YuBaoku
fec58639db
[CI] skip test_structured_outputs* temporarily ( #4055 )
2025-09-11 18:07:50 +08:00
YuanRisheng
d2d04c2d5e
[setup optimize]Support git submodule ( #4033 )
...
* support git submodule
* update setup
* fix ci network
* fix clone
* revert clone linux
* delete args
* fix ci
* update
2025-09-11 17:41:16 +08:00
SuperNova
d60f7c4661
fix import tests.utils error in tests/model_loader/test_load_mtp.py ( #4027 )
...
Co-authored-by: yongqiangma <xing.wo@163.com >
2025-09-11 16:47:16 +08:00
CSWYF3634076
e4c64a71cc
[BugFix] qwen2.5vl enable_thinking=true and image_patch_id bug fix ( #3921 )
2025-09-11 15:08:24 +08:00
bukejiyu
2650f58740
[docs] Update environment variables documentation ( #3957 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-10 21:17:06 -07:00
co63oc
2af0f671b1
【Hackathon 9th No.55】add test_update_inputs_v1.py ( #3992 )
2025-09-11 11:34:22 +08:00
AIbin
a7392a0ff9
【Inference Optimize】DeepSeek-V3-model MLA Optimize ( #3886 )
...
* support MLA chunk_size auto search & cuda_graph
2025-09-11 10:46:09 +08:00
chen
637d96c6ae
[Feature] Support zai-org/GLM-4.5-Air BF16 model ( #3928 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* support glm45_air
2025-09-10 19:36:10 +08:00
freeliuzc
7ee100903f
support rope_3d in spec mode ( #4034 )
2025-09-10 03:15:05 -07:00
ltd0924
684e93269b
[Fix] fix multi api server log dir ( #3967 )
...
* [BugFix] fix max streaming tokens invalid
* fix scheduler bug
* fix scheduler bug
* Update multi_api_server.py
2025-09-10 17:15:30 +08:00
wanrui
276f73cf83
【Hackathon 9th No.28】add test_cutlass_fp8_fp8_fp8_dual_gemm_fused ( #3935 )
...
* add test_cutlass_fp8_fp8_fp8_dual_gemm_fused
* fix the version
* fix code style
---------
Co-authored-by: Tao Luo <luotao02@baidu.com >
2025-09-10 14:57:49 +08:00
RAM
d3e4ae3d49
[Executor] Adjust signal sending order in RL training ( #3773 )
...
* Adjust processing order
* fix bug
* fix update_parameters bug
* refine code
2025-09-10 13:24:20 +08:00
Ayakouji
453487d5b0
[Feat] ernie4_5_vl_moe support CudaGraph ( #3226 )
...
* delete dynamic control flow for decode
* coda-style
* fix scatter/gather typos and use input stream instead default stream
* support 0-Size Tensor
* update runner and model
* using static mem address as input
* fix mem leak
* refine code
* update mm_buffer
* fix typo
* fix buffersize
* fix unk token
* refine code
* refine
* support other arch
* open cudagraph in vlci
* fix
* update
* update
* update
* fix cmd
* update
---------
Co-authored-by: aquagull <hongyuh@qq.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-09-10 13:11:57 +08:00
zhupengyang
9d0074a91a
[xpu] add ep custom ops ( #3911 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-10 12:22:50 +08:00
Yuanle Liu
c3b2a60fb8
[BugFix] Fix the abnormal memory usage caused by shape errors in the triton moe backend ( #4026 )
...
* fix device_id to in
* fix triton_moe bug
2025-09-09 20:05:54 -07:00
周周周
dbab579299
clean code ( #4020 )
2025-09-10 10:56:15 +08:00
guozhuangzhuang
f078a959b6
metrics shared folder naming ( #4007 )
...
* Fixed the issue of metrics file conflicts between multiple instances on a single machine
* Use uuid to name the metrics shared folder
* Use uuid to name the metrics shared folder
2025-09-10 10:47:20 +08:00
Sunny-bot1
3b1da6e4dd
support v1 loader for machete ( #3999 )
2025-09-10 10:21:33 +08:00
YuanRisheng
b3fac5bde1
[V1 Loader] Ernie kv cache quant support v1 loader ( #3899 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* support c8 for ernie
* add unittest
* support vl
* fix c8
2025-09-09 05:25:08 -07:00
Zero Rains
98bfefea02
get org_vocab_size from args ( #3983 )
2025-09-09 15:08:03 +08:00
Jiang-Jia-Jun
c60adf4281
Revert "【FIX】Change the name of sparse attn from moba to plas ( #3845 )" ( #4001 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
This reverts commit e31c8f7336 .
2025-09-09 11:08:23 +08:00
Jiang-Jia-Jun
bbd548ceb6
Revert "【Fix】Change the name of sparse attn from moba to plas ( #3993 )" ( #4002 )
...
This reverts commit a553d1896c .
2025-09-09 11:07:46 +08:00
yangjianfengo1
f556561584
【docs】 update readme ( #4000 )
...
* 更新文档
* update readme
* update docs
2025-09-09 11:04:08 +08:00
yangjianfengo1
a553d1896c
【Fix】Change the name of sparse attn from moba to plas ( #3993 )
...
* 更新文档
* 更新文档
* 更新文档
* 更新文档
* 修改moba为plas
* code style
* update ci
* code style
* update ci
2025-09-09 10:57:07 +08:00
yangjianfengo1
e31c8f7336
【FIX】Change the name of sparse attn from moba to plas ( #3845 )
...
* 更新文档
* 更新文档
* 更新文档
* 更新文档
* 修改moba为plas
* code style
* update ci
* code style
* update ci
2025-09-09 10:56:50 +08:00
yangjianfengo1
de34222842
更新文档 ( #3998 )
2025-09-09 10:44:15 +08:00
JYChen
8e8a5913da
add a3b-thinking doc ( #3994 )
2025-09-09 10:27:01 +08:00
Jiang-Jia-Jun
9f0e2a6854
Update README_CN.md
2025-09-09 10:11:25 +08:00
Jiang-Jia-Jun
30ddcc9115
Update README.md
2025-09-09 10:10:45 +08:00
Zhang Yulong
2359c8d21c
update ci ( #3962 )
...
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-09-09 10:09:13 +08:00
Jiang-Jia-Jun
1dc1397ef6
Update docs for thinking model support
2025-09-09 10:08:05 +08:00
ming1753
12326b60e1
[Docs] update VL best_practices for release/2.2 ( #3965 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [Docs] update VL best_practices for release/2.2
* fix bug
* modify
2025-09-08 22:07:37 +08:00
lzy
f12159b630
del batch id per token ( #3963 )
...
* Update decoder_write_cache_with_rope_kernel.cu
del batch_id_per_token
* Update decoder_write_cache_with_rope_impl.cuh
* Update test_append_attention.py
* Update test_append_attention.py
2025-09-08 21:58:34 +08:00
bukejiyu
08b3153661
update doc ( #3990 )
...
Co-authored-by: root <root@tjdm-inf-sci-k8s-hzz2-h12ni8-0214.tjdm.baidu.com >
2025-09-08 21:04:26 +08:00
AIbin
d00faeec69
update dsk doc ( #3989 )
2025-09-08 20:42:48 +08:00
yinwei
7e0bfd024f
update release note ( #3986 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-09-08 19:03:14 +08:00
JYChen
1f056a7469
[docs] update best practice docs ( #3969 )
...
* update best practice docs
* add version and v1 loader info
2025-09-08 17:39:38 +08:00
Echo-Nie
319a4bf75f
【Hackathon 9th No.36】add test_extract_text_token_output( #3862 )
2025-09-08 17:31:58 +08:00
co63oc
f884cd4f62
[UnitTest][MTP]add test_speculate_set_stop_value_multi_seqs.py ( #3941 )
2025-09-08 17:11:00 +08:00
co63oc
f32327661c
[UnitTest][MTP]add test_eagle_get_hidden_states ( #3876 )
2025-09-08 17:10:01 +08:00
co63oc
976aa88e66
【Hackathon 9th No.69】add test_draft_model_preprocess ( #3832 )
...
* add test_draft_model_preprocess
* fix
* ci
2025-09-08 17:08:50 +08:00
co63oc
ed462cf238
[UnitTest][MTP] add test_speculate_get_token_penalty_multi_scores.py ( #3742 )
...
* add test_speculate_get_token_penalty_multi_scores
* fix
2025-09-08 17:07:11 +08:00
Echo-Nie
20495f927e
[UnitTest][MTP] supplementary unit test for ngram_match ( #3732 )
...
* supplement unittest for custom_ops: ngram_match
* add annotation
* 借助 step_idx 信息,改为在具体位置判断是否相等
* del anno
* del print
---------
Co-authored-by: Tao Luo <luotao02@baidu.com >
2025-09-08 17:06:06 +08:00
ooo oo
0c46318b34
【Hackathon 9th No.22】add unit tests for share_external_data ( #3744 )
2025-09-08 17:05:48 +08:00
yangjianfengo1
9ead10e1bc
更新文档 ( #3975 )
2025-09-08 16:53:37 +08:00
xiaolei373
571ddc677b
Modify markdown ( #3896 )
...
* feat(log):add_request_and_response_log
* modify markdown graceful shutdown
2025-09-08 16:42:34 +08:00
AIbin
316ac546d3
update_wint2_doc ( #3968 )
2025-09-08 15:53:09 +08:00
zhuzixuan
83bd55100b
[Optimize]Error messages about Model api. ( #3839 )
...
* add v1/models interface related
* add model parameters
* default model verification
* unit test
* check model err_msg
* unit test
* type annotation
* model parameter in response
* modify document description
* modify document description
* unit test
* verification
* verification update
* model_name
* pre-commit
* update test case
* update test case
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/entrypoints/openai/serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* 优化报错信息。
---------
Co-authored-by: yangzichao01 <yangzichao01@baidu.com >
Co-authored-by: Yzc216 <101054010+Yzc216@users.noreply.github.com >
Co-authored-by: LiqinruiG <37392159+LiqinruiG@users.noreply.github.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-09-08 15:52:26 +08:00
co63oc
aadd6a94d8
fix typos ( #3951 )
2025-09-08 15:22:41 +08:00
co63oc
2033450391
rename ep_moe_prefill_func ep_moe_expert_dispatch ( #3938 )
2025-09-08 15:19:28 +08:00