Ryan
e58fed3665
[Graph Optimization][BugFix][CI] Fix 0size bug && add unitest ( #5495 )
2025-12-11 16:25:26 +08:00
YuBaoku
9f4512c932
[CI] disable test_cuda_graph_dynamic_subgraph.py in unit_test
2025-12-11 14:12:49 +08:00
qwes5s5
d79438bb86
add detoken switch ( #5463 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-10 21:44:02 +08:00
zccjjj
03819f30c3
[CI][XPU] ep+prefix cache+chunk prefill ( #5489 )
2025-12-10 19:39:49 +08:00
luukunn
fbc9bce1e9
[Feature]Optimization of Thinking Pattern Framework ( #4302 )
...
* add model status in vl
* add x1 parser
* add model_status
* fix parser
* fix parser
* fix parser
* fix parser
* Revert "fix parser"
This reverts commit 300f446d8a .
* fix parser
* fix
* fix
* fix
* fix
* fix parser
* fix unit test
* fix unit test
* add unit test
* fix
* fix
* add unit test
* fix unit test
* add unit test
* add unit test
* fix unit test
* fix unit test
* fix bug
* fix unit test
* x1 tool parser
* fix unit test
* fix unit test
* fix unit test
* fix n
* fix unit test
* add unit test
* add unit test
* remove pring
2025-12-10 16:17:06 +08:00
ming1753
9e15191cce
[BugFix] fix audio end bug ( #5464 )
2025-12-10 13:37:26 +08:00
Echo-Nie
1b1bfab341
[CI] Add unittest ( #5328 )
...
* add test_worker_eplb
* remove tesnsor_wise_fp8
* add copyright
2025-12-09 19:19:42 +08:00
lizexu123
95eab9f9ee
[Feature] support stop_token_ids ( #5399 )
...
* support stop_token_ids
* fix
* delete chinese
* support both
* delete print
2025-12-09 17:49:12 +08:00
Haonan Luo
e397c4fba6
[Others] remove add_bias option ( #5425 )
2025-12-09 17:39:35 +08:00
lizexu123
b0cf2c4b7a
[Feature] Support prefill batch inference for pooling models. ( #5436 )
...
* fix multi-inputs
* fix threshold
* fix threshold
* fix
* support multi-batch
* add tests
* fix test
* test
* fix
2025-12-09 16:21:00 +08:00
Juncai
83ea9646f9
[PD Disaggregation] Unify the disaggregation info and the pd communication ( #5438 )
...
* Unify the disaggregation info and the pd communication
* up
* up
* fix
* fix conflict
* fix unittest
2025-12-09 14:44:59 +08:00
Nyakku Shigure
e1c4a12e34
[Graph Optimization][CINN] Use CINN in PaddleOCR-VL ViT part ( #5223 )
...
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-12-09 14:37:00 +08:00
K11OntheBoat
8d99bac532
Remove CUDA ERROR 9 of inputs of get_padding_offset kernel ( #5440 )
...
Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com ”>
2025-12-09 14:17:30 +08:00
kevin
f7e832efaf
[BugFix] fix mm cudagraph ( #5266 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix mm cudagraph
* fix test_prompt_ids bug
* update code
* update ci code
* update ci code
* update ci code
2025-12-09 11:51:00 +08:00
zhouchong
5d9b5e4a5b
[Engine] [Feature] Refactor async_llm:cross-process with EngineService,based on zmq communication ( #4868 )
...
* Refactor async_llm:cross-process with EngineService
* fix: async_llm output process
* fix: return prompt_token_ids and prompt_tokens in first res
* optimize common_engine start func
2025-12-09 10:53:40 +08:00
SunLei
5fb93d84f5
[Feature] [Benchmark]: add ZMQ-based FMQ implementation and benchmark tools ( #5418 )
...
* feat(fmq): add ZMQ-based FMQ implementation and benchmark tools
* move FMQ_CONFIG_JSON to envs
* fix top_p_candidates (#5400 )
Co-authored-by: freeliuzc <lzc842650834@gmail.com >
* [RL] Support Rollout Routing Replay (#5321 )
* [RL] Support Rollout Routing Replay
* add routing indices cache
* fix config bug and moe forward bug
* R3 Support GLM
* support eb4.5
* fix merge bug
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* add routing replay ci
* support glm topk
* support orther top_k
* fix ci bug
* pre-commit
* only support chatcmpl
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
* [Bug fix] Fix the multi-input accuracy issue in the pooling model. (#5374 )
* fix multi-inputs
* fix threshold
* fix threshold
* fix
* [BugFix]remove _execute_empty_input (#5396 )
* Revert "[RL] Support Rollout Routing Replay (#5321 )" (#5402 )
This reverts commit 96d2d4877b .
* [New][RL] Support Rollout Routing Replay (#5405 )
* [RL] Support Rollout Routing Replay
* add routing indices cache
* fix config bug and moe forward bug
* R3 Support GLM
* support eb4.5
* fix merge bug
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* add routing replay ci
* support glm topk
* support orther top_k
* fix ci bug
* pre-commit
* only support chatcmpl
* Revert "Revert "[RL] Support Rollout Routing Replay (#5321 )" (#5402 )"
This reverts commit c45e064f3d .
* Fix XPU and NPU bug
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
* bf16 deepseek (#5379 )
* fix deepseek (#5410 )
* Update tests/inter_communicator/test_fmq_factory.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update benchmarks/benchmark_fmq.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/inter_communicator/fmq.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
---------
Co-authored-by: GoldPancake <56388518+Deleter-D@users.noreply.github.com >
Co-authored-by: freeliuzc <lzc842650834@gmail.com >
Co-authored-by: RAM <gstian5555@outlook.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
Co-authored-by: lizexu123 <39205361+lizexu123@users.noreply.github.com >
Co-authored-by: 周周周 <39978853+zhoutianzi666@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: bukejiyu <52310069+bukejiyu@users.noreply.github.com >
2025-12-08 22:04:49 +08:00
kesmeey
d1bd40d44c
[CI]【Hackathon 9th Sprint Example NO 16】功能模块 fastdeploy/input/ernie4_5_vl_processor/process.py 单测补充 ( #5264 )
...
* test: add unit tests for process.py (NO.16)
* update
* update filename
* update filename
* update
* update
* fix failed testcases
* simplify the code
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-08 14:30:15 +08:00
周周周
2aea8a3a60
[Others] Remove useless code ( #5404 )
2025-12-08 13:59:46 +08:00
Juncai
80efe98f8d
[PD Disaggregation] Add timestamp for analyzing splitwise deployment ( #5317 )
...
* Add timestamp for analyzing splitwise deployment
* up
* up
* up
* up
* up
* up
* fix format
* fix
2025-12-08 10:08:44 +08:00
RAM
b2908b8e82
[New][RL] Support Rollout Routing Replay ( #5405 )
...
* [RL] Support Rollout Routing Replay
* add routing indices cache
* fix config bug and moe forward bug
* R3 Support GLM
* support eb4.5
* fix merge bug
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* add routing replay ci
* support glm topk
* support orther top_k
* fix ci bug
* pre-commit
* only support chatcmpl
* Revert "Revert "[RL] Support Rollout Routing Replay (#5321 )" (#5402 )"
This reverts commit c45e064f3d .
* Fix XPU and NPU bug
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-12-05 22:06:26 +08:00
Jiang-Jia-Jun
c45e064f3d
Revert "[RL] Support Rollout Routing Replay ( #5321 )" ( #5402 )
...
This reverts commit 96d2d4877b .
2025-12-05 20:19:39 +08:00
lizexu123
d4979347ca
[Bug fix] Fix the multi-input accuracy issue in the pooling model. ( #5374 )
...
* fix multi-inputs
* fix threshold
* fix threshold
* fix
2025-12-05 20:18:17 +08:00
RAM
96d2d4877b
[RL] Support Rollout Routing Replay ( #5321 )
...
* [RL] Support Rollout Routing Replay
* add routing indices cache
* fix config bug and moe forward bug
* R3 Support GLM
* support eb4.5
* fix merge bug
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* add routing replay ci
* support glm topk
* support orther top_k
* fix ci bug
* pre-commit
* only support chatcmpl
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-12-05 20:01:33 +08:00
kevin
c9d7f9e7c3
[BugFix] fix async download bug ( #5349 )
...
* fix async download bug
* update log
* Revert "update log"
This reverts commit 5816e602f4 .
* update code
* fix mtp bug
2025-12-05 18:59:12 +08:00
zccjjj
5b900667e3
[XPU] support ep4tp1+v1 loader ( #5398 )
2025-12-05 18:51:15 +08:00
zccjjj
e927c65742
[XPU] [Optimization] [EP] EP communication optimization. ( #5145 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-05 10:03:45 +08:00
YuBaoku
1b5fd79d6b
[CI] disable test_schedule_output.py in unit_test ( #5377 )
2025-12-04 23:18:23 +08:00
chenjian
3878a99b69
[Fearture] Support cache kv cache for output tokens ( #4535 )
...
* [Fearture] Support cache kv cache for output tokens
* fix bug
* fix ci bug
* improve coverage
* enable output caching by default
* fix ci
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-04 20:53:08 +08:00
Longzhi Wang
5cd17fd662
[Models] Add forward_meta to moe models' forward function ( #5138 )
...
* [Models] Add forward_meta to moe models' forward function
* fix missing param
* fix
* fix
* fix forward_meta
* fix test and remove chunked MoE releated in config
* fix test
* fix
* fix
2025-12-04 13:26:58 +08:00
Juncai
f5bdb36e9b
Reduce timeout in unittest ( #5366 )
2025-12-04 13:19:02 +08:00
lizexu123
946025480e
[Bug fix] fix pooling models ( #5358 )
...
* fix
* fix
* fix test
* fix gpu_model_runner
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-04 11:06:30 +08:00
qwes5s5
a52aea073c
fix logprobs ( #5335 )
2025-12-04 10:38:51 +08:00
ming1753
5f8d4aedea
[Feature] support audio tts ( #5333 )
2025-12-03 21:06:48 +08:00
Daci
83dbc4e5dd
[Feature] Guided Decoding add LLguidance backend ( #5124 )
...
* llguidance
* add requirements_guided_decoding.txt and doc
* fix test_guidance_*.py
* fix test_guidance_*.py && mv
* fix llguidance choice
* test_guidance_*
* rm lazy loader
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-03 20:23:57 +08:00
lzy
f458cc5ba4
[Optimization]1.fix tp+ep moe_forward; 2.set max_prefill_batch=env.MAX_PREFILL_NUM ( #5353 )
...
* [Optimization] 1.fix tp+ep moe_forward; 2.set max_prefill_batch=env.MAX_PREFILL_NUM
* fix test_chunked_moe
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-03 16:42:10 +08:00
YuBaoku
dfeabee123
[CI] Allow occasional distributed worker exit_code ( #5341 )
2025-12-03 10:56:59 +08:00
YuBaoku
3e2c13d8c5
[CI] Disable queue state assertion temporarily ( #5329 )
2025-12-02 18:57:29 +08:00
Sunny-bot1
3629db4129
[Quantization] Support w4afp8 MoE dynamic quantization ( #5282 )
...
* support dynamic activation quant for w4afp8
* support dynamic w4afp8
* add test
* fix
* fix
---------
Co-authored-by: zhoutianzi666 <17801055074@163.com >
2025-12-02 18:56:16 +08:00
周周周
fb7f951612
[UNITEST] add test ( #5305 )
2025-12-02 17:59:01 +08:00
Jiaxin Sui
8e0f4dfd0c
[XPU] [CI] Xpu Ci Refactor ( #5252 )
...
* add xpu ci
* add case
* add case
* fix ci bug
* Update Docker image tag to 'latest' in CI workflow
* Fix set -e usage in run_xpu_ci_pytest.sh
* add pd case
* add case
* Configure pip to use Tsinghua mirror for dependencies
Set the global pip index URL to Tsinghua mirror.
* fix ci bug
* fix bug
* fix bug
---------
Co-authored-by: suijiaxin <suijiaxin@Suis-MacBook-Pro.local >
Co-authored-by: root <root@gajl-bbc-onlinec-com-1511964.gajl.baidu.com >
Co-authored-by: root <root@gajl-bbc-onlinec-com-1511972.gajl.baidu.com >
2025-12-02 17:15:51 +08:00
YuBaoku
69e003abcb
[CI] Fix return_code check in test_chunked_moe.py ( #5326 )
2025-12-02 15:41:26 +08:00
lizexu123
c563eca791
[Feature] support reward model ( #5301 )
...
* Your commit message here
* add test
* update develop
* support reward
* support enable_chunk_prefill
* support bingfa
* support convert is reward
* update test
* delete print
* fix enable_thinking
* add document
* fix place
* fix test
* fix
* support enable_prefix_caching
* add no-enable_prefix-caching test
* fix
* support enable_prefix_caching
* delete print
* fix document
* fix
* fix test
* fix document and delete chinese
* udpate
* enable_thinking
* fix test
2025-12-02 14:55:31 +08:00
qwes5s5
117980dd4e
[LogProbs]Enable prompt logprobs output and modify data transmission method for the online interface. ( #5089 )
...
* add prompt logprobs
* Merge prompt_logprobs_tensors and prompt_logprobs
* fix param check
* trigger ci
* fix unitest
* fix logprobs bug
2025-12-02 13:49:51 +08:00
YuanRisheng
af39819fcd
Revert "[CI] 【Hackathon 9th Sprint No.18】NO.18 功能模块单测补充 ( #5064 )" ( #5290 )
...
This reverts commit 7bac016c77 .
2025-12-02 13:43:36 +08:00
YuanRisheng
ded7765dec
Revert "[CI] 【Hackathon 9th Sprint No.41】NO.41 功能模块单测补充 ( #5062 )" ( #5291 )
...
This reverts commit 373b5c3807 .
2025-12-02 13:43:13 +08:00
YuBaoku
68533ebd95
[CI] disable test_chunked_moe.py in unit_test ( #5322 )
2025-12-02 10:39:50 +08:00
xiaolei373
84e2f6aa75
[CI]add clear to run-batch ci ( #5307 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-01 21:18:19 +08:00
Jiaxin Sui
b0113cb0fc
[XPU][CI] Change XPU CI Base Value ( #5318 )
...
* Add '小度' keyword to assertion in run_w4a8.py
* Add keywords to assertion in run_ep_online.py
* Add keywords to assertion in run_w4a8.py
* Update run_45T.py
* Update run_ep_online.py
* Refactor assertion for response content keywords
* Update run_w4a8.py
* Update run_w4a8.py
2025-12-01 21:02:09 +08:00
Juncai
0925d44f18
[PD Disaggregation] support different tp_size for prefill and decode ( #5296 )
...
* up
* up
* up
* fix
2025-12-01 17:50:20 +08:00
Jiaxin Sui
b467e9dadc
[XPU][CI]Change W4A8 Case Base Value ( #5309 )
2025-12-01 15:25:33 +08:00