Commit Graph

4090 Commits

Author SHA1 Message Date
周周周
e9174f25e8 commit (#5452)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-09 19:36:58 +08:00
chen
b491dcd23c [Optimization] compulte real max_logprobs in batch (#5430) (#5448) 2025-12-09 16:48:06 +08:00
gaoziyuan
2c55bbc3f8 support dynamic load for normal (#5437) 2025-12-09 15:07:19 +08:00
周周周
4b9e2c5c8e [BugFix] 0 not into cuda graph to save memory (#5426) (#5432)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-09 11:08:55 +08:00
Yonghua Li
31436a35e4 [Cherry-Pick] [BugFix] [RL] remove shutdown_process_group/restart_process_group for RL (#5433) (#5434)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [fix] remove shutdown_process_group/restart_process_group for RL

* [chore] remove log

* [chore] remove log

* [chore] set log to debug level
2025-12-08 19:13:06 +08:00
周周周
d4c16aa63e [BugFix][Cherry-Pick] fix can not enter into cuda graph (#5423)
* fix bug

* fix bug
2025-12-08 13:12:27 +08:00
Jiang-Jia-Jun
1dceb1c48c Update setup.py
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-08 11:21:26 +08:00
Nyakku Shigure
7926add37c [Cherry-Pick][Loader][BugFix] Fix some parameters place on CPU in PaddleOCR-VL (#5413) (#5414)
* [BugFix] Fix some parameter place on CPU in PaddleOCR-VL

* clean log

* fix codestyle
2025-12-08 10:01:20 +08:00
RAM
707d1a1fc9 [New][RL] Support Rollout Routing Replay (#5405) (#5408)
* [RL] Support Rollout Routing Replay

* add routing indices cache

* fix config bug and moe forward bug

* R3 Support GLM

* support eb4.5

* fix merge bug

* Apply suggestion from @Copilot



* Apply suggestion from @Copilot



* Apply suggestion from @Copilot



* Apply suggestion from @Copilot



* add routing replay ci

* support glm topk

* support orther top_k

* fix ci bug

* pre-commit

* only support chatcmpl

* Revert "Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402)"

This reverts commit c45e064f3d.

* Fix XPU and NPU bug

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuanle Liu <yuanlehome@163.com>
2025-12-08 10:00:35 +08:00
bukejiyu
7eea23f238 cp pr5373 pr5379 pr5410 (#5411) 2025-12-06 00:47:01 +08:00
Jiang-Jia-Jun
c45e064f3d Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402)
This reverts commit 96d2d4877b.
2025-12-05 20:19:39 +08:00
周周周
94c57e4175 [BugFix]remove _execute_empty_input (#5396) 2025-12-05 20:19:01 +08:00
lizexu123
d4979347ca [Bug fix] Fix the multi-input accuracy issue in the pooling model. (#5374)
* fix multi-inputs

* fix threshold

* fix threshold

* fix
2025-12-05 20:18:17 +08:00
RAM
96d2d4877b [RL] Support Rollout Routing Replay (#5321)
* [RL] Support Rollout Routing Replay

* add routing indices cache

* fix config bug and moe forward bug

* R3 Support GLM

* support eb4.5

* fix merge bug

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* add routing replay ci

* support glm topk

* support orther top_k

* fix ci bug

* pre-commit

* only support chatcmpl

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuanle Liu <yuanlehome@163.com>
2025-12-05 20:01:33 +08:00
GoldPancake
8545b705ed fix top_p_candidates (#5400)
Co-authored-by: freeliuzc <lzc842650834@gmail.com>
2025-12-05 20:01:05 +08:00
wyw
bae3475926 [BugFix]Fix plugin loading logic and logging messages (#4909)
* Fix plugin loading logic and logging messages

* Fix indentation in plugin loading logic

---------

Co-authored-by: gaoziyuan <88373061+gzy19990617@users.noreply.github.com>
Co-authored-by: Yuanle Liu <yuanlehome@163.com>
2025-12-05 19:25:01 +08:00
kevin
db936ab3e4 fix mtp prefix_cache dy-c8 bug (#5390) 2025-12-05 19:03:19 +08:00
kevin
c9d7f9e7c3 [BugFix] fix async download bug (#5349)
* fix async download bug

* update log

* Revert "update log"

This reverts commit 5816e602f4.

* update code

* fix mtp bug
2025-12-05 18:59:12 +08:00
zccjjj
5b900667e3 [XPU] support ep4tp1+v1 loader (#5398) 2025-12-05 18:51:15 +08:00
Yonghua Li
35846909c7 [fix] fix scheduler hang when input length is very close to max_model_len (#5393)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-05 18:23:42 +08:00
Ayakouji
a8f8791668 [Optimization] Qwen2.5-VL support multi-batch prefill (#5269)
* update

* fix

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix dict access

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-12-05 18:22:39 +08:00
Lucas
8f2b85362d [XPU] support moe_expert_ffn TGEMM selection (#5375) 2025-12-05 17:49:40 +08:00
Lucas
3aed8d257d [XPU] redirect xvllm/xtdk/xhpc downloading log (#5388) 2025-12-05 17:34:17 +08:00
周周周
c83dc58105 [Feature] support Two batch overlap, mainly used in Prefill (#5078) 2025-12-05 14:58:50 +08:00
qwes5s5
1aefbef0b3 fix trace log (#5386) 2025-12-05 14:45:52 +08:00
lizhenyun01
d436640735 [BugFix] Fix flash_attn_backend 2025-12-05 14:33:38 +08:00
cmcamdy
86b6430582 fix split_rope_cache_kv_encoder in mix mtp (#5384) 2025-12-05 14:33:17 +08:00
Jiaxin Sui
b5a7abe624 [XPU] [CI] Change Paddle Version to Nightly (#5346)
* Enhance run_ci_xpu.sh with caching and prefill options

* Update model path and configuration in run_ci_xpu.sh

* Add '北朝' keyword to assertion in run_45vl.py

* Enhance process termination logic in run_ci_xpu.sh

* Set timeout for CI_XPU job to 60 minutes

* Remove extra newline in stop_processes function

* Update paddlepaddle-xpu installation command

Comment out the previous paddlepaddle-xpu installation command and replace it with a specific version installation due to EP parallel error.

* Update PaddlePaddle installation command
2025-12-05 13:01:29 +08:00
fmiao2372
ebe613ccc8 [Intel HPU] fix bug about RP 5138 (#5380) 2025-12-05 11:33:29 +08:00
Lucas
7b0b6e470a [XPU] support XDNN downloading function (#5365) 2025-12-05 11:16:45 +08:00
ming1753
dd2e9a14c7 [BugFix] Compatible with asynchronous functions (#5378)
* [BugFix] fix data_processor asyn bug

* fix bug
2025-12-05 11:05:21 +08:00
zccjjj
e927c65742 [XPU] [Optimization] [EP] EP communication optimization. (#5145)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-05 10:03:45 +08:00
bukejiyu
620d1da1c9 deepseek torch (#5373)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-04 23:26:53 +08:00
YuBaoku
1b5fd79d6b [CI] disable test_schedule_output.py in unit_test (#5377) 2025-12-04 23:18:23 +08:00
Juncai
7f4fff4d1e fix get_request from scheduler (#5369) 2025-12-04 21:59:10 +08:00
chenjian
3878a99b69 [Fearture] Support cache kv cache for output tokens (#4535)
* [Fearture] Support cache kv cache for output tokens

* fix bug

* fix ci bug

* improve coverage

* enable output caching by default

* fix ci

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-12-04 20:53:08 +08:00
Yonghua Li
b6f8069b36 [fix] update check_model_weights_status loop (#5249) 2025-12-04 19:43:01 +08:00
Yuanle Liu
41c63f6056 remove fastsafetensors (#5371) 2025-12-04 19:22:04 +08:00
xiegegege
b7e1e6c953 [CE]change yaml name 2025-12-04 19:14:11 +08:00
Nyakku Shigure
f88c159de1 [BugFix] Exit if neither modern nor legacy wheel dir not found (#5367)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-04 16:45:48 +08:00
SunLei
3697110599 [Docs] update FAQ with logprobs MQ limits and deprecation (#5368)
* [doc] update FAQ with logprobs MQ limits and deprecation

* [doc] update FAQ with logprobs MQ limits and deprecation

* update faq
2025-12-04 15:57:04 +08:00
Yonghua Li
f4119d51b4 [PD Disaggregation] support DP via v1 router and decouple DP and EP (#5197)
* [fix] support DP via v1 router and decouple DP and EP

* [fix] fix scripts

* [fix] reset model path

* [fix] dp use get_output_ep, fix router port type, update scripts

* [merge] merge with latest code

* [chore] remove some debug log

* [fix] fix code style check

* [fix] fix test_multi_api_server for log_dir name

* [chore] reduce logs

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-04 15:38:43 +08:00
Longzhi Wang
5cd17fd662 [Models] Add forward_meta to moe models' forward function (#5138)
* [Models] Add forward_meta to moe models' forward function

* fix missing param

* fix

* fix

* fix forward_meta

* fix test and remove chunked MoE releated in config

* fix test

* fix

* fix
2025-12-04 13:26:58 +08:00
Juncai
f5bdb36e9b Reduce timeout in unittest (#5366) 2025-12-04 13:19:02 +08:00
fmiao2372
209006e6a6 [Intel HPU] fix memory fragmentation issue due to warmup process and fix moe all_reduce issue (#5357) 2025-12-04 11:29:41 +08:00
lizexu123
946025480e [Bug fix] fix pooling models (#5358)
* fix

* fix

* fix test

* fix gpu_model_runner

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-12-04 11:06:30 +08:00
qwes5s5
a52aea073c fix logprobs (#5335) 2025-12-04 10:38:51 +08:00
Echo-Nie
96ff402d44 [Optimization] Remove version constraints for setuptools, uvicorn, triton and safetensors, del fastsafetensors (#5330)
* Remove version constraints for setuptools, triton, and fastsafetensors.

* remove version for uvicorn

* fix according to review
2025-12-04 10:07:31 +08:00
Yuanle Liu
be0c960260 [BugFix] dynamic cache kv block_wise_fp8 not need create layer.cache_k_scale (#5362)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-03 05:32:59 -08:00
周周周
a36d60aa18 [FIX BUG] fix bug in TP in permute_x_fp8_kernel (#5350)
* commit

* commit

* commit

* commit

* commit

* commit
2025-12-03 05:17:37 -08:00