gaoziyuan
5a74ee77f1
save model
...
Added functionality to save the model state dictionary to a specified path.
2025-12-23 16:30:13 +08:00
freeliuzc
ceafd757f0
[Speculative Decoding]Support multi-step mtp with cudagraph ( #5624 ) ( #5670 )
...
* support multi-step mtp with cudagraph
* fix usage
* fix unit test
2025-12-23 13:18:47 +08:00
ddchenhao66
eb309e5a2a
[XPU]Set top_p=0.0 by default on XPU to optimize performance ( #5688 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-12-23 11:00:53 +08:00
Yuanle Liu
90065084cb
[BugFix] fix rl signal ( #5678 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-22 00:31:24 -08:00
Yonghua Li
ea16c82b43
[Cherry-Pick] [RL] provide options for whether shutdown comm group after weights cleared ( #5663 ) ( #5664 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [rl] provide options for whether shutdown comm group after weights cleared
* [fix] fix args hardcode
* [fix] change args type
* [fix] add worker process args
2025-12-19 23:18:03 +08:00
bukejiyu
dd0014b7b9
del core ( #5659 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-19 16:33:44 +08:00
kevin
e10c5d5d61
cp fix eb5 prefix cache bug ( #5644 )
2025-12-19 14:57:17 +08:00
qw86972190
a9bb24bb56
[XPU]logprob bug ( #5636 )
2025-12-19 14:30:14 +08:00
Yuanle Liu
b3f78815d8
update rl signal ( #5650 )
2025-12-18 20:04:18 -08:00
kevin
23bfd28624
[Cherry-Pick][BugFix] cp fix_cpu_cache_bugs( #5544 ) ( #5577 )
...
* cp fix_cpu_cache_bugs
* update ce case
* update test case
* update code
2025-12-19 11:48:50 +08:00
bukejiyu
2aa88d3621
[Cherry-Pick][RL]Fix RL load_weights #5642 ( #5643 )
2025-12-18 19:17:09 -08:00
Yuanle Liu
9c55bc31cd
[Cherry-Pick][BugFix] fix rl model_weights_signal to support tp>1 #5639 ( #5637 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-18 04:44:19 -08:00
bukejiyu
646d1a0aa2
[Cherry-Pick][RL]Support loading weights via the load_weights function for RL #5549 ( #5602 )
...
* RL support load_weights
* fix
2025-12-18 02:28:53 -08:00
Yuanle Liu
0cb9ad186e
[Cherry-Pick][BugFix] fix speculate_limit_thinking_content_length #5590 ( #5615 )
2025-12-18 01:50:18 -08:00
Longzhi Wang
a30a5b4216
[Model] tp+ep support v1_loader ( #5600 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [Model] tp+ep support v1_loader
* fix
* fix mtp_linear
* fix mtp_linear
* fix
* fix
* fix v0 loader
* fix
* Add get_tensor for EP
* fix linear weight_loader
* fix typo
* fix
2025-12-18 15:27:12 +08:00
GoldPancake
e56c4dd0a8
[Cherry-Pick] Support for request-level speculative decoding metrics monitoring.( #5518 ) ( #5614 )
...
* support spec metrics monitor per request
2025-12-17 20:53:04 +08:00
qwes5s5
d67b64d5e1
add detoken switch ( #5463 ) ( #5572 )
2025-12-17 17:04:45 +08:00
freeliuzc
a7359d1c1d
[Cherry-Pick][CI]Support different inferseed in speculate decoding( #5568 ) ( #5597 )
...
* fix mtp entropy drop in RL
* optimize usage and fix unit test
* optimize padding_sampling_params speed(vectorized)
2025-12-17 16:53:47 +08:00
RAM
c19af496cb
[Cherry-Pick][RL] R3 Support RDMA Store( #5467 ) ( #5468 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [RL] R3 support rdma store
* refine code
* refine notes
* disable prefix cache
* fix ci bug
* support preempted task and put cpu tensor
2025-12-17 09:50:40 +08:00
gaoziyuan
9f74233966
【NewFeature】support load fp8 weight ( #5566 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
-
2025-12-16 11:24:17 +08:00
Yuanle Liu
99b40247ea
[Cherry-Pick][BugFix] fix dynamic c8 in v1 loader( #5562 ) ( #5519 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix dyname load bug
* update
* update
2025-12-15 04:08:07 -08:00
chenjian
0fa40f5f0c
Fix bug for caching output when preempted ( #5510 )
2025-12-15 17:25:55 +08:00
Yonghua Li
12e0206d4d
[Cherry-Pick] [BugFix] [RL] skip model executing after clearing/updating is done ( #5527 ) ( #5523 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [fix] fix ep loop
* [fix] another try
* [fix] again
2025-12-12 14:56:09 +08:00
chen
4e5e36ec9c
[[Cherry-Pick][BugFix] fix hung when n>1 and --enable-logprob ( #5492 )( #5499 ) ( #5498 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [BugFix] fix hung when n>1 and --enable-logprob (#5492 )
* check
* check
* check
2025-12-11 20:03:22 +08:00
bukejiyu
71781b56e1
RL fix ( #5505 )
2025-12-11 19:25:24 +08:00
Yonghua Li
7019afbb86
[BugFix] fix instability after clearing weight ( #5487 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [BugFix] fix instability after clearing weight
* [chore] add todo
2025-12-11 09:58:18 +08:00
freeliuzc
c5c43e3b3d
fix attention bug in spec decoding ( #5481 )
2025-12-10 12:55:13 +08:00
周周周
e9174f25e8
commit ( #5452 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-09 19:36:58 +08:00
chen
b491dcd23c
[Optimization] compulte real max_logprobs in batch ( #5430 ) ( #5448 )
2025-12-09 16:48:06 +08:00
gaoziyuan
2c55bbc3f8
support dynamic load for normal ( #5437 )
2025-12-09 15:07:19 +08:00
周周周
4b9e2c5c8e
[BugFix] 0 not into cuda graph to save memory ( #5426 ) ( #5432 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-09 11:08:55 +08:00
Yonghua Li
31436a35e4
[Cherry-Pick] [BugFix] [RL] remove shutdown_process_group/restart_process_group for RL ( #5433 ) ( #5434 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [fix] remove shutdown_process_group/restart_process_group for RL
* [chore] remove log
* [chore] remove log
* [chore] set log to debug level
2025-12-08 19:13:06 +08:00
周周周
d4c16aa63e
[BugFix][Cherry-Pick] fix can not enter into cuda graph ( #5423 )
...
* fix bug
* fix bug
2025-12-08 13:12:27 +08:00
Nyakku Shigure
7926add37c
[Cherry-Pick][Loader][BugFix] Fix some parameters place on CPU in PaddleOCR-VL ( #5413 ) ( #5414 )
...
* [BugFix] Fix some parameter place on CPU in PaddleOCR-VL
* clean log
* fix codestyle
2025-12-08 10:01:20 +08:00
RAM
707d1a1fc9
[New][RL] Support Rollout Routing Replay ( #5405 ) ( #5408 )
...
* [RL] Support Rollout Routing Replay
* add routing indices cache
* fix config bug and moe forward bug
* R3 Support GLM
* support eb4.5
* fix merge bug
* Apply suggestion from @Copilot
* Apply suggestion from @Copilot
* Apply suggestion from @Copilot
* Apply suggestion from @Copilot
* add routing replay ci
* support glm topk
* support orther top_k
* fix ci bug
* pre-commit
* only support chatcmpl
* Revert "Revert "[RL] Support Rollout Routing Replay (#5321 )" (#5402 )"
This reverts commit c45e064f3d .
* Fix XPU and NPU bug
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-12-08 10:00:35 +08:00
bukejiyu
7eea23f238
cp pr5373 pr5379 pr5410 ( #5411 )
2025-12-06 00:47:01 +08:00
Jiang-Jia-Jun
c45e064f3d
Revert "[RL] Support Rollout Routing Replay ( #5321 )" ( #5402 )
...
This reverts commit 96d2d4877b .
2025-12-05 20:19:39 +08:00
周周周
94c57e4175
[BugFix]remove _execute_empty_input ( #5396 )
2025-12-05 20:19:01 +08:00
lizexu123
d4979347ca
[Bug fix] Fix the multi-input accuracy issue in the pooling model. ( #5374 )
...
* fix multi-inputs
* fix threshold
* fix threshold
* fix
2025-12-05 20:18:17 +08:00
RAM
96d2d4877b
[RL] Support Rollout Routing Replay ( #5321 )
...
* [RL] Support Rollout Routing Replay
* add routing indices cache
* fix config bug and moe forward bug
* R3 Support GLM
* support eb4.5
* fix merge bug
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* add routing replay ci
* support glm topk
* support orther top_k
* fix ci bug
* pre-commit
* only support chatcmpl
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-12-05 20:01:33 +08:00
wyw
bae3475926
[BugFix]Fix plugin loading logic and logging messages ( #4909 )
...
* Fix plugin loading logic and logging messages
* Fix indentation in plugin loading logic
---------
Co-authored-by: gaoziyuan <88373061+gzy19990617@users.noreply.github.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-12-05 19:25:01 +08:00
kevin
db936ab3e4
fix mtp prefix_cache dy-c8 bug ( #5390 )
2025-12-05 19:03:19 +08:00
kevin
c9d7f9e7c3
[BugFix] fix async download bug ( #5349 )
...
* fix async download bug
* update log
* Revert "update log"
This reverts commit 5816e602f4 .
* update code
* fix mtp bug
2025-12-05 18:59:12 +08:00
zccjjj
5b900667e3
[XPU] support ep4tp1+v1 loader ( #5398 )
2025-12-05 18:51:15 +08:00
Yonghua Li
35846909c7
[fix] fix scheduler hang when input length is very close to max_model_len ( #5393 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-05 18:23:42 +08:00
Ayakouji
a8f8791668
[Optimization] Qwen2.5-VL support multi-batch prefill ( #5269 )
...
* update
* fix
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* fix dict access
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-05 18:22:39 +08:00
周周周
c83dc58105
[Feature] support Two batch overlap, mainly used in Prefill ( #5078 )
2025-12-05 14:58:50 +08:00
qwes5s5
1aefbef0b3
fix trace log ( #5386 )
2025-12-05 14:45:52 +08:00
lizhenyun01
d436640735
[BugFix] Fix flash_attn_backend
2025-12-05 14:33:38 +08:00
fmiao2372
ebe613ccc8
[Intel HPU] fix bug about RP 5138 ( #5380 )
2025-12-05 11:33:29 +08:00