Commit Graph

4126 Commits

Author SHA1 Message Date
YuBaoku
f50988d917 [Cherry-Pick][CI] Revert adapt vl_model baseline changes due to Paddle update(#5732) (#5733)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [Cherry-Pick][CI] Revert adapt vl_model baseline changes due to Paddle update(#5732)

---------

Co-authored-by: yubaoku <yubaoku@baidu.com>
2025-12-24 12:14:34 +08:00
Yonghua Li
9ff99d2b03 [BugFix] fix double shutdown of comm group when rank0 clears weights slower than other ranks (#5710)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-23 01:51:35 -08:00
freeliuzc
ceafd757f0 [Speculative Decoding]Support multi-step mtp with cudagraph (#5624) (#5670)
* support multi-step mtp with cudagraph

* fix usage

* fix unit test
2025-12-23 13:18:47 +08:00
ddchenhao66
eb309e5a2a [XPU]Set top_p=0.0 by default on XPU to optimize performance (#5688)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-12-23 11:00:53 +08:00
Yuanle Liu
90065084cb [BugFix] fix rl signal (#5678)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-22 00:31:24 -08:00
Yonghua Li
ea16c82b43 [Cherry-Pick] [RL] provide options for whether shutdown comm group after weights cleared (#5663) (#5664)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [rl] provide options for whether shutdown comm group after weights cleared

* [fix] fix args hardcode

* [fix] change args type

* [fix] add worker process args
2025-12-19 23:18:03 +08:00
chen
abf53b17ea [BugFix] Fix custom_all_reduce overflow (#5662) (#5667)
* check

* check

* code style
2025-12-19 20:04:39 +08:00
bukejiyu
dd0014b7b9 del core (#5659)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-19 16:33:44 +08:00
kevin
e10c5d5d61 cp fix eb5 prefix cache bug (#5644) 2025-12-19 14:57:17 +08:00
qw86972190
a9bb24bb56 [XPU]logprob bug (#5636) 2025-12-19 14:30:14 +08:00
Yuanle Liu
b3f78815d8 update rl signal (#5650) 2025-12-18 20:04:18 -08:00
kevin
23bfd28624 [Cherry-Pick][BugFix] cp fix_cpu_cache_bugs(#5544) (#5577)
* cp fix_cpu_cache_bugs

* update ce case

* update test case

* update code
2025-12-19 11:48:50 +08:00
bukejiyu
2aa88d3621 [Cherry-Pick][RL]Fix RL load_weights #5642 (#5643) 2025-12-18 19:17:09 -08:00
Yuanle Liu
9c55bc31cd [Cherry-Pick][BugFix] fix rl model_weights_signal to support tp>1 #5639 (#5637)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-18 04:44:19 -08:00
bukejiyu
646d1a0aa2 [Cherry-Pick][RL]Support loading weights via the load_weights function for RL #5549 (#5602)
* RL support load_weights

* fix
2025-12-18 02:28:53 -08:00
Yuanle Liu
0cb9ad186e [Cherry-Pick][BugFix] fix speculate_limit_thinking_content_length #5590 (#5615) 2025-12-18 01:50:18 -08:00
Longzhi Wang
a30a5b4216 [Model] tp+ep support v1_loader (#5600)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [Model] tp+ep support v1_loader

* fix

* fix mtp_linear

* fix mtp_linear

* fix

* fix

* fix v0 loader

* fix

* Add get_tensor for EP

* fix linear weight_loader

* fix typo

* fix
2025-12-18 15:27:12 +08:00
lzy
5300e73f8b [Others] Maintain the mtp branch temporarily. (#5446) (#5621)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-17 06:03:25 -08:00
GoldPancake
e56c4dd0a8 [Cherry-Pick] Support for request-level speculative decoding metrics monitoring.(#5518) (#5614)
* support spec metrics monitor per request
2025-12-17 20:53:04 +08:00
freeliuzc
d7d633a285 [Cherry-Pick][CI]Fix write qknorm cache bug in speculative decoding(#5491) (#5617)
* [liuzichang spend 10 dyas]fix write qknorm cache bug

* fix 'fix cachekv bug''
2025-12-17 20:08:51 +08:00
qwes5s5
d67b64d5e1 add detoken switch (#5463) (#5572) 2025-12-17 17:04:45 +08:00
freeliuzc
a7359d1c1d [Cherry-Pick][CI]Support different inferseed in speculate decoding(#5568) (#5597)
* fix mtp entropy drop in RL

* optimize usage and fix unit test

* optimize padding_sampling_params speed(vectorized)
2025-12-17 16:53:47 +08:00
RAM
c19af496cb [Cherry-Pick][RL] R3 Support RDMA Store(#5467) (#5468)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [RL] R3 support rdma store

* refine code

* refine notes

* disable prefix cache

* fix ci bug

* support preempted task and put cpu tensor
2025-12-17 09:50:40 +08:00
YuBaoku
53158b7f8d [Cherry-Pick][CI] Adape unit_test due to incompatibility change(#5578) (#5583)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [CI] Remove test_metrics.py due to incompatible forced merge (#5578)
* [CI] Adapt vl_model baseline changes due to Paddle update (#5576)
2025-12-16 15:45:49 +08:00
gaoziyuan
9f74233966 【NewFeature】support load fp8 weight (#5566)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
-
2025-12-16 11:24:17 +08:00
Yuanle Liu
99b40247ea [Cherry-Pick][BugFix] fix dynamic c8 in v1 loader(#5562) (#5519)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix dyname load bug

* update

* update
2025-12-15 04:08:07 -08:00
chenjian
0fa40f5f0c Fix bug for caching output when preempted (#5510) 2025-12-15 17:25:55 +08:00
chen
5bdef760a2 [Feature][Optimization] Qwen Support Dynamic block_wise_fp8 cache (#5486) (#5536) 2025-12-15 15:53:34 +08:00
Yonghua Li
12e0206d4d [Cherry-Pick] [BugFix] [RL] skip model executing after clearing/updating is done (#5527) (#5523)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [fix] fix ep loop

* [fix] another try

* [fix] again
2025-12-12 14:56:09 +08:00
chen
4e5e36ec9c [[Cherry-Pick][BugFix] fix hung when n>1 and --enable-logprob (#5492)(#5499) (#5498)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [BugFix] fix hung when n>1 and --enable-logprob (#5492)

* check

* check

* check
2025-12-11 20:03:22 +08:00
bukejiyu
71781b56e1 RL fix (#5505) 2025-12-11 19:25:24 +08:00
YuBaoku
b43563977d [CI] disable test_cuda_graph_dynamic_subgraph.py in unit_test 2025-12-11 14:14:30 +08:00
Yonghua Li
7019afbb86 [BugFix] fix instability after clearing weight (#5487)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [BugFix] fix instability after clearing weight

* [chore] add todo
2025-12-11 09:58:18 +08:00
zccjjj
bcde798098 [CI][XPU] ep+prefix cache+chunk prefill (#5490)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-10 19:40:38 +08:00
freeliuzc
c5c43e3b3d fix attention bug in spec decoding (#5481) 2025-12-10 12:55:13 +08:00
Yuanle Liu
1776d410d0 fix limit_thinking bug (#5469)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-10 11:56:35 +08:00
周周周
e9174f25e8 commit (#5452)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-09 19:36:58 +08:00
chen
b491dcd23c [Optimization] compulte real max_logprobs in batch (#5430) (#5448) 2025-12-09 16:48:06 +08:00
gaoziyuan
2c55bbc3f8 support dynamic load for normal (#5437) 2025-12-09 15:07:19 +08:00
周周周
4b9e2c5c8e [BugFix] 0 not into cuda graph to save memory (#5426) (#5432)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-09 11:08:55 +08:00
Yonghua Li
31436a35e4 [Cherry-Pick] [BugFix] [RL] remove shutdown_process_group/restart_process_group for RL (#5433) (#5434)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [fix] remove shutdown_process_group/restart_process_group for RL

* [chore] remove log

* [chore] remove log

* [chore] set log to debug level
2025-12-08 19:13:06 +08:00
周周周
d4c16aa63e [BugFix][Cherry-Pick] fix can not enter into cuda graph (#5423)
* fix bug

* fix bug
2025-12-08 13:12:27 +08:00
Jiang-Jia-Jun
1dceb1c48c Update setup.py
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-08 11:21:26 +08:00
Nyakku Shigure
7926add37c [Cherry-Pick][Loader][BugFix] Fix some parameters place on CPU in PaddleOCR-VL (#5413) (#5414)
* [BugFix] Fix some parameter place on CPU in PaddleOCR-VL

* clean log

* fix codestyle
2025-12-08 10:01:20 +08:00
RAM
707d1a1fc9 [New][RL] Support Rollout Routing Replay (#5405) (#5408)
* [RL] Support Rollout Routing Replay

* add routing indices cache

* fix config bug and moe forward bug

* R3 Support GLM

* support eb4.5

* fix merge bug

* Apply suggestion from @Copilot



* Apply suggestion from @Copilot



* Apply suggestion from @Copilot



* Apply suggestion from @Copilot



* add routing replay ci

* support glm topk

* support orther top_k

* fix ci bug

* pre-commit

* only support chatcmpl

* Revert "Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402)"

This reverts commit c45e064f3d.

* Fix XPU and NPU bug

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuanle Liu <yuanlehome@163.com>
2025-12-08 10:00:35 +08:00
bukejiyu
7eea23f238 cp pr5373 pr5379 pr5410 (#5411) 2025-12-06 00:47:01 +08:00
Jiang-Jia-Jun
c45e064f3d Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402)
This reverts commit 96d2d4877b.
2025-12-05 20:19:39 +08:00
周周周
94c57e4175 [BugFix]remove _execute_empty_input (#5396) 2025-12-05 20:19:01 +08:00
lizexu123
d4979347ca [Bug fix] Fix the multi-input accuracy issue in the pooling model. (#5374)
* fix multi-inputs

* fix threshold

* fix threshold

* fix
2025-12-05 20:18:17 +08:00
RAM
96d2d4877b [RL] Support Rollout Routing Replay (#5321)
* [RL] Support Rollout Routing Replay

* add routing indices cache

* fix config bug and moe forward bug

* R3 Support GLM

* support eb4.5

* fix merge bug

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* add routing replay ci

* support glm topk

* support orther top_k

* fix ci bug

* pre-commit

* only support chatcmpl

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuanle Liu <yuanlehome@163.com>
2025-12-05 20:01:33 +08:00