Commit Graph

1053 Commits

Author SHA1 Message Date
RAM
fbed0ef851 [Cherry-Pick][RL] Support Rollout Routing Replay (#5166)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support r3

* update

* support tp>1&&ep>1

* support cudagraph padding

* support all backends

* replace env with options

* modularize

* update

* Add RoutingStore and refine code

* add routing replay cofig

* add routing repaly config

* success run routing store

* convert request id as rollout id

* fix rollout config bug

* unify code

* use rollout_id to replace request_id in routing store

* delete code

---------

Co-authored-by: yuanlehome <yuanlehome@163.com>
2025-12-04 00:35:30 -08:00
kevin
74ba637b6b remove close prefix cache (#5363)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-03 20:59:32 +08:00
Dangweichong
5c2247c3f0 [Feature] Support async download chunk video features (#5297) 2025-12-03 19:39:45 +08:00
Yuanle Liu
17c88f429f fix skip_quant (#5342)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix skip_quant

* fix
2025-12-03 13:20:51 +08:00
kevin
b52e1bd281 [Cherry-Pick][Feature] dy-c8 prefix caching (#4918)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* c8 prefix caching

* update code

* update code

* update cache trans

* update code

* update code
2025-11-28 10:37:49 +08:00
SunLei
f637ba708c [Cherry-Pick] MTP split draft_tokens into standalone post-processing path(#5205) (#5232)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* merge code

* fix Request CONFLICT

* remove unuse unittest

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-11-27 15:30:00 +08:00
GoldPancake
bbcd92c8a0 [BugFix] fix mtp logprob bugs in chunk prefill (#5234)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix mtp logprob bugs in chunk prefill

* merge code

* fix Request CONFLICT

* Revert "fix Request CONFLICT"

This reverts commit 7a438e4119.

* Revert "merge code"

This reverts commit 3839559b83.

* fix

* remove print

* fix

---------

Co-authored-by: sunlei1024 <sunlei5788@gmail.com>
2025-11-27 11:32:01 +08:00
chen
cc588b70ab [CP][BugFix]Dev fix custom ar unstable result (#5186)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [CP][BugFix]Dev fix custom ar unstable result (#4437)

* code check

* revert delete

* check

* pre_commit
2025-11-24 15:28:01 +08:00
GoldPancake
fde827f95d fix mtp reschedule (#5164)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-11-21 19:08:33 +08:00
Jiang-Jia-Jun
a7740e56c4 Simplify __repr__ method in Request class (#5154)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Remove detailed string representation from Request class.
2025-11-20 21:31:02 +08:00
GoldPancake
67da16bd7c fix mtp reschedule (#5144) 2025-11-20 21:28:21 +08:00
kevin
966297e5d6 [Feature] mm prefix cache (#4554)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* mm prefix cache

* add _revert_match_blocks

* update code

* update code

* update code

* fix bugs

* add test case

* fix bug

* update code

* update reserved_dec_block_ids
2025-11-19 19:32:14 +08:00
LiqinruiG
b2b7881cca [fix] add more logger info: max_tokens (#5122)
Co-authored-by: liqinrui <liqinrui@baidu.com>
2025-11-19 18:59:43 +08:00
kevin
3ce2c8f754 [Feature] support async download features (#4910)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* add async download

* update code

* fix bug

* update code

* update code

* fix bugs

* update code

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-11-18 18:37:59 +08:00
SunLei
c55c0d2ca3 fix: Fix block allocation issue when MTP and logprobs are enabled (#5086)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-11-17 17:50:23 +08:00
LiqinruiG
24a7b79eec [BugFix] rollback max_tokens and min_tokens when continue to infer (#5051)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Co-authored-by: liqinrui <liqinrui@baidu.com>
2025-11-15 16:40:35 +08:00
GoldPancake
cbcb5c6e84 temporary change mtp logprob msg size (#5026)
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
Co-authored-by: gaoziyuan <88373061+gzy19990617@users.noreply.github.com>
2025-11-15 13:39:40 +08:00
kxz2002
936a80962f [BugFix] adjust max_tokens and min_tokens when continue to generate tokens (#5010) (#5015)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix max and min tokens initial commit

* fix double subtraction

* add unit tests

Co-authored-by: gaoziyuan <88373061+gzy19990617@users.noreply.github.com>
2025-11-14 22:24:59 +08:00
Yonghua Li
3da9f01e19 [BugFix] fix num_requests_running after clear_data (#4989)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [BugFix] fix num_requests_running after clear_data

* [fix] fix tasks_list and stop flags not cleared when _free_blocks failed
2025-11-13 13:50:38 +08:00
lizhenyun01
8749ca2fb6 support ENCODE_FEATURE_ENDPOINT (#4905)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-11-12 20:01:36 +08:00
xiaoxiaohehe001
4125b97603 [Fix] Fix eplb for ep mixed (#4894)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix eplb

* fix eplb
2025-11-10 14:46:26 +08:00
kevin
3dbe5596e6 [Feature] Support eplb for ep (#4786)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support eplb for ep

* update code

* update code

* update code

* update code

* update code

* update code

* update code

* update code

* update code
2025-11-07 15:42:29 +08:00
freeliuzc
bbae094cb9 [Optimization] Reduce memory allocate for cudaGraph (#4838)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* optimize memory allocate

* rename env
2025-11-06 13:32:47 +08:00
kxz2002
e0d98d00bc [Cherry-Pick] Modify follow-up push parameters and Modify the verification method for thinking length (#4086) (#4826)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* resolve #4086 conflict

* fix unit test
2025-11-05 21:36:28 +08:00
xiaoxiaohehe001
ee37882a26 [NewFeature] support eplb noaux (#4725)
* support eplb noaux

* support eplb noaux

* add  eplb noaux test
2025-11-05 20:59:12 +08:00
Yuan Xiaolan
1e88754133 support set dy-C8 from args (#4475)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-11-04 17:01:35 +08:00
lizhenyun01
d65e00f8fb [Feature] support get_task with tensor (#4751)
* [Feature] support get_task with tensor

* set FD_ENABLE_E2W_TENSOR_CONVERT default 0
2025-11-04 11:00:13 +08:00
kevin
9835697163 [Feature] Check bos url (#4677)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* add bos url check

* update code

* update code
2025-10-31 15:26:30 +08:00
RAM
7847b44172 [Graph Optimization] Refactor default capture list (#4631)
* refactore default capture list & refine code

* fix bug

* fix ci bug

* Fix test case
2025-10-31 14:18:27 +08:00
yangjianfengo1
eef85e4ff0 [BugFix] update eb5 video chunk (#4705)
* 修复视频和图片分chunk

* 修复视频和图片分chunk

* 修复视频和图片分chunk
2025-10-31 14:08:23 +08:00
kevin
c785c2dab1 [Scheduler] update v1 prefill batch (#4563)
* update v1 scheduler

* update code

* update code

* update code

* update code

* remove mm prefill batch

* update code

* fix bug

* update code
2025-10-31 14:03:18 +08:00
Sunny-bot1
3f15e6fa15 load cache scale (#4623) 2025-10-31 11:57:57 +08:00
Jiang-Jia-Jun
71135d58a0 Change log level from info to debug for response
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-30 14:02:50 +08:00
GoldPancake
05c1167c74 fix mtp logprob bugs (#4663) 2025-10-30 13:47:23 +08:00
Dangweichong
40cfed5bc9 [Feature] support eb5 video chunk (#4671)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-30 11:01:32 +08:00
gongshaotian
1081cad4a0 set default value as false
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-29 15:51:50 +08:00
gongshaotian
fa85956c6f add draft model using cudagraph switch 2025-10-29 15:51:50 +08:00
lizhenyun01
006c7e5a0d [Feature] Support attention dp balance for mixed deployment (#4594)
* [Feature] Support attention dp balance for mixed deployment

* add abstractmethod

* add config to rl

* fix argument style
2025-10-29 15:23:51 +08:00
freeliuzc
c9be8762b6 [MTP]Merge support attn (#4591)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support mask_offset in speculate decoding

* fix dummpy run output

* add unit test

* fix unit test import
2025-10-27 21:13:08 +08:00
GoldPancake
2cf0b0b715 [Feature] support mtp distribution equivalence verification (#4566)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support mtp distribution equivalence verification

* fix bugs

* add unit test
2025-10-27 11:01:28 +08:00
RAM
1a21e6c529 support mtp draft model with ep (#4581)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-27 09:34:54 +08:00
kevin
dd7fe27152 add hasher and ImagePosition 2025-10-23 15:20:21 +08:00
guozhuangzhuang
1531004085 fix image token output (#4487)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix

* fix

* fix

* add test case

* add test case

* add test case
2025-10-22 14:59:05 +08:00
李泳桦
ec499a0104 [Cherry-pick] fix requests & block metrics (#4500)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [fix] fix requests & block metrics

* [chore] rename variables
2025-10-21 10:43:33 +08:00
ltd0924
3cd9d3060a [Fearture] Support mm model close prefix cache (#4502)
* support mm prefix cache close

* add

* fix

* fix

* fix

---------

Co-authored-by: ltd0924 <luotingdan@baidu.com>
2025-10-21 09:56:47 +08:00
GoldPancake
9c7187998c [Feature] support mtp logprob (#4457)
* support logprob in mtp

* remove debug code

* fix

* feat: add draft_logprobs for Speculative Decode MTP

* Revert "feat: add draft_logprobs for Speculative Decode MTP"

This reverts commit d5a3c5c933.

* fix

* feat: add draft_logprobs for Speculative Decode MTP

* feat: add draft_logprobs for Speculative Decode MTP

* fix some bugs

* fix codestyle

* fix bugs

* fix bugs

* fix bugs

* fix bus

* fix bugs

* fix unitest

---------

Co-authored-by: sunlei1024 <sunlei5788@gmail.com>
Co-authored-by: sunlei18 <sunlei18@sunlei18deMacBook-Pro.local>
2025-10-20 10:18:00 +08:00
RAM
920df5be5a [Graph Optimization][Speculative Decoding] Fix the bug of CUDAGraph + MTP + EP (#4430)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* Fix MTP dummy run bug

* Target Model and Draft Model using the same flag

* aovid moe bug in cudagraph padding

* In mtp replace use_cudagraph as step_use_cudagraph
2025-10-17 14:22:05 +08:00
guozhuangzhuang
cfd93c0966 fix: image token output (#4399)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix: image token output

* fix: code style

* fix: CompletionOutput.decode_type
2025-10-16 14:51:32 +08:00
Yuanle Liu
83f97d1196 support speculate_limit_thinking_content_length_v2 (#4428)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support speculate_limit_thinking_content_length_v2

* fix

* fix import
2025-10-16 13:23:16 +08:00
gaoziyuan
74ae214f1a fix ep perf (#4381)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-15 18:38:20 +08:00