RAM
fbed0ef851
[Cherry-Pick][RL] Support Rollout Routing Replay ( #5166 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support r3
* update
* support tp>1&&ep>1
* support cudagraph padding
* support all backends
* replace env with options
* modularize
* update
* Add RoutingStore and refine code
* add routing replay cofig
* add routing repaly config
* success run routing store
* convert request id as rollout id
* fix rollout config bug
* unify code
* use rollout_id to replace request_id in routing store
* delete code
---------
Co-authored-by: yuanlehome <yuanlehome@163.com >
2025-12-04 00:35:30 -08:00
kevin
74ba637b6b
remove close prefix cache ( #5363 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-12-03 20:59:32 +08:00
Dangweichong
5c2247c3f0
[Feature] Support async download chunk video features ( #5297 )
2025-12-03 19:39:45 +08:00
Yuanle Liu
17c88f429f
fix skip_quant ( #5342 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix skip_quant
* fix
2025-12-03 13:20:51 +08:00
kevin
b52e1bd281
[Cherry-Pick][Feature] dy-c8 prefix caching ( #4918 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* c8 prefix caching
* update code
* update code
* update cache trans
* update code
* update code
2025-11-28 10:37:49 +08:00
SunLei
f637ba708c
[Cherry-Pick] MTP split draft_tokens into standalone post-processing path( #5205 ) ( #5232 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* merge code
* fix Request CONFLICT
* remove unuse unittest
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-11-27 15:30:00 +08:00
GoldPancake
bbcd92c8a0
[BugFix] fix mtp logprob bugs in chunk prefill ( #5234 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix mtp logprob bugs in chunk prefill
* merge code
* fix Request CONFLICT
* Revert "fix Request CONFLICT"
This reverts commit 7a438e4119 .
* Revert "merge code"
This reverts commit 3839559b83 .
* fix
* remove print
* fix
---------
Co-authored-by: sunlei1024 <sunlei5788@gmail.com >
2025-11-27 11:32:01 +08:00
chen
cc588b70ab
[CP][BugFix]Dev fix custom ar unstable result ( #5186 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [CP][BugFix]Dev fix custom ar unstable result (#4437 )
* code check
* revert delete
* check
* pre_commit
2025-11-24 15:28:01 +08:00
GoldPancake
fde827f95d
fix mtp reschedule ( #5164 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-11-21 19:08:33 +08:00
Jiang-Jia-Jun
a7740e56c4
Simplify __repr__ method in Request class ( #5154 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Remove detailed string representation from Request class.
2025-11-20 21:31:02 +08:00
GoldPancake
67da16bd7c
fix mtp reschedule ( #5144 )
2025-11-20 21:28:21 +08:00
kevin
966297e5d6
[Feature] mm prefix cache ( #4554 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* mm prefix cache
* add _revert_match_blocks
* update code
* update code
* update code
* fix bugs
* add test case
* fix bug
* update code
* update reserved_dec_block_ids
2025-11-19 19:32:14 +08:00
LiqinruiG
b2b7881cca
[fix] add more logger info: max_tokens ( #5122 )
...
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-19 18:59:43 +08:00
kevin
3ce2c8f754
[Feature] support async download features ( #4910 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* add async download
* update code
* fix bug
* update code
* update code
* fix bugs
* update code
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-11-18 18:37:59 +08:00
SunLei
c55c0d2ca3
fix: Fix block allocation issue when MTP and logprobs are enabled ( #5086 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-11-17 17:50:23 +08:00
LiqinruiG
24a7b79eec
[BugFix] rollback max_tokens and min_tokens when continue to infer ( #5051 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-15 16:40:35 +08:00
GoldPancake
cbcb5c6e84
temporary change mtp logprob msg size ( #5026 )
...
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
Co-authored-by: gaoziyuan <88373061+gzy19990617@users.noreply.github.com >
2025-11-15 13:39:40 +08:00
kxz2002
936a80962f
[BugFix] adjust max_tokens and min_tokens when continue to generate tokens ( #5010 ) ( #5015 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix max and min tokens initial commit
* fix double subtraction
* add unit tests
Co-authored-by: gaoziyuan <88373061+gzy19990617@users.noreply.github.com >
2025-11-14 22:24:59 +08:00
Yonghua Li
3da9f01e19
[BugFix] fix num_requests_running after clear_data ( #4989 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [BugFix] fix num_requests_running after clear_data
* [fix] fix tasks_list and stop flags not cleared when _free_blocks failed
2025-11-13 13:50:38 +08:00
lizhenyun01
8749ca2fb6
support ENCODE_FEATURE_ENDPOINT ( #4905 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-11-12 20:01:36 +08:00
xiaoxiaohehe001
4125b97603
[Fix] Fix eplb for ep mixed ( #4894 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix eplb
* fix eplb
2025-11-10 14:46:26 +08:00
kevin
3dbe5596e6
[Feature] Support eplb for ep ( #4786 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support eplb for ep
* update code
* update code
* update code
* update code
* update code
* update code
* update code
* update code
* update code
2025-11-07 15:42:29 +08:00
freeliuzc
bbae094cb9
[Optimization] Reduce memory allocate for cudaGraph ( #4838 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* optimize memory allocate
* rename env
2025-11-06 13:32:47 +08:00
kxz2002
e0d98d00bc
[Cherry-Pick] Modify follow-up push parameters and Modify the verification method for thinking length ( #4086 ) ( #4826 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* resolve #4086 conflict
* fix unit test
2025-11-05 21:36:28 +08:00
xiaoxiaohehe001
ee37882a26
[NewFeature] support eplb noaux ( #4725 )
...
* support eplb noaux
* support eplb noaux
* add eplb noaux test
2025-11-05 20:59:12 +08:00
Yuan Xiaolan
1e88754133
support set dy-C8 from args ( #4475 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-11-04 17:01:35 +08:00
lizhenyun01
d65e00f8fb
[Feature] support get_task with tensor ( #4751 )
...
* [Feature] support get_task with tensor
* set FD_ENABLE_E2W_TENSOR_CONVERT default 0
2025-11-04 11:00:13 +08:00
kevin
9835697163
[Feature] Check bos url ( #4677 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* add bos url check
* update code
* update code
2025-10-31 15:26:30 +08:00
RAM
7847b44172
[Graph Optimization] Refactor default capture list ( #4631 )
...
* refactore default capture list & refine code
* fix bug
* fix ci bug
* Fix test case
2025-10-31 14:18:27 +08:00
yangjianfengo1
eef85e4ff0
[BugFix] update eb5 video chunk ( #4705 )
...
* 修复视频和图片分chunk
* 修复视频和图片分chunk
* 修复视频和图片分chunk
2025-10-31 14:08:23 +08:00
kevin
c785c2dab1
[Scheduler] update v1 prefill batch ( #4563 )
...
* update v1 scheduler
* update code
* update code
* update code
* update code
* remove mm prefill batch
* update code
* fix bug
* update code
2025-10-31 14:03:18 +08:00
Sunny-bot1
3f15e6fa15
load cache scale ( #4623 )
2025-10-31 11:57:57 +08:00
Jiang-Jia-Jun
71135d58a0
Change log level from info to debug for response
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-30 14:02:50 +08:00
GoldPancake
05c1167c74
fix mtp logprob bugs ( #4663 )
2025-10-30 13:47:23 +08:00
Dangweichong
40cfed5bc9
[Feature] support eb5 video chunk ( #4671 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-30 11:01:32 +08:00
gongshaotian
1081cad4a0
set default value as false
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-29 15:51:50 +08:00
gongshaotian
fa85956c6f
add draft model using cudagraph switch
2025-10-29 15:51:50 +08:00
lizhenyun01
006c7e5a0d
[Feature] Support attention dp balance for mixed deployment ( #4594 )
...
* [Feature] Support attention dp balance for mixed deployment
* add abstractmethod
* add config to rl
* fix argument style
2025-10-29 15:23:51 +08:00
freeliuzc
c9be8762b6
[MTP]Merge support attn ( #4591 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support mask_offset in speculate decoding
* fix dummpy run output
* add unit test
* fix unit test import
2025-10-27 21:13:08 +08:00
GoldPancake
2cf0b0b715
[Feature] support mtp distribution equivalence verification ( #4566 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support mtp distribution equivalence verification
* fix bugs
* add unit test
2025-10-27 11:01:28 +08:00
RAM
1a21e6c529
support mtp draft model with ep ( #4581 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-27 09:34:54 +08:00
kevin
dd7fe27152
add hasher and ImagePosition
2025-10-23 15:20:21 +08:00
guozhuangzhuang
1531004085
fix image token output ( #4487 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix
* fix
* fix
* add test case
* add test case
* add test case
2025-10-22 14:59:05 +08:00
李泳桦
ec499a0104
[Cherry-pick] fix requests & block metrics ( #4500 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [fix] fix requests & block metrics
* [chore] rename variables
2025-10-21 10:43:33 +08:00
ltd0924
3cd9d3060a
[Fearture] Support mm model close prefix cache ( #4502 )
...
* support mm prefix cache close
* add
* fix
* fix
* fix
---------
Co-authored-by: ltd0924 <luotingdan@baidu.com >
2025-10-21 09:56:47 +08:00
GoldPancake
9c7187998c
[Feature] support mtp logprob ( #4457 )
...
* support logprob in mtp
* remove debug code
* fix
* feat: add draft_logprobs for Speculative Decode MTP
* Revert "feat: add draft_logprobs for Speculative Decode MTP"
This reverts commit d5a3c5c933 .
* fix
* feat: add draft_logprobs for Speculative Decode MTP
* feat: add draft_logprobs for Speculative Decode MTP
* fix some bugs
* fix codestyle
* fix bugs
* fix bugs
* fix bugs
* fix bus
* fix bugs
* fix unitest
---------
Co-authored-by: sunlei1024 <sunlei5788@gmail.com >
Co-authored-by: sunlei18 <sunlei18@sunlei18deMacBook-Pro.local >
2025-10-20 10:18:00 +08:00
RAM
920df5be5a
[Graph Optimization][Speculative Decoding] Fix the bug of CUDAGraph + MTP + EP ( #4430 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* Fix MTP dummy run bug
* Target Model and Draft Model using the same flag
* aovid moe bug in cudagraph padding
* In mtp replace use_cudagraph as step_use_cudagraph
2025-10-17 14:22:05 +08:00
guozhuangzhuang
cfd93c0966
fix: image token output ( #4399 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix: image token output
* fix: code style
* fix: CompletionOutput.decode_type
2025-10-16 14:51:32 +08:00
Yuanle Liu
83f97d1196
support speculate_limit_thinking_content_length_v2 ( #4428 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support speculate_limit_thinking_content_length_v2
* fix
* fix import
2025-10-16 13:23:16 +08:00
gaoziyuan
74ae214f1a
fix ep perf ( #4381 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-15 18:38:20 +08:00