Commit Graph

7 Commits

Author SHA1 Message Date
Jundong Liu
0b7a5778ab [Executor]CUDAGraph support Speculate Decode (#4258)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [Executor]CUDAGraph support Speculate Decode

* fix problem

* solve problem

* fix

* fast compile

* CUDAGraph + mtp support eb5(only target model)

* Revert "fast compile"

This reverts commit 3cfe8373ed.

* fix precommit

* solve comment

* fix comment about #pragram unroll

---------

Co-authored-by: gongshaotian <gstain5555@outlook.com>
Co-authored-by: gongshaotian <gstian5555@outlook.com>
2025-10-13 15:21:41 +08:00
freeliuzc
b176cba474 support mtp in ep64 (#4280)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-09-26 15:38:03 +08:00
lzy
48d760539b fix deepcopy(tp_group) in spec (#3648) 2025-08-29 16:08:21 +08:00
freeliuzc
52eda7fdb3 [Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610) 2025-08-26 14:29:22 +08:00
YuanRisheng
6ccc10ad47 Unify server-side and model-side Config (Part1) (#3018)
* move cache config

* fix mtp
2025-07-28 10:51:52 +08:00
freeliuzc
667547be59 support chunk_prefill in MTP (#2705) 2025-07-04 11:55:48 +08:00
Jiang-Jia-Jun
92c2cfa2e7 Sync v2.0 version of code to github repo 2025-06-29 23:29:37 +00:00