YuanRisheng
|
2e9e53ff7e
|
[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116)
* remove max_num_batched_tokens in parallel config
* remove max_num_seqs
* update test case
* fix test
* fix
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-09-17 10:43:35 +08:00 |
|
co63oc
|
d6369b4d51
|
fix typos (#3684)
|
2025-09-01 17:50:17 +08:00 |
|
lzy
|
48d760539b
|
fix deepcopy(tp_group) in spec (#3648)
|
2025-08-29 16:08:21 +08:00 |
|
freeliuzc
|
52eda7fdb3
|
[Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610)
|
2025-08-26 14:29:22 +08:00 |
|
YuanRisheng
|
6ccc10ad47
|
Unify server-side and model-side Config (Part1) (#3018)
* move cache config
* fix mtp
|
2025-07-28 10:51:52 +08:00 |
|
freeliuzc
|
667547be59
|
support chunk_prefill in MTP (#2705)
|
2025-07-04 11:55:48 +08:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|