freeliuzc
2f473ba966
[Feature][MTP]Support MTP for rl-model ( #4009 )
...
* qk norm for speculate decode C16
* support mtp in v1_scheduler mode
* support mtp rope_3d
* support mtp features
* add unit test && del some log
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com >
Co-authored-by: xiaoxiaohehe001 <hiteezsf@163.com >
2025-09-10 13:34:37 +08:00
co63oc
d4fc893fe3
fix typos ( #3633 )
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-08-28 14:42:24 +08:00
Sunny-bot1
479c8b85d3
[Optimize]support machete weight only gemm ( #3561 )
...
* support machete weight only gemm
* add generate
* update
* fix
* change file location
* add sm_version limit
* fix
* fix
* fix ci
* fix coverage
* fix xpu
2025-08-28 09:49:58 +08:00
YuanRisheng
642480f5f6
[CI] Standard unittest ( #3606 )
...
* standard unittest
* fix bugs
* fix script
2025-08-26 19:03:11 +08:00
freeliuzc
52eda7fdb3
[Feature][MTP]support new speculative decoding method named hybrid mtp with ngram ( #3610 )
2025-08-26 14:29:22 +08:00
Yuan Xiaolan
9205c88da1
support w4afp8 EP inference ( #3044 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-25 11:27:45 +08:00
freeliuzc
76759108c9
[Feature][SpeculativeDecoding]Support tree-attention ( #3514 )
...
* support tree-attention
* fix merge bug
* fix unit-test api
* fix merge bug
2025-08-22 13:36:41 +08:00
yangjianfengo1
e5aa7087db
【bug fix】修复w4a8编译慢 ( #3510 )
...
* 修复w4a8编译
* code style
* 修复tma copy
2025-08-21 18:50:14 +08:00
YUNSHEN XIE
3a6058e445
Add stable ci ( #3460 )
...
* add stable ci
* fix
* update
* fix
* rename tests dir;fix stable ci bug
* add timeout limit
* update
2025-08-20 08:57:17 +08:00