GoldPancake
|
f7cad30a38
|
[Feature] Add speculative decoding simulation benchmark. (#2751)
* Add speculative decoding simulation benchmark
* Fix the name of the parameter
|
2025-07-09 12:08:43 +08:00 |
|
ming1753
|
1eb8ea7328
|
[Bug fix] fix complie bug when sm < 89 (#2738)
|
2025-07-08 11:24:52 +08:00 |
|
ming1753
|
ef6649a577
|
[Optimize] Optimize tensorwise fp8 performance (#2729)
Deploy GitHub Pages / deploy (push) Has been cancelled
* [Optimize] Optimize tensorwise fp8 performance
|
2025-07-07 20:06:28 +08:00 |
|
liddk1121
|
1b54a2831e
|
Adapt for iluvatar gpu (#2684)
|
2025-07-07 16:53:14 +08:00 |
|
Jiang-Jia-Jun
|
05c670e593
|
[Sync] Update to latest code (#2679)
* [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
|
2025-07-03 15:43:53 +08:00 |
|
MARD1NO
|
ac5f860536
|
use shfl_xor_sync to reduce redundant shfl broadcast
|
2025-06-30 13:12:21 +08:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|
jiangjiajun
|
684703fd72
|
[LLM] First commit the llm deployment code
|
2025-06-09 19:20:15 +08:00 |
|