bukejiyu
bad53c6b6e
[vl]remove duplicated load logic ( #2744 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-13 07:36:26 +08:00
lizexu123
8c660a0dfb
[BugFix] fix RMSNorm rms_norm_esp ( #2797 )
...
* fix rms
* add vl
* fix
* add vl
* fix
* fix
2025-07-10 20:02:24 +08:00
zhink
b89180f1cd
[Feature] support custom all-reduce ( #2758 )
...
* [Feature] support custom all-reduce
* add vllm adapted
2025-07-09 16:00:27 +08:00
GoldPancake
f7cad30a38
[Feature] Add speculative decoding simulation benchmark. ( #2751 )
...
* Add speculative decoding simulation benchmark
* Fix the name of the parameter
2025-07-09 12:08:43 +08:00
Yuanle Liu
240bdac2a4
[feat] support fa3 backend for pd disaggregated ( #2695 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* delete use_fast_ffn
2025-07-03 22:33:27 +08:00
Jiang-Jia-Jun
05c670e593
[Sync] Update to latest code ( #2679 )
...
* [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun
92c2cfa2e7
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00
jiangjiajun
684703fd72
[LLM] First commit the llm deployment code
2025-06-09 19:20:15 +08:00