yzwu
|
fbdd6b0663
|
[Iluvatar GPU] Optimze attention and moe performance (#3234)
|
2025-08-08 10:51:24 +08:00 |
|
YuanRisheng
|
6ccc10ad47
|
Unify server-side and model-side Config (Part1) (#3018)
* move cache config
* fix mtp
|
2025-07-28 10:51:52 +08:00 |
|
lizhenyun01
|
29c3292f02
|
support c4 attn && fix cache
|
2025-07-24 12:00:52 +08:00 |
|
Zero Rains
|
25698d56d1
|
polish code with new pre-commit rule (#2923)
|
2025-07-19 23:19:27 +08:00 |
|
YuanRisheng
|
4c7b8bc458
|
Simplify the Config code (#2770)
* simplify the code
* fix vl
* delete config
* fix
* perfect code
* fix ci
* fix xpu
* fix xpu
* fix server
* resolve conflict
* fix mtp
* resolve conflict
* fix xpu
* fix xpu
* fix vl
* fix log
* fix qwen moe
* fix qwen moe
* fix qwen moe
|
2025-07-14 19:50:05 +08:00 |
|
littledgg
|
59071268b6
|
[Executor] Move forward_meta.py to fastdeploy/model_executor (#2774)
* Use PEP 563 in attention.py and fix conflict
* merge commit
* Change what was left out last time
|
2025-07-10 20:36:51 +08:00 |
|
liddk1121
|
1b54a2831e
|
Adapt for iluvatar gpu (#2684)
|
2025-07-07 16:53:14 +08:00 |
|