yinwei
|
20c7b741f4
|
[XPU] Support W4A8C8-TP4-300B Model (#4068)
* support w4a8
* delete ep block attn
* delete moe_topk_select
* update note
* update
* delte useless info
* update
* add some note
* fix some format
* update scale info
* add ans baseline
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
|
2025-10-10 15:41:32 +08:00 |
|
Yuan Xiaolan
|
2cf55168ca
|
load hadamard_block_size from config (#3797)
|
2025-09-05 17:07:58 +08:00 |
|
Yuan Xiaolan
|
5f56d289a7
|
fix is_permuted (#3098)
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-07-31 19:58:05 +08:00 |
|
Yuan Xiaolan
|
3214fb5393
|
support model loading for w4a8 offline quant (#3064)
支持W4A8 EP 对离线量化权重的load
|
2025-07-29 21:54:37 +08:00 |
|
Zero Rains
|
25698d56d1
|
polish code with new pre-commit rule (#2923)
|
2025-07-19 23:19:27 +08:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|