yangjianfengo1
|
ae7bee8122
|
【New Feature】W4afp8 supports per group quantization (#4987)
* w4afp8 支持per group
* code style
* fix transpose
* revert fast hardmard
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com>
|
2025-11-13 19:17:27 +08:00 |
|
YuBaoku
|
819b2dbbae
|
Revert "【New Feature】W4afp8 supports per group quantization (#4272)" (#4854)
This reverts commit 93fcf7e4ec.
|
2025-11-06 17:48:28 +08:00 |
|
yangjianfengo1
|
93fcf7e4ec
|
【New Feature】W4afp8 supports per group quantization (#4272)
* w4afp8 支持per group
* code style
* 精度完成
* revert append attn utils
* ffn1 动态量化
* ffn2 支持动态量化
* code style
* code style
* 修改单测
* 修改单测
* fix bug
* Implement conditional parameter creation for layers
Add parameter creation for up_gate_proj_in_scale when ep_size > 1.
* code style
* fix conflict
* code style
* code style
* 修复w4aint8 精度
* fix ci
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
|
2025-11-05 21:00:23 +08:00 |
|
yangjianfengo1
|
89397516a8
|
[New Feature] Support W4Afp8 MoE GroupGemm (#3171)
* init
* 增加多线程编译
* fix bug
* fix bug
* code style
* 增加fp16
* 将print替换成assert
* 修复stmatrix
* 减小单测shape
* 减小单测shape
|
2025-08-06 10:34:05 +08:00 |
|