FastDeploy/fastdeploy/model_executor/layers/quantization at 9058cc712da4e42a8fde63d7121a4dad54f2d1ac - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

History

Yuanle Liu b8e4828373 [BugFix] fix dynamic c8 in v1 loader (#5562 )

2025-12-15 04:07:54 -08:00

..

[BugFix] Add support for weight shape constraints and group size selection in Machete (#4911 )

2025-11-10 20:57:35 +08:00

__init__.py

[Quantization] Support w4afp8 MoE dynamic quantization (#5282 )

2025-12-02 18:56:16 +08:00

block_wise_fp8.py

refactor pt loading (#4532 )

2025-11-11 21:30:39 +08:00

kv_cache.py

[BugFix] fix dynamic c8 in v1 loader (#5562 )

2025-12-15 04:07:54 -08:00

mix_quant.py

[Feature] Add an unquantized option for MoE and Dense quant type (#4813 )

2025-11-19 16:24:03 +08:00

quant_base.py

…

tensor_wise_fp8.py

…

w4a8.py

…

w4afp8.py

[Others] remove add_bias option (#5425 )

2025-12-09 17:39:35 +08:00

w8a8.py

…

weight_only.py

[Others] remove add_bias option (#5425 )

2025-12-09 17:39:35 +08:00

wfp8afp8.py

refactor pt loading (#4532 )

2025-11-11 21:30:39 +08:00

wint2.py

…