FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 16:48:03 +08:00

Files

SuperNova 805f29a06c [Feature] refactor metax_gpu attention and moe and remove some useless code (#3688 )

Co-authored-by: yongqiangma <xing.wo@163.com>

2025-09-12 14:40:25 +08:00

2025-08-30 17:50:17 +08:00

__init__.py

2025-06-29 23:29:37 +00:00

block_wise_fp8.py

2025-08-29 11:07:30 +08:00

kv_cache.py

2025-09-09 05:25:08 -07:00

mix_quant.py

cache feature (#3857 )

2025-09-07 18:52:46 +08:00

quant_base.py

2025-07-19 23:19:27 +08:00

tensor_wise_fp8.py

2025-08-26 16:19:30 +08:00

w4a8.py

2025-09-05 17:07:58 +08:00

w4afp8.py

2025-09-05 17:07:58 +08:00

w8a8.py

fix w8a8.py (#3733 )

2025-09-03 10:57:26 +08:00

weight_only.py

2025-09-12 14:40:25 +08:00

wfp8afp8.py

2025-09-11 20:08:09 +08:00

wint2.py

2025-07-19 23:19:27 +08:00