FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

Yuan Xiaolan 3214fb5393 support model loading for w4a8 offline quant (#3064 )

支持W4A8 EP 对离线量化权重的load

2025-07-29 21:54:37 +08:00

__init__.py

2025-07-19 23:19:27 +08:00

ep.py

2025-07-29 15:06:49 +08:00

fused_moe_backend_base.py

2025-07-29 17:17:24 +08:00

fused_moe_cutlass_backend.py

2025-07-29 21:54:37 +08:00

fused_moe_deepgemm_backend.py

2025-07-29 17:07:44 +08:00

fused_moe_marlin_backend.py

2025-07-24 01:43:31 -07:00

fused_moe_triton_backend.py

2025-07-24 01:43:31 -07:00

fused_moe_wint2_backend.py

2025-07-28 16:31:56 +08:00

fused_moe_xpu_backend.py

2025-07-21 22:52:03 +08:00

moe.py

2025-07-29 21:54:37 +08:00

triton_moe_kernels.py

2025-07-19 23:19:27 +08:00