This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2025-12-24 13:28:13 +08:00
Code
Issues
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
048ca600133fd4e1c8802034379c7b18e5aa9a1c
FastDeploy
/
custom_ops
/
gpu_ops
/
moe
History
周周周
95243f012c
[Others] add PADDLE_ENFORCE (
#5288
)
2025-11-28 14:23:35 +08:00
..
moe_wna16_marlin_utils
…
deepgemm_preprocess.cu
…
ep_moe_expert_dispatch.cu
[Others] add PADDLE_ENFORCE (
#5288
)
2025-11-28 14:23:35 +08:00
fused_moe_helper.h
【New Feature】W4afp8 supports per group quantization (
#4987
)
2025-11-13 19:17:27 +08:00
fused_moe_imp_op.h
[BugFix] Fix zero workspace returned by CUB size query under CUDA Graph in MoE dispatch (
#5087
)
2025-11-20 20:00:29 +08:00
fused_moe_op.h
【New Feature】W4afp8 supports per group quantization (
#4987
)
2025-11-13 19:17:27 +08:00
fused_moe.cu
…
gptq_marlin_repack.cu
…
group_swiglu_with_masked.cu
…
group_swiglu_with_masked.h
…
moe_deepgemm_depermute.cu
…
moe_deepgemm_permute.cu
…
moe_dispatch.cu
[BugFix] Fix zero workspace returned by CUB size query under CUDA Graph in MoE dispatch (
#5087
)
2025-11-20 20:00:29 +08:00
moe_expert_ffn_wint2.cu
…
moe_fast_hardamard_impl_common.h
…
moe_fast_hardamard_impl.cuh
…
moe_fast_hardamard_kernel.cu
【Fix】fix deepep dispatch (
#5036
)
2025-11-17 10:34:01 +08:00
moe_fast_hardamard_kernel.h
…
moe_ffn.cu
【New Feature】W4afp8 supports per group quantization (
#4987
)
2025-11-13 19:17:27 +08:00
moe_reduce.cu
…
moe_redundant_topk_select.cu
…
moe_topk_select.cu
…
moe_wna16_marlin_gemm.cu
…
moe_wna16_marlin_gemm.h
…
swigluoai.cu
…
swigluoai.h
…
template_config.json
【New Feature】W4afp8 supports per group quantization (
#4987
)
2025-11-13 19:17:27 +08:00
tritonmoe_preprocess.cu
…
winx_unzip.cu
…