This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2025-10-18 06:31:17 +08:00
Code
Issues
Actions
6
Packages
Projects
Releases
Wiki
Activity
Files
e2c764fd5aa31cfe4130de922350c221f12e289c
FastDeploy
/
custom_ops
/
gpu_ops
/
moe
History
yangjianfengo1
e81046fdad
【New Feature】集中式支持w4afp8 (
#3644
)
...
* 支持tp w4afp8 * code style
2025-08-28 10:53:24 +08:00
..
moe_wna16_marlin_utils
…
deepgemm_preprocess.cu
…
ep_moe_prefill_func.cu
[NewFeature]Support dp multi api server && Fix some bug in mixed ep && merge develop (
#3598
)
2025-08-26 19:59:02 +08:00
fast_hardamard_kernel.cu
support w4afp8 EP inference (
#3044
)
2025-08-25 11:27:45 +08:00
fast_hardamard_kernel.h
…
fused_moe_helper.h
…
fused_moe_imp_op.h
…
fused_moe_op.h
【New Feature】集中式支持w4afp8 (
#3644
)
2025-08-28 10:53:24 +08:00
fused_moe.cu
…
gptq_marlin_repack.cu
…
group_swiglu_with_masked.cu
…
group_swiglu_with_masked.h
…
moe_deepgemm_depermute.cu
…
moe_deepgemm_permute.cu
…
moe_dispatch.cu
【New Feature】集中式支持w4afp8 (
#3644
)
2025-08-28 10:53:24 +08:00
moe_ffn_wint2.cu
…
moe_ffn.cu
【New Feature】集中式支持w4afp8 (
#3644
)
2025-08-28 10:53:24 +08:00
moe_reduce.cu
…
moe_redundant_topk_select.cu
topk_gating_softmax support bias (
#3405
)
2025-08-15 11:57:45 +08:00
moe_topk_select.cu
topk_gating_softmax support bias (
#3405
)
2025-08-15 11:57:45 +08:00
moe_wna16_marlin_gemm.cu
…
moe_wna16_marlin_gemm.h
…
tritonmoe_preprocess.cu
…
wintx_unzip.cu
…