This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2025-10-05 08:37:06 +08:00
Code
Issues
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
199f88ce1ef2ebdf63b17f20c3531e00b3f41fe5
FastDeploy
/
custom_ops
/
gpu_ops
/
w4afp8_gemm
History
yangjianfengo1
9213a58a06
【Fix bug] w4afp8 的nblock固定为256,并且fa3的append attn 增加mask参数 (
#3771
) (
#3835
)
...
* fix w4afp8 * 增加集中式配置 * codestyle * fix fa3 append attn
2025-09-03 19:36:45 +08:00
..
kernel_traits.h
[New Feature] Support W4Afp8 MoE GroupGemm (
#3171
)
2025-08-06 10:34:05 +08:00
mainloop_fwd.h
【New Feature】集中式支持w4afp8 (
#3644
)
2025-08-28 10:53:24 +08:00
utils.hpp
[New Feature] Support W4Afp8 MoE GroupGemm (
#3171
)
2025-08-06 10:34:05 +08:00
w4afp8_gemm_kernel.hpp
【New Feature】集中式支持w4afp8 (
#3644
)
2025-08-28 10:53:24 +08:00
w4afp8_gemm.cu
【Fix bug] w4afp8 的nblock固定为256,并且fa3的append attn 增加mask参数 (
#3771
) (
#3835
)
2025-09-03 19:36:45 +08:00
w4afp8_gemm.h
support w4afp8 EP inference (
#3044
)
2025-08-25 11:27:45 +08:00