This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2025-12-24 13:28:13 +08:00
Code
Issues
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
6160145f822f0e85d46f02eb0a629aa885105eeb
FastDeploy
/
custom_ops
History
Sunny-bot1
930f7b781c
[Optimization] Put get_block_shape_and_split_kv_block in cuda graph for append attention backend (
#4443
)
...
* get block in cuda graph * fix sot
2025-10-17 10:59:56 +08:00
..
cpu_ops
fix typos (
#3951
)
2025-09-08 15:22:41 +08:00
gpu_ops
[Optimization] Put get_block_shape_and_split_kv_block in cuda graph for append attention backend (
#4443
)
2025-10-17 10:59:56 +08:00
iluvatar_ops
[Iluvatar GPU] Optimize attention performance and fix moe load ckpt error (
#3651
)
2025-09-22 21:13:59 +08:00
metax_ops
[Metax] support cutlass moe & optimize flash attention (
#4208
)
2025-09-29 11:22:43 +08:00
third_party
[setup optimize]Support git submodule (
#4033
)
2025-09-11 17:41:16 +08:00
utils
【Fix bug] w4afp8 的nblock固定为256,并且fa3的append attn 增加mask参数 (
#3771
)
2025-09-02 19:17:01 +08:00
xpu_ops
[XPU] refine fused moe (
#4219
)
2025-10-16 19:04:07 +08:00
0001-DeepGEMM-95e81b3.patch
…
MANIFEST.in
…
setup_ops_cpu.py
…
setup_ops.py
【Hackathon 9th No.86】autogen
MultiQueryDecoderAttention
template_instantiation -part (
#4383
)
2025-10-16 17:08:19 +08:00