This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2025-10-13 20:34:02 +08:00
Code
Issues
Actions
5
Packages
Projects
Releases
Wiki
Activity
Files
7c1fd19f0f0ca2b01c96cc31e4b9626d3b6f1b65
FastDeploy
/
custom_ops
/
gpu_ops
/
mla_attn
History
AIbin
a7392a0ff9
【Inference Optimize】DeepSeek-V3-model MLA Optimize (
#3886
)
...
* support MLA chunk_size auto search & cuda_graph
2025-09-11 10:46:09 +08:00
..
attention_updater.cuh
…
batch_mla_with_paged_kv_cache.cu
【Inference Optimize】DeepSeek-V3-model MLA Optimize (
#3886
)
2025-09-11 10:46:09 +08:00
batch_mla_with_paged_kv_cache.h
【Inference Optimize】DeepSeek-V3-model MLA Optimize (
#3886
)
2025-09-11 10:46:09 +08:00
epilogue.cuh
…
kernel_traits.cuh
…
mainloop_load.cuh
【Inference Optimize】DeepSeek-V3-model MLA Optimize (
#3886
)
2025-09-11 10:46:09 +08:00
mainloop_mma.cuh
【Inference Optimize】DeepSeek-V3-model MLA Optimize (
#3886
)
2025-09-11 10:46:09 +08:00
mla_hopper.cuh
【Inference Optimize】DeepSeek-V3-model MLA Optimize (
#3886
)
2025-09-11 10:46:09 +08:00
named_barrier.cuh
…
utils.cuh
【Inference Optimize】DeepSeek-V3-model MLA Optimize (
#3886
)
2025-09-11 10:46:09 +08:00