This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2025-10-05 16:48:03 +08:00
Code
Issues
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
a7392a0ff944a1f40a26023c53c80b10f263421f
FastDeploy
/
fastdeploy
/
model_executor
/
layers
/
attention
/
ops
History
AIbin
a7392a0ff9
【Inference Optimize】DeepSeek-V3-model MLA Optimize (
#3886
)
...
* support MLA chunk_size auto search & cuda_graph
2025-09-11 10:46:09 +08:00
..
__init__.py
Add with_output version AppendAttention (
#3302
)
2025-08-28 17:10:18 +08:00
append_attention.py
Add with_output version AppendAttention (
#3302
)
2025-08-28 17:10:18 +08:00
get_block_shape_and_split_kv_block.py
【Inference Optimize】DeepSeek-V3-model MLA Optimize (
#3886
)
2025-09-11 10:46:09 +08:00
gqa_rope_write_cache.py
support fa3 rope3d (
#3622
)
2025-08-27 11:31:29 +08:00
init_kv_signal_per_query.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
init_signal_layerwise.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
open_shm_and_get_meta_signal.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
pre_cache_len_concat.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00