This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2025-12-24 13:28:13 +08:00
Code
Issues
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
f15edbb6efa9efa1185a931687aef51ac7041db2
FastDeploy
/
custom_ops
/
gpu_ops
/
append_attn
History
chen
27ef3610b5
support glm fa3 (
#5586
)
2025-12-16 19:33:27 +08:00
..
template_instantiation
…
append_attention_c4_impl.cuh
…
append_attention_c8_impl.cuh
…
append_attention_c16_impl.cuh
…
append_attention_func.cuh
[Others] Maintain the mtp branch temporarily. (
#5446
)
2025-12-09 19:17:53 +08:00
append_attention_kernel.h
…
decode_attention_func.cuh
…
decoder_mla_attention_kernel.cu
…
decoder_mla_attention_kernel.h
…
decoder_write_cache_with_rope_impl.cuh
[Feature][Optimization] Qwen Support Dynamic block_wise_fp8 cache (
#5486
)
2025-12-12 17:10:17 +08:00
decoder_write_cache_with_rope_kernel.cu
[Feature][Optimization] Qwen Support Dynamic block_wise_fp8 cache (
#5486
)
2025-12-12 17:10:17 +08:00
decoder_write_cache_with_rope_kernel.h
…
encoder_write_cache_with_rope_impl.cuh
…
encoder_write_cache_with_rope_kernel.h
…
get_block_shape_and_split_kv_block.cu
[Metax] modify wrapSize to WARP_SIZE (
#5442
)
2025-12-09 01:44:02 -08:00
gqa_rope_write_cache.cu
support glm fa3 (
#5586
)
2025-12-16 19:33:27 +08:00
mem_util.cuh
…
mla_cache_kernel.cu
…
mla_cache_kernel.cuh
…
mma_tensor_op.cuh
…
multiquery_attention_c4_impl.cuh
[Others] Maintain the mtp branch temporarily. (
#5446
)
2025-12-09 19:17:53 +08:00
multiquery_attention_c4_kernel.h
…
multiquery_attention_c8_impl.cuh
[Others] Maintain the mtp branch temporarily. (
#5446
)
2025-12-09 19:17:53 +08:00
multiquery_attention_c8_kernel.h
…
multiquery_attention_c16_impl.cuh
[Others] Maintain the mtp branch temporarily. (
#5446
)
2025-12-09 19:17:53 +08:00
multiquery_attention_c16_kernel.h
…
multiquery_decoder_attention_impl.cuh
…
multiquery_decoder_attention_kernel.h
…
pre_cache_len_concat.cu
…
qwen3_rope.h
…
speculate_write_cache_with_rope_impl.cuh
[BugFix][Speculative Decoding](Spend many dyas to solve)Fix write qknorm cache bug in speculative decoding (
#5491
)
2025-12-15 18:27:11 +08:00
speculate_write_cache_with_rope_kernel.cu
[BugFix][Speculative Decoding](Spend many dyas to solve)Fix write qknorm cache bug in speculative decoding (
#5491
)
2025-12-15 18:27:11 +08:00
speculate_write_cache_with_rope_kernel.h
…
template_config.json
…
utils.cuh
…