FastDeploy

apps/FastDeploy

Fork 0

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

History

周周周 aa76085d1f

Deploy GitHub Pages / deploy (push) Has been cancelled

Details

[Attention] remove cum_offsets from atten, and use cu_seqlens_q (#2870 )

[Attention] remove cum_offsets from atten, and use cu_seqlens_q (#2870)

2025-07-16 20:10:57 +08:00

template_instantiation

…

append_attention_c4_impl.cuh

…

append_attention_c8_impl.cuh

…

append_attention_c16_impl.cuh

…

append_attention_func.cuh

…

append_attention_kernel.h

…

decode_attention_func.cuh

…

decode_attention_kernel.cu

…

decoder_write_cache_with_rope_impl.cuh

…

decoder_write_cache_with_rope_kernel.cu

…

decoder_write_cache_with_rope_kernel.h

…

encoder_write_cache_with_rope_impl.cuh

…

encoder_write_cache_with_rope_kernel.h

…

get_block_shape_and_split_kv_block.cu

…

gqa_rope_write_cache.cu

…

mem_util.cuh

…

mla_cache_kernel.cu

…

mla_cache_kernel.cuh

…

mma_tensor_op.cuh

…

multi_head_latent_attention_kernel.h

…

pre_cache_len_concat.cu

…

speculate_write_cache_with_rope_impl.cuh

…

speculate_write_cache_with_rope_kernel.cu

…

speculate_write_cache_with_rope_kernel.h

…

utils.cuh

…