FastDeploy/attention at 7568b20098ae71fecf64af6bcef62c56b2b2a727 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-07 09:31:35 +08:00

Files

History

chen 7568b20098 check (#3720 )

2025-08-30 16:04:20 +08:00

..

Add with_output version AppendAttention (#3302 )

2025-08-28 17:10:18 +08:00

__init__.py

[Feature] block sparse attention (#3668 )

2025-08-29 19:46:30 +08:00

append_attn_backend.py

check (#3720 )

2025-08-30 16:04:20 +08:00

attention_selecter.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

attention.py

[Feature] block sparse attention (#3668 )

2025-08-29 19:46:30 +08:00

base_attention_backend.py

[MetaxGPU] Support FastDeploy on metax gpu (#3241 )

2025-08-13 11:11:54 +08:00

block_multihead_attn_backend.py

enable dcu ci (#3402 )

2025-08-29 10:23:08 +08:00

flash_attn_backend.py

Add with_output version AppendAttention (#3302 )

2025-08-28 17:10:18 +08:00

iluvatar_attn_backend.py

[Iluvatar GPU] Optimze attention and moe performance (#3234 )

2025-08-08 10:51:24 +08:00

mla_attention_backend.py

Add custom op declaration for all_reduce (#3473 )

2025-08-20 20:29:58 +08:00

moba_attention_backend.py

[Feature] block sparse attention (#3668 )

2025-08-29 19:46:30 +08:00

native_paddle_backend.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

utils.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

xpu_attn_backend.py

[Executor] Refactor GetBlockShapeAndSplitKVBlock Kernel (#2989 )

2025-07-31 00:09:31 +08:00