FastDeploy/attention at eda83ca6720d3136375e71ea33c947dda312aa60 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 16:48:03 +08:00

Files

History

yzwu fbdd6b0663 [Iluvatar GPU] Optimze attention and moe performance (#3234 )

2025-08-08 10:51:24 +08:00

..

support qk norm (#3145 )

2025-08-05 16:46:14 +08:00

__init__.py

[SOT] Mark dynamic dims by type annotations (#2771 )

2025-07-22 00:23:52 -07:00

append_attn_backend.py

support qk norm (#3145 )

2025-08-05 16:46:14 +08:00

attention_selecter.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

attention.py

support qk norm (#3145 )

2025-08-05 16:46:14 +08:00

base_attention_backend.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

block_multihead_attn_backend.py

[Executor] Refactor GetBlockShapeAndSplitKVBlock Kernel (#2989 )

2025-07-31 00:09:31 +08:00

flash_attn_backend.py

【Fix Bug】修复 fa3 支持集中式bug (#3235 )

2025-08-06 16:24:27 +08:00

iluvatar_attn_backend.py

[Iluvatar GPU] Optimze attention and moe performance (#3234 )

2025-08-08 10:51:24 +08:00

mla_attention_backend.py

[Executor] Refactor GetBlockShapeAndSplitKVBlock Kernel (#2989 )

2025-07-31 00:09:31 +08:00

native_paddle_backend.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

utils.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

xpu_attn_backend.py

[Executor] Refactor GetBlockShapeAndSplitKVBlock Kernel (#2989 )

2025-07-31 00:09:31 +08:00