FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-30 03:22:05 +08:00

Files

AIbin beec24fd89 【Inference Optimize】DeepSeek-v3 model inference performance optimization (#3455 )

* DSK_OPT_01

* update FA3

2025-08-19 10:42:42 +08:00

2025-08-14 03:40:55 -07:00

__init__.py

2025-07-22 00:23:52 -07:00

append_attn_backend.py

2025-08-14 03:40:55 -07:00

attention_selecter.py

2025-07-19 23:19:27 +08:00

attention.py

support qk norm (#3145 )

2025-08-05 16:46:14 +08:00

base_attention_backend.py

2025-08-13 11:11:54 +08:00

block_multihead_attn_backend.py

2025-07-31 00:09:31 +08:00

flash_attn_backend.py

2025-08-06 16:24:27 +08:00

iluvatar_attn_backend.py

2025-08-08 10:51:24 +08:00

mla_attention_backend.py

2025-08-19 10:42:42 +08:00

native_paddle_backend.py

2025-07-19 23:19:27 +08:00

utils.py

2025-07-19 23:19:27 +08:00

xpu_attn_backend.py

2025-07-31 00:09:31 +08:00