FastDeploy/custom_ops at ffe7af8a97a6dd3f0fbe63105889b7a9d8f55c81 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

History

Sunny-bot1 6d0cc0dd9c [Optimization] Optimize split_q_block kernel (#4367 )

2025-10-15 11:28:00 +08:00

..

fix typos (#3951 )

2025-09-08 15:22:41 +08:00

[Optimization] Optimize split_q_block kernel (#4367 )

2025-10-15 11:28:00 +08:00

[Iluvatar GPU] Optimize attention performance and fix moe load ckpt error (#3651 )

2025-09-22 21:13:59 +08:00

[Metax] support cutlass moe & optimize flash attention (#4208 )

2025-09-29 11:22:43 +08:00

[setup optimize]Support git submodule (#4033 )

2025-09-11 17:41:16 +08:00

【Fix bug] w4afp8 的nblock固定为256，并且fa3的append attn 增加mask参数 (#3771 )

2025-09-02 19:17:01 +08:00

[XPU] Support W4A8C8-TP4-300B Model (#4068 )

2025-10-10 15:41:32 +08:00

0001-DeepGEMM-95e81b3.patch

[feat] support fa3 backend for pd disaggregated (#2695 )

2025-07-03 22:33:27 +08:00

MANIFEST.in

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

setup_ops_cpu.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

setup_ops.py

【Hackathon 9th No.86】autogen MultiQueryAppendC8Attention template_instantiation -part (#4330 )

2025-10-10 15:07:48 +08:00