FastDeploy/custom_ops at 0cb9ad186e89bf0fc5e1ae1ac82583f133bad128 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

History

Yuanle Liu 0cb9ad186e [Cherry-Pick][BugFix] fix speculate_limit_thinking_content_length #5590 (#5615 )

2025-12-18 01:50:18 -08:00

..

c++ code format (#4527 )

2025-10-22 17:59:50 +08:00

[Cherry-Pick][BugFix] fix speculate_limit_thinking_content_length #5590 (#5615 )

2025-12-18 01:50:18 -08:00

c++ code format (#4527 )

2025-10-22 17:59:50 +08:00

[Metax] optimize cutlass moe and flash attention backend (#5128 )

2025-11-20 16:12:35 +08:00

[setup optimize]Support git submodule (#4033 )

2025-09-11 17:41:16 +08:00

[Quantization] Support w4afp8 MoE dynamic quantization (#5282 )

2025-12-02 18:56:16 +08:00

[XPU] support moe_expert_ffn TGEMM selection (#5375 )

2025-12-05 17:49:40 +08:00

0001-DeepGEMM-95e81b3.patch

[OP]Remove extra H2D in DeepGemm (#5262 )

2025-11-28 14:23:44 +08:00

MANIFEST.in

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

setup_ops_cpu.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

setup_ops.py

[Feature] Support noaux for eplb (#5143 )

2025-11-21 14:10:32 +08:00