FastDeploy

apps/FastDeploy

Fork 0

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 08:37:06 +08:00

Files

History

Sunny-bot1 3b1da6e4dd support v1 loader for machete (#3999 )

2025-09-10 10:21:33 +08:00

attention

[V1 Loader] Ernie kv cache quant support v1 loader (#3899 )

2025-09-09 05:25:08 -07:00

backends

fix typos (#3684 )

2025-09-01 17:50:17 +08:00

moe

cache feature (#3857 )

2025-09-07 18:52:46 +08:00

quantization

support v1 loader for machete (#3999 )

2025-09-10 10:21:33 +08:00

sample

[Feature] mm and thinking model support structred output (#2749 )

2025-09-02 16:21:09 +08:00

__init__.py

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

activation.py

[Polish Code] Remove useless notes

2025-08-14 14:04:52 +08:00

embeddings.py

[Feature] ernie4_5_vl_moe support huggingface safetensor loading (#3750 )

2025-09-03 02:58:59 -07:00

linear.py

【Inference Optimize】Update MergedReplicatedLinear for DSK qkv_a_proj_with_mqa. (#3673 )

2025-09-04 21:16:05 -07:00

lm_head.py

[Precision] Support lm_head layer running in float32 (#3597 )

2025-08-27 11:34:53 +08:00

mtp_linear.py

support tmp (#3675 )

2025-08-28 19:42:32 +08:00

normalization.py

adaptive rms_norm's dtype (#3617 )

2025-08-26 15:29:15 +08:00

rotary_embedding.py

[Model]support qwen2_5_vl (#3557 )

2025-08-29 18:28:39 +08:00

utils.py

fix mem boom in ep (#3854 )

2025-09-05 11:48:21 +08:00