FastDeploy/fastdeploy/model_executor/layers at 808b5487610fb697c9de09d859660a0ae6be10fa - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

History

YuanRisheng 808b548761 support tmp (#3675 )

2025-08-28 19:42:32 +08:00

..

Add with_output version AppendAttention (#3302 )

2025-08-28 17:10:18 +08:00

[NewFeatures] support eplb (#3547 )

2025-08-26 16:19:30 +08:00

[BugFix]fix dp&ep&tp and muti node infer (#3629 )

2025-08-28 19:09:10 +08:00

[Optimize]support machete weight only gemm (#3561 )

2025-08-28 09:49:58 +08:00

[Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing (#3552 )

2025-08-25 14:11:49 +08:00

__init__.py

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

activation.py

[Polish Code] Remove useless notes

2025-08-14 14:04:52 +08:00

embeddings.py

Supports DP+TP+EP hybrid parallel deployment strategy (#3489 )

2025-08-26 00:04:01 -07:00

linear.py

Supports DP+TP+EP hybrid parallel deployment strategy (#3489 )

2025-08-26 00:04:01 -07:00

lm_head.py

[Precision] Support lm_head layer running in float32 (#3597 )

2025-08-27 11:34:53 +08:00

mtp_linear.py

support tmp (#3675 )

2025-08-28 19:42:32 +08:00

normalization.py

adaptive rms_norm's dtype (#3617 )

2025-08-26 15:29:15 +08:00

rotary_embedding.py

[MetaxGPU] Support FastDeploy on metax gpu (#3241 )

2025-08-13 11:11:54 +08:00

utils.py

[V1 Loader] support weight_only (#3413 )

2025-08-23 13:13:41 +08:00