FastDeploy/layers at 17b414c2df4ed1f7e74f0177bfc307ea29c384b6 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 08:37:06 +08:00

Files

History

周周周 17b414c2df MoE Default use triton's blockwise fp8 in TP Case (#3678 )

2025-08-29 11:07:30 +08:00

..

enable dcu ci (#3402 )

2025-08-29 10:23:08 +08:00

enable dcu ci (#3402 )

2025-08-29 10:23:08 +08:00

add input_processor plugin (#3657 )

2025-08-28 22:53:57 +08:00

MoE Default use triton's blockwise fp8 in TP Case (#3678 )

2025-08-29 11:07:30 +08:00

[Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing (#3552 )

2025-08-25 14:11:49 +08:00

__init__.py

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

activation.py

[Polish Code] Remove useless notes

2025-08-14 14:04:52 +08:00

embeddings.py

Supports DP+TP+EP hybrid parallel deployment strategy (#3489 )

2025-08-26 00:04:01 -07:00

linear.py

fix qwen3 235B tp 8 (#3697 )

2025-08-28 23:46:25 +08:00

lm_head.py

[Precision] Support lm_head layer running in float32 (#3597 )

2025-08-27 11:34:53 +08:00

mtp_linear.py

support tmp (#3675 )

2025-08-28 19:42:32 +08:00

normalization.py

adaptive rms_norm's dtype (#3617 )

2025-08-26 15:29:15 +08:00

rotary_embedding.py

[MetaxGPU] Support FastDeploy on metax gpu (#3241 )

2025-08-13 11:11:54 +08:00

utils.py

[V1 Loader] support weight_only (#3413 )

2025-08-23 13:13:41 +08:00