FastDeploy/layers at 2ea267f624cc7971b462511f5537ea0742582f19 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-06 17:17:14 +08:00

Files

History

chen 888780ffde [Feature] block_wise_fp8 support triton_moe_backend (#2767 )

2025-07-09 19:22:47 +08:00

..

dcu adapter ernie45t (#2756 )

2025-07-09 18:56:27 +08:00

dcu adapter ernie45t (#2756 )

2025-07-09 18:56:27 +08:00

[Feature] block_wise_fp8 support triton_moe_backend (#2767 )

2025-07-09 19:22:47 +08:00

[Feature] block_wise_fp8 support triton_moe_backend (#2767 )

2025-07-09 19:22:47 +08:00

[Feature] Add speculative decoding simulation benchmark. (#2751 )

2025-07-09 12:08:43 +08:00

__init__.py

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

activation.py

[GCU] Support gcu platform (#2702 )

2025-07-08 13:00:52 +08:00

embeddings.py

[Sync] Update to latest code (#2679 )

2025-07-03 15:43:53 +08:00

hydra_head.py

Sync v2.0 version of code to github repo

2025-06-29 23:29:37 +00:00

linear.py

[GCU] Support gcu platform (#2702 )

2025-07-08 13:00:52 +08:00

lm_head.py

[Sync] Update to latest code (#2679 )

2025-07-03 15:43:53 +08:00

mtp_linear.py

Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue (#2707 )

2025-07-04 14:15:04 +08:00

normalization.py

[GCU] Support gcu platform (#2702 )

2025-07-08 13:00:52 +08:00

rotary_embedding.py

[GCU] Support gcu platform (#2702 )

2025-07-08 13:00:52 +08:00

utils.py

Adapt for iluvatar gpu (#2684 )

2025-07-07 16:53:14 +08:00