FastDeploy

apps/FastDeploy

Fork 0

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 08:37:06 +08:00

Files

History

RichardWooSJTU fee544e808 fix ep prefill (#2762 )

2025-07-09 14:03:05 +08:00

attention

fix ep prefill (#2762 )

2025-07-09 14:03:05 +08:00

backends

[GCU] Support gcu platform (#2702 )

2025-07-08 13:00:52 +08:00

moe

fix ep prefill (#2762 )

2025-07-09 14:03:05 +08:00

quantization

[GCU] Support gcu platform (#2702 )

2025-07-08 13:00:52 +08:00

sample

[Feature] Add speculative decoding simulation benchmark. (#2751 )

2025-07-09 12:08:43 +08:00

__init__.py

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

activation.py

[GCU] Support gcu platform (#2702 )

2025-07-08 13:00:52 +08:00

embeddings.py

[Sync] Update to latest code (#2679 )

2025-07-03 15:43:53 +08:00

hydra_head.py

Sync v2.0 version of code to github repo

2025-06-29 23:29:37 +00:00

linear.py

[GCU] Support gcu platform (#2702 )

2025-07-08 13:00:52 +08:00

lm_head.py

[Sync] Update to latest code (#2679 )

2025-07-03 15:43:53 +08:00

mtp_linear.py

Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue (#2707 )

2025-07-04 14:15:04 +08:00

normalization.py

[GCU] Support gcu platform (#2702 )

2025-07-08 13:00:52 +08:00

rotary_embedding.py

[GCU] Support gcu platform (#2702 )

2025-07-08 13:00:52 +08:00

utils.py

Adapt for iluvatar gpu (#2684 )

2025-07-07 16:53:14 +08:00