FastDeploy

apps/FastDeploy

Fork 0

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 16:48:03 +08:00

Files

History

ming1753 1eb8ea7328 [Bug fix] fix complie bug when sm < 89 (#2738 )

2025-07-08 11:24:52 +08:00

attention

Adapt for iluvatar gpu (#2684 )

2025-07-07 16:53:14 +08:00

backends

[Sync] Update to latest code (#2679 )

2025-07-03 15:43:53 +08:00

moe

[Bug fix] fix complie bug when sm < 89 (#2738 )

2025-07-08 11:24:52 +08:00

quantization

[Optimize] Optimize tensorwise fp8 performance (#2729 )

2025-07-07 20:06:28 +08:00

sample

Adapt for iluvatar gpu (#2684 )

2025-07-07 16:53:14 +08:00

__init__.py

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

activation.py

Adapt for iluvatar gpu (#2684 )

2025-07-07 16:53:14 +08:00

embeddings.py

[Sync] Update to latest code (#2679 )

2025-07-03 15:43:53 +08:00

hydra_head.py

Sync v2.0 version of code to github repo

2025-06-29 23:29:37 +00:00

linear.py

Adapt for iluvatar gpu (#2684 )

2025-07-07 16:53:14 +08:00

lm_head.py

[Sync] Update to latest code (#2679 )

2025-07-03 15:43:53 +08:00

mtp_linear.py

Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue (#2707 )

2025-07-04 14:15:04 +08:00

normalization.py

Adapt for iluvatar gpu (#2684 )

2025-07-07 16:53:14 +08:00

rotary_embedding.py

Adapt for iluvatar gpu (#2684 )

2025-07-07 16:53:14 +08:00

utils.py

Adapt for iluvatar gpu (#2684 )

2025-07-07 16:53:14 +08:00