FastDeploy/model_executor at 504461b6b5174d6fadc67db0b7f0fbe27eba7f17 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-30 03:22:05 +08:00

Files

History

yzwu 504461b6b5 [Iluvatar GPU] Optimize attention performance and fix moe load ckpt error (#3651 )

2025-09-22 21:13:59 +08:00

..

graph_optimization

[CUDAGraph] Support multi output buffers and merge some fixes from feature/exp_0908 (#4062 )

2025-09-15 16:21:30 +08:00

guided_decoding

[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116 )

2025-09-17 10:43:35 +08:00

[Iluvatar GPU] Optimize attention performance and fix moe load ckpt error (#3651 )

2025-09-22 21:13:59 +08:00

[Feature] support pool (#3827 )

2025-09-22 14:09:09 +08:00

[Iluvatar GPU] Optimize attention performance and fix moe load ckpt error (#3651 )

2025-09-22 21:13:59 +08:00

[Iluvatar GPU] Optimize attention performance and fix moe load ckpt error (#3651 )

2025-09-22 21:13:59 +08:00

__init__.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

forward_meta.py

【Inference Optimize】DeepSeek-V3-model MLA Optimize (#3886 )

2025-09-11 10:46:09 +08:00

load_weight_utils.py

[v1 loader]qwen Offline fp8 (#4036 )

2025-09-15 13:44:11 +08:00

pre_and_post_process.py

[Feature] Support pd ep deployment with yiyan adapter (#4029 )

2025-09-22 16:41:38 +08:00

utils.py

[Feature] support pool (#3827 )

2025-09-22 14:09:09 +08:00