FastDeploy/model_executor at 18f4977aecefe1cdf946cd54186dffb78cd7060a - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-25 09:31:38 +08:00

Files

History

chen 7c1fd19f0f [OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 (#4238 )

2025-09-24 16:39:51 +08:00

..

graph_optimization

[CUDAGraph] Support multi output buffers and merge some fixes from feature/exp_0908 (#4062 )

2025-09-15 16:21:30 +08:00

guided_decoding

[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116 )

2025-09-17 10:43:35 +08:00

[OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 (#4238 )

2025-09-24 16:39:51 +08:00

[v1 loader]code style (#4204 )

2025-09-23 19:36:00 +08:00

[BugFix] fix qwen3-embedding model tp>1 (#4223 )

2025-09-24 14:13:26 +08:00

[Intel HPU] Support intel hpu platform (#4161 )

2025-09-24 12:27:50 +08:00

__init__.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

forward_meta.py

[Intel HPU] Support intel hpu platform (#4161 )

2025-09-24 12:27:50 +08:00

load_weight_utils.py

[v1 loader]code style (#4204 )

2025-09-23 19:36:00 +08:00

pre_and_post_process.py

[Intel HPU] Support intel hpu platform (#4161 )

2025-09-24 12:27:50 +08:00

utils.py

[Feature] support pool (#3827 )

2025-09-22 14:09:09 +08:00