FastDeploy/fastdeploy/model_executor at b87e2c6184b1d918b60a528aecfd54aa877e2403 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

History

Ryan b87e2c6184 [CUDAGraph]Add support for custom all-reduce operators under SOT mode (#4386 )

2025-10-16 19:31:19 +08:00

..

graph_optimization

[CUDAGraph]Add support for custom all-reduce operators under SOT mode (#4386 )

2025-10-16 19:31:19 +08:00

guided_decoding

[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116 )

2025-09-17 10:43:35 +08:00

[XPU] refine fused moe (#4219 )

2025-10-16 19:04:07 +08:00

[v1 loader]code style (#4204 )

2025-09-23 19:36:00 +08:00

[FDConfig]Remove max_model_len in FDConfig (#4350 )

2025-10-11 14:04:17 +08:00

[Intel HPU] Support intel hpu platform (#4161 )

2025-09-24 12:27:50 +08:00

__init__.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

forward_meta.py

[Optimization] Fuse get_max_len and get_kv_max_len (#4369 )

2025-10-13 20:35:00 +08:00

load_weight_utils.py

[v1 loader]code style (#4204 )

2025-09-23 19:36:00 +08:00

pre_and_post_process.py

perf: optimize ZMQ communication with async queue and single-threaded… (#4444 )

2025-10-16 15:46:26 +08:00

utils.py

V1 loader default (#4251 )

2025-10-15 16:49:17 +08:00