FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-18 14:40:44 +08:00

Files

YuanRisheng 2e9e53ff7e [FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116 )

* remove max_num_batched_tokens in parallel config

* remove max_num_seqs

* update test case

* fix test

* fix

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>

2025-09-17 10:43:35 +08:00

sched

[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116 )

2025-09-17 10:43:35 +08:00

__init__.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

args_utils.py

[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116 )

2025-09-17 10:43:35 +08:00

common_engine.py

[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116 )

2025-09-17 10:43:35 +08:00

engine.py

[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116 )

2025-09-17 10:43:35 +08:00

expert_service.py

fix typo EngineSevice EngineService (#3841 )

2025-09-04 11:20:36 +08:00

kv_cache_interface.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

request.py

fix response processsors (#3826 )

2025-09-04 16:01:25 +08:00

resource_manager.py

[metrics] Add serveral observability metrics (#3868 )

2025-09-08 14:13:13 +08:00

sampling_params.py

[Feature] mm and thinking model support structred output (#2749 )

2025-09-02 16:21:09 +08:00