[Feature] enable guided decoding ENABLE_V1_KVCACHE_SCHEDULER = 1 (#5140)

* enable guided decoding ENABLE_V1_KVCACHE_SCHEDULER = 1

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
Daci
2025-11-26 10:22:35 +08:00
committed by GitHub
parent 2d787590c4
commit f25ee3a26f
3 changed files with 38 additions and 5 deletions

View File

@@ -535,8 +535,6 @@ class EngineArgs:
if not (current_platform.is_cuda() or current_platform.is_xpu() or current_platform.is_maca()):
envs.ENABLE_V1_KVCACHE_SCHEDULER = 0
if self.guided_decoding_backend != "off":
envs.ENABLE_V1_KVCACHE_SCHEDULER = 0
if "PaddleOCR" in get_model_architecture(self.model, self.model_config_name):
envs.FD_ENABLE_MAX_PREFILL = 1