[fix]Modify follow-up push parameters and Modify the verification method for thinking length (#4177)

* [fix]Modify follow-up push parameters and Modify the verification method for thinking length (#4086)

* 续推参数  generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式

* 续推参数  generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式

* 续推参数  generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式

* 续推参数  generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式

* add completion_token_ids

* add logger

* fix reasoning_max_tokens ParameterError

* add unittest

* add unittest

* add unittest

* add unittest

* add unittest

* add unit test

* fix
This commit is contained in:
luukunn
2025-09-22 21:12:05 +08:00
committed by GitHub
parent 0358329946
commit 6b47773bd6
6 changed files with 75 additions and 24 deletions

View File

@@ -255,8 +255,13 @@ class EngineClient:
raise ValueError(f"max_tokens can be defined [1, {self.max_model_len}).")
if data.get("reasoning_max_tokens") is not None:
if data["reasoning_max_tokens"] > data["max_tokens"] or data["reasoning_max_tokens"] < 1:
raise ValueError("reasoning_max_tokens must be between max_tokens and 1")
if data["reasoning_max_tokens"] < 1:
raise ValueError("reasoning_max_tokens must be greater than 1")
if data["reasoning_max_tokens"] > data["max_tokens"]:
data["reasoning_max_tokens"] = data["max_tokens"]
api_server_logger.warning(
f"req_id: {data['request_id']}, reasoning_max_tokens exceeds max_tokens, the value of reasoning_max_tokens will be adjusted to match that of max_tokens"
)
if data.get("top_p") is not None:
if data["top_p"] > 1 or data["top_p"] < 0: