[Benchmark]支持Completions接口 (#5700)

* benchmark工具支持受限解码场景指定response_format * Update backend_request_func.py output.success判断兼容思考内容超长截断时回复内容为空的情况 * Update benchmark_serving.py 更新benchmark_metrics * 支持Completions接口 * 支持Completions接口 * 支持Completions接口 * [Benchmark]支持Completions接口 * [Benchmark]支持Completions接口 --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-12-24 13:28:13 +08:00 · 2025-12-23 19:46:23 +08:00
parent 04c30521dd
commit 99258e19c8
5 changed files with 17 additions and 13 deletions
--- a/benchmarks/backend_request_func.py
+++ b/benchmarks/backend_request_func.py
@@ -273,7 +273,8 @@ async def async_request_eb_openai_chat_completions(
                    # 新增metrics统计，计算首token过滤空包
                    output.metrics = metrics_summary(metrics_list, token_timestamps[1:])

-                    if output.generated_text.strip() == "":
+                    # 兼容思考内容超长截断的情况，此时回复内容为空
+                    if output.generated_text.strip() == "" and output.reasoning_content.strip() == "":
                        output.success = False
                        output.reasoning_tokens = output.output_tokens
                        output.error = "No generated text found!"