xiegegege
b7e1e6c953
[CE]change yaml name
2025-12-04 19:14:11 +08:00
tianlef
04d35ace5e
[CE]add wint4 ep ( #5355 )
2025-12-03 15:17:47 +08:00
Zhang Yulong
5b49142988
update ( #5298 )
2025-11-28 18:29:16 +08:00
xiegegege
eae34a416c
[benchmark]add qwen3-235b pd+ep yaml ( #5225 )
2025-11-25 19:53:30 +08:00
tianlef
de43577a7c
[Docs] add ebvlthinking yaml ( #5120 )
2025-11-19 15:27:46 +08:00
Zhang Yulong
83532e1d01
[Benchmark] Enhance benchmark output logging ( #4682 )
...
* Enhance benchmark output logging
Add print statements to display the number of discarded outputs before and after filtering.
* Update benchmark_serving.py
2025-11-06 16:53:31 +08:00
Juncai
08ca0f6aea
[Feature] [PD] add simple router and refine splitwise deployment ( #4709 )
...
* add simple router and refine splitwise deployment
* fix
2025-11-06 14:56:02 +08:00
zhang-prog
4c2ad15258
add paddleocr_vl benchmark ( #4833 )
...
* add paddleocr_vl benchmark
* fix
* fix
* fix
* fix
2025-11-05 19:37:45 +08:00
ophilia-lee
412097c1b8
benchmark工具支持受限解码场景指定response_format ( #4718 )
2025-10-31 12:26:24 +08:00
Ryan
28de91b50f
[Graph Optimization] SOT+CUDAGraph support ERNIE4.5T VL 28B / 424B ( #4645 )
...
* 45TVL support sot+CUDAGraph
* mv unitest from ce_deploy 2 e2e
* add test_EB_VL_Lite_sot_serving
* rm useless line
* add openai_client
* fix unitest && reduce computing resources
2025-10-31 11:38:43 +08:00
kxz2002
a2870ed4a9
[Feature] Unify the registration name recognition for tool_parser and reasoning_parser to “-” ( #4668 )
...
* parser register name unify
* change ernie_x1 to ernie-x1
* change ernie4_5_vl to ernie-45-vl
* fix unit test
2025-10-31 10:45:27 +08:00
xjkmfa
19df1aec2b
[Docs] add Qwen25vl yaml ( #4662 )
...
* Add ci case for min token and max token
* 【CI case】include total_tokens in the last packet of completion interface stream output
* 【CE】add qwen25-vl
* 【CE】add qwen25-vl
---------
Co-authored-by: xujing43 <xujing43@baidu.com >
2025-10-29 17:39:40 +08:00
RAM
86d5006a57
[Graph Optimization][Speculative Decoding] Update yaml and fix typo ( #4612 )
2025-10-28 11:43:26 +08:00
ophilia-lee
70aa7423f8
benchmark工具适配SGLang框架 ( #4607 )
...
* benchmark工具适配SGLang框架
* benchmark工具适配SGLang框架
* benchmark工具适配SGLang框架
2025-10-27 18:52:56 +08:00
tianlef
2676a918f0
[Doc]fix deepseek ce ( #4560 )
2025-10-23 14:09:11 +08:00
tianlef
153f15db39
[Doc]add deepseek wint4 ce ( #4517 )
2025-10-21 16:41:51 +08:00
RAM
775edcc09a
[Executor] Default use CUDAGraph ( #3594 )
...
* add start intercept
* Adjustment GraphOptConfig
* pre-commit
* default use cudagraph
* set default value
* default use cuda graph
* pre-commit
* fix test case bug
* disable rl
* fix moba attention
* only support gpu
* Temporarily disable PD Disaggregation
* set max_num_seqs of test case as 1
* set max_num_seqs and temperature
* fix max_num_batched_tokens bug
* close cuda graph
* success run wint2
* profile run with max_num_batched_tokens
* 1.add c++ memchecker 2.success run wint2
* updatee a800 yaml
* update docs
* 1. delete check 2. fix plas attn test case
* default use use_unique_memory_pool
* add try-except for warmup
* ban mtp, mm, rl
* fix test case mock
* fix ci bug
* fix form_model_get_output_topp0 bug
* fix ci bug
* refine deepseek ci
* refine code
* Disable PD
* fix sot yaml
2025-10-21 14:25:45 +08:00
Zhang Yulong
10e85daf15
update benchmark scripts ( #4497 )
2025-10-20 17:03:10 +08:00
Zhang Yulong
8f77adc381
Add data dictionary for API response processing ( #4454 )
...
Initialize data dictionary for response handling.
2025-10-16 17:23:11 +08:00
Zhang Yulong
98f8c3703a
Add filtering for failed requests in benchmark outputs ( #4448 )
...
Filter out requests with end_timestamp == 0.0
2025-10-16 14:57:47 +08:00
Zhang Yulong
9dc3968c13
[benchmark] Fix benchmark duration calculation logic ( #4446 )
...
* Fix benchmark duration calculation logic
Calculate benchmark duration using filtered outputs.
* Fix benchmark duration calculation using benchmark_outputs
2025-10-16 14:36:29 +08:00
Zhang Yulong
7f94f063ff
Update benchmark_serving.py ( #4438 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
丢弃的请求依旧保存,用于结果分析
2025-10-15 20:36:19 +08:00
Zhang Yulong
c4f866c457
update benchmark tools ( #4416 )
2025-10-15 11:15:25 +08:00
tianlef
14eb8b4f8b
add x1 a3b quantization ( #4397 )
2025-10-14 15:04:06 +08:00
tianlef
8a964329f4
add glm benchmark yaml ( #4289 )
2025-09-26 14:23:29 +08:00
Zhang Yulong
5532e8a323
[FD CLI] Add bench cli ( #4160 )
...
* add bench cli
* Update test_main.py
2025-09-22 20:37:30 +08:00
co63oc
c4830ef24c
fix typos ( #4176 )
...
* fix typos
* fix
2025-09-22 14:27:17 +08:00
tianlef
e79a1a7938
x1_a3b config ( #4135 )
2025-09-16 19:44:46 +08:00
xiegegege
d682c97dd3
[benchmark]add lite-vl and x1 yaml ( #4130 )
2025-09-16 16:38:36 +08:00
tianlef
83bf1fd5aa
[Doc]add plas attention config ( #4128 )
2025-09-16 15:55:12 +08:00
tianlef
0bc7d076fc
[CE]add x1 w4a8c8 benchamrk config ( #3607 )
...
* [CE]add x1 w4a8c8 benchamrk config
* [CE]add x1 w4a8c8 benchamrk config
* [CE]add x1 w4a8c8 benchamrk config
2025-08-26 11:27:32 +08:00
Zhang Yulong
9ff2dfb162
Create eb45-8k-fp8-tp1-dp8_ep.yaml ( #3485 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
混合架构EP并行yaml
2025-08-20 14:33:54 +08:00
yinwei
776fb03250
add error info ( #3040 )
2025-07-28 15:10:28 +08:00
Zhang Yulong
5151bc92c8
Update benchmark tools ( #3004 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* update benchmark tools
* update benchmark tools
2025-07-24 15:19:23 +08:00
xiegegege
e3a843f2c5
[benchmark] add quantization for benchmark yaml ( #2995 )
2025-07-24 13:26:34 +08:00
Zero Rains
25698d56d1
polish code with new pre-commit rule ( #2923 )
2025-07-19 23:19:27 +08:00
RAM
0fad10b35a
[Executor] CUDA Graph support padding batch ( #2844 )
...
* cuda graph support padding batch
* Integrate the startup parameters for the graph optimization backend and provide support for user - defined capture sizes.
* Do not insert max_num_seqs when the user specifies a capture list
* Support set graph optimization config from YAML file
* update cuda graph ci
* fix ci bug
* fix ci bug
2025-07-15 19:49:01 -07:00
ophilia-lee
33db137d0b
新增vLLM默认请求参数yaml
2025-07-15 19:31:27 +08:00
lijingning
9d6a42b334
适配vLLM无arrival_time;适配vLLM model必传;RequestFuncInput/RequestFuncOutput/SampleRequest新增用例编号no
2025-07-15 19:31:27 +08:00
GoldPancake
f7cad30a38
[Feature] Add speculative decoding simulation benchmark. ( #2751 )
...
* Add speculative decoding simulation benchmark
* Fix the name of the parameter
2025-07-09 12:08:43 +08:00
Divano
050d9658a5
Update requirements.txt
2025-07-04 09:53:03 +08:00
Divano
be5cabaf80
add quick benchmark ( #2703 )
...
测试脚本不需要过CI
2025-07-04 09:32:36 +08:00
Zhang Yulong
264ddfdf8a
Update README.md
2025-06-30 10:28:15 +08:00
Jiang-Jia-Jun
92c2cfa2e7
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00