zhink
2c0d853067
add test for CustomAllreduce ( #3313 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-12 20:44:47 +08:00
memoryCoderC
c575611a5b
[BugFix] v1/completions add finish_reason ( #3246 )
...
* [BugFix] v1/completions add finish_reason
* update TestOpenAIServingCompletion for merge
---------
Co-authored-by: YUNSHEN XIE <1084314248@qq.com >
2025-08-12 19:40:26 +08:00
JYChen
973ddad91e
fix unittest ( #3328 )
2025-08-11 20:58:24 +08:00
Divano
8bf96217b4
Update test_evil_cases.py
2025-08-11 20:27:02 +08:00
Zero Rains
b23af29d0b
Launch expert_service before kv_cache initialization in worker_process ( #3045 )
...
* launch expert_service before kv_cache initialization
* add two signal make sure model loading and expert_service lauching finished
* fix the EP bug
* fix ep
* update launching way
* fix ep
* update
* roback ep
* pre-commit all files
---------
Co-authored-by: RAM <gstian5555@outlook.com >
Co-authored-by: Divano <dddivano@outlook.com >
2025-08-11 19:38:46 +08:00
Zhang Yulong
c27a3dc43b
Update deploy.py ( #3310 )
...
* Update deploy.py
更新部署工具
* Update deploy.py
2025-08-11 19:11:57 +08:00
xjkmfa
71018fb62e
【CI case】include total_tokens in the last packet of completion interface stream output ( #3279 )
...
* Add ci case for min token and max token
* 【CI case】include total_tokens in the last packet of completion interface stream output
---------
Co-authored-by: xujing43 <xujing43@baidu.com >
2025-08-11 10:59:47 +08:00
Divano
0b77d396ad
Acc ( #3301 )
...
* add repitation early stop cases
* add repitation early stop cases
* add accuracy cases
2025-08-11 10:22:06 +08:00
Divano
eaae4a580d
Split cases ( #3297 )
...
* add repitation early stop cases
* add repitation early stop cases
* split repetition_early_stop from the base test
2025-08-11 09:38:35 +08:00
YUNSHEN XIE
22255a65aa
add base test ci ( #3225 )
2025-08-08 19:08:55 +08:00
plusNew001
d0e9a70380
[CI] add CI logprobs case ( #3189 )
...
* [ci] add CI case
* [ci] add CI case
* [ci] add CI case
* [ci] add CI case
---------
Co-authored-by: ZhangYulongg <1272816783@qq.com >
2025-08-08 15:47:55 +08:00
yzwu
fbdd6b0663
[Iluvatar GPU] Optimze attention and moe performance ( #3234 )
2025-08-08 10:51:24 +08:00
Yzc216
6037dd5d9c
[fix] multi source download ( #3259 )
...
* multi-source download
* multi-source download
* huggingface download revision
* requirement
* style
* add revision arg
* test
* pre-commit
* Change default download
* change requirements.txt
* modify English Documentation
* documentation
* modify model download path
* add requirements
* error optimization
* 连接失败兜底
* 连接失败兜底
* 连接失败兜底
* unit test
* unit test
* unit test
* test
* test
* 兜底修改
* Trigger CI
2025-08-07 19:30:39 +08:00
JYChen
9423c577fe
[stop_seq] fix out-bound value for stop sequence ( #3216 )
...
* fix out-bound value for stop sequence
* catch error if there are out-of-bounds value
* check in offline mode
* add ut tests
2025-08-07 15:40:21 +08:00
Divano
5885285e57
Ce add benchmark test ( #3262 )
...
* add repitation early stop cases
* add repitation early stop cases
* add bad cases
* add bad cases
* add evil cases
* add benchmark gsm8k
2025-08-07 15:28:30 +08:00
YuBaoku
55ac449c31
[CI] remove useless case ( #3261 )
2025-08-07 15:09:40 +08:00
RAM
820798aec5
[Executor]Update graph test case and delete test_attention ( #3257 )
...
* 1.update graph test case 2.delete test_attention
* code style
* delete print
2025-08-07 14:05:15 +08:00
Yzc216
d9e3f88f9e
[Feature] multi source download ( #3125 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* multi-source download
* multi-source download
* huggingface download revision
* requirement
* style
* add revision arg
* test
* pre-commit
* Change default download
* change requirements.txt
* modify English Documentation
* documentation
* modify model download path
* add requirements
* error optimization
* 连接失败兜底
* 连接失败兜底
* 连接失败兜底
* unit test
* unit test
* unit test
* test
* test
2025-08-07 00:40:27 +08:00
lizexu123
afff4d37ea
[Feature] support seed parameter ( #3161 )
...
* support seed
* fix
* add SamplingMetadata seed test
* The next_tokens values are inconsistent!
* add air and rejection seed test
* fix
* add SamplingParams seed test
* fix seed=0
* Default to defualt
* fix
* fix args_utils
* fix review
* fix review
* fix
* fix
* add xpu,gcu,iluvatar support seed
* fix
2025-08-06 15:20:47 +08:00
bukejiyu
20839abccf
qwen3_moe ( #3084 )
2025-08-06 14:45:27 +08:00
Divano
91dc87f1c5
add some evil cases ( #3240 )
...
* add repitation early stop cases
* add repitation early stop cases
* add bad cases
* add bad cases
* add evil cases
2025-08-06 14:23:55 +08:00
xjkmfa
256a82b0b3
Add ci case for min token and max token ( #3229 )
...
Co-authored-by: xujing43 <xujing43@baidu.com >
2025-08-06 14:10:57 +08:00
yangjianfengo1
89397516a8
[New Feature] Support W4Afp8 MoE GroupGemm ( #3171 )
...
* init
* 增加多线程编译
* fix bug
* fix bug
* code style
* 增加fp16
* 将print替换成assert
* 修复stmatrix
* 减小单测shape
* 减小单测shape
2025-08-06 10:34:05 +08:00
Yuan Xiaolan
7ce00e597c
support qk norm ( #3145 )
2025-08-05 16:46:14 +08:00
Yuan Xiaolan
af543b7f0f
revise get_moe_scores ( #3164 )
2025-08-05 16:43:07 +08:00
Divano
e24929efa3
Ce add bad cases ( #3215 )
...
* add repitation early stop cases
* add repitation early stop cases
* add bad cases
* add bad cases
2025-08-05 16:37:28 +08:00
chen
04fc7eb931
fix test_air_top_p_sampling name ( #3211 )
2025-08-05 15:47:50 +08:00
Divano
9f1936ae28
Ce add repitation early stop cases ( #3213 )
...
* add repitation early stop cases
* add repitation early stop cases
2025-08-05 15:47:28 +08:00
ming1753
14ed75f7d3
[Test] scaled_gemm_f8_i4_f16 skip test while sm != 89 ( #3210 )
2025-08-05 15:25:28 +08:00
yangjianfengo1
40f7f3e0d8
[New Feature] fa3 支持flash mask ( #3184 )
...
* 支持flash mask
* 修改test_flash_mask
* 修改test.sh
2025-08-05 12:20:48 +08:00
Divano
fb7a0689cc
add more cases ( #3207 )
2025-08-05 11:17:36 +08:00
RAM
c593e1a39c
[Bug Fix]Fix bug of append attention test case ( #3202 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-05 11:04:45 +08:00
Divano
88596c0c63
Add more base chat cases ( #3203 )
...
* add test base class
* fix codestyle
* fix codestyle
* add base chat
2025-08-05 10:24:12 +08:00
lizhenyun01
fe540f6caa
[plugin] Custom model_runner/model support ( #3186 )
...
* support custom model&&model_runner
* fix merge
* add test && update doc
* fix codestyle
* fix unittest
* load model in rl
2025-08-04 18:52:39 -07:00
YuBaoku
3eb9a5df60
[CI] add test_compare_top_logprobs ( #3191 )
2025-08-04 19:49:24 +08:00
SunLei
68bc1d12c0
[Bugfix] Fix uninitialized decoded_token and add corresponding unit test. ( #3195 )
2025-08-04 19:23:58 +08:00
Zero Rains
17f51f0c92
[unitest] fix the bug in test_sampler ( #3157 )
2025-08-04 01:23:25 -07:00
Divano
3bfb2eca92
Update test_base_chat.py ( #3183 )
2025-08-04 15:09:53 +08:00
gaoziyuan
4021d66ea5
【Feature】add fd plugins && rm model_classes ( #3123 )
...
* add fd plugins && rm model_classed
* fix reviews
* add docs
* fix
* fix unitest ci
2025-08-03 19:53:20 -07:00
Divano
66d3bb89ad
Update __init__.py ( #3163 )
...
升级测试基类兼容性
2025-08-04 09:40:09 +08:00
Zhang Yulong
0eb32bb9c8
add cases ( #3155 )
2025-08-01 18:38:57 +08:00
Divano
50db0d7ba9
add case ( #3150 )
...
* add test base class
* fix codestyle
* fix codestyle
* add base chat
2025-08-01 17:30:58 +08:00
JYChen
c34088b0fd
fix stop seq unittest ( #3126 )
2025-08-01 16:50:05 +08:00
Divano
1d93565082
[CE] Add base test class for web server testing ( #3120 )
...
* add test base class
* fix codestyle
* fix codestyle
2025-07-31 23:28:50 +08:00
Zhang Yulong
1a543bca29
Fix test_EB_Lite_serving.py ( #3119 )
...
* Fix test_EB_Lite_serving.py
* fix test_EB_Lite_serving.py
2025-07-31 20:15:25 +08:00
LiqinruiG
25005fee30
[Doc] add chat_template_kwagrs and update params docs ( #3103 )
...
* add chat_template_kwagrs and update params docs
* add chat_template_kwagrs and update params docs
* update enable_thinking
* pre-commit
* update test case
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-31 19:44:06 +08:00
YUNSHEN XIE
583eae2fd1
fix ci ( #3106 )
...
* fix ci
* disable test_non_streaming_chat_with_min_tokens
2025-07-31 17:25:08 +08:00
Jiang-Jia-Jun
0616c208d2
[Feature] Support include_stop_str_in_output in completion api ( #3096 )
...
* [Feature] Support include_stop_str_in_output in completion api
* Fix ci test
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-30 22:18:48 +08:00
李泳桦
b242150f94
[feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client ( #3058 )
...
* [feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client
* [fix] delete ci test case for enable_thinking
* [fix] add reasoning_parser when server starts
* [fix] fix ci consistency test error with reasoning parser
* [doc] update docs related to metadata
* [fix] cancel enable_thinking default value
2025-07-30 19:25:20 +08:00
AIbin
28fff1b035
Revert "Add uinttest for moe_ffn_wint2. ( #3037 )" ( #3085 )
...
This reverts commit 327e1943fa
.
2025-07-30 19:04:07 +08:00