plusNew001
d0e9a70380
[CI] add CI logprobs case ( #3189 )
...
* [ci] add CI case
* [ci] add CI case
* [ci] add CI case
* [ci] add CI case
---------
Co-authored-by: ZhangYulongg <1272816783@qq.com >
2025-08-08 15:47:55 +08:00
yzwu
fbdd6b0663
[Iluvatar GPU] Optimze attention and moe performance ( #3234 )
2025-08-08 10:51:24 +08:00
Yzc216
6037dd5d9c
[fix] multi source download ( #3259 )
...
* multi-source download
* multi-source download
* huggingface download revision
* requirement
* style
* add revision arg
* test
* pre-commit
* Change default download
* change requirements.txt
* modify English Documentation
* documentation
* modify model download path
* add requirements
* error optimization
* 连接失败兜底
* 连接失败兜底
* 连接失败兜底
* unit test
* unit test
* unit test
* test
* test
* 兜底修改
* Trigger CI
2025-08-07 19:30:39 +08:00
JYChen
9423c577fe
[stop_seq] fix out-bound value for stop sequence ( #3216 )
...
* fix out-bound value for stop sequence
* catch error if there are out-of-bounds value
* check in offline mode
* add ut tests
2025-08-07 15:40:21 +08:00
Divano
5885285e57
Ce add benchmark test ( #3262 )
...
* add repitation early stop cases
* add repitation early stop cases
* add bad cases
* add bad cases
* add evil cases
* add benchmark gsm8k
2025-08-07 15:28:30 +08:00
YuBaoku
55ac449c31
[CI] remove useless case ( #3261 )
2025-08-07 15:09:40 +08:00
RAM
820798aec5
[Executor]Update graph test case and delete test_attention ( #3257 )
...
* 1.update graph test case 2.delete test_attention
* code style
* delete print
2025-08-07 14:05:15 +08:00
Yzc216
d9e3f88f9e
[Feature] multi source download ( #3125 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* multi-source download
* multi-source download
* huggingface download revision
* requirement
* style
* add revision arg
* test
* pre-commit
* Change default download
* change requirements.txt
* modify English Documentation
* documentation
* modify model download path
* add requirements
* error optimization
* 连接失败兜底
* 连接失败兜底
* 连接失败兜底
* unit test
* unit test
* unit test
* test
* test
2025-08-07 00:40:27 +08:00
lizexu123
afff4d37ea
[Feature] support seed parameter ( #3161 )
...
* support seed
* fix
* add SamplingMetadata seed test
* The next_tokens values are inconsistent!
* add air and rejection seed test
* fix
* add SamplingParams seed test
* fix seed=0
* Default to defualt
* fix
* fix args_utils
* fix review
* fix review
* fix
* fix
* add xpu,gcu,iluvatar support seed
* fix
2025-08-06 15:20:47 +08:00
bukejiyu
20839abccf
qwen3_moe ( #3084 )
2025-08-06 14:45:27 +08:00
Divano
91dc87f1c5
add some evil cases ( #3240 )
...
* add repitation early stop cases
* add repitation early stop cases
* add bad cases
* add bad cases
* add evil cases
2025-08-06 14:23:55 +08:00
xjkmfa
256a82b0b3
Add ci case for min token and max token ( #3229 )
...
Co-authored-by: xujing43 <xujing43@baidu.com >
2025-08-06 14:10:57 +08:00
yangjianfengo1
89397516a8
[New Feature] Support W4Afp8 MoE GroupGemm ( #3171 )
...
* init
* 增加多线程编译
* fix bug
* fix bug
* code style
* 增加fp16
* 将print替换成assert
* 修复stmatrix
* 减小单测shape
* 减小单测shape
2025-08-06 10:34:05 +08:00
Yuan Xiaolan
7ce00e597c
support qk norm ( #3145 )
2025-08-05 16:46:14 +08:00
Yuan Xiaolan
af543b7f0f
revise get_moe_scores ( #3164 )
2025-08-05 16:43:07 +08:00
Divano
e24929efa3
Ce add bad cases ( #3215 )
...
* add repitation early stop cases
* add repitation early stop cases
* add bad cases
* add bad cases
2025-08-05 16:37:28 +08:00
chen
04fc7eb931
fix test_air_top_p_sampling name ( #3211 )
2025-08-05 15:47:50 +08:00
Divano
9f1936ae28
Ce add repitation early stop cases ( #3213 )
...
* add repitation early stop cases
* add repitation early stop cases
2025-08-05 15:47:28 +08:00
ming1753
14ed75f7d3
[Test] scaled_gemm_f8_i4_f16 skip test while sm != 89 ( #3210 )
2025-08-05 15:25:28 +08:00
yangjianfengo1
40f7f3e0d8
[New Feature] fa3 支持flash mask ( #3184 )
...
* 支持flash mask
* 修改test_flash_mask
* 修改test.sh
2025-08-05 12:20:48 +08:00
Divano
fb7a0689cc
add more cases ( #3207 )
2025-08-05 11:17:36 +08:00
RAM
c593e1a39c
[Bug Fix]Fix bug of append attention test case ( #3202 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-05 11:04:45 +08:00
Divano
88596c0c63
Add more base chat cases ( #3203 )
...
* add test base class
* fix codestyle
* fix codestyle
* add base chat
2025-08-05 10:24:12 +08:00
lizhenyun01
fe540f6caa
[plugin] Custom model_runner/model support ( #3186 )
...
* support custom model&&model_runner
* fix merge
* add test && update doc
* fix codestyle
* fix unittest
* load model in rl
2025-08-04 18:52:39 -07:00
YuBaoku
3eb9a5df60
[CI] add test_compare_top_logprobs ( #3191 )
2025-08-04 19:49:24 +08:00
SunLei
68bc1d12c0
[Bugfix] Fix uninitialized decoded_token and add corresponding unit test. ( #3195 )
2025-08-04 19:23:58 +08:00
Zero Rains
17f51f0c92
[unitest] fix the bug in test_sampler ( #3157 )
2025-08-04 01:23:25 -07:00
Divano
3bfb2eca92
Update test_base_chat.py ( #3183 )
2025-08-04 15:09:53 +08:00
gaoziyuan
4021d66ea5
【Feature】add fd plugins && rm model_classes ( #3123 )
...
* add fd plugins && rm model_classed
* fix reviews
* add docs
* fix
* fix unitest ci
2025-08-03 19:53:20 -07:00
Divano
66d3bb89ad
Update __init__.py ( #3163 )
...
升级测试基类兼容性
2025-08-04 09:40:09 +08:00
Zhang Yulong
0eb32bb9c8
add cases ( #3155 )
2025-08-01 18:38:57 +08:00
Divano
50db0d7ba9
add case ( #3150 )
...
* add test base class
* fix codestyle
* fix codestyle
* add base chat
2025-08-01 17:30:58 +08:00
JYChen
c34088b0fd
fix stop seq unittest ( #3126 )
2025-08-01 16:50:05 +08:00
Divano
1d93565082
[CE] Add base test class for web server testing ( #3120 )
...
* add test base class
* fix codestyle
* fix codestyle
2025-07-31 23:28:50 +08:00
Zhang Yulong
1a543bca29
Fix test_EB_Lite_serving.py ( #3119 )
...
* Fix test_EB_Lite_serving.py
* fix test_EB_Lite_serving.py
2025-07-31 20:15:25 +08:00
LiqinruiG
25005fee30
[Doc] add chat_template_kwagrs and update params docs ( #3103 )
...
* add chat_template_kwagrs and update params docs
* add chat_template_kwagrs and update params docs
* update enable_thinking
* pre-commit
* update test case
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-31 19:44:06 +08:00
YUNSHEN XIE
583eae2fd1
fix ci ( #3106 )
...
* fix ci
* disable test_non_streaming_chat_with_min_tokens
2025-07-31 17:25:08 +08:00
Jiang-Jia-Jun
0616c208d2
[Feature] Support include_stop_str_in_output in completion api ( #3096 )
...
* [Feature] Support include_stop_str_in_output in completion api
* Fix ci test
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-30 22:18:48 +08:00
李泳桦
b242150f94
[feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client ( #3058 )
...
* [feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client
* [fix] delete ci test case for enable_thinking
* [fix] add reasoning_parser when server starts
* [fix] fix ci consistency test error with reasoning parser
* [doc] update docs related to metadata
* [fix] cancel enable_thinking default value
2025-07-30 19:25:20 +08:00
AIbin
28fff1b035
Revert "Add uinttest for moe_ffn_wint2. ( #3037 )" ( #3085 )
...
This reverts commit 327e1943fa .
2025-07-30 19:04:07 +08:00
Jiang-Jia-Jun
ffa0f4d99b
[Fix] Fix version function ( #3076 )
...
* [Fix] Fix version function
* Fix commit
* Fix commit
* fix code sync
* Update coverage_run.sh
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-30 16:05:24 +08:00
YuanRisheng
eeadbf332a
delete unused unittest ( #3065 )
2025-07-30 15:11:58 +08:00
Yiqun Liu
327e1943fa
Add uinttest for moe_ffn_wint2. ( #3037 )
...
Change-Id: Ifd452527eaf87ea96c3fa4fa9aeb17729b33c2de
2025-07-30 15:03:09 +08:00
Sunny-bot1
74aa31d15b
[Feature] support bad_words ( #3055 )
...
* support bad_words
* support online infer bad_words
* update
* add CI test
* update
* update
* update
---------
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-07-30 09:31:29 +08:00
zhuzixuan
ad7bb52a28
修复传入max_tokens=1时的报错 ( #3068 )
...
* 修复传入max_tokens=1时的报错
* 修复传入max_tokens=1时的报错
* 修复传入max_tokens=1时的报错
* 修复传入max_tokens=1时的报错
* 修复传入max_tokens=1时的报错
* 修复传入max_tokens=1时的报错
2025-07-29 23:49:28 +08:00
Zero Rains
b2f9a42d87
[Feature] Support repetition early stop ( #3024 )
...
* support repetition early stop and support user to set the parameter
* remove log
* fix codestyle
* add the early_stop_config to rollout_config
* update config and EarlyStopper class
* fix the bug for triton
* modify the stop method
* update description
* modify the usage for stop_flags
---------
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-07-29 22:42:54 +08:00
JYChen
dafe02a7b9
[stop sequence] support stop sequence ( #3025 )
...
* stop seqs in multi-ends
* unittest for gpu stop op
* kernel tid==0
2025-07-29 14:17:37 +08:00
李泳桦
69996a40da
[feat] add disable_chat_template in chat api as a substitute for previous raw_request ( #3020 )
...
* [feat] add disable_chat_template in chat api as a substitute for previous raw_request
* [fix] pre-commit code check
2025-07-25 20:57:32 +08:00
EnflameGCU
7634ffb709
[GCU] Add CI ( #3006 )
2025-07-25 10:59:29 +08:00
Zero Rains
0fb37ab7e4
update flake8 version to support pre-commit in python3.12 ( #3000 )
...
* update flake8 version to support pre-commit in python3.12
* polish code
2025-07-24 01:43:31 -07:00