Commit Graph

111 Commits

Author SHA1 Message Date
SunLei
b9af95cf1c [Feature] Add AsyncTokenizerClient&ChatResponseProcessor with remote encode&decode support. (#3674)
* [Feature] add AsyncTokenizerClient

* add decode_image

* Add response_processors with remote decode support.

* [Feature] add tokenizer_base_url startup argument

* Revert comment removal and restore original content.

* [Feature] Non-streaming requests now support remote image decoding.

* Fix parameter type issue in decode_image call.

* Keep completion_token_ids when return_token_ids = False.

* add copyright
2025-08-30 17:06:26 +08:00
lifulll
72094d4d82 enable dcu ci (#3402) 2025-08-29 10:23:08 +08:00
YUNSHEN XIE
cb166053ba fix test name (#3493)
* fix test name

* update

* update

* fix

* fix

* update

* update

* update

* update

* update

* fix

* update
2025-08-22 23:43:47 +08:00
YUNSHEN XIE
e197894977 add e2e cases (#3476)
* add e2e cases

* fix
2025-08-20 18:50:14 +08:00
xiaolei373
5d131485d8 add error log to file (#3431)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* feat(log):add_request_and_response_log

* feat[log]:add error log to file
2025-08-20 09:52:34 +08:00
YUNSHEN XIE
3a6058e445 Add stable ci (#3460)
* add stable ci

* fix

* update

* fix

* rename tests dir;fix stable ci bug

* add timeout limit

* update
2025-08-20 08:57:17 +08:00
kevin
67298cf4c0 add error traceback info (#3419)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* add error traceback info

* update error msg

* update code

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-08-19 19:32:04 +08:00
yangjianfengo1
b047681c5d 【New Feature】支持Fp8 group Gemm 24稀疏 (#3463)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* 支持24稀疏

* code style

* 增加stmatrix 宏定义判断

* code style
2025-08-19 02:54:47 -07:00
ltd0924
d587fb257f [CI] add test generation demo (#3270)
* Create test_generation.py

* update

* update

* format

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update setup.py

* Delete test/plugins/test_model_runner_register.py

---------

Co-authored-by: YUNSHEN XIE <1084314248@qq.com>
2025-08-19 17:12:40 +08:00
Zero Rains
8b12c80f90 [FixBug] compute early stopping with real batch size (#3418)
* [FixBug] compute early stopping with real batch size

* update

* fix test_sampler
2025-08-18 22:09:21 -07:00
luukunn
3a7a20d191 [Feature] Pass through the chat_template_kwargs to the data processing module (#3421)
* fix chat_template_args

* fix args

* add offline

* add offline

* fix

* fix

* fix default enable_thinking value

* fix default enable_thinking value

* modify condition

* Revert "modify condition"

This reverts commit 26430bdeb1.

* fix unit test
2025-08-19 10:50:01 +08:00
zhuzixuan
c95b3395e9 【BugFix】completion接口echo回显支持 (#3245)
* wenxin-tools-511,修复v1/completion无法回显的问题。

* 支持多prompt的回显

* 支持多prompt情况下的流式回显

* 补充了 completion 接口支持 echo 的单元测试

* pre-commit

* 移除了多余的test文件

* 修复了completion接口echo支持的单测方法

* 补充了单元测试文件

* 补充单测

* unittest

* 补充单测

* 修复单测

* 删除不必要的assert.

* 重新提交

* 更新测试方法

* ut

* 验证是否是正确思路单测

* 验证是否是正确思路单测

* 验证是否是正确思路单测3

* 优化单测代码,有针对性地缩小单测范围。

* 优化单测代码2,有针对性地缩小单测范围。

* 优化单测代码3,有针对性地缩小单测范围。

* support 'echo' in chat/completion.

* update

* update

* update

* update

* update

* update

* 补充了关于tokenid的单元测试

* update

* 修正index错误

* 修正index错误
2025-08-19 10:41:51 +08:00
luukunn
9c129813f9 [Feature] add custom chat template (#3251)
* add custom chat_template

* add custom chat_template

* add unittest

* fix

* add docs

* fix comment

* add offline chat

* fix unit test

* fix unit test

* fix

* fix pre commit

* fix unit test

* add unit test

* add unit test

* add unit test

* fix pre_commit

* fix enable_thinking

* fix pre commit

* fix pre commit

* fix unit test

* add requirements
2025-08-18 16:34:08 +08:00
Jundong Liu
ea4a3b479c [Excutor] Increase buffer size to prevent address corruption; add forward metadata debug tool (#3404)
* 修复buffer申请不够大,增加打印forwardmetadata的工具

* fix mistake

* Make CPU tensor in CPUPlace

* Add test about forward_meta_str and Add unitest_requirement

---------

Co-authored-by: RAM <gstian5555@outlook.com>
2025-08-18 16:14:09 +08:00
Divano
246cd7b3a5 Perf (#3453)
* add repitation early stop cases

* add repitation early stop cases

* add stress tool
2025-08-18 15:37:46 +08:00
Zhang Yulong
3ee6053e5d Add ci case (#3355)
* add ci cases

* debug

debug H20 baseline

* Update run_pre_ce.sh

* Update test_EB_Lite_serving.py

* Update test_EB_VL_Lite_serving.py

* Update test_EB_Lite_serving_mtp.py

* Update test_Qwen3-MoE_serving.py

* Update test_Qwen2-7B-Instruct_serving.py

* Update run_pre_ce.sh
2025-08-18 11:35:56 +08:00
YUNSHEN XIE
cc8ee50f27 add accuracy check ci (#3389)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* add accuracy ci

* fix

* fix

* update

* rename ci jobs
2025-08-15 15:17:43 +08:00
xjkmfa
ab60292f89 【CI】 evil case (#3359)
* Add ci case for min token and max token

* 【CI case】include total_tokens in the last packet of completion interface stream output

* 边缘检测 ,攻击性测试

* 边缘检测 ,攻击性测试

* 边缘检测 ,攻击性测试

* 边缘检测 ,攻击性测试

---------

Co-authored-by: xujing43 <xujing43@baidu.com>
2025-08-14 20:00:47 +08:00
Sunny-bot1
79d8ae4c38 [UT Fix] Fix bad_words test (#3385)
* fix bad_words test

* add streaming

* fix

* fix
2025-08-14 03:55:02 -07:00
lzy
1e06b9fa6d make append_attn supports mask_offset (#3138)
* make append_attn supports mask_offset

* add unittest
2025-08-14 03:40:55 -07:00
memoryCoderC
6031f9a5f5 [BugFix] fix ErnieProcessor not set raw_prediction (#3400) 2025-08-14 18:07:49 +08:00
gaoziyuan
0ea8712018 fix op tests (#3398) 2025-08-14 16:45:25 +08:00
memoryCoderC
f702a675a1 fix TestOpenAIServingCompletion fail (#3368) 2025-08-13 15:45:07 +08:00
EnflameGCU
d1a92e3e17 [GCU] Enable gcu CI (#3190)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* [GCU] Update to the latest version

* [GCU] Enable CI
2025-08-13 11:48:24 +08:00
Sunny-bot1
8224b21525 Refactor moe_topk_select op to use apply_norm_weight as a template parameter (#3345)
* Refactor moe_topk_select op to use apply_norm_weight as a template parameter

* update test
2025-08-13 08:44:16 +08:00
zhink
2c0d853067 add test for CustomAllreduce (#3313)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-12 20:44:47 +08:00
memoryCoderC
c575611a5b [BugFix] v1/completions add finish_reason (#3246)
* [BugFix] v1/completions add finish_reason

* update TestOpenAIServingCompletion for merge

---------

Co-authored-by: YUNSHEN XIE <1084314248@qq.com>
2025-08-12 19:40:26 +08:00
JYChen
973ddad91e fix unittest (#3328) 2025-08-11 20:58:24 +08:00
Divano
8bf96217b4 Update test_evil_cases.py 2025-08-11 20:27:02 +08:00
Zero Rains
b23af29d0b Launch expert_service before kv_cache initialization in worker_process (#3045)
* launch expert_service before kv_cache initialization

* add two signal make sure model loading and expert_service lauching finished

* fix the EP bug

* fix ep

* update launching way

* fix ep

* update

* roback ep

* pre-commit all files

---------

Co-authored-by: RAM <gstian5555@outlook.com>
Co-authored-by: Divano <dddivano@outlook.com>
2025-08-11 19:38:46 +08:00
Zhang Yulong
c27a3dc43b Update deploy.py (#3310)
* Update deploy.py

更新部署工具

* Update deploy.py
2025-08-11 19:11:57 +08:00
xjkmfa
71018fb62e 【CI case】include total_tokens in the last packet of completion interface stream output (#3279)
* Add ci case for min token and max token

* 【CI case】include total_tokens in the last packet of completion interface stream output

---------

Co-authored-by: xujing43 <xujing43@baidu.com>
2025-08-11 10:59:47 +08:00
Divano
0b77d396ad Acc (#3301)
* add repitation early stop cases

* add repitation early stop cases

* add accuracy cases
2025-08-11 10:22:06 +08:00
Divano
eaae4a580d Split cases (#3297)
* add repitation early stop cases

* add repitation early stop cases

* split repetition_early_stop from the base test
2025-08-11 09:38:35 +08:00
YUNSHEN XIE
22255a65aa add base test ci (#3225) 2025-08-08 19:08:55 +08:00
plusNew001
d0e9a70380 [CI] add CI logprobs case (#3189)
* [ci] add CI case

* [ci] add CI case

* [ci] add CI case

* [ci] add CI case

---------

Co-authored-by: ZhangYulongg <1272816783@qq.com>
2025-08-08 15:47:55 +08:00
yzwu
fbdd6b0663 [Iluvatar GPU] Optimze attention and moe performance (#3234) 2025-08-08 10:51:24 +08:00
Yzc216
6037dd5d9c [fix] multi source download (#3259)
* multi-source download

* multi-source download

* huggingface download revision

* requirement

* style

* add revision arg

* test

* pre-commit

* Change default download

* change requirements.txt

* modify English Documentation

* documentation

* modify model download path

* add requirements

* error optimization

* 连接失败兜底

* 连接失败兜底

* 连接失败兜底

* unit test

* unit test

* unit test

* test

* test

* 兜底修改

* Trigger CI
2025-08-07 19:30:39 +08:00
JYChen
9423c577fe [stop_seq] fix out-bound value for stop sequence (#3216)
* fix out-bound value for stop sequence

* catch error if there are out-of-bounds value

* check in offline mode

* add ut tests
2025-08-07 15:40:21 +08:00
Divano
5885285e57 Ce add benchmark test (#3262)
* add repitation early stop cases

* add repitation early stop cases

* add bad cases

* add bad cases

* add evil cases

* add benchmark gsm8k
2025-08-07 15:28:30 +08:00
YuBaoku
55ac449c31 [CI] remove useless case (#3261) 2025-08-07 15:09:40 +08:00
RAM
820798aec5 [Executor]Update graph test case and delete test_attention (#3257)
* 1.update graph test case 2.delete test_attention

* code style

* delete print
2025-08-07 14:05:15 +08:00
Yzc216
d9e3f88f9e [Feature] multi source download (#3125)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* multi-source download

* multi-source download

* huggingface download revision

* requirement

* style

* add revision arg

* test

* pre-commit

* Change default download

* change requirements.txt

* modify English Documentation

* documentation

* modify model download path

* add requirements

* error optimization

* 连接失败兜底

* 连接失败兜底

* 连接失败兜底

* unit test

* unit test

* unit test

* test

* test
2025-08-07 00:40:27 +08:00
lizexu123
afff4d37ea [Feature] support seed parameter (#3161)
* support seed

* fix

* add SamplingMetadata seed test

* The next_tokens values are inconsistent!

* add air and rejection seed test

* fix

* add SamplingParams seed test

* fix seed=0

* Default to defualt

* fix

* fix args_utils

* fix review

* fix review

* fix

* fix

* add xpu,gcu,iluvatar support seed

* fix
2025-08-06 15:20:47 +08:00
bukejiyu
20839abccf qwen3_moe (#3084) 2025-08-06 14:45:27 +08:00
Divano
91dc87f1c5 add some evil cases (#3240)
* add repitation early stop cases

* add repitation early stop cases

* add bad cases

* add bad cases

* add evil cases
2025-08-06 14:23:55 +08:00
xjkmfa
256a82b0b3 Add ci case for min token and max token (#3229)
Co-authored-by: xujing43 <xujing43@baidu.com>
2025-08-06 14:10:57 +08:00
yangjianfengo1
89397516a8 [New Feature] Support W4Afp8 MoE GroupGemm (#3171)
* init

* 增加多线程编译

* fix bug

* fix bug

* code style

* 增加fp16

* 将print替换成assert

* 修复stmatrix

* 减小单测shape

* 减小单测shape
2025-08-06 10:34:05 +08:00
Yuan Xiaolan
7ce00e597c support qk norm (#3145) 2025-08-05 16:46:14 +08:00
Yuan Xiaolan
af543b7f0f revise get_moe_scores (#3164) 2025-08-05 16:43:07 +08:00