Commit Graph

381 Commits

Author SHA1 Message Date
plusNew001
3665c283b5 [XPU] [CI]Change CI to multi-concurrency (#4866)
* Refactor GPU ID logic in CI workflow

Updated GPU ID assignment logic and removed unused port calculations.

* Refactor GPU device and port configuration

* Update engine_worker_queue_port calculation logic

* Refactor XPU_VISIBLE_DEVICES export logic

* Adjust service port based on GPU ID

* Adjust service HTTP port based on GPU ID

* Adjust service_http_port based on GPU_ID

* Add import for os module in run_45T.py

* Update run_45vl.py

* Import os module in run_w4a8.py

Added import for os module to use environment variables.

* Remove duplicate import of os module

* Remove duplicate import of os module

* Update run_45T.py

* Update run_w4a8.py

* fix bug

* fix bug

* Update run_w4a8.py

* Fix directory change command in run_ci_xpu.sh
2025-11-10 21:09:48 +08:00
Sunny-bot1
59d2edde29 [BugFix] Add support for weight shape constraints and group size selection in Machete (#4911) 2025-11-10 20:57:35 +08:00
Echo-Nie
112623e33e init version, exist some bugs, waiting fix (#4906)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-10 14:16:09 +08:00
luukunn
41c0bef964 [BugFix] When the value of "temperature" is 0, adjust it to 1e-06 (#4900)
* add default temperature value

* add unit test

* update

* update

* add unit test

* update

* fix unit test
2025-11-10 13:24:33 +08:00
plusNew001
0a3bc84f71 [XPU][CI]Update test assertion and base response value (#4907) 2025-11-10 11:44:54 +08:00
kxz2002
87911b7cf1 [Feature] Enable FastDeploy to support adding the “--api-key” authentication parameter. (#4806)
* add api key initial commit

* add unit test

* modify unit test

* move middleware to a single file and add unit tests
2025-11-08 18:24:02 +08:00
plusNew001
fa098383f6 [XPU][CI] Ci bug fix (#4889)
* Refactor test_45t by commenting out responses

Comment out base response variables and update assertion.

* Update run_w4a8.py

* Fix assertion syntax in run_45T.py
2025-11-07 17:50:11 +08:00
ming1753
cba185f1fe [Feature] Optim PaddleOCR-VL (#4873)
* [Feature] Optim PaddleOCR-VL

* fix bug
2025-11-07 14:56:44 +08:00
YuBaoku
fa28745f19 [CI] Update ERNIE-4.5-VL baseline to adapt to MoE changes (#4867)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-06 22:02:10 +08:00
kevin
cc34487810 [Feature] support mm disable_chunked (#4803)
* support mm disable_chunked

* update code

* update code

* update code
2025-11-06 21:32:25 +08:00
YuBaoku
a139f8f3cb [CI] Optimize port cleanup logic (#4860) 2025-11-06 19:13:48 +08:00
Zhang Yulong
5aa73d32f4 Update deploy.py (#4850) 2025-11-06 19:09:28 +08:00
YuBaoku
819b2dbbae Revert "【New Feature】W4afp8 supports per group quantization (#4272)" (#4854)
This reverts commit 93fcf7e4ec.
2025-11-06 17:48:28 +08:00
Juncai
08ca0f6aea [Feature] [PD] add simple router and refine splitwise deployment (#4709)
* add simple router and refine splitwise deployment

* fix
2025-11-06 14:56:02 +08:00
plusNew001
fc8bef2c95 [XPU][CI]Change ci vl model to 28 b (#4764)
* Update XPU_VISIBLE_DEVICES and model parameters

* Update base response and adjust max tokens

* Implement process cleanup in CI workflow

Add process cleanup commands to prevent port conflicts

* Remove process cleanup commands from CI workflow

Removed old process cleanup commands to prevent port conflicts.
2025-11-06 14:12:23 +08:00
Echo-Nie
354ddc8bc5 [CI] Add unittest for activation, native_paddle_backend, w4a8, w4afp8, platforms/utils (#4812)
* add unnitest for activation, native_paddle_backend, w4a8, w4afp8, platforms/utils

* Remove activation function retrieval tests

Removed tests for valid and unsupported activation function retrieval.

* move w4a8, w4afp8 to quantization

* fix code style
2025-11-06 14:08:00 +08:00
SunLei
782818c031 fix: ci port conflict (#4840)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-06 11:56:17 +08:00
kxz2002
5bdd40da5d [BugFix] Fix ernie_vl_reasoning_parsers.py 'end_token' to 'think_end_token' (#4805)
* fix ernie_vl_reasoning_parsers.py 'end_token' to 'think_end_token'

* add unit tests
2025-11-06 11:28:55 +08:00
yangjianfengo1
93fcf7e4ec 【New Feature】W4afp8 supports per group quantization (#4272)
* w4afp8 支持per group

* code style

* 精度完成

* revert append attn utils

* ffn1 动态量化

* ffn2 支持动态量化

* code style

* code style

* 修改单测

* 修改单测

* fix bug

* Implement conditional parameter creation for layers

Add parameter creation for up_gate_proj_in_scale when ep_size > 1.

* code style

* fix conflict

* code style

* code style

* 修复w4aint8 精度

* fix ci

---------

Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
2025-11-05 21:00:23 +08:00
chenjian
cc8f5312f5 [Feature] Add timestamp for profiler (#4726)
* [Feature] Add timestamp for profiler

* fix bug for offine inference

* fix for ci

* fix

* fix ci
2025-11-05 12:04:59 +08:00
周周周
876e4a8935 remove input_ids from ForwardMeta (#4793)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-05 11:55:51 +08:00
kxz2002
9676cc87d6 fix parser register name (#4795)
Co-authored-by: luukunn <83932082+luukunn@users.noreply.github.com>
2025-11-05 11:27:30 +08:00
zhupengyang
2fd254e5b7 support ep+tp at op layer (#4688) 2025-11-05 11:15:57 +08:00
周周周
937eb3c6ed [get_padding_offset.] clean get_padding_offset.cu (#4777)
[get_padding_offset.] clean get_padding_offset.cu (#4777)
2025-11-05 10:47:40 +08:00
Haonan Luo
2c281e617c Update Unit Test for PaddleOCR-VL (#4802)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix paddleocr prefix cache bug

* add test for paddleocr_vl

* disable prefix-caching in ocr

* add test for paddleocr_vl

* Fix top_p for rejection sampling

* add test for ocr processor; fix top_p for rejection sampling

* add test for ocr processor; fix top_p for rejection sampling

* add test for ocr processor; fix top_p for rejection sampling

* add test for ocr processor; fix top_p for rejection sampling

* add test for ocr processor; fix top_p for rejection sampling

---------

Co-authored-by: ming1753 <ideaminghp@163.com>
Co-authored-by: ming1753 <61511741+ming1753@users.noreply.github.com>
2025-11-04 22:40:15 +08:00
YuBaoku
722110a952 [CI] Refactor CE wheel upload for multiple target paths (#4790)
* [CI] Refactor CE wheel upload for multiple target paths

* [CI] fix test_streaming_with_stop_str error
2025-11-04 18:56:38 +08:00
kxz2002
8a40374bfe [BugFix] Fix ernie4_5_vl_processor.py and qwen_vl_processor.py can not disable thinking (#4762)
* fix ernie4_5_vl_processor.py and qwen_vl_processor.py

* add unit test
2025-11-04 16:00:32 +08:00
bukejiyu
41bfa1090d [CI]delete test_common_model (#4794)
* fix

* update

* update
2025-11-04 13:57:55 +08:00
plusNew001
9887025926 Update run_w4a8.py (#4783)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-03 21:41:00 +08:00
bukejiyu
69c2f3cda1 [CI]test common model (#4697)
* ut

* update
2025-11-03 16:48:36 +08:00
luukunn
7b35488779 【DataProcessor】add options thinking_mode (#4735)
* add thinking_mode

* add thinking_mode

* add thinking_mode

* add thinking_mode

* add thinking_mode

* add thinking_mode

* add unit test
2025-11-03 14:30:07 +08:00
yinwei
377f3bf5f2 [XPU] add v1 support for bf16 (#4744)
* support v1 loader

* update code style

* update code
2025-11-03 14:13:17 +08:00
freeliuzc
11398790d3 [Speculative Decoding][MTP]Support attn mask offset (#4641)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [MTP]Merge support attn (#4591)

* support mask_offset in speculate decoding

* fix dummpy run output

* add unit test

* fix unit test import

* support attn_mask_offset in mtp mode

* add update_attn_mask op

* fix unit test && fix code-style
2025-11-03 10:08:01 +08:00
YuBaoku
9eff788658 [CI] fix some ci yaml (#4747)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-02 21:28:04 +08:00
lizexu123
4ac6de9a3c [Feature] support pooling model runner (#4590)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support qwen3-embedding

* support qwen3-embedding-0.6b

* fix

* fix bug

* fix test_return_token_ids.py and update enable_thinking

* fix mtp dummy_run

* merge develop

* fix np.float32

* delete FD_DISABLE_CHUNKED_PREFILL and FD_USE_GET_SAVE_OUTPUT_V1

* delete and build_stream_transfer_data

* fix test_update_v1:

* fix

* fix

* update dummy_run post_process

* delete test_update_v1

* fix

* fix dummy_run

* fix model_path

* fix model_path

* fix dummy_run
2025-10-31 22:32:05 +08:00
YuBaoku
acef624049 [CI] Fix rollout_model test logic (#4730) 2025-10-31 22:25:24 +08:00
Yuanle Liu
b301bd6c31 [BugFix] fix thinking bug (#4710)
* fix thinking bug

* fix ut

* update

* fix
2025-10-31 22:00:31 +08:00
xiaolei373
dde7ba3f9e [CI]add_tokenizer_cli_unitest (#4620) 2025-10-31 13:57:51 +08:00
周周周
10de7a3b82 add flops and bandwidth to test_ffn.py (#4704) 2025-10-31 12:13:59 +08:00
GoldPancake
1f3ce65b58 [Feature] support mtp distribution equivalence verification (#4699)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-10-31 11:45:04 +08:00
Ryan
28de91b50f [Graph Optimization] SOT+CUDAGraph support ERNIE4.5T VL 28B / 424B (#4645)
* 45TVL support sot+CUDAGraph

* mv unitest from ce_deploy 2 e2e

* add test_EB_VL_Lite_sot_serving

* rm useless line

* add openai_client

* fix unitest && reduce computing resources
2025-10-31 11:38:43 +08:00
Longzhi Wang
b61a272385 [BugFix] fix unittest of get_save_output_v1 (#4701)
* [BugFix] fix unittest of get_save_output_v1

* [BugFix] fix unittest of get_save_output_v1

* [BugFix] fix unittest of get_save_output_v1

* [BugFix] fix unittest of get_save_output_v1
2025-10-31 11:23:49 +08:00
kxz2002
a2870ed4a9 [Feature] Unify the registration name recognition for tool_parser and reasoning_parser to “-” (#4668)
* parser register name unify

* change ernie_x1 to ernie-x1

* change ernie4_5_vl to ernie-45-vl

* fix unit test
2025-10-31 10:45:27 +08:00
kxz2002
82bd7e5db4 [BugFix] Fix finish reason in _create_chat_completion_choice (#4582)
* fix n_param _create_chat_completion_choicel

* fix unit test

* fix final_res

* modify unit tests
2025-10-31 10:42:19 +08:00
plusNew001
ea866e4b34 [XPU] [CI] Add Vl case (#4649)
* Enhance CI script with health checks and logging

Updated the CI script to include health checks and logging for the VL model testing process.

* Add test for OpenAI chat completions

* Refactor chat completion user message structure

* Fix variable name for exit code in CI script

* Update text prompt to Chinese for artifact question

* Update service port and response assertions in tests

* Refactor assertion for response content comparison

* Update run_45vl.py

* Change service HTTP port from 8123 to 8188
2025-10-31 10:38:09 +08:00
zhouchong
35286ce31a fix total_block_num init error in worker_process (#4687) 2025-10-30 19:53:09 +08:00
周周周
8b9c9463cd add real gate_correction_bias weight to mock un-balanced dispatch (#4676) 2025-10-30 15:13:21 +08:00
Haonan Luo
d7d0112bbf [CI] Add test for paddleocr_vl (#4627) 2025-10-30 13:40:04 +08:00
RAM
cd3b7cc392 [Graph Optimization] Add the CUDAGraph usage switch for Draft Model (#4601)
* add draft model using cudagraph switch

* set default as false

* capture draft model in ci

* fix bug
2025-10-30 11:44:50 +08:00
GoldPancake
fddda50cb9 Add ut for speculative sampler (#4650) 2025-10-30 10:37:49 +08:00