Commit Graph

3695 Commits

Author SHA1 Message Date
yinwei
377f3bf5f2 [XPU] add v1 support for bf16 (#4744)
* support v1 loader

* update code style

* update code
2025-11-03 14:13:17 +08:00
chenjian
f83d0cf127 [Feature] Support eplb for fd (#4599)
* support eplb

* support eplb

---------

Co-authored-by: kevin <chengyf112@gmail.com>
2025-11-03 14:08:15 +08:00
ming1753
c657f8d16a [Docs] fix PaddleOCR-VL docs bug (#4702) 2025-11-03 12:12:14 +08:00
yyssys
b1dd508965 [Docs]Add parameter (#4755) 2025-11-03 11:57:32 +08:00
yyssys
44ce91adea [Docs]Add parameter to the start service command (#4753) 2025-11-03 11:14:07 +08:00
freeliuzc
11398790d3 [Speculative Decoding][MTP]Support attn mask offset (#4641)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [MTP]Merge support attn (#4591)

* support mask_offset in speculate decoding

* fix dummpy run output

* add unit test

* fix unit test import

* support attn_mask_offset in mtp mode

* add update_attn_mask op

* fix unit test && fix code-style
2025-11-03 10:08:01 +08:00
freeliuzc
f44f4bafd1 support mtp in splitewise and scheduler_v1 mode (#4743) 2025-11-03 10:07:15 +08:00
yyssys
b8bf57138f [Docs]Update XPU document version to 2.3.0 (#4741)
* [Doc]Update XPU document version to 2.3.0

* update paddle doc version

* update applicable version
2025-11-03 09:54:51 +08:00
YuBaoku
9eff788658 [CI] fix some ci yaml (#4747)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-02 21:28:04 +08:00
周周周
6e01be28e0 format code (#4720)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-01 19:13:50 +08:00
lizexu123
4ac6de9a3c [Feature] support pooling model runner (#4590)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support qwen3-embedding

* support qwen3-embedding-0.6b

* fix

* fix bug

* fix test_return_token_ids.py and update enable_thinking

* fix mtp dummy_run

* merge develop

* fix np.float32

* delete FD_DISABLE_CHUNKED_PREFILL and FD_USE_GET_SAVE_OUTPUT_V1

* delete and build_stream_transfer_data

* fix test_update_v1:

* fix

* fix

* update dummy_run post_process

* delete test_update_v1

* fix

* fix dummy_run

* fix model_path

* fix model_path

* fix dummy_run
2025-10-31 22:32:05 +08:00
YuBaoku
acef624049 [CI] Fix rollout_model test logic (#4730) 2025-10-31 22:25:24 +08:00
Yuanle Liu
b301bd6c31 [BugFix] fix thinking bug (#4710)
* fix thinking bug

* fix ut

* update

* fix
2025-10-31 22:00:31 +08:00
周周周
10358bf1a0 fix noaux (#4731) 2025-10-31 21:25:11 +08:00
ming1753
27746026c1 Skip building native architecture when specifying arch list (#4727) 2025-10-31 20:32:46 +08:00
ddchenhao66
3cbca75cc8 [XPU] xpu support neox style ROPE (#4719)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-10-31 18:14:25 +08:00
Jundong Liu
88a94c821b [FDConfig] [PD Disaggregation] [Graph Optimization] Close Cudagraph for P node when PD Disaggregation (#4632)
* Close cudagraph for P node when PD Disaggregation

* fix problem
2025-10-31 16:44:25 +08:00
AIbin
316f784016 fix wint2 config (#4721) 2025-10-31 15:44:14 +08:00
kevin
c801d31c9c add checker (#4711) 2025-10-31 15:26:35 +08:00
kevin
096d87d335 fix bug (#4679) 2025-10-31 14:59:18 +08:00
李泳桦
0f75b62de2 [BugFix] Fix profile run in pd-disaggregated deployment (#4584)
* [fix] fix pd+dp+ep bug

* [fix] fix again

* [ci] fix code style
2025-10-31 14:42:00 +08:00
kevin
64e875b460 [Scheduler] update v1 prefill batch (#4611)
* update v1 prefill batch

* update code

* update code
2025-10-31 14:03:01 +08:00
xiaolei373
dde7ba3f9e [CI]add_tokenizer_cli_unitest (#4620) 2025-10-31 13:57:51 +08:00
ophilia-lee
412097c1b8 benchmark工具支持受限解码场景指定response_format (#4718) 2025-10-31 12:26:24 +08:00
周周周
10de7a3b82 add flops and bandwidth to test_ffn.py (#4704) 2025-10-31 12:13:59 +08:00
Sunny-bot1
9b18f0b55d cache scale load (#4624) 2025-10-31 11:58:33 +08:00
GoldPancake
1f3ce65b58 [Feature] support mtp distribution equivalence verification (#4699)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-10-31 11:45:04 +08:00
Ryan
28de91b50f [Graph Optimization] SOT+CUDAGraph support ERNIE4.5T VL 28B / 424B (#4645)
* 45TVL support sot+CUDAGraph

* mv unitest from ce_deploy 2 e2e

* add test_EB_VL_Lite_sot_serving

* rm useless line

* add openai_client

* fix unitest && reduce computing resources
2025-10-31 11:38:43 +08:00
plusNew001
937bcfc6ed [XPU] [CI] Lock xvllm version (#4715)
* Modify XVLLM_PATH assignment in run_ci_xpu.sh

Update XVLLM_PATH to point to the newly downloaded xvllm directory.

* Lock XVLLM version in CI script

Lock XVLLM version to avoid CI issues due to updates.

* Change xvllm output download link to latest version

Updated the download link for xvllm output to the latest version.
2025-10-31 11:32:38 +08:00
Longzhi Wang
b61a272385 [BugFix] fix unittest of get_save_output_v1 (#4701)
* [BugFix] fix unittest of get_save_output_v1

* [BugFix] fix unittest of get_save_output_v1

* [BugFix] fix unittest of get_save_output_v1

* [BugFix] fix unittest of get_save_output_v1
2025-10-31 11:23:49 +08:00
kxz2002
a2870ed4a9 [Feature] Unify the registration name recognition for tool_parser and reasoning_parser to “-” (#4668)
* parser register name unify

* change ernie_x1 to ernie-x1

* change ernie4_5_vl to ernie-45-vl

* fix unit test
2025-10-31 10:45:27 +08:00
kxz2002
82bd7e5db4 [BugFix] Fix finish reason in _create_chat_completion_choice (#4582)
* fix n_param _create_chat_completion_choicel

* fix unit test

* fix final_res

* modify unit tests
2025-10-31 10:42:19 +08:00
plusNew001
ea866e4b34 [XPU] [CI] Add Vl case (#4649)
* Enhance CI script with health checks and logging

Updated the CI script to include health checks and logging for the VL model testing process.

* Add test for OpenAI chat completions

* Refactor chat completion user message structure

* Fix variable name for exit code in CI script

* Update text prompt to Chinese for artifact question

* Update service port and response assertions in tests

* Refactor assertion for response content comparison

* Update run_45vl.py

* Change service HTTP port from 8123 to 8188
2025-10-31 10:38:09 +08:00
ddchenhao66
b87384aa70 [XPU] xpu currently disable prefix cache for VL model (#4695)
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-10-31 10:36:39 +08:00
chen
b73a78155f fix --logprobs-mode raw_logits (#4681)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-10-30 19:53:42 +08:00
zhouchong
35286ce31a fix total_block_num init error in worker_process (#4687) 2025-10-30 19:53:09 +08:00
kxz2002
7dc9d9885e [BugFix] fix offline llm chat "enable_thinking" is always "False" (#4686)
* fix enable_thinking

* recover ernie4_5_vl_processor
2025-10-30 19:45:41 +08:00
周周周
0089287534 [noauxtc_kernel] remove useless code (#4643)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* remove num_tokens

* remove num_tokens

* false

* final commit
2025-10-30 18:59:04 +08:00
Jiang-Jia-Jun
ec7746bd55 Update multi-node_deployment.md 2025-10-30 16:40:30 +08:00
Jiang-Jia-Jun
ca52cadd74 Update multi-node_deployment.md 2025-10-30 16:40:08 +08:00
周周周
8b9c9463cd add real gate_correction_bias weight to mock un-balanced dispatch (#4676) 2025-10-30 15:13:21 +08:00
Jiang-Jia-Jun
f1de348cbf Update common_engine.py 2025-10-30 14:05:04 +08:00
Haonan Luo
d7d0112bbf [CI] Add test for paddleocr_vl (#4627) 2025-10-30 13:40:04 +08:00
RAM
cd3b7cc392 [Graph Optimization] Add the CUDAGraph usage switch for Draft Model (#4601)
* add draft model using cudagraph switch

* set default as false

* capture draft model in ci

* fix bug
2025-10-30 11:44:50 +08:00
ApplEOFDiscord
cfdd1600a5 update doc (#4675)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-10-30 11:19:04 +08:00
GoldPancake
fddda50cb9 Add ut for speculative sampler (#4650) 2025-10-30 10:37:49 +08:00
Zhenghai Zhang
1712e1351b 【Hackathon 9th No.86】autogen MoeFastHardamardImplWrapper template_instantiation (#4592)
* autogen MoeFastHardamardImplWrapper template_instantiation

* fix codestyle

* fix codestyle

* add impl cu files
2025-10-30 10:28:36 +08:00
Ryan
e25c067f70 [OP] Add InferShape&InferDtype for per_token_quant_padding (#4667)
* add InferShape&InferDtype for per_token_quant_padding

* fix codestyle
2025-10-30 10:28:26 +08:00
ltd0924
50be19a88a [EP] fix several bugs in data parallel (#4657)
* Simplify profiling block setup in expert_service.py

Refactor profiling block initialization to avoid duplication.

* Update common_engine.py
2025-10-30 09:50:49 +08:00
周周周
dab04ab413 add noaux_tc to unitest fused_moe (#4656)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-10-29 21:50:25 +08:00