Commit Graph

3751 Commits

Author SHA1 Message Date
YuBaoku
819b2dbbae Revert "【New Feature】W4afp8 supports per group quantization (#4272)" (#4854)
This reverts commit 93fcf7e4ec.
2025-11-06 17:48:28 +08:00
YuBaoku
3478d20262 [CI] Add Check PR Template (#4481) 2025-11-06 17:41:14 +08:00
zhupengyang
b54eb7ad81 [XPU] ep+tp all2all (#4836) 2025-11-06 17:26:14 +08:00
Jiang-Jia-Jun
901d559aa7 Update README_CN.md 2025-11-06 17:19:22 +08:00
Jiang-Jia-Jun
0010420c56 Update README_EN.md 2025-11-06 17:19:07 +08:00
Zhang Yulong
83532e1d01 [Benchmark] Enhance benchmark output logging (#4682)
* Enhance benchmark output logging

Add print statements to display the number of discarded outputs before and after filtering.

* Update benchmark_serving.py
2025-11-06 16:53:31 +08:00
Jiang-Jia-Jun
095dada092 Add gemini for code review 2025-11-06 16:42:32 +08:00
Echo-Nie
c18b177f21 fix the get_act_fn,_load_st_projector (#4824) 2025-11-06 16:13:35 +08:00
Echo-Nie
e4f1267186 bug: fix list to List (#4818) 2025-11-06 16:13:12 +08:00
Ding
6c316286c1 fix: correct typo in nvidia_gpu.md (#4848) 2025-11-06 16:03:02 +08:00
Juncai
08ca0f6aea [Feature] [PD] add simple router and refine splitwise deployment (#4709)
* add simple router and refine splitwise deployment

* fix
2025-11-06 14:56:02 +08:00
Ayakouji
831266da7a [Fix] fix ernie4_5_vl model torch format loadding (#4447)
* fix

* add test

* fix test

* fix test

* update
2025-11-06 14:34:21 +08:00
plusNew001
fc8bef2c95 [XPU][CI]Change ci vl model to 28 b (#4764)
* Update XPU_VISIBLE_DEVICES and model parameters

* Update base response and adjust max tokens

* Implement process cleanup in CI workflow

Add process cleanup commands to prevent port conflicts

* Remove process cleanup commands from CI workflow

Removed old process cleanup commands to prevent port conflicts.
2025-11-06 14:12:23 +08:00
Echo-Nie
354ddc8bc5 [CI] Add unittest for activation, native_paddle_backend, w4a8, w4afp8, platforms/utils (#4812)
* add unnitest for activation, native_paddle_backend, w4a8, w4afp8, platforms/utils

* Remove activation function retrieval tests

Removed tests for valid and unsupported activation function retrieval.

* move w4a8, w4afp8 to quantization

* fix code style
2025-11-06 14:08:00 +08:00
SunLei
782818c031 fix: ci port conflict (#4840)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-06 11:56:17 +08:00
kxz2002
5bdd40da5d [BugFix] Fix ernie_vl_reasoning_parsers.py 'end_token' to 'think_end_token' (#4805)
* fix ernie_vl_reasoning_parsers.py 'end_token' to 'think_end_token'

* add unit tests
2025-11-06 11:28:55 +08:00
周周周
69fa741763 remove seq_lens_this_time (#4821) 2025-11-06 11:06:28 +08:00
K11OntheBoat
62dfad4a5f [PD Disaggregation] Support Qwen3-MoE use PD + EP inference. (#4691)
support Qwen-MoE PD/EP
2025-11-06 10:32:15 +08:00
YuBaoku
e8c3e20ee6 [CI] fix docker_build error and add tag-base (#4810)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-05 21:57:54 +08:00
yangjianfengo1
93fcf7e4ec 【New Feature】W4afp8 supports per group quantization (#4272)
* w4afp8 支持per group

* code style

* 精度完成

* revert append attn utils

* ffn1 动态量化

* ffn2 支持动态量化

* code style

* code style

* 修改单测

* 修改单测

* fix bug

* Implement conditional parameter creation for layers

Add parameter creation for up_gate_proj_in_scale when ep_size > 1.

* code style

* fix conflict

* code style

* code style

* 修复w4aint8 精度

* fix ci

---------

Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
2025-11-05 21:00:23 +08:00
李泳桦
fcd2f05dff [BugFix] fix messages being inplace modified in offline chat api (#4831) 2025-11-05 20:46:33 +08:00
Jiang-Jia-Jun
6f95df1777 Fix formatting of news section in README_EN.md
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-05 19:47:34 +08:00
Jiang-Jia-Jun
5db1a26340 Update README_CN.md 2025-11-05 19:46:51 +08:00
Jiang-Jia-Jun
aec1a84886 [Doc] Update docs for v2.3.0rc0 (#4828)
* [Doc] Update docs for v2.3.0rc0

* [Doc] Update docs for v2.3.0rc0

* [Doc] Update docs for v2.3.0rc0

* Update README_CN.md

* Add deployment guide link for FastDeploy v2.3-rc0

Updated release note for FastDeploy v2.3-rc0 to include deployment guide link.

* Add Deployment Guide link for FastDeploy v2.3-rc0

Updated the news section to include a link to the Deployment Guide for FastDeploy v2.3-rc0.

---------

Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
2025-11-05 19:45:53 +08:00
zhang-prog
4c2ad15258 add paddleocr_vl benchmark (#4833)
* add paddleocr_vl benchmark

* fix

* fix

* fix

* fix
2025-11-05 19:37:45 +08:00
ApplEOFDiscord
131d76dd64 [Bug Fix] process transparent image (#4807)
* process transparent image

* english comments

* process transparency at downloading

* fix

* remove useless codes
2025-11-05 17:15:24 +08:00
yinwei
ea1dd0e735 [XPU]Support V1 loader in weight_only Model (#4808)
* support v1 loader in wint8

* code style

* update

---------

Co-authored-by: root <root@gajl-bbc-onlinec-com-1498356.gajl.baidu.com>
2025-11-05 17:09:11 +08:00
chenjian
cc8f5312f5 [Feature] Add timestamp for profiler (#4726)
* [Feature] Add timestamp for profiler

* fix bug for offine inference

* fix for ci

* fix

* fix ci
2025-11-05 12:04:59 +08:00
周周周
876e4a8935 remove input_ids from ForwardMeta (#4793)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-05 11:55:51 +08:00
kxz2002
9676cc87d6 fix parser register name (#4795)
Co-authored-by: luukunn <83932082+luukunn@users.noreply.github.com>
2025-11-05 11:27:30 +08:00
zhupengyang
2fd254e5b7 support ep+tp at op layer (#4688) 2025-11-05 11:15:57 +08:00
周周周
937eb3c6ed [get_padding_offset.] clean get_padding_offset.cu (#4777)
[get_padding_offset.] clean get_padding_offset.cu (#4777)
2025-11-05 10:47:40 +08:00
chen
1c3ca48128 [Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs (#4769) 2025-11-05 10:43:25 +08:00
xiaozude
74722308f2 [Metax] adapt cutlass moe and fix mla attention (#4602)
Co-authored-by: Yuanle Liu <yuanlehome@163.com>
2025-11-05 10:03:49 +08:00
Haonan Luo
2c281e617c Update Unit Test for PaddleOCR-VL (#4802)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix paddleocr prefix cache bug

* add test for paddleocr_vl

* disable prefix-caching in ocr

* add test for paddleocr_vl

* Fix top_p for rejection sampling

* add test for ocr processor; fix top_p for rejection sampling

* add test for ocr processor; fix top_p for rejection sampling

* add test for ocr processor; fix top_p for rejection sampling

* add test for ocr processor; fix top_p for rejection sampling

* add test for ocr processor; fix top_p for rejection sampling

---------

Co-authored-by: ming1753 <ideaminghp@163.com>
Co-authored-by: ming1753 <61511741+ming1753@users.noreply.github.com>
2025-11-04 22:40:15 +08:00
李泳桦
1b61d62ecf [fix] fix v0 pd, let worker step_shm_value create=False (#4780) 2025-11-04 20:37:57 +08:00
yangjianfengo1
73252641dc updata mkdocs.yml (#4804)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
Co-authored-by: root <root@yq02-inf-sci-k8s-a100-aa2ni5-0018.yq02.baidu.com>
2025-11-04 19:30:26 +08:00
YuBaoku
722110a952 [CI] Refactor CE wheel upload for multiple target paths (#4790)
* [CI] Refactor CE wheel upload for multiple target paths

* [CI] fix test_streaming_with_stop_str error
2025-11-04 18:56:38 +08:00
ming1753
9547fa204e [Docs] Add new support models (#4801) 2025-11-04 16:49:51 +08:00
lzy
3e9dda39ab supports pd partn (#4615)
* supports pd partn

* fix codestype
2025-11-04 16:36:35 +08:00
lzy
af7e0f27f3 supports internode_ll_two_stage (#4162)
* supports internode_ll_two_stage

* supports internode_ll_two_stage

* supports internode_ll_two_stage

* supports internode_ll_two_stage

* supports D internode_ll_two_stage

* fix codestype

* fix xpu internode_ll_two_stage

* fix xpu internode_ll_two_stage
2025-11-04 16:35:40 +08:00
kxz2002
8a40374bfe [BugFix] Fix ernie4_5_vl_processor.py and qwen_vl_processor.py can not disable thinking (#4762)
* fix ernie4_5_vl_processor.py and qwen_vl_processor.py

* add unit test
2025-11-04 16:00:32 +08:00
Lucas
007ee71208 [XPU] add deploy doc for PaddleOCR-VL in XPU (#4784) 2025-11-04 15:06:19 +08:00
ddchenhao66
bffa08b74b [XPU] fix thinking bug where output only contains reasoning_content (#4761)
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-11-04 14:32:35 +08:00
Jiang-Jia-Jun
4a4948764d Update mkdocs.yml 2025-11-04 14:31:32 +08:00
bukejiyu
41bfa1090d [CI]delete test_common_model (#4794)
* fix

* update

* update
2025-11-04 13:57:55 +08:00
freeliuzc
855a2a609a fix attn_params (#4787) 2025-11-04 13:01:38 +08:00
plusNew001
9887025926 Update run_w4a8.py (#4783)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-03 21:41:00 +08:00
kevin
5233825562 test scheduler (#4739) 2025-11-03 20:12:14 +08:00
ming1753
35a6969a44 [Docs] PaddleOCR-VL add RTX3060 server param (#4765)
* [Docs] PaddleOCR-VL add RTX3060 server param

* modify config

* fix bug
2025-11-03 19:55:05 +08:00