Commit Graph

3428 Commits

Author SHA1 Message Date
yyssys
abde903813 Automatically configure workers based on max-num-seqs (#3846)
Automatically configure workers based on max-num-seqs
2025-09-03 21:12:42 +08:00
YUNSHEN XIE
7dbd9412b0 reopen ut (#3795)
* reopen ut

* update

* update

* update ci dockerfile
2025-09-03 19:05:20 +08:00
luukunn
fc598d4c5a add reasoning parser plugin (#3811)
* add reasoning parser plugin

* fix finish reason
2025-09-03 18:31:27 +08:00
Ayakouji
31313e0f3d [Feature] ernie4_5_vl_moe support huggingface safetensor loading (#3750)
* update

* update

* update in tp

* add todo

* update

---------

Co-authored-by: aquagull <hongyuh@qq.com>
2025-09-03 02:58:59 -07:00
lizexu123
4c998c3636 [Code Simplification] delete cum_offsets_out (#3815)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix

* fix
2025-09-03 16:15:33 +08:00
YuanRisheng
0a1ce612c2 V1 loader support ep (#3801) 2025-09-03 16:05:41 +08:00
Yuan Xiaolan
fa58a9fa8f qk norm for speculate decode C16 (#3637) 2025-09-03 14:53:56 +08:00
plusNew001
d22d3de256 [XPU] Update XPU CI case (#3837)
* Add debug environment variable exports

Added debug environment variable exports for CLANG_PATH and XVLLM_PATH.

* Lock paddlepaddle-xpu version in CI script

Temporarily lock paddlepaddle-xpu version due to framework update issues.

* Update no_proxy environment variable in CI workflow

* Install lsof tool in run_ci_xpu.sh
2025-09-03 14:32:12 +08:00
lzy
2527eb0e4e fix test_append_attention_with_output.py (#3831)
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com>
2025-09-03 14:07:50 +08:00
AIbin
54b458fd98 [Doc] update wint2 doc (#3819)
* update_wint2_doc
2025-09-03 11:27:43 +08:00
plusNew001
d81c57146f [XPU] FIX XPU CI BUG (#3829)
* Add debug environment variable exports

Added debug environment variable exports for CLANG_PATH and XVLLM_PATH.

* Lock paddlepaddle-xpu version in CI script

Temporarily lock paddlepaddle-xpu version due to framework update issues.
2025-09-03 11:25:48 +08:00
ooo oo
2396e49f9e 【Hackathon 9th No.73】add unit tests for graph_opt_backend (#3609)
* test: add unit tests for graph_opt_backend

* refactor(tests): improve graph optimization test structure and readability

* fix(tests): correct CUDA graph related typos in test files

- Fix class name: TestCUDAGrpahSubgraph -> TestCUDAGraphSubgraph

* refactor(test): support attention layer and optimize graph optimization backend test to eliminate redundant baseline calculations

* remove some func call

---------

Co-authored-by: RAM <gstian5555@outlook.com>
Co-authored-by: Tao Luo <luotao02@baidu.com>
2025-09-03 11:18:00 +08:00
co63oc
94a61d505c fix dcu_worker.py (#3734) 2025-09-03 10:57:42 +08:00
co63oc
ce998449e0 fix w8a8.py (#3733) 2025-09-03 10:57:26 +08:00
Echo-Nie
f7a4bea785 【Hackathon 9th No.84】Supplementary Unit Test for fastdeploy/reasoning (#3570)
测试内容:测试基类的注册、获取函数功能是否正常

Co-authored-by: Tao Luo <luotao02@baidu.com>
2025-09-03 10:55:02 +08:00
co63oc
5441538173 rename fused_get_rope.cu (#3752)
* rename fused_get_rope.cu

* fix

* fix typos

* fix

* fix
2025-09-03 10:54:34 +08:00
ltd0924
2c9b169c0e [BugFix] fix scheduler invalid (#3803)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* [BugFix] fix max streaming tokens invalid

* fix scheduler bug

* fix scheduler bug
2025-09-02 20:28:51 +08:00
Longzhi Wang
e0c9a6c76c [Feat] Support streaming transfer data using ZMQ (#3521)
* Support streaming transfer data of ZMQ

* fix typo

* fix typo

* support tp

* add unittest

* update

* update

* fix typo

* fix typo

* fix tp_num in ci machine

---------

Co-authored-by: Wanglongzhi2001 <>
2025-09-02 19:52:19 +08:00
Echo-Nie
0fe1d62232 [MTP] add test_draft_model_set_value_by_flags.py (#3741) 2025-09-02 19:33:33 +08:00
Jiang-Jia-Jun
18e5d355a1 Update version in docs 2025-09-02 19:21:10 +08:00
yangjianfengo1
8e1b35a09b 【Fix bug] w4afp8 的nblock固定为256,并且fa3的append attn 增加mask参数 (#3771)
* fix w4afp8

* 增加集中式配置

* codestyle

* fix fa3 append attn
2025-09-02 19:17:01 +08:00
bukejiyu
b6a4115369 [v1loader]Reduce EB300B model loading time (#3700)
* speed up eb45

* update
2025-09-02 19:13:57 +08:00
YUNSHEN XIE
693c7d781c fix ce compile job (#3768)
* fix ce compile job

* update

* update

* update

* update
2025-09-02 18:37:13 +08:00
co63oc
aa067a3106 rename speculate_token_penalty_multi_scores.cu (#3735) 2025-09-02 18:12:11 +08:00
lzy
7a521bbf62 Modify mask_offset‘s format (#3525)
* modify mask_offset in decode

* modify mask_offset unittest

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-09-02 03:05:35 -07:00
co63oc
f296aff6cf rename speculate_stop_generation_multi_stop_seqs (#3743) 2025-09-02 18:04:29 +08:00
RAM
205b706ef8 [Executor] Fix bug of import paddle with RLHF (#3781) 2025-09-02 17:32:13 +08:00
Yuanle Liu
306c024ff3 [BugFix] fix error of import paddle.base.core.Config (#3761)
* 延迟 import Config

* support chunked_prefill

* support chunked_prefill
2025-09-02 17:23:27 +08:00
ltd0924
905d89e42f [Feature] support model weight update in ep (#3765)
* support model weight update in ep

* support model weight update in ep

* support model weight update in ep

* support model weight update in ep

* Update fused_moe_backend_base.py

* Update worker_process.py

* Update worker_process.py

* Update dynamic_weight_manager.py
2025-09-02 17:16:03 +08:00
kevin
1908465542 [Feature] mm and thinking model support structred output (#2749)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* mm support structured output

* update code

* update code

* update format

* update code

* update code

* add enable_thinking default

* update code

* add structured_outputs test case

* add ci install xgrammar

* add ci timeout time

* update test for structured_outputs

* update code

* add error traceback info

* update error msg

* update structred output code

* update code

* update code

* update config

* update torch version

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-09-02 16:21:09 +08:00
Jiang-Jia-Jun
0e4df5a6f4 [Feature] Setting number of apiserver workers automatically (#3790)
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
2025-09-02 14:17:48 +08:00
ltd0924
bf0cf5167a [BugFix] fix max streaming tokens invalid (#3789) 2025-09-02 13:57:32 +08:00
kevin
7e751c93ae [BugFix] Fix chunked prefill (#3759)
* add error traceback info

* update error msg

* update code

* default enable chunked prefill

* update code

* update code

* add envs

* update code

* update enable chunked_prefill

* update code

* update code

* update code

* update code

* update code

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-09-02 13:40:45 +08:00
Jiang-Jia-Jun
27f2e7a6f1 Create faq.md 2025-09-02 11:07:37 +08:00
co63oc
6ac7cea81b fix test_load_mtp (#3780) 2025-09-02 10:21:02 +08:00
Zhang Yulong
adc246127b Update test_ernie_21b_mtp.py (#3783)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
暂时跳过多卡MTP case
2025-09-01 20:39:40 +08:00
lizexu123
6dd61a1bab fix Document (#3782)
Co-authored-by: example_name <example_email>
2025-09-01 20:22:43 +08:00
YUNSHEN XIE
253f388372 add ci images build job (#3749)
update

update
2025-09-01 19:57:36 +08:00
co63oc
d6369b4d51 fix typos (#3684) 2025-09-01 17:50:17 +08:00
Jiang-Jia-Jun
0513a78ecc Update docs for reasoing-parser 2025-09-01 17:42:58 +08:00
Jiang-Jia-Jun
0297127a93 Update FASTDEPLOY_VERSION to 2.3.0-dev 2025-09-01 16:48:42 +08:00
Jiang-Jia-Jun
2bd7d90929 Remove useless parameters
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-01 14:43:56 +08:00
YuanRisheng
6566e29807 Add loader test for mtp (#3724)
* add test for mtp

* fix unittest

* fix
2025-09-01 10:55:49 +08:00
Zhang Yulong
085fe070f2 add CI cases (#3714) 2025-09-01 10:06:49 +08:00
ming1753
927e8ec55e Add more runtime information to resource manager (#3706)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-01 00:25:28 +08:00
chenjian
465065cd19 [Bug fix] Fix prefix cache in V1 (#3715)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* [Bug fix] Fix prefix cache in V1

* fix code style
2025-08-31 21:29:33 +08:00
lizhenyun01
bed09ae8f8 fix mask_offset in append_attn (#3745)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix mask_offset in append_attn

* fix test
2025-08-31 15:03:16 +08:00
kevin
753772ace8 default enable chunked prefill (#3731)
* add error traceback info

* update error msg

* update code

* default enable chunked prefill

* update code

* update code

* add envs

* update code

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-08-31 13:15:13 +08:00
李泳桦
98e03fb4ea [feat] add metrics for yiyan adapter (#3219) (#3614)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* [feat] add metrics for yiyan adapter

* [fix] fix metrics num_requests_waiting and num_requests_running

* [fix] fix metrics gpu_cache_usage_perc

* [refactor] change where requests_number increases

* [chore] rename xxx_block_num as xxx_gpu_block_num, and update their values accordingly

* [chore] delete useless code
2025-08-30 23:20:58 +08:00
Sunny-bot1
fe5d09f9ee [FIX]Fix Machete compile via ENABLE_MACHETE (#3727)
* add ENABLE_MACHETE

* fix

* revert

* update

* pre_commit

* fix

* fix

---------

Co-authored-by: Ayakouji <yuhongh@qq.com>
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
Co-authored-by: aquagull <hongyuh@qq.com>
2025-08-30 17:50:17 +08:00