yyssys
abde903813
Automatically configure workers based on max-num-seqs ( #3846 )
...
Automatically configure workers based on max-num-seqs
2025-09-03 21:12:42 +08:00
YUNSHEN XIE
7dbd9412b0
reopen ut ( #3795 )
...
* reopen ut
* update
* update
* update ci dockerfile
2025-09-03 19:05:20 +08:00
luukunn
fc598d4c5a
add reasoning parser plugin ( #3811 )
...
* add reasoning parser plugin
* fix finish reason
2025-09-03 18:31:27 +08:00
Ayakouji
31313e0f3d
[Feature] ernie4_5_vl_moe support huggingface safetensor loading ( #3750 )
...
* update
* update
* update in tp
* add todo
* update
---------
Co-authored-by: aquagull <hongyuh@qq.com >
2025-09-03 02:58:59 -07:00
lizexu123
4c998c3636
[Code Simplification] delete cum_offsets_out ( #3815 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix
* fix
2025-09-03 16:15:33 +08:00
YuanRisheng
0a1ce612c2
V1 loader support ep ( #3801 )
2025-09-03 16:05:41 +08:00
Yuan Xiaolan
fa58a9fa8f
qk norm for speculate decode C16 ( #3637 )
2025-09-03 14:53:56 +08:00
plusNew001
d22d3de256
[XPU] Update XPU CI case ( #3837 )
...
* Add debug environment variable exports
Added debug environment variable exports for CLANG_PATH and XVLLM_PATH.
* Lock paddlepaddle-xpu version in CI script
Temporarily lock paddlepaddle-xpu version due to framework update issues.
* Update no_proxy environment variable in CI workflow
* Install lsof tool in run_ci_xpu.sh
2025-09-03 14:32:12 +08:00
lzy
2527eb0e4e
fix test_append_attention_with_output.py ( #3831 )
...
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-09-03 14:07:50 +08:00
AIbin
54b458fd98
[Doc] update wint2 doc ( #3819 )
...
* update_wint2_doc
2025-09-03 11:27:43 +08:00
plusNew001
d81c57146f
[XPU] FIX XPU CI BUG ( #3829 )
...
* Add debug environment variable exports
Added debug environment variable exports for CLANG_PATH and XVLLM_PATH.
* Lock paddlepaddle-xpu version in CI script
Temporarily lock paddlepaddle-xpu version due to framework update issues.
2025-09-03 11:25:48 +08:00
ooo oo
2396e49f9e
【Hackathon 9th No.73】add unit tests for graph_opt_backend ( #3609 )
...
* test: add unit tests for graph_opt_backend
* refactor(tests): improve graph optimization test structure and readability
* fix(tests): correct CUDA graph related typos in test files
- Fix class name: TestCUDAGrpahSubgraph -> TestCUDAGraphSubgraph
* refactor(test): support attention layer and optimize graph optimization backend test to eliminate redundant baseline calculations
* remove some func call
---------
Co-authored-by: RAM <gstian5555@outlook.com >
Co-authored-by: Tao Luo <luotao02@baidu.com >
2025-09-03 11:18:00 +08:00
co63oc
94a61d505c
fix dcu_worker.py ( #3734 )
2025-09-03 10:57:42 +08:00
co63oc
ce998449e0
fix w8a8.py ( #3733 )
2025-09-03 10:57:26 +08:00
Echo-Nie
f7a4bea785
【Hackathon 9th No.84】Supplementary Unit Test for fastdeploy/reasoning ( #3570 )
...
测试内容:测试基类的注册、获取函数功能是否正常
Co-authored-by: Tao Luo <luotao02@baidu.com >
2025-09-03 10:55:02 +08:00
co63oc
5441538173
rename fused_get_rope.cu ( #3752 )
...
* rename fused_get_rope.cu
* fix
* fix typos
* fix
* fix
2025-09-03 10:54:34 +08:00
ltd0924
2c9b169c0e
[BugFix] fix scheduler invalid ( #3803 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* [BugFix] fix max streaming tokens invalid
* fix scheduler bug
* fix scheduler bug
2025-09-02 20:28:51 +08:00
Longzhi Wang
e0c9a6c76c
[Feat] Support streaming transfer data using ZMQ ( #3521 )
...
* Support streaming transfer data of ZMQ
* fix typo
* fix typo
* support tp
* add unittest
* update
* update
* fix typo
* fix typo
* fix tp_num in ci machine
---------
Co-authored-by: Wanglongzhi2001 <>
2025-09-02 19:52:19 +08:00
Echo-Nie
0fe1d62232
[MTP] add test_draft_model_set_value_by_flags.py ( #3741 )
2025-09-02 19:33:33 +08:00
Jiang-Jia-Jun
18e5d355a1
Update version in docs
2025-09-02 19:21:10 +08:00
yangjianfengo1
8e1b35a09b
【Fix bug] w4afp8 的nblock固定为256,并且fa3的append attn 增加mask参数 ( #3771 )
...
* fix w4afp8
* 增加集中式配置
* codestyle
* fix fa3 append attn
2025-09-02 19:17:01 +08:00
bukejiyu
b6a4115369
[v1loader]Reduce EB300B model loading time ( #3700 )
...
* speed up eb45
* update
2025-09-02 19:13:57 +08:00
YUNSHEN XIE
693c7d781c
fix ce compile job ( #3768 )
...
* fix ce compile job
* update
* update
* update
* update
2025-09-02 18:37:13 +08:00
co63oc
aa067a3106
rename speculate_token_penalty_multi_scores.cu ( #3735 )
2025-09-02 18:12:11 +08:00
lzy
7a521bbf62
Modify mask_offset‘s format ( #3525 )
...
* modify mask_offset in decode
* modify mask_offset unittest
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-09-02 03:05:35 -07:00
co63oc
f296aff6cf
rename speculate_stop_generation_multi_stop_seqs ( #3743 )
2025-09-02 18:04:29 +08:00
RAM
205b706ef8
[Executor] Fix bug of import paddle with RLHF ( #3781 )
2025-09-02 17:32:13 +08:00
Yuanle Liu
306c024ff3
[BugFix] fix error of import paddle.base.core.Config ( #3761 )
...
* 延迟 import Config
* support chunked_prefill
* support chunked_prefill
2025-09-02 17:23:27 +08:00
ltd0924
905d89e42f
[Feature] support model weight update in ep ( #3765 )
...
* support model weight update in ep
* support model weight update in ep
* support model weight update in ep
* support model weight update in ep
* Update fused_moe_backend_base.py
* Update worker_process.py
* Update worker_process.py
* Update dynamic_weight_manager.py
2025-09-02 17:16:03 +08:00
kevin
1908465542
[Feature] mm and thinking model support structred output ( #2749 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* mm support structured output
* update code
* update code
* update format
* update code
* update code
* add enable_thinking default
* update code
* add structured_outputs test case
* add ci install xgrammar
* add ci timeout time
* update test for structured_outputs
* update code
* add error traceback info
* update error msg
* update structred output code
* update code
* update code
* update config
* update torch version
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-02 16:21:09 +08:00
Jiang-Jia-Jun
0e4df5a6f4
[Feature] Setting number of apiserver workers automatically ( #3790 )
...
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-09-02 14:17:48 +08:00
ltd0924
bf0cf5167a
[BugFix] fix max streaming tokens invalid ( #3789 )
2025-09-02 13:57:32 +08:00
kevin
7e751c93ae
[BugFix] Fix chunked prefill ( #3759 )
...
* add error traceback info
* update error msg
* update code
* default enable chunked prefill
* update code
* update code
* add envs
* update code
* update enable chunked_prefill
* update code
* update code
* update code
* update code
* update code
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-02 13:40:45 +08:00
Jiang-Jia-Jun
27f2e7a6f1
Create faq.md
2025-09-02 11:07:37 +08:00
co63oc
6ac7cea81b
fix test_load_mtp ( #3780 )
2025-09-02 10:21:02 +08:00
Zhang Yulong
adc246127b
Update test_ernie_21b_mtp.py ( #3783 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
暂时跳过多卡MTP case
2025-09-01 20:39:40 +08:00
lizexu123
6dd61a1bab
fix Document ( #3782 )
...
Co-authored-by: example_name <example_email>
2025-09-01 20:22:43 +08:00
YUNSHEN XIE
253f388372
add ci images build job ( #3749 )
...
update
update
2025-09-01 19:57:36 +08:00
co63oc
d6369b4d51
fix typos ( #3684 )
2025-09-01 17:50:17 +08:00
Jiang-Jia-Jun
0513a78ecc
Update docs for reasoing-parser
2025-09-01 17:42:58 +08:00
Jiang-Jia-Jun
0297127a93
Update FASTDEPLOY_VERSION to 2.3.0-dev
2025-09-01 16:48:42 +08:00
Jiang-Jia-Jun
2bd7d90929
Remove useless parameters
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-01 14:43:56 +08:00
YuanRisheng
6566e29807
Add loader test for mtp ( #3724 )
...
* add test for mtp
* fix unittest
* fix
2025-09-01 10:55:49 +08:00
Zhang Yulong
085fe070f2
add CI cases ( #3714 )
2025-09-01 10:06:49 +08:00
ming1753
927e8ec55e
Add more runtime information to resource manager ( #3706 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-01 00:25:28 +08:00
chenjian
465065cd19
[Bug fix] Fix prefix cache in V1 ( #3715 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* [Bug fix] Fix prefix cache in V1
* fix code style
2025-08-31 21:29:33 +08:00
lizhenyun01
bed09ae8f8
fix mask_offset in append_attn ( #3745 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix mask_offset in append_attn
* fix test
2025-08-31 15:03:16 +08:00
kevin
753772ace8
default enable chunked prefill ( #3731 )
...
* add error traceback info
* update error msg
* update code
* default enable chunked prefill
* update code
* update code
* add envs
* update code
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-08-31 13:15:13 +08:00
李泳桦
98e03fb4ea
[feat] add metrics for yiyan adapter ( #3219 ) ( #3614 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* [feat] add metrics for yiyan adapter
* [fix] fix metrics num_requests_waiting and num_requests_running
* [fix] fix metrics gpu_cache_usage_perc
* [refactor] change where requests_number increases
* [chore] rename xxx_block_num as xxx_gpu_block_num, and update their values accordingly
* [chore] delete useless code
2025-08-30 23:20:58 +08:00
Sunny-bot1
fe5d09f9ee
[FIX]Fix Machete compile via ENABLE_MACHETE ( #3727 )
...
* add ENABLE_MACHETE
* fix
* revert
* update
* pre_commit
* fix
* fix
---------
Co-authored-by: Ayakouji <yuhongh@qq.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: aquagull <hongyuh@qq.com >
2025-08-30 17:50:17 +08:00