freeliuzc
2f473ba966
[Feature][MTP]Support MTP for rl-model ( #4009 )
...
* qk norm for speculate decode C16
* support mtp in v1_scheduler mode
* support mtp rope_3d
* support mtp features
* add unit test && del some log
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com >
Co-authored-by: xiaoxiaohehe001 <hiteezsf@163.com >
2025-09-10 13:34:37 +08:00
ming1753
d6bf6de5e6
[Bug Fix] Fix mm performance degradation ( #3942 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [Bug Fix] Fix mm performance degradation
* formate
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: chenjian <1435317881@qq.com >
2025-09-08 00:32:22 +08:00
chenjian
8d77c1cb51
[Optimize] optimize prefix cache in release22 ( #3889 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* optimize prefix cache in release22
* optimize prefix cache in release22
* fix worker
* fix
* fix
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-06 09:52:01 +08:00
lizhenyun01
2d975e16b0
[BugFix] fix TaskQueue dp_id in multi node ( #3919 )
2025-09-05 22:29:26 +08:00
chenjian
8915c8411d
Revert "[Feature] Setting number of apiserver workers automatically ( #3794 )" ( #3918 )
...
This reverts commit d1d063e4af
.
2025-09-05 21:06:50 +08:00
chenjian
fb1e0d6a87
[Feature] Set scheduler v1 as default ( #3812 )
...
* [Feature] Set scheduler v1 as default
* [Feature] Set scheduler v1 as default
* [Feature] Set scheduler v1 as default
* [Feature] Set scheduler v1 as default
* [Feature] Set scheduler v1 as default
* [Feature] Set scheduler v1 as default
2025-09-04 11:02:10 +08:00
Jiang-Jia-Jun
d1d063e4af
[Feature] Setting number of apiserver workers automatically ( #3794 )
...
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-09-02 17:19:07 +08:00
李泳桦
98e03fb4ea
[feat] add metrics for yiyan adapter ( #3219 ) ( #3614 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* [feat] add metrics for yiyan adapter
* [fix] fix metrics num_requests_waiting and num_requests_running
* [fix] fix metrics gpu_cache_usage_perc
* [refactor] change where requests_number increases
* [chore] rename xxx_block_num as xxx_gpu_block_num, and update their values accordingly
* [chore] delete useless code
2025-08-30 23:20:58 +08:00
Yuan Xiaolan
c71ee0831c
add w4afp8 offline script ( #3636 )
2025-08-29 17:56:05 +08:00
Ryan
45f81b34f0
add dtype int32 ( #3692 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-29 14:56:35 +08:00
Yuanle Liu
4957908275
add input_processor plugin ( #3657 )
...
* add input_processor plugin
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
2025-08-28 22:53:57 +08:00
ltd0924
3d92fb09f7
[BugFix] fix parameter is 0 ( #3592 )
...
* Update engine_client.py
* fix
* Update common_engine.py
2025-08-28 09:52:36 +08:00
gaoziyuan
82e64b13e1
[NewFeature]Support dp multi api server && Fix some bug in mixed ep && merge develop ( #3598 )
...
* [Feature] update ep
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix queue ports idx
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* Update engine.py
* fix ci
* fix some bug in mixed ep
* add server fix and op fix
* rm some log
* fix code style
* ltd fix
* fix
* fix
* fix some bug
* fix bug
* fix bug
* fix style
* Update config.py
* Update splitwise_connector.py
* Update cache_messager.py
* Update __init__.py
* merge and fix
* Update engine.py
* Update common_engine.py
* Update run_ci_xpu.sh
* Update ernie_processor.py
* Update ernie_processor.py
---------
Co-authored-by: ltd0924 <ltd0924@sina.com >
Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com >
2025-08-26 19:59:02 +08:00