ltd0924
de4feff147
[Feature]CP support data clear ( #4214 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* Update serving_chat.py
* Update serving_completion.py
* Update serving_completion.py
* mv connection_manager init
* [BugFix] fix kv cache
* fix format
* [Feature] support clear data
---------
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
Co-authored-by: RAM <gstian5555@outlook.com >
2025-09-23 16:53:39 +08:00
李泳桦
0fa28b1068
[fix] fix ep group all-reduce ( #4140 )
...
* [fix] fix ep group all-reduce
* [fix] fix clear/update lock not working when workers > 1
* [chore] add preemption triggered info log
* [fix] fix code style
* fix model_weights_signal (#4092 )
* fix model_weights_signal
---------
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-09-18 10:34:49 +08:00
chenjian
38e734e183
[Feature] support hierarchical cache in v1 ( #3939 )
2025-09-08 00:31:34 +08:00
chenjian
8d77c1cb51
[Optimize] optimize prefix cache in release22 ( #3889 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* optimize prefix cache in release22
* optimize prefix cache in release22
* fix worker
* fix
* fix
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-06 09:52:01 +08:00
yinwei
77c1bd0813
[XPU]Fixed the issue of performance degradation caused by enabling ENABLE_V1_KVCACHE_SCHEDULER ( #3900 )
...
* fix bug
* fix bug
* update
* udpate
* update
2025-09-05 19:17:25 +08:00
chenjian
fb1e0d6a87
[Feature] Set scheduler v1 as default ( #3812 )
...
* [Feature] Set scheduler v1 as default
* [Feature] Set scheduler v1 as default
* [Feature] Set scheduler v1 as default
* [Feature] Set scheduler v1 as default
* [Feature] Set scheduler v1 as default
* [Feature] Set scheduler v1 as default
2025-09-04 11:02:10 +08:00
ming1753
1432e336d7
[Bug Fix] Fix bug of multimodal inputs only text ( #3850 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-09-03 19:48:10 +08:00
Yuanle Liu
174510180a
[BugFix] fix error of import paddle.base.core.Config ( #3761 ) ( #3804 )
...
* 延迟 import Config
* support chunked_prefill
* support chunked_prefill
2025-09-03 10:14:03 +08:00
chenjian
465065cd19
[Bug fix] Fix prefix cache in V1 ( #3715 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* [Bug fix] Fix prefix cache in V1
* fix code style
2025-08-31 21:29:33 +08:00
李泳桦
98e03fb4ea
[feat] add metrics for yiyan adapter ( #3219 ) ( #3614 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* [feat] add metrics for yiyan adapter
* [fix] fix metrics num_requests_waiting and num_requests_running
* [fix] fix metrics gpu_cache_usage_perc
* [refactor] change where requests_number increases
* [chore] rename xxx_block_num as xxx_gpu_block_num, and update their values accordingly
* [chore] delete useless code
2025-08-30 23:20:58 +08:00
Yuanle Liu
68f87240da
fix key error in mm ( #3702 )
2025-08-29 14:35:12 +08:00
Yuanle Liu
2fb2c0f46a
fix MultimodalRegistry ( #3699 )
2025-08-29 11:01:30 +08:00
Yuanle Liu
4957908275
add input_processor plugin ( #3657 )
...
* add input_processor plugin
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
2025-08-28 22:53:57 +08:00
YuanRisheng
5b66462f0e
Fix fdconfig bugs ( #3528 )
...
* fix config
* fix parallel
* fix ips
* fix rl
* open code
2025-08-22 16:17:15 +08:00
kevin
67298cf4c0
add error traceback info ( #3419 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* add error traceback info
* update error msg
* update code
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-08-19 19:32:04 +08:00
ming1753
396dba0d62
[Bug Fix] Fix V1 video bug ( #3388 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-13 23:04:07 +08:00
ming1753
f5164215be
[Bug Fix] fix vl V1 schedule bug ( #3323 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* [Bug Fix] fix vl V1 schedule bug
* fix format
2025-08-12 11:31:39 +08:00
Zero Rains
b23af29d0b
Launch expert_service before kv_cache initialization in worker_process ( #3045 )
...
* launch expert_service before kv_cache initialization
* add two signal make sure model loading and expert_service lauching finished
* fix the EP bug
* fix ep
* update launching way
* fix ep
* update
* roback ep
* pre-commit all files
---------
Co-authored-by: RAM <gstian5555@outlook.com >
Co-authored-by: Divano <dddivano@outlook.com >
2025-08-11 19:38:46 +08:00
chenjian
c011cb8b16
[Bug Fix] Fix scheduler bug in develop ( #3292 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* Fix scheduler bug in develop
* Fix scheduler bug in develop
* Fix scheduler bug in develop
2025-08-10 13:55:38 +08:00
kevin
22cab724e8
[Feature] block scheduler v1 support prefix caching ( #3061 )
...
* block scheduler v1 support prefix cache
* update code
* update code
* fix code bug
* add timeout time
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-31 19:29:19 +08:00
chenjian
fe0e3f508b
[BUG FIX] Fix bug when preempted request rescheduled ( #3080 )
...
* Fix bug when preempted request rescheduled
* Fix bug when preempted request rescheduled
* Fix bug when preempted request rescheduled
2025-07-30 22:25:47 +08:00
ming1753
5acde4eb43
[Feature] Multimodal Scheduler V1 ( #3019 )
...
* [Feature] Support multimodal scheduler v1
* remove debug log
* fix bug
* fix format
* modify code
* fix bug
* fix bug
* fix bug
* modify code
2025-07-30 16:05:55 +08:00
chenjian
85a78d695d
[Feature] Support block scheduler v1 for FD ( #2928 )
...
* Support FD block scheduler v1
* Support FD block scheduler v1
* Support FD block scheduler v1
* Fix according to copilot review
* Fix according to review
* Remove is_dummy
* Fix bug when real_bsz=1
* Fix infer first token cost time
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-23 20:31:31 +08:00