yinwei
d998efbc17
[Doc]Release fastdeploy-xpu 2.0.3 ( #3408 )
...
* fix v1 schedule oom bug
* fix v1 schedule oom bug
* update release note
* update info
v2.1.0
2025-08-14 19:19:54 +08:00
yinwei
8a15bdc0c8
[Doc]Release fastdeploy-xpu 2.1.0 ( #3407 )
...
* fix v1 schedule oom bug
* fix v1 schedule oom bug
* update release note
2025-08-14 19:11:16 +08:00
memoryCoderC
ad8ea68906
[BugFix] fix ErnieProcessor not set raw_prediction ( #3401 )
2025-08-14 19:10:07 +08:00
yinwei
101605869c
[XPU] Fixed the issue of performance degradation caused by enabling ENABLE_V1_KVCACHE_SCHEDULER ( #3393 )
...
* fix v1 schedule oom bug
* fix v1 schedule oom bug
2025-08-14 17:41:40 +08:00
Jiang-Jia-Jun
28918702c2
Revert "Merge branch 'feature/online/vs_think_20250813' into release/2.1"
...
This reverts commit 02596fc537 , reversing
changes made to 03347626a6 .
2025-08-14 17:20:29 +08:00
Jiang-Jia-Jun
02596fc537
Merge branch 'feature/online/vs_think_20250813' into release/2.1
2025-08-14 17:13:36 +08:00
ltd0924
03347626a6
[BugFix] fix control signal release failed ( #3374 )
...
* [BugFix]
* [BugFix]
* [BugFix]
* [BugFix]
* fix
* fix
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-08-14 17:01:25 +08:00
YUNSHEN XIE
b2df0311b8
Optimize CI execution workflow. ( #3371 ) ( #3384 )
...
* fix
2025-08-14 14:51:15 +08:00
xiaolei373
d1d321bafd
feat(log):add_request_and_response_log ( #3392 )
2025-08-14 14:50:48 +08:00
Jiang-Jia-Jun
dc5d3ff5a0
[Polish Code] Remove useless notes
2025-08-14 14:05:29 +08:00
Jiang-Jia-Jun
f0a707e06f
[BugFix] Fix default log level of paddleformers ( #3377 )
...
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-08-14 11:36:13 +08:00
JYChen
4870919682
fix stopseq error info ( #3342 )
...
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-08-14 10:45:05 +08:00
ming1753
a375378cc1
[Bug Fix] Fix V1 video bug ( #3387 )
2025-08-14 09:49:22 +08:00
YUNSHEN XIE
192f9caab4
Pre ce modified ( #3335 ) ( #3360 )
...
* Pre ce modified (#3335 )
* update
* update
* fix
* fix
* update
* update
* update
* fix
* update
* update
* update
* add ut fix pr(3367)
2025-08-13 18:50:52 +08:00
luukunn
81092c0fe3
add tool parser
2025-08-13 16:06:22 +08:00
YUNSHEN XIE
ad816f20f4
Use latest PaddlePaddle package ( #3347 ) ( #3352 )
...
* Use latest PaddlePaddle package
* fix
2025-08-13 11:06:01 +08:00
memoryCoderC
37b76158f9
Completion add raw_prediction/text_after_process ( #3362 )
2025-08-12 23:20:36 +08:00
memoryCoderC
fe2094609f
Release/2.1 ( #3361 )
...
* [BugFix] v1/completions add finish_reason
* update TestOpenAIServingCompletion for merge
2025-08-12 23:06:51 +08:00
gaoziyuan
b4bb54b56b
bugfix ( #3322 )
2025-08-12 16:16:37 +08:00
Jiang-Jia-Jun
eeec4bd15e
Remove useless code release/2.1 ( #3338 )
2025-08-12 11:32:50 +08:00
chenjian
d2592750f7
fix bug for scheduler v0 ( #3306 )
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: YUNSHEN XIE <1084314248@qq.com >
2025-08-12 00:41:15 +08:00
chenjian
25f51b0611
Fix block num in schduelr v1 for release 2.1 ( #3315 )
...
* fix bug for scheduler v0
* fix block num setting in scheduler v1 for release 2.1
* fix block num setting in scheduler v1 for release 2.1
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: YUNSHEN XIE <1084314248@qq.com >
2025-08-12 00:41:05 +08:00
ming1753
9b07f85f6d
[Bug Fix] fix vl V1 schedule bug ( #3284 )
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: YUNSHEN XIE <1084314248@qq.com >
2025-08-12 00:40:45 +08:00
Sunny-bot1
2fe31c6f0f
[Docs]fix sampling docs 2.1 ( #3333 )
...
* [Docs]fix sampling docs (#3113 )
* fix sampling docs
* fix sampling docs
* update
* fix docs
2025-08-11 21:04:10 +08:00
YUNSHEN XIE
a33e557732
fix ci pypi index error ( #3327 )
2025-08-11 20:24:27 +08:00
kevin
054c790642
fix uvicorn multi worker error ( #3309 )
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-08-11 20:19:31 +08:00
Jiang-Jia-Jun
ca4e4ab911
Revert "[BugFix] fix ep ( #3290 )" ( #3317 )
...
This reverts commit 86ff68be4b .
2025-08-11 16:17:58 +08:00
chenjian
c000cff744
fix scheduler bug in release2.1 ( #3295 )
2025-08-10 13:55:22 +08:00
lizexu123
86ff68be4b
[BugFix] fix ep ( #3290 )
...
* fix ep
* fix
2025-08-09 16:32:35 +08:00
yinwei
702c313ed1
revert pr ( #3286 )
2025-08-09 16:29:35 +08:00
ltd0924
6706ccb37e
[BugFix] fix too many open files problem ( #3275 )
2025-08-08 20:11:32 +08:00
JYChen
1b6f482c15
[Cherry-pick] fix stop seq ( #3263 )
...
* fix out-bound value for stop sequence
* catch error if there are out-of-bounds value
* check in offline mode
2025-08-07 19:11:37 +08:00
sg263
5d3bf308f6
merge develop trace FD_START ( #3253 )
...
Co-authored-by: shige <shige@baidu.com >
2025-08-07 11:10:55 +08:00
Sunny-bot1
f672a34f95
[FIX 2.1]fix bad_words when sending requests consecutively ( #3199 )
...
* fix bad_words
* fix log
* fix log
2025-08-06 15:47:27 +08:00
lizexu123
bc0b92bba4
[BugFix] support real batch_size ( #3109 ) ( #3217 )
...
* support real bsz
* fix
* fix xpu_model_runner.py,gpu_model_runner.py,gcu_model_runner.py,iluvatar_model_runner.py
* add event_loop_ep
* fix
* Add comments
* fix
* support mtp real_batch_size
* fix
* self.tmp_seq_lens_this_time->self.seq_lens_this_time_buffer
* fix
* fix VL real_seq_lens_this_time
* fix
* fix mtp
* fix
* fix mtp
* fix xpu
* fix
2025-08-06 14:30:33 +08:00
SunLei
3dd8492601
[Bugfix] Fix uninitialized decoded_token and add corresponding unit test ( #3201 )
...
* Update test_base_chat.py (#3183 )
* [Bugfix] Fix uninitialized decoded_token and add corresponding unit test.
---------
Co-authored-by: Divano <dddivano@outlook.com >
2025-08-05 10:55:22 +08:00
RAM
bd77a3a643
[Bug Fix] Fix bug of MLA Attention Backend ( #3178 )
...
* fix typo
* fix mla attention backend
2025-08-05 10:53:27 +08:00
YUNSHEN XIE
9561603ed9
Apply CI fix from Develop ( #3151 )
...
* fix ci approve
* Describe PR diff coverage using JSON file (#3114 )
* Refactored ci pipeline
* update
* Describe PR diff coverage using JSON file
* remove pip cache setting from Approve
* fix
* update
* fix ci (#3141 )
* fix
2025-08-04 16:30:56 +08:00
plusNew001
e26313a355
Update Dockerfile.xpu ( #3147 )
2025-08-04 16:25:33 +08:00
yinwei
4367c09a5f
Fix out-of-memory issue during single-XPU deployment ( #3131 )
2025-08-04 16:02:43 +08:00
bukejiyu
8e789dcb67
fix load_pre_sharded_checkpoint ( #3152 ) ( #3169 )
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-08-04 15:44:10 +08:00
ltd0924
5f6fc7f7b9
Update cache_messager.py ( #3173 )
2025-08-04 15:09:17 +08:00
RAM
d4059cabf0
fix typo ( #3153 )
2025-08-01 22:34:59 +08:00
chen
c8dd5976ae
fix request_output sampling_params ( #3154 )
2025-08-01 22:34:33 +08:00
Jiang-Jia-Jun
4880c16be3
Update setup.py
2025-07-31 20:30:24 +08:00
SunLei
dade19d7a4
[Feature] General support for logprobs ( #2974 )
...
* [Feature] support logprobs in chat/completions and completions endpoints
* Temporarily comment out text_offset due to incorrect logic
* Clean up temporary debug prints
* [Feature] support logprobs in offline mode via SamplingParams
* fix: serialize Logprob as dict before zmq send to fix msgpack error
* refactor: remove redundant methods to simplify codebase
* Fix missing fields in CompletionOutput.to_dict affecting msgpack serialization
* refactor: centralize param validation in engine_client to reduce duplication
* revert: rollback changes in offline_demo.py
* revert: rollback changes in offline_demo.py
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-31 20:25:56 +08:00
chenjian
fe17410f9c
[BUG] Fix bug for pd in fd ( #3034 )
...
* Fix bug for pd in fd
* Fix bug for pd in fd
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-31 20:17:27 +08:00
Zhang Yulong
1a543bca29
Fix test_EB_Lite_serving.py ( #3119 )
...
* Fix test_EB_Lite_serving.py
* fix test_EB_Lite_serving.py
2025-07-31 20:15:25 +08:00
Yuan Xiaolan
5f56d289a7
fix is_permuted ( #3098 )
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-31 19:58:05 +08:00
LiqinruiG
25005fee30
[Doc] add chat_template_kwagrs and update params docs ( #3103 )
...
* add chat_template_kwagrs and update params docs
* add chat_template_kwagrs and update params docs
* update enable_thinking
* pre-commit
* update test case
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-31 19:44:06 +08:00