luukunn
bbd50c6717
add tool parser
2025-08-14 21:08:49 +08:00
luukunn
132a8ef425
Release/2.1 ( #3414 )
...
* Pre ce modified (#3335 ) (#3360 )
* Pre ce modified (#3335 )
* update
* update
* fix
* fix
* update
* update
* update
* fix
* update
* update
* update
* add ut fix pr(3367)
* [Bug Fix] Fix V1 video bug (#3387 )
* fix stopseq error info (#3342 )
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
* [BugFix] Fix default log level of paddleformers (#3377 )
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
* [Polish Code] Remove useless notes
* feat(log):add_request_and_response_log (#3392 )
* Optimize CI execution workflow. (#3371 ) (#3384 )
* fix
* [BugFix] fix control signal release failed (#3374 )
* [BugFix]
* [BugFix]
* [BugFix]
* [BugFix]
* fix
* fix
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* Revert "Merge branch 'feature/online/vs_think_20250813' into release/2.1"
This reverts commit 02596fc537
, reversing
changes made to 03347626a6
.
* [XPU] Fixed the issue of performance degradation caused by enabling ENABLE_V1_KVCACHE_SCHEDULER (#3393 )
* fix v1 schedule oom bug
* fix v1 schedule oom bug
* [BugFix] fix ErnieProcessor not set raw_prediction (#3401 )
* [Doc]Release fastdeploy-xpu 2.1.0 (#3407 )
* fix v1 schedule oom bug
* fix v1 schedule oom bug
* update release note
* [Doc]Release fastdeploy-xpu 2.0.3 (#3408 )
* fix v1 schedule oom bug
* fix v1 schedule oom bug
* update release note
* update info
---------
Co-authored-by: YUNSHEN XIE <1084314248@qq.com >
Co-authored-by: ming1753 <61511741+ming1753@users.noreply.github.com >
Co-authored-by: JYChen <zoooo0820@qq.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
Co-authored-by: xiaolei373 <zley373@gmail.com >
Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com >
Co-authored-by: yinwei <yinwei_hust@163.com >
Co-authored-by: memoryCoderC <1137889088@qq.com >
2025-08-14 20:53:47 +08:00
luukunn
81092c0fe3
add tool parser
2025-08-13 16:06:22 +08:00
SunLei
dade19d7a4
[Feature] General support for logprobs ( #2974 )
...
* [Feature] support logprobs in chat/completions and completions endpoints
* Temporarily comment out text_offset due to incorrect logic
* Clean up temporary debug prints
* [Feature] support logprobs in offline mode via SamplingParams
* fix: serialize Logprob as dict before zmq send to fix msgpack error
* refactor: remove redundant methods to simplify codebase
* Fix missing fields in CompletionOutput.to_dict affecting msgpack serialization
* refactor: centralize param validation in engine_client to reduce duplication
* revert: rollback changes in offline_demo.py
* revert: rollback changes in offline_demo.py
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-31 20:25:56 +08:00
chenjian
fe0e3f508b
[BUG FIX] Fix bug when preempted request rescheduled ( #3080 )
...
* Fix bug when preempted request rescheduled
* Fix bug when preempted request rescheduled
* Fix bug when preempted request rescheduled
2025-07-30 22:25:47 +08:00
ming1753
5acde4eb43
[Feature] Multimodal Scheduler V1 ( #3019 )
...
* [Feature] Support multimodal scheduler v1
* remove debug log
* fix bug
* fix format
* modify code
* fix bug
* fix bug
* fix bug
* modify code
2025-07-30 16:05:55 +08:00
李泳桦
69996a40da
[feat] add disable_chat_template in chat api as a substitute for previous raw_request ( #3020 )
...
* [feat] add disable_chat_template in chat api as a substitute for previous raw_request
* [fix] pre-commit code check
2025-07-25 20:57:32 +08:00
chenjian
85a78d695d
[Feature] Support block scheduler v1 for FD ( #2928 )
...
* Support FD block scheduler v1
* Support FD block scheduler v1
* Support FD block scheduler v1
* Fix according to copilot review
* Fix according to review
* Remove is_dummy
* Fix bug when real_bsz=1
* Fix infer first token cost time
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-23 20:31:31 +08:00
李泳桦
8a619e9db5
[Feature] Add return_token_ids, prompt_token_ids, and delete training, raw_request in request body ( #2940 )
...
* [feat] add return_token_ids, prompt_token_ids, delete raw_request in request body
* [fix] return_token_ids not working in curl request
* [test] improve some test cases of return_token_ids and prompt_token_ids
* [fix] the server responds ok even if request.messages is an empty list
2025-07-21 19:31:14 +08:00
Zero Rains
25698d56d1
polish code with new pre-commit rule ( #2923 )
2025-07-19 23:19:27 +08:00
ltd0924
4b14dca1d6
[LLM] delete fixed slots ( #2893 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-17 19:19:54 +08:00
ltd0924
d245d1ca6c
[LLM] support send batch data and aggregate data ( #2860 )
...
* [LLM] support send batch data and aggregate data
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] update
2025-07-16 23:42:20 +08:00
sg263
42b80182e0
[Trace] add opentelemetry ( #2852 )
...
* add opentelemetry
* add opentelemetry
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* add opentelemetry on dequeue
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-16 15:33:25 +08:00
chen
d33105baeb
[Feature] Online Chat API Support Return logprobs ( #2777 )
...
* online chat support logprobs
* check xpu
* check vl_gpu_model_runner and xpu_model_runner
* get_worker() check platform
2025-07-10 16:33:40 +08:00
Jiang-Jia-Jun
92c2cfa2e7
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00
jiangjiajun
684703fd72
[LLM] First commit the llm deployment code
2025-06-09 19:20:15 +08:00