ltd0924
fb0f284e67
[BugFix] fix prompt token ids type ( #2994 )
...
* Update serving_completion.py
* fix
* fix
2025-07-23 21:00:56 +08:00
Jiang-Jia-Jun
e5804b1d98
Revert "[LLM] fix multinode bugs ( #2945 )" ( #2971 )
...
This reverts commit b0f1e0eef4 .
2025-07-22 21:23:48 +08:00
ltd0924
b0f1e0eef4
[LLM] fix multinode bugs ( #2945 )
...
* [LLM] fix multinode bugs
* [LLM] fix multinode bugs
* [LLM] fix multinode bugs
* [LLM] fix ci bugs
* fix ci bugs
* fix ci bugs
2025-07-22 20:23:37 +08:00
sg263
580460046f
merge 2.0.2 into 2.0.3 ( #2917 )
...
Co-authored-by: shige <shige@baidu.com >
2025-07-22 14:46:20 +08:00
Jiang-Jia-Jun
f941124402
[Feature] Support include_stop_str_in_output ( #2930 )
...
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-21 10:58:32 +08:00
Jiang-Jia-Jun
09d0073fdc
[Sync Code] develop to release/2.0.3 ( #2873 )
...
* [LLM] support send batch data and aggregate data (#2860 )
* [LLM] support send batch data and aggregate data
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] update
* [LLM] Update Multinode Deployment (#2830 )
* [LLM] fix multinode bugs
* [LLM] update multinode deployment
* [LLM] update multinode deployment
* [LLM] update multinode deployment
* [LLM] update multinode deployment
* [LLM] update multinode deployment
* [LLM] fix ci bugs
* Update fastdeploy/engine/args_utils.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* [LLM] update random port
* [LLM] update random port
* [LLM] fix ci bugs
* fix ci bugs
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
---------
Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-07-16 23:44:26 +08:00
sg263
42b80182e0
[Trace] add opentelemetry ( #2852 )
...
* add opentelemetry
* add opentelemetry
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* add opentelemetry on dequeue
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-16 15:33:25 +08:00
lddfym
ece88596ed
fix spelling error ( #2827 )
2025-07-14 13:12:57 +08:00
zhenwenDang
d48c03413f
Feature/logprob bug fix ( #2817 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix: handle missing logprobs at step 0 and incorrect finish reason with max_completion_tokens
* Prevent response_logprobs.logprob_token_ids[0] from going out of bounds
2025-07-12 16:48:51 +08:00
lddfym
b5e4288704
Global scheduler supports configuring hot updates ( #2807 )
...
* Check if the controller port is available
* Global scheduler supports configuring hot updates
* add interface: /controller/scheduler
* add interface: /controller/scheduler
2025-07-11 13:38:07 +08:00
chen
d33105baeb
[Feature] Online Chat API Support Return logprobs ( #2777 )
...
* online chat support logprobs
* check xpu
* check vl_gpu_model_runner and xpu_model_runner
* get_worker() check platform
2025-07-10 16:33:40 +08:00
Sunny-bot1
e45050cae3
[Feature] support top_k_top_p sampling ( #2753 )
...
* support top_k_top_p sampling
* fix
* add api param
* add api para
* fix
* fix
* fix
* fix
* fix
* fix
* fix
2025-07-09 20:58:58 -07:00
lddfym
4e293e50fa
Check if the controller port is available ( #2724 )
2025-07-07 13:24:55 +08:00
ltd0924
68b4755587
[LLM] support multi node deploy ( #2708 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* [LLM] support multi node deploy
* Update engine.py
* fix bugs
* fix
* [LLM] support multi node deploy
* [LLM] support multi node deploy
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-06 10:33:51 +08:00
ltd0924
87e638498c
[RL] update reschedule finish reason ( #2709 )
2025-07-04 13:47:36 +08:00
Jiang-Jia-Jun
05c670e593
[Sync] Update to latest code ( #2679 )
...
* [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun
92c2cfa2e7
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00
jiangjiajun
684703fd72
[LLM] First commit the llm deployment code
2025-06-09 19:20:15 +08:00