Jiang-Jia-Jun
a7740e56c4
Simplify __repr__ method in Request class ( #5154 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Remove detailed string representation from Request class.
2025-11-20 21:31:02 +08:00
kevin
3ce2c8f754
[Feature] support async download features ( #4910 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* add async download
* update code
* fix bug
* update code
* update code
* fix bugs
* update code
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-11-18 18:37:59 +08:00
kevin
dd7fe27152
add hasher and ImagePosition
2025-10-23 15:20:21 +08:00
GoldPancake
9c7187998c
[Feature] support mtp logprob ( #4457 )
...
* support logprob in mtp
* remove debug code
* fix
* feat: add draft_logprobs for Speculative Decode MTP
* Revert "feat: add draft_logprobs for Speculative Decode MTP"
This reverts commit d5a3c5c933 .
* fix
* feat: add draft_logprobs for Speculative Decode MTP
* feat: add draft_logprobs for Speculative Decode MTP
* fix some bugs
* fix codestyle
* fix bugs
* fix bugs
* fix bugs
* fix bus
* fix bugs
* fix unitest
---------
Co-authored-by: sunlei1024 <sunlei5788@gmail.com >
Co-authored-by: sunlei18 <sunlei18@sunlei18deMacBook-Pro.local >
2025-10-20 10:18:00 +08:00
guozhuangzhuang
cfd93c0966
fix: image token output ( #4399 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix: image token output
* fix: code style
* fix: CompletionOutput.decode_type
2025-10-16 14:51:32 +08:00
Yuanle Liu
4957908275
add input_processor plugin ( #3657 )
...
* add input_processor plugin
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
2025-08-28 22:53:57 +08:00
luukunn
9c129813f9
[Feature] add custom chat template ( #3251 )
...
* add custom chat_template
* add custom chat_template
* add unittest
* fix
* add docs
* fix comment
* add offline chat
* fix unit test
* fix unit test
* fix
* fix pre commit
* fix unit test
* add unit test
* add unit test
* add unit test
* fix pre_commit
* fix enable_thinking
* fix pre commit
* fix pre commit
* fix unit test
* add requirements
2025-08-18 16:34:08 +08:00
luukunn
eda83ca672
add Tool Parser ( #3272 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* add tool-parser
* add tool-parser
* add tool parser
* add tool parser
* fix
* add offline
* add offline
* fix
* parsers:tool&reasoning
* 修改tool parser名称·
* update
* fix reasoning-parser
* add requirements
* fix finish reason
* fix
* fix reasoning-parser
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: zhuzixuan <zhuzixuan@baidu.com >
2025-08-13 01:06:55 +08:00
SunLei
dade19d7a4
[Feature] General support for logprobs ( #2974 )
...
* [Feature] support logprobs in chat/completions and completions endpoints
* Temporarily comment out text_offset due to incorrect logic
* Clean up temporary debug prints
* [Feature] support logprobs in offline mode via SamplingParams
* fix: serialize Logprob as dict before zmq send to fix msgpack error
* refactor: remove redundant methods to simplify codebase
* Fix missing fields in CompletionOutput.to_dict affecting msgpack serialization
* refactor: centralize param validation in engine_client to reduce duplication
* revert: rollback changes in offline_demo.py
* revert: rollback changes in offline_demo.py
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-31 20:25:56 +08:00
chenjian
fe0e3f508b
[BUG FIX] Fix bug when preempted request rescheduled ( #3080 )
...
* Fix bug when preempted request rescheduled
* Fix bug when preempted request rescheduled
* Fix bug when preempted request rescheduled
2025-07-30 22:25:47 +08:00
ming1753
5acde4eb43
[Feature] Multimodal Scheduler V1 ( #3019 )
...
* [Feature] Support multimodal scheduler v1
* remove debug log
* fix bug
* fix format
* modify code
* fix bug
* fix bug
* fix bug
* modify code
2025-07-30 16:05:55 +08:00
李泳桦
69996a40da
[feat] add disable_chat_template in chat api as a substitute for previous raw_request ( #3020 )
...
* [feat] add disable_chat_template in chat api as a substitute for previous raw_request
* [fix] pre-commit code check
2025-07-25 20:57:32 +08:00
chenjian
85a78d695d
[Feature] Support block scheduler v1 for FD ( #2928 )
...
* Support FD block scheduler v1
* Support FD block scheduler v1
* Support FD block scheduler v1
* Fix according to copilot review
* Fix according to review
* Remove is_dummy
* Fix bug when real_bsz=1
* Fix infer first token cost time
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-23 20:31:31 +08:00
李泳桦
8a619e9db5
[Feature] Add return_token_ids, prompt_token_ids, and delete training, raw_request in request body ( #2940 )
...
* [feat] add return_token_ids, prompt_token_ids, delete raw_request in request body
* [fix] return_token_ids not working in curl request
* [test] improve some test cases of return_token_ids and prompt_token_ids
* [fix] the server responds ok even if request.messages is an empty list
2025-07-21 19:31:14 +08:00
Zero Rains
25698d56d1
polish code with new pre-commit rule ( #2923 )
2025-07-19 23:19:27 +08:00
ltd0924
4b14dca1d6
[LLM] delete fixed slots ( #2893 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-17 19:19:54 +08:00
ltd0924
d245d1ca6c
[LLM] support send batch data and aggregate data ( #2860 )
...
* [LLM] support send batch data and aggregate data
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] update
2025-07-16 23:42:20 +08:00
sg263
42b80182e0
[Trace] add opentelemetry ( #2852 )
...
* add opentelemetry
* add opentelemetry
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* add opentelemetry on dequeue
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-16 15:33:25 +08:00
chen
d33105baeb
[Feature] Online Chat API Support Return logprobs ( #2777 )
...
* online chat support logprobs
* check xpu
* check vl_gpu_model_runner and xpu_model_runner
* get_worker() check platform
2025-07-10 16:33:40 +08:00
Jiang-Jia-Jun
92c2cfa2e7
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00
jiangjiajun
684703fd72
[LLM] First commit the llm deployment code
2025-06-09 19:20:15 +08:00