SunLei
2f28f40d90
fix: replace list * n initialization with list comprehension to avoid shared references ( #3618 )
2025-08-26 17:53:31 +08:00
ltd0924
66c5addce4
[Bugfix] fix api server control signal bugs ( #3531 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* Update serving_chat.py
* Update serving_completion.py
* Update serving_completion.py
2025-08-25 21:13:04 +08:00
李泳桦
8bea4b1e25
[fix] fix output tokens count in streaming completion api ( #3507 )
2025-08-21 18:19:13 +08:00
luukunn
371fb3f853
[Feature] add tool parser ( #3483 )
...
* add tool parser
* add x1 enable_thinking
* restart ci
* fix vl reasoning parser
* modify call style
* modify call style
* add offline enablethinking
* fix completion
* fix
* fix unit test
* fix unit test
* fix unit test
* fix vl reasoning parser
* fix vl reasoning parser
2025-08-21 17:25:44 +08:00
Yzc216
466cbb5a99
[Feature] Models api ( #3073 )
...
* add v1/models interface related
* add model parameters
* default model verification
* unit test
* check model err_msg
* unit test
* type annotation
* model parameter in response
* modify document description
* modify document description
* unit test
* verification
* verification update
* model_name
* pre-commit
* update test case
* update test case
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/entrypoints/openai/serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
---------
Co-authored-by: LiqinruiG <37392159+LiqinruiG@users.noreply.github.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-08-21 17:02:56 +08:00
ltd0924
51f68ae593
[Feature] add dealer manager to reuse the connection ( #3471 )
...
* [BugFix] fix control signal release failed
* [BugFix] fix control signal release failed
* update
* update
* update
* [Feature] add dealer manager to reuse the connection
* fix
* fix
* fix
* fix
* fix
* fix
* Create test_dealer_connection_manager.py
* Delete test/entrypoints/openai directory
* Update test_dealer_connection_manager.py
* Update test_dealer_connection_manager.py
2025-08-21 13:11:13 +08:00
memoryCoderC
31f639f10b
[Feature] add prompt_tokens and completion_tokens ( #3504 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-21 10:23:27 +08:00
kevin
67298cf4c0
add error traceback info ( #3419 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* add error traceback info
* update error msg
* update code
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-08-19 19:32:04 +08:00
ltd0924
bca8905b40
[BugFix] fix control signal release failed ( #3390 )
...
* [BugFix] fix control signal release failed
* [BugFix] fix control signal release failed
* update
* update
* update
2025-08-19 13:51:38 +08:00
zhuzixuan
c95b3395e9
【BugFix】completion接口echo回显支持 ( #3245 )
...
* wenxin-tools-511,修复v1/completion无法回显的问题。
* 支持多prompt的回显
* 支持多prompt情况下的流式回显
* 补充了 completion 接口支持 echo 的单元测试
* pre-commit
* 移除了多余的test文件
* 修复了completion接口echo支持的单测方法
* 补充了单元测试文件
* 补充单测
* unittest
* 补充单测
* 修复单测
* 删除不必要的assert.
* 重新提交
* 更新测试方法
* ut
* 验证是否是正确思路单测
* 验证是否是正确思路单测
* 验证是否是正确思路单测3
* 优化单测代码,有针对性地缩小单测范围。
* 优化单测代码2,有针对性地缩小单测范围。
* 优化单测代码3,有针对性地缩小单测范围。
* support 'echo' in chat/completion.
* update
* update
* update
* update
* update
* update
* 补充了关于tokenid的单元测试
* update
* 修正index错误
* 修正index错误
2025-08-19 10:41:51 +08:00
xiaolei373
d4f610e4cd
feat(log):add_request_and_response_log ( #3373 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-13 23:27:41 +08:00
luukunn
eda83ca672
add Tool Parser ( #3272 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* add tool-parser
* add tool-parser
* add tool parser
* add tool parser
* fix
* add offline
* add offline
* fix
* parsers:tool&reasoning
* 修改tool parser名称·
* update
* fix reasoning-parser
* add requirements
* fix finish reason
* fix
* fix reasoning-parser
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: zhuzixuan <zhuzixuan@baidu.com >
2025-08-13 01:06:55 +08:00
memoryCoderC
2d1a4cacdf
Completion add raw_prediction/text_after_process ( #3356 )
2025-08-12 23:06:45 +08:00
memoryCoderC
c575611a5b
[BugFix] v1/completions add finish_reason ( #3246 )
...
* [BugFix] v1/completions add finish_reason
* update TestOpenAIServingCompletion for merge
---------
Co-authored-by: YUNSHEN XIE <1084314248@qq.com >
2025-08-12 19:40:26 +08:00
ltd0924
31d4fcb425
[BugFix] fix too many open files problem ( #3256 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* Update cache_messager.py
* fix too many open files problem
* fix too many open files problem
* fix too many open files problem
* fix ci bugs
* Update api_server.py
* add parameter
* format
* format
* format
* format
* Update parameters.md
* Update parameters.md
* Update serving_completion.py
* Update serving_chat.py
* Update envs.py
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-08-08 20:10:11 +08:00
李泳桦
09cc4e2802
[fix] fix completion stream api output_tokens not in usage ( #3247 )
2025-08-07 10:36:00 +08:00
SunLei
dade19d7a4
[Feature] General support for logprobs ( #2974 )
...
* [Feature] support logprobs in chat/completions and completions endpoints
* Temporarily comment out text_offset due to incorrect logic
* Clean up temporary debug prints
* [Feature] support logprobs in offline mode via SamplingParams
* fix: serialize Logprob as dict before zmq send to fix msgpack error
* refactor: remove redundant methods to simplify codebase
* Fix missing fields in CompletionOutput.to_dict affecting msgpack serialization
* refactor: centralize param validation in engine_client to reduce duplication
* revert: rollback changes in offline_demo.py
* revert: rollback changes in offline_demo.py
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
* [bugfix] fix parameter validation for logprobs
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-31 20:25:56 +08:00
Jiang-Jia-Jun
0616c208d2
[Feature] Support include_stop_str_in_output in completion api ( #3096 )
...
* [Feature] Support include_stop_str_in_output in completion api
* Fix ci test
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-30 22:18:48 +08:00
李泳桦
b242150f94
[feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client ( #3058 )
...
* [feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client
* [fix] delete ci test case for enable_thinking
* [fix] add reasoning_parser when server starts
* [fix] fix ci consistency test error with reasoning parser
* [doc] update docs related to metadata
* [fix] cancel enable_thinking default value
2025-07-30 19:25:20 +08:00
Zero Rains
0fb37ab7e4
update flake8 version to support pre-commit in python3.12 ( #3000 )
...
* update flake8 version to support pre-commit in python3.12
* polish code
2025-07-24 01:43:31 -07:00
ltd0924
f935d6f862
[BugFix] fix multinode deployment ( #2977 )
2025-07-24 15:04:04 +08:00
李泳桦
2a8a2c06de
[fix] non-streaming api now returns full output ids if return_token_ids is enabled ( #2951 )
2025-07-22 14:35:56 +08:00
李泳桦
8a619e9db5
[Feature] Add return_token_ids, prompt_token_ids, and delete training, raw_request in request body ( #2940 )
...
* [feat] add return_token_ids, prompt_token_ids, delete raw_request in request body
* [fix] return_token_ids not working in curl request
* [test] improve some test cases of return_token_ids and prompt_token_ids
* [fix] the server responds ok even if request.messages is an empty list
2025-07-21 19:31:14 +08:00
Zero Rains
25698d56d1
polish code with new pre-commit rule ( #2923 )
2025-07-19 23:19:27 +08:00
ltd0924
9c25dcca0b
[LLM] Update Multinode Deployment ( #2830 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* [LLM] fix multinode bugs
* [LLM] update multinode deployment
* [LLM] update multinode deployment
* [LLM] update multinode deployment
* [LLM] update multinode deployment
* [LLM] update multinode deployment
* [LLM] fix ci bugs
* Update fastdeploy/engine/args_utils.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* [LLM] update random port
* [LLM] update random port
* [LLM] fix ci bugs
* fix ci bugs
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-07-16 23:42:54 +08:00
ltd0924
d245d1ca6c
[LLM] support send batch data and aggregate data ( #2860 )
...
* [LLM] support send batch data and aggregate data
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] fix ci bugs
* [LLM] update
2025-07-16 23:42:20 +08:00
ltd0924
68b4755587
[LLM] support multi node deploy ( #2708 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* [LLM] support multi node deploy
* Update engine.py
* fix bugs
* fix
* [LLM] support multi node deploy
* [LLM] support multi node deploy
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-06 10:33:51 +08:00
Jiang-Jia-Jun
92c2cfa2e7
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00
jiangjiajun
684703fd72
[LLM] First commit the llm deployment code
2025-06-09 19:20:15 +08:00