[Docs] release docs 2.3 (#4951)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled

* [Docs] release docks 2.3

* modify dockerfiles

* fix bug
This commit is contained in:
ming1753
2025-11-11 15:30:11 +08:00
committed by GitHub
parent cba7b2912f
commit 38ccf9b00b
25 changed files with 2322 additions and 134 deletions

View File

@@ -5,12 +5,14 @@
Reasoning models return an additional `reasoning_content` field in their output, which contains the reasoning steps that led to the final conclusion.
## Supported Models
| Model Name | Parser Name | Eable_thinking by Default |
|----------------|----------------|---------------------------|
| baidu/ERNIE-4.5-VL-424B-A47B-Paddle | ernie-45-vl | |
| baidu/ERNIE-4.5-VL-28B-A3B-Paddle | ernie-45-vl | |
| Model Name | Parser Name | Enable thinking by Default | Tool Calling | Thinking switch parameters|
|---------------|-------------|---------|---------|----------------|
| baidu/ERNIE-4.5-VL-424B-A47B-Paddle | ernie-45-vl | | ❌ | "chat_template_kwargs":{"enable_thinking": true/false}|
| baidu/ERNIE-4.5-VL-28B-A3B-Paddle | ernie-45-vl | | |"chat_template_kwargs":{"enable_thinking": true/false}|
| baidu/ERNIE-4.5-21B-A3B-Thinking | ernie-x1 | ✅ Not supported for turning off | ✅|❌|
| baidu/ERNIE-4.5-VL-28B-A3B-Thinking | ernie-45-vl-thinking | ✅ Not recommended to turn off | ✅|"chat_template_kwargs": {"options": {"thinking_mode": "open/close"}}|
The reasoning model requires a specified parser to extract reasoning content. The reasoning mode can be disabled by setting the `"enable_thinking": false` parameter.
The reasoning model requires a specified parser to extract reasoning content. Referring to the `thinking switch parameters` of each model can turn off the model's thinking mode.
Interfaces that support toggling the reasoning mode:
1. `/v1/chat/completions` requests in OpenAI services.
@@ -34,6 +36,7 @@ python -m fastdeploy.entrypoints.openai.api_server \
```
Next, make a request to the model that should return the reasoning content in the response.
Taking the baidu/ERNIE-4.5-VL-28B-A3B-Paddle model as an example
```bash
curl -X POST "http://0.0.0.0:8192/v1/chat/completions" \
@@ -81,3 +84,78 @@ for chunk in chat_response:
print(chunk.choices[0].delta, end='')
print("\n")
```
## Tool Calling
The reasoning content is also available when both tool calling and the reasoning parser are enabled. Additionally, tool calling only parses functions from the `content` field, not from the `reasoning_content`.
Model request example:
```bash
curl -X POST "http://0.0.0.0:8390/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"role": "user",
"content": "Get the current weather in BeiJing"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Determine weather in my location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"c",
"f"
]
}
},
"additionalProperties": false,
"required": [
"location",
"unit"
]
},
"strict": true
}
}],
"stream": false
}'
```
Model output example
```json
{
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "",
"reasoning_content": "The user asks about ...",
"tool_calls": [
{
"id": "chatcmpl-tool-311b9bda34274722afc654c55c8ce6a0",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"BeiJing\", \"unit\": \"c\"}"
}
}
]
},
"finish_reason": "tool_calls"
}
]
}
```
More reference documentation related to tool calling usage [Tool Calling](./tool_calling.md)