[Docs]Supplement the English and Chinese user documentation for Tool calling (#4895)

* tool calling文档编写，v1.0 * tool calling文档编写，v1.0 * tool calling文档编写，v1.0 * tool calling doc，v1.1 * tool calling doc，v1.1 * tool calling doc，v1.1 * tool calling doc，v1.1
2025-12-24 13:28:13 +08:00 · 2025-11-08 20:05:14 +08:00
parent 87911b7cf1
commit 8a9e7b53af
2 changed files with 454 additions and 0 deletions
--- a/docs/features/tool_calling.md
+++ b/docs/features/tool_calling.md
@@ -0,0 +1,222 @@
+# Tool_Calling
+
+This document describes how to configure the server in FastDeploy to use the tool parser, and how to invoke tools from the client.
+
+---
+## Quickstart
+
+### Starting FastDeploy with Tool Calling Enabled.
+
+Launch the server with tool-calling enabled.This example uses ERNIE-4.5-21B-A3B.Leverage the ernie-x1 reasoning parser and the ernie-x1 tool-call parser from the fastdeploy directory to extract the model’s reasoning content, response content, and the tool-calling information:
+
+```bash
+python -m fastdeploy.entrypoints.openai.api_server
+    --model /models/ERNIE-4.5-21B-A3B \
+    --port 8000 \
+    --reasoning-parser ernie-x1 \
+    --tool-call-parser ernie-x1
+```
+### Example of triggering tool calling
+Make a request containing the tool to trigger the model to use the available tool:
+```python
+curl -X POST http://0.0.0.0:8000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "messages": [
+      {
+        "role": "user",
+        "content": "What's the weather in Beijing?"
+      }
+    ],
+    "tools": [
+      {
+        "type": "function",
+        "function": {
+          "name": "get_weather",
+          "description": "Get the current weather in a given location",
+          "parameters": {
+            "type": "object",
+            "properties": {
+              "location": {
+                "type": "string",
+                "description": "City name, for example: Beijing"
+              },
+              "unit": {
+                "type": "string",
+                "enum": ["c", "f"],
+                "description": "Temperature units: c = Celsius, f = Fahrenheit"
+              }
+            },
+            "required": ["location", "unit"],
+            "additionalProperties": false
+          },
+          "strict": true
+        }
+      }
+    ],
+    "stream": false
+  }'
+```
+The example output is as follows. It shows that the model's output of the thought process `reasoning_content` and tool call information `tool_calls` was successfully parsed, and the current response content `content` is empty,`finish_reason` is `tool_calls`:
+```bash
+{
+    "choices": [
+        {
+            "index": 0,
+            "message": {
+                "role": "assistant",
+                "content": "",
+                "multimodal_content": null,
+                "reasoning_content": "User wants to ... ",
+                "tool_calls": [
+                    {
+                        "id": "chatcmpl-tool-bc90641c67e44dbfb981a79bc986fbe5",
+                        "type": "function",
+                        "function": {
+                            "name": "get_weather",
+                            "arguments": "{\"location\": \"北京\", \"unit\": \"c\"}"
+                        }
+                    }
+                ],
+                "finish_reason": "tool_calls"
+            }
+        }
+    ]
+}
+```
+
+## Parallel Tool Calls
+If the model can generate parallel tool calls, FastDeploy will return a list:
+```bash
+tool_calls=[
+  {"id": "...", "function": {...}},
+  {"id": "...", "function": {...}}
+]
+```
+
+## Requests containing tools in the conversation history
+If tool-call information exists in previous turns, you can construct the request as follows:
+```python
+curl -X POST "http://0.0.0.0:8000/v1/chat/completions" \
+-H "Content-Type: application/json" \
+-d '{
+  "messages": [
+    {
+      "role": "user",
+      "content": "Hello,What's the weather in Beijing?"
+    },
+    {
+      "role": "assistant",
+      "tool_calls": [
+        {
+          "id": "call_1",
+          "type": "function",
+          "function": {
+            "name": "get_weather",
+            "arguments": {
+              "location": "Beijing",
+              "unit": "c"
+            }
+          }
+        }
+      ],
+      "thoughts": "Users need to check today's weather in Beijing."
+    },
+    {
+      "role": "tool",
+      "tool_call_id": "call_1",
+      "content": {
+        "type": "text",
+        "text": "{\"location\": \"北京\",\"temperature\": \"23\",\"weather\": \"晴\",\"unit\": \"c\"}"
+      }
+    }
+  ],
+  "tools": [
+    {
+      "type": "function",
+      "function": {
+        "name": "get_weather",
+        "description": "Determine weather in my location",
+        "parameters": {
+          "type": "object",
+          "properties": {
+            "location": {
+              "type": "string",
+              "description": "The city and state e.g. San Francisco, CA"
+            },
+            "unit": {
+              "type": "string",
+              "enum": [
+                "c",
+                "f"
+              ]
+            }
+          },
+          "additionalProperties": false,
+          "required": [
+            "location",
+            "unit"
+          ]
+        },
+        "strict": true
+      }
+    }
+  ],
+  "stream": false
+}'
+```
+The parsed model output is as follows, containing the thought content `reasoning_content` and the response content `content`, with `finish_reason` set to stop:
+```bash
+{
+    "choices": [
+        {
+            "index": 0,
+            "message": {
+                "role": "assistant",
+                "content": "Today's weather in Beijing is sunny with a temperature of 23 degrees Celsius.",
+                "reasoning_content": "User wants to ...",
+                "tool_calls": null
+            },
+            "finish_reason": "stop"
+        }
+    ]
+}
+```
+## Writing a Custom Tool Parser
+FastDeploy supports custom tool parser plugins. You can refer to the following address to create a `tool parser`: `fastdeploy/entrypoints/openai/tool_parser`
+
+A custom parser should implement:
+``` python
+# import the required packages
+# register the tool parser to ToolParserManager
+@ToolParserManager.register_module("my-parser")
+class ToolParser:
+    def __init__(self, tokenizer: AnyTokenizer):
+      super().__init__(tokenizer)
+
+    # implement the tool parse for non-stream call
+    def extract_tool_calls(self, model_output: str, request: ChatCompletionRequest) -> ExtractToolCallInformation:
+      return ExtractedToolCallInformation(tools_called=False,tool_calls=[],content=text)
+
+    # implement the tool call parse for stream call
+    def extract_tool_calls_streaming(
+        self,
+        previous_text: str,
+        current_text: str,
+        delta_text: str,
+        previous_token_ids: Sequence[int],
+        current_token_ids: Sequence[int],
+        delta_token_ids: Sequence[int],
+        request: ChatCompletionRequest,
+    ) -> DeltaMessage | None:
+        return delta
+```
+Enable via:
+``` bash
+python -m fastdeploy.entrypoints.openai.api_server
+--model <model path>
+--tool-parser-plugin <absolute path of the plugin file>
+--tool-call-parser my-parser
+```
+
+---