Update docs for reasoing-parser

2025-12-24 13:28:13 +08:00 · 2025-09-01 17:42:58 +08:00
parent 0297127a93
commit 0513a78ecc
2 changed files with 2 additions and 2 deletions
--- a/docs/parameters.md
+++ b/docs/parameters.md
@@ -34,7 +34,7 @@ When using FastDeploy to deploy models (including offline inference and service
 | ```max_long_partial_prefills``` | `int` | When Chunked Prefill is enabled, maximum number of long requests in concurrent partial prefill batches, default: 1 |
 | ```long_prefill_token_threshold``` | `int` | When Chunked Prefill is enabled, requests with token count exceeding this value are considered long requests, default: max_model_len*0.04 |
 | ```static_decode_blocks``` | `int` | During inference, each request is forced to allocate corresponding number of blocks from Prefill's KVCache for Decode use, default: 2 |
-| ```reasoning_parser``` | `str` | Specify the reasoning parser to extract reasoning content from model output |
+| ```reasoning_parser``` | `str` | Specify the reasoning parser to extract reasoning content from model output, refer [reasoning output](features/reasoning_output.md) for more details |
 | ```use_cudagraph```                | `bool`      | Whether to use cuda graph, default False. It is recommended to read [graph_optimization.md](./features/graph_optimization.md) carefully before opening. Custom all-reduce needs to be enabled at the same time in multi-card scenarios. |
 | ```graph_optimization_config```    | `dict[str]`       | Can configure parameters related to calculation graph optimization, the default value is'{"use_cudagraph":false, "graph_opt_level":0, "cudagraph_capture_sizes": null }'，Detailed description reference [graph_optimization.md](./features/graph_optimization.md)|
 | ```disable_custom_all_reduce``` | `bool` | Disable Custom all-reduce, default: False |