Modified to support custom all reduce by default (#3538)

This commit is contained in:
zhink
2025-08-22 16:59:05 +08:00
committed by GitHub
parent 27666ee586
commit df7c31012b
15 changed files with 18 additions and 30 deletions

View File

@@ -77,8 +77,7 @@ Add the following lines to the startup parameters
```
Notes:
1. Usually, no additional parameters need to be set, but CUDAGraph will generate some additional memory overhead, which may need to be adjusted in some scenarios with limited memory. For detailed parameter adjustments, please refer to [GraphOptimizationBackend](../features/graph_optimization.md) for related configuration parameter descriptions
2. When CUDAGraph is enabled, if running with multi-GPUs TP>1, `--enable-custom-all-reduce` must be specified at the same time.
3. When CUDAGraph is enabled, the scenario of `max-model-len > 32768` is not currently supported.
2. When CUDAGraph is enabled, the scenario of `max-model-len > 32768` is not currently supported.
#### 2.2.6 Rejection Sampling
**Idea:**

View File

@@ -87,8 +87,7 @@ Add the following lines to the startup parameters
```
Notes:
1. Usually, no additional parameters need to be set, but CUDAGraph will generate some additional memory overhead, which may need to be adjusted in some scenarios with limited memory. For detailed parameter adjustments, please refer to [GraphOptimizationBackend](../features/graph_optimization.md) for related configuration parameter descriptions
2. When CUDAGraph is enabled, if running with multi-GPUs TP>1, `--enable-custom-all-reduce` must be specified at the same time.
3. When CUDAGraph is enabled, the scenario of `max-model-len > 32768` is not currently supported.
2. When CUDAGraph is enabled, the scenario of `max-model-len > 32768` is not currently supported.
#### 2.2.6 Rejection Sampling
**Idea:**

View File

@@ -132,12 +132,10 @@ CUDAGraph is a GPU computing acceleration technology provided by NVIDIA. It achi
Add the following lines to the startup parameters
```
--use-cudagraph
--enable-custom-all-reduce
```
Notes:
1. Usually, no additional parameters need to be set, but CUDAGraph will generate some additional memory overhead, which may need to be adjusted in some scenarios with limited memory. For detailed parameter adjustments, please refer to [GraphOptimizationBackend](../features/graph_optimization.md) for related configuration parameter descriptions
2. When CUDAGraph is enabled, if running with multi-GPUs TP>1, `--enable-custom-all-reduce` must be specified at the same time.
3. When CUDAGraph is enabled, the scenario of `max-model-len > 32768` is not currently supported.
2. When CUDAGraph is enabled, the scenario of `max-model-len > 32768` is not currently supported.
## FAQ
If you encounter any problems during use, you can refer to [FAQ](./FAQ.md).