Files
FastDeploy/benchmarks/yaml/x1-64k-w4a8c8-tp4.yaml
tianlef 0bc7d076fc [CE]add x1 w4a8c8 benchamrk config (#3607)
* [CE]add x1 w4a8c8 benchamrk config

* [CE]add x1 w4a8c8 benchamrk config

* [CE]add x1 w4a8c8 benchamrk config
2025-08-26 11:27:32 +08:00

11 lines
254 B
YAML

reasoning-parser: ernie_x1
tool_call_parser: ernie_x1
tensor_parallel_size: 4
max_model_len: 65536
max_num_seqs: 128
enable_prefix_caching: True
enable_chunked_prefill: True
gpu_memory_utilization: 0.85
use_cudagraph: True
enable_custom_all_reduce: True