[Graph Optimization][Speculative Decoding] Update yaml and fix typo (#4612)

This commit is contained in:
RAM
2025-10-28 11:43:26 +08:00
committed by GitHub
parent b2c6c41447
commit 86d5006a57
2 changed files with 4 additions and 4 deletions

View File

@@ -1,6 +1,6 @@
max_model_len: 32768
max_num_seqs: 96
gpu_memory_utilization: 0.9
gpu_memory_utilization: 0.85
kv_cache_ratio: 0.71
tensor_parallel_size: 4
quantization: wint4