FastDeploy/benchmarks/yaml/deepseek-32k-tp8-wint4.yaml at 153f15db3934793db8f95f39e947b623a3d1f0d1 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

tianlef 153f15db39 [Doc]add deepseek wint4 ce (#4517 )

2025-10-21 16:41:51 +08:00

10 lines

221 B

YAML

Raw Blame History

 quantization: wint4
 load_choices: "default_v1"
 graph_optimization_config:
   use_cudagraph: True
   use_unique_memory_pool: True
 no_enable_prefix_caching: True
 max_num_seqs: 256
 max_model_len: 32768
 tensor_parallel_size: 8