FastDeploy/benchmarks/yaml/deepseek-32k-tp8-wint4.yaml at 28de91b50feed4c7a6a51ac71934d0d87f1cd62f - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

tianlef 2676a918f0 [Doc]fix deepseek ce (#4560 )

2025-10-23 14:09:11 +08:00

10 lines

219 B

YAML

Raw Blame History

 quantization: wint4
 load_choices: "default_v1"
 graph_optimization_config:
   use_cudagraph: True
   use_unique_memory_pool: True
 enable_prefix_caching: False
 max_num_seqs: 256
 max_model_len: 32768
 tensor_parallel_size: 8