mirror of
				https://github.com/PaddlePaddle/FastDeploy.git
				synced 2025-10-27 02:20:31 +08:00 
			
		
		
		
	 61c2f87e0c
			
		
	
	61c2f87e0c
	
	
	
		
			
			* Create README_CN.md * Update README.md * Update README_CN.md * Create README_CN.md * Update README.md * Create README_CN.md * Update README.md * Create README_CN.md * Update README.md * Create README_CN.md * Update README.md * Create README_CN.md * Update README.md * Create README_CN.md * Update README.md * Create README_CN.md * Update README.md * Update README.md * Update README_CN.md * Create README_CN.md * Update README.md * Update README.md * Update and rename README_en.md to README_CN.md * Update WebDemo.md * Update and rename WebDemo_en.md to WebDemo_CN.md * Update and rename DEVELOPMENT_cn.md to DEVELOPMENT_CN.md * Update DEVELOPMENT_CN.md * Update DEVELOPMENT.md * Update RNN.md * Update and rename RNN_EN.md to RNN_CN.md * Update README.md * Update and rename README_en.md to README_CN.md * Update README.md * Update and rename README_en.md to README_CN.md * Update README.md * Update README_cn.md * Rename README_cn.md to README_CN.md * Update README.md * Update README_cn.md * Rename README_cn.md to README_CN.md * Update export.md * Update and rename export_EN.md to export_CN.md * Update README.md * Update README.md * Create README_CN.md * Update README.md * Update README.md * Update kunlunxin.md * Update classification_result.md * Update classification_result_EN.md * Rename classification_result_EN.md to classification_result_CN.md * Update detection_result.md * Update and rename detection_result_EN.md to detection_result_CN.md * Update face_alignment_result.md * Update and rename face_alignment_result_EN.md to face_alignment_result_CN.md * Update face_detection_result.md * Update and rename face_detection_result_EN.md to face_detection_result_CN.md * Update face_recognition_result.md * Update and rename face_recognition_result_EN.md to face_recognition_result_CN.md * Update headpose_result.md * Update and rename headpose_result_EN.md to headpose_result_CN.md * Update keypointdetection_result.md * Update and rename keypointdetection_result_EN.md to keypointdetection_result_CN.md * Update matting_result.md * Update and rename matting_result_EN.md to matting_result_CN.md * Update mot_result.md * Update and rename mot_result_EN.md to mot_result_CN.md * Update ocr_result.md * Update and rename ocr_result_EN.md to ocr_result_CN.md * Update segmentation_result.md * Update and rename segmentation_result_EN.md to segmentation_result_CN.md * Update README.md * Update README.md * Update quantize.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md
English | 简体中文
YOLOv7 Quantized Model Deployment
FastDeploy supports the deployment of quantized models and provides a one-click model quantization tool. Users can use the one-click model quantization tool to quantize and deploy the models themselves or download the quantized models provided by FastDeploy directly for deployment.
FastDeploy One-Click Model Quantization Tool
FastDeploy provides a one-click quantization tool that allows users to quantize a model simply with a configuration file. For detailed tutorial, please refer to : One-Click Model Quantization Tool
Download Quantized YOLOv7 Model
Users can also directly download the quantized models in the table below for deployment.
| Model | Inference Backend | Hardware | FP32 Inference Time Delay | FP32 Inference Time Delay | Acceleration ratio | FP32 mAP | INT8 mAP | Method | 
|---|---|---|---|---|---|---|---|---|
| YOLOv7 | TensorRT | GPU | 24.57 | 9.40 | 2.61 | 51.1 | 50.8 | Quantized distillation training | 
| YOLOv7 | Paddle Inference | CPU | 1022.55 | 490.87 | 2.08 | 51.1 | 46.3 | Quantized distillation training | 
The data in the above table shows the end-to-end inference performance of FastDeploy deployment before and after model quantization.
- The test images are from COCO val2017.
- The inference time delay is the inference latency on different Runtimes in milliseconds.
- CPU is Intel(R) Xeon(R) Gold 6271C, GPU is Tesla T4, TensorRT version 8.4.15, and the fixed CPU thread is 1 for all tests.