
* Add PaddleOCR Support * Add PaddleOCR Support * Add PaddleOCRv3 Support * Add PaddleOCRv3 Support * Update README.md * Update README.md * Update README.md * Update README.md * Add PaddleOCRv3 Support * Add PaddleOCRv3 Supports * Add PaddleOCRv3 Suport * Fix Rec diff * Remove useless functions * Remove useless comments * Add PaddleOCRv2 Support * Add PaddleOCRv3 & PaddleOCRv2 Support * remove useless parameters * Add utils of sorting det boxes * Fix code naming convention * Fix code naming convention * Fix code naming convention * Fix bug in the Classify process * Imporve OCR Readme * Fix diff in Cls model * Update Model Download Link in Readme * Fix diff in PPOCRv2 * Improve OCR readme * Imporve OCR readme * Improve OCR readme * Improve OCR readme * Imporve OCR readme * Improve OCR readme * Fix conflict * Add readme for OCRResult * Improve OCR readme * Add OCRResult readme * Improve OCR readme * Improve OCR readme * Add Model Quantization Demo * Fix Model Quantization Readme * Fix Model Quantization Readme * Add the function to do PTQ quantization * Improve quant tools readme * Improve quant tool readme * Improve quant tool readme * Add PaddleInference-GPU for OCR Rec model * Add QAT method to fastdeploy-quantization tool * Remove examples/slim for now * Move configs folder * Add Quantization Support for Classification Model * Imporve ways of importing preprocess * Upload YOLO Benchmark on readme * Upload YOLO Benchmark on readme * Upload YOLO Benchmark on readme * Improve Quantization configs and readme * Add support for multi-inputs model * Add backends and params file for YOLOv7 * Add quantized model deployment support for YOLO series * Fix YOLOv5 quantize readme * Fix YOLO quantize readme * Fix YOLO quantize readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Fix bug, change Fronted to ModelFormat * Change Fronted to ModelFormat * Add examples to deploy quantized paddleclas models * Fix readme * Add quantize Readme * Add quantize Readme * Add quantize Readme * Modify readme of quantization tools * Modify readme of quantization tools * Improve quantization tools readme * Improve quantization readme * Improve PaddleClas quantized model deployment readme * Add PPYOLOE-l quantized deployment examples * Improve quantization tools readme * Improve Quantize Readme * Fix conflicts * Fix conflicts * improve readme * Improve quantization tools and readme * Improve quantization tools and readme * Add quantized deployment examples for PaddleSeg model * Fix cpp readme * Fix memory leak of reader_wrapper function * Fix model file name in PaddleClas quantization examples * Update Runtime and E2E benchmark * Update Runtime and E2E benchmark * Rename quantization tools to auto compression tools * Remove PPYOLOE data when deployed on MKLDNN * Fix readme * Support PPYOLOE with OR without NMS and update readme * Update Readme * Update configs and readme * Update configs and readme * Add Paddle-TensorRT backend in quantized model deploy examples * Support PPYOLOE+ series * Add reused_input_tensors for PPYOLOE * Improve fastdeploy tools usage * improve fastdeploy tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * remove modify * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Remove extra requirements for fd-auto-compress package
FastDeploy One-Click Model Auto Compression
FastDeploy, based on PaddleSlim's Auto Compression Toolkit(ACT), provides developers with a one-click model auto compression tool that supports post-training quantization and knowledge distillation training. We take the Yolov5 series as an example to demonstrate how to install and execute FastDeploy's one-click model auto compression.
1.Install
Environment Dependencies
- Install the develop version downloaded from PaddlePaddle official website.
https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
2.Install PaddleSlim-develop
git clone https://github.com/PaddlePaddle/PaddleSlim.git & cd PaddleSlim
python setup.py install
Install Fastdeploy Auto Compression Toolkit
Run the following command to install
# Install fd-auto-compress package using pip
pip install fd-auto-compress
# Execute the following command in the previous directory (not in the current directory)
python setup.py install
2. How to Use
Demo for One-Click Auto Compression Toolkit
Fastdeploy Auto Compression can include multiple strategies, At present, offline quantization and quantization distillation are mainly used for training. The following will introduce how to use it from two strategies, offline quantization and quantitative distillation.
Offline Quantization
1. Prepare models and Calibration data set
Developers need to prepare the model to be quantized and the Calibration dataset on their own. In this demo, developers can execute the following command to download the yolov5s.onnx model to be quantized and calibration data set.
# Download yolov5.onnx
wget https://paddle-slim-models.bj.bcebos.com/act/yolov5s.onnx
# Download dataset. This Calibration dataset is the first 320 images from COCO val2017
wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_val_320.tar.gz
tar -xvf COCO_val_320.tar.gz
2. Run fastdeploy --auto_compress command to compress the model
The following command is to quantize the yolov5s model, if developers want to quantize other models, replace the config_path with other model configuration files in the configs folder.
fastdeploy --auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
[notice] PTQ is short for post-training quantization
3. Parameters
To complete the quantization, developers only need to provide a customized model config file, specify the quantization method, and the path to save the quantized model.
Parameter | Description |
---|---|
--config_path | Quantization profiles needed for one-click quantization.Configs |
--method | Quantization method selection, PTQ for post-training quantization, QAT for quantization distillation training |
--save_dir | Output of quantized model paths, which can be deployed directly in FastDeploy |
Quantized distillation training
1.Prepare the model to be quantized and the training data set
FastDeploy currently supports quantized distillation training only for images without annotation. It does not support evaluating model accuracy during training. The datasets are images from inference application, and the number of images is determined by the size of the dataset, covering all deployment scenarios as much as possible. In this demo, we prepare the first 320 images from the COCO2017 validation set for users. Note: If users want to obtain a more accurate quantized model through quantized distillation training, feel free to prepare more data and train more rounds.
# Download yolov5.onnx
wget https://paddle-slim-models.bj.bcebos.com/act/yolov5s.onnx
# Download dataset. This Calibration dataset is the first 320 images from COCO val2017
wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_val_320.tar.gz
tar -xvf COCO_val_320.tar.gz
2.Use fastdeploy --auto_compress command to compress models
The following command is to quantize the yolov5s model, if developers want to quantize other models, replace the config_path with other model configuration files in the configs folder.
# Please specify the single card GPU before training, otherwise it may get stuck during the training process.
export CUDA_VISIBLE_DEVICES=0
fastdeploy --auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='QAT' --save_dir='./yolov5s_qat_model/'
3.Parameters
To complete the quantization, developers only need to provide a customized model config file, specify the quantization method, and the path to save the quantized model.
Parameter | Description |
---|---|
--config_path | Quantization profiles needed for one-click quantization.Configs |
--method | Quantization method selection, PTQ for post-training quantization, QAT for quantization distillation training |
--save_dir | Output of quantized model paths, which can be deployed directly in FastDeploy |
3. FastDeploy One-Click Model Auto Compression Config file examples
FastDeploy currently provides users with compression config files of multiple models, and the corresponding FP32 model, Users can directly download and experience it.
Config文件 | 待压缩的FP32模型 | 备注 |
---|---|---|
mobilenetv1_ssld_quant | mobilenetv1_ssld | |
resnet50_vd_quant | resnet50_vd | |
yolov5s_quant | yolov5s | |
yolov6s_quant | yolov6s | |
yolov7_quant | yolov7 | |
ppyoloe_withNMS_quant | ppyoloe_l | Support PPYOLOE's s,m,l,x series models, export the model normally when exporting the model from PaddleDetection, do not remove NMS |
ppyoloe_plus_withNMS_quant | ppyoloe_plus_s | Support PPYOLOE+'s s,m,l,x series models, export the model normally when exporting the model from PaddleDetection, do not remove NMS |
pp_liteseg_quant | pp_liteseg |
3. Deploy quantized models on FastDeploy
Once obtained the quantized model, developers can deploy it on FastDeploy. Please refer to the following docs for more details