[Quantization] Update quantized model deployment examples and update readme. (#377)

* Add PaddleOCR Support * Add PaddleOCR Support * Add PaddleOCRv3 Support * Add PaddleOCRv3 Support * Update README.md * Update README.md * Update README.md * Update README.md * Add PaddleOCRv3 Support * Add PaddleOCRv3 Supports * Add PaddleOCRv3 Suport * Fix Rec diff * Remove useless functions * Remove useless comments * Add PaddleOCRv2 Support * Add PaddleOCRv3 & PaddleOCRv2 Support * remove useless parameters * Add utils of sorting det boxes * Fix code naming convention * Fix code naming convention * Fix code naming convention * Fix bug in the Classify process * Imporve OCR Readme * Fix diff in Cls model * Update Model Download Link in Readme * Fix diff in PPOCRv2 * Improve OCR readme * Imporve OCR readme * Improve OCR readme * Improve OCR readme * Imporve OCR readme * Improve OCR readme * Fix conflict * Add readme for OCRResult * Improve OCR readme * Add OCRResult readme * Improve OCR readme * Improve OCR readme * Add Model Quantization Demo * Fix Model Quantization Readme * Fix Model Quantization Readme * Add the function to do PTQ quantization * Improve quant tools readme * Improve quant tool readme * Improve quant tool readme * Add PaddleInference-GPU for OCR Rec model * Add QAT method to fastdeploy-quantization tool * Remove examples/slim for now * Move configs folder * Add Quantization Support for Classification Model * Imporve ways of importing preprocess * Upload YOLO Benchmark on readme * Upload YOLO Benchmark on readme * Upload YOLO Benchmark on readme * Improve Quantization configs and readme * Add support for multi-inputs model * Add backends and params file for YOLOv7 * Add quantized model deployment support for YOLO series * Fix YOLOv5 quantize readme * Fix YOLO quantize readme * Fix YOLO quantize readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Fix bug, change Fronted to ModelFormat * Change Fronted to ModelFormat * Add examples to deploy quantized paddleclas models * Fix readme * Add quantize Readme * Add quantize Readme * Add quantize Readme * Modify readme of quantization tools * Modify readme of quantization tools * Improve quantization tools readme * Improve quantization readme * Improve PaddleClas quantized model deployment readme * Add PPYOLOE-l quantized deployment examples * Improve quantization tools readme * Improve Quantize Readme * Fix conflicts * Fix conflicts * improve readme * Improve quantization tools and readme * Improve quantization tools and readme * Add quantized deployment examples for PaddleSeg model * Fix cpp readme * Fix memory leak of reader_wrapper function * Fix model file name in PaddleClas quantization examples * Update Runtime and E2E benchmark * Update Runtime and E2E benchmark * Rename quantization tools to auto compression tools * Remove PPYOLOE data when deployed on MKLDNN * Fix readme * Support PPYOLOE with OR without NMS and update readme * Update Readme * Update configs and readme * Update configs and readme * Add Paddle-TensorRT backend in quantized model deploy examples * Support PPYOLOE+ series
2025-11-02 12:44:20 +08:00 · 2022-11-02 20:29:29 +08:00
parent 9437dec9f5
commit a231c9e7f3
53 changed files with 1514 additions and 521 deletions
--- a/tools/auto_compression/README.md
+++ b/tools/auto_compression/README.md
@@ -0,0 +1,129 @@
+# FastDeploy 一键模型自动化压缩
+FastDeploy基于PaddleSlim的Auto Compression Toolkit(ACT), 给用户提供了一键模型自动化压缩的工具.
+本文档以Yolov5s为例, 供用户参考如何安装并执行FastDeploy的一键模型自动化压缩.
+
+## 1.安装
+
+### 环境依赖
+
+1.用户参考PaddlePaddle官网, 安装develop版本
+```
+https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
+```
+
+2.安装paddleslim-develop版本
+```bash
+git clone https://github.com/PaddlePaddle/PaddleSlim.git & cd PaddleSlim
+python setup.py install
+```
+
+### fastdeploy-auto-compression 一键模型自动化压缩工具安装方式
+用户在当前目录下，运行如下命令:
+```
+python setup.py install
+```
+
+## 2.使用方式
+
+### 一键模型压缩示例
+FastDeploy模型一键自动压缩可包含多种策略, 目前主要采用离线量化和量化蒸馏训练, 下面将从离线量化和量化蒸馏两个策略来介绍如何使用一键模型自动化压缩.
+
+#### 离线量化
+
+##### 1. 准备模型和Calibration数据集
+用户需要自行准备待量化模型与Calibration数据集.
+本例中用户可执行以下命令, 下载待量化的yolov5s.onnx模型和我们为用户准备的Calibration数据集示例.
+
+```shell
+# 下载yolov5.onnx
+wget https://paddle-slim-models.bj.bcebos.com/act/yolov5s.onnx
+
+# 下载数据集, 此Calibration数据集为COCO val2017中的前320张图片
+wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_val_320.tar.gz
+tar -xvf COCO_val_320.tar.gz
+```
+
+##### 2.使用fastdeploy_auto_compress命令，执行一键模型自动化压缩:
+以下命令是对yolov5s模型进行量化, 用户若想量化其他模型, 替换config_path为configs文件夹下的其他模型配置文件即可.
+```shell
+fastdeploy_auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
+```
+【说明】离线量化（训练后量化）：post-training quantization，缩写是PTQ
+
+##### 3.参数说明
+
+目前用户只需要提供一个定制的模型config文件,并指定量化方法和量化后的模型保存路径即可完成量化.
+
+| 参数                 | 作用                                                         |
+| -------------------- | ------------------------------------------------------------ |
+| --config_path          | 一键压缩所需要的量化配置文件.[详解](./configs/README.md)                        |
+| --method               | 压缩方式选择, 离线量化选PTQ，量化蒸馏训练选QAT     |
+| --save_dir             | 产出的量化后模型路径, 该模型可直接在FastDeploy部署     |
+
+
+
+#### 量化蒸馏训练
+
+##### 1.准备待量化模型和训练数据集
+FastDeploy一键模型自动化压缩目前的量化蒸馏训练，只支持无标注图片训练，训练过程中不支持评估模型精度.
+数据集为真实预测场景下的图片，图片数量依据数据集大小来定，尽量覆盖所有部署场景. 此例中，我们为用户准备了COCO2017训练集中的前320张图片.
+注: 如果用户想通过量化蒸馏训练的方法,获得精度更高的量化模型, 可以自行准备更多的数据, 以及训练更多的轮数.
+
+```shell
+# 下载yolov5.onnx
+wget https://paddle-slim-models.bj.bcebos.com/act/yolov5s.onnx
+
+# 下载数据集, 此Calibration数据集为COCO2017训练集中的前320张图片
+wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_train_320.tar
+tar -xvf COCO_train_320.tar
+```
+
+##### 2.使用fastdeploy_auto_compress命令，执行一键模型自动化压缩:
+以下命令是对yolov5s模型进行量化, 用户若想量化其他模型, 替换config_path为configs文件夹下的其他模型配置文件即可.
+```shell
+# 执行命令默认为单卡训练，训练前请指定单卡GPU, 否则在训练过程中可能会卡住.
+export CUDA_VISIBLE_DEVICES=0
+fastdeploy_auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='QAT' --save_dir='./yolov5s_qat_model/'
+```
+
+##### 3.参数说明
+
+目前用户只需要提供一个定制的模型config文件,并指定量化方法和量化后的模型保存路径即可完成量化.
+
+| 参数                 | 作用                                                         |
+| -------------------- | ------------------------------------------------------------ |
+| --config_path          | 一键自动化压缩所需要的量化配置文件.[详解](./configs/README.md)|
+| --method               | 压缩方式选择, 离线量化选PTQ，量化蒸馏训练选QAT     |
+| --save_dir             | 产出的量化后模型路径, 该模型可直接在FastDeploy部署     |
+
+
+## 3. FastDeploy 一键模型自动化压缩 Config文件参考
+FastDeploy目前为用户提供了多个模型的压缩[config](./configs/)文件,以及相应的FP32模型, 用户可以直接下载使用并体验.
+
+| Config文件                | 待压缩的FP32模型 | 备注                                                       |
+| -------------------- | ------------------------------------------------------------ |----------------------------------------- |
+| [mobilenetv1_ssld_quant](./configs/classification/mobilenetv1_ssld_quant.yaml)      | [mobilenetv1_ssld](https://bj.bcebos.com/paddlehub/fastdeploy/MobileNetV1_ssld_infer.tgz)           |           |
+| [resnet50_vd_quant](./configs/classification/resnet50_vd_quant.yaml)      |   [resnet50_vd](https://bj.bcebos.com/paddlehub/fastdeploy/ResNet50_vd_infer.tgz)          |     |
+| [yolov5s_quant](./configs/detection/yolov5s_quant.yaml)       |   [yolov5s](https://paddle-slim-models.bj.bcebos.com/act/yolov5s.onnx)         |     |
+| [yolov6s_quant](./configs/detection/yolov6s_quant.yaml)       |  [yolov6s](https://paddle-slim-models.bj.bcebos.com/act/yolov6s.onnx)          |     |
+| [yolov7_quant](./configs/detection/yolov7_quant.yaml)        | [yolov7](https://paddle-slim-models.bj.bcebos.com/act/yolov7.onnx)           |      |
+| [ppyoloe_withNMS_quant](./configs/detection/ppyoloe_withNMS_quant.yaml)       |  [ppyoloe_l](https://bj.bcebos.com/v1/paddle-slim-models/act/ppyoloe_crn_l_300e_coco.tar)    | 支持PPYOLOE的s,m,l,x系列模型, 从PaddleDetection导出模型时正常导出, 不要去除NMS |
+| [ppyoloe_plus_withNMS_quant](./configs/detection/ppyoloe_plus_withNMS_quant.yaml)       |  [ppyoloe_plus_s](https://bj.bcebos.com/paddlehub/fastdeploy/ppyoloe_plus_crn_s_80e_coco.tar)    | 支持PPYOLOE+的s,m,l,x系列模型, 从PaddleDetection导出模型时正常导出, 不要去除NMS |
+| [pp_liteseg_quant](./configs/segmentation/pp_liteseg_quant.yaml)    |   [pp_liteseg](https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer.tgz)        |       |
+
+
+
+## 4. FastDeploy 部署量化模型
+用户在获得量化模型之后，即可以使用FastDeploy进行部署, 部署文档请参考:
+具体请用户参考示例文档:
+- [YOLOv5 量化模型部署](../../examples/vision/detection/yolov5/quantize/)
+
+- [YOLOv6 量化模型部署](../../examples/vision/detection/yolov6/quantize/)
+
+- [YOLOv7 量化模型部署](../../examples/vision/detection/yolov7/quantize/)
+
+- [PadddleClas 量化模型部署](../../examples/vision/classification/paddleclas/quantize/)
+
+- [PadddleDetection 量化模型部署](../../examples/vision/detection/paddledetection/quantize/)
+
+- [PadddleSegmentation 量化模型部署](../../examples/vision/segmentation/paddleseg/quantize/)
--- a/tools/auto_compression/configs/README.md
+++ b/tools/auto_compression/configs/README.md
@@ -0,0 +1,54 @@
+# FastDeploy 一键自动化压缩配置文件说明
+FastDeploy 一键自动化压缩配置文件中，包含了全局配置，量化蒸馏训练配置，离线量化配置和训练配置.
+用户除了直接使用FastDeploy提供在本目录的配置文件外，可以按照以下示例,自行修改相关配置文件, 来尝试压缩自己的模型.
+
+## 实例解读
+
+```
+# 全局配置
+Global:
+  model_dir: ./ppyoloe_plus_crn_s_80e_coco    #输入模型的路径, 用户若需量化自己的模型，替换此处即可
+  format: paddle                              #输入模型的格式, paddle模型请选择'paddle', onnx模型选择'onnx'
+  model_filename: model.pdmodel               #量化后转为paddle格式模型的模型名字
+  params_filename: model.pdiparams            #量化后转为paddle格式模型的参数名字
+  qat_image_path: ./COCO_train_320            #量化蒸馏训练使用的数据集,此例为少量无标签数据, 选自COCO2017训练集中的前320张图片, 做少量数据训练
+  ptq_image_path: ./COCO_val_320              #离线训练使用的Carlibration数据集, 选自COCO2017验证集中的前320张图片.
+  input_list: ['image','scale_factor']        #待量化的模型的输入名字
+  qat_preprocess: ppyoloe_plus_withNMS_image_preprocess #模型量化蒸馏训练时,对数据做的预处理函数, 用户可以在 ../fdquant/dataset.py 中修改或自行编写新的预处理函数, 来支自定义模型的量化
+  ptq_preprocess: ppyoloe_plus_withNMS_image_preprocess #模型离线量化时,对数据做的预处理函数, 用户可以在 ../fdquant/dataset.py 中修改或自行编写新的预处理函数, 来支自定义模型的量化
+  qat_batch_size: 4                           #量化蒸馏训练时的batch_size, 若为onnx格式的模型,此处只能为1
+
+
+#量化蒸馏训练配置
+Distillation:
+  alpha: 1.0                                  #蒸馏loss所占权重
+  loss: soft_label                            #蒸馏loss算法
+
+Quantization:
+  onnx_format: true                           #是否采用ONNX量化标准格式, 要在FastDeploy上部署, 必须选true
+  use_pact: true                              #量化训练是否使用PACT方法
+  activation_quantize_type: 'moving_average_abs_max'     #激活量化方式
+  quantize_op_types:                          #需要进行量化的OP
+  - conv2d
+  - depthwise_conv2d
+
+#离线量化配置
+PTQ:
+  calibration_method: 'avg'                   #离线量化的激活校准算法, 可选: avg, abs_max, hist, KL, mse, emd
+  skip_tensor_list: None                      #用户可指定跳过某些conv层,不进行量化
+
+#训练参数配置
+TrainConfig:
+  train_iter: 3000
+  learning_rate: 0.00001
+  optimizer_builder:
+    optimizer:
+      type: SGD
+    weight_decay: 4.0e-05
+  target_metric: 0.365
+
+```
+## 更多详细配置方法
+
+FastDeploy一键压缩功能由PaddeSlim助力, 更详细的量化配置方法请参考:
+[自动化压缩超参详细教程](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/example/auto_compression/hyperparameter_tutorial.md)
--- a/tools/auto_compression/configs/classification/mobilenetv1_ssld_quant.yaml
+++ b/tools/auto_compression/configs/classification/mobilenetv1_ssld_quant.yaml
@@ -0,0 +1,50 @@
+Global:
+  model_dir: ./MobileNetV1_ssld_infer/
+  format: 'paddle'
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  qat_image_path: ./ImageNet_val_640
+  ptq_image_path: ./ImageNet_val_640
+  input_list: ['input']
+  qat_preprocess: cls_image_preprocess
+  ptq_preprocess: cls_image_preprocess
+  qat_batch_size: 32
+
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+
+
+Quantization:
+  use_pact: true
+  activation_bits: 8
+  is_full_quantize: false
+  onnx_format: True
+  activation_quantize_type: moving_average_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+
+
+TrainConfig:
+  train_iter: 5000
+  learning_rate:
+    type: CosineAnnealingDecay
+    learning_rate: 0.015
+    T_max: 8000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.70898
+
+
+PTQ:
+  calibration_method: 'avg'   # option: avg, abs_max, hist, KL, mse
+  skip_tensor_list: None
--- a/tools/auto_compression/configs/classification/resnet50_vd_quant.yaml
+++ b/tools/auto_compression/configs/classification/resnet50_vd_quant.yaml
@@ -0,0 +1,48 @@
+Global:
+  model_dir: ./ResNet50_vd_infer/
+  format: 'paddle'
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  image_path: ./ImageNet_val_640
+  input_list: ['input']
+  qat_preprocess: cls_image_preprocess
+  ptq_preprocess: cls_image_preprocess
+  qat_batch_size: 32
+
+
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+
+Quantization:
+  use_pact: true
+  activation_bits: 8
+  is_full_quantize: false
+  onnx_format: True
+  activation_quantize_type: moving_average_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+
+TrainConfig:
+  train_iter: 5000
+  learning_rate:
+    type: CosineAnnealingDecay
+    learning_rate: 0.015
+    T_max: 8000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7912
+
+
+PTQ:
+  calibration_method: 'avg'   # option: avg, abs_max, hist, KL, mse
+  skip_tensor_list: None
--- a/tools/auto_compression/configs/detection/ppyoloe_plus_withNMS_quant.yaml
+++ b/tools/auto_compression/configs/detection/ppyoloe_plus_withNMS_quant.yaml
@@ -0,0 +1,39 @@
+Global:
+  model_dir: ./ppyoloe_plus_crn_s_80e_coco
+  format: paddle
+  model_filename: model.pdmodel
+  params_filename: model.pdiparams
+  qat_image_path: ./COCO_train_320
+  ptq_image_path: ./COCO_val_320
+  input_list: ['image','scale_factor']
+  qat_preprocess: ppyoloe_plus_withNMS_image_preprocess
+  ptq_preprocess: ppyoloe_plus_withNMS_image_preprocess
+  qat_batch_size: 4
+
+Distillation:
+  alpha: 1.0
+  loss: soft_label
+
+Quantization:
+  onnx_format: true
+  use_pact: true
+  activation_quantize_type: 'moving_average_abs_max'
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+
+
+PTQ:
+  calibration_method: 'avg'   # option: avg, abs_max, hist, KL, mse
+  skip_tensor_list: None
+
+TrainConfig:
+  train_iter: 5000
+  learning_rate:
+    type: CosineAnnealingDecay
+    learning_rate: 0.00003
+    T_max: 6000
+  optimizer_builder:
+    optimizer:
+      type: SGD
+    weight_decay: 4.0e-05
--- a/tools/auto_compression/configs/detection/ppyoloe_withNMS_quant.yaml
+++ b/tools/auto_compression/configs/detection/ppyoloe_withNMS_quant.yaml
@@ -0,0 +1,39 @@
+Global:
+  model_dir: ./ppyoloe_crn_s_300e_coco
+  format: paddle
+  model_filename: model.pdmodel
+  params_filename: model.pdiparams
+  qat_image_path: ./COCO_train_320
+  ptq_image_path: ./COCO_val_320
+  input_list: ['image','scale_factor']
+  qat_preprocess: ppyoloe_withNMS_image_preprocess
+  ptq_preprocess: ppyoloe_withNMS_image_preprocess
+  qat_batch_size: 4
+
+Distillation:
+  alpha: 1.0
+  loss: soft_label
+
+Quantization:
+  onnx_format: true
+  use_pact: true
+  activation_quantize_type: 'moving_average_abs_max'
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+
+
+PTQ:
+  calibration_method: 'avg'   # option: avg, abs_max, hist, KL, mse
+  skip_tensor_list: None
+
+TrainConfig:
+  train_iter: 5000
+  learning_rate:
+    type: CosineAnnealingDecay
+    learning_rate: 0.00003
+    T_max: 6000
+  optimizer_builder:
+    optimizer:
+      type: SGD
+    weight_decay: 4.0e-05
--- a/tools/auto_compression/configs/detection/yolov5s_quant.yaml
+++ b/tools/auto_compression/configs/detection/yolov5s_quant.yaml
@@ -0,0 +1,37 @@
+Global:
+  model_dir: ./yolov5s.onnx
+  format: 'onnx'
+  model_filename: model.pdmodel
+  params_filename: model.pdiparams
+  qat_image_path: ./COCO_train_320
+  ptq_image_path: ./COCO_val_320
+  input_list: ['x2paddle_images']
+  qat_preprocess: yolo_image_preprocess
+  ptq_preprocess: yolo_image_preprocess
+  qat_batch_size: 1
+
+Distillation:
+  alpha: 1.0
+  loss: soft_label
+
+Quantization:
+  onnx_format: true
+  use_pact: true
+  activation_quantize_type: 'moving_average_abs_max'
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+
+
+PTQ:
+  calibration_method: 'avg'   # option: avg, abs_max, hist, KL, mse
+  skip_tensor_list: None
+
+TrainConfig:
+  train_iter: 3000
+  learning_rate: 0.00001
+  optimizer_builder:
+    optimizer:
+      type: SGD
+    weight_decay: 4.0e-05
+  target_metric: 0.365
--- a/tools/auto_compression/configs/detection/yolov6s_quant.yaml
+++ b/tools/auto_compression/configs/detection/yolov6s_quant.yaml
@@ -0,0 +1,38 @@
+Global:
+  model_dir: ./yolov6s.onnx
+  format: 'onnx'
+  model_filename: model.pdmodel
+  params_filename: model.pdiparams
+  qat_image_path: ./COCO_train_320
+  ptq_image_path: ./COCO_val_320
+  input_list: ['x2paddle_image_arrays']
+  qat_preprocess: yolo_image_preprocess
+  ptq_preprocess: yolo_image_preprocess
+  qat_batch_size: 1
+
+Distillation:
+  alpha: 1.0
+  loss: soft_label
+
+Quantization:
+  onnx_format: true
+  activation_quantize_type: 'moving_average_abs_max'
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+
+
+PTQ:
+  calibration_method: 'avg'   # option: avg, abs_max, hist, KL, mse
+  skip_tensor_list:  ['conv2d_2.w_0', 'conv2d_15.w_0', 'conv2d_46.w_0', 'conv2d_11.w_0', 'conv2d_49.w_0']
+
+TrainConfig:
+  train_iter: 8000
+  learning_rate:
+    type: CosineAnnealingDecay
+    learning_rate: 0.00003
+    T_max: 8000
+  optimizer_builder:
+    optimizer:
+      type: SGD
+    weight_decay: 0.00004
--- a/tools/auto_compression/configs/detection/yolov7_quant.yaml
+++ b/tools/auto_compression/configs/detection/yolov7_quant.yaml
@@ -0,0 +1,37 @@
+Global:
+  model_dir: ./yolov7.onnx
+  format: 'onnx'
+  model_filename: model.pdmodel
+  params_filename: model.pdiparams
+  qat_image_path: ./COCO_train_320
+  ptq_image_path: ./COCO_val_320
+  input_list: ['x2paddle_images']
+  qat_preprocess: yolo_image_preprocess
+  ptq_preprocess: yolo_image_preprocess
+  qat_batch_size: 1
+
+Distillation:
+  alpha: 1.0
+  loss: soft_label
+
+Quantization:
+  onnx_format: true
+  activation_quantize_type: 'moving_average_abs_max'
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+
+PTQ:
+  calibration_method: 'avg'   # option: avg, abs_max, hist, KL, mse
+  skip_tensor_list: None
+
+TrainConfig:
+  train_iter: 3000
+  learning_rate:
+    type: CosineAnnealingDecay
+    learning_rate: 0.00003
+    T_max: 8000
+  optimizer_builder:
+    optimizer:
+      type: SGD
+    weight_decay: 0.00004
--- a/tools/auto_compression/configs/segmentation/pp_liteseg_quant.yaml
+++ b/tools/auto_compression/configs/segmentation/pp_liteseg_quant.yaml
@@ -0,0 +1,37 @@
+Global:
+  model_dir: ./PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer
+  format: paddle
+  model_filename: model.pdmodel
+  params_filename: model.pdiparams
+  qat_image_path: ./train_stuttgart
+  ptq_image_path: ./val_munster
+  input_list: ['x']
+  qat_preprocess: ppseg_cityscapes_qat_preprocess
+  ptq_preprocess: ppseg_cityscapes_ptq_preprocess
+  qat_batch_size: 16
+
+
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - conv2d_94.tmp_0
+
+Quantization:
+  onnx_format: True
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+
+PTQ:
+  calibration_method: 'avg'   # option: avg, abs_max, hist, KL, mse
+  skip_tensor_list: None
+
+TrainConfig:
+  epochs: 10
+  eval_iter: 180
+  learning_rate: 0.0005
+  optimizer_builder:
+    optimizer:
+      type: SGD
+    weight_decay: 4.0e-05
--- a/tools/auto_compression/fd_auto_compress/init.py
+++ b/tools/auto_compression/fd_auto_compress/init.py
--- a/tools/auto_compression/fd_auto_compress/dataset.py
+++ b/tools/auto_compression/fd_auto_compress/dataset.py
@@ -0,0 +1,388 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import cv2
+import os
+import numpy as np
+import random
+from PIL import Image, ImageEnhance
+import paddle
+"""
+Preprocess for Yolov5/v6/v7 Series
+"""
+
+
+def generate_scale(im, target_shape):
+    origin_shape = im.shape[:2]
+    im_size_min = np.min(origin_shape)
+    im_size_max = np.max(origin_shape)
+    target_size_min = np.min(target_shape)
+    target_size_max = np.max(target_shape)
+    im_scale = float(target_size_min) / float(im_size_min)
+    if np.round(im_scale * im_size_max) > target_size_max:
+        im_scale = float(target_size_max) / float(im_size_max)
+    im_scale_x = im_scale
+    im_scale_y = im_scale
+
+    return im_scale_y, im_scale_x
+
+
+def yolo_image_preprocess(img, target_shape=[640, 640]):
+    # Resize image
+    im_scale_y, im_scale_x = generate_scale(img, target_shape)
+    img = cv2.resize(
+        img,
+        None,
+        None,
+        fx=im_scale_x,
+        fy=im_scale_y,
+        interpolation=cv2.INTER_LINEAR)
+    # Pad
+    im_h, im_w = img.shape[:2]
+    h, w = target_shape[:]
+    if h != im_h or w != im_w:
+        canvas = np.ones((h, w, 3), dtype=np.float32)
+        canvas *= np.array([114.0, 114.0, 114.0], dtype=np.float32)
+        canvas[0:im_h, 0:im_w, :] = img.astype(np.float32)
+        img = canvas
+    img = np.transpose(img / 255, [2, 0, 1])
+
+    return img.astype(np.float32)
+
+
+"""
+Preprocess for PaddleClas model
+"""
+
+
+def cls_resize_short(img, target_size):
+
+    img_h, img_w = img.shape[:2]
+    percent = float(target_size) / min(img_w, img_h)
+    w = int(round(img_w * percent))
+    h = int(round(img_h * percent))
+
+    return cv2.resize(img, (w, h), interpolation=cv2.INTER_LINEAR)
+
+
+def crop_image(img, target_size, center):
+
+    height, width = img.shape[:2]
+    size = target_size
+
+    if center == True:
+        w_start = (width - size) // 2
+        h_start = (height - size) // 2
+    else:
+        w_start = np.random.randint(0, width - size + 1)
+        h_start = np.random.randint(0, height - size + 1)
+    w_end = w_start + size
+    h_end = h_start + size
+
+    return img[h_start:h_end, w_start:w_end, :]
+
+
+def cls_image_preprocess(img):
+
+    # resize
+    img = cls_resize_short(img, target_size=256)
+    # crop
+    img = crop_image(img, target_size=224, center=True)
+
+    #ToCHWImage & Normalize
+    img = np.transpose(img / 255, [2, 0, 1])
+
+    img_mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1))
+    img_std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1))
+    img -= img_mean
+    img /= img_std
+
+    return img.astype(np.float32)
+
+
+"""
+Preprocess for PPYOLOE
+"""
+
+
+def ppdet_resize_no_keepratio(img, target_shape=[640, 640]):
+    im_shape = img.shape
+
+    resize_h, resize_w = target_shape
+    im_scale_y = resize_h / im_shape[0]
+    im_scale_x = resize_w / im_shape[1]
+
+    scale_factor = np.asarray([im_scale_y, im_scale_x], dtype=np.float32)
+    return cv2.resize(
+        img, None, None, fx=im_scale_x, fy=im_scale_y,
+        interpolation=2), scale_factor
+
+
+def ppyoloe_withNMS_image_preprocess(img):
+
+    img, scale_factor = ppdet_resize_no_keepratio(img, target_shape=[640, 640])
+
+    img = np.transpose(img / 255, [2, 0, 1])
+
+    img_mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1))
+    img_std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1))
+    img -= img_mean
+    img /= img_std
+
+    return img.astype(np.float32), scale_factor
+
+
+def ppyoloe_plus_withNMS_image_preprocess(img):
+
+    img, scale_factor = ppdet_resize_no_keepratio(img, target_shape=[640, 640])
+
+    img = np.transpose(img / 255, [2, 0, 1])
+
+    return img.astype(np.float32), scale_factor
+
+
+"""
+Preprocess for PP_LiteSeg
+
+"""
+
+
+def ppseg_cityscapes_ptq_preprocess(img):
+
+    #ToCHWImage & Normalize
+    img = np.transpose(img / 255.0, [2, 0, 1])
+
+    img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
+    img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
+    img -= img_mean
+    img /= img_std
+
+    return img.astype(np.float32)
+
+
+def ResizeStepScaling(img,
+                      min_scale_factor=0.75,
+                      max_scale_factor=1.25,
+                      scale_step_size=0.25):
+    # refer form ppseg
+    if min_scale_factor == max_scale_factor:
+        scale_factor = min_scale_factor
+    elif scale_step_size == 0:
+        scale_factor = np.random.uniform(min_scale_factor, max_scale_factor)
+    else:
+        num_steps = int((max_scale_factor - min_scale_factor) / scale_step_size
+                        + 1)
+        scale_factors = np.linspace(min_scale_factor, max_scale_factor,
+                                    num_steps).tolist()
+        np.random.shuffle(scale_factors)
+        scale_factor = scale_factors[0]
+
+    w = int(round(scale_factor * img.shape[1]))
+    h = int(round(scale_factor * img.shape[0]))
+
+    img = cv2.resize(img, (w, h), interpolation=cv2.INTER_LINEAR)
+
+    return img
+
+
+def RandomPaddingCrop(img,
+                      crop_size=(512, 512),
+                      im_padding_value=(127.5, 127.5, 127.5),
+                      label_padding_value=255):
+
+    if isinstance(crop_size, list) or isinstance(crop_size, tuple):
+        if len(crop_size) != 2:
+            raise ValueError(
+                'Type of `crop_size` is list or tuple. It should include 2 elements, but it is {}'
+                .format(crop_size))
+    else:
+        raise TypeError(
+            "The type of `crop_size` is invalid. It should be list or tuple, but it is {}"
+            .format(type(crop_size)))
+
+    if isinstance(crop_size, int):
+        crop_width = crop_size
+        crop_height = crop_size
+    else:
+        crop_width = crop_size[0]
+        crop_height = crop_size[1]
+
+    img_height = img.shape[0]
+    img_width = img.shape[1]
+
+    if img_height == crop_height and img_width == crop_width:
+        return img
+    else:
+        pad_height = max(crop_height - img_height, 0)
+        pad_width = max(crop_width - img_width, 0)
+        if (pad_height > 0 or pad_width > 0):
+            img = cv2.copyMakeBorder(
+                img,
+                0,
+                pad_height,
+                0,
+                pad_width,
+                cv2.BORDER_CONSTANT,
+                value=im_padding_value)
+
+            img_height = img.shape[0]
+            img_width = img.shape[1]
+
+        if crop_height > 0 and crop_width > 0:
+            h_off = np.random.randint(img_height - crop_height + 1)
+            w_off = np.random.randint(img_width - crop_width + 1)
+
+            img = img[h_off:(crop_height + h_off), w_off:(w_off + crop_width
+                                                          ), :]
+
+        return img
+
+
+def RandomHorizontalFlip(img, prob=0.5):
+    if random.random() < prob:
+
+        if len(img.shape) == 3:
+            img = img[:, ::-1, :]
+        elif len(img.shape) == 2:
+            img = img[:, ::-1]
+
+        return img
+    else:
+        return img
+
+
+def brightness(im, brightness_lower, brightness_upper):
+    brightness_delta = np.random.uniform(brightness_lower, brightness_upper)
+    im = ImageEnhance.Brightness(im).enhance(brightness_delta)
+    return im
+
+
+def contrast(im, contrast_lower, contrast_upper):
+    contrast_delta = np.random.uniform(contrast_lower, contrast_upper)
+    im = ImageEnhance.Contrast(im).enhance(contrast_delta)
+    return im
+
+
+def saturation(im, saturation_lower, saturation_upper):
+    saturation_delta = np.random.uniform(saturation_lower, saturation_upper)
+    im = ImageEnhance.Color(im).enhance(saturation_delta)
+    return im
+
+
+def hue(im, hue_lower, hue_upper):
+    hue_delta = np.random.uniform(hue_lower, hue_upper)
+    im = np.array(im.convert('HSV'))
+    im[:, :, 0] = im[:, :, 0] + hue_delta
+    im = Image.fromarray(im, mode='HSV').convert('RGB')
+    return im
+
+
+def sharpness(im, sharpness_lower, sharpness_upper):
+    sharpness_delta = np.random.uniform(sharpness_lower, sharpness_upper)
+    im = ImageEnhance.Sharpness(im).enhance(sharpness_delta)
+    return im
+
+
+def RandomDistort(img,
+                  brightness_range=0.5,
+                  brightness_prob=0.5,
+                  contrast_range=0.5,
+                  contrast_prob=0.5,
+                  saturation_range=0.5,
+                  saturation_prob=0.5,
+                  hue_range=18,
+                  hue_prob=0.5,
+                  sharpness_range=0.5,
+                  sharpness_prob=0):
+
+    brightness_lower = 1 - brightness_range
+    brightness_upper = 1 + brightness_range
+    contrast_lower = 1 - contrast_range
+    contrast_upper = 1 + contrast_range
+    saturation_lower = 1 - saturation_range
+    saturation_upper = 1 + saturation_range
+    hue_lower = -hue_range
+    hue_upper = hue_range
+    sharpness_lower = 1 - sharpness_range
+    sharpness_upper = 1 + sharpness_range
+    ops = [brightness, contrast, saturation, hue, sharpness]
+    random.shuffle(ops)
+    params_dict = {
+        'brightness': {
+            'brightness_lower': brightness_lower,
+            'brightness_upper': brightness_upper
+        },
+        'contrast': {
+            'contrast_lower': contrast_lower,
+            'contrast_upper': contrast_upper
+        },
+        'saturation': {
+            'saturation_lower': saturation_lower,
+            'saturation_upper': saturation_upper
+        },
+        'hue': {
+            'hue_lower': hue_lower,
+            'hue_upper': hue_upper
+        },
+        'sharpness': {
+            'sharpness_lower': sharpness_lower,
+            'sharpness_upper': sharpness_upper,
+        }
+    }
+    prob_dict = {
+        'brightness': brightness_prob,
+        'contrast': contrast_prob,
+        'saturation': saturation_prob,
+        'hue': hue_prob,
+        'sharpness': sharpness_prob
+    }
+
+    img = img.astype('uint8')
+    img = Image.fromarray(img)
+
+    for id in range(len(ops)):
+        params = params_dict[ops[id].__name__]
+        prob = prob_dict[ops[id].__name__]
+        params['im'] = img
+        if np.random.uniform(0, 1) < prob:
+            img = ops[id](**params)
+    img = np.asarray(img).astype('float32')
+    return img
+
+
+def ppseg_cityscapes_qat_preprocess(img):
+
+    min_scale_factor = 0.5
+    max_scale_factor = 2.0
+    scale_step_size = 0.25
+
+    crop_size = (1024, 512)
+
+    brightness_range = 0.5
+    contrast_range = 0.5
+    saturation_range = 0.5
+
+    img = ResizeStepScaling(
+        img, min_scale_factor=0.5, max_scale_factor=2.0, scale_step_size=0.25)
+    img = RandomPaddingCrop(img, crop_size=(1024, 512))
+    img = RandomHorizontalFlip(img)
+    img = RandomDistort(
+        img, brightness_range=0.5, contrast_range=0.5, saturation_range=0.5)
+
+    img = np.transpose(img / 255.0, [2, 0, 1])
+    img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
+    img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
+    img -= img_mean
+    img /= img_std
+    return img.astype(np.float32)
--- a/tools/auto_compression/fd_auto_compress/fd_auto_compress.py
+++ b/tools/auto_compression/fd_auto_compress/fd_auto_compress.py
@@ -0,0 +1,195 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import numpy as np
+import time
+import argparse
+from tqdm import tqdm
+import paddle
+from paddleslim.common import load_config, load_onnx_model
+from paddleslim.auto_compression import AutoCompression
+from paddleslim.quant import quant_post_static
+from fd_auto_compress.dataset import *
+
+
+def argsparser():
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
+        '--config_path',
+        type=str,
+        default=None,
+        help="path of compression strategy config.",
+        required=True)
+    parser.add_argument(
+        '--method',
+        type=str,
+        default=None,
+        help="choose PTQ or QAT as quantization method",
+        required=True)
+    parser.add_argument(
+        '--save_dir',
+        type=str,
+        default='output',
+        help="directory to save compressed model.")
+    parser.add_argument(
+        '--devices',
+        type=str,
+        default='gpu',
+        help="which device used to compress.")
+
+    return parser
+
+
+def reader_wrapper(reader, input_list):
+
+    if isinstance(input_list, list) and len(input_list) == 1:
+        input_name = input_list[0]
+
+        def gen():
+            in_dict = {}
+            for i, data in enumerate(reader()):
+                imgs = np.array(data[0])
+                in_dict[input_name] = imgs
+                yield in_dict
+
+        return gen
+
+    if isinstance(input_list, list) and len(input_list) > 1:
+
+        def gen():
+            for idx, data in enumerate(reader()):
+                in_dict = {}
+                for i in range(len(input_list)):
+                    intput_name = input_list[i]
+                    feed_data = np.array(data[0][i])
+                    in_dict[intput_name] = feed_data
+
+                yield in_dict
+
+        return gen
+
+
+def main():
+
+    time_s = time.time()
+
+    paddle.enable_static()
+    parser = argsparser()
+    FLAGS = parser.parse_args()
+
+    assert FLAGS.devices in ['cpu', 'gpu', 'xpu', 'npu']
+    paddle.set_device(FLAGS.devices)
+    global global_config
+
+    if FLAGS.method == 'QAT':
+
+        all_config = load_config(FLAGS.config_path)
+        assert "Global" in all_config, f"Key 'Global' not found in config file. \n{all_config}"
+        global_config = all_config["Global"]
+        input_list = global_config['input_list']
+
+        assert os.path.exists(global_config[
+            'qat_image_path']), "image_path does not exist!"
+        paddle.vision.image.set_image_backend('cv2')
+        # transform could be customized.
+        train_dataset = paddle.vision.datasets.ImageFolder(
+            global_config['qat_image_path'],
+            transform=eval(global_config['qat_preprocess']))
+        train_loader = paddle.io.DataLoader(
+            train_dataset,
+            batch_size=global_config['qat_batch_size'],
+            shuffle=True,
+            drop_last=True,
+            num_workers=0)
+        train_loader = reader_wrapper(train_loader, input_list=input_list)
+        eval_func = None
+
+        # ACT compression
+        ac = AutoCompression(
+            model_dir=global_config['model_dir'],
+            model_filename=global_config['model_filename'],
+            params_filename=global_config['params_filename'],
+            train_dataloader=train_loader,
+            save_dir=FLAGS.save_dir,
+            config=all_config,
+            eval_callback=eval_func)
+        ac.compress()
+
+    # PTQ compression
+    if FLAGS.method == 'PTQ':
+
+        # Read Global config and prepare dataset
+        all_config = load_config(FLAGS.config_path)
+        assert "Global" in all_config, f"Key 'Global' not found in config file. \n{all_config}"
+        global_config = all_config["Global"]
+        input_list = global_config['input_list']
+
+        assert os.path.exists(global_config[
+            'ptq_image_path']), "image_path does not exist!"
+
+        paddle.vision.image.set_image_backend('cv2')
+        # transform could be customized.
+        val_dataset = paddle.vision.datasets.ImageFolder(
+            global_config['ptq_image_path'],
+            transform=eval(global_config['ptq_preprocess']))
+        val_loader = paddle.io.DataLoader(
+            val_dataset,
+            batch_size=1,
+            shuffle=True,
+            drop_last=True,
+            num_workers=0)
+        val_loader = reader_wrapper(val_loader, input_list=input_list)
+
+        # Read PTQ config
+        assert "PTQ" in all_config, f"Key 'PTQ' not found in config file. \n{all_config}"
+        ptq_config = all_config["PTQ"]
+
+        # Inititalize the executor
+        place = paddle.CUDAPlace(
+            0) if FLAGS.devices == 'gpu' else paddle.CPUPlace()
+        exe = paddle.static.Executor(place)
+
+        # Read ONNX or PADDLE format model
+        if global_config['format'] == 'onnx':
+            load_onnx_model(global_config["model_dir"])
+            inference_model_path = global_config["model_dir"].rstrip().rstrip(
+                '.onnx') + '_infer'
+        else:
+            inference_model_path = global_config["model_dir"].rstrip('/')
+
+        quant_post_static(
+            executor=exe,
+            model_dir=inference_model_path,
+            quantize_model_path=FLAGS.save_dir,
+            data_loader=val_loader,
+            model_filename=global_config["model_filename"],
+            params_filename=global_config["params_filename"],
+            batch_size=32,
+            batch_nums=10,
+            algo=ptq_config['calibration_method'],
+            hist_percent=0.999,
+            is_full_quantize=False,
+            bias_correction=False,
+            onnx_format=True,
+            skip_tensor_list=ptq_config['skip_tensor_list']
+            if 'skip_tensor_list' in ptq_config else None)
+
+    time_total = time.time() - time_s
+    print("Finish Compression, total time used is : ", time_total, "seconds.")
+
+
+if __name__ == '__main__':
+    main()
--- a/tools/auto_compression/requirements.txt
+++ b/tools/auto_compression/requirements.txt
@@ -0,0 +1 @@
+paddleslim
--- a/tools/auto_compression/setup.py
+++ b/tools/auto_compression/setup.py
@@ -0,0 +1,26 @@
+import setuptools
+import fd_auto_compress
+
+long_description = "fastdeploy-auto-compression is a toolkit for model auto compression of FastDeploy.\n\n"
+long_description += "Usage: fastdeploy_auto_compress --config_path=./yolov7_tiny_qat_dis.yaml --method='QAT' --save_dir='../v7_qat_outmodel/' \n"
+
+with open("requirements.txt") as fin:
+    REQUIRED_PACKAGES = fin.read()
+
+setuptools.setup(
+    name="fastdeploy-auto-compression",  # name of package
+    description="A toolkit for model auto compression of FastDeploy.",
+    long_description=long_description,
+    long_description_content_type="text/plain",
+    packages=setuptools.find_packages(),
+    install_requires=REQUIRED_PACKAGES,
+    classifiers=[
+        "Programming Language :: Python :: 3",
+        "License :: OSI Approved :: Apache Software License",
+        "Operating System :: OS Independent",
+    ],
+    license='Apache 2.0',
+    entry_points={
+        'console_scripts':
+        ['fastdeploy_auto_compress=fd_auto_compress.fd_auto_compress:main', ]
+    })