[Quantization] Change the usage of FastDeploy auto compression tool. (#576)

* Add PaddleOCR Support * Add PaddleOCR Support * Add PaddleOCRv3 Support * Add PaddleOCRv3 Support * Update README.md * Update README.md * Update README.md * Update README.md * Add PaddleOCRv3 Support * Add PaddleOCRv3 Supports * Add PaddleOCRv3 Suport * Fix Rec diff * Remove useless functions * Remove useless comments * Add PaddleOCRv2 Support * Add PaddleOCRv3 & PaddleOCRv2 Support * remove useless parameters * Add utils of sorting det boxes * Fix code naming convention * Fix code naming convention * Fix code naming convention * Fix bug in the Classify process * Imporve OCR Readme * Fix diff in Cls model * Update Model Download Link in Readme * Fix diff in PPOCRv2 * Improve OCR readme * Imporve OCR readme * Improve OCR readme * Improve OCR readme * Imporve OCR readme * Improve OCR readme * Fix conflict * Add readme for OCRResult * Improve OCR readme * Add OCRResult readme * Improve OCR readme * Improve OCR readme * Add Model Quantization Demo * Fix Model Quantization Readme * Fix Model Quantization Readme * Add the function to do PTQ quantization * Improve quant tools readme * Improve quant tool readme * Improve quant tool readme * Add PaddleInference-GPU for OCR Rec model * Add QAT method to fastdeploy-quantization tool * Remove examples/slim for now * Move configs folder * Add Quantization Support for Classification Model * Imporve ways of importing preprocess * Upload YOLO Benchmark on readme * Upload YOLO Benchmark on readme * Upload YOLO Benchmark on readme * Improve Quantization configs and readme * Add support for multi-inputs model * Add backends and params file for YOLOv7 * Add quantized model deployment support for YOLO series * Fix YOLOv5 quantize readme * Fix YOLO quantize readme * Fix YOLO quantize readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Improve quantize YOLO readme * Fix bug, change Fronted to ModelFormat * Change Fronted to ModelFormat * Add examples to deploy quantized paddleclas models * Fix readme * Add quantize Readme * Add quantize Readme * Add quantize Readme * Modify readme of quantization tools * Modify readme of quantization tools * Improve quantization tools readme * Improve quantization readme * Improve PaddleClas quantized model deployment readme * Add PPYOLOE-l quantized deployment examples * Improve quantization tools readme * Improve Quantize Readme * Fix conflicts * Fix conflicts * improve readme * Improve quantization tools and readme * Improve quantization tools and readme * Add quantized deployment examples for PaddleSeg model * Fix cpp readme * Fix memory leak of reader_wrapper function * Fix model file name in PaddleClas quantization examples * Update Runtime and E2E benchmark * Update Runtime and E2E benchmark * Rename quantization tools to auto compression tools * Remove PPYOLOE data when deployed on MKLDNN * Fix readme * Support PPYOLOE with OR without NMS and update readme * Update Readme * Update configs and readme * Update configs and readme * Add Paddle-TensorRT backend in quantized model deploy examples * Support PPYOLOE+ series * Add reused_input_tensors for PPYOLOE * Improve fastdeploy tools usage * improve fastdeploy tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * remove modify * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool * Improve fastdeploy auto compression tool
2025-10-05 16:48:03 +08:00 · 2022-11-14 15:16:14 +08:00
parent b6e4c39ed0
commit efa634ebf8
11 changed files with 169 additions and 29 deletions
--- a/python/requirements.txt
+++ b/python/requirements.txt
@@ -3,3 +3,4 @@ requests
 tqdm
 numpy
 opencv-python
+fd-auto-compress>=0.0.0
--- a/tools/README.md
+++ b/tools/README.md
@@ -0,0 +1,35 @@
+# FastDeploy 工具包
+FastDeploy提供了一系列高效易用的工具优化部署体验, 提升推理性能.
+例如, FastDeploy基于PaddleSlim的Auto Compression Toolkit(ACT), 给用户提供了一键模型自动化压缩的工具, 用户可以轻松地通过一行命令对模型进行自动化压缩, 并在FastDeploy上部署压缩后的模型, 提升推理速度. 本文档将以FastDeploy一键模型自动化压缩工具为例, 介绍如何安装此工具, 并提供相应的使用文档.
+
+## FastDeploy一键模型自动化压缩工具
+
+### 环境准备
+1.用户参考PaddlePaddle官网, 安装develop版本
+```
+https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
+```
+
+2.安装PaddleSlim develop版本
+```bash
+git clone https://github.com/PaddlePaddle/PaddleSlim.git & cd PaddleSlim
+python setup.py install
+```
+
+3.安装fd-auto-compress一键模型自动化压缩工具
+```bash
+# 通过pip安装fd-auto-compress.
+# FastDeploy的python包已包含此工具, 不需重复安装.
+pip install fd-auto-compress
+
+# 在当前目录执行以下命令
+python setup.py install
+```
+
+### 一键模型自动化压缩工具的使用
+按照以上步骤成功安装后,即可使用FastDeploy一键模型自动化压缩工具, 示例如下.
+
+```bash
+fastdeploy --auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
+```
+详细使用文档请参考[FastDeploy一键模型自动化压缩工具](./auto_compression/README.md)
--- a/tools/README_EN.md
+++ b/tools/README_EN.md
@@ -0,0 +1,35 @@
+# FastDeploy Toolkit
+FastDeploy provides a series of efficient and easy-to-use tools to optimize the deployment experience and improve inference performance.
+For example, based on PaddleSlim's Auto Compression Toolkit (ACT), FastDeploy provides users with a one-click model automation compression tool that allows users to easily compress the model with a single command. This document will take FastDeploy's one-click model automation compression tool as an example, introduce how to install the tool, and provide the corresponding documentation for usage.
+
+
+## FastDeploy One-Click Model Auto Compression Tool
+
+### Environmental Preparation
+1.Install PaddlePaddle develop version
+```
+https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
+```
+
+2.Install PaddleSlim dev version
+```bash
+git clone https://github.com/PaddlePaddle/PaddleSlim.git & cd PaddleSlim
+python setup.py install
+```
+
+3.Install fd-auto-compress package
+```bash
+# Installing fd-auto-compress via pip
+# This tool is included in the python installer of FastDeploy, so you don't need to install it again.
+pip install fd-auto-compress
+
+# Execute in the current directory
+python setup.py install
+```
+
+### The Usage of One-Click Model Auto Compression Tool
+After the above steps are successfully installed, you can use FastDeploy one-click model automation compression tool, as shown in the following example.
+```bash
+fastdeploy --auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
+```
+For detailed documentation, please refer to [FastDeploy One-Click Model Auto Compression Tool](./auto_compression/README.md)
--- a/tools/auto_compression/README.md
+++ b/tools/auto_compression/README.md
@@ -18,8 +18,11 @@ python setup.py install
 ```

 ### fastdeploy-auto-compression 一键模型自动化压缩工具安装方式
-用户在当前目录下，运行如下命令:
 ```
+# 通过pip安装fd-auto-compress包
+pip install fd-auto-compress
+
+# 并在上一层目录(非本级目录)执行如下命令
 python setup.py install
 ```

@@ -43,12 +46,11 @@ wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_val_320.tar.gz
 tar -xvf COCO_val_320.tar.gz
 ```

-##### 2.使用fastdeploy_auto_compress命令，执行一键模型自动化压缩:
+##### 2.使用fastdeploy --auto_compress命令，执行一键模型自动化压缩:
 以下命令是对yolov5s模型进行量化, 用户若想量化其他模型, 替换config_path为configs文件夹下的其他模型配置文件即可.
 ```shell
-fastdeploy_auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
+fastdeploy --auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
 ```
-【说明】离线量化（训练后量化）：post-training quantization，缩写是PTQ

 ##### 3.参数说明

@@ -78,12 +80,12 @@ wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_train_320.tar
 tar -xvf COCO_train_320.tar
 ```

-##### 2.使用fastdeploy_auto_compress命令，执行一键模型自动化压缩:
+##### 2.使用fastdeploy --auto_compress命令，执行一键模型自动化压缩:
 以下命令是对yolov5s模型进行量化, 用户若想量化其他模型, 替换config_path为configs文件夹下的其他模型配置文件即可.
 ```shell
 # 执行命令默认为单卡训练，训练前请指定单卡GPU, 否则在训练过程中可能会卡住.
 export CUDA_VISIBLE_DEVICES=0
-fastdeploy_auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='QAT' --save_dir='./yolov5s_qat_model/'
+fastdeploy --auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='QAT' --save_dir='./yolov5s_qat_model/'
 ```

 ##### 3.参数说明
--- a/tools/auto_compression/README_EN.md
+++ b/tools/auto_compression/README_EN.md
@@ -1,7 +1,5 @@
 # FastDeploy One-Click Model Auto Compression

-
-
 FastDeploy, based on PaddleSlim's Auto Compression Toolkit(ACT), provides developers with a one-click model auto compression tool that supports post-training quantization and knowledge distillation training.
 We take the Yolov5 series as an example to demonstrate how to install and execute FastDeploy's one-click model auto compression.

@@ -24,9 +22,13 @@ python setup.py install

 ### Install Fastdeploy Auto Compression Toolkit

-Run the following command in the current directory
+Run the following command to install

 ```
+# Install fd-auto-compress package using pip
+pip install fd-auto-compress
+
+# Execute the following command in the previous directory (not in the current directory)
 python setup.py install
 ```

@@ -52,12 +54,12 @@ wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_val_320.tar.gz
 tar -xvf COCO_val_320.tar.gz
 ```

-##### 2. Run fastdeploy_auto_compress command to compress the model
+##### 2. Run fastdeploy --auto_compress command to compress the model

 The following command is to quantize the yolov5s model, if developers want to quantize other models, replace the config_path with other model configuration files in the configs folder.

 ```shell
-fastdeploy_quant --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
+fastdeploy --auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
 ```

 [notice] PTQ is short for post-training quantization
@@ -89,14 +91,14 @@ wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_val_320.tar.gz
 tar -xvf COCO_val_320.tar.gz
 ```

-##### 2.Use fastdeploy_auto_compress command to compress models
+##### 2.Use fastdeploy --auto_compress command to compress models

 The following command is to quantize the yolov5s model, if developers want to quantize other models, replace the config_path with other model configuration files in the configs folder.

 ```shell
 # Please specify the single card GPU before training, otherwise it may get stuck during the training process.
 export CUDA_VISIBLE_DEVICES=0
-fastdeploy_quant --config_path=./configs/detection/yolov5s_quant.yaml --method='QAT' --save_dir='./yolov5s_qat_model/'
+fastdeploy --auto_compress --config_path=./configs/detection/yolov5s_quant.yaml --method='QAT' --save_dir='./yolov5s_qat_model/'
 ```

 ##### 3.Parameters
--- a/tools/auto_compression/fd_auto_compress/init.py
+++ b/tools/auto_compression/fd_auto_compress/init.py
@@ -0,0 +1 @@
+import fd_auto_compress
--- a/tools/auto_compression/fd_auto_compress/fd_auto_compress.py
+++ b/tools/auto_compression/fd_auto_compress/fd_auto_compress.py
@@ -82,13 +82,11 @@ def reader_wrapper(reader, input_list):
        return gen


-def main():
+def auto_compress(FLAGS):

+    #FLAGS needs parse
    time_s = time.time()
-
    paddle.enable_static()
-    parser = argsparser()
-    FLAGS = parser.parse_args()

    assert FLAGS.devices in ['cpu', 'gpu', 'xpu', 'npu']
    paddle.set_device(FLAGS.devices)
@@ -189,7 +187,3 @@ def main():

    time_total = time.time() - time_s
    print("Finish Compression, total time used is : ", time_total, "seconds.")
-
-
-if __name__ == '__main__':
-    main()
--- a/tools/auto_compression/setup.py
+++ b/tools/auto_compression/setup.py
@@ -1,26 +1,25 @@
 import setuptools
 import fd_auto_compress

-long_description = "fastdeploy-auto-compression is a toolkit for model auto compression of FastDeploy.\n\n"
-long_description += "Usage: fastdeploy_auto_compress --config_path=./yolov7_tiny_qat_dis.yaml --method='QAT' --save_dir='../v7_qat_outmodel/' \n"
+long_description = "fd_auto_compress is a toolkit for model auto compression of FastDeploy.\n\n"
+long_description += "Usage: fastdeploy --auto_compress --config_path=./yolov7_tiny_qat_dis.yaml --method='QAT' --save_dir='../v7_qat_outmodel/' \n"

 with open("requirements.txt") as fin:
    REQUIRED_PACKAGES = fin.read()

 setuptools.setup(
-    name="fastdeploy-auto-compression",  # name of package
+    name="fd_auto_compress",  # name of package
    description="A toolkit for model auto compression of FastDeploy.",
    long_description=long_description,
    long_description_content_type="text/plain",
    packages=setuptools.find_packages(),
+    author='fastdeploy',
+    author_email='fastdeploy@baidu.com',
+    url='https://github.com/PaddlePaddle/FastDeploy.git',
    install_requires=REQUIRED_PACKAGES,
    classifiers=[
        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: Apache Software License",
        "Operating System :: OS Independent",
    ],
-    license='Apache 2.0',
-    entry_points={
-        'console_scripts':
-        ['fastdeploy_auto_compress=fd_auto_compress.fd_auto_compress:main', ]
-    })
+    license='Apache 2.0', )
--- a/tools/common_tools/init.py
+++ b/tools/common_tools/init.py
--- a/tools/common_tools/common_tools.py
+++ b/tools/common_tools/common_tools.py
@@ -0,0 +1,51 @@
+import argparse
+
+
+def argsparser():
+
+    parser = argparse.ArgumentParser(description=__doc__)
+    ## argumentments for auto compression
+    parser.add_argument('--auto_compress', default=False, action='store_true')
+    parser.add_argument(
+        '--config_path',
+        type=str,
+        default=None,
+        help="path of compression strategy config.",
+        required=True)
+    parser.add_argument(
+        '--method',
+        type=str,
+        default=None,
+        help="choose PTQ or QAT as quantization method",
+        required=True)
+    parser.add_argument(
+        '--save_dir',
+        type=str,
+        default='./output',
+        help="directory to save compressed model.")
+    parser.add_argument(
+        '--devices',
+        type=str,
+        default='gpu',
+        help="which device used to compress.")
+
+    ## arguments for other tools
+    return parser
+
+
+def main():
+
+    args = argsparser().parse_args()
+    if args.auto_compress == True:
+        try:
+            from fd_auto_compress.fd_auto_compress import auto_compress
+            print("Welcome to use FastDeploy Auto Compression Toolkit!")
+            auto_compress(args)
+        except ImportError:
+            print(
+                "Can not start auto compresssion successfully! Please check if you have installed it!"
+            )
+
+
+if __name__ == '__main__':
+    main()
--- a/tools/setup.py
+++ b/tools/setup.py
@@ -0,0 +1,20 @@
+import setuptools
+
+long_description = "fastdeploy-tools is a toolkit for FastDeploy, including auto compression .etc.\n\n"
+long_description += "Usage of auto compression: fastdeploy --auto_compress --config_path=./yolov7_tiny_qat_dis.yaml --method='QAT' --save_dir='./v7_qat_outmodel/' \n"
+
+setuptools.setup(
+    name="fastdeploy-tools",  # name of package
+    description="A toolkit for FastDeploy.",
+    long_description=long_description,
+    long_description_content_type="text/plain",
+    packages=setuptools.find_packages(),
+    classifiers=[
+        "Programming Language :: Python :: 3",
+        "License :: OSI Approved :: Apache Software License",
+        "Operating System :: OS Independent",
+    ],
+    license='Apache 2.0',
+    entry_points={
+        'console_scripts': ['fastdeploy = common_tools.common_tools:main', ]
+    })