mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2025-10-26 10:00:33 +08:00
[Doc]Add English version of documents in examples/ (#1042)
* 第一次提交 * 补充一处漏翻译 * deleted: docs/en/quantize.md * Update one translation * Update en version * Update one translation in code * Standardize one writing * Standardize one writing * Update some en version * Fix a grammer problem * Update en version for api/vision result * Merge branch 'develop' of https://github.com/charl-u/FastDeploy into develop * Checkout the link in README in vision_results/ to the en documents * Modify a title * Add link to serving/docs/ * Finish translation of demo.md * Update english version of serving/docs/ * Update title of readme * Update some links * Modify a title * Update some links * Update en version of java android README * Modify some titles * Modify some titles * Modify some titles * modify article to document * update some english version of documents in examples * Add english version of documents in examples/visions * Sync to current branch * Add english version of documents in examples
This commit is contained in:
@@ -1,36 +1,37 @@
|
||||
# PaddleSeg 量化模型部署
|
||||
FastDeploy已支持部署量化模型,并提供一键模型自动化压缩的工具.
|
||||
用户可以使用一键模型自动化压缩工具,自行对模型量化后部署, 也可以直接下载FastDeploy提供的量化模型进行部署.
|
||||
English | [简体中文](README_CN.md)
|
||||
# PaddleSeg Quantized Model Deployment
|
||||
FastDeploy already supports the deployment of quantitative models and provides a tool to automatically compress model with just one click.
|
||||
You can use the one-click automatical model compression tool to quantify and deploy the models, or directly download the quantified models provided by FastDeploy for deployment.
|
||||
|
||||
## FastDeploy一键模型自动化压缩工具
|
||||
FastDeploy 提供了一键模型自动化压缩工具, 能够简单地通过输入一个配置文件, 对模型进行量化.
|
||||
详细教程请见: [一键模型自动化压缩工具](../../../../../tools/common_tools/auto_compression/)
|
||||
注意: 推理量化后的分类模型仍然需要FP32模型文件夹下的deploy.yaml文件, 自行量化的模型文件夹内不包含此yaml文件, 用户从FP32模型文件夹下复制此yaml文件到量化后的模型文件夹内即可。
|
||||
## FastDeploy One-Click Automation Model Compression Tool
|
||||
FastDeploy provides an one-click automatical model compression tool that can quantify a model simply by entering configuration file.
|
||||
For details, please refer to [one-click automatical compression tool](../../../../../tools/common_tools/auto_compression/).
|
||||
Note: The quantized classification model still needs the deploy.yaml file in the FP32 model folder. Self-quantized model folder does not contain this yaml file, you can copy it from the FP32 model folder to the quantized model folder.
|
||||
|
||||
## 下载量化完成的PaddleSeg模型
|
||||
用户也可以直接下载下表中的量化模型进行部署.(点击模型名字即可下载)
|
||||
## Download the Quantized PaddleSeg Model
|
||||
You can also directly download the quantized models in the following table for deployment (click model name to download).
|
||||
|
||||
Benchmark表格说明:
|
||||
- Runtime时延为模型在各种Runtime上的推理时延,包含CPU->GPU数据拷贝,GPU推理,GPU->CPU数据拷贝时间. 不包含模型各自的前后处理时间.
|
||||
- 端到端时延为模型在实际推理场景中的时延, 包含模型的前后处理.
|
||||
- 所测时延均为推理1000次后求得的平均值, 单位是毫秒.
|
||||
- INT8 + FP16 为在推理INT8量化模型的同时, 给Runtime 开启FP16推理选项
|
||||
- INT8 + FP16 + PM, 为在推理INT8量化模型和开启FP16的同时, 开启使用Pinned Memory的选项,可加速GPU->CPU数据拷贝的速度
|
||||
- 最大加速比, 为FP32时延除以INT8推理的最快时延,得到最大加速比.
|
||||
- 策略为量化蒸馏训练时, 采用少量无标签数据集训练得到量化模型, 并在全量验证集上验证精度, INT8精度并不代表最高的INT8精度.
|
||||
- CPU为Intel(R) Xeon(R) Gold 6271C, 所有测试中固定CPU线程数为1. GPU为Tesla T4, TensorRT版本8.4.15.
|
||||
Note:
|
||||
- Runtime latency is the inference latency of the model on various Runtimes, including CPU->GPU data copy, GPU inference, and GPU->CPU data copy time. It does not include the respective pre and post processing time of the models.
|
||||
- The end-to-end latency is the latency of the model in the actual inference scenario, including the pre and post processing of the model.
|
||||
- The measured latencies are averaged over 1000 inferences, in milliseconds.
|
||||
- INT8 + FP16 is to enable the FP16 inference option for Runtime while inferring the INT8 quantization model.
|
||||
- INT8 + FP16 + PM is the option to use Pinned Memory while inferring INT8 quantization model and turning on FP16, which can speed up the GPU->CPU data copy speed.
|
||||
- The maximum speedup ratio is obtained by dividing the FP32 latency by the fastest INT8 inference latency.
|
||||
- The strategy is quantitative distillation training, using a small number of unlabeled data sets to train the quantitative model, and verify the accuracy on the full validation set, INT8 accuracy does not represent the highest INT8 accuracy.
|
||||
- The CPU is Intel(R) Xeon(R) Gold 6271C with a fixed CPU thread count of 1 in all tests. The GPU is Tesla T4, TensorRT version 8.4.15.
|
||||
|
||||
#### Runtime Benchmark
|
||||
| 模型 |推理后端 |部署硬件 | FP32 Runtime时延 | INT8 Runtime时延 | INT8 + FP16 Runtime时延 | INT8+FP16+PM Runtime时延 | 最大加速比 | FP32 mIoU | INT8 mIoU | 量化方式 |
|
||||
| Model |Inference Backends | Hardware | FP32 Runtime Latency | INT8 Runtime Latency | INT8 + FP16 Runtime Latency | INT8+FP16+PM Runtime Latency | Max Speedup | FP32 mIoU | INT8 mIoU | Method |
|
||||
| ------------------- | -----------------|-----------| -------- |-------- |-------- | --------- |-------- |----- |----- |----- |
|
||||
| [PP-LiteSeg-T(STDC1)-cityscapes](https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_QAT_new.tar)) | Paddle Inference | CPU | 1138.04| 602.62 |None|None | 1.89 |77.37 | 71.62 |量化蒸馏训练 |
|
||||
| [PP-LiteSeg-T(STDC1)-cityscapes](https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_QAT_new.tar) | Paddle Inference | CPU | 1138.04| 602.62 |None|None | 1.89 |77.37 | 71.62 |Quantaware Distillation Training |
|
||||
|
||||
#### 端到端 Benchmark
|
||||
| 模型 |推理后端 |部署硬件 | FP32 End2End时延 | INT8 End2End时延 | INT8 + FP16 End2End时延 | INT8+FP16+PM End2End时延 | 最大加速比 | FP32 mIoU | INT8 mIoU | 量化方式 |
|
||||
#### End to End Benchmark
|
||||
| Model |Inference Backends | Hardware | FP32 End2End Latency | INT8 End2End Latency | INT8 + FP16 End2End Latency | INT8+FP16+PM End2End Latency | Max Speedup | FP32 mIoU | INT8 mIoU | Method |
|
||||
| ------------------- | -----------------|-----------| -------- |-------- |-------- | --------- |-------- |----- |----- |----- |
|
||||
| [PP-LiteSeg-T(STDC1)-cityscapes](https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_QAT_new.tar)) | Paddle Inference | CPU | 4726.65| 4134.91|None|None | 1.14 |77.37 | 71.62 |量化蒸馏训练 |
|
||||
| [PP-LiteSeg-T(STDC1)-cityscapes](https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_QAT_new.tar) | Paddle Inference | CPU | 4726.65| 4134.91|None|None | 1.14 |77.37 | 71.62 |Quantaware Distillation Training|
|
||||
|
||||
## 详细部署文档
|
||||
## Detailed Deployment Documents
|
||||
|
||||
- [Python部署](python)
|
||||
- [C++部署](cpp)
|
||||
- [Python Deployment](python)
|
||||
- [C++ Deployment](cpp)
|
||||
|
||||
37
examples/vision/segmentation/paddleseg/quantize/README_CN.md
Normal file
37
examples/vision/segmentation/paddleseg/quantize/README_CN.md
Normal file
@@ -0,0 +1,37 @@
|
||||
[English](README.md) | 简体中文
|
||||
# PaddleSeg 量化模型部署
|
||||
FastDeploy已支持部署量化模型,并提供一键模型自动化压缩的工具.
|
||||
用户可以使用一键模型自动化压缩工具,自行对模型量化后部署, 也可以直接下载FastDeploy提供的量化模型进行部署.
|
||||
|
||||
## FastDeploy一键模型自动化压缩工具
|
||||
FastDeploy 提供了一键模型自动化压缩工具, 能够简单地通过输入一个配置文件, 对模型进行量化.
|
||||
详细教程请见: [一键模型自动化压缩工具](../../../../../tools/common_tools/auto_compression/)
|
||||
注意: 推理量化后的分类模型仍然需要FP32模型文件夹下的deploy.yaml文件, 自行量化的模型文件夹内不包含此yaml文件, 用户从FP32模型文件夹下复制此yaml文件到量化后的模型文件夹内即可。
|
||||
|
||||
## 下载量化完成的PaddleSeg模型
|
||||
用户也可以直接下载下表中的量化模型进行部署.(点击模型名字即可下载)
|
||||
|
||||
Benchmark表格说明:
|
||||
- Runtime时延为模型在各种Runtime上的推理时延,包含CPU->GPU数据拷贝,GPU推理,GPU->CPU数据拷贝时间. 不包含模型各自的前后处理时间.
|
||||
- 端到端时延为模型在实际推理场景中的时延, 包含模型的前后处理.
|
||||
- 所测时延均为推理1000次后求得的平均值, 单位是毫秒.
|
||||
- INT8 + FP16 为在推理INT8量化模型的同时, 给Runtime 开启FP16推理选项
|
||||
- INT8 + FP16 + PM, 为在推理INT8量化模型和开启FP16的同时, 开启使用Pinned Memory的选项,可加速GPU->CPU数据拷贝的速度
|
||||
- 最大加速比, 为FP32时延除以INT8推理的最快时延,得到最大加速比.
|
||||
- 策略为量化蒸馏训练时, 采用少量无标签数据集训练得到量化模型, 并在全量验证集上验证精度, INT8精度并不代表最高的INT8精度.
|
||||
- CPU为Intel(R) Xeon(R) Gold 6271C, 所有测试中固定CPU线程数为1. GPU为Tesla T4, TensorRT版本8.4.15.
|
||||
|
||||
#### Runtime Benchmark
|
||||
| 模型 |推理后端 |部署硬件 | FP32 Runtime时延 | INT8 Runtime时延 | INT8 + FP16 Runtime时延 | INT8+FP16+PM Runtime时延 | 最大加速比 | FP32 mIoU | INT8 mIoU | 量化方式 |
|
||||
| ------------------- | -----------------|-----------| -------- |-------- |-------- | --------- |-------- |----- |----- |----- |
|
||||
| [PP-LiteSeg-T(STDC1)-cityscapes](https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_QAT_new.tar) | Paddle Inference | CPU | 1138.04| 602.62 |None|None | 1.89 |77.37 | 71.62 |量化蒸馏训练 |
|
||||
|
||||
#### 端到端 Benchmark
|
||||
| 模型 |推理后端 |部署硬件 | FP32 End2End时延 | INT8 End2End时延 | INT8 + FP16 End2End时延 | INT8+FP16+PM End2End时延 | 最大加速比 | FP32 mIoU | INT8 mIoU | 量化方式 |
|
||||
| ------------------- | -----------------|-----------| -------- |-------- |-------- | --------- |-------- |----- |----- |----- |
|
||||
| [PP-LiteSeg-T(STDC1)-cityscapes](https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_QAT_new.tar) | Paddle Inference | CPU | 4726.65| 4134.91|None|None | 1.14 |77.37 | 71.62 |量化蒸馏训练 |
|
||||
|
||||
## 详细部署文档
|
||||
|
||||
- [Python部署](python)
|
||||
- [C++部署](cpp)
|
||||
@@ -1,31 +1,32 @@
|
||||
# PaddleSeg 量化模型 C++部署示例
|
||||
本目录下提供的`infer.cc`,可以帮助用户快速完成PaddleSeg量化模型在CPU上的部署推理加速.
|
||||
English | [简体中文](README_CN.md)
|
||||
# PaddleSeg Quantitative Model C++ Deployment Example
|
||||
`infer.cc` in this directory can help you quickly complete the inference acceleration of PaddleSeg quantization model deployment on CPU.
|
||||
|
||||
## 部署准备
|
||||
### FastDeploy环境准备
|
||||
- 1. 软硬件环境满足要求,参考[FastDeploy环境要求](../../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)
|
||||
- 2. FastDeploy Python whl包安装,参考[FastDeploy Python安装](../../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)
|
||||
## Deployment Preparations
|
||||
### FastDeploy Environment Preparations
|
||||
- 1. For the software and hardware requirements, please refer to [FastDeploy Environment Requirements](../../../../../../docs/en/build_and_install/download_prebuilt_libraries.md).
|
||||
- 2. For the installation of FastDeploy Python whl package, please refer to [FastDeploy Python Installation](../../../../../../docs/en/build_and_install/download_prebuilt_libraries.md).
|
||||
|
||||
### 量化模型准备
|
||||
- 1. 用户可以直接使用由FastDeploy提供的量化模型进行部署.
|
||||
- 2. 用户可以使用FastDeploy提供的[一键模型自动化压缩工具](../../../../../../tools/common_tools/auto_compression/),自行进行模型量化, 并使用产出的量化模型进行部署.(注意: 推理量化后的分类模型仍然需要FP32模型文件夹下的deploy.yaml文件, 自行量化的模型文件夹内不包含此yaml文件, 用户从FP32模型文件夹下复制此yaml文件到量化后的模型文件夹内即可.)
|
||||
### Quantized Model Preparations
|
||||
- 1. You can directly use the quantized model provided by FastDeploy for deployment.
|
||||
- 2. You can use [one-click automatical compression tool](../../../../../../tools/common_tools/auto_compression/) provided by FastDeploy to quantize model by yourself, and use the generated quantized model for deployment.(Note: The quantized classification model still needs the deploy.yaml file in the FP32 model folder. Self-quantized model folder does not contain this yaml file, you can copy it from the FP32 model folder to the quantized model folder.)
|
||||
|
||||
## 以量化后的PP_LiteSeg_T_STDC1_cityscapes模型为例, 进行部署
|
||||
在本目录执行如下命令即可完成编译,以及量化模型部署.支持此模型需保证FastDeploy版本0.7.0以上(x.x.x>=0.7.0)
|
||||
## Take the Quantized PP_LiteSeg_T_STDC1_cityscapes Model as an example for Deployment
|
||||
Run the following commands in this directory to compile and deploy the quantized model. FastDeploy version 0.7.0 or higher is required (x.x.x>=0.7.0).
|
||||
```bash
|
||||
mkdir build
|
||||
cd build
|
||||
# 下载FastDeploy预编译库,用户可在上文提到的`FastDeploy预编译库`中自行选择合适的版本使用
|
||||
# Download pre-compiled FastDeploy libraries. You can choose the appropriate version from `pre-compiled FastDeploy libraries` mentioned above.
|
||||
wget https://bj.bcebos.com/fastdeploy/release/cpp/fastdeploy-linux-x64-x.x.x.tgz
|
||||
tar xvf fastdeploy-linux-x64-x.x.x.tgz
|
||||
cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/fastdeploy-linux-x64-x.x.x
|
||||
make -j
|
||||
|
||||
#下载FastDeloy提供的PP_LiteSeg_T_STDC1_cityscapes量化模型文件和测试图片
|
||||
# Download the PP_LiteSeg_T_STDC1_cityscapes quantized model and test images provided by FastDeloy.
|
||||
wget https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_PTQ.tar
|
||||
tar -xvf PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_PTQ.tar
|
||||
wget https://paddleseg.bj.bcebos.com/dygraph/demo/cityscapes_demo.png
|
||||
|
||||
# 在CPU上使用Paddle-Inference推理量化模型
|
||||
# Use Paddle-Inference inference quantization model on CPU.
|
||||
./infer_demo PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_PTQ cityscapes_demo.png 1
|
||||
```
|
||||
|
||||
@@ -0,0 +1,32 @@
|
||||
[English](README.md) | 简体中文
|
||||
# PaddleSeg 量化模型 C++部署示例
|
||||
本目录下提供的`infer.cc`,可以帮助用户快速完成PaddleSeg量化模型在CPU上的部署推理加速.
|
||||
|
||||
## 部署准备
|
||||
### FastDeploy环境准备
|
||||
- 1. 软硬件环境满足要求,参考[FastDeploy环境要求](../../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)
|
||||
- 2. FastDeploy Python whl包安装,参考[FastDeploy Python安装](../../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)
|
||||
|
||||
### 量化模型准备
|
||||
- 1. 用户可以直接使用由FastDeploy提供的量化模型进行部署.
|
||||
- 2. 用户可以使用FastDeploy提供的[一键模型自动化压缩工具](../../../../../../tools/common_tools/auto_compression/),自行进行模型量化, 并使用产出的量化模型进行部署.(注意: 推理量化后的分类模型仍然需要FP32模型文件夹下的deploy.yaml文件, 自行量化的模型文件夹内不包含此yaml文件, 用户从FP32模型文件夹下复制此yaml文件到量化后的模型文件夹内即可.)
|
||||
|
||||
## 以量化后的PP_LiteSeg_T_STDC1_cityscapes模型为例, 进行部署
|
||||
在本目录执行如下命令即可完成编译,以及量化模型部署.支持此模型需保证FastDeploy版本0.7.0以上(x.x.x>=0.7.0)
|
||||
```bash
|
||||
mkdir build
|
||||
cd build
|
||||
# 下载FastDeploy预编译库,用户可在上文提到的`FastDeploy预编译库`中自行选择合适的版本使用
|
||||
wget https://bj.bcebos.com/fastdeploy/release/cpp/fastdeploy-linux-x64-x.x.x.tgz
|
||||
tar xvf fastdeploy-linux-x64-x.x.x.tgz
|
||||
cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/fastdeploy-linux-x64-x.x.x
|
||||
make -j
|
||||
|
||||
# 下载FastDeloy提供的PP_LiteSeg_T_STDC1_cityscapes量化模型文件和测试图片
|
||||
wget https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_PTQ.tar
|
||||
tar -xvf PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_PTQ.tar
|
||||
wget https://paddleseg.bj.bcebos.com/dygraph/demo/cityscapes_demo.png
|
||||
|
||||
# 在CPU上使用Paddle-Inference推理量化模型
|
||||
./infer_demo PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_PTQ cityscapes_demo.png 1
|
||||
```
|
||||
@@ -1,28 +1,29 @@
|
||||
# PaddleSeg 量化模型 Python部署示例
|
||||
本目录下提供的`infer.py`,可以帮助用户快速完成PaddleSeg量化模型在CPU/GPU上的部署推理加速.
|
||||
English | [简体中文](README_CN.md)
|
||||
# PaddleSeg Quantitative Model Python Deployment Example
|
||||
`infer.py` in this directory can help you quickly complete the inference acceleration of PaddleSeg quantization model deployment on CPU/GPU.
|
||||
|
||||
## 部署准备
|
||||
### FastDeploy环境准备
|
||||
- 1. 软硬件环境满足要求,参考[FastDeploy环境要求](../../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)
|
||||
- 2. FastDeploy Python whl包安装,参考[FastDeploy Python安装](../../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)
|
||||
## Deployment Preparations
|
||||
### FastDeploy Environment Preparations
|
||||
- 1. For the software and hardware requirements, please refer to [FastDeploy Environment Requirements](../../../../../../docs/en/build_and_install/download_prebuilt_libraries.md)
|
||||
- 2. For the installation of FastDeploy Python whl package, please refer to [FastDeploy Python Installation](../../../../../../docs/en/build_and_install/download_prebuilt_libraries.md)
|
||||
|
||||
### 量化模型准备
|
||||
- 1. 用户可以直接使用由FastDeploy提供的量化模型进行部署.
|
||||
- 2. 用户可以使用FastDeploy提供的[一键模型自动化压缩工具](../../../../../../tools/common_tools/auto_compression/),自行进行模型量化, 并使用产出的量化模型进行部署.(注意: 推理量化后的分类模型仍然需要FP32模型文件夹下的deploy.yaml文件, 自行量化的模型文件夹内不包含此yaml文件, 用户从FP32模型文件夹下复制此yaml文件到量化后的模型文件夹内即可.)
|
||||
### Quantized Model Preparations
|
||||
- 1. You can directly use the quantized model provided by FastDeploy for deployment.
|
||||
- 2. You can use [one-click automatical compression tool](../../../../../../tools/common_tools/auto_compression/) provided by FastDeploy to quantize model by yourself, and use the generated quantized model for deployment.(Note: The quantized classification model still needs the deploy.yaml file in the FP32 model folder. Self-quantized model folder does not contain this yaml file, you can copy it from the FP32 model folder to the quantized model folder.)
|
||||
|
||||
|
||||
## 以量化后的PP_LiteSeg_T_STDC1_cityscapes模型为例, 进行部署
|
||||
## Take the Quantized PP_LiteSeg_T_STDC1_cityscapes Model as an example for Deployment
|
||||
```bash
|
||||
#下载部署示例代码
|
||||
# Download sample deployment code.
|
||||
git clone https://github.com/PaddlePaddle/FastDeploy.git
|
||||
cd examples/vision/segmentation/paddleseg/quantize/python
|
||||
|
||||
#下载FastDeloy提供的PP_LiteSeg_T_STDC1_cityscapes量化模型文件和测试图片
|
||||
# Download the PP_LiteSeg_T_STDC1_cityscapes quantized model and test images provided by FastDeloy.
|
||||
wget https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_PTQ.tar
|
||||
tar -xvf PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_PTQ.tar
|
||||
wget https://paddleseg.bj.bcebos.com/dygraph/demo/cityscapes_demo.png
|
||||
|
||||
# 在CPU上使用Paddle-Inference推理量化模型
|
||||
# Use Paddle-Inference inference quantization model on CPU.
|
||||
python infer.py --model PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_QAT --image cityscapes_demo.png --device cpu --backend paddle
|
||||
|
||||
```
|
||||
|
||||
@@ -0,0 +1,29 @@
|
||||
[English](README.md) | 简体中文
|
||||
# PaddleSeg 量化模型 Python部署示例
|
||||
本目录下提供的`infer.py`,可以帮助用户快速完成PaddleSeg量化模型在CPU/GPU上的部署推理加速.
|
||||
|
||||
## 部署准备
|
||||
### FastDeploy环境准备
|
||||
- 1. 软硬件环境满足要求,参考[FastDeploy环境要求](../../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)
|
||||
- 2. FastDeploy Python whl包安装,参考[FastDeploy Python安装](../../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)
|
||||
|
||||
### 量化模型准备
|
||||
- 1. 用户可以直接使用由FastDeploy提供的量化模型进行部署.
|
||||
- 2. 用户可以使用FastDeploy提供的[一键模型自动化压缩工具](../../../../../../tools/common_tools/auto_compression/),自行进行模型量化, 并使用产出的量化模型进行部署.(注意: 推理量化后的分类模型仍然需要FP32模型文件夹下的deploy.yaml文件, 自行量化的模型文件夹内不包含此yaml文件, 用户从FP32模型文件夹下复制此yaml文件到量化后的模型文件夹内即可.)
|
||||
|
||||
|
||||
## 以量化后的PP_LiteSeg_T_STDC1_cityscapes模型为例, 进行部署
|
||||
```bash
|
||||
# 下载部署示例代码
|
||||
git clone https://github.com/PaddlePaddle/FastDeploy.git
|
||||
cd examples/vision/segmentation/paddleseg/quantize/python
|
||||
|
||||
# 下载FastDeloy提供的PP_LiteSeg_T_STDC1_cityscapes量化模型文件和测试图片
|
||||
wget https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_PTQ.tar
|
||||
tar -xvf PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_PTQ.tar
|
||||
wget https://paddleseg.bj.bcebos.com/dygraph/demo/cityscapes_demo.png
|
||||
|
||||
# 在CPU上使用Paddle-Inference推理量化模型
|
||||
python infer.py --model PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer_QAT --image cityscapes_demo.png --device cpu --backend paddle
|
||||
|
||||
```
|
||||
Reference in New Issue
Block a user