[RKNPU2] Add Quantized PPHumanSeg (#905)

* 更新rknpu2 backend核心代码 * 更新模型导出核心代码 * 删除无用的config文件 * 新增配置文件以及修改文档 * 模型转换以及文档 * 更新文档 * 更新与配置文件 * 更新PPHumanSeg全量化 * 更新文档 * 更新文档 * 更新文档
2025-11-01 04:12:58 +08:00 · 2022-12-19 20:07:32 +08:00
parent 4ac0e33b71
commit 218f33f8b1
8 changed files with 129 additions and 108 deletions
--- a/docs/cn/faq/rknpu2/rknpu2.md
+++ b/docs/cn/faq/rknpu2/rknpu2.md
@@ -16,21 +16,12 @@ ONNX模型不能直接调用RK芯片中的NPU进行运算，需要把ONNX模型
 |------------------|-------------------|-------------------------------|--------------------|
 | Detection        | Picodet           | Picodet-s                     | 162/112            |
 | Detection        | RKYOLOV5          | YOLOV5-S-Relu(int8)           | -/57               |
+| Detection        | RKYOLOX           | -                             | -/-                |
+| Detection        | RKYOLOV7          | -                             | -/-                |
 | Segmentation     | Unet              | Unet-cityscapes               | -/-                |
-| Segmentation     | PP-LiteSeg        | PP_LiteSeg_T_STDC1_cityscapes | -/-                |
-| Segmentation     | PP-HumanSegV2Lite | portrait                      | 53/50              |
-| Segmentation     | PP-HumanSegV2Lite | human                         | 53/50              |
-| Face Detection   | SCRFD             | SCRFD-2.5G-kps-640            | 112/108            |
-
-## TODO
-以下为TODO计划，表示还正在准备支持，但是还存在问题或还可以改进的模型。
-
-| 任务场景             | 模型      | 模型版本(表示已经测试的版本)     | ARM CPU/RKNN速度(ms) |
-|------------------|---------|---------------------|--------------------|
-| Detection        | PPYOLOE | PPYOLOE(int8)       | -/-                |
-| Detection        | YOLOv5  | YOLOv5-s_v6.2(int8) | -/-                |
-| Face Recognition | ArcFace | ArcFace_r18         | 600/3              |
-| Face Recognition | cosFace | cosFace_r18         | 600/3              |
+| Segmentation     | PP-HumanSegV2Lite | portrait                      | 133/43             |
+| Segmentation     | PP-HumanSegV2Lite | human                         | 133/43             |
+| Face Detection   | SCRFD             | SCRFD-2.5G-kps-640            | 108/42             |

 ## RKNPU2 Backend推理使用教程

--- a/examples/vision/segmentation/paddleseg/rknpu2/README.md
+++ b/examples/vision/segmentation/paddleseg/rknpu2/README.md
@@ -25,80 +25,7 @@ RKNPU部署模型前需要将Paddle模型转换成RKNN模型，具体步骤如

 ## 模型转换example

-下面以Portait-PP-HumanSegV2_Lite(肖像分割模型)为例子，教大家如何转换PPSeg模型到RKNN模型。
-```bash
-# 下载Paddle2ONNX仓库
-git clone https://github.com/PaddlePaddle/Paddle2ONNX
-
-# 下载Paddle静态图模型并为Paddle静态图模型固定输入shape
-## 进入为Paddle静态图模型固定输入shape的目录
-cd Paddle2ONNX/tools/paddle
-## 下载Paddle静态图模型并解压
-wget https://bj.bcebos.com/paddlehub/fastdeploy/Portrait_PP_HumanSegV2_Lite_256x144_infer.tgz
-tar xvf Portrait_PP_HumanSegV2_Lite_256x144_infer.tgz
-python paddle_infer_shape.py --model_dir Portrait_PP_HumanSegV2_Lite_256x144_infer/ \
-                             --model_filename model.pdmodel \
-                             --params_filename model.pdiparams \
-                             --save_dir Portrait_PP_HumanSegV2_Lite_256x144_infer \
-                             --input_shape_dict="{'x':[1,3,144,256]}"
-
-# 静态图转ONNX模型，注意，这里的save_file请和压缩包名对齐
-paddle2onnx --model_dir Portrait_PP_HumanSegV2_Lite_256x144_infer \
-            --model_filename model.pdmodel \
-            --params_filename model.pdiparams \
-            --save_file Portrait_PP_HumanSegV2_Lite_256x144_infer/Portrait_PP_HumanSegV2_Lite_256x144_infer.onnx \
-            --enable_dev_version True
-
-# ONNX模型转RKNN模型
-# 将ONNX模型目录拷贝到Fastdeploy根目录
-cp -r ./Portrait_PP_HumanSegV2_Lite_256x144_infer /path/to/Fastdeploy
-# 转换模型,模型将生成在Portrait_PP_HumanSegV2_Lite_256x144_infer目录下
-python tools/rknpu2/export.py --config_path tools/rknpu2/config/RK3588/Portrait_PP_HumanSegV2_Lite_256x144_infer.yaml
-```
-
-## 修改yaml配置文件
-
-在**模型转换example**中，我们对模型的shape进行了固定，因此对应的yaml文件也要进行修改，如下:
-
-**原yaml文件**
-```yaml
-Deploy:
-  input_shape:
-  - -1
-  - 3
-  - -1
-  - -1
-  model: model.pdmodel
-  output_dtype: float32
-  output_op: none
-  params: model.pdiparams
-  transforms:
-  - target_size:
-    - 256
-    - 144
-    type: Resize
-  - type: Normalize
-```
-
-**修改后的yaml文件**
-```yaml
-Deploy:
-  input_shape:
-  - 1
-  - 3
-  - 144
-  - 256
-  model: model.pdmodel
-  output_dtype: float32
-  output_op: none
-  params: model.pdiparams
-  transforms:
-  - target_size:
-    - 256
-    - 144
-    type: Resize
-  - type: Normalize
-```
+* [PPHumanSeg](./pp_humanseg.md)

 ## 详细部署文档
 - [RKNN总体部署教程](../../../../../docs/cn/faq/rknpu2/rknpu2.md)
--- a/examples/vision/segmentation/paddleseg/rknpu2/cpp/README.md
+++ b/examples/vision/segmentation/paddleseg/rknpu2/cpp/README.md
@@ -62,16 +62,12 @@ make install

 ```bash
 cd ./build/install
-./rknpu_test
+./rknpu_test model/Portrait_PP_HumanSegV2_Lite_256x144_infer/ images/portrait_heng.jpg
 ```

-## 运行结果展示
-运行后将在install文件夹下生成human_pp_humansegv2_lite_npu_result.jpg文件，如下图:
-![](https://user-images.githubusercontent.com/58363586/198875853-72821ad1-d4f7-41e3-b616-bef43027de3c.jpg)
-
 ## 注意事项
 RKNPU上对模型的输入要求是使用NHWC格式，且图片归一化操作会在转RKNN模型时，内嵌到模型中，因此我们在使用FastDeploy部署时，
-需要先调用DisableNormalizePermute(C++)或`disable_normalize_permute(Python)，在预处理阶段禁用归一化以及数据格式的转换。
+需要先调用DisableNormalizeAndPermute(C++)或`disable_normalize_and_permute(Python)，在预处理阶段禁用归一化以及数据格式的转换。

 - [模型介绍](../../)
 - [Python部署](../python)
--- a/examples/vision/segmentation/paddleseg/rknpu2/cpp/infer.cc
+++ b/examples/vision/segmentation/paddleseg/rknpu2/cpp/infer.cc
@@ -92,7 +92,7 @@ int main(int argc, char* argv[]) {
  }

  RKNPU2Infer(argv[1], argv[2]);
-  ONNXInfer(argv[1], argv[2]);
+//  ONNXInfer(argv[1], argv[2]);
  return 0;
 }

--- a/examples/vision/segmentation/paddleseg/rknpu2/pp_humanseg.md
+++ b/examples/vision/segmentation/paddleseg/rknpu2/pp_humanseg.md
@@ -0,0 +1,80 @@
+# PPHumanSeg模型部署
+
+## 转换模型
+下面以Portait-PP-HumanSegV2_Lite(肖像分割模型)为例子，教大家如何转换PPSeg模型到RKNN模型。
+
+```bash
+# 下载Paddle2ONNX仓库
+git clone https://github.com/PaddlePaddle/Paddle2ONNX
+
+# 下载Paddle静态图模型并为Paddle静态图模型固定输入shape
+## 进入为Paddle静态图模型固定输入shape的目录
+cd Paddle2ONNX/tools/paddle
+## 下载Paddle静态图模型并解压
+wget https://bj.bcebos.com/paddlehub/fastdeploy/Portrait_PP_HumanSegV2_Lite_256x144_infer.tgz
+tar xvf Portrait_PP_HumanSegV2_Lite_256x144_infer.tgz
+python paddle_infer_shape.py --model_dir Portrait_PP_HumanSegV2_Lite_256x144_infer/ \
+                             --model_filename model.pdmodel \
+                             --params_filename model.pdiparams \
+                             --save_dir Portrait_PP_HumanSegV2_Lite_256x144_infer \
+                             --input_shape_dict="{'x':[1,3,144,256]}"
+
+# 静态图转ONNX模型，注意，这里的save_file请和压缩包名对齐
+paddle2onnx --model_dir Portrait_PP_HumanSegV2_Lite_256x144_infer \
+            --model_filename model.pdmodel \
+            --params_filename model.pdiparams \
+            --save_file Portrait_PP_HumanSegV2_Lite_256x144_infer/Portrait_PP_HumanSegV2_Lite_256x144_infer.onnx \
+            --enable_dev_version True
+
+# ONNX模型转RKNN模型
+# 将ONNX模型目录拷贝到Fastdeploy根目录
+cp -r ./Portrait_PP_HumanSegV2_Lite_256x144_infer /path/to/Fastdeploy
+# 转换模型,模型将生成在Portrait_PP_HumanSegV2_Lite_256x144_infer目录下
+python tools/rknpu2/export.py \
+        --config_path tools/rknpu2/config/Portrait_PP_HumanSegV2_Lite_256x144_infer.yaml \
+        --target_platform rk3588
+```
+
+## 修改yaml配置文件
+
+在**模型转换example**中，我们对模型的shape进行了固定，因此对应的yaml文件也要进行修改，如下:
+
+**原yaml文件**
+```yaml
+Deploy:
+  input_shape:
+  - -1
+  - 3
+  - -1
+  - -1
+  model: model.pdmodel
+  output_dtype: float32
+  output_op: none
+  params: model.pdiparams
+  transforms:
+  - target_size:
+    - 256
+    - 144
+    type: Resize
+  - type: Normalize
+```
+
+**修改后的yaml文件**
+```yaml
+Deploy:
+  input_shape:
+  - 1
+  - 3
+  - 144
+  - 256
+  model: model.pdmodel
+  output_dtype: float32
+  output_op: none
+  params: model.pdiparams
+  transforms:
+  - target_size:
+    - 256
+    - 144
+    type: Resize
+  - type: Normalize
+```
--- a/examples/vision/segmentation/paddleseg/rknpu2/python/README.md
+++ b/examples/vision/segmentation/paddleseg/rknpu2/python/README.md
@@ -23,15 +23,11 @@ python3 infer.py --model_file ./Portrait_PP_HumanSegV2_Lite_256x144_infer/Portra
                --image images/portrait_heng.jpg
 ```

-运行完成可视化结果如下图所示
-<div  align="center">  
-<img src="https://user-images.githubusercontent.com/16222477/191712880-91ae128d-247a-43e0-b1e3-cafae78431e0.jpg", width=512px, height=256px />
-</div>
-

 ## 注意事项
 RKNPU上对模型的输入要求是使用NHWC格式，且图片归一化操作会在转RKNN模型时，内嵌到模型中，因此我们在使用FastDeploy部署时，
-需要先调用DisableNormalizePermute(C++)或`disable_normalize_permute(Python)，在预处理阶段禁用归一化以及数据格式的转换。
+需要先调用DisableNormalizeAndPermute(C++)或`disable_normalize_and_permute(Python)，在预处理阶段禁用归一化以及数据格式的转换。
+
 ## 其它文档

 - [PaddleSeg 模型介绍](..)
--- a/fastdeploy/backends/rknpu/rknpu2/rknpu2_backend.cc
+++ b/fastdeploy/backends/rknpu/rknpu2/rknpu2_backend.cc
@@ -12,7 +12,7 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.
 #include "fastdeploy/backends/rknpu/rknpu2/rknpu2_backend.h"
-
+#include "fastdeploy/utils/perf.h"
 namespace fastdeploy {
 RKNPU2Backend::~RKNPU2Backend() {
  // Release memory uniformly here
@@ -178,9 +178,14 @@ bool RKNPU2Backend::GetModelInputOutputInfos() {
  // get input info and copy to input tensor info
  for (uint32_t i = 0; i < io_num.n_input; i++) {
    input_attrs_[i].index = i;
+
    // query info
-    ret = rknn_query(ctx, RKNN_QUERY_INPUT_ATTR, &(input_attrs_[i]),
+    ret = rknn_query(ctx,
+                     RKNN_QUERY_INPUT_ATTR,
+                     &(input_attrs_[i]),
                     sizeof(rknn_tensor_attr));
+    DumpTensorAttr(input_attrs_[i]);
+
    if (ret != RKNN_SUCC) {
      printf("rknn_init error! ret=%d\n", ret);
      return false;
@@ -214,8 +219,12 @@ bool RKNPU2Backend::GetModelInputOutputInfos() {
  for (uint32_t i = 0; i < io_num.n_output; i++) {
    output_attrs_[i].index = i;
    // query info
-    ret = rknn_query(ctx, RKNN_QUERY_OUTPUT_ATTR, &(output_attrs_[i]),
+    ret = rknn_query(ctx,
+                     RKNN_QUERY_OUTPUT_ATTR,
+                     &(output_attrs_[i]),
                     sizeof(rknn_tensor_attr));
+    DumpTensorAttr(output_attrs_[i]);
+
    if (ret != RKNN_SUCC) {
      FDERROR << "rknn_query fail! ret = " << ret << std::endl;
      return false;
@@ -254,11 +263,12 @@ bool RKNPU2Backend::GetModelInputOutputInfos() {
 void RKNPU2Backend::DumpTensorAttr(rknn_tensor_attr& attr) {
  printf("index=%d, name=%s, n_dims=%d, dims=[%d, %d, %d, %d], "
         "n_elems=%d, size=%d, fmt=%s, type=%s, "
-         "qnt_type=%s, zp=%d, scale=%f\n",
+         "qnt_type=%s, zp=%d, scale=%f, pass_through=%d\n",
         attr.index, attr.name, attr.n_dims, attr.dims[0], attr.dims[1],
         attr.dims[2], attr.dims[3], attr.n_elems, attr.size,
         get_format_string(attr.fmt), get_type_string(attr.type),
-         get_qnt_type_string(attr.qnt_type), attr.zp, attr.scale);
+         get_qnt_type_string(attr.qnt_type), attr.zp, attr.scale,
+         attr.pass_through);
 }

 TensorInfo RKNPU2Backend::GetInputInfo(int index) {
@@ -309,7 +319,12 @@ bool RKNPU2Backend::Infer(std::vector<FDTensor>& inputs,
      input_attrs_[i].type = input_type;
      input_attrs_[i].size = inputs[0].Nbytes();
      input_attrs_[i].size_with_stride = inputs[0].Nbytes();
-      input_attrs_[i].pass_through = 0;
+      if(input_attrs_[i].type == RKNN_TENSOR_FLOAT16 ||
+          input_attrs_[i].type == RKNN_TENSOR_FLOAT32){
+        FDINFO << "The input model is not a quantitative model. "
+                  "Close the normalize operation." << std::endl;
+      }
+
      input_mems_[i] = rknn_create_mem(ctx, inputs[i].Nbytes());
      if (input_mems_[i] == nullptr) {
        FDERROR << "rknn_create_mem input_mems_ error." << std::endl;
@@ -340,6 +355,7 @@ bool RKNPU2Backend::Infer(std::vector<FDTensor>& inputs,

      // default output type is depend on model, this requires float32 to compute top5
      ret = rknn_set_io_mem(ctx, output_mems_[i], &output_attrs_[i]);
+
      // set output memory and attribute
      if (ret != RKNN_SUCC) {
        FDERROR << "output tensor memory rknn_set_io_mem fail! ret=" << ret
@@ -350,7 +366,7 @@ bool RKNPU2Backend::Infer(std::vector<FDTensor>& inputs,

    this->infer_init = true;
  }
-  
+
  // Copy input data to input tensor memory
  for (uint32_t i = 0; i < io_num.n_input; i++) {
    uint32_t width = input_attrs_[i].dims[2];
--- a/tools/rknpu2/config/Portrait_PP_HumanSegV2_Lite_256x144_infer.yaml
+++ b/tools/rknpu2/config/Portrait_PP_HumanSegV2_Lite_256x144_infer.yaml
@@ -0,0 +1,15 @@
+mean:
+  -
+    - 128.5
+    - 128.5
+    - 128.5
+std:
+  -
+    - 128.5
+    - 128.5
+    - 128.5
+model_path: ./Portrait_PP_HumanSegV2_Lite_256x144_infer/Portrait_PP_HumanSegV2_Lite_256x144_infer.onnx
+outputs_nodes:
+do_quantization: True
+dataset: "./Portrait_PP_HumanSegV2_Lite_256x144_infer/dataset.txt"
+output_folder: "./Portrait_PP_HumanSegV2_Lite_256x144_infer"