From 0692dcc4057413fe15da9c664d1faac263eae0b5 Mon Sep 17 00:00:00 2001
From: ziqi-jin <67993288+ziqi-jin@users.noreply.github.com>
Date: Fri, 7 Oct 2022 21:44:16 +0800
Subject: [PATCH] Add PP-ModNet and PP-HumanMatting Support (#240)

* first commit for yolov7

* pybind for yolov7

* CPP README.md

* CPP README.md

* modified yolov7.cc

* README.md

* python file modify

* delete license in fastdeploy/

* repush the conflict part

* README.md modified

* README.md modified

* file path modified

* file path modified

* file path modified

* file path modified

* file path modified

* README modified

* README modified

* move some helpers to private

* add examples for yolov7

* api.md modified

* api.md modified

* api.md modified

* YOLOv7

* yolov7 release link

* yolov7 release link

* yolov7 release link

* copyright

* change some helpers to private

* change variables to const and fix documents.

* gitignore

* Transfer some funtions to private member of class

* Transfer some funtions to private member of class

* Merge from develop (#9)

* Fix compile problem in different python version (#26)

* fix some usage problem in linux

* Fix compile problem

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>

* Add PaddleDetetion/PPYOLOE model support (#22)

* add ppdet/ppyoloe

* Add demo code and documents

* add convert processor to vision (#27)

* update .gitignore

* Added checking for cmake include dir

* fixed missing trt_backend option bug when init from trt

* remove un-need data layout and add pre-check for dtype

* changed RGB2BRG to BGR2RGB in ppcls model

* add model_zoo yolov6 c++/python demo

* fixed CMakeLists.txt typos

* update yolov6 cpp/README.md

* add yolox c++/pybind and model_zoo demo

* move some helpers to private

* fixed CMakeLists.txt typos

* add normalize with alpha and beta

* add version notes for yolov5/yolov6/yolox

* add copyright to yolov5.cc

* revert normalize

* fixed some bugs in yolox

* fixed examples/CMakeLists.txt to avoid conflicts

* add convert processor to vision

* format examples/CMakeLists summary

* Fix bug while the inference result is empty with YOLOv5 (#29)

* Add multi-label function for yolov5

* Update README.md

Update doc

* Update fastdeploy_runtime.cc

fix variable option.trt_max_shape wrong name

* Update runtime_option.md

Update resnet model dynamic shape setting name from images to x

* Fix bug when inference result boxes are empty

* Delete detection.py

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

* first commit for yolor

* for merge

* Develop (#11)

* Fix compile problem in different python version (#26)

* fix some usage problem in linux

* Fix compile problem

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>

* Add PaddleDetetion/PPYOLOE model support (#22)

* add ppdet/ppyoloe

* Add demo code and documents

* add convert processor to vision (#27)

* update .gitignore

* Added checking for cmake include dir

* fixed missing trt_backend option bug when init from trt

* remove un-need data layout and add pre-check for dtype

* changed RGB2BRG to BGR2RGB in ppcls model

* add model_zoo yolov6 c++/python demo

* fixed CMakeLists.txt typos

* update yolov6 cpp/README.md

* add yolox c++/pybind and model_zoo demo

* move some helpers to private

* fixed CMakeLists.txt typos

* add normalize with alpha and beta

* add version notes for yolov5/yolov6/yolox

* add copyright to yolov5.cc

* revert normalize

* fixed some bugs in yolox

* fixed examples/CMakeLists.txt to avoid conflicts

* add convert processor to vision

* format examples/CMakeLists summary

* Fix bug while the inference result is empty with YOLOv5 (#29)

* Add multi-label function for yolov5

* Update README.md

Update doc

* Update fastdeploy_runtime.cc

fix variable option.trt_max_shape wrong name

* Update runtime_option.md

Update resnet model dynamic shape setting name from images to x

* Fix bug when inference result boxes are empty

* Delete detection.py

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

* Yolor (#16)

* Develop (#11) (#12)

* Fix compile problem in different python version (#26)

* fix some usage problem in linux

* Fix compile problem

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>

* Add PaddleDetetion/PPYOLOE model support (#22)

* add ppdet/ppyoloe

* Add demo code and documents

* add convert processor to vision (#27)

* update .gitignore

* Added checking for cmake include dir

* fixed missing trt_backend option bug when init from trt

* remove un-need data layout and add pre-check for dtype

* changed RGB2BRG to BGR2RGB in ppcls model

* add model_zoo yolov6 c++/python demo

* fixed CMakeLists.txt typos

* update yolov6 cpp/README.md

* add yolox c++/pybind and model_zoo demo

* move some helpers to private

* fixed CMakeLists.txt typos

* add normalize with alpha and beta

* add version notes for yolov5/yolov6/yolox

* add copyright to yolov5.cc

* revert normalize

* fixed some bugs in yolox

* fixed examples/CMakeLists.txt to avoid conflicts

* add convert processor to vision

* format examples/CMakeLists summary

* Fix bug while the inference result is empty with YOLOv5 (#29)

* Add multi-label function for yolov5

* Update README.md

Update doc

* Update fastdeploy_runtime.cc

fix variable option.trt_max_shape wrong name

* Update runtime_option.md

Update resnet model dynamic shape setting name from images to x

* Fix bug when inference result boxes are empty

* Delete detection.py

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

* Develop (#13)

* Fix compile problem in different python version (#26)

* fix some usage problem in linux

* Fix compile problem

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>

* Add PaddleDetetion/PPYOLOE model support (#22)

* add ppdet/ppyoloe

* Add demo code and documents

* add convert processor to vision (#27)

* update .gitignore

* Added checking for cmake include dir

* fixed missing trt_backend option bug when init from trt

* remove un-need data layout and add pre-check for dtype

* changed RGB2BRG to BGR2RGB in ppcls model

* add model_zoo yolov6 c++/python demo

* fixed CMakeLists.txt typos

* update yolov6 cpp/README.md

* add yolox c++/pybind and model_zoo demo

* move some helpers to private

* fixed CMakeLists.txt typos

* add normalize with alpha and beta

* add version notes for yolov5/yolov6/yolox

* add copyright to yolov5.cc

* revert normalize

* fixed some bugs in yolox

* fixed examples/CMakeLists.txt to avoid conflicts

* add convert processor to vision

* format examples/CMakeLists summary

* Fix bug while the inference result is empty with YOLOv5 (#29)

* Add multi-label function for yolov5

* Update README.md

Update doc

* Update fastdeploy_runtime.cc

fix variable option.trt_max_shape wrong name

* Update runtime_option.md

Update resnet model dynamic shape setting name from images to x

* Fix bug when inference result boxes are empty

* Delete detection.py

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* Develop (#14)

* Fix compile problem in different python version (#26)

* fix some usage problem in linux

* Fix compile problem

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>

* Add PaddleDetetion/PPYOLOE model support (#22)

* add ppdet/ppyoloe

* Add demo code and documents

* add convert processor to vision (#27)

* update .gitignore

* Added checking for cmake include dir

* fixed missing trt_backend option bug when init from trt

* remove un-need data layout and add pre-check for dtype

* changed RGB2BRG to BGR2RGB in ppcls model

* add model_zoo yolov6 c++/python demo

* fixed CMakeLists.txt typos

* update yolov6 cpp/README.md

* add yolox c++/pybind and model_zoo demo

* move some helpers to private

* fixed CMakeLists.txt typos

* add normalize with alpha and beta

* add version notes for yolov5/yolov6/yolox

* add copyright to yolov5.cc

* revert normalize

* fixed some bugs in yolox

* fixed examples/CMakeLists.txt to avoid conflicts

* add convert processor to vision

* format examples/CMakeLists summary

* Fix bug while the inference result is empty with YOLOv5 (#29)

* Add multi-label function for yolov5

* Update README.md

Update doc

* Update fastdeploy_runtime.cc

fix variable option.trt_max_shape wrong name

* Update runtime_option.md

Update resnet model dynamic shape setting name from images to x

* Fix bug when inference result boxes are empty

* Delete detection.py

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>
Co-authored-by: Jason <928090362@qq.com>

* add is_dynamic for YOLO series (#22)

* modify ppmatting backend and docs

* modify ppmatting docs

* fix the PPMatting size problem

* fix LimitShort's log

* retrigger ci

* modify PPMatting docs

* modify the way  for dealing with  LimitShort

* add pphumanmatting and modnet series

* docs of PPMatting series

* add explanation of newly added processors and fix processors

* Modify LimitShort function and ppmatting.cc

* modify ResizeByShort and ppmatting.cc

* change resize_to_int_mult to limit_by_stride and delete resize_by_input_shape

* retrigger ci

* retrigger ci

* fix problem produced by ResizeByShort

* Update eigen.cmake

* Delete eigen.cmake

* refine code

* add test file for ppmatting series

* add squeeze for fd_tensor and modify ppmatting.cc

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>
Co-authored-by: Jason <928090362@qq.com>
---
 examples/vision/matting/README.md             |   2 +
 examples/vision/matting/ppmatting/README.md   |  12 +-
 .../vision/matting/ppmatting/cpp/infer.cc     |   2 +-
 fastdeploy/core/fd_tensor.cc                  |   7 +
 fastdeploy/core/fd_tensor.h                   |   4 +
 fastdeploy/vision/common/processors/crop.cc   |  65 +++++++
 fastdeploy/vision/common/processors/crop.h    |  47 +++++
 ...size_to_int_mult.cc => limit_by_stride.cc} |  18 +-
 ...resize_to_int_mult.h => limit_by_stride.h} |  15 +-
 .../vision/common/processors/limit_long.cc    |  70 ++++++++
 .../vision/common/processors/limit_long.h     |  51 ++++++
 .../vision/common/processors/limit_short.cc   |  12 +-
 .../vision/common/processors/limit_short.h    |   8 +-
 .../vision/common/processors/pad_to_size.cc   |   6 +
 .../vision/common/processors/pad_to_size.h    |   2 +-
 .../common/processors/resize_by_short.cc      |  39 +++--
 .../common/processors/resize_by_short.h       |  13 +-
 .../vision/common/processors/transform.h      |   4 +-
 fastdeploy/vision/detection/ppdet/ppyoloe.cc  |   7 +-
 .../vision/matting/ppmatting/ppmatting.cc     | 162 ++++++++----------
 tests/eval_example/test_ppmatting.py          | 109 ++++++++++++
 21 files changed, 523 insertions(+), 132 deletions(-)
 create mode 100644 fastdeploy/vision/common/processors/crop.cc
 create mode 100644 fastdeploy/vision/common/processors/crop.h
 rename fastdeploy/vision/common/processors/{resize_to_int_mult.cc => limit_by_stride.cc} (75%)
 rename fastdeploy/vision/common/processors/{resize_to_int_mult.h => limit_by_stride.h} (76%)
 create mode 100644 fastdeploy/vision/common/processors/limit_long.cc
 create mode 100644 fastdeploy/vision/common/processors/limit_long.h
 create mode 100644 tests/eval_example/test_ppmatting.py

diff --git a/examples/vision/matting/README.md b/examples/vision/matting/README.md
index fafc3b6e1..afe434f52 100644
--- a/examples/vision/matting/README.md
+++ b/examples/vision/matting/README.md
@@ -6,3 +6,5 @@ FastDeploy目前支持如下抠图模型部署
 | :--- | :--- | :------- | :--- |
 | [ZHKKKe/MODNet](./modnet) | MODNet 系列模型 | ONNX | [CommitID:28165a4](https://github.com/ZHKKKe/MODNet/commit/28165a4) |
 | [PaddleSeg/PPMatting](./ppmatting) | PPMatting 系列模型 | Paddle | [Release/2.6](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.6/Matting) |
+| [PaddleSeg/PPHumanMatting](./ppmatting) | PPHumanMatting 系列模型 | Paddle | [Release/2.6](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.6/Matting) |
+| [PaddleSeg/ModNet](./ppmatting) | ModNet 系列模型 | Paddle | [Release/2.6](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.6/Matting) |
diff --git a/examples/vision/matting/ppmatting/README.md b/examples/vision/matting/ppmatting/README.md
index d71432284..067fde308 100644
--- a/examples/vision/matting/ppmatting/README.md
+++ b/examples/vision/matting/ppmatting/README.md
@@ -9,11 +9,13 @@
 目前FastDeploy支持如下模型的部署
 
 - [PPMatting系列模型](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.6/Matting)
+- [PPHumanMatting系列模型](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.6/Matting)
+- [ModNet系列模型](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.6/Matting)
 
 
 ## 导出部署模型
 
-在部署前，需要先将PPMatting导出成部署模型，导出步骤参考文档[导出模型](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.6/Matting)
+在部署前，需要先将PPMatting导出成部署模型，导出步骤参考文档[导出模型](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.6/Matting)(Tips:导出PPMatting系列模型和PPHumanMatting系列模型需要设置导出脚本的`--input_shape`参数)
 
 
 ## 下载预训练模型
@@ -25,8 +27,12 @@
 
 | 模型                                                               | 参数大小    | 精度    | 备注 |
 |:---------------------------------------------------------------- |:----- |:----- | :------ |
-| [PPMatting-512](https://bj.bcebos.com/paddlehub/fastdeploy/PP-Matting-512.tgz) | 87MB | - |
-| [PPMatting-1024](https://bj.bcebos.com/paddlehub/fastdeploy/PP-Matting-1024.tgz) | 87MB | - |
+| [PPMatting-512](https://bj.bcebos.com/paddlehub/fastdeploy/PP-Matting-512.tgz) | 106MB | - |
+| [PPMatting-1024](https://bj.bcebos.com/paddlehub/fastdeploy/PP-Matting-1024.tgz) | 106MB | - |
+| [PPHumanMatting](https://bj.bcebos.com/paddlehub/fastdeploy/PPHumanMatting.tgz) | 247MB | - |
+| [Modnet_ResNet50_vd](https://bj.bcebos.com/paddlehub/fastdeploy/PPModnet_ResNet50_vd.tgz) | 355MB | - |
+| [Modnet_MobileNetV2](https://bj.bcebos.com/paddlehub/fastdeploy/PPModnet_MobileNetV2.tgz) | 28MB | - |
+| [Modnet_HRNet_w18](https://bj.bcebos.com/paddlehub/fastdeploy/PPModnet_HRNet_w18.tgz) | 51MB | - |
 
 
 
diff --git a/examples/vision/matting/ppmatting/cpp/infer.cc b/examples/vision/matting/ppmatting/cpp/infer.cc
index f47b484e5..d7cad27d4 100644
--- a/examples/vision/matting/ppmatting/cpp/infer.cc
+++ b/examples/vision/matting/ppmatting/cpp/infer.cc
@@ -81,7 +81,7 @@ void GpuInfer(const std::string& model_dir, const std::string& image_file,
   cv::imwrite("visualized_result.jpg", vis_im_with_bg);
   cv::imwrite("visualized_result_fg.jpg", vis_im);
   std::cout << "Visualized result save in ./visualized_result_replaced_bg.jpg "
-               "and ./visualized_result_fg.jpgg"
+               "and ./visualized_result_fg.jpg"
             << std::endl;
 }
 
diff --git a/fastdeploy/core/fd_tensor.cc b/fastdeploy/core/fd_tensor.cc
index 00a7ae2b7..2670b355e 100644
--- a/fastdeploy/core/fd_tensor.cc
+++ b/fastdeploy/core/fd_tensor.cc
@@ -85,6 +85,13 @@ void FDTensor::ExpandDim(int64_t axis) {
   shape.insert(shape.begin() + axis, 1);
 }
 
+void FDTensor::Squeeze(int64_t axis) {
+  size_t ndim = shape.size();
+  FDASSERT(axis >= 0 && axis < ndim,
+           "The allowed 'axis' must be in range of (0, %lu)!", ndim);
+  shape.erase(shape.begin() + axis);
+}
+
 void FDTensor::Allocate(const std::vector<int64_t>& new_shape,
                         const FDDataType& data_type,
                         const std::string& tensor_name,
diff --git a/fastdeploy/core/fd_tensor.h b/fastdeploy/core/fd_tensor.h
index 08e500d08..7e8bb7851 100644
--- a/fastdeploy/core/fd_tensor.h
+++ b/fastdeploy/core/fd_tensor.h
@@ -71,6 +71,10 @@ struct FASTDEPLOY_DECL FDTensor {
   // at the `axis` position in the expanded Tensor shape.
   void ExpandDim(int64_t axis = 0);
 
+  // Squeeze the shape of a Tensor. Erase the axis that will appear
+  // at the `axis` position in the squeezed Tensor shape.
+  void Squeeze(int64_t axis = 0);
+
   // Initialize Tensor
   // Include setting attribute for tensor
   // and allocate cpu memory buffer
diff --git a/fastdeploy/vision/common/processors/crop.cc b/fastdeploy/vision/common/processors/crop.cc
new file mode 100644
index 000000000..19c451616
--- /dev/null
+++ b/fastdeploy/vision/common/processors/crop.cc
@@ -0,0 +1,65 @@
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "fastdeploy/vision/common/processors/crop.h"
+
+namespace fastdeploy {
+namespace vision {
+
+bool Crop::CpuRun(Mat* mat) {
+  cv::Mat* im = mat->GetCpuMat();
+  int height = static_cast<int>(im->rows);
+  int width = static_cast<int>(im->cols);
+  if (height < height_ + offset_h_ || width < width_ + offset_w_) {
+    FDERROR << "[Crop] Cannot crop [" << height_ << ", " << width_
+            << "] from the input image [" << height << ", " << width
+            << "], with offset [" << offset_h_ << ", " << offset_w_ << "]."
+            << std::endl;
+    return false;
+  }
+  cv::Rect crop_roi(offset_w_, offset_h_, width_, height_);
+  *im = (*im)(crop_roi);
+  mat->SetWidth(width_);
+  mat->SetHeight(height_);
+  return true;
+}
+
+#ifdef ENABLE_OPENCV_CUDA
+bool Crop::GpuRun(Mat* mat) {
+  cv::cuda::GpuMat* im = mat->GetGpuMat();
+  int height = static_cast<int>(im->rows);
+  int width = static_cast<int>(im->cols);
+  if (height < height_ + offset_h_ || width < width_ + offset_w_) {
+    FDERROR << "[Crop] Cannot crop [" << height_ << ", " << width_
+            << "] from the input image [" << height << ", " << width
+            << "], with offset [" << offset_h_ << ", " << offset_w_ << "]."
+            << std::endl;
+    return false;
+  }
+  cv::Rect crop_roi(offset_w_, offset_h_, width_, height_);
+  *im = (*im)(crop_roi);
+  mat->SetWidth(width_);
+  mat->SetHeight(height_);
+  return true;
+}
+#endif
+
+bool Crop::Run(Mat* mat, int offset_w, int offset_h, int width, int height,
+               ProcLib lib) {
+  auto c = Crop(offset_w, offset_h, width, height);
+  return c(mat, lib);
+}
+
+}  // namespace vision
+}  // namespace fastdeploy
diff --git a/fastdeploy/vision/common/processors/crop.h b/fastdeploy/vision/common/processors/crop.h
new file mode 100644
index 000000000..0148faed2
--- /dev/null
+++ b/fastdeploy/vision/common/processors/crop.h
@@ -0,0 +1,47 @@
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+
+#include "fastdeploy/vision/common/processors/base.h"
+
+namespace fastdeploy {
+namespace vision {
+
+class Crop : public Processor {
+ public:
+  Crop(int offset_w, int offset_h, int width, int height) {
+    offset_w_ = offset_w;
+    offset_h_ = offset_h;
+    width_ = width;
+    height_ = height;
+  }
+  bool CpuRun(Mat* mat);
+#ifdef ENABLE_OPENCV_CUDA
+  bool GpuRun(Mat* mat);
+#endif
+  std::string Name() { return "Crop"; }
+
+  static bool Run(Mat* mat, int offset_w, int offset_h, int width, int height,
+                  ProcLib lib = ProcLib::OPENCV_CPU);
+
+ private:
+  int offset_w_;
+  int offset_h_;
+  int height_;
+  int width_;
+};
+
+}  // namespace vision
+}  // namespace fastdeploy
diff --git a/fastdeploy/vision/common/processors/resize_to_int_mult.cc b/fastdeploy/vision/common/processors/limit_by_stride.cc
similarity index 75%
rename from fastdeploy/vision/common/processors/resize_to_int_mult.cc
rename to fastdeploy/vision/common/processors/limit_by_stride.cc
index 9659b101f..aa573ce88 100644
--- a/fastdeploy/vision/common/processors/resize_to_int_mult.cc
+++ b/fastdeploy/vision/common/processors/limit_by_stride.cc
@@ -12,17 +12,17 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.
 
-#include "fastdeploy/vision/common/processors/resize_to_int_mult.h"
+#include "fastdeploy/vision/common/processors/limit_by_stride.h"
 
 namespace fastdeploy {
 namespace vision {
 
-bool ResizeToIntMult::CpuRun(Mat* mat) {
+bool LimitByStride::CpuRun(Mat* mat) {
   cv::Mat* im = mat->GetCpuMat();
   int origin_w = im->cols;
   int origin_h = im->rows;
-  int rw = origin_w - origin_w % mult_int_;
-  int rh = origin_h - origin_h % mult_int_;
+  int rw = origin_w - origin_w % stride_;
+  int rh = origin_h - origin_h % stride_;
   if (rw != origin_w || rh != origin_w) {
     cv::resize(*im, *im, cv::Size(rw, rh), 0, 0, interp_);
     mat->SetWidth(im->cols);
@@ -32,13 +32,13 @@ bool ResizeToIntMult::CpuRun(Mat* mat) {
 }
 
 #ifdef ENABLE_OPENCV_CUDA
-bool ResizeToIntMult::GpuRun(Mat* mat) {
+bool LimitByStride::GpuRun(Mat* mat) {
   cv::cuda::GpuMat* im = mat->GetGpuMat();
   int origin_w = im->cols;
   int origin_h = im->rows;
   im->convertTo(*im, CV_32FC(im->channels()));
-  int rw = origin_w - origin_w % mult_int_;
-  int rh = origin_h - origin_h % mult_int_;
+  int rw = origin_w - origin_w % stride_;
+  int rh = origin_h - origin_h % stride_;
   if (rw != origin_w || rh != origin_w) {
     cv::cuda::resize(*im, *im, cv::Size(rw, rh), 0, 0, interp_);
     mat->SetWidth(im->cols);
@@ -48,8 +48,8 @@ bool ResizeToIntMult::GpuRun(Mat* mat) {
 }
 #endif
 
-bool ResizeToIntMult::Run(Mat* mat, int mult_int, int interp, ProcLib lib) {
-  auto r = ResizeToIntMult(mult_int, interp);
+bool LimitByStride::Run(Mat* mat, int stride, int interp, ProcLib lib) {
+  auto r = LimitByStride(stride, interp);
   return r(mat, lib);
 }
 }  // namespace vision
diff --git a/fastdeploy/vision/common/processors/resize_to_int_mult.h b/fastdeploy/vision/common/processors/limit_by_stride.h
similarity index 76%
rename from fastdeploy/vision/common/processors/resize_to_int_mult.h
rename to fastdeploy/vision/common/processors/limit_by_stride.h
index 71a2aaa8d..13c6e307e 100644
--- a/fastdeploy/vision/common/processors/resize_to_int_mult.h
+++ b/fastdeploy/vision/common/processors/limit_by_stride.h
@@ -19,24 +19,27 @@
 namespace fastdeploy {
 namespace vision {
 
-class ResizeToIntMult : public Processor {
+class LimitByStride : public Processor {
  public:
-  explicit ResizeToIntMult(int mult_int = 32, int interp = 1) {
-    mult_int_ = mult_int;
+  explicit LimitByStride(int stride = 32, int interp = 1) {
+    stride_ = stride;
     interp_ = interp;
   }
+
+  // Resize Mat* mat to make the size divisible by stride_.
+
   bool CpuRun(Mat* mat);
 #ifdef ENABLE_OPENCV_CUDA
   bool GpuRun(Mat* mat);
 #endif
-  std::string Name() { return "ResizeToIntMult"; }
+  std::string Name() { return "LimitByStride"; }
 
-  static bool Run(Mat* mat, int mult_int = 32, int interp = 1,
+  static bool Run(Mat* mat, int stride = 32, int interp = 1,
                   ProcLib lib = ProcLib::OPENCV_CPU);
 
  private:
   int interp_;
-  int mult_int_;
+  int stride_;
 };
 }  // namespace vision
 }  // namespace fastdeploy
diff --git a/fastdeploy/vision/common/processors/limit_long.cc b/fastdeploy/vision/common/processors/limit_long.cc
new file mode 100644
index 000000000..246497c97
--- /dev/null
+++ b/fastdeploy/vision/common/processors/limit_long.cc
@@ -0,0 +1,70 @@
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "fastdeploy/vision/common/processors/limit_long.h"
+
+namespace fastdeploy {
+namespace vision {
+
+bool LimitLong::CpuRun(Mat* mat) {
+  cv::Mat* im = mat->GetCpuMat();
+  int origin_w = im->cols;
+  int origin_h = im->rows;
+  int im_size_max = std::max(origin_w, origin_h);
+  int target = im_size_max;
+  if (max_long_ > 0 && im_size_max > max_long_) {
+    target = max_long_;
+  } else if (min_long_ > 0 && im_size_max < min_long_) {
+    target = min_long_;
+  }
+  if (target != im_size_max) {
+    double scale =
+        static_cast<double>(target) / static_cast<double>(im_size_max);
+    cv::resize(*im, *im, cv::Size(), scale, scale, interp_);
+    mat->SetWidth(im->cols);
+    mat->SetHeight(im->rows);
+  }
+  return true;
+}
+
+#ifdef ENABLE_OPENCV_CUDA
+bool LimitLong::GpuRun(Mat* mat) {
+  cv::cuda::GpuMat* im = mat->GetGpuMat();
+  int origin_w = im->cols;
+  int origin_h = im->rows;
+  im->convertTo(*im, CV_32FC(im->channels()));
+  int im_size_max = std::max(origin_w, origin_h);
+  int target = im_size_max;
+  if (max_long_ > 0 && im_size_max > max_long_) {
+    target = max_long_;
+  } else if (min_long_ > 0 && im_size_max < min_long_) {
+    target = min_long_;
+  }
+  if (target != im_size_max) {
+    double scale =
+        static_cast<double>(target) / static_cast<double>(im_size_max);
+    cv::cuda::resize(*im, *im, cv::Size(), scale, scale, interp_);
+    mat->SetWidth(im->cols);
+    mat->SetHeight(im->rows);
+  }
+  return true;
+}
+#endif
+
+bool LimitLong::Run(Mat* mat, int max_long, int min_long, ProcLib lib) {
+  auto l = LimitLong(max_long, min_long);
+  return l(mat, lib);
+}
+}  // namespace vision
+}  // namespace fastdeploy
diff --git a/fastdeploy/vision/common/processors/limit_long.h b/fastdeploy/vision/common/processors/limit_long.h
new file mode 100644
index 000000000..8c67212a5
--- /dev/null
+++ b/fastdeploy/vision/common/processors/limit_long.h
@@ -0,0 +1,51 @@
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+
+#include "fastdeploy/vision/common/processors/base.h"
+
+namespace fastdeploy {
+namespace vision {
+
+class LimitLong : public Processor {
+ public:
+  explicit LimitLong(int max_long = -1, int min_long = -1, int interp = 1) {
+    max_long_ = max_long;
+    min_long_ = min_long;
+    interp_ = interp;
+  }
+
+  // Limit the long edge of image.
+  // If the long edge is larger than max_long_, resize the long edge
+  // to max_long_, while scale the short edge proportionally.
+  // If the long edge is smaller than min_long_, resize the long edge
+  // to min_long_, while scale the short edge proportionally.
+  bool CpuRun(Mat* mat);
+#ifdef ENABLE_OPENCV_CUDA
+  bool GpuRun(Mat* mat);
+#endif
+  std::string Name() { return "LimitLong"; }
+
+  static bool Run(Mat* mat, int max_long = -1, int min_long = -1,
+                  ProcLib lib = ProcLib::OPENCV_CPU);
+  int GetMaxLong() const { return max_long_; }
+
+ private:
+  int max_long_;
+  int min_long_;
+  int interp_;
+};
+}  // namespace vision
+}  // namespace fastdeploy
diff --git a/fastdeploy/vision/common/processors/limit_short.cc b/fastdeploy/vision/common/processors/limit_short.cc
index b08bb0ef5..ce0daa282 100644
--- a/fastdeploy/vision/common/processors/limit_short.cc
+++ b/fastdeploy/vision/common/processors/limit_short.cc
@@ -28,9 +28,11 @@ bool LimitShort::CpuRun(Mat* mat) {
   } else if (min_short_ > 0 && im_size_min < min_short_) {
     target = min_short_;
   }
+  double scale = -1.f;
   if (target != im_size_min) {
-    double scale =
-        static_cast<double>(target) / static_cast<double>(im_size_min);
+    scale = static_cast<double>(target) / static_cast<double>(im_size_min);
+  }
+  if (scale > 0) {
     cv::resize(*im, *im, cv::Size(), scale, scale, interp_);
     mat->SetWidth(im->cols);
     mat->SetHeight(im->rows);
@@ -51,9 +53,11 @@ bool LimitShort::GpuRun(Mat* mat) {
   } else if (min_short_ > 0 && im_size_min < min_short_) {
     target = min_short_;
   }
+  double scale = -1.f;
   if (target != im_size_min) {
-    double scale =
-        static_cast<double>(target) / static_cast<double>(im_size_min);
+    scale = static_cast<double>(target) / static_cast<double>(im_size_min);
+  }
+  if (scale > 0) {
     cv::cuda::resize(*im, *im, cv::Size(), scale, scale, interp_);
     mat->SetWidth(im->cols);
     mat->SetHeight(im->rows);
diff --git a/fastdeploy/vision/common/processors/limit_short.h b/fastdeploy/vision/common/processors/limit_short.h
index 25eff6d71..5995c8753 100644
--- a/fastdeploy/vision/common/processors/limit_short.h
+++ b/fastdeploy/vision/common/processors/limit_short.h
@@ -26,6 +26,12 @@ class LimitShort : public Processor {
     min_short_ = min_short;
     interp_ = interp;
   }
+
+  // Limit the short edge of image.
+  // If the short edge is larger than max_short_, resize the short edge
+  // to max_short_, while scale the long edge proportionally.
+  // If the short edge is smaller than min_short_, resize the short edge
+  // to min_short_, while scale the long edge proportionally.
   bool CpuRun(Mat* mat);
 #ifdef ENABLE_OPENCV_CUDA
   bool GpuRun(Mat* mat);
@@ -34,7 +40,7 @@ class LimitShort : public Processor {
 
   static bool Run(Mat* mat, int max_short = -1, int min_short = -1,
                   ProcLib lib = ProcLib::OPENCV_CPU);
-  int GetMaxShort() { return max_short_; }
+  int GetMaxShort() const { return max_short_; }
 
  private:
   int max_short_;
diff --git a/fastdeploy/vision/common/processors/pad_to_size.cc b/fastdeploy/vision/common/processors/pad_to_size.cc
index d4cbacd87..77bf271bd 100644
--- a/fastdeploy/vision/common/processors/pad_to_size.cc
+++ b/fastdeploy/vision/common/processors/pad_to_size.cc
@@ -18,6 +18,9 @@ namespace fastdeploy {
 namespace vision {
 
 bool PadToSize::CpuRun(Mat* mat) {
+  if (width_ == -1 || height_ == -1) {
+    return true;
+  }
   if (mat->layout != Layout::HWC) {
     FDERROR << "PadToSize: The input data must be Layout::HWC format!"
             << std::endl;
@@ -74,6 +77,9 @@ bool PadToSize::CpuRun(Mat* mat) {
 
 #ifdef ENABLE_OPENCV_CUDA
 bool PadToSize::GpuRun(Mat* mat) {
+  if (width_ == -1 || height_ == -1) {
+    return true;
+  }
   if (mat->layout != Layout::HWC) {
     FDERROR << "PadToSize: The input data must be Layout::HWC format!"
             << std::endl;
diff --git a/fastdeploy/vision/common/processors/pad_to_size.h b/fastdeploy/vision/common/processors/pad_to_size.h
index ece0158f7..ff3ac159f 100644
--- a/fastdeploy/vision/common/processors/pad_to_size.h
+++ b/fastdeploy/vision/common/processors/pad_to_size.h
@@ -21,7 +21,7 @@ namespace vision {
 
 class PadToSize : public Processor {
  public:
-  // only support pad with left-top padding mode
+  // only support pad with right-bottom padding mode
   PadToSize(int width, int height, const std::vector<float>& value) {
     width_ = width;
     height_ = height;
diff --git a/fastdeploy/vision/common/processors/resize_by_short.cc b/fastdeploy/vision/common/processors/resize_by_short.cc
index 8e850425f..72ac33ac9 100644
--- a/fastdeploy/vision/common/processors/resize_by_short.cc
+++ b/fastdeploy/vision/common/processors/resize_by_short.cc
@@ -22,12 +22,14 @@ bool ResizeByShort::CpuRun(Mat* mat) {
   int origin_w = im->cols;
   int origin_h = im->rows;
   double scale = GenerateScale(origin_w, origin_h);
-  if (use_scale_) {
+  if (use_scale_ && fabs(scale - 1.0) >= 1e-06) {
     cv::resize(*im, *im, cv::Size(), scale, scale, interp_);
   } else {
     int width = static_cast<int>(round(scale * im->cols));
     int height = static_cast<int>(round(scale * im->rows));
-    cv::resize(*im, *im, cv::Size(width, height), 0, 0, interp_);
+    if (width != origin_w || height != origin_h) {
+      cv::resize(*im, *im, cv::Size(width, height), 0, 0, interp_);
+    }
   }
   mat->SetWidth(im->cols);
   mat->SetHeight(im->rows);
@@ -41,12 +43,14 @@ bool ResizeByShort::GpuRun(Mat* mat) {
   int origin_h = im->rows;
   double scale = GenerateScale(origin_w, origin_h);
   im->convertTo(*im, CV_32FC(im->channels()));
-  if (use_scale_) {
+  if (use_scale_ && fabs(scale - 1.0) >= 1e-06) {
     cv::cuda::resize(*im, *im, cv::Size(), scale, scale, interp_);
   } else {
     int width = static_cast<int>(round(scale * im->cols));
     int height = static_cast<int>(round(scale * im->rows));
-    cv::cuda::resize(*im, *im, cv::Size(width, height), 0, 0, interp_);
+    if (width != origin_w || height != origin_h) {
+      cv::cuda::resize(*im, *im, cv::Size(width, height), 0, 0, interp_);
+    }
   }
   mat->SetWidth(im->cols);
   mat->SetHeight(im->rows);
@@ -59,18 +63,31 @@ double ResizeByShort::GenerateScale(const int origin_w, const int origin_h) {
   int im_size_min = std::min(origin_w, origin_h);
   double scale =
       static_cast<double>(target_size_) / static_cast<double>(im_size_min);
-  if (max_size_ > 0) {
-    if (round(scale * im_size_max) > max_size_) {
-      scale = static_cast<double>(max_size_) / static_cast<double>(im_size_max);
+
+  if (max_hw_.size() > 0) {
+    FDASSERT(max_hw_.size() == 2,
+             "Require size of max_hw_ be 2, but now it's %zu.", max_hw_.size());
+    FDASSERT(
+        max_hw_[0] > 0 && max_hw_[1] > 0,
+        "Require elements in max_hw_ greater than 0, but now it's [%d, %d].",
+        max_hw_[0], max_hw_[1]);
+
+    double scale_h =
+        static_cast<double>(max_hw_[0]) / static_cast<double>(origin_h);
+    double scale_w =
+        static_cast<double>(max_hw_[1]) / static_cast<double>(origin_w);
+    double min_scale = std::min(scale_h, scale_w);
+    if (min_scale < scale) {
+      scale = min_scale;
     }
   }
   return scale;
 }
 
 bool ResizeByShort::Run(Mat* mat, int target_size, int interp, bool use_scale,
-                        int max_size, ProcLib lib) {
-  auto r = ResizeByShort(target_size, interp, use_scale, max_size);
+                        const std::vector<int>& max_hw, ProcLib lib) {
+  auto r = ResizeByShort(target_size, interp, use_scale, max_hw);
   return r(mat, lib);
 }
-} // namespace vision
-} // namespace fastdeploy
+}  // namespace vision
+}  // namespace fastdeploy
diff --git a/fastdeploy/vision/common/processors/resize_by_short.h b/fastdeploy/vision/common/processors/resize_by_short.h
index 023748e9e..4fadb905e 100644
--- a/fastdeploy/vision/common/processors/resize_by_short.h
+++ b/fastdeploy/vision/common/processors/resize_by_short.h
@@ -22,9 +22,9 @@ namespace vision {
 class ResizeByShort : public Processor {
  public:
   ResizeByShort(int target_size, int interp = 1, bool use_scale = true,
-                int max_size = -1) {
+                const std::vector<int>& max_hw = std::vector<int>()) {
     target_size_ = target_size;
-    max_size_ = max_size;
+    max_hw_ = max_hw;
     interp_ = interp;
     use_scale_ = use_scale;
   }
@@ -35,15 +35,16 @@ class ResizeByShort : public Processor {
   std::string Name() { return "ResizeByShort"; }
 
   static bool Run(Mat* mat, int target_size, int interp = 1,
-                  bool use_scale = true, int max_size = -1,
+                  bool use_scale = true,
+                  const std::vector<int>& max_hw = std::vector<int>(),
                   ProcLib lib = ProcLib::OPENCV_CPU);
 
  private:
   double GenerateScale(const int origin_w, const int origin_h);
   int target_size_;
-  int max_size_;
+  std::vector<int> max_hw_;
   int interp_;
   bool use_scale_;
 };
-} // namespace vision
-} // namespace fastdeploy
+}  // namespace vision
+}  // namespace fastdeploy
diff --git a/fastdeploy/vision/common/processors/transform.h b/fastdeploy/vision/common/processors/transform.h
index cf720ceb7..5522c138c 100644
--- a/fastdeploy/vision/common/processors/transform.h
+++ b/fastdeploy/vision/common/processors/transform.h
@@ -18,7 +18,10 @@
 #include "fastdeploy/vision/common/processors/center_crop.h"
 #include "fastdeploy/vision/common/processors/color_space_convert.h"
 #include "fastdeploy/vision/common/processors/convert.h"
+#include "fastdeploy/vision/common/processors/crop.h"
 #include "fastdeploy/vision/common/processors/hwc2chw.h"
+#include "fastdeploy/vision/common/processors/limit_by_stride.h"
+#include "fastdeploy/vision/common/processors/limit_long.h"
 #include "fastdeploy/vision/common/processors/limit_short.h"
 #include "fastdeploy/vision/common/processors/normalize.h"
 #include "fastdeploy/vision/common/processors/pad.h"
@@ -26,5 +29,4 @@
 #include "fastdeploy/vision/common/processors/resize.h"
 #include "fastdeploy/vision/common/processors/resize_by_long.h"
 #include "fastdeploy/vision/common/processors/resize_by_short.h"
-#include "fastdeploy/vision/common/processors/resize_to_int_mult.h"
 #include "fastdeploy/vision/common/processors/stride_pad.h"
diff --git a/fastdeploy/vision/detection/ppdet/ppyoloe.cc b/fastdeploy/vision/detection/ppdet/ppyoloe.cc
index 0c0207404..ee28dfe38 100644
--- a/fastdeploy/vision/detection/ppdet/ppyoloe.cc
+++ b/fastdeploy/vision/detection/ppdet/ppyoloe.cc
@@ -122,8 +122,13 @@ bool PPYOLOE::BuildPreprocessPipelineFromConfig() {
       } else {
         int min_target_size = std::min(target_size[0], target_size[1]);
         int max_target_size = std::max(target_size[0], target_size[1]);
+        std::vector<int> max_size;
+        if (max_target_size > 0) {
+          max_size.push_back(max_target_size);
+          max_size.push_back(max_target_size);
+        }
         processors_.push_back(std::make_shared<ResizeByShort>(
-            min_target_size, interp, true, max_target_size));
+            min_target_size, interp, true, max_size));
       }
     } else if (op_name == "Permute") {
       // Do nothing, do permute as the last operation
diff --git a/fastdeploy/vision/matting/ppmatting/ppmatting.cc b/fastdeploy/vision/matting/ppmatting/ppmatting.cc
index a3d0a25e4..a11e220b4 100644
--- a/fastdeploy/vision/matting/ppmatting/ppmatting.cc
+++ b/fastdeploy/vision/matting/ppmatting/ppmatting.cc
@@ -60,33 +60,54 @@ bool PPMatting::BuildPreprocessPipelineFromConfig() {
     return false;
   }
 
+  FDASSERT((cfg["Deploy"]["input_shape"]),
+           "The yaml file should include input_shape parameters");
+  // input_shape
+  // b c h w
+  auto input_shape = cfg["Deploy"]["input_shape"].as<std::vector<int>>();
+  FDASSERT(input_shape.size() == 4,
+           "The input_shape in yaml file need to be 4-dimensions, but now its "
+           "dimension is %zu.",
+           input_shape.size());
+
+  bool is_fixed_input_shape = false;
+  if (input_shape[2] > 0 && input_shape[3] > 0) {
+    is_fixed_input_shape = true;
+  }
+  if (input_shape[2] < 0 || input_shape[3] < 0) {
+    FDWARNING << "Detected dynamic input shape of your model, only Paddle "
+                 "Inference / OpenVINO support this model now."
+              << std::endl;
+  }
   if (cfg["Deploy"]["transforms"]) {
     auto preprocess_cfg = cfg["Deploy"]["transforms"];
+    int long_size = -1;
     for (const auto& op : preprocess_cfg) {
       FDASSERT(op.IsMap(),
                "Require the transform information in yaml be Map type.");
       if (op["type"].as<std::string>() == "LimitShort") {
-        int max_short = -1;
-        int min_short = -1;
-        if (op["max_short"]) {
-          max_short = op["max_short"].as<int>();
+        int max_short = op["max_short"] ? op["max_short"].as<int>() : -1;
+        int min_short = op["min_short"] ? op["min_short"].as<int>() : -1;
+        if (is_fixed_input_shape) {
+          // if the input shape is fixed, will resize by scale, and the max
+          // shape will not exceed input_shape
+          long_size = max_short;
+          std::vector<int> max_size = {input_shape[2], input_shape[3]};
+          processors_.push_back(
+              std::make_shared<ResizeByShort>(long_size, 1, true, max_size));
+        } else {
+          processors_.push_back(
+              std::make_shared<LimitShort>(max_short, min_short));
         }
-        if (op["min_short"]) {
-          min_short = op["min_short"].as<int>();
-        }
-        FDINFO << "Detected LimitShort processing step in yaml file, if the "
-                  "model is exported from PaddleSeg, please make sure the "
-                  "input of your model is fixed with a square shape, and "
-                  "greater than or equal to "
-               << max_short << "." << std::endl;
-        processors_.push_back(
-            std::make_shared<LimitShort>(max_short, min_short));
       } else if (op["type"].as<std::string>() == "ResizeToIntMult") {
-        int mult_int = 32;
-        if (op["mult_int"]) {
-          mult_int = op["mult_int"].as<int>();
+        if (is_fixed_input_shape) {
+          std::vector<int> max_size = {input_shape[2], input_shape[3]};
+          processors_.push_back(
+              std::make_shared<ResizeByShort>(long_size, 1, true, max_size));
+        } else {
+          int mult_int = op["mult_int"] ? op["mult_int"].as<int>() : 32;
+          processors_.push_back(std::make_shared<LimitByStride>(mult_int));
         }
-        processors_.push_back(std::make_shared<ResizeToIntMult>(mult_int));
       } else if (op["type"].as<std::string>() == "Normalize") {
         std::vector<float> mean = {0.5, 0.5, 0.5};
         std::vector<float> std = {0.5, 0.5, 0.5};
@@ -97,58 +118,40 @@ bool PPMatting::BuildPreprocessPipelineFromConfig() {
           std = op["std"].as<std::vector<float>>();
         }
         processors_.push_back(std::make_shared<Normalize>(mean, std));
-      } else if (op["type"].as<std::string>() == "ResizeByLong") {
-        int target_size = op["long_size"].as<int>();
-        processors_.push_back(std::make_shared<ResizeByLong>(target_size));
-      } else if (op["type"].as<std::string>() == "Pad") {
-        // size: (w, h)
-        auto size = op["size"].as<std::vector<int>>();
-        std::vector<float> value = {127.5, 127.5, 127.5};
-        if (op["fill_value"]) {
-          auto value = op["fill_value"].as<std::vector<float>>();
-        }
-        processors_.push_back(std::make_shared<Cast>("float"));
-        processors_.push_back(
-            std::make_shared<PadToSize>(size[1], size[0], value));
       } else if (op["type"].as<std::string>() == "ResizeByShort") {
-        int target_size = op["short_size"].as<int>();
-        processors_.push_back(std::make_shared<ResizeByShort>(target_size));
+        long_size = op["short_size"].as<int>();
+        if (is_fixed_input_shape) {
+          std::vector<int> max_size = {input_shape[2], input_shape[3]};
+          processors_.push_back(
+              std::make_shared<ResizeByShort>(long_size, 1, true, max_size));
+        } else {
+          processors_.push_back(std::make_shared<ResizeByShort>(long_size));
+        }
       }
     }
+    // the default padding value is {127.5,127.5,127.5} so after normalizing,
+    // ((127.5/255)-0.5)/0.5 = 0.0
+    std::vector<float> value = {0.0, 0.0, 0.0};
+    processors_.push_back(std::make_shared<Cast>("float"));
+    processors_.push_back(
+        std::make_shared<PadToSize>(input_shape[3], input_shape[2], value));
     processors_.push_back(std::make_shared<HWC2CHW>());
   }
+
   return true;
 }
 
 bool PPMatting::Preprocess(Mat* mat, FDTensor* output,
                            std::map<std::string, std::array<int, 2>>* im_info) {
+  (*im_info)["input_shape"] = {mat->Height(), mat->Width()};
   for (size_t i = 0; i < processors_.size(); ++i) {
-    if (processors_[i]->Name().compare("LimitShort") == 0) {
-      int input_h = static_cast<int>(mat->Height());
-      int input_w = static_cast<int>(mat->Width());
-      auto processor = dynamic_cast<LimitShort*>(processors_[i].get());
-      int max_short = processor->GetMaxShort();
-      if (runtime_option.backend != Backend::PDINFER) {
-        if (input_w != input_h || input_h < max_short || input_w < max_short) {
-          Resize::Run(mat, max_short, max_short);
-        }
-      }
-    }
     if (!(*(processors_[i].get()))(mat)) {
       FDERROR << "Failed to process image data in " << processors_[i]->Name()
               << "." << std::endl;
       return false;
     }
-    if (processors_[i]->Name().compare("ResizeByLong") == 0) {
-      (*im_info)["resize_by_long"] = {static_cast<int>(mat->Height()),
-                                      static_cast<int>(mat->Width())};
-    }
   }
-
-  // Record output shape of preprocessed image
-  (*im_info)["output_shape"] = {static_cast<int>(mat->Height()),
-                                static_cast<int>(mat->Width())};
-
+  (*im_info)["output_shape"] = {mat->Height(), mat->Width()};
   mat->ShareWithTensor(output);
   output->shape.insert(output->shape.begin(), 1);
   output->name = InputInfoOfRuntime(0).name;
@@ -159,8 +162,7 @@ bool PPMatting::Postprocess(
     std::vector<FDTensor>& infer_result, MattingResult* result,
     const std::map<std::string, std::array<int, 2>>& im_info) {
   FDASSERT((infer_result.size() == 1),
-           "The default number of output tensor must be 1 according to "
-           "modnet.");
+           "The default number of output tensor must be 1 ");
   FDTensor& alpha_tensor = infer_result.at(0);  // (1,h,w,1)
   FDASSERT((alpha_tensor.shape[0] == 1), "Only support batch =1 now.");
   if (alpha_tensor.dtype != FDDataType::FP32) {
@@ -170,41 +172,31 @@ bool PPMatting::Postprocess(
 
   auto iter_ipt = im_info.find("input_shape");
   auto iter_out = im_info.find("output_shape");
-  auto resize_by_long = im_info.find("resize_by_long");
-  FDASSERT(iter_out != im_info.end() && iter_ipt != im_info.end(),
-           "Cannot find input_shape or output_shape from im_info.");
-  int out_h = iter_out->second[0];
-  int out_w = iter_out->second[1];
-  int ipt_h = iter_ipt->second[0];
-  int ipt_w = iter_ipt->second[1];
 
-  float* alpha_ptr = static_cast<float*>(alpha_tensor.Data());
-  cv::Mat alpha_zero_copy_ref(out_h, out_w, CV_32FC1, alpha_ptr);
-  cv::Mat cropped_alpha;
-  if (resize_by_long != im_info.end()) {
-    int resize_h = resize_by_long->second[0];
-    int resize_w = resize_by_long->second[1];
-    alpha_zero_copy_ref(cv::Rect(0, 0, resize_w, resize_h))
-        .copyTo(cropped_alpha);
-  } else {
-    cropped_alpha = alpha_zero_copy_ref;
-  }
-  Mat alpha_resized(cropped_alpha);  // ref-only, zero copy.
+  double scale_h = static_cast<double>(iter_out->second[0]) /
+                   static_cast<double>(iter_ipt->second[0]);
+  double scale_w = static_cast<double>(iter_out->second[1]) /
+                   static_cast<double>(iter_ipt->second[1]);
+  double actual_scale = std::min(scale_h, scale_w);
 
-  if ((out_h != ipt_h) || (out_w != ipt_w)) {
-    // already allocated a new continuous memory after resize.
-    // cv::resize(alpha_resized, alpha_resized, cv::Size(ipt_w, ipt_h));
-    Resize::Run(&alpha_resized, ipt_w, ipt_h, -1, -1);
-  }
+  int size_before_pad_h = round(actual_scale * iter_ipt->second[0]);
+  int size_before_pad_w = round(actual_scale * iter_ipt->second[1]);
+  std::vector<int64_t> dim{0, 2, 3, 1};
+  Transpose(alpha_tensor, &alpha_tensor, dim);
+  alpha_tensor.Squeeze(0);
+  Mat mat = CreateFromTensor(alpha_tensor);
+
+  Crop::Run(&mat, 0, 0, size_before_pad_w, size_before_pad_h);
+  Resize::Run(&mat, iter_ipt->second[1], iter_ipt->second[0]);
 
   result->Clear();
   // note: must be setup shape before Resize
   result->contain_foreground = false;
-  result->shape = {static_cast<int64_t>(ipt_h), static_cast<int64_t>(ipt_w)};
-  int numel = ipt_h * ipt_w;
+  result->shape = {iter_ipt->second[0], iter_ipt->second[1]};
+  int numel = iter_ipt->second[0] * iter_ipt->second[1];
   int nbytes = numel * sizeof(float);
   result->Resize(numel);
-  std::memcpy(result->alpha.data(), alpha_resized.GetCpuMat()->data, nbytes);
+  std::memcpy(result->alpha.data(), mat.GetCpuMat()->data, nbytes);
   return true;
 }
 
@@ -214,12 +206,6 @@ bool PPMatting::Predict(cv::Mat* im, MattingResult* result) {
 
   std::map<std::string, std::array<int, 2>> im_info;
 
-  // Record the shape of image and the shape of preprocessed image
-  im_info["input_shape"] = {static_cast<int>(mat.Height()),
-                            static_cast<int>(mat.Width())};
-  im_info["output_shape"] = {static_cast<int>(mat.Height()),
-                             static_cast<int>(mat.Width())};
-
   if (!Preprocess(&mat, &(processed_data[0]), &im_info)) {
     FDERROR << "Failed to preprocess input data while using model:"
             << ModelName() << "." << std::endl;
diff --git a/tests/eval_example/test_ppmatting.py b/tests/eval_example/test_ppmatting.py
new file mode 100644
index 000000000..190f3017f
--- /dev/null
+++ b/tests/eval_example/test_ppmatting.py
@@ -0,0 +1,109 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import fastdeploy as fd
+import cv2
+import os
+import pickle
+import numpy as np
+
+
+def test_matting_ppmatting():
+    model_url = "https://bj.bcebos.com/paddlehub/fastdeploy/PP-Matting-512.tgz"
+    input_url = "https://bj.bcebos.com/paddlehub/fastdeploy/matting_input.jpg"
+    fd.download_and_decompress(model_url, ".")
+    fd.download(input_url, ".")
+    model_path = "./PP-Matting-512"
+    # 配置runtime，加载模型
+    runtime_option = fd.RuntimeOption()
+    model_file = os.path.join(model_path, "model.pdmodel")
+    params_file = os.path.join(model_path, "model.pdiparams")
+    config_file = os.path.join(model_path, "deploy.yaml")
+    model = fd.vision.matting.PPMatting(
+        model_file, params_file, config_file, runtime_option=runtime_option)
+
+    # 预测图片抠图结果
+    im = cv2.imread("./matting_input.jpg")
+    result = model.predict(im.copy())
+    pkl_url = ""
+    if pkl_url:
+        fd.download("ppmatting_result.pkl", ".")
+    with open("./ppmatting_result.pkl", "rb") as f:
+        baseline = pickle.load(f)
+
+    diff = np.fabs(np.array(result.alpha) - np.array(baseline))
+    thres = 1e-05
+    assert diff.max() < thres, "The diff is %f, which is bigger than %f" % (
+        diff.max(), thres)
+
+
+def test_matting_ppmodnet():
+    model_url = "https://bj.bcebos.com/paddlehub/fastdeploy/PPModnet_MobileNetV2.tgz"
+    input_url = "https://bj.bcebos.com/paddlehub/fastdeploy/matting_input.jpg"
+    fd.download_and_decompress(model_url, ".")
+    fd.download(input_url, ".")
+    model_path = "./PPModnet_MobileNetV2"
+    # 配置runtime，加载模型
+    runtime_option = fd.RuntimeOption()
+    model_file = os.path.join(model_path, "model.pdmodel")
+    params_file = os.path.join(model_path, "model.pdiparams")
+    config_file = os.path.join(model_path, "deploy.yaml")
+    model = fd.vision.matting.PPMatting(
+        model_file, params_file, config_file, runtime_option=runtime_option)
+
+    # 预测图片抠图结果
+    im = cv2.imread("./matting_input.jpg")
+    result = model.predict(im.copy())
+
+    pkl_url = ""
+    if pkl_url:
+        fd.download("ppmodnet_result.pkl", ".")
+    with open("./ppmodnet_result.pkl", "rb") as f:
+        baseline = pickle.load(f)
+
+    diff = np.fabs(np.array(result.alpha) - np.array(baseline))
+    thres = 1e-05
+    assert diff.max() < thres, "The diff is %f, which is bigger than %f" % (
+        diff.max(), thres)
+
+
+def test_matting_pphumanmatting():
+    model_url = "https://bj.bcebos.com/paddlehub/fastdeploy/PPHumanMatting.tgz"
+    input_url = "https://bj.bcebos.com/paddlehub/fastdeploy/matting_input.jpg"
+    fd.download_and_decompress(model_url, ".")
+    fd.download(input_url, ".")
+    model_path = "./PPHumanMatting"
+    # 配置runtime，加载模型
+    runtime_option = fd.RuntimeOption()
+    model_file = os.path.join(model_path, "model.pdmodel")
+    params_file = os.path.join(model_path, "model.pdiparams")
+    config_file = os.path.join(model_path, "deploy.yaml")
+    model = fd.vision.matting.PPMatting(
+        model_file, params_file, config_file, runtime_option=runtime_option)
+
+    # 预测图片抠图结果
+    im = cv2.imread("./matting_input.jpg")
+    result = model.predict(im.copy())
+
+    pkl_url = ""
+    if pkl_url:
+        fd.download("pphumanmatting_result.pkl", ".")
+
+    with open("./pphumanmatting_result.pkl", "rb") as f:
+        baseline = pickle.load(f)
+
+    diff = np.fabs(np.array(result.alpha) - np.array(baseline))
+    thres = 1e-05
+    assert diff.max() < thres, "The diff is %f, which is bigger than %f" % (
+        diff.max(), thres)