Add PP-ModNet and PP-HumanMatting Support (#240)

* first commit for yolov7 * pybind for yolov7 * CPP README.md * CPP README.md * modified yolov7.cc * README.md * python file modify * delete license in fastdeploy/ * repush the conflict part * README.md modified * README.md modified * file path modified * file path modified * file path modified * file path modified * file path modified * README modified * README modified * move some helpers to private * add examples for yolov7 * api.md modified * api.md modified * api.md modified * YOLOv7 * yolov7 release link * yolov7 release link * yolov7 release link * copyright * change some helpers to private * change variables to const and fix documents. * gitignore * Transfer some funtions to private member of class * Transfer some funtions to private member of class * Merge from develop (#9) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> * first commit for yolor * for merge * Develop (#11) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> * Yolor (#16) * Develop (#11) (#12) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> * Develop (#13) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * Develop (#14) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> Co-authored-by: Jason <928090362@qq.com> * add is_dynamic for YOLO series (#22) * modify ppmatting backend and docs * modify ppmatting docs * fix the PPMatting size problem * fix LimitShort's log * retrigger ci * modify PPMatting docs * modify the way for dealing with LimitShort * add pphumanmatting and modnet series * docs of PPMatting series * add explanation of newly added processors and fix processors * Modify LimitShort function and ppmatting.cc * modify ResizeByShort and ppmatting.cc * change resize_to_int_mult to limit_by_stride and delete resize_by_input_shape * retrigger ci * retrigger ci * fix problem produced by ResizeByShort * Update eigen.cmake * Delete eigen.cmake * refine code * add test file for ppmatting series * add squeeze for fd_tensor and modify ppmatting.cc Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> Co-authored-by: Jason <928090362@qq.com>
2025-10-08 10:00:29 +08:00 · 2022-10-07 21:44:16 +08:00
parent 1005a09ff1
commit 0692dcc405
21 changed files with 523 additions and 132 deletions
--- a/fastdeploy/vision/common/processors/crop.cc
+++ b/fastdeploy/vision/common/processors/crop.cc
@@ -0,0 +1,65 @@
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "fastdeploy/vision/common/processors/crop.h"
+
+namespace fastdeploy {
+namespace vision {
+
+bool Crop::CpuRun(Mat* mat) {
+  cv::Mat* im = mat->GetCpuMat();
+  int height = static_cast<int>(im->rows);
+  int width = static_cast<int>(im->cols);
+  if (height < height_ + offset_h_ || width < width_ + offset_w_) {
+    FDERROR << "[Crop] Cannot crop [" << height_ << ", " << width_
+            << "] from the input image [" << height << ", " << width
+            << "], with offset [" << offset_h_ << ", " << offset_w_ << "]."
+            << std::endl;
+    return false;
+  }
+  cv::Rect crop_roi(offset_w_, offset_h_, width_, height_);
+  *im = (*im)(crop_roi);
+  mat->SetWidth(width_);
+  mat->SetHeight(height_);
+  return true;
+}
+
+#ifdef ENABLE_OPENCV_CUDA
+bool Crop::GpuRun(Mat* mat) {
+  cv::cuda::GpuMat* im = mat->GetGpuMat();
+  int height = static_cast<int>(im->rows);
+  int width = static_cast<int>(im->cols);
+  if (height < height_ + offset_h_ || width < width_ + offset_w_) {
+    FDERROR << "[Crop] Cannot crop [" << height_ << ", " << width_
+            << "] from the input image [" << height << ", " << width
+            << "], with offset [" << offset_h_ << ", " << offset_w_ << "]."
+            << std::endl;
+    return false;
+  }
+  cv::Rect crop_roi(offset_w_, offset_h_, width_, height_);
+  *im = (*im)(crop_roi);
+  mat->SetWidth(width_);
+  mat->SetHeight(height_);
+  return true;
+}
+#endif
+
+bool Crop::Run(Mat* mat, int offset_w, int offset_h, int width, int height,
+               ProcLib lib) {
+  auto c = Crop(offset_w, offset_h, width, height);
+  return c(mat, lib);
+}
+
+}  // namespace vision
+}  // namespace fastdeploy
--- a/fastdeploy/vision/common/processors/crop.h
+++ b/fastdeploy/vision/common/processors/crop.h
@@ -0,0 +1,47 @@
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+
+#include "fastdeploy/vision/common/processors/base.h"
+
+namespace fastdeploy {
+namespace vision {
+
+class Crop : public Processor {
+ public:
+  Crop(int offset_w, int offset_h, int width, int height) {
+    offset_w_ = offset_w;
+    offset_h_ = offset_h;
+    width_ = width;
+    height_ = height;
+  }
+  bool CpuRun(Mat* mat);
+#ifdef ENABLE_OPENCV_CUDA
+  bool GpuRun(Mat* mat);
+#endif
+  std::string Name() { return "Crop"; }
+
+  static bool Run(Mat* mat, int offset_w, int offset_h, int width, int height,
+                  ProcLib lib = ProcLib::OPENCV_CPU);
+
+ private:
+  int offset_w_;
+  int offset_h_;
+  int height_;
+  int width_;
+};
+
+}  // namespace vision
+}  // namespace fastdeploy
--- a/fastdeploy/vision/common/processors/resize_to_int_mult.cc
+++ b/fastdeploy/vision/common/processors/resize_to_int_mult.cc
@@ -12,17 +12,17 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.

-#include "fastdeploy/vision/common/processors/resize_to_int_mult.h"
+#include "fastdeploy/vision/common/processors/limit_by_stride.h"

 namespace fastdeploy {
 namespace vision {

-bool ResizeToIntMult::CpuRun(Mat* mat) {
+bool LimitByStride::CpuRun(Mat* mat) {
  cv::Mat* im = mat->GetCpuMat();
  int origin_w = im->cols;
  int origin_h = im->rows;
-  int rw = origin_w - origin_w % mult_int_;
-  int rh = origin_h - origin_h % mult_int_;
+  int rw = origin_w - origin_w % stride_;
+  int rh = origin_h - origin_h % stride_;
  if (rw != origin_w || rh != origin_w) {
    cv::resize(*im, *im, cv::Size(rw, rh), 0, 0, interp_);
    mat->SetWidth(im->cols);
@@ -32,13 +32,13 @@ bool ResizeToIntMult::CpuRun(Mat* mat) {
 }

 #ifdef ENABLE_OPENCV_CUDA
-bool ResizeToIntMult::GpuRun(Mat* mat) {
+bool LimitByStride::GpuRun(Mat* mat) {
  cv::cuda::GpuMat* im = mat->GetGpuMat();
  int origin_w = im->cols;
  int origin_h = im->rows;
  im->convertTo(*im, CV_32FC(im->channels()));
-  int rw = origin_w - origin_w % mult_int_;
-  int rh = origin_h - origin_h % mult_int_;
+  int rw = origin_w - origin_w % stride_;
+  int rh = origin_h - origin_h % stride_;
  if (rw != origin_w || rh != origin_w) {
    cv::cuda::resize(*im, *im, cv::Size(rw, rh), 0, 0, interp_);
    mat->SetWidth(im->cols);
@@ -48,8 +48,8 @@ bool ResizeToIntMult::GpuRun(Mat* mat) {
 }
 #endif

-bool ResizeToIntMult::Run(Mat* mat, int mult_int, int interp, ProcLib lib) {
-  auto r = ResizeToIntMult(mult_int, interp);
+bool LimitByStride::Run(Mat* mat, int stride, int interp, ProcLib lib) {
+  auto r = LimitByStride(stride, interp);
  return r(mat, lib);
 }
 }  // namespace vision
--- a/fastdeploy/vision/common/processors/resize_to_int_mult.h
+++ b/fastdeploy/vision/common/processors/resize_to_int_mult.h
@@ -19,24 +19,27 @@
 namespace fastdeploy {
 namespace vision {

-class ResizeToIntMult : public Processor {
+class LimitByStride : public Processor {
 public:
-  explicit ResizeToIntMult(int mult_int = 32, int interp = 1) {
-    mult_int_ = mult_int;
+  explicit LimitByStride(int stride = 32, int interp = 1) {
+    stride_ = stride;
    interp_ = interp;
  }
+
+  // Resize Mat* mat to make the size divisible by stride_.
+
  bool CpuRun(Mat* mat);
 #ifdef ENABLE_OPENCV_CUDA
  bool GpuRun(Mat* mat);
 #endif
-  std::string Name() { return "ResizeToIntMult"; }
+  std::string Name() { return "LimitByStride"; }

-  static bool Run(Mat* mat, int mult_int = 32, int interp = 1,
+  static bool Run(Mat* mat, int stride = 32, int interp = 1,
                  ProcLib lib = ProcLib::OPENCV_CPU);

 private:
  int interp_;
-  int mult_int_;
+  int stride_;
 };
 }  // namespace vision
 }  // namespace fastdeploy
--- a/fastdeploy/vision/common/processors/limit_long.cc
+++ b/fastdeploy/vision/common/processors/limit_long.cc
@@ -0,0 +1,70 @@
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "fastdeploy/vision/common/processors/limit_long.h"
+
+namespace fastdeploy {
+namespace vision {
+
+bool LimitLong::CpuRun(Mat* mat) {
+  cv::Mat* im = mat->GetCpuMat();
+  int origin_w = im->cols;
+  int origin_h = im->rows;
+  int im_size_max = std::max(origin_w, origin_h);
+  int target = im_size_max;
+  if (max_long_ > 0 && im_size_max > max_long_) {
+    target = max_long_;
+  } else if (min_long_ > 0 && im_size_max < min_long_) {
+    target = min_long_;
+  }
+  if (target != im_size_max) {
+    double scale =
+        static_cast<double>(target) / static_cast<double>(im_size_max);
+    cv::resize(*im, *im, cv::Size(), scale, scale, interp_);
+    mat->SetWidth(im->cols);
+    mat->SetHeight(im->rows);
+  }
+  return true;
+}
+
+#ifdef ENABLE_OPENCV_CUDA
+bool LimitLong::GpuRun(Mat* mat) {
+  cv::cuda::GpuMat* im = mat->GetGpuMat();
+  int origin_w = im->cols;
+  int origin_h = im->rows;
+  im->convertTo(*im, CV_32FC(im->channels()));
+  int im_size_max = std::max(origin_w, origin_h);
+  int target = im_size_max;
+  if (max_long_ > 0 && im_size_max > max_long_) {
+    target = max_long_;
+  } else if (min_long_ > 0 && im_size_max < min_long_) {
+    target = min_long_;
+  }
+  if (target != im_size_max) {
+    double scale =
+        static_cast<double>(target) / static_cast<double>(im_size_max);
+    cv::cuda::resize(*im, *im, cv::Size(), scale, scale, interp_);
+    mat->SetWidth(im->cols);
+    mat->SetHeight(im->rows);
+  }
+  return true;
+}
+#endif
+
+bool LimitLong::Run(Mat* mat, int max_long, int min_long, ProcLib lib) {
+  auto l = LimitLong(max_long, min_long);
+  return l(mat, lib);
+}
+}  // namespace vision
+}  // namespace fastdeploy
--- a/fastdeploy/vision/common/processors/limit_long.h
+++ b/fastdeploy/vision/common/processors/limit_long.h
@@ -0,0 +1,51 @@
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+
+#include "fastdeploy/vision/common/processors/base.h"
+
+namespace fastdeploy {
+namespace vision {
+
+class LimitLong : public Processor {
+ public:
+  explicit LimitLong(int max_long = -1, int min_long = -1, int interp = 1) {
+    max_long_ = max_long;
+    min_long_ = min_long;
+    interp_ = interp;
+  }
+
+  // Limit the long edge of image.
+  // If the long edge is larger than max_long_, resize the long edge
+  // to max_long_, while scale the short edge proportionally.
+  // If the long edge is smaller than min_long_, resize the long edge
+  // to min_long_, while scale the short edge proportionally.
+  bool CpuRun(Mat* mat);
+#ifdef ENABLE_OPENCV_CUDA
+  bool GpuRun(Mat* mat);
+#endif
+  std::string Name() { return "LimitLong"; }
+
+  static bool Run(Mat* mat, int max_long = -1, int min_long = -1,
+                  ProcLib lib = ProcLib::OPENCV_CPU);
+  int GetMaxLong() const { return max_long_; }
+
+ private:
+  int max_long_;
+  int min_long_;
+  int interp_;
+};
+}  // namespace vision
+}  // namespace fastdeploy
--- a/fastdeploy/vision/common/processors/limit_short.cc
+++ b/fastdeploy/vision/common/processors/limit_short.cc
@@ -28,9 +28,11 @@ bool LimitShort::CpuRun(Mat* mat) {
  } else if (min_short_ > 0 && im_size_min < min_short_) {
    target = min_short_;
  }
+  double scale = -1.f;
  if (target != im_size_min) {
-    double scale =
-        static_cast<double>(target) / static_cast<double>(im_size_min);
+    scale = static_cast<double>(target) / static_cast<double>(im_size_min);
+  }
+  if (scale > 0) {
    cv::resize(*im, *im, cv::Size(), scale, scale, interp_);
    mat->SetWidth(im->cols);
    mat->SetHeight(im->rows);
@@ -51,9 +53,11 @@ bool LimitShort::GpuRun(Mat* mat) {
  } else if (min_short_ > 0 && im_size_min < min_short_) {
    target = min_short_;
  }
+  double scale = -1.f;
  if (target != im_size_min) {
-    double scale =
-        static_cast<double>(target) / static_cast<double>(im_size_min);
+    scale = static_cast<double>(target) / static_cast<double>(im_size_min);
+  }
+  if (scale > 0) {
    cv::cuda::resize(*im, *im, cv::Size(), scale, scale, interp_);
    mat->SetWidth(im->cols);
    mat->SetHeight(im->rows);
--- a/fastdeploy/vision/common/processors/limit_short.h
+++ b/fastdeploy/vision/common/processors/limit_short.h
@@ -26,6 +26,12 @@ class LimitShort : public Processor {
    min_short_ = min_short;
    interp_ = interp;
  }
+
+  // Limit the short edge of image.
+  // If the short edge is larger than max_short_, resize the short edge
+  // to max_short_, while scale the long edge proportionally.
+  // If the short edge is smaller than min_short_, resize the short edge
+  // to min_short_, while scale the long edge proportionally.
  bool CpuRun(Mat* mat);
 #ifdef ENABLE_OPENCV_CUDA
  bool GpuRun(Mat* mat);
@@ -34,7 +40,7 @@ class LimitShort : public Processor {

  static bool Run(Mat* mat, int max_short = -1, int min_short = -1,
                  ProcLib lib = ProcLib::OPENCV_CPU);
-  int GetMaxShort() { return max_short_; }
+  int GetMaxShort() const { return max_short_; }

 private:
  int max_short_;
--- a/fastdeploy/vision/common/processors/pad_to_size.cc
+++ b/fastdeploy/vision/common/processors/pad_to_size.cc
@@ -18,6 +18,9 @@ namespace fastdeploy {
 namespace vision {

 bool PadToSize::CpuRun(Mat* mat) {
+  if (width_ == -1 || height_ == -1) {
+    return true;
+  }
  if (mat->layout != Layout::HWC) {
    FDERROR << "PadToSize: The input data must be Layout::HWC format!"
            << std::endl;
@@ -74,6 +77,9 @@ bool PadToSize::CpuRun(Mat* mat) {

 #ifdef ENABLE_OPENCV_CUDA
 bool PadToSize::GpuRun(Mat* mat) {
+  if (width_ == -1 || height_ == -1) {
+    return true;
+  }
  if (mat->layout != Layout::HWC) {
    FDERROR << "PadToSize: The input data must be Layout::HWC format!"
            << std::endl;
--- a/fastdeploy/vision/common/processors/pad_to_size.h
+++ b/fastdeploy/vision/common/processors/pad_to_size.h
@@ -21,7 +21,7 @@ namespace vision {

 class PadToSize : public Processor {
 public:
-  // only support pad with left-top padding mode
+  // only support pad with right-bottom padding mode
  PadToSize(int width, int height, const std::vector<float>& value) {
    width_ = width;
    height_ = height;
--- a/fastdeploy/vision/common/processors/resize_by_short.cc
+++ b/fastdeploy/vision/common/processors/resize_by_short.cc
@@ -22,12 +22,14 @@ bool ResizeByShort::CpuRun(Mat* mat) {
  int origin_w = im->cols;
  int origin_h = im->rows;
  double scale = GenerateScale(origin_w, origin_h);
-  if (use_scale_) {
+  if (use_scale_ && fabs(scale - 1.0) >= 1e-06) {
    cv::resize(*im, *im, cv::Size(), scale, scale, interp_);
  } else {
    int width = static_cast<int>(round(scale * im->cols));
    int height = static_cast<int>(round(scale * im->rows));
-    cv::resize(*im, *im, cv::Size(width, height), 0, 0, interp_);
+    if (width != origin_w || height != origin_h) {
+      cv::resize(*im, *im, cv::Size(width, height), 0, 0, interp_);
+    }
  }
  mat->SetWidth(im->cols);
  mat->SetHeight(im->rows);
@@ -41,12 +43,14 @@ bool ResizeByShort::GpuRun(Mat* mat) {
  int origin_h = im->rows;
  double scale = GenerateScale(origin_w, origin_h);
  im->convertTo(*im, CV_32FC(im->channels()));
-  if (use_scale_) {
+  if (use_scale_ && fabs(scale - 1.0) >= 1e-06) {
    cv::cuda::resize(*im, *im, cv::Size(), scale, scale, interp_);
  } else {
    int width = static_cast<int>(round(scale * im->cols));
    int height = static_cast<int>(round(scale * im->rows));
-    cv::cuda::resize(*im, *im, cv::Size(width, height), 0, 0, interp_);
+    if (width != origin_w || height != origin_h) {
+      cv::cuda::resize(*im, *im, cv::Size(width, height), 0, 0, interp_);
+    }
  }
  mat->SetWidth(im->cols);
  mat->SetHeight(im->rows);
@@ -59,18 +63,31 @@ double ResizeByShort::GenerateScale(const int origin_w, const int origin_h) {
  int im_size_min = std::min(origin_w, origin_h);
  double scale =
      static_cast<double>(target_size_) / static_cast<double>(im_size_min);
-  if (max_size_ > 0) {
-    if (round(scale * im_size_max) > max_size_) {
-      scale = static_cast<double>(max_size_) / static_cast<double>(im_size_max);
+
+  if (max_hw_.size() > 0) {
+    FDASSERT(max_hw_.size() == 2,
+             "Require size of max_hw_ be 2, but now it's %zu.", max_hw_.size());
+    FDASSERT(
+        max_hw_[0] > 0 && max_hw_[1] > 0,
+        "Require elements in max_hw_ greater than 0, but now it's [%d, %d].",
+        max_hw_[0], max_hw_[1]);
+
+    double scale_h =
+        static_cast<double>(max_hw_[0]) / static_cast<double>(origin_h);
+    double scale_w =
+        static_cast<double>(max_hw_[1]) / static_cast<double>(origin_w);
+    double min_scale = std::min(scale_h, scale_w);
+    if (min_scale < scale) {
+      scale = min_scale;
    }
  }
  return scale;
 }

 bool ResizeByShort::Run(Mat* mat, int target_size, int interp, bool use_scale,
-                        int max_size, ProcLib lib) {
-  auto r = ResizeByShort(target_size, interp, use_scale, max_size);
+                        const std::vector<int>& max_hw, ProcLib lib) {
+  auto r = ResizeByShort(target_size, interp, use_scale, max_hw);
  return r(mat, lib);
 }
-} // namespace vision
-} // namespace fastdeploy
+}  // namespace vision
+}  // namespace fastdeploy
--- a/fastdeploy/vision/common/processors/resize_by_short.h
+++ b/fastdeploy/vision/common/processors/resize_by_short.h
@@ -22,9 +22,9 @@ namespace vision {
 class ResizeByShort : public Processor {
 public:
  ResizeByShort(int target_size, int interp = 1, bool use_scale = true,
-                int max_size = -1) {
+                const std::vector<int>& max_hw = std::vector<int>()) {
    target_size_ = target_size;
-    max_size_ = max_size;
+    max_hw_ = max_hw;
    interp_ = interp;
    use_scale_ = use_scale;
  }
@@ -35,15 +35,16 @@ class ResizeByShort : public Processor {
  std::string Name() { return "ResizeByShort"; }

  static bool Run(Mat* mat, int target_size, int interp = 1,
-                  bool use_scale = true, int max_size = -1,
+                  bool use_scale = true,
+                  const std::vector<int>& max_hw = std::vector<int>(),
                  ProcLib lib = ProcLib::OPENCV_CPU);

 private:
  double GenerateScale(const int origin_w, const int origin_h);
  int target_size_;
-  int max_size_;
+  std::vector<int> max_hw_;
  int interp_;
  bool use_scale_;
 };
-} // namespace vision
-} // namespace fastdeploy
+}  // namespace vision
+}  // namespace fastdeploy
--- a/fastdeploy/vision/common/processors/transform.h
+++ b/fastdeploy/vision/common/processors/transform.h
@@ -18,7 +18,10 @@
 #include "fastdeploy/vision/common/processors/center_crop.h"
 #include "fastdeploy/vision/common/processors/color_space_convert.h"
 #include "fastdeploy/vision/common/processors/convert.h"
+#include "fastdeploy/vision/common/processors/crop.h"
 #include "fastdeploy/vision/common/processors/hwc2chw.h"
+#include "fastdeploy/vision/common/processors/limit_by_stride.h"
+#include "fastdeploy/vision/common/processors/limit_long.h"
 #include "fastdeploy/vision/common/processors/limit_short.h"
 #include "fastdeploy/vision/common/processors/normalize.h"
 #include "fastdeploy/vision/common/processors/pad.h"
@@ -26,5 +29,4 @@
 #include "fastdeploy/vision/common/processors/resize.h"
 #include "fastdeploy/vision/common/processors/resize_by_long.h"
 #include "fastdeploy/vision/common/processors/resize_by_short.h"
-#include "fastdeploy/vision/common/processors/resize_to_int_mult.h"
 #include "fastdeploy/vision/common/processors/stride_pad.h"