[Model] Modify SR (#674)

* first commit for yolov7 * pybind for yolov7 * CPP README.md * CPP README.md * modified yolov7.cc * README.md * python file modify * delete license in fastdeploy/ * repush the conflict part * README.md modified * README.md modified * file path modified * file path modified * file path modified * file path modified * file path modified * README modified * README modified * move some helpers to private * add examples for yolov7 * api.md modified * api.md modified * api.md modified * YOLOv7 * yolov7 release link * yolov7 release link * yolov7 release link * copyright * change some helpers to private * change variables to const and fix documents. * gitignore * Transfer some funtions to private member of class * Transfer some funtions to private member of class * Merge from develop (#9) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> * first commit for yolor * for merge * Develop (#11) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> * Yolor (#16) * Develop (#11) (#12) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> * Develop (#13) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * Develop (#14) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> Co-authored-by: Jason <928090362@qq.com> * add is_dynamic for YOLO series (#22) * modify ppmatting backend and docs * modify ppmatting docs * fix the PPMatting size problem * fix LimitShort's log * retrigger ci * modify PPMatting docs * modify the way for dealing with LimitShort * add python comments for external models * modify resnet c++ comments * modify C++ comments for external models * modify python comments and add result class comments * fix comments compile error * modify result.h comments * modify examples doc and code for SR models * code style * retrigger ci * python file code style * fix examples links * fix examples links * fix examples links Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: huangjianhui <852142024@qq.com> Co-authored-by: Jason <928090362@qq.com>
2025-10-07 01:22:59 +08:00 · 2022-11-25 18:31:22 +08:00
parent 86f05e9ac8
commit ad5c9c08b2
24 changed files with 417 additions and 404 deletions
--- a/examples/vision/sr/edvr/cpp/infer.cc
+++ b/examples/vision/sr/edvr/cpp/infer.cc
@@ -20,8 +20,8 @@ const char sep = '\\';
 const char sep = '/';
 #endif

-void CpuInfer(const std::string& model_dir,
-              const std::string& video_file, int frame_num) {
+void CpuInfer(const std::string& model_dir, const std::string& video_file,
+              int frame_num) {
  auto model_file = model_dir + sep + "model.pdmodel";
  auto params_file = model_dir + sep + "model.pdiparams";
  auto model = fastdeploy::vision::sr::EDVR(model_file, params_file);
@@ -32,34 +32,36 @@ void CpuInfer(const std::string& model_dir,
  }
  // note: input/output shape is [b, n, c, h, w] (n = frame_nums; b=1(default))
  // b and n is dependent on export model shape
-  // see https://github.com/PaddlePaddle/PaddleGAN/blob/develop/docs/zh_CN/tutorials/video_super_resolution.md
+  // see
+  // https://github.com/PaddlePaddle/PaddleGAN/blob/develop/docs/zh_CN/tutorials/video_super_resolution.md
  cv::VideoCapture capture;
  // change your save video path
  std::string video_out_name = "output.mp4";
  capture.open(video_file);
-  if (!capture.isOpened())
-  {
-    std::cout<<"can not open video "<<std::endl;
+  if (!capture.isOpened()) {
+    std::cout << "can not open video " << std::endl;
    return;
  }
  // Get Video info :fps, frame count
  // it used 4.x version of opencv below
  // notice your opencv version and method of api.
  int video_fps = static_cast<int>(capture.get(cv::CAP_PROP_FPS));
-  int video_frame_count = static_cast<int>(capture.get(cv::CAP_PROP_FRAME_COUNT));
+  int video_frame_count =
+      static_cast<int>(capture.get(cv::CAP_PROP_FRAME_COUNT));
  // Set fixed size for output frame, only for msvsr model
  int out_width = 1280;
  int out_height = 720;
-  std::cout << "fps: " << video_fps << "\tframe_count: " << video_frame_count << std::endl;
+  std::cout << "fps: " << video_fps << "\tframe_count: " << video_frame_count
+            << std::endl;

  // Create VideoWriter for output
  cv::VideoWriter video_out;
  std::string video_out_path("./");
  video_out_path += video_out_name;
  int fcc = cv::VideoWriter::fourcc('m', 'p', '4', 'v');
-  video_out.open(video_out_path, fcc, video_fps, cv::Size(out_width, out_height), true);
-  if (!video_out.isOpened())
-  {
+  video_out.open(video_out_path, fcc, video_fps,
+                 cv::Size(out_width, out_height), true);
+  if (!video_out.isOpened()) {
    std::cout << "create video writer failed!" << std::endl;
    return;
  }
@@ -67,42 +69,40 @@ void CpuInfer(const std::string& model_dir,
  cv::Mat frame;
  int frame_id = 0;
  std::vector<cv::Mat> imgs;
-  while (capture.read(frame)){
-    if (!frame.empty())
-    {
-      if(frame_id < frame_num){
+  while (capture.read(frame)) {
+    if (!frame.empty()) {
+      if (frame_id < frame_num) {
        imgs.push_back(frame);
-        frame_id ++;
+        frame_id++;
        continue;
      }
      imgs.erase(imgs.begin());
      imgs.push_back(frame);
    }
-    frame_id ++;
+    frame_id++;
    std::vector<cv::Mat> results;
    model.Predict(imgs, results);
-    for (auto &item : results)
-    {
+    for (auto& item : results) {
      // cv::imshow("13",item);
      // cv::waitKey(30);
      video_out.write(item);
-      std::cout << "Processing frame: "<< frame_id << std::endl;
+      std::cout << "Processing frame: " << frame_id << std::endl;
    }
  }
-  std::cout << "inference finished, output video saved at " << video_out_path << std::endl;
+  std::cout << "inference finished, output video saved at " << video_out_path
+            << std::endl;
  capture.release();
  video_out.release();
 }

-void GpuInfer(const std::string& model_dir,
-              const std::string& video_file, int frame_num) {
+void GpuInfer(const std::string& model_dir, const std::string& video_file,
+              int frame_num) {
  auto model_file = model_dir + sep + "model.pdmodel";
  auto params_file = model_dir + sep + "model.pdiparams";

  auto option = fastdeploy::RuntimeOption();
  option.UseGpu();
-  auto model = fastdeploy::vision::sr::EDVR(
-      model_file, params_file, option);
+  auto model = fastdeploy::vision::sr::EDVR(model_file, params_file, option);

  if (!model.Initialized()) {
    std::cerr << "Failed to initialize." << std::endl;
@@ -110,32 +110,34 @@ void GpuInfer(const std::string& model_dir,
  }
  // note: input/output shape is [b, n, c, h, w] (n = frame_nums; b=1(default))
  // b and n is dependent on export model shape
-  // see https://github.com/PaddlePaddle/PaddleGAN/blob/develop/docs/zh_CN/tutorials/video_super_resolution.md
+  // see
+  // https://github.com/PaddlePaddle/PaddleGAN/blob/develop/docs/zh_CN/tutorials/video_super_resolution.md
  cv::VideoCapture capture;
  // change your save video path
  std::string video_out_name = "output.mp4";
  capture.open(video_file);
-  if (!capture.isOpened())
-  {
-    std::cout<<"can not open video "<<std::endl;
+  if (!capture.isOpened()) {
+    std::cout << "can not open video " << std::endl;
    return;
  }
  // Get Video info :fps, frame count
  int video_fps = static_cast<int>(capture.get(cv::CAP_PROP_FPS));
-  int video_frame_count = static_cast<int>(capture.get(cv::CAP_PROP_FRAME_COUNT));
+  int video_frame_count =
+      static_cast<int>(capture.get(cv::CAP_PROP_FRAME_COUNT));
  // Set fixed size for output frame, only for msvsr model
  int out_width = 1280;
  int out_height = 720;
-  std::cout << "fps: " << video_fps << "\tframe_count: " << video_frame_count << std::endl;
+  std::cout << "fps: " << video_fps << "\tframe_count: " << video_frame_count
+            << std::endl;

  // Create VideoWriter for output
  cv::VideoWriter video_out;
  std::string video_out_path("./");
  video_out_path += video_out_name;
  int fcc = cv::VideoWriter::fourcc('m', 'p', '4', 'v');
-  video_out.open(video_out_path, fcc, video_fps, cv::Size(out_width, out_height), true);
-  if (!video_out.isOpened())
-  {
+  video_out.open(video_out_path, fcc, video_fps,
+                 cv::Size(out_width, out_height), true);
+  if (!video_out.isOpened()) {
    std::cout << "create video writer failed!" << std::endl;
    return;
  }
@@ -143,44 +145,44 @@ void GpuInfer(const std::string& model_dir,
  cv::Mat frame;
  int frame_id = 0;
  std::vector<cv::Mat> imgs;
-  while (capture.read(frame)){
-    if (!frame.empty())
-    {
-      if(frame_id < frame_num){
+  while (capture.read(frame)) {
+    if (!frame.empty()) {
+      if (frame_id < frame_num) {
        imgs.push_back(frame);
-        frame_id ++;
+        frame_id++;
        continue;
      }
      imgs.erase(imgs.begin());
      imgs.push_back(frame);
    }
-    frame_id ++;
+    frame_id++;
    std::vector<cv::Mat> results;
    model.Predict(imgs, results);
-    for (auto &item : results)
-    {
+    for (auto& item : results) {
      // cv::imshow("13",item);
      // cv::waitKey(30);
      video_out.write(item);
-      std::cout << "Processing frame: "<< frame_id << std::endl;
+      std::cout << "Processing frame: " << frame_id << std::endl;
    }
  }
-  std::cout << "inference finished, output video saved at " << video_out_path << std::endl;
+  std::cout << "inference finished, output video saved at " << video_out_path
+            << std::endl;
  capture.release();
  video_out.release();
 }

-void TrtInfer(const std::string& model_dir,
-              const std::string& video_file, int frame_num) {
+void TrtInfer(const std::string& model_dir, const std::string& video_file,
+              int frame_num) {
  auto model_file = model_dir + sep + "model.pdmodel";
  auto params_file = model_dir + sep + "model.pdiparams";
  auto option = fastdeploy::RuntimeOption();
  option.UseGpu();
-  option.UseTrtBackend();
  // use paddle-TRT
+  option.UseTrtBackend();
+  option.EnablePaddleTrtCollectShape();
+  option.SetTrtInputShape("x", {1, 5, 3, 180, 320});
  option.EnablePaddleToTrt();
-  auto model = fastdeploy::vision::sr::EDVR(
-      model_file, params_file, option);
+  auto model = fastdeploy::vision::sr::EDVR(model_file, params_file, option);

  if (!model.Initialized()) {
    std::cerr << "Failed to initialize." << std::endl;
@@ -189,75 +191,77 @@ void TrtInfer(const std::string& model_dir,

  // note: input/output shape is [b, n, c, h, w] (n = frame_nums; b=1(default))
  // b and n is dependent on export model shape
-  // see https://github.com/PaddlePaddle/PaddleGAN/blob/develop/docs/zh_CN/tutorials/video_super_resolution.md
+  // see
+  // https://github.com/PaddlePaddle/PaddleGAN/blob/develop/docs/zh_CN/tutorials/video_super_resolution.md
  cv::VideoCapture capture;
  // change your save video path
  std::string video_out_name = "output.mp4";
  capture.open(video_file);
-  if (!capture.isOpened())
-  {
-    std::cout<<"can not open video "<<std::endl;
+  if (!capture.isOpened()) {
+    std::cout << "can not open video " << std::endl;
    return;
  }
  // Get Video info :fps, frame count
  int video_fps = static_cast<int>(capture.get(cv::CAP_PROP_FPS));
-  int video_frame_count = static_cast<int>(capture.get(cv::CAP_PROP_FRAME_COUNT));
+  int video_frame_count =
+      static_cast<int>(capture.get(cv::CAP_PROP_FRAME_COUNT));
  // Set fixed size for output frame, only for msvsr model
-  //Note that the resolution between the size and the original input is consistent when the model is exported,
+  // Note that the resolution between the size and the original input is
+  // consistent when the model is exported,
  // for example: [1,2,3,180,320], after 4x super separation [1,2,3,720,1080].
-  //Therefore, it is very important to derive the model
+  // Therefore, it is very important to derive the model
  int out_width = 1280;
  int out_height = 720;
-  std::cout << "fps: " << video_fps << "\tframe_count: " << video_frame_count << std::endl;
+  std::cout << "fps: " << video_fps << "\tframe_count: " << video_frame_count
+            << std::endl;

  // Create VideoWriter for output
  cv::VideoWriter video_out;
  std::string video_out_path("./");
  video_out_path += video_out_name;
  int fcc = cv::VideoWriter::fourcc('m', 'p', '4', 'v');
-  video_out.open(video_out_path, fcc, video_fps, cv::Size(out_width, out_height), true);
-  if (!video_out.isOpened())
-  {
+  video_out.open(video_out_path, fcc, video_fps,
+                 cv::Size(out_width, out_height), true);
+  if (!video_out.isOpened()) {
    std::cout << "create video writer failed!" << std::endl;
    return;
  }
  // Capture all frames and do inference
  cv::Mat frame;
  int frame_id = 0;
-    std::vector<cv::Mat> imgs;
-    while (capture.read(frame)){
-        if (!frame.empty())
-        {
-            if(frame_id < frame_num){
-                imgs.push_back(frame);
-                frame_id ++;
-                continue;
-            }
-            imgs.erase(imgs.begin());
-            imgs.push_back(frame);
-        }
-        frame_id ++;
-        std::vector<cv::Mat> results;
-        model.Predict(imgs, results);
-        for (auto &item : results)
-        {
-            // cv::imshow("13",item);
-            // cv::waitKey(30);
-            video_out.write(item);
-            std::cout << "Processing frame: "<< frame_id << std::endl;
-        }
+  std::vector<cv::Mat> imgs;
+  while (capture.read(frame)) {
+    if (!frame.empty()) {
+      if (frame_id < frame_num) {
+        imgs.push_back(frame);
+        frame_id++;
+        continue;
+      }
+      imgs.erase(imgs.begin());
+      imgs.push_back(frame);
    }
-  std::cout << "inference finished, output video saved at " << video_out_path << std::endl;
+    frame_id++;
+    std::vector<cv::Mat> results;
+    model.Predict(imgs, results);
+    for (auto& item : results) {
+      // cv::imshow("13",item);
+      // cv::waitKey(30);
+      video_out.write(item);
+      std::cout << "Processing frame: " << frame_id << std::endl;
+    }
+  }
+  std::cout << "inference finished, output video saved at " << video_out_path
+            << std::endl;
  capture.release();
  video_out.release();
 }

 int main(int argc, char* argv[]) {
  if (argc < 4) {
-    std::cout
-        << "Usage: infer_demo path/to/model_dir path/to/video frame number run_option, "
-           "e.g ./infer_model ./vsr_model_dir ./person.mp4 0 2"
-        << std::endl;
+    std::cout << "Usage: infer_demo path/to/model_dir path/to/video frame "
+                 "number run_option, "
+                 "e.g ./infer_model ./vsr_model_dir ./vsr_src.mp4 0 5"
+              << std::endl;
    std::cout << "The data type of run_option is int, 0: run with cpu; 1: run "
                 "with gpu; 2: run with gpu and use tensorrt backend."
              << std::endl;