[Model] add RobustVideoMatting model (#400)

* add yolov5cls * fixed bugs * fixed bugs * fixed preprocess bug * add yolov5cls readme * deal with comments * Add YOLOv5Cls Note * add yolov5cls test * add rvm support * support rvm model * add rvm demo * fixed bugs * add rvm readme * add TRT support * add trt support * add rvm test * add EXPORT.md * rename export.md * rm poros doxyen * deal with comments * deal with comments * add rvm video_mode note Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
2025-10-05 16:48:03 +08:00 · 2022-10-26 14:30:04 +08:00
parent ba501fd963
commit 718698a32a
22 changed files with 1080 additions and 16 deletions
--- a/examples/vision/matting/rvm/python/README.md
+++ b/examples/vision/matting/rvm/python/README.md
@@ -0,0 +1,88 @@
+# RobustVideoMatting Python部署示例
+
+在部署前，需确认以下两个步骤
+
+- 1. 软硬件环境满足要求，参考[FastDeploy环境要求](../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)  
+- 2. FastDeploy Python whl包安装，参考[FastDeploy Python安装](../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)
+
+本目录下提供`infer.py`快速完成RobustVideoMatting在CPU/GPU，以及GPU上通过TensorRT加速部署的示例。执行如下脚本即可完成
+
+```bash
+#下载部署示例代码
+git clone https://github.com/PaddlePaddle/FastDeploy.git
+cd FastDeploy/examples/vision/matting/rvm/python
+
+# 下载RobustVideoMatting模型文件和测试图片以及视频
+## 原版ONNX模型
+wget https://bj.bcebos.com/paddlehub/fastdeploy/rvm_mobilenetv3_fp32.onnx
+## 为加载TRT特殊处理ONNX模型
+wget https://bj.bcebos.com/paddlehub/fastdeploy/rvm_mobilenetv3_trt.onnx
+wget https://bj.bcebos.com/paddlehub/fastdeploy/matting_input.jpg
+wget https://bj.bcebos.com/paddlehub/fastdeploy/matting_bgr.jpg
+wget https://bj.bcebos.com/paddlehub/fastdeploy/video.mp4
+
+# CPU推理
+## 图片
+python infer.py --model rvm_mobilenetv3_fp32.onnx --image matting_input.jpg --bg matting_bgr.jpg --device cpu
+## 视频
+python infer.py --model rvm_mobilenetv3_fp32.onnx --video video.mp4 --bg matting_bgr.jpg --device cpu
+# GPU推理
+## 图片
+python infer.py --model rvm_mobilenetv3_fp32.onnx --image matting_input.jpg --bg matting_bgr.jpg --device gpu
+## 视频
+python infer.py --model rvm_mobilenetv3_fp32.onnx --video video.mp4 --bg matting_bgr.jpg --device gpu
+# TRT推理
+## 图片
+python infer.py --model rvm_mobilenetv3_trt.onnx --image matting_input.jpg --bg matting_bgr.jpg --device gpu --use_trt True
+## 视频
+python infer.py --model rvm_mobilenetv3_trt.onnx --video video.mp4 --bg matting_bgr.jpg --device gpu --use_trt True
+```
+
+运行完成可视化结果如下图所示
+<div width="1240">
+<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/67993288/186852040-759da522-fca4-4786-9205-88c622cd4a39.jpg">
+<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/67993288/186852587-48895efc-d24a-43c9-aeec-d7b0362ab2b9.jpg">
+<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/67993288/186852116-cf91445b-3a67-45d9-a675-c69fe77c383a.jpg">
+<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/67993288/186852554-6960659f-4fd7-4506-b33b-54e1a9dd89bf.jpg">
+<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/19977378/196653716-f7043bd5-dfc2-4e7d-be0f-e12a6af4c55b.gif">
+<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/19977378/196654529-866bff5d-47a2-4584-9627-39b587799228.gif">
+</div>
+
+## RobustVideoMatting Python接口
+
+```python
+fd.vision.matting.RobustVideoMatting(model_file, params_file=None, runtime_option=None, model_format=ModelFormat.ONNX)
+```
+
+RobustVideoMatting模型加载和初始化，其中model_file为导出的ONNX模型格式
+
+**参数**
+
+> * **model_file**(str): 模型文件路径
+> * **params_file**(str): 参数文件路径，当模型格式为ONNX格式时，此参数无需设定
+> * **runtime_option**(RuntimeOption): 后端推理配置，默认为None，即采用默认配置
+> * **model_format**(ModelFormat): 模型格式，默认为ONNX
+
+### predict函数
+
+> ```python
+> RobustVideoMatting.predict(input_image)
+> ```
+>
+> 模型预测结口，输入图像直接输出抠图结果。
+>
+> **参数**
+>
+> > * **input_image**(np.ndarray): 输入数据，注意需为HWC，BGR格式
+
+> **返回**
+>
+> > 返回`fastdeploy.vision.MattingResult`结构体，结构体说明参考文档[视觉模型预测结果](../../../../../docs/api/vision_results/)
+
+
+## 其它文档
+
+- [RobustVideoMatting 模型介绍](..)
+- [RobustVideoMatting C++部署](../cpp)
+- [模型预测结果说明](../../../../../docs/api/vision_results/)
+- [如何切换模型推理后端引擎](../../../../../docs/cn/faq/how_to_change_backend.md)
--- a/examples/vision/matting/rvm/python/infer.py
+++ b/examples/vision/matting/rvm/python/infer.py
@@ -0,0 +1,112 @@
+import fastdeploy as fd
+import cv2
+import os
+
+
+def parse_arguments():
+    import argparse
+    import ast
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--model", required=True, help="Path of RobustVideoMatting model.")
+    parser.add_argument("--image", type=str, help="Path of test image file.")
+    parser.add_argument("--video", type=str, help="Path of test video file.")
+    parser.add_argument(
+        "--bg",
+        type=str,
+        required=True,
+        default=None,
+        help="Path of test background image file.")
+    parser.add_argument(
+        '--output-composition',
+        type=str,
+        default="composition.mp4",
+        help="Path of composition video file.")
+    parser.add_argument(
+        '--output-alpha',
+        type=str,
+        default="alpha.mp4",
+        help="Path of alpha video file.")
+    parser.add_argument(
+        "--device",
+        type=str,
+        default='cpu',
+        help="Type of inference device, support 'cpu' or 'gpu'.")
+    parser.add_argument(
+        "--use_trt",
+        type=ast.literal_eval,
+        default=False,
+        help="Wether to use tensorrt.")
+    return parser.parse_args()
+
+
+def build_option(args):
+    option = fd.RuntimeOption()
+    if args.device.lower() == "gpu":
+        option.use_gpu()
+    if args.use_trt:
+        option.use_trt_backend()
+        option.set_trt_input_shape("src", [1, 3, 1920, 1080])
+        option.set_trt_input_shape("r1i", [1, 1, 1, 1], [1, 16, 240, 135],
+                                   [1, 16, 240, 135])
+        option.set_trt_input_shape("r2i", [1, 1, 1, 1], [1, 20, 120, 68],
+                                   [1, 20, 120, 68])
+        option.set_trt_input_shape("r3i", [1, 1, 1, 1], [1, 40, 60, 34],
+                                   [1, 40, 60, 34])
+        option.set_trt_input_shape("r4i", [1, 1, 1, 1], [1, 64, 30, 17],
+                                   [1, 64, 30, 17])
+
+    return option
+
+
+args = parse_arguments()
+output_composition = args.output_composition
+output_alpha = args.output_alpha
+
+# 配置runtime，加载模型
+runtime_option = build_option(args)
+model = fd.vision.matting.RobustVideoMatting(
+    args.model, runtime_option=runtime_option)
+bg = cv2.imread(args.bg)
+
+if args.video is not None:
+    # for video
+    cap = cv2.VideoCapture(args.video)
+    # Define the codec and create VideoWriter object
+    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
+    composition = cv2.VideoWriter(output_composition, fourcc, 20.0,
+                                  (1080, 1920))
+    alpha = cv2.VideoWriter(output_alpha, fourcc, 20.0, (1080, 1920))
+
+    frame_id = 0
+    while True:
+        frame_id = frame_id + 1
+        _, frame = cap.read()
+        if frame is None:
+            break
+        result = model.predict(frame)
+        vis_im = fd.vision.vis_matting(frame, result)
+        vis_im_with_bg = fd.vision.swap_background_matting(frame, bg, result)
+        alpha.write(vis_im)
+        composition.write(vis_im_with_bg)
+        cv2.waitKey(30)
+    cap.release()
+    composition.release()
+    alpha.release()
+    cv2.destroyAllWindows()
+    print("Visualized result video save in {} and {}".format(
+        output_composition, output_alpha))
+
+if args.image is not None:
+    # for image
+    im = cv2.imread(args.image)
+    result = model.predict(im.copy())
+    print(result)
+    # 可视化结果
+    vis_im = fd.vision.vis_matting(im, result)
+    vis_im_with_bg = fd.vision.swap_background_matting(im, bg, result)
+    cv2.imwrite("visualized_result_fg.jpg", vis_im)
+    cv2.imwrite("visualized_result_replaced_bg.jpg", vis_im_with_bg)
+    print(
+        "Visualized result save in ./visualized_result_replaced_bg.jpg and ./visualized_result_fg.jpg"
+    )