[Model] add RobustVideoMatting model (#400)

* add yolov5cls

* fixed bugs

* fixed bugs

* fixed preprocess bug

* add yolov5cls readme

* deal with comments

* Add YOLOv5Cls Note

* add yolov5cls test

* add rvm support

* support rvm model

* add rvm demo

* fixed bugs

* add rvm readme

* add TRT support

* add trt support

* add rvm test

* add EXPORT.md

* rename export.md

* rm poros doxyen

* deal with comments

* deal with comments

* add rvm video_mode note

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
This commit is contained in:
WJJ1995
2022-10-26 14:30:04 +08:00
committed by GitHub
parent ba501fd963
commit 718698a32a
22 changed files with 1080 additions and 16 deletions

View File

@@ -0,0 +1,88 @@
# RobustVideoMatting Python部署示例
在部署前,需确认以下两个步骤
- 1. 软硬件环境满足要求,参考[FastDeploy环境要求](../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)
- 2. FastDeploy Python whl包安装参考[FastDeploy Python安装](../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md)
本目录下提供`infer.py`快速完成RobustVideoMatting在CPU/GPU以及GPU上通过TensorRT加速部署的示例。执行如下脚本即可完成
```bash
#下载部署示例代码
git clone https://github.com/PaddlePaddle/FastDeploy.git
cd FastDeploy/examples/vision/matting/rvm/python
# 下载RobustVideoMatting模型文件和测试图片以及视频
## 原版ONNX模型
wget https://bj.bcebos.com/paddlehub/fastdeploy/rvm_mobilenetv3_fp32.onnx
## 为加载TRT特殊处理ONNX模型
wget https://bj.bcebos.com/paddlehub/fastdeploy/rvm_mobilenetv3_trt.onnx
wget https://bj.bcebos.com/paddlehub/fastdeploy/matting_input.jpg
wget https://bj.bcebos.com/paddlehub/fastdeploy/matting_bgr.jpg
wget https://bj.bcebos.com/paddlehub/fastdeploy/video.mp4
# CPU推理
## 图片
python infer.py --model rvm_mobilenetv3_fp32.onnx --image matting_input.jpg --bg matting_bgr.jpg --device cpu
## 视频
python infer.py --model rvm_mobilenetv3_fp32.onnx --video video.mp4 --bg matting_bgr.jpg --device cpu
# GPU推理
## 图片
python infer.py --model rvm_mobilenetv3_fp32.onnx --image matting_input.jpg --bg matting_bgr.jpg --device gpu
## 视频
python infer.py --model rvm_mobilenetv3_fp32.onnx --video video.mp4 --bg matting_bgr.jpg --device gpu
# TRT推理
## 图片
python infer.py --model rvm_mobilenetv3_trt.onnx --image matting_input.jpg --bg matting_bgr.jpg --device gpu --use_trt True
## 视频
python infer.py --model rvm_mobilenetv3_trt.onnx --video video.mp4 --bg matting_bgr.jpg --device gpu --use_trt True
```
运行完成可视化结果如下图所示
<div width="1240">
<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/67993288/186852040-759da522-fca4-4786-9205-88c622cd4a39.jpg">
<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/67993288/186852587-48895efc-d24a-43c9-aeec-d7b0362ab2b9.jpg">
<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/67993288/186852116-cf91445b-3a67-45d9-a675-c69fe77c383a.jpg">
<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/67993288/186852554-6960659f-4fd7-4506-b33b-54e1a9dd89bf.jpg">
<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/19977378/196653716-f7043bd5-dfc2-4e7d-be0f-e12a6af4c55b.gif">
<img width="200" height="200" float="left" src="https://user-images.githubusercontent.com/19977378/196654529-866bff5d-47a2-4584-9627-39b587799228.gif">
</div>
## RobustVideoMatting Python接口
```python
fd.vision.matting.RobustVideoMatting(model_file, params_file=None, runtime_option=None, model_format=ModelFormat.ONNX)
```
RobustVideoMatting模型加载和初始化其中model_file为导出的ONNX模型格式
**参数**
> * **model_file**(str): 模型文件路径
> * **params_file**(str): 参数文件路径当模型格式为ONNX格式时此参数无需设定
> * **runtime_option**(RuntimeOption): 后端推理配置默认为None即采用默认配置
> * **model_format**(ModelFormat): 模型格式默认为ONNX
### predict函数
> ```python
> RobustVideoMatting.predict(input_image)
> ```
>
> 模型预测结口,输入图像直接输出抠图结果。
>
> **参数**
>
> > * **input_image**(np.ndarray): 输入数据注意需为HWCBGR格式
> **返回**
>
> > 返回`fastdeploy.vision.MattingResult`结构体,结构体说明参考文档[视觉模型预测结果](../../../../../docs/api/vision_results/)
## 其它文档
- [RobustVideoMatting 模型介绍](..)
- [RobustVideoMatting C++部署](../cpp)
- [模型预测结果说明](../../../../../docs/api/vision_results/)
- [如何切换模型推理后端引擎](../../../../../docs/cn/faq/how_to_change_backend.md)

View File

@@ -0,0 +1,112 @@
import fastdeploy as fd
import cv2
import os
def parse_arguments():
import argparse
import ast
parser = argparse.ArgumentParser()
parser.add_argument(
"--model", required=True, help="Path of RobustVideoMatting model.")
parser.add_argument("--image", type=str, help="Path of test image file.")
parser.add_argument("--video", type=str, help="Path of test video file.")
parser.add_argument(
"--bg",
type=str,
required=True,
default=None,
help="Path of test background image file.")
parser.add_argument(
'--output-composition',
type=str,
default="composition.mp4",
help="Path of composition video file.")
parser.add_argument(
'--output-alpha',
type=str,
default="alpha.mp4",
help="Path of alpha video file.")
parser.add_argument(
"--device",
type=str,
default='cpu',
help="Type of inference device, support 'cpu' or 'gpu'.")
parser.add_argument(
"--use_trt",
type=ast.literal_eval,
default=False,
help="Wether to use tensorrt.")
return parser.parse_args()
def build_option(args):
option = fd.RuntimeOption()
if args.device.lower() == "gpu":
option.use_gpu()
if args.use_trt:
option.use_trt_backend()
option.set_trt_input_shape("src", [1, 3, 1920, 1080])
option.set_trt_input_shape("r1i", [1, 1, 1, 1], [1, 16, 240, 135],
[1, 16, 240, 135])
option.set_trt_input_shape("r2i", [1, 1, 1, 1], [1, 20, 120, 68],
[1, 20, 120, 68])
option.set_trt_input_shape("r3i", [1, 1, 1, 1], [1, 40, 60, 34],
[1, 40, 60, 34])
option.set_trt_input_shape("r4i", [1, 1, 1, 1], [1, 64, 30, 17],
[1, 64, 30, 17])
return option
args = parse_arguments()
output_composition = args.output_composition
output_alpha = args.output_alpha
# 配置runtime加载模型
runtime_option = build_option(args)
model = fd.vision.matting.RobustVideoMatting(
args.model, runtime_option=runtime_option)
bg = cv2.imread(args.bg)
if args.video is not None:
# for video
cap = cv2.VideoCapture(args.video)
# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
composition = cv2.VideoWriter(output_composition, fourcc, 20.0,
(1080, 1920))
alpha = cv2.VideoWriter(output_alpha, fourcc, 20.0, (1080, 1920))
frame_id = 0
while True:
frame_id = frame_id + 1
_, frame = cap.read()
if frame is None:
break
result = model.predict(frame)
vis_im = fd.vision.vis_matting(frame, result)
vis_im_with_bg = fd.vision.swap_background_matting(frame, bg, result)
alpha.write(vis_im)
composition.write(vis_im_with_bg)
cv2.waitKey(30)
cap.release()
composition.release()
alpha.release()
cv2.destroyAllWindows()
print("Visualized result video save in {} and {}".format(
output_composition, output_alpha))
if args.image is not None:
# for image
im = cv2.imread(args.image)
result = model.predict(im.copy())
print(result)
# 可视化结果
vis_im = fd.vision.vis_matting(im, result)
vis_im_with_bg = fd.vision.swap_background_matting(im, bg, result)
cv2.imwrite("visualized_result_fg.jpg", vis_im)
cv2.imwrite("visualized_result_replaced_bg.jpg", vis_im_with_bg)
print(
"Visualized result save in ./visualized_result_replaced_bg.jpg and ./visualized_result_fg.jpg"
)