[Model] add tracking trail on vis_mot (#461)

* add override mark * delete some * recovery * recovery * add tracking * add tracking py_bind and example * add pptracking * add pptracking * iomanip head file * add opencv_video lib * add python libs package Signed-off-by: ChaoII <849453582@qq.com> * complete comments Signed-off-by: ChaoII <849453582@qq.com> * add jdeTracker_ member variable Signed-off-by: ChaoII <849453582@qq.com> * add 'FASTDEPLOY_DECL' macro Signed-off-by: ChaoII <849453582@qq.com> * remove kwargs params Signed-off-by: ChaoII <849453582@qq.com> * [Doc]update pptracking docs * delete 'ENABLE_PADDLE_FRONTEND' switch * add pptracking unit test * update pptracking unit test Signed-off-by: ChaoII <849453582@qq.com> * modify test video file path and remove trt test * update unit test model url * remove 'FASTDEPLOY_DECL' macro Signed-off-by: ChaoII <849453582@qq.com> * fix build python packages about pptracking on win32 Signed-off-by: ChaoII <849453582@qq.com> * update comment Signed-off-by: ChaoII <849453582@qq.com> * add pptracking model explain Signed-off-by: ChaoII <849453582@qq.com> * add tracking trail on vis_mot * add tracking trail * modify code for some suggestion * remove unused import * fix import bug Signed-off-by: ChaoII <849453582@qq.com> Co-authored-by: Jason <jiangjiajun@baidu.com>
2025-10-06 17:17:14 +08:00 · 2022-11-03 09:57:07 +08:00
parent 328212f270
commit 22d60fdadf
16 changed files with 208 additions and 116 deletions
--- a/examples/vision/README.md
+++ b/examples/vision/README.md
@@ -2,16 +2,18 @@

 本目录下提供了各类视觉模型的部署，主要涵盖以下任务类型

-| 任务类型           | 说明                                  | 预测结果结构体                                                                          |
-|:-------------- |:----------------------------------- |:-------------------------------------------------------------------------------- |
-| Detection      | 目标检测，输入图像，检测图像中物体位置，并返回检测框坐标及类别和置信度 | [DetectionResult](../../docs/api/vision_results/detection_result.md)       |
-| Segmentation   | 语义分割，输入图像，给出图像中每个像素的分类及置信度          | [SegmentationResult](../../docs/api/vision_results/segmentation_result.md) |
-| Classification | 图像分类，输入图像，给出图像的分类结果和置信度             | [ClassifyResult](../../docs/api/vision_results/classification_result.md)   |
-| FaceDetection | 人脸检测，输入图像，检测图像中人脸位置，并返回检测框坐标及人脸关键点             | [FaceDetectionResult](../../docs/api/vision_results/face_detection_result.md)   |
-| KeypointDetection   | 关键点检测，输入图像，返回图像中人物行为的各个关键点坐标和置信度         | [KeyPointDetectionResult](../../docs/api/vision_results/keypointdetection_result.md) |
-| FaceRecognition | 人脸识别，输入图像，返回可用于相似度计算的人脸特征的embedding            | [FaceRecognitionResult](../../docs/api/vision_results/face_recognition_result.md)   |
-| Matting | 抠图，输入图像，返回图片的前景每个像素点的Alpha值            | [MattingResult](../../docs/api/vision_results/matting_result.md)   |
-| OCR | 文本框检测，分类，文本框内容识别，输入图像，返回文本框坐标，文本框的方向类别以及框内的文本内容            | [OCRResult](../../docs/api/vision_results/ocr_result.md)   |
+| 任务类型              | 说明                                              | 预测结果结构体                                                                              |
+|:------------------|:------------------------------------------------|:-------------------------------------------------------------------------------------|
+| Detection         | 目标检测，输入图像，检测图像中物体位置，并返回检测框坐标及类别和置信度             | [DetectionResult](../../docs/api/vision_results/detection_result.md)                 |
+| Segmentation      | 语义分割，输入图像，给出图像中每个像素的分类及置信度                      | [SegmentationResult](../../docs/api/vision_results/segmentation_result.md)           |
+| Classification    | 图像分类，输入图像，给出图像的分类结果和置信度                         | [ClassifyResult](../../docs/api/vision_results/classification_result.md)             |
+| FaceDetection     | 人脸检测，输入图像，检测图像中人脸位置，并返回检测框坐标及人脸关键点              | [FaceDetectionResult](../../docs/api/vision_results/face_detection_result.md)        |
+| KeypointDetection | 关键点检测，输入图像，返回图像中人物行为的各个关键点坐标和置信度                | [KeyPointDetectionResult](../../docs/api/vision_results/keypointdetection_result.md) |
+| FaceRecognition   | 人脸识别，输入图像，返回可用于相似度计算的人脸特征的embedding             | [FaceRecognitionResult](../../docs/api/vision_results/face_recognition_result.md)    |
+| Matting           | 抠图，输入图像，返回图片的前景每个像素点的Alpha值                     | [MattingResult](../../docs/api/vision_results/matting_result.md)                     |
+| OCR               | 文本框检测，分类，文本框内容识别，输入图像，返回文本框坐标，文本框的方向类别以及框内的文本内容 | [OCRResult](../../docs/api/vision_results/ocr_result.md)                             |
+| MOT               | 多目标跟踪，输入图像，检测图像中物体位置，并返回检测框坐标，对象id及类别置信度        | [MOTResult](../../docs/api/vision_results/mot_result.md)                             |
+
 ## FastDeploy API设计

 视觉模型具有较有统一任务范式，在设计API时（包括C++/Python），FastDeploy将视觉模型的部署拆分为四个步骤
--- a/examples/vision/tracking/pptracking/cpp/infer.cc
+++ b/examples/vision/tracking/pptracking/cpp/infer.cc
@@ -33,25 +33,29 @@ void CpuInfer(const std::string& model_dir, const std::string& video_file) {
  }

  fastdeploy::vision::MOTResult result;
+  fastdeploy::vision::tracking::TrailRecorder recorder;
+  // during each prediction, data is inserted into the recorder. As the number of predictions increases,
+  // the memory will continue to grow. You can cancel the insertion through 'UnbindRecorder'.
+  // int count = 0; // unbind condition
+  model.BindRecorder(&recorder);
  cv::Mat frame;
-  int frame_id=0;
  cv::VideoCapture capture(video_file);
-  // according to the time of prediction to calculate fps
-  float fps= 0.0f;
  while (capture.read(frame)) {
    if (frame.empty()) {
-        break;
+      break;
    }
    if (!model.Predict(&frame, &result)) {
-        std::cerr << "Failed to predict." << std::endl;
-        return;
+      std::cerr << "Failed to predict." << std::endl;
+      return;
    }
+    // such as adding this code can cancel trail datat bind
+    // if(count++ == 10) model.UnbindRecorder();
    // std::cout << result.Str() << std::endl;
-    cv::Mat out_img = fastdeploy::vision::VisMOT(frame, result, fps , frame_id);
+    cv::Mat out_img = fastdeploy::vision::VisMOT(frame, result, 0.0, &recorder);
    cv::imshow("mot",out_img);
    cv::waitKey(30);
-    frame_id++;
  }
+  model.UnbindRecorder();
  capture.release();
  cv::destroyAllWindows();
 }
@@ -72,25 +76,29 @@ void GpuInfer(const std::string& model_dir, const std::string& video_file) {
  }

  fastdeploy::vision::MOTResult result;
+  fastdeploy::vision::tracking::TrailRecorder trail_recorder;
+  // during each prediction, data is inserted into the recorder. As the number of predictions increases,
+  // the memory will continue to grow. You can cancel the insertion through 'UnbindRecorder'.
+  // int count = 0; // unbind condition
+  model.BindRecorder(&trail_recorder);
  cv::Mat frame;
-  int frame_id=0;
  cv::VideoCapture capture(video_file);
-  // according to the time of prediction to calculate fps
-  float fps= 0.0f;
  while (capture.read(frame)) {
    if (frame.empty()) {
-        break;
+      break;
    }
    if (!model.Predict(&frame, &result)) {
-        std::cerr << "Failed to predict." << std::endl;
-        return;
+      std::cerr << "Failed to predict." << std::endl;
+      return;
    }
+    // such as adding this code can cancel trail datat bind
+    //if(count++ == 10) model.UnbindRecorder();
    // std::cout << result.Str() << std::endl;
-    cv::Mat out_img = fastdeploy::vision::VisMOT(frame, result, fps , frame_id);
+    cv::Mat out_img = fastdeploy::vision::VisMOT(frame, result, 0.0, &trail_recorder);
    cv::imshow("mot",out_img);
    cv::waitKey(30);
-    frame_id++;
  }
+  model.UnbindRecorder();
  capture.release();
  cv::destroyAllWindows();
 }
@@ -112,11 +120,13 @@ void TrtInfer(const std::string& model_dir, const std::string& video_file) {
  }

  fastdeploy::vision::MOTResult result;
+  fastdeploy::vision::tracking::TrailRecorder recorder;
+  //during each prediction, data is inserted into the recorder. As the number of predictions increases,
+  //the memory will continue to grow. You can cancel the insertion through 'UnbindRecorder'.
+  // int count = 0; // unbind condition
+  model.BindRecorder(&recorder);
  cv::Mat frame;
-  int frame_id=0;
  cv::VideoCapture capture(video_file);
-  // according to the time of prediction to calculate fps
-  float fps= 0.0f;
  while (capture.read(frame)) {
    if (frame.empty()) {
        break;
@@ -125,12 +135,14 @@ void TrtInfer(const std::string& model_dir, const std::string& video_file) {
        std::cerr << "Failed to predict." << std::endl;
        return;
    }
+    // such as adding this code can cancel trail datat bind
+    // if(count++ == 10) model.UnbindRecorder();
    // std::cout << result.Str() << std::endl;
-    cv::Mat out_img = fastdeploy::vision::VisMOT(frame, result, fps , frame_id);
+    cv::Mat out_img = fastdeploy::vision::VisMOT(frame, result, 0.0, &recorder);
    cv::imshow("mot",out_img);
    cv::waitKey(30);
-    frame_id++;
  }
+  model.UnbindRecorder();
  capture.release();
  cv::destroyAllWindows();
 }
--- a/examples/vision/tracking/pptracking/python/infer.py
+++ b/examples/vision/tracking/pptracking/python/infer.py
@@ -14,7 +14,6 @@

 import fastdeploy as fd
 import cv2
-import time
 import os


@@ -60,20 +59,26 @@ config_file = os.path.join(args.model, "infer_cfg.yml")
 model = fd.vision.tracking.PPTracking(
    model_file, params_file, config_file, runtime_option=runtime_option)

+# 初始化轨迹记录器
+recorder = fd.vision.tracking.TrailRecorder()
+# 绑定记录器 注意：每次预测时，往trail_recorder里面插入数据，随着预测次数的增加，内存会不断地增长，
+# 可以通过unbind_recorder()方法来解除绑定
+model.bind_recorder(recorder)
 # 预测图片分割结果
 cap = cv2.VideoCapture(args.video)
-frame_id = 0
+# count = 0
 while True:
-    start_time = time.time()
-    frame_id = frame_id+1
    _, frame = cap.read()
    if frame is None:
        break
    result = model.predict(frame)
-    end_time = time.time()
-    fps = 1.0/(end_time-start_time)
-    img = fd.vision.vis_mot(frame, result, fps, frame_id)
+    # count += 1
+    # if count == 10:
+    #     model.unbind_recorder()
+    img = fd.vision.vis_mot(frame, result, 0.0, recorder)
    cv2.imshow("video", img)
-    cv2.waitKey(30)
+    if cv2.waitKey(30) == ord("q"):
+        break
+model.unbind_recorder()
 cap.release()
 cv2.destroyAllWindows()