skip frames for faster MOT deploy (#6629)

* skip frames for faster mot * fix skip frame mot * fix skip frame mot * fix skip frame mot * set skip frame default -1, test=document_fix

skip frames for faster MOT deploy (#6629)
* skip frames for faster mot * fix skip frame mot * fix skip frame mot * fix skip frame mot * set skip frame default -1, test=document_fix
601964b4 · Feng Ni · GitHub · a61f6915 · 601964b4 · 601964b4
11 changed file
--- a/deploy/pipeline/config/infer_cfg_pphuman.yml
+++ b/deploy/pipeline/config/infer_cfg_pphuman.yml
@@ -12,6 +12,7 @@ MOT:
  model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip
  tracker_config: deploy/pipeline/config/tracker_config.yml
  batch_size: 1
+  skip_frame_num: -1 # preferably no more than 3
  enable: False

 KPT:

--- a/deploy/pipeline/config/infer_cfg_ppvehicle.yml
+++ b/deploy/pipeline/config/infer_cfg_ppvehicle.yml
@@ -10,6 +10,7 @@ MOT:
  model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip
  tracker_config: deploy/pipeline/config/tracker_config.yml
  batch_size: 1
+  skip_frame_num: -1 # preferably no more than 3
  enable: False

 VEHICLE_PLATE:

--- a/deploy/pipeline/docs/tutorials/PPHuman_QUICK_STARTED.md
+++ b/deploy/pipeline/docs/tutorials/PPHuman_QUICK_STARTED.md
@@ -87,13 +87,13 @@ attr_thresh: 0.5
 visual: True

 MOT:
-  model_dir: output_inference/mot_ppyoloe_l_36e_pipeline/
+  model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip
  tracker_config: deploy/pipeline/config/tracker_config.yml
  batch_size: 1
  enable: True

 ATTR:
-  model_dir: output_inference/strongbaseline_r50_30e_pa100k/
+  model_dir:  https://bj.bcebos.com/v1/paddledet/models/pipeline/PPLCNet_x1_0_person_attribute_945_infer.zip
  batch_size: 8
  enable: True
 ```

--- a/deploy/pipeline/docs/tutorials/PPVehicle_QUICK_STARTED.md
+++ b/deploy/pipeline/docs/tutorials/PPVehicle_QUICK_STARTED.md
@@ -81,13 +81,13 @@ visual: True
 warmup_frame: 50

 MOT:
-  model_dir: output_inference/mot_ppyoloe_l_36e_ppvehicle/
+  model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip
  tracker_config: deploy/pipeline/config/tracker_config.yml
  batch_size: 1
  enable: True

 VEHICLE_ATTR:
-  model_dir: output_inference/vehicle_attribute_infer/
+  model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/vehicle_attribute_model.zip
  batch_size: 8
  color_threshold: 0.5
  type_threshold: 0.5
@@ -146,7 +146,7 @@ python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_ppv

 ## 方案介绍

-PP-Vehicle v2整体方案如下图所示:
+PP-Vehicle 整体方案如下图所示:

 <div width="1000" align="center">
  <img src="../../../../docs/images/ppvehicle.png"/>

--- a/deploy/pipeline/docs/tutorials/pphuman_mot.md
+++ b/deploy/pipeline/docs/tutorials/pphuman_mot.md
@@ -21,7 +21,16 @@ python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pph
                                                   --image_file=test_image.jpg \
                                                   --device=gpu
 ```
-3. 视频输入时，是跟踪任务，注意首先设置infer_cfg_pphuman.yml中的MOT配置的enable=True，然后启动命令如下
+3. 视频输入时，是跟踪任务，注意首先设置infer_cfg_pphuman.yml中的MOT配置的`enable=True`，如果希望跳帧加速检测跟踪流程，可以设置`skip_frame_num: 2`，建议跳帧帧数最大不超过3：
+```
+MOT:
+  model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip
+  tracker_config: deploy/pipeline/config/tracker_config.yml
+  batch_size: 1
+  skip_frame_num: 2
+  enable: True
+```
+然后启动命令如下
 ```python
 python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pphuman.yml \
                                                   --video_file=test_video.mp4 \

--- a/deploy/pipeline/docs/tutorials/pphuman_mot_en.md
+++ b/deploy/pipeline/docs/tutorials/pphuman_mot_en.md
@@ -21,7 +21,16 @@ python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pph
                                                   --image_file=test_image.jpg \
                                                   --device=gpu
 ```
-3. When use the video as input, it's a tracking task, first you should set the "enable: True" in MOT of infer_cfg_pphuman.yml, and then the start command is as follows:
+3. When use the video as input, it's a tracking task, first you should set the "enable: True" in MOT of infer_cfg_pphuman.yml. If you want skip some frames speed up the detection and tracking process, you can set `skip_frame_num: 2`, it is recommended that the maximum number of skip_frame_num should not exceed 3:
+```
+MOT:
+  model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip
+  tracker_config: deploy/pipeline/config/tracker_config.yml
+  batch_size: 1
+  skip_frame_num: 2
+  enable: True
+```
+and then the start command is as follows:
 ```python
 python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pphuman.yml \
                                                   --video_file=test_video.mp4 \

--- a/deploy/pipeline/docs/tutorials/ppvehicle_mot.md
+++ b/deploy/pipeline/docs/tutorials/ppvehicle_mot.md
@@ -28,7 +28,7 @@ MOT:
  model_dir: output_inference/mot_ppyoloe_l_36e_ppvehicle/ # 车辆跟踪模型调用路径
  tracker_config: deploy/pipeline/config/tracker_config.yml
  batch_size: 1   # 模型预测时的batch_size大小, 跟踪任务只能设置为1
-  skip_frame_num: 1  # 跳帧预测的帧数，默认为1即不跳帧，建议跳帧帧数最大不超过4
+  skip_frame_num: -1  # 跳帧预测的帧数，-1表示不进行跳帧，建议跳帧帧数最大不超过3
  enable: False   # 是否开启该功能，使用跟踪前必须确保设置为True
 ```

@@ -40,7 +40,7 @@ python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_ppv
                                                   --image_file=test_image.jpg \
                                                   --device=gpu
 ```
-3. 视频输入时，是跟踪任务，注意首先设置infer_cfg_ppvehicle.yml中的MOT配置的`enable=True`，如果希望跳帧加速检测跟踪流程，可以设置`skip_frame_num: 2`，建议跳帧帧数最大不超过4：
+3. 视频输入时，是跟踪任务，注意首先设置infer_cfg_ppvehicle.yml中的MOT配置的`enable=True`，如果希望跳帧加速检测跟踪流程，可以设置`skip_frame_num: 2`，建议跳帧帧数最大不超过3：
 ```
 MOT:
  model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip

--- a/deploy/pipeline/pipeline.py
+++ b/deploy/pipeline/pipeline.py
@@ -397,6 +397,7 @@ class PipePredictor(object):
                model_dir = mot_cfg['model_dir']
                tracker_config = mot_cfg['tracker_config']
                batch_size = mot_cfg['batch_size']
+                skip_frame_num = mot_cfg.get('skip_frame_num', -1)
                basemode = self.basemode['MOT']
                self.modebase[basemode] = True
                self.mot_predictor = SDE_Detector(
@@ -411,6 +412,7 @@ class PipePredictor(object):
                    args.trt_calib_mode,
                    args.cpu_threads,
                    args.enable_mkldnn,
+                    skip_frame_num=skip_frame_num,
                    draw_center_traj=self.draw_center_traj,
                    secs_interval=self.secs_interval,
                    do_entrance_counting=self.do_entrance_counting,
@@ -601,9 +603,15 @@ class PipePredictor(object):
                if frame_id > self.warmup_frame:
                    self.pipe_timer.total_time.start()
                    self.pipe_timer.module_time['mot'].start()
-                res = self.mot_predictor.predict_image(
-                    [copy.deepcopy(frame_rgb)], visual=False)

+                mot_skip_frame_num = self.mot_predictor.skip_frame_num
+                reuse_det_result = False
+                if mot_skip_frame_num > 1 and frame_id > 0 and frame_id % mot_skip_frame_num > 0:
+                    reuse_det_result = True
+                res = self.mot_predictor.predict_image(
+                    [copy.deepcopy(frame_rgb)],
+                    visual=False,
+                    reuse_det_result=reuse_det_result)
                if frame_id > self.warmup_frame:
                    self.pipe_timer.module_time['mot'].end()


--- a/deploy/pptracking/python/mot_jde_infer.py
+++ b/deploy/pptracking/python/mot_jde_infer.py
@@ -61,6 +61,7 @@ class JDE_Detector(Detector):
        save_mot_txts (bool): Whether to save tracking results (txt), default as False
        draw_center_traj (bool): Whether drawing the trajectory of center, default as False
        secs_interval (int): The seconds interval to count after tracking, default as 10
+        skip_frame_num (int): Skip frame num to get faster MOT results, default as -1
        do_entrance_counting(bool): Whether counting the numbers of identifiers entering 
            or getting out from the entrance, default as False，only support single class
            counting in MOT.
@@ -93,6 +94,7 @@ class JDE_Detector(Detector):
                 save_mot_txts=False,
                 draw_center_traj=False,
                 secs_interval=10,
+                 skip_frame_num=-1,
                 do_entrance_counting=False,
                 do_break_in_counting=False,
                 region_type='horizontal',
@@ -114,6 +116,7 @@ class JDE_Detector(Detector):
        self.save_mot_txts = save_mot_txts
        self.draw_center_traj = draw_center_traj
        self.secs_interval = secs_interval
+        self.skip_frame_num = skip_frame_num
        self.do_entrance_counting = do_entrance_counting
        self.do_break_in_counting = do_break_in_counting
        self.region_type = region_type
@@ -126,6 +129,8 @@ class JDE_Detector(Detector):
        assert batch_size == 1, "MOT model only supports batch_size=1."
        self.det_times = Timer(with_tracker=True)
        self.num_classes = len(self.pred_config.labels)
+        if self.skip_frame_num > 1:
+            self.previous_det_result = None

        # tracker config
        assert self.pred_config.tracker, "The exported JDE Detector model should have tracker."
@@ -204,7 +209,8 @@ class JDE_Detector(Detector):
                      run_benchmark=False,
                      repeats=1,
                      visual=True,
-                      seq_name=None):
+                      seq_name=None,
+                      reuse_det_result=False):
        mot_results = []
        num_classes = self.num_classes
        image_list.sort()
@@ -246,15 +252,22 @@ class JDE_Detector(Detector):

            else:
                self.det_times.preprocess_time_s.start()
-                inputs = self.preprocess(batch_image_list)
+                if not reuse_det_result:
+                    inputs = self.preprocess(batch_image_list)
                self.det_times.preprocess_time_s.end()

                self.det_times.inference_time_s.start()
-                result = self.predict()
+                if not reuse_det_result:
+                    result = self.predict()
                self.det_times.inference_time_s.end()

                self.det_times.postprocess_time_s.start()
-                det_result = self.postprocess(inputs, result)
+                if not reuse_det_result:
+                    det_result = self.postprocess(inputs, result)
+                    self.previous_det_result = det_result
+                else:
+                    assert self.previous_det_result is not None
+                    det_result = self.previous_det_result
                self.det_times.postprocess_time_s.end()

                # tracking process
@@ -309,7 +322,7 @@ class JDE_Detector(Detector):
        fourcc = cv2.VideoWriter_fourcc(*video_format)
        writer = cv2.VideoWriter(out_path, fourcc, fps, (width, height))

-        frame_id = 1
+        frame_id = 0
        timer = MOTTimer()
        results = defaultdict(list)  # support single class and multi classes
        num_classes = self.num_classes
@@ -355,12 +368,18 @@ class JDE_Detector(Detector):
                break
            if frame_id % 10 == 0:
                print('Tracking frame: %d' % (frame_id))
-            frame_id += 1

            timer.tic()
+            mot_skip_frame_num = self.skip_frame_num
+            reuse_det_result = False
+            if mot_skip_frame_num > 1 and frame_id > 0 and frame_id % mot_skip_frame_num > 0:
+                reuse_det_result = True
            seq_name = video_out_name.split('.')[0]
            mot_results = self.predict_image(
-                [frame], visual=False, seq_name=seq_name)
+                [frame],
+                visual=False,
+                seq_name=seq_name,
+                reuse_det_result=reuse_det_result)
            timer.toc()

            online_tlwhs, online_scores, online_ids = mot_results[0]
@@ -400,6 +419,7 @@ class JDE_Detector(Detector):
                cv2.imshow('Mask Detection', im)
                if cv2.waitKey(1) & 0xFF == ord('q'):
                    break
+            frame_id += 1

        if self.save_mot_txts:
            result_filename = os.path.join(
@@ -439,6 +459,7 @@ def main():
        save_mot_txts=FLAGS.save_mot_txts,
        draw_center_traj=FLAGS.draw_center_traj,
        secs_interval=FLAGS.secs_interval,
+        skip_frame_num=FLAGS.skip_frame_num,
        do_entrance_counting=FLAGS.do_entrance_counting,
        do_break_in_counting=FLAGS.do_break_in_counting,
        region_type=FLAGS.region_type,

--- a/deploy/pptracking/python/mot_sde_infer.py
+++ b/deploy/pptracking/python/mot_sde_infer.py
@@ -62,6 +62,7 @@ class SDE_Detector(Detector):
        save_mot_txts (bool): Whether to save tracking results (txt), default as False
        draw_center_traj (bool): Whether drawing the trajectory of center, default as False
        secs_interval (int): The seconds interval to count after tracking, default as 10
+        skip_frame_num (int): Skip frame num to get faster MOT results, default as -1
        do_entrance_counting(bool): Whether counting the numbers of identifiers entering 
            or getting out from the entrance, default as False，only support single class
            counting in MOT, and the video should be taken by a static camera.
@@ -96,6 +97,7 @@ class SDE_Detector(Detector):
                 save_mot_txts=False,
                 draw_center_traj=False,
                 secs_interval=10,
+                 skip_frame_num=-1,
                 do_entrance_counting=False,
                 do_break_in_counting=False,
                 region_type='horizontal',
@@ -119,6 +121,7 @@ class SDE_Detector(Detector):
        self.save_mot_txts = save_mot_txts
        self.draw_center_traj = draw_center_traj
        self.secs_interval = secs_interval
+        self.skip_frame_num = skip_frame_num
        self.do_entrance_counting = do_entrance_counting
        self.do_break_in_counting = do_break_in_counting
        self.region_type = region_type
@@ -131,6 +134,8 @@ class SDE_Detector(Detector):
        assert batch_size == 1, "MOT model only supports batch_size=1."
        self.det_times = Timer(with_tracker=True)
        self.num_classes = len(self.pred_config.labels)
+        if self.skip_frame_num > 1:
+            self.previous_det_result = None

        # reid config
        self.use_reid = False if reid_model_dir is None else True
@@ -414,7 +419,8 @@ class SDE_Detector(Detector):
                      run_benchmark=False,
                      repeats=1,
                      visual=True,
-                      seq_name=None):
+                      seq_name=None,
+                      reuse_det_result=False):
        num_classes = self.num_classes
        image_list.sort()
        ids2names = self.pred_config.labels
@@ -468,15 +474,22 @@ class SDE_Detector(Detector):

            else:
                self.det_times.preprocess_time_s.start()
-                inputs = self.preprocess(batch_image_list)
+                if not reuse_det_result:
+                    inputs = self.preprocess(batch_image_list)
                self.det_times.preprocess_time_s.end()

                self.det_times.inference_time_s.start()
-                result = self.predict()
+                if not reuse_det_result:
+                    result = self.predict()
                self.det_times.inference_time_s.end()

                self.det_times.postprocess_time_s.start()
-                det_result = self.postprocess(inputs, result)
+                if not reuse_det_result:
+                    det_result = self.postprocess(inputs, result)
+                    self.previous_det_result = det_result
+                else:
+                    assert self.previous_det_result is not None
+                    det_result = self.previous_det_result
                self.det_times.postprocess_time_s.end()

                # tracking process
@@ -553,7 +566,7 @@ class SDE_Detector(Detector):
        fourcc = cv2.VideoWriter_fourcc(*video_format)
        writer = cv2.VideoWriter(out_path, fourcc, fps, (width, height))

-        frame_id = 1
+        frame_id = 0
        timer = MOTTimer()
        results = defaultdict(list)
        num_classes = self.num_classes
@@ -599,12 +612,18 @@ class SDE_Detector(Detector):
                break
            if frame_id % 10 == 0:
                print('Tracking frame: %d' % (frame_id))
-            frame_id += 1

            timer.tic()
+            mot_skip_frame_num = self.skip_frame_num
+            reuse_det_result = False
+            if mot_skip_frame_num > 1 and frame_id > 0 and frame_id % mot_skip_frame_num > 0:
+                reuse_det_result = True
            seq_name = video_out_name.split('.')[0]
            mot_results = self.predict_image(
-                [frame], visual=False, seq_name=seq_name)
+                [frame],
+                visual=False,
+                seq_name=seq_name,
+                reuse_det_result=reuse_det_result)
            timer.toc()

            # bs=1 in MOT model
@@ -661,6 +680,7 @@ class SDE_Detector(Detector):
                cv2.imshow('Mask Detection', im)
                if cv2.waitKey(1) & 0xFF == ord('q'):
                    break
+            frame_id += 1

        if self.save_mot_txts:
            result_filename = os.path.join(
@@ -803,6 +823,7 @@ def main():
        save_mot_txts=FLAGS.save_mot_txts,
        draw_center_traj=FLAGS.draw_center_traj,
        secs_interval=FLAGS.secs_interval,
+        skip_frame_num=FLAGS.skip_frame_num,
        do_entrance_counting=FLAGS.do_entrance_counting,
        do_break_in_counting=FLAGS.do_break_in_counting,
        region_type=FLAGS.region_type,

--- a/deploy/pptracking/python/mot_utils.py
+++ b/deploy/pptracking/python/mot_utils.py
@@ -137,6 +137,11 @@ def argsparser():
        type=ast.literal_eval,
        default=True,
        help='whether to use darkpose to get better keypoint position predict ')
+    parser.add_argument(
+        '--skip_frame_num',
+        type=int,
+        default=-1,
+        help='Skip frames to speed up the process of getting mot results.')
    parser.add_argument(
        "--do_entrance_counting",
        action='store_true',