未验证 提交 c643eeab 编写于 作者: F Feng Ni 提交者: GitHub

[MOT] Upgrade pptracking deploy (#5358)

* add bytetrack yolov3 engine and deploy baseline

* fix bytetrack yolov3 configs

* fix cfgs tools

* fix tracker pred_dets

* fit pptracking jde sde infer

* rename cfgs, fix mot_sde_infer

* fix mot sde single class infer

* remove byte cfgs

* fix mtmct

* clean code
上级 cff6841a
...@@ -58,11 +58,12 @@ python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fa ...@@ -58,11 +58,12 @@ python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fa
### 2.1 导出预测模型 ### 2.1 导出预测模型
Step 1:导出检测模型 Step 1:导出检测模型
```bash ```bash
# 导出JDE YOLOv3行人检测模型 # 导出PPYOLOv2行人检测模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/detector/jde_yolov3_darknet53_30e_1088x608_mix.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/jde_yolov3_darknet53_30e_1088x608_mix.pdparams
# 或导出PPYOLOv2行人检测模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/detector/ppyolov2_r50vd_dcn_365e_640x640_mot17half.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyolov2_r50vd_dcn_365e_640x640_mot17half.pdparams CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/detector/ppyolov2_r50vd_dcn_365e_640x640_mot17half.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyolov2_r50vd_dcn_365e_640x640_mot17half.pdparams
# 或导出PPYOLOe行人检测模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
``` ```
Step 2:导出行人ReID模型 Step 2:导出行人ReID模型
```bash ```bash
# 导出PCB Pyramid ReID模型 # 导出PCB Pyramid ReID模型
...@@ -76,11 +77,10 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid ...@@ -76,11 +77,10 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid
# 下载行人跟踪demo视频: # 下载行人跟踪demo视频:
wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4 wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4
# 用导出的JDE YOLOv3行人检测模型和PCB Pyramid ReID模型 # 用导出的PPYOLOv2行人检测模型和PPLCNet ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608_mix/ --reid_model_dir=output_inference/deepsort_pcb_pyramid_r101/ --video_file=mot17_demo.mp4 --device=GPU --threshold=0.5 --save_mot_txts --save_images python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyolov2_r50vd_dcn_365e_640x640_mot17half/ --reid_model_dir=output_inference/deepsort_pplcnet/ --video_file=mot17_demo.mp4 --device=GPU --threshold=0.5 --save_mot_txts --save_images
# 或用导出的PPYOLOe行人检测模型和PPLCNet ReID模型
# 或用导出的PPYOLOv2行人检测模型和PPLCNet ReID模型 python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyoloe_crn_l_36e_640x640_mot17half/ --reid_model_dir=output_inference/deepsort_pplcnet/ --video_file=mot17_demo.mp4 --device=GPU --threshold=0.5 --save_mot_txts --save_images
python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyolov2_r50vd_dcn_365e_640x640_mot17half/ --reid_model_dir=output_inference/deepsort_pplcnet/ --video_file=mot17_demo.mp4 --device=GPU --threshold=0.5 --scaled=True --save_mot_txts --save_images
``` ```
### 2.3 用导出的模型基于Python去预测车辆跟踪 ### 2.3 用导出的模型基于Python去预测车辆跟踪
...@@ -97,17 +97,16 @@ wget https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet_vehicl ...@@ -97,17 +97,16 @@ wget https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet_vehicl
tar -xvf deepsort_pplcnet_vehicle.tar tar -xvf deepsort_pplcnet_vehicle.tar
# 用导出的PicoDet车辆检测模型和PPLCNet车辆ReID模型 # 用导出的PicoDet车辆检测模型和PPLCNet车辆ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=picodet_l_640_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --device=GPU --scaled=True --threshold=0.5 --video_file={your video}.mp4 --save_mot_txts --save_images python deploy/pptracking/python/mot_sde_infer.py --model_dir=picodet_l_640_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --device=GPU --threshold=0.5 --video_file={your video}.mp4 --save_mot_txts --save_images
# 用导出的PP-YOLOv2车辆检测模型和PPLCNet车辆ReID模型 # 用导出的PP-YOLOv2车辆检测模型和PPLCNet车辆ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --device=GPU --scaled=True --threshold=0.5 --video_file={your video}.mp4 --save_mot_txts --save_images python deploy/pptracking/python/mot_sde_infer.py --model_dir=ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --device=GPU --threshold=0.5 --video_file={your video}.mp4 --save_mot_txts --save_images
``` ```
**注意:** **注意:**
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_images`表示保存跟踪结果可视化图片。 - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_images`表示保存跟踪结果可视化图片。
- 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1` - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
- `--threshold`表示结果可视化的置信度阈值,默认为0.5,低于该阈值的结果会被过滤掉,为了可视化效果更佳,可根据实际情况自行修改。 - `--threshold`表示结果可视化的置信度阈值,默认为0.5,低于该阈值的结果会被过滤掉,为了可视化效果更佳,可根据实际情况自行修改。
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
- DeepSORT算法不支持多类别跟踪,只支持单类别跟踪,且ReID模型最好是与检测模型同一类别的物体训练过的,比如行人跟踪最好使用行人ReID模型,车辆跟踪最好使用车辆ReID模型。 - DeepSORT算法不支持多类别跟踪,只支持单类别跟踪,且ReID模型最好是与检测模型同一类别的物体训练过的,比如行人跟踪最好使用行人ReID模型,车辆跟踪最好使用车辆ReID模型。
...@@ -135,94 +134,22 @@ wget https://paddledet.bj.bcebos.com/data/mot/demo/mtmct-demo.tar ...@@ -135,94 +134,22 @@ wget https://paddledet.bj.bcebos.com/data/mot/demo/mtmct-demo.tar
tar -xvf mtmct-demo.tar tar -xvf mtmct-demo.tar
# 用导出的PicoDet车辆检测模型和PPLCNet车辆ReID模型 # 用导出的PicoDet车辆检测模型和PPLCNet车辆ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=picodet_l_640_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --mtmct_dir=mtmct-demo --mtmct_cfg=mtmct_cfg --device=GPU --scaled=True --threshold=0.5 --save_mot_txts --save_images python deploy/pptracking/python/mot_sde_infer.py --model_dir=picodet_l_640_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --mtmct_dir=mtmct-demo --mtmct_cfg=mtmct_cfg.yml --device=GPU --threshold=0.5 --save_mot_txts --save_images
# 用导出的PP-YOLOv2车辆检测模型和PPLCNet车辆ReID模型 # 用导出的PP-YOLOv2车辆检测模型和PPLCNet车辆ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --mtmct_dir=mtmct-demo --mtmct_cfg=mtmct_cfg --device=GPU --scaled=True --threshold=0.5 --save_mot_txts --save_images python deploy/pptracking/python/mot_sde_infer.py --model_dir=ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --mtmct_dir=mtmct-demo --mtmct_cfg=mtmct_cfg.yml --device=GPU --threshold=0.5 --save_mot_txts --save_images
``` ```
**注意:** **注意:**
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt),或`--save_images`表示保存跟踪结果可视化图片。 - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt),或`--save_images`表示保存跟踪结果可视化图片。
- 跨镜头跟踪结果txt文件每行信息是`camera_id,frame,id,x1,y1,w,h,-1,-1` - 跨镜头跟踪结果txt文件每行信息是`camera_id,frame,id,x1,y1,w,h,-1,-1`
- `--threshold`表示结果可视化的置信度阈值,默认为0.5,低于该阈值的结果会被过滤掉,为了可视化效果更佳,可根据实际情况自行修改。 - `--threshold`表示结果可视化的置信度阈值,默认为0.5,低于该阈值的结果会被过滤掉,为了可视化效果更佳,可根据实际情况自行修改。
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
- DeepSORT算法不支持多类别跟踪,只支持单类别跟踪,且ReID模型最好是与检测模型同一类别的物体训练过的,比如行人跟踪最好使用行人ReID模型,车辆跟踪最好使用车辆ReID模型。 - DeepSORT算法不支持多类别跟踪,只支持单类别跟踪,且ReID模型最好是与检测模型同一类别的物体训练过的,比如行人跟踪最好使用行人ReID模型,车辆跟踪最好使用车辆ReID模型。
- `--mtmct_dir`是MTMCT预测的某个场景的文件夹名字,里面包含该场景不同摄像头拍摄视频的图片文件夹,其数量至少为两个。 - `--mtmct_dir`是MTMCT预测的某个场景的文件夹名字,里面包含该场景不同摄像头拍摄视频的图片文件夹,其数量至少为两个。
- `--mtmct_cfg`是MTMCT预测的某个场景的配置文件,里面包含该一些trick操作的开关和该场景摄像头相关设置的文件路径,用户可以自行更改相关路径以及设置某些操作是否启用。 - `--mtmct_cfg`是MTMCT预测的某个场景的配置文件,里面包含该一些trick操作的开关和该场景摄像头相关设置的文件路径,用户可以自行更改相关路径以及设置某些操作是否启用。
## 4. API调用方式: ## 4. 参数说明:
### 4.1 FairMOT模型API调用
```
import mot_jde_infer
# 1.model config and weights
model_dir = 'fairmot_hrnetv2_w18_dlafpn_30e_576x320/'
# 2.inference data
video_file = 'test.mp4'
image_dir = None
# 3.other settings
device = 'CPU' # device should be CPU, GPU or XPU
threshold = 0.3
output_dir = 'output'
# mot predict
mot_jde_infer.predict_naive(model_dir, video_file, image_dir, device, threshold, output_dir)
```
**注意:**
- 以上代码必须进入目录`PaddleDetection/deploy/pptracking/python`下执行。
- 支持对视频和图片文件夹进行预测,不支持单张图的预测,`video_file``image_dir`不能同时为None,推荐使用`video_file`,而`image_dir`需直接存放命名顺序规范的图片。
- 默认会保存跟踪结果可视化后的图片和视频,以及跟踪结果txt文件,默认不会进行轨迹可视化和流量统计。
### 4.2 DeepSORT模型API调用
```
import mot_sde_infer
# 1.model config and weights
model_dir = 'ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle/'
reid_model_dir = 'deepsort_pplcnet_vehicle/'
# 2.inference data
video_file = 'test.mp4'
image_dir = None
# 3.other settings
scaled = True # set False only when use JDE YOLOv3
device = 'CPU' # device should be CPU, GPU or XPU
threshold = 0.3
output_dir = 'output'
# 4. MTMCT settings, default None
mtmct_dir = None
mtmct_cfg = None
# mot predict
mot_sde_infer.predict_naive(model_dir,
reid_model_dir,
video_file,
image_dir,
mtmct_dir,
mtmct_cfg,
scaled,
device,
threshold,
output_dir)
```
**注意:**
- 以上代码必须进入目录`PaddleDetection/deploy/pptracking/python`下执行。
- 支持对视频和图片文件夹进行预测,不支持单张图的预测,`video_file``image_dir``--mtmct_dir`不能同时为None,推荐使用`video_file`,而`image_dir`需直接存放命名顺序规范的图片,`--mtmct_dir`不为None表示是进行的MTMCT跨镜头跟踪任务。
- 默认会保存跟踪结果可视化后的图片和视频,以及跟踪结果txt文件,默认不会进行轨迹可视化和流量统计。
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
- `--mtmct_dir`是MTMCT预测的某个场景的文件夹名字,里面包含该场景不同摄像头拍摄视频的图片文件夹,其数量至少为两个。
- `--mtmct_cfg`是MTMCT预测的某个场景的配置文件,里面包含该一些trick操作的开关和该场景摄像头相关设置的文件路径,用户可以自行更改相关路径以及设置某些操作是否启用。
- 开启MTMCT预测必须将`video_file``image_dir`同时设置为None,且`--mtmct_dir``--mtmct_cfg`都必须不为None。
## 5. 参数说明:
| 参数 | 是否必须|含义 | | 参数 | 是否必须|含义 |
|-------|-------|----------| |-------|-------|----------|
......
...@@ -89,6 +89,8 @@ class PaddleInferBenchmark(object): ...@@ -89,6 +89,8 @@ class PaddleInferBenchmark(object):
self.preprocess_time_s = perf_info.get('preprocess_time_s', 0) self.preprocess_time_s = perf_info.get('preprocess_time_s', 0)
self.postprocess_time_s = perf_info.get('postprocess_time_s', 0) self.postprocess_time_s = perf_info.get('postprocess_time_s', 0)
self.with_tracker = True if 'tracking_time_s' in perf_info else False
self.tracking_time_s = perf_info.get('tracking_time_s', 0)
self.total_time_s = perf_info.get('total_time_s', 0) self.total_time_s = perf_info.get('total_time_s', 0)
self.inference_time_s_90 = perf_info.get("inference_time_s_90", "") self.inference_time_s_90 = perf_info.get("inference_time_s_90", "")
...@@ -235,9 +237,19 @@ class PaddleInferBenchmark(object): ...@@ -235,9 +237,19 @@ class PaddleInferBenchmark(object):
) )
self.logger.info( self.logger.info(
f"{identifier} total time spent(s): {self.total_time_s}") f"{identifier} total time spent(s): {self.total_time_s}")
self.logger.info(
f"{identifier} preprocess_time(ms): {round(self.preprocess_time_s*1000, 1)}, inference_time(ms): {round(self.inference_time_s*1000, 1)}, postprocess_time(ms): {round(self.postprocess_time_s*1000, 1)}" if self.with_tracker:
) self.logger.info(
f"{identifier} preprocess_time(ms): {round(self.preprocess_time_s*1000, 1)}, "
f"inference_time(ms): {round(self.inference_time_s*1000, 1)}, "
f"postprocess_time(ms): {round(self.postprocess_time_s*1000, 1)}, "
f"tracking_time(ms): {round(self.tracking_time_s*1000, 1)}")
else:
self.logger.info(
f"{identifier} preprocess_time(ms): {round(self.preprocess_time_s*1000, 1)}, "
f"inference_time(ms): {round(self.inference_time_s*1000, 1)}, "
f"postprocess_time(ms): {round(self.postprocess_time_s*1000, 1)}"
)
if self.inference_time_s_90: if self.inference_time_s_90:
self.looger.info( self.looger.info(
f"{identifier} 90%_cost: {self.inference_time_s_90}, 99%_cost: {self.inference_time_s_99}, succ_rate: {self.succ_rate}" f"{identifier} 90%_cost: {self.inference_time_s_90}, 99%_cost: {self.inference_time_s_99}, succ_rate: {self.succ_rate}"
......
...@@ -25,9 +25,14 @@ import paddle ...@@ -25,9 +25,14 @@ import paddle
from paddle.inference import Config from paddle.inference import Config
from paddle.inference import create_predictor from paddle.inference import create_predictor
import sys
# add deploy path of PadleDetection to sys.path
parent_path = os.path.abspath(os.path.join(__file__, *(['..'])))
sys.path.insert(0, parent_path)
from benchmark_utils import PaddleInferBenchmark from benchmark_utils import PaddleInferBenchmark
from picodet_postprocess import PicoDetPostProcess from picodet_postprocess import PicoDetPostProcess
from preprocess import preprocess, Resize, NormalizeImage, Permute, PadStride, LetterBoxResize from preprocess import preprocess, Resize, NormalizeImage, Permute, PadStride, LetterBoxResize, decode_image
from visualize import visualize_box_mask from visualize import visualize_box_mask
from utils import argsparser, Timer, get_current_memory_mb from utils import argsparser, Timer, get_current_memory_mb
...@@ -38,9 +43,27 @@ SUPPORT_MODELS = { ...@@ -38,9 +43,27 @@ SUPPORT_MODELS = {
'JDE', 'JDE',
'FairMOT', 'FairMOT',
'DeepSORT', 'DeepSORT',
'StrongBaseline',
} }
def bench_log(detector, img_list, model_info, batch_size=1, name=None):
mems = {
'cpu_rss_mb': detector.cpu_mem / len(img_list),
'gpu_rss_mb': detector.gpu_mem / len(img_list),
'gpu_util': detector.gpu_util * 100 / len(img_list)
}
perf_info = detector.det_times.report(average=True)
data_info = {
'batch_size': batch_size,
'shape': "dynamic_shape",
'data_num': perf_info['img_num']
}
log = PaddleInferBenchmark(detector.config, model_info, data_info,
perf_info, mems)
log(name)
class Detector(object): class Detector(object):
""" """
Args: Args:
...@@ -56,21 +79,25 @@ class Detector(object): ...@@ -56,21 +79,25 @@ class Detector(object):
calibration, trt_calib_mode need to set True calibration, trt_calib_mode need to set True
cpu_threads (int): cpu threads cpu_threads (int): cpu threads
enable_mkldnn (bool): whether to open MKLDNN enable_mkldnn (bool): whether to open MKLDNN
output_dir (str): The path of output
threshold (float): The threshold of score for visualization
""" """
def __init__(self, def __init__(
pred_config, self,
model_dir, model_dir,
device='CPU', device='CPU',
run_mode='paddle', run_mode='paddle',
batch_size=1, batch_size=1,
trt_min_shape=1, trt_min_shape=1,
trt_max_shape=1280, trt_max_shape=1280,
trt_opt_shape=640, trt_opt_shape=640,
trt_calib_mode=False, trt_calib_mode=False,
cpu_threads=1, cpu_threads=1,
enable_mkldnn=False): enable_mkldnn=False,
self.pred_config = pred_config output_dir='output',
threshold=0.5, ):
self.pred_config = self.set_config(model_dir)
self.predictor, self.config = load_predictor( self.predictor, self.config = load_predictor(
model_dir, model_dir,
run_mode=run_mode, run_mode=run_mode,
...@@ -86,6 +113,12 @@ class Detector(object): ...@@ -86,6 +113,12 @@ class Detector(object):
enable_mkldnn=enable_mkldnn) enable_mkldnn=enable_mkldnn)
self.det_times = Timer() self.det_times = Timer()
self.cpu_mem, self.gpu_mem, self.gpu_util = 0, 0, 0 self.cpu_mem, self.gpu_mem, self.gpu_util = 0, 0, 0
self.batch_size = batch_size
self.output_dir = output_dir
self.threshold = threshold
def set_config(self, model_dir):
return PredictConfig(model_dir)
def preprocess(self, image_list): def preprocess(self, image_list):
preprocess_ops = [] preprocess_ops = []
...@@ -101,51 +134,32 @@ class Detector(object): ...@@ -101,51 +134,32 @@ class Detector(object):
input_im_lst.append(im) input_im_lst.append(im)
input_im_info_lst.append(im_info) input_im_info_lst.append(im_info)
inputs = create_inputs(input_im_lst, input_im_info_lst) inputs = create_inputs(input_im_lst, input_im_info_lst)
input_names = self.predictor.get_input_names()
for i in range(len(input_names)):
input_tensor = self.predictor.get_input_handle(input_names[i])
input_tensor.copy_from_cpu(inputs[input_names[i]])
return inputs return inputs
def postprocess(self, def postprocess(self, inputs, result):
np_boxes,
np_masks,
inputs,
np_boxes_num,
threshold=0.5):
# postprocess output of predictor # postprocess output of predictor
results = {} np_boxes_num = result['boxes_num']
results['boxes'] = np_boxes if np_boxes_num[0] <= 0:
results['boxes_num'] = np_boxes_num print('[WARNNING] No object detected.')
if np_masks is not None: result = {'boxes': np.zeros([0, 6]), 'boxes_num': [0]}
results['masks'] = np_masks result = {k: v for k, v in result.items() if v is not None}
return results return result
def predict(self, image_list, threshold=0.5, warmup=0, repeats=1): def predict(self, repeats=1):
''' '''
Args: Args:
image_list (list): list of image repeats (int): repeats number for prediction
threshold (float): threshold of predicted box' score
Returns: Returns:
results (dict): include 'boxes': np.ndarray: shape:[N,6], N: number of box, result (dict): include 'boxes': np.ndarray: shape:[N,6], N: number of box,
matix element:[class, score, x_min, y_min, x_max, y_max] matix element:[class, score, x_min, y_min, x_max, y_max]
MaskRCNN's results include 'masks': np.ndarray:
shape: [N, im_h, im_w]
''' '''
self.det_times.preprocess_time_s.start() # model prediction
inputs = self.preprocess(image_list) np_boxes, np_boxes_num = None, None
self.det_times.preprocess_time_s.end()
np_boxes, np_masks = None, None
input_names = self.predictor.get_input_names()
for i in range(len(input_names)):
input_tensor = self.predictor.get_input_handle(input_names[i])
input_tensor.copy_from_cpu(inputs[input_names[i]])
for i in range(warmup):
self.predictor.run()
output_names = self.predictor.get_output_names()
boxes_tensor = self.predictor.get_output_handle(output_names[0])
np_boxes = boxes_tensor.copy_to_cpu()
if self.pred_config.mask:
masks_tensor = self.predictor.get_output_handle(output_names[2])
np_masks = masks_tensor.copy_to_cpu()
self.det_times.inference_time_s.start()
for i in range(repeats): for i in range(repeats):
self.predictor.run() self.predictor.run()
output_names = self.predictor.get_output_names() output_names = self.predictor.get_output_names()
...@@ -153,130 +167,131 @@ class Detector(object): ...@@ -153,130 +167,131 @@ class Detector(object):
np_boxes = boxes_tensor.copy_to_cpu() np_boxes = boxes_tensor.copy_to_cpu()
boxes_num = self.predictor.get_output_handle(output_names[1]) boxes_num = self.predictor.get_output_handle(output_names[1])
np_boxes_num = boxes_num.copy_to_cpu() np_boxes_num = boxes_num.copy_to_cpu()
if self.pred_config.mask: result = dict(boxes=np_boxes, boxes_num=np_boxes_num)
masks_tensor = self.predictor.get_output_handle(output_names[2]) return result
np_masks = masks_tensor.copy_to_cpu()
self.det_times.inference_time_s.end(repeats=repeats) def merge_batch_result(self, batch_result):
if len(batch_result) == 1:
self.det_times.postprocess_time_s.start() return batch_result[0]
results = [] res_key = batch_result[0].keys()
if reduce(lambda x, y: x * y, np_boxes.shape) < 6: results = {k: [] for k in res_key}
print('[WARNNING] No object detected.') for res in batch_result:
results = {'boxes': np.zeros([0, 6]), 'boxes_num': [0]} for k, v in res.items():
else: results[k].append(v)
results = self.postprocess( for k, v in results.items():
np_boxes, np_masks, inputs, np_boxes_num, threshold=threshold) results[k] = np.concatenate(v)
self.det_times.postprocess_time_s.end()
self.det_times.img_num += len(image_list)
return results return results
def get_timer(self): def get_timer(self):
return self.det_times return self.det_times
def predict_image(self,
image_list,
run_benchmark=False,
repeats=1,
visual=True):
batch_loop_cnt = math.ceil(float(len(image_list)) / self.batch_size)
results = []
for i in range(batch_loop_cnt):
start_index = i * self.batch_size
end_index = min((i + 1) * self.batch_size, len(image_list))
batch_image_list = image_list[start_index:end_index]
if run_benchmark:
# preprocess
inputs = self.preprocess(batch_image_list) # warmup
self.det_times.preprocess_time_s.start()
inputs = self.preprocess(batch_image_list)
self.det_times.preprocess_time_s.end()
# model prediction
result = self.predict(repeats=repeats) # warmup
self.det_times.inference_time_s.start()
result = self.predict(repeats=repeats)
self.det_times.inference_time_s.end(repeats=repeats)
# postprocess
result_warmup = self.postprocess(inputs, result) # warmup
self.det_times.postprocess_time_s.start()
result = self.postprocess(inputs, result)
self.det_times.postprocess_time_s.end()
self.det_times.img_num += len(batch_image_list)
cm, gm, gu = get_current_memory_mb()
self.cpu_mem += cm
self.gpu_mem += gm
self.gpu_util += gu
else:
# preprocess
self.det_times.preprocess_time_s.start()
inputs = self.preprocess(batch_image_list)
self.det_times.preprocess_time_s.end()
# model prediction
self.det_times.inference_time_s.start()
result = self.predict()
self.det_times.inference_time_s.end()
# postprocess
self.det_times.postprocess_time_s.start()
result = self.postprocess(inputs, result)
self.det_times.postprocess_time_s.end()
self.det_times.img_num += len(batch_image_list)
if visual:
visualize(
batch_image_list,
result,
self.pred_config.labels,
output_dir=self.output_dir,
threshold=self.threshold)
results.append(result)
if visual:
print('Test iter {}'.format(i))
results = self.merge_batch_result(results)
return results
class DetectorPicoDet(Detector): def predict_video(self, video_file, camera_id):
""" video_out_name = 'output.mp4'
Args: if camera_id != -1:
config (object): config of model, defined by `Config(model_dir)` capture = cv2.VideoCapture(camera_id)
model_dir (str): root path of model.pdiparams, model.pdmodel and infer_cfg.yml else:
device (str): Choose the device you want to run, it can be: CPU/GPU/XPU, default is CPU capture = cv2.VideoCapture(video_file)
run_mode (str): mode of running(paddle/trt_fp32/trt_fp16) video_out_name = os.path.split(video_file)[-1]
batch_size (int): size of pre batch in inference # Get Video info : resolution, fps, frame count
trt_min_shape (int): min shape for dynamic shape in trt width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))
trt_max_shape (int): max shape for dynamic shape in trt height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
trt_opt_shape (int): opt shape for dynamic shape in trt fps = int(capture.get(cv2.CAP_PROP_FPS))
trt_calib_mode (bool): If the model is produced by TRT offline quantitative frame_count = int(capture.get(cv2.CAP_PROP_FRAME_COUNT))
calibration, trt_calib_mode need to set True print("fps: %d, frame_count: %d" % (fps, frame_count))
cpu_threads (int): cpu threads
enable_mkldnn (bool): whether to open MKLDNN if not os.path.exists(self.output_dir):
""" os.makedirs(self.output_dir)
out_path = os.path.join(self.output_dir, video_out_name)
def __init__(self, fourcc = cv2.VideoWriter_fourcc(*'mp4v')
pred_config, writer = cv2.VideoWriter(out_path, fourcc, fps, (width, height))
model_dir, index = 1
device='CPU', while (1):
run_mode='paddle', ret, frame = capture.read()
batch_size=1, if not ret:
trt_min_shape=1, break
trt_max_shape=1280, print('detect frame: %d' % (index))
trt_opt_shape=640, index += 1
trt_calib_mode=False, results = self.predict_image([frame], visual=False)
cpu_threads=1,
enable_mkldnn=False):
self.pred_config = pred_config
self.predictor, self.config = load_predictor(
model_dir,
run_mode=run_mode,
batch_size=batch_size,
min_subgraph_size=self.pred_config.min_subgraph_size,
device=device,
use_dynamic_shape=self.pred_config.use_dynamic_shape,
trt_min_shape=trt_min_shape,
trt_max_shape=trt_max_shape,
trt_opt_shape=trt_opt_shape,
trt_calib_mode=trt_calib_mode,
cpu_threads=cpu_threads,
enable_mkldnn=enable_mkldnn)
self.det_times = Timer()
self.cpu_mem, self.gpu_mem, self.gpu_util = 0, 0, 0
def predict(self, image, threshold=0.5, warmup=0, repeats=1): im = visualize_box_mask(
''' frame,
Args: results,
image (str/np.ndarray): path of image/ np.ndarray read by cv2 self.pred_config.labels,
threshold (float): threshold of predicted box' score threshold=self.threshold)
Returns: im = np.array(im)
results (dict): include 'boxes': np.ndarray: shape:[N,6], N: number of box, writer.write(im)
matix element:[class, score, x_min, y_min, x_max, y_max] if camera_id != -1:
''' cv2.imshow('Mask Detection', im)
self.det_times.preprocess_time_s.start() if cv2.waitKey(1) & 0xFF == ord('q'):
inputs = self.preprocess(image) break
self.det_times.preprocess_time_s.end() writer.release()
input_names = self.predictor.get_input_names()
for i in range(len(input_names)):
input_tensor = self.predictor.get_input_handle(input_names[i])
input_tensor.copy_from_cpu(inputs[input_names[i]])
np_score_list, np_boxes_list = [], []
for i in range(warmup):
self.predictor.run()
np_score_list.clear()
np_boxes_list.clear()
output_names = self.predictor.get_output_names()
num_outs = int(len(output_names) / 2)
for out_idx in range(num_outs):
np_score_list.append(
self.predictor.get_output_handle(output_names[out_idx])
.copy_to_cpu())
np_boxes_list.append(
self.predictor.get_output_handle(output_names[
out_idx + num_outs]).copy_to_cpu())
self.det_times.inference_time_s.start()
for i in range(repeats):
self.predictor.run()
np_score_list.clear()
np_boxes_list.clear()
output_names = self.predictor.get_output_names()
num_outs = int(len(output_names) / 2)
for out_idx in range(num_outs):
np_score_list.append(
self.predictor.get_output_handle(output_names[out_idx])
.copy_to_cpu())
np_boxes_list.append(
self.predictor.get_output_handle(output_names[
out_idx + num_outs]).copy_to_cpu())
self.det_times.inference_time_s.end(repeats=repeats)
self.det_times.img_num += 1
self.det_times.postprocess_time_s.start()
self.postprocess = PicoDetPostProcess(
inputs['image'].shape[2:],
inputs['im_shape'],
inputs['scale_factor'],
strides=self.pred_config.fpn_stride,
nms_threshold=self.pred_config.nms['nms_threshold'])
np_boxes, np_boxes_num = self.postprocess(np_score_list, np_boxes_list)
self.det_times.postprocess_time_s.end()
return dict(boxes=np_boxes, boxes_num=np_boxes_num)
def create_inputs(imgs, im_info): def create_inputs(imgs, im_info):
...@@ -433,7 +448,7 @@ def load_predictor(model_dir, ...@@ -433,7 +448,7 @@ def load_predictor(model_dir,
} }
if run_mode in precision_map.keys(): if run_mode in precision_map.keys():
config.enable_tensorrt_engine( config.enable_tensorrt_engine(
workspace_size=1 << 10, workspace_size=1 << 25,
max_batch_size=batch_size, max_batch_size=batch_size,
min_subgraph_size=min_subgraph_size, min_subgraph_size=min_subgraph_size,
precision_mode=precision_map[run_mode], precision_mode=precision_map[run_mode],
...@@ -495,22 +510,15 @@ def get_test_images(infer_dir, infer_img): ...@@ -495,22 +510,15 @@ def get_test_images(infer_dir, infer_img):
return images return images
def visualize(image_list, results, labels, output_dir='output/', threshold=0.5): def visualize(image_list, result, labels, output_dir='output/', threshold=0.5):
# visualize the predict result # visualize the predict result
start_idx = 0 start_idx = 0
for idx, image_file in enumerate(image_list): for idx, image_file in enumerate(image_list):
im_bboxes_num = results['boxes_num'][idx] im_bboxes_num = result['boxes_num'][idx]
im_results = {} im_results = {}
if 'boxes' in results: if 'boxes' in result:
im_results['boxes'] = results['boxes'][start_idx:start_idx + im_results['boxes'] = result['boxes'][start_idx:start_idx +
im_bboxes_num, :] im_bboxes_num, :]
if 'label' in results:
im_results['label'] = results['label'][start_idx:start_idx +
im_bboxes_num]
if 'score' in results:
im_results['score'] = results['score'][start_idx:start_idx +
im_bboxes_num]
start_idx += im_bboxes_num start_idx += im_bboxes_num
im = visualize_box_mask( im = visualize_box_mask(
image_file, im_results, labels, threshold=threshold) image_file, im_results, labels, threshold=threshold)
...@@ -529,79 +537,13 @@ def print_arguments(args): ...@@ -529,79 +537,13 @@ def print_arguments(args):
print('------------------------------------------') print('------------------------------------------')
def predict_image(detector, image_list, batch_size=1):
batch_loop_cnt = math.ceil(float(len(image_list)) / batch_size)
for i in range(batch_loop_cnt):
start_index = i * batch_size
end_index = min((i + 1) * batch_size, len(image_list))
batch_image_list = image_list[start_index:end_index]
if FLAGS.run_benchmark:
detector.predict(
batch_image_list, FLAGS.threshold, warmup=10, repeats=10)
cm, gm, gu = get_current_memory_mb()
detector.cpu_mem += cm
detector.gpu_mem += gm
detector.gpu_util += gu
print('Test iter {}'.format(i))
else:
results = detector.predict(batch_image_list, FLAGS.threshold)
visualize(
batch_image_list,
results,
detector.pred_config.labels,
output_dir=FLAGS.output_dir,
threshold=FLAGS.threshold)
def predict_video(detector, camera_id):
video_out_name = 'output.mp4'
if camera_id != -1:
capture = cv2.VideoCapture(camera_id)
else:
capture = cv2.VideoCapture(FLAGS.video_file)
video_out_name = os.path.split(FLAGS.video_file)[-1]
# Get Video info : resolution, fps, frame count
width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(capture.get(cv2.CAP_PROP_FPS))
frame_count = int(capture.get(cv2.CAP_PROP_FRAME_COUNT))
print("fps: %d, frame_count: %d" % (fps, frame_count))
if not os.path.exists(FLAGS.output_dir):
os.makedirs(FLAGS.output_dir)
out_path = os.path.join(FLAGS.output_dir, video_out_name)
fourcc = cv2.VideoWriter_fourcc(* 'mp4v')
writer = cv2.VideoWriter(out_path, fourcc, fps, (width, height))
index = 1
while (1):
ret, frame = capture.read()
if not ret:
break
print('detect frame: %d' % (index))
index += 1
results = detector.predict([frame], FLAGS.threshold)
im = visualize_box_mask(
frame,
results,
detector.pred_config.labels,
threshold=FLAGS.threshold)
im = np.array(im)
writer.write(im)
if camera_id != -1:
cv2.imshow('Mask Detection', im)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
writer.release()
def main(): def main():
pred_config = PredictConfig(FLAGS.model_dir) deploy_file = os.path.join(FLAGS.model_dir, 'infer_cfg.yml')
with open(deploy_file) as f:
yml_conf = yaml.safe_load(f)
arch = yml_conf['arch']
detector_func = 'Detector' detector_func = 'Detector'
if pred_config.arch == 'PicoDet': detector = eval(detector_func)(FLAGS.model_dir,
detector_func = 'DetectorPicoDet'
detector = eval(detector_func)(pred_config,
FLAGS.model_dir,
device=FLAGS.device, device=FLAGS.device,
run_mode=FLAGS.run_mode, run_mode=FLAGS.run_mode,
batch_size=FLAGS.batch_size, batch_size=FLAGS.batch_size,
...@@ -610,41 +552,29 @@ def main(): ...@@ -610,41 +552,29 @@ def main():
trt_opt_shape=FLAGS.trt_opt_shape, trt_opt_shape=FLAGS.trt_opt_shape,
trt_calib_mode=FLAGS.trt_calib_mode, trt_calib_mode=FLAGS.trt_calib_mode,
cpu_threads=FLAGS.cpu_threads, cpu_threads=FLAGS.cpu_threads,
enable_mkldnn=FLAGS.enable_mkldnn) enable_mkldnn=FLAGS.enable_mkldnn,
threshold=FLAGS.threshold,
output_dir=FLAGS.output_dir)
# predict from video file or camera video stream # predict from video file or camera video stream
if FLAGS.video_file is not None or FLAGS.camera_id != -1: if FLAGS.video_file is not None or FLAGS.camera_id != -1:
predict_video(detector, FLAGS.camera_id) detector.predict_video(FLAGS.video_file, FLAGS.camera_id)
else: else:
# predict from image # predict from image
if FLAGS.image_dir is None and FLAGS.image_file is not None: if FLAGS.image_dir is None and FLAGS.image_file is not None:
assert FLAGS.batch_size == 1, "batch_size should be 1, when image_file is not None" assert FLAGS.batch_size == 1, "batch_size should be 1, when image_file is not None"
img_list = get_test_images(FLAGS.image_dir, FLAGS.image_file) img_list = get_test_images(FLAGS.image_dir, FLAGS.image_file)
predict_image(detector, img_list, FLAGS.batch_size) detector.predict_image(img_list, FLAGS.run_benchmark, repeats=10)
if not FLAGS.run_benchmark: if not FLAGS.run_benchmark:
detector.det_times.info(average=True) detector.det_times.info(average=True)
else: else:
mems = {
'cpu_rss_mb': detector.cpu_mem / len(img_list),
'gpu_rss_mb': detector.gpu_mem / len(img_list),
'gpu_util': detector.gpu_util * 100 / len(img_list)
}
perf_info = detector.det_times.report(average=True)
model_dir = FLAGS.model_dir
mode = FLAGS.run_mode mode = FLAGS.run_mode
model_dir = FLAGS.model_dir
model_info = { model_info = {
'model_name': model_dir.strip('/').split('/')[-1], 'model_name': model_dir.strip('/').split('/')[-1],
'precision': mode.split('_')[-1] 'precision': mode.split('_')[-1]
} }
data_info = { bench_log(detector, img_list, model_info, name='DET')
'batch_size': FLAGS.batch_size,
'shape': "dynamic_shape",
'data_num': perf_info['img_num']
}
det_log = PaddleInferBenchmark(detector.config, model_info,
data_info, perf_info, mems)
det_log('Det')
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -96,7 +96,8 @@ class DeepSORTTracker(object): ...@@ -96,7 +96,8 @@ class DeepSORTTracker(object):
""" """
pred_cls_ids = pred_dets[:, 0:1] pred_cls_ids = pred_dets[:, 0:1]
pred_scores = pred_dets[:, 1:2] pred_scores = pred_dets[:, 1:2]
pred_tlwhs = pred_dets[:, 2:6] pred_xyxys = pred_dets[:, 2:6]
pred_tlwhs = np.concatenate((pred_xyxys[:, 0:2], pred_xyxys[:, 2:4] - pred_xyxys[:, 0:2] + 1), axis=1)
detections = [ detections = [
Detection(tlwh, score, feat, cls_id) Detection(tlwh, score, feat, cls_id)
......
...@@ -182,8 +182,7 @@ def clip_box(xyxy, ori_image_shape): ...@@ -182,8 +182,7 @@ def clip_box(xyxy, ori_image_shape):
def get_crops(xyxy, ori_img, w, h): def get_crops(xyxy, ori_img, w, h):
crops = [] crops = []
xyxy = xyxy.astype(np.int64) xyxy = xyxy.astype(np.int64)
ori_img = ori_img.numpy() ori_img = ori_img.transpose(1, 0, 2) # [h,w,3]->[w,h,3]
ori_img = np.squeeze(ori_img, axis=0).transpose(1, 0, 2) # [h,w,3]->[w,h,3]
for i, bbox in enumerate(xyxy): for i, bbox in enumerate(xyxy):
crop = ori_img[bbox[0]:bbox[2], bbox[1]:bbox[3], :] crop = ori_img[bbox[0]:bbox[2], bbox[1]:bbox[3], :]
crops.append(crop) crops.append(crop)
...@@ -198,7 +197,10 @@ def preprocess_reid(imgs, ...@@ -198,7 +197,10 @@ def preprocess_reid(imgs,
std=[0.229, 0.224, 0.225]): std=[0.229, 0.224, 0.225]):
im_batch = [] im_batch = []
for img in imgs: for img in imgs:
img = cv2.resize(img, (w, h)) try:
img = cv2.resize(img, (w, h))
except:
embed()
img = img[:, :, ::-1].astype('float32').transpose((2, 0, 1)) / 255 img = img[:, :, ::-1].astype('float32').transpose((2, 0, 1)) / 255
img_mean = np.array(mean).reshape((3, 1, 1)) img_mean = np.array(mean).reshape((3, 1, 1))
img_std = np.array(std).reshape((3, 1, 1)) img_std = np.array(std).reshape((3, 1, 1))
......
...@@ -18,21 +18,24 @@ import yaml ...@@ -18,21 +18,24 @@ import yaml
import cv2 import cv2
import numpy as np import numpy as np
from collections import defaultdict from collections import defaultdict
import paddle import paddle
from paddle.inference import Config
from paddle.inference import create_predictor
from utils import argsparser, Timer, get_current_memory_mb
from det_infer import Detector, get_test_images, print_arguments, PredictConfig
from benchmark_utils import PaddleInferBenchmark from benchmark_utils import PaddleInferBenchmark
from visualize import plot_tracking_dict from preprocess import decode_image
from utils import argsparser, Timer, get_current_memory_mb
from det_infer import Detector, get_test_images, print_arguments, bench_log, PredictConfig
# add python path
import sys
parent_path = os.path.abspath(os.path.join(__file__, *(['..'] * 2)))
sys.path.insert(0, parent_path)
from mot.tracker import JDETracker from mot import JDETracker
from mot.utils import MOTTimer, write_mot_results, flow_statistic from utils import MOTTimer, write_mot_results
from visualize import plot_tracking, plot_tracking_dict
# Global dictionary # Global dictionary
MOT_SUPPORT_MODELS = { MOT_JDE_SUPPORT_MODELS = {
'JDE', 'JDE',
'FairMOT', 'FairMOT',
} }
...@@ -41,23 +44,22 @@ MOT_SUPPORT_MODELS = { ...@@ -41,23 +44,22 @@ MOT_SUPPORT_MODELS = {
class JDE_Detector(Detector): class JDE_Detector(Detector):
""" """
Args: Args:
pred_config (object): config of model, defined by `Config(model_dir)`
model_dir (str): root path of model.pdiparams, model.pdmodel and infer_cfg.yml model_dir (str): root path of model.pdiparams, model.pdmodel and infer_cfg.yml
device (str): Choose the device you want to run, it can be: CPU/GPU/XPU, default is CPU device (str): Choose the device you want to run, it can be: CPU/GPU/XPU, default is CPU
run_mode (str): mode of running(paddle/trt_fp32/trt_fp16) run_mode (str): mode of running(paddle/trt_fp32/trt_fp16)
batch_size (int): size of per batch in inference, default is 1 in tracking models batch_size (int): size of pre batch in inference
trt_min_shape (int): min shape for dynamic shape in trt trt_min_shape (int): min shape for dynamic shape in trt
trt_max_shape (int): max shape for dynamic shape in trt trt_max_shape (int): max shape for dynamic shape in trt
trt_opt_shape (int): opt shape for dynamic shape in trt trt_opt_shape (int): opt shape for dynamic shape in trt
trt_calib_mode (bool): If the model is produced by TRT offline quantitative trt_calib_mode (bool): If the model is produced by TRT offline quantitative
calibration, trt_calib_mode need to set True calibration, trt_calib_mode need to set True
cpu_threads (int): cpu threads cpu_threads (int): cpu threads
enable_mkldnn (bool): whether to open MKLDNN enable_mkldnn (bool): whether to open MKLDNN
""" """
def __init__(self, def __init__(self,
pred_config,
model_dir, model_dir,
tracker_config=None,
device='CPU', device='CPU',
run_mode='paddle', run_mode='paddle',
batch_size=1, batch_size=1,
...@@ -66,9 +68,10 @@ class JDE_Detector(Detector): ...@@ -66,9 +68,10 @@ class JDE_Detector(Detector):
trt_opt_shape=608, trt_opt_shape=608,
trt_calib_mode=False, trt_calib_mode=False,
cpu_threads=1, cpu_threads=1,
enable_mkldnn=False): enable_mkldnn=False,
output_dir='output',
threshold=0.5):
super(JDE_Detector, self).__init__( super(JDE_Detector, self).__init__(
pred_config=pred_config,
model_dir=model_dir, model_dir=model_dir,
device=device, device=device,
run_mode=run_mode, run_mode=run_mode,
...@@ -78,17 +81,21 @@ class JDE_Detector(Detector): ...@@ -78,17 +81,21 @@ class JDE_Detector(Detector):
trt_opt_shape=trt_opt_shape, trt_opt_shape=trt_opt_shape,
trt_calib_mode=trt_calib_mode, trt_calib_mode=trt_calib_mode,
cpu_threads=cpu_threads, cpu_threads=cpu_threads,
enable_mkldnn=enable_mkldnn) enable_mkldnn=enable_mkldnn,
assert batch_size == 1, "The JDE Detector only supports batch size=1 now" output_dir=output_dir,
assert pred_config.tracker, "Tracking model should have tracker" threshold=threshold, )
self.num_classes = len(pred_config.labels) assert batch_size == 1, "MOT model only supports batch_size=1."
self.det_times = Timer(with_tracker=True)
tp = pred_config.tracker self.num_classes = len(self.pred_config.labels)
min_box_area = tp['min_box_area'] if 'min_box_area' in tp else 200
vertical_ratio = tp['vertical_ratio'] if 'vertical_ratio' in tp else 1.6 # tracker config
conf_thres = tp['conf_thres'] if 'conf_thres' in tp else 0. assert self.pred_config.tracker, "The exported JDE Detector model should have tracker."
tracked_thresh = tp['tracked_thresh'] if 'tracked_thresh' in tp else 0.7 cfg = self.pred_config.tracker
metric_type = tp['metric_type'] if 'metric_type' in tp else 'euclidean' min_box_area = cfg.get('min_box_area', 200)
vertical_ratio = cfg.get('vertical_ratio', 1.6)
conf_thres = cfg.get('conf_thres', 0.0)
tracked_thresh = cfg.get('tracked_thresh', 0.7)
metric_type = cfg.get('metric_type', 'euclidean')
self.tracker = JDETracker( self.tracker = JDETracker(
num_classes=self.num_classes, num_classes=self.num_classes,
...@@ -98,7 +105,18 @@ class JDE_Detector(Detector): ...@@ -98,7 +105,18 @@ class JDE_Detector(Detector):
tracked_thresh=tracked_thresh, tracked_thresh=tracked_thresh,
metric_type=metric_type) metric_type=metric_type)
def postprocess(self, pred_dets, pred_embs, threshold): def postprocess(self, inputs, result):
# postprocess output of predictor
np_boxes = result['pred_dets']
if np_boxes.shape[0] <= 0:
print('[WARNNING] No object detected.')
result = {'pred_dets': np.zeros([0, 6]), 'pred_embs': None}
result = {k: v for k, v in result.items() if v is not None}
return result
def tracking(self, det_results):
pred_dets = det_results['pred_dets']
pred_embs = det_results['pred_embs']
online_targets_dict = self.tracker.update(pred_dets, pred_embs) online_targets_dict = self.tracker.update(pred_dets, pred_embs)
online_tlwhs = defaultdict(list) online_tlwhs = defaultdict(list)
...@@ -110,9 +128,7 @@ class JDE_Detector(Detector): ...@@ -110,9 +128,7 @@ class JDE_Detector(Detector):
tlwh = t.tlwh tlwh = t.tlwh
tid = t.track_id tid = t.track_id
tscore = t.score tscore = t.score
if tscore < threshold: continue if tlwh[2] * tlwh[3] <= self.tracker.min_box_area: continue
if tlwh[2] * tlwh[3] <= self.tracker.min_box_area:
continue
if self.tracker.vertical_ratio > 0 and tlwh[2] / tlwh[ if self.tracker.vertical_ratio > 0 and tlwh[2] / tlwh[
3] > self.tracker.vertical_ratio: 3] > self.tracker.vertical_ratio:
continue continue
...@@ -121,270 +137,181 @@ class JDE_Detector(Detector): ...@@ -121,270 +137,181 @@ class JDE_Detector(Detector):
online_scores[cls_id].append(tscore) online_scores[cls_id].append(tscore)
return online_tlwhs, online_scores, online_ids return online_tlwhs, online_scores, online_ids
def predict(self, image_list, threshold=0.5, repeats=1, add_timer=True): def predict(self, repeats=1):
''' '''
Args: Args:
image_list (list[str]): path of images, only support one image path repeats (int): repeats number for prediction
(batch_size=1) in tracking model
threshold (float): threshold of predicted box' score
repeats (int): repeat number for prediction
add_timer (bool): whether add timer during prediction
Returns: Returns:
online_tlwhs, online_scores, online_ids (dict[np.array]) result (dict): include 'pred_dets': np.ndarray: shape:[N,6], N: number of box,
matix element:[x_min, y_min, x_max, y_max, score, class]
FairMOT(JDE)'s result include 'pred_embs': np.ndarray:
shape: [N, 128]
''' '''
# preprocess
if add_timer:
self.det_times.preprocess_time_s.start()
inputs = self.preprocess(image_list)
pred_dets, pred_embs = None, None
input_names = self.predictor.get_input_names()
for i in range(len(input_names)):
input_tensor = self.predictor.get_input_handle(input_names[i])
input_tensor.copy_from_cpu(inputs[input_names[i]])
if add_timer:
self.det_times.preprocess_time_s.end()
self.det_times.inference_time_s.start()
# model prediction # model prediction
np_pred_dets, np_pred_embs = None, None
for i in range(repeats): for i in range(repeats):
self.predictor.run() self.predictor.run()
output_names = self.predictor.get_output_names() output_names = self.predictor.get_output_names()
boxes_tensor = self.predictor.get_output_handle(output_names[0]) boxes_tensor = self.predictor.get_output_handle(output_names[0])
pred_dets = boxes_tensor.copy_to_cpu() np_pred_dets = boxes_tensor.copy_to_cpu()
embs_tensor = self.predictor.get_output_handle(output_names[1]) embs_tensor = self.predictor.get_output_handle(output_names[1])
pred_embs = embs_tensor.copy_to_cpu() np_pred_embs = embs_tensor.copy_to_cpu()
if add_timer:
self.det_times.inference_time_s.end(repeats=repeats) result = dict(pred_dets=np_pred_dets, pred_embs=np_pred_embs)
self.det_times.postprocess_time_s.start() return result
# postprocess def predict_image(self,
online_tlwhs, online_scores, online_ids = self.postprocess( image_list,
pred_dets, pred_embs, threshold) run_benchmark=False,
if add_timer: repeats=1,
self.det_times.postprocess_time_s.end() visual=True):
self.det_times.img_num += 1 mot_results = []
return online_tlwhs, online_scores, online_ids num_classes = self.num_classes
image_list.sort()
ids2names = self.pred_config.labels
def predict_image(detector, data_type = 'mcmot' if num_classes > 1 else 'mot'
image_list, for frame_id, img_file in enumerate(image_list):
threshold, batch_image_list = [img_file] # bs=1 in MOT model
output_dir, if run_benchmark:
save_images=True, # preprocess
run_benchmark=False): inputs = self.preprocess(batch_image_list) # warmup
results = [] self.det_times.preprocess_time_s.start()
num_classes = detector.num_classes inputs = self.preprocess(batch_image_list)
data_type = 'mcmot' if num_classes > 1 else 'mot' self.det_times.preprocess_time_s.end()
ids2names = detector.pred_config.labels
# model prediction
image_list.sort() result_warmup = self.predict(repeats=repeats) # warmup
for frame_id, img_file in enumerate(image_list): self.det_times.inference_time_s.start()
frame = cv2.imread(img_file) result = self.predict(repeats=repeats)
if run_benchmark: self.det_times.inference_time_s.end(repeats=repeats)
# warmup
detector.predict([img_file], threshold, repeats=10, add_timer=False) # postprocess
# run benchmark result_warmup = self.postprocess(inputs, result) # warmup
detector.predict([img_file], threshold, repeats=10, add_timer=True) self.det_times.postprocess_time_s.start()
cm, gm, gu = get_current_memory_mb() det_result = self.postprocess(inputs, result)
detector.cpu_mem += cm self.det_times.postprocess_time_s.end()
detector.gpu_mem += gm
detector.gpu_util += gu # tracking
print('Test iter {}, file name:{}'.format(frame_id, img_file)) result_warmup = self.tracking(det_result)
self.det_times.tracking_time_s.start()
online_tlwhs, online_scores, online_ids = self.tracking(
det_result)
self.det_times.tracking_time_s.end()
self.det_times.img_num += 1
cm, gm, gu = get_current_memory_mb()
self.cpu_mem += cm
self.gpu_mem += gm
self.gpu_util += gu
else:
self.det_times.preprocess_time_s.start()
inputs = self.preprocess(batch_image_list)
self.det_times.preprocess_time_s.end()
self.det_times.inference_time_s.start()
result = self.predict()
self.det_times.inference_time_s.end()
self.det_times.postprocess_time_s.start()
det_result = self.postprocess(inputs, result)
self.det_times.postprocess_time_s.end()
# tracking process
self.det_times.tracking_time_s.start()
online_tlwhs, online_scores, online_ids = self.tracking(
det_result)
self.det_times.tracking_time_s.end()
self.det_times.img_num += 1
if visual:
if frame_id % 10 == 0:
print('Tracking frame {}'.format(frame_id))
frame, _ = decode_image(img_file, {})
im = plot_tracking_dict(
frame,
num_classes,
online_tlwhs,
online_ids,
online_scores,
frame_id=frame_id,
ids2names=ids2names)
seq_name = image_list[0].split('/')[-2]
save_dir = os.path.join(self.output_dir, seq_name)
if not os.path.exists(save_dir):
os.makedirs(save_dir)
cv2.imwrite(
os.path.join(save_dir, '{:05d}.jpg'.format(frame_id)), im)
mot_results.append([online_tlwhs, online_scores, online_ids])
return mot_results
def predict_video(self, video_file, camera_id):
video_out_name = 'mot_output.mp4'
if camera_id != -1:
capture = cv2.VideoCapture(camera_id)
else: else:
online_tlwhs, online_scores, online_ids = detector.predict( capture = cv2.VideoCapture(video_file)
[img_file], threshold) video_out_name = os.path.split(video_file)[-1]
online_im = plot_tracking_dict( # Get Video info : resolution, fps, frame count
width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(capture.get(cv2.CAP_PROP_FPS))
frame_count = int(capture.get(cv2.CAP_PROP_FRAME_COUNT))
print("fps: %d, frame_count: %d" % (fps, frame_count))
if not os.path.exists(self.output_dir):
os.makedirs(self.output_dir)
out_path = os.path.join(self.output_dir, video_out_name)
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
writer = cv2.VideoWriter(out_path, fourcc, fps, (width, height))
frame_id = 1
timer = MOTTimer()
results = defaultdict(list) # support single class and multi classes
num_classes = self.num_classes
data_type = 'mcmot' if num_classes > 1 else 'mot'
ids2names = self.pred_config.labels
while (1):
ret, frame = capture.read()
if not ret:
break
if frame_id % 10 == 0:
print('Tracking frame: %d' % (frame_id))
frame_id += 1
timer.tic()
mot_results = self.predict_image([frame], visual=False)
timer.toc()
online_tlwhs, online_scores, online_ids = mot_results[0]
for cls_id in range(num_classes):
results[cls_id].append(
(frame_id + 1, online_tlwhs[cls_id], online_scores[cls_id],
online_ids[cls_id]))
fps = 1. / timer.duration
im = plot_tracking_dict(
frame, frame,
num_classes, num_classes,
online_tlwhs, online_tlwhs,
online_ids, online_ids,
online_scores, online_scores,
frame_id=frame_id, frame_id=frame_id,
fps=fps,
ids2names=ids2names) ids2names=ids2names)
if save_images:
if not os.path.exists(output_dir):
os.makedirs(output_dir)
img_name = os.path.split(img_file)[-1]
out_path = os.path.join(output_dir, img_name)
cv2.imwrite(out_path, online_im)
print("save result to: " + out_path)
def predict_video(detector,
video_file,
threshold,
output_dir,
save_images=True,
save_mot_txts=True,
draw_center_traj=False,
secs_interval=10,
do_entrance_counting=False,
camera_id=-1):
video_name = 'mot_output.mp4'
if camera_id != -1:
capture = cv2.VideoCapture(camera_id)
else:
capture = cv2.VideoCapture(video_file)
video_name = os.path.split(video_file)[-1]
# Get Video info : resolution, fps, frame count
width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(capture.get(cv2.CAP_PROP_FPS))
frame_count = int(capture.get(cv2.CAP_PROP_FRAME_COUNT))
print("fps: %d, frame_count: %d" % (fps, frame_count))
if not os.path.exists(output_dir):
os.makedirs(output_dir)
out_path = os.path.join(output_dir, video_name)
if not save_images:
video_format = 'mp4v'
fourcc = cv2.VideoWriter_fourcc(*video_format)
writer = cv2.VideoWriter(out_path, fourcc, fps, (width, height))
frame_id = 0
timer = MOTTimer()
results = defaultdict(list) # support single class and multi classes
num_classes = detector.num_classes
data_type = 'mcmot' if num_classes > 1 else 'mot'
ids2names = detector.pred_config.labels
center_traj = None
entrance = None
records = None
if draw_center_traj:
center_traj = [{} for i in range(num_classes)]
if num_classes == 1:
id_set = set()
interval_id_set = set()
in_id_list = list()
out_id_list = list()
prev_center = dict()
records = list()
entrance = [0, height / 2., width, height / 2.]
video_fps = fps
while (1):
ret, frame = capture.read()
if not ret:
break
timer.tic()
online_tlwhs, online_scores, online_ids = detector.predict([frame],
threshold)
timer.toc()
for cls_id in range(num_classes):
results[cls_id].append((frame_id + 1, online_tlwhs[cls_id],
online_scores[cls_id], online_ids[cls_id]))
fps = 1. / timer.duration
# NOTE: just implement flow statistic for one class
if num_classes == 1:
result = (frame_id + 1, online_tlwhs[0], online_scores[0],
online_ids[0])
statistic = flow_statistic(
result, secs_interval, do_entrance_counting, video_fps,
entrance, id_set, interval_id_set, in_id_list, out_id_list,
prev_center, records, data_type, num_classes)
id_set = statistic['id_set']
interval_id_set = statistic['interval_id_set']
in_id_list = statistic['in_id_list']
out_id_list = statistic['out_id_list']
prev_center = statistic['prev_center']
records = statistic['records']
elif num_classes > 1 and do_entrance_counting:
raise NotImplementedError(
'Multi-class flow counting is not implemented now!')
im = plot_tracking_dict(
frame,
num_classes,
online_tlwhs,
online_ids,
online_scores,
frame_id=frame_id,
fps=fps,
ids2names=ids2names,
do_entrance_counting=do_entrance_counting,
entrance=entrance,
records=records,
center_traj=center_traj)
if save_images:
save_dir = os.path.join(output_dir, video_name.split('.')[-2])
if not os.path.exists(save_dir):
os.makedirs(save_dir)
cv2.imwrite(
os.path.join(save_dir, '{:05d}.jpg'.format(frame_id)), im)
else:
writer.write(im)
frame_id += 1 writer.write(im)
print('detect frame: %d, fps: %f' % (frame_id, fps)) if camera_id != -1:
if camera_id != -1: cv2.imshow('Mask Detection', im)
cv2.imshow('Tracking Detection', im) if cv2.waitKey(1) & 0xFF == ord('q'):
if cv2.waitKey(1) & 0xFF == ord('q'): break
break
if save_mot_txts:
result_filename = os.path.join(output_dir,
video_name.split('.')[-2] + '.txt')
write_mot_results(result_filename, results, data_type, num_classes)
if num_classes == 1:
result_filename = os.path.join(
output_dir, video_name.split('.')[-2] + '_flow_statistic.txt')
f = open(result_filename, 'w')
for line in records:
f.write(line)
print('Flow statistic save in {}'.format(result_filename))
f.close()
if save_images:
save_dir = os.path.join(output_dir, video_name.split('.')[-2])
cmd_str = 'ffmpeg -f image2 -i {}/%05d.jpg {}'.format(save_dir,
out_path)
os.system(cmd_str)
print('Save video in {}.'.format(out_path))
else:
writer.release() writer.release()
def predict_naive(model_dir,
video_file,
image_dir,
device='gpu',
threshold=0.5,
output_dir='output'):
pred_config = PredictConfig(model_dir)
detector = JDE_Detector(pred_config, model_dir, device=device.upper())
if video_file is not None:
predict_video(
detector,
video_file,
threshold=threshold,
output_dir=output_dir,
save_images=True,
save_mot_txts=True,
draw_center_traj=False,
secs_interval=10,
do_entrance_counting=False)
else:
img_list = get_test_images(image_dir, infer_img=None)
predict_image(
detector,
img_list,
threshold=threshold,
output_dir=output_dir,
save_images=True)
def main(): def main():
pred_config = PredictConfig(FLAGS.model_dir)
detector = JDE_Detector( detector = JDE_Detector(
pred_config,
FLAGS.model_dir, FLAGS.model_dir,
device=FLAGS.device, device=FLAGS.device,
run_mode=FLAGS.run_mode, run_mode=FLAGS.run_mode,
...@@ -397,50 +324,22 @@ def main(): ...@@ -397,50 +324,22 @@ def main():
# predict from video file or camera video stream # predict from video file or camera video stream
if FLAGS.video_file is not None or FLAGS.camera_id != -1: if FLAGS.video_file is not None or FLAGS.camera_id != -1:
predict_video( detector.predict_video(FLAGS.video_file, FLAGS.camera_id)
detector,
FLAGS.video_file,
threshold=FLAGS.threshold,
output_dir=FLAGS.output_dir,
save_images=FLAGS.save_images,
save_mot_txts=FLAGS.save_mot_txts,
draw_center_traj=FLAGS.draw_center_traj,
secs_interval=FLAGS.secs_interval,
do_entrance_counting=FLAGS.do_entrance_counting,
camera_id=FLAGS.camera_id)
else: else:
# predict from image # predict from image
img_list = get_test_images(FLAGS.image_dir, FLAGS.image_file) img_list = get_test_images(FLAGS.image_dir, FLAGS.image_file)
predict_image( detector.predict_image(img_list, FLAGS.run_benchmark, repeats=10)
detector,
img_list,
threshold=FLAGS.threshold,
output_dir=FLAGS.output_dir,
save_images=FLAGS.save_images,
run_benchmark=FLAGS.run_benchmark)
if not FLAGS.run_benchmark: if not FLAGS.run_benchmark:
detector.det_times.info(average=True) detector.det_times.info(average=True)
else: else:
mems = {
'cpu_rss_mb': detector.cpu_mem / len(img_list),
'gpu_rss_mb': detector.gpu_mem / len(img_list),
'gpu_util': detector.gpu_util * 100 / len(img_list)
}
perf_info = detector.det_times.report(average=True)
model_dir = FLAGS.model_dir
mode = FLAGS.run_mode mode = FLAGS.run_mode
model_dir = FLAGS.model_dir
model_info = { model_info = {
'model_name': model_dir.strip('/').split('/')[-1], 'model_name': model_dir.strip('/').split('/')[-1],
'precision': mode.split('_')[-1] 'precision': mode.split('_')[-1]
} }
data_info = { bench_log(detector, img_list, model_info, name='MOT')
'batch_size': 1,
'shape': "dynamic_shape",
'data_num': perf_info['img_num']
}
det_log = PaddleInferBenchmark(detector.config, model_info,
data_info, perf_info, mems)
det_log('MOT')
if __name__ == '__main__': if __name__ == '__main__':
......
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved. # Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
# #
# Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
...@@ -11,64 +11,44 @@ ...@@ -11,64 +11,44 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from IPython import embed
import os import os
import time import time
import yaml import yaml
import cv2 import cv2
import re import re
import glob
import numpy as np import numpy as np
from collections import defaultdict from collections import defaultdict
import paddle import paddle
from paddle.inference import Config
from paddle.inference import create_predictor
from picodet_postprocess import PicoDetPostProcess
from utils import argsparser, Timer, get_current_memory_mb, _is_valid_video, video2frames
from det_infer import Detector, DetectorPicoDet, get_test_images, print_arguments, PredictConfig
from det_infer import load_predictor
from benchmark_utils import PaddleInferBenchmark from benchmark_utils import PaddleInferBenchmark
from visualize import plot_tracking from preprocess import decode_image
from utils import argsparser, Timer, get_current_memory_mb, _is_valid_video, video2frames
from det_infer import Detector, get_test_images, print_arguments, bench_log, PredictConfig, load_predictor
# add python path
import sys
parent_path = os.path.abspath(os.path.join(__file__, *(['..'] * 2)))
sys.path.insert(0, parent_path)
from mot.tracker import DeepSORTTracker from mot.tracker import JDETracker, DeepSORTTracker
from mot.utils import MOTTimer, write_mot_results, flow_statistic, scale_coords, clip_box, preprocess_reid from mot.utils import MOTTimer, write_mot_results, flow_statistic, get_crops, clip_box
from visualize import plot_tracking, plot_tracking_dict
from mot.mtmct.utils import parse_bias from mot.mtmct.utils import parse_bias
from mot.mtmct.postprocess import trajectory_fusion, sub_cluster, gen_res, print_mtmct_result from mot.mtmct.postprocess import trajectory_fusion, sub_cluster, gen_res, print_mtmct_result
from mot.mtmct.postprocess import get_mtmct_matching_results, save_mtmct_crops, save_mtmct_vis_results from mot.mtmct.postprocess import get_mtmct_matching_results, save_mtmct_crops, save_mtmct_vis_results
# Global dictionary
MOT_SUPPORT_MODELS = {'DeepSORT'}
def bench_log(detector, img_list, model_info, batch_size=1, name=None):
mems = {
'cpu_rss_mb': detector.cpu_mem / len(img_list),
'gpu_rss_mb': detector.gpu_mem / len(img_list),
'gpu_util': detector.gpu_util * 100 / len(img_list)
}
perf_info = detector.det_times.report(average=True)
data_info = {
'batch_size': batch_size,
'shape': "dynamic_shape",
'data_num': perf_info['img_num']
}
log = PaddleInferBenchmark(detector.config, model_info, data_info,
perf_info, mems)
log(name)
class SDE_Detector(Detector): class SDE_Detector(Detector):
""" """
Detector of SDE methods
Args: Args:
pred_config (object): config of model, defined by `Config(model_dir)`
model_dir (str): root path of model.pdiparams, model.pdmodel and infer_cfg.yml model_dir (str): root path of model.pdiparams, model.pdmodel and infer_cfg.yml
tracker_config (str): tracker config path
device (str): Choose the device you want to run, it can be: CPU/GPU/XPU, default is CPU device (str): Choose the device you want to run, it can be: CPU/GPU/XPU, default is CPU
run_mode (str): mode of running(paddle/trt_fp32/trt_fp16) run_mode (str): mode of running(paddle/trt_fp32/trt_fp16)
batch_size (int): size of per batch in inference, default is 1 in tracking models batch_size (int): size of pre batch in inference
trt_min_shape (int): min shape for dynamic shape in trt trt_min_shape (int): min shape for dynamic shape in trt
trt_max_shape (int): max shape for dynamic shape in trt trt_max_shape (int): max shape for dynamic shape in trt
trt_opt_shape (int): opt shape for dynamic shape in trt trt_opt_shape (int): opt shape for dynamic shape in trt
...@@ -76,22 +56,27 @@ class SDE_Detector(Detector): ...@@ -76,22 +56,27 @@ class SDE_Detector(Detector):
calibration, trt_calib_mode need to set True calibration, trt_calib_mode need to set True
cpu_threads (int): cpu threads cpu_threads (int): cpu threads
enable_mkldnn (bool): whether to open MKLDNN enable_mkldnn (bool): whether to open MKLDNN
reid_model_dir (str): reid model dir, default None for ByteTrack, but set for DeepSORT
mtmct_dir (str): MTMCT dir, default None, set for doing MTMCT
""" """
def __init__(self, def __init__(self,
pred_config,
model_dir, model_dir,
tracker_config=None,
device='CPU', device='CPU',
run_mode='paddle', run_mode='paddle',
batch_size=1, batch_size=1,
trt_min_shape=1, trt_min_shape=1,
trt_max_shape=1088, trt_max_shape=1280,
trt_opt_shape=608, trt_opt_shape=640,
trt_calib_mode=False, trt_calib_mode=False,
cpu_threads=1, cpu_threads=1,
enable_mkldnn=False): enable_mkldnn=False,
output_dir='output',
threshold=0.5,
reid_model_dir=None,
mtmct_dir=None):
super(SDE_Detector, self).__init__( super(SDE_Detector, self).__init__(
pred_config=pred_config,
model_dir=model_dir, model_dir=model_dir,
device=device, device=device,
run_mode=run_mode, run_mode=run_mode,
...@@ -101,833 +86,465 @@ class SDE_Detector(Detector): ...@@ -101,833 +86,465 @@ class SDE_Detector(Detector):
trt_opt_shape=trt_opt_shape, trt_opt_shape=trt_opt_shape,
trt_calib_mode=trt_calib_mode, trt_calib_mode=trt_calib_mode,
cpu_threads=cpu_threads, cpu_threads=cpu_threads,
enable_mkldnn=enable_mkldnn) enable_mkldnn=enable_mkldnn,
assert batch_size == 1, "The detector of tracking models only supports batch_size=1 now" output_dir=output_dir,
self.pred_config = pred_config threshold=threshold, )
assert batch_size == 1, "MOT model only supports batch_size=1."
def postprocess(self, self.det_times = Timer(with_tracker=True)
boxes, self.num_classes = len(self.pred_config.labels)
ori_image_shape,
threshold, # reid and tracker config
inputs, self.use_reid = False if reid_model_dir is None else True
scaled=False): if self.use_reid:
over_thres_idx = np.nonzero(boxes[:, 1:2] >= threshold)[0] # use DeepSORTTracker
if len(over_thres_idx) == 0: self.reid_pred_config = self.set_config(reid_model_dir)
pred_dets = np.zeros((1, 6), dtype=np.float32) self.reid_predictor, self.config = load_predictor(
pred_xyxys = np.zeros((1, 4), dtype=np.float32) reid_model_dir,
return pred_dets, pred_xyxys run_mode=run_mode,
else: batch_size=50, # reid_batch_size
boxes = boxes[over_thres_idx] min_subgraph_size=self.reid_pred_config.min_subgraph_size,
device=device,
if not scaled: use_dynamic_shape=self.reid_pred_config.use_dynamic_shape,
# scaled means whether the coords after detector outputs trt_min_shape=trt_min_shape,
# have been scaled back to the original image, set True trt_max_shape=trt_max_shape,
# in general detector, set False in JDE YOLOv3. trt_opt_shape=trt_opt_shape,
input_shape = inputs['image'].shape[2:] trt_calib_mode=trt_calib_mode,
im_shape = inputs['im_shape'][0] cpu_threads=cpu_threads,
scale_factor = inputs['scale_factor'][0] enable_mkldnn=enable_mkldnn)
pred_bboxes = scale_coords(boxes[:, 2:], input_shape, im_shape,
scale_factor) cfg = self.reid_pred_config.tracker
else: max_age = cfg.get('max_age', 30)
pred_bboxes = boxes[:, 2:] max_iou_distance = cfg.get('max_iou_distance', 0.7)
pred_xyxys, keep_idx = clip_box(pred_bboxes, ori_image_shape) self.tracker = DeepSORTTracker(
max_age=max_age,
if len(keep_idx[0]) == 0: max_iou_distance=max_iou_distance,
pred_dets = np.zeros((1, 6), dtype=np.float32) )
pred_xyxys = np.zeros((1, 4), dtype=np.float32)
return pred_dets, pred_xyxys
pred_scores = boxes[:, 1:2][keep_idx[0]]
pred_cls_ids = boxes[:, 0:1][keep_idx[0]]
pred_tlwhs = np.concatenate(
(pred_xyxys[:, 0:2], pred_xyxys[:, 2:4] - pred_xyxys[:, 0:2] + 1),
axis=1)
pred_dets = np.concatenate(
(pred_tlwhs, pred_scores, pred_cls_ids), axis=1)
return pred_dets, pred_xyxys
def predict(self,
image_path,
ori_image_shape,
threshold=0.5,
scaled=False,
repeats=1,
add_timer=True):
'''
Args:
image_path (list[str]): path of images, only support one image path
(batch_size=1) in tracking model
ori_image_shape (list[int]: original image shape
threshold (float): threshold of predicted box' score
scaled (bool): whether the coords after detector outputs are scaled,
default False in jde yolov3, set True in general detector.
repeats (int): repeat number for prediction
add_timer (bool): whether add timer during prediction
Returns:
pred_dets (np.ndarray, [N, 6]): 'x,y,w,h,score,cls_id'
pred_xyxys (np.ndarray, [N, 4]): 'x1,y1,x2,y2'
'''
# preprocess
if add_timer:
self.det_times.preprocess_time_s.start()
inputs = self.preprocess(image_path)
input_names = self.predictor.get_input_names()
for i in range(len(input_names)):
input_tensor = self.predictor.get_input_handle(input_names[i])
input_tensor.copy_from_cpu(inputs[input_names[i]])
if add_timer:
self.det_times.preprocess_time_s.end()
self.det_times.inference_time_s.start()
# model prediction
for i in range(repeats):
self.predictor.run()
output_names = self.predictor.get_output_names()
boxes_tensor = self.predictor.get_output_handle(output_names[0])
boxes = boxes_tensor.copy_to_cpu()
if add_timer:
self.det_times.inference_time_s.end(repeats=repeats)
self.det_times.postprocess_time_s.start()
# postprocess
if len(boxes) == 0:
pred_dets = np.zeros((1, 6), dtype=np.float32)
pred_xyxys = np.zeros((1, 4), dtype=np.float32)
else:
pred_dets, pred_xyxys = self.postprocess(
boxes, ori_image_shape, threshold, inputs, scaled=scaled)
if add_timer:
self.det_times.postprocess_time_s.end()
self.det_times.img_num += 1
return pred_dets, pred_xyxys
class SDE_DetectorPicoDet(DetectorPicoDet):
"""
PicoDet of SDE methods, the postprocess of PicoDet has not been exported as
other detectors, so do postprocess here.
Args:
pred_config (object): config of model, defined by `Config(model_dir)`
model_dir (str): root path of model.pdiparams, model.pdmodel and infer_cfg.yml
device (str): Choose the device you want to run, it can be: CPU/GPU/XPU, default is CPU
run_mode (str): mode of running(paddle/trt_fp32/trt_fp16)
batch_size (int): size of per batch in inference, default is 1 in tracking models
trt_min_shape (int): min shape for dynamic shape in trt
trt_max_shape (int): max shape for dynamic shape in trt
trt_opt_shape (int): opt shape for dynamic shape in trt
trt_calib_mode (bool): If the model is produced by TRT offline quantitative
calibration, trt_calib_mode need to set True
cpu_threads (int): cpu threads
enable_mkldnn (bool): whether to open MKLDNN
"""
def __init__(self,
pred_config,
model_dir,
device='CPU',
run_mode='paddle',
batch_size=1,
trt_min_shape=1,
trt_max_shape=1088,
trt_opt_shape=608,
trt_calib_mode=False,
cpu_threads=1,
enable_mkldnn=False):
super(SDE_DetectorPicoDet, self).__init__(
pred_config=pred_config,
model_dir=model_dir,
device=device,
run_mode=run_mode,
batch_size=batch_size,
trt_min_shape=trt_min_shape,
trt_max_shape=trt_max_shape,
trt_opt_shape=trt_opt_shape,
trt_calib_mode=trt_calib_mode,
cpu_threads=cpu_threads,
enable_mkldnn=enable_mkldnn)
assert batch_size == 1, "The detector of tracking models only supports batch_size=1 now"
self.pred_config = pred_config
def postprocess(self, boxes, ori_image_shape, threshold):
over_thres_idx = np.nonzero(boxes[:, 1:2] >= threshold)[0]
if len(over_thres_idx) == 0:
pred_dets = np.zeros((1, 6), dtype=np.float32)
pred_xyxys = np.zeros((1, 4), dtype=np.float32)
return pred_dets, pred_xyxys
else: else:
boxes = boxes[over_thres_idx] # use ByteTracker
self.tracker_config = tracker_config
pred_bboxes = boxes[:, 2:] cfg = yaml.safe_load(open(self.tracker_config))['tracker']
min_box_area = cfg.get('min_box_area', 200)
vertical_ratio = cfg.get('vertical_ratio', 1.6)
use_byte = cfg.get('use_byte', True)
match_thres = cfg.get('match_thres', 0.9)
conf_thres = cfg.get('conf_thres', 0.6)
low_conf_thres = cfg.get('low_conf_thres', 0.1)
self.tracker = JDETracker(
use_byte=use_byte,
num_classes=self.num_classes,
min_box_area=min_box_area,
vertical_ratio=vertical_ratio,
match_thres=match_thres,
conf_thres=conf_thres,
low_conf_thres=low_conf_thres,
)
self.do_mtmct = False if mtmct_dir is None else True
self.mtmct_dir = mtmct_dir
def postprocess(self, inputs, result):
# postprocess output of predictor
np_boxes_num = result['boxes_num']
if np_boxes_num[0] <= 0:
print('[WARNNING] No object detected.')
result = {'boxes': np.zeros([0, 6]), 'boxes_num': [0]}
result = {k: v for k, v in result.items() if v is not None}
return result
def reidprocess(self, det_results, repeats=1):
pred_dets = det_results['boxes']
pred_xyxys = pred_dets[:, 2:6]
ori_image = det_results['ori_image']
ori_image_shape = ori_image.shape[:2]
pred_xyxys, keep_idx = clip_box(pred_xyxys, ori_image_shape)
pred_xyxys, keep_idx = clip_box(pred_bboxes, ori_image_shape)
if len(keep_idx[0]) == 0: if len(keep_idx[0]) == 0:
pred_dets = np.zeros((1, 6), dtype=np.float32) det_results['boxes'] = np.zeros((1, 6), dtype=np.float32)
pred_xyxys = np.zeros((1, 4), dtype=np.float32) det_results['embeddings'] = None
return pred_dets, pred_xyxys return det_results
pred_scores = boxes[:, 1:2][keep_idx[0]]
pred_cls_ids = boxes[:, 0:1][keep_idx[0]]
pred_tlwhs = np.concatenate(
(pred_xyxys[:, 0:2], pred_xyxys[:, 2:4] - pred_xyxys[:, 0:2] + 1),
axis=1)
pred_dets = np.concatenate(
(pred_tlwhs, pred_scores, pred_cls_ids), axis=1)
return pred_dets, pred_xyxys
def predict(self,
image_path,
ori_image_shape,
threshold=0.5,
scaled=False,
repeats=1,
add_timer=True):
'''
Args:
image_path (list[str]): path of images, only support one image path
(batch_size=1) in tracking model
ori_image_shape (list[int]: original image shape
threshold (float): threshold of predicted box' score
scaled (bool): whether the coords after detector outputs are scaled,
default False in jde yolov3, set True in general detector.
repeats (int): repeat number for prediction
add_timer (bool): whether add timer during prediction
Returns:
pred_dets (np.ndarray, [N, 6]): 'x,y,w,h,score,cls_id'
pred_xyxys (np.ndarray, [N, 4]): 'x1,y1,x2,y2'
'''
# preprocess
if add_timer:
self.det_times.preprocess_time_s.start()
inputs = self.preprocess(image_path)
input_names = self.predictor.get_input_names()
for i in range(len(input_names)):
input_tensor = self.predictor.get_input_handle(input_names[i])
input_tensor.copy_from_cpu(inputs[input_names[i]])
if add_timer:
self.det_times.preprocess_time_s.end()
self.det_times.inference_time_s.start()
np_score_list, np_boxes_list = [], []
# model prediction pred_dets = pred_dets[keep_idx[0]]
for i in range(repeats): pred_xyxys = pred_dets[:, 2:6]
self.predictor.run()
np_score_list.clear()
np_boxes_list.clear()
output_names = self.predictor.get_output_names()
num_outs = int(len(output_names) / 2)
for out_idx in range(num_outs):
np_score_list.append(
self.predictor.get_output_handle(output_names[out_idx])
.copy_to_cpu())
np_boxes_list.append(
self.predictor.get_output_handle(output_names[
out_idx + num_outs]).copy_to_cpu())
if add_timer:
self.det_times.inference_time_s.end(repeats=repeats)
self.det_times.postprocess_time_s.start()
# postprocess
self.picodet_postprocess = PicoDetPostProcess(
inputs['image'].shape[2:],
inputs['im_shape'],
inputs['scale_factor'],
strides=self.pred_config.fpn_stride,
nms_threshold=self.pred_config.nms['nms_threshold'])
boxes, boxes_num = self.picodet_postprocess(np_score_list,
np_boxes_list)
if len(boxes) == 0:
pred_dets = np.zeros((1, 6), dtype=np.float32)
pred_xyxys = np.zeros((1, 4), dtype=np.float32)
else:
pred_dets, pred_xyxys = self.postprocess(boxes, ori_image_shape,
threshold)
if add_timer:
self.det_times.postprocess_time_s.end()
self.det_times.img_num += 1
return pred_dets, pred_xyxys
class SDE_ReID(object):
"""
ReID of SDE methods
Args:
pred_config (object): config of model, defined by `Config(model_dir)`
model_dir (str): root path of model.pdiparams, model.pdmodel and infer_cfg.yml
device (str): Choose the device you want to run, it can be: CPU/GPU/XPU, default is CPU
run_mode (str): mode of running(paddle/trt_fp32/trt_fp16)
batch_size (int): size of per batch in inference, default 50 means at most
50 sub images can be made a batch and send into ReID model
trt_min_shape (int): min shape for dynamic shape in trt
trt_max_shape (int): max shape for dynamic shape in trt
trt_opt_shape (int): opt shape for dynamic shape in trt
trt_calib_mode (bool): If the model is produced by TRT offline quantitative
calibration, trt_calib_mode need to set True
cpu_threads (int): cpu threads
enable_mkldnn (bool): whether to open MKLDNN
"""
def __init__(self,
pred_config,
model_dir,
device='CPU',
run_mode='paddle',
batch_size=50,
trt_min_shape=1,
trt_max_shape=1088,
trt_opt_shape=608,
trt_calib_mode=False,
cpu_threads=1,
enable_mkldnn=False):
self.pred_config = pred_config
self.predictor, self.config = load_predictor(
model_dir,
run_mode=run_mode,
batch_size=batch_size,
min_subgraph_size=self.pred_config.min_subgraph_size,
device=device,
use_dynamic_shape=self.pred_config.use_dynamic_shape,
trt_min_shape=trt_min_shape,
trt_max_shape=trt_max_shape,
trt_opt_shape=trt_opt_shape,
trt_calib_mode=trt_calib_mode,
cpu_threads=cpu_threads,
enable_mkldnn=enable_mkldnn)
self.det_times = Timer()
self.cpu_mem, self.gpu_mem, self.gpu_util = 0, 0, 0
self.batch_size = batch_size
assert pred_config.tracker, "Tracking model should have tracker"
pt = pred_config.tracker
max_age = pt['max_age'] if 'max_age' in pt else 30
max_iou_distance = pt[
'max_iou_distance'] if 'max_iou_distance' in pt else 0.7
self.tracker = DeepSORTTracker(
max_age=max_age, max_iou_distance=max_iou_distance)
def get_crops(self, xyxy, ori_img):
w, h = self.tracker.input_size w, h = self.tracker.input_size
self.det_times.preprocess_time_s.start() crops = get_crops(pred_xyxys, ori_image, w, h)
crops = []
xyxy = xyxy.astype(np.int64)
ori_img = ori_img.transpose(1, 0, 2) # [h,w,3]->[w,h,3]
for i, bbox in enumerate(xyxy):
crop = ori_img[bbox[0]:bbox[2], bbox[1]:bbox[3], :]
crops.append(crop)
crops = preprocess_reid(crops, w, h)
self.det_times.preprocess_time_s.end()
return crops
def preprocess(self, crops):
# to keep fast speed, only use topk crops
crops = crops[:self.batch_size]
inputs = {}
inputs['crops'] = np.array(crops).astype('float32')
return inputs
def postprocess(self, pred_dets, pred_embs):
tracker = self.tracker
tracker.predict()
online_targets = tracker.update(pred_dets, pred_embs)
online_tlwhs, online_scores, online_ids = [], [], []
for t in online_targets:
if not t.is_confirmed() or t.time_since_update > 1:
continue
tlwh = t.to_tlwh()
tscore = t.score
tid = t.track_id
if tlwh[2] * tlwh[3] <= tracker.min_box_area:
continue
if tracker.vertical_ratio > 0 and tlwh[2] / tlwh[
3] > tracker.vertical_ratio:
continue
online_tlwhs.append(tlwh)
online_scores.append(tscore)
online_ids.append(tid)
tracking_outs = {
'online_tlwhs': online_tlwhs,
'online_scores': online_scores,
'online_ids': online_ids,
}
return tracking_outs
def postprocess_mtmct(self, pred_dets, pred_embs, frame_id, seq_name): # to keep fast speed, only use topk crops
tracker = self.tracker crops = crops[:50] # reid_batch_size
tracker.predict() det_results['crops'] = np.array(crops).astype('float32')
online_targets = tracker.update(pred_dets, pred_embs) det_results['boxes'] = pred_dets[:50]
online_tlwhs, online_scores, online_ids = [], [], []
online_tlbrs, online_feats = [], []
for t in online_targets:
if not t.is_confirmed() or t.time_since_update > 1:
continue
tlwh = t.to_tlwh()
tscore = t.score
tid = t.track_id
if tlwh[2] * tlwh[3] <= tracker.min_box_area:
continue
if tracker.vertical_ratio > 0 and tlwh[2] / tlwh[
3] > tracker.vertical_ratio:
continue
online_tlwhs.append(tlwh)
online_scores.append(tscore)
online_ids.append(tid)
online_tlbrs.append(t.to_tlbr())
online_feats.append(t.feat)
tracking_outs = {
'online_tlwhs': online_tlwhs,
'online_scores': online_scores,
'online_ids': online_ids,
'feat_data': {},
}
for _tlbr, _id, _feat in zip(online_tlbrs, online_ids, online_feats):
feat_data = {}
feat_data['bbox'] = _tlbr
feat_data['frame'] = f"{frame_id:06d}"
feat_data['id'] = _id
_imgname = f'{seq_name}_{_id}_{frame_id}.jpg'
feat_data['imgname'] = _imgname
feat_data['feat'] = _feat
tracking_outs['feat_data'].update({_imgname: feat_data})
return tracking_outs
def predict(self, input_names = self.reid_predictor.get_input_names()
crops,
pred_dets,
repeats=1,
add_timer=True,
MTMCT=False,
frame_id=0,
seq_name=''):
# preprocess
if add_timer:
self.det_times.preprocess_time_s.start()
inputs = self.preprocess(crops)
input_names = self.predictor.get_input_names()
for i in range(len(input_names)): for i in range(len(input_names)):
input_tensor = self.predictor.get_input_handle(input_names[i]) input_tensor = self.reid_predictor.get_input_handle(input_names[i])
input_tensor.copy_from_cpu(inputs[input_names[i]]) input_tensor.copy_from_cpu(det_results[input_names[i]])
if add_timer:
self.det_times.preprocess_time_s.end()
self.det_times.inference_time_s.start()
# model prediction # model prediction
for i in range(repeats): for i in range(repeats):
self.predictor.run() self.reid_predictor.run()
output_names = self.predictor.get_output_names() output_names = self.reid_predictor.get_output_names()
feature_tensor = self.predictor.get_output_handle(output_names[0]) feature_tensor = self.reid_predictor.get_output_handle(output_names[0])
pred_embs = feature_tensor.copy_to_cpu() pred_embs = feature_tensor.copy_to_cpu()
if add_timer:
self.det_times.inference_time_s.end(repeats=repeats)
self.det_times.postprocess_time_s.start()
# postprocess
if MTMCT == False:
tracking_outs = self.postprocess(pred_dets, pred_embs)
else:
tracking_outs = self.postprocess_mtmct(pred_dets, pred_embs,
frame_id, seq_name)
if add_timer:
self.det_times.postprocess_time_s.end()
self.det_times.img_num += 1
return tracking_outs
det_results['embeddings'] = pred_embs
return det_results
def tracking(self, det_results):
pred_dets = det_results['boxes']
pred_embs = det_results.get('embeddings', None)
if self.use_reid:
# use DeepSORTTracker, only support singe class
self.tracker.predict()
online_targets = self.tracker.update(pred_dets, pred_embs)
online_tlwhs, online_scores, online_ids = [], [], []
if self.do_mtmct:
online_tlbrs, online_feats = [], []
for t in online_targets:
if not t.is_confirmed() or t.time_since_update > 1:
continue
tlwh = t.to_tlwh()
tscore = t.score
tid = t.track_id
if self.tracker.vertical_ratio > 0 and tlwh[2] / tlwh[
3] > self.tracker.vertical_ratio:
continue
online_tlwhs.append(tlwh)
online_scores.append(tscore)
online_ids.append(tid)
if self.do_mtmct:
online_tlbrs.append(t.to_tlbr())
online_feats.append(t.feat)
tracking_outs = {
'online_tlwhs': online_tlwhs,
'online_scores': online_scores,
'online_ids': online_ids,
}
if self.do_mtmct:
seq_name = det_results['seq_name']
frame_id = det_results['frame_id']
tracking_outs['feat_data'] = {}
for _tlbr, _id, _feat in zip(online_tlbrs, online_ids, online_feats):
feat_data = {}
feat_data['bbox'] = _tlbr
feat_data['frame'] = f"{frame_id:06d}"
feat_data['id'] = _id
_imgname = f'{seq_name}_{_id}_{frame_id}.jpg'
feat_data['imgname'] = _imgname
feat_data['feat'] = _feat
tracking_outs['feat_data'].update({_imgname: feat_data})
def predict_image(detector,
reid_model,
image_list,
threshold,
output_dir,
scaled=True,
save_images=True,
run_benchmark=False):
image_list.sort()
for i, img_file in enumerate(image_list):
frame = cv2.imread(img_file)
ori_image_shape = list(frame.shape[:2])
if run_benchmark:
# warmup
pred_dets, pred_xyxys = detector.predict(
[img_file],
ori_image_shape,
threshold,
scaled,
repeats=10,
add_timer=False)
# run benchmark
pred_dets, pred_xyxys = detector.predict(
[img_file],
ori_image_shape,
threshold,
scaled,
repeats=10,
add_timer=True)
cm, gm, gu = get_current_memory_mb()
detector.cpu_mem += cm
detector.gpu_mem += gm
detector.gpu_util += gu
print('Test iter {}, file name:{}'.format(i, img_file))
else: else:
pred_dets, pred_xyxys = detector.predict( # use ByteTracker, support multiple class
[img_file], ori_image_shape, threshold, scaled) online_tlwhs = defaultdict(list)
online_scores = defaultdict(list)
online_ids = defaultdict(list)
online_targets_dict = self.tracker.update(pred_dets, pred_embs)
for cls_id in range(self.num_classes):
online_targets = online_targets_dict[cls_id]
for t in online_targets:
tlwh = t.tlwh
tid = t.track_id
tscore = t.score
if tlwh[2] * tlwh[3] <= self.tracker.min_box_area:
continue
if self.tracker.vertical_ratio > 0 and tlwh[2] / tlwh[
3] > self.tracker.vertical_ratio:
continue
online_tlwhs[cls_id].append(tlwh)
online_ids[cls_id].append(tid)
online_scores[cls_id].append(tscore)
tracking_outs = {
'online_tlwhs': online_tlwhs,
'online_scores': online_scores,
'online_ids': online_ids,
}
return tracking_outs
if len(pred_dets) == 1 and np.sum(pred_dets) == 0: def predict_image(self,
print('Frame {} has no object, try to modify score threshold.'. image_list,
format(i)) run_benchmark=False,
online_im = frame repeats=1,
visual=True,
seq_name=None):
num_classes = self.num_classes
image_list.sort()
ids2names = self.pred_config.labels
if self.do_mtmct:
mot_features_dict = {} # cid_tid_fid feats
else: else:
# reid process mot_results = []
crops = reid_model.get_crops(pred_xyxys, frame) for frame_id, img_file in enumerate(image_list):
if self.do_mtmct:
if frame_id % 10 == 0:
print('Tracking frame: %d' % (frame_id))
batch_image_list = [img_file] # bs=1 in MOT model
frame, _ = decode_image(img_file, {})
if run_benchmark: if run_benchmark:
# warmup # preprocess
tracking_outs = reid_model.predict( inputs = self.preprocess(batch_image_list) # warmup
crops, pred_dets, repeats=10, add_timer=False) self.det_times.preprocess_time_s.start()
# run benchmark inputs = self.preprocess(batch_image_list)
tracking_outs = reid_model.predict( self.det_times.preprocess_time_s.end()
crops, pred_dets, repeats=10, add_timer=True)
# model prediction
result_warmup = self.predict(repeats=repeats) # warmup
self.det_times.inference_time_s.start()
result = self.predict(repeats=repeats)
self.det_times.inference_time_s.end(repeats=repeats)
# postprocess
result_warmup = self.postprocess(inputs, result) # warmup
self.det_times.postprocess_time_s.start()
det_result = self.postprocess(inputs, result)
self.det_times.postprocess_time_s.end()
# tracking
if self.use_reid:
det_result['frame_id'] = frame_id
det_result['seq_name'] = seq_name
det_result['ori_image'] = frame
det_result = self.reidprocess(det_result)
result_warmup = self.tracking(det_result)
self.det_times.tracking_time_s.start()
if self.use_reid:
det_result = self.reidprocess(det_result)
tracking_outs = self.tracking(det_result)
self.det_times.tracking_time_s.end()
self.det_times.img_num += 1
cm, gm, gu = get_current_memory_mb()
self.cpu_mem += cm
self.gpu_mem += gm
self.gpu_util += gu
else: else:
tracking_outs = reid_model.predict(crops, pred_dets) self.det_times.preprocess_time_s.start()
inputs = self.preprocess(batch_image_list)
online_tlwhs = tracking_outs['online_tlwhs'] self.det_times.preprocess_time_s.end()
online_scores = tracking_outs['online_scores']
online_ids = tracking_outs['online_ids'] self.det_times.inference_time_s.start()
result = self.predict()
online_im = plot_tracking( self.det_times.inference_time_s.end()
frame, online_tlwhs, online_ids, online_scores, frame_id=i)
self.det_times.postprocess_time_s.start()
if save_images: det_result = self.postprocess(inputs, result)
if not os.path.exists(output_dir): self.det_times.postprocess_time_s.end()
os.makedirs(output_dir)
img_name = os.path.split(img_file)[-1] # tracking process
out_path = os.path.join(output_dir, img_name) self.det_times.tracking_time_s.start()
cv2.imwrite(out_path, online_im) if self.use_reid:
print("save result to: " + out_path) det_result['frame_id'] = frame_id
det_result['seq_name'] = seq_name
det_result['ori_image'] = frame
def predict_video(detector, det_result = self.reidprocess(det_result)
reid_model, tracking_outs = self.tracking(det_result)
video_file, self.det_times.tracking_time_s.end()
scaled, self.det_times.img_num += 1
threshold,
output_dir,
save_images=True,
save_mot_txts=True,
draw_center_traj=False,
secs_interval=10,
do_entrance_counting=False,
camera_id=-1):
video_name = 'mot_output.mp4'
if camera_id != -1:
capture = cv2.VideoCapture(camera_id)
else:
capture = cv2.VideoCapture(video_file)
video_name = os.path.split(video_file)[-1]
# Get Video info : resolution, fps, frame count
width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(capture.get(cv2.CAP_PROP_FPS))
frame_count = int(capture.get(cv2.CAP_PROP_FRAME_COUNT))
print("fps: %d, frame_count: %d" % (fps, frame_count))
if not os.path.exists(output_dir):
os.makedirs(output_dir)
out_path = os.path.join(output_dir, video_name)
if not save_images:
video_format = 'mp4v'
fourcc = cv2.VideoWriter_fourcc(*video_format)
writer = cv2.VideoWriter(out_path, fourcc, fps, (width, height))
frame_id = 0
timer = MOTTimer()
results = defaultdict(list)
id_set = set()
interval_id_set = set()
in_id_list = list()
out_id_list = list()
prev_center = dict()
records = list()
entrance = [0, height / 2., width, height / 2.]
video_fps = fps
while (1):
ret, frame = capture.read()
if not ret:
break
timer.tic()
ori_image_shape = list(frame.shape[:2])
pred_dets, pred_xyxys = detector.predict([frame], ori_image_shape,
threshold, scaled)
if len(pred_dets) == 1 and np.sum(pred_dets) == 0:
print('Frame {} has no object, try to modify score threshold.'.
format(frame_id))
timer.toc()
im = frame
else:
# reid process
crops = reid_model.get_crops(pred_xyxys, frame)
tracking_outs = reid_model.predict(crops, pred_dets)
online_tlwhs = tracking_outs['online_tlwhs'] online_tlwhs = tracking_outs['online_tlwhs']
online_scores = tracking_outs['online_scores'] online_scores = tracking_outs['online_scores']
online_ids = tracking_outs['online_ids'] online_ids = tracking_outs['online_ids']
if self.do_mtmct:
feat_data_dict = tracking_outs['feat_data']
mot_features_dict = dict(mot_features_dict, **feat_data_dict)
else:
mot_results.append([online_tlwhs, online_scores, online_ids])
if visual:
if frame_id % 10 == 0:
print('Tracking frame {}'.format(frame_id))
frame, _ = decode_image(img_file, {})
if num_classes == 1:
im = plot_tracking(
frame,
online_tlwhs,
online_ids,
online_scores,
frame_id=frame_id)
else:
im = plot_tracking_dict(
frame,
num_classes,
online_tlwhs,
online_ids,
online_scores,
frame_id=frame_id,
ids2names=[])
save_dir = os.path.join(self.output_dir, seq_name)
if not os.path.exists(save_dir):
os.makedirs(save_dir)
cv2.imwrite(
os.path.join(save_dir, '{:05d}.jpg'.format(frame_id)), im)
if self.do_mtmct:
return mot_features_dict
else:
return mot_results
results[0].append( def predict_video(self, video_file, camera_id):
(frame_id + 1, online_tlwhs, online_scores, online_ids)) video_out_name = 'output.mp4'
# NOTE: just implement flow statistic for one class if camera_id != -1:
result = (frame_id + 1, online_tlwhs, online_scores, online_ids) capture = cv2.VideoCapture(camera_id)
statistic = flow_statistic( else:
result, secs_interval, do_entrance_counting, video_fps, capture = cv2.VideoCapture(video_file)
entrance, id_set, interval_id_set, in_id_list, out_id_list, video_out_name = os.path.split(video_file)[-1]
prev_center, records) # Get Video info : resolution, fps, frame count
id_set = statistic['id_set'] width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))
interval_id_set = statistic['interval_id_set'] height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
in_id_list = statistic['in_id_list'] fps = int(capture.get(cv2.CAP_PROP_FPS))
out_id_list = statistic['out_id_list'] frame_count = int(capture.get(cv2.CAP_PROP_FRAME_COUNT))
prev_center = statistic['prev_center'] print("fps: %d, frame_count: %d" % (fps, frame_count))
records = statistic['records']
if not os.path.exists(self.output_dir):
os.makedirs(self.output_dir)
out_path = os.path.join(self.output_dir, video_out_name)
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
writer = cv2.VideoWriter(out_path, fourcc, fps, (width, height))
frame_id = 1
timer = MOTTimer()
results = defaultdict(list) # support single class and multi classes
num_classes = self.num_classes
while (1):
ret, frame = capture.read()
if not ret:
break
if frame_id % 10 == 0:
print('Tracking frame: %d' % (frame_id))
frame_id += 1
timer.tic()
seq_name = video_out_name.split('.')[0]
mot_results = self.predict_image([frame], visual=False, seq_name=seq_name)
timer.toc() timer.toc()
online_tlwhs, online_scores, online_ids = mot_results[0] # bs=1 in MOT model
fps = 1. / timer.duration fps = 1. / timer.duration
im = plot_tracking( if num_classes == 1 and self.use_reid:
frame, # use DeepSORTTracker, only support singe class
online_tlwhs, results[0].append((frame_id + 1, online_tlwhs, online_scores, online_ids))
online_ids, im = plot_tracking(
online_scores, frame,
frame_id=frame_id, online_tlwhs,
fps=fps, online_ids,
do_entrance_counting=do_entrance_counting, online_scores,
entrance=entrance) frame_id=frame_id,
fps=fps)
if save_images: else:
save_dir = os.path.join(output_dir, video_name.split('.')[-2]) # use ByteTracker, support multiple class
if not os.path.exists(save_dir): for cls_id in range(num_classes):
os.makedirs(save_dir) results[cls_id].append(
cv2.imwrite( (frame_id + 1, online_tlwhs[cls_id], online_scores[cls_id],
os.path.join(save_dir, '{:05d}.jpg'.format(frame_id)), im) online_ids[cls_id]))
else: im = plot_tracking_dict(
writer.write(im) frame,
num_classes,
frame_id += 1 online_tlwhs,
print('detect frame:%d, fps: %f' % (frame_id, fps)) online_ids,
online_scores,
if camera_id != -1: frame_id=frame_id,
cv2.imshow('Tracking Detection', im) fps=fps,
if cv2.waitKey(1) & 0xFF == ord('q'): ids2names=[])
break
if save_mot_txts: writer.write(im)
result_filename = os.path.join(output_dir, if camera_id != -1:
video_name.split('.')[-2] + '.txt') cv2.imshow('Mask Detection', im)
write_mot_results(result_filename, results) if cv2.waitKey(1) & 0xFF == ord('q'):
break
result_filename = os.path.join(
output_dir, video_name.split('.')[-2] + '_flow_statistic.txt')
f = open(result_filename, 'w')
for line in records:
f.write(line)
print('Flow statistic save in {}'.format(result_filename))
f.close()
if save_images:
save_dir = os.path.join(output_dir, video_name.split('.')[-2])
cmd_str = 'ffmpeg -f image2 -i {}/%05d.jpg {}'.format(save_dir,
out_path)
os.system(cmd_str)
print('Save video in {}.'.format(out_path))
else:
writer.release() writer.release()
def predict_mtmct(self, mtmct_dir, mtmct_cfg):
cameras_bias = mtmct_cfg['cameras_bias']
cid_bias = parse_bias(cameras_bias)
scene_cluster = list(cid_bias.keys())
# 1.zone releated parameters
use_zone = mtmct_cfg.get('use_zone', False)
zone_path = mtmct_cfg.get('zone_path', None)
def predict_mtmct_seq(detector, # 2.tricks parameters, can be used for other mtmct dataset
reid_model, use_ff = mtmct_cfg.get('use_ff', False)
mtmct_dir, use_rerank = mtmct_cfg.get('use_rerank', False)
seq_name,
scaled,
threshold,
output_dir,
save_images=True,
save_mot_txts=True):
fpath = os.path.join(mtmct_dir, seq_name)
if os.path.exists(os.path.join(fpath, 'img1')):
fpath = os.path.join(fpath, 'img1')
assert os.path.isdir(fpath), '{} should be a directory'.format(fpath)
image_list = os.listdir(fpath)
image_list.sort()
assert len(image_list) > 0, '{} has no images.'.format(fpath)
results = defaultdict(list)
mot_features_dict = {} # cid_tid_fid feats
print('Totally {} frames found in seq {}.'.format(
len(image_list), seq_name))
for frame_id, img_file in enumerate(image_list):
if frame_id % 10 == 0:
print('Processing frame {} of seq {}.'.format(frame_id, seq_name))
frame = cv2.imread(os.path.join(fpath, img_file))
ori_image_shape = list(frame.shape[:2])
frame_path = os.path.join(fpath, img_file)
pred_dets, pred_xyxys = detector.predict([frame_path], ori_image_shape,
threshold, scaled)
if len(pred_dets) == 1 and np.sum(pred_dets) == 0:
print('Frame {} has no object, try to modify score threshold.'.
format(frame_id))
online_im = frame
else:
# reid process
crops = reid_model.get_crops(pred_xyxys, frame)
tracking_outs = reid_model.predict( # 3.camera releated parameters
crops, use_camera = mtmct_cfg.get('use_camera', False)
pred_dets, use_st_filter = mtmct_cfg.get('use_st_filter', False)
MTMCT=True,
frame_id=frame_id,
seq_name=seq_name)
feat_data_dict = tracking_outs['feat_data'] # 4.zone releated parameters
mot_features_dict = dict(mot_features_dict, **feat_data_dict) use_roi = mtmct_cfg.get('use_roi', False)
roi_dir = mtmct_cfg.get('roi_dir', False)
online_tlwhs = tracking_outs['online_tlwhs'] mot_list_breaks = []
online_scores = tracking_outs['online_scores'] cid_tid_dict = dict()
online_ids = tracking_outs['online_ids']
online_im = plot_tracking(frame, online_tlwhs, online_ids, output_dir = self.output_dir
online_scores, frame_id) if not os.path.exists(output_dir):
results[0].append( os.makedirs(output_dir)
(frame_id + 1, online_tlwhs, online_scores, online_ids))
if save_images:
save_dir = os.path.join(output_dir, seq_name)
if not os.path.exists(save_dir): os.makedirs(save_dir)
img_name = os.path.split(img_file)[-1]
out_path = os.path.join(save_dir, img_name)
cv2.imwrite(out_path, online_im)
if save_mot_txts:
result_filename = os.path.join(output_dir, seq_name + '.txt')
write_mot_results(result_filename, results)
return mot_features_dict
def predict_mtmct(detector,
reid_model,
mtmct_dir,
mtmct_cfg,
scaled,
threshold,
output_dir,
save_images=True,
save_mot_txts=True):
MTMCT = mtmct_cfg['MTMCT']
assert MTMCT == True, 'predict_mtmct should be used for MTMCT.'
cameras_bias = mtmct_cfg['cameras_bias']
cid_bias = parse_bias(cameras_bias)
scene_cluster = list(cid_bias.keys())
# 1.zone releated parameters
use_zone = mtmct_cfg['use_zone']
zone_path = mtmct_cfg['zone_path']
# 2.tricks parameters, can be used for other mtmct dataset
use_ff = mtmct_cfg['use_ff']
use_rerank = mtmct_cfg['use_rerank']
# 3.camera releated parameters
use_camera = mtmct_cfg['use_camera']
use_st_filter = mtmct_cfg['use_st_filter']
# 4.zone releated parameters
use_roi = mtmct_cfg['use_roi']
roi_dir = mtmct_cfg['roi_dir']
mot_list_breaks = []
cid_tid_dict = dict()
if not os.path.exists(output_dir): os.makedirs(output_dir)
seqs = os.listdir(mtmct_dir)
seqs.sort()
for seq in seqs:
fpath = os.path.join(mtmct_dir, seq)
if os.path.isfile(fpath) and _is_valid_video(fpath):
ext = seq.split('.')[-1]
seq = seq.split('.')[-2]
print('ffmpeg processing of video {}'.format(fpath))
frames_path = video2frames(
video_path=fpath, outpath=mtmct_dir, frame_rate=25)
fpath = os.path.join(mtmct_dir, seq)
if os.path.isdir(fpath) == False: seqs = os.listdir(mtmct_dir)
print('{} is not a image folder.'.format(fpath)) for seq in sorted(seqs):
continue fpath = os.path.join(mtmct_dir, seq)
if os.path.isfile(fpath) and _is_valid_video(fpath):
mot_features_dict = predict_mtmct_seq( seq = seq.split('.')[-2]
detector, reid_model, mtmct_dir, seq, scaled, threshold, output_dir, print('ffmpeg processing of video {}'.format(fpath))
save_images, save_mot_txts) frames_path = video2frames(
video_path=fpath, outpath=mtmct_dir, frame_rate=25)
cid = int(re.sub('[a-z,A-Z]', "", seq)) fpath = os.path.join(mtmct_dir, seq)
tid_data, mot_list_break = trajectory_fusion(
mot_features_dict, if os.path.isdir(fpath) == False:
cid, print('{} is not a image folder.'.format(fpath))
cid_bias, continue
use_zone=use_zone, if os.path.exists(os.path.join(fpath, 'img1')):
zone_path=zone_path) fpath = os.path.join(fpath, 'img1')
mot_list_breaks.append(mot_list_break) assert os.path.isdir(fpath), '{} should be a directory'.format(fpath)
# single seq process image_list = glob.glob(os.path.join(fpath, '*.jpg'))
for line in tid_data: image_list.sort()
tracklet = tid_data[line] assert len(image_list) > 0, '{} has no images.'.format(fpath)
tid = tracklet['tid'] print('start tracking seq: {}'.format(seq))
if (cid, tid) not in cid_tid_dict:
cid_tid_dict[(cid, tid)] = tracklet mot_features_dict = self.predict_image(image_list, visual=False, seq_name=seq)
map_tid = sub_cluster( cid = int(re.sub('[a-z,A-Z]', "", seq))
cid_tid_dict, tid_data, mot_list_break = trajectory_fusion(
scene_cluster, mot_features_dict,
use_ff=use_ff, cid,
use_rerank=use_rerank, cid_bias,
use_camera=use_camera, use_zone=use_zone,
use_st_filter=use_st_filter) zone_path=zone_path)
mot_list_breaks.append(mot_list_break)
pred_mtmct_file = os.path.join(output_dir, 'mtmct_result.txt') # single seq process
if use_camera: for line in tid_data:
gen_res(pred_mtmct_file, scene_cluster, map_tid, mot_list_breaks) tracklet = tid_data[line]
else: tid = tracklet['tid']
gen_res( if (cid, tid) not in cid_tid_dict:
pred_mtmct_file, cid_tid_dict[(cid, tid)] = tracklet
map_tid = sub_cluster(
cid_tid_dict,
scene_cluster, scene_cluster,
map_tid, use_ff=use_ff,
mot_list_breaks, use_rerank=use_rerank,
use_roi=use_roi, use_camera=use_camera,
roi_dir=roi_dir) use_st_filter=use_st_filter)
pred_mtmct_file = os.path.join(output_dir, 'mtmct_result.txt')
if use_camera:
gen_res(pred_mtmct_file, scene_cluster, map_tid, mot_list_breaks)
else:
gen_res(
pred_mtmct_file,
scene_cluster,
map_tid,
mot_list_breaks,
use_roi=use_roi,
roi_dir=roi_dir)
if FLAGS.save_images:
camera_results, cid_tid_fid_res = get_mtmct_matching_results( camera_results, cid_tid_fid_res = get_mtmct_matching_results(
pred_mtmct_file) pred_mtmct_file)
...@@ -942,160 +559,55 @@ def predict_mtmct(detector, ...@@ -942,160 +559,55 @@ def predict_mtmct(detector,
save_dir=save_dir, save_dir=save_dir,
save_videos=FLAGS.save_images) save_videos=FLAGS.save_images)
# evalution metrics
data_root_gt = os.path.join(mtmct_dir, '..', 'gt', 'gt.txt')
if os.path.exists(data_root_gt):
print_mtmct_result(data_root_gt, pred_mtmct_file)
def predict_naive(model_dir,
reid_model_dir,
video_file,
image_dir,
mtmct_dir=None,
mtmct_cfg=None,
scaled=True,
device='gpu',
threshold=0.5,
output_dir='output'):
pred_config = PredictConfig(model_dir)
detector_func = 'SDE_Detector'
if pred_config.arch == 'PicoDet':
detector_func = 'SDE_DetectorPicoDet'
detector = eval(detector_func)(pred_config, model_dir, device=device)
pred_config = PredictConfig(reid_model_dir)
reid_model = SDE_ReID(pred_config, reid_model_dir, device=device)
if video_file is not None:
predict_video(
detector,
reid_model,
video_file,
scaled=scaled,
threshold=threshold,
output_dir=output_dir,
save_images=True,
save_mot_txts=True,
draw_center_traj=False,
secs_interval=10,
do_entrance_counting=False)
elif mtmct_dir is not None:
with open(mtmct_cfg) as f:
mtmct_cfg_file = yaml.safe_load(f)
predict_mtmct(
detector,
reid_model,
mtmct_dir,
mtmct_cfg_file,
scaled=scaled,
threshold=threshold,
output_dir=output_dir,
save_images=True,
save_mot_txts=True)
else:
img_list = get_test_images(image_dir, infer_img=None)
predict_image(
detector,
reid_model,
img_list,
threshold=threshold,
output_dir=output_dir,
save_images=True)
def main(): def main():
pred_config = PredictConfig(FLAGS.model_dir) deploy_file = os.path.join(FLAGS.model_dir, 'infer_cfg.yml')
detector_func = 'SDE_Detector' with open(deploy_file) as f:
if pred_config.arch == 'PicoDet': yml_conf = yaml.safe_load(f)
detector_func = 'SDE_DetectorPicoDet' arch = yml_conf['arch']
detector = SDE_Detector(
detector = eval(detector_func)(pred_config, FLAGS.model_dir,
FLAGS.model_dir, FLAGS.tracker_config,
device=FLAGS.device,
run_mode=FLAGS.run_mode,
batch_size=FLAGS.batch_size,
trt_min_shape=FLAGS.trt_min_shape,
trt_max_shape=FLAGS.trt_max_shape,
trt_opt_shape=FLAGS.trt_opt_shape,
trt_calib_mode=FLAGS.trt_calib_mode,
cpu_threads=FLAGS.cpu_threads,
enable_mkldnn=FLAGS.enable_mkldnn)
pred_config = PredictConfig(FLAGS.reid_model_dir)
reid_model = SDE_ReID(
pred_config,
FLAGS.reid_model_dir,
device=FLAGS.device, device=FLAGS.device,
run_mode=FLAGS.run_mode, run_mode=FLAGS.run_mode,
batch_size=FLAGS.reid_batch_size, batch_size=FLAGS.batch_size,
trt_min_shape=FLAGS.trt_min_shape, trt_min_shape=FLAGS.trt_min_shape,
trt_max_shape=FLAGS.trt_max_shape, trt_max_shape=FLAGS.trt_max_shape,
trt_opt_shape=FLAGS.trt_opt_shape, trt_opt_shape=FLAGS.trt_opt_shape,
trt_calib_mode=FLAGS.trt_calib_mode, trt_calib_mode=FLAGS.trt_calib_mode,
cpu_threads=FLAGS.cpu_threads, cpu_threads=FLAGS.cpu_threads,
enable_mkldnn=FLAGS.enable_mkldnn) enable_mkldnn=FLAGS.enable_mkldnn,
threshold=FLAGS.threshold,
output_dir=FLAGS.output_dir,
reid_model_dir=FLAGS.reid_model_dir,
mtmct_dir=FLAGS.mtmct_dir,
)
# predict from video file or camera video stream # predict from video file or camera video stream
if FLAGS.video_file is not None or FLAGS.camera_id != -1: if FLAGS.video_file is not None or FLAGS.camera_id != -1:
predict_video( detector.predict_video(FLAGS.video_file, FLAGS.camera_id)
detector,
reid_model,
FLAGS.video_file,
scaled=FLAGS.scaled,
threshold=FLAGS.threshold,
output_dir=FLAGS.output_dir,
save_images=FLAGS.save_images,
save_mot_txts=FLAGS.save_mot_txts,
draw_center_traj=FLAGS.draw_center_traj,
secs_interval=FLAGS.secs_interval,
do_entrance_counting=FLAGS.do_entrance_counting,
camera_id=FLAGS.camera_id)
elif FLAGS.mtmct_dir is not None: elif FLAGS.mtmct_dir is not None:
mtmct_cfg_file = FLAGS.mtmct_cfg with open(FLAGS.mtmct_cfg) as f:
with open(mtmct_cfg_file) as f:
mtmct_cfg = yaml.safe_load(f) mtmct_cfg = yaml.safe_load(f)
predict_mtmct( detector.predict_mtmct(FLAGS.mtmct_dir, mtmct_cfg)
detector,
reid_model,
FLAGS.mtmct_dir,
mtmct_cfg,
scaled=FLAGS.scaled,
threshold=FLAGS.threshold,
output_dir=FLAGS.output_dir,
save_images=FLAGS.save_images,
save_mot_txts=FLAGS.save_mot_txts)
else: else:
# predict from image # predict from image
if FLAGS.image_dir is None and FLAGS.image_file is not None:
assert FLAGS.batch_size == 1, "--batch_size should be 1 in MOT models."
img_list = get_test_images(FLAGS.image_dir, FLAGS.image_file) img_list = get_test_images(FLAGS.image_dir, FLAGS.image_file)
predict_image( seq_name = FLAGS.image_dir.split('/')[-1]
detector, detector.predict_image(img_list, FLAGS.run_benchmark, repeats=10, seq_name=seq_name)
reid_model,
img_list,
threshold=FLAGS.threshold,
output_dir=FLAGS.output_dir,
save_images=FLAGS.save_images,
run_benchmark=FLAGS.run_benchmark)
if not FLAGS.run_benchmark: if not FLAGS.run_benchmark:
detector.det_times.info(average=True) detector.det_times.info(average=True)
reid_model.det_times.info(average=True)
else: else:
mode = FLAGS.run_mode mode = FLAGS.run_mode
det_model_dir = FLAGS.model_dir model_dir = FLAGS.model_dir
det_model_info = { model_info = {
'model_name': det_model_dir.strip('/').split('/')[-1], 'model_name': model_dir.strip('/').split('/')[-1],
'precision': mode.split('_')[-1]
}
bench_log(detector, img_list, det_model_info, name='Det')
reid_model_dir = FLAGS.reid_model_dir
reid_model_info = {
'model_name': reid_model_dir.strip('/').split('/')[-1],
'precision': mode.split('_')[-1] 'precision': mode.split('_')[-1]
} }
bench_log(reid_model, img_list, reid_model_info, name='ReID') bench_log(detector, img_list, model_info, name='MOT')
if __name__ == '__main__': if __name__ == '__main__':
......
# config of tracker for MOT SDE Detector, use ByteTracker as default.
# The tracker of MOT JDE Detector is exported together with the model.
# Here 'min_box_area' and 'vertical_ratio' are set for pedestrian, you can modify for other objects tracking.
tracker:
use_byte: true
conf_thres: 0.6
low_conf_thres: 0.1
match_thres: 0.9
min_box_area: 100
vertical_ratio: 1.6
...@@ -66,6 +66,11 @@ def argsparser(): ...@@ -66,6 +66,11 @@ def argsparser():
default='cpu', default='cpu',
help="Choose the device you want to run, it can be: CPU/GPU/XPU, default is CPU." help="Choose the device you want to run, it can be: CPU/GPU/XPU, default is CPU."
) )
parser.add_argument(
"--use_gpu",
type=ast.literal_eval,
default=False,
help="Deprecated, please use `--device`.")
parser.add_argument( parser.add_argument(
"--run_benchmark", "--run_benchmark",
type=ast.literal_eval, type=ast.literal_eval,
...@@ -104,12 +109,18 @@ def argsparser(): ...@@ -104,12 +109,18 @@ def argsparser():
'--save_mot_txts', '--save_mot_txts',
action='store_true', action='store_true',
help='Save tracking results (txt).') help='Save tracking results (txt).')
parser.add_argument(
'--save_mot_txt_per_img',
action='store_true',
help='Save tracking results (txt) for each image.')
parser.add_argument( parser.add_argument(
'--scaled', '--scaled',
type=bool, type=bool,
default=False, default=False,
help="Whether coords after detector outputs are scaled, False in JDE YOLOv3 " help="Whether coords after detector outputs are scaled, False in JDE YOLOv3 "
"True in general detector.") "True in general detector.")
parser.add_argument(
"--tracker_config", type=str, default=None, help=("tracker donfig"))
parser.add_argument( parser.add_argument(
"--reid_model_dir", "--reid_model_dir",
type=str, type=str,
...@@ -122,20 +133,10 @@ def argsparser(): ...@@ -122,20 +133,10 @@ def argsparser():
default=50, default=50,
help="max batch_size for reid model inference.") help="max batch_size for reid model inference.")
parser.add_argument( parser.add_argument(
"--do_entrance_counting", '--use_dark',
action='store_true', type=ast.literal_eval,
help="Whether counting the numbers of identifiers entering " default=True,
"or getting out from the entrance. Note that only support one-class" help='whether to use darkpose to get better keypoint position predict ')
"counting, multi-class counting is coming soon.")
parser.add_argument(
"--secs_interval",
type=int,
default=2,
help="The seconds interval to count after tracking")
parser.add_argument(
"--draw_center_traj",
action='store_true',
help="Whether drawing the trajectory of center")
parser.add_argument( parser.add_argument(
"--mtmct_dir", "--mtmct_dir",
type=str, type=str,
...@@ -146,6 +147,7 @@ def argsparser(): ...@@ -146,6 +147,7 @@ def argsparser():
return parser return parser
class Times(object): class Times(object):
def __init__(self): def __init__(self):
self.time = 0. self.time = 0.
...@@ -174,29 +176,36 @@ class Times(object): ...@@ -174,29 +176,36 @@ class Times(object):
class Timer(Times): class Timer(Times):
def __init__(self): def __init__(self, with_tracker=False):
super(Timer, self).__init__() super(Timer, self).__init__()
self.with_tracker = with_tracker
self.preprocess_time_s = Times() self.preprocess_time_s = Times()
self.inference_time_s = Times() self.inference_time_s = Times()
self.postprocess_time_s = Times() self.postprocess_time_s = Times()
self.tracking_time_s = Times()
self.img_num = 0 self.img_num = 0
def info(self, average=False): def info(self, average=False):
total_time = self.preprocess_time_s.value( pre_time = self.preprocess_time_s.value()
) + self.inference_time_s.value() + self.postprocess_time_s.value() infer_time = self.inference_time_s.value()
post_time = self.postprocess_time_s.value()
track_time = self.tracking_time_s.value()
total_time = pre_time + infer_time + post_time
if self.with_tracker:
total_time = total_time + track_time
total_time = round(total_time, 4) total_time = round(total_time, 4)
print("------------------ Inference Time Info ----------------------") print("------------------ Inference Time Info ----------------------")
print("total_time(ms): {}, img_num: {}".format(total_time * 1000, print("total_time(ms): {}, img_num: {}".format(total_time * 1000,
self.img_num)) self.img_num))
preprocess_time = round( preprocess_time = round(pre_time / max(1, self.img_num),
self.preprocess_time_s.value() / max(1, self.img_num), 4) if average else pre_time
4) if average else self.preprocess_time_s.value() postprocess_time = round(post_time / max(1, self.img_num),
postprocess_time = round( 4) if average else post_time
self.postprocess_time_s.value() / max(1, self.img_num), inference_time = round(infer_time / max(1, self.img_num),
4) if average else self.postprocess_time_s.value() 4) if average else infer_time
inference_time = round(self.inference_time_s.value() / tracking_time = round(track_time / max(1, self.img_num),
max(1, self.img_num), 4) if average else track_time
4) if average else self.inference_time_s.value()
average_latency = total_time / max(1, self.img_num) average_latency = total_time / max(1, self.img_num)
qps = 0 qps = 0
...@@ -204,25 +213,36 @@ class Timer(Times): ...@@ -204,25 +213,36 @@ class Timer(Times):
qps = 1 / average_latency qps = 1 / average_latency
print("average latency time(ms): {:.2f}, QPS: {:2f}".format( print("average latency time(ms): {:.2f}, QPS: {:2f}".format(
average_latency * 1000, qps)) average_latency * 1000, qps))
print( if self.with_tracker:
"preprocess_time(ms): {:.2f}, inference_time(ms): {:.2f}, postprocess_time(ms): {:.2f}". print(
format(preprocess_time * 1000, inference_time * 1000, "preprocess_time(ms): {:.2f}, inference_time(ms): {:.2f}, postprocess_time(ms): {:.2f}, tracking_time(ms): {:.2f}".
postprocess_time * 1000)) format(preprocess_time * 1000, inference_time * 1000,
postprocess_time * 1000, tracking_time * 1000))
else:
print(
"preprocess_time(ms): {:.2f}, inference_time(ms): {:.2f}, postprocess_time(ms): {:.2f}".
format(preprocess_time * 1000, inference_time * 1000,
postprocess_time * 1000))
def report(self, average=False): def report(self, average=False):
dic = {} dic = {}
dic['preprocess_time_s'] = round( pre_time = self.preprocess_time_s.value()
self.preprocess_time_s.value() / max(1, self.img_num), infer_time = self.inference_time_s.value()
4) if average else self.preprocess_time_s.value() post_time = self.postprocess_time_s.value()
dic['postprocess_time_s'] = round( track_time = self.tracking_time_s.value()
self.postprocess_time_s.value() / max(1, self.img_num),
4) if average else self.postprocess_time_s.value() dic['preprocess_time_s'] = round(pre_time / max(1, self.img_num),
dic['inference_time_s'] = round( 4) if average else pre_time
self.inference_time_s.value() / max(1, self.img_num), dic['inference_time_s'] = round(infer_time / max(1, self.img_num),
4) if average else self.inference_time_s.value() 4) if average else infer_time
dic['postprocess_time_s'] = round(post_time / max(1, self.img_num),
4) if average else post_time
dic['img_num'] = self.img_num dic['img_num'] = self.img_num
total_time = self.preprocess_time_s.value( total_time = pre_time + infer_time + post_time
) + self.inference_time_s.value() + self.postprocess_time_s.value() if self.with_tracker:
dic['tracking_time_s'] = round(track_time / max(1, self.img_num),
4) if average else track_time
total_time = total_time + track_time
dic['total_time_s'] = round(total_time, 4) dic['total_time_s'] = round(total_time, 4)
return dic return dic
......
...@@ -31,7 +31,7 @@ sys.path.insert(0, parent_path) ...@@ -31,7 +31,7 @@ sys.path.insert(0, parent_path)
from benchmark_utils import PaddleInferBenchmark from benchmark_utils import PaddleInferBenchmark
from picodet_postprocess import PicoDetPostProcess from picodet_postprocess import PicoDetPostProcess
from preprocess import preprocess, Resize, NormalizeImage, Permute, PadStride, LetterBoxResize, WarpAffine from preprocess import preprocess, Resize, NormalizeImage, Permute, PadStride, LetterBoxResize, WarpAffine, decode_image
from keypoint_preprocess import EvalAffine, TopDownEvalAffine, expand_crop from keypoint_preprocess import EvalAffine, TopDownEvalAffine, expand_crop
from visualize import visualize_box_mask from visualize import visualize_box_mask
from utils import argsparser, Timer, get_current_memory_mb from utils import argsparser, Timer, get_current_memory_mb
......
...@@ -25,7 +25,7 @@ from collections import defaultdict ...@@ -25,7 +25,7 @@ from collections import defaultdict
from mot_keypoint_unite_utils import argsparser from mot_keypoint_unite_utils import argsparser
from preprocess import decode_image from preprocess import decode_image
from infer import print_arguments, get_test_images from infer import print_arguments, get_test_images
from mot_sde_infer import SDE_Detector, MOT_SDE_SUPPORT_MODELS from mot_sde_infer import SDE_Detector
from mot_jde_infer import JDE_Detector, MOT_JDE_SUPPORT_MODELS from mot_jde_infer import JDE_Detector, MOT_JDE_SUPPORT_MODELS
from keypoint_infer import KeyPointDetector, KEYPOINT_SUPPORT_MODELS from keypoint_infer import KeyPointDetector, KEYPOINT_SUPPORT_MODELS
from det_keypoint_unite_infer import predict_with_given_det from det_keypoint_unite_infer import predict_with_given_det
......
...@@ -34,13 +34,6 @@ from pptracking.python.mot import JDETracker ...@@ -34,13 +34,6 @@ from pptracking.python.mot import JDETracker
from pptracking.python.mot.utils import MOTTimer, write_mot_results from pptracking.python.mot.utils import MOTTimer, write_mot_results
from pptracking.python.visualize import plot_tracking, plot_tracking_dict from pptracking.python.visualize import plot_tracking, plot_tracking_dict
# Global dictionary
MOT_SDE_SUPPORT_MODELS = {
'DeepSORT',
'ByteTrack',
'YOLO',
}
class SDE_Detector(Detector): class SDE_Detector(Detector):
""" """
...@@ -287,7 +280,6 @@ def main(): ...@@ -287,7 +280,6 @@ def main():
with open(deploy_file) as f: with open(deploy_file) as f:
yml_conf = yaml.safe_load(f) yml_conf = yaml.safe_load(f)
arch = yml_conf['arch'] arch = yml_conf['arch']
assert arch in MOT_SDE_SUPPORT_MODELS, '{} is not supported.'.format(arch)
detector = SDE_Detector( detector = SDE_Detector(
FLAGS.model_dir, FLAGS.model_dir,
FLAGS.tracker_config, FLAGS.tracker_config,
......
...@@ -228,11 +228,11 @@ class Timer(Times): ...@@ -228,11 +228,11 @@ class Timer(Times):
4) if average else infer_time 4) if average else infer_time
dic['postprocess_time_s'] = round(post_time / max(1, self.img_num), dic['postprocess_time_s'] = round(post_time / max(1, self.img_num),
4) if average else post_time 4) if average else post_time
dic['tracking_time_s'] = round(post_time / max(1, self.img_num),
4) if average else track_time
dic['img_num'] = self.img_num dic['img_num'] = self.img_num
total_time = pre_time + infer_time + post_time total_time = pre_time + infer_time + post_time
if self.with_tracker: if self.with_tracker:
dic['tracking_time_s'] = round(track_time / max(1, self.img_num),
4) if average else track_time
total_time = total_time + track_time total_time = total_time + track_time
dic['total_time_s'] = round(total_time, 4) dic['total_time_s'] = round(total_time, 4)
return dic return dic
......
# coding: utf-8 # Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
# #
# Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
......
...@@ -41,6 +41,7 @@ TRT_MIN_SUBGRAPH = { ...@@ -41,6 +41,7 @@ TRT_MIN_SUBGRAPH = {
'HigherHRNet': 3, 'HigherHRNet': 3,
'HRNet': 3, 'HRNet': 3,
'DeepSORT': 3, 'DeepSORT': 3,
'ByteTrack':10,
'JDE': 10, 'JDE': 10,
'FairMOT': 5, 'FairMOT': 5,
'GFL': 16, 'GFL': 16,
...@@ -50,7 +51,7 @@ TRT_MIN_SUBGRAPH = { ...@@ -50,7 +51,7 @@ TRT_MIN_SUBGRAPH = {
} }
KEYPOINT_ARCH = ['HigherHRNet', 'TopDownHRNet'] KEYPOINT_ARCH = ['HigherHRNet', 'TopDownHRNet']
MOT_ARCH = ['DeepSORT', 'JDE', 'FairMOT'] MOT_ARCH = ['DeepSORT', 'JDE', 'FairMOT', 'ByteTrack']
def _prune_input_spec(input_spec, program, targets): def _prune_input_spec(input_spec, program, targets):
......
...@@ -29,6 +29,7 @@ from ppdet.core.workspace import create ...@@ -29,6 +29,7 @@ from ppdet.core.workspace import create
from ppdet.utils.checkpoint import load_weight, load_pretrain_weight from ppdet.utils.checkpoint import load_weight, load_pretrain_weight
from ppdet.modeling.mot.utils import Detection, get_crops, scale_coords, clip_box from ppdet.modeling.mot.utils import Detection, get_crops, scale_coords, clip_box
from ppdet.modeling.mot.utils import MOTTimer, load_det_results, write_mot_results, save_vis_results from ppdet.modeling.mot.utils import MOTTimer, load_det_results, write_mot_results, save_vis_results
from ppdet.modeling.mot.tracker import JDETracker, DeepSORTTracker
from ppdet.metrics import Metric, MOTMetric, KITTIMOTMetric from ppdet.metrics import Metric, MOTMetric, KITTIMOTMetric
from ppdet.metrics import MCMOTMetric from ppdet.metrics import MCMOTMetric
...@@ -39,6 +40,11 @@ from .callbacks import Callback, ComposeCallback ...@@ -39,6 +40,11 @@ from .callbacks import Callback, ComposeCallback
from ppdet.utils.logger import setup_logger from ppdet.utils.logger import setup_logger
logger = setup_logger(__name__) logger = setup_logger(__name__)
MOT_ARCH = ['DeepSORT', 'JDE', 'FairMOT', 'ByteTrack']
MOT_ARCH_JDE = ['JDE', 'FairMOT']
MOT_ARCH_SDE = ['DeepSORT', 'ByteTrack']
MOT_DATA_TYPE = ['mot', 'mcmot', 'kitti']
__all__ = ['Tracker'] __all__ = ['Tracker']
...@@ -109,11 +115,15 @@ class Tracker(object): ...@@ -109,11 +115,15 @@ class Tracker(object):
load_weight(self.model, weights, self.optimizer) load_weight(self.model, weights, self.optimizer)
def load_weights_sde(self, det_weights, reid_weights): def load_weights_sde(self, det_weights, reid_weights):
if self.model.detector: with_detector = self.model.detector is not None
load_weight(self.model.detector, det_weights) with_reid = self.model.reid is not None
load_weight(self.model.reid, reid_weights)
if with_detector:
load_weight(self.model.detector, det_weights, self.optimizer)
if with_reid:
load_weight(self.model.reid, reid_weights)
else: else:
load_weight(self.model.reid, reid_weights, self.optimizer) load_weight(self.model.reid, reid_weights)
def _eval_seq_jde(self, def _eval_seq_jde(self,
dataloader, dataloader,
...@@ -185,18 +195,21 @@ class Tracker(object): ...@@ -185,18 +195,21 @@ class Tracker(object):
if save_dir: if save_dir:
if not os.path.exists(save_dir): os.makedirs(save_dir) if not os.path.exists(save_dir): os.makedirs(save_dir)
use_detector = False if not self.model.detector else True use_detector = False if not self.model.detector else True
use_reid = False if not self.model.reid else True
timer = MOTTimer() timer = MOTTimer()
results = defaultdict(list) results = defaultdict(list)
frame_id = 0 frame_id = 0
self.status['mode'] = 'track' self.status['mode'] = 'track'
self.model.eval() self.model.eval()
self.model.reid.eval() if use_reid:
self.model.reid.eval()
if not use_detector: if not use_detector:
dets_list = load_det_results(det_file, len(dataloader)) dets_list = load_det_results(det_file, len(dataloader))
logger.info('Finish loading detection results file {}.'.format( logger.info('Finish loading detection results file {}.'.format(
det_file)) det_file))
tracker = self.model.tracker
for step_id, data in enumerate(dataloader): for step_id, data in enumerate(dataloader):
self.status['step_id'] = step_id self.status['step_id'] = step_id
if frame_id % 40 == 0: if frame_id % 40 == 0:
...@@ -257,6 +270,8 @@ class Tracker(object): ...@@ -257,6 +270,8 @@ class Tracker(object):
scale_factor) scale_factor)
else: else:
pred_bboxes = outs['bbox'][:, 2:] pred_bboxes = outs['bbox'][:, 2:]
pred_dets_old = np.concatenate(
(pred_cls_ids, pred_scores, pred_bboxes), axis=1)
else: else:
logger.warning( logger.warning(
'Frame {} has not detected object, try to modify score threshold.'. 'Frame {} has not detected object, try to modify score threshold.'.
...@@ -284,50 +299,80 @@ class Tracker(object): ...@@ -284,50 +299,80 @@ class Tracker(object):
pred_cls_ids = pred_cls_ids[keep_idx[0]] pred_cls_ids = pred_cls_ids[keep_idx[0]]
pred_scores = pred_scores[keep_idx[0]] pred_scores = pred_scores[keep_idx[0]]
pred_tlwhs = np.concatenate(
(pred_xyxys[:, 0:2],
pred_xyxys[:, 2:4] - pred_xyxys[:, 0:2] + 1),
axis=1)
pred_dets = np.concatenate( pred_dets = np.concatenate(
(pred_cls_ids, pred_scores, pred_tlwhs), axis=1) (pred_cls_ids, pred_scores, pred_xyxys), axis=1)
tracker = self.model.tracker if use_reid:
crops = get_crops( crops = get_crops(
pred_xyxys, pred_xyxys,
ori_image, ori_image,
w=tracker.input_size[0], w=tracker.input_size[0],
h=tracker.input_size[1]) h=tracker.input_size[1])
crops = paddle.to_tensor(crops) crops = paddle.to_tensor(crops)
data.update({'crops': crops}) data.update({'crops': crops})
pred_embs = self.model(data).numpy() pred_embs = self.model(data).numpy()
else:
tracker.predict() pred_embs = None
online_targets = tracker.update(pred_dets, pred_embs)
if isinstance(tracker, DeepSORTTracker):
online_tlwhs, online_scores, online_ids = [], [], [] online_tlwhs, online_scores, online_ids = [], [], []
for t in online_targets: tracker.predict()
if not t.is_confirmed() or t.time_since_update > 1: online_targets = tracker.update(pred_dets, pred_embs)
continue for t in online_targets:
tlwh = t.to_tlwh() if not t.is_confirmed() or t.time_since_update > 1:
tscore = t.score continue
tid = t.track_id tlwh = t.to_tlwh()
if tscore < draw_threshold: continue tscore = t.score
if tlwh[2] * tlwh[3] <= tracker.min_box_area: continue tid = t.track_id
if tracker.vertical_ratio > 0 and tlwh[2] / tlwh[ if tscore < draw_threshold: continue
3] > tracker.vertical_ratio: if tlwh[2] * tlwh[3] <= tracker.min_box_area: continue
continue if tracker.vertical_ratio > 0 and tlwh[2] / tlwh[
online_tlwhs.append(tlwh) 3] > tracker.vertical_ratio:
online_scores.append(tscore) continue
online_ids.append(tid) online_tlwhs.append(tlwh)
timer.toc() online_scores.append(tscore)
online_ids.append(tid)
timer.toc()
# save results
results[0].append(
(frame_id + 1, online_tlwhs, online_scores, online_ids))
save_vis_results(data, frame_id, online_ids, online_tlwhs,
online_scores, timer.average_time, show_image,
save_dir, self.cfg.num_classes)
elif isinstance(tracker, JDETracker):
# trick hyperparams only used for MOTChallenge (MOT17, MOT20) Test-set
tracker.track_buffer, tracker.conf_thres = get_trick_hyperparams(
seq_name, tracker.track_buffer, tracker.conf_thres)
online_targets_dict = tracker.update(pred_dets_old, pred_embs)
online_tlwhs = defaultdict(list)
online_scores = defaultdict(list)
online_ids = defaultdict(list)
for cls_id in range(self.cfg.num_classes):
online_targets = online_targets_dict[cls_id]
for t in online_targets:
tlwh = t.tlwh
tid = t.track_id
tscore = t.score
if tlwh[2] * tlwh[3] <= tracker.min_box_area: continue
if tracker.vertical_ratio > 0 and tlwh[2] / tlwh[
3] > tracker.vertical_ratio:
continue
online_tlwhs[cls_id].append(tlwh)
online_ids[cls_id].append(tid)
online_scores[cls_id].append(tscore)
# save results
results[cls_id].append(
(frame_id + 1, online_tlwhs[cls_id], online_scores[cls_id],
online_ids[cls_id]))
timer.toc()
save_vis_results(data, frame_id, online_ids, online_tlwhs,
online_scores, timer.average_time, show_image,
save_dir, self.cfg.num_classes)
# save results
results[0].append(
(frame_id + 1, online_tlwhs, online_scores, online_ids))
save_vis_results(data, frame_id, online_ids, online_tlwhs,
online_scores, timer.average_time, show_image,
save_dir, self.cfg.num_classes)
frame_id += 1 frame_id += 1
return results, frame_id, timer.average_time, timer.calls return results, frame_id, timer.average_time, timer.calls
...@@ -346,10 +391,10 @@ class Tracker(object): ...@@ -346,10 +391,10 @@ class Tracker(object):
if not os.path.exists(output_dir): os.makedirs(output_dir) if not os.path.exists(output_dir): os.makedirs(output_dir)
result_root = os.path.join(output_dir, 'mot_results') result_root = os.path.join(output_dir, 'mot_results')
if not os.path.exists(result_root): os.makedirs(result_root) if not os.path.exists(result_root): os.makedirs(result_root)
assert data_type in ['mot', 'mcmot', 'kitti'], \ assert data_type in MOT_DATA_TYPE, \
"data_type should be 'mot', 'mcmot' or 'kitti'" "data_type should be 'mot', 'mcmot' or 'kitti'"
assert model_type in ['JDE', 'DeepSORT', 'FairMOT'], \ assert model_type in MOT_ARCH, \
"model_type should be 'JDE', 'DeepSORT' or 'FairMOT'" "model_type should be 'JDE', 'DeepSORT', 'FairMOT' or 'ByteTrack'"
# run tracking # run tracking
n_frame = 0 n_frame = 0
...@@ -380,13 +425,13 @@ class Tracker(object): ...@@ -380,13 +425,13 @@ class Tracker(object):
result_filename = os.path.join(result_root, '{}.txt'.format(seq)) result_filename = os.path.join(result_root, '{}.txt'.format(seq))
with paddle.no_grad(): with paddle.no_grad():
if model_type in ['JDE', 'FairMOT']: if model_type in MOT_ARCH_JDE:
results, nf, ta, tc = self._eval_seq_jde( results, nf, ta, tc = self._eval_seq_jde(
dataloader, dataloader,
save_dir=save_dir, save_dir=save_dir,
show_image=show_image, show_image=show_image,
frame_rate=frame_rate) frame_rate=frame_rate)
elif model_type in ['DeepSORT']: elif model_type in MOT_ARCH_SDE:
results, nf, ta, tc = self._eval_seq_sde( results, nf, ta, tc = self._eval_seq_sde(
dataloader, dataloader,
save_dir=save_dir, save_dir=save_dir,
...@@ -472,10 +517,10 @@ class Tracker(object): ...@@ -472,10 +517,10 @@ class Tracker(object):
if not os.path.exists(output_dir): os.makedirs(output_dir) if not os.path.exists(output_dir): os.makedirs(output_dir)
result_root = os.path.join(output_dir, 'mot_results') result_root = os.path.join(output_dir, 'mot_results')
if not os.path.exists(result_root): os.makedirs(result_root) if not os.path.exists(result_root): os.makedirs(result_root)
assert data_type in ['mot', 'mcmot', 'kitti'], \ assert data_type in MOT_DATA_TYPE, \
"data_type should be 'mot', 'mcmot' or 'kitti'" "data_type should be 'mot', 'mcmot' or 'kitti'"
assert model_type in ['JDE', 'DeepSORT', 'FairMOT'], \ assert model_type in MOT_ARCH, \
"model_type should be 'JDE', 'DeepSORT' or 'FairMOT'" "model_type should be 'JDE', 'DeepSORT', 'FairMOT' or 'ByteTrack'"
# run tracking # run tracking
if video_file: if video_file:
...@@ -505,14 +550,14 @@ class Tracker(object): ...@@ -505,14 +550,14 @@ class Tracker(object):
frame_rate = self.dataset.frame_rate frame_rate = self.dataset.frame_rate
with paddle.no_grad(): with paddle.no_grad():
if model_type in ['JDE', 'FairMOT']: if model_type in MOT_ARCH_JDE:
results, nf, ta, tc = self._eval_seq_jde( results, nf, ta, tc = self._eval_seq_jde(
dataloader, dataloader,
save_dir=save_dir, save_dir=save_dir,
show_image=show_image, show_image=show_image,
frame_rate=frame_rate, frame_rate=frame_rate,
draw_threshold=draw_threshold) draw_threshold=draw_threshold)
elif model_type in ['DeepSORT']: elif model_type in MOT_ARCH_SDE:
results, nf, ta, tc = self._eval_seq_sde( results, nf, ta, tc = self._eval_seq_sde(
dataloader, dataloader,
save_dir=save_dir, save_dir=save_dir,
...@@ -536,3 +581,34 @@ class Tracker(object): ...@@ -536,3 +581,34 @@ class Tracker(object):
write_mot_results(result_filename, results, data_type, write_mot_results(result_filename, results, data_type,
self.cfg.num_classes) self.cfg.num_classes)
def get_trick_hyperparams(video_name, ori_buffer, ori_thresh):
if video_name[:3] != 'MOT':
# only used for MOTChallenge (MOT17, MOT20) Test-set
return ori_buffer, ori_thresh
video_name = video_name[:8]
if 'MOT17-05' in video_name:
track_buffer = 14
elif 'MOT17-13' in video_name:
track_buffer = 25
else:
track_buffer = ori_buffer
if 'MOT17-01' in video_name:
track_thresh = 0.65
elif 'MOT17-06' in video_name:
track_thresh = 0.65
elif 'MOT17-12' in video_name:
track_thresh = 0.7
elif 'MOT17-14' in video_name:
track_thresh = 0.67
else:
track_thresh = ori_thresh
if 'MOT20-06' in video_name or 'MOT20-08' in video_name:
track_thresh = 0.3
else:
track_thresh = ori_thresh
return track_buffer, ori_thresh
...@@ -51,7 +51,7 @@ logger = setup_logger('ppdet.engine') ...@@ -51,7 +51,7 @@ logger = setup_logger('ppdet.engine')
__all__ = ['Trainer'] __all__ = ['Trainer']
MOT_ARCH = ['DeepSORT', 'JDE', 'FairMOT'] MOT_ARCH = ['DeepSORT', 'JDE', 'FairMOT', 'ByteTrack']
class Trainer(object): class Trainer(object):
......
...@@ -308,10 +308,10 @@ class MCMOTEvaluator(object): ...@@ -308,10 +308,10 @@ class MCMOTEvaluator(object):
def load_annotations(self): def load_annotations(self):
assert self.data_type == 'mcmot' assert self.data_type == 'mcmot'
self.gt_filename = os.path.join(self.data_root, '../', '../', self.gt_filename = os.path.join(self.data_root, '../',
'sequences', 'sequences',
'{}.txt'.format(self.seq_name)) '{}.txt'.format(self.seq_name))
def reset_accumulator(self): def reset_accumulator(self):
import motmetrics as mm import motmetrics as mm
mm.lap.default_solver = 'lap' mm.lap.default_solver = 'lap'
......
...@@ -27,6 +27,7 @@ from . import detr ...@@ -27,6 +27,7 @@ from . import detr
from . import sparse_rcnn from . import sparse_rcnn
from . import tood from . import tood
from . import retinanet from . import retinanet
from . import bytetrack
from .meta_arch import * from .meta_arch import *
from .faster_rcnn import * from .faster_rcnn import *
...@@ -51,3 +52,4 @@ from .detr import * ...@@ -51,3 +52,4 @@ from .detr import *
from .sparse_rcnn import * from .sparse_rcnn import *
from .tood import * from .tood import *
from .retinanet import * from .retinanet import *
from .bytetrack import *
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from ppdet.core.workspace import register, create
from .meta_arch import BaseArch
__all__ = ['ByteTrack']
@register
class ByteTrack(BaseArch):
"""
ByteTrack network, see https://arxiv.org/abs/
Args:
detector (object): detector model instance
reid (object): reid model instance
tracker (object): tracker instance
"""
__category__ = 'architecture'
def __init__(self,
detector='YOLOX',
reid=None,
tracker='JDETracker'):
super(ByteTrack, self).__init__()
self.detector = detector
self.reid = reid
self.tracker = tracker
@classmethod
def from_config(cls, cfg, *args, **kwargs):
detector = create(cfg['detector'])
if cfg['reid'] != 'None':
reid = create(cfg['reid'])
else:
reid = None
tracker = create(cfg['tracker'])
return {
"detector": detector,
"reid": reid,
"tracker": tracker,
}
def _forward(self):
det_outs = self.detector(self.inputs)
if self.training:
return det_outs
else:
if self.reid is not None:
assert 'crops' in self.inputs
crops = self.inputs['crops']
pred_embs = self.reid(crops)
else:
pred_embs = None
det_outs['embeddings'] = pred_embs
return det_outs
def get_loss(self):
return self._forward()
def get_pred(self):
return self._forward()
...@@ -102,7 +102,8 @@ class DeepSORTTracker(object): ...@@ -102,7 +102,8 @@ class DeepSORTTracker(object):
""" """
pred_cls_ids = pred_dets[:, 0:1] pred_cls_ids = pred_dets[:, 0:1]
pred_scores = pred_dets[:, 1:2] pred_scores = pred_dets[:, 1:2]
pred_tlwhs = pred_dets[:, 2:6] pred_xyxys = pred_dets[:, 2:6]
pred_tlwhs = np.concatenate((pred_xyxys[:, 0:2], pred_xyxys[:, 2:4] - pred_xyxys[:, 0:2] + 1), axis=1)
detections = [ detections = [
Detection(tlwh, score, feat, cls_id) Detection(tlwh, score, feat, cls_id)
......
...@@ -34,9 +34,6 @@ from ppdet.engine import Tracker ...@@ -34,9 +34,6 @@ from ppdet.engine import Tracker
from ppdet.utils.check import check_gpu, check_version, check_config from ppdet.utils.check import check_gpu, check_version, check_config
from ppdet.utils.cli import ArgsParser from ppdet.utils.cli import ArgsParser
from ppdet.utils.logger import setup_logger
logger = setup_logger('eval')
def parse_args(): def parse_args():
parser = ArgsParser() parser = ArgsParser()
...@@ -83,11 +80,8 @@ def run(FLAGS, cfg): ...@@ -83,11 +80,8 @@ def run(FLAGS, cfg):
tracker = Tracker(cfg, mode='eval') tracker = Tracker(cfg, mode='eval')
# load weights # load weights
if cfg.architecture in ['DeepSORT']: if cfg.architecture in ['DeepSORT', 'ByteTrack']:
if cfg.det_weights != 'None': tracker.load_weights_sde(cfg.det_weights, cfg.reid_weights)
tracker.load_weights_sde(cfg.det_weights, cfg.reid_weights)
else:
tracker.load_weights_sde(None, cfg.reid_weights)
else: else:
tracker.load_weights_jde(cfg.weights) tracker.load_weights_jde(cfg.weights)
......
...@@ -28,7 +28,6 @@ import warnings ...@@ -28,7 +28,6 @@ import warnings
warnings.filterwarnings('ignore') warnings.filterwarnings('ignore')
import paddle import paddle
from ppdet.core.workspace import load_config, merge_config from ppdet.core.workspace import load_config, merge_config
from ppdet.utils.check import check_gpu, check_version, check_config from ppdet.utils.check import check_gpu, check_version, check_config
from ppdet.utils.cli import ArgsParser from ppdet.utils.cli import ArgsParser
...@@ -65,11 +64,8 @@ def run(FLAGS, cfg): ...@@ -65,11 +64,8 @@ def run(FLAGS, cfg):
trainer = Trainer(cfg, mode='test') trainer = Trainer(cfg, mode='test')
# load weights # load weights
if cfg.architecture in ['DeepSORT']: if cfg.architecture in ['DeepSORT', 'ByteTrack']:
if cfg.det_weights != 'None': trainer.load_weights_sde(cfg.det_weights, cfg.reid_weights)
trainer.load_weights_sde(cfg.det_weights, cfg.reid_weights)
else:
trainer.load_weights_sde(None, cfg.reid_weights)
else: else:
trainer.load_weights(cfg.weights) trainer.load_weights(cfg.weights)
......
...@@ -34,9 +34,6 @@ from ppdet.engine import Tracker ...@@ -34,9 +34,6 @@ from ppdet.engine import Tracker
from ppdet.utils.check import check_gpu, check_version, check_config from ppdet.utils.check import check_gpu, check_version, check_config
from ppdet.utils.cli import ArgsParser from ppdet.utils.cli import ArgsParser
from ppdet.utils.logger import setup_logger
logger = setup_logger('train')
def parse_args(): def parse_args():
parser = ArgsParser() parser = ArgsParser()
...@@ -94,11 +91,8 @@ def run(FLAGS, cfg): ...@@ -94,11 +91,8 @@ def run(FLAGS, cfg):
tracker = Tracker(cfg, mode='test') tracker = Tracker(cfg, mode='test')
# load weights # load weights
if cfg.architecture in ['DeepSORT']: if cfg.architecture in ['DeepSORT', 'ByteTrack']:
if cfg.det_weights != 'None': tracker.load_weights_sde(cfg.det_weights, cfg.reid_weights)
tracker.load_weights_sde(cfg.det_weights, cfg.reid_weights)
else:
tracker.load_weights_sde(None, cfg.reid_weights)
else: else:
tracker.load_weights_jde(cfg.weights) tracker.load_weights_jde(cfg.weights)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册