mot_en.md 4.6 KB
Newer Older
J
JYChen 已提交
1 2
English | [简体中文](mot.md)

W
wangguanzhong 已提交
3 4 5 6 7 8
# Detection and Tracking Module of PP-Human

Pedestrian detection and tracking is widely used in the intelligent community, industrial inspection, transportation monitoring and so on. PP-Human has the detection and tracking module, which is fundamental to keypoint detection, attribute action recognition, etc. Users enjoy easy access to pretrained models here.

| Task                 | Algorithm | Precision | Inference Speed(ms) | Download Link                                                                               |
|:---------------------|:---------:|:------:|:------:| :---------------------------------------------------------------------------------: |
9 10
| Pedestrian Detection/ Tracking    |  PP-YOLOE-l | mAP: 56.6 <br> MOTA: 79.5 | Detection: 28.0ms <br> Tracking:33.1ms | [Download Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
| Pedestrian Detection/ Tracking    |  PP-YOLOE-s | mAP: 53.2 <br> MOTA: 69.2 | Detection: 22.1ms <br> Tracking:27.2ms | [Download Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) |
W
wangguanzhong 已提交
11

12
1. The precision of the pedestrian detection/ tracking model is obtained by trainning and testing on [COCO-Person](http://cocodataset.org/), [CrowdHuman](http://www.crowdhuman.org/), [HIEVE](http://humaninevents.org/) and some business data.
W
wangguanzhong 已提交
13 14 15 16 17
2. The inference speed is the speed of using TensorRT FP16 on T4, the total number of data pre-training, model inference, and post-processing.

## How to Use

1. Download models from the links of the above table and unizp them to ```./output_inference```.
F
Feng Ni 已提交
18
2. When use the image as input, it's a detection task, the start command is as follows:
W
wangguanzhong 已提交
19
```python
Z
zhiboniu 已提交
20
python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pphuman.yml \
W
wangguanzhong 已提交
21 22 23
                                                   --image_file=test_image.jpg \
                                                   --device=gpu
```
F
Feng Ni 已提交
24
3. When use the video as input, it's a tracking task, first you should set the "enable: True" in MOT of infer_cfg_pphuman.yml, and then the start command is as follows:
W
wangguanzhong 已提交
25
```python
Z
zhiboniu 已提交
26
python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pphuman.yml \
W
wangguanzhong 已提交
27 28 29 30 31
                                                   --video_file=test_video.mp4 \
                                                   --device=gpu
```
4. There are two ways to modify the model path:

Z
zhiboniu 已提交
32
     - In `./deploy/pipeline/config/infer_cfg_pphuman.yml`, you can configurate different model paths,which is proper only if you match keypoint models and action recognition models with the fields of `DET` and `MOT` respectively, and modify the corresponding path of each field into the expected path.
W
wangguanzhong 已提交
33 34 35
    - Add `--model_dir` in the command line to revise the model path:

```python
Z
zhiboniu 已提交
36
python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pphuman.yml \
W
wangguanzhong 已提交
37 38 39 40 41 42 43 44 45 46 47 48 49 50
                                                   --video_file=test_video.mp4 \
                                                   --device=gpu \
                                                   --model_dir det=ppyoloe/
                                                   --do_entrance_counting \
                                                   --draw_center_traj

```
**Note:**

 - `--do_entrance_counting` is whether to calculate flow at the gateway, and the default setting is False.
 - `--draw_center_traj` means whether to draw the track, and the default setting is False. It's worth noting that the test video of track drawing should be filmed by the still camera.
The test result is:

<div width="1000" align="center">
W
wangguanzhong 已提交
51
  <img src="../images/mot.gif"/>
W
wangguanzhong 已提交
52 53 54 55 56 57 58 59 60
</div>

Data source and copyright owner:Skyinfor Technology. Thanks for the provision of actual scenario data, which are only used for academic research here.


## Introduction to the Solution

1. Get the pedestrian detection box of the image/ video input through object detection and multi-object tracking. The adopted model is PP-YOLOE, and for details, please refer to [PP-YOLOE](../../../configs/ppyoloe).

61
2. The multi-object tracking model solution is based on [ByteTrack](https://arxiv.org/pdf/2110.06864.pdf), and replace the original YOLOX with P-YOLOE as the detector,and BYTETracker as the tracker, please refer to [ByteTrack](../../../configs/mot/bytetrack).
W
wangguanzhong 已提交
62 63 64 65 66 67 68 69 70 71

## Reference
```
@article{zhang2021bytetrack,
  title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
  author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
  journal={arXiv preprint arXiv:2110.06864},
  year={2021}
}
```