未验证 提交 cdd750a4 编写于 作者: F Feng Ni 提交者: GitHub

[MOT] update pptracking and mot doc (#4779)

* update pptracking and mot doc, test=document_fix

* update docs, test=document_fix

* fix notes, test=document_fix
上级 efd21dd9
......@@ -9,22 +9,30 @@
- [数据集准备](#数据集准备)
- [引用](#引用)
## 简介
当前主流的Tracking By Detecting方式的多目标追踪(Multi-Object Tracking, MOT)算法主要由两部分组成:Detection+Embedding。Detection部分即针对视频,检测出每一帧中的潜在目标。Embedding部分则将检出的目标分配和更新到已有的对应轨迹上(即ReID重识别任务)。根据这两部分实现的不同,又可以划分为**SDE**系列和**JDE**系列算法。
- SDE(Separate Detection and Embedding)这类算法完全分离Detection和Embedding两个环节,最具代表性的就是**DeepSORT**算法。这样的设计可以使系统无差别的适配各类检测器,可以针对两个部分分别调优,但由于流程上是串联的导致速度慢耗时较长,在构建实时MOT系统中面临较大挑战。
- JDE(Joint Detection and Embedding)这类算法完是在一个共享神经网络中同时学习Detection和Embedding,使用一个多任务学习的思路设置损失函数。代表性的算法有**JDE****FairMOT**。这样的设计兼顾精度和速度,可以实现高精度的实时多目标跟踪。
当前主流的多目标追踪(MOT)算法主要由两部分组成:Detection+Embedding。Detection部分即针对视频,检测出每一帧中的潜在目标。Embedding部分则将检出的目标分配和更新到已有的对应轨迹上(即ReID重识别任务)。根据这两部分实现的不同,又可以划分为**SDE**系列和**JDE**系列算法
PaddleDetection实现了这两个系列的3种多目标跟踪算法,分别是SDE系列的[DeepSORT](https://arxiv.org/abs/1812.00442)和JDE系列的[JDE](https://arxiv.org/abs/1909.12605)[FairMOT](https://arxiv.org/abs/2004.01888)
- SDE(Separate Detection and Embedding)这类算法完全分离Detection和Embedding两个环节,最具代表性的就是**DeepSORT**算法。这样的设计可以使系统无差别的适配各类检测器,可以针对两个部分分别调优,但由于流程上是串联的导致速度慢耗时较长,在构建实时MOT系统中面临较大挑战。
### PP-Tracking 实时多目标跟踪系统
此外,PaddleDetection还提供了[PP-Tracking](../../deploy/pptracking/README.md)实时多目标跟踪系统。PP-Tracking是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统,具有模型丰富、应用广泛和部署高效三大优势。
PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式,针对实际业务的难点和痛点,提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用,部署方式支持API调用和GUI可视化界面,部署语言支持Python和C++,部署平台环境支持Linux、NVIDIA Jetson等。
- JDE(Joint Detection and Embedding)这类算法完是在一个共享神经网络中同时学习Detection和Embedding,使用一个多任务学习的思路设置损失函数。代表性的算法有**JDE****FairMOT**。这样的设计兼顾精度和速度,可以实现高精度的实时多目标跟踪。
### AI Studio公开项目案例
PP-Tracking 提供了AI Studio公开项目案例,教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)
PaddleDetection实现了这两个系列的3种多目标跟踪算法。
- [DeepSORT](https://arxiv.org/abs/1812.00442)(Deep Cosine Metric Learning SORT) 扩展了原有的[SORT](https://arxiv.org/abs/1703.07402)(Simple Online and Realtime Tracking)算法,增加了一个CNN模型用于在检测器限定的人体部分图像中提取特征,在深度外观描述的基础上整合外观信息,将检出的目标分配和更新到已有的对应轨迹上即进行一个ReID重识别任务。DeepSORT所需的检测框可以由任意一个检测器来生成,然后读入保存的检测结果和视频图片即可进行跟踪预测。ReID模型此处选择[PaddleClas](https://github.com/PaddlePaddle/PaddleClas)提供的`PCB+Pyramid ResNet101``PPLCNet`模型
### Python端预测部署
PP-Tracking 支持Python预测部署,教程请参考[PP-Tracking Python部署文档](python/README.md)
- [JDE](https://arxiv.org/abs/1909.12605)(Joint Detection and Embedding)是在一个单一的共享神经网络中同时学习目标检测任务和embedding任务,并同时输出检测结果和对应的外观embedding匹配的算法。JDE原论文是基于Anchor Base的YOLOv3检测器新增加一个ReID分支学习embedding,训练过程被构建为一个多任务联合学习问题,兼顾精度和速度。
### C++端预测部署
PP-Tracking 支持C++预测部署,教程请参考[PP-Tracking C++部署文档](cpp/README.md)
- [FairMOT](https://arxiv.org/abs/2004.01888)以Anchor Free的CenterNet检测器为基础,克服了Anchor-Based的检测框架中anchor和特征不对齐问题,深浅层特征融合使得检测和ReID任务各自获得所需要的特征,并且使用低维度ReID特征,提出了一种由两个同质分支组成的简单baseline来预测像素级目标得分和ReID特征,实现了两个任务之间的公平性,并获得了更高水平的实时多目标跟踪精度。
### GUI可视化界面预测部署
PP-Tracking 提供了简洁的GUI可视化界面,教程请参考[PP-Tracking可视化界面试用版使用文档](https://github.com/yangyudong2020/PP-Tracking_GUi)
[PP-Tracking](../../deploy/pptracking/README.md)是基于PaddlePaddle深度学习框架的业界首个开源实时跟踪系统。针对实际业务的难点痛点,PP-Tracking内置行人车辆跟踪、跨镜头跟踪、多类别跟踪、小目标跟踪及流量计数等能力与产业应用,同时提供可视化开发界面。模型集成多目标跟踪,目标检测,ReID轻量级算法,进一步提升PP-Tracking在服务器端部署性能。同时支持python,C++部署,适配Linux,Nvidia Jetson多平台环境。
<div width="1000" align="center">
<img src="../../docs/images/pptracking.png"/>
</div>
......@@ -32,12 +40,11 @@ PaddleDetection实现了这两个系列的3种多目标跟踪算法。
<div width="1000" align="center">
<img src="../../docs/images/pptracking-demo.gif"/>
<br>
视频来源:VisDrone2021, BDD100K开源数据集</div>
视频来源:VisDrone和BDD100K公开数据集</div>
</div>
## 安装依赖
一键安装MOT相关的依赖:
```
pip install lap sklearn motmetrics openpyxl cython_bbox
......@@ -48,8 +55,8 @@ pip install -r requirements.txt
- `cython_bbox`在windows上安装:`pip install -e git+https://github.com/samson-wang/cython_bbox.git#egg=cython-bbox`。可参考这个[教程](https://stackoverflow.com/questions/60349980/is-there-a-way-to-install-cython-bbox-for-windows)
- 预测需确保已安装[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
## 模型库
## 模型库
- 基础模型
- [DeepSORT](deepsort/README_cn.md)
- [JDE](jde/README_cn.md)
......@@ -63,46 +70,30 @@ pip install -r requirements.txt
- 跨境头跟踪
- [跨境头跟踪](mtmct/README_cn.md)
## 数据集准备
## 数据集准备
### MOT数据集
PaddleDetection使用和[JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) 还有[FairMOT](https://github.com/ifzhang/FairMOT)相同的数据集。请参照[数据准备文档](../../docs/tutorials/PrepareMOTDataSet_cn.md)去下载并准备好所有的数据集包括**Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17和MOT16**。使用前6者作为联合数据集参与训练,MOT16作为评测数据集。此外还可以使用**MOT15和MOT20**进行finetune。所有的行人都有检测框标签,部分有ID标签。如果您想使用这些数据集,请**遵循他们的License**
PaddleDetection复现[JDE](https://github.com/Zhongdao/Towards-Realtime-MOT)[FairMOT](https://github.com/ifzhang/FairMOT),是使用的和他们相同的MIX数据集,包括**Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17和MOT16**。使用前6者作为联合数据集参与训练,MOT16作为评测数据集。如果您想使用这些数据集,请**遵循他们的License**
为了训练更多场景的垂类模型,垂类数据集也是处理成与MIX数据集相同格式,请参照[数据准备文档](../../docs/tutorials/PrepareMOTDataSet_cn.md)去准备数据集。
### 数据格式
这几个相关数据集都遵循以下结构:
```
Caltech
|——————images
| └——————00001.jpg
| |—————— ...
| └——————0000N.jpg
└——————labels_with_ids
└——————00001.txt
|—————— ...
└——————0000N.txt
MOT17
|——————images
| └——————train
| └——————test
└——————labels_with_ids
└——————train
```
所有数据集的标注是以统一数据格式提供的。各个数据集中每张图片都有相应的标注文本。给定一个图像路径,可以通过将字符串`images`替换为`labels_with_ids`并将`.jpg`替换为`.txt`来生成标注文本路径。在标注文本中,每行都描述一个边界框,格式如下:
### 数据集目录
首先按照以下命令下载image_lists.zip并解压放在`PaddleDetection/dataset/mot`目录下:
```
[class] [identity] [x_center] [y_center] [width] [height]
wget https://dataset.bj.bcebos.com/mot/image_lists.zip
```
**注意**:
- `class``0`,目前仅支持单类别多目标跟踪。
- `identity`是从`1``num_identifies`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`
- `[x_center] [y_center] [width] [height]`是中心点坐标和宽高,注意他们的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。
### 数据集目录
首先按照以下命令下载image_lists.zip并解压放在`dataset/mot`目录下:
然后按照以下命令可以快速下载MIX数据集的各个子数据集,并解压放在`PaddleDetection/dataset/mot`目录下:
```
wget https://dataset.bj.bcebos.com/mot/image_lists.zip
wget https://dataset.bj.bcebos.com/mot/MOT17.zip
wget https://dataset.bj.bcebos.com/mot/Caltech.zip
wget https://dataset.bj.bcebos.com/mot/CUHKSYSU.zip
wget https://dataset.bj.bcebos.com/mot/PRW.zip
wget https://dataset.bj.bcebos.com/mot/Cityscapes.zip
wget https://dataset.bj.bcebos.com/mot/ETHZ.zip
wget https://dataset.bj.bcebos.com/mot/MOT16.zip
```
然后依次下载各个数据集并解压,最终目录为:
最终目录为:
```
dataset/mot
|——————image_lists
......@@ -115,34 +106,37 @@ dataset/mot
|——————cuhksysu.train
|——————cuhksysu.val
|——————eth.train
|——————mot15.train
|——————mot16.train
|——————mot17.train
|——————mot20.train
|——————prw.train
|——————prw.val
|——————Caltech
|——————Cityscapes
|——————CUHKSYSU
|——————ETHZ
|——————MOT15
|——————MOT16
|——————MOT17
|——————MOT20
|——————PRW
```
### 快速下载
按照以下命令可以快速下载各个数据集,注意需要按以上目录解压和存放
### 数据格式
这几个相关数据集都遵循以下结构
```
wget https://dataset.bj.bcebos.com/mot/MOT17.zip
wget https://dataset.bj.bcebos.com/mot/Caltech.zip
wget https://dataset.bj.bcebos.com/mot/CUHKSYSU.zip
wget https://dataset.bj.bcebos.com/mot/PRW.zip
wget https://dataset.bj.bcebos.com/mot/Cityscapes.zip
wget https://dataset.bj.bcebos.com/mot/ETH.zip
wget https://dataset.bj.bcebos.com/mot/MOT16.zip
MOT17
|——————images
| └——————train
| └——————test
└——————labels_with_ids
└——————train
```
所有数据集的标注是以统一数据格式提供的。各个数据集中每张图片都有相应的标注文本。给定一个图像路径,可以通过将字符串`images`替换为`labels_with_ids`并将`.jpg`替换为`.txt`来生成标注文本路径。在标注文本中,每行都描述一个边界框,格式如下:
```
[class] [identity] [x_center] [y_center] [width] [height]
```
**注意**:
- `class`为类别id,从0开始计,支持单类别和多类别。
- `identity`是从`1``num_identifies`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`
- `[x_center] [y_center] [width] [height]`是中心点坐标和宽高,注意他们的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。
## 引用
......
......@@ -10,34 +10,43 @@ English | [简体中文](README.md)
- [Citations](#Citations)
## Introduction
The current mainstream multi-objective tracking (MOT) algorithm is mainly composed of two parts: detection and embedding. Detection aims to detect the potential targets in each frame of the video. Embedding assigns and updates the detected target to the corresponding track (named ReID task). According to the different implementation of these two parts, it can be divided into **SDE** series and **JDE** series algorithm.
The current mainstream of 'Tracking By Detecting' multi-object tracking (MOT) algorithm is mainly composed of two parts: detection and embedding. Detection aims to detect the potential targets in each frame of the video. Embedding assigns and updates the detected target to the corresponding track (named ReID task). According to the different implementation of these two parts, it can be divided into **SDE** series and **JDE** series algorithm.
- **SDE** (Separate Detection and Embedding) is a kind of algorithm which completely separates Detection and Embedding. The most representative is **DeepSORT** algorithm. This design can make the system fit any kind of detectors without difference, and can be improved for each part separately. However, due to the series process, the speed is slow. Time-consuming is a great challenge in the construction of real-time MOT system.
- **JDE** (Joint Detection and Embedding) is to learn detection and embedding simultaneously in a shared neural network, and set the loss function with a multi task learning approach. The representative algorithms are **JDE** and **FairMOT**. This design can achieve high-precision real-time MOT performance.
Paddledetection implements three MOT algorithms of these two series.
Paddledetection implements three MOT algorithms of these two series, they are [DeepSORT](https://arxiv.org/abs/1812.00442) of SDE algorithm, and [JDE](https://arxiv.org/abs/1909.12605),[FairMOT](https://arxiv.org/abs/2004.01888) of JDE algorithm.
### PP-Tracking real-time MOT system
In addition, PaddleDetection also provides [PP-Tracking](../../deploy/pptracking/README.md) real-time multi-object tracking system.
PP-Tracking is the first open source real-time Multi-Object Tracking system, and it is based on PaddlePaddle deep learning framework. It has rich models, wide application and high efficiency deployment.
PP-Tracking supports two paradigms: single camera tracking (MOT) and multi-camera tracking (MTMCT). Aiming at the difficulties and pain points of actual business, PP-Tracking provides various MOT functions and applications such as pedestrian tracking, vehicle tracking, multi-class tracking, small object tracking, traffic statistics and multi-camera tracking. The deployment method supports API and GUI visual interface, and the deployment language supports Python and C++, The deployment platform environment supports Linux, NVIDIA Jetson, etc.
### AI studio public project tutorial
PP-tracking provides an AI studio public project tutorial. Please refer to this [tutorial](https://aistudio.baidu.com/aistudio/projectdetail/3022582).
- [DeepSORT](https://arxiv.org/abs/1812.00442) (Deep Cosine Metric Learning SORT) extends the original [SORT](https://arxiv.org/abs/1703.07402) (Simple Online and Realtime Tracking) algorithm, it adds a CNN model to extract features in image of human part bounded by a detector. It integrates appearance information based on a deep appearance descriptor, and assigns and updates the detected targets to the existing corresponding trajectories like ReID task. The detection bboxes result required by DeepSORT can be generated by any detection model, and then the saved detection result file can be loaded for tracking. Here we select the `PCB + Pyramid ResNet101` and `PPLCNet` models provided by [PaddleClas](https://github.com/PaddlePaddle/PaddleClas) as the ReID model.
### Python predict and deployment
PP-Tracking supports Python predict and deployment. Please refer to this [doc](../../deploy/pptracking/python/README.md).
- [JDE](https://arxiv.org/abs/1909.12605) (Joint Detection and Embedding) learns the object detection task and appearance embedding task simutaneously in a shared neural network. And the detection results and the corresponding embeddings are also outputed at the same time. JDE original paper is based on an Anchor Base detector YOLOv3 , adding a new ReID branch to learn embeddings. The training process is constructed as a multi-task learning problem, taking into account both accuracy and speed.
### C++ predict and deployment
PP-Tracking supports C++ predict and deployment. Please refer to this [doc](../../deploy/pptracking/cpp/README.md).
- [FairMOT](https://arxiv.org/abs/2004.01888) is based on an Anchor Free detector Centernet, which overcomes the problem of anchor and feature misalignment in anchor based detection framework. The fusion of deep and shallow features enables the detection and ReID tasks to obtain the required features respectively. It also uses low dimensional ReID features. FairMOT is a simple baseline composed of two homogeneous branches propose to predict the pixel level target score and ReID features. It achieves the fairness between the two tasks and obtains a higher level of real-time MOT performance.
### GUI predict and deployment
PP-Tracking supports GUI predict and deployment. Please refer to this [doc](https://github.com/yangyudong2020/PP-Tracking_GUi).
[PP-Tracking](../../deploy/pptracking/README.md) is the first open source real-time tracking system based on PaddlePaddle deep learning framework. Aiming at the difficulties and pain points of the actual business, PP-Tracking has built-in capabilities and industrial applications such as pedestrian and vehicle tracking, cross-camera tracking, multi-class tracking, small target tracking and traffic counting, and provides a visual development interface. The model integrates multi-object tracking, object detection and ReID lightweight algorithm to further improve the deployment performance of PP-Tracking on the server. It also supports Python and C + + deployment and adapts to Linux, NVIDIA and Jetson multi platform environment.。
<div width="1000" align="center">
<img src="../../docs/images/pptracking.png"/>
<img src="../../docs/images/pptracking_en.png"/>
</div>
<div width="1000" align="center">
<img src="../../docs/images/pptracking-demo.gif"/>
<br>
video source:VisDrone2021, BDD100K dataset</div>
video source:VisDrone, BDD100K dataset</div>
</div>
## Installation
Install all the related dependencies for MOT:
```
pip install lap sklearn motmetrics openpyxl cython_bbox
......@@ -50,12 +59,11 @@ pip install -r requirements.txt
## Model Zoo
- base models
- Base models
- [DeepSORT](deepsort/README.md)
- [JDE](jde/README.md)
- [FairMOT](fairmot/README.md)
- feature models
- Feature models
- [Pedestrian](pedestrian/README.md)
- [Head](headtracking21/README.md)
- [Vehicle](vehicle/README.md)
......@@ -66,47 +74,29 @@ pip install -r requirements.txt
## Dataset Preparation
### MOT Dataset
PaddleDetection use the same training data as [JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) and [FairMOT](https://github.com/ifzhang/FairMOT). Please refer to [PrepareMOTDataSet](../../docs/tutorials/PrepareMOTDataSet.md) to download and prepare all the training data including **Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17 and MOT16**. The former six are used as the mixed dataset for training, and MOT16 are used as the evaluation dataset. In addition, you can use **MOT15 and MOT20** for finetune. All pedestrians in these datasets have detection bbox labels and some have ID labels. If you want to use these datasets, please **follow their licenses**.
PaddleDetection implement [JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) and [FairMOT](https://github.com/ifzhang/FairMOT), and use the same training data named 'MIX' as them, including **Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17 and MOT16**. The former six are used as the mixed dataset for training, and MOT16 are used as the evaluation dataset. If you want to use these datasets, please **follow their licenses**.
### Data Format
These several relevant datasets have the following structure:
```
Caltech
|——————images
| └——————00001.jpg
| |—————— ...
| └——————0000N.jpg
└——————labels_with_ids
└——————00001.txt
|—————— ...
└——————0000N.txt
MOT17
|——————images
| └——————train
| └——————test
└——————labels_with_ids
└——————train
```
Annotations of these datasets are provided in a unified format. Every image has a corresponding annotation text. Given an image path, the annotation text path can be generated by replacing the string `images` with `labels_with_ids` and replacing `.jpg` with `.txt`.
In order to train the feature models of more scenes, more datasets are also processed into the same format as the MIX dataset. Please refer to MOT data preparation [doc](../../docs/tutorials/PrepareMOTDataSet.md) to prepare the dataset.
In the annotation text, each line is describing a bounding box and has the following format:
### Dataset Directory
First, download the image_lists.zip using the following command, and unzip them into `PaddleDetection/dataset/mot`:
```
[class] [identity] [x_center] [y_center] [width] [height]
wget https://dataset.bj.bcebos.com/mot/image_lists.zip
```
**Notes:**
- `class` should be `0`. Only single-class multi-object tracking is supported now.
- `identity` is an integer from `1` to `num_identities`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation.
- `[x_center] [y_center] [width] [height]` are the center coordinates, width and height, note that they are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1.
### Dataset Directory
First, follow the command below to download the `image_list.zip` and unzip it in the `dataset/mot` directory:
Then, download the MIX dataset using the following command, and unzip them into `PaddleDetection/dataset/mot`:
```
wget https://dataset.bj.bcebos.com/mot/image_lists.zip
wget https://dataset.bj.bcebos.com/mot/MOT17.zip
wget https://dataset.bj.bcebos.com/mot/Caltech.zip
wget https://dataset.bj.bcebos.com/mot/CUHKSYSU.zip
wget https://dataset.bj.bcebos.com/mot/PRW.zip
wget https://dataset.bj.bcebos.com/mot/Cityscapes.zip
wget https://dataset.bj.bcebos.com/mot/ETHZ.zip
wget https://dataset.bj.bcebos.com/mot/MOT16.zip
```
Then download and unzip each dataset, and the final directory is as follows:
The final directory is:
```
dataset/mot
|——————image_lists
......@@ -119,34 +109,39 @@ dataset/mot
|——————cuhksysu.train
|——————cuhksysu.val
|——————eth.train
|——————mot15.train
|——————mot16.train
|——————mot17.train
|——————mot20.train
|——————prw.train
|——————prw.val
|——————Caltech
|——————Cityscapes
|——————CUHKSYSU
|——————ETHZ
|——————MOT15
|——————MOT16
|——————MOT17
|——————MOT20
|——————PRW
```
### Quick Download
You can download all the dataset according to the following command. Note that it needs to be decompressed and stored according to the above directory.
### Data Format
These several relevant datasets have the following structure:
```
wget https://dataset.bj.bcebos.com/mot/MOT17.zip
wget https://dataset.bj.bcebos.com/mot/Caltech.zip
wget https://dataset.bj.bcebos.com/mot/CUHKSYSU.zip
wget https://dataset.bj.bcebos.com/mot/PRW.zip
wget https://dataset.bj.bcebos.com/mot/Cityscapes.zip
wget https://dataset.bj.bcebos.com/mot/ETH.zip
wget https://dataset.bj.bcebos.com/mot/MOT16.zip
MOT17
|——————images
| └——————train
| └——————test
└——————labels_with_ids
└——————train
```
Annotations of these datasets are provided in a unified format. Every image has a corresponding annotation text. Given an image path, the annotation text path can be generated by replacing the string `images` with `labels_with_ids` and replacing `.jpg` with `.txt`.
In the annotation text, each line is describing a bounding box and has the following format:
```
[class] [identity] [x_center] [y_center] [width] [height]
```
**Notes:**
- `class` is the class id, start from 0, and support single class and multi-class.
- `identity` is an integer from `1` to `num_identities`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation.
- `[x_center] [y_center] [width] [height]` are the center coordinates, width and height, note that they are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1.
## Citations
......
......@@ -109,8 +109,8 @@ CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/deepsort/deepsor
```
**注意:**
请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
`--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
- 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
### 3. 导出预测模型
......@@ -142,9 +142,9 @@ python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/jd
python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyolov2_r50vd_dcn_365e_640x640_mot17half/ --reid_model_dir=output_inference/deepsort_pplcnet/ --video_file={your video name}.mp4 --device=GPU --scaled=True --save_mot_txts
```
**注意:**
跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_mot_txt_per_img`(对每张图片保存一个txt)表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
`--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_mot_txt_per_img`(对每张图片保存一个txt)表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
- 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
## 适配其他检测器
......@@ -184,7 +184,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid
python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/xxx./ --reid_model_dir=output_inference/deepsort_yyy/ --video_file={your video name}.mp4 --device=GPU --scaled=True --save_mot_txts
```
**注意:**
`--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
## 引用
......
......@@ -12,6 +12,16 @@ English | [简体中文](README_cn.md)
[FairMOT](https://arxiv.org/abs/2004.01888) is based on an Anchor Free detector Centernet, which overcomes the problem of anchor and feature misalignment in anchor based detection framework. The fusion of deep and shallow features enables the detection and ReID tasks to obtain the required features respectively. It also uses low dimensional ReID features. FairMOT is a simple baseline composed of two homogeneous branches propose to predict the pixel level target score and ReID features. It achieves the fairness between the two tasks and obtains a higher level of real-time MOT performance.
### PP-Tracking real-time MOT system
In addition, PaddleDetection also provides [PP-Tracking](../../../deploy/pptracking/README.md) real-time multi-object tracking system.
PP-Tracking is the first open source real-time Multi-Object Tracking system, and it is based on PaddlePaddle deep learning framework. It has rich models, wide application and high efficiency deployment.
PP-Tracking supports two paradigms: single camera tracking (MOT) and multi-camera tracking (MTMCT). Aiming at the difficulties and pain points of actual business, PP-Tracking provides various MOT functions and applications such as pedestrian tracking, vehicle tracking, multi-class tracking, small object tracking, traffic statistics and multi-camera tracking. The deployment method supports API and GUI visual interface, and the deployment language supports Python and C++, The deployment platform environment supports Linux, NVIDIA Jetson, etc.
### AI studio public project tutorial
PP-tracking provides an AI studio public project tutorial. Please refer to this [tutorial](https://aistudio.baidu.com/aistudio/projectdetail/3022582).
## Model Zoo
### FairMOT Results on MOT-16 Training Set
......@@ -34,7 +44,7 @@ English | [简体中文](README_cn.md)
| DLA-34 | 576x320 | 69.9 | 70.2 | 1044 | 8869 | 44898 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_576x320.pdparams) | [config](./fairmot_dla34_30e_576x320.yml) |
**Notes:**
FairMOT DLA-34 used 2 GPUs for training and mini-batch size as 6 on each GPU, and trained for 30 epoches.
- FairMOT DLA-34 used 2 GPUs for training and mini-batch size as 6 on each GPU, and trained for 30 epoches.
### FairMOT enhance model
......@@ -51,7 +61,9 @@ English | [简体中文](README_cn.md)
| HarDNet-85 | 1088x608 | 74.7 | 70.7 | 3210 | 29790 | 109914 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_hardnet85_30e_1088x608.pdparams) | [config](./fairmot_enhance_hardnet85_30e_1088x608.yml) |
**Notes:**
FairMOT enhance used 8 GPUs for training, and the crowdhuman dataset is added to the train-set during training. For FairMOT enhance DLA-34 the batch size is 16 on each GPU,and trained for 60 epoches. For FairMOT enhance HarDNet-85 the batch size is 10 on each GPU,and trained for 30 epoches.
- FairMOT enhance used 8 GPUs for training, and the crowdhuman dataset is added to the train-set during training.
- For FairMOT enhance DLA-34 the batch size is 16 on each GPU,and trained for 60 epoches.
- For FairMOT enhance HarDNet-85 the batch size is 10 on each GPU,and trained for 30 epoches.
### FairMOT light model
### Results on MOT-16 Test Set
......@@ -67,7 +79,7 @@ English | [简体中文](README_cn.md)
| HRNetV2-W18 | 576x320 | 65.3 | 64.8 | 4137 | 28860 | 163017 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) |
**Notes:**
FairMOT HRNetV2-W18 used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epoches. Only ImageNet pre-train model is used, and the optimizer adopts Momentum. The crowdhuman dataset is added to the train-set during training.
- FairMOT HRNetV2-W18 used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epoches. Only ImageNet pre-train model is used, and the optimizer adopts Momentum. The crowdhuman dataset is added to the train-set during training.
## Getting Start
......@@ -93,7 +105,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_d
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=output/fairmot_dla34_30e_1088x608/model_final.pdparams
```
**Notes:**
The default evaluation dataset is MOT-16 Train Set. If you want to change the evaluation dataset, please refer to the following code and modify `configs/datasets/mot.yml`
- The default evaluation dataset is MOT-16 Train Set. If you want to change the evaluation dataset, please refer to the following code and modify `configs/datasets/mot.yml`
```
EvalMOTDataset:
!MOTImageFolder
......@@ -101,7 +113,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_d
data_root: MOT17/images/train
keep_ori_im: False # set True if save visualization images or video
```
Tracking results will be saved in `{output_dir}/mot_results/`, and every sequence has one txt file, each line of the txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`, and you can set `{output_dir}` by `--output_dir`.
- Tracking results will be saved in `{output_dir}/mot_results/`, and every sequence has one txt file, each line of the txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`, and you can set `{output_dir}` by `--output_dir`.
### 3. Inference
......@@ -112,7 +124,7 @@ Inference a vidoe on single GPU with following command:
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams --video_file={your video name}.mp4 --save_videos
```
**Notes:**
Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
- Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
### 4. Export model
......@@ -127,8 +139,8 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairm
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
```
**Notes:**
The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images.
Each line of the tracking results txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`.
- The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images.
- Each line of the tracking results txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`.
### 6. Using exported MOT and keypoint model for unite python inference
......@@ -137,7 +149,7 @@ Each line of the tracking results txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1
python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inference/fairmot_dla34_30e_1088x608/ --keypoint_model_dir=output_inference/higherhrnet_hrnet_w32_512/ --video_file={your video name}.mp4 --device=GPU
```
**Notes:**
Keypoint model export tutorial: `configs/keypoint/README.md`.
- Keypoint model export tutorial: `configs/keypoint/README.md`.
## Citations
......@@ -148,4 +160,10 @@ python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inferenc
journal={arXiv preprint arXiv:2004.01888},
year={2020}
}
@article{shao2018crowdhuman,
title={CrowdHuman: A Benchmark for Detecting Human in a Crowd},
author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
journal={arXiv preprint arXiv:1805.00123},
year={2018}
}
```
......@@ -12,6 +12,13 @@
[FairMOT](https://arxiv.org/abs/2004.01888)以Anchor Free的CenterNet检测器为基础,克服了Anchor-Based的检测框架中anchor和特征不对齐问题,深浅层特征融合使得检测和ReID任务各自获得所需要的特征,并且使用低维度ReID特征,提出了一种由两个同质分支组成的简单baseline来预测像素级目标得分和ReID特征,实现了两个任务之间的公平性,并获得了更高水平的实时多目标跟踪精度。
### PP-Tracking 实时多目标跟踪系统
此外,PaddleDetection还提供了[PP-Tracking](../../../deploy/pptracking/README.md)实时多目标跟踪系统。PP-Tracking是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统,具有模型丰富、应用广泛和部署高效三大优势。
PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式,针对实际业务的难点和痛点,提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用,部署方式支持API调用和GUI可视化界面,部署语言支持Python和C++,部署平台环境支持Linux、NVIDIA Jetson等。
### AI Studio公开项目案例
PP-Tracking 提供了AI Studio公开项目案例,教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)
## 模型库
### FairMOT在MOT-16 Training Set上结果
......@@ -33,7 +40,7 @@
| DLA-34 | 576x320 | 69.9 | 70.2 | 1044 | 8869 | 44898 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_576x320.pdparams) | [配置文件](./fairmot_dla34_30e_576x320.yml) |
**注意:**
FairMOT DLA-34均使用2个GPU进行训练,每个GPU上batch size为6,训练30个epoch。
- FairMOT DLA-34均使用2个GPU进行训练,每个GPU上batch size为6,训练30个epoch。
### FairMOT enhance模型
......@@ -50,8 +57,9 @@
| HarDNet-85 | 1088x608 | 74.7 | 70.7 | 3210 | 29790 | 109914 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_hardnet85_30e_1088x608.pdparams) | [配置文件](./fairmot_enhance_hardnet85_30e_1088x608.yml) |
**注意:**
FairMOT enhance模型均使用8个GPU进行训练,训练集中加入了crowdhuman数据集一起参与训练。DLA-34骨干网络的每个GPU上batch size为16,训练60个epoch。HarDNet-85骨干网络的每个GPU上batch size为10,训练30个epoch。
- FairMOT enhance模型均使用8个GPU进行训练,训练集中加入了crowdhuman数据集一起参与训练。
- FairMOT enhance DLA-34 每个GPU上batch size为16,训练60个epoch。
- FairMOT enhance HarDNet-85 每个GPU上batch size为10,训练30个epoch。
### FairMOT轻量级模型
### 在MOT-16 Test Set上结果
......@@ -67,7 +75,7 @@
| HRNetV2-W18 | 576x320 | 65.3 | 64.8 | 4137 | 28860 | 163017 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) |
**注意:**
FairMOT HRNetV2-W18均使用8个GPU进行训练,每个GPU上batch size为4,训练30个epoch,使用的ImageNet预训练,优化器策略采用的是Momentum,并且训练集中加入了crowdhuman数据集一起参与训练。
- FairMOT HRNetV2-W18均使用8个GPU进行训练,每个GPU上batch size为4,训练30个epoch,使用的ImageNet预训练,优化器策略采用的是Momentum,并且训练集中加入了crowdhuman数据集一起参与训练。
## 快速开始
......@@ -92,7 +100,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_d
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=output/fairmot_dla34_30e_1088x608/model_final.pdparams
```
**注意:**
默认评估的是MOT-16 Train Set数据集, 如需换评估数据集可参照以下代码修改`configs/datasets/mot.yml`
- 默认评估的是MOT-16 Train Set数据集, 如需换评估数据集可参照以下代码修改`configs/datasets/mot.yml`
```
EvalMOTDataset:
!MOTImageFolder
......@@ -100,7 +108,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_d
data_root: MOT17/images/train
keep_ori_im: False # set True if save visualization images or video
```
跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
- 跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
### 3. 预测
......@@ -112,7 +120,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/fairmot/fairmot_
```
**注意:**
请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
- 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
### 4. 导出预测模型
......@@ -126,8 +134,8 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairm
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
```
**注意:**
跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
- 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
### 6. 用导出的跟踪和关键点模型Python联合预测
......@@ -135,7 +143,7 @@ python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fa
python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inference/fairmot_dla34_30e_1088x608/ --keypoint_model_dir=output_inference/higherhrnet_hrnet_w32_512/ --video_file={your video name}.mp4 --device=GPU
```
**注意:**
关键点模型导出教程请参考`configs/keypoint/README.md`
- 关键点模型导出教程请参考`configs/keypoint/README.md`
## 引用
......@@ -146,4 +154,10 @@ python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inferenc
journal={arXiv preprint arXiv:2004.01888},
year={2020}
}
@article{shao2018crowdhuman,
title={CrowdHuman: A Benchmark for Detecting Human in a Crowd},
author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
journal={arXiv preprint arXiv:1805.00123},
year={2018}
}
```
......@@ -24,8 +24,8 @@
| HRNetv2-W18 | 1088x608 | 41.2 | 47.1 | 48809 | 241683 | 204346 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_headtracking21.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_headtracking21.yml) |
**注意:**
FairMOT DLA-34使用2个GPU进行训练,每个GPU上batch size为6,训练30个epoch。目前MOTA精度位于MOT官网[Head Tracking 21](https://motchallenge.net/results/Head_Tracking_21)榜单榜首。
FairMOT HRNetv2-W18使用4个GPU进行训练,每个GPU上batch size为8,训练30个epoch。
- FairMOT DLA-34使用2个GPU进行训练,每个GPU上batch size为6,训练30个epoch。目前MOTA精度位于MOT官网[Head Tracking 21](https://motchallenge.net/results/Head_Tracking_21)榜单榜首。
- FairMOT HRNetv2-W18使用4个GPU进行训练,每个GPU上batch size为8,训练30个epoch。
## 快速开始
......@@ -52,7 +52,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/headtracking21/fa
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/headtracking21/fairmot_dla34_30e_1088x608_headtracking21.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_headtracking21.pdparams --video_file={your video name}.mp4 --save_videos
```
**注意:**
请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
- 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
### 4. 导出预测模型
```bash
......@@ -64,8 +64,8 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/headtracking2
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_headtracking21 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
```
**注意:**
跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
- 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
## 引用
```
......
......@@ -12,6 +12,15 @@ English | [简体中文](README_cn.md)
- [JDE](https://arxiv.org/abs/1909.12605) (Joint Detection and Embedding) learns the object detection task and appearance embedding task simutaneously in a shared neural network. And the detection results and the corresponding embeddings are also outputed at the same time. JDE original paper is based on an Anchor Base detector YOLOv3, adding a new ReID branch to learn embeddings. The training process is constructed as a multi-task learning problem, taking into account both accuracy and speed.
### PP-Tracking real-time MOT system
In addition, PaddleDetection also provides [PP-Tracking](../../../deploy/pptracking/README.md) real-time multi-object tracking system.
PP-Tracking is the first open source real-time Multi-Object Tracking system, and it is based on PaddlePaddle deep learning framework. It has rich models, wide application and high efficiency deployment.
PP-Tracking supports two paradigms: single camera tracking (MOT) and multi-camera tracking (MTMCT). Aiming at the difficulties and pain points of actual business, PP-Tracking provides various MOT functions and applications such as pedestrian tracking, vehicle tracking, multi-class tracking, small object tracking, traffic statistics and multi-camera tracking. The deployment method supports API and GUI visual interface, and the deployment language supports Python and C++, The deployment platform environment supports Linux, NVIDIA Jetson, etc.
### AI studio public project tutorial
PP-tracking provides an AI studio public project tutorial. Please refer to this [tutorial](https://aistudio.baidu.com/aistudio/projectdetail/3022582).
<div align="center">
<img src="../../../docs/images/mot16_jde.gif" width=500 />
</div>
......@@ -37,7 +46,7 @@ English | [简体中文](README_cn.md)
| DarkNet53 | 576x320 | 59.1 | 56.4 | 1911 | 10923 | 61789 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_576x320.yml) |
**Notes:**
JDE used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epoches.
- JDE used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epoches.
## Getting Start
......@@ -61,7 +70,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=output/jde_darknet53_30e_1088x608/model_final.pdparams
```
**Notes:**
The default evaluation dataset is MOT-16 Train Set. If you want to change the evaluation dataset, please refer to the following code and modify `configs/datasets/mot.yml`
- The default evaluation dataset is MOT-16 Train Set. If you want to change the evaluation dataset, please refer to the following code and modify `configs/datasets/mot.yml`
```
EvalMOTDataset:
!MOTImageFolder
......@@ -69,7 +78,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53
data_root: MOT17/images/train
keep_ori_im: False # set True if save visualization images or video
```
Tracking results will be saved in `{output_dir}/mot_results/`, and every sequence has one txt file, each line of the txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`, and you can set `{output_dir}` by `--output_dir`.
- Tracking results will be saved in `{output_dir}/mot_results/`, and every sequence has one txt file, each line of the txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`, and you can set `{output_dir}` by `--output_dir`.
### 3. Inference
......@@ -80,7 +89,7 @@ Inference a vidoe on single GPU with following command:
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams --video_file={your video name}.mp4 --save_videos
```
**Notes:**
Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
- Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
### 4. Export model
......@@ -95,8 +104,8 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/jde/jde_darkn
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/jde_darknet53_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
```
**Notes:**
The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images.
Each line of the tracking results txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`.
- The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images.
- Each line of the tracking results txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`.
## Citations
......
......@@ -12,6 +12,13 @@
[JDE](https://arxiv.org/abs/1909.12605)(Joint Detection and Embedding)是在一个单一的共享神经网络中同时学习目标检测任务和embedding任务,并同时输出检测结果和对应的外观embedding匹配的算法。JDE原论文是基于Anchor Base的YOLOv3检测器新增加一个ReID分支学习embedding,训练过程被构建为一个多任务联合学习问题,兼顾精度和速度。
### PP-Tracking 实时多目标跟踪系统
此外,PaddleDetection还提供了[PP-Tracking](../../../deploy/pptracking/README.md)实时多目标跟踪系统。PP-Tracking是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统,具有模型丰富、应用广泛和部署高效三大优势。
PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式,针对实际业务的难点和痛点,提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用,部署方式支持API调用和GUI可视化界面,部署语言支持Python和C++,部署平台环境支持Linux、NVIDIA Jetson等。
### AI Studio公开项目案例
PP-Tracking 提供了AI Studio公开项目案例,教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)
<div align="center">
<img src="../../../docs/images/mot16_jde.gif" width=500 />
</div>
......@@ -38,7 +45,7 @@
| DarkNet53 | 576x320 | 59.1 | 56.4 | 1911 | 10923 | 61789 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_576x320.yml) |
**注意:**
JDE使用8个GPU进行训练,每个GPU上batch size为4,训练了30个epoch。
- JDE使用8个GPU进行训练,每个GPU上batch size为4,训练了30个epoch。
## 快速开始
......@@ -62,7 +69,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=output/jde_darknet53_30e_1088x608/model_final.pdparams
```
**注意:**
默认评估的是MOT-16 Train Set数据集, 如需换评估数据集可参照以下代码修改`configs/datasets/mot.yml`
- 默认评估的是MOT-16 Train Set数据集, 如需换评估数据集可参照以下代码修改`configs/datasets/mot.yml`
```
EvalMOTDataset:
!MOTImageFolder
......@@ -70,7 +77,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53
data_root: MOT17/images/train
keep_ori_im: False # set True if save visualization images or video
```
跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
- 跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
### 3. 预测
......@@ -82,7 +89,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/jde/jde_darknet5
```
**注意:**
请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
- 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
### 4. 导出预测模型
......@@ -96,8 +103,8 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/jde/jde_darkn
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/jde_darknet53_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
```
**注意:**
跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
- 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
## 引用
......
......@@ -12,6 +12,15 @@ English | [简体中文](README_cn.md)
MCFairMOT is the Multi-class extended version of [FairMOT](https://arxiv.org/abs/2004.01888).
### PP-Tracking real-time MOT system
In addition, PaddleDetection also provides [PP-Tracking](../../../deploy/pptracking/README.md) real-time multi-object tracking system.
PP-Tracking is the first open source real-time Multi-Object Tracking system, and it is based on PaddlePaddle deep learning framework. It has rich models, wide application and high efficiency deployment.
PP-Tracking supports two paradigms: single camera tracking (MOT) and multi-camera tracking (MTMCT). Aiming at the difficulties and pain points of actual business, PP-Tracking provides various MOT functions and applications such as pedestrian tracking, vehicle tracking, multi-class tracking, small object tracking, traffic statistics and multi-camera tracking. The deployment method supports API and GUI visual interface, and the deployment language supports Python and C++, The deployment platform environment supports Linux, NVIDIA Jetson, etc.
### AI studio public project tutorial
PP-tracking provides an AI studio public project tutorial. Please refer to this [tutorial](https://aistudio.baidu.com/aistudio/projectdetail/3022582).
## Model Zoo
### MCFairMOT Results on VisDrone2019 Val Set
| backbone | input shape | MOTA | IDF1 | IDS | FPS | download | config |
......@@ -22,8 +31,8 @@ MCFairMOT is the Multi-class extended version of [FairMOT](https://arxiv.org/abs
| HRNetV2-W18 | 576x320 | 12.0 | 33.8 | 2178 | - |[model](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone.pdparams) | [config](./mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone.yml) |
**Notes:**
MOTA is the average MOTA of 10 catecories in the VisDrone2019 MOT dataset, and its value is also equal to the average MOTA of all the evaluated video sequences.
MCFairMOT used 4 GPUs for training 30 epoches. The batch size is 6 on each GPU for MCFairMOT DLA-34, and 4 for MCFairMOT HRNetV2-W18.
- MOTA is the average MOTA of 10 catecories in the VisDrone2019 MOT dataset, and its value is also equal to the average MOTA of all the evaluated video sequences.
- MCFairMOT used 4 GPUs for training 30 epoches. The batch size is 6 on each GPU for MCFairMOT DLA-34, and 8 for MCFairMOT HRNetV2-W18.
## Getting Start
......@@ -44,7 +53,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/mcfairmot/mcfairm
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=output/mcfairmot_dla34_30e_1088x608_visdrone/model_final.pdparams
```
**Notes:**
The default evaluation dataset is VisDrone2019 MOT val-set. If you want to change the evaluation dataset, please refer to the following code and modify `configs/datasets/mcmot.yml`
- The default evaluation dataset is VisDrone2019 MOT val-set. If you want to change the evaluation dataset, please refer to the following code and modify `configs/datasets/mcmot.yml`
```
EvalMOTDataset:
!MOTImageFolder
......@@ -52,7 +61,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/mcfairmot/mcfairm
data_root: your_dataset/images/val
keep_ori_im: False # set True if save visualization images or video
```
Tracking results will be saved in `{output_dir}/mot_results/`, and every sequence has one txt file, each line of the txt file is `frame,id,x1,y1,w,h,score,cls_id,-1,-1`, and you can set `{output_dir}` by `--output_dir`.
- Tracking results will be saved in `{output_dir}/mot_results/`, and every sequence has one txt file, each line of the txt file is `frame,id,x1,y1,w,h,score,cls_id,-1,-1`, and you can set `{output_dir}` by `--output_dir`.
### 3. Inference
Inference a vidoe on single GPU with following command:
......@@ -61,7 +70,7 @@ Inference a vidoe on single GPU with following command:
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone.pdparams --video_file={your video name}.mp4 --save_videos
```
**Notes:**
Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
- Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
### 4. Export model
......@@ -74,8 +83,8 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/mcfairmot/mcf
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/mcfairmot_dla34_30e_1088x608_visdrone --video_file={your video name}.mp4 --device=GPU --save_mot_txts
```
**Notes:**
The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images.
Each line of the tracking results txt file is `frame,id,x1,y1,w,h,score,cls_id,-1,-1`.
- The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images.
- Each line of the tracking results txt file is `frame,id,x1,y1,w,h,score,cls_id,-1,-1`.
## Citations
......
......@@ -12,6 +12,13 @@
MCFairMOT是[FairMOT](https://arxiv.org/abs/2004.01888)的多类别扩展版本。
### PP-Tracking 实时多目标跟踪系统
此外,PaddleDetection还提供了[PP-Tracking](../../../deploy/pptracking/README.md)实时多目标跟踪系统。PP-Tracking是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统,具有模型丰富、应用广泛和部署高效三大优势。
PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式,针对实际业务的难点和痛点,提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用,部署方式支持API调用和GUI可视化界面,部署语言支持Python和C++,部署平台环境支持Linux、NVIDIA Jetson等。
### AI Studio公开项目案例
PP-Tracking 提供了AI Studio公开项目案例,教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)
## 模型库
### MCFairMOT 在VisDrone2019 MOT val-set上结果
......@@ -23,8 +30,8 @@ MCFairMOT是[FairMOT](https://arxiv.org/abs/2004.01888)的多类别扩展版本
| HRNetV2-W18 | 576x320 | 12.0 | 33.8 | 2178 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone.pdparams) | [配置文件](./mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone.yml) |
**注意:**
MOTA是VisDrone2019 MOT数据集10类目标的平均MOTA, 其值也等于所有评估的视频序列的平均MOTA。
MCFairMOT enhance模型均使用4个GPU进行训练,训练30个epoch。DLA-34骨干网络的每个GPU上batch size为6,HRNetV2-W18骨干网络的每个GPU上batch size为4
- MOTA是VisDrone2019 MOT数据集10类目标的平均MOTA, 其值也等于所有评估的视频序列的平均MOTA。
- MCFairMOT模型均使用4个GPU进行训练,训练30个epoch。DLA-34骨干网络的每个GPU上batch size为6,HRNetV2-W18骨干网络的每个GPU上batch size为8
## 快速开始
......@@ -44,7 +51,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/mcfairmot/mcfairm
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=output/mcfairmot_dla34_30e_1088x608_visdrone/model_final.pdparams
```
**注意:**
默认评估的是VisDrone2019 MOT val-set数据集, 如需换评估数据集可参照以下代码修改`configs/datasets/mcmot.yml`
- 默认评估的是VisDrone2019 MOT val-set数据集, 如需换评估数据集可参照以下代码修改`configs/datasets/mcmot.yml`
```
EvalMOTDataset:
!MOTImageFolder
......@@ -52,7 +59,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/mcfairmot/mcfairm
data_root: your_dataset/images/val
keep_ori_im: False # set True if save visualization images or video
```
多类别跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,cls_id,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
- 多类别跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,cls_id,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
### 3. 预测
使用单个GPU通过如下命令预测一个视频,并保存为视频
......@@ -61,7 +68,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/mcfairmot/mcfairm
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone.pdparams --video_file={your video name}.mp4 --save_videos
```
**注意:**
请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
- 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
### 4. 导出预测模型
```bash
......@@ -73,8 +80,8 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/mcfairmot/mcf
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/mcfairmot_dla34_30e_1088x608_visdrone --video_file={your video name}.mp4 --device=GPU --save_mot_txts
```
**注意:**
跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
多类别跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,cls_id,-1,-1`
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
- 多类别跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,cls_id,-1,-1`
## 引用
......
......@@ -9,10 +9,12 @@ English | [简体中文](README_cn.md)
- [引用](#引用)
## 简介
MTMCT (Multi-Target Multi-Camera Tracking) 跨镜头多目标跟踪是某一场景下的不同摄像头拍摄的视频进行多目标跟踪,是跟踪领域一个非常重要的研究课题,在安防监控、自动驾驶、智慧城市等行业起着重要作用。MTMCT预测的是同一场景下的不同摄像头拍摄的视频,其方法的效果受场景先验知识和相机数量角度拓扑结构等信息的影响较大,PaddleDetection此处提供的是去除场景和相机相关优化方法后的一个基础版本的MTMCT算法实现,如果要继续提高效果,需要专门针对该场景和相机信息设计后处理算法。此处选用DeepSORT方案做MTMCT,为了达到实时性选用了PaddleDetection自研的PPYOLOv2和PP-PicoDet作为检测器,选用PaddleClas自研的轻量级网络PP-LCNet作为ReID模型。
MTMCT (Multi-Target Multi-Camera Tracking) 跨镜头多目标跟踪是某一场景下的不同摄像头拍摄的视频进行多目标跟踪,是跟踪领域一个非常重要的研究课题,在安防监控、自动驾驶、智慧城市等行业起着重要作用。MTMCT预测的是同一场景下的不同摄像头拍摄的视频,其方法的效果受场景先验知识和相机数量角度拓扑结构等信息的影响较大,PaddleDetection此处提供的是去除场景和相机相关优化方法后的一个基础版本的MTMCT算法实现,如果要继续提高效果,需要专门针对该场景和相机信息设计后处理算法。此处选用DeepSORT方案做MTMCT,为了达到实时性选用了PaddleDetection自研的[PP-YOLOv2](../../ppyolo/)和轻量级网络[PP-PicoDet](../../picodet/)作为检测器,选用PaddleClas自研的轻量级网络PP-LCNet作为ReID模型。
MTMCT是[PP-Tracking](../../../deploy/pptracking)项目中一个非常重要的方向,[PP-Tracking](../../../deploy/pptracking/README.md)是基于PaddlePaddle深度学习框架的业界首个开源实时跟踪系统。针对实际业务的难点痛点,PP-Tracking内置行人车辆跟踪、跨镜头跟踪、多类别跟踪、小目标跟踪及流量计数等能力与产业应用,同时提供可视化开发界面。模型集成多目标跟踪,目标检测,ReID轻量级算法,进一步提升PP-Tracking在服务器端部署性能。同时支持python,C++部署,适配Linux,Nvidia Jetson多平台环境。具体可前往该目录使用。
MTMCT是[PP-Tracking](../../../deploy/pptracking)项目中一个非常重要的方向,[PP-Tracking](../../../deploy/pptracking/README.md)是基于PaddlePaddle深度学习框架的业界首个开源实时跟踪系统。针对实际业务的难点痛点,PP-Tracking内置行人车辆跟踪、跨镜头跟踪、多类别跟踪、小目标跟踪及流量计数等能力与产业应用,同时提供可视化开发界面。模型集成目标检测、轻量级ReID、多目标跟踪等算法,进一步提升PP-Tracking在服务器端部署性能。同时支持Python、C++部署,适配Linux、NVIDIA Jetson等多个平台环境。具体可前往该目录使用。
### AI Studio公开项目案例
PP-Tracking 提供了AI Studio公开项目案例,教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)
## 模型库
### DeepSORT在 AIC21 MTMCT(CityFlow) 车辆跨境跟踪数据集Test集上的结果
......@@ -23,51 +25,68 @@ MTMCT是[PP-Tracking](../../../deploy/pptracking)项目中一个非常重要的
| PPYOLOv2 | 640x640 | PP-LCNet | S06 | - | 0.4450 | 0.4611 | 0.4300 | 0.6385 | 0.5954 | - |[Detector](https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle.tar) |[ReID](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet_vehicle.tar) |
**注意:**
S06是AIC21 MTMCT数据集Test集的场景名称,S06场景下有’c041,c042,c043,c044,c045,c046‘共6个摄像头的视频。
- S06是AIC21 MTMCT数据集Test集的场景名称,S06场景下有’c041,c042,c043,c044,c045,c046‘共6个摄像头的视频。
- 由于在部署过程中只需要前向参数,此处提供的是已经导出的模型,解压后可看到包括`infer_cfg.yml``model.pdiparams``model.pdiparams.info``model.pdmodel`四个文件。
## 数据集准备
此处提供了车辆和行人的两种模型方案,对于车辆是选用的[AIC21 MTMCT](https://www.aicitychallenge.org) (CityFlow)车辆跨境跟踪数据集,对于行人是选用的[WILDTRACK](https://www.epfl.ch/labs/cvlab/data/data-wildtrack)行人跨境跟踪数据集。
AIC21 MTMCT原始数据集的目录如下所示:
对于车辆跨镜头跟踪是选用的[AIC21 MTMCT](https://www.aicitychallenge.org) (CityFlow)车辆跨境跟踪数据集,此处提供PaddleDetection团队整理过后的数据集的下载链接:`wget https://paddledet.bj.bcebos.com/data/mot/aic21mtmct_vehicle.zip`,测试使用的是其中的S06文件夹目录,此外还提供AIC21 MTMCT数据集中S01场景抽出来的极小的一个demo测试数据集:`wget https://paddledet.bj.bcebos.com/data/mot/demo/mtmct-demo.tar`
数据集的处理如下所示:
```
# AIC21 MTMCT原始数据集的目录如下所示:
|——————AIC21_Track3_MTMC_Tracking
|——————cam_framenum (Number of frames below each camera)
|——————cam_loc (Positional relationship between cameras)
|——————cam_framenum (Number of frames below each camera)
|——————cam_loc (Positional relationship between cameras)
|——————cam_timestamp (Time difference between cameras)
|——————eval (evaluation function and ground_truth.txt)
|——————test
|——————train
|——————validation
|——————test (S06 dataset)
|——————train (S01,S03,S04 dataset)
|——————validation (S02,S05 dataset)
|——————DataLicenseAgreement_AICityChallenge_2021.pdf
|——————list_cam.txt (List of all camera paths)
|——————ReadMe.txt (Dataset description)
|——————gen_aicity_mtmct_data.py (Camera data extraction script)
|——————gen_aicity_mtmct_data.py (Camera videos extraction script)
```
需要处理成如下格式:
```
aic21mtmct_vehicle/
├── S01
│ ├── c001
│ ├── roi.jog (Area mask of the road)
│ ├── img1
│ ├── ...
│ ├── c002
│ ├── roi.jog
│ ├── img1
│ ├── ...
│ ├── c003
│ ├── roi.jog
│ ├── img1
│ ├── ...
├── gt
│ ├── ground_truth_train.txt
│ ├── ground_truth_validation.txt
├── zone (only for S06 when use camera track trick)
│ ├── ...
├── gt
│ ├── gt.txt
├── images
├── c001
│ ├── img1
│ │ ├── 0000001.jpg
│ │ ...
│ ├── roi.jpg
├── c002
...
├── c006
├── S02
...
├── S05
├── S06
├── images
├── c041
├── img1
├── 0000001.jpg
...
├── c042
...
├── c046
├── zone (only for test-set S06 when use camera tricks for testing)
├── c041.png
...
├── c046.png
```
#### 生成S01场景的验证集数据
python gen_aicity_mtmct_data.py ./AIC21_Track3_MTMC_Tracking/train/S01
**注意:**
- AIC21 MTMCT数据集共有6个场景共计46个摄像头的数据,其中S01、S03和S04为训练集,S02和S05为验证集,S06是测试集,S06场景下有’c041,c042,c043,c044,c045,c046‘共6个摄像头的视频。
## 快速开始
......@@ -83,22 +102,26 @@ wget https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet_vehicl
tar -xvf deepsort_pplcnet_vehicle.tar
```
**注意:**
PP-PicoDet是轻量级检测模型,其训练请参考[configs/picodet](../../picodet/README.md),并注意修改种类数和数据集路径。
PP-LCNet是轻量级ReID模型,其训练请参考[PaddleClas](https://github.com/PaddlePaddle/PaddleClas),是在VERI-Wild车辆重识别数据集训练得到的权重,建议直接使用无需重训。
- PP-PicoDet是轻量级检测模型,其训练请参考[configs/picodet](../../picodet/README.md),并注意修改种类数和数据集路径。
- PP-LCNet是轻量级ReID模型,其训练请参考[PaddleClas](https://github.com/PaddlePaddle/PaddleClas),是在VERI-Wild车辆重识别数据集训练得到的权重,建议直接使用无需重训。
### 2. 用导出的模型基于Python去预测
```bash
# 用导出PicoDet车辆检测模型和PPLCNet车辆ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=picodet_l_640_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --mtmct_dir={your mtmct scene video folder} --mtmct_cfg=mtmct_cfg --device=GPU --scaled=True --save_mot_txts --save_images
# 下载demo测试视频
wget https://paddledet.bj.bcebos.com/data/mot/demo/mtmct-demo.tar
tar -xvf mtmct-demo.tar
# 用导出的PicoDet车辆检测模型和PPLCNet车辆ReID模型去基于Python预测
python deploy/pptracking/python/mot_sde_infer.py --model_dir=picodet_l_640_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --mtmct_dir=mtmct-demo --mtmct_cfg=mtmct_cfg --device=GPU --scaled=True --save_mot_txts --save_images
```
**注意:**
跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt),或`--save_images`表示保存跟踪结果可视化图片。
`--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
`--mtmct_dir`是MTMCT预测的某个场景的文件夹名字,里面包含该场景不同摄像头拍摄视频的图片文件夹,其数量至少为两个。
`--mtmct_cfg`是MTMCT预测的某个场景的配置文件,里面包含该一些trick操作的开关和该场景摄像头相关设置的文件路径,用户可以自行更改相关路径以及设置某些操作是否启用。
MTMCT跨镜头跟踪输出结果为视频和txt形式。每个图片文件夹各生成一个可视化的跨镜头跟踪结果,与单镜头跟踪的结果是不同的,单镜头跟踪的结果在几个视频文件夹间是独立无关的。MTMCT的结果txt只有一个,比单镜头跟踪结果txt多了第一列镜头id号,跨镜头跟踪结果txt文件每行信息是`carame_id,frame,id,x1,y1,w,h,-1,-1`
MTMCT是[PP-Tracking](../../../deploy/pptracking)项目中的一个非常重要的方向,具体可前往该目录使用。
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt),或`--save_images`表示保存跟踪结果可视化图片。
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
- `--mtmct_dir`是MTMCT预测的某个场景的文件夹名字,里面包含该场景不同摄像头拍摄视频的图片文件夹视频,其数量至少为两个。
- `--mtmct_cfg`是MTMCT预测的某个场景的配置文件,里面包含该一些trick操作的开关和该场景摄像头相关设置的文件路径,用户可以自行更改相关路径以及设置某些操作是否启用。
- MTMCT跨镜头跟踪输出结果为视频和txt形式。每个图片文件夹各生成一个可视化的跨镜头跟踪结果,与单镜头跟踪的结果是不同的,单镜头跟踪的结果在几个视频文件夹间是独立无关的。MTMCT的结果txt只有一个,比单镜头跟踪结果txt多了第一列镜头id号,跨镜头跟踪结果txt文件每行信息是`camera_id,frame,id,x1,y1,w,h,-1,-1`
- MTMCT是[PP-Tracking](../../../deploy/pptracking)项目中的一个非常重要的方向,具体可前往该目录使用。
## 引用
......
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import cv2
import glob
def video2frames(sourceVdo, dstDir):
videoData = cv2.VideoCapture(sourceVdo)
count = 0
while (videoData.isOpened()):
count += 1
ret, frame = videoData.read()
if ret:
cv2.imwrite(f"{dstDir}/{count:07d}.jpg", frame)
if count % 20 == 0:
print(f"{dstDir}/{count:07d}.jpg")
else:
break
videoData.release()
def transSeq(seqs_path, new_root):
sonCameras = glob.glob(seqs_path + "/*")
sonCameras.sort()
for vdoList in sonCameras:
Seq = vdoList.split('/')[-2]
Camera = vdoList.split('/')[-1]
os.system(f"mkdir -p {new_root}/{Seq}/images/{Camera}/img1")
roi_path = vdoList + '/roi.jpg'
new_roi_path = f"{new_root}/{Seq}/images/{Camera}"
os.system(f"cp {roi_path} {new_roi_path}")
video2frames(f"{vdoList}/vdo.avi",
f"{new_root}/{Scd eq}/images/{Camera}/img1")
if __name__ == "__main__":
seq_path = sys.argv[1]
new_root = 'aic21mtmct_vehicle'
seq_name = seq_path.split('/')[-1]
data_path = seq_path.split('/')[-3]
os.system(f"mkdir -p {new_root}/{seq_name}/gt")
os.system(f"cp {data_path}/eval/ground*.txt {new_root}/{seq_name}/gt")
# extract video frames
transSeq(seq_path, new_root)
......@@ -23,7 +23,7 @@
| VisDrone | HRNetv2-W18| 576x320 | 30.6 | 47.2 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_pedestrian.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_pedestrian.yml) |
**注意:**
FairMOT均使用DLA-34为骨干网络,4个GPU进行训练,每个GPU上batch size为6,训练30个epoch。
- FairMOT均使用DLA-34为骨干网络,4个GPU进行训练,每个GPU上batch size为6,训练30个epoch。
## 数据集准备和处理
......@@ -86,7 +86,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/pedestrian/fairmo
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/pedestrian/fairmot_dla34_30e_1088x608_visdrone_pedestrian.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_visdrone_pedestrian.pdparams --video_file={your video name}.mp4 --save_videos
```
**注意:**
请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
- 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
### 4. 导出预测模型
```bash
......@@ -98,8 +98,8 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/pedestrian/fa
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_visdrone_pedestrian --video_file={your video name}.mp4 --device=GPU --save_mot_txts
```
**注意:**
跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
- 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
## 引用
```
......
......@@ -28,7 +28,7 @@
| VisDrone | HRNetv2-W18| 576x320 | 39.8 | 52.4 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.yml) |
**注意:**
FairMOT均使用DLA-34为骨干网络,4个GPU进行训练,每个GPU上batch size为6,训练30个epoch。
- FairMOT均使用DLA-34为骨干网络,4个GPU进行训练,每个GPU上batch size为6,训练30个epoch。
## 数据集准备和处理
......@@ -118,7 +118,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/vehicle/fairmot_d
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/vehicle/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.pdparams --video_file={your video name}.mp4 --save_videos
```
**注意:**
请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
- 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
### 4. 导出预测模型
```bash
......@@ -130,8 +130,8 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/vehicle/fairm
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle --video_file={your video name}.mp4 --device=GPU --save_mot_txts
```
**注意:**
跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
- 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
## 引用
```
......
# 实时跟踪系统PP-Tracking
# 实时多目标跟踪系统PP-Tracking
PP-Tracking是基于飞桨深度学习框架的业界首个开源实时跟踪系统。针对实际业务的难点痛点,PP-Tracking内置行人车辆跟踪、跨镜头跟踪、多类别跟踪、小目标跟踪及流量计数等能力与产业应用,同时提供可视化开发界面。模型集成多目标跟踪,目标检测,ReID轻量级算法,进一步提升PP-Tracking在服务器端部署性能。同时支持python,C++部署,适配Linux,Nvidia Jetson多平台环境。
PP-Tracking是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统,具有模型丰富、应用广泛和部署高效三大优势。
PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式,针对实际业务的难点和痛点,提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用,部署方式支持API调用和GUI可视化界面,部署语言支持Python和C++,部署平台环境支持Linux、NVIDIA Jetson等。
<div width="1000" align="center">
<img src="../../docs/images/pptracking.png"/>
......@@ -9,46 +10,82 @@ PP-Tracking是基于飞桨深度学习框架的业界首个开源实时跟踪系
<div width="1000" align="center">
<img src="../../docs/images/pptracking-demo.gif"/>
<br>
视频来源:VisDrone2021, BDD100K开源数据集</div>
视频来源:VisDrone和BDD100K公开数据集</div>
</div>
### 一、快速开始
PP-Tracking提供了简洁的可视化界面,无需开发即可实现多种跟踪功能,可以参考[PP-Tracking可视化界面使用文档](https://github.com/yangyudong2020/PP-Tracking_GUi)快速上手体验。
PP-Tracking也提供了AI Studio公开项目案例,可以参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)快速上手体验。
### 二、算法介绍
PP-Tracking集成了多目标跟踪,目标检测,ReID轻量级算法,提升跟踪系统实时性能。多目标跟踪算法基于FairMOT进行优化,实现了服务器端轻量级模型,同时基于不同应用场景提供了针对性的预训练模型。
模型训练评估方法请参考[多目标跟踪快速开始](../../configs/mot/README.md#快速开始)
PP-Tracking中提供的多场景预训练模型及导出模型列表如下:
| 场景 | 数据集 | 精度(MOTA) | NX模型预测速度(FPS) | 配置文件 | 模型权重 | 预测部署模型 |
| :---------:|:--------------- | :-------: | :------: | :------: |:---: | :---: |
| 行人跟踪 | MOT17 | 65.3 | 23.9 | [配置文件](../../configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.tar) |
| 行人小目标跟踪 | VisDrone-pedestrian | 40.5 | 8.35 | [配置文件](../../configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.tar) |
| 车辆跟踪 | BDD100k-vehicle | 32.6 | 24.3 | [配置文件](../../configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.tar) |
| 车辆小目标跟踪 | VisDrone-vehicle | 39.8 | 22.8 | [配置文件](../../configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.tar)
| 多类别跟踪 | BDD100k | - | 12.5 | [配置文件](../../configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.tar) |
| 多类别小目标跟踪 | VisDrone | 20.4 | 6.74 | [配置文件](../../configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.tar) |
**注:**
1. 模型预测速度为TensorRT FP16速度,测试环境为CUDA 10.2,JETPACK 4.5.1,TensorRT 7.1
2. 更多跟踪模型请参考[多目标跟踪模型库](../../configs/mot/README.md#模型库)
检测模型使用轻量级特色模型PP-PicoDet,具体请参考[PP-PicoDet文档](../../configs/picodet)
ReID模型使用超轻量骨干网络模型PP-LCNet, 具体请参考[PP-LCNet模型介绍](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/models/PP-LCNet.md)
### 三、Python端预测部署
PP-Tracking 使用python预测部署教程请参考[PP-Tracking python部署文档](python/README.md)
### 四、C++端预测部署
PP-Tracking 使用c++预测部署教程请参考[PP-Tracking c++部署文档](cpp/README.md)
## 一、快速开始
### AI Studio公开项目案例
PP-Tracking 提供了AI Studio公开项目案例,教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)
### Python端预测部署
PP-Tracking 支持Python预测部署,教程请参考[PP-Tracking Python部署文档](python/README.md)
### C++端预测部署
PP-Tracking 支持C++预测部署,教程请参考[PP-Tracking C++部署文档](cpp/README.md)
### GUI可视化界面预测部署
PP-Tracking 提供了简洁的GUI可视化界面,教程请参考[PP-Tracking可视化界面试用版使用文档](https://github.com/yangyudong2020/PP-Tracking_GUi)
## 二、算法介绍
PP-Tracking 支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式。
- 单镜头跟踪同时支持**FairMOT****DeepSORT**两种多目标跟踪算法,跨镜头跟踪只支持**DeepSORT**算法。
- 单镜头跟踪的功能包括行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪以及流量统计,模型主要是基于FairMOT进行优化,实现了实时跟踪的效果,同时基于不同应用场景提供了针对性的预训练模型。
- DeepSORT算法方案(包括跨镜头跟踪用到的DeepSORT),选用的检测器是PaddleDetection自研的高性能检测模型[PP-YOLOv2](../../ppyolo/)和轻量级特色检测模型[PP-PicoDet](../../picodet/),选用的ReID模型是PaddleClas自研的超轻量骨干网络模型[PP-LCNet](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/models/PP-LCNet.md)
PP-Tracking中提供的多场景预训练模型以及导出后的预测部署模型如下:
| 场景 | 数据集 | 精度(MOTA) | 预测速度(FPS) | 配置文件 | 模型权重 | 预测部署模型 |
| :---------: |:--------------- | :-------: | :------: | :------:|:-----: | :--------: |
| 行人跟踪 | MOT17 | 65.3 | 23.9 | [配置文件](../../configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.tar) |
| 行人小目标跟踪 | VisDrone-pedestrian | 40.5 | 8.35 | [配置文件](../../configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.tar) |
| 车辆跟踪 | BDD100k-vehicle | 32.6 | 24.3 | [配置文件](../../configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.tar) |
| 车辆小目标跟踪 | VisDrone-vehicle | 39.8 | 22.8 | [配置文件](../../configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.tar)
| 多类别跟踪 | BDD100k | - | 12.5 | [配置文件](../../configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.tar) |
| 多类别小目标跟踪 | VisDrone | 20.4 | 6.74 | [配置文件](../../configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.tar) |
**注意:**
1. 模型预测速度的设备为**NVIDIA Jetson Xavier NX**,速度为**TensorRT FP16**速度,测试环境为CUDA 10.2、JETPACK 4.5.1、TensorRT 7.1。
2. 模型权重是指使用PaddleDetection训练完直接保存的权重,更多跟踪模型权重请参考[多目标跟踪模型库](../../configs/mot/README.md#模型库)去下载,也可按照相应模型配置文件去训练。
3. 预测部署模型是指导出后的前向参数的模型,因为PP-Tracking项目的部署过程中只需要前向参数,可根据[多目标跟踪模型库](../../configs/mot/README.md#模型库)去下载并导出,也可按照相应模型配置文件去训练并导出。导出后的模型文件夹应包括`infer_cfg.yml``model.pdiparams``model.pdiparams.info``model.pdmodel`四个文件,一般会将它们以tar格式打包。
## 引用
```
@ARTICLE{9573394,
author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Detection and Tracking Meet Drones Challenge},
year={2021},
volume={},
number={},
pages={1-1},
doi={10.1109/TPAMI.2021.3119563}
}
@InProceedings{bdd100k,
author = {Yu, Fisher and Chen, Haofeng and Wang, Xin and Xian, Wenqi and Chen,
Yingying and Liu, Fangchen and Madhavan, Vashisht and Darrell, Trevor},
title = {BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}
@article{zhang2020fair,
title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
journal={arXiv preprint arXiv:2004.01888},
year={2020}
}
@inproceedings{Wojke2018deep,
title={Deep Cosine Metric Learning for Person Re-identification},
author={Wojke, Nicolai and Bewley, Alex},
booktitle={2018 IEEE Winter Conference on Applications of Computer Vision (WACV)},
year={2018},
pages={748--756},
organization={IEEE},
doi={10.1109/WACV.2018.00087}
}
```
# Real time Multi-Object Tracking system PP-Tracking
PP-Tracking is the first open source real-time Multi-Object Tracking system, and it is based on PaddlePaddle deep learning framework. It has rich models, wide application and high efficiency deployment.
PP-Tracking supports two paradigms: single camera tracking (MOT) and multi-camera tracking (MTMCT). Aiming at the difficulties and pain points of actual business, PP-Tracking provides various MOT functions and applications such as pedestrian tracking, vehicle tracking, multi-class tracking, small object tracking, traffic statistics and multi-camera tracking. The deployment method supports API and GUI visual interface, and the deployment language supports Python and C++, The deployment platform environment supports Linux, NVIDIA Jetson, etc.
<div width="1000" align="center">
<img src="../../docs/images/pptracking_en.png"/>
</div>
<div width="1000" align="center">
<img src="../../docs/images/pptracking-demo.gif"/>
<br>
video source:VisDrone and BDD100K dataset</div>
</div>
## 一、Quick Start
### AI studio public project case
PP-tracking provides AI studio public project cases. Please refer to this [tutorial](https://aistudio.baidu.com/aistudio/projectdetail/3022582).
### Python predict and deployment
PP-Tracking supports Python predict and deployment. Please refer to this [doc](python/README.md).
### C++ predict and deployment
PP-Tracking supports C++ predict and deployment. Please refer to this [doc](cpp/README.md).
### GUI predict and deployment
PP-Tracking supports GUI predict and deployment. Please refer to this [doc](https://github.com/yangyudong2020/PP-Tracking_GUi).
## 二、Model Zoo
PP-Tracking supports two paradigms: single camera tracking (MOT) and multi-camera tracking (MTMCT).
- Single camera tracking supports **FairMOT** and **DeepSORT** two MOT models, multi-camera tracking only support **DeepSORT**.
- The applications of single camera tracking include pedestrian tracking, vehicle tracking, multi-class tracking, small object tracking and traffic statistics. The models are mainly optimized based on FairMOT to achieve the effect of real-time tracking. At the same time, PP-Tracking provides pre-training models based on different application scenarios.
- In DeepSORT (including DeepSORT used in multi-camera tracking), the selected detectors are PaddeDetection's self-developed high-performance detector [PP-YOLOv2](../../configs/ppyolo/) and lightweight detector [PP-PicoDet](../../configs/picodet/), and the selected ReID model is PaddleClas's self-developed ultra lightweight backbone [PP-LCNet](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/models/PP-LCNet.md)
PP-Tracking provids multi-scenario pre-training models and the exported models for deployment:
| Scene | Dataset | MOTA | Speed(FPS) | config | model weights | inference model |
| :---------: |:--------------- | :-------: | :------: | :------:|:-----: | :------------: |
| pedestrian | MOT17 | 65.3 | 23.9 | [config](../../configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) | [download](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [download](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.tar) |
| pedestrian(small objects) | VisDrone-pedestrian | 40.5| 8.35 | [config](../../configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.yml) | [download](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.pdparams) | [download](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.tar) |
| vehicle | BDD100k-vehicle | 32.6 | 24.3 | [config](../../configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.yml) | [download](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.pdparams)| [download](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.tar) |
| vehicle(small objects)| VisDrone-vehicle | 39.8 | 22.8 | [config](../../configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.yml) | [download](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.pdparams) | [download](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.tar)
| multi-class | BDD100k | - | 12.5 | [config](../../configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml) | [download](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.pdparams) | [download](https://bj.bcebos.com/v1/paddledet/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.tar) |
| multi-class(small objects) | VisDrone | 20.4 | 6.74 | [config](../../configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.yml) | [download](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.pdparams) | [download](https://bj.bcebos.com/v1/paddledet/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.tar) |
**Note:**
1. The equipment predicted by the model is **NVIDIA Jetson Xavier NX**, the speed is tested by **TensorRT FP16**, and the test environment is CUDA 10.2, JETPACK 4.5.1, TensorRT 7.1.
2. `model weights` means the weights saved directly after PaddleDetection training. For more tracking model weights, please refer to [modelzoo](../../configs/mot/README.md#模型库), you can also train according to the corresponding model config file and get the model weights.
3. `inference model` means the model weights with only forward parameters after exported, because only forward parameters are required during the deployment of PP-Tracking project. It can be downloaded and exported according to [modelzoo](../../configs/mot/README.md#模型库), you can also train according to the corresponding model config file and get the model weights, and then export them。In exported model files, there should be `infer_cfg.yml`,`model.pdiparams`,`model.pdiparams.info` and `model.pdmodel` four files in total, which are generally packaged in tar format.
## Citations
```
@ARTICLE{9573394,
author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Detection and Tracking Meet Drones Challenge},
year={2021},
volume={},
number={},
pages={1-1},
doi={10.1109/TPAMI.2021.3119563}
}
@InProceedings{bdd100k,
author = {Yu, Fisher and Chen, Haofeng and Wang, Xin and Xian, Wenqi and Chen,
Yingying and Liu, Fangchen and Madhavan, Vashisht and Darrell, Trevor},
title = {BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}
@article{zhang2020fair,
title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
journal={arXiv preprint arXiv:2004.01888},
year={2020}
}
@inproceedings{Wojke2018deep,
title={Deep Cosine Metric Learning for Person Re-identification},
author={Wojke, Nicolai and Bewley, Alex},
booktitle={2018 IEEE Winter Conference on Applications of Computer Vision (WACV)},
year={2018},
pages={748--756},
organization={IEEE},
doi={10.1109/WACV.2018.00087}
}
```
......@@ -3,7 +3,6 @@
在PaddlePaddle中预测引擎和训练引擎底层有着不同的优化方法, 预测引擎使用了AnalysisPredictor,专门针对推理进行了优化,该引擎可以对模型进行多项图优化,减少不必要的内存拷贝。如果用户在部署已训练模型的过程中对性能有较高的要求,我们提供了独立于PaddleDetection的预测脚本,方便用户直接集成部署。当前C++部署支持基于Fairmot的单镜头类别预测部署,并支持人流量统计、出入口计数功能。
主要包含三个步骤:
- 准备环境
- 导出预测模型
- C++预测
......@@ -17,7 +16,7 @@
- CMake 3.0+
- TensorRT 6/7
NVIDIA Jetson用户请参考[Jetson平台编译指南](../../cpp/Jetson_build.md#jetson环境搭建)完成JetPack安装
NVIDIA Jetson用户请参考[Jetson平台编译指南](../../cpp/docs/Jetson_build.md#jetson环境搭建)完成JetPack安装
### 1. 下载代码
......
......@@ -3,17 +3,15 @@
在PaddlePaddle中预测引擎和训练引擎底层有着不同的优化方法, 预测引擎使用了AnalysisPredictor,专门针对推理进行了优化,是基于[C++预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/native_infer.html)的Python接口,该引擎可以对模型进行多项图优化,减少不必要的内存拷贝。如果用户在部署已训练模型的过程中对性能有较高的要求,我们提供了独立于PaddleDetection的预测脚本,方便用户直接集成部署。
主要包含两个步骤:
- 导出预测模型
- 基于Python进行预测
PaddleDetection在训练过程包括网络的前向和优化器相关参数,而在部署过程中,我们只需要前向参数,具体参考:[导出模型](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/deploy/EXPORT_MODEL.md)
导出后目录下,包括`infer_cfg.yml`, `model.pdiparams`, `model.pdiparams.info`, `model.pdmodel`四个文件。
PP-Tracking也提供了AI Studio公开项目案例,可以参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)快速上手体验
PP-Tracking也提供了AI Studio公开项目案例,教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)
## 1. 对FairMOT模型的导出和预测
### 1.1 导出预测模型
```bash
# 命令行导出PaddleDetection发布的权重
......@@ -35,11 +33,12 @@ tar -xvf fairmot_hrnetv2_w18_dlafpn_30e_576x320.tar
wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4
# Python预测视频
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_hrnetv2_w18_dlafpn_30e_576x320 --video_file=mot17_demo.mp4 --device=GPU --save_mot_txts --save_images
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_hrnetv2_w18_dlafpn_30e_576x320 --video_file=mot17_demo.mp4 --device=GPU --threshold=0.5 --save_mot_txts --save_images
```
**注意:**
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
- 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
- `--threshold`表示结果可视化的置信度阈值,默认为0.5,低于该阈值的结果会被过滤掉,为了可视化效果更佳,可根据实际情况自行修改。
- 对于多类别或车辆的FairMOT模型的导出和Python预测只需更改相应的config和模型权重即可。如:
```bash
job_name=mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone
......@@ -48,7 +47,7 @@ python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fa
# 命令行导出模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c ${config} -o weights=https://paddledet.bj.bcebos.com/models/mot/${job_name}.pdparams
# Python预测视频
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/${job_name} --video_file={your video name}.mp4 --device=GPU --save_mot_txts --save_images
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/${job_name} --video_file={your video name}.mp4 --device=GPU --threshold=0.5 --save_mot_txts --save_images
```
- 多类别跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,cls_id,-1,-1`
- visdrone多类别跟踪demo视频可从此链接下载:`wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/visdrone_demo.mp4`
......@@ -57,17 +56,14 @@ python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fa
## 2. 对DeepSORT模型的导出和预测
### 2.1 导出预测模型
Step 1:导出检测模型
```bash
# 导出JDE YOLOv3行人检测模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/detector/jde_yolov3_darknet53_30e_1088x608_mix.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/jde_yolov3_darknet53_30e_1088x608_mix.pdparams
# 或导出PPYOLOv2行人检测模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/detector/ppyolov2_r50vd_dcn_365e_640x640_mot17half.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyolov2_r50vd_dcn_365e_640x640_mot17half.pdparams
```
Step 2:导出ReID模型
Step 2:导出行人ReID模型
```bash
# 导出PCB Pyramid ReID模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid/deepsort_pcb_pyramid_r101.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pcb_pyramid_r101.pdparams
......@@ -75,22 +71,44 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid/deepsort_pplcnet.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams
```
### 2.2 用导出的模型基于Python去预测
### 2.2 用导出的模型基于Python去预测行人跟踪
```bash
# 下载行人跟踪demo视频:
wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4
# 用导出JDE YOLOv3行人检测模型和PCB Pyramid ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608_mix/ --reid_model_dir=output_inference/deepsort_pcb_pyramid_r101/ --video_file=mot17_demo.mp4 --device=GPU --save_mot_txts --save_images
# 用导出JDE YOLOv3行人检测模型和PCB Pyramid ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608_mix/ --reid_model_dir=output_inference/deepsort_pcb_pyramid_r101/ --video_file=mot17_demo.mp4 --device=GPU --threshold=0.5 --save_mot_txts --save_images
# 或用导出的PPYOLOv2行人检测模型和PPLCNet ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyolov2_r50vd_dcn_365e_640x640_mot17half/ --reid_model_dir=output_inference/deepsort_pplcnet/ --video_file=mot17_demo.mp4 --device=GPU --scaled=True --save_mot_txts --save_images
python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyolov2_r50vd_dcn_365e_640x640_mot17half/ --reid_model_dir=output_inference/deepsort_pplcnet/ --video_file=mot17_demo.mp4 --device=GPU --threshold=0.5 --scaled=True --save_mot_txts --save_images
```
### 2.3 用导出的模型基于Python去预测车辆跟踪
```bash
# 下载车辆检测PicoDet导出的模型:
wget https://paddledet.bj.bcebos.com/models/mot/deepsort/picodet_l_640_aic21mtmct_vehicle.tar
tar -xvf picodet_l_640_aic21mtmct_vehicle.tar
# 或者车辆检测PP-YOLOv2导出的模型:
wget https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle.tar
tar -xvf ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle.tar
# 下载车辆ReID导出的模型:
wget https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet_vehicle.tar
tar -xvf deepsort_pplcnet_vehicle.tar
# 用导出的PicoDet车辆检测模型和PPLCNet车辆ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=picodet_l_640_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --device=GPU --scaled=True --threshold=0.5 --video_file={your video}.mp4 --save_mot_txts --save_images
# 用导出的PP-YOLOv2车辆检测模型和PPLCNet车辆ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --device=GPU --scaled=True --threshold=0.5 --video_file={your video}.mp4 --save_mot_txts --save_images
```
**注意:**
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_images`表示保存跟踪结果可视化图片。
- 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
- `--threshold`表示结果可视化的置信度阈值,默认为0.5,低于该阈值的结果会被过滤掉,为了可视化效果更佳,可根据实际情况自行修改。
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
- DeepSORT算法不支持多类别跟踪,只支持单类别跟踪,且ReID模型最好是与检测模型同一类别的物体训练过的,比如行人跟踪最好使用行人ReID模型,车辆跟踪最好使用车辆ReID模型。
## 3. 跨境跟踪模型的导出和预测
......@@ -99,6 +117,9 @@ Step 1:下载导出的检测模型
```bash
wget https://paddledet.bj.bcebos.com/models/mot/deepsort/picodet_l_640_aic21mtmct_vehicle.tar
tar -xvf picodet_l_640_aic21mtmct_vehicle.tar
# 或者
wget https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle.tar
tar -xvf ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle.tar
```
Step 2:下载导出的ReID模型
```bash
......@@ -106,17 +127,23 @@ wget https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet_vehicl
tar -xvf deepsort_pplcnet_vehicle.tar
```
### 3.2 用导出的模型基于Python去预测
### 3.2 用导出的模型基于Python去做跨镜头跟踪
```bash
# 用导出PicoDet车辆检测模型和PPLCNet车辆ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=picodet_l_640_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --mtmct_dir={your mtmct scene video folder} --mtmct_cfg=mtmct_cfg --device=GPU --scaled=True --save_mot_txts --save_images
# 用导出的PicoDet车辆检测模型和PPLCNet车辆ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=picodet_l_640_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --mtmct_dir={your mtmct scene video folder} --mtmct_cfg=mtmct_cfg --device=GPU --scaled=True --threshold=0.5 --save_mot_txts --save_images
# 用导出的PP-YOLOv2车辆检测模型和PPLCNet车辆ReID模型
python deploy/pptracking/python/mot_sde_infer.py --model_dir=ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --mtmct_dir={your mtmct scene video folder} --mtmct_cfg=mtmct_cfg --device=GPU --scaled=True --threshold=0.5 --save_mot_txts --save_images
```
**注意:**
跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt),或`--save_images`表示保存跟踪结果可视化图片。
跨镜头跟踪结果txt文件每行信息是`carame_id,frame,id,x1,y1,w,h,-1,-1`
`--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
`--mtmct_dir`是MTMCT预测的某个场景的文件夹名字,里面包含该场景不同摄像头拍摄视频的图片文件夹,其数量至少为两个。
`--mtmct_cfg`是MTMCT预测的某个场景的配置文件,里面包含该一些trick操作的开关和该场景摄像头相关设置的文件路径,用户可以自行更改相关路径以及设置某些操作是否启用。
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt),或`--save_images`表示保存跟踪结果可视化图片。
- 跨镜头跟踪结果txt文件每行信息是`camera_id,frame,id,x1,y1,w,h,-1,-1`
- `--threshold`表示结果可视化的置信度阈值,默认为0.5,低于该阈值的结果会被过滤掉,为了可视化效果更佳,可根据实际情况自行修改。
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
- DeepSORT算法不支持多类别跟踪,只支持单类别跟踪,且ReID模型最好是与检测模型同一类别的物体训练过的,比如行人跟踪最好使用行人ReID模型,车辆跟踪最好使用车辆ReID模型。
- `--mtmct_dir`是MTMCT预测的某个场景的文件夹名字,里面包含该场景不同摄像头拍摄视频的图片文件夹,其数量至少为两个。
- `--mtmct_cfg`是MTMCT预测的某个场景的配置文件,里面包含该一些trick操作的开关和该场景摄像头相关设置的文件路径,用户可以自行更改相关路径以及设置某些操作是否启用。
## 参数说明:
......
......@@ -14,7 +14,7 @@
"""
This code is based on https://github.com/LCFractal/AIC21-MTMC/tree/main/reid/reid-matching/tools
Note: The following codes are strongly related to carame parameters of the AIC21 test-set S06,
Note: The following codes are strongly related to camera parameters of the AIC21 test-set S06,
so they can only be used in S06, and can not be used for other MTMCT datasets.
"""
......@@ -202,7 +202,7 @@ def get_sim_matrix(cid_tid_dict,
use_ff=True,
use_rerank=True,
use_st_filter=False):
# Note: carame releated get_sim_matrix function,
# Note: camera releated get_sim_matrix function,
# which is different from the one in utils.py.
count = len(cid_tids)
......
......@@ -211,21 +211,21 @@ def print_mtmct_result(gt_file, pred_file):
def get_mtmct_matching_results(pred_mtmct_file, secs_interval=0.5,
video_fps=20):
res = np.loadtxt(pred_mtmct_file) # 'cid, tid, fid, x1, y1, w, h, -1, -1'
carame_ids = list(map(int, np.unique(res[:, 0])))
camera_ids = list(map(int, np.unique(res[:, 0])))
res = res[:, :7]
# each line in res: 'cid, tid, fid, x1, y1, w, h'
carame_tids = []
carame_results = dict()
for c_id in carame_ids:
carame_results[c_id] = res[res[:, 0] == c_id]
tids = np.unique(carame_results[c_id][:, 1])
camera_tids = []
camera_results = dict()
for c_id in camera_ids:
camera_results[c_id] = res[res[:, 0] == c_id]
tids = np.unique(camera_results[c_id][:, 1])
tids = list(map(int, tids))
carame_tids.append(tids)
camera_tids.append(tids)
# select common tids throughout each video
common_tids = reduce(np.intersect1d, carame_tids)
common_tids = reduce(np.intersect1d, camera_tids)
if len(common_tids) == 0:
print(
'No common tracked ids in these videos, please check your MOT result or select new videos.'
......@@ -236,15 +236,15 @@ def get_mtmct_matching_results(pred_mtmct_file, secs_interval=0.5,
cid_tid_fid_results = dict()
cid_tid_to_fids = dict()
interval = int(secs_interval * video_fps) # preferably less than 10
for c_id in carame_ids:
for c_id in camera_ids:
cid_tid_fid_results[c_id] = dict()
cid_tid_to_fids[c_id] = dict()
for t_id in common_tids:
tid_mask = carame_results[c_id][:, 1] == t_id
tid_mask = camera_results[c_id][:, 1] == t_id
cid_tid_fid_results[c_id][t_id] = dict()
carame_trackid_results = carame_results[c_id][tid_mask]
fids = np.unique(carame_trackid_results[:, 2])
camera_trackid_results = camera_results[c_id][tid_mask]
fids = np.unique(camera_trackid_results[:, 2])
fids = fids[fids % interval == 0]
fids = list(map(int, fids))
cid_tid_to_fids[c_id][t_id] = fids
......@@ -253,13 +253,13 @@ def get_mtmct_matching_results(pred_mtmct_file, secs_interval=0.5,
st_frame = f_id
ed_frame = f_id + interval
st_mask = carame_trackid_results[:, 2] >= st_frame
ed_mask = carame_trackid_results[:, 2] < ed_frame
st_mask = camera_trackid_results[:, 2] >= st_frame
ed_mask = camera_trackid_results[:, 2] < ed_frame
frame_mask = np.logical_and(st_mask, ed_mask)
cid_tid_fid_results[c_id][t_id][f_id] = carame_trackid_results[
cid_tid_fid_results[c_id][t_id][f_id] = camera_trackid_results[
frame_mask]
return carame_results, cid_tid_fid_results
return camera_results, cid_tid_fid_results
def save_mtmct_crops(cid_tid_fid_res,
......@@ -267,23 +267,23 @@ def save_mtmct_crops(cid_tid_fid_res,
crops_dir,
width=300,
height=200):
carame_ids = cid_tid_fid_res.keys()
camera_ids = cid_tid_fid_res.keys()
seqs_folder = os.listdir(images_dir)
seqs = []
for x in seqs_folder:
if os.path.isdir(os.path.join(images_dir, x)):
seqs.append(x)
assert len(seqs) == len(carame_ids)
assert len(seqs) == len(camera_ids)
seqs.sort()
if not os.path.exists(crops_dir):
os.makedirs(crops_dir)
common_tids = list(cid_tid_fid_res[list(carame_ids)[0]].keys())
common_tids = list(cid_tid_fid_res[list(camera_ids)[0]].keys())
# get crops by name 'tid_cid_fid.jpg
for t_id in common_tids:
for i, c_id in enumerate(carame_ids):
for i, c_id in enumerate(camera_ids):
infer_dir = os.path.join(images_dir, seqs[i])
if os.path.exists(os.path.join(infer_dir, 'img1')):
infer_dir = os.path.join(infer_dir, 'img1')
......@@ -312,24 +312,24 @@ def save_mtmct_crops(cid_tid_fid_res,
t_id, c_id))
def save_mtmct_vis_results(carame_results,
def save_mtmct_vis_results(camera_results,
images_dir,
save_dir,
save_videos=False):
# carame_results: 'cid, tid, fid, x1, y1, w, h'
carame_ids = carame_results.keys()
# camera_results: 'cid, tid, fid, x1, y1, w, h'
camera_ids = camera_results.keys()
seqs_folder = os.listdir(images_dir)
seqs = []
for x in seqs_folder:
if os.path.isdir(os.path.join(images_dir, x)):
seqs.append(x)
assert len(seqs) == len(carame_ids)
assert len(seqs) == len(camera_ids)
seqs.sort()
if not os.path.exists(save_dir):
os.makedirs(save_dir)
for i, c_id in enumerate(carame_ids):
for i, c_id in enumerate(camera_ids):
print("Start visualization for camera {} of sequence {}.".format(
c_id, seqs[i]))
cid_save_dir = os.path.join(save_dir, '{}'.format(seqs[i]))
......@@ -344,7 +344,7 @@ def save_mtmct_vis_results(carame_results,
for f_id, im_path in enumerate(all_images):
img = cv2.imread(os.path.join(infer_dir, im_path))
tracks = carame_results[c_id][carame_results[c_id][:, 2] == f_id]
tracks = camera_results[c_id][camera_results[c_id][:, 2] == f_id]
if tracks.shape[0] > 0:
tracked_ids = tracks[:, 1]
xywhs = tracks[:, 3:]
......
......@@ -498,7 +498,7 @@ def get_sim_matrix(cid_tid_dict,
use_ff=True,
use_rerank=True,
use_st_filter=False):
# Note: carame independent get_sim_matrix function,
# Note: camera independent get_sim_matrix function,
# which is different from the one in camera_utils.py.
count = len(cid_tids)
......
......@@ -870,7 +870,7 @@ def predict_mtmct(detector, reid_model, mtmct_dir, mtmct_cfg):
roi_dir=roi_dir)
if FLAGS.save_images:
carame_results, cid_tid_fid_res = get_mtmct_matching_results(
camera_results, cid_tid_fid_res = get_mtmct_matching_results(
pred_mtmct_file)
crops_dir = os.path.join(output_dir, 'mtmct_crops')
......@@ -879,7 +879,7 @@ def predict_mtmct(detector, reid_model, mtmct_dir, mtmct_cfg):
save_dir = os.path.join(output_dir, 'mtmct_vis')
save_mtmct_vis_results(
carame_results,
camera_results,
images_dir=mtmct_dir,
save_dir=save_dir,
save_videos=FLAGS.save_images)
......
docs/images/pptracking.png

150.2 KB | W: | H:

docs/images/pptracking.png

456.8 KB | W: | H:

docs/images/pptracking.png
docs/images/pptracking.png
docs/images/pptracking.png
docs/images/pptracking.png
  • 2-up
  • Swipe
  • Onion skin
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册