From 0ccce02b356e896e97aa0d199222cfc371830d51 Mon Sep 17 00:00:00 2001 From: Feng Ni Date: Fri, 26 Nov 2021 11:26:04 +0800 Subject: [PATCH] [MOT] fix pptracking doc readme (#4719) --- README_cn.md | 2 +- configs/mot/README.md | 128 ++++++------ configs/mot/README_cn.md | 171 ---------------- configs/mot/README_en.md | 187 ++++++++++++++++++ configs/mot/deepsort/README_cn.md | 6 +- configs/mot/fairmot/README.md | 2 +- configs/mot/fairmot/README_cn.md | 2 +- configs/mot/headtracking21/README_cn.md | 2 +- configs/mot/jde/README.md | 2 +- configs/mot/jde/README_cn.md | 2 +- configs/mot/mcfairmot/README.md | 2 +- configs/mot/mcfairmot/README_cn.md | 2 +- ...2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml | 64 ++++++ configs/mot/pedestrian/README_cn.md | 2 +- configs/mot/vehicle/README_cn.md | 2 +- deploy/pptracking/README.md | 11 +- deploy/pptracking/python/README.md | 36 +++- 17 files changed, 368 insertions(+), 255 deletions(-) delete mode 100644 configs/mot/README_cn.md create mode 100644 configs/mot/README_en.md create mode 100644 configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml diff --git a/README_cn.md b/README_cn.md index 4cab33e9a..91137d630 100644 --- a/README_cn.md +++ b/README_cn.md @@ -257,7 +257,7 @@ PaddleDetection为基于飞桨PaddlePaddle的端到端目标检测套件,提 - HigherHRNet - HRNet - LiteHRNet -- [多目标跟踪](configs/mot/README_cn.md) +- [多目标跟踪](configs/mot/README.md) - [PP-Tracking](deploy/pptracking/README.md) - [DeepSORT](configs/mot/deepsort/README_cn.md) - [JDE](configs/mot/jde/README_cn.md) diff --git a/configs/mot/README.md b/configs/mot/README.md index 5f25ef6e9..afcfa45cc 100644 --- a/configs/mot/README.md +++ b/configs/mot/README.md @@ -1,30 +1,30 @@ -English | [简体中文](README_cn.md) +简体中文 | [English](README_en.md) -# MOT (Multi-Object Tracking) +# 多目标跟踪 (Multi-Object Tracking) -## Table of Contents -- [Introduction](#Introduction) -- [Installation](#Installation) -- [Model Zoo](#Model_Zoo) -- [Dataset Preparation](#Dataset_Preparation) -- [Citations](#Citations) +## 内容 +- [简介](#简介) +- [安装依赖](#安装依赖) +- [模型库](#模型库) +- [数据集准备](#数据集准备) +- [引用](#引用) -## Introduction -The current mainstream multi-objective tracking (MOT) algorithm is mainly composed of two parts: detection and embedding. Detection aims to detect the potential targets in each frame of the video. Embedding assigns and updates the detected target to the corresponding track (named ReID task). According to the different implementation of these two parts, it can be divided into **SDE** series and **JDE** series algorithm. +## 简介 -- **SDE** (Separate Detection and Embedding) is a kind of algorithm which completely separates Detection and Embedding. The most representative is **DeepSORT** algorithm. This design can make the system fit any kind of detectors without difference, and can be improved for each part separately. However, due to the series process, the speed is slow. Time-consuming is a great challenge in the construction of real-time MOT system. +当前主流的多目标追踪(MOT)算法主要由两部分组成:Detection+Embedding。Detection部分即针对视频,检测出每一帧中的潜在目标。Embedding部分则将检出的目标分配和更新到已有的对应轨迹上(即ReID重识别任务)。根据这两部分实现的不同,又可以划分为**SDE**系列和**JDE**系列算法。 -- **JDE** (Joint Detection and Embedding) is to learn detection and embedding simultaneously in a shared neural network, and set the loss function with a multi task learning approach. The representative algorithms are **JDE** and **FairMOT**. This design can achieve high-precision real-time MOT performance. +- SDE(Separate Detection and Embedding)这类算法完全分离Detection和Embedding两个环节,最具代表性的就是**DeepSORT**算法。这样的设计可以使系统无差别的适配各类检测器,可以针对两个部分分别调优,但由于流程上是串联的导致速度慢耗时较长,在构建实时MOT系统中面临较大挑战。 -Paddledetection implements three MOT algorithms of these two series. +- JDE(Joint Detection and Embedding)这类算法完是在一个共享神经网络中同时学习Detection和Embedding,使用一个多任务学习的思路设置损失函数。代表性的算法有**JDE**和**FairMOT**。这样的设计兼顾精度和速度,可以实现高精度的实时多目标跟踪。 -- [DeepSORT](https://arxiv.org/abs/1812.00442) (Deep Cosine Metric Learning SORT) extends the original [SORT](https://arxiv.org/abs/1703.07402) (Simple Online and Realtime Tracking) algorithm, it adds a CNN model to extract features in image of human part bounded by a detector. It integrates appearance information based on a deep appearance descriptor, and assigns and updates the detected targets to the existing corresponding trajectories like ReID task. The detection bboxes result required by DeepSORT can be generated by any detection model, and then the saved detection result file can be loaded for tracking. Here we select the `PCB + Pyramid ResNet101` and `PPLCNet` models provided by [PaddleClas](https://github.com/PaddlePaddle/PaddleClas) as the ReID model. +PaddleDetection实现了这两个系列的3种多目标跟踪算法。 +- [DeepSORT](https://arxiv.org/abs/1812.00442)(Deep Cosine Metric Learning SORT) 扩展了原有的[SORT](https://arxiv.org/abs/1703.07402)(Simple Online and Realtime Tracking)算法,增加了一个CNN模型用于在检测器限定的人体部分图像中提取特征,在深度外观描述的基础上整合外观信息,将检出的目标分配和更新到已有的对应轨迹上即进行一个ReID重识别任务。DeepSORT所需的检测框可以由任意一个检测器来生成,然后读入保存的检测结果和视频图片即可进行跟踪预测。ReID模型此处选择[PaddleClas](https://github.com/PaddlePaddle/PaddleClas)提供的`PCB+Pyramid ResNet101`和`PPLCNet`模型。 -- [JDE](https://arxiv.org/abs/1909.12605) (Joint Detection and Embedding) learns the object detection task and appearance embedding task simutaneously in a shared neural network. And the detection results and the corresponding embeddings are also outputed at the same time. JDE original paper is based on an Anchor Base detector YOLOv3 , adding a new ReID branch to learn embeddings. The training process is constructed as a multi-task learning problem, taking into account both accuracy and speed. +- [JDE](https://arxiv.org/abs/1909.12605)(Joint Detection and Embedding)是在一个单一的共享神经网络中同时学习目标检测任务和embedding任务,并同时输出检测结果和对应的外观embedding匹配的算法。JDE原论文是基于Anchor Base的YOLOv3检测器新增加一个ReID分支学习embedding,训练过程被构建为一个多任务联合学习问题,兼顾精度和速度。 -- [FairMOT](https://arxiv.org/abs/2004.01888) is based on an Anchor Free detector Centernet, which overcomes the problem of anchor and feature misalignment in anchor based detection framework. The fusion of deep and shallow features enables the detection and ReID tasks to obtain the required features respectively. It also uses low dimensional ReID features. FairMOT is a simple baseline composed of two homogeneous branches propose to predict the pixel level target score and ReID features. It achieves the fairness between the two tasks and obtains a higher level of real-time MOT performance. +- [FairMOT](https://arxiv.org/abs/2004.01888)以Anchor Free的CenterNet检测器为基础,克服了Anchor-Based的检测框架中anchor和特征不对齐问题,深浅层特征融合使得检测和ReID任务各自获得所需要的特征,并且使用低维度ReID特征,提出了一种由两个同质分支组成的简单baseline来预测像素级目标得分和ReID特征,实现了两个任务之间的公平性,并获得了更高水平的实时多目标跟踪精度。 -[PP-Tracking](../../deploy/pptracking/README.md) is the first open source real-time tracking system based on PaddlePaddle deep learning framework. Aiming at the difficulties and pain points of the actual business, PP-Tracking has built-in capabilities and industrial applications such as pedestrian and vehicle tracking, cross-camera tracking, multi-class tracking, small target tracking and traffic counting, and provides a visual development interface. The model integrates multi-object tracking, object detection and ReID lightweight algorithm to further improve the deployment performance of PP-Tracking on the server. It also supports Python and C + + deployment and adapts to Linux, NVIDIA and Jetson multi platform environment.。 +[PP-Tracking](../../deploy/pptracking/README.md)是基于PaddlePaddle深度学习框架的业界首个开源实时跟踪系统。针对实际业务的难点痛点,PP-Tracking内置行人车辆跟踪、跨镜头跟踪、多类别跟踪、小目标跟踪及流量计数等能力与产业应用,同时提供可视化开发界面。模型集成多目标跟踪,目标检测,ReID轻量级算法,进一步提升PP-Tracking在服务器端部署性能。同时支持python,C++部署,适配Linux,Nvidia Jetson多平台环境。
@@ -32,46 +32,44 @@ Paddledetection implements three MOT algorithms of these two series.

- video source:VisDrone2021, BDD100K dataset
+ 视频来源:VisDrone2021, BDD100K开源数据集 -## Installation +## 安装依赖 -Install all the related dependencies for MOT: +一键安装MOT相关的依赖: ``` pip install lap sklearn motmetrics openpyxl cython_bbox -or +或者 pip install -r requirements.txt ``` -**Notes:** -- Install `cython_bbox` for Windows: `pip install -e git+https://github.com/samson-wang/cython_bbox.git#egg=cython-bbox`. You can refer to this [tutorial](https://stackoverflow.com/questions/60349980/is-there-a-way-to-install-cython-bbox-for-windows). -- Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`. - - -## Model Zoo - -- base models - - [DeepSORT](deepsort/README.md) - - [JDE](jde/README.md) - - [FairMOT](fairmot/README.md) -- feature models - - [Pedestrian](pedestrian/README.md) - - [Head](headtracking21/README.md) - - [Vehicle](vehicle/README.md) -- Multi-Class Tracking - - [MCFairMOT](mcfairmot/README.md) -- Multi-Target Multi-Camera Tracking - - [MTMCT](mtmct/README.md) - - -## Dataset Preparation - -### MOT Dataset -PaddleDetection use the same training data as [JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) and [FairMOT](https://github.com/ifzhang/FairMOT). Please refer to [PrepareMOTDataSet](../../docs/tutorials/PrepareMOTDataSet.md) to download and prepare all the training data including **Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17 and MOT16**. The former six are used as the mixed dataset for training, and MOT16 are used as the evaluation dataset. In addition, you can use **MOT15 and MOT20** for finetune. All pedestrians in these datasets have detection bbox labels and some have ID labels. If you want to use these datasets, please **follow their licenses**. - -### Data Format -These several relevant datasets have the following structure: +**注意:** +- `cython_bbox`在windows上安装:`pip install -e git+https://github.com/samson-wang/cython_bbox.git#egg=cython-bbox`。可参考这个[教程](https://stackoverflow.com/questions/60349980/is-there-a-way-to-install-cython-bbox-for-windows)。 +- 预测需确保已安装[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。 + +## 模型库 + +- 基础模型 + - [DeepSORT](deepsort/README_cn.md) + - [JDE](jde/README_cn.md) + - [FairMOT](fairmot/README_cn.md) +- 特色垂类模型 + - [行人跟踪](pedestrian/README_cn.md) + - [人头跟踪](headtracking21/README_cn.md) + - [车辆跟踪](vehicle/README_cn.md) +- 多类别跟踪 + - [多类别跟踪](mcfairmot/README_cn.md) +- 跨境头跟踪 + - [跨境头跟踪](mtmct/README_cn.md) + +## 数据集准备 + +### MOT数据集 +PaddleDetection使用和[JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) 还有[FairMOT](https://github.com/ifzhang/FairMOT)相同的数据集。请参照[数据准备文档](../../docs/tutorials/PrepareMOTDataSet_cn.md)去下载并准备好所有的数据集包括**Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17和MOT16**。使用前6者作为联合数据集参与训练,MOT16作为评测数据集。此外还可以使用**MOT15和MOT20**进行finetune。所有的行人都有检测框标签,部分有ID标签。如果您想使用这些数据集,请**遵循他们的License**。 + +### 数据格式 +这几个相关数据集都遵循以下结构: ``` Caltech |——————images @@ -89,24 +87,22 @@ MOT17 └——————labels_with_ids └——————train ``` -Annotations of these datasets are provided in a unified format. Every image has a corresponding annotation text. Given an image path, the annotation text path can be generated by replacing the string `images` with `labels_with_ids` and replacing `.jpg` with `.txt`. - -In the annotation text, each line is describing a bounding box and has the following format: +所有数据集的标注是以统一数据格式提供的。各个数据集中每张图片都有相应的标注文本。给定一个图像路径,可以通过将字符串`images`替换为`labels_with_ids`并将`.jpg`替换为`.txt`来生成标注文本路径。在标注文本中,每行都描述一个边界框,格式如下: ``` [class] [identity] [x_center] [y_center] [width] [height] ``` -**Notes:** -- `class` should be `0`. Only single-class multi-object tracking is supported now. -- `identity` is an integer from `1` to `num_identities`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. -- `[x_center] [y_center] [width] [height]` are the center coordinates, width and height, note that they are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1. +**注意**: +- `class`为`0`,目前仅支持单类别多目标跟踪。 +- `identity`是从`1`到`num_identifies`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。 +- `[x_center] [y_center] [width] [height]`是中心点坐标和宽高,注意他们的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。 -### Dataset Directory +### 数据集目录 -First, follow the command below to download the `image_list.zip` and unzip it in the `dataset/mot` directory: +首先按照以下命令下载image_lists.zip并解压放在`dataset/mot`目录下: ``` wget https://dataset.bj.bcebos.com/mot/image_lists.zip ``` -Then download and unzip each dataset, and the final directory is as follows: +然后依次下载各个数据集并解压,最终目录为: ``` dataset/mot |——————image_lists @@ -136,8 +132,20 @@ dataset/mot |——————PRW ``` +### 快速下载 +按照以下命令可以快速下载各个数据集,注意需要按以上目录解压和存放: +``` +wget https://dataset.bj.bcebos.com/mot/MOT17.zip +wget https://dataset.bj.bcebos.com/mot/Caltech.zip +wget https://dataset.bj.bcebos.com/mot/CUHKSYSU.zip +wget https://dataset.bj.bcebos.com/mot/PRW.zip +wget https://dataset.bj.bcebos.com/mot/Cityscapes.zip +wget https://dataset.bj.bcebos.com/mot/ETH.zip +wget https://dataset.bj.bcebos.com/mot/MOT16.zip +``` + -## Citations +## 引用 ``` @inproceedings{Wojke2017simple, title={Simple Online and Realtime Tracking with a Deep Association Metric}, diff --git a/configs/mot/README_cn.md b/configs/mot/README_cn.md deleted file mode 100644 index 6addfd905..000000000 --- a/configs/mot/README_cn.md +++ /dev/null @@ -1,171 +0,0 @@ -简体中文 | [English](README.md) - -# 多目标跟踪 (Multi-Object Tracking) - -## 内容 -- [简介](#简介) -- [安装依赖](#安装依赖) -- [模型库](#模型库) -- [数据集准备](#数据集准备) -- [引用](#引用) - -## 简介 - -当前主流的多目标追踪(MOT)算法主要由两部分组成:Detection+Embedding。Detection部分即针对视频,检测出每一帧中的潜在目标。Embedding部分则将检出的目标分配和更新到已有的对应轨迹上(即ReID重识别任务)。根据这两部分实现的不同,又可以划分为**SDE**系列和**JDE**系列算法。 - -- SDE(Separate Detection and Embedding)这类算法完全分离Detection和Embedding两个环节,最具代表性的就是**DeepSORT**算法。这样的设计可以使系统无差别的适配各类检测器,可以针对两个部分分别调优,但由于流程上是串联的导致速度慢耗时较长,在构建实时MOT系统中面临较大挑战。 - -- JDE(Joint Detection and Embedding)这类算法完是在一个共享神经网络中同时学习Detection和Embedding,使用一个多任务学习的思路设置损失函数。代表性的算法有**JDE**和**FairMOT**。这样的设计兼顾精度和速度,可以实现高精度的实时多目标跟踪。 - -PaddleDetection实现了这两个系列的3种多目标跟踪算法。 -- [DeepSORT](https://arxiv.org/abs/1812.00442)(Deep Cosine Metric Learning SORT) 扩展了原有的[SORT](https://arxiv.org/abs/1703.07402)(Simple Online and Realtime Tracking)算法,增加了一个CNN模型用于在检测器限定的人体部分图像中提取特征,在深度外观描述的基础上整合外观信息,将检出的目标分配和更新到已有的对应轨迹上即进行一个ReID重识别任务。DeepSORT所需的检测框可以由任意一个检测器来生成,然后读入保存的检测结果和视频图片即可进行跟踪预测。ReID模型此处选择[PaddleClas](https://github.com/PaddlePaddle/PaddleClas)提供的`PCB+Pyramid ResNet101`和`PPLCNet`模型。 - -- [JDE](https://arxiv.org/abs/1909.12605)(Joint Detection and Embedding)是在一个单一的共享神经网络中同时学习目标检测任务和embedding任务,并同时输出检测结果和对应的外观embedding匹配的算法。JDE原论文是基于Anchor Base的YOLOv3检测器新增加一个ReID分支学习embedding,训练过程被构建为一个多任务联合学习问题,兼顾精度和速度。 - -- [FairMOT](https://arxiv.org/abs/2004.01888)以Anchor Free的CenterNet检测器为基础,克服了Anchor-Based的检测框架中anchor和特征不对齐问题,深浅层特征融合使得检测和ReID任务各自获得所需要的特征,并且使用低维度ReID特征,提出了一种由两个同质分支组成的简单baseline来预测像素级目标得分和ReID特征,实现了两个任务之间的公平性,并获得了更高水平的实时多目标跟踪精度。 - -[PP-Tracking](../../deploy/pptracking/README.md)是基于PaddlePaddle深度学习框架的业界首个开源实时跟踪系统。针对实际业务的难点痛点,PP-Tracking内置行人车辆跟踪、跨镜头跟踪、多类别跟踪、小目标跟踪及流量计数等能力与产业应用,同时提供可视化开发界面。模型集成多目标跟踪,目标检测,ReID轻量级算法,进一步提升PP-Tracking在服务器端部署性能。同时支持python,C++部署,适配Linux,Nvidia Jetson多平台环境。 -
- -
- -
- -
- 视频来源:VisDrone2021, BDD100K开源数据集
- - - -## 安装依赖 - -一键安装MOT相关的依赖: -``` -pip install lap sklearn motmetrics openpyxl cython_bbox -或者 -pip install -r requirements.txt -``` -**注意:** -- `cython_bbox`在windows上安装:`pip install -e git+https://github.com/samson-wang/cython_bbox.git#egg=cython-bbox`。可参考这个[教程](https://stackoverflow.com/questions/60349980/is-there-a-way-to-install-cython-bbox-for-windows)。 -- 预测需确保已安装[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。 - -## 模型库 - -- 基础模型 - - [DeepSORT](deepsort/README_cn.md) - - [JDE](jde/README_cn.md) - - [FairMOT](fairmot/README_cn.md) -- 特色垂类模型 - - [行人跟踪](pedestrian/README_cn.md) - - [人头跟踪](headtracking21/README_cn.md) - - [车辆跟踪](vehicle/README_cn.md) -- 多类别跟踪 - - [多类别跟踪](mcfairmot/README_cn.md) -- 跨境头跟踪 - - [跨境头跟踪](mtmct/README_cn.md) - -## 数据集准备 - -### MOT数据集 -PaddleDetection使用和[JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) 还有[FairMOT](https://github.com/ifzhang/FairMOT)相同的数据集。请参照[数据准备文档](../../docs/tutorials/PrepareMOTDataSet_cn.md)去下载并准备好所有的数据集包括**Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17和MOT16**。使用前6者作为联合数据集参与训练,MOT16作为评测数据集。此外还可以使用**MOT15和MOT20**进行finetune。所有的行人都有检测框标签,部分有ID标签。如果您想使用这些数据集,请**遵循他们的License**。 - -### 数据格式 -这几个相关数据集都遵循以下结构: -``` -Caltech - |——————images - | └——————00001.jpg - | |—————— ... - | └——————0000N.jpg - └——————labels_with_ids - └——————00001.txt - |—————— ... - └——————0000N.txt -MOT17 - |——————images - | └——————train - | └——————test - └——————labels_with_ids - └——————train -``` -所有数据集的标注是以统一数据格式提供的。各个数据集中每张图片都有相应的标注文本。给定一个图像路径,可以通过将字符串`images`替换为`labels_with_ids`并将`.jpg`替换为`.txt`来生成标注文本路径。在标注文本中,每行都描述一个边界框,格式如下: -``` -[class] [identity] [x_center] [y_center] [width] [height] -``` -**注意**: -- `class`为`0`,目前仅支持单类别多目标跟踪。 -- `identity`是从`1`到`num_identifies`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。 -- `[x_center] [y_center] [width] [height]`是中心点坐标和宽高,注意他们的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。 - -### 数据集目录 - -首先按照以下命令下载image_lists.zip并解压放在`dataset/mot`目录下: -``` -wget https://dataset.bj.bcebos.com/mot/image_lists.zip -``` -然后依次下载各个数据集并解压,最终目录为: -``` -dataset/mot - |——————image_lists - |——————caltech.10k.val - |——————caltech.all - |——————caltech.train - |——————caltech.val - |——————citypersons.train - |——————citypersons.val - |——————cuhksysu.train - |——————cuhksysu.val - |——————eth.train - |——————mot15.train - |——————mot16.train - |——————mot17.train - |——————mot20.train - |——————prw.train - |——————prw.val - |——————Caltech - |——————Cityscapes - |——————CUHKSYSU - |——————ETHZ - |——————MOT15 - |——————MOT16 - |——————MOT17 - |——————MOT20 - |——————PRW -``` - - -## 引用 -``` -@inproceedings{Wojke2017simple, - title={Simple Online and Realtime Tracking with a Deep Association Metric}, - author={Wojke, Nicolai and Bewley, Alex and Paulus, Dietrich}, - booktitle={2017 IEEE International Conference on Image Processing (ICIP)}, - year={2017}, - pages={3645--3649}, - organization={IEEE}, - doi={10.1109/ICIP.2017.8296962} -} - -@inproceedings{Wojke2018deep, - title={Deep Cosine Metric Learning for Person Re-identification}, - author={Wojke, Nicolai and Bewley, Alex}, - booktitle={2018 IEEE Winter Conference on Applications of Computer Vision (WACV)}, - year={2018}, - pages={748--756}, - organization={IEEE}, - doi={10.1109/WACV.2018.00087} -} - -@article{wang2019towards, - title={Towards Real-Time Multi-Object Tracking}, - author={Wang, Zhongdao and Zheng, Liang and Liu, Yixuan and Wang, Shengjin}, - journal={arXiv preprint arXiv:1909.12605}, - year={2019} -} - -@article{zhang2020fair, - title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking}, - author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu}, - journal={arXiv preprint arXiv:2004.01888}, - year={2020} -} -``` diff --git a/configs/mot/README_en.md b/configs/mot/README_en.md new file mode 100644 index 000000000..2e411d197 --- /dev/null +++ b/configs/mot/README_en.md @@ -0,0 +1,187 @@ +English | [简体中文](README.md) + +# MOT (Multi-Object Tracking) + +## Table of Contents +- [Introduction](#Introduction) +- [Installation](#Installation) +- [Model Zoo](#Model_Zoo) +- [Dataset Preparation](#Dataset_Preparation) +- [Citations](#Citations) + +## Introduction +The current mainstream multi-objective tracking (MOT) algorithm is mainly composed of two parts: detection and embedding. Detection aims to detect the potential targets in each frame of the video. Embedding assigns and updates the detected target to the corresponding track (named ReID task). According to the different implementation of these two parts, it can be divided into **SDE** series and **JDE** series algorithm. + +- **SDE** (Separate Detection and Embedding) is a kind of algorithm which completely separates Detection and Embedding. The most representative is **DeepSORT** algorithm. This design can make the system fit any kind of detectors without difference, and can be improved for each part separately. However, due to the series process, the speed is slow. Time-consuming is a great challenge in the construction of real-time MOT system. + +- **JDE** (Joint Detection and Embedding) is to learn detection and embedding simultaneously in a shared neural network, and set the loss function with a multi task learning approach. The representative algorithms are **JDE** and **FairMOT**. This design can achieve high-precision real-time MOT performance. + +Paddledetection implements three MOT algorithms of these two series. + +- [DeepSORT](https://arxiv.org/abs/1812.00442) (Deep Cosine Metric Learning SORT) extends the original [SORT](https://arxiv.org/abs/1703.07402) (Simple Online and Realtime Tracking) algorithm, it adds a CNN model to extract features in image of human part bounded by a detector. It integrates appearance information based on a deep appearance descriptor, and assigns and updates the detected targets to the existing corresponding trajectories like ReID task. The detection bboxes result required by DeepSORT can be generated by any detection model, and then the saved detection result file can be loaded for tracking. Here we select the `PCB + Pyramid ResNet101` and `PPLCNet` models provided by [PaddleClas](https://github.com/PaddlePaddle/PaddleClas) as the ReID model. + +- [JDE](https://arxiv.org/abs/1909.12605) (Joint Detection and Embedding) learns the object detection task and appearance embedding task simutaneously in a shared neural network. And the detection results and the corresponding embeddings are also outputed at the same time. JDE original paper is based on an Anchor Base detector YOLOv3 , adding a new ReID branch to learn embeddings. The training process is constructed as a multi-task learning problem, taking into account both accuracy and speed. + +- [FairMOT](https://arxiv.org/abs/2004.01888) is based on an Anchor Free detector Centernet, which overcomes the problem of anchor and feature misalignment in anchor based detection framework. The fusion of deep and shallow features enables the detection and ReID tasks to obtain the required features respectively. It also uses low dimensional ReID features. FairMOT is a simple baseline composed of two homogeneous branches propose to predict the pixel level target score and ReID features. It achieves the fairness between the two tasks and obtains a higher level of real-time MOT performance. + +[PP-Tracking](../../deploy/pptracking/README.md) is the first open source real-time tracking system based on PaddlePaddle deep learning framework. Aiming at the difficulties and pain points of the actual business, PP-Tracking has built-in capabilities and industrial applications such as pedestrian and vehicle tracking, cross-camera tracking, multi-class tracking, small target tracking and traffic counting, and provides a visual development interface. The model integrates multi-object tracking, object detection and ReID lightweight algorithm to further improve the deployment performance of PP-Tracking on the server. It also supports Python and C + + deployment and adapts to Linux, NVIDIA and Jetson multi platform environment.。 +
+ +
+ +
+ +
+ video source:VisDrone2021, BDD100K dataset
+ + + +## Installation + +Install all the related dependencies for MOT: +``` +pip install lap sklearn motmetrics openpyxl cython_bbox +or +pip install -r requirements.txt +``` +**Notes:** +- Install `cython_bbox` for Windows: `pip install -e git+https://github.com/samson-wang/cython_bbox.git#egg=cython-bbox`. You can refer to this [tutorial](https://stackoverflow.com/questions/60349980/is-there-a-way-to-install-cython-bbox-for-windows). +- Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`. + + +## Model Zoo + +- base models + - [DeepSORT](deepsort/README.md) + - [JDE](jde/README.md) + - [FairMOT](fairmot/README.md) +- feature models + - [Pedestrian](pedestrian/README.md) + - [Head](headtracking21/README.md) + - [Vehicle](vehicle/README.md) +- Multi-Class Tracking + - [MCFairMOT](mcfairmot/README.md) +- Multi-Target Multi-Camera Tracking + - [MTMCT](mtmct/README.md) + + +## Dataset Preparation + +### MOT Dataset +PaddleDetection use the same training data as [JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) and [FairMOT](https://github.com/ifzhang/FairMOT). Please refer to [PrepareMOTDataSet](../../docs/tutorials/PrepareMOTDataSet.md) to download and prepare all the training data including **Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17 and MOT16**. The former six are used as the mixed dataset for training, and MOT16 are used as the evaluation dataset. In addition, you can use **MOT15 and MOT20** for finetune. All pedestrians in these datasets have detection bbox labels and some have ID labels. If you want to use these datasets, please **follow their licenses**. + +### Data Format +These several relevant datasets have the following structure: +``` +Caltech + |——————images + | └——————00001.jpg + | |—————— ... + | └——————0000N.jpg + └——————labels_with_ids + └——————00001.txt + |—————— ... + └——————0000N.txt +MOT17 + |——————images + | └——————train + | └——————test + └——————labels_with_ids + └——————train +``` +Annotations of these datasets are provided in a unified format. Every image has a corresponding annotation text. Given an image path, the annotation text path can be generated by replacing the string `images` with `labels_with_ids` and replacing `.jpg` with `.txt`. + +In the annotation text, each line is describing a bounding box and has the following format: +``` +[class] [identity] [x_center] [y_center] [width] [height] +``` +**Notes:** +- `class` should be `0`. Only single-class multi-object tracking is supported now. +- `identity` is an integer from `1` to `num_identities`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. +- `[x_center] [y_center] [width] [height]` are the center coordinates, width and height, note that they are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1. + +### Dataset Directory + +First, follow the command below to download the `image_list.zip` and unzip it in the `dataset/mot` directory: +``` +wget https://dataset.bj.bcebos.com/mot/image_lists.zip +``` +Then download and unzip each dataset, and the final directory is as follows: +``` +dataset/mot + |——————image_lists + |——————caltech.10k.val + |——————caltech.all + |——————caltech.train + |——————caltech.val + |——————citypersons.train + |——————citypersons.val + |——————cuhksysu.train + |——————cuhksysu.val + |——————eth.train + |——————mot15.train + |——————mot16.train + |——————mot17.train + |——————mot20.train + |——————prw.train + |——————prw.val + |——————Caltech + |——————Cityscapes + |——————CUHKSYSU + |——————ETHZ + |——————MOT15 + |——————MOT16 + |——————MOT17 + |——————MOT20 + |——————PRW +``` + +### Quick Download +You can download all the dataset according to the following command. Note that it needs to be decompressed and stored according to the above directory. +``` +wget https://dataset.bj.bcebos.com/mot/MOT17.zip +wget https://dataset.bj.bcebos.com/mot/Caltech.zip +wget https://dataset.bj.bcebos.com/mot/CUHKSYSU.zip +wget https://dataset.bj.bcebos.com/mot/PRW.zip +wget https://dataset.bj.bcebos.com/mot/Cityscapes.zip +wget https://dataset.bj.bcebos.com/mot/ETH.zip +wget https://dataset.bj.bcebos.com/mot/MOT16.zip +``` + + +## Citations +``` +@inproceedings{Wojke2017simple, + title={Simple Online and Realtime Tracking with a Deep Association Metric}, + author={Wojke, Nicolai and Bewley, Alex and Paulus, Dietrich}, + booktitle={2017 IEEE International Conference on Image Processing (ICIP)}, + year={2017}, + pages={3645--3649}, + organization={IEEE}, + doi={10.1109/ICIP.2017.8296962} +} + +@inproceedings{Wojke2018deep, + title={Deep Cosine Metric Learning for Person Re-identification}, + author={Wojke, Nicolai and Bewley, Alex}, + booktitle={2018 IEEE Winter Conference on Applications of Computer Vision (WACV)}, + year={2018}, + pages={748--756}, + organization={IEEE}, + doi={10.1109/WACV.2018.00087} +} + +@article{wang2019towards, + title={Towards Real-Time Multi-Object Tracking}, + author={Wang, Zhongdao and Zheng, Liang and Liu, Yixuan and Wang, Shengjin}, + journal={arXiv preprint arXiv:1909.12605}, + year={2019} +} + +@article{zhang2020fair, + title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking}, + author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu}, + journal={arXiv preprint arXiv:2004.01888}, + year={2020} +} +``` diff --git a/configs/mot/deepsort/README_cn.md b/configs/mot/deepsort/README_cn.md index 50649b4ad..5750d35df 100644 --- a/configs/mot/deepsort/README_cn.md +++ b/configs/mot/deepsort/README_cn.md @@ -136,10 +136,10 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid ```bash # 用导出JDE YOLOv3行人检测模型和PCB Pyramid ReID模型 -python deploy/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608_mix/ --reid_model_dir=output_inference/deepsort_pcb_pyramid_r101/ --video_file={your video name}.mp4 --device=GPU --save_mot_txts +python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608_mix/ --reid_model_dir=output_inference/deepsort_pcb_pyramid_r101/ --video_file={your video name}.mp4 --device=GPU --save_mot_txts # 或用导出的PPYOLOv2行人检测模型和PPLCNet ReID模型 -python deploy/python/mot_sde_infer.py --model_dir=output_inference/ppyolov2_r50vd_dcn_365e_640x640_mot17half/ --reid_model_dir=output_inference/deepsort_pplcnet/ --video_file={your video name}.mp4 --device=GPU --scaled=True --save_mot_txts +python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyolov2_r50vd_dcn_365e_640x640_mot17half/ --reid_model_dir=output_inference/deepsort_pplcnet/ --video_file={your video name}.mp4 --device=GPU --scaled=True --save_mot_txts ``` **注意:** 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_mot_txt_per_img`(对每张图片保存一个txt)表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。 @@ -181,7 +181,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid ``` #### 4.使用导出的检测模型和ReID模型去部署: ``` -python deploy/python/mot_sde_infer.py --model_dir=output_inference/xxx./ --reid_model_dir=output_inference/deepsort_yyy/ --video_file={your video name}.mp4 --device=GPU --scaled=True --save_mot_txts +python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/xxx./ --reid_model_dir=output_inference/deepsort_yyy/ --video_file={your video name}.mp4 --device=GPU --scaled=True --save_mot_txts ``` **注意:** `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。 diff --git a/configs/mot/fairmot/README.md b/configs/mot/fairmot/README.md index 667365f16..da546b7c1 100644 --- a/configs/mot/fairmot/README.md +++ b/configs/mot/fairmot/README.md @@ -124,7 +124,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairm ### 5. Using exported model for python inference ```bash -python deploy/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts +python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts ``` **Notes:** The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images. diff --git a/configs/mot/fairmot/README_cn.md b/configs/mot/fairmot/README_cn.md index 368b1cdf6..70d23fe1f 100644 --- a/configs/mot/fairmot/README_cn.md +++ b/configs/mot/fairmot/README_cn.md @@ -123,7 +123,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairm ### 5. 用导出的模型基于Python去预测 ```bash -python deploy/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts +python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts ``` **注意:** 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。 diff --git a/configs/mot/headtracking21/README_cn.md b/configs/mot/headtracking21/README_cn.md index b54b2519d..d42541353 100644 --- a/configs/mot/headtracking21/README_cn.md +++ b/configs/mot/headtracking21/README_cn.md @@ -61,7 +61,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/headtracking2 ### 5. 用导出的模型基于Python去预测 ```bash -python deploy/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_headtracking21 --video_file={your video name}.mp4 --device=GPU --save_mot_txts +python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_headtracking21 --video_file={your video name}.mp4 --device=GPU --save_mot_txts ``` **注意:** 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。 diff --git a/configs/mot/jde/README.md b/configs/mot/jde/README.md index 0e897e35c..7ed6f866e 100644 --- a/configs/mot/jde/README.md +++ b/configs/mot/jde/README.md @@ -92,7 +92,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/jde/jde_darkn ### 5. Using exported model for python inference ```bash -python deploy/python/mot_jde_infer.py --model_dir=output_inference/jde_darknet53_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts +python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/jde_darknet53_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts ``` **Notes:** The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images. diff --git a/configs/mot/jde/README_cn.md b/configs/mot/jde/README_cn.md index 1500ee8d4..3d50a7acd 100644 --- a/configs/mot/jde/README_cn.md +++ b/configs/mot/jde/README_cn.md @@ -93,7 +93,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/jde/jde_darkn ### 5. 用导出的模型基于Python去预测 ```bash -python deploy/python/mot_jde_infer.py --model_dir=output_inference/jde_darknet53_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts +python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/jde_darknet53_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts ``` **注意:** 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。 diff --git a/configs/mot/mcfairmot/README.md b/configs/mot/mcfairmot/README.md index 24eb7eeaa..3706fe062 100644 --- a/configs/mot/mcfairmot/README.md +++ b/configs/mot/mcfairmot/README.md @@ -71,7 +71,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/mcfairmot/mcf ### 5. Using exported model for python inference ```bash -python deploy/python/mot_jde_infer.py --model_dir=output_inference/mcfairmot_dla34_30e_1088x608_visdrone --video_file={your video name}.mp4 --device=GPU --save_mot_txts +python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/mcfairmot_dla34_30e_1088x608_visdrone --video_file={your video name}.mp4 --device=GPU --save_mot_txts ``` **Notes:** The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images. diff --git a/configs/mot/mcfairmot/README_cn.md b/configs/mot/mcfairmot/README_cn.md index 1c64a6899..57cb679db 100644 --- a/configs/mot/mcfairmot/README_cn.md +++ b/configs/mot/mcfairmot/README_cn.md @@ -70,7 +70,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/mcfairmot/mcf ### 5. 用导出的模型基于Python去预测 ```bash -python deploy/python/mot_jde_infer.py --model_dir=output_inference/mcfairmot_dla34_30e_1088x608_visdrone --video_file={your video name}.mp4 --device=GPU --save_mot_txts +python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/mcfairmot_dla34_30e_1088x608_visdrone --video_file={your video name}.mp4 --device=GPU --save_mot_txts ``` **注意:** 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。 diff --git a/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml b/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml new file mode 100644 index 000000000..da1170ac5 --- /dev/null +++ b/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml @@ -0,0 +1,64 @@ +_BASE_: [ + '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml', + '../../datasets/mcmot.yml' +] + +metric: MCMOT +num_classes: 11 +weights: output/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot/model_final + +# for MCMOT training +TrainDataset: + !MCMOTDataSet + dataset_dir: dataset/mot + image_lists: ['bdd100k_mcmot.train'] + data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide'] + label_list: label_list.txt + +EvalMOTDataset: + !MOTImageFolder + dataset_dir: dataset/mot + data_root: bdd100k_mcmot/images/val + keep_ori_im: False + +# model config +architecture: FairMOT +pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/HRNet_W18_C_pretrained.pdparams +for_mot: True + +FairMOT: + detector: CenterNet + reid: FairMOTEmbeddingHead + loss: FairMOTLoss + tracker: JDETracker # multi-class tracker + +CenterNetHead: + regress_ltrb: False + +CenterNetPostProcess: + regress_ltrb: False + max_per_img: 200 + +JDETracker: + min_box_area: 0 + vertical_ratio: 0 # no need to filter bboxes according to w/h + conf_thres: 0.4 + tracked_thresh: 0.4 + metric_type: cosine + +epoch: 30 +LearningRate: + base_lr: 0.0005 + schedulers: + - !PiecewiseDecay + gamma: 0.1 + milestones: [10, 20] + use_warmup: False + +OptimizerBuilder: + optimizer: + type: Adam + regularizer: NULL + +TrainReader: + batch_size: 8 diff --git a/configs/mot/pedestrian/README_cn.md b/configs/mot/pedestrian/README_cn.md index b5c030967..fc1d6d5ce 100644 --- a/configs/mot/pedestrian/README_cn.md +++ b/configs/mot/pedestrian/README_cn.md @@ -95,7 +95,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/pedestrian/fa ### 5. 用导出的模型基于Python去预测 ```bash -python deploy/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_visdrone_pedestrian --video_file={your video name}.mp4 --device=GPU --save_mot_txts +python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_visdrone_pedestrian --video_file={your video name}.mp4 --device=GPU --save_mot_txts ``` **注意:** 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。 diff --git a/configs/mot/vehicle/README_cn.md b/configs/mot/vehicle/README_cn.md index 6fa77e2ae..7a38e5e05 100644 --- a/configs/mot/vehicle/README_cn.md +++ b/configs/mot/vehicle/README_cn.md @@ -127,7 +127,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/vehicle/fairm ### 5. 用导出的模型基于Python去预测 ```bash -python deploy/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle --video_file={your video name}.mp4 --device=GPU --save_mot_txts +python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle --video_file={your video name}.mp4 --device=GPU --save_mot_txts ``` **注意:** 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。 diff --git a/deploy/pptracking/README.md b/deploy/pptracking/README.md index f17e82429..1db6bc250 100644 --- a/deploy/pptracking/README.md +++ b/deploy/pptracking/README.md @@ -14,13 +14,16 @@ PP-Tracking是基于飞桨深度学习框架的业界首个开源实时跟踪系 ### 一、快速开始 -PP-Tracking提供了简洁的可视化界面,无需开发即可实现多种跟踪功能,可以参考[PP-Tracking可视化界面使用文档](https://github.com/yangyudong2020/PP-Tracking_GUi)快速上手体验 +PP-Tracking提供了简洁的可视化界面,无需开发即可实现多种跟踪功能,可以参考[PP-Tracking可视化界面使用文档](https://github.com/yangyudong2020/PP-Tracking_GUi)快速上手体验。 + +PP-Tracking也提供了AI Studio公开项目案例,可以参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)快速上手体验。 + ### 二、算法介绍 PP-Tracking集成了多目标跟踪,目标检测,ReID轻量级算法,提升跟踪系统实时性能。多目标跟踪算法基于FairMOT进行优化,实现了服务器端轻量级模型,同时基于不同应用场景提供了针对性的预训练模型。 -模型训练评估方法请参考[多目标跟踪快速开始](../../configs/mot/README_cn.md#快速开始) +模型训练评估方法请参考[多目标跟踪快速开始](../../configs/mot/README.md#快速开始) PP-Tracking中提供的多场景预训练模型及导出模型列表如下: @@ -30,13 +33,13 @@ PP-Tracking中提供的多场景预训练模型及导出模型列表如下: | 行人小目标跟踪 | VisDrone-pedestrian | 40.5 | 8.35 | [配置文件](../../configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.tar) | | 车辆跟踪 | BDD100k-vehicle | 32.6 | 24.3 | [配置文件](../../configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.tar) | | 车辆小目标跟踪 | VisDrone-vehicle | 39.8 | 22.8 | [配置文件](../../configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.tar) -| 多类别跟踪 | BDD100k | - | 12.5 | [配置文件]() | [下载链接]() | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.tar) | +| 多类别跟踪 | BDD100k | - | 12.5 | [配置文件](../../configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.tar) | | 多类别小目标跟踪 | VisDrone | 20.4 | 6.74 | [配置文件](../../configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.yml) | [下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.pdparams) | [下载链接](https://bj.bcebos.com/v1/paddledet/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.tar) | **注:** 1. 模型预测速度为TensorRT FP16速度,测试环境为CUDA 10.2,JETPACK 4.5.1,TensorRT 7.1 -2. 更多跟踪模型请参考[多目标跟踪模型库](../../configs/mot/README_cn.md#模型库) +2. 更多跟踪模型请参考[多目标跟踪模型库](../../configs/mot/README.md#模型库) 检测模型使用轻量级特色模型PP-PicoDet,具体请参考[PP-PicoDet文档](../../configs/picodet) diff --git a/deploy/pptracking/python/README.md b/deploy/pptracking/python/README.md index 3b410d417..ffee698e0 100644 --- a/deploy/pptracking/python/README.md +++ b/deploy/pptracking/python/README.md @@ -1,4 +1,4 @@ -# Python端预测部署 +# PP-Tracking Python端预测部署 在PaddlePaddle中预测引擎和训练引擎底层有着不同的优化方法, 预测引擎使用了AnalysisPredictor,专门针对推理进行了优化,是基于[C++预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/native_infer.html)的Python接口,该引擎可以对模型进行多项图优化,减少不必要的内存拷贝。如果用户在部署已训练模型的过程中对性能有较高的要求,我们提供了独立于PaddleDetection的预测脚本,方便用户直接集成部署。 @@ -10,30 +10,49 @@ PaddleDetection在训练过程包括网络的前向和优化器相关参数,而在部署过程中,我们只需要前向参数,具体参考:[导出模型](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/deploy/EXPORT_MODEL.md) 导出后目录下,包括`infer_cfg.yml`, `model.pdiparams`, `model.pdiparams.info`, `model.pdmodel`四个文件。 +PP-Tracking也提供了AI Studio公开项目案例,可以参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)快速上手体验。 + ## 1. 对FairMOT模型的导出和预测 ### 1.1 导出预测模型 ```bash +# 命令行导出PaddleDetection发布的权重 CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams + +# 命令行导出训完保存的checkpoint权重 +CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml -o weights=output/fairmot_hrnetv2_w18_dlafpn_30e_576x320/model_final.pdparams + +# 或下载PaddleDetection发布的已导出的模型 +wget https://bj.bcebos.com/v1/paddledet/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.tar +tar -xvf fairmot_hrnetv2_w18_dlafpn_30e_576x320.tar ``` +**注意:** + 导出的模型默认会保存在`output_inference`目录下,如新下载请存放于对应目录下。 ### 1.2 用导出的模型基于Python去预测 ```bash -python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_hrnetv2_w18_dlafpn_30e_576x320 --video_file={your video name}.mp4 --device=GPU --save_mot_txts +# 下载行人跟踪demo视频: +wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4 + +# Python预测视频 +python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_hrnetv2_w18_dlafpn_30e_576x320 --video_file=mot17_demo.mp4 --device=GPU --save_mot_txts --save_images ``` **注意:** - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。 - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。 - 对于多类别或车辆的FairMOT模型的导出和Python预测只需更改相应的config和模型权重即可。如: - ``` + ```bash job_name=mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone model_type=mot/mcfairmot config=configs/${model_type}/${job_name}.yml - + # 命令行导出模型 CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c ${config} -o weights=https://paddledet.bj.bcebos.com/models/mot/${job_name}.pdparams - python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/${job_name} --video_file={your video name}.mp4 --device=GPU --save_mot_txts + # Python预测视频 + python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/${job_name} --video_file={your video name}.mp4 --device=GPU --save_mot_txts --save_images ``` - 多类别跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,cls_id,-1,-1`。 + - visdrone多类别跟踪demo视频可从此链接下载:`wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/visdrone_demo.mp4` + - bdd100k车辆跟踪和多类别demo视频可从此链接下载:`wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/bdd100k_demo.mp4` ## 2. 对DeepSORT模型的导出和预测 @@ -59,11 +78,14 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid ### 2.2 用导出的模型基于Python去预测 ```bash +# 下载行人跟踪demo视频: +wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4 + # 用导出JDE YOLOv3行人检测模型和PCB Pyramid ReID模型 -python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608_mix/ --reid_model_dir=output_inference/deepsort_pcb_pyramid_r101/ --video_file={your video name}.mp4 --device=GPU --save_mot_txts +python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608_mix/ --reid_model_dir=output_inference/deepsort_pcb_pyramid_r101/ --video_file=mot17_demo.mp4 --device=GPU --save_mot_txts --save_images # 或用导出的PPYOLOv2行人检测模型和PPLCNet ReID模型 -python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyolov2_r50vd_dcn_365e_640x640_mot17half/ --reid_model_dir=output_inference/deepsort_pplcnet/ --video_file={your video name}.mp4 --device=GPU --scaled=True --save_mot_txts +python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyolov2_r50vd_dcn_365e_640x640_mot17half/ --reid_model_dir=output_inference/deepsort_pplcnet/ --video_file=mot17_demo.mp4 --device=GPU --scaled=True --save_mot_txts --save_images ``` **注意:** - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_images`表示保存跟踪结果可视化图片。 -- GitLab