[MOT] fix JDE doc (#2977)

* fix jde doc and metric test_mode * add image_lists * add readme_cn

[MOT] fix JDE doc (#2977)
* fix jde doc and metric test_mode * add image_lists * add readme_cn
33c8aebd · George Ni · GitHub · 33dca040 · 33c8aebd · 33c8aebd
27 changed file
--- a/configs/datasets/mot.yml
+++ b/configs/datasets/mot.yml
-metric: MOTDet
+metric: MOT
 num_classes: 1

 MOTDataZoo: {

--- a/configs/mot/jde/README.md
+++ b/configs/mot/jde/README.md
@@ -6,12 +6,10 @@ English | [简体中文](README_cn.md)
 - [Introduction](#Introduction)
 - [Model Zoo](#Model_Zoo)
 - [Getting Start](#Getting_Start)
- [Citations](#Citations)

 ## Introduction

-[Joint Detection and Embedding](https://arxiv.org/abs/1909.12605)(JDE) is a fast and high-performance multiple-object tracker that learns the object detection task and appearance embedding task simutaneously in a shared neural network。
-JDE reached 64.4 MOTA on MOT16-tesing datatset.
+[Joint Detection and Embedding](https://arxiv.org/abs/1909.12605)(JDE) is a fast and high-performance multiple-object tracker that learns the object detection task and appearance embedding task simutaneously in a shared neural network.
 <div align="center">
  <img src="../../../../docs/images/mot16_jde.gif" width=500 />
 </div>
@@ -21,11 +19,9 @@ JDE reached 64.4 MOTA on MOT16-tesing datatset.
 ### JDE on MOT-16 training set

 | backbone           | input shape | MOTA | IDF1  |  IDS  |   FP  |  FN  |  FPS  | download | config |
-| :-----------------| :------- | :----: | :----: | :---: | :----: | :---: | :---: |:---: | :---: |
-| DarkNet53(paper)  | 1088x608 |  74.8  |  67.3  | 1189  |  5558  | 21505 |  22.2 | ---- | ---- |
+| :----------------- | :------- | :----: | :----: | :---: | :----: | :---: | :---: | :---: | :---: |
 | DarkNet53          | 1088x608 |  73.2  |  69.4  | 1320  |  6613  | 21629 |   -   |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_1088x608.yml) |

-
 **Notes:**
 JDE used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epoches.

@@ -39,56 +35,25 @@ Training JDE on 8 GPUs with following command
 python -m paddle.distributed.launch --log_dir=./jde_darknet53_30e_1088x608/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml &>jde_darknet53_30e_1088x608.log 2>&1 &
 ```

-
 ### 2. Evaluation

-Evaluating the detector module of JDE on val dataset in single GPU with following commands:
-
-```bash
-# use weights released in PaddleDetection model zoo
-CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams
-
-# use saved checkpoint in training
-CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=output/jde_darknet53_30e_1088x608/model_final
-```
-
-Evaluating the ReID module of JDE on val dataset in single GPU with following commands:
-
-```bash
-# use weights released in PaddleDetection model zoo
-CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o metric='MOTDet' weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams
-
-# use saved checkpoint in training
-CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o metric='MOT' weights=output/jde_darknet53_30e_1088x608/model_final
-```
-
 Evaluating the track performance of JDE on val dataset in single GPU with following commands:

 ```bash
 # use weights released in PaddleDetection model zoo
-CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o metric='MOT' weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams

 # use saved checkpoint in training
-CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o metric='MOT' weights=output/jde_darknet53_30e_1088x608/model_final
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=output/jde_darknet53_30e_1088x608/model_final
 ```

 ### 3. Inference

-Inference images in single GPU with following commands, use `--infer_img` to inference a single image and `--infer_dir` to inference all images in the directory.
-
-```bash
-# inference single image
-CUDA_VISIBLE_DEVICES=0 python tools/infer.py configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o metric='MOT' weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams --infer_img=./demo/000000014439.jpg
-
-# inference all images in the directory
-CUDA_VISIBLE_DEVICES=0 python tools/infer.py configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o metric='MOT' weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams --infer_dir=./demo
-```
-
-Inference vidoe in single GPU with following commands.
+Inference a vidoe in single GPU with following commands.

 ```bash
 # inference on video
-CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o metric='MOT' weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams --video_file={your video name}.mp4
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams --video_file={your video name}.mp4

 ```
 ## Citations

--- a/configs/mot/jde/README_cn.md
+++ b/configs/mot/jde/README_cn.md
+简体中文 | [English](README.md)
+
+# JDE (Towards-Realtime-MOT)
+
+## 内容
+- [简介](#简介)
+- [模型库与基线](#模型库与基线)
+- [快速开始](#快速开始)
+
+
+## 内容
+
+[Joint Detection and Embedding](https://arxiv.org/abs/1909.12605)(JDE) 是一个快速高性能多目标跟踪器，它是在共享神经网络中同时学习目标检测任务和外观嵌入任务的。
+<div align="center">
+  <img src="../../../../docs/images/mot16_jde.gif" width=500 />
+</div>
+
+## 模型库与基线
+
+### JDE on MOT-16 training set
+
+| 骨干网络            | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 检测模型 | ReID模型 | 配置文件 |
+| :----------------- | :------- | :----: | :----: | :---: | :----: | :---: | :---: | :---: | :---: |
+| DarkNet53          | 1088x608 |  73.2  |  69.4  | 1320  |  6613  | 21629 |   -   |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_1088x608.yml) |
+
+**Notes:**
+ JDE使用8个GPU进行训练，每个GPU上batch size为4，训练了30个epoches。
+
+## 快速开始
+
+### 1. 训练
+
+使用8GPU通过如下命令一键式启动训练
+
+```bash
+python -m paddle.distributed.launch --log_dir=./jde_darknet53_30e_1088x608/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml &>jde_darknet53_30e_1088x608.log 2>&1 &
+```
+
+### 2. 评估
+
+使用8GPU通过如下命令一键式启动评估
+
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams
+
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=output/jde_darknet53_30e_1088x608/model_final
+```
+
+### 3. 预测
+
+使用单个GPU过如下命令预测一个视频
+
+```bash
+# 预测一个视频
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams --video_file={your video name}.mp4
+
+```
+## 引用
+```
+@article{wang2019towards,
+  title={Towards Real-Time Multi-Object Tracking},
+  author={Wang, Zhongdao and Zheng, Liang and Liu, Yixuan and Wang, Shengjin},
+  journal={arXiv preprint arXiv:1909.12605},
+  year={2019}
+}
+```
--- a/configs/mot/jde/_base_/optimizer_60e.yml
+++ b/configs/mot/jde/_base_/optimizer_60e.yml
+epoch: 60
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [30, 44]
+    use_warmup: True
+  - !BurninWarmup
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
--- a/configs/mot/jde/jde_darknet53_30e_1088x608.yml
+++ b/configs/mot/jde/jde_darknet53_30e_1088x608.yml
@@ -11,7 +11,6 @@ JDE:
  detector: YOLOv3
  reid: JDEEmbeddingHead
  tracker: JDETracker
-  metric: 'MOT'

 YOLOv3:
  backbone: DarkNet

--- a/dataset/mot/gen_labels_MOT.py
+++ b/dataset/mot/gen_labels_MOT.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os.path as osp
+import os
+import numpy as np
+
+MOT_data = 'MOT16'
+
+# choose a data in ['MOT15', 'MOT16', 'MOT17', 'MOT20']
+
+
+def mkdirs(d):
+    if not osp.exists(d):
+        os.makedirs(d)
+
+
+seq_root = './{}/images/train'.format(MOT_data)
+label_root = './{}/labels_with_ids/train'.format(MOT_data)
+mkdirs(label_root)
+seqs = [s for s in os.listdir(seq_root)]
+
+tid_curr = 0
+tid_last = -1
+for seq in seqs:
+    seq_info = open(osp.join(seq_root, seq, 'seqinfo.ini')).read()
+    seq_width = int(seq_info[seq_info.find('imWidth=') + 8:seq_info.find(
+        '\nimHeight')])
+    seq_height = int(seq_info[seq_info.find('imHeight=') + 9:seq_info.find(
+        '\nimExt')])
+
+    gt_txt = osp.join(seq_root, seq, 'gt', 'gt.txt')
+    gt = np.loadtxt(gt_txt, dtype=np.float64, delimiter=',')
+
+    seq_label_root = osp.join(label_root, seq, 'img1')
+    mkdirs(seq_label_root)
+
+    for fid, tid, x, y, w, h, mark, label, _ in gt:
+        if mark == 0 or not label == 1:
+            continue
+        fid = int(fid)
+        tid = int(tid)
+        if not tid == tid_last:
+            tid_curr += 1
+            tid_last = tid
+        x += w / 2
+        y += h / 2
+        label_fpath = osp.join(seq_label_root, '{:06d}.txt'.format(fid))
+        label_str = '0 {:d} {:.6f} {:.6f} {:.6f} {:.6f}\n'.format(
+            tid_curr, x / seq_width, y / seq_height, w / seq_width,
+            h / seq_height)
+        with open(label_fpath, 'a') as f:
+            f.write(label_str)
--- a/dataset/mot/image_lists/caltech.10k.val
+++ b/dataset/mot/image_lists/caltech.10k.val
--- a/dataset/mot/image_lists/caltech.all
+++ b/dataset/mot/image_lists/caltech.all
--- a/dataset/mot/image_lists/caltech.train
+++ b/dataset/mot/image_lists/caltech.train
--- a/dataset/mot/image_lists/caltech.val
+++ b/dataset/mot/image_lists/caltech.val
--- a/dataset/mot/image_lists/citypersons.train
+++ b/dataset/mot/image_lists/citypersons.train
--- a/dataset/mot/image_lists/citypersons.val
+++ b/dataset/mot/image_lists/citypersons.val
--- a/dataset/mot/image_lists/crowdhuman.train
+++ b/dataset/mot/image_lists/crowdhuman.train
--- a/dataset/mot/image_lists/crowdhuman.val
+++ b/dataset/mot/image_lists/crowdhuman.val
--- a/dataset/mot/image_lists/cuhksysu.train
+++ b/dataset/mot/image_lists/cuhksysu.train
--- a/dataset/mot/image_lists/cuhksysu.val
+++ b/dataset/mot/image_lists/cuhksysu.val
--- a/dataset/mot/image_lists/eth.train
+++ b/dataset/mot/image_lists/eth.train
--- a/dataset/mot/image_lists/mot15.train
+++ b/dataset/mot/image_lists/mot15.train
--- a/dataset/mot/image_lists/mot16.train
+++ b/dataset/mot/image_lists/mot16.train
--- a/dataset/mot/image_lists/mot17.emb
+++ b/dataset/mot/image_lists/mot17.emb
--- a/dataset/mot/image_lists/mot17.half
+++ b/dataset/mot/image_lists/mot17.half
--- a/dataset/mot/image_lists/mot17.train
+++ b/dataset/mot/image_lists/mot17.train
--- a/dataset/mot/image_lists/mot17.val
+++ b/dataset/mot/image_lists/mot17.val
--- a/dataset/mot/image_lists/mot20.train
+++ b/dataset/mot/image_lists/mot20.train
--- a/dataset/mot/image_lists/prw.train
+++ b/dataset/mot/image_lists/prw.train
--- a/dataset/mot/image_lists/prw.val
+++ b/dataset/mot/image_lists/prw.val
--- a/ppdet/modeling/architectures/jde.py
+++ b/ppdet/modeling/architectures/jde.py
@@ -26,6 +26,7 @@ __all__ = ['JDE']

 @register
 class JDE(BaseArch):
+    __category__ = 'architecture'
    __shared__ = ['metric']
    """
    JDE network, see https://arxiv.org/abs/1909.12605v1
@@ -38,13 +39,12 @@ class JDE(BaseArch):
            for ReID embedding evaluation, or 'MOT' for multi object tracking
            evaluation。
    """
-    __category__ = 'architecture'

    def __init__(self,
                 detector='YOLOv3',
                 reid='JDEEmbeddingHead',
                 tracker='JDETracker',
-                 metric='MOTDet'):
+                 metric='MOT'):
        super(JDE, self).__init__()
        self.detector = detector
        self.reid = reid