[MOT] add HRNet fairmot small scales (#4376)

* add 576 864 configs * update 864 576 modelzoo

[MOT] add HRNet fairmot small scales (#4376)
* add 576 864 configs * update 864 576 modelzoo
bd68a7ef · Feng Ni · GitHub · 8d0ad5fd · bd68a7ef · bd68a7ef
7 changed file
--- a/configs/mot/README.md
+++ b/configs/mot/README.md
@@ -164,9 +164,11 @@ If you use a stronger detection model, you can get better results. Each txt is t
 | backbone       | input shape | MOTA | IDF1 |  IDS  |    FP   |   FN   |    FPS    | download | config |
 | :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
 | HRNetV2-W18   | 1088x608 |  70.7  |  65.7  |  4281  |  22485  | 138468 |    -     |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [config](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
+| HRNetV2-W18   | 864x480  |  70.3  |  65.8  |  4056  |  18927  | 144486 |    -     |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [config](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) |
+| HRNetV2-W18   | 576x320  |  65.3  |  64.8  |  4137  |  28860  | 163017 |    -     |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [config](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) |

 **Notes:**
- FairMOT HRNetV2-W18 used 8 GPUs for training and mini-batch size as 6 on each GPU, and trained for 30 epoches. Only ImageNet pre-train model is used, and the optimizer adopts Momentum. The crowdhuman dataset is added to the train-set during training.
+ FairMOT HRNetV2-W18 used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epoches. Only ImageNet pre-train model is used, and the optimizer adopts Momentum. The crowdhuman dataset is added to the train-set during training.


 ## Feature Tracking Model

--- a/configs/mot/README_cn.md
+++ b/configs/mot/README_cn.md
@@ -164,6 +164,8 @@ wget https://dataset.bj.bcebos.com/mot/det_results_dir.zip
 |    骨干网络      |  输入尺寸 |  MOTA  |  IDF1  |   IDS  |   FP   |   FN   |    FPS   |  下载链接  | 配置文件 |
 | :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
 | HRNetV2-W18   | 1088x608 |  70.7  |  65.7  |  4281  |  22485  | 138468 |    -     |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [配置文件](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
+| HRNetV2-W18   | 864x480  |  70.3  |  65.8  |  4056  |  18927  | 144486 |    -     |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [配置文件](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) |
+| HRNetV2-W18   | 576x320  |  65.3  |  64.8  |  4137  |  28860  | 163017 |    -     |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [配置文件](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) |

 **注意:**
 FairMOT HRNetV2-W18均使用8个GPU进行训练，每个GPU上batch size为4，训练30个epoch，使用的ImageNet预训练，优化器策略采用的是Momentum，并且训练集中加入了crowdhuman数据集一起参与训练。

--- a/configs/mot/fairmot/README.md
+++ b/configs/mot/fairmot/README.md
@@ -65,6 +65,8 @@ English | [简体中文](README_cn.md)
 | backbone       | input shape | MOTA | IDF1 |  IDS  |    FP   |   FN   |    FPS    | download | config |
 | :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
 | HRNetV2-W18   | 1088x608 |  70.7  |  65.7  |  4281  |  22485  | 138468 |    -     |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
+| HRNetV2-W18   | 864x480  |  70.3  |  65.8  |  4056  |  18927  | 144486 |    -     |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) |
+| HRNetV2-W18   | 576x320  |  65.3  |  64.8  |  4137  |  28860  | 163017 |    -     |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) |

 **Notes:**
 FairMOT HRNetV2-W18 used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epoches. Only ImageNet pre-train model is used, and the optimizer adopts Momentum. The crowdhuman dataset is added to the train-set during training.

--- a/configs/mot/fairmot/README_cn.md
+++ b/configs/mot/fairmot/README_cn.md
@@ -64,6 +64,8 @@
 |    骨干网络      |  输入尺寸 |  MOTA  |  IDF1  |   IDS  |   FP   |   FN   |    FPS   |  下载链接  | 配置文件 |
 | :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
 | HRNetV2-W18   | 1088x608 |  70.7  |  65.7  |  4281  |  22485  | 138468 |    -     |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
+| HRNetV2-W18   | 864x480  |  70.3  |  65.8  |  4056  |  18927  | 144486 |    -     |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) |
+| HRNetV2-W18   | 576x320  |  65.3  |  64.8  |  4137  |  28860  | 163017 |    -     |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) |

 **注意:**
 FairMOT HRNetV2-W18均使用8个GPU进行训练，每个GPU上batch size为4，训练30个epoch，使用的ImageNet预训练，优化器策略采用的是Momentum，并且训练集中加入了crowdhuman数据集一起参与训练。

--- a/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml
+++ b/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml
@@ -6,6 +6,10 @@ _BASE_: [
  '_base_/fairmot_reader_1088x608.yml',
 ]

+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
 # for MOT training
 TrainDataset:
  !MOTDataSet

--- a/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml
+++ b/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml
+_BASE_: [
+  '../../datasets/mot.yml',
+  '../../runtime.yml',
+  '_base_/optimizer_30e_momentum.yml',
+  '_base_/fairmot_hrnetv2_w18_dlafpn.yml',
+  '_base_/fairmot_reader_576x320.yml',
+]
+
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+# for MOT training
+TrainDataset:
+  !MOTDataSet
+    dataset_dir: dataset/mot
+    image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
+    data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+worker_num: 4
+TrainReader:
+  inputs_def:
+    image_shape: [3, 320, 576]
+  sample_transforms:
+    - Decode: {}
+    - RGBReverse: {}
+    - AugmentHSV: {}
+    - LetterBoxResize: {target_size: [320, 576]}
+    - MOTRandomAffine: {reject_outside: False}
+    - RandomFlip: {}
+    - BboxXYXY2XYWH: {}
+    - NormalizeBox: {}
+    - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+    - RGBReverse: {}
+    - Permute: {}
+  batch_transforms:
+    - Gt2FairMOTTarget: {}
+  batch_size: 4
+  shuffle: True
+  drop_last: True
+  use_shared_memory: True
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_576x320/model_final
--- a/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml
+++ b/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml
+_BASE_: [
+  '../../datasets/mot.yml',
+  '../../runtime.yml',
+  '_base_/optimizer_30e_momentum.yml',
+  '_base_/fairmot_hrnetv2_w18_dlafpn.yml',
+  '_base_/fairmot_reader_864x480.yml',
+]
+
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+# for MOT training
+TrainDataset:
+  !MOTDataSet
+    dataset_dir: dataset/mot
+    image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
+    data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+worker_num: 4
+TrainReader:
+  inputs_def:
+    image_shape: [3, 480, 864]
+  sample_transforms:
+    - Decode: {}
+    - RGBReverse: {}
+    - AugmentHSV: {}
+    - LetterBoxResize: {target_size: [480, 864]}
+    - MOTRandomAffine: {reject_outside: False}
+    - RandomFlip: {}
+    - BboxXYXY2XYWH: {}
+    - NormalizeBox: {}
+    - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+    - RGBReverse: {}
+    - Permute: {}
+  batch_transforms:
+    - Gt2FairMOTTarget: {}
+  batch_size: 4
+  shuffle: True
+  drop_last: True
+  use_shared_memory: True
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_864x480/model_final