diff --git a/configs/mot/README.md b/configs/mot/README.md index cbf76371ae76504b1b6d02a47c3fb13997888f5c..16b86527dc925d87902d58be76695d81e144186a 100644 --- a/configs/mot/README.md +++ b/configs/mot/README.md @@ -164,9 +164,11 @@ If you use a stronger detection model, you can get better results. Each txt is t | backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config | | :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: | | HRNetV2-W18 | 1088x608 | 70.7 | 65.7 | 4281 | 22485 | 138468 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [config](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) | +| HRNetV2-W18 | 864x480 | 70.3 | 65.8 | 4056 | 18927 | 144486 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [config](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) | +| HRNetV2-W18 | 576x320 | 65.3 | 64.8 | 4137 | 28860 | 163017 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [config](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) | **Notes:** - FairMOT HRNetV2-W18 used 8 GPUs for training and mini-batch size as 6 on each GPU, and trained for 30 epoches. Only ImageNet pre-train model is used, and the optimizer adopts Momentum. The crowdhuman dataset is added to the train-set during training. + FairMOT HRNetV2-W18 used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epoches. Only ImageNet pre-train model is used, and the optimizer adopts Momentum. The crowdhuman dataset is added to the train-set during training. ## Feature Tracking Model diff --git a/configs/mot/README_cn.md b/configs/mot/README_cn.md index 16b8775547eaea3b1269c2aca91e49c5c8eed0b4..0eb20c4271c06a61877e102cb10dc656044d41ac 100644 --- a/configs/mot/README_cn.md +++ b/configs/mot/README_cn.md @@ -164,6 +164,8 @@ wget https://dataset.bj.bcebos.com/mot/det_results_dir.zip | 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 | | :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: | | HRNetV2-W18 | 1088x608 | 70.7 | 65.7 | 4281 | 22485 | 138468 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [配置文件](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) | +| HRNetV2-W18 | 864x480 | 70.3 | 65.8 | 4056 | 18927 | 144486 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [配置文件](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) | +| HRNetV2-W18 | 576x320 | 65.3 | 64.8 | 4137 | 28860 | 163017 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [配置文件](./fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) | **注意:** FairMOT HRNetV2-W18均使用8个GPU进行训练,每个GPU上batch size为4,训练30个epoch,使用的ImageNet预训练,优化器策略采用的是Momentum,并且训练集中加入了crowdhuman数据集一起参与训练。 diff --git a/configs/mot/fairmot/README.md b/configs/mot/fairmot/README.md index 353a9fce88b106998f983ea80a3eb96d3b3187cc..9ec0cf8b60e6abbe0c37e7aa34eb5214a1b6eed8 100644 --- a/configs/mot/fairmot/README.md +++ b/configs/mot/fairmot/README.md @@ -65,6 +65,8 @@ English | [简体中文](README_cn.md) | backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config | | :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: | | HRNetV2-W18 | 1088x608 | 70.7 | 65.7 | 4281 | 22485 | 138468 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) | +| HRNetV2-W18 | 864x480 | 70.3 | 65.8 | 4056 | 18927 | 144486 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) | +| HRNetV2-W18 | 576x320 | 65.3 | 64.8 | 4137 | 28860 | 163017 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) | **Notes:** FairMOT HRNetV2-W18 used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epoches. Only ImageNet pre-train model is used, and the optimizer adopts Momentum. The crowdhuman dataset is added to the train-set during training. diff --git a/configs/mot/fairmot/README_cn.md b/configs/mot/fairmot/README_cn.md index e85ebcc45813f178b44fc097b0668e2954374101..72d2595582f6f8326f75f59bd7eaa79f4a4b3d59 100644 --- a/configs/mot/fairmot/README_cn.md +++ b/configs/mot/fairmot/README_cn.md @@ -64,6 +64,8 @@ | 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 | | :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: | | HRNetV2-W18 | 1088x608 | 70.7 | 65.7 | 4281 | 22485 | 138468 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) | +| HRNetV2-W18 | 864x480 | 70.3 | 65.8 | 4056 | 18927 | 144486 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) | +| HRNetV2-W18 | 576x320 | 65.3 | 64.8 | 4137 | 28860 | 163017 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) | **注意:** FairMOT HRNetV2-W18均使用8个GPU进行训练,每个GPU上batch size为4,训练30个epoch,使用的ImageNet预训练,优化器策略采用的是Momentum,并且训练集中加入了crowdhuman数据集一起参与训练。 diff --git a/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml b/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml index c0631af2dc0c48b0f6f6e123600b781b65791f98..dc9178db18fbb27209106b844b2920f65674701a 100644 --- a/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml +++ b/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml @@ -6,6 +6,10 @@ _BASE_: [ '_base_/fairmot_reader_1088x608.yml', ] +norm_type: sync_bn +use_ema: true +ema_decay: 0.9998 + # for MOT training TrainDataset: !MOTDataSet diff --git a/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml b/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml new file mode 100644 index 0000000000000000000000000000000000000000..a480ebd04c37b5ba4a9002492527df1268cbb923 --- /dev/null +++ b/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml @@ -0,0 +1,43 @@ +_BASE_: [ + '../../datasets/mot.yml', + '../../runtime.yml', + '_base_/optimizer_30e_momentum.yml', + '_base_/fairmot_hrnetv2_w18_dlafpn.yml', + '_base_/fairmot_reader_576x320.yml', +] + +norm_type: sync_bn +use_ema: true +ema_decay: 0.9998 + +# for MOT training +TrainDataset: + !MOTDataSet + dataset_dir: dataset/mot + image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val'] + data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide'] + +worker_num: 4 +TrainReader: + inputs_def: + image_shape: [3, 320, 576] + sample_transforms: + - Decode: {} + - RGBReverse: {} + - AugmentHSV: {} + - LetterBoxResize: {target_size: [320, 576]} + - MOTRandomAffine: {reject_outside: False} + - RandomFlip: {} + - BboxXYXY2XYWH: {} + - NormalizeBox: {} + - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]} + - RGBReverse: {} + - Permute: {} + batch_transforms: + - Gt2FairMOTTarget: {} + batch_size: 4 + shuffle: True + drop_last: True + use_shared_memory: True + +weights: output/fairmot_hrnetv2_w18_dlafpn_30e_576x320/model_final diff --git a/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml b/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml new file mode 100644 index 0000000000000000000000000000000000000000..25f1b636aa5c85a1ad01cbe9753f78d4b81f7762 --- /dev/null +++ b/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml @@ -0,0 +1,43 @@ +_BASE_: [ + '../../datasets/mot.yml', + '../../runtime.yml', + '_base_/optimizer_30e_momentum.yml', + '_base_/fairmot_hrnetv2_w18_dlafpn.yml', + '_base_/fairmot_reader_864x480.yml', +] + +norm_type: sync_bn +use_ema: true +ema_decay: 0.9998 + +# for MOT training +TrainDataset: + !MOTDataSet + dataset_dir: dataset/mot + image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val'] + data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide'] + +worker_num: 4 +TrainReader: + inputs_def: + image_shape: [3, 480, 864] + sample_transforms: + - Decode: {} + - RGBReverse: {} + - AugmentHSV: {} + - LetterBoxResize: {target_size: [480, 864]} + - MOTRandomAffine: {reject_outside: False} + - RandomFlip: {} + - BboxXYXY2XYWH: {} + - NormalizeBox: {} + - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]} + - RGBReverse: {} + - Permute: {} + batch_transforms: + - Gt2FairMOTTarget: {} + batch_size: 4 + shuffle: True + drop_last: True + use_shared_memory: True + +weights: output/fairmot_hrnetv2_w18_dlafpn_30e_864x480/model_final