diff --git a/configs/mot/README.md b/configs/mot/README.md index b2f169753d6a97a598fefbc995ff4456e57e7845..b6f7615922a8b309d26f77490188b01c4afb227e 100644 --- a/configs/mot/README.md +++ b/configs/mot/README.md @@ -28,9 +28,9 @@ PaddleDetection implements three multi-object tracking methods. | backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config | | :----------------- | :------- | :----: | :----: | :---: | :----: | :---: | :---: | :---: | :---: | -| DarkNet53 | 1088x608 | 73.2 | 69.4 | 1320 | 6613 | 21629 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_1088x608.yml) | -| DarkNet53 | 864x480 | 70.1 | 65.4 | 1341 | 6454 | 25208 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_864x480.yml) | -| DarkNet53 | 576x320 | 63.1 | 64.6 | 1357 | 7083 | 32312 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_576x320.yml) | +| DarkNet53 | 1088x608 | 73.2 | 69.3 | 1351 | 6591 | 21625 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_1088x608.yml) | +| DarkNet53 | 864x480 | 70.1 | 65.2 | 1328 | 6441 | 25187 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_864x480.yml) | +| DarkNet53 | 576x320 | 63.2 | 64.5 | 1308 | 7011 | 32252 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_576x320.yml) | **Notes:** JDE used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epoches. @@ -117,7 +117,7 @@ In the annotation text, each line is describing a bounding box and has the follo **Notes:** - `class` should be `0`. Only single-class multi-object tracking is supported now. - `identity` is an integer from `0` to `num_identities - 1`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. -- `[x_center] [y_center] [width] [height]` are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1. +- `[x_center] [y_center] [width] [height]` are the center coordinates, width and height, note that they are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1. ### Dataset Directory diff --git a/configs/mot/README_cn.md b/configs/mot/README_cn.md index 1b5cb69b792ade3c6fca0f92840cf6ca621d58ba..15fac03a2e0f9930ca4da7658482e87033def8d2 100644 --- a/configs/mot/README_cn.md +++ b/configs/mot/README_cn.md @@ -29,9 +29,9 @@ PaddleDetection实现了3种多目标跟踪方法。 | 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 检测模型 | 配置文件 | | :----------------- | :------- | :----: | :----: | :---: | :----: | :---: | :---: | :---: | :---: | -| DarkNet53 | 1088x608 | 73.2 | 69.4 | 1320 | 6613 | 21629 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_1088x608.yml) | -| DarkNet53 | 864x480 | 70.1 | 65.4 | 1341 | 6454 | 25208 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_864x480.yml) | -| DarkNet53 | 576x320 | 63.1 | 64.6 | 1357 | 7083 | 32312 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_576x320.yml) | +| DarkNet53 | 1088x608 | 73.2 | 69.3 | 1351 | 6591 | 21625 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_1088x608.yml) | +| DarkNet53 | 864x480 | 70.1 | 65.2 | 1328 | 6441 | 25187 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_864x480.yml) | +| DarkNet53 | 576x320 | 63.2 | 64.5 | 1308 | 7011 | 32252 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_576x320.yml) | **注意:** JDE使用8个GPU进行训练,每个GPU上batch size为4,训练30个epoch。 @@ -111,12 +111,12 @@ MOT17 ``` 所有数据集的标注是以统一数据格式提供的。各个数据集中每张图片都有相应的标注文本。给定一个图像路径,可以通过将字符串`images`替换为`labels_with_ids`并将`.jpg`替换为`.txt`来生成标注文本路径。在标注文本中,每行都描述一个边界框,格式如下: ``` -[class][identity][x_center][y_center][width][height] +[class] [identity] [x_center] [y_center] [width] [height] ``` **注意**: - `class`为`0`,目前仅支持单类别多目标跟踪。 - `identity`是从`0`到`num_identifies-1`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。 -- `[x_center][y_center][width][height]`的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。 +- `[x_center] [y_center] [width] [height]`是中心点坐标和宽高,注意他们的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。 ### 数据集目录 diff --git a/configs/mot/deepsort/README.md b/configs/mot/deepsort/README.md index 2ce7754a973d54c6b2073a89f77fd560ab672b32..4847be341cf3c02306ed0f7793b47b6b0afc1192 100644 --- a/configs/mot/deepsort/README.md +++ b/configs/mot/deepsort/README.md @@ -33,10 +33,16 @@ det_results_dir ``` Each txt is the detection result of all the pictures extracted from each video, and each line describes a bounding box with the following format: ``` -[frame_id][identity][bb_left][bb_top][width][height][conf][x][y][z] +[frame_id],[identity],[bb_left],[bb_top],[width],[height],[conf],[x],[y],[z] ``` **Notes:** -`frame_id` is the frame number of the image, `identity` is the object id using default value `-1`, `bb_left` is the X coordinate of the left bound of the object box, `bb_top` is the Y coordinate of the upper boundary of the object box, `width, height` is the pixel width and height, `conf` is the object score with default value `1` (the results had been filtered out according to the detection score threshold), `x,y,z` are used in 3D, default to `-1` in 2D. +- `frame_id` is the frame number of the image +- `identity` is the object id using default value `-1` +- `bb_left` is the X coordinate of the left bound of the object box +- `bb_top` is the Y coordinate of the upper bound of the object box +- `width,height` is the pixel width and height +- `conf` is the object score with default value `1` (the results had been filtered out according to the detection score threshold) +- `x,y,z` are used in 3D, default to `-1` in 2D. ## Getting Start @@ -44,10 +50,10 @@ Each txt is the detection result of all the pictures extracted from each video, ```bash # use weights released in PaddleDetection model zoo -CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o metric=MOT weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams +CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams # use saved checkpoint after training -CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o metric=MOT weights=output/jde_darknet53_30e_1088x608/model_final +CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o weights=output/jde_darknet53_30e_1088x608/model_final ``` ### 2. Tracking diff --git a/configs/mot/deepsort/README_cn.md b/configs/mot/deepsort/README_cn.md index 9daaf0b860dec95e95c12cd2899675ddee9697e8..81af5f6c1e0be964d4f51b8d9a0127992f366778 100644 --- a/configs/mot/deepsort/README_cn.md +++ b/configs/mot/deepsort/README_cn.md @@ -33,9 +33,16 @@ det_results_dir ``` 其中每个txt是每个视频中所有图片的检测结果,每行都描述一个边界框,格式如下: ``` -[frame_id][identity][bb_left][bb_top][width][height][conf][x][y][z] +[frame_id],[identity],[bb_left],[bb_top],[width],[height],[conf],[x],[y],[z] ``` -**注意**: `frame_id`是图片帧的序号,`identity`是目标id采用默认值为`-1`,`bb_left`是目标框的左边界的x坐标,`bb_top`是目标框的上边界的y坐标,`width,height`是真实的像素宽高,`conf`是目标得分设置为`1`(已经按检测的得分阈值筛选出的检测结果),`x,y,z`是3D中用到的,在2D中默认为`-1`即可。 +**注意**: +- `frame_id`是图片帧的序号 +- `identity`是目标id采用默认值为`-1` +- `bb_left`是目标框的左边界的x坐标 +- `bb_top`是目标框的上边界的y坐标 +- `width,height`是真实的像素宽高 +- `conf`是目标得分设置为`1`(已经按检测的得分阈值筛选出的检测结果) +- `x,y,z`是3D中用到的,在2D中默认为`-1` ## 快速开始 @@ -43,10 +50,10 @@ det_results_dir ```bash # 使用PaddleDetection发布的权重 -CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o metric=MOT weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams +CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams # 使用训练保存的checkpoint -CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o metric=MOT weights=output/jde_darknet53_30e_1088x608/model_final +CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608_track.yml -o weights=output/jde_darknet53_30e_1088x608/model_final ``` ### 2. 跟踪预测 diff --git a/configs/mot/jde/README.md b/configs/mot/jde/README.md index 3f2751b7f6ae18a51696543ebcac9fe84c7da301..d8ace647b9ebe97a9e288043bd0dada42d4e4a68 100644 --- a/configs/mot/jde/README.md +++ b/configs/mot/jde/README.md @@ -21,9 +21,9 @@ English | [简体中文](README_cn.md) | backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config | | :----------------- | :------- | :----: | :----: | :---: | :----: | :---: | :---: | :---: | :---: | -| DarkNet53 | 1088x608 | 73.2 | 69.4 | 1320 | 6613 | 21629 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_1088x608.yml) | -| DarkNet53 | 864x480 | 70.1 | 65.4 | 1341 | 6454 | 25208 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_864x480.yml) | -| DarkNet53 | 576x320 | 63.1 | 64.6 | 1357 | 7083 | 32312 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_576x320.yml) | +| DarkNet53 | 1088x608 | 73.2 | 69.3 | 1351 | 6591 | 21625 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_1088x608.yml) | +| DarkNet53 | 864x480 | 70.1 | 65.2 | 1328 | 6441 | 25187 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_864x480.yml) | +| DarkNet53 | 576x320 | 63.2 | 64.5 | 1308 | 7011 | 32252 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_576x320.yml) | **Notes:** diff --git a/configs/mot/jde/README_cn.md b/configs/mot/jde/README_cn.md index 096b62af6ebf201ad9e9ad9857d534de00405340..50672cf4e4c583e96876e507120ec88c90a3e63c 100644 --- a/configs/mot/jde/README_cn.md +++ b/configs/mot/jde/README_cn.md @@ -21,9 +21,9 @@ | 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 检测模型 | 配置文件 | | :----------------- | :------- | :----: | :----: | :---: | :----: | :---: | :---: | :---: | :---: | -| DarkNet53 | 1088x608 | 73.2 | 69.4 | 1320 | 6613 | 21629 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_1088x608.yml) | -| DarkNet53 | 864x480 | 70.1 | 65.4 | 1341 | 6454 | 25208 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_864x480.yml) | -| DarkNet53 | 576x320 | 63.1 | 64.6 | 1357 | 7083 | 32312 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_576x320.yml) | +| DarkNet53 | 1088x608 | 73.2 | 69.3 | 1351 | 6591 | 21625 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_1088x608.yml) | +| DarkNet53 | 864x480 | 70.1 | 65.2 | 1328 | 6441 | 25187 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_864x480.yml) | +| DarkNet53 | 576x320 | 63.2 | 64.5 | 1308 | 7011 | 32252 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/mot/jde/jde_darknet53_30e_576x320.yml) | **注意:** JDE使用8个GPU进行训练,每个GPU上batch size为4,训练了30个epoch。 diff --git a/docs/tutorials/PrepareMOTDataSet.md b/docs/tutorials/PrepareMOTDataSet.md index bce1bc01cfde688684e3efd6b5ae3cdafc9c4021..cfa784771bbbb24ac936bd1422b29b2e9bc35322 100644 --- a/docs/tutorials/PrepareMOTDataSet.md +++ b/docs/tutorials/PrepareMOTDataSet.md @@ -10,7 +10,7 @@ English | [简体中文](PrepareMOTDataSet_cn.md) - [Citations](#Citations) ### MOT Dataset -PaddleDetection uses the same training data as [JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) and [FairMOT](https://github.com/ifzhang/FairMOT). Please download and prepare all the training data including Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17 and MOT16. MOT15 and MOT20 can also be downloaded from the official webpage of MOT challenge. If you want to use these datasets, please **follow their licenses**. +PaddleDetection uses the same training data as [JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) and [FairMOT](https://github.com/ifzhang/FairMOT). Please download and prepare all the training data including **Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17 and MOT16**. **MOT15 and MOT20** can also be downloaded from the official webpage of MOT challenge. If you want to use these datasets, please **follow their licenses**. ### Data Format These several relevant datasets have the following structure: @@ -37,11 +37,11 @@ In the annotation text, each line is describing a bounding box and has the follo ``` [class] [identity] [x_center] [y_center] [width] [height] ``` -The field `[class]` should be `0`. Only single-class multi-object tracking is supported in this version. - -The field `[identity]` is an integer from `0` to `num_identities - 1`, or `-1` if this box has no identity annotation. +**Notes:** +- `class` should be `0`. Only single-class multi-object tracking is supported now. +- `identity` is an integer from `0` to `num_identities - 1`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. +- `[x_center] [y_center] [width] [height]` are the center coordinates, width and height, note that they are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1. -***Note** that the values of `[x_center] [y_center] [width] [height]` are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1. ### Dataset Directory @@ -81,7 +81,7 @@ dataset/mot ### Custom Dataset Preparation -In order to standardize training and evaluation, custom data needs to be converted into the same directory and format as MOT-17 dataset: +In order to standardize training and evaluation, custom data needs to be converted into the same directory and format as MOT-16 dataset: ``` custom_data |——————images @@ -124,14 +124,15 @@ imExt=.jpg Each line in `gt.txt` describes a bounding box, with the format as follows: ``` -[frame_id][identity][bb_left][bb_top][width][height][x][y][z] +[frame_id],[identity],[bb_left],[bb_top],[width],[height],[x],[y],[z] ``` **Notes:**: - `frame_id` is the current frame id. - `identity` is an integer from `0` to `num_identities - 1`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. - `bb_left` is the x coordinate of the left boundary of the target box - `bb_top` is the Y coordinate of the upper boundary of the target box -- `width, height` are the pixel width and height, and `x,y,z` are only used in 3D. +- `width, height` are the pixel width and height +- `x,y,z` are only used in 3D, default to `-1` in 2D. #### labels_with_ids @@ -144,7 +145,7 @@ In the annotation text, each line is describing a bounding box and has the follo **Notes:** - `class` should be `0`. Only single-class multi-object tracking is supported now. - `identity` is an integer from `0` to `num_identities - 1`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. -- `[x_center] [y_center] [width] [height]` are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1. +- `[x_center] [y_center] [width] [height]` are the center coordinates, width and height, note that they are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1. Generate the corresponding `labels_with_ids` with following command: ``` diff --git a/docs/tutorials/PrepareMOTDataSet_cn.md b/docs/tutorials/PrepareMOTDataSet_cn.md index 6897a9c352af08aab1ef8b54082b070ad20b384d..2aa95f44dc25ce3469b94f4c03721957ebfda2cb 100644 --- a/docs/tutorials/PrepareMOTDataSet_cn.md +++ b/docs/tutorials/PrepareMOTDataSet_cn.md @@ -10,7 +10,7 @@ - [引用](#引用) ### MOT数据集 -PaddleDetection使用和[JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) 还有[FairMOT](https://github.com/ifzhang/FairMOT)相同的数据集,请先下载并准备好所有的数据集包括Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17和MOT16。此外还可以下载MOT15和MOT20数据集,如果您想使用这些数据集,请**遵循他们的License**。 +PaddleDetection使用和[JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) 还有[FairMOT](https://github.com/ifzhang/FairMOT)相同的数据集,请先下载并准备好所有的数据集包括**Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17和MOT16**。此外还可以下载**MOT15和MOT20**数据集,如果您想使用这些数据集,请**遵循他们的License**。 ### 数据格式 这几个相关数据集都遵循以下结构: @@ -33,12 +33,12 @@ MOT17 ``` 所有数据集的标注是以统一数据格式提供的。各个数据集中每张图片都有相应的标注文本。给定一个图像路径,可以通过将字符串`images`替换为`labels_with_ids`并将`.jpg`替换为`.txt`来生成标注文本路径。在标注文本中,每行都描述一个边界框,格式如下: ``` -[class][identity][x_center][y_center][width][height] +[class] [identity] [x_center] [y_center] [width] [height] ``` **注意**: - `class`为`0`,目前仅支持单类别多目标跟踪。 - `identity`是从`0`到`num_identifies-1`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。 -- `[x_center][y_center][width][height]`的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。 +- `[x_center] [y_center] [width] [height]`是中心点坐标和宽高,注意它们的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。 ### 数据集目录 @@ -163,7 +163,7 @@ Google Drive: ### 用户数据准备 -为了规范地进行训练和评测,用户数据需要转成和MOT-17数据集相同的目录和格式: +为了规范地进行训练和评测,用户数据需要转成和MOT-16数据集相同的目录和格式: ``` custom_data |——————images @@ -206,7 +206,7 @@ imExt=.jpg `gt.txt`里是当前视频中所有图片的原始标注文件,每行都描述一个边界框,格式如下: ``` -[frame_id][identity][bb_left][bb_top][width][height][x][y][z] +[frame_id],[identity],[bb_left],[bb_top],[width],[height],[x],[y],[z] ``` **注意**: - `frame_id`为当前图片帧序号 @@ -220,12 +220,12 @@ imExt=.jpg #### labels_with_ids文件夹 所有数据集的标注是以统一数据格式提供的。各个数据集中每张图片都有相应的标注文本。给定一个图像路径,可以通过将字符串`images`替换为`labels_with_ids`并将`.jpg`替换为`.txt`来生成标注文本路径。在标注文本中,每行都描述一个边界框,格式如下: ``` -[class][identity][x_center][y_center][width][height] +[class] [identity] [x_center] [y_center] [width] [height] ``` **注意**: - `class`为`0`,目前仅支持单类别多目标跟踪。 - `identity`是从`0`到`num_identifies-1`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。 -- `[x_center][y_center][width][height]`的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。 +- `[x_center] [y_center] [width] [height]`是中心点坐标和宽高,注意是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。 可采用如下脚本生成相应的`labels_with_ids`: ``` diff --git a/ppdet/modeling/mot/utils.py b/ppdet/modeling/mot/utils.py index eff8d472f4fe2ce1189b6bd2eeacedba8efb63d8..4bf295921723539795474830f365c0055f68227c 100644 --- a/ppdet/modeling/mot/utils.py +++ b/ppdet/modeling/mot/utils.py @@ -124,8 +124,8 @@ def scale_coords(coords, input_shape, im_shape, scale_factor): ratio = scale_factor.numpy()[0][0] img0_shape = [int(im_shape[0] / ratio), int(im_shape[1] / ratio)] - pad_w = (input_shape[1] - img0_shape[1] * ratio) / 2 - pad_h = (input_shape[0] - img0_shape[0] * ratio) / 2 + pad_w = (input_shape[1] - round(img0_shape[1] * ratio)) / 2 + pad_h = (input_shape[0] - round(img0_shape[0] * ratio)) / 2 coords[:, 0::2] -= pad_w coords[:, 1::2] -= pad_h coords[:, 0:4] /= paddle.to_tensor(ratio)