From 84f1cc971fade5c41fa593cfabb4549f340d3906 Mon Sep 17 00:00:00 2001 From: George Ni Date: Wed, 7 Jul 2021 11:38:30 +0800 Subject: [PATCH] [MOT] fix mot identity gttxt doc (#3584) --- configs/mot/README.md | 6 +++--- configs/mot/README_cn.md | 6 +++--- docs/tutorials/PrepareMOTDataSet.md | 14 ++++++++------ docs/tutorials/PrepareMOTDataSet_cn.md | 12 +++++++----- 4 files changed, 21 insertions(+), 17 deletions(-) diff --git a/configs/mot/README.md b/configs/mot/README.md index d500e484a..49aa10bc4 100644 --- a/configs/mot/README.md +++ b/configs/mot/README.md @@ -163,7 +163,7 @@ In the annotation text, each line is describing a bounding box and has the follo ``` **Notes:** - `class` should be `0`. Only single-class multi-object tracking is supported now. -- `identity` is an integer from `0` to `num_identities - 1`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. +- `identity` is an integer from `1` to `num_identities`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. - `[x_center] [y_center] [width] [height]` are the center coordinates, width and height, note that they are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1. ### Dataset Directory @@ -257,7 +257,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairm ```bash python deploy/python/mot_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts ``` -**Notes:** +**Notes:** The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images. ### 6. Using exported MOT and keypoint model for unite python inference @@ -265,7 +265,7 @@ The tracking model is used to predict the video, and does not support the predic ```bash python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inference/fairmot_dla34_30e_1088x608/ --keypoint_model_dir=output_inference/higherhrnet_hrnet_w32_512/ --video_file={your video name}.mp4 --device=GPU ``` -**Notes:** +**Notes:** Keypoint model export tutorial: `configs/keypoint/README.md`. ## Citations diff --git a/configs/mot/README_cn.md b/configs/mot/README_cn.md index 891080507..aedf4f329 100644 --- a/configs/mot/README_cn.md +++ b/configs/mot/README_cn.md @@ -161,7 +161,7 @@ MOT17 ``` **注意**: - `class`为`0`,目前仅支持单类别多目标跟踪。 -- `identity`是从`0`到`num_identifies-1`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。 +- `identity`是从`1`到`num_identifies`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。 - `[x_center] [y_center] [width] [height]`是中心点坐标和宽高,注意他们的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。 ### 数据集目录 @@ -255,7 +255,7 @@ CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairm ```bash python deploy/python/mot_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts ``` -**注意:** +**注意:** 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。 ### 6. 用导出的跟踪和关键点模型Python联合预测 @@ -263,7 +263,7 @@ python deploy/python/mot_infer.py --model_dir=output_inference/fairmot_dla34_30e ```bash python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inference/fairmot_dla34_30e_1088x608/ --keypoint_model_dir=output_inference/higherhrnet_hrnet_w32_512/ --video_file={your video name}.mp4 --device=GPU ``` -**注意:** +**注意:** 关键点模型导出教程请参考`configs/keypoint/README.md`。 ## 引用 diff --git a/docs/tutorials/PrepareMOTDataSet.md b/docs/tutorials/PrepareMOTDataSet.md index cfa784771..6251bef07 100644 --- a/docs/tutorials/PrepareMOTDataSet.md +++ b/docs/tutorials/PrepareMOTDataSet.md @@ -39,7 +39,7 @@ In the annotation text, each line is describing a bounding box and has the follo ``` **Notes:** - `class` should be `0`. Only single-class multi-object tracking is supported now. -- `identity` is an integer from `0` to `num_identities - 1`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. +- `identity` is an integer from `1` to `num_identities`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. - `[x_center] [y_center] [width] [height]` are the center coordinates, width and height, note that they are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1. @@ -124,16 +124,18 @@ imExt=.jpg Each line in `gt.txt` describes a bounding box, with the format as follows: ``` -[frame_id],[identity],[bb_left],[bb_top],[width],[height],[x],[y],[z] +[frame_id],[identity],[bb_left],[bb_top],[width],[height],[score],[label],[vis_ratio] ``` **Notes:**: - `frame_id` is the current frame id. -- `identity` is an integer from `0` to `num_identities - 1`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. +- `identity` is an integer from `1` to `num_identities`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. - `bb_left` is the x coordinate of the left boundary of the target box - `bb_top` is the Y coordinate of the upper boundary of the target box - `width, height` are the pixel width and height -- `x,y,z` are only used in 3D, default to `-1` in 2D. - +- `score` acts as a flag whether the entry is to be considered. A value of 0 means that this particular instance is ignored in the evaluation, while a value of 1 is used to mark it as active. `1` by default. +- `label` is the type of object annotated, use `1` as default because only single-class multi-object tracking is supported now. There are other classes of object in MOT-16, but they are treated as ignore. +- `vis_ratio` is the visibility ratio of each bounding box. This can be due to occlusion by another +static or moving object, or due to image border cropping. `1` by default. #### labels_with_ids Annotations of these datasets are provided in a unified format. Every image has a corresponding annotation text. Given an image path, the annotation text path can be generated by replacing the string `images` with `labels_with_ids` and replacing `.jpg` with `.txt`. @@ -144,7 +146,7 @@ In the annotation text, each line is describing a bounding box and has the follo ``` **Notes:** - `class` should be `0`. Only single-class multi-object tracking is supported now. -- `identity` is an integer from `0` to `num_identities - 1`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. +- `identity` is an integer from `1` to `num_identities`(`num_identities` is the total number of instances of objects in the dataset), or `-1` if this box has no identity annotation. - `[x_center] [y_center] [width] [height]` are the center coordinates, width and height, note that they are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1. Generate the corresponding `labels_with_ids` with following command: diff --git a/docs/tutorials/PrepareMOTDataSet_cn.md b/docs/tutorials/PrepareMOTDataSet_cn.md index 2aa95f44d..c86544c58 100644 --- a/docs/tutorials/PrepareMOTDataSet_cn.md +++ b/docs/tutorials/PrepareMOTDataSet_cn.md @@ -37,7 +37,7 @@ MOT17 ``` **注意**: - `class`为`0`,目前仅支持单类别多目标跟踪。 -- `identity`是从`0`到`num_identifies-1`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。 +- `identity`是从`1`到`num_identifies`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。 - `[x_center] [y_center] [width] [height]`是中心点坐标和宽高,注意它们的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。 ### 数据集目录 @@ -206,15 +206,17 @@ imExt=.jpg `gt.txt`里是当前视频中所有图片的原始标注文件,每行都描述一个边界框,格式如下: ``` -[frame_id],[identity],[bb_left],[bb_top],[width],[height],[x],[y],[z] +[frame_id],[identity],[bb_left],[bb_top],[width],[height],[score],[label],[vis_ratio] ``` **注意**: - `frame_id`为当前图片帧序号 -- `identity`是从`0`到`num_identifies-1`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1` +- `identity`是从`1`到`num_identifies`的整数(`num_identifies`是当前视频中不同物体实例的总数),如果此框没有`identity`标注,则为`-1` - `bb_left`是目标框的左边界的x坐标 - `bb_top`是目标框的上边界的y坐标 - `width,height`是真实的像素宽高 -- `x,y,z`是3D中用到的,在2D中默认为`-1` +- `score`是当前目标是否进入考虑范围内的标志(值为0表示此目标在计算中被忽略,而值为1则用于将其标记为活动实例),默认为`1` +- `label`是当前目标的种类标签,由于目前仅支持单类别跟踪,默认为`1`,MOT-16数据集中会有其他类别标签,但都是当作ignore类别计算 +- `vis_ratio`是当前目标被其他目标包含或覆挡后的可见率,是从0到1的浮点数,默认为`1` #### labels_with_ids文件夹 @@ -224,7 +226,7 @@ imExt=.jpg ``` **注意**: - `class`为`0`,目前仅支持单类别多目标跟踪。 -- `identity`是从`0`到`num_identifies-1`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。 +- `identity`是从`1`到`num_identifies`的整数(`num_identifies`是数据集中不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。 - `[x_center] [y_center] [width] [height]`是中心点坐标和宽高,注意是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。 可采用如下脚本生成相应的`labels_with_ids`: -- GitLab