Add MOTS png format (#2198)

* Add mots format * fix upload * update docs * update changelog * Update datumaro dependency * fix header * update dm dependency * Support importing with outside property in mot and mots * fix track exporting Co-authored-by: N Boris Sekachev <boris.sekachev@yandex.ru>

Add MOTS png format (#2198)
* Add mots format * fix upload * update docs * update changelog * Update datumaro dependency * fix header * update dm dependency * Support importing with outside property in mot and mots * fix track exporting Co-authored-by: N Boris Sekachev <boris.sekachev@yandex.ru>
d4129f28 · Maxim Zhiltsov · GitHub · d957d6a9 · d4129f28 · d4129f28
9 changed file
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -27,9 +27,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 -   Ability to prepare meta information manually (<https://github.com/openvinotoolkit/cvat/pull/2217>)
 -   Ability to upload prepared meta information along with a video when creating a task (<https://github.com/openvinotoolkit/cvat/pull/2217>)
 -   Optional chaining plugin for cvat-canvas and cvat-ui (<https://github.com/openvinotoolkit/cvat/pull/2249>)
+-   MOTS png mask format support (<https://github.com/openvinotoolkit/cvat/pull/2198>)

 ### Changed
-
 -   UI models (like DEXTR) were redesigned to be more interactive (<https://github.com/opencv/cvat/pull/2054>)
 -   Used Ubuntu:20.04 as a base image for CVAT Dockerfile (<https://github.com/opencv/cvat/pull/2101>)
 -   Right colors of label tags in label mapping when a user runs automatic detection (<https://github.com/openvinotoolkit/cvat/pull/2162>)
@@ -37,6 +37,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 -   A key to remove a point from a polyshape [Ctrl => Alt] (<https://github.com/openvinotoolkit/cvat/pull/2204>)
 -   Updated `docker-compose` file version from `2.3` to `3.3`(<https://github.com/openvinotoolkit/cvat/pull/2235>)
 -   Added auto inference of url schema from host in CLI, if provided (<https://github.com/openvinotoolkit/cvat/pull/2240>)
+-   Track frames in skips between annotation is presented in MOT and MOTS formats are marked `outside` (<https://github.com/openvinotoolkit/cvat/pull/2198>)

 ### Deprecated

@@ -47,7 +48,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 -

 ### Fixed
-
 -   Fixed multiple errors which arises when polygon is of length 5 or less (<https://github.com/opencv/cvat/pull/2100>)
 -   Fixed task creation from PDF (<https://github.com/opencv/cvat/pull/2141>)
 -   Fixed CVAT format import for frame stepped tasks (<https://github.com/openvinotoolkit/cvat/pull/2151>)
@@ -60,6 +60,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 -   Fixed use case when logs could be saved twice or more times #2202 (<https://github.com/openvinotoolkit/cvat/pull/2203>)
 -   Fixed issues from #2112 (<https://github.com/openvinotoolkit/cvat/pull/2217>)
 -   Git application name (renamed to dataset_repo) (<https://github.com/openvinotoolkit/cvat/pull/2243>)
+-   A problem in exporting of tracks, where tracks could be truncated (<https://github.com/openvinotoolkit/cvat/issues/2129>)
+

 ### Security


--- a/cvat/apps/dataset_manager/annotation.py
+++ b/cvat/apps/dataset_manager/annotation.py
@@ -729,7 +729,6 @@ class TrackManager(ObjectManager):
        if track.get("interpolated_shapes"):
            return track["interpolated_shapes"]

-        # TODO: should be return an iterator?
        shapes = []
        curr_frame = track["shapes"][0]["frame"]
        prev_shape = {}
@@ -747,9 +746,7 @@ class TrackManager(ObjectManager):
            curr_frame = shape["frame"]
            prev_shape = shape

-        # TODO: Need to modify a client and a database (append "outside" shapes for polytracks)
-        if not prev_shape["outside"] and (prev_shape["type"] == ShapeType.RECTANGLE
-               or prev_shape["type"] == ShapeType.POINTS or prev_shape["type"] == ShapeType.CUBOID):
+        if not prev_shape["outside"]:
            shape = copy(prev_shape)
            shape["frame"] = end_frame
            shapes.extend(interpolate(prev_shape, shape))

--- a/cvat/apps/dataset_manager/formats/README.md
+++ b/cvat/apps/dataset_manager/formats/README.md
@@ -13,6 +13,7 @@
  - [CVAT](#cvat)
  - [LabelMe](#labelme)
  - [MOT](#mot)
+  - [MOTS](#mots)
  - [COCO](#coco)
  - [PASCAL VOC and mask](#voc)
  - [YOLO](#yolo)
@@ -708,8 +709,8 @@ Downloaded file: a zip archive of the following structure:
 ``` bash
 taskname.zip/
 ├── img1/
-|   ├── imgage1.jpg
-|   └── imgage2.jpg
+|   ├── image1.jpg
+|   └── image2.jpg
 └── gt/
    ├── labels.txt
    └── gt.txt
@@ -742,6 +743,38 @@ taskname.zip/

 - supported annotations: Rectangle tracks

+### [MOTS PNG](https://www.vision.rwth-aachen.de/page/mots)<a id="mots" />
+
+#### MOTS PNG Dumper
+
+Downloaded file: a zip archive of the following structure:
+
+``` bash
+taskname.zip/
+└── <any_subset_name>/
+    |   images/
+    |   ├── image1.jpg
+    |   └── image2.jpg
+    └── instances/
+        ├── labels.txt
+        ├── image1.png
+        └── image2.png
+
+# labels.txt
+cat
+dog
+person
+...
+```
+
+- supported annotations: Rectangle and Polygon tracks
+
+#### MOTS PNG Loader
+
+Uploaded file: a zip archive of the structure above
+
+- supported annotations: Polygon tracks
+
 ### [LabelMe](http://labelme.csail.mit.edu/Release3.0)<a id="labelme" />

 #### LabelMe Dumper

--- a/cvat/apps/dataset_manager/formats/mot.py
+++ b/cvat/apps/dataset_manager/formats/mot.py
@@ -79,6 +79,20 @@ def _import(src_file, task_data):
        for track in tracks.values():
            # MOT annotations do not require frames to be ordered
            track.shapes.sort(key=lambda t: t.frame)
+
+            # insert outside=True in skips between the frames track is visible
+            prev_shape_idx = 0
+            prev_shape = track.shapes[0]
+            for shape in track.shapes[1:]:
+                has_skip = task_data.frame_step < shape.frame - prev_shape.frame
+                if has_skip and not prev_shape.outside:
+                    prev_shape = prev_shape._replace(outside=True,
+                            frame=prev_shape.frame + task_data.frame_step)
+                    prev_shape_idx += 1
+                    track.shapes.insert(prev_shape_idx, prev_shape)
+                prev_shape = shape
+                prev_shape_idx += 1
+
            # Append a shape with outside=True to finish the track
            last_shape = track.shapes[-1]
            if last_shape.frame + task_data.frame_step <= \

--- a/cvat/apps/dataset_manager/formats/mots.py
+++ b/cvat/apps/dataset_manager/formats/mots.py
+# Copyright (C) 2019 Intel Corporation
+#
+# SPDX-License-Identifier: MIT
+
+from tempfile import TemporaryDirectory
+
+from pyunpack import Archive
+
+from cvat.apps.dataset_manager.bindings import (CvatTaskDataExtractor,
+    find_dataset_root, match_dm_item)
+from cvat.apps.dataset_manager.util import make_zip_archive
+from datumaro.components.extractor import AnnotationType, Transform
+from datumaro.components.project import Dataset
+
+from .registry import dm_env, exporter, importer
+
+
+class KeepTracks(Transform):
+    def transform_item(self, item):
+        return item.wrap(annotations=[a for a in item.annotations
+            if 'track_id' in a.attributes])
+
+@exporter(name='MOTS PNG', ext='ZIP', version='1.0')
+def _export(dst_file, task_data, save_images=False):
+    extractor = CvatTaskDataExtractor(task_data, include_images=save_images)
+    envt = dm_env.transforms
+    extractor = extractor.transform(KeepTracks) # can only export tracks
+    extractor = extractor.transform(envt.get('polygons_to_masks'))
+    extractor = extractor.transform(envt.get('boxes_to_masks'))
+    extractor = extractor.transform(envt.get('merge_instance_segments'))
+    extractor = Dataset.from_extractors(extractor) # apply lazy transforms
+    with TemporaryDirectory() as temp_dir:
+        dm_env.converters.get('mots_png').convert(extractor,
+            save_dir=temp_dir, save_images=save_images)
+
+        make_zip_archive(temp_dir, dst_file)
+
+@importer(name='MOTS PNG', ext='ZIP', version='1.0')
+def _import(src_file, task_data):
+    with TemporaryDirectory() as tmp_dir:
+        Archive(src_file.name).extractall(tmp_dir)
+
+        dataset = dm_env.make_importer('mots')(tmp_dir).make_dataset()
+        masks_to_polygons = dm_env.transforms.get('masks_to_polygons')
+        dataset = dataset.transform(masks_to_polygons)
+
+        tracks = {}
+        label_cat = dataset.categories()[AnnotationType.label]
+
+        root_hint = find_dataset_root(dataset, task_data)
+
+        for item in dataset:
+            frame_number = task_data.abs_frame_id(
+                match_dm_item(item, task_data, root_hint=root_hint))
+
+            for ann in item.annotations:
+                if ann.type != AnnotationType.polygon:
+                    continue
+
+                track_id = ann.attributes['track_id']
+                shape = task_data.TrackedShape(
+                    type='polygon',
+                    points=ann.points,
+                    occluded=ann.attributes.get('occluded') == True,
+                    outside=False,
+                    keyframe=True,
+                    z_order=ann.z_order,
+                    frame=frame_number,
+                    attributes=[],
+                    source='manual',
+                )
+
+                # build trajectories as lists of shapes in track dict
+                if track_id not in tracks:
+                    tracks[track_id] = task_data.Track(
+                        label_cat.items[ann.label].name, 0, 'manual', [])
+                tracks[track_id].shapes.append(shape)
+
+        for track in tracks.values():
+            track.shapes.sort(key=lambda t: t.frame)
+
+            # insert outside=True in skips between the frames track is visible
+            prev_shape_idx = 0
+            prev_shape = track.shapes[0]
+            for shape in track.shapes[1:]:
+                has_skip = task_data.frame_step < shape.frame - prev_shape.frame
+                if has_skip and not prev_shape.outside:
+                    prev_shape = prev_shape._replace(outside=True,
+                            frame=prev_shape.frame + task_data.frame_step)
+                    prev_shape_idx += 1
+                    track.shapes.insert(prev_shape_idx, prev_shape)
+                prev_shape = shape
+                prev_shape_idx += 1
+
+            # Append a shape with outside=True to finish the track
+            last_shape = track.shapes[-1]
+            if last_shape.frame + task_data.frame_step <= \
+                    int(task_data.meta['task']['stop_frame']):
+                track.shapes.append(last_shape._replace(outside=True,
+                    frame=last_shape.frame + task_data.frame_step)
+                )
+            task_data.add_track(track)
--- a/cvat/apps/dataset_manager/formats/registry.py
+++ b/cvat/apps/dataset_manager/formats/registry.py
@@ -87,6 +87,7 @@ import cvat.apps.dataset_manager.formats.datumaro
 import cvat.apps.dataset_manager.formats.labelme
 import cvat.apps.dataset_manager.formats.mask
 import cvat.apps.dataset_manager.formats.mot
+import cvat.apps.dataset_manager.formats.mots
 import cvat.apps.dataset_manager.formats.pascal_voc
 import cvat.apps.dataset_manager.formats.tfrecord
 import cvat.apps.dataset_manager.formats.yolo
\ No newline at end of file
--- a/cvat/apps/dataset_manager/tests/test_annotation.py
+++ b/cvat/apps/dataset_manager/tests/test_annotation.py
@@ -8,6 +8,17 @@ from unittest import TestCase


 class TrackManagerTest(TestCase):
+    def _check_interpolation(self, track):
+        interpolated = TrackManager.get_interpolated_shapes(track, 0, 7)
+
+        self.assertEqual(len(interpolated), 6)
+        self.assertTrue(interpolated[0]["keyframe"])
+        self.assertFalse(interpolated[1]["keyframe"])
+        self.assertTrue(interpolated[2]["keyframe"])
+        self.assertTrue(interpolated[3]["keyframe"])
+        self.assertFalse(interpolated[4]["keyframe"])
+        self.assertFalse(interpolated[5]["keyframe"])
+
    def test_point_interpolation(self):
        track = {
            "frame": 0,
@@ -32,14 +43,18 @@ class TrackManagerTest(TestCase):
                    "occluded": False,
                    "outside": True
                },
+                {
+                    "frame": 4,
+                    "attributes": [],
+                    "points": [3.0, 4.0, 5.0, 6.0],
+                    "type": "points",
+                    "occluded": False,
+                    "outside": False
+                },
            ]
        }

-        interpolated = TrackManager.get_interpolated_shapes(track, 0, 2)
-
-        self.assertEqual(len(interpolated), 3)
-        self.assertTrue(interpolated[0]["keyframe"])
-        self.assertFalse(interpolated[1]["keyframe"])
+        self._check_interpolation(track)

    def test_polygon_interpolation(self):
        track = {
@@ -65,14 +80,18 @@ class TrackManagerTest(TestCase):
                    "occluded": False,
                    "outside": True
                },
+                {
+                    "frame": 4,
+                    "attributes": [],
+                    "points": [3.0, 4.0, 5.0, 6.0, 7.0, 6.0, 4.0, 5.0],
+                    "type": "polygon",
+                    "occluded": False,
+                    "outside": False
+                },
            ]
        }

-        interpolated = TrackManager.get_interpolated_shapes(track, 0, 2)
-
-        self.assertEqual(len(interpolated), 3)
-        self.assertTrue(interpolated[0]["keyframe"])
-        self.assertFalse(interpolated[1]["keyframe"])
+        self._check_interpolation(track)

    def test_bbox_interpolation(self):
        track = {
@@ -98,14 +117,18 @@ class TrackManagerTest(TestCase):
                    "occluded": False,
                    "outside": True
                },
+                {
+                    "frame": 4,
+                    "attributes": [],
+                    "points": [3.0, 4.0, 5.0, 6.0],
+                    "type": "rectangle",
+                    "occluded": False,
+                    "outside": False
+                },
            ]
        }

-        interpolated = TrackManager.get_interpolated_shapes(track, 0, 2)
-
-        self.assertEqual(len(interpolated), 3)
-        self.assertTrue(interpolated[0]["keyframe"])
-        self.assertFalse(interpolated[1]["keyframe"])
+        self._check_interpolation(track)

    def test_line_interpolation(self):
        track = {
@@ -131,11 +154,15 @@ class TrackManagerTest(TestCase):
                    "occluded": False,
                    "outside": True
                },
+                {
+                    "frame": 4,
+                    "attributes": [],
+                    "points": [3.0, 4.0, 5.0, 6.0],
+                    "type": "polyline",
+                    "occluded": False,
+                    "outside": False
+                },
            ]
        }

-        interpolated = TrackManager.get_interpolated_shapes(track, 0, 2)
-
-        self.assertEqual(len(interpolated), 3)
-        self.assertTrue(interpolated[0]["keyframe"])
-        self.assertFalse(interpolated[1]["keyframe"])
\ No newline at end of file
+        self._check_interpolation(track)
\ No newline at end of file
--- a/cvat/apps/dataset_manager/tests/test_formats.py
+++ b/cvat/apps/dataset_manager/tests/test_formats.py
@@ -267,6 +267,7 @@ class TaskExportTest(_DbTestBase):
            'Datumaro 1.0',
            'LabelMe 3.0',
            'MOT 1.1',
+            'MOTS PNG 1.0',
            'PASCAL VOC 1.1',
            'Segmentation mask 1.1',
            'TFRecord 1.0',
@@ -282,6 +283,7 @@ class TaskExportTest(_DbTestBase):
            'CVAT 1.1',
            'LabelMe 3.0',
            'MOT 1.1',
+            'MOTS PNG 1.0',
            'PASCAL VOC 1.1',
            'Segmentation mask 1.1',
            'TFRecord 1.0',
@@ -316,6 +318,7 @@ class TaskExportTest(_DbTestBase):
            ('Datumaro 1.0', 'datumaro_project'),
            ('LabelMe 3.0', 'label_me'),
            # ('MOT 1.1', 'mot_seq'), # does not support
+            # ('MOTS PNG 1.0', 'mots_png'), # does not support
            ('PASCAL VOC 1.1', 'voc'),
            ('Segmentation mask 1.1', 'voc'),
            ('TFRecord 1.0', 'tf_detection_api'),

--- a/cvat/apps/engine/tests/test_rest_api.py
+++ b/cvat/apps/engine/tests/test_rest_api.py
@@ -3183,6 +3183,39 @@ class TaskAnnotationAPITestCase(JobAnnotationAPITestCase):
                    }
                ]
            }]
+            polygon_tracks_wo_attrs = [{
+                "frame": 0,
+                "label_id": task["labels"][1]["id"],
+                "group": 0,
+                "source": "manual",
+                "attributes": [],
+                "shapes": [
+                    {
+                        "frame": 0,
+                        "attributes": [],
+                        "points": [1.0, 2.1, 50.2, 36.6, 7.0, 10.0],
+                        "type": "polygon",
+                        "occluded": False,
+                        "outside": False,
+                    },
+                    {
+                        "frame": 1,
+                        "attributes": [],
+                        "points": [1.0, 2.1, 51, 36.6, 8.0, 11.0],
+                        "type": "polygon",
+                        "occluded": False,
+                        "outside": False
+                    },
+                    {
+                        "frame": 2,
+                        "attributes": [],
+                        "points": [1.0, 2.1, 51, 36.6, 14.0, 15.0],
+                        "type": "polygon",
+                        "occluded": False,
+                        "outside": True,
+                    }
+                ]
+            }]

            rectangle_shapes_with_attrs = [{
                "frame": 0,
@@ -3287,11 +3320,15 @@ class TaskAnnotationAPITestCase(JobAnnotationAPITestCase):
                    "tracks": [],
                }
            if annotation_format == "CVAT for video 1.1":
-                annotations["tracks"] = rectangle_tracks_with_attrs + rectangle_tracks_wo_attrs
+                annotations["tracks"] = rectangle_tracks_with_attrs \
+                                      + rectangle_tracks_wo_attrs \
+                                      + polygon_tracks_wo_attrs

            elif annotation_format == "CVAT for images 1.1":
-                annotations["shapes"] = rectangle_shapes_with_attrs + rectangle_shapes_wo_attrs \
-                    + polygon_shapes_wo_attrs + polygon_shapes_with_attrs
+                annotations["shapes"] = rectangle_shapes_with_attrs \
+                                      + rectangle_shapes_wo_attrs \
+                                      + polygon_shapes_wo_attrs \
+                                      + polygon_shapes_with_attrs
                annotations["tags"] = tags_with_attrs + tags_wo_attrs

            elif annotation_format == "PASCAL VOC 1.1":
@@ -3306,24 +3343,28 @@ class TaskAnnotationAPITestCase(JobAnnotationAPITestCase):
                annotations["shapes"] = polygon_shapes_wo_attrs

            elif annotation_format == "Segmentation mask 1.1":
-                annotations["shapes"] = rectangle_shapes_wo_attrs + polygon_shapes_wo_attrs
+                annotations["shapes"] = rectangle_shapes_wo_attrs \
+                                      + polygon_shapes_wo_attrs
                annotations["tracks"] = rectangle_tracks_wo_attrs

            elif annotation_format == "MOT 1.1":
                annotations["shapes"] = rectangle_shapes_wo_attrs
                annotations["tracks"] = rectangle_tracks_wo_attrs

+            elif annotation_format == "MOTS PNG 1.0":
+                annotations["tracks"] = polygon_tracks_wo_attrs
+
            elif annotation_format == "LabelMe 3.0":
-                annotations["shapes"] = rectangle_shapes_with_attrs + \
-                                        rectangle_shapes_wo_attrs + \
-                                        polygon_shapes_wo_attrs + \
-                                        polygon_shapes_with_attrs
+                annotations["shapes"] = rectangle_shapes_with_attrs \
+                                      + rectangle_shapes_wo_attrs \
+                                      + polygon_shapes_wo_attrs \
+                                      + polygon_shapes_with_attrs

            elif annotation_format == "Datumaro 1.0":
-                annotations["shapes"] = rectangle_shapes_with_attrs + \
-                                        rectangle_shapes_wo_attrs + \
-                                        polygon_shapes_wo_attrs + \
-                                        polygon_shapes_with_attrs
+                annotations["shapes"] = rectangle_shapes_with_attrs \
+                                      + rectangle_shapes_wo_attrs \
+                                      + polygon_shapes_wo_attrs \
+                                      + polygon_shapes_with_attrs
                annotations["tags"] = tags_with_attrs + tags_wo_attrs

            else:
@@ -3422,7 +3463,7 @@ class TaskAnnotationAPITestCase(JobAnnotationAPITestCase):
                self.assertEqual(response.status_code, HTTP_201_CREATED)

                # 7. check annotation
-                if import_format == "Segmentation mask 1.1":
+                if import_format in {"Segmentation mask 1.1", "MOTS PNG 1.0"}:
                    continue # can't really predict the result to check
                response = self._get_api_v1_tasks_id_annotations(task["id"], annotator)
                self.assertEqual(response.status_code, HTTP_200_OK)