Manifest (#2763)

* Added support for manifest file * Added data migration * Updated tests * Update CHANGELOG * Update manifest documentation * Fix case with 3d data Co-authored-by: N Nikita Manovich <nikita.manovich@intel.com>

Manifest (#2763)
* Added support for manifest file * Added data migration * Updated tests * Update CHANGELOG * Update manifest documentation * Fix case with 3d data Co-authored-by: N Nikita Manovich <nikita.manovich@intel.com>
6c38ad07 · Maria Khrustaleva · GitHub · e41c3012 · 6c38ad07 · 6c38ad07
21 changed file
--- a/.coveragerc
+++ b/.coveragerc
@@ -5,6 +5,7 @@ branch = true
 source =
    cvat/apps/
    utils/cli/
+    utils/dataset_manifest

 omit =
    cvat/settings/*

--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -28,6 +28,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - CVAT-3D: Implemented initial cuboid placement in 3D View and select cuboid in Top, Side and Front views
  (<https://github.com/openvinotoolkit/cvat/pull/2891>)
 - [Market-1501](https://www.aitribune.com/dataset/2018051063) format support (<https://github.com/openvinotoolkit/cvat/pull/2869>)
+- Ability of upload manifest for dataset with images (<https://github.com/openvinotoolkit/cvat/pull/2763>)
 - Annotations filters UI using react-awesome-query-builder (https://github.com/openvinotoolkit/cvat/issues/1418)

 ### Changed
@@ -42,6 +43,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Image visualizations settings on canvas for faster access (<https://github.com/openvinotoolkit/cvat/pull/2872>)
 - Better scale management of left panel when screen is too small (<https://github.com/openvinotoolkit/cvat/pull/2880>)
 - Improved error messages for annotation import (<https://github.com/openvinotoolkit/cvat/pull/2935>)
+- Using manifest support instead video meta information and dummy chunks (<https://github.com/openvinotoolkit/cvat/pull/2763>)

 ### Deprecated


--- a/cvat/apps/documentation/data_on_fly.md
+++ b/cvat/apps/documentation/data_on_fly.md
@@ -2,40 +2,25 @@

 ## Description

-Data on the fly processing is a way of working with data, the main idea of which is as follows:
-Minimum necessary meta information is collected, when task is created.
-This meta information allows in the future to create a necessary chunks when receiving a request from a client.
+Data on the fly processing is a way of working with data, the main idea of which is as follows: when creating a task,
+the minimum necessary meta information is collected. This meta information allows in the future to create necessary
+chunks when receiving a request from a client.

-Generated chunks are stored in a cache of limited size with a policy of evicting less popular items.
+Generated chunks are stored in a cache of the limited size with a policy of evicting less popular items.

-When a request received from a client, the required chunk is searched for in the cache.
-If the chunk does not exist yet, it is created using a prepared meta information and then put into the cache.
+When a request is received from a client, the required chunk is searched for in the cache. If the chunk does not exist
+yet, it is created using prepared meta information and then put into the cache.

 This method of working with data allows:

 - reduce the task creation time.
- store data in a cache of limited size with a policy of evicting less popular items.
+- store data in a cache of the limited size with a policy of evicting less popular items.

-## Prepare meta information
+Unfortunately, this method will not work for all videos with a valid manifest file. If there are not enough keyframes
+in the video for smooth video decoding, the task will be created in another way. Namely, all chunks will be prepared
+during task creation, which may take some time.

-Different meta information is collected for different types of uploaded data.
+#### Uploading a manifest with data

-### Video
-
-For video, this is a valid mapping of key frame numbers and their timestamps. This information is saved to `meta_info.txt`.
-
-Unfortunately, this method will not work for all videos with valid meta information.
-If there are not enough keyframes in the video for smooth video decoding, the task will be created in the old way.
-
-#### Uploading meta information along with data
-
-When creating a task, you can upload a file with meta information along with the video,
-which will further reduce the time for creating a task.
-You can see how to prepare meta information [here](/utils/prepare_meta_information/README.md).
-
-It is worth noting that the generated file also contains information about the number of frames in the video at the end.
-
-### Images
-
-Mapping of chunk number and paths to images that should enter the chunk
-is saved at the time of creating a task in a files `dummy_{chunk_number}.txt`
+When creating a task, you can upload a `manifest.jsonl` file along with the video or dataset with images.
+You can see how to prepare it [here](/utils/dataset_manifest/README.md).
--- a/cvat/apps/documentation/faq.md
+++ b/cvat/apps/documentation/faq.md
@@ -15,7 +15,6 @@
 - [How to create a task with multiple jobs](#how-to-create-a-task-with-multiple-jobs)
 - [How to transfer CVAT to another machine](#how-to-transfer-cvat-to-another-machine)

-
 ## How to update CVAT

 Before upgrading, please follow the [backup guide](backup_guide.md) and backup all CVAT volumes.
@@ -151,4 +150,5 @@ Set the segment size when you create a new task, this option is available in the
 [Advanced configuration](user_guide.md#advanced-configuration) section.

 ## How to transfer CVAT to another machine
+
 Follow the [backup/restore guide](backup_guide.md#how-to-backup-all-cvat-data).
--- a/cvat/apps/documentation/user_guide.md
+++ b/cvat/apps/documentation/user_guide.md
@@ -153,8 +153,8 @@ Go to the [Django administration panel](http://localhost:8080/admin). There you
    **Select files**. Press tab `My computer` to choose some files for annotation from your PC.
    If you select tab `Connected file share` you can choose files for annotation from your network.
    If you select ` Remote source` , you'll see a field where you can enter a list of URLs (one URL per line).
-    If you upload a video data and select `Use cache` option, you can along with the video file attach a file with meta information.
-    You can find how to prepare it [here](/utils/prepare_meta_information/README.md).
+    If you upload a video or dataset with images and select `Use cache` option, you can attach a `manifest.jsonl` file.
+    You can find how to prepare it [here](/utils/dataset_manifest/README.md).

    ![](static/documentation/images/image127.jpg)

@@ -1157,8 +1157,6 @@ Intelligent scissors is an CV method of creating a polygon by placing points wit
 The distance between the adjacent points is limited by the threshold of action,
 displayed as a red square which is tied to the cursor.

-
-
 - First, select the label and then click on the `intelligent scissors` button.

  ![](static/documentation/images/image199.jpg)

--- a/cvat/apps/engine/cache.py
+++ b/cvat/apps/engine/cache.py
-# Copyright (C) 2020 Intel Corporation
+# Copyright (C) 2020-2021 Intel Corporation
 #
 # SPDX-License-Identifier: MIT

@@ -9,9 +9,9 @@ from diskcache import Cache
 from django.conf import settings

 from cvat.apps.engine.media_extractors import (Mpeg4ChunkWriter,
-    Mpeg4CompressedChunkWriter, ZipChunkWriter, ZipCompressedChunkWriter)
+    Mpeg4CompressedChunkWriter, ZipChunkWriter, ZipCompressedChunkWriter,
+    ImageDatasetManifestReader, VideoDatasetManifestReader)
 from cvat.apps.engine.models import DataChoice, StorageChoice
-from cvat.apps.engine.prepare import PrepareInfo
 from cvat.apps.engine.models import DimensionType

 class CacheInteraction:
@@ -51,17 +51,24 @@ class CacheInteraction:
                StorageChoice.LOCAL: db_data.get_upload_dirname(),
                StorageChoice.SHARE: settings.SHARE_ROOT
            }[db_data.storage]
-        if os.path.exists(db_data.get_meta_path()):
+        if hasattr(db_data, 'video'):
            source_path = os.path.join(upload_dir, db_data.video.path)
-            meta = PrepareInfo(source_path=source_path, meta_path=db_data.get_meta_path())
-            for frame in meta.decode_needed_frames(chunk_number, db_data):
-                images.append(frame)
-            writer.save_as_chunk([(image, source_path, None) for image in images], buff)
+            reader = VideoDatasetManifestReader(manifest_path=db_data.get_manifest_path(),
+                source_path=source_path, chunk_number=chunk_number,
+                chunk_size=db_data.chunk_size, start=db_data.start_frame,
+                stop=db_data.stop_frame, step=db_data.get_frame_step())
+            for frame in reader:
+                images.append((frame, source_path, None))
        else:
-            with open(db_data.get_dummy_chunk_path(chunk_number), 'r') as dummy_file:
-                images = [os.path.join(upload_dir, line.strip()) for line in dummy_file]
-            writer.save_as_chunk([(image, image, None) for image in images], buff)
+            reader = ImageDatasetManifestReader(manifest_path=db_data.get_manifest_path(),
+                chunk_number=chunk_number, chunk_size=db_data.chunk_size,
+                start=db_data.start_frame, stop=db_data.stop_frame,
+                step=db_data.get_frame_step())
+            for item in reader:
+                source_path = os.path.join(upload_dir, f"{item['name']}{item['extension']}")
+                images.append((source_path, source_path, None))

+        writer.save_as_chunk(images, buff)
        buff.seek(0)
        return buff, mime_type


--- a/cvat/apps/engine/media_extractors.py
+++ b/cvat/apps/engine/media_extractors.py
@@ -11,6 +11,7 @@ import itertools
 import struct
 import re
 from abc import ABC, abstractmethod
+from contextlib import closing

 import av
 import numpy as np
@@ -25,6 +26,7 @@ from cvat.apps.engine.models import DimensionType
 ImageFile.LOAD_TRUNCATED_IMAGES = True

 from cvat.apps.engine.mime_types import mimetypes
+from utils.dataset_manifest import VideoManifestManager, ImageManifestManager

 def get_mime(name):
    for type_name, type_def in MEDIA_TYPES.items():
@@ -127,6 +129,10 @@ class ImageListReader(IMediaReader):
        img = Image.open(self._source_path[i])
        return img.width, img.height

+    @property
+    def absolute_source_paths(self):
+        return [self.get_path(idx) for idx, _ in enumerate(self._source_path)]
+
 class DirectoryReader(ImageListReader):
    def __init__(self, source_path, step=1, start=0, stop=None):
        image_paths = []
@@ -317,6 +323,103 @@ class VideoReader(IMediaReader):
        image = (next(iter(self)))[0]
        return image.width, image.height

+class FragmentMediaReader:
+    def __init__(self, chunk_number, chunk_size, start, stop, step=1):
+        self._start = start
+        self._stop = stop + 1 # up to the last inclusive
+        self._step = step
+        self._chunk_number = chunk_number
+        self._chunk_size = chunk_size
+        self._start_chunk_frame_number = \
+            self._start + self._chunk_number * self._chunk_size * self._step
+        self._end_chunk_frame_number = min(self._start_chunk_frame_number \
+            + (self._chunk_size - 1) * self._step + 1, self._stop)
+        self._frame_range = self._get_frame_range()
+
+    @property
+    def frame_range(self):
+        return self._frame_range
+
+    def _get_frame_range(self):
+        frame_range = []
+        for idx in range(self._start, self._stop, self._step):
+            if idx < self._start_chunk_frame_number:
+                continue
+            elif idx < self._end_chunk_frame_number and \
+                    not ((idx - self._start_chunk_frame_number) % self._step):
+                frame_range.append(idx)
+            elif (idx - self._start_chunk_frame_number) % self._step:
+                continue
+            else:
+                break
+        return frame_range
+
+class ImageDatasetManifestReader(FragmentMediaReader):
+    def __init__(self, manifest_path, **kwargs):
+        super().__init__(**kwargs)
+        self._manifest = ImageManifestManager(manifest_path)
+        self._manifest.init_index()
+
+    def __iter__(self):
+        for idx in self._frame_range:
+            yield self._manifest[idx]
+
+class VideoDatasetManifestReader(FragmentMediaReader):
+    def __init__(self, manifest_path, **kwargs):
+        self.source_path = kwargs.pop('source_path')
+        super().__init__(**kwargs)
+        self._manifest = VideoManifestManager(manifest_path)
+        self._manifest.init_index()
+
+    def _get_nearest_left_key_frame(self):
+        if self._start_chunk_frame_number >= \
+                self._manifest[len(self._manifest) - 1].get('number'):
+            left_border = len(self._manifest) - 1
+        else:
+            left_border = 0
+            delta = len(self._manifest)
+            while delta:
+                step = delta // 2
+                cur_position = left_border + step
+                if self._manifest[cur_position].get('number') < self._start_chunk_frame_number:
+                    cur_position += 1
+                    left_border = cur_position
+                    delta -= step + 1
+                else:
+                    delta = step
+            if self._manifest[cur_position].get('number') > self._start_chunk_frame_number:
+                left_border -= 1
+        frame_number = self._manifest[left_border].get('number')
+        timestamp = self._manifest[left_border].get('pts')
+        return frame_number, timestamp
+
+    def __iter__(self):
+        start_decode_frame_number, start_decode_timestamp = self._get_nearest_left_key_frame()
+        with closing(av.open(self.source_path, mode='r')) as container:
+            video_stream = next(stream for stream in container.streams if stream.type == 'video')
+            video_stream.thread_type = 'AUTO'
+
+            container.seek(offset=start_decode_timestamp, stream=video_stream)
+
+            frame_number = start_decode_frame_number - 1
+            for packet in container.demux(video_stream):
+                for frame in packet.decode():
+                    frame_number += 1
+                    if frame_number in self._frame_range:
+                        if video_stream.metadata.get('rotate'):
+                            frame = av.VideoFrame().from_ndarray(
+                                rotate_image(
+                                    frame.to_ndarray(format='bgr24'),
+                                    360 - int(container.streams.video[0].metadata.get('rotate'))
+                                ),
+                                format ='bgr24'
+                            )
+                        yield frame
+                    elif frame_number < self._frame_range[-1]:
+                        continue
+                    else:
+                        return
+
 class IChunkWriter(ABC):
    def __init__(self, quality, dimension=DimensionType.DIM_2D):
        self._image_quality = quality

--- a/cvat/apps/engine/migrations/0038_manifest.py
+++ b/cvat/apps/engine/migrations/0038_manifest.py
+# Generated by Django 3.1.1 on 2021-02-20 08:36
+
+import glob
+import os
+from re import search
+
+from django.conf import settings
+from django.db import migrations
+
+from cvat.apps.engine.models import (DimensionType, StorageChoice,
+                                     StorageMethodChoice)
+from utils.dataset_manifest import ImageManifestManager, VideoManifestManager
+
+def migrate_data(apps, shema_editor):
+    Data = apps.get_model("engine", "Data")
+    query_set = Data.objects.filter(storage_method=StorageMethodChoice.CACHE)
+    for db_data in query_set:
+        try:
+            upload_dir = '{}/{}/raw'.format(settings.MEDIA_DATA_ROOT, db_data.id)
+            if os.path.exists(os.path.join(upload_dir, 'meta_info.txt')):
+                    os.remove(os.path.join(upload_dir, 'meta_info.txt'))
+            else:
+                for path in glob.glob(f'{upload_dir}/dummy_*.txt'):
+                    os.remove(path)
+            # it's necessary for case with long data migration
+            if os.path.exists(os.path.join(upload_dir, 'manifest.jsonl')):
+                continue
+            data_dir = upload_dir if db_data.storage == StorageChoice.LOCAL else settings.SHARE_ROOT
+            if hasattr(db_data, 'video'):
+                media_file = os.path.join(data_dir, db_data.video.path)
+                manifest = VideoManifestManager(manifest_path=upload_dir)
+                meta_info = manifest.prepare_meta(media_file=media_file)
+                manifest.create(meta_info)
+                manifest.init_index()
+            else:
+                manifest = ImageManifestManager(manifest_path=upload_dir)
+                sources = []
+                if db_data.storage == StorageChoice.LOCAL:
+                    for (root, _, files) in os.walk(data_dir):
+                        sources.extend([os.path.join(root, f) for f in files])
+                    sources.sort()
+                # using share, this means that we can not explicitly restore the entire data structure
+                else:
+                    sources = [os.path.join(data_dir, db_image.path) for db_image in db_data.images.all().order_by('frame')]
+                if any(list(filter(lambda x: x.dimension==DimensionType.DIM_3D, db_data.tasks.all()))):
+                    content = []
+                    for source in sources:
+                        name, ext = os.path.splitext(os.path.relpath(source, upload_dir))
+                        content.append({
+                            'name': name,
+                            'extension': ext
+                        })
+                else:
+                    meta_info = manifest.prepare_meta(sources=sources, data_dir=data_dir)
+                    content = meta_info.content
+
+                if db_data.storage == StorageChoice.SHARE:
+                    def _get_frame_step(str_):
+                        match = search("step\s*=\s*([1-9]\d*)", str_)
+                        return int(match.group(1)) if match else 1
+                    step = _get_frame_step(db_data.frame_filter)
+                    start = db_data.start_frame
+                    stop = db_data.stop_frame + 1
+                    images_range = range(start, stop, step)
+                    result_content = []
+                    for i in range(stop):
+                        item = content.pop(0) if i in images_range else dict()
+                        result_content.append(item)
+                    content = result_content
+                manifest.create(content)
+                manifest.init_index()
+        except Exception as ex:
+            print(str(ex))
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('engine', '0037_task_subset'),
+    ]
+
+    operations = [
+        migrations.RunPython(migrate_data)
+    ]
--- a/cvat/apps/engine/models.py
+++ b/cvat/apps/engine/models.py
@@ -138,11 +138,10 @@ class Data(models.Model):
    def get_preview_path(self):
        return os.path.join(self.get_data_dirname(), 'preview.jpeg')

-    def get_meta_path(self):
-        return os.path.join(self.get_upload_dirname(), 'meta_info.txt')
-
-    def get_dummy_chunk_path(self, chunk_number):
-        return os.path.join(self.get_upload_dirname(), 'dummy_{}.txt'.format(chunk_number))
+    def get_manifest_path(self):
+        return os.path.join(self.get_upload_dirname(), 'manifest.jsonl')
+    def get_index_path(self):
+        return os.path.join(self.get_upload_dirname(), 'index.json')

 class Video(models.Model):
    data = models.OneToOneField(Data, on_delete=models.CASCADE, related_name="video", null=True)

--- a/cvat/apps/engine/prepare.py
+++ b/cvat/apps/engine/prepare.py
-# Copyright (C) 2020 Intel Corporation
-#
-# SPDX-License-Identifier: MIT
-
-import av
-from collections import OrderedDict
-import hashlib
-import os
-from cvat.apps.engine.utils import rotate_image
-
-class WorkWithVideo:
-    def __init__(self, **kwargs):
-        if not kwargs.get('source_path'):
-            raise Exception('No sourse path')
-        self.source_path = kwargs.get('source_path')
-
-    @staticmethod
-    def _open_video_container(sourse_path, mode, options=None):
-        return av.open(sourse_path, mode=mode, options=options)
-
-    @staticmethod
-    def _close_video_container(container):
-        container.close()
-
-    @staticmethod
-    def _get_video_stream(container):
-        video_stream = next(stream for stream in container.streams if stream.type == 'video')
-        video_stream.thread_type = 'AUTO'
-        return video_stream
-
-    @staticmethod
-    def _get_frame_size(container):
-        video_stream = WorkWithVideo._get_video_stream(container)
-        for packet in container.demux(video_stream):
-            for frame in packet.decode():
-                if video_stream.metadata.get('rotate'):
-                    frame = av.VideoFrame().from_ndarray(
-                        rotate_image(
-                            frame.to_ndarray(format='bgr24'),
-                            360 - int(container.streams.video[0].metadata.get('rotate')),
-                        ),
-                        format ='bgr24',
-                    )
-                return frame.width, frame.height
-
-class AnalyzeVideo(WorkWithVideo):
-    def check_type_first_frame(self):
-        container = self._open_video_container(self.source_path, mode='r')
-        video_stream = self._get_video_stream(container)
-
-        for packet in container.demux(video_stream):
-            for frame in packet.decode():
-                self._close_video_container(container)
-                assert frame.pict_type.name == 'I', 'First frame is not key frame'
-                return
-
-    def check_video_timestamps_sequences(self):
-        container = self._open_video_container(self.source_path, mode='r')
-        video_stream = self._get_video_stream(container)
-
-        frame_pts = -1
-        frame_dts = -1
-        for packet in container.demux(video_stream):
-            for frame in packet.decode():
-
-                if None not in [frame.pts, frame_pts] and frame.pts <= frame_pts:
-                    self._close_video_container(container)
-                    raise Exception('Invalid pts sequences')
-
-                if None not in [frame.dts, frame_dts] and frame.dts <= frame_dts:
-                    self._close_video_container(container)
-                    raise Exception('Invalid dts sequences')
-
-                frame_pts, frame_dts = frame.pts, frame.dts
-        self._close_video_container(container)
-
-def md5_hash(frame):
-    return hashlib.md5(frame.to_image().tobytes()).hexdigest()
-
-class PrepareInfo(WorkWithVideo):
-
-    def __init__(self, **kwargs):
-        super().__init__(**kwargs)
-
-        if not kwargs.get('meta_path'):
-            raise Exception('No meta path')
-
-        self.meta_path = kwargs.get('meta_path')
-        self.key_frames = {}
-        self.frames = 0
-
-        container = self._open_video_container(self.source_path, 'r')
-        self.width, self.height = self._get_frame_size(container)
-        self._close_video_container(container)
-
-    def get_task_size(self):
-        return self.frames
-
-    @property
-    def frame_sizes(self):
-        return (self.width, self.height)
-
-    def check_key_frame(self, container, video_stream, key_frame):
-        for packet in container.demux(video_stream):
-            for frame in packet.decode():
-                if md5_hash(frame) != key_frame[1]['md5'] or frame.pts != key_frame[1]['pts']:
-                    self.key_frames.pop(key_frame[0])
-                return
-
-    def check_seek_key_frames(self):
-        container = self._open_video_container(self.source_path, mode='r')
-        video_stream = self._get_video_stream(container)
-
-        key_frames_copy = self.key_frames.copy()
-
-        for key_frame in key_frames_copy.items():
-            container.seek(offset=key_frame[1]['pts'], stream=video_stream)
-            self.check_key_frame(container, video_stream, key_frame)
-
-    def check_frames_ratio(self, chunk_size):
-        return (len(self.key_frames) and (self.frames // len(self.key_frames)) <= 2 * chunk_size)
-
-    def save_key_frames(self):
-        container = self._open_video_container(self.source_path, mode='r')
-        video_stream = self._get_video_stream(container)
-        frame_number = 0
-
-        for packet in container.demux(video_stream):
-            for frame in packet.decode():
-                if frame.key_frame:
-                    self.key_frames[frame_number] = {
-                        'pts': frame.pts,
-                        'md5': md5_hash(frame),
-                    }
-                frame_number += 1
-
-        self.frames = frame_number
-        self._close_video_container(container)
-
-    def save_meta_info(self):
-        with open(self.meta_path, 'w') as meta_file:
-            for index, frame in self.key_frames.items():
-                meta_file.write('{} {}\n'.format(index, frame['pts']))
-
-    def get_nearest_left_key_frame(self, start_chunk_frame_number):
-        start_decode_frame_number = 0
-        start_decode_timestamp = 0
-
-        with open(self.meta_path, 'r') as file:
-            for line in file:
-                frame_number, timestamp = line.strip().split(' ')
-
-                if int(frame_number) <= start_chunk_frame_number:
-                    start_decode_frame_number = frame_number
-                    start_decode_timestamp = timestamp
-                else:
-                    break
-
-        return int(start_decode_frame_number), int(start_decode_timestamp)
-
-    def decode_needed_frames(self, chunk_number, db_data):
-        step = db_data.get_frame_step()
-        start_chunk_frame_number = db_data.start_frame + chunk_number * db_data.chunk_size * step
-        end_chunk_frame_number = min(start_chunk_frame_number + (db_data.chunk_size - 1) * step + 1, db_data.stop_frame + 1)
-        start_decode_frame_number, start_decode_timestamp = self.get_nearest_left_key_frame(start_chunk_frame_number)
-        container = self._open_video_container(self.source_path, mode='r')
-        video_stream = self._get_video_stream(container)
-        container.seek(offset=start_decode_timestamp, stream=video_stream)
-
-        frame_number = start_decode_frame_number - 1
-        for packet in container.demux(video_stream):
-            for frame in packet.decode():
-                frame_number += 1
-                if frame_number < start_chunk_frame_number:
-                    continue
-                elif frame_number < end_chunk_frame_number and not ((frame_number - start_chunk_frame_number) % step):
-                    if video_stream.metadata.get('rotate'):
-                        frame = av.VideoFrame().from_ndarray(
-                            rotate_image(
-                                frame.to_ndarray(format='bgr24'),
-                                360 - int(container.streams.video[0].metadata.get('rotate'))
-                            ),
-                            format ='bgr24'
-                        )
-                    yield frame
-                elif (frame_number - start_chunk_frame_number) % step:
-                    continue
-                else:
-                    self._close_video_container(container)
-                    return
-
-        self._close_video_container(container)
-
-class UploadedMeta(PrepareInfo):
-    def __init__(self, **kwargs):
-        super().__init__(**kwargs)
-        uploaded_meta = kwargs.get('uploaded_meta')
-        assert uploaded_meta is not None , 'No uploaded meta path'
-
-        with open(uploaded_meta, 'r') as meta_file:
-            lines = meta_file.read().strip().split('\n')
-            self.frames = int(lines.pop())
-
-            key_frames = {int(line.split()[0]): int(line.split()[1]) for line in lines}
-            self.key_frames = OrderedDict(sorted(key_frames.items(), key=lambda x: x[0]))
-
-    @property
-    def frame_sizes(self):
-        container = self._open_video_container(self.source_path, 'r')
-        video_stream = self._get_video_stream(container)
-        container.seek(offset=next(iter(self.key_frames.values())), stream=video_stream)
-        for packet in container.demux(video_stream):
-            for frame in packet.decode():
-                if video_stream.metadata.get('rotate'):
-                    frame = av.VideoFrame().from_ndarray(
-                        rotate_image(
-                            frame.to_ndarray(format='bgr24'),
-                            360 - int(container.streams.video[0].metadata.get('rotate'))
-                        ),
-                        format ='bgr24'
-                    )
-                self._close_video_container(container)
-                return (frame.width, frame.height)
-
-    def save_meta_info(self):
-        with open(self.meta_path, 'w') as meta_file:
-            for index, pts in self.key_frames.items():
-                meta_file.write('{} {}\n'.format(index, pts))
-
-    def check_key_frame(self, container, video_stream, key_frame):
-        for packet in container.demux(video_stream):
-            for frame in packet.decode():
-                assert frame.pts == key_frame[1], "Uploaded meta information does not match the video"
-                return
-
-    def check_seek_key_frames(self):
-        container = self._open_video_container(self.source_path, mode='r')
-        video_stream = self._get_video_stream(container)
-
-        for key_frame in self.key_frames.items():
-            container.seek(offset=key_frame[1], stream=video_stream)
-            self.check_key_frame(container, video_stream, key_frame)
-
-        self._close_video_container(container)
-
-    def check_frames_numbers(self):
-        container = self._open_video_container(self.source_path, mode='r')
-        video_stream = self._get_video_stream(container)
-        # not all videos contain information about numbers of frames
-        if video_stream.frames:
-            self._close_video_container(container)
-            assert video_stream.frames == self.frames, "Uploaded meta information does not match the video"
-            return
-        self._close_video_container(container)
-
-def prepare_meta(media_file, upload_dir=None, meta_dir=None, chunk_size=None):
-    paths = {
-        'source_path': os.path.join(upload_dir, media_file) if upload_dir else media_file,
-        'meta_path': os.path.join(meta_dir, 'meta_info.txt') if meta_dir else os.path.join(upload_dir, 'meta_info.txt'),
-    }
-    analyzer = AnalyzeVideo(source_path=paths.get('source_path'))
-    analyzer.check_type_first_frame()
-    analyzer.check_video_timestamps_sequences()
-
-    meta_info = PrepareInfo(source_path=paths.get('source_path'),
-                            meta_path=paths.get('meta_path'))
-    meta_info.save_key_frames()
-    meta_info.check_seek_key_frames()
-    meta_info.save_meta_info()
-    smooth_decoding = meta_info.check_frames_ratio(chunk_size) if chunk_size else None
-    return (meta_info, smooth_decoding)
-
-def prepare_meta_for_upload(func, *args):
-    meta_info, smooth_decoding = func(*args)
-    with open(meta_info.meta_path, 'a') as meta_file:
-        meta_file.write(str(meta_info.get_task_size()))
-    return smooth_decoding
--- a/cvat/apps/engine/task.py
+++ b/cvat/apps/engine/task.py

-# Copyright (C) 2018-2020 Intel Corporation
+# Copyright (C) 2018-2021 Intel Corporation
 #
 # SPDX-License-Identifier: MIT

 import itertools
 import os
 import sys
-from re import findall
 import rq
 import shutil
 from traceback import print_exception
@@ -17,8 +16,9 @@ import requests
 from cvat.apps.engine.media_extractors import get_mime, MEDIA_TYPES, Mpeg4ChunkWriter, ZipChunkWriter, Mpeg4CompressedChunkWriter, ZipCompressedChunkWriter, ValidateDimension
 from cvat.apps.engine.models import DataChoice, StorageMethodChoice, StorageChoice, RelatedFile
 from cvat.apps.engine.utils import av_scan_paths
-from cvat.apps.engine.prepare import prepare_meta
 from cvat.apps.engine.models import DimensionType
+from utils.dataset_manifest import ImageManifestManager, VideoManifestManager
+from utils.dataset_manifest.core import VideoManifestValidator

 import django_rq
 from django.conf import settings
@@ -107,7 +107,7 @@ def _save_task_to_db(db_task):
    db_task.data.save()
    db_task.save()

-def _count_files(data, meta_info_file=None):
+def _count_files(data, manifest_file=None):
    share_root = settings.SHARE_ROOT
    server_files = []

@@ -134,8 +134,8 @@ def _count_files(data, meta_info_file=None):
            mime = get_mime(full_path)
            if mime in counter:
                counter[mime].append(rel_path)
-            elif findall('meta_info.txt$', rel_path):
-                meta_info_file.append(rel_path)
+            elif 'manifest.jsonl' == os.path.basename(rel_path):
+                manifest_file.append(rel_path)
            else:
                slogger.glob.warn("Skip '{}' file (its mime type doesn't "
                    "correspond to a video or an image file)".format(full_path))
@@ -154,7 +154,7 @@ def _count_files(data, meta_info_file=None):

    return counter

-def _validate_data(counter, meta_info_file=None):
+def _validate_data(counter, manifest_file=None):
    unique_entries = 0
    multiple_entries = 0
    for media_type, media_config in MEDIA_TYPES.items():
@@ -164,8 +164,8 @@ def _validate_data(counter, meta_info_file=None):
            else:
                multiple_entries += len(counter[media_type])

-            if meta_info_file and media_type != 'video':
-                raise Exception('File with meta information can only be uploaded with video file')
+            if manifest_file and media_type not in ('video', 'image'):
+                raise Exception('File with meta information can only be uploaded with video/images ')

    if unique_entries == 1 and multiple_entries > 0 or unique_entries > 1:
        unique_types = ', '.join([k for k, v in MEDIA_TYPES.items() if v['unique']])
@@ -221,10 +221,10 @@ def _create_thread(tid, data):
    if data['remote_files']:
        data['remote_files'] = _download_data(data['remote_files'], upload_dir)

-    meta_info_file = []
-    media = _count_files(data, meta_info_file)
-    media, task_mode = _validate_data(media, meta_info_file)
-    if meta_info_file:
+    manifest_file = []
+    media = _count_files(data, manifest_file)
+    media, task_mode = _validate_data(media, manifest_file)
+    if manifest_file:
        assert settings.USE_CACHE and db_data.storage_method == StorageMethodChoice.CACHE, \
            "File with meta information can be uploaded if 'Use cache' option is also selected"

@@ -248,8 +248,10 @@ def _create_thread(tid, data):
            if extractor is not None:
                raise Exception('Combined data types are not supported')
            source_paths=[os.path.join(upload_dir, f) for f in media_files]
-            if media_type in  ('archive', 'zip') and db_data.storage == StorageChoice.SHARE:
+            if media_type in {'archive', 'zip'} and db_data.storage == StorageChoice.SHARE:
                source_paths.append(db_data.get_upload_dirname())
+                upload_dir = db_data.get_upload_dirname()
+                db_data.storage = StorageChoice.LOCAL
            extractor = MEDIA_TYPES[media_type]['extractor'](
                source_path=source_paths,
                step=db_data.get_frame_step(),
@@ -322,68 +324,108 @@ def _create_thread(tid, data):
    video_path = ""
    video_size = (0, 0)

+    def _update_status(msg):
+        job.meta['status'] = msg
+        job.save_meta()
+
    if settings.USE_CACHE and db_data.storage_method == StorageMethodChoice.CACHE:
       for media_type, media_files in media.items():

            if not media_files:
                continue

+            # replace manifest file (e.g was uploaded 'subdir/manifest.jsonl')
+            if manifest_file and not os.path.exists(db_data.get_manifest_path()):
+                shutil.copyfile(os.path.join(upload_dir, manifest_file[0]),
+                    db_data.get_manifest_path())
+                if upload_dir != settings.SHARE_ROOT:
+                    os.remove(os.path.join(upload_dir, manifest_file[0]))
+
            if task_mode == MEDIA_TYPES['video']['mode']:
                try:
-                    if meta_info_file:
+                    manifest_is_prepared = False
+                    if manifest_file:
                        try:
-                            from cvat.apps.engine.prepare import UploadedMeta
-                            meta_info = UploadedMeta(source_path=os.path.join(upload_dir, media_files[0]),
-                                                     meta_path=db_data.get_meta_path(),
-                                                     uploaded_meta=os.path.join(upload_dir, meta_info_file[0]))
-                            meta_info.check_seek_key_frames()
-                            meta_info.check_frames_numbers()
-                            meta_info.save_meta_info()
-                            assert len(meta_info.key_frames) > 0, 'No key frames.'
+                            manifest = VideoManifestValidator(source_path=os.path.join(upload_dir, media_files[0]),
+                                                              manifest_path=db_data.get_manifest_path())
+                            manifest.init_index()
+                            manifest.validate_seek_key_frames()
+                            manifest.validate_frame_numbers()
+                            assert len(manifest) > 0, 'No key frames.'
+
+                            all_frames = manifest['properties']['length']
+                            video_size = manifest['properties']['resolution']
+                            manifest_is_prepared = True
                        except Exception as ex:
-                            base_msg = str(ex) if isinstance(ex, AssertionError) else \
-                                'Invalid meta information was upload.'
-                            job.meta['status'] = '{} Start prepare valid meta information.'.format(base_msg)
-                            job.save_meta()
-                            meta_info, smooth_decoding = prepare_meta(
-                                media_file=media_files[0],
-                                upload_dir=upload_dir,
-                                meta_dir=os.path.dirname(db_data.get_meta_path()),
-                                chunk_size=db_data.chunk_size
-                            )
-                            assert smooth_decoding == True, 'Too few keyframes for smooth video decoding.'
-                    else:
-                        meta_info, smooth_decoding = prepare_meta(
+                            if os.path.exists(db_data.get_index_path()):
+                                os.remove(db_data.get_index_path())
+                            if isinstance(ex, AssertionError):
+                                base_msg = str(ex)
+                            else:
+                                base_msg = 'Invalid manifest file was upload.'
+                                slogger.glob.warning(str(ex))
+                            _update_status('{} Start prepare a valid manifest file.'.format(base_msg))
+
+                    if not manifest_is_prepared:
+                        _update_status('Start prepare a manifest file')
+                        manifest = VideoManifestManager(db_data.get_manifest_path())
+                        meta_info = manifest.prepare_meta(
                            media_file=media_files[0],
                            upload_dir=upload_dir,
-                            meta_dir=os.path.dirname(db_data.get_meta_path()),
                            chunk_size=db_data.chunk_size
                        )
-                        assert smooth_decoding == True, 'Too few keyframes for smooth video decoding.'
+                        manifest.create(meta_info)
+                        manifest.init_index()
+                        _update_status('A manifest had been created')

-                    all_frames = meta_info.get_task_size()
-                    video_size = meta_info.frame_sizes
+                        all_frames = meta_info.get_size()
+                        video_size = meta_info.frame_sizes
+                        manifest_is_prepared = True

-                    db_data.size = len(range(db_data.start_frame, min(data['stop_frame'] + 1 if data['stop_frame'] else all_frames, all_frames), db_data.get_frame_step()))
+                    db_data.size = len(range(db_data.start_frame, min(data['stop_frame'] + 1 \
+                        if data['stop_frame'] else all_frames, all_frames), db_data.get_frame_step()))
                    video_path = os.path.join(upload_dir, media_files[0])
                except Exception as ex:
                    db_data.storage_method = StorageMethodChoice.FILE_SYSTEM
-                    if os.path.exists(db_data.get_meta_path()):
-                        os.remove(db_data.get_meta_path())
-                    base_msg = str(ex) if isinstance(ex, AssertionError) else "Uploaded video does not support a quick way of task creating."
-                    job.meta['status'] = "{} The task will be created using the old method".format(base_msg)
-                    job.save_meta()
-            else:#images,archive
+                    if os.path.exists(db_data.get_manifest_path()):
+                        os.remove(db_data.get_manifest_path())
+                    if os.path.exists(db_data.get_index_path()):
+                        os.remove(db_data.get_index_path())
+                    base_msg = str(ex) if isinstance(ex, AssertionError) \
+                        else "Uploaded video does not support a quick way of task creating."
+                    _update_status("{} The task will be created using the old method".format(base_msg))
+            else:# images, archive, pdf
                db_data.size = len(extractor)
-
+                manifest = ImageManifestManager(db_data.get_manifest_path())
+                if not manifest_file:
+                    if db_task.dimension == DimensionType.DIM_2D:
+                        meta_info = manifest.prepare_meta(
+                            sources=extractor.absolute_source_paths,
+                            data_dir=upload_dir
+                        )
+                        content = meta_info.content
+                    else:
+                        content = []
+                        for source in extractor.absolute_source_paths:
+                            name, ext = os.path.splitext(os.path.relpath(source, upload_dir))
+                            content.append({
+                                'name': name,
+                                'extension': ext
+                            })
+                    manifest.create(content)
+                manifest.init_index()
                counter = itertools.count()
-                for chunk_number, chunk_frames in itertools.groupby(extractor.frame_range, lambda x: next(counter) // db_data.chunk_size):
+                for _, chunk_frames in itertools.groupby(extractor.frame_range, lambda x: next(counter) // db_data.chunk_size):
                    chunk_paths = [(extractor.get_path(i), i) for i in chunk_frames]
                    img_sizes = []
-                    with open(db_data.get_dummy_chunk_path(chunk_number), 'w') as dummy_chunk:
-                        for path, frame_id in chunk_paths:
-                            dummy_chunk.write(os.path.relpath(path, upload_dir) + '\n')
-                            img_sizes.append(extractor.get_image_size(frame_id))
+
+                    for _, frame_id in chunk_paths:
+                        properties = manifest[frame_id]
+                        if db_task.dimension == DimensionType.DIM_2D:
+                            resolution = (properties['width'], properties['height'])
+                        else:
+                            resolution = extractor.get_image_size(frame_id)
+                        img_sizes.append(resolution)

                    db_images.extend([
                        models.Image(data=db_data,
@@ -453,6 +495,10 @@ def _create_thread(tid, data):

    if db_data.stop_frame == 0:
        db_data.stop_frame = db_data.start_frame + (db_data.size - 1) * db_data.get_frame_step()
+    else:
+        # validate stop_frame
+        db_data.stop_frame = min(db_data.stop_frame, \
+            db_data.start_frame + (db_data.size - 1) * db_data.get_frame_step())

    preview = extractor.get_preview()
    preview.save(db_data.get_preview_path())

--- a/cvat/apps/engine/tests/test_rest_api.py
+++ b/cvat/apps/engine/tests/test_rest_api.py
@@ -30,9 +30,9 @@ from rest_framework.test import APIClient, APITestCase

 from cvat.apps.engine.models import (AttributeSpec, AttributeType, Data, Job, Project,
    Segment, StatusChoice, Task, Label, StorageMethodChoice, StorageChoice)
-from cvat.apps.engine.prepare import prepare_meta, prepare_meta_for_upload
 from cvat.apps.engine.media_extractors import ValidateDimension
 from cvat.apps.engine.models import DimensionType
+from utils.dataset_manifest import ImageManifestManager, VideoManifestManager

 def create_db_users(cls):
    (group_admin, _) = Group.objects.get_or_create(name="admin")
@@ -1971,6 +1971,26 @@ def generate_pdf_file(filename, page_count=1):
    file_buf.seek(0)
    return image_sizes, file_buf

+def generate_manifest_file(data_type, manifest_path, sources):
+    kwargs = {
+        'images': {
+            'sources': sources,
+            'is_sorted': False,
+        },
+        'video': {
+            'media_file': sources[0],
+            'upload_dir': os.path.dirname(sources[0]),
+            'force': True
+        }
+    }
+
+    if data_type == 'video':
+        manifest = VideoManifestManager(manifest_path)
+    else:
+        manifest = ImageManifestManager(manifest_path)
+    prepared_meta = manifest.prepare_meta(**kwargs[data_type])
+    manifest.create(prepared_meta)
+
 class TaskDataAPITestCase(APITestCase):
    _image_sizes = {}

@@ -2093,6 +2113,12 @@ class TaskDataAPITestCase(APITestCase):
        shutil.rmtree(root_path)
        cls._image_sizes[filename] = image_sizes

+        generate_manifest_file(data_type='video', manifest_path=os.path.join(settings.SHARE_ROOT, 'videos', 'manifest.jsonl'),
+            sources=[os.path.join(settings.SHARE_ROOT, 'videos', 'test_video_1.mp4')])
+
+        generate_manifest_file(data_type='images', manifest_path=os.path.join(settings.SHARE_ROOT, 'manifest.jsonl'),
+            sources=[os.path.join(settings.SHARE_ROOT, f'test_{i}.jpg') for i in range(1,4)])
+
    @classmethod
    def tearDownClass(cls):
        super().tearDownClass()
@@ -2114,7 +2140,10 @@ class TaskDataAPITestCase(APITestCase):
        path = os.path.join(settings.SHARE_ROOT, "videos", "test_video_1.mp4")
        os.remove(path)

-        path = os.path.join(settings.SHARE_ROOT, "videos", "meta_info.txt")
+        path = os.path.join(settings.SHARE_ROOT, "videos", "manifest.jsonl")
+        os.remove(path)
+
+        path = os.path.join(settings.SHARE_ROOT, "manifest.jsonl")
        os.remove(path)

    def _run_api_v1_tasks_id_data_post(self, tid, user, data):
@@ -2257,7 +2286,7 @@ class TaskDataAPITestCase(APITestCase):
            self.assertEqual(len(images), min(task["data_chunk_size"], len(image_sizes)))

            if task["data_original_chunk_type"] == self.ChunkType.IMAGESET:
-                server_files = [img for key, img in data.items() if key.startswith("server_files")]
+                server_files = [img for key, img in data.items() if key.startswith("server_files") and not img.endswith("manifest.jsonl")]
                client_files = [img for key, img in data.items() if key.startswith("client_files")]

                if server_files:
@@ -2446,7 +2475,7 @@ class TaskDataAPITestCase(APITestCase):
        image_sizes = self._image_sizes[task_data["server_files[0]"]]

        self._test_api_v1_tasks_id_data_spec(user, task_spec, task_data, self.ChunkType.IMAGESET, self.ChunkType.IMAGESET, image_sizes,
-                                             expected_uploaded_data_location=StorageChoice.SHARE)
+                                             expected_uploaded_data_location=StorageChoice.LOCAL)

        task_spec.update([('name', 'my archive task #12')])
        task_data.update([('copy_data', True)])
@@ -2546,7 +2575,7 @@ class TaskDataAPITestCase(APITestCase):
        image_sizes = self._image_sizes[task_data["server_files[0]"]]

        self._test_api_v1_tasks_id_data_spec(user, task_spec, task_data, self.ChunkType.IMAGESET,
-            self.ChunkType.IMAGESET, image_sizes, StorageMethodChoice.CACHE, StorageChoice.SHARE)
+            self.ChunkType.IMAGESET, image_sizes, StorageMethodChoice.CACHE, StorageChoice.LOCAL)

        task_spec.update([('name', 'my cached zip archive task #19')])
        task_data.update([('copy_data', True)])
@@ -2595,11 +2624,6 @@ class TaskDataAPITestCase(APITestCase):
        self._test_api_v1_tasks_id_data_spec(user, task_spec, task_data,
            self.ChunkType.IMAGESET, self.ChunkType.IMAGESET, image_sizes)

-        prepare_meta_for_upload(
-            prepare_meta,
-            os.path.join(settings.SHARE_ROOT, "videos", "test_video_1.mp4"),
-            os.path.join(settings.SHARE_ROOT, "videos")
-        )
        task_spec = {
            "name": "my video with meta info task without copying #22",
            "overlap": 0,
@@ -2611,7 +2635,7 @@ class TaskDataAPITestCase(APITestCase):
        }
        task_data = {
            "server_files[0]": os.path.join("videos", "test_video_1.mp4"),
-            "server_files[1]": os.path.join("videos", "meta_info.txt"),
+            "server_files[1]": os.path.join("videos", "manifest.jsonl"),
            "image_quality": 70,
            "use_cache": True
        }
@@ -2723,6 +2747,38 @@ class TaskDataAPITestCase(APITestCase):
                                             self.ChunkType.IMAGESET,
                                             image_sizes, dimension=DimensionType.DIM_3D)

+        task_spec = {
+            "name": "my images+manifest without copying #26",
+            "overlap": 0,
+            "segment_size": 0,
+            "labels": [
+                {"name": "car"},
+                {"name": "person"},
+            ]
+        }
+
+        task_data = {
+            "server_files[0]": "test_1.jpg",
+            "server_files[1]": "test_2.jpg",
+            "server_files[2]": "test_3.jpg",
+            "server_files[3]": "manifest.jsonl",
+            "image_quality": 70,
+            "use_cache": True
+        }
+        image_sizes = [
+            self._image_sizes[task_data["server_files[0]"]],
+            self._image_sizes[task_data["server_files[1]"]],
+            self._image_sizes[task_data["server_files[2]"]],
+        ]
+
+        self._test_api_v1_tasks_id_data_spec(user, task_spec, task_data, self.ChunkType.IMAGESET, self.ChunkType.IMAGESET,
+            image_sizes, StorageMethodChoice.CACHE, StorageChoice.SHARE)
+
+        task_spec.update([('name', 'my images+manifest #27')])
+        task_data.update([('copy_data', True)])
+        self._test_api_v1_tasks_id_data_spec(user, task_spec, task_data, self.ChunkType.IMAGESET, self.ChunkType.IMAGESET,
+            image_sizes, StorageMethodChoice.CACHE, StorageChoice.LOCAL)
+
    def test_api_v1_tasks_id_data_admin(self):
        self._test_api_v1_tasks_id_data(self.admin)


--- a/cvat/apps/engine/utils.py
+++ b/cvat/apps/engine/utils.py
-# Copyright (C) 2020 Intel Corporation
+# Copyright (C) 2020-2021 Intel Corporation
 #
 # SPDX-License-Identifier: MIT

 import ast
 import cv2 as cv
 from collections import namedtuple
+import hashlib
 import importlib
 import sys
 import traceback
 import subprocess
 import os
+from av import VideoFrame

 from django.core.exceptions import ValidationError

@@ -51,6 +53,7 @@ class InterpreterError(Exception):

 def execute_python_code(source_code, global_vars=None, local_vars=None):
    try:
+        # pylint: disable=exec-used
        exec(source_code, global_vars, local_vars)
    except SyntaxError as err:
        error_class = err.__class__.__name__
@@ -72,7 +75,7 @@ def av_scan_paths(*paths):
    if 'yes' == os.environ.get('CLAM_AV'):
        command = ['clamscan', '--no-summary', '-i', '-o']
        command.extend(paths)
-        res = subprocess.run(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+        res = subprocess.run(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE) # nosec
        if res.returncode:
            raise ValidationError(res.stdout)

@@ -88,3 +91,8 @@ def rotate_image(image, angle):
    matrix[1, 2] += bound_h/2 - image_center[1]
    matrix = cv.warpAffine(image, matrix, (bound_w, bound_h))
    return matrix
+
+def md5_hash(frame):
+    if isinstance(frame, VideoFrame):
+        frame = frame.to_image()
+    return hashlib.md5(frame.tobytes()).hexdigest() # nosec
\ No newline at end of file
--- a/utils/dataset_manifest/README.md
+++ b/utils/dataset_manifest/README.md
+## Simple command line to prepare dataset manifest file
+
+### Steps before use
+
+When used separately from Computer Vision Annotation Tool(CVAT), the required dependencies must be installed
+
+#### Ubuntu:20.04
+
+Install dependencies:
+
+```bash
+# General
+sudo apt-get update && sudo apt-get --no-install-recommends install -y \
+    python3-dev python3-pip python3-venv pkg-config
+```
+
+```bash
+# Library components
+sudo apt-get install --no-install-recommends -y \
+    libavformat-dev libavcodec-dev libavdevice-dev \
+    libavutil-dev libswscale-dev libswresample-dev libavfilter-dev
+```
+
+Create an environment and install the necessary python modules:
+
+```bash
+python3 -m venv .env
+. .env/bin/activate
+pip install -U pip
+pip install -r requirements.txt
+```
+
+### Using
+
+```bash
+usage: python create.py [-h] [--force] [--output-dir .] source
+
+positional arguments:
+  source                Source paths
+
+optional arguments:
+  -h, --help            show this help message and exit
+  --force               Use this flag to prepare the manifest file for video data if by default the video does not meet the requirements
+                        and a manifest file is not prepared
+  --output-dir OUTPUT_DIR
+                        Directory where the manifest file will be saved
+```
+
+### Alternative way to use with openvino/cvat_server
+
+```bash
+docker run -it --entrypoint python3 -v /path/to/host/data/:/path/inside/container/:rw openvino/cvat_server
+utils/dataset_manifest/create.py --output-dir /path/to/manifest/directory/ /path/to/data/
+```
+
+### Examples of using
+
+Create a dataset manifest in the current directory with video which contains enough keyframes:
+
+```bash
+python create.py ~/Documents/video.mp4
+```
+
+Create a dataset manifest with video which does not contain enough keyframes:
+
+```bash
+python create.py --force --output-dir ~/Documents ~/Documents/video.mp4
+```
+
+Create a dataset manifest with images:
+
+```bash
+python create.py --output-dir ~/Documents ~/Documents/images/
+```
+
+Create a dataset manifest with pattern (may be used `*`, `?`, `[]`):
+
+```bash
+python create.py --output-dir ~/Documents "/home/${USER}/Documents/**/image*.jpeg"
+```
+
+Create a dataset manifest with `openvino/cvat_server`:
+
+```bash
+docker run -it --entrypoint python3 -v ~/Documents/data/:${HOME}/manifest/:rw openvino/cvat_server
+utils/dataset_manifest/create.py --output-dir ~/manifest/ ~/manifest/images/
+```
+
+### Examples of generated `manifest.jsonl` files
+
+A maifest file contains some intuitive information and some specific like:
+
+`pts` - time at which the frame should be shown to the user
+`checksum` - `md5` hash sum for the specific image/frame
+
+#### For a video
+
+```json
+{"version":"1.0"}
+{"type":"video"}
+{"properties":{"name":"video.mp4","resolution":[1280,720],"length":778}}
+{"number":0,"pts":0,"checksum":"17bb40d76887b56fe8213c6fded3d540"}
+{"number":135,"pts":486000,"checksum":"9da9b4d42c1206d71bf17a7070a05847"}
+{"number":270,"pts":972000,"checksum":"a1c3a61814f9b58b00a795fa18bb6d3e"}
+{"number":405,"pts":1458000,"checksum":"18c0803b3cc1aa62ac75b112439d2b62"}
+{"number":540,"pts":1944000,"checksum":"4551ecea0f80e95a6c32c32e70cac59e"}
+{"number":675,"pts":2430000,"checksum":"0e72faf67e5218c70b506445ac91cdd7"}
+```
+
+#### For a dataset with images
+
+```json
+{"version":"1.0"}
+{"type":"images"}
+{"name":"image1","extension":".jpg","width":720,"height":405,"checksum":"548918ec4b56132a5cff1d4acabe9947"}
+{"name":"image2","extension":".jpg","width":183,"height":275,"checksum":"4b4eefd03cc6a45c1c068b98477fb639"}
+{"name":"image3","extension":".jpg","width":301,"height":167,"checksum":"0e454a6f4a13d56c82890c98be063663"}
+```
--- a/utils/dataset_manifest/__init__.py
+++ b/utils/dataset_manifest/__init__.py
+# Copyright (C) 2021 Intel Corporation
+#
+# SPDX-License-Identifier: MIT
+from .core import VideoManifestManager, ImageManifestManager
\ No newline at end of file
--- a/utils/dataset_manifest/core.py
+++ b/utils/dataset_manifest/core.py
+# Copyright (C) 2021 Intel Corporation
+#
+# SPDX-License-Identifier: MIT
+
+import av
+import json
+import os
+from abc import ABC, abstractmethod
+from collections import OrderedDict
+from contextlib import closing
+from PIL import Image
+from .utils import md5_hash, rotate_image
+
+class VideoStreamReader:
+    def __init__(self, source_path):
+        self.source_path = source_path
+        self._key_frames = OrderedDict()
+        self.frames = 0
+
+        with closing(av.open(self.source_path, mode='r')) as container:
+            self.width, self.height = self._get_frame_size(container)
+
+    @staticmethod
+    def _get_video_stream(container):
+        video_stream = next(stream for stream in container.streams if stream.type == 'video')
+        video_stream.thread_type = 'AUTO'
+        return video_stream
+
+    @staticmethod
+    def _get_frame_size(container):
+        video_stream = VideoStreamReader._get_video_stream(container)
+        for packet in container.demux(video_stream):
+            for frame in packet.decode():
+                if video_stream.metadata.get('rotate'):
+                    frame = av.VideoFrame().from_ndarray(
+                        rotate_image(
+                            frame.to_ndarray(format='bgr24'),
+                            360 - int(container.streams.video[0].metadata.get('rotate')),
+                        ),
+                        format ='bgr24',
+                    )
+                return frame.width, frame.height
+
+    def check_type_first_frame(self):
+        with closing(av.open(self.source_path, mode='r')) as container:
+            video_stream = self._get_video_stream(container)
+
+            for packet in container.demux(video_stream):
+                for frame in packet.decode():
+                    if not frame.pict_type.name == 'I':
+                        raise Exception('First frame is not key frame')
+                    return
+
+    def check_video_timestamps_sequences(self):
+        with closing(av.open(self.source_path, mode='r')) as container:
+            video_stream = self._get_video_stream(container)
+
+            frame_pts = -1
+            frame_dts = -1
+            for packet in container.demux(video_stream):
+                for frame in packet.decode():
+
+                    if None not in {frame.pts, frame_pts} and frame.pts <= frame_pts:
+                        raise Exception('Invalid pts sequences')
+
+                    if None not in {frame.dts, frame_dts} and frame.dts <= frame_dts:
+                        raise Exception('Invalid dts sequences')
+
+                    frame_pts, frame_dts = frame.pts, frame.dts
+
+    def rough_estimate_frames_ratio(self, upper_bound):
+        analyzed_frames_number, key_frames_number = 0, 0
+        _processing_end = False
+
+        with closing(av.open(self.source_path, mode='r')) as container:
+            video_stream = self._get_video_stream(container)
+            for packet in container.demux(video_stream):
+                for frame in packet.decode():
+                    if frame.key_frame:
+                        key_frames_number += 1
+                    analyzed_frames_number += 1
+                    if upper_bound == analyzed_frames_number:
+                        _processing_end = True
+                        break
+                if _processing_end:
+                    break
+        # In our case no videos with non-key first frame, so 1 key frame is guaranteed
+        return analyzed_frames_number // key_frames_number
+
+    def validate_frames_ratio(self, chunk_size):
+        upper_bound = 3 * chunk_size
+        ratio = self.rough_estimate_frames_ratio(upper_bound + 1)
+        assert ratio < upper_bound, 'Too few keyframes'
+
+    def get_size(self):
+        return self.frames
+
+    @property
+    def frame_sizes(self):
+        return (self.width, self.height)
+
+    def validate_key_frame(self, container, video_stream, key_frame):
+        for packet in container.demux(video_stream):
+            for frame in packet.decode():
+                if md5_hash(frame) != key_frame[1]['md5'] or frame.pts != key_frame[1]['pts']:
+                    self._key_frames.pop(key_frame[0])
+                return
+
+    def validate_seek_key_frames(self):
+        with closing(av.open(self.source_path, mode='r')) as container:
+            video_stream = self._get_video_stream(container)
+
+            key_frames_copy = self._key_frames.copy()
+
+            for key_frame in key_frames_copy.items():
+                container.seek(offset=key_frame[1]['pts'], stream=video_stream)
+                self.validate_key_frame(container, video_stream, key_frame)
+
+    def save_key_frames(self):
+        with closing(av.open(self.source_path, mode='r')) as container:
+            video_stream = self._get_video_stream(container)
+            frame_number = 0
+
+            for packet in container.demux(video_stream):
+                for frame in packet.decode():
+                    if frame.key_frame:
+                        self._key_frames[frame_number] = {
+                            'pts': frame.pts,
+                            'md5': md5_hash(frame),
+                        }
+                    frame_number += 1
+            self.frames = frame_number
+
+    @property
+    def key_frames(self):
+        return self._key_frames
+
+    def __len__(self):
+        return len(self._key_frames)
+
+    #TODO: need to change it in future
+    def __iter__(self):
+        for idx, key_frame in self._key_frames.items():
+            yield (idx, key_frame['pts'], key_frame['md5'])
+
+
+class DatasetImagesReader:
+    def __init__(self, sources, is_sorted=True, use_image_hash=False, *args, **kwargs):
+        self._sources = sources if is_sorted else sorted(sources)
+        self._content = []
+        self._data_dir = kwargs.get('data_dir', None)
+        self._use_image_hash = use_image_hash
+
+    def __iter__(self):
+        for image in self._sources:
+            img = Image.open(image, mode='r')
+            img_name = os.path.relpath(image, self._data_dir) if self._data_dir \
+                else os.path.basename(image)
+            name, extension = os.path.splitext(img_name)
+            image_properties = {
+                'name': name,
+                'extension': extension,
+                'width': img.width,
+                'height': img.height,
+            }
+            if self._use_image_hash:
+                image_properties['checksum'] = md5_hash(img)
+            yield image_properties
+
+    def create(self):
+        for item in self:
+            self._content.append(item)
+
+    @property
+    def content(self):
+        return self._content
+
+class _Manifest:
+    FILE_NAME = 'manifest.jsonl'
+    VERSION = '1.0'
+
+    def __init__(self, path, is_created=False):
+        assert path, 'A path to manifest file not found'
+        self._path = os.path.join(path, self.FILE_NAME) if os.path.isdir(path) else path
+        self._is_created = is_created
+
+    @property
+    def path(self):
+        return self._path
+
+    @property
+    def is_created(self):
+        return self._is_created
+
+    @is_created.setter
+    def is_created(self, value):
+        assert isinstance(value, bool)
+        self._is_created = value
+
+# Needed for faster iteration over the manifest file, will be generated to work inside CVAT
+# and will not be generated when manually creating a manifest
+class _Index:
+    FILE_NAME = 'index.json'
+
+    def __init__(self, path):
+        assert path and os.path.isdir(path), 'No index directory path'
+        self._path = os.path.join(path, self.FILE_NAME)
+        self._index = {}
+
+    @property
+    def path(self):
+        return self._path
+
+    def dump(self):
+        with open(self._path, 'w') as index_file:
+            json.dump(self._index, index_file,  separators=(',', ':'))
+
+    def load(self):
+        with open(self._path, 'r') as index_file:
+            self._index = json.load(index_file,
+                object_hook=lambda d: {int(k): v for k, v in d.items()})
+
+    def create(self, manifest, skip):
+        assert os.path.exists(manifest), 'A manifest file not exists, index cannot be created'
+        with open(manifest, 'r+') as manifest_file:
+            while skip:
+                manifest_file.readline()
+                skip -= 1
+            image_number = 0
+            position = manifest_file.tell()
+            line = manifest_file.readline()
+            while line:
+                if line.strip():
+                    self._index[image_number] = position
+                    image_number += 1
+                    position = manifest_file.tell()
+                line = manifest_file.readline()
+
+    def partial_update(self, manifest, number):
+        assert os.path.exists(manifest), 'A manifest file not exists, index cannot be updated'
+        with open(manifest, 'r+') as manifest_file:
+            manifest_file.seek(self._index[number])
+            line = manifest_file.readline()
+            while line:
+                if line.strip():
+                    self._index[number] = manifest_file.tell()
+                    number += 1
+                line = manifest_file.readline()
+
+    def __getitem__(self, number):
+        assert 0 <= number < len(self), \
+            'A invalid index number: {}\nMax: {}'.format(number, len(self))
+        return self._index[number]
+
+    def __len__(self):
+        return len(self._index)
+
+class _ManifestManager(ABC):
+    BASE_INFORMATION = {
+        'version' : 1,
+        'type': 2,
+    }
+    def __init__(self, path, *args, **kwargs):
+        self._manifest = _Manifest(path)
+
+    def _parse_line(self, line):
+        """ Getting a random line from the manifest file """
+        with open(self._manifest.path, 'r') as manifest_file:
+            if isinstance(line, str):
+                assert line in self.BASE_INFORMATION.keys(), \
+                    'An attempt to get non-existent information from the manifest'
+                for _ in range(self.BASE_INFORMATION[line]):
+                    fline = manifest_file.readline()
+                return json.loads(fline)[line]
+            else:
+                assert self._index, 'No prepared index'
+                offset = self._index[line]
+                manifest_file.seek(offset)
+                properties = manifest_file.readline()
+                return json.loads(properties)
+
+    def init_index(self):
+        self._index = _Index(os.path.dirname(self._manifest.path))
+        if os.path.exists(self._index.path):
+            self._index.load()
+        else:
+            self._index.create(self._manifest.path, 3 if self._manifest.TYPE == 'video' else 2)
+            self._index.dump()
+
+    @abstractmethod
+    def create(self, content, **kwargs):
+        pass
+
+    @abstractmethod
+    def partial_update(self, number, properties):
+        pass
+
+    def __iter__(self):
+        with open(self._manifest.path, 'r') as manifest_file:
+            manifest_file.seek(self._index[0])
+            image_number = 0
+            line = manifest_file.readline()
+            while line:
+                if not line.strip():
+                    continue
+                yield (image_number, json.loads(line))
+                image_number += 1
+                line = manifest_file.readline()
+
+    @property
+    def manifest(self):
+        return self._manifest
+
+    def __len__(self):
+        if hasattr(self, '_index'):
+            return len(self._index)
+        else:
+            return None
+
+    def __getitem__(self, item):
+        return self._parse_line(item)
+
+    @property
+    def index(self):
+        return self._index
+
+class VideoManifestManager(_ManifestManager):
+    def __init__(self, manifest_path, *args, **kwargs):
+        super().__init__(manifest_path)
+        setattr(self._manifest, 'TYPE', 'video')
+        self.BASE_INFORMATION['properties'] = 3
+
+    def create(self, content, **kwargs):
+        """ Creating and saving a manifest file """
+        with open(self._manifest.path, 'w') as manifest_file:
+            base_info = {
+                'version': self._manifest.VERSION,
+                'type': self._manifest.TYPE,
+                'properties': {
+                    'name': os.path.basename(content.source_path),
+                    'resolution': content.frame_sizes,
+                    'length': content.get_size(),
+                },
+            }
+            for key, value in base_info.items():
+                json_item = json.dumps({key: value}, separators=(',', ':'))
+                manifest_file.write(f'{json_item}\n')
+
+            for item in content:
+                json_item = json.dumps({
+                    'number': item[0],
+                    'pts': item[1],
+                    'checksum': item[2]
+                }, separators=(',', ':'))
+                manifest_file.write(f"{json_item}\n")
+        self._manifest.is_created = True
+
+    def partial_update(self, number, properties):
+        pass
+
+    @staticmethod
+    def prepare_meta(media_file, upload_dir=None, chunk_size=36, force=False):
+        source_path = os.path.join(upload_dir, media_file) if upload_dir else media_file
+        meta_info = VideoStreamReader(source_path=source_path)
+        meta_info.check_type_first_frame()
+        try:
+            meta_info.validate_frames_ratio(chunk_size)
+        except AssertionError:
+            if not force:
+                raise
+        meta_info.check_video_timestamps_sequences()
+        meta_info.save_key_frames()
+        meta_info.validate_seek_key_frames()
+        return meta_info
+
+#TODO: add generic manifest structure file validation
+class ManifestValidator:
+    def validate_base_info(self):
+        with open(self._manifest.path, 'r') as manifest_file:
+            assert self._manifest.VERSION != json.loads(manifest_file.readline())['version']
+            assert self._manifest.TYPE != json.loads(manifest_file.readline())['type']
+
+class VideoManifestValidator(VideoManifestManager):
+    def __init__(self, **kwargs):
+        self.source_path = kwargs.pop('source_path')
+        super().__init__(self, **kwargs)
+
+    def validate_key_frame(self, container, video_stream, key_frame):
+        for packet in container.demux(video_stream):
+            for frame in packet.decode():
+                assert frame.pts == key_frame['pts'], "The uploaded manifest does not match the video"
+                return
+
+    def validate_seek_key_frames(self):
+        with closing(av.open(self.source_path, mode='r')) as container:
+            video_stream = self._get_video_stream(container)
+            last_key_frame = None
+
+            for _, key_frame in self:
+                # check that key frames sequence sorted
+                if last_key_frame and last_key_frame['number'] >= key_frame['number']:
+                    raise AssertionError('Invalid saved key frames sequence in manifest file')
+                container.seek(offset=key_frame['pts'], stream=video_stream)
+                self.validate_key_frame(container, video_stream, key_frame)
+                last_key_frame = key_frame
+
+    def validate_frame_numbers(self):
+        with closing(av.open(self.source_path, mode='r')) as container:
+            video_stream = self._get_video_stream(container)
+            # not all videos contain information about numbers of frames
+            frames = video_stream.frames
+            if frames:
+                assert frames == self['properties']['length'], "The uploaded manifest does not match the video"
+                return
+
+class ImageManifestManager(_ManifestManager):
+    def __init__(self, manifest_path):
+        super().__init__(manifest_path)
+        setattr(self._manifest, 'TYPE', 'images')
+
+    def create(self, content, **kwargs):
+        """ Creating and saving a manifest file"""
+        with open(self._manifest.path, 'w') as manifest_file:
+            base_info = {
+                'version': self._manifest.VERSION,
+                'type': self._manifest.TYPE,
+            }
+            for key, value in base_info.items():
+                json_item = json.dumps({key: value}, separators=(',', ':'))
+                manifest_file.write(f'{json_item}\n')
+
+            for item in content:
+                json_item = json.dumps({
+                    key: value for key, value in item.items()
+                }, separators=(',', ':'))
+                manifest_file.write(f"{json_item}\n")
+        self._manifest.is_created = True
+
+    def partial_update(self, number, properties):
+        pass
+
+    @staticmethod
+    def prepare_meta(sources, **kwargs):
+        meta_info = DatasetImagesReader(sources=sources, **kwargs)
+        meta_info.create()
+        return meta_info
\ No newline at end of file
--- a/utils/dataset_manifest/create.py
+++ b/utils/dataset_manifest/create.py
+# Copyright (C) 2021 Intel Corporation
+#
+# SPDX-License-Identifier: MIT
+import argparse
+import mimetypes
+import os
+import sys
+from glob import glob
+
+def _define_data_type(media):
+    media_type, _ = mimetypes.guess_type(media)
+    if media_type:
+        return media_type.split('/')[0]
+
+def _is_video(media_file):
+    return _define_data_type(media_file) == 'video'
+
+def _is_image(media_file):
+    return _define_data_type(media_file) == 'image'
+
+def get_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--force', action='store_true',
+        help='Use this flag to prepare the manifest file for video data '
+             'if by default the video does not meet the requirements and a manifest file is not prepared')
+    parser.add_argument('--output-dir',type=str, help='Directory where the manifest file will be saved',
+        default=os.getcwd())
+    parser.add_argument('source', type=str, help='Source paths')
+    return parser.parse_args()
+
+def main():
+    args = get_args()
+
+    manifest_directory = os.path.abspath(args.output_dir)
+    os.makedirs(manifest_directory, exist_ok=True)
+    source = os.path.abspath(args.source)
+
+    sources = []
+    if not os.path.isfile(source): # directory/pattern with images
+        data_dir = None
+        if os.path.isdir(source):
+            data_dir = source
+            for root, _, files in os.walk(source):
+                sources.extend([os.path.join(root, f) for f in files if _is_image(f)])
+        else:
+            items = source.lstrip('/').split('/')
+            position = 0
+            try:
+                for item in items:
+                    if set(item) & {'*', '?', '[', ']'}:
+                        break
+                    position += 1
+                else:
+                    raise Exception('Wrong positional argument')
+                assert position != 0, 'Wrong pattern: there must be a common root'
+                data_dir = source.split(items[position])[0]
+            except Exception as ex:
+                sys.exit(str(ex))
+            sources = list(filter(_is_image, glob(source, recursive=True)))
+        try:
+            assert len(sources), 'A images was not found'
+            manifest = ImageManifestManager(manifest_path=manifest_directory)
+            meta_info = manifest.prepare_meta(sources=sources, is_sorted=False,
+                use_image_hash=True, data_dir=data_dir)
+            manifest.create(meta_info)
+        except Exception as ex:
+            sys.exit(str(ex))
+    else: # video
+        try:
+            assert _is_video(source), 'You can specify a video path or a directory/pattern with images'
+            manifest = VideoManifestManager(manifest_path=manifest_directory)
+            try:
+                meta_info = manifest.prepare_meta(media_file=source, force=args.force)
+            except AssertionError as ex:
+                if str(ex) == 'Too few keyframes':
+                    msg = 'NOTE: prepared manifest file contains too few key frames for smooth decoding.\n' \
+                        'Use --force flag if you still want to prepare a manifest file.'
+                    print(msg)
+                    sys.exit(2)
+                else:
+                    raise
+            manifest.create(meta_info)
+        except Exception as ex:
+            sys.exit(str(ex))
+
+    print('The manifest file has been prepared')
+if __name__ == "__main__":
+    base_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+    sys.path.append(base_dir)
+    from dataset_manifest.core import VideoManifestManager, ImageManifestManager
+    main()
\ No newline at end of file
--- a/utils/dataset_manifest/requirements.txt
+++ b/utils/dataset_manifest/requirements.txt
+av==8.0.2 --no-binary=av
+opencv-python-headless==4.4.0.42
+Pillow==7.2.0
\ No newline at end of file
--- a/utils/dataset_manifest/utils.py
+++ b/utils/dataset_manifest/utils.py
+# Copyright (C) 2021 Intel Corporation
+#
+# SPDX-License-Identifier: MIT
+import hashlib
+import cv2 as cv
+from av import VideoFrame
+
+def rotate_image(image, angle):
+    height, width = image.shape[:2]
+    image_center = (width/2, height/2)
+    matrix = cv.getRotationMatrix2D(image_center, angle, 1.)
+    abs_cos = abs(matrix[0,0])
+    abs_sin = abs(matrix[0,1])
+    bound_w = int(height * abs_sin + width * abs_cos)
+    bound_h = int(height * abs_cos + width * abs_sin)
+    matrix[0, 2] += bound_w/2 - image_center[0]
+    matrix[1, 2] += bound_h/2 - image_center[1]
+    matrix = cv.warpAffine(image, matrix, (bound_w, bound_h))
+    return matrix
+
+def md5_hash(frame):
+    if isinstance(frame, VideoFrame):
+        frame = frame.to_image()
+    return hashlib.md5(frame.tobytes()).hexdigest() # nosec
\ No newline at end of file
--- a/utils/prepare_meta_information/README.md
+++ b/utils/prepare_meta_information/README.md
-# Simple command line for prepare meta information for video data
-
-**Usage**
-
-```bash
-usage: prepare.py [-h] [-chunk_size CHUNK_SIZE] video_file meta_directory
-
-positional arguments:
-  video_file            Path to video file
-  meta_directory        Directory where the file with meta information will be saved
-
-optional arguments:
-  -h, --help            show this help message and exit
-  -chunk_size CHUNK_SIZE
-                        Chunk size that will be specified when creating the task with specified video and generated meta information
-```
-
-**NOTE**: For smooth video decoding, the `chunk size` must be greater than or equal to the ratio of number of frames
-to a number of key frames.
-You can understand the approximate `chunk size` by preparing and looking at the file with meta information.
-
-**NOTE**: If ratio of number of frames to number of key frames is small compared to the `chunk size`,
-then when creating a task with prepared meta information, you should expect that the waiting time for some chunks
-will be longer than the waiting time for other chunks. (At the first iteration, when there is no chunk in the cache)
-
-**Examples**
-
-```bash
-python prepare.py ~/Documents/some_video.mp4 ~/Documents
-```
--- a/utils/prepare_meta_information/prepare.py
+++ b/utils/prepare_meta_information/prepare.py
-# Copyright (C) 2020 Intel Corporation
-#
-# SPDX-License-Identifier: MIT
-import argparse
-import sys
-import os
-
-def get_args():
-    parser = argparse.ArgumentParser()
-    parser.add_argument('video_file',
-        type=str,
-        help='Path to video file')
-    parser.add_argument('meta_directory',
-        type=str,
-        help='Directory where the file with meta information will be saved')
-    parser.add_argument('-chunk_size',
-        type=int,
-        help='Chunk size that will be specified when creating the task with specified video and generated meta information')
-
-    return parser.parse_args()
-
-def main():
-    args = get_args()
-    try:
-        smooth_decoding = prepare_meta_for_upload(prepare_meta, args.video_file, None, args.meta_directory, args.chunk_size)
-        print('Meta information for video has been prepared')
-
-        if smooth_decoding != None and not smooth_decoding:
-            print('NOTE: prepared meta information contains too few key frames for smooth decoding.')
-    except Exception:
-        print('Impossible to prepare meta information')
-
-if __name__ == "__main__":
-    base_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
-    sys.path.append(base_dir)
-    from cvat.apps.engine.prepare import prepare_meta, prepare_meta_for_upload
-    main()
\ No newline at end of file