DATA.md 8.3 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12
## Introduction

The data pipeline is responsible for loading and converting data. Each
resulting data sample is a tuple of np.ndarrays.
For example, Faster R-CNN training uses samples of this format: `[(im,
im_info, im_id, gt_bbox, gt_class, is_crowd), (...)]`.

### Implementation

The data pipeline consists of four sub-systems: data parsing, image
pre-processing, data conversion and data feeding APIs.

S
SunAhong1993 已提交
13
Data samples are collected to form `data.Dataset`s, usually 3 sets are
14 15
needed for training, validation, and testing respectively.

S
SunAhong1993 已提交
16 17 18
First, `data.source` loads the data files into memory, then
`data.transform` processes them, and lastly, the batched samples
are fetched by `data.Reader`.
19 20 21

Sub-systems details:
1. Data parsing
S
SunAhong1993 已提交
22
Parses various data sources and creates `data.Dataset` instances. Currently,
23 24 25 26 27 28
following data sources are supported:

- COCO data source
Loads `COCO` type datasets with directory structures like this:

  ```
Q
qingqing01 已提交
29
  dataset/coco/
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
  ├── annotations
  │   ├── instances_train2017.json
  │   ├── instances_val2017.json
  |   ...
  ├── train2017
  │   ├── 000000000009.jpg
  │   ├── 000000580008.jpg
  |   ...
  ├── val2017
  │   ├── 000000000139.jpg
  │   ├── 000000000285.jpg
  |   ...
  ```

- Pascal VOC data source
Loads `Pascal VOC` like datasets with directory structure like this:

  ```
  data/pascalvoc/
  ├──Annotations
  │   ├── i000050.jpg
  │   ├── 003876.xml
  |   ...
  ├── ImageSets
  │   ├──Main
              └── train.txt
              └── val.txt
              └── test.txt
              └── dog_train.txt
              └── dog_trainval.txt
              └── dog_val.txt
              └── dog_test.txt
              └── ...
  │   ├──Layout
               └──...
  │   ├── Segmentation
                └──...
  ├── JPEGImages
  │   ├── 000050.jpg
  │   ├── 003876.jpg
  |   ...
  ```

- Roidb data source
A generalized data source serialized as pickle files, which have the following
structure:
```python
(records, cname2id)
# `cname2id` is a `dict` which maps category name to class IDs
# and `records` is a list of dict of this structure:
{
    'im_file': im_fname,    # image file name
    'im_id': im_id,         # image ID
    'h': im_h,              # height of image
    'w': im_w,              # width of image
    'is_crowd': is_crowd,   # crowd marker
    'gt_class': gt_class,   # ground truth class
    'gt_bbox': gt_bbox,     # ground truth bounding box
    'gt_poly': gt_poly,     # ground truth segmentation
}
```

We provide a tool to generate roidb data sources. To convert `COCO` or `VOC`
like dataset, run this command:
```sh
# --type: the type of original data (xml or json)
# --annotation: the path of file, which contains the name of annotation files
# --save-dir: the save path
# --samples: the number of samples (default is -1, which mean all datas in dataset)
python ./tools/generate_data_for_training.py
            --type=json \
            --annotation=./annotations/instances_val2017.json \
            --save-dir=./roidb \
            --samples=-1
```

 2. Image preprocessing
S
SunAhong1993 已提交
107
the `data.transform.operator` module provides operations such as image
108 109 110 111
decoding, expanding, cropping, etc. Multiple operators are combined to form
larger processing pipelines.

 3. Data transformer
S
SunAhong1993 已提交
112 113
Transform a `data.Dataset` to achieve various desired effects, Notably: the
`data.transform.paralle_map` transformer accelerates image processing with
114
multi-threads or multi-processes. More transformers can be found in
S
SunAhong1993 已提交
115
`data.transform.transformer`.
116 117

 4. Data feeding apis
S
SunAhong1993 已提交
118 119
To facilitate data pipeline building, we combine multiple `data.Dataset` to
form a `data.Reader` which can provide data for training, validation and
120 121 122 123 124 125 126 127 128
testing respectively. Users can simply call `Reader.[train|eval|infer]` to get
the corresponding data stream. Many aspect of the `Reader`, such as storage
location, preprocessing pipeline, acceleration mode can be configured with yaml
files.

The main APIs are as follows:

1. Data parsing

S
SunAhong1993 已提交
129 130
 - `source/coco_loader.py`: COCO dataset parser. [source](../ppdet/data/source/coco_loader.py)
 - `source/voc_loader.py`: Pascal VOC dataset parser. [source](../ppdet/data/source/voc_loader.py)
131 132 133 134
 [Note] To use a non-default label list for VOC datasets, a `label_list.txt`
 file is needed, one can use the provided label list
 (`data/pascalvoc/ImageSets/Main/label_list.txt`) or generate a custom one (with `tools/generate_data_for_training.py`). Also, `use_default_label` option should
 be set to `false` in the configuration file
S
SunAhong1993 已提交
135
 - `source/loader.py`: Roidb dataset parser. [source](../ppdet/data/source/loader.py)
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169

2. Operator
 `transform/operators.py`: Contains a variety of data enhancement methods, including:
- `DecodeImage`: Read images in RGB format.
- `RandomFlipImage`: Horizontal flip.
- `RandomDistort`: Distort brightness, contrast, saturation, and hue.
- `ResizeImage`: Resize image with interpolation.
- `RandomInterpImage`: Use a random interpolation method to resize the image.
- `CropImage`: Crop image with respect to different scale, aspect ratio, and overlap.
- `ExpandImage`: Pad image to a larger size, padding filled with mean image value.
- `NormalizeImage`: Normalize image pixel values.
- `NormalizeBox`: Normalize the bounding box.
- `Permute`: Arrange the channels of the image and optionally convert image to BGR format.
- `MixupImage`: Mixup two images with given fraction<sup>[1](#vd)</sup>.

<a name="mix">[1]</a> Please refer to [this paper](https://arxiv.org/pdf/1710.09412.pdf)

`transform/arrange_sample.py`: Assemble the data samples needed by different models.
3. Transformer
`transform/post_map.py`: Transformations that operates on whole batches, mainly for:
- Padding whole batch to given stride values
- Resize images to Multi-scales
- Randomly adjust the image size of the batch data
`transform/transformer.py`: Data filtering batching.
`transform/parallel_map.py`: Accelerate data processing with multi-threads/multi-processes.
4. Reader
`reader.py`: Combine source and transforms, return batch data according to `max_iter`.
`data_feed.py`: Configure default parameters for `reader.py`.


### Usage

#### Canned Datasets

Q
qingqing01 已提交
170
Preset for common datasets, e.g., `COCO` and `Pascal Voc` are included. In
171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216
most cases, user can simply use these canned dataset as is. Moreover, the
whole data pipeline is fully customizable through the yaml configuration files.

#### Custom Datasets

- Option 1: Convert the dataset to COCO or VOC format.
```sh
 # a small utility (`tools/labelme2coco.py`) is provided to convert
 # Labelme-annotated dataset to COCO format.
 python ./tools/labelme2coco.py --json_input_dir ./labelme_annos/
                                --image_input_dir ./labelme_imgs/
                                --output_dir ./cocome/
                                --train_proportion 0.8
                                --val_proportion 0.2
                                --test_proportion 0.0
 # --json_input_dir:The path of json files which are annotated by Labelme.
 # --image_input_dir:The path of images.
 # --output_dir:The path of coverted COCO dataset.
 # --train_proportion:The train proportion of annatation data.
 # --val_proportion:The validation proportion of annatation data.
 # --test_proportion: The inference proportion of annatation data.
```

- Option 2:

1. Add `source/XX_loader.py` and implement the `load` function, following the
   example of `source/coco_loader.py` and `source/voc_loader.py`.
2. Modify the `load` function in `source/loader.py` to make use of the newly
   added data loader.
3. Modify `/source/__init__.py` accordingly.
```python
if data_cf['type'] in ['VOCSource', 'COCOSource', 'RoiDbSource']:
    source_type = 'RoiDbSource'
# Replace the above code with the following code:
if data_cf['type'] in ['VOCSource', 'COCOSource', 'RoiDbSource', 'XXSource']:
    source_type = 'RoiDbSource'
```
4. In the configure file, define the `type` of `dataset` as `XXSource`.

#### How to add data pre-processing?

- To add pre-processing operation for a single image, refer to the classes in
  `transform/operators.py`, and implement the desired transformation with a new
  class.
- To add pre-processing for a batch, one needs to modify the `build_post_map`
  function in `transform/post_map.py`.