DATA.md 8.4 KB
Newer Older
Q
qingqing01 已提交
1 2
English | [简体中文](DATA_cn.md)

3 4
# Data Pipline

5 6 7 8 9 10 11 12 13 14 15 16
## Introduction

The data pipeline is responsible for loading and converting data. Each
resulting data sample is a tuple of np.ndarrays.
For example, Faster R-CNN training uses samples of this format: `[(im,
im_info, im_id, gt_bbox, gt_class, is_crowd), (...)]`.

### Implementation

The data pipeline consists of four sub-systems: data parsing, image
pre-processing, data conversion and data feeding APIs.

S
SunAhong1993 已提交
17
Data samples are collected to form `data.Dataset`s, usually 3 sets are
18 19
needed for training, validation, and testing respectively.

S
SunAhong1993 已提交
20 21 22
First, `data.source` loads the data files into memory, then
`data.transform` processes them, and lastly, the batched samples
are fetched by `data.Reader`.
23 24 25

Sub-systems details:
1. Data parsing
S
SunAhong1993 已提交
26
Parses various data sources and creates `data.Dataset` instances. Currently,
27 28 29 30 31 32
following data sources are supported:

- COCO data source
Loads `COCO` type datasets with directory structures like this:

  ```
Q
qingqing01 已提交
33
  dataset/coco/
34
  ├── annotations
W
wangguanzhong 已提交
35
  │   ├── instances_train2014.json
36
  │   ├── instances_train2017.json
W
wangguanzhong 已提交
37
  │   ├── instances_val2014.json
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104
  │   ├── instances_val2017.json
  |   ...
  ├── train2017
  │   ├── 000000000009.jpg
  │   ├── 000000580008.jpg
  |   ...
  ├── val2017
  │   ├── 000000000139.jpg
  │   ├── 000000000285.jpg
  |   ...
  ```

- Pascal VOC data source
Loads `Pascal VOC` like datasets with directory structure like this:

  ```
  data/pascalvoc/
  ├──Annotations
  │   ├── i000050.jpg
  │   ├── 003876.xml
  |   ...
  ├── ImageSets
  │   ├──Main
              └── train.txt
              └── val.txt
              └── test.txt
              └── dog_train.txt
              └── dog_trainval.txt
              └── dog_val.txt
              └── dog_test.txt
              └── ...
  │   ├──Layout
               └──...
  │   ├── Segmentation
                └──...
  ├── JPEGImages
  │   ├── 000050.jpg
  │   ├── 003876.jpg
  |   ...
  ```

- Roidb data source
A generalized data source serialized as pickle files, which have the following
structure:
```python
(records, cname2id)
# `cname2id` is a `dict` which maps category name to class IDs
# and `records` is a list of dict of this structure:
{
    'im_file': im_fname,    # image file name
    'im_id': im_id,         # image ID
    'h': im_h,              # height of image
    'w': im_w,              # width of image
    'is_crowd': is_crowd,   # crowd marker
    'gt_class': gt_class,   # ground truth class
    'gt_bbox': gt_bbox,     # ground truth bounding box
    'gt_poly': gt_poly,     # ground truth segmentation
}
```

We provide a tool to generate roidb data sources. To convert `COCO` or `VOC`
like dataset, run this command:
```sh
# --type: the type of original data (xml or json)
# --annotation: the path of file, which contains the name of annotation files
# --save-dir: the save path
# --samples: the number of samples (default is -1, which mean all datas in dataset)
W
wangguanzhong 已提交
105
python ./ppdet/data/tools/generate_data_for_training.py
106 107 108 109 110 111 112
            --type=json \
            --annotation=./annotations/instances_val2017.json \
            --save-dir=./roidb \
            --samples=-1
```

 2. Image preprocessing
S
SunAhong1993 已提交
113
the `data.transform.operator` module provides operations such as image
114 115 116 117
decoding, expanding, cropping, etc. Multiple operators are combined to form
larger processing pipelines.

 3. Data transformer
S
SunAhong1993 已提交
118 119
Transform a `data.Dataset` to achieve various desired effects, Notably: the
`data.transform.paralle_map` transformer accelerates image processing with
120
multi-threads or multi-processes. More transformers can be found in
S
SunAhong1993 已提交
121
`data.transform.transformer`.
122 123

 4. Data feeding apis
S
SunAhong1993 已提交
124 125
To facilitate data pipeline building, we combine multiple `data.Dataset` to
form a `data.Reader` which can provide data for training, validation and
126 127 128 129 130
testing respectively. Users can simply call `Reader.[train|eval|infer]` to get
the corresponding data stream. Many aspect of the `Reader`, such as storage
location, preprocessing pipeline, acceleration mode can be configured with yaml
files.

G
Guanghua Yu 已提交
131 132
### APIs

133 134 135 136
The main APIs are as follows:

1. Data parsing

S
SunAhong1993 已提交
137 138
 - `source/coco_loader.py`: COCO dataset parser. [source](../ppdet/data/source/coco_loader.py)
 - `source/voc_loader.py`: Pascal VOC dataset parser. [source](../ppdet/data/source/voc_loader.py)
139 140 141 142
 [Note] To use a non-default label list for VOC datasets, a `label_list.txt`
 file is needed, one can use the provided label list
 (`data/pascalvoc/ImageSets/Main/label_list.txt`) or generate a custom one (with `tools/generate_data_for_training.py`). Also, `use_default_label` option should
 be set to `false` in the configuration file
S
SunAhong1993 已提交
143
 - `source/loader.py`: Roidb dataset parser. [source](../ppdet/data/source/loader.py)
144 145

2. Operator
G
Guanghua Yu 已提交
146
 `transform/operators.py`: Contains a variety of data augmentation methods, including:
147 148 149 150 151 152 153 154 155 156
- `DecodeImage`: Read images in RGB format.
- `RandomFlipImage`: Horizontal flip.
- `RandomDistort`: Distort brightness, contrast, saturation, and hue.
- `ResizeImage`: Resize image with interpolation.
- `RandomInterpImage`: Use a random interpolation method to resize the image.
- `CropImage`: Crop image with respect to different scale, aspect ratio, and overlap.
- `ExpandImage`: Pad image to a larger size, padding filled with mean image value.
- `NormalizeImage`: Normalize image pixel values.
- `NormalizeBox`: Normalize the bounding box.
- `Permute`: Arrange the channels of the image and optionally convert image to BGR format.
G
Guanghua Yu 已提交
157
- `MixupImage`: Mixup two images with given fraction<sup>[1](#mix)</sup>.
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177

<a name="mix">[1]</a> Please refer to [this paper](https://arxiv.org/pdf/1710.09412.pdf)

`transform/arrange_sample.py`: Assemble the data samples needed by different models.
3. Transformer
`transform/post_map.py`: Transformations that operates on whole batches, mainly for:
- Padding whole batch to given stride values
- Resize images to Multi-scales
- Randomly adjust the image size of the batch data
`transform/transformer.py`: Data filtering batching.
`transform/parallel_map.py`: Accelerate data processing with multi-threads/multi-processes.
4. Reader
`reader.py`: Combine source and transforms, return batch data according to `max_iter`.
`data_feed.py`: Configure default parameters for `reader.py`.


### Usage

#### Canned Datasets

Q
qingqing01 已提交
178
Preset for common datasets, e.g., `COCO` and `Pascal Voc` are included. In
179 180 181 182 183 184 185 186 187
most cases, user can simply use these canned dataset as is. Moreover, the
whole data pipeline is fully customizable through the yaml configuration files.

#### Custom Datasets

- Option 1: Convert the dataset to COCO or VOC format.
```sh
 # a small utility (`tools/labelme2coco.py`) is provided to convert
 # Labelme-annotated dataset to COCO format.
W
wangguanzhong 已提交
188
 python ./ppdet/data/tools/labelme2coco.py --json_input_dir ./labelme_annos/
189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224
                                --image_input_dir ./labelme_imgs/
                                --output_dir ./cocome/
                                --train_proportion 0.8
                                --val_proportion 0.2
                                --test_proportion 0.0
 # --json_input_dir:The path of json files which are annotated by Labelme.
 # --image_input_dir:The path of images.
 # --output_dir:The path of coverted COCO dataset.
 # --train_proportion:The train proportion of annatation data.
 # --val_proportion:The validation proportion of annatation data.
 # --test_proportion: The inference proportion of annatation data.
```

- Option 2:

1. Add `source/XX_loader.py` and implement the `load` function, following the
   example of `source/coco_loader.py` and `source/voc_loader.py`.
2. Modify the `load` function in `source/loader.py` to make use of the newly
   added data loader.
3. Modify `/source/__init__.py` accordingly.
```python
if data_cf['type'] in ['VOCSource', 'COCOSource', 'RoiDbSource']:
    source_type = 'RoiDbSource'
# Replace the above code with the following code:
if data_cf['type'] in ['VOCSource', 'COCOSource', 'RoiDbSource', 'XXSource']:
    source_type = 'RoiDbSource'
```
4. In the configure file, define the `type` of `dataset` as `XXSource`.

#### How to add data pre-processing?

- To add pre-processing operation for a single image, refer to the classes in
  `transform/operators.py`, and implement the desired transformation with a new
  class.
- To add pre-processing for a batch, one needs to modify the `build_post_map`
  function in `transform/post_map.py`.