data_en.md 1.7 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
# Data

---

## Introducation
This document introduces the preparation of ImageNet1k and flowers102

## Dataset

Dataset | train dataset size | valid dataset size | category |
:------:|:---------------:|:---------------------:|:--------:|
[flowers102](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/)|1k | 6k | 102 |
[ImageNet1k](http://www.image-net.org/challenges/LSVRC/2012/)|1.2M| 50k | 1000 |

* Data format

Please follow the steps mentioned below to organize data, include train_list.txt and val_list.txt

```shell
# delimiter: "space"
L
littletomatodonkey 已提交
21 22
# the following the content of train_list.txt
train/n01440764/n01440764_10026.JPEG 0
23 24
...

L
littletomatodonkey 已提交
25 26 27
# the following the content of val_list.txt
val/ILSVRC2012_val_00000001.JPEG 65
...
28
```
L
littletomatodonkey 已提交
29

30 31 32 33
### ImageNet1k
After downloading data, please organize the data dir as below

```bash
L
littletomatodonkey 已提交
34
PaddleClas/dataset/ILSVRC2012/
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
|_ train/
|  |_ n01440764
|  |  |_ n01440764_10026.JPEG
|  |  |_ ...
|  |_ ...
|  |
|  |_ n15075141
|     |_ ...
|     |_ n15075141_9993.JPEG
|_ val/
|  |_ ILSVRC2012_val_00000001.JPEG
|  |_ ...
|  |_ ILSVRC2012_val_00050000.JPEG
|_ train_list.txt
|_ val_list.txt
```
L
littletomatodonkey 已提交
51

52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82
### Flowers102 Dataset

Download [Data](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/) then decompress:

```shell
jpg/
setid.mat
imagelabels.mat
```

Please put all the files under ```PaddleClas/dataset/flowers102```

generate generate_flowers102_list.py and train_list.txt和val_list.txt

```bash
python generate_flowers102_list.py jpg train > train_list.txt
python generate_flowers102_list.py jpg valid > val_list.txt

```

Please organize data dir as below

```bash
PaddleClas/dataset/flowers102/
|_ jpg/
|  |_ image_03601.jpg
|  |_ ...
|  |_ image_02355.jpg
|_ train_list.txt
|_ val_list.txt
```