Paddle reads data from *data reader* during training. *data reader* will be passed into `paddle.train` as a parameter.
At training and testing time, PaddlePaddle programs need to read data. To ease the users' work to write data reading code, we define that
- A *reader* is a function that reads data (from file, network, random number generator, etc) and yields data items.
- A *reader creator* is a function that returns a reader function.
- A *reader* decorator is a function, which accepts one or more readers, and returns a reader.
and provide frequently used reader creators and reader decorators.
## Data Reader Interface
Data reader is a function with no parameter that creates a iterable (anything can be used in `for x in iterable`):
Indeed, *data reader* doesn't have to be a function that reads and yields data items. It can be any function with no parameter that creates a iterable (anything can be used in `for x in iterable`):
```
iterable = data_reader()
...
...
@@ -15,16 +21,20 @@ Element produced for the iterable should be a **single** entry of data, **not**
An example implementation for single item data reader:
```python
defdata_reader_fake_image():
defreader_creator_random_image(width,height):
defreader():
whileTrue:
yieldnumpy.random.uniform(-1,1,size=20*20)
yieldnumpy.random.uniform(-1,1,size=width*height)
returnreader
```
An example implementation for multiple item data reader:
For example, we want to use a source of real images (reusing mnist dataset), and a source of fake images as input for [Generative Adversarial Networks](https://arxiv.org/abs/1406.2661).
For example, we want to use a source of real images (reusing mnist dataset), and a source of random images as input for [Generative Adversarial Networks](https://arxiv.org/abs/1406.2661).