programming_guide_en.md 17.7 KB
Newer Older
H
HongyingG 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430

# Guide to Fluid Programming

This document will instruct you to program and create a simple nueral network with Fluid API. From this guide, you will get the hang of:

- Core concepts of Fluid
- How to define computing process in Fluid
- How to run fluid operators with executor
- How to model practical problems logically
- How to call API(layers, datasets, loss functions, optimization methods and so on)

Before building model, you need to figure out several core concepts of Fluid at first:

## Express data with Tensor 

Like other mainstream frameworks, Fluid uses Tensor to hold data.

All data transferred in neural network are Tensor which can simply be regarded as a multi-dimensional array. In general, the number of dimensions can be any. Tensor features its own data type and shape. Data type of each element in single Tensor is the same. And **the shape of Tensor** refers to the dimensions of Tensor.

Picture below visually shows Tensor with dimension from one to six: 
<p align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/FluidDoc/develop/doc/fluid/beginners_guide/image/tensor.jpg" width="400">
</p>


There are three special kinds of Tensor in Fluid:

**1. Learnable parameters of models**

The lifetime of learnable parameters (such as network weight, bias and so on) of model is equal to the time of training task. The parameters will be updated by optimization algorithms. We use Parameter, the derived class of Variable, to express parameters.

We can create learnable parameters with `fluid.layers.create_parameter` in Fluid:

```python
w = fluid.layers.create_parameter(name="w",shape=[1],dtype='float32')
```


In general, you don't need to explicitly create learnable parameters of network. Fluid encapsulates most fundamental computing modules in common networks. Take the fully connected model as a simplest example, The codes below create connection weight(W) and bias(bias) for fully connected layer with no need to explicitly call associated APIs of Parameter.

```python
import paddle.fluid as fluid
y = fluid.layers.fc(input=x, size=128, bias_attr=True)
```


**2. Input and Output Tensor**

The input data of the whole neural network is also a special Tensor in which the sizes of some dimensions can not be decided at the definition time of models. Such dimensions usually includes batch size, or width and height of image when such data formats in a mini-batch are not constant. Placeholders for these uncertain dimension are necessary at the definition phase of model. 


`fluid.layers.data` is used to receive input data in Fluid, and it needs to be provided with the shape of input Tensor. When the shape is not certain, the correspondent dimension is defined as None. 

The code below exemplifies the usage of `fluid.layers.data` :

```python
import paddle.fluid as fluid

#Define the dimension of x : [3,None]. What we could make sure is that the first dimension of x is 3. 
#The second dimension is unknown and can only be known at runtime.
x = fluid.layers.data(name="x", shape=[3,None], dtype="int64")

#batch size doesn't have to be defined explicitly. 
#Fluid will automatically assign zeroth dimension as batch size dimension and fill right number at runtime.
a = fluid.layers.data(name="a",shape=[3,4],dtype='int64')

#If the width and height of image are variable, we can define the width and height as None.
#The meaning of three dimensions of shape is channel, width of image, height of image respectively.
b = fluid.layers.data(name="image",shape=[3,None,None],dtype="float32")
```

dtype=“int64” indicates signed int 64 bits data. For more data types supported by Fluid, please refer to [Data types currently supported by Fluid](../../user_guides/howto/prepare_data/feeding_data_en.html#fluid).

**3. Constant Tensor**

`fluid.layers.fill_constant` is used to define constant Tensor in Fluid. You can define the shape, data type and value of Constant Tensor. Code is as follows:

```python
import paddle.fluid as fluid
data = fluid.layers.fill_constant(shape=[1], value=0, dtype='int64')
```

Notice that the tensor defined above is not assigned with values. It merely represents the operation to perform. If you print data directly, you will get information about the description of this data:

```python
print data
```
Output:

```
name: "fill_constant_0.tmp_0"
type {
    type: LOD_TENSOR
    lod_tensor {
        tensor {
            data_type: INT64
            dims: 1
        }
    }
}
persistable: false
```

Specific output value will be shown at the runtime of Executor. Detailed process will be explained later.

## Feed data

The method to feed data in Fluid:

You need to use `fluid.layers.data` to configure data input layer and use ``executor.run(feed=...)`` to feed training data into `fluid.Executor` or `fluid.ParallelExecutor` .

For specific preparation for data, please refer to [Preparation for data](../../user_guides/howto/prepare_data/index_en.html).


## Operators -- operations on data

All operations on data are achieved by Operators in Fluid.

To facilitate development, on Python end, Operators in Fluid are further encapsulated into `paddle.fluid.layers` , `paddle.fluid.nets` and other modules.

It is because some common operations for Tensor may be composed of many fundamental operations. To make it more convenient, fundamental Operators are encapsulated in Fluid to reduce repeated coding, including the creation of learnable parameters which Operator relies on, details about initialization of learnable parameters and so on.

For example, you can use `paddle.fluid.layers.elementwise_add()` to add up two input Tensor:

```python
#Define network
import paddle.fluid as fluid
a = fluid.layers.data(name="a",shape=[1],dtype='float32')
b = fluid.layers.data(name="b",shape=[1],dtype='float32') 

result = fluid.layers.elementwise_add(a,b)

#Define Exector
cpu = fluid.core.CPUPlace() #define computing place. Here we choose to train on CPU
exe = fluid.Executor(cpu) #create executor
exe.run(fluid.default_startup_program()) #initialize network parameters

#Prepare data
import numpy
data_1 = input("a=")
data_2 = input("b=")
x = numpy.array([[data_1]])
y = numpy.array([[data_2]]) 

#Run computing
outs = exe.run(
feed={'a':x,'b':y},
fetch_list=[result.name])

#Verify result
print "%d+%d=%d" % (data_1,data_2,outs[0][0])
```

Output:
```
a=7
b=3
7+3=10
```

At runtime, input a=7,b=3, and you will get output=10.

You can copy the code, run it locally, input different numbers following the prompt instructions and check the computed result.

If you want to get the specific value of a,b at the runtime of network, you can add variables you want to check into ``fetch_list`` .

```python
...
#Run computing
outs = exe.run(
    feed={'a':x,'b':y},
    fetch_list=[a,b,result.name]
#Check output
print outs
```

Output:
```
[array([[7]]), array([[3]]), array([[10]])]
```
    
## Use Program to describe neural network model

Fluid is different from most other deep learning frameworks. In Fluid, static computing map is replaced by Program to dynamically describe the network. This dynamic method delivers both flexible modifications to network structure and convenience to build model. Moreover, the capability of expressing a model is enhanced significantly while the performance is guaranteed.

All Operators will be written into Program, which will be automatically transformed into a descriptive language named ProgramDesc in Fluid. It's like to write a general program to define Program. If you are an experienced developer, you can naturally apply the knowledge you have acquired on Fluid programming.

You can describe any complex model by combining sequential processes, branches and loops supported by Fluid.

**Sequential Process**

You can use sequential structure to build network:

```python
x = fluid.layers.data(name='x',shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
y = fluid.layers.data(name='y', shape=[1], dtype='float32')
cost = fluid.layers.square_error_cost(input=y_predict, label=y)
```

**Conditional branch——switch,if else:**

Switch and if-else class are used to implement conditional branch in Fluid. You can use the structure to adjust learning rate in learning rate adapter or perform other operations :

```python
lr = fluid.layers.tensor.create_global_var(
        shape=[1],
        value=0.0,
        dtype='float32',
        persistable=True,
        name="learning_rate")

one_var = fluid.layers.fill_constant(
        shape=[1], dtype='float32', value=1.0)
two_var = fluid.layers.fill_constant(
        shape=[1], dtype='float32', value=2.0)

with fluid.layers.control_flow.Switch() as switch:
    with switch.case(global_step == zero_var):
        fluid.layers.tensor.assign(input=one_var, output=lr)
        with switch.default():
            fluid.layers.tensor.assign(input=two_var, output=lr)
```

    
For detailed design principles of Program, please refer to [Design principle of Fluid](../../user_guides/design_idea/fluid_design_idea_en.html).

For more about control flow in Fluid, please refer to [Control Flow](../../api_guides/low_level/layers/control_flow_en.html).


## Use Executor to run Program

The design principle of Fluid is similar to C++, JAVA and other advanced programming language. The execution of program is divided into two steps: compile and run.

Executor accepts the defined Program and transforms it to a real executable Fluid Program at the back-end of C++. This process performed automatically is the compilation.

After compilation, it needs Executor to run the compiled Fluid Program.

Take add operator above as an example, you need to create an Executor to initialize and train Program after the construction of Program:

```python
#define Executor
cpu = fluid.core.CPUPlace() #define computing place. Here we choose training on CPU
exe = fluid.Executor(cpu) #create executor
exe.run(fluid.default_startup_program()) #initialize Program

#train Program and start computing
#feed defines the order of data transferred to network in the form of dict
#fetch_list defines the output of network
outs = exe.run(
    feed={'a':x,'b':y},
    fetch_list=[result.name])
```

## Code example

So far, you have got a primary knowledge of core concepts in Fluid. Why not try to configure a simple network ? You can finish a very simple data prediction under the guide of the part if you are interested. If you have learned this part, you can skip this section and read [What's next](#what_next).

Firstly, define input data format, model structure,loss function and optimized algorithm logically. Then you need to use PaddlePaddle APIs and operators to implement the logic of model. A typical model mainly contains four parts. They are: definition of input data format; forward computing logic; loss function; optimization algorithm.

1. Problem

    Given a pair of data $<X,Y>$,construct a function $f$ so that $y=f(x)$ . $X$ , $Y$ are both one dimensional Tensor. Network finally can predict $y_{\_predict}$ accurately according to input $x$.

2. Define data

    Supposing input data X=[1 2 3 4],Y=[2,4,6,8], make a definition in network:
    
    ```python
    #define X
    train_data=numpy.array([[1.0],[2.0],[3.0],[4.0]]).astype('float32')
    #define ground-truth y_true expected to get from the model prediction
    y_true = numpy.array([[2.0],[4.0],[6.0],[8.0]]).astype('float32')
    ```
        
3. Create network (define forward computing logic)

    Next you need to define the relationship between the predicted value and the input. Take a simple linear regression function for example:
    
    ```python
    #define input data type
    x = fluid.layers.data(name="x",shape=[1],dtype='float32')
    #create fully connected network
    y_predict = fluid.layers.fc(input=x,size=1,act=None)
    ```
    
    Now the network can predict output. Although the output is just a group of random numbers, which is far from expected results:
    
    ```python
    #load library
    import paddle.fluid as fluid
    import numpy
    #define data
    train_data=numpy.array([[1.0],[2.0],[3.0],[4.0]]).astype('float32')
    y_true = numpy.array([[2.0],[4.0],[6.0],[8.0]]).astype('float32')
    #define predict function
    x = fluid.layers.data(name="x",shape=[1],dtype='float32')
    y_predict = fluid.layers.fc(input=x,size=1,act=None)
    #initialize parameters
    cpu = fluid.core.CPUPlace()
    exe = fluid.Executor(cpu)
    exe.run(fluid.default_startup_program())
    #start training
    outs = exe.run(
        feed={'x':train_data},
        fetch_list=[y_predict.name])
    #observe result
    print outs
    ```
    
    Output:
    
    ```
    [array([[0.74079144],
               [1.4815829 ],
               [2.2223744 ],
               [2.9631658 ]], dtype=float32)]
    ```

4. Add loss function 
    
    After the construction of model, we need to evaluate the output result in order to make accurate predictions. How do we evaluate the result of prediction? We usually add loss function to network to compute the *distance* between ground-truth value and predict value.

    In this example, we adopt [mean-square function](https://en.wikipedia.org/wiki/Mean_squared_error) as our loss function :

    ```python
    cost = fluid.layers.square_error_cost(input=y_predict, label=y)
    avg_cost = fluid.layers.mean(cost)
    ```
    Output predicted value and loss function after a process of computing:
    
    ```python
    #load library
    import paddle.fluid as fluid
    import numpy
    #define data
    train_data=numpy.array([[1.0],[2.0],[3.0],[4.0]]).astype('float32')
    y_true = numpy.array([[2.0],[4.0],[6.0],[8.0]]).astype('float32')
    #define network
    x = fluid.layers.data(name="x",shape=[1],dtype='float32')
    y = fluid.layers.data(name="y",shape=[1],dtype='float32')
    y_predict = fluid.layers.fc(input=x,size=1,act=None)
    #define loss function
    cost = fluid.layers.square_error_cost(input=y_predict,label=y)
    avg_cost = fluid.layers.mean(cost)
    #initialize parameters
    cpu = fluid.core.CPUPlace()
    exe = fluid.Executor(cpu)
    exe.run(fluid.default_startup_program())
    #start training
    outs = exe.run(
        feed={'x':train_data,'y':y_true},
        fetch_list=[y_predict.name,avg_cost.name])
    #observe output
    print outs
    ```
    Output:
    
    ```
    [array([[0.9010564],
        [1.8021128],
        [2.7031693],
        [3.6042256]], dtype=float32), array([9.057577], dtype=float32)]
    ```

    We discover that the loss function after the first iteration of computing is 9.0, which shows there is a great improve space.
    
5. Optimization of network
    
    After the definition of loss function,you can get loss value by forward computing and then get gradients of parameters with chain derivative method.
    
    Parameters should be updated after you have obtained gradients. The simplest algorithm is random gradient algorithm: w=w−η⋅g,which is implemented by `fluid.optimizer.SGD`:
    ```python
    sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.01)
    ```
    Let's train the network for 100 times and check the results:
    
    ```python
    #load library
    import paddle.fluid as fluid
    import numpy
    #define data
    train_data=numpy.array([[1.0],[2.0],[3.0],[4.0]]).astype('float32')
    y_true = numpy.array([[2.0],[4.0],[6.0],[8.0]]).astype('float32')
    #define network
    x = fluid.layers.data(name="x",shape=[1],dtype='float32')
    y = fluid.layers.data(name="y",shape=[1],dtype='float32')
    y_predict = fluid.layers.fc(input=x,size=1,act=None)
    #define loss function
    cost = fluid.layers.square_error_cost(input=y_predict,label=y)
    avg_cost = fluid.layers.mean(cost)
    #define optimization algorithm
    sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.01)
    sgd_optimizer.minimize(avg_cost)
    #initialize parameters
    cpu = fluid.core.CPUPlace()
    exe = fluid.Executor(cpu)
    exe.run(fluid.default_startup_program())
    ##start training and iterate for 100 times
    for i in range(100):
        outs = exe.run(
            feed={'x':train_data,'y':y_true},
            fetch_list=[y_predict.name,avg_cost.name])
    #observe result
    print outs
    ```

    Output:
    ```
    [array([[2.2075021],
            [4.1005487],
            [5.9935956],
            [7.8866425]], dtype=float32), array([0.01651453], dtype=float32)]
    ```
    Now we discover that predicted value is nearly close to real value and the loss value descends from original value 9.05 to 0.01 after iteration for 100 times.
    
    Congratulations! You have succeed to create a simple network. If you want to try advanced linear regression —— predict model of housing price, please read [linear regression](../../beginners_guide/quick_start/fit_a_line/README.en.html). More examples of model can be found in [models](../../user_guides/models/index_en.html).

<a name="what_next"></a>
## What's next

If you have been familiar with fundamental operations, you can start your next journey to learn fluid:

You will learn how to build model for practical problem with fluid: [The configuration of simple network](../../user_guides/howto/configure_simple_model/index_en.html).

After the construction of network, you can start training your network in single node or multiple nodes. For detailed procedures, please refer to [train neural network](../../user_guides/howto/training/index_en.html).

In addition, there are three learning levels in documentation according to developer's background and experience: [Beginner's Guide](../../beginners_guide/index_en.html) , [User Guides](../../user_guides/index_en.html) and [Advanced User Guides](../../advanced_usage/index_en.html).

If you want to read examples in more application scenarios, you can go to [quick start](../../beginners_guide/quick_start/index_en.html) and [basic knowledge of deep learning](../../beginners_guide/basics/index_en.html) .If you have learned basic knowledge of deep learning, you can read from [user guide](../../user_guides/index_en.html).