python_api.md 7.0 KB
Newer Older
Y
Yu Yang 已提交
1 2
# Design Doc: Python API

Y
Yu Yang 已提交
3
The top level user API in Python should be as same as API in `paddle.v2` after refactoring Paddle from a layer based framework to an operator based framework. There are many new classes in C++ in [compile time] for describing neural networks, such as `Variable`, `Operator`, `Block`. The issue about current design is how to give a proper way to wrap the C++ API to `paddle.v2` API and writing layers in Python.
Y
Yu Yang 已提交
4 5 6 7 8 9

This implementation of Python API includes two steps.

1. Implement the Python API using current C++ runtime concepts.
2. Replace the implementation by using compile-time concepts when they are completed.

Y
Yu Yang 已提交
10 11 12 13
The implementation of the first step is a temporary implementation. We should design our Python API concepts based on `compile-time` concepts. We just use `runtime` classes to implement it for now.


## Python Class and compile-time protobuf
Y
Yu Yang 已提交
14

Y
Yu Yang 已提交
15
Since we design our Python API concepts based on `compile-time`, we try to map our Python classes to every compile-time result, i.e., the protobuf messages. They are:
Y
Yu Yang 已提交
16 17 18 19 20 21 22 23 24 25 26


| Python Class | Compile-time protobuf |
| --- | --- |
| Block | BlockDesc |
| Operator | OpDesc |
| Variable | VarDesc |


### Block

Y
Yu Yang 已提交
27 28 29 30
Block is just like programming languages `{}`, which contains many operators and variables. There are two data fields in `Block`.  1) An associate map, whose key is variable name and value is variable itself; 2) A list of operators.

The block is hierarchical because PaddlePaddle supports RNN and IfElse. For example, RNN is like `for-loop` in programming languages. There is new `block` inside a `for-loop`. To represent hierarchies, `Block` stores the `parent Block` inside. If `parent=None`, the `Block` is the outermost block, i.e., the `global` block.

Y
Yu Yang 已提交
31 32 33

```python
class Block(objects):
Y
Yu Yang 已提交
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
    def __init__(self, parent=None):
        self.vars = map<string, Variable>()
        self.ops = vector<Operator>()
        self.parent = parent
    
    def create_var(self, ...):
        # create variable in `self.vars`
        return Variable(...)
    
    
    def create_global_var(self, ...):
        if self.parent is not None:
            return self.parent.create_global_var(...)
        else:
            return self.create_var(...)
    
    def create_parameter(self, ...):
        return self.create_global_var(...)
    
    def append_operator(self, ...):
        self.ops.append(...)
        
    def prepend_operator(self, ...):
       self.ops.prepend(...)
Y
Yu Yang 已提交
58 59
```

Y
Yu Yang 已提交
60 61 62 63
Users are able to create a global variable inside any block since they many create parameters inside a RNN or IfElseOp. All parameters should be stored in the global block, not the step block in RNN.

Users can create local variables for outputs of operators. Users can also append and prepend an operator in current block. Prepending `random initialize` operator or `load` operator is very useful to initialize parameters before training.

Y
Yu Yang 已提交
64 65 66

### Operator

Y
Yu Yang 已提交
67
Operator class will take inputs, outputs and attributes of the operator into `protobuf` OpDesc and create a C++ `OpDesc` instance. The `infer_shape` perform on C++ objects.
Y
Yu Yang 已提交
68 69 70

```python
class Operator(object):
Y
Yu Yang 已提交
71 72 73 74 75 76 77 78
    def __init__(self, type, inputs, outputs, attrs):
        # create OpDesc in Python
        op_desc = ...
        self.cpp_op_desc_ptr = core.OpDesc(op_desc)
        cpp.infer_shape(self.cpp_op_desc_ptr, inputs, outputs)

    def type(self):
        return self.cpp_op_desc_ptr.type()
Y
Yu Yang 已提交
79 80
```

Y
Yu Yang 已提交
81 82
After creating a C++ `OpDesc`, `Operator` in Python can only reads the attribute from C++ side.

Y
Yu Yang 已提交
83 84
### Variable

F
fengjiayi 已提交
85
Operators' inputs, outputs, and parameters are all variables. In our design, a variable has four key attributes: its name(`name`), the block it belongs to(`block`), a pointer pointed to its C++ Protobuf object(`cpp_var_desc_ptr`), and the operator it is created by(`op`). All of these attributes are initialized in the constructor, except the `op`. The `op` will keep being `None` till the variable is taken as an operator's output.
Y
Yu Yang 已提交
86 87 88 89 90

```python
class Variable(object):
    def __init__(self, shape, dtype="float32", name=None, block=None):
        if name is None:
F
fengjiayi 已提交
91
            name = unique_name_generator()
Y
Yu Yang 已提交
92 93
        self.name = name
        self.block = block
F
fengjiayi 已提交
94
        # build C++ Protobuf object
Y
Yu Yang 已提交
95 96 97 98 99 100 101 102
        self.cpp_var_desc_ptr = ...
        self.op = None

    def shape(self):
        cpp_shape = self.cpp_var_desc_ptr.shape()
        return [None if elem < 0 else elem for elem in cpp_shape]
```

F
fengjiayi 已提交
103
The Protobuf object should be created in C++ not Python because it is needed by infershape, and infershape is implemented by C++ code. The C++ Protobuf object is accessible for Python through the `cpp_var_desc_ptr`, just like how `shape()` function does.
F
fengjiayi 已提交
104

F
fengjiayi 已提交
105
The user is allowed to build a variable without specifying its name. If so, it is going to be assigned with an automatically generated unique name.
F
fengjiayi 已提交
106

Y
Yu Yang 已提交
107 108
### Parameter

F
fengjiayi 已提交
109
The parameter is a kind of special variable. They need to be initialized at the very beginning and updated after each batch training. So if a variable is a parameter, our compiler will add an initializer op and an optimizer op for it during the building process of computation graph. Apart from these, there is no more difference between variable and parameter. In other words, 'parameter' is only a label attached to variables, to tell the compiler these ones require additional processing.
Y
Yu Yang 已提交
110

Y
Update  
Yu Yang 已提交
111 112
```python
class Parameter(Variable):
Y
Yu Yang 已提交
113 114 115 116
    def __init__(self, trainable, initialize_attrs, optimize_attrs):
        pass
```

F
fengjiayi 已提交
117 118 119
The class `Parameter` is derived from class `Variable`. In addition to variables have, parameters are able to hold their initializing and updating information. A parameter's `self.op` will always be `None` because it can never be an operator's output.


Y
Yu Yang 已提交
120 121
## Layer Functions

F
fengjiayi 已提交
122
A layer is a Python function. When it is invoked, it creates a series of operators and variables then inserts them into the block. It is something like the macro in C++. It is called 'Layer' because the combination of added operators acts just like what a neural network layer does. 
F
fengjiayi 已提交
123

F
fengjiayi 已提交
124
Here are examples of how to write a data layer and FC layer:
F
fengjiayi 已提交
125 126 127 128 129 130 131 132 133 134 135 136 137 138 139

### Data Layer

```python
def data_layer(name, type, block=None):
    if block is None:
        block = g_block
    # type = dense_vector(size=10) / integer_value(range=10)
    return block.create_global_var(
            name=name, 
            shape=[None] + type.dims(), 
            dtype=type.dtype)

``` 

F
fengjiayi 已提交
140
Before building new variables, we need to specify which block to use. If we don't, the default one `g_block` will be used. In the above `data_layer` code, a variable is created and be inserted into the root block to make it global. This variable is going to be used as input data of the whole network.
F
fengjiayi 已提交
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155

### FC Layer

```python
def fc_layer(input, size, block=None, ...):
    if block is None:
        block = g_block
    w = block.create_parameter(...)
    b = block.create_parameter(...)
    out = stack.create_var()
    op = block.append_operator(Operator("FC", X=input, W=w, b=b, Out=out))
    out.op = op
    return out
```

F
fengjiayi 已提交
156
In the `fc_layer` code, we create two parameters(`w` and `b`), one variable(`out`) and one operator(`FC operator`), then insert all of them into the specified block.