DAG_EN.md 4.7 KB
Newer Older
D
Dong Daxiang 已提交
1 2
# Computation Graph On Server

H
HexToString 已提交
3
([简体中文](./DAG_CN.md)|English)
J
Jiawei Wang 已提交
4

D
Dong Daxiang 已提交
5 6 7 8
This document shows the concept of computation graph on server. How to define computation graph with PaddleServing built-in operators. Examples for some sequential execution logics are shown as well.

## Computation Graph on Server

D
Dong Daxiang 已提交
9
Deep neural nets often have some preprocessing steps on input data, and postprocessing steps on model inference scores. Since deep learning frameworks are now very flexible, it is possible to do preprocessing and postprocessing outside the training computation graph. If we want to do input data preprocessing and inference result postprocess on server side, we have to add the corresponding computation logics on server. Moreover, if a user wants to do inference with the same inputs on more than one model, the best way is to do the inference concurrently on server side given only one client request so that we can save some network computation overhead. For the above two reasons, it is naturally to think of a Directed Acyclic Graph(DAG) as the main computation method for server inference. One example of DAG is as follows:
D
Dong Daxiang 已提交
10

D
Dong Daxiang 已提交
11
<center>
H
HexToString 已提交
12
<img src='../images/server_dag.png' width = "450" height = "500" align="middle"/>
D
Dong Daxiang 已提交
13
</center>
D
Dong Daxiang 已提交
14 15 16

## How to define Node

B
barrierye 已提交
17 18
### Simple series structure

H
HexToString 已提交
19
PaddleServing has some predefined Computation Node in the framework. A very commonly used Computation Graph is the simple reader-inference-response mode that can cover most of the single model inference scenarios. Here is an example of DAG graph.
B
barrierye 已提交
20

D
Dong Daxiang 已提交
21
<center>
H
HexToString 已提交
22
<img src='../images/simple_dag.png' width = "260" height = "370" align="middle"/>
D
Dong Daxiang 已提交
23
</center>
D
Dong Daxiang 已提交
24

H
HexToString 已提交
25
If you want to start the server through the python API. The corresponding DAG definition code is as follows.
D
Dong Daxiang 已提交
26 27
``` python
import paddle_serving_server as serving
B
barrierye 已提交
28 29 30
from paddle_serving_server import OpMaker
from paddle_serving_server import OpSeqMaker

D
Dong Daxiang 已提交
31
op_maker = serving.OpMaker()
H
fix doc  
HexToString 已提交
32 33 34
read_op = op_maker.create('GeneralReaderOp')
general_infer_op = op_maker.create('GeneralInferOp')
general_response_op = op_maker.create('GeneralResponseOp')
D
Dong Daxiang 已提交
35 36 37 38 39 40 41

op_seq_maker = serving.OpSeqMaker()
op_seq_maker.add_op(read_op)
op_seq_maker.add_op(general_infer_op)
op_seq_maker.add_op(general_response_op)
```

H
HexToString 已提交
42 43
If you use `the command line + configuration file method to start C++ server`, you only need to modify [the configuration file](./Serving_Configure_CN.md), don`t need to change any line of 👆 code.

B
barrierye 已提交
44 45
For simple series logic, we simplify it and build it with `OpSeqMaker`. You can determine the successor by default according to the order of joining `OpSeqMaker` without specifying the successor of each node.

D
Dong Daxiang 已提交
46 47 48 49 50 51
Since the code will be commonly used and users do not have to change the code, PaddleServing releases a easy-to-use launching command for service startup. An example is as follows: 

``` python
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
```

B
barrierye 已提交
52 53
### Nodes with multiple inputs

H
huangjianhui 已提交
54
An example containing multiple input nodes is given in the [Model_Ensemble](./Model_Ensemble_EN.md). A example graph and the corresponding DAG definition code is as follows.
B
barrierye 已提交
55 56

<center>
H
HexToString 已提交
57
<img src='../images/complex_dag.png' width = "480" height = "400" align="middle"/>
B
barrierye 已提交
58 59 60 61 62 63 64 65
</center>

```python
from paddle_serving_server import OpMaker
from paddle_serving_server import OpGraphMaker
from paddle_serving_server import Server

op_maker = OpMaker()
H
fix doc  
HexToString 已提交
66
read_op = op_maker.create('GeneralReaderOp')
B
barrierye 已提交
67
cnn_infer_op = op_maker.create(
H
fix doc  
HexToString 已提交
68
    'GeneralInferOp', engine_name='cnn', inputs=[read_op])
B
barrierye 已提交
69
bow_infer_op = op_maker.create(
H
fix doc  
HexToString 已提交
70
    'GeneralInferOp', engine_name='bow', inputs=[read_op])
B
barrierye 已提交
71
response_op = op_maker.create(
H
fix doc  
HexToString 已提交
72
    'GeneralResponseOp', inputs=[cnn_infer_op, bow_infer_op])
B
barrierye 已提交
73 74 75 76 77 78 79 80 81 82

op_graph_maker = OpGraphMaker()
op_graph_maker.add_op(read_op)
op_graph_maker.add_op(cnn_infer_op)
op_graph_maker.add_op(bow_infer_op)
op_graph_maker.add_op(response_op)
```

For a graph with multiple input nodes, we need to use `OpGraphMaker` to build it, and you must give the predecessor of each node.

D
Dong Daxiang 已提交
83 84 85 86 87 88
## More Examples

If a user has sparse features as inputs, and the model will do embedding lookup for each feature, we can do distributed embedding lookup operation which is not in the Paddle training computation graph. An example is as follows:

``` python
import paddle_serving_server as serving
B
barrierye 已提交
89 90 91
from paddle_serving_server import OpMaker
from paddle_serving_server import OpSeqMaker

D
Dong Daxiang 已提交
92
op_maker = serving.OpMaker()
H
fix doc  
HexToString 已提交
93 94 95 96
read_op = op_maker.create('GeneralReaderOp')
dist_kv_op = op_maker.create('GeneralDistKVInferOp')
general_infer_op = op_maker.create('GeneralInferOp')
general_response_op = op_maker.create('GeneralResponseOp')
D
Dong Daxiang 已提交
97 98 99 100 101 102 103

op_seq_maker = serving.OpSeqMaker()
op_seq_maker.add_op(read_op)
op_seq_maker.add_op(dist_kv_op)
op_seq_maker.add_op(general_infer_op)
op_seq_maker.add_op(general_response_op)
```