BERT_10_MINS.md 4.7 KB
Newer Older
J
Jiawei Wang 已提交
1
## Build Bert-As-Service in 10 minutes
M
add doc  
MRXLT 已提交
2

J
Jiawei Wang 已提交
3
([简体中文](./BERT_10_MINS_CN.md)|English)
M
add doc  
MRXLT 已提交
4

J
Jiawei Wang 已提交
5
The goal of Bert-As-Service is to give a sentence, and the service can represent the sentence as a semantic vector and return it to the user. [Bert model](https://arxiv.org/abs/1810.04805) is a popular model in the current NLP field. It has achieved good results on a variety of public NLP tasks. The semantic vector calculated by the Bert model is used as input to other NLP models, which will also greatly improve the performance of the model. Bert-As-Service allows users to easily obtain the semantic vector representation of text and apply it to their own tasks. In order to achieve this goal, we have shown in four steps that using Paddle Serving can build such a service in ten minutes. All the code and files in the example can be found in [Example](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert) of Paddle Serving.
M
add doc  
MRXLT 已提交
6

J
Jiawei Wang 已提交
7
#### Step1: Save the serviceable model
J
Jiawei Wang 已提交
8

J
Jiawei Wang 已提交
9
Paddle Serving supports various models trained based on Paddle, and saves the serviceable model by specifying the input and output variables of the model. For convenience, we can load a trained bert Chinese model from paddlehub and save a deployable service with two lines of code. The server and client configurations are placed in the `bert_seq20_model` and` bert_seq20_client` folders, respectively.
M
add doc  
MRXLT 已提交
10

J
Jiawei Wang 已提交
11
[//file]:#bert_10.py
M
add doc  
MRXLT 已提交
12 13 14 15 16 17 18
``` python
import paddlehub as hub
model_name = "bert_chinese_L-12_H-768_A-12"
module = hub.Module(model_name)
inputs, outputs, program = module.context(
    trainable=True, max_seq_len=20)
feed_keys = ["input_ids", "position_ids", "segment_ids",
M
fix doc  
MRXLT 已提交
19
             "input_mask"]
M
add doc  
MRXLT 已提交
20 21
fetch_keys = ["pooled_output", "sequence_output"]
feed_dict = dict(zip(feed_keys, [inputs[x] for x in feed_keys]))
M
fix doc  
MRXLT 已提交
22
fetch_dict = dict(zip(fetch_keys, [outputs[x] for x in fetch_keys]))
M
add doc  
MRXLT 已提交
23 24 25 26 27 28

import paddle_serving_client.io as serving_io
serving_io.save_model("bert_seq20_model", "bert_seq20_client",
                      feed_dict, fetch_dict, program)
```

J
Jiawei Wang 已提交
29
#### Step2: Launch Service
M
add doc  
MRXLT 已提交
30

J
Jiawei Wang 已提交
31
[//file]:#server.sh
M
add doc  
MRXLT 已提交
32 33 34
``` shell
python -m paddle_serving_server_gpu.serve --model bert_seq20_model --thread 10 --port 9292 --gpu_ids 0
```
J
Jiawei Wang 已提交
35 36 37 38 39 40
| Parameters | Meaning                                  |
| ---------- | ---------------------------------------- |
| model      | server configuration and model file path |
| thread     | server-side threads                      |
| port       | server port number                       |
| gpu_ids    | GPU index number                         |
M
add doc  
MRXLT 已提交
41

J
Jiawei Wang 已提交
42
#### Step3: data preprocessing logic on Client Side
M
add doc  
MRXLT 已提交
43

J
Jiawei Wang 已提交
44
Paddle Serving has many built-in corresponding data preprocessing logics. For the calculation of Chinese Bert semantic representation, we use the ChineseBertReader class under paddle_serving_app for data preprocessing. Model input fields  of multiple models corresponding to a raw Chinese sentence can be easily fetched by developers
M
add doc  
MRXLT 已提交
45

J
Jiawei Wang 已提交
46
Install paddle_serving_app
M
add doc  
MRXLT 已提交
47

J
Jiawei Wang 已提交
48
[//file]:#pip_app.sh
M
add doc  
MRXLT 已提交
49 50 51 52
```shell
pip install paddle_serving_app
```

J
Jiawei Wang 已提交
53
#### Step4: Client Visit Serving
M
add doc  
MRXLT 已提交
54

J
Jiawei Wang 已提交
55
the script of client side bert_client.py is as follow:
M
add doc  
MRXLT 已提交
56

J
Jiawei Wang 已提交
57
[//file]:#bert_client.py
M
add doc  
MRXLT 已提交
58 59 60
``` python
import sys
from paddle_serving_client import Client
W
fix doc  
wangjiawei04 已提交
61
from paddle_serving_client.utils import benchmark_args
M
MRXLT 已提交
62
from paddle_serving_app.reader import ChineseBertReader
W
fix doc  
wangjiawei04 已提交
63 64
import numpy as np
args = benchmark_args()
M
add doc  
MRXLT 已提交
65

W
fix doc  
wangjiawei04 已提交
66
reader = ChineseBertReader({"max_seq_len": 128})
M
add doc  
MRXLT 已提交
67
fetch = ["pooled_output"]
W
fix doc  
wangjiawei04 已提交
68
endpoint_list = ['127.0.0.1:9292']
M
add doc  
MRXLT 已提交
69
client = Client()
W
fix doc  
wangjiawei04 已提交
70
client.load_client_config(args.model)
M
add doc  
MRXLT 已提交
71 72 73 74
client.connect(endpoint_list)

for line in sys.stdin:
    feed_dict = reader.process(line)
W
fix doc  
wangjiawei04 已提交
75 76 77
    for key in feed_dict.keys():
        feed_dict[key] = np.array(feed_dict[key]).reshape((128, 1))
    result = client.predict(feed=feed_dict, fetch=fetch, batch=False)
M
add doc  
MRXLT 已提交
78 79
```

J
Jiawei Wang 已提交
80
run
M
add doc  
MRXLT 已提交
81

J
Jiawei Wang 已提交
82
[//file]:#bert_10_cli.sh
M
add doc  
MRXLT 已提交
83 84 85 86
```shell
cat data.txt | python bert_client.py
```

J
Jiawei Wang 已提交
87
read samples from data.txt, print results at the standard output.
M
add doc  
MRXLT 已提交
88

J
Jiawei Wang 已提交
89
### Benchmark
M
add doc  
MRXLT 已提交
90

J
Jiawei Wang 已提交
91
We tested the performance of Bert-As-Service based on Padde Serving based on V100 and compared it with the Bert-As-Service based on Tensorflow. From the perspective of user configuration, we used the same batch size and concurrent number for stress testing. The overall throughput performance data obtained under 4 V100s is as follows.
M
add doc  
MRXLT 已提交
92 93

![4v100_bert_as_service_benchmark](4v100_bert_as_service_benchmark.png)
J
Jiawei Wang 已提交
94 95 96 97 98 99 100 101

<!--
yum install -y libXext libSM libXrender
pip install paddlehub paddle_serving_server paddle_serving_client
sh pip_app.sh
python bert_10.py
sh server.sh &
wget https://paddle-serving.bj.bcebos.com/bert_example/data-c.txt --no-check-certificate
W
wangjiawei04 已提交
102 103
head -n 500 data-c.txt > data.txt
cat data.txt | python bert_client.py
J
Jiawei Wang 已提交
104 105 106 107 108
if [[ $? -eq 0 ]]; then
    echo "test success"
else
    echo "test fail"
fi
J
Jiawei Wang 已提交
109
ps -ef | grep "paddle_serving_server" | grep -v grep | awk '{print $2}' | xargs kill
J
Jiawei Wang 已提交
110
-->