TRAIN_TO_SERVICE.md 16.7 KB
Newer Older
M
MRXLT 已提交
1
# An End-to-end Tutorial from Training to Inference Service Deployment
M
MRXLT 已提交
2

M
MRXLT 已提交
3
([简体中文](./TRAIN_TO_SERVICE_CN.md)|English)
M
MRXLT 已提交
4

M
MRXLT 已提交
5
Paddle Serving is Paddle's high-performance online inference service framework, which can flexibly support the deployment of most models. In this article, the IMDB review sentiment analysis task is used as an example to show the entire process from model training to deployment of inference service through 9 steps.
M
MRXLT 已提交
6

M
MRXLT 已提交
7 8
## Step1:Prepare for Running Environment
Paddle Serving can be deployed on Linux environments such as Centos and Ubuntu. On other systems or in environments where you do not want to install the serving module, you can still access the server-side prediction service through the http service.
M
MRXLT 已提交
9

M
MRXLT 已提交
10
You can choose to install the cpu or gpu version of the server module according to the requirements and machine environment, and install the client module on the client machine. When you want to access the server with http
M
MRXLT 已提交
11 12

```shell
M
MRXLT 已提交
13 14 15
pip install paddle_serving_server #cpu version server side 
pip install paddle_serving_server_gpu #gpu version server side
pip install paddle_serving_client #client version
M
MRXLT 已提交
16 17
```

M
MRXLT 已提交
18
After simple preparation, we will take the IMDB review sentiment analysis task as an example to show the process from model training to deployment of prediction services. All the code in the example can be found in the [IMDB example](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imdb) of the Paddle Serving code base, the data and dictionary used in the example The file can be obtained by executing the get_data.sh script in the IMDB sample code.
M
MRXLT 已提交
19

M
MRXLT 已提交
20
## Step2:Determine Tasks and Raw Data Format
M
MRXLT 已提交
21

M
MRXLT 已提交
22
IMDB review sentiment analysis task is to classify the content of movie reviews to determine whether the review is a positive review or a negative review.
M
MRXLT 已提交
23

M
MRXLT 已提交
24
First let's take a look at the raw data:
M
MRXLT 已提交
25 26 27 28
```
saw a trailer for this on another video, and decided to rent when it came out. boy, was i disappointed! the story is extremely boring, the acting (aside from christopher walken) is bad, and i couldn't care less about the characters, aside from really wanting to see nora's husband get thrashed. christopher walken's role is such a throw-away, what a tease! | 0
```

M
MRXLT 已提交
29
This is a sample of English comments. The sample uses | as the separator. The content of the comment is before the separator. The label is the sample after the separator. 0 is the negative while 1 is the positive.
M
MRXLT 已提交
30

M
MRXLT 已提交
31
## Step3:Define Reader, divide training set and test set
M
MRXLT 已提交
32

M
MRXLT 已提交
33
For the original text we need to convert it to a numeric id that the neural network can use. The imdb_reader.py script defines the method of text idization, and the words are mapped to integers through the dictionary file imdb.vocab.
M
MRXLT 已提交
34

M
MRXLT 已提交
35 36
<details>
  <summary>imdb_reader.py</summary>
M
MRXLT 已提交
37

M
MRXLT 已提交
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102
```python
import sys
import os
import paddle
import re
import paddle.fluid.incubate.data_generator as dg


class IMDBDataset(dg.MultiSlotDataGenerator):
    def load_resource(self, dictfile):
        self._vocab = {}
        wid = 0
        with open(dictfile) as f:
            for line in f:
                self._vocab[line.strip()] = wid
                wid += 1
        self._unk_id = len(self._vocab)
        self._pattern = re.compile(r'(;|,|\.|\?|!|\s|\(|\))')
        self.return_value = ("words", [1, 2, 3, 4, 5, 6]), ("label", [0])

    def get_words_only(self, line):
        sent = line.lower().replace("<br />", " ").strip()
        words = [x for x in self._pattern.split(sent) if x and x != " "]
        feas = [
            self._vocab[x] if x in self._vocab else self._unk_id for x in words
        ]
        return feas

    def get_words_and_label(self, line):
        send = '|'.join(line.split('|')[:-1]).lower().replace("<br />",
                                                              " ").strip()
        label = [int(line.split('|')[-1])]

        words = [x for x in self._pattern.split(send) if x and x != " "]
        feas = [
            self._vocab[x] if x in self._vocab else self._unk_id for x in words
        ]
        return feas, label

    def infer_reader(self, infer_filelist, batch, buf_size):
        def local_iter():
            for fname in infer_filelist:
                with open(fname, "r") as fin:
                    for line in fin:
                        feas, label = self.get_words_and_label(line)
                        yield feas, label

        import paddle
        batch_iter = paddle.batch(
            paddle.reader.shuffle(
                local_iter, buf_size=buf_size),
            batch_size=batch)
        return batch_iter

    def generate_sample(self, line):
        def memory_iter():
            for i in range(1000):
                yield self.return_value

        def data_iter():
            feas, label = self.get_words_and_label(line)
            yield ("words", feas), ("label", label)

        return data_iter
```
M
MRXLT 已提交
103
</details>
M
MRXLT 已提交
104

M
MRXLT 已提交
105
The sample after mapping is similar to the following format:
M
MRXLT 已提交
106 107 108 109 110

```
257 142 52 898 7 0 12899 1083 824 122 89527 134 6 65 47 48 904 89527 13 0 87 170 8 248 9 15 4 25 1365 4360 89527 702 89527 1 89527 240 3 28 89527 19 7 0 216 219 614 89527 0 84 89527 225 3 0 15 67 2356 89527 0 498 117 2 314 282 7 38 1097 89527 1 0 174 181 38 11 71 198 44 1 3110 89527 454 89527 34 37 89527 0 15 5912 80 2 9856 7748 89527 8 421 80 9 15 14 55 2218 12 4 45 6 58 25 89527 154 119 224 41 0 151 89527 871 89527 505 89527 501 89527 29 2 773 211 89527 54 307 90 0 893 89527 9 407 4 25 2 614 15 46 89527 89527 71 8 1356 35 89527 12 0 89527 89527 89 527 577 374 3 39091 22950 1 3771 48900 95 371 156 313 89527 37 154 296 4 25 2 217 169 3 2759 7 0 15 89527 0 714 580 11 2094 559 34 0 84 539 89527 1 0 330 355 3 0 15 15607 935 80 0 5369 3 0 622 89527 2 15 36 9 2291 2 7599 6968 2449 89527 1 454 37 256 2 211 113 0 480 218 1152 700 4 1684 1253 352 10 2449 89527 39 4 1819 129 1 316 462 29 0 12957 3 6 28 89527 13 0 457 8952 7 225 89527 8 2389 0 1514 89527 1
```

M
MRXLT 已提交
111
In this way, the neural network can train the transformed text information as feature values.
M
MRXLT 已提交
112

M
MRXLT 已提交
113
## Step4:Define CNN network for training and saving
M
MRXLT 已提交
114

M
MRXLT 已提交
115
Net we use [CNN Model](https://www.paddlepaddle.org.cn/documentation/docs/zh/user_guides/nlp_case/understand_sentiment/README.cn.html#cnn) for training, in nets.py we define the network structure.
M
MRXLT 已提交
116 117 118

<details>
  <summary>nets.py</summary>
M
MRXLT 已提交
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156

```python
import sys
import time
import numpy as np

import paddle
import paddle.fluid as fluid

def cnn_net(data,
            label,
            dict_dim,
            emb_dim=128,
            hid_dim=128,
            hid_dim2=96,
            class_dim=2,
            win_size=3):
    """ conv net. """
    emb = fluid.layers.embedding(
        input=data, size=[dict_dim, emb_dim], is_sparse=True)

    conv_3 = fluid.nets.sequence_conv_pool(
        input=emb,
        num_filters=hid_dim,
        filter_size=win_size,
        act="tanh",
        pool_type="max")

    fc_1 = fluid.layers.fc(input=[conv_3], size=hid_dim2)

    prediction = fluid.layers.fc(input=[fc_1], size=class_dim, act="softmax")
    cost = fluid.layers.cross_entropy(input=prediction, label=label)
    avg_cost = fluid.layers.mean(x=cost)
    acc = fluid.layers.accuracy(input=prediction, label=label)

    return avg_cost, acc, prediction
```

M
MRXLT 已提交
157 158
</details>

M
MRXLT 已提交
159
Use training dataset for training. The training script is local_train.py. After training, use the paddle_serving_client.io.save_model function to save the model files and configuration files used by the  servingdeployment.
M
MRXLT 已提交
160 161 162

<details>
  <summary>local_train.py</summary>
M
MRXLT 已提交
163 164 165 166 167 168 169 170 171 172 173 174

```python
import os
import sys
import paddle
import logging
import paddle.fluid as fluid

logging.basicConfig(format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger("fluid")
logger.setLevel(logging.INFO)

M
MRXLT 已提交
175
# load dict file
M
MRXLT 已提交
176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192
def load_vocab(filename):
    vocab = {}
    with open(filename) as f:
        wid = 0
        for line in f:
            vocab[line.strip()] = wid
            wid += 1
    vocab["<unk>"] = len(vocab)
    return vocab


if __name__ == "__main__":
    from nets import cnn_net
    model_name = "imdb_cnn"
    vocab = load_vocab('imdb.vocab')
    dict_dim = len(vocab)
    
M
MRXLT 已提交
193
    #define model input
M
MRXLT 已提交
194 195 196
    data = fluid.layers.data(
        name="words", shape=[1], dtype="int64", lod_level=1)
    label = fluid.layers.data(name="label", shape=[1], dtype="int64")
M
MRXLT 已提交
197
    #define dataset,train_data is the dataset directory
M
MRXLT 已提交
198 199 200 201 202
    dataset = fluid.DatasetFactory().create_dataset()
    filelist = ["train_data/%s" % x for x in os.listdir("train_data")]
    dataset.set_use_var([data, label])
    pipe_command = "python imdb_reader.py"
    dataset.set_pipe_command(pipe_command)
M
MRXLT 已提交
203
    dataset.set_batch_size(4)
M
MRXLT 已提交
204 205
    dataset.set_filelist(filelist)
    dataset.set_thread(10)
M
MRXLT 已提交
206
    #define model
M
MRXLT 已提交
207
    avg_cost, acc, prediction = cnn_net(data, label, dict_dim)
M
MRXLT 已提交
208
    optimizer = fluid.optimizer.SGD(learning_rate=0.001)
M
MRXLT 已提交
209
    optimizer.minimize(avg_cost)
M
MRXLT 已提交
210
    #execute training
M
MRXLT 已提交
211 212
    exe = fluid.Executor(fluid.CPUPlace())
    exe.run(fluid.default_startup_program())
M
MRXLT 已提交
213
    epochs = 100
M
MRXLT 已提交
214 215 216 217 218 219 220
		
    import paddle_serving_client.io as serving_io

    for i in range(epochs):
        exe.train_from_dataset(
            program=fluid.default_main_program(), dataset=dataset, debug=False)
        logger.info("TRAIN --> pass: {}".format(i))
M
MRXLT 已提交
221
        if i == 64:
M
MRXLT 已提交
222
            #At the end of training, use the model save interface in PaddleServing to save the models and configuration files required by Serving
M
MRXLT 已提交
223 224 225 226 227 228
            serving_io.save_model("{}_model".format(model_name),
                                  "{}_client_conf".format(model_name),
                                  {"words": data}, {"prediction": prediction},
                                  fluid.default_main_program())
```

M
MRXLT 已提交
229
</details>
M
MRXLT 已提交
230

M
MRXLT 已提交
231 232
! [Training process](./ imdb_loss.png) As can be seen from the above figure, the loss of the model starts to converge after the 65th round. We save the model and configuration file after the 65th round of training is completed. The saved files are divided into imdb_cnn_client_conf and imdb_cnn_model folders. The former contains client-side configuration files, and the latter contains server-side configuration files and saved model files.
The parameter list of the save_model function is as follows:
M
MRXLT 已提交
233

M
MRXLT 已提交
234
| Parameter            | Meaning                                                        |
M
MRXLT 已提交
235
| -------------------- | ------------------------------------------------------------ |
M
MRXLT 已提交
236 237 238 239 240
| server_model_folder  |  Directory for server-side configuration files and model files |
| client_config_folder | Directory for saving client configuration files              |
| feed_var_dict        | The input of the inference model. The dict type and key can be customized. The value is the input variable in the model. Each key corresponds to a variable. When using the prediction service, the input data uses the key as the input name. |
| fetch_var_dict       | The output of the model used for prediction, dict type, key can be customized, value is the input variable in the model, and each key corresponds to a variable. When using the prediction service, use the key to get the returned data  |
| main_program         | Model's program                                                |
M
MRXLT 已提交
241

M
MRXLT 已提交
242
## Step5: Deploy RPC Prediction Service
M
MRXLT 已提交
243

M
MRXLT 已提交
244
The Paddle Serving framework supports two types of prediction service methods. One is to communicate through RPC and the other is to communicate through HTTP. The deployment and use of RPC prediction service will be introduced first. The deployment and use of HTTP prediction service will be introduced at Step 8. .
M
MRXLT 已提交
245

M
MRXLT 已提交
246 247 248 249
`` `shell
python -m paddle_serving_server.serve --model imdb_cnn_model / --port 9292 #cpu prediction service
python -m paddle_serving_server_gpu.serve --model imdb_cnn_model / --port 9292 --gpu_ids 0 #gpu prediction service
`` `
M
MRXLT 已提交
250

M
MRXLT 已提交
251
The parameter --model in the command specifies the server-side model and configuration file directory previously saved, --port specifies the port of the prediction service. When deploying the gpu prediction service using the gpu version, you can use --gpu_ids to specify the gpu used.
M
MRXLT 已提交
252

M
MRXLT 已提交
253
After executing one of the above commands, the RPC prediction service deployment of the IMDB sentiment analysis task is completed.
M
MRXLT 已提交
254

M
MRXLT 已提交
255 256
## Step6: Reuse Reader, define remote RPC client
Below we access the RPC prediction service through Python code, the script is test_client.py
M
MRXLT 已提交
257

M
MRXLT 已提交
258 259 260
<details>
  <summary>test_client.py</summary>

M
MRXLT 已提交
261 262 263 264 265 266 267 268 269
```python
from paddle_serving_client import Client
from imdb_reader import IMDBDataset
import sys

client = Client()
client.load_client_config(sys.argv[1])
client.connect(["127.0.0.1:9292"])

M
MRXLT 已提交
270
#The code of the data preprocessing part is reused here to convert the original text into a numeric id
M
MRXLT 已提交
271 272 273 274 275 276 277 278 279 280 281
imdb_dataset = IMDBDataset()
imdb_dataset.load_resource(sys.argv[2])

for line in sys.stdin:
    word_ids, label = imdb_dataset.get_words_and_label(line)
    feed = {"words": word_ids}
    fetch = ["acc", "cost", "prediction"]
    fetch_map = client.predict(feed=feed, fetch=fetch)
    print("{} {}".format(fetch_map["prediction"][1], label[0]))
```

M
MRXLT 已提交
282 283
</details>

M
MRXLT 已提交
284
The script receives data from standard input and prints out the probability that the sample whose infer result is 1 and its real label.
M
MRXLT 已提交
285

M
MRXLT 已提交
286
## Step7: Call the RPC service to test the model effect
M
MRXLT 已提交
287

M
MRXLT 已提交
288
The client implemented in the previous step runs the prediction service as an example. The usage method is as follows:
M
MRXLT 已提交
289

M
MRXLT 已提交
290 291 292
`` `shell
cat test_data/part-0 | python test_client.py imdb_lstm_client_conf / serving_client_conf.prototxt imdb.vocab
`` `
M
MRXLT 已提交
293

M
MRXLT 已提交
294
Using 2084 samples in the test_data/part-0 file for test testing, the model prediction accuracy is 88.19%.
M
MRXLT 已提交
295

M
MRXLT 已提交
296
** Note **: The effect of each model training may be slightly different, and the accuracy of predictions using the trained model will be close to the examples but may not be exactly the same.
M
MRXLT 已提交
297

M
MRXLT 已提交
298
## Step8: Deploy HTTP Prediction Service
M
MRXLT 已提交
299

M
MRXLT 已提交
300
When using the HTTP prediction service, the client does not need to install any modules of Paddle Serving, it only needs to be able to send HTTP requests. Of course, the HTTP method consumes more time in the communication phase than the RPC method.
M
MRXLT 已提交
301

M
MRXLT 已提交
302
For the IMDB sentiment analysis task, the original text needs to be preprocessed before prediction. In the RPC prediction service, we put the preprocessing in the client's script, and in the HTTP prediction service, we put the preprocessing on the server. Paddle Serving's HTTP prediction service framework prepares data pre-processing and post-processing interfaces for this situation. We just need to rewrite it according to the needs of the task.
M
MRXLT 已提交
303

M
MRXLT 已提交
304
Serving provides sample code, which is obtained by executing the imdb_web_service_demo.sh script in [IMDB Example](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imdb).
M
MRXLT 已提交
305

M
MRXLT 已提交
306
Let's take a look at the script text_classify_service.py that starts the HTTP prediction service.
M
MRXLT 已提交
307 308 309
<details>
  <summary>text_clssify_service.py</summary>

M
MRXLT 已提交
310 311 312 313 314
```python
from paddle_serving_server.web_service import WebService
from imdb_reader import IMDBDataset
import sys

M
MRXLT 已提交
315
#extend class WebService
M
MRXLT 已提交
316 317 318 319 320 321 322
class IMDBService(WebService):
    def prepare_dict(self, args={}):
        if len(args) == 0:
            exit(-1)
        self.dataset = IMDBDataset()
        self.dataset.load_resource(args["dict_file_path"])
        
M
MRXLT 已提交
323
		#rewrite preprocess() to implement data preprocessing, here we reuse reader script for training
M
MRXLT 已提交
324 325 326 327 328 329 330
    def preprocess(self, feed={}, fetch=[]):
        if "words" not in feed:
            exit(-1)
        res_feed = {}
        res_feed["words"] = self.dataset.get_words_only(feed["words"])[0]
        return res_feed, fetch

M
MRXLT 已提交
331
#Here you need to use the name parameter to specify the name of the prediction service.
M
MRXLT 已提交
332 333 334 335 336 337 338
imdb_service = IMDBService(name="imdb")
imdb_service.load_model_config(sys.argv[1])
imdb_service.prepare_server(
    workdir=sys.argv[2], port=int(sys.argv[3]), device="cpu")
imdb_service.prepare_dict({"dict_file_path": sys.argv[4]})
imdb_service.run_server()
```
M
MRXLT 已提交
339
</details>
M
MRXLT 已提交
340

M
MRXLT 已提交
341
run
M
MRXLT 已提交
342 343 344 345 346

```shell
python text_classify_service.py imdb_cnn_model/ workdir/ 9292 imdb.vocab
```

M
MRXLT 已提交
347
In the above command, the first parameter is the saved server-side model and configuration file. The second parameter is the working directory, which will save some configuration files for the prediction service. The directory may not exist but needs to be specified. The prediction service will be created by itself. the third parameter is Port number, the fourth parameter is the dictionary file.
M
MRXLT 已提交
348

M
MRXLT 已提交
349 350
## Step9: Call the prediction service with plaintext data
After starting the HTTP prediction service, you can make prediction with a single command:
M
MRXLT 已提交
351

M
MRXLT 已提交
352 353 354 355
`` `
curl -H "Content-Type: application / json" -X POST -d '{"words": "i am very sad | 0", "fetch": ["prediction"]}' http://127.0.0.1:9292/imdb/prediction
`` `
When the inference process is normal, the prediction probability is returned, as shown below.
M
MRXLT 已提交
356

M
MRXLT 已提交
357 358 359
`` `
{"prediction": [0.5592559576034546,0.44074398279190063]}
`` `
M
MRXLT 已提交
360

M
MRXLT 已提交
361
** Note **: The effect of each model training may be slightly different, and the inferred probability value using the trained model may not be consistent with the example.