README.md 2.2 KB
Newer Older
J
Jiawei Wang 已提交
1
## IMDB comment sentiment inference service
M
MRXLT 已提交
2

J
Jiawei Wang 已提交
3
([简体中文](./README_CN.md)|English)
M
MRXLT 已提交
4

J
Jiawei Wang 已提交
5
### Get model files and sample data
M
MRXLT 已提交
6 7 8 9

```
sh get_data.sh
```
J
Jiawei Wang 已提交
10
the package downloaded contains cnn, lstm and bow model config along with their test_data and train_data.
M
MRXLT 已提交
11

J
Jiawei Wang 已提交
12
### Start RPC inference service
M
MRXLT 已提交
13

M
MRXLT 已提交
14
```
M
fix doc  
MRXLT 已提交
15
python -m paddle_serving_server.serve --model imdb_cnn_model/ --port 9292
M
MRXLT 已提交
16
```
J
Jiawei Wang 已提交
17
### RPC Infer
M
MRXLT 已提交
18
```
M
fix doc  
MRXLT 已提交
19
head test_data/part-0 | python test_client.py imdb_cnn_client_conf/serving_client_conf.prototxt imdb.vocab
M
MRXLT 已提交
20
```
M
MRXLT 已提交
21

J
Jiawei Wang 已提交
22 23 24
it will get predict results of the first 10 test cases.

### Start HTTP inference service
M
MRXLT 已提交
25 26 27
```
python text_classify_service.py imdb_cnn_model/ workdir/ 9292 imdb.vocab
```
J
Jiawei Wang 已提交
28
### HTTP Infer
M
MRXLT 已提交
29

M
MRXLT 已提交
30
```
M
MRXLT 已提交
31
curl -H "Content-Type:application/json" -X POST -d '{"words": "i am very sad | 0", "fetch":["prediction"]}' http://127.0.0.1:9292/imdb/prediction
M
MRXLT 已提交
32
```
M
MRXLT 已提交
33 34 35

### Benchmark

J
Jiawei Wang 已提交
36
CPU :Intel(R) Xeon(R)  Gold 6271 CPU @ 2.60GHz * 48
M
MRXLT 已提交
37

J
Jiawei Wang 已提交
38
Model :[CNN](https://github.com/PaddlePaddle/Serving/blob/develop/python/examples/imdb/nets.py)
M
MRXLT 已提交
39 40 41

server thread num : 16

J
Jiawei Wang 已提交
42
In this test, client sends 25000 test samples totally, the bar chart given later is the latency of single thread, the unit is second, from which we know the predict efficiency is improved greatly by multi-thread compared to single-thread. 8.7 times improvement is made by 16 threads prediction.
M
MRXLT 已提交
43

M
MRXLT 已提交
44 45 46 47 48 49 50 51 52 53
| client  thread num | prepro | client infer | op0    | op1   | op2    | postpro | total |
| ------------------ | ------ | ------------ | ------ | ----- | ------ | ------- | ----- |
| 1                  | 1.09   | 28.79        | 0.094  | 20.59 | 0.047  | 0.034   | 31.41 |
| 4                  | 0.22   | 7.41         | 0.023  | 5.01  | 0.011  | 0.0098  | 8.01  |
| 8                  | 0.11   | 4.7          | 0.012  | 2.61  | 0.0062 | 0.0049  | 5.01  |
| 12                 | 0.081  | 4.69         | 0.0078 | 1.72  | 0.0042 | 0.0035  | 4.91  |
| 16                 | 0.058  | 3.46         | 0.0061 | 1.32  | 0.0033 | 0.003   | 3.63  |
| 20                 | 0.049  | 3.77         | 0.0047 | 1.03  | 0.0025 | 0.0022  | 3.91  |
| 24                 | 0.041  | 3.86         | 0.0039 | 0.85  | 0.002  | 0.0017  | 3.98  |

J
Jiawei Wang 已提交
54
The thread-latency bar chart is as follow:
M
MRXLT 已提交
55

M
MRXLT 已提交
56
![total cost](../../../doc/imdb-benchmark-server-16.png)