README.md 9.9 KB
Newer Older
M
MRXLT 已提交
1 2
([简体中文](./README_CN.md)|English)

D
Dong Daxiang 已提交
3 4
<p align="center">
    <br>
D
Dong Daxiang 已提交
5
<img src='doc/serving_logo.png' width = "600" height = "130">
D
Dong Daxiang 已提交
6 7
    <br>
<p>
8

M
MRXLT 已提交
9

B
barrierye 已提交
10

D
Dong Daxiang 已提交
11 12
<p align="center">
    <br>
B
barrierye 已提交
13 14 15
    <a href="https://travis-ci.com/PaddlePaddle/Serving">
        <img alt="Build Status" src="https://img.shields.io/travis/com/PaddlePaddle/Serving/develop">
    </a>
D
Dong Daxiang 已提交
16 17 18 19
    <img alt="Release" src="https://img.shields.io/badge/Release-0.0.3-yellowgreen">
    <img alt="Issues" src="https://img.shields.io/github/issues/PaddlePaddle/Serving">
    <img alt="License" src="https://img.shields.io/github/license/PaddlePaddle/Serving">
    <img alt="Slack" src="https://img.shields.io/badge/Join-Slack-green">
D
Dong Daxiang 已提交
20 21
    <br>
<p>
D
Dong Daxiang 已提交
22

D
Dong Daxiang 已提交
23
<h2 align="center">Motivation</h2>
D
Dong Daxiang 已提交
24

J
Jiawei Wang 已提交
25
We consider deploying deep learning inference service online to be a user-facing application in the future. **The goal of this project**: When you have trained a deep neural net with [Paddle](https://github.com/PaddlePaddle/Paddle), you are also capable to deploy the model online easily. A demo of Paddle Serving is as follows:
D
Dong Daxiang 已提交
26
<p align="center">
D
Dong Daxiang 已提交
27
    <img src="doc/demo.gif" width="700">
D
Dong Daxiang 已提交
28
</p>
D
Dong Daxiang 已提交
29 30


D
Dong Daxiang 已提交
31
<h2 align="center">Installation</h2>
D
Dong Daxiang 已提交
32

B
barrierye 已提交
33
We **highly recommend** you to **run Paddle Serving in Docker**, please visit [Run in Docker](https://github.com/PaddlePaddle/Serving/blob/develop/doc/RUN_IN_DOCKER.md). See the [document](doc/DOCKER_IMAGES.md) for more docker images.
M
MRXLT 已提交
34 35
```
# Run CPU Docker
B
barrierye 已提交
36 37
docker pull hub.baidubce.com/paddlepaddle/serving:latest
docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:latest
M
MRXLT 已提交
38 39 40 41
docker exec -it test bash
```
```
# Run GPU Docker
B
barrierye 已提交
42 43
nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7
nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7
M
MRXLT 已提交
44 45
nvidia-docker exec -it test bash
```
D
Dong Daxiang 已提交
46

D
Dong Daxiang 已提交
47
```shell
M
MRXLT 已提交
48 49 50
pip install paddle-serving-client==0.3.2 
pip install paddle-serving-server==0.3.2 # CPU
pip install paddle-serving-server-gpu==0.3.2.post9 # GPU with CUDA9.0
M
MRXLT 已提交
51
pip install paddle-serving-server-gpu==0.3.2.post10 # GPU with CUDA10.0
D
Dong Daxiang 已提交
52 53
```

M
MRXLT 已提交
54
You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source, add `-i https://pypi.tuna.tsinghua.edu.cn/simple` to pip command) to speed up the download.
B
barrierye 已提交
55

M
MRXLT 已提交
56 57
If you need install modules compiled with develop branch, please download packages from [latest packages list](./doc/LATEST_PACKAGES.md) and install with `pip install` command.

M
MRXLT 已提交
58
Packages of paddle-serving-server and paddle-serving-server-gpu support Centos 6/7 and Ubuntu 16/18.
59

M
MRXLT 已提交
60
Packages of paddle-serving-client and paddle-serving-app support Linux and Windows, but paddle-serving-client only support python2.7/3.6/3.7.
M
MRXLT 已提交
61 62

Recommended to install paddle >= 1.8.2.
D
Dong Daxiang 已提交
63 64 65

<h2 align="center"> Pre-built services with Paddle Serving</h2>

D
Dong Daxiang 已提交
66 67 68 69 70 71 72 73 74
<h3 align="center">Latest release</h4>
<p align="center">
    <a href="https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/ocr">Optical Character Recognition</a>
    <br>
    <a href="https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/faster_rcnn_model">Object Detection</a>
    <br>
    <a href="https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/deeplabv3">Image Segmentation</a>
<p>

D
Dong Daxiang 已提交
75 76 77
<h3 align="center">Chinese Word Segmentation</h4>

``` shell
M
MRXLT 已提交
78
> python -m paddle_serving_app.package --get_model lac
D
Dong Daxiang 已提交
79
> tar -xzf lac.tar.gz
M
MRXLT 已提交
80
> python lac_web_service.py lac_model/ lac_workdir 9393 &
D
Dong Daxiang 已提交
81 82 83 84 85 86 87 88 89 90 91
> curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "我爱北京天安门"}], "fetch":["word_seg"]}' http://127.0.0.1:9393/lac/prediction
{"result":[{"word_seg":"我|爱|北京|天安门"}]}
```

<h3 align="center">Image Classification</h4>

<p align="center">
    <br>
<img src='https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg' width = "200" height = "200">
    <br>
<p>
B
barrierye 已提交
92

D
Dong Daxiang 已提交
93
``` shell
M
MRXLT 已提交
94
> python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet
D
Dong Daxiang 已提交
95 96 97 98
> tar -xzf resnet_v2_50_imagenet.tar.gz
> python resnet50_imagenet_classify.py resnet50_serving_model &
> curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"image": "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"}], "fetch": ["score"]}' http://127.0.0.1:9292/image/prediction
{"result":{"label":["daisy"],"prob":[0.9341403245925903]}}
D
Dong Daxiang 已提交
99 100 101
```


D
Dong Daxiang 已提交
102
<h2 align="center">Quick Start Example</h2>
D
Dong Daxiang 已提交
103

D
Dong Daxiang 已提交
104 105
This quick start example is only for users who already have a model to deploy and we prepare a ready-to-deploy model here. If you want to know how to use paddle serving from offline training to online serving, please reference to [Train_To_Service](https://github.com/PaddlePaddle/Serving/blob/develop/doc/TRAIN_TO_SERVICE.md)

D
Dong Daxiang 已提交
106
### Boston House Price Prediction model
D
Dong Daxiang 已提交
107
``` shell
D
Dong Daxiang 已提交
108
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
D
Dong Daxiang 已提交
109
tar -xzf uci_housing.tar.gz
D
Dong Daxiang 已提交
110
```
D
Dong Daxiang 已提交
111

D
Dong Daxiang 已提交
112 113
Paddle Serving provides HTTP and RPC based service for users to access

D
Dong Daxiang 已提交
114
### HTTP service
D
Dong Daxiang 已提交
115

J
Jiawei Wang 已提交
116
Paddle Serving provides a built-in python module called `paddle_serving_server.serve` that can start a RPC service or a http service with one-line command. If we specify the argument `--name uci`, it means that we will have a HTTP service with a url of `$IP:$PORT/uci/prediction`
D
Dong Daxiang 已提交
117
``` shell
D
Dong Daxiang 已提交
118
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci
D
Dong Daxiang 已提交
119
```
D
Dong Daxiang 已提交
120 121
<center>

D
Dong Daxiang 已提交
122 123
| Argument | Type | Default | Description |
|--------------|------|-----------|--------------------------------|
D
Dong Daxiang 已提交
124
| `thread` | int | `4` | Concurrency of current service |
D
Dong Daxiang 已提交
125
| `port` | int | `9292` | Exposed port of current service to users|
D
Dong Daxiang 已提交
126 127
| `name` | str | `""` | Service name, can be used to generate HTTP request url |
| `model` | str | `""` | Path of paddle model directory to be served |
M
MRXLT 已提交
128
| `mem_optim_off` | - | - | Disable memory / graphic memory optimization |
M
MRXLT 已提交
129 130
| `ir_optim` | - | - | Enable analysis and optimization of calculation graph |
| `use_mkl` (Only for cpu version) | - | - | Run inference with MKL |
M
bug fix  
MRXLT 已提交
131
| `use_trt` (Only for trt version) | - | - | Run inference with TensorRT  |
D
Dong Daxiang 已提交
132

D
Dong Daxiang 已提交
133
Here, we use `curl` to send a HTTP POST request to the service we just started. Users can use any python library to send HTTP POST as well, e.g, [requests](https://requests.readthedocs.io/en/master/).
D
Dong Daxiang 已提交
134 135
</center>

D
Dong Daxiang 已提交
136
``` shell
M
MRXLT 已提交
137
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
D
Dong Daxiang 已提交
138
```
D
Dong Daxiang 已提交
139

D
Dong Daxiang 已提交
140
### RPC service
D
Dong Daxiang 已提交
141

J
Jiawei Wang 已提交
142
A user can also start a RPC service with `paddle_serving_server.serve`. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify `--name` here. 
D
Dong Daxiang 已提交
143 144 145
``` shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
```
D
Dong Daxiang 已提交
146

D
Dong Daxiang 已提交
147
``` python
D
Dong Daxiang 已提交
148
# A user can visit rpc service through paddle_serving_client API
D
Dong Daxiang 已提交
149 150 151
from paddle_serving_client import Client

client = Client()
D
Dong Daxiang 已提交
152
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
D
Dong Daxiang 已提交
153
client.connect(["127.0.0.1:9292"])
D
Dong Daxiang 已提交
154
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
D
Dong Daxiang 已提交
155
        -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
D
Dong Daxiang 已提交
156
fetch_map = client.predict(feed={"x": data}, fetch=["price"])
D
Dong Daxiang 已提交
157
print(fetch_map)
D
Dong Daxiang 已提交
158 159

```
D
Dong Daxiang 已提交
160
Here, `client.predict` function has two arguments. `feed` is a `python dict` with model input variable alias name and values. `fetch` assigns the prediction variables to be returned from servers. In the example, the name of `"x"` and `"price"` are assigned when the servable model is saved during training.
D
Dong Daxiang 已提交
161

D
Dong Daxiang 已提交
162
<h2 align="center">Some Key Features of Paddle Serving</h2>
M
MRXLT 已提交
163

D
Dong Daxiang 已提交
164 165 166 167 168
- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed **with one line command**.
- **Industrial serving features** supported, such as models management, online loading, online A/B testing etc.
- **Distributed Key-Value indexing** supported which is especially useful for large scale sparse features as model inputs.
- **Highly concurrent and efficient communication** between clients and servers supported.
- **Multiple programming languages** supported on client side, such as Golang, C++ and python.
M
MRXLT 已提交
169

D
Dong Daxiang 已提交
170
<h2 align="center">Document</h2>
D
Dong Daxiang 已提交
171

D
Dong Daxiang 已提交
172
### New to Paddle Serving
D
Dong Daxiang 已提交
173
- [How to save a servable model?](doc/SAVE.md)
J
Jiawei Wang 已提交
174
- [An End-to-end tutorial from training to inference service deployment](doc/TRAIN_TO_SERVICE.md)
J
Jiawei Wang 已提交
175
- [Write Bert-as-Service in 10 minutes](doc/BERT_10_MINS.md)
D
Dong Daxiang 已提交
176

W
wangjiawei04 已提交
177 178 179 180 181
### Tutorial at AIStudio
- [Introduction to PaddleServing](https://aistudio.baidu.com/aistudio/projectdetail/605819)
- [Image Segmentation on Paddle Serving](https://aistudio.baidu.com/aistudio/projectdetail/457715)
- [Sentimental Analysis](https://aistudio.baidu.com/aistudio/projectdetail/509014)

D
Dong Daxiang 已提交
182
### Developers
D
Dong Daxiang 已提交
183
- [How to config Serving native operators on server side?](doc/SERVER_DAG.md)
J
Jiawei Wang 已提交
184
- [How to develop a new Serving operator?](doc/NEW_OPERATOR.md)
B
barrierye 已提交
185
- [How to develop a new Web Service?](doc/NEW_WEB_SERVICE.md)
D
Dong Daxiang 已提交
186
- [Golang client](doc/IMDB_GO_CLIENT.md)
J
Jiawei Wang 已提交
187
- [Compile from source code](doc/COMPILE.md)
M
MRXLT 已提交
188 189
- [Deploy Web Service with uWSGI](doc/UWSGI_DEPLOY.md)
- [Hot loading for model file](doc/HOT_LOADING_IN_SERVING.md)
D
Dong Daxiang 已提交
190

D
Dong Daxiang 已提交
191
### About Efficiency
M
MRXLT 已提交
192
- [How to profile Paddle Serving latency?](python/examples/util)
M
MRXLT 已提交
193
- [How to optimize performance?](doc/PERFORMANCE_OPTIM.md)
M
MRXLT 已提交
194
- [Deploy multi-services on one GPU(Chinese)](doc/MULTI_SERVICE_ON_ONE_GPU_CN.md)
J
Jiawei Wang 已提交
195 196
- [CPU Benchmarks(Chinese)](doc/BENCHMARKING.md)
- [GPU Benchmarks(Chinese)](doc/GPU_BENCHMARKING.md)
D
Dong Daxiang 已提交
197

D
Dong Daxiang 已提交
198
### FAQ
M
add FAQ  
MRXLT 已提交
199
- [FAQ(Chinese)](doc/FAQ.md)
D
Dong Daxiang 已提交
200

D
Dong Daxiang 已提交
201

D
Dong Daxiang 已提交
202
### Design
J
Jiawei Wang 已提交
203
- [Design Doc](doc/DESIGN_DOC.md)
D
Dong Daxiang 已提交
204

D
Dong Daxiang 已提交
205 206
<h2 align="center">Community</h2>

D
Dong Daxiang 已提交
207

D
Dong Daxiang 已提交
208
### Slack
D
Dong Daxiang 已提交
209

D
Dong Daxiang 已提交
210 211
To connect with other users and contributors, welcome to join our [Slack channel](https://paddleserving.slack.com/archives/CUBPKHKMJ)

D
Dong Daxiang 已提交
212
### Contribution
D
Dong Daxiang 已提交
213

D
Dong Daxiang 已提交
214
If you want to contribute code to Paddle Serving, please reference [Contribution Guidelines](doc/CONTRIBUTE.md)
D
Dong Daxiang 已提交
215 216

### Feedback
D
Dong Daxiang 已提交
217

D
Dong Daxiang 已提交
218 219
For any feedback or to report a bug, please propose a [GitHub Issue](https://github.com/PaddlePaddle/Serving/issues).

D
Dong Daxiang 已提交
220 221
### License

D
Dong Daxiang 已提交
222
[Apache 2.0 License](https://github.com/PaddlePaddle/Serving/blob/develop/LICENSE)