README.md 10.4 KB
Newer Older
M
MRXLT 已提交
1 2
([简体中文](./README_CN.md)|English)

D
Dong Daxiang 已提交
3 4
<p align="center">
    <br>
D
Dong Daxiang 已提交
5
<img src='doc/serving_logo.png' width = "600" height = "130">
D
Dong Daxiang 已提交
6 7
    <br>
<p>
8

M
MRXLT 已提交
9

B
barrierye 已提交
10

D
Dong Daxiang 已提交
11 12
<p align="center">
    <br>
B
barrierye 已提交
13 14 15
    <a href="https://travis-ci.com/PaddlePaddle/Serving">
        <img alt="Build Status" src="https://img.shields.io/travis/com/PaddlePaddle/Serving/develop">
    </a>
D
Dong Daxiang 已提交
16 17 18 19
    <img alt="Release" src="https://img.shields.io/badge/Release-0.0.3-yellowgreen">
    <img alt="Issues" src="https://img.shields.io/github/issues/PaddlePaddle/Serving">
    <img alt="License" src="https://img.shields.io/github/license/PaddlePaddle/Serving">
    <img alt="Slack" src="https://img.shields.io/badge/Join-Slack-green">
D
Dong Daxiang 已提交
20 21
    <br>
<p>
D
Dong Daxiang 已提交
22

D
Dong Daxiang 已提交
23
<h2 align="center">Motivation</h2>
D
Dong Daxiang 已提交
24

J
Jiawei Wang 已提交
25
We consider deploying deep learning inference service online to be a user-facing application in the future. **The goal of this project**: When you have trained a deep neural net with [Paddle](https://github.com/PaddlePaddle/Paddle), you are also capable to deploy the model online easily. A demo of Paddle Serving is as follows:
D
Dong Daxiang 已提交
26
<p align="center">
D
Dong Daxiang 已提交
27
    <img src="doc/demo.gif" width="700">
D
Dong Daxiang 已提交
28
</p>
D
Dong Daxiang 已提交
29 30


D
Dong Daxiang 已提交
31
<h2 align="center">Installation</h2>
D
Dong Daxiang 已提交
32

B
barrierye 已提交
33
We **highly recommend** you to **run Paddle Serving in Docker**, please visit [Run in Docker](https://github.com/PaddlePaddle/Serving/blob/develop/doc/RUN_IN_DOCKER.md). See the [document](doc/DOCKER_IMAGES.md) for more docker images.
M
MRXLT 已提交
34 35
```
# Run CPU Docker
B
barrierye 已提交
36 37
docker pull hub.baidubce.com/paddlepaddle/serving:latest
docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:latest
M
MRXLT 已提交
38 39 40 41
docker exec -it test bash
```
```
# Run GPU Docker
B
barrierye 已提交
42 43
nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7
nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7
M
MRXLT 已提交
44 45
nvidia-docker exec -it test bash
```
D
Dong Daxiang 已提交
46

D
Dong Daxiang 已提交
47
```shell
W
wangjiawei04 已提交
48 49
pip install paddle-serving-client==0.4.0 
pip install paddle-serving-server==0.4.0 # CPU
T
TeslaZhao 已提交
50
pip install paddle-serving-app==0.2.0
W
wangjiawei04 已提交
51 52
pip install paddle-serving-server-gpu==0.4.0.post9 # GPU with CUDA9.0
pip install paddle-serving-server-gpu==0.4.0.post10 # GPU with CUDA10.0
T
TeslaZhao 已提交
53
pip install paddle-serving-server-gpu==0.4.0.100 # GPU with CUDA10.1+TensorRT
D
Dong Daxiang 已提交
54 55
```

M
MRXLT 已提交
56
You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source, add `-i https://pypi.tuna.tsinghua.edu.cn/simple` to pip command) to speed up the download.
B
barrierye 已提交
57

M
MRXLT 已提交
58 59
If you need install modules compiled with develop branch, please download packages from [latest packages list](./doc/LATEST_PACKAGES.md) and install with `pip install` command.

W
wangjiawei04 已提交
60
Packages of paddle-serving-server and paddle-serving-server-gpu support Centos 6/7, Ubuntu 16/18, Windows 10.
61

W
wangjiawei04 已提交
62
Packages of paddle-serving-client and paddle-serving-app support Linux and Windows, but paddle-serving-client only support python2.7/3.5/3.6/3.7.
M
MRXLT 已提交
63

W
wangjiawei04 已提交
64 65 66
Recommended to install paddle >= 1.8.4.

For **Windows Users**, please read the document [Paddle Serving for Windows Users](./doc/WINDOWS_TUTORIAL.md)
D
Dong Daxiang 已提交
67 68 69 70 71 72

<h2 align="center"> Pre-built services with Paddle Serving</h2>

<h3 align="center">Chinese Word Segmentation</h4>

``` shell
M
MRXLT 已提交
73
> python -m paddle_serving_app.package --get_model lac
D
Dong Daxiang 已提交
74
> tar -xzf lac.tar.gz
M
MRXLT 已提交
75
> python lac_web_service.py lac_model/ lac_workdir 9393 &
D
Dong Daxiang 已提交
76 77 78 79 80 81 82 83 84 85 86
> curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "我爱北京天安门"}], "fetch":["word_seg"]}' http://127.0.0.1:9393/lac/prediction
{"result":[{"word_seg":"我|爱|北京|天安门"}]}
```

<h3 align="center">Image Classification</h4>

<p align="center">
    <br>
<img src='https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg' width = "200" height = "200">
    <br>
<p>
B
barrierye 已提交
87

D
Dong Daxiang 已提交
88
``` shell
M
MRXLT 已提交
89
> python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet
D
Dong Daxiang 已提交
90 91 92 93
> tar -xzf resnet_v2_50_imagenet.tar.gz
> python resnet50_imagenet_classify.py resnet50_serving_model &
> curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"image": "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"}], "fetch": ["score"]}' http://127.0.0.1:9292/image/prediction
{"result":{"label":["daisy"],"prob":[0.9341403245925903]}}
D
Dong Daxiang 已提交
94 95 96
```


D
Dong Daxiang 已提交
97
<h2 align="center">Quick Start Example</h2>
D
Dong Daxiang 已提交
98

D
Dong Daxiang 已提交
99 100
This quick start example is only for users who already have a model to deploy and we prepare a ready-to-deploy model here. If you want to know how to use paddle serving from offline training to online serving, please reference to [Train_To_Service](https://github.com/PaddlePaddle/Serving/blob/develop/doc/TRAIN_TO_SERVICE.md)

D
Dong Daxiang 已提交
101
### Boston House Price Prediction model
D
Dong Daxiang 已提交
102
``` shell
D
Dong Daxiang 已提交
103
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
D
Dong Daxiang 已提交
104
tar -xzf uci_housing.tar.gz
D
Dong Daxiang 已提交
105
```
D
Dong Daxiang 已提交
106

D
Dong Daxiang 已提交
107 108
Paddle Serving provides HTTP and RPC based service for users to access

W
wangjiawei04 已提交
109
### RPC service
D
Dong Daxiang 已提交
110

W
wangjiawei04 已提交
111
A user can also start a RPC service with `paddle_serving_server.serve`. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify `--name` here. 
D
Dong Daxiang 已提交
112
``` shell
W
wangjiawei04 已提交
113
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
D
Dong Daxiang 已提交
114
```
D
Dong Daxiang 已提交
115 116
<center>

D
Dong Daxiang 已提交
117 118
| Argument | Type | Default | Description |
|--------------|------|-----------|--------------------------------|
D
Dong Daxiang 已提交
119
| `thread` | int | `4` | Concurrency of current service |
D
Dong Daxiang 已提交
120
| `port` | int | `9292` | Exposed port of current service to users|
D
Dong Daxiang 已提交
121
| `model` | str | `""` | Path of paddle model directory to be served |
M
MRXLT 已提交
122
| `mem_optim_off` | - | - | Disable memory / graphic memory optimization |
M
MRXLT 已提交
123 124
| `ir_optim` | - | - | Enable analysis and optimization of calculation graph |
| `use_mkl` (Only for cpu version) | - | - | Run inference with MKL |
M
bug fix  
MRXLT 已提交
125
| `use_trt` (Only for trt version) | - | - | Run inference with TensorRT  |
D
Dong Daxiang 已提交
126 127

</center>
W
fix doc  
wangjiawei04 已提交
128 129

```python
D
Dong Daxiang 已提交
130
# A user can visit rpc service through paddle_serving_client API
D
Dong Daxiang 已提交
131
from paddle_serving_client import Client
W
wangjiawei04 已提交
132
import numpy as np
D
Dong Daxiang 已提交
133
client = Client()
D
Dong Daxiang 已提交
134
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
D
Dong Daxiang 已提交
135
client.connect(["127.0.0.1:9292"])
D
Dong Daxiang 已提交
136
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
D
Dong Daxiang 已提交
137
        -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
W
wangjiawei04 已提交
138
fetch_map = client.predict(feed={"x": np.array(data).reshape(1,13,1)}, fetch=["price"])
D
Dong Daxiang 已提交
139
print(fetch_map)
D
Dong Daxiang 已提交
140
```
D
Dong Daxiang 已提交
141
Here, `client.predict` function has two arguments. `feed` is a `python dict` with model input variable alias name and values. `fetch` assigns the prediction variables to be returned from servers. In the example, the name of `"x"` and `"price"` are assigned when the servable model is saved during training.
D
Dong Daxiang 已提交
142

M
MRXLT 已提交
143

W
wangjiawei04 已提交
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177
### WEB service

Users can also put the data format processing logic on the server side, so that they can directly use curl to access the service, refer to the following case whose path is `python/examples/fit_a_line`

```python
from paddle_serving_server.web_service import WebService
import numpy as np

class UciService(WebService):
    def preprocess(self, feed=[], fetch=[]):
        feed_batch = []
        is_batch = True
        new_data = np.zeros((len(feed), 1, 13)).astype("float32")
        for i, ins in enumerate(feed):
            nums = np.array(ins["x"]).reshape(1, 1, 13)
            new_data[i] = nums
        feed = {"x": new_data}
        return feed, fetch, is_batch

uci_service = UciService(name="uci")
uci_service.load_model_config("uci_housing_model")
uci_service.prepare_server(workdir="workdir", port=9292)
uci_service.run_rpc_service()
uci_service.run_web_service()
```
for client side,
```
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
the response is
```
{"result":{"price":[[18.901151657104492]]}}
```

J
Jiawei Wang 已提交
178 179 180 181 182 183 184 185
<h2 align="center">Some Key Features of Paddle Serving</h2>

- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed **with one line command**.
- **Industrial serving features** supported, such as models management, online loading, online A/B testing etc.
- **Distributed Key-Value indexing** supported which is especially useful for large scale sparse features as model inputs.
- **Highly concurrent and efficient communication** between clients and servers supported.
- **Multiple programming languages** supported on client side, such as Golang, C++ and python.

D
Dong Daxiang 已提交
186
<h2 align="center">Document</h2>
D
Dong Daxiang 已提交
187

D
Dong Daxiang 已提交
188
### New to Paddle Serving
D
Dong Daxiang 已提交
189
- [How to save a servable model?](doc/SAVE.md)
J
Jiawei Wang 已提交
190
- [An End-to-end tutorial from training to inference service deployment](doc/TRAIN_TO_SERVICE.md)
J
Jiawei Wang 已提交
191
- [Write Bert-as-Service in 10 minutes](doc/BERT_10_MINS.md)
D
Dong Daxiang 已提交
192

W
wangjiawei04 已提交
193 194 195 196 197
### Tutorial at AIStudio
- [Introduction to PaddleServing](https://aistudio.baidu.com/aistudio/projectdetail/605819)
- [Image Segmentation on Paddle Serving](https://aistudio.baidu.com/aistudio/projectdetail/457715)
- [Sentimental Analysis](https://aistudio.baidu.com/aistudio/projectdetail/509014)

D
Dong Daxiang 已提交
198
### Developers
D
Dong Daxiang 已提交
199
- [How to config Serving native operators on server side?](doc/SERVER_DAG.md)
J
Jiawei Wang 已提交
200
- [How to develop a new Serving operator?](doc/NEW_OPERATOR.md)
B
barrierye 已提交
201
- [How to develop a new Web Service?](doc/NEW_WEB_SERVICE.md)
D
Dong Daxiang 已提交
202
- [Golang client](doc/IMDB_GO_CLIENT.md)
J
Jiawei Wang 已提交
203
- [Compile from source code](doc/COMPILE.md)
M
MRXLT 已提交
204 205
- [Deploy Web Service with uWSGI](doc/UWSGI_DEPLOY.md)
- [Hot loading for model file](doc/HOT_LOADING_IN_SERVING.md)
D
Dong Daxiang 已提交
206

D
Dong Daxiang 已提交
207
### About Efficiency
M
MRXLT 已提交
208
- [How to profile Paddle Serving latency?](python/examples/util)
M
MRXLT 已提交
209
- [How to optimize performance?](doc/PERFORMANCE_OPTIM.md)
M
MRXLT 已提交
210
- [Deploy multi-services on one GPU(Chinese)](doc/MULTI_SERVICE_ON_ONE_GPU_CN.md)
J
Jiawei Wang 已提交
211 212
- [CPU Benchmarks(Chinese)](doc/BENCHMARKING.md)
- [GPU Benchmarks(Chinese)](doc/GPU_BENCHMARKING.md)
D
Dong Daxiang 已提交
213

D
Dong Daxiang 已提交
214
### FAQ
M
add FAQ  
MRXLT 已提交
215
- [FAQ(Chinese)](doc/FAQ.md)
D
Dong Daxiang 已提交
216

D
Dong Daxiang 已提交
217

D
Dong Daxiang 已提交
218
### Design
J
Jiawei Wang 已提交
219
- [Design Doc](doc/DESIGN_DOC.md)
D
Dong Daxiang 已提交
220

D
Dong Daxiang 已提交
221 222
<h2 align="center">Community</h2>

D
Dong Daxiang 已提交
223

D
Dong Daxiang 已提交
224
### Slack
D
Dong Daxiang 已提交
225

D
Dong Daxiang 已提交
226 227
To connect with other users and contributors, welcome to join our [Slack channel](https://paddleserving.slack.com/archives/CUBPKHKMJ)

D
Dong Daxiang 已提交
228
### Contribution
D
Dong Daxiang 已提交
229

D
Dong Daxiang 已提交
230
If you want to contribute code to Paddle Serving, please reference [Contribution Guidelines](doc/CONTRIBUTE.md)
D
Dong Daxiang 已提交
231

J
Jiawei Wang 已提交
232 233 234
- Special Thanks to [@BeyondYourself](https://github.com/BeyondYourself) in complementing the gRPC tutorial, updating the FAQ doc and modifying the mdkir command
- Special Thanks to [@mcl-stone](https://github.com/mcl-stone) in updating faster_rcnn benchmark
- Special Thanks to [@cg82616424](https://github.com/cg82616424) in updating the unet benchmark and modifying resize comment error
P
PaddlePM 已提交
235

D
Dong Daxiang 已提交
236
### Feedback
D
Dong Daxiang 已提交
237

D
Dong Daxiang 已提交
238 239
For any feedback or to report a bug, please propose a [GitHub Issue](https://github.com/PaddlePaddle/Serving/issues).

D
Dong Daxiang 已提交
240 241
### License

D
Dong Daxiang 已提交
242
[Apache 2.0 License](https://github.com/PaddlePaddle/Serving/blob/develop/LICENSE)