README.md 11.1 KB
Newer Older
M
MRXLT 已提交
1 2
([简体中文](./README_CN.md)|English)

D
Dong Daxiang 已提交
3 4
<p align="center">
    <br>
D
Dong Daxiang 已提交
5
<img src='doc/serving_logo.png' width = "600" height = "130">
D
Dong Daxiang 已提交
6 7
    <br>
<p>
8

M
MRXLT 已提交
9

B
barrierye 已提交
10

D
Dong Daxiang 已提交
11 12
<p align="center">
    <br>
B
barrierye 已提交
13 14 15
    <a href="https://travis-ci.com/PaddlePaddle/Serving">
        <img alt="Build Status" src="https://img.shields.io/travis/com/PaddlePaddle/Serving/develop">
    </a>
D
Dong Daxiang 已提交
16 17 18 19
    <img alt="Release" src="https://img.shields.io/badge/Release-0.0.3-yellowgreen">
    <img alt="Issues" src="https://img.shields.io/github/issues/PaddlePaddle/Serving">
    <img alt="License" src="https://img.shields.io/github/license/PaddlePaddle/Serving">
    <img alt="Slack" src="https://img.shields.io/badge/Join-Slack-green">
D
Dong Daxiang 已提交
20 21
    <br>
<p>
D
Dong Daxiang 已提交
22

W
wangjiawei04 已提交
23
- [Motivation](./README.md#motivation)
W
wangjiawei04 已提交
24
- [AIStudio Tutorial](./README.md#aistuio-tutorial)
W
wangjiawei04 已提交
25 26 27 28
- [Installation](./README.md#installation)
- [Quick Start Example](./README.md#quick-start-example)
- [Document](README.md#document)
- [Community](README.md#community)
W
wangjiawei04 已提交
29

D
Dong Daxiang 已提交
30
<h2 align="center">Motivation</h2>
D
Dong Daxiang 已提交
31

J
Jiawei Wang 已提交
32
We consider deploying deep learning inference service online to be a user-facing application in the future. **The goal of this project**: When you have trained a deep neural net with [Paddle](https://github.com/PaddlePaddle/Paddle), you are also capable to deploy the model online easily. A demo of Paddle Serving is as follows:
W
wangjiawei04 已提交
33 34 35 36 37 38 39

- Any model trained by [PaddlePaddle](https://github.com/paddlepaddle/paddle) can be directly used or [Model Conversion Interface](./doc/SAVE_CN.md) for online deployment of Paddle Serving.
- Support [Multi-model Pipeline Deployment](./doc/PIPELINE_SERVING.md), and provide the requirements of the REST interface and RPC interface itself, [Pipeline example](./python/examples/pipeline).
- Support the major model libraries of the Paddle ecosystem, such as [PaddleDetection](./python/examples/detection), [PaddleOCR](./python/examples/ocr), [PaddleRec](https://github.com/PaddlePaddle/PaddleRec/tree/master/tools/recserving/movie_recommender).
- Provide a variety of pre-processing and post-processing to facilitate users in training, deployment and other stages of related code, bridging the gap between AI developers and application developers, please refer to
[Serving Examples](./python/examples/).

D
Dong Daxiang 已提交
40
<p align="center">
D
Dong Daxiang 已提交
41
    <img src="doc/demo.gif" width="700">
D
Dong Daxiang 已提交
42
</p>
D
Dong Daxiang 已提交
43 44


W
wangjiawei04 已提交
45
<h2 align="center">AIStudio Turorial</h2>
W
wangjiawei04 已提交
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69

Here we provide tutorial on AIStudio(Chinese Version) [AIStudio教程-Paddle Serving服务化部署框架](https://aistudio.baidu.com/aistudio/projectdetail/1550674)

The tutorial provides 
<ul>
<li>Paddle Serving Environment Setup</li>
  <ul>
    <li>Running in docker images
    <li>pip install Paddle Serving
  </ul>
<li>Quick Experience of Paddle Serving</li>
<li>Advanced Tutorial of Model Deployment</li>
  <ul>
    <li>Save/Convert Models for Paddle Serving</li>
    <li>Setup Online Inference Service</li>
  </ul>
<li>Paddle Serving Examples</li>
  <ul>
    <li>Paddle Serving for Detections</li>
    <li>Paddle Serving for OCR</li>
  </ul>
</ul>


D
Dong Daxiang 已提交
70
<h2 align="center">Installation</h2>
D
Dong Daxiang 已提交
71

W
wangjiawei04 已提交
72 73 74 75
We **highly recommend** you to **run Paddle Serving in Docker**, please visit [Run in Docker](doc/RUN_IN_DOCKER.md). See the [document](doc/DOCKER_IMAGES.md) for more docker images.

**Attention:**: Currently, the default GPU environment of paddlepaddle 2.0 is Cuda 10.2, so the sample code of GPU Docker is based on Cuda 10.2. We also provides docker images and whl packages for other GPU environments. If users use other environments, they need to carefully check and select the appropriate version.

M
MRXLT 已提交
76 77
```
# Run CPU Docker
W
wangjiawei04 已提交
78 79
docker pull registry.baidubce.com/paddlepaddle/serving:0.5.0-devel
docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:0.5.0-devel
M
MRXLT 已提交
80
docker exec -it test bash
W
wangjiawei04 已提交
81
git clone https://github.com/PaddlePaddle/Serving
M
MRXLT 已提交
82 83 84
```
```
# Run GPU Docker
W
wangjiawei04 已提交
85 86
nvidia-docker pull registry.baidubce.com/paddlepaddle/serving:0.5.0-cuda10.2-cudnn8-devel
nvidia-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:0.5.0-cuda10.2-cudnn8-devel
M
MRXLT 已提交
87
nvidia-docker exec -it test bash
W
wangjiawei04 已提交
88
git clone https://github.com/PaddlePaddle/Serving
M
MRXLT 已提交
89
```
D
Dong Daxiang 已提交
90

D
Dong Daxiang 已提交
91
```shell
W
wangjiawei04 已提交
92 93
pip install paddle-serving-client==0.5.0
pip install paddle-serving-server==0.5.0 # CPU
W
wangjiawei04 已提交
94
pip install paddle-serving-app==0.3.0
W
wangjiawei04 已提交
95 96 97 98 99 100
pip install paddle-serving-server-gpu==0.5.0.post102 #GPU with CUDA10.2 + TensorRT7
# DO NOT RUN ALL COMMANDS! check your GPU env and select the right one
pip install paddle-serving-server-gpu==0.5.0.post9 # GPU with CUDA9.0
pip install paddle-serving-server-gpu==0.5.0.post10 # GPU with CUDA10.0
pip install paddle-serving-server-gpu==0.5.0.post101 # GPU with CUDA10.1 + TensorRT6
pip install paddle-serving-server-gpu==0.5.0.post11 # GPU with CUDA10.1 + TensorRT7
D
Dong Daxiang 已提交
101 102
```

M
MRXLT 已提交
103
You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source, add `-i https://pypi.tuna.tsinghua.edu.cn/simple` to pip command) to speed up the download.
B
barrierye 已提交
104

W
wangjiawei04 已提交
105
If you need install modules compiled with develop branch, please download packages from [latest packages list](./doc/LATEST_PACKAGES.md) and install with `pip install` command. If you want to compile by yourself, please refer to [How to compile Paddle Serving?](./doc/COMPILE.md)
M
MRXLT 已提交
106

W
wangjiawei04 已提交
107
Packages of paddle-serving-server and paddle-serving-server-gpu support Centos 6/7, Ubuntu 16/18, Windows 10.
108

W
wangjiawei04 已提交
109
Packages of paddle-serving-client and paddle-serving-app support Linux and Windows, but paddle-serving-client only support python2.7/3.5/3.6/3.7/3.8.
M
MRXLT 已提交
110

W
wangjiawei04 已提交
111
Recommended to install paddle >= 2.0.0
D
Dong Daxiang 已提交
112 113

```
W
wangjiawei04 已提交
114 115
# CPU users, please run
pip install paddlepaddle==2.0.0
D
Dong Daxiang 已提交
116

W
wangjiawei04 已提交
117 118
# GPU Cuda10.2 please run
pip install paddlepaddle-gpu==2.0.0
D
Dong Daxiang 已提交
119 120
```

W
wangjiawei04 已提交
121
For **Windows Users**, please read the document [Paddle Serving for Windows Users](./doc/WINDOWS_TUTORIAL.md)
D
Dong Daxiang 已提交
122

D
Dong Daxiang 已提交
123
<h2 align="center">Quick Start Example</h2>
D
Dong Daxiang 已提交
124

W
wangjiawei04 已提交
125
This quick start example is mainly for those users who already have a model to deploy, and we also provide a model that can be used for deployment. in case if you want to know how to complete the process from offline training to online service, please refer to the AiStudio tutorial above.
D
Dong Daxiang 已提交
126

D
Dong Daxiang 已提交
127
### Boston House Price Prediction model
W
wangjiawei04 已提交
128 129

get into the Serving git directory, and change dir to `fit_a_line`
D
Dong Daxiang 已提交
130
``` shell
W
wangjiawei04 已提交
131 132
cd Serving/python/examples/fit_a_line
sh get_data.sh
D
Dong Daxiang 已提交
133
```
D
Dong Daxiang 已提交
134

D
Dong Daxiang 已提交
135 136
Paddle Serving provides HTTP and RPC based service for users to access

W
wangjiawei04 已提交
137
### RPC service
D
Dong Daxiang 已提交
138

W
wangjiawei04 已提交
139
A user can also start a RPC service with `paddle_serving_server.serve`. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify `--name` here. 
D
Dong Daxiang 已提交
140
``` shell
W
wangjiawei04 已提交
141
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
D
Dong Daxiang 已提交
142
```
D
Dong Daxiang 已提交
143 144
<center>

D
Dong Daxiang 已提交
145 146
| Argument | Type | Default | Description |
|--------------|------|-----------|--------------------------------|
D
Dong Daxiang 已提交
147
| `thread` | int | `4` | Concurrency of current service |
D
Dong Daxiang 已提交
148
| `port` | int | `9292` | Exposed port of current service to users|
D
Dong Daxiang 已提交
149
| `model` | str | `""` | Path of paddle model directory to be served |
M
MRXLT 已提交
150
| `mem_optim_off` | - | - | Disable memory / graphic memory optimization |
M
MRXLT 已提交
151 152
| `ir_optim` | - | - | Enable analysis and optimization of calculation graph |
| `use_mkl` (Only for cpu version) | - | - | Run inference with MKL |
M
bug fix  
MRXLT 已提交
153
| `use_trt` (Only for trt version) | - | - | Run inference with TensorRT  |
W
wangjiawei04 已提交
154 155
| `use_lite` (Only for ARM) | - | - | Run PaddleLite inference |
| `use_xpu` (Only for ARM+XPU) | - | - | Run PaddleLite XPU inference |
D
Dong Daxiang 已提交
156 157

</center>
W
fix doc  
wangjiawei04 已提交
158 159

```python
D
Dong Daxiang 已提交
160
# A user can visit rpc service through paddle_serving_client API
D
Dong Daxiang 已提交
161
from paddle_serving_client import Client
W
wangjiawei04 已提交
162
import numpy as np
D
Dong Daxiang 已提交
163
client = Client()
D
Dong Daxiang 已提交
164
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
D
Dong Daxiang 已提交
165
client.connect(["127.0.0.1:9292"])
D
Dong Daxiang 已提交
166
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
D
Dong Daxiang 已提交
167
        -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
W
wangjiawei04 已提交
168
fetch_map = client.predict(feed={"x": np.array(data).reshape(1,13,1)}, fetch=["price"])
D
Dong Daxiang 已提交
169
print(fetch_map)
D
Dong Daxiang 已提交
170
```
D
Dong Daxiang 已提交
171
Here, `client.predict` function has two arguments. `feed` is a `python dict` with model input variable alias name and values. `fetch` assigns the prediction variables to be returned from servers. In the example, the name of `"x"` and `"price"` are assigned when the servable model is saved during training.
D
Dong Daxiang 已提交
172

M
MRXLT 已提交
173

W
wangjiawei04 已提交
174 175 176 177
### WEB service

Users can also put the data format processing logic on the server side, so that they can directly use curl to access the service, refer to the following case whose path is `python/examples/fit_a_line`

W
wangjiawei04 已提交
178 179
```
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci
W
wangjiawei04 已提交
180 181 182 183 184 185 186 187 188 189
```
for client side,
```
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
the response is
```
{"result":{"price":[[18.901151657104492]]}}
```

J
Jiawei Wang 已提交
190 191 192 193 194 195 196 197
<h2 align="center">Some Key Features of Paddle Serving</h2>

- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed **with one line command**.
- **Industrial serving features** supported, such as models management, online loading, online A/B testing etc.
- **Distributed Key-Value indexing** supported which is especially useful for large scale sparse features as model inputs.
- **Highly concurrent and efficient communication** between clients and servers supported.
- **Multiple programming languages** supported on client side, such as Golang, C++ and python.

D
Dong Daxiang 已提交
198
<h2 align="center">Document</h2>
D
Dong Daxiang 已提交
199

D
Dong Daxiang 已提交
200
### New to Paddle Serving
D
Dong Daxiang 已提交
201
- [How to save a servable model?](doc/SAVE.md)
J
Jiawei Wang 已提交
202
- [Write Bert-as-Service in 10 minutes](doc/BERT_10_MINS.md)
W
wangjiawei04 已提交
203
- [Paddle Serving Examples](python/examples)
W
wangjiawei04 已提交
204

D
Dong Daxiang 已提交
205
### Developers
B
barrierye 已提交
206
- [How to develop a new Web Service?](doc/NEW_WEB_SERVICE.md)
J
Jiawei Wang 已提交
207
- [Compile from source code](doc/COMPILE.md)
W
wangjiawei04 已提交
208
- [Develop Pipeline Serving](doc/PIPELINE_SERVING.md)
M
MRXLT 已提交
209 210
- [Deploy Web Service with uWSGI](doc/UWSGI_DEPLOY.md)
- [Hot loading for model file](doc/HOT_LOADING_IN_SERVING.md)
D
Dong Daxiang 已提交
211

D
Dong Daxiang 已提交
212
### About Efficiency
M
MRXLT 已提交
213
- [How to profile Paddle Serving latency?](python/examples/util)
M
MRXLT 已提交
214
- [How to optimize performance?](doc/PERFORMANCE_OPTIM.md)
M
MRXLT 已提交
215
- [Deploy multi-services on one GPU(Chinese)](doc/MULTI_SERVICE_ON_ONE_GPU_CN.md)
J
Jiawei Wang 已提交
216 217
- [CPU Benchmarks(Chinese)](doc/BENCHMARKING.md)
- [GPU Benchmarks(Chinese)](doc/GPU_BENCHMARKING.md)
D
Dong Daxiang 已提交
218

D
Dong Daxiang 已提交
219
### Design
J
Jiawei Wang 已提交
220
- [Design Doc](doc/DESIGN_DOC.md)
D
Dong Daxiang 已提交
221

W
wangjiawei04 已提交
222 223
### FAQ
- [FAQ(Chinese)](doc/FAQ.md)
D
Dong Daxiang 已提交
224

W
wangjiawei04 已提交
225
<h2 align="center">Community</h2>
D
Dong Daxiang 已提交
226

D
Dong Daxiang 已提交
227
### Slack
D
Dong Daxiang 已提交
228

D
Dong Daxiang 已提交
229 230
To connect with other users and contributors, welcome to join our [Slack channel](https://paddleserving.slack.com/archives/CUBPKHKMJ)

D
Dong Daxiang 已提交
231
### Contribution
D
Dong Daxiang 已提交
232

D
Dong Daxiang 已提交
233
If you want to contribute code to Paddle Serving, please reference [Contribution Guidelines](doc/CONTRIBUTE.md)
D
Dong Daxiang 已提交
234

J
Jiawei Wang 已提交
235 236 237
- Special Thanks to [@BeyondYourself](https://github.com/BeyondYourself) in complementing the gRPC tutorial, updating the FAQ doc and modifying the mdkir command
- Special Thanks to [@mcl-stone](https://github.com/mcl-stone) in updating faster_rcnn benchmark
- Special Thanks to [@cg82616424](https://github.com/cg82616424) in updating the unet benchmark and modifying resize comment error
P
PaddlePM 已提交
238

D
Dong Daxiang 已提交
239
### Feedback
D
Dong Daxiang 已提交
240

D
Dong Daxiang 已提交
241 242
For any feedback or to report a bug, please propose a [GitHub Issue](https://github.com/PaddlePaddle/Serving/issues).

D
Dong Daxiang 已提交
243 244
### License

D
Dong Daxiang 已提交
245
[Apache 2.0 License](https://github.com/PaddlePaddle/Serving/blob/develop/LICENSE)