未验证 提交 0196482e 编写于 作者: J Jiawei Wang 提交者: GitHub

Merge pull request #931 from wangjiawei04/v0.4.0

V0.4.0 930 923 925
...@@ -45,10 +45,11 @@ nvidia-docker exec -it test bash ...@@ -45,10 +45,11 @@ nvidia-docker exec -it test bash
``` ```
```shell ```shell
pip install paddle-serving-client==0.3.2 pip install paddle-serving-client==0.4.0
pip install paddle-serving-server==0.3.2 # CPU pip install paddle-serving-server==0.4.0 # CPU
pip install paddle-serving-server-gpu==0.3.2.post9 # GPU with CUDA9.0 pip install paddle-serving-server-gpu==0.4.0.post9 # GPU with CUDA9.0
pip install paddle-serving-server-gpu==0.3.2.post10 # GPU with CUDA10.0 pip install paddle-serving-server-gpu==0.4.0.post10 # GPU with CUDA10.0
pip install paddle-serving-server-gpu==0.4.0.trt # GPU with CUDA10.1+TensorRT
``` ```
You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source, add `-i https://pypi.tuna.tsinghua.edu.cn/simple` to pip command) to speed up the download. You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source, add `-i https://pypi.tuna.tsinghua.edu.cn/simple` to pip command) to speed up the download.
...@@ -57,7 +58,7 @@ If you need install modules compiled with develop branch, please download packag ...@@ -57,7 +58,7 @@ If you need install modules compiled with develop branch, please download packag
Packages of paddle-serving-server and paddle-serving-server-gpu support Centos 6/7, Ubuntu 16/18, Windows 10. Packages of paddle-serving-server and paddle-serving-server-gpu support Centos 6/7, Ubuntu 16/18, Windows 10.
Packages of paddle-serving-client and paddle-serving-app support Linux and Windows, but paddle-serving-client only support python2.7/3.6/3.7. Packages of paddle-serving-client and paddle-serving-app support Linux and Windows, but paddle-serving-client only support python2.7/3.5/3.6/3.7.
Recommended to install paddle >= 1.8.4. Recommended to install paddle >= 1.8.4.
...@@ -113,11 +114,11 @@ tar -xzf uci_housing.tar.gz ...@@ -113,11 +114,11 @@ tar -xzf uci_housing.tar.gz
Paddle Serving provides HTTP and RPC based service for users to access Paddle Serving provides HTTP and RPC based service for users to access
### HTTP service ### RPC service
Paddle Serving provides a built-in python module called `paddle_serving_server.serve` that can start a RPC service or a http service with one-line command. If we specify the argument `--name uci`, it means that we will have a HTTP service with a url of `$IP:$PORT/uci/prediction` A user can also start a RPC service with `paddle_serving_server.serve`. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify `--name` here.
``` shell ``` shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
``` ```
<center> <center>
...@@ -125,39 +126,24 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po ...@@ -125,39 +126,24 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po
|--------------|------|-----------|--------------------------------| |--------------|------|-----------|--------------------------------|
| `thread` | int | `4` | Concurrency of current service | | `thread` | int | `4` | Concurrency of current service |
| `port` | int | `9292` | Exposed port of current service to users| | `port` | int | `9292` | Exposed port of current service to users|
| `name` | str | `""` | Service name, can be used to generate HTTP request url |
| `model` | str | `""` | Path of paddle model directory to be served | | `model` | str | `""` | Path of paddle model directory to be served |
| `mem_optim_off` | - | - | Disable memory / graphic memory optimization | | `mem_optim_off` | - | - | Disable memory / graphic memory optimization |
| `ir_optim` | - | - | Enable analysis and optimization of calculation graph | | `ir_optim` | - | - | Enable analysis and optimization of calculation graph |
| `use_mkl` (Only for cpu version) | - | - | Run inference with MKL | | `use_mkl` (Only for cpu version) | - | - | Run inference with MKL |
| `use_trt` (Only for trt version) | - | - | Run inference with TensorRT | | `use_trt` (Only for trt version) | - | - | Run inference with TensorRT |
Here, we use `curl` to send a HTTP POST request to the service we just started. Users can use any python library to send HTTP POST as well, e.g, [requests](https://requests.readthedocs.io/en/master/).
</center> </center>
``` shell
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
### RPC service
A user can also start a RPC service with `paddle_serving_server.serve`. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify `--name` here.
``` shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
```
``` python ``` python
# A user can visit rpc service through paddle_serving_client API # A user can visit rpc service through paddle_serving_client API
from paddle_serving_client import Client from paddle_serving_client import Client
import numpy as np
client = Client() client = Client()
client.load_client_config("uci_housing_client/serving_client_conf.prototxt") client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"]) client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
-0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332] -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": data}, fetch=["price"]) fetch_map = client.predict(feed={"x": np.array(data).reshape(1,13,1)}, fetch=["price"])
print(fetch_map) print(fetch_map)
``` ```
Here, `client.predict` function has two arguments. `feed` is a `python dict` with model input variable alias name and values. `fetch` assigns the prediction variables to be returned from servers. In the example, the name of `"x"` and `"price"` are assigned when the servable model is saved during training. Here, `client.predict` function has two arguments. `feed` is a `python dict` with model input variable alias name and values. `fetch` assigns the prediction variables to be returned from servers. In the example, the name of `"x"` and `"price"` are assigned when the servable model is saved during training.
...@@ -169,6 +155,40 @@ Here, `client.predict` function has two arguments. `feed` is a `python dict` wit ...@@ -169,6 +155,40 @@ Here, `client.predict` function has two arguments. `feed` is a `python dict` wit
- **Highly concurrent and efficient communication** between clients and servers supported. - **Highly concurrent and efficient communication** between clients and servers supported.
- **Multiple programming languages** supported on client side, such as Golang, C++ and python. - **Multiple programming languages** supported on client side, such as Golang, C++ and python.
### WEB service
Users can also put the data format processing logic on the server side, so that they can directly use curl to access the service, refer to the following case whose path is `python/examples/fit_a_line`
```python
from paddle_serving_server.web_service import WebService
import numpy as np
class UciService(WebService):
def preprocess(self, feed=[], fetch=[]):
feed_batch = []
is_batch = True
new_data = np.zeros((len(feed), 1, 13)).astype("float32")
for i, ins in enumerate(feed):
nums = np.array(ins["x"]).reshape(1, 1, 13)
new_data[i] = nums
feed = {"x": new_data}
return feed, fetch, is_batch
uci_service = UciService(name="uci")
uci_service.load_model_config("uci_housing_model")
uci_service.prepare_server(workdir="workdir", port=9292)
uci_service.run_rpc_service()
uci_service.run_web_service()
```
for client side,
```
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
the response is
```
{"result":{"price":[[18.901151657104492]]}}
```
<h2 align="center">Document</h2> <h2 align="center">Document</h2>
### New to Paddle Serving ### New to Paddle Serving
......
...@@ -47,10 +47,11 @@ nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/se ...@@ -47,10 +47,11 @@ nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/se
nvidia-docker exec -it test bash nvidia-docker exec -it test bash
``` ```
```shell ```shell
pip install paddle-serving-client==0.3.2 pip install paddle-serving-client==0.4.0
pip install paddle-serving-server==0.3.2 # CPU pip install paddle-serving-server==0.4.0 # CPU
pip install paddle-serving-server-gpu==0.3.2.post9 # GPU with CUDA9.0 pip install paddle-serving-server-gpu==0.4.0.post9 # GPU with CUDA9.0
pip install paddle-serving-server-gpu==0.3.2.post10 # GPU with CUDA10.0 pip install paddle-serving-server-gpu==0.4.0.post10 # GPU with CUDA10.0
pip install paddle-serving-server-gpu==0.4.0.trt # GPU with CUDA10.1+TensorRT
``` ```
您可能需要使用国内镜像源(例如清华源, 在pip命令中添加`-i https://pypi.tuna.tsinghua.edu.cn/simple`)来加速下载。 您可能需要使用国内镜像源(例如清华源, 在pip命令中添加`-i https://pypi.tuna.tsinghua.edu.cn/simple`)来加速下载。
...@@ -107,13 +108,12 @@ tar -xzf uci_housing.tar.gz ...@@ -107,13 +108,12 @@ tar -xzf uci_housing.tar.gz
Paddle Serving 为用户提供了基于 HTTP 和 RPC 的服务 Paddle Serving 为用户提供了基于 HTTP 和 RPC 的服务
<h3 align="center">RPC服务</h3>
<h3 align="center">HTTP服务</h3> 用户还可以使用`paddle_serving_server.serve`启动RPC服务。 尽管用户需要基于Paddle Serving的python客户端API进行一些开发,但是RPC服务通常比HTTP服务更快。需要指出的是这里我们没有指定`--name`
Paddle Serving提供了一个名为`paddle_serving_server.serve`的内置python模块,可以使用单行命令启动RPC服务或HTTP服务。如果我们指定参数`--name uci`,则意味着我们将拥有一个HTTP服务,其URL为$IP:$PORT/uci/prediction`。
``` shell ``` shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
``` ```
<center> <center>
...@@ -128,21 +128,10 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po ...@@ -128,21 +128,10 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po
| `use_mkl` (Only for cpu version) | - | - | Run inference with MKL | | `use_mkl` (Only for cpu version) | - | - | Run inference with MKL |
| `use_trt` (Only for trt version) | - | - | Run inference with TensorRT | | `use_trt` (Only for trt version) | - | - | Run inference with TensorRT |
我们使用 `curl` 命令来发送HTTP POST请求给刚刚启动的服务。用户也可以调用python库来发送HTTP POST请求,请参考英文文档 [requests](https://requests.readthedocs.io/en/master/)。 我们使用 `curl` 命令来发送HTTP POST请求给刚刚启动的服务。用户也可以调用python库来发送HTTP POST请求,请参考英文文
[requests](https://requests.readthedocs.io/en/master/)
</center> </center>
``` shell
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
<h3 align="center">RPC服务</h3>
用户还可以使用`paddle_serving_server.serve`启动RPC服务。 尽管用户需要基于Paddle Serving的python客户端API进行一些开发,但是RPC服务通常比HTTP服务更快。需要指出的是这里我们没有指定`--name`。
``` shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
```
``` python ``` python
# A user can visit rpc service through paddle_serving_client API # A user can visit rpc service through paddle_serving_client API
from paddle_serving_client import Client from paddle_serving_client import Client
...@@ -152,12 +141,45 @@ client.load_client_config("uci_housing_client/serving_client_conf.prototxt") ...@@ -152,12 +141,45 @@ client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"]) client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
-0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332] -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": data}, fetch=["price"]) fetch_map = client.predict(feed={"x": np.array(data).reshape(1,13,1)}, fetch=["price"])
print(fetch_map) print(fetch_map)
``` ```
在这里,`client.predict`函数具有两个参数。 `feed`是带有模型输入变量别名和值的`python dict``fetch`被要从服务器返回的预测变量赋值。 在该示例中,在训练过程中保存可服务模型时,被赋值的tensor名为`"x"``"price"` 在这里,`client.predict`函数具有两个参数。 `feed`是带有模型输入变量别名和值的`python dict``fetch`被要从服务器返回的预测变量赋值。 在该示例中,在训练过程中保存可服务模型时,被赋值的tensor名为`"x"``"price"`
<h3 align="center">HTTP服务</h3>
用户也可以将数据格式处理逻辑放在服务器端进行,这样就可以直接用curl去访问服务,参考如下案例,在目录``python/examples/fit_a_line``
```python
from paddle_serving_server.web_service import WebService
import numpy as np
class UciService(WebService):
def preprocess(self, feed=[], fetch=[]):
feed_batch = []
is_batch = True
new_data = np.zeros((len(feed), 1, 13)).astype("float32")
for i, ins in enumerate(feed):
nums = np.array(ins["x"]).reshape(1, 1, 13)
new_data[i] = nums
feed = {"x": new_data}
return feed, fetch, is_batch
uci_service = UciService(name="uci")
uci_service.load_model_config("uci_housing_model")
uci_service.prepare_server(workdir="workdir", port=9292)
uci_service.run_rpc_service()
uci_service.run_web_service()
```
客户端输入
```
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
返回结果
```
{"result":{"price":[[18.901151657104492]]}}
```
<h2 align="center">Paddle Serving的核心功能</h2> <h2 align="center">Paddle Serving的核心功能</h2>
- 与Paddle训练紧密连接,绝大部分Paddle模型可以 **一键部署**. - 与Paddle训练紧密连接,绝大部分Paddle模型可以 **一键部署**.
......
...@@ -100,14 +100,21 @@ make -j10 ...@@ -100,14 +100,21 @@ make -j10
you can execute `make install` to put targets under directory `./output`, you need to add`-DCMAKE_INSTALL_PREFIX=./output`to specify output path to cmake command shown above. you can execute `make install` to put targets under directory `./output`, you need to add`-DCMAKE_INSTALL_PREFIX=./output`to specify output path to cmake command shown above.
### Integrated GPU version paddle inference library ### Integrated GPU version paddle inference library
### CUDA_PATH is the cuda install path,use the command(whereis cuda) to check,it should be /usr/local/cuda.
### CUDNN_LIBRARY && CUDA_CUDART_LIBRARY is the lib path, it should be /usr/local/cuda/lib64/
``` shell ``` shell
export CUDA_PATH='/usr/local'
export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
export CUDA_CUDART_LIBRARY="/usr/local/cuda/lib64/"
mkdir server-build-gpu && cd server-build-gpu mkdir server-build-gpu && cd server-build-gpu
cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \ cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \
-DPYTHON_LIBRARIES=$PYTHONROOT/lib/libpython2.7.so \ -DPYTHON_LIBRARIES=$PYTHONROOT/lib/libpython2.7.so \
-DPYTHON_EXECUTABLE=$PYTHONROOT/bin/python \ -DPYTHON_EXECUTABLE=$PYTHONROOT/bin/python \
-DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \ -DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \
-DCUDNN_LIBRARY=${CUDNN_LIBRARY} \ -DCUDNN_LIBRARY=${CUDNN_LIBRARY} \
-DCUDA_CUDART_LIBRARY=${CUDA_CUDART_LIBRARY} \
-DSERVER=ON \ -DSERVER=ON \
-DWITH_GPU=ON .. -DWITH_GPU=ON ..
make -j10 make -j10
...@@ -116,6 +123,10 @@ make -j10 ...@@ -116,6 +123,10 @@ make -j10
### Integrated TRT version paddle inference library ### Integrated TRT version paddle inference library
``` ```
export CUDA_PATH='/usr/local'
export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
export CUDA_CUDART_LIBRARY="/usr/local/cuda/lib64/"
mkdir server-build-trt && cd server-build-trt mkdir server-build-trt && cd server-build-trt
cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \ cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \
-DPYTHON_LIBRARIES=$PYTHONROOT/lib/libpython2.7.so \ -DPYTHON_LIBRARIES=$PYTHONROOT/lib/libpython2.7.so \
...@@ -123,6 +134,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \ ...@@ -123,6 +134,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \
-DTENSORRT_ROOT=${TENSORRT_LIBRARY_PATH} \ -DTENSORRT_ROOT=${TENSORRT_LIBRARY_PATH} \
-DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \ -DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \
-DCUDNN_LIBRARY=${CUDNN_LIBRARY} \ -DCUDNN_LIBRARY=${CUDNN_LIBRARY} \
-DCUDA_CUDART_LIBRARY=${CUDA_CUDART_LIBRARY} \
-DSERVER=ON \ -DSERVER=ON \
-DWITH_GPU=ON \ -DWITH_GPU=ON \
-DWITH_TRT=ON .. -DWITH_TRT=ON ..
...@@ -166,12 +178,14 @@ make ...@@ -166,12 +178,14 @@ make
## Install wheel package ## Install wheel package
Regardless of the client, server or App part, after compiling, install the whl package in `python/dist/` in the temporary directory(`server-build-cpu`, `server-build-gpu`, `client-build`,`app-build`) of the compilation process. Regardless of the client, server or App part, after compiling, install the whl package in `python/dist/` in the temporary directory(`server-build-cpu`, `server-build-gpu`, `client-build`,`app-build`) of the compilation process.
for example:cd server-build-cpu/python/dist && pip install -U xxxxx.whl
## Note ## Note
When running the python server, it will check the `SERVING_BIN` environment variable. If you want to use your own compiled binary file, set the environment variable to the path of the corresponding binary file, usually`export SERVING_BIN=${BUILD_DIR}/core/general-server/serving`. When running the python server, it will check the `SERVING_BIN` environment variable. If you want to use your own compiled binary file, set the environment variable to the path of the corresponding binary file, usually`export SERVING_BIN=${BUILD_DIR}/core/general-server/serving`.
BUILD_DIR is the absolute path of server build CPU or server build GPU。
for example: cd server-build-cpu && export SERVING_BIN=${PWD}/core/general-server/serving
......
...@@ -97,14 +97,20 @@ make -j10 ...@@ -97,14 +97,20 @@ make -j10
可以执行`make install`把目标产出放在`./output`目录下,cmake阶段需添加`-DCMAKE_INSTALL_PREFIX=./output`选项来指定存放路径。 可以执行`make install`把目标产出放在`./output`目录下,cmake阶段需添加`-DCMAKE_INSTALL_PREFIX=./output`选项来指定存放路径。
### 集成GPU版本Paddle Inference Library ### 集成GPU版本Paddle Inference Library
### CUDA_PATH是cuda的安装路径,可以使用命令行whereis cuda命令确认你的cuda安装路径,通常应该是/usr/local/cuda
### CUDNN_LIBRARY CUDA_CUDART_LIBRARY 是cuda库文件的路径,通常应该是/usr/local/cuda/lib64/
``` shell ``` shell
export CUDA_PATH='/usr/local'
export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
export CUDA_CUDART_LIBRARY="/usr/local/cuda/lib64/"
mkdir server-build-gpu && cd server-build-gpu mkdir server-build-gpu && cd server-build-gpu
cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \ cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \
-DPYTHON_LIBRARIES=$PYTHONROOT/lib/libpython2.7.so \ -DPYTHON_LIBRARIES=$PYTHONROOT/lib/libpython2.7.so \
-DPYTHON_EXECUTABLE=$PYTHONROOT/bin/python \ -DPYTHON_EXECUTABLE=$PYTHONROOT/bin/python \
-DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \ -DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \
-DCUDNN_LIBRARY=${CUDNN_LIBRARY} \ -DCUDNN_LIBRARY=${CUDNN_LIBRARY} \
-DCUDA_CUDART_LIBRARY=${CUDA_CUDART_LIBRARY} \
-DSERVER=ON \ -DSERVER=ON \
-DWITH_GPU=ON .. -DWITH_GPU=ON ..
make -j10 make -j10
...@@ -113,6 +119,10 @@ make -j10 ...@@ -113,6 +119,10 @@ make -j10
### 集成TensorRT版本Paddle Inference Library ### 集成TensorRT版本Paddle Inference Library
``` ```
export CUDA_PATH='/usr/local'
export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
export CUDA_CUDART_LIBRARY="/usr/local/cuda/lib64/"
mkdir server-build-trt && cd server-build-trt mkdir server-build-trt && cd server-build-trt
cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \ cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \
-DPYTHON_LIBRARIES=$PYTHONROOT/lib/libpython2.7.so \ -DPYTHON_LIBRARIES=$PYTHONROOT/lib/libpython2.7.so \
...@@ -120,6 +130,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \ ...@@ -120,6 +130,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHONROOT/include/python2.7/ \
-DTENSORRT_ROOT=${TENSORRT_LIBRARY_PATH} \ -DTENSORRT_ROOT=${TENSORRT_LIBRARY_PATH} \
-DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \ -DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \
-DCUDNN_LIBRARY=${CUDNN_LIBRARY} \ -DCUDNN_LIBRARY=${CUDNN_LIBRARY} \
-DCUDA_CUDART_LIBRARY=${CUDA_CUDART_LIBRARY} \
-DSERVER=ON \ -DSERVER=ON \
-DWITH_GPU=ON \ -DWITH_GPU=ON \
-DWITH_TRT=ON .. -DWITH_TRT=ON ..
...@@ -162,12 +173,16 @@ make ...@@ -162,12 +173,16 @@ make
## 安装wheel包 ## 安装wheel包
无论是Client端,Server端还是App部分,编译完成后,安装编译过程临时目录(`server-build-cpu``server-build-gpu``client-build``app-build`)下的`python/dist/` 中的whl包即可。 无论是Client端,Server端还是App部分,编译完成后,安装编译过程临时目录(`server-build-cpu``server-build-gpu``client-build``app-build`)下的`python/dist/` 中的whl包即可。
例如:cd server-build-cpu/python/dist && pip install -U xxxxx.whl
## 注意事项 ## 注意事项
运行python端Server时,会检查`SERVING_BIN`环境变量,如果想使用自己编译的二进制文件,请将设置该环境变量为对应二进制文件的路径,通常是`export SERVING_BIN=${BUILD_DIR}/core/general-server/serving` 运行python端Server时,会检查`SERVING_BIN`环境变量,如果想使用自己编译的二进制文件,请将设置该环境变量为对应二进制文件的路径,通常是`export SERVING_BIN=${BUILD_DIR}/core/general-server/serving`
其中BUILD_DIR为server-build-cpu或server-build-gpu的绝对路径。
可以cd server-build-cpu路径下,执行export SERVING_BIN=${PWD}/core/general-server/serving
......
...@@ -28,6 +28,7 @@ You can get images in two ways: ...@@ -28,6 +28,7 @@ You can get images in two ways:
## Image description ## Image description
Runtime images cannot be used for compilation. Runtime images cannot be used for compilation.
If you want to customize your Serving based on source code, use the version with the suffix - devel.
| Description | OS | TAG | Dockerfile | | Description | OS | TAG | Dockerfile |
| :----------------------------------------------------------: | :-----: | :--------------------------: | :----------------------------------------------------------: | | :----------------------------------------------------------: | :-----: | :--------------------------: | :----------------------------------------------------------: |
......
...@@ -28,6 +28,7 @@ ...@@ -28,6 +28,7 @@
## 镜像说明 ## 镜像说明
运行时镜像不能用于开发编译。 运行时镜像不能用于开发编译。
若需要基于源代码二次开发编译,请使用后缀为-devel的版本。
| 镜像说明 | 操作系统 | TAG | Dockerfile | | 镜像说明 | 操作系统 | TAG | Dockerfile |
| -------------------------------------------------- | -------- | ---------------------------- | ------------------------------------------------------------ | | -------------------------------------------------- | -------- | ---------------------------- | ------------------------------------------------------------ |
......
...@@ -32,63 +32,9 @@ The `-p` option is to map the `9292` port of the container to the `9292` port of ...@@ -32,63 +32,9 @@ The `-p` option is to map the `9292` port of the container to the `9292` port of
### Install PaddleServing ### Install PaddleServing
In order to make the image smaller, the PaddleServing package is not installed in the image. You can run the following command to install it: The mirror comes with `paddle_serving_server`, `paddle_serving_client`, and `paddle_serving_app` corresponding to the mirror tag version. If users don’t need to change the version, they can use it directly, which is suitable for environments without extranet services.
```bash
pip install paddle-serving-server
```
You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source of the following example) to speed up the download:
```shell
pip install paddle-serving-server -i https://pypi.tuna.tsinghua.edu.cn/simple
```
### Test example
Get the trained Boston house price prediction model by the following command:
```bash
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
tar -xzf uci_housing.tar.gz
```
- Test HTTP service
Running on the Server side (inside the container):
```bash
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci >std.log 2>err.log &
```
Running on the Client side (inside or outside the container):
```bash
curl -H "Content-Type:application/json" -X POST -d '{"feed":{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}, "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
- Test RPC service
Running on the Server side (inside the container):
```bash
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 >std.log 2>err.log &
```
Running following Python code on the Client side (inside or outside the container, The `paddle-serving-client` package needs to be installed):
```bash
from paddle_serving_client import Client
client = Client()
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
-0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": data}, fetch=["price"])
print(fetch_map)
```
If you need to change the version, please refer to the instructions on the homepage to download the pip package of the corresponding version.
## GPU ## GPU
...@@ -100,7 +46,7 @@ The GPU version is basically the same as the CPU version, with only some differe ...@@ -100,7 +46,7 @@ The GPU version is basically the same as the CPU version, with only some differe
Refer to [this document](DOCKER_IMAGES.md) for a docker image, the following is an example of an `cuda9.0-cudnn7` image: Refer to [this document](DOCKER_IMAGES.md) for a docker image, the following is an example of an `cuda9.0-cudnn7` image:
```shell ```shell
nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7 docker pull hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7
``` ```
### Create container ### Create container
...@@ -110,77 +56,21 @@ nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/se ...@@ -110,77 +56,21 @@ nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/se
nvidia-docker exec -it test bash nvidia-docker exec -it test bash
``` ```
The `-p` option is to map the `9292` port of the container to the `9292` port of the host. or
### Install PaddleServing
In order to make the image smaller, the PaddleServing package is not installed in the image. You can run the following command to install it:
```bash ```bash
pip install paddle-serving-server-gpu docker run --gpus all -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7
``` docker exec -it test bash
You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source of the following example) to speed up the download:
```shell
pip install paddle-serving-server-gpu -i https://pypi.tuna.tsinghua.edu.cn/simple
```
### Test example
When running the GPU Server, you need to set the GPUs used by the prediction service through the `--gpu_ids` option, and the CPU is used by default. An error will be reported when the value of `--gpu_ids` exceeds the environment variable `CUDA_VISIBLE_DEVICES`. The following example specifies to use a GPU with index 0:
```shell
export CUDA_VISIBLE_DEVICES=0,1
python -m paddle_serving_server_gpu.serve --model uci_housing_model --port 9292 --gpu_ids 0
```
Get the trained Boston house price prediction model by the following command:
```bash
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
tar -xzf uci_housing.tar.gz
``` ```
- Test HTTP service The `-p` option is to map the `9292` port of the container to the `9292` port of the host.
Running on the Server side (inside the container):
```bash
python -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 10 --port 9292 --name uci --gpu_ids 0
```
Running on the Client side (inside or outside the container):
```bash
curl -H "Content-Type:application/json" -X POST -d '{"feed":{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}, "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
- Test RPC service
Running on the Server side (inside the container):
```bash
python -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 10 --port 9292 --gpu_ids 0
```
Running following Python code on the Client side (inside or outside the container, The `paddle-serving-client` package needs to be installed):
```bash
from paddle_serving_client import Client
client = Client()
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
-0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": data}, fetch=["price"])
print(fetch_map)
```
### Install PaddleServing
The mirror comes with `paddle_serving_server_gpu`, `paddle_serving_client`, and `paddle_serving_app` corresponding to the mirror tag version. If users don’t need to change the version, they can use it directly, which is suitable for environments without extranet services.
If you need to change the version, please refer to the instructions on the homepage to download the pip package of the corresponding version.
## Attention ## Precautious
Runtime images cannot be used for compilation. If you want to compile from source, refer to [COMPILE](COMPILE.md). Runtime images cannot be used for compilation. If you want to compile from source, refer to [COMPILE](COMPILE.md).
...@@ -20,7 +20,6 @@ Docker(GPU版本需要在GPU机器上安装nvidia-docker) ...@@ -20,7 +20,6 @@ Docker(GPU版本需要在GPU机器上安装nvidia-docker)
docker pull hub.baidubce.com/paddlepaddle/serving:latest docker pull hub.baidubce.com/paddlepaddle/serving:latest
``` ```
### 创建容器并进入 ### 创建容器并进入
```bash ```bash
...@@ -32,74 +31,11 @@ docker exec -it test bash ...@@ -32,74 +31,11 @@ docker exec -it test bash
### 安装PaddleServing ### 安装PaddleServing
为了减小镜像的体积,镜像中没有安装Serving包,要执行下面命令进行安装。 镜像里自带对应镜像tag版本的`paddle_serving_server``paddle_serving_client``paddle_serving_app`,如果用户不需要更改版本,可以直接使用,适用于没有外网服务的环境。
```bash
pip install paddle-serving-server
```
您可能需要使用国内镜像源(例如清华源)来加速下载。
```shell
pip install paddle-serving-server -i https://pypi.tuna.tsinghua.edu.cn/simple
```
### 测试example
通过下面命令获取训练好的Boston房价预估模型:
```bash
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
tar -xzf uci_housing.tar.gz
```
- 测试HTTP服务
在Server端(容器内)运行:
```bash 如果需要更换版本,请参照首页的指导,下载对应版本的pip包。
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci >std.log 2>err.log &
```
在Client端(容器内或容器外)运行: ## GPU 版本
```bash
curl -H "Content-Type:application/json" -X POST -d '{"feed":{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}, "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
- 测试RPC服务
在Server端(容器内)运行:
```bash
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 >std.log 2>err.log &
```
在Client端(容器内或容器外,需要安装`paddle-serving-client`包)运行下面Python代码:
```python
from paddle_serving_client import Client
client = Client()
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
-0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": data}, fetch=["price"])
print(fetch_map)
```
## GPU版本
GPU版本与CPU版本基本一致,只有部分接口命名的差别(GPU版本需要在GPU机器上安装nvidia-docker)。
### 获取镜像
参考[该文档](DOCKER_IMAGES_CN.md)获取镜像,这里以 `cuda9.0-cudnn7` 的镜像为例:
```shell
nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7
```
### 创建容器并进入 ### 创建容器并进入
...@@ -107,74 +43,19 @@ nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7 ...@@ -107,74 +43,19 @@ nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7
nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7 nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7
nvidia-docker exec -it test bash nvidia-docker exec -it test bash
``` ```
或者
`-p`选项是为了将容器的`9292`端口映射到宿主机的`9292`端口。
### 安装PaddleServing
为了减小镜像的体积,镜像中没有安装Serving包,要执行下面命令进行安装。
```bash
pip install paddle-serving-server-gpu
```
您可能需要使用国内镜像源(例如清华源)来加速下载。
```shell
pip install paddle-serving-server-gpu -i https://pypi.tuna.tsinghua.edu.cn/simple
```
### 测试example
在运行GPU版Server时需要通过`--gpu_ids`选项设置预测服务使用的GPU,缺省状态默认使用CPU。当设置的`--gpu_ids`超出环境变量`CUDA_VISIBLE_DEVICES`时会报错。下面的示例为指定使用索引为0的GPU:
```shell
export CUDA_VISIBLE_DEVICES=0,1
python -m paddle_serving_server_gpu.serve --model uci_housing_model --port 9292 --gpu_ids 0
```
通过下面命令获取训练好的Boston房价预估模型:
```bash ```bash
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz docker run --gpus all -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:latest-cuda9.0-cudnn7
tar -xzf uci_housing.tar.gz docker exec -it test bash
``` ```
- 测试HTTP服务 `-p`选项是为了将容器的`9292`端口映射到宿主机的`9292`端口。
在Server端(容器内)运行:
```bash
python -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 10 --port 9292 --name uci --gpu_ids 0
```
在Client端(容器内或容器外)运行:
```bash
curl -H "Content-Type:application/json" -X POST -d '{"feed":{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}, "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
- 测试RPC服务
在Server端(容器内)运行:
```bash ### 安装PaddleServing
python -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 10 --port 9292 --gpu_ids 0
```
在Client端(容器内或容器外,需要安装`paddle-serving-client`包)运行下面Python代码: 镜像里自带对应镜像tag版本的`paddle_serving_server_gpu``paddle_serving_client``paddle_serving_app`,如果用户不需要更改版本,可以直接使用,适用于没有外网服务的环境。
```bash 如果需要更换版本,请参照首页的指导,下载对应版本的pip包。
from paddle_serving_client import Client
client = Client()
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
-0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": data}, fetch=["price"])
print(fetch_map)
```
## 注意事项 ## 注意事项
......
...@@ -8,7 +8,7 @@ This document guides users how to build Paddle Serving service on the Windows pl ...@@ -8,7 +8,7 @@ This document guides users how to build Paddle Serving service on the Windows pl
### Running Paddle Serving on Native Windows System ### Running Paddle Serving on Native Windows System
**Configure Python environment variables to PATH**: First, you need to add the directory where the Python executable program is located to the PATH. Usually in **System Properties/My Computer Properties**-**Advanced**-**Environment Variables**, click Path and add the path at the beginning. For example, `C:\Users\$USER\AppData\Local\Programs\Python\Python36`, and finally click **OK** continuously. If you enter python on Powershell, you can enter the python interactive interface, indicating that the environment variable configuration is successful. **Configure Python environment variables to PATH**: **We only support Python 3.5+ on Native Windows System.**. First, you need to add the directory where the Python executable program is located to the PATH. Usually in **System Properties/My Computer Properties**-**Advanced**-**Environment Variables**, click Path and add the path at the beginning. For example, `C:\Users\$USER\AppData\Local\Programs\Python\Python36`, and finally click **OK** continuously. If you enter python on Powershell, you can enter the python interactive interface, indicating that the environment variable configuration is successful.
**Install wget**: Because all the downloads in the tutorial and the built-in model download function in `paddle_serving_app` all use the wget tool, download the binary package at the [link](http://gnuwin32.sourceforge.net/packages/wget.htm), unzip and copy it to `C:\Windows\System32`, if there is a security prompt, you need to pass it. **Install wget**: Because all the downloads in the tutorial and the built-in model download function in `paddle_serving_app` all use the wget tool, download the binary package at the [link](http://gnuwin32.sourceforge.net/packages/wget.htm), unzip and copy it to `C:\Windows\System32`, if there is a security prompt, you need to pass it.
...@@ -32,6 +32,7 @@ python -m pip install -U paddle_serving_server_gpu paddle_serving_client paddle_ ...@@ -32,6 +32,7 @@ python -m pip install -U paddle_serving_server_gpu paddle_serving_client paddle_
``` ```
git clone https://github.com/paddlepaddle/Serving git clone https://github.com/paddlepaddle/Serving
pip install -r python/requirements_win.txt
``` ```
**Run OCR example**: **Run OCR example**:
...@@ -42,7 +43,7 @@ python -m paddle_serving_app.package --get_model ocr_rec ...@@ -42,7 +43,7 @@ python -m paddle_serving_app.package --get_model ocr_rec
tar -xzvf ocr_rec.tar.gz tar -xzvf ocr_rec.tar.gz
python -m paddle_serving_app.package --get_model ocr_det python -m paddle_serving_app.package --get_model ocr_det
tar -xzvf ocr_det.tar.gz tar -xzvf ocr_det.tar.gz
python ocr_debugger_server.py & python ocr_debugger_server.py cpu &
python ocr_web_client.py python ocr_web_client.py
``` ```
......
...@@ -8,7 +8,7 @@ ...@@ -8,7 +8,7 @@
### 原生Windows系统运行Paddle Serving ### 原生Windows系统运行Paddle Serving
**配置Python环境变量到PATH**:首先需要将Python的可执行程序所在目录加入到PATH当中。通常在**系统属性/我的电脑属性**-**高级**-**环境变量** ,点选Path,并在开头加上路径。例如`C:\Users\$USER\AppData\Local\Programs\Python\Python36`,最后连续点击**确定** 。在Powershell上如果输入python可以进入python交互界面,说明环境变量配置成功。 **配置Python环境变量到PATH****目前原生Windows仅支持Python 3.5或更高版本**首先需要将Python的可执行程序所在目录加入到PATH当中。通常在**系统属性/我的电脑属性**-**高级**-**环境变量** ,点选Path,并在开头加上路径。例如`C:\Users\$USER\AppData\Local\Programs\Python\Python36`,最后连续点击**确定** 。在Powershell上如果输入python可以进入python交互界面,说明环境变量配置成功。
**安装wget工具**:由于教程当中所有的下载,以及`paddle_serving_app`当中内嵌的模型下载功能,都是用到wget工具,在链接[下载wget](http://gnuwin32.sourceforge.net/packages/wget.htm),解压后复制到`C:\Windows\System32`下,如有安全提示需要通过。 **安装wget工具**:由于教程当中所有的下载,以及`paddle_serving_app`当中内嵌的模型下载功能,都是用到wget工具,在链接[下载wget](http://gnuwin32.sourceforge.net/packages/wget.htm),解压后复制到`C:\Windows\System32`下,如有安全提示需要通过。
...@@ -32,6 +32,7 @@ python -m pip install -U paddle_serving_server_gpu paddle_serving_client paddle_ ...@@ -32,6 +32,7 @@ python -m pip install -U paddle_serving_server_gpu paddle_serving_client paddle_
``` ```
git clone https://github.com/paddlepaddle/Serving git clone https://github.com/paddlepaddle/Serving
pip install -r python/requirements_win.txt
``` ```
**运行OCR示例** **运行OCR示例**
...@@ -42,7 +43,7 @@ python -m paddle_serving_app.package --get_model ocr_rec ...@@ -42,7 +43,7 @@ python -m paddle_serving_app.package --get_model ocr_rec
tar -xzvf ocr_rec.tar.gz tar -xzvf ocr_rec.tar.gz
python -m paddle_serving_app.package --get_model ocr_det python -m paddle_serving_app.package --get_model ocr_det
tar -xzvf ocr_det.tar.gz tar -xzvf ocr_det.tar.gz
python ocr_debugger_server.py & python ocr_debugger_server.py cpu &
python ocr_web_client.py python ocr_web_client.py
``` ```
......
## Java Demo ## Tutorial of Java Client for Paddle Serving
(English|[简体中文](./README_CN.md))
### Development Environment
In order to facilitate users to use java for development, we provide the compiled Serving project to be placed in the java mirror. The way to get the mirror and enter the development environment is
```
docker pull hub.baidubce.com/paddlepaddle/serving:0.4.0-java
docker run --rm -dit --name java_serving hub.baidubce.com/paddlepaddle/serving:0.4.0-java
docker exec -it java_serving bash
cd Serving/java
```
The Serving folder is at the develop branch when the docker image is generated. You need to git pull to the latest version or git checkout to the desired branch.
### Install client dependencies
Due to the large number of dependent libraries, the image has been compiled once at the time of generation, and the user can perform the following operations
### Install package
``` ```
mvn compile mvn compile
mvn install mvn install
...@@ -9,18 +27,49 @@ mvn compile ...@@ -9,18 +27,49 @@ mvn compile
mvn install mvn install
``` ```
### Start Server ### Start the server
take the fit_a_line demo as example Take the fit_a_line model as an example, the server starts
``` ```
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 --use_multilang #CPU cd ../../python/examples/fit_a_line
python -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 10 --port 9393 --use_multilang #GPU sh get_data.sh
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 --use_multilang &
``` ```
### Client Predict Client prediction
``` ```
cd ../../../java/examples/target
java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar PaddleServingClientExample fit_a_line java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar PaddleServingClientExample fit_a_line
``` ```
The Java example also contains the prediction client of Bert, Model_enaemble, asyn_predict, batch_predict, Cube_local, Cube_quant, and Yolov4 models. Take yolov4 as an example, the server starts
```
python -m paddle_serving_app.package --get_model yolov4
tar -xzvf yolov4.tar.gz
python -m paddle_serving_server_gpu.serve --model yolov4_model --port 9393 --gpu_ids 0 --use_multilang & #It needs to be executed in GPU Docker, otherwise the execution method of CPU must be used.
```
Client prediction
```
# in /Serving/java/examples/target
java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar PaddleServingClientExample yolov4 ../../../python/examples/yolov4/000000570688.jpg
# The case of yolov4 needs to specify a picture as input
```
### Customization guidance
The above example is running in CPU mode. If GPU mode is required, there are two options.
The first is that GPU Serving and Java Client are in the same image. After starting the corresponding image, the user needs to move /Serving/java in the java image to the corresponding image.
The second is to deploy GPU Serving and Java Client separately. If they are on the same host, you can learn the IP address of the corresponding container through ifconfig, and then when you connect to client.connect in `examples/src/main/java/PaddleServingClientExample.java` Make changes to the endpoint, and then compile it again. Or select `--net=host` to bind the network device of docker and host when docker starts, so that it can run directly without customizing java code.
**It should be noted that in the example, all models need to use `--use_multilang` to start GRPC multi-programming language support, and the port number is 9393. If you need another port, you need to modify it in the java file**
**Currently Serving has launched the Pipeline mode (see [Pipeline Serving](../doc/PIPELINE_SERVING.md) for details). The next version (0.4.1) of the Pipeline Serving Client for Java will be released. **
## Java 示例 ## 用于Paddle Serving的Java客户端
([English](./README.md)|简体中文)
### 开发环境
为了方便用户使用java进行开发,我们提供了编译好的Serving工程放置在java镜像当中,获取镜像并进入开发环境的方式是
```
docker pull hub.baidubce.com/paddlepaddle/serving:0.4.0-java
docker run --rm -dit --name java_serving hub.baidubce.com/paddlepaddle/serving:0.4.0-java
docker exec -it java_serving bash
cd Serving/java
```
Serving文件夹是镜像生成时的develop分支工程目录,需要git pull 到最新版本,或者git checkout 到想要的分支。
### 安装客户端依赖 ### 安装客户端依赖
由于依赖库数量庞大,因此镜像已经在生成时编译过一次,用户执行以下操作即可
``` ```
mvn compile mvn compile
mvn install mvn install
...@@ -11,16 +29,47 @@ mvn install ...@@ -11,16 +29,47 @@ mvn install
### 启动服务端 ### 启动服务端
以fit_a_line模型为例 以fit_a_line模型为例,服务端启动
``` ```
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 --use_multilang #CPU cd ../../python/examples/fit_a_line
python -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 10 --port 9393 --use_multilang #GPU sh get_data.sh
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 --use_multilang &
``` ```
### 客户端预测 客户端预测
``` ```
cd ../../../java/examples/target
java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar PaddleServingClientExample fit_a_line java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar PaddleServingClientExample fit_a_line
``` ```
java示例中还包含了bert、model_enaemble、asyn_predict、batch_predict、cube_local、cube_quant、yolov4模型的预测客户端。 以yolov4为例子,服务端启动
```
python -m paddle_serving_app.package --get_model yolov4
tar -xzvf yolov4.tar.gz
python -m paddle_serving_server_gpu.serve --model yolov4_model --port 9393 --gpu_ids 0 --use_multilang & #需要在GPU Docker当中执行,否则要使用CPU的执行方式。
```
客户端预测
```
# in /Serving/java/examples/target
java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar PaddleServingClientExample yolov4 ../../../python/examples/yolov4/000000570688.jpg
# yolov4的案例需要指定一个图片作为输入
```
### 二次开发指导
上述示例是在CPU模式下运行,如果需要GPU模式,可以有两种选择。
第一种是GPU Serving和Java Client在同一个镜像,需要用户在启动对应的镜像后,把java镜像当中的/Serving/java移动到对应的镜像中。
第二种是GPU Serving和Java Client分开部署,如果在同一台宿主机,可以通过ifconfig了解对应容器的IP地址,然后在`examples/src/main/java/PaddleServingClientExample.java`当中对client.connect时的endpoint做修改,然后再编译一次。 或者在docker启动时选择 `--net=host`来绑定docker和宿主机的网络设备,这样不需要定制java代码可以直接运行。
**需要注意的是,在示例中,所有模型都需要使用`--use_multilang`来启动GRPC多编程语言支持,以及端口号都是9393,如果需要别的端口,需要在java文件里修改**
**目前Serving已推出Pipeline模式(详见[Pipeline Serving](../doc/PIPELINE_SERVING_CN.md)),下个版本(0.4.1)面向Java的Pipeline Serving Client将会发布,敬请期待。**
...@@ -14,12 +14,6 @@ sh get_data.sh ...@@ -14,12 +14,6 @@ sh get_data.sh
### Start server ### Start server
``` shell
python test_server.py uci_housing_model/
```
You can also start the default RPC service with the following line of code:
```shell ```shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393
``` ```
...@@ -40,7 +34,7 @@ python test_client.py uci_housing_client/serving_client_conf.prototxt ...@@ -40,7 +34,7 @@ python test_client.py uci_housing_client/serving_client_conf.prototxt
Start a web service with default web service hosting modules: Start a web service with default web service hosting modules:
``` shell ``` shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 --name uci python test_server.py
``` ```
### Client prediction ### Client prediction
......
...@@ -41,7 +41,7 @@ python test_client.py uci_housing_client/serving_client_conf.prototxt ...@@ -41,7 +41,7 @@ python test_client.py uci_housing_client/serving_client_conf.prototxt
通过下面的一行代码开启默认web服务: 通过下面的一行代码开启默认web服务:
``` shell ``` shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 --name uci python test_server.py
``` ```
### 客户端预测 ### 客户端预测
......
...@@ -15,6 +15,7 @@ ...@@ -15,6 +15,7 @@
from paddle_serving_client import Client from paddle_serving_client import Client
import sys import sys
import numpy as np
client = Client() client = Client()
client.load_client_config(sys.argv[1]) client.load_client_config(sys.argv[1])
...@@ -27,7 +28,6 @@ test_reader = paddle.batch( ...@@ -27,7 +28,6 @@ test_reader = paddle.batch(
batch_size=1) batch_size=1)
for data in test_reader(): for data in test_reader():
import numpy as np
new_data = np.zeros((1, 1, 13)).astype("float32") new_data = np.zeros((1, 1, 13)).astype("float32")
new_data[0] = data[0][0] new_data[0] = data[0][0]
fetch_map = client.predict( fetch_map = client.predict(
......
...@@ -13,24 +13,24 @@ ...@@ -13,24 +13,24 @@
# limitations under the License. # limitations under the License.
# pylint: disable=doc-string-missing # pylint: disable=doc-string-missing
import os from paddle_serving_server.web_service import WebService
import sys import numpy as np
from paddle_serving_server import OpMaker
from paddle_serving_server import OpSeqMaker
from paddle_serving_server import Server
op_maker = OpMaker()
read_op = op_maker.create('general_reader')
general_infer_op = op_maker.create('general_infer')
response_op = op_maker.create('general_response')
op_seq_maker = OpSeqMaker() class UciService(WebService):
op_seq_maker.add_op(read_op) def preprocess(self, feed=[], fetch=[]):
op_seq_maker.add_op(general_infer_op) feed_batch = []
op_seq_maker.add_op(response_op) is_batch = True
new_data = np.zeros((len(feed), 1, 13)).astype("float32")
for i, ins in enumerate(feed):
nums = np.array(ins["x"]).reshape(1, 1, 13)
new_data[i] = nums
feed = {"x": new_data}
return feed, fetch, is_batch
server = Server()
server.set_op_sequence(op_seq_maker.get_op_sequence()) uci_service = UciService(name="uci")
server.load_model_config(sys.argv[1]) uci_service.load_model_config("uci_housing_model")
server.prepare_server(workdir="work_dir1", port=9393, device="cpu") uci_service.prepare_server(workdir="workdir", port=9292)
server.run_server() uci_service.run_rpc_service()
uci_service.run_web_service()
...@@ -25,7 +25,9 @@ from .version import serving_server_version ...@@ -25,7 +25,9 @@ from .version import serving_server_version
from contextlib import closing from contextlib import closing
import argparse import argparse
import collections import collections
import fcntl import sys
if sys.platform.startswith('win') is False:
import fcntl
import shutil import shutil
import numpy as np import numpy as np
import grpc import grpc
......
...@@ -29,7 +29,7 @@ util.gen_pipeline_code("paddle_serving_server") ...@@ -29,7 +29,7 @@ util.gen_pipeline_code("paddle_serving_server")
REQUIRED_PACKAGES = [ REQUIRED_PACKAGES = [
'six >= 1.10.0', 'protobuf >= 3.11.0', 'grpcio <= 1.33.2', 'grpcio-tools <= 1.33.2', 'six >= 1.10.0', 'protobuf >= 3.11.0', 'grpcio <= 1.33.2', 'grpcio-tools <= 1.33.2',
'paddle_serving_client', 'flask >= 1.1.1', 'paddle_serving_app', 'func_timeout', 'pyyaml' 'flask >= 1.1.1', 'func_timeout', 'pyyaml'
] ]
packages=['paddle_serving_server', packages=['paddle_serving_server',
......
...@@ -31,7 +31,7 @@ util.gen_pipeline_code("paddle_serving_server_gpu") ...@@ -31,7 +31,7 @@ util.gen_pipeline_code("paddle_serving_server_gpu")
REQUIRED_PACKAGES = [ REQUIRED_PACKAGES = [
'six >= 1.10.0', 'protobuf >= 3.11.0', 'grpcio <= 1.33.2', 'grpcio-tools <= 1.33.2', 'six >= 1.10.0', 'protobuf >= 3.11.0', 'grpcio <= 1.33.2', 'grpcio-tools <= 1.33.2',
'paddle_serving_client', 'flask >= 1.1.1', 'paddle_serving_app', 'func_timeout', 'pyyaml' 'flask >= 1.1.1', 'func_timeout', 'pyyaml'
] ]
packages=['paddle_serving_server_gpu', packages=['paddle_serving_server_gpu',
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册