未验证 提交 a9ec4e4a 编写于 作者: J Jiawei Wang 提交者: GitHub

Merge branch 'develop' into docs_0.5.0

# Paddle Serving Using Baidu Kunlun Chips
(English|[简体中文](./BAIDU_KUNLUN_XPU_SERVING_CN.md))
Paddle serving supports deployment using Baidu Kunlun chips. At present, the pilot support is deployed on the ARM server with Baidu Kunlun chips
(such as Phytium FT-2000+/64). We will improve
the deployment capability on various heterogeneous hardware servers in the future.
# Compilation and installation
Refer to [compile](COMPILE.md) document to setup the compilation environment。
## Compilatiton
* Compile the Serving Server
```
cd Serving
mkdir -p server-build-arm && cd server-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DSERVER=ON ..
make -j10
```
You can run `make install` to produce the target in `./output` directory. Add `-DCMAKE_INSTALL_PREFIX=./output` to specify the output path to CMake command shown above。
* Compile the Serving Client
```
mkdir -p client-build-arm && cd client-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DCLIENT=ON ..
make -j10
```
* Compile the App
```
cd Serving
mkdir -p app-build-arm && cd app-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DAPP=ON ..
make -j10
```
## Install the wheel package
After the compilations stages above, the whl package will be generated in ```python/dist/``` under the specific temporary directories.
For example, after the Server Compiation step,the whl package will be produced under the server-build-arm/python/dist directory, and you can run ```pip install -u python/dist/*.whl``` to install the package。
# Request parameters description
In order to deploy serving
service on the arm server with Baidu Kunlun xpu chips and use the acceleration capability of Paddle-Lite,please specify the following parameters during deployment。
|param|param description|about|
|:--|:--|:--|
|use_lite|using Paddle-Lite Engine|use the inference capability of Paddle-Lite|
|use_xpu|using Baidu Kunlun for inference|need to be used with the use_lite option|
|ir_optim|open the graph optimization|refer to[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite)|
# Deplyment examples
## Download the model
```
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
tar -xzf uci_housing.tar.gz
```
## Start RPC service
There are mainly three deployment methods:
* deploy on the ARM server with Baidu xpu using the acceleration capability of Paddle-Lite and xpu;
* deploy on the ARM server standalone with Paddle-Lite;
* deploy on the ARM server standalone without Paddle-Lite。
The first two deployment methods are recommended。
Start the rpc service, deploying on ARM server with Baidu Kunlun chips,and accelerate with Paddle-Lite and Baidu Kunlun xpu.
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292 --use_lite --use_xpu --ir_optim
```
Start the rpc service, deploying on ARM server,and accelerate with Paddle-Lite.
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292 --use_lite --ir_optim
```
Start the rpc service, deploying on ARM server.
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292
```
##
```
from paddle_serving_client import Client
import numpy as np
client = Client()
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
-0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": np.array(data).reshape(1,13,1)}, fetch=["price"])
print(fetch_map)
```
Some examples are provided below, and other models can be modifed with reference to these examples。
|sample name|sample links|
|:-----|:--|
|fit_a_line|[fit_a_line_xpu](../python/examples/xpu/fit_a_line_xpu)|
|resnet|[resnet_v2_50_xpu](../python/examples/xpu/resnet_v2_50_xpu)|
# Paddle Serving使用百度昆仑芯片部署
(简体中文|[English](./BAIDU_KUNLUN_XPU_SERVING.md))
Paddle Serving支持使用百度昆仑芯片进行预测部署。目前试验性支持在百度昆仑芯片和arm服务器(如飞腾 FT-2000+/64)上进行部署,后续完善对其他异构硬件服务器部署能力。
# 编译、安装
基本环境配置可参考[该文档](COMPILE_CN.md)进行配置。
## 编译
* 编译server部分
```
cd Serving
mkdir -p server-build-arm && cd server-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DSERVER=ON ..
make -j10
```
可以执行`make install`把目标产出放在`./output`目录下,cmake阶段需添加`-DCMAKE_INSTALL_PREFIX=./output`选项来指定存放路径。
* 编译client部分
```
mkdir -p client-build-arm && cd client-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DCLIENT=ON ..
make -j10
```
* 编译app部分
```
cd Serving
mkdir -p app-build-arm && cd app-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DAPP=ON ..
make -j10
```
## 安装wheel包
以上编译步骤完成后,会在各自编译目录$build_dir/python/dist生成whl包,分别安装即可。例如server步骤,会在server-build-arm/python/dist目录下生成whl包, 使用命令```pip install -u xxx.whl```进行安装。
# 请求参数说明
为了支持arm+xpu服务部署,使用Paddle-Lite加速能力,请求时需使用以下参数。
|参数|参数说明|备注|
|:--|:--|:--|
|use_lite|使用Paddle-Lite Engine|使用Paddle-Lite cpu预测能力|
|use_xpu|使用Baidu Kunlun进行预测|该选项需要与use_lite配合使用|
|ir_optim|开启Paddle-Lite计算子图优化|详细见[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite)|
# 部署使用示例
## 下载模型
```
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
tar -xzf uci_housing.tar.gz
```
## 启动rpc服务
主要有三种启动配置:
* 使用arm cpu+xpu部署,使用Paddle-Lite xpu优化加速能力;
* 单独使用arm cpu部署,使用Paddle-Lite优化加速能力;
* 使用arm cpu部署,不使用Paddle-Lite加速。
推荐使用前两种部署方式。
启动rpc服务,使用arm cpu+xpu部署,使用Paddle-Lite xpu优化加速能力
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292 --use_lite --use_xpu --ir_optim
```
启动rpc服务,使用arm cpu部署, 使用Paddle-Lite加速能力
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292 --use_lite --ir_optim
```
启动rpc服务,使用arm cpu部署, 不使用Paddle-Lite加速能力
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292
```
## client调用
```
from paddle_serving_client import Client
import numpy as np
client = Client()
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
-0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": np.array(data).reshape(1,13,1)}, fetch=["price"])
print(fetch_map)
```
以下提供部分样例,其他模型可参照进行修改。
|示例名称|示例链接|
|:-----|:--|
|fit_a_line|[fit_a_line_xpu](../python/examples/xpu/fit_a_line_xpu)|
|resnet|[resnet_v2_50_xpu](../python/examples/xpu/resnet_v2_50_xpu)|
# Fit a line prediction example
([简体中文](./README_CN.md)|English)
## Get data
```shell
sh get_data.sh
```
## RPC service
### Start server
```shell
python -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 10 --port 9393 --use_lite --use_xpu --ir_optim
```
### Client prediction
The `paddlepaddle` package is used in `test_client.py`, and you may need to download the corresponding package(`pip install paddlepaddle`).
``` shell
python test_client.py uci_housing_client/serving_client_conf.prototxt
```
## HTTP service
### Start server
Start a web service with default web service hosting modules:
``` shell
python test_server.py
```
### Client prediction
``` shell
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
# 线性回归预测服务示例
(简体中文|[English](./README.md))
## 获取数据
```shell
sh get_data.sh
```
## RPC服务
### 开启服务端
``` shell
python test_server.py uci_housing_model/
```
也可以通过下面的一行代码开启默认RPC服务:
```shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 --use_lite --use_xpu --ir_optim
```
### 客户端预测
`test_client.py`中使用了`paddlepaddle`包,需要进行下载(`pip install paddlepaddle`)。
``` shell
python test_client.py uci_housing_client/serving_client_conf.prototxt
```
## HTTP服务
### 开启服务端
通过下面的一行代码开启默认web服务:
``` shell
python test_server.py
```
### 客户端预测
``` shell
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from paddle_serving_client import Client
from paddle_serving_client.utils import MultiThreadRunner
from paddle_serving_client.utils import benchmark_args
import time
import paddle
import sys
import requests
args = benchmark_args()
def single_func(idx, resource):
if args.request == "rpc":
client = Client()
client.load_client_config(args.model)
client.connect([args.endpoint])
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=1)
start = time.time()
for data in train_reader():
fetch_map = client.predict(feed={"x": data[0][0]}, fetch=["price"])
end = time.time()
return [[end - start]]
elif args.request == "http":
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=1)
start = time.time()
for data in train_reader():
r = requests.post(
'http://{}/uci/prediction'.format(args.endpoint),
data={"x": data[0]})
end = time.time()
return [[end - start]]
multi_thread_runner = MultiThreadRunner()
result = multi_thread_runner.run(single_func, args.thread, {})
print(result)
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
tar -xzf uci_housing.tar.gz
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
import sys
import paddle
import paddle.fluid as fluid
paddle.enable_static()
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=16)
test_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.test(), buf_size=500),
batch_size=16)
x = fluid.data(name='x', shape=[None, 13], dtype='float32')
y = fluid.data(name='y', shape=[None, 1], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
cost = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_loss = fluid.layers.mean(cost)
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.01)
sgd_optimizer.minimize(avg_loss)
place = fluid.CPUPlace()
feeder = fluid.DataFeeder(place=place, feed_list=[x, y])
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
import paddle_serving_client.io as serving_io
for pass_id in range(30):
for data_train in train_reader():
avg_loss_value, = exe.run(fluid.default_main_program(),
feed=feeder.feed(data_train),
fetch_list=[avg_loss])
serving_io.save_model("uci_housing_model", "uci_housing_client", {"x": x},
{"price": y_predict}, fluid.default_main_program())
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from paddle_serving_client import Client
import sys
import numpy as np
client = Client()
client.load_client_config(sys.argv[1])
client.connect(["127.0.0.1:9393"])
import paddle
test_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.test(), buf_size=500),
batch_size=1)
for data in test_reader():
new_data = np.zeros((1, 1, 13)).astype("float32")
new_data[0] = data[0][0]
fetch_map = client.predict(
feed={"x": new_data}, fetch=["price"], batch=True)
print(fetch_map)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle_serving_client import Client
from paddle_serving_client.utils import MultiThreadRunner
import paddle
import numpy as np
def single_func(idx, resource):
client = Client()
client.load_client_config(
"./uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9293", "127.0.0.1:9292"])
x = [
0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584,
0.6283, 0.4919, 0.1856, 0.0795, -0.0332
]
x = np.array(x)
for i in range(1000):
fetch_map = client.predict(feed={"x": x}, fetch=["price"])
if fetch_map is None:
return [[None]]
return [[0]]
multi_thread_runner = MultiThreadRunner()
thread_num = 4
result = multi_thread_runner.run(single_func, thread_num, {})
if None in result[0]:
exit(1)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from paddle_serving_server_gpu.web_service import WebService
import numpy as np
class UciService(WebService):
def preprocess(self, feed=[], fetch=[]):
feed_batch = []
is_batch = True
new_data = np.zeros((len(feed), 1, 13)).astype("float32")
for i, ins in enumerate(feed):
nums = np.array(ins["x"]).reshape(1, 1, 13)
new_data[i] = nums
feed = {"x": new_data}
return feed, fetch, is_batch
uci_service = UciService(name="uci")
uci_service.load_model_config("uci_housing_model")
uci_service.prepare_server(workdir="workdir", port=9393, use_lite=True, use_xpu=True, ir_optim=True)
uci_service.run_rpc_service()
uci_service.run_web_service()
# Image Classification
## Get Model
```
python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet
tar -xzvf resnet_v2_50_imagenet.tar.gz
```
## RPC Service
### Start Service
```
python -m paddle_serving_server_gpu.serve --model resnet_v2_50_imagenet_model --port 9393 --use_lite --use_xpu --ir_optim
```
### Client Prediction
```
python resnet50_v2_client.py
```
# 图像分类
## 获取模型
```
python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet
tar -xzvf resnet_v2_50_imagenet.tar.gz
```
## RPC 服务
### 启动服务端
```
python -m paddle_serving_server_gpu.serve --model resnet_v2_50_imagenet_model --port 9393 --use_lite --use_xpu --ir_optim
```
### 客户端预测
```
python resnet50_v2_client.py
```
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle_serving_app.reader import Sequential, File2Image, Resize, CenterCrop
from paddle_serving_app.reader import RGB2BGR, Transpose, Div, Normalize
from paddle_serving_app.local_predict import LocalPredictor
import sys
predictor = LocalPredictor()
predictor.load_model_config(sys.argv[1], use_lite=True, use_xpu=True, ir_optim=True)
seq = Sequential([
File2Image(), Resize(256), CenterCrop(224), RGB2BGR(), Transpose((2, 0, 1)),
Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], True)
])
image_file = "daisy.jpg"
img = seq(image_file)
fetch_map = predictor.predict(feed={"image": img}, fetch=["score"])
print(fetch_map["score"].reshape(-1))
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle_serving_client import Client
from paddle_serving_app.reader import Sequential, File2Image, Resize, CenterCrop
from paddle_serving_app.reader import RGB2BGR, Transpose, Div, Normalize
client = Client()
client.load_client_config(
"resnet_v2_50_imagenet_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9393"])
seq = Sequential([
File2Image(), Resize(256), CenterCrop(224), RGB2BGR(), Transpose((2, 0, 1)),
Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], True)
])
image_file = "daisy.jpg"
img = seq(image_file)
fetch_map = client.predict(feed={"image": img}, fetch=["score"])
print(fetch_map["score"].reshape(-1))
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册