未验证 提交 e694e859 编写于 作者: T TeslaZhao 提交者: GitHub

Merge pull request #25 from PaddlePaddle/develop

Develop
# Paddle Serving Using Baidu Kunlun Chips
(English|[简体中文](./BAIDU_KUNLUN_XPU_SERVING_CN.md))
Paddle serving supports deployment using Baidu Kunlun chips. At present, the pilot support is deployed on the ARM server with Baidu Kunlun chips
(such as Phytium FT-2000+/64). We will improve
the deployment capability on various heterogeneous hardware servers in the future.
# Compilation and installation
Refer to [compile](COMPILE.md) document to setup the compilation environment。
## Compilatiton
* Compile the Serving Server
```
cd Serving
mkdir -p server-build-arm && cd server-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DSERVER=ON ..
make -j10
```
You can run `make install` to produce the target in `./output` directory. Add `-DCMAKE_INSTALL_PREFIX=./output` to specify the output path to CMake command shown above。
* Compile the Serving Client
```
mkdir -p client-build-arm && cd client-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DCLIENT=ON ..
make -j10
```
* Compile the App
```
cd Serving
mkdir -p app-build-arm && cd app-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DAPP=ON ..
make -j10
```
## Install the wheel package
After the compilations stages above, the whl package will be generated in ```python/dist/``` under the specific temporary directories.
For example, after the Server Compiation step,the whl package will be produced under the server-build-arm/python/dist directory, and you can run ```pip install -u python/dist/*.whl``` to install the package。
# Request parameters description
In order to deploy serving
service on the arm server with Baidu Kunlun xpu chips and use the acceleration capability of Paddle-Lite,please specify the following parameters during deployment。
|param|param description|about|
|:--|:--|:--|
|use_lite|using Paddle-Lite Engine|use the inference capability of Paddle-Lite|
|use_xpu|using Baidu Kunlun for inference|need to be used with the use_lite option|
|ir_optim|open the graph optimization|refer to[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite)|
# Deplyment examples
## Download the model
```
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
tar -xzf uci_housing.tar.gz
```
## Start RPC service
There are mainly three deployment methods:
* deploy on the ARM server with Baidu xpu using the acceleration capability of Paddle-Lite and xpu;
* deploy on the ARM server standalone with Paddle-Lite;
* deploy on the ARM server standalone without Paddle-Lite。
The first two deployment methods are recommended。
Start the rpc service, deploying on ARM server with Baidu Kunlun chips,and accelerate with Paddle-Lite and Baidu Kunlun xpu.
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292 --use_lite --use_xpu --ir_optim
```
Start the rpc service, deploying on ARM server,and accelerate with Paddle-Lite.
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292 --use_lite --ir_optim
```
Start the rpc service, deploying on ARM server.
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292
```
##
```
from paddle_serving_client import Client
import numpy as np
client = Client()
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
-0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": np.array(data).reshape(1,13,1)}, fetch=["price"])
print(fetch_map)
```
Some examples are provided below, and other models can be modifed with reference to these examples。
|sample name|sample links|
|:-----|:--|
|fit_a_line|[fit_a_line_xpu](../python/examples/xpu/fit_a_line_xpu)|
|resnet|[resnet_v2_50_xpu](../python/examples/xpu/resnet_v2_50_xpu)|
# Paddle Serving使用百度昆仑芯片部署
(简体中文|[English](./BAIDU_KUNLUN_XPU_SERVING.md))
Paddle Serving支持使用百度昆仑芯片进行预测部署。目前试验性支持在百度昆仑芯片和arm服务器(如飞腾 FT-2000+/64)上进行部署,后续完善对其他异构硬件服务器部署能力。
# 编译、安装
基本环境配置可参考[该文档](COMPILE_CN.md)进行配置。
## 编译
* 编译server部分
```
cd Serving
mkdir -p server-build-arm && cd server-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DSERVER=ON ..
make -j10
```
可以执行`make install`把目标产出放在`./output`目录下,cmake阶段需添加`-DCMAKE_INSTALL_PREFIX=./output`选项来指定存放路径。
* 编译client部分
```
mkdir -p client-build-arm && cd client-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DCLIENT=ON ..
make -j10
```
* 编译app部分
```
cd Serving
mkdir -p app-build-arm && cd app-build-arm
cmake -DPYTHON_INCLUDE_DIR=/usr/include/python3.7m/ \
-DPYTHON_LIBRARIES=/usr/lib64/libpython3.7m.so \
-DPYTHON_EXECUTABLE=/usr/bin/python \
-DWITH_PYTHON=ON \
-DWITH_LITE=ON \
-DWITH_XPU=ON \
-DAPP=ON ..
make -j10
```
## 安装wheel包
以上编译步骤完成后,会在各自编译目录$build_dir/python/dist生成whl包,分别安装即可。例如server步骤,会在server-build-arm/python/dist目录下生成whl包, 使用命令```pip install -u xxx.whl```进行安装。
# 请求参数说明
为了支持arm+xpu服务部署,使用Paddle-Lite加速能力,请求时需使用以下参数。
|参数|参数说明|备注|
|:--|:--|:--|
|use_lite|使用Paddle-Lite Engine|使用Paddle-Lite cpu预测能力|
|use_xpu|使用Baidu Kunlun进行预测|该选项需要与use_lite配合使用|
|ir_optim|开启Paddle-Lite计算子图优化|详细见[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite)|
# 部署使用示例
## 下载模型
```
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
tar -xzf uci_housing.tar.gz
```
## 启动rpc服务
主要有三种启动配置:
* 使用arm cpu+xpu部署,使用Paddle-Lite xpu优化加速能力;
* 单独使用arm cpu部署,使用Paddle-Lite优化加速能力;
* 使用arm cpu部署,不使用Paddle-Lite加速。
推荐使用前两种部署方式。
启动rpc服务,使用arm cpu+xpu部署,使用Paddle-Lite xpu优化加速能力
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292 --use_lite --use_xpu --ir_optim
```
启动rpc服务,使用arm cpu部署, 使用Paddle-Lite加速能力
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292 --use_lite --ir_optim
```
启动rpc服务,使用arm cpu部署, 不使用Paddle-Lite加速能力
```
python3 -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 6 --port 9292
```
## client调用
```
from paddle_serving_client import Client
import numpy as np
client = Client()
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
-0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": np.array(data).reshape(1,13,1)}, fetch=["price"])
print(fetch_map)
```
以下提供部分样例,其他模型可参照进行修改。
|示例名称|示例链接|
|:-----|:--|
|fit_a_line|[fit_a_line_xpu](../python/examples/xpu/fit_a_line_xpu)|
|resnet|[resnet_v2_50_xpu](../python/examples/xpu/resnet_v2_50_xpu)|
...@@ -11,14 +11,16 @@ This example use model [BERT Chinese Model](https://www.paddlepaddle.org.cn/hubd ...@@ -11,14 +11,16 @@ This example use model [BERT Chinese Model](https://www.paddlepaddle.org.cn/hubd
Install paddlehub first Install paddlehub first
``` ```
pip install paddlehub pip3 install paddlehub
``` ```
run run
``` ```
python prepare_model.py 128 python3 prepare_model.py 128
``` ```
**PaddleHub only support Python 3.5+**
the 128 in the command above means max_seq_len in BERT model, which is the length of sample after preprocessing. the 128 in the command above means max_seq_len in BERT model, which is the length of sample after preprocessing.
the config file and model file for server side are saved in the folder bert_seq128_model. the config file and model file for server side are saved in the folder bert_seq128_model.
the config file generated for client side is saved in the folder bert_seq128_client. the config file generated for client side is saved in the folder bert_seq128_client.
...@@ -28,8 +30,9 @@ You can also download the above model from BOS(max_seq_len=128). After decompres ...@@ -28,8 +30,9 @@ You can also download the above model from BOS(max_seq_len=128). After decompres
```shell ```shell
wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz
tar -xzf bert_chinese_L-12_H-768_A-12.tar.gz tar -xzf bert_chinese_L-12_H-768_A-12.tar.gz
mv bert_chinese_L-12_H-768_A-12_model bert_seq128_model
mv bert_chinese_L-12_H-768_A-12_client bert_seq128_client
``` ```
if your model is bert_chinese_L-12_H-768_A-12_model, replace the 'bert_seq128_model' field in the following command with 'bert_chinese_L-12_H-768_A-12_model',replace 'bert_seq128_client' with 'bert_chinese_L-12_H-768_A-12_client'.
### Getting Dict and Sample Dataset ### Getting Dict and Sample Dataset
......
...@@ -10,11 +10,11 @@ ...@@ -10,11 +10,11 @@
示例中采用[Paddlehub](https://github.com/PaddlePaddle/PaddleHub)中的[BERT中文模型](https://www.paddlepaddle.org.cn/hubdetail?name=bert_chinese_L-12_H-768_A-12&en_category=SemanticModel) 示例中采用[Paddlehub](https://github.com/PaddlePaddle/PaddleHub)中的[BERT中文模型](https://www.paddlepaddle.org.cn/hubdetail?name=bert_chinese_L-12_H-768_A-12&en_category=SemanticModel)
请先安装paddlehub 请先安装paddlehub
``` ```
pip install paddlehub pip3 install paddlehub
``` ```
执行 执行
``` ```
python prepare_model.py 128 python3 prepare_model.py 128
``` ```
参数128表示BERT模型中的max_seq_len,即预处理后的样本长度。 参数128表示BERT模型中的max_seq_len,即预处理后的样本长度。
生成server端配置文件与模型文件,存放在bert_seq128_model文件夹。 生成server端配置文件与模型文件,存放在bert_seq128_model文件夹。
...@@ -25,9 +25,9 @@ python prepare_model.py 128 ...@@ -25,9 +25,9 @@ python prepare_model.py 128
```shell ```shell
wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz
tar -xzf bert_chinese_L-12_H-768_A-12.tar.gz tar -xzf bert_chinese_L-12_H-768_A-12.tar.gz
mv bert_chinese_L-12_H-768_A-12_model bert_seq128_model
mv bert_chinese_L-12_H-768_A-12_client bert_seq128_client
``` ```
若使用bert_chinese_L-12_H-768_A-12_model模型,将下面命令中的bert_seq128_model字段替换为bert_chinese_L-12_H-768_A-12_model,bert_seq128_client字段替换为bert_chinese_L-12_H-768_A-12_client.
### 获取词典和样例数据 ### 获取词典和样例数据
......
...@@ -26,6 +26,6 @@ python -m paddle_serving_server_gpu.serve --model ctr_serving_model/ --port 9292 ...@@ -26,6 +26,6 @@ python -m paddle_serving_server_gpu.serve --model ctr_serving_model/ --port 9292
### RPC Infer ### RPC Infer
``` ```
python test_client.py ctr_client_conf/serving_client_conf.prototxt raw_data/ python test_client.py ctr_client_conf/serving_client_conf.prototxt raw_data/part-0
``` ```
the latency will display in the end. the latency will display in the end.
...@@ -26,6 +26,6 @@ python -m paddle_serving_server_gpu.serve --model ctr_serving_model/ --port 9292 ...@@ -26,6 +26,6 @@ python -m paddle_serving_server_gpu.serve --model ctr_serving_model/ --port 9292
### 执行预测 ### 执行预测
``` ```
python test_client.py ctr_client_conf/serving_client_conf.prototxt raw_data/ python test_client.py ctr_client_conf/serving_client_conf.prototxt raw_data/part-0
``` ```
预测完毕会输出预测过程的耗时。 预测完毕会输出预测过程的耗时。
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
import sys
import paddle.fluid.incubate.data_generator as dg
class CriteoDataset(dg.MultiSlotDataGenerator):
def setup(self, sparse_feature_dim):
self.cont_min_ = [0, -3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
self.cont_max_ = [
20, 600, 100, 50, 64000, 500, 100, 50, 500, 10, 10, 10, 50
]
self.cont_diff_ = [
20, 603, 100, 50, 64000, 500, 100, 50, 500, 10, 10, 10, 50
]
self.hash_dim_ = sparse_feature_dim
# here, training data are lines with line_index < train_idx_
self.train_idx_ = 41256555
self.continuous_range_ = range(1, 14)
self.categorical_range_ = range(14, 40)
def _process_line(self, line):
features = line.rstrip('\n').split('\t')
dense_feature = []
sparse_feature = []
for idx in self.continuous_range_:
if features[idx] == '':
dense_feature.append(0.0)
else:
dense_feature.append((float(features[idx]) - self.cont_min_[idx - 1]) / \
self.cont_diff_[idx - 1])
for idx in self.categorical_range_:
sparse_feature.append(
[hash(str(idx) + features[idx]) % self.hash_dim_])
return dense_feature, sparse_feature, [int(features[0])]
def infer_reader(self, filelist, batch, buf_size):
def local_iter():
for fname in filelist:
with open(fname.strip(), "r") as fin:
for line in fin:
dense_feature, sparse_feature, label = self._process_line(
line)
#yield dense_feature, sparse_feature, label
yield [dense_feature] + sparse_feature + [label]
import paddle
batch_iter = paddle.batch(
paddle.reader.shuffle(
local_iter, buf_size=buf_size),
batch_size=batch)
return batch_iter
def generate_sample(self, line):
def data_iter():
dense_feature, sparse_feature, label = self._process_line(line)
feature_name = ["dense_input"]
for idx in self.categorical_range_:
feature_name.append("C" + str(idx - 13))
feature_name.append("label")
yield zip(feature_name, [dense_feature] + sparse_feature + [label])
return data_iter
if __name__ == "__main__":
criteo_dataset = CriteoDataset()
criteo_dataset.setup(int(sys.argv[1]))
criteo_dataset.run_from_stdin()
...@@ -14,43 +14,63 @@ ...@@ -14,43 +14,63 @@
# pylint: disable=doc-string-missing # pylint: disable=doc-string-missing
from paddle_serving_client import Client from paddle_serving_client import Client
import paddle
import sys import sys
import os import os
import time import time
import criteo_reader as criteo
from paddle_serving_client.metric import auc from paddle_serving_client.metric import auc
import numpy as np import numpy as np
import sys import sys
class CriteoReader(object):
def __init__(self, sparse_feature_dim):
self.cont_min_ = [0, -3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
self.cont_max_ = [
20, 600, 100, 50, 64000, 500, 100, 50, 500, 10, 10, 10, 50
]
self.cont_diff_ = [
20, 603, 100, 50, 64000, 500, 100, 50, 500, 10, 10, 10, 50
]
self.hash_dim_ = sparse_feature_dim
# here, training data are lines with line_index < train_idx_
self.train_idx_ = 41256555
self.continuous_range_ = range(1, 14)
self.categorical_range_ = range(14, 40)
def process_line(self, line):
features = line.rstrip('\n').split('\t')
dense_feature = []
sparse_feature = []
for idx in self.continuous_range_:
if features[idx] == '':
dense_feature.append(0.0)
else:
dense_feature.append((float(features[idx]) - self.cont_min_[idx - 1]) / \
self.cont_diff_[idx - 1])
for idx in self.categorical_range_:
sparse_feature.append(
[hash(str(idx) + features[idx]) % self.hash_dim_])
return sparse_feature
py_version = sys.version_info[0] py_version = sys.version_info[0]
client = Client() client = Client()
client.load_client_config(sys.argv[1]) client.load_client_config(sys.argv[1])
client.connect(["127.0.0.1:9292"]) client.connect(["127.0.0.1:9292"])
reader = CriteoReader(1000001)
batch = 1 batch = 1
buf_size = 100 buf_size = 100
dataset = criteo.CriteoDataset()
dataset.setup(1000001)
test_filelists = [
"{}/part-%d".format(sys.argv[2]) % x
for x in range(len(os.listdir(sys.argv[2])))
]
reader = dataset.infer_reader(test_filelists[len(test_filelists) - 40:], batch,
buf_size)
label_list = [] label_list = []
prob_list = [] prob_list = []
start = time.time() start = time.time()
for ei in range(1000): f = open(sys.argv[2], 'r')
if py_version == 2: for ei in range(10):
data = reader().next() data = reader.process_line(f.readline())
else:
data = reader().__next__()
feed_dict = {} feed_dict = {}
for i in range(1, 27): for i in range(1, 27):
feed_dict["sparse_{}".format(i - 1)] = np.array(data[0][i]).reshape(-1) feed_dict["sparse_{}".format(i - 1)] = np.array(data[i-1]).reshape(-1)
feed_dict["sparse_{}.lod".format(i - 1)] = [0, len(data[0][i])] feed_dict["sparse_{}.lod".format(i - 1)] = [0, len(data[i-1])]
fetch_map = client.predict(feed=feed_dict, fetch=["prob"]) fetch_map = client.predict(feed=feed_dict, fetch=["prob"])
print(fetch_map)
end = time.time() end = time.time()
print(end - start) f.close()
...@@ -12,6 +12,7 @@ Paddle Detection provides a large number of [Model Zoo](https://github.com/Paddl ...@@ -12,6 +12,7 @@ Paddle Detection provides a large number of [Model Zoo](https://github.com/Paddl
### Serving example ### Serving example
Several examples of PaddleDetection models used in Serving are given in this folder Several examples of PaddleDetection models used in Serving are given in this folder
All examples support TensorRT.
-[Faster RCNN](./faster_rcnn_r50_fpn_1x_coco) -[Faster RCNN](./faster_rcnn_r50_fpn_1x_coco)
-[PPYOLO](./ppyolo_r50vd_dcn_1x_coco) -[PPYOLO](./ppyolo_r50vd_dcn_1x_coco)
......
...@@ -13,6 +13,9 @@ tar xf faster_rcnn_r50_fpn_1x_coco.tar ...@@ -13,6 +13,9 @@ tar xf faster_rcnn_r50_fpn_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0
``` ```
This model support TensorRT, if you want a faster inference, please use `--use_trt`.
### Perform prediction ### Perform prediction
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
...@@ -13,6 +13,7 @@ wget --no-check-certificate https://paddle-serving.bj.bcebos.com/pddet_demo/2.0/ ...@@ -13,6 +13,7 @@ wget --no-check-certificate https://paddle-serving.bj.bcebos.com/pddet_demo/2.0/
tar xf faster_rcnn_r50_fpn_1x_coco.tar tar xf faster_rcnn_r50_fpn_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0
``` ```
该模型支持TensorRT,如果想要更快的预测速度,可以开启`--use_trt`选项。
### 执行预测 ### 执行预测
``` ```
......
...@@ -13,6 +13,8 @@ tar xf ppyolo_r50vd_dcn_1x_coco.tar ...@@ -13,6 +13,8 @@ tar xf ppyolo_r50vd_dcn_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0
``` ```
This model support TensorRT, if you want a faster inference, please use `--use_trt`.
### Perform prediction ### Perform prediction
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
...@@ -14,6 +14,8 @@ tar xf ppyolo_r50vd_dcn_1x_coco.tar ...@@ -14,6 +14,8 @@ tar xf ppyolo_r50vd_dcn_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0
``` ```
该模型支持TensorRT,如果想要更快的预测速度,可以开启`--use_trt`选项。
### 执行预测 ### 执行预测
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
...@@ -12,6 +12,7 @@ wget --no-check-certificate https://paddle-serving.bj.bcebos.com/pddet_demo/2.0/ ...@@ -12,6 +12,7 @@ wget --no-check-certificate https://paddle-serving.bj.bcebos.com/pddet_demo/2.0/
tar xf ttfnet_darknet53_1x_coco.tar tar xf ttfnet_darknet53_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0
``` ```
This model support TensorRT, if you want a faster inference, please use `--use_trt`.
### Perform prediction ### Perform prediction
``` ```
......
...@@ -14,6 +14,8 @@ tar xf ttfnet_darknet53_1x_coco.tar ...@@ -14,6 +14,8 @@ tar xf ttfnet_darknet53_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0
``` ```
该模型支持TensorRT,如果想要更快的预测速度,可以开启`--use_trt`选项。
### 执行预测 ### 执行预测
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
...@@ -13,6 +13,8 @@ tar xf yolov3_darknet53_270e_coco.tar ...@@ -13,6 +13,8 @@ tar xf yolov3_darknet53_270e_coco.tar
python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0
``` ```
This model support TensorRT, if you want a faster inference, please use `--use_trt`.
### Perform prediction ### Perform prediction
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
...@@ -14,6 +14,8 @@ tar xf yolov3_darknet53_270e_coco.tar ...@@ -14,6 +14,8 @@ tar xf yolov3_darknet53_270e_coco.tar
python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0
``` ```
该模型支持TensorRT,如果想要更快的预测速度,可以开启`--use_trt`选项。
### 执行预测 ### 执行预测
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
# Imagenet Pipeline WebService # Imagenet Pipeline WebService
这里以 Uci 服务为例来介绍 Pipeline WebService 的使用。 这里以 Imagenet 服务为例来介绍 Pipeline WebService 的使用。
## 获取模型 ## 获取模型
``` ```
...@@ -10,10 +10,11 @@ sh get_model.sh ...@@ -10,10 +10,11 @@ sh get_model.sh
## 启动服务 ## 启动服务
``` ```
python web_service.py &>log.txt & python resnet50_web_service.py &>log.txt &
``` ```
## 测试 ## 测试
``` ```
curl -X POST -k http://localhost:18082/uci/prediction -d '{"key": ["x"], "value": ["0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332"]}' python pipeline_rpc_client.py
``` ```
# Fit a line prediction example
([简体中文](./README_CN.md)|English)
## Get data
```shell
sh get_data.sh
```
## RPC service
### Start server
```shell
python -m paddle_serving_server_gpu.serve --model uci_housing_model --thread 10 --port 9393 --use_lite --use_xpu --ir_optim
```
### Client prediction
The `paddlepaddle` package is used in `test_client.py`, and you may need to download the corresponding package(`pip install paddlepaddle`).
``` shell
python test_client.py uci_housing_client/serving_client_conf.prototxt
```
## HTTP service
### Start server
Start a web service with default web service hosting modules:
``` shell
python test_server.py
```
### Client prediction
``` shell
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
# 线性回归预测服务示例
(简体中文|[English](./README.md))
## 获取数据
```shell
sh get_data.sh
```
## RPC服务
### 开启服务端
``` shell
python test_server.py uci_housing_model/
```
也可以通过下面的一行代码开启默认RPC服务:
```shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 --use_lite --use_xpu --ir_optim
```
### 客户端预测
`test_client.py`中使用了`paddlepaddle`包,需要进行下载(`pip install paddlepaddle`)。
``` shell
python test_client.py uci_housing_client/serving_client_conf.prototxt
```
## HTTP服务
### 开启服务端
通过下面的一行代码开启默认web服务:
``` shell
python test_server.py
```
### 客户端预测
``` shell
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
```
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from paddle_serving_client import Client
from paddle_serving_client.utils import MultiThreadRunner
from paddle_serving_client.utils import benchmark_args
import time
import paddle
import sys
import requests
args = benchmark_args()
def single_func(idx, resource):
if args.request == "rpc":
client = Client()
client.load_client_config(args.model)
client.connect([args.endpoint])
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=1)
start = time.time()
for data in train_reader():
fetch_map = client.predict(feed={"x": data[0][0]}, fetch=["price"])
end = time.time()
return [[end - start]]
elif args.request == "http":
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=1)
start = time.time()
for data in train_reader():
r = requests.post(
'http://{}/uci/prediction'.format(args.endpoint),
data={"x": data[0]})
end = time.time()
return [[end - start]]
multi_thread_runner = MultiThreadRunner()
result = multi_thread_runner.run(single_func, args.thread, {})
print(result)
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
tar -xzf uci_housing.tar.gz
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
import sys
import paddle
import paddle.fluid as fluid
paddle.enable_static()
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=16)
test_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.test(), buf_size=500),
batch_size=16)
x = fluid.data(name='x', shape=[None, 13], dtype='float32')
y = fluid.data(name='y', shape=[None, 1], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
cost = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_loss = fluid.layers.mean(cost)
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.01)
sgd_optimizer.minimize(avg_loss)
place = fluid.CPUPlace()
feeder = fluid.DataFeeder(place=place, feed_list=[x, y])
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
import paddle_serving_client.io as serving_io
for pass_id in range(30):
for data_train in train_reader():
avg_loss_value, = exe.run(fluid.default_main_program(),
feed=feeder.feed(data_train),
fetch_list=[avg_loss])
serving_io.save_model("uci_housing_model", "uci_housing_client", {"x": x},
{"price": y_predict}, fluid.default_main_program())
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from paddle_serving_client import Client
import sys
import numpy as np
client = Client()
client.load_client_config(sys.argv[1])
client.connect(["127.0.0.1:9393"])
import paddle
test_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.test(), buf_size=500),
batch_size=1)
for data in test_reader():
new_data = np.zeros((1, 1, 13)).astype("float32")
new_data[0] = data[0][0]
fetch_map = client.predict(
feed={"x": new_data}, fetch=["price"], batch=True)
print(fetch_map)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle_serving_client import Client
from paddle_serving_client.utils import MultiThreadRunner
import paddle
import numpy as np
def single_func(idx, resource):
client = Client()
client.load_client_config(
"./uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9293", "127.0.0.1:9292"])
x = [
0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584,
0.6283, 0.4919, 0.1856, 0.0795, -0.0332
]
x = np.array(x)
for i in range(1000):
fetch_map = client.predict(feed={"x": x}, fetch=["price"])
if fetch_map is None:
return [[None]]
return [[0]]
multi_thread_runner = MultiThreadRunner()
thread_num = 4
result = multi_thread_runner.run(single_func, thread_num, {})
if None in result[0]:
exit(1)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from paddle_serving_server_gpu.web_service import WebService
import numpy as np
class UciService(WebService):
def preprocess(self, feed=[], fetch=[]):
feed_batch = []
is_batch = True
new_data = np.zeros((len(feed), 1, 13)).astype("float32")
for i, ins in enumerate(feed):
nums = np.array(ins["x"]).reshape(1, 1, 13)
new_data[i] = nums
feed = {"x": new_data}
return feed, fetch, is_batch
uci_service = UciService(name="uci")
uci_service.load_model_config("uci_housing_model")
uci_service.prepare_server(workdir="workdir", port=9393, use_lite=True, use_xpu=True, ir_optim=True)
uci_service.run_rpc_service()
uci_service.run_web_service()
# Image Classification
## Get Model
```
python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet
tar -xzvf resnet_v2_50_imagenet.tar.gz
```
## RPC Service
### Start Service
```
python -m paddle_serving_server_gpu.serve --model resnet_v2_50_imagenet_model --port 9393 --use_lite --use_xpu --ir_optim
```
### Client Prediction
```
python resnet50_v2_client.py
```
# 图像分类
## 获取模型
```
python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet
tar -xzvf resnet_v2_50_imagenet.tar.gz
```
## RPC 服务
### 启动服务端
```
python -m paddle_serving_server_gpu.serve --model resnet_v2_50_imagenet_model --port 9393 --use_lite --use_xpu --ir_optim
```
### 客户端预测
```
python resnet50_v2_client.py
```
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle_serving_app.reader import Sequential, File2Image, Resize, CenterCrop
from paddle_serving_app.reader import RGB2BGR, Transpose, Div, Normalize
from paddle_serving_app.local_predict import LocalPredictor
import sys
predictor = LocalPredictor()
predictor.load_model_config(sys.argv[1], use_lite=True, use_xpu=True, ir_optim=True)
seq = Sequential([
File2Image(), Resize(256), CenterCrop(224), RGB2BGR(), Transpose((2, 0, 1)),
Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], True)
])
image_file = "daisy.jpg"
img = seq(image_file)
fetch_map = predictor.predict(feed={"image": img}, fetch=["score"])
print(fetch_map["score"].reshape(-1))
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle_serving_client import Client
from paddle_serving_app.reader import Sequential, File2Image, Resize, CenterCrop
from paddle_serving_app.reader import RGB2BGR, Transpose, Div, Normalize
client = Client()
client.load_client_config(
"resnet_v2_50_imagenet_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9393"])
seq = Sequential([
File2Image(), Resize(256), CenterCrop(224), RGB2BGR(), Transpose((2, 0, 1)),
Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], True)
])
image_file = "daisy.jpg"
img = seq(image_file)
fetch_map = client.predict(feed={"image": img}, fetch=["score"])
print(fetch_map["score"].reshape(-1))
...@@ -152,8 +152,8 @@ class MainService(BaseHTTPRequestHandler): ...@@ -152,8 +152,8 @@ class MainService(BaseHTTPRequestHandler):
if "key" not in post_data: if "key" not in post_data:
return False return False
else: else:
key = base64.b64decode(post_data["key"]) key = base64.b64decode(post_data["key"].encode())
with open(args.model + "/key", "w") as f: with open(args.model + "/key", "wb") as f:
f.write(key) f.write(key)
return True return True
...@@ -161,8 +161,8 @@ class MainService(BaseHTTPRequestHandler): ...@@ -161,8 +161,8 @@ class MainService(BaseHTTPRequestHandler):
if "key" not in post_data: if "key" not in post_data:
return False return False
else: else:
key = base64.b64decode(post_data["key"]) key = base64.b64decode(post_data["key"].encode())
with open(args.model + "/key", "r") as f: with open(args.model + "/key", "rb") as f:
cur_key = f.read() cur_key = f.read()
return (key == cur_key) return (key == cur_key)
...@@ -203,7 +203,7 @@ class MainService(BaseHTTPRequestHandler): ...@@ -203,7 +203,7 @@ class MainService(BaseHTTPRequestHandler):
self.send_response(200) self.send_response(200)
self.send_header('Content-type', 'application/json') self.send_header('Content-type', 'application/json')
self.end_headers() self.end_headers()
self.wfile.write(json.dumps(response)) self.wfile.write(json.dumps(response).encode())
if __name__ == "__main__": if __name__ == "__main__":
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册