未验证 提交 fa19afc1 编写于 作者: T Thomas Young 提交者: GitHub

Merge pull request #1 from PaddlePaddle/develop

update from origin
...@@ -24,7 +24,6 @@ You can get images in two ways: ...@@ -24,7 +24,6 @@ You can get images in two ways:
``` ```
## Image description ## Image description
...@@ -40,3 +39,13 @@ Runtime images cannot be used for compilation. ...@@ -40,3 +39,13 @@ Runtime images cannot be used for compilation.
| GPU (cuda10.0-cudnn7) development | CentOS7 | latest-cuda10.0-cudnn7-devel | [Dockerfile.cuda10.0-cudnn7.devel](../tools/Dockerfile.cuda10.0-cudnn7.devel) | | GPU (cuda10.0-cudnn7) development | CentOS7 | latest-cuda10.0-cudnn7-devel | [Dockerfile.cuda10.0-cudnn7.devel](../tools/Dockerfile.cuda10.0-cudnn7.devel) |
| CPU development (Used to compile packages on Ubuntu) | CentOS6 | <None> | [Dockerfile.centos6.devel](../tools/Dockerfile.centos6.devel) | | CPU development (Used to compile packages on Ubuntu) | CentOS6 | <None> | [Dockerfile.centos6.devel](../tools/Dockerfile.centos6.devel) |
| GPU (cuda9.0-cudnn7) development (Used to compile packages on Ubuntu) | CentOS6 | <None> | [Dockerfile.centos6.cuda9.0-cudnn7.devel](../tools/Dockerfile.centos6.cuda9.0-cudnn7.devel) | | GPU (cuda9.0-cudnn7) development (Used to compile packages on Ubuntu) | CentOS6 | <None> | [Dockerfile.centos6.cuda9.0-cudnn7.devel](../tools/Dockerfile.centos6.cuda9.0-cudnn7.devel) |
## Requirements for running CUDA containers
Running a CUDA container requires a machine with at least one CUDA-capable GPU and a driver compatible with the CUDA toolkit version you are using.
The machine running the CUDA container **only requires the NVIDIA driver**, the CUDA toolkit doesn't have to be installed.
For the relationship between CUDA toolkit version, Driver version and GPU architecture, please refer to [nvidia-docker wiki](https://github.com/NVIDIA/nvidia-docker/wiki/CUDA).
...@@ -22,7 +22,6 @@ ...@@ -22,7 +22,6 @@
```shell ```shell
docker build -t <image-name>:<images-tag> . docker build -t <image-name>:<images-tag> .
``` ```
...@@ -40,3 +39,13 @@ ...@@ -40,3 +39,13 @@
| GPU (cuda10.0-cudnn7) 开发镜像 | CentOS7 | latest-cuda10.0-cudnn7-devel | [Dockerfile.cuda10.0-cudnn7.devel](../tools/Dockerfile.cuda10.0-cudnn7.devel) | | GPU (cuda10.0-cudnn7) 开发镜像 | CentOS7 | latest-cuda10.0-cudnn7-devel | [Dockerfile.cuda10.0-cudnn7.devel](../tools/Dockerfile.cuda10.0-cudnn7.devel) |
| CPU 开发镜像 (用于编译 Ubuntu 包) | CentOS6 | <无> | [Dockerfile.centos6.devel](../tools/Dockerfile.centos6.devel) | | CPU 开发镜像 (用于编译 Ubuntu 包) | CentOS6 | <无> | [Dockerfile.centos6.devel](../tools/Dockerfile.centos6.devel) |
| GPU (cuda9.0-cudnn7) 开发镜像 (用于编译 Ubuntu 包) | CentOS6 | <无> | [Dockerfile.centos6.cuda9.0-cudnn7.devel](../tools/Dockerfile.centos6.cuda9.0-cudnn7.devel) | | GPU (cuda9.0-cudnn7) 开发镜像 (用于编译 Ubuntu 包) | CentOS6 | <无> | [Dockerfile.centos6.cuda9.0-cudnn7.devel](../tools/Dockerfile.centos6.cuda9.0-cudnn7.devel) |
## 运行CUDA容器的要求
运行CUDA容器需要至少具有一个支持CUDA的GPU以及与您所使用的CUDA工具包版本兼容的驱动程序。
运行CUDA容器的机器**只需要相应的NVIDIA驱动程序**,而CUDA工具包不是必要的。
相关CUDA工具包版本、驱动版本和GPU架构的关系请参阅 [nvidia-docker wiki](https://github.com/NVIDIA/nvidia-docker/wiki/CUDA)
...@@ -4,6 +4,35 @@ ...@@ -4,6 +4,35 @@
## 基础知识 ## 基础知识
#### Q: Paddle Serving 、Paddle Inference、PaddleHub Serving三者的区别及联系?
**A:** paddle serving是远程服务,即发起预测的设备(手机、浏览器、客户端等)与实际预测的硬件不在一起。 paddle inference是一个library,适合嵌入到一个大系统中保证预测效率,paddle serving调用了paddle inference做远程服务。paddlehub serving可以认为是一个示例,都会使用paddle serving作为统一预测服务入口。如果在web端交互,一般是调用远程服务的形式,可以使用paddle serving的web service搭建。
#### Q: paddle-serving是否支持Int32支持
**A:** 在protobuf定feed_type和fetch_type编号与数据类型对应如下
​ 0-int64
​ 1-float32
​ 2-int32
#### Q: paddle-serving是否支持windows和Linux环境下的多线程调用
**A:** 客户端可以发起多线程访问调用服务端
#### Q: paddle-serving如何修改消息大小限制
**A:** 在server端和client但通过FLAGS_max_body_size来扩大数据量限制,单位为字节,默认为64MB
#### Q: paddle-serving客户端目前支持哪些语言
**A:** java c++ python
#### Q: paddle-serving目前支持哪些协议
**A:** http rpc
## 编译问题 ## 编译问题
...@@ -12,6 +41,10 @@ ...@@ -12,6 +41,10 @@
**A:** 通过pip命令安装自己编译出的whl包,并设置SERVING_BIN环境变量为编译出的serving二进制文件路径。 **A:** 通过pip命令安装自己编译出的whl包,并设置SERVING_BIN环境变量为编译出的serving二进制文件路径。
#### Q: 使用Java客户端,mvn compile过程出现"No compiler is provided in this environment. Perhaps you are running on a JRE rather than a JDK?"错误
**A:** 没有安装JDK,或者JAVA_HOME路径配置错误(正确配置是JDK路径,常见错误配置成JRE路径,例如正确路径参考JAVA_HOME="/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.262.b10-0.el7_8.x86_64/")。Java JDK安装参考https://segmentfault.com/a/1190000015389941
## 部署问题 ## 部署问题
...@@ -46,7 +79,15 @@ InvalidArgumentError: Device id must be less than GPU count, but received id is: ...@@ -46,7 +79,15 @@ InvalidArgumentError: Device id must be less than GPU count, but received id is:
**A:** 目前(0.4.0)仅支持CentOS,具体列表查阅[这里](https://github.com/PaddlePaddle/Serving/blob/develop/doc/DOCKER_IMAGES.md) **A:** 目前(0.4.0)仅支持CentOS,具体列表查阅[这里](https://github.com/PaddlePaddle/Serving/blob/develop/doc/DOCKER_IMAGES.md)
#### Q: python编译的GCC版本与serving的版本不匹配
**A:**:1)使用[GPU docker](https://github.com/PaddlePaddle/Serving/blob/develop/doc/RUN_IN_DOCKER.md#gpunvidia-docker)解决环境问题
​ 2)修改anaconda的虚拟环境下安装的python的gcc版本[参考](https://www.jianshu.com/p/c498b3d86f77)
#### Q: paddle-serving是否支持本地离线安装
**A:** 支持离线部署,需要把一些相关的[依赖包](https://github.com/PaddlePaddle/Serving/blob/develop/doc/COMPILE.md)提前准备安装好
## 预测问题 ## 预测问题
...@@ -105,6 +146,19 @@ client端的日志直接打印到标准输出。 ...@@ -105,6 +146,19 @@ client端的日志直接打印到标准输出。
通过在部署服务之前 'export GLOG_v=3'可以输出更为详细的日志信息。 通过在部署服务之前 'export GLOG_v=3'可以输出更为详细的日志信息。
#### Q: paddle-serving启动成功后,相关的日志在哪里设置
**A:** 1)警告是glog组件打印的,告知glog初始化之前日志打印在STDERR
​ 2)一般采用GLOG_v方式启动服务同时设置日志级别。
例如:
```
GLOG_v=2 python -m paddle_serving_server.serve --model xxx_conf/ --port 9999
```
#### Q: (GLOG_v=2下)Server端日志一切正常,但Client端始终得不到正确的预测结果 #### Q: (GLOG_v=2下)Server端日志一切正常,但Client端始终得不到正确的预测结果
**A:** 可能是配置文件有问题,检查下配置文件(is_load_tensor,fetch_type等有没有问题) **A:** 可能是配置文件有问题,检查下配置文件(is_load_tensor,fetch_type等有没有问题)
......
# gRPC接口 # gRPC接口使用介绍
- [1.与bRPC接口对比](#1与brpc接口对比)
- [1.1 服务端对比](#11-服务端对比)
- [1.2 客服端对比](#12-客服端对比)
- [1.3 其他](#13-其他)
- [2.示例:线性回归预测服务](#2示例线性回归预测服务)
- [获取数据](#获取数据)
- [开启 gRPC 服务端](#开启-grpc-服务端)
- [客户端预测](#客户端预测)
- [同步预测](#同步预测)
- [异步预测](#异步预测)
- [Batch 预测](#batch-预测)
- [通用 pb 预测](#通用-pb-预测)
- [预测超时](#预测超时)
- [List 输入](#list-输入)
- [3.更多示例](#3更多示例)
使用gRPC接口,Client端可以在Win/Linux/MacOS平台上调用不同语言。gRPC 接口实现结构如下:
![](https://github.com/PaddlePaddle/Serving/blob/develop/doc/grpc_impl.png)
## 1.与bRPC接口对比
#### 1.1 服务端对比
* gRPC Server 端 `load_model_config` 函数添加 `client_config_path` 参数:
gRPC 接口实现形式类似 Web Service: ```
![](grpc_impl.png)
## 与bRPC接口对比
1. gRPC Server 端 `load_model_config` 函数添加 `client_config_path` 参数:
```python
def load_model_config(self, server_config_paths, client_config_path=None) def load_model_config(self, server_config_paths, client_config_path=None)
``` ```
在一些例子中 bRPC Server 端与 bRPC Client 端的配置文件可能不同(如 在cube local 中,Client 端的数据先交给 cube,经过 cube 处理后再交给预测库),此时 gRPC Server 端需要手动设置 gRPC Client 端的配置`client_config_path`
**`client_config_path` 默认为 `<server_config_path>/serving_server_conf.prototxt`。**
在一些例子中 bRPC Server 端与 bRPC Client 端的配置文件可能是不同的(如 cube local 例子中,Client 端的数据先交给 cube,经过 cube 处理后再交给预测库),所以 gRPC Server 端需要获取 gRPC Client 端的配置;同时为了取消 gRPC Client 端手动加载配置文件的过程,所以设计 gRPC Server 端同时加载两个配置文件。`client_config_path` 默认为 `<server_config_path>/serving_server_conf.prototxt` #### 1.2 客服端对比
2. gRPC Client 端取消 `load_client_config` 步骤: * gRPC Client 端取消 `load_client_config` 步骤:
`connect` 步骤通过 RPC 获取相应的 prototxt(从任意一个 endpoint 获取即可)。 `connect` 步骤通过 RPC 获取相应的 prototxt(从任意一个 endpoint 获取即可)。
3. gRPC Client 需要通过 RPC 方式设置 timeout 时间(调用形式与 bRPC Client保持一致) * gRPC Client 需要通过 RPC 方式设置 timeout 时间(调用形式与 bRPC Client保持一致)
因为 bRPC Client 在 `connect` 后无法更改 timeout 时间,所以当 gRPC Server 收到变更 timeout 的调用请求时会重新创建 bRPC Client 实例以变更 bRPC Client timeout时间,同时 gRPC Client 会设置 gRPC 的 deadline 时间。 因为 bRPC Client 在 `connect` 后无法更改 timeout 时间,所以当 gRPC Server 收到变更 timeout 的调用请求时会重新创建 bRPC Client 实例以变更 bRPC Client timeout时间,同时 gRPC Client 会设置 gRPC 的 deadline 时间。
**注意,设置 timeout 接口和 Inference 接口不能同时调用(非线程安全),出于性能考虑暂时不加锁。** **注意,设置 timeout 接口和 Inference 接口不能同时调用(非线程安全),出于性能考虑暂时不加锁。**
4. gRPC Client 端 `predict` 函数添加 `asyn``is_python` 参数: * gRPC Client 端 `predict` 函数添加 `asyn``is_python` 参数:
```python ```
def predict(self, feed, fetch, need_variant_tag=False, asyn=False, is_python=True) def predict(self, feed, fetch, need_variant_tag=False, asyn=False, is_python=True)
``` ```
其中,`asyn` 为异步调用选项。当 `asyn=True` 时为异步调用,返回 `MultiLangPredictFuture` 对象,通过 `MultiLangPredictFuture.result()` 阻塞获取预测值;当 `asyn=Fasle` 为同步调用。 1. `asyn` 为异步调用选项。当 `asyn=True` 时为异步调用,返回 `MultiLangPredictFuture` 对象,通过 `MultiLangPredictFuture.result()` 阻塞获取预测值;当 `asyn=Fasle` 为同步调用。
2. `is_python` 为 proto 格式选项。当 `is_python=True` 时,基于 numpy bytes 格式进行数据传输,目前只适用于 Python;当 `is_python=False` 时,以普通数据格式传输,更加通用。使用 numpy bytes 格式传输耗时比普通数据格式小很多(详见 [#654](https://github.com/PaddlePaddle/Serving/pull/654))。
#### 1.3 其他
* 异常处理:当 gRPC Server 端的 bRPC Client 预测失败(返回 `None`)时,gRPC Client 端同样返回None。其他 gRPC 异常会在 Client 内部捕获,并在返回的 fetch_map 中添加一个 "status_code" 字段来区分是否预测正常(参考 timeout 样例)。
* 由于 gRPC 只支持 pick_first 和 round_robin 负载均衡策略,ABTEST 特性还未打齐。
* 系统兼容性:
* [x] CentOS
* [x] macOS
* [x] Windows
* 已经支持的客户端语言:
- Python
- Java
- Go
## 2.示例:线性回归预测服务
以下是采用gRPC实现的关于线性回归预测的一个示例,具体代码详见此[链接](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/grpc_impl_example/fit_a_line)
#### 获取数据
```shell
sh get_data.sh
```
#### 开启 gRPC 服务端
``` shell
python test_server.py uci_housing_model/
```
也可以通过下面的一行代码开启默认 gRPC 服务:
```shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 --use_multilang
```
注:--use_multilang参数用来启用多语言客户端
### 客户端预测
#### 同步预测
``` shell
python test_sync_client.py
```
#### 异步预测
``` shell
python test_asyn_client.py
```
#### Batch 预测
``` shell
python test_batch_client.py
```
`is_python` 为 proto 格式选项。当 `is_python=True` 时,基于 numpy bytes 格式进行数据传输,目前只适用于 Python;当 `is_python=False` 时,以普通数据格式传输,更加通用。使用 numpy bytes 格式传输耗时比普通数据格式小很多(详见 [#654](https://github.com/PaddlePaddle/Serving/pull/654))。 #### 通用 pb 预测
5. 异常处理:当 gRPC Server 端的 bRPC Client 预测失败(返回 `None`)时,gRPC Client 端同样返回None。其他 gRPC 异常会在 Client 内部捕获,并在返回的 fetch_map 中添加一个 "status_code" 字段来区分是否预测正常(参考 timeout 样例)。 ``` shell
python test_general_pb_client.py
```
6. 由于 gRPC 只支持 pick_first 和 round_robin 负载均衡策略,ABTEST 特性还未打齐。 #### 预测超时
7. 经测试,gRPC 版本可以在 Windows、macOS 平台使用。 ``` shell
python test_timeout_client.py
```
8. 计划支持的客户端语言: #### List 输入
- [x] Python ``` shell
- [ ] Java python test_list_input_client.py
- [ ] Go ```
- [ ] JavaScript
## Python 端的一些例子 ## 3.更多示例
详见 `python/examples/grpc_impl_example` 下的示例文件。 详见[`python/examples/grpc_impl_example`](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/grpc_impl_example)下的示例文件。
...@@ -24,13 +24,13 @@ inference_model_dir = "your_inference_model" ...@@ -24,13 +24,13 @@ inference_model_dir = "your_inference_model"
serving_client_dir = "serving_client_dir" serving_client_dir = "serving_client_dir"
serving_server_dir = "serving_server_dir" serving_server_dir = "serving_server_dir"
feed_var_names, fetch_var_names = inference_model_to_serving( feed_var_names, fetch_var_names = inference_model_to_serving(
inference_model_dir, serving_client_dir, serving_server_dir) inference_model_dir, serving_server_dir, serving_client_dir)
``` ```
if your model file and params file are both standalone, please use the following api. if your model file and params file are both standalone, please use the following api.
``` ```
feed_var_names, fetch_var_names = inference_model_to_serving( feed_var_names, fetch_var_names = inference_model_to_serving(
inference_model_dir, serving_client_dir, serving_server_dir, inference_model_dir, serving_server_dir, serving_client_dir,
model_filename="model", params_filename="params") model_filename="model", params_filename="params")
``` ```
...@@ -23,11 +23,11 @@ inference_model_dir = "your_inference_model" ...@@ -23,11 +23,11 @@ inference_model_dir = "your_inference_model"
serving_client_dir = "serving_client_dir" serving_client_dir = "serving_client_dir"
serving_server_dir = "serving_server_dir" serving_server_dir = "serving_server_dir"
feed_var_names, fetch_var_names = inference_model_to_serving( feed_var_names, fetch_var_names = inference_model_to_serving(
inference_model_dir, serving_client_dir, serving_server_dir) inference_model_dir, serving_server_dir, serving_client_dir)
``` ```
如果模型中有模型描述文件`model_filename` 和 模型参数文件`params_filename`,那么请用 如果模型中有模型描述文件`model_filename` 和 模型参数文件`params_filename`,那么请用
``` ```
feed_var_names, fetch_var_names = inference_model_to_serving( feed_var_names, fetch_var_names = inference_model_to_serving(
inference_model_dir, serving_client_dir, serving_server_dir, inference_model_dir, serving_server_dir, serving_client_dir,
model_filename="model", params_filename="params") model_filename="model", params_filename="params")
``` ```
...@@ -75,7 +75,7 @@ ...@@ -75,7 +75,7 @@
<dependency> <dependency>
<groupId>junit</groupId> <groupId>junit</groupId>
<artifactId>junit</artifactId> <artifactId>junit</artifactId>
<version>4.11</version> <version>4.13.1</version>
<scope>test</scope> <scope>test</scope>
</dependency> </dependency>
<dependency> <dependency>
......
...@@ -23,7 +23,7 @@ args = benchmark_args() ...@@ -23,7 +23,7 @@ args = benchmark_args()
reader = ChineseBertReader({"max_seq_len": 128}) reader = ChineseBertReader({"max_seq_len": 128})
fetch = ["pooled_output"] fetch = ["pooled_output"]
endpoint_list = ['127.0.0.1:8861'] endpoint_list = ['127.0.0.1:9292']
client = Client() client = Client()
client.load_client_config(args.model) client.load_client_config(args.model)
client.connect(endpoint_list) client.connect(endpoint_list)
...@@ -33,5 +33,5 @@ for line in sys.stdin: ...@@ -33,5 +33,5 @@ for line in sys.stdin:
for key in feed_dict.keys(): for key in feed_dict.keys():
feed_dict[key] = np.array(feed_dict[key]).reshape((128, 1)) feed_dict[key] = np.array(feed_dict[key]).reshape((128, 1))
#print(feed_dict) #print(feed_dict)
result = client.predict(feed=feed_dict, fetch=fetch) result = client.predict(feed=feed_dict, fetch=fetch, batch=False)
print(result) print(result)
...@@ -29,13 +29,14 @@ class BertService(WebService): ...@@ -29,13 +29,14 @@ class BertService(WebService):
def preprocess(self, feed=[], fetch=[]): def preprocess(self, feed=[], fetch=[]):
feed_res = [] feed_res = []
is_batch = False
for ins in feed: for ins in feed:
feed_dict = self.reader.process(ins["words"].encode("utf-8")) feed_dict = self.reader.process(ins["words"].encode("utf-8"))
for key in feed_dict.keys(): for key in feed_dict.keys():
feed_dict[key] = np.array(feed_dict[key]).reshape( feed_dict[key] = np.array(feed_dict[key]).reshape(
(1, len(feed_dict[key]), 1)) (len(feed_dict[key]), 1))
feed_res.append(feed_dict) feed_res.append(feed_dict)
return feed_res, fetch return feed_res, fetch, is_batch
bert_service = BertService(name="bert") bert_service = BertService(name="bert")
......
...@@ -12,36 +12,29 @@ ...@@ -12,36 +12,29 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
import os
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_app.reader import Sequential, File2Image, ResizeByFactor from paddle_serving_app.reader import *
from paddle_serving_app.reader import Div, Normalize, Transpose import numpy as np
from paddle_serving_app.reader import DBPostProcess, FilterBoxes
client = Client()
client.load_client_config("ocr_det_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9494"])
read_image_file = File2Image()
preprocess = Sequential([ preprocess = Sequential([
ResizeByFactor(32, 960), Div(255), File2Image(), BGR2RGB(), Div(255.0),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), Transpose( Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], False),
(2, 0, 1)) Resize(800, 1333), Transpose((2, 0, 1)), PadStride(32)
]) ])
post_func = DBPostProcess({ postprocess = RCNNPostprocess("label_list.txt", "output")
"thresh": 0.3, client = Client()
"box_thresh": 0.5, client.load_client_config("serving_client/serving_client_conf.prototxt")
"max_candidates": 1000, client.connect(['127.0.0.1:9292'])
"unclip_ratio": 1.5, im = preprocess('000000570688.jpg')
"min_size": 3 fetch_map = client.predict(
}) feed={
filter_func = FilterBoxes(10, 10) "image": im,
"im_info": np.array(list(im.shape[1:]) + [1.0]),
img = read_image_file(name) "im_shape": np.array(list(im.shape[1:]) + [1.0])
ori_h, ori_w, _ = img.shape },
img = preprocess(img) fetch=["multiclass_nms_0.tmp_0"],
new_h, new_w, _ = img.shape batch=False)
ratio_list = [float(new_h) / ori_h, float(new_w) / ori_w] fetch_map["image"] = '000000570688.jpg'
outputs = client.predict(feed={"image": img}, fetch=["concat_1.tmp_0"]) print(fetch_map)
dt_boxes_list = post_func(outputs["concat_1.tmp_0"], [ratio_list]) postprocess(fetch_map)
dt_boxes = filter_func(dt_boxes_list[0], [ori_h, ori_w]) print(fetch_map)
# -*- coding: utf-8 -*-
#
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from __future__ import unicode_literals, absolute_import
import os
import sys
import time
import json
import requests
from paddle_serving_client import Client
from paddle_serving_client.utils import MultiThreadRunner
from paddle_serving_client.utils import benchmark_args, show_latency
from paddle_serving_app.reader import ChineseBertReader
from paddle_serving_app.reader import *
import numpy as np
args = benchmark_args()
def single_func(idx, resource):
img = "./000000570688.jpg"
profile_flags = False
latency_flags = False
if os.getenv("FLAGS_profile_client"):
profile_flags = True
if os.getenv("FLAGS_serving_latency"):
latency_flags = True
latency_list = []
if args.request == "rpc":
preprocess = Sequential([
File2Image(), BGR2RGB(), Div(255.0),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], False),
Resize(640, 640), Transpose((2, 0, 1))
])
postprocess = RCNNPostprocess("label_list.txt", "output")
client = Client()
client.load_client_config(args.model)
client.connect([resource["endpoint"][idx % len(resource["endpoint"])]])
start = time.time()
for i in range(turns):
if args.batch_size >= 1:
l_start = time.time()
feed_batch = []
b_start = time.time()
im = preprocess(img)
for bi in range(args.batch_size):
print("1111batch")
print(bi)
feed_batch.append({
"image": im,
"im_info": np.array(list(im.shape[1:]) + [1.0]),
"im_shape": np.array(list(im.shape[1:]) + [1.0])
})
# im = preprocess(img)
b_end = time.time()
if profile_flags:
sys.stderr.write(
"PROFILE\tpid:{}\tbert_pre_0:{} bert_pre_1:{}\n".format(
os.getpid(),
int(round(b_start * 1000000)),
int(round(b_end * 1000000))))
#result = client.predict(feed=feed_batch, fetch=fetch)
fetch_map = client.predict(
feed=feed_batch, fetch=["multiclass_nms"])
fetch_map["image"] = img
postprocess(fetch_map)
l_end = time.time()
if latency_flags:
latency_list.append(l_end * 1000 - l_start * 1000)
else:
print("unsupport batch size {}".format(args.batch_size))
else:
raise ValueError("not implemented {} request".format(args.request))
end = time.time()
if latency_flags:
return [[end - start], latency_list]
else:
return [[end - start]]
if __name__ == '__main__':
multi_thread_runner = MultiThreadRunner()
endpoint_list = ["127.0.0.1:7777"]
turns = 10
start = time.time()
result = multi_thread_runner.run(
single_func, args.thread, {"endpoint": endpoint_list,
"turns": turns})
end = time.time()
total_cost = end - start
avg_cost = 0
for i in range(args.thread):
avg_cost += result[0][i]
avg_cost = avg_cost / args.thread
print("total cost: {}s".format(total_cost))
print("each thread cost: {}s. ".format(avg_cost))
print("qps: {}samples/s".format(args.batch_size * args.thread * turns /
total_cost))
if os.getenv("FLAGS_serving_latency"):
show_latency(result[1])
rm profile_log*
export CUDA_VISIBLE_DEVICES=0
export FLAGS_profile_server=1
export FLAGS_profile_client=1
export FLAGS_serving_latency=1
gpu_id=0
#save cpu and gpu utilization log
if [ -d utilization ];then
rm -rf utilization
else
mkdir utilization
fi
#start server
$PYTHONROOT/bin/python3 -m paddle_serving_server_gpu.serve --model $1 --port 7777 --thread 4 --gpu_ids 0 --ir_optim > elog 2>&1 &
sleep 5
#warm up
$PYTHONROOT/bin/python3 benchmark.py --thread 4 --batch_size 1 --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
echo -e "import psutil\ncpu_utilization=psutil.cpu_percent(1,False)\nprint('CPU_UTILIZATION:', cpu_utilization)\n" > cpu_utilization.py
for thread_num in 1 4 8 16
do
for batch_size in 1
do
job_bt=`date '+%Y%m%d%H%M%S'`
nvidia-smi --id=0 --query-compute-apps=used_memory --format=csv -lms 100 > gpu_use.log 2>&1 &
nvidia-smi --id=0 --query-gpu=utilization.gpu --format=csv -lms 100 > gpu_utilization.log 2>&1 &
gpu_memory_pid=$!
$PYTHONROOT/bin/python3 benchmark.py --thread $thread_num --batch_size $batch_size --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
kill ${gpu_memory_pid}
kill `ps -ef|grep used_memory|awk '{print $2}'`
echo "model_name:" $1
echo "thread_num:" $thread_num
echo "batch_size:" $batch_size
echo "=================Done===================="
echo "model_name:$1" >> profile_log_$1
echo "batch_size:$batch_size" >> profile_log_$1
$PYTHONROOT/bin/python3 cpu_utilization.py >> profile_log_$1
job_et=`date '+%Y%m%d%H%M%S'`
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "MAX_GPU_MEMORY:", max}' gpu_use.log >> profile_log_$1
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "GPU_UTILIZATION:", max}' gpu_utilization.log >> profile_log_$1
rm -rf gpu_use.log gpu_utilization.log
$PYTHONROOT/bin/python3 ../util/show_profile.py profile $thread_num >> profile_log_$1
tail -n 8 profile >> profile_log_$1
echo "" >> profile_log_$1
done
done
#Divided log
awk 'BEGIN{RS="\n\n"}{i++}{print > "bert_log_"i}' profile_log_$1
mkdir bert_log && mv bert_log_* bert_log
ps -ef|grep 'serving'|grep -v grep|cut -c 9-15 | xargs kill -9
background
person
bicycle
car
motorcycle
airplane
bus
train
truck
boat
traffic light
fire hydrant
stop sign
parking meter
bench
bird
cat
dog
horse
sheep
cow
elephant
bear
zebra
giraffe
backpack
umbrella
handbag
tie
suitcase
frisbee
skis
snowboard
sports ball
kite
baseball bat
baseball glove
skateboard
surfboard
tennis racket
bottle
wine glass
cup
fork
knife
spoon
bowl
banana
apple
sandwich
orange
broccoli
carrot
hot dog
pizza
donut
cake
chair
couch
potted plant
bed
dining table
toilet
tv
laptop
mouse
remote
keyboard
cell phone
microwave
oven
toaster
sink
refrigerator
book
clock
vase
scissors
teddy bear
hair drier
toothbrush
...@@ -38,7 +38,8 @@ start = time.time() ...@@ -38,7 +38,8 @@ start = time.time()
image_file = "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg" image_file = "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"
for i in range(10): for i in range(10):
img = seq(image_file) img = seq(image_file)
fetch_map = client.predict(feed={"image": img}, fetch=["score"]) fetch_map = client.predict(
feed={"image": img}, fetch=["score"], batch=False)
prob = max(fetch_map["score"][0]) prob = max(fetch_map["score"][0])
label = label_dict[fetch_map["score"][0].tolist().index(prob)].strip( label = label_dict[fetch_map["score"][0].tolist().index(prob)].strip(
).replace(",", "") ).replace(",", "")
......
...@@ -13,7 +13,7 @@ ...@@ -13,7 +13,7 @@
# limitations under the License. # limitations under the License.
import sys import sys
from paddle_serving_client import Client from paddle_serving_client import Client
import numpy as np
from paddle_serving_app.reader import Sequential, URL2Image, Resize, CenterCrop, RGB2BGR, Transpose, Div, Normalize, Base64ToImage from paddle_serving_app.reader import Sequential, URL2Image, Resize, CenterCrop, RGB2BGR, Transpose, Div, Normalize, Base64ToImage
if len(sys.argv) != 4: if len(sys.argv) != 4:
...@@ -44,12 +44,13 @@ class ImageService(WebService): ...@@ -44,12 +44,13 @@ class ImageService(WebService):
def preprocess(self, feed=[], fetch=[]): def preprocess(self, feed=[], fetch=[]):
feed_batch = [] feed_batch = []
is_batch = True
for ins in feed: for ins in feed:
if "image" not in ins: if "image" not in ins:
raise ("feed data error!") raise ("feed data error!")
img = self.seq(ins["image"]) img = self.seq(ins["image"])
feed_batch.append({"image": img[np.newaxis, :]}) feed_batch.append({"image": img[np.newaxis, :]})
return feed_batch, fetch return feed_batch, fetch, is_batch
def postprocess(self, feed=[], fetch=[], fetch_map={}): def postprocess(self, feed=[], fetch=[], fetch_map={}):
score_list = fetch_map["score"] score_list = fetch_map["score"]
......
...@@ -18,7 +18,7 @@ import sys ...@@ -18,7 +18,7 @@ import sys
import time import time
import requests import requests
import numpy as np import numpy as np
from paddle_serving_app.reader import IMDBDataset from paddle_serving_app.reader.imdb_reader import IMDBDataset
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_client.utils import MultiThreadRunner from paddle_serving_client.utils import MultiThreadRunner
from paddle_serving_client.utils import MultiThreadRunner, benchmark_args, show_latency from paddle_serving_client.utils import MultiThreadRunner, benchmark_args, show_latency
......
...@@ -13,7 +13,7 @@ ...@@ -13,7 +13,7 @@
# limitations under the License. # limitations under the License.
# pylint: disable=doc-string-missing # pylint: disable=doc-string-missing
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_app.reader import IMDBDataset from paddle_serving_app.reader.imdb_reader import IMDBDataset
import sys import sys
import numpy as np import numpy as np
......
...@@ -14,7 +14,7 @@ ...@@ -14,7 +14,7 @@
# pylint: disable=doc-string-missing # pylint: disable=doc-string-missing
from paddle_serving_server.web_service import WebService from paddle_serving_server.web_service import WebService
from paddle_serving_app.reader import IMDBDataset from paddle_serving_app.reader.imdb_reader import IMDBDataset
import sys import sys
import numpy as np import numpy as np
...@@ -29,13 +29,14 @@ class IMDBService(WebService): ...@@ -29,13 +29,14 @@ class IMDBService(WebService):
def preprocess(self, feed={}, fetch=[]): def preprocess(self, feed={}, fetch=[]):
feed_batch = [] feed_batch = []
words_lod = [0] words_lod = [0]
is_batch = True
for ins in feed: for ins in feed:
words = self.dataset.get_words_only(ins["words"]) words = self.dataset.get_words_only(ins["words"])
words = np.array(words).reshape(len(words), 1) words = np.array(words).reshape(len(words), 1)
words_lod.append(words_lod[-1] + len(words)) words_lod.append(words_lod[-1] + len(words))
feed_batch.append(words) feed_batch.append(words)
feed = {"words": np.concatenate(feed_batch), "words.lod": words_lod} feed = {"words": np.concatenate(feed_batch), "words.lod": words_lod}
return feed, fetch return feed, fetch, is_batch
imdb_service = IMDBService(name="imdb") imdb_service = IMDBService(name="imdb")
......
...@@ -15,6 +15,7 @@ ...@@ -15,6 +15,7 @@
from paddle_serving_server.web_service import WebService from paddle_serving_server.web_service import WebService
import sys import sys
from paddle_serving_app.reader import LACReader from paddle_serving_app.reader import LACReader
import numpy as np
class LACService(WebService): class LACService(WebService):
...@@ -23,13 +24,21 @@ class LACService(WebService): ...@@ -23,13 +24,21 @@ class LACService(WebService):
def preprocess(self, feed={}, fetch=[]): def preprocess(self, feed={}, fetch=[]):
feed_batch = [] feed_batch = []
fetch = ["crf_decode"]
lod_info = [0]
is_batch = True
for ins in feed: for ins in feed:
if "words" not in ins: if "words" not in ins:
raise ("feed data error!") raise ("feed data error!")
feed_data = self.reader.process(ins["words"]) feed_data = self.reader.process(ins["words"])
feed_batch.append({"words": feed_data}) feed_batch.append(np.array(feed_data).reshape(len(feed_data), 1))
fetch = ["crf_decode"] lod_info.append(lod_info[-1] + len(feed_data))
return feed_batch, fetch feed_dict = {
"words": np.concatenate(
feed_batch, axis=0),
"words.lod": lod_info
}
return feed_dict, fetch, is_batch
def postprocess(self, feed={}, fetch=[], fetch_map={}): def postprocess(self, feed={}, fetch=[], fetch_map={}):
batch_ret = [] batch_ret = []
......
...@@ -53,7 +53,9 @@ class OCRService(WebService): ...@@ -53,7 +53,9 @@ class OCRService(WebService):
self.ori_h, self.ori_w, _ = im.shape self.ori_h, self.ori_w, _ = im.shape
det_img = self.det_preprocess(im) det_img = self.det_preprocess(im)
_, self.new_h, self.new_w = det_img.shape _, self.new_h, self.new_w = det_img.shape
return {"image": det_img[np.newaxis, :].copy()}, ["concat_1.tmp_0"] return {
"image": det_img[np.newaxis, :].copy()
}, ["concat_1.tmp_0"], True
def postprocess(self, feed={}, fetch=[], fetch_map=None): def postprocess(self, feed={}, fetch=[], fetch_map=None):
det_out = fetch_map["concat_1.tmp_0"] det_out = fetch_map["concat_1.tmp_0"]
......
...@@ -54,7 +54,7 @@ class OCRService(WebService): ...@@ -54,7 +54,7 @@ class OCRService(WebService):
det_img = self.det_preprocess(im) det_img = self.det_preprocess(im)
_, self.new_h, self.new_w = det_img.shape _, self.new_h, self.new_w = det_img.shape
print(det_img) print(det_img)
return {"image": det_img}, ["concat_1.tmp_0"] return {"image": det_img}, ["concat_1.tmp_0"], False
def postprocess(self, feed={}, fetch=[], fetch_map=None): def postprocess(self, feed={}, fetch=[], fetch_map=None):
det_out = fetch_map["concat_1.tmp_0"] det_out = fetch_map["concat_1.tmp_0"]
......
...@@ -42,10 +42,9 @@ class OCRService(WebService): ...@@ -42,10 +42,9 @@ class OCRService(WebService):
self.det_client = LocalPredictor() self.det_client = LocalPredictor()
if sys.argv[1] == 'gpu': if sys.argv[1] == 'gpu':
self.det_client.load_model_config( self.det_client.load_model_config(
det_model_config, gpu=True, profile=False) det_model_config, use_gpu=True, gpu_id=1)
elif sys.argv[1] == 'cpu': elif sys.argv[1] == 'cpu':
self.det_client.load_model_config( self.det_client.load_model_config(det_model_config)
det_model_config, gpu=False, profile=False)
self.ocr_reader = OCRReader() self.ocr_reader = OCRReader()
def preprocess(self, feed=[], fetch=[]): def preprocess(self, feed=[], fetch=[]):
...@@ -58,7 +57,7 @@ class OCRService(WebService): ...@@ -58,7 +57,7 @@ class OCRService(WebService):
det_img = det_img[np.newaxis, :] det_img = det_img[np.newaxis, :]
det_img = det_img.copy() det_img = det_img.copy()
det_out = self.det_client.predict( det_out = self.det_client.predict(
feed={"image": det_img}, fetch=["concat_1.tmp_0"]) feed={"image": det_img}, fetch=["concat_1.tmp_0"], batch=True)
filter_func = FilterBoxes(10, 10) filter_func = FilterBoxes(10, 10)
post_func = DBPostProcess({ post_func = DBPostProcess({
"thresh": 0.3, "thresh": 0.3,
...@@ -91,7 +90,7 @@ class OCRService(WebService): ...@@ -91,7 +90,7 @@ class OCRService(WebService):
imgs[id] = norm_img imgs[id] = norm_img
feed = {"image": imgs.copy()} feed = {"image": imgs.copy()}
fetch = ["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"] fetch = ["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"]
return feed, fetch return feed, fetch, True
def postprocess(self, feed={}, fetch=[], fetch_map=None): def postprocess(self, feed={}, fetch=[], fetch_map=None):
rec_res = self.ocr_reader.postprocess(fetch_map, with_score=True) rec_res = self.ocr_reader.postprocess(fetch_map, with_score=True)
...@@ -107,7 +106,8 @@ ocr_service.load_model_config("ocr_rec_model") ...@@ -107,7 +106,8 @@ ocr_service.load_model_config("ocr_rec_model")
ocr_service.prepare_server(workdir="workdir", port=9292) ocr_service.prepare_server(workdir="workdir", port=9292)
ocr_service.init_det_debugger(det_model_config="ocr_det_model") ocr_service.init_det_debugger(det_model_config="ocr_det_model")
if sys.argv[1] == 'gpu': if sys.argv[1] == 'gpu':
ocr_service.run_debugger_service(gpu=True) ocr_service.set_gpus("2")
ocr_service.run_debugger_service()
elif sys.argv[1] == 'cpu': elif sys.argv[1] == 'cpu':
ocr_service.run_debugger_service() ocr_service.run_debugger_service()
ocr_service.run_web_service() ocr_service.run_web_service()
...@@ -36,4 +36,5 @@ for img_file in os.listdir(test_img_dir): ...@@ -36,4 +36,5 @@ for img_file in os.listdir(test_img_dir):
image = cv2_to_base64(image_data1) image = cv2_to_base64(image_data1)
data = {"feed": [{"image": image}], "fetch": ["res"]} data = {"feed": [{"image": image}], "fetch": ["res"]}
r = requests.post(url=url, headers=headers, data=json.dumps(data)) r = requests.post(url=url, headers=headers, data=json.dumps(data))
print(r)
print(r.json()) print(r.json())
...@@ -50,7 +50,7 @@ class OCRService(WebService): ...@@ -50,7 +50,7 @@ class OCRService(WebService):
ori_h, ori_w, _ = im.shape ori_h, ori_w, _ = im.shape
det_img = self.det_preprocess(im) det_img = self.det_preprocess(im)
det_out = self.det_client.predict( det_out = self.det_client.predict(
feed={"image": det_img}, fetch=["concat_1.tmp_0"]) feed={"image": det_img}, fetch=["concat_1.tmp_0"], batch=False)
_, new_h, new_w = det_img.shape _, new_h, new_w = det_img.shape
filter_func = FilterBoxes(10, 10) filter_func = FilterBoxes(10, 10)
post_func = DBPostProcess({ post_func = DBPostProcess({
...@@ -77,10 +77,10 @@ class OCRService(WebService): ...@@ -77,10 +77,10 @@ class OCRService(WebService):
max_wh_ratio = max(max_wh_ratio, wh_ratio) max_wh_ratio = max(max_wh_ratio, wh_ratio)
for img in img_list: for img in img_list:
norm_img = self.ocr_reader.resize_norm_img(img, max_wh_ratio) norm_img = self.ocr_reader.resize_norm_img(img, max_wh_ratio)
feed = {"image": norm_img} feed_list.append(norm_img[np.newaxis, :])
feed_list.append(feed) feed_batch = {"image": np.concatenate(feed_list, axis=0)}
fetch = ["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"] fetch = ["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"]
return feed_list, fetch return feed_batch, fetch, True
def postprocess(self, feed={}, fetch=[], fetch_map=None): def postprocess(self, feed={}, fetch=[], fetch_map=None):
rec_res = self.ocr_reader.postprocess(fetch_map, with_score=True) rec_res = self.ocr_reader.postprocess(fetch_map, with_score=True)
......
...@@ -52,7 +52,7 @@ class OCRService(WebService): ...@@ -52,7 +52,7 @@ class OCRService(WebService):
imgs[i] = norm_img imgs[i] = norm_img
feed = {"image": imgs.copy()} feed = {"image": imgs.copy()}
fetch = ["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"] fetch = ["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"]
return feed, fetch return feed, fetch, True
def postprocess(self, feed={}, fetch=[], fetch_map=None): def postprocess(self, feed={}, fetch=[], fetch_map=None):
rec_res = self.ocr_reader.postprocess(fetch_map, with_score=True) rec_res = self.ocr_reader.postprocess(fetch_map, with_score=True)
......
...@@ -51,10 +51,17 @@ class OCRService(WebService): ...@@ -51,10 +51,17 @@ class OCRService(WebService):
max_wh_ratio = max(max_wh_ratio, wh_ratio) max_wh_ratio = max(max_wh_ratio, wh_ratio)
for img in img_list: for img in img_list:
norm_img = self.ocr_reader.resize_norm_img(img, max_wh_ratio) norm_img = self.ocr_reader.resize_norm_img(img, max_wh_ratio)
feed = {"image": norm_img} #feed = {"image": norm_img}
feed_list.append(feed) feed_list.append(norm_img)
if len(feed_list) == 1:
feed_batch = {
"image": np.concatenate(
feed_list, axis=0)[np.newaxis, :]
}
else:
feed_batch = {"image": np.concatenate(feed_list, axis=0)}
fetch = ["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"] fetch = ["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"]
return feed_list, fetch return feed_batch, fetch, True
def postprocess(self, feed={}, fetch=[], fetch_map=None): def postprocess(self, feed={}, fetch=[], fetch_map=None):
rec_res = self.ocr_reader.postprocess(fetch_map, with_score=True) rec_res = self.ocr_reader.postprocess(fetch_map, with_score=True)
......
# Imagenet Pipeline WebService
This document will takes Imagenet service as an example to introduce how to use Pipeline WebService.
## Get model
```
sh get_model.sh
```
## Start server
```
python resnet50_web_service.py &>log.txt &
```
## RPC test
```
python pipeline_rpc_client.py
```
# Imagenet Pipeline WebService
这里以 Uci 服务为例来介绍 Pipeline WebService 的使用。
## 获取模型
```
sh get_data.sh
```
## 启动服务
```
python web_service.py &>log.txt &
```
## 测试
```
curl -X POST -k http://localhost:18082/uci/prediction -d '{"key": ["x"], "value": ["0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332"]}'
```
#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG
##当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num
worker_num: 1
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port
http_port: 18082
rpc_port: 9999
dag:
#op资源类型, True, 为线程模型;False,为进程模型
is_thread_op: False
op:
imagenet:
#当op配置没有server_endpoints时,从local_service_conf读取本地服务配置
local_service_conf:
#并发数,is_thread_op=True时,为线程并发;否则为进程并发
concurrency: 2
#uci模型路径
model_config: ResNet50_vd_model
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
devices: "0" # "0,1"
#client类型,包括brpc, grpc和local_predictor.local_predictor不启动Serving服务,进程内预测
client_type: local_predictor
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["score"]
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/imagenet-example/ResNet50_vd.tar.gz
tar -xzvf ResNet50_vd.tar.gz
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/imagenet-example/image_data.tar.gz
tar -xzvf image_data.tar.gz
此差异已折叠。
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle_serving_server_gpu.pipeline import PipelineClient
import numpy as np
import requests
import json
import cv2
import base64
import os
client = PipelineClient()
client.connect(['127.0.0.1:9999'])
def cv2_to_base64(image):
return base64.b64encode(image).decode('utf8')
with open("daisy.jpg", 'rb') as file:
image_data = file.read()
image = cv2_to_base64(image_data)
for i in range(1):
ret = client.predict(feed_dict={"image": image}, fetch=["label", "prob"])
print(ret)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
from paddle_serving_app.reader import Sequential, URL2Image, Resize, CenterCrop, RGB2BGR, Transpose, Div, Normalize, Base64ToImage
try:
from paddle_serving_server_gpu.web_service import WebService, Op
except ImportError:
from paddle_serving_server.web_service import WebService, Op
import logging
import numpy as np
import base64, cv2
class ImagenetOp(Op):
def init_op(self):
self.seq = Sequential([
Resize(256), CenterCrop(224), RGB2BGR(), Transpose((2, 0, 1)),
Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225],
True)
])
self.label_dict = {}
label_idx = 0
with open("imagenet.label") as fin:
for line in fin:
self.label_dict[label_idx] = line.strip()
label_idx += 1
def preprocess(self, input_dicts, data_id, log_id):
(_, input_dict), = input_dicts.items()
data = base64.b64decode(input_dict["image"].encode('utf8'))
data = np.fromstring(data, np.uint8)
# Note: class variables(self.var) can only be used in process op mode
im = cv2.imdecode(data, cv2.IMREAD_COLOR)
img = self.seq(im)
return {"image": img[np.newaxis, :].copy()}, False, None, ""
def postprocess(self, input_dicts, fetch_dict, log_id):
print(fetch_dict)
score_list = fetch_dict["score"]
result = {"label": [], "prob": []}
for score in score_list:
score = score.tolist()
max_score = max(score)
result["label"].append(self.label_dict[score.index(max_score)]
.strip().replace(",", ""))
result["prob"].append(max_score)
result["label"] = str(result["label"])
result["prob"] = str(result["prob"])
return result, None, ""
class ImageService(WebService):
def get_pipeline_response(self, read_op):
image_op = ImagenetOp(name="imagenet", input_ops=[read_op])
return image_op
uci_service = ImageService(name="imagenet")
uci_service.prepare_pipeline_config("config.yml")
uci_service.run_service()
# IMDB model ensemble examples
## Get models
```
sh get_data.sh
```
## Start servers
```
python -m paddle_serving_server.serve --model imdb_cnn_model --port 9292 &> cnn.log &
python -m paddle_serving_server.serve --model imdb_bow_model --port 9393 &> bow.log &
python test_pipeline_server.py &>pipeline.log &
```
## Start clients
```
python test_pipeline_client.py
```
...@@ -8,8 +8,8 @@ sh get_data.sh ...@@ -8,8 +8,8 @@ sh get_data.sh
## 启动服务 ## 启动服务
``` ```
python -m paddle_serving_server_gpu.serve --model imdb_cnn_model --port 9292 &> cnn.log & python -m paddle_serving_server.serve --model imdb_cnn_model --port 9292 &> cnn.log &
python -m paddle_serving_server_gpu.serve --model imdb_bow_model --port 9393 &> bow.log & python -m paddle_serving_server.serve --model imdb_bow_model --port 9393 &> bow.log &
python test_pipeline_server.py &>pipeline.log & python test_pipeline_server.py &>pipeline.log &
``` ```
...@@ -17,8 +17,3 @@ python test_pipeline_server.py &>pipeline.log & ...@@ -17,8 +17,3 @@ python test_pipeline_server.py &>pipeline.log &
``` ```
python test_pipeline_client.py python test_pipeline_client.py
``` ```
## HTTP 测试
```
curl -X POST -k http://localhost:9999/prediction -d '{"key": ["words"], "value": ["i am very sad | 0"]}'
```
rpc_port: 18080 #rpc端口, rpc_port和http_port不允许同时为空。当rpc_port为空且http_port不为空时,会自动将rpc_port设置为http_port+1
rpc_port: 18070
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port
http_port: 18071
#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG
#当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num
worker_num: 4 worker_num: 4
build_dag_each_worker: false
#build_dag_each_worker, False,框架在进程内创建一条DAG;True,框架会每个进程内创建多个独立的DAG
build_dag_each_worker: False
dag: dag:
is_thread_op: true #op资源类型, True, 为线程模型;False,为进程模型
is_thread_op: True
#重试次数
retry: 1 retry: 1
use_profile: false
#使用性能分析, True,生成Timeline性能数据,对性能有一定影响;False为不使用
use_profile: False
#channel的最大长度,默认为0
channel_size: 0
#tracer, 跟踪框架吞吐,每个OP和channel的工作情况。无tracer时不生成数据
tracer:
#每次trace的时间间隔,单位秒/s
interval_s: 10
op: op:
bow: bow:
concurrency: 2 #并发数,is_thread_op=True时,为线程并发;否则为进程并发
remote_service_conf: concurrency: 1
client_type: brpc
model_config: imdb_bow_model #client连接类型,brpc
devices: "" client_type: brpc
rpc_port : 9393
#Serving交互重试次数,默认不重试
retry: 1
#Serving交互超时时间, 单位ms
timeout: 3000
#Serving IPs
server_endpoints: ["127.0.0.1:9393"]
#bow模型client端配置
client_config: "imdb_bow_client_conf/serving_client_conf.prototxt"
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["prediction"]
#批量查询Serving的数量, 默认1。batch_size>1要设置auto_batching_timeout,否则不足batch_size时会阻塞
batch_size: 1
#批量查询超时,与batch_size配合使用
auto_batching_timeout: 2000
cnn: cnn:
concurrency: 2 #并发数,is_thread_op=True时,为线程并发;否则为进程并发
remote_service_conf: concurrency: 1
client_type: brpc
model_config: imdb_cnn_model #client连接类型,brpc
devices: "" client_type: brpc
rpc_port : 9292
#Serving交互重试次数,默认不重试
retry: 1
#超时时间, 单位ms
timeout: 3000
#Serving IPs
server_endpoints: ["127.0.0.1:9292"]
#cnn模型client端配置
client_config: "imdb_cnn_client_conf/serving_client_conf.prototxt"
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["prediction"]
#批量查询Serving的数量, 默认1。batch_size>1要设置auto_batching_timeout,否则不足batch_size时会阻塞
batch_size: 1
#批量查询超时,与batch_size配合使用
auto_batching_timeout: 2000
combine:
#并发数,is_thread_op=True时,为线程并发;否则为进程并发
concurrency: 1
#Serving交互重试次数,默认不重试
retry: 1
#超时时间, 单位ms
timeout: 3000
#批量查询Serving的数量, 默认1。batch_size>1要设置auto_batching_timeout,否则不足batch_size时会阻塞
batch_size: 1
#批量查询超时,与batch_size配合使用
auto_batching_timeout: 2000
...@@ -15,21 +15,22 @@ from paddle_serving_server.pipeline import PipelineClient ...@@ -15,21 +15,22 @@ from paddle_serving_server.pipeline import PipelineClient
import numpy as np import numpy as np
client = PipelineClient() client = PipelineClient()
client.connect(['127.0.0.1:18080']) client.connect(['127.0.0.1:18070'])
words = 'i am very sad | 0' words = 'i am very sad | 0'
futures = [] futures = []
for i in range(4): for i in range(100):
futures.append( futures.append(
client.predict( client.predict(
feed_dict={"words": words}, feed_dict={"words": words,
"logid": 10000 + i},
fetch=["prediction"], fetch=["prediction"],
asyn=True, asyn=True,
profile=False)) profile=False))
for f in futures: for f in futures:
res = f.result() res = f.result()
if res["ecode"] != 0: if res.err_no != 0:
print("predict failed: {}".format(res)) print("predict failed: {}".format(res))
print(res) print(res)
...@@ -15,10 +15,14 @@ ...@@ -15,10 +15,14 @@
from paddle_serving_server.pipeline import Op, RequestOp, ResponseOp from paddle_serving_server.pipeline import Op, RequestOp, ResponseOp
from paddle_serving_server.pipeline import PipelineServer from paddle_serving_server.pipeline import PipelineServer
from paddle_serving_server.pipeline.proto import pipeline_service_pb2 from paddle_serving_server.pipeline.proto import pipeline_service_pb2
from paddle_serving_server.pipeline.channel import ChannelDataEcode from paddle_serving_server.pipeline.channel import ChannelDataErrcode
import numpy as np import numpy as np
from paddle_serving_app.reader import IMDBDataset from paddle_serving_app.reader.imdb_reader import IMDBDataset
import logging import logging
try:
from paddle_serving_server.web_service import WebService
except ImportError:
from paddle_serving_server_gpu.web_service import WebService
_LOGGER = logging.getLogger() _LOGGER = logging.getLogger()
user_handler = logging.StreamHandler() user_handler = logging.StreamHandler()
...@@ -41,74 +45,68 @@ class ImdbRequestOp(RequestOp): ...@@ -41,74 +45,68 @@ class ImdbRequestOp(RequestOp):
continue continue
words = request.value[idx] words = request.value[idx]
word_ids, _ = self.imdb_dataset.get_words_and_label(words) word_ids, _ = self.imdb_dataset.get_words_and_label(words)
dictdata[key] = np.array(word_ids) word_len = len(word_ids)
return dictdata dictdata[key] = np.array(word_ids).reshape(word_len, 1)
dictdata["{}.lod".format(key)] = np.array([0, word_len])
log_id = None
if request.logid is not None:
log_id = request.logid
return dictdata, log_id, None, ""
class CombineOp(Op): class CombineOp(Op):
def preprocess(self, input_data): def preprocess(self, input_data, data_id, log_id):
#_LOGGER.info("Enter CombineOp::preprocess")
combined_prediction = 0 combined_prediction = 0
for op_name, data in input_data.items(): for op_name, data in input_data.items():
_LOGGER.info("{}: {}".format(op_name, data["prediction"])) _LOGGER.info("{}: {}".format(op_name, data["prediction"]))
combined_prediction += data["prediction"] combined_prediction += data["prediction"]
data = {"prediction": combined_prediction / 2} data = {"prediction": combined_prediction / 2}
return data return data, False, None, ""
class ImdbResponseOp(ResponseOp): class ImdbResponseOp(ResponseOp):
# Here ImdbResponseOp is consistent with the default ResponseOp implementation # Here ImdbResponseOp is consistent with the default ResponseOp implementation
def pack_response_package(self, channeldata): def pack_response_package(self, channeldata):
resp = pipeline_service_pb2.Response() resp = pipeline_service_pb2.Response()
resp.ecode = channeldata.ecode resp.err_no = channeldata.error_code
if resp.ecode == ChannelDataEcode.OK.value: if resp.err_no == ChannelDataErrcode.OK.value:
feed = channeldata.parse() feed = channeldata.parse()
# ndarray to string # ndarray to string
for name, var in feed.items(): for name, var in feed.items():
resp.value.append(var.__repr__()) resp.value.append(var.__repr__())
resp.key.append(name) resp.key.append(name)
else: else:
resp.error_info = channeldata.error_info resp.err_msg = channeldata.error_info
return resp return resp
read_op = ImdbRequestOp() read_op = ImdbRequestOp()
bow_op = Op(name="bow",
input_ops=[read_op],
server_endpoints=["127.0.0.1:9393"], class BowOp(Op):
fetch_list=["prediction"], def init_op(self):
client_config="imdb_bow_client_conf/serving_client_conf.prototxt", pass
concurrency=1,
timeout=-1,
retry=1, class CnnOp(Op):
batch_size=3, def init_op(self):
auto_batching_timeout=1000) pass
cnn_op = Op(name="cnn",
input_ops=[read_op],
server_endpoints=["127.0.0.1:9292"], bow_op = BowOp("bow", input_ops=[read_op])
fetch_list=["prediction"], cnn_op = CnnOp("cnn", input_ops=[read_op])
client_config="imdb_cnn_client_conf/serving_client_conf.prototxt", combine_op = CombineOp("combine", input_ops=[bow_op, cnn_op])
concurrency=1,
timeout=-1,
retry=1,
batch_size=1,
auto_batching_timeout=None)
combine_op = CombineOp(
name="combine",
input_ops=[bow_op, cnn_op],
concurrency=1,
timeout=-1,
retry=1,
batch_size=2,
auto_batching_timeout=None)
# fetch output of bow_op # fetch output of bow_op
# response_op = ImdbResponseOp(input_ops=[bow_op]) #response_op = ImdbResponseOp(input_ops=[bow_op])
# fetch output of combine_op # fetch output of combine_op
response_op = ImdbResponseOp(input_ops=[combine_op]) response_op = ImdbResponseOp(input_ops=[combine_op])
# use default ResponseOp implementation # use default ResponseOp implementation
# response_op = ResponseOp(input_ops=[combine_op]) #response_op = ResponseOp(input_ops=[combine_op])
server = PipelineServer() server = PipelineServer()
server.set_response_op(response_op) server.set_response_op(response_op)
......
...@@ -28,31 +28,9 @@ python web_service.py &>log.txt & ...@@ -28,31 +28,9 @@ python web_service.py &>log.txt &
python pipeline_http_client.py python pipeline_http_client.py
``` ```
<!-- <!--
## More (PipelineServing) ## More (PipelineServing)
You can choose one of the following versions to start Service.
### Remote Service Version
```
python -m paddle_serving_server_gpu.serve --model ocr_det_model --port 12000 --gpu_id 0 &> det.log &
python -m paddle_serving_server_gpu.serve --model ocr_rec_model --port 12001 --gpu_id 0 &> rec.log &
python remote_service_pipeline_server.py &>pipeline.log &
```
### Local Service Version
```
python local_service_pipeline_server.py &>pipeline.log &
```
### Hybrid Service Version
```
python -m paddle_serving_server_gpu.serve --model ocr_rec_model --port 12001 --gpu_id 0 &> rec.log &
python hybrid_service_pipeline_server.py &>pipeline.log &
```
## Client Prediction ## Client Prediction
### RPC ### RPC
......
...@@ -31,26 +31,6 @@ python pipeline_http_client.py ...@@ -31,26 +31,6 @@ python pipeline_http_client.py
<!-- <!--
## 其他 (PipelineServing) ## 其他 (PipelineServing)
你可以选择下面任意一种版本启动服务。
### 远程服务版本
```
python -m paddle_serving_server.serve --model ocr_det_model --port 12000 --gpu_id 0 &> det.log &
python -m paddle_serving_server.serve --model ocr_rec_model --port 12001 --gpu_id 0 &> rec.log &
python remote_service_pipeline_server.py &>pipeline.log &
```
### 本地服务版本
```
python local_service_pipeline_server.py &>pipeline.log &
```
### 混合服务版本
```
python -m paddle_serving_server_gpu.serve --model ocr_rec_model --port 12001 --gpu_id 0 &> rec.log &
python hybrid_service_pipeline_server.py &>pipeline.log &
```
## 启动客户端 ## 启动客户端
### RPC ### RPC
......
rpc_port: 18080 #rpc端口, rpc_port和http_port不允许同时为空。当rpc_port为空且http_port不为空时,会自动将rpc_port设置为http_port+1
worker_num: 4 rpc_port: 18090
build_dag_each_worker: false
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port
http_port: 9999 http_port: 9999
#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG
##当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num
worker_num: 1
#build_dag_each_worker, False,框架在进程内创建一条DAG;True,框架会每个进程内创建多个独立的DAG
build_dag_each_worker: false
dag: dag:
is_thread_op: false #op资源类型, True, 为线程模型;False,为进程模型
is_thread_op: False
#重试次数
retry: 1 retry: 1
#使用性能分析, True,生成Timeline性能数据,对性能有一定影响;False为不使用
use_profile: false use_profile: false
op: op:
det: det:
#并发数,is_thread_op=True时,为线程并发;否则为进程并发
concurrency: 2 concurrency: 2
#当op配置没有server_endpoints时,从local_service_conf读取本地服务配置
local_service_conf: local_service_conf:
#client类型,包括brpc, grpc和local_predictor.local_predictor不启动Serving服务,进程内预测
client_type: local_predictor client_type: local_predictor
#det模型路径
model_config: ocr_det_model model_config: ocr_det_model
devices: ""
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["concat_1.tmp_0"]
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
devices: "0"
rec: rec:
concurrency: 1 #并发数,is_thread_op=True时,为线程并发;否则为进程并发
concurrency: 2
#超时时间, 单位ms
timeout: -1 timeout: -1
#Serving交互重试次数,默认不重试
retry: 1 retry: 1
#当op配置没有server_endpoints时,从local_service_conf读取本地服务配置
local_service_conf: local_service_conf:
#client类型,包括brpc, grpc和local_predictor。local_predictor不启动Serving服务,进程内预测
client_type: local_predictor client_type: local_predictor
#rec模型路径
model_config: ocr_rec_model model_config: ocr_rec_model
devices: ""
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"]
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
devices: "0"
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from paddle_serving_server_gpu.pipeline import Op, RequestOp, ResponseOp
from paddle_serving_server_gpu.pipeline import PipelineServer
from paddle_serving_server_gpu.pipeline.proto import pipeline_service_pb2
from paddle_serving_server_gpu.pipeline.channel import ChannelDataEcode
from paddle_serving_server_gpu.pipeline import LocalRpcServiceHandler
import numpy as np
import cv2
import time
import base64
import json
from paddle_serving_app.reader import OCRReader
from paddle_serving_app.reader import Sequential, ResizeByFactor
from paddle_serving_app.reader import Div, Normalize, Transpose
from paddle_serving_app.reader import DBPostProcess, FilterBoxes, GetRotateCropImage, SortedBoxes
import time
import re
import base64
import logging
_LOGGER = logging.getLogger()
class DetOp(Op):
def init_op(self):
self.det_preprocess = Sequential([
ResizeByFactor(32, 960), Div(255),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), Transpose(
(2, 0, 1))
])
self.filter_func = FilterBoxes(10, 10)
self.post_func = DBPostProcess({
"thresh": 0.3,
"box_thresh": 0.5,
"max_candidates": 1000,
"unclip_ratio": 1.5,
"min_size": 3
})
def preprocess(self, input_dicts):
(_, input_dict), = input_dicts.items()
data = base64.b64decode(input_dict["image"].encode('utf8'))
data = np.fromstring(data, np.uint8)
# Note: class variables(self.var) can only be used in process op mode
self.im = cv2.imdecode(data, cv2.IMREAD_COLOR)
self.ori_h, self.ori_w, _ = self.im.shape
det_img = self.det_preprocess(self.im)
_, self.new_h, self.new_w = det_img.shape
return {"image": det_img}
def postprocess(self, input_dicts, fetch_dict):
det_out = fetch_dict["concat_1.tmp_0"]
ratio_list = [
float(self.new_h) / self.ori_h, float(self.new_w) / self.ori_w
]
dt_boxes_list = self.post_func(det_out, [ratio_list])
dt_boxes = self.filter_func(dt_boxes_list[0], [self.ori_h, self.ori_w])
out_dict = {"dt_boxes": dt_boxes, "image": self.im}
return out_dict
class RecOp(Op):
def init_op(self):
self.ocr_reader = OCRReader()
self.get_rotate_crop_image = GetRotateCropImage()
self.sorted_boxes = SortedBoxes()
def preprocess(self, input_dicts):
(_, input_dict), = input_dicts.items()
im = input_dict["image"]
dt_boxes = input_dict["dt_boxes"]
dt_boxes = self.sorted_boxes(dt_boxes)
feed_list = []
img_list = []
max_wh_ratio = 0
for i, dtbox in enumerate(dt_boxes):
boximg = self.get_rotate_crop_image(im, dt_boxes[i])
img_list.append(boximg)
h, w = boximg.shape[0:2]
wh_ratio = w * 1.0 / h
max_wh_ratio = max(max_wh_ratio, wh_ratio)
for img in img_list:
norm_img = self.ocr_reader.resize_norm_img(img, max_wh_ratio)
feed = {"image": norm_img}
feed_list.append(feed)
return feed_list
def postprocess(self, input_dicts, fetch_dict):
rec_res = self.ocr_reader.postprocess(fetch_dict, with_score=True)
res_lst = []
for res in rec_res:
res_lst.append(res[0])
res = {"res": str(res_lst)}
return res
read_op = RequestOp()
det_op = DetOp(
name="det",
input_ops=[read_op],
local_rpc_service_handler=LocalRpcServiceHandler(
model_config="ocr_det_model",
workdir="det_workdir", # defalut: "workdir"
thread_num=2, # defalut: 2
devices="0", # gpu0. defalut: "" (cpu)
mem_optim=True, # defalut: True
ir_optim=False, # defalut: False
available_port_generator=None), # defalut: None
concurrency=1)
rec_op = RecOp(
name="rec",
input_ops=[det_op],
server_endpoints=["127.0.0.1:12001"],
fetch_list=["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"],
client_config="ocr_rec_client/serving_client_conf.prototxt",
concurrency=1)
response_op = ResponseOp(input_ops=[rec_op])
server = PipelineServer("ocr")
server.set_response_op(response_op)
server.prepare_server('config.yml')
server.run_server()
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from paddle_serving_server_gpu.pipeline import Op, RequestOp, ResponseOp
from paddle_serving_server_gpu.pipeline import PipelineServer
from paddle_serving_server_gpu.pipeline.proto import pipeline_service_pb2
from paddle_serving_server_gpu.pipeline.channel import ChannelDataEcode
from paddle_serving_server_gpu.pipeline import LocalRpcServiceHandler
import numpy as np
import cv2
import time
import base64
import json
from paddle_serving_app.reader import OCRReader
from paddle_serving_app.reader import Sequential, ResizeByFactor
from paddle_serving_app.reader import Div, Normalize, Transpose
from paddle_serving_app.reader import DBPostProcess, FilterBoxes, GetRotateCropImage, SortedBoxes
import time
import re
import base64
import logging
_LOGGER = logging.getLogger()
class DetOp(Op):
def init_op(self):
self.det_preprocess = Sequential([
ResizeByFactor(32, 960), Div(255),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), Transpose(
(2, 0, 1))
])
self.filter_func = FilterBoxes(10, 10)
self.post_func = DBPostProcess({
"thresh": 0.3,
"box_thresh": 0.5,
"max_candidates": 1000,
"unclip_ratio": 1.5,
"min_size": 3
})
def preprocess(self, input_dicts):
(_, input_dict), = input_dicts.items()
data = base64.b64decode(input_dict["image"].encode('utf8'))
data = np.fromstring(data, np.uint8)
# Note: class variables(self.var) can only be used in process op mode
self.im = cv2.imdecode(data, cv2.IMREAD_COLOR)
self.ori_h, self.ori_w, _ = self.im.shape
det_img = self.det_preprocess(self.im)
_, self.new_h, self.new_w = det_img.shape
return {"image": det_img}
def postprocess(self, input_dicts, fetch_dict):
det_out = fetch_dict["concat_1.tmp_0"]
ratio_list = [
float(self.new_h) / self.ori_h, float(self.new_w) / self.ori_w
]
dt_boxes_list = self.post_func(det_out, [ratio_list])
dt_boxes = self.filter_func(dt_boxes_list[0], [self.ori_h, self.ori_w])
out_dict = {"dt_boxes": dt_boxes, "image": self.im}
return out_dict
class RecOp(Op):
def init_op(self):
self.ocr_reader = OCRReader()
self.get_rotate_crop_image = GetRotateCropImage()
self.sorted_boxes = SortedBoxes()
def preprocess(self, input_dicts):
(_, input_dict), = input_dicts.items()
im = input_dict["image"]
dt_boxes = input_dict["dt_boxes"]
dt_boxes = self.sorted_boxes(dt_boxes)
feed_list = []
img_list = []
max_wh_ratio = 0
for i, dtbox in enumerate(dt_boxes):
boximg = self.get_rotate_crop_image(im, dt_boxes[i])
img_list.append(boximg)
h, w = boximg.shape[0:2]
wh_ratio = w * 1.0 / h
max_wh_ratio = max(max_wh_ratio, wh_ratio)
for img in img_list:
norm_img = self.ocr_reader.resize_norm_img(img, max_wh_ratio)
feed = {"image": norm_img}
feed_list.append(feed)
return feed_list
def postprocess(self, input_dicts, fetch_dict):
rec_res = self.ocr_reader.postprocess(fetch_dict, with_score=True)
res_lst = []
for res in rec_res:
res_lst.append(res[0])
res = {"res": str(res_lst)}
return res
read_op = RequestOp()
det_op = DetOp(
name="det",
input_ops=[read_op],
local_rpc_service_handler=LocalRpcServiceHandler(
model_config="ocr_det_model",
workdir="det_workdir", # defalut: "workdir"
thread_num=2, # defalut: 2
devices="0", # gpu0. defalut: "" (cpu)
mem_optim=True, # defalut: True
ir_optim=False, # defalut: False
available_port_generator=None), # defalut: None
concurrency=1)
rec_op = RecOp(
name="rec",
input_ops=[det_op],
local_rpc_service_handler=LocalRpcServiceHandler(
model_config="ocr_rec_model"),
concurrency=1)
response_op = ResponseOp(input_ops=[rec_op])
server = PipelineServer("ocr")
server.set_response_op(response_op)
server.prepare_server('config.yml')
server.run_server()
...@@ -11,7 +11,7 @@ ...@@ -11,7 +11,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from paddle_serving_server_gpu.pipeline import PipelineClient from paddle_serving_server.pipeline import PipelineClient
import numpy as np import numpy as np
import requests import requests
import json import json
......
...@@ -20,7 +20,7 @@ import base64 ...@@ -20,7 +20,7 @@ import base64
import os import os
client = PipelineClient() client = PipelineClient()
client.connect(['127.0.0.1:18080']) client.connect(['127.0.0.1:18090'])
def cv2_to_base64(image): def cv2_to_base64(image):
...@@ -33,6 +33,6 @@ for img_file in os.listdir(test_img_dir): ...@@ -33,6 +33,6 @@ for img_file in os.listdir(test_img_dir):
image_data = file.read() image_data = file.read()
image = cv2_to_base64(image_data) image = cv2_to_base64(image_data)
for i in range(4): for i in range(1):
ret = client.predict(feed_dict={"image": image}, fetch=["res"]) ret = client.predict(feed_dict={"image": image}, fetch=["res"])
print(ret) print(ret)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from paddle_serving_server_gpu.pipeline import Op, RequestOp, ResponseOp
from paddle_serving_server_gpu.pipeline import PipelineServer
from paddle_serving_server_gpu.pipeline.proto import pipeline_service_pb2
from paddle_serving_server_gpu.pipeline.channel import ChannelDataEcode
import numpy as np
import cv2
import time
import base64
import json
from paddle_serving_app.reader import OCRReader
from paddle_serving_app.reader import Sequential, ResizeByFactor
from paddle_serving_app.reader import Div, Normalize, Transpose
from paddle_serving_app.reader import DBPostProcess, FilterBoxes, GetRotateCropImage, SortedBoxes
import time
import re
import base64
import logging
_LOGGER = logging.getLogger()
class DetOp(Op):
def init_op(self):
self.det_preprocess = Sequential([
ResizeByFactor(32, 960), Div(255),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), Transpose(
(2, 0, 1))
])
self.filter_func = FilterBoxes(10, 10)
self.post_func = DBPostProcess({
"thresh": 0.3,
"box_thresh": 0.5,
"max_candidates": 1000,
"unclip_ratio": 1.5,
"min_size": 3
})
def preprocess(self, input_dicts):
(_, input_dict), = input_dicts.items()
data = base64.b64decode(input_dict["image"].encode('utf8'))
data = np.fromstring(data, np.uint8)
# Note: class variables(self.var) can only be used in process op mode
self.im = cv2.imdecode(data, cv2.IMREAD_COLOR)
self.ori_h, self.ori_w, _ = self.im.shape
det_img = self.det_preprocess(self.im)
_, self.new_h, self.new_w = det_img.shape
return {"image": det_img}
def postprocess(self, input_dicts, fetch_dict):
det_out = fetch_dict["concat_1.tmp_0"]
ratio_list = [
float(self.new_h) / self.ori_h, float(self.new_w) / self.ori_w
]
dt_boxes_list = self.post_func(det_out, [ratio_list])
dt_boxes = self.filter_func(dt_boxes_list[0], [self.ori_h, self.ori_w])
out_dict = {"dt_boxes": dt_boxes, "image": self.im}
return out_dict
class RecOp(Op):
def init_op(self):
self.ocr_reader = OCRReader()
self.get_rotate_crop_image = GetRotateCropImage()
self.sorted_boxes = SortedBoxes()
def preprocess(self, input_dicts):
(_, input_dict), = input_dicts.items()
im = input_dict["image"]
dt_boxes = input_dict["dt_boxes"]
dt_boxes = self.sorted_boxes(dt_boxes)
feed_list = []
img_list = []
max_wh_ratio = 0
for i, dtbox in enumerate(dt_boxes):
boximg = self.get_rotate_crop_image(im, dt_boxes[i])
img_list.append(boximg)
h, w = boximg.shape[0:2]
wh_ratio = w * 1.0 / h
max_wh_ratio = max(max_wh_ratio, wh_ratio)
for img in img_list:
norm_img = self.ocr_reader.resize_norm_img(img, max_wh_ratio)
feed = {"image": norm_img}
feed_list.append(feed)
return feed_list
def postprocess(self, input_dicts, fetch_dict):
rec_res = self.ocr_reader.postprocess(fetch_dict, with_score=True)
res_lst = []
for res in rec_res:
res_lst.append(res[0])
res = {"res": str(res_lst)}
return res
read_op = RequestOp()
det_op = DetOp(
name="det",
input_ops=[read_op],
server_endpoints=["127.0.0.1:12000"],
fetch_list=["concat_1.tmp_0"],
client_config="ocr_det_client/serving_client_conf.prototxt",
concurrency=1)
rec_op = RecOp(
name="rec",
input_ops=[det_op],
server_endpoints=["127.0.0.1:12001"],
fetch_list=["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"],
client_config="ocr_rec_client/serving_client_conf.prototxt",
concurrency=1)
response_op = ResponseOp(input_ops=[rec_op])
server = PipelineServer("ocr")
server.set_response_op(response_op)
server.prepare_server('config.yml')
server.run_server()
...@@ -14,7 +14,7 @@ ...@@ -14,7 +14,7 @@
try: try:
from paddle_serving_server.web_service import WebService, Op from paddle_serving_server.web_service import WebService, Op
except ImportError: except ImportError:
from paddle_serving_server.web_service import WebService, Op from paddle_serving_server_gpu.web_service import WebService, Op
import logging import logging
import numpy as np import numpy as np
import cv2 import cv2
...@@ -43,7 +43,7 @@ class DetOp(Op): ...@@ -43,7 +43,7 @@ class DetOp(Op):
"min_size": 3 "min_size": 3
}) })
def preprocess(self, input_dicts): def preprocess(self, input_dicts, data_id, log_id):
(_, input_dict), = input_dicts.items() (_, input_dict), = input_dicts.items()
data = base64.b64decode(input_dict["image"].encode('utf8')) data = base64.b64decode(input_dict["image"].encode('utf8'))
data = np.fromstring(data, np.uint8) data = np.fromstring(data, np.uint8)
...@@ -52,9 +52,9 @@ class DetOp(Op): ...@@ -52,9 +52,9 @@ class DetOp(Op):
self.ori_h, self.ori_w, _ = self.im.shape self.ori_h, self.ori_w, _ = self.im.shape
det_img = self.det_preprocess(self.im) det_img = self.det_preprocess(self.im)
_, self.new_h, self.new_w = det_img.shape _, self.new_h, self.new_w = det_img.shape
return {"image": det_img[np.newaxis, :]} return {"image": det_img[np.newaxis, :].copy()}, False, None, ""
def postprocess(self, input_dicts, fetch_dict): def postprocess(self, input_dicts, fetch_dict, log_id):
det_out = fetch_dict["concat_1.tmp_0"] det_out = fetch_dict["concat_1.tmp_0"]
ratio_list = [ ratio_list = [
float(self.new_h) / self.ori_h, float(self.new_w) / self.ori_w float(self.new_h) / self.ori_h, float(self.new_w) / self.ori_w
...@@ -63,7 +63,7 @@ class DetOp(Op): ...@@ -63,7 +63,7 @@ class DetOp(Op):
dt_boxes = self.filter_func(dt_boxes_list[0], [self.ori_h, self.ori_w]) dt_boxes = self.filter_func(dt_boxes_list[0], [self.ori_h, self.ori_w])
out_dict = {"dt_boxes": dt_boxes, "image": self.im} out_dict = {"dt_boxes": dt_boxes, "image": self.im}
print("out dict", out_dict) print("out dict", out_dict)
return out_dict return out_dict, None, ""
class RecOp(Op): class RecOp(Op):
...@@ -72,7 +72,7 @@ class RecOp(Op): ...@@ -72,7 +72,7 @@ class RecOp(Op):
self.get_rotate_crop_image = GetRotateCropImage() self.get_rotate_crop_image = GetRotateCropImage()
self.sorted_boxes = SortedBoxes() self.sorted_boxes = SortedBoxes()
def preprocess(self, input_dicts): def preprocess(self, input_dicts, data_id, log_id):
(_, input_dict), = input_dicts.items() (_, input_dict), = input_dicts.items()
im = input_dict["image"] im = input_dict["image"]
dt_boxes = input_dict["dt_boxes"] dt_boxes = input_dict["dt_boxes"]
...@@ -93,15 +93,15 @@ class RecOp(Op): ...@@ -93,15 +93,15 @@ class RecOp(Op):
norm_img = self.ocr_reader.resize_norm_img(img, max_wh_ratio) norm_img = self.ocr_reader.resize_norm_img(img, max_wh_ratio)
imgs[id] = norm_img imgs[id] = norm_img
feed = {"image": imgs.copy()} feed = {"image": imgs.copy()}
return feed return feed, False, None, ""
def postprocess(self, input_dicts, fetch_dict): def postprocess(self, input_dicts, fetch_dict, log_id):
rec_res = self.ocr_reader.postprocess(fetch_dict, with_score=True) rec_res = self.ocr_reader.postprocess(fetch_dict, with_score=True)
res_lst = [] res_lst = []
for res in rec_res: for res in rec_res:
res_lst.append(res[0]) res_lst.append(res[0])
res = {"res": str(res_lst)} res = {"res": str(res_lst)}
return res return res, None, ""
class OcrService(WebService): class OcrService(WebService):
...@@ -112,5 +112,5 @@ class OcrService(WebService): ...@@ -112,5 +112,5 @@ class OcrService(WebService):
uci_service = OcrService(name="ocr") uci_service = OcrService(name="ocr")
uci_service.prepare_pipeline_config("brpc_config.yml") uci_service.prepare_pipeline_config("config.yml")
uci_service.run_service() uci_service.run_service()
...@@ -15,5 +15,5 @@ python web_service.py &>log.txt & ...@@ -15,5 +15,5 @@ python web_service.py &>log.txt &
## Http test ## Http test
``` ```
curl -X POST -k http://localhost:18080/uci/prediction -d '{"key": ["x"], "value": ["0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332"]}' curl -X POST -k http://localhost:18082/uci/prediction -d '{"key": ["x"], "value": ["0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332"]}'
``` ```
...@@ -15,5 +15,5 @@ python web_service.py &>log.txt & ...@@ -15,5 +15,5 @@ python web_service.py &>log.txt &
## 测试 ## 测试
``` ```
curl -X POST -k http://localhost:18080/uci/prediction -d '{"key": ["x"], "value": ["0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332"]}' curl -X POST -k http://localhost:18082/uci/prediction -d '{"key": ["x"], "value": ["0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332"]}'
``` ```
worker_num: 4 #worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG
http_port: 18080 ##当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num
worker_num: 1
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port
http_port: 18082
dag: dag:
is_thread_op: false #op资源类型, True, 为线程模型;False,为进程模型
is_thread_op: False
op: op:
uci: uci:
#当op配置没有server_endpoints时,从local_service_conf读取本地服务配置
local_service_conf: local_service_conf:
#并发数,is_thread_op=True时,为线程并发;否则为进程并发
concurrency: 2
#uci模型路径
model_config: uci_housing_model model_config: uci_housing_model
devices: "" # "0,1"
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
devices: "0" # "0,1"
#client类型,包括brpc, grpc和local_predictor.local_predictor不启动Serving服务,进程内预测
client_type: local_predictor
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["price"]
...@@ -17,6 +17,7 @@ except ImportError: ...@@ -17,6 +17,7 @@ except ImportError:
from paddle_serving_server.web_service import WebService, Op from paddle_serving_server.web_service import WebService, Op
import logging import logging
import numpy as np import numpy as np
import sys
_LOGGER = logging.getLogger() _LOGGER = logging.getLogger()
...@@ -25,20 +26,32 @@ class UciOp(Op): ...@@ -25,20 +26,32 @@ class UciOp(Op):
def init_op(self): def init_op(self):
self.separator = "," self.separator = ","
def preprocess(self, input_dicts): def preprocess(self, input_dicts, data_id, log_id):
(_, input_dict), = input_dicts.items() (_, input_dict), = input_dicts.items()
_LOGGER.info(input_dict) _LOGGER.error("UciOp::preprocess >>> log_id:{}, input:{}".format(
log_id, input_dict))
x_value = input_dict["x"] x_value = input_dict["x"]
if isinstance(x_value, (str, unicode)): proc_dict = {}
input_dict["x"] = np.array( if sys.version_info.major == 2:
[float(x.strip()) if isinstance(x_value, (str, unicode)):
for x in x_value.split(self.separator)]).reshape(1, 13) input_dict["x"] = np.array(
return input_dict [float(x.strip())
for x in x_value.split(self.separator)]).reshape(1, 13)
def postprocess(self, input_dicts, fetch_dict): _LOGGER.error("input_dict:{}".format(input_dict))
# _LOGGER.info(fetch_dict) else:
if isinstance(x_value, str):
input_dict["x"] = np.array(
[float(x.strip())
for x in x_value.split(self.separator)]).reshape(1, 13)
_LOGGER.error("input_dict:{}".format(input_dict))
return input_dict, False, None, ""
def postprocess(self, input_dicts, fetch_dict, log_id):
_LOGGER.info("UciOp::postprocess >>> log_id:{}, fetch_dict:{}".format(
log_id, fetch_dict))
fetch_dict["price"] = str(fetch_dict["price"][0][0]) fetch_dict["price"] = str(fetch_dict["price"][0][0])
return fetch_dict return fetch_dict, None, ""
class UciService(WebService): class UciService(WebService):
......
...@@ -37,6 +37,7 @@ class SentaService(WebService): ...@@ -37,6 +37,7 @@ class SentaService(WebService):
#定义senta模型预测服务的预处理,调用顺序:lac reader->lac模型预测->预测结果后处理->senta reader #定义senta模型预测服务的预处理,调用顺序:lac reader->lac模型预测->预测结果后处理->senta reader
def preprocess(self, feed=[], fetch=[]): def preprocess(self, feed=[], fetch=[]):
feed_batch = [] feed_batch = []
is_batch = True
words_lod = [0] words_lod = [0]
for ins in feed: for ins in feed:
if "words" not in ins: if "words" not in ins:
...@@ -64,14 +65,13 @@ class SentaService(WebService): ...@@ -64,14 +65,13 @@ class SentaService(WebService):
return { return {
"words": np.concatenate(feed_batch), "words": np.concatenate(feed_batch),
"words.lod": words_lod "words.lod": words_lod
}, fetch }, fetch, is_batch
senta_service = SentaService(name="senta") senta_service = SentaService(name="senta")
senta_service.load_model_config("senta_bilstm_model") senta_service.load_model_config("senta_bilstm_model")
senta_service.prepare_server(workdir="workdir") senta_service.prepare_server(workdir="workdir")
senta_service.init_lac_client( senta_service.init_lac_client(
lac_port=9300, lac_port=9300, lac_client_config="lac_model/serving_server_conf.prototxt")
lac_client_config="lac/lac_model/serving_server_conf.prototxt")
senta_service.run_rpc_service() senta_service.run_rpc_service()
senta_service.run_web_service() senta_service.run_web_service()
#UNET_BENCHMARK 使用说明
## 功能
* benchmark测试
## 注意事项
* 示例图片(可以有多张)请放置于与img_data路径中,支持jpg,jpeg
* 图片张数应该大于等于并发数量
## TODO
* http benchmark
#!/bin/bash
python unet_benchmark.py --thread 1 --batch_size 1 --model ../unet_client/serving_client_conf.prototxt
# thread/batch can be modified as you wish
# -*- coding: utf-8 -*-
#
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
unet bench mark script
20201130 first edition by cg82616424
"""
from __future__ import unicode_literals, absolute_import
import os
import time
import json
import requests
from paddle_serving_client import Client
from paddle_serving_client.utils import MultiThreadRunner
from paddle_serving_client.utils import benchmark_args, show_latency
from paddle_serving_app.reader import Sequential, File2Image, Resize, Transpose, BGR2RGB, SegPostprocess
args = benchmark_args()
def get_img_names(path):
"""
Brief:
get img files(jpg) under this path
if any exception happened return None
Args:
path (string): image file path
Returns:
list: images names under this folder
"""
if not os.path.exists(path):
return None
if not os.path.isdir(path):
return None
list_name = []
for f_handler in os.listdir(path):
file_path = os.path.join(path, f_handler)
if os.path.isdir(file_path):
continue
else:
if not file_path.endswith(".jpeg") and not file_path.endswith(
".jpg"):
continue
list_name.append(file_path)
return list_name
def preprocess_img(img_list):
"""
Brief:
prepare img data for benchmark
Args:
img_list(list): list for img file path
Returns:
image content binary list after preprocess
"""
preprocess = Sequential([File2Image(), Resize((512, 512))])
result_list = []
for img in img_list:
img_tmp = preprocess(img)
result_list.append(img_tmp)
return result_list
def benckmark_worker(idx, resource):
"""
Brief:
benchmark single worker for unet
Args:
idx(int): worker idx ,use idx to select backend unet service
resource(dict): unet serving endpoint dict
Returns:
latency
TODO:
http benckmarks
"""
profile_flags = False
latency_flags = False
postprocess = SegPostprocess(2)
if os.getenv("FLAGS_profile_client"):
profile_flags = True
if os.getenv("FLAGS_serving_latency"):
latency_flags = True
latency_list = []
client_handler = Client()
client_handler.load_client_config(args.model)
client_handler.connect(
[resource["endpoint"][idx % len(resource["endpoint"])]])
start = time.time()
turns = resource["turns"]
img_list = resource["img_list"]
for i in range(turns):
if args.batch_size >= 1:
l_start = time.time()
feed_batch = []
b_start = time.time()
for bi in range(args.batch_size):
feed_batch.append({"image": img_list[bi]})
b_end = time.time()
if profile_flags:
sys.stderr.write(
"PROFILE\tpid:{}\tunt_pre_0:{} unet_pre_1:{}\n".format(
os.getpid(),
int(round(b_start * 1000000)),
int(round(b_end * 1000000))))
result = client_handler.predict(
feed={"image": img_list[bi]}, fetch=["output"])
#result["filename"] = "./img_data/N0060.jpg" % (os.getpid(), idx, time.time())
#postprocess(result) # if you want to measure post process time, you have to uncomment this line
l_end = time.time()
if latency_flags:
latency_list.append(l_end * 1000 - l_start * 1000)
else:
print("unsupport batch size {}".format(args.batch_size))
end = time.time()
if latency_flags:
return [[end - start], latency_list]
else:
return [[end - start]]
if __name__ == '__main__':
"""
usage:
"""
img_file_list = get_img_names("./img_data")
img_content_list = preprocess_img(img_file_list)
multi_thread_runner = MultiThreadRunner()
endpoint_list = ["127.0.0.1:9494"]
turns = 1
start = time.time()
result = multi_thread_runner.run(benckmark_worker, args.thread, {
"endpoint": endpoint_list,
"turns": turns,
"img_list": img_content_list
})
end = time.time()
total_cost = end - start
avg_cost = 0
for i in range(args.thread):
avg_cost += result[0][i]
avg_cost = avg_cost / args.thread
print("total cost: {}s".format(total_cost))
print("each thread cost: {}s. ".format(avg_cost))
print("qps: {}samples/s".format(args.batch_size * args.thread * turns /
total_cost))
if os.getenv("FLAGS_serving_latency"):
show_latency(result[1])
...@@ -32,6 +32,12 @@ logger.setLevel(logging.INFO) ...@@ -32,6 +32,12 @@ logger.setLevel(logging.INFO)
class LocalPredictor(object): class LocalPredictor(object):
"""
Prediction in the current process of the local environment, in process
call, Compared with RPC/HTTP, LocalPredictor has better performance,
because of no network and packaging load.
"""
def __init__(self): def __init__(self):
self.feed_names_ = [] self.feed_names_ = []
self.fetch_names_ = [] self.fetch_names_ = []
...@@ -42,13 +48,41 @@ class LocalPredictor(object): ...@@ -42,13 +48,41 @@ class LocalPredictor(object):
self.fetch_names_to_idx_ = {} self.fetch_names_to_idx_ = {}
self.fetch_names_to_type_ = {} self.fetch_names_to_type_ = {}
def load_model_config(self, model_path, gpu=False, profile=True, cpu_num=1): def load_model_config(self,
model_path,
use_gpu=False,
gpu_id=0,
use_profile=False,
thread_num=1,
mem_optim=True,
ir_optim=False,
use_trt=False,
use_feed_fetch_ops=False):
"""
Load model config and set the engine config for the paddle predictor
Args:
model_path: model config path.
use_gpu: calculating with gpu, False default.
gpu_id: gpu id, 0 default.
use_profile: use predictor profiles, False default.
thread_num: thread nums, default 1.
mem_optim: memory optimization, True default.
ir_optim: open calculation chart optimization, False default.
use_trt: use nvidia TensorRT optimization, False default
use_feed_fetch_ops: use feed/fetch ops, False default.
"""
client_config = "{}/serving_server_conf.prototxt".format(model_path) client_config = "{}/serving_server_conf.prototxt".format(model_path)
model_conf = m_config.GeneralModelConfig() model_conf = m_config.GeneralModelConfig()
f = open(client_config, 'r') f = open(client_config, 'r')
model_conf = google.protobuf.text_format.Merge( model_conf = google.protobuf.text_format.Merge(
str(f.read()), model_conf) str(f.read()), model_conf)
config = AnalysisConfig(model_path) config = AnalysisConfig(model_path)
logger.info("load_model_config params: model_path:{}, use_gpu:{},\
gpu_id:{}, use_profile:{}, thread_num:{}, mem_optim:{}, ir_optim:{},\
use_trt:{}, use_feed_fetch_ops:{}".format(
model_path, use_gpu, gpu_id, use_profile, thread_num, mem_optim,
ir_optim, use_trt, use_feed_fetch_ops))
self.feed_names_ = [var.alias_name for var in model_conf.feed_var] self.feed_names_ = [var.alias_name for var in model_conf.feed_var]
self.fetch_names_ = [var.alias_name for var in model_conf.fetch_var] self.fetch_names_ = [var.alias_name for var in model_conf.fetch_var]
...@@ -64,19 +98,43 @@ class LocalPredictor(object): ...@@ -64,19 +98,43 @@ class LocalPredictor(object):
self.fetch_names_to_idx_[var.alias_name] = i self.fetch_names_to_idx_[var.alias_name] = i
self.fetch_names_to_type_[var.alias_name] = var.fetch_type self.fetch_names_to_type_[var.alias_name] = var.fetch_type
if not gpu: if use_profile:
config.disable_gpu()
else:
config.enable_use_gpu(100, 0)
if profile:
config.enable_profile() config.enable_profile()
if mem_optim:
config.enable_memory_optim()
config.switch_ir_optim(ir_optim)
config.set_cpu_math_library_num_threads(thread_num)
config.switch_use_feed_fetch_ops(use_feed_fetch_ops)
config.delete_pass("conv_transpose_eltwiseadd_bn_fuse_pass") config.delete_pass("conv_transpose_eltwiseadd_bn_fuse_pass")
config.set_cpu_math_library_num_threads(cpu_num)
config.switch_ir_optim(False) if not use_gpu:
config.switch_use_feed_fetch_ops(False) config.disable_gpu()
else:
config.enable_use_gpu(100, gpu_id)
if use_trt:
config.enable_tensorrt_engine(
workspace_size=1 << 20,
max_batch_size=32,
min_subgraph_size=3,
use_static=False,
use_calib_mode=False)
self.predictor = create_paddle_predictor(config) self.predictor = create_paddle_predictor(config)
def predict(self, feed=None, fetch=None, batch=False, log_id=0): def predict(self, feed=None, fetch=None, batch=False, log_id=0):
"""
Predict locally
Args:
feed: feed var
fetch: fetch var
batch: batch data or not, False default.If batch is False, a new
dimension is added to header of the shape[np.newaxis].
log_id: for logging
Returns:
fetch_map: dict
"""
if feed is None or fetch is None: if feed is None or fetch is None:
raise ValueError("You should specify feed and fetch for prediction") raise ValueError("You should specify feed and fetch for prediction")
fetch_list = [] fetch_list = []
......
...@@ -18,5 +18,5 @@ from .image_reader import RCNNPostprocess, SegPostprocess, PadStride, BlazeFaceP ...@@ -18,5 +18,5 @@ from .image_reader import RCNNPostprocess, SegPostprocess, PadStride, BlazeFaceP
from .image_reader import DBPostProcess, FilterBoxes, GetRotateCropImage, SortedBoxes from .image_reader import DBPostProcess, FilterBoxes, GetRotateCropImage, SortedBoxes
from .lac_reader import LACReader from .lac_reader import LACReader
from .senta_reader import SentaReader from .senta_reader import SentaReader
from .imdb_reader import IMDBDataset #from .imdb_reader import IMDBDataset
from .ocr_reader import OCRReader from .ocr_reader import OCRReader
...@@ -22,18 +22,17 @@ import yaml ...@@ -22,18 +22,17 @@ import yaml
import copy import copy
import argparse import argparse
import logging import logging
import paddle.fluid as fluid
import json import json
FORMAT = '%(asctime)s-%(levelname)s: %(message)s' FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
logging.basicConfig(level=logging.INFO, format=FORMAT) logging.basicConfig(level=logging.INFO, format=FORMAT)
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
precision_map = { #precision_map = {
'trt_int8': fluid.core.AnalysisConfig.Precision.Int8, # 'trt_int8': fluid.core.AnalysisConfig.Precision.Int8,
'trt_fp32': fluid.core.AnalysisConfig.Precision.Float32, # 'trt_fp32': fluid.core.AnalysisConfig.Precision.Float32,
'trt_fp16': fluid.core.AnalysisConfig.Precision.Half # 'trt_fp16': fluid.core.AnalysisConfig.Precision.Half
} #}
class Resize(object): class Resize(object):
......
...@@ -92,9 +92,12 @@ def save_model(server_model_folder, ...@@ -92,9 +92,12 @@ def save_model(server_model_folder,
fetch_var.shape.extend(tmp_shape) fetch_var.shape.extend(tmp_shape)
config.fetch_var.extend([fetch_var]) config.fetch_var.extend([fetch_var])
cmd = "mkdir -p {}".format(client_config_folder) try:
save_dirname = os.path.normpath(client_config_folder)
os.system(cmd) os.makedirs(save_dirname)
except OSError as e:
if e.errno != errno.EEXIST:
raise
with open("{}/serving_client_conf.prototxt".format(client_config_folder), with open("{}/serving_client_conf.prototxt".format(client_config_folder),
"w") as fout: "w") as fout:
fout.write(str(config)) fout.write(str(config))
......
...@@ -23,13 +23,13 @@ import paddle_serving_server as paddle_serving_server ...@@ -23,13 +23,13 @@ import paddle_serving_server as paddle_serving_server
from .version import serving_server_version from .version import serving_server_version
from contextlib import closing from contextlib import closing
import collections import collections
import fcntl
import shutil import shutil
import numpy as np import numpy as np
import grpc import grpc
from .proto import multi_lang_general_model_service_pb2 from .proto import multi_lang_general_model_service_pb2
import sys import sys
if sys.platform.startswith('win') is False:
import fcntl
sys.path.append( sys.path.append(
os.path.join(os.path.abspath(os.path.dirname(__file__)), 'proto')) os.path.join(os.path.abspath(os.path.dirname(__file__)), 'proto'))
from .proto import multi_lang_general_model_service_pb2_grpc from .proto import multi_lang_general_model_service_pb2_grpc
......
...@@ -52,6 +52,20 @@ class WebService(object): ...@@ -52,6 +52,20 @@ class WebService(object):
def load_model_config(self, model_config): def load_model_config(self, model_config):
print("This API will be deprecated later. Please do not use it") print("This API will be deprecated later. Please do not use it")
self.model_config = model_config self.model_config = model_config
import os
from .proto import general_model_config_pb2 as m_config
import google.protobuf.text_format
if os.path.isdir(model_config):
client_config = "{}/serving_server_conf.prototxt".format(
model_config)
elif os.path.isfile(path):
client_config = model_config
model_conf = m_config.GeneralModelConfig()
f = open(client_config, 'r')
model_conf = google.protobuf.text_format.Merge(
str(f.read()), model_conf)
self.feed_names = [var.alias_name for var in model_conf.feed_var]
self.fetch_names = [var.alias_name for var in model_conf.fetch_var]
def _launch_rpc_service(self): def _launch_rpc_service(self):
op_maker = OpMaker() op_maker = OpMaker()
...@@ -112,13 +126,14 @@ class WebService(object): ...@@ -112,13 +126,14 @@ class WebService(object):
if "fetch" not in request.json: if "fetch" not in request.json:
abort(400) abort(400)
try: try:
feed, fetch = self.preprocess(request.json["feed"], feed, fetch, is_batch = self.preprocess(request.json["feed"],
request.json["fetch"]) request.json["fetch"])
if isinstance(feed, dict) and "fetch" in feed: if isinstance(feed, dict) and "fetch" in feed:
del feed["fetch"] del feed["fetch"]
if len(feed) == 0: if len(feed) == 0:
raise ValueError("empty input") raise ValueError("empty input")
fetch_map = self.client.predict(feed=feed, fetch=fetch, batch=True) fetch_map = self.client.predict(
feed=feed, fetch=fetch, batch=is_batch)
result = self.postprocess( result = self.postprocess(
feed=request.json["feed"], fetch=fetch, fetch_map=fetch_map) feed=request.json["feed"], fetch=fetch, fetch_map=fetch_map)
result = {"result": result} result = {"result": result}
...@@ -174,21 +189,19 @@ class WebService(object): ...@@ -174,21 +189,19 @@ class WebService(object):
from paddle_serving_app.local_predict import LocalPredictor from paddle_serving_app.local_predict import LocalPredictor
self.client = LocalPredictor() self.client = LocalPredictor()
self.client.load_model_config( self.client.load_model_config(
"{}".format(self.model_config), gpu=False, profile=False) "{}".format(self.model_config), use_gpu=False)
def run_web_service(self): def run_web_service(self):
print("This API will be deprecated later. Please do not use it") print("This API will be deprecated later. Please do not use it")
self.app_instance.run(host="0.0.0.0", self.app_instance.run(host="0.0.0.0", port=self.port, threaded=True)
port=self.port,
threaded=False,
processes=1)
def get_app_instance(self): def get_app_instance(self):
return self.app_instance return self.app_instance
def preprocess(self, feed=[], fetch=[]): def preprocess(self, feed=[], fetch=[]):
print("This API will be deprecated later. Please do not use it") print("This API will be deprecated later. Please do not use it")
return feed, fetch is_batch = True
return feed, fetch, is_batch
def postprocess(self, feed=[], fetch=[], fetch_map=None): def postprocess(self, feed=[], fetch=[], fetch_map=None):
print("This API will be deprecated later. Please do not use it") print("This API will be deprecated later. Please do not use it")
......
...@@ -58,6 +58,20 @@ class WebService(object): ...@@ -58,6 +58,20 @@ class WebService(object):
def load_model_config(self, model_config): def load_model_config(self, model_config):
print("This API will be deprecated later. Please do not use it") print("This API will be deprecated later. Please do not use it")
self.model_config = model_config self.model_config = model_config
import os
from .proto import general_model_config_pb2 as m_config
import google.protobuf.text_format
if os.path.isdir(model_config):
client_config = "{}/serving_server_conf.prototxt".format(
model_config)
elif os.path.isfile(path):
client_config = model_config
model_conf = m_config.GeneralModelConfig()
f = open(client_config, 'r')
model_conf = google.protobuf.text_format.Merge(
str(f.read()), model_conf)
self.feed_names = [var.alias_name for var in model_conf.feed_var]
self.fetch_names = [var.alias_name for var in model_conf.fetch_var]
def set_gpus(self, gpus): def set_gpus(self, gpus):
print("This API will be deprecated later. Please do not use it") print("This API will be deprecated later. Please do not use it")
...@@ -167,13 +181,14 @@ class WebService(object): ...@@ -167,13 +181,14 @@ class WebService(object):
if "fetch" not in request.json: if "fetch" not in request.json:
abort(400) abort(400)
try: try:
feed, fetch = self.preprocess(request.json["feed"], feed, fetch, is_batch = self.preprocess(request.json["feed"],
request.json["fetch"]) request.json["fetch"])
if isinstance(feed, dict) and "fetch" in feed: if isinstance(feed, dict) and "fetch" in feed:
del feed["fetch"] del feed["fetch"]
if len(feed) == 0: if len(feed) == 0:
raise ValueError("empty input") raise ValueError("empty input")
fetch_map = self.client.predict(feed=feed, fetch=fetch) fetch_map = self.client.predict(
feed=feed, fetch=fetch, batch=is_batch)
result = self.postprocess( result = self.postprocess(
feed=request.json["feed"], fetch=fetch, fetch_map=fetch_map) feed=request.json["feed"], fetch=fetch, fetch_map=fetch_map)
result = {"result": result} result = {"result": result}
...@@ -235,21 +250,19 @@ class WebService(object): ...@@ -235,21 +250,19 @@ class WebService(object):
from paddle_serving_app.local_predict import LocalPredictor from paddle_serving_app.local_predict import LocalPredictor
self.client = LocalPredictor() self.client = LocalPredictor()
self.client.load_model_config( self.client.load_model_config(
"{}".format(self.model_config), gpu=gpu, profile=False) "{}".format(self.model_config), use_gpu=True, gpu_id=self.gpus[0])
def run_web_service(self): def run_web_service(self):
print("This API will be deprecated later. Please do not use it") print("This API will be deprecated later. Please do not use it")
self.app_instance.run(host="0.0.0.0", self.app_instance.run(host="0.0.0.0", port=self.port, threaded=True)
port=self.port,
threaded=False,
processes=4)
def get_app_instance(self): def get_app_instance(self):
return self.app_instance return self.app_instance
def preprocess(self, feed=[], fetch=[]): def preprocess(self, feed=[], fetch=[]):
print("This API will be deprecated later. Please do not use it") print("This API will be deprecated later. Please do not use it")
return feed, fetch is_batch = True
return feed, fetch, is_batch
def postprocess(self, feed=[], fetch=[], fetch_map=None): def postprocess(self, feed=[], fetch=[], fetch_map=None):
print("This API will be deprecated later. Please do not use it") print("This API will be deprecated later. Please do not use it")
......
...@@ -312,7 +312,7 @@ class OpAnalyst(object): ...@@ -312,7 +312,7 @@ class OpAnalyst(object):
# reduce op times # reduce op times
op_times = { op_times = {
op_name: sum(step_times.values()) op_name: sum(list(step_times.values()))
for op_name, step_times in op_times.items() for op_name, step_times in op_times.items()
} }
......
...@@ -32,7 +32,10 @@ import copy ...@@ -32,7 +32,10 @@ import copy
_LOGGER = logging.getLogger(__name__) _LOGGER = logging.getLogger(__name__)
class ChannelDataEcode(enum.Enum): class ChannelDataErrcode(enum.Enum):
"""
ChannelData error code
"""
OK = 0 OK = 0
TIMEOUT = 1 TIMEOUT = 1
NOT_IMPLEMENTED = 2 NOT_IMPLEMENTED = 2
...@@ -42,9 +45,21 @@ class ChannelDataEcode(enum.Enum): ...@@ -42,9 +45,21 @@ class ChannelDataEcode(enum.Enum):
CLOSED_ERROR = 6 CLOSED_ERROR = 6
NO_SERVICE = 7 NO_SERVICE = 7
UNKNOW = 8 UNKNOW = 8
PRODUCT_ERROR = 9
class ProductErrCode(enum.Enum):
"""
ProductErrCode is a base class for recording business error code.
product developers inherit this class and extend more error codes.
"""
pass
class ChannelDataType(enum.Enum): class ChannelDataType(enum.Enum):
"""
Channel data type
"""
DICT = 0 DICT = 0
CHANNEL_NPDATA = 1 CHANNEL_NPDATA = 1
ERROR = 2 ERROR = 2
...@@ -56,20 +71,23 @@ class ChannelData(object): ...@@ -56,20 +71,23 @@ class ChannelData(object):
npdata=None, npdata=None,
dictdata=None, dictdata=None,
data_id=None, data_id=None,
ecode=None, log_id=None,
error_code=None,
error_info=None, error_info=None,
prod_error_code=None,
prod_error_info=None,
client_need_profile=False): client_need_profile=False):
''' '''
There are several ways to use it: There are several ways to use it:
1. ChannelData(ChannelDataType.CHANNEL_NPDATA.value, npdata, data_id) 1. ChannelData(ChannelDataType.CHANNEL_NPDATA.value, npdata, data_id, log_id)
2. ChannelData(ChannelDataType.DICT.value, dictdata, data_id) 2. ChannelData(ChannelDataType.DICT.value, dictdata, data_id, log_id)
3. ChannelData(ecode, error_info, data_id) 3. ChannelData(error_code, error_info, prod_error_code, prod_error_info, data_id, log_id)
Protobufs are not pickle-able: Protobufs are not pickle-able:
https://stackoverflow.com/questions/55344376/how-to-import-protobuf-module https://stackoverflow.com/questions/55344376/how-to-import-protobuf-module
''' '''
if ecode is not None: if error_code is not None or prod_error_code is not None:
if data_id is None or error_info is None: if data_id is None or error_info is None:
_LOGGER.critical("Failed to generate ChannelData: data_id" _LOGGER.critical("Failed to generate ChannelData: data_id"
" and error_info cannot be None") " and error_info cannot be None")
...@@ -77,25 +95,30 @@ class ChannelData(object): ...@@ -77,25 +95,30 @@ class ChannelData(object):
datatype = ChannelDataType.ERROR.value datatype = ChannelDataType.ERROR.value
else: else:
if datatype == ChannelDataType.CHANNEL_NPDATA.value: if datatype == ChannelDataType.CHANNEL_NPDATA.value:
ecode, error_info = ChannelData.check_npdata(npdata) error_code, error_info = ChannelData.check_npdata(npdata)
if ecode != ChannelDataEcode.OK.value: if error_code != ChannelDataErrcode.OK.value:
datatype = ChannelDataType.ERROR.value datatype = ChannelDataType.ERROR.value
_LOGGER.error("(logid={}) {}".format(data_id, error_info)) _LOGGER.error("(data_id={} log_id={}) {}".format(
data_id, log_id, error_info))
elif datatype == ChannelDataType.DICT.value: elif datatype == ChannelDataType.DICT.value:
ecode, error_info = ChannelData.check_dictdata(dictdata) error_code, error_info = ChannelData.check_dictdata(dictdata)
if ecode != ChannelDataEcode.OK.value: if error_code != ChannelDataErrcode.OK.value:
datatype = ChannelDataType.ERROR.value datatype = ChannelDataType.ERROR.value
_LOGGER.error("(logid={}) {}".format(data_id, error_info)) _LOGGER.error("(data_id={} log_id={}) {}".format(
data_id, log_id, error_info))
else: else:
_LOGGER.critical("(logid={}) datatype not match".format( _LOGGER.critical("(data_id={} log_id={}) datatype not match".
data_id)) format(data_id, log_id))
os._exit(-1) os._exit(-1)
self.datatype = datatype self.datatype = datatype
self.npdata = npdata self.npdata = npdata
self.dictdata = dictdata self.dictdata = dictdata
self.id = data_id self.id = data_id
self.ecode = ecode self.log_id = log_id
self.error_code = error_code
self.error_info = error_info self.error_info = error_info
self.prod_error_code = prod_error_code
self.prod_error_info = prod_error_info
self.client_need_profile = client_need_profile self.client_need_profile = client_need_profile
self.profile_data_set = set() self.profile_data_set = set()
...@@ -106,67 +129,67 @@ class ChannelData(object): ...@@ -106,67 +129,67 @@ class ChannelData(object):
@staticmethod @staticmethod
def check_dictdata(dictdata): def check_dictdata(dictdata):
ecode = ChannelDataEcode.OK.value error_code = ChannelDataErrcode.OK.value
error_info = None error_info = None
if isinstance(dictdata, list): if isinstance(dictdata, list):
# batch data # batch data
for sample in dictdata: for sample in dictdata:
if not isinstance(sample, dict): if not isinstance(sample, dict):
ecode = ChannelDataEcode.TYPE_ERROR.value error_code = ChannelDataErrcode.TYPE_ERROR.value
error_info = "Failed to check data: the type of " \ error_info = "Failed to check data: the type of " \
"data must be dict, but get {}.".format(type(sample)) "data must be dict, but get {}.".format(type(sample))
break break
elif not isinstance(dictdata, dict): elif not isinstance(dictdata, dict):
# batch size = 1 # batch size = 1
ecode = ChannelDataEcode.TYPE_ERROR.value error_code = ChannelDataErrcode.TYPE_ERROR.value
error_info = "Failed to check data: the type of data must " \ error_info = "Failed to check data: the type of data must " \
"be dict, but get {}.".format(type(dictdata)) "be dict, but get {}.".format(type(dictdata))
return ecode, error_info return error_code, error_info
@staticmethod @staticmethod
def check_batch_npdata(batch): def check_batch_npdata(batch):
ecode = ChannelDataEcode.OK.value error_code = ChannelDataErrcode.OK.value
error_info = None error_info = None
for npdata in batch: for npdata in batch:
ecode, error_info = ChannelData.check_npdata(npdata) error_code, error_info = ChannelData.check_npdata(npdata)
if ecode != ChannelDataEcode.OK.value: if error_code != ChannelDataErrcode.OK.value:
break break
return ecode, error_info return error_code, error_info
@staticmethod @staticmethod
def check_npdata(npdata): def check_npdata(npdata):
ecode = ChannelDataEcode.OK.value error_code = ChannelDataErrcode.OK.value
error_info = None error_info = None
if isinstance(npdata, list): if isinstance(npdata, list):
# batch data # batch data
for sample in npdata: for sample in npdata:
if not isinstance(sample, dict): if not isinstance(sample, dict):
ecode = ChannelDataEcode.TYPE_ERROR.value error_code = ChannelDataErrcode.TYPE_ERROR.value
error_info = "Failed to check data: the " \ error_info = "Failed to check data: the " \
"value of data must be dict, but get {}.".format( "value of data must be dict, but get {}.".format(
type(sample)) type(sample))
break break
for _, value in sample.items(): for _, value in sample.items():
if not isinstance(value, np.ndarray): if not isinstance(value, np.ndarray):
ecode = ChannelDataEcode.TYPE_ERROR.value error_code = ChannelDataErrcode.TYPE_ERROR.value
error_info = "Failed to check data: the" \ error_info = "Failed to check data: the" \
" value of data must be np.ndarray, but get {}.".format( " value of data must be np.ndarray, but get {}.".format(
type(value)) type(value))
return ecode, error_info return error_code, error_info
elif isinstance(npdata, dict): elif isinstance(npdata, dict):
# batch_size = 1 # batch_size = 1
for _, value in npdata.items(): for _, value in npdata.items():
if not isinstance(value, np.ndarray): if not isinstance(value, np.ndarray):
ecode = ChannelDataEcode.TYPE_ERROR.value error_code = ChannelDataErrcode.TYPE_ERROR.value
error_info = "Failed to check data: the value " \ error_info = "Failed to check data: the value " \
"of data must be np.ndarray, but get {}.".format( "of data must be np.ndarray, but get {}.".format(
type(value)) type(value))
break break
else: else:
ecode = ChannelDataEcode.TYPE_ERROR.value error_code = ChannelDataErrcode.TYPE_ERROR.value
error_info = "Failed to check data: the value of data " \ error_info = "Failed to check data: the value of data " \
"must be dict, but get {}.".format(type(npdata)) "must be dict, but get {}.".format(type(npdata))
return ecode, error_info return error_code, error_info
def parse(self): def parse(self):
feed = None feed = None
...@@ -191,8 +214,9 @@ class ChannelData(object): ...@@ -191,8 +214,9 @@ class ChannelData(object):
return 1 return 1
def __str__(self): def __str__(self):
return "type[{}], ecode[{}], id[{}]".format( return "type[{}], error_code[{}], data_id[{}], log_id[{}], dict_data[{}]".format(
ChannelDataType(self.datatype).name, self.ecode, self.id) ChannelDataType(self.datatype).name, self.error_code, self.id,
self.log_id, str(self.dictdata))
class ProcessChannel(object): class ProcessChannel(object):
...@@ -289,14 +313,14 @@ class ProcessChannel(object): ...@@ -289,14 +313,14 @@ class ProcessChannel(object):
def push(self, channeldata, op_name=None): def push(self, channeldata, op_name=None):
_LOGGER.debug( _LOGGER.debug(
self._log("(logid={}) Op({}) Pushing data".format(channeldata.id, self._log("(data_id={} log_id={}) Op({}) Enter channel::push".
op_name))) format(channeldata.id, channeldata.log_id, op_name)))
if len(self._producers) == 0: if len(self._producers) == 0:
_LOGGER.critical( _LOGGER.critical(
self._log( self._log(
"(logid={}) Op({}) Failed to push data: expected number" "(data_id={} log_id={}) Op({}) Failed to push data: expected number"
" of producers to be greater than 0, but the it is 0.". " of producers to be greater than 0, but the it is 0.".
format(channeldata.id, op_name))) format(channeldata.id, channeldata.log_id, op_name)))
os._exit(-1) os._exit(-1)
elif len(self._producers) == 1: elif len(self._producers) == 1:
with self._cv: with self._cv:
...@@ -310,19 +334,21 @@ class ProcessChannel(object): ...@@ -310,19 +334,21 @@ class ProcessChannel(object):
raise ChannelStopError() raise ChannelStopError()
self._cv.notify_all() self._cv.notify_all()
_LOGGER.debug( _LOGGER.debug(
self._log("(logid={}) Op({}) Pushed data into internal queue.". self._log(
format(channeldata.id, op_name))) "(data_id={} log_id={}) Op({}) Pushed data into internal queue.".
format(channeldata.id, channeldata.log_id, op_name)))
return True return True
elif op_name is None: elif op_name is None:
_LOGGER.critical( _LOGGER.critical(
self._log( self._log(
"(logid={}) Op({}) Failed to push data: there are multiple " "(data_id={} log_id={}) Op({}) Failed to push data: there are multiple "
"producers, so op_name cannot be None.".format( "producers, so op_name cannot be None.".format(
channeldata.id, op_name))) channeldata.id, channeldata.log_id, op_name)))
os._exit(-1) os._exit(-1)
producer_num = len(self._producers) producer_num = len(self._producers)
data_id = channeldata.id data_id = channeldata.id
log_id = channeldata.log_id
put_data = None put_data = None
with self._cv: with self._cv:
if data_id not in self._input_buf: if data_id not in self._input_buf:
...@@ -347,8 +373,8 @@ class ProcessChannel(object): ...@@ -347,8 +373,8 @@ class ProcessChannel(object):
if put_data is None: if put_data is None:
_LOGGER.debug( _LOGGER.debug(
self._log( self._log(
"(logid={}) Op({}) Pushed data into input_buffer.". "(data_id={} log_id={}) Op({}) Pushed data into input_buffer.".
format(data_id, op_name))) format(data_id, log_id, op_name)))
else: else:
while self._stop.value == 0: while self._stop.value == 0:
try: try:
...@@ -361,8 +387,8 @@ class ProcessChannel(object): ...@@ -361,8 +387,8 @@ class ProcessChannel(object):
_LOGGER.debug( _LOGGER.debug(
self._log( self._log(
"(logid={}) Op({}) Pushed data into internal_queue.". "(data_id={} log_id={}) Op({}) Pushed data into internal_queue.".
format(data_id, op_name))) format(data_id, log_id, op_name)))
self._cv.notify_all() self._cv.notify_all()
return True return True
...@@ -403,9 +429,12 @@ class ProcessChannel(object): ...@@ -403,9 +429,12 @@ class ProcessChannel(object):
self._cv.wait() self._cv.wait()
if self._stop.value == 1: if self._stop.value == 1:
raise ChannelStopError() raise ChannelStopError()
_LOGGER.debug(
self._log("(logid={}) Op({}) Got data".format(resp.values()[0] if resp is not None:
.id, op_name))) list_values = list(resp.values())
_LOGGER.debug(
self._log("(data_id={} log_id={}) Op({}) Got data".format(
list_values[0].id, list_values[0].log_id, op_name)))
return resp return resp
elif op_name is None: elif op_name is None:
_LOGGER.critical( _LOGGER.critical(
...@@ -432,10 +461,12 @@ class ProcessChannel(object): ...@@ -432,10 +461,12 @@ class ProcessChannel(object):
try: try:
channeldata = self._que.get(timeout=0) channeldata = self._que.get(timeout=0)
self._output_buf.append(channeldata) self._output_buf.append(channeldata)
list_values = list(channeldata.values())
_LOGGER.debug( _LOGGER.debug(
self._log( self._log(
"(logid={}) Op({}) Pop ready item into output_buffer". "(data_id={} log_id={}) Op({}) Pop ready item into output_buffer".
format(channeldata.values()[0].id, op_name))) format(list_values[0].id, list_values[0].log_id,
op_name)))
break break
except Queue.Empty: except Queue.Empty:
if timeout is not None: if timeout is not None:
...@@ -486,9 +517,12 @@ class ProcessChannel(object): ...@@ -486,9 +517,12 @@ class ProcessChannel(object):
self._cv.notify_all() self._cv.notify_all()
_LOGGER.debug( if resp is not None:
self._log("(logid={}) Op({}) Got data from output_buffer".format( list_values = list(resp.values())
resp.values()[0].id, op_name))) _LOGGER.debug(
self._log(
"(data_id={} log_id={}) Op({}) Got data from output_buffer".
format(list_values[0].id, list_values[0].log_id, op_name)))
return resp return resp
def stop(self): def stop(self):
...@@ -586,14 +620,14 @@ class ThreadChannel(Queue.PriorityQueue): ...@@ -586,14 +620,14 @@ class ThreadChannel(Queue.PriorityQueue):
def push(self, channeldata, op_name=None): def push(self, channeldata, op_name=None):
_LOGGER.debug( _LOGGER.debug(
self._log("(logid={}) Op({}) Pushing data".format(channeldata.id, self._log("(data_id={} log_id={}) Op({}) Pushing data".format(
op_name))) channeldata.id, channeldata.log_id, op_name)))
if len(self._producers) == 0: if len(self._producers) == 0:
_LOGGER.critical( _LOGGER.critical(
self._log( self._log(
"(logid={}) Op({}) Failed to push data: expected number of " "(data_id={} log_id={}) Op({}) Failed to push data: expected number of "
"producers to be greater than 0, but the it is 0.".format( "producers to be greater than 0, but the it is 0.".format(
channeldata.id, op_name))) channeldata.id, channeldata.log_id, op_name)))
os._exit(-1) os._exit(-1)
elif len(self._producers) == 1: elif len(self._producers) == 1:
with self._cv: with self._cv:
...@@ -607,19 +641,21 @@ class ThreadChannel(Queue.PriorityQueue): ...@@ -607,19 +641,21 @@ class ThreadChannel(Queue.PriorityQueue):
raise ChannelStopError() raise ChannelStopError()
self._cv.notify_all() self._cv.notify_all()
_LOGGER.debug( _LOGGER.debug(
self._log("(logid={}) Op({}) Pushed data into internal_queue.". self._log(
format(channeldata.id, op_name))) "(data_id={} log_id={}) Op({}) Pushed data into internal_queue.".
format(channeldata.id, channeldata.log_id, op_name)))
return True return True
elif op_name is None: elif op_name is None:
_LOGGER.critical( _LOGGER.critical(
self._log( self._log(
"(logid={}) Op({}) Failed to push data: there are multiple" "(data_id={} log_id={}) Op({}) Failed to push data: there are multiple"
" producers, so op_name cannot be None.".format( " producers, so op_name cannot be None.".format(
channeldata.id, op_name))) channeldata.id, channeldata.log_id, op_name)))
os._exit(-1) os._exit(-1)
producer_num = len(self._producers) producer_num = len(self._producers)
data_id = channeldata.id data_id = channeldata.id
log_id = channeldata.log_id
put_data = None put_data = None
with self._cv: with self._cv:
if data_id not in self._input_buf: if data_id not in self._input_buf:
...@@ -639,8 +675,8 @@ class ThreadChannel(Queue.PriorityQueue): ...@@ -639,8 +675,8 @@ class ThreadChannel(Queue.PriorityQueue):
if put_data is None: if put_data is None:
_LOGGER.debug( _LOGGER.debug(
self._log( self._log(
"(logid={}) Op({}) Pushed data into input_buffer.". "(data_id={} log_id={}) Op({}) Pushed data into input_buffer.".
format(data_id, op_name))) format(data_id, log_id, op_name)))
else: else:
while self._stop is False: while self._stop is False:
try: try:
...@@ -653,8 +689,8 @@ class ThreadChannel(Queue.PriorityQueue): ...@@ -653,8 +689,8 @@ class ThreadChannel(Queue.PriorityQueue):
_LOGGER.debug( _LOGGER.debug(
self._log( self._log(
"(logid={}) Op({}) Pushed data into internal_queue.". "(data_id={} log_id={}) Op({}) Pushed data into internal_queue.".
format(data_id, op_name))) format(data_id, log_id, op_name)))
self._cv.notify_all() self._cv.notify_all()
return True return True
...@@ -696,9 +732,11 @@ class ThreadChannel(Queue.PriorityQueue): ...@@ -696,9 +732,11 @@ class ThreadChannel(Queue.PriorityQueue):
self._cv.wait() self._cv.wait()
if self._stop: if self._stop:
raise ChannelStopError() raise ChannelStopError()
_LOGGER.debug( if resp is not None:
self._log("(logid={}) Op({}) Got data".format(resp.values()[0] list_values = list(resp.values())
.id, op_name))) _LOGGER.debug(
self._log("(data_id={} log_id={}) Op({}) Got data".format(
list_values[0].id, list_values[0].log_id, op_name)))
return resp return resp
elif op_name is None: elif op_name is None:
_LOGGER.critical( _LOGGER.critical(
...@@ -725,10 +763,12 @@ class ThreadChannel(Queue.PriorityQueue): ...@@ -725,10 +763,12 @@ class ThreadChannel(Queue.PriorityQueue):
try: try:
channeldata = self.get(timeout=0) channeldata = self.get(timeout=0)
self._output_buf.append(channeldata) self._output_buf.append(channeldata)
list_values = list(channeldata.values())
_LOGGER.debug( _LOGGER.debug(
self._log( self._log(
"(logid={}) Op({}) Pop ready item into output_buffer". "(data_id={} log_id={}) Op({}) Pop ready item into output_buffer".
format(channeldata.values()[0].id, op_name))) format(list_values[0].id, list_values[0].log_id,
op_name)))
break break
except Queue.Empty: except Queue.Empty:
if timeout is not None: if timeout is not None:
...@@ -779,9 +819,12 @@ class ThreadChannel(Queue.PriorityQueue): ...@@ -779,9 +819,12 @@ class ThreadChannel(Queue.PriorityQueue):
self._cv.notify_all() self._cv.notify_all()
_LOGGER.debug( if resp is not None:
self._log("(logid={}) Op({}) Got data from output_buffer".format( list_values = list(resp.values())
resp.values()[0].id, op_name))) _LOGGER.debug(
self._log(
"(data_id={} log_id={}) Op({}) Got data from output_buffer".
format(list_values[0].id, list_values[0].log_id, op_name)))
return resp return resp
def stop(self): def stop(self):
......
此差异已折叠。
...@@ -19,22 +19,25 @@ option go_package = ".;pipeline_serving"; ...@@ -19,22 +19,25 @@ option go_package = ".;pipeline_serving";
import "google/api/annotations.proto"; import "google/api/annotations.proto";
message Response { message Response {
repeated string key = 1; int32 err_no = 1;
repeated string value = 2; string err_msg = 2;
int32 ecode = 3; repeated string key = 3;
string error_info = 4; repeated string value = 4;
}; };
message Request { message Request {
repeated string key = 1; repeated string key = 1;
repeated string value = 2; repeated string value = 2;
string name = 3; string name = 3;
} string method = 4;
int64 logid = 5;
string clientip = 6;
};
service PipelineService { service PipelineService {
rpc inference(Request) returns (Response) { rpc inference(Request) returns (Response) {
option (google.api.http) = { option (google.api.http) = {
post : "/{name=*}/prediction" post : "/{name=*}/{method=*}"
body : "*" body : "*"
}; };
} }
......
...@@ -38,7 +38,8 @@ func run_proxy_server(grpc_port int, http_port int) error { ...@@ -38,7 +38,8 @@ func run_proxy_server(grpc_port int, http_port int) error {
ctx, cancel := context.WithCancel(ctx) ctx, cancel := context.WithCancel(ctx)
defer cancel() defer cancel()
mux := runtime.NewServeMux() //EmitDefaults=true, does not filter out the default inputs
mux := runtime.NewServeMux(runtime.WithMarshalerOption(runtime.MIMEWildcard, &runtime.JSONPb{OrigName: true, EmitDefaults: true}))
opts := []grpc.DialOption{grpc.WithInsecure()} opts := []grpc.DialOption{grpc.WithInsecure()}
err := gw.RegisterPipelineServiceHandlerFromEndpoint(ctx, mux, *pipelineEndpoint, opts) err := gw.RegisterPipelineServiceHandlerFromEndpoint(ctx, mux, *pipelineEndpoint, opts)
if err != nil { if err != nil {
......
...@@ -15,101 +15,220 @@ ...@@ -15,101 +15,220 @@
import os import os
import logging import logging
import multiprocessing import multiprocessing
try: #from paddle_serving_server_gpu import OpMaker, OpSeqMaker
from paddle_serving_server_gpu import OpMaker, OpSeqMaker, Server #from paddle_serving_server_gpu import Server as GpuServer
PACKAGE_VERSION = "GPU" #from paddle_serving_server import Server as CpuServer
except ImportError:
from paddle_serving_server import OpMaker, OpSeqMaker, Server
PACKAGE_VERSION = "CPU"
from . import util from . import util
#from paddle_serving_app.local_predict import LocalPredictor
_LOGGER = logging.getLogger(__name__) _LOGGER = logging.getLogger(__name__)
_workdir_name_gen = util.NameGenerator("workdir_") _workdir_name_gen = util.NameGenerator("workdir_")
class LocalServiceHandler(object): class LocalServiceHandler(object):
"""
LocalServiceHandler is the processor of the local service, contains
three client types, brpc, grpc and local_predictor.If you use the
brpc or grpc, serveing startup ability is provided.If you use
local_predictor, local predict ability is provided by paddle_serving_app.
"""
def __init__(self, def __init__(self,
model_config, model_config,
client_type='local_predictor',
workdir="", workdir="",
thread_num=2, thread_num=2,
devices="", devices="",
fetch_names=None,
mem_optim=True, mem_optim=True,
ir_optim=False, ir_optim=False,
available_port_generator=None): available_port_generator=None,
use_trt=False,
use_profile=False):
"""
Initialization of localservicehandler
Args:
model_config: model config path
client_type: brpc, grpc and local_predictor[default]
workdir: work directory
thread_num: number of threads, concurrent quantity.
devices: gpu id list[gpu], "" default[cpu]
fetch_names: get fetch names out of LocalServiceHandler in
local_predictor mode. fetch_names_ is compatible for Client().
mem_optim: use memory/graphics memory optimization, True default.
ir_optim: use calculation chart optimization, False default.
available_port_generator: generate available ports
use_trt: use nvidia tensorRt engine, False default.
use_profile: use profiling, False default.
Returns:
None
"""
if available_port_generator is None: if available_port_generator is None:
available_port_generator = util.GetAvailablePortGenerator() available_port_generator = util.GetAvailablePortGenerator()
self._model_config = model_config self._model_config = model_config
self._port_list = [] self._port_list = []
self._device_type = "cpu"
if devices == "": if devices == "":
# cpu # cpu
devices = [-1] devices = [-1]
self._device_type = "cpu"
self._port_list.append(available_port_generator.next()) self._port_list.append(available_port_generator.next())
_LOGGER.info("Model({}) will be launch in cpu device. Port({})" _LOGGER.info("Model({}) will be launch in cpu device. Port({})"
.format(model_config, self._port_list)) .format(model_config, self._port_list))
else: else:
# gpu # gpu
if PACKAGE_VERSION == "CPU": self._device_type = "gpu"
raise ValueError(
"You are using the CPU version package("
"paddle-serving-server), unable to set devices")
devices = [int(x) for x in devices.split(",")] devices = [int(x) for x in devices.split(",")]
for _ in devices: for _ in devices:
self._port_list.append(available_port_generator.next()) self._port_list.append(available_port_generator.next())
_LOGGER.info("Model({}) will be launch in gpu device: {}. Port({})" _LOGGER.info("Model({}) will be launch in gpu device: {}. Port({})"
.format(model_config, devices, self._port_list)) .format(model_config, devices, self._port_list))
self._client_type = client_type
self._workdir = workdir self._workdir = workdir
self._devices = devices self._devices = devices
self._thread_num = thread_num self._thread_num = thread_num
self._mem_optim = mem_optim self._mem_optim = mem_optim
self._ir_optim = ir_optim self._ir_optim = ir_optim
self._local_predictor_client = None
self._rpc_service_list = [] self._rpc_service_list = []
self._server_pros = [] self._server_pros = []
self._fetch_vars = None self._use_trt = use_trt
self._use_profile = use_profile
self.fetch_names_ = fetch_names
def get_fetch_list(self): def get_fetch_list(self):
return self._fetch_vars return self.fetch_names_
def get_port_list(self): def get_port_list(self):
return self._port_list return self._port_list
def get_client(self, concurrency_idx):
"""
Function get_client is only used for local predictor case, creates one
LocalPredictor object, and initializes the paddle predictor by function
load_model_config.The concurrency_idx is used to select running devices.
Args:
concurrency_idx: process/thread index
Returns:
_local_predictor_client
"""
#checking the legality of concurrency_idx.
device_num = len(self._devices)
if device_num <= 0:
_LOGGER.error("device_num must be not greater than 0. devices({})".
format(self._devices))
raise ValueError("The number of self._devices error")
if concurrency_idx < 0:
_LOGGER.error("concurrency_idx({}) must be one positive number".
format(concurrency_idx))
concurrency_idx = 0
elif concurrency_idx >= device_num:
concurrency_idx = concurrency_idx % device_num
_LOGGER.info("GET_CLIENT : concurrency_idx={}, device_num={}".format(
concurrency_idx, device_num))
from paddle_serving_app.local_predict import LocalPredictor
if self._local_predictor_client is None:
self._local_predictor_client = LocalPredictor()
use_gpu = False
if self._device_type == "gpu":
use_gpu = True
self._local_predictor_client.load_model_config(
model_path=self._model_config,
use_gpu=use_gpu,
gpu_id=self._devices[concurrency_idx],
use_profile=self._use_profile,
thread_num=self._thread_num,
mem_optim=self._mem_optim,
ir_optim=self._ir_optim,
use_trt=self._use_trt)
return self._local_predictor_client
def get_client_config(self): def get_client_config(self):
return os.path.join(self._model_config, "serving_server_conf.prototxt") return os.path.join(self._model_config, "serving_server_conf.prototxt")
def _prepare_one_server(self, workdir, port, gpuid, thread_num, mem_optim, def _prepare_one_server(self, workdir, port, gpuid, thread_num, mem_optim,
ir_optim): ir_optim):
device = "gpu" """
if gpuid == -1: According to _device_type, generating one CpuServer or GpuServer, and
device = "cpu" setting the model config amd startup params.
op_maker = OpMaker()
read_op = op_maker.create('general_reader') Args:
general_infer_op = op_maker.create('general_infer') workdir: work directory
general_response_op = op_maker.create('general_response') port: network port
gpuid: gpu id
op_seq_maker = OpSeqMaker() thread_num: thread num
op_seq_maker.add_op(read_op) mem_optim: use memory/graphics memory optimization
op_seq_maker.add_op(general_infer_op) ir_optim: use calculation chart optimization
op_seq_maker.add_op(general_response_op)
Returns:
server = Server() server: CpuServer/GpuServer
"""
if self._device_type == "cpu":
from paddle_serving_server import OpMaker, OpSeqMaker, Server
op_maker = OpMaker()
read_op = op_maker.create('general_reader')
general_infer_op = op_maker.create('general_infer')
general_response_op = op_maker.create('general_response')
op_seq_maker = OpSeqMaker()
op_seq_maker.add_op(read_op)
op_seq_maker.add_op(general_infer_op)
op_seq_maker.add_op(general_response_op)
server = Server()
else:
#gpu
from paddle_serving_server_gpu import OpMaker, OpSeqMaker, Server
op_maker = OpMaker()
read_op = op_maker.create('general_reader')
general_infer_op = op_maker.create('general_infer')
general_response_op = op_maker.create('general_response')
op_seq_maker = OpSeqMaker()
op_seq_maker.add_op(read_op)
op_seq_maker.add_op(general_infer_op)
op_seq_maker.add_op(general_response_op)
server = Server()
if gpuid >= 0:
server.set_gpuid(gpuid)
server.set_op_sequence(op_seq_maker.get_op_sequence()) server.set_op_sequence(op_seq_maker.get_op_sequence())
server.set_num_threads(thread_num) server.set_num_threads(thread_num)
server.set_memory_optimize(mem_optim) server.set_memory_optimize(mem_optim)
server.set_ir_optimize(ir_optim) server.set_ir_optimize(ir_optim)
server.load_model_config(self._model_config) server.load_model_config(self._model_config)
if gpuid >= 0: server.prepare_server(
server.set_gpuid(gpuid) workdir=workdir, port=port, device=self._device_type)
server.prepare_server(workdir=workdir, port=port, device=device) if self.fetch_names_ is None:
if self._fetch_vars is None: self.fetch_names_ = server.get_fetch_list()
self._fetch_vars = server.get_fetch_list()
return server return server
def _start_one_server(self, service_idx): def _start_one_server(self, service_idx):
"""
Start one server
Args:
service_idx: server index
Returns:
None
"""
self._rpc_service_list[service_idx].run_server() self._rpc_service_list[service_idx].run_server()
def prepare_server(self): def prepare_server(self):
"""
Prepare all servers to be started, and append them into list.
"""
for i, device_id in enumerate(self._devices): for i, device_id in enumerate(self._devices):
if self._workdir != "": if self._workdir != "":
workdir = "{}_{}".format(self._workdir, i) workdir = "{}_{}".format(self._workdir, i)
...@@ -125,6 +244,9 @@ class LocalServiceHandler(object): ...@@ -125,6 +244,9 @@ class LocalServiceHandler(object):
ir_optim=self._ir_optim)) ir_optim=self._ir_optim))
def start_server(self): def start_server(self):
"""
Start multiple processes and start one server in each process
"""
for i, service in enumerate(self._rpc_service_list): for i, service in enumerate(self._rpc_service_list):
p = multiprocessing.Process( p = multiprocessing.Process(
target=self._start_one_server, args=(i, )) target=self._start_one_server, args=(i, ))
......
此差异已折叠。
...@@ -18,14 +18,20 @@ import numpy as np ...@@ -18,14 +18,20 @@ import numpy as np
from numpy import * from numpy import *
import logging import logging
import functools import functools
from .channel import ChannelDataEcode import json
import socket
from .channel import ChannelDataErrcode
from .proto import pipeline_service_pb2 from .proto import pipeline_service_pb2
from .proto import pipeline_service_pb2_grpc from .proto import pipeline_service_pb2_grpc
import six
_LOGGER = logging.getLogger(__name__) _LOGGER = logging.getLogger(__name__)
class PipelineClient(object): class PipelineClient(object):
"""
PipelineClient provides the basic capabilities of the pipeline SDK
"""
def __init__(self): def __init__(self):
self._channel = None self._channel = None
self._profile_key = "pipeline.profile" self._profile_key = "pipeline.profile"
...@@ -42,13 +48,38 @@ class PipelineClient(object): ...@@ -42,13 +48,38 @@ class PipelineClient(object):
def _pack_request_package(self, feed_dict, profile): def _pack_request_package(self, feed_dict, profile):
req = pipeline_service_pb2.Request() req = pipeline_service_pb2.Request()
logid = feed_dict.get("logid")
if logid is None:
req.logid = 0
else:
if sys.version_info.major == 2:
req.logid = long(logid)
elif sys.version_info.major == 3:
req.logid = int(logid)
feed_dict.pop("logid")
clientip = feed_dict.get("clientip")
if clientip is None:
hostname = socket.gethostname()
ip = socket.gethostbyname(hostname)
req.clientip = ip
else:
req.clientip = clientip
feed_dict.pop("clientip")
np.set_printoptions(threshold=sys.maxsize) np.set_printoptions(threshold=sys.maxsize)
for key, value in feed_dict.items(): for key, value in feed_dict.items():
req.key.append(key) req.key.append(key)
if (sys.version_info.major == 2 and isinstance(value,
(str, unicode)) or
((sys.version_info.major == 3) and isinstance(value, str))):
req.value.append(value)
continue
if isinstance(value, np.ndarray): if isinstance(value, np.ndarray):
req.value.append(value.__repr__()) req.value.append(value.__repr__())
elif isinstance(value, (str, unicode)):
req.value.append(value)
elif isinstance(value, list): elif isinstance(value, list):
req.value.append(np.array(value).__repr__()) req.value.append(np.array(value).__repr__())
else: else:
...@@ -60,29 +91,7 @@ class PipelineClient(object): ...@@ -60,29 +91,7 @@ class PipelineClient(object):
return req return req
def _unpack_response_package(self, resp, fetch): def _unpack_response_package(self, resp, fetch):
if resp.ecode != 0: return resp
return {
"ecode": resp.ecode,
"ecode_desc": ChannelDataEcode(resp.ecode),
"error_info": resp.error_info,
}
fetch_map = {"ecode": resp.ecode}
for idx, key in enumerate(resp.key):
if key == self._profile_key:
if resp.value[idx] != "":
sys.stderr.write(resp.value[idx])
continue
if fetch is not None and key not in fetch:
continue
data = resp.value[idx]
try:
evaled_data = eval(data)
if isinstance(evaled_data, np.ndarray):
data = evaled_data
except Exception as e:
pass
fetch_map[key] = data
return fetch_map
def predict(self, feed_dict, fetch=None, asyn=False, profile=False): def predict(self, feed_dict, fetch=None, asyn=False, profile=False):
if not isinstance(feed_dict, dict): if not isinstance(feed_dict, dict):
......
此差异已折叠。
...@@ -19,13 +19,16 @@ message Request { ...@@ -19,13 +19,16 @@ message Request {
repeated string key = 1; repeated string key = 1;
repeated string value = 2; repeated string value = 2;
optional string name = 3; optional string name = 3;
optional string method = 4;
optional int64 logid = 5;
optional string clientip = 6;
}; };
message Response { message Response {
repeated string key = 1; optional int32 err_no = 1;
repeated string value = 2; optional string err_msg = 2;
required int32 ecode = 3; repeated string key = 3;
optional string error_info = 4; repeated string value = 4;
}; };
service PipelineService { service PipelineService {
......
...@@ -32,8 +32,8 @@ if '${PACK}' == 'ON': ...@@ -32,8 +32,8 @@ if '${PACK}' == 'ON':
REQUIRED_PACKAGES = [ REQUIRED_PACKAGES = [
'six >= 1.10.0', 'sentencepiece', 'opencv-python<=4.2.0.32', 'pillow', 'six >= 1.10.0', 'sentencepiece<=0.1.92', 'opencv-python<=4.2.0.32', 'pillow',
'shapely<=1.6.1', 'pyclipper' 'pyclipper'
] ]
packages=['paddle_serving_app', packages=['paddle_serving_app',
......
...@@ -43,8 +43,8 @@ if '${PACK}' == 'ON': ...@@ -43,8 +43,8 @@ if '${PACK}' == 'ON':
copy_lib() copy_lib()
REQUIRED_PACKAGES = [ REQUIRED_PACKAGES = [
'six >= 1.10.0', 'protobuf >= 3.11.0', 'numpy >= 1.12', 'grpcio >= 1.28.1', 'six >= 1.10.0', 'protobuf >= 3.11.0', 'numpy >= 1.12', 'grpcio <= 1.33.2',
'grpcio-tools >= 1.28.1' 'grpcio-tools <= 1.33.2'
] ]
......
...@@ -28,7 +28,7 @@ max_version, mid_version, min_version = util.python_version() ...@@ -28,7 +28,7 @@ max_version, mid_version, min_version = util.python_version()
util.gen_pipeline_code("paddle_serving_server") util.gen_pipeline_code("paddle_serving_server")
REQUIRED_PACKAGES = [ REQUIRED_PACKAGES = [
'six >= 1.10.0', 'protobuf >= 3.11.0', 'grpcio >= 1.28.1', 'grpcio-tools >= 1.28.1', 'six >= 1.10.0', 'protobuf >= 3.11.0', 'grpcio <= 1.33.2', 'grpcio-tools <= 1.33.2',
'paddle_serving_client', 'flask >= 1.1.1', 'paddle_serving_app', 'func_timeout', 'pyyaml' 'paddle_serving_client', 'flask >= 1.1.1', 'paddle_serving_app', 'func_timeout', 'pyyaml'
] ]
......
...@@ -30,7 +30,7 @@ max_version, mid_version, min_version = util.python_version() ...@@ -30,7 +30,7 @@ max_version, mid_version, min_version = util.python_version()
util.gen_pipeline_code("paddle_serving_server_gpu") util.gen_pipeline_code("paddle_serving_server_gpu")
REQUIRED_PACKAGES = [ REQUIRED_PACKAGES = [
'six >= 1.10.0', 'protobuf >= 3.11.0', 'grpcio >= 1.28.1', 'grpcio-tools >= 1.28.1', 'six >= 1.10.0', 'protobuf >= 3.11.0', 'grpcio <= 1.33.2', 'grpcio-tools <= 1.33.2',
'paddle_serving_client', 'flask >= 1.1.1', 'paddle_serving_app', 'func_timeout', 'pyyaml' 'paddle_serving_client', 'flask >= 1.1.1', 'paddle_serving_app', 'func_timeout', 'pyyaml'
] ]
......
sphinx==2.1.0 sphinx==2.1.0
mistune mistune
sphinx_rtd_theme sphinx_rtd_theme
paddlepaddle>=1.6 paddlepaddle>=1.8.4
shapely
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册