提交 12dbbcfe 编写于 作者: W wangjiawei04

merge

([简体中文](./README_CN.md)|English)
<p align="center"> <p align="center">
<br> <br>
<img src='doc/serving_logo.png' width = "600" height = "130"> <img src='doc/serving_logo.png' width = "600" height = "130">
<br> <br>
<p> <p>
<p align="center"> <p align="center">
<br> <br>
<a href="https://travis-ci.com/PaddlePaddle/Serving"> <a href="https://travis-ci.com/PaddlePaddle/Serving">
...@@ -23,14 +26,6 @@ We consider deploying deep learning inference service online to be a user-facing ...@@ -23,14 +26,6 @@ We consider deploying deep learning inference service online to be a user-facing
<img src="doc/demo.gif" width="700"> <img src="doc/demo.gif" width="700">
</p> </p>
<h2 align="center">Some Key Features</h2>
- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed **with one line command**.
- **Industrial serving features** supported, such as models management, online loading, online A/B testing etc.
- **Distributed Key-Value indexing** supported which is especially useful for large scale sparse features as model inputs.
- **Highly concurrent and efficient communication** between clients and servers supported.
- **Multiple programming languages** supported on client side, such as Golang, C++ and python.
- **Extensible framework design** which can support model serving beyond Paddle.
<h2 align="center">Installation</h2> <h2 align="center">Installation</h2>
...@@ -58,10 +53,42 @@ You may need to use a domestic mirror source (in China, you can use the Tsinghua ...@@ -58,10 +53,42 @@ You may need to use a domestic mirror source (in China, you can use the Tsinghua
If you need install modules compiled with develop branch, please download packages from [latest packages list](./doc/LATEST_PACKAGES.md) and install with `pip install` command. If you need install modules compiled with develop branch, please download packages from [latest packages list](./doc/LATEST_PACKAGES.md) and install with `pip install` command.
Client package support Centos 7 and Ubuntu 18, or you can use HTTP service without install client. Packages of Paddle Serving support Centos 6/7 and Ubuntu 16/18, or you can use HTTP service without install client.
<h2 align="center"> Pre-built services with Paddle Serving</h2>
<h3 align="center">Chinese Word Segmentation</h4>
``` shell
> python -m paddle_serving_app.package --get_model lac
> tar -xzf lac.tar.gz
> python lac_web_service.py lac_model/ lac_workdir 9393 &
> curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "我爱北京天安门"}], "fetch":["word_seg"]}' http://127.0.0.1:9393/lac/prediction
{"result":[{"word_seg":"我|爱|北京|天安门"}]}
```
<h3 align="center">Image Classification</h4>
<p align="center">
<br>
<img src='https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg' width = "200" height = "200">
<br>
<p>
``` shell
> python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet
> tar -xzf resnet_v2_50_imagenet.tar.gz
> python resnet50_imagenet_classify.py resnet50_serving_model &
> curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"image": "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"}], "fetch": ["score"]}' http://127.0.0.1:9292/image/prediction
{"result":{"label":["daisy"],"prob":[0.9341403245925903]}}
```
<h2 align="center">Quick Start Example</h2> <h2 align="center">Quick Start Example</h2>
This quick start example is only for users who already have a model to deploy and we prepare a ready-to-deploy model here. If you want to know how to use paddle serving from offline training to online serving, please reference to [Train_To_Service](https://github.com/PaddlePaddle/Serving/blob/develop/doc/TRAIN_TO_SERVICE.md)
### Boston House Price Prediction model ### Boston House Price Prediction model
``` shell ``` shell
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
...@@ -84,9 +111,9 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po ...@@ -84,9 +111,9 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po
| `port` | int | `9292` | Exposed port of current service to users| | `port` | int | `9292` | Exposed port of current service to users|
| `name` | str | `""` | Service name, can be used to generate HTTP request url | | `name` | str | `""` | Service name, can be used to generate HTTP request url |
| `model` | str | `""` | Path of paddle model directory to be served | | `model` | str | `""` | Path of paddle model directory to be served |
| `mem_optim` | bool | `False` | Enable memory / graphic memory optimization | | `mem_optim` | - | - | Enable memory / graphic memory optimization |
| `ir_optim` | bool | `False` | Enable analysis and optimization of calculation graph | | `ir_optim` | - | - | Enable analysis and optimization of calculation graph |
| `use_mkl` (Only for cpu version) | bool | `False` | Run inference with MKL | | `use_mkl` (Only for cpu version) | - | - | Run inference with MKL |
Here, we use `curl` to send a HTTP POST request to the service we just started. Users can use any python library to send HTTP POST as well, e.g, [requests](https://requests.readthedocs.io/en/master/). Here, we use `curl` to send a HTTP POST request to the service we just started. Users can use any python library to send HTTP POST as well, e.g, [requests](https://requests.readthedocs.io/en/master/).
</center> </center>
...@@ -117,138 +144,13 @@ print(fetch_map) ...@@ -117,138 +144,13 @@ print(fetch_map)
``` ```
Here, `client.predict` function has two arguments. `feed` is a `python dict` with model input variable alias name and values. `fetch` assigns the prediction variables to be returned from servers. In the example, the name of `"x"` and `"price"` are assigned when the servable model is saved during training. Here, `client.predict` function has two arguments. `feed` is a `python dict` with model input variable alias name and values. `fetch` assigns the prediction variables to be returned from servers. In the example, the name of `"x"` and `"price"` are assigned when the servable model is saved during training.
<h2 align="center"> Pre-built services with Paddle Serving</h2> <h2 align="center">Some Key Features of Paddle Serving</h2>
<h3 align="center">Chinese Word Segmentation</h4>
- **Description**:
``` shell
Chinese word segmentation HTTP service that can be deployed with one line command.
```
- **Download Servable Package**:
``` shell
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/lac/lac_model_jieba_web.tar.gz
```
- **Host web service**:
``` shell
tar -xzf lac_model_jieba_web.tar.gz
python lac_web_service.py jieba_server_model/ lac_workdir 9292
```
- **Request sample**:
``` shell
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "我爱北京天安门"}], "fetch":["word_seg"]}' http://127.0.0.1:9292/lac/prediction
```
- **Request result**:
``` shell
{"word_seg":"我|爱|北京|天安门"}
```
<h3 align="center">Image Classification</h4>
- **Description**:
``` shell
Image classification trained with Imagenet dataset. A label and corresponding probability will be returned.
Note: This demo needs paddle-serving-server-gpu.
```
- **Download Servable Package**:
``` shell
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/imagenet-example/imagenet_demo.tar.gz
```
- **Host web service**:
``` shell
tar -xzf imagenet_demo.tar.gz
python image_classification_service_demo.py resnet50_serving_model
```
- **Request sample**:
<p align="center">
<br>
<img src='https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg' width = "200" height = "200">
<br>
<p>
``` shell
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"url": "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"}], "fetch": ["score"]}' http://127.0.0.1:9292/image/prediction
```
- **Request result**:
``` shell
{"label":"daisy","prob":0.9341403245925903}
```
<h3 align="center">More Demos</h3>
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| Model Name | Bert-Base-Baike |
| URL | [https://paddle-serving.bj.bcebos.com/bert_example/bert_seq128.tar.gz](https://paddle-serving.bj.bcebos.com/bert_example%2Fbert_seq128.tar.gz) |
| Client/Server Code | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert |
| Description | Get semantic representation from a Chinese Sentence |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| Model Name | Resnet50-Imagenet |
| URL | [https://paddle-serving.bj.bcebos.com/imagenet-example/ResNet50_vd.tar.gz](https://paddle-serving.bj.bcebos.com/imagenet-example%2FResNet50_vd.tar.gz) |
| Client/Server Code | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet |
| Description | Get image semantic representation from an image |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| Model Name | Resnet101-Imagenet |
| URL | https://paddle-serving.bj.bcebos.com/imagenet-example/ResNet101_vd.tar.gz |
| Client/Server Code | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet |
| Description | Get image semantic representation from an image |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| Model Name | CNN-IMDB |
| URL | https://paddle-serving.bj.bcebos.com/imdb-demo/imdb_model.tar.gz |
| Client/Server Code | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imdb |
| Description | Get category probability from an English Sentence |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| Model Name | LSTM-IMDB |
| URL | https://paddle-serving.bj.bcebos.com/imdb-demo/imdb_model.tar.gz |
| Client/Server Code | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imdb |
| Description | Get category probability from an English Sentence |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| Model Name | BOW-IMDB |
| URL | https://paddle-serving.bj.bcebos.com/imdb-demo/imdb_model.tar.gz |
| Client/Server Code | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imdb |
| Description | Get category probability from an English Sentence |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| Model Name | Jieba-LAC |
| URL | https://paddle-serving.bj.bcebos.com/lac/lac_model.tar.gz |
| Client/Server Code | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/lac |
| Description | Get word segmentation from a Chinese Sentence |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| Model Name | DNN-CTR |
| URL | https://paddle-serving.bj.bcebos.com/criteo_ctr_example/criteo_ctr_demo_model.tar.gz |
| Client/Server Code | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/criteo_ctr |
| Description | Get click probability from a feature vector of item |
- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed **with one line command**.
- **Industrial serving features** supported, such as models management, online loading, online A/B testing etc.
- **Distributed Key-Value indexing** supported which is especially useful for large scale sparse features as model inputs.
- **Highly concurrent and efficient communication** between clients and servers supported.
- **Multiple programming languages** supported on client side, such as Golang, C++ and python.
<h2 align="center">Document</h2> <h2 align="center">Document</h2>
...@@ -268,13 +170,13 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"url": "https://pa ...@@ -268,13 +170,13 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"url": "https://pa
### About Efficiency ### About Efficiency
- [How to profile Paddle Serving latency?](python/examples/util) - [How to profile Paddle Serving latency?](python/examples/util)
- [How to optimize performance?(Chinese)](doc/PERFORMANCE_OPTIM_CN.md) - [How to optimize performance?](doc/PERFORMANCE_OPTIM.md)
- [Deploy multi-services on one GPU(Chinese)](doc/MULTI_SERVICE_ON_ONE_GPU_CN.md) - [Deploy multi-services on one GPU(Chinese)](doc/MULTI_SERVICE_ON_ONE_GPU_CN.md)
- [CPU Benchmarks(Chinese)](doc/BENCHMARKING.md) - [CPU Benchmarks(Chinese)](doc/BENCHMARKING.md)
- [GPU Benchmarks(Chinese)](doc/GPU_BENCHMARKING.md) - [GPU Benchmarks(Chinese)](doc/GPU_BENCHMARKING.md)
### FAQ ### FAQ
- [FAQ(Chinese)](doc/deprecated/FAQ.md) - [FAQ(Chinese)](doc/FAQ.md)
### Design ### Design
......
(简体中文|[English](./README.md))
<p align="center"> <p align="center">
<br> <br>
<img src='https://paddle-serving.bj.bcebos.com/imdb-demo%2FLogoMakr-3Bd2NM-300dpi.png' width = "600" height = "130"> <img src='https://paddle-serving.bj.bcebos.com/imdb-demo%2FLogoMakr-3Bd2NM-300dpi.png' width = "600" height = "130">
<br> <br>
<p> <p>
<p align="center"> <p align="center">
<br> <br>
<a href="https://travis-ci.com/PaddlePaddle/Serving"> <a href="https://travis-ci.com/PaddlePaddle/Serving">
...@@ -24,14 +27,7 @@ Paddle Serving 旨在帮助深度学习开发者轻易部署在线预测服务 ...@@ -24,14 +27,7 @@ Paddle Serving 旨在帮助深度学习开发者轻易部署在线预测服务
<img src="doc/demo.gif" width="700"> <img src="doc/demo.gif" width="700">
</p> </p>
<h2 align="center">核心功能</h2>
- 与Paddle训练紧密连接,绝大部分Paddle模型可以 **一键部署**.
- 支持 **工业级的服务能力** 例如模型管理,在线加载,在线A/B测试等.
- 支持 **分布式键值对索引** 助力于大规模稀疏特征作为模型输入.
- 支持客户端和服务端之间 **高并发和高效通信**.
- 支持 **多种编程语言** 开发客户端,例如Golang,C++和Python.
- **可伸缩框架设计** 可支持不限于Paddle的模型服务.
<h2 align="center">安装</h2> <h2 align="center">安装</h2>
...@@ -59,9 +55,40 @@ pip install paddle-serving-server-gpu # GPU ...@@ -59,9 +55,40 @@ pip install paddle-serving-server-gpu # GPU
如果需要使用develop分支编译的安装包,请从[最新安装包列表](./doc/LATEST_PACKAGES.md)中获取下载地址进行下载,使用`pip install`命令进行安装。 如果需要使用develop分支编译的安装包,请从[最新安装包列表](./doc/LATEST_PACKAGES.md)中获取下载地址进行下载,使用`pip install`命令进行安装。
客户端安装包支持Centos 7和Ubuntu 18,或者您可以使用HTTP服务,这种情况下不需要安装客户端。 Paddle Serving安装包支持Centos 6/7和Ubuntu 16/18,或者您可以使用HTTP服务,这种情况下不需要安装客户端。
<h2 align="center"> Paddle Serving预装的服务 </h2>
<h3 align="center">中文分词</h4>
``` shell
> python -m paddle_serving_app.package --get_model lac
> tar -xzf lac.tar.gz
> python lac_web_service.py lac_model/ lac_workdir 9393 &
> curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "我爱北京天安门"}], "fetch":["word_seg"]}' http://127.0.0.1:9393/lac/prediction
{"result":[{"word_seg":"我|爱|北京|天安门"}]}
```
<h3 align="center">图像分类</h4>
<p align="center">
<br>
<img src='https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg' width = "200" height = "200">
<br>
<p>
``` shell
> python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet
> tar -xzf resnet_v2_50_imagenet.tar.gz
> python resnet50_imagenet_classify.py resnet50_serving_model &
> curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"image": "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"}], "fetch": ["score"]}' http://127.0.0.1:9292/image/prediction
{"result":{"label":["daisy"],"prob":[0.9341403245925903]}}
```
<h2 align="center">快速启动示例</h2> <h2 align="center">快速开始示例</h2>
这个快速开始示例主要是为了给那些已经有一个要部署的模型的用户准备的,而且我们也提供了一个可以用来部署的模型。如果您想知道如何从离线训练到在线服务走完全流程,请参考[从训练到部署](https://github.com/PaddlePaddle/Serving/blob/develop/doc/TRAIN_TO_SERVICE_CN.md)
<h3 align="center">波士顿房价预测</h3> <h3 align="center">波士顿房价预测</h3>
...@@ -88,9 +115,9 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po ...@@ -88,9 +115,9 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po
| `port` | int | `9292` | Exposed port of current service to users| | `port` | int | `9292` | Exposed port of current service to users|
| `name` | str | `""` | Service name, can be used to generate HTTP request url | | `name` | str | `""` | Service name, can be used to generate HTTP request url |
| `model` | str | `""` | Path of paddle model directory to be served | | `model` | str | `""` | Path of paddle model directory to be served |
| `mem_optim` | bool | `False` | Enable memory optimization | | `mem_optim` | - | - | Enable memory optimization |
| `ir_optim` | bool | `False` | Enable analysis and optimization of calculation graph | | `ir_optim` | - | - | Enable analysis and optimization of calculation graph |
| `use_mkl` (Only for cpu version) | bool | `False` | Run inference with MKL | | `use_mkl` (Only for cpu version) | - | - | Run inference with MKL |
我们使用 `curl` 命令来发送HTTP POST请求给刚刚启动的服务。用户也可以调用python库来发送HTTP POST请求,请参考英文文档 [requests](https://requests.readthedocs.io/en/master/)。 我们使用 `curl` 命令来发送HTTP POST请求给刚刚启动的服务。用户也可以调用python库来发送HTTP POST请求,请参考英文文档 [requests](https://requests.readthedocs.io/en/master/)。
</center> </center>
...@@ -122,139 +149,13 @@ print(fetch_map) ...@@ -122,139 +149,13 @@ print(fetch_map)
``` ```
在这里,`client.predict`函数具有两个参数。 `feed`是带有模型输入变量别名和值的`python dict`。 `fetch`被要从服务器返回的预测变量赋值。 在该示例中,在训练过程中保存可服务模型时,被赋值的tensor名为`"x"`和`"price"` 在这里,`client.predict`函数具有两个参数。 `feed`是带有模型输入变量别名和值的`python dict`。 `fetch`被要从服务器返回的预测变量赋值。 在该示例中,在训练过程中保存可服务模型时,被赋值的tensor名为`"x"`和`"price"`
<h2 align="center">Paddle Serving预装的服务</h2> <h2 align="center">Paddle Serving的核心功能</h2>
<h3 align="center">中文分词模型</h4>
- **介绍**:
``` shell
本示例为中文分词HTTP服务一键部署
```
- **下载服务包**:
``` shell
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/lac/lac_model_jieba_web.tar.gz
```
- **启动web服务**:
``` shell
tar -xzf lac_model_jieba_web.tar.gz
python lac_web_service.py jieba_server_model/ lac_workdir 9292
```
- **客户端请求示例**:
``` shell
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "我爱北京天安门"}], "fetch":["word_seg"]}' http://127.0.0.1:9292/lac/prediction
```
- **返回结果示例**:
``` shell
{"word_seg":"我|爱|北京|天安门"}
```
<h3 align="center">图像分类模型</h4>
- **介绍**:
``` shell
图像分类模型由Imagenet数据集训练而成,该服务会返回一个标签及其概率
注意:本示例需要安装paddle-serving-server-gpu
```
- **下载服务包**:
``` shell
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/imagenet-example/imagenet_demo.tar.gz
```
- **启动web服务**:
``` shell
tar -xzf imagenet_demo.tar.gz
python image_classification_service_demo.py resnet50_serving_model
```
- **客户端请求示例**:
<p align="center">
<br>
<img src='https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg' width = "200" height = "200">
<br>
<p>
``` shell
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"url": "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"}], "fetch": ["score"]}' http://127.0.0.1:9292/image/prediction
```
- **返回结果示例**:
``` shell
{"label":"daisy","prob":0.9341403245925903}
```
<h3 align="center">更多示例</h3>
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| 模型名 | Bert-Base-Baike |
| 下载链接 | [https://paddle-serving.bj.bcebos.com/bert_example/bert_seq128.tar.gz](https://paddle-serving.bj.bcebos.com/bert_example%2Fbert_seq128.tar.gz) |
| 客户端/服务端代码 | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert |
| 介绍 | 获得一个中文语句的语义表示 |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| 模型名 | Resnet50-Imagenet |
| 下载链接 | [https://paddle-serving.bj.bcebos.com/imagenet-example/ResNet50_vd.tar.gz](https://paddle-serving.bj.bcebos.com/imagenet-example%2FResNet50_vd.tar.gz) |
| 客户端/服务端代码 | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet |
| 介绍 | 获得一张图片的图像语义表示 |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| 模型名 | Resnet101-Imagenet |
| 下载链接 | https://paddle-serving.bj.bcebos.com/imagenet-example/ResNet101_vd.tar.gz |
| 客户端/服务端代码 | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet |
| 介绍 | 获得一张图片的图像语义表示 |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| 模型名 | CNN-IMDB |
| 下载链接 | https://paddle-serving.bj.bcebos.com/imdb-demo/imdb_model.tar.gz |
| 客户端/服务端代码 | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imdb |
| 介绍 | 从一个中文语句获得类别及其概率 |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| 模型名 | LSTM-IMDB |
| 下载链接 | https://paddle-serving.bj.bcebos.com/imdb-demo/imdb_model.tar.gz |
| 客户端/服务端代码 | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imdb |
| 介绍 | 从一个英文语句获得类别及其概率 |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| 模型名 | BOW-IMDB |
| 下载链接 | https://paddle-serving.bj.bcebos.com/imdb-demo/imdb_model.tar.gz |
| 客户端/服务端代码 | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imdb |
| 介绍 | 从一个英文语句获得类别及其概率 |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| 模型名 | Jieba-LAC |
| 下载链接 | https://paddle-serving.bj.bcebos.com/lac/lac_model.tar.gz |
| 客户端/服务端代码 | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/lac |
| 介绍 | 获取中文语句的分词 |
| Key | Value |
| :----------------- | :----------------------------------------------------------- |
| 模型名 | DNN-CTR |
| 下载链接 | https://paddle-serving.bj.bcebos.com/criteo_ctr_example/criteo_ctr_demo_model.tar.gz |
| 客户端/服务端代码 | https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/criteo_ctr |
| 介绍 | 从项目的特征向量中获得点击概率 |
- 与Paddle训练紧密连接,绝大部分Paddle模型可以 **一键部署**.
- 支持 **工业级的服务能力** 例如模型管理,在线加载,在线A/B测试等.
- 支持 **分布式键值对索引** 助力于大规模稀疏特征作为模型输入.
- 支持客户端和服务端之间 **高并发和高效通信**.
- 支持 **多种编程语言** 开发客户端,例如Golang,C++和Python.
<h2 align="center">文档</h2> <h2 align="center">文档</h2>
...@@ -280,7 +181,7 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"url": "https://pa ...@@ -280,7 +181,7 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"url": "https://pa
- [GPU版Benchmarks](doc/GPU_BENCHMARKING.md) - [GPU版Benchmarks](doc/GPU_BENCHMARKING.md)
### FAQ ### FAQ
- [常见问答](doc/deprecated/FAQ.md) - [常见问答](doc/FAQ.md)
### 设计文档 ### 设计文档
- [Paddle Serving设计文档](doc/DESIGN_DOC_CN.md) - [Paddle Serving设计文档](doc/DESIGN_DOC_CN.md)
......
...@@ -86,6 +86,63 @@ function(protobuf_generate_python SRCS) ...@@ -86,6 +86,63 @@ function(protobuf_generate_python SRCS)
set(${SRCS} ${${SRCS}} PARENT_SCOPE) set(${SRCS} ${${SRCS}} PARENT_SCOPE)
endfunction() endfunction()
function(grpc_protobuf_generate_python SRCS)
# shameless copy from https://github.com/Kitware/CMake/blob/master/Modules/FindProtobuf.cmake
if(NOT ARGN)
message(SEND_ERROR "Error: GRPC_PROTOBUF_GENERATE_PYTHON() called without any proto files")
return()
endif()
if(PROTOBUF_GENERATE_CPP_APPEND_PATH)
# Create an include path for each file specified
foreach(FIL ${ARGN})
get_filename_component(ABS_FIL ${FIL} ABSOLUTE)
get_filename_component(ABS_PATH ${ABS_FIL} PATH)
list(FIND _protobuf_include_path ${ABS_PATH} _contains_already)
if(${_contains_already} EQUAL -1)
list(APPEND _protobuf_include_path -I ${ABS_PATH})
endif()
endforeach()
else()
set(_protobuf_include_path -I ${CMAKE_CURRENT_SOURCE_DIR})
endif()
if(DEFINED PROTOBUF_IMPORT_DIRS AND NOT DEFINED Protobuf_IMPORT_DIRS)
set(Protobuf_IMPORT_DIRS "${PROTOBUF_IMPORT_DIRS}")
endif()
if(DEFINED Protobuf_IMPORT_DIRS)
foreach(DIR ${Protobuf_IMPORT_DIRS})
get_filename_component(ABS_PATH ${DIR} ABSOLUTE)
list(FIND _protobuf_include_path ${ABS_PATH} _contains_already)
if(${_contains_already} EQUAL -1)
list(APPEND _protobuf_include_path -I ${ABS_PATH})
endif()
endforeach()
endif()
set(${SRCS})
foreach(FIL ${ARGN})
get_filename_component(ABS_FIL ${FIL} ABSOLUTE)
get_filename_component(FIL_WE ${FIL} NAME_WE)
if(NOT PROTOBUF_GENERATE_CPP_APPEND_PATH)
get_filename_component(FIL_DIR ${FIL} DIRECTORY)
if(FIL_DIR)
set(FIL_WE "${FIL_DIR}/${FIL_WE}")
endif()
endif()
list(APPEND ${SRCS} "${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}_pb2_grpc.py")
add_custom_command(
OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}_pb2_grpc.py"
COMMAND ${PYTHON_EXECUTABLE} -m grpc_tools.protoc --python_out ${CMAKE_CURRENT_BINARY_DIR} --grpc_python_out ${CMAKE_CURRENT_BINARY_DIR} ${_protobuf_include_path} ${ABS_FIL}
DEPENDS ${ABS_FIL}
COMMENT "Running Python grpc protocol buffer compiler on ${FIL}"
VERBATIM )
endforeach()
set(${SRCS} ${${SRCS}} PARENT_SCOPE)
endfunction()
# Print and set the protobuf library information, # Print and set the protobuf library information,
# finish this cmake process and exit from this file. # finish this cmake process and exit from this file.
macro(PROMPT_PROTOBUF_LIB) macro(PROMPT_PROTOBUF_LIB)
......
...@@ -704,6 +704,15 @@ function(py_proto_compile TARGET_NAME) ...@@ -704,6 +704,15 @@ function(py_proto_compile TARGET_NAME)
add_custom_target(${TARGET_NAME} ALL DEPENDS ${py_srcs}) add_custom_target(${TARGET_NAME} ALL DEPENDS ${py_srcs})
endfunction() endfunction()
function(py_grpc_proto_compile TARGET_NAME)
set(oneValueArgs "")
set(multiValueArgs SRCS)
cmake_parse_arguments(py_grpc_proto_compile "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN})
set(py_srcs)
grpc_protobuf_generate_python(py_srcs ${py_grpc_proto_compile_SRCS})
add_custom_target(${TARGET_NAME} ALL DEPENDS ${py_srcs})
endfunction()
function(py_test TARGET_NAME) function(py_test TARGET_NAME)
if(WITH_TESTING) if(WITH_TESTING)
set(options "") set(options "")
......
...@@ -35,6 +35,13 @@ py_proto_compile(general_model_config_py_proto SRCS proto/general_model_config.p ...@@ -35,6 +35,13 @@ py_proto_compile(general_model_config_py_proto SRCS proto/general_model_config.p
add_custom_target(general_model_config_py_proto_init ALL COMMAND ${CMAKE_COMMAND} -E touch __init__.py) add_custom_target(general_model_config_py_proto_init ALL COMMAND ${CMAKE_COMMAND} -E touch __init__.py)
add_dependencies(general_model_config_py_proto general_model_config_py_proto_init) add_dependencies(general_model_config_py_proto general_model_config_py_proto_init)
py_grpc_proto_compile(multi_lang_general_model_service_py_proto SRCS proto/multi_lang_general_model_service.proto)
add_custom_target(multi_lang_general_model_service_py_proto_init ALL COMMAND ${CMAKE_COMMAND} -E touch __init__.py)
add_dependencies(multi_lang_general_model_service_py_proto multi_lang_general_model_service_py_proto_init)
py_grpc_proto_compile(general_python_service_py_proto SRCS proto/general_python_service.proto)
add_custom_target(general_python_service_py_proto_init ALL COMMAND ${CMAKE_COMMAND} -E touch __init__.py)
add_dependencies(general_python_service_py_proto general_python_service_py_proto_init)
if (CLIENT) if (CLIENT)
py_proto_compile(sdk_configure_py_proto SRCS proto/sdk_configure.proto) py_proto_compile(sdk_configure_py_proto SRCS proto/sdk_configure.proto)
add_custom_target(sdk_configure_py_proto_init ALL COMMAND ${CMAKE_COMMAND} -E touch __init__.py) add_custom_target(sdk_configure_py_proto_init ALL COMMAND ${CMAKE_COMMAND} -E touch __init__.py)
...@@ -51,6 +58,17 @@ add_custom_command(TARGET general_model_config_py_proto POST_BUILD ...@@ -51,6 +58,17 @@ add_custom_command(TARGET general_model_config_py_proto POST_BUILD
COMMENT "Copy generated general_model_config proto file into directory paddle_serving_client/proto." COMMENT "Copy generated general_model_config proto file into directory paddle_serving_client/proto."
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}) WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
add_custom_command(TARGET multi_lang_general_model_service_py_proto POST_BUILD
COMMAND ${CMAKE_COMMAND} -E make_directory ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_client/proto
COMMAND cp *.py ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_client/proto
COMMENT "Copy generated multi_lang_general_model_service proto file into directory paddle_serving_client/proto."
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
add_custom_command(TARGET general_python_service_py_proto POST_BUILD
COMMAND ${CMAKE_COMMAND} -E make_directory ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_client/proto
COMMAND cp *.py ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_client/proto
COMMENT "Copy generated general_python_service proto file into directory paddle_serving_client/proto."
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
endif() endif()
if (APP) if (APP)
...@@ -65,6 +83,11 @@ if (SERVER) ...@@ -65,6 +83,11 @@ if (SERVER)
py_proto_compile(server_config_py_proto SRCS proto/server_configure.proto) py_proto_compile(server_config_py_proto SRCS proto/server_configure.proto)
add_custom_target(server_config_py_proto_init ALL COMMAND ${CMAKE_COMMAND} -E touch __init__.py) add_custom_target(server_config_py_proto_init ALL COMMAND ${CMAKE_COMMAND} -E touch __init__.py)
add_dependencies(server_config_py_proto server_config_py_proto_init) add_dependencies(server_config_py_proto server_config_py_proto_init)
py_proto_compile(pyserving_channel_py_proto SRCS proto/pyserving_channel.proto)
add_custom_target(pyserving_channel_py_proto_init ALL COMMAND ${CMAKE_COMMAND} -E touch __init__.py)
add_dependencies(pyserving_channel_py_proto pyserving_channel_py_proto_init)
if (NOT WITH_GPU) if (NOT WITH_GPU)
add_custom_command(TARGET server_config_py_proto POST_BUILD add_custom_command(TARGET server_config_py_proto POST_BUILD
COMMAND ${CMAKE_COMMAND} -E make_directory ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server/proto COMMAND ${CMAKE_COMMAND} -E make_directory ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server/proto
...@@ -77,6 +100,24 @@ add_custom_command(TARGET general_model_config_py_proto POST_BUILD ...@@ -77,6 +100,24 @@ add_custom_command(TARGET general_model_config_py_proto POST_BUILD
COMMAND cp *.py ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server/proto COMMAND cp *.py ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server/proto
COMMENT "Copy generated general_model_config proto file into directory paddle_serving_server/proto." COMMENT "Copy generated general_model_config proto file into directory paddle_serving_server/proto."
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}) WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
add_custom_command(TARGET general_python_service_py_proto POST_BUILD
COMMAND ${CMAKE_COMMAND} -E make_directory ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server/proto
COMMAND cp *.py ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server/proto
COMMENT "Copy generated general_python_service proto file into directory paddle_serving_server/proto."
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
add_custom_command(TARGET pyserving_channel_py_proto POST_BUILD
COMMAND ${CMAKE_COMMAND} -E make_directory ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server/proto
COMMAND cp *.py ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server/proto
COMMENT "Copy generated pyserving_channel proto file into directory paddle_serving_server/proto."
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
add_custom_command(TARGET multi_lang_general_model_service_py_proto POST_BUILD
COMMAND ${CMAKE_COMMAND} -E make_directory ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server/proto
COMMAND cp *.py ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server/proto
COMMENT "Copy generated multi_lang_general_model_service proto file into directory paddle_serving_server/proto."
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
else() else()
add_custom_command(TARGET server_config_py_proto POST_BUILD add_custom_command(TARGET server_config_py_proto POST_BUILD
COMMAND ${CMAKE_COMMAND} -E make_directory COMMAND ${CMAKE_COMMAND} -E make_directory
...@@ -95,5 +136,23 @@ add_custom_command(TARGET general_model_config_py_proto POST_BUILD ...@@ -95,5 +136,23 @@ add_custom_command(TARGET general_model_config_py_proto POST_BUILD
COMMENT "Copy generated general_model_config proto file into directory COMMENT "Copy generated general_model_config proto file into directory
paddle_serving_server_gpu/proto." paddle_serving_server_gpu/proto."
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}) WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
add_custom_command(TARGET general_python_service_py_proto POST_BUILD
COMMAND ${CMAKE_COMMAND} -E make_directory ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server_gpu/proto
COMMAND cp *.py ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server_gpu/proto
COMMENT "Copy generated general_python_service proto file into directory paddle_serving_server_gpu/proto."
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
add_custom_command(TARGET pyserving_channel_py_proto POST_BUILD
COMMAND ${CMAKE_COMMAND} -E make_directory ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server_gpu/proto
COMMAND cp *.py ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server_gpu/proto
COMMENT "Copy generated pyserving_channel proto file into directory paddle_serving_server_gpu/proto."
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
add_custom_command(TARGET multi_lang_general_model_service_py_proto POST_BUILD
COMMAND ${CMAKE_COMMAND} -E make_directory ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server_gpu/proto
COMMAND cp *.py ${PADDLE_SERVING_BINARY_DIR}/python/paddle_serving_server_gpu/proto
COMMENT "Copy generated multi_lang_general_model_service proto file into directory paddle_serving_server_gpu/proto."
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
endif() endif()
endif() endif()
...@@ -13,6 +13,7 @@ ...@@ -13,6 +13,7 @@
// limitations under the License. // limitations under the License.
syntax = "proto2"; syntax = "proto2";
package baidu.paddle_serving.pyserving;
service GeneralPythonService { service GeneralPythonService {
rpc inference(Request) returns (Response) {} rpc inference(Request) returns (Response) {}
...@@ -21,11 +22,15 @@ service GeneralPythonService { ...@@ -21,11 +22,15 @@ service GeneralPythonService {
message Request { message Request {
repeated bytes feed_insts = 1; repeated bytes feed_insts = 1;
repeated string feed_var_names = 2; repeated string feed_var_names = 2;
repeated bytes shape = 3;
repeated string type = 4;
} }
message Response { message Response {
repeated bytes fetch_insts = 1; repeated bytes fetch_insts = 1;
repeated string fetch_var_names = 2; repeated string fetch_var_names = 2;
required int32 is_error = 3; required int32 ecode = 3;
optional string error_info = 4; optional string error_info = 4;
repeated bytes shape = 5;
repeated string type = 6;
} }
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved. // Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
...@@ -14,18 +14,37 @@ ...@@ -14,18 +14,37 @@
syntax = "proto2"; syntax = "proto2";
service GeneralPythonService { message Tensor {
rpc inference(Request) returns (Response) {} optional bytes data = 1;
} repeated int32 int_data = 2;
repeated int64 int64_data = 3;
repeated float float_data = 4;
optional int32 elem_type = 5;
repeated int32 shape = 6;
repeated int32 lod = 7; // only for fetch tensor currently
};
message FeedInst { repeated Tensor tensor_array = 1; };
message FetchInst { repeated Tensor tensor_array = 1; };
message Request { message Request {
repeated bytes feed_insts = 1; repeated FeedInst insts = 1;
repeated string feed_var_names = 2; repeated string feed_var_names = 2;
} repeated string fetch_var_names = 3;
required bool is_python = 4 [ default = false ];
};
message Response { message Response {
repeated bytes fetch_insts = 1; repeated ModelOutput outputs = 1;
repeated string fetch_var_names = 2; optional string tag = 2;
required int32 is_error = 3; };
optional string error_info = 4;
message ModelOutput {
repeated FetchInst insts = 1;
optional string engine_name = 2;
} }
service MultiLangGeneralModelService {
rpc inference(Request) returns (Response) {}
};
...@@ -13,17 +13,19 @@ ...@@ -13,17 +13,19 @@
// limitations under the License. // limitations under the License.
syntax = "proto2"; syntax = "proto2";
package baidu.paddle_serving.pyserving;
message ChannelData { message ChannelData {
repeated Inst insts = 1; repeated Inst insts = 1;
required int32 id = 2; required int32 id = 2;
optional string type = 3 required int32 type = 3 [ default = 0 ];
[ default = "CD" ]; // CD(channel data), CF(channel futures) required int32 ecode = 4;
required int32 is_error = 4;
optional string error_info = 5; optional string error_info = 5;
} }
message Inst { message Inst {
required bytes data = 1; required bytes data = 1;
required string name = 2; required string name = 2;
required bytes shape = 3;
required string type = 4;
} }
...@@ -21,7 +21,7 @@ The following Python code will process the data `test_data/part-0` and write to ...@@ -21,7 +21,7 @@ The following Python code will process the data `test_data/part-0` and write to
[//file]:#process.py [//file]:#process.py
``` python ``` python
from imdb_reader import IMDBDataset from paddle_serving_app.reader import IMDBDataset
imdb_dataset = IMDBDataset() imdb_dataset = IMDBDataset()
imdb_dataset.load_resource('imdb.vocab') imdb_dataset.load_resource('imdb.vocab')
...@@ -78,7 +78,7 @@ with open('processed.data') as f: ...@@ -78,7 +78,7 @@ with open('processed.data') as f:
feed = {"words": word_ids} feed = {"words": word_ids}
fetch = ["acc", "cost", "prediction"] fetch = ["acc", "cost", "prediction"]
[fetch_map, tag] = client.predict(feed=feed, fetch=fetch, need_variant_tag=True) [fetch_map, tag] = client.predict(feed=feed, fetch=fetch, need_variant_tag=True)
if (float(fetch_map["prediction"][1]) - 0.5) * (float(label[0]) - 0.5) > 0: if (float(fetch_map["prediction"][0][1]) - 0.5) * (float(label[0]) - 0.5) > 0:
cnt[tag]['acc'] += 1 cnt[tag]['acc'] += 1
cnt[tag]['total'] += 1 cnt[tag]['total'] += 1
...@@ -88,7 +88,7 @@ with open('processed.data') as f: ...@@ -88,7 +88,7 @@ with open('processed.data') as f:
In the code, the function `client.add_variant(tag, clusters, variant_weight)` is to add a variant with label `tag` and flow weight `variant_weight`. In this example, a BOW variant with label of `bow` and flow weight of `10`, and an LSTM variant with label of `lstm` and a flow weight of `90` are added. The flow on the client side will be distributed to two variants according to the ratio of `10:90`. In the code, the function `client.add_variant(tag, clusters, variant_weight)` is to add a variant with label `tag` and flow weight `variant_weight`. In this example, a BOW variant with label of `bow` and flow weight of `10`, and an LSTM variant with label of `lstm` and a flow weight of `90` are added. The flow on the client side will be distributed to two variants according to the ratio of `10:90`.
When making prediction on the client side, if the parameter `need_variant_tag=True` is specified, the response will contains the variant tag corresponding to the distribution flow. When making prediction on the client side, if the parameter `need_variant_tag=True` is specified, the response will contain the variant tag corresponding to the distribution flow.
### Expected Results ### Expected Results
......
...@@ -20,7 +20,7 @@ sh get_data.sh ...@@ -20,7 +20,7 @@ sh get_data.sh
下面Python代码将处理`test_data/part-0`的数据,写入`processed.data`文件中。 下面Python代码将处理`test_data/part-0`的数据,写入`processed.data`文件中。
```python ```python
from imdb_reader import IMDBDataset from paddle_serving_app.reader import IMDBDataset
imdb_dataset = IMDBDataset() imdb_dataset = IMDBDataset()
imdb_dataset.load_resource('imdb.vocab') imdb_dataset.load_resource('imdb.vocab')
...@@ -76,7 +76,7 @@ with open('processed.data') as f: ...@@ -76,7 +76,7 @@ with open('processed.data') as f:
feed = {"words": word_ids} feed = {"words": word_ids}
fetch = ["acc", "cost", "prediction"] fetch = ["acc", "cost", "prediction"]
[fetch_map, tag] = client.predict(feed=feed, fetch=fetch, need_variant_tag=True) [fetch_map, tag] = client.predict(feed=feed, fetch=fetch, need_variant_tag=True)
if (float(fetch_map["prediction"][1]) - 0.5) * (float(label[0]) - 0.5) > 0: if (float(fetch_map["prediction"][0][1]) - 0.5) * (float(label[0]) - 0.5) > 0:
cnt[tag]['acc'] += 1 cnt[tag]['acc'] += 1
cnt[tag]['total'] += 1 cnt[tag]['total'] += 1
......
...@@ -59,7 +59,7 @@ the script of client side bert_client.py is as follow: ...@@ -59,7 +59,7 @@ the script of client side bert_client.py is as follow:
import os import os
import sys import sys
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_app import ChineseBertReader from paddle_serving_app.reader import ChineseBertReader
reader = ChineseBertReader() reader = ChineseBertReader()
fetch = ["pooled_output"] fetch = ["pooled_output"]
......
...@@ -52,7 +52,7 @@ pip install paddle_serving_app ...@@ -52,7 +52,7 @@ pip install paddle_serving_app
``` python ``` python
import sys import sys
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_app import ChineseBertReader from paddle_serving_app.reader import ChineseBertReader
reader = ChineseBertReader() reader = ChineseBertReader()
fetch = ["pooled_output"] fetch = ["pooled_output"]
......
...@@ -20,7 +20,7 @@ This document will take Python2 as an example to show how to compile Paddle Serv ...@@ -20,7 +20,7 @@ This document will take Python2 as an example to show how to compile Paddle Serv
- Set `DPYTHON_INCLUDE_DIR` to `$PYTHONROOT/include/python3.6m/` - Set `DPYTHON_INCLUDE_DIR` to `$PYTHONROOT/include/python3.6m/`
- Set `DPYTHON_LIBRARIES` to `$PYTHONROOT/lib64/libpython3.6.so` - Set `DPYTHON_LIBRARIES` to `$PYTHONROOT/lib64/libpython3.6.so`
- Set `DPYTHON_EXECUTABLE` to `$PYTHONROOT/bin/python3` - Set `DPYTHON_EXECUTABLE` to `$PYTHONROOT/bin/python3.6`
## Get Code ## Get Code
...@@ -36,6 +36,8 @@ cd Serving && git submodule update --init --recursive ...@@ -36,6 +36,8 @@ cd Serving && git submodule update --init --recursive
export PYTHONROOT=/usr/ export PYTHONROOT=/usr/
``` ```
In the default centos7 image we provide, the Python path is `/usr/bin/python`. If you want to use our centos6 image, you need to set it to `export PYTHONROOT=/usr/local/python2.7/`.
## Compile Server ## Compile Server
### Integrated CPU version paddle inference library ### Integrated CPU version paddle inference library
......
...@@ -20,7 +20,7 @@ ...@@ -20,7 +20,7 @@
-`DPYTHON_INCLUDE_DIR`设置为`$PYTHONROOT/include/python3.6m/` -`DPYTHON_INCLUDE_DIR`设置为`$PYTHONROOT/include/python3.6m/`
-`DPYTHON_LIBRARIES`设置为`$PYTHONROOT/lib64/libpython3.6.so` -`DPYTHON_LIBRARIES`设置为`$PYTHONROOT/lib64/libpython3.6.so`
-`DPYTHON_EXECUTABLE`设置为`$PYTHONROOT/bin/python3` -`DPYTHON_EXECUTABLE`设置为`$PYTHONROOT/bin/python3.6`
## 获取代码 ## 获取代码
...@@ -36,6 +36,8 @@ cd Serving && git submodule update --init --recursive ...@@ -36,6 +36,8 @@ cd Serving && git submodule update --init --recursive
export PYTHONROOT=/usr/ export PYTHONROOT=/usr/
``` ```
我们提供默认Centos7的Python路径为`/usr/bin/python`,如果您要使用我们的Centos6镜像,需要将其设置为`export PYTHONROOT=/usr/local/python2.7/`
## 编译Server部分 ## 编译Server部分
### 集成CPU版本Paddle Inference Library ### 集成CPU版本Paddle Inference Library
......
# FAQ
- Q:如何调整RPC服务的等待时间,避免超时?
A:使用set_rpc_timeout_ms设置更长的等待时间,单位为毫秒,默认时间为20秒。
示例:
```
from paddle_serving_client import Client
client = Client()
client.load_client_config(sys.argv[1])
client.set_rpc_timeout_ms(100000)
client.connect(["127.0.0.1:9393"])
```
...@@ -46,7 +46,7 @@ In this example, the production model is uploaded to HDFS in `product_path` fold ...@@ -46,7 +46,7 @@ In this example, the production model is uploaded to HDFS in `product_path` fold
### Product model ### Product model
Run the following Python code products model in `product_path` folder. Every 60 seconds, the package file of Boston house price prediction model `uci_housing.tar.gz` will be generated and uploaded to the path of HDFS `/`. After uploading, the timestamp file `donefile` will be updated and uploaded to the path of HDFS `/`. Run the following Python code products model in `product_path` folder(You need to modify Hadoop related parameters before running). Every 60 seconds, the package file of Boston house price prediction model `uci_housing.tar.gz` will be generated and uploaded to the path of HDFS `/`. After uploading, the timestamp file `donefile` will be updated and uploaded to the path of HDFS `/`.
```python ```python
import os import os
...@@ -82,9 +82,14 @@ exe = fluid.Executor(place) ...@@ -82,9 +82,14 @@ exe = fluid.Executor(place)
exe.run(fluid.default_startup_program()) exe.run(fluid.default_startup_program())
def push_to_hdfs(local_file_path, remote_path): def push_to_hdfs(local_file_path, remote_path):
hadoop_bin = '/hadoop-3.1.2/bin/hadoop' afs = 'afs://***.***.***.***:***' # User needs to change
os.system('{} fs -put -f {} {}'.format( uci = '***,***' # User needs to change
hadoop_bin, local_file_path, remote_path)) hadoop_bin = '/path/to/haddop/bin' # User needs to change
prefix = '{} fs -Dfs.default.name={} -Dhadoop.job.ugi={}'.format(hadoop_bin, afs, uci)
os.system('{} -rmr {}/{}'.format(
prefix, remote_path, local_file_path))
os.system('{} -put {} {}'.format(
prefix, local_file_path, remote_path))
name = "uci_housing" name = "uci_housing"
for pass_id in range(30): for pass_id in range(30):
......
...@@ -46,7 +46,7 @@ Paddle Serving提供了一个自动监控脚本,远端地址更新模型后会 ...@@ -46,7 +46,7 @@ Paddle Serving提供了一个自动监控脚本,远端地址更新模型后会
### 生产模型 ### 生产模型
`product_path`下运行下面的Python代码生产模型,每隔 60 秒会产出 Boston 房价预测模型的打包文件`uci_housing.tar.gz`并上传至hdfs的`/`路径下,上传完毕后更新时间戳文件`donefile`并上传至hdfs的`/`路径下。 `product_path`下运行下面的Python代码生产模型(运行前需要修改hadoop相关的参数),每隔 60 秒会产出 Boston 房价预测模型的打包文件`uci_housing.tar.gz`并上传至hdfs的`/`路径下,上传完毕后更新时间戳文件`donefile`并上传至hdfs的`/`路径下。
```python ```python
import os import os
...@@ -82,9 +82,14 @@ exe = fluid.Executor(place) ...@@ -82,9 +82,14 @@ exe = fluid.Executor(place)
exe.run(fluid.default_startup_program()) exe.run(fluid.default_startup_program())
def push_to_hdfs(local_file_path, remote_path): def push_to_hdfs(local_file_path, remote_path):
hadoop_bin = '/hadoop-3.1.2/bin/hadoop' afs = 'afs://***.***.***.***:***' # User needs to change
os.system('{} fs -put -f {} {}'.format( uci = '***,***' # User needs to change
hadoop_bin, local_file_path, remote_path)) hadoop_bin = '/path/to/haddop/bin' # User needs to change
prefix = '{} fs -Dfs.default.name={} -Dhadoop.job.ugi={}'.format(hadoop_bin, afs, uci)
os.system('{} -rmr {}/{}'.format(
prefix, remote_path, local_file_path))
os.system('{} -put {} {}'.format(
prefix, local_file_path, remote_path))
name = "uci_housing" name = "uci_housing"
for pass_id in range(30): for pass_id in range(30):
......
...@@ -99,7 +99,7 @@ func main() { ...@@ -99,7 +99,7 @@ func main() {
### 基于IMDB测试集的预测 ### 基于IMDB测试集的预测
```python ```python
go run imdb_client.go serving_client_conf / serving_client_conf.stream.prototxt test.data> result go run imdb_client.go serving_client_conf/serving_client_conf.stream.prototxt test.data> result
``` ```
### 计算精度 ### 计算精度
......
...@@ -2,9 +2,9 @@ ...@@ -2,9 +2,9 @@
([简体中文](./PERFORMANCE_OPTIM_CN.md)|English) ([简体中文](./PERFORMANCE_OPTIM_CN.md)|English)
Due to different model structures, different prediction services consume different computing resources when performing predictions. For online prediction services, models that require less computing resources will have a higher proportion of communication time cost, which is called communication-intensive service. Models that require more computing resources have a higher time cost for inference calculations, which is called computationa-intensive services. Due to different model structures, different prediction services consume different computing resources when performing predictions. For online prediction services, models that require less computing resources will have a higher proportion of communication time cost, which is called communication-intensive service. Models that require more computing resources have a higher time cost for inference calculations, which is called computation-intensive services.
For a prediction service, the easiest way to determine what type it is is to look at the time ratio. Paddle Serving provides [Timeline tool](../python/examples/util/README_CN.md), which can intuitively display the time spent in each stage of the prediction service. For a prediction service, the easiest way to determine the type of service is to look at the time ratio. Paddle Serving provides [Timeline tool](../python/examples/util/README_CN.md), which can intuitively display the time spent in each stage of the prediction service.
For communication-intensive prediction services, requests can be aggregated, and within a limit that can tolerate delay, multiple prediction requests can be combined into a batch for prediction. For communication-intensive prediction services, requests can be aggregated, and within a limit that can tolerate delay, multiple prediction requests can be combined into a batch for prediction.
...@@ -16,5 +16,5 @@ Parameters for performance optimization: ...@@ -16,5 +16,5 @@ Parameters for performance optimization:
| Parameters | Type | Default | Description | | Parameters | Type | Default | Description |
| ---------- | ---- | ------- | ------------------------------------------------------------ | | ---------- | ---- | ------- | ------------------------------------------------------------ |
| mem_optim | bool | False | Enable memory / graphic memory optimization | | mem_optim | - | - | Enable memory / graphic memory optimization |
| ir_optim | bool | Fasle | Enable analysis and optimization of calculation graph,including OP fusion, etc | | ir_optim | - | - | Enable analysis and optimization of calculation graph,including OP fusion, etc |
...@@ -16,5 +16,5 @@ ...@@ -16,5 +16,5 @@
| 参数 | 类型 | 默认值 | 含义 | | 参数 | 类型 | 默认值 | 含义 |
| --------- | ---- | ------ | -------------------------------- | | --------- | ---- | ------ | -------------------------------- |
| mem_optim | bool | False | 开启内存/显存优化 | | mem_optim | - | - | 开启内存/显存优化 |
| ir_optim | bool | Fasle | 开启计算图分析优化,包括OP融合等 | | ir_optim | - | - | 开启计算图分析优化,包括OP融合等 |
...@@ -34,7 +34,7 @@ for line in sys.stdin: ...@@ -34,7 +34,7 @@ for line in sys.stdin:
## Export from saved model files ## Export from saved model files
If you have saved model files using Paddle's `save_inference_model` API, you can use Paddle Serving's` inference_model_to_serving` API to convert it into a model file that can be used for Paddle Serving. If you have saved model files using Paddle's `save_inference_model` API, you can use Paddle Serving's` inference_model_to_serving` API to convert it into a model file that can be used for Paddle Serving.
``` ```python
import paddle_serving_client.io as serving_io import paddle_serving_client.io as serving_io
serving_io.inference_model_to_serving(dirname, serving_server="serving_server", serving_client="serving_client", model_filename=None, params_filename=None ) serving_io.inference_model_to_serving(dirname, serving_server="serving_server", serving_client="serving_client", model_filename=None, params_filename=None )
``` ```
......
...@@ -35,7 +35,7 @@ for line in sys.stdin: ...@@ -35,7 +35,7 @@ for line in sys.stdin:
## 从已保存的模型文件中导出 ## 从已保存的模型文件中导出
如果已使用Paddle 的`save_inference_model`接口保存出预测要使用的模型,则可以通过Paddle Serving的`inference_model_to_serving`接口转换成可用于Paddle Serving的模型文件。 如果已使用Paddle 的`save_inference_model`接口保存出预测要使用的模型,则可以通过Paddle Serving的`inference_model_to_serving`接口转换成可用于Paddle Serving的模型文件。
``` ```python
import paddle_serving_client.io as serving_io import paddle_serving_client.io as serving_io
serving_io.inference_model_to_serving(dirname, serving_server="serving_server", serving_client="serving_client", model_filename=None, params_filename=None) serving_io.inference_model_to_serving(dirname, serving_server="serving_server", serving_client="serving_client", model_filename=None, params_filename=None)
``` ```
......
...@@ -18,7 +18,7 @@ http://10.127.3.150:9393/uci/prediction ...@@ -18,7 +18,7 @@ http://10.127.3.150:9393/uci/prediction
Here you will be prompted that the HTTP service started is in development mode and cannot be used for production deployment. Here you will be prompted that the HTTP service started is in development mode and cannot be used for production deployment.
The prediction service started by Flask is not stable enough to withstand the concurrency of a large number of requests. In the actual deployment process, WSGI (Web Server Gateway Interface) is used. The prediction service started by Flask is not stable enough to withstand the concurrency of a large number of requests. In the actual deployment process, WSGI (Web Server Gateway Interface) is used.
Next, we will show how to use the [uWSGI] (https://github.com/unbit/uwsgi) module to deploy HTTP prediction services for production environments. Next, we will show how to use the [uWSGI](https://github.com/unbit/uwsgi) module to deploy HTTP prediction services for production environments.
```python ```python
...@@ -29,7 +29,7 @@ from paddle_serving_server.web_service import WebService ...@@ -29,7 +29,7 @@ from paddle_serving_server.web_service import WebService
uci_service = WebService(name = "uci") uci_service = WebService(name = "uci")
uci_service.load_model_config("./uci_housing_model") uci_service.load_model_config("./uci_housing_model")
uci_service.prepare_server(workdir="./workdir", port=int(9500), device="cpu") uci_service.prepare_server(workdir="./workdir", port=int(9500), device="cpu")
uci_service.run_server() uci_service.run_rpc_service()
#Get flask application #Get flask application
app_instance = uci_service.get_app_instance() app_instance = uci_service.get_app_instance()
``` ```
......
...@@ -29,7 +29,7 @@ from paddle_serving_server.web_service import WebService ...@@ -29,7 +29,7 @@ from paddle_serving_server.web_service import WebService
uci_service = WebService(name = "uci") uci_service = WebService(name = "uci")
uci_service.load_model_config("./uci_housing_model") uci_service.load_model_config("./uci_housing_model")
uci_service.prepare_server(workdir="./workdir", port=int(9500), device="cpu") uci_service.prepare_server(workdir="./workdir", port=int(9500), device="cpu")
uci_service.run_server() uci_service.run_rpc_service()
#获取flask服务 #获取flask服务
app_instance = uci_service.get_app_instance() app_instance = uci_service.get_app_instance()
``` ```
......
...@@ -19,13 +19,11 @@ from __future__ import unicode_literals, absolute_import ...@@ -19,13 +19,11 @@ from __future__ import unicode_literals, absolute_import
import os import os
import sys import sys
import time import time
import json
import requests
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_client.utils import MultiThreadRunner from paddle_serving_client.utils import MultiThreadRunner
from paddle_serving_client.utils import benchmark_args from paddle_serving_client.utils import benchmark_args, show_latency
from batching import pad_batch_data
import tokenization
import requests
import json
from paddle_serving_app.reader import ChineseBertReader from paddle_serving_app.reader import ChineseBertReader
args = benchmark_args() args = benchmark_args()
...@@ -36,42 +34,105 @@ def single_func(idx, resource): ...@@ -36,42 +34,105 @@ def single_func(idx, resource):
dataset = [] dataset = []
for line in fin: for line in fin:
dataset.append(line.strip()) dataset.append(line.strip())
profile_flags = False
latency_flags = False
if os.getenv("FLAGS_profile_client"):
profile_flags = True
if os.getenv("FLAGS_serving_latency"):
latency_flags = True
latency_list = []
if args.request == "rpc": if args.request == "rpc":
reader = ChineseBertReader(vocab_file="vocab.txt", max_seq_len=20) reader = ChineseBertReader({"max_seq_len": 128})
fetch = ["pooled_output"] fetch = ["pooled_output"]
client = Client() client = Client()
client.load_client_config(args.model) client.load_client_config(args.model)
client.connect([resource["endpoint"][idx % len(resource["endpoint"])]]) client.connect([resource["endpoint"][idx % len(resource["endpoint"])]])
start = time.time() start = time.time()
for i in range(1000): for i in range(turns):
if args.batch_size == 1: if args.batch_size >= 1:
feed_dict = reader.process(dataset[i]) l_start = time.time()
result = client.predict(feed=feed_dict, fetch=fetch) feed_batch = []
b_start = time.time()
for bi in range(args.batch_size):
feed_batch.append(reader.process(dataset[bi]))
b_end = time.time()
if profile_flags:
sys.stderr.write(
"PROFILE\tpid:{}\tbert_pre_0:{} bert_pre_1:{}\n".format(
os.getpid(),
int(round(b_start * 1000000)),
int(round(b_end * 1000000))))
result = client.predict(feed=feed_batch, fetch=fetch)
l_end = time.time()
if latency_flags:
latency_list.append(l_end * 1000 - l_start * 1000)
else: else:
print("unsupport batch size {}".format(args.batch_size)) print("unsupport batch size {}".format(args.batch_size))
elif args.request == "http": elif args.request == "http":
reader = ChineseBertReader({"max_seq_len": 128})
fetch = ["pooled_output"]
server = "http://" + resource["endpoint"][idx % len(resource[
"endpoint"])] + "/bert/prediction"
start = time.time() start = time.time()
header = {"Content-Type": "application/json"} for i in range(turns):
for i in range(1000): if args.batch_size >= 1:
dict_data = {"words": dataset[i], "fetch": ["pooled_output"]} l_start = time.time()
r = requests.post( feed_batch = []
'http://{}/bert/prediction'.format(resource["endpoint"][ b_start = time.time()
idx % len(resource["endpoint"])]), for bi in range(args.batch_size):
data=json.dumps(dict_data), feed_batch.append({"words": dataset[bi]})
headers=header) req = json.dumps({"feed": feed_batch, "fetch": fetch})
b_end = time.time()
if profile_flags:
sys.stderr.write(
"PROFILE\tpid:{}\tbert_pre_0:{} bert_pre_1:{}\n".format(
os.getpid(),
int(round(b_start * 1000000)),
int(round(b_end * 1000000))))
result = requests.post(
server,
data=req,
headers={"Content-Type": "application/json"})
l_end = time.time()
if latency_flags:
latency_list.append(l_end * 1000 - l_start * 1000)
else:
print("unsupport batch size {}".format(args.batch_size))
else:
raise ValueError("not implemented {} request".format(args.request))
end = time.time() end = time.time()
return [[end - start]] if latency_flags:
return [[end - start], latency_list]
else:
return [[end - start]]
if __name__ == '__main__': if __name__ == '__main__':
multi_thread_runner = MultiThreadRunner() multi_thread_runner = MultiThreadRunner()
endpoint_list = ["127.0.0.1:9292"] endpoint_list = ["127.0.0.1:9292"]
result = multi_thread_runner.run(single_func, args.thread, turns = 10
{"endpoint": endpoint_list}) start = time.time()
result = multi_thread_runner.run(
single_func, args.thread, {"endpoint": endpoint_list,
"turns": turns})
end = time.time()
total_cost = end - start
avg_cost = 0 avg_cost = 0
for i in range(args.thread): for i in range(args.thread):
avg_cost += result[0][i] avg_cost += result[0][i]
avg_cost = avg_cost / args.thread avg_cost = avg_cost / args.thread
print("average total cost {} s.".format(avg_cost))
print("total cost :{} s".format(total_cost))
print("each thread cost :{} s. ".format(avg_cost))
print("qps :{} samples/s".format(args.batch_size * args.thread * turns /
total_cost))
if os.getenv("FLAGS_serving_latency"):
show_latency(result[1])
rm profile_log rm profile_log
for thread_num in 1 2 4 8 16 export CUDA_VISIBLE_DEVICES=0,1,2,3
export FLAGS_profile_server=1
export FLAGS_profile_client=1
export FLAGS_serving_latency=1
python3 -m paddle_serving_server_gpu.serve --model $1 --port 9292 --thread 4 --gpu_ids 0,1,2,3 --mem_optim False --ir_optim True 2> elog > stdlog &
sleep 5
#warm up
python3 benchmark.py --thread 8 --batch_size 1 --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
for thread_num in 4 8 16
do do
$PYTHONROOT/bin/python benchmark.py --thread $thread_num --model serving_client_conf/serving_client_conf.prototxt --request rpc > profile 2>&1 for batch_size in 1 4 16 64 256
echo "========================================" do
echo "batch size : $batch_size" >> profile_log python3 benchmark.py --thread $thread_num --batch_size $batch_size --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
$PYTHONROOT/bin/python ../util/show_profile.py profile $thread_num >> profile_log echo "model name :" $1
tail -n 1 profile >> profile_log echo "thread num :" $thread_num
echo "batch size :" $batch_size
echo "=================Done===================="
echo "model name :$1" >> profile_log_$1
echo "batch size :$batch_size" >> profile_log_$1
python3 ../util/show_profile.py profile $thread_num >> profile_log_$1
tail -n 8 profile >> profile_log_$1
echo "" >> profile_log_$1
done
done done
ps -ef|grep 'serving'|grep -v grep|cut -c 9-15 | xargs kill -9
# -*- coding: utf-8 -*-
#
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from __future__ import unicode_literals, absolute_import
import os
import sys
import time
from paddle_serving_client import Client
from paddle_serving_client.utils import MultiThreadRunner
from paddle_serving_client.utils import benchmark_args
from batching import pad_batch_data
import tokenization
import requests
import json
from bert_reader import BertReader
args = benchmark_args()
def single_func(idx, resource):
fin = open("data-c.txt")
dataset = []
for line in fin:
dataset.append(line.strip())
profile_flags = False
if os.environ["FLAGS_profile_client"]:
profile_flags = True
if args.request == "rpc":
reader = BertReader(vocab_file="vocab.txt", max_seq_len=20)
fetch = ["pooled_output"]
client = Client()
client.load_client_config(args.model)
client.connect([resource["endpoint"][idx % len(resource["endpoint"])]])
start = time.time()
for i in range(1000):
if args.batch_size >= 1:
feed_batch = []
b_start = time.time()
for bi in range(args.batch_size):
feed_batch.append(reader.process(dataset[bi]))
b_end = time.time()
if profile_flags:
print("PROFILE\tpid:{}\tbert_pre_0:{} bert_pre_1:{}".format(
os.getpid(),
int(round(b_start * 1000000)),
int(round(b_end * 1000000))))
result = client.predict(feed=feed_batch, fetch=fetch)
else:
print("unsupport batch size {}".format(args.batch_size))
elif args.request == "http":
raise ("no batch predict for http")
end = time.time()
return [[end - start]]
if __name__ == '__main__':
multi_thread_runner = MultiThreadRunner()
endpoint_list = ["127.0.0.1:9292"]
result = multi_thread_runner.run(single_func, args.thread,
{"endpoint": endpoint_list})
avg_cost = 0
for i in range(args.thread):
avg_cost += result[0][i]
avg_cost = avg_cost / args.thread
print("average total cost {} s.".format(avg_cost))
rm profile_log
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle_serving_server_gpu.serve --model bert_seq20_model/ --port 9295 --thread 4 --gpu_ids 0,1,2,3 2> elog > stdlog &
sleep 5
for thread_num in 1 2 4 8 16
do
for batch_size in 1 2 4 8 16 32 64 128 256 512
do
$PYTHONROOT/bin/python benchmark_batch.py --thread $thread_num --batch_size $batch_size --model serving_client_conf/serving_client_conf.prototxt --request rpc > profile 2>&1
echo "========================================"
echo "thread num: ", $thread_num
echo "batch size: ", $batch_size
echo "batch size : $batch_size" >> profile_log
$PYTHONROOT/bin/python ../util/show_profile.py profile $thread_num >> profile_log
tail -n 1 profile >> profile_log
done
done
...@@ -14,15 +14,7 @@ ...@@ -14,15 +14,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
import os
import sys import sys
import numpy as np
import paddlehub as hub
import ujson
import random
import time
from paddlehub.common.logger import logger
import socket
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_client.utils import benchmark_args from paddle_serving_client.utils import benchmark_args
from paddle_serving_app.reader import ChineseBertReader from paddle_serving_app.reader import ChineseBertReader
......
...@@ -21,7 +21,10 @@ import os ...@@ -21,7 +21,10 @@ import os
class BertService(WebService): class BertService(WebService):
def load(self): def load(self):
self.reader = ChineseBertReader(vocab_file="vocab.txt", max_seq_len=128) self.reader = ChineseBertReader({
"vocab_file": "vocab.txt",
"max_seq_len": 128
})
def preprocess(self, feed=[], fetch=[]): def preprocess(self, feed=[], fetch=[]):
feed_res = [ feed_res = [
......
...@@ -17,6 +17,6 @@ ...@@ -17,6 +17,6 @@
mkdir -p cube_model mkdir -p cube_model
mkdir -p cube/data mkdir -p cube/data
./seq_generator ctr_serving_model/SparseFeatFactors ./cube_model/feature ./seq_generator ctr_serving_model/SparseFeatFactors ./cube_model/feature
./cube/cube-builder -dict_name=test_dict -job_mode=base -last_version=0 -cur_version=0 -depend_version=0 -input_path=./cube_model -output_path=./cube/data -shard_num=1 -only_build=false ./cube/cube-builder -dict_name=test_dict -job_mode=base -last_version=0 -cur_version=0 -depend_version=0 -input_path=./cube_model -output_path=${PWD}/cube/data -shard_num=1 -only_build=false
mv ./cube/data/0_0/test_dict_part0/* ./cube/data/ mv ./cube/data/0_0/test_dict_part0/* ./cube/data/
cd cube && ./cube cd cube && ./cube
...@@ -17,6 +17,6 @@ ...@@ -17,6 +17,6 @@
mkdir -p cube_model mkdir -p cube_model
mkdir -p cube/data mkdir -p cube/data
./seq_generator ctr_serving_model/SparseFeatFactors ./cube_model/feature 8 ./seq_generator ctr_serving_model/SparseFeatFactors ./cube_model/feature 8
./cube/cube-builder -dict_name=test_dict -job_mode=base -last_version=0 -cur_version=0 -depend_version=0 -input_path=./cube_model -output_path=./cube/data -shard_num=1 -only_build=false ./cube/cube-builder -dict_name=test_dict -job_mode=base -last_version=0 -cur_version=0 -depend_version=0 -input_path=./cube_model -output_path=${PWD}/cube/data -shard_num=1 -only_build=false
mv ./cube/data/0_0/test_dict_part0/* ./cube/data/ mv ./cube/data/0_0/test_dict_part0/* ./cube/data/
cd cube && ./cube cd cube && ./cube
# Image Segmentation
## Get Model
```
python -m paddle_serving_app.package --get_model deeplabv3
tar -xzvf deeplabv3.tar.gz
```
## RPC Service
### Start Service
```
python -m paddle_serving_server_gpu.serve --model deeplabv3_server --gpu_ids 0 --port 9494
```
### Client Prediction
```
python deeplabv3_client.py
```
# 图像分割
## 获取模型
```
python -m paddle_serving_app.package --get_model deeplabv3
tar -xzvf deeplabv3.tar.gz
```
## RPC 服务
### 启动服务端
```
python -m paddle_serving_server_gpu.serve --model deeplabv3_server --gpu_ids 0 --port 9494
```
### 客户端预测
```
python deeplabv3_client.py
...@@ -18,7 +18,7 @@ import sys ...@@ -18,7 +18,7 @@ import sys
import cv2 import cv2
client = Client() client = Client()
client.load_client_config("seg_client/serving_client_conf.prototxt") client.load_client_config("deeplabv3_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9494"]) client.connect(["127.0.0.1:9494"])
preprocess = Sequential( preprocess = Sequential(
......
...@@ -12,8 +12,8 @@ If you want to have more detection models, please refer to [Paddle Detection Mod ...@@ -12,8 +12,8 @@ If you want to have more detection models, please refer to [Paddle Detection Mod
### Start the service ### Start the service
``` ```
tar xf faster_rcnn_model.tar.gz tar xf faster_rcnn_model.tar.gz
mv faster_rcnn_model/pddet *. mv faster_rcnn_model/pddet* .
GLOG_v=2 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_id 0 GLOG_v=2 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0
``` ```
### Perform prediction ### Perform prediction
......
...@@ -13,7 +13,7 @@ wget https://paddle-serving.bj.bcebos.com/pddet_demo/infer_cfg.yml ...@@ -13,7 +13,7 @@ wget https://paddle-serving.bj.bcebos.com/pddet_demo/infer_cfg.yml
``` ```
tar xf faster_rcnn_model.tar.gz tar xf faster_rcnn_model.tar.gz
mv faster_rcnn_model/pddet* ./ mv faster_rcnn_model/pddet* ./
GLOG_v=2 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_id 0 GLOG_v=2 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0
``` ```
### 执行预测 ### 执行预测
......
...@@ -22,15 +22,19 @@ def single_func(idx, resource): ...@@ -22,15 +22,19 @@ def single_func(idx, resource):
client.load_client_config( client.load_client_config(
"./uci_housing_client/serving_client_conf.prototxt") "./uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9293", "127.0.0.1:9292"]) client.connect(["127.0.0.1:9293", "127.0.0.1:9292"])
test_reader = paddle.batch( x = [
paddle.reader.shuffle( 0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584,
paddle.dataset.uci_housing.test(), buf_size=500), 0.6283, 0.4919, 0.1856, 0.0795, -0.0332
batch_size=1) ]
for data in test_reader(): for i in range(1000):
fetch_map = client.predict(feed={"x": data[0][0]}, fetch=["price"]) fetch_map = client.predict(feed={"x": x}, fetch=["price"])
if fetch_map is None:
return [[None]]
return [[0]] return [[0]]
multi_thread_runner = MultiThreadRunner() multi_thread_runner = MultiThreadRunner()
thread_num = 4 thread_num = 4
result = multi_thread_runner.run(single_func, thread_num, {}) result = multi_thread_runner.run(single_func, thread_num, {})
if None in result[0]:
exit(1)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
from paddle_serving_client import MultiLangClient
import sys
client = MultiLangClient()
client.load_client_config(sys.argv[1])
client.connect(["127.0.0.1:9393"])
import paddle
test_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.test(), buf_size=500),
batch_size=1)
for data in test_reader():
future = client.predict(feed={"x": data[0][0]}, fetch=["price"], asyn=True)
fetch_map = future.result()
print("{} {}".format(fetch_map["price"][0], data[0][1][0]))
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# pylint: disable=doc-string-missing
import os
import sys
from paddle_serving_server import OpMaker
from paddle_serving_server import OpSeqMaker
from paddle_serving_server import MultiLangServer
op_maker = OpMaker()
read_op = op_maker.create('general_reader')
general_infer_op = op_maker.create('general_infer')
response_op = op_maker.create('general_response')
op_seq_maker = OpSeqMaker()
op_seq_maker.add_op(read_op)
op_seq_maker.add_op(general_infer_op)
op_seq_maker.add_op(response_op)
server = MultiLangServer()
server.set_op_sequence(op_seq_maker.get_op_sequence())
server.load_model_config(sys.argv[1])
server.prepare_server(workdir="work_dir1", port=9393, device="cpu")
server.run_server()
...@@ -19,10 +19,10 @@ pip install paddle_serving_app ...@@ -19,10 +19,10 @@ pip install paddle_serving_app
启动server端 启动server端
``` ```
python image_classification_service.py ResNet50_vd_model cpu 9696 #cpu预测服务 python resnet50_web_service.py ResNet50_vd_model cpu 9696 #cpu预测服务
``` ```
``` ```
python image_classification_service.py ResNet50_vd_model gpu 9696 #gpu预测服务 python resnet50_web_service.py ResNet50_vd_model gpu 9696 #gpu预测服务
``` ```
......
...@@ -73,7 +73,7 @@ def single_func(idx, resource): ...@@ -73,7 +73,7 @@ def single_func(idx, resource):
print("unsupport batch size {}".format(args.batch_size)) print("unsupport batch size {}".format(args.batch_size))
elif args.request == "http": elif args.request == "http":
py_version = 2 py_version = sys.version_info[0]
server = "http://" + resource["endpoint"][idx % len(resource[ server = "http://" + resource["endpoint"][idx % len(resource[
"endpoint"])] + "/image/prediction" "endpoint"])] + "/image/prediction"
start = time.time() start = time.time()
...@@ -93,7 +93,7 @@ def single_func(idx, resource): ...@@ -93,7 +93,7 @@ def single_func(idx, resource):
if __name__ == '__main__': if __name__ == '__main__':
multi_thread_runner = MultiThreadRunner() multi_thread_runner = MultiThreadRunner()
endpoint_list = ["127.0.0.1:9696"] endpoint_list = ["127.0.0.1:9393"]
#endpoint_list = endpoint_list + endpoint_list + endpoint_list #endpoint_list = endpoint_list + endpoint_list + endpoint_list
result = multi_thread_runner.run(single_func, args.thread, result = multi_thread_runner.run(single_func, args.thread,
{"endpoint": endpoint_list}) {"endpoint": endpoint_list})
......
rm profile_log rm profile_log
for thread_num in 1 2 4 8 export CUDA_VISIBLE_DEVICES=0,1,2,3
export FLAGS_profile_server=1
export FLAGS_profile_client=1
python -m paddle_serving_server_gpu.serve --model $1 --port 9292 --thread 4 --gpu_ids 0,1,2,3 2> elog > stdlog &
sleep 5
#warm up
$PYTHONROOT/bin/python benchmark.py --thread 8 --batch_size 1 --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
for thread_num in 4 8 16
do do
for batch_size in 1 2 4 8 16 32 64 128 for batch_size in 1 4 16 64 256
do do
$PYTHONROOT/bin/python benchmark.py --thread $thread_num --batch_size $batch_size --model ResNet50_vd_client_config/serving_client_conf.prototxt --request rpc > profile 2>&1 $PYTHONROOT/bin/python benchmark.py --thread $thread_num --batch_size $batch_size --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
echo "========================================" echo "model name :" $1
echo "batch size : $batch_size" >> profile_log echo "thread num :" $thread_num
echo "batch size :" $batch_size
echo "=================Done===================="
echo "model name :$1" >> profile_log
echo "batch size :$batch_size" >> profile_log
$PYTHONROOT/bin/python ../util/show_profile.py profile $thread_num >> profile_log $PYTHONROOT/bin/python ../util/show_profile.py profile $thread_num >> profile_log
tail -n 1 profile >> profile_log tail -n 8 profile >> profile_log
done done
done done
ps -ef|grep 'serving'|grep -v grep|cut -c 9-15 | xargs kill -9
...@@ -13,27 +13,23 @@ ...@@ -13,27 +13,23 @@
# limitations under the License. # limitations under the License.
from paddle_serving_client.pyclient import PyClient from paddle_serving_client.pyclient import PyClient
import numpy as np import numpy as np
from paddle_serving_app.reader import IMDBDataset
from line_profiler import LineProfiler from line_profiler import LineProfiler
client = PyClient() client = PyClient()
client.connect('localhost:8080') client.connect('localhost:8080')
x = np.array(
[
0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584,
0.6283, 0.4919, 0.1856, 0.0795, -0.0332
],
dtype='float')
lp = LineProfiler() lp = LineProfiler()
lp_wrapper = lp(client.predict) lp_wrapper = lp(client.predict)
words = 'i am very sad | 0'
imdb_dataset = IMDBDataset()
imdb_dataset.load_resource('imdb.vocab')
for i in range(1): for i in range(1):
word_ids, label = imdb_dataset.get_words_and_label(words)
fetch_map = lp_wrapper( fetch_map = lp_wrapper(
feed={"x": x}, fetch_with_type={"combine_op_output": "float"}) feed={"words": word_ids}, fetch=["combined_prediction"])
# fetch_map = client.predict(
# feed={"x": x}, fetch_with_type={"combine_op_output": "float"})
print(fetch_map) print(fetch_map)
#lp.print_stats() #lp.print_stats()
...@@ -16,101 +16,54 @@ ...@@ -16,101 +16,54 @@
from paddle_serving_server.pyserver import Op from paddle_serving_server.pyserver import Op
from paddle_serving_server.pyserver import Channel from paddle_serving_server.pyserver import Channel
from paddle_serving_server.pyserver import PyServer from paddle_serving_server.pyserver import PyServer
from paddle_serving_server import python_service_channel_pb2
import numpy as np import numpy as np
import logging import logging
logging.basicConfig( logging.basicConfig(
format='%(asctime)s %(levelname)-8s [%(filename)s:%(lineno)d] %(message)s', format='%(asctime)s %(levelname)-8s [%(filename)s:%(lineno)d] %(message)s',
datefmt='%Y-%m-%d %H:%M', datefmt='%Y-%m-%d %H:%M',
#level=logging.DEBUG)
level=logging.INFO) level=logging.INFO)
# channel data: {name(str): data(bytes)}
class CombineOp(Op): class CombineOp(Op):
def preprocess(self, input_data): def preprocess(self, input_data):
cnt = 0 combined_prediction = 0
for op_name, data in input_data.items(): for op_name, channeldata in input_data.items():
logging.debug("CombineOp preprocess: {}".format(op_name)) data = channeldata.parse()
cnt += np.frombuffer(data.insts[0].data, dtype='float') logging.info("{}: {}".format(op_name, data["prediction"]))
data = python_service_channel_pb2.ChannelData() combined_prediction += data["prediction"]
inst = python_service_channel_pb2.Inst() data = {"combined_prediction": combined_prediction / 2}
inst.data = np.ndarray.tobytes(cnt)
inst.name = "combine_op_output"
data.insts.append(inst)
return data
def postprocess(self, output_data):
return output_data
class UciOp(Op):
def postprocess(self, output_data):
data = python_service_channel_pb2.ChannelData()
inst = python_service_channel_pb2.Inst()
pred = np.array(output_data["price"][0][0], dtype='float')
inst.data = np.ndarray.tobytes(pred)
inst.name = "prediction"
data.insts.append(inst)
return data return data
read_channel = Channel(name="read_channel") read_op = Op(name="read", inputs=None)
combine_channel = Channel(name="combine_channel") bow_op = Op(name="bow",
out_channel = Channel(name="out_channel") inputs=[read_op],
server_model="imdb_bow_model",
cnn_op = UciOp( server_port="9393",
name="cnn", device="cpu",
input=read_channel, client_config="imdb_bow_client_conf/serving_client_conf.prototxt",
in_dtype='float', server_name="127.0.0.1:9393",
outputs=[combine_channel], fetch_names=["prediction"],
out_dtype='float', concurrency=1,
server_model="./uci_housing_model", timeout=0.1,
server_port="9393", retry=2)
device="cpu", cnn_op = Op(name="cnn",
client_config="uci_housing_client/serving_client_conf.prototxt", inputs=[read_op],
server_name="127.0.0.1:9393", server_model="imdb_cnn_model",
fetch_names=["price"], server_port="9292",
concurrency=1, device="cpu",
timeout=0.01, client_config="imdb_cnn_client_conf/serving_client_conf.prototxt",
retry=2) server_name="127.0.0.1:9292",
fetch_names=["prediction"],
bow_op = UciOp( concurrency=1,
name="bow", timeout=-1,
input=read_channel, retry=1)
in_dtype='float',
outputs=[combine_channel],
out_dtype='float',
server_model="./uci_housing_model",
server_port="9292",
device="cpu",
client_config="uci_housing_client/serving_client_conf.prototxt",
server_name="127.0.0.1:9393",
fetch_names=["price"],
concurrency=1,
timeout=-1,
retry=1)
combine_op = CombineOp( combine_op = CombineOp(
name="combine", name="combine", inputs=[bow_op, cnn_op], concurrency=1, timeout=-1, retry=1)
input=combine_channel,
in_dtype='float',
outputs=[out_channel],
out_dtype='float',
concurrency=1,
timeout=-1,
retry=1)
logging.info(read_channel.debug())
logging.info(combine_channel.debug())
logging.info(out_channel.debug())
pyserver = PyServer(profile=False, retry=1) pyserver = PyServer(profile=False, retry=1)
pyserver.add_channel(read_channel) pyserver.add_ops([read_op, bow_op, cnn_op, combine_op])
pyserver.add_channel(combine_channel)
pyserver.add_channel(out_channel)
pyserver.add_op(cnn_op)
pyserver.add_op(bow_op)
pyserver.add_op(combine_op)
pyserver.prepare_server(port=8080, worker_num=2) pyserver.prepare_server(port=8080, worker_num=2)
pyserver.run_server() pyserver.run_server()
...@@ -2,28 +2,27 @@ ...@@ -2,28 +2,27 @@
([简体中文](./README_CN.md)|English) ([简体中文](./README_CN.md)|English)
### Get model files and sample data ### Get Model
``` ```
sh get_data.sh python -m paddle_serving_app.package --get_model lac
tar -xzvf lac.tar.gz
``` ```
the package downloaded contains lac model config along with lac dictionary.
#### Start RPC inference service #### Start RPC inference service
``` ```
python -m paddle_serving_server.serve --model jieba_server_model/ --port 9292 python -m paddle_serving_server.serve --model lac_model/ --port 9292
``` ```
### RPC Infer ### RPC Infer
``` ```
echo "我爱北京天安门" | python lac_client.py jieba_client_conf/serving_client_conf.prototxt lac_dict/ echo "我爱北京天安门" | python lac_client.py lac_client/serving_client_conf.prototxt
``` ```
it will get the segmentation result It will get the segmentation result.
### Start HTTP inference service ### Start HTTP inference service
``` ```
python lac_web_service.py jieba_server_model/ lac_workdir 9292 python lac_web_service.py lac_model/ lac_workdir 9292
``` ```
### HTTP Infer ### HTTP Infer
......
...@@ -2,28 +2,27 @@ ...@@ -2,28 +2,27 @@
(简体中文|[English](./README.md)) (简体中文|[English](./README.md))
### 获取模型和字典文件 ### 获取模型
``` ```
sh get_data.sh python -m paddle_serving_app.package --get_model lac
tar -xzvf lac.tar.gz
``` ```
下载包里包含了lac模型和lac模型预测需要的字典文件
#### 开启RPC预测服务 #### 开启RPC预测服务
``` ```
python -m paddle_serving_server.serve --model jieba_server_model/ --port 9292 python -m paddle_serving_server.serve --model lac_model/ --port 9292
``` ```
### 执行RPC预测 ### 执行RPC预测
``` ```
echo "我爱北京天安门" | python lac_client.py jieba_client_conf/serving_client_conf.prototxt lac_dict/ echo "我爱北京天安门" | python lac_client.py lac_client/serving_client_conf.prototxt
``` ```
我们就能得到分词结果 我们就能得到分词结果
### 开启HTTP预测服务 ### 开启HTTP预测服务
``` ```
python lac_web_service.py jieba_server_model/ lac_workdir 9292 python lac_web_service.py lac_model/ lac_workdir 9292
``` ```
### 执行HTTP预测 ### 执行HTTP预测
......
...@@ -16,7 +16,7 @@ ...@@ -16,7 +16,7 @@
import sys import sys
import time import time
import requests import requests
from lac_reader import LACReader from paddle_serving_app.reader import LACReader
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_client.utils import MultiThreadRunner from paddle_serving_client.utils import MultiThreadRunner
from paddle_serving_client.utils import benchmark_args from paddle_serving_client.utils import benchmark_args
...@@ -25,7 +25,7 @@ args = benchmark_args() ...@@ -25,7 +25,7 @@ args = benchmark_args()
def single_func(idx, resource): def single_func(idx, resource):
reader = LACReader("lac_dict") reader = LACReader()
start = time.time() start = time.time()
if args.request == "rpc": if args.request == "rpc":
client = Client() client = Client()
......
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/lac/lac_model_jieba_web.tar.gz
tar -zxvf lac_model_jieba_web.tar.gz
...@@ -15,7 +15,7 @@ ...@@ -15,7 +15,7 @@
# pylint: disable=doc-string-missing # pylint: disable=doc-string-missing
from paddle_serving_client import Client from paddle_serving_client import Client
from lac_reader import LACReader from paddle_serving_app.reader import LACReader
import sys import sys
import os import os
import io import io
...@@ -24,7 +24,7 @@ client = Client() ...@@ -24,7 +24,7 @@ client = Client()
client.load_client_config(sys.argv[1]) client.load_client_config(sys.argv[1])
client.connect(["127.0.0.1:9292"]) client.connect(["127.0.0.1:9292"])
reader = LACReader(sys.argv[2]) reader = LACReader()
for line in sys.stdin: for line in sys.stdin:
if len(line) <= 0: if len(line) <= 0:
continue continue
...@@ -32,4 +32,7 @@ for line in sys.stdin: ...@@ -32,4 +32,7 @@ for line in sys.stdin:
if len(feed_data) <= 0: if len(feed_data) <= 0:
continue continue
fetch_map = client.predict(feed={"words": feed_data}, fetch=["crf_decode"]) fetch_map = client.predict(feed={"words": feed_data}, fetch=["crf_decode"])
print(fetch_map) begin = fetch_map['crf_decode.lod'][0]
end = fetch_map['crf_decode.lod'][1]
segs = reader.parse_result(line, fetch_map["crf_decode"][begin:end])
print("word_seg: " + "|".join(str(words) for words in segs))
...@@ -14,12 +14,12 @@ ...@@ -14,12 +14,12 @@
from paddle_serving_server.web_service import WebService from paddle_serving_server.web_service import WebService
import sys import sys
from lac_reader import LACReader from paddle_serving_app.reader import LACReader
class LACService(WebService): class LACService(WebService):
def load_reader(self): def load_reader(self):
self.reader = LACReader("lac_dict") self.reader = LACReader()
def preprocess(self, feed={}, fetch=[]): def preprocess(self, feed={}, fetch=[]):
feed_batch = [] feed_batch = []
......
# Image Classification
## Get Model
```
python -m paddle_serving_app.package --get_model mobilenet_v2_imagenet
tar -xzvf mobilenet_v2_imagenet.tar.gz
```
## RPC Service
### Start Service
```
python -m paddle_serving_server_gpu.serve --model mobilenet_v2_imagenet_model --gpu_ids 0 --port 9393
```
### Client Prediction
```
python mobilenet_tutorial.py
```
# 图像分类
## 获取模型
```
python -m paddle_serving_app.package --get_model mobilenet_v2_imagenet
tar -xzvf mobilenet_v2_imagenet.tar.gz
```
## RPC 服务
### 启动服务端
```
python -m paddle_serving_server_gpu.serve --model mobilenet_v2_imagenet_model --gpu_ids 0 --port 9393
```
### 客户端预测
```
python mobilenet_tutorial.py
```
# OCR
## Get Model
```
python -m paddle_serving_app.package --get_model ocr_rec
tar -xzvf ocr_rec.tar.gz
```
## RPC Service
### Start Service
```
python -m paddle_serving_server.serve --model ocr_rec_model --port 9292
```
### Client Prediction
```
python test_ocr_rec_client.py
```
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle_serving_client import Client
from paddle_serving_app.reader import OCRReader
import cv2
client = Client()
client.load_client_config("ocr_rec_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
image_file_list = ["./test_rec.jpg"]
img = cv2.imread(image_file_list[0])
ocr_reader = OCRReader()
feed = {"image": ocr_reader.preprocess([img])}
fetch = ["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"]
fetch_map = client.predict(feed=feed, fetch=fetch)
rec_res = ocr_reader.postprocess(fetch_map)
print(image_file_list[0])
print(rec_res[0][0])
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
from paddle_serving_client import Client
from paddle_serving_app.reader import Sequential, File2Image, ResizeByFactor
from paddle_serving_app.reader import Div, Normalize, Transpose
from paddle_serving_app.reader import DBPostProcess, FilterBoxes
client = Client()
client.load_client_config("ocr_det_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9494"])
read_image_file = File2Image()
preprocess = Sequential([
ResizeByFactor(32, 960), Div(255),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), Transpose(
(2, 0, 1))
])
post_func = DBPostProcess({
"thresh": 0.3,
"box_thresh": 0.5,
"max_candidates": 1000,
"unclip_ratio": 1.5,
"min_size": 3
})
filter_func = FilterBoxes(10, 10)
img = read_image_file(name)
ori_h, ori_w, _ = img.shape
img = preprocess(img)
new_h, new_w, _ = img.shape
ratio_list = [float(new_h) / ori_h, float(new_w) / ori_w]
outputs = client.predict(feed={"image": img}, fetch=["concat_1.tmp_0"])
dt_boxes_list = post_func(outputs["concat_1.tmp_0"], [ratio_list])
dt_boxes = filter_func(dt_boxes_list[0], [ori_h, ori_w])
# Image Classification
## Get Model
```
python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet
tar -xzvf resnet_v2_50_imagenet.tar.gz
```
## RPC Service
### Start Service
```
python -m paddle_serving_server_gpu.serve --model resnet_v2_50_imagenet_model --gpu_ids 0 --port 9393
```
### Client Prediction
```
python resnet50_v2_tutorial.py
```
# 图像分类
## 获取模型
```
python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet
tar -xzvf resnet_v2_50_imagenet.tar.gz
```
## RPC 服务
### 启动服务端
```
python -m paddle_serving_server_gpu.serve --model resnet_v2_50_imagenet_model --gpu_ids 0 --port 9393
```
### 客户端预测
```
python resnet50_v2_tutorial.py
```
...@@ -14,7 +14,7 @@ ...@@ -14,7 +14,7 @@
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_app.reader import Sequential, File2Image, Resize, CenterCrop from paddle_serving_app.reader import Sequential, File2Image, Resize, CenterCrop
from apddle_serving_app.reader import RGB2BGR, Transpose, Div, Normalize from paddle_serving_app.reader import RGB2BGR, Transpose, Div, Normalize
client = Client() client = Client()
client.load_client_config( client.load_client_config(
...@@ -28,5 +28,5 @@ seq = Sequential([ ...@@ -28,5 +28,5 @@ seq = Sequential([
image_file = "daisy.jpg" image_file = "daisy.jpg"
img = seq(image_file) img = seq(image_file)
fetch_map = client.predict(feed={"image": img}, fetch=["feature_map"]) fetch_map = client.predict(feed={"image": img}, fetch=["score"])
print(fetch_map["feature_map"].reshape(-1)) print(fetch_map["score"].reshape(-1))
# Chinese sentence sentiment classification # Chinese Sentence Sentiment Classification
([简体中文](./README_CN.md)|English) ([简体中文](./README_CN.md)|English)
## Get model files and sample data
```
sh get_data.sh
```
## Install preprocess module
## Get Model
``` ```
pip install paddle_serving_app python -m paddle_serving_app.package --get_model senta_bilstm
python -m paddle_serving_app.package --get_model lac
tar -xzvf senta_bilstm.tar.gz
tar -xzvf lac.tar.gz
``` ```
## Start http service ## Start HTTP Service
``` ```
python senta_web_service.py senta_bilstm_model/ workdir 9292 python -m paddle_serving_server.serve --model lac_model --port 9300
python senta_web_service.py
``` ```
In the Chinese sentiment classification task, the Chinese word segmentation needs to be done through [LAC task] (../lac). Set model path by ```lac_model_path``` and dictionary path by ```lac_dict_path```. In the Chinese sentiment classification task, the Chinese word segmentation needs to be done through [LAC task] (../lac).
In this demo, the LAC task is placed in the preprocessing part of the HTTP prediction service of the sentiment classification task. The LAC prediction service is deployed on the CPU, and the sentiment classification task is deployed on the GPU, which can be changed according to the actual situation. In this demo, the LAC task is placed in the preprocessing part of the HTTP prediction service of the sentiment classification task.
## Client prediction ## Client prediction
``` ```
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "天气不错"}], "fetch":["class_probs"]}' http://127.0.0.1:9292/senta/prediction curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "天气不错"}], "fetch":["class_probs"]}' http://127.0.0.1:9393/senta/prediction
``` ```
# 中文语句情感分类 # 中文语句情感分类
(简体中文|[English](./README.md)) (简体中文|[English](./README.md))
## 获取模型文件和样例数据
``` ## 获取模型文件
sh get_data.sh
```
## 安装数据预处理模块
``` ```
pip install paddle_serving_app python -m paddle_serving_app.package --get_model senta_bilstm
python -m paddle_serving_app.package --get_model lac
tar -xzvf lac.tar.gz
tar -xzvf senta_bilstm.tar.gz
``` ```
## 启动HTTP服务 ## 启动HTTP服务
``` ```
python senta_web_service.py senta_bilstm_model/ workdir 9292 python -m paddle_serving_server.serve --model lac_model --port 9300
python senta_web_service.py
``` ```
中文情感分类任务中需要先通过[LAC任务](../lac)进行中文分词,在脚本中通过```lac_model_path```参数配置LAC任务的模型文件路径,```lac_dict_path```参数配置LAC任务词典路径 中文情感分类任务中需要先通过[LAC任务](../lac)进行中文分词。
示例中将LAC任务放在情感分类任务的HTTP预测服务的预处理部分,LAC预测服务部署在CPU上,情感分类任务部署在GPU上,可以根据实际情况进行更改 示例中将LAC任务放在情感分类任务的HTTP预测服务的预处理部分。
## 客户端预测 ## 客户端预测
``` ```
curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "天气不错"}], "fetch":["class_probs"]}' http://127.0.0.1:9292/senta/prediction curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "天气不错"}], "fetch":["class_probs"]}' http://127.0.0.1:9393/senta/prediction
``` ```
#encoding=utf-8
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved. # Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
# #
# Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
...@@ -12,56 +13,28 @@ ...@@ -12,56 +13,28 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from paddle_serving_server_gpu.web_service import WebService from paddle_serving_server.web_service import WebService
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_app.reader import LACReader, SentaReader from paddle_serving_app.reader import LACReader, SentaReader
import os import os
import sys import sys
from multiprocessing import Process
#senta_web_service.py
from paddle_serving_server.web_service import WebService
from paddle_serving_client import Client
from paddle_serving_app.reader import LACReader, SentaReader
class SentaService(WebService):
def set_config(
self,
lac_model_path,
lac_dict_path,
senta_dict_path, ):
self.lac_model_path = lac_model_path
self.lac_client_config_path = lac_model_path + "/serving_server_conf.prototxt"
self.lac_dict_path = lac_dict_path
self.senta_dict_path = senta_dict_path
def start_lac_service(self):
if not os.path.exists('./lac_serving'):
os.mkdir("./lac_serving")
os.chdir('./lac_serving')
self.lac_port = self.port + 100
r = os.popen(
"python -m paddle_serving_server.serve --model {} --port {} &".
format("../" + self.lac_model_path, self.lac_port))
os.chdir('..')
def init_lac_service(self):
ps = Process(target=self.start_lac_service())
ps.start()
self.init_lac_client()
def lac_predict(self, feed_data):
lac_result = self.lac_client.predict(
feed={"words": feed_data}, fetch=["crf_decode"])
return lac_result
def init_lac_client(self):
self.lac_client = Client()
self.lac_client.load_client_config(self.lac_client_config_path)
self.lac_client.connect(["127.0.0.1:{}".format(self.lac_port)])
def init_lac_reader(self): class SentaService(WebService):
#初始化lac模型预测服务
def init_lac_client(self, lac_port, lac_client_config):
self.lac_reader = LACReader() self.lac_reader = LACReader()
def init_senta_reader(self):
self.senta_reader = SentaReader() self.senta_reader = SentaReader()
self.lac_client = Client()
self.lac_client.load_client_config(lac_client_config)
self.lac_client.connect(["127.0.0.1:{}".format(lac_port)])
#定义senta模型预测服务的预处理,调用顺序:lac reader->lac模型预测->预测结果后处理->senta reader
def preprocess(self, feed=[], fetch=[]): def preprocess(self, feed=[], fetch=[]):
feed_data = [{ feed_data = [{
"words": self.lac_reader.process(x["words"]) "words": self.lac_reader.process(x["words"])
...@@ -80,15 +53,9 @@ class SentaService(WebService): ...@@ -80,15 +53,9 @@ class SentaService(WebService):
senta_service = SentaService(name="senta") senta_service = SentaService(name="senta")
senta_service.set_config( senta_service.load_model_config("senta_bilstm_model")
lac_model_path="./lac_model", senta_service.prepare_server(workdir="workdir")
lac_dict_path="./lac_dict", senta_service.init_lac_client(
senta_dict_path="./vocab.txt") lac_port=9300, lac_client_config="lac_model/serving_server_conf.prototxt")
senta_service.load_model_config(sys.argv[1])
senta_service.prepare_server(
workdir=sys.argv[2], port=int(sys.argv[3]), device="cpu")
senta_service.init_lac_reader()
senta_service.init_senta_reader()
senta_service.init_lac_service()
senta_service.run_rpc_service() senta_service.run_rpc_service()
senta_service.run_web_service() senta_service.run_web_service()
# Image Segmentation
## Get Model
```
python -m paddle_serving_app.package --get_model unet
tar -xzvf unet.tar.gz
```
## RPC Service
### Start Service
```
python -m paddle_serving_server_gpu.serve --model unet_model --gpu_ids 0 --port 9494
```
### Client Prediction
```
python seg_client.py
```
# 图像分割
## 获取模型
```
python -m paddle_serving_app.package --get_model unet
tar -xzvf unet.tar.gz
```
## RPC 服务
### 启动服务端
```
python -m paddle_serving_server_gpu.serve --model unet_model --gpu_ids 0 --port 9494
```
### 客户端预测
```
python seg_client.py
```
...@@ -27,7 +27,8 @@ preprocess = Sequential( ...@@ -27,7 +27,8 @@ preprocess = Sequential(
postprocess = SegPostprocess(2) postprocess = SegPostprocess(2)
im = preprocess("N0060.jpg") filename = "N0060.jpg"
im = preprocess(filename)
fetch_map = client.predict(feed={"image": im}, fetch=["output"]) fetch_map = client.predict(feed={"image": im}, fetch=["output"])
fetch_map["filename"] = filename fetch_map["filename"] = filename
postprocess(fetch_map) postprocess(fetch_map)
...@@ -31,7 +31,7 @@ with open(profile_file) as f: ...@@ -31,7 +31,7 @@ with open(profile_file) as f:
if line[0] == "PROFILE": if line[0] == "PROFILE":
prase(line[2]) prase(line[2])
print("thread num {}".format(thread_num)) print("thread num :{}".format(thread_num))
for name in time_dict: for name in time_dict:
print("{} cost {} s in each thread ".format(name, time_dict[name] / ( print("{} cost :{} s in each thread ".format(name, time_dict[name] / (
1000000.0 * float(thread_num)))) 1000000.0 * float(thread_num))))
...@@ -12,7 +12,7 @@ pip install paddle_serving_app ...@@ -12,7 +12,7 @@ pip install paddle_serving_app
## Get model list ## Get model list
```shell ```shell
python -m paddle_serving_app.package --model_list python -m paddle_serving_app.package --list_model
``` ```
## Download pre-training model ## Download pre-training model
...@@ -21,16 +21,16 @@ python -m paddle_serving_app.package --model_list ...@@ -21,16 +21,16 @@ python -m paddle_serving_app.package --model_list
python -m paddle_serving_app.package --get_model senta_bilstm python -m paddle_serving_app.package --get_model senta_bilstm
``` ```
11 pre-trained models are built into paddle_serving_app, covering 6 kinds of prediction tasks. 1 pre-trained models are built into paddle_serving_app, covering 6 kinds of prediction tasks.
The model files can be directly used for deployment, and the `--tutorial` argument can be added to obtain the deployment method. The model files can be directly used for deployment, and the `--tutorial` argument can be added to obtain the deployment method.
| Prediction task | Model name | | Prediction task | Model name |
| ------------ | ------------------------------------------------ | | ------------ | ------------------------------------------------ |
| SentimentAnalysis | 'senta_bilstm', 'senta_bow', 'senta_cnn' | | SentimentAnalysis | 'senta_bilstm', 'senta_bow', 'senta_cnn' |
| SemanticRepresentation | 'ernie_base' | | SemanticRepresentation | 'ernie' |
| ChineseWordSegmentation | 'lac' | | ChineseWordSegmentation | 'lac' |
| ObjectDetection | 'faster_rcnn', 'yolov3' | | ObjectDetection | 'faster_rcnn' |
| ImageSegmentation | 'unet', 'deeplabv3' | | ImageSegmentation | 'unet', 'deeplabv3','deeplabv3+cityscapes' |
| ImageClassification | 'resnet_v2_50_imagenet', 'mobilenet_v2_imagenet' | | ImageClassification | 'resnet_v2_50_imagenet', 'mobilenet_v2_imagenet' |
## Data preprocess API ## Data preprocess API
...@@ -38,7 +38,8 @@ The model files can be directly used for deployment, and the `--tutorial` argume ...@@ -38,7 +38,8 @@ The model files can be directly used for deployment, and the `--tutorial` argume
paddle_serving_app provides a variety of data preprocessing methods for prediction tasks in the field of CV and NLP. paddle_serving_app provides a variety of data preprocessing methods for prediction tasks in the field of CV and NLP.
- class ChineseBertReader - class ChineseBertReader
Preprocessing for Chinese semantic representation task. Preprocessing for Chinese semantic representation task.
- `__init__(vocab_file, max_seq_len=20)` - `__init__(vocab_file, max_seq_len=20)`
...@@ -54,7 +55,8 @@ Preprocessing for Chinese semantic representation task. ...@@ -54,7 +55,8 @@ Preprocessing for Chinese semantic representation task.
[example](../examples/bert/bert_client.py) [example](../examples/bert/bert_client.py)
- class LACReader - class LACReader
Preprocessing for Chinese word segmentation task. Preprocessing for Chinese word segmentation task.
- `__init__(dict_floder)` - `__init__(dict_floder)`
...@@ -65,7 +67,7 @@ Preprocessing for Chinese word segmentation task. ...@@ -65,7 +67,7 @@ Preprocessing for Chinese word segmentation task.
- words(st ):Original text input. - words(st ):Original text input.
- crf_decode(np.array):CRF code predicted by model. - crf_decode(np.array):CRF code predicted by model.
[example](../examples/bert/lac_web_service.py) [example](../examples/lac/lac_web_service.py)
- class SentaReader - class SentaReader
...@@ -76,7 +78,7 @@ Preprocessing for Chinese word segmentation task. ...@@ -76,7 +78,7 @@ Preprocessing for Chinese word segmentation task.
[example](../examples/senta/senta_web_service.py) [example](../examples/senta/senta_web_service.py)
- The image preprocessing method is more flexible than the above method, and can be combined by the following multiple classes,[example](../examples/imagenet/image_rpc_client.py) - The image preprocessing method is more flexible than the above method, and can be combined by the following multiple classes,[example](../examples/imagenet/resnet50_rpc_client.py)
- class Sequentia - class Sequentia
......
...@@ -11,7 +11,7 @@ pip install paddle_serving_app ...@@ -11,7 +11,7 @@ pip install paddle_serving_app
## 获取模型列表 ## 获取模型列表
```shell ```shell
python -m paddle_serving_app.package --model_list python -m paddle_serving_app.package --list_model
``` ```
## 下载预训练模型 ## 下载预训练模型
...@@ -20,15 +20,15 @@ python -m paddle_serving_app.package --model_list ...@@ -20,15 +20,15 @@ python -m paddle_serving_app.package --model_list
python -m paddle_serving_app.package --get_model senta_bilstm python -m paddle_serving_app.package --get_model senta_bilstm
``` ```
paddle_serving_app中内置了11预训练模型,涵盖了6种预测任务。获取到的模型文件可以直接用于部署,添加`--tutorial`参数可以获取对应的部署方式。 paddle_serving_app中内置了11预训练模型,涵盖了6种预测任务。获取到的模型文件可以直接用于部署,添加`--tutorial`参数可以获取对应的部署方式。
| 预测服务类型 | 模型名称 | | 预测服务类型 | 模型名称 |
| ------------ | ------------------------------------------------ | | ------------ | ------------------------------------------------ |
| 中文情感分析 | 'senta_bilstm', 'senta_bow', 'senta_cnn' | | 中文情感分析 | 'senta_bilstm', 'senta_bow', 'senta_cnn' |
| 语义理解 | 'ernie_base' | | 语义理解 | 'ernie' |
| 中文分词 | 'lac' | | 中文分词 | 'lac' |
| 图像检测 | 'faster_rcnn', 'yolov3' | | 图像检测 | 'faster_rcnn' |
| 图像分割 | 'unet', 'deeplabv3' | | 图像分割 | 'unet', 'deeplabv3', 'deeplabv3+cityscapes' |
| 图像分类 | 'resnet_v2_50_imagenet', 'mobilenet_v2_imagenet' | | 图像分类 | 'resnet_v2_50_imagenet', 'mobilenet_v2_imagenet' |
## 数据预处理API ## 数据预处理API
...@@ -36,7 +36,7 @@ paddle_serving_app中内置了11中预训练模型,涵盖了6种预测任务 ...@@ -36,7 +36,7 @@ paddle_serving_app中内置了11中预训练模型,涵盖了6种预测任务
paddle_serving_app针对CV和NLP领域的模型任务,提供了多种常见的数据预处理方法。 paddle_serving_app针对CV和NLP领域的模型任务,提供了多种常见的数据预处理方法。
- class ChineseBertReader - class ChineseBertReader
中文语义理解模型预处理 中文语义理解模型预处理
- `__init__(vocab_file, max_seq_len=20)` - `__init__(vocab_file, max_seq_len=20)`
...@@ -71,7 +71,7 @@ paddle_serving_app针对CV和NLP领域的模型任务,提供了多种常见的 ...@@ -71,7 +71,7 @@ paddle_serving_app针对CV和NLP领域的模型任务,提供了多种常见的
[参考示例](../examples/senta/senta_web_service.py) [参考示例](../examples/senta/senta_web_service.py)
- 图像的预处理方法相比于上述的方法更加灵活多变,可以通过以下的多个类进行组合,[参考示例](../examples/imagenet/image_rpc_client.py) - 图像的预处理方法相比于上述的方法更加灵活多变,可以通过以下的多个类进行组合,[参考示例](../examples/imagenet/resnet50_rpc_client.py)
- class Sequentia - class Sequentia
......
...@@ -22,22 +22,26 @@ class ServingModels(object): ...@@ -22,22 +22,26 @@ class ServingModels(object):
self.model_dict = OrderedDict() self.model_dict = OrderedDict()
self.model_dict[ self.model_dict[
"SentimentAnalysis"] = ["senta_bilstm", "senta_bow", "senta_cnn"] "SentimentAnalysis"] = ["senta_bilstm", "senta_bow", "senta_cnn"]
self.model_dict["SemanticRepresentation"] = ["ernie_base"] self.model_dict["SemanticRepresentation"] = ["ernie"]
self.model_dict["ChineseWordSegmentation"] = ["lac"] self.model_dict["ChineseWordSegmentation"] = ["lac"]
self.model_dict["ObjectDetection"] = ["faster_rcnn", "yolov3"] self.model_dict["ObjectDetection"] = ["faster_rcnn"]
self.model_dict["ImageSegmentation"] = [ self.model_dict["ImageSegmentation"] = [
"unet", "deeplabv3", "deeplabv3+cityscapes" "unet", "deeplabv3", "deeplabv3+cityscapes"
] ]
self.model_dict["ImageClassification"] = [ self.model_dict["ImageClassification"] = [
"resnet_v2_50_imagenet", "mobilenet_v2_imagenet" "resnet_v2_50_imagenet", "mobilenet_v2_imagenet"
] ]
self.model_dict["TextDetection"] = ["ocr_detection"]
self.model_dict["OCR"] = ["ocr_rec"]
image_class_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/image/ImageClassification/" image_class_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/image/ImageClassification/"
image_seg_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/image/ImageSegmentation/" image_seg_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/image/ImageSegmentation/"
object_detection_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/image/ObjectDetection/" object_detection_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/image/ObjectDetection/"
ocr_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/image/OCR/"
senta_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SentimentAnalysis/" senta_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SentimentAnalysis/"
semantic_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticRepresentation/" semantic_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/"
wordseg_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/LexicalAnalysis/" wordseg_url = "https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/LexicalAnalysis/"
ocr_det_url = "https://paddle-serving.bj.bcebos.com/ocr/"
self.url_dict = {} self.url_dict = {}
...@@ -52,6 +56,8 @@ class ServingModels(object): ...@@ -52,6 +56,8 @@ class ServingModels(object):
pack_url(self.model_dict, "ObjectDetection", object_detection_url) pack_url(self.model_dict, "ObjectDetection", object_detection_url)
pack_url(self.model_dict, "ImageSegmentation", image_seg_url) pack_url(self.model_dict, "ImageSegmentation", image_seg_url)
pack_url(self.model_dict, "ImageClassification", image_class_url) pack_url(self.model_dict, "ImageClassification", image_class_url)
pack_url(self.model_dict, "OCR", ocr_url)
pack_url(self.model_dict, "TextDetection", ocr_det_url)
def get_model_list(self): def get_model_list(self):
return self.model_dict return self.model_dict
......
...@@ -12,7 +12,11 @@ ...@@ -12,7 +12,11 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from .chinese_bert_reader import ChineseBertReader from .chinese_bert_reader import ChineseBertReader
from .image_reader import ImageReader, File2Image, URL2Image, Sequential, Normalize, CenterCrop, Resize, Transpose, Div, RGB2BGR, BGR2RGB, RCNNPostprocess, SegPostprocess, PadStride from .image_reader import ImageReader, File2Image, URL2Image, Sequential, Normalize
from .image_reader import CenterCrop, Resize, Transpose, Div, RGB2BGR, BGR2RGB, ResizeByFactor
from .image_reader import RCNNPostprocess, SegPostprocess, PadStride
from .image_reader import DBPostProcess, FilterBoxes
from .lac_reader import LACReader from .lac_reader import LACReader
from .senta_reader import SentaReader from .senta_reader import SentaReader
from .imdb_reader import IMDBDataset from .imdb_reader import IMDBDataset
from .ocr_reader import OCRReader
...@@ -11,6 +11,9 @@ ...@@ -11,6 +11,9 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import cv2 import cv2
import os import os
import numpy as np import numpy as np
...@@ -18,6 +21,8 @@ import base64 ...@@ -18,6 +21,8 @@ import base64
import sys import sys
from . import functional as F from . import functional as F
from PIL import Image, ImageDraw from PIL import Image, ImageDraw
from shapely.geometry import Polygon
import pyclipper
import json import json
_cv2_interpolation_to_str = {cv2.INTER_LINEAR: "cv2.INTER_LINEAR", None: "None"} _cv2_interpolation_to_str = {cv2.INTER_LINEAR: "cv2.INTER_LINEAR", None: "None"}
...@@ -43,6 +48,196 @@ def generate_colormap(num_classes): ...@@ -43,6 +48,196 @@ def generate_colormap(num_classes):
return color_map return color_map
class DBPostProcess(object):
"""
The post process for Differentiable Binarization (DB).
"""
def __init__(self, params):
self.thresh = params['thresh']
self.box_thresh = params['box_thresh']
self.max_candidates = params['max_candidates']
self.unclip_ratio = params['unclip_ratio']
self.min_size = 3
def boxes_from_bitmap(self, pred, _bitmap, dest_width, dest_height):
'''
_bitmap: single map with shape (1, H, W),
whose values are binarized as {0, 1}
'''
bitmap = _bitmap
height, width = bitmap.shape
outs = cv2.findContours((bitmap * 255).astype(np.uint8), cv2.RETR_LIST,
cv2.CHAIN_APPROX_SIMPLE)
if len(outs) == 3:
img, contours, _ = outs[0], outs[1], outs[2]
elif len(outs) == 2:
contours, _ = outs[0], outs[1]
num_contours = min(len(contours), self.max_candidates)
boxes = np.zeros((num_contours, 4, 2), dtype=np.int16)
scores = np.zeros((num_contours, ), dtype=np.float32)
for index in range(num_contours):
contour = contours[index]
points, sside = self.get_mini_boxes(contour)
if sside < self.min_size:
continue
points = np.array(points)
score = self.box_score_fast(pred, points.reshape(-1, 2))
if self.box_thresh > score:
continue
box = self.unclip(points).reshape(-1, 1, 2)
box, sside = self.get_mini_boxes(box)
if sside < self.min_size + 2:
continue
box = np.array(box)
if not isinstance(dest_width, int):
dest_width = dest_width.item()
dest_height = dest_height.item()
box[:, 0] = np.clip(
np.round(box[:, 0] / width * dest_width), 0, dest_width)
box[:, 1] = np.clip(
np.round(box[:, 1] / height * dest_height), 0, dest_height)
boxes[index, :, :] = box.astype(np.int16)
scores[index] = score
return boxes, scores
def unclip(self, box):
unclip_ratio = self.unclip_ratio
poly = Polygon(box)
distance = poly.area * unclip_ratio / poly.length
offset = pyclipper.PyclipperOffset()
offset.AddPath(box, pyclipper.JT_ROUND, pyclipper.ET_CLOSEDPOLYGON)
expanded = np.array(offset.Execute(distance))
return expanded
def get_mini_boxes(self, contour):
bounding_box = cv2.minAreaRect(contour)
points = sorted(list(cv2.boxPoints(bounding_box)), key=lambda x: x[0])
index_1, index_2, index_3, index_4 = 0, 1, 2, 3
if points[1][1] > points[0][1]:
index_1 = 0
index_4 = 1
else:
index_1 = 1
index_4 = 0
if points[3][1] > points[2][1]:
index_2 = 2
index_3 = 3
else:
index_2 = 3
index_3 = 2
box = [
points[index_1], points[index_2], points[index_3], points[index_4]
]
return box, min(bounding_box[1])
def box_score_fast(self, bitmap, _box):
h, w = bitmap.shape[:2]
box = _box.copy()
xmin = np.clip(np.floor(box[:, 0].min()).astype(np.int), 0, w - 1)
xmax = np.clip(np.ceil(box[:, 0].max()).astype(np.int), 0, w - 1)
ymin = np.clip(np.floor(box[:, 1].min()).astype(np.int), 0, h - 1)
ymax = np.clip(np.ceil(box[:, 1].max()).astype(np.int), 0, h - 1)
mask = np.zeros((ymax - ymin + 1, xmax - xmin + 1), dtype=np.uint8)
box[:, 0] = box[:, 0] - xmin
box[:, 1] = box[:, 1] - ymin
cv2.fillPoly(mask, box.reshape(1, -1, 2).astype(np.int32), 1)
return cv2.mean(bitmap[ymin:ymax + 1, xmin:xmax + 1], mask)[0]
def __call__(self, pred, ratio_list):
pred = pred[:, 0, :, :]
segmentation = pred > self.thresh
boxes_batch = []
for batch_index in range(pred.shape[0]):
height, width = pred.shape[-2:]
tmp_boxes, tmp_scores = self.boxes_from_bitmap(
pred[batch_index], segmentation[batch_index], width, height)
boxes = []
for k in range(len(tmp_boxes)):
if tmp_scores[k] > self.box_thresh:
boxes.append(tmp_boxes[k])
if len(boxes) > 0:
boxes = np.array(boxes)
ratio_h, ratio_w = ratio_list[batch_index]
boxes[:, :, 0] = boxes[:, :, 0] / ratio_w
boxes[:, :, 1] = boxes[:, :, 1] / ratio_h
boxes_batch.append(boxes)
return boxes_batch
def __repr__(self):
return self.__class__.__name__ + \
" thresh: {1}, box_thresh: {2}, max_candidates: {3}, unclip_ratio: {4}, min_size: {5}".format(
self.thresh, self.box_thresh, self.max_candidates, self.unclip_ratio, self.min_size)
class FilterBoxes(object):
def __init__(self, width, height):
self.filter_width = width
self.filter_height = height
def order_points_clockwise(self, pts):
"""
reference from: https://github.com/jrosebr1/imutils/blob/master/imutils/perspective.py
# sort the points based on their x-coordinates
"""
xSorted = pts[np.argsort(pts[:, 0]), :]
# grab the left-most and right-most points from the sorted
# x-roodinate points
leftMost = xSorted[:2, :]
rightMost = xSorted[2:, :]
# now, sort the left-most coordinates according to their
# y-coordinates so we can grab the top-left and bottom-left
# points, respectively
leftMost = leftMost[np.argsort(leftMost[:, 1]), :]
(tl, bl) = leftMost
rightMost = rightMost[np.argsort(rightMost[:, 1]), :]
(tr, br) = rightMost
rect = np.array([tl, tr, br, bl], dtype="float32")
return rect
def clip_det_res(self, points, img_height, img_width):
for pno in range(4):
points[pno, 0] = int(min(max(points[pno, 0], 0), img_width - 1))
points[pno, 1] = int(min(max(points[pno, 1], 0), img_height - 1))
return points
def __call__(self, dt_boxes, image_shape):
img_height, img_width = image_shape[0:2]
dt_boxes_new = []
for box in dt_boxes:
box = self.order_points_clockwise(box)
box = self.clip_det_res(box, img_height, img_width)
rect_width = int(np.linalg.norm(box[0] - box[1]))
rect_height = int(np.linalg.norm(box[0] - box[3]))
if rect_width <= self.filter_width or \
rect_height <= self.filter_height:
continue
dt_boxes_new.append(box)
dt_boxes = np.array(dt_boxes_new)
return dt_boxes
def __repr__(self):
return self.__class__.__name__ + " filter_width: {1}, filter_height: {2}".format(
self.filter_width, self.filter_height)
class SegPostprocess(object): class SegPostprocess(object):
def __init__(self, class_num): def __init__(self, class_num):
self.class_num = class_num self.class_num = class_num
...@@ -77,8 +272,7 @@ class SegPostprocess(object): ...@@ -77,8 +272,7 @@ class SegPostprocess(object):
result_png = score_png result_png = score_png
result_png = cv2.resize( result_png = cv2.resize(
result_png, result_png, (ori_shape[1], ori_shape[0]),
ori_shape[:2],
fx=0, fx=0,
fy=0, fy=0,
interpolation=cv2.INTER_CUBIC) interpolation=cv2.INTER_CUBIC)
...@@ -296,7 +490,10 @@ class File2Image(object): ...@@ -296,7 +490,10 @@ class File2Image(object):
pass pass
def __call__(self, img_path): def __call__(self, img_path):
fin = open(img_path) if py_version == 2:
fin = open(img_path)
else:
fin = open(img_path, "rb")
sample = fin.read() sample = fin.read()
data = np.fromstring(sample, np.uint8) data = np.fromstring(sample, np.uint8)
img = cv2.imdecode(data, cv2.IMREAD_COLOR) img = cv2.imdecode(data, cv2.IMREAD_COLOR)
...@@ -470,6 +667,57 @@ class Resize(object): ...@@ -470,6 +667,57 @@ class Resize(object):
_cv2_interpolation_to_str[self.interpolation]) _cv2_interpolation_to_str[self.interpolation])
class ResizeByFactor(object):
"""Resize the input numpy array Image to a size multiple of factor which is usually required by a network
Args:
factor (int): Resize factor. make width and height multiple factor of the value of factor. Default is 32
max_side_len (int): max size of width and height. if width or height is larger than max_side_len, just resize the width or the height. Default is 2400
"""
def __init__(self, factor=32, max_side_len=2400):
self.factor = factor
self.max_side_len = max_side_len
def __call__(self, img):
h, w, _ = img.shape
resize_w = w
resize_h = h
if max(resize_h, resize_w) > self.max_side_len:
if resize_h > resize_w:
ratio = float(self.max_side_len) / resize_h
else:
ratio = float(self.max_side_len) / resize_w
else:
ratio = 1.
resize_h = int(resize_h * ratio)
resize_w = int(resize_w * ratio)
if resize_h % self.factor == 0:
resize_h = resize_h
elif resize_h // self.factor <= 1:
resize_h = self.factor
else:
resize_h = (resize_h // 32 - 1) * 32
if resize_w % self.factor == 0:
resize_w = resize_w
elif resize_w // self.factor <= 1:
resize_w = self.factor
else:
resize_w = (resize_w // self.factor - 1) * self.factor
try:
if int(resize_w) <= 0 or int(resize_h) <= 0:
return None, (None, None)
im = cv2.resize(img, (int(resize_w), int(resize_h)))
except:
print(resize_w, resize_h)
sys.exit(0)
return im
def __repr__(self):
return self.__class__.__name__ + '(factor={0}, max_side_len={1})'.format(
self.factor, self.max_side_len)
class PadStride(object): class PadStride(object):
def __init__(self, stride): def __init__(self, stride):
self.coarsest_stride = stride self.coarsest_stride = stride
......
...@@ -111,6 +111,10 @@ class LACReader(object): ...@@ -111,6 +111,10 @@ class LACReader(object):
return word_ids return word_ids
def parse_result(self, words, crf_decode): def parse_result(self, words, crf_decode):
try:
words = unicode(words, "utf-8")
except:
pass
tags = [self.id2label_dict[str(x[0])] for x in crf_decode] tags = [self.id2label_dict[str(x[0])] for x in crf_decode]
sent_out = [] sent_out = []
......
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
import copy
import numpy as np
import math
import re
import sys
import argparse
from paddle_serving_app.reader import Sequential, Resize, Transpose, Div, Normalize
class CharacterOps(object):
""" Convert between text-label and text-index """
def __init__(self, config):
self.character_type = config['character_type']
self.loss_type = config['loss_type']
if self.character_type == "en":
self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
dict_character = list(self.character_str)
elif self.character_type == "ch":
character_dict_path = config['character_dict_path']
self.character_str = ""
with open(character_dict_path, "rb") as fin:
lines = fin.readlines()
for line in lines:
line = line.decode('utf-8').strip("\n").strip("\r\n")
self.character_str += line
dict_character = list(self.character_str)
elif self.character_type == "en_sensitive":
# same with ASTER setting (use 94 char).
self.character_str = string.printable[:-6]
dict_character = list(self.character_str)
else:
self.character_str = None
assert self.character_str is not None, \
"Nonsupport type of the character: {}".format(self.character_str)
self.beg_str = "sos"
self.end_str = "eos"
if self.loss_type == "attention":
dict_character = [self.beg_str, self.end_str] + dict_character
self.dict = {}
for i, char in enumerate(dict_character):
self.dict[char] = i
self.character = dict_character
def encode(self, text):
"""convert text-label into text-index.
input:
text: text labels of each image. [batch_size]
output:
text: concatenated text index for CTCLoss.
[sum(text_lengths)] = [text_index_0 + text_index_1 + ... + text_index_(n - 1)]
length: length of each text. [batch_size]
"""
if self.character_type == "en":
text = text.lower()
text_list = []
for char in text:
if char not in self.dict:
continue
text_list.append(self.dict[char])
text = np.array(text_list)
return text
def decode(self, text_index, is_remove_duplicate=False):
""" convert text-index into text-label. """
char_list = []
char_num = self.get_char_num()
if self.loss_type == "attention":
beg_idx = self.get_beg_end_flag_idx("beg")
end_idx = self.get_beg_end_flag_idx("end")
ignored_tokens = [beg_idx, end_idx]
else:
ignored_tokens = [char_num]
for idx in range(len(text_index)):
if text_index[idx] in ignored_tokens:
continue
if is_remove_duplicate:
if idx > 0 and text_index[idx - 1] == text_index[idx]:
continue
char_list.append(self.character[text_index[idx]])
text = ''.join(char_list)
return text
def get_char_num(self):
return len(self.character)
def get_beg_end_flag_idx(self, beg_or_end):
if self.loss_type == "attention":
if beg_or_end == "beg":
idx = np.array(self.dict[self.beg_str])
elif beg_or_end == "end":
idx = np.array(self.dict[self.end_str])
else:
assert False, "Unsupport type %s in get_beg_end_flag_idx"\
% beg_or_end
return idx
else:
err = "error in get_beg_end_flag_idx when using the loss %s"\
% (self.loss_type)
assert False, err
class OCRReader(object):
def __init__(self):
args = self.parse_args()
image_shape = [int(v) for v in args.rec_image_shape.split(",")]
self.rec_image_shape = image_shape
self.character_type = args.rec_char_type
self.rec_batch_num = args.rec_batch_num
char_ops_params = {}
char_ops_params["character_type"] = args.rec_char_type
char_ops_params["character_dict_path"] = args.rec_char_dict_path
char_ops_params['loss_type'] = 'ctc'
self.char_ops = CharacterOps(char_ops_params)
def parse_args(self):
parser = argparse.ArgumentParser()
parser.add_argument("--rec_algorithm", type=str, default='CRNN')
parser.add_argument("--rec_model_dir", type=str)
parser.add_argument("--rec_image_shape", type=str, default="3, 32, 320")
parser.add_argument("--rec_char_type", type=str, default='ch')
parser.add_argument("--rec_batch_num", type=int, default=1)
parser.add_argument(
"--rec_char_dict_path", type=str, default="./ppocr_keys_v1.txt")
return parser.parse_args()
def resize_norm_img(self, img, max_wh_ratio):
imgC, imgH, imgW = self.rec_image_shape
if self.character_type == "ch":
imgW = int(32 * max_wh_ratio)
h = img.shape[0]
w = img.shape[1]
ratio = w / float(h)
if math.ceil(imgH * ratio) > imgW:
resized_w = imgW
else:
resized_w = int(math.ceil(imgH * ratio))
seq = Sequential([
Resize(imgH, resized_w), Transpose((2, 0, 1)), Div(255),
Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5], True)
])
resized_image = seq(img)
padding_im = np.zeros((imgC, imgH, imgW), dtype=np.float32)
padding_im[:, :, 0:resized_w] = resized_image
return padding_im
def preprocess(self, img_list):
img_num = len(img_list)
norm_img_batch = []
max_wh_ratio = 0
for ino in range(img_num):
h, w = img_list[ino].shape[0:2]
wh_ratio = w * 1.0 / h
max_wh_ratio = max(max_wh_ratio, wh_ratio)
for ino in range(img_num):
norm_img = self.resize_norm_img(img_list[ino], max_wh_ratio)
norm_img = norm_img[np.newaxis, :]
norm_img_batch.append(norm_img)
norm_img_batch = np.concatenate(norm_img_batch)
norm_img_batch = norm_img_batch.copy()
return norm_img_batch[0]
def postprocess(self, outputs):
rec_res = []
rec_idx_lod = outputs["ctc_greedy_decoder_0.tmp_0.lod"]
predict_lod = outputs["softmax_0.tmp_0.lod"]
rec_idx_batch = outputs["ctc_greedy_decoder_0.tmp_0"]
for rno in range(len(rec_idx_lod) - 1):
beg = rec_idx_lod[rno]
end = rec_idx_lod[rno + 1]
rec_idx_tmp = rec_idx_batch[beg:end, 0]
preds_text = self.char_ops.decode(rec_idx_tmp)
beg = predict_lod[rno]
end = predict_lod[rno + 1]
probs = outputs["softmax_0.tmp_0"][beg:end, :]
ind = np.argmax(probs, axis=1)
blank = probs.shape[1]
valid_ind = np.where(ind != (blank - 1))[0]
score = np.mean(probs[valid_ind, ind[valid_ind]])
rec_res.append([preds_text, score])
return rec_res
...@@ -21,7 +21,12 @@ import google.protobuf.text_format ...@@ -21,7 +21,12 @@ import google.protobuf.text_format
import numpy as np import numpy as np
import time import time
import sys import sys
from .serving_client import PredictorRes
import grpc
from .proto import multi_lang_general_model_service_pb2
sys.path.append(
os.path.join(os.path.abspath(os.path.dirname(__file__)), 'proto'))
from .proto import multi_lang_general_model_service_pb2_grpc
int_type = 0 int_type = 0
float_type = 1 float_type = 1
...@@ -61,13 +66,18 @@ class SDKConfig(object): ...@@ -61,13 +66,18 @@ class SDKConfig(object):
self.tag_list = [] self.tag_list = []
self.cluster_list = [] self.cluster_list = []
self.variant_weight_list = [] self.variant_weight_list = []
self.rpc_timeout_ms = 20000
self.load_balance_strategy = "la"
def add_server_variant(self, tag, cluster, variant_weight): def add_server_variant(self, tag, cluster, variant_weight):
self.tag_list.append(tag) self.tag_list.append(tag)
self.cluster_list.append(cluster) self.cluster_list.append(cluster)
self.variant_weight_list.append(variant_weight) self.variant_weight_list.append(variant_weight)
def gen_desc(self): def set_load_banlance_strategy(self, strategy):
self.load_balance_strategy = strategy
def gen_desc(self, rpc_timeout_ms):
predictor_desc = sdk.Predictor() predictor_desc = sdk.Predictor()
predictor_desc.name = "general_model" predictor_desc.name = "general_model"
predictor_desc.service_name = \ predictor_desc.service_name = \
...@@ -86,7 +96,7 @@ class SDKConfig(object): ...@@ -86,7 +96,7 @@ class SDKConfig(object):
self.sdk_desc.predictors.extend([predictor_desc]) self.sdk_desc.predictors.extend([predictor_desc])
self.sdk_desc.default_variant_conf.tag = "default" self.sdk_desc.default_variant_conf.tag = "default"
self.sdk_desc.default_variant_conf.connection_conf.connect_timeout_ms = 2000 self.sdk_desc.default_variant_conf.connection_conf.connect_timeout_ms = 2000
self.sdk_desc.default_variant_conf.connection_conf.rpc_timeout_ms = 20000 self.sdk_desc.default_variant_conf.connection_conf.rpc_timeout_ms = rpc_timeout_ms
self.sdk_desc.default_variant_conf.connection_conf.connect_retry_count = 2 self.sdk_desc.default_variant_conf.connection_conf.connect_retry_count = 2
self.sdk_desc.default_variant_conf.connection_conf.max_connection_per_host = 100 self.sdk_desc.default_variant_conf.connection_conf.max_connection_per_host = 100
self.sdk_desc.default_variant_conf.connection_conf.hedge_request_timeout_ms = -1 self.sdk_desc.default_variant_conf.connection_conf.hedge_request_timeout_ms = -1
...@@ -119,6 +129,9 @@ class Client(object): ...@@ -119,6 +129,9 @@ class Client(object):
self.profile_ = _Profiler() self.profile_ = _Profiler()
self.all_numpy_input = True self.all_numpy_input = True
self.has_numpy_input = False self.has_numpy_input = False
self.rpc_timeout_ms = 20000
from .serving_client import PredictorRes
self.predictorres_constructor = PredictorRes
def load_client_config(self, path): def load_client_config(self, path):
from .serving_client import PredictorClient from .serving_client import PredictorClient
...@@ -171,13 +184,19 @@ class Client(object): ...@@ -171,13 +184,19 @@ class Client(object):
self.predictor_sdk_.add_server_variant(tag, cluster, self.predictor_sdk_.add_server_variant(tag, cluster,
str(variant_weight)) str(variant_weight))
def set_rpc_timeout_ms(self, rpc_timeout):
if not isinstance(rpc_timeout, int):
raise ValueError("rpc_timeout must be int type.")
else:
self.rpc_timeout_ms = rpc_timeout
def connect(self, endpoints=None): def connect(self, endpoints=None):
# check whether current endpoint is available # check whether current endpoint is available
# init from client config # init from client config
# create predictor here # create predictor here
if endpoints is None: if endpoints is None:
if self.predictor_sdk_ is None: if self.predictor_sdk_ is None:
raise SystemExit( raise ValueError(
"You must set the endpoints parameter or use add_variant function to create a variant." "You must set the endpoints parameter or use add_variant function to create a variant."
) )
else: else:
...@@ -188,7 +207,7 @@ class Client(object): ...@@ -188,7 +207,7 @@ class Client(object):
print( print(
"parameter endpoints({}) will not take effect, because you use the add_variant function.". "parameter endpoints({}) will not take effect, because you use the add_variant function.".
format(endpoints)) format(endpoints))
sdk_desc = self.predictor_sdk_.gen_desc() sdk_desc = self.predictor_sdk_.gen_desc(self.rpc_timeout_ms)
self.client_handle_.create_predictor_by_desc(sdk_desc.SerializeToString( self.client_handle_.create_predictor_by_desc(sdk_desc.SerializeToString(
)) ))
...@@ -203,7 +222,7 @@ class Client(object): ...@@ -203,7 +222,7 @@ class Client(object):
return return
if isinstance(feed[key], if isinstance(feed[key],
list) and len(feed[key]) != self.feed_tensor_len[key]: list) and len(feed[key]) != self.feed_tensor_len[key]:
raise SystemExit("The shape of feed tensor {} not match.".format( raise ValueError("The shape of feed tensor {} not match.".format(
key)) key))
if type(feed[key]).__module__ == np.__name__ and np.size(feed[ if type(feed[key]).__module__ == np.__name__ and np.size(feed[
key]) != self.feed_tensor_len[key]: key]) != self.feed_tensor_len[key]:
...@@ -292,7 +311,7 @@ class Client(object): ...@@ -292,7 +311,7 @@ class Client(object):
self.profile_.record('py_prepro_1') self.profile_.record('py_prepro_1')
self.profile_.record('py_client_infer_0') self.profile_.record('py_client_infer_0')
result_batch_handle = PredictorRes() result_batch_handle = self.predictorres_constructor()
if self.all_numpy_input: if self.all_numpy_input:
res = self.client_handle_.numpy_predict( res = self.client_handle_.numpy_predict(
float_slot_batch, float_feed_names, float_shape, int_slot_batch, float_slot_batch, float_feed_names, float_shape, int_slot_batch,
...@@ -304,7 +323,7 @@ class Client(object): ...@@ -304,7 +323,7 @@ class Client(object):
int_feed_names, int_shape, fetch_names, result_batch_handle, int_feed_names, int_shape, fetch_names, result_batch_handle,
self.pid) self.pid)
else: else:
raise SystemExit( raise ValueError(
"Please make sure the inputs are all in list type or all in numpy.array type" "Please make sure the inputs are all in list type or all in numpy.array type"
) )
...@@ -360,3 +379,172 @@ class Client(object): ...@@ -360,3 +379,172 @@ class Client(object):
def release(self): def release(self):
self.client_handle_.destroy_predictor() self.client_handle_.destroy_predictor()
self.client_handle_ = None self.client_handle_ = None
class MultiLangClient(object):
def __init__(self):
self.channel_ = None
def load_client_config(self, path):
if not isinstance(path, str):
raise Exception("GClient only supports multi-model temporarily")
self._parse_model_config(path)
def connect(self, endpoint):
self.channel_ = grpc.insecure_channel(endpoint[0]) #TODO
self.stub_ = multi_lang_general_model_service_pb2_grpc.MultiLangGeneralModelServiceStub(
self.channel_)
def _flatten_list(self, nested_list):
for item in nested_list:
if isinstance(item, (list, tuple)):
for sub_item in self._flatten_list(item):
yield sub_item
else:
yield item
def _parse_model_config(self, model_config_path):
model_conf = m_config.GeneralModelConfig()
f = open(model_config_path, 'r')
model_conf = google.protobuf.text_format.Merge(
str(f.read()), model_conf)
self.feed_names_ = [var.alias_name for var in model_conf.feed_var]
self.feed_types_ = {}
self.feed_shapes_ = {}
self.fetch_names_ = [var.alias_name for var in model_conf.fetch_var]
self.fetch_types_ = {}
self.lod_tensor_set_ = set()
for i, var in enumerate(model_conf.feed_var):
self.feed_types_[var.alias_name] = var.feed_type
self.feed_shapes_[var.alias_name] = var.shape
if var.is_lod_tensor:
self.lod_tensor_set_.add(var.alias_name)
else:
counter = 1
for dim in self.feed_shapes_[var.alias_name]:
counter *= dim
for i, var in enumerate(model_conf.fetch_var):
self.fetch_types_[var.alias_name] = var.fetch_type
if var.is_lod_tensor:
self.lod_tensor_set_.add(var.alias_name)
def _pack_feed_data(self, feed, fetch, is_python):
req = multi_lang_general_model_service_pb2.Request()
req.fetch_var_names.extend(fetch)
req.feed_var_names.extend(feed.keys())
req.is_python = is_python
feed_batch = None
if isinstance(feed, dict):
feed_batch = [feed]
elif isinstance(feed, list):
feed_batch = feed
else:
raise Exception("{} not support".format(type(feed)))
init_feed_names = False
for feed_data in feed_batch:
inst = multi_lang_general_model_service_pb2.FeedInst()
for name in req.feed_var_names:
tensor = multi_lang_general_model_service_pb2.Tensor()
var = feed_data[name]
v_type = self.feed_types_[name]
if is_python:
data = None
if isinstance(var, list):
if v_type == 0: # int64
data = np.array(var, dtype="int64")
elif v_type == 1: # float32
data = np.array(var, dtype="float32")
else:
raise Exception("error type.")
else:
data = var
if var.dtype == "float64":
data = data.astype("float32")
tensor.data = data.tobytes()
else:
if v_type == 0: # int64
if isinstance(var, np.ndarray):
tensor.int64_data.extend(var.reshape(-1).tolist())
else:
tensor.int64_data.extend(self._flatten_list(var))
elif v_type == 1: # float32
if isinstance(var, np.ndarray):
tensor.float_data.extend(var.reshape(-1).tolist())
else:
tensor.float_data.extend(self._flatten_list(var))
else:
raise Exception("error type.")
if isinstance(var, np.ndarray):
tensor.shape.extend(list(var.shape))
else:
tensor.shape.extend(self.feed_shapes_[name])
inst.tensor_array.append(tensor)
req.insts.append(inst)
return req
def _unpack_resp(self, resp, fetch, is_python, need_variant_tag):
result_map = {}
inst = resp.outputs[0].insts[0]
tag = resp.tag
for i, name in enumerate(fetch):
var = inst.tensor_array[i]
v_type = self.fetch_types_[name]
if is_python:
if v_type == 0: # int64
result_map[name] = np.frombuffer(var.data, dtype="int64")
elif v_type == 1: # float32
result_map[name] = np.frombuffer(var.data, dtype="float32")
else:
raise Exception("error type.")
else:
if v_type == 0: # int64
result_map[name] = np.array(
list(var.int64_data), dtype="int64")
elif v_type == 1: # float32
result_map[name] = np.array(
list(var.float_data), dtype="float32")
else:
raise Exception("error type.")
result_map[name].shape = list(var.shape)
if name in self.lod_tensor_set_:
result_map["{}.lod".format(name)] = np.array(list(var.lod))
return result_map if not need_variant_tag else [result_map, tag]
def _done_callback_func(self, fetch, is_python, need_variant_tag):
def unpack_resp(resp):
return self._unpack_resp(resp, fetch, is_python, need_variant_tag)
return unpack_resp
def predict(self,
feed,
fetch,
need_variant_tag=False,
asyn=False,
is_python=True):
req = self._pack_feed_data(feed, fetch, is_python=is_python)
if not asyn:
resp = self.stub_.inference(req)
return self._unpack_resp(
resp,
fetch,
is_python=is_python,
need_variant_tag=need_variant_tag)
else:
call_future = self.stub_.inference.future(req)
return MultiLangPredictFuture(
call_future,
self._done_callback_func(
fetch,
is_python=is_python,
need_variant_tag=need_variant_tag))
class MultiLangPredictFuture(object):
def __init__(self, call_future, callback_func):
self.call_future_ = call_future
self.callback_func_ = callback_func
def result(self):
resp = self.call_future_.result()
return self.callback_func_(resp)
...@@ -13,8 +13,8 @@ ...@@ -13,8 +13,8 @@
# limitations under the License. # limitations under the License.
# pylint: disable=doc-string-missing # pylint: disable=doc-string-missing
import grpc import grpc
import general_python_service_pb2 from .proto import general_python_service_pb2
import general_python_service_pb2_grpc from .proto import general_python_service_pb2_grpc
import numpy as np import numpy as np
...@@ -30,27 +30,33 @@ class PyClient(object): ...@@ -30,27 +30,33 @@ class PyClient(object):
def _pack_data_for_infer(self, feed_data): def _pack_data_for_infer(self, feed_data):
req = general_python_service_pb2.Request() req = general_python_service_pb2.Request()
for name, data in feed_data.items(): for name, data in feed_data.items():
if not isinstance(data, np.ndarray): if isinstance(data, list):
raise TypeError( data = np.array(data)
"only numpy array type is supported temporarily.") elif not isinstance(data, np.ndarray):
data2bytes = np.ndarray.tobytes(data) raise TypeError("only list and numpy array type is supported.")
req.feed_var_names.append(name) req.feed_var_names.append(name)
req.feed_insts.append(data2bytes) req.feed_insts.append(data.tobytes())
req.shape.append(np.array(data.shape, dtype="int32").tobytes())
req.type.append(str(data.dtype))
return req return req
def predict(self, feed, fetch_with_type): def predict(self, feed, fetch):
if not isinstance(feed, dict): if not isinstance(feed, dict):
raise TypeError( raise TypeError(
"feed must be dict type with format: {name: value}.") "feed must be dict type with format: {name: value}.")
if not isinstance(fetch_with_type, dict): if not isinstance(fetch, list):
raise TypeError( raise TypeError(
"fetch_with_type must be dict type with format: {name : type}.") "fetch_with_type must be list type with format: [name].")
req = self._pack_data_for_infer(feed) req = self._pack_data_for_infer(feed)
resp = self._stub.inference(req) resp = self._stub.inference(req)
fetch_map = {} if resp.ecode != 0:
return {"ecode": resp.ecode, "error_info": resp.error_info}
fetch_map = {"ecode": resp.ecode}
for idx, name in enumerate(resp.fetch_var_names): for idx, name in enumerate(resp.fetch_var_names):
if name not in fetch_with_type: if name not in fetch:
continue continue
fetch_map[name] = np.frombuffer( fetch_map[name] = np.frombuffer(
resp.fetch_insts[idx], dtype=fetch_with_type[name]) resp.fetch_insts[idx], dtype=resp.type[idx])
fetch_map[name].shape = np.frombuffer(
resp.shape[idx], dtype="int32")
return fetch_map return fetch_map
...@@ -17,6 +17,7 @@ import sys ...@@ -17,6 +17,7 @@ import sys
import subprocess import subprocess
import argparse import argparse
from multiprocessing import Pool from multiprocessing import Pool
import numpy as np
def benchmark_args(): def benchmark_args():
...@@ -35,6 +36,17 @@ def benchmark_args(): ...@@ -35,6 +36,17 @@ def benchmark_args():
return parser.parse_args() return parser.parse_args()
def show_latency(latency_list):
latency_array = np.array(latency_list)
info = "latency:\n"
info += "mean :{} ms\n".format(np.mean(latency_array))
info += "median :{} ms\n".format(np.median(latency_array))
info += "80 percent :{} ms\n".format(np.percentile(latency_array, 80))
info += "90 percent :{} ms\n".format(np.percentile(latency_array, 90))
info += "99 percent :{} ms\n".format(np.percentile(latency_array, 99))
sys.stderr.write(info)
class MultiThreadRunner(object): class MultiThreadRunner(object):
def __init__(self): def __init__(self):
pass pass
......
...@@ -23,6 +23,17 @@ import paddle_serving_server as paddle_serving_server ...@@ -23,6 +23,17 @@ import paddle_serving_server as paddle_serving_server
from .version import serving_server_version from .version import serving_server_version
from contextlib import closing from contextlib import closing
import collections import collections
import fcntl
import numpy as np
import grpc
from .proto import multi_lang_general_model_service_pb2
import sys
sys.path.append(
os.path.join(os.path.abspath(os.path.dirname(__file__)), 'proto'))
from .proto import multi_lang_general_model_service_pb2_grpc
from multiprocessing import Pool, Process
from concurrent import futures
class OpMaker(object): class OpMaker(object):
...@@ -322,6 +333,10 @@ class Server(object): ...@@ -322,6 +333,10 @@ class Server(object):
bin_url = "https://paddle-serving.bj.bcebos.com/bin/" + tar_name bin_url = "https://paddle-serving.bj.bcebos.com/bin/" + tar_name
self.server_path = os.path.join(self.module_path, floder_name) self.server_path = os.path.join(self.module_path, floder_name)
#acquire lock
version_file = open("{}/version.py".format(self.module_path), "r")
fcntl.flock(version_file, fcntl.LOCK_EX)
if not os.path.exists(self.server_path): if not os.path.exists(self.server_path):
print('Frist time run, downloading PaddleServing components ...') print('Frist time run, downloading PaddleServing components ...')
r = os.system('wget ' + bin_url + ' --no-check-certificate') r = os.system('wget ' + bin_url + ' --no-check-certificate')
...@@ -345,6 +360,8 @@ class Server(object): ...@@ -345,6 +360,8 @@ class Server(object):
foemat(self.module_path)) foemat(self.module_path))
finally: finally:
os.remove(tar_name) os.remove(tar_name)
#release lock
version_file.close()
os.chdir(self.cur_path) os.chdir(self.cur_path)
self.bin_path = self.server_path + "/serving" self.bin_path = self.server_path + "/serving"
...@@ -421,3 +438,158 @@ class Server(object): ...@@ -421,3 +438,158 @@ class Server(object):
print("Going to Run Command") print("Going to Run Command")
print(command) print(command)
os.system(command) os.system(command)
class MultiLangServerService(
multi_lang_general_model_service_pb2_grpc.MultiLangGeneralModelService):
def __init__(self, model_config_path, endpoints):
from paddle_serving_client import Client
self._parse_model_config(model_config_path)
self.bclient_ = Client()
self.bclient_.load_client_config(
"{}/serving_server_conf.prototxt".format(model_config_path))
self.bclient_.connect(endpoints)
def _parse_model_config(self, model_config_path):
model_conf = m_config.GeneralModelConfig()
f = open("{}/serving_server_conf.prototxt".format(model_config_path),
'r')
model_conf = google.protobuf.text_format.Merge(
str(f.read()), model_conf)
self.feed_names_ = [var.alias_name for var in model_conf.feed_var]
self.feed_types_ = {}
self.feed_shapes_ = {}
self.fetch_names_ = [var.alias_name for var in model_conf.fetch_var]
self.fetch_types_ = {}
self.lod_tensor_set_ = set()
for i, var in enumerate(model_conf.feed_var):
self.feed_types_[var.alias_name] = var.feed_type
self.feed_shapes_[var.alias_name] = var.shape
if var.is_lod_tensor:
self.lod_tensor_set_.add(var.alias_name)
for i, var in enumerate(model_conf.fetch_var):
self.fetch_types_[var.alias_name] = var.fetch_type
if var.is_lod_tensor:
self.lod_tensor_set_.add(var.alias_name)
def _flatten_list(self, nested_list):
for item in nested_list:
if isinstance(item, (list, tuple)):
for sub_item in self._flatten_list(item):
yield sub_item
else:
yield item
def _unpack_request(self, request):
feed_names = list(request.feed_var_names)
fetch_names = list(request.fetch_var_names)
is_python = request.is_python
feed_batch = []
for feed_inst in request.insts:
feed_dict = {}
for idx, name in enumerate(feed_names):
var = feed_inst.tensor_array[idx]
v_type = self.feed_types_[name]
data = None
if is_python:
if v_type == 0:
data = np.frombuffer(var.data, dtype="int64")
elif v_type == 1:
data = np.frombuffer(var.data, dtype="float32")
else:
raise Exception("error type.")
else:
if v_type == 0: # int64
data = np.array(list(var.int64_data), dtype="int64")
elif v_type == 1: # float32
data = np.array(list(var.float_data), dtype="float32")
else:
raise Exception("error type.")
data.shape = list(feed_inst.tensor_array[idx].shape)
feed_dict[name] = data
feed_batch.append(feed_dict)
return feed_batch, fetch_names, is_python
def _pack_resp_package(self, result, fetch_names, is_python, tag):
resp = multi_lang_general_model_service_pb2.Response()
# Only one model is supported temporarily
model_output = multi_lang_general_model_service_pb2.ModelOutput()
inst = multi_lang_general_model_service_pb2.FetchInst()
for idx, name in enumerate(fetch_names):
tensor = multi_lang_general_model_service_pb2.Tensor()
v_type = self.fetch_types_[name]
if is_python:
tensor.data = result[name].tobytes()
else:
if v_type == 0: # int64
tensor.int64_data.extend(result[name].reshape(-1).tolist())
elif v_type == 1: # float32
tensor.float_data.extend(result[name].reshape(-1).tolist())
else:
raise Exception("error type.")
tensor.shape.extend(list(result[name].shape))
if name in self.lod_tensor_set_:
tensor.lod.extend(result["{}.lod".format(name)].tolist())
inst.tensor_array.append(tensor)
model_output.insts.append(inst)
resp.outputs.append(model_output)
resp.tag = tag
return resp
def inference(self, request, context):
feed_dict, fetch_names, is_python = self._unpack_request(request)
data, tag = self.bclient_.predict(
feed=feed_dict, fetch=fetch_names, need_variant_tag=True)
return self._pack_resp_package(data, fetch_names, is_python, tag)
class MultiLangServer(object):
def __init__(self, worker_num=2):
self.bserver_ = Server()
self.worker_num_ = worker_num
def set_op_sequence(self, op_seq):
self.bserver_.set_op_sequence(op_seq)
def load_model_config(self, model_config_path):
if not isinstance(model_config_path, str):
raise Exception(
"MultiLangServer only supports multi-model temporarily")
self.bserver_.load_model_config(model_config_path)
self.model_config_path_ = model_config_path
def prepare_server(self, workdir=None, port=9292, device="cpu"):
default_port = 12000
self.port_list_ = []
for i in range(1000):
if default_port + i != port and self._port_is_available(default_port
+ i):
self.port_list_.append(default_port + i)
break
self.bserver_.prepare_server(
workdir=workdir, port=self.port_list_[0], device=device)
self.gport_ = port
def _launch_brpc_service(self, bserver):
bserver.run_server()
def _port_is_available(self, port):
with closing(socket.socket(socket.AF_INET, socket.SOCK_STREAM)) as sock:
sock.settimeout(2)
result = sock.connect_ex(('0.0.0.0', port))
return result != 0
def run_server(self):
p_bserver = Process(
target=self._launch_brpc_service, args=(self.bserver_, ))
p_bserver.start()
server = grpc.server(
futures.ThreadPoolExecutor(max_workers=self.worker_num_))
multi_lang_general_model_service_pb2_grpc.add_MultiLangGeneralModelServiceServicer_to_server(
MultiLangServerService(self.model_config_path_,
["0.0.0.0:{}".format(self.port_list_[0])]),
server)
server.add_insecure_port('[::]:{}'.format(self.gport_))
server.start()
p_bserver.join()
server.wait_for_termination()
...@@ -20,7 +20,7 @@ Usage: ...@@ -20,7 +20,7 @@ Usage:
import os import os
import time import time
import argparse import argparse
import commands import subprocess
import datetime import datetime
import shutil import shutil
import tarfile import tarfile
...@@ -209,7 +209,7 @@ class HadoopMonitor(Monitor): ...@@ -209,7 +209,7 @@ class HadoopMonitor(Monitor):
remote_filepath = os.path.join(path, filename) remote_filepath = os.path.join(path, filename)
cmd = '{} -ls {} 2>/dev/null'.format(self._cmd_prefix, remote_filepath) cmd = '{} -ls {} 2>/dev/null'.format(self._cmd_prefix, remote_filepath)
_LOGGER.debug('check cmd: {}'.format(cmd)) _LOGGER.debug('check cmd: {}'.format(cmd))
[status, output] = commands.getstatusoutput(cmd) [status, output] = subprocess.getstatusoutput(cmd)
_LOGGER.debug('resp: {}'.format(output)) _LOGGER.debug('resp: {}'.format(output))
if status == 0: if status == 0:
[_, _, _, _, _, mdate, mtime, _] = output.split('\n')[-1].split() [_, _, _, _, _, mdate, mtime, _] = output.split('\n')[-1].split()
......
...@@ -40,15 +40,23 @@ def parse_args(): # pylint: disable=doc-string-missing ...@@ -40,15 +40,23 @@ def parse_args(): # pylint: disable=doc-string-missing
parser.add_argument( parser.add_argument(
"--device", type=str, default="cpu", help="Type of device") "--device", type=str, default="cpu", help="Type of device")
parser.add_argument( parser.add_argument(
"--mem_optim", type=bool, default=False, help="Memory optimize") "--mem_optim",
default=False,
action="store_true",
help="Memory optimize")
parser.add_argument( parser.add_argument(
"--ir_optim", type=bool, default=False, help="Graph optimize") "--ir_optim", default=False, action="store_true", help="Graph optimize")
parser.add_argument("--use_mkl", type=bool, default=False, help="Use MKL") parser.add_argument(
"--use_mkl", default=False, action="store_true", help="Use MKL")
parser.add_argument( parser.add_argument(
"--max_body_size", "--max_body_size",
type=int, type=int,
default=512 * 1024 * 1024, default=512 * 1024 * 1024,
help="Limit sizes of messages") help="Limit sizes of messages")
parser.add_argument(
"--use_multilang",
action='store_true',
help="Use Multi-language-service")
return parser.parse_args() return parser.parse_args()
...@@ -63,6 +71,7 @@ def start_standard_model(): # pylint: disable=doc-string-missing ...@@ -63,6 +71,7 @@ def start_standard_model(): # pylint: disable=doc-string-missing
ir_optim = args.ir_optim ir_optim = args.ir_optim
max_body_size = args.max_body_size max_body_size = args.max_body_size
use_mkl = args.use_mkl use_mkl = args.use_mkl
use_multilang = args.use_multilang
if model == "": if model == "":
print("You must specify your serving model") print("You must specify your serving model")
...@@ -79,14 +88,19 @@ def start_standard_model(): # pylint: disable=doc-string-missing ...@@ -79,14 +88,19 @@ def start_standard_model(): # pylint: disable=doc-string-missing
op_seq_maker.add_op(general_infer_op) op_seq_maker.add_op(general_infer_op)
op_seq_maker.add_op(general_response_op) op_seq_maker.add_op(general_response_op)
server = serving.Server() server = None
server.set_op_sequence(op_seq_maker.get_op_sequence()) if use_multilang:
server.set_num_threads(thread_num) server = serving.MultiLangServer()
server.set_memory_optimize(mem_optim) server.set_op_sequence(op_seq_maker.get_op_sequence())
server.set_ir_optimize(ir_optim) else:
server.use_mkl(use_mkl) server = serving.Server()
server.set_max_body_size(max_body_size) server.set_op_sequence(op_seq_maker.get_op_sequence())
server.set_port(port) server.set_num_threads(thread_num)
server.set_memory_optimize(mem_optim)
server.set_ir_optimize(ir_optim)
server.use_mkl(use_mkl)
server.set_max_body_size(max_body_size)
server.set_port(port)
server.load_model_config(model) server.load_model_config(model)
server.prepare_server(workdir=workdir, port=port, device=device) server.prepare_server(workdir=workdir, port=port, device=device)
......
...@@ -86,7 +86,7 @@ class WebService(object): ...@@ -86,7 +86,7 @@ class WebService(object):
for key in fetch_map: for key in fetch_map:
fetch_map[key] = fetch_map[key].tolist() fetch_map[key] = fetch_map[key].tolist()
fetch_map = self.postprocess( fetch_map = self.postprocess(
feed=feed, fetch=fetch, fetch_map=fetch_map) feed=request.json["feed"], fetch=fetch, fetch_map=fetch_map)
result = {"result": fetch_map} result = {"result": fetch_map}
except ValueError: except ValueError:
result = {"result": "Request Value Error"} result = {"result": "Request Value Error"}
......
...@@ -25,6 +25,17 @@ from .version import serving_server_version ...@@ -25,6 +25,17 @@ from .version import serving_server_version
from contextlib import closing from contextlib import closing
import argparse import argparse
import collections import collections
import fcntl
import numpy as np
import grpc
from .proto import multi_lang_general_model_service_pb2
import sys
sys.path.append(
os.path.join(os.path.abspath(os.path.dirname(__file__)), 'proto'))
from .proto import multi_lang_general_model_service_pb2_grpc
from multiprocessing import Pool, Process
from concurrent import futures
def serve_args(): def serve_args():
...@@ -46,9 +57,12 @@ def serve_args(): ...@@ -46,9 +57,12 @@ def serve_args():
parser.add_argument( parser.add_argument(
"--name", type=str, default="None", help="Default service name") "--name", type=str, default="None", help="Default service name")
parser.add_argument( parser.add_argument(
"--mem_optim", type=bool, default=False, help="Memory optimize") "--mem_optim",
default=False,
action="store_true",
help="Memory optimize")
parser.add_argument( parser.add_argument(
"--ir_optim", type=bool, default=False, help="Graph optimize") "--ir_optim", default=False, action="store_true", help="Graph optimize")
parser.add_argument( parser.add_argument(
"--max_body_size", "--max_body_size",
type=int, type=int,
...@@ -347,6 +361,11 @@ class Server(object): ...@@ -347,6 +361,11 @@ class Server(object):
download_flag = "{}/{}.is_download".format(self.module_path, download_flag = "{}/{}.is_download".format(self.module_path,
folder_name) folder_name)
#acquire lock
version_file = open("{}/version.py".format(self.module_path), "r")
fcntl.flock(version_file, fcntl.LOCK_EX)
if os.path.exists(download_flag): if os.path.exists(download_flag):
os.chdir(self.cur_path) os.chdir(self.cur_path)
self.bin_path = self.server_path + "/serving" self.bin_path = self.server_path + "/serving"
...@@ -377,6 +396,8 @@ class Server(object): ...@@ -377,6 +396,8 @@ class Server(object):
format(self.module_path)) format(self.module_path))
finally: finally:
os.remove(tar_name) os.remove(tar_name)
#release lock
version_file.close()
os.chdir(self.cur_path) os.chdir(self.cur_path)
self.bin_path = self.server_path + "/serving" self.bin_path = self.server_path + "/serving"
...@@ -461,3 +482,158 @@ class Server(object): ...@@ -461,3 +482,158 @@ class Server(object):
print(command) print(command)
os.system(command) os.system(command)
class MultiLangServerService(
multi_lang_general_model_service_pb2_grpc.MultiLangGeneralModelService):
def __init__(self, model_config_path, endpoints):
from paddle_serving_client import Client
self._parse_model_config(model_config_path)
self.bclient_ = Client()
self.bclient_.load_client_config(
"{}/serving_server_conf.prototxt".format(model_config_path))
self.bclient_.connect(endpoints)
def _parse_model_config(self, model_config_path):
model_conf = m_config.GeneralModelConfig()
f = open("{}/serving_server_conf.prototxt".format(model_config_path),
'r')
model_conf = google.protobuf.text_format.Merge(
str(f.read()), model_conf)
self.feed_names_ = [var.alias_name for var in model_conf.feed_var]
self.feed_types_ = {}
self.feed_shapes_ = {}
self.fetch_names_ = [var.alias_name for var in model_conf.fetch_var]
self.fetch_types_ = {}
self.lod_tensor_set_ = set()
for i, var in enumerate(model_conf.feed_var):
self.feed_types_[var.alias_name] = var.feed_type
self.feed_shapes_[var.alias_name] = var.shape
if var.is_lod_tensor:
self.lod_tensor_set_.add(var.alias_name)
for i, var in enumerate(model_conf.fetch_var):
self.fetch_types_[var.alias_name] = var.fetch_type
if var.is_lod_tensor:
self.lod_tensor_set_.add(var.alias_name)
def _flatten_list(self, nested_list):
for item in nested_list:
if isinstance(item, (list, tuple)):
for sub_item in self._flatten_list(item):
yield sub_item
else:
yield item
def _unpack_request(self, request):
feed_names = list(request.feed_var_names)
fetch_names = list(request.fetch_var_names)
is_python = request.is_python
feed_batch = []
for feed_inst in request.insts:
feed_dict = {}
for idx, name in enumerate(feed_names):
var = feed_inst.tensor_array[idx]
v_type = self.feed_types_[name]
data = None
if is_python:
if v_type == 0:
data = np.frombuffer(var.data, dtype="int64")
elif v_type == 1:
data = np.frombuffer(var.data, dtype="float32")
else:
raise Exception("error type.")
else:
if v_type == 0: # int64
data = np.array(list(var.int64_data), dtype="int64")
elif v_type == 1: # float32
data = np.array(list(var.float_data), dtype="float32")
else:
raise Exception("error type.")
data.shape = list(feed_inst.tensor_array[idx].shape)
feed_dict[name] = data
feed_batch.append(feed_dict)
return feed_batch, fetch_names, is_python
def _pack_resp_package(self, result, fetch_names, is_python, tag):
resp = multi_lang_general_model_service_pb2.Response()
# Only one model is supported temporarily
model_output = multi_lang_general_model_service_pb2.ModelOutput()
inst = multi_lang_general_model_service_pb2.FetchInst()
for idx, name in enumerate(fetch_names):
tensor = multi_lang_general_model_service_pb2.Tensor()
v_type = self.fetch_types_[name]
if is_python:
tensor.data = result[name].tobytes()
else:
if v_type == 0: # int64
tensor.int64_data.extend(result[name].reshape(-1).tolist())
elif v_type == 1: # float32
tensor.float_data.extend(result[name].reshape(-1).tolist())
else:
raise Exception("error type.")
tensor.shape.extend(list(result[name].shape))
if name in self.lod_tensor_set_:
tensor.lod.extend(result["{}.lod".format(name)].tolist())
inst.tensor_array.append(tensor)
model_output.insts.append(inst)
resp.outputs.append(model_output)
resp.tag = tag
return resp
def inference(self, request, context):
feed_dict, fetch_names, is_python = self._unpack_request(request)
data, tag = self.bclient_.predict(
feed=feed_dict, fetch=fetch_names, need_variant_tag=True)
return self._pack_resp_package(data, fetch_names, is_python, tag)
class MultiLangServer(object):
def __init__(self, worker_num=2):
self.bserver_ = Server()
self.worker_num_ = worker_num
def set_op_sequence(self, op_seq):
self.bserver_.set_op_sequence(op_seq)
def load_model_config(self, model_config_path):
if not isinstance(model_config_path, str):
raise Exception(
"MultiLangServer only supports multi-model temporarily")
self.bserver_.load_model_config(model_config_path)
self.model_config_path_ = model_config_path
def prepare_server(self, workdir=None, port=9292, device="cpu"):
default_port = 12000
self.port_list_ = []
for i in range(1000):
if default_port + i != port and self._port_is_available(default_port
+ i):
self.port_list_.append(default_port + i)
break
self.bserver_.prepare_server(
workdir=workdir, port=self.port_list_[0], device=device)
self.gport_ = port
def _launch_brpc_service(self, bserver):
bserver.run_server()
def _port_is_available(self, port):
with closing(socket.socket(socket.AF_INET, socket.SOCK_STREAM)) as sock:
sock.settimeout(2)
result = sock.connect_ex(('0.0.0.0', port))
return result != 0
def run_server(self):
p_bserver = Process(
target=self._launch_brpc_service, args=(self.bserver_, ))
p_bserver.start()
server = grpc.server(
futures.ThreadPoolExecutor(max_workers=self.worker_num_))
multi_lang_general_model_service_pb2_grpc.add_MultiLangGeneralModelServiceServicer_to_server(
MultiLangServerService(self.model_config_path_,
["0.0.0.0:{}".format(self.port_list_[0])]),
server)
server.add_insecure_port('[::]:{}'.format(self.gport_))
server.start()
p_bserver.join()
server.wait_for_termination()
...@@ -20,7 +20,7 @@ Usage: ...@@ -20,7 +20,7 @@ Usage:
import os import os
import time import time
import argparse import argparse
import commands import subprocess
import datetime import datetime
import shutil import shutil
import tarfile import tarfile
...@@ -209,7 +209,7 @@ class HadoopMonitor(Monitor): ...@@ -209,7 +209,7 @@ class HadoopMonitor(Monitor):
remote_filepath = os.path.join(path, filename) remote_filepath = os.path.join(path, filename)
cmd = '{} -ls {} 2>/dev/null'.format(self._cmd_prefix, remote_filepath) cmd = '{} -ls {} 2>/dev/null'.format(self._cmd_prefix, remote_filepath)
_LOGGER.debug('check cmd: {}'.format(cmd)) _LOGGER.debug('check cmd: {}'.format(cmd))
[status, output] = commands.getstatusoutput(cmd) [status, output] = subprocess.getstatusoutput(cmd)
_LOGGER.debug('resp: {}'.format(output)) _LOGGER.debug('resp: {}'.format(output))
if status == 0: if status == 0:
[_, _, _, _, _, mdate, mtime, _] = output.split('\n')[-1].split() [_, _, _, _, _, mdate, mtime, _] = output.split('\n')[-1].split()
......
...@@ -131,7 +131,7 @@ class WebService(object): ...@@ -131,7 +131,7 @@ class WebService(object):
for key in fetch_map: for key in fetch_map:
fetch_map[key] = fetch_map[key].tolist() fetch_map[key] = fetch_map[key].tolist()
result = self.postprocess( result = self.postprocess(
feed=feed, fetch=fetch, fetch_map=fetch_map) feed=request.json["feed"], fetch=fetch, fetch_map=fetch_map)
result = {"result": result} result = {"result": result}
except ValueError: except ValueError:
result = {"result": "Request Value Error"} result = {"result": "Request Value Error"}
......
numpy>=1.12, <=1.16.4 ; python_version<"3.5" numpy>=1.12, <=1.16.4 ; python_version<"3.5"
grpcio-tools>=1.28.1
grpcio>=1.28.1
...@@ -42,7 +42,8 @@ if '${PACK}' == 'ON': ...@@ -42,7 +42,8 @@ if '${PACK}' == 'ON':
REQUIRED_PACKAGES = [ REQUIRED_PACKAGES = [
'six >= 1.10.0', 'sentencepiece', 'opencv-python', 'pillow' 'six >= 1.10.0', 'sentencepiece', 'opencv-python', 'pillow',
'shapely', 'pyclipper'
] ]
packages=['paddle_serving_app', packages=['paddle_serving_app',
......
...@@ -58,7 +58,8 @@ if '${PACK}' == 'ON': ...@@ -58,7 +58,8 @@ if '${PACK}' == 'ON':
REQUIRED_PACKAGES = [ REQUIRED_PACKAGES = [
'six >= 1.10.0', 'protobuf >= 3.1.0', 'numpy >= 1.12' 'six >= 1.10.0', 'protobuf >= 3.1.0', 'numpy >= 1.12', 'grpcio >= 1.28.1',
'grpcio-tools >= 1.28.1'
] ]
if not find_package("paddlepaddle") and not find_package("paddlepaddle-gpu"): if not find_package("paddlepaddle") and not find_package("paddlepaddle-gpu"):
......
...@@ -37,13 +37,10 @@ def python_version(): ...@@ -37,13 +37,10 @@ def python_version():
max_version, mid_version, min_version = python_version() max_version, mid_version, min_version = python_version()
REQUIRED_PACKAGES = [ REQUIRED_PACKAGES = [
'six >= 1.10.0', 'protobuf >= 3.1.0', 'six >= 1.10.0', 'protobuf >= 3.1.0', 'grpcio >= 1.28.1', 'grpcio-tools >= 1.28.1',
'paddle_serving_client', 'flask >= 1.1.1' 'paddle_serving_client', 'flask >= 1.1.1', 'paddle_serving_app'
] ]
if not find_package("paddlepaddle") and not find_package("paddlepaddle-gpu"):
REQUIRED_PACKAGES.append("paddlepaddle")
packages=['paddle_serving_server', packages=['paddle_serving_server',
'paddle_serving_server.proto'] 'paddle_serving_server.proto']
......
...@@ -37,12 +37,10 @@ def python_version(): ...@@ -37,12 +37,10 @@ def python_version():
max_version, mid_version, min_version = python_version() max_version, mid_version, min_version = python_version()
REQUIRED_PACKAGES = [ REQUIRED_PACKAGES = [
'six >= 1.10.0', 'protobuf >= 3.1.0', 'six >= 1.10.0', 'protobuf >= 3.1.0', 'grpcio >= 1.28.1', 'grpcio-tools >= 1.28.1',
'paddle_serving_client', 'flask >= 1.1.1' 'paddle_serving_client', 'flask >= 1.1.1', 'paddle_serving_app'
] ]
if not find_package("paddlepaddle") and not find_package("paddlepaddle-gpu"):
REQUIRED_PACKAGES.append("paddlepaddle")
packages=['paddle_serving_server_gpu', packages=['paddle_serving_server_gpu',
'paddle_serving_server_gpu.proto'] 'paddle_serving_server_gpu.proto']
......
...@@ -9,4 +9,6 @@ RUN yum -y install wget && \ ...@@ -9,4 +9,6 @@ RUN yum -y install wget && \
yum -y install python3 python3-devel && \ yum -y install python3 python3-devel && \
yum clean all && \ yum clean all && \
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
python get-pip.py && rm get-pip.py python get-pip.py && rm get-pip.py && \
localedef -c -i en_US -f UTF-8 en_US.UTF-8 && \
echo "export LANG=en_US.utf8" >> /root/.bashrc
...@@ -44,4 +44,6 @@ RUN yum -y install wget && \ ...@@ -44,4 +44,6 @@ RUN yum -y install wget && \
cd .. && rm -rf Python-3.6.8* && \ cd .. && rm -rf Python-3.6.8* && \
pip3 install google protobuf setuptools wheel flask numpy==1.16.4 && \ pip3 install google protobuf setuptools wheel flask numpy==1.16.4 && \
yum -y install epel-release && yum -y install patchelf libXext libSM libXrender && \ yum -y install epel-release && yum -y install patchelf libXext libSM libXrender && \
yum clean all yum clean all && \
localedef -c -i en_US -f UTF-8 en_US.UTF-8 && \
echo "export LANG=en_US.utf8" >> /root/.bashrc
...@@ -44,4 +44,5 @@ RUN yum -y install wget && \ ...@@ -44,4 +44,5 @@ RUN yum -y install wget && \
cd .. && rm -rf Python-3.6.8* && \ cd .. && rm -rf Python-3.6.8* && \
pip3 install google protobuf setuptools wheel flask numpy==1.16.4 && \ pip3 install google protobuf setuptools wheel flask numpy==1.16.4 && \
yum -y install epel-release && yum -y install patchelf libXext libSM libXrender && \ yum -y install epel-release && yum -y install patchelf libXext libSM libXrender && \
yum clean all yum clean all && \
echo "export LANG=en_US.utf8" >> /root/.bashrc
...@@ -21,4 +21,6 @@ RUN yum -y install wget >/dev/null \ ...@@ -21,4 +21,6 @@ RUN yum -y install wget >/dev/null \
&& yum install -y python3 python3-devel \ && yum install -y python3 python3-devel \
&& pip3 install google protobuf setuptools wheel flask \ && pip3 install google protobuf setuptools wheel flask \
&& yum -y install epel-release && yum -y install patchelf libXext libSM libXrender\ && yum -y install epel-release && yum -y install patchelf libXext libSM libXrender\
&& yum clean all && yum clean all \
&& localedef -c -i en_US -f UTF-8 en_US.UTF-8 \
&& echo "export LANG=en_US.utf8" >> /root/.bashrc
FROM nvidia/cuda:9.0-cudnn7-runtime-centos7 FROM nvidia/cuda:9.0-cudnn7-devel-centos7 as builder
FROM nvidia/cuda:9.0-cudnn7-runtime-centos7
RUN yum -y install wget && \ RUN yum -y install wget && \
yum -y install epel-release && yum -y install patchelf && \ yum -y install epel-release && yum -y install patchelf && \
yum -y install gcc make python-devel && \ yum -y install gcc make python-devel && \
...@@ -13,4 +14,8 @@ RUN yum -y install wget && \ ...@@ -13,4 +14,8 @@ RUN yum -y install wget && \
ln -s /usr/local/cuda-9.0/lib64/libcublas.so.9.0 /usr/local/cuda-9.0/lib64/libcublas.so && \ ln -s /usr/local/cuda-9.0/lib64/libcublas.so.9.0 /usr/local/cuda-9.0/lib64/libcublas.so && \
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> /root/.bashrc && \ echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> /root/.bashrc && \
ln -s /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudnn.so.7 /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudnn.so && \ ln -s /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudnn.so.7 /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudnn.so && \
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-9.0/targets/x86_64-linux/lib:$LD_LIBRARY_PATH' >> /root/.bashrc echo 'export LD_LIBRARY_PATH=/usr/local/cuda-9.0/targets/x86_64-linux/lib:$LD_LIBRARY_PATH' >> /root/.bashrc && \
echo "export LANG=en_US.utf8" >> /root/.bashrc && \
mkdir -p /usr/local/cuda/extras
COPY --from=builder /usr/local/cuda/extras/CUPTI /usr/local/cuda/extras/CUPTI
...@@ -22,4 +22,5 @@ RUN yum -y install wget >/dev/null \ ...@@ -22,4 +22,5 @@ RUN yum -y install wget >/dev/null \
&& yum install -y python3 python3-devel \ && yum install -y python3 python3-devel \
&& pip3 install google protobuf setuptools wheel flask \ && pip3 install google protobuf setuptools wheel flask \
&& yum -y install epel-release && yum -y install patchelf libXext libSM libXrender\ && yum -y install epel-release && yum -y install patchelf libXext libSM libXrender\
&& yum clean all && yum clean all \
&& echo "export LANG=en_US.utf8" >> /root/.bashrc
#!/usr/bin/env bash #!/usr/bin/env bash
set -x
function unsetproxy() { function unsetproxy() {
HTTP_PROXY_TEMP=$http_proxy HTTP_PROXY_TEMP=$http_proxy
HTTPS_PROXY_TEMP=$https_proxy HTTPS_PROXY_TEMP=$https_proxy
...@@ -375,16 +375,17 @@ function python_test_multi_process(){ ...@@ -375,16 +375,17 @@ function python_test_multi_process(){
sh get_data.sh sh get_data.sh
case $TYPE in case $TYPE in
CPU) CPU)
check_cmd "python -m paddle_serving_server.serve --model uci_housing_model --port 9292 &" check_cmd "python -m paddle_serving_server.serve --model uci_housing_model --port 9292 --workdir test9292 &"
check_cmd "python -m paddle_serving_server.serve --model uci_housing_model --port 9293 &" check_cmd "python -m paddle_serving_server.serve --model uci_housing_model --port 9293 --workdir test9293 &"
sleep 5 sleep 5
check_cmd "python test_multi_process_client.py" check_cmd "python test_multi_process_client.py"
kill_server_process kill_server_process
echo "bert mutli rpc RPC inference pass" echo "bert mutli rpc RPC inference pass"
;; ;;
GPU) GPU)
check_cmd "python -m paddle_serving_server_gpu.serve --model uci_housing_model --port 9292 --gpu_ids 0 &" rm -rf ./image #TODO: The following code tried to create this folder, but no corresponding code was found
check_cmd "python -m paddle_serving_server_gpu.serve --model uci_housing_model --port 9293 --gpu_ids 0 &" check_cmd "python -m paddle_serving_server_gpu.serve --model uci_housing_model --port 9292 --workdir test9292 --gpu_ids 0 &"
check_cmd "python -m paddle_serving_server_gpu.serve --model uci_housing_model --port 9293 --workdir test9293 --gpu_ids 0 &"
sleep 5 sleep 5
check_cmd "python test_multi_process_client.py" check_cmd "python test_multi_process_client.py"
kill_server_process kill_server_process
...@@ -454,15 +455,16 @@ function python_test_lac() { ...@@ -454,15 +455,16 @@ function python_test_lac() {
cd lac # pwd: /Serving/python/examples/lac cd lac # pwd: /Serving/python/examples/lac
case $TYPE in case $TYPE in
CPU) CPU)
sh get_data.sh python -m paddle_serving_app.package --get_model lac
check_cmd "python -m paddle_serving_server.serve --model jieba_server_model/ --port 9292 &" tar -xzvf lac.tar.gz
check_cmd "python -m paddle_serving_server.serve --model lac_model/ --port 9292 &"
sleep 5 sleep 5
check_cmd "echo \"我爱北京天安门\" | python lac_client.py jieba_client_conf/serving_client_conf.prototxt lac_dict/" check_cmd "echo \"我爱北京天安门\" | python lac_client.py lac_client/serving_client_conf.prototxt "
echo "lac CPU RPC inference pass" echo "lac CPU RPC inference pass"
kill_server_process kill_server_process
unsetproxy # maybe the proxy is used on iPipe, which makes web-test failed. unsetproxy # maybe the proxy is used on iPipe, which makes web-test failed.
check_cmd "python lac_web_service.py jieba_server_model/ lac_workdir 9292 &" check_cmd "python lac_web_service.py lac_model/ lac_workdir 9292 &"
sleep 5 sleep 5
check_cmd "curl -H \"Content-Type:application/json\" -X POST -d '{\"feed\":[{\"words\": \"我爱北京天安门\"}], \"fetch\":[\"word_seg\"]}' http://127.0.0.1:9292/lac/prediction" check_cmd "curl -H \"Content-Type:application/json\" -X POST -d '{\"feed\":[{\"words\": \"我爱北京天安门\"}], \"fetch\":[\"word_seg\"]}' http://127.0.0.1:9292/lac/prediction"
# check http code # check http code
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册