未验证 提交 c1e9e00c 编写于 作者: J Jiawei Wang 提交者: GitHub

Merge pull request #1037 from wangjiawei04/v0.5.0

cherry-pick #1036 #1035 #1034 #1033
此差异已折叠。
此差异已折叠。
# How to Convert Paddle Inference Model To Paddle Serving Format
([简体中文](./INFERENCE_TO_SERVING_CN.md)|English)
you can use a build-in python module called `paddle_serving_client.convert` to convert it.
```python
python -m paddle_serving_client.convert --dirname ./your_inference_model_dir
```
Arguments are the same as `inference_model_to_serving` API.
| Argument | Type | Default | Description |
|--------------|------|-----------|--------------------------------|
| `dirname` | str | - | Path of saved model files. Program file and parameter files are saved in this directory. |
| `serving_server` | str | `"serving_server"` | The path of model files and configuration files for server. |
| `serving_client` | str | `"serving_client"` | The path of configuration files for client. |
| `model_filename` | str | None | The name of file to load the inference program. If it is None, the default filename `__model__` will be used. |
| `params_filename` | str | None | The name of file to load all parameters. It is only used for the case that all parameters were saved in a single binary file. If parameters were saved in separate files, set it as None. |
# 如何从Paddle保存的预测模型转为Paddle Serving格式可部署的模型
([English](./INFERENCE_TO_SERVING.md)|简体中文)
你可以使用Paddle Serving提供的名为`paddle_serving_client.convert`的内置模块进行转换。
```python
python -m paddle_serving_client.convert --dirname ./your_inference_model_dir
```
模块参数与`inference_model_to_serving`接口参数相同。
| 参数 | 类型 | 默认值 | 描述 |
|--------------|------|-----------|--------------------------------|
| `dirname` | str | - | 需要转换的模型文件存储路径,Program结构文件和参数文件均保存在此目录。|
| `serving_server` | str | `"serving_server"` | 转换后的模型文件和配置文件的存储路径。默认值为serving_server |
| `serving_client` | str | `"serving_client"` | 转换后的客户端配置文件存储路径。默认值为serving_client |
| `model_filename` | str | None | 存储需要转换的模型Inference Program结构的文件名称。如果设置为None,则使用 `__model__` 作为默认的文件名 |
| `params_filename` | str | None | 存储需要转换的模型所有参数的文件名称。当且仅当所有模型参数被保存在一个单独的>二进制文件中,它才需要被指定。如果模型参数是存储在各自分离的文件中,设置它的值为None |
...@@ -11,14 +11,16 @@ This example use model [BERT Chinese Model](https://www.paddlepaddle.org.cn/hubd ...@@ -11,14 +11,16 @@ This example use model [BERT Chinese Model](https://www.paddlepaddle.org.cn/hubd
Install paddlehub first Install paddlehub first
``` ```
pip install paddlehub pip3 install paddlehub
``` ```
run run
``` ```
python prepare_model.py 128 python3 prepare_model.py 128
``` ```
**PaddleHub only support Python 3.5+**
the 128 in the command above means max_seq_len in BERT model, which is the length of sample after preprocessing. the 128 in the command above means max_seq_len in BERT model, which is the length of sample after preprocessing.
the config file and model file for server side are saved in the folder bert_seq128_model. the config file and model file for server side are saved in the folder bert_seq128_model.
the config file generated for client side is saved in the folder bert_seq128_client. the config file generated for client side is saved in the folder bert_seq128_client.
...@@ -28,8 +30,9 @@ You can also download the above model from BOS(max_seq_len=128). After decompres ...@@ -28,8 +30,9 @@ You can also download the above model from BOS(max_seq_len=128). After decompres
```shell ```shell
wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz
tar -xzf bert_chinese_L-12_H-768_A-12.tar.gz tar -xzf bert_chinese_L-12_H-768_A-12.tar.gz
mv bert_chinese_L-12_H-768_A-12_model bert_seq128_model
mv bert_chinese_L-12_H-768_A-12_client bert_seq128_client
``` ```
if your model is bert_chinese_L-12_H-768_A-12_model, replace the 'bert_seq128_model' field in the following command with 'bert_chinese_L-12_H-768_A-12_model',replace 'bert_seq128_client' with 'bert_chinese_L-12_H-768_A-12_client'.
### Getting Dict and Sample Dataset ### Getting Dict and Sample Dataset
......
...@@ -10,11 +10,11 @@ ...@@ -10,11 +10,11 @@
示例中采用[Paddlehub](https://github.com/PaddlePaddle/PaddleHub)中的[BERT中文模型](https://www.paddlepaddle.org.cn/hubdetail?name=bert_chinese_L-12_H-768_A-12&en_category=SemanticModel) 示例中采用[Paddlehub](https://github.com/PaddlePaddle/PaddleHub)中的[BERT中文模型](https://www.paddlepaddle.org.cn/hubdetail?name=bert_chinese_L-12_H-768_A-12&en_category=SemanticModel)
请先安装paddlehub 请先安装paddlehub
``` ```
pip install paddlehub pip3 install paddlehub
``` ```
执行 执行
``` ```
python prepare_model.py 128 python3 prepare_model.py 128
``` ```
参数128表示BERT模型中的max_seq_len,即预处理后的样本长度。 参数128表示BERT模型中的max_seq_len,即预处理后的样本长度。
生成server端配置文件与模型文件,存放在bert_seq128_model文件夹。 生成server端配置文件与模型文件,存放在bert_seq128_model文件夹。
...@@ -25,9 +25,9 @@ python prepare_model.py 128 ...@@ -25,9 +25,9 @@ python prepare_model.py 128
```shell ```shell
wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz
tar -xzf bert_chinese_L-12_H-768_A-12.tar.gz tar -xzf bert_chinese_L-12_H-768_A-12.tar.gz
mv bert_chinese_L-12_H-768_A-12_model bert_seq128_model
mv bert_chinese_L-12_H-768_A-12_client bert_seq128_client
``` ```
若使用bert_chinese_L-12_H-768_A-12_model模型,将下面命令中的bert_seq128_model字段替换为bert_chinese_L-12_H-768_A-12_model,bert_seq128_client字段替换为bert_chinese_L-12_H-768_A-12_client.
### 获取词典和样例数据 ### 获取词典和样例数据
......
...@@ -12,6 +12,7 @@ Paddle Detection provides a large number of [Model Zoo](https://github.com/Paddl ...@@ -12,6 +12,7 @@ Paddle Detection provides a large number of [Model Zoo](https://github.com/Paddl
### Serving example ### Serving example
Several examples of PaddleDetection models used in Serving are given in this folder Several examples of PaddleDetection models used in Serving are given in this folder
All examples support TensorRT.
-[Faster RCNN](./faster_rcnn_r50_fpn_1x_coco) -[Faster RCNN](./faster_rcnn_r50_fpn_1x_coco)
-[PPYOLO](./ppyolo_r50vd_dcn_1x_coco) -[PPYOLO](./ppyolo_r50vd_dcn_1x_coco)
......
...@@ -13,6 +13,9 @@ tar xf faster_rcnn_r50_fpn_1x_coco.tar ...@@ -13,6 +13,9 @@ tar xf faster_rcnn_r50_fpn_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0
``` ```
This model support TensorRT, if you want a faster inference, please use `--use_trt`.
### Perform prediction ### Perform prediction
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
...@@ -13,6 +13,7 @@ wget --no-check-certificate https://paddle-serving.bj.bcebos.com/pddet_demo/2.0/ ...@@ -13,6 +13,7 @@ wget --no-check-certificate https://paddle-serving.bj.bcebos.com/pddet_demo/2.0/
tar xf faster_rcnn_r50_fpn_1x_coco.tar tar xf faster_rcnn_r50_fpn_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0
``` ```
该模型支持TensorRT,如果想要更快的预测速度,可以开启`--use_trt`选项。
### 执行预测 ### 执行预测
``` ```
......
...@@ -13,6 +13,8 @@ tar xf ppyolo_r50vd_dcn_1x_coco.tar ...@@ -13,6 +13,8 @@ tar xf ppyolo_r50vd_dcn_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0
``` ```
This model support TensorRT, if you want a faster inference, please use `--use_trt`.
### Perform prediction ### Perform prediction
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
...@@ -14,6 +14,8 @@ tar xf ppyolo_r50vd_dcn_1x_coco.tar ...@@ -14,6 +14,8 @@ tar xf ppyolo_r50vd_dcn_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0
``` ```
该模型支持TensorRT,如果想要更快的预测速度,可以开启`--use_trt`选项。
### 执行预测 ### 执行预测
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
...@@ -12,6 +12,7 @@ wget --no-check-certificate https://paddle-serving.bj.bcebos.com/pddet_demo/2.0/ ...@@ -12,6 +12,7 @@ wget --no-check-certificate https://paddle-serving.bj.bcebos.com/pddet_demo/2.0/
tar xf ttfnet_darknet53_1x_coco.tar tar xf ttfnet_darknet53_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0
``` ```
This model support TensorRT, if you want a faster inference, please use `--use_trt`.
### Perform prediction ### Perform prediction
``` ```
......
...@@ -14,6 +14,8 @@ tar xf ttfnet_darknet53_1x_coco.tar ...@@ -14,6 +14,8 @@ tar xf ttfnet_darknet53_1x_coco.tar
python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0
``` ```
该模型支持TensorRT,如果想要更快的预测速度,可以开启`--use_trt`选项。
### 执行预测 ### 执行预测
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
...@@ -13,6 +13,8 @@ tar xf yolov3_darknet53_270e_coco.tar ...@@ -13,6 +13,8 @@ tar xf yolov3_darknet53_270e_coco.tar
python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model serving_server --port 9494 --gpu_ids 0
``` ```
This model support TensorRT, if you want a faster inference, please use `--use_trt`.
### Perform prediction ### Perform prediction
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
...@@ -14,6 +14,8 @@ tar xf yolov3_darknet53_270e_coco.tar ...@@ -14,6 +14,8 @@ tar xf yolov3_darknet53_270e_coco.tar
python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0 python -m paddle_serving_server_gpu.serve --model pddet_serving_model --port 9494 --gpu_ids 0
``` ```
该模型支持TensorRT,如果想要更快的预测速度,可以开启`--use_trt`选项。
### 执行预测 ### 执行预测
``` ```
python test_client.py 000000570688.jpg python test_client.py 000000570688.jpg
......
# Imagenet Pipeline WebService # Imagenet Pipeline WebService
这里以 Uci 服务为例来介绍 Pipeline WebService 的使用。 这里以 Imagenet 服务为例来介绍 Pipeline WebService 的使用。
## 获取模型 ## 获取模型
``` ```
...@@ -10,10 +10,11 @@ sh get_model.sh ...@@ -10,10 +10,11 @@ sh get_model.sh
## 启动服务 ## 启动服务
``` ```
python web_service.py &>log.txt & python resnet50_web_service.py &>log.txt &
``` ```
## 测试 ## 测试
``` ```
curl -X POST -k http://localhost:18082/uci/prediction -d '{"key": ["x"], "value": ["0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332"]}' python pipeline_rpc_client.py
``` ```
...@@ -152,8 +152,8 @@ class MainService(BaseHTTPRequestHandler): ...@@ -152,8 +152,8 @@ class MainService(BaseHTTPRequestHandler):
if "key" not in post_data: if "key" not in post_data:
return False return False
else: else:
key = base64.b64decode(post_data["key"]) key = base64.b64decode(post_data["key"].encode())
with open(args.model + "/key", "w") as f: with open(args.model + "/key", "wb") as f:
f.write(key) f.write(key)
return True return True
...@@ -161,8 +161,8 @@ class MainService(BaseHTTPRequestHandler): ...@@ -161,8 +161,8 @@ class MainService(BaseHTTPRequestHandler):
if "key" not in post_data: if "key" not in post_data:
return False return False
else: else:
key = base64.b64decode(post_data["key"]) key = base64.b64decode(post_data["key"].encode())
with open(args.model + "/key", "r") as f: with open(args.model + "/key", "rb") as f:
cur_key = f.read() cur_key = f.read()
return (key == cur_key) return (key == cur_key)
...@@ -203,7 +203,7 @@ class MainService(BaseHTTPRequestHandler): ...@@ -203,7 +203,7 @@ class MainService(BaseHTTPRequestHandler):
self.send_response(200) self.send_response(200)
self.send_header('Content-type', 'application/json') self.send_header('Content-type', 'application/json')
self.end_headers() self.end_headers()
self.wfile.write(json.dumps(response)) self.wfile.write(json.dumps(response).encode())
if __name__ == "__main__": if __name__ == "__main__":
......
...@@ -767,7 +767,7 @@ class ThreadChannel(Queue.PriorityQueue): ...@@ -767,7 +767,7 @@ class ThreadChannel(Queue.PriorityQueue):
while self._stop is False and self._consumer_cursors[ while self._stop is False and self._consumer_cursors[
op_name] - self._base_cursor >= len(self._output_buf): op_name] - self._base_cursor >= len(self._output_buf):
try: try:
channeldata = self.get(timeout=0) channeldata = self.get(timeout=0)[1]
self._output_buf.append(channeldata) self._output_buf.append(channeldata)
list_values = list(channeldata.values()) list_values = list(channeldata.values())
_LOGGER.debug( _LOGGER.debug(
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册