未验证 提交 36e02995 编写于 作者: T Thomas Young 提交者: GitHub

Merge pull request #1656 from felixhjh/v0.8.0

V0.8.0
......@@ -105,13 +105,13 @@ For Paddle Serving developers, we provide extended documents such as custom OP,
<h2 align="center">Model Zoo</h2>
Paddle Serving works closely with the Paddle model suite, and implements a large number of service deployment examples, including image classification, object detection, language and text recognition, Chinese part of speech, sentiment analysis, content recommendation and other types of examples, for a total of 42 models.
Paddle Serving works closely with the Paddle model suite, and implements a large number of service deployment examples, including image classification, object detection, language and text recognition, Chinese part of speech, sentiment analysis, content recommendation and other types of examples, for a total of 45 models.
<p align="center">
| PaddleOCR | PaddleDetection | PaddleClas | PaddleSeg | PaddleRec | Paddle NLP |
| :----: | :----: | :----: | :----: | :----: | :----: |
| 8 | 12 | 14 | 2 | 3 | 4 |
| 8 | 12 | 14 | 2 | 3 | 6 |
</p>
......
......@@ -97,13 +97,13 @@ Paddle Serving依托深度学习框架PaddlePaddle旨在帮助深度学习开发
<h2 align="center">模型库</h2>
Paddle Serving与Paddle模型套件紧密配合,实现大量服务化部署,包括图像分类、物体检测、语言文本识别、中文词性、情感分析、内容推荐等多种类型示例,以及Paddle全链条项目,共计42个模型。
Paddle Serving与Paddle模型套件紧密配合,实现大量服务化部署,包括图像分类、物体检测、语言文本识别、中文词性、情感分析、内容推荐等多种类型示例,以及Paddle全链条项目,共计45个模型。
<p align="center">
| PaddleOCR | PaddleDetection | PaddleClas | PaddleSeg | PaddleRec | Paddle NLP |
| :----: | :----: | :----: | :----: | :----: | :----: |
| 8 | 12 | 14 | 2 | 3 | 4 |
| 8 | 12 | 14 | 2 | 3 | 6 |
</p>
......
# Paddle Serving 环境检查功能介绍
## 概览
Paddle Serving 提供了一键运行示例,检查 Paddle Serving 环境是否安装正确。
## 启动方式
```
python3 -m paddle_serving_server.serve check
```
|命令|描述|
|---------|----|
|check_all|检查 Paddle Inference、Pipeline Serving、C++ Serving。只打印检测结果,不记录日志|
|check_pipeline|检查 Pipeline Serving,只打印检测结果,不记录日志|
|check_cpp|检查 C++ Serving,只打印检测结果,不记录日志|
|check_inference|检查 Paddle Inference 是否安装正确,只打印检测结果,不记录日志|
|debug|发生报错后,该命令将打印提示日志到屏幕,并记录详细日志文件|
|exit|退出|
>> **注意**:<br>
>> 1.当 C++ Serving 启动报错且是自己编译后 pip 安装的paddle_serving_server, 确认是否执行 `export SERVING_BIN` 导入`SERVING_BIN`真实路径。<br>
>> 2.可以通过 `export SERVING_LOG_PATH` 指定`debug`命令生成log的路径,默认是在当前路径下记录日志。
......@@ -126,3 +126,10 @@ pip3 install https://paddle-inference-lib.bj.bcebos.com/2.2.2/python/Linux/GPU/x
| CUDA11.2 + CUDNN8 | 0.8.0-cuda11.2-cudnn8-devel | Ubuntu 16.04 | 2.2.2-gpu-cuda11.2-cudnn8 | Ubuntu 18.04 |
对于**Windows 10 用户**,请参考文档[Windows平台使用Paddle Serving指导](Windows_Tutorial_CN.md)
## 5.安装完成后的环境检查
当以上步骤均完成后可使用命令行运行环境检查功能,自动运行Paddle Serving相关示例,进行环境相关配置校验。
```
python3 -m paddle_serving_server.serve check
```
详情请参考[环境检查文档](./Check_Env_CN.md)
......@@ -28,6 +28,8 @@
| senta_bilstm | PaddleNLP | [C++ Serving](../examples/C++/PaddleNLP/senta) | [.tar.gz](https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SentimentAnalysis/senta_bilstm.tar.gz) |C++ Serving|
| lac | PaddleNLP | [C++ Serving](../examples/C++/PaddleNLP/lac) | [.tar.gz](https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/LexicalAnalysis/lac.tar.gz) |
| transformer | PaddleNLP | [Pipeline Serving](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/examples/machine_translation/transformer/deploy/serving/README.md) | [model](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/machine_translation/transformer) |
| ELECTRA | PaddleNLP | [Pipeline Serving](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/examples/language_model/electra/deploy/serving/README.md) | [model](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/language_model/electra) |
| In-batch Negatives | PaddleNLP | [Pipeline Serving](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/applications/neural_search/recall/in_batch_negative) | [model](https://bj.bcebos.com/v1/paddlenlp/models/inbatch_model.zip) |
| criteo_ctr | PaddleRec | [C++ Serving](../examples/C++/PaddleRec/criteo_ctr) | [.tar.gz](https://paddle-serving.bj.bcebos.com/criteo_ctr_example/criteo_ctr_demo_model.tar.gz) |
| criteo_ctr_with_cube | PaddleRec | [C++ Serving](../examples/C++/PaddleRec/criteo_ctr_with_cube) | [.tar.gz](https://paddle-serving.bj.bcebos.com/unittest/ctr_cube_unittest.tar.gz) |
| wide&deep | PaddleRec | [C++ Serving](https://github.com/PaddlePaddle/PaddleRec/blob/release/2.1.0/doc/serving.md) | [model](https://github.com/PaddlePaddle/PaddleRec/blob/release/2.1.0/models/rank/wide_deep/README.md) |
......
# 如何配置TensorRT动态shape
# 如何开启 TensorRT 并配置动态 shape
(简体中文|[English](./TensorRT_Dynamic_Shape_EN.md))
## 引言
## 概览
在Pipeline/C++开启TensorRT`--use_trt`后,关于如何进行动态shape的配置,以下会分别给出Pipeline Serving和C++ Serving示例
TensorRT是一个高性能的深度学习推理(Inference)优化器,可以为深度学习应用提供低延迟、高吞吐率的部署推理。
以下将分别从 Pipeline Serving 和 C++ Serving 介绍 Tensorrt 开启方式以及配置动态 shape(Dynamic Shape)。
以下是动态shape api
## Paddle Inference Dynamic Shape Api
```
void SetTRTDynamicShapeInfo(
std::map<std::string, std::vector<int>> min_input_shape,
......@@ -15,7 +16,23 @@
```
具体API说明请参考[C++](https://paddleinference.paddlepaddle.org.cn/api_reference/cxx_api_doc/Config/GPUConfig.html#tensorrt)/[Python](https://paddleinference.paddlepaddle.org.cn/api_reference/python_api_doc/Config/GPUConfig.html#tensorrt)
### C++ Serving
## C++ Serving
**一. C++ Serving Tensorrt 开启方式**
在 C++ Serving 启动命令加上`--use_trt`
```
python -m paddle_serving_server.serve \
--model serving_server \
--thread 2 --port 9000 \
--gpu_ids 0 \
--use_trt \
--precision FP16
```
**二. C++ Serving 设置动态 shape**
`**/paddle_inference/paddle/include/paddle_engine.h` 修改如下代码
```
......@@ -111,44 +128,52 @@
```
### Pipeline Serving
## Pipeline Serving
`**/python/paddle_serving_app/local_predict.py`中修改如下代码
**一. Pipeline Serving Tensorrt 开启方式**
在示例目录下的 config.yml 文件, 修改`device_type: 2`, 配置 GPU 使用的核心 `devices: "0,1,2,3"`
>> **注意**: Tensorrt 需要配合 GPU 使用
**二. Pipeline Serving 设置动态 shape**
在示例目录下的 web_service.py, 在每个 op 下可以通过 `def set_dynamic_shape_info(self):` 添加动态 shape 相关的配置
示例如下
```
if use_trt:
config.enable_tensorrt_engine(
precision_mode=precision_type,
workspace_size=1 << 20,
max_batch_size=32,
min_subgraph_size=3,
use_static=False,
use_calib_mode=False)
head_number = 12
names = [
"placeholder_0", "placeholder_1", "placeholder_2", "stack_0.tmp_0"
]
min_input_shape = [1, 1, 1]
max_input_shape = [100, 128, 1]
opt_input_shape = [10, 60, 1]
config.set_trt_dynamic_shape_info(
{
names[0]: min_input_shape,
names[1]: min_input_shape,
names[2]: min_input_shape,
names[3]: [1, head_number, 1, 1]
}, {
names[0]: max_input_shape,
names[1]: max_input_shape,
names[2]: max_input_shape,
names[3]: [100, head_number, 128, 128]
}, {
names[0]: opt_input_shape,
names[1]: opt_input_shape,
names[2]: opt_input_shape,
names[3]: [10, head_number, 60, 60]
})
def set_dynamic_shape_info(self):
min_input_shape = {
"x": [1, 3, 50, 50],
"conv2d_182.tmp_0": [1, 1, 20, 20],
"nearest_interp_v2_2.tmp_0": [1, 1, 20, 20],
"nearest_interp_v2_3.tmp_0": [1, 1, 20, 20],
"nearest_interp_v2_4.tmp_0": [1, 1, 20, 20],
"nearest_interp_v2_5.tmp_0": [1, 1, 20, 20]
}
max_input_shape = {
"x": [1, 3, 1536, 1536],
"conv2d_182.tmp_0": [20, 200, 960, 960],
"nearest_interp_v2_2.tmp_0": [20, 200, 960, 960],
"nearest_interp_v2_3.tmp_0": [20, 200, 960, 960],
"nearest_interp_v2_4.tmp_0": [20, 200, 960, 960],
"nearest_interp_v2_5.tmp_0": [20, 200, 960, 960],
}
opt_input_shape = {
"x": [1, 3, 960, 960],
"conv2d_182.tmp_0": [3, 96, 240, 240],
"nearest_interp_v2_2.tmp_0": [3, 96, 240, 240],
"nearest_interp_v2_3.tmp_0": [3, 24, 240, 240],
"nearest_interp_v2_4.tmp_0": [3, 24, 240, 240],
"nearest_interp_v2_5.tmp_0": [3, 24, 240, 240],
}
self.dynamic_shape_info = {
"min_input_shape": min_input_shape,
"max_input_shape": max_input_shape,
"opt_input_shape": opt_input_shape,
}
```
具体可以参考[Pipeline OCR](../examples/Pipeline/PaddleOCR/ocr/)
>> **注意**: 由于不同的模型具有不同的动态 shape 配置,因此不存在通用的动态 shape 配置方法。当运行 Pipeline Serving
>> 出现报错信息时,应该使用[netron](https://netron.app/) 加载模型,查看各个 op 的输入输出 shape。之后,结合报错信息,在 web_service.py
>> 添加相应的动态 shape 配置代码。
Http## Bert as service
## Bert as service
([简体中文](./README_CN.md)|English)
......
......@@ -37,6 +37,11 @@ from paddle_serving_server.util import *
from paddle_serving_server.env_check.run import check_env
import cmd
def signal_handler(signal, frame):
print('Process stopped')
sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
# web_service.py is still used by Pipeline.
def port_is_available(port):
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册