未验证 提交 0dc19035 编写于 作者: C cuicheng01 提交者: GitHub

Merge branch 'PaddlePaddle:develop' into develop

...@@ -78,7 +78,7 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick ...@@ -78,7 +78,7 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick
- 推理部署 - 推理部署
- [基于python预测引擎推理](docs/zh_CN/inference_deployment/python_deploy.md#1) - [基于python预测引擎推理](docs/zh_CN/inference_deployment/python_deploy.md#1)
- [基于C++预测引擎推理](docs/zh_CN/inference_deployment/cpp_deploy.md) - [基于C++预测引擎推理](docs/zh_CN/inference_deployment/cpp_deploy.md)
- [服务化部署](docs/zh_CN/inference_deployment/paddle_serving_deploy.md) - [服务化部署](docs/zh_CN/inference_deployment/classification_serving_deploy.md)
- [端侧部署](docs/zh_CN/inference_deployment/paddle_lite_deploy.md) - [端侧部署](docs/zh_CN/inference_deployment/paddle_lite_deploy.md)
- [Paddle2ONNX模型转化与预测](deploy/paddle2onnx/readme.md) - [Paddle2ONNX模型转化与预测](deploy/paddle2onnx/readme.md)
- [模型压缩](deploy/slim/README.md) - [模型压缩](deploy/slim/README.md)
...@@ -93,7 +93,7 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick ...@@ -93,7 +93,7 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick
- 推理部署 - 推理部署
- [基于python预测引擎推理](docs/zh_CN/inference_deployment/python_deploy.md#2) - [基于python预测引擎推理](docs/zh_CN/inference_deployment/python_deploy.md#2)
- [基于C++预测引擎推理](deploy/cpp_shitu/readme.md) - [基于C++预测引擎推理](deploy/cpp_shitu/readme.md)
- [服务化部署](docs/zh_CN/inference_deployment/paddle_serving_deploy.md) - [服务化部署](docs/zh_CN/inference_deployment/recognition_serving_deploy.md)
- [端侧部署](deploy/lite_shitu/README.md) - [端侧部署](deploy/lite_shitu/README.md)
- PP系列骨干网络模型 - PP系列骨干网络模型
- [PP-HGNet](docs/zh_CN/models/PP-HGNet.md) - [PP-HGNet](docs/zh_CN/models/PP-HGNet.md)
......
Global:
infer_imgs: "./images/ImageNet/ILSVRC2012_val_00000010.jpeg"
inference_model_dir: "./models/PPHGNet_tiny_calling_halfbody/"
batch_size: 1
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
resize_short: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: Topk
Topk:
topk: 2
class_id_map_file: "../dataset/data/phone_label_list.txt"
SavePreLabel:
save_dir: ./pre_label/
[English](readme_en.md) | 简体中文 简体中文 | [English](readme_en.md)
# 基于PaddleHub Serving的服务部署 # 基于 PaddleHub Serving 的服务部署
hubserving服务部署配置服务包`clas`下包含3个必选文件,目录如下: PaddleClas 支持通过 PaddleHub 快速进行服务化部署。目前支持图像分类的部署,图像识别的部署敬请期待。
```
hubserving/clas/
└─ __init__.py 空文件,必选 ## 目录
└─ config.json 配置文件,可选,使用配置启动服务时作为参数传入 - [1. 简介](#1-简介)
└─ module.py 主模块,必选,包含服务的完整逻辑 - [2. 准备环境](#2-准备环境)
└─ params.py 参数文件,必选,包含模型路径、前后处理参数等参数 - [3. 下载推理模型](#3-下载推理模型)
- [4. 安装服务模块](#4-安装服务模块)
- [5. 启动服务](#5-启动服务)
- [5.1 命令行启动](#51-命令行启动)
- [5.2 配置文件启动](#52-配置文件启动)
- [6. 发送预测请求](#6-发送预测请求)
- [7. 自定义修改服务模块](#7-自定义修改服务模块)
<a name="1"></a>
## 1. 简介
hubserving 服务部署配置服务包 `clas` 下包含 3 个必选文件,目录如下:
```shell
deploy/hubserving/clas/
├── __init__.py # 空文件,必选
├── config.json # 配置文件,可选,使用配置启动服务时作为参数传入
├── module.py # 主模块,必选,包含服务的完整逻辑
└── params.py # 参数文件,必选,包含模型路径、前后处理参数等参数
``` ```
## 快速启动服务
### 1. 准备环境 <a name="2"></a>
## 2. 准备环境
```shell ```shell
# 安装paddlehub,请安装2.0版本 # 安装 paddlehub,建议安装 2.1.0 版本
pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple python3.7 -m pip install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
``` ```
### 2. 下载推理模型
<a name="3"></a>
## 3. 下载推理模型
安装服务模块前,需要准备推理模型并放到正确路径,默认模型路径为: 安装服务模块前,需要准备推理模型并放到正确路径,默认模型路径为:
```
分类推理模型结构文件:PaddleClas/inference/inference.pdmodel * 分类推理模型结构文件:`PaddleClas/inference/inference.pdmodel`
分类推理模型权重文件:PaddleClas/inference/inference.pdiparams * 分类推理模型权重文件:`PaddleClas/inference/inference.pdiparams`
```
**注意** **注意**
* 模型文件路径可在`PaddleClas/deploy/hubserving/clas/params.py`中查看和修改: * 模型文件路径可在 `PaddleClas/deploy/hubserving/clas/params.py` 中查看和修改:
```python ```python
"inference_model_dir": "../inference/" "inference_model_dir": "../inference/"
``` ```
需要注意,模型文件(包括.pdmodel与.pdiparams)名称必须为`inference` * 模型文件(包括 `.pdmodel``.pdiparams`)的名称必须为 `inference`
* 我们也提供了大量基于ImageNet-1k数据集的预训练模型,模型列表及下载地址详见[模型库概览](../../docs/zh_CN/models/models_intro.md),也可以使用自己训练转换好的模型。 * 我们提供了大量基于 ImageNet-1k 数据集的预训练模型,模型列表及下载地址详见[模型库概览](../../docs/zh_CN/algorithm_introduction/ImageNet_models.md),也可以使用自己训练转换好的模型。
### 3. 安装服务模块
针对Linux环境和Windows环境,安装命令如下。
* 在Linux环境下,安装示例如下: <a name="4"></a>
```shell ## 4. 安装服务模块
cd PaddleClas/deploy
# 安装服务模块: * 在 Linux 环境下,安装示例如下:
hub install hubserving/clas/ ```shell
``` cd PaddleClas/deploy
# 安装服务模块:
hub install hubserving/clas/
```
* 在 Windows 环境下(文件夹的分隔符为`\`),安装示例如下:
```shell
cd PaddleClas\deploy
# 安装服务模块:
hub install hubserving\clas\
```
* 在Windows环境下(文件夹的分隔符为`\`),安装示例如下: <a name="5"></a>
## 5. 启动服务
<a name="5.1"></a>
### 5.1 命令行启动
该方式仅支持使用 CPU 预测。启动命令:
```shell ```shell
cd PaddleClas\deploy hub serving start \
# 安装服务模块: --modules clas_system
hub install hubserving\clas\ --port 8866
``` ```
这样就完成了一个服务化 API 的部署,使用默认端口号 8866。
### 4. 启动服务 **参数说明**:
#### 方式1. 命令行命令启动(仅支持CPU) | 参数 | 用途 |
**启动命令:** | ------------------ | ----------------------------------------------------------------------------------------------------------------------------- |
```shell | --modules/-m | [**必选**] PaddleHub Serving 预安装模型,以多个 Module==Version 键值对的形式列出<br>*`当不指定 Version 时,默认选择最新版本`* |
$ hub serving start --modules Module1==Version1 \ | --port/-p | [**可选**] 服务端口,默认为 8866 |
--port XXXX \ | --use_multiprocess | [**可选**] 是否启用并发方式,默认为单进程方式,推荐多核 CPU 机器使用此方式<br>*`Windows 操作系统只支持单进程方式`* |
--use_multiprocess \ | --workers | [**可选**] 在并发方式下指定的并发任务数,默认为 `2*cpu_count-1`,其中 `cpu_count` 为 CPU 核数 |
--workers \ 更多部署细节详见 [PaddleHub Serving模型一键服务部署](https://paddlehub.readthedocs.io/zh_CN/release-v2.1/tutorial/serving.html)
```
**参数:** <a name="5.2"></a>
|参数|用途| ### 5.2 配置文件启动
|-|-|
|--modules/-m| [**必选**] PaddleHub Serving预安装模型,以多个Module==Version键值对的形式列出<br>*`当不指定Version时,默认选择最新版本`*|
|--port/-p| [**可选**] 服务端口,默认为8866|
|--use_multiprocess| [**可选**] 是否启用并发方式,默认为单进程方式,推荐多核CPU机器使用此方式<br>*`Windows操作系统只支持单进程方式`*|
|--workers| [**可选**] 在并发方式下指定的并发任务数,默认为`2*cpu_count-1`,其中`cpu_count`为CPU核数|
如按默认参数启动服务: ```hub serving start -m clas_system``` 该方式仅支持使用 CPU 或 GPU 预测。启动命令:
这样就完成了一个服务化API的部署,使用默认端口号8866。 ```shell
hub serving start -c config.json
```
#### 方式2. 配置文件启动(支持CPU、GPU) 其中,`config.json` 格式如下:
**启动命令:**
```hub serving start -c config.json```
其中,`config.json`格式如下:
```json ```json
{ {
"modules_info": { "modules_info": {
...@@ -97,92 +131,109 @@ $ hub serving start --modules Module1==Version1 \ ...@@ -97,92 +131,109 @@ $ hub serving start --modules Module1==Version1 \
} }
``` ```
- `init_args`中的可配参数与`module.py`中的`_initialize`函数接口一致。其中, **参数说明**:
- 当`use_gpu`为`true`时,表示使用GPU启动服务。 * `init_args` 中的可配参数与 `module.py` 中的 `_initialize` 函数接口一致。其中,
- 当`enable_mkldnn`为`true`时,表示使用MKL-DNN加速。 - 当 `use_gpu` 为 `true` 时,表示使用 GPU 启动服务。
- `predict_args`中的可配参数与`module.py`中的`predict`函数接口一致。 - 当 `enable_mkldnn` 为 `true` 时,表示使用 MKL-DNN 加速。
* `predict_args` 中的可配参数与 `module.py` 中的 `predict` 函数接口一致。
**注意:** **注意**:
- 使用配置文件启动服务时,其他参数会被忽略。 * 使用配置文件启动服务时,将使用配置文件中的参数设置,其他命令行参数将被忽略;
- 如果使用GPU预测(即,`use_gpu`置为`true`),则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,如:```export CUDA_VISIBLE_DEVICES=0```,否则不用设置。 * 如果使用 GPU 预测(即,`use_gpu` 置为 `true`),则需要在启动服务之前,设置 `CUDA_VISIBLE_DEVICES` 环境变量来指定所使用的 GPU 卡号,如:`export CUDA_VISIBLE_DEVICES=0`;
- **`use_gpu`不可与`use_multiprocess`同时为`true`**。 * **`use_gpu` 不可与 `use_multiprocess` 同时为 `true`**;
- **`use_gpu`与`enable_mkldnn`同时为`true`时,将忽略`enable_mkldnn`,而使用GPU**。 * **`use_gpu` 与 `enable_mkldnn` 同时为 `true` 时,将忽略 `enable_mkldnn`,而使用 GPU**。
如使用 GPU 3 号卡启动服务:
如,使用GPU 3号卡启动串联服务:
```shell ```shell
cd PaddleClas/deploy cd PaddleClas/deploy
export CUDA_VISIBLE_DEVICES=3 export CUDA_VISIBLE_DEVICES=3
hub serving start -c hubserving/clas/config.json hub serving start -c hubserving/clas/config.json
``` ```
## 发送预测请求 <a name="6"></a>
配置好服务端,可使用以下命令发送预测请求,获取预测结果: ## 6. 发送预测请求
配置好服务端后,可使用以下命令发送预测请求,获取预测结果:
```shell ```shell
cd PaddleClas/deploy cd PaddleClas/deploy
python hubserving/test_hubserving.py server_url image_path python3.7 hubserving/test_hubserving.py \
``` --server_url http://127.0.0.1:8866/predict/clas_system \
--image_file ./hubserving/ILSVRC2012_val_00006666.JPEG \
需要给脚本传递2个必须参数: --batch_size 8
- **server_url**:服务地址,格式为 ```
`http://[ip_address]:[port]/predict/[module_name]` **预测输出**
- **image_path**:测试图像路径,可以是单张图片路径,也可以是图像集合目录路径。 ```log
- **batch_size**:[**可选**] 以`batch_size`大小为单位进行预测,默认为`1`。 The result(s): class_ids: [57, 67, 68, 58, 65], label_names: ['garter snake, grass snake', 'diamondback, diamondback rattlesnake, Crotalus adamanteus', 'sidewinder, horned rattlesnake, Crotalus cerastes', 'water snake', 'sea snake'], scores: [0.21915, 0.15631, 0.14794, 0.13177, 0.12285]
- **resize_short**:[**可选**] 预处理时,按短边调整大小,默认为`256`。 The average time of prediction cost: 2.970 s/image
- **crop_size**:[**可选**] 预处理时,居中裁剪的大小,默认为`224`。 The average time cost: 3.014 s/image
- **normalize**:[**可选**] 预处理时,是否进行`normalize`,默认为`True`。 The average top-1 score: 0.110
- **to_chw**:[**可选**] 预处理时,是否调整为`CHW`顺序,默认为`True`。 ```
**注意**:如果使用`Transformer`系列模型,如`DeiT_***_384`, `ViT_***_384`等,请注意模型的输入数据尺寸,需要指定`--resize_short=384 --crop_size=384`。 **脚本参数说明**:
* **server_url**:服务地址,格式为`http://[ip_address]:[port]/predict/[module_name]`。
* **image_path**:测试图像路径,可以是单张图片路径,也可以是图像集合目录路径。
* **batch_size**:[**可选**] 以 `batch_size` 大小为单位进行预测,默认为 `1`。
* **resize_short**:[**可选**] 预处理时,按短边调整大小,默认为 `256`。
* **crop_size**:[**可选**] 预处理时,居中裁剪的大小,默认为 `224`。
* **normalize**:[**可选**] 预处理时,是否进行 `normalize`,默认为 `True`。
* **to_chw**:[**可选**] 预处理时,是否调整为 `CHW` 顺序,默认为 `True`。
**注意**:如果使用 `Transformer` 系列模型,如 `DeiT_***_384`, `ViT_***_384` 等,请注意模型的输入数据尺寸,需要指定`--resize_short=384 --crop_size=384`。
访问示例: **返回结果格式说明**:
返回结果为列表(list),包含 top-k 个分类结果,以及对应的得分,还有此图片预测耗时,具体如下:
```shell ```shell
python hubserving/test_hubserving.py --server_url http://127.0.0.1:8866/predict/clas_system --image_file ./hubserving/ILSVRC2012_val_00006666.JPEG --batch_size 8
```
### 返回结果格式说明
返回结果为列表(list),包含top-k个分类结果,以及对应的得分,还有此图片预测耗时,具体如下:
```
list: 返回结果 list: 返回结果
└─ list: 第一张图片结果 └─list: 第一张图片结果
└─ list: 前k个分类结果,依score递减排序 ├── list: 前 k 个分类结果,依 score 递减排序
└─ list: 前k个分类结果对应的score,依score递减排序 ├── list: 前 k 个分类结果对应的 score,依 score 递减排序
└─ float: 该图分类耗时,单位秒 └─ float: 该图分类耗时,单位秒
``` ```
**说明:** 如果需要增加、删除、修改返回字段,可对相应模块进行修改,完整流程参考下一节自定义修改服务模块。
## 自定义修改服务模块
如果需要修改服务逻辑,你一般需要操作以下步骤:
- 1、 停止服务 <a name="7"></a>
```hub serving stop --port/-p XXXX``` ## 7. 自定义修改服务模块
- 2、 到相应的`module.py`和`params.py`等文件中根据实际需求修改代码。`module.py`修改后需要重新安装(`hub install hubserving/clas/`)并部署。在进行部署前,可通过`python hubserving/clas/module.py`测试已安装服务模块。 如果需要修改服务逻辑,需要进行以下操作:
- 3、 卸载旧服务包 1. 停止服务
```hub uninstall clas_system``` ```shell
hub serving stop --port/-p XXXX
```
- 4、 安装修改后的新服务包 2. 到相应的 `module.py` 和 `params.py` 等文件中根据实际需求修改代码。`module.py` 修改后需要重新安装(`hub install hubserving/clas/`)并部署。在进行部署前,可先通过 `python3.7 hubserving/clas/module.py` 命令来快速测试准备部署的代码。
```hub install hubserving/clas/```
- 5、重新启动服务 3. 卸载旧服务包
```hub serving start -m clas_system``` ```shell
hub uninstall clas_system
```
4. 安装修改后的新服务包
```shell
hub install hubserving/clas/
```
5. 重新启动服务
```shell
hub serving start -m clas_system
```
**注意**: **注意**:
常用参数可在[params.py](./clas/params.py)中修改: 常用参数可在 `PaddleClas/deploy/hubserving/clas/params.py` 中修改:
* 更换模型,需要修改模型文件路径参数: * 更换模型,需要修改模型文件路径参数:
```python ```python
"inference_model_dir": "inference_model_dir":
``` ```
* 更改后处理时返回的`top-k`结果数量: * 更改后处理时返回的 `top-k` 结果数量:
```python ```python
'topk': 'topk':
``` ```
* 更改后处理时的lable与class id对应映射文件: * 更改后处理时的 lable 与 class id 对应映射文件:
```python ```python
'class_id_map_file': 'class_id_map_file':
``` ```
为了避免不必要的延时以及能够以batch_size进行预测,数据预处理逻辑(包括resize、crop等操作)在客户端完成,因此需要在[test_hubserving.py](./test_hubserving.py#L35-L52)中修改 为了避免不必要的延时以及能够以 batch_size 进行预测,数据预处理逻辑(包括 `resize`、`crop` 等操作)均在客户端完成,因此需要在 [PaddleClas/deploy/hubserving/test_hubserving.py#L41-L47](./test_hubserving.py#L41-L47) 以及 [PaddleClas/deploy/hubserving/test_hubserving.py#L51-L76](./test_hubserving.py#L51-L76) 中修改数据预处理逻辑相关代码
English | [简体中文](readme.md) English | [简体中文](readme.md)
# Service deployment based on PaddleHub Serving # Service deployment based on PaddleHub Serving
PaddleClas supports rapid service deployment through PaddleHub. Currently, the deployment of image classification is supported. Please look forward to the deployment of image recognition.
## Catalogue
- [1 Introduction](#1-introduction)
- [2. Prepare the environment](#2-prepare-the-environment)
- [3. Download the inference model](#3-download-the-inference-model)
- [4. Install the service module](#4-install-the-service-module)
- [5. Start service](#5-start-service)
- [5.1 Start with command line parameters](#51-start-with-command-line-parameters)
- [5.2 Start with configuration file](#52-start-with-configuration-file)
- [6. Send prediction requests](#6-send-prediction-requests)
- [7. User defined service module modification](#7-user-defined-service-module-modification)
HubServing service pack contains 3 files, the directory is as follows:
```
hubserving/clas/
└─ __init__.py Empty file, required
└─ config.json Configuration file, optional, passed in as a parameter when using configuration to start the service
└─ module.py Main module file, required, contains the complete logic of the service
└─ params.py Parameter file, required, including parameters such as model path, pre- and post-processing parameters
```
## Quick start service <a name="1"></a>
### 1. Prepare the environment ## 1 Introduction
The hubserving service deployment configuration service package `clas` contains 3 required files, the directories are as follows:
```shell ```shell
# Install version 2.0 of PaddleHub deploy/hubserving/clas/
pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple ├── __init__.py # Empty file, required
├── config.json # Configuration file, optional, passed in as a parameter when starting the service with configuration
├── module.py # The main module, required, contains the complete logic of the service
└── params.py # Parameter file, required, including model path, pre- and post-processing parameters and other parameters
``` ```
### 2. Download inference model
Before installing the service module, you need to prepare the inference model and put it in the correct path. The default model path is:
``` <a name="2"></a>
Model structure file: PaddleClas/inference/inference.pdmodel ## 2. Prepare the environment
Model parameters file: PaddleClas/inference/inference.pdiparams ```shell
# Install paddlehub, version 2.1.0 is recommended
python3.7 -m pip install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
``` ```
* The model file path can be viewed and modified in `PaddleClas/deploy/hubserving/clas/params.py`.
It should be noted that the prefix of model structure file and model parameters file must be `inference`. <a name="3"></a>
## 3. Download the inference model
* More models provided by PaddleClas can be obtained from the [model library](../../docs/en/models/models_intro_en.md). You can also use models trained by yourself. Before installing the service module, you need to prepare the inference model and put it in the correct path. The default model path is:
### 3. Install Service Module * Classification inference model structure file: `PaddleClas/inference/inference.pdmodel`
* Classification inference model weight file: `PaddleClas/inference/inference.pdiparams`
* On Linux platform, the examples are as follows. **Notice**:
```shell * Model file paths can be viewed and modified in `PaddleClas/deploy/hubserving/clas/params.py`:
cd PaddleClas/deploy
hub install hubserving/clas/ ```python
``` "inference_model_dir": "../inference/"
```
* Model files (including `.pdmodel` and `.pdiparams`) must be named `inference`.
* We provide a large number of pre-trained models based on the ImageNet-1k dataset. For the model list and download address, see [Model Library Overview](../../docs/en/algorithm_introduction/ImageNet_models_en.md), or you can use your own trained and converted models.
<a name="4"></a>
## 4. Install the service module
* In the Linux environment, the installation example is as follows:
```shell
cd PaddleClas/deploy
# Install the service module:
hub install hubserving/clas/
```
* In the Windows environment (the folder separator is `\`), the installation example is as follows:
```shell
cd PaddleClas\deploy
# Install the service module:
hub install hubserving\clas\
```
<a name="5"></a>
## 5. Start service
<a name="5.1"></a>
### 5.1 Start with command line parameters
This method only supports prediction using CPU. Start command:
* On Windows platform, the examples are as follows.
```shell ```shell
cd PaddleClas\deploy hub serving start \
hub install hubserving\clas\ --modules clas_system
--port 8866
``` ```
This completes the deployment of a serviced API, using the default port number 8866.
### 4. Start service **Parameter Description**:
#### Way 1. Start with command line parameters (CPU only) | parameters | uses |
| ------------------ | ------------------- |
| --modules/-m | [**required**] PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs<br>*`When no Version is specified, the latest is selected by default version`* |
| --port/-p | [**OPTIONAL**] Service port, default is 8866 |
| --use_multiprocess | [**Optional**] Whether to enable the concurrent mode, the default is single-process mode, it is recommended to use this mode for multi-core CPU machines<br>*`Windows operating system only supports single-process mode`* |
| --workers | [**Optional**] The number of concurrent tasks specified in concurrent mode, the default is `2*cpu_count-1`, where `cpu_count` is the number of CPU cores |
For more deployment details, see [PaddleHub Serving Model One-Click Service Deployment](https://paddlehub.readthedocs.io/zh_CN/release-v2.1/tutorial/serving.html)
**start command:** <a name="5.2"></a>
```shell ### 5.2 Start with configuration file
$ hub serving start --modules Module1==Version1 \
--port XXXX \
--use_multiprocess \
--workers \
```
**parameters:**
|parameters|usage|
|-|-|
|--modules/-m|PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs<br>*`When Version is not specified, the latest version is selected by default`*|
|--port/-p|Service port, default is 8866|
|--use_multiprocess|Enable concurrent mode, the default is single-process mode, this mode is recommended for multi-core CPU machines<br>*`Windows operating system only supports single-process mode`*|
|--workers|The number of concurrent tasks specified in concurrent mode, the default is `2*cpu_count-1`, where `cpu_count` is the number of CPU cores|
For example, start the 2-stage series service:
```shell
hub serving start -m clas_system
```
This completes the deployment of a service API, using the default port number 8866. This method only supports prediction using CPU or GPU. Start command:
#### Way 2. Start with configuration file(CPU、GPU)
**start command:**
```shell ```shell
hub serving start --config/-c config.json hub serving start -c config.json
``` ```
Wherein, the format of `config.json` is as follows:
Among them, the format of `config.json` is as follows:
```json ```json
{ {
"modules_info": { "modules_info": {
...@@ -96,104 +129,110 @@ Wherein, the format of `config.json` is as follows: ...@@ -96,104 +129,110 @@ Wherein, the format of `config.json` is as follows:
"workers": 2 "workers": 2
} }
``` ```
- The configurable parameters in `init_args` are consistent with the `_initialize` function interface in `module.py`. Among them,
- when `use_gpu` is `true`, it means that the GPU is used to start the service. **Parameter Description**:
- when `enable_mkldnn` is `true`, it means that use MKL-DNN to accelerate. * The configurable parameters in `init_args` are consistent with the `_initialize` function interface in `module.py`. in,
- The configurable parameters in `predict_args` are consistent with the `predict` function interface in `module.py`. - When `use_gpu` is `true`, it means to use GPU to start the service.
- When `enable_mkldnn` is `true`, it means to use MKL-DNN acceleration.
**Note:** * The configurable parameters in `predict_args` are consistent with the `predict` function interface in `module.py`.
- When using the configuration file to start the service, other parameters will be ignored.
- If you use GPU prediction (that is, `use_gpu` is set to `true`), you need to set the environment variable CUDA_VISIBLE_DEVICES before starting the service, such as: ```export CUDA_VISIBLE_DEVICES=0```, otherwise you do not need to set it. **Notice**:
- **`use_gpu` and `use_multiprocess` cannot be `true` at the same time.** * When using the configuration file to start the service, the parameter settings in the configuration file will be used, and other command line parameters will be ignored;
- **When both `use_gpu` and `enable_mkldnn` are set to `true` at the same time, GPU is used to run and `enable_mkldnn` will be ignored.** * If you use GPU prediction (ie, `use_gpu` is set to `true`), you need to set the `CUDA_VISIBLE_DEVICES` environment variable to specify the GPU card number used before starting the service, such as: `export CUDA_VISIBLE_DEVICES=0`;
* **`use_gpu` cannot be `true`** at the same time as `use_multiprocess`;
For example, use GPU card No. 3 to start the 2-stage series service: * ** When both `use_gpu` and `enable_mkldnn` are `true`, `enable_mkldnn` will be ignored and GPU** will be used.
If you use GPU No. 3 card to start the service:
```shell ```shell
cd PaddleClas/deploy cd PaddleClas/deploy
export CUDA_VISIBLE_DEVICES=3 export CUDA_VISIBLE_DEVICES=3
hub serving start -c hubserving/clas/config.json hub serving start -c hubserving/clas/config.json
```
## Send prediction requests
After the service starts, you can use the following command to send a prediction request to obtain the prediction result:
```shell
cd PaddleClas/deploy
python hubserving/test_hubserving.py server_url image_path
``` ```
Two required parameters need to be passed to the script: <a name="6"></a>
- **server_url**: service address,format of which is ## 6. Send prediction requests
`http://[ip_address]:[port]/predict/[module_name]`
- **image_path**: Test image path, can be a single image path or an image directory path
- **batch_size**: [**Optional**] batch_size. Default by `1`.
- **resize_short**: [**Optional**] In preprocessing, resize according to short size. Default by `256`
- **crop_size**: [**Optional**] In preprocessing, centor crop size. Default by `224`
- **normalize**: [**Optional**] In preprocessing, whether to do `normalize`. Default by `True`
- **to_chw**: [**Optional**] In preprocessing, whether to transpose to `CHW`. Default by `True`
**Notice**: After configuring the server, you can use the following command to send a prediction request to get the prediction result:
If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `--resize_short=384`, `--crop_size=384`.
**Eg.**
```shell ```shell
python hubserving/test_hubserving.py --server_url http://127.0.0.1:8866/predict/clas_system --image_file ./hubserving/ILSVRC2012_val_00006666.JPEG --batch_size 8 cd PaddleClas/deploy
``` python3.7 hubserving/test_hubserving.py \
--server_url http://127.0.0.1:8866/predict/clas_system \
### Returned result format --image_file ./hubserving/ILSVRC2012_val_00006666.JPEG \
The returned result is a list, including the `top_k`'s classification results, corresponding scores and the time cost of prediction, details as follows. --batch_size 8
```
``` **Predicted output**
list: The returned results ```log
└─ list: The result of first picture The result(s): class_ids: [57, 67, 68, 58, 65], label_names: ['garter snake, grass snake', 'diamondback, diamondback rattlesnake, Crotalus adamanteus', 'sidewinder, horned rattlesnake, Crotalus cerastes' , 'water snake', 'sea snake'], scores: [0.21915, 0.15631, 0.14794, 0.13177, 0.12285]
└─ list: The top-k classification results, sorted in descending order of score The average time of prediction cost: 2.970 s/image
└─ list: The scores corresponding to the top-k classification results, sorted in descending order of score The average time cost: 3.014 s/image
└─ float: The time cost of predicting the picture, unit second The average top-1 score: 0.110
```
**Script parameter description**:
* **server_url**: Service address, the format is `http://[ip_address]:[port]/predict/[module_name]`.
* **image_path**: The test image path, which can be a single image path or an image collection directory path.
* **batch_size**: [**OPTIONAL**] Make predictions in `batch_size` size, default is `1`.
* **resize_short**: [**optional**] When preprocessing, resize by short edge, default is `256`.
* **crop_size**: [**Optional**] The size of the center crop during preprocessing, the default is `224`.
* **normalize**: [**Optional**] Whether to perform `normalize` during preprocessing, the default is `True`.
* **to_chw**: [**Optional**] Whether to adjust to `CHW` order when preprocessing, the default is `True`.
**Note**: If you use `Transformer` series models, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input data size of the model, you need to specify `--resize_short=384 -- crop_size=384`.
**Return result format description**:
The returned result is a list (list), including the top-k classification results, the corresponding scores, and the time-consuming prediction of this image, as follows:
```shell
list: return result
└──list: first image result
├── list: the top k classification results, sorted in descending order of score
├── list: the scores corresponding to the first k classification results, sorted in descending order of score
└── float: The image classification time, in seconds
``` ```
**Note:** If you need to add, delete or modify the returned fields, you can modify the corresponding module. For the details, refer to the user-defined modification service module in the next section.
## User defined service module modification
If you need to modify the service logic, the following steps are generally required:
1. Stop service <a name="7"></a>
```shell ## 7. User defined service module modification
hub serving stop --port/-p XXXX
```
2. Modify the code in the corresponding files, like `module.py` and `params.py`, according to the actual needs. You need re-install(hub install hubserving/clas/) and re-deploy after modifing `module.py`. If you need to modify the service logic, you need to do the following:
After modifying and installing and before deploying, you can use `python hubserving/clas/module.py` to test the installed service module.
For example, if you need to replace the model used by the deployed service, you need to modify model path parameters `cfg.model_file` and `cfg.params_file` in `params.py`. Of course, other related parameters may need to be modified at the same time. Please modify and debug according to the actual situation. 1. Stop the service
```shell
3. Uninstall old service module hub serving stop --port/-p XXXX
```shell ```
hub uninstall clas_system
```
4. Install modified service module 2. Go to the corresponding `module.py` and `params.py` and other files to modify the code according to actual needs. `module.py` needs to be reinstalled after modification (`hub install hubserving/clas/`) and deployed. Before deploying, you can use the `python3.7 hubserving/clas/module.py` command to quickly test the code ready for deployment.
```shell
hub install hubserving/clas/
```
5. Restart service 3. Uninstall the old service pack
```shell ```shell
hub serving start -m clas_system hub uninstall clas_system
``` ```
**Note**: 4. Install the new modified service pack
```shell
hub install hubserving/clas/
```
Common parameters can be modified in params.py: 5. Restart the service
* Directory of model files(include model structure file and model parameters file): ```shell
```python hub serving start -m clas_system
"inference_model_dir": ```
```
* The number of Top-k results returned during post-processing:
```python
'topk':
```
* Mapping file corresponding to label and class ID during post-processing:
```python
'class_id_map_file':
```
In order to avoid unnecessary delay and be able to predict in batch, the preprocessing (include resize, crop and other) is completed in the client, so modify [test_hubserving.py](./test_hubserving.py#L35-L52) if necessary. **Notice**:
Common parameters can be modified in `PaddleClas/deploy/hubserving/clas/params.py`:
* To replace the model, you need to modify the model file path parameters:
```python
"inference_model_dir":
```
* Change the number of `top-k` results returned when postprocessing:
```python
'topk':
```
* The mapping file corresponding to the lable and class id when changing the post-processing:
```python
'class_id_map_file':
```
In order to avoid unnecessary delay and be able to predict with batch_size, data preprocessing logic (including `resize`, `crop` and other operations) is completed on the client side, so it needs to modify data preprocessing logic related code in [PaddleClas/deploy/hubserving/test_hubserving.py# L41-L47](./test_hubserving.py#L41-L47) and [PaddleClas/deploy/hubserving/test_hubserving.py#L51-L76](./test_hubserving.py#L51-L76).
...@@ -92,9 +92,9 @@ PaddleClas 提供了转换并优化后的推理模型,可以直接参考下方 ...@@ -92,9 +92,9 @@ PaddleClas 提供了转换并优化后的推理模型,可以直接参考下方
```shell ```shell
# 进入lite_ppshitu目录 # 进入lite_ppshitu目录
cd $PaddleClas/deploy/lite_shitu cd $PaddleClas/deploy/lite_shitu
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/lite/ppshitu_lite_models_v1.1.tar wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/lite/ppshitu_lite_models_v1.2.tar
tar -xf ppshitu_lite_models_v1.1.tar tar -xf ppshitu_lite_models_v1.2.tar
rm -f ppshitu_lite_models_v1.1.tar rm -f ppshitu_lite_models_v1.2.tar
``` ```
#### 2.1.2 使用其他模型 #### 2.1.2 使用其他模型
...@@ -162,7 +162,7 @@ git clone https://github.com/PaddlePaddle/PaddleDetection.git ...@@ -162,7 +162,7 @@ git clone https://github.com/PaddlePaddle/PaddleDetection.git
# 进入PaddleDetection根目录 # 进入PaddleDetection根目录
cd PaddleDetection cd PaddleDetection
# 将预训练模型导出为inference模型 # 将预训练模型导出为inference模型
python tools/export_model.py -c configs/picodet/application/mainbody_detection/picodet_lcnet_x2_5_640_mainbody.yml -o weights=https://paddledet.bj.bcebos.com/models/picodet_lcnet_x2_5_640_mainbody.pdparams --output_dir=inference python tools/export_model.py -c configs/picodet/application/mainbody_detection/picodet_lcnet_x2_5_640_mainbody.yml -o weights=https://paddledet.bj.bcebos.com/models/picodet_lcnet_x2_5_640_mainbody.pdparams export_post_process=False --output_dir=inference
# 将inference模型转化为Paddle-Lite优化模型 # 将inference模型转化为Paddle-Lite优化模型
paddle_lite_opt --model_file=inference/picodet_lcnet_x2_5_640_mainbody/model.pdmodel --param_file=inference/picodet_lcnet_x2_5_640_mainbody/model.pdiparams --optimize_out=inference/picodet_lcnet_x2_5_640_mainbody/mainbody_det paddle_lite_opt --model_file=inference/picodet_lcnet_x2_5_640_mainbody/model.pdmodel --param_file=inference/picodet_lcnet_x2_5_640_mainbody/model.pdiparams --optimize_out=inference/picodet_lcnet_x2_5_640_mainbody/mainbody_det
# 将转好的模型复制到lite_shitu目录下 # 将转好的模型复制到lite_shitu目录下
...@@ -183,24 +183,56 @@ cd deploy/lite_shitu ...@@ -183,24 +183,56 @@ cd deploy/lite_shitu
**注意**`--optimize_out` 参数为优化后模型的保存路径,无需加后缀`.nb``--model_file` 参数为模型结构信息文件的路径,`--param_file` 参数为模型权重信息文件的路径,请注意文件名。 **注意**`--optimize_out` 参数为优化后模型的保存路径,无需加后缀`.nb``--model_file` 参数为模型结构信息文件的路径,`--param_file` 参数为模型权重信息文件的路径,请注意文件名。
### 2.2 将yaml文件转换成json文件 ### 2.2 生成新的检索库
由于lite 版本的检索库用的是`faiss1.5.3`版本,与新版本不兼容,因此需要重新生成index库
#### 2.2.1 数据及环境配置
```shell
# 进入上级目录
cd ..
# 下载瓶装饮料数据集
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar && tar -xf drink_dataset_v1.0.tar
rm -rf drink_dataset_v1.0.tar
rm -rf drink_dataset_v1.0/index
# 安装1.5.3版本的faiss
pip install faiss-cpu==1.5.3
# 下载通用识别模型,可替换成自己的inference model
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar
tar -xf general_PPLCNet_x2_5_lite_v1.0_infer.tar
rm -rf general_PPLCNet_x2_5_lite_v1.0_infer.tar
```
#### 2.2.2 生成新的index文件
```shell
# 生成新的index库,注意指定好识别模型的路径,同时将index_mothod修改成Flat,HNSW32和IVF在此版本中可能存在bug,请慎重使用。
# 如果使用自己的识别模型,对应的修改inference model的目录
python python/build_gallery.py -c configs/inference_drink.yaml -o Global.rec_inference_model_dir=general_PPLCNet_x2_5_lite_v1.0_infer -o IndexProcess.index_method=Flat
# 进入到lite_shitu目录
cd lite_shitu
mv ../drink_dataset_v1.0 .
```
### 2.3 将yaml文件转换成json文件
```shell ```shell
# 如果测试单张图像 # 如果测试单张图像
python generate_json_config.py --det_model_path ppshitu_lite_models_v1.1/mainbody_PPLCNet_x2_5_640_quant_v1.1_lite.nb --rec_model_path ppshitu_lite_models_v1.1/general_PPLCNet_x2_5_lite_v1.1_infer.nb --img_path images/demo.jpg python generate_json_config.py --det_model_path ppshitu_lite_models_v1.2/mainbody_PPLCNet_x2_5_640_v1.2_lite.nb --rec_model_path ppshitu_lite_models_v1.2/general_PPLCNet_x2_5_lite_v1.2_infer.nb --img_path images/demo.jpeg
# or # or
# 如果测试多张图像 # 如果测试多张图像
python generate_json_config.py --det_model_path ppshitu_lite_models_v1.1/mainbody_PPLCNet_x2_5_640_quant_v1.1_lite.nb --rec_model_path ppshitu_lite_models_v1.1/general_PPLCNet_x2_5_lite_v1.1_infer.nb --img_dir images python generate_json_config.py --det_model_path ppshitu_lite_models_v1.2/mainbody_PPLCNet_x2_5_640_v1.2_lite.nb --rec_model_path ppshitu_lite_models_v1.2/general_PPLCNet_x2_5_lite_v1.2_infer.nb --img_dir images
# 执行完成后,会在lit_shitu下生成shitu_config.json配置文件 # 执行完成后,会在lit_shitu下生成shitu_config.json配置文件
``` ```
### 2.3 index字典转换 ### 2.4 index字典转换
由于python的检索库字典,使用`pickle`进行的序列化存储,导致C++不方便读取,因此需要进行转换 由于python的检索库字典,使用`pickle`进行的序列化存储,导致C++不方便读取,因此需要进行转换
```shell ```shell
# 下载瓶装饮料数据集
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar && tar -xf drink_dataset_v1.0.tar
rm -rf drink_dataset_v1.0.tar
# 转化id_map.pkl为id_map.txt # 转化id_map.pkl为id_map.txt
python transform_id_map.py -c ../configs/inference_drink.yaml python transform_id_map.py -c ../configs/inference_drink.yaml
...@@ -208,7 +240,7 @@ python transform_id_map.py -c ../configs/inference_drink.yaml ...@@ -208,7 +240,7 @@ python transform_id_map.py -c ../configs/inference_drink.yaml
转换成功后,会在`IndexProcess.index_dir`目录下生成`id_map.txt` 转换成功后,会在`IndexProcess.index_dir`目录下生成`id_map.txt`
### 2.4 与手机联调 ### 2.5 与手机联调
首先需要进行一些准备工作。 首先需要进行一些准备工作。
1. 准备一台arm8的安卓手机,如果编译的预测库是armv7,则需要arm7的手机,并修改Makefile中`ARM_ABI=arm7` 1. 准备一台arm8的安卓手机,如果编译的预测库是armv7,则需要arm7的手机,并修改Makefile中`ARM_ABI=arm7`
...@@ -308,8 +340,9 @@ chmod 777 pp_shitu ...@@ -308,8 +340,9 @@ chmod 777 pp_shitu
运行效果如下: 运行效果如下:
``` ```
images/demo.jpg: images/demo.jpeg:
result0: bbox[253, 275, 1146, 872], score: 0.974196, label: 伊藤园_果蔬汁 result0: bbox[344, 98, 527, 593], score: 0.811656, label: 红牛-强化型
result1: bbox[0, 0, 600, 600], score: 0.729664, label: 红牛-强化型
``` ```
## FAQ ## FAQ
......
...@@ -29,16 +29,16 @@ ...@@ -29,16 +29,16 @@
namespace PPShiTu { namespace PPShiTu {
void load_jsonf(std::string jsonfile, Json::Value& jsondata); void load_jsonf(std::string jsonfile, Json::Value &jsondata);
// Inference model configuration parser // Inference model configuration parser
class ConfigPaser { class ConfigPaser {
public: public:
ConfigPaser() {} ConfigPaser() {}
~ConfigPaser() {} ~ConfigPaser() {}
bool load_config(const Json::Value& config) { bool load_config(const Json::Value &config) {
// Get model arch : YOLO, SSD, RetinaNet, RCNN, Face // Get model arch : YOLO, SSD, RetinaNet, RCNN, Face
if (config["Global"].isMember("det_arch")) { if (config["Global"].isMember("det_arch")) {
...@@ -89,4 +89,4 @@ class ConfigPaser { ...@@ -89,4 +89,4 @@ class ConfigPaser {
std::vector<int> fpn_stride_; std::vector<int> fpn_stride_;
}; };
} // namespace PPShiTu } // namespace PPShiTu
...@@ -18,6 +18,7 @@ ...@@ -18,6 +18,7 @@
#include <arm_neon.h> #include <arm_neon.h>
#include <chrono> #include <chrono>
#include <fstream> #include <fstream>
#include <include/preprocess_op.h>
#include <iostream> #include <iostream>
#include <math.h> #include <math.h>
#include <opencv2/opencv.hpp> #include <opencv2/opencv.hpp>
...@@ -48,10 +49,6 @@ public: ...@@ -48,10 +49,6 @@ public:
config_file["Global"]["rec_model_path"].as<std::string>()); config_file["Global"]["rec_model_path"].as<std::string>());
this->predictor = CreatePaddlePredictor<MobileConfig>(config); this->predictor = CreatePaddlePredictor<MobileConfig>(config);
if (config_file["Global"]["rec_label_path"].as<std::string>().empty()) {
std::cout << "Please set [rec_label_path] in config file" << std::endl;
exit(-1);
}
SetPreProcessParam(config_file["RecPreProcess"]["transform_ops"]); SetPreProcessParam(config_file["RecPreProcess"]["transform_ops"]);
printf("feature extract model create!\n"); printf("feature extract model create!\n");
} }
...@@ -68,24 +65,29 @@ public: ...@@ -68,24 +65,29 @@ public:
this->mean.emplace_back(tmp.as<float>()); this->mean.emplace_back(tmp.as<float>());
} }
for (auto tmp : item["std"]) { for (auto tmp : item["std"]) {
this->std.emplace_back(1 / tmp.as<float>()); this->std.emplace_back(tmp.as<float>());
} }
this->scale = item["scale"].as<double>(); this->scale = item["scale"].as<double>();
} }
} }
} }
void RunRecModel(const cv::Mat &img, double &cost_time, std::vector<float> &feature); void RunRecModel(const cv::Mat &img, double &cost_time,
//void PostProcess(std::vector<float> &feature); std::vector<float> &feature);
cv::Mat ResizeImage(const cv::Mat &img); // void PostProcess(std::vector<float> &feature);
void NeonMeanScale(const float *din, float *dout, int size); void FeatureNorm(std::vector<float> &featuer);
private: private:
std::shared_ptr<PaddlePredictor> predictor; std::shared_ptr<PaddlePredictor> predictor;
//std::vector<std::string> label_list; // std::vector<std::string> label_list;
std::vector<float> mean = {0.485f, 0.456f, 0.406f}; std::vector<float> mean = {0.485f, 0.456f, 0.406f};
std::vector<float> std = {1 / 0.229f, 1 / 0.224f, 1 / 0.225f}; std::vector<float> std = {0.229f, 0.224f, 0.225f};
double scale = 0.00392157; double scale = 0.00392157;
float size = 224; int size = 224;
// pre-process
Resize resize_op_;
NormalizeImage normalize_op_;
Permute permute_op_;
}; };
} // namespace PPShiTu } // namespace PPShiTu
...@@ -16,24 +16,24 @@ ...@@ -16,24 +16,24 @@
#include <ctime> #include <ctime>
#include <memory> #include <memory>
#include <stdlib.h>
#include <string> #include <string>
#include <utility> #include <utility>
#include <vector> #include <vector>
#include <stdlib.h>
#include "json/json.h"
#include <opencv2/core/core.hpp> #include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp> #include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp> #include <opencv2/imgproc/imgproc.hpp>
#include "json/json.h"
#include "paddle_api.h" // NOLINT #include "paddle_api.h" // NOLINT
#include "include/config_parser.h" #include "include/config_parser.h"
#include "include/picodet_postprocess.h"
#include "include/preprocess_op.h" #include "include/preprocess_op.h"
#include "include/utils.h" #include "include/utils.h"
#include "include/picodet_postprocess.h"
using namespace paddle::lite_api; // NOLINT using namespace paddle::lite_api; // NOLINT
namespace PPShiTu { namespace PPShiTu {
...@@ -41,53 +41,51 @@ namespace PPShiTu { ...@@ -41,53 +41,51 @@ namespace PPShiTu {
std::vector<int> GenerateColorMap(int num_class); std::vector<int> GenerateColorMap(int num_class);
// Visualiztion Detection Result // Visualiztion Detection Result
cv::Mat VisualizeResult(const cv::Mat& img, cv::Mat VisualizeResult(const cv::Mat &img,
const std::vector<PPShiTu::ObjectResult>& results, const std::vector<PPShiTu::ObjectResult> &results,
const std::vector<std::string>& lables, const std::vector<std::string> &lables,
const std::vector<int>& colormap, const std::vector<int> &colormap, const bool is_rbox);
const bool is_rbox);
class ObjectDetector { class ObjectDetector {
public: public:
explicit ObjectDetector(const Json::Value& config, explicit ObjectDetector(const Json::Value &config,
const std::string& model_dir, const std::string &model_dir, int cpu_threads = 1,
int cpu_threads = 1,
const int batch_size = 1) { const int batch_size = 1) {
config_.load_config(config); config_.load_config(config);
printf("config created\n"); printf("config created\n");
preprocessor_.Init(config_.preprocess_info_); preprocessor_.Init(config_.preprocess_info_);
printf("before object detector\n"); printf("before object detector\n");
if(config["Global"]["det_model_path"].as<std::string>().empty()){ if (config["Global"]["det_model_path"].as<std::string>().empty()) {
std::cout << "Please set [det_model_path] in config file" << std::endl; std::cout << "Please set [det_model_path] in config file" << std::endl;
exit(-1); exit(-1);
} }
LoadModel(config["Global"]["det_model_path"].as<std::string>(), cpu_threads); LoadModel(config["Global"]["det_model_path"].as<std::string>(),
printf("create object detector\n"); } cpu_threads);
printf("create object detector\n");
}
// Load Paddle inference model // Load Paddle inference model
void LoadModel(std::string model_file, int num_theads); void LoadModel(std::string model_file, int num_theads);
// Run predictor // Run predictor
void Predict(const std::vector<cv::Mat>& imgs, void Predict(const std::vector<cv::Mat> &imgs, const int warmup = 0,
const int warmup = 0,
const int repeats = 1, const int repeats = 1,
std::vector<PPShiTu::ObjectResult>* result = nullptr, std::vector<PPShiTu::ObjectResult> *result = nullptr,
std::vector<int>* bbox_num = nullptr, std::vector<int> *bbox_num = nullptr,
std::vector<double>* times = nullptr); std::vector<double> *times = nullptr);
// Get Model Label list // Get Model Label list
const std::vector<std::string>& GetLabelList() const { const std::vector<std::string> &GetLabelList() const {
return config_.label_list_; return config_.label_list_;
} }
private: private:
// Preprocess image and copy data to input buffer // Preprocess image and copy data to input buffer
void Preprocess(const cv::Mat& image_mat); void Preprocess(const cv::Mat &image_mat);
// Postprocess result // Postprocess result
void Postprocess(const std::vector<cv::Mat> mats, void Postprocess(const std::vector<cv::Mat> mats,
std::vector<PPShiTu::ObjectResult>* result, std::vector<PPShiTu::ObjectResult> *result,
std::vector<int> bbox_num, std::vector<int> bbox_num, bool is_rbox);
bool is_rbox);
std::shared_ptr<PaddlePredictor> predictor_; std::shared_ptr<PaddlePredictor> predictor_;
Preprocessor preprocessor_; Preprocessor preprocessor_;
...@@ -96,7 +94,6 @@ class ObjectDetector { ...@@ -96,7 +94,6 @@ class ObjectDetector {
std::vector<int> out_bbox_num_data_; std::vector<int> out_bbox_num_data_;
float threshold_; float threshold_;
ConfigPaser config_; ConfigPaser config_;
}; };
} // namespace PPShiTu } // namespace PPShiTu
...@@ -14,25 +14,23 @@ ...@@ -14,25 +14,23 @@
#pragma once #pragma once
#include <string>
#include <vector>
#include <memory>
#include <utility>
#include <ctime> #include <ctime>
#include <memory>
#include <numeric> #include <numeric>
#include <string>
#include <utility>
#include <vector>
#include "include/utils.h" #include "include/utils.h"
namespace PPShiTu { namespace PPShiTu {
void PicoDetPostProcess(std::vector<PPShiTu::ObjectResult>* results, void PicoDetPostProcess(std::vector<PPShiTu::ObjectResult> *results,
std::vector<const float *> outs, std::vector<const float *> outs,
std::vector<int> fpn_stride, std::vector<int> fpn_stride,
std::vector<float> im_shape, std::vector<float> im_shape,
std::vector<float> scale_factor, std::vector<float> scale_factor,
float score_threshold = 0.3, float score_threshold = 0.3, float nms_threshold = 0.5,
float nms_threshold = 0.5, int num_class = 80, int reg_max = 7);
int num_class = 80,
int reg_max = 7);
} // namespace PPShiTu } // namespace PPShiTu
...@@ -21,16 +21,16 @@ ...@@ -21,16 +21,16 @@
#include <utility> #include <utility>
#include <vector> #include <vector>
#include "json/json.h"
#include <opencv2/core/core.hpp> #include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp> #include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp> #include <opencv2/imgproc/imgproc.hpp>
#include "json/json.h"
namespace PPShiTu { namespace PPShiTu {
// Object for storing all preprocessed data // Object for storing all preprocessed data
class ImageBlob { class ImageBlob {
public: public:
// image width and height // image width and height
std::vector<float> im_shape_; std::vector<float> im_shape_;
// Buffer for image data after preprocessing // Buffer for image data after preprocessing
...@@ -45,20 +45,20 @@ class ImageBlob { ...@@ -45,20 +45,20 @@ class ImageBlob {
// Abstraction of preprocessing opration class // Abstraction of preprocessing opration class
class PreprocessOp { class PreprocessOp {
public: public:
virtual void Init(const Json::Value& item) = 0; virtual void Init(const Json::Value &item) = 0;
virtual void Run(cv::Mat* im, ImageBlob* data) = 0; virtual void Run(cv::Mat *im, ImageBlob *data) = 0;
}; };
class InitInfo : public PreprocessOp { class InitInfo : public PreprocessOp {
public: public:
virtual void Init(const Json::Value& item) {} virtual void Init(const Json::Value &item) {}
virtual void Run(cv::Mat* im, ImageBlob* data); virtual void Run(cv::Mat *im, ImageBlob *data);
}; };
class NormalizeImage : public PreprocessOp { class NormalizeImage : public PreprocessOp {
public: public:
virtual void Init(const Json::Value& item) { virtual void Init(const Json::Value &item) {
mean_.clear(); mean_.clear();
scale_.clear(); scale_.clear();
for (auto tmp : item["mean"]) { for (auto tmp : item["mean"]) {
...@@ -70,9 +70,11 @@ class NormalizeImage : public PreprocessOp { ...@@ -70,9 +70,11 @@ class NormalizeImage : public PreprocessOp {
is_scale_ = item["is_scale"].as<bool>(); is_scale_ = item["is_scale"].as<bool>();
} }
virtual void Run(cv::Mat* im, ImageBlob* data); virtual void Run(cv::Mat *im, ImageBlob *data);
void Run_feature(cv::Mat *im, const std::vector<float> &mean,
const std::vector<float> &std, float scale);
private: private:
// CHW or HWC // CHW or HWC
std::vector<float> mean_; std::vector<float> mean_;
std::vector<float> scale_; std::vector<float> scale_;
...@@ -80,14 +82,15 @@ class NormalizeImage : public PreprocessOp { ...@@ -80,14 +82,15 @@ class NormalizeImage : public PreprocessOp {
}; };
class Permute : public PreprocessOp { class Permute : public PreprocessOp {
public: public:
virtual void Init(const Json::Value& item) {} virtual void Init(const Json::Value &item) {}
virtual void Run(cv::Mat* im, ImageBlob* data); virtual void Run(cv::Mat *im, ImageBlob *data);
void Run_feature(const cv::Mat *im, float *data);
}; };
class Resize : public PreprocessOp { class Resize : public PreprocessOp {
public: public:
virtual void Init(const Json::Value& item) { virtual void Init(const Json::Value &item) {
interp_ = item["interp"].as<int>(); interp_ = item["interp"].as<int>();
// max_size_ = item["target_size"].as<int>(); // max_size_ = item["target_size"].as<int>();
keep_ratio_ = item["keep_ratio"].as<bool>(); keep_ratio_ = item["keep_ratio"].as<bool>();
...@@ -98,11 +101,13 @@ class Resize : public PreprocessOp { ...@@ -98,11 +101,13 @@ class Resize : public PreprocessOp {
} }
// Compute best resize scale for x-dimension, y-dimension // Compute best resize scale for x-dimension, y-dimension
std::pair<float, float> GenerateScale(const cv::Mat& im); std::pair<float, float> GenerateScale(const cv::Mat &im);
virtual void Run(cv::Mat* im, ImageBlob* data); virtual void Run(cv::Mat *im, ImageBlob *data);
void Run_feature(const cv::Mat &img, cv::Mat &resize_img, int max_size_len,
int size = 0);
private: private:
int interp_; int interp_;
bool keep_ratio_; bool keep_ratio_;
std::vector<int> target_size_; std::vector<int> target_size_;
...@@ -111,46 +116,43 @@ class Resize : public PreprocessOp { ...@@ -111,46 +116,43 @@ class Resize : public PreprocessOp {
// Models with FPN need input shape % stride == 0 // Models with FPN need input shape % stride == 0
class PadStride : public PreprocessOp { class PadStride : public PreprocessOp {
public: public:
virtual void Init(const Json::Value& item) { virtual void Init(const Json::Value &item) {
stride_ = item["stride"].as<int>(); stride_ = item["stride"].as<int>();
} }
virtual void Run(cv::Mat* im, ImageBlob* data); virtual void Run(cv::Mat *im, ImageBlob *data);
private: private:
int stride_; int stride_;
}; };
class TopDownEvalAffine : public PreprocessOp { class TopDownEvalAffine : public PreprocessOp {
public: public:
virtual void Init(const Json::Value& item) { virtual void Init(const Json::Value &item) {
trainsize_.clear(); trainsize_.clear();
for (auto tmp : item["trainsize"]) { for (auto tmp : item["trainsize"]) {
trainsize_.emplace_back(tmp.as<int>()); trainsize_.emplace_back(tmp.as<int>());
} }
} }
virtual void Run(cv::Mat* im, ImageBlob* data); virtual void Run(cv::Mat *im, ImageBlob *data);
private: private:
int interp_ = 1; int interp_ = 1;
std::vector<int> trainsize_; std::vector<int> trainsize_;
}; };
void CropImg(cv::Mat& img, void CropImg(cv::Mat &img, cv::Mat &crop_img, std::vector<int> &area,
cv::Mat& crop_img, std::vector<float> &center, std::vector<float> &scale,
std::vector<int>& area,
std::vector<float>& center,
std::vector<float>& scale,
float expandratio = 0.15); float expandratio = 0.15);
class Preprocessor { class Preprocessor {
public: public:
void Init(const Json::Value& config_node) { void Init(const Json::Value &config_node) {
// initialize image info at first // initialize image info at first
ops_["InitInfo"] = std::make_shared<InitInfo>(); ops_["InitInfo"] = std::make_shared<InitInfo>();
for (const auto& item : config_node) { for (const auto &item : config_node) {
auto op_name = item["type"].as<std::string>(); auto op_name = item["type"].as<std::string>();
ops_[op_name] = CreateOp(op_name); ops_[op_name] = CreateOp(op_name);
...@@ -158,7 +160,7 @@ class Preprocessor { ...@@ -158,7 +160,7 @@ class Preprocessor {
} }
} }
std::shared_ptr<PreprocessOp> CreateOp(const std::string& name) { std::shared_ptr<PreprocessOp> CreateOp(const std::string &name) {
if (name == "DetResize") { if (name == "DetResize") {
return std::make_shared<Resize>(); return std::make_shared<Resize>();
} else if (name == "DetPermute") { } else if (name == "DetPermute") {
...@@ -176,13 +178,13 @@ class Preprocessor { ...@@ -176,13 +178,13 @@ class Preprocessor {
return nullptr; return nullptr;
} }
void Run(cv::Mat* im, ImageBlob* data); void Run(cv::Mat *im, ImageBlob *data);
public: public:
static const std::vector<std::string> RUN_ORDER; static const std::vector<std::string> RUN_ORDER;
private: private:
std::unordered_map<std::string, std::shared_ptr<PreprocessOp>> ops_; std::unordered_map<std::string, std::shared_ptr<PreprocessOp>> ops_;
}; };
} // namespace PPShiTu } // namespace PPShiTu
...@@ -38,6 +38,23 @@ struct ObjectResult { ...@@ -38,6 +38,23 @@ struct ObjectResult {
std::vector<RESULT> rec_result; std::vector<RESULT> rec_result;
}; };
void nms(std::vector<ObjectResult> &input_boxes, float nms_threshold, bool rec_nms=false); void nms(std::vector<ObjectResult> &input_boxes, float nms_threshold,
bool rec_nms = false);
template <typename T>
static inline bool SortScorePairDescend(const std::pair<float, T> &pair1,
const std::pair<float, T> &pair2) {
return pair1.first > pair2.first;
}
float RectOverlap(const ObjectResult &a, const ObjectResult &b);
inline void
GetMaxScoreIndex(const std::vector<ObjectResult> &det_result,
const float threshold,
std::vector<std::pair<float, int>> &score_index_vec);
void NMSBoxes(const std::vector<ObjectResult> det_result,
const float score_threshold, const float nms_threshold,
std::vector<int> &indices);
} // namespace PPShiTu } // namespace PPShiTu
...@@ -70,4 +70,4 @@ private: ...@@ -70,4 +70,4 @@ private:
std::vector<faiss::Index::idx_t> I; std::vector<faiss::Index::idx_t> I;
SearchResult sr; SearchResult sr;
}; };
} } // namespace PPShiTu
...@@ -29,4 +29,4 @@ void load_jsonf(std::string jsonfile, Json::Value &jsondata) { ...@@ -29,4 +29,4 @@ void load_jsonf(std::string jsonfile, Json::Value &jsondata) {
} }
} }
} // namespace PPShiTu } // namespace PPShiTu
...@@ -13,24 +13,29 @@ ...@@ -13,24 +13,29 @@
// limitations under the License. // limitations under the License.
#include "include/feature_extractor.h" #include "include/feature_extractor.h"
#include <cmath>
#include <numeric>
namespace PPShiTu { namespace PPShiTu {
void FeatureExtract::RunRecModel(const cv::Mat &img, void FeatureExtract::RunRecModel(const cv::Mat &img, double &cost_time,
double &cost_time,
std::vector<float> &feature) { std::vector<float> &feature) {
// Read img // Read img
cv::Mat resize_image = ResizeImage(img);
cv::Mat img_fp; cv::Mat img_fp;
resize_image.convertTo(img_fp, CV_32FC3, scale); this->resize_op_.Run_feature(img, img_fp, this->size, this->size);
this->normalize_op_.Run_feature(&img_fp, this->mean, this->std, this->scale);
std::vector<float> input(1 * 3 * img_fp.rows * img_fp.cols, 0.0f);
this->permute_op_.Run_feature(&img_fp, input.data());
// Prepare input data from image // Prepare input data from image
std::unique_ptr<Tensor> input_tensor(std::move(this->predictor->GetInput(0))); std::unique_ptr<Tensor> input_tensor(std::move(this->predictor->GetInput(0)));
input_tensor->Resize({1, 3, img_fp.rows, img_fp.cols}); input_tensor->Resize({1, 3, this->size, this->size});
auto *data0 = input_tensor->mutable_data<float>(); auto *data0 = input_tensor->mutable_data<float>();
const float *dimg = reinterpret_cast<const float *>(img_fp.data); // const float *dimg = reinterpret_cast<const float *>(img_fp.data);
NeonMeanScale(dimg, data0, img_fp.rows * img_fp.cols); // NeonMeanScale(dimg, data0, img_fp.rows * img_fp.cols);
for (int i = 0; i < input.size(); ++i) {
data0[i] = input[i];
}
auto start = std::chrono::system_clock::now(); auto start = std::chrono::system_clock::now();
// Run predictor // Run predictor
...@@ -38,7 +43,7 @@ void FeatureExtract::RunRecModel(const cv::Mat &img, ...@@ -38,7 +43,7 @@ void FeatureExtract::RunRecModel(const cv::Mat &img,
// Get output and post process // Get output and post process
std::unique_ptr<const Tensor> output_tensor( std::unique_ptr<const Tensor> output_tensor(
std::move(this->predictor->GetOutput(0))); //only one output std::move(this->predictor->GetOutput(0))); // only one output
auto end = std::chrono::system_clock::now(); auto end = std::chrono::system_clock::now();
auto duration = auto duration =
std::chrono::duration_cast<std::chrono::microseconds>(end - start); std::chrono::duration_cast<std::chrono::microseconds>(end - start);
...@@ -46,7 +51,7 @@ void FeatureExtract::RunRecModel(const cv::Mat &img, ...@@ -46,7 +51,7 @@ void FeatureExtract::RunRecModel(const cv::Mat &img,
std::chrono::microseconds::period::num / std::chrono::microseconds::period::num /
std::chrono::microseconds::period::den; std::chrono::microseconds::period::den;
//do postprocess // do postprocess
int output_size = 1; int output_size = 1;
for (auto dim : output_tensor->shape()) { for (auto dim : output_tensor->shape()) {
output_size *= dim; output_size *= dim;
...@@ -54,63 +59,15 @@ void FeatureExtract::RunRecModel(const cv::Mat &img, ...@@ -54,63 +59,15 @@ void FeatureExtract::RunRecModel(const cv::Mat &img,
feature.resize(output_size); feature.resize(output_size);
output_tensor->CopyToCpu(feature.data()); output_tensor->CopyToCpu(feature.data());
//postprocess include sqrt or binarize. // postprocess include sqrt or binarize.
//PostProcess(feature); FeatureNorm(feature);
return; return;
} }
// void FeatureExtract::PostProcess(std::vector<float> &feature){ void FeatureExtract::FeatureNorm(std::vector<float> &feature) {
// float feature_sqrt = std::sqrt(std::inner_product( float feature_sqrt = std::sqrt(std::inner_product(
// feature.begin(), feature.end(), feature.begin(), 0.0f)); feature.begin(), feature.end(), feature.begin(), 0.0f));
// for (int i = 0; i < feature.size(); ++i) for (int i = 0; i < feature.size(); ++i)
// feature[i] /= feature_sqrt; feature[i] /= feature_sqrt;
// }
void FeatureExtract::NeonMeanScale(const float *din, float *dout, int size) {
if (this->mean.size() != 3 || this->std.size() != 3) {
std::cerr << "[ERROR] mean or scale size must equal to 3\n";
exit(1);
}
float32x4_t vmean0 = vdupq_n_f32(mean[0]);
float32x4_t vmean1 = vdupq_n_f32(mean[1]);
float32x4_t vmean2 = vdupq_n_f32(mean[2]);
float32x4_t vscale0 = vdupq_n_f32(std[0]);
float32x4_t vscale1 = vdupq_n_f32(std[1]);
float32x4_t vscale2 = vdupq_n_f32(std[2]);
float *dout_c0 = dout;
float *dout_c1 = dout + size;
float *dout_c2 = dout + size * 2;
int i = 0;
for (; i < size - 3; i += 4) {
float32x4x3_t vin3 = vld3q_f32(din);
float32x4_t vsub0 = vsubq_f32(vin3.val[0], vmean0);
float32x4_t vsub1 = vsubq_f32(vin3.val[1], vmean1);
float32x4_t vsub2 = vsubq_f32(vin3.val[2], vmean2);
float32x4_t vs0 = vmulq_f32(vsub0, vscale0);
float32x4_t vs1 = vmulq_f32(vsub1, vscale1);
float32x4_t vs2 = vmulq_f32(vsub2, vscale2);
vst1q_f32(dout_c0, vs0);
vst1q_f32(dout_c1, vs1);
vst1q_f32(dout_c2, vs2);
din += 12;
dout_c0 += 4;
dout_c1 += 4;
dout_c2 += 4;
}
for (; i < size; i++) {
*(dout_c0++) = (*(din++) - this->mean[0]) * this->std[0];
*(dout_c1++) = (*(din++) - this->mean[1]) * this->std[1];
*(dout_c2++) = (*(din++) - this->mean[2]) * this->std[2];
}
}
cv::Mat FeatureExtract::ResizeImage(const cv::Mat &img) {
cv::Mat resize_img;
cv::resize(img, resize_img, cv::Size(this->size, this->size));
return resize_img;
}
} }
} // namespace PPShiTu
...@@ -27,6 +27,7 @@ ...@@ -27,6 +27,7 @@
#include "include/feature_extractor.h" #include "include/feature_extractor.h"
#include "include/object_detector.h" #include "include/object_detector.h"
#include "include/preprocess_op.h" #include "include/preprocess_op.h"
#include "include/utils.h"
#include "include/vector_search.h" #include "include/vector_search.h"
#include "json/json.h" #include "json/json.h"
...@@ -158,6 +159,11 @@ int main(int argc, char **argv) { ...@@ -158,6 +159,11 @@ int main(int argc, char **argv) {
<< " [image_dir]>" << std::endl; << " [image_dir]>" << std::endl;
return -1; return -1;
} }
float rec_nms_threshold = 0.05;
if (RT_Config["Global"]["rec_nms_thresold"].isDouble())
rec_nms_threshold = RT_Config["Global"]["rec_nms_thresold"].as<float>();
// Load model and create a object detector // Load model and create a object detector
PPShiTu::ObjectDetector det( PPShiTu::ObjectDetector det(
RT_Config, RT_Config["Global"]["det_model_path"].as<std::string>(), RT_Config, RT_Config["Global"]["det_model_path"].as<std::string>(),
...@@ -174,6 +180,7 @@ int main(int argc, char **argv) { ...@@ -174,6 +180,7 @@ int main(int argc, char **argv) {
// for vector search // for vector search
std::vector<float> feature; std::vector<float> feature;
std::vector<float> features; std::vector<float> features;
std::vector<int> indeices;
double rec_time; double rec_time;
if (!RT_Config["Global"]["infer_imgs"].as<std::string>().empty() || if (!RT_Config["Global"]["infer_imgs"].as<std::string>().empty() ||
!img_dir.empty()) { !img_dir.empty()) {
...@@ -208,9 +215,9 @@ int main(int argc, char **argv) { ...@@ -208,9 +215,9 @@ int main(int argc, char **argv) {
RT_Config["Global"]["max_det_results"].as<int>(), false, &det); RT_Config["Global"]["max_det_results"].as<int>(), false, &det);
// add the whole image for recognition to improve recall // add the whole image for recognition to improve recall
// PPShiTu::ObjectResult result_whole_img = { PPShiTu::ObjectResult result_whole_img = {
// {0, 0, srcimg.cols, srcimg.rows}, 0, 1.0}; {0, 0, srcimg.cols, srcimg.rows}, 0, 1.0};
// det_result.push_back(result_whole_img); det_result.push_back(result_whole_img);
// get rec result // get rec result
PPShiTu::SearchResult search_result; PPShiTu::SearchResult search_result;
...@@ -225,10 +232,18 @@ int main(int argc, char **argv) { ...@@ -225,10 +232,18 @@ int main(int argc, char **argv) {
// do vectore search // do vectore search
search_result = searcher.Search(features.data(), det_result.size()); search_result = searcher.Search(features.data(), det_result.size());
for (int i = 0; i < det_result.size(); ++i) {
det_result[i].confidence = search_result.D[search_result.return_k * i];
}
NMSBoxes(det_result, searcher.GetThreshold(), rec_nms_threshold,
indeices);
PrintResult(img_path, det_result, searcher, search_result); PrintResult(img_path, det_result, searcher, search_result);
batch_imgs.clear(); batch_imgs.clear();
det_result.clear(); det_result.clear();
features.clear();
feature.clear();
indeices.clear();
} }
} }
return 0; return 0;
......
...@@ -13,9 +13,9 @@ ...@@ -13,9 +13,9 @@
// limitations under the License. // limitations under the License.
#include <sstream> #include <sstream>
// for setprecision // for setprecision
#include "include/object_detector.h"
#include <chrono> #include <chrono>
#include <iomanip> #include <iomanip>
#include "include/object_detector.h"
namespace PPShiTu { namespace PPShiTu {
...@@ -30,10 +30,10 @@ void ObjectDetector::LoadModel(std::string model_file, int num_theads) { ...@@ -30,10 +30,10 @@ void ObjectDetector::LoadModel(std::string model_file, int num_theads) {
} }
// Visualiztion MaskDetector results // Visualiztion MaskDetector results
cv::Mat VisualizeResult(const cv::Mat& img, cv::Mat VisualizeResult(const cv::Mat &img,
const std::vector<PPShiTu::ObjectResult>& results, const std::vector<PPShiTu::ObjectResult> &results,
const std::vector<std::string>& lables, const std::vector<std::string> &lables,
const std::vector<int>& colormap, const std::vector<int> &colormap,
const bool is_rbox = false) { const bool is_rbox = false) {
cv::Mat vis_img = img.clone(); cv::Mat vis_img = img.clone();
for (int i = 0; i < results.size(); ++i) { for (int i = 0; i < results.size(); ++i) {
...@@ -75,24 +75,18 @@ cv::Mat VisualizeResult(const cv::Mat& img, ...@@ -75,24 +75,18 @@ cv::Mat VisualizeResult(const cv::Mat& img,
origin.y = results[i].rect[1]; origin.y = results[i].rect[1];
// Configure text background // Configure text background
cv::Rect text_back = cv::Rect(results[i].rect[0], cv::Rect text_back =
results[i].rect[1] - text_size.height, cv::Rect(results[i].rect[0], results[i].rect[1] - text_size.height,
text_size.width, text_size.width, text_size.height);
text_size.height);
// Draw text, and background // Draw text, and background
cv::rectangle(vis_img, text_back, roi_color, -1); cv::rectangle(vis_img, text_back, roi_color, -1);
cv::putText(vis_img, cv::putText(vis_img, text, origin, font_face, font_scale,
text, cv::Scalar(255, 255, 255), thickness);
origin,
font_face,
font_scale,
cv::Scalar(255, 255, 255),
thickness);
} }
return vis_img; return vis_img;
} }
void ObjectDetector::Preprocess(const cv::Mat& ori_im) { void ObjectDetector::Preprocess(const cv::Mat &ori_im) {
// Clone the image : keep the original mat for postprocess // Clone the image : keep the original mat for postprocess
cv::Mat im = ori_im.clone(); cv::Mat im = ori_im.clone();
// cv::cvtColor(im, im, cv::COLOR_BGR2RGB); // cv::cvtColor(im, im, cv::COLOR_BGR2RGB);
...@@ -100,7 +94,7 @@ void ObjectDetector::Preprocess(const cv::Mat& ori_im) { ...@@ -100,7 +94,7 @@ void ObjectDetector::Preprocess(const cv::Mat& ori_im) {
} }
void ObjectDetector::Postprocess(const std::vector<cv::Mat> mats, void ObjectDetector::Postprocess(const std::vector<cv::Mat> mats,
std::vector<PPShiTu::ObjectResult>* result, std::vector<PPShiTu::ObjectResult> *result,
std::vector<int> bbox_num, std::vector<int> bbox_num,
bool is_rbox = false) { bool is_rbox = false) {
result->clear(); result->clear();
...@@ -156,12 +150,11 @@ void ObjectDetector::Postprocess(const std::vector<cv::Mat> mats, ...@@ -156,12 +150,11 @@ void ObjectDetector::Postprocess(const std::vector<cv::Mat> mats,
} }
} }
void ObjectDetector::Predict(const std::vector<cv::Mat>& imgs, void ObjectDetector::Predict(const std::vector<cv::Mat> &imgs, const int warmup,
const int warmup,
const int repeats, const int repeats,
std::vector<PPShiTu::ObjectResult>* result, std::vector<PPShiTu::ObjectResult> *result,
std::vector<int>* bbox_num, std::vector<int> *bbox_num,
std::vector<double>* times) { std::vector<double> *times) {
auto preprocess_start = std::chrono::steady_clock::now(); auto preprocess_start = std::chrono::steady_clock::now();
int batch_size = imgs.size(); int batch_size = imgs.size();
...@@ -180,29 +173,29 @@ void ObjectDetector::Predict(const std::vector<cv::Mat>& imgs, ...@@ -180,29 +173,29 @@ void ObjectDetector::Predict(const std::vector<cv::Mat>& imgs,
scale_factor_all[bs_idx * 2 + 1] = inputs_.scale_factor_[1]; scale_factor_all[bs_idx * 2 + 1] = inputs_.scale_factor_[1];
// TODO: reduce cost time // TODO: reduce cost time
in_data_all.insert( in_data_all.insert(in_data_all.end(), inputs_.im_data_.begin(),
in_data_all.end(), inputs_.im_data_.begin(), inputs_.im_data_.end()); inputs_.im_data_.end());
} }
auto preprocess_end = std::chrono::steady_clock::now(); auto preprocess_end = std::chrono::steady_clock::now();
std::vector<const float *> output_data_list_; std::vector<const float *> output_data_list_;
// Prepare input tensor // Prepare input tensor
auto input_names = predictor_->GetInputNames(); auto input_names = predictor_->GetInputNames();
for (const auto& tensor_name : input_names) { for (const auto &tensor_name : input_names) {
auto in_tensor = predictor_->GetInputByName(tensor_name); auto in_tensor = predictor_->GetInputByName(tensor_name);
if (tensor_name == "image") { if (tensor_name == "image") {
int rh = inputs_.in_net_shape_[0]; int rh = inputs_.in_net_shape_[0];
int rw = inputs_.in_net_shape_[1]; int rw = inputs_.in_net_shape_[1];
in_tensor->Resize({batch_size, 3, rh, rw}); in_tensor->Resize({batch_size, 3, rh, rw});
auto* inptr = in_tensor->mutable_data<float>(); auto *inptr = in_tensor->mutable_data<float>();
std::copy_n(in_data_all.data(), in_data_all.size(), inptr); std::copy_n(in_data_all.data(), in_data_all.size(), inptr);
} else if (tensor_name == "im_shape") { } else if (tensor_name == "im_shape") {
in_tensor->Resize({batch_size, 2}); in_tensor->Resize({batch_size, 2});
auto* inptr = in_tensor->mutable_data<float>(); auto *inptr = in_tensor->mutable_data<float>();
std::copy_n(im_shape_all.data(), im_shape_all.size(), inptr); std::copy_n(im_shape_all.data(), im_shape_all.size(), inptr);
} else if (tensor_name == "scale_factor") { } else if (tensor_name == "scale_factor") {
in_tensor->Resize({batch_size, 2}); in_tensor->Resize({batch_size, 2});
auto* inptr = in_tensor->mutable_data<float>(); auto *inptr = in_tensor->mutable_data<float>();
std::copy_n(scale_factor_all.data(), scale_factor_all.size(), inptr); std::copy_n(scale_factor_all.data(), scale_factor_all.size(), inptr);
} }
} }
...@@ -216,7 +209,7 @@ void ObjectDetector::Predict(const std::vector<cv::Mat>& imgs, ...@@ -216,7 +209,7 @@ void ObjectDetector::Predict(const std::vector<cv::Mat>& imgs,
if (config_.arch_ == "PicoDet") { if (config_.arch_ == "PicoDet") {
for (int j = 0; j < output_names.size(); j++) { for (int j = 0; j < output_names.size(); j++) {
auto output_tensor = predictor_->GetTensor(output_names[j]); auto output_tensor = predictor_->GetTensor(output_names[j]);
const float* outptr = output_tensor->data<float>(); const float *outptr = output_tensor->data<float>();
std::vector<int64_t> output_shape = output_tensor->shape(); std::vector<int64_t> output_shape = output_tensor->shape();
output_data_list_.push_back(outptr); output_data_list_.push_back(outptr);
} }
...@@ -242,7 +235,7 @@ void ObjectDetector::Predict(const std::vector<cv::Mat>& imgs, ...@@ -242,7 +235,7 @@ void ObjectDetector::Predict(const std::vector<cv::Mat>& imgs,
if (config_.arch_ == "PicoDet") { if (config_.arch_ == "PicoDet") {
for (int i = 0; i < output_names.size(); i++) { for (int i = 0; i < output_names.size(); i++) {
auto output_tensor = predictor_->GetTensor(output_names[i]); auto output_tensor = predictor_->GetTensor(output_names[i]);
const float* outptr = output_tensor->data<float>(); const float *outptr = output_tensor->data<float>();
std::vector<int64_t> output_shape = output_tensor->shape(); std::vector<int64_t> output_shape = output_tensor->shape();
if (i == 0) { if (i == 0) {
num_class = output_shape[2]; num_class = output_shape[2];
...@@ -268,16 +261,15 @@ void ObjectDetector::Predict(const std::vector<cv::Mat>& imgs, ...@@ -268,16 +261,15 @@ void ObjectDetector::Predict(const std::vector<cv::Mat>& imgs,
std::cerr << "[WARNING] No object detected." << std::endl; std::cerr << "[WARNING] No object detected." << std::endl;
} }
output_data_.resize(output_size); output_data_.resize(output_size);
std::copy_n( std::copy_n(output_tensor->mutable_data<float>(), output_size,
output_tensor->mutable_data<float>(), output_size, output_data_.data()); output_data_.data());
int out_bbox_num_size = 1; int out_bbox_num_size = 1;
for (int j = 0; j < out_bbox_num_shape.size(); ++j) { for (int j = 0; j < out_bbox_num_shape.size(); ++j) {
out_bbox_num_size *= out_bbox_num_shape[j]; out_bbox_num_size *= out_bbox_num_shape[j];
} }
out_bbox_num_data_.resize(out_bbox_num_size); out_bbox_num_data_.resize(out_bbox_num_size);
std::copy_n(out_bbox_num->mutable_data<int>(), std::copy_n(out_bbox_num->mutable_data<int>(), out_bbox_num_size,
out_bbox_num_size,
out_bbox_num_data_.data()); out_bbox_num_data_.data());
} }
// Postprocessing result // Postprocessing result
...@@ -285,9 +277,8 @@ void ObjectDetector::Predict(const std::vector<cv::Mat>& imgs, ...@@ -285,9 +277,8 @@ void ObjectDetector::Predict(const std::vector<cv::Mat>& imgs,
result->clear(); result->clear();
if (config_.arch_ == "PicoDet") { if (config_.arch_ == "PicoDet") {
PPShiTu::PicoDetPostProcess( PPShiTu::PicoDetPostProcess(
result, output_data_list_, config_.fpn_stride_, result, output_data_list_, config_.fpn_stride_, inputs_.im_shape_,
inputs_.im_shape_, inputs_.scale_factor_, inputs_.scale_factor_, config_.nms_info_["score_threshold"].as<float>(),
config_.nms_info_["score_threshold"].as<float>(),
config_.nms_info_["nms_threshold"].as<float>(), num_class, reg_max); config_.nms_info_["nms_threshold"].as<float>(), num_class, reg_max);
bbox_num->push_back(result->size()); bbox_num->push_back(result->size());
} else { } else {
...@@ -326,4 +317,4 @@ std::vector<int> GenerateColorMap(int num_class) { ...@@ -326,4 +317,4 @@ std::vector<int> GenerateColorMap(int num_class) {
return colormap; return colormap;
} }
} // namespace PPShiTu } // namespace PPShiTu
...@@ -47,9 +47,9 @@ int activation_function_softmax(const _Tp *src, _Tp *dst, int length) { ...@@ -47,9 +47,9 @@ int activation_function_softmax(const _Tp *src, _Tp *dst, int length) {
} }
// PicoDet decode // PicoDet decode
PPShiTu::ObjectResult PPShiTu::ObjectResult disPred2Bbox(const float *&dfl_det, int label,
disPred2Bbox(const float *&dfl_det, int label, float score, int x, int y, float score, int x, int y, int stride,
int stride, std::vector<float> im_shape, int reg_max) { std::vector<float> im_shape, int reg_max) {
float ct_x = (x + 0.5) * stride; float ct_x = (x + 0.5) * stride;
float ct_y = (y + 0.5) * stride; float ct_y = (y + 0.5) * stride;
std::vector<float> dis_pred; std::vector<float> dis_pred;
......
...@@ -20,7 +20,7 @@ ...@@ -20,7 +20,7 @@
namespace PPShiTu { namespace PPShiTu {
void InitInfo::Run(cv::Mat* im, ImageBlob* data) { void InitInfo::Run(cv::Mat *im, ImageBlob *data) {
data->im_shape_ = {static_cast<float>(im->rows), data->im_shape_ = {static_cast<float>(im->rows),
static_cast<float>(im->cols)}; static_cast<float>(im->cols)};
data->scale_factor_ = {1., 1.}; data->scale_factor_ = {1., 1.};
...@@ -28,10 +28,10 @@ void InitInfo::Run(cv::Mat* im, ImageBlob* data) { ...@@ -28,10 +28,10 @@ void InitInfo::Run(cv::Mat* im, ImageBlob* data) {
static_cast<float>(im->cols)}; static_cast<float>(im->cols)};
} }
void NormalizeImage::Run(cv::Mat* im, ImageBlob* data) { void NormalizeImage::Run(cv::Mat *im, ImageBlob *data) {
double e = 1.0; double e = 1.0;
if (is_scale_) { if (is_scale_) {
e *= 1./255.0; e *= 1. / 255.0;
} }
(*im).convertTo(*im, CV_32FC3, e); (*im).convertTo(*im, CV_32FC3, e);
for (int h = 0; h < im->rows; h++) { for (int h = 0; h < im->rows; h++) {
...@@ -46,35 +46,61 @@ void NormalizeImage::Run(cv::Mat* im, ImageBlob* data) { ...@@ -46,35 +46,61 @@ void NormalizeImage::Run(cv::Mat* im, ImageBlob* data) {
} }
} }
void Permute::Run(cv::Mat* im, ImageBlob* data) { void NormalizeImage::Run_feature(cv::Mat *im, const std::vector<float> &mean,
const std::vector<float> &std, float scale) {
(*im).convertTo(*im, CV_32FC3, scale);
for (int h = 0; h < im->rows; h++) {
for (int w = 0; w < im->cols; w++) {
im->at<cv::Vec3f>(h, w)[0] =
(im->at<cv::Vec3f>(h, w)[0] - mean[0]) / std[0];
im->at<cv::Vec3f>(h, w)[1] =
(im->at<cv::Vec3f>(h, w)[1] - mean[1]) / std[1];
im->at<cv::Vec3f>(h, w)[2] =
(im->at<cv::Vec3f>(h, w)[2] - mean[2]) / std[2];
}
}
}
void Permute::Run(cv::Mat *im, ImageBlob *data) {
(*im).convertTo(*im, CV_32FC3); (*im).convertTo(*im, CV_32FC3);
int rh = im->rows; int rh = im->rows;
int rw = im->cols; int rw = im->cols;
int rc = im->channels(); int rc = im->channels();
(data->im_data_).resize(rc * rh * rw); (data->im_data_).resize(rc * rh * rw);
float* base = (data->im_data_).data(); float *base = (data->im_data_).data();
for (int i = 0; i < rc; ++i) { for (int i = 0; i < rc; ++i) {
cv::extractChannel(*im, cv::Mat(rh, rw, CV_32FC1, base + i * rh * rw), i); cv::extractChannel(*im, cv::Mat(rh, rw, CV_32FC1, base + i * rh * rw), i);
} }
} }
void Resize::Run(cv::Mat* im, ImageBlob* data) { void Permute::Run_feature(const cv::Mat *im, float *data) {
int rh = im->rows;
int rw = im->cols;
int rc = im->channels();
for (int i = 0; i < rc; ++i) {
cv::extractChannel(*im, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw), i);
}
}
void Resize::Run(cv::Mat *im, ImageBlob *data) {
auto resize_scale = GenerateScale(*im); auto resize_scale = GenerateScale(*im);
data->im_shape_ = {static_cast<float>(im->cols * resize_scale.first), data->im_shape_ = {static_cast<float>(im->cols * resize_scale.first),
static_cast<float>(im->rows * resize_scale.second)}; static_cast<float>(im->rows * resize_scale.second)};
data->in_net_shape_ = {static_cast<float>(im->cols * resize_scale.first), data->in_net_shape_ = {static_cast<float>(im->cols * resize_scale.first),
static_cast<float>(im->rows * resize_scale.second)}; static_cast<float>(im->rows * resize_scale.second)};
cv::resize( cv::resize(*im, *im, cv::Size(), resize_scale.first, resize_scale.second,
*im, *im, cv::Size(), resize_scale.first, resize_scale.second, interp_); interp_);
data->im_shape_ = { data->im_shape_ = {
static_cast<float>(im->rows), static_cast<float>(im->cols), static_cast<float>(im->rows),
static_cast<float>(im->cols),
}; };
data->scale_factor_ = { data->scale_factor_ = {
resize_scale.second, resize_scale.first, resize_scale.second,
resize_scale.first,
}; };
} }
std::pair<float, float> Resize::GenerateScale(const cv::Mat& im) { std::pair<float, float> Resize::GenerateScale(const cv::Mat &im) {
std::pair<float, float> resize_scale; std::pair<float, float> resize_scale;
int origin_w = im.cols; int origin_w = im.cols;
int origin_h = im.rows; int origin_h = im.rows;
...@@ -101,7 +127,30 @@ std::pair<float, float> Resize::GenerateScale(const cv::Mat& im) { ...@@ -101,7 +127,30 @@ std::pair<float, float> Resize::GenerateScale(const cv::Mat& im) {
return resize_scale; return resize_scale;
} }
void PadStride::Run(cv::Mat* im, ImageBlob* data) { void Resize::Run_feature(const cv::Mat &img, cv::Mat &resize_img,
int resize_short_size, int size) {
int resize_h = 0;
int resize_w = 0;
if (size > 0) {
resize_h = size;
resize_w = size;
} else {
int w = img.cols;
int h = img.rows;
float ratio = 1.f;
if (h < w) {
ratio = float(resize_short_size) / float(h);
} else {
ratio = float(resize_short_size) / float(w);
}
resize_h = round(float(h) * ratio);
resize_w = round(float(w) * ratio);
}
cv::resize(img, resize_img, cv::Size(resize_w, resize_h));
}
void PadStride::Run(cv::Mat *im, ImageBlob *data) {
if (stride_ <= 0) { if (stride_ <= 0) {
return; return;
} }
...@@ -110,48 +159,44 @@ void PadStride::Run(cv::Mat* im, ImageBlob* data) { ...@@ -110,48 +159,44 @@ void PadStride::Run(cv::Mat* im, ImageBlob* data) {
int rw = im->cols; int rw = im->cols;
int nh = (rh / stride_) * stride_ + (rh % stride_ != 0) * stride_; int nh = (rh / stride_) * stride_ + (rh % stride_ != 0) * stride_;
int nw = (rw / stride_) * stride_ + (rw % stride_ != 0) * stride_; int nw = (rw / stride_) * stride_ + (rw % stride_ != 0) * stride_;
cv::copyMakeBorder( cv::copyMakeBorder(*im, *im, 0, nh - rh, 0, nw - rw, cv::BORDER_CONSTANT,
*im, *im, 0, nh - rh, 0, nw - rw, cv::BORDER_CONSTANT, cv::Scalar(0)); cv::Scalar(0));
data->in_net_shape_ = { data->in_net_shape_ = {
static_cast<float>(im->rows), static_cast<float>(im->cols), static_cast<float>(im->rows),
static_cast<float>(im->cols),
}; };
} }
void TopDownEvalAffine::Run(cv::Mat* im, ImageBlob* data) { void TopDownEvalAffine::Run(cv::Mat *im, ImageBlob *data) {
cv::resize(*im, *im, cv::Size(trainsize_[0], trainsize_[1]), 0, 0, interp_); cv::resize(*im, *im, cv::Size(trainsize_[0], trainsize_[1]), 0, 0, interp_);
// todo: Simd::ResizeBilinear(); // todo: Simd::ResizeBilinear();
data->in_net_shape_ = { data->in_net_shape_ = {
static_cast<float>(trainsize_[1]), static_cast<float>(trainsize_[0]), static_cast<float>(trainsize_[1]),
static_cast<float>(trainsize_[0]),
}; };
} }
// Preprocessor op running order // Preprocessor op running order
const std::vector<std::string> Preprocessor::RUN_ORDER = {"InitInfo", const std::vector<std::string> Preprocessor::RUN_ORDER = {
"DetTopDownEvalAffine", "InitInfo", "DetTopDownEvalAffine", "DetResize",
"DetResize", "DetNormalizeImage", "DetPadStride", "DetPermute"};
"DetNormalizeImage",
"DetPadStride", void Preprocessor::Run(cv::Mat *im, ImageBlob *data) {
"DetPermute"}; for (const auto &name : RUN_ORDER) {
void Preprocessor::Run(cv::Mat* im, ImageBlob* data) {
for (const auto& name : RUN_ORDER) {
if (ops_.find(name) != ops_.end()) { if (ops_.find(name) != ops_.end()) {
ops_[name]->Run(im, data); ops_[name]->Run(im, data);
} }
} }
} }
void CropImg(cv::Mat& img, void CropImg(cv::Mat &img, cv::Mat &crop_img, std::vector<int> &area,
cv::Mat& crop_img, std::vector<float> &center, std::vector<float> &scale,
std::vector<int>& area,
std::vector<float>& center,
std::vector<float>& scale,
float expandratio) { float expandratio) {
int crop_x1 = std::max(0, area[0]); int crop_x1 = std::max(0, area[0]);
int crop_y1 = std::max(0, area[1]); int crop_y1 = std::max(0, area[1]);
int crop_x2 = std::min(img.cols - 1, area[2]); int crop_x2 = std::min(img.cols - 1, area[2]);
int crop_y2 = std::min(img.rows - 1, area[3]); int crop_y2 = std::min(img.rows - 1, area[3]);
int center_x = (crop_x1 + crop_x2) / 2.; int center_x = (crop_x1 + crop_x2) / 2.;
int center_y = (crop_y1 + crop_y2) / 2.; int center_y = (crop_y1 + crop_y2) / 2.;
int half_h = (crop_y2 - crop_y1) / 2.; int half_h = (crop_y2 - crop_y1) / 2.;
...@@ -182,4 +227,4 @@ void CropImg(cv::Mat& img, ...@@ -182,4 +227,4 @@ void CropImg(cv::Mat& img,
scale.emplace_back((crop_y2 - crop_y1)); scale.emplace_back((crop_y2 - crop_y1));
} }
} // namespace PPShiTu } // namespace PPShiTu
...@@ -54,4 +54,53 @@ void nms(std::vector<ObjectResult> &input_boxes, float nms_threshold, ...@@ -54,4 +54,53 @@ void nms(std::vector<ObjectResult> &input_boxes, float nms_threshold,
} }
} }
float RectOverlap(const ObjectResult &a, const ObjectResult &b) {
float Aa = (a.rect[2] - a.rect[0] + 1) * (a.rect[3] - a.rect[1] + 1);
float Ab = (b.rect[2] - b.rect[0] + 1) * (b.rect[3] - b.rect[1] + 1);
int iou_w = max(min(a.rect[2], b.rect[2]) - max(a.rect[0], b.rect[0]) + 1, 0);
int iou_h = max(min(a.rect[3], b.rect[3]) - max(a.rect[1], b.rect[1]) + 1, 0);
float Aab = iou_w * iou_h;
return Aab / (Aa + Ab - Aab);
}
inline void
GetMaxScoreIndex(const std::vector<ObjectResult> &det_result,
const float threshold,
std::vector<std::pair<float, int>> &score_index_vec) {
// Generate index score pairs.
for (size_t i = 0; i < det_result.size(); ++i) {
if (det_result[i].confidence > threshold) {
score_index_vec.push_back(std::make_pair(det_result[i].confidence, i));
}
}
// Sort the score pair according to the scores in descending order
std::stable_sort(score_index_vec.begin(), score_index_vec.end(),
SortScorePairDescend<int>);
}
void NMSBoxes(const std::vector<ObjectResult> det_result,
const float score_threshold, const float nms_threshold,
std::vector<int> &indices) {
int a = 1;
// Get top_k scores (with corresponding indices).
std::vector<std::pair<float, int>> score_index_vec;
GetMaxScoreIndex(det_result, score_threshold, score_index_vec);
// Do nms
indices.clear();
for (size_t i = 0; i < score_index_vec.size(); ++i) {
const int idx = score_index_vec[i].second;
bool keep = true;
for (int k = 0; k < (int)indices.size() && keep; ++k) {
const int kept_idx = indices[k];
float overlap = RectOverlap(det_result[idx], det_result[kept_idx]);
keep = overlap <= nms_threshold;
}
if (keep)
indices.push_back(idx);
}
}
} // namespace PPShiTu } // namespace PPShiTu
...@@ -64,4 +64,4 @@ const SearchResult &VectorSearch::Search(float *feature, int query_number) { ...@@ -64,4 +64,4 @@ const SearchResult &VectorSearch::Search(float *feature, int query_number) {
const std::string &VectorSearch::GetLabel(faiss::Index::idx_t ind) { const std::string &VectorSearch::GetLabel(faiss::Index::idx_t ind) {
return this->id_map.at(ind); return this->id_map.at(ind);
} }
} } // namespace PPShiTu
\ No newline at end of file
../../docs/zh_CN/inference_deployment/paddle_serving_deploy.md ../../docs/zh_CN/inference_deployment/classification_serving_deploy.md
\ No newline at end of file \ No newline at end of file
../../docs/en/inference_deployment/classification_serving_deploy_en.md
\ No newline at end of file
../../../docs/zh_CN/inference_deployment/recognition_serving_deploy.md
\ No newline at end of file
../../../docs/en/inference_deployment/recognition_serving_deploy_en.md
\ No newline at end of file
nohup python3 -m paddle_serving_server.serve \ gpu_id=$1
--model ../../models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving \
--port 9293 >>log_mainbody_detection.txt 1&>2 &
nohup python3 -m paddle_serving_server.serve \ # PP-ShiTu CPP serving script
--model ../../models/general_PPLCNet_x2_5_lite_v1.0_serving \ if [[ -n "${gpu_id}" ]]; then
--port 9294 >>log_feature_extraction.txt 1&>2 & nohup python3.7 -m paddle_serving_server.serve \
--model ../../models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving ../../models/general_PPLCNet_x2_5_lite_v1.0_serving \
--op GeneralPicodetOp GeneralFeatureExtractOp \
--port 9400 --gpu_id="${gpu_id}" > log_PPShiTu.txt 2>&1 &
else
nohup python3.7 -m paddle_serving_server.serve \
--model ../../models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving ../../models/general_PPLCNet_x2_5_lite_v1.0_serving \
--op GeneralPicodetOp GeneralFeatureExtractOp \
--port 9400 > log_PPShiTu.txt 2>&1 &
fi
#run cls server: gpu_id=$1
nohup python3 -m paddle_serving_server.serve --model ResNet50_vd_serving --port 9292 &
# ResNet50_vd CPP serving script
if [[ -n "${gpu_id}" ]]; then
nohup python3.7 -m paddle_serving_server.serve \
--model ./ResNet50_vd_serving \
--op GeneralClasOp \
--port 9292 &
else
nohup python3.7 -m paddle_serving_server.serve \
--model ./ResNet50_vd_serving \
--op GeneralClasOp \
--port 9292 --gpu_id="${gpu_id}" &
fi
English | [简体中文](../../zh_CN/inference_deployment/classification_serving_deploy.md)
# Classification model service deployment
## Table of contents
- [1 Introduction](#1-introduction)
- [2. Serving installation](#2-serving-installation)
- [3. Image Classification Service Deployment](#3-image-classification-service-deployment)
- [3.1 Model conversion](#31-model-conversion)
- [3.2 Service deployment and request](#32-service-deployment-and-request)
- [3.2.1 Python Serving](#321-python-serving)
- [3.2.2 C++ Serving](#322-c-serving)
<a name="1"></a>
## 1 Introduction
[Paddle Serving](https://github.com/PaddlePaddle/Serving) aims to help deep learning developers easily deploy online prediction services, support one-click deployment of industrial-grade service capabilities, high concurrency between client and server Efficient communication and support for developing clients in multiple programming languages.
This section takes the HTTP prediction service deployment as an example to introduce how to use PaddleServing to deploy the model service in PaddleClas. Currently, only Linux platform deployment is supported, and Windows platform is not currently supported.
<a name="2"></a>
## 2. Serving installation
The Serving official website recommends using docker to install and deploy the Serving environment. First, you need to pull the docker environment and create a Serving-based docker.
```shell
# start GPU docker
docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel
nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash
nvidia-docker exec -it test bash
# start CPU docker
docker pull paddlepaddle/serving:0.7.0-devel
docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-devel bash
docker exec -it test bash
```
After entering docker, you need to install Serving-related python packages.
```shell
python3.7 -m pip install paddle-serving-client==0.7.0
python3.7 -m pip install paddle-serving-app==0.7.0
python3.7 -m pip install faiss-cpu==1.7.1post2
#If it is a CPU deployment environment:
python3.7 -m pip install paddle-serving-server==0.7.0 #CPU
python3.7 -m pip install paddlepaddle==2.2.0 # CPU
#If it is a GPU deployment environment
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post102 # GPU with CUDA10.2 + TensorRT6
python3.7 -m pip install paddlepaddle-gpu==2.2.0 # GPU with CUDA10.2
#Other GPU environments need to confirm the environment and then choose which one to execute
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
```
* If the installation speed is too slow, you can change the source through `-i https://pypi.tuna.tsinghua.edu.cn/simple` to speed up the installation process.
* For other environment configuration installation, please refer to: [Install Paddle Serving with Docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_EN.md)
<a name="3"></a>
## 3. Image Classification Service Deployment
The following takes the classic ResNet50_vd model as an example to introduce how to deploy the image classification service.
<a name="3.1"></a>
### 3.1 Model conversion
When using PaddleServing for service deployment, you need to convert the saved inference model into a Serving model.
- Go to the working directory:
```shell
cd deploy/paddleserving
```
- Download and unzip the inference model for ResNet50_vd:
```shell
# Download ResNet50_vd inference model
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar
# Decompress the ResNet50_vd inference model
tar xf ResNet50_vd_infer.tar
```
- Use the paddle_serving_client command to convert the downloaded inference model into a model format for easy server deployment:
```shell
# Convert ResNet50_vd model
python3.7 -m paddle_serving_client.convert \
--dirname ./ResNet50_vd_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./ResNet50_vd_serving/ \
--serving_client ./ResNet50_vd_client/
```
The specific meaning of the parameters in the above command is shown in the following table
| parameter | type | default value | description |
| --------- | ---- | ------------- | ----------- | |--- |
| `dirname` | str | - | The storage path of the model file to be converted. The program structure file and parameter file are saved in this directory. |
| `model_filename` | str | None | The name of the file storing the model Inference Program structure that needs to be converted. If set to None, use `__model__` as the default filename |
| `params_filename` | str | None | File name where all parameters of the model to be converted are stored. It needs to be specified if and only if all model parameters are stored in a single binary file. If the model parameters are stored in separate files, set it to None |
| `serving_server` | str | `"serving_server"` | The storage path of the converted model files and configuration files. Default is serving_server |
| `serving_client` | str | `"serving_client"` | The converted client configuration file storage path. Default is serving_client |
After the ResNet50_vd inference model conversion is completed, there will be additional `ResNet50_vd_serving` and `ResNet50_vd_client` folders in the current folder, with the following structure:
```shell
├── ResNet50_vd_serving/
│ ├── inference.pdiparams
│ ├── inference.pdmodel
│ ├── serving_server_conf.prototxt
│ └── serving_server_conf.stream.prototxt
└── ResNet50_vd_client/
├── serving_client_conf.prototxt
└── serving_client_conf.stream.prototxt
```
- Serving provides the function of input and output renaming in order to be compatible with the deployment of different models. When different models are deployed in inference, you only need to modify the `alias_name` of the configuration file, and the inference deployment can be completed without modifying the code. Therefore, after the conversion, you need to modify the alias names in the files `serving_server_conf.prototxt` under `ResNet50_vd_serving` and `ResNet50_vd_client` respectively, and change the `alias_name` in `fetch_var` to `prediction`, the modified serving_server_conf.prototxt is as follows Show:
```log
feed_var {
name: "inputs"
alias_name: "inputs"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "prediction"
is_lod_tensor: false
fetch_type: 1
shape: 1000
}
```
<a name="3.2"></a>
### 3.2 Service deployment and request
The paddleserving directory contains the code for starting the pipeline service, the C++ serving service and sending the prediction request, mainly including:
```shell
__init__.py
classification_web_service.py # Script to start the pipeline server
config.yml # Configuration file to start the pipeline service
pipeline_http_client.py # Script for sending pipeline prediction requests in http mode
pipeline_rpc_client.py # Script for sending pipeline prediction requests in rpc mode
readme.md # Classification model service deployment document
run_cpp_serving.sh # Start the C++ Serving departmentscript
test_cpp_serving_client.py # Script for sending C++ serving prediction requests in rpc mode
```
<a name="3.2.1"></a>
#### 3.2.1 Python Serving
- Start the service:
```shell
# Start the service and save the running log in log.txt
python3.7 classification_web_service.py &>log.txt &
```
- send request:
```shell
# send service request
python3.7 pipeline_http_client.py
```
After a successful run, the results of the model prediction will be printed in the cmd window, and the results are as follows:
```log
{'err_no': 0, 'err_msg': '', 'key': ['label', 'prob'], 'value': ["['daisy']", '[0.9341402053833008]'], 'tensors ': []}
```
- turn off the service
If the service program is running in the foreground, you can press `Ctrl+C` to terminate the server program; if it is running in the background, you can use the kill command to close related processes, or you can execute the following command in the path where the service program is started to terminate the server program:
```bash
python3.7 -m paddle_serving_server.serve stop
```
After the execution is completed, the `Process stopped` message appears, indicating that the service was successfully shut down.
<a name="3.2.2"></a>
#### 3.2.2 C++ Serving
Different from Python Serving, the C++ Serving client calls C++ OP to predict, so before starting the service, you need to compile and install the serving server package, and set `SERVING_BIN`.
- Compile and install the Serving server package
```shell
# Enter the working directory
cd PaddleClas/deploy/paddleserving
# One-click compile and install Serving server, set SERVING_BIN
source ./build_server.sh python3.7
```
**Note: The path set by **[build_server.sh](./build_server.sh#L55-L62) may need to be modified according to the actual machine environment such as CUDA, python version, etc., and then compiled.
- Modify the client file `ResNet50_client/serving_client_conf.prototxt` , change the field after `feed_type:` to 20, change the field after the first `shape:` to 1 and delete the rest of the `shape` fields.
```log
feed_var {
name: "inputs"
alias_name: "inputs"
is_lod_tensor: false
feed_type: 20
shape: 1
}
```
- Modify part of the code of [`test_cpp_serving_client`](./test_cpp_serving_client.py)
1. Modify the [`feed={"inputs": image}`](./test_cpp_serving_client.py#L28) part of the code, and change the path after `load_client_config` to `ResNet50_client/serving_client_conf.prototxt` .
2. Modify the [`feed={"inputs": image}`](./test_cpp_serving_client.py#L45) part of the code, and change `inputs` to be the same as the `feed_var` field in `ResNet50_client/serving_client_conf.prototxt` name` is the same. Since `name` in some model client files is `x` instead of `inputs` , you need to pay attention to this when using these models for C++ Serving deployment.
- Start the service:
```shell
# Start the service, the service runs in the background, and the running log is saved in nohup.txt
# CPU deployment
sh run_cpp_serving.sh
# GPU deployment and specify card 0
sh run_cpp_serving.sh 0
```
- send request:
```shell
# send service request
python3.7 test_cpp_serving_client.py
```
After a successful run, the results of the model prediction will be printed in the cmd window, and the results are as follows:
```log
prediction: daisy, probability: 0.9341399073600769
```
- close the service:
If the service program is running in the foreground, you can press `Ctrl+C` to terminate the server program; if it is running in the background, you can use the kill command to close related processes, or you can execute the following command in the path where the service program is started to terminate the server program:
```bash
python3.7 -m paddle_serving_server.serve stop
```
After the execution is completed, the `Process stopped` message appears, indicating that the service was successfully shut down.
##4.FAQ
**Q1**: No result is returned after the request is sent or an output decoding error is prompted
**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and sending the request. The command to close the proxy is:
```shell
unset https_proxy
unset http_proxy
```
**Q2**: nothing happens after starting the service
**A2**: You can check whether the path corresponding to `model_config` in `config.yml` exists, and whether the folder name is correct
For more service deployment types, such as `RPC prediction service`, you can refer to Serving's [github official website](https://github.com/PaddlePaddle/Serving/tree/v0.9.0/examples)
# Service deployment based on PaddleHub Serving English | [简体中文](../../zh_CN/inference_deployment/paddle_hub_serving_deploy.md)
PaddleClas supports rapid service deployment through Paddlehub. At present, it supports the deployment of image classification. Please look forward to the deployment of image recognition. # Service deployment based on PaddleHub Serving
--- PaddleClas supports rapid service deployment through PaddleHub. Currently, the deployment of image classification is supported. Please look forward to the deployment of image recognition.
## Catalogue ## Catalogue
- [1. Introduction](#1) - [1. Introduction](#1)
- [2. Prepare the environment](#2) - [2. Prepare the environment](#2)
- [3. Download inference model](#3) - [3. Download inference model](#3)
...@@ -16,97 +15,101 @@ PaddleClas supports rapid service deployment through Paddlehub. At present, it s ...@@ -16,97 +15,101 @@ PaddleClas supports rapid service deployment through Paddlehub. At present, it s
- [6. Send prediction requests](#6) - [6. Send prediction requests](#6)
- [7. User defined service module modification](#7) - [7. User defined service module modification](#7)
<a name="1"></a> <a name="1"></a>
## 1. Introduction ## 1 Introduction
HubServing service pack contains 3 files, the directory is as follows: The hubserving service deployment configuration service package `clas` contains 3 required files, the directories are as follows:
```shell
deploy/hubserving/clas/
├── __init__.py # Empty file, required
├── config.json # Configuration file, optional, passed in as a parameter when starting the service with configuration
├── module.py # The main module, required, contains the complete logic of the service
└── params.py # Parameter file, required, including model path, pre- and post-processing parameters and other parameters
``` ```
hubserving/clas/
└─ __init__.py Empty file, required
└─ config.json Configuration file, optional, passed in as a parameter when using configuration to start the service
└─ module.py Main module file, required, contains the complete logic of the service
└─ params.py Parameter file, required, including parameters such as model path, pre- and post-processing parameters
```
<a name="2"></a> <a name="2"></a>
## 2. Prepare the environment ## 2. Prepare the environment
```shell ```shell
# Install version 2.0 of PaddleHub # Install paddlehub, version 2.1.0 is recommended
pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple python3.7 -m pip install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
``` ```
<a name="3"></a> <a name="3"></a>
## 3. Download inference model ## 3. Download the inference model
Before installing the service module, you need to prepare the inference model and put it in the correct path. The default model path is: Before installing the service module, you need to prepare the inference model and put it in the correct path. The default model path is:
* Model structure file: `PaddleClas/inference/inference.pdmodel` * Classification inference model structure file: `PaddleClas/inference/inference.pdmodel`
* Model parameters file: `PaddleClas/inference/inference.pdiparams` * Classification inference model weight file: `PaddleClas/inference/inference.pdiparams`
**Notice**: **Notice**:
* The model file path can be viewed and modified in `PaddleClas/deploy/hubserving/clas/params.py`. * Model file paths can be viewed and modified in `PaddleClas/deploy/hubserving/clas/params.py`:
* It should be noted that the prefix of model structure file and model parameters file must be `inference`.
* More models provided by PaddleClas can be obtained from the [model library](../algorithm_introduction/ImageNet_models_en.md). You can also use models trained by yourself. ```python
"inference_model_dir": "../inference/"
```
* Model files (including `.pdmodel` and `.pdiparams`) must be named `inference`.
* We provide a large number of pre-trained models based on the ImageNet-1k dataset. For the model list and download address, see [Model Library Overview](../algorithm_introduction/ImageNet_models.md), or you can use your own trained and converted models.
<a name="4"></a> <a name="4"></a>
## 4. Install Service Module ## 4. Install the service module
* On Linux platform, the examples are as follows. * In the Linux environment, the installation example is as follows:
```shell ```shell
cd PaddleClas/deploy cd PaddleClas/deploy
hub install hubserving/clas/ # Install the service module:
``` hub install hubserving/clas/
```
* In the Windows environment (the folder separator is `\`), the installation example is as follows:
```shell
cd PaddleClas\deploy
# Install the service module:
hub install hubserving\clas\
```
* On Windows platform, the examples are as follows.
```shell
cd PaddleClas\deploy
hub install hubserving\clas\
```
<a name="5"></a> <a name="5"></a>
## 5. Start service ## 5. Start service
<a name="5.1"></a> <a name="5.1"></a>
### 5.1 Start with command line parameters ### 5.1 Start with command line parameters
This method only supports CPU. Command as follow: This method only supports prediction using CPU. Start command:
```shell ```shell
$ hub serving start --modules Module1==Version1 \ hub serving start \
--port XXXX \ --modules clas_system
--use_multiprocess \ --port 8866
--workers \ ```
``` This completes the deployment of a serviced API, using the default port number 8866.
**parameters:**
|parameters|usage|
|-|-|
|--modules/-m|PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs<br>*`When Version is not specified, the latest version is selected by default`*|
|--port/-p|Service port, default is 8866|
|--use_multiprocess|Enable concurrent mode, the default is single-process mode, this mode is recommended for multi-core CPU machines<br>*`Windows operating system only supports single-process mode`*|
|--workers|The number of concurrent tasks specified in concurrent mode, the default is `2*cpu_count-1`, where `cpu_count` is the number of CPU cores|
For example, start service:
```shell
hub serving start -m clas_system
```
This completes the deployment of a service API, using the default port number 8866. **Parameter Description**:
|parameters|uses|
|-|-|
|--modules/-m| [**required**] PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs<br>*`When no Version is specified, the latest is selected by default version `*|
|--port/-p| [**OPTIONAL**] Service port, default is 8866|
|--use_multiprocess| [**Optional**] Whether to enable the concurrent mode, the default is single-process mode, it is recommended to use this mode for multi-core CPU machines<br>*`Windows operating system only supports single-process mode`*|
|--workers| [**Optional**] The number of concurrent tasks specified in concurrent mode, the default is `2*cpu_count-1`, where `cpu_count` is the number of CPU cores|
For more deployment details, see [PaddleHub Serving Model One-Click Service Deployment](https://paddlehub.readthedocs.io/zh_CN/release-v2.1/tutorial/serving.html)
<a name="5.2"></a> <a name="5.2"></a>
### 5.2 Start with configuration file ### 5.2 Start with configuration file
This method supports CPU and GPU. Command as follow: This method only supports prediction using CPU or GPU. Start command:
```shell ```shell
hub serving start --config/-c config.json hub serving start -c config.json
``` ```
Wherein, the format of `config.json` is as follows: Among them, the format of `config.json` is as follows:
```json ```json
{ {
...@@ -127,18 +130,19 @@ Wherein, the format of `config.json` is as follows: ...@@ -127,18 +130,19 @@ Wherein, the format of `config.json` is as follows:
} }
``` ```
- The configurable parameters in `init_args` are consistent with the `_initialize` function interface in `module.py`. Among them, **Parameter Description**:
- when `use_gpu` is `true`, it means that the GPU is used to start the service. * The configurable parameters in `init_args` are consistent with the `_initialize` function interface in `module.py`. in,
- when `enable_mkldnn` is `true`, it means that use MKL-DNN to accelerate. - When `use_gpu` is `true`, it means to use GPU to start the service.
- The configurable parameters in `predict_args` are consistent with the `predict` function interface in `module.py`. - When `enable_mkldnn` is `true`, it means to use MKL-DNN acceleration.
* The configurable parameters in `predict_args` are consistent with the `predict` function interface in `module.py`.
**Note:** **Notice**:
- When using the configuration file to start the service, other parameters will be ignored. * When using the configuration file to start the service, the parameter settings in the configuration file will be used, and other command line parameters will be ignored;
- If you use GPU prediction (that is, `use_gpu` is set to `true`), you need to set the environment variable CUDA_VISIBLE_DEVICES before starting the service, such as: ```export CUDA_VISIBLE_DEVICES=0```, otherwise you do not need to set it. * If you use GPU prediction (ie, `use_gpu` is set to `true`), you need to set the `CUDA_VISIBLE_DEVICES` environment variable to specify the GPU card number used before starting the service, such as: `export CUDA_VISIBLE_DEVICES=0`;
- **`use_gpu` and `use_multiprocess` cannot be `true` at the same time.** * **`use_gpu` cannot be `true`** at the same time as `use_multiprocess`;
- **When both `use_gpu` and `enable_mkldnn` are set to `true` at the same time, GPU is used to run and `enable_mkldnn` will be ignored.** * ** When both `use_gpu` and `enable_mkldnn` are `true`, `enable_mkldnn` will be ignored and GPU** will be used.
For example, use GPU card No. 3 to start the 2-stage series service: If you use GPU No. 3 card to start the service:
```shell ```shell
cd PaddleClas/deploy cd PaddleClas/deploy
...@@ -149,88 +153,86 @@ hub serving start -c hubserving/clas/config.json ...@@ -149,88 +153,86 @@ hub serving start -c hubserving/clas/config.json
<a name="6"></a> <a name="6"></a>
## 6. Send prediction requests ## 6. Send prediction requests
After the service starting, you can use the following command to send a prediction request to obtain the prediction result: After configuring the server, you can use the following command to send a prediction request to get the prediction result:
```shell ```shell
cd PaddleClas/deploy cd PaddleClas/deploy
python hubserving/test_hubserving.py server_url image_path python3.7 hubserving/test_hubserving.py \
--server_url http://127.0.0.1:8866/predict/clas_system \
--image_file ./hubserving/ILSVRC2012_val_00006666.JPEG \
--batch_size 8
```
**Predicted output**
```log
The result(s): class_ids: [57, 67, 68, 58, 65], label_names: ['garter snake, grass snake', 'diamondback, diamondback rattlesnake, Crotalus adamanteus', 'sidewinder, horned rattlesnake, Crotalus cerastes' , 'water snake', 'sea snake'], scores: [0.21915, 0.15631, 0.14794, 0.13177, 0.12285]
The average time of prediction cost: 2.970 s/image
The average time cost: 3.014 s/image
The average top-1 score: 0.110
``` ```
Two required parameters need to be passed to the script: **Script parameter description**:
* **server_url**: Service address, the format is `http://[ip_address]:[port]/predict/[module_name]`.
- **server_url**: service address,format of which is * **image_path**: The test image path, which can be a single image path or an image collection directory path.
`http://[ip_address]:[port]/predict/[module_name]` * **batch_size**: [**OPTIONAL**] Make predictions in `batch_size` size, default is `1`.
- **image_path**: Test image path, can be a single image path or an image directory path * **resize_short**: [**optional**] When preprocessing, resize by short edge, default is `256`.
- **batch_size**: [**Optional**] batch_size. Default by `1`. * **crop_size**: [**Optional**] The size of the center crop during preprocessing, the default is `224`.
- **resize_short**: [**Optional**] In preprocessing, resize according to short size. Default by `256` * **normalize**: [**Optional**] Whether to perform `normalize` during preprocessing, the default is `True`.
- **crop_size**: [**Optional**] In preprocessing, centor crop size. Default by `224` * **to_chw**: [**Optional**] Whether to adjust to `CHW` order when preprocessing, the default is `True`.
- **normalize**: [**Optional**] In preprocessing, whether to do `normalize`. Default by `True`
- **to_chw**: [**Optional**] In preprocessing, whether to transpose to `CHW`. Default by `True`
**Notice**: **Note**: If you use `Transformer` series models, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input data size of the model, you need to specify `--resize_short=384 -- crop_size=384`.
If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `--resize_short=384`, `--crop_size=384`.
**Eg.** **Return result format description**:
The returned result is a list (list), including the top-k classification results, the corresponding scores, and the time-consuming prediction of this image, as follows:
```shell ```shell
python hubserving/test_hubserving.py --server_url http://127.0.0.1:8866/predict/clas_system --image_file ./hubserving/ILSVRC2012_val_00006666.JPEG --batch_size 8 list: return result
└──list: first image result
├── list: the top k classification results, sorted in descending order of score
├── list: the scores corresponding to the first k classification results, sorted in descending order of score
└── float: The image classification time, in seconds
``` ```
The returned result is a list, including the `top_k`'s classification results, corresponding scores and the time cost of prediction, details as follows.
```
list: The returned results
└─ list: The result of first picture
└─ list: The top-k classification results, sorted in descending order of score
└─ list: The scores corresponding to the top-k classification results, sorted in descending order of score
└─ float: The time cost of predicting the picture, unit second
```
**Note:** If you need to add, delete or modify the returned fields, you can modify the corresponding module. For the details, refer to the user-defined modification service module in the next section.
<a name="7"></a> <a name="7"></a>
## 7. User defined service module modification ## 7. User defined service module modification
If you need to modify the service logic, the following steps are generally required: If you need to modify the service logic, you need to do the following:
1. Stop service
```shell
hub serving stop --port/-p XXXX
```
2. Modify the code in the corresponding files, like `module.py` and `params.py`, according to the actual needs. You need re-install(hub install hubserving/clas/) and re-deploy after modifing `module.py`.
After modifying and installing and before deploying, you can use `python hubserving/clas/module.py` to test the installed service module.
For example, if you need to replace the model used by the deployed service, you need to modify model path parameters `cfg.model_file` and `cfg.params_file` in `params.py`. Of course, other related parameters may need to be modified at the same time. Please modify and debug according to the actual situation. 1. Stop the service
```shell
3. Uninstall old service module hub serving stop --port/-p XXXX
```shell ```
hub uninstall clas_system
```
4. Install modified service module 2. Go to the corresponding `module.py` and `params.py` and other files to modify the code according to actual needs. `module.py` needs to be reinstalled after modification (`hub install hubserving/clas/`) and deployed. Before deploying, you can use the `python3.7 hubserving/clas/module.py` command to quickly test the code ready for deployment.
```shell
hub install hubserving/clas/
```
5. Restart service 3. Uninstall the old service pack
```shell ```shell
hub serving start -m clas_system hub uninstall clas_system
``` ```
**Note**: 4. Install the new modified service pack
```shell
hub install hubserving/clas/
```
Common parameters can be modified in params.py: 5. Restart the service
* Directory of model files(include model structure file and model parameters file): ```shell
```python hub serving start -m clas_system
"inference_model_dir": ```
```
* The number of Top-k results returned during post-processing:
```python
'topk':
```
* Mapping file corresponding to label and class ID during post-processing:
```python
'class_id_map_file':
```
In order to avoid unnecessary delay and be able to predict in batch, the preprocessing (include resize, crop and other) is completed in the client, so modify [test_hubserving.py](../../../deploy/hubserving/test_hubserving.py#L35-L52) if necessary. **Notice**:
Common parameters can be modified in `PaddleClas/deploy/hubserving/clas/params.py`:
* To replace the model, you need to modify the model file path parameters:
```python
"inference_model_dir":
```
* Change the number of `top-k` results returned when postprocessing:
```python
'topk':
```
* The mapping file corresponding to the lable and class id when changing the post-processing:
```python
'class_id_map_file':
```
In order to avoid unnecessary delay and be able to predict with batch_size, data preprocessing logic (including `resize`, `crop` and other operations) is completed on the client side, so it needs to be in [PaddleClas/deploy/hubserving/test_hubserving.py# L41-L47](../../../deploy/hubserving/test_hubserving.py#L41-L47) and [PaddleClas/deploy/hubserving/test_hubserving.py#L51-L76](../../../deploy/hubserving/test_hubserving.py#L51-L76) Modify the data preprocessing logic related code.
# Model Service Deployment
## Catalogue
- [1. Introduction](#1)
- [2. Installation of Serving](#2)
- [3. Service Deployment for Image Classification](#3)
- [3.1 Model Transformation](#3.1)
- [3.2 Service Deployment and Request](#3.2)
- [4. Service Deployment for Image Recognition](#4)
- [4.1 Model Transformation](#4.1)
- [4.2 Service Deployment and Request](#4.2)
- [5. FAQ](#5)
<a name="1"></a>
## 1. Introduction
[Paddle Serving](https://github.com/PaddlePaddle/Serving) is designed to provide easy deployment of on-line prediction services for deep learning developers, it supports one-click deployment of industrial-grade services, highly concurrent and efficient communication between client and server, and multiple programming languages for client development.
This section, exemplified by HTTP deployment of prediction service, describes how to deploy model services in PaddleClas with PaddleServing. Currently, only deployment on Linux platform is supported. Windows platform is not supported.
<a name="2"></a>
## 2. Installation of Serving
It is officially recommended to use docker for the installation and environment deployment of Serving. First, pull the docker and create a Serving-based one.
```
docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel
nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash
nvidia-docker exec -it test bash
```
Once you are in docker, install the Serving-related python packages.
```
pip3 install paddle-serving-client==0.7.0
pip3 install paddle-serving-server==0.7.0 # CPU
pip3 install paddle-serving-app==0.7.0
pip3 install paddle-serving-server-gpu==0.7.0.post102 #GPU with CUDA10.2 + TensorRT6
# For other GPU environemnt, confirm the environment before choosing which one to execute
pip3 install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
pip3 install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
```
- Speed up the installation process by replacing the source with `-i https://pypi.tuna.tsinghua.edu.cn/simple`.
- For other environment configuration and installation, please refer to [Install Paddle Serving using docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_EN.md)
- To deploy CPU services, please install the CPU version of serving-server with the following command.
```
pip install paddle-serving-server
```
<a name="3"></a>
## 3. Service Deployment for Image Classification
<a name="3.1"></a>
### 3.1 Model Transformation
When adopting PaddleServing for service deployment, the saved inference model needs to be converted to a Serving model. The following part takes the classic ResNet50_vd model as an example to introduce the deployment of image classification service.
- Enter the working directory:
```
cd deploy/paddleserving
```
- Download the inference model of ResNet50_vd:
```
# Download and decompress the ResNet50_vd model
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
```
- Convert the downloaded inference model into a format that is readily deployable by Server with the help of paddle_serving_client.
```
# Convert the ResNet50_vd model
python3 -m paddle_serving_client.convert --dirname ./ResNet50_vd_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./ResNet50_vd_serving/ \
--serving_client ./ResNet50_vd_client/
```
After the transformation, `ResNet50_vd_serving` and `ResNet50_vd_client` will be added to the current folder in the following format:
```
|- ResNet50_vd_server/
|- __model__
|- __params__
|- serving_server_conf.prototxt
|- serving_server_conf.stream.prototxt
|- ResNet50_vd_client
|- serving_client_conf.prototxt
|- serving_client_conf.stream.prototxt
```
Having obtained the model file, modify the alias name in `serving_server_conf.prototxt` under directory `ResNet50_vd_server` by changing `alias_name` in `fetch_var` to `prediction`.
**Notes**: Serving supports input and output renaming to ensure its compatibility with the deployment of different models. In this case, modifying the alias_name of the configuration file is the only step needed to complete the inference and deployment of all kinds of models. The modified serving_server_conf.prototxt is shown below:
```
feed_var {
name: "inputs"
alias_name: "inputs"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "prediction"
is_lod_tensor: true
fetch_type: 1
shape: -1
}
```
<a name="3.2"></a>
### 3.2 Service Deployment and Request
Paddleserving's directory contains the code to start the pipeline service and send prediction requests, including:
```
__init__.py
config.yml # Configuration file for starting the service
pipeline_http_client.py # Script for sending pipeline prediction requests by http
pipeline_rpc_client.py # Script for sending pipeline prediction requests by rpc
classification_web_service.py # Script for starting the pipeline server
```
- Start the service:
```
# Start the service and the run log is saved in log.txt
python3 classification_web_service.py &>log.txt &
```
Once the service is successfully started, a log will be printed in log.txt similar to the following ![img](../../../deploy/paddleserving/imgs/start_server.png)
- Send request:
```
# Send service request
python3 pipeline_http_client.py
```
Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example:![img](../../../deploy/paddleserving/imgs/results.png)
<a name="4"></a>
## 4. Service Deployment for Image Recognition
When using PaddleServing for service deployment, the saved inference model needs to be converted to a Serving model. The following part, exemplified by the ultra-lightweight model for image recognition in PP-ShiTu, details the deployment of image recognition service.
<a name="4.1"></a>
## 4.1 Model Transformation
- Download inference models for general detection and general recognition
```
cd deploy
# Download and decompress general recogntion models
wget -P models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar
cd models
tar -xf general_PPLCNet_x2_5_lite_v1.0_infer.tar
# Download and decompress general detection models
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
tar -xf picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
```
- Convert the inference model for recognition into a Serving model:
```
# Convert the recognition model
python3 -m paddle_serving_client.convert --dirname ./general_PPLCNet_x2_5_lite_v1.0_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./general_PPLCNet_x2_5_lite_v1.0_serving/ \
--serving_client ./general_PPLCNet_x2_5_lite_v1.0_client/
```
After the transformation, `general_PPLCNet_x2_5_lite_v1.0_serving/` and `general_PPLCNet_x2_5_lite_v1.0_serving/` will be added to the current folder. Modify the alias name in serving_server_conf.prototxt under the directory `general_PPLCNet_x2_5_lite_v1.0_serving/` by changing `alias_name` to `features` in `fetch_var`. The modified serving_server_conf.prototxt is similar to the following:
```
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: true
fetch_type: 1
shape: -1
}
```
- Convert the inference model for detection into a Serving model:
```
# Convert the general detection model
python3 -m paddle_serving_client.convert --dirname ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/ \
--serving_client ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/
```
After the transformation, `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` and `picodet_PPLCNet_x2_5_ mainbody_lite_v1.0_client/` will be added to the current folder.
**Note:** The alias name in the serving_server_conf.prototxt under the directory`picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` requires no modification.
- Download and decompress the constructed search library index
```
cd ../
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar && tar -xf drink_dataset_v1.0.tar
```
<a name="4.2"></a>
## 4.2 Service Deployment and Request
**Note:** Since the recognition service involves multiple models, PipeLine is adopted for better performance. This deployment method does not support the windows platform for now.
- Enter the working directory
```
cd ./deploy/paddleserving/recognition
```
Paddleserving's directory contains the code to start the pipeline service and send prediction requests, including:
```
__init__.py
config.yml # Configuration file for starting the service
pipeline_http_client.py # Script for sending pipeline prediction requests by http
pipeline_rpc_client.py # Script for sending pipeline prediction requests by rpc
recognition_web_service.py # Script for starting the pipeline server
```
- Start the service:
```
# Start the service and the run log is saved in log.txt
python3 recognition_web_service.py &>log.txt &
```
Once the service is successfully started, a log will be printed in log.txt similar to the following ![img](../../../deploy/paddleserving/imgs/start_server_shitu.png)
- Send request:
```
python3 pipeline_http_client.py
```
Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example: ![img](../../../deploy/paddleserving/imgs/results_shitu.png)
<a name="5"></a>
## 5.FAQ
**Q1**: After sending a request, no result is returned or the output is prompted with a decoding error.
**A1**: Please turn off the proxy before starting the service and sending requests, try the following command:
```
unset https_proxy
unset http_proxy
```
For more types of service deployment, such as `RPC prediction services`, you can refer to the [github official website](https://github.com/PaddlePaddle/Serving/tree/v0.7.0/examples) of Serving.
English | [简体中文](../../zh_CN/inference_deployment/recognition_serving_deploy.md)
# Recognition model service deployment
## Table of contents
- [1 Introduction](#1-introduction)
- [2. Serving installation](#2-serving-installation)
- [3. Image recognition service deployment](#3-image-recognition-service-deployment)
- [3.1 Model conversion](#31-model-conversion)
- [3.2 Service deployment and request](#32-service-deployment-and-request)
- [3.2.1 Python Serving](#321-python-serving)
- [3.2.2 C++ Serving](#322-c-serving)
- [4. FAQ](#4-faq)
<a name="1"></a>
## 1 Introduction
[Paddle Serving](https://github.com/PaddlePaddle/Serving) aims to help deep learning developers easily deploy online prediction services, support one-click deployment of industrial-grade service capabilities, high concurrency between client and server Efficient communication and support for developing clients in multiple programming languages.
This section takes the HTTP prediction service deployment as an example to introduce how to use PaddleServing to deploy the model service in PaddleClas. Currently, only Linux platform deployment is supported, and Windows platform is not currently supported.
<a name="2"></a>
## 2. Serving installation
The Serving official website recommends using docker to install and deploy the Serving environment. First, you need to pull the docker environment and create a Serving-based docker.
```shell
# start GPU docker
docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel
nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash
nvidia-docker exec -it test bash
# start CPU docker
docker pull paddlepaddle/serving:0.7.0-devel
docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-devel bash
docker exec -it test bash
```
After entering docker, you need to install Serving-related python packages.
```shell
python3.7 -m pip install paddle-serving-client==0.7.0
python3.7 -m pip install paddle-serving-app==0.7.0
python3.7 -m pip install faiss-cpu==1.7.1post2
#If it is a CPU deployment environment:
python3.7 -m pip install paddle-serving-server==0.7.0 #CPU
python3.7 -m pip install paddlepaddle==2.2.0 # CPU
#If it is a GPU deployment environment
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post102 # GPU with CUDA10.2 + TensorRT6
python3.7 -m pip install paddlepaddle-gpu==2.2.0 # GPU with CUDA10.2
#Other GPU environments need to confirm the environment and then choose which one to execute
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
```
* If the installation speed is too slow, you can change the source through `-i https://pypi.tuna.tsinghua.edu.cn/simple` to speed up the installation process.
* For other environment configuration installation, please refer to: [Install Paddle Serving with Docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md)
<a name="3"></a>
## 3. Image recognition service deployment
When using PaddleServing for image recognition service deployment, **need to convert multiple saved inference models to Serving models**. The following takes the ultra-lightweight image recognition model in PP-ShiTu as an example to introduce the deployment of image recognition services.
<a name="3.1"></a>
### 3.1 Model conversion
- Go to the working directory:
```shell
cd deploy/
```
- Download generic detection inference model and generic recognition inference model
```shell
# Create and enter the models folder
mkdir models
cd models
# Download and unzip the generic recognition model
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar
tar -xf general_PPLCNet_x2_5_lite_v1.0_infer.tar
# Download and unzip the generic detection model
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
tar -xf picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
```
- Convert the generic recognition inference model to the Serving model:
```shell
# Convert the generic recognition model
python3.7 -m paddle_serving_client.convert \
--dirname ./general_PPLCNet_x2_5_lite_v1.0_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./general_PPLCNet_x2_5_lite_v1.0_serving/ \
--serving_client ./general_PPLCNet_x2_5_lite_v1.0_client/
```
The meaning of the parameters of the above command is the same as [#3.1 Model conversion](#3.1)
After the recognition inference model is converted, there will be additional folders `general_PPLCNet_x2_5_lite_v1.0_serving/` and `general_PPLCNet_x2_5_lite_v1.0_client/` in the current folder. Modify the name of `alias` in `serving_server_conf.prototxt` in `general_PPLCNet_x2_5_lite_v1.0_serving/` and `general_PPLCNet_x2_5_lite_v1.0_client/` directories respectively: Change `alias_name` in `fetch_var` to `features`. The content of the modified `serving_server_conf.prototxt` is as follows
```log
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: false
fetch_type: 1
shape: 512
}
```
After the conversion of the general recognition inference model is completed, there will be additional `general_PPLCNet_x2_5_lite_v1.0_serving/` and `general_PPLCNet_x2_5_lite_v1.0_client/` folders in the current folder, with the following structure:
```shell
├── general_PPLCNet_x2_5_lite_v1.0_serving/
│ ├── inference.pdiparams
│ ├── inference.pdmodel
│ ├── serving_server_conf.prototxt
│ └── serving_server_conf.stream.prototxt
└── general_PPLCNet_x2_5_lite_v1.0_client/
├── serving_client_conf.prototxt
└── serving_client_conf.stream.prototxt
```
- Convert general detection inference model to Serving model:
```shell
# Convert generic detection model
python3.7 -m paddle_serving_client.convert --dirname ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/ \
--serving_client ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/
```
The meaning of the parameters of the above command is the same as [#3.1 Model conversion](#3.1)
After the conversion of the general detection inference model is completed, there will be additional folders `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` and `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/` in the current folder, with the following structure:
```shell
├── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/
│ ├── inference.pdiparams
│ ├── inference.pdmodel
│ ├── serving_server_conf.prototxt
│ └── serving_server_conf.stream.prototxt
└── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/
├── serving_client_conf.prototxt
└── serving_client_conf.stream.prototxt
```
The specific meaning of the parameters in the above command is shown in the following table
| parameter | type | default value | description |
| ----------------- | ---- | ------------------ | ----------------------------------------------------- |
| `dirname` | str | - | The storage path of the model file to be converted. The program structure file and parameter file are saved in this directory.|
| `model_filename` | str | None | The name of the file storing the model Inference Program structure that needs to be converted. If set to None, use `__model__` as the default filename |
| `params_filename` | str | None | The name of the file that stores all parameters of the model that need to be transformed. It needs to be specified if and only if all model parameters are stored in a single binary file. If the model parameters are stored in separate files, set it to None |
| `serving_server` | str | `"serving_server"` | The storage path of the converted model files and configuration files. Default is serving_server |
| `serving_client` | str | `"serving_client"` | The converted client configuration file storage path. Default is |
- Download and unzip the index of the retrieval library that has been built
```shell
# Go back to the deploy directory
cd ../
# Download the built retrieval library index
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar
# Decompress the built retrieval library index
tar -xf drink_dataset_v1.0.tar
```
<a name="3.2"></a>
### 3.2 Service deployment and request
**Note:** The identification service involves multiple models, and the PipeLine deployment method is used for performance reasons. The Pipeline deployment method currently does not support the windows platform.
- go to the working directory
```shell
cd ./deploy/paddleserving/recognition
```
The paddleserving directory contains code to start the Python Pipeline service, the C++ Serving service, and send prediction requests, including:
```shell
__init__.py
config.yml # The configuration file to start the python pipeline service
pipeline_http_client.py # Script for sending pipeline prediction requests in http mode
pipeline_rpc_client.py # Script for sending pipeline prediction requests in rpc mode
recognition_web_service.py # Script to start the pipeline server
readme.md # Recognition model service deployment documents
run_cpp_serving.sh # Script to start C++ Pipeline Serving deployment
test_cpp_serving_client.py # Script for sending C++ Pipeline serving prediction requests by rpc
```
<a name="3.2.1"></a>
#### 3.2.1 Python Serving
- Start the service:
```shell
# Start the service and save the running log in log.txt
python3.7 recognition_web_service.py &>log.txt &
```
- send request:
```shell
python3.7 pipeline_http_client.py
```
After a successful run, the results of the model prediction will be printed in the cmd window, and the results are as follows:
```log
{'err_no': 0, 'err_msg': '', 'key': ['result'], 'value': ["[{'bbox': [345, 95, 524, 576], 'rec_docs': 'Red Bull-Enhanced', 'rec_scores': 0.79903316}]"], 'tensors': []}
```
<a name="3.2.2"></a>
#### 3.2.2 C++ Serving
Different from Python Serving, the C++ Serving client calls C++ OP to predict, so before starting the service, you need to compile and install the serving server package, and set `SERVING_BIN`.
- Compile and install the Serving server package
```shell
# Enter the working directory
cd PaddleClas/deploy/paddleserving
# One-click compile and install Serving server, set SERVING_BIN
source ./build_server.sh python3.7
```
**Note:** The path set by [build_server.sh](../build_server.sh#L55-L62) may need to be modified according to the actual machine environment such as CUDA, python version, etc., and then compiled.
- The input and output format used by C++ Serving is different from that of Python, so you need to execute the following command to overwrite the files below [3.1] (#31-model conversion) by copying the 4 files to get the corresponding 4 prototxt files in the folder.
```shell
# Enter PaddleClas/deploy directory
cd PaddleClas/deploy/
# Overwrite prototxt file
\cp ./paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_serving/*.prototxt ./models/general_PPLCNet_x2_5_lite_v1.0_serving/
\cp ./paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_client/*.prototxt ./models/general_PPLCNet_x2_5_lite_v1.0_client/
\cp ./paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/*.prototxt ./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/
\cp ./paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/*.prototxt ./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/
```
- Start the service:
```shell
# Enter the working directory
cd PaddleClas/deploy/paddleserving/recognition
# The default port number is 9400; the running log is saved in log_PPShiTu.txt by default
# CPU deployment
sh run_cpp_serving.sh
# GPU deployment, and specify card 0
sh run_cpp_serving.sh 0
```
- send request:
```shell
# send service request
python3.7 test_cpp_serving_client.py
```
After a successful run, the results of the model predictions are printed in the client's terminal window as follows:
```log
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0614 03:01:36.273097 6084 naming_service_thread.cpp:202] brpc::policy::ListNamingService("127.0.0.1:9400"): added 1
I0614 03:01:37.393564 6084 general_model.cpp:490] [client]logid=0,client_cost=1107.82ms,server_cost=1101.75ms.
[{'bbox': [345, 95, 524, 585], 'rec_docs': 'Red Bull-Enhanced', 'rec_scores': 0.8073724}]
```
- close the service:
If the service program is running in the foreground, you can press `Ctrl+C` to terminate the server program; if it is running in the background, you can use the kill command to close related processes, or you can execute the following command in the path where the service program is started to terminate the server program:
```bash
python3.7 -m paddle_serving_server.serve stop
```
After the execution is completed, the `Process stopped` message appears, indicating that the service was successfully shut down.
<a name="4"></a>
## 4. FAQ
**Q1**: No result is returned after the request is sent or an output decoding error is prompted
**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and sending the request. The command to close the proxy is:
```shell
unset https_proxy
unset http_proxy
```
**Q2**: nothing happens after starting the service
**A2**: You can check whether the path corresponding to `model_config` in `config.yml` exists, and whether the folder name is correct
For more service deployment types, such as `RPC prediction service`, you can refer to Serving's [github official website](https://github.com/PaddlePaddle/Serving/tree/v0.9.0/examples)
# 基于图像分类的打电话行为识别模型
------
## 目录
- [1. 模型和应用场景介绍](#1)
- [2. 模型训练、评估和预测](#2)
- [2.1 PaddleClas 环境安装](#2.1)
- [2.2 数据准备](#2.2)
- [2.2.1 数据集下载](#2.2.1)
- [2.2.2 训练及测试图像处理](#2.2.2)
- [2.2.3 标注文件准备](#2.2.3)
- [2.3 模型训练](#2.3)
- [2.4 模型评估](#2.4)
- [2.5 模型预测](#2.5)
- [3. 模型推理部署](#3)
- [3.1 模型导出](#3.1)
- [3.2 执行模型预测](#3.2)
- [4. 在PP-Human中使用该模型](#4)
<div align="center">
<img src="../../images/action_rec_by_classification.gif" width='1000'/>
<center>数据来源及版权归属:天覆科技,感谢提供并开源实际场景数据,仅限学术研究使用</center>
</div>
<a name="1"></a>
## 1. 模型和应用场景介绍
行为识别在智慧社区,安防监控等方向具有广泛应用。根据行为的不同,一些行为可以通过图像直接进行行为判断(例如打电话)。这里我们提供了基于图像分类的打电话行为识别模型,对人物图像进行是否打电话的二分类识别。
| 任务 | 算法 | 精度 | 预测速度(ms) | 模型权重 |
| ---- | ---- | ---- | ---- | ------ |
| 打电话行为识别 | PP-HGNet-tiny | 准确率: 86.85 | 单人 2.94ms | [下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_tiny_calling_halfbody.pdparams) |
注:
1. 该模型使用[UAV-Human](https://github.com/SUTDCV/UAV-Human)的打电话行为部分进行训练和测试。
2. 预测速度为NVIDIA T4 机器上使用TensorRT FP16时的速度, 速度包含数据预处理、模型预测、后处理全流程。
该模型为实时行人分析工具[PP-Human](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/deploy/pipeline)中行为识别功能的一部分,欢迎体验PP-Human的完整功能。
<a name="2"></a>
## 2. 模型训练、评估和预测
<a name="2.1"></a>
### 2.1 PaddleClas 环境安装
请根据[环境准备](../installation/install_paddleclas.md)完成PaddleClas的环境依赖准备。
<a name="2.2"></a>
### 2.2 数据准备
<a name="2.2.1"></a>
#### 2.2.1 数据集下载
打电话的行为识别是基于公开数据集[UAV-Human](https://github.com/SUTDCV/UAV-Human)进行训练的。请通过该链接填写相关数据集申请材料后获取下载链接。
`UAVHuman/ActionRecognition/RGBVideos`路径下包含了该数据集中RGB视频数据集,每个视频的文件名即为其标注信息。
<a name="2.2.2"></a>
#### 2.2.2 训练及测试图像处理
根据视频文件名,其中与行为识别相关的为`A`相关的字段(即action),我们可以找到期望识别的动作类型数据。
- 正样本视频:以打电话为例,我们只需找到包含`A024`的文件。
- 负样本视频:除目标动作以外所有的视频。
鉴于视频数据转化为图像会有较多冗余,对于正样本视频,我们间隔8帧进行采样,并使用行人检测模型处理为半身图像(取检测框的上半部分,即`img = img[:H/2, :, :]`)。正样本视频中的采样得到的图像即视为正样本,负样本视频中采样得到的图像即为负样本。
**注意**: 正样本视频中并不完全符合打电话这一动作,在视频开头结尾部分会出现部分冗余动作,需要移除。
<a name="2.2.3"></a>
#### 2.2.3 标注文件准备
根据[PaddleClas数据集格式说明](../data_preparation/classification_dataset.md),标注文件样例如下,其中`0`,`1`分别是图片对应所属的类别:
```
# 每一行采用"空格"分隔图像路径与标注
train/000001.jpg 0
train/000002.jpg 0
train/000003.jpg 1
...
```
此外,标签文件`phone_label_list.txt`,帮助将分类序号映射到具体的类型名称:
```
0 make_a_phone_call # 类型0
1 normal # 类型1
```
完成上述内容后,放置于`dataset`目录下,文件结构如下:
```
data/
├── images # 放置所有图片
├── phone_label_list.txt # 标签文件
├── phone_train_list.txt # 训练列表,包含图片及其对应类型
└── phone_val_list.txt # 测试列表,包含图片及其对应类型
```
<a name="2.3"></a>
### 2.3 模型训练
通过如下命令启动训练:
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/practical_models/PPHGNet_tiny_calling_halfbody.yaml \
-o Arch.pretrained=True
```
其中 `Arch.pretrained``True`表示使用预训练权重帮助训练。
<a name="2.4"></a>
### 2.4 模型评估
训练好模型之后,可以通过以下命令实现对模型指标的评估。
```bash
python3 tools/eval.py \
-c ./ppcls/configs/practical_models/PPHGNet_tiny_calling_halfbody.yaml \
-o Global.pretrained_model=output/PPHGNet_tiny/best_model
```
其中 `-o Global.pretrained_model="output/PPHGNet_tiny/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
<a name="2.5"></a>
### 2.5 模型预测
模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
```bash
python3 tools/infer.py \
-c ./ppcls/configs/practical_models/PPHGNet_tiny_calling_halfbody.yaml \
-o Global.pretrained_model=output/PPHGNet_tiny/best_model
-o Infer.infer_imgs={your test image}
```
<a name="3"></a>
## 3. 模型推理部署
Paddle Inference 是飞桨的原生推理库,作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用 MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于 Paddle Inference 推理引擎的介绍,可以参考 [Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)
<a name="3.1"></a>
### 3.1 模型导出
```bash
python3 tools/export_model.py \
-c ./ppcls/configs/practical_models/PPHGNet_tiny_calling_halfbody.yaml \
-o Global.pretrained_model=output/PPHGNet_tiny/best_model \
-o Global.save_inference_dir=deploy/models//PPHGNet_tiny_calling_halfbody/
```
执行完该脚本后会在 `deploy/models/` 下生成 `PPHGNet_tiny_calling_halfbody` 文件夹,文件结构如下:
```
├── PPHGNet_tiny_calling_halfbody
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
<a name="3.2"></a>
### 3.2 执行模型预测
`deploy`下,执行下列命令:
```bash
# Current path is {root of PaddleClas}/deploy
python3 python/predict_cls.py -c configs/inference_cls_based_action.yaml
```
<a name="4"></a>
## 4. 在PP-Human中使用该模型
[PP-Human](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/deploy/pipeline)是基于飞桨深度学习框架的业界首个开源产业级实时行人分析工具,具有功能丰富,应用广泛和部署高效三大优势。该模型可以应用于PP-Human中,实现实时视频的打电话行为识别功能。
由于当前的PP-Human功能集成在[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)中,需要按以下步骤实现该模型在PP-Human中的调用适配。
1. 完成模型导出
2. 重命名模型
```bash
cd deploy/models/PPHGNet_tiny_calling_halfbody
mv inference.pdiparams model.pdiparams
mv inference.pdiparams.info model.pdiparams.info
mv inference.pdmodel model.pdmodel
```
3. 下载[预测配置文件](https://bj.bcebos.com/v1/paddledet/models/pipeline/infer_configs/PPHGNet_tiny_calling_halfbody/infer_cfg.yml)
``` bash
wget https://bj.bcebos.com/v1/paddledet/models/pipeline/infer_configs/PPHGNet_tiny_calling_halfbody/infer_cfg.yml
```
完成后文件结构如下,即可在PP-Human中使用:
```
PPHGNet_tiny_calling_halfbody
├── infer_cfg.yml
├── model.pdiparams
├── model.pdiparams.info
└── model.pdmodel
```
详细请参考[基于图像分类的行为识别——打电话识别](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/deploy/pipeline/docs/tutorials/action.md#%E5%9F%BA%E4%BA%8E%E5%9B%BE%E5%83%8F%E5%88%86%E7%B1%BB%E7%9A%84%E8%A1%8C%E4%B8%BA%E8%AF%86%E5%88%AB%E6%89%93%E7%94%B5%E8%AF%9D%E8%AF%86%E5%88%AB)
简体中文 | [English](../../en/inference_deployment/classification_serving_deploy_en.md)
# 分类模型服务化部署
## 目录
- [1. 简介](#1-简介)
- [2. Serving 安装](#2-serving-安装)
- [3. 图像分类服务部署](#3-图像分类服务部署)
- [3.1 模型转换](#31-模型转换)
- [3.2 服务部署和请求](#32-服务部署和请求)
- [3.2.1 Python Serving](#321-python-serving)
- [3.2.2 C++ Serving](#322-c-serving)
- [4.FAQ](#4faq)
<a name="1"></a>
## 1. 简介
[Paddle Serving](https://github.com/PaddlePaddle/Serving) 旨在帮助深度学习开发者轻松部署在线预测服务,支持一键部署工业级的服务能力、客户端和服务端之间高并发和高效通信、并支持多种编程语言开发客户端。
该部分以 HTTP 预测服务部署为例,介绍怎样在 PaddleClas 中使用 PaddleServing 部署模型服务。目前只支持 Linux 平台部署,暂不支持 Windows 平台。
<a name="2"></a>
## 2. Serving 安装
Serving 官网推荐使用 docker 安装并部署 Serving 环境。首先需要拉取 docker 环境并创建基于 Serving 的 docker。
```shell
# 启动GPU docker
docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel
nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash
nvidia-docker exec -it test bash
# 启动CPU docker
docker pull paddlepaddle/serving:0.7.0-devel
docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-devel bash
docker exec -it test bash
```
进入 docker 后,需要安装 Serving 相关的 python 包。
```shell
python3.7 -m pip install paddle-serving-client==0.7.0
python3.7 -m pip install paddle-serving-app==0.7.0
python3.7 -m pip install faiss-cpu==1.7.1post2
#若为CPU部署环境:
python3.7 -m pip install paddle-serving-server==0.7.0 # CPU
python3.7 -m pip install paddlepaddle==2.2.0 # CPU
#若为GPU部署环境
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post102 # GPU with CUDA10.2 + TensorRT6
python3.7 -m pip install paddlepaddle-gpu==2.2.0 # GPU with CUDA10.2
#其他GPU环境需要确认环境再选择执行哪一条
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
```
* 如果安装速度太慢,可以通过 `-i https://pypi.tuna.tsinghua.edu.cn/simple` 更换源,加速安装过程。
* 其他环境配置安装请参考:[使用Docker安装Paddle Serving](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md)
<a name="3"></a>
## 3. 图像分类服务部署
下面以经典的 ResNet50_vd 模型为例,介绍如何部署图像分类服务。
<a name="3.1"></a>
### 3.1 模型转换
使用 PaddleServing 做服务化部署时,需要将保存的 inference 模型转换为 Serving 模型。
- 进入工作目录:
```shell
cd deploy/paddleserving
```
- 下载并解压 ResNet50_vd 的 inference 模型:
```shell
# 下载 ResNet50_vd inference 模型
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar
# 解压 ResNet50_vd inference 模型
tar xf ResNet50_vd_infer.tar
```
- 用 paddle_serving_client 命令把下载的 inference 模型转换成易于 Server 部署的模型格式:
```shell
# 转换 ResNet50_vd 模型
python3.7 -m paddle_serving_client.convert \
--dirname ./ResNet50_vd_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./ResNet50_vd_serving/ \
--serving_client ./ResNet50_vd_client/
```
上述命令中参数具体含义如下表所示
| 参数 | 类型 | 默认值 | 描述 |
| ----------------- | ---- | ------------------ | ------------------------------------------------------------ |
| `dirname` | str | - | 需要转换的模型文件存储路径,Program结构文件和参数文件均保存在此目录。 |
| `model_filename` | str | None | 存储需要转换的模型Inference Program结构的文件名称。如果设置为None,则使用 `__model__` 作为默认的文件名 |
| `params_filename` | str | None | 存储需要转换的模型所有参数的文件名称。当且仅当所有模型参数被保>存在一个单独的二进制文件中,它才需要被指定。如果模型参数是存储在各自分离的文件中,设置它的值为None |
| `serving_server` | str | `"serving_server"` | 转换后的模型文件和配置文件的存储路径。默认值为serving_server |
| `serving_client` | str | `"serving_client"` | 转换后的客户端配置文件存储路径。默认值为serving_client |
ResNet50_vd 推理模型转换完成后,会在当前文件夹多出 `ResNet50_vd_serving` 和 `ResNet50_vd_client` 的文件夹,具备如下结构:
```shell
├── ResNet50_vd_serving/
│ ├── inference.pdiparams
│ ├── inference.pdmodel
│ ├── serving_server_conf.prototxt
│ └── serving_server_conf.stream.prototxt
└── ResNet50_vd_client/
├── serving_client_conf.prototxt
└── serving_client_conf.stream.prototxt
```
- Serving 为了兼容不同模型的部署,提供了输入输出重命名的功能。让不同的模型在推理部署时,只需要修改配置文件的 `alias_name` 即可,无需修改代码即可完成推理部署。因此在转换完毕后需要分别修改 `ResNet50_vd_serving` 下的文件 `serving_server_conf.prototxt``ResNet50_vd_client` 下的文件 `serving_client_conf.prototxt`,将 `fetch_var``alias_name:` 后的字段改为 `prediction`,修改后的 `serving_server_conf.prototxt``serving_client_conf.prototxt` 如下所示:
```log
feed_var {
name: "inputs"
alias_name: "inputs"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "prediction"
is_lod_tensor: false
fetch_type: 1
shape: 1000
}
```
<a name="3.2"></a>
### 3.2 服务部署和请求
paddleserving 目录包含了启动 pipeline 服务、C++ serving服务和发送预测请求的代码,主要包括:
```shell
__init__.py
classification_web_service.py # 启动pipeline服务端的脚本
config.yml # 启动pipeline服务的配置文件
pipeline_http_client.py # http方式发送pipeline预测请求的脚本
pipeline_rpc_client.py # rpc方式发送pipeline预测请求的脚本
readme.md # 分类模型服务化部署文档
run_cpp_serving.sh # 启动C++ Serving部署的脚本
test_cpp_serving_client.py # rpc方式发送C++ serving预测请求的脚本
```
<a name="3.2.1"></a>
#### 3.2.1 Python Serving
- 启动服务:
```shell
# 启动服务,运行日志保存在 log.txt
python3.7 classification_web_service.py &>log.txt &
```
- 发送请求:
```shell
# 发送服务请求
python3.7 pipeline_http_client.py
```
成功运行后,模型预测的结果会打印在客户端中,如下所示:
```log
{'err_no': 0, 'err_msg': '', 'key': ['label', 'prob'], 'value': ["['daisy']", '[0.9341402053833008]'], 'tensors': []}
```
- 关闭服务
如果服务程序在前台运行,可以按下`Ctrl+C`来终止服务端程序;如果在后台运行,可以使用kill命令关闭相关进程,也可以在启动服务程序的路径下执行以下命令来终止服务端程序:
```bash
python3.7 -m paddle_serving_server.serve stop
```
执行完毕后出现`Process stopped`信息表示成功关闭服务。
<a name="3.2.2"></a>
#### 3.2.2 C++ Serving
与Python Serving不同,C++ Serving客户端调用 C++ OP来预测,因此在启动服务之前,需要编译并安装 serving server包,并设置 `SERVING_BIN`
- 编译并安装Serving server包
```shell
# 进入工作目录
cd PaddleClas/deploy/paddleserving
# 一键编译安装Serving server、设置 SERVING_BIN
source ./build_server.sh python3.7
```
**注:**[build_server.sh](./build_server.sh#L55-L62)所设定的路径可能需要根据实际机器上的环境如CUDA、python版本等作一定修改,然后再编译。
- 修改客户端文件 `ResNet50_vd_client/serving_client_conf.prototxt` ,将 `feed_type:` 后的字段改为20,将第一个 `shape:` 后的字段改为1并删掉其余的 `shape` 字段。
```log
feed_var {
name: "inputs"
alias_name: "inputs"
is_lod_tensor: false
feed_type: 20
shape: 1
}
```
- 修改 [`test_cpp_serving_client`](./test_cpp_serving_client.py) 的部分代码
1. 修改 [`load_client_config`](./test_cpp_serving_client.py#L28) 处的代码,将 `load_client_config` 后的路径改为 `ResNet50_vd_client/serving_client_conf.prototxt`
2. 修改 [`feed={"inputs": image}`](./test_cpp_serving_client.py#L45) 处的代码,将 `inputs` 改为与 `ResNet50_vd_client/serving_client_conf.prototxt``feed_var` 字段下面的 `name` 一致。由于部分模型client文件中的 `name``x` 而不是 `inputs` ,因此使用这些模型进行C++ Serving部署时需要注意这一点。
- 启动服务:
```shell
# 启动服务, 服务在后台运行,运行日志保存在 nohup.txt
# CPU部署
bash run_cpp_serving.sh
# GPU部署并指定0号卡
bash run_cpp_serving.sh 0
```
- 发送请求:
```shell
# 发送服务请求
python3.7 test_cpp_serving_client.py
```
成功运行后,模型预测的结果会打印在客户端中,如下所示:
```log
prediction: daisy, probability: 0.9341399073600769
```
- 关闭服务:
如果服务程序在前台运行,可以按下`Ctrl+C`来终止服务端程序;如果在后台运行,可以使用kill命令关闭相关进程,也可以在启动服务程序的路径下执行以下命令来终止服务端程序:
```bash
python3.7 -m paddle_serving_server.serve stop
```
执行完毕后出现`Process stopped`信息表示成功关闭服务。
## 4.FAQ
**Q1**: 发送请求后没有结果返回或者提示输出解码报错
**A1**: 启动服务和发送请求时不要设置代理,可以在启动服务前和发送请求前关闭代理,关闭代理的命令是:
```shell
unset https_proxy
unset http_proxy
```
**Q2**: 启动服务后没有任何反应
**A2**: 可以检查`config.yml``model_config`对应的路径是否存在,文件夹命名是否正确
更多的服务部署类型,如 `RPC 预测服务` 等,可以参考 Serving 的[github 官网](https://github.com/PaddlePaddle/Serving/tree/v0.9.0/examples)
...@@ -5,13 +5,13 @@ PaddleClas 在 Windows 平台下基于 `Visual Studio 2019 Community` 进行了 ...@@ -5,13 +5,13 @@ PaddleClas 在 Windows 平台下基于 `Visual Studio 2019 Community` 进行了
----- -----
## 目录 ## 目录
* [1. 前置条件](#1) * [1. 前置条件](#1)
* [1.1 下载 PaddlePaddle C++ 预测库 paddle_inference_install_dir](#1.1) * [1.1 下载 PaddlePaddle C++ 预测库 paddle_inference_install_dir](#1.1)
* [1.2 安装配置 OpenCV](#1.2) * [1.2 安装配置 OpenCV](#1.2)
* [2. 使用 Visual Studio 2019 编译](#2) * [2. 使用 Visual Studio 2019 编译](#2)
* [3. 预测](#3) * [3. 预测](#3)
* [3.1 准备 inference model](#3.1) * [3.1 准备 inference model](#3.1)
* [3.2 运行预测](#3.2) * [3.2 运行预测](#3.2)
* [3.3 注意事项](#3.3) * [3.3 注意事项](#3.3)
<a name='1'></a> <a name='1'></a>
## 1. 前置条件 ## 1. 前置条件
......
...@@ -91,9 +91,9 @@ python3 tools/export_model.py \ ...@@ -91,9 +91,9 @@ python3 tools/export_model.py \
导出的 inference 模型文件可用于预测引擎进行推理部署,根据不同的部署方式/平台,可参考: 导出的 inference 模型文件可用于预测引擎进行推理部署,根据不同的部署方式/平台,可参考:
* [Python 预测](./python_deploy.md) * [Python 预测](./inference/python_deploy.md)
* [C++ 预测](./cpp_deploy.md)(目前仅支持分类模型) * [C++ 预测](./inference/cpp_deploy.md)(目前仅支持分类模型)
* [Python Whl 预测](./whl_deploy.md)(目前仅支持分类模型) * [Python Whl 预测](./inference/whl_deploy.md)(目前仅支持分类模型)
* [PaddleHub Serving 部署](./paddle_hub_serving_deploy.md)(目前仅支持分类模型) * [PaddleHub Serving 部署](./deployment/paddle_hub_serving_deploy.md)(目前仅支持分类模型)
* [PaddleServing 部署](./paddle_serving_deploy.md) * [PaddleServing 部署](./deployment/paddle_serving_deploy.md)
* [PaddleLite 部署](./paddle_lite_deploy.md)(目前仅支持分类模型) * [PaddleLite 部署](./deployment/paddle_lite_deploy.md)(目前仅支持分类模型)
简体中文 | [English](../../en/inference_deployment/paddle_hub_serving_deploy_en.md)
# 基于 PaddleHub Serving 的服务部署 # 基于 PaddleHub Serving 的服务部署
PaddleClas 支持通过 PaddleHub 快速进行服务化部署。目前支持图像分类的部署,图像识别的部署敬请期待。 PaddleClas 支持通过 PaddleHub 快速进行服务化部署。目前支持图像分类的部署,图像识别的部署敬请期待。
---
## 目录 ## 目录
- [1. 简介](#1) - [1. 简介](#1)
...@@ -22,20 +22,20 @@ PaddleClas 支持通过 PaddleHub 快速进行服务化部署。目前支持图 ...@@ -22,20 +22,20 @@ PaddleClas 支持通过 PaddleHub 快速进行服务化部署。目前支持图
hubserving 服务部署配置服务包 `clas` 下包含 3 个必选文件,目录如下: hubserving 服务部署配置服务包 `clas` 下包含 3 个必选文件,目录如下:
``` ```shell
hubserving/clas/ deploy/hubserving/clas/
└─ __init__.py 空文件,必选 ├── __init__.py # 空文件,必选
└─ config.json 配置文件,可选,使用配置启动服务时作为参数传入 ├── config.json # 配置文件,可选,使用配置启动服务时作为参数传入
└─ module.py 主模块,必选,包含服务的完整逻辑 ├── module.py # 主模块,必选,包含服务的完整逻辑
└─ params.py 参数文件,必选,包含模型路径、前后处理参数等参数 └── params.py # 参数文件,必选,包含模型路径、前后处理参数等参数
``` ```
<a name="2"></a> <a name="2"></a>
## 2. 准备环境 ## 2. 准备环境
```shell ```shell
# 安装 paddlehub,请安装 2.0 版本 # 安装 paddlehub,建议安装 2.1.0 版本
pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple python3.7 -m pip install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
``` ```
...@@ -53,30 +53,27 @@ pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/sim ...@@ -53,30 +53,27 @@ pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/sim
```python ```python
"inference_model_dir": "../inference/" "inference_model_dir": "../inference/"
``` ```
需要注意, * 模型文件(包括 `.pdmodel``.pdiparams`)的名称必须为 `inference`
* 模型文件(包括 `.pdmodel``.pdiparams`)名称必须为 `inference` * 我们提供了大量基于 ImageNet-1k 数据集的预训练模型,模型列表及下载地址详见[模型库概览](../algorithm_introduction/ImageNet_models.md),也可以使用自己训练转换好的模型。
* 我们也提供了大量基于 ImageNet-1k 数据集的预训练模型,模型列表及下载地址详见[模型库概览](../algorithm_introduction/ImageNet_models.md),也可以使用自己训练转换好的模型。
<a name="4"></a> <a name="4"></a>
## 4. 安装服务模块 ## 4. 安装服务模块
针对 Linux 环境和 Windows 环境,安装命令如下。
* 在 Linux 环境下,安装示例如下: * 在 Linux 环境下,安装示例如下:
```shell ```shell
cd PaddleClas/deploy cd PaddleClas/deploy
# 安装服务模块: # 安装服务模块:
hub install hubserving/clas/ hub install hubserving/clas/
``` ```
* 在 Windows 环境下(文件夹的分隔符为`\`),安装示例如下: * 在 Windows 环境下(文件夹的分隔符为`\`),安装示例如下:
```shell ```shell
cd PaddleClas\deploy cd PaddleClas\deploy
# 安装服务模块: # 安装服务模块:
hub install hubserving\clas\ hub install hubserving\clas\
``` ```
<a name="5"></a> <a name="5"></a>
...@@ -84,36 +81,34 @@ hub install hubserving\clas\ ...@@ -84,36 +81,34 @@ hub install hubserving\clas\
<a name="5.1"></a> <a name="5.1"></a>
### 5.1 命令行命令启动 ### 5.1 命令行启动
该方式仅支持使用 CPU 预测。启动命令: 该方式仅支持使用 CPU 预测。启动命令:
```shell ```shell
$ hub serving start --modules Module1==Version1 \ hub serving start \
--port XXXX \ --modules clas_system
--use_multiprocess \ --port 8866
--workers \ ```
``` 这样就完成了一个服务化 API 的部署,使用默认端口号 8866。
**参数说明**: **参数说明**:
|参数|用途| |参数|用途|
|-|-| |-|-|
|--modules/-m| [**必选**] PaddleHub Serving 预安装模型,以多个 Module==Version 键值对的形式列出<br>*`当不指定 Version 时,默认选择最新版本`*| |--modules/-m| [**必选**] PaddleHub Serving 预安装模型,以多个 Module==Version 键值对的形式列出<br>*`当不指定 Version 时,默认选择最新版本`*|
|--port/-p| [**可选**] 服务端口,默认为 8866| |--port/-p| [**可选**] 服务端口,默认为 8866|
|--use_multiprocess| [**可选**] 是否启用并发方式,默认为单进程方式,推荐多核 CPU 机器使用此方式<br>*`Windows 操作系统只支持单进程方式`*| |--use_multiprocess| [**可选**] 是否启用并发方式,默认为单进程方式,推荐多核 CPU 机器使用此方式<br>*`Windows 操作系统只支持单进程方式`*|
|--workers| [**可选**] 在并发方式下指定的并发任务数,默认为 `2*cpu_count-1`,其中 `cpu_count` 为 CPU 核数| |--workers| [**可选**] 在并发方式下指定的并发任务数,默认为 `2*cpu_count-1`,其中 `cpu_count` 为 CPU 核数|
更多部署细节详见 [PaddleHub Serving模型一键服务部署](https://paddlehub.readthedocs.io/zh_CN/release-v2.1/tutorial/serving.html)
如按默认参数启动服务:```hub serving start -m clas_system```
这样就完成了一个服务化 API 的部署,使用默认端口号 8866。
<a name="5.2"></a> <a name="5.2"></a>
### 5.2 配置文件启动 ### 5.2 配置文件启动
该方式仅支持使用 CPU 或 GPU 预测。启动命令: 该方式仅支持使用 CPU 或 GPU 预测。启动命令:
```hub serving start -c config.json``` ```shell
hub serving start -c config.json
```
其中,`config.json` 格式如下: 其中,`config.json` 格式如下:
...@@ -163,12 +158,21 @@ hub serving start -c hubserving/clas/config.json ...@@ -163,12 +158,21 @@ hub serving start -c hubserving/clas/config.json
```shell ```shell
cd PaddleClas/deploy cd PaddleClas/deploy
python hubserving/test_hubserving.py server_url image_path python3.7 hubserving/test_hubserving.py \
``` --server_url http://127.0.0.1:8866/predict/clas_system \
--image_file ./hubserving/ILSVRC2012_val_00006666.JPEG \
--batch_size 8
```
**预测输出**
```log
The result(s): class_ids: [57, 67, 68, 58, 65], label_names: ['garter snake, grass snake', 'diamondback, diamondback rattlesnake, Crotalus adamanteus', 'sidewinder, horned rattlesnake, Crotalus cerastes', 'water snake', 'sea snake'], scores: [0.21915, 0.15631, 0.14794, 0.13177, 0.12285]
The average time of prediction cost: 2.970 s/image
The average time cost: 3.014 s/image
The average top-1 score: 0.110
```
**脚本参数说明**: **脚本参数说明**:
* **server_url**:服务地址,格式为 * **server_url**:服务地址,格式为`http://[ip_address]:[port]/predict/[module_name]`。
`http://[ip_address]:[port]/predict/[module_name]`
* **image_path**:测试图像路径,可以是单张图片路径,也可以是图像集合目录路径。 * **image_path**:测试图像路径,可以是单张图片路径,也可以是图像集合目录路径。
* **batch_size**:[**可选**] 以 `batch_size` 大小为单位进行预测,默认为 `1`。 * **batch_size**:[**可选**] 以 `batch_size` 大小为单位进行预测,默认为 `1`。
* **resize_short**:[**可选**] 预处理时,按短边调整大小,默认为 `256`。 * **resize_short**:[**可选**] 预处理时,按短边调整大小,默认为 `256`。
...@@ -178,41 +182,44 @@ python hubserving/test_hubserving.py server_url image_path ...@@ -178,41 +182,44 @@ python hubserving/test_hubserving.py server_url image_path
**注意**:如果使用 `Transformer` 系列模型,如 `DeiT_***_384`, `ViT_***_384` 等,请注意模型的输入数据尺寸,需要指定`--resize_short=384 --crop_size=384`。 **注意**:如果使用 `Transformer` 系列模型,如 `DeiT_***_384`, `ViT_***_384` 等,请注意模型的输入数据尺寸,需要指定`--resize_short=384 --crop_size=384`。
访问示例:
```shell
python hubserving/test_hubserving.py --server_url http://127.0.0.1:8866/predict/clas_system --image_file ./hubserving/ILSVRC2012_val_00006666.JPEG --batch_size 8
```
**返回结果格式说明**: **返回结果格式说明**:
返回结果为列表(list),包含 top-k 个分类结果,以及对应的得分,还有此图片预测耗时,具体如下: 返回结果为列表(list),包含 top-k 个分类结果,以及对应的得分,还有此图片预测耗时,具体如下:
``` ```shell
list: 返回结果 list: 返回结果
└─ list: 第一张图片结果 └─list: 第一张图片结果
─ list: 前 k 个分类结果,依 score 递减排序 ├── list: 前 k 个分类结果,依 score 递减排序
─ list: 前 k 个分类结果对应的 score,依 score 递减排序 ├── list: 前 k 个分类结果对应的 score,依 score 递减排序
└─ float: 该图分类耗时,单位秒 └─ float: 该图分类耗时,单位秒
``` ```
<a name="7"></a> <a name="7"></a>
## 7. 自定义修改服务模块 ## 7. 自定义修改服务模块
如果需要修改服务逻辑,需要进行以下操作: 如果需要修改服务逻辑,需要进行以下操作:
1. 停止服务 1. 停止服务
```hub serving stop --port/-p XXXX``` ```shell
hub serving stop --port/-p XXXX
```
2. 到相应的 `module.py` 和 `params.py` 等文件中根据实际需求修改代码。`module.py` 修改后需要重新安装(`hub install hubserving/clas/`)并部署。在进行部署前,可通过 `python hubserving/clas/module.py` 测试已安装服务模块 2. 到相应的 `module.py` 和 `params.py` 等文件中根据实际需求修改代码。`module.py` 修改后需要重新安装(`hub install hubserving/clas/`)并部署。在进行部署前,可先通过 `python3.7 hubserving/clas/module.py` 命令来快速测试准备部署的代码
3. 卸载旧服务包 3. 卸载旧服务包
```hub uninstall clas_system``` ```shell
hub uninstall clas_system
```
4. 安装修改后的新服务包 4. 安装修改后的新服务包
```hub install hubserving/clas/``` ```shell
hub install hubserving/clas/
```
5.重新启动服务 5. 重新启动服务
```hub serving start -m clas_system``` ```shell
hub serving start -m clas_system
```
**注意**: **注意**:
常用参数可在 `PaddleClas/deploy/hubserving/clas/params.py` 中修改: 常用参数可在 `PaddleClas/deploy/hubserving/clas/params.py` 中修改:
...@@ -229,4 +236,4 @@ list: 返回结果 ...@@ -229,4 +236,4 @@ list: 返回结果
'class_id_map_file': 'class_id_map_file':
``` ```
为了避免不必要的延时以及能够以 batch_size 进行预测,数据预处理逻辑(包括 `resize`、`crop` 等操作)均在客户端完成,因此需要在 `PaddleClas/deploy/hubserving/test_hubserving.py#L35-L52` 中修改 为了避免不必要的延时以及能够以 batch_size 进行预测,数据预处理逻辑(包括 `resize`、`crop` 等操作)均在客户端完成,因此需要在 [PaddleClas/deploy/hubserving/test_hubserving.py#L41-L47](../../../deploy/hubserving/test_hubserving.py#L41-L47) 以及 [PaddleClas/deploy/hubserving/test_hubserving.py#L51-L76](../../../deploy/hubserving/test_hubserving.py#L51-L76) 中修改数据预处理逻辑相关代码
...@@ -231,9 +231,9 @@ adb push imgs/tabby_cat.jpg /data/local/tmp/arm_cpu/ ...@@ -231,9 +231,9 @@ adb push imgs/tabby_cat.jpg /data/local/tmp/arm_cpu/
```shell ```shell
clas_model_file ./MobileNetV3_large_x1_0.nb # 模型文件地址 clas_model_file ./MobileNetV3_large_x1_0.nb # 模型文件地址
label_path ./imagenet1k_label_list.txt # 类别映射文本文件 label_path ./imagenet1k_label_list.txt # 类别映射文本文件
resize_short_size 256 # resize之后的短边边长 resize_short_size 256 # resize之后的短边边长
crop_size 224 # 裁剪后用于预测的边长 crop_size 224 # 裁剪后用于预测的边长
visualize 0 # 是否进行可视化,如果选择的话,会在当前文件夹下生成名为clas_result.png的图像文件 visualize 0 # 是否进行可视化,如果选择的话,会在当前文件夹下生成名为clas_result.png的图像文件
num_threads 1 # 线程数,默认是1。 num_threads 1 # 线程数,默认是1。
precision FP32 # 精度类型,可以选择 FP32 或者 INT8,默认是 FP32。 precision FP32 # 精度类型,可以选择 FP32 或者 INT8,默认是 FP32。
...@@ -263,4 +263,3 @@ A1:如果已经走通了上述步骤,更换模型只需要替换 `.nb` 模 ...@@ -263,4 +263,3 @@ A1:如果已经走通了上述步骤,更换模型只需要替换 `.nb` 模
Q2:换一个图测试怎么做? Q2:换一个图测试怎么做?
A2:替换 debug 下的测试图像为你想要测试的图像,使用 ADB 再次 push 到手机上即可。 A2:替换 debug 下的测试图像为你想要测试的图像,使用 ADB 再次 push 到手机上即可。
# 模型服务化部署
--------
## 目录
- [1. 简介](#1)
- [2. Serving 安装](#2)
- [3. 图像分类服务部署](#3)
- [3.1 模型转换](#3.1)
- [3.2 服务部署和请求](#3.2)
- [3.2.1 Python Serving](#3.2.1)
- [3.2.2 C++ Serving](#3.2.2)
- [4. 图像识别服务部署](#4)
- [4.1 模型转换](#4.1)
- [4.2 服务部署和请求](#4.2)
- [4.2.1 Python Serving](#4.2.1)
- [4.2.2 C++ Serving](#4.2.2)
- [5. FAQ](#5)
<a name="1"></a>
## 1. 简介
[Paddle Serving](https://github.com/PaddlePaddle/Serving) 旨在帮助深度学习开发者轻松部署在线预测服务,支持一键部署工业级的服务能力、客户端和服务端之间高并发和高效通信、并支持多种编程语言开发客户端。
该部分以 HTTP 预测服务部署为例,介绍怎样在 PaddleClas 中使用 PaddleServing 部署模型服务。目前只支持 Linux 平台部署,暂不支持 Windows 平台。
<a name="2"></a>
## 2. Serving 安装
Serving 官网推荐使用 docker 安装并部署 Serving 环境。首先需要拉取 docker 环境并创建基于 Serving 的 docker。
```shell
# 启动GPU docker
docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel
nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash
nvidia-docker exec -it test bash
# 启动CPU docker
docker pull paddlepaddle/serving:0.7.0-devel
docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-devel bash
docker exec -it test bash
```
进入 docker 后,需要安装 Serving 相关的 python 包。
```shell
pip3 install paddle-serving-client==0.7.0
pip3 install paddle-serving-app==0.7.0
pip3 install faiss-cpu==1.7.1post2
#若为CPU部署环境:
pip3 install paddle-serving-server==0.7.0 # CPU
pip3 install paddlepaddle==2.2.0 # CPU
#若为GPU部署环境
pip3 install paddle-serving-server-gpu==0.7.0.post102 # GPU with CUDA10.2 + TensorRT6
pip3 install paddlepaddle-gpu==2.2.0 # GPU with CUDA10.2
#其他GPU环境需要确认环境再选择执行哪一条
pip3 install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
pip3 install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
```
* 如果安装速度太慢,可以通过 `-i https://pypi.tuna.tsinghua.edu.cn/simple` 更换源,加速安装过程。
* 其他环境配置安装请参考: [使用Docker安装Paddle Serving](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md)
<a name="3"></a>
## 3. 图像分类服务部署
<a name="3.1"></a>
### 3.1 模型转换
使用 PaddleServing 做服务化部署时,需要将保存的 inference 模型转换为 Serving 模型。下面以经典的 ResNet50_vd 模型为例,介绍如何部署图像分类服务。
- 进入工作目录:
```shell
cd deploy/paddleserving
```
- 下载 ResNet50_vd 的 inference 模型:
```shell
# 下载并解压 ResNet50_vd 模型
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
```
- 用 paddle_serving_client 把下载的 inference 模型转换成易于 Server 部署的模型格式:
```
# 转换 ResNet50_vd 模型
python3 -m paddle_serving_client.convert --dirname ./ResNet50_vd_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./ResNet50_vd_serving/ \
--serving_client ./ResNet50_vd_client/
```
ResNet50_vd 推理模型转换完成后,会在当前文件夹多出 `ResNet50_vd_serving``ResNet50_vd_client` 的文件夹,具备如下格式:
```
|- ResNet50_vd_serving/
|- inference.pdiparams
|- inference.pdmodel
|- serving_server_conf.prototxt
|- serving_server_conf.stream.prototxt
|- ResNet50_vd_client
|- serving_client_conf.prototxt
|- serving_client_conf.stream.prototxt
```
得到模型文件之后,需要分别修改 `ResNet50_vd_serving``ResNet50_vd_client` 下文件 `serving_server_conf.prototxt` 中的 alias 名字:将 `fetch_var` 中的 `alias_name` 改为 `prediction`
**备注**: Serving 为了兼容不同模型的部署,提供了输入输出重命名的功能。这样,不同的模型在推理部署时,只需要修改配置文件的 alias_name 即可,无需修改代码即可完成推理部署。
修改后的 serving_server_conf.prototxt 如下所示:
```
feed_var {
name: "inputs"
alias_name: "inputs"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "prediction"
is_lod_tensor: false
fetch_type: 1
shape: 1000
}
```
<a name="3.2"></a>
### 3.2 服务部署和请求
paddleserving 目录包含了启动 pipeline 服务、C++ serving服务和发送预测请求的代码,包括:
```shell
__init__.py
config.yml # 启动pipeline服务的配置文件
pipeline_http_client.py # http方式发送pipeline预测请求的脚本
pipeline_rpc_client.py # rpc方式发送pipeline预测请求的脚本
classification_web_service.py # 启动pipeline服务端的脚本
run_cpp_serving.sh # 启动C++ Serving部署的脚本
test_cpp_serving_client.py # rpc方式发送C++ serving预测请求的脚本
```
<a name="3.2.1"></a>
#### 3.2.1 Python Serving
- 启动服务:
```shell
# 启动服务,运行日志保存在 log.txt
python3 classification_web_service.py &>log.txt &
```
- 发送请求:
```shell
# 发送服务请求
python3 pipeline_http_client.py
```
成功运行后,模型预测的结果会打印在 cmd 窗口中,结果如下:
```
{'err_no': 0, 'err_msg': '', 'key': ['label', 'prob'], 'value': ["['daisy']", '[0.9341402053833008]'], 'tensors': []}
```
<a name="3.2.2"></a>
#### 3.2.2 C++ Serving
- 启动服务:
```shell
# 启动服务, 服务在后台运行,运行日志保存在 nohup.txt
sh run_cpp_serving.sh
```
- 发送请求:
```shell
# 发送服务请求
python3 test_cpp_serving_client.py
```
成功运行后,模型预测的结果会打印在 cmd 窗口中,结果如下:
```
prediction: daisy, probability: 0.9341399073600769
```
<a name="4"></a>
## 4.图像识别服务部署
使用 PaddleServing 做服务化部署时,需要将保存的 inference 模型转换为 Serving 模型。 下面以 PP-ShiTu 中的超轻量图像识别模型为例,介绍图像识别服务的部署。
<a name="4.1"></a>
## 4.1 模型转换
- 下载通用检测 inference 模型和通用识别 inference 模型
```
cd deploy
# 下载并解压通用识别模型
wget -P models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar
cd models
tar -xf general_PPLCNet_x2_5_lite_v1.0_infer.tar
# 下载并解压通用检测模型
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
tar -xf picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
```
- 转换识别 inference 模型为 Serving 模型:
```
# 转换识别模型
python3 -m paddle_serving_client.convert --dirname ./general_PPLCNet_x2_5_lite_v1.0_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./general_PPLCNet_x2_5_lite_v1.0_serving/ \
--serving_client ./general_PPLCNet_x2_5_lite_v1.0_client/
```
识别推理模型转换完成后,会在当前文件夹多出 `general_PPLCNet_x2_5_lite_v1.0_serving/``general_PPLCNet_x2_5_lite_v1.0_client/` 的文件夹。分别修改 `general_PPLCNet_x2_5_lite_v1.0_serving/``general_PPLCNet_x2_5_lite_v1.0_client/` 目录下的 serving_server_conf.prototxt 中的 alias 名字: 将 `fetch_var` 中的 `alias_name` 改为 `features`
修改后的 serving_server_conf.prototxt 内容如下:
```
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: false
fetch_type: 1
shape: 512
}
```
- 转换通用检测 inference 模型为 Serving 模型:
```
# 转换通用检测模型
python3 -m paddle_serving_client.convert --dirname ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/ \
--serving_client ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/
```
检测 inference 模型转换完成后,会在当前文件夹多出 `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/``picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/` 的文件夹。
**注意:** 此处不需要修改 `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` 目录下的 serving_server_conf.prototxt 中的 alias 名字。
- 下载并解压已经构建后的检索库 index
```
cd ../
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar && tar -xf drink_dataset_v1.0.tar
```
<a name="4.2"></a>
## 4.2 服务部署和请求
**注意:** 识别服务涉及到多个模型,出于性能考虑采用 PipeLine 部署方式。Pipeline 部署方式当前不支持 windows 平台。
- 进入到工作目录
```shell
cd ./deploy/paddleserving/recognition
```
paddleserving 目录包含启动 Python Pipeline 服务、C++ Serving 服务和发送预测请求的代码,包括:
```
__init__.py
config.yml # 启动python pipeline服务的配置文件
pipeline_http_client.py # http方式发送pipeline预测请求的脚本
pipeline_rpc_client.py # rpc方式发送pipeline预测请求的脚本
recognition_web_service.py # 启动pipeline服务端的脚本
run_cpp_serving.sh # 启动C++ Pipeline Serving部署的脚本
test_cpp_serving_client.py # rpc方式发送C++ Pipeline serving预测请求的脚本
```
<a name="4.2.1"></a>
#### 4.2.1 Python Serving
- 启动服务:
```
# 启动服务,运行日志保存在 log.txt
python3 recognition_web_service.py &>log.txt &
```
- 发送请求:
```
python3 pipeline_http_client.py
```
成功运行后,模型预测的结果会打印在 cmd 窗口中,结果如下:
```
{'err_no': 0, 'err_msg': '', 'key': ['result'], 'value': ["[{'bbox': [345, 95, 524, 576], 'rec_docs': '红牛-强化型', 'rec_scores': 0.79903316}]"], 'tensors': []}
```
<a name="4.2.2"></a>
#### 4.2.2 C++ Serving
- 启动服务:
```shell
# 启动服务: 此处会在后台同时启动主体检测和特征提取服务,端口号分别为9293和9294;
# 运行日志分别保存在 log_mainbody_detection.txt 和 log_feature_extraction.txt中
sh run_cpp_serving.sh
```
- 发送请求:
```shell
# 发送服务请求
python3 test_cpp_serving_client.py
```
成功运行后,模型预测的结果会打印在 cmd 窗口中,结果如下所示:
```
[{'bbox': [345, 95, 524, 586], 'rec_docs': '红牛-强化型', 'rec_scores': 0.8016462}]
```
<a name="5"></a>
## 5.FAQ
**Q1**: 发送请求后没有结果返回或者提示输出解码报错
**A1**: 启动服务和发送请求时不要设置代理,可以在启动服务前和发送请求前关闭代理,关闭代理的命令是:
```
unset https_proxy
unset http_proxy
```
更多的服务部署类型,如 `RPC 预测服务` 等,可以参考 Serving 的[github 官网](https://github.com/PaddlePaddle/Serving/tree/v0.7.0/examples)
...@@ -8,9 +8,9 @@ ...@@ -8,9 +8,9 @@
- [1. 图像分类模型推理](#1) - [1. 图像分类模型推理](#1)
- [2. PP-ShiTu模型推理](#2) - [2. PP-ShiTu模型推理](#2)
- [2.1 主体检测模型推理](#2.1) - [2.1 主体检测模型推理](#2.1)
- [2.2 特征提取模型推理](#2.2) - [2.2 特征提取模型推理](#2.2)
- [2.3 PP-ShiTu PipeLine推理](#2.3) - [2.3 PP-ShiTu PipeLine推理](#2.3)
<a name="1"></a> <a name="1"></a>
## 1. 图像分类推理 ## 1. 图像分类推理
......
简体中文 | [English](../../en/inference_deployment/recognition_serving_deploy_en.md)
# 识别模型服务化部署
## 目录
- [1. 简介](#1-简介)
- [2. Serving 安装](#2-serving-安装)
- [3. 图像识别服务部署](#3-图像识别服务部署)
- [3.1 模型转换](#31-模型转换)
- [3.2 服务部署和请求](#32-服务部署和请求)
- [3.2.1 Python Serving](#321-python-serving)
- [3.2.2 C++ Serving](#322-c-serving)
- [4. FAQ](#4-faq)
<a name="1"></a>
## 1. 简介
[Paddle Serving](https://github.com/PaddlePaddle/Serving) 旨在帮助深度学习开发者轻松部署在线预测服务,支持一键部署工业级的服务能力、客户端和服务端之间高并发和高效通信、并支持多种编程语言开发客户端。
该部分以 HTTP 预测服务部署为例,介绍怎样在 PaddleClas 中使用 PaddleServing 部署模型服务。目前只支持 Linux 平台部署,暂不支持 Windows 平台。
<a name="2"></a>
## 2. Serving 安装
Serving 官网推荐使用 docker 安装并部署 Serving 环境。首先需要拉取 docker 环境并创建基于 Serving 的 docker。
```shell
# 启动GPU docker
docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel
nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash
nvidia-docker exec -it test bash
# 启动CPU docker
docker pull paddlepaddle/serving:0.7.0-devel
docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-devel bash
docker exec -it test bash
```
进入 docker 后,需要安装 Serving 相关的 python 包。
```shell
python3.7 -m pip install paddle-serving-client==0.7.0
python3.7 -m pip install paddle-serving-app==0.7.0
python3.7 -m pip install faiss-cpu==1.7.1post2
#若为CPU部署环境:
python3.7 -m pip install paddle-serving-server==0.7.0 # CPU
python3.7 -m pip install paddlepaddle==2.2.0 # CPU
#若为GPU部署环境
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post102 # GPU with CUDA10.2 + TensorRT6
python3.7 -m pip install paddlepaddle-gpu==2.2.0 # GPU with CUDA10.2
#其他GPU环境需要确认环境再选择执行哪一条
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
```
* 如果安装速度太慢,可以通过 `-i https://pypi.tuna.tsinghua.edu.cn/simple` 更换源,加速安装过程。
* 其他环境配置安装请参考:[使用Docker安装Paddle Serving](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md)
<a name="3"></a>
## 3. 图像识别服务部署
使用 PaddleServing 做图像识别服务化部署时,**需要将保存的多个 inference 模型都转换为 Serving 模型**。 下面以 PP-ShiTu 中的超轻量图像识别模型为例,介绍图像识别服务的部署。
<a name="3.1"></a>
### 3.1 模型转换
- 进入工作目录:
```shell
cd deploy/
```
- 下载通用检测 inference 模型和通用识别 inference 模型
```shell
# 创建并进入models文件夹
mkdir models
cd models
# 下载并解压通用识别模型
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar
tar -xf general_PPLCNet_x2_5_lite_v1.0_infer.tar
# 下载并解压通用检测模型
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
tar -xf picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
```
- 转换通用识别 inference 模型为 Serving 模型:
```shell
# 转换通用识别模型
python3.7 -m paddle_serving_client.convert \
--dirname ./general_PPLCNet_x2_5_lite_v1.0_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./general_PPLCNet_x2_5_lite_v1.0_serving/ \
--serving_client ./general_PPLCNet_x2_5_lite_v1.0_client/
```
上述命令的参数含义与[#3.1 模型转换](#3.1)相同
通用识别 inference 模型转换完成后,会在当前文件夹多出 `general_PPLCNet_x2_5_lite_v1.0_serving/``general_PPLCNet_x2_5_lite_v1.0_client/` 的文件夹,具备如下结构:
```shell
├── general_PPLCNet_x2_5_lite_v1.0_serving/
│ ├── inference.pdiparams
│ ├── inference.pdmodel
│ ├── serving_server_conf.prototxt
│ └── serving_server_conf.stream.prototxt
└── general_PPLCNet_x2_5_lite_v1.0_client/
├── serving_client_conf.prototxt
└── serving_client_conf.stream.prototxt
```
- 转换通用检测 inference 模型为 Serving 模型:
```shell
# 转换通用检测模型
python3.7 -m paddle_serving_client.convert --dirname ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/ \
--serving_client ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/
```
上述命令的参数含义与[#3.1 模型转换](#3.1)相同
识别推理模型转换完成后,会在当前文件夹多出 `general_PPLCNet_x2_5_lite_v1.0_serving/``general_PPLCNet_x2_5_lite_v1.0_client/` 的文件夹。分别修改 `general_PPLCNet_x2_5_lite_v1.0_serving/``general_PPLCNet_x2_5_lite_v1.0_client/` 目录下的 `serving_server_conf.prototxt` 中的 `alias` 名字: 将 `fetch_var` 中的 `alias_name` 改为 `features`。 修改后的 `serving_server_conf.prototxt` 内容如下
```log
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: false
fetch_type: 1
shape: 512
}
```
通用检测 inference 模型转换完成后,会在当前文件夹多出 `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/``picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/` 的文件夹,具备如下结构:
```shell
├── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/
│ ├── inference.pdiparams
│ ├── inference.pdmodel
│ ├── serving_server_conf.prototxt
│ └── serving_server_conf.stream.prototxt
└── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/
├── serving_client_conf.prototxt
└── serving_client_conf.stream.prototxt
```
上述命令中参数具体含义如下表所示
| 参数 | 类型 | 默认值 | 描述 |
| ----------------- | ---- | ------------------ | ------------------------------------------------------------ |
| `dirname` | str | - | 需要转换的模型文件存储路径,Program结构文件和参数文件均保存在此目录。 |
| `model_filename` | str | None | 存储需要转换的模型Inference Program结构的文件名称。如果设置为None,则使用 `__model__` 作为默认的文件名 |
| `params_filename` | str | None | 存储需要转换的模型所有参数的文件名称。当且仅当所有模型参数被保>存在一个单独的二进制文件中,它才需要被指定。如果模型参数是存储在各自分离的文件中,设置它的值为None |
| `serving_server` | str | `"serving_server"` | 转换后的模型文件和配置文件的存储路径。默认值为serving_server |
| `serving_client` | str | `"serving_client"` | 转换后的客户端配置文件存储路径。默认值为serving_client |
- 下载并解压已经构建后完成的检索库 index
```shell
# 回到deploy目录
cd ../
# 下载构建完成的检索库 index
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar
# 解压构建完成的检索库 index
tar -xf drink_dataset_v1.0.tar
```
<a name="3.2"></a>
### 3.2 服务部署和请求
**注意:** 识别服务涉及到多个模型,出于性能考虑采用 PipeLine 部署方式。Pipeline 部署方式当前不支持 windows 平台。
- 进入到工作目录
```shell
cd ./deploy/paddleserving/recognition
```
paddleserving 目录包含启动 Python Pipeline 服务、C++ Serving 服务和发送预测请求的代码,包括:
```shell
__init__.py
config.yml # 启动python pipeline服务的配置文件
pipeline_http_client.py # http方式发送pipeline预测请求的脚本
pipeline_rpc_client.py # rpc方式发送pipeline预测请求的脚本
recognition_web_service.py # 启动pipeline服务端的脚本
readme.md # 识别模型服务化部署文档
run_cpp_serving.sh # 启动C++ Pipeline Serving部署的脚本
test_cpp_serving_client.py # rpc方式发送C++ Pipeline serving预测请求的脚本
```
<a name="3.2.1"></a>
#### 3.2.1 Python Serving
- 启动服务:
```shell
# 启动服务,运行日志保存在 log.txt
python3.7 recognition_web_service.py &>log.txt &
```
- 发送请求:
```shell
python3.7 pipeline_http_client.py
```
成功运行后,模型预测的结果会打印在客户端中,如下所示:
```log
{'err_no': 0, 'err_msg': '', 'key': ['result'], 'value': ["[{'bbox': [345, 95, 524, 576], 'rec_docs': '红牛-强化型', 'rec_scores': 0.79903316}]"], 'tensors': []}
```
<a name="3.2.2"></a>
#### 3.2.2 C++ Serving
与Python Serving不同,C++ Serving客户端调用 C++ OP来预测,因此在启动服务之前,需要编译并安装 serving server包,并设置 `SERVING_BIN`
- 编译并安装Serving server包
```shell
# 进入工作目录
cd PaddleClas/deploy/paddleserving
# 一键编译安装Serving server、设置 SERVING_BIN
source ./build_server.sh python3.7
```
**注:**[build_server.sh](../build_server.sh#L55-L62)所设定的路径可能需要根据实际机器上的环境如CUDA、python版本等作一定修改,然后再编译。
- C++ Serving使用的输入输出格式与Python不同,因此需要执行以下命令,将4个文件复制到下的文件覆盖掉[3.1](#31-模型转换)得到文件夹中的对应4个prototxt文件。
```shell
# 进入PaddleClas/deploy目录
cd PaddleClas/deploy/
# 覆盖prototxt文件
\cp ./paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_serving/*.prototxt ./models/general_PPLCNet_x2_5_lite_v1.0_serving/
\cp ./paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_client/*.prototxt ./models/general_PPLCNet_x2_5_lite_v1.0_client/
\cp ./paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/*.prototxt ./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/
\cp ./paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/*.prototxt ./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/
```
- 启动服务:
```shell
# 进入工作目录
cd PaddleClas/deploy/paddleserving/recognition
# 端口号默认为9400;运行日志默认保存在 log_PPShiTu.txt 中
# CPU部署
bash run_cpp_serving.sh
# GPU部署,并指定第0号卡
bash run_cpp_serving.sh 0
```
- 发送请求:
```shell
# 发送服务请求
python3.7 test_cpp_serving_client.py
```
成功运行后,模型预测的结果会打印在客户端中,如下所示:
```log
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0614 03:01:36.273097 6084 naming_service_thread.cpp:202] brpc::policy::ListNamingService("127.0.0.1:9400"): added 1
I0614 03:01:37.393564 6084 general_model.cpp:490] [client]logid=0,client_cost=1107.82ms,server_cost=1101.75ms.
[{'bbox': [345, 95, 524, 585], 'rec_docs': '红牛-强化型', 'rec_scores': 0.8073724}]
```
- 关闭服务
如果服务程序在前台运行,可以按下`Ctrl+C`来终止服务端程序;如果在后台运行,可以使用kill命令关闭相关进程,也可以在启动服务程序的路径下执行以下命令来终止服务端程序:
```bash
python3.7 -m paddle_serving_server.serve stop
```
执行完毕后出现`Process stopped`信息表示成功关闭服务。
<a name="4"></a>
## 4. FAQ
**Q1**: 发送请求后没有结果返回或者提示输出解码报错
**A1**: 启动服务和发送请求时不要设置代理,可以在启动服务前和发送请求前关闭代理,关闭代理的命令是:
```shell
unset https_proxy
unset http_proxy
```
**Q2**: 启动服务后没有任何反应
**A2**: 可以检查`config.yml``model_config`对应的路径是否存在,文件夹命名是否正确
更多的服务部署类型,如 `RPC 预测服务` 等,可以参考 Serving 的[github 官网](https://github.com/PaddlePaddle/Serving/tree/v0.9.0/examples)
...@@ -67,6 +67,9 @@ from ppcls.arch.backbone.model_zoo.pvt_v2 import PVT_V2_B0, PVT_V2_B1, PVT_V2_B2 ...@@ -67,6 +67,9 @@ from ppcls.arch.backbone.model_zoo.pvt_v2 import PVT_V2_B0, PVT_V2_B1, PVT_V2_B2
from ppcls.arch.backbone.model_zoo.mobilevit import MobileViT_XXS, MobileViT_XS, MobileViT_S from ppcls.arch.backbone.model_zoo.mobilevit import MobileViT_XXS, MobileViT_XS, MobileViT_S
from ppcls.arch.backbone.model_zoo.repvgg import RepVGG_A0, RepVGG_A1, RepVGG_A2, RepVGG_B0, RepVGG_B1, RepVGG_B2, RepVGG_B1g2, RepVGG_B1g4, RepVGG_B2g4, RepVGG_B3g4 from ppcls.arch.backbone.model_zoo.repvgg import RepVGG_A0, RepVGG_A1, RepVGG_A2, RepVGG_B0, RepVGG_B1, RepVGG_B2, RepVGG_B1g2, RepVGG_B1g4, RepVGG_B2g4, RepVGG_B3g4
from ppcls.arch.backbone.model_zoo.van import VAN_tiny from ppcls.arch.backbone.model_zoo.van import VAN_tiny
from ppcls.arch.backbone.model_zoo.peleenet import PeleeNet
from ppcls.arch.backbone.model_zoo.convnext import ConvNeXt_tiny
from ppcls.arch.backbone.variant_models.resnet_variant import ResNet50_last_stage_stride1 from ppcls.arch.backbone.variant_models.resnet_variant import ResNet50_last_stage_stride1
from ppcls.arch.backbone.variant_models.vgg_variant import VGG19Sigmoid from ppcls.arch.backbone.variant_models.vgg_variant import VGG19Sigmoid
from ppcls.arch.backbone.variant_models.pp_lcnet_variant import PPLCNet_x2_5_Tanh from ppcls.arch.backbone.variant_models.pp_lcnet_variant import PPLCNet_x2_5_Tanh
......
# MIT License
#
# Copyright (c) Meta Platforms, Inc. and affiliates.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#
# Code was heavily based on https://github.com/facebookresearch/ConvNeXt
import paddle
import paddle.nn as nn
from paddle.nn.initializer import TruncatedNormal, Constant
from ppcls.utils.save_load import load_dygraph_pretrain, load_dygraph_pretrain_from_url
MODEL_URLS = {
"ConvNeXt_tiny": "", # TODO
}
__all__ = list(MODEL_URLS.keys())
trunc_normal_ = TruncatedNormal(std=.02)
zeros_ = Constant(value=0.)
ones_ = Constant(value=1.)
def drop_path(x, drop_prob=0., training=False):
"""Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
the original name is misleading as 'Drop Connect' is a different form of dropout in a separate paper...
See discussion: https://github.com/tensorflow/tpu/issues/494#issuecomment-532968956 ...
"""
if drop_prob == 0. or not training:
return x
keep_prob = paddle.to_tensor(1 - drop_prob)
shape = (paddle.shape(x)[0], ) + (1, ) * (x.ndim - 1)
random_tensor = keep_prob + paddle.rand(shape, dtype=x.dtype)
random_tensor = paddle.floor(random_tensor) # binarize
output = x.divide(keep_prob) * random_tensor
return output
class DropPath(nn.Layer):
"""Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
"""
def __init__(self, drop_prob=None):
super(DropPath, self).__init__()
self.drop_prob = drop_prob
def forward(self, x):
return drop_path(x, self.drop_prob, self.training)
class ChannelsFirstLayerNorm(nn.Layer):
r""" LayerNorm that supports two data formats: channels_last (default) or channels_first.
The ordering of the dimensions in the inputs. channels_last corresponds to inputs with
shape (batch_size, height, width, channels) while channels_first corresponds to inputs
with shape (batch_size, channels, height, width).
"""
def __init__(self, normalized_shape, epsilon=1e-5):
super().__init__()
self.weight = self.create_parameter(
shape=[normalized_shape], default_initializer=ones_)
self.bias = self.create_parameter(
shape=[normalized_shape], default_initializer=zeros_)
self.epsilon = epsilon
self.normalized_shape = [normalized_shape]
def forward(self, x):
u = x.mean(1, keepdim=True)
s = (x - u).pow(2).mean(1, keepdim=True)
x = (x - u) / paddle.sqrt(s + self.epsilon)
x = self.weight[:, None, None] * x + self.bias[:, None, None]
return x
class Block(nn.Layer):
r""" ConvNeXt Block. There are two equivalent implementations:
(1) DwConv -> LayerNorm (channels_first) -> 1x1 Conv -> GELU -> 1x1 Conv; all in (N, C, H, W)
(2) DwConv -> Permute to (N, H, W, C); LayerNorm (channels_last) -> Linear -> GELU -> Linear; Permute back
We use (2) as we find it slightly faster in PyTorch
Args:
dim (int): Number of input channels.
drop_path (float): Stochastic depth rate. Default: 0.0
layer_scale_init_value (float): Init value for Layer Scale. Default: 1e-6.
"""
def __init__(self, dim, drop_path=0., layer_scale_init_value=1e-6):
super().__init__()
self.dwconv = nn.Conv2D(
dim, dim, 7, padding=3, groups=dim) # depthwise conv
self.norm = nn.LayerNorm(dim, epsilon=1e-6)
# pointwise/1x1 convs, implemented with linear layers
self.pwconv1 = nn.Linear(dim, 4 * dim)
self.act = nn.GELU()
self.pwconv2 = nn.Linear(4 * dim, dim)
if layer_scale_init_value > 0:
self.gamma = self.create_parameter(
shape=[dim],
default_initializer=Constant(value=layer_scale_init_value))
else:
self.gamma = None
self.drop_path = DropPath(
drop_path) if drop_path > 0. else nn.Identity()
def forward(self, x):
input = x
x = self.dwconv(x)
x = x.transpose([0, 2, 3, 1]) # (N, C, H, W) -> (N, H, W, C)
x = self.norm(x)
x = self.pwconv1(x)
x = self.act(x)
x = self.pwconv2(x)
if self.gamma is not None:
x = self.gamma * x
x = x.transpose([0, 3, 1, 2]) # (N, H, W, C) -> (N, C, H, W)
x = input + self.drop_path(x)
return x
class ConvNeXt(nn.Layer):
r""" ConvNeXt
A PyTorch impl of : `A ConvNet for the 2020s` -
https://arxiv.org/pdf/2201.03545.pdf
Args:
in_chans (int): Number of input image channels. Default: 3
class_num (int): Number of classes for classification head. Default: 1000
depths (tuple(int)): Number of blocks at each stage. Default: [3, 3, 9, 3]
dims (int): Feature dimension at each stage. Default: [96, 192, 384, 768]
drop_path_rate (float): Stochastic depth rate. Default: 0.
layer_scale_init_value (float): Init value for Layer Scale. Default: 1e-6.
head_init_scale (float): Init scaling value for classifier weights and biases. Default: 1.
"""
def __init__(self,
in_chans=3,
class_num=1000,
depths=[3, 3, 9, 3],
dims=[96, 192, 384, 768],
drop_path_rate=0.,
layer_scale_init_value=1e-6,
head_init_scale=1.):
super().__init__()
# stem and 3 intermediate downsampling conv layers
self.downsample_layers = nn.LayerList()
stem = nn.Sequential(
nn.Conv2D(
in_chans, dims[0], 4, stride=4),
ChannelsFirstLayerNorm(
dims[0], epsilon=1e-6))
self.downsample_layers.append(stem)
for i in range(3):
downsample_layer = nn.Sequential(
ChannelsFirstLayerNorm(
dims[i], epsilon=1e-6),
nn.Conv2D(
dims[i], dims[i + 1], 2, stride=2), )
self.downsample_layers.append(downsample_layer)
# 4 feature resolution stages, each consisting of multiple residual blocks
self.stages = nn.LayerList()
dp_rates = [
x.item() for x in paddle.linspace(0, drop_path_rate, sum(depths))
]
cur = 0
for i in range(4):
stage = nn.Sequential(*[
Block(
dim=dims[i],
drop_path=dp_rates[cur + j],
layer_scale_init_value=layer_scale_init_value)
for j in range(depths[i])
])
self.stages.append(stage)
cur += depths[i]
self.norm = nn.LayerNorm(dims[-1], epsilon=1e-6) # final norm layer
self.head = nn.Linear(dims[-1], class_num)
self.apply(self._init_weights)
self.head.weight.set_value(self.head.weight * head_init_scale)
self.head.bias.set_value(self.head.bias * head_init_scale)
def _init_weights(self, m):
if isinstance(m, (nn.Conv2D, nn.Linear)):
trunc_normal_(m.weight)
if m.bias is not None:
zeros_(m.bias)
def forward_features(self, x):
for i in range(4):
x = self.downsample_layers[i](x)
x = self.stages[i](x)
# global average pooling, (N, C, H, W) -> (N, C)
return self.norm(x.mean([-2, -1]))
def forward(self, x):
x = self.forward_features(x)
x = self.head(x)
return x
def _load_pretrained(pretrained, model, model_url, use_ssld=False):
if pretrained is False:
pass
elif pretrained is True:
load_dygraph_pretrain_from_url(model, model_url, use_ssld=use_ssld)
elif isinstance(pretrained, str):
load_dygraph_pretrain(model, pretrained)
else:
raise RuntimeError(
"pretrained type is not available. Please use `string` or `boolean` type."
)
def ConvNeXt_tiny(pretrained=False, use_ssld=False, **kwargs):
model = ConvNeXt(depths=[3, 3, 9, 3], dims=[96, 192, 384, 768], **kwargs)
_load_pretrained(
pretrained, model, MODEL_URLS["ConvNeXt_tiny"], use_ssld=use_ssld)
return model
# copyright (c) 2022 PaddlePaddle Authors. All Rights Reserve.
#
# MIT License
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#
# Code was heavily based on https://github.com/Robert-JunWang/PeleeNet
# reference: https://arxiv.org/pdf/1804.06882.pdf
import math
import paddle
import paddle.nn as nn
import paddle.nn.functional as F
from paddle.nn.initializer import Normal, Constant
from ppcls.utils.save_load import load_dygraph_pretrain, load_dygraph_pretrain_from_url
MODEL_URLS = {
"peleenet": "" # TODO
}
__all__ = MODEL_URLS.keys()
normal_ = lambda x, mean=0, std=1: Normal(mean, std)(x)
constant_ = lambda x, value=0: Constant(value)(x)
zeros_ = Constant(value=0.)
ones_ = Constant(value=1.)
class _DenseLayer(nn.Layer):
def __init__(self, num_input_features, growth_rate, bottleneck_width, drop_rate):
super(_DenseLayer, self).__init__()
growth_rate = int(growth_rate / 2)
inter_channel = int(growth_rate * bottleneck_width / 4) * 4
if inter_channel > num_input_features / 2:
inter_channel = int(num_input_features / 8) * 4
print('adjust inter_channel to ', inter_channel)
self.branch1a = BasicConv2D(
num_input_features, inter_channel, kernel_size=1)
self.branch1b = BasicConv2D(
inter_channel, growth_rate, kernel_size=3, padding=1)
self.branch2a = BasicConv2D(
num_input_features, inter_channel, kernel_size=1)
self.branch2b = BasicConv2D(
inter_channel, growth_rate, kernel_size=3, padding=1)
self.branch2c = BasicConv2D(
growth_rate, growth_rate, kernel_size=3, padding=1)
def forward(self, x):
branch1 = self.branch1a(x)
branch1 = self.branch1b(branch1)
branch2 = self.branch2a(x)
branch2 = self.branch2b(branch2)
branch2 = self.branch2c(branch2)
return paddle.concat([x, branch1, branch2], 1)
class _DenseBlock(nn.Sequential):
def __init__(self, num_layers, num_input_features, bn_size, growth_rate, drop_rate):
super(_DenseBlock, self).__init__()
for i in range(num_layers):
layer = _DenseLayer(num_input_features + i *
growth_rate, growth_rate, bn_size, drop_rate)
setattr(self, 'denselayer%d' % (i + 1), layer)
class _StemBlock(nn.Layer):
def __init__(self, num_input_channels, num_init_features):
super(_StemBlock, self).__init__()
num_stem_features = int(num_init_features/2)
self.stem1 = BasicConv2D(
num_input_channels, num_init_features, kernel_size=3, stride=2, padding=1)
self.stem2a = BasicConv2D(
num_init_features, num_stem_features, kernel_size=1, stride=1, padding=0)
self.stem2b = BasicConv2D(
num_stem_features, num_init_features, kernel_size=3, stride=2, padding=1)
self.stem3 = BasicConv2D(
2*num_init_features, num_init_features, kernel_size=1, stride=1, padding=0)
self.pool = nn.MaxPool2D(kernel_size=2, stride=2)
def forward(self, x):
out = self.stem1(x)
branch2 = self.stem2a(out)
branch2 = self.stem2b(branch2)
branch1 = self.pool(out)
out = paddle.concat([branch1, branch2], 1)
out = self.stem3(out)
return out
class BasicConv2D(nn.Layer):
def __init__(self, in_channels, out_channels, activation=True, **kwargs):
super(BasicConv2D, self).__init__()
self.conv = nn.Conv2D(in_channels, out_channels,
bias_attr=False, **kwargs)
self.norm = nn.BatchNorm2D(out_channels)
self.activation = activation
def forward(self, x):
x = self.conv(x)
x = self.norm(x)
if self.activation:
return F.relu(x)
else:
return x
class PeleeNetDY(nn.Layer):
r"""PeleeNet model class, based on
`"Densely Connected Convolutional Networks" <https://arxiv.org/pdf/1608.06993.pdf> and
"Pelee: A Real-Time Object Detection System on Mobile Devices" <https://arxiv.org/pdf/1804.06882.pdf>`
Args:
growth_rate (int or list of 4 ints) - how many filters to add each layer (`k` in paper)
block_config (list of 4 ints) - how many layers in each pooling block
num_init_features (int) - the number of filters to learn in the first convolution layer
bottleneck_width (int or list of 4 ints) - multiplicative factor for number of bottle neck layers
(i.e. bn_size * k features in the bottleneck layer)
drop_rate (float) - dropout rate after each dense layer
class_num (int) - number of classification classes
"""
def __init__(self, growth_rate=32, block_config=[3, 4, 8, 6],
num_init_features=32, bottleneck_width=[1, 2, 4, 4],
drop_rate=0.05, class_num=1000):
super(PeleeNetDY, self).__init__()
self.features = nn.Sequential(*[
('stemblock', _StemBlock(3, num_init_features)),
])
if type(growth_rate) is list:
growth_rates = growth_rate
assert len(growth_rates) == 4, \
'The growth rate must be the list and the size must be 4'
else:
growth_rates = [growth_rate] * 4
if type(bottleneck_width) is list:
bottleneck_widths = bottleneck_width
assert len(bottleneck_widths) == 4, \
'The bottleneck width must be the list and the size must be 4'
else:
bottleneck_widths = [bottleneck_width] * 4
# Each denseblock
num_features = num_init_features
for i, num_layers in enumerate(block_config):
block = _DenseBlock(num_layers=num_layers,
num_input_features=num_features,
bn_size=bottleneck_widths[i],
growth_rate=growth_rates[i],
drop_rate=drop_rate)
setattr(self.features, 'denseblock%d' % (i + 1), block)
num_features = num_features + num_layers * growth_rates[i]
setattr(self.features, 'transition%d' % (i + 1), BasicConv2D(
num_features, num_features, kernel_size=1, stride=1, padding=0))
if i != len(block_config) - 1:
setattr(self.features, 'transition%d_pool' %
(i + 1), nn.AvgPool2D(kernel_size=2, stride=2))
num_features = num_features
# Linear layer
self.classifier = nn.Linear(num_features, class_num)
self.drop_rate = drop_rate
self.apply(self._initialize_weights)
def forward(self, x):
features = self.features(x)
out = F.avg_pool2d(features, kernel_size=features.shape[2:4]).flatten(1)
if self.drop_rate > 0:
out = F.dropout(out, p=self.drop_rate, training=self.training)
out = self.classifier(out)
return out
def _initialize_weights(self, m):
if isinstance(m, nn.Conv2D):
n = m._kernel_size[0] * m._kernel_size[1] * m._out_channels
normal_(m.weight, std=math.sqrt(2. / n))
if m.bias is not None:
zeros_(m.bias)
elif isinstance(m, nn.BatchNorm2D):
ones_(m.weight)
zeros_(m.bias)
elif isinstance(m, nn.Linear):
normal_(m.weight, std=0.01)
zeros_(m.bias)
def _load_pretrained(pretrained, model, model_url, use_ssld):
if pretrained is False:
pass
elif pretrained is True:
load_dygraph_pretrain_from_url(model, model_url, use_ssld=use_ssld)
elif isinstance(pretrained, str):
load_dygraph_pretrain(model, pretrained)
else:
raise RuntimeError(
"pretrained type is not available. Please use `string` or `boolean` type."
)
def PeleeNet(pretrained=False, use_ssld=False, **kwargs):
model = PeleeNetDY(**kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["peleenet"], use_ssld)
return model
此差异已折叠。
此差异已折叠。
...@@ -18,6 +18,7 @@ from __future__ import print_function ...@@ -18,6 +18,7 @@ from __future__ import print_function
from __future__ import unicode_literals from __future__ import unicode_literals
from functools import partial from functools import partial
import io
import six import six
import math import math
import random import random
...@@ -138,28 +139,53 @@ class OperatorParamError(ValueError): ...@@ -138,28 +139,53 @@ class OperatorParamError(ValueError):
class DecodeImage(object): class DecodeImage(object):
""" decode image """ """ decode image """
def __init__(self, to_rgb=True, to_np=False, channel_first=False): def __init__(self,
self.to_rgb = to_rgb to_np=True,
to_rgb=True,
channel_first=False,
backend="cv2"):
self.to_np = to_np # to numpy self.to_np = to_np # to numpy
self.to_rgb = to_rgb # only enabled when to_np is True
self.channel_first = channel_first # only enabled when to_np is True self.channel_first = channel_first # only enabled when to_np is True
if backend.lower() not in ["cv2", "pil"]:
logger.warning(
f"The backend of DecodeImage only support \"cv2\" or \"PIL\". \"f{backend}\" is unavailable. Use \"cv2\" instead."
)
backend = "cv2"
self.backend = backend.lower()
if not to_np:
logger.warning(
f"\"to_rgb\" and \"channel_first\" are only enabled when to_np is True. \"to_np\" is now {to_np}."
)
def __call__(self, img): def __call__(self, img):
if not isinstance(img, np.ndarray): if isinstance(img, Image.Image):
if six.PY2: assert self.backend == "pil", "invalid input 'img' in DecodeImage"
assert type(img) is str and len( elif isinstance(img, np.ndarray):
img) > 0, "invalid input 'img' in DecodeImage" assert self.backend == "cv2", "invalid input 'img' in DecodeImage"
elif isinstance(img, bytes):
if self.backend == "pil":
data = io.BytesIO(img)
img = Image.open(data)
else: else:
assert type(img) is bytes and len( data = np.frombuffer(img, dtype="uint8")
img) > 0, "invalid input 'img' in DecodeImage" img = cv2.imdecode(data, 1)
data = np.frombuffer(img, dtype='uint8') else:
img = cv2.imdecode(data, 1) raise ValueError("invalid input 'img' in DecodeImage")
if self.to_rgb:
assert img.shape[2] == 3, 'invalid shape of image[%s]' % ( if self.to_np:
img.shape) if self.backend == "pil":
img = img[:, :, ::-1] assert img.mode == "RGB", f"invalid shape of image[{img.shape}]"
img = np.asarray(img)[:, :, ::-1] # BRG
if self.channel_first:
img = img.transpose((2, 0, 1)) if self.to_rgb:
assert img.shape[2] == 3, f"invalid shape of image[{img.shape}]"
img = img[:, :, ::-1]
if self.channel_first:
img = img.transpose((2, 0, 1))
return img return img
......
此差异已折叠。
此差异已折叠。
...@@ -54,12 +54,12 @@ def log_info(trainer, batch_size, epoch_id, iter_id): ...@@ -54,12 +54,12 @@ def log_info(trainer, batch_size, epoch_id, iter_id):
ips_msg = "ips: {:.5f} samples/s".format( ips_msg = "ips: {:.5f} samples/s".format(
batch_size / trainer.time_info["batch_cost"].avg) batch_size / trainer.time_info["batch_cost"].avg)
eta_sec = ((trainer.config["Global"]["epochs"] - epoch_id + 1 eta_sec = ((trainer.config["Global"]["epochs"] - epoch_id + 1
) * len(trainer.train_dataloader) - iter_id ) * trainer.max_iter - iter_id
) * trainer.time_info["batch_cost"].avg ) * trainer.time_info["batch_cost"].avg
eta_msg = "eta: {:s}".format(str(datetime.timedelta(seconds=int(eta_sec)))) eta_msg = "eta: {:s}".format(str(datetime.timedelta(seconds=int(eta_sec))))
logger.info("[Train][Epoch {}/{}][Iter: {}/{}]{}, {}, {}, {}, {}".format( logger.info("[Train][Epoch {}/{}][Iter: {}/{}]{}, {}, {}, {}, {}".format(
epoch_id, trainer.config["Global"]["epochs"], iter_id, epoch_id, trainer.config["Global"]["epochs"], iter_id,
len(trainer.train_dataloader), lr_msg, metric_msg, time_msg, ips_msg, trainer.max_iter, lr_msg, metric_msg, time_msg, ips_msg,
eta_msg)) eta_msg))
for i, lr in enumerate(trainer.lr_sch): for i, lr in enumerate(trainer.lr_sch):
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
...@@ -179,6 +179,22 @@ for batch_size in ${batch_size_list[*]}; do ...@@ -179,6 +179,22 @@ for batch_size in ${batch_size_list[*]}; do
func_sed_params "$FILENAME" "${line_epoch}" "$epoch" func_sed_params "$FILENAME" "${line_epoch}" "$epoch"
gpu_id=$(set_gpu_id $device_num) gpu_id=$(set_gpu_id $device_num)
# if bs is big, then copy train_list.txt to generate more train log
# There are 5w image in train_list. And the train log printed interval is 10 iteration.
# At least 25 log number would be good to calculate ips for benchmark system.
# So the copy number for train_list is as follows:
total_batch_size=`echo $[$batch_size*${device_num:1:1}*${device_num:3:3}]`
copy_num=`echo $[$total_batch_size/200]`
if [ $copy_num -gt 1 ];then
cd dataset/ILSVRC2012
rm -rf train_list.txt
for ((i=1; i <=$copy_num; i++));do
cat val_list.txt >> train_list.txt
done
cd ../../
fi
if [ ${#gpu_id} -le 1 ];then if [ ${#gpu_id} -le 1 ];then
log_path="$SAVE_LOG/profiling_log" log_path="$SAVE_LOG/profiling_log"
mkdir -p $log_path mkdir -p $log_path
......
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册