未验证 提交 f99b1ec5 编写于 作者: C cuicheng01 提交者: GitHub

Merge branch 'develop' into Add_PULC_demo

......@@ -53,52 +53,65 @@ Res2Net200_vd预训练模型Top-1精度高达85.1%。
PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick_start_recognition.md)
## 文档教程
- 安装说明
- [安装Paddle](./docs/zh_CN/installation/install_paddle.md)
- [安装PaddleClas](./docs/zh_CN/installation/install_paddleclas.md)
- 快速体验
- [PP-ShiTu图像识别快速体验](./docs/zh_CN/quick_start/quick_start_recognition.md)
- 图像分类快速体验
- [尝鲜版](./docs/zh_CN/quick_start/quick_start_classification_new_user.md)
- [进阶版](./docs/zh_CN/quick_start/quick_start_classification_professional.md)
- [多标签分类](./docs/zh_CN/quick_start/quick_start_multilabel_classification.md)
- [环境准备]()
- [PULC超轻量图像分类实用方案]()
- [超轻量图像分类快速体验 (@崔程)]()
- [超轻量图像分类模型库](包含benchmark @崔程)
- xx
- [方案介绍和模型训练]()
- [推理部署](@水龙)
- 基于python预测引擎推理
- 基于C++预测引擎推理
- 服务化部署
- 端侧部署
- Paddle2ONNX模型转化与预测
- [模型压缩](@崔程)
- [PP-ShiTu图像识别系统介绍](#图像识别系统介绍)
- 图像识别快速体验
- 模块介绍
- [主体检测](./docs/zh_CN/image_recognition_pipeline/mainbody_detection.md)
- [特征提取](./docs/zh_CN/image_recognition_pipeline/feature_extraction.md)
- [特征提取模型](./docs/zh_CN/image_recognition_pipeline/feature_extraction.md)
- [向量检索](./docs/zh_CN/image_recognition_pipeline/vector_search.md)
- [骨干网络和预训练模型库](./docs/zh_CN/algorithm_introduction/ImageNet_models.md)
- 数据准备
- [图像分类数据集介绍](./docs/zh_CN/data_preparation/classification_dataset.md)
- [图像识别数据集介绍](./docs/zh_CN/data_preparation/recognition_dataset.md)
- 模型训练
- [图像分类任务](./docs/zh_CN/models_training/classification.md)
- [图像识别任务](./docs/zh_CN/models_training/recognition.md)
- [训练参数调整策略](./docs/zh_CN/models_training/train_strategy.md)
- [配置文件说明](./docs/zh_CN/models_training/config_description.md)
- 模型预测部署
- [模型导出](./docs/zh_CN/inference_deployment/export_model.md)
- Python/C++ 预测引擎
- [基于Python预测引擎预测推理](./docs/zh_CN/inference_deployment/python_deploy.md)
- [基于C++分类预测引擎预测推理](./docs/zh_CN/inference_deployment/cpp_deploy.md)[基于C++的PP-ShiTu预测引擎预测推理](deploy/cpp_shitu/readme.md)
- 哈希编码
- 模型训练(包含数据集格式说明等)
- 推理部署
- 基于python预测引擎推理
- 基于C++预测引擎推理
- 服务化部署
- [Paddle Serving服务化部署(推荐)](./docs/zh_CN/inference_deployment/paddle_serving_deploy.md)
- [Hub serving服务化部署](./docs/zh_CN/inference_deployment/paddle_hub_serving_deploy.md)
- [端侧部署](./deploy/lite/readme.md)
- [whl包预测](./docs/zh_CN/inference_deployment/whl_deploy.md)
- 算法介绍
- [图像分类任务介绍](./docs/zh_CN/algorithm_introduction/image_classification.md)
- [度量学习介绍](./docs/zh_CN/algorithm_introduction/metric_learning.md)
- 高阶使用
- [数据增广](./docs/zh_CN/advanced_tutorials/DataAugmentation.md)
- [模型量化](./docs/zh_CN/advanced_tutorials/model_prune_quantization.md)
- [知识蒸馏](./docs/zh_CN/advanced_tutorials/knowledge_distillation.md)
- [PaddleClas结构解析](./docs/zh_CN/advanced_tutorials/code_overview.md)
- [社区贡献指南](./docs/zh_CN/advanced_tutorials/how_to_contribute.md)
- 端侧部署
- Paddle2ONNX模型转化与预测
- 模型压缩
- 模型量化
- 模型裁剪
- [骨干网络和预训练模型库](./docs/zh_CN/algorithm_introduction/ImageNet_models.md)
- PP系列骨干网络模型(包括算法介绍,使用,训推一体链接等)(@崔程)
- PP-HGNet
- PP-LCNet v2
- PP-LCNet
- SSLD半监督知识蒸馏方案 (@若愚)
- SSLD算法简介
- 预训练模型库
- 使用方法(?)
- 前沿算法
- 骨干网络和预训练模型库 (@崔程)
- 服务端CNN模型库
- 移动端CNN模型库
- Vision Transformer模型库
- 度量学习(arcmargin等算法)(@水龙)
- ReID (@水龙)
- 向量检索 (@水龙)
- 哈希特征 (@水龙)
- 模型蒸馏 (@若愚)
- 数据增强 (@崔程)
- 产业实用范例库 (@胜禹)
- 30分钟快速体验图像分类(原尝鲜版)(@崔程)
- FAQ
- [图像识别精选问题](docs/zh_CN/faq_series/faq_2021_s2.md)
- [图像分类精选问题](docs/zh_CN/faq_series/faq_selected_30.md)
- [图像分类FAQ第一季](docs/zh_CN/faq_series/faq_2020_s1.md)
- [图像分类FAQ第二季](docs/zh_CN/faq_series/faq_2021_s1.md)
- [PaddleClas结构解析](./docs/zh_CN/advanced_tutorials/code_overview.md)
- [社区贡献指南](./docs/zh_CN/advanced_tutorials/how_to_contribute.md)
- [许可证书](#许可证书)
- [贡献代码](#贡献代码)
......
Global:
infer_imgs: "./images/PULC/person/objects365_02035329.jpg"
inference_model_dir: "./models/person_cls_infer"
infer_imgs: "./images/PULC/person_exists/objects365_02035329.jpg"
inference_model_dir: "./models/person_exists_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: False
......
Global:
infer_imgs: "./images/PULC/traffic_sign/99603_17806.jpg"
inference_model_dir: "./models/traffic_sign_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
benchmark: False
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: Topk
Topk:
topk: 5
class_id_map_file: "../dataset/traffic_sign/label_name_id.txt"
SavePreLabel:
save_dir: ./pre_label/
Global:
infer_imgs: "./images/PULC/vehicle_attr/0002_c002_00030670_0.jpg"
inference_model_dir: "./models/vehicle_attr_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
benchmark: False
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
size: [256, 192]
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: VehicleAttribute
VehicleAttribute:
color_threshold: 0.5
type_threshold: 0.5
......@@ -280,3 +280,44 @@ class Attribute(object):
batch_res.append([label_res, pred_res])
return batch_res
class VehicleAttribute(object):
def __init__(self, color_threshold=0.5, type_threshold=0.5):
self.color_threshold = color_threshold
self.type_threshold = type_threshold
self.color_list = [
"yellow", "orange", "green", "gray", "red", "blue", "white",
"golden", "brown", "black"
]
self.type_list = [
"sedan", "suv", "van", "hatchback", "mpv", "pickup", "bus",
"truck", "estate"
]
def __call__(self, batch_preds, file_names=None):
# postprocess output of predictor
batch_res = []
for res in batch_preds:
res = res.tolist()
label_res = []
color_idx = np.argmax(res[:10])
type_idx = np.argmax(res[10:])
if res[color_idx] >= self.color_threshold:
color_info = f"Color: ({self.color_list[color_idx]}, prob: {res[color_idx]})"
else:
color_info = "Color unknown"
if res[type_idx + 10] >= self.type_threshold:
type_info = f"Type: ({self.type_list[type_idx]}, prob: {res[type_idx + 10]})"
else:
type_info = "Type unknown"
label_res = f"{color_info}, {type_info}"
threshold_list = [self.color_threshold
] * 10 + [self.type_threshold] * 9
pred_res = (np.array(res) > np.array(threshold_list)
).astype(np.int8).tolist()
batch_res.append({"attributes": label_res, "output": pred_res})
return batch_res
......@@ -138,12 +138,11 @@ def main(config):
continue
batch_results = cls_predictor.predict(batch_imgs)
for number, result_dict in enumerate(batch_results):
if "Attribute" in config["PostProcess"]:
if "Attribute" in config[
"PostProcess"] or "VehicleAttribute" in config[
"PostProcess"]:
filename = batch_names[number]
attr_message = result_dict[0]
pred_res = result_dict[1]
print("{}:\t attributes: {}, \npredict output: {}".format(
filename, attr_message, pred_res))
print("{}:\t {}".format(filename, result_dict))
else:
filename = batch_names[number]
clas_ids = result_dict["class_ids"]
......
# PaddleClas构建有人/无人分类案例
此处提供了用户使用 PaddleClas 快速构建轻量级、高精度、可落地的有人/无人的分类模型教程,主要基于有人/无人场景的数据,融合了轻量级骨干网络PPLCNet、SSLD预训练权重、EDA数据增强策略、SKL-UGI知识蒸馏策略、SHAS超参数搜索策略,得到精度高、速度快、易于部署的二分类模型。
------
## 目录
- [1. 环境配置](#1)
- [2. 有人/无人场景推理预测](#2)
- [2.1 下载模型](#2.1)
- [2.2 模型推理预测](#2.2)
- [2.2.1 预测单张图像](#2.2.1)
- [2.2.2 基于文件夹的批量预测](#2.2.2)
- [3.有人/无人场景训练](#3)
- [3.1 数据准备](#3.1)
- [3.2 模型训练](#3.2)
- [3.2.1 基于默认超参数训练](#3.2.1)
- [3.2.1.1 基于默认超参数训练轻量级模型](#3.2.1.1)
- [3.2.1.2 基于默认超参数训练教师模型](#3.2.1.2)
- [3.2.1.3 基于默认超参数进行蒸馏训练](#3.2.1.3)
- [3.2.2 超参数搜索训练](#3.2)
- [4. 模型评估与推理](#4)
- [4.1 模型评估](#3.1)
- [4.2 模型预测](#3.2)
- [4.3 使用 inference 模型进行推理](#4.3)
- [4.3.1 导出 inference 模型](#4.3.1)
- [4.3.2 模型推理预测](#4.3.2)
<a name="1"></a>
## 1. 环境配置
* 安装:请先参考 [Paddle 安装教程](../installation/install_paddle.md) 以及 [PaddleClas 安装教程](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
<a name="2"></a>
## 2. 有人/无人场景推理预测
<a name="2.1"></a>
### 2.1 下载模型
* 进入 `deploy` 运行目录。
```
cd deploy
```
下载有人/无人分类的模型。
```
mkdir models
cd models
# 下载inference 模型并解压
wget https://paddleclas.bj.bcebos.com/models/PULC/person_cls_infer.tar && tar -xf person_cls_infer.tar
```
解压完毕后,`models` 文件夹下应有如下文件结构:
```
├── person_cls_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
<a name="2.2"></a>
### 2.2 模型推理预测
<a name="2.2.1"></a>
#### 2.2.1 预测单张图像
返回 `deploy` 目录:
```
cd ../
```
运行下面的命令,对图像 `./images/PULC/person/objects365_02035329.jpg` 进行有人/无人分类。
```shell
# 使用下面的命令使用 GPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o PostProcess.ThreshOutput.threshold=0.9794
# 使用下面的命令使用 CPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o PostProcess.ThreshOutput.threshold=0.9794 -o Global.use_gpu=False
```
输出结果如下。
```
objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
```
**备注:** 真实场景中往往需要在假正类率(Fpr)小于某一个指标下求真正类率(Tpr),该场景中的`val`数据集在千分之一Fpr下得到的最佳Tpr所得到的阈值为`0.9794`,故此处的`threshold``0.9794`。该阈值的确定方法可以参考[3.2节](#3.2)
<a name="2.2.2"></a>
#### 2.2.2 基于文件夹的批量预测
如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
```shell
# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o Global.infer_imgs="./images/PULC/person/"
```
终端中会输出该文件夹内所有图像的分类结果,如下所示。
```
objects365_01780782.jpg: class id(s): [0], score(s): [1.00], label_name(s): ['nobody']
objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
```
其中,`someone` 表示该图里存在人,`nobody` 表示该图里不存在人。
<a name="3"></a>
## 3.有人/无人场景训练
<a name="3.1"></a>
### 3.1 数据准备
进入 PaddleClas 目录。
```
cd path_to_PaddleClas
```
进入 `dataset/` 目录,下载并解压有人/无人场景的数据。
```shell
cd dataset
wget https://paddleclas.bj.bcebos.com/data/cls_demo/person.tar
tar -xf person.tar
cd ../
```
执行上述命令后,`dataset/`下存在`person`目录,该目录中具有以下数据:
```
├── train
│   ├── 000000000009.jpg
│   ├── 000000000025.jpg
...
├── val
│   ├── objects365_01780637.jpg
│   ├── objects365_01780640.jpg
...
├── ImageNet_val
│   ├── ILSVRC2012_val_00000001.JPEG
│   ├── ILSVRC2012_val_00000002.JPEG
...
├── train_list.txt
├── train_list.txt.debug
├── train_list_for_distill.txt
├── val_list.txt
└── val_list.txt.debug
```
其中`train/``val/`分别为训练集和验证集。`train_list.txt``val_list.txt`分别为训练集和验证集的标签文件,`train_list.txt.debug``val_list.txt.debug`分别为训练集和验证集的`debug`标签文件,其分别是`train_list.txt``val_list.txt`的子集,用该文件可以快速体验本案例的流程。`ImageNet_val/`是ImageNet的验证集,该集合和`train`集合的混合数据用于本案例的`SKL-UGI知识蒸馏策略`,对应的训练标签文件为`train_list_for_distill.txt`
* **注意**:
* 本案例中所使用的所有数据集均为开源数据,`train`集合为[MS-COCO数据](https://cocodataset.org/#overview)的训练集的子集,`val`集合为[Object365数据](https://www.objects365.org/overview.html)的训练集的子集,`ImageNet_val`[ImageNet数据](https://www.image-net.org/)的验证集。数据集的筛选流程可以参考[有人/无人场景数据集筛选方法]()。
<a name="3.2"></a>
### 3.2 模型训练
<a name="3.2.1"></a>
#### 3.2.1 基于默认超参数训练
<a name="3.2.1.1"></a>
##### 3.2.1.1 基于默认超参数训练轻量级模型
`ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml`中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml
```
验证集的最佳指标在0.94-0.95之间(数据集较小,容易造成波动)。
**备注:**
* 此时使用的指标为Tpr,该指标描述了在假正类率(Fpr)小于某一个指标时的真正类率(Tpr),是产业中二分类问题常用的指标之一。在本案例中,Fpr为千分之一。关于Fpr和Tpr的更多介绍,可以参考[这里](https://baike.baidu.com/item/AUC/19282953)
* 在eval时,会打印出来当前最佳的TprAtFpr指标,具体地,其会打印当前的`Fpr``Tpr`值,以及当前的`threshold`值,`Tpr`值反映了在当前`Fpr`值下的召回率,该值越高,代表模型越好。`threshold` 表示当前最佳`Fpr`所对应的分类阈值,可用于后续模型部署落地等。
<a name="3.2.1.2"></a>
##### 3.2.1.2 基于默认超参数训练教师模型
复用`ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml`中的超参数,训练教师模型,训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml \
-o Arch.name=ResNet101_vd
```
验证集的最佳指标为0.96-0.98之间,当前教师模型最好的权重保存在`output/ResNet101_vd/best_model.pdparams`
<a name="3.2.1.3"></a>
##### 3.2.1.3 基于默认超参数进行蒸馏训练
配置文件`ppcls/configs/PULC/PULC/Distillation/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型,使用ImageNet数据集的验证集作为新增的无标签数据。训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person/Distillation/PPLCNet_x1_0_distillation.yaml \
-o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
```
验证集的最佳指标为0.95-0.97之间,当前模型最好的权重保存在`output/DistillationModel/best_model_student.pdparams`
<a name="3.2.2"></a>
#### 3.2.2 超参数搜索训练
[3.2 小节](#3.2) 提供了在已经搜索并得到的超参数上进行了训练,此部分内容提供了搜索的过程,此过程是为了得到更好的训练超参数。
* 搜索运行脚本如下:
```shell
python tools/search_strategy.py -c ppcls/configs/StrategySearch/person.yaml
```
`ppcls/configs/StrategySearch/person.yaml`中指定了具体的 GPU id 号和搜索配置, 默认搜索的训练日志和模型存放于`output/search_person`中,最终的蒸馏模型存放于`output/search_person/search_res/DistillationModel/best_model_student.pdparams`
* **注意**:
* 3.1小节提供的默认配置已经经过了搜索,所以此过程不是必要的过程,如果自己的训练数据集有变化,可以尝试此过程。
* 此过程基于当前数据集在 V100 4 卡上大概需要耗时 10 小时,如果缺少机器资源,希望体验搜索过程,可以将`ppcls/configs/cls_demo/person/PPLCNet/PPLCNet_x1_0_search.yaml`中的`train_list.txt``val_list.txt`分别替换为`train_list.txt.debug``val_list.txt.debug`。替换list只是为了加速跑通整个搜索过程,由于数据量较小,其搜素的结果没有参考性。另外,搜索空间可以根据当前的机器资源来调整,如果机器资源有限,可以尝试缩小搜索空间,如果机器资源较充足,可以尝试扩大搜索空间。
* 如果此过程搜索的得到的超参数与[3.2.1小节](#3.2.1)提供的超参数不一致,主要是由于训练数据较小造成的波动导致,可以忽略。
<a name="4"></a>
## 4. 模型评估与推理
<a name="4.1"></a>
### 4.1 模型评估
训练好模型之后,可以通过以下命令实现对模型指标的评估。
```bash
python3 tools/eval.py \
-c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml \
-o Global.pretrained_model="output/DistillationModel/best_model_student"
```
<a name="4.2"></a>
### 4.2 模型预测
模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
```python
python3 tools/infer.py \
-c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml \
-o Infer.infer_imgs=./dataset/person/val/objects365_01780637.jpg \
-o Global.pretrained_model=output/DistillationModel/best_model_student \
-o Global.pretrained_model=Infer.PostProcess.threshold=0.9794
```
输出结果如下:
```
[{'class_ids': [0], 'scores': [0.9878496769815683], 'label_names': ['nobody'], 'file_name': './dataset/person/val/objects365_01780637.jpg'}]
```
**备注:** 这里的`Infer.PostProcess.threshold`的值需要根据实际场景来确定,此处的`0.9794`是在该场景中的`val`数据集在千分之一Fpr下得到的最佳Tpr所得到的。
<a name="4.3"></a>
### 4.3 使用 inference 模型进行推理
<a name="4.3.1"></a>
### 4.3.1 导出 inference 模型
通过导出 inference 模型,PaddlePaddle 支持使用预测引擎进行预测推理。接下来介绍如何用预测引擎进行推理:
首先,对训练好的模型进行转换:
```bash
python3 tools/export_model.py \
-c ./ppcls/configs/cls_demo/PULC/PPLCNet/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/DistillationModel/best_model_student \
-o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_person
```
执行完该脚本后会在`deploy/models/`下生成`PPLCNet_x1_0_person`文件夹,该文件夹中的模型与 2.2 节下载的推理预测模型格式一致。
<a name="4.3.2"></a>
### 4.3.2 基于 inference 模型推理预测
推理预测的脚本为:
```
python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o Global.inference_model_dir="models/PPLCNet_x1_0_person" -o PostProcess.ThreshOutput.threshold=0.9794
```
**备注:**
- 此处的`PostProcess.ThreshOutput.threshold`由eval时的最佳`threshold`来确定。
- 更多关于推理的细节,可以参考[2.2节](#2.2)
# PULC 有人/无人分类模型
------
## 目录
- [1. 模型和应用场景介绍](#1)
- [2. 模型快速体验](#2)
- [3. 模型训练、评估和预测](#3)
- [3.1 环境配置](#3.1)
- [3.2 数据准备](#3.2)
- [3.2.1 数据集来源](#3.2.1)
- [3.2.2 数据集获取](#3.2.2)
- [3.3 模型训练](#3.3)
- [3.4 模型评估](#3.4)
- [3.5 模型预测](#3.5)
- [4. 模型压缩](#4)
- [4.1 SKL-UGI 知识蒸馏](#4.1)
- [4.1.1 教师模型训练](#4.1.1)
- [4.1.2 蒸馏训练](#4.1.2)
- [5. 超参搜索](#5)
- [6. 模型推理部署](#6)
- [6.1 推理模型准备](#6.1)
- [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
- [6.1.2 直接下载 inference 模型](#6.1.2)
- [6.2 基于 Python 预测引擎推理](#6.2)
- [6.2.1 预测单张图像](#6.2.1)
- [6.2.2 基于文件夹的批量预测](#6.2.2)
- [6.3 基于 C++ 预测引擎推理](#6.3)
- [6.4 服务化部署](#6.4)
- [6.5 端侧部署](#6.5)
- [6.6 Paddle2ONNX 模型转换与预测](#6.6)
<a name="1"></a>
## 1. 模型和应用场景介绍
该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight Classification)快速构建轻量级、高精度、可落地的有人/无人的分类模型。该模型可以广泛应用于如监控场景、人员进出管控场景、海量数据过滤场景等。
下表列出了判断图片中是否有人的二分类模型的相关指标,前两行展现了使用 SwinTranformer_tiny 和 MobileNetV3_large_x1_0 作为 backbone 训练得到的模型的相关指标,第三行至第六行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + SKL-UGI 知识蒸馏策略训练得到的模型的相关指标。
| 模型 | Tpr(%) | 延时(ms) | 存储(M) | 策略 |
|-------|-----------|----------|---------------|---------------|
| SwinTranformer_tiny | 95.69 | 175.52 | 107 | 使用ImageNet预训练模型 |
| MobileNetV3_large_x1_0 | 91.97 | 4.70 | 17 | 使用ImageNet预训练模型 |
| PPLCNet_x1_0 | 89.57 | 2.36 | 6.5 | 使用ImageNet预训练模型 |
| PPLCNet_x1_0 | 92.10 | 2.36 | 6.5 | 使用SSLD预训练模型 |
| PPLCNet_x1_0 | 93.43 | 2.36 | 6.5 | 使用SSLD预训练模型+EDA策略|
| <b>PPLCNet_x1_0<b> | <b>95.60<b> | <b>2.36<b> | <b>6.5<b> | 使用SSLD预训练模型+EDA策略+SKL-UGI知识蒸馏策略|
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backboone 替换为轻量级模型 MobileNetV3_large_x1_0 后,速度可以大幅提升,但是精度下降明显。将 backbone 替换为 PPLCNet_x1_0 时,精度较 MobileNetV3_large_x1_0 低两个多百分点,但是速度提升 2 倍左右。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升约 2.6 个百分点,进一步地,当融合EDA策略后,精度可以再提升 1.3 个百分点,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 2.2 个百分点。此时,PPLCNet_x1_0 达到了 SwinTranformer_tiny 模型的精度,但是速度快 70+ 倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
**备注:**
* `Tpr`指标的介绍可以参考 [3.2 小节](#3.2)的备注部分,延时是基于 Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,开启MKLDNN加速策略,线程数为10。
* 关于PPLCNet的介绍可以参考[PPLCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PPLCNet paper](https://arxiv.org/abs/2109.15099)
<a name="2"></a>
## 2. 模型快速体验
(pip方式,待补充)
<a name="3"></a>
## 3. 模型训练、评估和预测
<a name="3.1"></a>
### 3.1 环境配置
* 安装:请先参考 [Paddle 安装教程](../installation/install_paddle.md) 以及 [PaddleClas 安装教程](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
<a name="3.2"></a>
### 3.2 数据准备
<a name="3.2.1"></a>
#### 3.2.1 数据集来源
本案例中所使用的所有数据集均为开源数据,`train` 集合为[MS-COCO 数据](https://cocodataset.org/#overview)的训练集的子集,`val` 集合为[Object365 数据](https://www.objects365.org/overview.html)的训练集的子集,`ImageNet_val`[ImageNet-1k 数据](https://www.image-net.org/)的验证集。
<a name="3.2.2"></a>
#### 3.2.2 数据集获取
在公开数据集的基础上经过后处理即可得到本案例需要的数据,具体处理方法如下:
- 训练集合,本案例处理了 MS-COCO 数据训练集的标注文件,如果某张图含有“人”的标签,且这个框的面积在整张图中的比例大于 10%,即认为该张图中含有人,如果某张图中没有“人”的标签,则认为该张图中不含有人。经过处理后,得到 92964 条可用数据,其中有人的数据有 39813 条,无人的数据 53151 条。
- 验证集合,从 Object365 数据中随机抽取一小部分数据,使用在 MS-COCO 上训练得到的较好的模型预测这些数据,将预测结果和数据的标注文件取交集,将交集的结果按照得到训练集的方法筛选出验证集合。经过处理后,得到 27820 条可用数据。其中有人的数据有 2255 条,无人的数据有 25565 条。
处理后的数据集部分数据可视化如下:
![](../../images/PULC/docs/person_exists_data_demo.png)
此处提供了经过上述方法处理好的数据,可以直接下载得到。
进入 PaddleClas 目录。
```
cd path_to_PaddleClas
```
进入 `dataset/` 目录,下载并解压有人/无人场景的数据。
```shell
cd dataset
wget https://paddleclas.bj.bcebos.com/data/PULC/person_exists.tar
tar -xf person_exists.tar
cd ../
```
执行上述命令后,`dataset/` 下存在 `person_exists` 目录,该目录中具有以下数据:
```
├── train
│   ├── 000000000009.jpg
│   ├── 000000000025.jpg
...
├── val
│   ├── objects365_01780637.jpg
│   ├── objects365_01780640.jpg
...
├── ImageNet_val
│   ├── ILSVRC2012_val_00000001.JPEG
│   ├── ILSVRC2012_val_00000002.JPEG
...
├── train_list.txt
├── train_list.txt.debug
├── train_list_for_distill.txt
├── val_list.txt
└── val_list.txt.debug
```
其中 `train/``val/` 分别为训练集和验证集。`train_list.txt``val_list.txt` 分别为训练集和验证集的标签文件,`train_list.txt.debug``val_list.txt.debug` 分别为训练集和验证集的 `debug` 标签文件,其分别是 `train_list.txt``val_list.txt` 的子集,用该文件可以快速体验本案例的流程。`ImageNet_val/` 是 ImageNet-1k 的验证集,该集合和 `train` 集合的混合数据用于本案例的 `SKL-UGI知识蒸馏策略`,对应的训练标签文件为 `train_list_for_distill.txt`
**备注:**
* 关于 `train_list.txt``val_list.txt`的格式说明,可以参考[PaddleClas分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明)
* 关于如何得到蒸馏的标签文件可以参考[知识蒸馏标签获得方法](@ruoyu)
<a name="3.3"></a>
### 3.3 模型训练
`ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml` 中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml
```
验证集的最佳指标在 `0.94-0.95` 之间(数据集较小,容易造成波动)。
**备注:**
* 此时使用的指标为Tpr,该指标描述了在假正类率(Fpr)小于某一个指标时的真正类率(Tpr),是产业中二分类问题常用的指标之一。在本案例中,Fpr 为千分之一。关于 Fpr 和 Tpr 的更多介绍,可以参考[这里](https://baike.baidu.com/item/AUC/19282953)
* 在eval时,会打印出来当前最佳的 TprAtFpr 指标,具体地,其会打印当前的 `Fpr``Tpr` 值,以及当前的 `threshold`值,`Tpr` 值反映了在当前 `Fpr` 值下的召回率,该值越高,代表模型越好。`threshold` 表示当前最佳 `Fpr` 所对应的分类阈值,可用于后续模型部署落地等。
<a name="3.4"></a>
### 3.4 模型评估
训练好模型之后,可以通过以下命令实现对模型指标的评估。
```bash
python3 tools/eval.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
-o Global.pretrained_model="output/DPPLCNet_x1_0/best_model"
```
其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
<a name="3.5"></a>
### 3.5 模型预测
模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
```python
python3 tools/infer.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/DistillationModel/best_model_student \
-o Global.pretrained_model=Infer.PostProcess.threshold=0.9794
```
输出结果如下:
```
[{'class_ids': [0], 'scores': [0.9878496769815683], 'label_names': ['nobody'], 'file_name': './dataset/person_exists/val/objects365_01780637.jpg'}]
```
**备注:**
* 这里`-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
* 默认是对 `deploy/images/PULC/person_exists/objects365_02035329.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
* 这里的 `Infer.PostProcess.threshold` 的值需要根据实际场景来确定,此处的 `0.9794` 是在该场景中的 `val` 数据集在千分之一 Fpr 下得到的最佳 Tpr 所得到的。
<a name="4"></a>
## 4. 模型压缩
<a name="4.1"></a>
### 4.1 SKL-UGI 知识蒸馏
SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](@ruoyu)
<a name="4.1.1"></a>
#### 4.1.1 教师模型训练
复用 `ppcls/configs/PULC/person_exists/PPLCNet/PPLCNet_x1_0.yaml` 中的超参数,训练教师模型,训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet/PPLCNet_x1_0.yaml \
-o Arch.name=ResNet101_vd
```
验证集的最佳指标为 `0.96-0.98` 之间,当前教师模型最好的权重保存在 `output/ResNet101_vd/best_model.pdparams`
<a name="4.1.2"></a>
#### 4.1.2 蒸馏训练
配置文件`ppcls/configs/PULC/person_exists/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型,使用ImageNet数据集的验证集作为新增的无标签数据。训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0_distillation.yaml \
-o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
```
验证集的最佳指标为 `0.95-0.97` 之间,当前模型最好的权重保存在 `output/DistillationModel/best_model_student.pdparams`
<a name="5"></a>
## 5. 超参搜索
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](#TODO)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
<a name="6"></a>
## 6. 模型推理部署
<a name="6.1"></a>
### 6.1 推理模型准备
Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)
当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
<a name="6.1.1"></a>
### 6.1.1 基于训练得到的权重导出 inference 模型
此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
```bash
python3 tools/export_model.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/DistillationModel/best_model_student \
-o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_person_exists_infer
```
执行完该脚本后会在 `deploy/models/` 下生成 `PPLCNet_x1_0_person_exists_infer` 文件夹,`models` 文件夹下应有如下文件结构:
```
├── PPLCNet_x1_0_person_exists_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在`output/PPLCNet_x1_0/best_model.pdparams`中。
<a name="6.1.2"></a>
### 6.1.2 直接下载 inference 模型
[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
```
cd deploy/models
# 下载 inference 模型并解压
wget https://paddleclas.bj.bcebos.com/models/PULC/person_exists_infer.tar && tar -xf person_exists_infer.tar
```
解压完毕后,`models` 文件夹下应有如下文件结构:
```
├── person_exists_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
<a name="6.2"></a>
### 6.2 基于 Python 预测引擎推理
<a name="6.2.1"></a>
#### 6.2.1 预测单张图像
返回 `deploy` 目录:
```
cd ../
```
运行下面的命令,对图像 `./images/PULC/person_exists/objects365_02035329.jpg` 进行有人/无人分类。
```shell
# 使用下面的命令使用 GPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml -o PostProcess.ThreshOutput.threshold=0.9794
# 使用下面的命令使用 CPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml -o PostProcess.ThreshOutput.threshold=0.9794 -o Global.use_gpu=False
```
输出结果如下。
```
objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
```
**备注:** 真实场景中往往需要在假正类率(Fpr)小于某一个指标下求真正类率(Tpr),该场景中的 `val` 数据集在千分之一 Fpr 下得到的最佳 Tpr 所得到的阈值为 `0.9794`,故此处的 `threshold``0.9794`。该阈值的确定方法可以参考[3.2节](#3.2)备注部分。
<a name="6.2.2"></a>
#### 6.2.2 基于文件夹的批量预测
如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
```shell
# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml -o Global.infer_imgs="./images/PULC/person_exists/"
```
终端中会输出该文件夹内所有图像的分类结果,如下所示。
```
objects365_01780782.jpg: class id(s): [0], score(s): [1.00], label_name(s): ['nobody']
objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
```
其中,`someone` 表示该图里存在人,`nobody` 表示该图里不存在人。
<a name="6.3"></a>
### 6.3 基于 C++ 预测引擎推理
PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
<a name="6.4"></a>
### 6.4 服务化部署
Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)
PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
<a name="6.5"></a>
### 6.5 端侧部署
Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)
PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
<a name="6.6"></a>
### 6.6 Paddle2ONNX 模型转换与预测
Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)
PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](@shuilong)来完成相应的部署工作。
# PULC 交通标志分类模型
------
## 目录
- [1. 模型和应用场景介绍](#1)
- [2. 模型快速体验](#2)
- [3. 模型训练、评估和预测](#3)
- [3.1 环境配置](#3.1)
- [3.2 数据准备](#3.2)
- [3.2.1 数据集来源](#3.2.1)
- [3.2.2 数据集获取](#3.2.2)
- [3.3 模型训练](#3.3)
- [3.4 模型评估](#3.4)
- [3.5 模型预测](#3.5)
- [4. 模型压缩](#4)
- [4.1 SKL-UGI 知识蒸馏](#4.1)
- [4.1.1 教师模型训练](#4.1.1)
- [4.1.2 蒸馏训练](#4.1.2)
- [5. 超参搜索](#5)
- [6. 模型推理部署](#6)
- [6.1 推理模型准备](#6.1)
- [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
- [6.1.2 直接下载 inference 模型](#6.1.2)
- [6.2 基于 Python 预测引擎推理](#6.2)
- [6.2.1 预测单张图像](#6.2.1)
- [6.2.2 基于文件夹的批量预测](#6.2.2)
- [6.3 基于 C++ 预测引擎推理](#6.3)
- [6.4 服务化部署](#6.4)
- [6.5 端侧部署](#6.5)
- [6.6 Paddle2ONNX 模型转换与预测](#6.6)
<a name="1"></a>
## 1. 模型和应用场景介绍
该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight Classification)快速构建轻量级、高精度、可落地的交通标志分类模型。该模型可以广泛应用于自动驾驶、道路监控等场景。
下表列出了不同交通标志分类模型的相关指标,前两行展现了使用 SwinTranformer_tiny 和 MobileNetV3_large_x1_0 作为 backbone 训练得到的模型的相关指标,第三行至第六行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + SKL-UGI 知识蒸馏策略训练得到的模型的相关指标。
| 模型 | Top-1 Acc(%) | 延时(ms) | 存储(M) | 策略 |
|-------|-----------|----------|---------------|---------------|
| SwinTranformer_tiny | 98.11 | 87.19 | 111 | 使用ImageNet预训练模型 |
| MobileNetV3_large_x1_0 | 97.79 | 5.59 | 23 | 使用ImageNet预训练模型 |
| PPLCNet_x1_0 | 97.78 | 2.67 | 8.2 | 使用ImageNet预训练模型 |
| PPLCNet_x1_0 | 97.84 | 2.67 | 8.2 | 使用SSLD预训练模型 |
| PPLCNet_x1_0 | 98.14 | 2.67 | 8.2 | 使用SSLD预训练模型+EDA策略|
| <b>PPLCNet_x1_0<b> | <b>98.35<b> | <b>2.67<b> | <b>8.2<b> | 使用SSLD预训练模型+EDA策略+SKL-UGI知识蒸馏策略|
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_large_x1_0 后,速度可以大幅提升,但是精度下降明显。将 backbone 替换为 PPLCNet_x1_0 时,精度低0.01%,但是速度提升 2 倍左右。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升约 0.06%,进一步地,当融合EDA策略后,精度可以再提升 0.3%,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.21%。此时,PPLCNet_x1_0 的精度超越了SwinTranformer_tiny,速度快32倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
**备注:**
* 关于PPLCNet的介绍可以参考[PPLCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PPLCNet paper](https://arxiv.org/abs/2109.15099)
<a name="2"></a>
## 2. 模型快速体验
(pip方式,待补充)
<a name="3"></a>
## 3. 模型训练、评估和预测
<a name="3.1"></a>
### 3.1 环境配置
* 安装:请先参考 [Paddle 安装教程](../installation/install_paddle.md) 以及 [PaddleClas 安装教程](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
<a name="3.2"></a>
### 3.2 数据准备
<a name="3.2.1"></a>
#### 3.2.1 数据集来源
本案例中所使用的数据为[Tsinghua-Tencent 100K dataset (CC-BY-NC license)](https://cg.cs.tsinghua.edu.cn/traffic-sign/),在使用的过程中,对交通标志检测框进行随机扩充与裁剪,从而得到用于训练与测试的图像,下面简称该数据集为`TT100K`数据集。
<a name="3.2.2"></a>
#### 3.2.2 数据集获取
在TT00K数据集上,对交通标志检测框进行随机扩充与裁剪,从而得到用于训练与测试的图像。随机扩充检测框的逻辑如下所示。
```python
def get_random_crop_box(xmin, ymin, xmax, ymax, img_height, img_width, ratio=1.0):
h = ymax - ymin
w = ymax - ymin
xmin_diff = random.random() * ratio * min(w, xmin/ratio)
ymin_diff = random.random() * ratio * min(h, ymin/ratio)
xmax_diff = random.random() * ratio * min(w, (img_width-xmin-1)/ratio)
ymax_diff = random.random() * ratio * min(h, (img_height-ymin-1)/ratio)
new_xmin = round(xmin - xmin_diff)
new_ymin = round(ymin - ymin_diff)
new_xmax = round(xmax + xmax_diff)
new_ymax = round(ymax + ymax_diff)
return new_xmin, new_ymin, new_xmax, new_ymax
```
完整的预处理逻辑,可以参考下载好的数据集文件夹中的`deal.py`文件。
处理后的数据集部分数据可视化如下。
<div align="center">
<img src="../../images/PULC/docs/traffic_sign_data_demo.png" width = "500" />
</div>
此处提供了经过上述方法处理好的数据,可以直接下载得到。
进入 PaddleClas 目录。
```
cd path_to_PaddleClas
```
进入 `dataset/` 目录,下载并解压交通标志分类场景的数据。
```shell
cd dataset
wget https://paddleclas.bj.bcebos.com/data/PULC/traffic_sign.tar
tar -xf traffic_sign.tar
cd ../
```
执行上述命令后,`dataset/`下存在`traffic_sign`目录,该目录中具有以下数据:
```
traffic_sign
├── train
│ ├── 0_62627.jpg
│ ├── 100000_89031.jpg
│ ├── 100001_89031.jpg
...
├── test
│ ├── 100423_2315.jpg
│ ├── 100424_2315.jpg
│ ├── 100425_2315.jpg
...
├── other
│ ├── 100603_3422.jpg
│ ├── 100604_3422.jpg
...
├── label_list_train.txt
├── label_list_test.txt
├── label_list_other.txt
├── label_list_train_for_distillation.txt
├── label_list_train.txt.debug
├── label_list_test.txt.debug
├── label_name_id.txt
├── deal.py
```
其中`train/``test/`分别为训练集和验证集。`label_list_train.txt``label_list_test.txt`分别为训练集和验证集的标签文件,`label_list_train.txt.debug``label_list_test.txt.debug`分别为训练集和验证集的`debug`标签文件,其分别是`label_list_train.txt``label_list_test.txt`的子集,用该文件可以快速体验本案例的流程。`train``other`的混合数据用于本案例的`SKL-UGI知识蒸馏策略`,对应的训练标签文件为`label_list_train_for_distillation.txt`
**备注:**
* 关于 `label_list_train.txt``label_list_test.txt`的格式说明,可以参考[PaddleClas分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明)
* 关于如何得到蒸馏的标签文件可以参考[知识蒸馏标签获得方法](@ruoyu)
<a name="3.3"></a>
### 3.3 模型训练
`ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml` 中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml
```
验证集的最佳指标在 `98.14%` 左右(数据集较小,一般有0.1%左右的波动)。
<a name="3.4"></a>
### 3.4 模型评估
训练好模型之后,可以通过以下命令实现对模型指标的评估。
```bash
python3 tools/eval.py \
-c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
```
其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
<a name="3.5"></a>
### 3.5 模型预测
模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
```bash
python3 tools/infer.py \
-c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/DistillationModel/best_model
```
输出结果如下:
```
99603_17806.jpg: class id(s): [216, 145, 49, 207, 169], score(s): [1.00, 0.00, 0.00, 0.00, 0.00], label_name(s): ['pm20', 'pm30', 'pm40', 'pl25', 'pm15']
```
**备注:**
* 这里`-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
* 默认是对 `deploy/images/PULC/traffic_sign/99603_17806.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
<a name="4"></a>
## 4. 模型压缩
<a name="4.1"></a>
### 4.1 SKL-UGI 知识蒸馏
SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](@ruoyu)
<a name="4.1.1"></a>
#### 4.1.1 教师模型训练
复用 `ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml` 中的超参数,训练教师模型,训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
-o Arch.name=ResNet101_vd
```
验证集的最佳指标为 `98.59%` 左右,当前教师模型最好的权重保存在 `output/ResNet101_vd/best_model.pdparams`
<a name="4.1.2"></a>
#### 4.1.2 蒸馏训练
配置文件`ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型,使用ImageNet数据集的验证集作为新增的无标签数据。训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0_distillation.yaml \
-o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
```
验证集的最佳指标为 `98.35%` 左右,当前模型最好的权重保存在 `output/DistillationModel/best_model_student.pdparams`
<a name="5"></a>
## 5. 超参搜索
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](#TODO)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
<a name="6"></a>
## 6. 模型推理部署
<a name="6.1"></a>
### 6.1 推理模型准备
Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)
当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
<a name="6.1.1"></a>
### 6.1.1 基于训练得到的权重导出 inference 模型
此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
```bash
python3 tools/export_model.py \
-c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/DistillationModel/best_model_student \
-o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_traffic_sign_infer
```
执行完该脚本后会在 `deploy/models/` 下生成 `PPLCNet_x1_0_traffic_sign_infer` 文件夹,`models` 文件夹下应有如下文件结构:
```
├── PPLCNet_x1_0_traffic_sign_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在`output/PPLCNet_x1_0/best_model.pdparams`中。
<a name="6.1.2"></a>
### 6.1.2 直接下载 inference 模型
[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
```
cd deploy/models
# 下载 inference 模型并解压
wget https://paddleclas.bj.bcebos.com/models/PULC/traffic_sign_infer.tar && tar -xf traffic_sign_infer.tar
```
解压完毕后,`models` 文件夹下应有如下文件结构:
```
├── traffic_sign_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
<a name="6.2"></a>
### 6.2 基于 Python 预测引擎推理
<a name="6.2.1"></a>
#### 6.2.1 预测单张图像
返回 `deploy` 目录:
```
cd ../
```
运行下面的命令,对图像 `./images/PULC/traffic_sign/99603_17806.jpg` 进行交通标志分类。
```shell
# 使用下面的命令使用 GPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/traffic_sign/inference_traffic_sign.yaml
# 使用下面的命令使用 CPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/traffic_sign/inference_traffic_sign.yaml -o Global.use_gpu=False
```
输出结果如下。
```
99603_17806.jpg: class id(s): [216, 145, 49, 207, 169], score(s): [1.00, 0.00, 0.00, 0.00, 0.00], label_name(s): ['pm20', 'pm30', 'pm40', 'pl25', 'pm15']
```
<a name="6.2.2"></a>
#### 6.2.2 基于文件夹的批量预测
如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
```shell
# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
python3.7 python/predict_cls.py -c configs/PULC/traffic_sign/inference_traffic_sign.yaml -o Global.infer_imgs="./images/PULC/traffic_sign/"
```
终端中会输出该文件夹内所有图像的分类结果,如下所示。
```
100999_83928.jpg: class id(s): [182, 179, 162, 128, 24], score(s): [0.99, 0.01, 0.00, 0.00, 0.00], label_name(s): ['pl110', 'pl100', 'pl120', 'p26', 'pm10']
99603_17806.jpg: class id(s): [216, 145, 49, 24, 169], score(s): [1.00, 0.00, 0.00, 0.00, 0.00], label_name(s): ['pm20', 'pm30', 'pm40', 'pm10', 'pm15']
```
输出的 `label_name`可以从`dataset/traffic_sign/report.pdf`文件中查阅对应的图片。
<a name="6.3"></a>
### 6.3 基于 C++ 预测引擎推理
PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
<a name="6.4"></a>
### 6.4 服务化部署
Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)
PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
<a name="6.5"></a>
### 6.5 端侧部署
Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)
PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
<a name="6.6"></a>
### 6.6 Paddle2ONNX 模型转换与预测
Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)
PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](@shuilong)来完成相应的部署工作。
# PULC 车辆属性识别模型
------
## 目录
- [1. 模型和应用场景介绍](#1)
- [2. 模型快速体验](#2)
- [3. 模型训练、评估和预测](#3)
- [3.1 环境配置](#3.1)
- [3.2 数据准备](#3.2)
- [3.2.1 数据集来源](#3.2.1)
- [3.2.2 数据集获取](#3.2.2)
- [3.3 模型训练](#3.3)
- [3.4 模型评估](#3.4)
- [3.5 模型预测](#3.5)
- [4. 模型压缩](#4)
- [4.1 SKL-UGI 知识蒸馏](#4.1)
- [4.1.1 教师模型训练](#4.1.1)
- [4.1.2 蒸馏训练](#4.1.2)
- [5. 超参搜索](#5)
- [6. 模型推理部署](#6)
- [6.1 推理模型准备](#6.1)
- [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
- [6.1.2 直接下载 inference 模型](#6.1.2)
- [6.2 基于 Python 预测引擎推理](#6.2)
- [6.2.1 预测单张图像](#6.2.1)
- [6.2.2 基于文件夹的批量预测](#6.2.2)
- [6.3 基于 C++ 预测引擎推理](#6.3)
- [6.4 服务化部署](#6.4)
- [6.5 端侧部署](#6.5)
- [6.6 Paddle2ONNX 模型转换与预测](#6.6)
<a name="1"></a>
## 1. 模型和应用场景介绍
该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight Classification)快速构建轻量级、高精度、可落地的车辆属性识别模型。该模型可以广泛应用于车辆识别、道路监控等场景。
下表列出了不同车辆属性识别模型的相关指标,前两行展现了使用 Res2Net200_vd_26w_4s 和 MobileNetV3_large_x1_0 作为 backbone 训练得到的模型的相关指标,第三行至第六行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + SKL-UGI 知识蒸馏策略训练得到的模型的相关指标。
| 模型 | ma(%) | 延时(ms) | 存储(M) | 策略 |
|-------|-----------|----------|---------------|---------------|
| Res2Net200_vd_26w_4s | 91.36 | 66.58 | 293 | 使用ImageNet预训练模型 |
| ResNet50 | 89.98 | 12.74 | 92 | 使用ImageNet预训练模型 |
| MobileNetV3_large_x1_0 | 89.77 | 5.59 | 23 | 使用ImageNet预训练模型 |
| PPLCNet_x1_0 | 89.57 | 2.56 | 8.2 | 使用ImageNet预训练模型 |
| PPLCNet_x1_0 | 90.07 | 2.56 | 8.2 | 使用SSLD预训练模型 |
| PPLCNet_x1_0 | 90.59 | 2.56 | 8.2 | 使用SSLD预训练模型+EDA策略|
| <b>PPLCNet_x1_0<b> | <b>90.81<b> | <b>2.56<b> | <b>8.2<b> | 使用SSLD预训练模型+EDA策略+SKL-UGI知识蒸馏策略|
从表中可以看出,backbone 为 Res2Net200_vd_26w_4s 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_large_x1_0 后,速度可以大幅提升,但是精度下降明显。将 backbone 替换为 PPLCNet_x1_0 时,精度低0.2%,但是速度提升 2 倍左右。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升约 0.5%,进一步地,当融合EDA策略后,精度可以再提升 0.52%,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.23%。此时,PPLCNet_x1_0 的精度与 Res2Net200_vd_26w_4s 仅相差0.55%,但是速度快26倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
**备注:**
* 关于PPLCNet的介绍可以参考[PPLCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PPLCNet paper](https://arxiv.org/abs/2109.15099)
<a name="2"></a>
## 2. 模型快速体验
```
(pip方式,待补充)
```
<a name="3"></a>
## 3. 模型训练、评估和预测
<a name="3.1"></a>
### 3.1 环境配置
* 安装:请先参考 [Paddle 安装教程](../installation/install_paddle.md) 以及 [PaddleClas 安装教程](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
<a name="3.2"></a>
### 3.2 数据准备
<a name="3.2.1"></a>
#### 3.2.1 数据集来源
本案例中所使用的数据为[VeRi 数据集](https://www.v7labs.com/open-datasets/veri-dataset)
<a name="3.2.2"></a>
#### 3.2.2 数据集获取
部分数据可视化如下所示。
<div align="center">
<img src="../../images/PULC/docs/vehicle_attr_data_demo.png" width = "500" />
</div>
首先从[VeRi数据集官网](https://www.v7labs.com/open-datasets/veri-dataset)中申请并下载数据,放在PaddleClas的`dataset`目录下,数据集目录名为`VeRi`,使用下面的命令进入该文件夹。
```shell
cd PaddleClas/dataset/VeRi/
```
然后使用下面的代码转换label(可以在python终端中执行下面的命令,也可以将其写入一个文件,然后使用`python3 convert.py`的方式运行该文件)。
```python
import os
from xml.dom.minidom import parse
vehicleids = []
def convert_annotation(input_fp, output_fp):
in_file = open(input_fp)
list_file = open(output_fp, 'w')
tree = parse(in_file)
root = tree.documentElement
for item in root.getElementsByTagName("Item"):
label = ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
if item.hasAttribute("imageName"):
name = item.getAttribute("imageName")
if item.hasAttribute("vehicleID"):
vehicleid = item.getAttribute("vehicleID")
if vehicleid not in vehicleids :
vehicleids.append(vehicleid)
vid = vehicleids.index(vehicleid)
if item.hasAttribute("colorID"):
colorid = int (item.getAttribute("colorID"))
label[colorid-1] = '1'
if item.hasAttribute("typeID"):
typeid = int (item.getAttribute("typeID"))
label[typeid+9] = '1'
label = ','.join(label)
list_file.write(os.path.join('image_train', name) + "\t" + label + "\n")
list_file.close()
convert_annotation('train_label.xml', 'train_list.txt') #imagename vehiclenum colorid typeid
convert_annotation('test_label.xml', 'test_list.txt')
```
执行上述命令后,`VeRi`目录中具有以下数据:
```
VeRi
├── image_train
│ ├── 0001_c001_00016450_0.jpg
│ ├── 0001_c001_00016460_0.jpg
│ ├── 0001_c001_00016470_0.jpg
...
├── image_test
│ ├── 0002_c002_00030600_0.jpg
│ ├── 0002_c002_00030605_1.jpg
│ ├── 0002_c002_00030615_1.jpg
...
...
├── train_list.txt
├── test_list.txt
├── train_label.xml
├── test_label.xml
```
其中`train/``test/`分别为训练集和验证集。`train_list.txt``test_list.txt`分别为训练集和验证集的转换后用于训练的标签文件。
<a name="3.3"></a>
### 3.3 模型训练
`ppcls/configs/PULC/vehicle_attr/PPLCNet_x1_0.yaml` 中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/vehicle_attr/PPLCNet_x1_0.yaml
```
验证集的最佳指标在 `90.07%` 左右(数据集较小,一般有0.3%左右的波动)。
<a name="3.4"></a>
### 3.4 模型评估
训练好模型之后,可以通过以下命令实现对模型指标的评估。
```bash
python3 tools/eval.py \
-c ./ppcls/configs/PULC/vehicle_attr/PPLCNet_x1_0.yaml \
-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
```
其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
<a name="3.5"></a>
### 3.5 模型预测
模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
```bash
python3 tools/infer.py \
-c ./ppcls/configs/PULC/vehicle_attr/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/DistillationModel/best_model
```
输出结果如下:
```
[{'attr': 'Color: (yellow, prob: 0.9893478155136108), Type: (hatchback, prob: 0.9734100103378296)', 'pred': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], 'file_name': './deploy/images/PULC/vehicle_attr/0002_c002_00030670_0.jpg'}]
```
**备注:**
* 这里`-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
* 默认是对 `./deploy/images/PULC/vehicle_attr/0002_c002_00030670_0.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
<a name="4"></a>
## 4. 模型压缩
<a name="4.1"></a>
### 4.1 SKL-UGI 知识蒸馏
SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](@ruoyu)
<a name="4.1.1"></a>
#### 4.1.1 教师模型训练
复用 `ppcls/configs/PULC/vehicle_attr/PPLCNet_x1_0.yaml` 中的超参数,训练教师模型,训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/vehicle_attr/PPLCNet_x1_0.yaml \
-o Arch.name=ResNet101_vd
```
验证集的最佳指标为 `91.60%` 左右,当前教师模型最好的权重保存在 `output/ResNet101_vd/best_model.pdparams`
<a name="4.1.2"></a>
#### 4.1.2 蒸馏训练
配置文件`ppcls/configs/PULC/vehicle_attr/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型。训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/vehicle_attr/PPLCNet_x1_0_distillation.yaml \
-o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
```
验证集的最佳指标为 `90.81%` 左右,当前模型最好的权重保存在 `output/DistillationModel/best_model_student.pdparams`
<a name="5"></a>
## 5. 超参搜索
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](#TODO)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
<a name="6"></a>
## 6. 模型推理部署
<a name="6.1"></a>
### 6.1 推理模型准备
Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)
当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
<a name="6.1.1"></a>
### 6.1.1 基于训练得到的权重导出 inference 模型
此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
```bash
python3 tools/export_model.py \
-c ./ppcls/configs/PULC/vehicle_attr/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/DistillationModel/best_model_student \
-o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_vehicle_attr_infer
```
执行完该脚本后会在 `deploy/models/` 下生成 `PPLCNet_x1_0_vehicle_attr_infer` 文件夹,`models` 文件夹下应有如下文件结构:
```
├── PPLCNet_x1_0_vehicle_attr_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在`output/PPLCNet_x1_0/best_model.pdparams`中。
<a name="6.1.2"></a>
### 6.1.2 直接下载 inference 模型
[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
```
cd deploy/models
# 下载 inference 模型并解压
wget https://paddleclas.bj.bcebos.com/models/PULC/vehicle_attr_infer.tar && tar -xf vehicle_attr_infer.tar
```
解压完毕后,`models` 文件夹下应有如下文件结构:
```
├── vehicle_attr_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
<a name="6.2"></a>
### 6.2 基于 Python 预测引擎推理
<a name="6.2.1"></a>
#### 6.2.1 预测单张图像
返回 `deploy` 目录:
```
cd ../
```
运行下面的命令,对图像 `./images/PULC/vehicle_attr/0002_c002_00030670_0.jpg` 进行车辆属性识别。
```shell
# 使用下面的命令使用 GPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/vehicle_attr/inference_vehicle_attr.yaml -o Global.use_gpu=True
# 使用下面的命令使用 CPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/vehicle_attr/inference_vehicle_attr.yaml -o Global.use_gpu=False
```
输出结果如下。
```
0002_c002_00030670_0.jpg: attributes: Color: (yellow, prob: 0.9893478155136108), Type: (hatchback, prob: 0.97340989112854),
predict output: [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
```
<a name="6.2.2"></a>
#### 6.2.2 基于文件夹的批量预测
如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
```shell
# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
python3.7 python/predict_cls.py -c configs/PULC/vehicle_attr/inference_vehicle_attr.yaml -o Global.infer_imgs="./images/PULC/vehicle_attr/"
```
终端中会输出该文件夹内所有图像的属性识别结果,如下所示。
```
0002_c002_00030670_0.jpg: attributes: Color: (yellow, prob: 0.9893478155136108), Type: (hatchback, prob: 0.97340989112854),
predict output: [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
0014_c012_00040750_0.jpg: attributes: Color: (red, prob: 0.9998721480369568), Type: (sedan, prob: 0.999976634979248),
predict output: [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
```
<a name="6.3"></a>
### 6.3 基于 C++ 预测引擎推理
PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
<a name="6.4"></a>
### 6.4 服务化部署
Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)
PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
<a name="6.5"></a>
### 6.5 端侧部署
Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)
PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
<a name="6.6"></a>
### 6.6 Paddle2ONNX 模型转换与预测
Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)
PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](@shuilong)来完成相应的部署工作。
......@@ -133,6 +133,8 @@ PP-LCNet 系列模型的精度、速度指标如下表所示,更多关于该
**: 基于 Intel-Xeon-Gold-6271C 硬件平台与 OpenVINO 2021.4.2 推理平台。
<a name="PPHGNet"></a>
## PP-HGNet 系列
PP-HGNet 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:[PP-HGNet 系列模型文档](../models/PP-HGNet.md)
......@@ -140,7 +142,10 @@ PP-HGNet 系列模型的精度、速度指标如下表所示,更多关于该
| 模型 | Top-1 Acc | Top-5 Acc | time(ms)<br>bs=1 | time(ms)<br>bs=4 | time(ms)<br/>bs=8 | FLOPs(G) | Params(M) | 预训练模型下载地址 | inference模型下载地址 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| PPHGNet_tiny | 0.7983 | 0.9504 | 1.77 | - | - | 4.54 | 14.75 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_tiny_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_tiny_infer.tar) |
| PPHGNet_tiny_ssld | 0.8195 | 0.9612 | 1.77 | - | - | 4.54 | 14.75 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_tiny_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_tiny_ssld_infer.tar) |
| PPHGNet_small | 0.8151 | 0.9582 | 2.52 | - | - | 8.53 | 24.38 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_small_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_small_infer.tar) |
| PPHGNet_small_ssld | 0.8382 | 0.9681 | 2.52 | - | - | 8.53 | 24.38 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_small_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_small_ssld_infer.tar) |
| PPHGNet_base_ssld | 0.8500 | 0.9735 | 5.97 | - | - | 25.14 | 71.62 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_base_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_base_ssld_infer.tar) |
<a name="ResNet"></a>
......
......@@ -46,6 +46,11 @@ PP-HGNet 与其他模型的比较如下,其中测试机器为 NVIDIA® Tesla®
| SwinTransformer_tiny | 81.2 | 95.5 | 6.59 |
| <b>PPHGNet_small<b> | <b>81.51<b>| <b>95.82<b> | <b>2.52<b> |
| <b>PPHGNet_small_ssld<b> | <b>83.82<b>| <b>96.81<b> | <b>2.52<b> |
| Res2Net200_vd_26w_4s_ssld| 85.13 | 97.42 | 11.45 |
| ResNeXt101_32x48d_wsl | 85.37 | 97.69 | 55.07 |
| SwinTransformer_base | 85.2 | 97.5 | 13.53 |
| <b>PPHGNet_base_ssld<b> | <b>85.00<b>| <b>97.35<b> | <b>5.97<b> |
关于更多 PP-HGNet 的介绍以及下游任务的表现,敬请期待。
# 分布式训练
## 1. 简介
* 分布式训练指的是将训练任务按照一定方法拆分到多个计算节点进行计算,再按照一定的方法对拆分后计算得到的梯度等信息进行聚合与更新。飞桨分布式训练技术源自百度的业务实践,在自然语言处理、计算机视觉、搜索和推荐等领域经过超大规模业务检验。分布式训练的高性能,是飞桨的核心优势技术之一,在图像分类等任务上,分布式训练可以达到几乎线性的加速比。图像分类训练任务中往往包含大量训练数据,以ImageNet为例,ImageNet22k数据集中包含1400W张图像,如果使用单卡训练,会非常耗时。因此PaddleClas中使用分布式训练接口完成训练任务,同时支持单机训练与多机训练。更多关于分布式训练的方法与文档可以参考:[分布式训练快速开始教程](https://fleet-x.readthedocs.io/en/latest/paddle_fleet_rst/parameter_server/ps_quick_start.html)
## 2. 使用方法
### 2.1 单机训练
* 以识别为例,本地准备好数据之后,使用`paddle.distributed.launch`的接口启动训练任务即可。下面为运行代码示例。
```shell
python3 -m paddle.distributed.launch \
--log_dir=./log/ \
--gpus "0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/ImageNet/ResNet/ResNet50.yaml
```
### 2.2 多机训练
* 相比单机训练,多机训练时,只需要添加`--ips`的参数,该参数表示需要参与分布式训练的机器的ip列表,不同机器的ip用逗号隔开。下面为运行代码示例。
```shell
ip_list="192.168.0.1,192.168.0.2"
python3 -m paddle.distributed.launch \
--log_dir=./log/ \
--ips="${ip_list}" \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/ImageNet/ResNet/ResNet50.yaml
```
**注:**
* 不同机器的ip信息需要用逗号隔开,可以通过`ifconfig`或者`ipconfig`查看。
* 不同机器之间需要做免密设置,且可以直接ping通,否则无法完成通信。
* 不同机器之间的代码、数据与运行命令或脚本需要保持一致,且所有的机器上都需要运行设置好的训练命令或者脚本。最终`ip_list`中的第一台机器的第一块设备是trainer0,以此类推。
* 不同机器的起始端口可能不同,建议在启动多机任务前,在不同的机器中设置相同的多机运行起始端口,命令为`export FLAGS_START_PORT=17000`,端口值建议在`10000~20000`之间。
## 3. 性能效果测试
* 在4机8卡V100的机器上,基于[SSLD知识蒸馏训练策略](../advanced_tutorials/knowledge_distillation.md)(数据量500W)进行模型训练,不同模型的训练耗时以及多机加速比情况如下所示。
| 模型 | 精度 | 单机8卡耗时 | 4机8卡耗时 | 加速比 |
|:---------:|:--------:|:--------:|:--------:|:------:|
| PPHGNet-base_ssld | 85.00% | 15.74d | 4.86d | **3.23** |
| PPLCNetv2-base_ssld | 80.10% | 6.4d | 1.67d | **3.83** |
| PPLCNet_x0_25_ssld | 53.43% | 6.2d | 1.78d | **3.48** |
......@@ -154,7 +154,8 @@ class MobileNetV3(TheseusLayer):
class_expand=LAST_CONV,
dropout_prob=0.2,
return_patterns=None,
return_stages=None):
return_stages=None,
**kwargs):
super().__init__()
self.cfg = config
......
......@@ -27,7 +27,8 @@ MODEL_URLS = {
"PPHGNet_tiny":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_tiny_pretrained.pdparams",
"PPHGNet_small":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_small_pretrained.pdparams"
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_small_pretrained.pdparams",
"PPHGNet_base": ""
}
__all__ = list(MODEL_URLS.keys())
......@@ -344,7 +345,7 @@ def PPHGNet_small(pretrained=False, use_ssld=False, **kwargs):
return model
def PPHGNet_base(pretrained=False, use_ssld=False, **kwargs):
def PPHGNet_base(pretrained=False, use_ssld=True, **kwargs):
"""
PPHGNet_base
Args:
......
......@@ -132,7 +132,6 @@ class DepthwiseSeparable(TheseusLayer):
lr_mult=lr_mult)
if use_se:
self.se = SEModule(num_channels, lr_mult=lr_mult)
self.pw_conv = ConvBNLayer(
num_channels=num_channels,
filter_size=1,
......@@ -193,7 +192,8 @@ class PPLCNet(TheseusLayer):
stride_list=[2, 2, 2, 2, 2],
use_last_conv=True,
return_patterns=None,
return_stages=None):
return_stages=None,
**kwargs):
super().__init__()
self.scale = scale
self.class_expand = class_expand
......
......@@ -165,7 +165,8 @@ class BottleneckBlock(nn.Layer):
class Res2Net_vd(nn.Layer):
def __init__(self, layers=50, scales=4, width=26, class_num=1000):
def __init__(self, layers=50, scales=4, width=26, class_num=1000,
**kwargs):
super(Res2Net_vd, self).__init__()
self.layers = layers
......
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output/"
device: "gpu"
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: "./inference"
use_dali: false
# mixed precision training
AMP:
scale_loss: 128.0
use_dynamic_loss_scaling: True
# O1: mixed fp16
level: O1
# model architecture
Arch:
name: "DistillationModel"
class_num: &class_num 1000
# if not null, its lengths should be same as models
pretrained_list:
# if not null, its lengths should be same as models
freeze_params_list:
- True
- False
models:
- Teacher:
name: Res2Net200_vd_26w_4s
class_num: *class_num
pretrained: True
use_ssld: True
- Student:
name: PPHGNet_base
class_num: *class_num
pretrained: False
infer_model_name: "Student"
# loss function config for traing/eval process
Loss:
Train:
- DistillationCELoss:
weight: 1.0
model_name_pairs:
- ["Student", "Teacher"]
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.5
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: "./dataset/ILSVRC2012/"
cls_label_path: "./dataset/ILSVRC2012/train_list.txt"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
interpolation: bicubic
backend: pil
- RandFlipImage:
flip_code: 1
- TimmAutoAugment:
config_str: rand-m7-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: "./dataset/ILSVRC2012/"
cls_label_path: "./dataset/ILSVRC2012/val_list.txt"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 236
interpolation: bicubic
backend: pil
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Infer:
infer_imgs: "docs/images/inference_deployment/whl_demo.jpg"
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 236
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: DistillationPostProcess
func: Topk
topk: 5
class_id_map_file: "ppcls/utils/imagenet1k_label_list.txt"
Metric:
Train:
- DistillationTopkAcc:
model_key: "Student"
topk: [1, 5]
Eval:
- DistillationTopkAcc:
model_key: "Student"
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 600
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
use_dali: False
# mixed precision training
AMP:
scale_loss: 128.0
use_dynamic_loss_scaling: True
# O1: mixed fp16
level: O1
# model architecture
Arch:
name: PPHGNet_base
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.5
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
interpolation: bicubic
backend: pil
- RandFlipImage:
flip_code: 1
- TimmAutoAugment:
config_str: rand-m15-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.4
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.4
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 16
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 236
interpolation: bicubic
backend: pil
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 16
use_shared_memory: True
Infer:
infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 236
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
......@@ -113,7 +113,7 @@ DataLoader:
use_shared_memory: True
Infer:
infer_imgs: docs/images/inference_deployment/whl_demo.jpg
infer_imgs: deploy/images/PULC/person_exists/objects365_02035329.jpg
batch_size: 10
transforms:
- DecodeImage:
......
......@@ -119,7 +119,7 @@ DataLoader:
use_shared_memory: True
Infer:
infer_imgs: docs/images/inference_deployment/whl_demo.jpg
infer_imgs: deploy/images/PULC/person_exists/objects365_02035329.jpg
batch_size: 10
transforms:
- DecodeImage:
......
......@@ -135,7 +135,7 @@ DataLoader:
use_shared_memory: True
Infer:
infer_imgs: docs/images/inference_deployment/whl_demo.jpg
infer_imgs: deploy/images/PULC/person_exists/objects365_02035329.jpg
batch_size: 10
transforms:
- DecodeImage:
......
......@@ -119,7 +119,7 @@ DataLoader:
use_shared_memory: True
Infer:
infer_imgs: docs/images/inference_deployment/whl_demo.jpg
infer_imgs: deploy/images/PULC/person_exists/objects365_02035329.jpg
batch_size: 10
transforms:
- DecodeImage:
......
......@@ -136,7 +136,7 @@ DataLoader:
use_shared_memory: True
Infer:
infer_imgs: docs/images/inference_deployment/whl_demo.jpg
infer_imgs: deploy/images/PULC/person_exists/objects365_02035329.jpg
batch_size: 10
transforms:
- DecodeImage:
......
base_config_file: ppcls/configs/PULC/person_exists/PPLCNet_x1_0_search.yaml
distill_config_file: ppcls/configs/PULC/person_exists/PPLCNet_x1_0_distillation.yaml
gpus: 0,1,2,3
output_dir: output/search_person_cls
search_times: 1
search_dict:
- search_key: lrs
replace_config:
- Optimizer.lr.learning_rate
search_values: [0.0075, 0.01, 0.0125]
- search_key: resolutions
replace_config:
- DataLoader.Train.dataset.transform_ops.1.RandCropImage.size
- DataLoader.Train.dataset.transform_ops.3.TimmAutoAugment.img_size
search_values: [176, 192, 224]
- search_key: ra_probs
replace_config:
- DataLoader.Train.dataset.transform_ops.3.TimmAutoAugment.prob
search_values: [0.0, 0.1, 0.5]
- search_key: re_probs
replace_config:
- DataLoader.Train.dataset.transform_ops.5.RandomErasing.EPSILON
search_values: [0.0, 0.1, 0.5]
- search_key: lr_mult_list
replace_config:
- Arch.lr_mult_list
search_values:
- [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
- [0.0, 0.4, 0.4, 0.8, 0.8, 1.0]
- [1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
teacher:
rm_keys:
- Arch.lr_mult_list
search_values:
- ResNet101_vd
- ResNet50_vd
final_replace:
Arch.lr_mult_list: Arch.models.1.Student.lr_mult_list
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 10
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: MobileNetV3_large_x1_0
class_num: 232
pretrained: True
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00002
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/
cls_label_path: ./dataset/traffic_sign/label_list_train.txt
delimiter: "\t"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/
cls_label_path: ./dataset/traffic_sign/label_list_test.txt
delimiter: "\t"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Infer:
infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: dataset/traffic_sign/label_name_id.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
start_eval_epoch: 0
epochs: 10
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
use_dali: False
# model architecture
Arch:
name: PPLCNet_x1_0
class_num: 232
pretrained: True
use_ssld: True
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.02
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/
cls_label_path: ./dataset/tt100k_clas_v2/label_list_train.txt
delimiter: "\t"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- TimmAutoAugment:
prob: 0.5
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.0
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/
cls_label_path: ./dataset/traffic_sign/label_list_test.txt
delimiter: "\t"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Infer:
infer_imgs: deploy/images/PULC/traffic_sign/99603_17806.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: dataset/traffic_sign/label_name_id.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 10
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
use_dali: False
# mixed precision training
AMP:
scale_loss: 128.0
use_dynamic_loss_scaling: True
# O1: mixed fp16
level: O1
# model architecture
Arch:
name: "DistillationModel"
class_num: &class_num 232
# if not null, its lengths should be same as models
pretrained_list:
# if not null, its lengths should be same as models
freeze_params_list:
- True
- False
models:
- Teacher:
name: ResNet101_vd
class_num: *class_num
pretrained: False
- Student:
name: PPLCNet_x1_0
class_num: *class_num
pretrained: True
use_ssld: True
infer_model_name: "Student"
# loss function config for traing/eval process
Loss:
Train:
- DistillationDMLLoss:
weight: 1.0
model_name_pairs:
- ["Student", "Teacher"]
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/
cls_label_path: ./dataset/traffic_sign/label_list_train_for_distillation.txt
delimiter: "\t"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- TimmAutoAugment:
prob: 0.0
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.0
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/
cls_label_path: ./dataset/traffic_sign/label_list_test.txt
delimiter: "\t"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Infer:
infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: dataset/traffic_sign/label_name_id.txt
Metric:
Train:
- DistillationTopkAcc:
model_key: "Student"
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
start_eval_epoch: 0
epochs: 10
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
use_dali: False
# model architecture
Arch:
name: PPLCNet_x1_0
class_num: 232
pretrained: True
# use_ssld: True
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/
cls_label_path: ./dataset/traffic_sign/label_list_train.txt
delimiter: "\t"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- TimmAutoAugment:
prob: 0.0
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.0
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/
cls_label_path: ./dataset/traffic_sign/label_list_test.txt
delimiter: "\t"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Infer:
# infer_imgs: dataset/traffic_sign_demo/
infer_imgs: dataset/tt100k_clas_v2/test/
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: dataset/traffic_sign/label_name_id.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
start_eval_epoch: 0
epochs: 10
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
use_dali: False
# mixed precision training
AMP:
scale_loss: 128.0
use_dynamic_loss_scaling: True
# O1: mixed fp16
level: O1
# model architecture
Arch:
name: SwinTransformer_tiny_patch4_window7_224
class_num: 232
pretrained: True
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: AdamW
beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr:
name: Cosine
learning_rate: 2e-4
eta_min: 2e-6
warmup_epoch: 5
warmup_start_lr: 2e-7
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/
cls_label_path: ./dataset/traffic_sign/label_list_train.txt
delimiter: "\t"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
interpolation: bicubic
backend: pil
- RandFlipImage:
flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/
cls_label_path: ./dataset/tt100k_clas_v2/label_list_test.txt
delimiter: "\t"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Infer:
infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: dataset/traffic_sign/label_name_id.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
base_config_file: ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0_search.yaml
distill_config_file: ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0_distillation.yaml
gpus: 0,1,2,3
output_dir: output/search_traffic_sign
search_times: 1
search_dict:
- search_key: lrs
replace_config:
- Optimizer.lr.learning_rate
search_values: [0.0075, 0.01, 0.0125]
- search_key: resolutions
replace_config:
- DataLoader.Train.dataset.transform_ops.1.RandCropImage.size
- DataLoader.Train.dataset.transform_ops.2.TimmAutoAugment.img_size
search_values: [176, 192, 224]
- search_key: ra_probs
replace_config:
- DataLoader.Train.dataset.transform_ops.2.TimmAutoAugment.prob
search_values: [0.0, 0.1, 0.5]
- search_key: re_probs
replace_config:
- DataLoader.Train.dataset.transform_ops.4.RandomErasing.EPSILON
search_values: [0.0, 0.1, 0.5]
- search_key: lr_mult_list
replace_config:
- Arch.lr_mult_list
search_values:
- [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
- [0.0, 0.4, 0.4, 0.8, 0.8, 1.0]
- [1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
teacher:
algorithm: "skl-ugi"
rm_keys:
- Arch.lr_mult_list
search_values:
- ResNet101_vd
- ResNet50_vd
final_replace:
Arch.lr_mult_list: Arch.models.1.Student.lr_mult_list
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output/"
device: "gpu"
save_interval: 5
eval_during_train: True
eval_interval: 1
epochs: 30
print_batch_step: 20
use_visualdl: False
# used for static mode and model export
image_shape: [3, 192, 256]
save_inference_dir: "./inference"
use_multilabel: True
# model architecture
Arch:
name: "MobileNetV3_large_x1_0"
pretrained: True
class_num: 19
infer_add_softmax: False
# loss function config for traing/eval process
Loss:
Train:
- MultiLabelLoss:
weight: 1.0
weight_ratio: True
size_sum: True
Eval:
- MultiLabelLoss:
weight: 1.0
weight_ratio: True
size_sum: True
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.0005
# data loader for train and eval
DataLoader:
Train:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/train_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- Padv2:
size: [276, 212]
pad_mode: 1
fill_value: 0
- RandomCropImage:
size: [256, 192]
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: True
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/test_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Metric:
Eval:
- ATTRMetric:
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output/"
device: "gpu"
save_interval: 5
eval_during_train: True
eval_interval: 1
epochs: 30
print_batch_step: 20
use_visualdl: False
# used for static mode and model export
image_shape: [3, 192, 256]
save_inference_dir: "./inference"
use_multilabel: True
# model architecture
Arch:
name: "PPLCNet_x1_0"
pretrained: True
class_num: 19
use_ssld: True
lr_mult_list: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
infer_add_softmax: False
# loss function config for traing/eval process
Loss:
Train:
- MultiLabelLoss:
weight: 1.0
weight_ratio: True
size_sum: True
Eval:
- MultiLabelLoss:
weight: 1.0
weight_ratio: True
size_sum: True
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.0125
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.0005
# data loader for train and eval
DataLoader:
Train:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/train_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- TimmAutoAugment:
prob: 0.0
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: [256, 192]
- Padv2:
size: [276, 212]
pad_mode: 1
fill_value: 0
- RandomCropImage:
size: [256, 192]
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.5
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: True
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/test_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Infer:
infer_imgs: ./deploy/images/PULC/vehicle_attr/0002_c002_00030670_0.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: VehicleAttribute
color_threshold: 0.5
type_threshold: 0.5
Metric:
Eval:
- ATTRMetric:
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output/"
device: "gpu"
save_interval: 5
eval_during_train: True
eval_interval: 1
epochs: 30
print_batch_step: 20
use_visualdl: False
# used for static mode and model export
image_shape: [3, 192, 256]
save_inference_dir: "./inference"
use_multilabel: True
# model architecture
Arch:
name: "DistillationModel"
class_num: &class_num 19
# if not null, its lengths should be same as models
pretrained_list:
# if not null, its lengths should be same as models
freeze_params_list:
- True
- False
use_ssld: True
models:
- Teacher:
name: ResNet101_vd
class_num: *class_num
- Student:
name: PPLCNet_x1_0
class_num: *class_num
pretrained: True
use_ssld: True
# loss function config for traing/eval process
Loss:
Train:
- DistillationMultiLabelLoss:
weight: 1.0
model_names: ["Student"]
weight_ratio: True
size_sum: True
- DistillationDMLLoss:
weight: 1.0
weight_ratio: True
sum_across_class_dim: False
model_name_pairs:
- ["Student", "Teacher"]
Eval:
- MultiLabelLoss:
weight: 1.0
weight_ratio: True
size_sum: True
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.0005
# data loader for train and eval
DataLoader:
Train:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/train_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- TimmAutoAugment:
prob: 0.0
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: [256, 192]
- Padv2:
size: [276, 212]
pad_mode: 1
fill_value: 0
- RandomCropImage:
size: [256, 192]
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.0
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: True
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/test_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Metric:
Eval:
- ATTRMetric:
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output/"
device: "gpu"
save_interval: 5
eval_during_train: True
eval_interval: 1
epochs: 30
print_batch_step: 20
use_visualdl: False
# used for static mode and model export
image_shape: [3, 192, 256]
save_inference_dir: "./inference"
use_multilabel: True
# model architecture
Arch:
name: "PPLCNet_x1_0"
pretrained: True
use_ssld: True
class_num: 19
infer_add_softmax: False
# loss function config for traing/eval process
Loss:
Train:
- MultiLabelLoss:
weight: 1.0
weight_ratio: True
size_sum: True
Eval:
- MultiLabelLoss:
weight: 1.0
weight_ratio: True
size_sum: True
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.0005
# data loader for train and eval
DataLoader:
Train:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/train_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- TimmAutoAugment:
prob: 0.0
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: [256, 192]
- Padv2:
size: [276, 212]
pad_mode: 1
fill_value: 0
- RandomCropImage:
size: [256, 192]
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.0
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: True
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/test_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Metric:
Eval:
- ATTRMetric:
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output/mo"
device: "gpu"
save_interval: 5
eval_during_train: True
eval_interval: 1
epochs: 30
print_batch_step: 20
use_visualdl: False
# used for static mode and model export
image_shape: [3, 192, 256]
save_inference_dir: "./inference"
use_multilabel: True
# mixed precision training
AMP:
scale_loss: 128.0
use_dynamic_loss_scaling: True
# O1: mixed fp16
level: O1
# model architecture
Arch:
name: "Res2Net200_vd_26w_4s"
pretrained: True
class_num: 19
infer_add_softmax: False
# loss function config for traing/eval process
Loss:
Train:
- MultiLabelLoss:
weight: 1.0
weight_ratio: True
size_sum: True
Eval:
- MultiLabelLoss:
weight: 1.0
weight_ratio: True
size_sum: True
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.0005
# data loader for train and eval
DataLoader:
Train:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/train_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- Padv2:
size: [276, 212]
pad_mode: 1
fill_value: 0
- RandomCropImage:
size: [256, 192]
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: True
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/test_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Metric:
Eval:
- ATTRMetric:
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output/"
device: "gpu"
save_interval: 5
eval_during_train: True
eval_interval: 1
epochs: 30
print_batch_step: 20
use_visualdl: False
# used for static mode and model export
image_shape: [3, 192, 256]
save_inference_dir: "./inference"
use_multilabel: True
# model architecture
Arch:
name: "ResNet50"
pretrained: True
class_num: 19
infer_add_softmax: False
# loss function config for traing/eval process
Loss:
Train:
- MultiLabelLoss:
weight: 1.0
weight_ratio: True
size_sum: True
Eval:
- MultiLabelLoss:
weight: 1.0
weight_ratio: True
size_sum: True
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.0005
# data loader for train and eval
DataLoader:
Train:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/train_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- Padv2:
size: [276, 212]
pad_mode: 1
fill_value: 0
- RandomCropImage:
size: [256, 192]
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: True
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: MultiLabelDataset
image_root: "dataset/VeRi/"
cls_label_path: "dataset/VeRi/test_list.txt"
label_ratio: True
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: [256, 192]
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Metric:
Eval:
- ATTRMetric:
base_config_file: ppcls/configs/PULC/vehicle_attr/PPLCNet_x1_0_search.yaml
distill_config_file: ppcls/configs/PULC/vehicle_attr/PPLCNet_x1_0_distillation.yaml
gpus: 0,1,2,3
output_dir: output/search_vehicle_attr
search_times: 1
search_dict:
- search_key: lrs
replace_config:
- Optimizer.lr.learning_rate
search_values: [0.0075, 0.01, 0.0125]
- search_key: ra_probs
replace_config:
- DataLoader.Train.dataset.transform_ops.2.TimmAutoAugment.prob
search_values: [0.0, 0.1, 0.5]
- search_key: re_probs
replace_config:
- DataLoader.Train.dataset.transform_ops.7.RandomErasing.EPSILON
search_values: [0.0, 0.1, 0.5]
- search_key: lr_mult_list
replace_config:
- Arch.lr_mult_list
search_values:
- [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
- [0.0, 0.4, 0.4, 0.8, 0.8, 1.0]
- [1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
teacher:
algorithm: "skl-ugi"
rm_keys:
- Arch.lr_mult_list
search_values:
- ResNet101_vd
- ResNet50_vd
final_replace:
Arch.lr_mult_list: Arch.models.1.Student.lr_mult_list
......@@ -18,6 +18,7 @@ from . import topk, threshoutput
from .topk import Topk, MultiLabelTopk
from .threshoutput import ThreshOutput
from .attr_rec import VehicleAttribute
def build_postprocess(config):
......
# copyright (c) 2022 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import numpy as np
import paddle
import paddle.nn.functional as F
class VehicleAttribute(object):
def __init__(self, color_threshold=0.5, type_threshold=0.5):
self.color_threshold = color_threshold
self.type_threshold = type_threshold
self.color_list = [
"yellow", "orange", "green", "gray", "red", "blue", "white",
"golden", "brown", "black"
]
self.type_list = [
"sedan", "suv", "van", "hatchback", "mpv", "pickup", "bus",
"truck", "estate"
]
def __call__(self, x, file_names=None):
if isinstance(x, dict):
x = x['logits']
assert isinstance(x, paddle.Tensor)
if file_names is not None:
assert x.shape[0] == len(file_names)
x = F.sigmoid(x).numpy()
# postprocess output of predictor
batch_res = []
for idx, res in enumerate(x):
res = res.tolist()
label_res = []
color_idx = np.argmax(res[:10])
type_idx = np.argmax(res[10:])
print(color_idx, type_idx)
if res[color_idx] >= self.color_threshold:
color_info = f"Color: ({self.color_list[color_idx]}, prob: {res[color_idx]})"
else:
color_info = "Color unknown"
if res[type_idx + 10] >= self.type_threshold:
type_info = f"Type: ({self.type_list[type_idx]}, prob: {res[type_idx + 10]})"
else:
type_info = "Type unknown"
label_res = f"{color_info}, {type_info}"
threshold_list = [self.color_threshold
] * 10 + [self.type_threshold] * 9
pred_res = (np.array(res) > np.array(threshold_list)
).astype(np.int8).tolist()
batch_res.append({
"attr": label_res,
"pred": pred_res,
"file_name": file_names[idx]
})
return batch_res
......@@ -21,9 +21,9 @@ import paddle.nn.functional as F
class Topk(object):
def __init__(self, topk=1, class_id_map_file=None, delimiter=None):
assert isinstance(topk, (int, ))
self.class_id_map = self.parse_class_id_map(class_id_map_file)
self.topk = topk
self.delimiter = delimiter if delimiter is not None else " "
self.class_id_map = self.parse_class_id_map(class_id_map_file)
def parse_class_id_map(self, class_id_map_file):
if class_id_map_file is None:
......
......@@ -365,9 +365,6 @@ class RandomCropImage(object):
j = random.randint(0, w - tw)
img = img[i:i + th, j:j + tw, :]
if img.shape[0] != 256 or img.shape[1] != 192:
raise ValueError('sample: ', h, w, i, j, th, tw, img.shape)
return img
......
......@@ -152,39 +152,33 @@ class Engine(object):
self.eval_loss_func = None
# build metric
if self.mode == 'train':
metric_config = self.config.get("Metric")
if metric_config is not None:
metric_config = metric_config.get("Train")
if metric_config is not None:
if hasattr(
self.train_dataloader, "collate_fn"
if self.mode == 'train' and "Metric" in self.config and "Train" in self.config[
"Metric"] and self.config["Metric"]["Train"]:
metric_config = self.config["Metric"]["Train"]
if hasattr(self.train_dataloader, "collate_fn"
) and self.train_dataloader.collate_fn is not None:
for m_idx, m in enumerate(metric_config):
if "TopkAcc" in m:
msg = f"'TopkAcc' metric can not be used when setting 'batch_transform_ops' in config. The 'TopkAcc' metric has been removed."
msg = f"Unable to calculate accuracy when using \"batch_transform_ops\". The metric \"{m}\" has been removed."
logger.warning(msg)
break
metric_config.pop(m_idx)
self.train_metric_func = build_metrics(metric_config)
else:
self.train_metric_func = None
else:
self.train_metric_func = None
if self.mode == "eval" or (self.mode == "train" and
self.config["Global"]["eval_during_train"]):
metric_config = self.config.get("Metric")
if self.eval_mode == "classification":
if metric_config is not None:
metric_config = metric_config.get("Eval")
if metric_config is not None:
self.eval_metric_func = build_metrics(metric_config)
if "Metric" in self.config and "Eval" in self.config["Metric"]:
self.eval_metric_func = build_metrics(self.config["Metric"]
["Eval"])
else:
self.eval_metric_func = None
elif self.eval_mode == "retrieval":
if metric_config is None:
metric_config = [{"name": "Recallk", "topk": (1, 5)}]
if "Metric" in self.config and "Eval" in self.config["Metric"]:
metric_config = self.config["Metric"]["Eval"]
else:
metric_config = metric_config["Eval"]
metric_config = [{"name": "Recallk", "topk": (1, 5)}]
self.eval_metric_func = build_metrics(metric_config)
else:
self.eval_metric_func = None
......@@ -222,7 +216,7 @@ class Engine(object):
AMP_RELATED_FLAGS_SETTING.update({
'FLAGS_cudnn_batchnorm_spatial_persistent': 1
})
paddle.fluid.set_flags(AMP_RELATED_FLAGS_SETTING)
paddle.set_flags(AMP_RELATED_FLAGS_SETTING)
self.scale_loss = self.config["AMP"].get("scale_loss", 1.0)
self.use_dynamic_loss_scaling = self.config["AMP"].get(
......@@ -460,7 +454,7 @@ class Engine(object):
assert self.mode == "export"
use_multilabel = self.config["Global"].get(
"use_multilabel",
False) and not "ATTRMetric" in self.config["Metric"]["Eval"][0]
False) and "ATTRMetric" in self.config["Metric"]["Eval"][0]
model = ExportModel(self.config["Arch"], self.model, use_multilabel)
if self.config["Global"]["pretrained_model"] is not None:
load_dygraph_pretrain(model.base_model,
......
......@@ -34,7 +34,6 @@ def classification_eval(engine, epoch_id=0):
}
print_batch_step = engine.config["Global"]["print_batch_step"]
metric_key = None
tic = time.time()
accum_samples = 0
total_samples = len(
......
......@@ -24,6 +24,7 @@ from .distillationloss import DistillationDistanceLoss
from .distillationloss import DistillationRKDLoss
from .distillationloss import DistillationKLDivLoss
from .distillationloss import DistillationDKDLoss
from .distillationloss import DistillationMultiLabelLoss
from .multilabelloss import MultiLabelLoss
from .afdloss import AFDLoss
......
......@@ -22,6 +22,7 @@ from .distanceloss import DistanceLoss
from .rkdloss import RKdAngle, RkdDistance
from .kldivloss import KLDivLoss
from .dkdloss import DKDLoss
from .multilabelloss import MultiLabelLoss
class DistillationCELoss(CELoss):
......@@ -89,13 +90,16 @@ class DistillationDMLLoss(DMLLoss):
def __init__(self,
model_name_pairs=[],
act="softmax",
weight_ratio=False,
sum_across_class_dim=False,
key=None,
name="loss_dml"):
super().__init__(act=act)
super().__init__(act=act, sum_across_class_dim=sum_across_class_dim)
assert isinstance(model_name_pairs, list)
self.key = key
self.model_name_pairs = model_name_pairs
self.name = name
self.weight_ratio = weight_ratio
def forward(self, predicts, batch):
loss_dict = dict()
......@@ -105,6 +109,9 @@ class DistillationDMLLoss(DMLLoss):
if self.key is not None:
out1 = out1[self.key]
out2 = out2[self.key]
if self.weight_ratio is True:
loss = super().forward(out1, out2, batch)
else:
loss = super().forward(out1, out2)
if isinstance(loss, dict):
for key in loss:
......@@ -122,6 +129,7 @@ class DistillationDistanceLoss(DistanceLoss):
def __init__(self,
mode="l2",
model_name_pairs=[],
act=None,
key=None,
name="loss_",
**kargs):
......@@ -130,6 +138,13 @@ class DistillationDistanceLoss(DistanceLoss):
self.key = key
self.model_name_pairs = model_name_pairs
self.name = name + mode
assert act in [None, "sigmoid", "softmax"]
if act == "sigmoid":
self.act = nn.Sigmoid()
elif act == "softmax":
self.act = nn.Softmax(axis=-1)
else:
self.act = None
def forward(self, predicts, batch):
loss_dict = dict()
......@@ -139,6 +154,9 @@ class DistillationDistanceLoss(DistanceLoss):
if self.key is not None:
out1 = out1[self.key]
out2 = out2[self.key]
if self.act is not None:
out1 = self.act(out1)
out2 = self.act(out2)
loss = super().forward(out1, out2)
for key in loss:
loss_dict["{}_{}_{}".format(self.name, key, idx)] = loss[key]
......@@ -235,3 +253,34 @@ class DistillationDKDLoss(DKDLoss):
loss = super().forward(out1, out2, batch)
loss_dict[f"{self.name}_{pair[0]}_{pair[1]}"] = loss
return loss_dict
class DistillationMultiLabelLoss(MultiLabelLoss):
"""
DistillationMultiLabelLoss
"""
def __init__(self,
model_names=[],
epsilon=None,
size_sum=False,
weight_ratio=False,
key=None,
name="loss_mll"):
super().__init__(
epsilon=epsilon, size_sum=size_sum, weight_ratio=weight_ratio)
assert isinstance(model_names, list)
self.key = key
self.model_names = model_names
self.name = name
def forward(self, predicts, batch):
loss_dict = dict()
for name in self.model_names:
out = predicts[name]
if self.key is not None:
out = out[self.key]
loss = super().forward(out, batch)
for key in loss:
loss_dict["{}_{}".format(key, name)] = loss[key]
return loss_dict
......@@ -16,13 +16,15 @@ import paddle
import paddle.nn as nn
import paddle.nn.functional as F
from ppcls.loss.multilabelloss import ratio2weight
class DMLLoss(nn.Layer):
"""
DMLLoss
"""
def __init__(self, act="softmax", eps=1e-12):
def __init__(self, act="softmax", sum_across_class_dim=False, eps=1e-12):
super().__init__()
if act is not None:
assert act in ["softmax", "sigmoid"]
......@@ -33,6 +35,7 @@ class DMLLoss(nn.Layer):
else:
self.act = None
self.eps = eps
self.sum_across_class_dim = sum_across_class_dim
def _kldiv(self, x, target):
class_num = x.shape[-1]
......@@ -40,11 +43,20 @@ class DMLLoss(nn.Layer):
(target + self.eps) / (x + self.eps)) * class_num
return cost
def forward(self, x, target):
def forward(self, x, target, gt_label=None):
if self.act is not None:
x = self.act(x)
target = self.act(target)
loss = self._kldiv(x, target) + self._kldiv(target, x)
loss = loss / 2
loss = paddle.mean(loss)
# for multi-label dml loss
if gt_label is not None:
gt_label, label_ratio = gt_label[:, 0, :], gt_label[:, 1, :]
targets_mask = paddle.cast(gt_label > 0.5, 'float32')
weight = ratio2weight(targets_mask, paddle.to_tensor(label_ratio))
weight = weight * (gt_label > -1)
loss = loss * weight
loss = loss.sum(1).mean() if self.sum_across_class_dim else loss.mean()
return {"DMLLoss": loss}
......@@ -26,6 +26,7 @@ from easydict import EasyDict
from ppcls.metric.avg_metrics import AvgMetrics
from ppcls.utils.misc import AverageMeter, AttrMeter
from ppcls.utils import logger
class TopkAcc(AvgMetrics):
......@@ -39,7 +40,7 @@ class TopkAcc(AvgMetrics):
def reset(self):
self.avg_meters = {
"top{}".format(k): AverageMeter("top{}".format(k))
f"top{k}": AverageMeter(f"top{k}")
for k in self.topk
}
......@@ -47,11 +48,21 @@ class TopkAcc(AvgMetrics):
if isinstance(x, dict):
x = x["logits"]
output_dims = x.shape[-1]
metric_dict = dict()
for k in self.topk:
metric_dict["top{}".format(k)] = paddle.metric.accuracy(
x, label, k=k)
self.avg_meters["top{}".format(k)].update(metric_dict["top{}".format(k)], x.shape[0])
for idx, k in enumerate(self.topk):
if output_dims < k:
msg = f"The output dims({output_dims}) is less than k({k}), and the argument {k} of Topk has been removed."
logger.warning(msg)
self.avg_meters.pop(f"top{k}")
continue
metric_dict[f"top{k}"] = paddle.metric.accuracy(x, label, k=k)
self.avg_meters[f"top{k}"].update(metric_dict[f"top{k}"],
x.shape[0])
self.topk = list(filter(lambda k: k <= output_dims, self.topk))
return metric_dict
......
......@@ -62,8 +62,8 @@ def load_params(exe, prog, path, ignore_params=None):
"""
Load model from the given path.
Args:
exe (fluid.Executor): The fluid.Executor object.
prog (fluid.Program): load weight to which Program object.
exe (paddle.static.Executor): The paddle.static.Executor object.
prog (paddle.static.Program): load weight to which Program object.
path (string): URL string or loca model path.
ignore_params (list): ignore variable to load when finetuning.
It can be specified by finetune_exclude_pretrained_params
......
......@@ -87,7 +87,7 @@ def main(args):
'FLAGS_max_inplace_grad_add': 8,
}
os.environ['FLAGS_cudnn_batchnorm_spatial_persistent'] = '1'
paddle.fluid.set_flags(AMP_RELATED_FLAGS_SETTING)
paddle.set_flags(AMP_RELATED_FLAGS_SETTING)
use_xpu = global_config.get("use_xpu", False)
use_npu = global_config.get("use_npu", False)
......
......@@ -112,7 +112,7 @@ def get_path_from_url(url,
str: a local path to save downloaded models & weights & datasets.
"""
from paddle.fluid.dygraph.parallel import ParallelEnv
from paddle.distributed import ParallelEnv
assert is_url(url), "downloading from {} not a url".format(url)
# parse path after download to decompress under root_dir
......
......@@ -35,18 +35,23 @@
│ ├── MobileNetV3 # MobileNetV3系列模型测试配置文件目录
│ │ ├── MobileNetV3_large_x1_0_train_infer_python.txt #基础训练预测配置文件
│ │ ├── MobileNetV3_large_x1_0_train_linux_gpu_fleet_amp_infer_python_linux_gpu_cpu.txt #多机多卡训练预测配置文件
│ │ └── MobileNetV3_large_x1_0_train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt #混合精度训练预测配置文件
│ └── ResNet # ResNet系列模型测试配置文件目录
│ ├── ResNet50_vd_train_infer_python.txt #基础训练预测配置文件
│ ├── ResNet50_vd_train_linux_gpu_fleet_amp_infer_python_linux_gpu_cpu.txt #多机多卡训练预测配置文件
│ └── ResNet50_vd_train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt #混合精度训练预测配置文件
| ......
│ │ ├── MobileNetV3_large_x1_0_train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt #混合精度训练预测配置文件
│ │ ├── MobileNetV3_large_x1_0_paddle2onnx_infer_python.txt #paddle2onnx推理测试配置文件
│ │ └── ......
│ ├──ResNet # ResNet系列模型测试配置文件目录
│ │ ├── ResNet50_vd_train_infer_python.txt #基础训练预测配置文件
│ │ ├── ResNet50_vd_train_linux_gpu_fleet_amp_infer_python_linux_gpu_cpu.txt #多机多卡训练预测配置文件
│ │ ├── ResNet50_vd_train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt #混合精度训练预测配置文件
│ │ ├── ResNet50_vd_paddle2onnx_infer_python.txt #paddle2onnx推理测试配置文件
│ │ └── ......
│ └── ......
├── docs
│ ├── guide.png
│ └── test.png
├── prepare.sh # 完成test_*.sh运行所需要的数据和模型下载
├── README.md # 使用文档
├── results # 预先保存的预测结果,用于和实际预测结果进行精读比对
├── test_paddle2onnx.sh # 测试paddle2onnx推理预测的主程序
└── test_train_inference_python.sh # 测试python训练预测的主程序
```
......@@ -106,3 +111,5 @@ bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/MobileNetV3/Mo
- [test_serving 使用](docs/test_serving.md) :测试基于Paddle Serving的服务化部署功能。
- [test_lite_arm_cpu_cpp 使用](docs/test_lite_arm_cpu_cpp.md): 测试基于Paddle-Lite的ARM CPU端c++预测部署功能.
- [test_paddle2onnx 使用](docs/test_paddle2onnx.md):测试Paddle2ONNX的模型转化功能,并验证正确性。
- [test_serving_infer_python 使用](docs/test_serving_infer_python.md):测试python serving功能。
- [test_train_fleet_inference_python 使用](./docs/test_train_fleet_inference_python.md):测试基于Python的多机多卡训练与推理等基本功能。
===========================train_params===========================
model_name:GeneralRecognition_PPLCNet_x2_5
python:python3.7
gpu_list:192.168.0.1,192.168.0.2;0,1
-o Global.device:gpu
-o Global.auto_cast:null
-o Global.epochs:lite_train_lite_infer=2|whole_train_whole_infer=120
-o Global.output_dir:./output/
-o DataLoader.Train.sampler.batch_size:8
-o Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val
null:null
##
trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null
fpgm_train:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml
quant_export:null
fpgm_export:null
distill_export:null
kl_quant:null
export2:null
pretrained_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/general_PPLCNet_x2_5_pretrained_v1.0.pdparams
infer_model:../inference/
infer_export:True
infer_quant:Fasle
inference:python/predict_rec.py -c configs/inference_rec.yaml
-o Global.use_gpu:True|False
-o Global.enable_mkldnn:True|False
-o Global.cpu_num_threads:1|6
-o Global.batch_size:1|16
-o Global.use_tensorrt:True|False
-o Global.use_fp16:True|False
-o Global.rec_inference_model_dir:../inference
-o Global.infer_imgs:../dataset/Aliproduct/demo_test/
-o Global.save_log_path:null
-o Global.benchmark:True
null:null
null:null
===========================infer_benchmark_params==========================
random_infer_input:[{float32,[3,224,224]}]
\ No newline at end of file
===========================cpp_infer_params===========================
model_name:MobileNetV3_large_x1_0
cpp_infer_type:cls
cls_inference_model_dir:./MobileNetV3_large_x1_0_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/MobileNetV3_large_x1_0_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:MobileNetV3_large_x1_0
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/MobileNetV3_large_x1_0_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/MobileNetV3_large_x1_0_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/MobileNetV3_large_x1_0_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/MobileNetV3_large_x1_0_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:MobileNetV3_large_x1_0
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/MobileNetV3_large_x1_0_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/MobileNetV3_large_x1_0_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/MobileNetV3_large_x1_0_serving/
--serving_client:./deploy/paddleserving/MobileNetV3_large_x1_0_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================paddle2onnx_params===========================
model_name:PP-ShiTu_general_rec
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/general_PPLCNet_x2_5_lite_v1.0_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/general_PPLCNet_x2_5_lite_v1.0_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/general_PPLCNet_x2_5_lite_v1.0_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================cpp_infer_params===========================
model_name:PPShiTu
cpp_infer_type:shitu
feature_inference_model_dir:./feature_inference/
det_inference_model_dir:./det_inference
feature_inference_model_dir:./general_PPLCNet_x2_5_lite_v1.0_infer/
det_inference_model_dir:./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar
det_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
infer_quant:False
inference_cmd:./deploy/cpp_shitu/build/pp_shitu -c inference_drink.yaml
use_gpu:True|False
enable_mkldnn:True|False
cpu_threads:1|6
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False|True
precision:fp32|fp16
use_tensorrt:False
precision:fp32
data_dir:./dataset/drink_dataset_v1.0
benchmark:True
generate_yaml_cmd:python3 test_tipc/generate_cpp_yaml.py
transform_index_cmd:python3 deploy/cpp_shitu/tools/transform_id_map.py -c inference_drink.yaml
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
transform_index_cmd:python3.7 deploy/cpp_shitu/tools/transform_id_map.py -c inference_drink.yaml
===========================serving_params===========================
model_name:PPShiTu
python:python3.7
cls_inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar
det_inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./models/general_PPLCNet_x2_5_lite_v1.0_infer/
--dirname:./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./models/general_PPLCNet_x2_5_lite_v1.0_serving/
--serving_client:./models/general_PPLCNet_x2_5_lite_v1.0_client/
--serving_server:./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/
--serving_client:./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/
serving_dir:./paddleserving/recognition
web_service:recognition_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================paddle2onnx_params===========================
model_name:PP-ShiTu_mainbody_det
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================cpp_infer_params===========================
model_name:PPHGNet_small
cpp_infer_type:cls
cls_inference_model_dir:./PPHGNet_small_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_small_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:PPHGNet_small
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/PPHGNet_small_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/PPHGNet_small_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/PPHGNet_small_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:PPHGNet_small
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_small_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/PPHGNet_small_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/PPHGNet_small_serving/
--serving_client:./deploy/paddleserving/PPHGNet_small_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================train_params===========================
model_name:PPHGNet_small
python:python3.7
gpu_list:192.168.0.1,192.168.0.2;0,1
-o Global.device:gpu
-o Global.auto_cast:null
-o Global.epochs:lite_train_lite_infer=2|whole_train_whole_infer=120
-o Global.output_dir:./output/
-o DataLoader.Train.sampler.batch_size:8
-o Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val
null:null
##
trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/PPHGNet/PPHGNet_small.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null
fpgm_train:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/ImageNet/PPHGNet/PPHGNet_small.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/PPHGNet/PPHGNet_small.yaml
quant_export:null
fpgm_export:null
distill_export:null
kl_quant:null
export2:null
pretrained_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_small_pretrained.pdparams
infer_model:../inference/
infer_export:True
infer_quant:Fasle
inference:python/predict_cls.py -c configs/inference_cls.yaml -o PreProcess.transform_ops.0.ResizeImage.resize_short=236
-o Global.use_gpu:True|False
-o Global.enable_mkldnn:True|False
-o Global.cpu_num_threads:1|6
-o Global.batch_size:1|16
-o Global.use_tensorrt:True|False
-o Global.use_fp16:True|False
-o Global.inference_model_dir:../inference
-o Global.infer_imgs:../dataset/ILSVRC2012/val
-o Global.save_log_path:null
-o Global.benchmark:True
null:null
===========================infer_benchmark_params==========================
random_infer_input:[{float32,[3,224,224]}]
===========================cpp_infer_params===========================
model_name:PPHGNet_tiny
cpp_infer_type:cls
cls_inference_model_dir:./PPHGNet_tiny_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_tiny_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:PPHGNet_tiny
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/PPHGNet_tiny_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/PPHGNet_tiny_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/PPHGNet_tiny_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:PPHGNet_tiny
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_tiny_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/PPHGNet_tiny_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/PPHGNet_tiny_serving/
--serving_client:./deploy/paddleserving/PPHGNet_tiny_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================cpp_infer_params===========================
model_name:PPLCNet_x0_25
cpp_infer_type:cls
cls_inference_model_dir:./PPLCNet_x0_25_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_25_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:PPLCNet_x0_25
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/PPLCNet_x0_25_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/PPLCNet_x0_25_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/PPLCNet_x0_25_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:PPLCNet_x0_25
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_25_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/PPLCNet_x0_25_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/PPLCNet_x0_25_serving/
--serving_client:./deploy/paddleserving/PPLCNet_x0_25_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================cpp_infer_params===========================
model_name:PPLCNet_x0_35
cpp_infer_type:cls
cls_inference_model_dir:./PPLCNet_x0_35_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_35_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:PPLCNet_x0_25
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/PPLCNet_x0_25_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/PPLCNet_x0_25_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_25_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/PPLCNet_x0_25_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:PPLCNet_x0_35
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_35_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/PPLCNet_x0_35_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/PPLCNet_x0_35_serving/
--serving_client:./deploy/paddleserving/PPLCNet_x0_35_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================cpp_infer_params===========================
model_name:PPLCNet_x0_5
cpp_infer_type:cls
cls_inference_model_dir:./PPLCNet_x0_5_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_5_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:PP-ShiTu_mainbody_det
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:PPLCNet_x0_5
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_5_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/PPLCNet_x0_5_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/PPLCNet_x0_5_serving/
--serving_client:./deploy/paddleserving/PPLCNet_x0_5_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
\ No newline at end of file
===========================cpp_infer_params===========================
model_name:PPLCNet_x0_75
cpp_infer_type:cls
cls_inference_model_dir:./PPLCNet_x0_75_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_75_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:PPLCNet_x0_75
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/PPLCNet_x0_75_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/PPLCNet_x0_75_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_75_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/PPLCNet_x0_75_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:PPLCNet_x0_75
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_75_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/PPLCNet_x0_75_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/PPLCNet_x0_75_serving/
--serving_client:./deploy/paddleserving/PPLCNet_x0_75_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================cpp_infer_params===========================
model_name:PPLCNet_x1_0
cpp_infer_type:cls
cls_inference_model_dir:./PPLCNet_x1_0_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x1_0_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:PPLCNet_x1_0
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/PPLCNet_x1_0_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/PPLCNet_x1_0_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x1_0_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/PPLCNet_x1_0_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:PPLCNet_x1_0
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x1_0_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/PPLCNet_x1_0_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/PPLCNet_x1_0_serving/
--serving_client:./deploy/paddleserving/PPLCNet_x1_0_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================train_params===========================
model_name:PPLCNet_x1_0
python:python3.7
gpu_list:192.168.0.1,192.168.0.2;0,1
-o Global.device:gpu
-o Global.auto_cast:null
-o Global.epochs:lite_train_lite_infer=2|whole_train_whole_infer=120
-o Global.output_dir:./output/
-o DataLoader.Train.sampler.batch_size:8
-o Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val
null:null
##
trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/PPLCNet/PPLCNet_x1_0.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null
fpgm_train:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/ImageNet/PPLCNet/PPLCNet_x1_0.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/PPLCNet/PPLCNet_x1_0.yaml
quant_export:null
fpgm_export:null
distill_export:null
kl_quant:null
export2:null
pretrained_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_pretrained.pdparams
infer_model:../inference/
infer_export:True
infer_quant:Fasle
inference:python/predict_cls.py -c configs/inference_cls.yaml
-o Global.use_gpu:True|False
-o Global.enable_mkldnn:True|False
-o Global.cpu_num_threads:1|6
-o Global.batch_size:1|16
-o Global.use_tensorrt:True|False
-o Global.use_fp16:True|False
-o Global.inference_model_dir:../inference
-o Global.infer_imgs:../dataset/ILSVRC2012/val
-o Global.save_log_path:null
-o Global.benchmark:True
null:null
===========================infer_benchmark_params==========================
random_infer_input:[{float32,[3,224,224]}]
\ No newline at end of file
===========================cpp_infer_params===========================
model_name:PPLCNet_x1_5
cpp_infer_type:cls
cls_inference_model_dir:./PPLCNet_x1_5_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x1_5_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:PPLCNet_x1_5
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/PPLCNet_x1_5_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/PPLCNet_x1_5_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x1_5_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/PPLCNet_x1_5_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:PPLCNet_x1_5
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x1_5_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/PPLCNet_x1_5_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/PPLCNet_x1_5_serving/
--serving_client:./deploy/paddleserving/PPLCNet_x1_5_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================cpp_infer_params===========================
model_name:PPLCNet_x2_0
cpp_infer_type:cls
cls_inference_model_dir:./PPLCNet_x2_0_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x2_0_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:PPLCNet_x2_0
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/PPLCNet_x2_0_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/PPLCNet_x2_0_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x2_0_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/PPLCNet_x2_0_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:PPLCNet_x2_0
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x2_0_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/PPLCNet_x2_0_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/PPLCNet_x2_0_serving/
--serving_client:./deploy/paddleserving/PPLCNet_x2_0_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================cpp_infer_params===========================
model_name:PPLCNet_x2_5
cpp_infer_type:cls
cls_inference_model_dir:./PPLCNet_x2_5_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x2_5_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:PPLCNet_x2_5
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/PPLCNet_x2_5_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/PPLCNet_x2_5_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x2_5_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/PPLCNet_x2_5_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:PPLCNet_x2_5
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x2_5_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/PPLCNet_x2_5_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/PPLCNet_x2_5_serving/
--serving_client:./deploy/paddleserving/PPLCNet_x2_5_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================cpp_infer_params===========================
model_name:PPLCNetV2_base
cpp_infer_type:cls
cls_inference_model_dir:./PPLCNetV2_base_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNetV2_base_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:PPLCNetV2_base
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/PPLCNetV2_base_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/PPLCNetV2_base_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNetV2_base_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/PPLCNetV2_base_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:PPLCNetV2_base
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNetV2_base_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/PPLCNetV2_base_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/PPLCNetV2_base_serving/
--serving_client:./deploy/paddleserving/PPLCNetV2_base_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================train_params===========================
model_name:PPLCNetV2_base
python:python3.7
gpu_list:192.168.0.1,192.168.0.2;0,1
-o Global.device:gpu
-o Global.auto_cast:null
-o Global.epochs:lite_train_lite_infer=2|whole_train_whole_infer=120
-o Global.output_dir:./output/
-o DataLoader.Train.sampler.first_bs:8
-o Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val
null:null
##
trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/PPLCNetV2/PPLCNetV2_base.yaml -o Global.seed=1234 -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null
fpgm_train:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/ImageNet/PPLCNetV2/PPLCNetV2_base.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/PPLCNetV2/PPLCNetV2_base.yaml
quant_export:null
fpgm_export:null
distill_export:null
kl_quant:null
export2:null
pretrained_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNetV2_base_pretrained.pdparams
infer_model:../inference/
infer_export:True
infer_quant:Fasle
inference:python/predict_cls.py -c configs/inference_cls.yaml
-o Global.use_gpu:True|False
-o Global.enable_mkldnn:True|False
-o Global.cpu_num_threads:1|6
-o Global.batch_size:1|16
-o Global.use_tensorrt:True|False
-o Global.use_fp16:True|False
-o Global.inference_model_dir:../inference
-o Global.infer_imgs:../dataset/ILSVRC2012/val
-o Global.save_log_path:null
-o Global.benchmark:True
null:null
===========================infer_benchmark_params==========================
random_infer_input:[{float32,[3,224,224]}]
===========================cpp_infer_params===========================
model_name:ResNet50
cpp_infer_type:cls
cls_inference_model_dir:./ResNet50_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:ResNet50
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/ResNet50_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/ResNet50_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/ResNet50_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:ResNet50
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/ResNet50_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/ResNet50_serving/
--serving_client:./deploy/paddleserving/ResNet50_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================cpp_infer_params===========================
model_name:ResNet50_vd
cpp_infer_type:cls
cls_inference_model_dir:./cls_inference/
cls_inference_model_dir:./ResNet50_vd_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/whole_chain/ResNet50_vd_inference.tar
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:True|False
cpu_threads:1|6
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False|True
precision:fp32|fp16
image_dir:./dataset/ILSVRC2012/val
benchmark:True
generate_yaml_cmd:python3 test_tipc/generate_cpp_yaml.py
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
......@@ -8,7 +8,9 @@ python:python3.7
--save_file:./deploy/models/ResNet50_vd_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar
inference: python/predict_cls.py -c configs/inference_cls.yaml
Global.use_onnx:True
Global.inference_model_dir:models/ResNet50_vd_infer/
Global.use_gpu:False
-c:configs/inference_cls.yaml
===========================serving_params===========================
model_name:ResNet50_vd
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/ResNet50_vd_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/ResNet50_vd_serving/
--serving_client:./deploy/paddleserving/ResNet50_vd_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
===========================cpp_infer_params===========================
model_name:SwinTransformer_tiny_patch4_window7_224
cpp_infer_type:cls
cls_inference_model_dir:./SwinTransformer_tiny_patch4_window7_224_infer/
det_inference_model_dir:
cls_inference_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/SwinTransformer_tiny_patch4_window7_224_infer.tar
det_inference_url:
infer_quant:False
inference_cmd:./deploy/cpp/build/clas_system -c inference_cls.yaml
use_gpu:True|False
enable_mkldnn:False
cpu_threads:1
batch_size:1
use_tensorrt:False
precision:fp32
image_dir:./dataset/ILSVRC2012/val/ILSVRC2012_val_00000001.JPEG
benchmark:False
generate_yaml_cmd:python3.7 test_tipc/generate_cpp_yaml.py
===========================paddle2onnx_params===========================
model_name:SwinTransformer_tiny_patch4_window7_224
python:python3.7
2onnx: paddle2onnx
--model_dir:./deploy/models/SwinTransformer_tiny_patch4_window7_224_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--save_file:./deploy/models/SwinTransformer_tiny_patch4_window7_224_infer/inference.onnx
--opset_version:10
--enable_onnx_checker:True
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/SwinTransformer_tiny_patch4_window7_224_infer.tar
inference:./python/predict_cls.py
Global.use_onnx:True
Global.inference_model_dir:./models/SwinTransformer_tiny_patch4_window7_224_infer
Global.use_gpu:False
-c:configs/inference_cls.yaml
\ No newline at end of file
===========================serving_params===========================
model_name:SwinTransformer_tiny_patch4_window7_224
python:python3.7
inference_model_url:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/SwinTransformer_tiny_patch4_window7_224_infer.tar
trans_model:-m paddle_serving_client.convert
--dirname:./deploy/paddleserving/SwinTransformer_tiny_patch4_window7_224_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
--serving_server:./deploy/paddleserving/SwinTransformer_tiny_patch4_window7_224_serving/
--serving_client:./deploy/paddleserving/SwinTransformer_tiny_patch4_window7_224_client/
serving_dir:./deploy/paddleserving
web_service:classification_web_service.py
--use_gpu:0|null
pipline:pipeline_http_client.py
\ No newline at end of file
# C++预测功能测试
# Linux GPU/CPU C++ 推理功能测试
C++预测功能测试的主程序为`test_inference_cpp.sh`,可以测试基于C++预测库的模型推理功能。
Linux GPU/CPU C++ 推理功能测试的主程序为`test_inference_cpp.sh`,可以测试基于C++预测引擎的推理功能。
## 1. 测试结论汇总
基于训练是否使用量化,进行本测试的模型可以分为`正常模型``量化模型`,这两类模型对应的C++预测功能汇总如下
- 推理相关
| 模型类型 |device | batchsize | tensorrt | mkldnn | cpu多线程 |
| ---- | ---- | ---- | :----: | :----: | :----: |
| 正常模型 | GPU | 1/6 | fp32/fp16 | - | - |
| 正常模型 | CPU | 1/6 | - | fp32 | 支持 |
| 量化模型 | GPU | 1/6 | int8 | - | - |
| 量化模型 | CPU | 1/6 | - | int8 | 支持 |
| 算法名称 | 模型名称 | device_CPU | device_GPU |
| :----: | :----: | :----: | :----: |
| MobileNetV3 | MobileNetV3_large_x1_0 | 支持 | 支持 |
| PP-ShiTu | PPShiTu_general_rec、PPShiTu_mainbody_det | 支持 | 支持 |
| PP-ShiTu | PPShiTu_mainbody_det | 支持 | 支持 |
| PPHGNet | PPHGNet_small | 支持 | 支持 |
| PPHGNet | PPHGNet_tiny | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_25 | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_35 | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_5 | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_75 | 支持 | 支持 |
| PPLCNet | PPLCNet_x1_0 | 支持 | 支持 |
| PPLCNet | PPLCNet_x1_5 | 支持 | 支持 |
| PPLCNet | PPLCNet_x2_0 | 支持 | 支持 |
| PPLCNet | PPLCNet_x2_5 | 支持 | 支持 |
| PPLCNetV2 | PPLCNetV2_base | 支持 | 支持 |
| ResNet | ResNet50 | 支持 | 支持 |
| ResNet | ResNet50_vd | 支持 | 支持 |
| SwinTransformer | SwinTransformer_tiny_patch4_window7_224 | 支持 | 支持 |
## 2. 测试流程
运行环境配置请参考[文档](./install.md)的内容配置TIPC的运行环境。
## 2. 测试流程(以**ResNet50**为例)
### 2.1 功能测试
先运行`prepare.sh`准备数据和模型,然后运行`test_inference_cpp.sh`进行测试,最终在```test_tipc/output```目录下生成`cpp_infer_*.log`后缀的日志文件。
<details>
<summary><b>准备数据、准备推理模型、编译opencv、编译(下载)Paddle Inference、编译C++预测Demo(已写入prepare.sh自动执行,点击以展开详细内容或者折叠)
</b></summary>
### 2.1 准备数据和推理模型
#### 2.1.1 准备数据
默认使用`./deploy/images/ILSVRC2012_val_00000010.jpeg`作为测试输入图片。
#### 2.1.2 准备推理模型
* 如果已经训练好了模型,可以参考[模型导出](../../docs/zh_CN/inference_deployment/export_model.md),导出`inference model`,并将导出路径设置为`./deploy/models/ResNet50_infer`
导出完毕后文件结构如下
```shell
bash test_tipc/prepare.sh test_tipc/config/ResNet/ResNet50_vd_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt cpp_infer
./deploy/models/ResNet50_infer/
├── inference.pdmodel
├── inference.pdiparams
└── inference.pdiparams.info
```
### 2.2 准备环境
#### 2.2.1 运行准备
配置合适的编译和执行环境,其中包括编译器,cuda等一些基础库,建议安装docker环境,[参考链接](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/docker/linux-docker.html)
# 用法1:
bash test_tipc/test_inference_cpp.sh test_tipc/config/ResNet/ResNet50_vd_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
# 用法2: 指定GPU卡预测,第三个传入参数为GPU卡号
bash test_tipc/test_inference_cpp.sh test_tipc/config/ResNet/ResNet50_vd_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt 1
#### 2.2.2 编译opencv库
* 首先需要从opencv官网上下载Linux环境下的源码,以3.4.7版本为例,下载及解压缩命令如下:
```
cd deploy/cpp
wget https://github.com/opencv/opencv/archive/3.4.7.tar.gz
tar -xvf 3.4.7.tar.gz
```
运行预测指令后,在`test_tipc/output`文件夹下自动会保存运行日志,包括以下文件:
* 编译opencv,首先设置opencv源码路径(`root_path`)以及安装路径(`install_path`),`root_path`为下载的opencv源码路径,`install_path`为opencv的安装路径。在本例中,源码路径即为当前目录下的`opencv-3.4.7/`
```shell
test_tipc/output/
|- results_cpp.log # 运行指令状态的日志
|- cls_cpp_infer_cpu_usemkldnn_False_threads_1_precision_fp32_batchsize_1.log # CPU上不开启Mkldnn,线程数设置为1,测试batch_size=1条件下的预测运行日志
|- cls_cpp_infer_cpu_usemkldnn_False_threads_6_precision_fp32_batchsize_1.log # CPU上不开启Mkldnn,线程数设置为6,测试batch_size=1条件下的预测运行日志
|- cls_cpp_infer_gpu_usetrt_False_precision_fp32_batchsize_1.log # GPU上不开启TensorRT,测试batch_size=1的fp32精度预测日志
|- cls_cpp_infer_gpu_usetrt_True_precision_fp16_batchsize_1.log # GPU上开启TensorRT,测试batch_size=1的fp16精度预测日志
......
cd ./opencv-3.4.7
export root_path=$PWD
export install_path=${root_path}/opencv3
```
其中results_cpp.log中包含了每条指令的运行状态,如果运行成功会输出:
* 然后在opencv源码路径下,按照下面的命令进行编译。
```shell
rm -rf build
mkdir build
cd build
cmake .. \
-DCMAKE_INSTALL_PREFIX=${install_path} \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=OFF \
-DWITH_IPP=OFF \
-DBUILD_IPP_IW=OFF \
-DWITH_LAPACK=OFF \
-DWITH_EIGEN=OFF \
-DCMAKE_INSTALL_LIBDIR=lib64 \
-DWITH_ZLIB=ON \
-DBUILD_ZLIB=ON \
-DWITH_JPEG=ON \
-DBUILD_JPEG=ON \
-DWITH_PNG=ON \
-DBUILD_PNG=ON \
-DWITH_TIFF=ON \
-DBUILD_TIFF=ON
make -j
make install
```
Run successfully with command - ./deploy/cpp/build/clas_system -c inference_cls.yaml 2>&1|tee test_tipc/output/cls_cpp_infer_gpu_usetrt_False_precision_fp32_batchsize_1.log
......
* `make install`完成之后,会在该文件夹下生成opencv头文件和库文件,用于后面的代码编译。
以opencv3.4.7版本为例,最终在安装路径下的文件结构如下所示。**注意**:不同的opencv版本,下述的文件结构可能不同。
```shell
opencv3/
├── bin :可执行文件
├── include :头文件
├── lib64 :库文件
└── share :部分第三方库
```
#### 2.2.3 下载或者编译Paddle预测库
* 有2种方式获取Paddle预测库,下面进行详细介绍。
##### 预测库源码编译
* 如果希望获取最新预测库特性,可以从Paddle github上克隆最新代码,源码编译预测库。
* 可以参考[Paddle预测库官网](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/guides/05_inference_deployment/inference/build_and_install_lib_cn.html#id16)的说明,从github上获取Paddle代码,然后进行编译,生成最新的预测库。使用git获取代码方法如下。
```shell
git clone https://github.com/PaddlePaddle/Paddle.git
```
* 进入Paddle目录后,使用如下命令编译。
```shell
rm -rf build
mkdir build
cd build
cmake .. \
-DWITH_CONTRIB=OFF \
-DWITH_MKL=ON \
-DWITH_MKLDNN=ON \
-DWITH_TESTING=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DWITH_INFERENCE_API_TEST=OFF \
-DON_INFER=ON \
-DWITH_PYTHON=ON
make -j
make inference_lib_dist
```
如果运行失败,会输出:
更多编译参数选项可以参考Paddle C++预测库官网:[https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/guides/05_inference_deployment/inference/build_and_install_lib_cn.html#id16](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/guides/05_inference_deployment/inference/build_and_install_lib_cn.html#id16)
* 编译完成之后,可以在`build/paddle_inference_install_dir/`文件下看到生成了以下文件及文件夹。
```
Run failed with command - ./deploy/cpp/build/clas_system -c inference_cls.yaml 2>&1|tee test_tipc/output/cls_cpp_infer_gpu_usetrt_False_precision_fp32_batchsize_1.log
......
build/paddle_inference_install_dir/
├── CMakeCache.txt
├── paddle
├── third_party
└── version.txt
```
可以很方便的根据results_cpp.log中的内容判定哪一个指令运行错误。
其中`paddle`就是之后进行C++预测时所需的Paddle库,`version.txt`中包含当前预测库的版本信息。
##### 直接下载安装
### 2.2 精度测试
* [Paddle预测库官网](https://paddleinference.paddlepaddle.org.cn/user_guides/download_lib.html)上提供了不同cuda版本的Linux预测库,可以在官网查看并选择合适的预测库版本。
使用compare_results.py脚本比较模型预测的结果是否符合预期,主要步骤包括:
- 提取日志中的预测坐标;
- 从本地文件中提取保存好的坐标结果;
- 比较上述两个结果是否符合精度预期,误差大于设置阈值时会报错。
`manylinux_cuda11.1_cudnn8.1_avx_mkl_trt7_gcc8.2`版本为例,使用下述命令下载并解压:
```shell
wget https://paddle-inference-lib.bj.bcebos.com/2.2.2/cxx_c/Linux/GPU/x86-64_gcc8.2_avx_mkl_cuda11.1_cudnn8.1.1_trt7.2.3.4/paddle_inference.tgz
tar -xvf paddle_inference.tgz
```
最终会在当前的文件夹中生成`paddle_inference/`的子文件夹,文件内容和上述的paddle_inference_install_dir一样。
#### 2.2.4 编译C++预测Demo
* 编译命令如下,其中Paddle C++预测库、opencv等其他依赖库的地址需要换成自己机器上的实际地址。
```shell
# 在deploy/cpp下执行以下命令
bash tools/build.sh
```
具体地,`tools/build.sh`中内容如下。
```shell
OPENCV_DIR=your_opencv_dir
LIB_DIR=your_paddle_inference_dir
CUDA_LIB_DIR=your_cuda_lib_dir
CUDNN_LIB_DIR=your_cudnn_lib_dir
TENSORRT_DIR=your_tensorrt_lib_dir
BUILD_DIR=build
rm -rf ${BUILD_DIR}
mkdir ${BUILD_DIR}
cd ${BUILD_DIR}
cmake .. \
-DPADDLE_LIB=${LIB_DIR} \
-DWITH_MKL=ON \
-DDEMO_NAME=clas_system \
-DWITH_GPU=OFF \
-DWITH_STATIC_LIB=OFF \
-DWITH_TENSORRT=OFF \
-DTENSORRT_DIR=${TENSORRT_DIR} \
-DOPENCV_DIR=${OPENCV_DIR} \
-DCUDNN_LIB=${CUDNN_LIB_DIR} \
-DCUDA_LIB=${CUDA_LIB_DIR} \
make -j
```
上述命令中,
* `OPENCV_DIR`为opencv编译安装的地址(本例中需修改为`opencv-3.4.7/opencv3`文件夹的路径);
* `LIB_DIR`为下载的Paddle预测库(`paddle_inference`文件夹),或编译生成的Paddle预测库(`build/paddle_inference_install_dir`文件夹)的路径;
* `CUDA_LIB_DIR`为cuda库文件地址,在docker中一般为`/usr/local/cuda/lib64`
* `CUDNN_LIB_DIR`为cudnn库文件地址,在docker中一般为`/usr/lib64`
* `TENSORRT_DIR`是tensorrt库文件地址,在dokcer中一般为`/usr/local/TensorRT-7.2.3.4/`,TensorRT需要结合GPU使用。
在执行上述命令,编译完成之后,会在当前路径下生成`build`文件夹,其中生成一个名为`clas_system`的可执行文件。
</details>
* 可执行以下命令,自动完成上述准备环境中的所需内容
```shell
bash test_tipc/prepare.sh test_tipc/config/ResNet/ResNet50_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt cpp_infer
```
### 2.3 功能测试
测试方法如下所示,希望测试不同的模型文件,只需更换为自己的参数配置文件,即可完成对应模型的测试。
#### 使用方式
运行命令:
```shell
python3.7 test_tipc/compare_results.py --gt_file=./test_tipc/results/cls_cpp_*.txt --log_file=./test_tipc/output/cls_cpp_*.log --atol=1e-3 --rtol=1e-3
bash test_tipc/test_inference_cpp.sh ${your_params_file}
```
参数介绍:
- gt_file: 指向事先保存好的预测结果路径,支持*.txt 结尾,会自动索引*.txt格式的文件,文件默认保存在test_tipc/result/ 文件夹下
- log_file: 指向运行test_tipc/test_inference_cpp.sh 脚本的infer模式保存的预测日志,预测日志中打印的有预测结果,比如:文本框,预测文本,类别等等,同样支持cpp_infer_*.log格式传入
- atol: 设置的绝对误差
- rtol: 设置的相对误差
`ResNet50``Linux GPU/CPU C++推理测试`为例,命令如下所示。
#### 运行结果
```shell
bash test_tipc/test_inference_cpp.sh test_tipc/config/ResNet/ResNet50_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
```
正常运行效果如下图:
<img src="compare_cpp_right.png" width="1000">
输出结果如下,表示命令运行成功。
出现不一致结果时的运行输出:
<img src="compare_cpp_wrong.png" width="1000">
```shell
Run successfully with command - ./deploy/cpp/build/clas_system -c inference_cls.yaml > ./test_tipc/output/ResNet50/cls_cpp_infer_gpu_usetrt_False_precision_fp32_batchsize_1.log 2>&1!
Run successfully with command - ./deploy/cpp/build/clas_system -c inference_cls.yaml > ./test_tipc/output/ResNet50/cls_cpp_infer_cpu_usemkldnn_False_threads_1_precision_fp32_batchsize_1.log 2>&1!
```
最终log中会打印出结果,如下所示
```log
You are using Paddle compiled with TensorRT, but TensorRT dynamic library is not found. Ignore this if TensorRT is not needed.
=======Paddle Class inference config======
Global:
infer_imgs: ./deploy/images/ILSVRC2012_val_00000010.jpeg
inference_model_dir: ./deploy/models/ResNet50_infer
batch_size: 1
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ""
channel_num: 3
- ToCHWImage: ~
PostProcess:
main_indicator: Topk
Topk:
topk: 5
class_id_map_file: ./ppcls/utils/imagenet1k_label_list.txt
SavePreLabel:
save_dir: ./pre_label/
=======End of Paddle Class inference config======
img_file_list length: 1
Current image path: ./deploy/images/ILSVRC2012_val_00000010.jpeg
Current total inferen time cost: 5449.39 ms.
Top1: class_id: 153, score: 0.4144, label: Maltese dog, Maltese terrier, Maltese
Top2: class_id: 332, score: 0.3909, label: Angora, Angora rabbit
Top3: class_id: 229, score: 0.0514, label: Old English sheepdog, bobtail
Top4: class_id: 204, score: 0.0430, label: Lhasa, Lhasa apso
Top5: class_id: 265, score: 0.0420, label: toy poodle
## 3. 更多教程
```
详细log位于`./test_tipc/output/ResNet50/cls_cpp_infer_gpu_usetrt_False_precision_fp32_batchsize_1.log``./test_tipc/output/ResNet50/cls_cpp_infer_cpu_usemkldnn_False_threads_1_precision_fp32_batchsize_1.log`中。
本文档为功能测试用,更详细的c++预测使用教程请参考:[服务器端C++预测](../../docs/zh_CN/inference_deployment/)
如果运行失败,也会在终端中输出运行失败的日志信息以及对应的运行命令。可以基于该命令,分析运行失败的原因。
# Paddle2onnx预测功能测试
PaddleServing预测功能测试的主程序为`test_paddle2onnx.sh`,可以测试Paddle2ONNX的模型转化功能,并验证正确性。
## 1. 测试结论汇总
基于训练是否使用量化,进行本测试的模型可以分为`正常模型``量化模型`,这两类模型对应的Paddle2ONNX预测功能汇总如下:
| 模型类型 |device |
| ---- | ---- |
| 正常模型 | GPU |
| 正常模型 | CPU |
## 2. 测试流程
以下内容以`ResNet50`模型的paddle2onnx测试为例
### 2.1 功能测试
先运行`prepare.sh`准备数据和模型,然后运行`test_paddle2onnx.sh`进行测试,最终在`test_tipc/output/ResNet50`目录下生成`paddle2onnx_infer_*.log`后缀的日志文件
下方展示以PPHGNet_small为例的测试命令与结果。
```shell
bash test_tipc/prepare.sh ./test_tipc/config/ResNet/ResNet50_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt paddle2onnx_infer
# 用法:
bash test_tipc/test_paddle2onnx.sh ./test_tipc/config/ResNet/ResNet50_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
```
#### 运行结果
各测试的运行情况会打印在 `./test_tipc/output/ResNet50/results_paddle2onnx.log` 中:
运行成功时会输出:
```
Run successfully with command - paddle2onnx --model_dir=./deploy/models/ResNet50_infer/ --model_filename=inference.pdmodel --params_filename=inference.pdiparams --save_file=./deploy/models/ResNet50_infer/inference.onnx --opset_version=10 --enable_onnx_checker=True!
Run successfully with command - cd deploy && python3.7 ./python/predict_cls.py -o Global.inference_model_dir=./models/ResNet50_infer -o Global.use_onnx=True -o Global.use_gpu=False -c=configs/inference_cls.yaml > ../test_tipc/output/ResNet50/paddle2onnx_infer_cpu.log 2>&1 && cd ../!
```
运行失败时会输出:
```
Run failed with command - paddle2onnx --model_dir=./deploy/models/ResNet50_infer/ --model_filename=inference.pdmodel --params_filename=inference.pdiparams --save_file=./deploy/models/ResNet50_infer/inference.onnx --opset_version=10 --enable_onnx_checker=True!
Run failed with command - cd deploy && python3.7 ./python/predict_cls.py -o Global.inference_model_dir=./models/ResNet50_infer -o Global.use_onnx=True -o Global.use_gpu=False -c=configs/inference_cls.yaml > ../test_tipc/output/ResNet50/paddle2onnx_infer_cpu.log 2>&1 && cd ../!
...
```
## 3. 更多教程
本文档为功能测试用,更详细的Paddle2onnx预测使用教程请参考:[Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX)
# Linux GPU/CPU PYTHON 服务化部署测试
Linux GPU/CPU PYTHON 服务化部署测试的主程序为`test_serving_infer.sh`,可以测试基于Python的模型服务化部署功能。
## 1. 测试结论汇总
- 推理相关:
| 算法名称 | 模型名称 | device_CPU | device_GPU |
| :----: | :----: | :----: | :----: |
| MobileNetV3 | MobileNetV3_large_x1_0 | 支持 | 支持 |
| PP-ShiTu | PPShiTu_general_rec、PPShiTu_mainbody_det | 支持 | 支持 |
| PPHGNet | PPHGNet_small | 支持 | 支持 |
| PPHGNet | PPHGNet_tiny | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_25 | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_35 | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_5 | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_75 | 支持 | 支持 |
| PPLCNet | PPLCNet_x1_0 | 支持 | 支持 |
| PPLCNet | PPLCNet_x1_5 | 支持 | 支持 |
| PPLCNet | PPLCNet_x2_0 | 支持 | 支持 |
| PPLCNet | PPLCNet_x2_5 | 支持 | 支持 |
| PPLCNetV2 | PPLCNetV2_base | 支持 | 支持 |
| ResNet | ResNet50 | 支持 | 支持 |
| ResNet | ResNet50_vd | 支持 | 支持 |
| SwinTransformer | SwinTransformer_tiny_patch4_window7_224 | 支持 | 支持 |
## 2. 测试流程
### 2.1 准备数据
分类模型默认使用`./deploy/paddleserving/daisy.jpg`作为测试输入图片,无需下载
识别模型默认使用`drink_dataset_v1.0/test_images/001.jpeg`作为测试输入图片,在**2.2 准备环境**中会下载好。
### 2.2 准备环境
- 安装PaddlePaddle:如果您已经安装了2.2或者以上版本的paddlepaddle,那么无需运行下面的命令安装paddlepaddle。
```shell
# 需要安装2.2及以上版本的Paddle
# 安装GPU版本的Paddle
python3.7 -m pip install paddlepaddle-gpu==2.2.0
# 安装CPU版本的Paddle
python3.7 -m pip install paddlepaddle==2.2.0
```
- 安装依赖
```shell
python3.7 -m pip install -r requirements.txt
```
- 安装 PaddleServing 相关组件,包括serving-server、serving_client、serving-app,自动下载并解压推理模型
```bash
bash test_tipc/prepare.sh test_tipc/configs/ResNet50/ResNet50_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt serving_infer
```
### 2.3 功能测试
测试方法如下所示,希望测试不同的模型文件,只需更换为自己的参数配置文件,即可完成对应模型的测试。
```bash
bash test_tipc/test_serving_infer_python.sh ${your_params_file} lite_train_lite_infer
```
`ResNet50``Linux GPU/CPU PYTHON 服务化部署测试`为例,命令如下所示。
```bash
bash test_tipc/test_serving_infer_python.sh test_tipc/configs/ResNet50/ResNet50_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt serving_infer
```
输出结果如下,表示命令运行成功。
```
Run successfully with command - python3.7 pipeline_http_client.py > ../../test_tipc/output/ResNet50/server_infer_gpu_pipeline_http_batchsize_1.log 2>&1!
Run successfully with command - python3.7 pipeline_http_client.py > ../../test_tipc/output/ResNet50/server_infer_cpu_pipeline_http_batchsize_1.log 2>&1 !
```
预测结果会自动保存在 `./test_tipc/output/ResNet50/server_infer_gpu_pipeline_http_batchsize_1.log` ,可以看到 PaddleServing 的运行结果:
```
{'err_no': 0, 'err_msg': '', 'key': ['label', 'prob'], 'value': ["['daisy']", '[0.998314619064331]']}
```
如果运行失败,也会在终端中输出运行失败的日志信息以及对应的运行命令。可以基于该命令,分析运行失败的原因。
# Linux GPU/CPU 多机多卡训练推理测试
Linux GPU/CPU 多机多卡训练推理测试的主程序为`test_train_inference_python.sh`,可以测试基于Python的多机多卡模型训练、评估、推理等基本功能。
## 1. 测试结论汇总
- 训练相关:
| 算法名称 | 模型名称 | 多机多卡 |
| :-------: | :-----------------: | :--------: |
| PPLCNet | PPLCNet_x1_0 | 分布式训练 |
| PPLCNetV2 | PPLCNetV2_base | 分布式训练 |
| PPHGNet | PPHGNet_small | 分布式训练 |
| PP-ShiTu | PPShiTu_general_rec | 分布式训练 |
- 推理相关:
| 算法名称 | 模型名称 | device_CPU | device_GPU | batchsize |
| :-------: | :-----------------: | :--------: | :--------: | :-------: |
| PPLCNet | PPLCNet_x1_0 | 支持 | 支持 | 1 |
| PPLCNetV2 | PPLCNetV2_base | 支持 | 支持 | 1 |
| PPHGNet | PPHGNet_small | 支持 | 支持 | 1 |
| PP-ShiTu | PPShiTu_general_rec | 支持 | 支持 | 1 |
## 2. 测试流程
运行环境配置请参考[文档](./install.md)的内容配置TIPC的运行环境。
**下面以 PPLCNet_x1_0 模型为例,介绍测试流程**
### 2.1 功能测试
#### 2.1.1 修改配置文件
首先,修改配置文件`test_tipc/config/PPLCNet/PPLCNet_x1_0_train_fleet_infer_python.txt`中的`gpu_list`设置:假设两台机器的`ip`地址分别为`192.168.0.1``192.168.0.2`,则对应的配置文件`gpu_list`字段需要修改为`gpu_list:192.168.0.1,192.168.0.2;0,1`
**`ip`地址查看命令为`ifconfig`,在`inet addr:`字段后的即为ip地址**
#### 2.1.2 准备数据
运行`prepare.sh`准备数据和模型,数据准备命令如下所示。
```shell
bash test_tipc/prepare.sh test_tipc/config/PPLCNet/PPLCNet_x1_0_train_fleet_infer_python.txt lite_train_lite_infer
```
**注意:** 由于是多机训练,这里需要在所有节点上都运行一次启动上述命令来准备数据。
#### 2.1.3 修改起始端口开始测试
在多机的节点上使用下面的命令设置分布式的起始端口(否则后面运行的时候会由于无法找到运行端口而hang住),一般建议设置在`10000~20000`之间。
```shell
export FLAGS_START_PORT=17000
```
**注意:** 上述修改起始端口命令同样需要在所有节点上都执行一次。
接下来就可以开始执行测试,命令如下所示。
```shell
bash test_tipc/test_train_inference_python.sh test_tipc/config/PPLCNet/PPLCNet_x1_0_train_fleet_infer_python.txt
```
**注意:** 由于是多机训练,这里需要在所有的节点上均运行启动上述命令进行测试。
#### 2.1.4 输出结果
输出结果保存在`test_tipc/output/PPLCNet_x1_0/results_python.log`,内容如下,以`Run successfully`开头表示测试命令正常,否则为测试失败。
```bash
Run successfully with command - python3.7 -m paddle.distributed.launch --ips=192.168.0.1,192.168.0.2 --gpus=0,1 tools/train.py -c ppcls/configs/ImageNet/PPLCNet/PPLCNet_x1_0.yaml -o Global.seed=1234 -o DataL
oader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False -o Global.device=gpu -o Global.output_dir=./test_tipc/output/PPLCNet_x1_0/norm_train_gpus_0,
1_autocast_null_nodes_2 -o Global.epochs=2 -o DataLoader.Train.sampler.batch_size=8 !
...
...
Run successfully with command - python3.7 python/predict_cls.py -c configs/inference_cls.yaml -o Global.use_gpu=False -o Global.enable_mkldnn=True -o Global.cpu_num_threads=1 -o Global.inference_model_dir=.././t
est_tipc/output/PPLCNet_x1_0/norm_train_gpus_0,1_autocast_null_nodes_2 -o Global.batch_size=16 -o Global.infer_imgs=../dataset/ILSVRC2012/val -o Global.benchmark=True > .././test_tipc/output/PPLCNet_x1_0/infer_cpu_us
emkldnn_True_threads_1_batchsize_16.log 2>&1 !
```
在配置文件中默认设置`-o Global.benchmark:True`表示开启benchmark选项,此时可以得到测试的详细数据,包含运行环境信息(系统版本、CUDA版本、CUDNN版本、驱动版本),Paddle版本信息,参数设置信息(运行设备、线程数、是否开启内存优化等),模型信息(模型名称、精度),数据信息(batchsize、是否为动态shape等),性能信息(CPU,GPU的占用、运行耗时、预处理耗时、推理耗时、后处理耗时),内容如下所示:
```log
[2022/06/07 17:01:41] root INFO: ---------------------- Env info ----------------------
[2022/06/07 17:01:41] root INFO: OS_version: CentOS 6.10
[2022/06/07 17:01:41] root INFO: CUDA_version: 10.1.243
[2022/06/07 17:01:41] root INFO: CUDNN_version: None.None.None
[2022/06/07 17:01:41] root INFO: drivier_version: 460.32.03
[2022/06/07 17:01:41] root INFO: ---------------------- Paddle info ----------------------
[2022/06/07 17:01:41] root INFO: paddle_version: 2.3.0-rc0
[2022/06/07 17:01:41] root INFO: paddle_version: 2.3.0-rc0
[2022/06/07 17:01:41] root INFO: paddle_commit: 5d4980c052583fec022812d9c29460aff7cdc18b
[2022/06/07 17:01:41] root INFO: log_api_version: 1.0
[2022/06/07 17:01:41] root INFO: ----------------------- Conf info -----------------------
[2022/06/07 17:01:41] root INFO: runtime_device: cpu
[2022/06/07 17:01:41] root INFO: ir_optim: True
[2022/06/07 17:01:41] root INFO: enable_memory_optim: True
[2022/06/07 17:01:41] root INFO: enable_tensorrt: False
[2022/06/07 17:01:41] root INFO: enable_mkldnn: False
[2022/06/07 17:01:41] root INFO: cpu_math_library_num_threads: 6
[2022/06/07 17:01:41] root INFO: ----------------------- Model info ----------------------
[2022/06/07 17:01:41] root INFO: model_name: cls
[2022/06/07 17:01:41] root INFO: precision: fp32
[2022/06/07 17:01:41] root INFO: ----------------------- Data info -----------------------
[2022/06/07 17:01:41] root INFO: batch_size: 16
[2022/06/07 17:01:41] root INFO: input_shape: [3, 224, 224]
[2022/06/07 17:01:41] root INFO: data_num: 3
[2022/06/07 17:01:41] root INFO: ----------------------- Perf info -----------------------
[2022/06/07 17:01:41] root INFO: cpu_rss(MB): 726.5586, gpu_rss(MB): None, gpu_util: None%
[2022/06/07 17:01:41] root INFO: total time spent(s): 0.3527
[2022/06/07 17:01:41] root INFO: preprocess_time(ms): 33.2723, inference_time(ms): 317.9824, postprocess_time(ms): 1.4579
```
该信息可以在运行log中查看,log位置在`test_tipc/output/PPLCNet_x1_0/infer_gpu_usetrt_True_precision_True_batchsize_1.log`
如果运行失败,也会在终端中输出运行失败的日志信息以及对应的运行命令。可以基于该命令,分析运行失败的原因。
**注意:** 由于分布式训练时,仅在`trainer_id=0`所在的节点中保存模型,因此其他的节点中在运行模型导出与推理时会因为找不到保存的模型而报错,为正常现象。
......@@ -66,6 +66,10 @@ def main():
"test_images")
config["IndexProcess"]["index_dir"] = os.path.join(args.data_dir,
"index")
config["IndexProcess"]["image_root"] = os.path.join(args.data_dir,
"gallery")
config["IndexProcess"]["data_file"] = os.path.join(args.data_dir,
"drink_label.txt")
assert args.cls_model_dir
assert args.det_model_dir
config["Global"]["det_inference_model_dir"] = args.det_model_dir
......
......@@ -12,7 +12,7 @@ dataline=$(cat ${FILENAME})
IFS=$'\n'
lines=(${dataline})
function func_parser_key(){
function func_parser_key() {
strs=$1
IFS=":"
array=(${strs})
......@@ -20,7 +20,7 @@ function func_parser_key(){
echo ${tmp}
}
function func_parser_value(){
function func_parser_value() {
strs=$1
IFS=":"
array=(${strs})
......@@ -33,46 +33,84 @@ function func_parser_value(){
fi
}
function func_get_url_file_name(){
function func_get_url_file_name() {
strs=$1
IFS="/"
array=(${strs})
tmp=${array[${#array[@]}-1]}
tmp=${array[${#array[@]} - 1]}
echo ${tmp}
}
model_name=$(func_parser_value "${lines[1]}")
if [ ${MODE} = "cpp_infer" ];then
if [[ $FILENAME == *infer_cpp_linux_gpu_cpu.txt ]];then
if [[ ${MODE} = "cpp_infer" ]]; then
if [ -d "./deploy/cpp/opencv-3.4.7/opencv3/" ] && [ $(md5sum ./deploy/cpp/opencv-3.4.7.tar.gz | awk -F ' ' '{print $1}') = "faa2b5950f8bee3f03118e600c74746a" ]; then
echo "################### build opencv skipped ###################"
else
echo "################### build opencv ###################"
rm -rf ./deploy/cpp/opencv-3.4.7.tar.gz ./deploy/cpp/opencv-3.4.7/
pushd ./deploy/cpp/
wget -nc https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/opencv-3.4.7.tar.gz
tar -xf opencv-3.4.7.tar.gz
cd opencv-3.4.7/
install_path=$(pwd)/opencv3
rm -rf build
mkdir build
cd build
cmake .. \
-DCMAKE_INSTALL_PREFIX=${install_path} \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=OFF \
-DWITH_IPP=OFF \
-DBUILD_IPP_IW=OFF \
-DWITH_LAPACK=OFF \
-DWITH_EIGEN=OFF \
-DCMAKE_INSTALL_LIBDIR=lib64 \
-DWITH_ZLIB=ON \
-DBUILD_ZLIB=ON \
-DWITH_JPEG=ON \
-DBUILD_JPEG=ON \
-DWITH_PNG=ON \
-DBUILD_PNG=ON \
-DWITH_TIFF=ON \
-DBUILD_TIFF=ON
make -j
make install
cd ../../
popd
echo "################### build opencv finished ###################"
fi
if [[ $FILENAME == *infer_cpp_linux_gpu_cpu.txt ]]; then
cpp_type=$(func_parser_value "${lines[2]}")
cls_inference_model_dir=$(func_parser_value "${lines[3]}")
det_inference_model_dir=$(func_parser_value "${lines[4]}")
cls_inference_url=$(func_parser_value "${lines[5]}")
det_inference_url=$(func_parser_value "${lines[6]}")
if [[ $cpp_type == "cls" ]];then
if [[ $cpp_type == "cls" ]]; then
eval "wget -nc $cls_inference_url"
tar xf "${model_name}_inference.tar"
eval "mv inference $cls_inference_model_dir"
tar xf "${model_name}_infer.tar"
cd dataset
rm -rf ILSVRC2012
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/whole_chain_infer.tar
tar xf whole_chain_infer.tar
ln -s whole_chain_infer ILSVRC2012
cd ..
elif [[ $cpp_type == "shitu" ]];then
elif [[ $cpp_type == "shitu" ]]; then
eval "wget -nc $cls_inference_url"
tar_name=$(func_get_url_file_name "$cls_inference_url")
model_dir=${tar_name%.*}
eval "tar xf ${tar_name}"
eval "mv ${model_dir} ${cls_inference_model_dir}"
eval "wget -nc $det_inference_url"
tar_name=$(func_get_url_file_name "$det_inference_url")
model_dir=${tar_name%.*}
eval "tar xf ${tar_name}"
eval "mv ${model_dir} ${det_inference_model_dir}"
cd dataset
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar
tar -xf drink_dataset_v1.0.tar
......@@ -90,12 +128,12 @@ model_name=$(func_parser_value "${lines[1]}")
model_url_value=$(func_parser_value "${lines[35]}")
model_url_key=$(func_parser_key "${lines[35]}")
if [[ $FILENAME == *GeneralRecognition* ]];then
if [[ $FILENAME == *GeneralRecognition* ]]; then
cd dataset
rm -rf Aliproduct
rm -rf train_reg_all_data.txt
rm -rf demo_train
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/tipc_shitu_demo_data.tar
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/tipc_shitu_demo_data.tar --no-check-certificate
tar -xf tipc_shitu_demo_data.tar
ln -s tipc_shitu_demo_data Aliproduct
ln -s tipc_shitu_demo_data/demo_train.txt train_reg_all_data.txt
......@@ -103,21 +141,21 @@ if [[ $FILENAME == *GeneralRecognition* ]];then
cd tipc_shitu_demo_data
ln -s demo_test.txt val_list.txt
cd ../../
eval "wget -nc $model_url_value"
eval "wget -nc $model_url_value --no-check-certificate"
mv general_PPLCNet_x2_5_pretrained_v1.0.pdparams GeneralRecognition_PPLCNet_x2_5_pretrained.pdparams
exit 0
fi
if [[ $FILENAME == *use_dali* ]];then
if [[ $FILENAME == *use_dali* ]]; then
python_name=$(func_parser_value "${lines[2]}")
${python_name} -m pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/nightly --upgrade nvidia-dali-nightly-cuda102
fi
if [ ${MODE} = "lite_train_lite_infer" ] || [ ${MODE} = "lite_train_whole_infer" ];then
if [[ ${MODE} = "lite_train_lite_infer" ]] || [[ ${MODE} = "lite_train_whole_infer" ]]; then
# pretrain lite train data
cd dataset
rm -rf ILSVRC2012
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/whole_chain_little_train.tar
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/whole_chain_little_train.tar --no-check-certificate
tar xf whole_chain_little_train.tar
ln -s whole_chain_little_train ILSVRC2012
cd ILSVRC2012
......@@ -125,7 +163,7 @@ if [ ${MODE} = "lite_train_lite_infer" ] || [ ${MODE} = "lite_train_whole_infer"
mv val.txt val_list.txt
cp -r train/* val/
cd ../../
elif [ ${MODE} = "whole_infer" ] || [ ${MODE} = "klquant_whole_infer" ];then
elif [[ ${MODE} = "whole_infer" ]] || [[ ${MODE} = "klquant_whole_infer" ]]; then
# download data
cd dataset
rm -rf ILSVRC2012
......@@ -140,14 +178,14 @@ elif [ ${MODE} = "whole_infer" ] || [ ${MODE} = "klquant_whole_infer" ];then
eval "wget -nc $model_url_value"
if [[ $model_url_key == *inference* ]]; then
rm -rf inference
tar xf "${model_name}_inference.tar"
tar xf "${model_name}_infer.tar"
fi
if [[ $model_name == "SwinTransformer_large_patch4_window7_224" || $model_name == "SwinTransformer_large_patch4_window12_384" ]];then
if [[ $model_name == "SwinTransformer_large_patch4_window7_224" || $model_name == "SwinTransformer_large_patch4_window12_384" ]]; then
cmd="mv ${model_name}_22kto1k_pretrained.pdparams ${model_name}_pretrained.pdparams"
eval $cmd
fi
elif [ ${MODE} = "whole_train_whole_infer" ];then
elif [[ ${MODE} = "whole_train_whole_infer" ]]; then
cd dataset
rm -rf ILSVRC2012
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/whole_chain_CIFAR100.tar
......@@ -159,31 +197,51 @@ elif [ ${MODE} = "whole_train_whole_infer" ];then
cd ../../
fi
if [ ${MODE} = "serving_infer" ];then
if [[ ${MODE} = "serving_infer" ]]; then
# prepare serving env
python_name=$(func_parser_value "${lines[2]}")
${python_name} -m pip install install paddle-serving-server-gpu==0.6.1.post101
${python_name} -m pip install paddle_serving_client==0.6.1
${python_name} -m pip install paddle-serving-app==0.6.1
${python_name} -m pip install install paddle-serving-server-gpu==0.7.0.post102
${python_name} -m pip install paddle_serving_client==0.7.0
${python_name} -m pip install paddle-serving-app==0.7.0
if [[ ${model_name} =~ "ShiTu" ]]; then
cls_inference_model_url=$(func_parser_value "${lines[3]}")
cls_tar_name=$(func_get_url_file_name "${cls_inference_model_url}")
det_inference_model_url=$(func_parser_value "${lines[4]}")
det_tar_name=$(func_get_url_file_name "${det_inference_model_url}")
cd ./deploy
mkdir models
cd models
wget -nc ${cls_inference_model_url} && tar xf ${cls_tar_name}
wget -nc ${det_inference_model_url} && tar xf ${det_tar_name}
cd ..
else
cls_inference_model_url=$(func_parser_value "${lines[3]}")
cls_tar_name=$(func_get_url_file_name "${cls_inference_model_url}")
cd ./deploy/paddleserving
wget -nc ${cls_inference_model_url} && tar xf ${cls_tar_name}
cd ../../
fi
unset http_proxy
unset https_proxy
cd ./deploy/paddleserving
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
fi
if [ ${MODE} = "paddle2onnx_infer" ];then
if [[ ${MODE} = "paddle2onnx_infer" ]]; then
# prepare paddle2onnx env
python_name=$(func_parser_value "${lines[2]}")
inference_model_url=$(func_parser_value "${lines[10]}")
tar_name=${inference_model_url##*/}
${python_name} -m pip install install paddle2onnx
${python_name} -m pip install onnxruntime
# wget model
cd deploy && mkdir models && cd models
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
cd deploy
mkdir models
cd models
wget -nc ${inference_model_url}
tar xf ${tar_name}
cd ../../
fi
if [ ${MODE} = "benchmark_train" ];then
if [[ ${MODE} = "benchmark_train" ]]; then
pip install -r requirements.txt
cd dataset
rm -rf ILSVRC2012
......
......@@ -6,12 +6,11 @@ GPUID=$2
if [[ ! $GPUID ]];then
GPUID=0
fi
dataline=$(awk 'NR==1, NR==16{print}' $FILENAME)
dataline=$(awk 'NR==1, NR==19{print}' $FILENAME)
# parser params
IFS=$'\n'
lines=(${dataline})
# parser cpp inference model
model_name=$(func_parser_value "${lines[1]}")
cpp_infer_type=$(func_parser_value "${lines[2]}")
......@@ -31,7 +30,7 @@ cpp_benchmark_value=$(func_parser_value "${lines[16]}")
generate_yaml_cmd=$(func_parser_value "${lines[17]}")
transform_index_cmd=$(func_parser_value "${lines[18]}")
LOG_PATH="./test_tipc/output"
LOG_PATH="./test_tipc/output/${model_name}"
mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results_cpp.log"
# generate_yaml_cmd="python3 test_tipc/generate_cpp_yaml.py"
......@@ -58,11 +57,10 @@ function func_shitu_cpp_inference(){
precison="int8"
fi
_save_log_path="${_log_path}/shitu_cpp_infer_cpu_usemkldnn_${use_mkldnn}_threads_${threads}_precision_${precision}_batchsize_${batch_size}.log"
eval $transform_index_cmd
command="${generate_yaml_cmd} --type shitu --batch_size ${batch_size} --mkldnn ${use_mkldnn} --gpu ${use_gpu} --cpu_thread ${threads} --tensorrt False --precision ${precision} --data_dir ${_img_dir} --benchmark True --cls_model_dir ${cpp_infer_model_dir} --det_model_dir ${cpp_det_infer_model_dir} --gpu_id ${GPUID}"
eval $command
eval $transform_index_cmd
command="${_script} 2>&1|tee ${_save_log_path}"
command="${_script} > ${_save_log_path} 2>&1"
eval $command
last_status=${PIPESTATUS[0]}
status_check $last_status "${command}" "${status_log}"
......@@ -83,13 +81,13 @@ function func_shitu_cpp_inference(){
fi
for batch_size in ${cpp_batch_size_list[*]}; do
_save_log_path="${_log_path}/shitu_cpp_infer_gpu_usetrt_${use_trt}_precision_${precision}_batchsize_${batch_size}.log"
eval $transform_index_cmd
command="${generate_yaml_cmd} --type shitu --batch_size ${batch_size} --mkldnn False --gpu ${use_gpu} --cpu_thread 1 --tensorrt ${use_trt} --precision ${precision} --data_dir ${_img_dir} --benchmark True --cls_model_dir ${cpp_infer_model_dir} --det_model_dir ${cpp_det_infer_model_dir} --gpu_id ${GPUID}"
eval $command
eval $transform_index_cmd
command="${_script} 2>&1|tee ${_save_log_path}"
command="${_script} > ${_save_log_path} 2>&1"
eval $command
last_status=${PIPESTATUS[0]}
status_check $last_status "${_script}" "${status_log}"
status_check $last_status "${command}" "${status_log}"
done
done
done
......@@ -124,7 +122,7 @@ function func_cls_cpp_inference(){
command="${generate_yaml_cmd} --type cls --batch_size ${batch_size} --mkldnn ${use_mkldnn} --gpu ${use_gpu} --cpu_thread ${threads} --tensorrt False --precision ${precision} --data_dir ${_img_dir} --benchmark True --cls_model_dir ${cpp_infer_model_dir} --gpu_id ${GPUID}"
eval $command
command1="${_script} 2>&1|tee ${_save_log_path}"
command1="${_script} > ${_save_log_path} 2>&1"
eval ${command1}
last_status=${PIPESTATUS[0]}
status_check $last_status "${command1}" "${status_log}"
......@@ -147,7 +145,7 @@ function func_cls_cpp_inference(){
_save_log_path="${_log_path}/cls_cpp_infer_gpu_usetrt_${use_trt}_precision_${precision}_batchsize_${batch_size}.log"
command="${generate_yaml_cmd} --type cls --batch_size ${batch_size} --mkldnn False --gpu ${use_gpu} --cpu_thread 1 --tensorrt ${use_trt} --precision ${precision} --data_dir ${_img_dir} --benchmark True --cls_model_dir ${cpp_infer_model_dir} --gpu_id ${GPUID}"
eval $command
command="${_script} 2>&1|tee ${_save_log_path}"
command="${_script} > ${_save_log_path} 2>&1"
eval $command
last_status=${PIPESTATUS[0]}
status_check $last_status "${command}" "${status_log}"
......@@ -195,49 +193,11 @@ if [[ $cpp_infer_type == "shitu" ]]; then
cd ..
fi
if [ -d "opencv-3.4.7/opencv3/" ] && [ $(md5sum opencv-3.4.7.tar.gz | awk -F ' ' '{print $1}') = "faa2b5950f8bee3f03118e600c74746a" ];then
echo "################### build opencv skipped ###################"
else
echo "################### build opencv ###################"
rm -rf opencv-3.4.7.tar.gz opencv-3.4.7/
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/opencv-3.4.7.tar.gz
tar -xf opencv-3.4.7.tar.gz
cd opencv-3.4.7/
install_path=$(pwd)/opencv3
rm -rf build
mkdir build
cd build
cmake .. \
-DCMAKE_INSTALL_PREFIX=${install_path} \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=OFF \
-DWITH_IPP=OFF \
-DBUILD_IPP_IW=OFF \
-DWITH_LAPACK=OFF \
-DWITH_EIGEN=OFF \
-DCMAKE_INSTALL_LIBDIR=lib64 \
-DWITH_ZLIB=ON \
-DBUILD_ZLIB=ON \
-DWITH_JPEG=ON \
-DBUILD_JPEG=ON \
-DWITH_PNG=ON \
-DBUILD_PNG=ON \
-DWITH_TIFF=ON \
-DBUILD_TIFF=ON
make -j
make install
cd ../../
echo "################### build opencv finished ###################"
fi
echo "################### build PaddleClas demo ####################"
OPENCV_DIR=$(pwd)/opencv-3.4.7/opencv3/
# LIB_DIR=/work/project/project/test/paddle_inference/
LIB_DIR=$(pwd)/Paddle/build/paddle_inference_install_dir/
# pwd = /workspace/hesensen/PaddleClas/deploy/cpp_shitu
OPENCV_DIR=$(dirname $PWD)/cpp/opencv-3.4.7/opencv3/
LIB_DIR=$(dirname $PWD)/cpp/paddle_inference/
CUDA_LIB_DIR=$(dirname `find /usr -name libcudart.so`)
CUDNN_LIB_DIR=$(dirname `find /usr -name libcudnn.so`)
......
......@@ -11,7 +11,7 @@ python=$(func_parser_value "${lines[2]}")
# parser params
dataline=$(awk 'NR==1, NR==14{print}' $FILENAME)
dataline=$(awk 'NR==1, NR==16{print}' $FILENAME)
IFS=$'\n'
lines=(${dataline})
......@@ -32,15 +32,17 @@ opset_version_value=$(func_parser_value "${lines[8]}")
enable_onnx_checker_key=$(func_parser_key "${lines[9]}")
enable_onnx_checker_value=$(func_parser_value "${lines[9]}")
# parser onnx inference
inference_py=$(func_parser_value "${lines[10]}")
use_onnx_key=$(func_parser_key "${lines[11]}")
use_onnx_value=$(func_parser_value "${lines[11]}")
inference_model_dir_key=$(func_parser_key "${lines[12]}")
inference_model_dir_value=$(func_parser_value "${lines[12]}")
inference_hardware_key=$(func_parser_key "${lines[13]}")
inference_hardware_value=$(func_parser_value "${lines[13]}")
inference_py=$(func_parser_value "${lines[11]}")
use_onnx_key=$(func_parser_key "${lines[12]}")
use_onnx_value=$(func_parser_value "${lines[12]}")
inference_model_dir_key=$(func_parser_key "${lines[13]}")
inference_model_dir_value=$(func_parser_value "${lines[13]}")
inference_hardware_key=$(func_parser_key "${lines[14]}")
inference_hardware_value=$(func_parser_value "${lines[14]}")
inference_config_key=$(func_parser_key "${lines[15]}")
inference_config_value=$(func_parser_value "${lines[15]}")
LOG_PATH="./test_tipc/output"
LOG_PATH="./test_tipc/output/${model_name}"
mkdir -p ./test_tipc/output
status_log="${LOG_PATH}/results_paddle2onnx.log"
......@@ -65,7 +67,8 @@ function func_paddle2onnx(){
set_model_dir=$(func_set_params "${inference_model_dir_key}" "${inference_model_dir_value}")
set_use_onnx=$(func_set_params "${use_onnx_key}" "${use_onnx_value}")
set_hardware=$(func_set_params "${inference_hardware_key}" "${inference_hardware_value}")
infer_model_cmd="cd deploy && ${python} ${inference_py} -o ${set_model_dir} -o ${set_use_onnx} -o ${set_hardware} >${_save_log_path} 2>&1 && cd ../"
set_inference_config=$(func_set_params "${inference_config_key}" "${inference_config_value}")
infer_model_cmd="cd deploy && ${python} ${inference_py} -o ${set_model_dir} -o ${set_use_onnx} -o ${set_hardware} ${set_inference_config} > ${_save_log_path} 2>&1 && cd ../"
eval $infer_model_cmd
status_check $last_status "${infer_model_cmd}" "${status_log}"
}
......
#!/bin/bash
source test_tipc/common_func.sh
FILENAME=$1
dataline=$(awk 'NR==1, NR==18{print}' $FILENAME)
# parser params
IFS=$'\n'
lines=(${dataline})
# parser serving
model_name=$(func_parser_value "${lines[1]}")
python=$(func_parser_value "${lines[2]}")
trans_model_py=$(func_parser_value "${lines[3]}")
infer_model_dir_key=$(func_parser_key "${lines[4]}")
infer_model_dir_value=$(func_parser_value "${lines[4]}")
model_filename_key=$(func_parser_key "${lines[5]}")
model_filename_value=$(func_parser_value "${lines[5]}")
params_filename_key=$(func_parser_key "${lines[6]}")
params_filename_value=$(func_parser_value "${lines[6]}")
serving_server_key=$(func_parser_key "${lines[7]}")
serving_server_value=$(func_parser_value "${lines[7]}")
serving_client_key=$(func_parser_key "${lines[8]}")
serving_client_value=$(func_parser_value "${lines[8]}")
serving_dir_value=$(func_parser_value "${lines[9]}")
web_service_py=$(func_parser_value "${lines[10]}")
web_use_gpu_key=$(func_parser_key "${lines[11]}")
web_use_gpu_list=$(func_parser_value "${lines[11]}")
web_use_mkldnn_key=$(func_parser_key "${lines[12]}")
web_use_mkldnn_list=$(func_parser_value "${lines[12]}")
web_cpu_threads_key=$(func_parser_key "${lines[13]}")
web_cpu_threads_list=$(func_parser_value "${lines[13]}")
web_use_trt_key=$(func_parser_key "${lines[14]}")
web_use_trt_list=$(func_parser_value "${lines[14]}")
web_precision_key=$(func_parser_key "${lines[15]}")
web_precision_list=$(func_parser_value "${lines[15]}")
pipeline_py=$(func_parser_value "${lines[16]}")
image_dir_key=$(func_parser_key "${lines[17]}")
image_dir_value=$(func_parser_value "${lines[17]}")
LOG_PATH="../../test_tipc/output"
mkdir -p ./test_tipc/output
status_log="${LOG_PATH}/results_serving.log"
function func_serving(){
IFS='|'
_python=$1
_script=$2
_model_dir=$3
# pdserving
set_dirname=$(func_set_params "${infer_model_dir_key}" "${infer_model_dir_value}")
set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
set_serving_server=$(func_set_params "${serving_server_key}" "${serving_server_value}")
set_serving_client=$(func_set_params "${serving_client_key}" "${serving_client_value}")
set_image_dir=$(func_set_params "${image_dir_key}" "${image_dir_value}")
trans_model_cmd="${python} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval $trans_model_cmd
cd ${serving_dir_value}
echo $PWD
unset https_proxy
unset http_proxy
for python in ${python[*]}; do
if [ ${python} = "cpp"]; then
for use_gpu in ${web_use_gpu_list[*]}; do
if [ ${use_gpu} = "null" ]; then
web_service_cpp_cmd="${python} -m paddle_serving_server.serve --model ppocr_det_mobile_2.0_serving/ ppocr_rec_mobile_2.0_serving/ --port 9293"
eval $web_service_cmd
sleep 2s
_save_log_path="${LOG_PATH}/server_infer_cpp_cpu_pipeline_usemkldnn_False_threads_4_batchsize_1.log"
pipeline_cmd="${python} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
eval $pipeline_cmd
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 2s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
else
web_service_cpp_cmd="${python} -m paddle_serving_server.serve --model ppocr_det_mobile_2.0_serving/ ppocr_rec_mobile_2.0_serving/ --port 9293 --gpu_id=0"
eval $web_service_cmd
sleep 2s
_save_log_path="${LOG_PATH}/server_infer_cpp_cpu_pipeline_usemkldnn_False_threads_4_batchsize_1.log"
pipeline_cmd="${python} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
eval $pipeline_cmd
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 2s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
fi
done
else
# python serving
for use_gpu in ${web_use_gpu_list[*]}; do
echo ${ues_gpu}
if [ ${use_gpu} = "null" ]; then
for use_mkldnn in ${web_use_mkldnn_list[*]}; do
if [ ${use_mkldnn} = "False" ]; then
continue
fi
for threads in ${web_cpu_threads_list[*]}; do
set_cpu_threads=$(func_set_params "${web_cpu_threads_key}" "${threads}")
web_service_cmd="${python} ${web_service_py} ${web_use_gpu_key}=${use_gpu} ${web_use_mkldnn_key}=${use_mkldnn} ${set_cpu_threads} &"
eval $web_service_cmd
sleep 2s
for pipeline in ${pipeline_py[*]}; do
_save_log_path="${LOG_PATH}/server_infer_cpu_${pipeline%_client*}_usemkldnn_${use_mkldnn}_threads_${threads}_batchsize_1.log"
pipeline_cmd="${python} ${pipeline} ${set_image_dir} > ${_save_log_path} 2>&1 "
eval $pipeline_cmd
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 2s
done
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
done
done
elif [ ${use_gpu} = "0" ]; then
for use_trt in ${web_use_trt_list[*]}; do
for precision in ${web_precision_list[*]}; do
if [[ ${_flag_quant} = "False" ]] && [[ ${precision} =~ "int8" ]]; then
continue
fi
if [[ ${precision} =~ "fp16" || ${precision} =~ "int8" ]] && [ ${use_trt} = "False" ]; then
continue
fi
if [[ ${use_trt} = "False" || ${precision} =~ "int8" ]] && [[ ${_flag_quant} = "True" ]]; then
continue
fi
set_tensorrt=$(func_set_params "${web_use_trt_key}" "${use_trt}")
set_precision=$(func_set_params "${web_precision_key}" "${precision}")
web_service_cmd="${python} ${web_service_py} ${web_use_gpu_key}=${use_gpu} ${set_tensorrt} ${set_precision} & "
eval $web_service_cmd
sleep 2s
for pipeline in ${pipeline_py[*]}; do
_save_log_path="${LOG_PATH}/server_infer_gpu_${pipeline%_client*}_usetrt_${use_trt}_precision_${precision}_batchsize_1.log"
pipeline_cmd="${python} ${pipeline} ${set_image_dir}> ${_save_log_path} 2>&1"
eval $pipeline_cmd
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 2s
done
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
done
done
else
echo "Does not support hardware other than CPU and GPU Currently!"
fi
done
fi
done
}
# set cuda device
GPUID=$2
if [ ${#GPUID} -le 0 ];then
env=" "
else
env="export CUDA_VISIBLE_DEVICES=${GPUID}"
fi
set CUDA_VISIBLE_DEVICES
eval $env
echo "################### run test ###################"
export Count=0
IFS="|"
func_serving "${web_service_cmd}"
#!/bin/bash
source test_tipc/common_func.sh
FILENAME=$1
dataline=$(awk 'NR==1, NR==19{print}' $FILENAME)
# parser params
IFS=$'\n'
lines=(${dataline})
function func_get_url_file_name(){
strs=$1
IFS="/"
array=(${strs})
tmp=${array[${#array[@]}-1]}
echo ${tmp}
}
# parser serving
model_name=$(func_parser_value "${lines[1]}")
python=$(func_parser_value "${lines[2]}")
trans_model_py=$(func_parser_value "${lines[4]}")
infer_model_dir_key=$(func_parser_key "${lines[5]}")
infer_model_dir_value=$(func_parser_value "${lines[5]}")
model_filename_key=$(func_parser_key "${lines[6]}")
model_filename_value=$(func_parser_value "${lines[6]}")
params_filename_key=$(func_parser_key "${lines[7]}")
params_filename_value=$(func_parser_value "${lines[7]}")
serving_server_key=$(func_parser_key "${lines[8]}")
serving_server_value=$(func_parser_value "${lines[8]}")
serving_client_key=$(func_parser_key "${lines[9]}")
serving_client_value=$(func_parser_value "${lines[9]}")
serving_dir_value=$(func_parser_value "${lines[10]}")
web_service_py=$(func_parser_value "${lines[11]}")
web_use_gpu_key=$(func_parser_key "${lines[12]}")
web_use_gpu_list=$(func_parser_value "${lines[12]}")
pipeline_py=$(func_parser_value "${lines[13]}")
function func_serving_cls(){
LOG_PATH="../../test_tipc/output/${model_name}"
mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results_serving.log"
IFS='|'
# pdserving
set_dirname=$(func_set_params "${infer_model_dir_key}" "${infer_model_dir_value}")
set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
set_serving_server=$(func_set_params "${serving_server_key}" "${serving_server_value}")
set_serving_client=$(func_set_params "${serving_client_key}" "${serving_client_value}")
trans_model_cmd="${python} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval $trans_model_cmd
# modify the alias_name of fetch_var to "outputs"
server_fetch_var_line_cmd="sed -i '/fetch_var/,/is_lod_tensor/s/alias_name: .*/alias_name: \"prediction\"/' ${serving_server_value}/serving_server_conf.prototxt"
eval ${server_fetch_var_line_cmd}
client_fetch_var_line_cmd="sed -i '/fetch_var/,/is_lod_tensor/s/alias_name: .*/alias_name: \"prediction\"/' ${serving_client_value}/serving_client_conf.prototxt"
eval ${client_fetch_var_line_cmd}
prototxt_dataline=$(awk 'NR==1, NR==3{print}' ${serving_server_value}/serving_server_conf.prototxt)
IFS=$'\n'
prototxt_lines=(${prototxt_dataline})
feed_var_name=$(func_parser_value "${prototxt_lines[2]}")
IFS='|'
cd ${serving_dir_value}
unset https_proxy
unset http_proxy
# modify the input_name in "classification_web_service.py" to be consistent with feed_var.name in prototxt
set_web_service_feet_var_cmd="sed -i '/preprocess/,/input_imgs}/s/{.*: input_imgs}/{${feed_var_name}: input_imgs}/' ${web_service_py}"
eval ${set_web_service_feet_var_cmd}
model_config=21
serving_server_dir_name=$(func_get_url_file_name "$serving_server_value")
set_model_config_cmd="sed -i '${model_config}s/model_config: .*/model_config: ${serving_server_dir_name}/' config.yml"
eval ${set_model_config_cmd}
for python in ${python[*]}; do
if [[ ${python} = "cpp" ]]; then
for use_gpu in ${web_use_gpu_list[*]}; do
if [ ${use_gpu} = "null" ]; then
web_service_cpp_cmd="${python} -m paddle_serving_server.serve --model ppocr_det_mobile_2.0_serving/ ppocr_rec_mobile_2.0_serving/ --port 9293"
eval $web_service_cmd
sleep 5s
_save_log_path="${LOG_PATH}/server_infer_cpp_cpu_pipeline_usemkldnn_False_threads_4_batchsize_1.log"
pipeline_cmd="${python} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
eval $pipeline_cmd
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 5s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
else
web_service_cpp_cmd="${python} -m paddle_serving_server.serve --model ppocr_det_mobile_2.0_serving/ ppocr_rec_mobile_2.0_serving/ --port 9293 --gpu_id=0"
eval $web_service_cmd
sleep 5s
_save_log_path="${LOG_PATH}/server_infer_cpp_cpu_pipeline_usemkldnn_False_threads_4_batchsize_1.log"
pipeline_cmd="${python} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
eval $pipeline_cmd
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 5s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
fi
done
else
# python serving
for use_gpu in ${web_use_gpu_list[*]}; do
if [[ ${use_gpu} = "null" ]]; then
device_type_line=24
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 0/' config.yml"
eval $set_device_type_cmd
devices_line=27
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"\"/' config.yml"
eval $set_devices_cmd
web_service_cmd="${python} ${web_service_py} &"
eval $web_service_cmd
sleep 5s
for pipeline in ${pipeline_py[*]}; do
_save_log_path="${LOG_PATH}/server_infer_cpu_${pipeline%_client*}_batchsize_1.log"
pipeline_cmd="${python} ${pipeline} > ${_save_log_path} 2>&1 "
eval $pipeline_cmd
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 5s
done
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
elif [ ${use_gpu} -eq 0 ]; then
if [[ ${_flag_quant} = "False" ]] && [[ ${precision} =~ "int8" ]]; then
continue
fi
if [[ ${precision} =~ "fp16" || ${precision} =~ "int8" ]] && [ ${use_trt} = "False" ]; then
continue
fi
if [[ ${use_trt} = "False" || ${precision} =~ "int8" ]] && [[ ${_flag_quant} = "True" ]]; then
continue
fi
device_type_line=24
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 1/' config.yml"
eval $set_device_type_cmd
devices_line=27
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"${use_gpu}\"/' config.yml"
eval $set_devices_cmd
web_service_cmd="${python} ${web_service_py} & "
eval $web_service_cmd
sleep 5s
for pipeline in ${pipeline_py[*]}; do
_save_log_path="${LOG_PATH}/server_infer_gpu_${pipeline%_client*}_batchsize_1.log"
pipeline_cmd="${python} ${pipeline} > ${_save_log_path} 2>&1"
eval $pipeline_cmd
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 5s
done
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
else
echo "Does not support hardware [${use_gpu}] other than CPU and GPU Currently!"
fi
done
fi
done
}
function func_serving_rec(){
LOG_PATH="../../../test_tipc/output/${model_name}"
mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results_serving.log"
trans_model_py=$(func_parser_value "${lines[5]}")
cls_infer_model_dir_key=$(func_parser_key "${lines[6]}")
cls_infer_model_dir_value=$(func_parser_value "${lines[6]}")
det_infer_model_dir_key=$(func_parser_key "${lines[7]}")
det_infer_model_dir_value=$(func_parser_value "${lines[7]}")
model_filename_key=$(func_parser_key "${lines[8]}")
model_filename_value=$(func_parser_value "${lines[8]}")
params_filename_key=$(func_parser_key "${lines[9]}")
params_filename_value=$(func_parser_value "${lines[9]}")
cls_serving_server_key=$(func_parser_key "${lines[10]}")
cls_serving_server_value=$(func_parser_value "${lines[10]}")
cls_serving_client_key=$(func_parser_key "${lines[11]}")
cls_serving_client_value=$(func_parser_value "${lines[11]}")
det_serving_server_key=$(func_parser_key "${lines[12]}")
det_serving_server_value=$(func_parser_value "${lines[12]}")
det_serving_client_key=$(func_parser_key "${lines[13]}")
det_serving_client_value=$(func_parser_value "${lines[13]}")
serving_dir_value=$(func_parser_value "${lines[14]}")
web_service_py=$(func_parser_value "${lines[15]}")
web_use_gpu_key=$(func_parser_key "${lines[16]}")
web_use_gpu_list=$(func_parser_value "${lines[16]}")
pipeline_py=$(func_parser_value "${lines[17]}")
IFS='|'
# pdserving
cd ./deploy
set_dirname=$(func_set_params "${cls_infer_model_dir_key}" "${cls_infer_model_dir_value}")
set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
set_serving_server=$(func_set_params "${cls_serving_server_key}" "${cls_serving_server_value}")
set_serving_client=$(func_set_params "${cls_serving_client_key}" "${cls_serving_client_value}")
cls_trans_model_cmd="${python} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval $cls_trans_model_cmd
set_dirname=$(func_set_params "${det_infer_model_dir_key}" "${det_infer_model_dir_value}")
set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
set_serving_server=$(func_set_params "${det_serving_server_key}" "${det_serving_server_value}")
set_serving_client=$(func_set_params "${det_serving_client_key}" "${det_serving_client_value}")
det_trans_model_cmd="${python} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval $det_trans_model_cmd
# modify the alias_name of fetch_var to "outputs"
server_fetch_var_line_cmd="sed -i '/fetch_var/,/is_lod_tensor/s/alias_name: .*/alias_name: \"features\"/' $cls_serving_server_value/serving_server_conf.prototxt"
eval ${server_fetch_var_line_cmd}
client_fetch_var_line_cmd="sed -i '/fetch_var/,/is_lod_tensor/s/alias_name: .*/alias_name: \"features\"/' $cls_serving_client_value/serving_client_conf.prototxt"
eval ${client_fetch_var_line_cmd}
prototxt_dataline=$(awk 'NR==1, NR==3{print}' ${cls_serving_server_value}/serving_server_conf.prototxt)
IFS=$'\n'
prototxt_lines=(${prototxt_dataline})
feed_var_name=$(func_parser_value "${prototxt_lines[2]}")
IFS='|'
cd ${serving_dir_value}
unset https_proxy
unset http_proxy
# modify the input_name in "recognition_web_service.py" to be consistent with feed_var.name in prototxt
set_web_service_feet_var_cmd="sed -i '/preprocess/,/input_imgs}/s/{.*: input_imgs}/{${feed_var_name}: input_imgs}/' ${web_service_py}"
eval ${set_web_service_feet_var_cmd}
for python in ${python[*]}; do
if [[ ${python} = "cpp" ]]; then
for use_gpu in ${web_use_gpu_list[*]}; do
if [ ${use_gpu} = "null" ]; then
web_service_cpp_cmd="${python} web_service_py"
eval $web_service_cmd
sleep 5s
_save_log_path="${LOG_PATH}/server_infer_cpp_cpu_pipeline_usemkldnn_False_threads_4_batchsize_1.log"
pipeline_cmd="${python} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
eval $pipeline_cmd
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 5s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
else
web_service_cpp_cmd="${python} web_service_py"
eval $web_service_cmd
sleep 5s
_save_log_path="${LOG_PATH}/server_infer_cpp_cpu_pipeline_usemkldnn_False_threads_4_batchsize_1.log"
pipeline_cmd="${python} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
eval $pipeline_cmd
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 5s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
fi
done
else
# python serving
for use_gpu in ${web_use_gpu_list[*]}; do
if [[ ${use_gpu} = "null" ]]; then
device_type_line=24
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 0/' config.yml"
eval $set_device_type_cmd
devices_line=27
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"\"/' config.yml"
eval $set_devices_cmd
web_service_cmd="${python} ${web_service_py} &"
eval $web_service_cmd
sleep 5s
for pipeline in ${pipeline_py[*]}; do
_save_log_path="${LOG_PATH}/server_infer_cpu_${pipeline%_client*}_batchsize_1.log"
pipeline_cmd="${python} ${pipeline} > ${_save_log_path} 2>&1 "
eval $pipeline_cmd
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 5s
done
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
elif [ ${use_gpu} -eq 0 ]; then
if [[ ${_flag_quant} = "False" ]] && [[ ${precision} =~ "int8" ]]; then
continue
fi
if [[ ${precision} =~ "fp16" || ${precision} =~ "int8" ]] && [ ${use_trt} = "False" ]; then
continue
fi
if [[ ${use_trt} = "False" || ${precision} =~ "int8" ]] && [[ ${_flag_quant} = "True" ]]; then
continue
fi
device_type_line=24
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 1/' config.yml"
eval $set_device_type_cmd
devices_line=27
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"${use_gpu}\"/' config.yml"
eval $set_devices_cmd
web_service_cmd="${python} ${web_service_py} & "
eval $web_service_cmd
sleep 10s
for pipeline in ${pipeline_py[*]}; do
_save_log_path="${LOG_PATH}/server_infer_gpu_${pipeline%_client*}_batchsize_1.log"
pipeline_cmd="${python} ${pipeline} > ${_save_log_path} 2>&1"
eval $pipeline_cmd
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}"
sleep 10s
done
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
else
echo "Does not support hardware [${use_gpu}] other than CPU and GPU Currently!"
fi
done
fi
done
}
# set cuda device
GPUID=$2
if [ ${#GPUID} -le 0 ];then
env=" "
else
env="export CUDA_VISIBLE_DEVICES=${GPUID}"
fi
set CUDA_VISIBLE_DEVICES
eval $env
echo "################### run test ###################"
export Count=0
IFS="|"
if [[ ${model_name} =~ "ShiTu" ]]; then
func_serving_rec
else
func_serving_cls
fi
......@@ -90,7 +90,7 @@ infer_value1=$(func_parser_value "${lines[50]}")
if [ ! $epoch_num ]; then
epoch_num=2
fi
if [ $MODE = 'benchmark_train' ]; then
if [[ $MODE = 'benchmark_train' ]]; then
epoch_num=1
fi
......@@ -161,7 +161,7 @@ function func_inference(){
done
}
if [ ${MODE} = "whole_infer" ] || [ ${MODE} = "klquant_whole_infer" ]; then
if [[ ${MODE} = "whole_infer" ]] || [[ ${MODE} = "klquant_whole_infer" ]]; then
IFS="|"
infer_export_flag=(${infer_export_flag})
if [ ${infer_export_flag} != "null" ] && [ ${infer_export_flag} != "False" ]; then
......@@ -171,7 +171,7 @@ if [ ${MODE} = "whole_infer" ] || [ ${MODE} = "klquant_whole_infer" ]; then
fi
fi
if [ ${MODE} = "whole_infer" ]; then
if [[ ${MODE} = "whole_infer" ]]; then
GPUID=$3
if [ ${#GPUID} -le 0 ];then
env=" "
......@@ -191,7 +191,7 @@ if [ ${MODE} = "whole_infer" ]; then
done
cd ..
elif [ ${MODE} = "klquant_whole_infer" ]; then
elif [[ ${MODE} = "klquant_whole_infer" ]]; then
# for kl_quant
if [ ${kl_quant_cmd_value} != "null" ] && [ ${kl_quant_cmd_value} != "False" ]; then
echo "kl_quant"
......@@ -270,7 +270,9 @@ else
set_batchsize=$(func_set_params "${train_batch_key}" "${train_batch_value}")
set_train_params1=$(func_set_params "${train_param_key1}" "${train_param_value1}")
set_use_gpu=$(func_set_params "${train_use_gpu_key}" "${train_use_gpu_value}")
if [ ${#ips} -le 26 ];then
if [ ${#ips} -le 15 ];then
# if length of ips >= 15, then it is seen as multi-machine
# 15 is the min length of ips info for multi-machine: 0.0.0.0,0.0.0.0
save_log="${LOG_PATH}/${trainer}_gpus_${gpu}_autocast_${autocast}"
nodes=1
else
......@@ -289,7 +291,7 @@ else
set_save_model=$(func_set_params "${save_model_key}" "${save_log}")
if [ ${#gpu} -le 2 ];then # train with cpu or single gpu
cmd="${python} ${run_train} ${set_use_gpu} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_autocast} ${set_batchsize} ${set_train_params1} "
elif [ ${#ips} -le 26 ];then # train with multi-gpu
elif [ ${#ips} -le 15 ];then # train with multi-gpu
cmd="${python} -m paddle.distributed.launch --gpus=${gpu} ${run_train} ${set_use_gpu} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_autocast} ${set_batchsize} ${set_train_params1}"
else # train with multi-machine
cmd="${python} -m paddle.distributed.launch --ips=${ips} --gpus=${gpu} ${run_train} ${set_use_gpu} ${set_save_model} ${set_pretrain} ${set_epoch} ${set_autocast} ${set_batchsize} ${set_train_params1}"
......
......@@ -20,8 +20,13 @@ def get_result(log_dir):
return res
def search_train(search_list, base_program, base_output_dir, search_key,
config_replace_value, model_name, search_times=1):
def search_train(search_list,
base_program,
base_output_dir,
search_key,
config_replace_value,
model_name,
search_times=1):
best_res = 0.
best = search_list[0]
all_result = {}
......@@ -33,7 +38,8 @@ def search_train(search_list, base_program, base_output_dir, search_key,
model_name = search_i
res_list = []
for j in range(search_times):
output_dir = "{}/{}_{}_{}".format(base_output_dir, search_key, search_i, j).replace(".", "_")
output_dir = "{}/{}_{}_{}".format(base_output_dir, search_key,
search_i, j).replace(".", "_")
program += ["-o", "Global.output_dir={}".format(output_dir)]
process = subprocess.Popen(program)
process.communicate()
......@@ -50,14 +56,17 @@ def search_train(search_list, base_program, base_output_dir, search_key,
def search_strategy():
args = config.parse_args()
configs = config.get_config(args.config, overrides=args.override, show=False)
configs = config.get_config(
args.config, overrides=args.override, show=False)
base_config_file = configs["base_config_file"]
distill_config_file = configs["distill_config_file"]
model_name = config.get_config(base_config_file)["Arch"]["name"]
gpus = configs["gpus"]
gpus = ",".join([str(i) for i in gpus])
base_program = ["python3.7", "-m", "paddle.distributed.launch", "--gpus={}".format(gpus),
"tools/train.py", "-c", base_config_file]
base_program = [
"python3.7", "-m", "paddle.distributed.launch",
"--gpus={}".format(gpus), "tools/train.py", "-c", base_config_file
]
base_output_dir = configs["output_dir"]
search_times = configs["search_times"]
search_dict = configs.get("search_dict")
......@@ -67,14 +76,22 @@ def search_strategy():
search_values = search_i["search_values"]
replace_config = search_i["replace_config"]
res = search_train(search_values, base_program, base_output_dir,
search_key, replace_config, model_name, search_times)
search_key, replace_config, model_name,
search_times)
all_results[search_key] = res
best = res.get("best")
for v in replace_config:
base_program += ["-o", "{}={}".format(v, best)]
teacher_configs = configs.get("teacher", None)
if teacher_configs is not None:
if teacher_configs is None:
print(all_results, base_program)
return
algo = teacher_configs.get("algorithm", "skl-ugi")
supported_list = ["skl-ugi", "udml"]
assert algo in supported_list, f"algorithm must be in {supported_list} but got {algo}"
if algo == "skl-ugi":
teacher_program = base_program.copy()
# remove incompatible keys
teacher_rm_keys = teacher_configs["rm_keys"]
......@@ -85,20 +102,32 @@ def search_strategy():
rm_indices.append(ind)
for rm_index in rm_indices[::-1]:
teacher_program.pop(rm_index)
teacher_program.pop(rm_index-1)
teacher_program.pop(rm_index - 1)
replace_config = ["Arch.name"]
teacher_list = teacher_configs["search_values"]
res = search_train(teacher_list, teacher_program, base_output_dir, "teacher", replace_config, model_name)
res = search_train(teacher_list, teacher_program, base_output_dir,
"teacher", replace_config, model_name)
all_results["teacher"] = res
best = res.get("best")
t_pretrained = "{}/{}_{}_0/{}/best_model".format(base_output_dir, "teacher", best, best)
base_program += ["-o", "Arch.models.0.Teacher.name={}".format(best),
"-o", "Arch.models.0.Teacher.pretrained={}".format(t_pretrained)]
t_pretrained = "{}/{}_{}_0/{}/best_model".format(base_output_dir,
"teacher", best, best)
base_program += [
"-o", "Arch.models.0.Teacher.name={}".format(best), "-o",
"Arch.models.0.Teacher.pretrained={}".format(t_pretrained)
]
elif algo == "udml":
if "lr_mult_list" in all_results:
base_program += [
"-o", "Arch.models.0.Teacher.lr_mult_list={}".format(
all_results["lr_mult_list"]["best"])
]
output_dir = "{}/search_res".format(base_output_dir)
base_program += ["-o", "Global.output_dir={}".format(output_dir)]
final_replace = configs.get('final_replace')
for i in range(len(base_program)):
base_program[i] = base_program[i].replace(base_config_file, distill_config_file)
base_program[i] = base_program[i].replace(base_config_file,
distill_config_file)
for k in final_replace:
v = final_replace[k]
base_program[i] = base_program[i].replace(k, v)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册