提交 421a2fc4 编写于 作者: H HydrogenSulfate

Merge branch 'develop' into polish_serving_docs

......@@ -12,3 +12,4 @@ build/
log/
nohup.out
.DS_Store
.idea
include LICENSE.txt
include README.md
include docs/en/whl_en.md
recursive-include deploy/python predict_cls.py preprocess.py postprocess.py det_preprocess.py
recursive-include deploy/python *.py
recursive-include deploy/configs *.yaml
recursive-include deploy/utils get_image_list.py config.py logger.py predictor.py
recursive-include ppcls/ *.py *.txt
\ No newline at end of file
......@@ -4,40 +4,46 @@
## 简介
飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。
飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别和图像分类任务的工具集,助力使用者训练出更好的视觉模型和应用落地。
**近期更新**
- 🔥️ 2022.5.26 [飞桨产业实践范例直播课](http://aglc.cn/v-c4FAR),解读**超轻量重点区域人员出入管理方案**,欢迎报名来交流。
<div align="center">
<img src="https://user-images.githubusercontent.com/80816848/170166458-767a01ca-1429-437f-a628-dd184732ef53.png" width = "150" />
</div>
- 2022.5.23 新增[人员出入管理范例库](https://aistudio.baidu.com/aistudio/projectdetail/4094475),具体内容可以在 AI Stuio 上体验。
- 2022.5.20 上线[PP-HGNet](./docs/zh_CN/models/PP-HGNet.md), [PP-LCNet v2](./docs/zh_CN/models/PP-LCNetV2.md)
- 2022.4.21 新增 CVPR2022 oral论文 [MixFormer](https://arxiv.org/pdf/2204.02557.pdf) 相关[代码](https://github.com/PaddlePaddle/PaddleClas/pull/1820/files)
- 2022.1.27 全面升级文档;新增[PaddleServing C++ pipeline部署方式](./deploy/paddleserving)[18M图像识别安卓部署Demo](./deploy/lite_shitu)
- 2021.11.1 发布[PP-ShiTu技术报告](https://arxiv.org/pdf/2111.00775.pdf),新增饮料识别demo
- 2021.10.23 发布轻量级图像识别系统PP-ShiTu,CPU上0.2s即可完成在10w+库的图像识别。
[点击这里](./docs/zh_CN/quick_start/quick_start_recognition.md)立即体验
- 2021.09.17 发布PP-LCNet系列超轻量骨干网络模型, 在Intel CPU上,单张图像预测速度约5ms,ImageNet-1K数据集上Top1识别准确率达到80.82%,超越ResNet152的模型效果。PP-LCNet的介绍可以参考[论文](https://arxiv.org/pdf/2109.15099.pdf), 或者[PP-LCNet模型介绍](docs/zh_CN/models/PP-LCNet.md),相关指标和预训练权重可以从 [这里](docs/zh_CN/algorithm_introduction/ImageNet_models.md)下载。
- [more](./docs/zh_CN/others/update_history.md)
## 特性
- PP-ShiTu轻量图像识别系统:集成了目标检测、特征学习、图像检索等模块,广泛适用于各类图像识别任务。cpu上0.2s即可完成在10w+库的图像识别。
<div align="center">
<img src="./docs/images/class_simple.gif" width = "600" />
<p>PULC实用图像分类模型效果展示</p>
</div>
&nbsp;
- PP-LCNet轻量级CPU骨干网络:专门为CPU设备打造轻量级骨干网络,速度、精度均远超竞品。
- 丰富的预训练模型库:提供了36个系列共175个ImageNet预训练模型,其中7个精选系列模型支持结构快速修改。
<div align="center">
<img src="./docs/images/recognition.gif" width = "400" />
<p>PP-ShiTu图像识别系统效果展示</p>
</div>
- 全面易用的特征学习组件:集成arcmargin, triplet loss等12度量学习方法,通过配置文件即可随意组合切换。
- SSLD知识蒸馏:14个分类预训练模型,精度普遍提升3%以上;其中ResNet50_vd模型在ImageNet-1k数据集上的Top-1精度达到了84.0%,
Res2Net200_vd预训练模型Top-1精度高达85.1%。
## 近期更新
- 📢将于**<u>6月15-6月17日晚20:30</u>**进行为期三天的课程直播,详细介绍超轻量图像分类方案,对各场景模型优化原理及使用方式进行拆解,之后还有产业案例全流程实操,对各类痛难点解决方案进行手把手教学,加上现场互动答疑,抓紧扫码上车吧!
<div align="center">
<img src="./docs/images/recognition.gif" width = "400" />
<img src="https://user-images.githubusercontent.com/45199522/173483779-2332f990-4941-4f8d-baee-69b62035fc31.png" width = "200" height = "200"/>
</div>
- 🔥️ 2022.6.15 发布[PULC超轻量图像分类实用方案](docs/zh_CN/PULC/PULC_train.md),CPU推理3ms,精度比肩SwinTransformer,覆盖人、车、OCR场景九大常见任务。
- 2022.5.26 [飞桨产业实践范例直播课](http://aglc.cn/v-c4FAR),解读**超轻量重点区域人员出入管理方案**
- 2022.5.23 新增[人员出入管理范例库](https://aistudio.baidu.com/aistudio/projectdetail/4094475),具体内容可以在 AI Stuio 上体验。
- 2022.5.20 上线[PP-HGNet](./docs/zh_CN/models/PP-HGNet.md), [PP-LCNetv2](./docs/zh_CN/models/PP-LCNetV2.md)
- 2022.4.21 新增 CVPR2022 oral论文 [MixFormer](https://arxiv.org/pdf/2204.02557.pdf) 相关[代码](https://github.com/PaddlePaddle/PaddleClas/pull/1820/files)
- [more](./docs/zh_CN/others/update_history.md)
## 特性
PaddleClas发布了[PP-HGNet](docs/zh_CN/models/PP-HGNet.md)[PP-LCNetv2](docs/zh_CN/models/PP-LCNetV2.md)[PP-LCNet](docs/zh_CN/models/PP-LCNet.md)[SSLD半监督知识蒸馏方案](docs/zh_CN/advanced_tutorials/ssld.md)等算法,
并支持多种图像分类、识别相关算法,在此基础上打造[PULC超轻量图像分类方案](docs/zh_CN/PULC/PULC_quickstart.md)[PP-ShiTu图像识别系统](./docs/zh_CN/quick_start/quick_start_recognition.md)
![](https://user-images.githubusercontent.com/19523330/173273046-239a42da-c88d-4c2c-94b1-2134557afa49.png)
## 欢迎加入技术交流群
......@@ -50,60 +56,78 @@ Res2Net200_vd预训练模型Top-1精度高达85.1%。
## 快速体验
PULC超轻量图像分类方案快速体验:[点击这里](docs/zh_CN/PULC/PULC_quickstart.md)
PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick_start_recognition.md)
## 文档教程
- 安装说明
- [安装Paddle](./docs/zh_CN/installation/install_paddle.md)
- [安装PaddleClas](./docs/zh_CN/installation/install_paddleclas.md)
- 快速体验
- [PP-ShiTu图像识别快速体验](./docs/zh_CN/quick_start/quick_start_recognition.md)
- 图像分类快速体验
- [尝鲜版](./docs/zh_CN/quick_start/quick_start_classification_new_user.md)
- [进阶版](./docs/zh_CN/quick_start/quick_start_classification_professional.md)
- [多标签分类](./docs/zh_CN/quick_start/quick_start_multilabel_classification.md)
- [环境准备](docs/zh_CN/installation/install_paddleclas.md)
- [PULC超轻量图像分类实用方案](docs/zh_CN/PULC/PULC_train.md)
- [超轻量图像分类快速体验](docs/zh_CN/PULC/PULC_quickstart.md)
- [超轻量图像分类模型库](docs/zh_CN/PULC/PULC_model_list.md)
- [PULC有人/无人分类模型](docs/zh_CN/PULC/PULC_person_exists.md)
- [PULC人体属性识别模型](docs/zh_CN/PULC/PULC_person_attribute.md)
- [PULC佩戴安全帽分类模型](docs/zh_CN/PULC/PULC_safety_helmet.md)
- [PULC交通标志分类模型](docs/zh_CN/PULC/PULC_traffic_sign.md)
- [PULC车辆属性识别模型](docs/zh_CN/PULC/PULC_vehicle_attribute.md)
- [PULC有车/无车分类模型](docs/zh_CN/PULC/PULC_car_exists.md)
- [PULC含文字图像方向分类模型](docs/zh_CN/PULC/PULC_text_image_orientation.md)
- [PULC文本行方向分类模型](docs/zh_CN/PULC/PULC_textline_orientation.md)
- [PULC语种分类模型](docs/zh_CN/PULC/PULC_language_classification.md)
- [模型训练](docs/zh_CN/PULC/PULC_train.md)
- 推理部署
- [基于python预测引擎推理](docs/zh_CN/inference_deployment/python_deploy.md#1)
- [基于C++预测引擎推理](docs/zh_CN/inference_deployment/cpp_deploy.md)
- [服务化部署](docs/zh_CN/inference_deployment/paddle_serving_deploy.md)
- [端侧部署](docs/zh_CN/inference_deployment/paddle_lite_deploy.md)
- [Paddle2ONNX模型转化与预测](deploy/paddle2onnx/readme.md)
- [模型压缩](deploy/slim/README.md)
- [PP-ShiTu图像识别系统介绍](#图像识别系统介绍)
- [图像识别快速体验](docs/zh_CN/quick_start/quick_start_recognition.md)
- 模块介绍
- [主体检测](./docs/zh_CN/image_recognition_pipeline/mainbody_detection.md)
- [特征提取](./docs/zh_CN/image_recognition_pipeline/feature_extraction.md)
- [特征提取模型](./docs/zh_CN/image_recognition_pipeline/feature_extraction.md)
- [向量检索](./docs/zh_CN/image_recognition_pipeline/vector_search.md)
- [骨干网络和预训练模型库](./docs/zh_CN/algorithm_introduction/ImageNet_models.md)
- 数据准备
- [图像分类数据集介绍](./docs/zh_CN/data_preparation/classification_dataset.md)
- [图像识别数据集介绍](./docs/zh_CN/data_preparation/recognition_dataset.md)
- 模型训练
- [图像分类任务](./docs/zh_CN/models_training/classification.md)
- [图像识别任务](./docs/zh_CN/models_training/recognition.md)
- [训练参数调整策略](./docs/zh_CN/models_training/train_strategy.md)
- [配置文件说明](./docs/zh_CN/models_training/config_description.md)
- 模型预测部署
- [模型导出](./docs/zh_CN/inference_deployment/export_model.md)
- Python/C++ 预测引擎
- [基于Python预测引擎预测推理](./docs/zh_CN/inference_deployment/python_deploy.md)
- [基于C++分类预测引擎预测推理](./docs/zh_CN/inference_deployment/cpp_deploy.md)[基于C++的PP-ShiTu预测引擎预测推理](deploy/cpp_shitu/readme.md)
- 服务化部署
- [Paddle Serving服务化部署(推荐)](./docs/zh_CN/inference_deployment/paddle_serving_deploy.md)
- [Hub serving服务化部署](./docs/zh_CN/inference_deployment/paddle_hub_serving_deploy.md)
- [端侧部署](./deploy/lite/readme.md)
- [whl包预测](./docs/zh_CN/inference_deployment/whl_deploy.md)
- 算法介绍
- [图像分类任务介绍](./docs/zh_CN/algorithm_introduction/image_classification.md)
- [度量学习介绍](./docs/zh_CN/algorithm_introduction/metric_learning.md)
- 高阶使用
- [数据增广](./docs/zh_CN/advanced_tutorials/DataAugmentation.md)
- [模型量化](./docs/zh_CN/advanced_tutorials/model_prune_quantization.md)
- [知识蒸馏](./docs/zh_CN/advanced_tutorials/knowledge_distillation.md)
- [PaddleClas结构解析](./docs/zh_CN/advanced_tutorials/code_overview.md)
- [社区贡献指南](./docs/zh_CN/advanced_tutorials/how_to_contribute.md)
- [哈希编码](docs/zh_CN/image_recognition_pipeline/)
- [模型训练](docs/zh_CN/models_training/recognition.md)
- 推理部署
- [基于python预测引擎推理](docs/zh_CN/inference_deployment/python_deploy.md#2)
- [基于C++预测引擎推理](deploy/cpp_shitu/readme.md)
- [服务化部署](docs/zh_CN/inference_deployment/paddle_serving_deploy.md)
- [端侧部署](deploy/lite_shitu/README.md)
- PP系列骨干网络模型
- [PP-HGNet](docs/zh_CN/models/PP-HGNet.md)
- [PP-LCNetv2](docs/zh_CN/models/PP-LCNetV2.md)
- [PP-LCNet](docs/zh_CN/models/PP-LCNet.md)
- [SSLD半监督知识蒸馏方案](docs/zh_CN/advanced_tutorials/ssld.md)
- 前沿算法
- [骨干网络和预训练模型库](docs/zh_CN/algorithm_introduction/ImageNet_models.md)
- [度量学习](docs/zh_CN/algorithm_introduction/metric_learning.md)
- [模型压缩](docs/zh_CN/algorithm_introduction/model_prune_quantization.md)
- [模型蒸馏](docs/zh_CN/algorithm_introduction/knowledge_distillation.md)
- [数据增强](docs/zh_CN/advanced_tutorials/DataAugmentation.md)
- [产业实用范例库](docs/zh_CN/samples)
- [30分钟快速体验图像分类](docs/zh_CN/quick_start/quick_start_classification_new_user.md)
- FAQ
- [图像识别精选问题](docs/zh_CN/faq_series/faq_2021_s2.md)
- [图像分类精选问题](docs/zh_CN/faq_series/faq_selected_30.md)
- [图像分类FAQ第一季](docs/zh_CN/faq_series/faq_2020_s1.md)
- [图像分类FAQ第二季](docs/zh_CN/faq_series/faq_2021_s1.md)
- [图像识别精选问题](docs/zh_CN/faq_series/faq_2021_s2.md)
- [图像分类精选问题](docs/zh_CN/faq_series/faq_selected_30.md)
- [图像分类FAQ第一季](docs/zh_CN/faq_series/faq_2020_s1.md)
- [图像分类FAQ第二季](docs/zh_CN/faq_series/faq_2021_s1.md)
- [社区贡献指南](./docs/zh_CN/advanced_tutorials/how_to_contribute.md)
- [许可证书](#许可证书)
- [贡献代码](#贡献代码)
<a name="PULC超轻量图像分类方案"></a>
## PULC超轻量图像分类方案
<div align="center">
<img src="https://user-images.githubusercontent.com/19523330/173011854-b10fcd7a-b799-4dfd-a1cf-9504952a3c44.png" width = "800" />
</div>
PULC融合了骨干网络、数据增广、蒸馏等多种前沿算法,可以自动训练得到轻量且高精度的图像分类模型。
PaddleClas提供了覆盖人、车、OCR场景九大常见任务的分类模型,CPU推理3ms,精度比肩SwinTransformer。
<a name="图像识别系统介绍"></a>
## PP-ShiTu图像识别系统介绍
## PP-ShiTu图像识别系统
<div align="center">
<img src="./docs/images/structure.jpg" width = "800" />
......@@ -111,6 +135,11 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick
PP-ShiTu是一个实用的轻量级通用图像识别系统,主要由主体检测、特征学习和向量检索三个模块组成。该系统从骨干网络选择和调整、损失函数的选择、数据增强、学习率变换策略、正则化参数选择、预训练模型使用以及模型裁剪量化8个方面,采用多种策略,对各个模块的模型进行优化,最终得到在CPU上仅0.2s即可完成10w+库的图像识别的系统。更多细节请参考[PP-ShiTu技术方案](https://arxiv.org/pdf/2111.00775.pdf)
<a name="分类效果展示"></a>
## PULC实用图像分类模型效果展示
<div align="center">
<img src="docs/images/classification.gif">
</div>
<a name="识别效果展示"></a>
## PP-ShiTu图像识别系统效果展示
......
......@@ -4,10 +4,24 @@
## Introduction
PaddleClas is an image recognition toolset for industry and academia, helping users train better computer vision models and apply them in real scenarios.
PaddleClas is an image classification and image recognition toolset for industry and academia, helping users train better computer vision models and apply them in real scenarios.
**Recent updates**
<div align="center">
<img src="./docs/images/class_simple.gif" width = "600" />
PULC demo images
</div>
&nbsp;
<div align="center">
<img src="./docs/images/recognition.gif" width = "400" />
PP-ShiTu demo images
</div>
**Recent updates**
- 2022.6.15 Release [**P**ractical **U**ltra **L**ight-weight image **C**lassification solutions](./docs/en/PULC/PULC_quickstart_en.md). PULC models inference within 3ms on CPU devices, with accuracy comparable with SwinTransformer. We also release 9 practical models covering pedestrian, vehicle and OCR.
- 2022.4.21 Added the related [code](https://github.com/PaddlePaddle/PaddleClas/pull/1820/files) of the CVPR2022 oral paper [MixFormer](https://arxiv.org/pdf/2204.02557.pdf).
- 2021.09.17 Add PP-LCNet series model developed by PaddleClas, these models show strong competitiveness on Intel CPUs.
......@@ -19,24 +33,12 @@ For the introduction of PP-LCNet, please refer to [paper](https://arxiv.org/pdf/
## Features
- A practical image recognition system consist of detection, feature learning and retrieval modules, widely applicable to all types of image recognition tasks.
Four sample solutions are provided, including product recognition, vehicle recognition, logo recognition and animation character recognition.
- Rich library of pre-trained models: Provide a total of 164 ImageNet pre-trained models in 35 series, among which 6 selected series of models support fast structural modification.
- Comprehensive and easy-to-use feature learning components: 12 metric learning methods are integrated and can be combined and switched at will through configuration files.
- SSLD knowledge distillation: The 14 classification pre-training models generally improved their accuracy by more than 3%; among them, the ResNet50_vd model achieved a Top-1 accuracy of 84.0% on the Image-Net-1k dataset and the Res2Net200_vd pre-training model achieved a Top-1 accuracy of 85.1%.
PaddleClas release PP-HGNet、PP-LCNetv2、 PP-LCNet and **S**imple **S**emi-supervised **L**abel **D**istillation algorithms, and support plenty of
image classification and image recognition algorithms.
Based on th algorithms above, PaddleClas release PP-ShiTu image recognition system and [**P**ractical **U**ltra **L**ight-weight image **C**lassification solutions](docs/en/PULC/PULC_quickstart_en.md).
- Data augmentation: Provide 8 data augmentation algorithms such as AutoAugment, Cutout, Cutmix, etc. with detailed introduction, code replication and evaluation of effectiveness in a unified experimental environment.
<div align="center">
<img src="./docs/images/recognition_en.gif" width = "400" />
</div>
![](https://user-images.githubusercontent.com/19523330/173347904-f2998e00-7b86-4adf-b546-23c684fc67b9.png)
## Welcome to Join the Technical Exchange Group
......@@ -48,11 +50,13 @@ Four sample solutions are provided, including product recognition, vehicle recog
</div>
## Quick Start
Quick experience of image recognition:[Link](./docs/en/tutorials/quick_start_recognition_en.md)
Quick experience of PP-ShiTu image recognition system:[Link](./docs/en/tutorials/quick_start_recognition_en.md)
Quick experience of **P**ractical **U**ltra **L**ight-weight image **C**lassification models:[Link](docs/en/PULC/PULC_quickstart.md)
## Tutorials
- [Quick Installation](./docs/en/tutorials/install_en.md)
- [Practical Ultra Light-weight image Classification solutions](./docs/en/PULC/PULC_quickstart_en.md)
- [Quick Start of Recognition](./docs/en/tutorials/quick_start_recognition_en.md)
- [Introduction to Image Recognition Systems](#Introduction_to_Image_Recognition_Systems)
- [Demo images](#Demo_images)
......@@ -83,6 +87,14 @@ Quick experience of image recognition:[Link](./docs/en/tutorials/quick_start_r
- [License](#License)
- [Contribution](#Contribution)
<a name="Introduction_to_PULC"></a>
## Introduction to Practical Ultra Light-weight image Classification solutions
<div align="center">
<img src="https://user-images.githubusercontent.com/19523330/173011854-b10fcd7a-b799-4dfd-a1cf-9504952a3c44.png" width = "800" />
</div>
PULC solutions consists of PP-LCNet light-weight backbone, SSLD pretrained models, Ensemble of Data Augmentation strategy and SKL-UGI knowledge distillation.
PULC models inference within 3ms on CPU devices, with accuracy comparable with SwinTransformer. We also release 9 practical models covering pedestrian, vehicle and OCR.
<a name="Introduction_to_Image_Recognition_Systems"></a>
## Introduction to Image Recognition Systems
......@@ -97,8 +109,13 @@ Image recognition can be divided into three steps:
For a new unknown category, there is no need to retrain the model, just prepare images of new category, extract features and update retrieval database and the category can be recognised.
<a name="Demo_images"></a>
## Demo images [more](https://github.com/PaddlePaddle/PaddleClas/tree/release/2.2/docs/images/recognition/more_demo_images)
## PULC demo images
<div align="center">
<img src="docs/images/classification.gif">
</div>
<a name="Rec_Demo_images"></a>
## Image Recognition Demo images [more](https://github.com/PaddlePaddle/PaddleClas/tree/release/2.2/docs/images/recognition/more_demo_images)
- Product recognition
<div align="center">
<img src="https://user-images.githubusercontent.com/18028216/122769644-51604f80-d2d7-11eb-8290-c53b12a5c1f6.gif" width = "400" />
......
Global:
infer_imgs: "./images/PULC/car_exists/objects365_00001507.jpeg"
inference_model_dir: "./models/car_exists_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: False
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: ThreshOutput
ThreshOutput:
threshold: 0.5
label_0: no_car
label_1: contains_car
SavePreLabel:
save_dir: ./pre_label/
Global:
infer_imgs: "./images/PULC/language_classification/word_35404.png"
inference_model_dir: "./models/language_classification_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: False
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
size: [160, 80]
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: Topk
Topk:
topk: 2
class_id_map_file: "../ppcls/utils/PULC_label_list/language_classification_label_list.txt"
SavePreLabel:
save_dir: ./pre_label/
Global:
infer_imgs: "./images/PULC/person_attribute/090004.jpg"
inference_model_dir: "./models/person_attribute_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
benchmark: False
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
size: [192, 256]
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: PersonAttribute
PersonAttribute:
threshold: 0.5 #default threshold
glasses_threshold: 0.3 #threshold only for glasses
hold_threshold: 0.6 #threshold only for hold
Global:
infer_imgs: "./images/PULC/person/objects365_02035329.jpg"
inference_model_dir: "./models/person_cls_infer"
infer_imgs: "./images/PULC/person_exists/objects365_02035329.jpg"
inference_model_dir: "./models/person_exists_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: False
......@@ -29,7 +29,7 @@ PreProcess:
PostProcess:
main_indicator: ThreshOutput
ThreshOutput:
threshold: 0.9
threshold: 0.5
label_0: nobody
label_1: someone
SavePreLabel:
......
Global:
infer_imgs: "./images/PULC/safety_helmet/safety_helmet_test_1.png"
inference_model_dir: "./models/safety_helmet_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: False
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: ThreshOutput
ThreshOutput:
threshold: 0.5
label_0: wearing_helmet
label_1: unwearing_helmet
SavePreLabel:
save_dir: ./pre_label/
Global:
infer_imgs: "./images/PULC/text_image_orientation/img_rot0_demo.jpg"
inference_model_dir: "./models/text_image_orientation_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: False
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: Topk
Topk:
topk: 2
class_id_map_file: "../ppcls/utils/PULC_label_list/text_image_orientation_label_list.txt"
SavePreLabel:
save_dir: ./pre_label/
Global:
infer_imgs: "./images/PULC/textline_orientation/textline_orientation_test_0_0.png"
inference_model_dir: "./models/textline_orientation_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
size: [160, 80]
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: Topk
Topk:
topk: 1
class_id_map_file: "../ppcls/utils/PULC_label_list/textline_orientation_label_list.txt"
SavePreLabel:
save_dir: ./pre_label/
Global:
infer_imgs: "./images/PULC/traffic_sign/99603_17806.jpg"
inference_model_dir: "./models/traffic_sign_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
benchmark: False
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: Topk
Topk:
topk: 5
class_id_map_file: "../ppcls/utils/PULC_label_list/traffic_sign_label_list.txt"
SavePreLabel:
save_dir: ./pre_label/
Global:
infer_imgs: "./images/PULC/vehicle_attribute/0002_c002_00030670_0.jpg"
inference_model_dir: "./models/vehicle_attribute_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
benchmark: False
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
size: [256, 192]
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: VehicleAttribute
VehicleAttribute:
color_threshold: 0.5
type_threshold: 0.5
......@@ -25,9 +25,9 @@ PreProcess:
- ToCHWImage:
PostProcess:
main_indicator: Attribute
Attribute:
main_indicator: PersonAttribute
PersonAttribute:
threshold: 0.5 #default threshold
glasses_threshold: 0.3 #threshold only for glasses
hold_threshold: 0.6 #threshold only for hold
\ No newline at end of file
Global:
infer_imgs: "./images/ILSVRC2012_val_00000010.jpeg"
infer_imgs: "./images/ImageNet/ILSVRC2012_val_00000010.jpeg"
inference_model_dir: "./models"
batch_size: 1
use_gpu: True
......@@ -32,4 +32,4 @@ PostProcess:
topk: 5
class_id_map_file: "../ppcls/utils/imagenet1k_label_list.txt"
SavePreLabel:
save_dir: ./pre_label/
\ No newline at end of file
save_dir: ./pre_label/
Global:
infer_imgs: "./images/ILSVRC2012_val_00000010.jpeg"
infer_imgs: "./images/ImageNet/ILSVRC2012_val_00000010.jpeg"
inference_model_dir: "./models"
batch_size: 1
use_gpu: True
......@@ -32,4 +32,4 @@ PostProcess:
topk: 5
class_id_map_file: "../ppcls/utils/imagenet1k_label_list.txt"
SavePreLabel:
save_dir: ./pre_label/
\ No newline at end of file
save_dir: ./pre_label/
# 使用镜像:
# registry.baidubce.com/paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82
# 编译Serving Server:
# client和app可以直接使用release版本
# server因为加入了自定义OP,需要重新编译
# 默认编译时的${PWD}=PaddleClas/deploy/paddleserving/
python_name=${1:-'python'}
apt-get update
apt install -y libcurl4-openssl-dev libbz2-dev
wget -nc https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar
tar xf centos_ssl.tar
rm -rf centos_ssl.tar
mv libcrypto.so.1.0.2k /usr/lib/libcrypto.so.1.0.2k
mv libssl.so.1.0.2k /usr/lib/libssl.so.1.0.2k
ln -sf /usr/lib/libcrypto.so.1.0.2k /usr/lib/libcrypto.so.10
ln -sf /usr/lib/libssl.so.1.0.2k /usr/lib/libssl.so.10
ln -sf /usr/lib/libcrypto.so.10 /usr/lib/libcrypto.so
ln -sf /usr/lib/libssl.so.10 /usr/lib/libssl.so
# 安装go依赖
rm -rf /usr/local/go
wget -qO- https://paddle-ci.cdn.bcebos.com/go1.17.2.linux-amd64.tar.gz | tar -xz -C /usr/local
export GOROOT=/usr/local/go
export GOPATH=/root/gopath
export PATH=$PATH:$GOPATH/bin:$GOROOT/bin
go env -w GO111MODULE=on
go env -w GOPROXY=https://goproxy.cn,direct
go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
go install google.golang.org/grpc@v1.33.0
go env -w GO111MODULE=auto
# 下载opencv库
wget https://paddle-qa.bj.bcebos.com/PaddleServing/opencv3.tar.gz
tar -xvf opencv3.tar.gz
rm -rf opencv3.tar.gz
export OPENCV_DIR=$PWD/opencv3
# clone Serving
git clone https://github.com/PaddlePaddle/Serving.git -b develop --depth=1
cd Serving # PaddleClas/deploy/paddleserving/Serving
export Serving_repo_path=$PWD
git submodule update --init --recursive
${python_name} -m pip install -r python/requirements.txt
# set env
export PYTHON_INCLUDE_DIR=$(${python_name} -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())")
export PYTHON_LIBRARIES=$(${python_name} -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR'))")
export PYTHON_EXECUTABLE=`which ${python_name}`
export CUDA_PATH='/usr/local/cuda'
export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
export CUDA_CUDART_LIBRARY='/usr/local/cuda/lib64/'
export TENSORRT_LIBRARY_PATH='/usr/local/TensorRT6-cuda10.1-cudnn7/targets/x86_64-linux-gnu/'
# cp 自定义OP代码
\cp ../preprocess/general_clas_op.* ${Serving_repo_path}/core/general-server/op
\cp ../preprocess/preprocess_op.* ${Serving_repo_path}/core/predictor/tools/pp_shitu_tools
# 编译Server
mkdir server-build-gpu-opencv
cd server-build-gpu-opencv
cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \
-DCUDNN_LIBRARY=${CUDNN_LIBRARY} \
-DCUDA_CUDART_LIBRARY=${CUDA_CUDART_LIBRARY} \
-DTENSORRT_ROOT=${TENSORRT_LIBRARY_PATH} \
-DOPENCV_DIR=${OPENCV_DIR} \
-DWITH_OPENCV=ON \
-DSERVER=ON \
-DWITH_GPU=ON ..
make -j32
${python_name} -m pip install python/dist/paddle*
# export SERVING_BIN
export SERVING_BIN=$PWD/core/general-server/serving
cd ../../
\ No newline at end of file
......@@ -30,4 +30,4 @@ op:
client_type: local_predictor
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["prediction"]
fetch_list: ["prediction"]
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "core/general-server/op/general_clas_op.h"
#include "core/predictor/framework/infer.h"
#include "core/predictor/framework/memory.h"
#include "core/predictor/framework/resource.h"
#include "core/util/include/timer.h"
#include <algorithm>
#include <iostream>
#include <memory>
#include <sstream>
namespace baidu {
namespace paddle_serving {
namespace serving {
using baidu::paddle_serving::Timer;
using baidu::paddle_serving::predictor::MempoolWrapper;
using baidu::paddle_serving::predictor::general_model::Tensor;
using baidu::paddle_serving::predictor::general_model::Response;
using baidu::paddle_serving::predictor::general_model::Request;
using baidu::paddle_serving::predictor::InferManager;
using baidu::paddle_serving::predictor::PaddleGeneralModelConfig;
int GeneralClasOp::inference() {
VLOG(2) << "Going to run inference";
const std::vector<std::string> pre_node_names = pre_names();
if (pre_node_names.size() != 1) {
LOG(ERROR) << "This op(" << op_name()
<< ") can only have one predecessor op, but received "
<< pre_node_names.size();
return -1;
}
const std::string pre_name = pre_node_names[0];
const GeneralBlob *input_blob = get_depend_argument<GeneralBlob>(pre_name);
if (!input_blob) {
LOG(ERROR) << "input_blob is nullptr,error";
return -1;
}
uint64_t log_id = input_blob->GetLogId();
VLOG(2) << "(logid=" << log_id << ") Get precedent op name: " << pre_name;
GeneralBlob *output_blob = mutable_data<GeneralBlob>();
if (!output_blob) {
LOG(ERROR) << "output_blob is nullptr,error";
return -1;
}
output_blob->SetLogId(log_id);
if (!input_blob) {
LOG(ERROR) << "(logid=" << log_id
<< ") Failed mutable depended argument, op:" << pre_name;
return -1;
}
const TensorVector *in = &input_blob->tensor_vector;
TensorVector *out = &output_blob->tensor_vector;
int batch_size = input_blob->_batch_size;
output_blob->_batch_size = batch_size;
VLOG(2) << "(logid=" << log_id << ") infer batch size: " << batch_size;
Timer timeline;
int64_t start = timeline.TimeStampUS();
timeline.Start();
// only support string type
char *total_input_ptr = static_cast<char *>(in->at(0).data.data());
std::string base64str = total_input_ptr;
cv::Mat img = Base2Mat(base64str);
// RGB2BGR
cv::cvtColor(img, img, cv::COLOR_BGR2RGB);
// Resize
cv::Mat resize_img;
resize_op_.Run(img, resize_img, resize_short_size_);
// CenterCrop
crop_op_.Run(resize_img, crop_size_);
// Normalize
normalize_op_.Run(&resize_img, mean_, scale_, is_scale_);
// Permute
std::vector<float> input(1 * 3 * resize_img.rows * resize_img.cols, 0.0f);
permute_op_.Run(&resize_img, input.data());
float maxValue = *max_element(input.begin(), input.end());
float minValue = *min_element(input.begin(), input.end());
TensorVector *real_in = new TensorVector();
if (!real_in) {
LOG(ERROR) << "real_in is nullptr,error";
return -1;
}
std::vector<int> input_shape;
int in_num = 0;
void *databuf_data = NULL;
char *databuf_char = NULL;
size_t databuf_size = 0;
input_shape = {1, 3, resize_img.rows, resize_img.cols};
in_num = std::accumulate(input_shape.begin(), input_shape.end(), 1,
std::multiplies<int>());
databuf_size = in_num * sizeof(float);
databuf_data = MempoolWrapper::instance().malloc(databuf_size);
if (!databuf_data) {
LOG(ERROR) << "Malloc failed, size: " << databuf_size;
return -1;
}
memcpy(databuf_data, input.data(), databuf_size);
databuf_char = reinterpret_cast<char *>(databuf_data);
paddle::PaddleBuf paddleBuf(databuf_char, databuf_size);
paddle::PaddleTensor tensor_in;
tensor_in.name = in->at(0).name;
tensor_in.dtype = paddle::PaddleDType::FLOAT32;
tensor_in.shape = {1, 3, resize_img.rows, resize_img.cols};
tensor_in.lod = in->at(0).lod;
tensor_in.data = paddleBuf;
real_in->push_back(tensor_in);
if (InferManager::instance().infer(engine_name().c_str(), real_in, out,
batch_size)) {
LOG(ERROR) << "(logid=" << log_id
<< ") Failed do infer in fluid model: " << engine_name().c_str();
return -1;
}
int64_t end = timeline.TimeStampUS();
CopyBlobInfo(input_blob, output_blob);
AddBlobInfo(output_blob, start);
AddBlobInfo(output_blob, end);
return 0;
}
cv::Mat GeneralClasOp::Base2Mat(std::string &base64_data) {
cv::Mat img;
std::string s_mat;
s_mat = base64Decode(base64_data.data(), base64_data.size());
std::vector<char> base64_img(s_mat.begin(), s_mat.end());
img = cv::imdecode(base64_img, cv::IMREAD_COLOR); // CV_LOAD_IMAGE_COLOR
return img;
}
std::string GeneralClasOp::base64Decode(const char *Data, int DataByte) {
const char DecodeTable[] = {
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0,
62, // '+'
0, 0, 0,
63, // '/'
52, 53, 54, 55, 56, 57, 58, 59, 60, 61, // '0'-'9'
0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, // 'A'-'Z'
0, 0, 0, 0, 0, 0, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, // 'a'-'z'
};
std::string strDecode;
int nValue;
int i = 0;
while (i < DataByte) {
if (*Data != '\r' && *Data != '\n') {
nValue = DecodeTable[*Data++] << 18;
nValue += DecodeTable[*Data++] << 12;
strDecode += (nValue & 0x00FF0000) >> 16;
if (*Data != '=') {
nValue += DecodeTable[*Data++] << 6;
strDecode += (nValue & 0x0000FF00) >> 8;
if (*Data != '=') {
nValue += DecodeTable[*Data++];
strDecode += nValue & 0x000000FF;
}
}
i += 4;
} else // 回车换行,跳过
{
Data++;
i++;
}
}
return strDecode;
}
DEFINE_OP(GeneralClasOp);
} // namespace serving
} // namespace paddle_serving
} // namespace baidu
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include "core/general-server/general_model_service.pb.h"
#include "core/general-server/op/general_infer_helper.h"
#include "core/predictor/tools/pp_shitu_tools/preprocess_op.h"
#include "paddle_inference_api.h" // NOLINT
#include <string>
#include <vector>
#include "opencv2/core.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/imgproc.hpp"
#include <chrono>
#include <iomanip>
#include <iostream>
#include <ostream>
#include <vector>
#include <cstring>
#include <fstream>
#include <numeric>
namespace baidu {
namespace paddle_serving {
namespace serving {
class GeneralClasOp
: public baidu::paddle_serving::predictor::OpWithChannel<GeneralBlob> {
public:
typedef std::vector<paddle::PaddleTensor> TensorVector;
DECLARE_OP(GeneralClasOp);
int inference();
private:
// clas preprocess
std::vector<float> mean_ = {0.485f, 0.456f, 0.406f};
std::vector<float> scale_ = {0.229f, 0.224f, 0.225f};
bool is_scale_ = true;
int resize_short_size_ = 256;
int crop_size_ = 224;
PaddleClas::ResizeImg resize_op_;
PaddleClas::Normalize normalize_op_;
PaddleClas::Permute permute_op_;
PaddleClas::CenterCropImg crop_op_;
// read pics
cv::Mat Base2Mat(std::string &base64_data);
std::string base64Decode(const char *Data, int DataByte);
};
} // namespace serving
} // namespace paddle_serving
} // namespace baidu
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "opencv2/core.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/imgproc.hpp"
#include "paddle_api.h"
#include "paddle_inference_api.h"
#include <chrono>
#include <iomanip>
#include <iostream>
#include <ostream>
#include <vector>
#include <cstring>
#include <fstream>
#include <math.h>
#include <numeric>
#include "preprocess_op.h"
namespace Feature {
void Permute::Run(const cv::Mat *im, float *data) {
int rh = im->rows;
int rw = im->cols;
int rc = im->channels();
for (int i = 0; i < rc; ++i) {
cv::extractChannel(*im, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw), i);
}
}
void Normalize::Run(cv::Mat *im, const std::vector<float> &mean,
const std::vector<float> &std, float scale) {
(*im).convertTo(*im, CV_32FC3, scale);
for (int h = 0; h < im->rows; h++) {
for (int w = 0; w < im->cols; w++) {
im->at<cv::Vec3f>(h, w)[0] =
(im->at<cv::Vec3f>(h, w)[0] - mean[0]) / std[0];
im->at<cv::Vec3f>(h, w)[1] =
(im->at<cv::Vec3f>(h, w)[1] - mean[1]) / std[1];
im->at<cv::Vec3f>(h, w)[2] =
(im->at<cv::Vec3f>(h, w)[2] - mean[2]) / std[2];
}
}
}
void CenterCropImg::Run(cv::Mat &img, const int crop_size) {
int resize_w = img.cols;
int resize_h = img.rows;
int w_start = int((resize_w - crop_size) / 2);
int h_start = int((resize_h - crop_size) / 2);
cv::Rect rect(w_start, h_start, crop_size, crop_size);
img = img(rect);
}
void ResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img,
int resize_short_size, int size) {
int resize_h = 0;
int resize_w = 0;
if (size > 0) {
resize_h = size;
resize_w = size;
} else {
int w = img.cols;
int h = img.rows;
float ratio = 1.f;
if (h < w) {
ratio = float(resize_short_size) / float(h);
} else {
ratio = float(resize_short_size) / float(w);
}
resize_h = round(float(h) * ratio);
resize_w = round(float(w) * ratio);
}
cv::resize(img, resize_img, cv::Size(resize_w, resize_h));
}
} // namespace Feature
namespace PaddleClas {
void Permute::Run(const cv::Mat *im, float *data) {
int rh = im->rows;
int rw = im->cols;
int rc = im->channels();
for (int i = 0; i < rc; ++i) {
cv::extractChannel(*im, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw), i);
}
}
void Normalize::Run(cv::Mat *im, const std::vector<float> &mean,
const std::vector<float> &scale, const bool is_scale) {
double e = 1.0;
if (is_scale) {
e /= 255.0;
}
(*im).convertTo(*im, CV_32FC3, e);
for (int h = 0; h < im->rows; h++) {
for (int w = 0; w < im->cols; w++) {
im->at<cv::Vec3f>(h, w)[0] =
(im->at<cv::Vec3f>(h, w)[0] - mean[0]) / scale[0];
im->at<cv::Vec3f>(h, w)[1] =
(im->at<cv::Vec3f>(h, w)[1] - mean[1]) / scale[1];
im->at<cv::Vec3f>(h, w)[2] =
(im->at<cv::Vec3f>(h, w)[2] - mean[2]) / scale[2];
}
}
}
void CenterCropImg::Run(cv::Mat &img, const int crop_size) {
int resize_w = img.cols;
int resize_h = img.rows;
int w_start = int((resize_w - crop_size) / 2);
int h_start = int((resize_h - crop_size) / 2);
cv::Rect rect(w_start, h_start, crop_size, crop_size);
img = img(rect);
}
void ResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img,
int resize_short_size) {
int w = img.cols;
int h = img.rows;
float ratio = 1.f;
if (h < w) {
ratio = float(resize_short_size) / float(h);
} else {
ratio = float(resize_short_size) / float(w);
}
int resize_h = round(float(h) * ratio);
int resize_w = round(float(w) * ratio);
cv::resize(img, resize_img, cv::Size(resize_w, resize_h));
}
} // namespace PaddleClas
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include "opencv2/core.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/imgproc.hpp"
#include <chrono>
#include <iomanip>
#include <iostream>
#include <ostream>
#include <vector>
#include <cstring>
#include <fstream>
#include <numeric>
namespace Feature {
class Normalize {
public:
virtual void Run(cv::Mat *im, const std::vector<float> &mean,
const std::vector<float> &std, float scale);
};
// RGB -> CHW
class Permute {
public:
virtual void Run(const cv::Mat *im, float *data);
};
class CenterCropImg {
public:
virtual void Run(cv::Mat &im, const int crop_size = 224);
};
class ResizeImg {
public:
virtual void Run(const cv::Mat &img, cv::Mat &resize_img, int max_size_len,
int size = 0);
};
} // namespace Feature
namespace PaddleClas {
class Normalize {
public:
virtual void Run(cv::Mat *im, const std::vector<float> &mean,
const std::vector<float> &scale, const bool is_scale = true);
};
// RGB -> CHW
class Permute {
public:
virtual void Run(const cv::Mat *im, float *data);
};
class CenterCropImg {
public:
virtual void Run(cv::Mat &im, const int crop_size = 224);
};
class ResizeImg {
public:
virtual void Run(const cv::Mat &img, cv::Mat &resize_img, int max_size_len);
};
} // namespace PaddleClas
......@@ -31,7 +31,7 @@ op:
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["features"]
det:
concurrency: 1
local_service_conf:
......
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
feed_var {
name: "boxes"
alias_name: "boxes"
is_lod_tensor: false
feed_type: 1
shape: 6
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: false
fetch_type: 1
shape: 512
}
fetch_var {
name: "boxes"
alias_name: "boxes"
is_lod_tensor: false
fetch_type: 1
shape: 6
}
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
feed_var {
name: "boxes"
alias_name: "boxes"
is_lod_tensor: false
feed_type: 1
shape: 6
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: false
fetch_type: 1
shape: 512
}
fetch_var {
name: "boxes"
alias_name: "boxes"
is_lod_tensor: false
fetch_type: 1
shape: 6
}
feed_var {
name: "im_shape"
alias_name: "im_shape"
is_lod_tensor: false
feed_type: 1
shape: 2
}
feed_var {
name: "image"
alias_name: "image"
is_lod_tensor: false
feed_type: 7
shape: -1
shape: -1
shape: 3
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "save_infer_model/scale_0.tmp_1"
is_lod_tensor: true
fetch_type: 1
shape: -1
}
fetch_var {
name: "save_infer_model/scale_1.tmp_1"
alias_name: "save_infer_model/scale_1.tmp_1"
is_lod_tensor: false
fetch_type: 2
}
feed_var {
name: "im_shape"
alias_name: "im_shape"
is_lod_tensor: false
feed_type: 1
shape: 2
}
feed_var {
name: "image"
alias_name: "image"
is_lod_tensor: false
feed_type: 7
shape: -1
shape: -1
shape: 3
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "save_infer_model/scale_0.tmp_1"
is_lod_tensor: true
fetch_type: 1
shape: -1
}
fetch_var {
name: "save_infer_model/scale_1.tmp_1"
alias_name: "save_infer_model/scale_1.tmp_1"
is_lod_tensor: false
fetch_type: 2
}
......@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import numpy as np
from paddle_serving_client import Client
......@@ -22,181 +21,101 @@ import faiss
import os
import pickle
class MainbodyDetect():
"""
pp-shitu mainbody detect.
include preprocess, process, postprocess
return detect results
Attention: Postprocess include num limit and box filter; no nms
"""
def __init__(self):
self.preprocess = DetectionSequential([
DetectionFile2Image(), DetectionNormalize(
[0.485, 0.456, 0.406], [0.229, 0.224, 0.225], True),
DetectionResize(
(640, 640), False, interpolation=2), DetectionTranspose(
(2, 0, 1))
])
self.client = Client()
self.client.load_client_config(
"../../models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/serving_client_conf.prototxt"
)
self.client.connect(['127.0.0.1:9293'])
self.max_det_result = 5
self.conf_threshold = 0.2
def predict(self, imgpath):
im, im_info = self.preprocess(imgpath)
im_shape = np.array(im.shape[1:]).reshape(-1)
scale_factor = np.array(list(im_info['scale_factor'])).reshape(-1)
fetch_map = self.client.predict(
feed={
"image": im,
"im_shape": im_shape,
"scale_factor": scale_factor,
},
fetch=["save_infer_model/scale_0.tmp_1"],
batch=False)
return self.postprocess(fetch_map, imgpath)
def postprocess(self, fetch_map, imgpath):
#1. get top max_det_result
det_results = fetch_map["save_infer_model/scale_0.tmp_1"]
if len(det_results) > self.max_det_result:
boxes_reserved = fetch_map[
"save_infer_model/scale_0.tmp_1"][:self.max_det_result]
else:
boxes_reserved = det_results
#2. do conf threshold
boxes_list = []
for i in range(boxes_reserved.shape[0]):
if (boxes_reserved[i, 1]) > self.conf_threshold:
boxes_list.append(boxes_reserved[i, :])
#3. add origin image box
origin_img = cv2.imread(imgpath)
boxes_list.append(
np.array([0, 1.0, 0, 0, origin_img.shape[1], origin_img.shape[0]]))
return np.array(boxes_list)
class ObjectRecognition():
"""
pp-shitu object recognion for all objects detected by MainbodyDetect.
include preprocess, process, postprocess
preprocess include preprocess for each image and batching.
Batch process
postprocess include retrieval and nms
"""
def __init__(self):
self.client = Client()
self.client.load_client_config(
"../../models/general_PPLCNet_x2_5_lite_v1.0_client/serving_client_conf.prototxt"
)
self.client.connect(["127.0.0.1:9294"])
self.seq = Sequential([
BGR2RGB(), Resize((224, 224)), Div(255),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225],
False), Transpose((2, 0, 1))
])
self.searcher, self.id_map = self.init_index()
self.rec_nms_thresold = 0.05
self.rec_score_thres = 0.5
self.feature_normalize = True
self.return_k = 1
def init_index(self):
index_dir = "../../drink_dataset_v1.0/index"
assert os.path.exists(os.path.join(
index_dir, "vector.index")), "vector.index not found ..."
assert os.path.exists(os.path.join(
index_dir, "id_map.pkl")), "id_map.pkl not found ... "
searcher = faiss.read_index(os.path.join(index_dir, "vector.index"))
with open(os.path.join(index_dir, "id_map.pkl"), "rb") as fd:
id_map = pickle.load(fd)
return searcher, id_map
def predict(self, det_boxes, imgpath):
#1. preprocess
batch_imgs = []
origin_img = cv2.imread(imgpath)
for i in range(det_boxes.shape[0]):
box = det_boxes[i]
x1, y1, x2, y2 = [int(x) for x in box[2:]]
cropped_img = origin_img[y1:y2, x1:x2, :].copy()
tmp = self.seq(cropped_img)
batch_imgs.append(tmp)
batch_imgs = np.array(batch_imgs)
#2. process
fetch_map = self.client.predict(
feed={"x": batch_imgs}, fetch=["features"], batch=True)
batch_features = fetch_map["features"]
#3. postprocess
if self.feature_normalize:
feas_norm = np.sqrt(
np.sum(np.square(batch_features), axis=1, keepdims=True))
batch_features = np.divide(batch_features, feas_norm)
scores, docs = self.searcher.search(batch_features, self.return_k)
results = []
for i in range(scores.shape[0]):
pred = {}
if scores[i][0] >= self.rec_score_thres:
pred["bbox"] = [int(x) for x in det_boxes[i, 2:]]
pred["rec_docs"] = self.id_map[docs[i][0]].split()[1]
pred["rec_scores"] = scores[i][0]
results.append(pred)
return self.nms_to_rec_results(results)
def nms_to_rec_results(self, results):
filtered_results = []
x1 = np.array([r["bbox"][0] for r in results]).astype("float32")
y1 = np.array([r["bbox"][1] for r in results]).astype("float32")
x2 = np.array([r["bbox"][2] for r in results]).astype("float32")
y2 = np.array([r["bbox"][3] for r in results]).astype("float32")
scores = np.array([r["rec_scores"] for r in results])
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
order = scores.argsort()[::-1]
while order.size > 0:
i = order[0]
xx1 = np.maximum(x1[i], x1[order[1:]])
yy1 = np.maximum(y1[i], y1[order[1:]])
xx2 = np.minimum(x2[i], x2[order[1:]])
yy2 = np.minimum(y2[i], y2[order[1:]])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)
inds = np.where(ovr <= self.rec_nms_thresold)[0]
order = order[inds + 1]
filtered_results.append(results[i])
return filtered_results
rec_nms_thresold = 0.05
rec_score_thres = 0.5
feature_normalize = True
return_k = 1
index_dir = "../../drink_dataset_v1.0/index"
def init_index(index_dir):
assert os.path.exists(os.path.join(
index_dir, "vector.index")), "vector.index not found ..."
assert os.path.exists(os.path.join(
index_dir, "id_map.pkl")), "id_map.pkl not found ... "
searcher = faiss.read_index(os.path.join(index_dir, "vector.index"))
with open(os.path.join(index_dir, "id_map.pkl"), "rb") as fd:
id_map = pickle.load(fd)
return searcher, id_map
#get box
def nms_to_rec_results(results, thresh=0.1):
filtered_results = []
x1 = np.array([r["bbox"][0] for r in results]).astype("float32")
y1 = np.array([r["bbox"][1] for r in results]).astype("float32")
x2 = np.array([r["bbox"][2] for r in results]).astype("float32")
y2 = np.array([r["bbox"][3] for r in results]).astype("float32")
scores = np.array([r["rec_scores"] for r in results])
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
order = scores.argsort()[::-1]
while order.size > 0:
i = order[0]
xx1 = np.maximum(x1[i], x1[order[1:]])
yy1 = np.maximum(y1[i], y1[order[1:]])
xx2 = np.minimum(x2[i], x2[order[1:]])
yy2 = np.minimum(y2[i], y2[order[1:]])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)
inds = np.where(ovr <= thresh)[0]
order = order[inds + 1]
filtered_results.append(results[i])
return filtered_results
def postprocess(fetch_dict, feature_normalize, det_boxes, searcher, id_map,
return_k, rec_score_thres, rec_nms_thresold):
batch_features = fetch_dict["features"]
#do feature norm
if feature_normalize:
feas_norm = np.sqrt(
np.sum(np.square(batch_features), axis=1, keepdims=True))
batch_features = np.divide(batch_features, feas_norm)
scores, docs = searcher.search(batch_features, return_k)
results = []
for i in range(scores.shape[0]):
pred = {}
if scores[i][0] >= rec_score_thres:
pred["bbox"] = [int(x) for x in det_boxes[i, 2:]]
pred["rec_docs"] = id_map[docs[i][0]].split()[1]
pred["rec_scores"] = scores[i][0]
results.append(pred)
#do nms
results = nms_to_rec_results(results, rec_nms_thresold)
return results
#do client
if __name__ == "__main__":
det = MainbodyDetect()
rec = ObjectRecognition()
#1. get det_results
imgpath = "../../drink_dataset_v1.0/test_images/001.jpeg"
det_results = det.predict(imgpath)
#2. get rec_results
rec_results = rec.predict(det_results, imgpath)
print(rec_results)
client = Client()
client.load_client_config([
"../../models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client",
"../../models/general_PPLCNet_x2_5_lite_v1.0_client"
])
client.connect(['127.0.0.1:9400'])
im = cv2.imread("../../drink_dataset_v1.0/test_images/001.jpeg")
im_shape = np.array(im.shape[:2]).reshape(-1)
fetch_map = client.predict(
feed={"image": im,
"im_shape": im_shape},
fetch=["features", "boxes"],
batch=False)
#add retrieval procedure
det_boxes = fetch_map["boxes"]
searcher, id_map = init_index(index_dir)
results = postprocess(fetch_map, feature_normalize, det_boxes, searcher,
id_map, return_k, rec_score_thres, rec_nms_thresold)
print(results)
......@@ -12,16 +12,20 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import base64
import time
from paddle_serving_client import Client
#app
from paddle_serving_app.reader import Sequential, URL2Image, Resize
from paddle_serving_app.reader import CenterCrop, RGB2BGR, Transpose, Div, Normalize
import time
def bytes_to_base64(image: bytes) -> str:
"""encode bytes into base64 string
"""
return base64.b64encode(image).decode('utf8')
client = Client()
client.load_client_config("./ResNet50_vd_serving/serving_server_conf.prototxt")
client.load_client_config("./ResNet50_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
label_dict = {}
......@@ -31,22 +35,17 @@ with open("imagenet.label") as fin:
label_dict[label_idx] = line.strip()
label_idx += 1
#preprocess
seq = Sequential([
URL2Image(), Resize(256), CenterCrop(224), RGB2BGR(), Transpose((2, 0, 1)),
Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], True)
])
start = time.time()
image_file = "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"
image_file = "./daisy.jpg"
for i in range(1):
img = seq(image_file)
fetch_map = client.predict(
feed={"inputs": img}, fetch=["prediction"], batch=False)
prob = max(fetch_map["prediction"][0])
label = label_dict[fetch_map["prediction"][0].tolist().index(prob)].strip(
).replace(",", "")
print("prediction: {}, probability: {}".format(label, prob))
end = time.time()
print(end - start)
start = time.time()
with open(image_file, 'rb') as img_file:
image_data = img_file.read()
image = bytes_to_base64(image_data)
fetch_dict = client.predict(
feed={"inputs": image}, fetch=["prediction"], batch=False)
prob = max(fetch_dict["prediction"][0])
label = label_dict[fetch_dict["prediction"][0].tolist().index(
prob)].strip().replace(",", "")
print("prediction: {}, probability: {}".format(label, prob))
end = time.time()
print(end - start)
......@@ -189,7 +189,7 @@ class Binarize(object):
return byte
class Attribute(object):
class PersonAttribute(object):
def __init__(self,
threshold=0.5,
glasses_threshold=0.3,
......@@ -277,6 +277,46 @@ class Attribute(object):
threshold_list[18] = self.hold_threshold
pred_res = (np.array(res) > np.array(threshold_list)
).astype(np.int8).tolist()
batch_res.append({"attributes": label_res, "output": pred_res})
return batch_res
class VehicleAttribute(object):
def __init__(self, color_threshold=0.5, type_threshold=0.5):
self.color_threshold = color_threshold
self.type_threshold = type_threshold
self.color_list = [
"yellow", "orange", "green", "gray", "red", "blue", "white",
"golden", "brown", "black"
]
self.type_list = [
"sedan", "suv", "van", "hatchback", "mpv", "pickup", "bus",
"truck", "estate"
]
def __call__(self, batch_preds, file_names=None):
# postprocess output of predictor
batch_res = []
for res in batch_preds:
res = res.tolist()
label_res = []
color_idx = np.argmax(res[:10])
type_idx = np.argmax(res[10:])
if res[color_idx] >= self.color_threshold:
color_info = f"Color: ({self.color_list[color_idx]}, prob: {res[color_idx]})"
else:
color_info = "Color unknown"
batch_res.append([label_res, pred_res])
if res[type_idx + 10] >= self.type_threshold:
type_info = f"Type: ({self.type_list[type_idx]}, prob: {res[type_idx + 10]})"
else:
type_info = "Type unknown"
label_res = f"{color_info}, {type_info}"
threshold_list = [self.color_threshold
] * 10 + [self.type_threshold] * 9
pred_res = (np.array(res) > np.array(threshold_list)
).astype(np.int8).tolist()
batch_res.append({"attributes": label_res, "output": pred_res})
return batch_res
......@@ -138,12 +138,11 @@ def main(config):
continue
batch_results = cls_predictor.predict(batch_imgs)
for number, result_dict in enumerate(batch_results):
if "Attribute" in config["PostProcess"]:
if "PersonAttribute" in config[
"PostProcess"] or "VehicleAttribute" in config[
"PostProcess"]:
filename = batch_names[number]
attr_message = result_dict[0]
pred_res = result_dict[1]
print("{}:\t attributes: {}, \npredict output: {}".format(
filename, attr_message, pred_res))
print("{}:\t {}".format(filename, result_dict))
else:
filename = batch_names[number]
clas_ids = result_dict["class_ids"]
......
# PULC Model Zoo
------
The PULC model zoo is provided here, mainly providing indicators, model storage size, and download links of the model. The pre-trained model can be used for fine-tuning training, and the inference model can be directly used for prediction and deployment.
|Model name| Model Description | Metrics |Storage Size| Latency| Download Address|
| --- | --- | --- | --- | --- | --- |
| person_exists |[Human Exists Classification](PULC_person_exists_en.md)| 95.60 |6.5M|2.58ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_exists_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_exists_pretrained.pdparams)|
| person_attribute |[Pedestrian Attribute Classification](PULC_person_attribute_en.md)| 78.59 |6.6M|2.01ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_attribute_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_attribute_pretrained.pdparams)|
| safety_helmet |[Classification of Wheather Wearing Safety Helmet](PULC_safety_helmet_en.md)| 99.38 |6.5M|2.03ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/safety_helmet_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/safety_helmet_pretrained.pdparams)|
| traffic_sign |[Traffic Sign Classification](PULC_traffic_sign_en.md)| 98.35 |8.2M|2.10ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/traffic_sign_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/traffic_sign_pretrained.pdparams)|
| vehicle_attribute |[Vehicle Attribute Classification](PULC_vehicle_attribute_en.md)| 90.81 |7.2M|2.36ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/vehicle_attribute_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/vehicle_attribute_pretrained.pdparams)|
| car_exists |[Car Exists Classification](PULC_car_exists_en.md) | 95.92 | 6.6M | 2.38ms |[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/car_exists_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/car_exists_pretrained.pdparams)|
| text_image_orientation |[Text Image Orientation Classification](PULC_text_image_orientation_en.md)| 99.06 | 6.5M | 2.16ms |[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/text_image_orientation_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/text_image_orientation_pretrained.pdparams)|
| textline_orientation |[Text-line Orientation Classification](PULC_textline_orientation_en.md)| 96.01 |6.5M|2.72ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/textline_orientation_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/textline_orientation_pretrained.pdparams)|
| language_classification |[Language Classification](PULC_language_classification_en.md)| 99.26 |6.5M|2.58ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/language_classification_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/language_classification_pretrained.pdparams)|
**Note:**
* The backbone of all the above models is PPLCNet_x1_0. The different sizes of some models are caused by the different output sizes of the classification layer. The inference time is tested on the Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. During the test process, the MKLDNN acceleration strategy is turned on, and the number of threads is 10. There will be slight fluctuations during the speed test process.
* The evaluation indicators of person_exists, safety_helmet, and car_exists are TprAtFpr. The evaluation indicators of person_attribute and vehicle_attribute are ma. The evaluation indicators of traffic_sign, text_image_orientation, textline_orientation and language_classification are Top-1 Acc.
# PULC Quick Start
------
This document introduces the prediction using PULC series model based on PaddleClas wheel.
## Catalogue
- [1. Installation](#1)
- [1.1 PaddlePaddle Installation](#11)
- [1.2 PaddleClas wheel Installation](#12)
- [2. Quick Start](#2)
- [2.1 Predicion with Command Line](#2.1)
- [2.2 Predicion with Python](#2.2)
- [2.3 Supported Model List](#2.3)
- [3. Summary](#3)
<a name="1"></a>
## 1. Installation
<a name="1.1"></a>
### 1.1 PaddlePaddle Installation
- Run the following command to install if CUDA9 or CUDA10 is available.
```bash
python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
```
- Run the following command to install if GPU device is unavailable.
```bash
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
```
Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
<a name="1.2"></a>
### 1.2 PaddleClas wheel Installation
```bash
pip3 install paddleclas
```
<a name="2"></a>
## 2. Quick Start
PaddleClas provides a series of test cases, which contain demos of different scenes about people, cars, OCR, etc. Click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download the data.
<a name="2.1"></a>
### 2.1 Predicion with Command Line
```
cd /path/to/pulc_demo_imgs
```
The prediction command:
```bash
paddleclas --model_name=person_exists --infer_imgs=pulc_demo_imgs/person_exists/objects365_01780782.jpg
```
Result:
```
>>> result
class_ids: [0], scores: [0.9955421453341842], label_names: ['nobody'], filename: pulc_demo_imgs/person_exists/objects365_01780782.jpg
Predict complete!
```
`Nobody` means there is no one in the image, `someone` means there is someone in the image. Therefore, the prediction result indicates that there is no one in the figure.
**Note**: The "--infer_imgs" argument specify the image(s) to be predict, and you can also specify a directoy contains images. If use other model, you can specify the `--model_name` argument. Please refer to [2.3 Supported Model List](#2.3) for the supported model list.
<a name="2.2"></a>
### 2.2 Predicion with Python
You can also use in Python:
```python
import paddleclas
model = paddleclas.PaddleClas(model_name="person_exists")
result = model.predict(input_data="pulc_demo_imgs/person_exists/objects365_01780782.jpg")
print(next(result))
```
The printed result information:
```
>>> result
[{'class_ids': [0], 'scores': [0.9955421453341842], 'label_names': ['nobody'], 'filename': 'pulc_demo_imgs/person_exists/objects365_01780782.jpg'}]
```
**Note**: `model.predict()` is a generator, so `next()` or `for` is needed to call it. This would to predict by batch that length is `batch_size`, default by 1. You can specify the argument `batch_size` and `model_name` when instantiating PaddleClas object, for example: `model = paddleclas.PaddleClas(model_name="person_exists", batch_size=2)`. Please refer to [2.3 Supported Model List](#2.3) for the supported model list.
<a name="2.3"></a>
### 2.3 Supported Model List
The name of PULC series models are as follows:
| Name | Intro |
| --- | --- |
| person_exists | Human Exists Classification |
| person_attribute | Pedestrian Attribute Classification |
| safety_helmet | Classification of Wheather Wearing Safety Helmet |
| traffic_sign | Traffic Sign Classification |
| vehicle_attribute | Vehicle Attribute Classification |
| car_exists | Car Exists Classification |
| text_image_orientation | Text Image Orientation Classification |
| textline_orientation | Text-line Orientation Classification |
| language_classification | Language Classification |
<a name="3"></a>
## 3. Summary
The PULC series models have been verified to be effective in different scenarios about people, vehicles, OCR, etc. The ultra lightweight model can achieve the accuracy close to SwinTransformer model, and the speed is increased by 40+ times. And PULC also provides the whole process of dataset getting, model training, model compression and deployment. Please refer to [Human Exists Classification](PULC_person_exists_en.md)[Pedestrian Attribute Classification](PULC_person_attribute_en.md)[Classification of Wheather Wearing Safety Helmet](PULC_safety_helmet_en.md)[Traffic Sign Classification](PULC_traffic_sign_en.md)[Vehicle Attribute Classification](PULC_vehicle_attribute_en.md)[Car Exists Classification](PULC_car_exists_en.md)[Text Image Orientation Classification](PULC_text_image_orientation_en.md)[Text-line Orientation Classification](PULC_textline_orientation_en.md)[Language Classification](PULC_language_classification_en.md) for more information about different scenarios.
## Practical Ultra Lightweight Classification scheme PULC
------
## Catalogue
- [1. Introduction of PULC solution](#1)
- [2. Data preparation](#2)
- [2.1 Dataset format description](#2.1)
- [2.2 Annotation file generation method](#2.2)
- [3. Training with standard classification configuration](#3)
- [3.1 PP-LCNet as backbone](#3.1)
- [3.2 SSLD pretrained model](#3.2)
- [3.3 EDA strategy](#3.3)
- [3.4 SKL-UGI knowledge distillation](#3.4)
- [3.5 Summary](#3.5)
- [4. Hyperparameter Search](#4)
- [4.1 Search based on default configuration](#4.1)
- [4.2 Custom search configuration](#4.2)
<a name="1"></a>
### 1. Introduction of PULC solution
Image classification is one of the basic algorithms of computer vision, and it is also the most common algorithm in enterprise applications, and further, it is also an important part of many CV applications. In recent years, the backbone network model has developed rapidly, and the accuracy record of ImageNet has been continuously refreshed. However, the performance of these models in practical scenarios is sometimes unsatisfactory. On the one hand, models with high precision tend to have large storage and slow inference speed, which are often difficult to meet actual deployment requirements; on the other hand, after selecting a suitable model, experienced engineers are often required to adjust parameters, which is time-consuming and labor-intensive. In order to solve the problems of enterprise application and make the training and parameter adjustment of classification models easier, PaddleClas summarized and launched a Practical Ultra Lightweight Classification (PULC) solution. PULC integrates various state-of-the-art algorithms such as backbone network, data augmentation and distillation, etc., and finally can automatically obtain a lightweight and high-precision image classification model.
The PULC solution has been verified to be effective in many scenarios, such as human-related scenarios, car-related scenarios, and OCR-related scenarios. With an ultra-lightweight model, the accuracy close to SwinTransformer can be achieved, and the inference speed can be 40+ times faster.
<div align="center">
<img src="https://user-images.githubusercontent.com/19523330/173011854-b10fcd7a-b799-4dfd-a1cf-9504952a3c44.png" width = "800" />
</div>
The solution mainly includes 4 parts, namely: PP-LCNet lightweight backbone network, SSLD pre-trained model, Ensemble Data Augmentation (EDA) and SKL-UGI knowledge distillation algorithm. In addition, we also adopt the method of hyperparameter search to efficiently optimize the hyperparameters in training. Below, we take the person exists or not scene as an example to illustrate the solution.
**Note**:For some specific scenarios, we provide basic training documents for reference, such as [person exists or not classification model](PULC_person_exists_en.md), etc. You can find these documents [here](./PULC_model_list_en.md). If the methods in these documents do not meet your needs, or if you need a custom training task, you can refer to this document.
<a name="2"></a>
### 2. Data preparation
<a name="2.1"></a>
#### 2.1 Dataset format description
PaddleClas uses the `txt` format file to specify the training set and validation set. Take the person exists or not scene as an example, you need to specify `train_list.txt` and `val_list.txt` as the data labels of the training set and validation set. The format is in the form of as follows:
```
# Each line uses "space" to separate the image path and label
train/1.jpg 0
train/10.jpg 1
...
```
If you want to get more information about common classification datasets, you can refer to the document [PaddleClas Classification Dataset Format Description](../data_preparation/classification_dataset_en.md).
<a name="2.2"></a>
#### 2.2 Annotation file generation method
If you already have the data in the actual scene, you can label it according to the format in the previous section. Here, we provide a script to quickly generate annotation files. You only need to put different categories of data in folders and run the script to generate annotation files.
First, assume that the path where you store the data is `./train`, `train/` contains the data of each category, the category number starts from 0, and the folder of each category contains specific image data.
```shell
train
├── 0
│   ├── 0.jpg
│   ├── 1.jpg
│   └── ...
└── 1
├── 0.jpg
├── 1.jpg
└── ...
└── ...
```
```shell
tree -r -i -f train | grep -E "jpg|JPG|jpeg|JPEG|png|PNG" | awk -F "/" '{print $0" "$2}' > train_list.txt
```
Among them, if more image name suffixes are involved, the content after `grep -E` can be added, and the `2` in `$2` is the level of the category number folder.
**Note:** The above is an introduction to the method of dataset acquisition and generation. Here you can directly download the person exists or not scene data to quickly start the experience.
Go to the PaddleClas directory.
```
cd path_to_PaddleClas
```
Go to the `dataset/` directory, download and unzip the data.
```shell
cd dataset
wget https://paddleclas.bj.bcebos.com/data/PULC/person_exists.tar
tar -xf person_exists.tar
cd ../
```
<a name="3"></a>
### 3. Training with standard classification configuration
<a name="3.1"></a>
#### 3.1 PP-LCNet as backbone
PULC adopts the lightweight backbone network PP-LCNet, which is 50% faster than other networks with the same accuracy. You can view the detailed introduction of the backbone network in [PP-LCNet Introduction](../models/PP-LCNet_en.md).
The command to train with PP-LCNet is:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0_search.yaml
```
For performance comparison, we also provide configuration files for the large model SwinTransformer_tiny and the lightweight model MobileNetV3_small_x0_35, which you can train with the command:
SwinTransformer_tiny:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/SwinTransformer_tiny_patch4_window7_224.yaml
```
MobileNetV3_small_x0_35:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/MobileNetV3_small_x0_35.yaml
```
The accuracy of the trained models is compared in the following table.
| Model | Tpr(%) | Latency(ms) | Storage Size(M) | Strategy |
|-------|-----------|----------|---------------|---------------|
| SwinTranformer_tiny | 95.69 | 95.30 | 107 | Use ImageNet pretrained model|
| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 1.6 | Use ImageNet pretrained model |
| PPLCNet_x1_0 | 89.57 | 2.12 | 6.5 | Use ImageNet pretrained model |
It can be seen that PP-LCNet is much faster than SwinTransformer, but the accuracy is also slightly lower. Below we improve the accuracy of the PP-LCNet model through a series of optimizations.
<a name="3.2"></a>
#### 3.2 SSLD pretrained model
SSLD is a semi-supervised distillation algorithm developed by Baidu. On the ImageNet dataset, the model accuracy can be improved by 3-7 points. You can find a detailed introduction in [SSLD introduction](../advanced_tutorials/distillation/distillation_en.md). We found that using SSLD pre-trained weights can effectively improve the accuracy of the applied classification model. In addition, using a smaller resolution in training can effectively improve model accuracy. At the same time, we also optimize the learning rate.
Based on the above three improvements, the accuracy of our trained model is 92.1%, an increase of 2.6%.
<a name="3.3"></a>
#### 3.3 EDA strategy
Data augmentation is a commonly used optimization strategy in vision algorithms, which can significantly improve the accuracy of the model. In addition to the traditional RandomCrop, RandomFlip, etc. methods, we also apply RandomAugment and RandomErasing. You can find a detailed introduction at [Data Augmentation Introduction](../advanced_tutorials/DataAugmentation_en.md).
Since these two kinds of data augmentation greatly modify the picture, making the classification task more difficult, it may lead to under-fitting of the model on some datasets. We will set the probability of enabling these two methods in advance.
Based on the above improvements, we obtained a model accuracy of 93.43%, an increase of 1.3%.
<a name="3.4"></a>
#### 3.4 SKL-UGI knowledge distillation
Knowledge distillation is a method that can effectively improve the accuracy of small models. You can find a detailed introduction in [Introduction to Knowledge Distillation](../advanced_tutorials/distillation/distillation_en.md). We choose ResNet101_vd as the teacher model for distillation. In order to adapt to the distillation process, we also adjust the learning rate of different stages of the network here. Based on the above improvements, we trained the model to get a model accuracy of 95.6%, an increase of 1.4%.
<a name="3.5"></a>
#### 3.5 Summary
After the optimization of the above methods, the final accuracy of PP-LCNet reaches 95.6%, reaching the accuracy level of the large model. We summarize the experimental results in the following table:
| Model | Tpr(%) | Latency(ms) | Storage Size(M) | Strategy |
|-------|-----------|----------|---------------|---------------|
| SwinTranformer_tiny | 95.69 | 95.30 | 107 | Use ImageNet pretrained model |
| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 1.6 | Use ImageNet pretrained model |
| PPLCNet_x1_0 | 89.57 | 2.12 | 6.5 | Use ImageNet pretrained model |
| PPLCNet_x1_0 | 92.10 | 2.12 | 6.5 | Use SSLD pretrained model |
| PPLCNet_x1_0 | 93.43 | 2.12 | 6.5 | Use SSLD pretrained model + EDA Strategy|
| <b>PPLCNet_x1_0<b> | <b>95.60<b> | <b>2.12<b> | <b>6.5<b> | Use SSLD pretrained model + EDA Strategy + SKL-UGI knowledge distillation |
We also used the same optimization strategy in the other 8 scenarios and got the following results:
| scenarios | large model | large model metrics(%) | small model | small model metrics(%) |
|----------|----------|----------|----------|----------|
| Pedestrian Attribute Classification | Res2Net200_vd | 81.25 | PPLCNet_x1_0 | 78.59 |
| Classification of Wheather Wearing Safety Helmet | Res2Net200_vd| 98.92 | PPLCNet_x1_0 |99.38 |
| Traffic Sign Classification | SwinTransformer_tiny | 98.11 | PPLCNet_x1_0 | 98.35 |
| Vehicle Attribute Classification | Res2Net200_vd_26w_4s | 91.36 | PPLCNet_x1_0 | 90.81 |
| Car Exists Classification | SwinTransformer_tiny | 97.71 | PPLCNet_x1_0 | 95.92 |
| Text Image Orientation Classification | SwinTransformer_tiny |99.12 | PPLCNet_x1_0 | 99.06 |
| Text-line Orientation Classification | SwinTransformer_tiny | 93.61 | PPLCNet_x1_0 | 96.01 |
| Language Classification | SwinTransformer_tiny | 98.12 | PPLCNet_x1_0 | 99.26 |
It can be seen from the results that the PULC scheme can improve the model accuracy in multiple application scenarios. Using the PULC scheme can greatly reduce the workload of model optimization and quickly obtain models with higher accuracy.
<a name="4"></a>
### 4. Hyperparameter Search
In the above training process, we adjusted parameters such as learning rate, data augmentation probability, and stage learning rate mult list. The optimal values of these parameters may not be the same in different scenarios. We provide a quick hyperparameter search script to automate the process of hyperparameter tuning. This script traverses the parameters in the search value list to replace the parameters in the default configuration, then trains in sequence, and finally selects the parameters corresponding to the model with the highest accuracy as the search result.
<a name="4.1"></a>
#### 4.1 Search based on default configuration
The configuration file [search.yaml](../../../ppcls/configs/PULC/person_exists/search.yaml) defines the configuration of hyperparameter search in person exists or not scenarios. Use the following commands to complete hyperparameter search.
```bash
python3 tools/search_strategy.py -c ppcls/configs/PULC/person_exists/search.yaml
```
**Note**:Regarding the search part, we are also constantly improving, so stay tuned.
<a name="4.2"></a>
#### 4.2 Custom search configuration
You can also modify the configuration of hyperparameter search based on training results or your parameter tuning experience.
Modify the `search_values` field in `lrs` to modify the list of learning rate search values;
Modify the `search_values` field in `resolutions` to modify the search value list of resolutions;
Modify the `search_values` field in `ra_probs` to modify the search value list of RandAugment activation probability;
Modify the `search_values` field in `re_probs` to modify the search value list of RnadomErasing on probability;
Modify the `search_values` field in `lr_mult_list` to modify the lr_mult search value list;
Modify the `search_values` field in `teacher` to modify the search list of the teacher model.
After the search is completed, the final results will be generated in `output/search_person_exists`, where, except for `search_res`, the directories in `output/search_person_exists` are the weights and training log files of the results of the corresponding hyperparameters of each search training, ` search_res` corresponds to the result of knowledge distillation, that is, the final model. The weights of the model are stored in `output/output_dir/search_person_exists/DistillationModel/best_model_student.pdparams`.
# PaddleClas wheel package
Paddleclas supports Python WHL package for prediction. At present, WHL package only supports image classification, but does not support subject detection, feature extraction and vector retrieval.
PaddleClas supports Python wheel package for prediction. At present, PaddleClas wheel supports image classification including ImagetNet1k models and PULC models, but does not support mainbody detection, feature extraction and vector retrieval.
---
......@@ -9,7 +9,7 @@ Paddleclas supports Python WHL package for prediction. At present, WHL package o
- [1. Installation](#1)
- [2. Quick Start](#2)
- [3. Definition of Parameters](#3)
- [4. Usage](#4)
- [4. More usage](#4)
- [4.1 View help information](#4.1)
- [4.2 Prediction using inference model provide by PaddleClas](#4.2)
- [4.3 Prediction using local model files](#4.3)
......@@ -20,6 +20,7 @@ Paddleclas supports Python WHL package for prediction. At present, WHL package o
- [4.8 Specify the mapping between class id and label name](#4.8)
<a name="1"></a>
## 1. Installation
* installing from pypi
......@@ -36,8 +37,14 @@ pip3 install dist/*
```
<a name="2"></a>
## 2. Quick Start
* Using the `ResNet50` model provided by PaddleClas, the following image(`'docs/images/inference_deployment/whl_demo.jpg'`) as an example.
<a name="2.1"></a>
### 2.1 ImageNet1k models
Using the `ResNet50` model provided by PaddleClas, the following image(`'docs/images/inference_deployment/whl_demo.jpg'`) as an example.
![](../../images/inference_deployment/whl_demo.jpg)
......@@ -68,25 +75,89 @@ filename: docs/images/inference_deployment/whl_demo.jpg, top-5, class_ids: [8, 7
Predict complete!
```
<a name="2.2"></a>
### 2.2 PULC models
PULC integrates various state-of-the-art algorithms such as backbone network, data augmentation and distillation, etc., and finally can automatically obtain a lightweight and high-precision image classification model.
PaddleClas provides a series of test cases, which contain demos of different scenes about people, cars, OCR, etc. Click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download the data.
Prection using the PULC "Human Exists Classification" model provided by PaddleClas:
* Python
```python
import paddleclas
model = paddleclas.PaddleClas(model_name="person_exists")
result = model.predict(input_data="pulc_demo_imgs/person_exists/objects365_01780782.jpg")
print(next(result))
```
```
>>> result
[{'class_ids': [0], 'scores': [0.9955421453341842], 'label_names': ['nobody'], 'filename': 'pulc_demo_imgs/person_exists/objects365_01780782.jpg'}]
```
`Nobody` means there is no one in the image, `someone` means there is someone in the image. Therefore, the prediction result indicates that there is no one in the figure.
**Note**: `model.predict()` is a generator, so `next()` or `for` is needed to call it. This would to predict by batch that length is `batch_size`, default by 1. You can specify the argument `batch_size` and `model_name` when instantiating PaddleClas object, for example: `model = paddleclas.PaddleClas(model_name="person_exists", batch_size=2)`. Please refer to [Supported Model List](#PULC_Models) for the supported model list.
* CLI
```bash
paddleclas --model_name=person_exists --infer_imgs=pulc_demo_imgs/person_exists/objects365_01780782.jpg
```
```
>>> result
class_ids: [0], scores: [0.9955421453341842], label_names: ['nobody'], filename: pulc_demo_imgs/person_exists/objects365_01780782.jpg
Predict complete!
```
**Note**: The "--infer_imgs" argument specify the image(s) to be predict, and you can also specify a directoy contains images. If use other model, you can specify the `--model_name` argument. Please refer to [Supported Model List](#PULC_Models) for the supported model list.
<a name="PULC_Models"></a>
**Supported Model List**
The name of PULC series models are as follows:
| Name | Intro |
| --- | --- |
| person_exists | Human Exists Classification |
| person_attribute | Pedestrian Attribute Classification |
| safety_helmet | Classification of Wheather Wearing Safety Helmet |
| traffic_sign | Traffic Sign Classification |
| vehicle_attribute | Vehicle Attribute Classification |
| car_exists | Car Exists Classification |
| text_image_orientation | Text Image Orientation Classification |
| textline_orientation | Text-line Orientation Classification |
| language_classification | Language Classification |
Please refer to [Human Exists Classification](../PULC/PULC_person_exists_en.md)[Pedestrian Attribute Classification](../PULC/PULC_person_attribute_en.md)[Classification of Wheather Wearing Safety Helmet](../PULC/PULC_safety_helmet_en.md)[Traffic Sign Classification](../PULC/PULC_traffic_sign_en.md)[Vehicle Attribute Classification](../PULC/PULC_vehicle_attribute_en.md)[Car Exists Classification](../PULC/PULC_car_exists_en.md)[Text Image Orientation Classification](../PULC/PULC_text_image_orientation_en.md)[Text-line Orientation Classification](../PULC/PULC_textline_orientation_en.md)[Language Classification](../PULC/PULC_language_classification_en.md) for more information about different scenarios.
<a name="3"></a>
## 3. Definition of Parameters
The following parameters can be specified in Command Line or used as parameters of the constructor when instantiating the PaddleClas object in Python.
* model_name(str): If using inference model based on ImageNet1k provided by Paddle, please specify the model's name by the parameter.
* inference_model_dir(str): Local model files directory, which is valid when `model_name` is not specified. The directory should contain `inference.pdmodel` and `inference.pdiparams`.
* infer_imgs(str): The path of image to be predicted, or the directory containing the image files, or the URL of the image from Internet.
* use_gpu(bool): Whether to use GPU or not, default by `True`.
* gpu_mem(int): GPU memory usages,default by `8000`
* use_tensorrt(bool): Whether to open TensorRT or not. Using it can greatly promote predict preformance, default by `False`.
* enable_mkldnn(bool): Whether enable MKLDNN or not, default `False`.
* cpu_num_threads(int): Assign number of cpu threads, valid when `--use_gpu` is `False` and `--enable_mkldnn` is `True`, default by `10`.
* batch_size(int): Batch size, default by `1`.
* resize_short(int): Resize the minima between height and width into `resize_short`, default by `256`.
* crop_size(int): Center crop image to `crop_size`, default by `224`.
* topk(int): Print (return) the `topk` prediction results, default by `5`.
* class_id_map_file(str): The mapping file between class ID and label, default by `ImageNet1K` dataset's mapping.
* pre_label_image(bool): whether prelabel or not, default=False.
* save_dir(str): The directory to save the prediction results that can be used as pre-label, default by `None`, that is, not to save.
* use_gpu(bool): Whether to use GPU or not.
* gpu_mem(int): GPU memory usages.
* use_tensorrt(bool): Whether to open TensorRT or not. Using it can greatly promote predict preformance.
* enable_mkldnn(bool): Whether enable MKLDNN or not.
* cpu_num_threads(int): Assign number of cpu threads, valid when `--use_gpu` is `False` and `--enable_mkldnn` is `True`.
* batch_size(int): Batch size.
* resize_short(int): Resize the minima between height and width into `resize_short`.
* crop_size(int): Center crop image to `crop_size`.
* topk(int): Print (return) the `topk` prediction results when Topk postprocess is used.
* threshold(float): The threshold of ThreshOutput when postprocess is used.
* class_id_map_file(str): The mapping file between class ID and label.
* save_dir(str): The directory to save the prediction results that can be used as pre-label.
**Note**: If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `resize_short=384`, `resize=384`. The following is a demo.
......@@ -103,6 +174,7 @@ clas = PaddleClas(model_name='ViT_base_patch16_384', resize_short=384, crop_size
```
<a name="4"></a>
## 4. Usage
PaddleClas provides two ways to use:
......@@ -110,6 +182,7 @@ PaddleClas provides two ways to use:
2. Bash command line programming.
<a name="4.1"></a>
### 4.1 View help information
* CLI
......@@ -118,6 +191,7 @@ paddleclas -h
```
<a name="4.2"></a>
### 4.2 Prediction using inference model provide by PaddleClas
You can use the inference model provided by PaddleClas to predict, and only need to specify `model_name`. In this case, PaddleClas will automatically download files of specified model and save them in the directory `~/.paddleclas/`.
......@@ -136,6 +210,7 @@ paddleclas --model_name='ResNet50' --infer_imgs='docs/images/inference_deploymen
```
<a name="4.3"></a>
### 4.3 Prediction using local model files
You can use the local model files trained by yourself to predict, and only need to specify `inference_model_dir`. Note that the directory must contain `inference.pdmodel` and `inference.pdiparams`.
......@@ -154,6 +229,7 @@ paddleclas --inference_model_dir='./inference/' --infer_imgs='docs/images/infere
```
<a name="4.4"></a>
### 4.4 Prediction by batch
You can predict by batch, only need to specify `batch_size` when `infer_imgs` is direcotry contain image files.
......@@ -173,6 +249,7 @@ paddleclas --model_name='ResNet50' --infer_imgs='docs/images/' --batch_size 2
```
<a name="4.5"></a>
### 4.5 Prediction of Internet image
You can predict the Internet image, only need to specify URL of Internet image by `infer_imgs`. In this case, the image file will be downloaded and saved in the directory `~/.paddleclas/images/`.
......@@ -191,6 +268,7 @@ paddleclas --model_name='ResNet50' --infer_imgs='https://raw.githubusercontent.c
```
<a name="4.6"></a>
### 4.6 Prediction of NumPy.array format image
In Python code, you can predict the `NumPy.array` format image, only need to use the `infer_imgs` to transfer variable of image data. Note that the models in PaddleClas only support to predict 3 channels image data, and channels order is `RGB`.
......@@ -205,6 +283,7 @@ print(next(result))
```
<a name="4.7"></a>
### 4.7 Save the prediction result(s)
You can save the prediction result(s) as pre-label, only need to use `pre_label_out_dir` to specify the directory to save.
......@@ -212,17 +291,18 @@ You can save the prediction result(s) as pre-label, only need to use `pre_label_
```python
from paddleclas import PaddleClas
clas = PaddleClas(model_name='ResNet50', save_dir='./output_pre_label/')
infer_imgs = 'docs/images/inference_deployment/whl_' # it can be infer_imgs folder path which contains all of images you want to predict.
infer_imgs = 'docs/images/' # it can be infer_imgs folder path which contains all of images you want to predict.
result=clas.predict(infer_imgs)
print(next(result))
```
* CLI
```bash
paddleclas --model_name='ResNet50' --infer_imgs='docs/images/inference_deployment/whl_' --save_dir='./output_pre_label/'
paddleclas --model_name='ResNet50' --infer_imgs='docs/images/' --save_dir='./output_pre_label/'
```
<a name="4.8"></a>
### 4.8 Specify the mapping between class id and label name
You can specify the mapping between class id and label name, only need to use `class_id_map_file` to specify the mapping file. PaddleClas uses ImageNet1K's mapping by default.
......
此差异已折叠。
此差异已折叠。
# PULC 模型库
------
此处提供了 PULC 模型库的相关指标和模型的下载链接,其中预训练模型可以用来微调训练,推理模型可以直接用来预测和部署。
|模型名称|模型简介|模型精度 |模型大小|推理耗时|下载地址|
| --- | --- | --- | --- | --- | --- |
| person_exists |[PULC有人/无人分类模型](PULC_person_exists.md)| 95.60 |6.5M|2.58ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_exists_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_exists_pretrained.pdparams)|
| person_attribute |[PULC人体属性识别模型](PULC_person_attribute.md)| 78.59 |6.6M|2.01ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_attribute_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_attribute_pretrained.pdparams)|
| safety_helmet |[PULC佩戴安全帽分类模型](PULC_safety_helmet.md)| 99.38 |6.5M|2.03ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/safety_helmet_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/safety_helmet_pretrained.pdparams)|
| traffic_sign |[PULC交通标志分类模型](PULC_traffic_sign.md)| 98.35 |8.2M|2.10ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/traffic_sign_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/traffic_sign_pretrained.pdparams)|
| vehicle_attribute |[PULC车辆属性识别模型](PULC_vehicle_attribute.md)| 90.81 |7.2M|2.36ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/vehicle_attribute_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/vehicle_attribute_pretrained.pdparams)|
| car_exists |[PULC有车/无车分类模型](PULC_car_exists.md) | 95.92 | 6.6M | 2.38ms |[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/car_exists_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/car_exists_pretrained.pdparams)|
| text_image_orientation |[PULC含文字图像方向分类模型](PULC_text_image_orientation.md)| 99.06 | 6.5M | 2.16ms |[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/text_image_orientation_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/text_image_orientation_pretrained.pdparams)|
| textline_orientation |[PULC文本行方向分类模型](PULC_textline_orientation.md)| 96.01 |6.5M|2.72ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/textline_orientation_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/textline_orientation_pretrained.pdparams)|
| language_classification |[PULC语种分类模型](PULC_language_classification.md)| 99.26 |6.5M|2.58ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/language_classification_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/language_classification_pretrained.pdparams)|
**备注:**
* 以上所有的模型的 backbone 均为 PPLCNet_x1_0,部分模型大小不同是由于分类的输出大小不同导致的,推理耗时是基于Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,其中测试过程开启 MKLDNN 加速策略,线程数为10。速度测试过程会有轻微波动。
* person_exists、safety_helmet、car_exists 的评测指标为 TprAtFpr,person_attribute、vehicle_attribute的评测指标为ma、traffic_sign、text_image_orientation、textline_orientation、language_classification的评测指标为Top-1 Acc。
# PULC 人体属性识别模型
------
## 目录
- [1. 模型和应用场景介绍](#1)
- [2. 模型快速体验](#2)
- [2.1 安装 paddlepaddle](#2.1)
- [2.2 安装 paddleclas](#2.2)
- [2.3 预测](#2.3)
- [3. 模型训练、评估和预测](#3)
- [3.1 环境配置](#3.1)
- [3.2 数据准备](#3.2)
- [3.2.1 数据集来源](#3.2.1)
- [3.2.2 数据集获取](#3.2.2)
- [3.3 模型训练](#3.3)
- [3.4 模型评估](#3.4)
- [3.5 模型预测](#3.5)
- [4. 模型压缩](#4)
- [4.1 SKL-UGI 知识蒸馏](#4.1)
- [4.1.1 教师模型训练](#4.1.1)
- [4.1.2 蒸馏训练](#4.1.2)
- [5. 超参搜索](#5)
- [6. 模型推理部署](#6)
- [6.1 推理模型准备](#6.1)
- [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
- [6.1.2 直接下载 inference 模型](#6.1.2)
- [6.2 基于 Python 预测引擎推理](#6.2)
- [6.2.1 预测单张图像](#6.2.1)
- [6.2.2 基于文件夹的批量预测](#6.2.2)
- [6.3 基于 C++ 预测引擎推理](#6.3)
- [6.4 服务化部署](#6.4)
- [6.5 端侧部署](#6.5)
- [6.6 Paddle2ONNX 模型转换与预测](#6.6)
<a name="1"></a>
## 1. 模型和应用场景介绍
该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight Classification)快速构建轻量级、高精度、可落地的人体属性识别模型。该模型可以广泛应用于行人分析、行人跟踪等场景。
下表列出了不同人体属性识别模型的相关指标,前两行展现了使用 SwinTransformer_tiny、Res2Net200_vd_26w_4s 和 MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第三行至第六行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + SKL-UGI 知识蒸馏策略训练得到的模型的相关指标。
| 模型 | ma(%) | 延时(ms) | 存储(M) | 策略 |
|-------|-----------|----------|---------------|---------------|
| Res2Net200_vd_26w_4s | 81.25 | 77.51 | 293 | 使用ImageNet预训练模型 |
| SwinTransformer_tiny | 80.17 | 89.51 | 107 | 使用ImageNet预训练模型 |
| MobileNetV3_small_x0_35 | 70.79 | 2.90 | 1.7 | 使用ImageNet预训练模型 |
| PPLCNet_x1_0 | 76.31 | 2.01 | 6.6 | 使用ImageNet预训练模型 |
| PPLCNet_x1_0 | 77.31 | 2.01 | 6.6 | 使用SSLD预训练模型 |
| PPLCNet_x1_0 | 77.71 | 2.01 | 6.6 | 使用SSLD预训练模型+EDA策略|
| <b>PPLCNet_x1_0<b> | <b>78.59<b> | <b>2.01<b> | <b>6.6<b> | 使用SSLD预训练模型+EDA策略+SKL-UGI知识蒸馏策略|
从表中可以看出,backbone 为 Res2Net200_vd_26w_4s 和 SwinTransformer_tiny 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是精度也大幅下降。将 backbone 替换为 PPLCNet_x1_0 时,精度较 MobileNetV3_small_x0_35 高 5.5%,于此同时,速度更快。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升 1%,进一步地,当融合EDA策略后,精度可以再提升 0.4%,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.88%。此时,PPLCNet_x1_0 的精度与 SwinTransformer_tiny 仅相差1.58%,但是速度快 44 倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
**备注:**
* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)
<a name="2"></a>
## 2. 模型快速体验
<a name="2.1"></a>
### 2.1 安装 paddlepaddle
- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
```bash
python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
```
- 您的机器是CPU,请运行以下命令安装
```bash
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
```
更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
<a name="2.2"></a>
### 2.2 安装 paddleclas
使用如下命令快速安装 paddleclas
```
pip3 install paddleclas
```
<a name="2.3"></a>
### 2.3 预测
点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
* 使用命令行快速预测
```bash
paddleclas --model_name=person_attribute --infer_imgs=pulc_demo_imgs/person_attribute/090004.jpg
```
结果如下:
```
>>> result
attributes: ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], output: [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1], filename: pulc_demo_imgs/person_attribute/090004.jpg
Predict complete!
```
**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹。
* 在 Python 代码中预测
```python
import paddleclas
model = paddleclas.PaddleClas(model_name="person_attribute")
result = model.predict(input_data="pulc_demo_imgs/person_attribute/090004.jpg")
print(next(result))
```
**备注**`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="person_attribute", batch_size=2)`, 使用默认的代码返回结果示例如下:
```
>>> result
[{'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1], 'filename': 'pulc_demo_imgs/person_attribute/090004.jpg'}]
```
<a name="3"></a>
## 3. 模型训练、评估和预测
<a name="3.1"></a>
### 3.1 环境配置
* 安装:请先参考文档 [环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
<a name="3.2"></a>
### 3.2 数据准备
<a name="3.2.1"></a>
#### 3.2.1 数据集来源
本案例中所使用的数据为[pa100k 数据集](https://www.v7labs.com/open-datasets/pa-100k)
<a name="3.2.2"></a>
#### 3.2.2 数据集获取
部分数据可视化如下所示。
<div align="center">
<img src="../../images/PULC/docs/person_attribute_data_demo.png" width = "500" />
</div>
我们将原始数据转换成了 PaddleClas 多标签可读的数据格式,可以直接下载。
进入 PaddleClas 目录。
```
cd path_to_PaddleClas
```
进入 `dataset/` 目录,下载并解压有人/无人场景的数据。
```shell
cd dataset
wget https://paddleclas.bj.bcebos.com/data/PULC/pa100k.tar
tar -xf pa100k.tar
cd ../
```
执行上述命令后,`dataset/` 下存在 `pa100k` 目录,该目录中具有以下数据:
执行上述命令后,`pa100k`目录中具有以下数据:
```
pa100k
├── train
│   ├── 000001.jpg
│   ├── 000002.jpg
...
├── val
│   ├── 080001.jpg
│   ├── 080002.jpg
...
├── test
│   ├── 090001.jpg
│   ├── 090002.jpg
...
...
├── train_list.txt
├── train_val_list.txt
├── val_list.txt
├── test_list.txt
```
其中`train/``val/``test/`分别为训练集、验证集和测试集。`train_list.txt``val_list.txt``test_list.txt`分别为训练集、验证集、测试集的标签文件。在本例子中,`test_list.txt`暂时没有使用。
<a name="3.3"></a>
### 3.3 模型训练
`ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml` 中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml
```
验证集的最佳指标在 `90.07%` 左右(数据集较小,一般有0.3%左右的波动)。
<a name="3.4"></a>
### 3.4 模型评估
训练好模型之后,可以通过以下命令实现对模型指标的评估。
```bash
python3 tools/eval.py \
-c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
```
其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
<a name="3.5"></a>
### 3.5 模型预测
模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
```bash
python3 tools/infer.py \
-c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/PPLCNet_x1_0/best_model
```
输出结果如下:
```
[{'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1]}]
```
**备注:**
* 这里`-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
* 默认是对 `deploy/images/PULC/person_attribute/090004.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
<a name="4"></a>
## 4. 模型压缩
<a name="4.1"></a>
### 4.1 SKL-UGI 知识蒸馏
SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](../advanced_tutorials/ssld.md)
<a name="4.1.1"></a>
#### 4.1.1 教师模型训练
复用 `ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml` 中的超参数,训练教师模型,训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
-o Arch.name=ResNet101_vd
```
验证集的最佳指标为 `80.10%` 左右,当前教师模型最好的权重保存在 `output/ResNet101_vd/best_model.pdparams`
<a name="4.1.2"></a>
#### 4.1.2 蒸馏训练
配置文件`ppcls/configs/PULC/person_attribute/PPLCNet_x1_0_Distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型。训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0_Distillation.yaml \
-o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
```
验证集的最佳指标为 `78.5%` 左右,当前模型最好的权重保存在 `output/DistillationModel/best_model_student.pdparams`
<a name="5"></a>
## 5. 超参搜索
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
<a name="6"></a>
## 6. 模型推理部署
<a name="6.1"></a>
### 6.1 推理模型准备
Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)
当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
<a name="6.1.1"></a>
### 6.1.1 基于训练得到的权重导出 inference 模型
此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
```bash
python3 tools/export_model.py \
-c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/DistillationModel/best_model_student \
-o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_person_attribute_infer
```
执行完该脚本后会在 `deploy/models/` 下生成 `PPLCNet_x1_0_person_attribute_infer` 文件夹,`models` 文件夹下应有如下文件结构:
```
├── PPLCNet_x1_0_person_attribute_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在`output/PPLCNet_x1_0/best_model.pdparams`中。
<a name="6.1.2"></a>
### 6.1.2 直接下载 inference 模型
[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
```
cd deploy/models
# 下载 inference 模型并解压
wget https://paddleclas.bj.bcebos.com/models/PULC/person_attribute_infer.tar && tar -xf person_attribute_infer.tar
```
解压完毕后,`models` 文件夹下应有如下文件结构:
```
├── person_attribute_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
<a name="6.2"></a>
### 6.2 基于 Python 预测引擎推理
<a name="6.2.1"></a>
#### 6.2.1 预测单张图像
返回 `deploy` 目录:
```
cd ../
```
运行下面的命令,对图像 `./images/PULC/person_attribute/090004.jpg` 进行车辆属性识别。
```shell
# 使用下面的命令使用 GPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/person_attribute/inference_person_attribute.yaml -o Global.use_gpu=True
# 使用下面的命令使用 CPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/person_attribute/inference_person_attribute.yaml -o Global.use_gpu=False
```
输出结果如下。
```
090004.jpg: {'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1]}
```
<a name="6.2.2"></a>
#### 6.2.2 基于文件夹的批量预测
如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
```shell
# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
python3.7 python/predict_cls.py -c configs/PULC/person_attribute/inference_person_attribute.yaml -o Global.infer_imgs="./images/PULC/person_attribute/"
```
终端中会输出该文件夹内所有图像的属性识别结果,如下所示。
```
090004.jpg: {'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1]}
090007.jpg: {'attributes': ['Female', 'Age18-60', 'Side', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'No bag', 'Upper: ShortSleeve', 'Lower: Skirt&Dress', 'No boots'], 'output': [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0]}
```
<a name="6.3"></a>
### 6.3 基于 C++ 预测引擎推理
PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
<a name="6.4"></a>
### 6.4 服务化部署
Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)
PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
<a name="6.5"></a>
### 6.5 端侧部署
Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)
PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
<a name="6.6"></a>
### 6.6 Paddle2ONNX 模型转换与预测
Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)
PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](../../../deploy/paddle2onnx/readme.md)来完成相应的部署工作。
# PaddleClas构建有人/无人分类案例
此处提供了用户使用 PaddleClas 快速构建轻量级、高精度、可落地的有人/无人的分类模型教程,主要基于有人/无人场景的数据,融合了轻量级骨干网络PPLCNet、SSLD预训练权重、EDA数据增强策略、SKL-UGI知识蒸馏策略、SHAS超参数搜索策略,得到精度高、速度快、易于部署的二分类模型。
------
## 目录
- [1. 环境配置](#1)
- [2. 有人/无人场景推理预测](#2)
- [2.1 下载模型](#2.1)
- [2.2 模型推理预测](#2.2)
- [2.2.1 预测单张图像](#2.2.1)
- [2.2.2 基于文件夹的批量预测](#2.2.2)
- [3.有人/无人场景训练](#3)
- [3.1 数据准备](#3.1)
- [3.2 模型训练](#3.2)
- [3.2.1 基于默认超参数训练](#3.2.1)
- [3.2.1.1 基于默认超参数训练轻量级模型](#3.2.1.1)
- [3.2.1.2 基于默认超参数训练教师模型](#3.2.1.2)
- [3.2.1.3 基于默认超参数进行蒸馏训练](#3.2.1.3)
- [3.2.2 超参数搜索训练](#3.2)
- [4. 模型评估与推理](#4)
- [4.1 模型评估](#3.1)
- [4.2 模型预测](#3.2)
- [4.3 使用 inference 模型进行推理](#4.3)
- [4.3.1 导出 inference 模型](#4.3.1)
- [4.3.2 模型推理预测](#4.3.2)
<a name="1"></a>
## 1. 环境配置
* 安装:请先参考 [Paddle 安装教程](../installation/install_paddle.md) 以及 [PaddleClas 安装教程](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
<a name="2"></a>
## 2. 有人/无人场景推理预测
<a name="2.1"></a>
### 2.1 下载模型
* 进入 `deploy` 运行目录。
```
cd deploy
```
下载有人/无人分类的模型。
```
mkdir models
cd models
# 下载inference 模型并解压
wget https://paddleclas.bj.bcebos.com/models/PULC/person_cls_infer.tar && tar -xf person_cls_infer.tar
```
解压完毕后,`models` 文件夹下应有如下文件结构:
```
├── person_cls_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
<a name="2.2"></a>
### 2.2 模型推理预测
<a name="2.2.1"></a>
#### 2.2.1 预测单张图像
返回 `deploy` 目录:
```
cd ../
```
运行下面的命令,对图像 `./images/PULC/person/objects365_02035329.jpg` 进行有人/无人分类。
```shell
# 使用下面的命令使用 GPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o PostProcess.ThreshOutput.threshold=0.9794
# 使用下面的命令使用 CPU 进行预测
python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o PostProcess.ThreshOutput.threshold=0.9794 -o Global.use_gpu=False
```
输出结果如下。
```
objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
```
**备注:** 真实场景中往往需要在假正类率(Fpr)小于某一个指标下求真正类率(Tpr),该场景中的`val`数据集在千分之一Fpr下得到的最佳Tpr所得到的阈值为`0.9794`,故此处的`threshold``0.9794`。该阈值的确定方法可以参考[3.2节](#3.2)
<a name="2.2.2"></a>
#### 2.2.2 基于文件夹的批量预测
如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
```shell
# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o Global.infer_imgs="./images/PULC/person/"
```
终端中会输出该文件夹内所有图像的分类结果,如下所示。
```
objects365_01780782.jpg: class id(s): [0], score(s): [1.00], label_name(s): ['nobody']
objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
```
其中,`someone` 表示该图里存在人,`nobody` 表示该图里不存在人。
<a name="3"></a>
## 3.有人/无人场景训练
<a name="3.1"></a>
### 3.1 数据准备
进入 PaddleClas 目录。
```
cd path_to_PaddleClas
```
进入 `dataset/` 目录,下载并解压有人/无人场景的数据。
```shell
cd dataset
wget https://paddleclas.bj.bcebos.com/data/cls_demo/person.tar
tar -xf person.tar
cd ../
```
执行上述命令后,`dataset/`下存在`person`目录,该目录中具有以下数据:
```
├── train
│   ├── 000000000009.jpg
│   ├── 000000000025.jpg
...
├── val
│   ├── objects365_01780637.jpg
│   ├── objects365_01780640.jpg
...
├── ImageNet_val
│   ├── ILSVRC2012_val_00000001.JPEG
│   ├── ILSVRC2012_val_00000002.JPEG
...
├── train_list.txt
├── train_list.txt.debug
├── train_list_for_distill.txt
├── val_list.txt
└── val_list.txt.debug
```
其中`train/``val/`分别为训练集和验证集。`train_list.txt``val_list.txt`分别为训练集和验证集的标签文件,`train_list.txt.debug``val_list.txt.debug`分别为训练集和验证集的`debug`标签文件,其分别是`train_list.txt``val_list.txt`的子集,用该文件可以快速体验本案例的流程。`ImageNet_val/`是ImageNet的验证集,该集合和`train`集合的混合数据用于本案例的`SKL-UGI知识蒸馏策略`,对应的训练标签文件为`train_list_for_distill.txt`
* **注意**:
* 本案例中所使用的所有数据集均为开源数据,`train`集合为[MS-COCO数据](https://cocodataset.org/#overview)的训练集的子集,`val`集合为[Object365数据](https://www.objects365.org/overview.html)的训练集的子集,`ImageNet_val`[ImageNet数据](https://www.image-net.org/)的验证集。数据集的筛选流程可以参考[有人/无人场景数据集筛选方法]()。
<a name="3.2"></a>
### 3.2 模型训练
<a name="3.2.1"></a>
#### 3.2.1 基于默认超参数训练
<a name="3.2.1.1"></a>
##### 3.2.1.1 基于默认超参数训练轻量级模型
`ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml`中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml
```
验证集的最佳指标在0.94-0.95之间(数据集较小,容易造成波动)。
**备注:**
* 此时使用的指标为Tpr,该指标描述了在假正类率(Fpr)小于某一个指标时的真正类率(Tpr),是产业中二分类问题常用的指标之一。在本案例中,Fpr为千分之一。关于Fpr和Tpr的更多介绍,可以参考[这里](https://baike.baidu.com/item/AUC/19282953)
* 在eval时,会打印出来当前最佳的TprAtFpr指标,具体地,其会打印当前的`Fpr``Tpr`值,以及当前的`threshold`值,`Tpr`值反映了在当前`Fpr`值下的召回率,该值越高,代表模型越好。`threshold` 表示当前最佳`Fpr`所对应的分类阈值,可用于后续模型部署落地等。
<a name="3.2.1.2"></a>
##### 3.2.1.2 基于默认超参数训练教师模型
复用`ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml`中的超参数,训练教师模型,训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml \
-o Arch.name=ResNet101_vd
```
验证集的最佳指标为0.96-0.98之间,当前教师模型最好的权重保存在`output/ResNet101_vd/best_model.pdparams`
<a name="3.2.1.3"></a>
##### 3.2.1.3 基于默认超参数进行蒸馏训练
配置文件`ppcls/configs/PULC/PULC/Distillation/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型,使用ImageNet数据集的验证集作为新增的无标签数据。训练脚本如下:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person/Distillation/PPLCNet_x1_0_distillation.yaml \
-o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
```
验证集的最佳指标为0.95-0.97之间,当前模型最好的权重保存在`output/DistillationModel/best_model_student.pdparams`
<a name="3.2.2"></a>
#### 3.2.2 超参数搜索训练
[3.2 小节](#3.2) 提供了在已经搜索并得到的超参数上进行了训练,此部分内容提供了搜索的过程,此过程是为了得到更好的训练超参数。
* 搜索运行脚本如下:
```shell
python tools/search_strategy.py -c ppcls/configs/StrategySearch/person.yaml
```
`ppcls/configs/StrategySearch/person.yaml`中指定了具体的 GPU id 号和搜索配置, 默认搜索的训练日志和模型存放于`output/search_person`中,最终的蒸馏模型存放于`output/search_person/search_res/DistillationModel/best_model_student.pdparams`
* **注意**:
* 3.1小节提供的默认配置已经经过了搜索,所以此过程不是必要的过程,如果自己的训练数据集有变化,可以尝试此过程。
* 此过程基于当前数据集在 V100 4 卡上大概需要耗时 10 小时,如果缺少机器资源,希望体验搜索过程,可以将`ppcls/configs/cls_demo/person/PPLCNet/PPLCNet_x1_0_search.yaml`中的`train_list.txt``val_list.txt`分别替换为`train_list.txt.debug``val_list.txt.debug`。替换list只是为了加速跑通整个搜索过程,由于数据量较小,其搜素的结果没有参考性。另外,搜索空间可以根据当前的机器资源来调整,如果机器资源有限,可以尝试缩小搜索空间,如果机器资源较充足,可以尝试扩大搜索空间。
* 如果此过程搜索的得到的超参数与[3.2.1小节](#3.2.1)提供的超参数不一致,主要是由于训练数据较小造成的波动导致,可以忽略。
<a name="4"></a>
## 4. 模型评估与推理
<a name="4.1"></a>
### 4.1 模型评估
训练好模型之后,可以通过以下命令实现对模型指标的评估。
```bash
python3 tools/eval.py \
-c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml \
-o Global.pretrained_model="output/DistillationModel/best_model_student"
```
<a name="4.2"></a>
### 4.2 模型预测
模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
```python
python3 tools/infer.py \
-c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml \
-o Infer.infer_imgs=./dataset/person/val/objects365_01780637.jpg \
-o Global.pretrained_model=output/DistillationModel/best_model_student \
-o Global.pretrained_model=Infer.PostProcess.threshold=0.9794
```
输出结果如下:
```
[{'class_ids': [0], 'scores': [0.9878496769815683], 'label_names': ['nobody'], 'file_name': './dataset/person/val/objects365_01780637.jpg'}]
```
**备注:** 这里的`Infer.PostProcess.threshold`的值需要根据实际场景来确定,此处的`0.9794`是在该场景中的`val`数据集在千分之一Fpr下得到的最佳Tpr所得到的。
<a name="4.3"></a>
### 4.3 使用 inference 模型进行推理
<a name="4.3.1"></a>
### 4.3.1 导出 inference 模型
通过导出 inference 模型,PaddlePaddle 支持使用预测引擎进行预测推理。接下来介绍如何用预测引擎进行推理:
首先,对训练好的模型进行转换:
```bash
python3 tools/export_model.py \
-c ./ppcls/configs/cls_demo/PULC/PPLCNet/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/DistillationModel/best_model_student \
-o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_person
```
执行完该脚本后会在`deploy/models/`下生成`PPLCNet_x1_0_person`文件夹,该文件夹中的模型与 2.2 节下载的推理预测模型格式一致。
<a name="4.3.2"></a>
### 4.3.2 基于 inference 模型推理预测
推理预测的脚本为:
```
python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o Global.inference_model_dir="models/PPLCNet_x1_0_person" -o PostProcess.ThreshOutput.threshold=0.9794
```
**备注:**
- 此处的`PostProcess.ThreshOutput.threshold`由eval时的最佳`threshold`来确定。
- 更多关于推理的细节,可以参考[2.2节](#2.2)
此差异已折叠。
# PULC 快速体验
------
本文主要介绍通过 PaddleClas whl 包,使用 PULC 系列模型进行预测。
## 目录
- [1. 安装](#1)
- [1.1 安装PaddlePaddle](#11)
- [1.2 安装PaddleClas whl包](#12)
- [2. 快速体验](#2)
- [2.1 命令行使用](#2.1)
- [2.2 Python脚本使用](#2.2)
- [2.3 模型列表](#2.3)
- [3.小结](#3)
<a name="1"></a>
## 1. 安装
<a name="1.1"></a>
### 1.1 安装 PaddlePaddle
- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
```bash
python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
```
- 您的机器是CPU,请运行以下命令安装
```bash
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
```
更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
<a name="1.2"></a>
### 1.2 安装 PaddleClas whl 包
```bash
pip3 install paddleclas
```
<a name="2"></a>
## 2. 快速体验
PaddleClas 提供了一系列测试图片,里边包含人、车、OCR等方向的多个场景的demo数据。点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载并解压,然后在终端中切换到相应目录。
<a name="2.1"></a>
### 2.1 命令行使用
```
cd /path/to/pulc_demo_imgs
```
使用命令行预测:
```bash
paddleclas --model_name=person_exists --infer_imgs=pulc_demo_imgs/person_exists/objects365_01780782.jpg
```
结果如下:
```
>>> result
class_ids: [0], scores: [0.9955421453341842], label_names: ['nobody'], filename: pulc_demo_imgs/person_exists/objects365_01780782.jpg
Predict complete!
```
若预测结果为 `nobody`,表示该图中没有人,若预测结果为 `someone`,则表示该图中有人。此处预测结果为 `nobody`,表示该图中没有人。
**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹,如需要替换模型,更改 `--model_name` 中的模型名字即可,模型名字可以参考[2.3 模型列表](#2.3)
<a name="2.2"></a>
### 2.2 Python 脚本使用
此处提供了在 python 脚本中使用 PULC 有人/无人分类模型预测的例子。
```python
import paddleclas
model = paddleclas.PaddleClas(model_name="person_exists")
result = model.predict(input_data="pulc_demo_imgs/person_exists/objects365_01780782.jpg")
print(next(result))
```
打印的结果如下:
```
>>> result
[{'class_ids': [0], 'scores': [0.9955421453341842], 'label_names': ['nobody'], 'filename': 'pulc_demo_imgs/person_exists/objects365_01780782.jpg'}]
```
**备注**`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="person_exists", batch_size=2)`。更换其他模型只需要替换`model_name`, `model_name`,可以参考[2.3 模型列表](#2.3)
<a name="2.3"></a>
### 2.3 模型列表
PULC 系列模型的名称和简介如下:
|模型名称|模型简介|
| --- | --- |
| person_exists | PULC有人/无人分类模型 |
| person_attribute | PULC人体属性识别模型 |
| safety_helmet | PULC佩戴安全帽分类模型 |
| traffic_sign | PULC交通标志分类模型 |
| vehicle_attribute | PULC车辆属性识别模型 |
| car_exists | PULC有车/无车分类模型 |
| text_image_orientation | PULC含文字图像方向分类模型 |
| textline_orientation | PULC文本行方向分类模型 |
| language_classification | PULC语种分类模型 |
<a name="3"></a>
## 3. 小结
通过本节内容,相信您已经熟练掌握 PaddleClas whl 包的 PULC 模型使用方法并获得了初步效果。
PULC 方法产出的系列模型在人、车、OCR等方向的多个场景中均验证有效,用超轻量模型就可实现与 SwinTransformer 模型接近的精度,预测速度提高 40+ 倍。并且打通数据、模型训练、压缩和推理部署全流程,具体地,您可以参考[PULC有人/无人分类模型](PULC_person_exists.md)[PULC人体属性识别模型](PULC_person_attribute.md)[PULC佩戴安全帽分类模型](PULC_safety_helmet.md)[PULC交通标志分类模型](PULC_traffic_sign.md)[PULC车辆属性识别模型](PULC_vehicle_attribute.md)[PULC有车/无车分类模型](PULC_car_exists.md)[PULC含文字图像方向分类模型](PULC_text_image_orientation.md)[PULC文本行方向分类模型](PULC_textline_orientation.md)[PULC语种分类模型](PULC_language_classification.md)
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
## 超轻量图像分类方案PULC
------
## 目录
- [1. PULC方案简介](#1)
- [2. 数据准备](#2)
- [2.1 数据集格式说明](#2.1)
- [2.2 标注文件生成](#2.2)
- [3. 使用标准分类配置进行训练](#3)
- [3.1 骨干网络PP-LCNet](#3.1)
- [3.2 SSLD预训练权重](#3.2)
- [3.3 EDA数据增强策略](#3.3)
- [3.4 SKL-UGI模型蒸馏](#3.4)
- [3.5 总结](#3.5)
- [4. 超参搜索](#4)
- [4.1 基于默认配置搜索](#4.1)
- [4.2 自定义搜索配置](#4.2)
<a name="1"></a>
### 1. PULC方案简介
图像分类是计算机视觉的基础算法之一,是企业应用中最常见的算法,也是许多 CV 应用的重要组成部分。近年来,骨干网络模型发展迅速,ImageNet 的精度纪录被不断刷新。然而,这些模型在实用场景的表现有时却不尽如人意。一方面,精度高的模型往往体积大,运算慢,常常难以满足实际部署需求;另一方面,选择了合适的模型之后,往往还需要经验丰富的工程师进行调参,费时费力。PaddleClas 为了解决企业应用难题,让分类模型的训练和调参更加容易,总结推出了实用轻量图像分类解决方案(PULC, Practical Ultra Lightweight Classification)。PULC融合了骨干网络、数据增广、蒸馏等多种前沿算法,可以自动训练得到轻量且高精度的图像分类模型。
PULC 方案在人、车、OCR等方向的多个场景中均验证有效,用超轻量模型就可实现与 SwinTransformer 模型接近的精度,预测速度提高 40+ 倍。
<div align="center">
<img src="https://user-images.githubusercontent.com/19523330/173011854-b10fcd7a-b799-4dfd-a1cf-9504952a3c44.png" width = "800" />
</div>
方案主要包括 4 部分,分别是:PP-LCNet轻量级骨干网络、SSLD预训练权重、数据增强策略集成(EDA)和 SKL-UGI 知识蒸馏算法。此外,我们还采用了超参搜索的方法,高效优化训练中的超参数。下面,我们以有人/无人场景为例,对方案进行说明。
**备注**:针对一些特定场景,我们提供了基础的训练文档供参考,例如[有人/无人分类模型](PULC_person_exists.md)等,您可以在[这里](./PULC_model_list.md)找到这些文档。如果这些文档中的方法不能满足您的需求,或者您需要自定义训练任务,您可以参考本文档。
<a name="2"></a>
### 2. 数据准备
<a name="2.1"></a>
#### 2.1 数据集格式说明
PaddleClas 使用 `txt` 格式文件指定训练集和测试集,以有人/无人场景为例,其中需要指定 `train_list.txt``val_list.txt` 当作训练集和验证集的数据标签,格式形如:
```
# 每一行采用"空格"分隔图像路径与标注
train/1.jpg 0
train/10.jpg 1
...
```
如果您想获取更多常用分类数据集的信息,可以参考文档可以参考 [PaddleClas 分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明)
<a name="2.2"></a>
#### 2.2 标注文件生成
如果您已经有实际场景中的数据,那么按照上节的格式进行标注即可。这里,我们提供了一个快速生成数据的脚本,您只需要将不同类别的数据分别放在文件夹中,运行脚本即可生成标注文件。
首先,假设您存放数据的路径为`./train``train/` 中包含了每个类别的数据,类别号从 0 开始,每个类别的文件夹中有具体的图像数据。
```shell
train
├── 0
│   ├── 0.jpg
│   ├── 1.jpg
│   └── ...
└── 1
├── 0.jpg
├── 1.jpg
└── ...
└── ...
```
```shell
tree -r -i -f train | grep -E "jpg|JPG|jpeg|JPEG|png|PNG" | awk -F "/" '{print $0" "$2}' > train_list.txt
```
其中,如果涉及更多的图片名称尾缀,可以增加 `grep -E`后的内容, `$2` 中的 `2` 为类别号文件夹的层级。
**备注:** 以上为数据集获取和生成的方法介绍,这里您可以直接下载有人/无人场景数据快速开始体验。
进入 PaddleClas 目录。
```
cd path_to_PaddleClas
```
进入 `dataset/` 目录,下载并解压有人/无人场景的数据。
```shell
cd dataset
wget https://paddleclas.bj.bcebos.com/data/PULC/person_exists.tar
tar -xf person_exists.tar
cd ../
```
<a name="3"></a>
### 3. 使用标准分类配置进行训练
<a name="3.1"></a>
#### 3.1 骨干网络PP-LCNet
PULC 采用了轻量骨干网络 PP-LCNet,相比同精度竞品速度快 50%,您可以在[PP-LCNet介绍](../models/PP-LCNet.md)查阅该骨干网络的详细介绍。
直接使用 PP-LCNet 训练的命令为:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0_search.yaml
```
为了方便性能对比,我们也提供了大模型 SwinTransformer_tiny 和轻量模型 MobileNetV3_small_x0_35 的配置文件,您可以使用命令训练:
SwinTransformer_tiny:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/SwinTransformer_tiny_patch4_window7_224.yaml
```
MobileNetV3_small_x0_35:
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/MobileNetV3_small_x0_35.yaml
```
训练得到的模型精度对比如下表。
| 模型 | Tpr(%) | 延时(ms) | 存储(M) | 策略 |
|-------|-----------|----------|---------------|---------------|
| SwinTranformer_tiny | 95.69 | 95.30 | 107 | 使用 ImageNet 预训练模型 |
| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 1.6 | 使用 ImageNet 预训练模型 |
| PPLCNet_x1_0 | 89.57 | 2.12 | 6.5 | 使用 ImageNet 预训练模型 |
从中可以看出,PP-LCNet 的速度比 SwinTransformer 快很多,但是精度也略低。下面我们通过一系列优化来提高 PP-LCNet 模型的精度。
<a name="3.2"></a>
#### 3.2 SSLD预训练权重
SSLD 是百度自研的半监督蒸馏算法,在 ImageNet 数据集上,模型精度可以提升 3-7 个点,您可以在 [SSLD 介绍](../advanced_tutorials/ssld.md)找到详细介绍。我们发现,使用SSLD预训练权重,可以有效提升应用分类模型的精度。此外,在训练中使用更小的分辨率,可以有效提升模型精度。同时,我们也对学习率进行了优化。
基于以上三点改进,我们训练得到模型精度为 92.1%,提升 2.6%。
<a name="3.3"></a>
#### 3.3 EDA数据增强策略
数据增强是视觉算法中常用的优化策略,可以对模型精度有明显提升。除了传统的 RandomCrop,RandomFlip 等方法之外,我们还应用了 RandomAugment 和 RandomErasing。您可以在[数据增强介绍](../advanced_tutorials/DataAugmentation.md)找到详细介绍。
由于这两种数据增强对图片的修改较大,使分类任务变难,在一些小数据集上可能会导致模型欠拟合,我们将提前设置好这两种方法启用的概率。
基于以上改进,我们训练得到模型精度为 93.43%,提升 1.3%。
<a name="3.4"></a>
#### 3.4 SKL-UGI模型蒸馏
模型蒸馏是一种可以有效提升小模型精度的方法,您可以在[知识蒸馏介绍](../advanced_tutorials/ssld.md)找到详细介绍。我们选择 ResNet101_vd 作为教师模型进行蒸馏。为了适应蒸馏过程,我们在此也对网络不同 stage 的学习率进行了调整。基于以上改进,我们训练得到模型精度为 95.6%,提升 1.4%。
<a name="3.5"></a>
#### 3.5 总结
经过以上方法优化,PP-LCNet最终精度达到 95.6%,达到了大模型的精度水平。我们将实验结果总结如下表:
| 模型 | Tpr(%) | 延时(ms) | 存储(M) | 策略 |
|-------|-----------|----------|---------------|---------------|
| SwinTranformer_tiny | 95.69 | 95.30 | 107 | 使用 ImageNet 预训练模型 |
| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 1.6 | 使用 ImageNet 预训练模型 |
| PPLCNet_x1_0 | 89.57 | 2.12 | 6.5 | 使用 ImageNet 预训练模型 |
| PPLCNet_x1_0 | 92.10 | 2.12 | 6.5 | 使用 SSLD 预训练模型 |
| PPLCNet_x1_0 | 93.43 | 2.12 | 6.5 | 使用 SSLD 预训练模型+EDA 策略|
| <b>PPLCNet_x1_0<b> | <b>95.60<b> | <b>2.12<b> | <b>6.5<b> | 使用 SSLD 预训练模型+EDA 策略+SKL-UGI 知识蒸馏策略|
我们在其他 8 个场景中也使用了同样的优化策略,得到如下结果:
| 场景 | 大模型 | 大模型精度(%) | 小模型 | 小模型精度(%) |
|----------|----------|----------|----------|----------|
| 人体属性识别 | Res2Net200_vd | 81.25 | PPLCNet_x1_0 | 78.59 |
| 佩戴安全帽分类 | Res2Net200_vd| 98.92 | PPLCNet_x1_0 |99.38 |
| 交通标志分类 | SwinTransformer_tiny | 98.11 | PPLCNet_x1_0 | 98.35 |
| 车辆属性识别 | Res2Net200_vd_26w_4s | 91.36 | PPLCNet_x1_0 | 90.81 |
| 有车/无车分类 | SwinTransformer_tiny | 97.71 | PPLCNet_x1_0 | 95.92 |
| 含文字图像方向分类 | SwinTransformer_tiny |99.12 | PPLCNet_x1_0 | 99.06 |
| 文本行方向分类 | SwinTransformer_tiny | 93.61 | PPLCNet_x1_0 | 96.01 |
| 语种分类 | SwinTransformer_tiny | 98.12 | PPLCNet_x1_0 | 99.26 |
从结果可以看出,PULC 方案在多个应用场景中均可提升模型精度。使用 PULC 方案可以大大减少模型优化的工作量,快速得到精度较高的模型。
<a name="4"></a>
### 4. 超参搜索
在上述训练过程中,我们调节了学习率、数据增广方法开启概率、分阶段学习率倍数等参数。
这些参数在不同场景中最优值可能并不相同。我们提供了一个快速超参搜索的脚本,将超参调优的过程自动化。
这个脚本会遍历搜索值列表中的参数来替代默认配置中的参数,依次训练,最终选择精度最高的模型所对应的参数作为搜索结果。
<a name="4.1"></a>
#### 4.1 基于默认配置搜索
配置文件 [search.yaml](../../../ppcls/configs/PULC/person_exists/search.yaml) 定义了有人/无人场景超参搜索的配置,使用如下命令即可完成超参数的搜索。
```bash
python3 tools/search_strategy.py -c ppcls/configs/PULC/person_exists/search.yaml
```
**备注**:关于搜索部分,我们也在不断优化,敬请期待。
<a name="4.2"></a>
#### 4.2 自定义搜索配置
您也可以根据训练结果或调参经验,修改超参搜索的配置。
修改 `lrs` 中的`search_values`字段,可以修改学习率搜索值列表;
修改 `resolutions` 中的 `search_values` 字段,可以修改分辨率的搜索值列表;
修改 `ra_probs` 中的 `search_values` 字段,可以修改 RandAugment 开启概率的搜索值列表;
修改 `re_probs` 中的 `search_values` 字段,可以修改 RnadomErasing 开启概率的搜索值列表;
修改 `lr_mult_list` 中的 `search_values` 字段,可以修改 lr_mult 搜索值列表;
修改 `teacher` 中的 `search_values` 字段,可以修改教师模型的搜索列表。
搜索完成后,会在 `output/search_person_exists` 中生成最终的结果,其中,除`search_res``output/search_person_exists` 中目录为对应的每个搜索的超参数的结果的权重和训练日志文件,`search_res` 对应的是蒸馏后的结果,也就是最终的模型,该模型的权重保存在`output/output_dir/search_person_exists/DistillationModel/best_model_student.pdparams`
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册