diff --git a/.gitignore b/.gitignore
index d8f8bca6ec5faa7f12c78c2dbeb9c7d11e903b97..f1e7651dabdb487f76efa9c992407bb077feac35 100644
--- a/.gitignore
+++ b/.gitignore
@@ -12,3 +12,4 @@ build/
log/
nohup.out
.DS_Store
+.idea
diff --git a/MANIFEST.in b/MANIFEST.in
index b0a4f6dc151b0e11d83655d3f7ef40c200a88ee8..97372da0035488913c83dfe6f2ddfb8fe0c906c3 100644
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -1,7 +1,8 @@
include LICENSE.txt
include README.md
include docs/en/whl_en.md
-recursive-include deploy/python predict_cls.py preprocess.py postprocess.py det_preprocess.py
+recursive-include deploy/python *.py
+recursive-include deploy/configs *.yaml
recursive-include deploy/utils get_image_list.py config.py logger.py predictor.py
recursive-include ppcls/ *.py *.txt
\ No newline at end of file
diff --git a/README.md b/README.md
index 44885f554afdc7e00188fae2987e7fbbb4278fcc..13c4f964bb9063f28d6e08dfb8c6b828a81d2536 120000
--- a/README.md
+++ b/README.md
@@ -1 +1 @@
-README_ch.md
\ No newline at end of file
+README_en.md
\ No newline at end of file
diff --git a/README_ch.md b/README_ch.md
index 74f02ecca839b53217b2189a65afaf0b012b3261..fbc7aa6fcf1180d6ab733e3d739dca0f3861e149 100644
--- a/README_ch.md
+++ b/README_ch.md
@@ -4,106 +4,130 @@
## 简介
-飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。
+飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别和图像分类任务的工具集,助力使用者训练出更好的视觉模型和应用落地。
-**近期更新**
-- 🔥️ 2022.5.26 [飞桨产业实践范例直播课](http://aglc.cn/v-c4FAR),解读**超轻量重点区域人员出入管理方案**,欢迎报名来交流。
-

@@ -111,6 +135,11 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick
PP-ShiTu是一个实用的轻量级通用图像识别系统,主要由主体检测、特征学习和向量检索三个模块组成。该系统从骨干网络选择和调整、损失函数的选择、数据增强、学习率变换策略、正则化参数选择、预训练模型使用以及模型裁剪量化8个方面,采用多种策略,对各个模块的模型进行优化,最终得到在CPU上仅0.2s即可完成10w+库的图像识别的系统。更多细节请参考[PP-ShiTu技术方案](https://arxiv.org/pdf/2111.00775.pdf)。
+
+## PULC实用图像分类模型效果展示
+
+

+
## PP-ShiTu图像识别系统效果展示
diff --git a/README_en.md b/README_en.md
index 9b0d7c85d76cf06eac8fb265abb85c3bb98a275f..4bf960e57f2e56972f889c4bcf6a6d715b903477 100644
--- a/README_en.md
+++ b/README_en.md
@@ -4,39 +4,41 @@
## Introduction
-PaddleClas is an image recognition toolset for industry and academia, helping users train better computer vision models and apply them in real scenarios.
+PaddleClas is an image classification and image recognition toolset for industry and academia, helping users train better computer vision models and apply them in real scenarios.
-**Recent updates**
-
-- 2022.4.21 Added the related [code](https://github.com/PaddlePaddle/PaddleClas/pull/1820/files) of the CVPR2022 oral paper [MixFormer](https://arxiv.org/pdf/2204.02557.pdf).
-
-- 2021.09.17 Add PP-LCNet series model developed by PaddleClas, these models show strong competitiveness on Intel CPUs.
-For the introduction of PP-LCNet, please refer to [paper](https://arxiv.org/pdf/2109.15099.pdf) or [PP-LCNet model introduction](docs/en/models/PP-LCNet_en.md). The metrics and pretrained model are available [here](docs/en/ImageNet_models_en.md).
-
-- 2021.06.29 Add Swin-transformer series model,Highest top1 acc on ImageNet1k dataset reaches 87.2%, training, evaluation and inference are all supported. Pretrained models can be downloaded [here](docs/en/models/models_intro_en.md).
-- 2021.06.16 PaddleClas release/2.2. Add metric learning and vector search modules. Add product recognition, animation character recognition, vehicle recognition and logo recognition. Added 30 pretrained models of LeViT, Twins, TNT, DLA, HarDNet, and RedNet, and the accuracy is roughly the same as that of the paper.
-- [more](./docs/en/update_history_en.md)
+
+

-## Features
+PULC demo images
+
+
-- A practical image recognition system consist of detection, feature learning and retrieval modules, widely applicable to all types of image recognition tasks.
-Four sample solutions are provided, including product recognition, vehicle recognition, logo recognition and animation character recognition.
-- Rich library of pre-trained models: Provide a total of 164 ImageNet pre-trained models in 35 series, among which 6 selected series of models support fast structural modification.
+
+

-- Comprehensive and easy-to-use feature learning components: 12 metric learning methods are integrated and can be combined and switched at will through configuration files.
+PP-ShiTu demo images
+
-- SSLD knowledge distillation: The 14 classification pre-training models generally improved their accuracy by more than 3%; among them, the ResNet50_vd model achieved a Top-1 accuracy of 84.0% on the Image-Net-1k dataset and the Res2Net200_vd pre-training model achieved a Top-1 accuracy of 85.1%.
+**Recent updates**
+- 2022.6.15 Release [**P**ractical **U**ltra **L**ight-weight image **C**lassification solutions](./docs/en/PULC/PULC_quickstart_en.md). PULC models inference within 3ms on CPU devices, with accuracy on par with SwinTransformer. We also release 9 practical classification models covering pedestrian, vehicle and OCR scenario.
+- 2022.4.21 Added the related [code](https://github.com/PaddlePaddle/PaddleClas/pull/1820/files) of the CVPR2022 oral paper [MixFormer](https://arxiv.org/pdf/2204.02557.pdf).
-- Data augmentation: Provide 8 data augmentation algorithms such as AutoAugment, Cutout, Cutmix, etc. with detailed introduction, code replication and evaluation of effectiveness in a unified experimental environment.
+- 2021.09.17 Add PP-LCNet series model developed by PaddleClas, these models show strong competitiveness on Intel CPUs.
+For the introduction of PP-LCNet, please refer to [paper](https://arxiv.org/pdf/2109.15099.pdf) or [PP-LCNet model introduction](docs/en/models/PP-LCNet_en.md). The metrics and pretrained model are available [here](docs/en/algorithm_introduction/ImageNet_models_en.md).
+- 2021.06.29 Add [Swin-transformer](docs/en/models/SwinTransformer_en.md)) series model,Highest top1 acc on ImageNet1k dataset reaches 87.2%, training, evaluation and inference are all supported. Pretrained models can be downloaded [here](docs/en/algorithm_introduction/ImageNet_models_en.md#16).
+- 2021.06.16 PaddleClas release/2.2. Add metric learning and vector search modules. Add product recognition, animation character recognition, vehicle recognition and logo recognition. Added 30 pretrained models of LeViT, Twins, TNT, DLA, HarDNet, and RedNet, and the accuracy is roughly the same as that of the paper.
+- [more](./docs/en/others/update_history_en.md)
+## Features
+PaddleClas release PP-HGNet、PP-LCNetv2、 PP-LCNet and **S**imple **S**emi-supervised **L**abel **D**istillation algorithms, and support plenty of
+image classification and image recognition algorithms.
+Based on th algorithms above, PaddleClas release PP-ShiTu image recognition system and [**P**ractical **U**ltra **L**ight-weight image **C**lassification solutions](docs/en/PULC/PULC_quickstart_en.md).
-
-

-
+
## Welcome to Join the Technical Exchange Group
@@ -48,41 +50,57 @@ Four sample solutions are provided, including product recognition, vehicle recog
## Quick Start
-Quick experience of image recognition:[Link](./docs/en/tutorials/quick_start_recognition_en.md)
+Quick experience of PP-ShiTu image recognition system:[Link](./docs/en/quick_start/quick_start_recognition_en.md)
+
+Quick experience of **P**ractical **U**ltra **L**ight-weight image **C**lassification models:[Link](docs/en/PULC/PULC_quickstart_en.md)
## Tutorials
-- [Quick Installation](./docs/en/tutorials/install_en.md)
-- [Quick Start of Recognition](./docs/en/tutorials/quick_start_recognition_en.md)
+- [Install Paddle](./docs/en/installation/install_paddle_en.md)
+- [Install PaddleClas Environment](./docs/en/installation/install_paddleclas_en.md)
+- [Practical Ultra Light-weight image Classification solutions](./docs/en/PULC/PULC_train_en.md)
+ - [PULC Quick Start](docs/en/PULC/PULC_quickstart_en.md)
+ - [PULC Model Zoo](docs/en/PULC/PULC_model_list_en.md)
+ - [PULC Classification Model of Someone or Nobody](docs/en/PULC/PULC_person_exists_en.md)
+ - [PULC Recognition Model of Person Attribute](docs/en/PULC/PULC_person_attribute_en.md)
+ - [PULC Classification Model of Wearing or Unwearing Safety Helmet](docs/en/PULC/PULC_safety_helmet_en.md)
+ - [PULC Classification Model of Traffic Sign](docs/en/PULC/PULC_traffic_sign_en.md)
+ - [PULC Recognition Model of Vehicle Attribute](docs/en/PULC/PULC_vehicle_attribute_en.md)
+ - [PULC Classification Model of Containing or Uncontaining Car](docs/en/PULC/PULC_car_exists_en.md)
+ - [PULC Classification Model of Text Image Orientation](docs/en/PULC/PULC_text_image_orientation_en.md)
+ - [PULC Classification Model of Textline Orientation](docs/en/PULC/PULC_textline_orientation_en.md)
+ - [PULC Classification Model of Language](docs/en/PULC/PULC_language_classification_en.md)
+- [Quick Start of Recognition](./docs/en/quick_start/quick_start_recognition_en.md)
- [Introduction to Image Recognition Systems](#Introduction_to_Image_Recognition_Systems)
-- [Demo images](#Demo_images)
+- [Image Recognition Demo images](#Rec_Demo_images)
+- [PULC demo images](#Clas_Demo_images)
- Algorithms Introduction
- - [Backbone Network and Pre-trained Model Library](./docs/en/ImageNet_models_en.md)
- - [Mainbody Detection](./docs/en/application/mainbody_detection_en.md)
- - [Image Classification](./docs/en/tutorials/image_classification_en.md)
- - [Feature Learning](./docs/en/application/feature_learning_en.md)
- - [Product Recognition](./docs/en/application/product_recognition_en.md)
- - [Vehicle Recognition](./docs/en/application/vehicle_recognition_en.md)
- - [Logo Recognition](./docs/en/application/logo_recognition_en.md)
- - [Animation Character Recognition](./docs/en/application/cartoon_character_recognition_en.md)
+ - [Backbone Network and Pre-trained Model Library](./docs/en/algorithm_introduction/ImageNet_models_en.md)
+ - [Mainbody Detection](./docs/en/image_recognition_pipeline/mainbody_detection_en.md)
+ - [Feature Learning](./docs/en/image_recognition_pipeline/feature_extraction_en.md)
- [Vector Search](./deploy/vector_search/README.md)
-- Models Training/Evaluation
- - [Image Classification](./docs/en/tutorials/getting_started_en.md)
- - [Feature Learning](./docs/en/tutorials/getting_started_retrieval_en.md)
- Inference Model Prediction
- - [Python Inference](./docs/en/inference.md)
+ - [Python Inference](./docs/en/inference_deployment/python_deploy_en.md)
- [C++ Classfication Inference](./deploy/cpp/readme_en.md), [C++ PP-ShiTu Inference](deploy/cpp_shitu/readme_en.md)
- Model Deploy (only support classification for now, recognition coming soon)
- [Hub Serving Deployment](./deploy/hubserving/readme_en.md)
- [Mobile Deployment](./deploy/lite/readme_en.md)
- - [Inference Using whl](./docs/en/whl_en.md)
+ - [Inference Using whl](./docs/en/inference_deployment/whl_deploy_en.md)
- Advanced Tutorial
- [Knowledge Distillation](./docs/en/advanced_tutorials/distillation/distillation_en.md)
- - [Model Quantization](./docs/en/extension/paddle_quantization_en.md)
- - [Data Augmentation](./docs/en/advanced_tutorials/image_augmentation/ImageAugment_en.md)
+ - [Model Quantization](./docs/en/algorithm_introduction/model_prune_quantization_en.md)
+ - [Data Augmentation](./docs/en/advanced_tutorials/DataAugmentation_en.md)
- [License](#License)
- [Contribution](#Contribution)
+

diff --git a/deploy/configs/PULC/car_exists/inference_car_exists.yaml b/deploy/configs/PULC/car_exists/inference_car_exists.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..b6733069d99b5622c83321bc628f3d70274ce8d4
--- /dev/null
+++ b/deploy/configs/PULC/car_exists/inference_car_exists.yaml
@@ -0,0 +1,36 @@
+Global:
+ infer_imgs: "./images/PULC/car_exists/objects365_00001507.jpeg"
+ inference_model_dir: "./models/car_exists_infer"
+ batch_size: 1
+ use_gpu: True
+ enable_mkldnn: False
+ cpu_num_threads: 10
+ enable_benchmark: True
+ use_fp16: False
+ ir_optim: True
+ use_tensorrt: False
+ gpu_mem: 8000
+ enable_profile: False
+
+PreProcess:
+ transform_ops:
+ - ResizeImage:
+ resize_short: 256
+ - CropImage:
+ size: 224
+ - NormalizeImage:
+ scale: 0.00392157
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ order: ''
+ channel_num: 3
+ - ToCHWImage:
+
+PostProcess:
+ main_indicator: ThreshOutput
+ ThreshOutput:
+ threshold: 0.5
+ label_0: no_car
+ label_1: contains_car
+ SavePreLabel:
+ save_dir: ./pre_label/
diff --git a/deploy/configs/PULC/language_classification/inference_language_classification.yaml b/deploy/configs/PULC/language_classification/inference_language_classification.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..fb9fb6b6631e774e7486bcdb31c25621e2b7d790
--- /dev/null
+++ b/deploy/configs/PULC/language_classification/inference_language_classification.yaml
@@ -0,0 +1,33 @@
+Global:
+ infer_imgs: "./images/PULC/language_classification/word_35404.png"
+ inference_model_dir: "./models/language_classification_infer"
+ batch_size: 1
+ use_gpu: True
+ enable_mkldnn: False
+ cpu_num_threads: 10
+ enable_benchmark: True
+ use_fp16: False
+ ir_optim: True
+ use_tensorrt: False
+ gpu_mem: 8000
+ enable_profile: False
+
+PreProcess:
+ transform_ops:
+ - ResizeImage:
+ size: [160, 80]
+ - NormalizeImage:
+ scale: 0.00392157
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ order: ''
+ channel_num: 3
+ - ToCHWImage:
+
+PostProcess:
+ main_indicator: Topk
+ Topk:
+ topk: 2
+ class_id_map_file: "../ppcls/utils/PULC_label_list/language_classification_label_list.txt"
+ SavePreLabel:
+ save_dir: ./pre_label/
diff --git a/deploy/configs/PULC/person_attribute/inference_person_attribute.yaml b/deploy/configs/PULC/person_attribute/inference_person_attribute.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..d5be2a3568291d0a31a7026974fc22ecf54a8f4c
--- /dev/null
+++ b/deploy/configs/PULC/person_attribute/inference_person_attribute.yaml
@@ -0,0 +1,32 @@
+Global:
+ infer_imgs: "./images/PULC/person_attribute/090004.jpg"
+ inference_model_dir: "./models/person_attribute_infer"
+ batch_size: 1
+ use_gpu: True
+ enable_mkldnn: True
+ cpu_num_threads: 10
+ benchmark: False
+ use_fp16: False
+ ir_optim: True
+ use_tensorrt: False
+ gpu_mem: 8000
+ enable_profile: False
+
+PreProcess:
+ transform_ops:
+ - ResizeImage:
+ size: [192, 256]
+ - NormalizeImage:
+ scale: 1.0/255.0
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ order: ''
+ channel_num: 3
+ - ToCHWImage:
+
+PostProcess:
+ main_indicator: PersonAttribute
+ PersonAttribute:
+ threshold: 0.5 #default threshold
+ glasses_threshold: 0.3 #threshold only for glasses
+ hold_threshold: 0.6 #threshold only for hold
diff --git a/deploy/configs/PULC/person/inference_person_cls.yaml b/deploy/configs/PULC/person_exists/inference_person_exists.yaml
similarity index 81%
rename from deploy/configs/PULC/person/inference_person_cls.yaml
rename to deploy/configs/PULC/person_exists/inference_person_exists.yaml
index a70f663a792fcdcab3b7d45059f2afe0b1efbf07..3df94a80c7c75814e778e5320a31b20a8a7eb742 100644
--- a/deploy/configs/PULC/person/inference_person_cls.yaml
+++ b/deploy/configs/PULC/person_exists/inference_person_exists.yaml
@@ -1,6 +1,6 @@
Global:
- infer_imgs: "./images/PULC/person/objects365_02035329.jpg"
- inference_model_dir: "./models/person_cls_infer"
+ infer_imgs: "./images/PULC/person_exists/objects365_02035329.jpg"
+ inference_model_dir: "./models/person_exists_infer"
batch_size: 1
use_gpu: True
enable_mkldnn: False
@@ -29,7 +29,7 @@ PreProcess:
PostProcess:
main_indicator: ThreshOutput
ThreshOutput:
- threshold: 0.9
+ threshold: 0.5
label_0: nobody
label_1: someone
SavePreLabel:
diff --git a/deploy/configs/PULC/safety_helmet/inference_safety_helmet.yaml b/deploy/configs/PULC/safety_helmet/inference_safety_helmet.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..66a4cebb359a9b1f03a205ee6a031ca6464cffa8
--- /dev/null
+++ b/deploy/configs/PULC/safety_helmet/inference_safety_helmet.yaml
@@ -0,0 +1,36 @@
+Global:
+ infer_imgs: "./images/PULC/safety_helmet/safety_helmet_test_1.png"
+ inference_model_dir: "./models/safety_helmet_infer"
+ batch_size: 1
+ use_gpu: True
+ enable_mkldnn: False
+ cpu_num_threads: 10
+ enable_benchmark: True
+ use_fp16: False
+ ir_optim: True
+ use_tensorrt: False
+ gpu_mem: 8000
+ enable_profile: False
+
+PreProcess:
+ transform_ops:
+ - ResizeImage:
+ resize_short: 256
+ - CropImage:
+ size: 224
+ - NormalizeImage:
+ scale: 0.00392157
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ order: ''
+ channel_num: 3
+ - ToCHWImage:
+
+PostProcess:
+ main_indicator: ThreshOutput
+ ThreshOutput:
+ threshold: 0.5
+ label_0: wearing_helmet
+ label_1: unwearing_helmet
+ SavePreLabel:
+ save_dir: ./pre_label/
diff --git a/deploy/configs/PULC/text_image_orientation/inference_text_image_orientation.yaml b/deploy/configs/PULC/text_image_orientation/inference_text_image_orientation.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..c6c3969ffa627288fe58fab28b3fe1cbffe9dd03
--- /dev/null
+++ b/deploy/configs/PULC/text_image_orientation/inference_text_image_orientation.yaml
@@ -0,0 +1,35 @@
+Global:
+ infer_imgs: "./images/PULC/text_image_orientation/img_rot0_demo.jpg"
+ inference_model_dir: "./models/text_image_orientation_infer"
+ batch_size: 1
+ use_gpu: True
+ enable_mkldnn: False
+ cpu_num_threads: 10
+ enable_benchmark: True
+ use_fp16: False
+ ir_optim: True
+ use_tensorrt: False
+ gpu_mem: 8000
+ enable_profile: False
+
+PreProcess:
+ transform_ops:
+ - ResizeImage:
+ resize_short: 256
+ - CropImage:
+ size: 224
+ - NormalizeImage:
+ scale: 0.00392157
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ order: ''
+ channel_num: 3
+ - ToCHWImage:
+
+PostProcess:
+ main_indicator: Topk
+ Topk:
+ topk: 2
+ class_id_map_file: "../ppcls/utils/PULC_label_list/text_image_orientation_label_list.txt"
+ SavePreLabel:
+ save_dir: ./pre_label/
diff --git a/deploy/configs/PULC/textline_orientation/inference_textline_orientation.yaml b/deploy/configs/PULC/textline_orientation/inference_textline_orientation.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..108b3dd53a95c06345bdd7ccd34b2e5252d2df19
--- /dev/null
+++ b/deploy/configs/PULC/textline_orientation/inference_textline_orientation.yaml
@@ -0,0 +1,33 @@
+Global:
+ infer_imgs: "./images/PULC/textline_orientation/textline_orientation_test_0_0.png"
+ inference_model_dir: "./models/textline_orientation_infer"
+ batch_size: 1
+ use_gpu: True
+ enable_mkldnn: True
+ cpu_num_threads: 10
+ enable_benchmark: True
+ use_fp16: False
+ ir_optim: True
+ use_tensorrt: False
+ gpu_mem: 8000
+ enable_profile: False
+
+PreProcess:
+ transform_ops:
+ - ResizeImage:
+ size: [160, 80]
+ - NormalizeImage:
+ scale: 0.00392157
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ order: ''
+ channel_num: 3
+ - ToCHWImage:
+
+PostProcess:
+ main_indicator: Topk
+ Topk:
+ topk: 1
+ class_id_map_file: "../ppcls/utils/PULC_label_list/textline_orientation_label_list.txt"
+ SavePreLabel:
+ save_dir: ./pre_label/
diff --git a/deploy/configs/PULC/traffic_sign/inference_traffic_sign.yaml b/deploy/configs/PULC/traffic_sign/inference_traffic_sign.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..53699718b4fdd38da86eaee4cccc584dcc87d2b7
--- /dev/null
+++ b/deploy/configs/PULC/traffic_sign/inference_traffic_sign.yaml
@@ -0,0 +1,35 @@
+Global:
+ infer_imgs: "./images/PULC/traffic_sign/99603_17806.jpg"
+ inference_model_dir: "./models/traffic_sign_infer"
+ batch_size: 1
+ use_gpu: True
+ enable_mkldnn: True
+ cpu_num_threads: 10
+ benchmark: False
+ use_fp16: False
+ ir_optim: True
+ use_tensorrt: False
+ gpu_mem: 8000
+ enable_profile: False
+
+PreProcess:
+ transform_ops:
+ - ResizeImage:
+ resize_short: 256
+ - CropImage:
+ size: 224
+ - NormalizeImage:
+ scale: 0.00392157
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ order: ''
+ channel_num: 3
+ - ToCHWImage:
+
+PostProcess:
+ main_indicator: Topk
+ Topk:
+ topk: 5
+ class_id_map_file: "../ppcls/utils/PULC_label_list/traffic_sign_label_list.txt"
+ SavePreLabel:
+ save_dir: ./pre_label/
diff --git a/deploy/configs/PULC/vehicle_attribute/inference_vehicle_attribute.yaml b/deploy/configs/PULC/vehicle_attribute/inference_vehicle_attribute.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..14ae348d09faca113d5863fbb57f066675b3f447
--- /dev/null
+++ b/deploy/configs/PULC/vehicle_attribute/inference_vehicle_attribute.yaml
@@ -0,0 +1,32 @@
+Global:
+ infer_imgs: "./images/PULC/vehicle_attribute/0002_c002_00030670_0.jpg"
+ inference_model_dir: "./models/vehicle_attribute_infer"
+ batch_size: 1
+ use_gpu: True
+ enable_mkldnn: True
+ cpu_num_threads: 10
+ benchmark: False
+ use_fp16: False
+ ir_optim: True
+ use_tensorrt: False
+ gpu_mem: 8000
+ enable_profile: False
+
+PreProcess:
+ transform_ops:
+ - ResizeImage:
+ size: [256, 192]
+ - NormalizeImage:
+ scale: 1.0/255.0
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ order: ''
+ channel_num: 3
+ - ToCHWImage:
+
+PostProcess:
+ main_indicator: VehicleAttribute
+ VehicleAttribute:
+ color_threshold: 0.5
+ type_threshold: 0.5
+
diff --git a/deploy/configs/inference_attr.yaml b/deploy/configs/inference_attr.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..88f73db5419414812450b768ac783982386f0a78
--- /dev/null
+++ b/deploy/configs/inference_attr.yaml
@@ -0,0 +1,33 @@
+Global:
+ infer_imgs: "./images/Pedestrain_Attr.jpg"
+ inference_model_dir: "../inference/"
+ batch_size: 1
+ use_gpu: True
+ enable_mkldnn: False
+ cpu_num_threads: 10
+ enable_benchmark: True
+ use_fp16: False
+ ir_optim: True
+ use_tensorrt: False
+ gpu_mem: 8000
+ enable_profile: False
+
+PreProcess:
+ transform_ops:
+ - ResizeImage:
+ size: [192, 256]
+ - NormalizeImage:
+ scale: 1.0/255.0
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ order: ''
+ channel_num: 3
+ - ToCHWImage:
+
+PostProcess:
+ main_indicator: PersonAttribute
+ PersonAttribute:
+ threshold: 0.5 #default threshold
+ glasses_threshold: 0.3 #threshold only for glasses
+ hold_threshold: 0.6 #threshold only for hold
+
diff --git a/deploy/configs/inference_cls.yaml b/deploy/configs/inference_cls.yaml
index fc0f0fe67aa628e504bb6fcb743f29fd020548cc..d9181278cc617822f98e4966abf0d12ceca498a4 100644
--- a/deploy/configs/inference_cls.yaml
+++ b/deploy/configs/inference_cls.yaml
@@ -1,5 +1,5 @@
Global:
- infer_imgs: "./images/ILSVRC2012_val_00000010.jpeg"
+ infer_imgs: "./images/ImageNet/ILSVRC2012_val_00000010.jpeg"
inference_model_dir: "./models"
batch_size: 1
use_gpu: True
@@ -32,4 +32,4 @@ PostProcess:
topk: 5
class_id_map_file: "../ppcls/utils/imagenet1k_label_list.txt"
SavePreLabel:
- save_dir: ./pre_label/
\ No newline at end of file
+ save_dir: ./pre_label/
diff --git a/deploy/configs/inference_cls_ch4.yaml b/deploy/configs/inference_cls_ch4.yaml
index 9b740ed8293c3d66a325682cafc42e2b1415df4d..85f9acb29a88772da63abe302354f5e17a9c3e59 100644
--- a/deploy/configs/inference_cls_ch4.yaml
+++ b/deploy/configs/inference_cls_ch4.yaml
@@ -1,5 +1,5 @@
Global:
- infer_imgs: "./images/ILSVRC2012_val_00000010.jpeg"
+ infer_imgs: "./images/ImageNet/ILSVRC2012_val_00000010.jpeg"
inference_model_dir: "./models"
batch_size: 1
use_gpu: True
@@ -32,4 +32,4 @@ PostProcess:
topk: 5
class_id_map_file: "../ppcls/utils/imagenet1k_label_list.txt"
SavePreLabel:
- save_dir: ./pre_label/
\ No newline at end of file
+ save_dir: ./pre_label/
diff --git a/deploy/images/ILSVRC2012_val_00000010.jpeg b/deploy/images/ImageNet/ILSVRC2012_val_00000010.jpeg
similarity index 100%
rename from deploy/images/ILSVRC2012_val_00000010.jpeg
rename to deploy/images/ImageNet/ILSVRC2012_val_00000010.jpeg
diff --git a/deploy/images/ILSVRC2012_val_00010010.jpeg b/deploy/images/ImageNet/ILSVRC2012_val_00010010.jpeg
similarity index 100%
rename from deploy/images/ILSVRC2012_val_00010010.jpeg
rename to deploy/images/ImageNet/ILSVRC2012_val_00010010.jpeg
diff --git a/deploy/images/ILSVRC2012_val_00020010.jpeg b/deploy/images/ImageNet/ILSVRC2012_val_00020010.jpeg
similarity index 100%
rename from deploy/images/ILSVRC2012_val_00020010.jpeg
rename to deploy/images/ImageNet/ILSVRC2012_val_00020010.jpeg
diff --git a/deploy/images/ILSVRC2012_val_00030010.jpeg b/deploy/images/ImageNet/ILSVRC2012_val_00030010.jpeg
similarity index 100%
rename from deploy/images/ILSVRC2012_val_00030010.jpeg
rename to deploy/images/ImageNet/ILSVRC2012_val_00030010.jpeg
diff --git a/deploy/images/PULC/car_exists/objects365_00001507.jpeg b/deploy/images/PULC/car_exists/objects365_00001507.jpeg
new file mode 100644
index 0000000000000000000000000000000000000000..9959954b6b8bf27589e1d2081f86c6078d16e2c1
Binary files /dev/null and b/deploy/images/PULC/car_exists/objects365_00001507.jpeg differ
diff --git a/deploy/images/PULC/car_exists/objects365_00001521.jpeg b/deploy/images/PULC/car_exists/objects365_00001521.jpeg
new file mode 100644
index 0000000000000000000000000000000000000000..ea65b3108ec0476ce952b3221c31ac54fcef161d
Binary files /dev/null and b/deploy/images/PULC/car_exists/objects365_00001521.jpeg differ
diff --git a/deploy/images/PULC/language_classification/word_17.png b/deploy/images/PULC/language_classification/word_17.png
new file mode 100644
index 0000000000000000000000000000000000000000..c0cd74632460e01676fbc5a43b220c0a7f7d0474
Binary files /dev/null and b/deploy/images/PULC/language_classification/word_17.png differ
diff --git a/deploy/images/PULC/language_classification/word_20.png b/deploy/images/PULC/language_classification/word_20.png
new file mode 100644
index 0000000000000000000000000000000000000000..f9149670e8a2aa086c91451442f63a727661fd7d
Binary files /dev/null and b/deploy/images/PULC/language_classification/word_20.png differ
diff --git a/deploy/images/PULC/language_classification/word_35404.png b/deploy/images/PULC/language_classification/word_35404.png
new file mode 100644
index 0000000000000000000000000000000000000000..9e1789ab47aefecac8eaf1121decfc6a8cfb1e8b
Binary files /dev/null and b/deploy/images/PULC/language_classification/word_35404.png differ
diff --git a/deploy/images/PULC/person_attribute/090004.jpg b/deploy/images/PULC/person_attribute/090004.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..140694eeec3d2925303e8c0d544ef5979cd78219
Binary files /dev/null and b/deploy/images/PULC/person_attribute/090004.jpg differ
diff --git a/deploy/images/PULC/person_attribute/090007.jpg b/deploy/images/PULC/person_attribute/090007.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..9fea2e7c9e0047a8b59606877ad41fe24bf2e24c
Binary files /dev/null and b/deploy/images/PULC/person_attribute/090007.jpg differ
diff --git a/deploy/images/PULC/person/objects365_01780782.jpg b/deploy/images/PULC/person_exists/objects365_01780782.jpg
similarity index 100%
rename from deploy/images/PULC/person/objects365_01780782.jpg
rename to deploy/images/PULC/person_exists/objects365_01780782.jpg
diff --git a/deploy/images/PULC/person/objects365_02035329.jpg b/deploy/images/PULC/person_exists/objects365_02035329.jpg
similarity index 100%
rename from deploy/images/PULC/person/objects365_02035329.jpg
rename to deploy/images/PULC/person_exists/objects365_02035329.jpg
diff --git a/deploy/images/PULC/safety_helmet/safety_helmet_test_1.png b/deploy/images/PULC/safety_helmet/safety_helmet_test_1.png
new file mode 100644
index 0000000000000000000000000000000000000000..c28f54f77d54df6e68e471538846b01db4387e08
Binary files /dev/null and b/deploy/images/PULC/safety_helmet/safety_helmet_test_1.png differ
diff --git a/deploy/images/PULC/safety_helmet/safety_helmet_test_2.png b/deploy/images/PULC/safety_helmet/safety_helmet_test_2.png
new file mode 100644
index 0000000000000000000000000000000000000000..8e784af808afb58d67fdb3e277dfeebd134ee846
Binary files /dev/null and b/deploy/images/PULC/safety_helmet/safety_helmet_test_2.png differ
diff --git a/deploy/images/PULC/text_image_orientation/img_rot0_demo.jpg b/deploy/images/PULC/text_image_orientation/img_rot0_demo.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..412d41956ba48c8e3243bdeff746d389be7e762b
Binary files /dev/null and b/deploy/images/PULC/text_image_orientation/img_rot0_demo.jpg differ
diff --git a/deploy/images/PULC/text_image_orientation/img_rot180_demo.jpg b/deploy/images/PULC/text_image_orientation/img_rot180_demo.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..f4725b96698e2ac222ae9d4830d8f29a33322443
Binary files /dev/null and b/deploy/images/PULC/text_image_orientation/img_rot180_demo.jpg differ
diff --git a/deploy/images/PULC/textline_orientation/textline_orientation_test_0_0.png b/deploy/images/PULC/textline_orientation/textline_orientation_test_0_0.png
new file mode 100644
index 0000000000000000000000000000000000000000..4b8d24d29ff0f8b4befff6bf943d506c36061d4d
Binary files /dev/null and b/deploy/images/PULC/textline_orientation/textline_orientation_test_0_0.png differ
diff --git a/deploy/images/PULC/textline_orientation/textline_orientation_test_0_1.png b/deploy/images/PULC/textline_orientation/textline_orientation_test_0_1.png
new file mode 100644
index 0000000000000000000000000000000000000000..42ad5234973679e65be6054f90c1cc7c0f989bd2
Binary files /dev/null and b/deploy/images/PULC/textline_orientation/textline_orientation_test_0_1.png differ
diff --git a/deploy/images/PULC/textline_orientation/textline_orientation_test_1_0.png b/deploy/images/PULC/textline_orientation/textline_orientation_test_1_0.png
new file mode 100644
index 0000000000000000000000000000000000000000..ac2447842dd0fac260c0d3c6e0d156dda9890923
Binary files /dev/null and b/deploy/images/PULC/textline_orientation/textline_orientation_test_1_0.png differ
diff --git a/deploy/images/PULC/textline_orientation/textline_orientation_test_1_1.png b/deploy/images/PULC/textline_orientation/textline_orientation_test_1_1.png
new file mode 100644
index 0000000000000000000000000000000000000000..7d5b75f7e5bbeabded56eba1b4b566c4ca019590
Binary files /dev/null and b/deploy/images/PULC/textline_orientation/textline_orientation_test_1_1.png differ
diff --git a/deploy/images/PULC/traffic_sign/100999_83928.jpg b/deploy/images/PULC/traffic_sign/100999_83928.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..6f32ed5ae2d8483d29986e3a45db1789da2a4d43
Binary files /dev/null and b/deploy/images/PULC/traffic_sign/100999_83928.jpg differ
diff --git a/deploy/images/PULC/traffic_sign/99603_17806.jpg b/deploy/images/PULC/traffic_sign/99603_17806.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..c792fdf6eb64395fffaf8289a1ec14d47279860e
Binary files /dev/null and b/deploy/images/PULC/traffic_sign/99603_17806.jpg differ
diff --git a/deploy/images/PULC/vehicle_attribute/0002_c002_00030670_0.jpg b/deploy/images/PULC/vehicle_attribute/0002_c002_00030670_0.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..bb5de9fc6ff99550bf9bff8d4a9f0d0e0fe18c06
Binary files /dev/null and b/deploy/images/PULC/vehicle_attribute/0002_c002_00030670_0.jpg differ
diff --git a/deploy/images/PULC/vehicle_attribute/0014_c012_00040750_0.jpg b/deploy/images/PULC/vehicle_attribute/0014_c012_00040750_0.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..76207d43ce597a1079c523dca0c32923bf15db19
Binary files /dev/null and b/deploy/images/PULC/vehicle_attribute/0014_c012_00040750_0.jpg differ
diff --git a/deploy/images/Pedestrain_Attr.jpg b/deploy/images/Pedestrain_Attr.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..6a87e856af8c17a3b93617b93ea517b91c508619
Binary files /dev/null and b/deploy/images/Pedestrain_Attr.jpg differ
diff --git a/deploy/paddle2onnx/readme.md b/deploy/paddle2onnx/readme.md
index d1307ea84e3d7a1465c7c464d3b41dfa7613a046..bacc202806bf1a60e85790969edcb70f1489f7df 100644
--- a/deploy/paddle2onnx/readme.md
+++ b/deploy/paddle2onnx/readme.md
@@ -1,53 +1,59 @@
# paddle2onnx 模型转化与预测
-本章节介绍 ResNet50_vd 模型如何转化为 ONNX 模型,并基于 ONNX 引擎预测。
+## 目录
+
+- [paddle2onnx 模型转化与预测](#paddle2onnx-模型转化与预测)
+ - [1. 环境准备](#1-环境准备)
+ - [2. 模型转换](#2-模型转换)
+ - [3. onnx 预测](#3-onnx-预测)
## 1. 环境准备
需要准备 Paddle2ONNX 模型转化环境,和 ONNX 模型预测环境。
-Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式,算子目前稳定支持导出 ONNX Opset 9~11,部分Paddle算子支持更低的ONNX Opset转换。
-更多细节可参考 [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX/blob/develop/README_zh.md)
+Paddle2ONNX 支持将 PaddlePaddle inference 模型格式转化到 ONNX 模型格式,算子目前稳定支持导出 ONNX Opset 9~11。
+更多细节可参考 [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX#paddle2onnx)
- 安装 Paddle2ONNX
-```
-python3.7 -m pip install paddle2onnx
-```
+ ```shell
+ python3.7 -m pip install paddle2onnx
+ ```
-- 安装 ONNX 运行时
-```
-python3.7 -m pip install onnxruntime
-```
+- 安装 ONNX 推理引擎
+ ```shell
+ python3.7 -m pip install onnxruntime
+ ```
+下面以 ResNet50_vd 为例,介绍如何将 PaddlePaddle inference 模型转换为 ONNX 模型,并基于 ONNX 引擎预测。
## 2. 模型转换
- ResNet50_vd inference模型下载
-```
-cd deploy
-mkdir models && cd models
-wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
-cd ..
-```
+ ```shell
+ cd deploy
+ mkdir models && cd models
+ wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
+ cd ..
+ ```
- 模型转换
-使用 Paddle2ONNX 将 Paddle 静态图模型转换为 ONNX 模型格式:
-```
-paddle2onnx --model_dir=./models/ResNet50_vd_infer/ \
---model_filename=inference.pdmodel \
---params_filename=inference.pdiparams \
---save_file=./models/ResNet50_vd_infer/inference.onnx \
---opset_version=10 \
---enable_onnx_checker=True
-```
+ 使用 Paddle2ONNX 将 Paddle 静态图模型转换为 ONNX 模型格式:
+ ```shell
+ paddle2onnx --model_dir=./models/ResNet50_vd_infer/ \
+ --model_filename=inference.pdmodel \
+ --params_filename=inference.pdiparams \
+ --save_file=./models/ResNet50_vd_infer/inference.onnx \
+ --opset_version=10 \
+ --enable_onnx_checker=True
+ ```
-执行完毕后,ONNX 模型 `inference.onnx` 会被保存在 `./models/ResNet50_vd_infer/` 路径下
+转换完毕后,生成的ONNX 模型 `inference.onnx` 会被保存在 `./models/ResNet50_vd_infer/` 路径下
## 3. onnx 预测
执行如下命令:
-```
+```shell
python3.7 python/predict_cls.py \
-c configs/inference_cls.yaml \
-o Global.use_onnx=True \
diff --git a/deploy/paddle2onnx/readme_en.md b/deploy/paddle2onnx/readme_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..6df13e5fe31805d642432dea8526661e82b6e95b
--- /dev/null
+++ b/deploy/paddle2onnx/readme_en.md
@@ -0,0 +1,59 @@
+# Paddle2ONNX: Converting To ONNX and Deployment
+
+This section introduce that how to convert the Paddle Inference Model ResNet50_vd to ONNX model and deployment based on ONNX engine.
+
+## 1. Installation
+
+First, you need to install Paddle2ONNX and onnxruntime. Paddle2ONNX is a toolkit to convert Paddle Inference Model to ONNX model. Please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX/blob/develop/README_en.md) for more information.
+
+- Paddle2ONNX Installation
+```
+python3.7 -m pip install paddle2onnx
+```
+
+- ONNX Installation
+```
+python3.7 -m pip install onnxruntime
+```
+
+## 2. Converting to ONNX
+
+Download the Paddle Inference Model ResNet50_vd:
+
+```
+cd deploy
+mkdir models && cd models
+wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
+cd ..
+```
+
+Converting to ONNX model:
+
+```
+paddle2onnx --model_dir=./models/ResNet50_vd_infer/ \
+--model_filename=inference.pdmodel \
+--params_filename=inference.pdiparams \
+--save_file=./models/ResNet50_vd_infer/inference.onnx \
+--opset_version=10 \
+--enable_onnx_checker=True
+```
+
+After running the above command, the ONNX model file converted would be save in `./models/ResNet50_vd_infer/`.
+
+## 3. Deployment
+
+Deployment with ONNX model, command is as shown below.
+
+```
+python3.7 python/predict_cls.py \
+-c configs/inference_cls.yaml \
+-o Global.use_onnx=True \
+-o Global.use_gpu=False \
+-o Global.inference_model_dir=./models/ResNet50_vd_infer
+```
+
+The prediction results:
+
+```
+ILSVRC2012_val_00000010.jpeg: class id(s): [153, 204, 229, 332, 155], score(s): [0.69, 0.10, 0.02, 0.01, 0.01], label_name(s): ['Maltese dog, Maltese terrier, Maltese', 'Lhasa, Lhasa apso', 'Old English sheepdog, bobtail', 'Angora, Angora rabbit', 'Shih-Tzu']
+```
diff --git a/deploy/paddleserving/build_server.sh b/deploy/paddleserving/build_server.sh
new file mode 100644
index 0000000000000000000000000000000000000000..1329a3684ff72862858ee25c0a938bd61ff654ae
--- /dev/null
+++ b/deploy/paddleserving/build_server.sh
@@ -0,0 +1,88 @@
+# 使用镜像:
+# registry.baidubce.com/paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82
+
+# 编译Serving Server:
+
+# client和app可以直接使用release版本
+
+# server因为加入了自定义OP,需要重新编译
+
+# 默认编译时的${PWD}=PaddleClas/deploy/paddleserving/
+
+python_name=${1:-'python'}
+
+apt-get update
+apt install -y libcurl4-openssl-dev libbz2-dev
+wget -nc https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar
+tar xf centos_ssl.tar
+rm -rf centos_ssl.tar
+mv libcrypto.so.1.0.2k /usr/lib/libcrypto.so.1.0.2k
+mv libssl.so.1.0.2k /usr/lib/libssl.so.1.0.2k
+ln -sf /usr/lib/libcrypto.so.1.0.2k /usr/lib/libcrypto.so.10
+ln -sf /usr/lib/libssl.so.1.0.2k /usr/lib/libssl.so.10
+ln -sf /usr/lib/libcrypto.so.10 /usr/lib/libcrypto.so
+ln -sf /usr/lib/libssl.so.10 /usr/lib/libssl.so
+
+# 安装go依赖
+rm -rf /usr/local/go
+wget -qO- https://paddle-ci.cdn.bcebos.com/go1.17.2.linux-amd64.tar.gz | tar -xz -C /usr/local
+export GOROOT=/usr/local/go
+export GOPATH=/root/gopath
+export PATH=$PATH:$GOPATH/bin:$GOROOT/bin
+go env -w GO111MODULE=on
+go env -w GOPROXY=https://goproxy.cn,direct
+go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
+go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
+go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
+go install google.golang.org/grpc@v1.33.0
+go env -w GO111MODULE=auto
+
+# 下载opencv库
+wget https://paddle-qa.bj.bcebos.com/PaddleServing/opencv3.tar.gz
+tar -xvf opencv3.tar.gz
+rm -rf opencv3.tar.gz
+export OPENCV_DIR=$PWD/opencv3
+
+# clone Serving
+git clone https://github.com/PaddlePaddle/Serving.git -b develop --depth=1
+
+cd Serving # PaddleClas/deploy/paddleserving/Serving
+export Serving_repo_path=$PWD
+git submodule update --init --recursive
+${python_name} -m pip install -r python/requirements.txt
+
+# set env
+export PYTHON_INCLUDE_DIR=$(${python_name} -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())")
+export PYTHON_LIBRARIES=$(${python_name} -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR'))")
+export PYTHON_EXECUTABLE=`which ${python_name}`
+
+export CUDA_PATH='/usr/local/cuda'
+export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
+export CUDA_CUDART_LIBRARY='/usr/local/cuda/lib64/'
+export TENSORRT_LIBRARY_PATH='/usr/local/TensorRT6-cuda10.1-cudnn7/targets/x86_64-linux-gnu/'
+
+# cp 自定义OP代码
+\cp ../preprocess/general_clas_op.* ${Serving_repo_path}/core/general-server/op
+\cp ../preprocess/preprocess_op.* ${Serving_repo_path}/core/predictor/tools/pp_shitu_tools
+
+# 编译Server
+mkdir server-build-gpu-opencv
+cd server-build-gpu-opencv
+cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
+-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
+-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
+-DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \
+-DCUDNN_LIBRARY=${CUDNN_LIBRARY} \
+-DCUDA_CUDART_LIBRARY=${CUDA_CUDART_LIBRARY} \
+-DTENSORRT_ROOT=${TENSORRT_LIBRARY_PATH} \
+-DOPENCV_DIR=${OPENCV_DIR} \
+-DWITH_OPENCV=ON \
+-DSERVER=ON \
+-DWITH_GPU=ON ..
+make -j32
+
+${python_name} -m pip install python/dist/paddle*
+
+# export SERVING_BIN
+export SERVING_BIN=$PWD/core/general-server/serving
+cd ../../
\ No newline at end of file
diff --git a/deploy/paddleserving/config.yml b/deploy/paddleserving/config.yml
index d9f464dd093d5a3d0ac34a61f4af17e3792fcd86..92d8297f9f23a4082cb0a499ca4c172e71d79caf 100644
--- a/deploy/paddleserving/config.yml
+++ b/deploy/paddleserving/config.yml
@@ -30,4 +30,4 @@ op:
client_type: local_predictor
#Fetch结果列表,以client_config中fetch_var的alias_name为准
- fetch_list: ["prediction"]
+ fetch_list: ["prediction"]
diff --git a/deploy/paddleserving/preprocess/general_clas_op.cpp b/deploy/paddleserving/preprocess/general_clas_op.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..e0ab48fa52da70a558b34e7ab1deda52675e99bc
--- /dev/null
+++ b/deploy/paddleserving/preprocess/general_clas_op.cpp
@@ -0,0 +1,206 @@
+// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "core/general-server/op/general_clas_op.h"
+#include "core/predictor/framework/infer.h"
+#include "core/predictor/framework/memory.h"
+#include "core/predictor/framework/resource.h"
+#include "core/util/include/timer.h"
+#include
+#include
+#include
+#include
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+using baidu::paddle_serving::Timer;
+using baidu::paddle_serving::predictor::MempoolWrapper;
+using baidu::paddle_serving::predictor::general_model::Tensor;
+using baidu::paddle_serving::predictor::general_model::Response;
+using baidu::paddle_serving::predictor::general_model::Request;
+using baidu::paddle_serving::predictor::InferManager;
+using baidu::paddle_serving::predictor::PaddleGeneralModelConfig;
+
+int GeneralClasOp::inference() {
+ VLOG(2) << "Going to run inference";
+ const std::vector pre_node_names = pre_names();
+ if (pre_node_names.size() != 1) {
+ LOG(ERROR) << "This op(" << op_name()
+ << ") can only have one predecessor op, but received "
+ << pre_node_names.size();
+ return -1;
+ }
+ const std::string pre_name = pre_node_names[0];
+
+ const GeneralBlob *input_blob = get_depend_argument(pre_name);
+ if (!input_blob) {
+ LOG(ERROR) << "input_blob is nullptr,error";
+ return -1;
+ }
+ uint64_t log_id = input_blob->GetLogId();
+ VLOG(2) << "(logid=" << log_id << ") Get precedent op name: " << pre_name;
+
+ GeneralBlob *output_blob = mutable_data();
+ if (!output_blob) {
+ LOG(ERROR) << "output_blob is nullptr,error";
+ return -1;
+ }
+ output_blob->SetLogId(log_id);
+
+ if (!input_blob) {
+ LOG(ERROR) << "(logid=" << log_id
+ << ") Failed mutable depended argument, op:" << pre_name;
+ return -1;
+ }
+
+ const TensorVector *in = &input_blob->tensor_vector;
+ TensorVector *out = &output_blob->tensor_vector;
+
+ int batch_size = input_blob->_batch_size;
+ output_blob->_batch_size = batch_size;
+ VLOG(2) << "(logid=" << log_id << ") infer batch size: " << batch_size;
+
+ Timer timeline;
+ int64_t start = timeline.TimeStampUS();
+ timeline.Start();
+
+ // only support string type
+
+ char *total_input_ptr = static_cast(in->at(0).data.data());
+ std::string base64str = total_input_ptr;
+
+ cv::Mat img = Base2Mat(base64str);
+
+ // RGB2BGR
+ cv::cvtColor(img, img, cv::COLOR_BGR2RGB);
+
+ // Resize
+ cv::Mat resize_img;
+ resize_op_.Run(img, resize_img, resize_short_size_);
+
+ // CenterCrop
+ crop_op_.Run(resize_img, crop_size_);
+
+ // Normalize
+ normalize_op_.Run(&resize_img, mean_, scale_, is_scale_);
+
+ // Permute
+ std::vector input(1 * 3 * resize_img.rows * resize_img.cols, 0.0f);
+ permute_op_.Run(&resize_img, input.data());
+ float maxValue = *max_element(input.begin(), input.end());
+ float minValue = *min_element(input.begin(), input.end());
+
+ TensorVector *real_in = new TensorVector();
+ if (!real_in) {
+ LOG(ERROR) << "real_in is nullptr,error";
+ return -1;
+ }
+
+ std::vector input_shape;
+ int in_num = 0;
+ void *databuf_data = NULL;
+ char *databuf_char = NULL;
+ size_t databuf_size = 0;
+
+ input_shape = {1, 3, resize_img.rows, resize_img.cols};
+ in_num = std::accumulate(input_shape.begin(), input_shape.end(), 1,
+ std::multiplies());
+
+ databuf_size = in_num * sizeof(float);
+ databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+ if (!databuf_data) {
+ LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+ return -1;
+ }
+
+ memcpy(databuf_data, input.data(), databuf_size);
+ databuf_char = reinterpret_cast(databuf_data);
+ paddle::PaddleBuf paddleBuf(databuf_char, databuf_size);
+ paddle::PaddleTensor tensor_in;
+ tensor_in.name = in->at(0).name;
+ tensor_in.dtype = paddle::PaddleDType::FLOAT32;
+ tensor_in.shape = {1, 3, resize_img.rows, resize_img.cols};
+ tensor_in.lod = in->at(0).lod;
+ tensor_in.data = paddleBuf;
+ real_in->push_back(tensor_in);
+
+ if (InferManager::instance().infer(engine_name().c_str(), real_in, out,
+ batch_size)) {
+ LOG(ERROR) << "(logid=" << log_id
+ << ") Failed do infer in fluid model: " << engine_name().c_str();
+ return -1;
+ }
+
+ int64_t end = timeline.TimeStampUS();
+ CopyBlobInfo(input_blob, output_blob);
+ AddBlobInfo(output_blob, start);
+ AddBlobInfo(output_blob, end);
+ return 0;
+}
+
+cv::Mat GeneralClasOp::Base2Mat(std::string &base64_data) {
+ cv::Mat img;
+ std::string s_mat;
+ s_mat = base64Decode(base64_data.data(), base64_data.size());
+ std::vector base64_img(s_mat.begin(), s_mat.end());
+ img = cv::imdecode(base64_img, cv::IMREAD_COLOR); // CV_LOAD_IMAGE_COLOR
+ return img;
+}
+
+std::string GeneralClasOp::base64Decode(const char *Data, int DataByte) {
+ const char DecodeTable[] = {
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 62, // '+'
+ 0, 0, 0,
+ 63, // '/'
+ 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, // '0'-'9'
+ 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
+ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, // 'A'-'Z'
+ 0, 0, 0, 0, 0, 0, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
+ 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, // 'a'-'z'
+ };
+
+ std::string strDecode;
+ int nValue;
+ int i = 0;
+ while (i < DataByte) {
+ if (*Data != '\r' && *Data != '\n') {
+ nValue = DecodeTable[*Data++] << 18;
+ nValue += DecodeTable[*Data++] << 12;
+ strDecode += (nValue & 0x00FF0000) >> 16;
+ if (*Data != '=') {
+ nValue += DecodeTable[*Data++] << 6;
+ strDecode += (nValue & 0x0000FF00) >> 8;
+ if (*Data != '=') {
+ nValue += DecodeTable[*Data++];
+ strDecode += nValue & 0x000000FF;
+ }
+ }
+ i += 4;
+ } else // 回车换行,跳过
+ {
+ Data++;
+ i++;
+ }
+ }
+ return strDecode;
+}
+DEFINE_OP(GeneralClasOp);
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
diff --git a/deploy/paddleserving/preprocess/general_clas_op.h b/deploy/paddleserving/preprocess/general_clas_op.h
new file mode 100644
index 0000000000000000000000000000000000000000..69b7a8e005872d7b66b9a61265ca5798b4ac8bab
--- /dev/null
+++ b/deploy/paddleserving/preprocess/general_clas_op.h
@@ -0,0 +1,70 @@
+// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+#include "core/general-server/general_model_service.pb.h"
+#include "core/general-server/op/general_infer_helper.h"
+#include "core/predictor/tools/pp_shitu_tools/preprocess_op.h"
+#include "paddle_inference_api.h" // NOLINT
+#include
+#include
+
+#include "opencv2/core.hpp"
+#include "opencv2/imgcodecs.hpp"
+#include "opencv2/imgproc.hpp"
+#include
+#include
+#include
+#include
+#include
+
+#include
+#include
+#include
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+class GeneralClasOp
+ : public baidu::paddle_serving::predictor::OpWithChannel {
+public:
+ typedef std::vector TensorVector;
+
+ DECLARE_OP(GeneralClasOp);
+
+ int inference();
+
+private:
+ // clas preprocess
+ std::vector mean_ = {0.485f, 0.456f, 0.406f};
+ std::vector scale_ = {0.229f, 0.224f, 0.225f};
+ bool is_scale_ = true;
+
+ int resize_short_size_ = 256;
+ int crop_size_ = 224;
+
+ PaddleClas::ResizeImg resize_op_;
+ PaddleClas::Normalize normalize_op_;
+ PaddleClas::Permute permute_op_;
+ PaddleClas::CenterCropImg crop_op_;
+
+ // read pics
+ cv::Mat Base2Mat(std::string &base64_data);
+ std::string base64Decode(const char *Data, int DataByte);
+};
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
diff --git a/deploy/paddleserving/preprocess/preprocess_op.cpp b/deploy/paddleserving/preprocess/preprocess_op.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..9c79342ceda115fe3c213bb6f5d32c6e56f2380a
--- /dev/null
+++ b/deploy/paddleserving/preprocess/preprocess_op.cpp
@@ -0,0 +1,149 @@
+// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "opencv2/core.hpp"
+#include "opencv2/imgcodecs.hpp"
+#include "opencv2/imgproc.hpp"
+#include "paddle_api.h"
+#include "paddle_inference_api.h"
+#include
+#include
+#include
+#include
+#include
+
+#include
+#include
+#include
+#include
+
+#include "preprocess_op.h"
+
+namespace Feature {
+
+void Permute::Run(const cv::Mat *im, float *data) {
+ int rh = im->rows;
+ int rw = im->cols;
+ int rc = im->channels();
+ for (int i = 0; i < rc; ++i) {
+ cv::extractChannel(*im, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw), i);
+ }
+}
+
+void Normalize::Run(cv::Mat *im, const std::vector &mean,
+ const std::vector &std, float scale) {
+ (*im).convertTo(*im, CV_32FC3, scale);
+ for (int h = 0; h < im->rows; h++) {
+ for (int w = 0; w < im->cols; w++) {
+ im->at(h, w)[0] =
+ (im->at(h, w)[0] - mean[0]) / std[0];
+ im->at(h, w)[1] =
+ (im->at(h, w)[1] - mean[1]) / std[1];
+ im->at(h, w)[2] =
+ (im->at(h, w)[2] - mean[2]) / std[2];
+ }
+ }
+}
+
+void CenterCropImg::Run(cv::Mat &img, const int crop_size) {
+ int resize_w = img.cols;
+ int resize_h = img.rows;
+ int w_start = int((resize_w - crop_size) / 2);
+ int h_start = int((resize_h - crop_size) / 2);
+ cv::Rect rect(w_start, h_start, crop_size, crop_size);
+ img = img(rect);
+}
+
+void ResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img,
+ int resize_short_size, int size) {
+ int resize_h = 0;
+ int resize_w = 0;
+ if (size > 0) {
+ resize_h = size;
+ resize_w = size;
+ } else {
+ int w = img.cols;
+ int h = img.rows;
+
+ float ratio = 1.f;
+ if (h < w) {
+ ratio = float(resize_short_size) / float(h);
+ } else {
+ ratio = float(resize_short_size) / float(w);
+ }
+ resize_h = round(float(h) * ratio);
+ resize_w = round(float(w) * ratio);
+ }
+ cv::resize(img, resize_img, cv::Size(resize_w, resize_h));
+}
+
+} // namespace Feature
+
+namespace PaddleClas {
+void Permute::Run(const cv::Mat *im, float *data) {
+ int rh = im->rows;
+ int rw = im->cols;
+ int rc = im->channels();
+ for (int i = 0; i < rc; ++i) {
+ cv::extractChannel(*im, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw), i);
+ }
+}
+
+void Normalize::Run(cv::Mat *im, const std::vector &mean,
+ const std::vector &scale, const bool is_scale) {
+ double e = 1.0;
+ if (is_scale) {
+ e /= 255.0;
+ }
+ (*im).convertTo(*im, CV_32FC3, e);
+ for (int h = 0; h < im->rows; h++) {
+ for (int w = 0; w < im->cols; w++) {
+ im->at(h, w)[0] =
+ (im->at(h, w)[0] - mean[0]) / scale[0];
+ im->at(h, w)[1] =
+ (im->at(h, w)[1] - mean[1]) / scale[1];
+ im->at(h, w)[2] =
+ (im->at(h, w)[2] - mean[2]) / scale[2];
+ }
+ }
+}
+
+void CenterCropImg::Run(cv::Mat &img, const int crop_size) {
+ int resize_w = img.cols;
+ int resize_h = img.rows;
+ int w_start = int((resize_w - crop_size) / 2);
+ int h_start = int((resize_h - crop_size) / 2);
+ cv::Rect rect(w_start, h_start, crop_size, crop_size);
+ img = img(rect);
+}
+
+void ResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img,
+ int resize_short_size) {
+ int w = img.cols;
+ int h = img.rows;
+
+ float ratio = 1.f;
+ if (h < w) {
+ ratio = float(resize_short_size) / float(h);
+ } else {
+ ratio = float(resize_short_size) / float(w);
+ }
+
+ int resize_h = round(float(h) * ratio);
+ int resize_w = round(float(w) * ratio);
+
+ cv::resize(img, resize_img, cv::Size(resize_w, resize_h));
+}
+
+} // namespace PaddleClas
diff --git a/deploy/paddleserving/preprocess/preprocess_op.h b/deploy/paddleserving/preprocess/preprocess_op.h
new file mode 100644
index 0000000000000000000000000000000000000000..0ea9d2e14a525365bb049a13358660a2567dadc8
--- /dev/null
+++ b/deploy/paddleserving/preprocess/preprocess_op.h
@@ -0,0 +1,81 @@
+// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+
+#include "opencv2/core.hpp"
+#include "opencv2/imgcodecs.hpp"
+#include "opencv2/imgproc.hpp"
+#include
+#include
+#include
+#include
+#include
+
+#include
+#include
+#include
+
+namespace Feature {
+
+class Normalize {
+public:
+ virtual void Run(cv::Mat *im, const std::vector &mean,
+ const std::vector &std, float scale);
+};
+
+// RGB -> CHW
+class Permute {
+public:
+ virtual void Run(const cv::Mat *im, float *data);
+};
+
+class CenterCropImg {
+public:
+ virtual void Run(cv::Mat &im, const int crop_size = 224);
+};
+
+class ResizeImg {
+public:
+ virtual void Run(const cv::Mat &img, cv::Mat &resize_img, int max_size_len,
+ int size = 0);
+};
+
+} // namespace Feature
+
+namespace PaddleClas {
+
+class Normalize {
+public:
+ virtual void Run(cv::Mat *im, const std::vector &mean,
+ const std::vector &scale, const bool is_scale = true);
+};
+
+// RGB -> CHW
+class Permute {
+public:
+ virtual void Run(const cv::Mat *im, float *data);
+};
+
+class CenterCropImg {
+public:
+ virtual void Run(cv::Mat &im, const int crop_size = 224);
+};
+
+class ResizeImg {
+public:
+ virtual void Run(const cv::Mat &img, cv::Mat &resize_img, int max_size_len);
+};
+
+} // namespace PaddleClas
diff --git a/deploy/paddleserving/recognition/config.yml b/deploy/paddleserving/recognition/config.yml
index 6ecc32e22435f07a549ffcdeb6a435b33c4901f1..e4108006e6f2ea1a3698e4fdf9c32f25dcbfbeb0 100644
--- a/deploy/paddleserving/recognition/config.yml
+++ b/deploy/paddleserving/recognition/config.yml
@@ -31,7 +31,7 @@ op:
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["features"]
-
+
det:
concurrency: 1
local_service_conf:
diff --git a/deploy/paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_client/serving_client_conf.prototxt b/deploy/paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_client/serving_client_conf.prototxt
new file mode 100644
index 0000000000000000000000000000000000000000..c781eb6f449fe06afbba7f96e01798c974bccf54
--- /dev/null
+++ b/deploy/paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_client/serving_client_conf.prototxt
@@ -0,0 +1,32 @@
+feed_var {
+ name: "x"
+ alias_name: "x"
+ is_lod_tensor: false
+ feed_type: 1
+ shape: 3
+ shape: 224
+ shape: 224
+}
+feed_var {
+ name: "boxes"
+ alias_name: "boxes"
+ is_lod_tensor: false
+ feed_type: 1
+ shape: 6
+}
+fetch_var {
+ name: "save_infer_model/scale_0.tmp_1"
+ alias_name: "features"
+ is_lod_tensor: false
+ fetch_type: 1
+ shape: 512
+}
+fetch_var {
+ name: "boxes"
+ alias_name: "boxes"
+ is_lod_tensor: false
+ fetch_type: 1
+ shape: 6
+}
+
+
diff --git a/deploy/paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_serving/serving_server_conf.prototxt b/deploy/paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_serving/serving_server_conf.prototxt
new file mode 100644
index 0000000000000000000000000000000000000000..04812f42ed90fbbd47c73b9ec706d57c04b4c571
--- /dev/null
+++ b/deploy/paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_serving/serving_server_conf.prototxt
@@ -0,0 +1,30 @@
+feed_var {
+ name: "x"
+ alias_name: "x"
+ is_lod_tensor: false
+ feed_type: 1
+ shape: 3
+ shape: 224
+ shape: 224
+}
+feed_var {
+ name: "boxes"
+ alias_name: "boxes"
+ is_lod_tensor: false
+ feed_type: 1
+ shape: 6
+}
+fetch_var {
+ name: "save_infer_model/scale_0.tmp_1"
+ alias_name: "features"
+ is_lod_tensor: false
+ fetch_type: 1
+ shape: 512
+}
+fetch_var {
+ name: "boxes"
+ alias_name: "boxes"
+ is_lod_tensor: false
+ fetch_type: 1
+ shape: 6
+}
diff --git a/deploy/paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/serving_client_conf.prototxt b/deploy/paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/serving_client_conf.prototxt
new file mode 100644
index 0000000000000000000000000000000000000000..d9ab81a8b3c275f638f314489a84deef46011d73
--- /dev/null
+++ b/deploy/paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/serving_client_conf.prototxt
@@ -0,0 +1,29 @@
+feed_var {
+ name: "im_shape"
+ alias_name: "im_shape"
+ is_lod_tensor: false
+ feed_type: 1
+ shape: 2
+}
+feed_var {
+ name: "image"
+ alias_name: "image"
+ is_lod_tensor: false
+ feed_type: 7
+ shape: -1
+ shape: -1
+ shape: 3
+}
+fetch_var {
+ name: "save_infer_model/scale_0.tmp_1"
+ alias_name: "save_infer_model/scale_0.tmp_1"
+ is_lod_tensor: true
+ fetch_type: 1
+ shape: -1
+}
+fetch_var {
+ name: "save_infer_model/scale_1.tmp_1"
+ alias_name: "save_infer_model/scale_1.tmp_1"
+ is_lod_tensor: false
+ fetch_type: 2
+}
diff --git a/deploy/paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/serving_server_conf.prototxt b/deploy/paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/serving_server_conf.prototxt
new file mode 100644
index 0000000000000000000000000000000000000000..d9ab81a8b3c275f638f314489a84deef46011d73
--- /dev/null
+++ b/deploy/paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/serving_server_conf.prototxt
@@ -0,0 +1,29 @@
+feed_var {
+ name: "im_shape"
+ alias_name: "im_shape"
+ is_lod_tensor: false
+ feed_type: 1
+ shape: 2
+}
+feed_var {
+ name: "image"
+ alias_name: "image"
+ is_lod_tensor: false
+ feed_type: 7
+ shape: -1
+ shape: -1
+ shape: 3
+}
+fetch_var {
+ name: "save_infer_model/scale_0.tmp_1"
+ alias_name: "save_infer_model/scale_0.tmp_1"
+ is_lod_tensor: true
+ fetch_type: 1
+ shape: -1
+}
+fetch_var {
+ name: "save_infer_model/scale_1.tmp_1"
+ alias_name: "save_infer_model/scale_1.tmp_1"
+ is_lod_tensor: false
+ fetch_type: 2
+}
diff --git a/deploy/paddleserving/recognition/test_cpp_serving_client.py b/deploy/paddleserving/recognition/test_cpp_serving_client.py
index a2bf1ae3e9d0a69628319b9f845a1e6f7701b391..e2cd17e855ebfe8fb286ebaeff8ab63874e2e972 100644
--- a/deploy/paddleserving/recognition/test_cpp_serving_client.py
+++ b/deploy/paddleserving/recognition/test_cpp_serving_client.py
@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-import sys
import numpy as np
from paddle_serving_client import Client
@@ -22,181 +21,101 @@ import faiss
import os
import pickle
-
-class MainbodyDetect():
- """
- pp-shitu mainbody detect.
- include preprocess, process, postprocess
- return detect results
- Attention: Postprocess include num limit and box filter; no nms
- """
-
- def __init__(self):
- self.preprocess = DetectionSequential([
- DetectionFile2Image(), DetectionNormalize(
- [0.485, 0.456, 0.406], [0.229, 0.224, 0.225], True),
- DetectionResize(
- (640, 640), False, interpolation=2), DetectionTranspose(
- (2, 0, 1))
- ])
-
- self.client = Client()
- self.client.load_client_config(
- "../../models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/serving_client_conf.prototxt"
- )
- self.client.connect(['127.0.0.1:9293'])
-
- self.max_det_result = 5
- self.conf_threshold = 0.2
-
- def predict(self, imgpath):
- im, im_info = self.preprocess(imgpath)
- im_shape = np.array(im.shape[1:]).reshape(-1)
- scale_factor = np.array(list(im_info['scale_factor'])).reshape(-1)
-
- fetch_map = self.client.predict(
- feed={
- "image": im,
- "im_shape": im_shape,
- "scale_factor": scale_factor,
- },
- fetch=["save_infer_model/scale_0.tmp_1"],
- batch=False)
- return self.postprocess(fetch_map, imgpath)
-
- def postprocess(self, fetch_map, imgpath):
- #1. get top max_det_result
- det_results = fetch_map["save_infer_model/scale_0.tmp_1"]
- if len(det_results) > self.max_det_result:
- boxes_reserved = fetch_map[
- "save_infer_model/scale_0.tmp_1"][:self.max_det_result]
- else:
- boxes_reserved = det_results
-
- #2. do conf threshold
- boxes_list = []
- for i in range(boxes_reserved.shape[0]):
- if (boxes_reserved[i, 1]) > self.conf_threshold:
- boxes_list.append(boxes_reserved[i, :])
-
- #3. add origin image box
- origin_img = cv2.imread(imgpath)
- boxes_list.append(
- np.array([0, 1.0, 0, 0, origin_img.shape[1], origin_img.shape[0]]))
- return np.array(boxes_list)
-
-
-class ObjectRecognition():
- """
- pp-shitu object recognion for all objects detected by MainbodyDetect.
- include preprocess, process, postprocess
- preprocess include preprocess for each image and batching.
- Batch process
- postprocess include retrieval and nms
- """
-
- def __init__(self):
- self.client = Client()
- self.client.load_client_config(
- "../../models/general_PPLCNet_x2_5_lite_v1.0_client/serving_client_conf.prototxt"
- )
- self.client.connect(["127.0.0.1:9294"])
-
- self.seq = Sequential([
- BGR2RGB(), Resize((224, 224)), Div(255),
- Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225],
- False), Transpose((2, 0, 1))
- ])
-
- self.searcher, self.id_map = self.init_index()
-
- self.rec_nms_thresold = 0.05
- self.rec_score_thres = 0.5
- self.feature_normalize = True
- self.return_k = 1
-
- def init_index(self):
- index_dir = "../../drink_dataset_v1.0/index"
- assert os.path.exists(os.path.join(
- index_dir, "vector.index")), "vector.index not found ..."
- assert os.path.exists(os.path.join(
- index_dir, "id_map.pkl")), "id_map.pkl not found ... "
-
- searcher = faiss.read_index(os.path.join(index_dir, "vector.index"))
-
- with open(os.path.join(index_dir, "id_map.pkl"), "rb") as fd:
- id_map = pickle.load(fd)
- return searcher, id_map
-
- def predict(self, det_boxes, imgpath):
- #1. preprocess
- batch_imgs = []
- origin_img = cv2.imread(imgpath)
- for i in range(det_boxes.shape[0]):
- box = det_boxes[i]
- x1, y1, x2, y2 = [int(x) for x in box[2:]]
- cropped_img = origin_img[y1:y2, x1:x2, :].copy()
- tmp = self.seq(cropped_img)
- batch_imgs.append(tmp)
- batch_imgs = np.array(batch_imgs)
-
- #2. process
- fetch_map = self.client.predict(
- feed={"x": batch_imgs}, fetch=["features"], batch=True)
- batch_features = fetch_map["features"]
-
- #3. postprocess
- if self.feature_normalize:
- feas_norm = np.sqrt(
- np.sum(np.square(batch_features), axis=1, keepdims=True))
- batch_features = np.divide(batch_features, feas_norm)
- scores, docs = self.searcher.search(batch_features, self.return_k)
-
- results = []
- for i in range(scores.shape[0]):
- pred = {}
- if scores[i][0] >= self.rec_score_thres:
- pred["bbox"] = [int(x) for x in det_boxes[i, 2:]]
- pred["rec_docs"] = self.id_map[docs[i][0]].split()[1]
- pred["rec_scores"] = scores[i][0]
- results.append(pred)
- return self.nms_to_rec_results(results)
-
- def nms_to_rec_results(self, results):
- filtered_results = []
- x1 = np.array([r["bbox"][0] for r in results]).astype("float32")
- y1 = np.array([r["bbox"][1] for r in results]).astype("float32")
- x2 = np.array([r["bbox"][2] for r in results]).astype("float32")
- y2 = np.array([r["bbox"][3] for r in results]).astype("float32")
- scores = np.array([r["rec_scores"] for r in results])
-
- areas = (x2 - x1 + 1) * (y2 - y1 + 1)
- order = scores.argsort()[::-1]
- while order.size > 0:
- i = order[0]
- xx1 = np.maximum(x1[i], x1[order[1:]])
- yy1 = np.maximum(y1[i], y1[order[1:]])
- xx2 = np.minimum(x2[i], x2[order[1:]])
- yy2 = np.minimum(y2[i], y2[order[1:]])
-
- w = np.maximum(0.0, xx2 - xx1 + 1)
- h = np.maximum(0.0, yy2 - yy1 + 1)
- inter = w * h
- ovr = inter / (areas[i] + areas[order[1:]] - inter)
- inds = np.where(ovr <= self.rec_nms_thresold)[0]
- order = order[inds + 1]
- filtered_results.append(results[i])
- return filtered_results
-
-
+rec_nms_thresold = 0.05
+rec_score_thres = 0.5
+feature_normalize = True
+return_k = 1
+index_dir = "../../drink_dataset_v1.0/index"
+
+
+def init_index(index_dir):
+ assert os.path.exists(os.path.join(
+ index_dir, "vector.index")), "vector.index not found ..."
+ assert os.path.exists(os.path.join(
+ index_dir, "id_map.pkl")), "id_map.pkl not found ... "
+
+ searcher = faiss.read_index(os.path.join(index_dir, "vector.index"))
+
+ with open(os.path.join(index_dir, "id_map.pkl"), "rb") as fd:
+ id_map = pickle.load(fd)
+ return searcher, id_map
+
+
+#get box
+def nms_to_rec_results(results, thresh=0.1):
+ filtered_results = []
+
+ x1 = np.array([r["bbox"][0] for r in results]).astype("float32")
+ y1 = np.array([r["bbox"][1] for r in results]).astype("float32")
+ x2 = np.array([r["bbox"][2] for r in results]).astype("float32")
+ y2 = np.array([r["bbox"][3] for r in results]).astype("float32")
+ scores = np.array([r["rec_scores"] for r in results])
+
+ areas = (x2 - x1 + 1) * (y2 - y1 + 1)
+ order = scores.argsort()[::-1]
+ while order.size > 0:
+ i = order[0]
+ xx1 = np.maximum(x1[i], x1[order[1:]])
+ yy1 = np.maximum(y1[i], y1[order[1:]])
+ xx2 = np.minimum(x2[i], x2[order[1:]])
+ yy2 = np.minimum(y2[i], y2[order[1:]])
+
+ w = np.maximum(0.0, xx2 - xx1 + 1)
+ h = np.maximum(0.0, yy2 - yy1 + 1)
+ inter = w * h
+ ovr = inter / (areas[i] + areas[order[1:]] - inter)
+ inds = np.where(ovr <= thresh)[0]
+ order = order[inds + 1]
+ filtered_results.append(results[i])
+ return filtered_results
+
+
+def postprocess(fetch_dict, feature_normalize, det_boxes, searcher, id_map,
+ return_k, rec_score_thres, rec_nms_thresold):
+ batch_features = fetch_dict["features"]
+
+ #do feature norm
+ if feature_normalize:
+ feas_norm = np.sqrt(
+ np.sum(np.square(batch_features), axis=1, keepdims=True))
+ batch_features = np.divide(batch_features, feas_norm)
+
+ scores, docs = searcher.search(batch_features, return_k)
+
+ results = []
+ for i in range(scores.shape[0]):
+ pred = {}
+ if scores[i][0] >= rec_score_thres:
+ pred["bbox"] = [int(x) for x in det_boxes[i, 2:]]
+ pred["rec_docs"] = id_map[docs[i][0]].split()[1]
+ pred["rec_scores"] = scores[i][0]
+ results.append(pred)
+
+ #do nms
+ results = nms_to_rec_results(results, rec_nms_thresold)
+ return results
+
+
+#do client
if __name__ == "__main__":
- det = MainbodyDetect()
- rec = ObjectRecognition()
-
- #1. get det_results
- imgpath = "../../drink_dataset_v1.0/test_images/001.jpeg"
- det_results = det.predict(imgpath)
-
- #2. get rec_results
- rec_results = rec.predict(det_results, imgpath)
- print(rec_results)
+ client = Client()
+ client.load_client_config([
+ "../../models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client",
+ "../../models/general_PPLCNet_x2_5_lite_v1.0_client"
+ ])
+ client.connect(['127.0.0.1:9400'])
+
+ im = cv2.imread("../../drink_dataset_v1.0/test_images/001.jpeg")
+ im_shape = np.array(im.shape[:2]).reshape(-1)
+ fetch_map = client.predict(
+ feed={"image": im,
+ "im_shape": im_shape},
+ fetch=["features", "boxes"],
+ batch=False)
+
+ #add retrieval procedure
+ det_boxes = fetch_map["boxes"]
+ searcher, id_map = init_index(index_dir)
+ results = postprocess(fetch_map, feature_normalize, det_boxes, searcher,
+ id_map, return_k, rec_score_thres, rec_nms_thresold)
+ print(results)
diff --git a/deploy/paddleserving/test_cpp_serving_client.py b/deploy/paddleserving/test_cpp_serving_client.py
index 50794b363767c8236ccca1001a441b535a9f9db3..ba5399c90dcd5e0701df26e2d2f8337a4105ab51 100644
--- a/deploy/paddleserving/test_cpp_serving_client.py
+++ b/deploy/paddleserving/test_cpp_serving_client.py
@@ -12,16 +12,20 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-import sys
+import base64
+import time
+
from paddle_serving_client import Client
-#app
-from paddle_serving_app.reader import Sequential, URL2Image, Resize
-from paddle_serving_app.reader import CenterCrop, RGB2BGR, Transpose, Div, Normalize
-import time
+
+def bytes_to_base64(image: bytes) -> str:
+ """encode bytes into base64 string
+ """
+ return base64.b64encode(image).decode('utf8')
+
client = Client()
-client.load_client_config("./ResNet50_vd_serving/serving_server_conf.prototxt")
+client.load_client_config("./ResNet50_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
label_dict = {}
@@ -31,22 +35,17 @@ with open("imagenet.label") as fin:
label_dict[label_idx] = line.strip()
label_idx += 1
-#preprocess
-seq = Sequential([
- URL2Image(), Resize(256), CenterCrop(224), RGB2BGR(), Transpose((2, 0, 1)),
- Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], True)
-])
-
-start = time.time()
-image_file = "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"
+image_file = "./daisy.jpg"
for i in range(1):
- img = seq(image_file)
- fetch_map = client.predict(
- feed={"inputs": img}, fetch=["prediction"], batch=False)
-
- prob = max(fetch_map["prediction"][0])
- label = label_dict[fetch_map["prediction"][0].tolist().index(prob)].strip(
- ).replace(",", "")
- print("prediction: {}, probability: {}".format(label, prob))
-end = time.time()
-print(end - start)
+ start = time.time()
+ with open(image_file, 'rb') as img_file:
+ image_data = img_file.read()
+ image = bytes_to_base64(image_data)
+ fetch_dict = client.predict(
+ feed={"inputs": image}, fetch=["prediction"], batch=False)
+ prob = max(fetch_dict["prediction"][0])
+ label = label_dict[fetch_dict["prediction"][0].tolist().index(
+ prob)].strip().replace(",", "")
+ print("prediction: {}, probability: {}".format(label, prob))
+ end = time.time()
+ print(end - start)
diff --git a/deploy/python/postprocess.py b/deploy/python/postprocess.py
index 4f4d005fdff2bf17e04265e136443d0cd837f10e..23a803e284361e98b60f193c450318536d992937 100644
--- a/deploy/python/postprocess.py
+++ b/deploy/python/postprocess.py
@@ -64,9 +64,17 @@ class ThreshOutput(object):
for idx, probs in enumerate(x):
score = probs[1]
if score < self.threshold:
- result = {"class_ids": [0], "scores": [1 - score], "label_names": [self.label_0]}
+ result = {
+ "class_ids": [0],
+ "scores": [1 - score],
+ "label_names": [self.label_0]
+ }
else:
- result = {"class_ids": [1], "scores": [score], "label_names": [self.label_1]}
+ result = {
+ "class_ids": [1],
+ "scores": [score],
+ "label_names": [self.label_1]
+ }
if file_names is not None:
result["file_name"] = file_names[idx]
y.append(result)
@@ -179,3 +187,136 @@ class Binarize(object):
byte[:, i:i + 1] = np.dot(x[:, i * 8:(i + 1) * 8], self.unit)
return byte
+
+
+class PersonAttribute(object):
+ def __init__(self,
+ threshold=0.5,
+ glasses_threshold=0.3,
+ hold_threshold=0.6):
+ self.threshold = threshold
+ self.glasses_threshold = glasses_threshold
+ self.hold_threshold = hold_threshold
+
+ def __call__(self, batch_preds, file_names=None):
+ # postprocess output of predictor
+ age_list = ['AgeLess18', 'Age18-60', 'AgeOver60']
+ direct_list = ['Front', 'Side', 'Back']
+ bag_list = ['HandBag', 'ShoulderBag', 'Backpack']
+ upper_list = ['UpperStride', 'UpperLogo', 'UpperPlaid', 'UpperSplice']
+ lower_list = [
+ 'LowerStripe', 'LowerPattern', 'LongCoat', 'Trousers', 'Shorts',
+ 'Skirt&Dress'
+ ]
+ batch_res = []
+ for res in batch_preds:
+ res = res.tolist()
+ label_res = []
+ # gender
+ gender = 'Female' if res[22] > self.threshold else 'Male'
+ label_res.append(gender)
+ # age
+ age = age_list[np.argmax(res[19:22])]
+ label_res.append(age)
+ # direction
+ direction = direct_list[np.argmax(res[23:])]
+ label_res.append(direction)
+ # glasses
+ glasses = 'Glasses: '
+ if res[1] > self.glasses_threshold:
+ glasses += 'True'
+ else:
+ glasses += 'False'
+ label_res.append(glasses)
+ # hat
+ hat = 'Hat: '
+ if res[0] > self.threshold:
+ hat += 'True'
+ else:
+ hat += 'False'
+ label_res.append(hat)
+ # hold obj
+ hold_obj = 'HoldObjectsInFront: '
+ if res[18] > self.hold_threshold:
+ hold_obj += 'True'
+ else:
+ hold_obj += 'False'
+ label_res.append(hold_obj)
+ # bag
+ bag = bag_list[np.argmax(res[15:18])]
+ bag_score = res[15 + np.argmax(res[15:18])]
+ bag_label = bag if bag_score > self.threshold else 'No bag'
+ label_res.append(bag_label)
+ # upper
+ upper_res = res[4:8]
+ upper_label = 'Upper:'
+ sleeve = 'LongSleeve' if res[3] > res[2] else 'ShortSleeve'
+ upper_label += ' {}'.format(sleeve)
+ for i, r in enumerate(upper_res):
+ if r > self.threshold:
+ upper_label += ' {}'.format(upper_list[i])
+ label_res.append(upper_label)
+ # lower
+ lower_res = res[8:14]
+ lower_label = 'Lower: '
+ has_lower = False
+ for i, l in enumerate(lower_res):
+ if l > self.threshold:
+ lower_label += ' {}'.format(lower_list[i])
+ has_lower = True
+ if not has_lower:
+ lower_label += ' {}'.format(lower_list[np.argmax(lower_res)])
+
+ label_res.append(lower_label)
+ # shoe
+ shoe = 'Boots' if res[14] > self.threshold else 'No boots'
+ label_res.append(shoe)
+
+ threshold_list = [0.5] * len(res)
+ threshold_list[1] = self.glasses_threshold
+ threshold_list[18] = self.hold_threshold
+ pred_res = (np.array(res) > np.array(threshold_list)
+ ).astype(np.int8).tolist()
+ batch_res.append({"attributes": label_res, "output": pred_res})
+ return batch_res
+
+
+class VehicleAttribute(object):
+ def __init__(self, color_threshold=0.5, type_threshold=0.5):
+ self.color_threshold = color_threshold
+ self.type_threshold = type_threshold
+ self.color_list = [
+ "yellow", "orange", "green", "gray", "red", "blue", "white",
+ "golden", "brown", "black"
+ ]
+ self.type_list = [
+ "sedan", "suv", "van", "hatchback", "mpv", "pickup", "bus",
+ "truck", "estate"
+ ]
+
+ def __call__(self, batch_preds, file_names=None):
+ # postprocess output of predictor
+ batch_res = []
+ for res in batch_preds:
+ res = res.tolist()
+ label_res = []
+ color_idx = np.argmax(res[:10])
+ type_idx = np.argmax(res[10:])
+ if res[color_idx] >= self.color_threshold:
+ color_info = f"Color: ({self.color_list[color_idx]}, prob: {res[color_idx]})"
+ else:
+ color_info = "Color unknown"
+
+ if res[type_idx + 10] >= self.type_threshold:
+ type_info = f"Type: ({self.type_list[type_idx]}, prob: {res[type_idx + 10]})"
+ else:
+ type_info = "Type unknown"
+
+ label_res = f"{color_info}, {type_info}"
+
+ threshold_list = [self.color_threshold
+ ] * 10 + [self.type_threshold] * 9
+ pred_res = (np.array(res) > np.array(threshold_list)
+ ).astype(np.int8).tolist()
+ batch_res.append({"attributes": label_res, "output": pred_res})
+ return batch_res
diff --git a/deploy/python/predict_cls.py b/deploy/python/predict_cls.py
index 64c07ea875eaa2c456393328183b7270080a64d1..49bf62fa3060b9336a3438b2ee5c25b2bac49667 100644
--- a/deploy/python/predict_cls.py
+++ b/deploy/python/predict_cls.py
@@ -138,13 +138,20 @@ def main(config):
continue
batch_results = cls_predictor.predict(batch_imgs)
for number, result_dict in enumerate(batch_results):
- filename = batch_names[number]
- clas_ids = result_dict["class_ids"]
- scores_str = "[{}]".format(", ".join("{:.2f}".format(
- r) for r in result_dict["scores"]))
- label_names = result_dict["label_names"]
- print("{}:\tclass id(s): {}, score(s): {}, label_name(s): {}".
- format(filename, clas_ids, scores_str, label_names))
+ if "PersonAttribute" in config[
+ "PostProcess"] or "VehicleAttribute" in config[
+ "PostProcess"]:
+ filename = batch_names[number]
+ print("{}:\t {}".format(filename, result_dict))
+ else:
+ filename = batch_names[number]
+ clas_ids = result_dict["class_ids"]
+ scores_str = "[{}]".format(", ".join("{:.2f}".format(
+ r) for r in result_dict["scores"]))
+ label_names = result_dict["label_names"]
+ print(
+ "{}:\tclass id(s): {}, score(s): {}, label_name(s): {}".
+ format(filename, clas_ids, scores_str, label_names))
batch_imgs = []
batch_names = []
if cls_predictor.benchmark:
diff --git a/deploy/slim/quant_post_static.py b/deploy/slim/quant_post_static.py
index 5c8469794ad29e18dad15f985b611e423fd4b474..20507c66ad1ed583c2baf1bae7e812e0364e015e 100644
--- a/deploy/slim/quant_post_static.py
+++ b/deploy/slim/quant_post_static.py
@@ -43,6 +43,7 @@ def main():
'inference.pdiparams'))
config["DataLoader"]["Eval"]["sampler"]["batch_size"] = 1
config["DataLoader"]["Eval"]["loader"]["num_workers"] = 0
+
init_logger()
device = paddle.set_device("cpu")
train_dataloader = build_dataloader(config["DataLoader"], "Eval", device,
@@ -67,6 +68,7 @@ def main():
quantize_model_path=os.path.join(
config["Global"]["save_inference_dir"], "quant_post_static_model"),
sample_generator=sample_generator(train_dataloader),
+ batch_size=config["DataLoader"]["Eval"]["sampler"]["batch_size"],
batch_nums=10)
diff --git a/docs/en/PULC/PULC_car_exists_en.md b/docs/en/PULC/PULC_car_exists_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..33c0932e6f118d7f9e31650e7d1e9754af19ec17
--- /dev/null
+++ b/docs/en/PULC/PULC_car_exists_en.md
@@ -0,0 +1,457 @@
+# PULC Classification Model of Containing or Uncontaining Car
+
+------
+
+## Catalogue
+
+- [1. Introduction](#1)
+- [2. Quick Start](#2)
+ - [2.1 PaddlePaddle Installation](#2.1)
+ - [2.2 PaddleClas Installation](#2.2)
+ - [2.3 Prediction](#2.3)
+- [3. Training, Evaluation and Inference](#3)
+ - [3.1 Installation](#3.1)
+ - [3.2 Dataset](#3.2)
+ - [3.2.1 Dataset Introduction](#3.2.1)
+ - [3.2.2 Getting Dataset](#3.2.2)
+ - [3.3 Training](#3.3)
+ - [3.4 Evaluation](#3.4)
+ - [3.5 Inference](#3.5)
+- [4. Model Compression](#4)
+ - [4.1 SKL-UGI Knowledge Distillation](#4.1)
+ - [4.1.1 Teacher Model Training](#4.1.1)
+ - [4.1.2 Knowledge Distillation Training](#4.1.2)
+- [5. SHAS](#5)
+- [6. Inference Deployment](#6)
+ - [6.1 Getting Paddle Inference Model](#6.1)
+ - [6.1.1 Exporting Paddle Inference Model](#6.1.1)
+ - [6.1.2 Downloading Inference Model](#6.1.2)
+ - [6.2 Prediction with Python](#6.2)
+ - [6.2.1 Image Prediction](#6.2.1)
+ - [6.2.2 Images Prediction](#6.2.2)
+ - [6.3 Deployment with C++](#6.3)
+ - [6.4 Deployment as Service](#6.4)
+ - [6.5 Deployment on Mobile](#6.5)
+ - [6.6 Converting To ONNX and Deployment](#6.6)
+
+
+
+## 1. Introduction
+
+This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of car exists using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in monitoring scenarios, massive data filtering scenarios, etc.
+
+The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to sixth lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy.
+
+| Backbone | Tpr(%) | Latency(ms) | Size(M)| Training Strategy |
+|-------|----------------|----------|---------------|---------------|
+| SwinTranformer_tiny | 97.71 | 95.30 | 111 | using ImageNet pretrained model |
+| MobileNetV3_small_x0_35 | 81.23 | 2.85 | 2.7 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 94.72 | 2.12 | 7.1 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 95.48 | 2.12 | 7.1 | using SSLD pretrained model |
+| PPLCNet_x1_0 | 95.48 | 2.12 | 7.1 | using SSLD pretrained model + EDA strategy |
+| PPLCNet_x1_0 | 95.92 | 2.12 | 7.1 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy|
+
+It can be seen that high Tpr can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the Tpr will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the Tpr is higher more 13 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the Tpr can be improved by about 0.7 percentage points without affecting the inference speed. Finally, after additional using the SKL-UGI knowledge distillation, the Tpr can be further improved by 0.44 percentage points. At this point, the Tpr is close to that of SwinTranformer_tiny, but the speed is more than 40 times faster. The training method and deployment instructions of PULC will be introduced in detail below.
+
+**Note**:
+
+* About `Tpr` metric, please refer to [3.2 section](#3.2) for more information .
+* The Latency is tested on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. The MKLDNN is enabled and the number of threads is 10.
+* About PP-LCNet, please refer to [PP-LCNet Introduction](../models/PP-LCNet_en.md) and [PP-LCNet Paper](https://arxiv.org/abs/2109.15099).
+
+
+
+## 2. Quick Start
+
+
+
+### 2.1 PaddlePaddle Installation
+
+- Run the following command to install if CUDA9 or CUDA10 is available.
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- Run the following command to install if GPU device is unavailable.
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
+
+
+
+### 2.2 PaddleClas wheel Installation
+
+The command of PaddleClas installation as bellow:
+
+```bash
+pip3 install paddleclas
+```
+
+
+
+### 2.3 Prediction
+
+First, please click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download and unzip to get the test demo images.
+
+* Prediction with CLI
+
+```bash
+paddleclas --model_name=car_exists --infer_imgs=pulc_demo_imgs/car_exists/objects365_00001507.jpeg
+```
+
+Results:
+
+```
+>>> result
+class_ids: [1], scores: [0.9871138], label_names: ['contains_car'], filename: pulc_demo_imgs/car_exists/objects365_00001507.jpeg
+Predict complete!
+```
+
+**Note**: If you want to test other images, only need to specify the `--infer_imgs` argument, and the directory containing images is also supported.
+
+* Prediction in Python
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="car_exists")
+result = model.predict(input_data="pulc_demo_imgs/car_exists/objects365_00001507.jpeg")
+print(next(result))
+```
+
+**Note**: The `result` returned by `model.predict()` is a generator, so you need to use the `next()` function to call it or `for` loop to loop it. And it will predict with `batch_size` size batch and return the prediction results when called. The default `batch_size` is 1, and you also specify the `batch_size` when instantiating, such as `model = paddleclas.PaddleClas(model_name="car_exists", batch_size=2)`. The result of demo above:
+
+```
+>>> result
+[{'class_ids': [1], 'scores': [0.9871138], 'label_names': ['contains_car'], 'filename': 'pulc_demo_imgs/car_exists/objects365_00001507.jpeg'}]
+```
+
+
+
+## 3. Training, Evaluation and Inference
+
+
+
+### 3.1 Installation
+
+Please refer to [Installation](../installation/install_paddleclas_en.md) to get the description about installation.
+
+
+
+### 3.2 Dataset
+
+
+
+#### 3.2.1 Dataset Introduction
+
+All datasets used in this case are open source data. Train and validation data are the subset of [Object365](https://www.objects365.org/overview.html) data. ImageNet_val is [ImageNet-1k](https://www.image-net.org/) validation data.
+
+
+
+#### 3.2.2 Getting Dataset
+
+The data used in this case can be getted by processing the open source data. The detailed processes are as follows:
+
+- Training data. This case deals with the annotation file of Objects365 data training data. If a certain image contains the label of "car" and the area of this box is greater than 10% in the whole image, it is considered that the image contains car. If there is no label of any vehicle in a certain image, such as car, bus and so on, it is considered that the image does not contain car. After processing, 108629 images were obtained, including 27422 images containing car and 81207 images uncontaining car.
+- Validation data: Same as Training data, but checked manually to remove some labeled wrong images.
+
+**Note**: the labels of objects365 are not completely mutually exclusive. For example, F1 racing cars may be "F1 formula" or "car". In order to reduce the interference, we only keep the "car" label as containing car, and the figure without any vehicle as uncontaining car.
+
+Some image of the processed dataset is as follows:
+
+
+
+And you can also download the data processed directly.
+
+```
+cd path_to_PaddleClas
+```
+
+Enter the `dataset/` directory, download and unzip the dataset.
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/car_exists.tar
+tar -xf car_exists.tar
+cd ../
+```
+
+The datas under `car_exists` directory:
+
+```
+
+├── objects365_car
+│ ├── objects365_00000039.jpg
+│ ├── objects365_00000099.jpg
+├── ImageNet_val
+│ ├── ILSVRC2012_val_00000001.JPEG
+│ ├── ILSVRC2012_val_00000002.JPEG
+...
+├── train_list.txt
+├── train_list.txt.debug
+├── train_list_for_distill.txt
+├── val_list.txt
+└── val_list.txt.debug
+```
+
+Where `train/` and `val/` are training set and validation set respectively. The `train_list.txt` and `val_list.txt` are label files of training data and validation data respectively. The file `train_list.txt.debug` and `val_list.txt.debug` are subset of `train_list.txt` and `val_list.txt` respectively. `ImageNet_val/` is the validation data of ImageNet-1k, which will be used for SKL-UGI knowledge distillation, and its label file is `train_list_for_distill.txt`.
+
+**Note**:
+
+* About the contents format of `train_list.txt` and `val_list.txt`, please refer to [Description about Classification Dataset in PaddleClas](../data_preparation/classification_dataset_en.md).
+* About the `train_list_for_distill.txt`, please refer to [Knowledge Distillation Label](../advanced_tutorials/distillation/distillation_en.md).
+
+
+
+### 3.3 Training
+
+The details of training config in `ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml`. The command about training as follows:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml
+```
+
+The best metric of validation data is between `0.95` and `0.96`. There would be fluctuations because the data size is small.
+
+**Note**:
+
+* The metric Tpr, that describe the True Positive Rate when False Positive Rate is less than a certain threshold(1/100 used in this case), is one of the commonly used metric for binary classification. About the details of Fpr and Tpr, please refer [here](https://en.wikipedia.org/wiki/Receiver_operating_characteristic).
+* When evaluation, the best metric TprAtFpr will be printed that include `Fpr`, `Tpr` and the current `threshold`. The `Tpr` means the Recall rate under the current `Fpr`. The `Tpr` higher, the model better. The `threshold` would be used in deployment, which means the classification threshold under best `Fpr` metric.
+
+
+
+### 3.4 Evaluation
+
+After training, you can use the following commands to evaluate the model.
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+Among the above command, the argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+
+
+
+### 3.5 Inference
+
+After training, you can use the model that trained to infer. Command is as follow:
+
+```python
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+The results:
+
+```
+[{'class_ids': [1], 'scores': [0.9871138], 'label_names': ['contains_car'], 'filename': 'deploy/images/PULC/car_exists/objects365_00001507.jpeg'}]
+```
+
+**Note**:
+
+* Among the above command, argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+* The default test image is `deploy/images/PULC/car_exists/objects365_00001507.jpeg`. And you can test other image, only need to specify the argument `-o Infer.infer_imgs=path_to_test_image`.
+* The default threshold is `0.5`. If needed, you can specify the argument `Infer.PostProcess.threshold`, such as: `-o Infer.PostProcess.threshold=0.9794`. And the argument `threshold` is needed to be specified according by specific case. The `0.9794` is the best threshold when `Fpr` is less than `1/100` in this valuation dataset.
+
+
+
+## 4. Model Compression
+
+
+
+### 4.1 SKL-UGI Knowledge Distillation
+
+SKL-UGI is a simple but effective knowledge distillation algrithem proposed by PaddleClas.
+
+
+
+
+
+
+#### 4.1.1 Teacher Model Training
+
+Training the teacher model with hyperparameters specified in `ppcls/configs/PULC/car_exists/PPLCNet/PPLCNet_x1_0.yaml`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+The best metric of validation data is between `0.96` and `0.98`. The best teacher model weight would be saved in file `output/ResNet101_vd/best_model.pdparams`.
+
+
+
+#### 4.1.2 Knowledge Distillation Training
+
+The training strategy, specified in training config file `ppcls/configs/PULC/car_exists/PPLCNet_x1_0_distillation.yaml`, the teacher model is `ResNet101_vd`, the student model is `PPLCNet_x1_0` and the additional unlabeled training data is validation data of ImageNet1k. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+The best metric is between `0.95` and `0.97`. The best student model weight would be saved in file `output/DistillationModel/best_model_student.pdparams`.
+
+
+
+## 5. Hyperparameters Searching
+
+The hyperparameters used by [3.2 section](#3.2) and [4.1 section](#4.1) are according by `Hyperparameters Searching` in PaddleClas. If you want to get better results on your own dataset, you can refer to [Hyperparameters Searching](PULC_train_en.md#4) to get better hyperparameters.
+
+**Note**: This section is optional. Because the search process will take a long time, you can selectively run according to your specific. If not replace the dataset, you can ignore this section.
+
+
+
+## 6. Inference Deployment
+
+
+
+### 6.1 Getting Paddle Inference Model
+
+Paddle Inference is the original Inference Library of the PaddlePaddle, provides high-performance inference for server deployment. And compared with directly based on the pretrained model, Paddle Inference can use tools to accelerate prediction, so as to achieve better inference performance. Please refer to [Paddle Inference](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html) for more information.
+
+Paddle Inference need Paddle Inference Model to predict. Two process provided to get Paddle Inference Model. If want to use the provided by PaddleClas, you can download directly, click [Downloading Inference Model](#6.1.2).
+
+
+
+### 6.1.1 Exporting Paddle Inference Model
+
+The command about exporting Paddle Inference Model is as follow:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_car_exists_infer
+```
+
+After running above command, the inference model files would be saved in `deploy/models/PPLCNet_x1_0_car_exists_infer`, as shown below:
+
+```
+├── PPLCNet_x1_0_car_exists_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**Note**: The best model is from knowledge distillation training. If knowledge distillation training is not used, the best model would be saved in `output/PPLCNet_x1_0/best_model.pdparams`.
+
+
+
+### 6.1.2 Downloading Inference Model
+
+You can also download directly.
+
+```
+cd deploy/models
+# download the inference model and decompression
+wget https://paddleclas.bj.bcebos.com/models/PULC/car_exists_infer.tar && tar -xf car_exists_infer.tar
+```
+
+After decompression, the directory `models` should be shown below.
+
+```
+├── car_exists_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 Prediction with Python
+
+
+
+#### 6.2.1 Image Prediction
+
+Return the directory `deploy`:
+
+```
+cd ../
+```
+
+Run the following command to classify whether there are cars in the image `./images/PULC/car_exists/objects365_00001507.jpeg`.
+
+```shell
+# Use the following command to predict with GPU.
+python3.7 python/predict_cls.py -c configs/PULC/car_exists/inference_car_exists.yaml
+# Use the following command to predict with CPU.
+python3.7 python/predict_cls.py -c configs/PULC/car_exists/inference_car_exists.yaml -o Global.use_gpu=False
+```
+
+The prediction results:
+
+```
+objects365_00001507.jpeg: class id(s): [1], score(s): [0.99], label_name(s): ['contains_car']
+```
+
+**Note**: The default threshold is `0.5`. If needed, you can specify the argument `Infer.PostProcess.threshold`, such as: `-o Infer.PostProcess.threshold=0.9794`. And the argument `threshold` is needed to be specified according by specific case. The `0.9794` is the best threshold when `Fpr` is less than `1/100` in this valuation dataset. Please refer to [3.3 section](#3.3) for details.
+
+
+
+#### 6.2.2 Images Prediction
+
+If you want to predict images in directory, please specify the argument `Global.infer_imgs` as directory path by `-o Global.infer_imgs`. The command is as follow.
+
+```shell
+# Use the following command to predict with GPU. If want to replace with CPU, you can add argument -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/car_exists/inference_car_exists.yaml -o Global.infer_imgs="./images/PULC/car_exists/"
+```
+
+All prediction results will be printed, as shown below.
+
+```
+objects365_00001507.jpeg: class id(s): [1], score(s): [0.99], label_name(s): ['contains_car']
+objects365_00001521.jpeg: class id(s): [0], score(s): [0.99], label_name(s): ['no_car']
+```
+
+Among the prediction results above, `contains_car` means that there is a car in the image, `no_car` means that there is no car in the image.
+
+
+
+### 6.3 Deployment with C++
+
+PaddleClas provides an example about how to deploy with C++. Please refer to [Deployment with C++](../inference_deployment/cpp_deploy_en.md).
+
+
+
+### 6.4 Deployment as Service
+
+Paddle Serving is a flexible, high-performance carrier for machine learning models, and supports different protocol, such as RESTful, gRPC, bRPC and so on, which provides different deployment solutions for a variety of heterogeneous hardware and operating system environments. Please refer [Paddle Serving](https://github.com/PaddlePaddle/Serving) for more information.
+
+PaddleClas provides an example about how to deploy as service by Paddle Serving. Please refer to [Paddle Serving Deployment](../inference_deployment/paddle_serving_deploy_en.md).
+
+
+
+### 6.5 Deployment on Mobile
+
+Paddle-Lite is an open source deep learning framework that designed to make easy to perform inference on mobile, embeded, and IoT devices. Please refer to [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) for more information.
+
+PaddleClas provides an example of how to deploy on mobile by Paddle-Lite. Please refer to [Paddle-Lite deployment](../inference_deployment/paddle_lite_deploy_en.md).
+
+
+
+### 6.6 Converting To ONNX and Deployment
+
+Paddle2ONNX support convert Paddle Inference model to ONNX model. And you can deploy with ONNX model on different inference engine, such as TensorRT, OpenVINO, MNN/TNN, NCNN and so on. About Paddle2ONNX details, please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX).
+
+PaddleClas provides an example of how to convert Paddle Inference model to ONNX model by paddle2onnx toolkit and predict by ONNX model. You can refer to [paddle2onnx](../../../deploy/paddle2onnx/readme_en.md) for deployment details.
diff --git a/docs/en/PULC/PULC_language_classification_en.md b/docs/en/PULC/PULC_language_classification_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..c7cd5f5db9c01f01c4fbb2299086bc1adcfc98d1
--- /dev/null
+++ b/docs/en/PULC/PULC_language_classification_en.md
@@ -0,0 +1,470 @@
+# PULC Classification Model of Language
+
+------
+
+## Catalogue
+
+- [1. Introduction](#1)
+- [2. Quick Start](#2)
+ - [2.1 PaddlePaddle Installation](#2.1)
+ - [2.2 PaddleClas Installation](#2.2)
+ - [2.3 Prediction](#2.3)
+- [3. Training, Evaluation and Inference](#3)
+ - [3.1 Installation](#3.1)
+ - [3.2 Dataset](#3.2)
+ - [3.2.1 Dataset Introduction](#3.2.1)
+ - [3.2.2 Getting Dataset](#3.2.2)
+ - [3.3 Training](#3.3)
+ - [3.4 Evaluation](#3.4)
+ - [3.5 Inference](#3.5)
+- [4. Model Compression](#4)
+ - [4.1 SKL-UGI Knowledge Distillation](#4.1)
+ - [4.1.1 Teacher Model Training](#4.1.1)
+ - [4.1.2 Knowledge Distillation Training](#4.1.2)
+- [5. SHAS](#5)
+- [6. Inference Deployment](#6)
+ - [6.1 Getting Paddle Inference Model](#6.1)
+ - [6.1.1 Exporting Paddle Inference Model](#6.1.1)
+ - [6.1.2 Downloading Inference Model](#6.1.2)
+ - [6.2 Prediction with Python](#6.2)
+ - [6.2.1 Image Prediction](#6.2.1)
+ - [6.2.2 Images Prediction](#6.2.2)
+ - [6.3 Deployment with C++](#6.3)
+ - [6.4 Deployment as Service](#6.4)
+ - [6.5 Deployment on Mobile](#6.5)
+ - [6.6 Converting To ONNX and Deployment](#6.6)
+
+
+
+## 1. Introduction
+
+This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of language in the image using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in various scenarios involving multilingual OCR processing, such as finance and government affairs.
+
+The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to sixth lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy. When replacing the backbone with PPLCNet_x1_0, the input shape of model is changed to [192, 48], and the stride of the network is changed to [2, [2, 1], [2, 1], [2, 1]].
+
+| Backbone | Top1-Acc(%) | Latency(ms) | Size(M)| Training Strategy |
+| ----------------------- | --------- | -------- | ------- | ---------------------------------------------- |
+| SwinTranformer_tiny | 98.12 | 89.09 | 111 | using ImageNet pretrained model |
+| MobileNetV3_small_x0_35 | 95.92 | 2.98 | 3.7 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 98.35 | 2.58 | 7.1 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 98.7 | 2.58 | 7.1 | using SSLD pretrained model |
+| PPLCNet_x1_0 | 99.12 | 2.58 | 7.1 | using SSLD pretrained model + EDA strategy |
+| **PPLCNet_x1_0** | **99.26** | **2.58** | **7.1** | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy|
+
+It can be seen that high accuracy can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the accuracy will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0 and changing the input shape and stride of network, the accuracy is higher more 2.43 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the accuracy can be improved by about 0.35 percentage points without affecting the inference speed. Further, additional using the EDA strategy, the accuracy can be increased by 0.42 percentage points. Finally, after additional using the SKL-UGI knowledge distillation, the accuracy can be further improved by 0.14 percentage points. At this point, the accuracy is higher than that of SwinTranformer_tiny, but the speed is more faster. The training method and deployment instructions of PULC will be introduced in detail below.
+
+**Note**:
+
+* The Latency is tested on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. The MKLDNN is enabled and the number of threads is 10.
+* About PP-LCNet, please refer to [PP-LCNet Introduction](../models/PP-LCNet_en.md) and [PP-LCNet Paper](https://arxiv.org/abs/2109.15099).
+
+
+
+## 2. Quick Start
+
+
+
+### 2.1 PaddlePaddle Installation
+
+- Run the following command to install if CUDA9 or CUDA10 is available.
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- Run the following command to install if GPU device is unavailable.
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
+
+
+
+### 2.2 PaddleClas wheel Installation
+
+The command of PaddleClas installation as bellow:
+
+```bash
+pip3 install paddleclas
+```
+
+
+
+### 2.3 Prediction
+
+First, please click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download and unzip to get the test demo images.
+
+* Prediction with CLI
+
+```bash
+paddleclas --model_name=language_classification --infer_imgs=pulc_demo_imgs/language_classification/word_35404.png
+```
+
+Results:
+
+```
+>>> result
+class_ids: [4, 6], scores: [0.88672, 0.01434], label_names: ['japan', 'korean'], filename: pulc_demo_imgs/language_classification/word_35404.png
+Predict complete!
+```
+
+**Note**: If you want to test other images, only need to specify the `--infer_imgs` argument, and the directory containing images is also supported.
+
+* Prediction in Python
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="language_classification")
+result = model.predict(input_data="pulc_demo_imgs/language_classification/word_35404.png")
+print(next(result))
+```
+
+**Note**: The `result` returned by `model.predict()` is a generator, so you need to use the `next()` function to call it or `for` loop to loop it. And it will predict with `batch_size` size batch and return the prediction results when called. The default `batch_size` is 1, and you also specify the `batch_size` when instantiating, such as `model = paddleclas.PaddleClas(model_name="language_classification", batch_size=2)`. The result of demo above:
+
+```
+>>> result
+[{'class_ids': [4, 6], 'scores': [0.88672, 0.01434], 'label_names': ['japan', 'korean'], 'filename': 'pulc_demo_imgs/language_classification/word_35404.png'}]
+```
+
+
+
+## 3. Training, Evaluation and Inference
+
+
+
+### 3.1 Installation
+
+Please refer to [Installation](../installation/install_paddleclas_en.md) to get the description about installation.
+
+
+
+### 3.2 Dataset
+
+
+
+#### 3.2.1 Dataset Introduction
+
+The models wo provided are trained with internal data, which is not open source yet. So it is suggested that constructing dataset based on open source dataset [Multi-lingual scene text detection and recognition](https://rrc.cvc.uab.es/?ch=15&com=downloads) to experience the this case.
+
+Some image of the processed dataset is as follows:
+
+
+
+
+
+#### 3.2.2 Getting Dataset
+
+The models provided support to classcify 10 languages, which as shown in the following list:
+
+`0` : means Arabic
+`1` : means chinese_cht
+`2` : means cyrillic
+`3` : means devanagari
+`4` : means Japanese
+`5` : means ka
+`6` : means Korean
+`7` : means ta
+`8` : means te
+`9` : means Latin
+
+In the `Multi-lingual scene text detection and recognition`, only Arabic, Japanese, Korean and Latin data are included. 1600 images from each of the four languages are taken as the training data of this case, 300 images as the evaluation data, and 400 images as the supplementary data is used for the `SKL-UGI Knowledge Distillation`.
+
+Therefore, for the demo dataset in this case, the language categories are shown in following list:
+`0` : means arabic
+`4` : means japan
+`6` : means korean
+`9` : means latin
+
+**Note**: The images used in this task should be cropped by text from original image. Only the text line part is used as the image data.
+
+If you want to create your own dataset, you can collect and sort out the data of the required languages in your task as required. And you can also download the data processed directly.
+
+```
+cd path_to_PaddleClas
+```
+
+Enter the `dataset/` directory, download and unzip the dataset.
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/language_classification.tar
+tar -xf language_classification.tar
+cd ../
+```
+
+The datas under `language_classification` directory:
+
+```
+├── img
+│ ├── word_1.png
+│ ├── word_2.png
+...
+├── train_list.txt
+├── train_list_for_distill.txt
+├── test_list.txt
+└── label_list.txt
+```
+
+Where `img/` is the directory including 9200 images in 4 languages. The `train_list.txt` and `test_list.txt` are label files of training data and validation data respectively. `label_list.txt` is the mapping file corresponding to the four languages. `train_list_for_distill.txt` is the label list of images used for `SKL-UGI Knowledge Distillation`.
+
+**Note**:
+
+* About the contents format of `train_list.txt` and `val_list.txt`, please refer to [Description about Classification Dataset in PaddleClas](../data_preparation/classification_dataset_en.md).
+* About the `train_list_for_distill.txt`, please refer to [Knowledge Distillation Label](../advanced_tutorials/distillation/distillation_en.md).
+
+
+
+### 3.3 Training
+
+The details of training config in `ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml`. The command about training as follows:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+ -o Arch.class_num=4
+```
+
+**Note**: Because the class num of demo dataset is 4, the argument `-o Arch.class_num=4` should be specifed to change the prediction class num of model to 4.
+
+
+
+### 3.4 Evaluation
+
+After training, you can use the following commands to evaluate the model.
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model" \
+ -o Arch.class_num=4
+```
+
+Among the above command, the argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+
+
+
+### 3.5 Inference
+
+After training, you can use the model that trained to infer. Command is as follow:
+
+```bash
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model" \
+ -o Arch.class_num=4
+```
+
+The results:
+
+```
+[{'class_ids': [4, 9], 'scores': [0.96809, 0.01001], 'file_name': 'deploy/images/PULC/language_classification/word_35404.png', 'label_names': ['japan', 'latin']}]
+```
+
+**Note**:
+
+* Among the above command, argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+* The default test image is `deploy/images/PULC/person_exists/objects365_02035329.jpg`. And you can test other image, only need to specify the argument `-o Infer.infer_imgs=path_to_test_image`.
+* Among the prediction results, `japan` means japanese and `korean` means korean.
+
+
+
+## 4. Model Compression
+
+
+
+### 4.1 SKL-UGI Knowledge Distillation
+
+SKL-UGI is a simple but effective knowledge distillation algrithem proposed by PaddleClas.
+
+
+
+
+
+
+#### 4.1.1 Teacher Model Training
+
+Training the teacher model with hyperparameters specified in `ppcls/configs/PULC/language_classification/PPLCNet/PPLCNet_x1_0.yaml`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd \
+ -o Arch.class_num=4
+```
+
+The best teacher model weight would be saved in file `output/ResNet101_vd/best_model.pdparams`.
+
+**Note**: Training the ResNet101_vd model requires more GPU memory. If the memory is not enough, you can reduce the learning rate and batch size in the same proportion.
+
+
+
+#### 4.1.2 Knowledge Distillation Training
+
+The training strategy, specified in training config file `ppcls/configs/PULC/language_classification/PPLCNet_x1_0_distillation.yaml`, the teacher model is `ResNet101_vd`, the student model is `PPLCNet_x1_0`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model \
+ -o Arch.class_num=4
+```
+
+The best student model weight would be saved in file `output/DistillationModel/best_model_student.pdparams`.
+
+
+
+## 5. Hyperparameters Searching
+
+The hyperparameters used by [3.2 section](#3.2) and [4.1 section](#4.1) are according by `Hyperparameters Searching` in PaddleClas. If you want to get better results on your own dataset, you can refer to [Hyperparameters Searching](PULC_train_en.md#4) to get better hyperparameters.
+
+**Note**: This section is optional. Because the search process will take a long time, you can selectively run according to your specific. If not replace the dataset, you can ignore this section.
+
+
+
+## 6. Inference Deployment
+
+
+
+### 6.1 Getting Paddle Inference Model
+
+Paddle Inference is the original Inference Library of the PaddlePaddle, provides high-performance inference for server deployment. And compared with directly based on the pretrained model, Paddle Inference can use tools to accelerate prediction, so as to achieve better inference performance. Please refer to [Paddle Inference](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html) for more information.
+
+Paddle Inference need Paddle Inference Model to predict. Two process provided to get Paddle Inference Model. If want to use the provided by PaddleClas, you can download directly, click [Downloading Inference Model](#6.1.2).
+
+
+
+### 6.1.1 Exporting Paddle Inference Model
+
+The command about exporting Paddle Inference Model is as follow:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_language_classification_infer
+```
+
+After running above command, the inference model files would be saved in `deploy/models/PPLCNet_x1_0_language_classification_infer`, as shown below:
+
+```
+├── PPLCNet_x1_0_language_classification_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**Note**: The best model is from knowledge distillation training. If knowledge distillation training is not used, the best model would be saved in `output/PPLCNet_x1_0/best_model.pdparams`.
+
+
+
+### 6.1.2 Downloading Inference Model
+
+You can also download directly.
+
+```
+cd deploy/models
+# download the inference model and decompression
+wget https://paddleclas.bj.bcebos.com/models/PULC/language_classification_infer.tar && tar -xf language_classification_infer.tar
+```
+
+After decompression, the directory `models` should be shown below.
+
+```
+├── language_classification_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 Prediction with Python
+
+
+
+#### 6.2.1 Image Prediction
+
+Return the directory `deploy`:
+
+```
+cd ../
+```
+
+Run the following command to classify language about the image `./images/PULC/language_classification/word_35404.png`.
+
+```shell
+# Use the following command to predict with GPU.
+python3.7 python/predict_cls.py -c configs/PULC/language_classification/inference_language_classification.yaml
+# Use the following command to predict with CPU.
+python3.7 python/predict_cls.py -c configs/PULC/language_classification/inference_language_classification.yaml -o Global.use_gpu=False
+```
+
+The prediction results:
+
+```
+word_35404.png: class id(s): [4, 6], score(s): [0.89, 0.01], label_name(s): ['japan', 'korean']
+```
+
+**Note**: Among the prediction results, `japan` means japanese and `korean` means korean.
+
+
+
+#### 6.2.2 Images Prediction
+
+If you want to predict images in directory, please specify the argument `Global.infer_imgs` as directory path by `-o Global.infer_imgs`. The command is as follow.
+
+```shell
+# Use the following command to predict with GPU. If want to replace with CPU, you can add argument -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/language_classification/inference_language_classification.yaml -o Global.infer_imgs="./images/PULC/language_classification/"
+```
+
+All prediction results will be printed, as shown below.
+
+```
+word_17.png: class id(s): [9, 4], score(s): [0.80, 0.09], label_name(s): ['latin', 'japan']
+word_20.png: class id(s): [0, 4], score(s): [0.91, 0.02], label_name(s): ['arabic', 'japan']
+word_35404.png: class id(s): [4, 6], score(s): [0.89, 0.01], label_name(s): ['japan', 'korean']
+```
+
+Among the prediction results above, `japan` means japanese, `latin` means latin, `arabic` means arabic and `korean` means korean.
+
+
+
+### 6.3 Deployment with C++
+
+PaddleClas provides an example about how to deploy with C++. Please refer to [Deployment with C++](../inference_deployment/cpp_deploy_en.md).
+
+
+
+### 6.4 Deployment as Service
+
+Paddle Serving is a flexible, high-performance carrier for machine learning models, and supports different protocol, such as RESTful, gRPC, bRPC and so on, which provides different deployment solutions for a variety of heterogeneous hardware and operating system environments. Please refer [Paddle Serving](https://github.com/PaddlePaddle/Serving) for more information.
+
+PaddleClas provides an example about how to deploy as service by Paddle Serving. Please refer to [Paddle Serving Deployment](../inference_deployment/paddle_serving_deploy_en.md).
+
+
+
+### 6.5 Deployment on Mobile
+
+Paddle-Lite is an open source deep learning framework that designed to make easy to perform inference on mobile, embeded, and IoT devices. Please refer to [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) for more information.
+
+PaddleClas provides an example of how to deploy on mobile by Paddle-Lite. Please refer to [Paddle-Lite deployment](../inference_deployment/paddle_lite_deploy_en.md).
+
+
+
+### 6.6 Converting To ONNX and Deployment
+
+Paddle2ONNX support convert Paddle Inference model to ONNX model. And you can deploy with ONNX model on different inference engine, such as TensorRT, OpenVINO, MNN/TNN, NCNN and so on. About Paddle2ONNX details, please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX).
+
+PaddleClas provides an example of how to convert Paddle Inference model to ONNX model by paddle2onnx toolkit and predict by ONNX model. You can refer to [paddle2onnx](../../../deploy/paddle2onnx/readme_en.md) for deployment details.
diff --git a/docs/en/PULC/PULC_model_list_en.md b/docs/en/PULC/PULC_model_list_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..a7de0ce2c996132e6c882a10f5fcecd22398cc22
--- /dev/null
+++ b/docs/en/PULC/PULC_model_list_en.md
@@ -0,0 +1,25 @@
+# PULC Model Zoo
+
+------
+
+The PULC model zoo is provided here, mainly providing indicators, model storage size, and download links of the model. The pre-trained model can be used for fine-tuning training, and the inference model can be directly used for prediction and deployment.
+
+
+|Model name| Model Description | Metrics |Storage Size| Latency| Download Address|
+| --- | --- | --- | --- | --- | --- |
+| person_exists |[Human Exists Classification](PULC_person_exists_en.md)| 96.23 |7.0M|2.58ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_exists_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_exists_pretrained.pdparams)|
+| person_attribute |[Pedestrian Attribute Classification](PULC_person_attribute_en.md)| 78.59 |7.2M|2.01ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_attribute_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_attribute_pretrained.pdparams)|
+| safety_helmet |[Classification of Wheather Wearing Safety Helmet](PULC_safety_helmet_en.md)| 99.38 |7.1M|2.03ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/safety_helmet_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/safety_helmet_pretrained.pdparams)|
+| traffic_sign |[Traffic Sign Classification](PULC_traffic_sign_en.md)| 98.35 |8.2M|2.10ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/traffic_sign_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/traffic_sign_pretrained.pdparams)|
+| vehicle_attribute |[Vehicle Attribute Classification](PULC_vehicle_attribute_en.md)| 90.81 |7.2M|2.36ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/vehicle_attribute_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/vehicle_attribute_pretrained.pdparams)|
+| car_exists |[Car Exists Classification](PULC_car_exists_en.md) | 95.92 | 7.1M | 2.38ms |[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/car_exists_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/car_exists_pretrained.pdparams)|
+| text_image_orientation |[Text Image Orientation Classification](PULC_text_image_orientation_en.md)| 99.06 | 7.1M | 2.16ms |[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/text_image_orientation_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/text_image_orientation_pretrained.pdparams)|
+| textline_orientation |[Text-line Orientation Classification](PULC_textline_orientation_en.md)| 96.01 |7.0M|2.72ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/textline_orientation_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/textline_orientation_pretrained.pdparams)|
+| language_classification |[Language Classification](PULC_language_classification_en.md)| 99.26 |7.1M|2.58ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/language_classification_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/language_classification_pretrained.pdparams)|
+
+
+**Note:**
+
+* The backbone of all the above models is PPLCNet_x1_0. The different sizes of some models are caused by the different output sizes of the classification layer. The inference time is tested on the Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. During the test process, the MKLDNN acceleration strategy is turned on, and the number of threads is 10. There will be slight fluctuations during the speed test process.
+
+* The evaluation indicators of person_exists, safety_helmet, and car_exists are TprAtFpr. The evaluation indicators of person_attribute and vehicle_attribute are ma. The evaluation indicators of traffic_sign, text_image_orientation, textline_orientation and language_classification are Top-1 Acc.
diff --git a/docs/en/PULC/PULC_person_attribute_en.md b/docs/en/PULC/PULC_person_attribute_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..173313aad1a684289f3a6825cdf73ea01493847d
--- /dev/null
+++ b/docs/en/PULC/PULC_person_attribute_en.md
@@ -0,0 +1,448 @@
+# PULC Recognition Model of Person Attribute
+
+------
+
+## Catalogue
+
+- [1. Introduction](#1)
+- [2. Quick Start](#2)
+ - [2.1 PaddlePaddle Installation](#2.1)
+ - [2.2 PaddleClas Installation](#2.2)
+ - [2.3 Prediction](#2.3)
+- [3. Training, Evaluation and Inference](#3)
+ - [3.1 Installation](#3.1)
+ - [3.2 Dataset](#3.2)
+ - [3.2.1 Dataset Introduction](#3.2.1)
+ - [3.2.2 Getting Dataset](#3.2.2)
+ - [3.3 Training](#3.3)
+ - [3.4 Evaluation](#3.4)
+ - [3.5 Inference](#3.5)
+- [4. Model Compression](#4)
+ - [4.1 SKL-UGI Knowledge Distillation](#4.1)
+ - [4.1.1 Teacher Model Training](#4.1.1)
+ - [4.1.2 Knowledge Distillation Training](#4.1.2)
+- [5. SHAS](#5)
+- [6. Inference Deployment](#6)
+ - [6.1 Getting Paddle Inference Model](#6.1)
+ - [6.1.1 Exporting Paddle Inference Model](#6.1.1)
+ - [6.1.2 Downloading Inference Model](#6.1.2)
+ - [6.2 Prediction with Python](#6.2)
+ - [6.2.1 Image Prediction](#6.2.1)
+ - [6.2.2 Images Prediction](#6.2.2)
+ - [6.3 Deployment with C++](#6.3)
+ - [6.4 Deployment as Service](#6.4)
+ - [6.5 Deployment on Mobile](#6.5)
+ - [6.6 Converting To ONNX and Deployment](#6.6)
+
+
+
+## 1. Introduction
+
+This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of person attribute using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in
+Pedestrian analysis scenarios, pedestrian tracking scenarios, etc.
+
+The following table lists the relevant indicators of the model. The first three lines means that using Res2Net200_vd_26w_4s, SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The fourth to seventh lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy.
+
+
+| Backbone | ma(%) | Latency(ms) | Size(M) | Training Strategy |
+|-------|-----------|----------|---------------|---------------|
+| Res2Net200_vd_26w_4s | 81.25 | 77.51 | 293 | using ImageNet pretrained |
+| SwinTransformer_tiny | 80.17 | 89.51 | 111 | using ImageNet pretrained |
+| MobileNetV3_small_x0_35 | 70.79 | 2.90 | 1.7 | using ImageNet pretrained |
+| PPLCNet_x1_0 | 76.31 | 2.01 | 7.1 | using ImageNet pretrained |
+| PPLCNet_x1_0 | 77.31 | 2.01 | 7.1 | using SSLD pretrained |
+| PPLCNet_x1_0 | 77.71 | 2.01 | 7.1 | using SSLD pretrained + EDA strategy|
+| PPLCNet_x1_0 | 78.59 | 2.01 | 7.1 | using SSLD pretrained + EDA strategy + SKL-UGI knowledge distillation strategy|
+
+It can be seen that high ma metric can be getted when backbone are Res2Net200_vd_26w_4s and SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the ma metric will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the ma metric is higher more 5.5 percentage points higher than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the ma metric can be improved by about 1 percentage points without affecting the inference speed. Further, additional using the EDA strategy, the ma metric can be increased by 0.4 percentage points. Finally, after additional using the SKL-UGI knowledge distillation, the ma matric can be further improved by 0.88 percentage points. At this time, the ma metric of PPLCNet_x1_0 is only 1.58% different from SwinTransformer_tiny, but the speed is more than 44 times faster. The training method and deployment instructions of PULC will be introduced in detail below.
+
+**Note**:
+
+* The Latency is tested on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. The MKLDNN is enabled and the number of threads is 10.
+* About PP-LCNet, please refer to [PP-LCNet Introduction](../models/PP-LCNet_en.md) and [PP-LCNet Paper](https://arxiv.org/abs/2109.15099).
+
+
+
+## 2. Quick Start
+
+
+
+### 2.1 PaddlePaddle Installation
+
+- Run the following command to install if CUDA9 or CUDA10 is available.
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- Run the following command to install if GPU device is unavailable.
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
+
+
+
+### 2.2 PaddleClas wheel Installation
+
+The command of PaddleClas installation as bellow:
+
+```bash
+pip3 install paddleclas
+```
+
+
+
+### 2.3 Prediction
+
+First, please click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download and unzip to get the test demo images.
+
+
+* Prediction with CLI
+
+```bash
+paddleclas --model_name=person_attribute --infer_imgs=pulc_demo_imgs/person_attribute/090004.jpg
+```
+
+Results:
+```
+>>> result
+attributes: ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], output: [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1], filename: pulc_demo_imgs/person_attribute/090004.jpg
+Predict complete!
+```
+
+**Note**: If you want to test other images, only need to specify the `--infer_imgs` argument, and the directory containing images is also supported.
+
+* Prediction in Python
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="person_attribute")
+result = model.predict(input_data="pulc_demo_imgs/person_attribute/090004.jpg")
+print(next(result))
+```
+
+**Note**: The `result` returned by `model.predict()` is a generator, so you need to use the `next()` function to call it or `for` loop to loop it. And it will predict with `batch_size` size batch and return the prediction results when called. The default `batch_size` is 1, and you also specify the `batch_size` when instantiating, such as `model = paddleclas.PaddleClas(model_name="person_attribute", batch_size=2)`. The result of demo above:
+
+```
+>>> result
+[{'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1], 'filename': 'pulc_demo_imgs/person_attribute/090004.jpg'}]
+```
+
+
+
+## 3. Training, Evaluation and Inference
+
+
+
+### 3.1 Installation
+
+Please refer to [Installation](../installation/install_paddleclas_en.md) to get the description about installation.
+
+
+
+### 3.2 Dataset
+
+
+
+#### 3.2.1 Dataset Introduction
+
+The data used in this case is the [pa100k dataset](https://www.v7labs.com/open-datasets/pa-100k).
+
+
+
+#### 3.2.2 Getting Dataset
+
+Some image of the processed dataset is as follows:
+
+
+
+
+We converted the data into a PaddleClas multi-label readable data format that can be downloaded directly.
+
+```
+cd path_to_PaddleClas
+```
+
+Enter the `dataset/` directory, download and unzip the dataset.
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/pa100k.tar
+tar -xf pa100k.tar
+cd ../
+```
+
+The datas under `pa100k` directory:
+
+```
+pa100k
+├── train
+│ ├── 000001.jpg
+│ ├── 000002.jpg
+...
+├── val
+│ ├── 080001.jpg
+│ ├── 080002.jpg
+...
+├── test
+│ ├── 090001.jpg
+│ ├── 090002.jpg
+...
+...
+├── train_list.txt
+├── train_val_list.txt
+├── val_list.txt
+├── test_list.txt
+```
+
+Where `train/`, `val/`, `test/` are training set, validation set and test set respectively. `train_list.txt`, `val_list.txt`, `test_list.txt` are the label files of the training set, validation set, and test set, respectively. In this example, `test_list.txt` is not used for now.
+
+
+
+
+### 3.3 Training
+
+The details of training config in ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml`. The command about training as follows:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml
+```
+
+The best metric for the validation set is around `77.71%` (the dataset is small and generally fluctuates around 0.3%).
+
+
+
+
+### 3.4 Evaluation
+
+After training, you can use the following commands to evaluate the model.
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+Among the above command, the argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+
+
+
+### 3.5 Inference
+
+After training, you can use the model that trained to infer. Command is as follow:
+
+```python
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+The results:
+
+```
+[{'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1]}]
+```
+
+**Note**:
+
+* Among the above command, argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+* The default test image is `deploy/images/PULC/person_attribute/090004.jpg`. And you can test other image, only need to specify the argument `-o Infer.infer_imgs=path_to_test_image`.
+
+
+
+## 4. Model Compression
+
+
+
+### 4.1 SKL-UGI Knowledge Distillation
+
+SKL-UGI is a simple but effective knowledge distillation algrithem proposed by PaddleClas.
+
+
+
+
+
+
+#### 4.1.1 Teacher Model Training
+
+Training the teacher model with hyperparameters specified in `ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+The best metric for the validation set is around `80.10%`. The best teacher model weight would be saved in file `output/ResNet101_vd/best_model.pdparams`.
+
+
+
+#### 4.1.2 Knowledge Distillation Training
+
+The training strategy, specified in training config file `ppcls/configs/PULC/person_attribute/PPLCNet_x1_0_Distillation.yaml`, the teacher model is `ResNet101_vd`, the student model is `PPLCNet_x1_0`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0_Distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+The best metric for the validation set is around `78.5%`. The best student model weight would be saved in file `output/DistillationModel/best_model_student.pdparams`.
+
+
+
+## 5. Hyperparameters Searching
+
+The hyperparameters used by [3.2 section](#3.2) and [4.1 section](#4.1) are according by `Hyperparameters Searching` in PaddleClas. If you want to get better results on your own dataset, you can refer to [Hyperparameters Searching](PULC_train_en.md#4) to get better hyperparameters.
+
+**Note**: This section is optional. Because the search process will take a long time, you can selectively run according to your specific. If not replace the dataset, you can ignore this section.
+
+
+
+## 6. Inference Deployment
+
+
+
+### 6.1 Getting Paddle Inference Model
+
+Paddle Inference is the original Inference Library of the PaddlePaddle, provides high-performance inference for server deployment. And compared with directly based on the pretrained model, Paddle Inference can use tools to accelerate prediction, so as to achieve better inference performance. Please refer to [Paddle Inference](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html) for more information.
+
+Paddle Inference need Paddle Inference Model to predict. Two process provided to get Paddle Inference Model. If want to use the provided by PaddleClas, you can download directly, click [Downloading Inference Model](#6.1.2).
+
+
+
+### 6.1.1 Exporting Paddle Inference Model
+
+The command about exporting Paddle Inference Model is as follow:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_person_attribute_infer
+```
+
+After running above command, the inference model files would be saved in `PPLCNet_x1_0_person_attribute_infer`, as shown below:
+
+```
+├── PPLCNet_x1_0_person_attribute_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**Note**: The best model is from knowledge distillation training. If knowledge distillation training is not used, the best model would be saved in `output/PPLCNet_x1_0/best_model.pdparams`.
+
+
+
+### 6.1.2 Downloading Inference Model
+
+You can also download directly.
+
+```
+cd deploy/models
+# download the inference model and decompression
+wget https://paddleclas.bj.bcebos.com/models/PULC/person_attribute_infer.tar && tar -xf person_attribute_infer.tar
+```
+
+After decompression, the directory `models` should be shown below.
+
+```
+├── person_attribute_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 Prediction with Python
+
+
+
+#### 6.2.1 Image Prediction
+
+Return the directory `deploy`:
+
+```
+cd ../
+```
+
+Run the following command to classify whether there are human in the image `./images/PULC/person_attribute/090004.jpg`.
+
+```shell
+# Use the following command to predict with GPU.
+python3.7 python/predict_cls.py -c configs/PULC/person_attribute/inference_person_attribute.yaml -o Global.use_gpu=True
+# Use the following command to predict with CPU.
+python3.7 python/predict_cls.py -c configs/PULC/person_attribute/inference_person_attribute.yaml -o Global.use_gpu=False
+```
+
+The prediction results:
+
+```
+090004.jpg: {'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1]}
+```
+
+
+
+
+#### 6.2.2 Images Prediction
+
+If you want to predict images in directory, please specify the argument `Global.infer_imgs` as directory path by `-o Global.infer_imgs`. The command is as follow.
+
+```shell
+# Use the following command to predict with GPU. If want to replace with CPU, you can add argument -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/person_attribute/inference_person_attribute.yaml -o Global.infer_imgs="./images/PULC/person_attribute/"
+```
+
+All prediction results will be printed, as shown below.
+
+```
+090004.jpg: {'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1]}
+090007.jpg: {'attributes': ['Female', 'Age18-60', 'Side', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'No bag', 'Upper: ShortSleeve', 'Lower: Skirt&Dress', 'No boots'], 'output': [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0]}
+```
+
+Among the prediction results above, `someone` means that there is a human in the image, `nobody` means that there is no human in the image.
+
+
+
+### 6.3 Deployment with C++
+
+PaddleClas provides an example about how to deploy with C++. Please refer to [Deployment with C++](../inference_deployment/cpp_deploy_en.md).
+
+
+
+### 6.4 Deployment as Service
+
+Paddle Serving is a flexible, high-performance carrier for machine learning models, and supports different protocol, such as RESTful, gRPC, bRPC and so on, which provides different deployment solutions for a variety of heterogeneous hardware and operating system environments. Please refer [Paddle Serving](https://github.com/PaddlePaddle/Serving) for more information.
+
+PaddleClas provides an example about how to deploy as service by Paddle Serving. Please refer to [Paddle Serving Deployment](../inference_deployment/paddle_serving_deploy_en.md).
+
+
+
+### 6.5 Deployment on Mobile
+
+Paddle-Lite is an open source deep learning framework that designed to make easy to perform inference on mobile, embeded, and IoT devices. Please refer to [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) for more information.
+
+PaddleClas provides an example of how to deploy on mobile by Paddle-Lite. Please refer to [Paddle-Lite deployment](../inference_deployment/paddle_lite_deploy_en.md).
+
+
+
+### 6.6 Converting To ONNX and Deployment
+
+Paddle2ONNX support convert Paddle Inference model to ONNX model. And you can deploy with ONNX model on different inference engine, such as TensorRT, OpenVINO, MNN/TNN, NCNN and so on. About Paddle2ONNX details, please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX).
+
+PaddleClas provides an example of how to convert Paddle Inference model to ONNX model by paddle2onnx toolkit and predict by ONNX model. You can refer to [paddle2onnx](../../../deploy/paddle2onnx/readme_en.md) for deployment details.
diff --git a/docs/en/PULC/PULC_person_exists_en.md b/docs/en/PULC/PULC_person_exists_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..baf5ce3e4c295a57d928853f5a0b3da1d3c7b366
--- /dev/null
+++ b/docs/en/PULC/PULC_person_exists_en.md
@@ -0,0 +1,458 @@
+# PULC Classification Model of Someone or Nobody
+
+------
+
+## Catalogue
+
+- [1. Introduction](#1)
+- [2. Quick Start](#2)
+ - [2.1 PaddlePaddle Installation](#2.1)
+ - [2.2 PaddleClas Installation](#2.2)
+ - [2.3 Prediction](#2.3)
+- [3. Training, Evaluation and Inference](#3)
+ - [3.1 Installation](#3.1)
+ - [3.2 Dataset](#3.2)
+ - [3.2.1 Dataset Introduction](#3.2.1)
+ - [3.2.2 Getting Dataset](#3.2.2)
+ - [3.3 Training](#3.3)
+ - [3.4 Evaluation](#3.4)
+ - [3.5 Inference](#3.5)
+- [4. Model Compression](#4)
+ - [4.1 SKL-UGI Knowledge Distillation](#4.1)
+ - [4.1.1 Teacher Model Training](#4.1.1)
+ - [4.1.2 Knowledge Distillation Training](#4.1.2)
+- [5. SHAS](#5)
+- [6. Inference Deployment](#6)
+ - [6.1 Getting Paddle Inference Model](#6.1)
+ - [6.1.1 Exporting Paddle Inference Model](#6.1.1)
+ - [6.1.2 Downloading Inference Model](#6.1.2)
+ - [6.2 Prediction with Python](#6.2)
+ - [6.2.1 Image Prediction](#6.2.1)
+ - [6.2.2 Images Prediction](#6.2.2)
+ - [6.3 Deployment with C++](#6.3)
+ - [6.4 Deployment as Service](#6.4)
+ - [6.5 Deployment on Mobile](#6.5)
+ - [6.6 Converting To ONNX and Deployment](#6.6)
+
+
+
+## 1. Introduction
+
+This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of human exists using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in monitoring scenarios, personnel access control scenarios, massive data filtering scenarios, etc.
+
+The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to sixth lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy.
+
+| Backbone | Tpr(%) | Latency(ms) | Size(M)| Training Strategy |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 95.69 | 95.30 | 111 | using ImageNet pretrained model |
+| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 2.6 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 89.57 | 2.12 | 7.0 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 92.10 | 2.12 | 7.0 | using SSLD pretrained model |
+| PPLCNet_x1_0 | 93.43 | 2.12 | 7.0 | using SSLD pretrained model + EDA strategy |
+| PPLCNet_x1_0 | 96.23 | 2.12 | 7.0 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy|
+
+It can be seen that high Tpr can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the Tpr will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the Tpr is higher more 20 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the Tpr can be improved by about 2.6 percentage points without affecting the inference speed. Further, additional using the EDA strategy, the Tpr can be increased by 1.3 percentage points. Finally, after additional using the SKL-UGI knowledge distillation, the Tpr can be further improved by 2.8 percentage points. At this point, the Tpr is close to that of SwinTranformer_tiny, but the speed is more than 40 times faster. The training method and deployment instructions of PULC will be introduced in detail below.
+
+**Note**:
+
+* About `Tpr` metric, please refer to [3.2 section](#3.2) for more information .
+* The Latency is tested on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. The MKLDNN is enabled and the number of threads is 10.
+* About PP-LCNet, please refer to [PP-LCNet Introduction](../models/PP-LCNet_en.md) and [PP-LCNet Paper](https://arxiv.org/abs/2109.15099).
+
+
+
+## 2. Quick Start
+
+
+
+### 2.1 PaddlePaddle Installation
+
+- Run the following command to install if CUDA9 or CUDA10 is available.
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- Run the following command to install if GPU device is unavailable.
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
+
+
+
+### 2.2 PaddleClas wheel Installation
+
+The command of PaddleClas installation as bellow:
+
+```bash
+pip3 install paddleclas
+```
+
+
+
+### 2.3 Prediction
+
+First, please click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download and unzip to get the test demo images.
+
+* Prediction with CLI
+
+```bash
+paddleclas --model_name=person_exists --infer_imgs=pulc_demo_imgs/person_exists/objects365_01780782.jpg
+```
+
+Results:
+```
+>>> result
+class_ids: [0], scores: [0.9955421453341842], label_names: ['nobody'], filename: pulc_demo_imgs/person_exists/objects365_01780782.jpg
+Predict complete!
+```
+
+**Note**: If you want to test other images, only need to specify the `--infer_imgs` argument, and the directory containing images is also supported.
+
+* Prediction in Python
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="person_exists")
+result = model.predict(input_data="pulc_demo_imgs/person_exists/objects365_01780782.jpg")
+print(next(result))
+```
+
+**Note**: The `result` returned by `model.predict()` is a generator, so you need to use the `next()` function to call it or `for` loop to loop it. And it will predict with `batch_size` size batch and return the prediction results when called. The default `batch_size` is 1, and you also specify the `batch_size` when instantiating, such as `model = paddleclas.PaddleClas(model_name="person_exists", batch_size=2)`. The result of demo above:
+
+```
+>>> result
+[{'class_ids': [0], 'scores': [0.9955421453341842], 'label_names': ['nobody'], 'filename': 'pulc_demo_imgs/person_exists/objects365_01780782.jpg'}]
+```
+
+
+
+## 3. Training, Evaluation and Inference
+
+
+
+### 3.1 Installation
+
+Please refer to [Installation](../installation/install_paddleclas_en.md) to get the description about installation.
+
+
+
+### 3.2 Dataset
+
+
+
+#### 3.2.1 Dataset Introduction
+
+All datasets used in this case are open source data. Train data is the subset of [MS-COCO](https://cocodataset.org/#overview) training data. And the validation data is the subset of [Object365](https://www.objects365.org/overview.html) training data. ImageNet_val is [ImageNet-1k](https://www.image-net.org/) validation data.
+
+
+
+#### 3.2.2 Getting Dataset
+
+The data used in this case can be getted by processing the open source data. The detailed processes are as follows:
+
+- Training data. This case deals with the annotation file of MS-COCO data training data. If a certain image contains the label of "person" and the area of this box is greater than 10% in the whole image, it is considered that the image contains human. If there is no label of "person" in a certain image, It is considered that the image does not contain human. After processing, 92964 pieces of available data were obtained, including 39813 images containing human and 53151 images without containing human.
+- Validation data: randomly select a small part of data from object365 data, use the better model trained on MS-COCO to predict these data, take the intersection between the prediction results and the data annotation file, and filter the intersection results into the validation set according to the method of obtaining the training set. After processing, 27820 pieces of available data were obtained. There are 2255 pieces of data with human and 25565 pieces of data without human. The data visualization of the processed dataset is as follows:
+
+Some image of the processed dataset is as follows:
+
+
+
+And you can also download the data processed directly.
+
+```
+cd path_to_PaddleClas
+```
+
+Enter the `dataset/` directory, download and unzip the dataset.
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/person_exists.tar
+tar -xf person_exists.tar
+cd ../
+```
+
+The datas under `person_exists` directory:
+
+```
+├── train
+│ ├── 000000000009.jpg
+│ ├── 000000000025.jpg
+...
+├── val
+│ ├── objects365_01780637.jpg
+│ ├── objects365_01780640.jpg
+...
+├── ImageNet_val
+│ ├── ILSVRC2012_val_00000001.JPEG
+│ ├── ILSVRC2012_val_00000002.JPEG
+...
+├── train_list.txt
+├── train_list.txt.debug
+├── train_list_for_distill.txt
+├── val_list.txt
+└── val_list.txt.debug
+```
+
+Where `train/` and `val/` are training set and validation set respectively. The `train_list.txt` and `val_list.txt` are label files of training data and validation data respectively. The file `train_list.txt.debug` and `val_list.txt.debug` are subset of `train_list.txt` and `val_list.txt` respectively. `ImageNet_val/` is the validation data of ImageNet-1k, which will be used for SKL-UGI knowledge distillation, and its label file is `train_list_for_distill.txt`.
+
+**Note**:
+
+* About the contents format of `train_list.txt` and `val_list.txt`, please refer to [Description about Classification Dataset in PaddleClas](../data_preparation/classification_dataset_en.md).
+* About the `train_list_for_distill.txt`, please refer to [Knowledge Distillation Label](../advanced_tutorials/distillation/distillation_en.md).
+
+
+
+### 3.3 Training
+
+The details of training config in `ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml`. The command about training as follows:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml
+```
+
+The best metric of validation data is between `0.94` and `0.95`. There would be fluctuations because the data size is small.
+
+**Note**:
+
+* The metric Tpr, that describe the True Positive Rate when False Positive Rate is less than a certain threshold(1/1000 used in this case), is one of the commonly used metric for binary classification. About the details of Fpr and Tpr, please refer [here](https://en.wikipedia.org/wiki/Receiver_operating_characteristic).
+* When evaluation, the best metric TprAtFpr will be printed that include `Fpr`, `Tpr` and the current `threshold`. The `Tpr` means the Recall rate under the current `Fpr`. The `Tpr` higher, the model better. The `threshold` would be used in deployment, which means the classification threshold under best `Fpr` metric.
+
+
+
+### 3.4 Evaluation
+
+After training, you can use the following commands to evaluate the model.
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+Among the above command, the argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+
+
+
+### 3.5 Inference
+
+After training, you can use the model that trained to infer. Command is as follow:
+
+```python
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+The results:
+
+```
+[{'class_ids': [1], 'scores': [0.9999976], 'label_names': ['someone'], 'file_name': 'deploy/images/PULC/person_exists/objects365_02035329.jpg'}]
+```
+
+**Note**:
+
+* Among the above command, argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+* The default test image is `deploy/images/PULC/person_exists/objects365_02035329.jpg`. And you can test other image, only need to specify the argument `-o Infer.infer_imgs=path_to_test_image`.
+* The default threshold is `0.5`. If needed, you can specify the argument `Infer.PostProcess.threshold`, such as: `-o Infer.PostProcess.threshold=0.9794`. And the argument `threshold` is needed to be specified according by specific case. The `0.9794` is the best threshold when `Fpr` is less than `1/1000` in this valuation dataset.
+
+
+
+## 4. Model Compression
+
+
+
+### 4.1 SKL-UGI Knowledge Distillation
+
+SKL-UGI is a simple but effective knowledge distillation algrithem proposed by PaddleClas.
+
+
+
+
+
+
+#### 4.1.1 Teacher Model Training
+
+Training the teacher model with hyperparameters specified in `ppcls/configs/PULC/person_exists/PPLCNet/PPLCNet_x1_0.yaml`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+The best metric of validation data is between `0.96` and `0.98`. The best teacher model weight would be saved in file `output/ResNet101_vd/best_model.pdparams`.
+
+
+
+#### 4.1.2 Knowledge Distillation Training
+
+The training strategy, specified in training config file `ppcls/configs/PULC/person_exists/PPLCNet_x1_0_distillation.yaml`, the teacher model is `ResNet101_vd`, the student model is `PPLCNet_x1_0` and the additional unlabeled training data is validation data of ImageNet1k. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+The best metric is between `0.95` and `0.97`. The best student model weight would be saved in file `output/DistillationModel/best_model_student.pdparams`.
+
+
+
+## 5. Hyperparameters Searching
+
+The hyperparameters used by [3.2 section](#3.2) and [4.1 section](#4.1) are according by `Hyperparameters Searching` in PaddleClas. If you want to get better results on your own dataset, you can refer to [Hyperparameters Searching](PULC_train_en.md#4) to get better hyperparameters.
+
+**Note**: This section is optional. Because the search process will take a long time, you can selectively run according to your specific. If not replace the dataset, you can ignore this section.
+
+
+
+## 6. Inference Deployment
+
+
+
+### 6.1 Getting Paddle Inference Model
+
+Paddle Inference is the original Inference Library of the PaddlePaddle, provides high-performance inference for server deployment. And compared with directly based on the pretrained model, Paddle Inference can use tools to accelerate prediction, so as to achieve better inference performance. Please refer to [Paddle Inference](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html) for more information.
+
+Paddle Inference need Paddle Inference Model to predict. Two process provided to get Paddle Inference Model. If want to use the provided by PaddleClas, you can download directly, click [Downloading Inference Model](#6.1.2).
+
+
+
+### 6.1.1 Exporting Paddle Inference Model
+
+The command about exporting Paddle Inference Model is as follow:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_person_exists_infer
+```
+
+After running above command, the inference model files would be saved in `deploy/models/PPLCNet_x1_0_person_exists_infer`, as shown below:
+
+```
+├── PPLCNet_x1_0_person_exists_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**Note**: The best model is from knowledge distillation training. If knowledge distillation training is not used, the best model would be saved in `output/PPLCNet_x1_0/best_model.pdparams`.
+
+
+
+### 6.1.2 Downloading Inference Model
+
+You can also download directly.
+
+```
+cd deploy/models
+# download the inference model and decompression
+wget https://paddleclas.bj.bcebos.com/models/PULC/person_exists_infer.tar && tar -xf person_exists_infer.tar
+```
+
+After decompression, the directory `models` should be shown below.
+
+```
+├── person_exists_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 Prediction with Python
+
+
+
+#### 6.2.1 Image Prediction
+
+Return the directory `deploy`:
+
+```
+cd ../
+```
+
+Run the following command to classify whether there are humans in the image `./images/PULC/person_exists/objects365_02035329.jpg`.
+
+```shell
+# Use the following command to predict with GPU.
+python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml
+# Use the following command to predict with CPU.
+python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml -o Global.use_gpu=False
+```
+
+The prediction results:
+
+```
+objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
+```
+
+**Note**: The default threshold is `0.5`. If needed, you can specify the argument `Infer.PostProcess.threshold`, such as: `-o Infer.PostProcess.threshold=0.9794`. And the argument `threshold` is needed to be specified according by specific case. The `0.9794` is the best threshold when `Fpr` is less than `1/1000` in this valuation dataset. Please refer to [3.3 section](#3.3) for details.
+
+
+
+#### 6.2.2 Images Prediction
+
+If you want to predict images in directory, please specify the argument `Global.infer_imgs` as directory path by `-o Global.infer_imgs`. The command is as follow.
+
+```shell
+# Use the following command to predict with GPU. If want to replace with CPU, you can add argument -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml -o Global.infer_imgs="./images/PULC/person_exists/"
+```
+
+All prediction results will be printed, as shown below.
+
+```
+objects365_01780782.jpg: class id(s): [0], score(s): [1.00], label_name(s): ['nobody']
+objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
+```
+
+Among the prediction results above, `someone` means that there is a human in the image, `nobody` means that there is no human in the image.
+
+
+
+### 6.3 Deployment with C++
+
+PaddleClas provides an example about how to deploy with C++. Please refer to [Deployment with C++](../inference_deployment/cpp_deploy_en.md).
+
+
+
+### 6.4 Deployment as Service
+
+Paddle Serving is a flexible, high-performance carrier for machine learning models, and supports different protocol, such as RESTful, gRPC, bRPC and so on, which provides different deployment solutions for a variety of heterogeneous hardware and operating system environments. Please refer [Paddle Serving](https://github.com/PaddlePaddle/Serving) for more information.
+
+PaddleClas provides an example about how to deploy as service by Paddle Serving. Please refer to [Paddle Serving Deployment](../inference_deployment/paddle_serving_deploy_en.md).
+
+
+
+### 6.5 Deployment on Mobile
+
+Paddle-Lite is an open source deep learning framework that designed to make easy to perform inference on mobile, embeded, and IoT devices. Please refer to [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) for more information.
+
+PaddleClas provides an example of how to deploy on mobile by Paddle-Lite. Please refer to [Paddle-Lite deployment](../inference_deployment/paddle_lite_deploy_en.md).
+
+
+
+### 6.6 Converting To ONNX and Deployment
+
+Paddle2ONNX support convert Paddle Inference model to ONNX model. And you can deploy with ONNX model on different inference engine, such as TensorRT, OpenVINO, MNN/TNN, NCNN and so on. About Paddle2ONNX details, please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX).
+
+PaddleClas provides an example of how to convert Paddle Inference model to ONNX model by paddle2onnx toolkit and predict by ONNX model. You can refer to [paddle2onnx](../../../deploy/paddle2onnx/readme_en.md) for deployment details.
diff --git a/docs/en/PULC/PULC_quickstart_en.md b/docs/en/PULC/PULC_quickstart_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..087c359283c0e288db91bc80774163eda336853b
--- /dev/null
+++ b/docs/en/PULC/PULC_quickstart_en.md
@@ -0,0 +1,123 @@
+# PULC Quick Start
+
+------
+
+This document introduces the prediction using PULC series model based on PaddleClas wheel.
+
+## Catalogue
+
+- [1. Installation](#1)
+ - [1.1 PaddlePaddle Installation](#11)
+ - [1.2 PaddleClas wheel Installation](#12)
+- [2. Quick Start](#2)
+ - [2.1 Predicion with Command Line](#2.1)
+ - [2.2 Predicion with Python](#2.2)
+ - [2.3 Supported Model List](#2.3)
+- [3. Summary](#3)
+
+
+
+## 1. Installation
+
+
+
+### 1.1 PaddlePaddle Installation
+
+- Run the following command to install if CUDA9 or CUDA10 is available.
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- Run the following command to install if GPU device is unavailable.
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
+
+
+
+### 1.2 PaddleClas wheel Installation
+
+```bash
+pip3 install paddleclas
+```
+
+
+
+## 2. Quick Start
+
+PaddleClas provides a series of test cases, which contain demos of different scenes about people, cars, OCR, etc. Click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download the data.
+
+
+
+### 2.1 Predicion with Command Line
+
+```
+cd /path/to/pulc_demo_imgs
+```
+
+The prediction command:
+
+```bash
+paddleclas --model_name=person_exists --infer_imgs=pulc_demo_imgs/person_exists/objects365_01780782.jpg
+```
+
+Result:
+
+```
+>>> result
+class_ids: [0], scores: [0.9955421453341842], label_names: ['nobody'], filename: pulc_demo_imgs/person_exists/objects365_01780782.jpg
+Predict complete!
+```
+`Nobody` means there is no one in the image, `someone` means there is someone in the image. Therefore, the prediction result indicates that there is no one in the figure.
+
+**Note**: The "--infer_imgs" argument specify the image(s) to be predict, and you can also specify a directoy contains images. If use other model, you can specify the `--model_name` argument. Please refer to [2.3 Supported Model List](#2.3) for the supported model list.
+
+
+
+### 2.2 Predicion with Python
+
+You can also use in Python:
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="person_exists")
+result = model.predict(input_data="pulc_demo_imgs/person_exists/objects365_01780782.jpg")
+print(next(result))
+```
+
+The printed result information:
+
+```
+>>> result
+[{'class_ids': [0], 'scores': [0.9955421453341842], 'label_names': ['nobody'], 'filename': 'pulc_demo_imgs/person_exists/objects365_01780782.jpg'}]
+```
+
+**Note**: `model.predict()` is a generator, so `next()` or `for` is needed to call it. This would to predict by batch that length is `batch_size`, default by 1. You can specify the argument `batch_size` and `model_name` when instantiating PaddleClas object, for example: `model = paddleclas.PaddleClas(model_name="person_exists", batch_size=2)`. Please refer to [2.3 Supported Model List](#2.3) for the supported model list.
+
+
+
+### 2.3 Supported Model List
+
+The name of PULC series models are as follows:
+
+| Name | Intro |
+| --- | --- |
+| person_exists | Human Exists Classification |
+| person_attribute | Pedestrian Attribute Classification |
+| safety_helmet | Classification of Wheather Wearing Safety Helmet |
+| traffic_sign | Traffic Sign Classification |
+| vehicle_attribute | Vehicle Attribute Classification |
+| car_exists | Car Exists Classification |
+| text_image_orientation | Text Image Orientation Classification |
+| textline_orientation | Text-line Orientation Classification |
+| language_classification | Language Classification |
+
+
+
+## 3. Summary
+
+The PULC series models have been verified to be effective in different scenarios about people, vehicles, OCR, etc. The ultra lightweight model can achieve the accuracy close to SwinTransformer model, and the speed is increased by 40+ times. And PULC also provides the whole process of dataset getting, model training, model compression and deployment. Please refer to [Human Exists Classification](PULC_person_exists_en.md)、[Pedestrian Attribute Classification](PULC_person_attribute_en.md)、[Classification of Wheather Wearing Safety Helmet](PULC_safety_helmet_en.md)、[Traffic Sign Classification](PULC_traffic_sign_en.md)、[Vehicle Attribute Classification](PULC_vehicle_attribute_en.md)、[Car Exists Classification](PULC_car_exists_en.md)、[Text Image Orientation Classification](PULC_text_image_orientation_en.md)、[Text-line Orientation Classification](PULC_textline_orientation_en.md)、[Language Classification](PULC_language_classification_en.md) for more information about different scenarios.
diff --git a/docs/en/PULC/PULC_safety_helmet_en.md b/docs/en/PULC/PULC_safety_helmet_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..d2e5cb32931cdc98b0776f4692e6162e907aa6fa
--- /dev/null
+++ b/docs/en/PULC/PULC_safety_helmet_en.md
@@ -0,0 +1,432 @@
+# PULC Classification Model of Wheather Wearing Safety Helmet or Not
+
+-----
+
+## Catalogue
+
+- [1. Introduction](#1)
+- [2. Quick Start](#2)
+ - [2.1 PaddlePaddle Installation](#2.1)
+ - [2.2 PaddleClas Installation](#2.2)
+ - [2.3 Prediction](#2.3)
+- [3. Training, Evaluation and Inference](#3)
+ - [3.1 Installation](#3.1)
+ - [3.2 Dataset](#3.2)
+ - [3.2.1 Dataset Introduction](#3.2.1)
+ - [3.2.2 Getting Dataset](#3.2.2)
+ - [3.3 Training](#3.3)
+ - [3.4 Evaluation](#3.4)
+ - [3.5 Inference](#3.5)
+- [4. Model Compression](#4)
+ - [4.1 SKL-UGI Knowledge Distillation](#4.1)
+ - [4.1.1 Teacher Model Training](#4.1.1)
+ - [4.1.2 Knowledge Distillation Training](#4.1.2)
+- [5. SHAS](#5)
+- [6. Inference Deployment](#6)
+ - [6.1 Getting Paddle Inference Model](#6.1)
+ - [6.1.1 Exporting Paddle Inference Model](#6.1.1)
+ - [6.1.2 Downloading Inference Model](#6.1.2)
+ - [6.2 Prediction with Python](#6.2)
+ - [6.2.1 Image Prediction](#6.2.1)
+ - [6.2.2 Images Prediction](#6.2.2)
+ - [6.3 Deployment with C++](#6.3)
+ - [6.4 Deployment as Service](#6.4)
+ - [6.5 Deployment on Mobile](#6.5)
+ - [6.6 Converting To ONNX and Deployment](#6.6)
+
+
+
+## 1. Introduction
+
+This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of wheather wearing safety helmet using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in construction scenes, factory workshop scenes, traffic scenes and so on.
+
+The following table lists the relevant indicators of the model. The first three lines means that using SwinTransformer_tiny, Res2Net200_vd_26w_4s and MobileNetV3_small_x0_35 as the backbone to training. The fourth to seventh lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy.
+
+| Backbone | Tpr(%) | Latency(ms) | Size(M)| Training Strategy |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 93.57 | 91.32 | 111 | using ImageNet pretrained model |
+| Res2Net200_vd_26w_4s | 98.92 | 80.99 | 284 | using ImageNet pretrained model |
+| MobileNetV3_small_x0_35 | 84.83 | 2.85 | 2.6 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 93.27 | 2.03 | 7.1 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 98.16 | 2.03 | 7.1 | using SSLD pretrained model |
+| PPLCNet_x1_0 | 99.30 | 2.03 | 7.1 | using SSLD pretrained model + EDA strategy |
+| PPLCNet_x1_0 | 99.38 | 2.03 | 7.1 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy|
+
+It can be seen that high Tpr can be getted when backbone is Res2Net200_vd_26w_4s, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the Tpr will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the Tpr is higher more 8.5 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the Tpr can be improved by about 4.9 percentage points without affecting the inference speed. Further, additional using the EDA strategy, the Tpr can be increased by 1.1 percentage points. Finally, after additional using the UDML knowledge distillation, the Tpr can be further improved by 2.2 percentage points. At this point, the Tpr is higher than that of Res2Net200_vd_26w_4s, but the speed is more than 70 times faster. The training method and deployment instructions of PULC will be introduced in detail below.
+
+**Note**:
+
+* About `Tpr` metric, please refer to [3.2 section](#3.2) for more information .
+* The Latency is tested on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. The MKLDNN is enabled and the number of threads is 10.
+* About PP-LCNet, please refer to [PP-LCNet Introduction](../models/PP-LCNet_en.md) and [PP-LCNet Paper](https://arxiv.org/abs/2109.15099).
+
+
+
+## 2. Quick Start
+
+
+
+### 2.1 PaddlePaddle Installation
+
+- Run the following command to install if CUDA9 or CUDA10 is available.
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- Run the following command to install if GPU device is unavailable.
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
+
+
+
+### 2.2 PaddleClas wheel Installation
+
+The command of PaddleClas installation as bellow:
+
+```bash
+pip3 install paddleclas
+```
+
+
+
+### 2.3 Prediction
+
+First, please click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download and unzip to get the test demo images.
+
+* Prediction with CLI
+
+```bash
+paddleclas --model_name=safety_helmet --infer_imgs=pulc_demo_imgs/safety_helmet/safety_helmet_test_1.png
+```
+
+Results:
+
+```
+>>> result
+class_ids: [1], scores: [0.9986255], label_names: ['unwearing_helmet'], filename: pulc_demo_imgs/safety_helmet/safety_helmet_test_1.png
+Predict complete!
+```
+**Note**: If you want to test other images, only need to specify the `--infer_imgs` argument, and the directory containing images is also supported.
+
+* Prediction in Python
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="safety_helmet")
+result = model.predict(input_data="pulc_demo_imgs/safety_helmet/safety_helmet_test_1.png")
+print(next(result))
+```
+
+**Note**: The `result` returned by `model.predict()` is a generator, so you need to use the `next()` function to call it or `for` loop to loop it. And it will predict with `batch_size` size batch and return the prediction results when called. The default `batch_size` is 1, and you also specify the `batch_size` when instantiating, such as `model = paddleclas.PaddleClas(model_name="safety_helmet", batch_size=2)`. The result of demo above:
+
+```
+>>> result
+[{'class_ids': [1], 'scores': [0.9986255], 'label_names': ['unwearing_helmet'], 'filename': 'pulc_demo_imgs/safety_helmet/safety_helmet_test_1.png'}]
+```
+
+
+
+## 3. Training, Evaluation and Inference
+
+
+
+### 3.1 Installation
+
+Please refer to [Installation](../installation/install_paddleclas_en.md) to get the description about installation.
+
+
+
+### 3.2 Dataset
+
+
+
+#### 3.2.1 Dataset Introduction
+
+All datasets used in this case are open source data. Train data is the subset of [Safety-Helmet-Wearing-Dataset](https://github.com/njvisionpower/Safety-Helmet-Wearing-Dataset), [hard-hat-detection](https://www.kaggle.com/datasets/andrewmvd/hard-hat-detection) and [Large-scale CelebFaces Attributes (CelebA) Dataset](https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html).
+
+
+
+#### 3.2.2 Getting Dataset
+
+The data used in this case can be getted by processing the open source data. The detailed processes are as follows:
+
+* `Safety-Helmet-Wearing-Dataset`: according to the bbox label data, the image is cropped by enlarging width and height of bbox by 3 times. The label is 0 if wearing safety helmet in the image, and the label is 1 if not;
+* `hard-hat-detection`: Only use the image that labeled "hat" and crop it with bbox. The label is 0;
+* `CelebA`: Only use the image labeled "wearing_hat" and crop it with bbox. The label is 0;
+
+After processing, the dataset totals about 150000 images, of which the number of images with and without wearing safety helmet is about 28000 and 121000 respectively. Then 5600 images are randomly selected in the two labels as the valuation data, a total of about 11200 images, and about 138000 other images as the training data.
+
+Some image of the processed dataset is as follows:
+
+
+
+And you can also download the data processed directly.
+
+```
+cd path_to_PaddleClas
+```
+
+Enter the `dataset/` directory, download and unzip the dataset.
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/safety_helmet.tar
+tar -xf safety_helmet.tar
+cd ../
+```
+
+The datas under `safety_helmet` directory:
+
+```
+├── images
+│ ├── VOC2028_part2_001209_1.jpg
+│ ├── HHD_hard_hat_workers23_1.jpg
+│ ├── CelebA_077809.jpg
+│ ├── ...
+│ └── ...
+├── train_list.txt
+└── val_list.txt
+```
+
+The `train_list.txt` and `val_list.txt` are label files of training data and validation data respectively. All images in `images/` directory.
+
+**Note**:
+
+* About the contents format of `train_list.txt` and `val_list.txt`, please refer to [Description about Classification Dataset in PaddleClas](../data_preparation/classification_dataset_en.md).
+
+
+
+### 3.3 Training
+
+The details of training config in `ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml`. The command about training as follows:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0.yaml
+```
+
+The best metric of validation data is between `0.985` and `0.993`. There would be fluctuations because the data size is small.
+
+**Note**:
+
+* The metric Tpr, that describe the True Positive Rate when False Positive Rate is less than a certain threshold(1/10000 used in this case), is one of the commonly used metric for binary classification. About the details of Fpr and Tpr, please refer [here](https://en.wikipedia.org/wiki/Receiver_operating_characteristic).
+* When evaluation, the best metric TprAtFpr will be printed that include `Fpr`, `Tpr` and the current `threshold`. The `Tpr` means the Recall rate under the current `Fpr`. The `Tpr` higher, the model better. The `threshold` would be used in deployment, which means the classification threshold under best `Fpr` metric.
+
+
+
+### 3.4 Evaluation
+
+After training, you can use the following commands to evaluate the model.
+
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+Among the above command, the argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+
+
+
+### 3.5 Inference
+
+After training, you can use the model that trained to infer. Command is as follow:
+
+```python
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+The results:
+
+```
+[{'class_ids': [1], 'scores': [0.9524797], 'label_names': ['unwearing_helmet'], 'file_name': 'deploy/images/PULC/safety_helmet/safety_helmet_test_1.png'}]
+```
+
+**备注:**
+
+* Among the above command, argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+* The default test image is `deploy/images/PULC/safety_helmet/safety_helmet_test_1.png`. And you can test other image, only need to specify the argument `-o Infer.infer_imgs=path_to_test_image`.
+* The default threshold is `0.5`. If needed, you can specify the argument `Infer.PostProcess.threshold`, such as: `-o Infer.PostProcess.threshold=0.9167`. And the argument `threshold` is needed to be specified according by specific case. The `0.9167` is the best threshold when `Fpr` is less than `1/10000` in this valuation dataset.
+
+
+
+## 4. Model Compression
+
+
+
+### 4.1 UDML Knowledge Distillation
+
+UDML is a simple but effective knowledge distillation algrithem proposed by PaddleClas. Please refer to [UDML 知识蒸馏](../advanced_tutorials/knowledge_distillation_en.md#1.2.3) for more details.
+
+
+
+#### 4.1.1 Knowledge Distillation Training
+
+Training with hyperparameters specified in `ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0_distillation.yaml`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0_distillation.yaml
+```
+
+The best metric is between `0.990` and `0.993`. The best student model weight would be saved in file `output/DistillationModel/best_model_student.pdparams`.
+
+
+
+## 5. Hyperparameters Searching
+
+The hyperparameters used by [3.2 section](#3.2) and [4.1 section](#4.1) are according by `Hyperparameters Searching` in PaddleClas. If you want to get better results on your own dataset, you can refer to [Hyperparameters Searching](PULC_train_en.md#4) to get better hyperparameters.
+
+**Note**: This section is optional. Because the search process will take a long time, you can selectively run according to your specific. If not replace the dataset, you can ignore this section.
+
+
+
+## 6. Inference Deployment
+
+
+
+### 6.1 Getting Paddle Inference Model
+
+Paddle Inference is the original Inference Library of the PaddlePaddle, provides high-performance inference for server deployment. And compared with directly based on the pretrained model, Paddle Inference can use tools to accelerate prediction, so as to achieve better inference performance. Please refer to [Paddle Inference](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html) for more information.
+
+Paddle Inference need Paddle Inference Model to predict. Two process provided to get Paddle Inference Model. If want to use the provided by PaddleClas, you can download directly, click [Downloading Inference Model](#6.1.2).
+
+
+
+### 6.1.1 Exporting Paddle Inference Model
+
+The command about exporting Paddle Inference Model is as follow:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_safety_helmet_infer
+```
+
+After running above command, the inference model files would be saved in `deploy/models/PPLCNet_x1_0_safety_helmet_infer`, as shown below:
+
+```
+├── PPLCNet_x1_0_safety_helmet_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**Note**: The best model is from knowledge distillation training. If knowledge distillation training is not used, the best model would be saved in `output/PPLCNet_x1_0/best_model.pdparams`.
+
+
+
+### 6.1.2 Downloading Inference Model
+
+You can also download directly.
+
+```
+cd deploy/models
+# download the inference model and decompression
+wget https://paddleclas.bj.bcebos.com/models/PULC/safety_helmet_infer.tar && tar -xf safety_helmet_infer.tar
+```
+
+After decompression, the directory `models` should be shown below.
+
+```
+├── safety_helmet_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 Prediction with Python
+
+
+
+#### 6.2.1 Image Prediction
+
+Return the directory `deploy`:
+
+```
+cd ../
+```
+
+Run the following command to classify whether wearing safety helmet about the image `./images/PULC/safety_helmet/safety_helmet_test_1.png`.
+
+```shell
+# Use the following command to predict with GPU.
+python3.7 python/predict_cls.py -c configs/PULC/safety_helmet/inference_safety_helmet.yaml
+# Use the following command to predict with CPU.
+python3.7 python/predict_cls.py -c configs/PULC/safety_helmet/inference_safety_helmet.yaml -o Global.use_gpu=False
+```
+
+The prediction results:
+
+```
+safety_helmet_test_1.png: class id(s): [1], score(s): [1.00], label_name(s): ['unwearing_helmet']
+```
+
+**Note**: The default threshold is `0.5`. If needed, you can specify the argument `Infer.PostProcess.threshold`, such as: `-o Infer.PostProcess.threshold=0.9167`. And the argument `threshold` is needed to be specified according by specific case. The `0.9167` is the best threshold when `Fpr` is less than `1/10000` in this valuation dataset. Please refer to [3.3 section](#3.3) for details.
+
+
+
+#### 6.2.2 Images Prediction
+
+If you want to predict images in directory, please specify the argument `Global.infer_imgs` as directory path by `-o Global.infer_imgs`. The command is as follow.
+
+```shell
+# Use the following command to predict with GPU. If want to replace with CPU, you can add argument -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/safety_helmet/inference_safety_helmet.yaml -o Global.infer_imgs="./images/PULC/safety_helmet/"
+```
+
+All prediction results will be printed, as shown below.
+
+```
+safety_helmet_test_1.png: class id(s): [1], score(s): [1.00], label_name(s): ['unwearing_helmet']
+safety_helmet_test_2.png: class id(s): [0], score(s): [1.00], label_name(s): ['wearing_helmet']
+```
+
+Among the prediction results above, `wearing_helmet` means that wearing safety helmet about the image, `unwearing_helmet` means not.
+
+
+
+### 6.3 Deployment with C++
+
+PaddleClas provides an example about how to deploy with C++. Please refer to [Deployment with C++](../inference_deployment/cpp_deploy_en.md).
+
+
+
+### 6.4 Deployment as Service
+
+Paddle Serving is a flexible, high-performance carrier for machine learning models, and supports different protocol, such as RESTful, gRPC, bRPC and so on, which provides different deployment solutions for a variety of heterogeneous hardware and operating system environments. Please refer [Paddle Serving](https://github.com/PaddlePaddle/Serving) for more information.
+
+PaddleClas provides an example about how to deploy as service by Paddle Serving. Please refer to [Paddle Serving Deployment](../inference_deployment/paddle_serving_deploy_en.md).
+
+
+
+### 6.5 Deployment on Mobile
+
+Paddle-Lite is an open source deep learning framework that designed to make easy to perform inference on mobile, embeded, and IoT devices. Please refer to [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) for more information.
+
+PaddleClas provides an example of how to deploy on mobile by Paddle-Lite. Please refer to [Paddle-Lite deployment](../inference_deployment/paddle_lite_deploy_en.md).
+
+
+
+### 6.6 Converting To ONNX and Deployment
+
+Paddle2ONNX support convert Paddle Inference model to ONNX model. And you can deploy with ONNX model on different inference engine, such as TensorRT, OpenVINO, MNN/TNN, NCNN and so on. About Paddle2ONNX details, please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX).
+
+PaddleClas provides an example of how to convert Paddle Inference model to ONNX model by paddle2onnx toolkit and predict by ONNX model. You can refer to [paddle2onnx](../../../deploy/paddle2onnx/readme_en.md) for deployment details.
diff --git a/docs/en/PULC/PULC_text_image_orientation_en.md b/docs/en/PULC/PULC_text_image_orientation_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..1d3cc41f992adff90f396463205cd060147023c1
--- /dev/null
+++ b/docs/en/PULC/PULC_text_image_orientation_en.md
@@ -0,0 +1,466 @@
+# PULC Classification Model of Text Image Orientation
+
+## Catalogue
+
+- [1. Introduction](#1)
+- [2. Quick Start](#2)
+ - [2.1 PaddlePaddle Installation](#2.1)
+ - [2.2 PaddleClas Installation](#2.2)
+ - [2.3 Prediction](#2.3)
+- [3. Training, Evaluation and Inference](#3)
+ - [3.1 Installation](#3.1)
+ - [3.2 Dataset](#3.2)
+ - [3.2.1 Dataset Introduction](#3.2.1)
+ - [3.2.2 Getting Dataset](#3.2.2)
+ - [3.3 Training](#3.3)
+ - [3.4 Evaluation](#3.4)
+ - [3.5 Inference](#3.5)
+- [4. Model Compression](#4)
+ - [4.1 SKL-UGI Knowledge Distillation](#4.1)
+ - [4.1.1 Teacher Model Training](#4.1.1)
+ - [4.1.2 Knowledge Distillation Training](#4.1.2)
+- [5. SHAS](#5)
+- [6. Inference Deployment](#6)
+ - [6.1 Getting Paddle Inference Model](#6.1)
+ - [6.1.1 Exporting Paddle Inference Model](#6.1.1)
+ - [6.1.2 Downloading Inference Model](#6.1.2)
+ - [6.2 Prediction with Python](#6.2)
+ - [6.2.1 Image Prediction](#6.2.1)
+ - [6.2.2 Images Prediction](#6.2.2)
+ - [6.3 Deployment with C++](#6.3)
+ - [6.4 Deployment as Service](#6.4)
+ - [6.5 Deployment on Mobile](#6.5)
+ - [6.6 Converting To ONNX and Deployment](#6.6)
+
+
+
+## 1. Introduction
+
+In the process of document scanning, license shooting and so on, sometimes in order to shoot more clearly, the camera device will be rotated, resulting in photo in different directions. At this time, the standard OCR process cannot cope with these issues well. Using the text image orientation classification technology, the direction of the text image can be predicted and adjusted, so as to improve the accuracy of OCR processing. This case provides a way for users to use PaddleClas PULC (Practical Ultra Lightweight image Classification) to quickly build a lightweight, high-precision, practical classification model of text image orientation. This model can be widely used in OCR processing scenarios of rotating pictures in financial, government and other industries.
+
+The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to fifth lines means that the backbone is replaced by PPLCNet, additional use of SSLD pretrained model and additional use of hyperparameters searching strategy.
+
+| Backbone | Top1-Acc(%) | Latency(ms) | Size(M)| Training Strategy |
+| ----------------------- | --------- | ---------- | --------- | ------------------------------------- |
+| SwinTranformer_tiny | 99.12 | 89.65 | 111 | using ImageNet pretrained model |
+| MobileNetV3_small_x0_35 | 83.61 | 2.95 | 2.6 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 97.85 | 2.16 | 7.1 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 98.02 | 2.16 | 7.1 | using SSLD pretrained model |
+| **PPLCNet_x1_0** | **99.06** | **2.16** | **7.1** | using SSLD pretrained model + hyperparameters searching strategy |
+
+It can be seen that high accuracy can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the accuracy will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the accuracy is higher more 14 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more faster. After additional using the SSLD pretrained model, the accuracy can be improved by about 0.17 percentage points without affecting the inference speed. Finally, after additional using the hyperparameters searching strategy, the accuracy can be further improved by 1.04 percentage points. At this point, the accuracy is close to that of SwinTranformer_tiny, but the speed is more faster. The training method and deployment instructions of PULC will be introduced in detail below.
+
+**Note**:
+
+* The Latency is tested on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. The MKLDNN is enabled and the number of threads is 10.
+* About PP-LCNet, please refer to [PP-LCNet Introduction](../models/PP-LCNet_en.md) and [PP-LCNet Paper](https://arxiv.org/abs/2109.15099).
+
+
+
+## 2. Quick Start
+
+
+
+### 2.1 PaddlePaddle Installation
+
+- Run the following command to install if CUDA9 or CUDA10 is available.
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- Run the following command to install if GPU device is unavailable.
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
+
+
+
+### 2.2 PaddleClas wheel Installation
+
+The command of PaddleClas installation as bellow:
+
+```bash
+pip3 install paddleclas
+```
+
+
+
+### 2.3 Prediction
+
+First, please click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download and unzip to get the test demo images.
+
+* Prediction with CLI
+
+```bash
+paddleclas --model_name=text_image_orientation --infer_imgs=pulc_demo_imgs/text_image_orientation/img_rot0_demo.jpg
+```
+
+Results:
+
+```
+>>> result
+class_ids: [0, 2], scores: [0.85615, 0.05046], label_names: ['0', '180'], filename: pulc_demo_imgs/text_image_orientation/img_rot0_demo.jpg
+Predict complete!
+```
+
+**Note**: If you want to test other images, only need to specify the `--infer_imgs` argument, and the directory containing images is also supported.
+
+* Prediction in Python
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="text_image_orientation")
+result = model.predict(input_data="pulc_demo_imgs/text_image_orientation/img_rot0_demo.jpg")
+print(next(result))
+```
+
+**Note**: The `result` returned by `model.predict()` is a generator, so you need to use the `next()` function to call it or `for` loop to loop it. And it will predict with `batch_size` size batch and return the prediction results when called. The default `batch_size` is 1, and you also specify the `batch_size` when instantiating, such as `model = paddleclas.PaddleClas(model_name="text_image_orientation", batch_size=2)`. The result of demo above:
+
+```
+>>> result
+[{'class_ids': [0, 2], 'scores': [0.85615, 0.05046], 'label_names': ['0', '180'], 'filename': 'pulc_demo_imgs/text_image_orientation/img_rot0_demo.jpg'}]
+```
+
+
+
+## 3. Training, Evaluation and Inference
+
+
+
+### 3.1 Installation
+
+Please refer to [Installation](../installation/install_paddleclas_en.md) to get the description about installation.
+
+
+
+### 3.2 Dataset
+
+
+
+#### 3.2.1 Dataset Introduction
+
+The model provided in [1 section](#1) is trained using internal data, which has not been open source. So we provide a dataset with [ICDAR2019-ArT](https://ai.baidu.com/broad/introduction?dataset=art), [XFUND](https://github.com/doc-analysis/XFUND) and [ICDAR2015](https://rrc.cvc.uab.es/?ch=4&com=introduction) to experience.
+
+
+
+
+
+#### 3.2.2 Getting Dataset
+
+The data used in this case can be getted by processing the open source data. The detailed processes are as follows:
+
+Considering the resolution of original image is too high to need long training time, all the data are scaled in advance. Keeping image aspect ratio, the short edge is scaled to 384. Then rotate the data clockwise to generate composite data of 90 degrees, 180 degrees and 270 degrees respectively. Among them, 41460 images generated by ICDAR2019-ArT and XFUND are randomly divided into training set and verification set according to the ratio of 9:1. 6000 images generated by ICDAR2015 are used as supplementary data in the experiment of `SKL-UGI knowledge distillation`.
+
+Some image of the processed dataset is as follows:
+
+
+
+And you can also download the data processed directly.
+
+```
+cd path_to_PaddleClas
+```
+
+Enter the `dataset/` directory, download and unzip the dataset.
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/text_image_orientation.tar
+tar -xf text_image_orientation.tar
+cd ../
+```
+
+The datas under `text_image_orientation` directory:
+
+```
+├── img_0
+│ ├── img_rot0_0.jpg
+│ ├── img_rot0_1.png
+...
+├── img_90
+│ ├── img_rot90_0.jpg
+│ ├── img_rot90_1.png
+...
+├── img_180
+│ ├── img_rot180_0.jpg
+│ ├── img_rot180_1.png
+...
+├── img_270
+│ ├── img_rot270_0.jpg
+│ ├── img_rot270_1.png
+...
+├── distill_data
+│ ├── gt_7060_0.jpg
+│ ├── gt_7060_90.jpg
+...
+├── train_list.txt
+├── train_list.txt.debug
+├── train_list_for_distill.txt
+├── test_list.txt
+├── test_list.txt.debug
+└── label_list.txt
+```
+
+Where `img_0/`, `img_90/`, `img_180/` and `img_270/` are data of 4 angles respectively. The `train_list.txt` and `val_list.txt` are label files of training data and validation data respectively. The file `train_list.txt.debug` and `val_list.txt.debug` are subset of `train_list.txt` and `val_list.txt` respectively. `distill_data/` is the supplementary data, which will be used for SKL-UGI knowledge distillation, and its label file is `train_list_for_distill.txt`.
+
+**Note**:
+
+* About the contents format of `train_list.txt` and `val_list.txt`, please refer to [Description about Classification Dataset in PaddleClas](../data_preparation/classification_dataset_en.md).
+* About the `train_list_for_distill.txt`, please refer to [Knowledge Distillation Label](../advanced_tutorials/distillation/distillation_en.md).
+
+
+
+### 3.3 Training
+
+The details of training config in `ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml`. The command about training as follows:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml
+```
+
+The best metric of validation data is about `0.99`.
+
+
+**Note**:
+* The metric mentioned in this document are training on large-scale internal dataset. When using demo data to train, this metric cannot be achieved because the dataset is small and the distribution is different from large-scale internal data. You can further expand your own data and use the optimization method described in this case to achieve higher accuracy.
+
+
+
+### 3.4 Evaluation
+
+After training, you can use the following commands to evaluate the model.
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+Among the above command, the argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+
+
+
+### 3.5 Inference
+
+After training, you can use the model that trained to infer. Command is as follow:
+
+```bash
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+The results:
+
+```
+[{'class_ids': [0, 2], 'scores': [0.85615, 0.05046], 'file_name': 'deploy/images/PULC/text_image_orientation/img_rot0_demo.jpg', 'label_names': ['0', '180']}]
+```
+
+**Note**:
+
+* Among the above command, argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+* The default test image is `deploy/images/PULC/text_image_orientation/img_rot0_demo.jpg`. And you can test other image, only need to specify the argument `-o Infer.infer_imgs=path_to_test_image`.
+* The Top2 result would be printed. `0` means that the text direction of the drawing is 0 degrees, `90` means that 90 degrees clockwise, `180` means that 180 degrees clockwise, `270` means that 270 degrees clockwise.
+
+
+
+## 4. Model Compression
+
+
+
+### 4.1 SKL-UGI Knowledge Distillation
+
+SKL-UGI is a simple but effective knowledge distillation algrithem proposed by PaddleClas.
+
+
+
+
+
+
+#### 4.1.1 Teacher Model Training
+
+Training the teacher model with hyperparameters specified in `ppcls/configs/PULC/text_image_orientation/PPLCNet/PPLCNet_x1_0.yaml`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+The best metric of validation data is about `0.996`. The best teacher model weight would be saved in file `output/ResNet101_vd/best_model.pdparams`.
+
+**Note**: Training ResNet101_vd need more GPU memory. So you can reduce `batch_size` and `learning rate` at the same time, such as: `-o DataLoader.Train.sampler.batch_size=64`, `Optimizer.lr.learning_rate=0.1`.
+
+
+
+#### 4.1.2 Knowledge Distillation Training
+
+The training strategy, specified in training config file `ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0_distillation.yaml`, the teacher model is `ResNet101_vd` and the student model is `PPLCNet_x1_0`.
+
+The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+The best metric is about `0.99`. The best student model weight would be saved in file `output/DistillationModel/best_model_student.pdparams`.
+
+
+
+## 5. Hyperparameters Searching
+
+The hyperparameters used by [3.2 section](#3.2) and [4.1 section](#4.1) are according by `Hyperparameters Searching` in PaddleClas. If you want to get better results on your own dataset, you can refer to [Hyperparameters Searching](PULC_train_en.md#4) to get better hyperparameters.
+
+**Note**: This section is optional. Because the search process will take a long time, you can selectively run according to your specific. If not replace the dataset, you can ignore this section.
+
+
+
+## 6. Inference Deployment
+
+
+
+### 6.1 Getting Paddle Inference Model
+
+Paddle Inference is the original Inference Library of the PaddlePaddle, provides high-performance inference for server deployment. And compared with directly based on the pretrained model, Paddle Inference can use tools to accelerate prediction, so as to achieve better inference performance. Please refer to [Paddle Inference](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html) for more information.
+
+Paddle Inference need Paddle Inference Model to predict. Two process provided to get Paddle Inference Model. If want to use the provided by PaddleClas, you can download directly, click [Downloading Inference Model](#6.1.2).
+
+
+
+### 6.1.1 Exporting Paddle Inference Model
+
+The command about exporting Paddle Inference Model is as follow:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_text_image_orientation_infer
+```
+
+After running above command, the inference model files would be saved in `deploy/models/PPLCNet_x1_0_text_image_orientation_infer`, as shown below:
+
+```
+├── PPLCNet_x1_0_text_image_orientation_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**Note**: The best model is from knowledge distillation training. If knowledge distillation training is not used, the best model would be saved in `output/PPLCNet_x1_0/best_model.pdparams`.
+
+
+
+### 6.1.2 Downloading Inference Model
+
+You can also download directly.
+
+```
+cd deploy/models
+# download the inference model and decompression
+wget https://paddleclas.bj.bcebos.com/models/PULC/text_image_orientation_infer.tar && tar -xf text_image_orientation_infer.tar
+```
+
+After decompression, the directory `models` should be shown below.
+
+```
+├── text_image_orientation_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 Prediction with Python
+
+
+
+#### 6.2.1 Image Prediction
+
+Return the directory `deploy`:
+
+```
+cd ../
+```
+
+Run the following command to classify text image orientation about image `./images/PULC/text_image_orientation/img_rot0_demo.png`.
+
+```shell
+# Use the following command to predict with GPU.
+python3.7 python/predict_cls.py -c configs/PULC/text_image_orientation/inference_text_image_orientation.yaml
+# Use the following command to predict with CPU.
+python3.7 python/predict_cls.py -c configs/PULC/text_image_orientation/inference_text_image_orientation.yaml -o Global.use_gpu=False
+```
+
+The prediction results:
+
+```
+img_rot0_demo.jpg: class id(s): [0, 2], score(s): [0.86, 0.05], label_name(s): ['0', '180']
+```
+
+Among the results, `0` means that the text direction of the drawing is 0 degrees, `90` means that 90 degrees clockwise, `180` means that 180 degrees clockwise, `270` means that 270 degrees clockwise.
+
+
+
+#### 6.2.2 Images Prediction
+
+If you want to predict images in directory, please specify the argument `Global.infer_imgs` as directory path by `-o Global.infer_imgs`. The command is as follow.
+
+```shell
+# Use the following command to predict with GPU. If want to replace with CPU, you can add argument -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/text_image_orientation/inference_text_image_orientation.yaml -o Global.infer_imgs="./images/PULC/text_image_orientation/"
+```
+
+All prediction results will be printed, as shown below.
+
+```
+img_rot0_demo.jpg: class id(s): [0, 2], score(s): [0.86, 0.05], label_name(s): ['0', '180']
+img_rot180_demo.jpg: class id(s): [2, 1], score(s): [0.88, 0.04], label_name(s): ['180', '90']
+```
+
+
+
+### 6.3 Deployment with C++
+
+PaddleClas provides an example about how to deploy with C++. Please refer to [Deployment with C++](../inference_deployment/cpp_deploy_en.md).
+
+
+
+### 6.4 Deployment as Service
+
+Paddle Serving is a flexible, high-performance carrier for machine learning models, and supports different protocol, such as RESTful, gRPC, bRPC and so on, which provides different deployment solutions for a variety of heterogeneous hardware and operating system environments. Please refer [Paddle Serving](https://github.com/PaddlePaddle/Serving) for more information.
+
+PaddleClas provides an example about how to deploy as service by Paddle Serving. Please refer to [Paddle Serving Deployment](../inference_deployment/paddle_serving_deploy_en.md).
+
+
+
+### 6.5 Deployment on Mobile
+
+Paddle-Lite is an open source deep learning framework that designed to make easy to perform inference on mobile, embeded, and IoT devices. Please refer to [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) for more information.
+
+PaddleClas provides an example of how to deploy on mobile by Paddle-Lite. Please refer to [Paddle-Lite deployment](../inference_deployment/paddle_lite_deploy_en.md).
+
+
+
+### 6.6 Converting To ONNX and Deployment
+
+Paddle2ONNX support convert Paddle Inference model to ONNX model. And you can deploy with ONNX model on different inference engine, such as TensorRT, OpenVINO, MNN/TNN, NCNN and so on. About Paddle2ONNX details, please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX).
+
+PaddleClas provides an example of how to convert Paddle Inference model to ONNX model by paddle2onnx toolkit and predict by ONNX model. You can refer to [paddle2onnx](../../../deploy/paddle2onnx/readme_en.md) for deployment details.
diff --git a/docs/en/PULC/PULC_textline_orientation_en.md b/docs/en/PULC/PULC_textline_orientation_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..d11307d0b5aafe056c1f1e53a85882d2449ac277
--- /dev/null
+++ b/docs/en/PULC/PULC_textline_orientation_en.md
@@ -0,0 +1,450 @@
+# PULC Classification Model of Textline Orientation
+
+------
+
+## Catalogue
+
+- [1. Introduction](#1)
+- [2. Quick Start](#2)
+ - [2.1 PaddlePaddle Installation](#2.1)
+ - [2.2 PaddleClas Installation](#2.2)
+ - [2.3 Prediction](#2.3)
+- [3. Training, Evaluation and Inference](#3)
+ - [3.1 Installation](#3.1)
+ - [3.2 Dataset](#3.2)
+ - [3.2.1 Dataset Introduction](#3.2.1)
+ - [3.2.2 Getting Dataset](#3.2.2)
+ - [3.3 Training](#3.3)
+ - [3.4 Evaluation](#3.4)
+ - [3.5 Inference](#3.5)
+- [4. Model Compression](#4)
+ - [4.1 SKL-UGI Knowledge Distillation](#4.1)
+ - [4.1.1 Teacher Model Training](#4.1.1)
+ - [4.1.2 Knowledge Distillation Training](#4.1.2)
+- [5. SHAS](#5)
+- [6. Inference Deployment](#6)
+ - [6.1 Getting Paddle Inference Model](#6.1)
+ - [6.1.1 Exporting Paddle Inference Model](#6.1.1)
+ - [6.1.2 Downloading Inference Model](#6.1.2)
+ - [6.2 Prediction with Python](#6.2)
+ - [6.2.1 Image Prediction](#6.2.1)
+ - [6.2.2 Images Prediction](#6.2.2)
+ - [6.3 Deployment with C++](#6.3)
+ - [6.4 Deployment as Service](#6.4)
+ - [6.5 Deployment on Mobile](#6.5)
+ - [6.6 Converting To ONNX and Deployment](#6.6)
+
+
+
+## 1. Introduction
+
+This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of textline orientation using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in character correction, character recognition, etc.
+
+The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to seventh lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy.
+
+| Backbone | Top-1 Acc(%) | Latency(ms) | Size(M)| Training Strategy |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 93.61 | 89.64 | 111 | using ImageNet pretrained model |
+| MobileNetV3_small_x0_35 | 81.40 | 2.96 | 2.6 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 89.99 | 2.11 | 7.0 | using ImageNet pretrained model |
+| PPLCNet_x1_0* | 94.06 | 2.68 | 7.0 | using ImageNet pretrained model |
+| PPLCNet_x1_0* | 94.11 | 2.68 | 7.0 | using SSLD pretrained model |
+| PPLCNet_x1_0** | 96.01 | 2.72 | 7.0 | using SSLD pretrained model + EDA strategy |
+| PPLCNet_x1_0** | 95.86 | 2.72 | 7.0 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy|
+
+It can be seen that high accuracy can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the accuracy will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the accuracy is higher more 8.6 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 10% faster. On this basis, by changing the resolution and stripe (refer to [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)), the speed becomes 27% slower, but the accuracy can be improved by 4.5 percentage points. After additional using the SSLD pretrained model, the accuracy can be improved by about 0.05 percentage points without affecting the inference speed. Finally, additional using the EDA strategy, the accuracy can be increased by 1.9 percentage points. The training method and deployment instructions of PULC will be introduced in detail below.
+
+**Note**:
+* Backbone name without \* means the resolution is 224x224, and with \* means the resolution is 48x192 (h\*w). The stride of the network is changed to `[2, [2, 1], [2, 1], [2, 1]`. Please refer to [PaddleOCR]( https://github.com/PaddlePaddle/PaddleOCR)for more details.
+* Backbone name with \*\* means that the resolution is 80x160 (h\*w), and the stride of the network is changed to `[2, [2, 1], [2, 1], [2, 1]]`. This resolution is searched by [Hyperparameter Searching](pulc_train_en.md#4).
+* The Latency is tested on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. The MKLDNN is enabled and the number of threads is 10.
+* About PP-LCNet, please refer to [PP-LCNet Introduction](../models/PP-LCNet_en.md) and [PP-LCNet Paper](https://arxiv.org/abs/2109.15099).
+
+
+
+## 2. Quick Start
+
+
+
+### 2.1 PaddlePaddle Installation
+
+- Run the following command to install if CUDA9 or CUDA10 is available.
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- Run the following command to install if GPU device is unavailable.
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
+
+
+
+### 2.2 PaddleClas wheel Installation
+
+The command of PaddleClas installation as bellow:
+
+```bash
+pip3 install paddleclas
+```
+
+
+
+### 2.3 Prediction
+
+First, please click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download and unzip to get the test demo images.
+
+* Prediction with CLI
+
+```bash
+paddleclas --model_name=textline_orientation --infer_imgs=pulc_demo_imgs/textline_orientation/textline_orientation_test_0_0.png
+```
+
+Results:
+
+```
+>>> result
+class_ids: [0], scores: [1.0], label_names: ['0_degree'], filename: pulc_demo_imgs/textline_orientation/textline_orientation_test_0_0.png
+Predict complete!
+```
+
+**Note**: If you want to test other images, only need to specify the `--infer_imgs` argument, and the directory containing images is also supported.
+
+* Prediction in Python
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="textline_orientation")
+result = model.predict(input_data="pulc_demo_imgs/textline_orientation/textline_orientation_test_0_0.png")
+print(next(result))
+```
+
+**Note**: The `result` returned by `model.predict()` is a generator, so you need to use the `next()` function to call it or `for` loop to loop it. And it will predict with `batch_size` size batch and return the prediction results when called. The default `batch_size` is 1, and you also specify the `batch_size` when instantiating, such as `model = paddleclas.PaddleClas(model_name="textline_orientation", batch_size=2)`. The result of demo above:
+
+```
+>>> result
+[{'class_ids': [0], 'scores': [1.0], 'label_names': ['0_degree'], 'filename': 'pulc_demo_imgs/textline_orientation/textline_orientation_test_0_0.png'}]
+```
+
+
+
+## 3. Training, Evaluation and Inference
+
+
+
+### 3.1 Installation
+
+Please refer to [Installation](../installation/install_paddleclas_en.md) to get the description about installation.
+
+
+
+### 3.2 Dataset
+
+
+
+#### 3.2.1 Dataset Introduction
+
+The data used in this case come from internal data. If you want to experience the training process, you can use open source data, such as [ICDAR2019-LSVT](https://aistudio.baidu.com/aistudio/datasetdetail/8429).
+
+
+
+#### 3.2.2 Getting Dataset
+
+Take ICDAR2019-LSVT for example, images with ID numbers from 0 to 1999 would be processed and used. After rotation, it is divided into class 0 or class 1. Class 0 means that the textline rotation angle is 0 degrees, and class 1 means 180 degrees.
+
+- Training data: The images with ID number from 0 to 1799 are used as the training set. 3600 images in total.
+- Evaluation data: The images with ID number from 1800 to 1999 are used as the evaluation set. 400 images in total.
+
+Some image of the processed dataset is as follows:
+
+
+
+And you can also download the data processed directly.
+
+```
+cd path_to_PaddleClas
+```
+
+Enter the `dataset/` directory, download and unzip the dataset.
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/textline_orientation.tar
+tar -xf textline_orientation.tar
+cd ../
+```
+
+The datas under `textline_orientation` directory:
+
+```
+├── 0
+│ ├── img_0.jpg
+│ ├── img_1.jpg
+...
+├── 1
+│ ├── img_0.jpg
+│ ├── img_1.jpg
+...
+├── train_list.txt
+└── val_list.txt
+```
+
+其中 `0/` 和 `1/` 分别存放 0 类和 1 类的数据。`train_list.txt` 和 `val_list.txt` 分别为训练集和验证集的标签文件。
+
+Where `0/` and `1/` are class 0 and class 1 data respectively. The `train_list.txt` and `val_list.txt` are label files of training data and validation data respectively.
+
+**Note**:
+
+* About the contents format of `train_list.txt` and `val_list.txt`, please refer to [Description about Classification Dataset in PaddleClas](../data_preparation/classification_dataset_en.md).
+
+
+
+### 3.3 Training
+
+The details of training config in `ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml`. The command about training as follows:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml
+```
+
+**Note**:
+
+* Because the ICDAR2019-LSVT data set is different from the dataset used in the provided pretrained model. If you want to get higher accuracy, you can process [ICDAR2019-LSVT](https://aistudio.baidu.com/aistudio/datasetdetail/8429).
+
+
+
+### 3.4 Evaluation
+
+After training, you can use the following commands to evaluate the model.
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+Among the above command, the argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+
+
+
+### 3.5 Inference
+
+After training, you can use the model that trained to infer. Command is as follow:
+
+```python
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+The results:
+
+```
+[{'class_ids': [0], 'scores': [1.0], 'file_name': 'deploy/images/PULC/textline_orientation/textline_orientation_test_0_0.png', 'label_names': ['0_degree']}]
+```
+
+**Note**:
+
+* Among the above command, argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+* The default test image is `deploy/images/PULC/textline_orientation/textline_orientation_test_0_0.png`. And you can test other image, only need to specify the argument `-o Infer.infer_imgs=path_to_test_image`.
+
+
+
+## 4. Model Compression
+
+
+
+### 4.1 SKL-UGI Knowledge Distillation
+
+SKL-UGI is a simple but effective knowledge distillation algrithem proposed by PaddleClas.
+
+
+
+
+
+
+#### 4.1.1 Teacher Model Training
+
+Training the teacher model with hyperparameters specified in `ppcls/configs/PULC/textline_orientation/PPLCNet/PPLCNet_x1_0.yaml`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+The best metric of validation data is between `0.96` and `0.98`. The best teacher model weight would be saved in file `output/ResNet101_vd/best_model.pdparams`.
+
+
+
+#### 4.1.2 Knowledge Distillation Training
+
+The training strategy, specified in training config file `ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0_distillation.yaml`, the teacher model is `ResNet101_vd` and the student model is `PPLCNet_x1_0`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+The best metric is between `0.95` and `0.97`. The best student model weight would be saved in file `output/DistillationModel/best_model_student.pdparams`.
+
+
+
+## 5. Hyperparameters Searching
+
+The hyperparameters used by [3.2 section](#3.2) and [4.1 section](#4.1) are according by `Hyperparameters Searching` in PaddleClas. If you want to get better results on your own dataset, you can refer to [Hyperparameters Searching](PULC_train_en.md#4) to get better hyperparameters.
+
+**Note**: This section is optional. Because the search process will take a long time, you can selectively run according to your specific. If not replace the dataset, you can ignore this section.
+
+
+
+## 6. Inference Deployment
+
+
+
+### 6.1 Getting Paddle Inference Model
+
+Paddle Inference is the original Inference Library of the PaddlePaddle, provides high-performance inference for server deployment. And compared with directly based on the pretrained model, Paddle Inference can use tools to accelerate prediction, so as to achieve better inference performance. Please refer to [Paddle Inference](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html) for more information.
+
+Paddle Inference need Paddle Inference Model to predict. Two process provided to get Paddle Inference Model. If want to use the provided by PaddleClas, you can download directly, click [Downloading Inference Model](#6.1.2).
+
+
+
+### 6.1.1 Exporting Paddle Inference Model
+
+The command about exporting Paddle Inference Model is as follow:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_textline_orientation_infer
+```
+
+After running above command, the inference model files would be saved in `deploy/models/PPLCNet_x1_0_textline_orientation_infer`, as shown below:
+
+```
+├── PPLCNet_x1_0_textline_orientation_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**Note**: The best model is from knowledge distillation training. If knowledge distillation training is not used, the best model would be saved in `output/PPLCNet_x1_0/best_model.pdparams`.
+
+
+
+### 6.1.2 Downloading Inference Model
+
+You can also download directly.
+
+```
+cd deploy/models
+# 下载 inference 模型并解压
+wget https://paddleclas.bj.bcebos.com/models/PULC/textline_orientation_infer.tar && tar -xf textline_orientation_infer.tar
+```
+
+After decompression, the directory `models` should be shown below.
+
+```
+├── textline_orientation_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 Prediction with Python
+
+
+
+#### 6.2.1 Image Prediction
+
+Return the directory `deploy`:
+
+```
+cd ../
+```
+
+Run the following command to classify the rotation of image `./images/PULC/textline_orientation/objects365_02035329.jpg`.
+
+```shell
+# Use the following command to predict with GPU.
+python3.7 python/predict_cls.py -c configs/PULC/textline_orientation/inference_textline_orientation.yaml
+# Use the following command to predict with CPU.
+python3.7 python/predict_cls.py -c configs/PULC/textline_orientation/inference_textline_orientation.yaml -o Global.use_gpu=False
+```
+
+The prediction results:
+
+```
+textline_orientation_test_0_0.png: class id(s): [0], score(s): [1.00], label_name(s): ['0_degree']
+```
+
+
+
+#### 6.2.2 Images Prediction
+
+If you want to predict images in directory, please specify the argument `Global.infer_imgs` as directory path by `-o Global.infer_imgs`. The command is as follow.
+
+```shell
+# Use the following command to predict with GPU. If want to replace with CPU, you can add argument -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/textline_orientation/inference_textline_orientation.yaml -o Global.infer_imgs="./images/PULC/textline_orientation/"
+```
+
+All prediction results will be printed, as shown below.
+
+```
+textline_orientation_test_0_0.png: class id(s): [0], score(s): [1.00], label_name(s): ['0_degree']
+textline_orientation_test_0_1.png: class id(s): [0], score(s): [1.00], label_name(s): ['0_degree']
+textline_orientation_test_1_0.png: class id(s): [1], score(s): [1.00], label_name(s): ['180_degree']
+textline_orientation_test_1_1.png: class id(s): [1], score(s): [1.00], label_name(s): ['180_degree']
+```
+
+Among the prediction results above, `0_degree` means that the rotation angle of the textline image is 0, and `180_degree` means that 180.
+
+
+
+### 6.3 Deployment with C++
+
+PaddleClas provides an example about how to deploy with C++. Please refer to [Deployment with C++](../inference_deployment/cpp_deploy_en.md).
+
+
+
+### 6.4 Deployment as Service
+
+Paddle Serving is a flexible, high-performance carrier for machine learning models, and supports different protocol, such as RESTful, gRPC, bRPC and so on, which provides different deployment solutions for a variety of heterogeneous hardware and operating system environments. Please refer [Paddle Serving](https://github.com/PaddlePaddle/Serving) for more information.
+
+PaddleClas provides an example about how to deploy as service by Paddle Serving. Please refer to [Paddle Serving Deployment](../inference_deployment/paddle_serving_deploy_en.md).
+
+
+
+### 6.5 Deployment on Mobile
+
+Paddle-Lite is an open source deep learning framework that designed to make easy to perform inference on mobile, embeded, and IoT devices. Please refer to [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) for more information.
+
+PaddleClas provides an example of how to deploy on mobile by Paddle-Lite. Please refer to [Paddle-Lite deployment](../inference_deployment/paddle_lite_deploy_en.md).
+
+
+
+### 6.6 Converting To ONNX and Deployment
+
+Paddle2ONNX support convert Paddle Inference model to ONNX model. And you can deploy with ONNX model on different inference engine, such as TensorRT, OpenVINO, MNN/TNN, NCNN and so on. About Paddle2ONNX details, please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX).
+
+PaddleClas provides an example of how to convert Paddle Inference model to ONNX model by paddle2onnx toolkit and predict by ONNX model. You can refer to [paddle2onnx](../../../deploy/paddle2onnx/readme_en.md) for deployment details.
diff --git a/docs/en/PULC/PULC_traffic_sign_en.md b/docs/en/PULC/PULC_traffic_sign_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..baa0faf4828a6c7acc16f8c12587a2af58c04f99
--- /dev/null
+++ b/docs/en/PULC/PULC_traffic_sign_en.md
@@ -0,0 +1,475 @@
+# PULC Classification Model of Traffic Sign
+
+------
+
+## Catalogue
+
+- [1. Introduction](#1)
+- [2. Quick Start](#2)
+ - [2.1 PaddlePaddle Installation](#2.1)
+ - [2.2 PaddleClas Installation](#2.2)
+ - [2.3 Prediction](#2.3)
+- [3. Training, Evaluation and Inference](#3)
+ - [3.1 Installation](#3.1)
+ - [3.2 Dataset](#3.2)
+ - [3.2.1 Dataset Introduction](#3.2.1)
+ - [3.2.2 Getting Dataset](#3.2.2)
+ - [3.3 Training](#3.3)
+ - [3.4 Evaluation](#3.4)
+ - [3.5 Inference](#3.5)
+- [4. Model Compression](#4)
+ - [4.1 SKL-UGI Knowledge Distillation](#4.1)
+ - [4.1.1 Teacher Model Training](#4.1.1)
+ - [4.1.2 Knowledge Distillation Training](#4.1.2)
+- [5. SHAS](#5)
+- [6. Inference Deployment](#6)
+ - [6.1 Getting Paddle Inference Model](#6.1)
+ - [6.1.1 Exporting Paddle Inference Model](#6.1.1)
+ - [6.1.2 Downloading Inference Model](#6.1.2)
+ - [6.2 Prediction with Python](#6.2)
+ - [6.2.1 Image Prediction](#6.2.1)
+ - [6.2.2 Images Prediction](#6.2.2)
+ - [6.3 Deployment with C++](#6.3)
+ - [6.4 Deployment as Service](#6.4)
+ - [6.5 Deployment on Mobile](#6.5)
+ - [6.6 Converting To ONNX and Deployment](#6.6)
+
+
+
+## 1. Introduction
+
+This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of traffic sign using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in automatic driving, road monitoring, etc.
+
+The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to sixth lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy.
+
+| Backbone | Top-1 Acc(%) | Latency(ms) | Size(M)| Training Strategy |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 98.11 | 89.45 | 111 | using ImageNet pretrained model |
+| MobileNetV3_small_x0_35 | 93.88 | 3.01 | 3.9 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 97.78 | 2.10 | 8.2 | using ImageNet pretrained model |
+| PPLCNet_x1_0 | 97.84 | 2.10 | 8.2 | using SSLD pretrained model |
+| PPLCNet_x1_0 | 98.14 | 2.10 | 8.2 | using SSLD pretrained model + EDA strategy |
+| PPLCNet_x1_0 | 98.35 | 2.10 | 8.2 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy|
+
+It can be seen that high accuracy can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the accuracy will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the accuracy is lower 3.9 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 43% faster. After additional using the SSLD pretrained model, the accuracy can be improved by about 0.06 percentage points without affecting the inference speed. Further, additional using the EDA strategy, the accuracy can be increased by 0.3 percentage points. Finally, after additional using the SKL-UGI knowledge distillation, the accuracy can be further improved by 0.21 percentage points. At this point, the accuracy exceeds that of SwinTranformer_tiny, but the speed is more than 41 times faster. The training method and deployment instructions of PULC will be introduced in detail below.
+
+**Note**:
+
+* The Latency is tested on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. The MKLDNN is enabled and the number of threads is 10.
+* About PP-LCNet, please refer to [PP-LCNet Introduction](../models/PP-LCNet_en.md) and [PP-LCNet Paper](https://arxiv.org/abs/2109.15099).
+
+
+
+## 2. Quick Start
+
+
+
+### 2.1 PaddlePaddle Installation
+
+- Run the following command to install if CUDA9 or CUDA10 is available.
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- Run the following command to install if GPU device is unavailable.
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
+
+
+
+### 2.2 PaddleClas wheel Installation
+
+The command of PaddleClas installation as bellow:
+
+```bash
+pip3 install paddleclas
+```
+
+
+
+### 2.3 Prediction
+
+First, please click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download and unzip to get the test demo images.
+
+* Prediction with CLI
+
+```bash
+paddleclas --model_name=traffic_sign --infer_imgs=pulc_demo_imgs/traffic_sign/100999_83928.jpg
+```
+
+Results:
+
+```
+>>> result
+class_ids: [182, 179, 162, 128, 24], scores: [0.98623, 0.01255, 0.00022, 0.00021, 0.00012], label_names: ['pl110', 'pl100', 'pl120', 'p26', 'pm10'], filename: pulc_demo_imgs/traffic_sign/100999_83928.jpg
+Predict complete!
+```
+
+**Note**: If you want to test other images, only need to specify the `--infer_imgs` argument, and the directory containing images is also supported.
+
+* Prediction in Python
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="traffic_sign")
+result = model.predict(input_data="pulc_demo_imgs/traffic_sign/100999_83928.jpg")
+print(next(result))
+```
+
+**Note**: The `result` returned by `model.predict()` is a generator, so you need to use the `next()` function to call it or `for` loop to loop it. And it will predict with `batch_size` size batch and return the prediction results when called. The default `batch_size` is 1, and you also specify the `batch_size` when instantiating, such as `model = paddleclas.PaddleClas(model_name="traffic_sign", batch_size=2)`. The result of demo above:
+
+```
+>>> result
+[{'class_ids': [182, 179, 162, 128, 24], 'scores': [0.98623, 0.01255, 0.00022, 0.00021, 0.00012], 'label_names': ['pl110', 'pl100', 'pl120', 'p26', 'pm10'], 'filename': 'pulc_demo_imgs/traffic_sign/100999_83928.jpg'}]
+```
+
+
+
+## 3. Training, Evaluation and Inference
+
+
+
+### 3.1 Installation
+
+Please refer to [Installation](../installation/install_paddleclas_en.md) to get the description about installation.
+
+
+
+### 3.2 Dataset
+
+
+
+#### 3.2.1 Dataset Introduction
+
+All datasets used in this case are open source data. Train data is the subset of [MS-COCO](https://cocodataset.org/#overview) training data. And the validation data is the subset of [Object365](https://www.objects365.org/overview.html) training data. ImageNet_val is [ImageNet-1k](https://www.image-net.org/) validation data.
+
+The dataset used in this case is based on the [Tsinghua-Tencent 100K dataset (CC-BY-NC license), TT100K](https://cg.cs.tsinghua.edu.cn/traffic-sign/) randomly expanded and cropped according to the bounding box.
+
+
+
+#### 3.2.2 Getting Dataset
+
+The processing to `TT00K` includes randomly expansion and cropping, details are shown below.
+
+```python
+def get_random_crop_box(xmin, ymin, xmax, ymax, img_height, img_width, ratio=1.0):
+ h = ymax - ymin
+ w = ymax - ymin
+
+ xmin_diff = random.random() * ratio * min(w, xmin/ratio)
+ ymin_diff = random.random() * ratio * min(h, ymin/ratio)
+ xmax_diff = random.random() * ratio * min(w, (img_width-xmin-1)/ratio)
+ ymax_diff = random.random() * ratio * min(h, (img_height-ymin-1)/ratio)
+
+ new_xmin = round(xmin - xmin_diff)
+ new_ymin = round(ymin - ymin_diff)
+ new_xmax = round(xmax + xmax_diff)
+ new_ymax = round(ymax + ymax_diff)
+
+ return new_xmin, new_ymin, new_xmax, new_ymax
+```
+
+Some image of the processed dataset is as follows:
+
+
+

+
+
+You can also download the data processed directly. And the process script file `deal.py` is also included.
+
+```
+cd path_to_PaddleClas
+```
+
+Enter the `dataset/` directory, download and unzip the dataset.
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/traffic_sign.tar
+tar -xf traffic_sign.tar
+cd ../
+```
+
+The datas under `traffic_sign` directory:
+
+```
+traffic_sign
+├── train
+│ ├── 0_62627.jpg
+│ ├── 100000_89031.jpg
+│ ├── 100001_89031.jpg
+...
+├── test
+│ ├── 100423_2315.jpg
+│ ├── 100424_2315.jpg
+│ ├── 100425_2315.jpg
+...
+├── other
+│ ├── 100603_3422.jpg
+│ ├── 100604_3422.jpg
+...
+├── label_list_train.txt
+├── label_list_test.txt
+├── label_list_other.txt
+├── label_list_train_for_distillation.txt
+├── label_list_train.txt.debug
+├── label_list_test.txt.debug
+├── label_name_id.txt
+├── deal.py
+```
+
+Where `train/` and `test/` are training set and validation set respectively. The `label_list_train.txt` and `label_list_test.txt` are label files of training data and validation data respectively. The file `label_list_train.txt.debug` and `label_list_test.txt.debug` are subset of `train_list.txt` and `val_list.txt` respectively. `other` would be used for SKL-UGI knowledge distillation, and its label file is `label_list_train_for_distillation.txt`.
+
+**Note**:
+
+* About the contents format of `label_list_train.txt` and `label_list_train.txt`, please refer to [Description about Classification Dataset in PaddleClas](../data_preparation/classification_dataset_en.md).
+* About the `label_list_train_for_distillation.txt`, please refer to [Knowledge Distillation Label](../advanced_tutorials/distillation/distillation_en.md).
+
+
+
+### 3.3 Training
+
+The details of training config in `ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml`. The command about training as follows:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml
+```
+
+The best metric of validation data is between `98.0` and `98.2`. There would be fluctuations because the data size is small.
+
+
+
+### 3.4 Evaluation
+
+After training, you can use the following commands to evaluate the model.
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+Among the above command, the argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+
+
+
+### 3.5 Inference
+
+After training, you can use the model that trained to infer. Command is as follow:
+
+```bash
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model
+```
+
+The results:
+
+```
+99603_17806.jpg: class id(s): [216, 145, 49, 207, 169], score(s): [1.00, 0.00, 0.00, 0.00, 0.00], label_name(s): ['pm20', 'pm30', 'pm40', 'pl25', 'pm15']
+```
+
+**Note**:
+
+* Among the above command, argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+* The default test image is `deploy/images/PULC/traffic_sign/99603_17806.jpg`. And you can test other image, only need to specify the argument `-o Infer.infer_imgs=path_to_test_image`.
+
+
+
+
+## 4. Model Compression
+
+
+
+### 4.1 SKL-UGI Knowledge Distillation
+
+SKL-UGI is a simple but effective knowledge distillation algrithem proposed by PaddleClas.
+
+
+
+
+
+
+#### 4.1.1 Teacher Model Training
+
+Training the teacher model with hyperparameters specified in `ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+The best metric of validation data is about `98.59%`. The best teacher model weight would be saved in file `output/ResNet101_vd/best_model.pdparams`.
+
+
+
+#### 4.1.2 Knowledge Distillation Training
+
+The training strategy, specified in training config file `ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0_distillation.yaml`, the teacher model is `ResNet101_vd`, the student model is `PPLCNet_x1_0` and the additional unlabeled training data is validation data of ImageNet1k. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+The best metric is about `98.35%`. The best student model weight would be saved in file `output/DistillationModel/best_model_student.pdparams`.
+
+
+
+## 5. Hyperparameters Searching
+
+The hyperparameters used by [3.2 section](#3.2) and [4.1 section](#4.1) are according by `Hyperparameters Searching` in PaddleClas. If you want to get better results on your own dataset, you can refer to [Hyperparameters Searching](PULC_train_en.md#4) to get better hyperparameters.
+
+**Note**: This section is optional. Because the search process will take a long time, you can selectively run according to your specific. If not replace the dataset, you can ignore this section.
+
+
+
+## 6. Inference Deployment
+
+
+
+### 6.1 Getting Paddle Inference Model
+
+Paddle Inference is the original Inference Library of the PaddlePaddle, provides high-performance inference for server deployment. And compared with directly based on the pretrained model, Paddle Inference can use tools to accelerate prediction, so as to achieve better inference performance. Please refer to [Paddle Inference](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html) for more information.
+
+Paddle Inference need Paddle Inference Model to predict. Two process provided to get Paddle Inference Model. If want to use the provided by PaddleClas, you can download directly, click [Downloading Inference Model](#6.1.2).
+
+
+### 6.1.1 Exporting Paddle Inference Model
+
+The command about exporting Paddle Inference Model is as follow:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_traffic_sign_infer
+```
+
+After running above command, the inference model files would be saved in `deploy/models/PPLCNet_x1_0_traffic_sign_infer`, as shown below:
+
+```
+├── PPLCNet_x1_0_traffic_sign_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**Note**: The best model is from knowledge distillation training. If knowledge distillation training is not used, the best model would be saved in `output/PPLCNet_x1_0/best_model.pdparams`.
+
+
+
+### 6.1.2 Downloading Inference Model
+
+You can also download directly.
+
+```
+cd deploy/models
+# download the inference model and decompression
+wget https://paddleclas.bj.bcebos.com/models/PULC/traffic_sign_infer.tar && tar -xf traffic_sign_infer.tar
+```
+
+After decompression, the directory `models` should be shown below.
+
+```
+├── traffic_sign_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 Prediction with Python
+
+
+
+#### 6.2.1 Image Prediction
+
+Return the directory `deploy`:
+
+```
+cd ../
+```
+
+Run the following command to classify traffic sign about the image `./images/PULC/traffic_sign/99603_17806.jpg`.
+
+```shell
+# Use the following command to predict with GPU.
+python3.7 python/predict_cls.py -c configs/PULC/traffic_sign/inference_traffic_sign.yaml
+# Use the following command to predict with CPU.
+python3.7 python/predict_cls.py -c configs/PULC/traffic_sign/inference_traffic_sign.yaml -o Global.use_gpu=False
+```
+
+The prediction results:
+
+```
+99603_17806.jpg: class id(s): [216, 145, 49, 207, 169], score(s): [1.00, 0.00, 0.00, 0.00, 0.00], label_name(s): ['pm20', 'pm30', 'pm40', 'pl25', 'pm15']
+```
+
+
+
+#### 6.2.2 Images Prediction
+
+If you want to predict images in directory, please specify the argument `Global.infer_imgs` as directory path by `-o Global.infer_imgs`. The command is as follow.
+
+```shell
+# Use the following command to predict with GPU. If want to replace with CPU, you can add argument -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/traffic_sign/inference_traffic_sign.yaml -o Global.infer_imgs="./images/PULC/traffic_sign/"
+```
+
+All prediction results will be printed, as shown below.
+
+```
+100999_83928.jpg: class id(s): [182, 179, 162, 128, 24], score(s): [0.99, 0.01, 0.00, 0.00, 0.00], label_name(s): ['pl110', 'pl100', 'pl120', 'p26', 'pm10']
+99603_17806.jpg: class id(s): [216, 145, 49, 24, 169], score(s): [1.00, 0.00, 0.00, 0.00, 0.00], label_name(s): ['pm20', 'pm30', 'pm40', 'pm10', 'pm15']
+```
+
+About the `label_name` details, please refer to `dataset/traffic_sign/report.pdf`.
+
+
+
+### 6.3 Deployment with C++
+
+PaddleClas provides an example about how to deploy with C++. Please refer to [Deployment with C++](../inference_deployment/cpp_deploy_en.md).
+
+
+
+### 6.4 Deployment as Service
+
+Paddle Serving is a flexible, high-performance carrier for machine learning models, and supports different protocol, such as RESTful, gRPC, bRPC and so on, which provides different deployment solutions for a variety of heterogeneous hardware and operating system environments. Please refer [Paddle Serving](https://github.com/PaddlePaddle/Serving) for more information.
+
+PaddleClas provides an example about how to deploy as service by Paddle Serving. Please refer to [Paddle Serving Deployment](../inference_deployment/paddle_serving_deploy_en.md).
+
+
+
+### 6.5 Deployment on Mobile
+
+Paddle-Lite is an open source deep learning framework that designed to make easy to perform inference on mobile, embeded, and IoT devices. Please refer to [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) for more information.
+
+PaddleClas provides an example of how to deploy on mobile by Paddle-Lite. Please refer to [Paddle-Lite deployment](../inference_deployment/paddle_lite_deploy_en.md).
+
+
+
+### 6.6 Converting To ONNX and Deployment
+
+Paddle2ONNX support convert Paddle Inference model to ONNX model. And you can deploy with ONNX model on different inference engine, such as TensorRT, OpenVINO, MNN/TNN, NCNN and so on. About Paddle2ONNX details, please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX).
+
+PaddleClas provides an example of how to convert Paddle Inference model to ONNX model by paddle2onnx toolkit and predict by ONNX model. You can refer to [paddle2onnx](../../../deploy/paddle2onnx/readme_en.md) for deployment details.
diff --git a/docs/en/PULC/PULC_train_en.md b/docs/en/PULC/PULC_train_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..9f94265e9ffb38f40633c671b0f6a60846f8cd08
--- /dev/null
+++ b/docs/en/PULC/PULC_train_en.md
@@ -0,0 +1,246 @@
+## Practical Ultra Lightweight Classification scheme PULC
+------
+
+
+## Catalogue
+
+- [1. Introduction of PULC solution](#1)
+- [2. Data preparation](#2)
+ - [2.1 Dataset format description](#2.1)
+ - [2.2 Annotation file generation method](#2.2)
+- [3. Training with standard classification configuration](#3)
+ - [3.1 PP-LCNet as backbone](#3.1)
+ - [3.2 SSLD pretrained model](#3.2)
+ - [3.3 EDA strategy](#3.3)
+ - [3.4 SKL-UGI knowledge distillation](#3.4)
+ - [3.5 Summary](#3.5)
+- [4. Hyperparameters Searching](#4)
+ - [4.1 Search based on default configuration](#4.1)
+ - [4.2 Custom search configuration](#4.2)
+
+
+
+### 1. Introduction of PULC solution
+
+Image classification is one of the basic algorithms of computer vision, and it is also the most common algorithm in enterprise applications, and further, it is also an important part of many CV applications. In recent years, the backbone network model has developed rapidly, and the accuracy record of ImageNet has been continuously refreshed. However, the performance of these models in practical scenarios is sometimes unsatisfactory. On the one hand, models with high precision tend to have large storage and slow inference speed, which are often difficult to meet actual deployment requirements; on the other hand, after selecting a suitable model, experienced engineers are often required to adjust parameters, which is time-consuming and labor-intensive. In order to solve the problems of enterprise application and make the training and parameter adjustment of classification models easier, PaddleClas summarized and launched a Practical Ultra Lightweight Classification (PULC) solution. PULC integrates various state-of-the-art algorithms such as backbone network, data augmentation and distillation, etc., and finally can automatically obtain a lightweight and high-precision image classification model.
+
+
+The PULC solution has been verified to be effective in many scenarios, such as human-related scenarios, car-related scenarios, and OCR-related scenarios. With an ultra-lightweight model, the accuracy close to SwinTransformer can be achieved, and the inference speed can be 40+ times faster.
+
+
+

+
+
+The solution mainly includes 4 parts, namely: PP-LCNet lightweight backbone network, SSLD pre-trained model, Ensemble Data Augmentation (EDA) and SKL-UGI knowledge distillation algorithm. In addition, we also adopt the method of hyperparameters searching to efficiently optimize the hyperparameters in training. Below, we take the person exists or not scene as an example to illustrate the solution.
+
+**Note**:For some specific scenarios, we provide basic training documents for reference, such as [person exists or not classification model](PULC_person_exists_en.md), etc. You can find these documents [here](./PULC_model_list_en.md). If the methods in these documents do not meet your needs, or if you need a custom training task, you can refer to this document.
+
+
+
+### 2. Data preparation
+
+
+
+#### 2.1 Dataset format description
+
+PaddleClas uses the `txt` format file to specify the training set and validation set. Take the person exists or not scene as an example, you need to specify `train_list.txt` and `val_list.txt` as the data labels of the training set and validation set. The format is in the form of as follows:
+
+```
+# Each line uses "space" to separate the image path and label
+train/1.jpg 0
+train/10.jpg 1
+...
+```
+
+If you want to get more information about common classification datasets, you can refer to the document [PaddleClas Classification Dataset Format Description](../data_preparation/classification_dataset_en.md).
+
+
+
+
+#### 2.2 Annotation file generation method
+
+If you already have the data in the actual scene, you can label it according to the format in the previous section. Here, we provide a script to quickly generate annotation files. You only need to put different categories of data in folders and run the script to generate annotation files.
+
+First, assume that the path where you store the data is `./train`, `train/` contains the data of each category, the category number starts from 0, and the folder of each category contains specific image data.
+
+```shell
+train
+├── 0
+│ ├── 0.jpg
+│ ├── 1.jpg
+│ └── ...
+└── 1
+ ├── 0.jpg
+ ├── 1.jpg
+ └── ...
+└── ...
+```
+
+```shell
+tree -r -i -f train | grep -E "jpg|JPG|jpeg|JPEG|png|PNG" | awk -F "/" '{print $0" "$2}' > train_list.txt
+```
+
+Among them, if more image name suffixes are involved, the content after `grep -E` can be added, and the `2` in `$2` is the level of the category number folder.
+
+**Note:** The above is an introduction to the method of dataset acquisition and generation. Here you can directly download the person exists or not scene data to quickly start the experience.
+
+
+Go to the PaddleClas directory.
+
+```
+cd path_to_PaddleClas
+```
+
+Go to the `dataset/` directory, download and unzip the data.
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/person_exists.tar
+tar -xf person_exists.tar
+cd ../
+```
+
+
+
+### 3. Training with standard classification configuration
+
+
+
+#### 3.1 PP-LCNet as backbone
+
+PULC adopts the lightweight backbone network PP-LCNet, which is 50% faster than other networks with the same accuracy. You can view the detailed introduction of the backbone network in [PP-LCNet Introduction](../models/PP-LCNet_en.md).
+
+The command to train with PP-LCNet is:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0_search.yaml
+```
+
+For performance comparison, we also provide configuration files for the large model SwinTransformer_tiny and the lightweight model MobileNetV3_small_x0_35, which you can train with the command:
+
+SwinTransformer_tiny:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/SwinTransformer_tiny_patch4_window7_224.yaml
+```
+
+MobileNetV3_small_x0_35:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/MobileNetV3_small_x0_35.yaml
+```
+
+
+The accuracy of the trained models is compared in the following table.
+
+| Model | Tpr(%) | Latency(ms) | Storage Size(M) | Strategy |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 95.69 | 95.30 | 107 | Use ImageNet pretrained model|
+| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 1.6 | Use ImageNet pretrained model |
+| PPLCNet_x1_0 | 89.57 | 2.12 | 6.5 | Use ImageNet pretrained model |
+
+It can be seen that PP-LCNet is much faster than SwinTransformer, but the accuracy is also slightly lower. Below we improve the accuracy of the PP-LCNet model through a series of optimizations.
+
+
+
+#### 3.2 SSLD pretrained model
+
+SSLD is a semi-supervised distillation algorithm developed by Baidu. On the ImageNet dataset, the model accuracy can be improved by 3-7 points. You can find a detailed introduction in [SSLD introduction](../advanced_tutorials/distillation/distillation_en.md). We found that using SSLD pre-trained weights can effectively improve the accuracy of the applied classification model. In addition, using a smaller resolution in training can effectively improve model accuracy. At the same time, we also optimize the learning rate.
+Based on the above three improvements, the accuracy of our trained model is 92.1%, an increase of 2.6%.
+
+
+
+#### 3.3 EDA strategy
+
+Data augmentation is a commonly used optimization strategy in vision algorithms, which can significantly improve the accuracy of the model. In addition to the traditional RandomCrop, RandomFlip, etc. methods, we also apply RandomAugment and RandomErasing. You can find a detailed introduction at [Data Augmentation Introduction](../advanced_tutorials/DataAugmentation_en.md).
+Since these two kinds of data augmentation greatly modify the picture, making the classification task more difficult, it may lead to under-fitting of the model on some datasets. We will set the probability of enabling these two methods in advance.
+Based on the above improvements, we obtained a model accuracy of 93.43%, an increase of 1.3%.
+
+
+
+#### 3.4 SKL-UGI knowledge distillation
+
+Knowledge distillation is a method that can effectively improve the accuracy of small models. You can find a detailed introduction in [Introduction to Knowledge Distillation](../advanced_tutorials/distillation/distillation_en.md). We choose ResNet101_vd as the teacher model for distillation. In order to adapt to the distillation process, we also adjust the learning rate of different stages of the network here. Based on the above improvements, we trained the model to get a model accuracy of 95.6%, an increase of 1.4%.
+
+
+
+#### 3.5 Summary
+
+After the optimization of the above methods, the final accuracy of PP-LCNet reaches 95.6%, reaching the accuracy level of the large model. We summarize the experimental results in the following table:
+
+| Model | Tpr(%) | Latency(ms) | Storage Size(M) | Strategy |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 95.69 | 95.30 | 107 | Use ImageNet pretrained model |
+| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 1.6 | Use ImageNet pretrained model |
+| PPLCNet_x1_0 | 89.57 | 2.12 | 6.5 | Use ImageNet pretrained model |
+| PPLCNet_x1_0 | 92.10 | 2.12 | 6.5 | Use SSLD pretrained model |
+| PPLCNet_x1_0 | 93.43 | 2.12 | 6.5 | Use SSLD pretrained model + EDA Strategy|
+| PPLCNet_x1_0 | 95.60 | 2.12 | 6.5 | Use SSLD pretrained model + EDA Strategy + SKL-UGI knowledge distillation |
+
+We also used the same optimization strategy in the other 8 scenarios and got the following results:
+
+| scenarios | large model | large model metrics(%) | small model | small model metrics(%) |
+|----------|----------|----------|----------|----------|
+| Pedestrian Attribute Classification | Res2Net200_vd | 81.25 | PPLCNet_x1_0 | 78.59 |
+| Classification of Wheather Wearing Safety Helmet | Res2Net200_vd| 98.92 | PPLCNet_x1_0 |99.38 |
+| Traffic Sign Classification | SwinTransformer_tiny | 98.11 | PPLCNet_x1_0 | 98.35 |
+| Vehicle Attribute Classification | Res2Net200_vd_26w_4s | 91.36 | PPLCNet_x1_0 | 90.81 |
+| Car Exists Classification | SwinTransformer_tiny | 97.71 | PPLCNet_x1_0 | 95.92 |
+| Text Image Orientation Classification | SwinTransformer_tiny |99.12 | PPLCNet_x1_0 | 99.06 |
+| Text-line Orientation Classification | SwinTransformer_tiny | 93.61 | PPLCNet_x1_0 | 96.01 |
+| Language Classification | SwinTransformer_tiny | 98.12 | PPLCNet_x1_0 | 99.26 |
+
+
+It can be seen from the results that the PULC scheme can improve the model accuracy in multiple application scenarios. Using the PULC scheme can greatly reduce the workload of model optimization and quickly obtain models with higher accuracy.
+
+
+
+
+### 4. Hyperparameters Searching
+
+In the above training process, we adjusted parameters such as learning rate, data augmentation probability, and stage learning rate mult list. The optimal values of these parameters may not be the same in different scenarios. We provide a quick hyperparameters searching script to automate the process of hyperparameter tuning. This script traverses the parameters in the search value list to replace the parameters in the default configuration, then trains in sequence, and finally selects the parameters corresponding to the model with the highest accuracy as the search result.
+
+
+
+#### 4.1 Search based on default configuration
+
+The configuration file [search.yaml](../../../ppcls/configs/PULC/person_exists/search.yaml) defines the configuration of hyperparameters searching in person exists or not scenarios. Use the following commands to complete hyperparameters searching.
+
+```bash
+python3 tools/search_strategy.py -c ppcls/configs/PULC/person_exists/search.yaml
+```
+
+**Note**:Regarding the search part, we are also constantly improving, so stay tuned.
+
+
+
+#### 4.2 Custom search configuration
+
+
+You can also modify the configuration of hyperparameters searching based on training results or your parameter tuning experience.
+
+Modify the `search_values` field in `lrs` to modify the list of learning rate search values;
+
+Modify the `search_values` field in `resolutions` to modify the search value list of resolutions;
+
+Modify the `search_values` field in `ra_probs` to modify the search value list of RandAugment activation probability;
+
+Modify the `search_values` field in `re_probs` to modify the search value list of RnadomErasing on probability;
+
+Modify the `search_values` field in `lr_mult_list` to modify the lr_mult search value list;
+
+Modify the `search_values` field in `teacher` to modify the search list of the teacher model.
+
+After the search is completed, the final results will be generated in `output/search_person_exists`, where, except for `search_res`, the directories in `output/search_person_exists` are the weights and training log files of the results of the corresponding hyperparameters of each search training, ` search_res` corresponds to the result of knowledge distillation, that is, the final model. The weights of the model are stored in `output/output_dir/search_person_exists/DistillationModel/best_model_student.pdparams`.
diff --git a/docs/en/PULC/PULC_vehicle_attribute_en.md b/docs/en/PULC/PULC_vehicle_attribute_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..47d7c963e9de6e4bde9fd3338830611e59b60695
--- /dev/null
+++ b/docs/en/PULC/PULC_vehicle_attribute_en.md
@@ -0,0 +1,481 @@
+# PULC Recognition Model of Vehicle Attribute
+
+------
+
+## Catalogue
+
+- [1. Introduction](#1)
+- [2. Quick Start](#2)
+ - [2.1 PaddlePaddle Installation](#2.1)
+ - [2.2 PaddleClas Installation](#2.2)
+ - [2.3 Prediction](#2.3)
+- [3. Training, Evaluation and Inference](#3)
+ - [3.1 Installation](#3.1)
+ - [3.2 Dataset](#3.2)
+ - [3.2.1 Dataset Introduction](#3.2.1)
+ - [3.2.2 Getting Dataset](#3.2.2)
+ - [3.3 Training](#3.3)
+ - [3.4 Evaluation](#3.4)
+ - [3.5 Inference](#3.5)
+- [4. Model Compression](#4)
+ - [4.1 SKL-UGI Knowledge Distillation](#4.1)
+ - [4.1.1 Teacher Model Training](#4.1.1)
+ - [4.1.2 Knowledge Distillation Training](#4.1.2)
+- [5. SHAS](#5)
+- [6. Inference Deployment](#6)
+ - [6.1 Getting Paddle Inference Model](#6.1)
+ - [6.1.1 Exporting Paddle Inference Model](#6.1.1)
+ - [6.1.2 Downloading Inference Model](#6.1.2)
+ - [6.2 Prediction with Python](#6.2)
+ - [6.2.1 Image Prediction](#6.2.1)
+ - [6.2.2 Images Prediction](#6.2.2)
+ - [6.3 Deployment with C++](#6.3)
+ - [6.4 Deployment as Service](#6.4)
+ - [6.5 Deployment on Mobile](#6.5)
+ - [6.6 Converting To ONNX and Deployment](#6.6)
+
+
+
+## 1. Introduction
+
+This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of vehicle attribute using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in Vehicle identification, road monitoring and other scenarios.
+
+The following table lists the relevant indicators of the model. The first three lines means that using Res2Net200_vd_26w_4s, ResNet50 and MobileNetV3_small_x0_35 as the backbone to training. The fourth to seventh lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy.
+
+
+| Backbone | mA(%) | Latency(ms) | Size(M) | Training Strategy |
+|-------|-----------|----------|---------------|---------------|
+| Res2Net200_vd_26w_4s | 91.36 | 79.46 | 293 | using ImageNet pretrained |
+| ResNet50 | 89.98 | 12.83 | 92 | using ImageNet pretrained |
+| MobileNetV3_small_x0_35 | 87.41 | 2.91 | 2.8 | using ImageNet pretrained |
+| PPLCNet_x1_0 | 89.57 | 2.36 | 7.2 | using ImageNet pretrained |
+| PPLCNet_x1_0 | 90.07 | 2.36 | 7.2 | using SSLD pretrained |
+| PPLCNet_x1_0 | 90.59 | 2.36 | 7.2 | using SSLD pretrained + EDA strategy|
+| PPLCNet_x1_0 | 90.81 | 2.36 | 7.2 | using SSLD pretrained + EDA strategy + SKL-UGI knowledge distillation strategy|
+
+
+It can be seen from the table that the ma metric is higher when the backbone is Res2Net200_vd_26w_4s, but the inference speed is slower. After replacing the backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the ma metric drops significantly. When the backbone is replaced by PPLCNet_x1_0, the ma metric is increased by 2 percentage points, and the speed is also increased by about 23%. On this basis, after using the SSLD pre-training model, the ma metric can be improved by about 0.5 percentage points without changing the inference speed. Further, when the EDA strategy is integrated, the ma metric can be improved by another 0.52 percentage points. Finally, using After SKL-UGI knowledge distillation, the ma metric can continue to improve by 0.23 percentage points. At this time, the ma metric of PPLCNet_x1_0 is only 0.55 percentage points away from Res2Net200_vd_26w_4s, but it is 32 times faster. The training method and deployment instructions of PULC will be introduced in detail below.
+
+
+**Note**:
+
+* The Latency is tested on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. The MKLDNN is enabled and the number of threads is 10.
+* About PP-LCNet, please refer to [PP-LCNet Introduction](../models/PP-LCNet_en.md) and [PP-LCNet Paper](https://arxiv.org/abs/2109.15099).
+
+
+
+## 2. Quick Start
+
+
+
+### 2.1 PaddlePaddle Installation
+
+- Run the following command to install if CUDA9 or CUDA10 is available.
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- Run the following command to install if GPU device is unavailable.
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
+
+
+
+### 2.2 PaddleClas wheel Installation
+
+The command of PaddleClas installation as bellow:
+
+```bash
+pip3 install paddleclas
+```
+
+
+
+### 2.3 Prediction
+
+First, please click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download and unzip to get the test demo images.
+
+
+* Prediction with CLI
+
+```bash
+paddleclas --model_name=vehicle_attribute --infer_imgs=pulc_demo_imgs/vehicle_attribute/0002_c002_00030670_0.jpg
+```
+
+Results:
+```
+>>> result
+attributes: Color: (yellow, prob: 0.9893476963043213), Type: (hatchback, prob: 0.9734097719192505), output: [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], filename: pulc_demo_imgs/vehicle_attribute/0002_c002_00030670_0.jpg
+Predict complete!
+```
+
+**Note**: If you want to test other images, only need to specify the `--infer_imgs` argument, and the directory containing images is also supported.
+
+* Prediction in Python
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="vehicle_attribute")
+result = model.predict(input_data="pulc_demo_imgs/vehicle_attribute/0002_c002_00030670_0.jpg")
+print(next(result))
+```
+
+**Note**: The `result` returned by `model.predict()` is a generator, so you need to use the `next()` function to call it or `for` loop to loop it. And it will predict with `batch_size` size batch and return the prediction results when called. The default `batch_size` is 1, and you also specify the `batch_size` when instantiating, such as `model = paddleclas.PaddleClas(model_name="vehicle_attribute", batch_size=2)`. The result of demo above:
+
+```
+>>> result
+[{'attributes': 'Color: (yellow, prob: 0.9893476963043213), Type: (hatchback, prob: 0.9734097719192505)', 'output': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], 'filename': 'pulc_demo_imgs/vehicle_attribute/0002_c002_00030670_0.jpg'}]
+```
+
+
+
+## 3. Training, Evaluation and Inference
+
+
+
+### 3.1 Installation
+
+Please refer to [Installation](../installation/install_paddleclas_en.md) to get the description about installation.
+
+
+
+### 3.2 Dataset
+
+
+
+#### 3.2.1 Dataset Introduction
+
+The data used in this case is the [pa100k dataset](https://www.v7labs.com/open-datasets/pa-100k).
+
+
+
+#### 3.2.2 Getting Dataset
+
+
+Part of the data visualization is shown below.
+
+
+

+
+
+First, apply for and download data from [VeRi dataset official website](https://www.v7labs.com/open-datasets/veri-dataset), put it in the `dataset` directory of PaddleClas, the dataset directory name is `VeRi `, use the following command to enter the folder.
+
+
+```shell
+cd PaddleClas/dataset/VeRi/
+```
+
+Then use the following code to convert the label (you can execute the following command in the python terminal, or you can write it to a file and run the file using `python3 convert.py`).
+
+```python
+import os
+from xml.dom.minidom import parse
+
+vehicleids = []
+
+def convert_annotation(input_fp, output_fp):
+ in_file = open(input_fp)
+ list_file = open(output_fp, 'w')
+ tree = parse(in_file)
+
+ root = tree.documentElement
+
+ for item in root.getElementsByTagName("Item"):
+ label = ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
+ if item.hasAttribute("imageName"):
+ name = item.getAttribute("imageName")
+ if item.hasAttribute("vehicleID"):
+ vehicleid = item.getAttribute("vehicleID")
+ if vehicleid not in vehicleids :
+ vehicleids.append(vehicleid)
+ vid = vehicleids.index(vehicleid)
+ if item.hasAttribute("colorID"):
+ colorid = int (item.getAttribute("colorID"))
+ label[colorid-1] = '1'
+ if item.hasAttribute("typeID"):
+ typeid = int (item.getAttribute("typeID"))
+ label[typeid+9] = '1'
+ label = ','.join(label)
+ list_file.write(os.path.join('image_train', name) + "\t" + label + "\n")
+
+ list_file.close()
+
+convert_annotation('train_label.xml', 'train_list.txt') #imagename vehiclenum colorid typeid
+convert_annotation('test_label.xml', 'test_list.txt')
+```
+
+
+After executing the above command, the `VeRi` directory has the following data:
+
+```
+VeRi
+├── image_train
+│ ├── 0001_c001_00016450_0.jpg
+│ ├── 0001_c001_00016460_0.jpg
+│ ├── 0001_c001_00016470_0.jpg
+...
+├── image_test
+│ ├── 0002_c002_00030600_0.jpg
+│ ├── 0002_c002_00030605_1.jpg
+│ ├── 0002_c002_00030615_1.jpg
+...
+...
+├── train_list.txt
+├── test_list.txt
+├── train_label.xml
+├── test_label.xml
+```
+
+where `train/` and `test/` are the training set and validation set, respectively. `train_list.txt` and `test_list.txt` are the converted label files for training and validation sets, respectively.
+
+
+
+
+### 3.3 Training
+
+The details of training config in `./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml`. The command about training as follows:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml
+```
+
+The best metric for the validation set is around `90.59%` (the dataset is small and generally fluctuates around 0.3%).
+
+
+
+
+### 3.4 Evaluation
+
+After training, you can use the following commands to evaluate the model.
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+Among the above command, the argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+
+
+
+### 3.5 Inference
+
+After training, you can use the model that trained to infer. Command is as follow:
+
+```python
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model
+```
+
+The results:
+
+```
+[{'attr': 'Color: (yellow, prob: 0.9893478155136108), Type: (hatchback, prob: 0.9734100103378296)', 'pred': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], 'file_name': './deploy/images/PULC/vehicle_attribute/0002_c002_00030670_0.jpg'}]
+```
+
+**Note**:
+
+* Among the above command, argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
+* The default test image is `./deploy/images/PULC/vehicle_attribute/0002_c002_00030670_0.jpg`. And you can test other image, only need to specify the argument `-o Infer.infer_imgs=path_to_test_image`.
+
+
+
+## 4. Model Compression
+
+
+
+### 4.1 SKL-UGI Knowledge Distillation
+
+SKL-UGI is a simple but effective knowledge distillation algrithem proposed by PaddleClas.
+
+
+
+
+
+
+#### 4.1.1 Teacher Model Training
+
+Training the teacher model with hyperparameters specified in `ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+The best metric for the validation set is around `91.60%`. The best teacher model weight would be saved in file `output/ResNet101_vd/best_model.pdparams`.
+
+
+
+#### 4.1.2 Knowledge Distillation Training
+
+The training strategy, specified in training config file `ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0_distillation.yaml`, the teacher model is `ResNet101_vd`, the student model is `PPLCNet_x1_0`. The command is as follow:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+The best metric for the validation set is around `90.81%`. The best student model weight would be saved in file `output/DistillationModel/best_model_student.pdparams`.
+
+
+
+## 5. Hyperparameters Searching
+
+The hyperparameters used by [3.2 section](#3.2) and [4.1 section](#4.1) are according by `Hyperparameters Searching` in PaddleClas. If you want to get better results on your own dataset, you can refer to [Hyperparameters Searching](PULC_train_en.md#4) to get better hyperparameters.
+
+**Note**: This section is optional. Because the search process will take a long time, you can selectively run according to your specific. If not replace the dataset, you can ignore this section.
+
+
+
+## 6. Inference Deployment
+
+
+
+### 6.1 Getting Paddle Inference Model
+
+Paddle Inference is the original Inference Library of the PaddlePaddle, provides high-performance inference for server deployment. And compared with directly based on the pretrained model, Paddle Inference can use tools to accelerate prediction, so as to achieve better inference performance. Please refer to [Paddle Inference](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html) for more information.
+
+Paddle Inference need Paddle Inference Model to predict. Two process provided to get Paddle Inference Model. If want to use the provided by PaddleClas, you can download directly, click [Downloading Inference Model](#6.1.2).
+
+
+
+### 6.1.1 Exporting Paddle Inference Model
+
+The command about exporting Paddle Inference Model is as follow:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_vehicle_attribute_infer
+```
+
+After running above command, the inference model files would be saved in `PPLCNet_x1_0_vehicle_attribute_infer`, as shown below:
+
+```
+└── PPLCNet_x1_0_vehicle_attribute_infer
+ ├── inference.pdiparams
+ ├── inference.pdiparams.info
+ └── inference.pdmodel
+```
+
+**Note**: The best model is from knowledge distillation training. If knowledge distillation training is not used, the best model would be saved in `output/PPLCNet_x1_0/best_model.pdparams`.
+
+
+
+### 6.1.2 Downloading Inference Model
+
+You can also download directly.
+
+```
+cd deploy/models
+# download the inference model and decompression
+wget https://paddleclas.bj.bcebos.com/models/PULC/vehicle_attribute_infer.tar && tar -xf vehicle_attribute_infer.tar
+```
+
+After decompression, the directory `models` should be shown below.
+
+```
+├── vehicle_attribute_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 Prediction with Python
+
+
+
+#### 6.2.1 Image Prediction
+
+Return the directory `deploy`:
+
+```
+cd ../
+```
+
+Run the following command to classify whether there are human in the image `../images/PULC/vehicle_attribute/0002_c002_00030670_0.jpg`.
+
+```shell
+# Use the following command to predict with GPU.
+python3.7 python/predict_cls.py -c configs/PULC/vehicle_attribute/inference_vehicle_attribute.yaml -o Global.use_gpu=True
+# Use the following command to predict with CPU.
+python3.7 python/predict_cls.py -c configs/PULC/vehicle_attribute/inference_vehicle_attribute.yaml -o Global.use_gpu=False
+```
+
+The prediction results:
+
+```
+0002_c002_00030670_0.jpg: {'attributes': 'Color: (yellow, prob: 0.9893478155136108), Type: (hatchback, prob: 0.9734099507331848)', 'output': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]}
+```
+
+
+
+
+#### 6.2.2 Images Prediction
+
+If you want to predict images in directory, please specify the argument `Global.infer_imgs` as directory path by `-o Global.infer_imgs`. The command is as follow.
+
+```shell
+# Use the following command to predict with GPU. If want to replace with CPU, you can add argument -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/vehicle_attribute/inference_vehicle_attribute.yaml -o Global.infer_imgs="./images/PULC/vehicle_attribute/"
+```
+
+All prediction results will be printed, as shown below.
+
+```
+0002_c002_00030670_0.jpg: {'attributes': 'Color: (yellow, prob: 0.9893476963043213), Type: (hatchback, prob: 0.9734097719192505)', 'output': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]}
+0014_c012_00040750_0.jpg: {'attributes': 'Color: (red, prob: 0.999872088432312), Type: (sedan, prob: 0.999976634979248)', 'output': [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]}
+```
+
+Among the prediction results above, `someone` means that there is a human in the image, `nobody` means that there is no human in the image.
+
+
+
+### 6.3 Deployment with C++
+
+PaddleClas provides an example about how to deploy with C++. Please refer to [Deployment with C++](../inference_deployment/cpp_deploy_en.md).
+
+
+
+### 6.4 Deployment as Service
+
+Paddle Serving is a flexible, high-performance carrier for machine learning models, and supports different protocol, such as RESTful, gRPC, bRPC and so on, which provides different deployment solutions for a variety of heterogeneous hardware and operating system environments. Please refer [Paddle Serving](https://github.com/PaddlePaddle/Serving) for more information.
+
+PaddleClas provides an example about how to deploy as service by Paddle Serving. Please refer to [Paddle Serving Deployment](../inference_deployment/paddle_serving_deploy_en.md).
+
+
+
+### 6.5 Deployment on Mobile
+
+Paddle-Lite is an open source deep learning framework that designed to make easy to perform inference on mobile, embeded, and IoT devices. Please refer to [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) for more information.
+
+PaddleClas provides an example of how to deploy on mobile by Paddle-Lite. Please refer to [Paddle-Lite deployment](../inference_deployment/paddle_lite_deploy_en.md).
+
+
+
+### 6.6 Converting To ONNX and Deployment
+
+Paddle2ONNX support convert Paddle Inference model to ONNX model. And you can deploy with ONNX model on different inference engine, such as TensorRT, OpenVINO, MNN/TNN, NCNN and so on. About Paddle2ONNX details, please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX).
+
+PaddleClas provides an example of how to convert Paddle Inference model to ONNX model by paddle2onnx toolkit and predict by ONNX model. You can refer to [paddle2onnx](../../../deploy/paddle2onnx/readme_en.md) for deployment details.
diff --git a/docs/en/inference_deployment/whl_deploy_en.md b/docs/en/inference_deployment/whl_deploy_en.md
index 224d41a7c1f2de9886fd830a36b8910dae0f97b6..e2666458a27f55bdb44f5fcb2646ba9107e80163 100644
--- a/docs/en/inference_deployment/whl_deploy_en.md
+++ b/docs/en/inference_deployment/whl_deploy_en.md
@@ -1,6 +1,6 @@
# PaddleClas wheel package
-Paddleclas supports Python WHL package for prediction. At present, WHL package only supports image classification, but does not support subject detection, feature extraction and vector retrieval.
+PaddleClas supports Python wheel package for prediction. At present, PaddleClas wheel supports image classification including ImagetNet1k models and PULC models, but does not support mainbody detection, feature extraction and vector retrieval.
---
@@ -8,8 +8,10 @@ Paddleclas supports Python WHL package for prediction. At present, WHL package o
- [1. Installation](#1)
- [2. Quick Start](#2)
+ - [2.1 ImageNet1k models](#2.1)
+ - [2.2 PULC models](#2.2)
- [3. Definition of Parameters](#3)
-- [4. Usage](#4)
+- [4. More usage](#4)
- [4.1 View help information](#4.1)
- [4.2 Prediction using inference model provide by PaddleClas](#4.2)
- [4.3 Prediction using local model files](#4.3)
@@ -20,6 +22,7 @@ Paddleclas supports Python WHL package for prediction. At present, WHL package o
- [4.8 Specify the mapping between class id and label name](#4.8)
+
## 1. Installation
* installing from pypi
@@ -36,8 +39,14 @@ pip3 install dist/*
```
+
## 2. Quick Start
-* Using the `ResNet50` model provided by PaddleClas, the following image(`'docs/images/inference_deployment/whl_demo.jpg'`) as an example.
+
+
+
+### 2.1 ImageNet1k models
+
+Using the `ResNet50` model provided by PaddleClas, the following image(`'docs/images/inference_deployment/whl_demo.jpg'`) as an example.

@@ -68,25 +77,88 @@ filename: docs/images/inference_deployment/whl_demo.jpg, top-5, class_ids: [8, 7
Predict complete!
```
+
+
+### 2.2 PULC models
+
+PULC integrates various state-of-the-art algorithms such as backbone network, data augmentation and distillation, etc., and finally can automatically obtain a lightweight and high-precision image classification model.
+
+PaddleClas provides a series of test cases, which contain demos of different scenes about people, cars, OCR, etc. Click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download the data.
+
+Prection using the PULC "Human Exists Classification" model provided by PaddleClas:
+
+* Python
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="person_exists")
+result = model.predict(input_data="pulc_demo_imgs/person_exists/objects365_01780782.jpg")
+print(next(result))
+```
+
+```
+>>> result
+[{'class_ids': [0], 'scores': [0.9955421453341842], 'label_names': ['nobody'], 'filename': 'pulc_demo_imgs/person_exists/objects365_01780782.jpg'}]
+```
+
+`Nobody` means there is no one in the image, `someone` means there is someone in the image. Therefore, the prediction result indicates that there is no one in the figure.
+
+**Note**: `model.predict()` is a generator, so `next()` or `for` is needed to call it. This would to predict by batch that length is `batch_size`, default by 1. You can specify the argument `batch_size` and `model_name` when instantiating PaddleClas object, for example: `model = paddleclas.PaddleClas(model_name="person_exists", batch_size=2)`. Please refer to [Supported Model List](#PULC_Models) for the supported model list.
+
+* CLI
+
+```bash
+paddleclas --model_name=person_exists --infer_imgs=pulc_demo_imgs/person_exists/objects365_01780782.jpg
+```
+
+```
+>>> result
+class_ids: [0], scores: [0.9955421453341842], label_names: ['nobody'], filename: pulc_demo_imgs/person_exists/objects365_01780782.jpg
+Predict complete!
+```
+
+**Note**: The "--infer_imgs" argument specify the image(s) to be predict, and you can also specify a directoy contains images. If use other model, you can specify the `--model_name` argument. Please refer to [Supported Model List](#PULC_Models) for the supported model list.
+
+
+
+**Supported Model List**
+
+The name of PULC series models are as follows:
+
+| Name | Intro |
+| --- | --- |
+| person_exists | Human Exists Classification |
+| person_attribute | Pedestrian Attribute Classification |
+| safety_helmet | Classification of Wheather Wearing Safety Helmet |
+| traffic_sign | Traffic Sign Classification |
+| vehicle_attribute | Vehicle Attribute Classification |
+| car_exists | Car Exists Classification |
+| text_image_orientation | Text Image Orientation Classification |
+| textline_orientation | Text-line Orientation Classification |
+| language_classification | Language Classification |
+
+Please refer to [Human Exists Classification](../PULC/PULC_person_exists_en.md)、[Pedestrian Attribute Classification](../PULC/PULC_person_attribute_en.md)、[Classification of Wheather Wearing Safety Helmet](../PULC/PULC_safety_helmet_en.md)、[Traffic Sign Classification](../PULC/PULC_traffic_sign_en.md)、[Vehicle Attribute Classification](../PULC/PULC_vehicle_attribute_en.md)、[Car Exists Classification](../PULC/PULC_car_exists_en.md)、[Text Image Orientation Classification](../PULC/PULC_text_image_orientation_en.md)、[Text-line Orientation Classification](../PULC/PULC_textline_orientation_en.md)、[Language Classification](../PULC/PULC_language_classification_en.md) for more information about different scenarios.
+
+
## 3. Definition of Parameters
The following parameters can be specified in Command Line or used as parameters of the constructor when instantiating the PaddleClas object in Python.
* model_name(str): If using inference model based on ImageNet1k provided by Paddle, please specify the model's name by the parameter.
* inference_model_dir(str): Local model files directory, which is valid when `model_name` is not specified. The directory should contain `inference.pdmodel` and `inference.pdiparams`.
* infer_imgs(str): The path of image to be predicted, or the directory containing the image files, or the URL of the image from Internet.
-* use_gpu(bool): Whether to use GPU or not, default by `True`.
-* gpu_mem(int): GPU memory usages,default by `8000`。
-* use_tensorrt(bool): Whether to open TensorRT or not. Using it can greatly promote predict preformance, default by `False`.
-* enable_mkldnn(bool): Whether enable MKLDNN or not, default `False`.
-* cpu_num_threads(int): Assign number of cpu threads, valid when `--use_gpu` is `False` and `--enable_mkldnn` is `True`, default by `10`.
-* batch_size(int): Batch size, default by `1`.
-* resize_short(int): Resize the minima between height and width into `resize_short`, default by `256`.
-* crop_size(int): Center crop image to `crop_size`, default by `224`.
-* topk(int): Print (return) the `topk` prediction results, default by `5`.
-* class_id_map_file(str): The mapping file between class ID and label, default by `ImageNet1K` dataset's mapping.
-* pre_label_image(bool): whether prelabel or not, default=False.
-* save_dir(str): The directory to save the prediction results that can be used as pre-label, default by `None`, that is, not to save.
+* use_gpu(bool): Whether to use GPU or not.
+* gpu_mem(int): GPU memory usages.
+* use_tensorrt(bool): Whether to open TensorRT or not. Using it can greatly promote predict preformance.
+* enable_mkldnn(bool): Whether enable MKLDNN or not.
+* cpu_num_threads(int): Assign number of cpu threads, valid when `--use_gpu` is `False` and `--enable_mkldnn` is `True`.
+* batch_size(int): Batch size.
+* resize_short(int): Resize the minima between height and width into `resize_short`.
+* crop_size(int): Center crop image to `crop_size`.
+* topk(int): Print (return) the `topk` prediction results when Topk postprocess is used.
+* threshold(float): The threshold of ThreshOutput when postprocess is used.
+* class_id_map_file(str): The mapping file between class ID and label.
+* save_dir(str): The directory to save the prediction results that can be used as pre-label.
**Note**: If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `resize_short=384`, `resize=384`. The following is a demo.
@@ -103,6 +175,7 @@ clas = PaddleClas(model_name='ViT_base_patch16_384', resize_short=384, crop_size
```
+
## 4. Usage
PaddleClas provides two ways to use:
@@ -110,6 +183,7 @@ PaddleClas provides two ways to use:
2. Bash command line programming.
+
### 4.1 View help information
* CLI
@@ -118,6 +192,7 @@ paddleclas -h
```
+
### 4.2 Prediction using inference model provide by PaddleClas
You can use the inference model provided by PaddleClas to predict, and only need to specify `model_name`. In this case, PaddleClas will automatically download files of specified model and save them in the directory `~/.paddleclas/`.
@@ -136,6 +211,7 @@ paddleclas --model_name='ResNet50' --infer_imgs='docs/images/inference_deploymen
```
+
### 4.3 Prediction using local model files
You can use the local model files trained by yourself to predict, and only need to specify `inference_model_dir`. Note that the directory must contain `inference.pdmodel` and `inference.pdiparams`.
@@ -154,6 +230,7 @@ paddleclas --inference_model_dir='./inference/' --infer_imgs='docs/images/infere
```
+
### 4.4 Prediction by batch
You can predict by batch, only need to specify `batch_size` when `infer_imgs` is direcotry contain image files.
@@ -173,6 +250,7 @@ paddleclas --model_name='ResNet50' --infer_imgs='docs/images/' --batch_size 2
```
+
### 4.5 Prediction of Internet image
You can predict the Internet image, only need to specify URL of Internet image by `infer_imgs`. In this case, the image file will be downloaded and saved in the directory `~/.paddleclas/images/`.
@@ -191,6 +269,7 @@ paddleclas --model_name='ResNet50' --infer_imgs='https://raw.githubusercontent.c
```
+
### 4.6 Prediction of NumPy.array format image
In Python code, you can predict the `NumPy.array` format image, only need to use the `infer_imgs` to transfer variable of image data. Note that the models in PaddleClas only support to predict 3 channels image data, and channels order is `RGB`.
@@ -205,6 +284,7 @@ print(next(result))
```
+
### 4.7 Save the prediction result(s)
You can save the prediction result(s) as pre-label, only need to use `pre_label_out_dir` to specify the directory to save.
@@ -212,17 +292,18 @@ You can save the prediction result(s) as pre-label, only need to use `pre_label_
```python
from paddleclas import PaddleClas
clas = PaddleClas(model_name='ResNet50', save_dir='./output_pre_label/')
-infer_imgs = 'docs/images/inference_deployment/whl_' # it can be infer_imgs folder path which contains all of images you want to predict.
+infer_imgs = 'docs/images/' # it can be infer_imgs folder path which contains all of images you want to predict.
result=clas.predict(infer_imgs)
print(next(result))
```
* CLI
```bash
-paddleclas --model_name='ResNet50' --infer_imgs='docs/images/inference_deployment/whl_' --save_dir='./output_pre_label/'
+paddleclas --model_name='ResNet50' --infer_imgs='docs/images/' --save_dir='./output_pre_label/'
```
+
### 4.8 Specify the mapping between class id and label name
You can specify the mapping between class id and label name, only need to use `class_id_map_file` to specify the mapping file. PaddleClas uses ImageNet1K's mapping by default.
diff --git a/docs/images/PULC/docs/car_exists_data_demo.jpeg b/docs/images/PULC/docs/car_exists_data_demo.jpeg
new file mode 100644
index 0000000000000000000000000000000000000000..9959954b6b8bf27589e1d2081f86c6078d16e2c1
Binary files /dev/null and b/docs/images/PULC/docs/car_exists_data_demo.jpeg differ
diff --git a/docs/images/PULC/docs/language_classification_original_data.png b/docs/images/PULC/docs/language_classification_original_data.png
new file mode 100644
index 0000000000000000000000000000000000000000..42c4a03ebe3df6b4563e6f006d61faa0a4b1fdea
Binary files /dev/null and b/docs/images/PULC/docs/language_classification_original_data.png differ
diff --git a/docs/images/PULC/docs/person_attribute_data_demo.png b/docs/images/PULC/docs/person_attribute_data_demo.png
new file mode 100644
index 0000000000000000000000000000000000000000..c9b276af0a554bbe07d807224d56fbbe5e2b7400
Binary files /dev/null and b/docs/images/PULC/docs/person_attribute_data_demo.png differ
diff --git a/docs/images/PULC/docs/person_exists_data_demo.png b/docs/images/PULC/docs/person_exists_data_demo.png
new file mode 100644
index 0000000000000000000000000000000000000000..b74ab64b6f62b83880aa426c1d05cb1fc53840e4
Binary files /dev/null and b/docs/images/PULC/docs/person_exists_data_demo.png differ
diff --git a/docs/images/PULC/docs/safety_helmet_data_demo.jpg b/docs/images/PULC/docs/safety_helmet_data_demo.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..70bd2d952fd20e6f8fe39182914e400177d913c4
Binary files /dev/null and b/docs/images/PULC/docs/safety_helmet_data_demo.jpg differ
diff --git a/docs/images/PULC/docs/text_image_orientation_data_demo.png b/docs/images/PULC/docs/text_image_orientation_data_demo.png
new file mode 100644
index 0000000000000000000000000000000000000000..756b18e03077f7c631deb39390aa84ba0f4580ae
Binary files /dev/null and b/docs/images/PULC/docs/text_image_orientation_data_demo.png differ
diff --git a/docs/images/PULC/docs/text_image_orientation_original_data.png b/docs/images/PULC/docs/text_image_orientation_original_data.png
new file mode 100644
index 0000000000000000000000000000000000000000..9014179214224c21f50a595f414617ab12538b8e
Binary files /dev/null and b/docs/images/PULC/docs/text_image_orientation_original_data.png differ
diff --git a/docs/images/PULC/docs/textline_orientation_data_demo.png b/docs/images/PULC/docs/textline_orientation_data_demo.png
new file mode 100644
index 0000000000000000000000000000000000000000..fcb48732026e48e14a616967ee06904c2feb9449
Binary files /dev/null and b/docs/images/PULC/docs/textline_orientation_data_demo.png differ
diff --git a/docs/images/PULC/docs/traffic_sign_data_demo.png b/docs/images/PULC/docs/traffic_sign_data_demo.png
new file mode 100644
index 0000000000000000000000000000000000000000..6fac97a299b6fbf037a931f7ba56607f791271f3
Binary files /dev/null and b/docs/images/PULC/docs/traffic_sign_data_demo.png differ
diff --git a/docs/images/PULC/docs/vehicle_attribute_data_demo.png b/docs/images/PULC/docs/vehicle_attribute_data_demo.png
new file mode 100644
index 0000000000000000000000000000000000000000..68c67acb331de19b688b9b9111fb8c20ff42fc2a
Binary files /dev/null and b/docs/images/PULC/docs/vehicle_attribute_data_demo.png differ
diff --git a/docs/images/algorithm_introduction/hnsw.png b/docs/images/algorithm_introduction/hnsw.png
new file mode 100644
index 0000000000000000000000000000000000000000..eeacd32bd31e690bca2363932ca7ab9d78750313
Binary files /dev/null and b/docs/images/algorithm_introduction/hnsw.png differ
diff --git a/docs/images/class_simple.gif b/docs/images/class_simple.gif
new file mode 100644
index 0000000000000000000000000000000000000000..c30122dfa239e14901738f0c6583be6a259d339f
Binary files /dev/null and b/docs/images/class_simple.gif differ
diff --git a/docs/images/class_simple_en.gif b/docs/images/class_simple_en.gif
new file mode 100644
index 0000000000000000000000000000000000000000..14c3a678f6b0ba81b7761c397ddc97826817409a
Binary files /dev/null and b/docs/images/class_simple_en.gif differ
diff --git a/docs/images/classification.gif b/docs/images/classification.gif
new file mode 100644
index 0000000000000000000000000000000000000000..db2ff2a56be31793402a350f68e59eb924d7c1bf
Binary files /dev/null and b/docs/images/classification.gif differ
diff --git a/docs/images/classification_en.gif b/docs/images/classification_en.gif
new file mode 100644
index 0000000000000000000000000000000000000000..884d5ba1453a3c717a9060e3a9831ea6e5160e7d
Binary files /dev/null and b/docs/images/classification_en.gif differ
diff --git a/docs/zh_CN/PULC/PULC_car_exists.md b/docs/zh_CN/PULC/PULC_car_exists.md
new file mode 100644
index 0000000000000000000000000000000000000000..4107363534f9c76508d660ffb7d69dc705076a1a
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_car_exists.md
@@ -0,0 +1,470 @@
+# PULC 有车/无车分类模型
+
+------
+
+
+## 目录
+
+- [1. 模型和应用场景介绍](#1)
+- [2. 模型快速体验](#2)
+ - [2.1 安装 paddlepaddle](#2.1)
+ - [2.2 安装 paddleclas](#2.2)
+ - [2.3 预测](#2.3)
+- [3. 模型训练、评估和预测](#3)
+ - [3.1 环境配置](#3.1)
+ - [3.2 数据准备](#3.2)
+ - [3.2.1 数据集来源](#3.2.1)
+ - [3.2.2 数据集获取](#3.2.2)
+ - [3.3 模型训练](#3.3)
+ - [3.4 模型评估](#3.4)
+ - [3.5 模型预测](#3.5)
+- [4. 模型压缩](#4)
+ - [4.1 SKL-UGI 知识蒸馏](#4.1)
+ - [4.1.1 教师模型训练](#4.1.1)
+ - [4.1.2 蒸馏训练](#4.1.2)
+- [5. 超参搜索](#5)
+- [6. 模型推理部署](#6)
+ - [6.1 推理模型准备](#6.1)
+ - [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
+ - [6.1.2 直接下载 inference 模型](#6.1.2)
+ - [6.2 基于 Python 预测引擎推理](#6.2)
+ - [6.2.1 预测单张图像](#6.2.1)
+ - [6.2.2 基于文件夹的批量预测](#6.2.2)
+ - [6.3 基于 C++ 预测引擎推理](#6.3)
+ - [6.4 服务化部署](#6.4)
+ - [6.5 端侧部署](#6.5)
+ - [6.6 Paddle2ONNX 模型转换与预测](#6.6)
+
+
+
+
+## 1. 模型和应用场景介绍
+
+该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight image Classification)快速构建轻量级、高精度、可落地的有车/无车的分类模型。该模型可以广泛应用于如监控场景、海量数据过滤场景等。
+
+下表列出了判断图片中是否有车的二分类模型的相关指标,前两行展现了使用 SwinTranformer_tiny 和 MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第三行至第六行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + SKL-UGI 知识蒸馏策略训练得到的模型的相关指标。
+
+
+| 模型 | Tpr(%)@Fpr0.01 | 延时(ms) | 存储(M) | 策略 |
+|-------|----------------|----------|---------------|---------------|
+| SwinTranformer_tiny | 97.71 | 95.30 | 111 | 使用 ImageNet 预训练模型 |
+| MobileNetV3_small_x0_35 | 81.23 | 2.85 | 2.7 | 使用 ImageNet 预训练模型 |
+| PPLCNet_x1_0 | 94.72 | 2.12 | 7.1 | 使用 ImageNet 预训练模型 |
+| PPLCNet_x1_0 | 95.48 | 2.12 | 7.1 | 使用 SSLD 预训练模型 |
+| PPLCNet_x1_0 | 95.48 | 2.12 | 7.1 | 使用 SSLD 预训练模型+EDA 策略|
+| PPLCNet_x1_0 | 95.92 | 2.12 | 7.1 | 使用 SSLD 预训练模型+EDA 策略+SKL-UGI 知识蒸馏策略|
+
+从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是会导致精度大幅下降。将 backbone 替换为速度更快的 PPLCNet_x1_0 时,精度较 MobileNetV3_small_x0_35 高 13 个百分点,与此同时速度依旧可以快 20% 以上。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升约 0.7 个百分点,进一步地,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.44 个百分点。此时,PPLCNet_x1_0 达到了接近 SwinTranformer_tiny 模型的精度,但是速度快 40 多倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
+
+**备注:**
+
+* `Tpr`指标的介绍可以参考 [3.3节](#3.3)的备注部分,延时是基于 Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,开启 MKLDNN 加速策略,线程数为10。
+* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)。
+
+
+
+
+## 2. 模型快速体验
+
+
+
+### 2.1 安装 paddlepaddle
+
+- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- 您的机器是CPU,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
+
+
+
+### 2.2 安装 paddleclas
+
+使用如下命令快速安装 paddleclas
+
+```
+pip3 install paddleclas
+```
+
+
+
+### 2.3 预测
+
+点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
+
+* 使用命令行快速预测
+
+```bash
+paddleclas --model_name=car_exists --infer_imgs=pulc_demo_imgs/car_exists/objects365_00001507.jpeg
+```
+
+结果如下:
+```
+>>> result
+class_ids: [1], scores: [0.9871138], label_names: ['contains_car'], filename: pulc_demo_imgs/car_exists/objects365_00001507.jpeg
+Predict complete!
+```
+
+**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹。
+
+
+* 在 Python 代码中预测
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="car_exists")
+result = model.predict(input_data="pulc_demo_imgs/car_exists/objects365_00001507.jpeg")
+print(next(result))
+```
+
+**备注**:`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="car_exists", batch_size=2)`, 使用默认的代码返回结果示例如下:
+
+```
+>>> result
+[{'class_ids': [1], 'scores': [0.9871138], 'label_names': ['contains_car'], 'filename': 'pulc_demo_imgs/car_exists/objects365_00001507.jpeg'}]
+```
+
+
+
+
+## 3. 模型训练、评估和预测
+
+
+
+### 3.1 环境配置
+
+* 安装:请先参考文档[环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+
+
+
+### 3.2 数据准备
+
+
+
+#### 3.2.1 数据集来源
+
+本案例中所使用的所有数据集均为开源数据,`train`和`val` 集合均为[Objects365 数据](https://www.objects365.org/overview.html)的子集,`ImageNet_val` 为[ImageNet-1k 数据](https://www.image-net.org/)的验证集。
+
+
+
+#### 3.2.2 数据集获取
+
+在公开数据集的基础上经过后处理即可得到本案例需要的数据,具体处理方法如下:
+
+- 训练集合,本案例处理了 Objects365 数据训练集的标注文件,如果某张图含有“car”的标签,且这个框的面积在整张图中的比例大于 10%,即认为该张图中含有车,如果某张图中没有任何与交通工具,例如car、bus等相关的的标签,则认为该张图中不含有车。经过处理后,得到 108629 条可用数据,其中有车的数据有 27422 条,无车的数据 81207 条。
+
+- 验证集合,处理方法与训练集相同,数据来源于 Objects365 数据集的验证集。为了测试结果准确,验证集经过人工校正,去除了一些可能存在标注错误的图像。
+
+* 注:由于objects365的标签并不是完全互斥的,例如F1赛车可能是 "F1 Formula",也可能被标称"car"。为了减轻干扰,我们仅保留"car"标签作为有车,而将不含任何交通工具的图作为无车。
+
+处理后的数据集部分数据可视化如下:
+
+
+
+此处提供了经过上述方法处理好的数据,可以直接下载得到。
+
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,下载并解压有车/无车场景的数据。
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/car_exists.tar
+tar -xf car_exists.tar
+cd ../
+```
+
+执行上述命令后,`dataset/` 下存在 `car_exists` 目录,该目录中具有以下数据:
+
+```
+
+├── objects365_car
+│ ├── objects365_00000039.jpg
+│ ├── objects365_00000099.jpg
+├── ImageNet_val
+│ ├── ILSVRC2012_val_00000001.JPEG
+│ ├── ILSVRC2012_val_00000002.JPEG
+...
+├── train_list.txt
+├── train_list.txt.debug
+├── train_list_for_distill.txt
+├── val_list.txt
+└── val_list.txt.debug
+```
+
+其中 `train/` 和 `val/` 分别为训练集和验证集。`train_list.txt` 和 `val_list.txt` 分别为训练集和验证集的标签文件,`train_list.txt.debug` 和 `val_list.txt.debug` 分别为训练集和验证集的 `debug` 标签文件,其分别是 `train_list.txt` 和 `val_list.txt` 的子集,用该文件可以快速体验本案例的流程。`ImageNet_val/` 是 ImageNet-1k 的验证集,该集合和 `train` 集合的混合数据用于本案例的 `SKL-UGI知识蒸馏策略`,对应的训练标签文件为 `train_list_for_distill.txt` 。
+
+**备注:**
+
+* 关于 `train_list.txt`、`val_list.txt`的格式说明,可以参考 [PaddleClas 分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
+
+* 关于如何得到蒸馏的标签文件可以参考[知识蒸馏标签获得方法](../advanced_tutorials/ssld.md#3.2)。
+
+
+
+
+### 3.3 模型训练
+
+
+在 `ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml` 中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml
+```
+
+验证集的最佳指标在 `0.95-0.96` 之间(数据集较小,容易造成波动)。
+
+**备注:**
+
+* 此时使用的指标为Tpr,该指标描述了在假正类率(Fpr)小于某一个指标时的真正类率(Tpr),是产业中二分类问题常用的指标之一。在本案例中,Fpr 为 1/100 。关于 Fpr 和 Tpr 的更多介绍,可以参考[这里](https://baike.baidu.com/item/AUC/19282953)。
+
+* 在eval时,会打印出来当前最佳的 TprAtFpr 指标,具体地,其会打印当前的 `Fpr`、`Tpr` 值,以及当前的 `threshold`值,`Tpr` 值反映了在当前 `Fpr` 值下的召回率,该值越高,代表模型越好。`threshold` 表示当前最佳 `Fpr` 所对应的分类阈值,可用于后续模型部署落地等。
+
+
+
+### 3.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+
+
+### 3.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```python
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+输出结果如下:
+
+```
+[{'class_ids': [1], 'scores': [0.9871138], 'label_names': ['contains_car'], 'filename': 'deploy/images/PULC/car_exists/objects365_00001507.jpeg'}]
+```
+
+**备注:**
+
+* 这里`-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+* 默认是对 `deploy/images/PULC/car_exists/objects365_00001507.jpeg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
+
+* 二分类默认的阈值为0.5, 如果需要指定阈值,可以重写 `Infer.PostProcess.threshold` ,如`-o Infer.PostProcess.threshold=0.9794`,该值需要根据实际场景来确定,此处的 `0.9794` 是在该场景中的 `val` 数据集在百分之一 Fpr 下得到的最佳 Tpr 所得到的。
+
+
+
+
+## 4. 模型压缩
+
+
+
+### 4.1 SKL-UGI 知识蒸馏
+
+SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](../advanced_tutorials/ssld.md)。
+
+
+
+#### 4.1.1 教师模型训练
+
+复用 `ppcls/configs/PULC/car_exists/PPLCNet/PPLCNet_x1_0.yaml` 中的超参数,训练教师模型,训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+验证集的最佳指标为 `0.96-0.98` 之间,当前教师模型最好的权重保存在 `output/ResNet101_vd/best_model.pdparams`。
+
+
+
+#### 4.1.2 蒸馏训练
+
+配置文件`ppcls/configs/PULC/car_exists/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型,使用ImageNet数据集的验证集作为新增的无标签数据。训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+验证集的最佳指标为 `0.95-0.97` 之间,当前模型最好的权重保存在 `output/DistillationModel/best_model_student.pdparams`。
+
+
+
+
+## 5. 超参搜索
+
+在 [3.3 节](#3.3)和 [4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
+
+**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
+
+
+
+## 6. 模型推理部署
+
+
+
+### 6.1 推理模型准备
+
+Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用 MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于 Paddle Inference 推理引擎的介绍,可以参考 [Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)。
+
+当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
+
+
+
+### 6.1.1 基于训练得到的权重导出 inference 模型
+
+此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/car_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_car_exists_infer
+```
+执行完该脚本后会在 `deploy/models/` 下生成 `PPLCNet_x1_0_car_exists_infer` 文件夹,`models` 文件夹下应有如下文件结构:
+
+```
+├── PPLCNet_x1_0_car_exists_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在`output/PPLCNet_x1_0/best_model.pdparams`中。
+
+
+
+### 6.1.2 直接下载 inference 模型
+
+[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
+
+```
+cd deploy/models
+# 下载 inference 模型并解压
+wget https://paddleclas.bj.bcebos.com/models/PULC/car_exists_infer.tar && tar -xf car_exists_infer.tar
+```
+
+解压完毕后,`models` 文件夹下应有如下文件结构:
+
+```
+├── car_exists_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 基于 Python 预测引擎推理
+
+
+
+
+#### 6.2.1 预测单张图像
+
+返回 `deploy` 目录:
+
+```
+cd ../
+```
+
+运行下面的命令,对图像 `./images/PULC/car_exists/objects365_00001507.jpeg` 进行有人/无人分类。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/car_exists/inference_car_exists.yaml
+# 使用下面的命令使用 CPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/car_exists/inference_car_exists.yaml -o Global.use_gpu=False
+```
+
+输出结果如下。
+
+```
+objects365_00001507.jpeg: class id(s): [1], score(s): [0.99], label_name(s): ['contains_car']
+```
+
+
+**备注:** 二分类默认的阈值为0.5, 如果需要指定阈值,可以重写 `Infer.PostProcess.threshold` ,如`-o Infer.PostProcess.threshold=0.9794`,该值需要根据实际场景来确定,此处的 `0.9794` 是在该场景中的 `val` 数据集在百分之一 Fpr 下得到的最佳 Tpr 所得到的。该阈值的确定方法可以参考[3.3节](#3.3)备注部分。
+
+
+
+#### 6.2.2 基于文件夹的批量预测
+
+如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/car_exists/inference_car_exists.yaml -o Global.infer_imgs="./images/PULC/car_exists/"
+```
+
+终端中会输出该文件夹内所有图像的分类结果,如下所示。
+
+```
+objects365_00001507.jpeg: class id(s): [1], score(s): [0.99], label_name(s): ['contains_car']
+objects365_00001521.jpeg: class id(s): [0], score(s): [0.99], label_name(s): ['no_car']
+```
+
+其中,`contains_car` 表示该图里存在车,`no_car` 表示该图里不存在车。
+
+
+
+### 6.3 基于 C++ 预测引擎推理
+
+PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
+
+
+
+### 6.4 服务化部署
+
+Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)。
+
+PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.5 端侧部署
+
+Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)。
+
+PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.6 Paddle2ONNX 模型转换与预测
+
+Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)。
+
+PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](../../../deploy/paddle2onnx/readme.md)来完成相应的部署工作。
diff --git a/docs/zh_CN/PULC/PULC_language_classification.md b/docs/zh_CN/PULC/PULC_language_classification.md
new file mode 100644
index 0000000000000000000000000000000000000000..309f3e9cc8a0c3c519722baeb13e5b90a8312e51
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_language_classification.md
@@ -0,0 +1,453 @@
+# PULC 语种分类模型
+
+## 目录
+
+- [1. 模型和应用场景介绍](#1)
+- [2. 模型快速体验](#2)
+ - [2.1 安装 paddlepaddle](#2.1)
+ - [2.2 安装 paddleclas](#2.2)
+ - [2.3 预测](#2.3)
+- [3. 模型训练、评估和预测](#3)
+ - [3.1 环境配置](#3.1)
+ - [3.2 数据准备](#3.2)
+ - [3.2.1 数据集来源](#3.2.1)
+ - [3.2.2 数据集获取](#3.2.2)
+ - [3.3 模型训练](#3.3)
+ - [3.4 模型评估](#3.4)
+ - [3.5 模型预测](#3.5)
+- [4. 模型压缩](#4)
+ - [4.1 SKL-UGI 知识蒸馏](#4.1)
+ - [4.1.1 教师模型训练](#4.1.1)
+ - [4.1.2 蒸馏训练](#4.1.2)
+- [5. 超参搜索](#5)
+- [6. 模型推理部署](#6)
+ - [6.1 推理模型准备](#6.1)
+ - [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
+ - [6.1.2 直接下载 inference 模型](#6.1.2)
+ - [6.2 基于 Python 预测引擎推理](#6.2)
+ - [6.2.1 预测单张图片](#6.2.1)
+ - [6.2.2 基于文件夹的批量预测](#6.2.2)
+ - [6.3 基于 C++ 预测引擎推理](#6.3)
+ - [6.4 服务化部署](#6.4)
+ - [6.5 端侧部署](#6.5)
+ - [6.6 Paddle2ONNX 模型转换与预测](#6.6)
+
+
+
+## 1. 模型和应用场景介绍
+
+该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight image Classification)快速构建轻量级、高精度、可落地的语种分类模型。使用该方法训练得到的模型可以快速判断图片中的文字语种,该模型可以广泛应用于金融、政务等各种涉及多语种OCR处理的场景中。
+
+下表列出了语种分类模型的相关指标,前两行展现了使用 SwinTranformer_tiny 和 MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第三行至第六行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + SKL-UGI 知识蒸馏策略训练得到的模型的相关指标。其中替换 backbone 为 PPLCNet_x1_0时,将数据预处理时的输入尺寸变为[192,48],且网络的下采样stride调整为[2, [2, 1], [2, 1], [2, 1], [2, 1]]。
+
+| 模型 | 精度 | 延时 | 存储 | 策略 |
+| ----------------------- | --------- | -------- | ------- | ---------------------------------------------- |
+| SwinTranformer_tiny | 98.12 | 89.09 | 111 | 使用ImageNet预训练模型 |
+| MobileNetV3_small_x0_35 | 95.92 | 2.98 | 3.7 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 98.35 | 2.58 | 7.1 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 98.7 | 2.58 | 7.1 | 使用SSLD预训练模型 |
+| PPLCNet_x1_0 | 99.12 | 2.58 | 7.1 | 使用SSLD预训练模型+EDA策略 |
+| **PPLCNet_x1_0** | **99.26** | **2.58** | **7.1** | 使用SSLD预训练模型+EDA策略+SKL-UGI知识蒸馏策略 |
+
+从表中可以看出,backbone 为 SwinTranformer_tiny 时精度比较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度提升明显,但精度有了大幅下降。将 backbone 替换为 PPLCNet_x1_0 且调整预处理输入尺寸和网络的下采样stride时,速度略为提升,同时精度较 MobileNetV3_large_x1_0 高2.43个百分点。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升 0.35 个百分点,进一步地,当融合EDA策略后,精度可以再提升 0.42 个百分点,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.14 个百分点。此时,PPLCNet_x1_0 超过了 SwinTranformer_tiny 模型的精度,并且速度有了明显提升。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
+
+**备注:**
+
+* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)。
+
+
+
+## 2. 模型快速体验
+
+
+
+### 2.1 安装 paddlepaddle
+
+- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- 您的机器是CPU,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
+
+
+
+### 2.2 安装 paddleclas
+
+使用如下命令快速安装 paddleclas
+
+```
+pip3 install paddleclas
+```
+
+
+
+### 2.3 预测
+
+点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
+
+* 使用命令行快速预测
+
+```bash
+paddleclas --model_name=language_classification --infer_imgs=pulc_demo_imgs/language_classification/word_35404.png
+```
+
+结果如下:
+```
+>>> result
+class_ids: [4, 6], scores: [0.88672, 0.01434], label_names: ['japan', 'korean'], filename: pulc_demo_imgs/language_classification/word_35404.png
+Predict complete!
+```
+
+**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹。
+
+
+* 在 Python 代码中预测
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="language_classification")
+result = model.predict(input_data="pulc_demo_imgs/language_classification/word_35404.png")
+print(next(result))
+```
+
+**备注**:`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="language_classification", batch_size=2)`, 使用默认的代码返回结果示例如下:
+
+```
+>>> result
+[{'class_ids': [4, 6], 'scores': [0.88672, 0.01434], 'label_names': ['japan', 'korean'], 'filename': 'pulc_demo_imgs/language_classification/word_35404.png'}]
+```
+
+
+
+
+## 3. 模型训练、评估和预测
+
+
+
+### 3.1 环境配置
+
+* 安装:请先参考文档 [环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+
+
+
+### 3.2 数据准备
+
+
+
+#### 3.2.1 数据集来源
+
+[第1节](#1)中提供的模型使用内部数据训练得到,该数据集暂时不方便公开。这里基于 [Multi-lingual scene text detection and recognition](https://rrc.cvc.uab.es/?ch=15&com=downloads) 开源数据集构造了一个多语种demo数据集,用于体验本案例的预测过程。
+
+
+
+
+
+#### 3.2.2 数据集获取
+
+[第1节](#1)中提供的模型共支持10个类别,分别为:
+
+`0` 表示阿拉伯语(arabic);`1` 表示中文繁体(chinese_cht);`2` 表示斯拉夫语(cyrillic);`3` 表示梵文(devanagari);`4` 表示日语(japan);`5` 表示卡纳达文(ka);`6` 表示韩语(korean);`7` 表示泰米尔文(ta);`8` 表示泰卢固文(te);`9` 表示拉丁语(latin)。
+
+在 Multi-lingual scene text detection and recognition 数据集中,仅包含了阿拉伯语、日语、韩语和拉丁语数据,这里分别将 4 个语种的数据各抽取 1600 张作为本案例的训练数据,300 张作为测试数据,以及 400 张作为补充数据和训练数据混合用于本案例的`SKL-UGI知识蒸馏策略`实验。
+
+因此,对于本案例中的demo数据集,类别为:
+
+`0` 表示阿拉伯语(arabic);`1` 表示日语(japan);`2` 表示韩语(korean);`3` 表示拉丁语(latin)。
+
+如果想要制作自己的多语种数据集,可以按照需求收集并整理自己任务中需要语种的数据,此处提供了经过上述方法处理好的demo数据,可以直接下载得到。
+
+**备注:** 语种分类任务中的图片数据需要将整图中的文字区域抠取出来,仅仅使用文本行部分作为图片数据。
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,下载并解压多语种场景的demo数据。
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/language_classification.tar
+tar -xf language_classification.tar
+cd ../
+```
+
+执行上述命令后,`dataset/`下存在`language_classification`目录,该目录中具有以下数据:
+
+```
+├── img
+│ ├── word_1.png
+│ ├── word_2.png
+...
+├── train_list.txt
+├── train_list_for_distill.txt
+├── test_list.txt
+└── label_list.txt
+```
+
+其中`img/`存放了 4 种语言总计 9200 张数据。`train_list.txt`和`test_list.txt`分别为训练集和验证集的标签文件,`label_list.txt`是 4 类语言分类模型对应的类别列表,`SKL-UGI 知识蒸馏策略`对应的训练标签文件为`train_list_for_distill.txt`。用这些图片可以快速体验本案例中模型的训练预测过程。
+
+***备注:***
+
+- 这里的`label_list.txt`是4类语种分类模型对应的类别列表,如果自己构造的数据集语种类别发生变化,需要自行调整。
+- 如果想要自己构造训练集和验证集,可以参考[PaddleClas分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
+
+
+
+### 3.3 模型训练
+
+在`ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml`中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+ -o Arch.class_num=4
+```
+
+- 由于本文档中的demo数据集的类别数量为 4,所以需要添加`-o Arch.class_num=4`来将模型的类别数量指定为4。
+
+
+
+### 3.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model" \
+ -o Arch.class_num=4
+```
+
+其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+
+
+### 3.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```bash
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model" \
+ -o Arch.class_num=4
+```
+
+输出结果如下:
+
+```
+[{'class_ids': [4, 9], 'scores': [0.96809, 0.01001], 'file_name': 'deploy/images/PULC/language_classification/word_35404.png', 'label_names': ['japan', 'latin']}]
+```
+
+***备注:***
+
+- 其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+- 默认是对 `deploy/images/PULC/language_classification/word_35404.png` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
+- 预测输出为top2的预测结果,`japan` 表示该图中文字语种识别为日语,`latin` 表示该图中文字语种识别为拉丁语。
+
+
+
+## 4. 模型压缩
+
+
+
+### 4.1 SKL-UGI 知识蒸馏
+
+SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](../advanced_tutorials/ssld.md)。
+
+
+
+#### 4.1.1 教师模型训练
+
+复用`ppcls/configs/PULC/language_classification/PPLCNet/PPLCNet_x1_0.yaml`中的超参数,训练教师模型,训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd \
+ -o Arch.class_num=4
+```
+
+当前教师模型最好的权重保存在`output/ResNet101_vd/best_model.pdparams`。
+
+**备注:** 训练ResNet101_vd模型需要的显存较多,如果机器显存不够,可以将学习率和 batch size 同时缩小一定的倍数进行训练。
+
+
+
+#### 4.1.2 蒸馏训练
+
+配置文件`ppcls/configs/PULC/language_classification/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型,使用[3.2.2节](#3.2.2)中介绍的蒸馏数据作为新增的无标签数据。训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model \
+ -o Arch.class_num=4
+```
+
+当前模型最好的权重保存在`output/DistillationModel/best_model_student.pdparams`。
+
+
+
+## 5. 超参搜索
+
+在 [3.2 节](#3.2)和 [4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
+
+**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
+
+
+
+## 6. 模型推理部署
+
+
+
+### 6.1 推理模型准备
+
+Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)。
+
+当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
+
+
+
+#### 6.1.1 基于训练得到的权重导出 inference 模型
+
+此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_language_classification_infer
+```
+
+执行完该脚本后会在`deploy/models/`下生成`PPLCNet_x1_0_language_classification_infer`文件夹,`models` 文件夹下应有如下文件结构:
+
+```
+├── PPLCNet_x1_0_language_classification_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在`output/PPLCNet_x1_0/best_model.pdparams`中。
+
+
+
+#### 6.1.2 直接下载 inference 模型
+
+[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
+
+```
+cd deploy/models
+# 下载inference 模型并解压
+wget https://paddleclas.bj.bcebos.com/models/PULC/language_classification_infer.tar && tar -xf language_classification_infer.tar
+```
+
+解压完毕后,`models` 文件夹下应有如下文件结构:
+
+```
+├── language_classification_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 基于 Python 预测引擎推理
+
+
+
+#### 6.2.1 预测单张图像
+
+返回 `deploy` 目录:
+
+```
+cd ../
+```
+
+运行下面的命令,对图像 `./images/PULC/language_classification/word_35404.png` 进行整图文字方向分类。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/language_classification/inference_language_classification.yaml
+# 使用下面的命令使用 CPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/language_classification/inference_language_classification.yaml -o Global.use_gpu=False
+```
+
+输出结果如下。
+
+```
+word_35404.png: class id(s): [4, 6], score(s): [0.89, 0.01], label_name(s): ['japan', 'korean']
+```
+
+其中,输出为top2的预测结果,`japan` 表示该图中文字语种为日语,`korean` 表示该图中文字语种为韩语。
+
+
+
+#### 6.2.2 基于文件夹的批量预测
+
+如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/language_classification/inference_language_classification.yaml -o Global.infer_imgs="./images/PULC/language_classification/"
+```
+
+终端中会输出该文件夹内所有图像的分类结果,如下所示。
+
+```
+word_17.png: class id(s): [9, 4], score(s): [0.80, 0.09], label_name(s): ['latin', 'japan']
+word_20.png: class id(s): [0, 4], score(s): [0.91, 0.02], label_name(s): ['arabic', 'japan']
+word_35404.png: class id(s): [4, 6], score(s): [0.89, 0.01], label_name(s): ['japan', 'korean']
+```
+
+其中,输出为top2的预测结果,`japan` 表示该图中文字语种为日语,`latin` 表示该图中文字语种为拉丁语,`arabic` 表示该图中文字语种为阿拉伯语,`korean` 表示该图中文字语种为韩语。
+
+
+
+### 6.3 基于 C++ 预测引擎推理
+
+PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
+
+
+
+### 6.4 服务化部署
+
+Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)。
+
+PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.5 端侧部署
+
+Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)。
+
+PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.6 Paddle2ONNX 模型转换与预测
+
+Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)。
+
+PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](../../../deploy/paddle2onnx/readme.md)来完成相应的部署工作。
diff --git a/docs/zh_CN/PULC/PULC_model_list.md b/docs/zh_CN/PULC/PULC_model_list.md
new file mode 100644
index 0000000000000000000000000000000000000000..4b2d7a8774d7d64a634a1bebc96481fc2ad076eb
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_model_list.md
@@ -0,0 +1,25 @@
+# PULC 模型库
+
+------
+
+此处提供了 PULC 模型库的相关指标和模型的下载链接,其中预训练模型可以用来微调训练,推理模型可以直接用来预测和部署。
+
+
+|模型名称|模型简介|模型精度 |模型大小|推理耗时|下载地址|
+| --- | --- | --- | --- | --- | --- |
+| person_exists |[PULC有人/无人分类模型](PULC_person_exists.md)| 96.23 |7.0M|2.58ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_exists_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_exists_pretrained.pdparams)|
+| person_attribute |[PULC人体属性识别模型](PULC_person_attribute.md)| 78.59 |7.2M|2.01ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_attribute_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_attribute_pretrained.pdparams)|
+| safety_helmet |[PULC佩戴安全帽分类模型](PULC_safety_helmet.md)| 99.38 |7.1M|2.03ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/safety_helmet_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/safety_helmet_pretrained.pdparams)|
+| traffic_sign |[PULC交通标志分类模型](PULC_traffic_sign.md)| 98.35 |8.2M|2.10ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/traffic_sign_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/traffic_sign_pretrained.pdparams)|
+| vehicle_attribute |[PULC车辆属性识别模型](PULC_vehicle_attribute.md)| 90.81 |7.2M|2.36ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/vehicle_attribute_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/vehicle_attribute_pretrained.pdparams)|
+| car_exists |[PULC有车/无车分类模型](PULC_car_exists.md) | 95.92 | 7.1M | 2.38ms |[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/car_exists_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/car_exists_pretrained.pdparams)|
+| text_image_orientation |[PULC含文字图像方向分类模型](PULC_text_image_orientation.md)| 99.06 | 7.1M | 2.16ms |[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/text_image_orientation_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/text_image_orientation_pretrained.pdparams)|
+| textline_orientation |[PULC文本行方向分类模型](PULC_textline_orientation.md)| 96.01 |7.0M|2.72ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/textline_orientation_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/textline_orientation_pretrained.pdparams)|
+| language_classification |[PULC语种分类模型](PULC_language_classification.md)| 99.26 |7.1M|2.58ms|[推理模型](https://paddleclas.bj.bcebos.com/models/PULC/inference/language_classification_infer.tar) / [预训练模型](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/language_classification_pretrained.pdparams)|
+
+
+**备注:**
+
+* 以上所有的模型的 backbone 均为 PPLCNet_x1_0,部分模型大小不同是由于分类的输出大小不同导致的,推理耗时是基于Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,其中测试过程开启 MKLDNN 加速策略,线程数为10。速度测试过程会有轻微波动。
+
+* person_exists、safety_helmet、car_exists 的评测指标为 TprAtFpr,person_attribute、vehicle_attribute的评测指标为ma、traffic_sign、text_image_orientation、textline_orientation、language_classification的评测指标为Top-1 Acc。
diff --git a/docs/zh_CN/PULC/PULC_person_attribute.md b/docs/zh_CN/PULC/PULC_person_attribute.md
new file mode 100644
index 0000000000000000000000000000000000000000..a144aed80b1e3b3ccca6a530c3f8392a057e3190
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_person_attribute.md
@@ -0,0 +1,453 @@
+# PULC 人体属性识别模型
+
+------
+
+
+## 目录
+
+- [1. 模型和应用场景介绍](#1)
+- [2. 模型快速体验](#2)
+ - [2.1 安装 paddlepaddle](#2.1)
+ - [2.2 安装 paddleclas](#2.2)
+ - [2.3 预测](#2.3)
+- [3. 模型训练、评估和预测](#3)
+ - [3.1 环境配置](#3.1)
+ - [3.2 数据准备](#3.2)
+ - [3.2.1 数据集来源](#3.2.1)
+ - [3.2.2 数据集获取](#3.2.2)
+ - [3.3 模型训练](#3.3)
+ - [3.4 模型评估](#3.4)
+ - [3.5 模型预测](#3.5)
+- [4. 模型压缩](#4)
+ - [4.1 SKL-UGI 知识蒸馏](#4.1)
+ - [4.1.1 教师模型训练](#4.1.1)
+ - [4.1.2 蒸馏训练](#4.1.2)
+- [5. 超参搜索](#5)
+- [6. 模型推理部署](#6)
+ - [6.1 推理模型准备](#6.1)
+ - [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
+ - [6.1.2 直接下载 inference 模型](#6.1.2)
+ - [6.2 基于 Python 预测引擎推理](#6.2)
+ - [6.2.1 预测单张图像](#6.2.1)
+ - [6.2.2 基于文件夹的批量预测](#6.2.2)
+ - [6.3 基于 C++ 预测引擎推理](#6.3)
+ - [6.4 服务化部署](#6.4)
+ - [6.5 端侧部署](#6.5)
+ - [6.6 Paddle2ONNX 模型转换与预测](#6.6)
+
+
+
+
+## 1. 模型和应用场景介绍
+
+该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight image Classification)快速构建轻量级、高精度、可落地的人体属性识别模型。该模型可以广泛应用于行人分析、行人跟踪等场景。
+
+下表列出了不同人体属性识别模型的相关指标,前三行展现了使用 SwinTransformer_tiny、Res2Net200_vd_26w_4s 和 MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第四行至第七行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + SKL-UGI 知识蒸馏策略训练得到的模型的相关指标。
+
+
+| 模型 | mA(%) | 延时(ms) | 存储(M) | 策略 |
+|-------|-----------|----------|---------------|---------------|
+| Res2Net200_vd_26w_4s | 81.25 | 77.51 | 293 | 使用ImageNet预训练模型 |
+| SwinTransformer_tiny | 80.17 | 89.51 | 111 | 使用ImageNet预训练模型 |
+| MobileNetV3_small_x0_35 | 70.79 | 2.90 | 1.7 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 76.31 | 2.01 | 7.1 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 77.31 | 2.01 | 7.1 | 使用SSLD预训练模型 |
+| PPLCNet_x1_0 | 77.71 | 2.01 | 7.1 | 使用SSLD预训练模型+EDA策略|
+| PPLCNet_x1_0 | 78.59 | 2.01 | 7.1 | 使用SSLD预训练模型+EDA策略+SKL-UGI知识蒸馏策略|
+
+从表中可以看出,backbone 为 Res2Net200_vd_26w_4s 和 SwinTransformer_tiny 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是精度也大幅下降。将 backbone 替换为 PPLCNet_x1_0 时,精度较 MobileNetV3_small_x0_35 高 5.5%,于此同时,速度更快。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升 1%,进一步地,当融合EDA策略后,精度可以再提升 0.4%,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.88%。此时,PPLCNet_x1_0 的精度与 SwinTransformer_tiny 仅相差1.58%,但是速度快 44 倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
+
+**备注:**
+
+* 延时是基于 Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,开启 MKLDNN 加速策略,线程数为10。
+* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)。
+
+
+
+
+## 2. 模型快速体验
+
+
+
+### 2.1 安装 paddlepaddle
+
+- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- 您的机器是CPU,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
+
+
+
+### 2.2 安装 paddleclas
+
+使用如下命令快速安装 paddleclas
+
+```
+pip3 install paddleclas
+```
+
+
+
+### 2.3 预测
+
+点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
+
+* 使用命令行快速预测
+
+```bash
+paddleclas --model_name=person_attribute --infer_imgs=pulc_demo_imgs/person_attribute/090004.jpg
+```
+
+结果如下:
+```
+>>> result
+attributes: ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], output: [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1], filename: pulc_demo_imgs/person_attribute/090004.jpg
+Predict complete!
+```
+
+**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹。
+
+
+* 在 Python 代码中预测
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="person_attribute")
+result = model.predict(input_data="pulc_demo_imgs/person_attribute/090004.jpg")
+print(next(result))
+```
+
+**备注**:`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="person_attribute", batch_size=2)`, 使用默认的代码返回结果示例如下:
+
+```
+>>> result
+[{'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1], 'filename': 'pulc_demo_imgs/person_attribute/090004.jpg'}]
+```
+
+
+
+## 3. 模型训练、评估和预测
+
+
+
+### 3.1 环境配置
+
+* 安装:请先参考文档 [环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+
+
+
+### 3.2 数据准备
+
+
+
+#### 3.2.1 数据集来源
+
+本案例中所使用的数据为[pa100k 数据集](https://www.v7labs.com/open-datasets/pa-100k)。
+
+
+
+#### 3.2.2 数据集获取
+
+部分数据可视化如下所示。
+
+
+

+
+
+
+我们将原始数据转换成了 PaddleClas 多标签可读的数据格式,可以直接下载。
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,下载并解压有人/无人场景的数据。
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/pa100k.tar
+tar -xf pa100k.tar
+cd ../
+```
+
+执行上述命令后,`dataset/` 下存在 `pa100k` 目录,该目录中具有以下数据:
+
+
+执行上述命令后,`pa100k`目录中具有以下数据:
+
+```
+pa100k
+├── train
+│ ├── 000001.jpg
+│ ├── 000002.jpg
+...
+├── val
+│ ├── 080001.jpg
+│ ├── 080002.jpg
+...
+├── test
+│ ├── 090001.jpg
+│ ├── 090002.jpg
+...
+...
+├── train_list.txt
+├── train_val_list.txt
+├── val_list.txt
+├── test_list.txt
+```
+
+其中`train/`、`val/`、`test/`分别为训练集、验证集和测试集。`train_list.txt`、`val_list.txt`、`test_list.txt`分别为训练集、验证集、测试集的标签文件。在本例子中,`test_list.txt`暂时没有使用。
+
+
+
+
+### 3.3 模型训练
+
+
+在 `ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml` 中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml
+```
+
+验证集的最佳指标在 `77.71%` 左右(数据集较小,一般有0.3%左右的波动)。
+
+
+
+
+### 3.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+
+
+### 3.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```bash
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+输出结果如下:
+
+```
+[{'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1]}]
+```
+
+**备注:**
+
+* 这里`-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+* 默认是对 `deploy/images/PULC/person_attribute/090004.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
+
+
+
+## 4. 模型压缩
+
+
+
+### 4.1 SKL-UGI 知识蒸馏
+
+SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](../advanced_tutorials/ssld.md)。
+
+
+
+#### 4.1.1 教师模型训练
+
+复用 `ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml` 中的超参数,训练教师模型,训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+验证集的最佳指标为 `80.10%` 左右,当前教师模型最好的权重保存在 `output/ResNet101_vd/best_model.pdparams`。
+
+
+
+#### 4.1.2 蒸馏训练
+
+配置文件`ppcls/configs/PULC/person_attribute/PPLCNet_x1_0_Distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型。训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0_Distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+验证集的最佳指标为 `78.5%` 左右,当前模型最好的权重保存在 `output/DistillationModel/best_model_student.pdparams`。
+
+
+
+
+## 5. 超参搜索
+
+在 [3.2 节](#3.2)和 [4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
+
+**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
+
+
+
+## 6. 模型推理部署
+
+
+
+### 6.1 推理模型准备
+
+Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)。
+
+当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
+
+
+
+### 6.1.1 基于训练得到的权重导出 inference 模型
+
+此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_person_attribute_infer
+```
+执行完该脚本后会在 `deploy/models/` 下生成 `PPLCNet_x1_0_person_attribute_infer` 文件夹,`models` 文件夹下应有如下文件结构:
+
+```
+├── PPLCNet_x1_0_person_attribute_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在`output/PPLCNet_x1_0/best_model.pdparams`中。
+
+
+
+### 6.1.2 直接下载 inference 模型
+
+[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
+
+```
+cd deploy/models
+# 下载 inference 模型并解压
+wget https://paddleclas.bj.bcebos.com/models/PULC/person_attribute_infer.tar && tar -xf person_attribute_infer.tar
+```
+
+解压完毕后,`models` 文件夹下应有如下文件结构:
+
+```
+├── person_attribute_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 基于 Python 预测引擎推理
+
+
+
+
+#### 6.2.1 预测单张图像
+
+返回 `deploy` 目录:
+
+```
+cd ../
+```
+
+运行下面的命令,对图像 `./images/PULC/person_attribute/090004.jpg` 进行车辆属性识别。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/person_attribute/inference_person_attribute.yaml -o Global.use_gpu=True
+# 使用下面的命令使用 CPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/person_attribute/inference_person_attribute.yaml -o Global.use_gpu=False
+```
+
+输出结果如下。
+
+```
+090004.jpg: {'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1]}
+```
+
+
+
+#### 6.2.2 基于文件夹的批量预测
+
+如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/person_attribute/inference_person_attribute.yaml -o Global.infer_imgs="./images/PULC/person_attribute/"
+```
+
+终端中会输出该文件夹内所有图像的属性识别结果,如下所示。
+
+```
+090004.jpg: {'attributes': ['Male', 'Age18-60', 'Back', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'Backpack', 'Upper: LongSleeve UpperPlaid', 'Lower: Trousers', 'No boots'], 'output': [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1]}
+090007.jpg: {'attributes': ['Female', 'Age18-60', 'Side', 'Glasses: False', 'Hat: False', 'HoldObjectsInFront: False', 'No bag', 'Upper: ShortSleeve', 'Lower: Skirt&Dress', 'No boots'], 'output': [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0]}
+```
+
+
+
+### 6.3 基于 C++ 预测引擎推理
+
+PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
+
+
+
+### 6.4 服务化部署
+
+Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)。
+
+PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.5 端侧部署
+
+Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)。
+
+PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.6 Paddle2ONNX 模型转换与预测
+
+Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)。
+
+PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](../../../deploy/paddle2onnx/readme.md)来完成相应的部署工作。
diff --git a/docs/zh_CN/PULC/PULC_person_cls.md b/docs/zh_CN/PULC/PULC_person_cls.md
deleted file mode 100644
index ff3508c35c3ff9394da9f5c82e0b4001ee8394a3..0000000000000000000000000000000000000000
--- a/docs/zh_CN/PULC/PULC_person_cls.md
+++ /dev/null
@@ -1,332 +0,0 @@
-# PaddleClas构建有人/无人分类案例
-
-此处提供了用户使用 PaddleClas 快速构建轻量级、高精度、可落地的有人/无人的分类模型教程,主要基于有人/无人场景的数据,融合了轻量级骨干网络PPLCNet、SSLD预训练权重、EDA数据增强策略、SKL-UGI知识蒸馏策略、SHAS超参数搜索策略,得到精度高、速度快、易于部署的二分类模型。
-
-------
-
-
-## 目录
-
-- [1. 环境配置](#1)
-- [2. 有人/无人场景推理预测](#2)
- - [2.1 下载模型](#2.1)
- - [2.2 模型推理预测](#2.2)
- - [2.2.1 预测单张图像](#2.2.1)
- - [2.2.2 基于文件夹的批量预测](#2.2.2)
-- [3.有人/无人场景训练](#3)
- - [3.1 数据准备](#3.1)
- - [3.2 模型训练](#3.2)
- - [3.2.1 基于默认超参数训练](#3.2.1)
- - [3.2.1.1 基于默认超参数训练轻量级模型](#3.2.1.1)
- - [3.2.1.2 基于默认超参数训练教师模型](#3.2.1.2)
- - [3.2.1.3 基于默认超参数进行蒸馏训练](#3.2.1.3)
- - [3.2.2 超参数搜索训练](#3.2)
-- [4. 模型评估与推理](#4)
- - [4.1 模型评估](#3.1)
- - [4.2 模型预测](#3.2)
- - [4.3 使用 inference 模型进行推理](#4.3)
- - [4.3.1 导出 inference 模型](#4.3.1)
- - [4.3.2 模型推理预测](#4.3.2)
-
-
-
-
-## 1. 环境配置
-
-* 安装:请先参考 [Paddle 安装教程](../installation/install_paddle.md) 以及 [PaddleClas 安装教程](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
-
-
-
-## 2. 有人/无人场景推理预测
-
-
-
-### 2.1 下载模型
-
-* 进入 `deploy` 运行目录。
-
-```
-cd deploy
-```
-
-下载有人/无人分类的模型。
-
-```
-mkdir models
-cd models
-# 下载inference 模型并解压
-wget https://paddleclas.bj.bcebos.com/models/PULC/person_cls_infer.tar && tar -xf person_cls_infer.tar
-```
-
-解压完毕后,`models` 文件夹下应有如下文件结构:
-
-```
-├── person_cls_infer
-│ ├── inference.pdiparams
-│ ├── inference.pdiparams.info
-│ └── inference.pdmodel
-```
-
-
-
-### 2.2 模型推理预测
-
-
-
-#### 2.2.1 预测单张图像
-
-返回 `deploy` 目录:
-
-```
-cd ../
-```
-
-运行下面的命令,对图像 `./images/PULC/person/objects365_02035329.jpg` 进行有人/无人分类。
-
-```shell
-# 使用下面的命令使用 GPU 进行预测
-python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o PostProcess.ThreshOutput.threshold=0.9794
-# 使用下面的命令使用 CPU 进行预测
-python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o PostProcess.ThreshOutput.threshold=0.9794 -o Global.use_gpu=False
-```
-
-输出结果如下。
-
-```
-objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
-```
-
-
-**备注:** 真实场景中往往需要在假正类率(Fpr)小于某一个指标下求真正类率(Tpr),该场景中的`val`数据集在千分之一Fpr下得到的最佳Tpr所得到的阈值为`0.9794`,故此处的`threshold`为`0.9794`。该阈值的确定方法可以参考[3.2节](#3.2)
-
-
-
-#### 2.2.2 基于文件夹的批量预测
-
-如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
-
-```shell
-# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
-python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o Global.infer_imgs="./images/PULC/person/"
-```
-
-终端中会输出该文件夹内所有图像的分类结果,如下所示。
-
-```
-objects365_01780782.jpg: class id(s): [0], score(s): [1.00], label_name(s): ['nobody']
-objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
-```
-
-其中,`someone` 表示该图里存在人,`nobody` 表示该图里不存在人。
-
-
-
-## 3.有人/无人场景训练
-
-
-
-### 3.1 数据准备
-
-进入 PaddleClas 目录。
-
-```
-cd path_to_PaddleClas
-```
-
-进入 `dataset/` 目录,下载并解压有人/无人场景的数据。
-
-```shell
-cd dataset
-wget https://paddleclas.bj.bcebos.com/data/cls_demo/person.tar
-tar -xf person.tar
-cd ../
-```
-
-执行上述命令后,`dataset/`下存在`person`目录,该目录中具有以下数据:
-
-```
-
-├── train
-│ ├── 000000000009.jpg
-│ ├── 000000000025.jpg
-...
-├── val
-│ ├── objects365_01780637.jpg
-│ ├── objects365_01780640.jpg
-...
-├── ImageNet_val
-│ ├── ILSVRC2012_val_00000001.JPEG
-│ ├── ILSVRC2012_val_00000002.JPEG
-...
-├── train_list.txt
-├── train_list.txt.debug
-├── train_list_for_distill.txt
-├── val_list.txt
-└── val_list.txt.debug
-```
-
-其中`train/`和`val/`分别为训练集和验证集。`train_list.txt`和`val_list.txt`分别为训练集和验证集的标签文件,`train_list.txt.debug`和`val_list.txt.debug`分别为训练集和验证集的`debug`标签文件,其分别是`train_list.txt`和`val_list.txt`的子集,用该文件可以快速体验本案例的流程。`ImageNet_val/`是ImageNet的验证集,该集合和`train`集合的混合数据用于本案例的`SKL-UGI知识蒸馏策略`,对应的训练标签文件为`train_list_for_distill.txt`。
-
-* **注意**:
-
-* 本案例中所使用的所有数据集均为开源数据,`train`集合为[MS-COCO数据](https://cocodataset.org/#overview)的训练集的子集,`val`集合为[Object365数据](https://www.objects365.org/overview.html)的训练集的子集,`ImageNet_val`为[ImageNet数据](https://www.image-net.org/)的验证集。数据集的筛选流程可以参考[有人/无人场景数据集筛选方法]()。
-
-
-
-### 3.2 模型训练
-
-
-
-#### 3.2.1 基于默认超参数训练
-
-
-
-##### 3.2.1.1 基于默认超参数训练轻量级模型
-
-在`ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml`中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
-
-```shell
-export CUDA_VISIBLE_DEVICES=0,1,2,3
-python3 -m paddle.distributed.launch \
- --gpus="0,1,2,3" \
- tools/train.py \
- -c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml
-```
-
-验证集的最佳指标在0.94-0.95之间(数据集较小,容易造成波动)。
-
-**备注:**
-
-* 此时使用的指标为Tpr,该指标描述了在假正类率(Fpr)小于某一个指标时的真正类率(Tpr),是产业中二分类问题常用的指标之一。在本案例中,Fpr为千分之一。关于Fpr和Tpr的更多介绍,可以参考[这里](https://baike.baidu.com/item/AUC/19282953)。
-
-* 在eval时,会打印出来当前最佳的TprAtFpr指标,具体地,其会打印当前的`Fpr`、`Tpr`值,以及当前的`threshold`值,`Tpr`值反映了在当前`Fpr`值下的召回率,该值越高,代表模型越好。`threshold` 表示当前最佳`Fpr`所对应的分类阈值,可用于后续模型部署落地等。
-
-
-
-##### 3.2.1.2 基于默认超参数训练教师模型
-
-复用`ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml`中的超参数,训练教师模型,训练脚本如下:
-
-```shell
-export CUDA_VISIBLE_DEVICES=0,1,2,3
-python3 -m paddle.distributed.launch \
- --gpus="0,1,2,3" \
- tools/train.py \
- -c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml \
- -o Arch.name=ResNet101_vd
-```
-
-验证集的最佳指标为0.96-0.98之间,当前教师模型最好的权重保存在`output/ResNet101_vd/best_model.pdparams`。
-
-
-
-##### 3.2.1.3 基于默认超参数进行蒸馏训练
-
-配置文件`ppcls/configs/PULC/PULC/Distillation/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型,使用ImageNet数据集的验证集作为新增的无标签数据。训练脚本如下:
-
-```shell
-export CUDA_VISIBLE_DEVICES=0,1,2,3
-python3 -m paddle.distributed.launch \
- --gpus="0,1,2,3" \
- tools/train.py \
- -c ./ppcls/configs/PULC/person/Distillation/PPLCNet_x1_0_distillation.yaml \
- -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
-```
-
-验证集的最佳指标为0.95-0.97之间,当前模型最好的权重保存在`output/DistillationModel/best_model_student.pdparams`。
-
-
-
-#### 3.2.2 超参数搜索训练
-
-[3.2 小节](#3.2) 提供了在已经搜索并得到的超参数上进行了训练,此部分内容提供了搜索的过程,此过程是为了得到更好的训练超参数。
-
-* 搜索运行脚本如下:
-
-```shell
-python tools/search_strategy.py -c ppcls/configs/StrategySearch/person.yaml
-```
-
-在`ppcls/configs/StrategySearch/person.yaml`中指定了具体的 GPU id 号和搜索配置, 默认搜索的训练日志和模型存放于`output/search_person`中,最终的蒸馏模型存放于`output/search_person/search_res/DistillationModel/best_model_student.pdparams`。
-
-* **注意**:
-
-* 3.1小节提供的默认配置已经经过了搜索,所以此过程不是必要的过程,如果自己的训练数据集有变化,可以尝试此过程。
-
-* 此过程基于当前数据集在 V100 4 卡上大概需要耗时 10 小时,如果缺少机器资源,希望体验搜索过程,可以将`ppcls/configs/cls_demo/person/PPLCNet/PPLCNet_x1_0_search.yaml`中的`train_list.txt`和`val_list.txt`分别替换为`train_list.txt.debug`和`val_list.txt.debug`。替换list只是为了加速跑通整个搜索过程,由于数据量较小,其搜素的结果没有参考性。另外,搜索空间可以根据当前的机器资源来调整,如果机器资源有限,可以尝试缩小搜索空间,如果机器资源较充足,可以尝试扩大搜索空间。
-
-* 如果此过程搜索的得到的超参数与[3.2.1小节](#3.2.1)提供的超参数不一致,主要是由于训练数据较小造成的波动导致,可以忽略。
-
-
-
-
-## 4. 模型评估与推理
-
-
-
-
-### 4.1 模型评估
-
-训练好模型之后,可以通过以下命令实现对模型指标的评估。
-
-```bash
-python3 tools/eval.py \
- -c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml \
- -o Global.pretrained_model="output/DistillationModel/best_model_student"
-```
-
-
-
-### 4.2 模型预测
-
-模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
-
-```python
-python3 tools/infer.py \
- -c ./ppcls/configs/PULC/person/PPLCNet/PPLCNet_x1_0.yaml \
- -o Infer.infer_imgs=./dataset/person/val/objects365_01780637.jpg \
- -o Global.pretrained_model=output/DistillationModel/best_model_student \
- -o Global.pretrained_model=Infer.PostProcess.threshold=0.9794
-```
-
-输出结果如下:
-
-```
-[{'class_ids': [0], 'scores': [0.9878496769815683], 'label_names': ['nobody'], 'file_name': './dataset/person/val/objects365_01780637.jpg'}]
-```
-
-**备注:** 这里的`Infer.PostProcess.threshold`的值需要根据实际场景来确定,此处的`0.9794`是在该场景中的`val`数据集在千分之一Fpr下得到的最佳Tpr所得到的。
-
-
-
-### 4.3 使用 inference 模型进行推理
-
-
-
-### 4.3.1 导出 inference 模型
-
-通过导出 inference 模型,PaddlePaddle 支持使用预测引擎进行预测推理。接下来介绍如何用预测引擎进行推理:
-首先,对训练好的模型进行转换:
-
-```bash
-python3 tools/export_model.py \
- -c ./ppcls/configs/cls_demo/PULC/PPLCNet/PPLCNet_x1_0.yaml \
- -o Global.pretrained_model=output/DistillationModel/best_model_student \
- -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_person
-```
-执行完该脚本后会在`deploy/models/`下生成`PPLCNet_x1_0_person`文件夹,该文件夹中的模型与 2.2 节下载的推理预测模型格式一致。
-
-
-
-### 4.3.2 基于 inference 模型推理预测
-推理预测的脚本为:
-
-```
-python3.7 python/predict_cls.py -c configs/PULC/person/inference_person_cls.yaml -o Global.inference_model_dir="models/PPLCNet_x1_0_person" -o PostProcess.ThreshOutput.threshold=0.9794
-```
-
-**备注:**
-
-- 此处的`PostProcess.ThreshOutput.threshold`由eval时的最佳`threshold`来确定。
-- 更多关于推理的细节,可以参考[2.2节](#2.2)。
-
diff --git a/docs/zh_CN/PULC/PULC_person_exists.md b/docs/zh_CN/PULC/PULC_person_exists.md
new file mode 100644
index 0000000000000000000000000000000000000000..b3b830a893a4648645beab3a447ec8d894a5da4c
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_person_exists.md
@@ -0,0 +1,472 @@
+# PULC 有人/无人分类模型
+
+------
+
+
+## 目录
+
+- [1. 模型和应用场景介绍](#1)
+- [2. 模型快速体验](#2)
+ - [2.1 安装 paddlepaddle](#2.1)
+ - [2.2 安装 paddleclas](#2.2)
+ - [2.3 预测](#2.3)
+- [3. 模型训练、评估和预测](#3)
+ - [3.1 环境配置](#3.1)
+ - [3.2 数据准备](#3.2)
+ - [3.2.1 数据集来源](#3.2.1)
+ - [3.2.2 数据集获取](#3.2.2)
+ - [3.3 模型训练](#3.3)
+ - [3.4 模型评估](#3.4)
+ - [3.5 模型预测](#3.5)
+- [4. 模型压缩](#4)
+ - [4.1 SKL-UGI 知识蒸馏](#4.1)
+ - [4.1.1 教师模型训练](#4.1.1)
+ - [4.1.2 蒸馏训练](#4.1.2)
+- [5. 超参搜索](#5)
+- [6. 模型推理部署](#6)
+ - [6.1 推理模型准备](#6.1)
+ - [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
+ - [6.1.2 直接下载 inference 模型](#6.1.2)
+ - [6.2 基于 Python 预测引擎推理](#6.2)
+ - [6.2.1 预测单张图像](#6.2.1)
+ - [6.2.2 基于文件夹的批量预测](#6.2.2)
+ - [6.3 基于 C++ 预测引擎推理](#6.3)
+ - [6.4 服务化部署](#6.4)
+ - [6.5 端侧部署](#6.5)
+ - [6.6 Paddle2ONNX 模型转换与预测](#6.6)
+
+
+
+
+## 1. 模型和应用场景介绍
+
+该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight image Classification)快速构建轻量级、高精度、可落地的有人/无人的分类模型。该模型可以广泛应用于如监控场景、人员进出管控场景、海量数据过滤场景等。
+
+下表列出了判断图片中是否有人的二分类模型的相关指标,前两行展现了使用 SwinTranformer_tiny 和 MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第三行至第六行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + SKL-UGI 知识蒸馏策略训练得到的模型的相关指标。
+
+
+| 模型 | Tpr(%) | 延时(ms) | 存储(M) | 策略 |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 95.69 | 95.30 | 111 | 使用 ImageNet 预训练模型 |
+| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 2.6 | 使用 ImageNet 预训练模型 |
+| PPLCNet_x1_0 | 89.57 | 2.12 | 7.0 | 使用 ImageNet 预训练模型 |
+| PPLCNet_x1_0 | 92.10 | 2.12 | 7.0 | 使用 SSLD 预训练模型 |
+| PPLCNet_x1_0 | 93.43 | 2.12 | 7.0 | 使用 SSLD 预训练模型+EDA 策略|
+| PPLCNet_x1_0 | 96.23 | 2.12 | 7.0 | 使用 SSLD 预训练模型+EDA 策略+SKL-UGI 知识蒸馏策略|
+
+从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是会导致精度大幅下降。将 backbone 替换为速度更快的 PPLCNet_x1_0 时,精度较 MobileNetV3_small_x0_35 高 20 多个百分点,与此同时速度依旧可以快 20% 以上。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升约 2.6 个百分点,进一步地,当融合EDA策略后,精度可以再提升 1.3 个百分点,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 2.8 个百分点。此时,PPLCNet_x1_0 达到了 SwinTranformer_tiny 模型的精度,但是速度快 40 多倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
+
+**备注:**
+
+* `Tpr`指标的介绍可以参考 [3.2 小节](#3.2)的备注部分,延时是基于 Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,开启 MKLDNN 加速策略,线程数为10。
+* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)。
+
+
+
+
+## 2. 模型快速体验
+
+
+
+### 2.1 安装 paddlepaddle
+
+- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- 您的机器是CPU,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
+
+
+
+### 2.2 安装 paddleclas
+
+使用如下命令快速安装 paddleclas
+
+```
+pip3 install paddleclas
+```
+
+
+
+### 2.3 预测
+
+点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
+
+* 使用命令行快速预测
+
+```bash
+paddleclas --model_name=person_exists --infer_imgs=pulc_demo_imgs/person_exists/objects365_01780782.jpg
+```
+
+结果如下:
+```
+>>> result
+class_ids: [0], scores: [0.9955421453341842], label_names: ['nobody'], filename: pulc_demo_imgs/person_exists/objects365_01780782.jpg
+Predict complete!
+```
+
+**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹。
+
+
+* 在 Python 代码中预测
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="person_exists")
+result = model.predict(input_data="pulc_demo_imgs/person_exists/objects365_01780782.jpg")
+print(next(result))
+```
+
+**备注**:`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="person_exists", batch_size=2)`, 使用默认的代码返回结果示例如下:
+
+```
+>>> result
+[{'class_ids': [0], 'scores': [0.9955421453341842], 'label_names': ['nobody'], 'filename': 'pulc_demo_imgs/person_exists/objects365_01780782.jpg'}]
+```
+
+
+
+## 3. 模型训练、评估和预测
+
+
+
+### 3.1 环境配置
+
+* 安装:请先参考文档[环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+
+
+
+### 3.2 数据准备
+
+
+
+#### 3.2.1 数据集来源
+
+本案例中所使用的所有数据集均为开源数据,`train` 集合为[MS-COCO 数据](https://cocodataset.org/#overview)的训练集的子集,`val` 集合为[Object365 数据](https://www.objects365.org/overview.html)的训练集的子集,`ImageNet_val` 为[ImageNet-1k 数据](https://www.image-net.org/)的验证集。
+
+
+
+#### 3.2.2 数据集获取
+
+在公开数据集的基础上经过后处理即可得到本案例需要的数据,具体处理方法如下:
+
+- 训练集合,本案例处理了 MS-COCO 数据训练集的标注文件,如果某张图含有“人”的标签,且这个框的面积在整张图中的比例大于 10%,即认为该张图中含有人,如果某张图中没有“人”的标签,则认为该张图中不含有人。经过处理后,得到 92964 条可用数据,其中有人的数据有 39813 条,无人的数据 53151 条。
+
+- 验证集合,从 Object365 数据中随机抽取一小部分数据,使用在 MS-COCO 上训练得到的较好的模型预测这些数据,将预测结果和数据的标注文件取交集,将交集的结果按照得到训练集的方法筛选出验证集合。经过处理后,得到 27820 条可用数据。其中有人的数据有 2255 条,无人的数据有 25565 条。
+
+处理后的数据集部分数据可视化如下:
+
+
+
+此处提供了经过上述方法处理好的数据,可以直接下载得到。
+
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,下载并解压有人/无人场景的数据。
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/person_exists.tar
+tar -xf person_exists.tar
+cd ../
+```
+
+执行上述命令后,`dataset/` 下存在 `person_exists` 目录,该目录中具有以下数据:
+
+```
+
+├── train
+│ ├── 000000000009.jpg
+│ ├── 000000000025.jpg
+...
+├── val
+│ ├── objects365_01780637.jpg
+│ ├── objects365_01780640.jpg
+...
+├── ImageNet_val
+│ ├── ILSVRC2012_val_00000001.JPEG
+│ ├── ILSVRC2012_val_00000002.JPEG
+...
+├── train_list.txt
+├── train_list.txt.debug
+├── train_list_for_distill.txt
+├── val_list.txt
+└── val_list.txt.debug
+```
+
+其中 `train/` 和 `val/` 分别为训练集和验证集。`train_list.txt` 和 `val_list.txt` 分别为训练集和验证集的标签文件,`train_list.txt.debug` 和 `val_list.txt.debug` 分别为训练集和验证集的 `debug` 标签文件,其分别是 `train_list.txt` 和 `val_list.txt` 的子集,用该文件可以快速体验本案例的流程。`ImageNet_val/` 是 ImageNet-1k 的验证集,该集合和 `train` 集合的混合数据用于本案例的 `SKL-UGI知识蒸馏策略`,对应的训练标签文件为 `train_list_for_distill.txt` 。
+
+**备注:**
+
+* 关于 `train_list.txt`、`val_list.txt`的格式说明,可以参考 [PaddleClas 分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
+
+* 关于如何得到蒸馏的标签文件可以参考[知识蒸馏标签获得方法](../advanced_tutorials/ssld.md#3.2)。
+
+
+
+
+### 3.3 模型训练
+
+
+在 `ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml` 中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml
+```
+
+验证集的最佳指标在 `0.94-0.95` 之间(数据集较小,容易造成波动)。
+
+**备注:**
+
+* 此时使用的指标为Tpr,该指标描述了在假正类率(Fpr)小于某一个指标时的真正类率(Tpr),是产业中二分类问题常用的指标之一。在本案例中,Fpr 为千分之一。关于 Fpr 和 Tpr 的更多介绍,可以参考[这里](https://baike.baidu.com/item/AUC/19282953)。
+
+* 在eval时,会打印出来当前最佳的 TprAtFpr 指标,具体地,其会打印当前的 `Fpr`、`Tpr` 值,以及当前的 `threshold`值,`Tpr` 值反映了在当前 `Fpr` 值下的召回率,该值越高,代表模型越好。`threshold` 表示当前最佳 `Fpr` 所对应的分类阈值,可用于后续模型部署落地等。
+
+
+
+### 3.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+
+
+### 3.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```python
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+输出结果如下:
+
+```
+[{'class_ids': [1], 'scores': [0.9999976], 'label_names': ['someone'], 'file_name': 'deploy/images/PULC/person_exists/objects365_02035329.jpg'}]
+```
+
+**备注:**
+
+* 这里`-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+* 默认是对 `deploy/images/PULC/person_exists/objects365_02035329.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
+
+* 二分类默认的阈值为0.5, 如果需要指定阈值,可以重写 `Infer.PostProcess.threshold` ,如`-o Infer.PostProcess.threshold=0.9794`,该值需要根据实际场景来确定,此处的 `0.9794` 是在该场景中的 `val` 数据集在千分之一 Fpr 下得到的最佳 Tpr 所得到的。
+
+
+
+
+## 4. 模型压缩
+
+
+
+### 4.1 SKL-UGI 知识蒸馏
+
+SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](../advanced_tutorials/ssld.md)。
+
+
+
+#### 4.1.1 教师模型训练
+
+复用 `ppcls/configs/PULC/person_exists/PPLCNet/PPLCNet_x1_0.yaml` 中的超参数,训练教师模型,训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+验证集的最佳指标为 `0.96-0.98` 之间,当前教师模型最好的权重保存在 `output/ResNet101_vd/best_model.pdparams`。
+
+
+
+#### 4.1.2 蒸馏训练
+
+配置文件`ppcls/configs/PULC/person_exists/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型,使用ImageNet数据集的验证集作为新增的无标签数据。训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+验证集的最佳指标为 `0.95-0.97` 之间,当前模型最好的权重保存在 `output/DistillationModel/best_model_student.pdparams`。
+
+
+
+
+## 5. 超参搜索
+
+在 [3.3 节](#3.3)和 [4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
+
+**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
+
+
+
+## 6. 模型推理部署
+
+
+
+### 6.1 推理模型准备
+
+Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用 MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于 Paddle Inference 推理引擎的介绍,可以参考 [Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)。
+
+当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
+
+
+
+### 6.1.1 基于训练得到的权重导出 inference 模型
+
+此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_person_exists_infer
+```
+执行完该脚本后会在 `deploy/models/` 下生成 `PPLCNet_x1_0_person_exists_infer` 文件夹,`models` 文件夹下应有如下文件结构:
+
+```
+├── PPLCNet_x1_0_person_exists_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在`output/PPLCNet_x1_0/best_model.pdparams`中。
+
+
+
+### 6.1.2 直接下载 inference 模型
+
+[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
+
+```
+cd deploy/models
+# 下载 inference 模型并解压
+wget https://paddleclas.bj.bcebos.com/models/PULC/person_exists_infer.tar && tar -xf person_exists_infer.tar
+```
+
+解压完毕后,`models` 文件夹下应有如下文件结构:
+
+```
+├── person_exists_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 基于 Python 预测引擎推理
+
+
+
+
+#### 6.2.1 预测单张图像
+
+返回 `deploy` 目录:
+
+```
+cd ../
+```
+
+运行下面的命令,对图像 `./images/PULC/person_exists/objects365_02035329.jpg` 进行有人/无人分类。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml
+# 使用下面的命令使用 CPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml -o Global.use_gpu=False
+```
+
+输出结果如下。
+
+```
+objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
+```
+
+
+**备注:** 二分类默认的阈值为0.5, 如果需要指定阈值,可以重写 `Infer.PostProcess.threshold` ,如`-o Infer.PostProcess.threshold=0.9794`,该值需要根据实际场景来确定,此处的 `0.9794` 是在该场景中的 `val` 数据集在千分之一 Fpr 下得到的最佳 Tpr 所得到的。该阈值的确定方法可以参考[3.3节](#3.3)备注部分。
+
+
+
+#### 6.2.2 基于文件夹的批量预测
+
+如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml -o Global.infer_imgs="./images/PULC/person_exists/"
+```
+
+终端中会输出该文件夹内所有图像的分类结果,如下所示。
+
+```
+objects365_01780782.jpg: class id(s): [0], score(s): [1.00], label_name(s): ['nobody']
+objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
+```
+
+其中,`someone` 表示该图里存在人,`nobody` 表示该图里不存在人。
+
+
+
+### 6.3 基于 C++ 预测引擎推理
+
+PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
+
+
+
+### 6.4 服务化部署
+
+Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)。
+
+PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.5 端侧部署
+
+Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)。
+
+PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.6 Paddle2ONNX 模型转换与预测
+
+Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)。
+
+PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](../../../deploy/paddle2onnx/readme.md)来完成相应的部署工作。
diff --git a/docs/zh_CN/PULC/PULC_quickstart.md b/docs/zh_CN/PULC/PULC_quickstart.md
new file mode 100644
index 0000000000000000000000000000000000000000..c7c6980625d6325bddbd5a6fed619147534c43b7
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_quickstart.md
@@ -0,0 +1,125 @@
+# PULC 快速体验
+
+------
+
+本文主要介绍通过 PaddleClas whl 包,使用 PULC 系列模型进行预测。
+
+## 目录
+
+- [1. 安装](#1)
+ - [1.1 安装PaddlePaddle](#11)
+ - [1.2 安装PaddleClas whl包](#12)
+- [2. 快速体验](#2)
+ - [2.1 命令行使用](#2.1)
+ - [2.2 Python脚本使用](#2.2)
+ - [2.3 模型列表](#2.3)
+- [3.小结](#3)
+
+
+
+## 1. 安装
+
+
+
+### 1.1 安装 PaddlePaddle
+
+- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- 您的机器是CPU,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
+
+
+
+### 1.2 安装 PaddleClas whl 包
+
+```bash
+pip3 install paddleclas
+```
+
+
+
+## 2. 快速体验
+
+PaddleClas 提供了一系列测试图片,里边包含人、车、OCR等方向的多个场景的demo数据。点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载并解压,然后在终端中切换到相应目录。
+
+
+
+### 2.1 命令行使用
+
+```
+cd /path/to/pulc_demo_imgs
+```
+
+使用命令行预测:
+
+```bash
+paddleclas --model_name=person_exists --infer_imgs=pulc_demo_imgs/person_exists/objects365_01780782.jpg
+```
+
+结果如下:
+```
+>>> result
+class_ids: [0], scores: [0.9955421453341842], label_names: ['nobody'], filename: pulc_demo_imgs/person_exists/objects365_01780782.jpg
+Predict complete!
+```
+
+若预测结果为 `nobody`,表示该图中没有人,若预测结果为 `someone`,则表示该图中有人。此处预测结果为 `nobody`,表示该图中没有人。
+
+**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹,如需要替换模型,更改 `--model_name` 中的模型名字即可,模型名字可以参考[2.3 模型列表](#2.3)。
+
+
+
+### 2.2 Python 脚本使用
+
+此处提供了在 python 脚本中使用 PULC 有人/无人分类模型预测的例子。
+
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="person_exists")
+result = model.predict(input_data="pulc_demo_imgs/person_exists/objects365_01780782.jpg")
+print(next(result))
+```
+
+打印的结果如下:
+
+```
+>>> result
+[{'class_ids': [0], 'scores': [0.9955421453341842], 'label_names': ['nobody'], 'filename': 'pulc_demo_imgs/person_exists/objects365_01780782.jpg'}]
+```
+
+**备注**:`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="person_exists", batch_size=2)`。更换其他模型只需要替换`model_name`, `model_name`,可以参考[2.3 模型列表](#2.3)。
+
+
+
+### 2.3 模型列表
+
+PULC 系列模型的名称和简介如下:
+
+|模型名称|模型简介|
+| --- | --- |
+| person_exists | PULC有人/无人分类模型 |
+| person_attribute | PULC人体属性识别模型 |
+| safety_helmet | PULC佩戴安全帽分类模型 |
+| traffic_sign | PULC交通标志分类模型 |
+| vehicle_attribute | PULC车辆属性识别模型 |
+| car_exists | PULC有车/无车分类模型 |
+| text_image_orientation | PULC含文字图像方向分类模型 |
+| textline_orientation | PULC文本行方向分类模型 |
+| language_classification | PULC语种分类模型 |
+
+
+
+## 3. 小结
+
+通过本节内容,相信您已经熟练掌握 PaddleClas whl 包的 PULC 模型使用方法并获得了初步效果。
+
+PULC 方法产出的系列模型在人、车、OCR等方向的多个场景中均验证有效,用超轻量模型就可实现与 SwinTransformer 模型接近的精度,预测速度提高 40+ 倍。并且打通数据、模型训练、压缩和推理部署全流程,具体地,您可以参考[PULC有人/无人分类模型](PULC_person_exists.md)、[PULC人体属性识别模型](PULC_person_attribute.md)、[PULC佩戴安全帽分类模型](PULC_safety_helmet.md)、[PULC交通标志分类模型](PULC_traffic_sign.md)、[PULC车辆属性识别模型](PULC_vehicle_attribute.md)、[PULC有车/无车分类模型](PULC_car_exists.md)、[PULC含文字图像方向分类模型](PULC_text_image_orientation.md)、[PULC文本行方向分类模型](PULC_textline_orientation.md)、[PULC语种分类模型](PULC_language_classification.md)。
diff --git a/docs/zh_CN/PULC/PULC_safety_helmet.md b/docs/zh_CN/PULC/PULC_safety_helmet.md
new file mode 100644
index 0000000000000000000000000000000000000000..0467b61b12c629ebc7a6e2a2268b4c82fe512abe
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_safety_helmet.md
@@ -0,0 +1,438 @@
+# PULC 佩戴安全帽分类模型
+
+------
+
+## 目录
+
+- [1. 模型和应用场景介绍](#1)
+- [2. 模型快速体验](#2)
+ - [2.1 安装 paddlepaddle](#2.1)
+ - [2.2 安装 paddleclas](#2.2)
+ - [2.3 预测](#2.3)
+- [3. 模型训练、评估和预测](#3)
+ - [3.1 环境配置](#3.1)
+ - [3.2 数据准备](#3.2)
+ - [3.2.1 数据集来源](#3.2.1)
+ - [3.2.2 数据集获取](#3.2.2)
+ - [3.3 模型训练](#3.3)
+ - [3.4 模型评估](#3.4)
+ - [3.5 模型预测](#3.5)
+- [4. 模型压缩](#4)
+ - [4.1 UDML 知识蒸馏](#4.1)
+ - [4.1.1 教师模型训练](#4.1.1)
+ - [4.1.2 蒸馏训练](#4.1.2)
+- [5. 超参搜索](#5)
+- [6. 模型推理部署](#6)
+ - [6.1 推理模型准备](#6.1)
+ - [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
+ - [6.1.2 直接下载 inference 模型](#6.1.2)
+ - [6.2 基于 Python 预测引擎推理](#6.2)
+ - [6.2.1 预测单张图像](#6.2.1)
+ - [6.2.2 基于文件夹的批量预测](#6.2.2)
+ - [6.3 基于 C++ 预测引擎推理](#6.3)
+ - [6.4 服务化部署](#6.4)
+ - [6.5 端侧部署](#6.5)
+ - [6.6 Paddle2ONNX 模型转换与预测](#6.6)
+
+
+
+
+## 1. 模型和应用场景介绍
+
+该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight image Classification)快速构建轻量级、高精度、可落地的“是否佩戴安全帽”的二分类模型。该模型可以广泛应用于如建筑施工场景、工厂车间场景、交通场景等。
+
+下表列出了判断图片中是否佩戴安全帽的二分类模型的相关指标,前三行展现了使用 Res2Net200_vd_26w_4s,SwinTranformer_tiny 和 MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第四行至第七行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + UDML 知识蒸馏策略训练得到的模型的相关指标。
+
+| 模型 | Tpr(%) | 延时(ms) | 存储(M) | 策略 |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 93.57 | 91.32 | 111 | 使用ImageNet预训练模型 |
+| Res2Net200_vd_26w_4s | 98.92 | 80.99 | 284 | 使用ImageNet预训练模型 |
+| MobileNetV3_small_x0_35 | 84.83 | 2.85 | 2.6 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 93.27 | 2.03 | 7.1 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 98.16 | 2.03 | 7.1 | 使用SSLD预训练模型 |
+| PPLCNet_x1_0 | 99.30 | 2.03 | 7.1 | 使用SSLD预训练模型+EDA策略|
+| PPLCNet_x1_0 | 99.38 | 2.03 | 7.1 | 使用SSLD预训练模型+EDA策略+UDML知识蒸馏策略|
+
+从表中可以看出,在使用服务器端大模型作为 backbone 时,SwinTranformer_tiny 精度较低,Res2Net200_vd_26w_4s 精度较高,但服务器端大模型推理速度普遍较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是精度显著降低。在将 backbone 替换为 PPLCNet_x1_0 后,精度较 MobileNetV3_small_x0_35 提高约 8.5 个百分点,与此同时速度快 20% 以上。在此基础上,将 PPLCNet_x1_0 的预训练模型替换为 SSLD 预训练模型后,在对推理速度无影响的前提下,精度提升约 4.9 个百分点,进一步地使用 EDA 策略后,精度可以再提升 1.1 个百分点。此时,PPLCNet_x1_0 已经超过 Res2Net200_vd_26w_4s 模型的精度,但是速度快 70+ 倍。最后,在使用 UDML 知识蒸馏后,精度可以再提升 0.08 个百分点。下面详细介绍关于 PULC 安全帽模型的训练方法和推理部署方法。
+
+**备注:**
+
+* `Tpr`指标的介绍可以参考 [3.3小节](#3.3)的备注部分,延时是基于 Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,开启MKLDNN加速策略,线程数为10。
+
+* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)。
+
+
+
+
+## 2. 模型快速体验
+
+
+
+### 2.1 安装 paddlepaddle
+
+- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- 您的机器是CPU,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
+
+
+
+### 2.2 安装 paddleclas
+
+使用如下命令快速安装 paddleclas
+
+```
+pip3 install paddleclas
+```
+
+
+
+### 2.3 预测
+
+点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
+
+* 使用命令行快速预测
+
+```bash
+paddleclas --model_name=safety_helmet --infer_imgs=pulc_demo_imgs/safety_helmet/safety_helmet_test_1.png
+```
+
+结果如下:
+```
+>>> result
+class_ids: [1], scores: [0.9986255], label_names: ['unwearing_helmet'], filename: pulc_demo_imgs/safety_helmet/safety_helmet_test_1.png
+Predict complete!
+```
+
+**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹。
+
+
+* 在 Python 代码中预测
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="safety_helmet")
+result = model.predict(input_data="pulc_demo_imgs/safety_helmet/safety_helmet_test_1.png")
+print(next(result))
+```
+
+**备注**:`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="safety_helmet", batch_size=2)`, 使用默认的代码返回结果示例如下:
+
+```
+>>> result
+[{'class_ids': [1], 'scores': [0.9986255], 'label_names': ['unwearing_helmet'], 'filename': 'pulc_demo_imgs/safety_helmet/safety_helmet_test_1.png'}]
+```
+
+
+
+## 3. 模型训练、评估和预测
+
+
+
+### 3.1 环境配置
+
+* 安装:请先参考文档 [环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+
+
+
+### 3.2 数据准备
+
+
+
+#### 3.2.1 数据集来源
+
+本案例中所使用的所有数据集均为开源数据,数据集基于[Safety-Helmet-Wearing-Dataset](https://github.com/njvisionpower/Safety-Helmet-Wearing-Dataset)、[hard-hat-detection](https://www.kaggle.com/datasets/andrewmvd/hard-hat-detection)与[Large-scale CelebFaces Attributes (CelebA) Dataset](https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html)处理整合而来。
+
+
+
+#### 3.2.2 数据集获取
+
+在公开数据集的基础上经过后处理即可得到本案例需要的数据,具体处理方法如下:
+
+* 对于 Safety-Helmet-Wearing-Dataset 数据集:根据 bbox 标签数据,对其宽、高放大 3 倍作为 bbox 对图像进行裁剪,其中带有安全帽的图像类别为0,不戴安全帽的图像类别为1;
+* 对于 hard-hat-detection 数据集:仅使用其中类别标签为 “hat” 的图像,并使用 bbox 标签进行裁剪,图像类别为0;
+* 对于 CelebA 数据集:仅使用其中类别标签为 “Wearing_Hat” 的图像,并使用 bbox 标签进行裁剪,图像类别为0。
+
+在整合上述数据后,可得到共约 15 万数据,其中戴安全帽与不戴安全帽的图像数量分别约为 2.8 万与 12.1 万,然后在两个类别上分别随机选取 0.56 万张图像作为测试集,共约 1.12 万张图像,其他约 13.8 万张图像作为训练集。
+
+处理后的数据集部分数据可视化如下:
+
+
+
+此处提供了经过上述方法处理好的数据,可以直接下载得到。
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,下载并解压安全帽场景的数据。
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/safety_helmet.tar
+tar -xf safety_helmet.tar
+cd ../
+```
+
+执行上述命令后,`dataset/` 下存在 `safety_helmet` 目录,该目录中具有以下数据:
+
+```
+├── images
+│ ├── VOC2028_part2_001209_1.jpg
+│ ├── HHD_hard_hat_workers23_1.jpg
+│ ├── CelebA_077809.jpg
+│ ├── ...
+│ └── ...
+├── train_list.txt
+└── val_list.txt
+```
+
+其中,`train_list.txt` 和 `val_list.txt` 分别为训练集和验证集的标签文件,所有的图像数据在 `images/` 目录下。
+
+**备注:**
+
+* 关于 `train_list.txt`、`val_list.txt`的格式说明,可以参考[PaddleClas分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
+
+
+
+### 3.3 模型训练
+
+在 `ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0.yaml` 中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0.yaml
+```
+
+验证集的最佳指标在 `0.985-0.993` 之间(数据集较小,容易造成波动)。
+
+**备注:**
+
+* 此时使用的指标为Tpr,该指标描述了在假正类率(Fpr)小于某一个指标时的真正类率(Tpr),是产业中二分类问题常用的指标之一。在本案例中,Fpr 为万分之一。关于 Fpr 和 Tpr 的更多介绍,可以参考[这里](https://baike.baidu.com/item/AUC/19282953)。
+
+* 在eval时,会打印出来当前最佳的 TprAtFpr 指标,具体地,其会打印当前的 `Fpr`、`Tpr` 值,以及当前的 `threshold`值,`Tpr` 值反映了在当前 `Fpr` 值下的召回率,该值越高,代表模型越好。`threshold` 表示当前最佳 `Fpr` 所对应的分类阈值,可用于后续模型部署落地等。
+
+
+
+### 3.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了训练过程中的最佳参数权重文件所在的路径,如需指定其他权重文件,只需替换对应的路径即可。
+
+
+
+### 3.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```python
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+输出结果如下:
+
+```
+[{'class_ids': [1], 'scores': [0.9524797], 'label_names': ['unwearing_helmet'], 'file_name': 'deploy/images/PULC/safety_helmet/safety_helmet_test_1.png'}]
+```
+
+**备注:**
+
+* 这里`-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+* 默认是对 `deploy/images/PULC/safety_helmet/safety_helmet_test_1.png` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
+
+* 二分类默认的阈值为0.5, 如果需要指定阈值,可以重写 `Infer.PostProcess.threshold` ,如 `-o Infer.PostProcess.threshold=0.9167`,该值需要根据实际应用场景来确定,在 safety_helmet 数据集的 val 验证集上,在万分之一 Fpr 下得到的最佳 Tpr 时,该值为 0.9167。
+
+
+
+## 4. 模型压缩
+
+
+
+### 4.1 UDML 知识蒸馏
+
+UDML 知识蒸馏是一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[UDML 知识蒸馏](../advanced_tutorials/knowledge_distillation.md#1.2.3)。
+
+
+
+#### 4.1.1 蒸馏训练
+
+配置文件 `ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0_distillation.yaml` 提供了 `UDML知识蒸馏策略` 的配置。训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0_distillation.yaml
+```
+
+验证集的最佳指标为 `0.990-0.993` 之间,当前模型最好的权重保存在 `output/DistillationModel/best_model_student.pdparams`。
+
+
+
+## 5. 超参搜索
+
+在 [3.2 节](#3.2)和 [4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
+
+**备注**:此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
+
+
+
+## 6. 模型推理部署
+
+
+
+### 6.1 推理模型准备
+
+Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference 可使用 MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于 Paddle Inference 推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)。
+
+当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
+
+
+
+### 6.1.1 基于训练得到的权重导出 inference 模型
+
+此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/safety_helmet/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_safety_helmet_infer
+```
+
+执行完该脚本后会在 `deploy/models/` 下生成 `PPLCNet_x1_0_safety_helmet_infer` 目录,该目录下有如下文件结构:
+
+```
+├── PPLCNet_x1_0_safety_helmet_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在 `output/PPLCNet_x1_0/best_model.pdparams` 中。
+
+
+
+### 6.1.2 直接下载 inference 模型
+
+[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
+
+```
+cd deploy/models
+# 下载 inference 模型并解压
+wget https://paddleclas.bj.bcebos.com/models/PULC/safety_helmet_infer.tar && tar -xf safety_helmet_infer.tar
+```
+
+解压完毕后,`models` 文件夹下应有如下文件结构:
+
+```
+├── safety_helmet_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 基于 Python 预测引擎推理
+
+
+
+#### 6.2.1 预测单张图像
+
+返回 `deploy` 目录:
+
+```
+cd ../
+```
+
+运行下面的命令,对图像 `./images/PULC/safety_helmet/safety_helmet_test_1.png` 进行是否佩戴安全帽分类。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/safety_helmet/inference_safety_helmet.yaml
+# 使用下面的命令使用 CPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/safety_helmet/inference_safety_helmet.yaml -o Global.use_gpu=False
+```
+
+输出结果如下。
+
+```
+safety_helmet_test_1.png: class id(s): [1], score(s): [1.00], label_name(s): ['unwearing_helmet']
+```
+
+**备注:** 二分类默认的阈值为0.5, 如果需要指定阈值,可以重写 `Infer.PostProcess.threshold` ,如 `-o Infer.PostProcess.threshold=0.9167`,该值需要根据实际应用场景来确定,在 safety_helmet 数据集的 val 验证集上,在万分之一 Fpr 下得到的最佳 Tpr 时,该值为 0.9167。该阈值的确定方法可以参考[3.3节](#3.3)备注部分。
+
+
+
+#### 6.2.2 基于文件夹的批量预测
+
+如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/safety_helmet/inference_safety_helmet.yaml -o Global.infer_imgs="./images/PULC/safety_helmet/"
+```
+
+终端中会输出该文件夹内所有图像的分类结果,如下所示。
+
+```
+safety_helmet_test_1.png: class id(s): [1], score(s): [1.00], label_name(s): ['unwearing_helmet']
+safety_helmet_test_2.png: class id(s): [0], score(s): [1.00], label_name(s): ['wearing_helmet']
+```
+
+其中,`wearing_helmet` 表示该图中的人佩戴了安全帽,`unwearing_helmet` 表示该图中的人未佩戴安全帽。
+
+
+
+### 6.3 基于 C++ 预测引擎推理
+
+PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
+
+
+
+### 6.4 服务化部署
+
+Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)。
+
+PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.5 端侧部署
+
+Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)。
+
+PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.6 Paddle2ONNX 模型转换与预测
+
+Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)。
+
+PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](../../../deploy/paddle2onnx/readme.md)来完成相应的部署工作。
diff --git a/docs/zh_CN/PULC/PULC_text_image_orientation.md b/docs/zh_CN/PULC/PULC_text_image_orientation.md
new file mode 100644
index 0000000000000000000000000000000000000000..d89396f0a0c4a67dd0990bd4e19725684b894020
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_text_image_orientation.md
@@ -0,0 +1,460 @@
+# PULC 含文字图像方向分类模型
+
+## 目录
+
+- [1. 模型和应用场景介绍](#1)
+- [2. 模型快速体验](#2)
+ - [2.1 安装 paddlepaddle](#2.1)
+ - [2.2 安装 paddleclas](#2.2)
+ - [2.3 预测](#2.3)
+- [3. 模型训练、评估和预测](#3)
+ - [3.1 环境配置](#3.1)
+ - [3.2 数据准备](#3.2)
+ - [3.2.1 数据集来源](#3.2.1)
+ - [3.2.2 数据集获取](#3.2.2)
+ - [3.3 模型训练](#3.3)
+ - [3.4 模型评估](#3.4)
+ - [3.5 模型预测](#3.5)
+- [4. 模型压缩](#4)
+ - [4.1 SKL-UGI 知识蒸馏](#4.1)
+ - [4.1.1 教师模型训练](#4.1.1)
+ - [4.1.2 蒸馏训练](#4.1.2)
+- [5. 超参搜索](#5)
+- [6. 模型推理部署](#6)
+ - [6.1 推理模型准备](#6.1)
+ - [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
+ - [6.1.2 直接下载 inference 模型](#6.1.2)
+ - [6.2 基于 Python 预测引擎推理](#6.2)
+ - [6.2.1 预测单张图片](#6.2.1)
+ - [6.2.2 基于文件夹的批量预测](#6.2.2)
+ - [6.3 基于 C++ 预测引擎推理](#6.3)
+ - [6.4 服务化部署](#6.4)
+ - [6.5 端侧部署](#6.5)
+ - [6.6 Paddle2ONNX 模型转换与预测](#6.6)
+
+
+
+## 1. 模型和应用场景介绍
+
+在诸如文档扫描、证照拍摄等过程中,有时为了拍摄更清晰,会将拍摄设备进行旋转,导致得到的图片也是不同方向的。此时,标准的OCR流程无法很好地应对这些数据。利用图像分类技术,可以预先判断含文字图像的方向,并将其进行方向调整,从而提高OCR处理的准确性。该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight image Classification)快速构建轻量级、高精度、可落地的含文字图像方向的分类模型。该模型可以广泛应用于金融、政务等行业的旋转图片的OCR处理场景中。
+
+下表列出了判断含文字图像方向分类模型的相关指标,前两行展现了使用 SwinTranformer_tiny 和 MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第三行至第五行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用EDA策略训练得到的模型的相关指标。
+
+| 模型 | 精度(%) | 延时(ms) | 存储(M) | 策略 |
+| ----------------------- | --------- | ---------- | --------- | -------------------------- |
+| SwinTranformer_tiny | 99.12 | 89.65 | 111 | 使用ImageNet预训练模型 |
+| MobileNetV3_small_x0_35 | 83.61 | 2.95 | 2.6 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 97.85 | 2.16 | 7.1 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 99.02 | 2.16 | 7.1 | 使用SSLD预训练模型 |
+| **PPLCNet_x1_0** | **99.06** | **2.16** | **7.1** | 使用SSLD预训练模型+EDA策略 |
+
+从表中可以看出,backbone 为 SwinTranformer_tiny 时精度比较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度提升明显,但精度有了大幅下降。将 backbone 替换为 PPLCNet_x1_0 时,速度略为提升,同时精度较 MobileNetV3_small_x0_35 高了 14.24 个百分点。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升 1.17 个百分点,进一步地使用 EDA 策略后,精度可以再提升 0.04 个百分点。此时,PPLCNet_x1_0 与 SwinTranformer_tiny 的精度差别不大,但是速度明显变快。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
+
+**备注:**
+
+* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)。
+
+
+
+## 2. 模型快速体验
+
+
+
+### 2.1 安装 paddlepaddle
+
+- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- 您的机器是CPU,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
+
+
+
+### 2.2 安装 paddleclas
+
+使用如下命令快速安装 paddleclas
+
+```
+pip3 install paddleclas
+```
+
+
+
+### 2.3 预测
+
+点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
+
+* 使用命令行快速预测
+
+```bash
+paddleclas --model_name=text_image_orientation --infer_imgs=pulc_demo_imgs/text_image_orientation/img_rot0_demo.jpg
+```
+
+结果如下:
+```
+>>> result
+class_ids: [0, 2], scores: [0.85615, 0.05046], label_names: ['0', '180'], filename: pulc_demo_imgs/text_image_orientation/img_rot0_demo.jpg
+Predict complete!
+```
+
+**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹。
+
+
+* 在 Python 代码中预测
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="text_image_orientation")
+result = model.predict(input_data="pulc_demo_imgs/text_image_orientation/img_rot0_demo.jpg")
+print(next(result))
+```
+
+**备注**:`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="text_image_orientation", batch_size=2)`, 使用默认的代码返回结果示例如下:
+
+```
+>>> result
+[{'class_ids': [0, 2], 'scores': [0.85615, 0.05046], 'label_names': ['0', '180'], 'filename': 'pulc_demo_imgs/text_image_orientation/img_rot0_demo.jpg'}]
+```
+
+
+
+## 3. 模型训练、评估和预测
+
+
+
+### 3.1 环境配置
+
+* 安装:请先参考文档 [环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+
+
+
+### 3.2 数据准备
+
+
+
+#### 3.2.1 数据集来源
+
+[第1节](#1)中提供的模型使用内部数据训练得到,该数据集暂时不方便公开。这里基于 [ICDAR2019-ArT](https://ai.baidu.com/broad/introduction?dataset=art)、 [XFUND](https://github.com/doc-analysis/XFUND) 和 [ICDAR2015](https://rrc.cvc.uab.es/?ch=4&com=introduction) 三个公开数据集构造了一个小规模含文字图像方向分类数据集,用于体验本案例。
+
+
+
+
+
+#### 3.2.2 数据集获取
+
+在公开数据集的基础上经过后处理即可得到本案例需要的数据,具体处理方法如下:
+
+考虑到原始图片的分辨率较高,模型训练时间较长,这里将所有数据预先进行了缩放处理,在保持长宽比不变的前提下,将短边缩放到384。然后将数据进行顺时针旋转处理,分别生成90度、180度和270度的合成数据。其中,将 ICDAR2019-ArT 和 XFUND 生成的41460张数据按照 9:1 的比例随机划分成了训练集和验证集, ICDAR2015 生成的6000张数据作为`SKL-UGI知识蒸馏策略`实验中的补充数据。
+
+处理后的数据集部分数据可视化如下:
+
+
+
+此处提供了经过上述方法处理好的数据,可以直接下载得到。
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,下载并解压含文字图像方向场景的数据。
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/text_image_orientation.tar
+tar -xf text_image_orientation.tar
+cd ../
+```
+
+执行上述命令后,`dataset/`下存在`text_image_orientation`目录,该目录中具有以下数据:
+
+```
+├── img_0
+│ ├── img_rot0_0.jpg
+│ ├── img_rot0_1.png
+...
+├── img_90
+│ ├── img_rot90_0.jpg
+│ ├── img_rot90_1.png
+...
+├── img_180
+│ ├── img_rot180_0.jpg
+│ ├── img_rot180_1.png
+...
+├── img_270
+│ ├── img_rot270_0.jpg
+│ ├── img_rot270_1.png
+...
+├── distill_data
+│ ├── gt_7060_0.jpg
+│ ├── gt_7060_90.jpg
+...
+├── train_list.txt
+├── train_list.txt.debug
+├── train_list_for_distill.txt
+├── test_list.txt
+├── test_list.txt.debug
+└── label_list.txt
+```
+
+其中`img_0/`、`img_90/`、`img_180/`和`img_270/`分别存放了4个角度的训练集和验证集数据。`train_list.txt`和`test_list.txt`分别为训练集和验证集的标签文件,`train_list.txt.debug`和`test_list.txt.debug`分别为训练集和验证集的`debug`标签文件,其分别是`train_list.txt`和`test_list.txt`的子集,用该文件可以快速体验本案例的流程。`distill_data/`是补充文字数据,该集合和`train`集合的混合数据用于本案例的`SKL-UGI知识蒸馏策略`,对应的训练标签文件为`train_list_for_distill.txt`。关于如何得到蒸馏的标签可以参考[知识蒸馏标签获得](../advanced_tutorials/ssld.md#3.2)。
+
+**备注:**
+
+* 关于 `train_list.txt`、`val_list.txt`的格式说明,可以参考[PaddleClas分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
+
+* 关于如何得到蒸馏的标签文件可以参考[知识蒸馏标签获得方法](../advanced_tutorials/ssld.md#3.2)。
+
+
+
+### 3.3 模型训练
+
+在`ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml`中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml
+```
+
+验证集的最佳指标在0.99左右。
+
+**备注**:本文档中提到的训练指标均为在大规模内部数据上的训练指标,使用 demo 数据训练时,由于数据集规模较小且分布与大规模内部数据不同,无法达到该指标。可以进一步扩充自己的数据并且使用本案例中介绍的优化方法进行调优,从而达到更高的精度。
+
+
+
+### 3.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+
+
+### 3.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```bash
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+输出结果如下:
+
+```
+[{'class_ids': [0, 2], 'scores': [0.85615, 0.05046], 'file_name': 'deploy/images/PULC/text_image_orientation/img_rot0_demo.jpg', 'label_names': ['0', '180']}]
+```
+
+**备注:**
+
+- 其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+- 默认是对 `deploy/images/PULC/text_image_orientation/img_rot0_demo.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
+- 输出为top2的预测结果,`0` 表示该图文本方向为0度,`90` 表示该图文本方向为顺时针90度,`180` 表示该图文本方向为顺时针180度,`270` 表示该图文本方向为顺时针270度。
+
+
+
+## 4. 模型压缩
+
+
+
+### 4.1 SKL-UGI 知识蒸馏
+
+SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](../advanced_tutorials/ssld.md)。
+
+
+
+#### 4.1.1 教师模型训练
+
+复用`ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml`中的超参数,训练教师模型,训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+验证集的最佳指标为 0.996 左右,当前教师模型最好的权重保存在`output/ResNet101_vd/best_model.pdparams`。
+
+**备注:** 训练 ResNet101_vd 模型需要的显存较多,如果机器显存不够,可以将学习率和 batch size 同时缩小一定的倍数进行训练。如在命令后添加以下参数 `-o DataLoader.Train.sampler.batch_size=64`, `Optimizer.lr.learning_rate=0.1`。
+
+
+
+#### 4.1.2 蒸馏训练
+
+配置文件`ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI 知识蒸馏策略`的配置。该配置将 `ResNet101_vd` 当作教师模型,`PPLCNet_x1_0` 当作学生模型,使用[3.2.2节](#3.2.2)中介绍的蒸馏数据作为新增的无标签数据。训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+验证集的最佳指标为0.99左右,当前模型最好的权重保存在`output/DistillationModel/best_model_student.pdparams`。
+
+
+
+## 5. 超参搜索
+
+在 [3.2 节](#3.2)和 [4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
+
+**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
+
+
+
+## 6. 模型推理部署
+
+
+
+### 6.1 推理模型准备
+
+Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)。
+
+当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
+
+
+
+#### 6.1.1 基于训练得到的权重导出 inference 模型
+
+此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_text_image_orientation_infer
+```
+
+执行完该脚本后会在`deploy/models/`下生成`PPLCNet_x1_0_text_image_orientation_infer`文件夹,`models` 文件夹下应有如下文件结构:
+
+```
+├── PPLCNet_x1_0_text_image_orientation_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在`output/PPLCNet_x1_0/best_model.pdparams`中。
+
+
+
+#### 6.1.2 直接下载 inference 模型
+
+[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
+
+```
+cd deploy/models
+# 下载inference 模型并解压
+wget https://paddleclas.bj.bcebos.com/models/PULC/text_image_orientation_infer.tar && tar -xf text_image_orientation_infer.tar
+```
+
+解压完毕后,`models` 文件夹下应有如下文件结构:
+
+```
+├── text_image_orientation_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 基于 Python 预测引擎推理
+
+
+
+#### 6.2.1 预测单张图像
+
+返回 `deploy` 目录:
+
+```
+cd ../
+```
+
+运行下面的命令,对图像 `./images/PULC/text_image_orientation/img_rot0_demo.png` 进行含文字图像方向分类。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/text_image_orientation/inference_text_image_orientation.yaml
+# 使用下面的命令使用 CPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/text_image_orientation/inference_text_image_orientation.yaml -o Global.use_gpu=False
+```
+
+输出结果如下。
+
+```
+img_rot0_demo.jpg: class id(s): [0, 2], score(s): [0.86, 0.05], label_name(s): ['0', '180']
+```
+
+其中,输出为top2的预测结果,`0` 表示该图文本方向为0度,`90` 表示该图文本方向为顺时针90度,`180` 表示该图文本方向为顺时针180度,`270` 表示该图文本方向为顺时针270度。
+
+
+
+#### 6.2.2 基于文件夹的批量预测
+
+如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/text_image_orientation/inference_text_image_orientation.yaml -o Global.infer_imgs="./images/PULC/text_image_orientation/"
+```
+
+终端中会输出该文件夹内所有图像的分类结果,如下所示。
+
+```
+img_rot0_demo.jpg: class id(s): [0, 2], score(s): [0.86, 0.05], label_name(s): ['0', '180']
+img_rot180_demo.jpg: class id(s): [2, 1], score(s): [0.88, 0.04], label_name(s): ['180', '90']
+```
+
+
+
+### 6.3 基于 C++ 预测引擎推理
+
+PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
+
+
+
+### 6.4 服务化部署
+
+Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)。
+
+PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.5 端侧部署
+
+Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)。
+
+PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.6 Paddle2ONNX 模型转换与预测
+
+Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)。
+
+PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](../../../deploy/paddle2onnx/readme.md)来完成相应的部署工作。
diff --git a/docs/zh_CN/PULC/PULC_textline_orientation.md b/docs/zh_CN/PULC/PULC_textline_orientation.md
new file mode 100644
index 0000000000000000000000000000000000000000..eea10307532eb0a8a323a82108b0c5f9691a82f8
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_textline_orientation.md
@@ -0,0 +1,457 @@
+# PULC 文本行方向分类模型
+
+------
+
+
+## 目录
+
+- [1. 模型和应用场景介绍](#1)
+- [2. 模型快速体验](#2)
+ - [2.1 安装 paddlepaddle](#2.1)
+ - [2.2 安装 paddleclas](#2.2)
+ - [2.3 预测](#2.3)
+- [3. 模型训练、评估和预测](#3)
+ - [3.1 环境配置](#3.1)
+ - [3.2 数据准备](#3.2)
+ - [3.2.1 数据集来源](#3.2.1)
+ - [3.2.2 数据集获取](#3.2.2)
+ - [3.3 模型训练](#3.3)
+ - [3.4 模型评估](#3.4)
+ - [3.5 模型预测](#3.5)
+- [4. 模型压缩](#4)
+ - [4.1 SKL-UGI 知识蒸馏](#4.1)
+ - [4.1.1 教师模型训练](#4.1.1)
+ - [4.1.2 蒸馏训练](#4.1.2)
+- [5. 超参搜索](#5)
+- [6. 模型推理部署](#6)
+ - [6.1 推理模型准备](#6.1)
+ - [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
+ - [6.1.2 直接下载 inference 模型](#6.1.2)
+ - [6.2 基于 Python 预测引擎推理](#6.2)
+ - [6.2.1 预测单张图像](#6.2.1)
+ - [6.2.2 基于文件夹的批量预测](#6.2.2)
+ - [6.3 基于 C++ 预测引擎推理](#6.3)
+ - [6.4 服务化部署](#6.4)
+ - [6.5 端侧部署](#6.5)
+ - [6.6 Paddle2ONNX 模型转换与预测](#6.6)
+
+
+
+
+## 1. 模型和应用场景介绍
+
+该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight image Classification)快速构建轻量级、高精度、可落地的文本行方向分类模型。该模型可以广泛应用于如文字矫正、文字识别等场景。
+
+下表列出了文本行方向分类模型的相关指标,前两行展现了使用 Res2Net200_vd 和 MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第三行至第七行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + SKL-UGI 知识蒸馏策略训练得到的模型的相关指标。
+
+
+| 模型 | Top-1 Acc(%) | 延时(ms) | 存储(M) | 策略 |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 93.61 | 89.64 | 111 | 使用 ImageNet 预训练模型 |
+| MobileNetV3_small_x0_35 | 81.40 | 2.96 | 2.6 | 使用 ImageNet 预训练模型 |
+| PPLCNet_x1_0 | 89.99 | 2.11 | 7.0 | 使用 ImageNet 预训练模型 |
+| PPLCNet_x1_0* | 94.06 | 2.68 | 7.0 | 使用 ImageNet 预训练模型 |
+| PPLCNet_x1_0* | 94.11 | 2.68 | 7.0 | 使用 SSLD 预训练模型 |
+| PPLCNet_x1_0** | 96.01 | 2.72 | 7.0 | 使用 SSLD 预训练模型+EDA 策略|
+| PPLCNet_x1_0** | 95.86 | 2.72 | 7.0 | 使用 SSLD 预训练模型+EDA 策略+SKL-UGI 知识蒸馏策略|
+
+从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,精度下降也比较明显。将 backbone 替换为 PPLCNet_x1_0 时,精度较 MobileNetV3_small_x0_35 高 8.6 个百分点,速度快10%左右。在此基础上,更改分辨率和stride, 速度变慢 27%,但是精度可以提升 4.5 个百分点(采用[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)的方案),使用 SSLD 预训练模型后,精度可以继续提升约 0.05 个百分点 ,进一步地,当融合EDA策略后,精度可以再提升 1.9 个百分点。最后,融合SKL-UGI 知识蒸馏策略后,在该场景无效。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
+
+**备注:**
+
+* 其中不带\*的模型表示分辨率为224x224,带\*的模型表示分辨率为48x192(h\*w),数据增强从网络中的 stride 改为 `[2, [2, 1], [2, 1], [2, 1], [2, 1]]`,其中,外层列表中的每一个元素代表网络结构下采样层的stride,该策略为 [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) 提供的文本行方向分类器方案。带\*\*的模型表示分辨率为80x160(h\*w), 网络中的 stride 改为 `[2, [2, 1], [2, 1], [2, 1], [2, 1]]`,其中,外层列表中的每一个元素代表网络结构下采样层的stride,此分辨率是经过[超参数搜索策略](PULC_train.md#4-超参搜索)搜索得到的。
+* 延时是基于 Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,开启 MKLDNN 加速策略,线程数为10。
+* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)。
+
+
+
+## 2. 模型快速体验
+
+
+
+### 2.1 安装 paddlepaddle
+
+- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- 您的机器是CPU,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
+
+
+
+### 2.2 安装 paddleclas
+
+使用如下命令快速安装 paddleclas
+
+```
+pip3 install paddleclas
+```
+
+
+
+### 2.3 预测
+
+点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
+
+* 使用命令行快速预测
+
+```bash
+paddleclas --model_name=textline_orientation --infer_imgs=pulc_demo_imgs/textline_orientation/textline_orientation_test_0_0.png
+```
+
+结果如下:
+```
+>>> result
+class_ids: [0], scores: [1.0], label_names: ['0_degree'], filename: pulc_demo_imgs/textline_orientation/textline_orientation_test_0_0.png
+Predict complete!
+```
+
+**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹。
+
+* 在 Python 代码中预测
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="textline_orientation")
+result = model.predict(input_data="pulc_demo_imgs/textline_orientation/textline_orientation_test_0_0.png")
+print(next(result))
+```
+
+**备注**:`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="textline_orientation", batch_size=2)`, 使用默认的代码返回结果示例如下:
+
+```
+>>> result
+[{'class_ids': [0], 'scores': [1.0], 'label_names': ['0_degree'], 'filename': 'pulc_demo_imgs/textline_orientation/textline_orientation_test_0_0.png'}]
+```
+
+
+
+## 3. 模型训练、评估和预测
+
+
+
+### 3.1 环境配置
+
+* 安装:请先参考文档 [环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+
+
+
+### 3.2 数据准备
+
+
+
+#### 3.2.1 数据集来源
+
+本案例中所使用的所有数据集来源于内部数据,如果您希望体验训练过程,可以使用开源数据如[ICDAR2019-LSVT 文本行识别数据](https://aistudio.baidu.com/aistudio/datasetdetail/8429)。
+
+
+
+#### 3.2.2 数据集获取
+
+在公开数据集的基础上经过后处理即可得到本案例需要的数据,具体处理方法如下:
+
+本案例处理了 ICDAR2019-LSVT 文本行识别数据,将其中的 id 号为 0-1999 作为本案例的数据集合,经过旋转处理成 0 类 和 1 类,其中 0 类代表文本行为正,即 0 度,1 类代表文本行为反,即 180 度。
+
+- 训练集合,id号为 0-1799 作为训练集合,0 类和 1 类共 3600 张。
+
+- 验证集合,id号为 1800-1999 作为验证集合,0 类和 1 类共 400 张。
+
+处理后的数据集部分数据可视化如下:
+
+
+
+此处提供了经过上述方法处理好的数据,可以直接下载得到。
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,下载并解压文本行方向分类场景的数据。
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/textline_orientation.tar
+tar -xf textline_orientation.tar
+cd ../
+```
+
+执行上述命令后,`dataset/` 下存在 `textline_orientation` 目录,该目录中具有以下数据:
+
+```
+├── 0
+│ ├── img_0.jpg
+│ ├── img_1.jpg
+...
+├── 1
+│ ├── img_0.jpg
+│ ├── img_1.jpg
+...
+├── train_list.txt
+└── val_list.txt
+```
+
+其中 `0/` 和 `1/` 分别存放 0 类和 1 类的数据。`train_list.txt` 和 `val_list.txt` 分别为训练集和验证集的标签文件。
+
+**备注:**
+
+* 关于 `train_list.txt`、`val_list.txt` 的格式说明,可以参考[PaddleClas分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
+
+
+
+
+### 3.3 模型训练
+
+
+在 `ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml` 中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml
+```
+
+
+**备注:**
+
+* 由于此时使用的数据集并非内部非开源数据集,此处不能直接复现提供的模型的指标,如果希望得到更高的精度,可以根据需要处理[ICDAR2019-LSVT 文本行识别数据](https://aistudio.baidu.com/aistudio/datasetdetail/8429)。
+
+
+
+### 3.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+
+
+### 3.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```python
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
+```
+
+输出结果如下:
+
+```
+[{'class_ids': [0], 'scores': [1.0], 'file_name': 'deploy/images/PULC/textline_orientation/textline_orientation_test_0_0.png', 'label_names': ['0_degree']}]
+```
+
+**备注:**
+
+* 这里`-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+* 默认是对 `deploy/images/PULC/textline_orientation/textline_orientation_test_0_0.png` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
+
+
+
+
+## 4. 模型压缩
+
+
+
+### 4.1 SKL-UGI 知识蒸馏
+
+SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](../advanced_tutorials/ssld.md)。
+
+
+
+#### 4.1.1 教师模型训练
+
+复用 `./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml` 中的超参数,训练教师模型,训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+当前教师模型最好的权重保存在 `output/ResNet101_vd/best_model.pdparams`。
+
+
+
+#### 4.1.2 蒸馏训练
+
+配置文件`ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型。训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+当前模型最好的权重保存在 `output/DistillationModel/best_model_student.pdparams`。
+
+
+
+
+## 5. 超参搜索
+
+在 [3.3 节](#3.3)和 [4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
+
+**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。
+
+
+
+## 6. 模型推理部署
+
+
+
+### 6.1 推理模型准备
+
+Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)。
+
+当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
+
+
+
+### 6.1.1 基于训练得到的权重导出 inference 模型
+
+此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/textline_orientation/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/PPLCNet_x1_0/best_model \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_textline_orientation_infer
+```
+执行完该脚本后会在 `deploy/models/` 下生成 `PPLCNet_x1_0_textline_orientation_infer` 文件夹,`models` 文件夹下应有如下文件结构:
+
+```
+├── PPLCNet_x1_0_textline_orientation_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**备注:** 此处的最佳权重可以根据实际情况来选择,如果希望导出知识蒸馏后的权重,则最佳权重保存在`output/DistillationModel/best_model_student.pdparams`,在导出命令中更改`-o Global.pretrained_model=xx`中的字段为`output/DistillationModel/best_model_student`即可。
+
+
+
+### 6.1.2 直接下载 inference 模型
+
+[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
+
+```
+cd deploy/models
+# 下载 inference 模型并解压
+wget https://paddleclas.bj.bcebos.com/models/PULC/textline_orientation_infer.tar && tar -xf textline_orientation_infer.tar
+```
+
+解压完毕后,`models` 文件夹下应有如下文件结构:
+
+```
+├── textline_orientation_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 基于 Python 预测引擎推理
+
+
+
+
+#### 6.2.1 预测单张图像
+
+返回 `deploy` 目录:
+
+```
+cd ../
+```
+
+运行下面的命令,对图像 `./images/PULC/textline_orientation/textline_orientation_test_0_0.png` 进行文字方向cd分类。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/textline_orientation/inference_textline_orientation.yaml
+# 使用下面的命令使用 CPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/textline_orientation/inference_textline_orientation.yaml -o Global.use_gpu=False
+```
+
+输出结果如下。
+
+```
+textline_orientation_test_0_0.png: class id(s): [0], score(s): [1.00], label_name(s): ['0_degree']
+```
+
+
+
+#### 6.2.2 基于文件夹的批量预测
+
+如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/textline_orientation/inference_textline_orientation.yaml -o Global.infer_imgs="./images/PULC/textline_orientation/"
+```
+
+终端中会输出该文件夹内所有图像的分类结果,如下所示。
+
+```
+textline_orientation_test_0_0.png: class id(s): [0], score(s): [1.00], label_name(s): ['0_degree']
+textline_orientation_test_0_1.png: class id(s): [0], score(s): [1.00], label_name(s): ['0_degree']
+textline_orientation_test_1_0.png: class id(s): [1], score(s): [1.00], label_name(s): ['180_degree']
+textline_orientation_test_1_1.png: class id(s): [1], score(s): [1.00], label_name(s): ['180_degree']
+```
+
+其中,`0_degree` 表示该文本行为 0 度,`180_degree` 表示该文本行为 180 度。
+
+
+
+### 6.3 基于 C++ 预测引擎推理
+
+PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
+
+
+
+### 6.4 服务化部署
+
+Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)。
+
+PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.5 端侧部署
+
+Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)。
+
+PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.6 Paddle2ONNX 模型转换与预测
+
+Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)。
+
+PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](../../../deploy/paddle2onnx/readme.md)来完成相应的部署工作。
diff --git a/docs/zh_CN/PULC/PULC_traffic_sign.md b/docs/zh_CN/PULC/PULC_traffic_sign.md
new file mode 100644
index 0000000000000000000000000000000000000000..700cbd58b89501ec8b7fe9add5bdceb373a36936
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_traffic_sign.md
@@ -0,0 +1,485 @@
+# PULC 交通标志分类模型
+
+------
+
+
+## 目录
+
+- [1. 模型和应用场景介绍](#1)
+- [2. 模型快速体验](#2)
+ - [2.1 安装 paddlepaddle](#2.1)
+ - [2.2 安装 paddleclas](#2.2)
+ - [2.3 预测](#2.3)
+- [3. 模型训练、评估和预测](#3)
+ - [3.1 环境配置](#3.1)
+ - [3.2 数据准备](#3.2)
+ - [3.2.1 数据集来源](#3.2.1)
+ - [3.2.2 数据集获取](#3.2.2)
+ - [3.3 模型训练](#3.3)
+ - [3.4 模型评估](#3.4)
+ - [3.5 模型预测](#3.5)
+- [4. 模型压缩](#4)
+ - [4.1 SKL-UGI 知识蒸馏](#4.1)
+ - [4.1.1 教师模型训练](#4.1.1)
+ - [4.1.2 蒸馏训练](#4.1.2)
+- [5. 超参搜索](#5)
+- [6. 模型推理部署](#6)
+ - [6.1 推理模型准备](#6.1)
+ - [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
+ - [6.1.2 直接下载 inference 模型](#6.1.2)
+ - [6.2 基于 Python 预测引擎推理](#6.2)
+ - [6.2.1 预测单张图像](#6.2.1)
+ - [6.2.2 基于文件夹的批量预测](#6.2.2)
+ - [6.3 基于 C++ 预测引擎推理](#6.3)
+ - [6.4 服务化部署](#6.4)
+ - [6.5 端侧部署](#6.5)
+ - [6.6 Paddle2ONNX 模型转换与预测](#6.6)
+
+
+
+
+## 1. 模型和应用场景介绍
+
+该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight image Classification)快速构建轻量级、高精度、可落地的交通标志分类模型。该模型可以广泛应用于自动驾驶、道路监控等场景。
+
+下表列出了不同交通标志分类模型的相关指标,前两行展现了使用 SwinTranformer_tiny 和 MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第三行至第六行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + SKL-UGI 知识蒸馏策略训练得到的模型的相关指标。
+
+
+| 模型 | Top-1 Acc(%) | 延时(ms) | 存储(M) | 策略 |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 98.11 | 89.45 | 111 | 使用ImageNet预训练模型 |
+| MobileNetV3_small_x0_35 | 93.88 | 3.01 | 3.9 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 97.78 | 2.10 | 8.2 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 97.84 | 2.10 | 8.2 | 使用SSLD预训练模型 |
+| PPLCNet_x1_0 | 98.14 | 2.10 | 8.2 | 使用SSLD预训练模型+EDA策略|
+| PPLCNet_x1_0 | 98.35 | 2.10 | 8.2 | 使用SSLD预训练模型+EDA策略+SKL-UGI知识蒸馏策略|
+
+从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是精度下降明显。将 backbone 替换为 PPLCNet_x1_0 时,精度低3.9%,同时速度提升 43% 左右。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升约 0.06%,进一步地,当融合EDA策略后,精度可以再提升 0.3%,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.21%。此时,PPLCNet_x1_0 的精度超越了 SwinTranformer_tiny,速度快 41 倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
+
+**备注:**
+
+* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)。
+
+
+
+
+## 2. 模型快速体验
+
+
+
+### 2.1 安装 paddlepaddle
+
+- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- 您的机器是CPU,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
+
+
+
+### 2.2 安装 paddleclas
+
+使用如下命令快速安装 paddleclas
+
+```
+pip3 install paddleclas
+```
+
+
+
+### 2.3 预测
+
+点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
+
+* 使用命令行快速预测
+
+```bash
+paddleclas --model_name=traffic_sign --infer_imgs=pulc_demo_imgs/traffic_sign/100999_83928.jpg
+```
+
+结果如下:
+```
+>>> result
+class_ids: [182, 179, 162, 128, 24], scores: [0.98623, 0.01255, 0.00022, 0.00021, 0.00012], label_names: ['pl110', 'pl100', 'pl120', 'p26', 'pm10'], filename: pulc_demo_imgs/traffic_sign/100999_83928.jpg
+Predict complete!
+```
+
+**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹。
+
+
+* 在 Python 代码中预测
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="traffic_sign")
+result = model.predict(input_data="pulc_demo_imgs/traffic_sign/100999_83928.jpg")
+print(next(result))
+```
+
+**备注**:`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="traffic_sign", batch_size=2)`, 使用默认的代码返回结果示例如下:
+
+```
+>>> result
+[{'class_ids': [182, 179, 162, 128, 24], 'scores': [0.98623, 0.01255, 0.00022, 0.00021, 0.00012], 'label_names': ['pl110', 'pl100', 'pl120', 'p26', 'pm10'], 'filename': 'pulc_demo_imgs/traffic_sign/100999_83928.jpg'}]
+```
+
+
+
+## 3. 模型训练、评估和预测
+
+
+
+### 3.1 环境配置
+
+* 安装:请先参考文档 [环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+
+
+
+### 3.2 数据准备
+
+
+
+#### 3.2.1 数据集来源
+
+本案例中所使用的数据为[Tsinghua-Tencent 100K dataset (CC-BY-NC license)](https://cg.cs.tsinghua.edu.cn/traffic-sign/),在使用的过程中,对交通标志检测框进行随机扩充与裁剪,从而得到用于训练与测试的图像,下面简称该数据集为`TT100K`数据集。
+
+
+
+#### 3.2.2 数据集获取
+
+在TT00K数据集上,对交通标志检测框进行随机扩充与裁剪,从而得到用于训练与测试的图像。随机扩充检测框的逻辑如下所示。
+
+```python
+def get_random_crop_box(xmin, ymin, xmax, ymax, img_height, img_width, ratio=1.0):
+ h = ymax - ymin
+ w = ymax - ymin
+
+ xmin_diff = random.random() * ratio * min(w, xmin/ratio)
+ ymin_diff = random.random() * ratio * min(h, ymin/ratio)
+ xmax_diff = random.random() * ratio * min(w, (img_width-xmin-1)/ratio)
+ ymax_diff = random.random() * ratio * min(h, (img_height-ymin-1)/ratio)
+
+ new_xmin = round(xmin - xmin_diff)
+ new_ymin = round(ymin - ymin_diff)
+ new_xmax = round(xmax + xmax_diff)
+ new_ymax = round(ymax + ymax_diff)
+
+ return new_xmin, new_ymin, new_xmax, new_ymax
+```
+
+完整的预处理逻辑,可以参考下载好的数据集文件夹中的`deal.py`文件。
+
+
+处理后的数据集部分数据可视化如下。
+
+
+

+
+
+
+此处提供了经过上述方法处理好的数据,可以直接下载得到。
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,下载并解压交通标志分类场景的数据。
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/traffic_sign.tar
+tar -xf traffic_sign.tar
+cd ../
+```
+
+执行上述命令后,`dataset/`下存在`traffic_sign`目录,该目录中具有以下数据:
+
+```
+traffic_sign
+├── train
+│ ├── 0_62627.jpg
+│ ├── 100000_89031.jpg
+│ ├── 100001_89031.jpg
+...
+├── test
+│ ├── 100423_2315.jpg
+│ ├── 100424_2315.jpg
+│ ├── 100425_2315.jpg
+...
+├── other
+│ ├── 100603_3422.jpg
+│ ├── 100604_3422.jpg
+...
+├── label_list_train.txt
+├── label_list_test.txt
+├── label_list_other.txt
+├── label_list_train_for_distillation.txt
+├── label_list_train.txt.debug
+├── label_list_test.txt.debug
+├── label_name_id.txt
+├── deal.py
+```
+
+其中`train/`和`test/`分别为训练集和验证集。`label_list_train.txt`和`label_list_test.txt`分别为训练集和验证集的标签文件,`label_list_train.txt.debug`和`label_list_test.txt.debug`分别为训练集和验证集的`debug`标签文件,其分别是`label_list_train.txt`和`label_list_test.txt`的子集,用该文件可以快速体验本案例的流程。`train`与`other`的混合数据用于本案例的`SKL-UGI知识蒸馏策略`,对应的训练标签文件为`label_list_train_for_distillation.txt`。
+
+
+**备注:**
+
+* 关于 `label_list_train.txt`、`label_list_test.txt`的格式说明,可以参考[PaddleClas分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
+
+* 关于如何得到蒸馏的标签文件可以参考[知识蒸馏标签获得方法](../advanced_tutorials/ssld.md)。
+
+
+
+
+### 3.3 模型训练
+
+
+在 `ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml` 中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml
+```
+
+验证集的最佳指标在 `98.14%` 左右(数据集较小,一般有0.1%左右的波动)。
+
+
+
+
+### 3.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+
+
+### 3.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```bash
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model
+```
+
+输出结果如下:
+
+```
+99603_17806.jpg: class id(s): [216, 145, 49, 207, 169], score(s): [1.00, 0.00, 0.00, 0.00, 0.00], label_name(s): ['pm20', 'pm30', 'pm40', 'pl25', 'pm15']
+```
+
+**备注:**
+
+* 这里`-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+* 默认是对 `deploy/images/PULC/traffic_sign/99603_17806.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
+
+
+
+## 4. 模型压缩
+
+
+
+### 4.1 SKL-UGI 知识蒸馏
+
+SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](../advanced_tutorials/ssld.md#3.2)。
+
+
+
+#### 4.1.1 教师模型训练
+
+复用 `ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml` 中的超参数,训练教师模型,训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+验证集的最佳指标为 `98.59%` 左右,当前教师模型最好的权重保存在 `output/ResNet101_vd/best_model.pdparams`。
+
+
+
+#### 4.1.2 蒸馏训练
+
+配置文件`ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型,使用ImageNet数据集的验证集作为新增的无标签数据。训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+验证集的最佳指标为 `98.35%` 左右,当前模型最好的权重保存在 `output/DistillationModel/best_model_student.pdparams`。
+
+
+
+
+## 5. 超参搜索
+
+在 [3.2 节](#3.2)和 [4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
+
+**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
+
+
+
+## 6. 模型推理部署
+
+
+
+### 6.1 推理模型准备
+
+Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)。
+
+当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
+
+
+
+### 6.1.1 基于训练得到的权重导出 inference 模型
+
+此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/traffic_sign/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_traffic_sign_infer
+```
+执行完该脚本后会在 `deploy/models/` 下生成 `PPLCNet_x1_0_traffic_sign_infer` 文件夹,`models` 文件夹下应有如下文件结构:
+
+```
+├── PPLCNet_x1_0_traffic_sign_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在`output/PPLCNet_x1_0/best_model.pdparams`中。
+
+
+
+### 6.1.2 直接下载 inference 模型
+
+[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
+
+```
+cd deploy/models
+# 下载 inference 模型并解压
+wget https://paddleclas.bj.bcebos.com/models/PULC/traffic_sign_infer.tar && tar -xf traffic_sign_infer.tar
+```
+
+解压完毕后,`models` 文件夹下应有如下文件结构:
+
+```
+├── traffic_sign_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 基于 Python 预测引擎推理
+
+
+
+
+#### 6.2.1 预测单张图像
+
+返回 `deploy` 目录:
+
+```
+cd ../
+```
+
+运行下面的命令,对图像 `./images/PULC/traffic_sign/99603_17806.jpg` 进行交通标志分类。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/traffic_sign/inference_traffic_sign.yaml
+# 使用下面的命令使用 CPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/traffic_sign/inference_traffic_sign.yaml -o Global.use_gpu=False
+```
+
+输出结果如下。
+
+```
+99603_17806.jpg: class id(s): [216, 145, 49, 207, 169], score(s): [1.00, 0.00, 0.00, 0.00, 0.00], label_name(s): ['pm20', 'pm30', 'pm40', 'pl25', 'pm15']
+```
+
+
+
+#### 6.2.2 基于文件夹的批量预测
+
+如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/traffic_sign/inference_traffic_sign.yaml -o Global.infer_imgs="./images/PULC/traffic_sign/"
+```
+
+终端中会输出该文件夹内所有图像的分类结果,如下所示。
+
+```
+100999_83928.jpg: class id(s): [182, 179, 162, 128, 24], score(s): [0.99, 0.01, 0.00, 0.00, 0.00], label_name(s): ['pl110', 'pl100', 'pl120', 'p26', 'pm10']
+99603_17806.jpg: class id(s): [216, 145, 49, 24, 169], score(s): [1.00, 0.00, 0.00, 0.00, 0.00], label_name(s): ['pm20', 'pm30', 'pm40', 'pm10', 'pm15']
+```
+
+输出的 `label_name`可以从`dataset/traffic_sign/report.pdf`文件中查阅对应的图片。
+
+
+
+### 6.3 基于 C++ 预测引擎推理
+
+PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
+
+
+
+### 6.4 服务化部署
+
+Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)。
+
+PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.5 端侧部署
+
+Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)。
+
+PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.6 Paddle2ONNX 模型转换与预测
+
+Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)。
+
+PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](../../../deploy/paddle2onnx/readme.md)来完成相应的部署工作。
diff --git a/docs/zh_CN/PULC/PULC_train.md b/docs/zh_CN/PULC/PULC_train.md
new file mode 100644
index 0000000000000000000000000000000000000000..035535c7f9eb04af952c628fca85cedaaffc97b8
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_train.md
@@ -0,0 +1,241 @@
+## 超轻量图像分类方案PULC
+------
+
+
+## 目录
+
+- [1. PULC方案简介](#1)
+- [2. 数据准备](#2)
+ - [2.1 数据集格式说明](#2.1)
+ - [2.2 标注文件生成](#2.2)
+- [3. 使用标准分类配置进行训练](#3)
+ - [3.1 骨干网络PP-LCNet](#3.1)
+ - [3.2 SSLD预训练权重](#3.2)
+ - [3.3 EDA数据增强策略](#3.3)
+ - [3.4 SKL-UGI模型蒸馏](#3.4)
+ - [3.5 总结](#3.5)
+- [4. 超参搜索](#4)
+ - [4.1 基于默认配置搜索](#4.1)
+ - [4.2 自定义搜索配置](#4.2)
+
+
+
+### 1. PULC方案简介
+
+图像分类是计算机视觉的基础算法之一,是企业应用中最常见的算法,也是许多 CV 应用的重要组成部分。近年来,骨干网络模型发展迅速,ImageNet 的精度纪录被不断刷新。然而,这些模型在实用场景的表现有时却不尽如人意。一方面,精度高的模型往往体积大,运算慢,常常难以满足实际部署需求;另一方面,选择了合适的模型之后,往往还需要经验丰富的工程师进行调参,费时费力。PaddleClas 为了解决企业应用难题,让分类模型的训练和调参更加容易,总结推出了实用轻量图像分类解决方案(PULC, Practical Ultra Lightweight Classification)。PULC融合了骨干网络、数据增广、蒸馏等多种前沿算法,可以自动训练得到轻量且高精度的图像分类模型。
+
+PULC 方案在人、车、OCR等方向的多个场景中均验证有效,用超轻量模型就可实现与 SwinTransformer 模型接近的精度,预测速度提高 40+ 倍。
+
+
+

+
+
+方案主要包括 4 部分,分别是:PP-LCNet轻量级骨干网络、SSLD预训练权重、数据增强策略集成(EDA)和 SKL-UGI 知识蒸馏算法。此外,我们还采用了超参搜索的方法,高效优化训练中的超参数。下面,我们以有人/无人场景为例,对方案进行说明。
+
+**备注**:针对一些特定场景,我们提供了基础的训练文档供参考,例如[有人/无人分类模型](PULC_person_exists.md)等,您可以在[这里](./PULC_model_list.md)找到这些文档。如果这些文档中的方法不能满足您的需求,或者您需要自定义训练任务,您可以参考本文档。
+
+
+
+### 2. 数据准备
+
+
+
+#### 2.1 数据集格式说明
+
+PaddleClas 使用 `txt` 格式文件指定训练集和测试集,以有人/无人场景为例,其中需要指定 `train_list.txt` 和 `val_list.txt` 当作训练集和验证集的数据标签,格式形如:
+
+```
+# 每一行采用"空格"分隔图像路径与标注
+train/1.jpg 0
+train/10.jpg 1
+...
+```
+
+如果您想获取更多常用分类数据集的信息,可以参考文档可以参考 [PaddleClas 分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
+
+
+
+#### 2.2 标注文件生成
+
+如果您已经有实际场景中的数据,那么按照上节的格式进行标注即可。这里,我们提供了一个快速生成数据的脚本,您只需要将不同类别的数据分别放在文件夹中,运行脚本即可生成标注文件。
+
+首先,假设您存放数据的路径为`./train`,`train/` 中包含了每个类别的数据,类别号从 0 开始,每个类别的文件夹中有具体的图像数据。
+
+```shell
+train
+├── 0
+│ ├── 0.jpg
+│ ├── 1.jpg
+│ └── ...
+└── 1
+ ├── 0.jpg
+ ├── 1.jpg
+ └── ...
+└── ...
+```
+
+```shell
+tree -r -i -f train | grep -E "jpg|JPG|jpeg|JPEG|png|PNG" | awk -F "/" '{print $0" "$2}' > train_list.txt
+```
+
+其中,如果涉及更多的图片名称尾缀,可以增加 `grep -E`后的内容, `$2` 中的 `2` 为类别号文件夹的层级。
+
+**备注:** 以上为数据集获取和生成的方法介绍,这里您可以直接下载有人/无人场景数据快速开始体验。
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,下载并解压有人/无人场景的数据。
+
+```shell
+cd dataset
+wget https://paddleclas.bj.bcebos.com/data/PULC/person_exists.tar
+tar -xf person_exists.tar
+cd ../
+```
+
+
+
+### 3. 使用标准分类配置进行训练
+
+
+
+#### 3.1 骨干网络PP-LCNet
+
+PULC 采用了轻量骨干网络 PP-LCNet,相比同精度竞品速度快 50%,您可以在[PP-LCNet介绍](../models/PP-LCNet.md)查阅该骨干网络的详细介绍。
+直接使用 PP-LCNet 训练的命令为:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0_search.yaml
+```
+
+为了方便性能对比,我们也提供了大模型 SwinTransformer_tiny 和轻量模型 MobileNetV3_small_x0_35 的配置文件,您可以使用命令训练:
+
+SwinTransformer_tiny:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/SwinTransformer_tiny_patch4_window7_224.yaml
+```
+
+MobileNetV3_small_x0_35:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/person_exists/MobileNetV3_small_x0_35.yaml
+```
+
+训练得到的模型精度对比如下表。
+
+| 模型 | Tpr(%) | 延时(ms) | 存储(M) | 策略 |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 95.69 | 95.30 | 107 | 使用 ImageNet 预训练模型 |
+| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 1.6 | 使用 ImageNet 预训练模型 |
+| PPLCNet_x1_0 | 89.57 | 2.12 | 6.5 | 使用 ImageNet 预训练模型 |
+
+从中可以看出,PP-LCNet 的速度比 SwinTransformer 快很多,但是精度也略低。下面我们通过一系列优化来提高 PP-LCNet 模型的精度。
+
+
+
+#### 3.2 SSLD预训练权重
+
+SSLD 是百度自研的半监督蒸馏算法,在 ImageNet 数据集上,模型精度可以提升 3-7 个点,您可以在 [SSLD 介绍](../advanced_tutorials/ssld.md)找到详细介绍。我们发现,使用SSLD预训练权重,可以有效提升应用分类模型的精度。此外,在训练中使用更小的分辨率,可以有效提升模型精度。同时,我们也对学习率进行了优化。
+基于以上三点改进,我们训练得到模型精度为 92.1%,提升 2.6%。
+
+
+
+#### 3.3 EDA数据增强策略
+
+数据增强是视觉算法中常用的优化策略,可以对模型精度有明显提升。除了传统的 RandomCrop,RandomFlip 等方法之外,我们还应用了 RandomAugment 和 RandomErasing。您可以在[数据增强介绍](../advanced_tutorials/DataAugmentation.md)找到详细介绍。
+由于这两种数据增强对图片的修改较大,使分类任务变难,在一些小数据集上可能会导致模型欠拟合,我们将提前设置好这两种方法启用的概率。
+基于以上改进,我们训练得到模型精度为 93.43%,提升 1.3%。
+
+
+
+#### 3.4 SKL-UGI模型蒸馏
+
+模型蒸馏是一种可以有效提升小模型精度的方法,您可以在[知识蒸馏介绍](../advanced_tutorials/ssld.md)找到详细介绍。我们选择 ResNet101_vd 作为教师模型进行蒸馏。为了适应蒸馏过程,我们在此也对网络不同 stage 的学习率进行了调整。基于以上改进,我们训练得到模型精度为 95.6%,提升 1.4%。
+
+
+
+#### 3.5 总结
+
+经过以上方法优化,PP-LCNet最终精度达到 95.6%,达到了大模型的精度水平。我们将实验结果总结如下表:
+
+| 模型 | Tpr(%) | 延时(ms) | 存储(M) | 策略 |
+|-------|-----------|----------|---------------|---------------|
+| SwinTranformer_tiny | 95.69 | 95.30 | 107 | 使用 ImageNet 预训练模型 |
+| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 1.6 | 使用 ImageNet 预训练模型 |
+| PPLCNet_x1_0 | 89.57 | 2.12 | 6.5 | 使用 ImageNet 预训练模型 |
+| PPLCNet_x1_0 | 92.10 | 2.12 | 6.5 | 使用 SSLD 预训练模型 |
+| PPLCNet_x1_0 | 93.43 | 2.12 | 6.5 | 使用 SSLD 预训练模型+EDA 策略|
+| PPLCNet_x1_0 | 95.60 | 2.12 | 6.5 | 使用 SSLD 预训练模型+EDA 策略+SKL-UGI 知识蒸馏策略|
+
+我们在其他 8 个场景中也使用了同样的优化策略,得到如下结果:
+
+| 场景 | 大模型 | 大模型精度(%) | 小模型 | 小模型精度(%) |
+|----------|----------|----------|----------|----------|
+| 人体属性识别 | Res2Net200_vd | 81.25 | PPLCNet_x1_0 | 78.59 |
+| 佩戴安全帽分类 | Res2Net200_vd| 98.92 | PPLCNet_x1_0 |99.38 |
+| 交通标志分类 | SwinTransformer_tiny | 98.11 | PPLCNet_x1_0 | 98.35 |
+| 车辆属性识别 | Res2Net200_vd_26w_4s | 91.36 | PPLCNet_x1_0 | 90.81 |
+| 有车/无车分类 | SwinTransformer_tiny | 97.71 | PPLCNet_x1_0 | 95.92 |
+| 含文字图像方向分类 | SwinTransformer_tiny |99.12 | PPLCNet_x1_0 | 99.06 |
+| 文本行方向分类 | SwinTransformer_tiny | 93.61 | PPLCNet_x1_0 | 96.01 |
+| 语种分类 | SwinTransformer_tiny | 98.12 | PPLCNet_x1_0 | 99.26 |
+
+
+从结果可以看出,PULC 方案在多个应用场景中均可提升模型精度。使用 PULC 方案可以大大减少模型优化的工作量,快速得到精度较高的模型。
+
+
+
+### 4. 超参搜索
+
+在上述训练过程中,我们调节了学习率、数据增广方法开启概率、分阶段学习率倍数等参数。
+这些参数在不同场景中最优值可能并不相同。我们提供了一个快速超参搜索的脚本,将超参调优的过程自动化。
+这个脚本会遍历搜索值列表中的参数来替代默认配置中的参数,依次训练,最终选择精度最高的模型所对应的参数作为搜索结果。
+
+
+
+#### 4.1 基于默认配置搜索
+
+配置文件 [search.yaml](../../../ppcls/configs/PULC/person_exists/search.yaml) 定义了有人/无人场景超参搜索的配置,使用如下命令即可完成超参数的搜索。
+
+```bash
+python3 tools/search_strategy.py -c ppcls/configs/PULC/person_exists/search.yaml
+```
+
+**备注**:关于搜索部分,我们也在不断优化,敬请期待。
+
+
+
+#### 4.2 自定义搜索配置
+
+您也可以根据训练结果或调参经验,修改超参搜索的配置。
+
+修改 `lrs` 中的`search_values`字段,可以修改学习率搜索值列表;
+
+修改 `resolutions` 中的 `search_values` 字段,可以修改分辨率的搜索值列表;
+
+修改 `ra_probs` 中的 `search_values` 字段,可以修改 RandAugment 开启概率的搜索值列表;
+
+修改 `re_probs` 中的 `search_values` 字段,可以修改 RnadomErasing 开启概率的搜索值列表;
+
+修改 `lr_mult_list` 中的 `search_values` 字段,可以修改 lr_mult 搜索值列表;
+
+修改 `teacher` 中的 `search_values` 字段,可以修改教师模型的搜索列表。
+
+搜索完成后,会在 `output/search_person_exists` 中生成最终的结果,其中,除`search_res`外 `output/search_person_exists` 中目录为对应的每个搜索的超参数的结果的权重和训练日志文件,`search_res` 对应的是蒸馏后的结果,也就是最终的模型,该模型的权重保存在`output/output_dir/search_person_exists/DistillationModel/best_model_student.pdparams`。
diff --git a/docs/zh_CN/PULC/PULC_vehicle_attribute.md b/docs/zh_CN/PULC/PULC_vehicle_attribute.md
new file mode 100644
index 0000000000000000000000000000000000000000..35b731f324236f4b9bcade4074c4a7afd21b9e8e
--- /dev/null
+++ b/docs/zh_CN/PULC/PULC_vehicle_attribute.md
@@ -0,0 +1,477 @@
+# PULC 车辆属性识别模型
+
+------
+
+
+## 目录
+
+- [1. 模型和应用场景介绍](#1)
+- [2. 模型快速体验](#2)
+ - [2.1 安装 paddlepaddle](#2.1)
+ - [2.2 安装 paddleclas](#2.2)
+ - [2.3 预测](#2.3)
+- [3. 模型训练、评估和预测](#3)
+ - [3.1 环境配置](#3.1)
+ - [3.2 数据准备](#3.2)
+ - [3.2.1 数据集来源](#3.2.1)
+ - [3.2.2 数据集获取](#3.2.2)
+ - [3.3 模型训练](#3.3)
+ - [3.4 模型评估](#3.4)
+ - [3.5 模型预测](#3.5)
+- [4. 模型压缩](#4)
+ - [4.1 SKL-UGI 知识蒸馏](#4.1)
+ - [4.1.1 教师模型训练](#4.1.1)
+ - [4.1.2 蒸馏训练](#4.1.2)
+- [5. 超参搜索](#5)
+- [6. 模型推理部署](#6)
+ - [6.1 推理模型准备](#6.1)
+ - [6.1.1 基于训练得到的权重导出 inference 模型](#6.1.1)
+ - [6.1.2 直接下载 inference 模型](#6.1.2)
+ - [6.2 基于 Python 预测引擎推理](#6.2)
+ - [6.2.1 预测单张图像](#6.2.1)
+ - [6.2.2 基于文件夹的批量预测](#6.2.2)
+ - [6.3 基于 C++ 预测引擎推理](#6.3)
+ - [6.4 服务化部署](#6.4)
+ - [6.5 端侧部署](#6.5)
+ - [6.6 Paddle2ONNX 模型转换与预测](#6.6)
+
+
+
+
+## 1. 模型和应用场景介绍
+
+该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight image Classification)快速构建轻量级、高精度、可落地的车辆属性识别模型。该模型可以广泛应用于车辆识别、道路监控等场景。
+
+下表列出了不同车辆属性识别模型的相关指标,前三行展现了使用 Res2Net200_vd_26w_4s、 ResNet50、MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第四行至第七行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SSLD 预训练模型 + EDA 策略、使用 SSLD 预训练模型 + EDA 策略 + SKL-UGI 知识蒸馏策略训练得到的模型的相关指标。
+
+
+| 模型 | mA(%) | 延时(ms) | 存储(M) | 策略 |
+|-------|-----------|----------|---------------|---------------|
+| Res2Net200_vd_26w_4s | 91.36 | 79.46 | 293 | 使用ImageNet预训练模型 |
+| ResNet50 | 89.98 | 12.83 | 92 | 使用ImageNet预训练模型 |
+| MobileNetV3_small_x0_35 | 87.41 | 2.91 | 2.8 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 89.57 | 2.36 | 7.2 | 使用ImageNet预训练模型 |
+| PPLCNet_x1_0 | 90.07 | 2.36 | 7.2 | 使用SSLD预训练模型 |
+| PPLCNet_x1_0 | 90.59 | 2.36 | 7.2 | 使用SSLD预训练模型+EDA策略|
+| PPLCNet_x1_0 | 90.81 | 2.36 | 7.2 | 使用SSLD预训练模型+EDA策略+SKL-UGI知识蒸馏策略|
+
+从表中可以看出,backbone 为 Res2Net200_vd_26w_4s 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是精度下降明显。将 backbone 替换为 PPLCNet_x1_0 时,精度提升 2 个百分点,同时速度也提升 23% 左右。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升约 0.5 个百分点,进一步地,当融合EDA策略后,精度可以再提升 0.52 个百分点,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.23 个百分点。此时,PPLCNet_x1_0 的精度与 Res2Net200_vd_26w_4s 仅相差 0.55 个百分点,但是速度快 32 倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
+
+**备注:**
+
+* 延时是基于 Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,开启 MKLDNN 加速策略,线程数为10。
+* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)。
+
+
+
+
+## 2. 模型快速体验
+
+
+
+### 2.1 安装 paddlepaddle
+
+- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+```
+
+- 您的机器是CPU,请运行以下命令安装
+
+```bash
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+```
+
+更多的版本需求,请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
+
+
+
+### 2.2 安装 paddleclas
+
+使用如下命令快速安装 paddleclas
+
+```
+pip3 install paddleclas
+```
+
+
+
+### 2.3 预测
+
+点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
+
+* 使用命令行快速预测
+
+```bash
+paddleclas --model_name=vehicle_attribute --infer_imgs=pulc_demo_imgs/vehicle_attribute/0002_c002_00030670_0.jpg
+```
+
+结果如下:
+```
+>>> result
+attributes: Color: (yellow, prob: 0.9893476963043213), Type: (hatchback, prob: 0.9734097719192505), output: [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], filename: pulc_demo_imgs/vehicle_attribute/0002_c002_00030670_0.jpg
+Predict complete!
+```
+
+**备注**: 更换其他预测的数据时,只需要改变 `--infer_imgs=xx` 中的字段即可,支持传入整个文件夹。
+
+
+* 在 Python 代码中预测
+```python
+import paddleclas
+model = paddleclas.PaddleClas(model_name="vehicle_attribute")
+result = model.predict(input_data="pulc_demo_imgs/vehicle_attribute/0002_c002_00030670_0.jpg")
+print(next(result))
+```
+
+**备注**:`model.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果, 默认 `batch_size` 为 1,如果需要更改 `batch_size`,实例化模型时,需要指定 `batch_size`,如 `model = paddleclas.PaddleClas(model_name="vehicle_attribute", batch_size=2)`, 使用默认的代码返回结果示例如下:
+
+```
+>>> result
+[{'attributes': 'Color: (yellow, prob: 0.9893476963043213), Type: (hatchback, prob: 0.9734097719192505)', 'output': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], 'filename': 'pulc_demo_imgs/vehicle_attribute/0002_c002_00030670_0.jpg'}]
+```
+
+
+
+
+## 3. 模型训练、评估和预测
+
+
+
+### 3.1 环境配置
+
+* 安装:请先参考文档 [环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+
+
+
+### 3.2 数据准备
+
+
+
+#### 3.2.1 数据集来源
+
+本案例中所使用的数据为[VeRi 数据集](https://www.v7labs.com/open-datasets/veri-dataset)。
+
+
+
+#### 3.2.2 数据集获取
+
+部分数据可视化如下所示。
+
+
+

+
+
+首先从[VeRi数据集官网](https://www.v7labs.com/open-datasets/veri-dataset)中申请并下载数据,放在PaddleClas的`dataset`目录下,数据集目录名为`VeRi`,使用下面的命令进入该文件夹。
+
+```shell
+cd PaddleClas/dataset/VeRi/
+```
+
+然后使用下面的代码转换label(可以在python终端中执行下面的命令,也可以将其写入一个文件,然后使用`python3 convert.py`的方式运行该文件)。
+
+
+```python
+import os
+from xml.dom.minidom import parse
+
+vehicleids = []
+
+def convert_annotation(input_fp, output_fp):
+ in_file = open(input_fp)
+ list_file = open(output_fp, 'w')
+ tree = parse(in_file)
+
+ root = tree.documentElement
+
+ for item in root.getElementsByTagName("Item"):
+ label = ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
+ if item.hasAttribute("imageName"):
+ name = item.getAttribute("imageName")
+ if item.hasAttribute("vehicleID"):
+ vehicleid = item.getAttribute("vehicleID")
+ if vehicleid not in vehicleids :
+ vehicleids.append(vehicleid)
+ vid = vehicleids.index(vehicleid)
+ if item.hasAttribute("colorID"):
+ colorid = int (item.getAttribute("colorID"))
+ label[colorid-1] = '1'
+ if item.hasAttribute("typeID"):
+ typeid = int (item.getAttribute("typeID"))
+ label[typeid+9] = '1'
+ label = ','.join(label)
+ list_file.write(os.path.join('image_train', name) + "\t" + label + "\n")
+
+ list_file.close()
+
+convert_annotation('train_label.xml', 'train_list.txt') #imagename vehiclenum colorid typeid
+convert_annotation('test_label.xml', 'test_list.txt')
+```
+
+执行上述命令后,`VeRi`目录中具有以下数据:
+
+```
+VeRi
+├── image_train
+│ ├── 0001_c001_00016450_0.jpg
+│ ├── 0001_c001_00016460_0.jpg
+│ ├── 0001_c001_00016470_0.jpg
+...
+├── image_test
+│ ├── 0002_c002_00030600_0.jpg
+│ ├── 0002_c002_00030605_1.jpg
+│ ├── 0002_c002_00030615_1.jpg
+...
+...
+├── train_list.txt
+├── test_list.txt
+├── train_label.xml
+├── test_label.xml
+```
+
+其中`train/`和`test/`分别为训练集和验证集。`train_list.txt`和`test_list.txt`分别为训练集和验证集的转换后用于训练的标签文件。
+
+
+
+
+### 3.3 模型训练
+
+
+在 `ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml` 中提供了基于该场景的训练配置,可以通过如下脚本启动训练:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml
+```
+
+验证集的最佳指标在 `90.59%` 左右(数据集较小,一般有0.3%左右的波动)。
+
+
+
+
+### 3.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
+
+```bash
+python3 tools/eval.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+```
+
+其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+
+
+### 3.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```bash
+python3 tools/infer.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model
+```
+
+输出结果如下:
+
+```
+[{'attr': 'Color: (yellow, prob: 0.9893478155136108), Type: (hatchback, prob: 0.9734100103378296)', 'pred': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], 'file_name': './deploy/images/PULC/vehicle_attribute/0002_c002_00030670_0.jpg'}]
+```
+
+**备注:**
+
+* 这里`-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+* 默认是对 `./deploy/images/PULC/vehicle_attribute/0002_c002_00030670_0.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
+
+
+
+## 4. 模型压缩
+
+
+
+### 4.1 SKL-UGI 知识蒸馏
+
+SKL-UGI 知识蒸馏是 PaddleClas 提出的一种简单有效的知识蒸馏方法,关于该方法的介绍,可以参考[SKL-UGI 知识蒸馏](../advanced_tutorials/ssld.md)。
+
+
+
+#### 4.1.1 教师模型训练
+
+复用 `ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml` 中的超参数,训练教师模型,训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml \
+ -o Arch.name=ResNet101_vd
+```
+
+验证集的最佳指标为 `91.60%` 左右,当前教师模型最好的权重保存在 `output/ResNet101_vd/best_model.pdparams`。
+
+
+
+#### 4.1.2 蒸馏训练
+
+配置文件`ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型,`PPLCNet_x1_0`当作学生模型。训练脚本如下:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0_distillation.yaml \
+ -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+```
+
+验证集的最佳指标为 `90.81%` 左右,当前模型最好的权重保存在 `output/DistillationModel/best_model_student.pdparams`。
+
+
+
+
+## 5. 超参搜索
+
+在 [3.3 节](#3.3)和 [4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
+
+**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
+
+
+
+## 6. 模型推理部署
+
+
+
+### 6.1 推理模型准备
+
+Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)。
+
+当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
+
+
+
+### 6.1.1 基于训练得到的权重导出 inference 模型
+
+此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
+
+```bash
+python3 tools/export_model.py \
+ -c ./ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model_student \
+ -o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_vehicle_attribute_infer
+```
+执行完该脚本后会在 `deploy/models/` 下生成 `PPLCNet_x1_0_vehicle_attributeibute_infer` 文件夹,`models` 文件夹下应有如下文件结构:
+
+```
+└── PPLCNet_x1_0_vehicle_attribute_infer
+ ├── inference.pdiparams
+ ├── inference.pdiparams.info
+ └── inference.pdmodel
+```
+
+**备注:** 此处的最佳权重是经过知识蒸馏后的权重路径,如果没有执行知识蒸馏的步骤,最佳模型保存在`output/PPLCNet_x1_0/best_model.pdparams`中。
+
+
+
+### 6.1.2 直接下载 inference 模型
+
+[6.1.1 小节](#6.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
+
+```
+cd deploy/models
+# 下载 inference 模型并解压
+wget https://paddleclas.bj.bcebos.com/models/PULC/vehicle_attribute_infer.tar && tar -xf vehicle_attribute_infer.tar
+```
+
+解压完毕后,`models` 文件夹下应有如下文件结构:
+
+```
+├── vehicle_attribute_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 6.2 基于 Python 预测引擎推理
+
+
+
+
+#### 6.2.1 预测单张图像
+
+返回 `deploy` 目录:
+
+```
+cd ../
+```
+
+运行下面的命令,对图像 `./images/PULC/vehicle_attribute/0002_c002_00030670_0.jpg` 进行车辆属性识别。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/vehicle_attribute/inference_vehicle_attribute.yaml -o Global.use_gpu=True
+# 使用下面的命令使用 CPU 进行预测
+python3.7 python/predict_cls.py -c configs/PULC/vehicle_attribute/inference_vehicle_attribute.yaml -o Global.use_gpu=False
+```
+
+输出结果如下。
+
+```
+0002_c002_00030670_0.jpg: {'attributes': 'Color: (yellow, prob: 0.9893478155136108), Type: (hatchback, prob: 0.9734099507331848)', 'output': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]}
+```
+
+
+
+#### 6.2.2 基于文件夹的批量预测
+
+如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
+python3.7 python/predict_cls.py -c configs/PULC/vehicle_attribute/inference_vehicle_attribute.yaml -o Global.infer_imgs="./images/PULC/vehicle_attribute/"
+```
+
+终端中会输出该文件夹内所有图像的属性识别结果,如下所示。
+
+```
+0002_c002_00030670_0.jpg: {'attributes': 'Color: (yellow, prob: 0.9893476963043213), Type: (hatchback, prob: 0.9734097719192505)', 'output': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]}
+0014_c012_00040750_0.jpg: {'attributes': 'Color: (red, prob: 0.999872088432312), Type: (sedan, prob: 0.999976634979248)', 'output': [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]}
+```
+
+
+
+### 6.3 基于 C++ 预测引擎推理
+
+PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
+
+
+
+### 6.4 服务化部署
+
+Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)。
+
+PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.5 端侧部署
+
+Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)。
+
+PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
+
+
+
+### 6.6 Paddle2ONNX 模型转换与预测
+
+Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)。
+
+PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](../../../deploy/paddle2onnx/readme.md)来完成相应的部署工作。
diff --git a/docs/zh_CN/advanced_tutorials/DataAugmentation.md b/docs/zh_CN/advanced_tutorials/DataAugmentation.md
index 9e5159a4a148b75fa28d1cd774a8e8498e6da460..7097ff637b9f204f19d596445b2d0376e7b52d3b 100644
--- a/docs/zh_CN/advanced_tutorials/DataAugmentation.md
+++ b/docs/zh_CN/advanced_tutorials/DataAugmentation.md
@@ -1,33 +1,149 @@
# 数据增强分类实战
---
-本节将基于 ImageNet-1K 的数据集详细介绍数据增强实验,如果想快速体验此方法,可以参考 [**30 分钟玩转 PaddleClas(进阶版)**](../quick_start/quick_start_classification_professional.md)中基于 CIFAR100 的数据增强实验。如果想了解相关算法的内容,请参考[数据增强算法介绍](../algorithm_introduction/DataAugmentation.md)。
-
-
## 目录
-- [1. 参数配置](#1)
- - [1.1 AutoAugment](#1.1)
- - [1.2 RandAugment](#1.2)
- - [1.3 TimmAutoAugment](#1.3)
- - [1.4 Cutout](#1.4)
- - [1.5 RandomErasing](#1.5)
- - [1.6 HideAndSeek](#1.6)
- - [1.7 GridMask](#1.7)
- - [1.8 Mixup](#1.8)
- - [1.9 Cutmix](#1.9)
- - [1.10 Mixup 与 Cutmix 同时使用](#1.10)
-- [2. 启动命令](#2)
-- [3. 注意事项](#3)
-- [4. 实验结果](#4)
+- [1. 算法介绍](#1)
+ - [1.1 数据增强简介](#1.1)
+ - [1.2 图像变换类数据增强](#1.2)
+ - [1.2.1 AutoAugment](#1.2.1)
+ - [1.2.1.1 AutoAugment 算法介绍](#1.2.1.1)
+ - [1.2.1.2 AutoAugment 配置](#1.2.1.2)
+ - [1.2.2 RandAugment](#1.2.2)
+ - [1.2.2.1 RandAugment 算法介绍](#1.2.2.1)
+ - [1.2.2.2 RandAugment 配置](#1.2.2.2)
+ - [1.2.3 TimmAutoAugment](#1.2.3)
+ - [1.2.3.1 TimmAutoAugment 算法介绍](#1.2.3.1)
+ - [1.2.3.2 TimmAutoAugment 配置](#1.2.3.2)
+ - [1.3 图像裁剪类数据增强](#1.3)
+ - [1.3.1 Cutout](#1.3.1)
+ - [1.3.1.1 Cutout 算法介绍](#1.3.1.1)
+ - [1.3.1.2 Cutout 配置](#1.3.1.2)
+ - [1.3.2 RandomErasing](#1.3.2)
+ - [1.3.2.1 RandomErasing 算法介绍](#1.3.2.1)
+ - [1.3.2.2 RandomErasing 配置](#1.3.2.2)
+ - [1.3.3 HideAndSeek](#1.3.3)
+ - [1.3.3.1 HideAndSeek 算法介绍](#1.3.3.1)
+ - [1.3.3.2 HideAndSeek 配置](#1.3.3.2)
+ - [1.3.4 GridMask](#1.3.4)
+ - [1.3.4.1 GridMask 算法介绍](#1.3.4.1)
+ - [1.3.4.2 GridMask 配置](#1.3.4.2)
+ - [1.4 图像混叠类数据增强](#1.4)
+ - [1.4.1 Mixup](#1.4.1)
+ - [1.4.1.1 Mixup 算法介绍](#1.4.1.1)
+ - [1.4.1.2 Mixup 配置](#1.4.1.2)
+ - [1.4.2 Cutmix](#1.4.2)
+ - [1.4.2.1 Cutmix 算法介绍](#1.4.2.1)
+ - [1.4.2.2 Cutmix 配置](#1.4.2.2)
+ - [1.4.2.3 Mixup 和 Cutmix 混合使用配置](#1.4.2.3)
+- [2. 模型训练、评估和预测](#2)
+ - [2.1 环境配置](#2.1)
+ - [2.2 数据准备](#2.2)
+ - [2.3 模型训练](#2.3)
+ - [2.4 模型评估](#2.4)
+ - [2.5 模型预测](#2.5)
+- [3. 参考文献](#4)
+
-## 1. 参数配置
-由于不同的数据增强方式含有不同的超参数,为了便于理解和使用,我们在 `configs/DataAugment` 里分别列举了 8 种训练 ResNet50 的数据增强方式的参数配置文件,用户可以在 `tools/run.sh` 里直接替换配置文件的路径即可使用。此处分别挑选了图像变换、图像裁剪、图像混叠中的一个示例展示,其他参数配置用户可以自查配置文件。
+## 1. 算法介绍
+
+在图像分类任务中,图像数据的增广是一种常用的正则化方法,常用于数据量不足或者模型参数较多的场景。在本章节中,我们将对除 ImageNet 分类任务标准数据增强外的 8 种数据增强方式进行简单的介绍和对比,用户也可以将这些增广方法应用到自己的任务中,以获得模型精度的提升。这 8 种数据增强方式在 ImageNet 上的精度指标如下所示。
+
+
+
+更具体的指标如下表所示:
+
+
+| 模型 | 初始学习率策略 | l2 decay | batch size | epoch | 数据变化策略 | Top1 Acc | 论文中结论 |
+|-------------|------------------|--------------|------------|-------|----------------|------------|----|
+| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | 标准变换 | 0.7731 | - |
+| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | AutoAugment | 0.7795 | 0.7763 |
+| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | mixup | 0.7828 | 0.7790 |
+| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | cutmix | 0.7839 | 0.7860 |
+| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | cutout | 0.7801 | - |
+| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | gridmask | 0.7785 | 0.7790 |
+| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | random-augment | 0.7770 | 0.7760 |
+| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | random erasing | 0.7791 | - |
+| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | hide and seek | 0.7743 | 0.7720 |
-### 1.1 AutoAugment
+
+### 1.1. 数据增强简介
+
+如果没有特殊说明,本章节中所有示例为 ImageNet 分类,并且假设最终输入网络的数据维度为:`[batch-size, 3, 224, 224]`
+
+其中 ImageNet 分类训练阶段的标准数据增强方式分为以下几个步骤:
+
+1. 图像解码:简写为 `ImageDecode`
+2. 随机裁剪到长宽均为 224 的图像:简写为 `RandCrop`
+3. 水平方向随机翻转:简写为 `RandFlip`
+4. 图像数据的归一化:简写为 `Normalize`
+5. 图像数据的重排,`[224, 224, 3]` 变为 `[3, 224, 224]`:简写为 `Transpose`
+6. 多幅图像数据组成 batch 数据,如 `batch-size` 个 `[3, 224, 224]` 的图像数据拼组成 `[batch-size, 3, 224, 224]`:简写为 `Batch`
+
+相比于上述标准的图像增广方法,研究者也提出了很多改进的图像增广策略,这些策略均是在标准增广方法的不同阶段插入一定的操作,基于这些策略操作所处的不同阶段,我们将其分为了三类:
+
+1. 对 `RandCrop` 后的 224 的图像进行一些变换: AutoAugment,RandAugment
+2. 对 `Transpose` 后的 224 的图像进行一些裁剪: CutOut,RandErasing,HideAndSeek,GridMask
+3. 对 `Batch` 后的数据进行混合: Mixup,Cutmix
+
+增广后的可视化效果如下所示。
+
+
+
+具体如下表所示:
+
+
+| 变换方法 | 输入 | 输出 | Auto-
Augment\[1\] | Rand-
Augment\[2\] | CutOut\[3\] | Rand
Erasing\[4\] | HideAnd-
Seek\[5\] | GridMask\[6\] | Mixup\[7\] | Cutmix\[8\] |
+|-------------|---------------------------|---------------------------|------------------|------------------|-------------|------------------|------------------|---------------|------------|------------|
+| Image
Decode | Binary | (224, 224, 3)
uint8 | Y | Y | Y | Y | Y | Y | Y | Y |
+| RandCrop | (:, :, 3)
uint8 | (224, 224, 3)
uint8 | Y | Y | Y | Y | Y | Y | Y | Y |
+| **Process** | (224, 224, 3)
uint8 | (224, 224, 3)
uint8 | Y | Y | \- | \- | \- | \- | \- | \- |
+| RandFlip | (224, 224, 3)
uint8 | (224, 224, 3)
float32 | Y | Y | Y | Y | Y | Y | Y | Y |
+| Normalize | (224, 224, 3)
uint8 | (3, 224, 224)
float32 | Y | Y | Y | Y | Y | Y | Y | Y |
+| Transpose | (224, 224, 3)
float32 | (3, 224, 224)
float32 | Y | Y | Y | Y | Y | Y | Y | Y |
+| **Process** | (3, 224, 224)
float32 | (3, 224, 224)
float32 | \- | \- | Y | Y | Y | Y | \- | \- |
+| Batch | (3, 224, 224)
float32 | (N, 3, 224, 224)
float32 | Y | Y | Y | Y | Y | Y | Y | Y |
+| **Process** | (N, 3, 224, 224)
float32 | (N, 3, 224, 224)
float32 | \- | \- | \- | \- | \- | \- | Y | Y |
+
+
+PaddleClas 中集成了上述所有的数据增强策略,每种数据增强策略的参考论文与参考开源代码均在下面的介绍中列出。下文将介绍这些策略的原理与使用方法,并以下图为例,对变换后的效果进行可视化。为了说明问题,本章节中将 `RandCrop` 替换为 `Resize`。
+
+![][test_baseline]
+
+
+
+### 1.2 图像变换类
+
+图像变换类指的是对 `RandCrop` 后的 224 的图像进行一些变换,主要包括
+
++ AutoAugment
++ RandAugment
++ TimmAutoAugment
+
+
+
+#### 1.2.1 AutoAugment
+
+
+
+##### 1.2.1.1 AutoAugment 算法介绍
+
+论文地址:[https://arxiv.org/abs/1805.09501v1](https://arxiv.org/abs/1805.09501v1)
+
+开源代码 github 地址:[https://github.com/DeepVoltaire/AutoAugment](https://github.com/DeepVoltaire/AutoAugment)
+
+不同于常规的人工设计图像增广方式,AutoAugment 是在一系列图像增广子策略的搜索空间中通过搜索算法找到的适合特定数据集的图像增广方案。针对 ImageNet 数据集,最终搜索出来的数据增强方案包含 25 个子策略组合,每个子策略中都包含两种变换,针对每幅图像都随机的挑选一个子策略组合,然后以一定的概率来决定是否执行子策略中的每种变换。
+
+经过 AutoAugment 数据增强后结果如下图所示。
+
+![][test_autoaugment]
+
+
+
+##### 1.2.1.2 AutoAugment 配置
`AotoAugment` 的图像增广方式的配置如下。`AutoAugment` 是在 uint8 的数据格式上转换的,所以其处理过程应该放在归一化操作(`NormalizeImage`)之前。
@@ -48,8 +164,31 @@
order: ''
```
-
-### 1.2 RandAugment
+
+
+#### 1.2.2 RandAugment
+
+
+
+##### 1.2.2.1 RandAugment 算法介绍
+
+论文地址:[https://arxiv.org/pdf/1909.13719.pdf](https://arxiv.org/pdf/1909.13719.pdf)
+
+开源代码 github 地址:[https://github.com/heartInsert/randaugment](https://github.com/heartInsert/randaugment)
+
+
+`AutoAugment` 的搜索方法比较暴力,直接在数据集上搜索针对该数据集的最优策略,其计算量很大。在 `RandAugment` 文章中作者发现,一方面,针对越大的模型,越大的数据集,使用 `AutoAugment` 方式搜索到的增广方式产生的收益也就越小;另一方面,这种搜索出的最优策略是针对该数据集的,其迁移能力较差,并不太适合迁移到其他数据集上。
+
+在 `RandAugment` 中,作者提出了一种随机增广的方式,不再像 `AutoAugment` 中那样使用特定的概率确定是否使用某种子策略,而是所有的子策略都会以同样的概率被选择到,论文中的实验也表明这种数据增强方式即使在大模型的训练中也具有很好的效果。
+
+
+经过 RandAugment 数据增强后结果如下图所示。
+
+![][test_randaugment]
+
+
+
+##### 1.2.2.2 RandAugment 配置
`RandAugment` 的图像增广方式的配置如下,其中用户需要指定其中的参数 `num_layers` 与 `magnitude`,默认的数值分别是 `2` 和 `5`。`RandAugment` 是在 uint8 的数据格式上转换的,所以其处理过程应该放在归一化操作(`NormalizeImage`)之前。
@@ -72,8 +211,21 @@
order: ''
```
-
-### 1.3 TimmAutoAugment
+
+
+#### 1.2.3 TimmAutoAugment
+
+
+
+##### 1.2.3.1 TimmAutoAugment 算法介绍
+
+开源代码 github 地址:[https://github.com/rwightman/pytorch-image-models/blob/master/timm/data/auto_augment.py](https://github.com/rwightman/pytorch-image-models/blob/master/timm/data/auto_augment.py)
+
+`TimmAutoAugment` 是开源作者对 AutoAugment 和 RandAugment 的改进,事实证明,其在很多视觉任务上有更好的表现,目前绝大多数 VisionTransformer 模型都是基于 TimmAutoAugment 去实现的。
+
+
+
+##### 1.2.3.2 TimmAutoAugment 配置
`TimmAutoAugment` 的图像增广方式的配置如下,其中用户需要指定其中的参数 `config_str`、`interpolation`、`img_size`,默认的数值分别是 `rand-m9-mstd0.5-inc1`、`bicubic`、`224`。`TimmAutoAugment` 是在 uint8 的数据格式上转换的,所以其处理过程应该放在归一化操作(`NormalizeImage`)之前。
@@ -97,8 +249,43 @@
order: ''
```
-
-### 1.4 Cutout
+
+
+### 1.3 图像裁剪类
+
+图像裁剪类主要是对 `Transpose` 后的 224 的图像进行一些裁剪,并将裁剪区域的像素值置为特定的常数(默认为 0),主要包括:
+
++ CutOut
++ RandErasing
++ HideAndSeek
++ GridMask
+
+图像裁剪的这些增广并非一定要放在归一化之后,也有不少实现是放在归一化之前的,也就是直接对 uint8 的图像进行操作,两种方式的差别是:如果直接对 uint8 的图像进行操作,那么再经过归一化之后被裁剪的区域将不再是纯黑或纯白(减均值除方差之后像素值不为 0)。而对归一后之后的数据进行操作,裁剪的区域会是纯黑或纯白。
+
+上述的裁剪变换思路是相同的,都是为了解决训练出的模型在有遮挡数据上泛化能力较差的问题,不同的是他们的裁剪方式、区域不太一样。
+
+
+
+#### 1.3.1 Cutout
+
+
+
+##### 1.3.1.1 Cutout 算法介绍
+
+论文地址:[https://arxiv.org/abs/1708.04552](https://arxiv.org/abs/1708.04552)
+
+开源代码 github 地址:[https://github.com/uoguelph-mlrg/Cutout](https://github.com/uoguelph-mlrg/Cutout)
+
+Cutout 可以理解为 Dropout 的一种扩展操作,不同的是 Dropout 是对图像经过网络后生成的特征进行遮挡,而 Cutout 是直接对输入的图像进行遮挡,相对于 Dropout 对噪声的鲁棒性更好。作者在论文中也进行了说明,这样做法有以下两点优势:(1)通过 Cutout 可以模拟真实场景中主体被部分遮挡时的分类场景;(2)可以促进模型充分利用图像中更多的内容来进行分类,防止网络只关注显著性的图像区域,从而发生过拟合。
+
+
+经过 RandAugment 数据增强后结果如下图所示。
+
+![][test_cutout]
+
+
+
+##### 1.3.1.2 Cutout 配置
`Cutout` 的图像增广方式的配置如下,其中用户需要指定其中的参数 `n_holes` 与 `length`,默认的数值分别是 `1` 和 `112`。类似其他图像裁剪类的数据增强方式,`Cutout` 既可以在 uint8 格式的数据上操作,也可以在归一化)(`NormalizeImage`)后的数据上操作,此处给出的是在归一化后的操作。
@@ -121,8 +308,31 @@
length: 112
```
-
-### 1.5 RandomErasing
+
+
+
+#### 1.3.2 RandomErasing
+
+
+
+##### 1.3.2.1 RandomErasing 算法介绍
+
+论文地址:[https://arxiv.org/pdf/1708.04896.pdf](https://arxiv.org/pdf/1708.04896.pdf)
+
+开源代码 github 地址:[https://github.com/zhunzhong07/Random-Erasing](https://github.com/zhunzhong07/Random-Erasing)
+
+`RandomErasing` 与 `Cutout` 方法类似,同样是为了解决训练出的模型在有遮挡数据上泛化能力较差的问题,作者在论文中也指出,随机裁剪的方式与随机水平翻转具有一定的互补性。作者也在行人再识别(REID)上验证了该方法的有效性。与 `Cutout` 不同的是,在 `RandomErasing` 中,图片以一定的概率接受该种预处理方法,生成掩码的尺寸大小与长宽比也是根据预设的超参数随机生成。
+
+
+PaddleClas 中 `RandomErasing` 的使用方法如下所示。
+
+经过 RandomErasing 数据增强后结果如下图所示。
+
+![][test_randomerassing]
+
+
+
+##### 1.3.2.2 RandomErasing 配置
`RandomErasing` 的图像增广方式的配置如下,其中用户需要指定其中的参数 `EPSILON`、`sl`、`sh`、`r1`、`attempt`、`use_log_aspect`、`mode`,默认的数值分别是 `0.25`、`0.02`、`1.0/3.0`、`0.3`、`10`、`True`、`pixel`。类似其他图像裁剪类的数据增强方式,`RandomErasing` 既可以在 uint8 格式的数据上操作,也可以在归一化(`NormalizeImage`)后的数据上操作,此处给出的是在归一化后的操作。
@@ -150,8 +360,35 @@
mode: pixel
```
-
-### 1.6 HideAndSeek
+
+
+#### 1.3.3 HideAndSeek
+
+
+
+##### 1.3.3.1 HideAndSeek 算法介绍
+
+论文地址:[https://arxiv.org/pdf/1811.02545.pdf](https://arxiv.org/pdf/1811.02545.pdf)
+
+开源代码 github 地址:[https://github.com/kkanshul/Hide-and-Seek](https://github.com/kkanshul/Hide-and-Seek)
+
+
+`HideAndSeek` 论文将图像分为若干块区域(patch),对于每块区域,都以一定的概率生成掩码,不同区域的掩码含义如下图所示。
+
+
+![][hide_and_seek_mask_expanation]
+
+
+PaddleClas 中 `HideAndSeek` 的使用方法如下所示。
+
+
+经过 HideAndSeek 数据增强后结果如下图所示。
+
+![][test_hideandseek]
+
+
+
+##### 1.3.3.2 HideAndSeek 配置
`HideAndSeek` 的图像增广方式的配置如下。类似其他图像裁剪类的数据增强方式,`HideAndSeek` 既可以在 uint8 格式的数据上操作,也可以在归一化(`NormalizeImage`)后的数据上操作,此处给出的是在归一化后的操作。
@@ -172,9 +409,43 @@
- HideAndSeek:
```
-
+
+
+#### 1.3.4 GridMask
+
+
+
+##### 1.3.4.1 GridMask 算法介绍
+
+论文地址:[https://arxiv.org/abs/2001.04086](https://arxiv.org/abs/2001.04086)
+
+开源代码 github 地址:[https://github.com/akuxcw/GridMask](https://github.com/akuxcw/GridMask)
-### 1.7 GridMask
+
+作者在论文中指出,此前存在的基于对图像 crop 的方法存在两个问题,如下图所示:
+
+1. 过度删除区域可能造成目标主体大部分甚至全部被删除,或者导致上下文信息的丢失,导致增广后的数据成为噪声数据;
+2. 保留过多的区域,对目标主体及上下文基本产生不了什么影响,失去增广的意义。
+
+![][gridmask-0]
+
+因此如果避免过度删除或过度保留成为需要解决的核心问题。
+
+
+`GridMask` 是通过生成一个与原图分辨率相同的掩码,并将掩码进行随机翻转,与原图相乘,从而得到增广后的图像,通过超参数控制生成的掩码网格的大小。
+
+
+在训练过程中,有两种以下使用方法:
+1. 设置一个概率 p,从训练开始就对图片以概率 p 使用 `GridMask` 进行增广。
+2. 一开始设置增广概率为 0,随着迭代轮数增加,对训练图片进行 `GridMask` 增广的概率逐渐增大,最后变为 p。
+
+论文中验证上述第二种方法的训练效果更好一些。
+
+经过 GridMask 数据增强后结果如下图所示。
+
+
+
+##### 1.3.4.2 GridMask 配置
`GridMask` 的图像增广方式的配置如下,其中用户需要指定其中的参数 `d1`、`d2`、`rotate`、`ratio`、`mode`, 默认的数值分别是 `96`、`224`、`1`、`0.5`、`0`。类似其他图像裁剪类的数据增强方式,`GridMask` 既可以在 uint8 格式的数据上操作,也可以在归一化(`NormalizeImage`)后的数据上操作,此处给出的是在归一化后的操作。
@@ -200,8 +471,43 @@
mode: 0
```
-
-### 1.8 Mixup
+![][test_gridmask]
+
+
+
+### 1.4 图像混叠类
+
+图像混叠主要对 `Batch` 后的数据进行混合,包括:
+
++ Mixup
++ Cutmix
+
+前文所述的图像变换与图像裁剪都是针对单幅图像进行的操作,而图像混叠是对两幅图像进行融合,生成一幅图像,两种方法的主要区别为混叠的方式不太一样。
+
+
+
+#### 1.4.1 Mixup
+
+
+
+##### 1.4.1.1 Mixup 算法介绍
+
+论文地址:[https://arxiv.org/pdf/1710.09412.pdf](https://arxiv.org/pdf/1710.09412.pdf)
+
+开源代码 github 地址:[https://github.com/facebookresearch/mixup-cifar10](https://github.com/facebookresearch/mixup-cifar10)
+
+Mixup 是最先提出的图像混叠增广方案,其原理简单、方便实现,不仅在图像分类上,在目标检测上也取得了不错的效果。为了便于实现,通常只对一个 batch 内的数据进行混叠,在 `Cutmix` 中也是如此。
+
+如下是 `imaug` 中的实现,需要指出的是,下述实现会出现对同一幅进行相加的情况,也就是最终得到的图和原图一样,随着 `batch-size` 的增加这种情况出现的概率也会逐渐减小。
+
+
+经过 Mixup 数据增强结果如下图所示。
+
+![][test_mixup]
+
+
+
+##### 1.4.1.2 Mixup 配置
`Mixup` 的图像增广方式的配置如下,其中用户需要指定其中的参数 `alpha`,默认的数值是 `0.2`。类似其他图像混合类的数据增强方式,`Mixup` 是在图像做完数据处理后将每个 batch 内的数据做图像混叠,将混叠后的图像和标签输入网络中训练,所以其是在图像数据处理(图像变换、图像裁剪)后操作。
@@ -224,8 +530,26 @@
alpha: 0.2
```
-
-### 1.9 Cutmix
+
+#### 1.4.2 Cutmix
+
+
+
+##### 1.4.2.1 Cutmix 算法介绍
+
+论文地址:[https://arxiv.org/pdf/1905.04899v2.pdf](https://arxiv.org/pdf/1905.04899v2.pdf)
+
+开源代码 github 地址:[https://github.com/clovaai/CutMix-PyTorch](https://github.com/clovaai/CutMix-PyTorch)
+
+与 `Mixup` 直接对两幅图进行相加不一样,`Cutmix` 是从一幅图中随机裁剪出一个 `ROI`,然后覆盖当前图像中对应的区域,代码实现如下所示:
+
+经过 Cutmix 数据增强后结果如下图所示。
+
+![][test_cutmix]
+
+
+
+##### 1.4.2.2 Cutmix 配置
`Cutmix` 的图像增广方式的配置如下,其中用户需要指定其中的参数 `alpha`,默认的数值是 `0.2`。类似其他图像混合类的数据增强方式,`Cutmix` 是在图像做完数据处理后将每个 batch 内的数据做图像混叠,将混叠后的图像和标签输入网络中训练,所以其是在图像数据处理(图像变换、图像裁剪)后操作。
@@ -248,8 +572,9 @@
alpha: 0.2
```
-
-### 1.10 Mixup 与 Cutmix 同时使用
+
+
+##### 1.4.2.3 Mixup 和 Cutmix 混合使用配置
`Mixup` 与 `Cutmix` 同时使用的配置如下,其中用户需要指定额外的参数 `prob`,该参数控制不同数据增强的概率,默认为 `0.5`。
@@ -277,55 +602,149 @@
```
-## 2. 启动命令
-当用户配置完训练环境后,类似于训练其他分类任务,只需要将 `tools/train.sh` 中的配置文件替换成为相应的数据增强方式的配置文件即可。
+## 2. 模型训练、评估和预测
+
+
-其中 `train.sh` 中的内容如下:
+### 2.1 环境配置
-```bash
+* 安装:请先参考 [Paddle 安装教程](../installation/install_paddle.md) 以及 [PaddleClas 安装教程](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+
+
+
+### 2.2 数据准备
+
+请在[ImageNet 官网](https://www.image-net.org/)准备 ImageNet-1k 相关的数据。
+
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,将下载好的数据命名为 `ILSVRC2012` ,存放于此。 `ILSVRC2012` 目录中具有以下数据:
+
+```
+├── train
+│ ├── n01440764
+│ │ ├── n01440764_10026.JPEG
+│ │ ├── n01440764_10027.JPEG
+├── train_list.txt
+...
+├── val
+│ ├── ILSVRC2012_val_00000001.JPEG
+│ ├── ILSVRC2012_val_00000002.JPEG
+├── val_list.txt
+```
+
+其中 `train/` 和 `val/` 分别为训练集和验证集。`train_list.txt` 和 `val_list.txt` 分别为训练集和验证集的标签文件。
+
+**备注:**
+* 关于 `train_list.txt`、`val_list.txt`的格式说明,可以参考[PaddleClas分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
+
+
+
+
+### 2.3 模型训练
+
+
+在 `ppcls/configs/ImageNet/DataAugment` 中提供了基于 ResNet50 的不同的数据增强的训练配置,这里以使用 `AutoAugment` 为例,介绍数据增强的使用方法。可以通过如下脚本启动训练:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
- --selected_gpus="0,1,2,3" \
- --log_dir=ResNet50_Cutout \
+ --gpus="0,1,2,3" \
tools/train.py \
- -c ./ppcls/configs/ImageNet/DataAugment/ResNet50_Cutout.yaml
+ -c ppcls/configs/ImageNet/DataAugment/ResNet50_AutoAugment.yaml
```
-运行 `train.sh`:
+
+**备注:**
+
+* 1.当前精度最佳的模型会保存在 `output/ResNet50/best_model.pdparams`。
+* 2.如需更改数据增强类型,只需要替换`ppcls/configs/ImageNet/DataAugment`中的其他的配置文件即可。
+* 3.如果希望多种数据增强混合使用,请参考[第 2 节](#2)中的相关配置更改配置文件中的数据增强即可。
+* 4.由于图像混叠时需对 label 进行混叠,无法计算训练数据的准确率,所以在训练过程中没有打印训练准确率。
+* 5.在使用数据增强后,由于训练数据更难,所以训练损失函数可能较大,训练集的准确率相对较低,但其有拥更好的泛化能力,所以验证集的准确率相对较高。
+* 6.在使用数据增强后,模型可能会趋于欠拟合状态,建议可以适当的调小 `l2_decay` 的值来获得更高的验证集准确率。
+* 7.几乎每一类图像增强均含有超参数,我们只提供了基于 ImageNet-1k 的超参数,其他数据集需要用户自己调试超参数,具体超参数的含义用户可以阅读相关的论文,调试方法也可以参考[训练技巧](../models_training/train_strategy.md)。
+
+
+
+### 2.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
```bash
-sh tools/train.sh
+python3 tools/eval.py \
+ -c ppcls/configs/ImageNet/DataAugment/ResNet50_AutoAugment.yaml \
+ -o Global.pretrained_model=output/ResNet50/best_model
```
+其中 `-o Global.pretrained_model="output/ResNet50/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+
+
+### 2.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```python
+python3 tools/infer.py \
+ -c ppcls/configs/ImageNet/DataAugment/ResNet50_AutoAugment.yaml \
+ -o Global.pretrained_model=output/ResNet50/best_model
+```
+
+输出结果如下:
+
+```
+[{'class_ids': [8, 7, 86, 81, 85], 'scores': [0.91347, 0.03779, 0.0036, 0.00117, 0.00112], 'file_name': 'docs/images/inference_deployment/whl_demo.jpg', 'label_names': ['hen', 'cock', 'partridge', 'ptarmigan', 'quail']}]
+```
+
+**备注:**
+
+* 这里`-o Global.pretrained_model="output/ResNet50/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+* 默认是对 `docs/images/inference_deployment/whl_demo.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
+
+* 默认输出的是 Top-5 的值,如果希望输出 Top-k 的值,可以指定`-o Infer.PostProcess.topk=k`,其中,`k` 为您指定的值。
+
+
+
-## 3. 注意事项
-* 由于图像混叠时需对 label 进行混叠,无法计算训练数据的准确率,所以在训练过程中没有打印训练准确率。
+## 3.参考文献
-* 在使用数据增强后,由于训练数据更难,所以训练损失函数可能较大,训练集的准确率相对较低,但其有拥更好的泛化能力,所以验证集的准确率相对较高。
+[1] Cubuk E D, Zoph B, Mane D, et al. Autoaugment: Learning augmentation strategies from data[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2019: 113-123.
-* 在使用数据增强后,模型可能会趋于欠拟合状态,建议可以适当的调小 `l2_decay` 的值来获得更高的验证集准确率。
-* 几乎每一类图像增强均含有超参数,我们只提供了基于 ImageNet-1k 的超参数,其他数据集需要用户自己调试超参数,具体超参数的含义用户可以阅读相关的论文,调试方法也可以参考[训练技巧](../models_training/train_strategy.md)。
+[2] Cubuk E D, Zoph B, Shlens J, et al. Randaugment: Practical automated data augmentation with a reduced search space[J]. arXiv preprint arXiv:1909.13719, 2019.
-
-## 4. 实验结果
+[3] DeVries T, Taylor G W. Improved regularization of convolutional neural networks with cutout[J]. arXiv preprint arXiv:1708.04552, 2017.
+
+[4] Zhong Z, Zheng L, Kang G, et al. Random erasing data augmentation[J]. arXiv preprint arXiv:1708.04896, 2017.
+
+[5] Singh K K, Lee Y J. Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization[C]//2017 IEEE international conference on computer vision (ICCV). IEEE, 2017: 3544-3553.
+
+[6] Chen P. GridMask Data Augmentation[J]. arXiv preprint arXiv:2001.04086, 2020.
+
+[7] Zhang H, Cisse M, Dauphin Y N, et al. mixup: Beyond empirical risk minimization[J]. arXiv preprint arXiv:1710.09412, 2017.
+
+[8] Yun S, Han D, Oh S J, et al. Cutmix: Regularization strategy to train strong classifiers with localizable features[C]//Proceedings of the IEEE International Conference on Computer Vision. 2019: 6023-6032.
-基于 PaddleClas,在 ImageNet1k 数据集上的分类精度如下。
-| 模型 | 初始学习率策略 | l2 decay | batch size | epoch | 数据变化策略 | Top1 Acc | 论文中结论 |
-|-------------|------------------|--------------|------------|-------|----------------|------------|----|
-| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | 标准变换 | 0.7731 | - |
-| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | AutoAugment | 0.7795 | 0.7763 |
-| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | mixup | 0.7828 | 0.7790 |
-| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | cutmix | 0.7839 | 0.7860 |
-| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | cutout | 0.7801 | - |
-| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | gridmask | 0.7785 | 0.7790 |
-| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | random-augment | 0.7770 | 0.7760 |
-| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | random erasing | 0.7791 | - |
-| ResNet50 | 0.1/cosine_decay | 0.0001 | 256 | 300 | hide and seek | 0.7743 | 0.7720 |
-**注意**:
-* 在这里的实验中,为了便于对比,我们将 l2 decay 固定设置为 1e-4,在实际使用中,我们推荐尝试使用更小的 l2 decay。结合数据增强,我们发现将 l2 decay 由 1e-4 减小为 7e-5 均能带来至少 0.3~0.5% 的精度提升。
-* 我们目前尚未对不同策略进行组合并验证效果,这一块后续我们会开展更多的对比实验,敬请期待。
+[test_baseline]: ../../images/image_aug/test_baseline.jpeg
+[test_autoaugment]: ../../images/image_aug/test_autoaugment.jpeg
+[test_cutout]: ../../images/image_aug/test_cutout.jpeg
+[test_gridmask]: ../../images/image_aug/test_gridmask.jpeg
+[gridmask-0]: ../../images/image_aug/gridmask-0.png
+[test_hideandseek]: ../../images/image_aug/test_hideandseek.jpeg
+[test_randaugment]: ../../images/image_aug/test_randaugment.jpeg
+[test_randomerassing]: ../../images/image_aug/test_randomerassing.jpeg
+[hide_and_seek_mask_expanation]: ../../images/image_aug/hide-and-seek-visual.png
+[test_mixup]: ../../images/image_aug/test_mixup.png
+[test_cutmix]: ../../images/image_aug/test_cutmix.png
diff --git a/docs/zh_CN/advanced_tutorials/knowledge_distillation.md b/docs/zh_CN/advanced_tutorials/knowledge_distillation.md
index d3e6d77cf254a933fd6e6776e361f2c499b5c14d..6224e82a79eb39cc62641c66acd8bbc0133070ce 100644
--- a/docs/zh_CN/advanced_tutorials/knowledge_distillation.md
+++ b/docs/zh_CN/advanced_tutorials/knowledge_distillation.md
@@ -1,209 +1,411 @@
-# 知识蒸馏
+# 知识蒸馏实战
## 目录
- - [1. 模型压缩与知识蒸馏方法简介](#1)
- - [2. SSLD 蒸馏策略](#2)
- - [2.1 简介](#2.1)
- - [2.2 数据选择](#2.2)
- - [3. 实验](#3)
- - [3.1 教师模型的选择](#3.1)
- - [3.2 大数据蒸馏](#3.2)
- - [3.3 ImageNet1k 训练集 finetune](#3.3)
- - [3.4 数据增广以及基于 Fix 策略的微调](#3.4)
- - [3.5 实验过程中的一些问题](#3.5)
- - [4. 蒸馏模型的应用](#4)
- - [4.1 使用方法](#4.1)
- - [4.2 迁移学习 finetune](#4.2)
- - [4.3 目标检测](#4.3)
- - [5. SSLD 实战](#5)
- - [5.1 参数配置](#5.1)
- - [5.2 启动命令](#5.2)
- - [5.3 注意事项](#5.3)
- - [6. 参考文献](#6)
+
+- [1. 算法介绍](#1)
+ - [1.1 知识蒸馏简介](#1.1)
+ - [1.1.1 Response based distillation](#1.1.1)
+ - [1.1.2 Feature based distillation](#1.1.2)
+ - [1.1.3 Relation based distillation](#1.1.3)
+ - [1.2 PaddleClas支持的知识蒸馏算法](#1.2)
+ - [1.2.1 SSLD](#1.2.1)
+ - [1.2.2 DML](#1.2.2)
+ - [1.2.3 UDML](#1.2.3)
+ - [1.2.4 AFD](#1.2.4)
+ - [1.2.5 DKD](#1.2.5)
+- [2. 使用方法](#2)
+ - [2.1 环境配置](#2.1)
+ - [2.2 数据准备](#2.2)
+ - [2.3 模型训练](#2.3)
+ - [2.4 模型评估](#2.4)
+ - [2.5 模型预测](#2.5)
+ - [2.6 模型导出与推理](#2.6)
+- [3. 参考文献](#3)
+
+
-## 1. 模型压缩与知识蒸馏方法简介
+
+## 1. 算法介绍
+
+
+
+### 1.1 知识蒸馏简介
近年来,深度神经网络在计算机视觉、自然语言处理等领域被验证是一种极其有效的解决问题的方法。通过构建合适的神经网络,加以训练,最终网络模型的性能指标基本上都会超过传统算法。
在数据量足够大的情况下,通过合理构建网络模型的方式增加其参数量,可以显著改善模型性能,但是这又带来了模型复杂度急剧提升的问题。大模型在实际场景中使用的成本较高。
-深度神经网络一般有较多的参数冗余,目前有几种主要的方法对模型进行压缩,减小其参数量。如裁剪、量化、知识蒸馏等,其中知识蒸馏是指使用教师模型(teacher model)去指导学生模型(student model)学习特定任务,保证小模型在参数量不变的情况下,得到比较大的性能提升,甚至获得与大模型相似的精度指标 [1]。 PaddleClas 融合已有的蒸馏方法 [2,3],提供了一种简单的半监督标签知识蒸馏方案(SSLD,Simple Semi-supervised Label Distillation),基于 ImageNet1k 分类数据集,在 ResNet_vd 以及 MobileNet 系列上的精度均有超过 3% 的绝对精度提升,具体指标如下图所示。
+深度神经网络一般有较多的参数冗余,目前有几种主要的方法对模型进行压缩,减小其参数量。如裁剪、量化、知识蒸馏等,其中知识蒸馏是指使用教师模型(teacher model)去指导学生模型(student model)学习特定任务,保证小模型在参数量不变的情况下,得到比较大的性能提升,甚至获得与大模型相似的精度指标 [1]。
-
-
-## 2. SSLD 蒸馏策略
+根据蒸馏方式的不同,可以将知识蒸馏方法分为3个不同的类别:Response based distillation、Feature based distillation、Relation based distillation。下面进行详细介绍。
-
-### 2.1 简介
+
-SSLD 的流程图如下图所示。
+#### 1.1.1 Response based distillation
-
-首先,我们从 ImageNet22k 中挖掘出了近 400 万张图片,同时与 ImageNet-1k 训练集整合在一起,得到了一个新的包含 500 万张图片的数据集。然后,我们将学生模型与教师模型组合成一个新的网络,该网络分别输出学生模型和教师模型的预测分布,与此同时,固定教师模型整个网络的梯度,而学生模型可以做正常的反向传播。最后,我们将两个模型的 logits 经过 softmax 激活函数转换为 soft label,并将二者的 soft label 做 JS 散度作为损失函数,用于蒸馏模型训练。下面以 MobileNetV3(该模型直接训练,精度为 75.3%)的知识蒸馏为例,介绍该方案的核心关键点(baseline 为 79.12% 的 ResNet50_vd 模型蒸馏 MobileNetV3,训练集为 ImageNet1k 训练集,loss 为 cross entropy loss,迭代轮数为 120epoch,精度指标为 75.6%)。
+最早的知识蒸馏算法 KD,由 Hinton 提出,训练的损失函数中除了 gt loss 之外,还引入了学生模型与教师模型输出的 KL 散度,最终精度超过单纯使用 gt loss 训练的精度。这里需要注意的是,在训练的时候,需要首先训练得到一个更大的教师模型,来指导学生模型的训练过程。
-* 教师模型的选择。在进行知识蒸馏时,如果教师模型与学生模型的结构差异太大,蒸馏得到的结果反而不会有太大收益。相同结构下,精度更高的教师模型对结果也有很大影响。相比于 79.12% 的 ResNet50_vd 教师模型,使用 82.4% 的 ResNet50_vd 教师模型可以带来 0.4% 的绝对精度收益(`75.6%->76.0%`)。
+PaddleClas 中提出了一种简单使用的 SSLD 知识蒸馏算法 [6],在训练的时候去除了对 gt label 的依赖,结合大量无标注数据,最终蒸馏训练得到的预训练模型在 15 个模型上的精度提升平均高达 3%。
-* 改进 loss 计算方法。分类 loss 计算最常用的方法就是 cross entropy loss,我们经过实验发现,在使用 soft label 进行训练时,相对于 cross entropy loss,KL div loss 对模型性能提升几乎无帮助,但是使用具有对称特性的 JS div loss 时,在多个蒸馏任务上相比 cross entropy loss 均有 0.2% 左右的收益(`76.0%->76.2%`),SSLD 中也基于 JS div loss 展开实验。
+上述标准的蒸馏方法是通过一个大模型作为教师模型来指导学生模型提升效果,而后来又发展出 DML(Deep Mutual Learning)互学习蒸馏方法 [7],即通过两个结构相同的模型互相学习。具体的。相比于 KD 等依赖于大的教师模型的知识蒸馏算法,DML 脱离了对大的教师模型的依赖,蒸馏训练的流程更加简单,模型产出效率也要更高一些。
-* 更多的迭代轮数。蒸馏的 baseline 实验只迭代了 120 个 epoch 。实验发现,迭代轮数越多,蒸馏效果越好,最终我们迭代了 360 epoch,精度指标可以达到 77.1%(`76.2%->77.1%`)。
+
-* 无需数据集的真值标签,很容易扩展训练集。 SSLD 的 loss 在计算过程中,仅涉及到教师和学生模型对于相同图片的处理结果(经过 softmax 激活函数处理之后的 soft label),因此即使图片数据不包含真值标签,也可以用来进行训练并提升模型性能。该蒸馏方案的无标签蒸馏策略也大大提升了学生模型的性能上限(`77.1%->78.5%`)。
+#### 1.1.2 Feature based distillation
-* ImageNet1k 蒸馏 finetune 。 我们仅使用 ImageNet1k 数据,使用蒸馏方法对上述模型进行 finetune,最终仍然可以获得 0.4% 的性能提升(`78.5%->78.9%`)。
+Heo 等人提出了 OverHaul [8], 计算学生模型与教师模型的 feature map distance,作为蒸馏的 loss,在这里使用了学生模型、教师模型的转移,来保证二者的 feature map 可以正常地进行 distance 的计算。
+基于 feature map distance 的知识蒸馏方法也能够和 `3.1 章节` 中的基于 response 的知识蒸馏算法融合在一起,同时对学生模型的输出结果和中间层 feature map 进行监督。而对于 DML 方法来说,这种融合过程更为简单,因为不需要对学生和教师模型的 feature map 进行转换,便可以完成对齐(alignment)过程。PP-OCRv2 系统中便使用了这种方法,最终大幅提升了 OCR 文字识别模型的精度。
-
-### 2.2 数据选择
+
-* SSLD 蒸馏方案的一大特色就是无需使用图像的真值标签,因此可以任意扩展数据集的大小,考虑到计算资源的限制,我们在这里仅基于 ImageNet22k 数据集对蒸馏任务的训练集进行扩充。在 SSLD 蒸馏任务中,我们使用了 `Top-k per class` 的数据采样方案 [3] 。具体步骤如下。
- * 训练集去重。我们首先基于 SIFT 特征相似度匹配的方式对 ImageNet22k 数据集与 ImageNet1k 验证集进行去重,防止添加的 ImageNet22k 训练集中包含 ImageNet1k 验证集图像,最终去除了 4511 张相似图片。部分过滤的相似图片如下所示。
+#### 1.1.3 Relation based distillation
- 
+[1.1.1](#1.1.1) 和 [1.1.2](#1.1.2) 章节中的论文中主要是考虑到学生模型与教师模型的输出或者中间层 feature map,这些知识蒸馏算法只关注个体的输出结果,没有考虑到个体之间的输出关系。
- * 大数据集 soft label 获取,对于去重后的 ImageNet22k 数据集,我们使用 `ResNeXt101_32x16d_wsl` 模型进行预测,得到每张图片的 soft label 。
- * Top-k 数据选择,ImageNet1k 数据共有 1000 类,对于每一类,找出属于该类并且得分最高的 `k` 张图片,最终得到一个数据量不超过 `1000*k` 的数据集(某些类上得到的图片数量可能少于 `k` 张)。
- * 将该数据集与 ImageNet1k 的训练集融合组成最终蒸馏模型所使用的数据集,数据量为 500 万。
+Park 等人提出了 RKD [10],基于关系的知识蒸馏算法,RKD 中进一步考虑个体输出之间的关系,使用 2 种损失函数,二阶的距离损失(distance-wise)和三阶的角度损失(angle-wise)
-
-## 3. 实验
-* PaddleClas 的蒸馏策略为`大数据集训练 + ImageNet1k 蒸馏 finetune` 的策略。选择合适的教师模型,首先在挑选得到的 500 万数据集上进行训练,然后在 ImageNet1k 训练集上进行 finetune,最终得到蒸馏后的学生模型。
+本论文提出的算法关系知识蒸馏(RKD)迁移教师模型得到的输出结果间的结构化关系给学生模型,不同于之前的只关注个体输出结果,RKD 算法使用两种损失函数:二阶的距离损失(distance-wise)和三阶的角度损失(angle-wise)。在最终计算蒸馏损失函数的时候,同时考虑 KD loss 和 RKD loss。最终精度优于单独使用 KD loss 蒸馏得到的模型精度。
+
+
+
+### 1.2 PaddleClas支持的知识蒸馏算法
+
+
+
+#### 1.2.1 SSLD
+
+##### 1.2.1.1 SSLD 算法介绍
+
+论文信息:
+
+> [Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones
+](https://arxiv.org/abs/2103.05959)
+>
+> Cheng Cui, Ruoyu Guo, Yuning Du, Dongliang He, Fu Li, Zewu Wu, Qiwen Liu, Shilei Wen, Jizhou Huang, Xiaoguang Hu, Dianhai Yu, Errui Ding, Yanjun Ma
+>
+> arxiv, 2021
+
+SSLD是百度于2021年提出的一种简单的半监督知识蒸馏方案,通过设计一种改进的JS散度作为损失函数,结合基于ImageNet22k数据集的数据挖掘策略,最终帮助15个骨干网络模型的精度平均提升超过3%。
-
-### 3.1 教师模型的选择
+更多关于SSLD的原理、模型库与使用介绍,请参考:[SSLD知识蒸馏算法介绍](./ssld.md)。
-为了验证教师模型和学生模型的模型大小差异和教师模型的模型精度对蒸馏结果的影响,我们做了几组实验验证。训练策略统一为:`cosine_decay_warmup,lr=1.3, epoch=120, bs=2048`,学生模型均为从头训练。
-|Teacher Model | Teacher Top1 | Student Model | Student Top1|
-|- |:-: |:-: | :-: |
-| ResNeXt101_32x16d_wsl | 84.2% | MobileNetV3_large_x1_0 | 75.78% |
-| ResNet50_vd | 79.12% | MobileNetV3_large_x1_0 | 75.60% |
-| ResNet50_vd | 82.35% | MobileNetV3_large_x1_0 | 76.00% |
+##### 1.2.1.2 SSLD 配置
+SSLD配置如下所示。在模型构建Arch字段中,需要同时定义学生模型与教师模型,教师模型固定梯度,并且加载预训练参数。在损失函数Loss字段中,需要定义`DistillationDMLLoss`,作为训练的损失函数。
-从表中可以看出
+```yaml
+# model architecture
+Arch:
+ name: "DistillationModel" # 模型名称,这里使用的是蒸馏模型,
+ class_num: &class_num 1000 # 类别数量,对于ImageNet1k数据集来说,类别数为1000
+ pretrained_list: # 预训练模型列表,因为在下面的子网络中指定了预训练模型,这里无需指定
+ freeze_params_list: # 固定网络参数列表,为True时,表示固定该index对应的网络
+ - True
+ - False
+ infer_model_name: "Student" # 在模型导出的时候,会导出Student子网络
+ models: # 子网络列表
+ - Teacher: # 教师模型
+ name: ResNet50_vd # 模型名称
+ class_num: *class_num # 类别数
+ pretrained: True # 预训练模型路径,如果为True,则会从官网下载默认的预训练模型
+ use_ssld: True # 是否使用SSLD蒸馏得到的预训练模型,精度会更高一些
+ - Student: # 学生模型
+ name: PPLCNet_x2_5 # 模型名称
+ class_num: *class_num # 类别数
+ pretrained: False # 预训练模型路径,可以指定为bool值或者字符串,这里为False,表示学生模型默认不加载预训练模型
+
+# loss function config for traing/eval process
+Loss: # 定义损失函数
+ Train: # 定义训练的损失函数,为列表形式
+ - DistillationDMLLoss: # 蒸馏的DMLLoss,对DMLLoss进行封装,支持蒸馏结果(dict形式)的损失函数计算
+ weight: 1.0 # loss权重
+ model_name_pairs: # 用于计算的模型对,这里表示计算Student和Teacher输出的损失函数
+ - ["Student", "Teacher"]
+ Eval: # 定义评估时的损失函数
+ - CELoss:
+ weight: 1.0
+```
-> 教师模型结构相同时,其精度越高,最终的蒸馏效果也会更好一些。
+
+
+#### 1.2.2 DML
+
+##### 1.2.2.1 DML 算法介绍
+
+论文信息:
+
+> [Deep Mutual Learning](https://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_Deep_Mutual_Learning_CVPR_2018_paper.html)
+>
+> Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu
>
-> 教师模型与学生模型的模型大小差异不宜过大,否则反而会影响蒸馏结果的精度。
+> CVPR, 2018
+
+DML论文中,在蒸馏的过程中,不依赖于教师模型,两个结构相同的模型互相学习,计算彼此输出(logits)的KL散度,最终完成训练过程。
+
+
+在ImageNet1k公开数据集上,效果如下所示。
+
+| 策略 | 骨干网络 | 配置文件 | Top-1 acc | 下载链接 |
+| --- | --- | --- | --- | --- |
+| baseline | PPLCNet_x2_5 | [PPLCNet_x2_5.yaml](../../../ppcls/configs/ImageNet/PPLCNet/PPLCNet_x2_5.yaml) | 74.93% | - |
+| DML | PPLCNet_x2_5 | [PPLCNet_x2_5_dml.yaml](../../../ppcls/configs/ImageNet/Distillation/PPLCNet_x2_5_dml.yaml) | 76.68%(**+1.75%**) | - |
+
+
+* 注:完整的PPLCNet_x2_5模型训练了360epoch,这里为了方便对比,baseline和DML均训练了100epoch,因此指标比官网最终开源出来的模型精度(76.60%)低一些。
+
+
+##### 1.2.2.2 DML 配置
+
+DML配置如下所示。在模型构建Arch字段中,需要同时定义学生模型与教师模型,教师模型与学生模型均保持梯度更新状态。在损失函数Loss字段中,需要定义`DistillationDMLLoss`(学生与教师之间的JS-Div loss)以及`DistillationGTCELoss`(学生与教师关于真值标签的CE loss),作为训练的损失函数。
+
+```yaml
+Arch:
+ name: "DistillationModel"
+ class_num: &class_num 1000
+ pretrained_list:
+ freeze_params_list: # 两个模型互相学习,因此这里两个子网络的参数均不能固定
+ - False
+ - False
+ models:
+ - Teacher:
+ name: PPLCNet_x2_5 # 两个模型互学习,因此均没有加载预训练模型
+ class_num: *class_num
+ pretrained: False
+ - Student:
+ name: PPLCNet_x2_5
+ class_num: *class_num
+ pretrained: False
+
+Loss:
+ Train:
+ - DistillationGTCELoss: # 因为2个子网络均没有加载预训练模型,这里需要同时计算不同子网络的输出与真值标签之间的CE loss
+ weight: 1.0
+ model_names: ["Student", "Teacher"]
+ - DistillationDMLLoss:
+ weight: 1.0
+ model_name_pairs:
+ - ["Student", "Teacher"]
+ Eval:
+ - CELoss:
+ weight: 1.0
+```
+
+
+
+#### 1.2.3 UDML
+
+##### 1.2.3.1 UDML 算法介绍
+
+论文信息:
+
+UDML 是百度飞桨视觉团队提出的无需依赖教师模型的知识蒸馏算法,它基于DML进行改进,在蒸馏的过程中,除了考虑两个模型的输出信息,也考虑两个模型的中间层特征信息,从而进一步提升知识蒸馏的精度。更多关于UDML的说明与应用,请参考[PP-ShiTu论文](https://arxiv.org/abs/2111.00775)以及[PP-OCRv3论文](https://arxiv.org/abs/2109.03144)。
-因此最终在蒸馏实验中,对于 ResNet 系列学生模型,我们使用 `ResNeXt101_32x16d_wsl` 作为教师模型;对于 MobileNet 系列学生模型,我们使用蒸馏得到的 `ResNet50_vd` 作为教师模型。
-
-### 3.2 大数据蒸馏
+在ImageNet1k公开数据集上,效果如下所示。
-基于 PaddleClas 的蒸馏策略为`大数据集训练 + imagenet1k finetune` 的策略。
+| 策略 | 骨干网络 | 配置文件 | Top-1 acc | 下载链接 |
+| --- | --- | --- | --- | --- |
+| baseline | PPLCNet_x2_5 | [PPLCNet_x2_5.yaml](../../../ppcls/configs/ImageNet/PPLCNet/PPLCNet_x2_5.yaml) | 74.93% | - |
+| UDML | PPLCNet_x2_5 | [PPLCNet_x2_5_dml.yaml](../../../ppcls/configs/ImageNet/Distillation/PPLCNet_x2_5_udml.yaml) | 76.74%(**+1.81%**) | - |
+
+
+##### 1.2.3.2 UDML 配置
+
+
+```yaml
+Arch:
+ name: "DistillationModel"
+ class_num: &class_num 1000
+ # if not null, its lengths should be same as models
+ pretrained_list:
+ # if not null, its lengths should be same as models
+ freeze_params_list:
+ - False
+ - False
+ models:
+ - Teacher:
+ name: PPLCNet_x2_5
+ class_num: *class_num
+ pretrained: False
+ # return_patterns表示除了返回输出的logits,也会返回对应名称的中间层feature map
+ return_patterns: ["blocks3", "blocks4", "blocks5", "blocks6"]
+ - Student:
+ name: PPLCNet_x2_5
+ class_num: *class_num
+ pretrained: False
+ return_patterns: ["blocks3", "blocks4", "blocks5", "blocks6"]
+
+# loss function config for traing/eval process
+Loss:
+ Train:
+ - DistillationGTCELoss:
+ weight: 1.0
+ key: logits
+ model_names: ["Student", "Teacher"]
+ - DistillationDMLLoss:
+ weight: 1.0
+ key: logits
+ model_name_pairs:
+ - ["Student", "Teacher"]
+ - DistillationDistanceLoss: # 基于蒸馏结果的距离loss,这里默认使用l2 loss计算block5之间的损失函数
+ weight: 1.0
+ key: "blocks5"
+ model_name_pairs:
+ - ["Student", "Teacher"]
+ Eval:
+ - CELoss:
+ weight: 1.0
+```
-针对从 ImageNet22k 挑选出的 400 万数据,融合 imagenet1k 训练集,组成共 500 万的训练集进行训练,具体地,在不同模型上的训练超参及效果如下。
+**注意(:** 上述在网络中指定`return_patterns`,返回中间层特征的功能是基于TheseusLayer,更多关于TheseusLayer的使用说明,请参考:[TheseusLayer 使用说明](./theseus_layer.md)。
-|Student Model | num_epoch | l2_ecay | batch size/gpu cards | base lr | learning rate decay | top1 acc |
-| - |:-: |:-: | :-: |:-: |:-: |:-: |
-| MobileNetV1 | 360 | 3e-5 | 4096/8 | 1.6 | cosine_decay_warmup | 77.65% |
-| MobileNetV2 | 360 | 1e-5 | 3072/8 | 0.54 | cosine_decay_warmup | 76.34% |
-| MobileNetV3_large_x1_0 | 360 | 1e-5 | 5760/24 | 3.65625 | cosine_decay_warmup | 78.54% |
-| MobileNetV3_small_x1_0 | 360 | 1e-5 | 5760/24 | 3.65625 | cosine_decay_warmup | 70.11% |
-| ResNet50_vd | 360 | 7e-5 | 1024/32 | 0.4 | cosine_decay_warmup | 82.07% |
-| ResNet101_vd | 360 | 7e-5 | 1024/32 | 0.4 | cosine_decay_warmup | 83.41% |
-| Res2Net200_vd_26w_4s | 360 | 4e-5 | 1024/32 | 0.4 | cosine_decay_warmup | 84.82% |
+
-
-### 3.3 ImageNet1k 训练集 finetune
+#### 1.2.4 AFD
-对于在大数据集上训练的模型,其学习到的特征可能与 ImageNet1k 数据特征有偏,因此在这里使用 ImageNet1k 数据集对模型进行 finetune。 finetune 的超参和 finetune 的精度收益如下。
+##### 1.2.4.1 AFD 算法介绍
+论文信息:
-|Student Model | num_epoch | l2_ecay | batch size/gpu cards | base lr | learning rate decay | top1 acc |
-| - |:-: |:-: | :-: |:-: |:-: |:-: |
-| MobileNetV1 | 30 | 3e-5 | 4096/8 | 0.016 | cosine_decay_warmup | 77.89% |
-| MobileNetV2 | 30 | 1e-5 | 3072/8 | 0.0054 | cosine_decay_warmup | 76.73% |
-| MobileNetV3_large_x1_0 | 30 | 1e-5 | 2048/8 | 0.008 | cosine_decay_warmup | 78.96% |
-| MobileNetV3_small_x1_0 | 30 | 1e-5 | 6400/32 | 0.025 | cosine_decay_warmup | 71.28% |
-| ResNet50_vd | 60 | 7e-5 | 1024/32 | 0.004 | cosine_decay_warmup | 82.39% |
-| ResNet101_vd | 30 | 7e-5 | 1024/32 | 0.004 | cosine_decay_warmup | 83.73% |
-| Res2Net200_vd_26w_4s | 360 | 4e-5 | 1024/32 | 0.004 | cosine_decay_warmup | 85.13% |
-
-### 3.4 数据增广以及基于 Fix 策略的微调
+> [Show, attend and distill: Knowledge distillation via attention-based feature matching](https://arxiv.org/abs/2102.02973)
+>
+> Mingi Ji, Byeongho Heo, Sungrae Park
+>
+> AAAI, 2018
-* 基于前文所述的实验结论,我们在训练的过程中加入自动增广(AutoAugment)[4],同时进一步减小了 l2_decay(4e-5->2e-5),最终 ResNet50_vd 经过 SSLD 蒸馏策略,在 ImageNet1k 上的精度可以达到 82.99%,相比之前不加数据增广的蒸馏策略再次增加了 0.6% 。
+AFD提出在蒸馏的过程中,利用基于注意力的元网络学习特征之间的相对相似性,并应用识别的相似关系来控制所有可能的特征图pair的蒸馏强度。
+在ImageNet1k公开数据集上,效果如下所示。
-* 对于图像分类任务,在测试的时候,测试尺度为训练尺度的 1.15 倍左右时,往往在不需要重新训练模型的情况下,模型的精度指标就可以进一步提升 [5],对于 82.99% 的 ResNet50_vd 在 320x320 的尺度下测试,精度可达 83.7%,我们进一步使用 Fix 策略,即在 320x320 的尺度下进行训练,使用与预测时相同的数据预处理方法,同时固定除 FC 层以外的所有参数,最终在 320x320 的预测尺度下,精度可以达到 **84.0%**。
+| 策略 | 骨干网络 | 配置文件 | Top-1 acc | 下载链接 |
+| --- | --- | --- | --- | --- |
+| baseline | ResNet18 | [ResNet18.yaml](../../../ppcls/configs/ImageNet/ResNet/ResNet18.yaml) | 70.8% | - |
+| AFD | ResNet18 | [resnet34_distill_resnet18_afd.yaml](../../../ppcls/configs/ImageNet/Distillation/resnet34_distill_resnet18_afd.yaml) | 71.68%(**+0.88%**) | - |
-
-### 3.5 实验过程中的一些问题
+注意:这里为了与论文的训练配置保持对齐,设置训练的迭代轮数为100epoch,因此baseline精度低于PaddleClas中开源出的模型精度(71.0%)
-* 在预测过程中,batch norm 的平均值与方差是通过加载预训练模型得到(设其模式为 test mode)。在训练过程中,batch norm 是通过统计当前 batch 的信息(设其模式为 train mode),与历史保存信息进行滑动平均计算得到,在蒸馏任务中,我们发现通过 train mode,即教师模型的均值与方差实时变化的模式,去指导学生模型,比通过 test mode 蒸馏,得到的学生模型性能更好一些,下面是一组实验结果。因此我们在该蒸馏方案中,均使用 train mode 去得到教师模型的 soft label 。
+##### 1.2.4.2 AFD 配置
-|Teacher Model | Teacher Top1 | Student Model | Student Top1|
-|- |:-: |:-: | :-: |
-| ResNet50_vd | 82.35% | MobileNetV3_large_x1_0 | 76.00% |
-| ResNet50_vd | 82.35% | MobileNetV3_large_x1_0 | 75.84% |
+AFD配置如下所示。在模型构建Arch字段中,需要同时定义学生模型与教师模型,固定教师模型的权重。这里需要对从教师模型获取的特征进行变换,进而与学生模型进行损失函数的计算。在损失函数Loss字段中,需要定义`DistillationKLDivLoss`(学生与教师之间的KL-Div loss)、`AFDLoss`(学生与教师之间的AFD loss)以及`DistillationGTCELoss`(学生与教师关于真值标签的CE loss),作为训练的损失函数。
-
-## 4. 蒸馏模型的应用
+```yaml
+Arch:
+ name: "DistillationModel"
+ pretrained_list:
+ freeze_params_list:
+ models:
+ - Teacher:
+ name: AttentionModel # 包含若干个串行的网络,后面的网络会将前面的网络输出作为输入并进行处理
+ pretrained_list:
+ freeze_params_list:
+ - True
+ - False
+ models:
+ # AttentionModel 的基础网络
+ - ResNet34:
+ name: ResNet34
+ pretrained: True
+ # return_patterns表示除了返回输出的logits,也会返回对应名称的中间层feature map
+ return_patterns: &t_keys ["blocks[0]", "blocks[1]", "blocks[2]", "blocks[3]",
+ "blocks[4]", "blocks[5]", "blocks[6]", "blocks[7]",
+ "blocks[8]", "blocks[9]", "blocks[10]", "blocks[11]",
+ "blocks[12]", "blocks[13]", "blocks[14]", "blocks[15]"]
+ # AttentionModel的变换网络,会对基础子网络的特征进行变换
+ - LinearTransformTeacher:
+ name: LinearTransformTeacher
+ qk_dim: 128
+ keys: *t_keys
+ t_shapes: &t_shapes [[64, 56, 56], [64, 56, 56], [64, 56, 56], [128, 28, 28],
+ [128, 28, 28], [128, 28, 28], [128, 28, 28], [256, 14, 14],
+ [256, 14, 14], [256, 14, 14], [256, 14, 14], [256, 14, 14],
+ [256, 14, 14], [512, 7, 7], [512, 7, 7], [512, 7, 7]]
-
-### 4.1 使用方法
+ - Student:
+ name: AttentionModel
+ pretrained_list:
+ freeze_params_list:
+ - False
+ - False
+ models:
+ - ResNet18:
+ name: ResNet18
+ pretrained: False
+ return_patterns: &s_keys ["blocks[0]", "blocks[1]", "blocks[2]", "blocks[3]",
+ "blocks[4]", "blocks[5]", "blocks[6]", "blocks[7]"]
+ - LinearTransformStudent:
+ name: LinearTransformStudent
+ qk_dim: 128
+ keys: *s_keys
+ s_shapes: &s_shapes [[64, 56, 56], [64, 56, 56], [128, 28, 28], [128, 28, 28],
+ [256, 14, 14], [256, 14, 14], [512, 7, 7], [512, 7, 7]]
+ t_shapes: *t_shapes
-* 中间层学习率调整。蒸馏得到的模型的中间层特征图更加精细化,因此将蒸馏模型预训练应用到其他任务中时,如果采取和之前相同的学习率,容易破坏中间层特征。而如果降低整体模型训练的学习率,则会带来训练收敛速度慢的问题。因此我们使用了中间层学习率调整的策略。具体地:
- * 针对 ResNet50_vd,我们设置一个学习率倍数列表,res block 之前的 3 个 conv2d 卷积参数具有统一的学习率倍数,4 个 res block 的 conv2d 分别有一个学习率参数,共需设置 5 个学习率倍数的超参。在实验中发现。用于迁移学习 finetune 分类模型时,`[0.1,0.1,0.2,0.2,0.3]` 的中间层学习率倍数设置在绝大多数的任务中都性能更好;而在目标检测任务中,`[0.05,0.05,0.05,0.1,0.15]` 的中间层学习率倍数设置能够带来更大的精度收益。
- * 对于 MoblileNetV3_large_x1_0,由于其包含 15 个 block,我们设置每 3 个 block 共享一个学习率倍数参数,因此需要共 5 个学习率倍数的参数,最终发现在分类和检测任务中,`[0.25,0.25,0.5,0.5,0.75]` 的中间层学习率倍数能够带来更大的精度收益。
+ infer_model_name: "Student"
-* 适当的 l2 decay 。不同分类模型在训练的时候一般都会根据模型设置不同的 l2 decay,大模型为了防止过拟合,往往会设置更大的 l2 decay,如 ResNet50 等模型,一般设置为 `1e-4` ;而如 MobileNet 系列模型,在训练时往往都会设置为 `1e-5~4e-5`,防止模型过度欠拟合,在蒸馏时亦是如此。在将蒸馏模型应用到目标检测任务中时,我们发现也需要调节 backbone 甚至特定任务模型模型的 l2 decay,和预训练蒸馏时的 l2 decay 尽可能保持一致。以 Faster RCNN MobiletNetV3 FPN 为例,我们发现仅修改该参数,在 COCO2017 数据集上就可以带来最多 0.5% 左右的精度(mAP)提升(默认 Faster RCNN l2 decay 为 1e-4,我们修改为 1e-5~4e-5 均有 0.3%~0.5% 的提升)。
+# loss function config for traing/eval process
+Loss:
+ Train:
+ - DistillationGTCELoss:
+ weight: 1.0
+ model_names: ["Student"]
+ key: logits
+ - DistillationKLDivLoss: # 蒸馏的KL-Div loss,会根据model_name_pairs中的模型名称去提取对应模型的输出特征,计算loss
+ weight: 0.9 # 该loss的权重
+ model_name_pairs: [["Student", "Teacher"]]
+ temperature: 4
+ key: logits
+ - AFDLoss: # AFD loss
+ weight: 50.0
+ model_name_pair: ["Student", "Teacher"]
+ student_keys: ["bilinear_key", "value"]
+ teacher_keys: ["query", "value"]
+ s_shapes: *s_shapes
+ t_shapes: *t_shapes
+ Eval:
+ - CELoss:
+ weight: 1.0
+```
-
-### 4.2 迁移学习 finetune
-* 为验证迁移学习的效果,我们在 10 个小的数据集上验证其效果。在这里为了保证实验的可对比性,我们均使用 ImageNet1k 数据集训练的标准预处理过程,对于蒸馏模型我们也添加了蒸馏模型中间层学习率的搜索。
-* 对于 ResNet50_vd, baseline 为 Top1 Acc 79.12% 的预训练模型基于 grid search 搜索得到的最佳精度,对比实验则为基于该精度对预训练和中间层学习率进一步搜索得到的最佳精度。下面给出 10 个数据集上所有 baseline 和蒸馏模型的精度对比。
+**注意(:** 上述在网络中指定`return_patterns`,返回中间层特征的功能是基于TheseusLayer,更多关于TheseusLayer的使用说明,请参考:[TheseusLayer 使用说明](./theseus_layer.md)。
+
-| Dataset | Model | Baseline Top1 Acc | Distillation Model Finetune |
-|- |:-: |:-: | :-: |
-| Oxford102 flowers | ResNete50_vd | 97.18% | 97.41% |
-| caltech-101 | ResNete50_vd | 92.57% | 93.21% |
-| Oxford-IIIT-Pets | ResNete50_vd | 94.30% | 94.76% |
-| DTD | ResNete50_vd | 76.48% | 77.71% |
-| fgvc-aircraft-2013b | ResNete50_vd | 88.98% | 90.00% |
-| Stanford-Cars | ResNete50_vd | 92.65% | 92.76% |
-| SUN397 | ResNete50_vd | 64.02% | 68.36% |
-| cifar100 | ResNete50_vd | 86.50% | 87.58% |
-| cifar10 | ResNete50_vd | 97.72% | 97.94% |
-| Food-101 | ResNete50_vd | 89.58% | 89.99% |
+#### 1.2.5 DKD
-* 可以看出在上面 10 个数据集上,结合适当的中间层学习率倍数设置,蒸馏模型平均能够带来 1% 以上的精度提升。
+##### 1.2.5.1 DKD 算法介绍
-
-### 4.3 目标检测
+论文信息:
-我们基于两阶段目标检测 Faster/Cascade RCNN 模型验证蒸馏得到的预训练模型的效果。
-* ResNet50_vd
+> [Decoupled Knowledge Distillation](https://arxiv.org/abs/2203.08679)
+>
+> Borui Zhao, Quan Cui, Renjie Song, Yiyu Qiu, Jiajun Liang
+>
+> CVPR, 2022
-设置训练与评测的尺度均为 640x640,最终 COCO 上检测指标如下。
+DKD将蒸馏中常用的 KD Loss 进行了解耦成为Target Class Knowledge Distillation(TCKD,目标类知识蒸馏)以及Non-target Class Knowledge Distillation(NCKD,非目标类知识蒸馏)两个部分,对两个部分的作用分别研究,并使它们各自的权重可以独立调节,提升了蒸馏的精度和灵活性。
-| Model | train/test scale | pretrain top1 acc | feature map lr | coco mAP |
-|- |:-: |:-: | :-: | :-: |
-| Faster RCNN R50_vd FPN | 640/640 | 79.12% | [1.0,1.0,1.0,1.0,1.0] | 34.8% |
-| Faster RCNN R50_vd FPN | 640/640 | 79.12% | [0.05,0.05,0.1,0.1,0.15] | 34.3% |
-| Faster RCNN R50_vd FPN | 640/640 | 82.18% | [0.05,0.05,0.1,0.1,0.15] | 36.3% |
+在ImageNet1k公开数据集上,效果如下所示。
-在这里可以看出,对于未蒸馏模型,过度调整中间层学习率反而降低最终检测模型的性能指标。基于该蒸馏模型,我们也提供了领先的服务端实用目标检测方案,详细的配置与训练代码均已开源,可以参考 [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_enhance)。
+| 策略 | 骨干网络 | 配置文件 | Top-1 acc | 下载链接 |
+| --- | --- | --- | --- | --- |
+| baseline | ResNet18 | [ResNet18.yaml](../../../ppcls/configs/ImageNet/ResNet/ResNet18.yaml) | 70.8% | - |
+| AFD | ResNet18 | [resnet34_distill_resnet18_dkd.yaml](../../../ppcls/configs/ImageNet/Distillation/resnet34_distill_resnet18_dkd.yaml) | 72.59%(**+1.79%**) | - |
-
-## 5. SSLD 实战
-本节将基于 ImageNet-1K 的数据集详细介绍 SSLD 蒸馏实验,如果想快速体验此方法,可以参考 [**30 分钟玩转 PaddleClas(进阶版)**](../quick_start/quick_start_classification_professional.md)中基于 CIFAR100 的 SSLD 蒸馏实验。
+##### 1.2.5.2 DKD 配置
-
-### 5.1 参数配置
+DKD 配置如下所示。在模型构建Arch字段中,需要同时定义学生模型与教师模型,教师模型固定参数,且需要加载预训练模型。在损失函数Loss字段中,需要定义`DistillationDKDLoss`(学生与教师之间的DKD loss)以及`DistillationGTCELoss`(学生与教师关于真值标签的CE loss),作为训练的损失函数。
-实战部分提供了 SSLD 蒸馏的示例,在 `ppcls/configs/ImageNet/Distillation/mv3_large_x1_0_distill_mv3_small_x1_0.yaml` 中提供了 `MobileNetV3_large_x1_0` 蒸馏 `MobileNetV3_small_x1_0` 的配置文件,用户可以在 `tools/train.sh` 里直接替换配置文件的路径即可使用。
```yaml
Arch:
@@ -216,53 +418,165 @@ Arch:
- False
models:
- Teacher:
- name: MobileNetV3_large_x1_0
+ name: ResNet34
pretrained: True
- use_ssld: True
+
- Student:
- name: MobileNetV3_small_x1_0
+ name: ResNet18
pretrained: False
infer_model_name: "Student"
+
+
+# loss function config for traing/eval process
+Loss:
+ Train:
+ - DistillationGTCELoss:
+ weight: 1.0
+ model_names: ["Student"]
+ - DistillationDKDLoss:
+ weight: 1.0
+ model_name_pairs: [["Student", "Teacher"]]
+ temperature: 1
+ alpha: 1.0
+ beta: 1.0
+ Eval:
+ - CELoss:
+ weight: 1.0
```
+
-在参数配置中,`freeze_params_list` 中需要指定模型是否需要冻结参数,`models` 中需要指定 Teacher 模型和 Student 模型,其中 Teacher 模型需要加载预训练模型。用户可以直接在此处更改模型。
+## 2. 模型训练、评估和预测
-
-### 5.2 启动命令
+
-当用户配置完训练环境后,类似于训练其他分类任务,只需要将 `tools/train.sh` 中的配置文件替换成为相应的蒸馏配置文件即可。
+### 2.1 环境配置
-其中 `train.sh` 中的内容如下:
+* 安装:请先参考 [Paddle 安装教程](../installation/install_paddle.md) 以及 [PaddleClas 安装教程](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
-```bash
+
+
+### 2.2 数据准备
+
+请在[ImageNet 官网](https://www.image-net.org/)准备 ImageNet-1k 相关的数据。
+
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,将下载好的数据命名为 `ILSVRC2012` ,存放于此。 `ILSVRC2012` 目录中具有以下数据:
+
+```
+├── train
+│ ├── n01440764
+│ │ ├── n01440764_10026.JPEG
+│ │ ├── n01440764_10027.JPEG
+├── train_list.txt
+...
+├── val
+│ ├── ILSVRC2012_val_00000001.JPEG
+│ ├── ILSVRC2012_val_00000002.JPEG
+├── val_list.txt
+```
+
+其中 `train/` 和 `val/` 分别为训练集和验证集。`train_list.txt` 和 `val_list.txt` 分别为训练集和验证集的标签文件。
+
+
+如果包含与训练集场景相似的无标注数据,则也可以按照与训练集标注完全相同的方式进行整理,将文件与当前有标注的数据集放在相同目录下,将其标签值记为0,假设整理的标签文件名为`train_list_unlabel.txt`,则可以通过下面的命令生成用于SSLD训练的标签文件。
+
+```shell
+cat train_list.txt train_list_unlabel.txt > train_list_all.txt
+```
+
+
+**备注:**
-python -m paddle.distributed.launch \
- --selected_gpus="0,1,2,3" \
- --log_dir=mv3_large_x1_0_distill_mv3_small_x1_0 \
+* 关于 `train_list.txt`、`val_list.txt`的格式说明,可以参考[PaddleClas分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
+
+
+
+
+### 2.3 模型训练
+
+
+以SSLD知识蒸馏算法为例,介绍知识蒸馏算法的模型训练、评估、预测等过程。配置文件为 [PPLCNet_x2_5_ssld.yaml](../../../ppcls/configs/ImageNet/Distillation/PPLCNet_x2_5_ssld.yaml) ,使用下面的命令可以完成模型训练。
+
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
tools/train.py \
- -c ./ppcls/configs/ImageNet/Distillation/mv3_large_x1_0_distill_mv3_small_x1_0.yaml
+ -c ppcls/configs/ImageNet/Distillation/PPLCNet_x2_5_ssld.yaml
```
-运行 `train.sh` :
+
+
+### 2.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
```bash
-sh tools/train.sh
+python3 tools/eval.py \
+ -c ppcls/configs/ImageNet/Distillation/PPLCNet_x2_5_ssld.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model
+```
+
+其中 `-o Global.pretrained_model="output/DistillationModel/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+
+
+### 2.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```python
+python3 tools/infer.py \
+ -c ppcls/configs/ImageNet/Distillation/PPLCNet_x2_5_ssld.yaml \
+ -o Global.pretrained_model=output/DistillationModel/best_model
+```
+
+输出结果如下:
+
+```
+[{'class_ids': [8, 7, 86, 82, 21], 'scores': [0.87908, 0.12091, 0.0, 0.0, 0.0], 'file_name': 'docs/images/inference_deployment/whl_demo.jpg', 'label_names': ['hen', 'cock', 'partridge', 'ruffed grouse, partridge, Bonasa umbellus', 'kite']}]
```
-
-### 5.3 注意事项
-* 用户在使用 SSLD 蒸馏之前,首先需要在目标数据集上训练一个教师模型,该教师模型用于指导学生模型在该数据集上的训练。
+**备注:**
-* 如果学生模型没有加载预训练模型,训练的其他超参数可以参考该学生模型在 ImageNet-1k 上训练的超参数,如果学生模型加载了预训练模型,学习率可以调整到原来的 1/10 或者 1/100 。
+* 这里`-o Global.pretrained_model="output/ResNet50/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
-* 在 SSLD 蒸馏的过程中,学生模型只学习 soft-label 导致训练目标变的更加复杂,建议可以适当的调小 `l2_decay` 的值来获得更高的验证集准确率。
+* 默认是对 `docs/images/inference_deployment/whl_demo.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
-* 若用户准备添加无标签的训练数据,只需要将新的训练数据放置在原本训练数据的路径下,生成新的数据 list 即可,另外,新生成的数据 list 需要将无标签的数据添加伪标签(只是为了统一读数据)。
-
-## 6. 参考文献
+
+
+### 2.6 模型导出与推理
+
+
+Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)。
+
+在模型推理之前需要先导出模型。对于知识蒸馏训练得到的模型,在导出时需要指定`-o Global.infer_model_name=Student`,来表示导出的模型为学生模型。具体命令如下所示。
+
+```shell
+python3 tools/export_model.py \
+ -c ppcls/configs/ImageNet/Distillation/PPLCNet_x2_5_ssld.yaml \
+ -o Global.pretrained_model=./output/DistillationModel/best_model \
+ -o Arch.infer_model_name=Student
+```
+
+最终在`inference`目录下会产生`inference.pdiparams`、`inference.pdiparams.info`、`inference.pdmodel` 3个文件。
+
+关于更多模型推理相关的教程,请参考:[Python 预测推理](../inference_deployment/python_deploy.md)。
+
+
+
+
+## 3. 参考文献
[1] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J]. arXiv preprint arXiv:1503.02531, 2015.
@@ -273,3 +587,17 @@ sh tools/train.sh
[4] Cubuk E D, Zoph B, Mane D, et al. Autoaugment: Learning augmentation strategies from data[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2019: 113-123.
[5] Touvron H, Vedaldi A, Douze M, et al. Fixing the train-test resolution discrepancy[C]//Advances in Neural Information Processing Systems. 2019: 8250-8260.
+
+[6] Cui C, Guo R, Du Y, et al. Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones[J]. arXiv preprint arXiv:2103.05959, 2021.
+
+[7] Zhang Y, Xiang T, Hospedales T M, et al. Deep mutual learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4320-4328.
+
+[8] Heo B, Kim J, Yun S, et al. A comprehensive overhaul of feature distillation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 1921-1930.
+
+[9] Du Y, Li C, Guo R, et al. PP-OCRv2: Bag of Tricks for Ultra Lightweight OCR System[J]. arXiv preprint arXiv:2109.03144, 2021.
+
+[10] Park W, Kim D, Lu Y, et al. Relational knowledge distillation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3967-3976.
+
+[11] Zhao B, Cui Q, Song R, et al. Decoupled Knowledge Distillation[J]. arXiv preprint arXiv:2203.08679, 2022.
+
+[12] Ji M, Heo B, Park S. Show, attend and distill: Knowledge distillation via attention-based feature matching[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2021, 35(9): 7945-7952.
diff --git a/docs/zh_CN/advanced_tutorials/ssld.md b/docs/zh_CN/advanced_tutorials/ssld.md
new file mode 100644
index 0000000000000000000000000000000000000000..e19a98cbc866bc02f0ca9df6d8e939b3342663f5
--- /dev/null
+++ b/docs/zh_CN/advanced_tutorials/ssld.md
@@ -0,0 +1,171 @@
+
+# SSLD 知识蒸馏实战
+
+## 目录
+
+- [1. 算法介绍](#1)
+ - [1.1 知识蒸馏简介](#1.1)
+ - [1.2 SSLD蒸馏策略](#1.2)
+ - [1.3 SKL-UGI蒸馏策略](#1.3)
+- [2. SSLD预训练模型库](#2)
+- [3. SSLD使用](#3)
+ - [3.1 加载SSLD模型进行微调](#3.1)
+ - [3.2 使用SSLD方案进行知识蒸馏](#3.2)
+- [4. 参考文献](#4)
+
+
+
+
+
+## 1. 算法介绍
+
+
+
+### 1.1 简介
+
+PaddleClas 融合已有的知识蒸馏方法 [2,3],提供了一种简单的半监督标签知识蒸馏方案(SSLD,Simple Semi-supervised Label Distillation),基于 ImageNet1k 分类数据集,在 ResNet_vd 以及 MobileNet 系列上的精度均有超过 3% 的绝对精度提升,具体指标如下图所示。
+
+
+

+
+
+
+
+### 1.2 SSLD蒸馏策略
+
+SSLD 的流程图如下图所示。
+
+
+

+
+
+首先,我们从 ImageNet22k 中挖掘出了近 400 万张图片,同时与 ImageNet-1k 训练集整合在一起,得到了一个新的包含 500 万张图片的数据集。然后,我们将学生模型与教师模型组合成一个新的网络,该网络分别输出学生模型和教师模型的预测分布,与此同时,固定教师模型整个网络的梯度,而学生模型可以做正常的反向传播。最后,我们将两个模型的 logits 经过 softmax 激活函数转换为 soft label,并将二者的 soft label 做 JS 散度作为损失函数,用于蒸馏模型训练。
+
+以 MobileNetV3(该模型直接训练,精度为 75.3%)的知识蒸馏为例,该方案的核心策略优化点如下所示。
+
+
+| 实验ID | 策略 | Top-1 acc |
+|:------:|:---------:|:--------:|
+| 1 | baseline | 75.60% |
+| 2 | 更换教师模型精度为82.4%的权重 | 76.00% |
+| 3 | 使用改进的JS散度损失函数 | 76.20% |
+| 4 | 迭代轮数增加至360epoch | 77.10% |
+| 5 | 添加400W挖掘得到的无标注数据 | 78.50% |
+| 6 | 基于ImageNet1k数据微调 | 78.90% |
+
+* 注:其中baseline的训练条件为
+ * 训练数据:ImageNet1k数据集
+ * 损失函数:Cross Entropy Loss
+ * 迭代轮数:120epoch
+
+
+SSLD 蒸馏方案的一大特色就是无需使用图像的真值标签,因此可以任意扩展数据集的大小,考虑到计算资源的限制,我们在这里仅基于 ImageNet22k 数据集对蒸馏任务的训练集进行扩充。在 SSLD 蒸馏任务中,我们使用了 `Top-k per class` 的数据采样方案 [3] 。具体步骤如下。
+
+(1)训练集去重。我们首先基于 SIFT 特征相似度匹配的方式对 ImageNet22k 数据集与 ImageNet1k 验证集进行去重,防止添加的 ImageNet22k 训练集中包含 ImageNet1k 验证集图像,最终去除了 4511 张相似图片。部分过滤的相似图片如下所示。
+
+
+

+
+
+(2)大数据集 soft label 获取,对于去重后的 ImageNet22k 数据集,我们使用 `ResNeXt101_32x16d_wsl` 模型进行预测,得到每张图片的 soft label 。
+
+(3)Top-k 数据选择,ImageNet1k 数据共有 1000 类,对于每一类,找出属于该类并且得分最高的 `k` 张图片,最终得到一个数据量不超过 `1000*k` 的数据集(某些类上得到的图片数量可能少于 `k` 张)。
+
+(4)将该数据集与 ImageNet1k 的训练集融合组成最终蒸馏模型所使用的数据集,数据量为 500 万。
+
+
+
+
+## 1.3 SKL-UGI蒸馏策略
+
+此外,在无标注数据选择的过程中,我们发现使用更加通用的数据,即使不需要严格的数据筛选过程,也可以帮助知识蒸馏任务获得稳定的精度提升,因而提出了SKL-UGI (Symmetrical-KL Unlabeled General Images distillation)知识蒸馏方案。
+
+通用数据可以使用ImageNet数据或者与场景相似的数据集。更多关于SKL-UGI的应用,请参考:[超轻量图像分类方案PULC使用教程](../PULC/PULC_train.md)。
+
+
+
+
+## 2. 预训练模型库
+
+
+移动端预训练模型库列表如下所示。
+
+| 模型 | FLOPs(M) | Params(M) | top-1 acc | SSLD top-1 acc | 精度收益 | 下载链接 |
+|-------------------|----------|-----------|----------|---------------|--------|------|
+| PPLCNetV2_base | 604.16 | 6.54 | 77.04% | 80.10% | +3.06% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNetV2_base_ssld_pretrained.pdparams) |
+| PPLCNet_x2_5 | 906.49 | 9.04 | 76.60% | 80.82% | +4.22% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_ssld_pretrained.pdparams) |
+| PPLCNet_x1_0 | 160.81 | 2.96 | 71.32% | 74.39% | +3.07% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_ssld_pretrained.pdparams) |
+| PPLCNet_x0_5 | 47.28 | 1.89 | 63.14% | 66.10% | +2.96% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_ssld_pretrained.pdparams) |
+| PPLCNet_x0_25 | 18.43 | 1.52 | 51.86% | 53.43% | +1.57% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_25_ssld_pretrained.pdparams) |
+| MobileNetV1 | 578.88 | 4.19 | 71.00% | 77.90% | +6.90% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_ssld_pretrained.pdparams) |
+| MobileNetV2 | 327.84 | 3.44 | 72.20% | 76.74% | +4.54% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_ssld_pretrained.pdparams) |
+| MobileNetV3_large_x1_0 | 229.66 | 5.47 | 75.30% | 79.00% | +3.70% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_ssld_pretrained.pdparams) |
+| MobileNetV3_small_x1_0 | 63.67 | 2.94 | 68.20% | 71.30% | +3.10% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x1_0_ssld_pretrained.pdparams) |
+| MobileNetV3_small_x0_35 | 14.56 | 1.66 | 53.00% | 55.60% | +2.60% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x0_35_ssld_pretrained.pdparams) |
+| GhostNet_x1_3_ssld | 236.89 | 7.30 | 75.70% | 79.40% | +3.70% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/GhostNet_x1_3_ssld_pretrained.pdparams) |
+
+* 注:其中的`top-1 acc`表示使用普通训练方式得到的模型精度,`SSLD top-1 acc`表示使用SSLD知识蒸馏训练策略得到的模型精度。
+
+
+服务端预训练模型库列表如下所示。
+
+| 模型 | FLOPs(G) | Params(M) | top-1 acc | SSLD top-1 acc | 精度收益 | 下载链接 |
+|----------------------|----------|-----------|----------|---------------|--------|-------------------------------------------------------------------------------------------|
+| PPHGNet_base | 25.14 | 71.62 | - | 85.00% | - | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_base_ssld_pretrained.pdparams) |
+| PPHGNet_small | 8.53 | 24.38 | 81.50% | 83.80% | +2.30% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_small_ssld_pretrained.pdparams) |
+| PPHGNet_tiny | 4.54 | 14.75 | 79.83% | 81.95% | +2.12% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_tiny_ssld_pretrained.pdparams) |
+| ResNet50_vd | 8.67 | 25.58 | 79.10% | 83.00% | +3.90% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_vd_ssld_pretrained.pdparams) |
+| ResNet101_vd | 16.1 | 44.57 | 80.20% | 83.70% | +3.50% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet101_vd_ssld_pretrained.pdparams) |
+| ResNet34_vd | 7.39 | 21.82 | 76.00% | 79.70% | +3.70% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet34_vd_ssld_pretrained.pdparams) |
+| Res2Net50_vd_26w_4s | 8.37 | 25.06 | 79.80% | 83.10% | +3.30% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net50_vd_26w_4s_ssld_pretrained.pdparams) |
+| Res2Net101_vd_26w_4s | 16.67 | 45.22 | 80.60% | 83.90% | +3.30% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net101_vd_26w_4s_ssld_pretrained.pdparams) |
+| Res2Net200_vd_26w_4s | 31.49 | 76.21 | 81.20% | 85.10% | +3.90% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net200_vd_26w_4s_ssld_pretrained.pdparams) |
+| HRNet_W18_C | 4.14 | 21.29 | 76.90% | 81.60% | +4.70% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/_ssld_pretrained.pdparams) |
+| HRNet_W48_C | 34.58 | 77.47 | 79.00% | 83.60% | +4.60% | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W48_C_ssld_pretrained.pdparams) |
+| SE_HRNet_W64_C | 57.83 | 128.97 | - | 84.70% | - | [链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/SE_HRNet_W64_C_ssld_pretrained.pdparams) |
+
+
+
+
+## 3. SSLD使用方法
+
+
+
+### 3.1 加载SSLD模型进行微调
+
+如果希望直接使用预训练模型,可以在训练的时候,加入参数`-o Arch.pretrained=True -o Arch.use_ssld=True`,表示使用基于SSLD的预训练模型,示例如下所示。
+
+```shell
+# 单机单卡训练
+python3 tools/train.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml -o Arch.pretrained=True -o Arch.use_ssld=True
+# 单机多卡训练
+python3 -m paddle.distributed.launch --gpus="0,1,2,3" tools/train.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml -o Arch.pretrained=True -o Arch.use_ssld=True
+```
+
+
+
+### 3.2 使用SSLD方案进行知识蒸馏
+
+相比于其他大多数知识蒸馏算法,SSLD摆脱对数据标注的依赖,通过引入无标注数据,可以进一步提升模型精度。
+
+对于无标注数据,需要按照与有标注数据完全相同的整理方式,将文件与当前有标注的数据集放在相同目录下,将其标签值记为`0`,假设整理的标签文件名为`train_list_unlabel.txt`,则可以通过下面的命令生成用于SSLD训练的标签文件。
+
+```shell
+cat train_list.txt train_list_unlabel.txt > train_list_all.txt
+```
+
+更多关于图像分类任务的数据标签说明,请参考:[PaddleClas图像分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明)
+
+PaddleClas中集成了PULC超轻量图像分类实用方案,里面包含SSLD ImageNet预训练模型的使用以及更加通用的无标签数据的知识蒸馏方案,更多详细信息,请参考[PULC超轻量图像分类实用方案使用教程](../PULC/PULC_train.md)。
+
+
+
+## 4. 参考文献
+
+[1] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J]. arXiv preprint arXiv:1503.02531, 2015.
+
+[2] Bagherinezhad H, Horton M, Rastegari M, et al. Label refinery: Improving imagenet classification through label progression[J]. arXiv preprint arXiv:1805.02641, 2018.
+
+[3] Yalniz I Z, Jégou H, Chen K, et al. Billion-scale semi-supervised learning for image classification[J]. arXiv preprint arXiv:1905.00546, 2019.
+
+[4] Touvron H, Vedaldi A, Douze M, et al. Fixing the train-test resolution discrepancy[C]//Advances in Neural Information Processing Systems. 2019: 8250-8260.
diff --git a/docs/zh_CN/algorithm_introduction/ImageNet_models.md b/docs/zh_CN/algorithm_introduction/ImageNet_models.md
index 8e847bb8c17db46e71e8542b954fdf49e8cd549d..ad32788a8579ccf22ddb72dd40f9f0a8daa019d9 100644
--- a/docs/zh_CN/algorithm_introduction/ImageNet_models.md
+++ b/docs/zh_CN/algorithm_introduction/ImageNet_models.md
@@ -133,6 +133,8 @@ PP-LCNet 系列模型的精度、速度指标如下表所示,更多关于该
**: 基于 Intel-Xeon-Gold-6271C 硬件平台与 OpenVINO 2021.4.2 推理平台。
+
+
## PP-HGNet 系列
PP-HGNet 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:[PP-HGNet 系列模型文档](../models/PP-HGNet.md)。
@@ -140,7 +142,10 @@ PP-HGNet 系列模型的精度、速度指标如下表所示,更多关于该
| 模型 | Top-1 Acc | Top-5 Acc | time(ms)
bs=1 | time(ms)
bs=4 | time(ms)
bs=8 | FLOPs(G) | Params(M) | 预训练模型下载地址 | inference模型下载地址 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| PPHGNet_tiny | 0.7983 | 0.9504 | 1.77 | - | - | 4.54 | 14.75 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_tiny_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_tiny_infer.tar) |
+| PPHGNet_tiny_ssld | 0.8195 | 0.9612 | 1.77 | - | - | 4.54 | 14.75 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_tiny_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_tiny_ssld_infer.tar) |
| PPHGNet_small | 0.8151 | 0.9582 | 2.52 | - | - | 8.53 | 24.38 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_small_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_small_infer.tar) |
+| PPHGNet_small_ssld | 0.8382 | 0.9681 | 2.52 | - | - | 8.53 | 24.38 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_small_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_small_ssld_infer.tar) |
+| PPHGNet_base_ssld | 0.8500 | 0.9735 | 5.97 | - | - | 25.14 | 71.62 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_base_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_base_ssld_infer.tar) |
diff --git a/docs/zh_CN/algorithm_introduction/knowledge_distillation.md b/docs/zh_CN/algorithm_introduction/knowledge_distillation.md
index 58092195956119416277e62ec225318373a2bfa3..afedce7f07117c857717575ca063bab8a5decc66 100644
--- a/docs/zh_CN/algorithm_introduction/knowledge_distillation.md
+++ b/docs/zh_CN/algorithm_introduction/knowledge_distillation.md
@@ -42,7 +42,7 @@
PaddleClas 中提出了一种简单使用的 SSLD 知识蒸馏算法 [6],在训练的时候去除了对 gt label 的依赖,结合大量无标注数据,最终蒸馏训练得到的预训练模型在 15 个模型上的精度提升平均高达 3%。
-上述标准的蒸馏方法是通过一个大模型作为教师模型来指导学生模型提升效果,而后来又发展出 DML(Deep Mutual Learning)互学习蒸馏方法 [7],即通过两个结构相同的模型互相学习。具体的。相比于 KD 等依赖于大的教师模型的知识蒸馏算法,DML 脱离了对大的教师模型的依赖,蒸馏训练的流程更加简单,模型产出效率也要更高一些。
+上述标准的蒸馏方法是通过一个大模型作为教师模型来指导学生模型提升效果,而后来又发展出 DML(Deep Mutual Learning)互学习蒸馏方法 [7],即通过两个结构相同的模型互相学习。具体的。相比于 KD 等依赖于大的教师模型的知识蒸馏算法,DML 脱离了对大的教师模型的依赖,蒸馏训练的流程更加简单,模型产出效率也要更高一些。
### 3.2 Feature based distillation
diff --git a/docs/zh_CN/image_recognition_pipeline/feature_extraction.md b/docs/zh_CN/image_recognition_pipeline/feature_extraction.md
index 1438e9661200ede1adf67cf6813f763c3a13c095..368abc3da9856c8d9232819aef3b43f0ef66735d 100644
--- a/docs/zh_CN/image_recognition_pipeline/feature_extraction.md
+++ b/docs/zh_CN/image_recognition_pipeline/feature_extraction.md
@@ -1,182 +1,247 @@
+简体中文|[English](../../en/image_recognition_pipeline/feature_extraction_en.md)
# 特征提取
## 目录
-- [1. 简介](#1)
-- [2. 网络结构](#2)
-- [3. 通用识别模型](#3)
-- [4. 自定义特征提取](#4)
- - [4.1 数据准备](#4.1)
- - [4.2 模型训练](#4.2)
- - [4.3 模型评估](#4.3)
- - [4.4 模型推理](#4.4)
- - [4.4.1 导出推理模型](#4.4.1)
- - [4.4.2 获取特征向量](#4.4.2)
+- [1. 摘要](#1-摘要)
+- [2. 介绍](#2-介绍)
+- [3. 方法](#3-方法)
+ - [3.1 Backbone](#31-backbone)
+ - [3.2 Neck](#32-neck)
+ - [3.3 Head](#33-head)
+ - [3.4 Loss](#34-loss)
+- [4. 实验部分](#4-实验部分)
+- [5. 自定义特征提取](#5-自定义特征提取)
+ - [5.1 数据准备](#51-数据准备)
+ - [5.2 模型训练](#52-模型训练)
+ - [5.3 模型评估](#53-模型评估)
+ - [5.4 模型推理](#54-模型推理)
+ - [5.4.1 导出推理模型](#541-导出推理模型)
+ - [5.4.2 获取特征向量](#542-获取特征向量)
+- [6. 总结](#6-总结)
+- [7. 参考文献](#7-参考文献)
-## 1. 简介
+## 1. 摘要
-特征提取是图像识别中的关键一环,它的作用是将输入的图片转化为固定维度的特征向量,用于后续的[向量检索](./vector_search.md)。好的特征需要具备相似度保持性,即在特征空间中,相似度高的图片对其特征相似度要比较高(距离比较近),相似度低的图片对,其特征相似度要比较小(距离比较远)。[Deep Metric Learning](../algorithm_introduction/metric_learning.md)用以研究如何通过深度学习的方法获得具有强表征能力的特征。
+特征提取是图像识别中的关键一环,它的作用是将输入的图片转化为固定维度的特征向量,用于后续的[向量检索](./vector_search.md)。一个好的特征需要具备“相似度保持性”,即相似度高的图片对,其特征的相似度也比较高(特征空间中的距离比较近),相似度低的图片对,其特征相似度要比较低(特征空间中的距离比较远)。为此[Deep Metric Learning](../algorithm_introduction/metric_learning.md)领域内提出了不少方法用以研究如何通过深度学习来获得具有强表征能力的特征。
-## 2. 网络结构
+## 2. 介绍
+
为了图像识别任务的灵活定制,我们将整个网络分为 Backbone、 Neck、 Head 以及 Loss 部分,整体结构如下图所示:

图中各个模块的功能为:
-- **Backbone**: 指定所使用的骨干网络。 值得注意的是,PaddleClas 提供的基于 ImageNet 的预训练模型,最后一层的输出为 1000,我们需要依据所需的特征维度定制最后一层的输出。
-- **Neck**: 用以特征增强及特征维度变换。这儿的 Neck,可以是一个简单的 Linear Layer,用来做特征维度变换;也可以是较复杂的 FPN 结构,用以做特征增强。
-- **Head**: 用来将 feature 转化为 logits。除了常用的 Fc Layer 外,还可以替换为 cosmargin, arcmargin, circlemargin 等模块。
-- **Loss**: 指定所使用的 Loss 函数。我们将 Loss 设计为组合 loss 的形式,可以方便地将 Classification Loss 和 Pair_wise Loss 组合在一起。
+- **Backbone**: 用于提取输入图像初步特征的骨干网络,一般由配置文件中的 [`Backbone`](../../../ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml#L26-L29) 以及 [`BackboneStopLayer`](../../../ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml#L30-L31) 字段共同指定。
+- **Neck**: 用以特征增强及特征维度变换。可以是一个简单的 FC Layer,用来做特征维度变换;也可以是较复杂的 FPN 结构,用以做特征增强,一般由配置文件中的 [`Neck`](../../../ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml#L32-L35)字段指定。
+- **Head**: 用来将 feature 转化为 logits,让模型在训练阶段能以分类任务的形式进行训练。除了常用的 FC Layer 外,还可以替换为 cosmargin, arcmargin, circlemargin 等模块,一般由配置文件中的 [`Head`](../../../ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml#L36-L41)字段指定。
+- **Loss**: 指定所使用的 Loss 函数。我们将 Loss 设计为组合 loss 的形式,可以方便地将 Classification Loss 和 Metric learning Loss 组合在一起,一般由配置文件中的 [`Loss`](../../../ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml#L44-L50)字段指定。
-## 3. 通用识别模型
+## 3. 方法
+
+### 3.1 Backbone
+
+Backbone 部分采用了 [PP_LCNet_x2_5](../models/PP-LCNet.md),其针对Intel CPU端的性能优化探索了多个有效的结构设计方案,最终实现了在不增加推理时间的情况下,进一步提升模型的性能,最终大幅度超越现有的 SOTA 模型。
+
+### 3.2 Neck
+
+Neck 部分采用了 [FC Layer](../../../ppcls/arch/gears/fc.py),对 Backbone 抽取得到的特征进行降维,减少了特征存储的成本与计算量。
+
+### 3.3 Head
+
+Head 部分选用 [ArcMargin](../../../ppcls/arch/gears/arcmargin.py),在训练时通过指定margin,增大同类特征之间的角度差异再进行分类,进一步提升抽取特征的表征能力。
-在 PP-Shitu 中, 我们采用 [PP_LCNet_x2_5](../models/PP-LCNet.md) 作为骨干网络 Neck 部分选用 Linear Layer, Head 部分选用 [ArcMargin](../../../ppcls/arch/gears/arcmargin.py),Loss 部分选用 CELoss,详细的配置文件见[通用识别配置文件](../../../ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml)。其中,训练数据为如下 7 个公开数据集的汇总:
+### 3.4 Loss
-| 数据集 | 数据量 | 类别数 | 场景 | 数据集地址 |
-| :------------: | :-------------: | :-------: | :-------: | :--------: |
-| Aliproduct | 2498771 | 50030 | 商品 | [地址](https://retailvisionworkshop.github.io/recognition_challenge_2020/) |
-| GLDv2 | 1580470 | 81313 | 地标 | [地址](https://github.com/cvdfoundation/google-landmark) |
-| VeRI-Wild | 277797 | 30671 | 车辆 | [地址](https://github.com/PKU-IMRE/VERI-Wild)|
-| LogoDet-3K | 155427 | 3000 | Logo | [地址](https://github.com/Wangjing1551/LogoDet-3K-Dataset) |
-| iCartoonFace | 389678 | 5013 | 动漫人物 | [地址](http://challenge.ai.iqiyi.com/detail?raceId=5def69ace9fcf68aef76a75d) |
-| SOP | 59551 | 11318 | 商品 | [地址](https://cvgl.stanford.edu/projects/lifted_struct/) |
-| Inshop | 25882 | 3997 | 商品 | [地址](http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion.html) |
-| **Total** | **5M** | **185K** | ---- | ---- |
+Loss 部分选用 [Cross entropy loss](../../../ppcls/loss/celoss.py),在训练时以分类任务的损失函数来指导网络进行优化。详细的配置文件见[通用识别配置文件](../../../ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml)。
+
+
+
+## 4. 实验部分
+
+训练数据为如下 7 个公开数据集的汇总:
+
+| 数据集 | 数据量 | 类别数 | 场景 | 数据集地址 |
+| :----------: | :-----: | :------: | :------: | :--------------------------------------------------------------------------: |
+| Aliproduct | 2498771 | 50030 | 商品 | [地址](https://retailvisionworkshop.github.io/recognition_challenge_2020/) |
+| GLDv2 | 1580470 | 81313 | 地标 | [地址](https://github.com/cvdfoundation/google-landmark) |
+| VeRI-Wild | 277797 | 30671 | 车辆 | [地址](https://github.com/PKU-IMRE/VERI-Wild) |
+| LogoDet-3K | 155427 | 3000 | Logo | [地址](https://github.com/Wangjing1551/LogoDet-3K-Dataset) |
+| iCartoonFace | 389678 | 5013 | 动漫人物 | [地址](http://challenge.ai.iqiyi.com/detail?raceId=5def69ace9fcf68aef76a75d) |
+| SOP | 59551 | 11318 | 商品 | [地址](https://cvgl.stanford.edu/projects/lifted_struct/) |
+| Inshop | 25882 | 3997 | 商品 | [地址](http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion.html) |
+| **Total** | **5M** | **185K** | ---- | ---- |
最终的模型效果如下表所示:
-| 模型 | Aliproduct | VeRI-Wild | LogoDet-3K | iCartoonFace | SOP | Inshop | Latency(ms) |
-| :----------: | :---------: | :-------: | :-------: | :--------: | :--------: | :--------: | :--------: |
-PP-LCNet-2.5x | 0.839 | 0.888 | 0.861 | 0.841 | 0.793 | 0.892 | 5.0
+| 模型 | Aliproduct | VeRI-Wild | LogoDet-3K | iCartoonFace | SOP | Inshop | Latency(ms) |
+| :-----------------------------: | :--------: | :-------: | :--------: | :----------: | :---: | :----: | :---------: |
+| GeneralRecognition_PPLCNet_x2_5 | 0.839 | 0.888 | 0.861 | 0.841 | 0.793 | 0.892 | 5.0 |
+* 预训练模型地址:[通用识别预训练模型](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/general_PPLCNet_x2_5_pretrained_v1.0.pdparams)
* 采用的评测指标为:`Recall@1`
* 速度评测机器的 CPU 具体信息为:`Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz`
* 速度指标的评测条件为: 开启 MKLDNN, 线程数设置为 10
-* 预训练模型地址:[通用识别预训练模型](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/general_PPLCNet_x2_5_pretrained_v1.0.pdparams)
-
+
-## 4. 自定义特征提取
+## 5. 自定义特征提取
-自定义特征提取,是指依据自己的任务,重新训练特征提取模型。主要包含四个步骤:1)数据准备;2)模型训练;3)模型评估;4)模型推理。
+自定义特征提取,是指依据自己的任务,重新训练特征提取模型。
-
+下面基于`GeneralRecognition_PPLCNet_x2_5.yaml`配置文件,介绍主要的四个步骤:1)数据准备;2)模型训练;3)模型评估;4)模型推理
-### 4.1 数据准备
-首先,需要基于任务定制自己的数据集。数据集格式参见[格式说明](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/data_preparation/recognition_dataset.md#%E6%95%B0%E6%8D%AE%E9%9B%86%E6%A0%BC%E5%BC%8F%E8%AF%B4%E6%98%8E)。在启动模型训练之前,需要在配置文件中修改数据配置相关的内容, 主要包括数据集的地址以及类别数量。对应到配置文件中的位置如下所示:
-```
- Head:
- name: ArcMargin
- embedding_size: 512
- class_num: 185341 #此处表示类别数
-```
-```
- Train:
- dataset:
- name: ImageNetDataset
- image_root: ./dataset/ #此处表示train数据所在的目录
- cls_label_path: ./dataset/train_reg_all_data.txt #此处表示train数据集label文件的地址
-```
-```
- Query:
- dataset:
- name: VeriWild
- image_root: ./dataset/Aliproduct/. #此处表示query数据集所在的目录
- cls_label_path: ./dataset/Aliproduct/val_list.txt. #此处表示query数据集label文件的地址
-```
-```
- Gallery:
- dataset:
- name: VeriWild
- image_root: ./dataset/Aliproduct/ #此处表示gallery数据集所在的目录
- cls_label_path: ./dataset/Aliproduct/val_list.txt. #此处表示gallery数据集label文件的地址
-```
-
-
+
-### 4.2 模型训练
+### 5.1 数据准备
-- 单机单卡训练
-```shell
-export CUDA_VISIBLE_DEVICES=0
-python tools/train.py -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml
-```
-- 单机多卡训练
-```shell
-export CUDA_VISIBLE_DEVICES=0,1,2,3
-python -m paddle.distributed.launch \
- --gpus="0,1,2,3" tools/train.py \
- -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml
-```
-**注意:**
-配置文件中默认采用`在线评估`的方式,如果你想加快训练速度,去除`在线评估`,只需要在上述命令后面,增加 `-o eval_during_train=False`。训练完毕后,在 output 目录下会生成最终模型文件 `latest`,`best_model` 和训练日志文件 `train.log`。其中,`best_model` 用来存储当前评测指标下的最佳模型;`latest` 用来存储最新生成的模型, 方便在任务中断的情况下从断点位置启动训练。
+首先需要基于任务定制自己的数据集。数据集格式与文件结构详见[数据集格式说明](../data_preparation/recognition_dataset.md)。
-- 断点续训:
-```shell
-export CUDA_VISIBLE_DEVICES=0,1,2,3
-python -m paddle.distributed.launch \
- --gpus="0,1,2,3" tools/train.py \
- -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
- -o Global.checkpoint="output/RecModel/latest"
-```
+准备完毕之后还需要在配置文件中修改数据配置相关的内容, 主要包括数据集的地址以及类别数量。对应到配置文件中的位置如下所示:
-
+- 修改类别数:
+ ```yaml
+ Head:
+ name: ArcMargin
+ embedding_size: 512
+ class_num: 185341 # 此处表示类别数
+ ```
+- 修改训练数据集配置:
+ ```yaml
+ Train:
+ dataset:
+ name: ImageNetDataset
+ image_root: ./dataset/ # 此处表示train数据所在的目录
+ cls_label_path: ./dataset/train_reg_all_data.txt # 此处表示train数据集label文件的地址
+ ```
+- 修改评估数据集中query数据配置:
+ ```yaml
+ Query:
+ dataset:
+ name: VeriWild
+ image_root: ./dataset/Aliproduct/ # 此处表示query数据集所在的目录
+ cls_label_path: ./dataset/Aliproduct/val_list.txt # 此处表示query数据集label文件的地址
+ ```
+- 修改评估数据集中gallery数据配置:
+ ```yaml
+ Gallery:
+ dataset:
+ name: VeriWild
+ image_root: ./dataset/Aliproduct/ # 此处表示gallery数据集所在的目录
+ cls_label_path: ./dataset/Aliproduct/val_list.txt # 此处表示gallery数据集label文件的地址
+ ```
+
+
+
+### 5.2 模型训练
+
+模型训练主要包括启动训练和断点恢复训练的功能
-### 4.3 模型评估
+- 单机单卡训练
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0
+ python3.7 tools/train.py \
+ -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml
+ ```
+- 单机多卡训练
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0,1,2,3
+ python3.7 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" tools/train.py \
+ -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml
+ ```
+**注意:**
+配置文件中默认采用`在线评估`的方式,如果你想加快训练速度,可以关闭`在线评估`功能,只需要在上述命令的后面,增加 `-o Global.eval_during_train=False`。
+
+训练完毕后,在 output 目录下会生成最终模型文件 `latest.pdparams`,`best_model.pdarams` 和训练日志文件 `train.log`。其中,`best_model` 保存了当前评测指标下的最佳模型,`latest` 用来保存最新生成的模型, 方便在任务中断的情况下从断点位置恢复训练。通过在上述训练命令的末尾加上`-o Global.checkpoint="path_to_resume_checkpoint"`即可从断点恢复训练,示例如下。
+
+- 单机单卡断点恢复训练
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0
+ python3.7 tools/train.py \
+ -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
+ -o Global.checkpoint="output/RecModel/latest"
+ ```
+- 单机多卡断点恢复训练
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0,1,2,3
+ python3.7 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" tools/train.py \
+ -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
+ -o Global.checkpoint="output/RecModel/latest"
+ ```
+
+
+
+### 5.3 模型评估
+
+除了训练过程中对模型进行的在线评估,也可以手动启动评估程序来获得指定的模型的精度指标。
- 单卡评估
-```shell
-export CUDA_VISIBLE_DEVICES=0
-python tools/eval.py \
--c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
--o Global.pretrained_model="output/RecModel/best_model"
-```
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0
+ python3.7 tools/eval.py \
+ -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
+ -o Global.pretrained_model="output/RecModel/best_model"
+ ```
- 多卡评估
-```shell
-export CUDA_VISIBLE_DEVICES=0,1,2,3
-python -m paddle.distributed.launch \
- --gpus="0,1,2,3" tools/eval.py \
- -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
- -o Global.pretrained_model="output/RecModel/best_model"
-```
-**推荐:** 建议使用多卡评估。多卡评估方式可以利用多卡并行计算快速得到整体数据集的特征集合,能够加速评估的过程。
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0,1,2,3
+ python3.7 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" tools/eval.py \
+ -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
+ -o Global.pretrained_model="output/RecModel/best_model"
+ ```
+**注:** 建议使用多卡评估。该方式可以利用多卡并行计算快速得到全部数据的特征,能够加速评估的过程。
-
+
-### 4.4 模型推理
+### 5.4 模型推理
-推理过程包括两个步骤: 1)导出推理模型; 2)获取特征向量
+推理过程包括两个步骤: 1)导出推理模型;2)模型推理以获取特征向量
-
+#### 5.4.1 导出推理模型
-#### 4.4.1 导出推理模型
-
-```
-python tools/export_model.py \
+首先需要将 `*.pdparams` 模型文件转换成 inference 格式,转换命令如下。
+```shell
+python3.7 tools/export_model.py \
-c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
-o Global.pretrained_model="output/RecModel/best_model"
```
-生成的推理模型位于 `inference` 目录,里面包含三个文件,分别为 `inference.pdmodel`、`inference.pdiparams`、`inference.pdiparams.info`。
-其中: `inference.pdmodel` 用来存储推理模型的结构, `inference.pdiparams` 和 `inference.pdiparams.info` 用来存储推理模型相关的参数信息。
+生成的推理模型默认位于 `PaddleClas/inference` 目录,里面包含三个文件,分别为 `inference.pdmodel`、`inference.pdiparams`、`inference.pdiparams.info`。
+其中`inference.pdmodel` 用来存储推理模型的结构, `inference.pdiparams` 和 `inference.pdiparams.info` 用来存储推理模型相关的参数信息。
-
+#### 5.4.2 获取特征向量
-#### 4.4.2 获取特征向量
+使用上一步转换得到的 inference 格式模型,将输入图片转换为对应的特征向量,推理命令如下。
-```
+```shell
cd deploy
-python python/predict_rec.py \
+python3.7 python/predict_rec.py \
-c configs/inference_rec.yaml \
-o Global.rec_inference_model_dir="../inference"
```
得到的特征输出格式如下图所示:

-在实际使用过程中,单纯得到特征往往并不能够满足业务的需求。如果想进一步通过特征检索来进行图像识别,可以参照文档[向量检索](./vector_search.md)。
+在实际使用过程中,仅仅得到特征可能并不能满足业务需求。如果想进一步通过特征检索来进行图像识别,可以参照文档[向量检索](./vector_search.md)。
+
+
+
+## 6. 总结
+
+特征提取模块作为图像识别中的关键一环,在网络结构的设计,损失函数的选取上有很大的改进空间。不同的数据集类型有各自不同的特点,如行人重识别、商品识别、人脸识别数据集的分布、图片内容都不尽相同。学术界根据这些特点提出了各种各样的方法,如PCB、MGN、ArcFace、CircleLoss、TripletLoss等,围绕的还是增大类间差异、减少类内差异的最终目标,从而有效地应对各种真实场景数据。
+
+
+
+## 7. 参考文献
+
+1. [PP-LCNet: A Lightweight CPU Convolutional Neural Network](https://arxiv.org/pdf/2109.15099.pdf)
+2. [ArcFace: Additive Angular Margin Loss for Deep Face Recognition](https://arxiv.org/abs/1801.07698)
diff --git a/docs/zh_CN/image_recognition_pipeline/mainbody_detection.md b/docs/zh_CN/image_recognition_pipeline/mainbody_detection.md
index f3d7989029ae763523cebf3d504920863b356adc..828fdf4f1f017d524aa9ebea1f1a409dee0eaf43 100644
--- a/docs/zh_CN/image_recognition_pipeline/mainbody_detection.md
+++ b/docs/zh_CN/image_recognition_pipeline/mainbody_detection.md
@@ -19,9 +19,13 @@
- [3.3 配置文件改动和说明](#3.3)
- [3.4 启动训练](#3.4)
- [3.5 模型预测与调试](#3.5)
- - [3.6 模型导出与预测部署](#3.6)
+- [4. 模型推理部署](#4)
+ - [4.1 推理模型准备](#4.1)
+ - [4.2 基于python预测引擎推理](#4.2)
+ - [4.3 其他推理方式](#4.3)
-
+
+
## 1. 数据集
@@ -37,7 +41,7 @@
在实际训练的过程中,将所有数据集混合在一起。由于是主体检测,这里将所有标注出的检测框对应的类别都修改为 `前景` 的类别,最终融合的数据集中只包含 1 个类别,即前景。
-
+
## 2. 模型选择
@@ -55,7 +59,7 @@
* 速度评测机器的 CPU 具体信息为:`Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz`,速度指标为开启 mkldnn,线程数设置为 10 测试得到。
* 主体检测的预处理过程较为耗时,平均每张图在上述机器上的时间在 40~55 ms 左右,没有包含在上述的预测耗时统计中。
-
+
### 2.1 轻量级主体检测模型
@@ -72,7 +76,7 @@ PicoDet 由 [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)
在轻量级主体检测任务中,为了更好地兼顾检测速度与效果,我们使用 PPLCNet_x2_5 作为主体检测模型的骨干网络,同时将训练与预测的图像尺度修改为了 640x640,其余配置与 [picodet_lcnet_1_5x_416_coco.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/picodet/more_config/picodet_lcnet_1_5x_416_coco.yml) 完全一致。将数据集更换为自定义的主体检测数据集,进行训练,最终得到检测模型。
-
+
### 2.2 服务端主体检测模型
@@ -93,13 +97,13 @@ PP-YOLO 由 [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)
在服务端主体检测任务中,为了保证检测效果,我们使用 ResNet50vd-DCN 作为检测模型的骨干网络,使用配置文件 [ppyolov2_r50vd_dcn_365e_coco.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml),更换为自定义的主体检测数据集,进行训练,最终得到检测模型。
-
+
## 3. 模型训练
本节主要介绍怎样基于 PaddleDetection,基于自己的数据集,训练主体检测模型。
-
+
### 3.1 环境准备
@@ -116,7 +120,7 @@ pip install -r requirements.txt
更多安装教程,请参考: [安装文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL_cn.md)
-
+
### 3.2 数据准备
@@ -128,7 +132,7 @@ pip install -r requirements.txt
[{u'id': 1, u'name': u'foreground', u'supercategory': u'foreground'}]
```
-
+
### 3.3 配置文件改动和说明
@@ -154,7 +158,7 @@ ppyolov2_reader.yml:主要说明数据读取器配置,如 batch size,并
此外,也可以根据实际情况,修改上述文件,比如,如果显存溢出,可以将 batch size 和学习率等比缩小等。
-
+
### 3.4 启动训练
@@ -198,7 +202,7 @@ python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppy
注意:如果遇到 "`Out of memory error`" 问题, 尝试在 `ppyolov2_reader.yml` 文件中调小 `batch_size`,同时等比例调小学习率。
-
+
### 3.5 模型预测与调试
@@ -211,9 +215,11 @@ python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --infer
`--draw_threshold` 是个可选参数. 根据 [NMS](https://ieeexplore.ieee.org/document/1699659) 的计算,不同阈值会产生不同的结果 `keep_top_k` 表示设置输出目标的最大数量,默认值为 100,用户可以根据自己的实际情况进行设定。
-
+
+## 4. 模型推理部署
-### 3.6 模型导出与预测部署。
+
+### 4.1 推理模型准备
执行导出模型脚本:
@@ -225,15 +231,21 @@ python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml
注意: `PaddleDetection` 导出的 inference 模型的文件格式为 `model.xxx`,这里如果希望与 PaddleClas 的 inference 模型文件格式保持一致,需要将其 `model.xxx` 文件修改为 `inference.xxx` 文件,用于后续主体检测的预测部署。
-更多模型导出教程,请参考: [EXPORT_MODEL](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/EXPORT_MODEL.md)
+更多模型导出教程,请参考: [EXPORT_MODEL](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/EXPORT_MODEL.md)
最终,目录 `inference/ppyolov2_r50vd_dcn_365e_coco` 中包含 `inference.pdiparams`, `inference.pdiparams.info` 以及 `inference.pdmodel` 文件,其中 `inference.pdiparams` 为保存的 inference 模型权重文件,`inference.pdmodel` 为保存的 inference 模型结构文件。
+
+### 4.2 基于python预测引擎推理
导出模型之后,在主体检测与识别任务中,就可以将检测模型的路径更改为该 inference 模型路径,完成预测。
以商品识别为例,其配置文件为 [inference_product.yaml](../../../deploy/configs/inference_product.yaml),修改其中的 `Global.det_inference_model_dir` 字段为导出的主体检测 inference 模型目录,参考[图像识别快速开始教程](../quick_start/quick_start_recognition.md),即可完成商品检测与识别过程。
+
+### 4.3 其他推理方式
+其他推理方法,如C++推理部署、PaddleServing部署等请参考[检测模型推理部署](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/README.md)。
+
### FAQ
diff --git a/docs/zh_CN/image_recognition_pipeline/vector_search.md b/docs/zh_CN/image_recognition_pipeline/vector_search.md
index 6cf4d207ddfa5f3cade2ac727b12df2038f3943c..be0bf785c9b4844a9e6d2ae744ceb37c5ddbfed7 100644
--- a/docs/zh_CN/image_recognition_pipeline/vector_search.md
+++ b/docs/zh_CN/image_recognition_pipeline/vector_search.md
@@ -1,5 +1,21 @@
# 向量检索
+## 目录
+
+- [1. 向量检索应用场景介绍](#1)
+- [2. 向量检索算法介绍](#2)
+ - [2.1 HNSW](#2.1)
+ - [2.2 IVF](#2.2)
+ - [2.3 FLAT](#2.3)
+- [3. 检索库安装](#3)
+- [4. 使用及配置文档介绍](#4)
+ - [4.1 建库及配置文件参数](#4.1)
+ - [4.2 检索配置文件参数](#4.2)
+
+
+
+## 1. 向量检索应用场景介绍
+
向量检索技术在图像识别、图像检索中应用比较广泛。其主要目标是,对于给定的查询向量,在已经建立好的向量库中,与库中所有的待查询向量,进行特征向量的相似度或距离计算,得到相似度排序。在图像识别系统中,我们使用 [Faiss](https://github.com/facebookresearch/faiss) 对此部分进行支持,具体信息请详查 [Faiss 官网](https://github.com/facebookresearch/faiss)。`Faiss` 主要有以下优势
- 适配性好:支持 Windos、Linux、MacOS 系统
@@ -20,17 +36,33 @@
--------------------------
-## 目录
+
+## 2. 使用的检索算法
+
+目前 `PaddleClas` 中检索模块,支持三种检索算法**HNSW32**、**IVF**、**FLAT**。每种检索算法,满足不同场景。其中 `HNSW32` 为默认方法,此方法的检索精度、检索速度可以取得一个较好的平衡,具体算法介绍可以查看[官方文档](https://github.com/facebookresearch/faiss/wiki)。
+
+
+### 2.1 HNSW方法
+
+此方法为图索引方法,如下图所示,在建立索引的时候,分为不同的层,所以检索精度较高,速度较快,但是特征库只支持添加图像功能,不支持删除图像特征功能。基于图的向量检索算法在向量检索的评测中性能都是比较优异的。如果比较在乎检索算法的效率,而且可以容忍一定的空间成本,多数场景下比较推荐基于图的检索算法。而HNSW是一种典型的,应用广泛的图算法,很多分布式检索引擎都对HNSW算法进行了分布式改造,以应用于高并发,大数据量的线上查询。此方法为默认方法。
+
+

+
+
+
+### 2.2 IVF
-- [1. 检索库安装](#1)
-- [2. 使用的检索算法](#2)
-- [3. 使用及配置文档介绍](#3)
- - [3.1 建库及配置文件参数](#3.1)
- - [3.2 检索配置文件参数](#3.2)
+一种倒排索引检索方法。速度较快,但是精度略低。特征库支持增加、删除图像特征功能。IVF主要利用倒排的思想保存每个聚类中心下的向量,每次查询向量的时候找到最近的几个中心,分别搜索这几个中心下的向量。通过减小搜索范围,大大提升搜索效率。
-
+
+### 2.3 FLAT
-## 1. 检索库安装
+暴力检索算法。精度最高,但是数据量大时,检索速度较慢。特征库支持增加、删除图像特征功能。
+
+
+
+
+## 3. 检索库安装
`Faiss` 具体安装方法如下:
@@ -40,27 +72,16 @@ pip install faiss-cpu==1.7.1post2
若使用时,不能正常引用,则 `uninstall` 之后,重新 `install`,尤其是 `windows` 下。
-
-
-## 2. 使用的检索算法
-
-目前 `PaddleClas` 中检索模块,支持如下三种检索算法
-
-- **HNSW32**: 一种图索引方法。检索精度较高,速度较快。但是特征库只支持添加图像功能,不支持删除图像特征功能。(默认方法)
-- **IVF**:倒排索引检索方法。速度较快,但是精度略低。特征库支持增加、删除图像特征功能。
-- **FLAT**: 暴力检索算法。精度最高,但是数据量大时,检索速度较慢。特征库支持增加、删除图像特征功能。
-
-每种检索算法,满足不同场景。其中 `HNSW32` 为默认方法,此方法的检索精度、检索速度可以取得一个较好的平衡,具体算法介绍可以查看[官方文档](https://github.com/facebookresearch/faiss/wiki)。
-
+
-## 3. 使用及配置文档介绍
+## 4. 使用及配置文档介绍
-涉及检索模块配置文件位于:`deploy/configs/` 下,其中 `build_*.yaml` 是建立特征库的相关配置文件,`inference_*.yaml` 是检索或者分类的推理配置文件。
+涉及检索模块配置文件位于:`deploy/configs/` 下,其中 `inference_*.yaml` 是检索或者分类的推理配置文件,同时也是建立特征库的相关配置文件。
-
+
-### 3.1 建库及配置文件参数
+### 4.1 建库及配置文件参数
建库的具体操作如下:
@@ -68,14 +89,14 @@ pip install faiss-cpu==1.7.1post2
# 进入 deploy 目录
cd deploy
# yaml 文件根据需要改成自己所需的具体 yaml 文件
-python python/build_gallery.py -c configs/build_***.yaml
+python python/build_gallery.py -c configs/inference_***.yaml
```
其中 `yaml` 文件的建库的配置如下,在运行时,请根据实际情况进行修改。建库操作会将根据 `data_file` 的图像列表,将 `image_root` 下的图像进行特征提取,并在 `index_dir` 下进行存储,以待后续检索使用。
其中 `data_file` 文件存储的是图像文件的路径和标签,每一行的格式为:`image_path label`。中间间隔以 `yaml` 文件中 `delimiter` 参数作为间隔。
-关于特征提取的具体模型参数,可查看 `yaml` 文件。
+关于特征提取的具体模型参数,可查看 `yaml` 文件。注意下面的配置参数只列举了建立索引库相关部分。
```yaml
# indexing engine config
@@ -88,6 +109,7 @@ IndexProcess:
delimiter: "\t"
dist_type: "IP"
embedding_size: 512
+ batch_size: 32
```
- **index_method**:使用的检索算法。目前支持三种,HNSW32、IVF、Flat
@@ -98,23 +120,29 @@ IndexProcess:
- **delimiter**:**data_file** 中每一行的间隔符
- **dist_type**: 特征匹配过程中使用的相似度计算方式。例如 `IP` 内积相似度计算方式,`L2` 欧式距离计算方法
- **embedding_size**:特征维度
+- **batch_size**:建立特征库时,特征提取的`batch_size`
-
+
+
+### 4.2 检索配置文件参数
-### 3.2 检索配置文件参数
将检索的过程融合到 `PP-ShiTu` 的整体流程中,请参考 [README](../../../README_ch.md) 中 `PP-ShiTu 图像识别系统介绍` 部分。检索具体使用操作请参考[识别快速开始文档](../quick_start/quick_start_recognition.md)。
其中,检索部分配置如下,整体检索配置文件,请参考 `deploy/configs/inference_*.yaml` 文件。
+注意:此部分参数只是列举了离线检索相关部分参数。
+
```yaml
IndexProcess:
index_dir: "./recognition_demo_data_v1.1/gallery_logo/index/"
return_k: 5
score_thres: 0.5
+ hamming_radius: 100
```
与建库配置文件不同,新参数主要如下:
- `return_k`: 检索结果返回 `k` 个结果
- `score_thres`: 检索匹配的阈值
+- `hamming_radius`: 汉明距离半径。此参数只有在使用二值特征模型,`dist_type`设置为`hamming`时才能生效。具体二值特征模型使用方法请参考[哈希编码](./deep_hashing.md)
diff --git a/docs/zh_CN/inference_deployment/export_model.md b/docs/zh_CN/inference_deployment/export_model.md
index 1d8decb2837c0f68f71a6b022b05e574ce3ef83b..5e7d204c5f3e9755d2c97428c040fe7c2aa328e2 100644
--- a/docs/zh_CN/inference_deployment/export_model.md
+++ b/docs/zh_CN/inference_deployment/export_model.md
@@ -17,7 +17,7 @@ PaddlePaddle 支持导出 inference 模型用于部署推理场景,相比于
## 1. 环境准备
-首先请参考文档[安装 PaddlePaddle](../installation/install_paddle.md)和文档[安装 PaddleClas](../installation/install_paddleclas.md)配置运行环境。
+首先请参考文档文档[环境准备](../installation/install_paddleclas.md)配置运行环境。
## 2. 分类模型导出
diff --git a/docs/zh_CN/inference_deployment/python_deploy.md b/docs/zh_CN/inference_deployment/python_deploy.md
index 39843df12d17265fc586b160003e3361edb8a14a..9d4f254fdde8400b369dc54a4437dcc5f6929126 100644
--- a/docs/zh_CN/inference_deployment/python_deploy.md
+++ b/docs/zh_CN/inference_deployment/python_deploy.md
@@ -2,14 +2,15 @@
---
-首先请参考文档[安装 PaddlePaddle](../installation/install_paddle.md)和文档[安装 PaddleClas](../installation/install_paddleclas.md)配置运行环境。
+首先请参考文档[环境准备](../installation/install_paddleclas.md)配置运行环境。
## 目录
-- [1. 图像分类推理](#1)
-- [2. 主体检测模型推理](#2)
-- [3. 特征提取模型推理](#3)
-- [4. 主体检测、特征提取和向量检索串联](#4)
+- [1. 图像分类模型推理](#1)
+- [2. PP-ShiTu模型推理](#2)
+ - [2.1 主体检测模型推理](#2.1)
+ - [2.2 特征提取模型推理](#2.2)
+ - [2.3 PP-ShiTu PipeLine推理](#2.3)
## 1. 图像分类推理
@@ -42,7 +43,12 @@ python python/predict_cls.py -c configs/inference_cls.yaml
* 如果你希望提升评测模型速度,使用 GPU 评测时,建议开启 TensorRT 加速预测,使用 CPU 评测时,建议开启 MKL-DNN 加速预测。
-## 2. 主体检测模型推理
+## 2. PP-ShiTu模型推理
+
+PP-ShiTu整个Pipeline包含三部分:主体检测、特提取模型、特征检索。其中主体检测、特征模型可以单独推理使用。单独主体检测详见[2.1](#2.1),特征提取模型单独推理详见[2.2](#2.2), PP-ShiTu整体推理详见[2.3](#2.3)。
+
+
+### 2.1 主体检测模型推理
进入 PaddleClas 的 `deploy` 目录下:
@@ -70,8 +76,8 @@ python python/predict_det.py -c configs/inference_det.yaml
* `Global.use_gpu`: 是否使用 GPU 预测,默认为 `True`。
-
-## 3. 特征提取模型推理
+
+### 2.2 特征提取模型推理
下面以商品特征提取为例,介绍特征提取模型推理。首先进入 PaddleClas 的 `deploy` 目录下:
@@ -90,7 +96,7 @@ tar -xf ./models/product_ResNet50_vd_aliproduct_v1.0_infer.tar -C ./models/
上述预测命令可以得到一个 512 维的特征向量,直接输出在在命令行中。
-
-## 4. 主体检测、特征提取和向量检索串联
+
+### 2.3. PP-ShiTu PipeLine推理
主体检测、特征提取和向量检索的串联预测,可以参考图像识别[快速体验](../quick_start/quick_start_recognition.md)。
diff --git a/docs/zh_CN/inference_deployment/whl_deploy.md b/docs/zh_CN/inference_deployment/whl_deploy.md
index 14582ace5ce13636c7c14e7fdb9ba9ad2ebbfe90..e6ad70904853d17f89974ff62b812a3420d21a2b 100644
--- a/docs/zh_CN/inference_deployment/whl_deploy.md
+++ b/docs/zh_CN/inference_deployment/whl_deploy.md
@@ -18,7 +18,7 @@ PaddleClas 支持 Python Whl 包方式进行预测,目前 Whl 包方式仅支
- [4.6 对 `NumPy.ndarray` 格式数据进行预测](#4.6)
- [4.7 保存预测结果](#4.7)
- [4.8 指定 label name](#4.8)
-
+
## 1. 安装 paddleclas
@@ -212,14 +212,14 @@ print(next(result))
```python
from paddleclas import PaddleClas
clas = PaddleClas(model_name='ResNet50', save_dir='./output_pre_label/')
-infer_imgs = 'docs/images/whl/' # it can be infer_imgs folder path which contains all of images you want to predict.
+infer_imgs = 'docs/images/' # it can be infer_imgs folder path which contains all of images you want to predict.
result=clas.predict(infer_imgs)
print(next(result))
```
* CLI
```bash
-paddleclas --model_name='ResNet50' --infer_imgs='docs/images/whl/' --save_dir='./output_pre_label/'
+paddleclas --model_name='ResNet50' --infer_imgs='docs/images/' --save_dir='./output_pre_label/'
```
diff --git a/docs/zh_CN/installation/install_paddle.md b/docs/zh_CN/installation/install_paddle.md
deleted file mode 100644
index 995d28797c3078956af5571ef11506c2028481e4..0000000000000000000000000000000000000000
--- a/docs/zh_CN/installation/install_paddle.md
+++ /dev/null
@@ -1,101 +0,0 @@
-# 安装 PaddlePaddle
-
----
-## 目录
-
-- [1. 环境要求](#1)
-- [2.(建议)使用 Docker 环境](#2)
-- [3. 通过 pip 安装 PaddlePaddle](#3)
-- [4. 验证安装](#4)
-
-目前,**PaddleClas** 要求 **PaddlePaddle** 版本 `>=2.0`。建议使用我们提供的 Docker 运行 PaddleClas,有关 Docker、nvidia-docker 的相关使用教程可以参考[链接](https://www.runoob.com/Docker/Docker-tutorial.html)。如果不使用 Docker,可以直接跳过 [2.(建议)使用 Docker 环境](#2) 部分内容,从 [3. 通过 pip 安装 PaddlePaddle](#3) 部分开始。
-
-
-
-## 1. 环境要求
-
-**版本要求**:
-- python 3.x
-- CUDA >= 10.1(如果使用 `paddlepaddle-gpu`)
-- cuDNN >= 7.6.4(如果使用 `paddlepaddle-gpu`)
-- nccl >= 2.1.2(如果使用分布式训练/评估)
-- gcc >= 8.2
-
-**建议**:
-* 当 CUDA 版本为 10.1 时,显卡驱动版本 `>= 418.39`;
-* 当 CUDA 版本为 10.2 时,显卡驱动版本 `>= 440.33`;
-* 更多 CUDA 版本与要求的显卡驱动版本可以参考[链接](https://docs.nvidia.com/deploy/cuda-compatibility/index.html)。
-
-
-
-## 2.(建议)使用 Docker 环境
-
-* 切换到工作目录下
-
-```shell
-cd /home/Projects
-```
-
-* 创建 docker 容器
-
-下述命令会创建一个名为 ppcls 的 Docker 容器,并将当前工作目录映射到容器内的 `/paddle` 目录。
-
-```shell
-# 对于 GPU 用户
-sudo nvidia-docker run --name ppcls -v $PWD:/paddle --shm-size=8G --network=host -it paddlepaddle/paddle:2.1.0-gpu-cuda10.2-cudnn7 /bin/bash
-
-# 对于 CPU 用户
-sudo docker run --name ppcls -v $PWD:/paddle --shm-size=8G --network=host -it paddlepaddle/paddle:2.1.0 /bin/bash
-```
-
-**注意**:
-* 首次使用该镜像时,下述命令会自动下载该镜像文件,下载需要一定的时间,请耐心等待;
-* 上述命令会创建一个名为 ppcls 的 Docker 容器,之后再次使用该容器时无需再次运行该命令;
-* 参数 `--shm-size=8G` 将设置容器的共享内存为 8 G,如机器环境允许,建议将该参数设置较大,如 `64G`;
-* 您也可以访问 [DockerHub](https://hub.Docker.com/r/paddlepaddle/paddle/tags/) 获取与您机器适配的镜像;
-* 退出/进入 docker 容器:
- * 在进入 Docker 容器后,可使用组合键 `Ctrl + P + Q` 退出当前容器,同时不关闭该容器;
- * 如需再次进入容器,可使用下述命令:
-
- ```shell
- sudo Docker exec -it ppcls /bin/bash
- ```
-
-
-
-## 3. 通过 pip 安装 PaddlePaddle
-
-可运行下面的命令,通过 pip 安装最新版本 PaddlePaddle:
-
-```bash
-# 对于 CPU 用户
-pip install paddlepaddle --upgrade -i https://mirror.baidu.com/pypi/simple
-
-# 对于 GPU 用户
-pip install paddlepaddle-gpu --upgrade -i https://mirror.baidu.com/pypi/simple
-```
-
-**注意:**
-* 如果先安装了 CPU 版本的 PaddlePaddle,之后想切换到 GPU 版本,那么需要使用 pip 先卸载 CPU 版本的 PaddlePaddle,再安装 GPU 版本的 PaddlePaddle,否则容易导致 PaddlePaddle 冲突。
-* 您也可以从源码编译安装 PaddlePaddle,请参照 [PaddlePaddle 安装文档](http://www.paddlepaddle.org.cn/install/quick) 中的说明进行操作。
-
-
-## 4. 验证安装
-
-使用以下命令可以验证 PaddlePaddle 是否安装成功。
-
-```python
-import paddle
-paddle.utils.run_check()
-```
-
-查看 PaddlePaddle 版本的命令如下:
-
-```bash
-python -c "import paddle; print(paddle.__version__)"
-```
-
-**注意**:
-- 从源码编译的 PaddlePaddle 版本号为 `0.0.0`,请确保使用 PaddlePaddle 2.0 及之后的源码进行编译;
-- PaddleClas 基于 PaddlePaddle 高性能的分布式训练能力,若您从源码编译,请确保打开编译选项 `WITH_DISTRIBUTE=ON`。具体编译选项参考 [编译选项表](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/Tables.html#bianyixuanxiangbiao);
-- 在 Docker 中运行时,为保证 Docker 容器有足够的共享内存用于 Paddle 的数据读取加速,在创建 Docker 容器时,请设置参数 `--shm-size=8g`,条件允许的话可以设置为更大的值。
diff --git a/docs/zh_CN/installation/install_paddleclas.md b/docs/zh_CN/installation/install_paddleclas.md
index 0f70bf2364589dbe85bf09128fc034d9d250d22b..e02acc6fdae4f211c07232489d07b31bd187da1d 100644
--- a/docs/zh_CN/installation/install_paddleclas.md
+++ b/docs/zh_CN/installation/install_paddleclas.md
@@ -1,29 +1,94 @@
-# 安装 PaddleClas
+# 环境准备
---
## 目录
-* [1. 克隆 PaddleClas](#1)
-* [2. 安装 Python 依赖库](#2)
+- [1. 安装 PaddlePaddle](#1)
+ - [1.1 使用Paddle官方镜像](#1.1)
+ - [1.2 在现有环境中安装paddle](#1.2)
+ - [1.3 安装验证](#1.3)
+- [2. 克隆 PaddleClas](#2)
+- [3. 安装 Python 依赖库](#3)
+### 1.安装PaddlePaddle
+目前,**PaddleClas** 要求 **PaddlePaddle** 版本 `>=2.3`。
+建议使用Paddle官方提供的 Docker 镜像运行 PaddleClas,有关 Docker、nvidia-docker 的相关使用教程可以参考[链接](https://www.runoob.com/Docker/Docker-tutorial.html)。
-## 1. 克隆 PaddleClas
+
+
+#### 1.1(建议)使用 Docker 环境
+
+* 切换到工作目录下,例如工作目录为`/home/Projects`,则运行命令:
+
+```shell
+cd /home/Projects
+```
+
+* 创建 docker 容器
+
+下述命令会创建一个名为 ppcls 的 Docker 容器,并将当前工作目录映射到容器内的 `/paddle` 目录。
+
+```shell
+# 对于 GPU 用户
+sudo nvidia-docker run --name ppcls -v $PWD:/paddle --shm-size=8G --network=host -it registry.baidubce.com/paddlepaddle/paddle:2.3.0-gpu-cuda10.2-cudnn7 /bin/bash
+
+# 对于 CPU 用户
+sudo docker run --name ppcls -v $PWD:/paddle --shm-size=8G --network=host -it paddlepaddle/paddle:2.3.0-gpu-cuda10.2-cudnn7 /bin/bash
+```
+
+**注意**:
+* 首次使用该镜像时,下述命令会自动下载该镜像文件,下载需要一定的时间,请耐心等待;
+* 上述命令会创建一个名为 ppcls 的 Docker 容器,之后再次使用该容器时无需再次运行该命令;
+* 参数 `--shm-size=8G` 将设置容器的共享内存为 8 G,如机器环境允许,建议将该参数设置较大,如 `64G`;
+* 您也可以访问 [DockerHub](https://hub.Docker.com/r/paddlepaddle/paddle/tags/) ,手动选择需要的镜像;
+* 退出/进入 docker 容器:
+ * 在进入 Docker 容器后,可使用组合键 `Ctrl + P + Q` 退出当前容器,同时不关闭该容器;
+ * 如需再次进入容器,可使用下述命令:
+
+ ```shell
+ sudo Docker exec -it ppcls /bin/bash
+ ```
+
+#### 1.2 在现有环境中安装paddle
+您也可以用pip或conda直接安装paddle,详情请参考官方文档中的[快速安装](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/docker/linux-docker.html)部分。
+
+#### 1.3 安装验证
+使用以下命令可以验证 PaddlePaddle 是否安装成功。
+```python
+import paddle
+paddle.utils.run_check()
+```
+查看 PaddlePaddle 版本的命令如下:
+
+```bash
+python -c "import paddle; print(paddle.__version__)"
+```
+
+**注意**:
+- 从源码编译的 PaddlePaddle 版本号为 `0.0.0`,请确保使用 PaddlePaddle 2.3 及之后的源码进行编译;
+- PaddleClas 基于 PaddlePaddle 高性能的分布式训练能力,若您从源码编译,请确保打开编译选项 `WITH_DISTRIBUTE=ON`。具体编译选项参考 [编译选项表](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/Tables.html#bianyixuanxiangbiao);
+- 在 Docker 中运行时,为保证 Docker 容器有足够的共享内存用于 Paddle 的数据读取加速,在创建 Docker 容器时,请设置参数 `--shm-size=8g`,条件允许的话可以设置为更大的值。
+
+
+
+
+### 2. 克隆 PaddleClas
从 GitHub 下载:
```shell
-git clone https://github.com/PaddlePaddle/PaddleClas.git -b release/2.3
+git clone https://github.com/PaddlePaddle/PaddleClas.git -b release/2.4
```
如果访问 GitHub 网速较慢,可以从 Gitee 下载,命令如下:
```shell
-git clone https://gitee.com/paddlepaddle/PaddleClas.git -b release/2.3
+git clone https://gitee.com/paddlepaddle/PaddleClas.git -b release/2.4
```
-
+
-## 2. 安装 Python 依赖库
+### 3. 安装 Python 依赖库
PaddleClas 的 Python 依赖库在 `requirements.txt` 中给出,可通过如下命令安装:
diff --git a/docs/zh_CN/models/PP-HGNet.md b/docs/zh_CN/models/PP-HGNet.md
index d4b4a975d105f632a46c75a78b89089bdb1590e0..1150c87584319024767af1d3564f135d5391d83d 100644
--- a/docs/zh_CN/models/PP-HGNet.md
+++ b/docs/zh_CN/models/PP-HGNet.md
@@ -1,20 +1,43 @@
# PP-HGNet 系列
---
-## 目录
-
-* [1. 概述](#1)
-* [2. 结构信息](#2)
-* [3. 实验结果](#3)
+- [1. 模型介绍](#1)
+ - [1.1 模型简介](#1.1)
+ - [1.2 模型细节](#1.2)
+ - [1.3 实验结果](#1.3)
+- [2. 模型快速体验](#2)
+ - [2.1 安装 paddleclas](#2.1)
+ - [2.2 预测](#2.2)
+- [3. 模型训练、评估和预测](#3)
+ - [3.1 环境配置](#3.1)
+ - [3.2 数据准备](#3.2)
+ - [3.3 模型训练](#3.3)
+ - [3.4 模型评估](#3.4)
+ - [3.5 模型预测](#3.5)
+- [4. 模型推理部署](#4)
+ - [4.1 推理模型准备](#4.1)
+ - [4.1.1 基于训练得到的权重导出 inference 模型](#4.1.1)
+ - [4.1.2 直接下载 inference 模型](#4.1.2)
+ - [4.2 基于 Python 预测引擎推理](#4.2)
+ - [4.2.1 预测单张图像](#4.2.1)
+ - [4.2.2 基于文件夹的批量预测](#4.2.2)
+ - [4.3 基于 C++ 预测引擎推理](#4.3)
+ - [4.4 服务化部署](#4.4)
+ - [4.5 端侧部署](#4.5)
+ - [4.6 Paddle2ONNX 模型转换与预测](#4.6)
-## 1. 概述
+## 1. 模型介绍
+
+
+
+### 1.1 模型简介
-PP-HGNet(High Performance GPU Net) 是百度飞桨视觉团队自研的更适用于 GPU 平台的高性能骨干网络,该网络在 VOVNet 的基础上使用了可学习的下采样层(LDS Layer),融合了 ResNet_vd、PPLCNet 等模型的优点,该模型在 GPU 平台上与其他 SOTA 模型在相同的速度下有着更高的精度。在同等速度下,该模型高于 ResNet34-D 模型 3.8 个百分点,高于 ResNet50-D 模型 2.4 个百分点,在使用百度自研 SSLD 蒸馏策略后,超越 ResNet50-D 模型 4.7 个百分点。与此同时,在相同精度下,其推理速度也远超主流 VisionTransformer 的推理速度。
+PP-HGNet(High Performance GPU Net) 是百度飞桨视觉团队自研的更适用于 GPU 平台的高性能骨干网络,该网络在 VOVNet 的基础上使用了可学习的下采样层(LDS Layer),融合了 ResNet_vd、PPHGNet 等模型的优点,该模型在 GPU 平台上与其他 SOTA 模型在相同的速度下有着更高的精度。在同等速度下,该模型高于 ResNet34-D 模型 3.8 个百分点,高于 ResNet50-D 模型 2.4 个百分点,在使用百度自研 SSLD 蒸馏策略后,超越 ResNet50-D 模型 4.7 个百分点。与此同时,在相同精度下,其推理速度也远超主流 VisionTransformer 的推理速度。
-
+
-## 2. 结构信息
+### 1.2 模型细节
PP-HGNet 作者针对 GPU 设备,对目前 GPU 友好的网络做了分析和归纳,尽可能多的使用 3x3 标准卷积(计算密度最高)。在此将 VOVNet 作为基准模型,将主要的有利于 GPU 推理的改进点进行融合。从而得到一个有利于 GPU 推理的骨干网络,同样速度下,精度大幅超越其他 CNN 或者 VisionTransformer 模型。
@@ -26,14 +49,29 @@ PP-HGNet 骨干网络的整体结构如下:

-
+
+
+### 1.3 实验结果
+
+PP-HGNet 目前提供的模型的精度、速度指标及预训练权重链接如下:
+
+| Model | Top-1 Acc(\%) | Top-5 Acc(\%) | Latency(ms) | 预训练模型下载地址 | inference模型下载地址 |
+|:--: |:--: |:--: |:--: | :--: |:--: |
+| PPHGNet_tiny | 79.83 | 95.04 | 1.77 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_tiny_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_tiny_infer.tar) |
+| PPHGNet_tiny_ssld | 81.95 | 96.12 | 1.77 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_tiny_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_tiny_ssld_infer.tar) |
+| PPHGNet_small | 81.51| 95.82 | 2.52 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_small_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_small_infer.tar) |
+| PPHGNet_small_ssld | 83.82| 96.81 | 2.52 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_small_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_small_ssld_infer.tar) |
+| PPHGNet_base_ssld | 85.00| 97.35 | 5.97 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPHGNet_base_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_base_ssld_infer.tar) |
-## 3. 实验结果
+**备注:**
+
+* 1. `_ssld` 表示使用 `SSLD 蒸馏`后的模型。关于 `SSLD蒸馏` 的内容,详情 [SSLD 蒸馏](../advanced_tutorials/knowledge_distillation.md)。
+* 2. PP-HGNet 更多模型指标及权重,敬请期待。
PP-HGNet 与其他模型的比较如下,其中测试机器为 NVIDIA® Tesla® V100,开启 TensorRT 引擎,精度类型为 FP32。在相同速度下,PP-HGNet 精度均超越了其他 SOTA CNN 模型,在与 SwinTransformer 模型的比较中,在更高精度的同时,速度快 2 倍以上。
| Model | Top-1 Acc(\%) | Top-5 Acc(\%) | Latency(ms) |
-|-------|---------------|---------------|-------------|
+|:--: |:--: |:--: |:--: |
| ResNet34 | 74.57 | 92.14 | 1.97 |
| ResNet34_vd | 75.98 | 92.98 | 2.00 |
| EfficientNetB0 | 77.38 | 93.31 | 1.96 |
@@ -46,6 +84,304 @@ PP-HGNet 与其他模型的比较如下,其中测试机器为 NVIDIA® Tesla®
| SwinTransformer_tiny | 81.2 | 95.5 | 6.59 |
| PPHGNet_small | 81.51| 95.82 | 2.52 |
| PPHGNet_small_ssld | 83.82| 96.81 | 2.52 |
+| Res2Net200_vd_26w_4s_ssld| 85.13 | 97.42 | 11.45 |
+| ResNeXt101_32x48d_wsl | 85.37 | 97.69 | 55.07 |
+| SwinTransformer_base | 85.2 | 97.5 | 13.53 |
+| PPHGNet_base_ssld | 85.00| 97.35 | 5.97 |
+
+
+
+
+## 2. 模型快速体验
+
+
+
+### 2.1 安装 paddleclas
+
+使用如下命令快速安装 paddlepaddle, paddleclas
+
+```
+pip3 install paddlepaddle paddleclas
+```
+
+
+### 2.2 预测
+
+* 在命令行中使用 PPHGNet_small 的权重快速预测
+
+```bash
+paddleclas --model_name=PPHGNet_small --infer_imgs="docs/images/inference_deployment/whl_demo.jpg"
+```
+
+结果如下:
+```
+>>> result
+class_ids: [8, 7, 86, 82, 81], scores: [0.71479, 0.08682, 0.00806, 0.0023, 0.00121], label_names: ['hen', 'cock', 'partridge', 'ruffed grouse, partridge, Bonasa umbellus', 'ptarmigan'], filename: docs/images/inference_deployment/whl_demo.jpg
+Predict complete!
+```
+
+**备注**: 更换 PPHGNet 的其他 scale 的模型时,只需替换 `model_name`,如将此时的模型改为 `PPHGNet_tiny` 时,只需要将 `--model_name=PPHGNet_small` 改为 `--model_name=PPHGNet_tiny` 即可。
+
+
+* 在 Python 代码中预测
+```python
+from paddleclas import PaddleClas
+clas = PaddleClas(model_name='PPHGNet_small')
+infer_imgs = 'docs/images/inference_deployment/whl_demo.jpg'
+result = clas.predict(infer_imgs)
+print(next(result))
+```
+
+**备注**:`PaddleClas.predict()` 为可迭代对象(`generator`),因此需要使用 `next()` 函数或 `for` 循环对其迭
+代调用。每次调用将以 `batch_size` 为单位进行一次预测,并返回预测结果。返回结果示例如下:
+
+```
+>>> result
+[{'class_ids': [8, 7, 86, 82, 81], 'scores': [0.71479, 0.08682, 0.00806, 0.0023, 0.00121], 'label_names': ['hen', 'cock', 'partridge', 'ruffed grouse, partridge, Bonasa umbellus', 'ptarmigan'], 'filename': 'docs/images/inference_deployment/whl_demo.jpg'}]
+```
+
+
+
+
+## 3. 模型训练、评估和预测
+
+
+
+### 3.1 环境配置
+
+* 安装:请先参考文档[环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+
+
+
+### 3.2 数据准备
+
+请在[ImageNet 官网](https://www.image-net.org/)准备 ImageNet-1k 相关的数据。
+
+
+进入 PaddleClas 目录。
+
+```
+cd path_to_PaddleClas
+```
+
+进入 `dataset/` 目录,将下载好的数据命名为 `ILSVRC2012` ,存放于此。 `ILSVRC2012` 目录中具有以下数据:
+
+```
+├── train
+│ ├── n01440764
+│ │ ├── n01440764_10026.JPEG
+│ │ ├── n01440764_10027.JPEG
+├── train_list.txt
+...
+├── val
+│ ├── ILSVRC2012_val_00000001.JPEG
+│ ├── ILSVRC2012_val_00000002.JPEG
+├── val_list.txt
+```
+
+其中 `train/` 和 `val/` 分别为训练集和验证集。`train_list.txt` 和 `val_list.txt` 分别为训练集和验证集的标签文件。
+
+**备注:**
+
+* 关于 `train_list.txt`、`val_list.txt`的格式说明,可以参考[PaddleClas分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
+
+
+
+
+### 3.3 模型训练
+
+
+在 `ppcls/configs/ImageNet/PPHGNet/PPHGNet_small.yaml` 中提供了 PPHGNet_small 训练配置,可以通过如下脚本启动训练:
+
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+ --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ppcls/configs/ImageNet/PPHGNet/PPHGNet_small.yaml
+```
+
+
+**备注:**
+
+* 当前精度最佳的模型会保存在 `output/PPHGNet_small/best_model.pdparams`
+
+
+
+### 3.4 模型评估
+
+训练好模型之后,可以通过以下命令实现对模型指标的评估。
+
+```bash
+python3 tools/eval.py \
+ -c ppcls/configs/ImageNet/PPHGNet/PPHGNet_small.yaml \
+ -o Global.pretrained_model=output/PPHGNet_small/best_model
+```
+
+其中 `-o Global.pretrained_model="output/PPHGNet_small/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+
+
+### 3.5 模型预测
+
+模型训练完成之后,可以加载训练得到的预训练模型,进行模型预测。在模型库的 `tools/infer.py` 中提供了完整的示例,只需执行下述命令即可完成模型预测:
+
+```python
+python3 tools/infer.py \
+ -c ppcls/configs/ImageNet/PPHGNet/PPHGNet_small.yaml \
+ -o Global.pretrained_model=output/PPHGNet_small/best_model
+```
+
+输出结果如下:
+
+```
+[{'class_ids': [8, 7, 86, 82, 81], 'scores': [0.71479, 0.08682, 0.00806, 0.0023, 0.00121], 'file_name': 'docs/images/inference_deployment/whl_demo.jpg', 'label_names': ['hen', 'cock', 'partridge', 'ruffed grouse, partridge, Bonasa umbellus', 'ptarmigan']}]
+```
+
+**备注:**
+
+* 这里`-o Global.pretrained_model="output/PPHGNet_small/best_model"` 指定了当前最佳权重所在的路径,如果指定其他权重,只需替换对应的路径即可。
+
+* 默认是对 `docs/images/inference_deployment/whl_demo.jpg` 进行预测,此处也可以通过增加字段 `-o Infer.infer_imgs=xxx` 对其他图片预测。
+
+* 默认输出的是 Top-5 的值,如果希望输出 Top-k 的值,可以指定`-o Infer.PostProcess.topk=k`,其中,`k` 为您指定的值。
+
+
+
+
+
+## 4. 模型推理部署
+
+
+
+### 4.1 推理模型准备
+
+Paddle Inference 是飞桨的原生推理库, 作用于服务器端和云端,提供高性能的推理能力。相比于直接基于预训练模型进行预测,Paddle Inference可使用MKLDNN、CUDNN、TensorRT 进行预测加速,从而实现更优的推理性能。更多关于Paddle Inference推理引擎的介绍,可以参考[Paddle Inference官网教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html)。
+
+当使用 Paddle Inference 推理时,加载的模型类型为 inference 模型。本案例提供了两种获得 inference 模型的方法,如果希望得到和文档相同的结果,请选择[直接下载 inference 模型](#6.1.2)的方式。
+
+
+
+
+### 4.1.1 基于训练得到的权重导出 inference 模型
+
+此处,我们提供了将权重和模型转换的脚本,执行该脚本可以得到对应的 inference 模型:
+
+```bash
+python3 tools/export_model.py \
+ -c ppcls/configs/ImageNet/PPHGNet/PPHGNet_small.yaml \
+ -o Global.pretrained_model=output/PPHGNet_small/best_model \
+ -o Global.save_inference_dir=deploy/models/PPHGNet_small_infer
+```
+执行完该脚本后会在 `deploy/models/` 下生成 `PPHGNet_small_infer` 文件夹,`models` 文件夹下应有如下文件结构:
+
+```
+├── PPHGNet_small_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+
+### 4.1.2 直接下载 inference 模型
+
+[4.1.1 小节](#4.1.1)提供了导出 inference 模型的方法,此处也提供了该场景可以下载的 inference 模型,可以直接下载体验。
+
+```
+cd deploy/models
+# 下载 inference 模型并解压
+wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPHGNet_small_infer.tar && tar -xf PPHGNet_small_infer.tar
+```
+
+解压完毕后,`models` 文件夹下应有如下文件结构:
+
+```
+├── PPHGNet_small_infer
+│ ├── inference.pdiparams
+│ ├── inference.pdiparams.info
+│ └── inference.pdmodel
+```
+
+
+
+### 4.2 基于 Python 预测引擎推理
+
+
+
+
+#### 4.2.1 预测单张图像
+
+返回 `deploy` 目录:
+
+```
+cd ../
+```
+
+运行下面的命令,对图像 `./images/ImageNet/ILSVRC2012_val_00000010.jpeg` 进行分类。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测
+python3 python/predict_cls.py -c configs/inference_cls.yaml -o Global.inference_model_dir=models/PPHGNet_small_infer
+# 使用下面的命令使用 CPU 进行预测
+python3 python/predict_cls.py -c configs/inference_cls.yaml -o Global.inference_model_dir=models/PPHGNet_small_infer -o Global.use_gpu=False
+```
+
+输出结果如下。
+
+```
+ILSVRC2012_val_00000010.jpeg: class id(s): [332, 153, 283, 338, 204], score(s): [0.50, 0.05, 0.02, 0.01, 0.01], label_name(s): ['Angora, Angora rabbit', 'Maltese dog, Maltese terrier, Maltese', 'Persian cat', 'guinea pig, Cavia cobaya', 'Lhasa, Lhasa apso']
+```
+
+
+
+#### 4.2.2 基于文件夹的批量预测
+
+如果希望预测文件夹内的图像,可以直接修改配置文件中的 `Global.infer_imgs` 字段,也可以通过下面的 `-o` 参数修改对应的配置。
+
+```shell
+# 使用下面的命令使用 GPU 进行预测,如果希望使用 CPU 预测,可以在命令后面添加 -o Global.use_gpu=False
+python3 python/predict_cls.py -c configs/inference_cls.yaml -o Global.inference_model_dir=models/PPHGNet_small_infer -o Global.infer_imgs=images/ImageNet/
+```
+
+终端中会输出该文件夹内所有图像的分类结果,如下所示。
+
+```
+ILSVRC2012_val_00000010.jpeg: class id(s): [332, 153, 283, 338, 204], score(s): [0.50, 0.05, 0.02, 0.01, 0.01], label_name(s): ['Angora, Angora rabbit', 'Maltese dog, Maltese terrier, Maltese', 'Persian cat', 'guinea pig, Cavia cobaya', 'Lhasa, Lhasa apso']
+ILSVRC2012_val_00010010.jpeg: class id(s): [626, 622, 531, 487, 633], score(s): [0.68, 0.02, 0.02, 0.02, 0.02], label_name(s): ['lighter, light, igniter, ignitor', 'lens cap, lens cover', 'digital watch', 'cellular telephone, cellular phone, cellphone, cell, mobile phone', "loupe, jeweler's loupe"]
+ILSVRC2012_val_00020010.jpeg: class id(s): [178, 211, 171, 246, 741], score(s): [0.82, 0.00, 0.00, 0.00, 0.00], label_name(s): ['Weimaraner', 'vizsla, Hungarian pointer', 'Italian greyhound', 'Great Dane', 'prayer rug, prayer mat']
+ILSVRC2012_val_00030010.jpeg: class id(s): [80, 83, 136, 23, 93], score(s): [0.84, 0.00, 0.00, 0.00, 0.00], label_name(s): ['black grouse', 'prairie chicken, prairie grouse, prairie fowl', 'European gallinule, Porphyrio porphyrio', 'vulture', 'hornbill']
+```
+
+
+
+
+### 4.3 基于 C++ 预测引擎推理
+
+PaddleClas 提供了基于 C++ 预测引擎推理的示例,您可以参考[服务器端 C++ 预测](../inference_deployment/cpp_deploy.md)来完成相应的推理部署。如果您使用的是 Windows 平台,可以参考[基于 Visual Studio 2019 Community CMake 编译指南](../inference_deployment/cpp_deploy_on_windows.md)完成相应的预测库编译和模型预测工作。
+
+
+
+### 4.4 服务化部署
+
+Paddle Serving 提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案。更多关于Paddle Serving 的介绍,可以参考[Paddle Serving 代码仓库](https://github.com/PaddlePaddle/Serving)。
+
+PaddleClas 提供了基于 Paddle Serving 来完成模型服务化部署的示例,您可以参考[模型服务化部署](../inference_deployment/paddle_serving_deploy.md)来完成相应的部署工作。
+
+
+
+### 4.5 端侧部署
+
+Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及服务器端在内的多硬件平台。更多关于 Paddle Lite 的介绍,可以参考[Paddle Lite 代码仓库](https://github.com/PaddlePaddle/Paddle-Lite)。
+
+PaddleClas 提供了基于 Paddle Lite 来完成模型端侧部署的示例,您可以参考[端侧部署](../inference_deployment/paddle_lite_deploy.md)来完成相应的部署工作。
+
+
+
+### 4.6 Paddle2ONNX 模型转换与预测
+
+Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式。通过 ONNX 可以完成将 Paddle 模型到多种推理引擎的部署,包括TensorRT/OpenVINO/MNN/TNN/NCNN,以及其它对 ONNX 开源格式进行支持的推理引擎或硬件。更多关于 Paddle2ONNX 的介绍,可以参考[Paddle2ONNX 代码仓库](https://github.com/PaddlePaddle/Paddle2ONNX)。
+PaddleClas 提供了基于 Paddle2ONNX 来完成 inference 模型转换 ONNX 模型并作推理预测的示例,您可以参考[Paddle2ONNX 模型转换与预测](../../../deploy/paddle2onnx/readme.md)来完成相应的部署工作。
-关于更多 PP-HGNet 的介绍以及下游任务的表现,敬请期待。
diff --git a/docs/zh_CN/models/PP-LCNet.md b/docs/zh_CN/models/PP-LCNet.md
index 7fea973d4634228fefcccff0e7e856546b3b9652..2c9627cfb195a7b384c262c0b4076bdf15bd3de3 100644
--- a/docs/zh_CN/models/PP-LCNet.md
+++ b/docs/zh_CN/models/PP-LCNet.md
@@ -3,54 +3,76 @@
## 目录
-- [1. 摘要](#1)
-- [2. 介绍](#2)
-- [3. 方法](#3)
- - [3.1 更好的激活函数](#3.1)
- - [3.2 合适的位置添加 SE 模块](#3.2)
- - [3.3 合适的位置添加更大的卷积核](#3.3)
- - [3.4 GAP 后使用更大的 1x1 卷积层](#3.4)
-- [4. 实验部分](#4)
- - [4.1 图像分类](#4.1)
- - [4.2 目标检测](#4.2)
- - [4.3 语义分割](#4.3)
-- [5. 基于 V100 GPU 的预测速度](#5)
-- [6. 基于 SD855 的预测速度](#6)
-- [7. 总结](#7)
-- [8. 引用](#8)
+- [1. 模型介绍](#1)
+ - [1.1 模型简介](#1.1)
+ - [1.2 模型细节](#1.2)
+ - [1.2.1 更好的激活函数](#1.2.1)
+ - [1.2.2 合适的位置添加 SE 模块](#1.2.2)
+ - [1.2.3 合适的位置添加更大的卷积核](#1.2.3)
+ - [1.2.4 GAP 后使用更大的 1x1 卷积层](#1.2.4)
+ - [1.3 实验结果](#1.3)
+ - [1.4 Benchmark](#1.4)
+ - [1.4.1 基于 Intel Xeon Gold 6148 的预测速度](#1.4.1)
+ - [1.4.2 基于 V100 GPU 的预测速度](#1.4.2)
+ - [1.4.3 基于 SD855 的预测速度](#1.4.3)
+- [2. 模型快速体验](#2)
+ - [2.1 安装 paddleclas](#2.1)
+ - [2.2 预测](#2.2)
+- [3. 模型训练、评估和预测](#3)
+ - [3.1 环境配置](#3.1)
+ - [3.2 数据准备](#3.2)
+ - [3.3 模型训练](#3.3)
+ - [3.4 模型评估](#3.4)
+ - [3.5 模型预测](#3.5)
+- [4. 模型推理部署](#4)
+ - [4.1 推理模型准备](#4.1)
+ - [4.1.1 基于训练得到的权重导出 inference 模型](#4.1.1)
+ - [4.1.2 直接下载 inference 模型](#4.1.2)
+ - [4.2 基于 Python 预测引擎推理](#4.2)
+ - [4.2.1 预测单张图像](#4.2.1)
+ - [4.2.2 基于文件夹的批量预测](#4.2.2)
+ - [4.3 基于 C++ 预测引擎推理](#4.3)
+ - [4.4 服务化部署](#4.4)
+ - [4.5 端侧部署](#4.5)
+ - [4.6 Paddle2ONNX 模型转换与预测](#4.6)
+- [5. 引用](#5)
+
+
-## 1. 摘要
+## 1. 模型介绍
-在计算机视觉领域中,骨干网络的好坏直接影响到整个视觉任务的结果。在之前的一些工作中,相关的研究者普遍将 FLOPs 或者 Params 作为优化目的,但是在工业界真实落地的场景中,推理速度才是考量模型好坏的重要指标,然而,推理速度和准确性很难兼得。考虑到工业界有很多基于 Intel CPU 的应用,所以我们本次的工作旨在使骨干网络更好的适应 Intel CPU,从而得到一个速度更快、准确率更高的轻量级骨干网络,与此同时,目标检测、语义分割等下游视觉任务的性能也同样得到提升。
+### 1.1 模型简介
-
-## 2. 介绍
+在计算机视觉领域中,骨干网络的好坏直接影响到整个视觉任务的结果。在之前的一些工作中,相关的研究者普遍将 FLOPs 或者 Params 作为优化目的,但是在工业界真实落地的场景中,推理速度才是考量模型好坏的重要指标,然而,推理速度和准确性很难兼得。考虑到工业界有很多基于 Intel CPU 的应用,所以我们本次的工作旨在使骨干网络更好的适应 Intel CPU,从而得到一个速度更快、准确率更高的轻量级骨干网络,与此同时,目标检测、语义分割等下游视觉任务的性能也同样得到提升。
近年来,有很多轻量级的骨干网络问世,尤其最近两年,各种 NAS 搜索出的网络层出不穷,这些网络要么主打 FLOPs 或者 Params 上的优势,要么主打 ARM 设备上的推理速度的优势,很少有网络专门针对 Intel CPU 做特定的优化,导致这些网络在 Intel CPU 端的推理速度并不是很完美。基于此,我们针对 Intel CPU 设备以及其加速库 MKLDNN 设计了特定的骨干网络 PP-LCNet,比起其他的轻量级的 SOTA 模型,该骨干网络可以在不增加推理时间的情况下,进一步提升模型的性能,最终大幅度超越现有的 SOTA 模型。与其他模型的对比图如下。

-
-## 3. 方法
+
+
+### 1.2 模型细节
网络结构整体如下图所示。

我们经过大量的实验发现,在基于 Intel CPU 设备上,尤其当启用 MKLDNN 加速库后,很多看似不太耗时的操作反而会增加延时,比如 elementwise-add 操作、split-concat 结构等。所以最终我们选用了结构尽可能精简、速度尽可能快的 block 组成我们的 BaseNet(类似 MobileNetV1)。基于 BaseNet,我们通过实验,总结了四条几乎不增加延时但是可以提升模型精度的方法,融合这四条策略,我们组合成了 PP-LCNet。下面对这四条策略一一介绍:
-
-### 3.1 更好的激活函数
+
+
+#### 1.2.1 更好的激活函数
自从卷积神经网络使用了 ReLU 激活函数后,网络性能得到了大幅度提升,近些年 ReLU 激活函数的变体也相继出现,如 Leaky-ReLU、P-ReLU、ELU 等,2017 年,谷歌大脑团队通过搜索的方式得到了 swish 激活函数,该激活函数在轻量级网络上表现优异,在 2019 年,MobileNetV3 的作者将该激活函数进一步优化为 H-Swish,该激活函数去除了指数运算,速度更快,网络精度几乎不受影响。我们也经过很多实验发现该激活函数在轻量级网络上有优异的表现。所以在 PP-LCNet 中,我们选用了该激活函数。
-
-### 3.2 合适的位置添加 SE 模块
+
+
+#### 1.2.2 合适的位置添加 SE 模块
SE 模块是 SENet 提出的一种通道注意力机制,可以有效提升模型的精度。但是在 Intel CPU 端,该模块同样会带来较大的延时,如何平衡精度和速度是我们要解决的一个问题。虽然在 MobileNetV3 等基于 NAS 搜索的网络中对 SE 模块的位置进行了搜索,但是并没有得出一般的结论,我们通过实验发现,SE 模块越靠近网络的尾部对模型精度的提升越大。下表也展示了我们的一些实验结果:
| SE Location | Top-1 Acc(\%) | Latency(ms) |
-|-------------------|---------------|-------------|
+|:--:|:--:|:--:|
| 1100000000000 | 61.73 | 2.06 |
| 0000001100000 | 62.17 | 2.03 |
| 0000000000011 | 63.14 | 2.05 |
@@ -59,13 +81,14 @@ SE 模块是 SENet 提出的一种通道注意力机制,可以有效提升模
最终,PP-LCNet 中的 SE 模块的位置选用了表格中第三行的方案。
-
-### 3.3 合适的位置添加更大的卷积核
+
+
+#### 1.2.3 合适的位置添加更大的卷积核
在 MixNet 的论文中,作者分析了卷积核大小对模型性能的影响,结论是在一定范围内大的卷积核可以提升模型的性能,但是超过这个范围会有损模型的性能,所以作者组合了一种 split-concat 范式的 MixConv,这种组合虽然可以提升模型的性能,但是不利于推理。我们通过实验总结了一些更大的卷积核在不同位置的作用,类似 SE 模块的位置,更大的卷积核在网络的中后部作用更明显,下表展示了 5x5 卷积核的位置对精度的影响:
| large-kernel Location | Top-1 Acc(\%) | Latency(ms) |
-|-------------------|---------------|-------------|
+|:--:|:--:|:--:|
| 1111111111111 | 63.22 | 2.08 |
| 1111111000000 | 62.70 | 2.07 |
| 0000001111111 | 63.14 | 2.05 |
@@ -73,48 +96,51 @@ SE 模块是 SENet 提出的一种通道注意力机制,可以有效提升模
实验表明,更大的卷积核放在网络的中后部即可达到放在所有位置的精度,与此同时,获得更快的推理速度。PP-LCNet 最终选用了表格中第三行的方案。
-
-### 3.4 GAP 后使用更大的 1x1 卷积层
+
+
+#### 1.2.4 GAP 后使用更大的 1x1 卷积层
在 GoogLeNet 之后,GAP(Global-Average-Pooling)后往往直接接分类层,但是在轻量级网络中,这样会导致 GAP 后提取的特征没有得到进一步的融合和加工。如果在此后使用一个更大的 1x1 卷积层(等同于 FC 层),GAP 后的特征便不会直接经过分类层,而是先进行了融合,并将融合的特征进行分类。这样可以在不影响模型推理速度的同时大大提升准确率。
BaseNet 经过以上四个方面的改进,得到了 PP-LCNet。下表进一步说明了每个方案对结果的影响:
| Activation | SE-block | Large-kernel | last-1x1-conv | Top-1 Acc(\%) | Latency(ms) |
-|------------|----------|--------------|---------------|---------------|-------------|
+|:--:|:--:|:--:|:--:|:--:|:--:|
| 0 | 1 | 1 | 1 | 61.93 | 1.94 |
| 1 | 0 | 1 | 1 | 62.51 | 1.87 |
| 1 | 1 | 0 | 1 | 62.44 | 2.01 |
| 1 | 1 | 1 | 0 | 59.91 | 1.85 |
| 1 | 1 | 1 | 1 | 63.14 | 2.05 |
-
-## 4. 实验部分
+
+
+### 1.3 实验结果
-
-### 4.1 图像分类
+
+
+#### 1.3.1 图像分类
图像分类我们选用了 ImageNet 数据集,相比目前主流的轻量级网络,PP-LCNet 在相同精度下可以获得更快的推理速度。当使用百度自研的 SSLD 蒸馏策略后,精度进一步提升,在 Intel cpu 端约 5ms 的推理速度下 ImageNet 的 Top-1 Acc 超过了 80%。
-| Model | Params(M) | FLOPs(M) | Top-1 Acc(\%) | Top-5 Acc(\%) | Latency(ms) |
-|-------|-----------|----------|---------------|---------------|-------------|
-| PPLCNet_x0_25 | 1.5 | 18 | 51.86 | 75.65 | 1.74 |
-| PPLCNet_x0_35 | 1.6 | 29 | 58.09 | 80.83 | 1.92 |
-| PPLCNet_x0_5 | 1.9 | 47 | 63.14 | 84.66 | 2.05 |
-| PPLCNet_x0_75 | 2.4 | 99 | 68.18 | 88.30 | 2.29 |
-| PPLCNet_x1_0 | 3.0 | 161 | 71.32 | 90.03 | 2.46 |
-| PPLCNet_x1_5 | 4.5 | 342 | 73.71 | 91.53 | 3.19 |
-| PPLCNet_x2_0 | 6.5 | 590 | 75.18 | 92.27 | 4.27 |
-| PPLCNet_x2_5 | 9.0 | 906 | 76.60 | 93.00 | 5.39 |
-| PPLCNet_x0_5_ssld | 1.9 | 47 | 66.10 | 86.46 | 2.05 |
-| PPLCNet_x1_0_ssld | 3.0 | 161 | 74.39 | 92.09 | 2.46 |
-| PPLCNet_x2_5_ssld | 9.0 | 906 | 80.82 | 95.33 | 5.39 |
+| Model | Params(M) | FLOPs(M) | Top-1 Acc(\%) | Top-5 Acc(\%) | Latency(ms) | 预训练模型下载地址 | inference模型下载地址 |
+|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
+| PPLCNet_x0_25 | 1.5 | 18 | 51.86 | 75.65 | 1.74 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_25_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_25_infer.tar) |
+| PPLCNet_x0_35 | 1.6 | 29 | 58.09 | 80.83 | 1.92 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_35_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_35_infer.tar) |
+| PPLCNet_x0_5 | 1.9 | 47 | 63.14 | 84.66 | 2.05 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_5_infer.tar) |
+| PPLCNet_x0_75 | 2.4 | 99 | 68.18 | 88.30 | 2.29 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_75_infer.tar) |
+| PPLCNet_x1_0 | 3.0 | 161 | 71.32 | 90.03 | 2.46 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x1_0_infer.tar) |
+| PPLCNet_x1_5 | 4.5 | 342 | 73.71 | 91.53 | 3.19 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_5_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x1_5_infer.tar) |
+| PPLCNet_x2_0 | 6.5 | 590 | 75.18 | 92.27 | 4.27 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x2_0_infer.tar) |
+| PPLCNet_x2_5 | 9.0 | 906 | 76.60 | 93.00 | 5.39 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x2_5_infer.tar) |
+| PPLCNet_x0_5_ssld | 1.9 | 47 | 66.10 | 86.46 | 2.05 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x0_5_ssld_infer.tar) |
+| PPLCNet_x1_0_ssld | 3.0 | 161 | 74.39 | 92.09 | 2.46 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x1_0_ssld_infer.tar) |
+| PPLCNet_x2_5_ssld | 9.0 | 906 | 80.82 | 95.33 | 5.39 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_ssld_pretrained.pdparams) | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/PPLCNet_x2_5_ssld_infer.tar) |
其中 `_ssld` 表示使用 `SSLD 蒸馏`后的模型。关于 `SSLD蒸馏` 的内容,详情 [SSLD 蒸馏](../advanced_tutorials/knowledge_distillation.md)。
与其他轻量级网络的性能对比:
| Model | Params(M) | FLOPs(M) | Top-1 Acc(\%) | Top-5 Acc(\%) | Latency(ms) |
-|-------|-----------|----------|---------------|---------------|-------------|
+|:--:|:--:|:--:|:--:|:--:|:--:|
| MobileNetV2_x0_25 | 1.5 | 34 | 53.21 | 76.52 | 2.47 |
| MobileNetV3_small_x0_35 | 1.7 | 15 | 53.03 | 76.37 | 3.02 |
| ShuffleNetV2_x0_33 | 0.6 | 24 | 53.73 | 77.05 | 4.30 |
@@ -128,50 +154,75 @@ BaseNet 经过以上四个方面的改进,得到了 PP-LCNet。下表进一步
| MobileNetV3_small_x1_25 | 3.6 | 100 | 70.67 | 89.51 | 3.95 |
| PPLCNet_x1_0 | 3.0 | 161 | 71.32 | 90.03 | 2.46 |
-
-### 4.2 目标检测
+
+
+#### 1.3.2 目标检测
目标检测的方法我们选用了百度自研的 PicoDet,该方法主打轻量级目标检测场景,下表展示了在 COCO 数据集上、backbone 选用 PP-LCNet 与 MobileNetV3 的结果的比较,无论在精度还是速度上,PP-LCNet 的优势都非常明显。
| Backbone | mAP(%) | Latency(ms) |
-|-------|-----------|----------|
+|:--:|:--:|:--:|
MobileNetV3_large_x0_35 | 19.2 | 8.1 |
PPLCNet_x0_5