update en docs

14997e9e · HydrogenSulfate · 2b446440 · 14997e9e · 14997e9e · 14997e9e
隐藏空白更改
内联并排

Showing with 205 addition and 103 deletion

README_ch.md README_ch.md +2 -1

README_en.md README_en.md +7 -0

docs/en/PPShiTu/PPShiTuV2_introduction.md docs/en/PPShiTu/PPShiTuV2_introduction.md +196 -102

未找到文件。
--- a/README_ch.md
+++ b/README_ch.md
@@ -22,7 +22,7 @@

 ## 近期更新

- 🔥️ 发布[PP-ShiTuV2](./docs/zh_CN/PPShiTu/PPShiTuV2_introduction.md)，recall1精度提升8个点，覆盖20+识别场景，新增[库管理工具](./deploy/shitu_index_manager/)，[Android Demo](./docs/zh_CN/quick_start/quick_start_recognition.md)全新体验。
+- 🔥️ 发布[PP-ShiTuV2](./docs/zh_CN/PPShiTu/PPShiTuV2_introduction.md)，recall1精度提升8个点，覆盖[20+识别场景](./docs/zh_CN/introduction/ppshitu_application_scenarios.md)，新增[库管理工具](./deploy/shitu_index_manager/)，[Android Demo](./docs/zh_CN/quick_start/quick_start_recognition.md)全新体验。
 - 2022.9.4 新增[生鲜产品自主结算范例库](https://aistudio.baidu.com/aistudio/projectdetail/4486158)，具体内容可以在AI Studio上体验。
 - 2022.6.15 发布[PULC超轻量图像分类实用方案](docs/zh_CN/PULC/PULC_train.md)，CPU推理3ms，精度比肩SwinTransformer，覆盖人、车、OCR场景九大常见任务。
 - 2022.5.23 新增[人员出入管理范例库](https://aistudio.baidu.com/aistudio/projectdetail/4094475)，具体内容可以在 AI Studio 上体验。
@@ -100,6 +100,7 @@ PP-ShiTu图像识别快速体验：[点击这里](./docs/zh_CN/quick_start/quick
 - 前沿算法
  - [骨干网络和预训练模型库](docs/zh_CN/algorithm_introduction/ImageNet_models.md)
  - [度量学习](docs/zh_CN/algorithm_introduction/metric_learning.md)
+    - [ReID](./docs/zh_CN/algorithm_introduction/reid.md)
  - [模型压缩](docs/zh_CN/algorithm_introduction/model_prune_quantization.md)
  - [模型蒸馏](docs/zh_CN/algorithm_introduction/knowledge_distillation.md)
  - [数据增强](docs/zh_CN/advanced_tutorials/DataAugmentation.md)

--- a/README_en.md
+++ b/README_en.md
@@ -54,6 +54,11 @@ Based on th algorithms above, PaddleClas release PP-ShiTu image recognition syst
 ## Quick Start
 Quick experience of PP-ShiTu image recognition system：[Link](./docs/en/quick_start/quick_start_recognition_en.md)

+<div align="center">
+<img src="./docs/images/quick_start/android_demo/PPShiTu_qrcode.png"  width = "40%" />
+<p>PP-ShiTuV2 Android Demo</p>
+</div>
+
 Quick experience of **P**ractical **U**ltra **L**ight-weight image **C**lassification models：[Link](docs/en/PULC/PULC_quickstart_en.md)

 ## Tutorials
@@ -61,6 +66,8 @@ Quick experience of **P**ractical **U**ltra **L**ight-weight image **C**lassific
 - [Install Paddle](./docs/en/installation/install_paddle_en.md)
 - [Install PaddleClas Environment](./docs/en/installation/install_paddleclas_en.md)
 - [PP-ShiTuV2 Image Recognition Systems Introduction](./docs/en/PPShiTu/PPShiTuV2_introduction.md)
+  - [Image Recognition Quick Start](docs/en/quick_start/quick_start_recognition_en.md)
+  - [20+ application scenarios](docs/zh_CN/introduction/ppshitu_application_scenarios.md)
  - Submodule Introduction and Model Training
    - [Mainbody Detection](./docs/zh_CN/image_recognition_pipeline/mainbody_detection.md)
    - [Feature Extraction](./docs/en/image_recognition_pipeline/feature_extraction_en.md)

--- a/docs/en/PPShiTu/PPShiTuV2_introduction.md
+++ b/docs/en/PPShiTu/PPShiTuV2_introduction.md
 ## PP-ShiTuV2 Image Recognition System

-## Content
-
- [PP-ShiTuV2 Introduction](#pp-shituv2-introduction)
-  - [Dataset](#dataset)
-  - [Model Training](#model-training)
-  - [Model Evaluation](#model-evaluation)
-  - [Model Inference](#model-inference)
-  - [Model Deployment](#model-deployment)
- [Module introduction](#module-introduction)
-  - [Mainbody Detection](#mainbody-detection)
-  - [Feature Extraction](#feature-extraction)
-    - [Dataset](#dataset-1)
-    - [Backbone](#backbone)
-    - [Network Structure](#network-structure)
-    - [Data Augmentation](#data-augmentation)
+## Table of contents
+
+- [1. Introduction of PP-ShiTuV2 model and application scenarios](#1-introduction-of-pp-shituv2-model-and-application-scenarios)
+- [2. Quick experience](#2-quick-experience)
+  - [2.1 Quick experience of PP-ShiTu android demo](#21-quick-experience-of-pp-shitu-android-demo)
+  - [2.2 Quick experience of command line code](#22-quick-experience-of-command-line-code)
+- [3 Module introduction and training](#3-module-introduction-and-training)
+  - [3.1 Mainbody detection](#31-mainbody-detection)
+  - [3.2 Feature Extraction](#32-feature-extraction)
+  - [3.3 Vector Search](#33-vector-search)
+- [4. Inference Deployment](#4-inference-deployment)
+  - [4.1 Inference model preparation](#41-inference-model-preparation)
+    - [4.1.1 Export the inference model from pretrained model](#411-export-the-inference-model-from-pretrained-model)
+    - [4.1.2 Download the inference model directly](#412-download-the-inference-model-directly)
+  - [4.2 Test data preparation](#42-test-data-preparation)
+  - [4.3 Inference based on Python inference engine](#43-inference-based-on-python-inference-engine)
+    - [4.3.1 single image prediction](#431-single-image-prediction)
+    - [4.3.2 multi images prediction](#432-multi-images-prediction)
+  - [4.3 Inference based on C++ inference engine](#43-inference-based-on-c-inference-engine)
+  - [4.4 Serving deployment](#44-serving-deployment)
+  - [4.5 Lite deployment](#45-lite-deployment)
+  - [4.6 Paddle2ONNX](#46-paddle2onnx)
 - [references](#references)

-## PP-ShiTuV2 Introduction
+## 1. Introduction of PP-ShiTuV2 model and application scenarios

-PP-ShiTuV2 is a practical lightweight general image recognition system based on PP-ShiTuV1. Compared with PP-ShiTuV1, it has higher recognition accuracy, stronger generalization ability and similar inference speed<sup>*</sup >. The system is mainly optimized for training data set and feature extraction, with a better backbone, loss function and training strategy. The retrieval performance of PP-ShiTuV2 in multiple practical application scenarios is significantly improved.
+PP-shituv2 is a practical lightweight general image recognition system improved on PP-ShitUV1. It is composed of three modules: mainbody detection, feature extraction and vector search. Compared with PP-ShiTuV1, PP-ShiTuV2 has higher recognition accuracy, stronger generalization and similar inference speed <sup>*</sup>. This paper mainly optimize in training dataset, feature extraction with better backbone network, loss function and training strategy, which significantly improved the retrieval performance of PP-ShiTuV2 in multiple practical application scenarios.

 <div align="center">
 <img src="../../images/structure.jpg" />
 </div>

-### Dataset
+The following table lists the relevant metric obtained by PP-ShiTuV2 with comparison to PP-ShiTuV1.

-We remove some uncommon datasets add more common datasets in training stage. For more details, please refer to [PP-ShiTuV2 dataset](../image_recognition_pipeline/feature_extraction.md#4-实验部分).
+| model      | storage (mainbody detection + feature extraction) | product  |
+| :--------- | :------------------------------------------------ | :------- |
+|            |                                                   | recall@1 |
+| PP-ShiTuV1 | 64(30+34)MB                                       | 66.8%    |
+| PP-ShiTuV2 | 49(30+19)                                         | 73.8%    |

-The following takes the dataset of [PP-ShiTuV2](../image_recognition_pipeline/feature_extraction.md#4-实验部分) as an example to introduce the training, evaluation and inference process of the PP-ShiTuV2 model.
+**Note:**
+- For the introduction of recall and mAP metric, please refer to [Retrieval Metric](../algorithm_introduction/reid.md).
+- Latency is based on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz test, MKLDNN acceleration strategy is enabled, and the number of threads is 10.

-### Model Training
+## 2. Quick experience

-Download the 17 datasets in [PP-ShiTuV2 dataset](../image_recognition_pipeline/feature_extraction.md#4-实验部分) and merge them manually, then generate the annotation text file `train_reg_all_data_v2.txt`, and finally place them in `dataset` directory.
+### 2.1 Quick experience of PP-ShiTu android demo

-The merged 17 datasets structure is as follows:
+You can download and install the APP by scanning the QR code or [click the link](https://paddle-imagenet-models-name.bj.bcebos.com/demos/PP-ShiTu.apk)

-```python
-dataset/
-├── Aliproduct/ # Aliproduct dataset folder
-├── SOP/ # SOPt dataset folder
-├── ...
-├── Products-10k/ # Products-10k dataset folder
-├── ...
-└── train_reg_all_data_v2.txt # Annotation text file
-```
-The content of the generated `train_reg_all_data_v2.txt` is as follows:
+<div align=center><img src="../../images/quick_start/android_demo/PPShiTu_qrcode.png" height="45%" width="45%"/></div>

-```log
-...
-Aliproduct/train/50029/1766228.jpg 50029
-Aliproduct/train/50029/1764348.jpg 50029
-...
-Products-10k/train/88823.jpg 186440
-Products-10k/train/88824.jpg 186440
-...
-```
+Then save the following demo pictures to your phone:

-Then run the following command to train:
+<div align=center><img src="../../images/recognition/drink_data_demo/test_images/nongfu_spring.jpeg" width=30% height=30% /></div>

-```shell
-# Use GPU 0 for single-card training
-export CUDA_VISIBLE_DEVICES=0
-python3.7 tools/train.py \
-c ./ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml
-
-# Use 8 GPUs for distributed training
-export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
-python3.7 -m paddle.distributed.launch tools/train.py \
-c ./ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml
-```
-**Note:** `eval_during_train` will be enabled by default during training. After each `eval_interval` epoch, the model will be evaluated on the data set specified by `Eval` in the configuration file (the default is Aliproduct) and calculated for reference. index.
+Open the installed APP, click the "**file recognition**" button below, select the above saved image, and you can get the following recognition results:
+
+<div align=center><img src="../../images/quick_start/android_demo/android_nongfu_spring.JPG" width=30% height=30%/></div>
+
+### 2.2 Quick experience of command line code

-### Model Evaluation
+- First follow the commands below to install paddlepaddle and faiss
+  ```shell
+  # If your machine is installed with CUDA9 or CUDA10, please run the following command to install
+  python3.7 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple

-Reference [Model Evaluation](../image_recognition_pipeline/feature_extraction_en.md#43-model-evaluation)
+  # If your machine is CPU, please run the following command to install
+  python3.7 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple

-### Model Inference
+  # install faiss database
+  python3.7 -m pip install faiss-cpu==1.7.1post2
+  ```

-Refer to [Python Model Reasoning](../quick_start/quick_start_recognition.md#22-Image Recognition Experience) and [C++ Model Reasoning](../../../deploy/cpp_shitu/readme_en.md)
+- Then follow the command below to install the paddleclas whl package
+  ```shell
+  # Go to the root directory of PaddleClas
+  cd PaddleClas

-### Model Deployment
+  # install paddleclas
+  python3.7 setup.py install
+  ```

-Reference [Model Deployment](../inference_deployment/recognition_serving_deploy_en.md#32-service-deployment-and-request)
+- Then execute the following command to download and decompress the demo data, and finally execute command to quick start image recognition

-## Module introduction
+  ```shell
+  # Download and unzip the demo data
+  wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v2.0.tar && tar -xf drink_dataset_v2.0.tar

-### Mainbody Detection
+  # Execute the identification command
+  paddleclas \
+  --model_name=PP-ShiTuV2 \
+  --infer_imgs=./drink_dataset_v2.0/test_images/100.jpeg \
+  --index_dir=./drink_dataset_v2.0/index/ \
+  --data_file=./drink_dataset_v2.0/gallery/drink_label.txt
+  ```

-The main body detection model uses `PicoDet-LCNet_x2_5`, for details refer to: [picodet_lcnet_x2_5_640_mainbody](../image_recognition_pipeline/mainbody_detection.md).
+## 3 Module introduction and training

-### Feature Extraction
+### 3.1 Mainbody detection

-#### Dataset
+Mainbody detection is a widely used detection technology. It refers to detecting the coordinate position of one or more objects in the image, and then cropping the corresponding area in the image for identification. Mainbody detection is the pre-procedure of the recognition task. The input image is recognized after mainbody detection, which can remove complex backgrounds and effectively improve the recognition accuracy.

-On the basis of the training data set used in PP-ShiTuV1, we removed the iCartoonFace data set, and added more widely used data sets, such as bird400, Cars, Products-10k, fruits- 262.
+Taking into account the detection speed, model size, detection accuracy and other factors, the lightweight model `PicoDet-LCNet_x2_5` developed by PaddleDetection was finally selected as the mainbody detection model of PP-ShiTuV2

-#### Backbone
+For details on the dataset, training, evaluation, inference, etc. of the mainbody detection model, please refer to the document: [picodet_lcnet_x2_5_640_mainbody](../../en/image_recognition_pipeline/mainbody_detection_en.md).

-We replaced the backbone network from `PPLCNet_x2_5` to [`PPLCNetV2_base`](../models/PP-LCNetV2.md). Compared with `PPLCNet_x2_5`, `PPLCNetV2_base` basically maintains a higher classification accuracy and reduces the 40% of inference time <sup>*</sup>.
+### 3.2 Feature Extraction
+
+Feature extraction is a key part of image recognition. It is designed to convert the input image into a fixed-dimensional feature vector for subsequent [vector search](../../en/image_recognition_pipeline/vector_search_en.md) . Taking into account the speed of the feature extraction model, model size, feature extraction performance and other factors, the [`PPLCNetV2_base`](../../en/models/PP-LCNet_en.md) developed by PaddleClas was finally selected as the feature extraction network. Compared with `PPLCNet_x2_5` used by PP-ShiTuV1, `PPLCNetV2_base` basically maintains high classification accuracy and reduces inference time by 40%<sup>*</sup>.

 **Note:** <sup>*</sup>The inference environment is based on Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz hardware platform, OpenVINO inference platform.

-#### Network Structure
+During the experiment, we found that we can make appropriate improvements to `PPLCNetV2_base` to achieve higher performance in recognition tasks while keeping the speed basically unchanged, including: removing `ReLU` and `FC` at the end of `PPLCNetV2_base`, change the stride of the last stage (RepDepthwiseSeparable) to 1.
+
+For details about the dataset, training, evaluation, inference, etc. of the feature extraction model, please refer to the document: [PPLCNetV2_base_ShiTu](../../en/image_recognition_pipeline/feature_extraction_en.md).
+
+### 3.3 Vector Search
+
+Vector Search technology is widely used in image recognition. Its' main goal is to calculate the similarity or distance of the feature vector in the established vector database for a given query vector, and return the similarity ranking result of the candidate vector.
+
+In the PP-ShiTuV2 recognition system, we use the [Faiss](https://github.com/facebookresearch/faiss) vector research open source library, which has good adaptability, easy installation, rich algorithms, It supports the advantages of both CPU and GPU.
+
+For the installation and use of the Faiss vector research tool in the PP-ShiTuV2 system, please refer to the document: [vector search](../../en/image_recognition_pipeline/vector_search_en.md).
+
+## 4. Inference Deployment
+
+### 4.1 Inference model preparation
+Paddle Inference is the native inference database of Paddle, which enabled on the server and the cloud to provide high-performance inference capabilities. Compared to making predictions based on pre-trained models directly, Paddle Inference can use MKLDNN, CUDNN, and TensorRT for prediction acceleration to achieve better inference performance. For more introduction to Paddle Inference inference engine, please refer to [Paddle Inference official website tutorial](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html).
+
+When using Paddle Inference for model inference, the loaded model type is the inference model. This case provides two methods to obtain the inference model. If you want to get the same result as the document, please click [Download the inference model directly](#412-download-the-inference-model-directly).
+
+#### 4.1.1 Export the inference model from pretrained model
+- Please refer to the document [Mainbody Detection Inference Model Preparation](../../en/image_recognition_pipeline/mainbody_detection_en.md), or refer to [4.1.2](#412-direct download-inference-model)
+
+- To export the weights of the feature extraction model, you can refer to the following commands:
+  ```shell
+  python3.7 tools/export_model.py \
+  -c ./ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml \
+  -o Global.pretrained_model="https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/PPShiTuV2/general_PPLCNetV2_base_pretrained_v1.0.pdparams" \
+  -o Global.save_inference_dir=deploy/models/GeneralRecognitionV2_PPLCNetV2_base`
+  ```
+  After executing the script, the `GeneralRecognitionV2_PPLCNetV2_base` folder will be generated under `deploy/models/` with the following file structure:
+
+  ```log
+  deploy/models/
+  ├── GeneralRecognitionV2_PPLCNetV2_base
+  │   ├── inference.pdiparams
+  │   ├── inference.pdiparams.info
+  │   └── inference.pdmodel
+  ```
+
+#### 4.1.2 Download the inference model directly
+
+[Section 4.1.1](#411-export-the-inference-model-from-pretrained-model) provides a method to export the inference model, here we provide the exported inference model, you can download the model to the specified location and decompress it by the following command experience.
+
+```shell
+cd deploy/models
+
+# Download the mainbody detection inference model and unzip it
+wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar && tar -xf picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
+
+# Download the feature extraction inference model and unzip it
+wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/PP-ShiTuV2/general_PPLCNetV2_base_pretrained_v1.0_infer.tar && tar -xf general_PPLCNetV2_base_pretrained_v1.
+```

-We adjust the `PPLCNetV2_base` structure, and added more general and effective optimizations for retrieval tasks such as pedestrian re-detection, landmark retrieval, and face recognition. It mainly includes the following points:
+### 4.2 Test data preparation

-1. `PPLCNetV2_base` structure adjustment: The experiment found that [`ReLU`](../../../ppcls/arch/backbone/legendary_models/pp_lcnet_v2.py#L322) at the end of the network has a great impact on the retrieval performance, [`FC`](../../../ppcls/arch/backbone/legendary_models/pp_lcnet_v2.py#L325) also causes a slight drop in retrieval performance, so we removed `ReLU` and `FC` at the end of BackBone.
+After preparing the mainbody detection and feature extraction models, you also need to prepare the test data as input. You can run the following commands to download and decompress the test data.

-2. `last stride=1`: No downsampling is performed at last stage, so as to increase the semantic information of the final output feature map, without having much more computational cost.
+```shell
+# return to ./deploy
+cd ../
+
+# Download the test data drink_dataset_v2.0 and unzip it
+wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v2.0.tar && tar -xf drink_dataset_v2.0.tar
+```

-3. `BN Neck`: Add a `BatchNorm1D` layer after `BackBone` to normalize each dimension of the feature vector, bringing faster convergence.
+### 4.3 Inference based on Python inference engine

-    | Model                                            | training data      | recall@1%(mAP%) |
-    | :----------------------------------------------- | :----------------- | :-------------- |
-    | GeneralRecognition_PPLCNet_x2_5                  | PP-ShiTuV1 dataset | 65.9(54.3)      |
-    | GeneralRecognitionV2_PPLCNetV2_base(TripletLoss) | PP-ShiTuV1 dataset | 72.3(60.5)      |
+#### 4.3.1 single image prediction

-4. `TripletAngularMarginLoss`: We improved on the original `TripletLoss` (difficult triplet loss), changed the optimization objective from L2 Euclidean space to cosine space, and added an additional space between anchor and positive/negtive The hard distance constraint makes the training and testing goals closer and improves the generalization ability of the model.
+Then execute the following command to identify the single image `./drink_dataset_v2.0/test_images/100.jpeg`.

-    | Model | training data | recall@1%(mAP%) |
-    | :---- | :------------ |: -------------- |
-    | GeneralRecognitionV2_PPLCNetV2_base(TripletLoss) | PP-ShiTuV2 dataset | 71.9(60.2) |
-    | GeneralRecognitionV2_PPLCNetV2_base(TripletAngularMarginLoss) | PP-ShiTuV2 dataset | 73.7(61.0) |
+```shell
+# Execute the following command to predict with GPU
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images/100.jpeg"
+
+# Execute the following command to predict with CPU
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images/100.jpeg" -o Global.use_gpu=False
+```
+
+The final output is as follows.
+
+```log
+[{'bbox': [437, 71, 660, 728], 'rec_docs': '元气森林', 'rec_scores': 0.7740249}, {'bbox': [221, 72, 449, 701], 'rec_docs': '元气森林', 'rec_scores': 0.6950992}, {'bbox': [794, 104, 979, 652], 'rec_docs': '元气森林', 'rec_scores': 0.6305153}]
+```
+
+#### 4.3.2 multi images prediction
+
+If you want to predict the images in the folder, you can directly modify the `Global.infer_imgs` field in the configuration file, or you can modify the corresponding configuration through the following -o parameter.
+
+```shell
+# Use the command below to predict with GPU
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images"
+# Use the following command to predict with CPU
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images" -o Global.use_gpu=False
+```
+
+The terminal will output the recognition results of all images in the folder, as shown below.
+
+```log
+...
+[{'bbox': [0, 0, 600, 600], 'rec_docs': '红牛-强化型', 'rec_scores': 0.74081033}]
+Inference: 120.39852142333984 ms per batch image
+[{'bbox': [0, 0, 514, 436], 'rec_docs': '康师傅矿物质水', 'rec_scores': 0.6918598}]
+Inference: 32.045602798461914 ms per batch image
+[{'bbox': [138, 40, 573, 1198], 'rec_docs': '乐虎功能饮料', 'rec_scores': 0.68214047}]
+Inference: 113.41428756713867 ms per batch image
+[{'bbox': [328, 7, 467, 272], 'rec_docs': '脉动', 'rec_scores': 0.60406065}]
+Inference: 122.04337120056152 ms per batch image
+[{'bbox': [242, 82, 498, 726], 'rec_docs': '味全_每日C', 'rec_scores': 0.5428652}]
+Inference: 37.95266151428223 ms per batch image
+[{'bbox': [437, 71, 660, 728], 'rec_docs': '元气森林', 'rec_scores': 0.7740249}, {'bbox': [221, 72, 449, 701], 'rec_docs': '元气森林', 'rec_scores': 0.6950992}, {'bbox': [794, 104, 979, 652], 'rec_docs': '元气森林', 'rec_scores': 0.6305153}]
+...
+```

-#### Data Augmentation
+Where `bbox` represents the bounding box of the detected mainbody, `rec_docs` represents the most similar category to the detection object in the index database, and `rec_scores` represents the corresponding similarity.

-The target object may rotate to a certain extent and may not maintain an upright state when the actual camera is shot, so we add [Random Rotation](../../../ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml#L117) in the data augmentation to make retrieval more robust in real scenes.
+### 4.3 Inference based on C++ inference engine
+PaddleClas provides an example of inference based on C++ prediction engine, you can refer to [Server-side C++ prediction](../../../deploy/cpp_shitu/readme_en.md) to complete the corresponding inference deployment. If you are using the Windows platform, you can refer to [Visual Studio 2019 Community CMake Compilation Guide](../inference_deployment/python_deploy_en.md) to complete the corresponding prediction database compilation and model prediction work.

-Combining the above strategies, the final experimental results on multiple data sets are as follows:
+### 4.4 Serving deployment
+Paddle Serving provides high-performance, flexible and easy-to-use industrial-grade online inference services. Paddle Serving supports RESTful, gRPC, bRPC and other protocols, and provides inference solutions in a variety of heterogeneous hardware and operating system environments. For more introduction to Paddle Serving, please refer to [Paddle Serving Code Repository](https://github.com/PaddlePaddle/Serving).

-  | Model                               | product<sup>*</sup> |
-  | :---------------------------------- | :------------------ |
-  | -                                   | recall@1%(mAP%)     |
-  | GeneralRecognition_PPLCNet_x2_5     | 65.9(54.3)          |
-  | GeneralRecognitionV2_PPLCNetV2_base | 73.7(61.0)          |
+PaddleClas provides an example of model serving deployment based on Paddle Serving. You can refer to [Model serving deployment](../inference_deployment/recognition_serving_deploy_en.md) to complete the corresponding deployment.

-  | Models                              | Aliproduct      | VeRI-Wild       | LogoDet-3k      | iCartoonFace    | SOP             | Inshop           |
-  | :---------------------------------- | :-------------- | :-------------- | :-------------- | :-------------- | :-------------- | :--------------- |
-  | -                                   | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@ 1%(mAP%) |
-  | GeneralRecognition_PPLCNet_x2_5     | 83.9(83.2)      | 88.7(60.1)      | 86.1(73.6)      | 84.1(72.3)      | 79.7(58.6)      | 89.1(69.4)       |
-  | GeneralRecognitionV2_PPLCNetV2_base | 84.2(83.3)      | 87.8(68.8)      | 88.0(63.2)      | 53.6(27.5)      | 77.6(55.3)      | 90.8(74.3)       |
+### 4.5 Lite deployment
+Paddle Lite is a high-performance, lightweight, flexible and easily extensible deep learning inference framework, positioned to support multiple hardware platforms including mobile, embedded and server. For more introduction to Paddle Lite, please refer to [Paddle Lite Code Repository](https://github.com/PaddlePaddle/Paddle-Lite).

-  | model                               | gldv2           | imdb_face       | iNat            | instre          | sketch          | sop<sup>*</sup>  |
-  | :---------------------------------- | :-------------- | :-------------- | :-------------- | :-------------- | :-------------- | :--------------- |
-  | -                                   | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@ 1%(mAP%) |
-  | GeneralRecognition_PPLCNet_x2_5     | 98.2(91.6)      | 28.8(8.42)      | 12.6(6.1)       | 72.0(50.4)      | 27.9(9.5)       | 97.6(90.3)       |
-  | GeneralRecognitionV2_PPLCNetV2_base | 98.1(90.5)      | 35.9(11.2)      | 38.6(23.9)      | 87.7(71.4)      | 39.3(15.6)      | 98.3(90.9)       |
+### 4.6 Paddle2ONNX
+Paddle2ONNX supports converting PaddlePaddle model format to ONNX model format. The deployment of Paddle models to various inference engines can be completed through ONNX, including TensorRT/OpenVINO/MNN/TNN/NCNN, and other inference engines or hardware that support the ONNX open source format. For more introduction to Paddle2ONNX, please refer to [Paddle2ONNX Code Repository](https://github.com/PaddlePaddle/Paddle2ONNX).

-**Note:** The product dataset is made to verify the generalization performance of PP-ShiTu, and all the data are not present in the training and testing sets. The data contains 7 categories ( cosmetics, landmarks, wine, watches, cars, sports shoes, beverages) and 250 sub-categories. When testing, use the labels of 250 small classes for testing; the sop dataset comes from [GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval](https://arxiv.org/abs/2111.13122), which can be regarded as " SOP" dataset.
+PaddleClas provides an example of converting an inference model to an ONNX model and making inference prediction based on Paddle2ONNX. You can refer to [Paddle2ONNX Model Conversion and Prediction](../../../deploy/paddle2onnx/readme_en.md) to complete the corresponding deployment work.

 ## references
 1. Schall, Konstantin, et al. "GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval." International Conference on Multimedia Modeling. Springer, Cham, 2022.