diff --git a/README_ch.md b/README_ch.md
index 6ba1b44ebbe4c3c8c9a7d36d992d60b316ce06c9..a6bbe5ac98a46f7bd8482faad4a24a43925166ec 100644
--- a/README_ch.md
+++ b/README_ch.md
@@ -22,7 +22,7 @@
## 近期更新
-- 🔥️ 发布[PP-ShiTuV2](./docs/zh_CN/PPShiTu/PPShiTuV2_introduction.md),recall1精度提升10个点,覆盖20+识别场景,新增库管理工具,Android Demo全新体验。
+- 🔥️ 发布[PP-ShiTuV2](./docs/zh_CN/PPShiTu/PPShiTuV2_introduction.md),recall1精度提升8个点,覆盖[20+识别场景](./docs/zh_CN/introduction/ppshitu_application_scenarios.md),新增[库管理工具](./deploy/shitu_index_manager/),[Android Demo](./docs/zh_CN/quick_start/quick_start_recognition.md)全新体验。
- 2022.9.4 新增[生鲜产品自主结算范例库](https://aistudio.baidu.com/aistudio/projectdetail/4486158),具体内容可以在AI Studio上体验。
- 2022.6.15 发布[PULC超轻量图像分类实用方案](docs/zh_CN/PULC/PULC_train.md),CPU推理3ms,精度比肩SwinTransformer,覆盖人、车、OCR场景九大常见任务。
- 2022.5.23 新增[人员出入管理范例库](https://aistudio.baidu.com/aistudio/projectdetail/4094475),具体内容可以在 AI Studio 上体验。
@@ -100,6 +100,7 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick
- 前沿算法
- [骨干网络和预训练模型库](docs/zh_CN/algorithm_introduction/ImageNet_models.md)
- [度量学习](docs/zh_CN/algorithm_introduction/metric_learning.md)
+ - [ReID](./docs/zh_CN/algorithm_introduction/reid.md)
- [模型压缩](docs/zh_CN/algorithm_introduction/model_prune_quantization.md)
- [模型蒸馏](docs/zh_CN/algorithm_introduction/knowledge_distillation.md)
- [数据增强](docs/zh_CN/advanced_tutorials/DataAugmentation.md)
@@ -110,6 +111,7 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick
- [图像分类精选问题](docs/zh_CN/faq_series/faq_selected_30.md)
- [图像分类FAQ第一季](docs/zh_CN/faq_series/faq_2020_s1.md)
- [图像分类FAQ第二季](docs/zh_CN/faq_series/faq_2021_s1.md)
+ - [图像分类FAQ第三季](docs/zh_CN/faq_series/faq_2022_s1.md)
- [社区贡献指南](./docs/zh_CN/advanced_tutorials/how_to_contribute.md)
- [许可证书](#许可证书)
- [贡献代码](#贡献代码)
@@ -123,7 +125,7 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick
-PP-ShiTuV2是一个实用的轻量级通用图像识别系统,主要由主体检测、特征学习和向量检索三个模块组成。该系统从骨干网络选择和调整、损失函数的选择、数据增强、学习率变换策略、正则化参数选择、预训练模型使用以及模型裁剪量化多个方面,采用多种策略,对各个模块的模型进行优化,PP-ShiTuV2相比V1而已,Recall1提升10+个点。更多细节请参考[PP-ShiTuV2详细介绍](./docs/zh_CN/PPShiTu/PPShiTuV2_introduction.md)。
+PP-ShiTuV2是一个实用的轻量级通用图像识别系统,主要由主体检测、特征学习和向量检索三个模块组成。该系统从骨干网络选择和调整、损失函数的选择、数据增强、学习率变换策略、正则化参数选择、预训练模型使用以及模型裁剪量化多个方面,采用多种策略,对各个模块的模型进行优化,PP-ShiTuV2相比V1,Recall1提升近8个点。更多细节请参考[PP-ShiTuV2详细介绍](./docs/zh_CN/PPShiTu/PPShiTuV2_introduction.md)。
diff --git a/README_en.md b/README_en.md
index 4bf960e57f2e56972f889c4bcf6a6d715b903477..cddae2891927d593c997547f28f51551a6c48cdd 100644
--- a/README_en.md
+++ b/README_en.md
@@ -7,20 +7,22 @@
PaddleClas is an image classification and image recognition toolset for industry and academia, helping users train better computer vision models and apply them in real scenarios.
-
-
-PULC demo images
+
+
PP-ShiTuV2 demo images
-
+
-
+
-PP-ShiTu demo images
+PULC demo images
+
+
**Recent updates**
+- 🔥️ Release [PP-ShiTuV2](./docs/en/PPShiTu/PPShiTuV2_introduction.md), recall1 is improved by nearly 8 points, covering 20+ recognition scenarios, with [index management tool](./deploy/shitu_index_manager) and [Android Demo](./docs/en/quick_start/quick_start_recognition_en.md) for better experience.
- 2022.6.15 Release [**P**ractical **U**ltra **L**ight-weight image **C**lassification solutions](./docs/en/PULC/PULC_quickstart_en.md). PULC models inference within 3ms on CPU devices, with accuracy on par with SwinTransformer. We also release 9 practical classification models covering pedestrian, vehicle and OCR scenario.
- 2022.4.21 Added the related [code](https://github.com/PaddlePaddle/PaddleClas/pull/1820/files) of the CVPR2022 oral paper [MixFormer](https://arxiv.org/pdf/2204.02557.pdf).
@@ -52,12 +54,31 @@ Based on th algorithms above, PaddleClas release PP-ShiTu image recognition syst
## Quick Start
Quick experience of PP-ShiTu image recognition system:[Link](./docs/en/quick_start/quick_start_recognition_en.md)
+
+
+
PP-ShiTuV2 Android Demo
+
+
Quick experience of **P**ractical **U**ltra **L**ight-weight image **C**lassification models:[Link](docs/en/PULC/PULC_quickstart_en.md)
## Tutorials
- [Install Paddle](./docs/en/installation/install_paddle_en.md)
- [Install PaddleClas Environment](./docs/en/installation/install_paddleclas_en.md)
+- [PP-ShiTuV2 Image Recognition Systems Introduction](./docs/en/PPShiTu/PPShiTuV2_introduction.md)
+ - [Image Recognition Quick Start](docs/en/quick_start/quick_start_recognition_en.md)
+ - [20+ application scenarios](docs/zh_CN/introduction/ppshitu_application_scenarios.md)
+ - Submodule Introduction and Model Training
+ - [Mainbody Detection](./docs/zh_CN/image_recognition_pipeline/mainbody_detection.md)
+ - [Feature Extraction](./docs/en/image_recognition_pipeline/feature_extraction_en.md)
+ - [Vector Search](./docs/en/image_recognition_pipeline/vector_search_en.md)
+ - [Hash Encoding](./docs/zh_CN/image_recognition_pipeline/deep_hashing.md)
+ - PipeLine Inference and Deployment
+ - [Python Inference](docs/en/inference_deployment/python_deploy_en.md)
+ - [C++ Inference](deploy/cpp_shitu/readme_en.md)
+ - [Serving Deployment](docs/en/inference_deployment/recognition_serving_deploy_en.md)
+ - [Lite Deployment](docs/en/inference_deployment/paddle_lite_deploy_en.md)
+ - [Shitu Gallery Manager Tool](docs/zh_CN/inference_deployment/shitu_gallery_manager.md)
- [Practical Ultra Light-weight image Classification solutions](./docs/en/PULC/PULC_train_en.md)
- [PULC Quick Start](docs/en/PULC/PULC_quickstart_en.md)
- [PULC Model Zoo](docs/en/PULC/PULC_model_list_en.md)
@@ -108,41 +129,55 @@ PULC models inference within 3ms on CPU devices, with accuracy comparable with S
-Image recognition can be divided into three steps:
-- (1)Identify region proposal for target objects through a detection model;
-- (2)Extract features for each region proposal;
-- (3)Search features in the retrieval database and output results;
-
+PP-ShiTuV2 is a practical lightweight general image recognition system, which is mainly composed of three modules: mainbody detection model, feature extraction model and vector search tool. The system adopts a variety of strategies including backbone network, loss function, data augmentations, optimal hyperparameters, pre-training model, model pruning and quantization. Compared to V1, PP-ShiTuV2, Recall1 is improved by nearly 8 points. For more details, please refer to [PP-ShiTuV2 introduction](./docs/en/PPShiTu/PPShiTuV2_introduction.md).
For a new unknown category, there is no need to retrain the model, just prepare images of new category, extract features and update retrieval database and the category can be recognised.
-
-## PULC demo images
+
+## PP-ShiTuV2 Demo images
+
+- Drinks recognition
+
-
+
-
-## Image Recognition Demo images [more](https://github.com/PaddlePaddle/PaddleClas/tree/release/2.2/docs/images/recognition/more_demo_images)
+
- Product recognition
+
+
- Cartoon character recognition
+
+
- Logo recognition
+
+
+
- Car recognition
+
+
+
+## PULC demo images
+
+
+
+
+
## License
PaddleClas is released under the Apache 2.0 license Apache 2.0 license
diff --git a/docs/en/PPShiTu/PPShiTuV2_introduction.md b/docs/en/PPShiTu/PPShiTuV2_introduction.md
new file mode 100644
index 0000000000000000000000000000000000000000..bae44aea1aa4b3ebaa3f35408786d94ee91dc9d8
--- /dev/null
+++ b/docs/en/PPShiTu/PPShiTuV2_introduction.md
@@ -0,0 +1,250 @@
+## PP-ShiTuV2 Image Recognition System
+
+## Table of contents
+
+- [1. Introduction of PP-ShiTuV2 model and application scenarios](#1-introduction-of-pp-shituv2-model-and-application-scenarios)
+- [2. Quick experience](#2-quick-experience)
+ - [2.1 Quick experience of PP-ShiTu android demo](#21-quick-experience-of-pp-shitu-android-demo)
+ - [2.2 Quick experience of command line code](#22-quick-experience-of-command-line-code)
+- [3 Module introduction and training](#3-module-introduction-and-training)
+ - [3.1 Mainbody detection](#31-mainbody-detection)
+ - [3.2 Feature Extraction](#32-feature-extraction)
+ - [3.3 Vector Search](#33-vector-search)
+- [4. Inference Deployment](#4-inference-deployment)
+ - [4.1 Inference model preparation](#41-inference-model-preparation)
+ - [4.1.1 Export the inference model from pretrained model](#411-export-the-inference-model-from-pretrained-model)
+ - [4.1.2 Download the inference model directly](#412-download-the-inference-model-directly)
+ - [4.2 Test data preparation](#42-test-data-preparation)
+ - [4.3 Inference based on Python inference engine](#43-inference-based-on-python-inference-engine)
+ - [4.3.1 single image prediction](#431-single-image-prediction)
+ - [4.3.2 multi images prediction](#432-multi-images-prediction)
+ - [4.3 Inference based on C++ inference engine](#43-inference-based-on-c-inference-engine)
+ - [4.4 Serving deployment](#44-serving-deployment)
+ - [4.5 Lite deployment](#45-lite-deployment)
+ - [4.6 Paddle2ONNX](#46-paddle2onnx)
+- [references](#references)
+
+## 1. Introduction of PP-ShiTuV2 model and application scenarios
+
+PP-shituv2 is a practical lightweight general image recognition system improved on PP-ShitUV1. It is composed of three modules: mainbody detection, feature extraction and vector search. Compared with PP-ShiTuV1, PP-ShiTuV2 has higher recognition accuracy, stronger generalization and similar inference speed *. This paper mainly optimize in training dataset, feature extraction with better backbone network, loss function and training strategy, which significantly improved the retrieval performance of PP-ShiTuV2 in multiple practical application scenarios.
+
+
+
+
+
+The following table lists the relevant metric obtained by PP-ShiTuV2 with comparison to PP-ShiTuV1.
+
+| model | storage (mainbody detection + feature extraction) | product |
+| :--------- | :------------------------------------------------ | :------- |
+| | | recall@1 |
+| PP-ShiTuV1 | 64(30+34)MB | 66.8% |
+| PP-ShiTuV2 | 49(30+19) | 73.8% |
+
+**Note:**
+- For the introduction of recall and mAP metric, please refer to [Retrieval Metric](../algorithm_introduction/reid.md).
+- Latency is based on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz test, MKLDNN acceleration strategy is enabled, and the number of threads is 10.
+
+## 2. Quick experience
+
+### 2.1 Quick experience of PP-ShiTu android demo
+
+You can download and install the APP by scanning the QR code or [click the link](https://paddle-imagenet-models-name.bj.bcebos.com/demos/PP-ShiTu.apk)
+
+
+
+Then save the following demo pictures to your phone:
+
+
+
+Open the installed APP, click the "**file recognition**" button below, select the above saved image, and you can get the following recognition results:
+
+
+
+### 2.2 Quick experience of command line code
+
+- First follow the commands below to install paddlepaddle and faiss
+ ```shell
+ # If your machine is installed with CUDA9 or CUDA10, please run the following command to install
+ python3.7 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
+
+ # If your machine is CPU, please run the following command to install
+ python3.7 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+
+ # install faiss database
+ python3.7 -m pip install faiss-cpu==1.7.1post2
+ ```
+
+- Then follow the command below to install the paddleclas whl package
+ ```shell
+ # Go to the root directory of PaddleClas
+ cd PaddleClas
+
+ # install paddleclas
+ python3.7 setup.py install
+ ```
+
+- Then execute the following command to download and decompress the demo data, and finally execute command to quick start image recognition
+
+ ```shell
+ # Download and unzip the demo data
+ wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v2.0.tar && tar -xf drink_dataset_v2.0.tar
+
+ # Execute the identification command
+ paddleclas \
+ --model_name=PP-ShiTuV2 \
+ --infer_imgs=./drink_dataset_v2.0/test_images/100.jpeg \
+ --index_dir=./drink_dataset_v2.0/index/ \
+ --data_file=./drink_dataset_v2.0/gallery/drink_label.txt
+ ```
+
+## 3 Module introduction and training
+
+### 3.1 Mainbody detection
+
+Mainbody detection is a widely used detection technology. It refers to detecting the coordinate position of one or more objects in the image, and then cropping the corresponding area in the image for identification. Mainbody detection is the pre-procedure of the recognition task. The input image is recognized after mainbody detection, which can remove complex backgrounds and effectively improve the recognition accuracy.
+
+Taking into account the detection speed, model size, detection accuracy and other factors, the lightweight model `PicoDet-LCNet_x2_5` developed by PaddleDetection was finally selected as the mainbody detection model of PP-ShiTuV2
+
+For details on the dataset, training, evaluation, inference, etc. of the mainbody detection model, please refer to the document: [picodet_lcnet_x2_5_640_mainbody](../../en/image_recognition_pipeline/mainbody_detection_en.md).
+
+### 3.2 Feature Extraction
+
+Feature extraction is a key part of image recognition. It is designed to convert the input image into a fixed-dimensional feature vector for subsequent [vector search](../../en/image_recognition_pipeline/vector_search_en.md) . Taking into account the speed of the feature extraction model, model size, feature extraction performance and other factors, the [`PPLCNetV2_base`](../../en/models/PP-LCNet_en.md) developed by PaddleClas was finally selected as the feature extraction network. Compared with `PPLCNet_x2_5` used by PP-ShiTuV1, `PPLCNetV2_base` basically maintains high classification accuracy and reduces inference time by 40%*.
+
+**Note:** *The inference environment is based on Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz hardware platform, OpenVINO inference platform.
+
+During the experiment, we found that we can make appropriate improvements to `PPLCNetV2_base` to achieve higher performance in recognition tasks while keeping the speed basically unchanged, including: removing `ReLU` and `FC` at the end of `PPLCNetV2_base`, change the stride of the last stage (RepDepthwiseSeparable) to 1.
+
+For details about the dataset, training, evaluation, inference, etc. of the feature extraction model, please refer to the document: [PPLCNetV2_base_ShiTu](../../en/image_recognition_pipeline/feature_extraction_en.md).
+
+### 3.3 Vector Search
+
+Vector Search technology is widely used in image recognition. Its' main goal is to calculate the similarity or distance of the feature vector in the established vector database for a given query vector, and return the similarity ranking result of the candidate vector.
+
+In the PP-ShiTuV2 recognition system, we use the [Faiss](https://github.com/facebookresearch/faiss) vector research open source library, which has good adaptability, easy installation, rich algorithms, It supports the advantages of both CPU and GPU.
+
+For the installation and use of the Faiss vector research tool in the PP-ShiTuV2 system, please refer to the document: [vector search](../../en/image_recognition_pipeline/vector_search_en.md).
+
+## 4. Inference Deployment
+
+### 4.1 Inference model preparation
+Paddle Inference is the native inference database of Paddle, which enabled on the server and the cloud to provide high-performance inference capabilities. Compared to making predictions based on pre-trained models directly, Paddle Inference can use MKLDNN, CUDNN, and TensorRT for prediction acceleration to achieve better inference performance. For more introduction to Paddle Inference inference engine, please refer to [Paddle Inference official website tutorial](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html).
+
+When using Paddle Inference for model inference, the loaded model type is the inference model. This case provides two methods to obtain the inference model. If you want to get the same result as the document, please click [Download the inference model directly](#412-download-the-inference-model-directly).
+
+#### 4.1.1 Export the inference model from pretrained model
+- Please refer to the document [Mainbody Detection Inference Model Preparation](../../en/image_recognition_pipeline/mainbody_detection_en.md), or refer to [4.1.2](#412-direct download-inference-model)
+
+- To export the weights of the feature extraction model, you can refer to the following commands:
+ ```shell
+ python3.7 tools/export_model.py \
+ -c ./ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml \
+ -o Global.pretrained_model="https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/PPShiTuV2/general_PPLCNetV2_base_pretrained_v1.0.pdparams" \
+ -o Global.save_inference_dir=deploy/models/GeneralRecognitionV2_PPLCNetV2_base`
+ ```
+ After executing the script, the `GeneralRecognitionV2_PPLCNetV2_base` folder will be generated under `deploy/models/` with the following file structure:
+
+ ```log
+ deploy/models/
+ ├── GeneralRecognitionV2_PPLCNetV2_base
+ │ ├── inference.pdiparams
+ │ ├── inference.pdiparams.info
+ │ └── inference.pdmodel
+ ```
+
+#### 4.1.2 Download the inference model directly
+
+[Section 4.1.1](#411-export-the-inference-model-from-pretrained-model) provides a method to export the inference model, here we provide the exported inference model, you can download the model to the specified location and decompress it by the following command experience.
+
+```shell
+cd deploy/models
+
+# Download the mainbody detection inference model and unzip it
+wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar && tar -xf picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
+
+# Download the feature extraction inference model and unzip it
+wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/PP-ShiTuV2/general_PPLCNetV2_base_pretrained_v1.0_infer.tar && tar -xf general_PPLCNetV2_base_pretrained_v1.
+```
+
+### 4.2 Test data preparation
+
+After preparing the mainbody detection and feature extraction models, you also need to prepare the test data as input. You can run the following commands to download and decompress the test data.
+
+```shell
+# return to ./deploy
+cd ../
+
+# Download the test data drink_dataset_v2.0 and unzip it
+wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v2.0.tar && tar -xf drink_dataset_v2.0.tar
+```
+
+### 4.3 Inference based on Python inference engine
+
+#### 4.3.1 single image prediction
+
+Then execute the following command to identify the single image `./drink_dataset_v2.0/test_images/100.jpeg`.
+
+```shell
+# Execute the following command to predict with GPU
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images/100.jpeg"
+
+# Execute the following command to predict with CPU
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images/100.jpeg" -o Global.use_gpu=False
+```
+
+The final output is as follows.
+
+```log
+[{'bbox': [437, 71, 660, 728], 'rec_docs': '元气森林', 'rec_scores': 0.7740249}, {'bbox': [221, 72, 449, 701], 'rec_docs': '元气森林', 'rec_scores': 0.6950992}, {'bbox': [794, 104, 979, 652], 'rec_docs': '元气森林', 'rec_scores': 0.6305153}]
+```
+
+#### 4.3.2 multi images prediction
+
+If you want to predict the images in the folder, you can directly modify the `Global.infer_imgs` field in the configuration file, or you can modify the corresponding configuration through the following -o parameter.
+
+```shell
+# Use the command below to predict with GPU
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images"
+# Use the following command to predict with CPU
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images" -o Global.use_gpu=False
+```
+
+The terminal will output the recognition results of all images in the folder, as shown below.
+
+```log
+...
+[{'bbox': [0, 0, 600, 600], 'rec_docs': '红牛-强化型', 'rec_scores': 0.74081033}]
+Inference: 120.39852142333984 ms per batch image
+[{'bbox': [0, 0, 514, 436], 'rec_docs': '康师傅矿物质水', 'rec_scores': 0.6918598}]
+Inference: 32.045602798461914 ms per batch image
+[{'bbox': [138, 40, 573, 1198], 'rec_docs': '乐虎功能饮料', 'rec_scores': 0.68214047}]
+Inference: 113.41428756713867 ms per batch image
+[{'bbox': [328, 7, 467, 272], 'rec_docs': '脉动', 'rec_scores': 0.60406065}]
+Inference: 122.04337120056152 ms per batch image
+[{'bbox': [242, 82, 498, 726], 'rec_docs': '味全_每日C', 'rec_scores': 0.5428652}]
+Inference: 37.95266151428223 ms per batch image
+[{'bbox': [437, 71, 660, 728], 'rec_docs': '元气森林', 'rec_scores': 0.7740249}, {'bbox': [221, 72, 449, 701], 'rec_docs': '元气森林', 'rec_scores': 0.6950992}, {'bbox': [794, 104, 979, 652], 'rec_docs': '元气森林', 'rec_scores': 0.6305153}]
+...
+```
+
+Where `bbox` represents the bounding box of the detected mainbody, `rec_docs` represents the most similar category to the detection object in the index database, and `rec_scores` represents the corresponding similarity.
+
+### 4.3 Inference based on C++ inference engine
+PaddleClas provides an example of inference based on C++ prediction engine, you can refer to [Server-side C++ prediction](../../../deploy/cpp_shitu/readme_en.md) to complete the corresponding inference deployment. If you are using the Windows platform, you can refer to [Visual Studio 2019 Community CMake Compilation Guide](../inference_deployment/python_deploy_en.md) to complete the corresponding prediction database compilation and model prediction work.
+
+### 4.4 Serving deployment
+Paddle Serving provides high-performance, flexible and easy-to-use industrial-grade online inference services. Paddle Serving supports RESTful, gRPC, bRPC and other protocols, and provides inference solutions in a variety of heterogeneous hardware and operating system environments. For more introduction to Paddle Serving, please refer to [Paddle Serving Code Repository](https://github.com/PaddlePaddle/Serving).
+
+PaddleClas provides an example of model serving deployment based on Paddle Serving. You can refer to [Model serving deployment](../inference_deployment/recognition_serving_deploy_en.md) to complete the corresponding deployment.
+
+### 4.5 Lite deployment
+Paddle Lite is a high-performance, lightweight, flexible and easily extensible deep learning inference framework, positioned to support multiple hardware platforms including mobile, embedded and server. For more introduction to Paddle Lite, please refer to [Paddle Lite Code Repository](https://github.com/PaddlePaddle/Paddle-Lite).
+
+### 4.6 Paddle2ONNX
+Paddle2ONNX supports converting PaddlePaddle model format to ONNX model format. The deployment of Paddle models to various inference engines can be completed through ONNX, including TensorRT/OpenVINO/MNN/TNN/NCNN, and other inference engines or hardware that support the ONNX open source format. For more introduction to Paddle2ONNX, please refer to [Paddle2ONNX Code Repository](https://github.com/PaddlePaddle/Paddle2ONNX).
+
+PaddleClas provides an example of converting an inference model to an ONNX model and making inference prediction based on Paddle2ONNX. You can refer to [Paddle2ONNX Model Conversion and Prediction](../../../deploy/paddle2onnx/readme_en.md) to complete the corresponding deployment work.
+
+## references
+1. Schall, Konstantin, et al. "GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval." International Conference on Multimedia Modeling. Springer, Cham, 2022.
+2. Luo, Hao, et al. "A strong baseline and batch normalization neck for deep person re-identification." IEEE Transactions on Multimedia 22.10 (2019): 2597-2609.
diff --git a/docs/en/image_recognition_pipeline/feature_extraction_en.md b/docs/en/image_recognition_pipeline/feature_extraction_en.md
index f86562a37416c406497cb3723d50dc02332e4e51..26216543733529f749fe1732fbb12585ffdcc874 100644
--- a/docs/en/image_recognition_pipeline/feature_extraction_en.md
+++ b/docs/en/image_recognition_pipeline/feature_extraction_en.md
@@ -12,12 +12,15 @@
- [4.4 Model Inference](#4.4)
-## 1.Introduction
+
+## 1. Abstract
Feature extraction plays a key role in image recognition, which serves to transform the input image into a fixed dimensional feature vector for subsequent [vector search](./vector_search_en.md). Good features boast great similarity preservation, i.e., in the feature space, pairs of images with high similarity should have higher feature similarity (closer together), and pairs of images with low similarity should have less feature similarity (further apart). [Deep Metric Learning](../algorithm_introduction/metric_learning_en.md) is applied to explore how to obtain features with high representational power through deep learning.
-## 2.Network Structure
+
+## 2. Introduction
+
In order to customize the image recognition task flexibly, the whole network is divided into Backbone, Neck, Head, and Loss. The figure below illustrates the overall structure:
@@ -31,152 +34,239 @@ Functions of the above modules :
- **Loss**: Specifies the Loss function to be used. It is designed as a combined form to facilitate the combination of Classification Loss and Pair_wise Loss.
-## 3.General Recognition Models
-
-In PP-Shitu, we have [PP_LCNet_x2_5](../models/PP-LCNet.md) as the backbone network, Linear Layer for Neck, [ArcMargin](../../../ppcls/arch/gears/arcmargin.py) for Head, and CELoss for Loss. See the details in [General Recognition_configuration files](../../../ppcls/configs/GeneralRecognition/). The involved training data covers the following seven public datasets:
-
-| Datasets | Data Size | Class Number | Scenarios | URL |
-| ------------ | --------- | ------------ | ------------------ | ------------------------------------------------------------ |
-| Aliproduct | 2498771 | 50030 | Commodities | [URL](https://retailvisionworkshop.github.io/recognition_challenge_2020/) |
-| GLDv2 | 1580470 | 81313 | Landmarks | [URL](https://github.com/cvdfoundation/google-landmark) |
-| VeRI-Wild | 277797 | 30671 | Vehicle | [URL](https://github.com/PKU-IMRE/VERI-Wild) |
-| LogoDet-3K | 155427 | 3000 | Logo | [URL](https://github.com/Wangjing1551/LogoDet-3K-Dataset) |
-| iCartoonFace | 389678 | 5013 | Cartoon Characters | [URL](http://challenge.ai.iqiyi.com/detail?raceId=5def69ace9fcf68aef76a75d) |
-| SOP | 59551 | 11318 | Commodities | [URL](https://cvgl.stanford.edu/projects/lifted_struct/) |
-| Inshop | 25882 | 3997 | Commodities | [URL](http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion.html) |
-| **Total** | **5M** | **185K** | ---- | ---- |
-
-The results are shown in the table below:
-
-| Model | Aliproduct | VeRI-Wild | LogoDet-3K | iCartoonFace | SOP | Inshop | Latency(ms) |
-| ------------- | ---------- | --------- | ---------- | ------------ | ----- | ------ | ----------- |
-| PP-LCNet-2.5x | 0.839 | 0.888 | 0.861 | 0.841 | 0.793 | 0.892 | 5.0 |
-
-- Evaluation metric: `Recall@1`
-- CPU of the speed evaluation machine: `Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz`.
-- Evaluation conditions for the speed metric: MKLDNN enabled, number of threads set to 10
-- Address of the pre-training model: [General recognition pre-training model](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/general_PPLCNet_x2_5_pretrained_v1.0.pdparams)
-
-
-## 4.Customized Feature Extraction
-
-Customized feature extraction refers to retraining the feature extraction model based on one's own task. It consists of four main steps: 1) data preparation, 2) model training, 3) model evaluation, and 4) model inference.
-
-
-### 4.1 Data Preparation
-To start with, customize your dataset based on the task (See [Format description](../data_preparation/recognition_dataset_en.md#1) for the dataset format). Before initiating the model training, modify the data-related content in the configuration files, including the address of the dataset and the class number. The corresponding locations in configuration files are shown below:
+## 3. Methods
-```
- Head:
- name: ArcMargin
- embedding_size: 512
- class_num: 185341 #Number of class
-```
+#### 3.1 Backbone
-```
-Train:
- dataset:
- name: ImageNetDataset
- image_root: ./dataset/ #The directory where the train dataset is located
- cls_label_path: ./dataset/train_reg_all_data.txt #The address of label file for train dataset
-```
+The Backbone part adopts [PP-LCNetV2_base](../models/PP-LCNetV2.md), which is based on `PPLCNet_V1`, including Rep strategy, PW convolution, Shortcut, activation function improvement, SE module improvement After several optimization points, the final classification accuracy is similar to `PPLCNet_x2_5`, and the inference delay is reduced by 40%*. During the experiment, we made appropriate improvements to `PPLCNetV2_base`, so that it can achieve higher performance in recognition tasks while keeping the speed basically unchanged, including: removing `ReLU` and ` at the end of `PPLCNetV2_base` FC`, change the stride of the last stage (RepDepthwiseSeparable) to 1.
-```
- Query:
- dataset:
- name: VeriWild
- image_root: ./dataset/Aliproduct/. #The directory where the query dataset is located
- cls_label_path: ./dataset/Aliproduct/val_list.txt. #The address of label file for query dataset
-```
+**Note:** *The inference environment is based on Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz hardware platform, OpenVINO inference platform.
-```
- Gallery:
- dataset:
- name: VeriWild
- image_root: ./dataset/Aliproduct/ #The directory where the gallery dataset is located
- cls_label_path: ./dataset/Aliproduct/val_list.txt. #The address of label file for gallery dataset
-```
+#### 3.2 Neck
-
-### 4.2 Model Training
+We use [BN Neck](../../../ppcls/arch/gears/bnneck.py) to standardize each dimension of the features extracted by Backbone, reducing difficulty of optimizing metric learning loss and identification loss simultaneously.
-- Single machine single card training
+#### 3.3 Head
-```
-export CUDA_VISIBLE_DEVICES=0
-python tools/train.py -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml
-```
+We use [FC Layer](../../../ppcls/arch/gears/fc.py) as the classification head to convert features into logits for classification loss.
-- Single machine multi card training
+#### 3.4 Loss
-```
-export CUDA_VISIBLE_DEVICES=0,1,2,3
-python -m paddle.distributed.launch \
- --gpus="0,1,2,3" tools/train.py \
- -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml
-```
+We use [Cross entropy loss](../../../ppcls/loss/celoss.py) and [TripletAngularMarginLoss](../../../ppcls/loss/tripletangularmarginloss.py), and we improved the original TripletLoss(TriHard Loss), replacing the optimization objective from L2 Euclidean space to cosine space, adding a hard distance constraint between anchor and positive/negtive, so the generalization ability of the model is improved. For detailed configuration files, see [GeneralRecognitionV2_PPLCNetV2_base.yaml](../../../ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml#L63-77).
-**Note:** The configuration file adopts `online evaluation` by default, if you want to speed up the training and remove `online evaluation`, just add `-o eval_during_train=False` after the above command. After training, the final model files `latest`, `best_model` and the training log file `train.log` will be generated under the directory output. Among them, `best_model` is utilized to store the best model under the current evaluation metrics while`latest` is adopted to store the latest generated model, making it convenient to resume the training from where it was interrupted.
+#### 3.5 Data Augmentation
-- Resumption of Training:
+We consider that the object may rotate to a certain extent and can not maintain an upright state in real scenes, so we add an appropriate [random rotation](../../../ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml#L117) in the data augmentation to improve the retrieval performance in real scenes.
-```
-export CUDA_VISIBLE_DEVICES=0,1,2,3
-python -m paddle.distributed.launch \
- --gpus="0,1,2,3" tools/train.py \
- -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
- -o Global.checkpoint="output/RecModel/latest"
-```
+
-
-### 4.3 Model Evaluation
+## 4. Experimental
+
+We reasonably expanded and optimized the original training data, and finally used a summary of the following 17 public datasets:
+
+| Dataset | Data Amount | Number of Categories | Scenario | Dataset Address |
+| :--------------------- | :---------: | :------------------: | :---------: | :-------------------------------------------------------------------------------------: |
+| Aliproduct | 2498771 | 50030 | Commodities | [Address](https://retailvisionworkshop.github.io/recognition_challenge_2020/) |
+| GLDv2 | 1580470 | 81313 | Landmark | [address](https://github.com/cvdfoundation/google-landmark) |
+| VeRI-Wild | 277797 | 30671 | Vehicles | [Address](https://github.com/PKU-IMRE/VERI-Wild) |
+| LogoDet-3K | 155427 | 3000 | Logo | [Address](https://github.com/Wangjing1551/LogoDet-3K-Dataset) |
+| SOP | 59551 | 11318 | Commodities | [Address](https://cvgl.stanford.edu/projects/lifted_struct/) |
+| Inshop | 25882 | 3997 | Commodities | [Address](http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion.html) |
+| bird400 | 58388 | 400 | birds | [address](https://www.kaggle.com/datasets/gpiosenka/100-bird-species) |
+| 104flows | 12753 | 104 | Flowers | [Address](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/) |
+| Cars | 58315 | 112 | Vehicles | [Address](https://ai.stanford.edu/~jkrause/cars/car_dataset.html) |
+| Fashion Product Images | 44441 | 47 | Products | [Address](https://www.kaggle.com/datasets/paramaggarwal/fashion-product-images-dataset) |
+| flowerrecognition | 24123 | 59 | flower | [address](https://www.kaggle.com/datasets/aymenktari/flowerrecognition) |
+| food-101 | 101000 | 101 | food | [address](https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/) |
+| fruits-262 | 225639 | 262 | fruits | [address](https://www.kaggle.com/datasets/aelchimminut/fruits262) |
+| inaturalist | 265213 | 1010 | natural | [address](https://github.com/visipedia/inat_comp/tree/master/2017) |
+| indoor-scenes | 15588 | 67 | indoor | [address](https://www.kaggle.com/datasets/itsahmad/indoor-scenes-cvpr-2019) |
+| Products-10k | 141931 | 9691 | Products | [Address](https://products-10k.github.io/) |
+| CompCars | 16016 | 431 | Vehicles | [Address](http://http://ai.stanford.edu/~jkrause/cars/car_dataset.html) |
+| **Total** | **6M** | **192K** | - | - |
+
+The final model accuracy metrics are shown in the following table:
+
+| Model | Latency (ms) | Storage (MB) | product* | | Aliproduct | | VeRI-Wild | | LogoDet-3k | | iCartoonFace | | SOP | | Inshop | | gldv2 | | imdb_face | | iNat | | instre | | sketch | | sop | |
+| :--------------------- | :----------- | :----------- | :------------------ | :--- | ---------- | ---- | --------- | ---- | ---------- | ---- | ------------ | ---- | -------- | --------- | ------ | -------- | ----- | -------- | --------- | -------- | ---- | -------- | ------ | -------- | ------ | -------- | --- | --- |
+| | | | recall@1 | mAP | recall@1 | mAP | recall@1 | mAP | recall@1 | mAP | recall@1 | mAP | recall@1 | mrecall@1 | mAP | recall@1 | mAP | recall@1 | mAP | recall@1 | mAP | recall@1 | mAP | recall@1 | mAP | recall@1 | mAP |
+| PP-ShiTuV1_general_rec | 5.0 | 34 | 65.9 | 54.3 | 83.9 | 83.2 | 88.7 | 60.1 | 86.1 | 73.6 | | 50.4 | 27.9 | 9.5 | 97.6 | 90.3 |
+| PP-ShiTuV2_general_rec | 6.1 | 19 | 73.7 | 61.0 | 84.2 | 83.3 | 87.8 | 68.8 | 88.0 | 63.2 | 53.6 | 27.5 | | 71.4 | 39.3 | 15.6 | 98.3 | 90.9 |
+
+*The product dataset is a dataset made to verify the generalization performance of PP-ShiTu, and all the data are not present in the training and testing sets. The data contains 7 major categories (cosmetics, landmarks, wine, watches, cars, sports shoes, beverages) and 250 subcategories. When testing, use the labels of 250 small classes for testing; the sop dataset comes from [GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval](https://arxiv.org/abs/2111.13122), which can be regarded as " SOP" dataset.
+* Pre-trained model address: [general_PPLCNetV2_base_pretrained_v1.0.pdparams](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/PPShiTuV2/general_PPLCNetV2_base_pretrained_v1.0.pdparams)
+* The evaluation metrics used are: `Recall@1` and `mAP`
+* The CPU specific information of the speed test machine is: `Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz`
+* The evaluation conditions of the speed indicator are: MKLDNN is turned on, and the number of threads is set to 10
+
+
+
+## 5. Custom Feature Extraction
+
+Custom feature extraction refers to retraining the feature extraction model according to your own task.
+
+Based on the `GeneralRecognitionV2_PPLCNetV2_base.yaml` configuration file, the following describes the main four steps: 1) data preparation; 2) model training; 3) model evaluation; 4) model inference
+
+
+
+### 5.1 Data Preparation
+
+First you need to customize your own dataset based on the task. Please refer to [Dataset Format Description](../data_preparation/recognition_dataset.md) for the dataset format and file structure.
+
+After the preparation is complete, it is necessary to modify the content related to the data configuration in the configuration file, mainly including the path of the dataset and the number of categories. As is as shown below:
+
+- Modify the number of classes:
+ ```yaml
+ Head:
+ name: FC
+ embedding_size: *feat_dim
+ class_num: 192612 # This is the number of classes
+ weight_attr:
+ initializer:
+ name: Normal
+ std: 0.001
+ bias_attr: False
+ ```
+- Modify the training dataset configuration:
+ ```yaml
+ Train:
+ dataset:
+ name: ImageNetDataset
+ image_root: ./dataset/ # Here is the directory where the train dataset is located
+ cls_label_path: ./dataset/train_reg_all_data_v2.txt # Here is the path of the label file corresponding to the train dataset
+ relabel: True
+ ```
+- Modify the query data configuration in the evaluation dataset:
+ ```yaml
+ Query:
+ dataset:
+ name: VeriWild
+ image_root: ./dataset/Aliproduct/ # Here is the directory where the query dataset is located
+ cls_label_path: ./dataset/Aliproduct/val_list.txt # Here is the path of the label file corresponding to the query dataset
+ ```
+- Modify the gallery data configuration in the evaluation dataset:
+ ```yaml
+ Gallery:
+ dataset:
+ name: VeriWild
+ image_root: ./dataset/Aliproduct/ # This is the directory where the gallery dataset is located
+ cls_label_path: ./dataset/Aliproduct/val_list.txt # Here is the path of the label file corresponding to the gallery dataset
+ ```
+
+
+
+### 5.2 Model training
+
+Model training mainly includes the starting training and restoring training from checkpoint
+
+- Single machine and single card training
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0
+ python3.7 tools/train.py \
+ -c ./ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml
+ ```
+- Single machine multi-card training
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0,1,2,3
+ python3.7 -m paddle.distributed.launch --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml
+ ```
+**Notice:**
+The online evaluation method is used by default in the configuration file. If you want to speed up the training, you can turn off the online evaluation function, just add `-o Global.eval_during_train=False` after the above scripts.
+
+After training, the final model files `latest.pdparams`, `best_model.pdarams` and the training log file `train.log` will be generated in the output directory. Among them, `best_model` saves the best model under the current evaluation index, and `latest` is used to save the latest generated model, which is convenient to resume training from the checkpoint when training task is interrupted. Training can be resumed from a checkpoint by adding `-o Global.checkpoint="path_to_resume_checkpoint"` to the end of the above training scripts, as shown below.
+
+- Single machine and single card checkpoint recovery training
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0
+ python3.7 tools/train.py \
+ -c ./ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml \
+ -o Global.checkpoint="output/RecModel/latest"
+ ```
+- Single-machine multi-card checkpoint recovery training
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0,1,2,3
+ python3.7 -m paddle.distributed.launch --gpus="0,1,2,3" \
+ tools/train.py \
+ -c ./ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml \
+ -o Global.checkpoint="output/RecModel/latest"
+ ```
+
+
+
+### 5.3 Model Evaluation
+
+In addition to the online evaluation of the model during training, the evaluation program can also be started manually to obtain the specified model's accuracy metrics.
- Single Card Evaluation
-
-```
-export CUDA_VISIBLE_DEVICES=0
-python tools/eval.py \
--c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
--o Global.pretrained_model="output/RecModel/best_model"
-```
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0
+ python3.7 tools/eval.py \
+ -c ./ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml \
+ -o Global.pretrained_model="output/RecModel/best_model"
+ ```
- Multi Card Evaluation
+ ```shell
+ export CUDA_VISIBLE_DEVICES=0,1,2,3
+ python3.7 -m paddle.distributed.launch --gpus="0,1,2,3" \
+ tools/eval.py \
+ -c ./ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml \
+ -o Global.pretrained_model="output/RecModel/best_model"
+ ```
+**Note:** Multi Card Evaluation is recommended. This method can quickly obtain the metric cross all the data by using multi-card parallel computing, which can speed up the evaluation.
-```
-export CUDA_VISIBLE_DEVICES=0,1,2,3
-python -m paddle.distributed.launch \
- --gpus="0,1,2,3" tools/eval.py \
- -c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
- -o Global.pretrained_model="output/RecModel/best_model"
-```
-
-**Recommendation:** It is suggested to employ multi-card evaluation, which can quickly obtain the feature set of the overall dataset using multi-card parallel computing, accelerating the evaluation process.
+
-
-### 4.4 Model Inference
+### 5.4 Model Inference
-Two steps are included in the inference: 1)exporting the inference model; 2)obtaining the feature vector.
+The inference process consists of two steps: 1) Export the inference model; 2) Model inference to obtain feature vectors
-#### 4.4.1 Export Inference Model
+#### 5.4.1 Export inference model
-```
-python tools/export_model.py \
--c ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml \
+First, you need to convert the `*.pdparams` model file into inference format. The conversion script is as follows.
+```shell
+python3.7 tools/export_model.py \
+-c ./ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml \
-o Global.pretrained_model="output/RecModel/best_model"
```
+The generated inference model is located in the `PaddleClas/inference` directory by default, which contains three files, `inference.pdmodel`, `inference.pdiparams`, `inference.pdiparams.info`.
+Where `inference.pdmodel` is used to store the structure of the inference model, `inference.pdiparams` and `inference.pdiparams.info` are used to store parameter information related to the inference model.
-The generated inference models are under the directory `inference`, which comprises three files, namely, `inference.pdmodel`、`inference.pdiparams`、`inference.pdiparams.info`. Among them, `inference.pdmodel` serves to store the structure of inference model while `inference.pdiparams` and `inference.pdiparams.info` are mobilized to store model-related parameters.
+#### 5.4.2 Get feature vector
-#### 4.4.2 Obtain Feature Vector
+Use the inference model converted in the previous step to convert the input image into corresponding feature vector. The inference script is as follows.
-```
+```shell
cd deploy
-python python/predict_rec.py \
+python3.7 python/predict_rec.py \
-c configs/inference_rec.yaml \
-o Global.rec_inference_model_dir="../inference"
```
+The resulting feature output format is as follows:
+
+```log
+wangzai.jpg: [-7.82453567e-02 2.55877394e-02 -3.66694555e-02 1.34572461e-02
+ 4.39076796e-02 -2.34078392e-02 -9.49947070e-03 1.28221214e-02
+ 5.53947650e-02 1.01355985e-02 -1.06436480e-02 4.97181974e-02
+ -2.21862812e-02 -1.75557341e-02 1.55848479e-02 -3.33278324e-03
+ ...
+ -3.40284109e-02 8.35561901e-02 2.10910216e-02 -3.27066667e-02]
+```
+
+In most cases, just getting the features may not meet the users' requirements. If you want to go further on the image recognition task, you can refer to the document [Vector Search](./vector_search.md).
+
+
+
+## 6. Summary
+
+As a key part of image recognition, the feature extraction module has a lot of points for improvement in the network structure and the the loss function. Different datasets have their own characteristics, such as person re-identification, commodity recognition, face recognition. According to these characteristics, the academic community has proposed various methods, such as PCB, MGN, ArcFace, CircleLoss, TripletLoss, etc., which focus on the ultimate goal of increasing the gap between classes and reducing the gap within classes, so as to make a retrieval model robust enough in most scenes.
+
+
-The output format of the obtained features is shown in the figure below:![img](../../images/feature_extraction_output.png)
+## 7. References
-In practical use, however, business operations require more than simply obtaining features. To further perform image recognition by feature retrieval, please refer to the document [vector search](./vector_search_en.md).
+1. [PP-LCNet: A Lightweight CPU Convolutional Neural Network](https://arxiv.org/pdf/2109.15099.pdf)
+2. [Bag of Tricks and A Strong Baseline for Deep Person Re-identification](https://openaccess.thecvf.com/content_CVPRW_2019/papers/TRMTMCT/Luo_Bag_of_Tricks_and_a_Strong_Baseline_for_Deep_Person_CVPRW_2019_paper.pdf)
diff --git a/docs/en/quick_start/quick_start_recognition_en.md b/docs/en/quick_start/quick_start_recognition_en.md
index 61c6f2309770e1b888712cc7919d93c9fcdf26b8..670ad03e80d8dd69ae2f283704ba9e7bd04444a6 100644
--- a/docs/en/quick_start/quick_start_recognition_en.md
+++ b/docs/en/quick_start/quick_start_recognition_en.md
@@ -1,296 +1,395 @@
# Quick Start of Recognition
-This tutorial contains 3 parts: Environment Preparation, Image Recognition Experience, and Unknown Category Image Recognition Experience.
+This document contains 2 parts: PP-ShiTu android demo quick start and PP-ShiTu PC demo quick start.
-If the image category already exists in the image index database, then you can take a reference to chapter [Image Recognition Experience](#2),to complete the progress of image recognition;If you wish to recognize unknow category image, which is not included in the index database,you can take a reference to chapter [Unknown Category Image Recognition Experience](#3),to complete the process of creating an index to recognize it。
+If the image category already exists in the image index library, you can directly refer to the [Image Recognition Experience](#image recognition experience) chapter to complete the image recognition process; if you want to recognize images of unknown classes, that is, the image category did not exist in the index library before , then you can refer to the [Unknown Category Image Recognition Experience](#Unknown Category Image Recognition Experience) chapter to complete the process of indexing and recognition.
## Catalogue
-* [1. Enviroment Preparation](#1)
-* [2. Image Recognition Experience](#2)
- * [2.1 Download and Unzip the Inference Model and Demo Data](#2.1)
- * [2.2 Product Recognition and Retrieval](#2.2)
- * [2.2.1 Single Image Recognition](#2.2.1)
- * [2.2.2 Folder-based Batch Recognition](#2.2.2)
-* [3. Unknown Category Image Recognition Experience](#3)
- * [3.1 Prepare for the new images and labels](#3.1)
- * [3.2 Build a new Index Library](#3.2)
- * [3.3 Recognize the Unknown Category Images](#3.3)
+- [1. PP-ShiTu android demo for quick start](#1-pp-shitu-android-demo-for-quick-start)
+ - [1.1 Install PP-ShiTu android demo](#11-install-pp-shitu-android-demo)
+ - [1.2 Feature Experience](#12-feature-experience)
+ - [1.2.1 Image Retrieval](#121-image-retrieval)
+ - [1.2.2 Update Index](#122-update-index)
+ - [1.2.3 Save Index](#123-save-index)
+ - [1.2.4 Initialize Index](#124-initialize-index)
+ - [1.2.5 Preview Index](#125-preview-index)
+ - [1.3 Feature Details](#13-feature-details)
+ - [1.3.1 Image Retrieval](#131-image-retrieval)
+ - [1.3.2 Update Index](#132-update-index)
+ - [1.3.3 Save Index](#133-save-index)
+ - [1.3.4 Initialize Index](#134-initialize-index)
+ - [1.3.5 Preview Index](#135-preview-index)
+- [2. PP-ShiTu PC demo for quick start](#2-pp-shitu-pc-demo-for-quick-start)
+ - [2.1 Environment configuration](#21-environment-configuration)
+ - [2.2 Image recognition experience](#22-image-recognition-experience)
+ - [2.2.1 Download and unzip the inference model and demo data](#221-download-and-unzip-the-inference-model-and-demo-data)
+ - [2.2.2 Drink recognition and retrieval](#222-drink-recognition-and-retrieval)
+ - [2.2.2.1 single image recognition](#2221-single-image-recognition)
+ - [2.2.2.2 Folder-based batch recognition](#2222-folder-based-batch-recognition)
+ - [2.3 Image of Unknown categories recognition experience](#23-image-of-unknown-categories-recognition-experience)
+ - [2.3.1 Prepare new data and labels](#231-prepare-new-data-and-labels)
+ - [2.3.2 Create a new index database](#232-create-a-new-index-database)
+ - [2.3.3 Image recognition based on the new index database](#233-image-recognition-based-on-the-new-index-database)
+ - [2.4 List of server recognition models](#24-list-of-server-recognition-models)
+
-
-## 1. Enviroment Preparation
+## 1. PP-ShiTu android demo for quick start
-* Installation:Please take a reference to [Quick Installation ](../installation/)to configure the PaddleClas environment.
+
-* Using the following command to enter Folder `deploy`. All content and commands in this section need to be run in folder `deploy`.
+### 1.1 Install PP-ShiTu android demo
- ```
- cd deploy
- ```
+You can download and install the APP by scanning the QR code or [click the link](https://paddle-imagenet-models-name.bj.bcebos.com/demos/PP-ShiTu.apk)
-
-## 2. Image Recognition Experience
+
-The detection model with the recognition inference model for the 4 directions (Logo, Cartoon Face, Vehicle, Product), the address for downloading the test data and the address of the corresponding configuration file are as follows.
+
-| Models Introduction | Recommended Scenarios | inference Model | Predict Config File | Config File to Build Index Database |
-| ------------ | ------------- | -------- | ------- | -------- |
-| Generic mainbody detection model | General Scenarios |[Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar) | - | - |
-| Logo Recognition Model | Logo Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/logo_rec_ResNet50_Logo3K_v1.0_infer.tar) | [inference_logo.yaml](../../../deploy/configs/inference_logo.yaml) | [build_logo.yaml](../../../deploy/configs/build_logo.yaml) |
-| Cartoon Face Recognition Model| Cartoon Face Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/cartoon_rec_ResNet50_iCartoon_v1.0_infer.tar) | [inference_cartoon.yaml](../../../deploy/configs/inference_cartoon.yaml) | [build_cartoon.yaml](../../../deploy/configs/build_cartoon.yaml) |
-| Vehicle Fine-Grained Classfication Model | Vehicle Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/vehicle_cls_ResNet50_CompCars_v1.0_infer.tar) | [inference_vehicle.yaml](../../../deploy/configs/inference_vehicle.yaml) | [build_vehicle.yaml](../../../deploy/configs/build_vehicle.yaml) |
-| Product Recignition Model | Product Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar) | [inference_product.yaml](../../../deploy/configs/inference_product.yaml) | [build_product.yaml](../../../deploy/configs/build_product.yaml) |
-| Vehicle ReID Model | Vehicle ReID Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/vehicle_reid_ResNet50_VERIWild_v1.0_infer.tar) | - | - |
+### 1.2 Feature Experience
+At present, the PP-ShiTu android demo has basic features such as image retrieval, add image to the index database, saving the index database, initializing the index database, and viewing the index database. Next, we will introduce how to experience these features.
-| Models Introduction | Recommended Scenarios | inference Model | Predict Config File | Config File to Build Index Database |
-| ------------ | ------------- | -------- | ------- | -------- |
-| Lightweight generic mainbody detection model | General Scenarios |[Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar) | - | - |
-| Lightweight generic recognition model | General Scenarios | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar) | [inference_product.yaml](../../../deploy/configs/inference_product.yaml) | [build_product.yaml](../../../deploy/configs/build_product.yaml) |
+#### 1.2.1 Image Retrieval
+Click the "photo recognition" button below or the "file recognition" button, you can take an image or select an image, then wait a few seconds, main object in the image will be marked and the predicted class and inference time will be shown below the image.
+Take the following image as an example:
-Demo data in this tutorial can be downloaded here: [download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_en_v1.1.tar).
+
+The retrieval results obtained are visualized as follows:
-**Attention**
-1. If you do not have wget installed on Windows, you can download the model by copying the link into your browser and unzipping it in the appropriate folder; for Linux or macOS users, you can right-click and copy the download link to download it via the `wget` command.
-2. If you want to install `wget` on macOS, you can run the following command.
-3. The predict config file of the lightweight generic recognition model and the config file to build index database are used for the config of product recognition model of server-side. You can modify the path of the model to complete the index building and prediction.
+
-```shell
-# install homebrew
-ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)";
-# install wget
-brew install wget
-```
+#### 1.2.2 Update Index
+Click the "photo upload" button above or the "file upload" button , you can take an image or select an image and enter the class name of the uploaded image (such as `keyboard`), click the "OK" button, then the feature vector and classname corresponding to the image will be added to the index database.
-3. If you want to isntall `wget` on Windows, you can refer to [link](https://www.cnblogs.com/jeshy/p/10518062.html). If you want to install `tar` on Windows, you can refer to [link](https://www.cnblogs.com/chooperman/p/14190107.html).
+#### 1.2.3 Save Index
+Click the "save index" button above , you can save the current index database as `latest`.
+#### 1.2.4 Initialize Index
+Click the "initialize index" button above to initialize the current library to `original`.
-* You can download and unzip the data and models by following the command below
+#### 1.2.5 Preview Index
+Click the "class preview" button to view it in the pop-up window.
-```shell
-mkdir models
-cd models
-# Download and unzip the inference model
-wget {Models download link} && tar -xf {Name of the tar archive}
-cd ..
+
-# Download the demo data and unzip
-wget {Data download link} && tar -xf {Name of the tar archive}
-```
+### 1.3 Feature Details
+
+#### 1.3.1 Image Retrieval
+After selecting the image to be retrieved, firstly, the mainbody detection will be performed through the detection model to obtain the bounding box of the object in the image, and then the image will be cropped and is input into the feature extraction model to obtain the corresponding feature vector and retrieved in the index database, returns and displays the final search result.
+
+#### 1.3.2 Update Index
+After selecting the picture to be stored, firstly, the mainbody detection will be performed through the detection model to obtain the bounding box of the object in the image, and then the image will be cropped and is input into the feature extraction model to obtain the corresponding feature vector, and then added into index database.
+
+#### 1.3.3 Save Index
+Save the index database in the current program index database name of `latest`, and automatically switch to `latest`. The saving logic is similar to "Save As" in general software. If the current index is already `latest`, it will be automatically overwritten, or it will switch to `latest`.
+
+#### 1.3.4 Initialize Index
+When initializing the index database, it will automatically switch the search index database to `original.index` and `original.txt`, and automatically delete `latest.index` and `latest.txt` (if exists).
+
+#### 1.3.5 Preview Index
+One can preview it according to the instructions in [Function Experience - Preview Index](#125-preview-index).
+
+
+## 2. PP-ShiTu PC demo for quick start
+
+
+
+### 2.1 Environment configuration
+
+* Installation: Please refer to the document [Environment Preparation](../installation/install_paddleclas.md) to configure the PaddleClas operating environment.
+
+* Go to the `deploy` run directory. All the content and scripts in this section need to be run in the `deploy` directory, you can enter the `deploy` directory with the following scripts.
+
+ ```shell
+ cd deploy
+ ```
+
-
-### 2.1 Download and Unzip the Inference Model and Demo Data
+### 2.2 Image recognition experience
-Take the product recognition as an example, download the detection model, recognition model and product recognition demo data with the following commands.
+The lightweight general object detection model, lightweight general recognition model and configuration file are available in following table.
+
+
+
+| Model Introduction | Recommended Scenarios | Inference Model | Prediction Profile |
+| ------------------------------------------ | --------------------- | ------------------ | ------------------------------------------------------------------------ |
+| Lightweight General MainBody Detection Model | General Scene | [tar format download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainMainBody_lite_v1.0_infer.tar ) \| [zip format download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainMainBody_lite_v1.0_infer.zip) | - |
+| Lightweight General Recognition Model | General Scene | [tar format download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/PP-ShiTuV2/general_PPLCNetV2_base_pretrained_v1.0_infer.tar) \| [zip format download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/PP-ShiTuV2/general_PPLCNetV2_base_pretrained_v1.0_infer.zip) | [inference_general.yaml](../../../deploy/configs/inference_general.yaml) |
+
+Note: Since some decompression software has problems in decompressing the above `tar` format files, it is recommended that non-script line users download the `zip` format files and decompress them. `tar` format file is recommended to use the script `tar -xf xxx.tar`unzip.
+
+The demo data download path of this chapter is as follows: [drink_dataset_v2.0.tar (drink data)](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v2.0.tar),
+
+The following takes **drink_dataset_v2.0.tar** as an example to introduce the PP-ShiTu quick start process on the PC. Users can also download and decompress the data of other scenarios to experience: [22 scenarios data download](../../zh_CN/introduction/ppshitu_application_scenarios.md#22-下载解压场景库数据).
+
+If you want to experience the server object detection and the recognition model of each scene, you can refer to [2.4 Server recognition model list](#24-list-of-server-identification-models)
+
+**Notice**
+
+- If wget is not installed in the windows environment, you can install the `wget` and tar scripts according to the following steps, or you can copy the link to the browser to download the model, decompress it and place it in the corresponding directory.
+- If the `wget` script is not installed in the macOS environment, you can run the following script to install it.
+ ```shell
+ # install homebrew
+ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)";
+ # install wget
+ brew install wget
+ ```
+- If you want to install `wget` in the windows environment, you can refer to: [link](https://www.cnblogs.com/jeshy/p/10518062.html); if you want to install the `tar` script in the windows environment, you can refer to: [Link](https://www.cnblogs.com/chooperman/p/14190107.html).
+
+
+
+#### 2.2.1 Download and unzip the inference model and demo data
+
+Download the demo dataset and the lightweight subject detection and recognition model. The scripts are as follows.
```shell
mkdir models
cd models
-# Download the generic detection inference model and unzip it
-wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar && tar -xf ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar
-# Download and unpack the inference model
-wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar && tar -xf product_ResNet50_vd_aliproduct_v1.0_infer.tar
-cd ..
-
-# Download the demo data and unzip it
-wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_en_v1.1.tar && tar -xf recognition_demo_data_en_v1.1.tar
+# Download the mainbody detection inference model and unzip it
+wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar && tar -xf picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
+# Download the feature extraction inference model and unzip it
+wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/PP-ShiTuV2/general_PPLCNetV2_base_pretrained_v1.0_infer.tar && tar -xf general_PPLCNetV2_base_pretrained_v1.0_infer.tar
+
+cd ../
+# Download demo data and unzip it
+wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v2.0.tar && tar -xf drink_dataset_v2.0.tar
```
-Once unpacked, the `recognition_demo_data_v1.1` folder should have the following file structure.
+After decompression, the `drink_dataset_v2.0/` folder be structured as follows:
-```
-├── recognition_demo_data_v1.1
-│ ├── gallery_cartoon
-│ ├── gallery_logo
-│ ├── gallery_product
-│ ├── gallery_vehicle
-│ ├── test_cartoon
-│ ├── test_logo
-│ ├── test_product
-│ └── test_vehicle
+```log
+├── drink_dataset_v2.0/
+│ ├── gallery/
+│ ├── index/
+│ ├── index_all/
+│ └── test_images/
├── ...
```
-here, original images to build index are in folder `gallery_xxx`, test images are in folder `test_xxx`. You can also access specific folder for more details.
+The `gallery` folder stores the original images used to build the index database, `index` represents the index database constructed based on the original images, and the `test_images` folder stores the list of images for query.
-The `models` folder should have the following file structure.
+The `models` folder should be structured as follows:
-```
-├── product_ResNet50_vd_aliproduct_v1.0_infer
+```log
+├── general_PPLCNetV2_base_pretrained_v1.0_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
-├── ppyolov2_r50vd_dcn_mainbody_v1.0_infer
+├── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
```
-**Attention**
-If you want to use the lightweight generic recognition model, you need to re-extract the features of the demo data and re-build the index. The way is as follows:
+**Notice**
+
+If the general feature extraction model is changed, the index for demo data must be rebuild, as follows:
```shell
-python3.7 python/build_gallery.py -c configs/build_product.yaml -o Global.rec_inference_model_dir=./models/general_PPLCNet_x2_5_lite_v1.0_infer
+python3.7 python/build_gallery.py \
+-c configs/inference_general.yaml \
+-o Global.rec_inference_model_dir=./models/general_PPLCNetV2_base_pretrained_v1.0_infer
```
-
-### 2.2 Product Recognition and Retrieval
-
-Take the product recognition demo as an example to show the recognition and retrieval process (if you wish to try other scenarios of recognition and retrieval, replace the corresponding configuration file after downloading and unzipping the corresponding demo data and model to complete the prediction).
-
-**Note:** `faiss` is used as search library. The installation method is as follows:
+
-```
-pip install faiss-cpu==1.7.1post2
-```
+#### 2.2.2 Drink recognition and retrieval
-If error happens when using `import faiss`, please uninstall `faiss` and reinstall it, especially on `Windows`.
+Take the drink recognition demo as an example to show the recognition and retrieval process.
-
+Note that this section will uses `faiss` as the retrieval tool, and the installation script is as follows:
-#### 2.2.1 Single Image Recognition
+```python
+python3.7 -m pip install faiss-cpu==1.7.1post2
+```
-Run the following command to identify and retrieve the image `./recognition_demo_data_v1.1/test_product/daoxiangcunjinzhubing_6.jpg` for recognition and retrieval
+If `faiss` cannot be importted, try reinstall it, especially for windows users.
-```shell
-# use the following command to predict using GPU.
-python3.7 python/predict_system.py -c configs/inference_product.yaml
-# use the following command to predict using CPU
-python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.use_gpu=False
-```
+
+##### 2.2.2.1 single image recognition
-The image to be retrieved is shown below.
+Run the following script to recognize the image `./drink_dataset_v2.0/test_images/100.jpeg`
-![](../../images/recognition/product_demo/query/daoxiangcunjinzhubing_6.jpg)
+The images to be retrieved are as follows
+![](../../images/recognition/drink_data_demo/test_images/100.jpeg)
-The final output is shown below.
+```shell
+# Use the script below to make predictions using the GPU
+python3.7 python/predict_system.py -c configs/inference_general.yaml
-```
-[{'bbox': [287, 129, 497, 326], 'rec_docs': 'Daoxaingcun Golden Piggie Cake', 'rec_scores': 0.8309420347213745}, {'bbox': [99, 242, 313, 426], 'rec_docs': 'Daoxaingcun Golden Piggie Cake', 'rec_scores': 0.7245651483535767}]
+# Use the following script to make predictions using the CPU
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.use_gpu=False
```
+The final output is as follows.
-where bbox indicates the location of the detected object, rec_docs indicates the labels corresponding to the label in the index dabase that are most similar to the detected object, and rec_scores indicates the corresponding confidence.
+```log
+[{'bbox': [437, 71, 660, 728], 'rec_docs': '元气森林', 'rec_scores': 0.7740249}, {'bbox': [221, 72, 449, 701], 'rec_docs' : '元气森林', 'rec_scores': 0.6950992}, {'bbox': [794, 104, 979, 652], 'rec_docs': '元气森林', 'rec_scores': 0.6305153}]
+```
+Where `bbox` represents the location of the detected object, `rec_docs` represents the most similar category to the detection box in the index database, and `rec_scores` represents the corresponding similarity.
-The detection result is also saved in the folder `output`, for this image, the visualization result is as follows.
+The visualization results of the recognition are saved in the `output` folder by default. For this image, the visualization of the recognition results is shown below.
-![](../../images/recognition/product_demo/result/daoxiangcunjinzhubing_6_en.jpg)
+![](../../images/recognition/drink_data_demo/output/100.jpeg)
+
-
-#### 2.2.2 Folder-based Batch Recognition
+##### 2.2.2.2 Folder-based batch recognition
-If you want to predict the images in the folder, you can directly modify the `Global.infer_imgs` field in the configuration file, or you can also modify the corresponding configuration through the following `-o` parameter.
+If you want to use multi images in the folder for prediction, you can modify the `Global.infer_imgs` field in the configuration file, or you can modify the corresponding configuration through the `-o` parameter below.
```shell
-# using the following command to predict using GPU, you can append `-o Global.use_gpu=False` to predict using CPU.
-python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/"
+# Use the following script to use GPU for prediction, if you want to use CPU prediction, you can add -o Global.use_gpu=False after the script
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images/"
```
+The recognition results of all images in the folder will be output in the terminal, as shown below.
-The results on the screen are shown as following.
-
-```
+```log
...
-[{'bbox': [37, 29, 123, 89], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.6163763999938965}, {'bbox': [153, 96, 235, 175], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5279821157455444}]
-[{'bbox': [735, 562, 1133, 851], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5588355660438538}]
-[{'bbox': [124, 50, 230, 129], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.6980369687080383}]
-[{'bbox': [0, 0, 275, 183], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5818190574645996}]
-[{'bbox': [400, 1179, 905, 1537], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.9814301133155823}, {'bbox': [295, 713, 820, 1046], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.9496176242828369}, {'bbox': [153, 236, 694, 614], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.8395382761955261}]
-[{'bbox': [544, 4, 1482, 932], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5143815279006958}]
+[{'bbox': [0, 0, 600, 600], 'rec_docs': '红牛-强化型', 'rec_scores': 0.74081033}]
+Inference: 120.39852142333984 ms per batch image
+[{'bbox': [0, 0, 514, 436], 'rec_docs': '康师傅矿物质水', 'rec_scores': 0.6918598}]
+Inference: 32.045602798461914 ms per batch image
+[{'bbox': [138, 40, 573, 1198], 'rec_docs': '乐虎功能饮料', 'rec_scores': 0.68214047}]
+Inference: 113.41428756713867 ms per batch image
+[{'bbox': [328, 7, 467, 272], 'rec_docs': '脉动', 'rec_scores': 0.60406065}]
+Inference: 122.04337120056152 ms per batch image
+[{'bbox': [242, 82, 498, 726], 'rec_docs': '味全_每日C', 'rec_scores': 0.5428652}]
+Inference: 37.95266151428223 ms per batch image
+[{'bbox': [437, 71, 660, 728], 'rec_docs': '元气森林', 'rec_scores': 0.7740249}, {'bbox': [221, 72, 449, 701], 'rec_docs': '元气森林', 'rec_scores': 0.6950992}, {'bbox': [794, 104, 979, 652], 'rec_docs': '元气森林', 'rec_scores': 0.6305153}]
...
```
-All the visualization results are also saved in folder `output`.
+Visualizations of recognition results for all images are also saved in the `output` folder.
+
+Furthermore, you can change the path of the recognition inference model by modifying the `Global.rec_inference_model_dir` field, and change the path of the index database by modifying the `IndexProcess.index_dir` field.
+
-Furthermore, the recognition inference model path can be changed by modifying the `Global.rec_inference_model_dir` field, and the path of the index to the index databass can be changed by modifying the `IndexProcess.index_dir` field.
+### 2.3 Image of Unknown categories recognition experience
+Now we try to recognize the unseen image `./drink_dataset_v2.0/test_images/mosilian.jpeg`
-
-## 3. Recognize Images of Unknown Category
+The images to be retrieved are as follows
-To recognize the image `./recognition_demo_data_v1.1/test_product/anmuxi.jpg`, run the command as follows:
+![](../../images/recognition/drink_data_demo/test_images/mosilian.jpeg)
+
+Execute the following identification script
```shell
-python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/anmuxi.jpg"
+# Use the following script to use GPU for prediction, if you want to use CPU prediction, you can add -o Global.use_gpu=False after the script
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images/mosilian.jpeg"
```
-The image to be retrieved is shown below.
+It can be found that the output result is empty
+
+Since the default index database does not contain the unknown category's information, the recognition result here is wrong. At this time, we can achieve the image recognition of unknown classes by building a new index database.
+
+When the images in the index database cannot cover the scene we actually recognize, i.e. recognizing an image of an unknown category, we need to add a similar image(at least one) belong the unknown category to the index database. This process does not require re-training the model. Take `mosilian.jpeg` as an example, just follow the steps below to rebuild a new index database.
+
+
+
+#### 2.3.1 Prepare new data and labels
-![](../../images/recognition/product_demo/query/anmuxi.jpg)
+First, copy the image(s) belong to unknown category(except the query image) to the original image folder of the index database. Here we already put all the image data in the folder `drink_dataset_v2.0/gallery/`.
-The output is empty.
+Then we need to edit the text file that records the image path and label information. Here we already put the updated label information file in the `drink_dataset_v2.0/gallery/drink_label_all.txt` file. Comparing with the original `drink_dataset_v2.0/gallery/drink_label.txt` label file, it can be found that the index images of the bright and ternary series of milk have been added.
-Since the index infomation is not included in the corresponding index databse, the recognition result is empty or not proper. At this time, we can complete the image recognition of unknown categories by constructing a new index database.
+In each line of text, the first field represents the relative path of the image, and the second field represents the label information corresponding to the image, separated by the `\t` key (Note: some editors will automatically convert `tab` is `space`, in which case it will cause a file parsing error).
-When the index database cannot cover the scenes we actually recognise, i.e. when predicting images of unknown categories, we need to add similar images of the corresponding categories to the index databasey, thus completing the recognition of images of unknown categories ,which does not require retraining.
+
-
-### 3.1 Prepare for the new images and labels
+#### 2.3.2 Create a new index database
-First, you need to copy the images which are similar with the image to retrieval to the original images for the index database. The command is as follows.
+Build a new index database `index_all` with the following scripts.
```shell
-cp -r ../docs/images/recognition/product_demo/gallery/anmuxi ./recognition_demo_data_/gallery_product/gallery/
+python3.7 python/build_gallery.py -c configs/inference_general.yaml -o IndexProcess.data_file="./drink_dataset_v2.0/gallery/drink_label_all.txt" -o IndexProcess.index_dir="./drink_dataset_v2.0/index_all"
```
-Then you need to create a new label file which records the image path and label information. Use the following command to create a new file based on the original one.
+The final constructed new index database is saved in the folder `./drink_dataset_v2.0/index_all`. For specific instructions on yaml `yaml`, please refer to [Vector Search Documentation](../image_recognition_pipeline/vector_search.md).
+
+
+
+#### 2.3.3 Image recognition based on the new index database
+
+To re-recognize the `mosilian.jpeg` image using the new index database, run the following scripts.
```shell
-# copy the file
-cp recognition_demo_data_v1.1/gallery_product/data_file.txt recognition_demo_data_v1.1/gallery_product/data_file_update.txt
+# run the following script predict with GPU, if you want to use CPU, you can add -o Global.use_gpu=False after the script
+python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images/mosilian.jpeg" -o IndexProcess.index_dir="./drink_dataset_v2.0/index_all"
```
-Then add some new lines into the new label file, which is shown as follows.
+The output is as follows.
-```
-gallery/anmuxi/001.jpg Anmuxi Ambrosial Yogurt
-gallery/anmuxi/002.jpg Anmuxi Ambrosial Yogurt
-gallery/anmuxi/003.jpg Anmuxi Ambrosial Yogurt
-gallery/anmuxi/004.jpg Anmuxi Ambrosial Yogurt
-gallery/anmuxi/005.jpg Anmuxi Ambrosial Yogurt
-gallery/anmuxi/006.jpg Anmuxi Ambrosial Yogurt
+```log
+[{'bbox': [290, 297, 564, 919], 'rec_docs': 'Bright_Mosleyan', 'rec_scores': 0.59137374}]
```
-Each line can be splited into two fields. The first field denotes the relative image path, and the second field denotes its label. The `delimiter` is `tab` here.
+The final recognition result is `光明_莫斯利安`, we can see the recognition result is correct now , and the visualization of the recognition result is shown below.
+![](../../images/recognition/drink_data_demo/output/mosilian.jpeg)
-
-### 3.2 Build a new Index Base Library
-Use the following command to build the index to accelerate the retrieval process after recognition.
+
-```shell
-python3.7 python/build_gallery.py -c configs/build_product.yaml -o IndexProcess.data_file="./recognition_demo_data_v1.1/gallery_product/data_file_update.txt" -o IndexProcess.index_dir="./recognition_demo_data_v1.1/gallery_product/index_update"
-```
+### 2.4 List of server recognition models
-Finally, the new index information is stored in the folder`./recognition_demo_data_v1.1/gallery_product/index_update`. Use the new index database for the above index.
+At present, we recommend to use model in [Lightweight General Object Detection Model and Lightweight General Recognition Model](#22-image-recognition-experience) to get better test results. However, if you want to experience the general recognition model, general object detection model and other recognition model for server, the test data download path, and the corresponding configuration file path are as follows.
+| Model Introduction | Recommended Scenarios | Inference Model | Prediction Profile |
+| --------------------------------- | --------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
+| General Body Detection Model | General Scene | [Model download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar) | - |
+| Logo Recognition Model | Logo Scene | [Model download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/logo_rec_ResNet50_Logo3K_v1.0_infer.tar) | [inference_logo. yaml](../../../deploy/configs/inference_logo.yaml) |
+| Anime Character Recognition Model | Anime Character Scene | [Model download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/cartoon_rec_ResNet50_iCartoon_v1.0_infer.tar) | [ inference_cartoon.yaml](../../../deploy/configs/inference_cartoon.yaml) |
+| Vehicle Subdivision Model | Vehicle Scene | [Model download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/vehicle_cls_ResNet50_CompCars_v1.0_infer.tar) | [inference_vehicle .yaml](../../../deploy/configs/inference_vehicle.yaml) |
+| Product Recognition Model | Product Scene | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar) | [inference_product. yaml](../../../deploy/configs/inference_product.yaml) |
+| Vehicle ReID Model | Vehicle ReID Scene | [Model download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/vehicle_reid_ResNet50_VERIWild_v1.0_infer.tar) | [inference_vehicle .yaml](../../../deploy/configs/inference_vehicle.yaml) |
-
-### 3.3 Recognize the Unknown Category Images
+The above models can be downloaded to the `deploy/models` folder by the following script for use in recognition tasks
+```shell
+cd ./deploy
+mkdir -p models
-To recognize the image `./recognition_demo_data_v1.1/test_product/anmuxi.jpg`, run the command as follows.
+cd ./models
+# Download the generic object detection model for server and unzip it
+wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar && tar -xf ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar
+# Download the generic recognition model and unzip it
+wget {recognize model download link path} && tar -xf {name of compressed package}
+```
+
+Then use the following scripts to download the test data for other recognition scenario:
```shell
-# using the following command to predict using GPU, you can append `-o Global.use_gpu=False` to predict using CPU.
-python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/anmuxi.jpg" -o IndexProcess.index_dir="./recognition_demo_data_v1.1/gallery_product/index_update"
+# Go back to the deploy directory
+cd..
+# Download test data and unzip
+wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_en_v1.1.tar && tar -xf recognition_demo_data_en_v1.1.tar
```
-The output is as follows:
+After decompression, the `recognition_demo_data_v1.1` folder should have the following file structure:
-```
-[{'bbox': [243, 80, 523, 522], 'rec_docs': 'Anmuxi Ambrosial Yogurt', 'rec_scores': 0.5570770502090454}]
+```log
+├── recognition_demo_data_v1.1
+│ ├── gallery_cartoon
+│ ├── gallery_logo
+│ ├── gallery_product
+│ ├── gallery_vehicle
+│ ├── test_cartoon
+│ ├── test_logo
+│ ├── test_product
+│ └── test_vehicle
+├── ...
```
-The final recognition result is `Anmuxi Ambrosial Yogurt`, which is corrrect, the visualization result is as follows.
+After downloading the model and test data according to the above steps, you can re-build the index database and test the relevant recognition model.
-![](../../images/recognition/product_demo/result/anmuxi_en.jpg)
-
+* For more introduction to object detection, please refer to: [Object Detection Tutorial Document](../image_recognition_pipeline/mainbody_detection.md); for the introduction of feature extraction, please refer to: [Feature Extraction Tutorial Document](../image_recognition_pipeline/feature_extraction.md); for the introduction to vector search, please refer to: [vector search tutorial document](../image_recognition_pipeline/vector_search.md).