提交 7569a626 编写于 作者: H HydrogenSulfate

update en docs

上级 3254addf
...@@ -22,7 +22,7 @@ ...@@ -22,7 +22,7 @@
## 近期更新 ## 近期更新
- 🔥️ 发布[PP-ShiTuV2](./docs/zh_CN/PPShiTu/PPShiTuV2_introduction.md),recall1精度提升10个点,覆盖20+识别场景,新增库管理工具,Android Demo全新体验。 - 🔥️ 发布[PP-ShiTuV2](./docs/zh_CN/PPShiTu/PPShiTuV2_introduction.md),recall1精度提升8个点,覆盖20+识别场景,新增[库管理工具](./deploy/shitu_index_manager/)[Android Demo](./docs/zh_CN/quick_start/quick_start_recognition.md)全新体验。
- 2022.9.4 新增[生鲜产品自主结算范例库](https://aistudio.baidu.com/aistudio/projectdetail/4486158),具体内容可以在AI Studio上体验。 - 2022.9.4 新增[生鲜产品自主结算范例库](https://aistudio.baidu.com/aistudio/projectdetail/4486158),具体内容可以在AI Studio上体验。
- 2022.6.15 发布[PULC超轻量图像分类实用方案](docs/zh_CN/PULC/PULC_train.md),CPU推理3ms,精度比肩SwinTransformer,覆盖人、车、OCR场景九大常见任务。 - 2022.6.15 发布[PULC超轻量图像分类实用方案](docs/zh_CN/PULC/PULC_train.md),CPU推理3ms,精度比肩SwinTransformer,覆盖人、车、OCR场景九大常见任务。
- 2022.5.23 新增[人员出入管理范例库](https://aistudio.baidu.com/aistudio/projectdetail/4094475),具体内容可以在 AI Studio 上体验。 - 2022.5.23 新增[人员出入管理范例库](https://aistudio.baidu.com/aistudio/projectdetail/4094475),具体内容可以在 AI Studio 上体验。
......
...@@ -22,7 +22,7 @@ PULC demo images ...@@ -22,7 +22,7 @@ PULC demo images
**Recent updates** **Recent updates**
- 🔥️ Release [PP-ShiTuV2](./docs/en/PPShiTu/PPShiTuV2_introduction.md), recall1 is improved by nearly 8 points, covering 20+ recognition scenarios, with [index management tool](./deploy/shitu_index_manager/README.md) and [Android Demo](./docs/en/quick_start/quick_start_recognition_en.md) for better experience. - 🔥️ Release [PP-ShiTuV2](./docs/en/PPShiTu/PPShiTuV2_introduction.md), recall1 is improved by nearly 8 points, covering 20+ recognition scenarios, with [index management tool](./deploy/shitu_index_manager) and [Android Demo](./docs/en/quick_start/quick_start_recognition_en.md) for better experience.
- 2022.6.15 Release [**P**ractical **U**ltra **L**ight-weight image **C**lassification solutions](./docs/en/PULC/PULC_quickstart_en.md). PULC models inference within 3ms on CPU devices, with accuracy on par with SwinTransformer. We also release 9 practical classification models covering pedestrian, vehicle and OCR scenario. - 2022.6.15 Release [**P**ractical **U**ltra **L**ight-weight image **C**lassification solutions](./docs/en/PULC/PULC_quickstart_en.md). PULC models inference within 3ms on CPU devices, with accuracy on par with SwinTransformer. We also release 9 practical classification models covering pedestrian, vehicle and OCR scenario.
- 2022.4.21 Added the related [code](https://github.com/PaddlePaddle/PaddleClas/pull/1820/files) of the CVPR2022 oral paper [MixFormer](https://arxiv.org/pdf/2204.02557.pdf). - 2022.4.21 Added the related [code](https://github.com/PaddlePaddle/PaddleClas/pull/1820/files) of the CVPR2022 oral paper [MixFormer](https://arxiv.org/pdf/2204.02557.pdf).
......
...@@ -113,9 +113,9 @@ We adjust the `PPLCNetV2_base` structure, and added more general and effective o ...@@ -113,9 +113,9 @@ We adjust the `PPLCNetV2_base` structure, and added more general and effective o
3. `BN Neck`: Add a `BatchNorm1D` layer after `BackBone` to normalize each dimension of the feature vector, bringing faster convergence. 3. `BN Neck`: Add a `BatchNorm1D` layer after `BackBone` to normalize each dimension of the feature vector, bringing faster convergence.
| Model | training data | recall@1%(mAP%) | | Model | training data | recall@1%(mAP%) |
| :----------------------------------------------------------------- | :----------------- | :-------------- | | :----------------------------------------------- | :----------------- | :-------------- |
| GeneralRecognition_PPLCNet_x2_5 | PP-ShiTuV1 dataset | 65.9(54.3) | | GeneralRecognition_PPLCNet_x2_5 | PP-ShiTuV1 dataset | 65.9(54.3) |
| GeneralRecognitionV2_PPLCNetV2_base(TripletLoss) | PP-ShiTuV1 dataset | 72.3(60.5) | | GeneralRecognitionV2_PPLCNetV2_base(TripletLoss) | PP-ShiTuV1 dataset | 72.3(60.5) |
4. `TripletAngularMarginLoss`: We improved on the original `TripletLoss` (difficult triplet loss), changed the optimization objective from L2 Euclidean space to cosine space, and added an additional space between anchor and positive/negtive The hard distance constraint makes the training and testing goals closer and improves the generalization ability of the model. 4. `TripletAngularMarginLoss`: We improved on the original `TripletLoss` (difficult triplet loss), changed the optimization objective from L2 Euclidean space to cosine space, and added an additional space between anchor and positive/negtive The hard distance constraint makes the training and testing goals closer and improves the generalization ability of the model.
...@@ -127,26 +127,26 @@ We adjust the `PPLCNetV2_base` structure, and added more general and effective o ...@@ -127,26 +127,26 @@ We adjust the `PPLCNetV2_base` structure, and added more general and effective o
#### Data Augmentation #### Data Augmentation
The target object may rotate to a certain extent and may not maintain an upright state when the actual camera is shot, so we add [random rotation augmentation](../../../ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml#L117) in the data augmentation to make retrieval more robust in real scenes. The target object may rotate to a certain extent and may not maintain an upright state when the actual camera is shot, so we add [Random Rotation](../../../ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml#L117) in the data augmentation to make retrieval more robust in real scenes.
Combining the above strategies, the final experimental results on multiple data sets are as follows: Combining the above strategies, the final experimental results on multiple data sets are as follows:
| Model | product<sup>*</sup> | | Model | product<sup>*</sup> |
| :--------- | :------------------ | | :---------------------------------- | :------------------ |
| - | recall@1%(mAP%) | | - | recall@1%(mAP%) |
| GeneralRecognition_PPLCNet_x2_5 | 65.9(54.3) | | GeneralRecognition_PPLCNet_x2_5 | 65.9(54.3) |
| GeneralRecognitionV2_PPLCNetV2_base | 73.7(61.0) | | GeneralRecognitionV2_PPLCNetV2_base | 73.7(61.0) |
| Models | Aliproduct | VeRI-Wild | LogoDet-3k | iCartoonFace | SOP | Inshop | | Models | Aliproduct | VeRI-Wild | LogoDet-3k | iCartoonFace | SOP | Inshop |
| :--------- | :-------------- | :-------------- | :-------------- | :-------------- | :-------------- | :--------------- | | :---------------------------------- | :-------------- | :-------------- | :-------------- | :-------------- | :-------------- | :--------------- |
| - | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@ 1%(mAP%) | | - | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@ 1%(mAP%) |
| GeneralRecognition_PPLCNet_x2_5 | 83.9(83.2) | 88.7(60.1) | 86.1(73.6) | 84.1(72.3) | 79.7(58.6) | 89.1(69.4) | | GeneralRecognition_PPLCNet_x2_5 | 83.9(83.2) | 88.7(60.1) | 86.1(73.6) | 84.1(72.3) | 79.7(58.6) | 89.1(69.4) |
| GeneralRecognitionV2_PPLCNetV2_base | 84.2(83.3) | 87.8(68.8) | 88.0(63.2) | 53.6(27.5) | 77.6(55.3) | 90.8(74.3) | | GeneralRecognitionV2_PPLCNetV2_base | 84.2(83.3) | 87.8(68.8) | 88.0(63.2) | 53.6(27.5) | 77.6(55.3) | 90.8(74.3) |
| model | gldv2 | imdb_face | iNat | instre | sketch | sop<sup>*</sup> | | model | gldv2 | imdb_face | iNat | instre | sketch | sop<sup>*</sup> |
| :--------- | :-------------- | :-------------- | :-------------- | :-------------- | :-------------- | :--------------- | | :---------------------------------- | :-------------- | :-------------- | :-------------- | :-------------- | :-------------- | :--------------- |
| - | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@ 1%(mAP%) | | - | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@1%(mAP%) | recall@ 1%(mAP%) |
| GeneralRecognition_PPLCNet_x2_5 | 98.2(91.6) | 28.8(8.42) | 12.6(6.1) | 72.0(50.4) | 27.9(9.5) | 97.6(90.3) | | GeneralRecognition_PPLCNet_x2_5 | 98.2(91.6) | 28.8(8.42) | 12.6(6.1) | 72.0(50.4) | 27.9(9.5) | 97.6(90.3) |
| GeneralRecognitionV2_PPLCNetV2_base | 98.1(90.5) | 35.9(11.2) | 38.6(23.9) | 87.7(71.4) | 39.3(15.6) | 98.3(90.9) | | GeneralRecognitionV2_PPLCNetV2_base | 98.1(90.5) | 35.9(11.2) | 38.6(23.9) | 87.7(71.4) | 39.3(15.6) | 98.3(90.9) |
**Note:** The product dataset is made to verify the generalization performance of PP-ShiTu, and all the data are not present in the training and testing sets. The data contains 7 categories ( cosmetics, landmarks, wine, watches, cars, sports shoes, beverages) and 250 sub-categories. When testing, use the labels of 250 small classes for testing; the sop dataset comes from [GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval](https://arxiv.org/abs/2111.13122), which can be regarded as " SOP" dataset. **Note:** The product dataset is made to verify the generalization performance of PP-ShiTu, and all the data are not present in the training and testing sets. The data contains 7 categories ( cosmetics, landmarks, wine, watches, cars, sports shoes, beverages) and 250 sub-categories. When testing, use the labels of 250 small classes for testing; the sop dataset comes from [GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval](https://arxiv.org/abs/2111.13122), which can be regarded as " SOP" dataset.
......
# Quick Start of Recognition # Quick Start of Recognition
This tutorial contains 3 parts: Environment Preparation, Image Recognition Experience, and Unknown Category Image Recognition Experience. This document contains 2 parts: PP-ShiTu android demo quick start and PP-ShiTu PC demo quick start.
If the image category already exists in the image index database, then you can take a reference to chapter [Image Recognition Experience](#2),to complete the progress of image recognition;If you wish to recognize unknow category image, which is not included in the index database,you can take a reference to chapter [Unknown Category Image Recognition Experience](#3),to complete the process of creating an index to recognize it。 If the image category already exists in the image index library, you can directly refer to the [Image Recognition Experience](#image recognition experience) chapter to complete the image recognition process; if you want to recognize images of unknown classes, that is, the image category did not exist in the index library before , then you can refer to the [Unknown Category Image Recognition Experience](#Unknown Category Image Recognition Experience) chapter to complete the process of indexing and recognition.
## Catalogue ## Catalogue
* [1. Enviroment Preparation](#1) - [1. PP-ShiTu android demo for quick start](#1-pp-shitu-android-demo-for-quick-start)
* [2. Image Recognition Experience](#2) - [1.1 Install PP-ShiTu android demo](#11-install-pp-shitu-android-demo)
* [2.1 Download and Unzip the Inference Model and Demo Data](#2.1) - [1.2 Feature Experience](#12-feature-experience)
* [2.2 Product Recognition and Retrieval](#2.2) - [1.2.1 Image Retrieval](#121-image-retrieval)
* [2.2.1 Single Image Recognition](#2.2.1) - [1.2.2 Update Index](#122-update-index)
* [2.2.2 Folder-based Batch Recognition](#2.2.2) - [1.2.3 Save Index](#123-save-index)
* [3. Unknown Category Image Recognition Experience](#3) - [1.2.4 Initialize Index](#124-initialize-index)
* [3.1 Prepare for the new images and labels](#3.1) - [1.2.5 Preview Index](#125-preview-index)
* [3.2 Build a new Index Library](#3.2) - [1.3 Feature Details](#13-feature-details)
* [3.3 Recognize the Unknown Category Images](#3.3) - [1.3.1 Image Retrieval](#131-image-retrieval)
- [1.3.2 Update Index](#132-update-index)
- [1.3.3 Save Index](#133-save-index)
- [1.3.4 Initialize Index](#134-initialize-index)
- [1.3.5 Preview Index](#135-preview-index)
- [2. PP-ShiTu PC demo for quick start](#2-pp-shitu-pc-demo-for-quick-start)
- [2.1 Environment configuration](#21-environment-configuration)
- [2.2 Image recognition experience](#22-image-recognition-experience)
- [2.2.1 Download and unzip the inference model and demo data](#221-download-and-unzip-the-inference-model-and-demo-data)
- [2.2.2 Drink recognition and retrieval](#222-drink-recognition-and-retrieval)
- [2.2.2.1 single image recognition](#2221-single-image-recognition)
- [2.2.2.2 Folder-based batch recognition](#2222-folder-based-batch-recognition)
- [2.3 Image of Unknown categories recognition experience](#23-image-of-unknown-categories-recognition-experience)
- [2.3.1 Prepare new data and labels](#231-prepare-new-data-and-labels)
- [2.3.2 Create a new index database](#232-create-a-new-index-database)
- [2.3.3 Image recognition based on the new index database](#233-image-recognition-based-on-the-new-index-database)
- [2.4 List of server recognition models](#24-list-of-server-recognition-models)
<a name="PP-ShiTu android quick start"></a>
<a name="1"></a> ## 1. PP-ShiTu android demo for quick start
## 1. Enviroment Preparation
* Installation:Please take a reference to [Quick Installation ](../installation/)to configure the PaddleClas environment. <a name="install"></a>
* Using the following command to enter Folder `deploy`. All content and commands in this section need to be run in folder `deploy`. ### 1.1 Install PP-ShiTu android demo
``` You can download and install the APP by scanning the QR code or [click the link](https://paddle-imagenet-models-name.bj.bcebos.com/demos/PP-ShiTu.apk)
cd deploy
```
<a name="2"></a> <div align=center><img src="../../images/quick_start/android_demo/PPShiTu_qrcode.png" height="400" width="400"/></div>
## 2. Image Recognition Experience
The detection model with the recognition inference model for the 4 directions (Logo, Cartoon Face, Vehicle, Product), the address for downloading the test data and the address of the corresponding configuration file are as follows. <a name="Feature Experience"></a>
| Models Introduction | Recommended Scenarios | inference Model | Predict Config File | Config File to Build Index Database | ### 1.2 Feature Experience
| ------------ | ------------- | -------- | ------- | -------- | At present, the PP-ShiTu android demo has basic features such as image retrieval, add image to the index database, saving the index database, initializing the index database, and viewing the index database. Next, we will introduce how to experience these features.
| Generic mainbody detection model | General Scenarios |[Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar) | - | - |
| Logo Recognition Model | Logo Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/logo_rec_ResNet50_Logo3K_v1.0_infer.tar) | [inference_logo.yaml](../../../deploy/configs/inference_logo.yaml) | [build_logo.yaml](../../../deploy/configs/build_logo.yaml) |
| Cartoon Face Recognition Model| Cartoon Face Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/cartoon_rec_ResNet50_iCartoon_v1.0_infer.tar) | [inference_cartoon.yaml](../../../deploy/configs/inference_cartoon.yaml) | [build_cartoon.yaml](../../../deploy/configs/build_cartoon.yaml) |
| Vehicle Fine-Grained Classfication Model | Vehicle Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/vehicle_cls_ResNet50_CompCars_v1.0_infer.tar) | [inference_vehicle.yaml](../../../deploy/configs/inference_vehicle.yaml) | [build_vehicle.yaml](../../../deploy/configs/build_vehicle.yaml) |
| Product Recignition Model | Product Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar) | [inference_product.yaml](../../../deploy/configs/inference_product.yaml) | [build_product.yaml](../../../deploy/configs/build_product.yaml) |
| Vehicle ReID Model | Vehicle ReID Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/vehicle_reid_ResNet50_VERIWild_v1.0_infer.tar) | - | - |
| Models Introduction | Recommended Scenarios | inference Model | Predict Config File | Config File to Build Index Database | #### 1.2.1 Image Retrieval
| ------------ | ------------- | -------- | ------- | -------- | Click the "photo recognition" button below <img src="../../images/quick_start/android_demo/paizhaoshibie_100.png" width="25" height="25"/> or the "file recognition" button<img src ="../../images/quick_start/android_demo/bendishibie_100.png" width="25" height="25"/>, you can take an image or select an image, then wait a few seconds, main object in the image will be marked and the predicted class and inference time will be shown below the image.
| Lightweight generic mainbody detection model | General Scenarios |[Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar) | - | - |
| Lightweight generic recognition model | General Scenarios | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar) | [inference_product.yaml](../../../deploy/configs/inference_product.yaml) | [build_product.yaml](../../../deploy/configs/build_product.yaml) |
Take the following image as an example:
Demo data in this tutorial can be downloaded here: [download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_en_v1.1.tar). <img src="../../images/recognition/drink_data_demo/test_images/nongfu_spring.jpeg" width="400" height="600"/>
The retrieval results obtained are visualized as follows:
**Attention** <img src="../../images/quick_start/android_demo/android_nongfu_spring.JPG" width="400" height="800"/>
1. If you do not have wget installed on Windows, you can download the model by copying the link into your browser and unzipping it in the appropriate folder; for Linux or macOS users, you can right-click and copy the download link to download it via the `wget` command.
2. If you want to install `wget` on macOS, you can run the following command.
3. The predict config file of the lightweight generic recognition model and the config file to build index database are used for the config of product recognition model of server-side. You can modify the path of the model to complete the index building and prediction.
```shell #### 1.2.2 Update Index
# install homebrew Click the "photo upload" button above <img src="../../images/quick_start/android_demo/paizhaoshangchuan_100.png" width="25" height="25"/> or the "file upload" button <img src ="../../images/quick_start/android_demo/bendishangchuan_100.png" width="25" height="25"/>, you can take an image or select an image and enter the class name of the uploaded image (such as `keyboard`), click the "OK" button, then the feature vector and classname corresponding to the image will be added to the index database.
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)";
# install wget
brew install wget
```
3. If you want to isntall `wget` on Windows, you can refer to [link](https://www.cnblogs.com/jeshy/p/10518062.html). If you want to install `tar` on Windows, you can refer to [link](https://www.cnblogs.com/chooperman/p/14190107.html). #### 1.2.3 Save Index
Click the "save index" button above <img src="../../images/quick_start/android_demo/baocunxiugai_100.png" width="25" height="25"/>, you can save the current index database as `latest`.
#### 1.2.4 Initialize Index
Click the "initialize index" button above <img src="../../images/quick_start/android_demo/reset_100.png" width="25" height="25"/> to initialize the current library to `original`.
* You can download and unzip the data and models by following the command below #### 1.2.5 Preview Index
Click the "class preview" button <img src="../../images/quick_start/android_demo/leibichaxun_100.png" width="25" height="25"/> to view it in the pop-up window.
```shell <a name="Feature introduction"></a>
mkdir models
cd models
# Download and unzip the inference model
wget {Models download link} && tar -xf {Name of the tar archive}
cd ..
# Download the demo data and unzip ### 1.3 Feature Details
wget {Data download link} && tar -xf {Name of the tar archive}
``` #### 1.3.1 Image Retrieval
After selecting the image to be retrieved, firstly, the mainbody detection will be performed through the detection model to obtain the bounding box of ​the object in the image, and then the image will be cropped and is input into the feature extraction model to obtain the corresponding feature vector and retrieved in the index database, returns and displays the final search result.
#### 1.3.2 Update Index
After selecting the picture to be stored, firstly, the mainbody detection will be performed through the detection model to obtain the bounding box of ​the object in the image, and then the image will be cropped and is input into the feature extraction model to obtain the corresponding feature vector, and then added into index database.
#### 1.3.3 Save Index
Save the index database in the current program index database name of `latest`, and automatically switch to `latest`. The saving logic is similar to "Save As" in general software. If the current index is already `latest`, it will be automatically overwritten, or it will switch to `latest`.
#### 1.3.4 Initialize Index
When initializing the index database, it will automatically switch the search index database to `original.index` and `original.txt`, and automatically delete `latest.index` and `latest.txt` (if exists).
#### 1.3.5 Preview Index
One can preview it according to the instructions in [Function Experience - Preview Index](#125-preview-index).
## 2. PP-ShiTu PC demo for quick start
<a name="Environment Configuration"></a>
### 2.1 Environment configuration
* Installation: Please refer to the document [Environment Preparation](../installation/install_paddleclas.md) to configure the PaddleClas operating environment.
* Go to the `deploy` run directory. All the content and scripts in this section need to be run in the `deploy` directory, you can enter the `deploy` directory with the following scripts.
```shell
cd deploy
```
<a name="Image Recognition Experience"></a>
<a name="2.1"></a> ### 2.2 Image recognition experience
### 2.1 Download and Unzip the Inference Model and Demo Data
Take the product recognition as an example, download the detection model, recognition model and product recognition demo data with the following commands. The lightweight general object detection model, lightweight general recognition model and configuration file are available in following table.
<a name="Lightweight General object Detection Model and Lightweight General Recognition Model"></a>
| Model Introduction | Recommended Scenarios | Inference Model | Prediction Profile |
| ------------------------------------------ | --------------------- | ------------------ | ------------------------------------------------------------------------ |
| Lightweight General MainBody Detection Model | General Scene | [tar format download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainMainBody_lite_v1.0_infer.tar ) \| [zip format download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainMainBody_lite_v1.0_infer.zip) | - |
| Lightweight General Recognition Model | General Scene | [tar format download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/PP-ShiTuV2/general_PPLCNetV2_base_pretrained_v1.0_infer.tar) \| [zip format download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/PP-ShiTuV2/general_PPLCNetV2_base_pretrained_v1.0_infer.zip) | [inference_general.yaml](../../../deploy/configs/inference_general.yaml) |
Note: Since some decompression software has problems in decompressing the above `tar` format files, it is recommended that non-script line users download the `zip` format files and decompress them. `tar` format file is recommended to use the script `tar -xf xxx.tar`unzip.
The demo data download path of this chapter is as follows: [drink_dataset_v2.0.tar (drink data)](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v2.0.tar),
The following takes **drink_dataset_v2.0.tar** as an example to introduce the PP-ShiTu quick start process on the PC. Users can also download and decompress the data of other scenarios to experience: [22 scenarios data download](../../zh_CN/introduction/ppshitu_application_scenarios.md#22-下载解压场景库数据).
If you want to experience the server object detection and the recognition model of each scene, you can refer to [2.4 Server recognition model list](#24-list-of-server-identification-models)
**Notice**
- If wget is not installed in the windows environment, you can install the `wget` and tar scripts according to the following steps, or you can copy the link to the browser to download the model, decompress it and place it in the corresponding directory.
- If the `wget` script is not installed in the macOS environment, you can run the following script to install it.
```shell
# install homebrew
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)";
# install wget
brew install wget
```
- If you want to install `wget` in the windows environment, you can refer to: [link](https://www.cnblogs.com/jeshy/p/10518062.html); if you want to install the `tar` script in the windows environment, you can refer to: [Link](https://www.cnblogs.com/chooperman/p/14190107.html).
<a name="2.2.1"></a>
#### 2.2.1 Download and unzip the inference model and demo data
Download the demo dataset and the lightweight subject detection and recognition model. The scripts are as follows.
```shell ```shell
mkdir models mkdir models
cd models cd models
# Download the generic detection inference model and unzip it # Download the mainbody detection inference model and unzip it
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar && tar -xf ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar && tar -xf picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
# Download and unpack the inference model # Download the feature extraction inference model and unzip it
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar && tar -xf product_ResNet50_vd_aliproduct_v1.0_infer.tar wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/PP-ShiTuV2/general_PPLCNetV2_base_pretrained_v1.0_infer.tar && tar -xf general_PPLCNetV2_base_pretrained_v1.0_infer.tar
cd ..
cd ../
# Download the demo data and unzip it # Download demo data and unzip it
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_en_v1.1.tar && tar -xf recognition_demo_data_en_v1.1.tar wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v2.0.tar && tar -xf drink_dataset_v2.0.tar
``` ```
Once unpacked, the `recognition_demo_data_v1.1` folder should have the following file structure. After decompression, the `drink_dataset_v2.0/` folder be structured as follows:
``` ```log
├── recognition_demo_data_v1.1 ├── drink_dataset_v2.0/
│ ├── gallery_cartoon │ ├── gallery/
│ ├── gallery_logo │ ├── index/
│ ├── gallery_product │ ├── index_all/
│ ├── gallery_vehicle │ └── test_images/
│ ├── test_cartoon
│ ├── test_logo
│ ├── test_product
│ └── test_vehicle
├── ... ├── ...
``` ```
here, original images to build index are in folder `gallery_xxx`, test images are in folder `test_xxx`. You can also access specific folder for more details. The `gallery` folder stores the original images used to build the index database, `index` represents the index database constructed based on the original images, and the `test_images` folder stores the list of images for query.
The `models` folder should have the following file structure. The `models` folder should be structured as follows:
``` ```log
├── product_ResNet50_vd_aliproduct_v1.0_infer ├── general_PPLCNetV2_base_pretrained_v1.0_infer
│ ├── inference.pdiparams │ ├── inference.pdiparams
│ ├── inference.pdiparams.info │ ├── inference.pdiparams.info
│ └── inference.pdmodel │ └── inference.pdmodel
├── ppyolov2_r50vd_dcn_mainbody_v1.0_infer ├── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer
│ ├── inference.pdiparams │ ├── inference.pdiparams
│ ├── inference.pdiparams.info │ ├── inference.pdiparams.info
│ └── inference.pdmodel │ └── inference.pdmodel
``` ```
**Attention** **Notice**
If you want to use the lightweight generic recognition model, you need to re-extract the features of the demo data and re-build the index. The way is as follows:
If the general feature extraction model is changed, the index for demo data must be rebuild, as follows:
```shell ```shell
python3.7 python/build_gallery.py -c configs/build_product.yaml -o Global.rec_inference_model_dir=./models/general_PPLCNet_x2_5_lite_v1.0_infer python3.7 python/build_gallery.py \
-c configs/inference_general.yaml \
-o Global.rec_inference_model_dir=./models/general_PPLCNetV2_base_pretrained_v1.0_infer
``` ```
<a name="2.2"></a> <a name="Drink Recognition and Retrieval"></a>
### 2.2 Product Recognition and Retrieval
Take the product recognition demo as an example to show the recognition and retrieval process (if you wish to try other scenarios of recognition and retrieval, replace the corresponding configuration file after downloading and unzipping the corresponding demo data and model to complete the prediction).
**Note:** `faiss` is used as search library. The installation method is as follows:
``` #### 2.2.2 Drink recognition and retrieval
pip install faiss-cpu==1.7.1post2
```
If error happens when using `import faiss`, please uninstall `faiss` and reinstall it, especially on `Windows`. Take the drink recognition demo as an example to show the recognition and retrieval process.
<a name="2.2.1"></a> Note that this section will uses `faiss` as the retrieval tool, and the installation script is as follows:
#### 2.2.1 Single Image Recognition ```python
python3.7 -m pip install faiss-cpu==1.7.1post2
```
Run the following command to identify and retrieve the image `./recognition_demo_data_v1.1/test_product/daoxiangcunjinzhubing_6.jpg` for recognition and retrieval If `faiss` cannot be importted, try reinstall it, especially for windows users.
```shell <a name="single image recognition"></a>
# use the following command to predict using GPU.
python3.7 python/predict_system.py -c configs/inference_product.yaml
# use the following command to predict using CPU
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.use_gpu=False
```
##### 2.2.2.1 single image recognition
The image to be retrieved is shown below. Run the following script to recognize the image `./drink_dataset_v2.0/test_images/100.jpeg`
![](../../images/recognition/product_demo/query/daoxiangcunjinzhubing_6.jpg) The images to be retrieved are as follows
![](../../images/recognition/drink_data_demo/test_images/100.jpeg)
The final output is shown below. ```shell
# Use the script below to make predictions using the GPU
python3.7 python/predict_system.py -c configs/inference_general.yaml
``` # Use the following script to make predictions using the CPU
[{'bbox': [287, 129, 497, 326], 'rec_docs': 'Daoxaingcun Golden Piggie Cake', 'rec_scores': 0.8309420347213745}, {'bbox': [99, 242, 313, 426], 'rec_docs': 'Daoxaingcun Golden Piggie Cake', 'rec_scores': 0.7245651483535767}] python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.use_gpu=False
``` ```
The final output is as follows.
where bbox indicates the location of the detected object, rec_docs indicates the labels corresponding to the label in the index dabase that are most similar to the detected object, and rec_scores indicates the corresponding confidence. ```log
[{'bbox': [437, 71, 660, 728], 'rec_docs': '元气森林', 'rec_scores': 0.7740249}, {'bbox': [221, 72, 449, 701], 'rec_docs' : '元气森林', 'rec_scores': 0.6950992}, {'bbox': [794, 104, 979, 652], 'rec_docs': '元气森林', 'rec_scores': 0.6305153}]
```
Where `bbox` represents the location of the detected object, `rec_docs` represents the most similar category to the detection box in the index database, and `rec_scores` represents the corresponding similarity.
The detection result is also saved in the folder `output`, for this image, the visualization result is as follows. The visualization results of the recognition are saved in the `output` folder by default. For this image, the visualization of the recognition results is shown below.
![](../../images/recognition/product_demo/result/daoxiangcunjinzhubing_6_en.jpg) ![](../../images/recognition/drink_data_demo/output/100.jpeg)
<a name="Folder-based batch recognition"></a>
<a name="2.2.2"></a> ##### 2.2.2.2 Folder-based batch recognition
#### 2.2.2 Folder-based Batch Recognition
If you want to predict the images in the folder, you can directly modify the `Global.infer_imgs` field in the configuration file, or you can also modify the corresponding configuration through the following `-o` parameter. If you want to use multi images in the folder for prediction, you can modify the `Global.infer_imgs` field in the configuration file, or you can modify the corresponding configuration through the `-o` parameter below.
```shell ```shell
# using the following command to predict using GPU, you can append `-o Global.use_gpu=False` to predict using CPU. # Use the following script to use GPU for prediction, if you want to use CPU prediction, you can add -o Global.use_gpu=False after the script
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/" python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images/"
``` ```
The recognition results of all images in the folder will be output in the terminal, as shown below.
The results on the screen are shown as following. ```log
```
... ...
[{'bbox': [37, 29, 123, 89], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.6163763999938965}, {'bbox': [153, 96, 235, 175], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5279821157455444}] [{'bbox': [0, 0, 600, 600], 'rec_docs': '红牛-强化型', 'rec_scores': 0.74081033}]
[{'bbox': [735, 562, 1133, 851], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5588355660438538}] Inference: 120.39852142333984 ms per batch image
[{'bbox': [124, 50, 230, 129], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.6980369687080383}] [{'bbox': [0, 0, 514, 436], 'rec_docs': '康师傅矿物质水', 'rec_scores': 0.6918598}]
[{'bbox': [0, 0, 275, 183], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5818190574645996}] Inference: 32.045602798461914 ms per batch image
[{'bbox': [400, 1179, 905, 1537], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.9814301133155823}, {'bbox': [295, 713, 820, 1046], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.9496176242828369}, {'bbox': [153, 236, 694, 614], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.8395382761955261}] [{'bbox': [138, 40, 573, 1198], 'rec_docs': '乐虎功能饮料', 'rec_scores': 0.68214047}]
[{'bbox': [544, 4, 1482, 932], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5143815279006958}] Inference: 113.41428756713867 ms per batch image
[{'bbox': [328, 7, 467, 272], 'rec_docs': '脉动', 'rec_scores': 0.60406065}]
Inference: 122.04337120056152 ms per batch image
[{'bbox': [242, 82, 498, 726], 'rec_docs': '味全_每日C', 'rec_scores': 0.5428652}]
Inference: 37.95266151428223 ms per batch image
[{'bbox': [437, 71, 660, 728], 'rec_docs': '元气森林', 'rec_scores': 0.7740249}, {'bbox': [221, 72, 449, 701], 'rec_docs': '元气森林', 'rec_scores': 0.6950992}, {'bbox': [794, 104, 979, 652], 'rec_docs': '元气森林', 'rec_scores': 0.6305153}]
... ...
``` ```
All the visualization results are also saved in folder `output`. Visualizations of recognition results for all images are also saved in the `output` folder.
Furthermore, you can change the path of the recognition inference model by modifying the `Global.rec_inference_model_dir` field, and change the path of the index database by modifying the `IndexProcess.index_dir` field.
<a name="Image of Unknown categories recognition experience"></a>
Furthermore, the recognition inference model path can be changed by modifying the `Global.rec_inference_model_dir` field, and the path of the index to the index databass can be changed by modifying the `IndexProcess.index_dir` field. ### 2.3 Image of Unknown categories recognition experience
Now we try to recognize the unseen image `./drink_dataset_v2.0/test_images/mosilian.jpeg`
<a name="3"></a> The images to be retrieved are as follows
## 3. Recognize Images of Unknown Category
To recognize the image `./recognition_demo_data_v1.1/test_product/anmuxi.jpg`, run the command as follows: ![](../../images/recognition/drink_data_demo/test_images/mosilian.jpeg)
Execute the following identification script
```shell ```shell
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/anmuxi.jpg" # Use the following script to use GPU for prediction, if you want to use CPU prediction, you can add -o Global.use_gpu=False after the script
python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images/mosilian.jpeg"
``` ```
The image to be retrieved is shown below. It can be found that the output result is empty
Since the default index database does not contain the unknown category's information, the recognition result here is wrong. At this time, we can achieve the image recognition of unknown classes by building a new index database.
When the images in the index database cannot cover the scene we actually recognize, i.e. recognizing an image of an unknown category, we need to add a similar image(at least one) belong the unknown category to the index database. This process does not require re-training the model. Take `mosilian.jpeg` as an example, just follow the steps below to rebuild a new index database.
<a name="Prepare new data and labels"></a>
#### 2.3.1 Prepare new data and labels
![](../../images/recognition/product_demo/query/anmuxi.jpg) First, copy the image(s) belong to unknown category(except the query image) to the original image folder of the index database. Here we already put all the image data in the folder `drink_dataset_v2.0/gallery/`.
The output is empty. Then we need to edit the text file that records the image path and label information. Here we already put the updated label information file in the `drink_dataset_v2.0/gallery/drink_label_all.txt` file. Comparing with the original `drink_dataset_v2.0/gallery/drink_label.txt` label file, it can be found that the index images of the bright and ternary series of milk have been added.
Since the index infomation is not included in the corresponding index databse, the recognition result is empty or not proper. At this time, we can complete the image recognition of unknown categories by constructing a new index database. In each line of text, the first field represents the relative path of the image, and the second field represents the label information corresponding to the image, separated by the `\t` key (Note: some editors will automatically convert `tab` is `space`, in which case it will cause a file parsing error).
When the index database cannot cover the scenes we actually recognise, i.e. when predicting images of unknown categories, we need to add similar images of the corresponding categories to the index databasey, thus completing the recognition of images of unknown categories ,which does not require retraining. <a name="Create a new index database"></a>
<a name="3.1"></a> #### 2.3.2 Create a new index database
### 3.1 Prepare for the new images and labels
First, you need to copy the images which are similar with the image to retrieval to the original images for the index database. The command is as follows. Build a new index database `index_all` with the following scripts.
```shell ```shell
cp -r ../docs/images/recognition/product_demo/gallery/anmuxi ./recognition_demo_data_/gallery_product/gallery/ python3.7 python/build_gallery.py -c configs/inference_general.yaml -o IndexProcess.data_file="./drink_dataset_v2.0/gallery/drink_label_all.txt" -o IndexProcess.index_dir="./drink_dataset_v2.0/index_all"
``` ```
Then you need to create a new label file which records the image path and label information. Use the following command to create a new file based on the original one. The final constructed new index database is saved in the folder `./drink_dataset_v2.0/index_all`. For specific instructions on yaml `yaml`, please refer to [Vector Search Documentation](../image_recognition_pipeline/vector_search.md).
<a name="Image recognition based on the new index database"></a>
#### 2.3.3 Image recognition based on the new index database
To re-recognize the `mosilian.jpeg` image using the new index database, run the following scripts.
```shell ```shell
# copy the file # run the following script predict with GPU, if you want to use CPU, you can add -o Global.use_gpu=False after the script
cp recognition_demo_data_v1.1/gallery_product/data_file.txt recognition_demo_data_v1.1/gallery_product/data_file_update.txt python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.infer_imgs="./drink_dataset_v2.0/test_images/mosilian.jpeg" -o IndexProcess.index_dir="./drink_dataset_v2.0/index_all"
``` ```
Then add some new lines into the new label file, which is shown as follows. The output is as follows.
``` ```log
gallery/anmuxi/001.jpg Anmuxi Ambrosial Yogurt [{'bbox': [290, 297, 564, 919], 'rec_docs': 'Bright_Mosleyan', 'rec_scores': 0.59137374}]
gallery/anmuxi/002.jpg Anmuxi Ambrosial Yogurt
gallery/anmuxi/003.jpg Anmuxi Ambrosial Yogurt
gallery/anmuxi/004.jpg Anmuxi Ambrosial Yogurt
gallery/anmuxi/005.jpg Anmuxi Ambrosial Yogurt
gallery/anmuxi/006.jpg Anmuxi Ambrosial Yogurt
``` ```
Each line can be splited into two fields. The first field denotes the relative image path, and the second field denotes its label. The `delimiter` is `tab` here. The final recognition result is `光明_莫斯利安`, we can see the recognition result is correct now , and the visualization of the recognition result is shown below.
![](../../images/recognition/drink_data_demo/output/mosilian.jpeg)
<a name="3.2"></a>
### 3.2 Build a new Index Base Library
Use the following command to build the index to accelerate the retrieval process after recognition. <a name="5"></a>
```shell ### 2.4 List of server recognition models
python3.7 python/build_gallery.py -c configs/build_product.yaml -o IndexProcess.data_file="./recognition_demo_data_v1.1/gallery_product/data_file_update.txt" -o IndexProcess.index_dir="./recognition_demo_data_v1.1/gallery_product/index_update"
```
Finally, the new index information is stored in the folder`./recognition_demo_data_v1.1/gallery_product/index_update`. Use the new index database for the above index. At present, we recommend to use model in [Lightweight General Object Detection Model and Lightweight General Recognition Model](#22-image-recognition-experience) to get better test results. However, if you want to experience the general recognition model, general object detection model and other recognition model for server, the test data download path, and the corresponding configuration file path are as follows.
| Model Introduction | Recommended Scenarios | Inference Model | Prediction Profile |
| --------------------------------- | --------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
| General Body Detection Model | General Scene | [Model download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar) | - |
| Logo Recognition Model | Logo Scene | [Model download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/logo_rec_ResNet50_Logo3K_v1.0_infer.tar) | [inference_logo. yaml](../../../deploy/configs/inference_logo.yaml) |
| Anime Character Recognition Model | Anime Character Scene | [Model download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/cartoon_rec_ResNet50_iCartoon_v1.0_infer.tar) | [ inference_cartoon.yaml](../../../deploy/configs/inference_cartoon.yaml) |
| Vehicle Subdivision Model | Vehicle Scene | [Model download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/vehicle_cls_ResNet50_CompCars_v1.0_infer.tar) | [inference_vehicle .yaml](../../../deploy/configs/inference_vehicle.yaml) |
| Product Recognition Model | Product Scene | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar) | [inference_product. yaml](../../../deploy/configs/inference_product.yaml) |
| Vehicle ReID Model | Vehicle ReID Scene | [Model download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/vehicle_reid_ResNet50_VERIWild_v1.0_infer.tar) | [inference_vehicle .yaml](../../../deploy/configs/inference_vehicle.yaml) |
<a name="3.3"></a> The above models can be downloaded to the `deploy/models` folder by the following script for use in recognition tasks
### 3.3 Recognize the Unknown Category Images ```shell
cd ./deploy
mkdir -p models
To recognize the image `./recognition_demo_data_v1.1/test_product/anmuxi.jpg`, run the command as follows. cd ./models
# Download the generic object detection model for server and unzip it
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar && tar -xf ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar
# Download the generic recognition model and unzip it
wget {recognize model download link path} && tar -xf {name of compressed package}
```
Then use the following scripts to download the test data for other recognition scenario:
```shell ```shell
# using the following command to predict using GPU, you can append `-o Global.use_gpu=False` to predict using CPU. # Go back to the deploy directory
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/anmuxi.jpg" -o IndexProcess.index_dir="./recognition_demo_data_v1.1/gallery_product/index_update" cd..
# Download test data and unzip
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_en_v1.1.tar && tar -xf recognition_demo_data_en_v1.1.tar
``` ```
The output is as follows: After decompression, the `recognition_demo_data_v1.1` folder should have the following file structure:
``` ```log
[{'bbox': [243, 80, 523, 522], 'rec_docs': 'Anmuxi Ambrosial Yogurt', 'rec_scores': 0.5570770502090454}] ├── recognition_demo_data_v1.1
│ ├── gallery_cartoon
│ ├── gallery_logo
│ ├── gallery_product
│ ├── gallery_vehicle
│ ├── test_cartoon
│ ├── test_logo
│ ├── test_product
│ └── test_vehicle
├── ...
``` ```
The final recognition result is `Anmuxi Ambrosial Yogurt`, which is corrrect, the visualization result is as follows. After downloading the model and test data according to the above steps, you can re-build the index database and test the relevant recognition model.
![](../../images/recognition/product_demo/result/anmuxi_en.jpg) * For more introduction to object detection, please refer to: [Object Detection Tutorial Document](../image_recognition_pipeline/mainbody_detection.md); for the introduction of feature extraction, please refer to: [Feature Extraction Tutorial Document](../image_recognition_pipeline/feature_extraction.md); for the introduction to vector search, please refer to: [vector search tutorial document](../image_recognition_pipeline/vector_search.md).
</div>
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册