提交 f23d4368 编写于 作者: D dongshuilong 提交者: Tingquan Gao

update catlogue and link

上级 a1481422
......@@ -4,7 +4,23 @@ The mainbody detection technology is currently a widely used detection technolog
This tutorial will introduce the technology from three aspects, namely, the datasets, model selection and model training.
## Dataset
## Catalogue
- [1. Dataset](#1)
- [2. Model Selection](#2)
- [2.1 Lightweight Mainbody Detection Model](#2.1)
- [2.2 Server-side Mainbody Detection Model](#2.2)
- [3. Model Training](#3)
- [3.1 Prepare For the Environment](#3.1)
- [3.2 Prepare For the Dataset](#3.2)
- [3.3 Configuration Files](#3.3)
- [3.4 Begin the Training Process](#3.4)
- [3.5 Model Prediction](#3.5)
- [3.6 Model Export and Inference Deployment](#3.6)
<a name="1"></a>
## 1. Dataset
The datasets we used for mainbody detection tasks are shown in the following table.
......@@ -18,7 +34,9 @@ The datasets we used for mainbody detection tasks are shown in the following tab
In the actual training process, all datasets are mixed together. Categories of all the labeled boxes are modified as `foreground`, and the detection model we trained only contains one category (`foreground`).
## Model Selection
<a name="2"></a>
## 2. Model Selection
There are a wide variety of object detection methods, such as the commonly used two-stage detectors (FasterRCNN series, etc.), single-stage detectors (YOLO, SSD, etc.), anchor-free detectors (FCOS, etc.) and so on. PaddleDetection has its self-developed PP-YOLO models for server-side scenarios and PicoDet models for end-side scenarios (CPU and mobile), which all take the lead in the area.
......@@ -34,7 +52,9 @@ Notes:
- Detailed information of the CPU of the speed evaluation machine:`Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz`.The speed indicator is the testing result when mkldnn is on and the number of threads is set to 10.
- Mainbody detection has a time-consuming preprocessing procedure, with an average time of about 40 to 55 ms per image in the above machine. Therefore, it is not included in the inference time.
### Lightweight Mainbody Detection Model
<a name="2.1"></a>
### 2.1 Lightweight Mainbody Detection Model
PicoDet, introduced by [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection), is an object detection algorithm applied to CPU or mobile-side scenarios. It integrates the following optimization algorithm.
......@@ -48,7 +68,9 @@ For more details of optimized PicoDet and benchmark, you can refer to [Tutorial
To balance the detection speed and effects in lightweight mainbody detection tasks, we adopt PPLCNet_x2_5 as the backbone of the model and revise the image scale for training and inference to 640x640, with the rest configured the same as [picodet_m_shufflenetv2_416_coco.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/picodet/picodet_m_shufflenetv2_416_coco.yml). The final detection model is obtained after the training of customized mainbody detection datasets.
### Server-side Mainbody Detection Model
<a name="2.2"></a>
### 2.2 Server-side Mainbody Detection Model
PP-YOLO is proposed by [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection). It greatly optimizes the yolov3 model from multiple perspectives such as backbone, data augmentation, regularization strategy, loss function, and post-processing. It reaches the state of the art in terms of "speed-precision". The optimization strategy is as follows.
......@@ -67,11 +89,15 @@ For more information about PP-YOLO, you can refer to [PP-YOLO tutorial](https://
In the mainbody detection task, we use `ResNet50vd-DCN` as our backbone for better performance. The config file is [ppyolov2_r50vd_dcn_365e_coco.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml), in which the dataset path is modified to the customized mainbody detection dataset. The final detection model can be downloaded [here](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar).
## Model Training
<a name="3"></a>
## 3 Model Training
This section mainly talks about how to train your own mainbody detection model using PaddleDetection on your own datasets.
### Prepare For the Environment
<a name="3.1"></a>
### 3.1 Prepare For the Environment
Download PaddleDetection and install requirements.
......@@ -86,7 +112,9 @@ pip install -r requirements.txt
For more installation tutorials, please refer to [Installation Tutorial](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL.md)
### Prepare For the Dataset
<a name="3.2"></a>
### 3.2 Prepare For the Dataset
For customized dataset, you should convert it to COCO format. Please refer to [Customized Dataset Tutorial](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/static/docs/tutorials/Custom_DataSet.md) to build your own datasets with COCO format.
......@@ -96,11 +124,13 @@ In mainbody detection task, all the objects belong to foregroud. Therefore, `cat
[{u'id': 1, u'name': u'foreground', u'supercategory': u'foreground'}]
```
### Configuration Files
<a name="3.3"></a>
### 3.3 Configuration Files
We use `configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml` to train the model, mode details are as follows.
[![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/docs/images/det/PaddleDetection_config.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/images/det/PaddleDetection_config.png)
[![img](../../images/det/PaddleDetection_config.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/images/det/PaddleDetection_config.png)
......@@ -122,7 +152,9 @@ In mainbody detection task, you need to modify `num_classes` in `datasets/coco_d
In addition, the above files can also be modified according to real situations, for example, if the video memory is overflowing, the batch size and learning rate can be reduced in equal proportion.
### Begin the Training Process
<a name="3.4"></a>
### 3.4 Begin the Training Process
PaddleDetection supports many ways of training process.
......@@ -162,7 +194,9 @@ python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppy
Note: If `Out of memory error` occurs, you can try to decrease `batch_size` in `ppyolov2_reader.yml` while reducing learning rate in equal proportion.
### Model Prediction
<a name="3.5"></a>
### 3.5 Model Prediction
Use the following command to finish the prediction process.
......@@ -173,7 +207,9 @@ python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --infer
`--draw_threshold` is an optional parameter. According to NMS calculation, different thresholds will produce different results. `keep_top_k` indicates the maximum number of output targets, with a default value of 100 that can be modified according to their actual situation.
### Model Export and Inference Deployment
<a name="3.6"></a>
### 3.6 Model Export and Inference Deployment
Use the following to export the inference model:
......@@ -191,7 +227,7 @@ The final directory contains `inference/ppyolov2_r50vd_dcn_365e_coco`, `inferen
After exporting the model, the path of the detection model can be changed to the inference model path to complete the prediction task.
Take product recognition as an example,you can modify the field `Global.det_inference_model_dir` in its config file [inference_product.yaml](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/configs/inference_product.yaml) to the directory of exported inference model, and then finish the detection and recognition of the product with reference to [Quick Start for Image Recognition](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN_tmp/tutorials/quick_start_recognition.md).
Take product recognition as an example,you can modify the field `Global.det_inference_model_dir` in its config file [inference_product.yaml](../../../deploy/configs/inference_product.yaml) to the directory of exported inference model, and then finish the detection and recognition of the product with reference to [Quick Start for Image Recognition](./quick_start/quick_start_recognition_en.md).
## FAQ
......
......@@ -9,7 +9,7 @@ Vector search finds wide applications in image recognition and image retrieval.
It is worth noting that the current version of `PaddleClas` **only uses CPU for vector retrieval** for the moment in pursuit of better adaptability.
[![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/docs/images/structure.jpg)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/images/structure.jpg)
[![img](../../images/structure.jpg)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/images/structure.jpg)
As shown in the figure above, two parts constitute the vector search in the whole `PP-ShiTu` system.
......@@ -20,15 +20,15 @@ This document mainly introduces the installation of the search module in PaddleC
------
## Contents
- [1. Installation of the Search Library](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/image_recognition_pipeline/vector_search.md#1)
- [2. Search Algorithms](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/image_recognition_pipeline/vector_search.md#2)
- [3. Introduction of and Configuration Files](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/image_recognition_pipeline/vector_search.md#3)
- [3.1 Parameters of Library Building and Configuration Files](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/image_recognition_pipeline/vector_search.md#3.1)
- [3.2 Parameters of Search Configuration Files](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/image_recognition_pipeline/vector_search.md#3.2)
## Catalogue
- [1. Installation of the Search Library](#1)
- [2. Search Algorithms](#2)
- [3. Introduction of and Configuration Files](#3)
- [3.1 Parameters of Library Building and Configuration Files](#3.1)
- [3.2 Parameters of Search Configuration Files](#3.2)
<a name="1"></a>
## 1. Installation of the Search Library
......@@ -40,6 +40,8 @@ pip install faiss-cpu==1.7.1post2
If the above cannot be properly used, please `uninstall` and then `install` again, especially when you are using`windows`.
<a name="2"></a>
## 2. Search Algorithms
Currently, the search module in `PaddleClas` supports the following three search algorithms:
......@@ -50,10 +52,14 @@ Currently, the search module in `PaddleClas` supports the following three search
Each search algorithm can find its right place in different scenarios. `HNSW32`, as the default method, strikes a balance between accuracy and speed, see its detailed introduction in the [official document](https://github.com/facebookresearch/faiss/wiki).
<a name="3"></a>
## 3. Introduction of Configuration Files
Configuration files involving the search module are under `deploy/configs/`, where `build_*.yaml` is related to building the feature library, and `inference_*.yaml` is the inference file for retrieval or classification.
<a name="3.1"></a>
### 3.1 Parameters of Library Building and Configuration Files
The building of the library is detailed as follows:
......@@ -93,11 +99,11 @@ IndexProcess:
- **dist_type**: the method of similarity calculation adopted in feature matching. For example, Inner Product(`IP`) and Euclidean distance(`L2`).
- **embedding_size**: feature dimensionality
<a name="3.2"></a>
### 3.2 Parameters of Search Configuration Files
To integrate the search into the overall `PP-ShiTu` process, please refer to `The Introduction of PP-ShiTu Image Recognition System` in [README](https://github.com/PaddlePaddle/PaddleClas/blob/develop/README_ch.md). Please check the [Quick Start for Image Recognition](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/quick_start/quick_start_recognition.md) for the specific operation of the search.
To integrate the search into the overall `PP-ShiTu` process, please refer to `The Introduction of PP-ShiTu Image Recognition System` in [README](../../../README_en.md). Please check the [Quick Start for Image Recognition](../quick_start/quick_start_recognition_en.md) for the specific operation of the search.
The search part is configured as follows. Please refer to `deploy/configs/inference_*.yaml` for the complete version.
......
......@@ -12,26 +12,26 @@ For an image to be queried, the image recognition process in PaddleClas is divid
The feature gallery is built in advance using the labeled image datasets. The complete image recognition system is shown in the figure below.
[![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/docs/images/structure.jpg)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/images/structure.jpg)
[![img](../../images/structure.jpg)
To experience the whole image recognition system, or learn how to build a feature gallery, please refer to [Quick Start of Image Recognition](. /quick_start/quick_start_recognition.md), which explains the overall application process. The following parts expound on the training part of the above three steps.
To experience the whole image recognition system, or learn how to build a feature gallery, please refer to [Quick Start of Image Recognition](../quick_start/quick_start_recognition_en.md), which explains the overall application process. The following parts expound on the training part of the above three steps.
Please first refer to the [Installation Guide](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/installation/install_paddleclas.md) to configure the runtime environment.
Please first refer to the [Installation Guide](../installation/install_paddleclas_en.md) to configure the runtime environment.
## Contents
- [1. Mainbody Detection](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#1)
- [2. Feature Model Training](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#2)
- [2.1. Data Preparation](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#2.1)
- [2. 2 Single GPU-based Training and Evaluation](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#2.2)
- [2.2.1 Model Training](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#2.2.2)
- [2.2.2 Resume Training](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#2.2.2)
- [2.2.3 Model Evaluation](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#2.2.3)
- [2.3 Export Inference Model](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#2.3)
- [3. Vector Search](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#3)
- [4. Basic Knowledge](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#4)
## Catalogue
- [1. Mainbody Detection](#1)
- [2. Feature Model Training](#2)
- [2.1. Data Preparation](h#2.1)
- [2. 2 Single GPU-based Training and Evaluation](#2.2)
- [2.2.1 Model Training](#2.2.2)
- [2.2.2 Resume Training](#2.2.2)
- [2.2.3 Model Evaluation](#2.2.3)
- [2.3 Export Inference Model](#2.3)
- [3. Vector Search](#3)
- [4. Basic Knowledge](#4)
<a name="1"></a>
## 1. Mainbody Detection
......@@ -45,11 +45,11 @@ For more information about the training method of mainbody detection, please ref
For more information on the introduction and download of the model provided in PaddleClas for body detection, please refer to: [PaddleDetection Tutorial](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/image_recognition_pipeline/mainbody_detection.md).
<a name="2"></a>
## 2. Feature Model Training
<a name="2.1"></a>
### 2.1 Data Preparation
......@@ -123,7 +123,7 @@ The format of testing set is the same as the one of training set.
**Note**
- When the gallery dataset and query dataset are the same, in order to remove the first data retrieved (the retrieved images themselves do not need to be evaluated), each data needs to correspond to a unique id for subsequent evaluation of metrics such as mAP, recall@1, etc. Please refer to [Introduction to image retrieval datasets](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#图像检索数据集介绍) for the analysis of gallery datasets and query datasets, and [Image retrieval evaluation metrics](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#图像检索评价指标) for the evaluation of mAP, recall@1, etc.
- When the gallery dataset and query dataset are the same, in order to remove the first data retrieved (the retrieved images themselves do not need to be evaluated), each data needs to correspond to a unique id for subsequent evaluation of metrics such as mAP, recall@1, etc. Please refer to [Introduction to image retrieval datasets](#Introduction to Image Retrieval Datasets) for the analysis of gallery datasets and query datasets, and [Image retrieval evaluation metrics](#Image Retrieval Evaluation Metrics) for the evaluation of mAP, recall@1, etc.
Back to `PaddleClas` root directory.
......@@ -132,13 +132,15 @@ Back to `PaddleClas` root directory.
cd ../../
```
<a name="2.2"></a>
### 2.2 Single GPU-based Training and Evaluation
For training and evaluation on a single GPU, the `tools/train.py` and `tools/eval.py` scripts are recommended.
#### 2.2.1 Model Training
Once you have prepared the configuration file, you can start training the image retrieval task in the following way. the method used by PaddleClas to train the image retrieval is metric learning, referring to [metric learning](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/docs/en/tutorials/getting_started_retrieval_en.md#Metric-Learning) for more explanations.
Once you have prepared the configuration file, you can start training the image retrieval task in the following way. the method used by PaddleClas to train the image retrieval is metric learning, referring to [metric learning](#metric learning) for more explanations.
```shell
# Single GPU
......@@ -156,7 +158,7 @@ python3 -m paddle.distributed.launch tools/train.py \
`-c` is used to specify the path to the configuration file, and `-o` is used to specify the parameters that need to be modified or added, where `-o Arch.Backbone.pretrained=True` indicates that the Backbone part uses the pre-trained model. In addtion,`Arch.Backbone.pretrained` can also specify the address of a specific model weight file, which needs to be replaced with the path to your own pre-trained model weight file when using it. `-o Global.device=gpu` indicates that the GPU is used for training. If you want to use a CPU for training, you need to set `Global.device` to `cpu`.
For more detailed training configuration, you can also modify the corresponding configuration file of the model directly. Refer to the [configuration document](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/docs/en/tutorials/config_description_en.md) for specific configuration parameters.
For more detailed training configuration, you can also modify the corresponding configuration file of the model directly. Refer to the [configuration document](config_description_en.md) for specific configuration parameters.
Run the above commands to check the output log, an example is as follows:
......@@ -172,7 +174,7 @@ Run the above commands to check the output log, an example is as follows:
The Backbone here is MobileNetV1, if you want to use other backbone, you can rewrite the parameter `Arch.Backbone.name`, for example by adding `-o Arch.Backbone.name={other Backbone}` to the command. In addition, as the input dimension of the `Neck` section differs between models, replacing a Backbone may require rewriting the input size here in a similar way to replacing the Backbone's name.
In the Training Loss section, [CELoss](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/ppcls/loss/celoss.py) and [TripletLossV2](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/ppcls/loss/triplet.py) are used here with the following configuration files:
In the Training Loss section, [CELoss](../../../ppcls/loss/celoss.py) and [TripletLossV2](../../../ppcls/loss/triplet.py) are used here with the following configuration files:
```
Loss:
......@@ -184,7 +186,7 @@ Loss:
margin: 0.5
```
The final total Loss is a weighted sum of all Losses, where weight defines the weight of a particular Loss in the final total. If you want to replace other Losses, you can also change the Loss field in the configuration file, for the currently supported Losses please refer to [Loss](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/ppcls/loss).
The final total Loss is a weighted sum of all Losses, where weight defines the weight of a particular Loss in the final total. If you want to replace other Losses, you can also change the Loss field in the configuration file, for the currently supported Losses please refer to [Loss](../../../ppcls/loss).
#### 2.2.2 Resume Training
......@@ -246,13 +248,15 @@ Some of the configurable evaluation parameters are introduced as follows.
- `Arch.name`:the name of the model
- `Global.pretrained_model`:path to the pre-trained model file of the model to be evaluated, unlike `Global.Backbone.pretrained`, the pre-trained model is the weight of the whole model instead of the Backbone only. When it is time to do model evaluation, the weights of the whole model need to be loaded.
- `Metric.Eval`:the metric to be evaluated, by default evaluates recall@1, recall@5, mAP. when you are not going to evaluate a metric, you can remove the corresponding trial marker from the configuration file; when you want to add a certain evaluation metric, you can also refer to [Metric](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/ppcls/metric/metrics.py) section to add the relevant metric to the configuration file `Metric.Eval`.
- `Metric.Eval`:the metric to be evaluated, by default evaluates recall@1, recall@5, mAP. when you are not going to evaluate a metric, you can remove the corresponding trial marker from the configuration file; when you want to add a certain evaluation metric, you can also refer to [Metric](../../../ppcls/metric/metrics.py) section to add the relevant metric to the configuration file `Metric.Eval`.
**Note:**
- When loading the model to be evaluated, the path to the model file needs to be specified, but it is not necessary to include the file suffix, PaddleClas will automatically complete the `.pdparams` suffix, e.g. [2.2.2 Resume Training](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/docs/en/tutorials/getting_started_retrieval_en.md#Resume-Training).
- When loading the model to be evaluated, the path to the model file needs to be specified, but it is not necessary to include the file suffix, PaddleClas will automatically complete the `.pdparams` suffix, e.g. [2.2.2 Resume Training](#2.2.2).
- Metric learning are generally not evaluated for TopkAcc.
<a name="2.3"></a>
### 2.3 Export Inference Model
By exporting the inference model, PaddlePaddle supports the transformation of the trained model using prediction with inference engine.
......@@ -264,9 +268,11 @@ python3 tools/export_model.py \
-o Global.save_inference_dir=./inference
```
`Global.pretrained_model` is used to specify the model file path, which still does not need to contain the model file suffix (e.g.[2.2.2 Model Recovery Training](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/recognition.md#2.2.2)). When executed, it will generate the `./inference` directory, which contains the `inference.pdiparams`,`inference.pdiparams.info`, and`inference.pdmodel` files.`Global.save_inference_dir` allows you to specify the path to export the inference model. The inference model saved here is truncated at the embedding feature level, i.e. the final output of the model is n-dimensional embedding features.
`Global.pretrained_model` is used to specify the model file path, which still does not need to contain the model file suffix (e.g.[2.2.2 Model Recovery Training](#2.2.2)). When executed, it will generate the `./inference` directory, which contains the `inference.pdiparams`,`inference.pdiparams.info`, and`inference.pdmodel` files.`Global.save_inference_dir` allows you to specify the path to export the inference model. The inference model saved here is truncated at the embedding feature level, i.e. the final output of the model is n-dimensional embedding features.
The above command will generate the model structure file (`inference.pdmodel`) and the model weights file (`inference.pdiparams`), which can then be used for inference using the inference engine. The process of inference using the inference model can be found in [Predictive inference based on the Python prediction engine](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/docs/en/tutorials/@shengyu).
The above command will generate the model structure file (`inference.pdmodel`) and the model weights file (`inference.pdiparams`), which can then be used for inference using the inference engine. The process of inference using the inference model can be found in [Predictive inference based on the Python prediction engine](../inference_deployment/python_deploy_en.md).
<a name="3"></a>
## 3. Vector Search
......@@ -295,19 +301,27 @@ pip install faiss-cpu==1.7.1post2
If the above cannot be properly referenced, please `uninstall` and then `install` again, especially when you are using`windows`.
<a name="4"></a>
## 4. Basic Knowledge
Image retrieval refers to a query image given a specific instance (e.g. a specific target, scene, item, etc.) that contains the same instance from a database image. Unlike image classification, image retrieval solves an open set problem where the training set may not contain the class of the image being recognised. The overall process of image retrieval is: firstly, the images are represented in a suitable feature vector, secondly, a nearest neighbour search is performed on these image feature vectors using Euclidean or Cosine distances to find similar images in the base, and finally, some post-processing techniques can be used to fine-tune the retrieval results and determine information such as the category of the image being recognised. Therefore, the key to determining the performance of an image retrieval algorithm lies in the goodness of the feature vectors corresponding to the images.
<a name="metric learning"></a>
- Metric Learning
Metric learning studies how to learn a distance function on a particular task so that the distance function can help nearest-neighbour based algorithms (kNN, k-means, etc.) to achieve better performance. Deep Metric Learning is a method of metric learning that aims to learn a mapping from the original features to a low-dimensional dense vector space (embedding space) such that similar objects on the embedding space are closer together using commonly used distance functions (Euclidean distance, cosine distance, etc.) ) on the embedding space, while the distances between objects of different classes are not close to each other. Deep metric learning has achieved very successful applications in the field of computer vision, such as face recognition, commodity recognition, image retrieval, pedestrian re-identification, etc. See [HERE](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/algorithm_introduction/metric_learning.md) for detailed information.
Metric learning studies how to learn a distance function on a particular task so that the distance function can help nearest-neighbour based algorithms (kNN, k-means, etc.) to achieve better performance. Deep Metric Learning is a method of metric learning that aims to learn a mapping from the original features to a low-dimensional dense vector space (embedding space) such that similar objects on the embedding space are closer together using commonly used distance functions (Euclidean distance, cosine distance, etc.) ) on the embedding space, while the distances between objects of different classes are not close to each other. Deep metric learning has achieved very successful applications in the field of computer vision, such as face recognition, commodity recognition, image retrieval, pedestrian re-identification, etc. See [HERE](../algorithm_introduction/metric_learning_em.md) for detailed information.
<a name="Introduction to Image Retrieval Datasets"></a>
- Introduction to Image Retrieval Datasets
- Training Dataset: used to train the model so that it can learn the image features of the collection.
- Gallery Dataset: used to provide the gallery data for the image retrieval task. The gallery dataset can be the same as the training set or the test set, or different.
- Test Set (Query Dataset): used to test the goodness of the model, usually each test image in the test set is extracted with features, and then matched with the features of the underlying data to obtain recognition results, and then the metrics of the whole test set are calculated based on the recognition results.
<a name="Image Retrieval Evaluation Metrics"></a>
- Image Retrieval Evaluation Metrics
- recall: indicates the number of predicted positive cases with positive labels / the number of cases with positive labels
......
......@@ -22,19 +22,23 @@
模型下载及pipline 运行详见[图像识别快速开始](../quick_start/quick_start_recognition.md)
下载模型后,要准备相应的数据,即所迁移应用的具体数据,数据量根据实际情况,自行决定,但是不能太少,会影响精度。将准备的数据分成两部分:1)建库图像,2)测试图像。其中建库数据无需过多,但需保证每个类别包含此类别物体不同角度的图像。请根据实际情况,自行判断。
下载模型后,要准备相应的数据,即所迁移应用的具体数据,数据量根据实际情况,自行决定,但是不能太少,会影响精度。将准备的数据分成两部分:1)建库图像(gallery),2)测试图像。其中建库数据无需过多,但需保证每个类别包含此类别物体不同角度的图像,建议每个类别至少5张图,请根据实际情况,具体调节。
数据标注工具可以使用[lebalme](https://github.com/wkentaro/labelme)。标注数据时。请标注待识别物体的的包围框(BoundingBox),注意只需要标注**建库图像**。。
建议一个类别一共准备30张图左右,其中约至少5张图作为建库图像,剩下的作为测试图像。
<a name="1.2 检索库更新"></a>
### 1.2 建立检索库(gallery)
对于加入检索的数据,每个类别尽量准备此类别的各角度的图像,丰富类别信息。准备的图像只能包含此类别,同时图像背景尽可能的少、简单。如将要加入检索库的数据进行bbox标注,裁剪出bbox图像作为新的要加入的图像,或者使用检测器进行标注及过滤,来提高检索库的图像质量。
对于加入检索的数据,每个类别尽量准备此类别的各角度的图像,丰富类别信息。准备的图像只能包含此类别,同时图像背景尽可能的少、简单。即将要加入检索根据标注的包围框信息,裁剪出bbox图像作为新的要加入的图像,以提高检索库的图像质量。
收集好图像后,数据整理及建库流程详见[图像识别快速开始](../quick_start/quick_start_recognition.md)`3.2 建立新的索引库`
### 1.3 精度测试
使用测试图像,对整个pipline进行简单的精度测试。如发现类别不正确,则需对gallery进行调整,将不正确的测试图像的相似图片加入gallery中,反复迭代。经过调整后,可以测试出整个pipeline的精度。如果精度能够满足需求,则可继续使用。若精度不达预期,则需对模型进行调优,参考下面文档。
使用测试图像,对整个pipline进行简单的精度测试。如发现类别不正确,则需对gallery进行调整,将不正确的测试图像的相似图片(标注并裁剪出没有背景的物体)加入gallery中,反复迭代。经过调整后,可以测试出整个pipeline的精度。如果精度能够满足需求,则可继续使用。若精度不达预期,则需对模型进行调优,参考下面文档。
## 2 模型调优
......@@ -44,6 +48,8 @@
`PP-ShiTu`中检测模型采用的 `PicoDet `算法,具体算法请参考[此文档](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet)。检测模型的训练及调优,请参考[此文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.3/README_cn.md)
对模型进行训练的话,需要自行准备数据,并对数据进行标注,建议一个类别至少准备200张标注图像,并将标注图像及groudtruth文件转成coco文件格式,以方便使用PaddleDetection进行训练。主体检测的预训练权重及相关配置文件相见[主体检测文档](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/application/mainbody_detection)。训练的时候,请加载主体检测的预训练权重。
### 2.2 识别模型调优
在使用官方模型后,如果不满足精度需求,则可以参考此部分文档,进行模型调优
......@@ -64,3 +70,17 @@
- 替换小模型:一般来说,越小的模型预测速度相对越快
- 模型裁剪、量化:请参考文档[模型压缩](./model_prune_quantization.md),压缩配置文件修改请参考[slim相关配置文件](../../../ppcls/configs/slim/)
# 会议纪要
1、排期表,每两周,大家主动更新。下两周计划及之前计划完成情况。@all
2、直播课一节,1月19号。@水龙
3、明年定4篇paper@all
4、复现论文踩过坑要记录好@崔程
5、paddle宣传稿,最好尽快写@all
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册