Measuring the distance between data is a common practice in machine learning. Generally speaking, Euclidean Distance, Inner Product, or Cosine Similarity are all available to calculate measurable data. However, the same operation can hardly be replicated on unstructured data, such as calculating the compatibility between a video and a piece of music. Despite the difficulty in performing the aforementioned vector operation directly due to varied data formats, priori knowledge tells that ED(laugh_video, laugh_music) < ED(laugh_video, blue_music). And how to effectively characterize this "distance"? This is exactly the focus of Metric Learning.
Metric learning, known as Distance Metric Learning, is to automatically construct a task-specific metric function based on training data in the form of machine learning. As shown in the figure below, the goal of Metric learning is to learn a transformation function (either linear or nonlinear) L that maps data points from the original vector space to a new one in which similar points are closer together and non-similar points are further apart, making the metric more task-appropriate. And Deep Metric Learning fits the transformation function by adopting a deep neural network. [![example](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/ml_illustration.jpg)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/ml_illustration.jpg)
## Applications
Metric Learning technologies are widely applied in real life, such as Face Recognition, Person ReID, Image Retrieval, Fine-grained classification, etc. With the growing prevalence of deep learning in industrial practice, Deep Metric Learning (DML) emerges as the current research direction.
Normally, DML consists of three parts: a feature extraction network for map embedding, a sampling strategy to combine samples in a mini-batch into multiple sub-sets, and a loss function to compute the loss on each sub-set. Please refer to the figure below: [![image](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/ml_pipeline.jpg)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/ml_pipeline.jpg)
## Algorithms
Two learning paradigms are adopted in Metric Learning:
### 1. Classification based:
This refers to methods based on classification labels. They learn the effective feature representation by classifying each sample into the correct category and require the participation of the explicit labels of each sample in the Loss calculation during the learning process. Common algorithms include [L2-Softmax](https://arxiv.org/abs/1703.09507), [Large-margin Softmax](https://arxiv.org/abs/1612.02295), [Angular Softmax](https://arxiv.org/pdf/1704.08063.pdf), [NormFace](https://arxiv.org/abs/1704.06369), [AM-Softmax](https://arxiv.org/abs/1801.05599), [CosFace](https://arxiv.org/abs/1801.09414), [ArcFace](https://arxiv.org/abs/1801.07698), etc. These methods are also called proxy-based, because what they optimize is essentially the similarity between a sample and a set of proxies.
### 2. Pairwise based:
This refers to the learning paradigm based on paired samples. It takes sample pairs as input and obtains an effective feature representation by directly learning the similarity between these pairs. Common algorithms include [Contrastive loss](http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf), [ Triplet loss](https://arxiv.org/abs/1503.03832), [Lifted-Structure loss](https://arxiv.org/abs/1511.06452), [N-pair loss](https://), [Multi-Similarity loss](https://arxiv.org/pdf/1904.06627.pdf), etc.
[CircleLoss](https://arxiv.org/abs/2002.10857), released in 2020, unifies the two learning paradigms from a fresh perspective, prompting researchers and practitioners' further reflection on Metric Learning.
Feature extraction plays a key role in image recognition, which serves to transform the input image into a fixed dimensional feature vector for subsequent [vector search](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/image_recognition_pipeline/vector_search.md). Good features boast great similarity preservation, i.e., in the feature space, pairs of images with high similarity should have higher feature similarity (closer together), and pairs of images with low similarity should have less feature similarity (further apart). [Deep Metric Learning](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/algorithm_introduction/metric_learning.md) is applied to explore how to obtain features with high representational power through deep learning.
## 2. Network Structure
In order to customize the image recognition task flexibly, the whole network is divided into Backbone, Neck, Head, and Loss. The figure below illustrates the overall structure:
-**Backbone**: Specifies the backbone network to be used. It is worth noting that the ImageNet-based pre-training model provided by PaddleClas has an output of 1000 for the last layer, which demands for customization according to the required feature dimensions.
-**Neck**: Used for feature augmentation and feature dimension transformation. Here it can be a simple Linear Layer for feature dimension transformation, or a more complex FPN structure for feature augmentation.
-**Head**: Used to transform features into logits. In addition to the common Fc Layer, cosmargin, arcmargin, circlemargin and other modules are all available choices.
-**Loss**: Specifies the Loss function to be used. It is designed as a combined form to facilitate the combination of Classification Loss and Pair_wise Loss.
## 3. General Recognition Models
In PP-Shitu, we have [PP_LCNet_x2_5](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/models/PP-LCNet.md) as the backbone network, Linear Layer for Neck, [ArcMargin](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/arch/gears/arcmargin.py) for Head, and CELoss for Loss. See the details in [General Recognition_configuration files](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/configs/GeneralRecognition/). The involved training data covers the following seven public datasets:
| Datasets | Data Size | Class Number | Scenarios | URL |
- CPU of the speed evaluation machine: `Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz`.
- Evaluation conditions for the speed metric: MKLDNN enabled, number of threads set to 10
- Address of the pre-training model: [General recognition pre-training model](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/general_PPLCNet_x2_5_pretrained_v1.0.pdparams)
# 4. Customized Feature Extraction
Customized feature extraction refers to retraining the feature extraction model based on one's own task. It consists of four main steps: 1) data preparation, 2) model training, 3) model evaluation, and 4) model inference.
## 4.1 Data Preparation
To start with, customize your dataset based on the task (See [Format description](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/data_preparation/recognition_dataset.md#数据集格式说明) for the dataset format). Before initiating the model training, modify the data-related content in the configuration files, including the address of the dataset and the class number. The corresponding locations in configuration files are shown below:
```
Head:
name: ArcMargin
embedding_size: 512
class_num: 185341 #Number of class
```
```
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ #The directory where the train dataset is located
cls_label_path: ./dataset/train_reg_all_data.txt #The address of label file for train dataset
```
```
Query:
dataset:
name: VeriWild
image_root: ./dataset/Aliproduct/. #The directory where the query dataset is located
cls_label_path: ./dataset/Aliproduct/val_list.txt. #The address of label file for query dataset
```
```
Gallery:
dataset:
name: VeriWild
image_root: ./dataset/Aliproduct/ #The directory where the gallery dataset is located
cls_label_path: ./dataset/Aliproduct/val_list.txt. #The address of label file for gallery dataset
**Note:** The configuration file adopts `online evaluation` by default, if you want to speed up the training and remove `online evaluation`, just add `-o eval_during_train=False` after the above command. After training, the final model files `latest`, `best_model` and the training log file `train.log` will be generated under the directory output. Among them, `best_model` is utilized to store the best model under the current evaluation metrics while`latest` is adopted to store the latest generated model, making it convenient to resume the training from where it was interrupted.
**Recommendation:** It is suggested to employ multi-card evaluation, which can quickly obtain the feature set of the overall dataset using multi-card parallel computing, accelerating the evaluation process.
## 4.4 Model Inference
Two steps are included in the inference: 1)exporting the inference model; 2)obtaining the feature vector.
The generated inference models are under the directory `inference`, which comprises three files, namely, `inference.pdmodel`、`inference.pdiparams`、`inference.pdiparams.info`. Among them, `inference.pdmodel` serves to store the structure of inference model while `inference.pdiparams` and `inference.pdiparams.info` are mobilized to store model-related parameters.
### 4.4.2 Obtain Feature Vector
```
cd deploy
python python/predict_rec.py \
-c configs/inference_rec.yaml \
-o Global.rec_inference_model_dir="../inference"
```
The output format of the obtained features is shown in the figure below:[![img](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/feature_extraction_output.png)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/feature_extraction_output.png)
In practical use, however, business operations require more than simply obtaining features. To further perform image recognition by feature retrieval, please refer to the document [vector search](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/image_recognition_pipeline/vector_search.md).
-[2. Installation of Serving ](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#2)
-[3. Service Deployment for Image Classification](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#3)
-[3.1 Model Transformation](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#3.1)
-[3.2 Service Deployment and Request](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#3.2)
-[4. Service Deployment for Image Recognition](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#4)
-[4.1 Model Transformation](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#4.1)
-[4.2 Service Deployment and Request](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#4.2)
[Paddle Serving](https://github.com/PaddlePaddle/Serving) is designed to provide easy deployment of on-line prediction services for deep learning developers, it supports one-click deployment of industrial-grade services, highly concurrent and efficient communication between client and server, and multiple programming languages for client development.
This section, exemplified by HTTP deployment of prediction service, describes how to deploy model services in PaddleClas with PaddleServing. Currently, only deployment on Linux platform is supported. Windows platform is not supported.
## 2. Installation of Serving
It is officially recommended to use docker for the installation and environment deployment of Serving. First, pull the docker and create a Serving-based one.
nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash
nvidia-docker exec -it test bash
```
Once you are in docker, install the Serving-related python packages.
```
pip3 install paddle-serving-client==0.7.0
pip3 install paddle-serving-server==0.7.0 # CPU
pip3 install paddle-serving-app==0.7.0
pip3 install paddle-serving-server-gpu==0.7.0.post102 #GPU with CUDA10.2 + TensorRT6
# For other GPU environemnt, confirm the environment before choosing which one to execute
pip3 install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
pip3 install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
```
- Speed up the installation process by replacing the source with `-i https://pypi.tuna.tsinghua.edu.cn/simple`.
- For other environment configuration and installation, please refer to [Install Paddle Serving using docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md)
- To deploy CPU services, please install the CPU version of serving-server with the following command.
```
pip install paddle-serving-server
```
## 3. Service Deployment for Image Classification
### 3.1 Model Transformation
When adopting PaddleServing for service deployment, the saved inference model needs to be converted to a Serving model. The following part takes the classic ResNet50_vd model as an example to introduce the deployment of image classification service.
- Enter the working directory:
```
cd deploy/paddleserving
```
- Download the inference model of ResNet50_vd:
```
# Download and decompress the ResNet50_vd model
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
```
- Convert the downloaded inference model into a format that is readily deployable by Server with the help of paddle_serving_client.
After the transformation, `ResNet50_vd_serving` and `ResNet50_vd_client` will be added to the current folder in the following format:
```
|- ResNet50_vd_server/
|- __model__
|- __params__
|- serving_server_conf.prototxt
|- serving_server_conf.stream.prototxt
|- ResNet50_vd_client
|- serving_client_conf.prototxt
|- serving_client_conf.stream.prototxt
```
Having obtained the model file, modify the alias name in `serving_server_conf.prototxt` under directory `ResNet50_vd_server` by changing `alias_name` in `fetch_var` to `prediction`.
**Notes**: Serving supports input and output renaming to ensure its compatibility with the deployment of different models. In this case, modifying the alias_name of the configuration file is the only step needed to complete the inference and deployment of all kinds of models. The modified serving_server_conf.prototxt is shown below:
```
feed_var {
name: "inputs"
alias_name: "inputs"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "prediction"
is_lod_tensor: true
fetch_type: 1
shape: -1
}
```
### 3.2 Service Deployment and Request
Paddleserving's directory contains the code to start the pipeline service and send prediction requests, including:
```
__init__.py
config.yml # Configuration file for starting the service
pipeline_http_client.py # Script for sending pipeline prediction requests by http
pipeline_rpc_client.py # Script for sending pipeline prediction requests by rpc
classification_web_service.py # Script for starting the pipeline server
```
- Start the service:
```
# Start the service and the run log is saved in log.txt
python3 classification_web_service.py &>log.txt &
```
Once the service is successfully started, a log will be printed in log.txt similar to the following [![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/start_server.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/start_server.png)
- Send request:
```
# Send service request
python3 pipeline_http_client.py
```
Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example:[![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/results.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/results.png)
## 4. Service Deployment for Image Recognition
When using PaddleServing for service deployment, the saved inference model needs to be converted to a Serving model. The following part, exemplified by the ultra-lightweight model for image recognition in PP-ShiTu, details the deployment of image recognition service.
## 4.1 Model Transformation
- Download inference models for general detection and general recognition
```
cd deploy
# Download and decompress general recogntion models
After the transformation, `general_PPLCNet_x2_5_lite_v1.0_serving/` and `general_PPLCNet_x2_5_lite_v1.0_serving/` will be added to the current folder. Modify the alias name in serving_server_conf.prototxt under the directory `general_PPLCNet_x2_5_lite_v1.0_serving/` by changing `alias_name` to `features` in `fetch_var`. The modified serving_server_conf.prototxt is similar to the following:
```
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: true
fetch_type: 1
shape: -1
}
```
- Convert the inference model for detection into a Serving model:
After the transformation, `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` and `picodet_PPLCNet_x2_5_ mainbody_lite_v1.0_client/` will be added to the current folder.
**Note:** The alias name in the serving_server_conf.prototxt under the directory`picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` requires no modification.
- Download and decompress the constructed search library index
```
cd ../
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar && tar -xf drink_dataset_v1.0.tar
```
## 4.2 Service Deployment and Request
**Note:** Since the recognition service involves multiple models, PipeLine is adopted for better performance. This deployment method does not support the windows platform for now.
- Enter the working directory
```
cd ./deploy/paddleserving/recognition
```
Paddleserving's directory contains the code to start the pipeline service and send prediction requests, including:
```
__init__.py
config.yml # Configuration file for starting the service
pipeline_http_client.py # Script for sending pipeline prediction requests by http
pipeline_rpc_client.py # Script for sending pipeline prediction requests by rpc
recognition_web_service.py # Script for starting the pipeline server
```
- Start the service:
```
# Start the service and the run log is saved in log.txt
python3 recognition_web_service.py &>log.txt &
```
Once the service is successfully started, a log will be printed in log.txt similar to the following [![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/start_server_shitu.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/start_server_shitu.png)
- Send request:
```
python3 pipeline_http_client.py
```
Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example: [![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/results_shitu.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/results_shitu.png)
## 5.FAQ
**Q1**: After sending a request, no result is returned or the output is prompted with a decoding error.
**A1**: Please turn off the proxy before starting the service and sending requests, try the following command:
```
unset https_proxy
unset http_proxy
```
For more types of service deployment, such as `RPC prediction services`, you can refer to the [github official website](https://github.com/PaddlePaddle/Serving/tree/v0.7.0/examples) of Serving.