diff --git a/docs/en/algorithm_introduction/metric_learning_en.md b/docs/en/algorithm_introduction/metric_learning_en.md index f37d24b55f1baea4cde21c6fc4add9ce0ad5ecc3..8eaa91036f037d1855f197a3fea41f7e8a54853f 100644 --- a/docs/en/algorithm_introduction/metric_learning_en.md +++ b/docs/en/algorithm_introduction/metric_learning_en.md @@ -1,26 +1,38 @@ # Metric Learning -## Introduction +## Contents +- [1.Introduction](#1) +- [2.Applications](#2) +- [3.Algorithms](#3) + - [3.1 Classification based](#3.1) + - [3.2 Pairwise based](#3.2) + + +## 1.Introduction Measuring the distance between data is a common practice in machine learning. Generally speaking, Euclidean Distance, Inner Product, or Cosine Similarity are all available to calculate measurable data. However, the same operation can hardly be replicated on unstructured data, such as calculating the compatibility between a video and a piece of music. Despite the difficulty in performing the aforementioned vector operation directly due to varied data formats, priori knowledge tells that ED(laugh_video, laugh_music) < ED(laugh_video, blue_music). And how to effectively characterize this "distance"? This is exactly the focus of Metric Learning. -Metric learning, known as Distance Metric Learning, is to automatically construct a task-specific metric function based on training data in the form of machine learning. As shown in the figure below, the goal of Metric learning is to learn a transformation function (either linear or nonlinear) L that maps data points from the original vector space to a new one in which similar points are closer together and non-similar points are further apart, making the metric more task-appropriate. And Deep Metric Learning fits the transformation function by adopting a deep neural network. [![example](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/ml_illustration.jpg)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/ml_illustration.jpg) +Metric learning, known as Distance Metric Learning, is to automatically construct a task-specific metric function based on training data in the form of machine learning. As shown in the figure below, the goal of Metric learning is to learn a transformation function (either linear or nonlinear) L that maps data points from the original vector space to a new one in which similar points are closer together and non-similar points are further apart, making the metric more task-appropriate. And Deep Metric Learning fits the transformation function by adopting a deep neural network. ![example](../../images/ml_illustration.jpg) -## Applications + +## 2.Applications Metric Learning technologies are widely applied in real life, such as Face Recognition, Person ReID, Image Retrieval, Fine-grained classification, etc. With the growing prevalence of deep learning in industrial practice, Deep Metric Learning (DML) emerges as the current research direction. -Normally, DML consists of three parts: a feature extraction network for map embedding, a sampling strategy to combine samples in a mini-batch into multiple sub-sets, and a loss function to compute the loss on each sub-set. Please refer to the figure below: [![image](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/ml_pipeline.jpg)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/ml_pipeline.jpg) +Normally, DML consists of three parts: a feature extraction network for map embedding, a sampling strategy to combine samples in a mini-batch into multiple sub-sets, and a loss function to compute the loss on each sub-set. Please refer to the figure below: ![image](../../images/ml_pipeline.jpg) -## Algorithms + +## 3.Algorithms Two learning paradigms are adopted in Metric Learning: -### 1. Classification based: + +### 3.1 Classification based: This refers to methods based on classification labels. They learn the effective feature representation by classifying each sample into the correct category and require the participation of the explicit labels of each sample in the Loss calculation during the learning process. Common algorithms include [L2-Softmax](https://arxiv.org/abs/1703.09507), [Large-margin Softmax](https://arxiv.org/abs/1612.02295), [Angular Softmax]( https://arxiv.org/pdf/1704.08063.pdf), [NormFace](https://arxiv.org/abs/1704.06369), [AM-Softmax](https://arxiv.org/abs/1801.05599), [CosFace](https://arxiv.org/abs/1801.09414), [ArcFace](https://arxiv.org/abs/1801.07698), etc. These methods are also called proxy-based, because what they optimize is essentially the similarity between a sample and a set of proxies. -### 2. Pairwise based: + +### 3.2 Pairwise based: This refers to the learning paradigm based on paired samples. It takes sample pairs as input and obtains an effective feature representation by directly learning the similarity between these pairs. Common algorithms include [Contrastive loss](http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf), [ Triplet loss](https://arxiv.org/abs/1503.03832), [Lifted-Structure loss](https://arxiv.org/abs/1511.06452), [N-pair loss](https://), [Multi-Similarity loss](https://arxiv.org/pdf/1904.06627.pdf), etc. diff --git a/docs/en/data_preparation/classification_dataset_en.md b/docs/en/data_preparation/classification_dataset_en.md index 7b78f6a96f78cbf8146f3ad07b2f0c3ac062f3d8..0127171d0256f512ec3a00fe81e7aee12256c542 100644 --- a/docs/en/data_preparation/classification_dataset_en.md +++ b/docs/en/data_preparation/classification_dataset_en.md @@ -6,7 +6,7 @@ This document elaborates on the dataset format adopted by PaddleClas for image c ## Contents -- [Dataset Format](#1) +- [1.Dataset Format](#1) - [Common Datasets for Image Classification](#2) - [2.1 ImageNet1k](#2.1) - [2.2 Flowers102](#2.2) @@ -16,7 +16,7 @@ This document elaborates on the dataset format adopted by PaddleClas for image c -## 1 Dataset Format +## 1. Dataset Format PaddleClas adopts `txt` files to assign the training and test sets. Taking the `ImageNet1k` dataset as an example, where `train_list.txt` and `val_list.txt` have the following formats: @@ -34,7 +34,7 @@ val/ILSVRC2012_val_00000001.JPEG 65 -## 2 Common Datasets for Image Classification +## 2. Common Datasets for Image Classification Here we present a compilation of commonly used image classification datasets, which is continuously updated and expects your supplement. diff --git a/docs/en/data_preparation/recognition_dataset_en.md b/docs/en/data_preparation/recognition_dataset_en.md index bd6b7743c5b54510f0da43fa620cd20db45e6499..bd06a7a87445146fa1ec3d85b225763c1cb89a30 100644 --- a/docs/en/data_preparation/recognition_dataset_en.md +++ b/docs/en/data_preparation/recognition_dataset_en.md @@ -6,8 +6,8 @@ This document elaborates on the dataset format adopted by PaddleClas for image r ## Contents -- [Dataset Format](#1) -- [Common Datasets for Image Recognition](#2) +- [1.Dataset Format](#1) +- [2.Common Datasets for Image Recognition](#2) - [2.1 General Datasets](#2.1) - [2.2 Vertical Datasets](#2.2) - [2.2.1 Animation Character Recognition](#2.2.1) @@ -17,7 +17,7 @@ This document elaborates on the dataset format adopted by PaddleClas for image r -## 1 Dataset Format +## 1.Dataset Format The dataset for the vector search, unlike those for classification tasks, is divided into the following three parts: @@ -57,7 +57,7 @@ Each row of data is separated by "space", and the three columns of data stand fo -## 2. Common Datasets for Image Recognition +## 2.Common Datasets for Image Recognition Here we present a compilation of commonly used image recognition datasets, which is continuously updated and expects your supplement. diff --git a/docs/en/image_recognition_pipeline/feature_extraction_en.md b/docs/en/image_recognition_pipeline/feature_extraction_en.md index c96f8a6920bed23b5a60a6ac7952191e4c1176e9..3411a56908516722199743cca403d6b889b26b1a 100644 --- a/docs/en/image_recognition_pipeline/feature_extraction_en.md +++ b/docs/en/image_recognition_pipeline/feature_extraction_en.md @@ -1,14 +1,27 @@ # Feature Extraction -## 1. Introduction +## Content -Feature extraction plays a key role in image recognition, which serves to transform the input image into a fixed dimensional feature vector for subsequent [vector search](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/image_recognition_pipeline/vector_search.md). Good features boast great similarity preservation, i.e., in the feature space, pairs of images with high similarity should have higher feature similarity (closer together), and pairs of images with low similarity should have less feature similarity (further apart). [Deep Metric Learning](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/algorithm_introduction/metric_learning.md) is applied to explore how to obtain features with high representational power through deep learning. +- [1.Introduction](#1) +- [2.Network Structure](#2) +- [3.General Recognition Models](#3) +- [4.Customized Feature Extraction](#4) + - [4.1 Data Preparation](#4.1) + - [4.2 Model Training](#4.2) + - [4.3 Model Evaluation](#4.3) + - [4.4 Model Inference](#4.4) -## 2. Network Structure + +## 1.Introduction + +Feature extraction plays a key role in image recognition, which serves to transform the input image into a fixed dimensional feature vector for subsequent [vector search](./vector_search_en.md). Good features boast great similarity preservation, i.e., in the feature space, pairs of images with high similarity should have higher feature similarity (closer together), and pairs of images with low similarity should have less feature similarity (further apart). [Deep Metric Learning](../algorithm_introduction/metric_learning_en.md) is applied to explore how to obtain features with high representational power through deep learning. + + +## 2.Network Structure In order to customize the image recognition task flexibly, the whole network is divided into Backbone, Neck, Head, and Loss. The figure below illustrates the overall structure: -[![img](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/feature_extraction_framework.png)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/feature_extraction_framework.png) +![img](../../images/feature_extraction_framework.png) Functions of the above modules : @@ -17,9 +30,10 @@ Functions of the above modules : - **Head**: Used to transform features into logits. In addition to the common Fc Layer, cosmargin, arcmargin, circlemargin and other modules are all available choices. - **Loss**: Specifies the Loss function to be used. It is designed as a combined form to facilitate the combination of Classification Loss and Pair_wise Loss. -## 3. General Recognition Models + +## 3.General Recognition Models -In PP-Shitu, we have [PP_LCNet_x2_5](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/models/PP-LCNet.md) as the backbone network, Linear Layer for Neck, [ArcMargin](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/arch/gears/arcmargin.py) for Head, and CELoss for Loss. See the details in [General Recognition_configuration files](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/configs/GeneralRecognition/). The involved training data covers the following seven public datasets: +In PP-Shitu, we have [PP_LCNet_x2_5](../models/PP-LCNet.md) as the backbone network, Linear Layer for Neck, [ArcMargin](../../../ppcls/arch/gears/arcmargin.py) for Head, and CELoss for Loss. See the details in [General Recognition_configuration files](../.././ppcls/configs/GeneralRecognition/). The involved training data covers the following seven public datasets: | Datasets | Data Size | Class Number | Scenarios | URL | | ------------ | --------- | ------------ | ------------------ | ------------------------------------------------------------ | @@ -43,13 +57,15 @@ The results are shown in the table below: - Evaluation conditions for the speed metric: MKLDNN enabled, number of threads set to 10 - Address of the pre-training model: [General recognition pre-training model](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/general_PPLCNet_x2_5_pretrained_v1.0.pdparams) -# 4. Customized Feature Extraction + +# 4.Customized Feature Extraction Customized feature extraction refers to retraining the feature extraction model based on one's own task. It consists of four main steps: 1) data preparation, 2) model training, 3) model evaluation, and 4) model inference. + ## 4.1 Data Preparation -To start with, customize your dataset based on the task (See [Format description](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/data_preparation/recognition_dataset.md#数据集格式说明) for the dataset format). Before initiating the model training, modify the data-related content in the configuration files, including the address of the dataset and the class number. The corresponding locations in configuration files are shown below: +To start with, customize your dataset based on the task (See [Format description](../data_preparation/recognition_dataset_en.md#1) for the dataset format). Before initiating the model training, modify the data-related content in the configuration files, including the address of the dataset and the class number. The corresponding locations in configuration files are shown below: ``` Head: @@ -82,6 +98,7 @@ Train: cls_label_path: ./dataset/Aliproduct/val_list.txt. #The address of label file for gallery dataset ``` + ## 4.2 Model Training - Single machine single card training @@ -112,6 +129,7 @@ python -m paddle.distributed.launch \ -o Global.checkpoint="output/RecModel/latest" ``` + ## 4.3 Model Evaluation - Single Card Evaluation @@ -135,6 +153,7 @@ python -m paddle.distributed.launch \ **Recommendation:** It is suggested to employ multi-card evaluation, which can quickly obtain the feature set of the overall dataset using multi-card parallel computing, accelerating the evaluation process. + ## 4.4 Model Inference Two steps are included in the inference: 1)exporting the inference model; 2)obtaining the feature vector. @@ -158,6 +177,6 @@ python python/predict_rec.py \ -o Global.rec_inference_model_dir="../inference" ``` -The output format of the obtained features is shown in the figure below:[![img](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/feature_extraction_output.png)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/feature_extraction_output.png) +The output format of the obtained features is shown in the figure below:![img](../../images/feature_extraction_output.png) -In practical use, however, business operations require more than simply obtaining features. To further perform image recognition by feature retrieval, please refer to the document [vector search](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/image_recognition_pipeline/vector_search.md). +In practical use, however, business operations require more than simply obtaining features. To further perform image recognition by feature retrieval, please refer to the document [vector search](./vector_search_en.md). diff --git a/docs/en/inference_deployment/paddle_serving_deploy_en.md b/docs/en/inference_deployment/paddle_serving_deploy_en.md index 96dd0fd80639123f072d41d00f02e09e121cf55d..6e0364db49ccb8ac6cfd06262903a8cf819da402 100644 --- a/docs/en/inference_deployment/paddle_serving_deploy_en.md +++ b/docs/en/inference_deployment/paddle_serving_deploy_en.md @@ -1,23 +1,25 @@ # Model Service Deployment -- [1. Introduction](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#1) -- [2. Installation of Serving ](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#2) -- [3. Service Deployment for Image Classification](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#3) - - [3.1 Model Transformation](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#3.1) - - [3.2 Service Deployment and Request](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#3.2) -- [4. Service Deployment for Image Recognition](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#4) - - [4.1 Model Transformation](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#4.1) - - [4.2 Service Deployment and Request](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#4.2) -- [5. FAQ](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#5) - - - +## Content + +- [1. Introduction](#1) +- [2. Installation of Serving](#2) +- [3. Service Deployment for Image Classification](#3) + - [3.1 Model Transformation](#3.1) + - [3.2 Service Deployment and Request](#3.2) +- [4. Service Deployment for Image Recognition](#4) + - [4.1 Model Transformation](#4.1) + - [4.2 Service Deployment and Request](#4.2) +- [5. FAQ](#5) + + ## 1. Introduction [Paddle Serving](https://github.com/PaddlePaddle/Serving) is designed to provide easy deployment of on-line prediction services for deep learning developers, it supports one-click deployment of industrial-grade services, highly concurrent and efficient communication between client and server, and multiple programming languages for client development. This section, exemplified by HTTP deployment of prediction service, describes how to deploy model services in PaddleClas with PaddleServing. Currently, only deployment on Linux platform is supported. Windows platform is not supported. + ## 2. Installation of Serving It is officially recommended to use docker for the installation and environment deployment of Serving. First, pull the docker and create a Serving-based one. @@ -41,19 +43,17 @@ pip3 install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + Tens ``` - Speed up the installation process by replacing the source with `-i https://pypi.tuna.tsinghua.edu.cn/simple`. -- For other environment configuration and installation, please refer to [Install Paddle Serving using docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md) +- For other environment configuration and installation, please refer to [Install Paddle Serving using docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_EN.md) - To deploy CPU services, please install the CPU version of serving-server with the following command. ``` pip install paddle-serving-server ``` - - + ## 3. Service Deployment for Image Classification - - + ### 3.1 Model Transformation When adopting PaddleServing for service deployment, the saved inference model needs to be converted to a Serving model. The following part takes the classic ResNet50_vd model as an example to introduce the deployment of image classification service. @@ -118,8 +118,7 @@ fetch_var { } ``` - - + ### 3.2 Service Deployment and Request Paddleserving's directory contains the code to start the pipeline service and send prediction requests, including: @@ -139,7 +138,7 @@ classification_web_service.py # Script for starting the pipeline server python3 classification_web_service.py &>log.txt & ``` -Once the service is successfully started, a log will be printed in log.txt similar to the following [![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/start_server.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/start_server.png) +Once the service is successfully started, a log will be printed in log.txt similar to the following ![img](../imgs/start_server.png) - Send request: @@ -148,14 +147,16 @@ Once the service is successfully started, a log will be printed in log.txt simil python3 pipeline_http_client.py ``` -Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example:[![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/results.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/results.png) - +Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example:![img](../imgs/results.png) + ## 4. Service Deployment for Image Recognition When using PaddleServing for service deployment, the saved inference model needs to be converted to a Serving model. The following part, exemplified by the ultra-lightweight model for image recognition in PP-ShiTu, details the deployment of image recognition service. + + ## 4.1 Model Transformation - Download inference models for general detection and general recognition @@ -225,8 +226,7 @@ cd ../ wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar && tar -xf drink_dataset_v1.0.tar ``` - - + ## 4.2 Service Deployment and Request **Note:** Since the recognition service involves multiple models, PipeLine is adopted for better performance. This deployment method does not support the windows platform for now. @@ -254,7 +254,7 @@ recognition_web_service.py # Script for starting the pipeline server python3 recognition_web_service.py &>log.txt & ``` -Once the service is successfully started, a log will be printed in log.txt similar to the following [![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/start_server_shitu.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/start_server_shitu.png) +Once the service is successfully started, a log will be printed in log.txt similar to the following ![img](../imgs/start_server_shitu.png) - Send request: @@ -262,10 +262,10 @@ Once the service is successfully started, a log will be printed in log.txt simil python3 pipeline_http_client.py ``` -Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example: [![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/results_shitu.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/results_shitu.png) - +Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example: ![img](../imgs/results_shitu.png) + ## 5.FAQ **Q1**: After sending a request, no result is returned or the output is prompted with a decoding error.