update en doc

175f682c · stephon · sibo2rr · 61f2dc10 · 175f682c · 175f682c
5 changed file
--- a/docs/en/algorithm_introduction/metric_learning_en.md
+++ b/docs/en/algorithm_introduction/metric_learning_en.md
 # Metric Learning

-## Introduction
+## Contents
+- [1.Introduction](#1)
+- [2.Applications](#2)
+- [3.Algorithms](#3)
+    - [3.1 Classification based](#3.1)
+    - [3.2 Pairwise based](#3.2)
+
+<a name="1"></a>
+## 1.Introduction

 Measuring the distance between data is a common practice in machine learning. Generally speaking, Euclidean Distance, Inner Product, or Cosine Similarity are all available to calculate measurable data. However, the same operation can hardly be replicated on unstructured data, such as calculating the compatibility between a video and a piece of music. Despite the difficulty in performing the aforementioned vector operation directly due to varied data formats, priori knowledge tells that ED(laugh_video, laugh_music) < ED(laugh_video, blue_music). And how to effectively characterize this "distance"?  This is exactly the focus of Metric Learning.

-Metric learning, known as Distance Metric Learning, is to automatically construct a task-specific metric function based on training data in the form of machine learning. As shown in the figure below, the goal of Metric learning is to learn a transformation function (either linear or nonlinear) L that maps data points from the original vector space to a new one in which similar points are closer together and non-similar points are further apart, making the metric more task-appropriate. And Deep Metric Learning fits the transformation function by adopting a deep neural network. [![example](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/ml_illustration.jpg)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/ml_illustration.jpg)
+Metric learning, known as Distance Metric Learning, is to automatically construct a task-specific metric function based on training data in the form of machine learning. As shown in the figure below, the goal of Metric learning is to learn a transformation function (either linear or nonlinear) L that maps data points from the original vector space to a new one in which similar points are closer together and non-similar points are further apart, making the metric more task-appropriate. And Deep Metric Learning fits the transformation function by adopting a deep neural network. ![example](../../images/ml_illustration.jpg)

-## Applications
+<a name="2"></a>
+## 2.Applications

 Metric Learning technologies are widely applied in real life, such as Face Recognition, Person ReID, Image Retrieval, Fine-grained classification, etc. With the growing prevalence of deep learning in industrial practice, Deep Metric Learning (DML) emerges as the current research direction.

-Normally, DML consists of three parts: a feature extraction network for map embedding, a sampling strategy to combine samples in a mini-batch into multiple sub-sets, and  a loss function to compute the loss on each sub-set. Please refer to the figure below: [![image](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/ml_pipeline.jpg)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/ml_pipeline.jpg)
+Normally, DML consists of three parts: a feature extraction network for map embedding, a sampling strategy to combine samples in a mini-batch into multiple sub-sets, and  a loss function to compute the loss on each sub-set. Please refer to the figure below: ![image](../../images/ml_pipeline.jpg)

-## Algorithms
+<a name="3"></a>
+## 3.Algorithms

 Two learning paradigms are adopted in Metric Learning:

-### 1. Classification based:
+<a name="3.1"></a>
+### 3.1 Classification based:

 This refers to methods based on classification labels. They learn the effective feature representation by classifying each sample into the correct category and require the participation of  the explicit labels of each sample in the Loss calculation during the learning process. Common algorithms include [L2-Softmax](https://arxiv.org/abs/1703.09507), [Large-margin Softmax](https://arxiv.org/abs/1612.02295), [Angular Softmax]( https://arxiv.org/pdf/1704.08063.pdf), [NormFace](https://arxiv.org/abs/1704.06369), [AM-Softmax](https://arxiv.org/abs/1801.05599), [CosFace](https://arxiv.org/abs/1801.09414), [ArcFace](https://arxiv.org/abs/1801.07698), etc. These methods are also called proxy-based, because what they optimize is essentially the similarity between a sample and a set of proxies.

-### 2. Pairwise based:
+<a name="3.2"></a>
+### 3.2 Pairwise based:

 This refers to the learning paradigm based on paired samples. It takes sample pairs as input and obtains an effective feature representation by directly learning the similarity between these pairs. Common algorithms include [Contrastive loss](http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf), [ Triplet loss](https://arxiv.org/abs/1503.03832), [Lifted-Structure loss](https://arxiv.org/abs/1511.06452), [N-pair loss](https://), [Multi-Similarity loss](https://arxiv.org/pdf/1904.06627.pdf), etc.


--- a/docs/en/data_preparation/classification_dataset_en.md
+++ b/docs/en/data_preparation/classification_dataset_en.md
@@ -6,7 +6,7 @@ This document elaborates on the dataset format adopted by PaddleClas for image c

 ## Contents

- [Dataset Format](#1)
+- [1.Dataset Format](#1)
 - [Common Datasets for Image Classification](#2)
  - [2.1 ImageNet1k](#2.1)
  - [2.2 Flowers102](#2.2)
@@ -16,7 +16,7 @@ This document elaborates on the dataset format adopted by PaddleClas for image c


 <a name="1"></a>
-## 1 Dataset Format
+## 1. Dataset Format

 PaddleClas adopts `txt` files to assign the training and test sets. Taking the `ImageNet1k` dataset as an example, where `train_list.txt` and `val_list.txt` have the following formats:

@@ -34,7 +34,7 @@ val/ILSVRC2012_val_00000001.JPEG 65


 <a name="2"></a>
-## 2 Common Datasets for Image Classification
+## 2. Common Datasets for Image Classification

 Here we present a compilation of commonly used image classification datasets, which is continuously updated and expects your supplement.


--- a/docs/en/data_preparation/recognition_dataset_en.md
+++ b/docs/en/data_preparation/recognition_dataset_en.md
@@ -6,8 +6,8 @@ This document elaborates on the dataset format adopted by PaddleClas for image r

 ## Contents

- [Dataset Format](#1)
- [Common Datasets for Image Recognition](#2)
+- [1.Dataset Format](#1)
+- [2.Common Datasets for Image Recognition](#2)
  - [2.1 General Datasets](#2.1)
  - [2.2 Vertical Datasets](#2.2)
    - [2.2.1 Animation Character Recognition](#2.2.1)
@@ -17,7 +17,7 @@ This document elaborates on the dataset format adopted by PaddleClas for image r


 <a name="1"></a>
-## 1 Dataset Format
+## 1.Dataset Format

 The dataset for the vector search, unlike those for classification tasks, is divided into the following three parts:

@@ -57,7 +57,7 @@ Each row of data is separated by "space", and the three columns of data stand fo


 <a name="2"></a>
-## 2. Common Datasets for Image Recognition
+## 2.Common Datasets for Image Recognition

 Here we present a compilation of commonly used image recognition datasets, which is continuously updated and expects your supplement.


--- a/docs/en/image_recognition_pipeline/feature_extraction_en.md
+++ b/docs/en/image_recognition_pipeline/feature_extraction_en.md
 # Feature Extraction

-## 1. Introduction
+## Content

-Feature extraction plays a key role in image recognition, which serves to transform the input image into a fixed dimensional feature vector for subsequent [vector search](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/image_recognition_pipeline/vector_search.md). Good features boast great similarity preservation, i.e., in the feature space, pairs of images with high similarity should have higher feature similarity (closer together), and pairs of images with low similarity should have less feature similarity (further apart). [Deep Metric Learning](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/algorithm_introduction/metric_learning.md) is applied to explore how to obtain features with high representational power through deep learning.
+- [1.Introduction](#1)
+- [2.Network Structure](#2)
+- [3.General Recognition Models](#3)
+- [4.Customized Feature Extraction](#4)
+    - [4.1 Data Preparation](#4.1)
+    - [4.2 Model Training](#4.2)
+    - [4.3 Model Evaluation](#4.3)
+    - [4.4 Model Inference](#4.4)

-## 2. Network Structure
+<a name="1"></a>
+## 1.Introduction
+
+Feature extraction plays a key role in image recognition, which serves to transform the input image into a fixed dimensional feature vector for subsequent [vector search](./vector_search_en.md). Good features boast great similarity preservation, i.e., in the feature space, pairs of images with high similarity should have higher feature similarity (closer together), and pairs of images with low similarity should have less feature similarity (further apart). [Deep Metric Learning](../algorithm_introduction/metric_learning_en.md) is applied to explore how to obtain features with high representational power through deep learning.
+
+<a name="2"></a>
+## 2.Network Structure

 In order to customize the image recognition task flexibly, the whole network is divided into Backbone, Neck, Head, and Loss. The figure below illustrates the overall structure:

-[![img](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/feature_extraction_framework.png)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/feature_extraction_framework.png)
+![img](../../images/feature_extraction_framework.png)

 Functions of the above modules :

@@ -17,9 +30,10 @@ Functions of the above modules :
 - **Head**: Used to transform features into logits. In addition to the common Fc Layer, cosmargin, arcmargin, circlemargin and other modules are all available choices.
 - **Loss**: Specifies the Loss function to be used. It is designed as a combined form to facilitate the combination of Classification Loss and Pair_wise Loss.

-## 3. General Recognition Models
+<a name="3"></a>
+## 3.General Recognition Models

-In PP-Shitu, we have [PP_LCNet_x2_5](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/models/PP-LCNet.md) as the backbone network, Linear Layer for Neck, [ArcMargin](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/arch/gears/arcmargin.py) for Head, and CELoss for Loss. See the details in  [General Recognition_configuration files](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/configs/GeneralRecognition/). The involved training data covers the following seven public datasets:
+In PP-Shitu, we have [PP_LCNet_x2_5](../models/PP-LCNet.md) as the backbone network, Linear Layer for Neck, [ArcMargin](../../../ppcls/arch/gears/arcmargin.py) for Head, and CELoss for Loss. See the details in  [General Recognition_configuration files](../.././ppcls/configs/GeneralRecognition/). The involved training data covers the following seven public datasets:

 | Datasets     | Data Size | Class Number | Scenarios          | URL                                                          |
 | ------------ | --------- | ------------ | ------------------ | ------------------------------------------------------------ |
@@ -43,13 +57,15 @@ The results are shown in the table below:
 - Evaluation conditions for the speed metric: MKLDNN enabled, number of threads set to 10
 - Address of the pre-training model: [General recognition pre-training model](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/general_PPLCNet_x2_5_pretrained_v1.0.pdparams)

-# 4. Customized Feature Extraction
+<a name="4"></a>
+# 4.Customized Feature Extraction

 Customized feature extraction refers to retraining the feature extraction model based on one's own task. It consists of four main steps: 1) data preparation, 2) model training, 3) model evaluation, and 4) model inference.

+<a name="4.1"></a>
 ## 4.1 Data Preparation

-To start with, customize your dataset based on the task (See [Format description](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/data_preparation/recognition_dataset.md#数据集格式说明) for the dataset format). Before initiating the model training, modify the data-related content in the configuration files, including the address of the dataset and the class number. The corresponding locations in configuration files are shown below:
+To start with, customize your dataset based on the task (See [Format description](../data_preparation/recognition_dataset_en.md#1) for the dataset format). Before initiating the model training, modify the data-related content in the configuration files, including the address of the dataset and the class number. The corresponding locations in configuration files are shown below:

 ```
 Head:
@@ -82,6 +98,7 @@ Train:
        cls_label_path: ./dataset/Aliproduct/val_list.txt.   #The address of label file for gallery dataset
 ```

+<a name="4.2"></a>
 ## 4.2 Model Training

 - Single machine single card training
@@ -112,6 +129,7 @@ python -m paddle.distributed.launch \
    -o Global.checkpoint="output/RecModel/latest"
 ```

+<a name="4.3"></a>
 ## 4.3 Model Evaluation

 - Single Card Evaluation
@@ -135,6 +153,7 @@ python -m paddle.distributed.launch \

 **Recommendation:** It is suggested to employ multi-card evaluation, which can quickly obtain the feature set of the overall dataset using multi-card parallel computing, accelerating the evaluation process.

+<a name="4.4"></a>
 ## 4.4 Model Inference

 Two steps are included in the inference: 1)exporting the inference model; 2)obtaining the feature vector.
@@ -158,6 +177,6 @@ python python/predict_rec.py \
 -o Global.rec_inference_model_dir="../inference"
 ```

-The output format of the obtained features is shown in the figure below:[![img](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/feature_extraction_output.png)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/feature_extraction_output.png)
+The output format of the obtained features is shown in the figure below:![img](../../images/feature_extraction_output.png)

-In practical use, however, business operations require more than simply obtaining features. To further perform image recognition by feature retrieval, please refer to the document [vector search](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/image_recognition_pipeline/vector_search.md).
+In practical use, however, business operations require more than simply obtaining features. To further perform image recognition by feature retrieval, please refer to the document [vector search](./vector_search_en.md).
--- a/docs/en/inference_deployment/paddle_serving_deploy_en.md
+++ b/docs/en/inference_deployment/paddle_serving_deploy_en.md
 # Model Service Deployment

- [1. Introduction](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#1)
- [2. Installation of Serving ](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#2)
- [3. Service Deployment for Image Classification](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#3)
-  - [3.1 Model Transformation](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#3.1)
-  - [3.2 Service Deployment and Request](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#3.2)
- [4. Service Deployment for Image Recognition](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#4)
-  - [4.1 Model Transformation](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#4.1)
-  - [4.2 Service Deployment and Request](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#4.2)
- [5. FAQ](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_serving_deploy.md#5)
-
-
-
+## Content
+
+- [1. Introduction](#1)
+- [2. Installation of Serving](#2)
+- [3. Service Deployment for Image Classification](#3)
+  - [3.1 Model Transformation](#3.1)
+  - [3.2 Service Deployment and Request](#3.2)
+- [4. Service Deployment for Image Recognition](#4)
+  - [4.1 Model Transformation](#4.1)
+  - [4.2 Service Deployment and Request](#4.2)
+- [5. FAQ](#5)
+
+<a name="1"></a>
 ## 1. Introduction

 [Paddle Serving](https://github.com/PaddlePaddle/Serving) is designed to provide easy deployment of on-line prediction services for deep learning developers, it supports one-click deployment of industrial-grade services, highly concurrent and efficient communication between client and server, and multiple programming languages for client development.

 This section, exemplified by HTTP deployment of prediction service, describes how to deploy model services in PaddleClas with PaddleServing. Currently, only deployment on Linux platform is supported. Windows platform is not supported.

+<a name="2"></a>
 ## 2. Installation of Serving

 It is officially recommended to use docker for the installation and environment deployment of Serving. First, pull the docker and create a Serving-based one.
@@ -41,19 +43,17 @@ pip3 install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + Tens
 ```

 - Speed up the installation process by replacing the source with `-i https://pypi.tuna.tsinghua.edu.cn/simple`.
- For other environment configuration and installation, please refer to [Install Paddle Serving using docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md)
+- For other environment configuration and installation, please refer to [Install Paddle Serving using docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_EN.md)
 - To deploy CPU services, please install the CPU version of serving-server with the following command.

 ```
 pip install paddle-serving-server
 ```

-
-
+<a name="3"></a>
 ## 3. Service Deployment for Image Classification

-
-
+<a name="3.1"></a>
 ### 3.1 Model Transformation

 When adopting PaddleServing for service deployment, the saved inference model needs to be converted to a Serving model. The following part takes the classic ResNet50_vd model as an example to introduce the deployment of image classification service.
@@ -118,8 +118,7 @@ fetch_var {
 }
 ```

-
-
+<a name="3.2"></a>
 ### 3.2 Service Deployment and Request

 Paddleserving's directory contains the code to start the pipeline service and send prediction requests, including:
@@ -139,7 +138,7 @@ classification_web_service.py    # Script for starting the pipeline server
 python3 classification_web_service.py &>log.txt &
 ```

-Once the service is successfully started, a log will be printed in log.txt similar to the following [![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/start_server.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/start_server.png)
+Once the service is successfully started, a log will be printed in log.txt similar to the following ![img](../imgs/start_server.png)

 - Send request：

@@ -148,14 +147,16 @@ Once the service is successfully started, a log will be printed in log.txt simil
 python3 pipeline_http_client.py
 ```

-Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example:[![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/results.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/results.png)
-
+Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example:![img](../imgs/results.png)


+<a name="4"></a>
 ## 4. Service Deployment for Image Recognition

 When using PaddleServing for service deployment, the saved inference model needs to be converted to a Serving model. The following part, exemplified by the ultra-lightweight model for image recognition in PP-ShiTu, details the deployment of image recognition service.

+
+<a name="4.1"></a>
 ## 4.1 Model Transformation

 - Download inference models for general detection and general recognition
@@ -225,8 +226,7 @@ cd ../
 wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar && tar -xf drink_dataset_v1.0.tar
 ```

-
-
+<a name="4.2"></a>
 ## 4.2 Service Deployment and Request

 **Note:** Since the recognition service involves multiple models, PipeLine is adopted for better performance. This deployment method does not support the windows platform for now.
@@ -254,7 +254,7 @@ recognition_web_service.py    # Script for starting the pipeline server
 python3 recognition_web_service.py &>log.txt &
 ```

-Once the service is successfully started, a log will be printed in log.txt similar to the following  [![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/start_server_shitu.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/start_server_shitu.png)
+Once the service is successfully started, a log will be printed in log.txt similar to the following  ![img](../imgs/start_server_shitu.png)

 - Send request：

@@ -262,10 +262,10 @@ Once the service is successfully started, a log will be printed in log.txt simil
 python3 pipeline_http_client.py
 ```

-Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example: [![img](https://github.com/PaddlePaddle/PaddleClas/raw/develop/deploy/paddleserving/imgs/results_shitu.png)](https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/paddleserving/imgs/results_shitu.png)
-
+Once the service is successfully started, the prediction results will be printed in the cmd window, see the following example: ![img](../imgs/results_shitu.png)


+<a name="5"></a>
 ## 5.FAQ

 **Q1**： After sending a request, no result is returned or the output is prompted with a decoding error.