Merge pull request #906 from cuicheng01/develop

Update configs and README

Merge pull request #906 from cuicheng01/develop
Update configs and README
4a61a3d1 · Wei Shengyu · GitHub · dc0d8bf6 · 4fb05ee4 · 4a61a3d1
12 changed file
--- a/README.md
+++ b/README.md
@@ -55,6 +55,7 @@ Res2Net200_vd预训练模型Top-1精度高达85.1%。
    - 图像分类
        - [ImageNet分类任务](./docs/zh_CN/tutorials/quick_start_professional.md)
    - [特征学习](./docs/zh_CN/application/feature_learning.md)
+        - [特征学习](./docs/zh_CN/application/feature_learning.md)
        - [商品识别](./docs/zh_CN/application/product_recognition.md)
        - [车辆识别](./docs/zh_CN/application/vehicle_recognition.md)
        - [logo识别](./docs/zh_CN/application/logo_recognition.md)
@@ -62,7 +63,7 @@ Res2Net200_vd预训练模型Top-1精度高达85.1%。
    - [向量检索](./deploy/vector_search/README.md)
 - 模型训练/评估
    - [图像分类任务](./docs/zh_CN/tutorials/getting_started.md)
-    - [特征学习任务](./docs/zh_CN/application/feature_learning.md)
+    - [特征学习任务](./docs/zh_CN/tutorials/getting_started_retrieval.md)
 - 模型预测（当前只支持图像分类任务，图像识别更新中）
    - [基于Python预测引擎预测推理](./docs/zh_CN/tutorials/getting_started.md)
    - [基于C++预测引擎预测推理](./deploy/cpp_infer/readme.md)

--- a/docs/en/advanced_tutorials/distillation/distillation_en.md
+++ b/docs/en/advanced_tutorials/distillation/distillation_en.md
@@ -106,6 +106,14 @@ Finetuning is carried out on ImageNet1k dataset to restore distribution between

 * For image classsification tasks, The model accuracy can be further improved when the test scale is 1.15 times that of training[5]. For the 82.99% ResNet50_vd pretrained model, it comes to 83.7% using 320x320 for the evaluation. We use Fix strategy to finetune the model with the training scale set as 320x320. During the process, the pre-preocessing pipeline is same for both training and test. All the weights except the fully connected layer are freezed. Finally the top-1 accuracy comes to **84.0%**.

+### Some phenomena during the experiment
+
+In the prediction process, the average value and variance of the batch norm are obtained by loading the pretrained model (set its mode as test mode). In the training process, batch norm is obtained by counting the information of the current batch (set its mode as train mode) and calculating the moving average with the historical saved information. In the distillation task, we found that through the train mode, In the distillation task, we found that the real-time change of the bn parameter of the teacher model to guide the student model is better than the student model obtained through the test mode distillation. The following is a set of experimental results. Therefore, in this distillation scheme, we use train mode to get the soft label of the teacher model.
+
+|Teacher Model | Teacher Top1 | Student Model | Student Top1|
+|- |:-: |:-: | :-: |
+| ResNet50_vd | 82.35% | MobileNetV3_large_x1_0 | 76.00% |
+| ResNet50_vd | 82.35% | MobileNetV3_large_x1_0 | 75.84% |

 ## Application of the distillation model

@@ -113,7 +121,7 @@ Finetuning is carried out on ImageNet1k dataset to restore distribution between

 * Adjust the learning rate of the middle layer. The middle layer feature map of the model obtained by distillation is more refined. Therefore, when the distillation model is used as the pretrained model in other tasks, if the same learning rate as before is adopted, it is easy to destroy the features. If the learning rate of the overall model training is reduced, it will bring about the problem of slow convergence. Therefore, we use the strategy of adjusting the learning rate of the middle layer. specifically:
     * For ResNet50_vd, we set up a learning rate list. The three conv2d convolution parameters before the resiual block have a uniform learning rate multiple, and the four resiual block conv2d have theirs own learning rate parameters, respectively. 5 values need to be set in the list. By the experiment, we find that when used for transfer learning finetune classification model, the learning rate list with `[0.1,0.1,0.2,0.2,0.3]` performs better in most tasks; while in the object detection tasks, `[0.05, 0.05, 0.05, 0.1, 0.15]` can bring greater accuracy gains.
-    * For MoblileNetV3_large_1x0, because it contains 15 blocks, we set each 3 blocks to share a learning rate, so 5 learning rate values are required. We find that in classification and detection tasks, the learning rate list with `[0.25, 0.25, 0.5, 0.5, 0.75]` performs better in most tasks.
+    * For MoblileNetV3_large_x1_0, because it contains 15 blocks, we set each 3 blocks to share a learning rate, so 5 learning rate values are required. We find that in classification and detection tasks, the learning rate list with `[0.25, 0.25, 0.5, 0.5, 0.75]` performs better in most tasks.
 * Appropriate l2 decay. Different l2 decay values are set for different models during training. In order to prevent overfitting, l2 decay is ofen set as large for large models. L2 decay is set as `1e-4` for ResNet50, and `1e-5 ~ 4e-5` for MobileNet series models. L2 decay needs also to be adjusted when applied in other tasks. Taking Faster_RCNN_MobiletNetV3_FPN as an example, we found that only modifying l2 decay can bring up to 0.5% accuracy (mAP) improvement on the COCO2017 dataset.


@@ -167,54 +175,52 @@ This section will introduce the SSLD distillation experiments in detail based on



-#### Distill ResNet50_vd using ResNeXt101_32x16d_wsl
+#### Distill MobileNetV3_small_x1_0 using MobileNetV3_large_x1_0

-Configuration of distilling `ResNet50_vd` using `ResNeXt101_32x16d_wsl` is as follows.
+An example of SSLD distillation is provided here. The configuration file of `MobileNetV3_large_x1_0` distilling `MobileNetV3_small_x1_0` is provided in `ppcls/configs/ImageNet/Distillation/mv3_large_x1_0_distill_mv3_small_x1_0.yaml`, and the user can directly replace the path of the configuration file in `tools/train.sh` to use it.

-```yaml
-ARCHITECTURE:
-    name: 'ResNeXt101_32x16d_wsl_distill_ResNet50_vd'
-pretrained_model: "./pretrained/ResNeXt101_32x16d_wsl_pretrained/"
-# pretrained_model:
-#     - "./pretrained/ResNeXt101_32x16d_wsl_pretrained/"
-#     - "./pretrained/ResNet50_vd_pretrained/"
-use_distillation: True
-```
-
-#### Distill MobileNetV3_large_x1_0 using ResNet50_vd_ssld
-
-The detailed configuration is as follows.
+Configuration of distilling `MobileNetV3_large_x1_0` using `MobileNetV3_small_x1_0` is as follows.

 ```yaml
-ARCHITECTURE:
-    name: 'ResNet50_vd_distill_MobileNetV3_large_x1_0'
-pretrained_model: "./pretrained/ResNet50_vd_ssld_pretrained/"
-# pretrained_model:
-#     - "./pretrained/ResNet50_vd_ssld_pretrained/"
-#     - "./pretrained/ResNet50_vd_pretrained/"
-use_distillation: True
+Arch:
+  name: "DistillationModel"
+  # if not null, its lengths should be same as models
+  pretrained_list:
+  # if not null, its lengths should be same as models
+  freeze_params_list:
+  - True
+  - False
+  models:
+    - Teacher:
+        name: MobileNetV3_large_x1_0
+        pretrained: True
+        use_ssld: True
+    - Student:
+        name: MobileNetV3_small_x1_0
+        pretrained: False
+
+  infer_model_name: "Student"
 ```

+In configuration file, the `freeze_params_list` needs to specify whether the model needs to freeze the parameters, the `models` needs to specify the teacher model and the student model, and the teacher model needs to load the pretrained model. The user can directly change the model here.
+
 ### Begin to train the network

 If everything is ready, users can begin to train the network using the following command.

 ```bash
-export PYTHONPATH=path_to_PaddleClas:$PYTHONPATH

 python -m paddle.distributed.launch \
    --selected_gpus="0,1,2,3" \
-    --log_dir=R50_vd_distill_MV3_large_x1_0 \
+    --log_dir=mv3_large_x1_0_distill_mv3_small_x1_0 \
    tools/train.py \
-        -c ./configs/Distillation/R50_vd_distill_MV3_large_x1_0.yaml
+        -c ./ppcls/configs/ImageNet/Distillation/mv3_large_x1_0_distill_mv3_small_x1_0.yaml
 ```

 ### Note

 * Before using SSLD, users need to train a teacher model on the target dataset firstly. The teacher model is used to guide the training of the student model.

-* When using SSLD, users need to set `use_distillation` in the configuration file to` True`. In addition, because the student model learns soft-label with knowledge information, you need to turn off the `label_smoothing` option.
-
 * If the student model is not loaded with a pretrained model, the other hyperparameters of the training can refer to the hyperparameters trained by the student model on ImageNet-1k. If the student model is loaded with the pre-trained model, the learning rate can be adjusted to `1/100~1/10` of the standard learning rate.

 * In the process of SSLD distillation, the student model only learns the soft label, which makes the training process more difficult. It is recommended that the value of `l2_decay` can be decreased appropriately to obtain higher accuracy of the validation set.

--- a/docs/en/advanced_tutorials/image_augmentation/ImageAugment_en.md
+++ b/docs/en/advanced_tutorials/image_augmentation/ImageAugment_en.md
@@ -69,10 +69,6 @@ Unlike conventional artificially designed image augmentation methods, AutoAugmen
 In PaddleClas, `AutoAugment` is used as follows.

 ```python
-from ppcls.data.imaug import DecodeImage
-from ppcls.data.imaug import ResizeImage
-from ppcls.data.imaug import ImageNetPolicy
-from ppcls.data.imaug import transform

 size = 224

@@ -107,10 +103,6 @@ In `RandAugment`, the author proposes a random augmentation method. Instead of u
 In PaddleClas, `RandAugment` is used as follows.

 ```python
-from ppcls.data.imaug import DecodeImage
-from ppcls.data.imaug import ResizeImage
-from ppcls.data.imaug import RandAugment
-from ppcls.data.imaug import transform

 size = 224

@@ -153,10 +145,6 @@ Cutout is a kind of dropout, but occludes input image rather than feature map. I
 In PaddleClas, `Cutout` is used as follows.

 ```python
-from ppcls.data.imaug import DecodeImage
-from ppcls.data.imaug import ResizeImage
-from ppcls.data.imaug import Cutout
-from ppcls.data.imaug import transform

 size = 224

@@ -188,11 +176,6 @@ RandomErasing is similar to the Cutout. It is also to solve the problem of poor
 In PaddleClas, `RandomErasing` is used as follows.

 ```python
-from ppcls.data.imaug import DecodeImage
-from ppcls.data.imaug import ResizeImage
-from ppcls.data.imaug import ToCHWImage
-from ppcls.data.imaug import RandomErasing
-from ppcls.data.imaug import transform

 size = 224

@@ -229,11 +212,6 @@ Images are divided into some patches for `HideAndSeek` and masks are generated w
 In PaddleClas, `HideAndSeek` is used as follows.

 ```python
-from ppcls.data.imaug import DecodeImage
-from ppcls.data.imaug import ResizeImage
-from ppcls.data.imaug import ToCHWImage
-from ppcls.data.imaug import HideAndSeek
-from ppcls.data.imaug import transform

 size = 224

@@ -283,11 +261,6 @@ It shows that the second method is better.
 The usage of `GridMask` in PaddleClas is shown below.

 ```python
-from data.imaug import DecodeImage
-from data.imaug import ResizeImage
-from data.imaug import ToCHWImage
-from data.imaug import GridMask
-from data.imaug import transform

 size = 224

@@ -329,11 +302,6 @@ Mixup is the first solution for image aliasing, it is easy to realize and perfor
 The usage of `Mixup` in PaddleClas is shown below.

 ```python
-from ppcls.data.imaug import DecodeImage
-from ppcls.data.imaug import ResizeImage
-from ppcls.data.imaug import ToCHWImage
-from ppcls.data.imaug import transform
-from ppcls.data.imaug import MixupOperator

 size = 224

@@ -373,11 +341,6 @@ Cutmix randomly cuts out an `ROI` from one image, and then covered onto the corr


 ```python
-rom ppcls.data.imaug import DecodeImage
-from ppcls.data.imaug import ResizeImage
-from ppcls.data.imaug import ToCHWImage
-from ppcls.data.imaug import transform
-from ppcls.data.imaug import CutmixOperator

 size = 224

@@ -444,10 +407,9 @@ Configuration of `RandAugment` is shown as follows. `Num_layers`(default as 2) a


 ```yaml
-    transforms:
+      transform_ops:
        - DecodeImage:
            to_rgb: True
-            to_np: False
            channel_first: False
        - RandCropImage:
            size: 224
@@ -457,11 +419,10 @@ Configuration of `RandAugment` is shown as follows. `Num_layers`(default as 2) a
            num_layers: 2
            magnitude: 5
        - NormalizeImage:
-            scale: 1./255.
+            scale: 1.0/255.0
            mean: [0.485, 0.456, 0.406]
            std: [0.229, 0.224, 0.225]
            order: ''
-        - ToCHWImage:
 ```

 ### Cutout
@@ -469,24 +430,22 @@ Configuration of `RandAugment` is shown as follows. `Num_layers`(default as 2) a
 Configuration of `Cutout` is shown as follows. `n_holes`(default as 1) and `n_holes`(default as 112) are two hyperparameters.

 ```yaml
-    transforms:
+      transform_ops:
        - DecodeImage:
            to_rgb: True
-            to_np: False
            channel_first: False
        - RandCropImage:
            size: 224
        - RandFlipImage:
            flip_code: 1
        - NormalizeImage:
-            scale: 1./255.
+            scale: 1.0/255.0
            mean: [0.485, 0.456, 0.406]
            std: [0.229, 0.224, 0.225]
            order: ''
        - Cutout:
            n_holes: 1
            length: 112
-        - ToCHWImage:
 ```

 ### Mixup
@@ -495,42 +454,39 @@ Configuration of `Cutout` is shown as follows. `n_holes`(default as 1) and `n_ho
 Configuration of `Mixup` is shown as follows. `alpha`(default as 0.2) is hyperparameter which users need to care about. What's more, `use_mix` need to be set as `True` in the root of the configuration.

 ```yaml
-    transforms:
+      transform_ops:
        - DecodeImage:
            to_rgb: True
-            to_np: False
            channel_first: False
        - RandCropImage:
            size: 224
        - RandFlipImage:
            flip_code: 1
        - NormalizeImage:
-            scale: 1./255.
+            scale: 1.0/255.0
            mean: [0.485, 0.456, 0.406]
            std: [0.229, 0.224, 0.225]
            order: ''
-        - ToCHWImage:
-    mix:
+      batch_transform_ops:
        - MixupOperator:
            alpha: 0.2
 ```

-## 启动命令
+## Start training

-Users can use the following command to start the training process, which can also be referred to `tools/run.sh`.
+Users can use the following command to start the training process, which can also be referred to `tools/train.sh`.

 ```bash
-export PYTHONPATH=path_to_PaddleClas:$PYTHONPATH
-
-python -m paddle.distributed.launch \
+python3 -m paddle.distributed.launch \
    --selected_gpus="0,1,2,3" \
+    --log_dir=ResNet50_Cutout \
    tools/train.py \
-        -c ./configs/DataAugment/ResNet50_Cutout.yaml
+        -c ./ppcls/configs/ImageNet/DataAugment/ResNet50_Cutout.yaml
 ```

 ## Note

-* When using augmentation methods based on image aliasing, users need to set `use_mix` in the configuration file as `True`. In addition, because the label needs to be aliased when the image is aliased, the accuracy of the training data cannot be calculated. The training accuracy rate was not printed during the training process.
+* In addition, because the label needs to be aliased when the image is aliased, the accuracy of the training data cannot be calculated. The training accuracy rate was not printed during the training process.

 * The training data is more difficult with data augmentation, so the training loss may be larger, the training set accuracy is relatively low, but it has better generalization ability, so the validation set accuracy is relatively higher.


--- a/docs/zh_CN/advanced_tutorials/distillation/distillation.md
+++ b/docs/zh_CN/advanced_tutorials/distillation/distillation.md
@@ -113,7 +113,7 @@ SSLD的流程图如下图所示。
 * 对于图像分类任务，在测试的时候，测试尺度为训练尺度的1.15倍左右时，往往在不需要重新训练模型的情况下，模型的精度指标就可以进一步提升[5]，对于82.99%的ResNet50_vd在320x320的尺度下测试，精度可达83.7%，我们进一步使用Fix策略，即在320x320的尺度下进行训练，使用与预测时相同的数据预处理方法，同时固定除FC层以外的所有参数，最终在320x320的预测尺度下，精度可以达到**84.0%**。


-### 3.4 实验过程中的一些问题
+### 3.5 实验过程中的一些问题

 * 在预测过程中，batch norm的平均值与方差是通过加载预训练模型得到（设其模式为test mode）。在训练过程中，batch norm是通过统计当前batch的信息（设其模式为train mode），与历史保存信息进行滑动平均计算得到，在蒸馏任务中，我们发现通过train mode，即教师模型的bn实时变化的模式，去指导学生模型，比通过test mode蒸馏，得到的学生模型性能更好一些，下面是一组实验结果。因此我们在该蒸馏方案中，均使用train mode去得到教师模型的soft label。


--- a/docs/zh_CN/tutorials/getting_started.md
+++ b/docs/zh_CN/tutorials/getting_started.md
@@ -82,11 +82,11 @@ python3 tools/train.py \
    -o Global.device=gpu
 ```

-其中配置文件不需要做任何修改，只需要在继续训练时设置`checkpoints`参数即可，表示加载的断点权重文件路径，使用该参数会同时加载保存的断点权重和学习率、优化器等信息。
+其中配置文件不需要做任何修改，只需要在继续训练时设置`Global.checkpoints`参数即可，表示加载的断点权重文件路径，使用该参数会同时加载保存的断点权重和学习率、优化器等信息。

 **注意**：

-* `-o Global.checkpoints`参数无需包含断点权重文件的后缀名，上述训练命令会在训练过程中生成如下所示的断点权重文件，若想从断点`5`继续训练，则`Global.checkpoints`参数只需设置为`"../output/MobileNetV3_large_x1_0/epoch_5"`，PaddleClas会自动补充后缀名。
+* `-o Global.checkpoints`参数无需包含断点权重文件的后缀名，上述训练命令会在训练过程中生成如下所示的断点权重文件，若想从断点`5`继续训练，则`Global.checkpoints`参数只需设置为`"../output/MobileNetV3_large_x1_0/epoch_5"`，PaddleClas会自动补充后缀名。output目录下的文件结构如下所示：

    ```shell
    output
@@ -117,7 +117,7 @@ python3 tools/eval.py \

 可配置的部分评估参数说明如下：
 * `Arch.name`：模型名称
-* `Global.pretrained_model`：待评估的模型文件路径
+* `Global.pretrained_model`：待评估的模型预训练模型文件路径

 **注意：** 在加载待评估模型时，需要指定模型文件的路径，但无需包含文件后缀名，PaddleClas会自动补齐`.pdparams`的后缀，如[1.3 模型恢复训练](#1.3)。

@@ -175,11 +175,10 @@ python3 -m paddle.distributed.launch \
    tools/train.py \
        -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
        -o Global.checkpoints="./output/MobileNetV3_large_x1_0/epoch_5" \
-        -o Optimizer.lr.last_epoch=5 \
        -o Global.device=gpu
 ```

-其中配置文件不需要做任何修改，只需要在训练时设置`Global.checkpoints`参数与`Optimizer.lr.last_epoch`参数即可，该参数表示加载的断点权重文件路径，使用该参数会同时加载保存的模型参数权重和学习率、优化器等信息，详见[1.3 模型恢复训练](#1.3)。
+其中配置文件不需要做任何修改，只需要在训练时设置`Global.checkpoints`参数即可，该参数表示加载的断点权重文件路径，使用该参数会同时加载保存的模型参数权重和学习率、优化器等信息，详见[1.3 模型恢复训练](#1.3)。


 ### 2.4 模型评估
@@ -230,11 +229,6 @@ python3 tools/export_model.py \

 其中，`Global.pretrained_model`用于指定模型文件路径，该路径仍无需包含模型文件后缀名（如[1.3 模型恢复训练](#1.3)）。

-**注意**：
-1. `--output_path`表示输出的inference模型文件夹路径，若`--output_path=./inference`，则会在`inference`文件夹下生成`inference.pdiparams`、`inference.pdmodel`和`inference.pdiparams.info`文件。
-2. 可以通过设置参数`--img_size`指定模型输入图像的`shape`，默认为`224`，表示图像尺寸为`224*224`，请根据实际情况修改；如果使用`Transformer`系列模型，如`DeiT_***_384`, `ViT_***_384`等，请注意模型的输入数据尺寸，需要设置参数`img_size=384`。
-
-
 上述命令将生成模型结构文件（`inference.pdmodel`）和模型权重文件（`inference.pdiparams`），然后可以使用预测引擎进行推理：

 进入deploy目录下：

--- a/docs/zh_CN/tutorials/quick_start_professional.md
+++ b/docs/zh_CN/tutorials/quick_start_professional.md
 # 30分钟玩转PaddleClas（进阶版）

-此处提供了专业用户在linux操作系统上使用PaddleClas的快速上手教程，主要内容包括基于CIFAR-100数据集和NUS-WIDE-SCENE数据集，快速体验不同模型的单标签训练及多标签训练、加载不同预训练模型、SSLD知识蒸馏方案和数据增广的效果。请事先参考[安装指南](install.md)配置运行环境和克隆PaddleClas代码。
+此处提供了专业用户在linux操作系统上使用PaddleClas的快速上手教程，主要内容基于CIFAR-100数据集，快速体验不同模型的训练、加载不同预训练模型、SSLD知识蒸馏方案和数据增广的效果。请事先参考[安装指南](install.md)配置运行环境和克隆PaddleClas代码。


 ## 一、数据和模型准备
@@ -125,7 +125,7 @@ python3 -m paddle.distributed.launch \
 ## 四、知识蒸馏


-PaddleClas包含了自研的SSLD知识蒸馏方案，具体的内容可以参考[知识蒸馏章节](../advanced_tutorials/distillation/distillation.md)本小节将尝试使用知识蒸馏技术对MobileNetV3_large_x1_0模型进行训练，使用`2.1.2小节`训练得到的ResNet50_vd模型作为蒸馏所用的教师模型，首先将`2.1.2小节`训练得到的ResNet50_vd模型保存到指定目录，脚本如下。
+PaddleClas包含了自研的SSLD知识蒸馏方案，具体的内容可以参考[知识蒸馏章节](../advanced_tutorials/distillation/distillation.md), 本小节将尝试使用知识蒸馏技术对MobileNetV3_large_x1_0模型进行训练，使用`2.1.2小节`训练得到的ResNet50_vd模型作为蒸馏所用的教师模型，首先将`2.1.2小节`训练得到的ResNet50_vd模型保存到指定目录，脚本如下。

 ```shell
 mkdir pretrained 

--- a/ppcls/configs/Cartoonface/ResNet50_icartoon.yaml
+++ b/ppcls/configs/Cartoonface/ResNet50_icartoon.yaml
@@ -84,7 +84,7 @@ DataLoader:
        shuffle: True
    loader:
        num_workers: 6
-        use_shared_memory: False
+        use_shared_memory: True
  
  Eval:
    Query:

--- a/ppcls/configs/Logo/ResNet50_ReID.yaml
+++ b/ppcls/configs/Logo/ResNet50_ReID.yaml
@@ -92,7 +92,7 @@ DataLoader:

    loader:
        num_workers: 6
-        use_shared_memory: False
+        use_shared_memory: True
  Eval:
    Query:
      dataset:

--- a/ppcls/configs/Products/ResNet50_vd_Aliproduct.yaml
+++ b/ppcls/configs/Products/ResNet50_vd_Aliproduct.yaml
@@ -66,7 +66,7 @@ DataLoader:
        - DecodeImage:
            to_rgb: True
            channel_first: False
-        - ResizeImage:
+        - RandCropImage:
            size: 224
        - RandFlipImage:
            flip_code: 1

--- a/ppcls/configs/Products/ResNet50_vd_SOP.yaml
+++ b/ppcls/configs/Products/ResNet50_vd_SOP.yaml
@@ -69,7 +69,7 @@ Optimizer:
 DataLoader:
  Train:
    dataset:
-      name: ImageNetDataset
+      name: VeriWild
      image_root: ./dataset/Stanford_Online_Products/
      cls_label_path: ./dataset/Stanford_Online_Products/train_list.txt
      transform_ops:
@@ -104,7 +104,7 @@ DataLoader:
  Eval:
    Query:
      dataset: 
-        name: ImageNetDataset
+        name: VeriWild
        image_root: ./dataset/Stanford_Online_Products/
        cls_label_path: ./dataset/Stanford_Online_Products/test_list.txt
        transform_ops:
@@ -129,7 +129,7 @@ DataLoader:

    Gallery:
      dataset: 
-        name: ImageNetDataset
+        name: VeriWild
        image_root: ./dataset/Stanford_Online_Products/
        cls_label_path: ./dataset/Stanford_Online_Products/test_list.txt
        transform_ops:

--- a/ppcls/configs/Vehicle/ResNet50_ReID.yaml
+++ b/ppcls/configs/Vehicle/ResNet50_ReID.yaml
@@ -100,7 +100,7 @@ DataLoader:
        shuffle: True
    loader:
        num_workers: 6
-        use_shared_memory: False
+        use_shared_memory: True
  Eval:
    Query:
      dataset: 
@@ -125,7 +125,7 @@ DataLoader:
        shuffle: False
      loader:
        num_workers: 6
-        use_shared_memory: False
+        use_shared_memory: True

    Gallery:
      dataset: 
@@ -150,7 +150,7 @@ DataLoader:
        shuffle: False
      loader:
        num_workers: 6
-        use_shared_memory: False
+        use_shared_memory: True

 Metric:
  Eval:

--- a/ppcls/configs/quick_start/ResNet50_vd_finetune_retrieval.yaml
+++ b/ppcls/configs/quick_start/ResNet50_vd_finetune_retrieval.yaml
-Global:
-  checkpoints: null
-  pretrained_model: null
-  output_dir: "./output/"
-  device: "gpu"
-  class_num: 102
-  save_interval: 1
-  eval_mode: "retrieval"
-  eval_during_train: True
-  eval_interval: 1
-  epochs: 20
-  print_batch_step: 10
-  use_visualdl: False
-  image_shape: [3, 224, 224]
-
-  #inference related
-  save_inference_dir: "./inference"
-
-Arch:
-  name: "RecModel"
-  infer_output_key:  "features"
-  infer_add_softmax: "false"
-  Backbone:
-    name: "ResNet50_vd"
-    pretrained: False
-  BackboneStopLayer: 
-    name: "flatten_0"
-    output_dim: 2048
-  Head:
-    name: "FC"
-    class_num: 102
-    embedding_size: 2048
-
-Loss:
-  Train:
-    - CELoss:
-        weight: 1.0
-  Eval:
-    - CELoss:
-        weight: 1.0
-
-Optimizer:
-  name: Momentum
-  momentum: 0.9
-  lr:
-    name: Piecewise
-    learning_rate: 0.1
-    decay_epochs: [30, 60, 90]
-    values: [0.1, 0.01, 0.001, 0.0001]
-  regularizer:
-    name: 'L2'
-    coeff: 0.0001
-
-DataLoader:
-  Train:
-    dataset:
-        name: ImageNetDataset
-        image_root:  "./dataset/flowers102/"
-        cls_label_path:  "./dataset/flowers102/train_list.txt"
-        transform_ops:
-          - RandCropImage:
-              size: 224
-          - RandFlipImage:
-              flip_code: 1
-          - NormalizeImage:
-              scale: 0.00392157
-              mean: [0.485, 0.456, 0.406]
-              std: [0.229, 0.224, 0.225]
-              order: ''
-    sampler:
-        name: DistributedBatchSampler
-        batch_size: 256
-        drop_last: False
-        shuffle: True
-    loader:
-        num_workers: 6
-        use_shared_memory: False
-  
-  Eval:
-    Query:
-      dataset: 
-          name: ImageNetDataset
-          image_root: "./dataset/flowers102/"
-          cls_label_path: "./dataset/flowers102/val_list.txt"
-          transform_ops:
-            - ResizeImage:
-                resize_short: 256
-            - CropImage:
-                size: 224
-            - NormalizeImage:
-                scale: 0.00392157
-                mean: [0.485, 0.456, 0.406]
-                std: [0.229, 0.224, 0.225]
-                order: ''
-      sampler:
-          name: DistributedBatchSampler
-          batch_size: 512
-          drop_last: False
-          shuffle: False
-      loader:
-          num_workers: 6
-          use_shared_memory: True
-
-    Gallery:
-      dataset: 
-          name: ImageNetDataset
-          image_root: "./dataset/flowers102/"
-          cls_label_path: "./dataset/flowers102/train_list.txt"
-          transform_ops:
-            - ResizeImage:
-                resize_short: 256
-            - CropImage:
-                size: 224
-            - NormalizeImage:
-                scale: 0.00392157
-                mean: [0.485, 0.456, 0.406]
-                std: [0.229, 0.224, 0.225]
-                order: ''
-      sampler:
-          name: DistributedBatchSampler
-          batch_size: 512
-          drop_last: False
-          shuffle: False
-      loader:
-          num_workers: 6
-          use_shared_memory: True
-
-Metric:
-    Train:
-    - TopkAcc:
-        topk: [1, 5]
-    Eval:
-    - Recallk:
-        topk: [1, 10]
-