diff --git a/deploy/paddleserving/README.md b/deploy/paddleserving/README.md
index 75eb3e35b8ffa03bc6ae69db42fffb33bdccaf14..bb34b12989a56944bb5a3b890dc122cd4beba24f 100644
--- a/deploy/paddleserving/README.md
+++ b/deploy/paddleserving/README.md
@@ -4,9 +4,9 @@
 
 PaddleClas provides two service deployment methods:
 - Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please refer to the [tutorial](../../deploy/hubserving/readme_en.md)
-- Based on **PaddleServing**: Code path is "`./deploy/paddleserving`". Please follow this tutorial.
+- Based on **PaddleServing**: Code path is "`./deploy/paddleserving`".  if you prefer retrieval_based image reocognition service, please refer to [tutorial](./recognition/README.md)，if you'd like image classification service, Please follow this tutorial.
 
-# Service deployment based on PaddleServing  
+# Image Classification Service deployment based on PaddleServing  
 
 This document will introduce how to use the [PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md) to deploy the ResNet50_vd model as a pipeline online service.
 
@@ -131,7 +131,7 @@ fetch_var {
     config.yml                # configuration file of starting the service
     pipeline_http_client.py   # script to send pipeline prediction request by http
     pipeline_rpc_client.py    # script to send pipeline prediction request by rpc
-    resnet50_web_service.py   # start the script of the pipeline server
+    classification_web_service.py   # start the script of the pipeline server
     ```
 
 2. Run the following command to start the service.
@@ -147,7 +147,7 @@ fetch_var {
     python3 pipeline_http_client.py
     ```
     After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is:
-    ![](./imgs/results.png)  
+    ![](./imgs/results.png)
 
     Adjust the number of concurrency in config.yml to get the largest QPS. 
 
diff --git a/deploy/paddleserving/README_CN.md b/deploy/paddleserving/README_CN.md
index 3394ae5b5a75c774858fb50e429d083f8a19fc07..02ee2093d901251a20cdf67261b0fb882d2736fd 100644
--- a/deploy/paddleserving/README_CN.md
+++ b/deploy/paddleserving/README_CN.md
@@ -4,9 +4,9 @@
 
 PaddleClas提供2种服务部署方式：
 - 基于PaddleHub Serving的部署：代码路径为"`./deploy/hubserving`"，使用方法参考[文档](../../deploy/hubserving/readme.md)；
-- 基于PaddleServing的部署：代码路径为"`./deploy/paddleserving`"，按照本教程使用。
+- 基于PaddleServing的部署：代码路径为"`./deploy/paddleserving`"， 基于检索方式的图像识别服务参考[文档](./recognition/README_CN.md)， 图像分类服务按照本教程使用。
 
-# 基于PaddleServing的服务部署
+# 基于PaddleServing的图像分类服务部署
 
 本文档以经典的ResNet50_vd模型为例，介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PaddleClas
 动态图模型的pipeline在线服务。
@@ -127,7 +127,7 @@ fetch_var {
     config.yml                 # 启动服务的配置文件
     pipeline_http_client.py    # http方式发送pipeline预测请求的脚本
     pipeline_rpc_client.py     # rpc方式发送pipeline预测请求的脚本
-    resnet50_web_service.py    # 启动pipeline服务端的脚本
+    classification_web_service.py    # 启动pipeline服务端的脚本
     ```
 
 2. 启动服务可运行如下命令：
diff --git a/deploy/paddleserving/imgs/results_recog.png b/deploy/paddleserving/imgs/results_recog.png
new file mode 100644
index 0000000000000000000000000000000000000000..37393d5d64e84de469d78dcc9fad88aa771f57f8
Binary files /dev/null and b/deploy/paddleserving/imgs/results_recog.png differ
diff --git a/deploy/paddleserving/imgs/start_server_recog.png b/deploy/paddleserving/imgs/start_server_recog.png
new file mode 100644
index 0000000000000000000000000000000000000000..d4344a1e6bdab7ccc4c3c31bc16d1e3186b9b806
Binary files /dev/null and b/deploy/paddleserving/imgs/start_server_recog.png differ
diff --git a/deploy/paddleserving/recognition/README.md b/deploy/paddleserving/recognition/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..0ece4fbd469840b6f2d29f455cdc7b0dc826739e
--- /dev/null
+++ b/deploy/paddleserving/recognition/README.md
@@ -0,0 +1,176 @@
+# Product Recognition Service deployment based on PaddleServing  
+
+(English|[简体中文](./README_CN.md))
+
+This document will introduce how to use the [PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md) to deploy the product recognition model based on retrieval method as a pipeline online service.
+
+Some Key Features of Paddle Serving:
+- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed with one line command.
+- Industrial serving features supported, such as models management, online loading, online A/B testing etc.
+- Highly concurrent and efficient communication between clients and servers supported.
+
+The introduction and tutorial of Paddle Serving service deployment framework reference [document](https://github.com/PaddlePaddle/Serving/blob/develop/README.md).
+
+## Contents
+- [Environmental preparation](#environmental-preparation)
+- [Model conversion](#model-conversion)
+- [Paddle Serving pipeline deployment](#paddle-serving-pipeline-deployment)
+- [FAQ](#faq)
+
+<a name="environmental-preparation"></a>
+## Environmental preparation
+
+PaddleClas operating environment and PaddleServing operating environment are needed.
+
+1. Please prepare PaddleClas operating environment reference [link](../../docs/zh_CN/tutorials/install.md).
+   Download the corresponding paddle whl package according to the environment, it is recommended to install version 2.1.0.
+
+2. The steps of PaddleServing operating environment prepare are as follows:
+
+    Install serving which used to start the service
+    ```
+    pip3 install paddle-serving-server==0.6.1 # for CPU
+    pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
+    # Other GPU environments need to confirm the environment and then choose to execute the following commands
+    pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
+    pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
+    ```
+
+3. Install the client to send requests to the service
+    In [download link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md) find the client installation package corresponding to the python version.
+    The python3.7 version is recommended here:
+
+    ```
+    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
+    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
+    ```
+
+4. Install serving-app
+    ```
+    pip3 install paddle-serving-app==0.6.1
+    ```
+
+   **note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md).
+
+
+<a name="model-conversion"></a>
+## Model conversion
+When using PaddleServing for service deployment, you need to convert the saved inference model into a serving model that is easy to deploy.
+The following assumes that the current working directory is the PaddleClas root directory
+
+Firstly, download the inference model of ResNet50_vd
+```
+cd deploy
+# Download and unzip the ResNet50_vd model
+wget -P models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar
+cd models
+tar -xf product_ResNet50_vd_aliproduct_v1.0_infer.tar
+```
+
+Then, you can use installed paddle_serving_client tool to convert inference model to mobile model.
+```
+#  Product recognition model conversion
+python3 -m paddle_serving_client.convert --dirname ./product_ResNet50_vd_aliproduct_v1.0_infer/ \
+                                         --model_filename inference.pdmodel  \
+                                         --params_filename inference.pdiparams \
+                                         --serving_server ./product_ResNet50_vd_aliproduct_v1.0_serving/ \
+                                         --serving_client ./product_ResNet50_vd_aliproduct_v1.0_client/
+```
+
+After the ResNet50_vd inference model is converted, there will be additional folders of `product_ResNet50_vd_aliproduct_v1.0_serving` and `product_ResNet50_vd_aliproduct_v1.0_client` in the current folder, with the following format:
+```
+|- product_ResNet50_vd_aliproduct_v1.0_serving/
+  |- __model__  
+  |- __params__
+  |- serving_server_conf.prototxt  
+  |- serving_server_conf.stream.prototxt
+
+|- product_ResNet50_vd_aliproduct_v1.0_client
+  |- serving_client_conf.prototxt  
+  |- serving_client_conf.stream.prototxt
+```
+
+Once you have the model file for deployment, you need to change the alias name in `serving_server_conf.prototxt`:  change `alias_name` in `fetch_var` to `features`,
+The modified serving_server_conf.prototxt file is as follows:
+```
+feed_var {
+  name: "x"
+  alias_name: "x"
+  is_lod_tensor: false
+  feed_type: 1
+  shape: 3
+  shape: 224
+  shape: 224
+}
+fetch_var {
+  name: "save_infer_model/scale_0.tmp_1"
+  alias_name: "features"
+  is_lod_tensor: true
+  fetch_type: 1
+  shape: -1
+}
+```
+
+Next，download and unpack the built index of product gallery
+```
+cd ../
+wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.1.tar && tar -xf recognition_demo_data_v1.1.tar
+```
+
+
+<a name="paddle-serving-pipeline-deployment"></a>
+## Paddle Serving pipeline deployment
+
+1. Download the PaddleClas code, if you have already downloaded it, you can skip this step.
+    ```
+    git clone https://github.com/PaddlePaddle/PaddleClas
+
+    # Enter the working directory  
+    cd PaddleClas/deploy/paddleserving/recognition
+    ```
+
+    The paddleserving directory contains the code to start the pipeline service and send prediction requests, including:
+    ```
+    __init__.py
+    config.yml                # configuration file of starting the service
+    pipeline_http_client.py   # script to send pipeline prediction request by http
+    pipeline_rpc_client.py    # script to send pipeline prediction request by rpc
+    recognition_web_service.py   # start the script of the pipeline server
+    ```
+
+2. Run the following command to start the service.
+    ```
+    # Start the service and save the running log in log.txt
+    python3 recognition_web_service.py &>log.txt &
+    ```
+    After the service is successfully started, a log similar to the following will be printed in log.txt
+    ![](../imgs/start_server_recog.png)
+
+3. Send service request
+    ```
+    python3 pipeline_http_client.py
+    ```
+    After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is:
+    ![](../imgs/results_recog.png)  
+
+    Adjust the number of concurrency in config.yml to get the largest QPS. 
+
+    ```
+    op:
+        concurrency: 8
+        ...
+    ```
+
+    Multiple service requests can be sent at the same time if necessary.
+
+    The predicted performance data will be automatically written into the `PipelineServingLogs/pipeline.tracer` file.
+
+<a name="faq"></a>
+## FAQ
+**Q1**: No result return after sending the request.
+
+**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and before sending the request. The command to close the proxy is:
+```
+unset https_proxy
+unset http_proxy
+```  
diff --git a/deploy/paddleserving/recognition/README_CN.md b/deploy/paddleserving/recognition/README_CN.md
new file mode 100644
index 0000000000000000000000000000000000000000..58efd6ba7f283fa183fed49a99d84d154cace969
--- /dev/null
+++ b/deploy/paddleserving/recognition/README_CN.md
@@ -0,0 +1,172 @@
+# 基于PaddleServing的商品识别服务部署
+
+([English](./README.md)|简体中文)
+
+本文以商品识别为例，介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PaddleClas动态图模型的pipeline在线服务。
+
+相比较于hubserving部署，PaddleServing具备以下优点：
+- 支持客户端和服务端之间高并发和高效通信
+- 支持 工业级的服务能力 例如模型管理，在线加载，在线A/B测试等
+- 支持 多种编程语言 开发客户端，例如C++, Python和Java
+
+更多有关PaddleServing服务化部署框架介绍和使用教程参考[文档](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)。
+
+## 目录
+- [环境准备](#环境准备)
+- [模型转换](#模型转换)
+- [Paddle Serving pipeline部署](#部署)
+- [FAQ](#FAQ)
+
+<a name="环境准备"></a>
+## 环境准备
+
+需要准备PaddleClas的运行环境和PaddleServing的运行环境。
+
+- 准备PaddleClas的[运行环境](../../docs/zh_CN/tutorials/install.md), 根据环境下载对应的paddle whl包，推荐安装2.1.0版本
+
+- 准备PaddleServing的运行环境，步骤如下
+
+1. 安装serving，用于启动服务
+    ```
+    pip3 install paddle-serving-server==0.6.1 # for CPU
+    pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
+    # 其他GPU环境需要确认环境再选择执行如下命令
+    pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
+    pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
+    ```
+
+2. 安装client，用于向服务发送请求
+    在[下载链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)中找到对应python版本的client安装包，这里推荐python3.7版本：
+
+    ```
+    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
+    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
+    ```
+
+3. 安装serving-app
+    ```
+    pip3 install paddle-serving-app==0.6.1
+    ```
+    **Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)。
+
+<a name="模型转换"></a>
+## 模型转换
+
+使用PaddleServing做服务化部署时，需要将保存的inference模型转换为serving易于部署的模型。 
+以下内容假定当前工作目录为PaddleClas根目录。
+
+首先，下载商品识别的inference模型
+```
+cd deploy
+
+# 下载并解压商品识别模型
+wget -P models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar
+cd models
+tar -xf product_ResNet50_vd_aliproduct_v1.0_infer.tar
+```
+
+接下来，用安装的paddle_serving_client把下载的inference模型转换成易于server部署的模型格式。
+
+```
+# 转换商品识别模型
+python3 -m paddle_serving_client.convert --dirname ./product_ResNet50_vd_aliproduct_v1.0_infer/ \
+                                         --model_filename inference.pdmodel  \
+                                         --params_filename inference.pdiparams \
+                                         --serving_server ./product_ResNet50_vd_aliproduct_v1.0_serving/ \
+                                         --serving_client ./product_ResNet50_vd_aliproduct_v1.0_client/
+```
+商品识别推理模型转换完成后，会在当前文件夹多出`product_ResNet50_vd_aliproduct_v1.0_serving` 和`product_ResNet50_vd_aliproduct_v1.0_client`的文件夹，具备如下格式：
+```
+|- product_ResNet50_vd_aliproduct_v1.0_serving/
+  |- __model__  
+  |- __params__
+  |- serving_server_conf.prototxt  
+  |- serving_server_conf.stream.prototxt
+
+|- product_ResNet50_vd_aliproduct_v1.0_client
+  |- serving_client_conf.prototxt  
+  |- serving_client_conf.stream.prototxt
+
+```
+得到模型文件之后，需要修改serving_server_conf.prototxt中的alias名字： 将`fetch_var`中的`alias_name`改为`features`, 
+修改后的serving_server_conf.prototxt内容如下：
+```
+feed_var {
+  name: "x"
+  alias_name: "x"
+  is_lod_tensor: false
+  feed_type: 1
+  shape: 3
+  shape: 224
+  shape: 224
+}
+fetch_var {
+  name: "save_infer_model/scale_0.tmp_1"
+  alias_name: "features"
+  is_lod_tensor: true
+  fetch_type: 1
+  shape: -1
+}
+```
+
+接下来，下载并解压已经构建后的商品库index
+```
+cd ../
+wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.1.tar && tar -xf recognition_demo_data_v1.1.tar
+```
+
+
+<a name="部署"></a>
+## Paddle Serving pipeline部署
+
+1. 下载PaddleClas代码，若已下载可跳过此步骤
+    ```
+    git clone https://github.com/PaddlePaddle/PaddleClas
+
+    # 进入到工作目录
+    cd PaddleClas/deploy/paddleserving/recognition
+    ```
+    paddleserving目录包含启动pipeline服务和发送预测请求的代码，包括：
+    ```
+    __init__.py
+    config.yml                    # 启动服务的配置文件
+    pipeline_http_client.py       # http方式发送pipeline预测请求的脚本
+    pipeline_rpc_client.py        # rpc方式发送pipeline预测请求的脚本
+    recognition_web_service.py    # 启动pipeline服务端的脚本
+    ```
+
+2. 启动服务可运行如下命令：
+    ```
+    # 启动服务，运行日志保存在log.txt
+    python3 recognition_web_service.py &>log.txt &
+    ```
+    成功启动服务后，log.txt中会打印类似如下日志
+    ![](../imgs/start_server_recog.png)
+
+3. 发送服务请求：
+    ```
+    python3 pipeline_http_client.py
+    ```
+    成功运行后，模型预测的结果会打印在cmd窗口中，结果示例为：
+    ![](../imgs/results_recog.png)
+
+    调整 config.yml 中的并发个数可以获得最大的QPS
+    ```
+    op:
+        #并发数，is_thread_op=True时，为线程并发；否则为进程并发
+        concurrency: 8
+        ...
+    ```
+    有需要的话可以同时发送多个服务请求
+
+    预测性能数据会被自动写入 `PipelineServingLogs/pipeline.tracer` 文件中。
+
+<a name="FAQ"></a>
+## FAQ
+**Q1**： 发送请求后没有结果返回或者提示输出解码报错
+
+**A1**： 启动服务和发送请求时不要设置代理，可以在启动服务前和发送请求前关闭代理，关闭代理的命令是：
+```
+unset https_proxy
+unset http_proxy
+```
diff --git a/deploy/paddleserving/recognition/__init__.py b/deploy/paddleserving/recognition/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/deploy/paddleserving/recognition/config.yml b/deploy/paddleserving/recognition/config.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f67ee5521b5135b28f0b945cf800f477e18cc787
--- /dev/null
+++ b/deploy/paddleserving/recognition/config.yml
@@ -0,0 +1,43 @@
+#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程，每个进程内构建grpcSever和DAG
+##当build_dag_each_worker=False时，框架会设置主线程grpc线程池的max_workers=worker_num
+worker_num: 1
+
+#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时，不自动生成http_port
+http_port: 18081
+rpc_port: 9994
+
+dag:
+    #op资源类型, True, 为线程模型；False，为进程模型
+    is_thread_op: False
+op:
+    rec:
+        #并发数，is_thread_op=True时，为线程并发；否则为进程并发
+        concurrency: 1
+
+        #当op配置没有server_endpoints时，从local_service_conf读取本地服务配置
+        local_service_conf:
+
+            #uci模型路径
+            model_config: ../../models/product_ResNet50_vd_aliproduct_v1.0_serving
+
+            #计算硬件类型: 空缺时由devices决定(CPU/GPU)，0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu
+            device_type: 1
+
+            #计算硬件ID，当devices为""或不写时为CPU预测；当devices为"0", "0,1,2"时为GPU预测，表示使用的GPU卡
+            devices: "0" # "0,1"
+
+            #client类型，包括brpc, grpc和local_predictor.local_predictor不启动Serving服务，进程内预测
+            client_type: local_predictor
+
+            #Fetch结果列表，以client_config中fetch_var的alias_name为准
+            fetch_list: ["features"]
+            
+    det:
+        concurrency: 1
+        local_service_conf:
+            client_type: local_predictor
+            device_type: 1
+            devices: '0'
+            fetch_list:
+            - save_infer_model/scale_0.tmp_1
+            model_config: ../../models/ppyolov2_r50vd_dcn_mainbody_v1.0_serving/
\ No newline at end of file
diff --git a/deploy/paddleserving/recognition/daoxiangcunjinzhubing_6.jpg b/deploy/paddleserving/recognition/daoxiangcunjinzhubing_6.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..fc64a9531db0829d42b51e888361fa697afd080f
Binary files /dev/null and b/deploy/paddleserving/recognition/daoxiangcunjinzhubing_6.jpg differ
diff --git a/deploy/paddleserving/recognition/label_list.txt b/deploy/paddleserving/recognition/label_list.txt
new file mode 100644
index 0000000000000000000000000000000000000000..35e26a622d788cfd6ece09ad4712f61936fcb5c5
--- /dev/null
+++ b/deploy/paddleserving/recognition/label_list.txt
@@ -0,0 +1,2 @@
+foreground
+background
\ No newline at end of file
diff --git a/deploy/paddleserving/recognition/pipeline_http_client.py b/deploy/paddleserving/recognition/pipeline_http_client.py
new file mode 100644
index 0000000000000000000000000000000000000000..aa0cb5429b7a5ef06567a1655d6f5d89f489ecea
--- /dev/null
+++ b/deploy/paddleserving/recognition/pipeline_http_client.py
@@ -0,0 +1,21 @@
+import requests
+import json
+import base64
+import os
+
+imgpath = "daoxiangcunjinzhubing_6.jpg"
+
+def cv2_to_base64(image):
+    return base64.b64encode(image).decode('utf8')
+
+if __name__ == "__main__":
+    url = "http://127.0.0.1:18081/recognition/prediction"
+
+    with open(os.path.join(".",  imgpath), 'rb') as file:
+        image_data1 = file.read()
+    image = cv2_to_base64(image_data1)
+    data = {"key": ["image"], "value": [image]}
+
+    for i in range(1):
+        r = requests.post(url=url, data=json.dumps(data))
+        print(r.json())
diff --git a/deploy/paddleserving/recognition/pipeline_rpc_client.py b/deploy/paddleserving/recognition/pipeline_rpc_client.py
new file mode 100644
index 0000000000000000000000000000000000000000..8a3257dc0af899b8fa55b667c15b346bcdc75009
--- /dev/null
+++ b/deploy/paddleserving/recognition/pipeline_rpc_client.py
@@ -0,0 +1,34 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+try:
+    from paddle_serving_server_gpu.pipeline import PipelineClient
+except ImportError:
+    from paddle_serving_server.pipeline import PipelineClient
+import base64
+
+client = PipelineClient()
+client.connect(['127.0.0.1:9994'])
+imgpath = "daoxiangcunjinzhubing_6.jpg"
+
+def cv2_to_base64(image):
+    return base64.b64encode(image).decode('utf8')
+
+if __name__ == "__main__":
+    with open(imgpath, 'rb') as file:
+        image_data = file.read()
+    image = cv2_to_base64(image_data)
+
+    for i in range(1):
+        ret = client.predict(feed_dict={"image": image}, fetch=["result"])
+        print(ret)
diff --git a/deploy/paddleserving/recognition/recognition_web_service.py b/deploy/paddleserving/recognition/recognition_web_service.py
new file mode 100644
index 0000000000000000000000000000000000000000..88daf96e6ba2ea9b5bf030f2a2bd83c6d645bd5e
--- /dev/null
+++ b/deploy/paddleserving/recognition/recognition_web_service.py
@@ -0,0 +1,198 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from paddle_serving_server.web_service import WebService, Op
+import logging
+import numpy as np
+import sys
+import cv2
+from paddle_serving_app.reader import *
+import base64
+import os
+import faiss
+import pickle
+import json
+
+class DetOp(Op):
+    def init_op(self):
+        self.img_preprocess = Sequential([
+            BGR2RGB(), Div(255.0),
+            Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], False),
+            Resize((640, 640)), Transpose((2, 0, 1))
+        ])
+
+        self.img_postprocess = RCNNPostprocess("label_list.txt", "output")
+        self.threshold = 0.2
+        self.max_det_results = 5
+
+    def generate_scale(self, im):
+        """
+        Args:
+            im (np.ndarray): image (np.ndarray)
+        Returns:
+            im_scale_x: the resize ratio of X
+            im_scale_y: the resize ratio of Y
+        """
+        target_size = [640, 640]
+        origin_shape = im.shape[:2]
+        resize_h, resize_w = target_size
+        im_scale_y = resize_h / float(origin_shape[0])
+        im_scale_x = resize_w / float(origin_shape[1])
+        return im_scale_y, im_scale_x
+
+    def preprocess(self, input_dicts, data_id, log_id):
+        (_, input_dict), = input_dicts.items()
+        imgs = []
+        raw_imgs = []
+        for key in input_dict.keys():
+            data = base64.b64decode(input_dict[key].encode('utf8'))
+            raw_imgs.append(data)
+            data = np.fromstring(data, np.uint8)
+            raw_im = cv2.imdecode(data, cv2.IMREAD_COLOR)
+
+            im_scale_y, im_scale_x = self.generate_scale(raw_im)
+            im = self.img_preprocess(raw_im)
+            
+            imgs.append({
+              "image": im[np.newaxis, :],
+              "im_shape": np.array(list(im.shape[1:])).reshape(-1)[np.newaxis,:],
+              "scale_factor": np.array([im_scale_y, im_scale_x]).astype('float32'),
+            })
+        self.raw_img = raw_imgs
+
+        feed_dict = {
+            "image":        np.concatenate([x["image"] for x in imgs], axis=0),
+            "im_shape":     np.concatenate([x["im_shape"] for x in imgs], axis=0),
+            "scale_factor": np.concatenate([x["scale_factor"] for x in imgs], axis=0)
+        }
+        return feed_dict, False,  None,  ""
+
+    def postprocess(self, input_dicts, fetch_dict, log_id):
+        boxes = self.img_postprocess(fetch_dict, visualize=False)
+        boxes.sort(key = lambda x: x["score"], reverse = True)
+        boxes = filter(lambda x: x["score"] >= self.threshold, boxes[:self.max_det_results])
+        boxes = list(boxes)
+        for i in range(len(boxes)):
+            boxes[i]["bbox"][2] += boxes[i]["bbox"][0] - 1
+            boxes[i]["bbox"][3] += boxes[i]["bbox"][1] - 1
+        result = json.dumps(boxes)
+        res_dict = {"bbox_result": result, "image": self.raw_img}
+        return res_dict,  None,  ""
+
+class RecOp(Op):
+    def init_op(self):
+        self.seq = Sequential([
+            BGR2RGB(), Resize((224, 224)), 
+            Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225],
+                                False), Transpose((2, 0, 1))
+        ])
+
+        index_dir = "../../recognition_demo_data_v1.1/gallery_product/index"
+        assert os.path.exists(os.path.join(
+            index_dir, "vector.index")), "vector.index not found ..."
+        assert os.path.exists(os.path.join(
+            index_dir, "id_map.pkl")), "id_map.pkl not found ... "
+        
+        self.searcher = faiss.read_index(
+            os.path.join(index_dir, "vector.index"))
+                
+        with open(os.path.join(index_dir, "id_map.pkl"), "rb") as fd:
+            self.id_map = pickle.load(fd)
+
+        self.rec_nms_thresold = 0.05
+        self.rec_score_thres = 0.5
+        self.feature_normalize = True
+        self.return_k = 1
+
+    def preprocess(self, input_dicts, data_id, log_id):
+        (_, input_dict), = input_dicts.items()
+        raw_img = input_dict["image"][0]
+        data = np.frombuffer(raw_img, np.uint8)
+        origin_img = cv2.imdecode(data, cv2.IMREAD_COLOR)
+        dt_boxes = input_dict["bbox_result"]
+        boxes = json.loads(dt_boxes)
+        boxes.append({"category_id": 0,
+                      "score": 1.0,
+                      "bbox": [0, 0, origin_img.shape[1], origin_img.shape[0]]
+                     })
+        self.det_boxes = boxes
+
+        #construct batch images for rec
+        imgs = []
+        for box in boxes:
+            box = [int(x) for x in box["bbox"]]
+            im = origin_img[box[1]: box[3], box[0]: box[2]].copy()
+            img = self.seq(im)
+            imgs.append(img[np.newaxis, :].copy())
+
+        input_imgs = np.concatenate(imgs, axis=0)
+        return {"x": input_imgs},  False,  None,  ""
+
+    def nms_to_rec_results(self, results, thresh = 0.1):
+        filtered_results = []
+        x1 = np.array([r["bbox"][0] for r in results]).astype("float32")
+        y1 = np.array([r["bbox"][1] for r in results]).astype("float32")
+        x2 = np.array([r["bbox"][2] for r in results]).astype("float32")
+        y2 = np.array([r["bbox"][3] for r in results]).astype("float32")
+        scores = np.array([r["rec_scores"] for r in results])
+
+        areas = (x2 - x1 + 1) * (y2 - y1 + 1)
+        order = scores.argsort()[::-1]
+        while order.size > 0:
+            i = order[0]
+            xx1 = np.maximum(x1[i], x1[order[1:]])
+            yy1 = np.maximum(y1[i], y1[order[1:]])
+            xx2 = np.minimum(x2[i], x2[order[1:]])
+            yy2 = np.minimum(y2[i], y2[order[1:]])
+
+            w = np.maximum(0.0, xx2 - xx1 + 1)
+            h = np.maximum(0.0, yy2 - yy1 + 1)
+            inter = w * h
+            ovr = inter / (areas[i] + areas[order[1:]] - inter)
+            inds = np.where(ovr <= thresh)[0]
+            order = order[inds + 1]
+            filtered_results.append(results[i])
+        return filtered_results
+
+    def postprocess(self, input_dicts, fetch_dict, log_id):
+        batch_features = fetch_dict["features"]
+
+        if self.feature_normalize:
+            feas_norm = np.sqrt(
+                np.sum(np.square(batch_features), axis=1, keepdims=True))
+            batch_features = np.divide(batch_features, feas_norm)
+
+        scores, docs = self.searcher.search(batch_features,  self.return_k)
+
+        results = []
+        for i in range(scores.shape[0]):
+            pred = {}
+            if scores[i][0] >= self.rec_score_thres:
+                pred["bbox"] = [int(x) for x in self.det_boxes[i]["bbox"]]
+                pred["rec_docs"] = self.id_map[docs[i][0]].split()[1]
+                pred["rec_scores"] = scores[i][0]
+                results.append(pred)
+        
+        #do nms
+        results = self.nms_to_rec_results(results, self.rec_nms_thresold)
+        return {"result": str(results)}, None, ""
+
+class RecognitionService(WebService):
+    def get_pipeline_response(self, read_op):
+        det_op = DetOp(name="det", input_ops=[read_op])
+        rec_op = RecOp(name="rec", input_ops=[det_op])
+        return rec_op
+
+product_recog_service = RecognitionService(name="recognition")
+product_recog_service.prepare_pipeline_config("config.yml")
+product_recog_service.run_service()
diff --git a/deploy/python/preprocess.py b/deploy/python/preprocess.py
index aaa0bef3559bbd76d90d6a95389dcc4178d69b2e..5d7fc929675267100f324d014c1e3a4c630b975f 100644
--- a/deploy/python/preprocess.py
+++ b/deploy/python/preprocess.py
@@ -78,6 +78,9 @@ class UnifiedResize(object):
         if backend.lower() == "cv2":
             if isinstance(interpolation, str):
                 interpolation = _cv2_interp_from_str[interpolation.lower()]
+            # compatible with opencv < version 4.4.0
+            elif not interpolation:
+                interpolation = cv2.INTER_LINEAR
             self.resize_func = partial(cv2.resize, interpolation=interpolation)
         elif backend.lower() == "pil":
             if isinstance(interpolation, str):
diff --git a/docs/en/tutorials/getting_started_en.md b/docs/en/tutorials/getting_started_en.md
index 1903a04ef883c8f13239abf47899199d866f0dcd..fe8dca132ca1fe296c8663ed8d233a47acba7313 100644
--- a/docs/en/tutorials/getting_started_en.md
+++ b/docs/en/tutorials/getting_started_en.md
@@ -14,13 +14,13 @@ After preparing the configuration file, The training process can be started in t
 
 ```
 python tools/train.py \
-    -c configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
-    -o pretrained_model="" \
-    -o use_gpu=False
+    -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
+    -o Arch.pretrained=False \
+    -o Global.device=gpu
 ```
 
-Among them, `-c` is used to specify the path of the configuration file, `-o` is used to specify the parameters needed to be modified or added, `-o pretrained_model=""` means to not using pre-trained models.
-`-o use_gpu=True` means to use GPU for training. If you want to use the CPU for training, you need to set `use_gpu` to `False`.
+Among them, `-c` is used to specify the path of the configuration file, `-o` is used to specify the parameters needed to be modified or added, `-o Arch.pretrained=False` means to not using pre-trained models.
+`-o Global.device=gpu` means to use GPU for training. If you want to use the CPU for training, you need to set `Global.device` to `cpu`.
 
 
 Of course, you can also directly modify the configuration file to update the configuration. For specific configuration parameters, please refer to [Configuration Document](config_description_en.md).
@@ -54,12 +54,12 @@ After configuring the configuration file, you can finetune it by loading the pre
 
 ```
 python tools/train.py \
-    -c configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
-    -o pretrained_model="./pretrained/MobileNetV3_large_x1_0_pretrained" \
-    -o use_gpu=True
+    -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
+    -o Arch.pretrained=True \
+    -o Global.device=gpu
 ```
 
-Among them, `-o pretrained_model` is used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file.
+Among them, `-o Arch.pretrained` is used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file. You can also set it into `True` to use pretrained weights that trained in ImageNet1k.
 
 We also provide a lot of pre-trained models trained on the ImageNet-1k dataset. For the model list and download address, please refer to the [model library overview](../models/models_intro_en.md).
 
@@ -69,28 +69,26 @@ If the training process is terminated for some reasons, you can also load the ch
 
 ```
 python tools/train.py \
-    -c configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
-    -o checkpoints="./output/MobileNetV3_large_x1_0/5/ppcls" \
-    -o last_epoch=5 \
-    -o use_gpu=True
+    -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
+    -o Global.checkpoints="./output/MobileNetV3_large_x1_0/epoch_5" \
+    -o Global.device=gpu
 ```
 
-The configuration file does not need to be modified. You only need to add the `checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter.
+The configuration file does not need to be modified. You only need to add the `Global.checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter.
 
 **Note**:
-* The parameter `-o last_epoch=5` means to record the number of the last training epoch as `5`, that is, the number of this training epoch starts from `6`, , and the parameter defaults to `-1`, which means the number of this training epoch starts from `0`.
 
-* The `-o checkpoints` parameter does not need to include the suffix of the checkpoints. The above training command will generate the checkpoints as shown below during the training process. If you want to continue training from the epoch `5`, Just set the `checkpoints` to `./output/MobileNetV3_large_x1_0_gpupaddle/5/ppcls`, PaddleClas will automatically fill in the `pdopt` and `pdparams` suffixes.
+* The `-o Global.checkpoints` parameter does not need to include the suffix of the checkpoints. The above training command will generate the checkpoints as shown below during the training process. If you want to continue training from the epoch `5`, Just set the `Global.checkpoints` to `../output/MobileNetV3_large_x1_0/epoch_5`, PaddleClas will automatically fill in the `pdopt` and `pdparams` suffixes.
 
     ```shell
-    output/
-    └── MobileNetV3_large_x1_0
-        ├── 0
-        │   ├── ppcls.pdopt
-        │   └── ppcls.pdparams
-        ├── 1
-        │   ├── ppcls.pdopt
-        │   └── ppcls.pdparams
+    output
+    ├── MobileNetV3_large_x1_0
+    │   ├── best_model.pdopt
+    │   ├── best_model.pdparams
+    │   ├── best_model.pdstates
+    │   ├── epoch_1.pdopt
+    │   ├── epoch_1.pdparams
+    │   ├── epoch_1.pdstates
         .
         .
         .
@@ -103,18 +101,15 @@ The model evaluation process can be started as follows.
 
 ```bash
 python tools/eval.py \
-    -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
-    -o pretrained_model="./output/MobileNetV3_large_x1_0/best_model/ppcls"\
-    -o load_static_weights=False
+    -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
+    -o Global.pretrained_model=./output/MobileNetV3_large_x1_0/best_model
 ```
 
-The above command will use `./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml` as the configuration file to evaluate the model `./output/MobileNetV3_large_x1_0/best_model/ppcls`. You can also set the evaluation by changing the parameters in the configuration file, or you can update the configuration with the `-o` parameter, as shown above.
+The above command will use `./configs/quick_start/MobileNetV3_large_x1_0.yaml` as the configuration file to evaluate the model `./output/MobileNetV3_large_x1_0/best_model`. You can also set the evaluation by changing the parameters in the configuration file, or you can update the configuration with the `-o` parameter, as shown above.
 
 Some of the configurable evaluation parameters are described as follows:
-* `ARCHITECTURE.name`: Model name
-* `pretrained_model`: The path of the model file to be evaluated
-* `load_static_weights`: Whether the model to be evaluated is a static graph model
-
+* `Arch.name`: Model name
+* `Global.pretrained_model`: The path of the model file to be evaluated
 
 **Note:** If the model is a dygraph type, you only need to specify the prefix of the model file when loading the model, instead of specifying the suffix, such as [1.3 Resume Training](#13-resume-training).
 
@@ -125,26 +120,15 @@ If you want to run PaddleClas on Linux with GPU, it is highly recommended to use
 
 ### 2.1 Model training
 
-After preparing the configuration file, The training process can be started in the following way. `paddle.distributed.launch` specifies the GPU running card number by setting `selected_gpus`:
+After preparing the configuration file, The training process can be started in the following way. `paddle.distributed.launch` specifies the GPU running card number by setting `gpus`:
 
 ```bash
 export CUDA_VISIBLE_DEVICES=0,1,2,3
 
-python -m paddle.distributed.launch \
-    --selected_gpus="0,1,2,3" \
+python3 -m paddle.distributed.launch \
+    --gpus="0,1,2,3" \
     tools/train.py \
-        -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml
-```
-
-The configuration can be updated by adding the `-o` parameter.
-
-```bash
-python -m paddle.distributed.launch \
-    --selected_gpus="0,1,2,3" \
-    tools/train.py \
-        -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
-        -o pretrained_model="" \
-        -o use_gpu=True
+        -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml
 ```
 
 The format of output log information is the same as above, see [1.1 Model training](#11-model-training) for details.
@@ -156,14 +140,14 @@ After configuring the configuration file, you can finetune it by loading the pre
 ```
 export CUDA_VISIBLE_DEVICES=0,1,2,3
 
-python -m paddle.distributed.launch \
-    --selected_gpus="0,1,2,3" \
+python3 -m paddle.distributed.launch \
+    --gpus="0,1,2,3" \
     tools/train.py \
-        -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
-        -o pretrained_model="./pretrained/MobileNetV3_large_x1_0_pretrained"
+        -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
+        -o Arch.pretrained=True
 ```
 
-Among them, `pretrained_model` is used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file.
+Among them, `Arch.pretrained` is set to `True` or `False`. It also can be used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file.
 
 There contains a lot of examples of model finetuning in [Quick Start](./quick_start_en.md). You can refer to this tutorial to finetune the model on a specific dataset.
 
@@ -175,26 +159,26 @@ If the training process is terminated for some reasons, you can also load the ch
 ```
 export CUDA_VISIBLE_DEVICES=0,1,2,3
 
-python -m paddle.distributed.launch \
-    --selected_gpus="0,1,2,3" \
+python3 -m paddle.distributed.launch \
+    --gpus="0,1,2,3" \
     tools/train.py \
-        -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
-        -o checkpoints="./output/MobileNetV3_large_x1_0/5/ppcls" \
-        -o last_epoch=5 \
-        -o use_gpu=True
+        -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
+        -o Global.checkpoints="./output/MobileNetV3_large_x1_0/epoch_5" \
+        -o Global.device=gpu
 ```
 
-The configuration file does not need to be modified. You only need to add the `checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter. About `last_epoch` parameter, please refer [1.3 Resume training](#13-resume-training) for details.
+The configuration file does not need to be modified. You only need to add the `Global.checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter as described in [1.3 Resume training](#13-resume-training).
 
 ### 2.4 Model evaluation
 
 The model evaluation process can be started as follows.
 
 ```bash
-python tools/eval.py \
-    -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
-    -o pretrained_model="./output/MobileNetV3_large_x1_0/best_model/ppcls"\
-    -o load_static_weights=False
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3 -m paddle.distributed.launch \
+    tools/eval.py \
+        -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
+        -o Global.pretrained_model=./output/MobileNetV3_large_x1_0/best_model
 ```
 
 About parameter description, see [1.4 Model evaluation](#14-model-evaluation) for details.
@@ -204,30 +188,16 @@ About parameter description, see [1.4 Model evaluation](#14-model-evaluation) fo
 After the training is completed, you can predict by using the pre-trained model obtained by the training, as follows:
 
 ```python
-python tools/infer/infer.py \
-    -i image path \
-    --model MobileNetV3_large_x1_0 \
-    --pretrained_model "./output/MobileNetV3_large_x1_0/best_model/ppcls" \
-    --use_gpu True \
-    --load_static_weights False
+python3 tools/infer.py \
+    -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
+    -o Infer.infer_imgs=dataset/flowers102/jpg/image_00001.jpg \
+    -o Global.pretrained_model=./output/MobileNetV3_large_x1_0/best_model
 ```
 
 Among them:
-+ `image_file`(i): The path of the image file to be predicted, such as `./test.jpeg`;
-+ `model`: Model name, such as `MobileNetV3_large_x1_0`;
-+ `pretrained_model`: Weight file path, such as `./pretrained/MobileNetV3_large_x1_0_pretrained/`;
-+ `use_gpu`: Whether to use the GPU, default by `True`;
-+ `load_static_weights`: Whether to load the pre-trained model obtained from static image training, default by `False`;
-+ `resize_short`: The length of the shortest side of the image that be scaled proportionally, default by `256`;
-+ `resize`: The side length of the image that be center cropped from resize_shorted image, default by `224`;
-+ `pre_label_image`: Whether to pre-label the image data, default value: `False`;
-+ `pre_label_out_idr`: The output path of pre-labeled image data. When `pre_label_image=True`, a lot of subfolders will be generated under the path, each subfolder represent a category, which stores all the images predicted by the model to belong to the category.
-
-**Note**: If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `resize_short=384`, `resize=384`.
-
-About more detailed infomation, you can refer to [infer.py](../../../tools/infer/infer.py).
++ `Infer.infer_imgs`: The path of the image file or folder to be predicted;
++ `Global.pretrained_model`: Weight file path, such as `./output/MobileNetV3_large_x1_0/best_model`;
 
-<a name="model_inference"></a>
 ## 4. Use the inference model to predict
 
 PaddlePaddle supports inference using prediction engines, which will be introduced next.
@@ -235,41 +205,38 @@ PaddlePaddle supports inference using prediction engines, which will be introduc
 Firstly, you should export inference model using `tools/export_model.py`.
 
 ```bash
-python tools/export_model.py \
-    --model MobileNetV3_large_x1_0 \
-    --pretrained_model ./output/MobileNetV3_large_x1_0/best_model/ppcls \
-    --output_path ./inference \
-    --class_dim 1000
+python3 tools/export_model.py \
+    -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
+    -o Global.pretrained_model=output/MobileNetV3_large_x1_0/best_model
 ```
 
-Among them, the `--model` parameter is used to specify the model name, `--pretrained_model` parameter is used to specify the model file path, the path does not need to include the model file suffix name, and `--output_path` is used to specify the storage path of the converted model, class_dim means number of class for the model, default as 1000.
-
-**Note**:
-1. If `--output_path=./inference`, then three files will be generated in the folder `inference`, they are `inference.pdiparams`, `inference.pdmodel` and `inference.pdiparams.info`.
-2. You can specify the `shape` of the model input image by setting the parameter `--img_size`, the default is `224`, which means the shape of input image is `224*224`. If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, you need to set `--img_size=384`.
+Among them,  `Global.pretrained_model` parameter is used to specify the model file path that does not need to include the file suffix name.
 
 The above command will generate the model structure file (`inference.pdmodel`) and the model weight file (`inference.pdiparams`), and then the inference engine can be used for inference:
 
+Go to the deploy directory:
+
+```
+cd deploy
+```
+
+Using inference engine to inference. Because the mapping file of ImageNet1k dataset is used by default, we should set `PostProcess.Topk.class_id_map_file` into `None`.
+
 ```bash
-python tools/infer/predict.py \
-    --image_file image path \
-    --model_file "./inference/inference.pdmodel" \
-    --params_file "./inference/inference.pdiparams" \
-    --use_gpu=True \
-    --use_tensorrt=False
+python3 python/predict_cls.py \
+    -c configs/inference_cls.yaml \
+    -o Global.infer_imgs=../dataset/flowers102/jpg/image_00001.jpg \
+    -o Global.inference_model_dir=../inference/ \
+    -o PostProcess.Topk.class_id_map_file=None
 ```
 Among them:
-+ `image_file`: The path of the image file to be predicted, such as `./test.jpeg`;
-+ `model_file`: Model file path, such as `./MobileNetV3_large_x1_0/inference.pdmodel`;
-+ `params_file`: Weight file path, such as `./MobileNetV3_large_x1_0/inference.pdiparams`;
-+ `use_tensorrt`: Whether to use the TesorRT, default by `True`;
-+ `use_gpu`: Whether to use the GPU, default by `True`
-+ `enable_mkldnn`: Wheter to use `MKL-DNN`, default by `False`. When both `use_gpu` and `enable_mkldnn` are set to `True`, GPU is used to run and `enable_mkldnn` will be ignored.
-+ `resize_short`: The length of the shortest side of the image that be scaled proportionally, default by `256`;
-+ `resize`: The side length of the image that be center cropped from resize_shorted image, default by `224`;
-+ `enable_calc_topk`: Whether to calculate top-k accuracy of the predction, default by `False`. Top-k accuracy will be printed out when set as `True`.
-+ `gt_label_path`: Image name and label file, used when `enable_calc_topk` is `True` to get image list and labels.
++ `Global.infer_imgs`: The path of the image file to be predicted;
++ `Global.inference_model_dir`: Model structure file path, such as `../inference/inference.pdmodel`;
++ `Global.use_tensorrt`: Whether to use the TesorRT, default by `False`;
++ `Global.use_gpu`: Whether to use the GPU, default by `True`
++ `Global.enable_mkldnn`: Wheter to use `MKL-DNN`, default by `False`. It is valid when `Global.use_gpu` is `False`.
++ `Global.use_fp16`: Whether to enable FP16, default by `False`;
 
 **Note**: If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `resize_short=384`, `resize=384`.
 
-If you want to evaluate the speed of the model, it is recommended to use [predict.py](../../../tools/infer/predict.py), and enable TensorRT to accelerate.
+If you want to evaluate the speed of the model, it is recommended to enable TensorRT to accelerate for GPU, and MKL-DNN for CPU.
diff --git a/docs/en/tutorials/getting_started_retrieval_en.md b/docs/en/tutorials/getting_started_retrieval_en.md
index eea6c1667036ab6eb8c554b6959d8d1cc669e86a..f548572d9f17a58695404ccb53839926cdfcf3eb 100644
--- a/docs/en/tutorials/getting_started_retrieval_en.md
+++ b/docs/en/tutorials/getting_started_retrieval_en.md
@@ -120,7 +120,7 @@ python3 tools/train.py \
 
  `-c` is used to specify the path to the configuration file, and `-o` is used to specify the parameters that need to be modified or added, where `-o Arch.Backbone.pretrained=True` indicates that the Backbone part uses the pre-trained model, in addition, `Arch.Backbone.pretrained` can also specify backbone.`pretrained` can also specify the address of a specific model weight file, which needs to be replaced with the path to your own pre-trained model weight file when using it. `-o Global.device=gpu` indicates that the GPU is used for training. If you want to use a CPU for training, you need to set `Global.device` to `cpu`.
 
-For more detailed training configuration, you can also modify the corresponding configuration file of the model directly. Refer to the [configuration document](config_en.md) for specific configuration parameters.
+For more detailed training configuration, you can also modify the corresponding configuration file of the model directly. Refer to the [configuration document](config_description_en.md) for specific configuration parameters.
 
 Run the above commands to check the output log, an example is as follows:
 
diff --git a/docs/images/wx_group.png b/docs/images/wx_group.png
index 4a410ffc8850a03ffd2e7a62a9a7a41948781b9d..94f87f6f0b6178c5d491cbc0ccd2d0efb6702e61 100644
Binary files a/docs/images/wx_group.png and b/docs/images/wx_group.png differ
diff --git a/docs/zh_CN/tutorials/getting_started_retrieval.md b/docs/zh_CN/tutorials/getting_started_retrieval.md
index 06dcc11c77af238aec346c4432cec570e2ac0c4f..a0695d88c1cd2f9a1ef1bc93cabb17276eddf5a9 100644
--- a/docs/zh_CN/tutorials/getting_started_retrieval.md
+++ b/docs/zh_CN/tutorials/getting_started_retrieval.md
@@ -117,7 +117,7 @@ python3 tools/train.py \
 
 其中，`-c`用于指定配置文件的路径，`-o`用于指定需要修改或者添加的参数，其中`-o Arch.Backbone.pretrained=True`表示Backbone部分使用预训练模型，此外，`Arch.Backbone.pretrained`也可以指定具体的模型权重文件的地址，使用时需要换成自己的预训练模型权重文件的路径。`-o Global.device=gpu`表示使用GPU进行训练。如果希望使用CPU进行训练，则需要将`Global.device`设置为`cpu`。
 
-更详细的训练配置，也可以直接修改模型对应的配置文件。具体配置参数参考[配置文档](config.md)。
+更详细的训练配置，也可以直接修改模型对应的配置文件。具体配置参数参考[配置文档](config_description.md)。
 
 运行上述命令，可以看到输出日志，示例如下：
 
@@ -245,4 +245,4 @@ python3 tools/export_model.py \
   - 平均检索精度(mAP)
   
     - AP: AP指的是不同召回率上的正确率的平均值
-    - mAP: 测试集中所有图片对应的AP的的平均值
\ No newline at end of file
+    - mAP: 测试集中所有图片对应的AP的的平均值
diff --git a/ppcls/arch/backbone/model_zoo/googlenet.py b/ppcls/arch/backbone/model_zoo/googlenet.py
index 00b7feeb9207aace64d84f71f77bf2cbe2be6af8..22528427ea3b9afa38856d632fbc08901f3c1009 100644
--- a/ppcls/arch/backbone/model_zoo/googlenet.py
+++ b/ppcls/arch/backbone/model_zoo/googlenet.py
@@ -131,7 +131,7 @@ class GoogLeNetDY(nn.Layer):
         self._ince5b = Inception(
             832, 832, 384, 192, 384, 48, 128, 128, name="ince5b")
 
-        self._pool_5 = AvgPool2D(kernel_size=7, stride=7)
+        self._pool_5 = AdaptiveAvgPool2D(1)
 
         self._drop = Dropout(p=0.4, mode="downscale_in_infer")
         self._fc_out = Linear(
diff --git a/ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml b/ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..967673f2af6df839fc68d628a42d87e7f2c991d0
--- /dev/null
+++ b/ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml
@@ -0,0 +1,148 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 100
+  print_batch_step: 10
+  use_visualdl: False
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+  eval_mode: retrieval
+  use_dali: False
+  to_static: False
+
+# model architecture
+Arch:
+  name: RecModel
+  infer_output_key: features
+  infer_add_softmax: False
+
+  Backbone: 
+    name: PPLCNet_x2_5
+    pretrained: True
+    use_ssld: True
+  BackboneStopLayer:
+    name: flatten_0
+  Neck:
+    name: FC
+    embedding_size: 1280
+    class_num: 512
+  Head:
+    name: ArcMargin 
+    embedding_size: 512
+    class_num: 185341
+    margin: 0.2
+    scale: 30
+
+# loss function config for traing/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Cosine
+    learning_rate: 0.04
+    warmup_epoch: 5
+  regularizer:
+    name: 'L2'
+    coeff: 0.00001
+
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/
+      cls_label_path: ./dataset/train_reg_all_data.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - RandCropImage:
+            size: 224
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.485, 0.456, 0.406]
+            std: [0.229, 0.224, 0.225]
+            order: ''
+
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 256
+      drop_last: False
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    Query:
+      dataset: 
+        name: VeriWild
+        image_root: ./dataset/Aliproduct/
+        cls_label_path: ./dataset/Aliproduct/val_list.txt
+        transform_ops:
+          - DecodeImage:
+              to_rgb: True
+              channel_first: False
+          - ResizeImage:
+              size: 224
+          - NormalizeImage:
+              scale: 0.00392157
+              mean: [0.485, 0.456, 0.406]
+              std: [0.229, 0.224, 0.225]
+              order: ''
+      sampler:
+        name: DistributedBatchSampler
+        batch_size: 64
+        drop_last: False
+        shuffle: False
+      loader:
+        num_workers: 4
+        use_shared_memory: True
+
+    Gallery:
+      dataset: 
+        name: VeriWild
+        image_root: ./dataset/Aliproduct/
+        cls_label_path: ./dataset/Aliproduct/val_list.txt
+        transform_ops:
+          - DecodeImage:
+              to_rgb: True
+              channel_first: False
+          - ResizeImage:
+              size: 224
+          - NormalizeImage:
+              scale: 0.00392157
+              mean: [0.485, 0.456, 0.406]
+              std: [0.229, 0.224, 0.225]
+              order: ''
+      sampler:
+        name: DistributedBatchSampler
+        batch_size: 64
+        drop_last: False
+        shuffle: False
+      loader:
+        num_workers: 4
+        use_shared_memory: True
+
+Metric:
+  Eval:
+    - Recallk:
+        topk: [1, 5]
diff --git a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_base_patch4_window12_384.yaml b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_base_patch4_window12_384.yaml
index af54e4aa753cba8d0215d7292c6cff752553a04f..5d976c0b83c266b6f9ccb91f5ac640a096bbd301 100644
--- a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_base_patch4_window12_384.yaml
+++ b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_base_patch4_window12_384.yaml
@@ -61,6 +61,8 @@ DataLoader:
             channel_first: False
         - RandCropImage:
             size: 384
+            interpolation: bicubic
+            backend: pil
         - RandFlipImage:
             flip_code: 1
         - TimmAutoAugment:
@@ -109,6 +111,8 @@ DataLoader:
             channel_first: False
         - ResizeImage:
             resize_short: 438
+            interpolation: bicubic
+            backend: pil
         - CropImage:
             size: 384
         - NormalizeImage:
@@ -134,6 +138,8 @@ Infer:
         channel_first: False
     - ResizeImage:
         resize_short: 438
+        interpolation: bicubic
+        backend: pil
     - CropImage:
         size: 384
     - NormalizeImage:
diff --git a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_base_patch4_window7_224.yaml b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_base_patch4_window7_224.yaml
index 4b9baa1b642c371f7e8019f19adb8e3ba51005e9..efbd427ad56802ba7a7a3478a1dd4e6c22ce3c1e 100644
--- a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_base_patch4_window7_224.yaml
+++ b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_base_patch4_window7_224.yaml
@@ -61,6 +61,8 @@ DataLoader:
             channel_first: False
         - RandCropImage:
             size: 224
+            interpolation: bicubic
+            backend: pil
         - RandFlipImage:
             flip_code: 1
         - TimmAutoAugment:
@@ -109,6 +111,8 @@ DataLoader:
             channel_first: False
         - ResizeImage:
             resize_short: 256
+            interpolation: bicubic
+            backend: pil
         - CropImage:
             size: 224
         - NormalizeImage:
@@ -134,6 +138,8 @@ Infer:
         channel_first: False
     - ResizeImage:
         resize_short: 256
+        interpolation: bicubic
+        backend: pil
     - CropImage:
         size: 224
     - NormalizeImage:
diff --git a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_large_patch4_window12_384.yaml b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_large_patch4_window12_384.yaml
index 58c9667e78d6892afbc1a524fd8127d0b3b29815..6c3abe6fff9932f86accd0a52650f37442a5fd47 100644
--- a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_large_patch4_window12_384.yaml
+++ b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_large_patch4_window12_384.yaml
@@ -61,6 +61,8 @@ DataLoader:
             channel_first: False
         - RandCropImage:
             size: 384
+            interpolation: bicubic
+            backend: pil
         - RandFlipImage:
             flip_code: 1
         - TimmAutoAugment:
@@ -109,6 +111,8 @@ DataLoader:
             channel_first: False
         - ResizeImage:
             resize_short: 438
+            interpolation: bicubic
+            backend: pil
         - CropImage:
             size: 384
         - NormalizeImage:
@@ -134,6 +138,8 @@ Infer:
         channel_first: False
     - ResizeImage:
         resize_short: 438
+        interpolation: bicubic
+        backend: pil
     - CropImage:
         size: 384
     - NormalizeImage:
diff --git a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_large_patch4_window7_224.yaml b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_large_patch4_window7_224.yaml
index 16f5a7dce143b207d9e8e671d91f8464aa8e21d4..dd2b2acd71f2427bc667d59663d2400800d610f9 100644
--- a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_large_patch4_window7_224.yaml
+++ b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_large_patch4_window7_224.yaml
@@ -61,6 +61,8 @@ DataLoader:
             channel_first: False
         - RandCropImage:
             size: 224
+            interpolation: bicubic
+            backend: pil
         - RandFlipImage:
             flip_code: 1
         - TimmAutoAugment:
@@ -109,6 +111,8 @@ DataLoader:
             channel_first: False
         - ResizeImage:
             resize_short: 256
+            interpolation: bicubic
+            backend: pil
         - CropImage:
             size: 224
         - NormalizeImage:
@@ -134,6 +138,8 @@ Infer:
         channel_first: False
     - ResizeImage:
         resize_short: 256
+        interpolation: bicubic
+        backend: pil
     - CropImage:
         size: 224
     - NormalizeImage:
diff --git a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_small_patch4_window7_224.yaml b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_small_patch4_window7_224.yaml
index 88fc3da419770f8e4bb439e09170bf68fc991b14..34a80d8341d2b07f6bd6806fde3e1f58dbc307e5 100644
--- a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_small_patch4_window7_224.yaml
+++ b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_small_patch4_window7_224.yaml
@@ -61,6 +61,8 @@ DataLoader:
             channel_first: False
         - RandCropImage:
             size: 224
+            interpolation: bicubic
+            backend: pil
         - RandFlipImage:
             flip_code: 1
         - TimmAutoAugment:
@@ -109,6 +111,8 @@ DataLoader:
             channel_first: False
         - ResizeImage:
             resize_short: 256
+            interpolation: bicubic
+            backend: pil
         - CropImage:
             size: 224
         - NormalizeImage:
@@ -134,6 +138,8 @@ Infer:
         channel_first: False
     - ResizeImage:
         resize_short: 256
+        interpolation: bicubic
+        backend: pil
     - CropImage:
         size: 224
     - NormalizeImage:
diff --git a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_tiny_patch4_window7_224.yaml b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_tiny_patch4_window7_224.yaml
index ed9b4d505f06a1c794ca0d82151caba33c184518..d921593853d1bb658cc3b3d8aec35e0decd0f833 100644
--- a/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_tiny_patch4_window7_224.yaml
+++ b/ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_tiny_patch4_window7_224.yaml
@@ -61,6 +61,8 @@ DataLoader:
             channel_first: False
         - RandCropImage:
             size: 224
+            interpolation: bicubic
+            backend: pil
         - RandFlipImage:
             flip_code: 1
         - TimmAutoAugment:
@@ -109,6 +111,8 @@ DataLoader:
             channel_first: False
         - ResizeImage:
             resize_short: 256
+            interpolation: bicubic
+            backend: pil
         - CropImage:
             size: 224
         - NormalizeImage:
@@ -134,6 +138,8 @@ Infer:
         channel_first: False
     - ResizeImage:
         resize_short: 256
+        interpolation: bicubic
+        backend: pil
     - CropImage:
         size: 224
     - NormalizeImage:
diff --git a/ppcls/configs/Logo/ResNet50_ReID.yaml b/ppcls/configs/Logo/ResNet50_ReID.yaml
index 90ec5caad21f7ff4b203dcd97ddd8e4b69ee0f29..fa52193fd6652cc2c7968b271d6fc12f363547bf 100644
--- a/ppcls/configs/Logo/ResNet50_ReID.yaml
+++ b/ppcls/configs/Logo/ResNet50_ReID.yaml
@@ -54,7 +54,7 @@ Optimizer:
   momentum: 0.9
   lr:
     name: Cosine
-    learning_rate: 0.01
+    learning_rate: 0.04
   regularizer:
     name: 'L2'
     coeff: 0.0001
@@ -84,10 +84,10 @@ DataLoader:
           - RandomErasing:
               EPSILON: 0.5
     sampler:
-        name: DistributedRandomIdentitySampler
+        name: PKSampler
         batch_size: 128
-        num_instances: 2
-        drop_last: False
+        sample_per_id: 2
+        drop_last: True
 
     loader:
         num_workers: 6
@@ -97,7 +97,7 @@ DataLoader:
       dataset:
         name: LogoDataset
         image_root: "dataset/LogoDet-3K-crop/val/"
-        cls_label_path: "dataset/LogoDet-3K-crop/LogoDet-3K+query.txt"
+        cls_label_path: "dataset/LogoDet-3K-crop/LogoDet-3K+val.txt"
         transform_ops:
           - DecodeImage:
               to_rgb: True
@@ -122,7 +122,7 @@ DataLoader:
       dataset:
           name: LogoDataset
           image_root: "dataset/LogoDet-3K-crop/train/"
-          cls_label_path: "dataset/LogoDet-3K-crop/LogoDet-3K+gallery.txt"
+          cls_label_path: "dataset/LogoDet-3K-crop/LogoDet-3K+train.txt"
           transform_ops:
             - DecodeImage:
                 to_rgb: True
diff --git a/ppcls/configs/Products/ResNet50_vd_Inshop.yaml b/ppcls/configs/Products/ResNet50_vd_Inshop.yaml
index b29a3a3f0e53099b06cd4aa2994d3bbd209a0467..2571ea483167323407da66ddcb75b38fcd32ab5c 100644
--- a/ppcls/configs/Products/ResNet50_vd_Inshop.yaml
+++ b/ppcls/configs/Products/ResNet50_vd_Inshop.yaml
@@ -54,7 +54,7 @@ Optimizer:
   momentum: 0.9
   lr:
     name: MultiStepDecay
-    learning_rate: 0.01
+    learning_rate: 0.04
     milestones: [30, 60, 70, 80, 90, 100]
     gamma: 0.5
     verbose: False
@@ -90,10 +90,10 @@ DataLoader:
             r1: 0.3
             mean: [0., 0., 0.]
     sampler:
-      name: DistributedRandomIdentitySampler
+      name: PKSampler
       batch_size: 64
-      num_instances: 2
-      drop_last: False
+      sample_per_id: 2
+      drop_last: True
       shuffle: True
     loader:
       num_workers: 4
diff --git a/ppcls/configs/Vehicle/ResNet50_ReID.yaml b/ppcls/configs/Vehicle/ResNet50_ReID.yaml
index ffe98396629617064060a952dfc9b1043254d67c..6aebcbf0d85e379dcd9c383199721f839b8a7a13 100644
--- a/ppcls/configs/Vehicle/ResNet50_ReID.yaml
+++ b/ppcls/configs/Vehicle/ResNet50_ReID.yaml
@@ -53,7 +53,7 @@ Optimizer:
   momentum: 0.9
   lr:
     name: Cosine
-    learning_rate: 0.01
+    learning_rate: 0.04
   regularizer:
     name: 'L2'
     coeff: 0.0005
@@ -88,10 +88,10 @@ DataLoader:
               mean: [0., 0., 0.]
 
     sampler:
-        name: DistributedRandomIdentitySampler
+        name: PKSampler
         batch_size: 128
-        num_instances: 2
-        drop_last: False
+        sample_per_id: 2
+        drop_last: True
         shuffle: True
     loader:
         num_workers: 6
diff --git a/ppcls/data/__init__.py b/ppcls/data/__init__.py
index b442aa883dcec50b37fc3abd393d2af5f0011298..fd41ea3ca0d763e41798050562db9e1244d12085 100644
--- a/ppcls/data/__init__.py
+++ b/ppcls/data/__init__.py
@@ -26,9 +26,12 @@ from ppcls.data.dataloader.common_dataset import create_operators
 from ppcls.data.dataloader.vehicle_dataset import CompCars, VeriWild
 from ppcls.data.dataloader.logo_dataset import LogoDataset
 from ppcls.data.dataloader.icartoon_dataset import ICartoonDataset
+from ppcls.data.dataloader.mix_dataset import MixDataset
 
 # sampler
 from ppcls.data.dataloader.DistributedRandomIdentitySampler import DistributedRandomIdentitySampler
+from ppcls.data.dataloader.pk_sampler import PKSampler
+from ppcls.data.dataloader.mix_sampler import MixSampler
 from ppcls.data import preprocess
 from ppcls.data.preprocess import transform
 
diff --git a/ppcls/data/dataloader/__init__.py b/ppcls/data/dataloader/__init__.py
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..8f81921018634419ba92d77fae92a57112bcdc54 100644
--- a/ppcls/data/dataloader/__init__.py
+++ b/ppcls/data/dataloader/__init__.py
@@ -0,0 +1,9 @@
+from ppcls.data.dataloader.imagenet_dataset import ImageNetDataset
+from ppcls.data.dataloader.multilabel_dataset import MultiLabelDataset
+from ppcls.data.dataloader.common_dataset import create_operators
+from ppcls.data.dataloader.vehicle_dataset import CompCars, VeriWild
+from ppcls.data.dataloader.logo_dataset import LogoDataset
+from ppcls.data.dataloader.icartoon_dataset import ICartoonDataset
+from ppcls.data.dataloader.mix_dataset import MixDataset
+from ppcls.data.dataloader.mix_sampler import MixSampler
+from ppcls.data.dataloader.pk_sampler import PKSampler
diff --git a/ppcls/data/dataloader/mix_dataset.py b/ppcls/data/dataloader/mix_dataset.py
new file mode 100644
index 0000000000000000000000000000000000000000..cbf4b4028d27cf3ebfeab9dc89ac5414dbd4786e
--- /dev/null
+++ b/ppcls/data/dataloader/mix_dataset.py
@@ -0,0 +1,49 @@
+#   Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import print_function
+
+import numpy as np
+import os
+
+from paddle.io import Dataset
+from .. import dataloader
+
+
+class MixDataset(Dataset):
+    def __init__(self, datasets_config):
+        super().__init__()
+        self.dataset_list = []
+        start_idx = 0
+        end_idx = 0
+        for config_i in datasets_config:
+            dataset_name = config_i.pop('name')
+            dataset = getattr(dataloader, dataset_name)(**config_i)
+            end_idx += len(dataset)
+            self.dataset_list.append([end_idx, start_idx, dataset])
+            start_idx = end_idx
+
+        self.length = end_idx
+
+    def __getitem__(self, idx):
+        for dataset_i in self.dataset_list:
+            if dataset_i[0] > idx:
+                dataset_i_idx = idx - dataset_i[1]
+                return dataset_i[2][dataset_i_idx]
+
+    def __len__(self):
+        return self.length
+
+    def get_dataset_list(self):
+        return self.dataset_list
diff --git a/ppcls/data/dataloader/mix_sampler.py b/ppcls/data/dataloader/mix_sampler.py
new file mode 100644
index 0000000000000000000000000000000000000000..2df3109cece3e6532ac54eb8f1d9e6498a1f33a7
--- /dev/null
+++ b/ppcls/data/dataloader/mix_sampler.py
@@ -0,0 +1,79 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+
+from paddle.io import DistributedBatchSampler, Sampler
+
+from ppcls.utils import logger
+from ppcls.data.dataloader.mix_dataset import MixDataset
+from ppcls.data import dataloader
+
+
+class MixSampler(DistributedBatchSampler):
+    def __init__(self, dataset, batch_size, sample_configs, iter_per_epoch):
+        super().__init__(dataset, batch_size)
+        assert isinstance(dataset,
+                          MixDataset), "MixSampler only support MixDataset"
+        self.sampler_list = []
+        self.batch_size = batch_size
+        self.start_list = []
+        self.length = iter_per_epoch
+        dataset_list = dataset.get_dataset_list()
+        batch_size_left = self.batch_size
+        self.iter_list = []
+        for i, config_i in enumerate(sample_configs):
+            self.start_list.append(dataset_list[i][1])
+            sample_method = config_i.pop("name")
+            ratio_i = config_i.pop("ratio")
+            if i < len(sample_configs) - 1:
+                batch_size_i = int(self.batch_size * ratio_i)
+                batch_size_left -= batch_size_i
+            else:
+                batch_size_i = batch_size_left
+            assert batch_size_i <= len(dataset_list[i][2])
+            config_i["batch_size"] = batch_size_i
+            if sample_method == "DistributedBatchSampler":
+                sampler_i = DistributedBatchSampler(dataset_list[i][2],
+                                                    **config_i)
+            else:
+                sampler_i = getattr(dataloader, sample_method)(
+                    dataset_list[i][2], **config_i)
+            self.sampler_list.append(sampler_i)
+            self.iter_list.append(iter(sampler_i))
+            self.length += len(dataset_list[i][2]) * ratio_i
+            self.iter_counter = 0
+
+    def __iter__(self):
+        while self.iter_counter < self.length:
+            batch = []
+            for i, iter_i in enumerate(self.iter_list):
+                batch_i = next(iter_i, None)
+                if batch_i is None:
+                    iter_i = iter(self.sampler_list[i])
+                    self.iter_list[i] = iter_i
+                    batch_i = next(iter_i, None)
+                    assert batch_i is not None, "dataset {} return None".format(
+                        i)
+                batch += [idx + self.start_list[i] for idx in batch_i]
+            if len(batch) == self.batch_size:
+                self.iter_counter += 1
+                yield batch
+            else:
+                logger.info("Some dataset reaches end")
+        self.iter_counter = 0
+
+    def __len__(self):
+        return self.length
diff --git a/ppcls/data/dataloader/pk_sampler.py b/ppcls/data/dataloader/pk_sampler.py
new file mode 100644
index 0000000000000000000000000000000000000000..7f718a33350a63323e556185a126c13a07a4674b
--- /dev/null
+++ b/ppcls/data/dataloader/pk_sampler.py
@@ -0,0 +1,106 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from collections import defaultdict
+import numpy as np
+import random
+from paddle.io import DistributedBatchSampler
+
+from ppcls.utils import logger
+
+
+class PKSampler(DistributedBatchSampler):
+    """
+    First, randomly sample P identities.
+    Then for each identity randomly sample K instances.
+    Therefore batch size is P*K, and the sampler called PKSampler.
+    Args:
+        dataset (paddle.io.Dataset): list of (img_path, pid, cam_id).
+        sample_per_id(int): number of instances per identity in a batch.
+        batch_size (int): number of examples in a batch.
+        shuffle(bool): whether to shuffle indices order before generating
+            batch indices. Default False.
+    """
+
+    def __init__(self,
+                 dataset,
+                 batch_size,
+                 sample_per_id,
+                 shuffle=True,
+                 drop_last=True,
+                 sample_method="sample_avg_prob"):
+        super().__init__(
+            dataset, batch_size, shuffle=shuffle, drop_last=drop_last)
+        assert batch_size % sample_per_id == 0, \
+            "PKSampler configs error, Sample_per_id must be a divisor of batch_size."
+        assert hasattr(self.dataset,
+                       "labels"), "Dataset must have labels attribute."
+        self.sample_per_label = sample_per_id
+        self.label_dict = defaultdict(list)
+        self.sample_method = sample_method
+        for idx, label in enumerate(self.dataset.labels):
+            self.label_dict[label].append(idx)
+        self.label_list = list(self.label_dict)
+        assert len(self.label_list) * self.sample_per_label > self.batch_size, \
+            "batch size should be smaller than "
+        if self.sample_method == "id_avg_prob":
+            self.prob_list = np.array([1 / len(self.label_list)] *
+                                      len(self.label_list))
+        elif self.sample_method == "sample_avg_prob":
+            counter = []
+            for label_i in self.label_list:
+                counter.append(len(self.label_dict[label_i]))
+            self.prob_list = np.array(counter) / sum(counter)
+        else:
+            logger.error(
+                "PKSampler only support id_avg_prob and sample_avg_prob sample method, "
+                "but receive {}.".format(self.sample_method))
+        if sum(np.abs(self.prob_list - 1) > 0.00000001):
+            self.prob_list[-1] = 1 - sum(self.prob_list[:-1])
+            if self.prob_list[-1] > 1 or self.prob_list[-1] < 0:
+                logger.error("PKSampler prob list error")
+            else:
+                logger.info(
+                    "PKSampler: sum of prob list not equal to 1, change the last prob"
+                )
+
+    def __iter__(self):
+        label_per_batch = self.batch_size // self.sample_per_label
+        if self.shuffle:
+            np.random.RandomState(self.epoch).shuffle(self.label_list)
+        for i in range(len(self)):
+            batch_index = []
+            batch_label_list = np.random.choice(
+                self.label_list,
+                size=label_per_batch,
+                replace=False,
+                p=self.prob_list)
+            for label_i in batch_label_list:
+                label_i_indexes = self.label_dict[label_i]
+                if self.sample_per_label <= len(label_i_indexes):
+                    batch_index.extend(
+                        np.random.choice(
+                            label_i_indexes,
+                            size=self.sample_per_label,
+                            replace=False))
+                else:
+                    batch_index.extend(
+                        np.random.choice(
+                            label_i_indexes,
+                            size=self.sample_per_label,
+                            replace=True))
+            if not self.drop_last or len(batch_index) == self.batch_size:
+                yield batch_index
diff --git a/ppcls/data/preprocess/ops/operators.py b/ppcls/data/preprocess/ops/operators.py
index 4418f529356d6dbf1b1c3b33761371b4dbcde3b4..e46823d2a751789d50c98e056fa34b496dee156e 100644
--- a/ppcls/data/preprocess/ops/operators.py
+++ b/ppcls/data/preprocess/ops/operators.py
@@ -59,6 +59,9 @@ class UnifiedResize(object):
         if backend.lower() == "cv2":
             if isinstance(interpolation, str):
                 interpolation = _cv2_interp_from_str[interpolation.lower()]
+            # compatible with opencv < version 4.4.0
+            elif not interpolation:
+                interpolation = cv2.INTER_LINEAR
             self.resize_func = partial(cv2.resize, interpolation=interpolation)
         elif backend.lower() == "pil":
             if isinstance(interpolation, str):
diff --git a/ppcls/engine/evaluation/classification.py b/ppcls/engine/evaluation/classification.py
index 005d740d38da871755c8b507b5ed3412c4f2eb94..9335e3079013c3707a89bcbb37a0aa2d512dac2d 100644
--- a/ppcls/engine/evaluation/classification.py
+++ b/ppcls/engine/evaluation/classification.py
@@ -22,7 +22,7 @@ from ppcls.utils.misc import AverageMeter
 from ppcls.utils import logger
 
 
-def classification_eval(evaler, epoch_id=0):
+def classification_eval(engine, epoch_id=0):
     output_info = dict()
     time_info = {
         "batch_cost": AverageMeter(
@@ -30,21 +30,19 @@ def classification_eval(evaler, epoch_id=0):
         "reader_cost": AverageMeter(
             "reader_cost", ".5f", postfix=" s,"),
     }
-    print_batch_step = evaler.config["Global"]["print_batch_step"]
+    print_batch_step = engine.config["Global"]["print_batch_step"]
 
     metric_key = None
     tic = time.time()
-    eval_dataloader = evaler.eval_dataloader if evaler.use_dali else evaler.eval_dataloader(
-    )
-    max_iter = len(evaler.eval_dataloader) - 1 if platform.system(
-    ) == "Windows" else len(evaler.eval_dataloader)
-    for iter_id, batch in enumerate(eval_dataloader):
+    max_iter = len(engine.eval_dataloader) - 1 if platform.system(
+    ) == "Windows" else len(engine.eval_dataloader)
+    for iter_id, batch in enumerate(engine.eval_dataloader):
         if iter_id >= max_iter:
             break
         if iter_id == 5:
             for key in time_info:
                 time_info[key].reset()
-        if evaler.use_dali:
+        if engine.use_dali:
             batch = [
                 paddle.to_tensor(batch[0]['data']),
                 paddle.to_tensor(batch[0]['label'])
@@ -55,17 +53,17 @@ def classification_eval(evaler, epoch_id=0):
         if not evaler.config["Global"].get("use_multilabel", False):
             batch[1] = batch[1].reshape([-1, 1]).astype("int64")
         # image input
-        out = evaler.model(batch[0])
+        out = engine.model(batch[0])
         # calc loss
-        if evaler.eval_loss_func is not None:
-            loss_dict = evaler.eval_loss_func(out, batch[1])
+        if engine.eval_loss_func is not None:
+            loss_dict = engine.eval_loss_func(out, batch[1])
             for key in loss_dict:
                 if key not in output_info:
                     output_info[key] = AverageMeter(key, '7.5f')
                 output_info[key].update(loss_dict[key].numpy()[0], batch_size)
         # calc metric
-        if evaler.eval_metric_func is not None:
-            metric_dict = evaler.eval_metric_func(out, batch[1])
+        if engine.eval_metric_func is not None:
+            metric_dict = engine.eval_metric_func(out, batch[1])
             if paddle.distributed.get_world_size() > 1:
                 for key in metric_dict:
                     paddle.distributed.all_reduce(
@@ -98,18 +96,18 @@ def classification_eval(evaler, epoch_id=0):
             ])
             logger.info("[Eval][Epoch {}][Iter: {}/{}]{}, {}, {}".format(
                 epoch_id, iter_id,
-                len(evaler.eval_dataloader), metric_msg, time_msg, ips_msg))
+                len(engine.eval_dataloader), metric_msg, time_msg, ips_msg))
 
         tic = time.time()
-    if evaler.use_dali:
-        evaler.eval_dataloader.reset()
+    if engine.use_dali:
+        engine.eval_dataloader.reset()
     metric_msg = ", ".join([
         "{}: {:.5f}".format(key, output_info[key].avg) for key in output_info
     ])
     logger.info("[Eval][Epoch {}][Avg]{}".format(epoch_id, metric_msg))
 
     # do not try to save best eval.model
-    if evaler.eval_metric_func is None:
+    if engine.eval_metric_func is None:
         return -1
     # return 1st metric in the dict
     return output_info[metric_key].avg
diff --git a/ppcls/engine/evaluation/retrieval.py b/ppcls/engine/evaluation/retrieval.py
index bb6d08d319d1c8d9d2f5f98aa0af51cd5b4e3721..bae77743d4f72d6bc92e13fae8e17583454159d7 100644
--- a/ppcls/engine/evaluation/retrieval.py
+++ b/ppcls/engine/evaluation/retrieval.py
@@ -20,21 +20,21 @@ import paddle
 from ppcls.utils import logger
 
 
-def retrieval_eval(evaler, epoch_id=0):
-    evaler.model.eval()
+def retrieval_eval(engine, epoch_id=0):
+    engine.model.eval()
     # step1. build gallery
-    if evaler.gallery_query_dataloader is not None:
+    if engine.gallery_query_dataloader is not None:
         gallery_feas, gallery_img_id, gallery_unique_id = cal_feature(
-            evaler, name='gallery_query')
+            engine, name='gallery_query')
         query_feas, query_img_id, query_query_id = gallery_feas, gallery_img_id, gallery_unique_id
     else:
         gallery_feas, gallery_img_id, gallery_unique_id = cal_feature(
-            evaler, name='gallery')
+            engine, name='gallery')
         query_feas, query_img_id, query_query_id = cal_feature(
-            evaler, name='query')
+            engine, name='query')
 
     # step2. do evaluation
-    sim_block_size = evaler.config["Global"].get("sim_block_size", 64)
+    sim_block_size = engine.config["Global"].get("sim_block_size", 64)
     sections = [sim_block_size] * (len(query_feas) // sim_block_size)
     if len(query_feas) % sim_block_size:
         sections.append(len(query_feas) % sim_block_size)
@@ -45,7 +45,7 @@ def retrieval_eval(evaler, epoch_id=0):
     image_id_blocks = paddle.split(query_img_id, num_or_sections=sections)
     metric_key = None
 
-    if evaler.eval_loss_func is None:
+    if engine.eval_loss_func is None:
         metric_dict = {metric_key: 0.}
     else:
         metric_dict = dict()
@@ -65,7 +65,7 @@ def retrieval_eval(evaler, epoch_id=0):
             else:
                 keep_mask = None
 
-            metric_tmp = evaler.eval_metric_func(similarity_matrix,
+            metric_tmp = engine.eval_metric_func(similarity_matrix,
                                                  image_id_blocks[block_idx],
                                                  gallery_img_id, keep_mask)
 
@@ -88,32 +88,31 @@ def retrieval_eval(evaler, epoch_id=0):
     return metric_dict[metric_key]
 
 
-def cal_feature(evaler, name='gallery'):
+def cal_feature(engine, name='gallery'):
     all_feas = None
     all_image_id = None
     all_unique_id = None
     has_unique_id = False
 
     if name == 'gallery':
-        dataloader = evaler.gallery_dataloader
+        dataloader = engine.gallery_dataloader
     elif name == 'query':
-        dataloader = evaler.query_dataloader
+        dataloader = engine.query_dataloader
     elif name == 'gallery_query':
-        dataloader = evaler.gallery_query_dataloader
+        dataloader = engine.gallery_query_dataloader
     else:
         raise RuntimeError("Only support gallery or query dataset")
 
     max_iter = len(dataloader) - 1 if platform.system() == "Windows" else len(
         dataloader)
-    dataloader_tmp = dataloader if evaler.use_dali else dataloader()
-    for idx, batch in enumerate(dataloader_tmp):  # load is very time-consuming
+    for idx, batch in enumerate(dataloader):  # load is very time-consuming
         if idx >= max_iter:
             break
-        if idx % evaler.config["Global"]["print_batch_step"] == 0:
+        if idx % engine.config["Global"]["print_batch_step"] == 0:
             logger.info(
                 f"{name} feature calculation process: [{idx}/{len(dataloader)}]"
             )
-        if evaler.use_dali:
+        if engine.use_dali:
             batch = [
                 paddle.to_tensor(batch[0]['data']),
                 paddle.to_tensor(batch[0]['label'])
@@ -123,20 +122,20 @@ def cal_feature(evaler, name='gallery'):
         if len(batch) == 3:
             has_unique_id = True
             batch[2] = batch[2].reshape([-1, 1]).astype("int64")
-        out = evaler.model(batch[0], batch[1])
+        out = engine.model(batch[0], batch[1])
         batch_feas = out["features"]
 
         # do norm
-        if evaler.config["Global"].get("feature_normalize", True):
+        if engine.config["Global"].get("feature_normalize", True):
             feas_norm = paddle.sqrt(
                 paddle.sum(paddle.square(batch_feas), axis=1, keepdim=True))
             batch_feas = paddle.divide(batch_feas, feas_norm)
 
         # do binarize
-        if evaler.config["Global"].get("feature_binarize") == "round":
+        if engine.config["Global"].get("feature_binarize") == "round":
             batch_feas = paddle.round(batch_feas).astype("float32") * 2.0 - 1.0
 
-        if evaler.config["Global"].get("feature_binarize") == "sign":
+        if engine.config["Global"].get("feature_binarize") == "sign":
             batch_feas = paddle.sign(batch_feas).astype("float32")
 
         if all_feas is None:
@@ -150,8 +149,8 @@ def cal_feature(evaler, name='gallery'):
             if has_unique_id:
                 all_unique_id = paddle.concat([all_unique_id, batch[2]])
 
-    if evaler.use_dali:
-        dataloader_tmp.reset()
+    if engine.use_dali:
+        dataloader.reset()
 
     if paddle.distributed.get_world_size() > 1:
         feat_list = []
diff --git a/ppcls/engine/train/train.py b/ppcls/engine/train/train.py
index e158548347630ca52d0ad12b38289c69206ca51b..7a70eeeb006611ec5eb58a5d12e0a6d5d8466aa0 100644
--- a/ppcls/engine/train/train.py
+++ b/ppcls/engine/train/train.py
@@ -18,63 +18,61 @@ import paddle
 from ppcls.engine.train.utils import update_loss, update_metric, log_info
 
 
-def train_epoch(trainer, epoch_id, print_batch_step):
+def train_epoch(engine, epoch_id, print_batch_step):
     tic = time.time()
-
-    train_dataloader = trainer.train_dataloader if trainer.use_dali else trainer.train_dataloader(
-    )
-    for iter_id, batch in enumerate(train_dataloader):
-        if iter_id >= trainer.max_iter:
+    for iter_id, batch in enumerate(engine.train_dataloader):
+        if iter_id >= engine.max_iter:
             break
         if iter_id == 5:
-            for key in trainer.time_info:
-                trainer.time_info[key].reset()
-        trainer.time_info["reader_cost"].update(time.time() - tic)
-        if trainer.use_dali:
+            for key in engine.time_info:
+                engine.time_info[key].reset()
+        engine.time_info["reader_cost"].update(time.time() - tic)
+        if engine.use_dali:
             batch = [
                 paddle.to_tensor(batch[0]['data']),
                 paddle.to_tensor(batch[0]['label'])
             ]
         batch_size = batch[0].shape[0]
-        if not trainer.config["Global"].get("use_multilabel", False):
+        if not engine.config["Global"].get("use_multilabel", False):
             batch[1] = batch[1].reshape([-1, 1]).astype("int64")
-        trainer.global_step += 1
+        engine.global_step += 1
+
         # image input
-        if trainer.amp:
+        if engine.amp:
             with paddle.amp.auto_cast(custom_black_list={
                     "flatten_contiguous_range", "greater_than"
             }):
-                out = forward(trainer, batch)
-                loss_dict = trainer.train_loss_func(out, batch[1])
+                out = forward(engine, batch)
+                loss_dict = engine.train_loss_func(out, batch[1])
         else:
-            out = forward(trainer, batch)
+            out = forward(engine, batch)
 
         # calc loss
-        if trainer.config["DataLoader"]["Train"]["dataset"].get(
+        if engine.config["DataLoader"]["Train"]["dataset"].get(
                 "batch_transform_ops", None):
-            loss_dict = trainer.train_loss_func(out, batch[1:])
+            loss_dict = engine.train_loss_func(out, batch[1:])
         else:
-            loss_dict = trainer.train_loss_func(out, batch[1])
+            loss_dict = engine.train_loss_func(out, batch[1])
 
         # step opt and lr
-        if trainer.amp:
-            scaled = trainer.scaler.scale(loss_dict["loss"])
+        if engine.amp:
+            scaled = engine.scaler.scale(loss_dict["loss"])
             scaled.backward()
-            trainer.scaler.minimize(trainer.optimizer, scaled)
+            engine.scaler.minimize(engine.optimizer, scaled)
         else:
             loss_dict["loss"].backward()
-            trainer.optimizer.step()
-        trainer.optimizer.clear_grad()
-        trainer.lr_sch.step()
+            engine.optimizer.step()
+        engine.optimizer.clear_grad()
+        engine.lr_sch.step()
 
         # below code just for logging
         # update metric_for_logger
-        update_metric(trainer, out, batch, batch_size)
+        update_metric(engine, out, batch, batch_size)
         # update_loss_for_logger
-        update_loss(trainer, loss_dict, batch_size)
-        trainer.time_info["batch_cost"].update(time.time() - tic)
+        update_loss(engine, loss_dict, batch_size)
+        engine.time_info["batch_cost"].update(time.time() - tic)
         if iter_id % print_batch_step == 0:
-            log_info(trainer, batch_size, epoch_id, iter_id)
+            log_info(engine, batch_size, epoch_id, iter_id)
         tic = time.time()
 
 
diff --git a/ppcls/optimizer/learning_rate.py b/ppcls/optimizer/learning_rate.py
index ea938b123ea559c2466359dea5f72461747eecec..b59387dd935c805078ffdb435788373e07743807 100644
--- a/ppcls/optimizer/learning_rate.py
+++ b/ppcls/optimizer/learning_rate.py
@@ -11,12 +11,15 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+
 from __future__ import (absolute_import, division, print_function,
                         unicode_literals)
 
 from paddle.optimizer import lr
 from paddle.optimizer.lr import LRScheduler
 
+from ppcls.utils import logger
+
 
 class Linear(object):
     """
@@ -41,7 +44,11 @@ class Linear(object):
                  warmup_start_lr=0.0,
                  last_epoch=-1,
                  **kwargs):
-        super(Linear, self).__init__()
+        super().__init__()
+        if warmup_epoch >= epochs:
+            msg = f"When using warm up, the value of \"Global.epochs\" must be greater than value of \"Optimizer.lr.warmup_epoch\". The value of \"Optimizer.lr.warmup_epoch\" has been set to {epochs}."
+            logger.warning(msg)
+            warmup_epoch = epochs
         self.learning_rate = learning_rate
         self.steps = (epochs - warmup_epoch) * step_each_epoch
         self.end_lr = end_lr
@@ -56,7 +63,8 @@ class Linear(object):
             decay_steps=self.steps,
             end_lr=self.end_lr,
             power=self.power,
-            last_epoch=self.last_epoch)
+            last_epoch=self.
+            last_epoch) if self.steps > 0 else self.learning_rate
         if self.warmup_steps > 0:
             learning_rate = lr.LinearWarmup(
                 learning_rate=learning_rate,
@@ -90,7 +98,11 @@ class Cosine(object):
                  warmup_start_lr=0.0,
                  last_epoch=-1,
                  **kwargs):
-        super(Cosine, self).__init__()
+        super().__init__()
+        if warmup_epoch >= epochs:
+            msg = f"When using warm up, the value of \"Global.epochs\" must be greater than value of \"Optimizer.lr.warmup_epoch\". The value of \"Optimizer.lr.warmup_epoch\" has been set to {epochs}."
+            logger.warning(msg)
+            warmup_epoch = epochs
         self.learning_rate = learning_rate
         self.T_max = (epochs - warmup_epoch) * step_each_epoch
         self.eta_min = eta_min
@@ -103,7 +115,8 @@ class Cosine(object):
             learning_rate=self.learning_rate,
             T_max=self.T_max,
             eta_min=self.eta_min,
-            last_epoch=self.last_epoch)
+            last_epoch=self.
+            last_epoch) if self.T_max > 0 else self.learning_rate
         if self.warmup_steps > 0:
             learning_rate = lr.LinearWarmup(
                 learning_rate=learning_rate,
@@ -132,12 +145,17 @@ class Step(object):
                  learning_rate,
                  step_size,
                  step_each_epoch,
+                 epochs,
                  gamma,
                  warmup_epoch=0,
                  warmup_start_lr=0.0,
                  last_epoch=-1,
                  **kwargs):
-        super(Step, self).__init__()
+        super().__init__()
+        if warmup_epoch >= epochs:
+            msg = f"When using warm up, the value of \"Global.epochs\" must be greater than value of \"Optimizer.lr.warmup_epoch\". The value of \"Optimizer.lr.warmup_epoch\" has been set to {epochs}."
+            logger.warning(msg)
+            warmup_epoch = epochs
         self.step_size = step_each_epoch * step_size
         self.learning_rate = learning_rate
         self.gamma = gamma
@@ -177,11 +195,16 @@ class Piecewise(object):
                  step_each_epoch,
                  decay_epochs,
                  values,
+                 epochs,
                  warmup_epoch=0,
                  warmup_start_lr=0.0,
                  last_epoch=-1,
                  **kwargs):
-        super(Piecewise, self).__init__()
+        super().__init__()
+        if warmup_epoch >= epochs:
+            msg = f"When using warm up, the value of \"Global.epochs\" must be greater than value of \"Optimizer.lr.warmup_epoch\". The value of \"Optimizer.lr.warmup_epoch\" has been set to {epochs}."
+            logger.warning(msg)
+            warmup_epoch = epochs
         self.boundaries = [step_each_epoch * e for e in decay_epochs]
         self.values = values
         self.last_epoch = last_epoch
@@ -294,8 +317,7 @@ class MultiStepDecay(LRScheduler):
             raise ValueError('gamma should be < 1.0.')
         self.milestones = [x * step_each_epoch for x in milestones]
         self.gamma = gamma
-        super(MultiStepDecay, self).__init__(learning_rate, last_epoch,
-                                             verbose)
+        super().__init__(learning_rate, last_epoch, verbose)
 
     def get_lr(self):
         for i in range(len(self.milestones)):