docs: update en docs

ba91c32d · gaotingquan · Tingquan Gao · de0619ae · de0619ae · ba91c32d
19 changed file
--- a/docs/en/inference_deployment/.gitkeep
+++ b/docs/en/inference_deployment/.gitkeep
--- a/docs/en/inference_deployment/cpp_deploy_en.md
+++ b/docs/en/inference_deployment/cpp_deploy_en.md
+# Server-side C++ inference
+This tutorial will introduce the detailed steps of deploying the PaddleClas classification model on the server side. The deployment of the recognition model will be supported in the near future. Please look forward to it.
+---
+## Contents
+- [1. Prepare the environment](#1)
+    - [1.1 Compile OpenCV](#1.1)
+    - [1.2 Compile or download the Paddle Inference Library](#1.2)
+        - [1.2.1 Compile from the source code](#1.2.1)
+        - [1.2.2 Direct download and installation](#1.2.2)
+- [2. Compile](#2)
+    - [2.1 Compile PaddleClas C++ inference demo](#2.1)
+    - [2.2 Compile config lib and cls lib](#2.2)
+- [3. Run](#3)
+    - [3.1 Prepare inference model](#3.1)
+    - [3.2 Run demo](#3.2)
+<a name="1"></a>
+## 1. Prepare the environment
+### Environment
+- Linux, docker is recommended.
+- Windows, compilation based on `Visual Studio 2019 Community` is supported. In addition, you can refer to [How to use PaddleDetection to make a complete project](https://zhuanlan.zhihu.com/p/145446681) to compile by generating the `sln solution`.
+- This document mainly introduces the compilation and inference of PaddleClas using C++ in Linux environment.
+- If you need to use the Inference Library in Windows environment, please refer to [The compilation tutorial in Windows](./docs/windows_vs2019_build.md) for detailed information.
+<a name="1.1"></a>
+### 1.1 Compile opencv
+* First of all, you need to download the source code compiled package in the Linux environment from the opencv official website. Taking opencv3.4.7 as an example, the download and uncompress command are as follows.
+```
+wget https://github.com/opencv/opencv/archive/3.4.7.tar.gz
+tar -xf 3.4.7.tar.gz
+```
+Finally, you can see the folder of `opencv-3.4.7/` in the current directory.
+* Compile opencv, the opencv source path (`root_path`) and installation path (`install_path`) should be set by yourself. Among them, `root_path` is the downloaded opencv source code path, and `install_path` is the installation path of opencv. In this case, the opencv source is `./opencv-3.4.7`.
+```shell
+cd ./opencv-3.4.7
+export root_path=$PWD
+export install_path=${root_path}/opencv3
+```
+* After entering the opencv source code path, you can compile it in the following way.
+```shell
+rm -rf build
+mkdir build
+cd build
+cmake .. \
+    -DCMAKE_INSTALL_PREFIX=${install_path} \
+    -DCMAKE_BUILD_TYPE=Release \
+    -DBUILD_SHARED_LIBS=OFF \
+    -DWITH_IPP=OFF \
+    -DBUILD_IPP_IW=OFF \
+    -DWITH_LAPACK=OFF \
+    -DWITH_EIGEN=OFF \
+    -DCMAKE_INSTALL_LIBDIR=lib64 \
+    -DWITH_ZLIB=ON \
+    -DBUILD_ZLIB=ON \
+    -DWITH_JPEG=ON \
+    -DBUILD_JPEG=ON \
+    -DWITH_PNG=ON \
+    -DBUILD_PNG=ON \
+    -DWITH_TIFF=ON \
+    -DBUILD_TIFF=ON
+make -j
+make install
+```
+* After `make install` is completed, the opencv header file and library file will be generated in this folder for later PaddleClas source code compilation.
+Take opencv3.4.7 for example, the final file structure under the opencv installation path is as follows. **NOTICE**:The following file structure may be different for different Versions of Opencv.
+```
+opencv3/
+|-- bin
+|-- include
+|-- lib64
+|-- share
+```
+<a name="1.2"></a>
+### 1.2 Compile or download the Paddle Inference Library
+* There are 2 ways to obtain the Paddle Inference Library, described in detail below.
+<a name="1.2.1"></a>
+#### 1.2.1 Compile from the source code
+* If you want to get the latest Paddle Inference Library features, you can download the latest code from Paddle GitHub repository and compile the inference library from the source code.
+* You can refer to [Paddle Inference Library](https://www.paddlepaddle.org.cn/documentation/docs/en/develop/guides/05_inference_deployment/inference/build_and_install_lib_en.html#build-from-source-code) to get the Paddle source code from github, and then compile To generate the latest inference library. The method of using git to access the code is as follows.
+```shell
+git clone https://github.com/PaddlePaddle/Paddle.git
+```
+* After entering the Paddle directory, the compilation method is as follows.
+```shell
+rm -rf build
+mkdir build
+cd build
+cmake  .. \
+    -DWITH_CONTRIB=OFF \
+    -DWITH_MKL=ON \
+    -DWITH_MKLDNN=ON  \
+    -DWITH_TESTING=OFF \
+    -DCMAKE_BUILD_TYPE=Release \
+    -DWITH_INFERENCE_API_TEST=OFF \
+    -DON_INFER=ON \
+    -DWITH_PYTHON=ON
+make -j
+make inference_lib_dist
+```
+For more compilation parameter options, please refer to the official website of the Paddle C++ inference library:[https://www.paddlepaddle.org.cn/documentation/docs/en/develop/guides/05_inference_deployment/inference/build_and_install_lib_en.html#build-from-source-code](https://www.paddlepaddle.org.cn/documentation/docs/en/develop/guides/05_inference_deployment/inference/build_and_install_lib_en.html#build-from-source-code).
+* After the compilation process, you can see the following files in the folder of `build/paddle_inference_install_dir/`.
+```
+build/paddle_inference_install_dir/
+|-- CMakeCache.txt
+|-- paddle
+|-- third_party
+|-- version.txt
+```
+Among them, `paddle` is the Paddle library required for C++ prediction later, and `version.txt` contains the version information of the current inference library.
+<a name="1.2.2"></a>
+#### 1.2.2 Direct download and installation
+* Different cuda versions of the Linux inference library (based on GCC 4.8.2) are provided on the
+[Paddle Inference Library official website](https://www.paddlepaddle.org.cn/documentation/docs/en/develop/guides/05_inference_deployment/inference/build_and_install_lib_en.html). You can view and select the appropriate version of the inference library on the official website.
+* Please select the `develop` version.
+* After downloading, use the following method to uncompress.
+```
+tar -xf paddle_inference.tgz
+```
+Finally you can see the following files in the folder of `paddle_inference/`.
+<a name="2"></a>
+## 2. Compile
+<a name="2.1"></a>
+### 2.1 Compile PaddleClas C++ inference demo
+* The compilation commands are as follows. The addresses of Paddle C++ inference library, opencv and other Dependencies need to be replaced with the actual addresses on your own machines.
+```shell
+sh tools/build.sh
+```
+Specifically, the content in `tools/build.sh` is as follows.
+```shell
+OPENCV_DIR=your_opencv_dir
+LIB_DIR=your_paddle_inference_dir
+CUDA_LIB_DIR=your_cuda_lib_dir
+CUDNN_LIB_DIR=your_cudnn_lib_dir
+TENSORRT_DIR=your_tensorrt_lib_dir
+BUILD_DIR=build
+rm -rf ${BUILD_DIR}
+mkdir ${BUILD_DIR}
+cd ${BUILD_DIR}
+cmake .. \
+    -DPADDLE_LIB=${LIB_DIR} \
+    -DWITH_MKL=ON \
+    -DDEMO_NAME=clas_system \
+    -DWITH_GPU=OFF \
+    -DWITH_STATIC_LIB=OFF \
+    -DWITH_TENSORRT=OFF \
+    -DTENSORRT_DIR=${TENSORRT_DIR} \
+    -DOPENCV_DIR=${OPENCV_DIR} \
+    -DCUDNN_LIB=${CUDNN_LIB_DIR} \
+    -DCUDA_LIB=${CUDA_LIB_DIR} \
+make -j
+```
+In the above parameters of command:
+* `OPENCV_DIR` is the opencv installation path;
+* `LIB_DIR` is the download (`paddle_inference` folder) or the generated Paddle Inference Library path (`build/paddle_inference_install_dir` folder);
+* `CUDA_LIB_DIR` is the cuda library file path, in docker; it is `/usr/local/cuda/lib64`;
+* `CUDNN_LIB_DIR` is the cudnn library file path, in docker it is `/usr/lib/x86_64-linux-gnu/`.
+* `TENSORRT_DIR` is the tensorrt library file path，in dokcer it is `/usr/local/TensorRT6-cuda10.0-cudnn7/`，TensorRT is just enabled for GPU.
+After the compilation is completed, an executable file named `clas_system` will be generated in the `build` folder.
+<a name="2.2"></a>
+### 2.2 Compile config lib and cls lib
+In addition to compiling the demo directly, you can also compile only config lib and cls lib by running the following command:
+```shell
+sh tools/build_lib.sh
+```
+The contents of the above command are as follows:
+```shell
+OpenCV_DIR=path/to/opencv
+PADDLE_LIB_DIR=path/to/paddle
+BUILD_DIR=./lib/build
+rm -rf ${BUILD_DIR}
+mkdir ${BUILD_DIR}
+cd ${BUILD_DIR}
+cmake .. \
+    -DOpenCV_DIR=${OpenCV_DIR} \
+    -DPADDLE_LIB=${PADDLE_LIB_DIR} \
+    -DCMP_STATIC=ON \
+make
+```
+The specific description of each compilation option is as follows:
+* `DOpenCV_DIR`: The directory to the OpenCV compilation library. In this example, it is `opencv-3.4.7/opencv3/share/OpenCV`. Note that there needs to be a `OpenCVConfig.cmake` file under this directory;
+* `DPADDLE_LIB`: The directory to the paddle prediction library which generally is the path of `paddle_inference` downloaded and decompressed or compiled, such as `build/paddle_inference_install_dir`. Note that there should be two subdirectories `paddle` and `third_party` in this directory;
+* `DCMP_STATIC`: Whether to compile config lib and cls lib into static link library (`.a`). The default is `ON`. If you need to compile into dynamic link library (`.so`), please set it to `OFF`.
+After executing the above commands, the dynamic link libraries (`libcls.so` and `libconfig.so`) or static link libraries (`cls.a` and `libconfig.a`) of config lib and cls lib will be generated in the directory. In the [2.1 Compile PaddleClas C++ inference demo](#2.1), you can specify the compilation option `DCLS_LIB` and `DCONFIG_LIB` to the path of the existing link library of `cls lib` and `config lib`, which can also be used for development.
+<a name="3"></a>
+## 3. Run the demo
+<a name="3.1"></a>
+### 3.1 Prepare the inference model
+* You can refer to [Model inference](../../tools/export_model.py)，export the inference model. After the model is exported, assuming it is placed in the `inference` directory, the directory structure is as follows.
+```
+inference/
+|--cls_infer.pdmodel
+|--cls_infer.pdiparams
+```
+**NOTICE**: Among them, `cls_infer.pdmodel` file stores the model structure information and the `cls_infer.pdiparams` file stores the model parameter information.The paths of the two files need to correspond to the parameters of `cls_model_path` and `cls_params_path` in the configuration file `tools/config.txt`.
+<a name="3.2"></a>
+### 3.2 Run demo
+First, please modify the `tools/config.txt` and `tools/run.sh`.
+* Some key words in `tools/config.txt` is as follows.
+  * use_gpu: Whether to use GPU.
+  * gpu_id: GPU id.
+  * gpu_mem：GPU memory.
+  * cpu_math_library_num_threads：Number of thread for math library acceleration.
+  * use_mkldnn：Whether to use mkldnn.
+  * use_tensorrt: Whether to use tensorRT.
+  * use_fp16：Whether to use Float16 (half precision), it is just enabled when use_tensorrt is set as 1.
+  * cls_model_path: Model path of inference model.
+  * cls_params_path: Params path of inference model.
+  * resize_short_size：Short side length of the image after resize.
+  * crop_size：Image size after center crop.
+* You could modify `tools/run.sh`(`./build/clas_system ./tools/config.txt ./docs/imgs/ILSVRC2012_val_00000666.JPEG`):
+  * ./build/clas_system: the path of executable file compiled;
+  * ./tools/config.txt: the path of config;
+  * ./docs/imgs/ILSVRC2012_val_00000666.JPEG: the path of image file to be predicted.
+* Then execute the following command to complete the classification of an image.
+```shell
+sh tools/run.sh
+```
+* The prediction results will be shown on the screen, which is as follows.
+<div align="center">
+    <img src="./docs/imgs/cpp_infer_result.png" width="600">
+</div>
+* In the above results,`class id` represents the id corresponding to the category with the highest confidence, and `score` represents the probability that the image belongs to that category.
--- a/docs/en/inference_deployment/cpp_deploy_on_windows_en.md
+++ b/docs/en/inference_deployment/cpp_deploy_on_windows_en.md
+# Visual Studio 2019 Community CMake 编译指南
+PaddleClas在Windows 平台下基于`Visual Studio 2019 Community` 进行了测试。微软从`Visual Studio 2017`开始即支持直接管理`CMake`跨平台编译项目，但是直到`2019`才提供了稳定和完全的支持，所以如果你想使用CMake管理项目编译构建，我们推荐使用`Visual Studio 2019`。如果您希望通过生成`sln解决方案`的方式进行编译，可以参考该文档：[https://zhuanlan.zhihu.com/p/145446681](https://zhuanlan.zhihu.com/p/145446681)。
+## 前置条件
+* Visual Studio 2019
+* CUDA 9.0 / CUDA 10.0，cudnn 7.6+ （仅在使用GPU版本的预测库时需要）
+* CMake 3.0+
+请确保系统已经正确安装并配置好上述基本软件，其中：
+  * 在安装`Visual Studio 2019`时，`工作负载`需要勾选`使用C++的桌面开发`；
+  * CUDA需要正确安装并设置系统环境变量；
+  * CMake需要正确安装并将路径添加到系统环境变量中。
+以下测试基于`Visual Studio 2019 Community`版本。
+**下面所有示例以工作目录为 `D:\projects`演示**。
+### Step1: 下载PaddlePaddle C++ 预测库 paddle_inference_install_dir
+PaddlePaddle C++ 预测库针对不同的`CPU`和`CUDA`版本提供了不同的预编译版本，请根据实际情况下载:  [C++预测库下载列表](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/guides/05_inference_deployment/inference/windows_cpp_inference.html)。
+解压后`D:\projects\paddle_inference_install_dir`目录包含内容为：
+```
+paddle_inference_install_dir
+├── paddle # paddle核心库和头文件
+|
+├── third_party # 第三方依赖库和头文件
+|
+└── version.txt # 版本和编译信息
+```
+然后需要将`Paddle预测库`的路径`D:\projects\paddle_inference_install_dir\paddle\lib`添加到系统环境变量`Path`中。
+### Step2: 安装配置OpenCV
+1. 在OpenCV官网下载适用于Windows平台的3.4.6版本， [下载地址](https://sourceforge.net/projects/opencvlibrary/files/3.4.6/opencv-3.4.6-vc14_vc15.exe/download)  
+2. 运行下载的可执行文件，将OpenCV解压至指定目录，如`D:\projects\opencv`
+3. 配置环境变量，如下流程所示  
+    - 此电脑（我的电脑）-> 属性 -> 高级系统设置 -> 环境变量
+    - 在系统变量中找到Path（如没有，自行创建），并双击编辑
+    - 新建，将OpenCV路径填入并保存，如 `D:\projects\opencv\build\x64\vc14\bin`
+### Step3: 使用Visual Studio 2019直接编译CMake
+1. 打开Visual Studio 2019 Community，点击 `继续但无需代码`
+![step2](./imgs/vs2019_step1.png)
+2. 点击： `文件`->`打开`->`CMake`
+![step2.1](./imgs/vs2019_step2.png)
+选择项目代码所在路径，并打开`CMakeList.txt`：
+![step2.2](./imgs/vs2019_step3.png)
+3. 点击：`项目`->`cpp_inference_demo的CMake设置`
+![step3](./imgs/vs2019_step4.png)
+4. 请设置以下参数的值
+| 名称                          | 值                 | 保存到 JSON |
+| ----------------------------- | ------------------ | ----------- |
+| CMAKE_BACKWARDS_COMPATIBILITY | 3.17               | [√]         |
+| CMAKE_BUILD_TYPE              | RelWithDebInfo     | [√]         |
+| CUDA_LIB                      | CUDA的库路径       | [√]         |
+| CUDNN_LIB                     | CUDNN的库路径      | [√]         |
+| OpenCV_DIR                    | OpenCV的安装路径   | [√]         |
+| PADDLE_LIB                    | Paddle预测库的路径 | [√]         |
+| WITH_GPU                      | [√]                | [√]         |
+| WITH_MKL                      | [√]                | [√]         |
+| WITH_STATIC_LIB               | [√]                | [√]         |
+**注意**：
+1. `CMAKE_BACKWARDS_COMPATIBILITY` 的值请根据自己 `cmake` 版本设置，`cmake` 版本可以通过命令：`cmake --version` 查询；
+2. `CUDA_LIB` 、 `CUDNN_LIB` 的值仅需在使用**GPU版本**预测库时指定，其中CUDA库版本尽量对齐，**使用9.0、10.0版本，不使用9.2、10.1等版本CUDA库**；
+3. 在设置 `CUDA_LIB`、`CUDNN_LIB`、`OPENCV_DIR`、`PADDLE_LIB` 时，点击 `浏览`，分别设置相应的路径；
+   * `CUDA_LIB`和`CUDNN_LIB`：该路径取决于CUDA与CUDNN的安装位置。
+   * `OpenCV_DIR`：该路径下需要有`.cmake`文件，一般为`opencv/build/`；
+   * `PADDLE_LIB`：该路径下需要有`CMakeCache.txt`文件，一般为`paddle_inference_install_dir/`。
+4. 在使用 `CPU` 版预测库时，请不要勾选 `WITH_GPU` - `保存到 JSON`。
+![step4](./imgs/vs2019_step5.png)
+**设置完成后**, 点击上图中 `保存并生成CMake缓存以加载变量` 。
+5. 点击`生成`->`全部生成`
+![step6](./imgs/vs2019_step6.png)
+### Step4: 预测及可视化
+在完成上述操作后，`Visual Studio 2019` 编译产出的可执行文件 `clas_system.exe` 在 `out\build\x64-Release`目录下，打开`cmd`，并切换到该目录：
+```
+cd D:\projects\PaddleClas\deploy\cpp_infer\out\build\x64-Release
+```
+可执行文件`clas_system.exe`即为编译产出的的预测程序，其使用方法如下：
+```shell
+.\clas_system.exe D:\projects\PaddleClas\deploy\cpp_infer\tools\config.txt .\docs\ILSVRC2012_val_00008306.JPEG
+```
+上述命令中，第一个参数（`D:\projects\PaddleClas\deploy\cpp_infer\tools\config.txt`）为配置文件路径，第二个参数（`.\docs\ILSVRC2012_val_00008306.JPEG`）为需要预测的图片路径。
+注意，需要在配置文件中正确设置预测参数，包括所用模型文件的路径（`cls_model_path`和`cls_params_path`）。
+### 注意
+* 在Windows下的终端中执行文件exe时，可能会发生乱码的现象，此时需要在终端中输入`CHCP 65001`，将终端的编码方式由GBK编码(默认)改为UTF-8编码，更加具体的解释可以参考这篇博客：[https://blog.csdn.net/qq_35038153/article/details/78430359](https://blog.csdn.net/qq_35038153/article/details/78430359)。
+* 如果需要使用CPU预测，PaddlePaddle在Windows上仅支持avx的CPU预测，目前不支持noavx的CPU预测。
+* 在使用生成的`clas_system.exe`进行预测时，如提示`由于找不到paddle_fluid.dll，无法继续执行代码。重新安装程序可能会解决此问题`，请检查是否将Paddle预测库路径添加到系统环境变量，详见[Step1: 下载PaddlePaddle C++ 预测库 paddle_inference_install_dir](#step1-下载paddlepaddle-c-预测库-paddle_inference_install_dir)
--- a/docs/en/inference_deployment/export_model_en.md
+++ b/docs/en/inference_deployment/export_model_en.md
+# Export model
+PaddlePaddle supports exporting inference model for deployment. Compared with training, inference model files store network weights and network structures persistently, and PaddlePaddle supports more fast prediction engine loading inference model to deployment.
+---
+## Contents
+- [1. 环境准备](#1)
+- [2. 分类模型导出](#2)
+- [3. 主体检测模型导出](#3)
+- [4. 识别模型导出](#4)
+- [5. 命令参数说明](#5)
+<a name="1"></a>
+## 1. Environmental preparation
+First, refer to the [Installing PaddlePaddle](../installation/install_paddle_en.md) and the [Installing PaddleClas](../installation/install_paddleclas_en.md) to prepare environment.
+<a name="2"></a>
+## 2. Export classification model
+Change the working directory to PaddleClas:
+```shell
+cd /path/to/PaddleClas
+```
+Taking the classification model ResNet50_vd as an example, download the pre-trained model:
+```shell
+wget -P ./cls_pretrain/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_pretrained.pdparams
+```
+The above model weights is trained by ResNet50_vd model on ImageNet1k dataset and training configuration file is `ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml`. To export the inference model, just run the following command:
+```shell
+python tools/export_model.py
+    -c ./ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml \
+    -o Global.pretrained_model=./cls_pretrain/ResNet50_vd_pretrained \
+    -o Global.save_inference_dir=./deploy/models/class_ResNet50_vd_ImageNet_infer
+```
+<a name="3"></a>
+## 3. Export mainbody detection model
+About exporting mainbody detection model in details, please refer[mainbody detection](../image_recognition_pipeline/mainbody_detection_en.md).
+<a name="4"></a>
+## 4. Export recognition model
+Change the working directory to PaddleClas:
+```shell
+cd /path/to/PaddleClas
+```
+Take the feature extraction model in products recognition as an example, download the pretrained model:
+```shell
+wget -P ./product_pretrain/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/product_ResNet50_vd_Aliproduct_v1.0_pretrained.pdparams
+```
+The above model weights file is trained by ResNet50_vd on AliProduct dataset, and the training configuration file is `ppcls/configs/Products/ResNet50_vd_Aliproduct.yaml`. The command to export inference model is as follow:
+```shell
+python3 tools/export_model.py \
+    -c ./ppcls/configs/Products/ResNet50_vd_Aliproduct.yaml \
+    -o Global.pretrained_model=./product_pretrain/product_ResNet50_vd_Aliproduct_v1.0_pretrained \
+    -o Global.save_inference_dir=./deploy/models/product_ResNet50_vd_aliproduct_v1.0_infer
+```
+Notice, the inference model exported by above command is truncated on embedding layer, so the output of the model is n-dimensional embedding feature.
+<a name="5"></a>
+## 5. Parameter description
+In the above model export command, the configuration file used must be the same as the training configuration file. The following fields in the configuration file are used to configure exporting model parameters.
+* `Global.image_shape`：To specify the input data size of the model, which does not contain the batch dimension;
+* `Global.save_inference_dir`：To specify directory of saving inference model files exported;
+* `Global.pretrained_model`：To specify the path of model weight file saved during training. This path does not need to contain the suffix `.pdparams` of model weight file;
+The exporting model command will generate the following three files:
+* `inference.pdmodel`：To store model network structure information;
+* `inference.pdiparams`：To store model network weight information;
+* `inference.pdiparams.info`：To store the parameter information of the model, which can be ignored in the classification model and recognition model;
+The inference model exported is used to deployment by using prediction engine. You can refer the following docs according to different deployment modes / platforms
+* [Python inference](./python_deploy.md)
+* [C++ inference](./cpp_deploy.md)(Only support classification)
+* [Python Whl inference](./whl_deploy.md)(Only support classification)
+* [PaddleHub Serving inference](./paddle_hub_serving_deploy.md)(Only support classification)
+* [PaddleServing inference](./paddle_serving_deploy.md)
+* [PaddleLite inference](./paddle_lite_deploy.md)(Only support classification)
--- a/docs/en/inference_deployment/paddle_hub_serving_deploy_en.md
+++ b/docs/en/inference_deployment/paddle_hub_serving_deploy_en.md
+# Service deployment based on PaddleHub Serving
+PaddleClas supports rapid service deployment through Paddlehub. At present, it supports the deployment of image classification. Please look forward to the deployment of image recognition.
+---
+## Contents
+- [1. Introduction](#1)
+- [2. Prepare the environment](#2)
+- [3. Download inference model](#3)
+- [4. Install Service Module](#4)
+- [5. Start service](#5)
+    - [5.1 Start with command line parameters](#5.1)
+    - [5.2 Start with configuration file](#5.2)
+- [6. Send prediction requests](#6)
+- [7. User defined service module modification](#7)
+HubServing service pack contains 3 files, the directory is as follows:  
+```
+hubserving/clas/
+  └─  __init__.py    Empty file, required
+  └─  config.json    Configuration file, optional, passed in as a parameter when using configuration to start the service
+  └─  module.py      Main module file, required, contains the complete logic of the service
+  └─  params.py      Parameter file, required, including parameters such as model path, pre- and post-processing parameters
+```
+<a name="2"></a>
+## 2. Prepare the environment
+```shell
+# Install version 2.0 of PaddleHub  
+pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
+```
+<a name="3"></a>
+## 3. Download inference model
+Before installing the service module, you need to prepare the inference model and put it in the correct path. The default model path is:  
+* Model structure file: `PaddleClas/inference/inference.pdmodel`
+* Model parameters file: `PaddleClas/inference/inference.pdiparams`
+**Notice**:
+* The model file path can be viewed and modified in `PaddleClas/deploy/hubserving/clas/params.py`.
+* It should be noted that the prefix of model structure file and model parameters file must be `inference`.
+* More models provided by PaddleClas can be obtained from the [model library](../../docs/en/models/models_intro_en.md). You can also use models trained by yourself.
+<a name="4"></a>
+## 4. Install Service Module
+* On Linux platform, the examples are as follows.
+```shell
+cd PaddleClas/deploy
+hub install hubserving/clas/
+```
+* On Windows platform, the examples are as follows.
+```shell
+cd PaddleClas\deploy
+hub install hubserving\clas\
+```
+<a name="5"></a>
+## 5. Start service
+<a name="5.1"></a>
+### 5.1 Start with command line parameters
+This method only supports CPU. Command as follow:
+```shell
+$ hub serving start --modules Module1==Version1 \
+                    --port XXXX \
+                    --use_multiprocess \
+                    --workers \
+```  
+**parameters：**  
+|parameters|usage|  
+|-|-|  
+|--modules/-m|PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs<br>*`When Version is not specified, the latest version is selected by default`*|
+|--port/-p|Service port, default is 8866|  
+|--use_multiprocess|Enable concurrent mode, the default is single-process mode, this mode is recommended for multi-core CPU machines<br>*`Windows operating system only supports single-process mode`*|
+|--workers|The number of concurrent tasks specified in concurrent mode, the default is `2*cpu_count-1`, where `cpu_count` is the number of CPU cores|  
+For example, start service:  
+```shell
+hub serving start -m clas_system
+```  
+This completes the deployment of a service API, using the default port number 8866.  
+<a name="5.2"></a>
+### 5.2 Start with configuration file
+This method supports CPU and GPU. Command as follow:
+```shell
+hub serving start --config/-c config.json
+```  
+Wherein, the format of `config.json` is as follows:
+```json
+{
+    "modules_info": {
+        "clas_system": {
+            "init_args": {
+                "version": "1.0.0",
+                "use_gpu": true,
+                "enable_mkldnn": false
+            },
+            "predict_args": {
+            }
+        }
+    },
+    "port": 8866,
+    "use_multiprocess": false,
+    "workers": 2
+}
+```
+- The configurable parameters in `init_args` are consistent with the `_initialize` function interface in `module.py`. Among them,
+  - when `use_gpu` is `true`, it means that the GPU is used to start the service.
+  - when `enable_mkldnn` is `true`, it means that use MKL-DNN to accelerate.
+- The configurable parameters in `predict_args` are consistent with the `predict` function interface in `module.py`.
+**Note:**  
+- When using the configuration file to start the service, other parameters will be ignored.
+- If you use GPU prediction (that is, `use_gpu` is set to `true`), you need to set the environment variable CUDA_VISIBLE_DEVICES before starting the service, such as: ```export CUDA_VISIBLE_DEVICES=0```, otherwise you do not need to set it.
+- **`use_gpu` and `use_multiprocess` cannot be `true` at the same time.**  
+- **When both `use_gpu` and `enable_mkldnn` are set to `true` at the same time, GPU is used to run and `enable_mkldnn` will be ignored.**
+For example, use GPU card No. 3 to start the 2-stage series service:
+```shell
+cd PaddleClas/deploy
+export CUDA_VISIBLE_DEVICES=3
+hub serving start -c hubserving/clas/config.json
+```
+<a name="6"></a>
+## 6. Send prediction requests
+After the service starting, you can use the following command to send a prediction request to obtain the prediction result:  
+```shell
+cd PaddleClas/deploy
+python hubserving/test_hubserving.py server_url image_path
+```
+Two required parameters need to be passed to the script:
+- **server_url**: service address，format of which is
+`http://[ip_address]:[port]/predict/[module_name]`  
+- **image_path**: Test image path, can be a single image path or an image directory path
+- **batch_size**: [**Optional**] batch_size. Default by `1`.
+- **resize_short**: [**Optional**] In preprocessing, resize according to short size. Default by `256`。
+- **crop_size**: [**Optional**] In preprocessing, centor crop size. Default by `224`。
+- **normalize**: [**Optional**] In preprocessing, whether to do `normalize`. Default by `True`。
+- **to_chw**: [**Optional**] In preprocessing, whether to transpose to `CHW`. Default by `True`。
+**Notice**:
+If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `--resize_short=384`, `--crop_size=384`.
+**Eg.**
+```shell
+python hubserving/test_hubserving.py --server_url http://127.0.0.1:8866/predict/clas_system --image_file ./hubserving/ILSVRC2012_val_00006666.JPEG --batch_size 8
+```
+The returned result is a list, including the `top_k`'s classification results, corresponding scores and the time cost of prediction, details as follows.
+```
+list: The returned results
+└─ list: The result of first picture
+   └─ list: The top-k classification results, sorted in descending order of score
+   └─ list: The scores corresponding to the top-k classification results, sorted in descending order of score
+   └─ float: The time cost of predicting the picture, unit second
+```
+**Note：** If you need to add, delete or modify the returned fields, you can modify the corresponding module. For the details, refer to the user-defined modification service module in the next section.
+<a name="7"></a>
+## 7. User defined service module modification
+If you need to modify the service logic, the following steps are generally required:
+1. Stop service
+```shell
+hub serving stop --port/-p XXXX
+```
+2. Modify the code in the corresponding files, like `module.py` and `params.py`, according to the actual needs. You need re-install(hub install hubserving/clas/) and re-deploy after modifing `module.py`.
+After modifying and installing and before deploying, you can use `python hubserving/clas/module.py` to test the installed service module.
+For example, if you need to replace the model used by the deployed service, you need to modify model path parameters `cfg.model_file` and `cfg.params_file` in `params.py`. Of course, other related parameters may need to be modified at the same time. Please modify and debug according to the actual situation.
+3. Uninstall old service module
+```shell
+hub uninstall clas_system
+```
+4. Install modified service module
+```shell
+hub install hubserving/clas/
+```
+5. Restart service
+```shell
+hub serving start -m clas_system
+```
+**Note**:
+Common parameters can be modified in params.py:
+* Directory of model files(include model structure file and model parameters file):
+    ```python
+    "inference_model_dir":
+    ```
+* The number of Top-k results returned during post-processing:
+    ```python
+    'topk':
+    ```
+* Mapping file corresponding to label and class ID during post-processing:
+    ```python
+    'class_id_map_file':
+    ```
+In order to avoid unnecessary delay and be able to predict in batch, the preprocessing (include resize, crop and other) is completed in the client, so modify [test_hubserving.py](../../deploy/hubserving/test_hubserving.py#L35-L52) if necessary.
--- a/docs/en/inference_deployment/paddle_lite_deploy_en.md
+++ b/docs/en/inference_deployment/paddle_lite_deploy_en.md
+# Tutorial of PaddleClas Mobile Deployment
+This tutorial will introduce how to use [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) to deploy PaddleClas models on mobile phones.
+Paddle-Lite is a lightweight inference engine for PaddlePaddle. It provides efficient inference capabilities for mobile phones and IoTs,  and extensively integrates cross-platform hardware to provide lightweight deployment solutions for mobile-side deployment issues.
+If you only want to test speed, please refer to [The tutorial of Paddle-Lite mobile-side benchmark test](../../docs/zh_CN/extension/paddle_mobile_inference.md).
+---
+## Contents
+- [1. Preparation](#1)
+    - [1.1 Build Paddle-Lite library](#1.1)
+    - [1.2 Download inference library for Android or iOS](#1.2)
+- [2. Start running](#2)
+    - [2.1 Inference Model Optimization](#2.1)
+        - [2.1.1 [RECOMMEND] Use pip to install Paddle-Lite and optimize model](#2.1.1)
+        - [2.1.2 Compile Paddle-Lite to generate opt tool](#2.1.2)
+        - [2.1.3 Demo of get the optimized model](#2.1.3)
+    - [2.2 Run optimized model on Phone](#2.2)
+- [3. FAQ](#3)
+<a name="1"></a>
+## 1. Preparation
+PaddeLite currently supports the following platforms:
+- Computer (for compiling Paddle-Lite)
+- Mobile phone (arm7 or arm8)
+<a name="1.1"></a>
+### 1.1 Prepare cross-compilation environment
+The cross-compilation environment is used to compile the C++ demos of Paddle-Lite and PaddleClas.
+For the detailed compilation directions of different development environments, please refer to the corresponding [document](https://paddle-lite.readthedocs.io/zh/latest/source_compile/compile_env.html).
+<a name="1.2"></a>
+## 1.2 Download inference library for Android or iOS
+|Platform|Inference Library Download Link|
+|-|-|
+|Android|[arm7](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/Android/gcc/inference_lite_lib.android.armv7.gcc.c++_static.with_extra.with_cv.tar.gz) / [arm8](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/Android/gcc/inference_lite_lib.android.armv8.gcc.c++_static.with_extra.with_cv.tar.gz)|
+|iOS|[arm7](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/iOS/inference_lite_lib.ios.armv7.with_cv.with_extra.tiny_publish.tar.gz) / [arm8](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/iOS/inference_lite_lib.ios.armv8.with_cv.with_extra.tiny_publish.tar.gz)|
+**NOTE**:
+1. If you download the inference library from [Paddle-Lite official document](https://paddle-lite.readthedocs.io/zh/latest/quick_start/release_lib.html#android-toolchain-gcc), please choose `with_extra=ON` , `with_cv=ON` .
+2. It is recommended to build inference library using [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) develop branch if you want to deploy the [quantitative](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/quantization/README_en.md) model to mobile phones. Please refer to the [link](https://paddle-lite.readthedocs.io/zh/latest/user_guides/Compile/Android.html#id2) for more detailed information about compiling.
+The structure of the inference library is as follows:
+```
+inference_lite_lib.android.armv8/
+|-- cxx                                                    C++ inference library and header files
+|   |-- include                                            C++ header files
+|   |   |-- paddle_api.h
+|   |   |-- paddle_image_preprocess.h
+|   |   |-- paddle_lite_factory_helper.h
+|   |   |-- paddle_place.h
+|   |   |-- paddle_use_kernels.h
+|   |   |-- paddle_use_ops.h
+|   |   `-- paddle_use_passes.h
+|   `-- lib                                                                                  C++ inference library
+|       |-- libpaddle_api_light_bundled.a           C++ static library
+|       `-- libpaddle_light_api_shared.so           C++ dynamic library
+|-- java                                                     Java inference library
+|   |-- jar
+|   |   `-- PaddlePredictor.jar
+|   |-- so
+|   |   `-- libpaddle_lite_jni.so
+|   `-- src
+|-- demo                                                     C++ and java demos
+|   |-- cxx                                                                                  C++ demos
+|   `-- java                                                                              Java demos
+```
+<a name="2"></a>
+## 2. Start running
+<a name="2.1"></a>
+## 2.1 Inference Model Optimization
+Paddle-Lite provides a variety of strategies to automatically optimize the original training model, including quantization, sub-graph fusion, hybrid scheduling, Kernel optimization and so on. In order to make the optimization process more convenient and easy to use, Paddle-Lite provides `opt` tool to automatically complete the optimization steps and output a lightweight, optimal executable model.
+**NOTE**: If you have already got the `.nb` file, you can skip this step.
+<a name="2.1.1"></a>
+### 2.1.1 [RECOMMEND] Use `pip` to install Paddle-Lite and optimize model
+* Use pip to install Paddle-Lite. The following command uses `pip3.7` .
+```shell
+pip install paddlelite==2.8
+```
+**Note**：The version of `paddlelite`'s wheel must match that of inference lib.
+* Use `paddle_lite_opt` to optimize inference model, the parameters of `paddle_lite_opt` are as follows:
+| Parameters              | Explanation                                                  |
+| ----------------------- | ------------------------------------------------------------ |
+| --model_dir             | Path to the PaddlePaddle model (no-combined) file to be optimized. |
+| --model_file            | Path to the net structure file of PaddlePaddle model (combined) to be optimized. |
+| --param_file            | Path to the net weight files of PaddlePaddle model (combined) to be optimized. |
+| --optimize_out_type     | Type of output model, `protobuf` by default. Supports `protobuf` and `naive_buffer` . Compared with `protobuf`, you can use`naive_buffer` to get a more lightweight serialization/deserialization model. If you need to predict on the mobile-side, please set it to `naive_buffer`. |
+| --optimize_out          | Path to output model, not needed to add `.nb` suffix.        |
+| --valid_targets         | The executable backend of the model, `arm` by default. Supports one or some of `x86` , `arm` , `opencl` , `npu` , `xpu`. If set more than one, please separate the options by space, and the `opt` tool will choose the best way automatically. If need to support Huawei NPU (DaVinci core carried by Kirin 810/990 SoC), please set it to `npu arm` . |
+| --record_tailoring_info | Whether to enable `Cut the Library Files According To the Model` , `false` by default. If need to record kernel and OP infos of optimized model, please set it to `true`. |
+In addition, you can run `paddle_lite_opt` to get more detailed information about how to use.
+<a name="2.1.2"></a>
+### 2.1.2 Compile Paddle-Lite to generate `opt` tool
+Optimizing model requires Paddle-Lite's `opt` executable file, which can be obtained by compiling the Paddle-Lite. The steps are as follows:
+```shell
+# get the Paddle-Lite source code, if have gotten , please skip
+git clone https://github.com/PaddlePaddle/Paddle-Lite.git
+cd Paddle-Lite
+git checkout develop
+# compile
+./lite/tools/build.sh build_optimize_tool
+```
+After the compilation is complete, the `opt` file is located under `build.opt/lite/api/`.
+`opt` tool is used in the same way as `paddle_lite_opt` , please refer to [4.1](#4.1).
+<a name="2.1.3"></a>
+### 2.1.3 Demo of get the optimized model
+Taking the `MobileNetV3_large_x1_0` model of PaddleClas as an example, we will introduce how to use `paddle_lite_opt` to complete the conversion from the pre-trained model to the inference model, and then to the Paddle-Lite optimized model.
+```shell
+# enter PaddleClas root directory
+cd PaddleClas_root_path
+# download and uncompress the inference model
+wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/MobileNetV3_large_x1_0_infer.tar
+tar -xf MobileNetV3_large_x1_0_infer.tar
+# convert inference model to Paddle-Lite optimized model
+paddle_lite_opt --model_file=./MobileNetV3_large_x1_0_infer/inference.pdmodel --param_file=./MobileNetV3_large_x1_0_infer/inference.pdiparams --optimize_out=./MobileNetV3_large_x1_0
+```
+When the above code command is completed, there will be ``MobileNetV3_large_x1_0.nb` in the current directory, which is the converted model file.
+<a name="2.2"></a>
+## 2.2 Run optimized model on Phone
+1. Prepare an Android phone with `arm8`. If the compiled inference library and `opt` file are `armv7`, you need an `arm7` phone and modify `ARM_ABI = arm7` in the Makefile.
+2. Install the ADB tool on the computer.
+    * Install ADB for MAC
+      Recommend use homebrew to install.
+      ```shell
+      brew cask install android-platform-tools
+      ```
+    * Install ADB for Linux
+      ```shell
+      sudo apt update
+      sudo apt install -y wget adb
+      ```
+    * Install ADB for windows
+      If install ADB fo Windows, you need to download from Google's Android platform: [Download Link](https://developer.android.com/studio).
+    First, make sure the phone is connected to the computer, turn on the `USB debugging` option of the phone, and select the `file transfer` mode. Verify whether ADB is installed successfully as follows:
+    ```shell
+    $ adb devices
+    List of devices attached
+    744be294    device
+    ```
+    If there is `device` output like the above, it means the installation was successful.
+4. Prepare optimized model, inference library files, test image and dictionary file used.
+```shell
+cd PaddleClas_root_path
+cd deploy/lite/
+# prepare.sh will put the inference library files, the test image and the dictionary files in demo/cxx/clas
+sh prepare.sh /{lite inference library path}/inference_lite_lib.android.armv8
+# enter the working directory of lite demo
+cd /{lite inference library path}/inference_lite_lib.android.armv8/
+cd demo/cxx/clas/
+# copy the C++ inference dynamic library file （ie. .so) to the debug folder
+cp ../../../cxx/lib/libpaddle_light_api_shared.so ./debug/
+```
+The `prepare.sh` take `PaddleClas/deploy/lite/imgs/tabby_cat.jpg` as the test image, and copy it to the `demo/cxx/clas/debug/` directory.
+You should put the model that optimized by `paddle_lite_opt` under the `demo/cxx/clas/debug/` directory. In this example, use `MobileNetV3_large_x1_0.nb` model file generated in [2.1.3](#4.3).
+The structure of the clas demo is as follows after the above command is completed:
+```
+demo/cxx/clas/
+|-- debug/
+|   |--MobileNetV3_large_x1_0.nb                    class model
+|   |--tabby_cat.jpg                              test image
+|   |--imagenet1k_label_list.txt                    dictionary file
+|   |--libpaddle_light_api_shared.so              C++ .so file
+|   |--config.txt                                 config file
+|-- config.txt                                    config file
+|-- image_classfication.cpp                       source code
+|-- Makefile                                      compile file
+```
+**NOTE**:
+* `Imagenet1k_label_list.txt` is the category mapping file of the `ImageNet1k` dataset. If use a custom category, you need to replace the category mapping file.
+* `config.txt`  contains the hyperparameters, as follows:
+```shell
+clas_model_file ./MobileNetV3_large_x1_0.nb # path of model file
+label_path ./imagenet1k_label_list.txt      # path of category mapping file
+resize_short_size 256                       # the short side length after resize
+crop_size 224                               # side length used for inference after cropping
+visualize 0                                 # whether to visualize. If you set it to 1, an image file named 'clas_result.png' will be generated in the current directory.
+```
+5. Run Model on Phone
+```shell
+# run compile to get the executable file 'clas_system'
+make -j
+# move the compiled executable file to the debug folder
+mv clas_system ./debug/
+# push the debug folder to Phone
+adb push debug /data/local/tmp/
+adb shell
+cd /data/local/tmp/debug
+export LD_LIBRARY_PATH=/data/local/tmp/debug:$LD_LIBRARY_PATH
+# the usage of clas_system is as follows:
+# ./clas_system "path of config file" "path of test image"
+./clas_system ./config.txt ./tabby_cat.jpg
+```
+**NOTE**: If you make changes to the code, you need to recompile and repush the `debug ` folder to the phone.
+The result is as follows:
+<div align="center">
+    <img src="./imgs/lite_demo_result.png" width="600">
+</div>
+<a name="3"></a>
+## 3. FAQ
+Q1：If I want to change the model, do I need to go through the all process again?  
+A1：If you have completed the above steps, you only need to replace the `.nb` model file after replacing the model. At the same time, you may need to modify the path of `.nb` file in the config file and change the category mapping file to be compatible the model .
+Q2：How to change the test picture?  
+A2：Replace the test image under debug folder with the image you want to test，and then repush to the Phone again.
--- a/docs/en/inference.md
+++ b/docs/en/inference.md
-# INFERING BASED ON PYTHON PREDICTION ENGINE
+# Infering based on Python prediction engine
 The inference model (the model saved by `paddle.jit.save`) is generally a solidified model saved after the model training is completed, and is mostly used to give prediction in deployment.
@@ -6,170 +6,137 @@ The model saved during the training process is the checkpoints model, which save
 Compared with the checkpoints model, the inference model will additionally save the structural information of the model. Therefore, it is easier to deploy because the model structure and model parameters are already solidified in the inference model file, and is suitable for integration with actual systems.
-Next, we first introduce how to convert a trained model into an inference model, and then we will introduce mainbody detection, feature extraction based on inference model, 
+Please refer to the document [install paddle](../installation/install_paddle_en.md) and [install paddleclas](../installation/install_paddleclas_en.md) to prepare the environment.
-then we introduce a recognition pipeline consist of mainbody detection, feature extraction and vector search. At last, we introduce classification base on inference model. 
- [CONVERT TRAINING MODEL TO INFERENCE MODEL](#CONVERT)
+---
-    - [Convert feature extraction model to inference model](#Convert_feature_extraction)
-    - [Convert classification model to inference model](#Convert_class)
- [MAINBODY DETECTION MODEL INFERENCE](#DETECTION_MODEL_INFERENCE)
+## Contents
- [FEATURE EXTRACTION MODEL INFERENCE](#FEATURE_EXTRACTION_MODEL_INFERENCE)
+- [1. Image classification inference](#1)
+- [2. Mainbody detection model inference](#2)
+- [3. Feature Extraction model inference](#3)
+- [4. Concatenation of mainbody detection, feature extraction and vector search](#4)
- [CONCATENATION OF MAINBODY DETECTION, FEATURE EXTRACTION AND VECTOR SEARCH](#CONCATENATION)
- [CLASSIFICATION MODEL INFERENCE](#CLASSIFICATION)
+<a name="1"></a>
+## 1. Image classification inference
-<a name="CONVERT"></a>
+First, please refer to the document [export model](./export_model_en.md) to prepare the inference model files. All the command should be run under `deploy` folder of PaddleClas:
-## CONVERT TRAINING MODEL TO INFERENCE MODEL
-<a name="Convert_feature_extraction"></a>
-### Convert feature extraction model to inference model
-First please enter the root folder of PaddleClas. Download the product feature extraction model:
-```shell script
-wget -P ./product_pretrain/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/product_ResNet50_vd_Aliproduct_v1.0_pretrained.pdparams
-```
-The above model is trained on AliProduct with ResNet50_vd as backbone. To convert the trained model into an inference model, just run the following command:
+```shell
+cd deploy
 ```
-# -c Set the training algorithm yml configuration file
-# -o Set optional parameters
-# Global.pretrained_model parameter Set the training model address to be converted without adding the file suffix .pdmodel, .pdopt or .pdparams.
-# Global.save_inference_dir Set the address where the converted model will be saved.
-python3.7 tools/export_model.py -c ppcls/configs/Products/ResNet50_vd_Aliproduct.yaml -o Global.pretrained_model=./product_pretrain/product_ResNet50_vd_Aliproduct_v1.0_pretrained -o Global.save_inference_dir=./deploy/models/product_ResNet50_vd_aliproduct_v1.0_infer
+For classification model inference, you can execute the following commands:
-```
-When converting to an inference model, the configuration file used is the same as the configuration file used during training. In addition, you also need to set the `Global.pretrained_model` parameter in the configuration file.
+```shell
-After the conversion is successful, there are three files in the model save directory:
+python python/predict_cls.py -c configs/inference_cls.yaml
-``` 
-├── product_ResNet50_vd_aliproduct_v1.0_infer
-│   ├── inference.pdiparams
-│   ├── inference.pdiparams.info
-│   └── inference.pdmodel
 ```
-<a name="Convert_class"></a>
+In the configuration file `configs/inference_cls.yaml`, the following fields are used to configure prediction parameters:
-### Convert classification model to inference model
+* `Global.infer_imgs`: The path of image to be predicted;
+* `Global.inference_model_dir`: The directory of inference model files. There should be contain the model files (`inference.pdmodel` and `inference.pdiparams`);
+* `Global.use_tensorrt`: Whether use `TensorRT`, `False` by default;
+* `Global.use_gpu`: Whether use GPU, `True` by default;
+* `Global.enable_mkldnn`: Whether use `MKL-DNN`, `False` by default. Valid only when `use_gpu` is `False`;
+* `Global.use_fp16`: Whether use `FP16`, `False` by default;
+* `PreProcess`: To config the preprocessing of image to be predicted;
+* `PostProcess`: To config the postprocessing of prediction results;
+* `PostProcess.Topk.class_id_map_file`: The path of file mapping label and class id. By default ImageNet1k (`./utils/imagenet1k_label_list.txt`).
-Download the pretrained model:
+**Notice**:
-``` shell script
+* If VisionTransformer series models used, such as `DeiT_***_384`, `ViT_***_384`, please notice the size of model input. And you could need to specify the `PreProcess.resize_short=384`, `PreProcess.resize=384`.
-wget -P ./cls_pretrain/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_pretrained.pdparams
+* If you want to improve the speed of the evaluation, it is recommended to enable TensorRT when using GPU, and MKL-DNN when using CPU.
-```
-the model is trained on ImageNet with ResNet50_vd as backbone, using config file `ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml`.
-The model can be converted to the inference model in the same way as the feature extraction model, as follows:
-```
-# -c Set the training algorithm yml configuration file
-# -o Set optional parameters
-# Global.pretrained_model parameter Set the training model address to be converted without adding the file suffix .pdmodel, .pdopt or .pdparams.
-# Global.save_inference_dir Set the address where the converted model will be saved.
-python3.7 tools/export_model.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml -o Global.pretrained_model=./cls_pretrain/ResNet50_vd_pretrained -o Global.save_inference_dir=./deploy/models/class_ResNet50_vd_ImageNet_infer
+```shell
+python python/predict_cls.py -c configs/inference_cls.yaml -o Global.infer_imgs=images/ILSVRC2012_val_00010010.jpeg
 ```
-After the conversion is successful, there are three files in the model save directory:
+If you want to use the CPU for prediction, you can switch value of `use_gpu` in config file to `False`. Or you can execute the command as follows
 ```
-├── class_ResNet50_vd_ImageNet_infer
+python python/predict_cls.py -c configs/inference_cls.yaml  -o Global.use_gpu=False
-│   ├── inference.pdiparams
-│   ├── inference.pdiparams.info
-│   └── inference.pdmodel
 ```
-<a name="DETECTION_MODEL_INFERENCE"></a>
+<a name="2"></a>
-## MAINBODY DETECTION MODEL INFERENCE
+## 2. Mainbody detection model inference
 The following will introduce the mainbody detection model inference. All the command should be run under `deploy` folder of PaddleClas:
-```shell script
+```shell
 cd deploy
 ```
 For mainbody detection model inference, you can execute the following commands:
-```shell script
+```shell
 mkdir -p models
 cd models
 # download mainbody detection inference model
 wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar && tar -xf ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar
 cd ..
 # predict
-python3.7 python/predict_det.py -c configs/inference_det.yaml
+python python/predict_det.py -c configs/inference_det.yaml
 ```
 The input example image is as follows:
 [](../images/recognition/product_demo/wangzai.jpg)
 The output will be:
 ```text
 [{'class_id': 0, 'score': 0.4762245, 'bbox': array([305.55115, 226.05322, 776.61084, 930.42395], dtype=float32), 'label_name': 'foreground'}]
 ```
 And the visualise result is as follows:
 [](../images/recognition/product_demo/wangzai_det_result.jpg)
-If you want to detect another image, you can change the value of `infer_imgs` in `configs/inference_det.yaml`, 
+If you want to detect another image, you can change the value of `infer_imgs` in `configs/inference_det.yaml`,
 or you can use `-o Global.infer_imgs` argument. For example, if you want to detect `images/anmuxi.jpg`:
-```shell script
-python3.7 python/predict_det.py -c configs/inference_det.yaml -o Global.infer_imgs=images/anmuxi.jpg
+```shell
+python python/predict_det.py -c configs/inference_det.yaml -o Global.infer_imgs=images/anmuxi.jpg
 ```
 If you want to use the CPU for prediction, you can switch value of `use_gpu` in config file to `False`. Or you can execute the command as follows
 ```
-python3.7 python/predict_det.py -c configs/inference_det.yaml  -o Global.use_gpu=False
+python python/predict_det.py -c configs/inference_det.yaml  -o Global.use_gpu=False
 ```
-<a name="FEATURE_EXTRACTION_MODEL_INFERENCE"></a>
+<a name="3"></a>
-### FEATURE EXTRACTION MODEL INFERENCE
+## 3. Feature Extraction model inference
+First, please refer to the document [export model](./export_model_en.md) to prepare the inference model files. All the command should be run under `deploy` folder of PaddleClas:
-The following will introduce the feature extraction model inference. All the command should be run under `deploy` folder of PaddleClas:
+```shell
-```shell script
 cd deploy
 ```
 For feature extraction model inference, you can execute the following commands:
-```shell script
+```shell
 mkdir -p models
 cd models
 # download feature extraction inference model
 wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar && tar -xf product_ResNet50_vd_aliproduct_v1.0_infer.tar
 cd ..
 # predict
-python3.7 python/predict_rec.py -c configs/inference_rec.yaml
+python python/predict_rec.py -c configs/inference_rec.yaml
 ```
 You can get a 512-dim feature printed in the command line.
-If you want to extract feature of another image, you can change the value of `infer_imgs` in `configs/inference_rec.yaml`, 
+If you want to extract feature of another image, you can change the value of `infer_imgs` in `configs/inference_rec.yaml`,
 or you can use `-o Global.infer_imgs` argument. For example, if you want to try `images/anmuxi.jpg`:
-```shell script
-python3.7 python/predict_rec.py -c configs/inference_rec.yaml -o Global.infer_imgs=images/anmuxi.jpg
-```
-If you want to use the CPU for prediction, you can switch value of `use_gpu` in config file to `False`. Or you can execute the command as follows
+```shell
-```
+python python/predict_rec.py -c configs/inference_rec.yaml -o Global.infer_imgs=images/anmuxi.jpg
-python3.7 python/predict_rec.py -c configs/inference_rec.yaml  -o Global.use_gpu=False
 ```
-<a name="CONCATENATION"></a>
+If you want to use the CPU for prediction, you can switch value of `use_gpu` in config file to `False`. Or you can execute the command as follows
-## CONCATENATION OF MAINBODY DETECTION, FEATURE EXTRACTION AND VECTOR SEARCH
- Please refer to [Quick Start of Recognition](./tutorials/quick_start_recognition_en.md)
-<a name="CLASSIFICATION"></a>
-### CLASSIFICATION MODEL INFERENCE
-The following will introduce the classification model inference. All the command should be run under `deploy` folder of PaddleClas:
-```shell script
-cd deploy
-```
-For classification model inference, you can execute the following commands:
-```shell script
-python3.7 python/predict_cls.py -c configs/inference_cls.yaml
 ```
-If you want to try another image, you can change the value of `infer_imgs` in `configs/inference_cls.yaml`, 
+python python/predict_rec.py -c configs/inference_rec.yaml  -o Global.use_gpu=False
-or you can use `-o Global.infer_imgs` argument. For example, if you want to try `images/ILSVRC2012_val_00010010.jpeg`:
-```shell script
-python3.7 python/predict_cls.py -c configs/inference_cls.yaml -o Global.infer_imgs=images/ILSVRC2012_val_00010010.jpeg
 ```
-If you want to use the CPU for prediction, you can switch value of `use_gpu` in config file to `False`. Or you can execute the command as follows
+<a name="4"></a>
-```
+## 4. Concatenation of mainbody detection, feature extraction and vector search
-python3.7 python/predict_cls.py -c configs/inference_cls.yaml  -o Global.use_gpu=False
+ Please refer to [Quick Start of Recognition](./tutorials/quick_start_recognition_en.md)
-```
\ No newline at end of file
--- a/docs/en/whl_en.md
+++ b/docs/en/whl_en.md
 # PaddleClas wheel package
+Paddleclas supports Python WHL package for prediction. At present, WHL package only supports image classification, but does not support subject detection, feature extraction and vector retrieval.
+---
+## Contents
+- [1. Installation](#1)
+- [2. Quick Start](#2)
+- [3. Definition of Parameters](#3)
+- [4. Usage](#4)
+   - [4.1 View help information](#4.1)
+   - [4.2 Prediction using inference model provide by PaddleClas](#4.2)
+   - [4.3 Prediction using local model files](#4.3)
+   - [4.4 Prediction by batch](#4.4)
+   - [4.5 Prediction of Internet image](#4.5)
+   - [4.6 Prediction of `NumPy.array` format image](#4.6)
+   - [4.7 Save the prediction result(s)](#4.7)
+   - [4.8 Specify the mapping between class id and label name](#4.8)
+<a name="1"></a>
 ## 1. Installation
 * installing from pypi
@@ -15,7 +35,7 @@ python3 setup.py bdist_wheel
 pip3 install dist/*
 ```
+<a name="2"></a>
 ## 2. Quick Start
 * Using the `ResNet50` model provided by PaddleClas, the following image(`'docs/images/whl/demo.jpg'`) as an example.
@@ -50,9 +70,7 @@ filename: docs/images/whl/demo.jpg, top-5, class_ids: [8, 7, 136, 80, 84], score
 Predict complete!
 ```
+<a name="3"></a>
 ## 3. Definition of Parameters
 The following parameters can be specified in Command Line or used as parameters of the constructor when instantiating the PaddleClas object in Python.
@@ -86,12 +104,14 @@ from paddleclas import PaddleClas
 clas = PaddleClas(model_name='ViT_base_patch16_384', resize_short=384, crop_size=384)
 ```
+<a name="4"></a>
 ## 4. Usage
 PaddleClas provides two ways to use:
 1. Python interative programming;
 2. Bash command line programming.
+<a name="4.1"></a>
 ### 4.1 View help information
 * CLI
@@ -99,6 +119,7 @@ PaddleClas provides two ways to use:
 paddleclas -h
 ```
+<a name="4.2"></a>
 ### 4.2 Prediction using inference model provide by PaddleClas
 You can use the inference model provided by PaddleClas to predict, and only need to specify `model_name`. In this case, PaddleClas will automatically download files of specified model and save them in the directory `~/.paddleclas/`.
@@ -116,7 +137,7 @@ print(next(result))
 paddleclas --model_name='ResNet50' --infer_imgs='docs/images/whl/demo.jpg'
 ```
+<a name="4.3"></a>
 ### 4.3 Prediction using local model files
 You can use the local model files trained by yourself to predict, and only need to specify `inference_model_dir`. Note that the directory must contain `inference.pdmodel` and `inference.pdiparams`.
@@ -134,6 +155,7 @@ print(next(result))
 paddleclas --inference_model_dir='./inference/' --infer_imgs='docs/images/whl/demo.jpg'
 ```
+<a name="4.4"></a>
 ### 4.4 Prediction by batch
 You can predict by batch, only need to specify `batch_size` when `infer_imgs` is direcotry contain image files.
@@ -152,7 +174,7 @@ for r in result:
 paddleclas --model_name='ResNet50' --infer_imgs='docs/images/' --batch_size 2
 ```
+<a name="4.5"></a>
 ### 4.5 Prediction of Internet image
 You can predict the Internet image, only need to specify URL of Internet image by `infer_imgs`. In this case, the image file will be downloaded and saved in the directory `~/.paddleclas/images/`.
@@ -170,7 +192,7 @@ print(next(result))
 paddleclas --model_name='ResNet50' --infer_imgs='https://raw.githubusercontent.com/paddlepaddle/paddleclas/release/2.2/docs/images/whl/demo.jpg'
 ```
+<a name="4.6"></a>
 ### 4.6 Prediction of NumPy.array format image
 In Python code, you can predict the NumPy.array format image, only need to use the `infer_imgs` to transfer variable of image data. Note that the image data must be 3 channels.
@@ -184,6 +206,7 @@ result=clas.predict(infer_imgs)
 print(next(result))
 ```
+<a name="4.7"></a>
 ### 4.7 Save the prediction result(s)
 You can save the prediction result(s) as pre-label, only need to use `pre_label_out_dir` to specify the directory to save.
@@ -201,7 +224,7 @@ print(next(result))
 paddleclas --model_name='ResNet50' --infer_imgs='docs/images/whl/' --save_dir='./output_pre_label/'
 ```
+<a name="4.8"></a>
 ### 4.8 Specify the mapping between class id and label name
 You can specify the mapping between class id and label name, only need to use `class_id_map_file` to specify the mapping file. PaddleClas uses ImageNet1K's mapping by default.

--- a/docs/en/installation/.gitkeep
+++ b/docs/en/installation/.gitkeep
--- a/docs/en/installation/install_paddle_en.md
+++ b/docs/en/installation/install_paddle_en.md
+# Installation PaddlePaddle
+---
+## Contents
+- [1. Environment requirements](#1)
+- [2.(Recommended) Prepare a docker environment](#2)
+- [3. Install PaddlePaddle using pip](#3)
+- [4. Verify installation](#4)
+At present, **PaddleClas** requires **PaddlePaddle** version **>=2.0**. Docker is recomended to run Paddleclas, for more detailed information about docker and nvidia-docker, you can refer to the [tutorial](https://docs.docker.com/get-started/). If you do not want to use docker, you can skip section [2. (Recommended) Prepare a docker environment](#2), and go into section [3. Install PaddlePaddle using pip](#3).
+## 1. Environment requirements
+- python 3.x
+- cuda >= 10.1 (necessary if paddlepaddle-gpu is used)
+- cudnn >= 7.6.4 (necessary if paddlepaddle-gpu is used)
+- nccl >= 2.1.2 (necessary distributed training/eval is used)
+- gcc >= 8.2
+**Recomends**:
+* When CUDA version is 10.1, the driver version `>= 418.39`;
+* When CUDA version is 10.2, the driver version `>= 440.33`;
+* For more CUDA versions and specific driver versions, please refer to [link](https://docs.nvidia.com/deploy/cuda-compatibility/index.html).
+<a name="2"></a>
+## 2. (Recommended) Prepare a docker environment
+* Switch to the working directory
+```shell
+cd /home/Projects
+```
+* Create docker container
+The following commands will create a docker container named ppcls and map the current working directory to the `/paddle' directory in the container.
+```shell
+# For GPU users
+sudo nvidia-docker run --name ppcls -v $PWD:/paddle --shm-size=8G --network=host -it paddlepaddle/paddle:2.1.0-gpu-cuda10.2-cudnn7 /bin/bash
+# For CPU users
+sudo docker run --name ppcls -v $PWD:/paddle --shm-size=8G --network=host -it paddlepaddle/paddle:2.1.0 /bin/bash
+```
+**Notices**:
+* The first time you use this docker image, it will be downloaded automatically. Please be patient;
+* The above command will create a docker container named ppcls, and there is no need to run the command again when using the container again;
+* The parameter `--shm-size=8g` will set the shared memory of the container to 8g. If conditions permit, it is recommended to set this parameter to a larger value, such as `64g`;
+* You can also access [DockerHub](https://hub.Docker.com/r/paddlepaddle/paddle/tags/) to obtain the image adapted to your machine;
+* Exit / Enter the docker container:
+    * After entering the docker container, you can exit the current container by pressing `Ctrl + P + Q` without closing the container;
+    * To re-enter the container, use the following command:
+    ```shell
+    sudo Docker exec -it ppcls /bin/bash
+    ```
+<a name="3"></a>
+## 3. Install PaddlePaddle using pip
+If you want to use PaddlePaddle on GPU, you can use the following command to install PaddlePaddle.
+```bash
+pip install paddlepaddle-gpu --upgrade -i https://mirror.baidu.com/pypi/simple
+```
+If you want to use PaddlePaddle on CPU, you can use the following command to install PaddlePaddle.
+```bash
+pip install paddlepaddle --upgrade -i https://mirror.baidu.com/pypi/simple
+```
+**Note:**
+* If you have already installed CPU version of PaddlePaddle and want to use GPU version now, you should uninstall CPU version of PaddlePaddle and then install GPU version to avoid package confusion.
+* You can also compile PaddlePaddle from source code, please refer to [PaddlePaddle Installation tutorial](http://www.paddlepaddle.org.cn/install/quick) to more compilation options.
+<a name="4"></a>
+## 4. Verify Installation
+```python
+import paddle
+paddle.utils.run_check()
+```
+Check PaddlePaddle version：
+```bash
+python -c "import paddle; print(paddle.__version__)"
+```
+Note:
+* Make sure the compiled source code is later than PaddlePaddle2.0.
+* Indicate **WITH_DISTRIBUTE=ON** when compiling, Please refer to [Instruction](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/Tables.html#id3) for more details.
+* When running in docker, in order to ensure that the container has enough shared memory for dataloader acceleration of Paddle, please set the parameter `--shm-size=8g` at creating a docker container, if conditions permit, you can set it to a larger value.
--- a/docs/en/installation/install_paddleclas_en.md
+++ b/docs/en/installation/install_paddleclas_en.md
+# Install PaddleClas
+---
+## Contents
+* [1. Clone PaddleClas source code](#1)
+* [2. Install requirements](#2)
+<a name='1'></a>
+### 1. Clone PaddleClas source code
+```shell
+git clone https://github.com/PaddlePaddle/PaddleClas.git -b develop
+```
+If it is too slow for you to download from github, you can download PaddleClas from gitee. The command is as follows.
+```shell
+git clone https://gitee.com/paddlepaddle/PaddleClas.git -b develop
+```
+<a name='2'></a>
+## 2. Install requirements
+PaddleClas dependencies are listed in file `requirements.txt`, you can use the following command to install the dependencies.
+```
+pip install --upgrade -r requirements.txt -i https://mirror.baidu.com/pypi/simple
+```
--- a/docs/en/others/VisualDL_en.md
+++ b/docs/en/others/VisualDL_en.md
+# Use VisualDL to visualize the training
+---
+## Contents
+* [1. Preface](#1)
+* [2. Use VisualDL in PaddleClas](#2)
+    * [2.1 Set config and start training](#2.1)
+    * [2.2 Start VisualDL](#2.2)
+<a name='1'></a>
+## 1. Preface
+VisualDL, a visualization analysis tool of PaddlePaddle, provides a variety of charts to show the trends of parameters, and visualizes model structures, data samples, histograms of tensors, PR curves , ROC curves and high-dimensional data distributions. It enables users to understand the training process and the model structure more clearly and intuitively so as to optimize models efficiently. For more information, please refer to [VisualDL](https://github.com/PaddlePaddle/VisualDL/).
+<a name='2'></a>
+## 2. Use VisualDL in PaddleClas
+Now PaddleClas support use VisualDL to visualize the changes of learning rate, loss, accuracy in training.
+<a name='2.1'></a>
+### 2.1 Set config and start training
+You only need to set the field `Global.use_visualdl` to `True` in train config:
+```yaml
+# config.yaml
+Global:
+...
+  use_visualdl: True
+...
+```
+PaddleClas will save the VisualDL logs to subdirectory `vdl/` under the output directory specified by `Global.output_dir`. And then you just need to start training normally:
+```shell
+python3 tools/train.py -c config.yaml
+```
+<a name='2.2'></a>
+### 2.2 Start VisualDL
+After starting the training program, you can start the VisualDL service in a new terminal session:
+```shell
+ visualdl --logdir ./output/vdl/
+```
+In the above command, `--logdir` specify the directory of the VisualDL logs produced in training. VisualDL will traverse and iterate to find the subdirectories of the specified directory to visualize all the experimental results. You can also use the following parameters to set the IP and port number of the VisualDL service:
+* `--host`：ip, default is 127.0.0.1
+* `--port`：port, default is 8040
+More information about the command，please refer to [VisualDL](https://github.com/PaddlePaddle/VisualDL/blob/develop/README.md#2-launch-panel).
+Then you can enter the address `127.0.0.1:8840` and view the training process in the browser:
+<div align="center">
+    <img src="../../images/VisualDL/train_loss.png" width="400">
+</div>
--- a/docs/en/others/paddle_mobile_inference_en.md
+++ b/docs/en/others/paddle_mobile_inference_en.md
+# Paddle-Lite
+---
+## Contents
+* [1. Introduction](#1)
+* [2. Evaluation Steps](#2)
+   * [2.1 Export the Inference Model](#2.1)
+   * [2.2 Download Benchmark Binary File](#2.2)
+   * [2.3 Inference benchmark](#2.3)
+   * [2.4 Model Optimization and Speed Evaluation](#2.4)
+<a name='1'></a>
+## 1. Introduction
+[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) is a set of lightweight inference engine which is fully functional, easy to use and then performs well. Lightweighting is reflected in the use of fewer bits to represent the weight and activation of the neural network, which can greatly reduce the size of the model, solve the problem of limited storage space of the mobile device, and the inference speed is better than other frameworks on the whole.
+In [PaddleClas](https://github.com/PaddlePaddle/PaddleClas), we uses Paddle-Lite to [evaluate the performance on the mobile device](../models/Mobile.md), in this section we uses the `MobileNetV1` model trained on the `ImageNet1k` dataset as an example to introduce how to use `Paddle-Lite` to evaluate the model speed on the mobile terminal (evaluated on SD855)
+<a name='2'></a>
+## 2. Evaluation Steps
+<a name='2.1'></a>
+### 2.1 Export the Inference Model
+* First you should transform the saved model during training to the special model which can be used to inference, the special model can be exported by `tools/export_model.py`, the specific way of transform is as follows.
+```shell
+python tools/export_model.py -m MobileNetV1 -p pretrained/MobileNetV1_pretrained/ -o inference/MobileNetV1
+```
+Finally the `model` and `parmas` can be saved in `inference/MobileNetV1`.
+<a name='2.2'></a>
+### 2.2 Download Benchmark Binary File
+* Use the adb (Android Debug Bridge) tool to connect the Android phone and the PC, then develop and debug. After installing adb and ensuring that the PC and the phone are successfully connected, use the following command to view the ARM version of the phone and select the pre-compiled library based on ARM version.
+```shell
+adb shell getprop ro.product.cpu.abi
+```
+* Download Benchmark_bin File
+```shell
+wget -c https://paddle-inference-dist.bj.bcebos.com/PaddleLite/benchmark_0/benchmark_bin_v8
+```
+If the ARM version is v7, the v7 benchmark_bin file should be downloaded, the command is as follow.
+```shell
+wget -c https://paddle-inference-dist.bj.bcebos.com/PaddleLite/benchmark_0/benchmark_bin_v7
+```
+<a name='2.3'></a>
+### 2.3 Inference benchmark
+After the PC and mobile phone are successfully connected, use the following command to start the model evaluation.
+```
+sh deploy/lite/benchmark/benchmark.sh ./benchmark_bin_v8 ./inference result_armv8.txt true
+```
+Where `./benchmark_bin_v8` is the path of the benchmark binary file, `./inference` is the path of all the models that need to be evaluated, `result_armv8.txt` is the result file, and the final parameter `true` means that the model will be optimized before evaluation. Eventually, the evaluation result file of `result_armv8.txt` will be saved in the current folder. The specific performances are as follows.
+```
+PaddleLite Benchmark
+Threads=1 Warmup=10 Repeats=30
+MobileNetV1                           min = 30.89100    max = 30.73600    average = 30.79750
+Threads=2 Warmup=10 Repeats=30
+MobileNetV1                           min = 18.26600    max = 18.14000    average = 18.21637
+Threads=4 Warmup=10 Repeats=30
+MobileNetV1                           min = 10.03200    max = 9.94300     average = 9.97627
+```
+Here is the model inference speed under different number of threads, the unit is FPS, taking model on one threads as an example, the average speed of MobileNetV1 on SD855 is `30.79750FPS`.
+<a name='2.4'></a>
+### 2.4 Model Optimization and Speed Evaluation
+* In II.III section, we mention that the model will be optimized before evaluation, here you can  first optimize the model, and then directly load the optimized model for speed evaluation
+* Paddle-Lite
+In Paddle-Lite, we provides multiple strategies to automatically optimize the original training model, which contain Quantify, Subgraph fusion, Hybrid scheduling, Kernel optimization and so on. In order to make the optimization more convenient and easy to use, we provide opt tools to automatically complete the optimization steps and output a lightweight, optimal  and executable model in Paddle-Lite, which can be downloaded on [Paddle-Lite Model Optimization Page](https://paddle-lite.readthedocs.io/zh/latest/user_guides/model_optimize_tool.html). Here we take `MacOS` as our development environment, download[opt_mac](https://paddlelite-data.bj.bcebos.com/model_optimize_tool/opt_mac) model optimization tools and use the following commands to optimize the model.
+```shell
+model_file="../MobileNetV1/model"
+param_file="../MobileNetV1/params"
+opt_models_dir="./opt_models"
+mkdir ${opt_models_dir}
+./opt_mac --model_file=${model_file} \
+    --param_file=${param_file} \
+    --valid_targets=arm \
+    --optimize_out_type=naive_buffer \
+    --prefer_int8_kernel=false \
+    --optimize_out=${opt_models_dir}/MobileNetV1
+```
+Where the `model_file` and `param_file` are exported model file and the file address respectively, after transforming successfully, the `MobileNetV1.nb` will be saved in `opt_models`
+Use the benchmark_bin file to load the optimized model for evaluation. The commands are as follows.
+```shell
+bash benchmark.sh ./benchmark_bin_v8 ./opt_models result_armv8.txt
+```
+Finally the result is saved in `result_armv8.txt` and shown as follow.
+```
+PaddleLite Benchmark
+Threads=1 Warmup=10 Repeats=30
+MobileNetV1_lite              min = 30.89500    max = 30.78500    average = 30.84173
+Threads=2 Warmup=10 Repeats=30
+MobileNetV1_lite              min = 18.25300    max = 18.11000    average = 18.18017
+Threads=4 Warmup=10 Repeats=30
+MobileNetV1_lite              min = 10.00600    max = 9.90000     average = 9.96177
+```
+Taking the model on one threads as an example, the average speed of MobileNetV1 on SD855 is `30.84173FPS`.
+More specific parameter explanation and Paddle-Lite usage can refer to [Paddle-Lite docs](https://paddle-lite.readthedocs.io/zh/latest/)。
--- a/docs/en/others/train_with_DALI_en.md
+++ b/docs/en/others/train_with_DALI_en.md
+# Train with DALI
+## Preface
+[The NVIDIA Data Loading Library](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html) (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It can build Dataloader of Paddle.
+Since  the Deep learning relies on a large amount of data in the training stage, these data need to be loaded and preprocessed. These operations are usually executed on the CPU, which limits the further improvement of the training speed, especially when the batch_size is large, which become the bottleneck of speed. DALI can use GPU to accelerate these operations, thereby further improve the training speed.
+## Installing DALI
+DALI only support Linux x64 and version of CUDA is 10.2 or later.
+* For CUDA 10:
+    pip install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda100
+* For CUDA 11.0:
+    pip install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda110
+For more information about installing DALI, please refer to [DALI](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/installation.html).
+## Using DALI
+Paddleclas supports training with DALI in static graph. Since DALI only supports GPU training, `CUDA_VISIBLE_DEVICES` needs to be set, and DALI needs to occupy GPU memory, so it needs to reserve GPU memory for Dali. To train with DALI, just set the fields in the training config `use_dali = True`, or start the training by the following command:
+```shell
+# set the GPUs that can be seen
+export CUDA_VISIBLE_DEVICES="0"
+# set the GPU memory used for neural network training, generally 0.8 or 0.7, and the remaining GPU memory is reserved for DALI
+export FLAGS_fraction_of_gpu_memory_to_use=0.80
+python tools/static/train.py -c configs/ResNet/ResNet50.yaml -o use_dali=True
+```
+And you can train with muti-GPUs:
+```shell
+# set the GPUs that can be seen
+export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"
+# set the GPU memory used for neural network training, generally 0.8 or 0.7, and the remaining GPU memory is reserved for DALI
+export FLAGS_fraction_of_gpu_memory_to_use=0.80
+python -m paddle.distributed.launch \
+    --gpus="0,1,2,3,4,5,6,7" \
+    tools/static/train.py \
+        -c ./configs/ResNet/ResNet50.yaml \
+        -o use_dali=True
+```
+## Train with FP16
+On the basis of the above, using FP16 half-precision can further improve the training speed, you can refer to the following command.
+```shell
+export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
+export FLAGS_fraction_of_gpu_memory_to_use=0.8
+python -m paddle.distributed.launch \
+    --gpus="0,1,2,3,4,5,6,7" \
+    tools/static/train.py \
+        -c configs/ResNet/ResNet50_fp16.yaml
+```
--- a/docs/en/others/transfer_learning_en.md
+++ b/docs/en/others/transfer_learning_en.md
+# Transfer learning in image classification
+Transfer learning is an important part of machine learning, which is widely used in various fields such as text and images. Here we mainly introduce transfer learning in the field of image classification, which is often called domain transfer, such as migration of the ImageNet classification model to the specified image classification task, such as flower classification.
+---
+## Contents
+* [1. Hyperparameter search](#1)
+  * [1.1 Grid search](#1.1)
+  * [1.2 Bayesian search](#1.2)
+* [2. Large-scale image classification](#2)
+* [3. Reference](#3)
+<a name='1'></a>
+## 1. Hyperparameter search
+ImageNet is the widely used dataset for image classification. A series of empirical hyperparameters have been summarized. High accuracy can be got using the hyperparameters. However, when applied in the specified dataset, the hyperparameters may not be optimal. There are two commonly used hyperparameter search methods that can be used to help us obtain better model hyperparameters.
+<a name='1.1'></a>
+### 1.1 Grid search
+For grid search, which is also called exhaustive search, the optimal value is determined by finding the best solution from all solutions in the search space. The method is simple and effective, but when the search space is large, it takes huge computing resource.
+<a name='1.2'></a>
+### 1.2 Bayesian search
+Bayesian search, which is also called Bayesian optimization, is realized by randomly selecting a group of hyperparameters in the search space. Gaussian process is used to update the hyperparameters, compute their expected mean and variance according to the performance of the previous hyperparameters. The larger the expected mean, the greater the probability of being close to the optimal solution. The larger the expected variance, the greater the uncertainty. Usually, the hyperparameter point with large expected mean is called `exporitation`, and the hyperparameter point with large variance is called `exploration`. Acquisition function is defined to balance the expected mean and variance. The currently selected hyperparameter point is viewed as the optimal position with maximum probability.
+According to the above two search schemes, we carry out some experiments based on fixed scheme and two search schemes on 8 open source datasets. As the experimental scheme in [1], we search for 4 hyperparameters, the search space and The experimental results are as follows:
+a fixed set of parameter experiments and two search schemes on 8 open source data sets. With reference to the experimental scheme of [1], we search for 4 hyperparameters, the search space and the experimental results are as follows:
+- Fixed scheme.
+```
+lr=0.003，l2 decay=1e-4，label smoothing=False，mixup=False
+```
+- Search space of the hyperparameters.
+```
+lr: [0.1, 0.03, 0.01, 0.003, 0.001, 0.0003, 0.0001]
+l2 decay: [1e-3, 3e-4, 1e-4, 3e-5, 1e-5, 3e-6, 1e-6]
+label smoothing: [False, True]
+mixup: [False, True]
+```
+It takes 196 times for grid search, and takes 10 times less for Bayesian search. The baseline is trained by using ImageNet1k pretrained model based on ResNet50_vd and fixed scheme. The follow shows the experiments.
+| Dataset             | Fix scheme | Grid search | Grid search time | Bayesian search | Bayesian search time|
+| ------------------ | -------- | -------- | -------- | -------- | ---------- |
+| Oxford-IIIT-Pets   | 93.64%   | 94.55%   | 196 | 94.04%     | 20         |
+| Oxford-102-Flowers | 96.08%   | 97.69%   | 196 |  97.49%     | 20         |
+| Food101            | 87.07%   | 87.52%   | 196 |  87.33%     | 23         |
+| SUN397             | 63.27%   | 64.84%   | 196 |  64.55%     | 20         |
+| Caltech101         | 91.71%   | 92.54%   | 196 |  92.16%     | 14         |
+| DTD                | 76.87%   | 77.53%   | 196 |  77.47%     | 13         |
+| Stanford Cars      | 85.14%   | 92.72%   | 196 |  92.72%     | 25         |
+| FGVC Aircraft      | 80.32%   | 88.45%   | 196 |  88.36%     | 20         |
+- The above experiments verify that Bayesian search only reduces the accuracy by 0% to 0.4% under the condition of reducing the number of searches by about 10 times compared to grid search.
+- The search space can be expaned easily using Bayesian search.
+<a name='2'></a>
+## Large-scale image classification
+In practical applications, due to the lack of training data, the classification model trained on the ImageNet1k data set is often used as the pretrained model for other image classification tasks. In order to further help solve practical problems, based on ResNet50_vd, Baidu open sourced a self-developed large-scale classification pretrained model, in which the training data contains 100,000 categories and 43 million pictures. The pretrained model can be downloaded as follows：[**download link**](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_vd_10w_pretrained.pdparams)
+We conducted transfer learning experiments on 6 self-collected datasets,
+using a set of fixed parameters and a grid search method, in which the number of training rounds was set to 20epochs, the ResNet50_vd model was selected, and the ImageNet pre-training accuracy was 79.12%. The comparison results of the experimental data set parameters and model accuracy are as follows:
+Fixed scheme：
+```
+lr=0.001，l2 decay=1e-4，label smoothing=False，mixup=False
+```
+| Dataset          | Statstics                                  | **Pretrained moel on ImageNet <br />Top-1(fixed)/Top-1(search)** | **Pretrained moel on large-scale dataset<br />Top-1(fixed)/Top-1(search)** |
+| --------------- | ----------------------------------------- | -------------------------------------------------------- | --------------------------------------------------------- |
+| Flowers         | class:102<br />train:5789<br />valid:2396 | 0.7779/0.9883                                            | 0.9892/0.9954                                             |
+| Hand-painted stick figures       | Class:18<br />train:1007<br />valid:432   | 0.8795/0.9196                                            | 0.9107/0.9219                                             |
+| Leaves     | class:6<br />train:5256<br />valid:2278   | 0.8212/0.8482                                            | 0.8385/0.8659                                             |
+| Container vehicle       | Class:115<br />train:4879<br />valid:2094 | 0.6230/0.9556                                            | 0.9524/0.9702                                             |
+| Chair         | class:5<br />train:169<br />valid:78      | 0.8557/0.9688                                            | 0.9077/0.9792                                             |
+| Geology         | class:4<br />train:671<br />valid:296     | 0.5719/0.8094                                            | 0.6781/0.8219                                             |
+- The above experiments verified that for fixed parameters, compared with the pretrained model on ImageNet, using the large-scale classification model as a pretrained model can help us improve the model performance on a new dataset in most cases. Parameter search can be further helpful to the model performance.
+<a name='3'></a>
+## Reference
+[1] Kornblith, Simon, Jonathon Shlens, and Quoc V. Le. "Do better imagenet models transfer better?." *Proceedings of the IEEE conference on computer vision and pattern recognition*. 2019.
+[2] Kolesnikov, Alexander, et al. "Large Scale Learning of General Visual Representations for Transfer." *arXiv preprint arXiv:1912.11370* (2019).
--- a/docs/zh_CN/installation/install_paddle.md
+++ b/docs/zh_CN/installation/install_paddle.md
@@ -69,10 +69,10 @@ sudo docker run --name ppcls -v $PWD:/paddle --shm-size=8G --network=host -it pa
 ```bash
 # 对于 CPU 用户
-pip3 install paddlepaddle --upgrade -i https://mirror.baidu.com/pypi/simple
+pip install paddlepaddle --upgrade -i https://mirror.baidu.com/pypi/simple
 # 对于 GPU 用户
-pip3 install paddlepaddle-gpu --upgrade -i https://mirror.baidu.com/pypi/simple
+pip install paddlepaddle-gpu --upgrade -i https://mirror.baidu.com/pypi/simple
 ```
 **注意：**
@@ -98,4 +98,4 @@ python -c "import paddle; print(paddle.__version__)"
 **注意**：
 - 从源码编译的 PaddlePaddle 版本号为 `0.0.0`，请确保使用 PaddlePaddle 2.0 及之后的源码进行编译；
 - PaddleClas 基于 PaddlePaddle 高性能的分布式训练能力，若您从源码编译，请确保打开编译选项 `WITH_DISTRIBUTE=ON`。具体编译选项参考 [编译选项表](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/Tables.html#bianyixuanxiangbiao)；
- 在 Docker 中运行时，为保证 Docker 容器有足够的共享内存用于 Paddle 的数据读取加速，在创建 Docker 容器时，请设置参数 `--shm_size=8g`，条件允许的话可以设置为更大的值。
+- 在 Docker 中运行时，为保证 Docker 容器有足够的共享内存用于 Paddle 的数据读取加速，在创建 Docker 容器时，请设置参数 `--shm-size=8g`，条件允许的话可以设置为更大的值。
--- a/docs/zh_CN/others/VisualDL.md
+++ b/docs/zh_CN/others/VisualDL.md
 # 使用 VisualDL 可视化训练过程
--------
+---
 ## 目录
 * [1. 前言](#1)
 * [2. 在 PaddleClas 中使用 VisualDL](#2)
-	* [2.1 设置 config 文件并启动训练](#2.1)
+    * [2.1 设置 config 文件并启动训练](#2.1)
-	* [2.2 启动 VisualDL](#2.2)
+    * [2.2 启动 VisualDL](#2.2)
 <a name='1'></a>

--- a/docs/zh_CN/others/paddle_mobile_inference.md
+++ b/docs/zh_CN/others/paddle_mobile_inference.md
@@ -4,10 +4,10 @@
 * [1. 简介](#1)
 * [2. 评估步骤](#2)
-	* [2.1 导出 inference 模型](#2.1)
+   * [2.1 导出 inference 模型](#2.1)
-	* [2.2 benchmark 二进制文件下载](#2.2)
+   * [2.2 benchmark 二进制文件下载](#2.2)
-	* [2.3 模型速度 benchmark](#2.3)
+   * [2.3 模型速度 benchmark](#2.3)
-	* [2.4 模型优化与速度评估](#2.4)
+   * [2.4 模型优化与速度评估](#2.4)
 <a name='1'></a>
@@ -17,7 +17,7 @@
 轻量化体现在使用较少比特数用于表示神经网络的权重和激活，能够大大降低模型的体积，解决终端设备存储空间有限的问题，推理性能也整体优于其他框架。
 [PaddleClas](https://github.com/PaddlePaddle/PaddleClas) 使用 Paddle-Lite 进行了[移动端模型的性能评估](../models/Mobile.md)，本部分以 `ImageNet1k` 数据集的 `MobileNetV1` 模型为例，介绍怎样使用 `Paddle-Lite`，在移动端(基于骁龙855的安卓开发平台)对进行模型速度评估。
- <a name='2'></a>
+<a name='2'></a>
 ## 2. 评估步骤

--- a/docs/zh_CN/others/transfer_learning.md
+++ b/docs/zh_CN/others/transfer_learning.md
@@ -17,13 +17,13 @@
 ImageNet 作为业界常用的图像分类数据被大家广泛使用，已经总结出一系列经验性的超参，使用这些超参往往能够得到不错的训练精度，而这些经验性的参数在迁移到自己的业务中时，有时效果不佳。有两种常用的超参搜索方法可以用于获得更好的模型超参。
- <a name='1.1'></a>
+<a name='1.1'></a>
 ### 1.1 网格搜索
 网格搜索，即穷举搜索，通过查找搜索空间内所有的点，确定最优值。方法简单有效，但当搜索空间较大时，需要消耗大量的计算资源。
- <a name='1.2'></a>
+<a name='1.2'></a>
 ### 1.2 贝叶斯搜索
@@ -70,7 +70,7 @@ Mixup: [False, True]
 - 上述实验验证了贝叶斯搜索相比网格搜索，在减少搜索次数 10 倍左右条件下，精度只下降 0%~0.4%。
 - 当搜索空间进一步扩大时，例如将是否进行 AutoAugment，RandAugment，Cutout，Cutmix 以及 Dropout 这些正则化策略作为选择时，贝叶斯搜索能够在获取较优精度的前提下，有效地降低搜索次数。
- <a name='2'></a>
+<a name='2'></a>
 ## 2. 大规模分类模型