diff --git a/docs/en/extension/multi_machine_training.md b/docs/en/extension/multi_machine_training_en.md
similarity index 100%
rename from docs/en/extension/multi_machine_training.md
rename to docs/en/extension/multi_machine_training_en.md
diff --git a/docs/en/extension/paddle_hub.md b/docs/en/extension/paddle_hub_en.md
similarity index 100%
rename from docs/en/extension/paddle_hub.md
rename to docs/en/extension/paddle_hub_en.md
diff --git a/docs/en/extension/paddle_mobile_inference.md b/docs/en/extension/paddle_mobile_inference.md
deleted file mode 100644
index 2d7a2968d68cd269e6cfdba5b20598d62d201c17..0000000000000000000000000000000000000000
--- a/docs/en/extension/paddle_mobile_inference.md
+++ /dev/null
@@ -1,9 +0,0 @@
-# Paddle-Lite
-
-[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) is an open-source deep learning framework designed by PaddlePaddle to make it easy to perform inference on mobile, embeded, and IoT devices.
-Light Weight is reflected in the use of fewer bits to represent the weight and activation of the neural network,
-which can greatly reduce the size of the model,
-solve the problem of limited storage space of the terminal device,
-and the inference performance is overall better than other frame.
-[PaddleClas](https://github.com/PaddlePaddle/PaddleClas) has used Paddle-Lite to evaluate [the performance of the mobile model](../models/Mobile.md).
-For more detail of process, please refer to [Paddle-Lite documentations](https://paddle-lite.readthedocs.io/zh/latest/).
diff --git a/docs/en/extension/paddle_quantization.md b/docs/en/extension/paddle_quantization_en.md
similarity index 100%
rename from docs/en/extension/paddle_quantization.md
rename to docs/en/extension/paddle_quantization_en.md
diff --git a/docs/en/extension/paddle_serving_en.md b/docs/en/extension/paddle_serving_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..fc64a45d520bb6823059cdac527474f3b6961cdd
--- /dev/null
+++ b/docs/en/extension/paddle_serving_en.md
@@ -0,0 +1,64 @@
+# Model Service Deployment
+
+## I. Overview
+[Paddle Serving](https://github.com/PaddlePaddle/Serving) aims to help deep-learning researchers to easily deploy online inference services, supporting one-click deployment of industry, high concurrency and efficient communication between client and server and supporting multiple programming languages to develop clients.
+
+Taking HTTP inference service deployment as an example to introduce how to use PaddleServing to deploy model services in PaddleClas.
+
+## II. Serving Install
+
+It is recommends to use docker to install and deploy the Serving environment in the Serving official website, first, you need to pull the docker environment and create Serving-based docker.
+
+```shell
+nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
+nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
+nvidia-docker exec -it test bash
+```
+
+In docker, you need to install some packages about Serving
+
+```shell
+pip install paddlepaddle-gpu
+pip install paddle-serving-client
+pip install paddle-serving-server-gpu
+```
+
+* If the installation speed is too slow, you can add `-i https://pypi.tuna.tsinghua.edu.cn/simple` following pip to speed up the process.
+
+* If you want to deploy CPU service, you can install the cpu version of Serving, the command is as follow. 
+
+```shell
+pip install paddle-serving-server
+```
+
+### III. Export Model
+
+Exporting the Serving model using `tools/export_serving_model.py`, taking ResNet50_vd as an example, the command is as follow.
+
+```shell
+python tools/export_serving_model.py -m ResNet50_vd -p ./pretrained/ResNet50_vd_pretrained/ -o serving
+```
+
+finally, the client configures, model parameters and structure file will be saved in `ppcls_client_conf` and `ppcls_model`.
+
+
+### IV. Service Deployment and Request
+
+* Using the following commands to start the Serving.
+
+```shell
+python tools/serving/image_service_gpu.py serving/ppcls_model workdir 9292
+```
+
+`serving/ppcls_model` is the address of the Serving model just saved, `workdir` is the work directory, and `9292` is the port of the service.
+
+
+* Using the following script to send an identification request to the Serving and return the result.
+
+```
+python tools/serving/image_http_client.py  9292 ./docs/images/logo.png
+```
+
+`9292` is the port for sending the request, which is consistent with the Serving starting port, and `./docs/images/logo.png` is the test image, the final top1 label and probability are returned. 
+
+* For more Serving deployment, such RPC inference service, you can refer to the Serving official website: [https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet)