diff --git a/doc/ENCRYPTION.md b/doc/ENCRYPTION.md
new file mode 100644
index 0000000000000000000000000000000000000000..1e6a53aa386bf672d5f87647cb1682531ea3d62c
--- /dev/null
+++ b/doc/ENCRYPTION.md
@@ -0,0 +1,52 @@
+# MOEDL ENCRYPTION INFERENCE
+
+([简体中文](ENCRYPTION_CN.md)|English)
+
+Paddle Serving provides model encryption inference, This document shows the details.
+
+## Principle
+
+We use symmetric encryption algorithm to encrypt the model. Symmetric encryption algorithm uses the same key for encryption and decryption, it has small amount of calculation, fast speed, is the most commonly used encryption method.
+
+### Got an Encrypted Model
+
+Normal model and parameters can be understood as a string, by using the encryption algorithm (parameter is your key) on them, the normal model and parameters become an encrypted one.
+
+We provide a simple demo to encrypt the model. See the [python/examples/encryption/encrypt.py](../python/examples/encryption/encrypt.py)。
+
+
+### Start Encryption Service
+
+Suppose you already have an encrypted model（in the `encrypt_server/`）,you can start the encryption model service by adding an additional command line parameter `--use_encryption_model`
+
+CPU Service
+```
+python -m paddle_serving_server.serve --model encrypt_server/ --port 9300 --use_encryption_model
+```
+GPU Service
+```
+python -m paddle_serving_server_gpu.serve --model encrypt_server/ --port 9300 --use_encryption_model --gpu_ids 0
+```
+
+At this point, the server does not really start, but waits for the key。
+
+### Client Encryption Inference
+
+First of all, you got have the key which is used in the process of model encryption.
+
+Then you can configure your client with the key, when you connect the server, this key will send to the server and the server will keep it.
+
+Once the server gets the key, it uses the key to parse the model and starts the model prediction service.
+
+
+### Example of Model Encryption Inference
+Example of model encryption inference, See the [`/python/examples/encryption/`](../python/examples/encryption/)。
+
+
+### Other Details
+Interface of encryption method in paddlepaddle official website:
+
+[Python encryption method](https://github.com/HexToString/Serving/blob/develop/python/paddle_serving_app/local_predict.py)
+
+[C++ encryption method](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/python_infer_cn.html#analysispre)
+
diff --git a/doc/ENCRYPTION_CN.md b/doc/ENCRYPTION_CN.md
new file mode 100644
index 0000000000000000000000000000000000000000..5ca304d00d198ba2c6df1c7cfbff7315ba46fe15
--- /dev/null
+++ b/doc/ENCRYPTION_CN.md
@@ -0,0 +1,52 @@
+# 加密模型预测
+
+(简体中文|[English](ENCRYPTION.md))
+
+Padle Serving提供了模型加密预测功能，本文档显示了详细信息。
+
+## 原理
+
+采用对称加密算法对模型进行加密。对称加密算法采用同一密钥进行加解密，它计算量小，速度快，是最常用的加密方法。
+
+### 获得加密模型
+
+普通的模型和参数可以理解为一个字符串，通过对其使用加密算法（参数是您的密钥），普通模型和参数就变成了一个加密的模型和参数。
+
+我们提供了一个简单的演示来加密模型。请参阅[`python/examples/encryption/encrypt.py`](../python/examples/encryption/encrypt.py)。
+
+
+### 启动加密服务
+
+假设您已经有一个已经加密的模型（在`encrypt_server/`路径下）,您可以通过添加一个额外的命令行参数 `--use_encryption_model`来启动加密模型服务。
+
+CPU Service
+```
+python -m paddle_serving_server.serve --model encrypt_server/ --port 9300 --use_encryption_model
+```
+GPU Service
+```
+python -m paddle_serving_server_gpu.serve --model encrypt_server/ --port 9300 --use_encryption_model --gpu_ids 0
+```
+
+此时，服务器不会真正启动，而是等待密钥。
+
+### Client Encryption Inference
+
+首先，您必须拥有模型加密过程中使用的密钥。
+
+然后你可以用这个密钥配置你的客户端，当你连接服务器时，这个密钥会发送到服务器，服务器会保留它。
+
+一旦服务器获得密钥，它就使用该密钥解析模型并启动模型预测服务。
+
+
+### 模型加密推理示例
+模型加密推理示例, 请参见[`/python/examples/encryption/`](../python/examples/encryption/)。
+
+
+### 其他详细信息
+飞桨官方网站加密方法接口
+
+[Python加密方法接口](https://github.com/HexToString/Serving/blob/develop/python/paddle_serving_app/local_predict.py)
+
+[C++加密方法接口](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/python_infer_cn.html#analysispre)
+
diff --git a/doc/FAQ.md b/doc/FAQ.md
index e102857750e1d50731654a8f384bb9f7c1b21d88..b7b5ab98cfd101b6d15a24458abb9d4e7d91c109 100644
--- a/doc/FAQ.md
+++ b/doc/FAQ.md
@@ -86,6 +86,26 @@ pip3 install --upgrade pip
 pip3 install --upgrade setuptools
 ```
 
+#### Q: 运行过程中报错，信息如下：
+```
+Traceback (most recent call last):
+  File "../../deploy/serving/test_client.py", line 18, in <module>
+    from paddle_serving_app.reader import *
+  File "/usr/local/python2.7.15/lib/python2.7/site-packages/paddle_serving_app/reader/__init__.py", line 15, in <module>
+    from .image_reader import ImageReader, File2Image, URL2Image, Sequential, Normalize, Base64ToImage
+  File "/usr/local/python2.7.15/lib/python2.7/site-packages/paddle_serving_app/reader/image_reader.py", line 24, in <module>
+    from shapely.geometry import Polygon
+ImportError: No module named shapely.geometry
+```
+**A:** 有2种方法，第一种通过pip/pip3安装shapely，第二种通过pip/pip3安装所有依赖组件。
+```
+方法1：
+pip install shapely==1.7.0
+
+方法2：
+pip install -r python/requirements.txt
+```
+
 ## 编译问题
 
 #### Q: 如何使用自己编译的Paddle Serving进行预测？
diff --git a/doc/GRPC_IMPL_CN.md b/doc/GRPC_IMPL_CN.md
index 9e7ecd268fe0900c1085479c1f96fa083629758c..7cfa9d86f7c92d0f33f2984c116993544159f7e8 100644
--- a/doc/GRPC_IMPL_CN.md
+++ b/doc/GRPC_IMPL_CN.md
@@ -24,13 +24,12 @@
 
 #### 1.1 服务端对比
 
-* gRPC Server 端 `load_model_config` 函数添加 `client_config_path` 参数：
+* 由于gRPC Server 端实际包含了brpc-Client端的，因此brpc-Client的初始化过程是在gRPC Server 端实现的，所以gRPC Server 端 `load_model_config` 函数添加 `client_config_path` 参数，用于指定brpc-Client初始化过程中的传输数据格式配置文件路径（`client_config_path` 参数未指定时默认为None,此时`client_config_path` 在`load_model_config` 函数中被默认为 `<server_config_path>/serving_server_conf.prototxt`，此时brpc-Client与brpc-Server的传输数据格式配置文件相同）
 
    ```
    def load_model_config(self, server_config_paths, client_config_path=None)
    ```
     在一些例子中 bRPC Server 端与 bRPC Client 端的配置文件可能不同（如 在cube local 中，Client 端的数据先交给 cube，经过 cube 处理后再交给预测库），此时 gRPC Server 端需要手动设置 gRPC Client 端的配置`client_config_path`。
-    **`client_config_path` 默认为 `<server_config_path>/serving_server_conf.prototxt`。**
 
 #### 1.2 客服端对比
 
@@ -47,13 +46,15 @@
 * gRPC Client 端 `predict` 函数添加 `asyn` 和 `is_python` 参数：
 
    ```
-   def predict(self, feed, fetch, need_variant_tag=False, asyn=False, is_python=True)
+   def predict(self, feed, fetch, batch=True, need_variant_tag=False, asyn=False, is_python=True,log_id=0)
    ```
 
 1.    `asyn` 为异步调用选项。当 `asyn=True` 时为异步调用，返回 `MultiLangPredictFuture` 对象，通过 `MultiLangPredictFuture.result()` 阻塞获取预测值；当 `asyn=Fasle` 为同步调用。
 
 2.    `is_python` 为 proto 格式选项。当 `is_python=True` 时，基于 numpy bytes 格式进行数据传输，目前只适用于 Python；当 `is_python=False` 时，以普通数据格式传输，更加通用。使用 numpy bytes 格式传输耗时比普通数据格式小很多（详见 [#654](https://github.com/PaddlePaddle/Serving/pull/654)）。
 
+3.    `batch`为数据是否需要进行增维处理的选项。当`batch=True`时，feed数据不需要额外的处理，维持原有维度；当`batch=False`时,会对数据进行增维度处理。例如：feed.shape原始为[2,2]，当`batch=False`时,会将feed.reshape为[1,2,2]。
+
 #### 1.3 其他
 
 * 异常处理：当 gRPC Server 端的 bRPC Client 预测失败（返回 `None`）时，gRPC Client 端同样返回None。其他 gRPC 异常会在 Client 内部捕获，并在返回的 fetch_map 中添加一个 "status_code" 字段来区分是否预测正常（参考 timeout 样例）。
@@ -74,7 +75,7 @@
    
 ## 2.示例：线性回归预测服务
 
-以下是采用gRPC实现的关于线性回归预测的一个示例，具体代码详见此[链接](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/grpc_impl_example/fit_a_line)
+以下是采用gRPC实现的关于线性回归预测的一个示例，具体代码详见此[链接](../python/examples/grpc_impl_example/fit_a_line)
 #### 获取数据
 
 ```shell
@@ -134,4 +135,4 @@ python test_list_input_client.py
 
 ## 3.更多示例
 
-详见[`python/examples/grpc_impl_example`](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/grpc_impl_example)下的示例文件。
+详见[`python/examples/grpc_impl_example`](../python/examples/grpc_impl_example)下的示例文件。
diff --git a/doc/JAVA_SDK.md b/doc/JAVA_SDK.md
index 4880e74bfee123b432b6b583a239d2d2ccbb45ac..01da7156a6d1a803bd06171664fe9e2c4e977d83 100644
--- a/doc/JAVA_SDK.md
+++ b/doc/JAVA_SDK.md
@@ -20,6 +20,7 @@ The following table shows compatibilities between Paddle Serving Server and Java
 | :---------------------------: | :--------------: |
 |             0.3.2             |      0.0.1       |
 
+1.    Directly use the provided Java SDK as the client for prediction
 ### Install Java SDK
 
 You can download jar and install it to the local Maven repository:
@@ -29,14 +30,6 @@ wget https://paddle-serving.bj.bcebos.com/jar/paddle-serving-sdk-java-0.0.1.jar
 mvn install:install-file -Dfile=$PWD/paddle-serving-sdk-java-0.0.1.jar -DgroupId=io.paddle.serving.client -DartifactId=paddle-serving-sdk-java -Dversion=0.0.1 -Dpackaging=jar
 ```
 
-Or compile from the source code and install it to the local Maven repository:
-
-```shell
-cd Serving/java
-mvn compile
-mvn install
-```
-
 ### Maven configure
 
 ```text
@@ -47,63 +40,8 @@ mvn install
  </dependency>
 ```
 
+2.    Use it after compiling from the source code. See the [document](../java/README.md).
 
 
-## Example
-
-Here we will show how to use Java SDK for Boston house price prediction. Please refer to [examples](../java/examples) folder for more examples.
-
-### Get model
-
-```shell
-wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
-tar -xzf uci_housing.tar.gz
-```
+3.    examples for using the java client, see the See the [document](../java/README.md).
 
-### Start Python Server
-
-```shell
-python -m paddle_serving_server.serve --model uci_housing_model --port 9393 --use_multilang 
-```
-
-#### Client side code example
-
-```java
-import io.paddle.serving.client.*;
-import org.nd4j.linalg.api.ndarray.INDArray;
-import org.nd4j.linalg.factory.Nd4j;
-import java.util.*;
-
-public class PaddleServingClientExample {
-    public static void main( String[] args ) {
-        float[] data = {0.0137f, -0.1136f, 0.2553f, -0.0692f,
-            0.0582f, -0.0727f, -0.1583f, -0.0584f,
-            0.6283f, 0.4919f, 0.1856f, 0.0795f, -0.0332f};
-        INDArray npdata = Nd4j.createFromArray(data);
-        HashMap<String, INDArray> feed_data
-            = new HashMap<String, INDArray>() {{
-                put("x", npdata);
-            }};
-        List<String> fetch = Arrays.asList("price");
-
-        Client client = new Client();
-        String target = "localhost:9393";
-        boolean succ = client.connect(target);
-        if (succ != true) {
-            System.out.println("connect failed.");
-            return ;
-        }
-
-        Map<String, INDArray> fetch_map = client.predict(feed_data, fetch);
-        if (fetch_map == null) {
-            System.out.println("predict failed.");
-            return ;
-        }
-
-        for (Map.Entry<String, INDArray> e : fetch_map.entrySet()) {
-            System.out.println("Key = " + e.getKey() + ", Value = " + e.getValue());
-        }
-        return ;
-    }
-}
-```
diff --git a/doc/JAVA_SDK_CN.md b/doc/JAVA_SDK_CN.md
index f624a4403371f5b284f34cbf310fef64d59602d9..7033b96078e1143567ccb19f14b80fc2b126a45d 100644
--- a/doc/JAVA_SDK_CN.md
+++ b/doc/JAVA_SDK_CN.md
@@ -19,6 +19,7 @@ Paddle Serving 提供了 Java SDK，支持 Client 端用 Java 语言进行预测
 | :---------------------------: | :--------------: |
 |             0.3.2             |      0.0.1       |
 
+1.    直接使用提供的Java SDK作为Client进行预测
 ### 安装
 
 您可以直接下载 jar，安装到本地 Maven 库：
@@ -28,14 +29,6 @@ wget https://paddle-serving.bj.bcebos.com/jar/paddle-serving-sdk-java-0.0.1.jar
 mvn install:install-file -Dfile=$PWD/paddle-serving-sdk-java-0.0.1.jar -DgroupId=io.paddle.serving.client -DartifactId=paddle-serving-sdk-java -Dversion=0.0.1 -Dpackaging=jar
 ```
 
-或者从源码进行编译，安装到本地 Maven 库：
-
-```shell
-cd Serving/java
-mvn compile
-mvn install
-```
-
 ### Maven 配置
 
 ```text
@@ -46,64 +39,7 @@ mvn install
  </dependency>
 ```
 
+2.    从源码进行编译后使用，详细步骤见[文档](../java/README.md).
 
+3.    相关使用示例见[文档](../java/README.md).
 
-
-## 使用样例
-
-这里将展示如何使用 Java SDK 进行房价预测，更多例子详见 [examples](../java/examples) 文件夹。
-
-### 获取房价预测模型
-
-```shell
-wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
-tar -xzf uci_housing.tar.gz
-```
-
-### 启动 Python 端 Server
-
-```shell
-python -m paddle_serving_server.serve --model uci_housing_model --port 9393 --use_multilang 
-```
-
-### Client 端代码示例
-
-```java
-import io.paddle.serving.client.*;
-import org.nd4j.linalg.api.ndarray.INDArray;
-import org.nd4j.linalg.factory.Nd4j;
-import java.util.*;
-
-public class PaddleServingClientExample {
-    public static void main( String[] args ) {
-        float[] data = {0.0137f, -0.1136f, 0.2553f, -0.0692f,
-            0.0582f, -0.0727f, -0.1583f, -0.0584f,
-            0.6283f, 0.4919f, 0.1856f, 0.0795f, -0.0332f};
-        INDArray npdata = Nd4j.createFromArray(data);
-        HashMap<String, INDArray> feed_data
-            = new HashMap<String, INDArray>() {{
-                put("x", npdata);
-            }};
-        List<String> fetch = Arrays.asList("price");
-
-        Client client = new Client();
-        String target = "localhost:9393";
-        boolean succ = client.connect(target);
-        if (succ != true) {
-            System.out.println("connect failed.");
-            return ;
-        }
-
-        Map<String, INDArray> fetch_map = client.predict(feed_data, fetch);
-        if (fetch_map == null) {
-            System.out.println("predict failed.");
-            return ;
-        }
-
-        for (Map.Entry<String, INDArray> e : fetch_map.entrySet()) {
-            System.out.println("Key = " + e.getKey() + ", Value = " + e.getValue());
-        }
-        return ;
-    }
-}
-```
diff --git a/doc/LATEST_PACKAGES.md b/doc/LATEST_PACKAGES.md
index 63f9f4e6394796382eac5893196d73a40eae847d..1ce1e2c569b3864b0bdc6f84629de7e6e99df584 100644
--- a/doc/LATEST_PACKAGES.md
+++ b/doc/LATEST_PACKAGES.md
@@ -22,6 +22,8 @@ https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post10-
 https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post101-py3-none-any.whl
 #cuda10.2 with TensorRT 7
 https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post102-py3-none-any.whl
+#cuda11.0 with TensorRT 7 (beta)
+https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post11-py3-none-any.whl
 ```
 ### Python 2
 ```
@@ -33,17 +35,24 @@ https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post10-
 https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post101-py2-none-any.whl
 #cuda10.2 with TensorRT 7
 https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post102-py2-none-any.whl
+#cuda11.0 with TensorRT 7 (beta)
+https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post11-py2-none-any.whl
 ```
 
 ## Client
-### Python 3.7
-```
-https://paddle-serving.bj.bcebos.com/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
-```
+
 ### Python 3.6
 ```
 https://paddle-serving.bj.bcebos.com/whl/paddle_serving_client-0.0.0-cp36-none-any.whl
 ```
+### Python 3.8
+```
+https://paddle-serving.bj.bcebos.com/whl/paddle_serving_client-0.0.0-cp38-none-any.whl
+```
+### Python 3.7
+```
+https://paddle-serving.bj.bcebos.com/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
+```
 ### Python 3.5
 ```
 https://paddle-serving.bj.bcebos.com/whl/paddle_serving_client-0.0.0-cp35-none-any.whl
@@ -53,6 +62,7 @@ https://paddle-serving.bj.bcebos.com/whl/paddle_serving_client-0.0.0-cp35-none-a
 https://paddle-serving.bj.bcebos.com/whl/paddle_serving_client-0.0.0-cp27-none-any.whl
 ```
 
+
 ## App
 ### Python 3
 ```
diff --git a/python/examples/grpc_impl_example/imdb/README.md b/python/examples/grpc_impl_example/imdb/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..73636f3caa9aeea375f56714f57e26a6f31e990c
--- /dev/null
+++ b/python/examples/grpc_impl_example/imdb/README.md
@@ -0,0 +1,25 @@
+## IMDB comment sentiment inference service
+
+([简体中文](./README_CN.md)|English)
+
+### Get model files and sample data
+
+```
+sh get_data.sh
+```
+the package downloaded contains cnn, lstm and bow model config along with their test_data and train_data.
+
+### Start RPC inference service
+
+```
+python -m paddle_serving_server.serve --model imdb_cnn_model/ --thread 10 --port 9393 --use_multilang
+```
+### RPC Infer
+
+The `paddlepaddle` package is used in `test_client.py`, and you may need to download the corresponding package(`pip install paddlepaddle`).
+
+```
+head test_data/part-0 | python test_client.py
+```
+
+it will get predict results of the first 10 test cases.
diff --git a/python/examples/grpc_impl_example/imdb/README_CN.md b/python/examples/grpc_impl_example/imdb/README_CN.md
new file mode 100644
index 0000000000000000000000000000000000000000..327b1c5541ad53f14b8518037de39e572c31e67c
--- /dev/null
+++ b/python/examples/grpc_impl_example/imdb/README_CN.md
@@ -0,0 +1,24 @@
+## IMDB评论情绪预测服务
+
+(简体中文|[English](./README.md))
+
+### 获取模型文件和样例数据
+
+```
+sh get_data.sh
+```
+脚本会下载和解压出cnn、lstm和bow三种模型的配置文文件以及test_data和train_data。
+
+### 启动RPC预测服务
+
+```
+python -m paddle_serving_server.serve --model imdb_cnn_model/ --thread 10 --port 9393 --use_multilang
+```
+### 执行预测
+
+`test_client.py`中使用了`paddlepaddle`包，需要进行下载（`pip install paddlepaddle`）。
+
+```
+head test_data/part-0 | python test_client.py
+```
+预测test_data/part-0的前十个样例。
diff --git a/python/examples/imdb/test_ensemble_client.py b/python/examples/grpc_impl_example/imdb/test_client.py
similarity index 63%
rename from python/examples/imdb/test_ensemble_client.py
rename to python/examples/grpc_impl_example/imdb/test_client.py
index eb1e29ddd6d5a02854e4859a35474306c1c4d073..bddc4d501d346c4cfbb33d743d53e2e0eb3b6b10 100644
--- a/python/examples/imdb/test_ensemble_client.py
+++ b/python/examples/grpc_impl_example/imdb/test_client.py
@@ -12,14 +12,12 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # pylint: disable=doc-string-missing
-
-from paddle_serving_client import Client
-from imdb_reader import IMDBDataset
+from paddle_serving_client import MultiLangClient as Client
+from paddle_serving_app.reader.imdb_reader import IMDBDataset
+import sys
+import numpy as np
 
 client = Client()
-# If you have more than one model, make sure that the input
-# and output of more than one model are the same.
-client.load_client_config('imdb_bow_client_conf/serving_client_conf.prototxt')
 client.connect(["127.0.0.1:9393"])
 
 # you can define any english sentence or dataset here
@@ -28,11 +26,17 @@ client.connect(["127.0.0.1:9393"])
 imdb_dataset = IMDBDataset()
 imdb_dataset.load_resource('imdb.vocab')
 
-for i in range(3):
-    line = 'i am very sad | 0'
+for line in sys.stdin:
     word_ids, label = imdb_dataset.get_words_and_label(line)
-    feed = {"words": word_ids}
+    word_len = len(word_ids)
+    feed = {
+        "words": np.array(word_ids).reshape(word_len, 1),
+        "words.lod": [0, word_len]
+    }
     fetch = ["prediction"]
-    fetch_maps = client.predict(feed=feed, fetch=fetch)
-    for model, fetch_map in fetch_maps.items():
-        print("step: {}, model: {}, res: {}".format(i, model, fetch_map))
+    fetch_map = client.predict(feed=feed, fetch=fetch, batch=True)
+    if fetch_map["serving_status_code"] == 0:
+        print(fetch_map)
+    else:
+        print(fetch_map["serving_status_code"])
+    #print("{} {}".format(fetch_map["prediction"][0], label[0]))
diff --git a/python/examples/grpc_impl_example/imdb/test_multilang_ensemble_client.py b/python/examples/grpc_impl_example/imdb/test_multilang_ensemble_client.py
deleted file mode 100644
index 43034e49bde4a477c160c5a0d158ea541d633a4d..0000000000000000000000000000000000000000
--- a/python/examples/grpc_impl_example/imdb/test_multilang_ensemble_client.py
+++ /dev/null
@@ -1,39 +0,0 @@
-# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-# pylint: disable=doc-string-missing
-
-from paddle_serving_client import MultiLangClient
-from imdb_reader import IMDBDataset
-
-client = MultiLangClient()
-# If you have more than one model, make sure that the input
-# and output of more than one model are the same.
-client.connect(["127.0.0.1:9393"])
-
-# you can define any english sentence or dataset here
-# This example reuses imdb reader in training, you
-# can define your own data preprocessing easily.
-imdb_dataset = IMDBDataset()
-imdb_dataset.load_resource('imdb.vocab')
-
-for i in range(3):
-    line = 'i am very sad | 0'
-    word_ids, label = imdb_dataset.get_words_and_label(line)
-    feed = {"words": word_ids}
-    fetch = ["prediction"]
-    fetch_maps = client.predict(feed=feed, fetch=fetch)
-    for model, fetch_map in fetch_maps.items():
-        if model == "serving_status_code":
-            continue
-        print("step: {}, model: {}, res: {}".format(i, model, fetch_map))
diff --git a/python/examples/grpc_impl_example/imdb/test_multilang_ensemble_server.py b/python/examples/grpc_impl_example/imdb/test_multilang_ensemble_server.py
deleted file mode 100644
index 053aa06f0219de231415ba178135782334e56c1f..0000000000000000000000000000000000000000
--- a/python/examples/grpc_impl_example/imdb/test_multilang_ensemble_server.py
+++ /dev/null
@@ -1,40 +0,0 @@
-# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-# pylint: disable=doc-string-missing
-
-from paddle_serving_server import OpMaker
-from paddle_serving_server import OpGraphMaker
-from paddle_serving_server import MultiLangServer
-
-op_maker = OpMaker()
-read_op = op_maker.create('general_reader')
-cnn_infer_op = op_maker.create(
-    'general_infer', engine_name='cnn', inputs=[read_op])
-bow_infer_op = op_maker.create(
-    'general_infer', engine_name='bow', inputs=[read_op])
-response_op = op_maker.create(
-    'general_response', inputs=[cnn_infer_op, bow_infer_op])
-
-op_graph_maker = OpGraphMaker()
-op_graph_maker.add_op(read_op)
-op_graph_maker.add_op(cnn_infer_op)
-op_graph_maker.add_op(bow_infer_op)
-op_graph_maker.add_op(response_op)
-
-server = MultiLangServer()
-server.set_op_graph(op_graph_maker.get_op_graph())
-model_config = {cnn_infer_op: 'imdb_cnn_model', bow_infer_op: 'imdb_bow_model'}
-server.load_model_config(model_config)
-server.prepare_server(workdir="work_dir1", port=9393, device="cpu")
-server.run_server()
diff --git a/python/examples/imdb/test_ensemble_server.py b/python/examples/imdb/test_ensemble_server.py
deleted file mode 100644
index 464288a0a167d8487f787d12c4b44a138da86f88..0000000000000000000000000000000000000000
--- a/python/examples/imdb/test_ensemble_server.py
+++ /dev/null
@@ -1,40 +0,0 @@
-# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-# pylint: disable=doc-string-missing
-
-from paddle_serving_server import OpMaker
-from paddle_serving_server import OpGraphMaker
-from paddle_serving_server import Server
-
-op_maker = OpMaker()
-read_op = op_maker.create('general_reader')
-cnn_infer_op = op_maker.create(
-    'general_infer', engine_name='cnn', inputs=[read_op])
-bow_infer_op = op_maker.create(
-    'general_infer', engine_name='bow', inputs=[read_op])
-response_op = op_maker.create(
-    'general_response', inputs=[cnn_infer_op, bow_infer_op])
-
-op_graph_maker = OpGraphMaker()
-op_graph_maker.add_op(read_op)
-op_graph_maker.add_op(cnn_infer_op)
-op_graph_maker.add_op(bow_infer_op)
-op_graph_maker.add_op(response_op)
-
-server = Server()
-server.set_op_graph(op_graph_maker.get_op_graph())
-model_config = {cnn_infer_op: 'imdb_cnn_model', bow_infer_op: 'imdb_bow_model'}
-server.load_model_config(model_config)
-server.prepare_server(workdir="work_dir1", port=9393, device="cpu")
-server.run_server()
diff --git a/python/paddle_serving_server/__init__.py b/python/paddle_serving_server/__init__.py
index 5ef3cf751925072594f8cb749fb926e233f4c0e9..4d0832b67d81b25d6552774d785b399304b4f66a 100644
--- a/python/paddle_serving_server/__init__.py
+++ b/python/paddle_serving_server/__init__.py
@@ -537,26 +537,37 @@ class MultiLangServerServiceServicer(multi_lang_general_model_service_pb2_grpc.
         fetch_names = list(request.fetch_var_names)
         is_python = request.is_python
         log_id = request.log_id
-        feed_dict = {}
-        feed_inst = request.insts[0]
-        for idx, name in enumerate(feed_names):
-            var = feed_inst.tensor_array[idx]
-            v_type = self.feed_types_[name]
-            data = None
-            if is_python:
-                if v_type == 0:  # int64
-                    data = np.frombuffer(var.data, dtype="int64")
-                elif v_type == 1:  # float32
-                    data = np.frombuffer(var.data, dtype="float32")
-                elif v_type == 2:  # int32
-                    data = np.frombuffer(var.data, dtype="int32")
+        feed_batch = []
+        for feed_inst in request.insts:
+            feed_dict = {}
+            for idx, name in enumerate(feed_names):
+                var = feed_inst.tensor_array[idx]
+                v_type = self.feed_types_[name]
+                data = None
+                if is_python:
+                    if v_type == 0:  # int64
+                        data = np.frombuffer(var.data, dtype="int64")
+                    elif v_type == 1:  # float32
+                        data = np.frombuffer(var.data, dtype="float32")
+                    elif v_type == 2:  # int32
+                        data = np.frombuffer(var.data, dtype="int32")
+                    else:
+                        raise Exception("error type.")
                 else:
-                    raise Exception("error type.")
-            data.shape = list(feed_inst.tensor_array[idx].shape)
-            feed_dict[name] = data
-            if len(var.lod) > 0:
-                feed_dict["{}.lod".format()] = var.lod
-        return feed_dict, fetch_names, is_python, log_id
+                    if v_type == 0:  # int64
+                        data = np.array(list(var.int64_data), dtype="int64")
+                    elif v_type == 1:  # float32
+                        data = np.array(list(var.float_data), dtype="float32")
+                    elif v_type == 2:  # int32
+                        data = np.array(list(var.int_data), dtype="int32")
+                    else:
+                        raise Exception("error type.")
+                data.shape = list(feed_inst.tensor_array[idx].shape)
+                feed_dict[name] = data
+                if len(var.lod) > 0:
+                    feed_dict["{}.lod".format(name)] = var.lod
+            feed_batch.append(feed_dict)
+        return feed_batch, fetch_names, is_python, log_id
 
     def _pack_inference_response(self, ret, fetch_names, is_python):
         resp = multi_lang_general_model_service_pb2.InferenceResponse()
@@ -608,10 +619,10 @@ class MultiLangServerServiceServicer(multi_lang_general_model_service_pb2_grpc.
         return resp
 
     def Inference(self, request, context):
-        feed_dict, fetch_names, is_python, log_id = \
+        feed_batch, fetch_names, is_python, log_id = \
                 self._unpack_inference_request(request)
         ret = self.bclient_.predict(
-            feed=feed_dict,
+            feed=feed_batch,
             fetch=fetch_names,
             batch=True,
             need_variant_tag=True,
@@ -649,6 +660,9 @@ class MultiLangServer(object):
                 "max_body_size is less than default value, will use default value in service."
             )
 
+    def use_encryption_model(self, flag=False):
+        self.encryption_model = flag
+
     def set_port(self, port):
         self.gport_ = port
 
diff --git a/python/paddle_serving_server_gpu/__init__.py b/python/paddle_serving_server_gpu/__init__.py
index ffbb888ecf8bee8342476dcd21fe9f04abf27acf..d53c67797f66e89e1f6c78d3aeef79d9d8603fc7 100644
--- a/python/paddle_serving_server_gpu/__init__.py
+++ b/python/paddle_serving_server_gpu/__init__.py
@@ -244,6 +244,9 @@ class Server(object):
                 "max_body_size is less than default value, will use default value in service."
             )
 
+    def use_encryption_model(self, flag=False):
+        self.encryption_model = flag
+
     def set_port(self, port):
         self.port = port
 
@@ -690,6 +693,8 @@ class MultiLangServerServiceServicer(multi_lang_general_model_service_pb2_grpc.
                         raise Exception("error type.")
                 data.shape = list(feed_inst.tensor_array[idx].shape)
                 feed_dict[name] = data
+                if len(var.lod) > 0:
+                    feed_dict["{}.lod".format(name)] = var.lod
             feed_batch.append(feed_dict)
         return feed_batch, fetch_names, is_python, log_id
 
@@ -744,11 +749,12 @@ class MultiLangServerServiceServicer(multi_lang_general_model_service_pb2_grpc.
         return resp
 
     def Inference(self, request, context):
-        feed_dict, fetch_names, is_python, log_id \
+        feed_batch, fetch_names, is_python, log_id \
                 = self._unpack_inference_request(request)
         ret = self.bclient_.predict(
-            feed=feed_dict,
+            feed=feed_batch,
             fetch=fetch_names,
+            batch=True,
             need_variant_tag=True,
             log_id=log_id)
         return self._pack_inference_response(ret, fetch_names, is_python)
@@ -787,6 +793,9 @@ class MultiLangServer(object):
                 "max_body_size is less than default value, will use default value in service."
             )
 
+    def use_encryption_model(self, flag=False):
+        self.encryption_model = flag
+
     def set_port(self, port):
         self.gport_ = port
 
@@ -824,6 +833,7 @@ class MultiLangServer(object):
                        workdir=None,
                        port=9292,
                        device="cpu",
+                       use_encryption_model=False,
                        cube_conf=None):
         if not self._port_is_available(port):
             raise SystemExit("Prot {} is already used".format(port))
@@ -838,6 +848,7 @@ class MultiLangServer(object):
             workdir=workdir,
             port=self.port_list_[0],
             device=device,
+            use_encryption_model=use_encryption_model,
             cube_conf=cube_conf)
         self.set_port(port)