Merge pull request #1828 from rainyfly/add_clas_modules

Add clas modules

Merge pull request #1828 from rainyfly/add_clas_modules
Add clas modules
e401e10b · KP · GitHub · f3579435 · efdb9896 · e401e10b
77 changed file
--- a/modules/image/classification/esnet_x0_25_imagenet/README.md
+++ b/modules/image/classification/esnet_x0_25_imagenet/README.md
+# esnet_x0_25_imagenet
+|模型名称|esnet_x0_25_imagenet|
+| :--- | :---: |
+|类别|图像-图像分类|
+|网络|ESNet|
+|数据集|ImageNet-2012|
+|是否支持Fine-tuning|否|
+|模型大小|10 MB|
+|最新更新日期|2022-04-02|
+|数据指标|Acc|
+## 一、模型基本信息
+- ### 模型介绍
+  - ESNet(Enhanced ShuffleNet)是百度自研的一个轻量级网络，该网络在 ShuffleNetV2 的基础上融合了 MobileNetV3、GhostNet、PPLCNet 的优点，组合成了一个在 ARM 设备上速度更快、精度更高的网络，由于其出色的表现，所以在 PaddleDetection 推出的 [PP-PicoDet](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet) 使用了该模型做 backbone，配合更强的目标检测算法，最终的指标一举刷新了目标检测模型在 ARM 设备上的 SOTA 指标。该模型为模型规模参数scale为x0.25下的ESNet模型。
+## 二、安装
+- ### 1、环境依赖  
+  - paddlepaddle >= 1.6.2  
+  - paddlehub >= 1.6.0  | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
+- ### 2、安装
+  - ```shell
+    $ hub install esnet_x0_25_imagenet
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测
+- ### 1、命令行预测
+  - ```shell
+    $ hub run esnet_x0_25_imagenet --input_path "/PATH/TO/IMAGE"
+    ```
+  - 通过命令行方式实现分类模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+- ### 2、预测代码示例
+  - ```python
+    import paddlehub as hub
+    import cv2
+    classifier = hub.Module(name="esnet_x0_25_imagenet")
+    result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')])
+    # or
+    # result = classifier.classification(paths=['/PATH/TO/IMAGE'])
+    ```
+- ### 3、API
+  - ```python
+    def classification(images=None,
+                       paths=None,
+                       batch_size=1,
+                       use_gpu=False,
+                       top_k=1):
+    ```
+    - 分类接口API。
+    - **参数**
+      - images (list\[numpy.ndarray\]): 图片数据，每一个图片数据的shape 均为 \[H, W, C\]，颜色空间为 BGR； <br/>
+      - paths (list\[str\]): 图片的路径； <br/>
+      - batch\_size (int): batch 的大小；<br/>
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA_VISIBLE_DEVICES环境变量** <br/>
+      - top\_k (int): 返回预测结果的前 k 个。
+    - **返回**
+      - res (list\[dict\]): 分类结果，列表的每一个元素均为字典，其中 key 包括'class_ids'（种类索引）, 'scores'（置信度） 和 'label_names'（种类名称）
+## 四、服务部署
+- PaddleHub Serving可以部署一个图像识别的在线服务。
+- ### 第一步：启动PaddleHub Serving
+  - 运行启动命令：
+  - ```shell
+    $ hub serving start -m esnet_x0_25_imagenet
+    ```
+  - 这样就完成了一个图像识别的在线服务的部署，默认端口号为8866。
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+  - ```python
+    import requests
+    import json
+    import cv2
+    import base64
+    def cv2_to_base64(image):
+        data = cv2.imencode('.jpg', image)[1]
+        return base64.b64encode(data.tostring()).decode('utf8')
+    # 发送HTTP请求
+    data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+    headers = {"Content-type": "application/json"\}
+    url = "http://127.0.0.1:8866/predict/esnet_x0_25_imagenet"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    # 打印预测结果
+    print(r.json()["results"])
+    ```
+## 五、更新历史
+* 1.0.0
+  初始发布
+  - ```shell
+    $ hub install esnet_x0_25_imagenet==1.0.0
+    ```
--- a/modules/image/classification/esnet_x0_25_imagenet/model.py
+++ b/modules/image/classification/esnet_x0_25_imagenet/model.py
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import math
+from typing import Any
+from typing import Callable
+from typing import Dict
+from typing import List
+from typing import Tuple
+from typing import Union
+import paddle
+import paddle.nn as nn
+from paddle import concat
+from paddle import ParamAttr
+from paddle import reshape
+from paddle import split
+from paddle import transpose
+from paddle.nn import AdaptiveAvgPool2D
+from paddle.nn import BatchNorm
+from paddle.nn import Conv2D
+from paddle.nn import Dropout
+from paddle.nn import Linear
+from paddle.nn import MaxPool2D
+from paddle.nn.initializer import KaimingNormal
+from paddle.regularizer import L2Decay
+MODEL_STAGES_PATTERN = {"ESNet": ["blocks[2]", "blocks[9]", "blocks[12]"]}
+class Identity(nn.Layer):
+    def __init__(self):
+        super(Identity, self).__init__()
+    def forward(self, inputs):
+        return inputs
+class TheseusLayer(nn.Layer):
+    def __init__(self, *args, **kwargs):
+        super(TheseusLayer, self).__init__()
+        self.res_dict = {}
+        self.res_name = self.full_name()
+        self.pruner = None
+        self.quanter = None
+    def _return_dict_hook(self, layer, input, output):
+        res_dict = {"output": output}
+        # 'list' is needed to avoid error raised by popping self.res_dict
+        for res_key in list(self.res_dict):
+            # clear the res_dict because the forward process may change according to input
+            res_dict[res_key] = self.res_dict.pop(res_key)
+        return res_dict
+    def init_res(self, stages_pattern, return_patterns=None, return_stages=None):
+        if return_patterns and return_stages:
+            msg = f"The 'return_patterns' would be ignored when 'return_stages' is set."
+            return_stages = None
+        if return_stages is True:
+            return_patterns = stages_pattern
+        # return_stages is int or bool
+        if type(return_stages) is int:
+            return_stages = [return_stages]
+        if isinstance(return_stages, list):
+            if max(return_stages) > len(stages_pattern) or min(return_stages) < 0:
+                msg = f"The 'return_stages' set error. Illegal value(s) have been ignored. The stages' pattern list is {stages_pattern}."
+                return_stages = [val for val in return_stages if val >= 0 and val < len(stages_pattern)]
+            return_patterns = [stages_pattern[i] for i in return_stages]
+        if return_patterns:
+            self.update_res(return_patterns)
+    def replace_sub(self, *args, **kwargs) -> None:
+        msg = "The function 'replace_sub()' is deprecated, please use 'upgrade_sublayer()' instead."
+        raise DeprecationWarning(msg)
+    def upgrade_sublayer(self, layer_name_pattern: Union[str, List[str]],
+                         handle_func: Callable[[nn.Layer, str], nn.Layer]) -> Dict[str, nn.Layer]:
+        """use 'handle_func' to modify the sub-layer(s) specified by 'layer_name_pattern'.
+        Args:
+            layer_name_pattern (Union[str, List[str]]): The name of layer to be modified by 'handle_func'.
+            handle_func (Callable[[nn.Layer, str], nn.Layer]): The function to modify target layer specified by 'layer_name_pattern'. The formal params are the layer(nn.Layer) and pattern(str) that is (a member of) layer_name_pattern (when layer_name_pattern is List type). And the return is the layer processed.
+        Returns:
+            Dict[str, nn.Layer]: The key is the pattern and corresponding value is the result returned by 'handle_func()'.
+        Examples:
+            from paddle import nn
+            import paddleclas
+            def rep_func(layer: nn.Layer, pattern: str):
+                new_layer = nn.Conv2D(
+                    in_channels=layer._in_channels,
+                    out_channels=layer._out_channels,
+                    kernel_size=5,
+                    padding=2
+                )
+                return new_layer
+            net = paddleclas.MobileNetV1()
+            res = net.replace_sub(layer_name_pattern=["blocks[11].depthwise_conv.conv", "blocks[12].depthwise_conv.conv"], handle_func=rep_func)
+            print(res)
+            # {'blocks[11].depthwise_conv.conv': the corresponding new_layer, 'blocks[12].depthwise_conv.conv': the corresponding new_layer}
+        """
+        if not isinstance(layer_name_pattern, list):
+            layer_name_pattern = [layer_name_pattern]
+        hit_layer_pattern_list = []
+        for pattern in layer_name_pattern:
+            # parse pattern to find target layer and its parent
+            layer_list = parse_pattern_str(pattern=pattern, parent_layer=self)
+            if not layer_list:
+                continue
+            sub_layer_parent = layer_list[-2]["layer"] if len(layer_list) > 1 else self
+            sub_layer = layer_list[-1]["layer"]
+            sub_layer_name = layer_list[-1]["name"]
+            sub_layer_index = layer_list[-1]["index"]
+            new_sub_layer = handle_func(sub_layer, pattern)
+            if sub_layer_index:
+                getattr(sub_layer_parent, sub_layer_name)[sub_layer_index] = new_sub_layer
+            else:
+                setattr(sub_layer_parent, sub_layer_name, new_sub_layer)
+            hit_layer_pattern_list.append(pattern)
+        return hit_layer_pattern_list
+    def stop_after(self, stop_layer_name: str) -> bool:
+        """stop forward and backward after 'stop_layer_name'.
+        Args:
+            stop_layer_name (str): The name of layer that stop forward and backward after this layer.
+        Returns:
+            bool: 'True' if successful, 'False' otherwise.
+        """
+        layer_list = parse_pattern_str(stop_layer_name, self)
+        if not layer_list:
+            return False
+        parent_layer = self
+        for layer_dict in layer_list:
+            name, index = layer_dict["name"], layer_dict["index"]
+            if not set_identity(parent_layer, name, index):
+                msg = f"Failed to set the layers that after stop_layer_name('{stop_layer_name}') to IdentityLayer. The error layer's name is '{name}'."
+                return False
+            parent_layer = layer_dict["layer"]
+        return True
+    def update_res(self, return_patterns: Union[str, List[str]]) -> Dict[str, nn.Layer]:
+        """update the result(s) to be returned.
+        Args:
+            return_patterns (Union[str, List[str]]): The name of layer to return output.
+        Returns:
+            Dict[str, nn.Layer]: The pattern(str) and corresponding layer(nn.Layer) that have been set successfully.
+        """
+        # clear res_dict that could have been set
+        self.res_dict = {}
+        class Handler(object):
+            def __init__(self, res_dict):
+                # res_dict is a reference
+                self.res_dict = res_dict
+            def __call__(self, layer, pattern):
+                layer.res_dict = self.res_dict
+                layer.res_name = pattern
+                if hasattr(layer, "hook_remove_helper"):
+                    layer.hook_remove_helper.remove()
+                layer.hook_remove_helper = layer.register_forward_post_hook(save_sub_res_hook)
+                return layer
+        handle_func = Handler(self.res_dict)
+        hit_layer_pattern_list = self.upgrade_sublayer(return_patterns, handle_func=handle_func)
+        if hasattr(self, "hook_remove_helper"):
+            self.hook_remove_helper.remove()
+        self.hook_remove_helper = self.register_forward_post_hook(self._return_dict_hook)
+        return hit_layer_pattern_list
+def save_sub_res_hook(layer, input, output):
+    layer.res_dict[layer.res_name] = output
+def set_identity(parent_layer: nn.Layer, layer_name: str, layer_index: str = None) -> bool:
+    """set the layer specified by layer_name and layer_index to Indentity.
+    Args:
+        parent_layer (nn.Layer): The parent layer of target layer specified by layer_name and layer_index.
+        layer_name (str): The name of target layer to be set to Indentity.
+        layer_index (str, optional): The index of target layer to be set to Indentity in parent_layer. Defaults to None.
+    Returns:
+        bool: True if successfully, False otherwise.
+    """
+    stop_after = False
+    for sub_layer_name in parent_layer._sub_layers:
+        if stop_after:
+            parent_layer._sub_layers[sub_layer_name] = Identity()
+            continue
+        if sub_layer_name == layer_name:
+            stop_after = True
+    if layer_index and stop_after:
+        stop_after = False
+        for sub_layer_index in parent_layer._sub_layers[layer_name]._sub_layers:
+            if stop_after:
+                parent_layer._sub_layers[layer_name][sub_layer_index] = Identity()
+                continue
+            if layer_index == sub_layer_index:
+                stop_after = True
+    return stop_after
+def parse_pattern_str(pattern: str, parent_layer: nn.Layer) -> Union[None, List[Dict[str, Union[nn.Layer, str, None]]]]:
+    """parse the string type pattern.
+    Args:
+        pattern (str): The pattern to discribe layer.
+        parent_layer (nn.Layer): The root layer relative to the pattern.
+    Returns:
+        Union[None, List[Dict[str, Union[nn.Layer, str, None]]]]: None if failed. If successfully, the members are layers parsed in order:
+                                                                [
+                                                                    {"layer": first layer, "name": first layer's name parsed, "index": first layer's index parsed if exist},
+                                                                    {"layer": second layer, "name": second layer's name parsed, "index": second layer's index parsed if exist},
+                                                                    ...
+                                                                ]
+    """
+    pattern_list = pattern.split(".")
+    if not pattern_list:
+        msg = f"The pattern('{pattern}') is illegal. Please check and retry."
+        return None
+    layer_list = []
+    while len(pattern_list) > 0:
+        if '[' in pattern_list[0]:
+            target_layer_name = pattern_list[0].split('[')[0]
+            target_layer_index = pattern_list[0].split('[')[1].split(']')[0]
+        else:
+            target_layer_name = pattern_list[0]
+            target_layer_index = None
+        target_layer = getattr(parent_layer, target_layer_name, None)
+        if target_layer is None:
+            msg = f"Not found layer named('{target_layer_name}') specifed in pattern('{pattern}')."
+            return None
+        if target_layer_index and target_layer:
+            if int(target_layer_index) < 0 or int(target_layer_index) >= len(target_layer):
+                msg = f"Not found layer by index('{target_layer_index}') specifed in pattern('{pattern}'). The index should < {len(target_layer)} and > 0."
+                return None
+            target_layer = target_layer[target_layer_index]
+        layer_list.append({"layer": target_layer, "name": target_layer_name, "index": target_layer_index})
+        pattern_list = pattern_list[1:]
+        parent_layer = target_layer
+    return layer_list
+def channel_shuffle(x, groups):
+    batch_size, num_channels, height, width = x.shape[0:4]
+    channels_per_group = num_channels // groups
+    x = reshape(x=x, shape=[batch_size, groups, channels_per_group, height, width])
+    x = transpose(x=x, perm=[0, 2, 1, 3, 4])
+    x = reshape(x=x, shape=[batch_size, num_channels, height, width])
+    return x
+def make_divisible(v, divisor=8, min_value=None):
+    if min_value is None:
+        min_value = divisor
+    new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
+    if new_v < 0.9 * v:
+        new_v += divisor
+    return new_v
+class ConvBNLayer(TheseusLayer):
+    def __init__(self, in_channels, out_channels, kernel_size, stride=1, groups=1, if_act=True):
+        super().__init__()
+        self.conv = Conv2D(in_channels=in_channels,
+                           out_channels=out_channels,
+                           kernel_size=kernel_size,
+                           stride=stride,
+                           padding=(kernel_size - 1) // 2,
+                           groups=groups,
+                           weight_attr=ParamAttr(initializer=KaimingNormal()),
+                           bias_attr=False)
+        self.bn = BatchNorm(out_channels,
+                            param_attr=ParamAttr(regularizer=L2Decay(0.0)),
+                            bias_attr=ParamAttr(regularizer=L2Decay(0.0)))
+        self.if_act = if_act
+        self.hardswish = nn.Hardswish()
+    def forward(self, x):
+        x = self.conv(x)
+        x = self.bn(x)
+        if self.if_act:
+            x = self.hardswish(x)
+        return x
+class SEModule(TheseusLayer):
+    def __init__(self, channel, reduction=4):
+        super().__init__()
+        self.avg_pool = AdaptiveAvgPool2D(1)
+        self.conv1 = Conv2D(in_channels=channel, out_channels=channel // reduction, kernel_size=1, stride=1, padding=0)
+        self.relu = nn.ReLU()
+        self.conv2 = Conv2D(in_channels=channel // reduction, out_channels=channel, kernel_size=1, stride=1, padding=0)
+        self.hardsigmoid = nn.Hardsigmoid()
+    def forward(self, x):
+        identity = x
+        x = self.avg_pool(x)
+        x = self.conv1(x)
+        x = self.relu(x)
+        x = self.conv2(x)
+        x = self.hardsigmoid(x)
+        x = paddle.multiply(x=identity, y=x)
+        return x
+class ESBlock1(TheseusLayer):
+    def __init__(self, in_channels, out_channels):
+        super().__init__()
+        self.pw_1_1 = ConvBNLayer(in_channels=in_channels // 2, out_channels=out_channels // 2, kernel_size=1, stride=1)
+        self.dw_1 = ConvBNLayer(in_channels=out_channels // 2,
+                                out_channels=out_channels // 2,
+                                kernel_size=3,
+                                stride=1,
+                                groups=out_channels // 2,
+                                if_act=False)
+        self.se = SEModule(out_channels)
+        self.pw_1_2 = ConvBNLayer(in_channels=out_channels, out_channels=out_channels // 2, kernel_size=1, stride=1)
+    def forward(self, x):
+        x1, x2 = split(x, num_or_sections=[x.shape[1] // 2, x.shape[1] // 2], axis=1)
+        x2 = self.pw_1_1(x2)
+        x3 = self.dw_1(x2)
+        x3 = concat([x2, x3], axis=1)
+        x3 = self.se(x3)
+        x3 = self.pw_1_2(x3)
+        x = concat([x1, x3], axis=1)
+        return channel_shuffle(x, 2)
+class ESBlock2(TheseusLayer):
+    def __init__(self, in_channels, out_channels):
+        super().__init__()
+        # branch1
+        self.dw_1 = ConvBNLayer(in_channels=in_channels,
+                                out_channels=in_channels,
+                                kernel_size=3,
+                                stride=2,
+                                groups=in_channels,
+                                if_act=False)
+        self.pw_1 = ConvBNLayer(in_channels=in_channels, out_channels=out_channels // 2, kernel_size=1, stride=1)
+        # branch2
+        self.pw_2_1 = ConvBNLayer(in_channels=in_channels, out_channels=out_channels // 2, kernel_size=1)
+        self.dw_2 = ConvBNLayer(in_channels=out_channels // 2,
+                                out_channels=out_channels // 2,
+                                kernel_size=3,
+                                stride=2,
+                                groups=out_channels // 2,
+                                if_act=False)
+        self.se = SEModule(out_channels // 2)
+        self.pw_2_2 = ConvBNLayer(in_channels=out_channels // 2, out_channels=out_channels // 2, kernel_size=1)
+        self.concat_dw = ConvBNLayer(in_channels=out_channels,
+                                     out_channels=out_channels,
+                                     kernel_size=3,
+                                     groups=out_channels)
+        self.concat_pw = ConvBNLayer(in_channels=out_channels, out_channels=out_channels, kernel_size=1)
+    def forward(self, x):
+        x1 = self.dw_1(x)
+        x1 = self.pw_1(x1)
+        x2 = self.pw_2_1(x)
+        x2 = self.dw_2(x2)
+        x2 = self.se(x2)
+        x2 = self.pw_2_2(x2)
+        x = concat([x1, x2], axis=1)
+        x = self.concat_dw(x)
+        x = self.concat_pw(x)
+        return x
+class ESNet(TheseusLayer):
+    def __init__(self,
+                 stages_pattern,
+                 class_num=1000,
+                 scale=1.0,
+                 dropout_prob=0.2,
+                 class_expand=1280,
+                 return_patterns=None,
+                 return_stages=None):
+        super().__init__()
+        self.scale = scale
+        self.class_num = class_num
+        self.class_expand = class_expand
+        stage_repeats = [3, 7, 3]
+        stage_out_channels = [
+            -1, 24, make_divisible(116 * scale),
+            make_divisible(232 * scale),
+            make_divisible(464 * scale), 1024
+        ]
+        self.conv1 = ConvBNLayer(in_channels=3, out_channels=stage_out_channels[1], kernel_size=3, stride=2)
+        self.max_pool = MaxPool2D(kernel_size=3, stride=2, padding=1)
+        block_list = []
+        for stage_id, num_repeat in enumerate(stage_repeats):
+            for i in range(num_repeat):
+                if i == 0:
+                    block = ESBlock2(in_channels=stage_out_channels[stage_id + 1],
+                                     out_channels=stage_out_channels[stage_id + 2])
+                else:
+                    block = ESBlock1(in_channels=stage_out_channels[stage_id + 2],
+                                     out_channels=stage_out_channels[stage_id + 2])
+                block_list.append(block)
+        self.blocks = nn.Sequential(*block_list)
+        self.conv2 = ConvBNLayer(in_channels=stage_out_channels[-2], out_channels=stage_out_channels[-1], kernel_size=1)
+        self.avg_pool = AdaptiveAvgPool2D(1)
+        self.last_conv = Conv2D(in_channels=stage_out_channels[-1],
+                                out_channels=self.class_expand,
+                                kernel_size=1,
+                                stride=1,
+                                padding=0,
+                                bias_attr=False)
+        self.hardswish = nn.Hardswish()
+        self.dropout = Dropout(p=dropout_prob, mode="downscale_in_infer")
+        self.flatten = nn.Flatten(start_axis=1, stop_axis=-1)
+        self.fc = Linear(self.class_expand, self.class_num)
+        super().init_res(stages_pattern, return_patterns=return_patterns, return_stages=return_stages)
+    def forward(self, x):
+        x = self.conv1(x)
+        x = self.max_pool(x)
+        x = self.blocks(x)
+        x = self.conv2(x)
+        x = self.avg_pool(x)
+        x = self.last_conv(x)
+        x = self.hardswish(x)
+        x = self.dropout(x)
+        x = self.flatten(x)
+        x = self.fc(x)
+        return x
+def ESNet_x0_25(pretrained=False, use_ssld=False, **kwargs):
+    """
+    ESNet_x0_25
+    Args:
+        pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
+                    If str, means the path of the pretrained model.
+        use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
+    Returns:
+        model: nn.Layer. Specific `ESNet_x0_25` model depends on args.
+    """
+    model = ESNet(scale=0.25, stages_pattern=MODEL_STAGES_PATTERN["ESNet"], **kwargs)
+    return model
--- a/modules/image/classification/esnet_x0_25_imagenet/module.py
+++ b/modules/image/classification/esnet_x0_25_imagenet/module.py
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+import copy
+import os
+import cv2
+import numpy as np
+import paddle
+from skimage.io import imread
+from skimage.transform import rescale
+from skimage.transform import resize
+import paddlehub as hub
+from .model import ESNet_x0_25
+from .processor import base64_to_cv2
+from .processor import create_operators
+from .processor import Topk
+from .utils import get_config
+from paddlehub.module.module import moduleinfo
+from paddlehub.module.module import runnable
+from paddlehub.module.module import serving
+@moduleinfo(name="esnet_x0_25_imagenet",
+            type="cv/classification",
+            author="paddlepaddle",
+            author_email="",
+            summary="",
+            version="1.0.0")
+class Esnet_x0_25_Imagenet:
+    def __init__(self):
+        self.config = get_config(os.path.join(self.directory, 'ESNet_x0_25.yaml'), show=False)
+        self.label_path = os.path.join(self.directory, 'imagenet1k_label_list.txt')
+        self.pretrain_path = os.path.join(self.directory, 'ESNet_x0_25_pretrained.pdparams')
+        self.config['Infer']['PostProcess']['class_id_map_file'] = self.label_path
+        self.model = ESNet_x0_25()
+        param_state_dict = paddle.load(self.pretrain_path)
+        self.model.set_dict(param_state_dict)
+        self.preprocess_funcs = create_operators(self.config["Infer"]["transforms"])
+    def classification(self,
+                       images: list = None,
+                       paths: list = None,
+                       batch_size: int = 1,
+                       use_gpu: bool = False,
+                       top_k: int = 1):
+        '''
+        Args:
+            images (list[numpy.ndarray]): data of images, shape of each is [H, W, C], color space must be BGR.
+            paths (list[str]): The paths of images.
+            batch_size (int): batch size.
+            use_gpu (bool): Whether to use gpu.
+            top_k (int): Return top k results.
+        Returns:
+            res (list[dict]): The classfication results, each result dict contains key 'class_ids', 'scores' and 'label_names'.
+        '''
+        postprocess_func = Topk(top_k, self.label_path)
+        inputs = []
+        results = []
+        paddle.disable_static()
+        place = 'gpu:0' if use_gpu else 'cpu'
+        place = paddle.set_device(place)
+        if images == None and paths == None:
+            print('No image provided. Please input an image or a image path.')
+            return
+        if images != None:
+            for image in images:
+                image = image[:, :, ::-1]
+                inputs.append(image)
+        if paths != None:
+            for path in paths:
+                image = cv2.imread(path)[:, :, ::-1]
+                inputs.append(image)
+        batch_data = []
+        for idx, imagedata in enumerate(inputs):
+            for process in self.preprocess_funcs:
+                imagedata = process(imagedata)
+            batch_data.append(imagedata)
+            if len(batch_data) >= batch_size or idx == len(inputs) - 1:
+                batch_tensor = paddle.to_tensor(batch_data)
+                out = self.model(batch_tensor)
+                if isinstance(out, list):
+                    out = out[0]
+                if isinstance(out, dict) and "logits" in out:
+                    out = out["logits"]
+                if isinstance(out, dict) and "output" in out:
+                    out = out["output"]
+                result = postprocess_func(out)
+                results.extend(result)
+                batch_data.clear()
+        return results
+    @runnable
+    def run_cmd(self, argvs: list):
+        """
+        Run as a command.
+        """
+        self.parser = argparse.ArgumentParser(description="Run the {} module.".format(self.name),
+                                              prog='hub run {}'.format(self.name),
+                                              usage='%(prog)s',
+                                              add_help=True)
+        self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required")
+        self.arg_config_group = self.parser.add_argument_group(
+            title="Config options", description="Run configuration for controlling module behavior, not required.")
+        self.add_module_config_arg()
+        self.add_module_input_arg()
+        self.args = self.parser.parse_args(argvs)
+        results = self.classification(paths=[self.args.input_path],
+                                      use_gpu=self.args.use_gpu,
+                                      batch_size=self.args.batch_size,
+                                      top_k=self.args.top_k)
+        return results
+    @serving
+    def serving_method(self, images, **kwargs):
+        """
+        Run as a service.
+        """
+        images_decode = [base64_to_cv2(image) for image in images]
+        results = self.classification(images=images_decode, **kwargs)
+        return results
+    def add_module_config_arg(self):
+        """
+        Add the command config options.
+        """
+        self.arg_config_group.add_argument('--use_gpu', action='store_true', help="use GPU or not")
+        self.arg_config_group.add_argument('--batch_size', type=int, default=1, help='batch size')
+        self.arg_config_group.add_argument('--top_k', type=int, default=1, help='Return top k results.')
+    def add_module_input_arg(self):
+        """
+        Add the command input options.
+        """
+        self.arg_input_group.add_argument('--input_path', type=str, help="path to input image.")
--- a/modules/image/classification/esnet_x0_25_imagenet/processor.py
+++ b/modules/image/classification/esnet_x0_25_imagenet/processor.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+import base64
+import inspect
+import math
+import os
+import random
+import sys
+from functools import partial
+import cv2
+import numpy as np
+import paddle
+import paddle.nn.functional as F
+import six
+from paddle.vision.transforms import ColorJitter as RawColorJitter
+from PIL import Image
+def create_operators(params, class_num=None):
+    """
+    create operators based on the config
+    Args:
+        params(list): a dict list, used to create some operators
+    """
+    assert isinstance(params, list), ('operator config should be a list')
+    ops = []
+    current_module = sys.modules[__name__]
+    for operator in params:
+        assert isinstance(operator, dict) and len(operator) == 1, "yaml format error"
+        op_name = list(operator)[0]
+        param = {} if operator[op_name] is None else operator[op_name]
+        op_func = getattr(current_module, op_name)
+        if "class_num" in inspect.getfullargspec(op_func).args:
+            param.update({"class_num": class_num})
+        op = op_func(**param)
+        ops.append(op)
+    return ops
+class UnifiedResize(object):
+    def __init__(self, interpolation=None, backend="cv2"):
+        _cv2_interp_from_str = {
+            'nearest': cv2.INTER_NEAREST,
+            'bilinear': cv2.INTER_LINEAR,
+            'area': cv2.INTER_AREA,
+            'bicubic': cv2.INTER_CUBIC,
+            'lanczos': cv2.INTER_LANCZOS4
+        }
+        _pil_interp_from_str = {
+            'nearest': Image.NEAREST,
+            'bilinear': Image.BILINEAR,
+            'bicubic': Image.BICUBIC,
+            'box': Image.BOX,
+            'lanczos': Image.LANCZOS,
+            'hamming': Image.HAMMING
+        }
+        def _pil_resize(src, size, resample):
+            pil_img = Image.fromarray(src)
+            pil_img = pil_img.resize(size, resample)
+            return np.asarray(pil_img)
+        if backend.lower() == "cv2":
+            if isinstance(interpolation, str):
+                interpolation = _cv2_interp_from_str[interpolation.lower()]
+            # compatible with opencv < version 4.4.0
+            elif interpolation is None:
+                interpolation = cv2.INTER_LINEAR
+            self.resize_func = partial(cv2.resize, interpolation=interpolation)
+        elif backend.lower() == "pil":
+            if isinstance(interpolation, str):
+                interpolation = _pil_interp_from_str[interpolation.lower()]
+            self.resize_func = partial(_pil_resize, resample=interpolation)
+        else:
+            self.resize_func = cv2.resize
+    def __call__(self, src, size):
+        return self.resize_func(src, size)
+class OperatorParamError(ValueError):
+    """ OperatorParamError
+    """
+    pass
+class DecodeImage(object):
+    """ decode image """
+    def __init__(self, to_rgb=True, to_np=False, channel_first=False):
+        self.to_rgb = to_rgb
+        self.to_np = to_np  # to numpy
+        self.channel_first = channel_first  # only enabled when to_np is True
+    def __call__(self, img):
+        if six.PY2:
+            assert type(img) is str and len(img) > 0, "invalid input 'img' in DecodeImage"
+        else:
+            assert type(img) is bytes and len(img) > 0, "invalid input 'img' in DecodeImage"
+        data = np.frombuffer(img, dtype='uint8')
+        img = cv2.imdecode(data, 1)
+        if self.to_rgb:
+            assert img.shape[2] == 3, 'invalid shape of image[%s]' % (img.shape)
+            img = img[:, :, ::-1]
+        if self.channel_first:
+            img = img.transpose((2, 0, 1))
+        return img
+class ResizeImage(object):
+    """ resize image """
+    def __init__(self, size=None, resize_short=None, interpolation=None, backend="cv2"):
+        if resize_short is not None and resize_short > 0:
+            self.resize_short = resize_short
+            self.w = None
+            self.h = None
+        elif size is not None:
+            self.resize_short = None
+            self.w = size if type(size) is int else size[0]
+            self.h = size if type(size) is int else size[1]
+        else:
+            raise OperatorParamError("invalid params for ReisizeImage for '\
+                'both 'size' and 'resize_short' are None")
+        self._resize_func = UnifiedResize(interpolation=interpolation, backend=backend)
+    def __call__(self, img):
+        img_h, img_w = img.shape[:2]
+        if self.resize_short is not None:
+            percent = float(self.resize_short) / min(img_w, img_h)
+            w = int(round(img_w * percent))
+            h = int(round(img_h * percent))
+        else:
+            w = self.w
+            h = self.h
+        return self._resize_func(img, (w, h))
+class CropImage(object):
+    """ crop image """
+    def __init__(self, size):
+        if type(size) is int:
+            self.size = (size, size)
+        else:
+            self.size = size  # (h, w)
+    def __call__(self, img):
+        w, h = self.size
+        img_h, img_w = img.shape[:2]
+        w_start = (img_w - w) // 2
+        h_start = (img_h - h) // 2
+        w_end = w_start + w
+        h_end = h_start + h
+        return img[h_start:h_end, w_start:w_end, :]
+class RandCropImage(object):
+    """ random crop image """
+    def __init__(self, size, scale=None, ratio=None, interpolation=None, backend="cv2"):
+        if type(size) is int:
+            self.size = (size, size)  # (h, w)
+        else:
+            self.size = size
+        self.scale = [0.08, 1.0] if scale is None else scale
+        self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio
+        self._resize_func = UnifiedResize(interpolation=interpolation, backend=backend)
+    def __call__(self, img):
+        size = self.size
+        scale = self.scale
+        ratio = self.ratio
+        aspect_ratio = math.sqrt(random.uniform(*ratio))
+        w = 1. * aspect_ratio
+        h = 1. / aspect_ratio
+        img_h, img_w = img.shape[:2]
+        bound = min((float(img_w) / img_h) / (w**2), (float(img_h) / img_w) / (h**2))
+        scale_max = min(scale[1], bound)
+        scale_min = min(scale[0], bound)
+        target_area = img_w * img_h * random.uniform(scale_min, scale_max)
+        target_size = math.sqrt(target_area)
+        w = int(target_size * w)
+        h = int(target_size * h)
+        i = random.randint(0, img_w - w)
+        j = random.randint(0, img_h - h)
+        img = img[j:j + h, i:i + w, :]
+        return self._resize_func(img, size)
+class RandFlipImage(object):
+    """ random flip image
+        flip_code:
+            1: Flipped Horizontally
+            0: Flipped Vertically
+            -1: Flipped Horizontally & Vertically
+    """
+    def __init__(self, flip_code=1):
+        assert flip_code in [-1, 0, 1], "flip_code should be a value in [-1, 0, 1]"
+        self.flip_code = flip_code
+    def __call__(self, img):
+        if random.randint(0, 1) == 1:
+            return cv2.flip(img, self.flip_code)
+        else:
+            return img
+class NormalizeImage(object):
+    """ normalize image such as substract mean, divide std
+    """
+    def __init__(self, scale=None, mean=None, std=None, order='chw', output_fp16=False, channel_num=3):
+        if isinstance(scale, str):
+            scale = eval(scale)
+        assert channel_num in [3, 4], "channel number of input image should be set to 3 or 4."
+        self.channel_num = channel_num
+        self.output_dtype = 'float16' if output_fp16 else 'float32'
+        self.scale = np.float32(scale if scale is not None else 1.0 / 255.0)
+        self.order = order
+        mean = mean if mean is not None else [0.485, 0.456, 0.406]
+        std = std if std is not None else [0.229, 0.224, 0.225]
+        shape = (3, 1, 1) if self.order == 'chw' else (1, 1, 3)
+        self.mean = np.array(mean).reshape(shape).astype('float32')
+        self.std = np.array(std).reshape(shape).astype('float32')
+    def __call__(self, img):
+        from PIL import Image
+        if isinstance(img, Image.Image):
+            img = np.array(img)
+        assert isinstance(img, np.ndarray), "invalid input 'img' in NormalizeImage"
+        img = (img.astype('float32') * self.scale - self.mean) / self.std
+        if self.channel_num == 4:
+            img_h = img.shape[1] if self.order == 'chw' else img.shape[0]
+            img_w = img.shape[2] if self.order == 'chw' else img.shape[1]
+            pad_zeros = np.zeros((1, img_h, img_w)) if self.order == 'chw' else np.zeros((img_h, img_w, 1))
+            img = (np.concatenate((img, pad_zeros), axis=0) if self.order == 'chw' else np.concatenate(
+                (img, pad_zeros), axis=2))
+        return img.astype(self.output_dtype)
+class ToCHWImage(object):
+    """ convert hwc image to chw image
+    """
+    def __init__(self):
+        pass
+    def __call__(self, img):
+        from PIL import Image
+        if isinstance(img, Image.Image):
+            img = np.array(img)
+        return img.transpose((2, 0, 1))
+class ColorJitter(RawColorJitter):
+    """ColorJitter.
+    """
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+    def __call__(self, img):
+        if not isinstance(img, Image.Image):
+            img = np.ascontiguousarray(img)
+            img = Image.fromarray(img)
+        img = super()._apply_image(img)
+        if isinstance(img, Image.Image):
+            img = np.asarray(img)
+        return img
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
+class Topk(object):
+    def __init__(self, topk=1, class_id_map_file=None):
+        assert isinstance(topk, (int, ))
+        self.class_id_map = self.parse_class_id_map(class_id_map_file)
+        self.topk = topk
+    def parse_class_id_map(self, class_id_map_file):
+        if class_id_map_file is None:
+            return None
+        if not os.path.exists(class_id_map_file):
+            print(
+                "Warning: If want to use your own label_dict, please input legal path!\nOtherwise label_names will be empty!"
+            )
+            return None
+        try:
+            class_id_map = {}
+            with open(class_id_map_file, "r") as fin:
+                lines = fin.readlines()
+                for line in lines:
+                    partition = line.split("\n")[0].partition(" ")
+                    class_id_map[int(partition[0])] = str(partition[-1])
+        except Exception as ex:
+            print(ex)
+            class_id_map = None
+        return class_id_map
+    def __call__(self, x, file_names=None, multilabel=False):
+        assert isinstance(x, paddle.Tensor)
+        if file_names is not None:
+            assert x.shape[0] == len(file_names)
+        x = F.softmax(x, axis=-1) if not multilabel else F.sigmoid(x)
+        x = x.numpy()
+        y = []
+        for idx, probs in enumerate(x):
+            index = probs.argsort(axis=0)[-self.topk:][::-1].astype("int32") if not multilabel else np.where(
+                probs >= 0.5)[0].astype("int32")
+            clas_id_list = []
+            score_list = []
+            label_name_list = []
+            for i in index:
+                clas_id_list.append(i.item())
+                score_list.append(probs[i].item())
+                if self.class_id_map is not None:
+                    label_name_list.append(self.class_id_map[i.item()])
+            result = {
+                "class_ids": clas_id_list,
+                "scores": np.around(score_list, decimals=5).tolist(),
+            }
+            if file_names is not None:
+                result["file_name"] = file_names[idx]
+            if label_name_list is not None:
+                result["label_names"] = label_name_list
+            y.append(result)
+        return y
--- a/modules/image/classification/esnet_x0_25_imagenet/utils.py
+++ b/modules/image/classification/esnet_x0_25_imagenet/utils.py
+# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+import copy
+import os
+import yaml
+__all__ = ['get_config']
+class AttrDict(dict):
+    def __getattr__(self, key):
+        return self[key]
+    def __setattr__(self, key, value):
+        if key in self.__dict__:
+            self.__dict__[key] = value
+        else:
+            self[key] = value
+    def __deepcopy__(self, content):
+        return copy.deepcopy(dict(self))
+def create_attr_dict(yaml_config):
+    from ast import literal_eval
+    for key, value in yaml_config.items():
+        if type(value) is dict:
+            yaml_config[key] = value = AttrDict(value)
+        if isinstance(value, str):
+            try:
+                value = literal_eval(value)
+            except BaseException:
+                pass
+        if isinstance(value, AttrDict):
+            create_attr_dict(yaml_config[key])
+        else:
+            yaml_config[key] = value
+def parse_config(cfg_file):
+    """Load a config file into AttrDict"""
+    with open(cfg_file, 'r') as fopen:
+        yaml_config = AttrDict(yaml.load(fopen, Loader=yaml.SafeLoader))
+    create_attr_dict(yaml_config)
+    return yaml_config
+def override(dl, ks, v):
+    """
+    Recursively replace dict of list
+    Args:
+        dl(dict or list): dict or list to be replaced
+        ks(list): list of keys
+        v(str): value to be replaced
+    """
+    def str2num(v):
+        try:
+            return eval(v)
+        except Exception:
+            return v
+    assert isinstance(dl, (list, dict)), ("{} should be a list or a dict")
+    assert len(ks) > 0, ('lenght of keys should larger than 0')
+    if isinstance(dl, list):
+        k = str2num(ks[0])
+        if len(ks) == 1:
+            assert k < len(dl), ('index({}) out of range({})'.format(k, dl))
+            dl[k] = str2num(v)
+        else:
+            override(dl[k], ks[1:], v)
+    else:
+        if len(ks) == 1:
+            # assert ks[0] in dl, ('{} is not exist in {}'.format(ks[0], dl))
+            if not ks[0] in dl:
+                print('A new filed ({}) detected!'.format(ks[0], dl))
+            dl[ks[0]] = str2num(v)
+        else:
+            override(dl[ks[0]], ks[1:], v)
+def override_config(config, options=None):
+    """
+    Recursively override the config
+    Args:
+        config(dict): dict to be replaced
+        options(list): list of pairs(key0.key1.idx.key2=value)
+            such as: [
+                'topk=2',
+                'VALID.transforms.1.ResizeImage.resize_short=300'
+            ]
+    Returns:
+        config(dict): replaced config
+    """
+    if options is not None:
+        for opt in options:
+            assert isinstance(opt, str), ("option({}) should be a str".format(opt))
+            assert "=" in opt, ("option({}) should contain a ="
+                                "to distinguish between key and value".format(opt))
+            pair = opt.split('=')
+            assert len(pair) == 2, ("there can be only a = in the option")
+            key, value = pair
+            keys = key.split('.')
+            override(config, keys, value)
+    return config
+def get_config(fname, overrides=None, show=False):
+    """
+    Read config from file
+    """
+    assert os.path.exists(fname), ('config file({}) is not exist'.format(fname))
+    config = parse_config(fname)
+    override_config(config, overrides)
+    return config
--- a/modules/image/classification/esnet_x0_5_imagenet/README.md
+++ b/modules/image/classification/esnet_x0_5_imagenet/README.md
+# esnet_x0_5_imagenet
+|模型名称|esnet_x0_5_imagenet|
+| :--- | :---: |
+|类别|图像-图像分类|
+|网络|ESNet|
+|数据集|ImageNet-2012|
+|是否支持Fine-tuning|否|
+|模型大小|12 MB|
+|最新更新日期|2022-04-02|
+|数据指标|Acc|
+## 一、模型基本信息
+- ### 模型介绍
+  - ESNet(Enhanced ShuffleNet)是百度自研的一个轻量级网络，该网络在 ShuffleNetV2 的基础上融合了 MobileNetV3、GhostNet、PPLCNet 的优点，组合成了一个在 ARM 设备上速度更快、精度更高的网络，由于其出色的表现，所以在 PaddleDetection 推出的 [PP-PicoDet](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet) 使用了该模型做 backbone，配合更强的目标检测算法，最终的指标一举刷新了目标检测模型在 ARM 设备上的 SOTA 指标。该模型为模型规模参数scale为x0.5下的ESNet模型。
+## 二、安装
+- ### 1、环境依赖  
+  - paddlepaddle >= 1.6.2  
+  - paddlehub >= 1.6.0  | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
+- ### 2、安装
+  - ```shell
+    $ hub install esnet_x0_5_imagenet
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测
+- ### 1、命令行预测
+  - ```shell
+    $ hub run esnet_x0_5_imagenet --input_path "/PATH/TO/IMAGE"
+    ```
+  - 通过命令行方式实现分类模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+- ### 2、预测代码示例
+  - ```python
+    import paddlehub as hub
+    import cv2
+    classifier = hub.Module(name="esnet_x0_5_imagenet")
+    result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')])
+    # or
+    # result = classifier.classification(paths=['/PATH/TO/IMAGE'])
+    ```
+- ### 3、API
+  - ```python
+    def classification(images=None,
+                       paths=None,
+                       batch_size=1,
+                       use_gpu=False,
+                       top_k=1):
+    ```
+    - 分类接口API。
+    - **参数**
+      - images (list\[numpy.ndarray\]): 图片数据，每一个图片数据的shape 均为 \[H, W, C\]，颜色空间为 BGR； <br/>
+      - paths (list\[str\]): 图片的路径； <br/>
+      - batch\_size (int): batch 的大小；<br/>
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA_VISIBLE_DEVICES环境变量** <br/>
+      - top\_k (int): 返回预测结果的前 k 个。
+    - **返回**
+      - res (list\[dict\]): 分类结果，列表的每一个元素均为字典，其中 key 包括'class_ids'（种类索引）, 'scores'（置信度） 和 'label_names'（种类名称）
+## 四、服务部署
+- PaddleHub Serving可以部署一个图像识别的在线服务。
+- ### 第一步：启动PaddleHub Serving
+  - 运行启动命令：
+  - ```shell
+    $ hub serving start -m esnet_x0_5_imagenet
+    ```
+  - 这样就完成了一个图像识别的在线服务的部署，默认端口号为8866。
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+  - ```python
+    import requests
+    import json
+    import cv2
+    import base64
+    def cv2_to_base64(image):
+        data = cv2.imencode('.jpg', image)[1]
+        return base64.b64encode(data.tostring()).decode('utf8')
+    # 发送HTTP请求
+    data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+    headers = {"Content-type": "application/json"\}
+    url = "http://127.0.0.1:8866/predict/esnet_x0_5_imagenet"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    # 打印预测结果
+    print(r.json()["results"])
+    ```
+## 五、更新历史
+* 1.0.0
+  初始发布
+  - ```shell
+    $ hub install esnet_x0_5_imagenet==1.0.0
+    ```
--- a/modules/image/classification/esnet_x0_5_imagenet/model.py
+++ b/modules/image/classification/esnet_x0_5_imagenet/model.py
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import math
+from typing import Any
+from typing import Callable
+from typing import Dict
+from typing import List
+from typing import Tuple
+from typing import Union
+import paddle
+import paddle.nn as nn
+from paddle import concat
+from paddle import ParamAttr
+from paddle import reshape
+from paddle import split
+from paddle import transpose
+from paddle.nn import AdaptiveAvgPool2D
+from paddle.nn import BatchNorm
+from paddle.nn import Conv2D
+from paddle.nn import Dropout
+from paddle.nn import Linear
+from paddle.nn import MaxPool2D
+from paddle.nn.initializer import KaimingNormal
+from paddle.regularizer import L2Decay
+MODEL_STAGES_PATTERN = {"ESNet": ["blocks[2]", "blocks[9]", "blocks[12]"]}
+class Identity(nn.Layer):
+    def __init__(self):
+        super(Identity, self).__init__()
+    def forward(self, inputs):
+        return inputs
+class TheseusLayer(nn.Layer):
+    def __init__(self, *args, **kwargs):
+        super(TheseusLayer, self).__init__()
+        self.res_dict = {}
+        self.res_name = self.full_name()
+        self.pruner = None
+        self.quanter = None
+    def _return_dict_hook(self, layer, input, output):
+        res_dict = {"output": output}
+        # 'list' is needed to avoid error raised by popping self.res_dict
+        for res_key in list(self.res_dict):
+            # clear the res_dict because the forward process may change according to input
+            res_dict[res_key] = self.res_dict.pop(res_key)
+        return res_dict
+    def init_res(self, stages_pattern, return_patterns=None, return_stages=None):
+        if return_patterns and return_stages:
+            msg = f"The 'return_patterns' would be ignored when 'return_stages' is set."
+            return_stages = None
+        if return_stages is True:
+            return_patterns = stages_pattern
+        # return_stages is int or bool
+        if type(return_stages) is int:
+            return_stages = [return_stages]
+        if isinstance(return_stages, list):
+            if max(return_stages) > len(stages_pattern) or min(return_stages) < 0:
+                msg = f"The 'return_stages' set error. Illegal value(s) have been ignored. The stages' pattern list is {stages_pattern}."
+                return_stages = [val for val in return_stages if val >= 0 and val < len(stages_pattern)]
+            return_patterns = [stages_pattern[i] for i in return_stages]
+        if return_patterns:
+            self.update_res(return_patterns)
+    def replace_sub(self, *args, **kwargs) -> None:
+        msg = "The function 'replace_sub()' is deprecated, please use 'upgrade_sublayer()' instead."
+        raise DeprecationWarning(msg)
+    def upgrade_sublayer(self, layer_name_pattern: Union[str, List[str]],
+                         handle_func: Callable[[nn.Layer, str], nn.Layer]) -> Dict[str, nn.Layer]:
+        """use 'handle_func' to modify the sub-layer(s) specified by 'layer_name_pattern'.
+        Args:
+            layer_name_pattern (Union[str, List[str]]): The name of layer to be modified by 'handle_func'.
+            handle_func (Callable[[nn.Layer, str], nn.Layer]): The function to modify target layer specified by 'layer_name_pattern'. The formal params are the layer(nn.Layer) and pattern(str) that is (a member of) layer_name_pattern (when layer_name_pattern is List type). And the return is the layer processed.
+        Returns:
+            Dict[str, nn.Layer]: The key is the pattern and corresponding value is the result returned by 'handle_func()'.
+        Examples:
+            from paddle import nn
+            import paddleclas
+            def rep_func(layer: nn.Layer, pattern: str):
+                new_layer = nn.Conv2D(
+                    in_channels=layer._in_channels,
+                    out_channels=layer._out_channels,
+                    kernel_size=5,
+                    padding=2
+                )
+                return new_layer
+            net = paddleclas.MobileNetV1()
+            res = net.replace_sub(layer_name_pattern=["blocks[11].depthwise_conv.conv", "blocks[12].depthwise_conv.conv"], handle_func=rep_func)
+            print(res)
+            # {'blocks[11].depthwise_conv.conv': the corresponding new_layer, 'blocks[12].depthwise_conv.conv': the corresponding new_layer}
+        """
+        if not isinstance(layer_name_pattern, list):
+            layer_name_pattern = [layer_name_pattern]
+        hit_layer_pattern_list = []
+        for pattern in layer_name_pattern:
+            # parse pattern to find target layer and its parent
+            layer_list = parse_pattern_str(pattern=pattern, parent_layer=self)
+            if not layer_list:
+                continue
+            sub_layer_parent = layer_list[-2]["layer"] if len(layer_list) > 1 else self
+            sub_layer = layer_list[-1]["layer"]
+            sub_layer_name = layer_list[-1]["name"]
+            sub_layer_index = layer_list[-1]["index"]
+            new_sub_layer = handle_func(sub_layer, pattern)
+            if sub_layer_index:
+                getattr(sub_layer_parent, sub_layer_name)[sub_layer_index] = new_sub_layer
+            else:
+                setattr(sub_layer_parent, sub_layer_name, new_sub_layer)
+            hit_layer_pattern_list.append(pattern)
+        return hit_layer_pattern_list
+    def stop_after(self, stop_layer_name: str) -> bool:
+        """stop forward and backward after 'stop_layer_name'.
+        Args:
+            stop_layer_name (str): The name of layer that stop forward and backward after this layer.
+        Returns:
+            bool: 'True' if successful, 'False' otherwise.
+        """
+        layer_list = parse_pattern_str(stop_layer_name, self)
+        if not layer_list:
+            return False
+        parent_layer = self
+        for layer_dict in layer_list:
+            name, index = layer_dict["name"], layer_dict["index"]
+            if not set_identity(parent_layer, name, index):
+                msg = f"Failed to set the layers that after stop_layer_name('{stop_layer_name}') to IdentityLayer. The error layer's name is '{name}'."
+                return False
+            parent_layer = layer_dict["layer"]
+        return True
+    def update_res(self, return_patterns: Union[str, List[str]]) -> Dict[str, nn.Layer]:
+        """update the result(s) to be returned.
+        Args:
+            return_patterns (Union[str, List[str]]): The name of layer to return output.
+        Returns:
+            Dict[str, nn.Layer]: The pattern(str) and corresponding layer(nn.Layer) that have been set successfully.
+        """
+        # clear res_dict that could have been set
+        self.res_dict = {}
+        class Handler(object):
+            def __init__(self, res_dict):
+                # res_dict is a reference
+                self.res_dict = res_dict
+            def __call__(self, layer, pattern):
+                layer.res_dict = self.res_dict
+                layer.res_name = pattern
+                if hasattr(layer, "hook_remove_helper"):
+                    layer.hook_remove_helper.remove()
+                layer.hook_remove_helper = layer.register_forward_post_hook(save_sub_res_hook)
+                return layer
+        handle_func = Handler(self.res_dict)
+        hit_layer_pattern_list = self.upgrade_sublayer(return_patterns, handle_func=handle_func)
+        if hasattr(self, "hook_remove_helper"):
+            self.hook_remove_helper.remove()
+        self.hook_remove_helper = self.register_forward_post_hook(self._return_dict_hook)
+        return hit_layer_pattern_list
+def save_sub_res_hook(layer, input, output):
+    layer.res_dict[layer.res_name] = output
+def set_identity(parent_layer: nn.Layer, layer_name: str, layer_index: str = None) -> bool:
+    """set the layer specified by layer_name and layer_index to Indentity.
+    Args:
+        parent_layer (nn.Layer): The parent layer of target layer specified by layer_name and layer_index.
+        layer_name (str): The name of target layer to be set to Indentity.
+        layer_index (str, optional): The index of target layer to be set to Indentity in parent_layer. Defaults to None.
+    Returns:
+        bool: True if successfully, False otherwise.
+    """
+    stop_after = False
+    for sub_layer_name in parent_layer._sub_layers:
+        if stop_after:
+            parent_layer._sub_layers[sub_layer_name] = Identity()
+            continue
+        if sub_layer_name == layer_name:
+            stop_after = True
+    if layer_index and stop_after:
+        stop_after = False
+        for sub_layer_index in parent_layer._sub_layers[layer_name]._sub_layers:
+            if stop_after:
+                parent_layer._sub_layers[layer_name][sub_layer_index] = Identity()
+                continue
+            if layer_index == sub_layer_index:
+                stop_after = True
+    return stop_after
+def parse_pattern_str(pattern: str, parent_layer: nn.Layer) -> Union[None, List[Dict[str, Union[nn.Layer, str, None]]]]:
+    """parse the string type pattern.
+    Args:
+        pattern (str): The pattern to discribe layer.
+        parent_layer (nn.Layer): The root layer relative to the pattern.
+    Returns:
+        Union[None, List[Dict[str, Union[nn.Layer, str, None]]]]: None if failed. If successfully, the members are layers parsed in order:
+                                                                [
+                                                                    {"layer": first layer, "name": first layer's name parsed, "index": first layer's index parsed if exist},
+                                                                    {"layer": second layer, "name": second layer's name parsed, "index": second layer's index parsed if exist},
+                                                                    ...
+                                                                ]
+    """
+    pattern_list = pattern.split(".")
+    if not pattern_list:
+        msg = f"The pattern('{pattern}') is illegal. Please check and retry."
+        return None
+    layer_list = []
+    while len(pattern_list) > 0:
+        if '[' in pattern_list[0]:
+            target_layer_name = pattern_list[0].split('[')[0]
+            target_layer_index = pattern_list[0].split('[')[1].split(']')[0]
+        else:
+            target_layer_name = pattern_list[0]
+            target_layer_index = None
+        target_layer = getattr(parent_layer, target_layer_name, None)
+        if target_layer is None:
+            msg = f"Not found layer named('{target_layer_name}') specifed in pattern('{pattern}')."
+            return None
+        if target_layer_index and target_layer:
+            if int(target_layer_index) < 0 or int(target_layer_index) >= len(target_layer):
+                msg = f"Not found layer by index('{target_layer_index}') specifed in pattern('{pattern}'). The index should < {len(target_layer)} and > 0."
+                return None
+            target_layer = target_layer[target_layer_index]
+        layer_list.append({"layer": target_layer, "name": target_layer_name, "index": target_layer_index})
+        pattern_list = pattern_list[1:]
+        parent_layer = target_layer
+    return layer_list
+def channel_shuffle(x, groups):
+    batch_size, num_channels, height, width = x.shape[0:4]
+    channels_per_group = num_channels // groups
+    x = reshape(x=x, shape=[batch_size, groups, channels_per_group, height, width])
+    x = transpose(x=x, perm=[0, 2, 1, 3, 4])
+    x = reshape(x=x, shape=[batch_size, num_channels, height, width])
+    return x
+def make_divisible(v, divisor=8, min_value=None):
+    if min_value is None:
+        min_value = divisor
+    new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
+    if new_v < 0.9 * v:
+        new_v += divisor
+    return new_v
+class ConvBNLayer(TheseusLayer):
+    def __init__(self, in_channels, out_channels, kernel_size, stride=1, groups=1, if_act=True):
+        super().__init__()
+        self.conv = Conv2D(in_channels=in_channels,
+                           out_channels=out_channels,
+                           kernel_size=kernel_size,
+                           stride=stride,
+                           padding=(kernel_size - 1) // 2,
+                           groups=groups,
+                           weight_attr=ParamAttr(initializer=KaimingNormal()),
+                           bias_attr=False)
+        self.bn = BatchNorm(out_channels,
+                            param_attr=ParamAttr(regularizer=L2Decay(0.0)),
+                            bias_attr=ParamAttr(regularizer=L2Decay(0.0)))
+        self.if_act = if_act
+        self.hardswish = nn.Hardswish()
+    def forward(self, x):
+        x = self.conv(x)
+        x = self.bn(x)
+        if self.if_act:
+            x = self.hardswish(x)
+        return x
+class SEModule(TheseusLayer):
+    def __init__(self, channel, reduction=4):
+        super().__init__()
+        self.avg_pool = AdaptiveAvgPool2D(1)
+        self.conv1 = Conv2D(in_channels=channel, out_channels=channel // reduction, kernel_size=1, stride=1, padding=0)
+        self.relu = nn.ReLU()
+        self.conv2 = Conv2D(in_channels=channel // reduction, out_channels=channel, kernel_size=1, stride=1, padding=0)
+        self.hardsigmoid = nn.Hardsigmoid()
+    def forward(self, x):
+        identity = x
+        x = self.avg_pool(x)
+        x = self.conv1(x)
+        x = self.relu(x)
+        x = self.conv2(x)
+        x = self.hardsigmoid(x)
+        x = paddle.multiply(x=identity, y=x)
+        return x
+class ESBlock1(TheseusLayer):
+    def __init__(self, in_channels, out_channels):
+        super().__init__()
+        self.pw_1_1 = ConvBNLayer(in_channels=in_channels // 2, out_channels=out_channels // 2, kernel_size=1, stride=1)
+        self.dw_1 = ConvBNLayer(in_channels=out_channels // 2,
+                                out_channels=out_channels // 2,
+                                kernel_size=3,
+                                stride=1,
+                                groups=out_channels // 2,
+                                if_act=False)
+        self.se = SEModule(out_channels)
+        self.pw_1_2 = ConvBNLayer(in_channels=out_channels, out_channels=out_channels // 2, kernel_size=1, stride=1)
+    def forward(self, x):
+        x1, x2 = split(x, num_or_sections=[x.shape[1] // 2, x.shape[1] // 2], axis=1)
+        x2 = self.pw_1_1(x2)
+        x3 = self.dw_1(x2)
+        x3 = concat([x2, x3], axis=1)
+        x3 = self.se(x3)
+        x3 = self.pw_1_2(x3)
+        x = concat([x1, x3], axis=1)
+        return channel_shuffle(x, 2)
+class ESBlock2(TheseusLayer):
+    def __init__(self, in_channels, out_channels):
+        super().__init__()
+        # branch1
+        self.dw_1 = ConvBNLayer(in_channels=in_channels,
+                                out_channels=in_channels,
+                                kernel_size=3,
+                                stride=2,
+                                groups=in_channels,
+                                if_act=False)
+        self.pw_1 = ConvBNLayer(in_channels=in_channels, out_channels=out_channels // 2, kernel_size=1, stride=1)
+        # branch2
+        self.pw_2_1 = ConvBNLayer(in_channels=in_channels, out_channels=out_channels // 2, kernel_size=1)
+        self.dw_2 = ConvBNLayer(in_channels=out_channels // 2,
+                                out_channels=out_channels // 2,
+                                kernel_size=3,
+                                stride=2,
+                                groups=out_channels // 2,
+                                if_act=False)
+        self.se = SEModule(out_channels // 2)
+        self.pw_2_2 = ConvBNLayer(in_channels=out_channels // 2, out_channels=out_channels // 2, kernel_size=1)
+        self.concat_dw = ConvBNLayer(in_channels=out_channels,
+                                     out_channels=out_channels,
+                                     kernel_size=3,
+                                     groups=out_channels)
+        self.concat_pw = ConvBNLayer(in_channels=out_channels, out_channels=out_channels, kernel_size=1)
+    def forward(self, x):
+        x1 = self.dw_1(x)
+        x1 = self.pw_1(x1)
+        x2 = self.pw_2_1(x)
+        x2 = self.dw_2(x2)
+        x2 = self.se(x2)
+        x2 = self.pw_2_2(x2)
+        x = concat([x1, x2], axis=1)
+        x = self.concat_dw(x)
+        x = self.concat_pw(x)
+        return x
+class ESNet(TheseusLayer):
+    def __init__(self,
+                 stages_pattern,
+                 class_num=1000,
+                 scale=1.0,
+                 dropout_prob=0.2,
+                 class_expand=1280,
+                 return_patterns=None,
+                 return_stages=None):
+        super().__init__()
+        self.scale = scale
+        self.class_num = class_num
+        self.class_expand = class_expand
+        stage_repeats = [3, 7, 3]
+        stage_out_channels = [
+            -1, 24, make_divisible(116 * scale),
+            make_divisible(232 * scale),
+            make_divisible(464 * scale), 1024
+        ]
+        self.conv1 = ConvBNLayer(in_channels=3, out_channels=stage_out_channels[1], kernel_size=3, stride=2)
+        self.max_pool = MaxPool2D(kernel_size=3, stride=2, padding=1)
+        block_list = []
+        for stage_id, num_repeat in enumerate(stage_repeats):
+            for i in range(num_repeat):
+                if i == 0:
+                    block = ESBlock2(in_channels=stage_out_channels[stage_id + 1],
+                                     out_channels=stage_out_channels[stage_id + 2])
+                else:
+                    block = ESBlock1(in_channels=stage_out_channels[stage_id + 2],
+                                     out_channels=stage_out_channels[stage_id + 2])
+                block_list.append(block)
+        self.blocks = nn.Sequential(*block_list)
+        self.conv2 = ConvBNLayer(in_channels=stage_out_channels[-2], out_channels=stage_out_channels[-1], kernel_size=1)
+        self.avg_pool = AdaptiveAvgPool2D(1)
+        self.last_conv = Conv2D(in_channels=stage_out_channels[-1],
+                                out_channels=self.class_expand,
+                                kernel_size=1,
+                                stride=1,
+                                padding=0,
+                                bias_attr=False)
+        self.hardswish = nn.Hardswish()
+        self.dropout = Dropout(p=dropout_prob, mode="downscale_in_infer")
+        self.flatten = nn.Flatten(start_axis=1, stop_axis=-1)
+        self.fc = Linear(self.class_expand, self.class_num)
+        super().init_res(stages_pattern, return_patterns=return_patterns, return_stages=return_stages)
+    def forward(self, x):
+        x = self.conv1(x)
+        x = self.max_pool(x)
+        x = self.blocks(x)
+        x = self.conv2(x)
+        x = self.avg_pool(x)
+        x = self.last_conv(x)
+        x = self.hardswish(x)
+        x = self.dropout(x)
+        x = self.flatten(x)
+        x = self.fc(x)
+        return x
+def ESNet_x0_5(pretrained=False, use_ssld=False, **kwargs):
+    """
+    ESNet_x0_5
+    Args:
+        pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
+                    If str, means the path of the pretrained model.
+        use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
+    Returns:
+        model: nn.Layer. Specific `ESNet_x0_5` model depends on args.
+    """
+    model = ESNet(scale=0.5, stages_pattern=MODEL_STAGES_PATTERN["ESNet"], **kwargs)
+    return model
--- a/modules/image/classification/esnet_x0_5_imagenet/module.py
+++ b/modules/image/classification/esnet_x0_5_imagenet/module.py
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+import copy
+import os
+import cv2
+import numpy as np
+import paddle
+from skimage.io import imread
+from skimage.transform import rescale
+from skimage.transform import resize
+import paddlehub as hub
+from .model import ESNet_x0_5
+from .processor import base64_to_cv2
+from .processor import create_operators
+from .processor import Topk
+from .utils import get_config
+from paddlehub.module.module import moduleinfo
+from paddlehub.module.module import runnable
+from paddlehub.module.module import serving
+@moduleinfo(name="esnet_x0_5_imagenet",
+            type="cv/classification",
+            author="paddlepaddle",
+            author_email="",
+            summary="",
+            version="1.0.0")
+class Esnet_x0_5_Imagenet:
+    def __init__(self):
+        self.config = get_config(os.path.join(self.directory, 'ESNet_x0_5.yaml'), show=False)
+        self.label_path = os.path.join(self.directory, 'imagenet1k_label_list.txt')
+        self.pretrain_path = os.path.join(self.directory, 'ESNet_x0_5_pretrained.pdparams')
+        self.config['Infer']['PostProcess']['class_id_map_file'] = self.label_path
+        self.model = ESNet_x0_5()
+        param_state_dict = paddle.load(self.pretrain_path)
+        self.model.set_dict(param_state_dict)
+        self.preprocess_funcs = create_operators(self.config["Infer"]["transforms"])
+    def classification(self,
+                       images: list = None,
+                       paths: list = None,
+                       batch_size: int = 1,
+                       use_gpu: bool = False,
+                       top_k: int = 1):
+        '''
+        Args:
+            images (list[numpy.ndarray]): data of images, shape of each is [H, W, C], color space must be BGR.
+            paths (list[str]): The paths of images.
+            batch_size (int): batch size.
+            use_gpu (bool): Whether to use gpu.
+            top_k (int): Return top k results.
+        Returns:
+            res (list[dict]): The classfication results, each result dict contains key 'class_ids', 'scores' and 'label_names'.
+        '''
+        postprocess_func = Topk(top_k, self.label_path)
+        inputs = []
+        results = []
+        paddle.disable_static()
+        place = 'gpu:0' if use_gpu else 'cpu'
+        place = paddle.set_device(place)
+        if images == None and paths == None:
+            print('No image provided. Please input an image or a image path.')
+            return
+        if images != None:
+            for image in images:
+                image = image[:, :, ::-1]
+                inputs.append(image)
+        if paths != None:
+            for path in paths:
+                image = cv2.imread(path)[:, :, ::-1]
+                inputs.append(image)
+        batch_data = []
+        for idx, imagedata in enumerate(inputs):
+            for process in self.preprocess_funcs:
+                imagedata = process(imagedata)
+            batch_data.append(imagedata)
+            if len(batch_data) >= batch_size or idx == len(inputs) - 1:
+                batch_tensor = paddle.to_tensor(batch_data)
+                out = self.model(batch_tensor)
+                if isinstance(out, list):
+                    out = out[0]
+                if isinstance(out, dict) and "logits" in out:
+                    out = out["logits"]
+                if isinstance(out, dict) and "output" in out:
+                    out = out["output"]
+                result = postprocess_func(out)
+                results.extend(result)
+                batch_data.clear()
+        return results
+    @runnable
+    def run_cmd(self, argvs: list):
+        """
+        Run as a command.
+        """
+        self.parser = argparse.ArgumentParser(description="Run the {} module.".format(self.name),
+                                              prog='hub run {}'.format(self.name),
+                                              usage='%(prog)s',
+                                              add_help=True)
+        self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required")
+        self.arg_config_group = self.parser.add_argument_group(
+            title="Config options", description="Run configuration for controlling module behavior, not required.")
+        self.add_module_config_arg()
+        self.add_module_input_arg()
+        self.args = self.parser.parse_args(argvs)
+        results = self.classification(paths=[self.args.input_path],
+                                      use_gpu=self.args.use_gpu,
+                                      batch_size=self.args.batch_size,
+                                      top_k=self.args.top_k)
+        return results
+    @serving
+    def serving_method(self, images, **kwargs):
+        """
+        Run as a service.
+        """
+        images_decode = [base64_to_cv2(image) for image in images]
+        results = self.classification(images=images_decode, **kwargs)
+        return results
+    def add_module_config_arg(self):
+        """
+        Add the command config options.
+        """
+        self.arg_config_group.add_argument('--use_gpu', action='store_true', help="use GPU or not")
+        self.arg_config_group.add_argument('--batch_size', type=int, default=1, help='batch size')
+        self.arg_config_group.add_argument('--top_k', type=int, default=1, help='Return top k results.')
+    def add_module_input_arg(self):
+        """
+        Add the command input options.
+        """
+        self.arg_input_group.add_argument('--input_path', type=str, help="path to input image.")
--- a/modules/image/classification/esnet_x0_5_imagenet/processor.py
+++ b/modules/image/classification/esnet_x0_5_imagenet/processor.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+import base64
+import inspect
+import math
+import os
+import random
+import sys
+from functools import partial
+import cv2
+import numpy as np
+import paddle
+import paddle.nn.functional as F
+import six
+from paddle.vision.transforms import ColorJitter as RawColorJitter
+from PIL import Image
+def create_operators(params, class_num=None):
+    """
+    create operators based on the config
+    Args:
+        params(list): a dict list, used to create some operators
+    """
+    assert isinstance(params, list), ('operator config should be a list')
+    ops = []
+    current_module = sys.modules[__name__]
+    for operator in params:
+        assert isinstance(operator, dict) and len(operator) == 1, "yaml format error"
+        op_name = list(operator)[0]
+        param = {} if operator[op_name] is None else operator[op_name]
+        op_func = getattr(current_module, op_name)
+        if "class_num" in inspect.getfullargspec(op_func).args:
+            param.update({"class_num": class_num})
+        op = op_func(**param)
+        ops.append(op)
+    return ops
+class UnifiedResize(object):
+    def __init__(self, interpolation=None, backend="cv2"):
+        _cv2_interp_from_str = {
+            'nearest': cv2.INTER_NEAREST,
+            'bilinear': cv2.INTER_LINEAR,
+            'area': cv2.INTER_AREA,
+            'bicubic': cv2.INTER_CUBIC,
+            'lanczos': cv2.INTER_LANCZOS4
+        }
+        _pil_interp_from_str = {
+            'nearest': Image.NEAREST,
+            'bilinear': Image.BILINEAR,
+            'bicubic': Image.BICUBIC,
+            'box': Image.BOX,
+            'lanczos': Image.LANCZOS,
+            'hamming': Image.HAMMING
+        }
+        def _pil_resize(src, size, resample):
+            pil_img = Image.fromarray(src)
+            pil_img = pil_img.resize(size, resample)
+            return np.asarray(pil_img)
+        if backend.lower() == "cv2":
+            if isinstance(interpolation, str):
+                interpolation = _cv2_interp_from_str[interpolation.lower()]
+            # compatible with opencv < version 4.4.0
+            elif interpolation is None:
+                interpolation = cv2.INTER_LINEAR
+            self.resize_func = partial(cv2.resize, interpolation=interpolation)
+        elif backend.lower() == "pil":
+            if isinstance(interpolation, str):
+                interpolation = _pil_interp_from_str[interpolation.lower()]
+            self.resize_func = partial(_pil_resize, resample=interpolation)
+        else:
+            self.resize_func = cv2.resize
+    def __call__(self, src, size):
+        return self.resize_func(src, size)
+class OperatorParamError(ValueError):
+    """ OperatorParamError
+    """
+    pass
+class DecodeImage(object):
+    """ decode image """
+    def __init__(self, to_rgb=True, to_np=False, channel_first=False):
+        self.to_rgb = to_rgb
+        self.to_np = to_np  # to numpy
+        self.channel_first = channel_first  # only enabled when to_np is True
+    def __call__(self, img):
+        if six.PY2:
+            assert type(img) is str and len(img) > 0, "invalid input 'img' in DecodeImage"
+        else:
+            assert type(img) is bytes and len(img) > 0, "invalid input 'img' in DecodeImage"
+        data = np.frombuffer(img, dtype='uint8')
+        img = cv2.imdecode(data, 1)
+        if self.to_rgb:
+            assert img.shape[2] == 3, 'invalid shape of image[%s]' % (img.shape)
+            img = img[:, :, ::-1]
+        if self.channel_first:
+            img = img.transpose((2, 0, 1))
+        return img
+class ResizeImage(object):
+    """ resize image """
+    def __init__(self, size=None, resize_short=None, interpolation=None, backend="cv2"):
+        if resize_short is not None and resize_short > 0:
+            self.resize_short = resize_short
+            self.w = None
+            self.h = None
+        elif size is not None:
+            self.resize_short = None
+            self.w = size if type(size) is int else size[0]
+            self.h = size if type(size) is int else size[1]
+        else:
+            raise OperatorParamError("invalid params for ReisizeImage for '\
+                'both 'size' and 'resize_short' are None")
+        self._resize_func = UnifiedResize(interpolation=interpolation, backend=backend)
+    def __call__(self, img):
+        img_h, img_w = img.shape[:2]
+        if self.resize_short is not None:
+            percent = float(self.resize_short) / min(img_w, img_h)
+            w = int(round(img_w * percent))
+            h = int(round(img_h * percent))
+        else:
+            w = self.w
+            h = self.h
+        return self._resize_func(img, (w, h))
+class CropImage(object):
+    """ crop image """
+    def __init__(self, size):
+        if type(size) is int:
+            self.size = (size, size)
+        else:
+            self.size = size  # (h, w)
+    def __call__(self, img):
+        w, h = self.size
+        img_h, img_w = img.shape[:2]
+        w_start = (img_w - w) // 2
+        h_start = (img_h - h) // 2
+        w_end = w_start + w
+        h_end = h_start + h
+        return img[h_start:h_end, w_start:w_end, :]
+class RandCropImage(object):
+    """ random crop image """
+    def __init__(self, size, scale=None, ratio=None, interpolation=None, backend="cv2"):
+        if type(size) is int:
+            self.size = (size, size)  # (h, w)
+        else:
+            self.size = size
+        self.scale = [0.08, 1.0] if scale is None else scale
+        self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio
+        self._resize_func = UnifiedResize(interpolation=interpolation, backend=backend)
+    def __call__(self, img):
+        size = self.size
+        scale = self.scale
+        ratio = self.ratio
+        aspect_ratio = math.sqrt(random.uniform(*ratio))
+        w = 1. * aspect_ratio
+        h = 1. / aspect_ratio
+        img_h, img_w = img.shape[:2]
+        bound = min((float(img_w) / img_h) / (w**2), (float(img_h) / img_w) / (h**2))
+        scale_max = min(scale[1], bound)
+        scale_min = min(scale[0], bound)
+        target_area = img_w * img_h * random.uniform(scale_min, scale_max)
+        target_size = math.sqrt(target_area)
+        w = int(target_size * w)
+        h = int(target_size * h)
+        i = random.randint(0, img_w - w)
+        j = random.randint(0, img_h - h)
+        img = img[j:j + h, i:i + w, :]
+        return self._resize_func(img, size)
+class RandFlipImage(object):
+    """ random flip image
+        flip_code:
+            1: Flipped Horizontally
+            0: Flipped Vertically
+            -1: Flipped Horizontally & Vertically
+    """
+    def __init__(self, flip_code=1):
+        assert flip_code in [-1, 0, 1], "flip_code should be a value in [-1, 0, 1]"
+        self.flip_code = flip_code
+    def __call__(self, img):
+        if random.randint(0, 1) == 1:
+            return cv2.flip(img, self.flip_code)
+        else:
+            return img
+class NormalizeImage(object):
+    """ normalize image such as substract mean, divide std
+    """
+    def __init__(self, scale=None, mean=None, std=None, order='chw', output_fp16=False, channel_num=3):
+        if isinstance(scale, str):
+            scale = eval(scale)
+        assert channel_num in [3, 4], "channel number of input image should be set to 3 or 4."
+        self.channel_num = channel_num
+        self.output_dtype = 'float16' if output_fp16 else 'float32'
+        self.scale = np.float32(scale if scale is not None else 1.0 / 255.0)
+        self.order = order
+        mean = mean if mean is not None else [0.485, 0.456, 0.406]
+        std = std if std is not None else [0.229, 0.224, 0.225]
+        shape = (3, 1, 1) if self.order == 'chw' else (1, 1, 3)
+        self.mean = np.array(mean).reshape(shape).astype('float32')
+        self.std = np.array(std).reshape(shape).astype('float32')
+    def __call__(self, img):
+        from PIL import Image
+        if isinstance(img, Image.Image):
+            img = np.array(img)
+        assert isinstance(img, np.ndarray), "invalid input 'img' in NormalizeImage"
+        img = (img.astype('float32') * self.scale - self.mean) / self.std
+        if self.channel_num == 4:
+            img_h = img.shape[1] if self.order == 'chw' else img.shape[0]
+            img_w = img.shape[2] if self.order == 'chw' else img.shape[1]
+            pad_zeros = np.zeros((1, img_h, img_w)) if self.order == 'chw' else np.zeros((img_h, img_w, 1))
+            img = (np.concatenate((img, pad_zeros), axis=0) if self.order == 'chw' else np.concatenate(
+                (img, pad_zeros), axis=2))
+        return img.astype(self.output_dtype)
+class ToCHWImage(object):
+    """ convert hwc image to chw image
+    """
+    def __init__(self):
+        pass
+    def __call__(self, img):
+        from PIL import Image
+        if isinstance(img, Image.Image):
+            img = np.array(img)
+        return img.transpose((2, 0, 1))
+class ColorJitter(RawColorJitter):
+    """ColorJitter.
+    """
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+    def __call__(self, img):
+        if not isinstance(img, Image.Image):
+            img = np.ascontiguousarray(img)
+            img = Image.fromarray(img)
+        img = super()._apply_image(img)
+        if isinstance(img, Image.Image):
+            img = np.asarray(img)
+        return img
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
+class Topk(object):
+    def __init__(self, topk=1, class_id_map_file=None):
+        assert isinstance(topk, (int, ))
+        self.class_id_map = self.parse_class_id_map(class_id_map_file)
+        self.topk = topk
+    def parse_class_id_map(self, class_id_map_file):
+        if class_id_map_file is None:
+            return None
+        if not os.path.exists(class_id_map_file):
+            print(
+                "Warning: If want to use your own label_dict, please input legal path!\nOtherwise label_names will be empty!"
+            )
+            return None
+        try:
+            class_id_map = {}
+            with open(class_id_map_file, "r") as fin:
+                lines = fin.readlines()
+                for line in lines:
+                    partition = line.split("\n")[0].partition(" ")
+                    class_id_map[int(partition[0])] = str(partition[-1])
+        except Exception as ex:
+            print(ex)
+            class_id_map = None
+        return class_id_map
+    def __call__(self, x, file_names=None, multilabel=False):
+        assert isinstance(x, paddle.Tensor)
+        if file_names is not None:
+            assert x.shape[0] == len(file_names)
+        x = F.softmax(x, axis=-1) if not multilabel else F.sigmoid(x)
+        x = x.numpy()
+        y = []
+        for idx, probs in enumerate(x):
+            index = probs.argsort(axis=0)[-self.topk:][::-1].astype("int32") if not multilabel else np.where(
+                probs >= 0.5)[0].astype("int32")
+            clas_id_list = []
+            score_list = []
+            label_name_list = []
+            for i in index:
+                clas_id_list.append(i.item())
+                score_list.append(probs[i].item())
+                if self.class_id_map is not None:
+                    label_name_list.append(self.class_id_map[i.item()])
+            result = {
+                "class_ids": clas_id_list,
+                "scores": np.around(score_list, decimals=5).tolist(),
+            }
+            if file_names is not None:
+                result["file_name"] = file_names[idx]
+            if label_name_list is not None:
+                result["label_names"] = label_name_list
+            y.append(result)
+        return y
--- a/modules/image/classification/esnet_x0_5_imagenet/utils.py
+++ b/modules/image/classification/esnet_x0_5_imagenet/utils.py
+# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+import copy
+import os
+import yaml
+__all__ = ['get_config']
+class AttrDict(dict):
+    def __getattr__(self, key):
+        return self[key]
+    def __setattr__(self, key, value):
+        if key in self.__dict__:
+            self.__dict__[key] = value
+        else:
+            self[key] = value
+    def __deepcopy__(self, content):
+        return copy.deepcopy(dict(self))
+def create_attr_dict(yaml_config):
+    from ast import literal_eval
+    for key, value in yaml_config.items():
+        if type(value) is dict:
+            yaml_config[key] = value = AttrDict(value)
+        if isinstance(value, str):
+            try:
+                value = literal_eval(value)
+            except BaseException:
+                pass
+        if isinstance(value, AttrDict):
+            create_attr_dict(yaml_config[key])
+        else:
+            yaml_config[key] = value
+def parse_config(cfg_file):
+    """Load a config file into AttrDict"""
+    with open(cfg_file, 'r') as fopen:
+        yaml_config = AttrDict(yaml.load(fopen, Loader=yaml.SafeLoader))
+    create_attr_dict(yaml_config)
+    return yaml_config
+def override(dl, ks, v):
+    """
+    Recursively replace dict of list
+    Args:
+        dl(dict or list): dict or list to be replaced
+        ks(list): list of keys
+        v(str): value to be replaced
+    """
+    def str2num(v):
+        try:
+            return eval(v)
+        except Exception:
+            return v
+    assert isinstance(dl, (list, dict)), ("{} should be a list or a dict")
+    assert len(ks) > 0, ('lenght of keys should larger than 0')
+    if isinstance(dl, list):
+        k = str2num(ks[0])
+        if len(ks) == 1:
+            assert k < len(dl), ('index({}) out of range({})'.format(k, dl))
+            dl[k] = str2num(v)
+        else:
+            override(dl[k], ks[1:], v)
+    else:
+        if len(ks) == 1:
+            # assert ks[0] in dl, ('{} is not exist in {}'.format(ks[0], dl))
+            if not ks[0] in dl:
+                print('A new filed ({}) detected!'.format(ks[0], dl))
+            dl[ks[0]] = str2num(v)
+        else:
+            override(dl[ks[0]], ks[1:], v)
+def override_config(config, options=None):
+    """
+    Recursively override the config
+    Args:
+        config(dict): dict to be replaced
+        options(list): list of pairs(key0.key1.idx.key2=value)
+            such as: [
+                'topk=2',
+                'VALID.transforms.1.ResizeImage.resize_short=300'
+            ]
+    Returns:
+        config(dict): replaced config
+    """
+    if options is not None:
+        for opt in options:
+            assert isinstance(opt, str), ("option({}) should be a str".format(opt))
+            assert "=" in opt, ("option({}) should contain a ="
+                                "to distinguish between key and value".format(opt))
+            pair = opt.split('=')
+            assert len(pair) == 2, ("there can be only a = in the option")
+            key, value = pair
+            keys = key.split('.')
+            override(config, keys, value)
+    return config
+def get_config(fname, overrides=None, show=False):
+    """
+    Read config from file
+    """
+    assert os.path.exists(fname), ('config file({}) is not exist'.format(fname))
+    config = parse_config(fname)
+    override_config(config, overrides)
+    return config
--- a/modules/image/classification/levit_128_imagenet/README.md
+++ b/modules/image/classification/levit_128_imagenet/README.md
+# levit_128_imagenet
+|模型名称|levit_128_imagenet|
+| :--- | :---: |
+|类别|图像-图像分类|
+|网络|LeViT|
+|数据集|ImageNet-2012|
+|是否支持Fine-tuning|否|
+|模型大小|54 MB|
+|最新更新日期|2022-04-02|
+|数据指标|Acc|
+## 一、模型基本信息
+- ### 模型介绍
+  - LeViT 是一种快速推理的、用于图像分类任务的混合神经网络。其设计之初考虑了网络模型在不同的硬件平台上的性能，因此能够更好地反映普遍应用的真实场景。通过大量实验，作者找到了卷积神经网络与 Transformer 体系更好的结合方式，并且提出了 attention-based 方法，用于整合 Transformer 中的位置信息编码, 该模块的模型结构配置为LeViT128, 详情可参考[论文地址](https://arxiv.org/abs/2104.01136)。
+## 二、安装
+- ### 1、环境依赖  
+  - paddlepaddle >= 1.6.2  
+  - paddlehub >= 1.6.0  | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
+- ### 2、安装
+  - ```shell
+    $ hub install levit_128_imagenet
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测
+- ### 1、命令行预测
+  - ```shell
+    $ hub run levit_128_imagenet --input_path "/PATH/TO/IMAGE"
+    ```
+  - 通过命令行方式实现分类模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+- ### 2、预测代码示例
+  - ```python
+    import paddlehub as hub
+    import cv2
+    classifier = hub.Module(name="levit_128_imagenet")
+    result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')])
+    # or
+    # result = classifier.classification(paths=['/PATH/TO/IMAGE'])
+    ```
+- ### 3、API
+  - ```python
+    def classification(images=None,
+                       paths=None,
+                       batch_size=1,
+                       use_gpu=False,
+                       top_k=1):
+    ```
+    - 分类接口API。
+    - **参数**
+      - images (list\[numpy.ndarray\]): 图片数据，每一个图片数据的shape 均为 \[H, W, C\]，颜色空间为 BGR； <br/>
+      - paths (list\[str\]): 图片的路径； <br/>
+      - batch\_size (int): batch 的大小；<br/>
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA_VISIBLE_DEVICES环境变量** <br/>
+      - top\_k (int): 返回预测结果的前 k 个。
+    - **返回**
+      - res (list\[dict\]): 分类结果，列表的每一个元素均为字典，其中 key 包括'class_ids'（种类索引）, 'scores'（置信度） 和 'label_names'（种类名称）
+## 四、服务部署
+- PaddleHub Serving可以部署一个图像识别的在线服务。
+- ### 第一步：启动PaddleHub Serving
+  - 运行启动命令：
+  - ```shell
+    $ hub serving start -m levit_128_imagenet
+    ```
+  - 这样就完成了一个图像识别的在线服务的部署，默认端口号为8866。
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+  - ```python
+    import requests
+    import json
+    import cv2
+    import base64
+    def cv2_to_base64(image):
+        data = cv2.imencode('.jpg', image)[1]
+        return base64.b64encode(data.tostring()).decode('utf8')
+    # 发送HTTP请求
+    data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+    headers = {"Content-type": "application/json"\}
+    url = "http://127.0.0.1:8866/predict/levit_128_imagenet"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    # 打印预测结果
+    print(r.json()["results"])
+    ```
+## 五、更新历史
+* 1.0.0
+  初始发布
+  - ```shell
+    $ hub install levit_128_imagenet==1.0.0
+    ```
--- a/modules/image/classification/levit_128_imagenet/model.py
+++ b/modules/image/classification/levit_128_imagenet/model.py
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# Code was based on https://github.com/facebookresearch/LeViT
+import itertools
+import math
+import warnings
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+from paddle.nn.initializer import Constant
+from paddle.nn.initializer import TruncatedNormal
+from paddle.regularizer import L2Decay
+from .vision_transformer import Identity
+from .vision_transformer import ones_
+from .vision_transformer import trunc_normal_
+from .vision_transformer import zeros_
+def cal_attention_biases(attention_biases, attention_bias_idxs):
+    gather_list = []
+    attention_bias_t = paddle.transpose(attention_biases, (1, 0))
+    nums = attention_bias_idxs.shape[0]
+    for idx in range(nums):
+        gather = paddle.gather(attention_bias_t, attention_bias_idxs[idx])
+        gather_list.append(gather)
+    shape0, shape1 = attention_bias_idxs.shape
+    gather = paddle.concat(gather_list)
+    return paddle.transpose(gather, (1, 0)).reshape((0, shape0, shape1))
+class Conv2d_BN(nn.Sequential):
+    def __init__(self, a, b, ks=1, stride=1, pad=0, dilation=1, groups=1, bn_weight_init=1, resolution=-10000):
+        super().__init__()
+        self.add_sublayer('c', nn.Conv2D(a, b, ks, stride, pad, dilation, groups, bias_attr=False))
+        bn = nn.BatchNorm2D(b)
+        ones_(bn.weight)
+        zeros_(bn.bias)
+        self.add_sublayer('bn', bn)
+class Linear_BN(nn.Sequential):
+    def __init__(self, a, b, bn_weight_init=1):
+        super().__init__()
+        self.add_sublayer('c', nn.Linear(a, b, bias_attr=False))
+        bn = nn.BatchNorm1D(b)
+        if bn_weight_init == 0:
+            zeros_(bn.weight)
+        else:
+            ones_(bn.weight)
+        zeros_(bn.bias)
+        self.add_sublayer('bn', bn)
+    def forward(self, x):
+        l, bn = self._sub_layers.values()
+        x = l(x)
+        return paddle.reshape(bn(x.flatten(0, 1)), x.shape)
+class BN_Linear(nn.Sequential):
+    def __init__(self, a, b, bias=True, std=0.02):
+        super().__init__()
+        self.add_sublayer('bn', nn.BatchNorm1D(a))
+        l = nn.Linear(a, b, bias_attr=bias)
+        trunc_normal_(l.weight)
+        if bias:
+            zeros_(l.bias)
+        self.add_sublayer('l', l)
+def b16(n, activation, resolution=224):
+    return nn.Sequential(Conv2d_BN(3, n // 8, 3, 2, 1, resolution=resolution), activation(),
+                         Conv2d_BN(n // 8, n // 4, 3, 2, 1, resolution=resolution // 2), activation(),
+                         Conv2d_BN(n // 4, n // 2, 3, 2, 1, resolution=resolution // 4), activation(),
+                         Conv2d_BN(n // 2, n, 3, 2, 1, resolution=resolution // 8))
+class Residual(nn.Layer):
+    def __init__(self, m, drop):
+        super().__init__()
+        self.m = m
+        self.drop = drop
+    def forward(self, x):
+        if self.training and self.drop > 0:
+            y = paddle.rand(shape=[x.shape[0], 1, 1]).__ge__(self.drop).astype("float32")
+            y = y.divide(paddle.full_like(y, 1 - self.drop))
+            return paddle.add(x, y)
+        else:
+            return paddle.add(x, self.m(x))
+class Attention(nn.Layer):
+    def __init__(self, dim, key_dim, num_heads=8, attn_ratio=4, activation=None, resolution=14):
+        super().__init__()
+        self.num_heads = num_heads
+        self.scale = key_dim**-0.5
+        self.key_dim = key_dim
+        self.nh_kd = nh_kd = key_dim * num_heads
+        self.d = int(attn_ratio * key_dim)
+        self.dh = int(attn_ratio * key_dim) * num_heads
+        self.attn_ratio = attn_ratio
+        self.h = self.dh + nh_kd * 2
+        self.qkv = Linear_BN(dim, self.h)
+        self.proj = nn.Sequential(activation(), Linear_BN(self.dh, dim, bn_weight_init=0))
+        points = list(itertools.product(range(resolution), range(resolution)))
+        N = len(points)
+        attention_offsets = {}
+        idxs = []
+        for p1 in points:
+            for p2 in points:
+                offset = (abs(p1[0] - p2[0]), abs(p1[1] - p2[1]))
+                if offset not in attention_offsets:
+                    attention_offsets[offset] = len(attention_offsets)
+                idxs.append(attention_offsets[offset])
+        self.attention_biases = self.create_parameter(shape=(num_heads, len(attention_offsets)),
+                                                      default_initializer=zeros_,
+                                                      attr=paddle.ParamAttr(regularizer=L2Decay(0.0)))
+        tensor_idxs = paddle.to_tensor(idxs, dtype='int64')
+        self.register_buffer('attention_bias_idxs', paddle.reshape(tensor_idxs, [N, N]))
+    @paddle.no_grad()
+    def train(self, mode=True):
+        if mode:
+            super().train()
+        else:
+            super().eval()
+        if mode and hasattr(self, 'ab'):
+            del self.ab
+        else:
+            self.ab = cal_attention_biases(self.attention_biases, self.attention_bias_idxs)
+    def forward(self, x):
+        self.training = True
+        B, N, C = x.shape
+        qkv = self.qkv(x)
+        qkv = paddle.reshape(qkv, [B, N, self.num_heads, self.h // self.num_heads])
+        q, k, v = paddle.split(qkv, [self.key_dim, self.key_dim, self.d], axis=3)
+        q = paddle.transpose(q, perm=[0, 2, 1, 3])
+        k = paddle.transpose(k, perm=[0, 2, 1, 3])
+        v = paddle.transpose(v, perm=[0, 2, 1, 3])
+        k_transpose = paddle.transpose(k, perm=[0, 1, 3, 2])
+        if self.training:
+            attention_biases = cal_attention_biases(self.attention_biases, self.attention_bias_idxs)
+        else:
+            attention_biases = self.ab
+        attn = (paddle.matmul(q, k_transpose) * self.scale + attention_biases)
+        attn = F.softmax(attn)
+        x = paddle.transpose(paddle.matmul(attn, v), perm=[0, 2, 1, 3])
+        x = paddle.reshape(x, [B, N, self.dh])
+        x = self.proj(x)
+        return x
+class Subsample(nn.Layer):
+    def __init__(self, stride, resolution):
+        super().__init__()
+        self.stride = stride
+        self.resolution = resolution
+    def forward(self, x):
+        B, N, C = x.shape
+        x = paddle.reshape(x, [B, self.resolution, self.resolution, C])
+        end1, end2 = x.shape[1], x.shape[2]
+        x = x[:, 0:end1:self.stride, 0:end2:self.stride]
+        x = paddle.reshape(x, [B, -1, C])
+        return x
+class AttentionSubsample(nn.Layer):
+    def __init__(self,
+                 in_dim,
+                 out_dim,
+                 key_dim,
+                 num_heads=8,
+                 attn_ratio=2,
+                 activation=None,
+                 stride=2,
+                 resolution=14,
+                 resolution_=7):
+        super().__init__()
+        self.num_heads = num_heads
+        self.scale = key_dim**-0.5
+        self.key_dim = key_dim
+        self.nh_kd = nh_kd = key_dim * num_heads
+        self.d = int(attn_ratio * key_dim)
+        self.dh = int(attn_ratio * key_dim) * self.num_heads
+        self.attn_ratio = attn_ratio
+        self.resolution_ = resolution_
+        self.resolution_2 = resolution_**2
+        self.training = True
+        h = self.dh + nh_kd
+        self.kv = Linear_BN(in_dim, h)
+        self.q = nn.Sequential(Subsample(stride, resolution), Linear_BN(in_dim, nh_kd))
+        self.proj = nn.Sequential(activation(), Linear_BN(self.dh, out_dim))
+        self.stride = stride
+        self.resolution = resolution
+        points = list(itertools.product(range(resolution), range(resolution)))
+        points_ = list(itertools.product(range(resolution_), range(resolution_)))
+        N = len(points)
+        N_ = len(points_)
+        attention_offsets = {}
+        idxs = []
+        i = 0
+        j = 0
+        for p1 in points_:
+            i += 1
+            for p2 in points:
+                j += 1
+                size = 1
+                offset = (abs(p1[0] * stride - p2[0] + (size - 1) / 2), abs(p1[1] * stride - p2[1] + (size - 1) / 2))
+                if offset not in attention_offsets:
+                    attention_offsets[offset] = len(attention_offsets)
+                idxs.append(attention_offsets[offset])
+        self.attention_biases = self.create_parameter(shape=(num_heads, len(attention_offsets)),
+                                                      default_initializer=zeros_,
+                                                      attr=paddle.ParamAttr(regularizer=L2Decay(0.0)))
+        tensor_idxs_ = paddle.to_tensor(idxs, dtype='int64')
+        self.register_buffer('attention_bias_idxs', paddle.reshape(tensor_idxs_, [N_, N]))
+    @paddle.no_grad()
+    def train(self, mode=True):
+        if mode:
+            super().train()
+        else:
+            super().eval()
+        if mode and hasattr(self, 'ab'):
+            del self.ab
+        else:
+            self.ab = cal_attention_biases(self.attention_biases, self.attention_bias_idxs)
+    def forward(self, x):
+        self.training = True
+        B, N, C = x.shape
+        kv = self.kv(x)
+        kv = paddle.reshape(kv, [B, N, self.num_heads, -1])
+        k, v = paddle.split(kv, [self.key_dim, self.d], axis=3)
+        k = paddle.transpose(k, perm=[0, 2, 1, 3])  # BHNC
+        v = paddle.transpose(v, perm=[0, 2, 1, 3])
+        q = paddle.reshape(self.q(x), [B, self.resolution_2, self.num_heads, self.key_dim])
+        q = paddle.transpose(q, perm=[0, 2, 1, 3])
+        if self.training:
+            attention_biases = cal_attention_biases(self.attention_biases, self.attention_bias_idxs)
+        else:
+            attention_biases = self.ab
+        attn = (paddle.matmul(q, paddle.transpose(k, perm=[0, 1, 3, 2]))) * self.scale + attention_biases
+        attn = F.softmax(attn)
+        x = paddle.reshape(paddle.transpose(paddle.matmul(attn, v), perm=[0, 2, 1, 3]), [B, -1, self.dh])
+        x = self.proj(x)
+        return x
+class LeViT(nn.Layer):
+    """ Vision Transformer with support for patch or hybrid CNN input stage
+    """
+    def __init__(self,
+                 img_size=224,
+                 patch_size=16,
+                 in_chans=3,
+                 class_num=1000,
+                 embed_dim=[192],
+                 key_dim=[64],
+                 depth=[12],
+                 num_heads=[3],
+                 attn_ratio=[2],
+                 mlp_ratio=[2],
+                 hybrid_backbone=None,
+                 down_ops=[],
+                 attention_activation=nn.Hardswish,
+                 mlp_activation=nn.Hardswish,
+                 distillation=True,
+                 drop_path=0):
+        super().__init__()
+        self.class_num = class_num
+        self.num_features = embed_dim[-1]
+        self.embed_dim = embed_dim
+        self.distillation = distillation
+        self.patch_embed = hybrid_backbone
+        self.blocks = []
+        down_ops.append([''])
+        resolution = img_size // patch_size
+        for i, (ed, kd, dpth, nh, ar, mr,
+                do) in enumerate(zip(embed_dim, key_dim, depth, num_heads, attn_ratio, mlp_ratio, down_ops)):
+            for _ in range(dpth):
+                self.blocks.append(
+                    Residual(
+                        Attention(
+                            ed,
+                            kd,
+                            nh,
+                            attn_ratio=ar,
+                            activation=attention_activation,
+                            resolution=resolution,
+                        ), drop_path))
+                if mr > 0:
+                    h = int(ed * mr)
+                    self.blocks.append(
+                        Residual(
+                            nn.Sequential(
+                                Linear_BN(ed, h),
+                                mlp_activation(),
+                                Linear_BN(h, ed, bn_weight_init=0),
+                            ), drop_path))
+            if do[0] == 'Subsample':
+                #('Subsample',key_dim, num_heads, attn_ratio, mlp_ratio, stride)
+                resolution_ = (resolution - 1) // do[5] + 1
+                self.blocks.append(
+                    AttentionSubsample(*embed_dim[i:i + 2],
+                                       key_dim=do[1],
+                                       num_heads=do[2],
+                                       attn_ratio=do[3],
+                                       activation=attention_activation,
+                                       stride=do[5],
+                                       resolution=resolution,
+                                       resolution_=resolution_))
+                resolution = resolution_
+                if do[4] > 0:  # mlp_ratio
+                    h = int(embed_dim[i + 1] * do[4])
+                    self.blocks.append(
+                        Residual(
+                            nn.Sequential(
+                                Linear_BN(embed_dim[i + 1], h),
+                                mlp_activation(),
+                                Linear_BN(h, embed_dim[i + 1], bn_weight_init=0),
+                            ), drop_path))
+        self.blocks = nn.Sequential(*self.blocks)
+        # Classifier head
+        self.head = BN_Linear(embed_dim[-1], class_num) if class_num > 0 else Identity()
+        if distillation:
+            self.head_dist = BN_Linear(embed_dim[-1], class_num) if class_num > 0 else Identity()
+    def forward(self, x):
+        x = self.patch_embed(x)
+        x = x.flatten(2)
+        x = paddle.transpose(x, perm=[0, 2, 1])
+        x = self.blocks(x)
+        x = x.mean(1)
+        x = paddle.reshape(x, [-1, self.embed_dim[-1]])
+        if self.distillation:
+            x = self.head(x), self.head_dist(x)
+            if not self.training:
+                x = (x[0] + x[1]) / 2
+        else:
+            x = self.head(x)
+        return x
+def model_factory(C, D, X, N, drop_path, class_num, distillation):
+    embed_dim = [int(x) for x in C.split('_')]
+    num_heads = [int(x) for x in N.split('_')]
+    depth = [int(x) for x in X.split('_')]
+    act = nn.Hardswish
+    model = LeViT(
+        patch_size=16,
+        embed_dim=embed_dim,
+        num_heads=num_heads,
+        key_dim=[D] * 3,
+        depth=depth,
+        attn_ratio=[2, 2, 2],
+        mlp_ratio=[2, 2, 2],
+        down_ops=[
+            #('Subsample',key_dim, num_heads, attn_ratio, mlp_ratio, stride)
+            ['Subsample', D, embed_dim[0] // D, 4, 2, 2],
+            ['Subsample', D, embed_dim[1] // D, 4, 2, 2],
+        ],
+        attention_activation=act,
+        mlp_activation=act,
+        hybrid_backbone=b16(embed_dim[0], activation=act),
+        class_num=class_num,
+        drop_path=drop_path,
+        distillation=distillation)
+    return model
+specification = {
+    'LeViT_128S': {
+        'C': '128_256_384',
+        'D': 16,
+        'N': '4_6_8',
+        'X': '2_3_4',
+        'drop_path': 0
+    },
+    'LeViT_128': {
+        'C': '128_256_384',
+        'D': 16,
+        'N': '4_8_12',
+        'X': '4_4_4',
+        'drop_path': 0
+    },
+    'LeViT_192': {
+        'C': '192_288_384',
+        'D': 32,
+        'N': '3_5_6',
+        'X': '4_4_4',
+        'drop_path': 0
+    },
+    'LeViT_256': {
+        'C': '256_384_512',
+        'D': 32,
+        'N': '4_6_8',
+        'X': '4_4_4',
+        'drop_path': 0
+    },
+    'LeViT_384': {
+        'C': '384_512_768',
+        'D': 32,
+        'N': '6_9_12',
+        'X': '4_4_4',
+        'drop_path': 0.1
+    },
+}
+def LeViT_128(**kwargs):
+    model = model_factory(**specification['LeViT_128'], class_num=1000, distillation=False)
+    return model
--- a/modules/image/classification/levit_128_imagenet/module.py
+++ b/modules/image/classification/levit_128_imagenet/module.py
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+import copy
+import os
+import cv2
+import numpy as np
+import paddle
+from skimage.io import imread
+from skimage.transform import rescale
+from skimage.transform import resize
+import paddlehub as hub
+from .model import LeViT_128
+from .processor import base64_to_cv2
+from .processor import create_operators
+from .processor import Topk
+from .utils import get_config
+from paddlehub.module.module import moduleinfo
+from paddlehub.module.module import runnable
+from paddlehub.module.module import serving
+@moduleinfo(name="levit_128_imagenet",
+            type="cv/classification",
+            author="paddlepaddle",
+            author_email="",
+            summary="",
+            version="1.0.0")
+class LeViT_128_ImageNet:
+    def __init__(self):
+        self.config = get_config(os.path.join(self.directory, 'LeViT_128.yaml'), show=False)
+        self.label_path = os.path.join(self.directory, 'imagenet1k_label_list.txt')
+        self.pretrain_path = os.path.join(self.directory, 'LeViT_128_pretrained.pdparams')
+        self.config['Infer']['PostProcess']['class_id_map_file'] = self.label_path
+        self.model = LeViT_128()
+        param_state_dict = paddle.load(self.pretrain_path)
+        self.model.set_dict(param_state_dict)
+        self.preprocess_funcs = create_operators(self.config["Infer"]["transforms"])
+    def classification(self,
+                       images: list = None,
+                       paths: list = None,
+                       batch_size: int = 1,
+                       use_gpu: bool = False,
+                       top_k: int = 1):
+        '''
+        Args:
+            images (list[numpy.ndarray]): data of images, shape of each is [H, W, C], color space must be BGR.
+            paths (list[str]): The paths of images.
+            batch_size (int): batch size.
+            use_gpu (bool): Whether to use gpu.
+            top_k (int): Return top k results.
+        Returns:
+            res (list[dict]): The classfication results, each result dict contains key 'class_ids', 'scores' and 'label_names'.
+        '''
+        postprocess_func = Topk(top_k, self.label_path)
+        inputs = []
+        results = []
+        paddle.disable_static()
+        place = 'gpu:0' if use_gpu else 'cpu'
+        place = paddle.set_device(place)
+        if images == None and paths == None:
+            print('No image provided. Please input an image or a image path.')
+            return
+        if images != None:
+            for image in images:
+                image = image[:, :, ::-1]
+                inputs.append(image)
+        if paths != None:
+            for path in paths:
+                image = cv2.imread(path)[:, :, ::-1]
+                inputs.append(image)
+        batch_data = []
+        for idx, imagedata in enumerate(inputs):
+            for process in self.preprocess_funcs:
+                imagedata = process(imagedata)
+            batch_data.append(imagedata)
+            if len(batch_data) >= batch_size or idx == len(inputs) - 1:
+                batch_tensor = paddle.to_tensor(batch_data)
+                out = self.model(batch_tensor)
+                if isinstance(out, list):
+                    out = out[0]
+                if isinstance(out, dict) and "logits" in out:
+                    out = out["logits"]
+                if isinstance(out, dict) and "output" in out:
+                    out = out["output"]
+                result = postprocess_func(out)
+                results.extend(result)
+                batch_data.clear()
+        return results
+    @runnable
+    def run_cmd(self, argvs: list):
+        """
+        Run as a command.
+        """
+        self.parser = argparse.ArgumentParser(description="Run the {} module.".format(self.name),
+                                              prog='hub run {}'.format(self.name),
+                                              usage='%(prog)s',
+                                              add_help=True)
+        self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required")
+        self.arg_config_group = self.parser.add_argument_group(
+            title="Config options", description="Run configuration for controlling module behavior, not required.")
+        self.add_module_config_arg()
+        self.add_module_input_arg()
+        self.args = self.parser.parse_args(argvs)
+        results = self.classification(paths=[self.args.input_path],
+                                      use_gpu=self.args.use_gpu,
+                                      batch_size=self.args.batch_size,
+                                      top_k=self.args.top_k)
+        return results
+    @serving
+    def serving_method(self, images, **kwargs):
+        """
+        Run as a service.
+        """
+        images_decode = [base64_to_cv2(image) for image in images]
+        results = self.classification(images=images_decode, **kwargs)
+        return results
+    def add_module_config_arg(self):
+        """
+        Add the command config options.
+        """
+        self.arg_config_group.add_argument('--use_gpu', action='store_true', help="use GPU or not")
+        self.arg_config_group.add_argument('--batch_size', type=int, default=1, help='batch size')
+        self.arg_config_group.add_argument('--top_k', type=int, default=1, help='Return top k results.')
+    def add_module_input_arg(self):
+        """
+        Add the command input options.
+        """
+        self.arg_input_group.add_argument('--input_path', type=str, help="path to input image.")
--- a/modules/image/classification/levit_128_imagenet/processor.py
+++ b/modules/image/classification/levit_128_imagenet/processor.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+import base64
+import inspect
+import math
+import os
+import random
+import sys
+from functools import partial
+import cv2
+import numpy as np
+import paddle
+import paddle.nn.functional as F
+import six
+from paddle.vision.transforms import ColorJitter as RawColorJitter
+from PIL import Image
+def create_operators(params, class_num=None):
+    """
+    create operators based on the config
+    Args:
+        params(list): a dict list, used to create some operators
+    """
+    assert isinstance(params, list), ('operator config should be a list')
+    ops = []
+    current_module = sys.modules[__name__]
+    for operator in params:
+        assert isinstance(operator, dict) and len(operator) == 1, "yaml format error"
+        op_name = list(operator)[0]
+        param = {} if operator[op_name] is None else operator[op_name]
+        op_func = getattr(current_module, op_name)
+        if "class_num" in inspect.getfullargspec(op_func).args:
+            param.update({"class_num": class_num})
+        op = op_func(**param)
+        ops.append(op)
+    return ops
+class UnifiedResize(object):
+    def __init__(self, interpolation=None, backend="cv2"):
+        _cv2_interp_from_str = {
+            'nearest': cv2.INTER_NEAREST,
+            'bilinear': cv2.INTER_LINEAR,
+            'area': cv2.INTER_AREA,
+            'bicubic': cv2.INTER_CUBIC,
+            'lanczos': cv2.INTER_LANCZOS4
+        }
+        _pil_interp_from_str = {
+            'nearest': Image.NEAREST,
+            'bilinear': Image.BILINEAR,
+            'bicubic': Image.BICUBIC,
+            'box': Image.BOX,
+            'lanczos': Image.LANCZOS,
+            'hamming': Image.HAMMING
+        }
+        def _pil_resize(src, size, resample):
+            pil_img = Image.fromarray(src)
+            pil_img = pil_img.resize(size, resample)
+            return np.asarray(pil_img)
+        if backend.lower() == "cv2":
+            if isinstance(interpolation, str):
+                interpolation = _cv2_interp_from_str[interpolation.lower()]
+            # compatible with opencv < version 4.4.0
+            elif interpolation is None:
+                interpolation = cv2.INTER_LINEAR
+            self.resize_func = partial(cv2.resize, interpolation=interpolation)
+        elif backend.lower() == "pil":
+            if isinstance(interpolation, str):
+                interpolation = _pil_interp_from_str[interpolation.lower()]
+            self.resize_func = partial(_pil_resize, resample=interpolation)
+        else:
+            self.resize_func = cv2.resize
+    def __call__(self, src, size):
+        return self.resize_func(src, size)
+class OperatorParamError(ValueError):
+    """ OperatorParamError
+    """
+    pass
+class DecodeImage(object):
+    """ decode image """
+    def __init__(self, to_rgb=True, to_np=False, channel_first=False):
+        self.to_rgb = to_rgb
+        self.to_np = to_np  # to numpy
+        self.channel_first = channel_first  # only enabled when to_np is True
+    def __call__(self, img):
+        if six.PY2:
+            assert type(img) is str and len(img) > 0, "invalid input 'img' in DecodeImage"
+        else:
+            assert type(img) is bytes and len(img) > 0, "invalid input 'img' in DecodeImage"
+        data = np.frombuffer(img, dtype='uint8')
+        img = cv2.imdecode(data, 1)
+        if self.to_rgb:
+            assert img.shape[2] == 3, 'invalid shape of image[%s]' % (img.shape)
+            img = img[:, :, ::-1]
+        if self.channel_first:
+            img = img.transpose((2, 0, 1))
+        return img
+class ResizeImage(object):
+    """ resize image """
+    def __init__(self, size=None, resize_short=None, interpolation=None, backend="cv2"):
+        if resize_short is not None and resize_short > 0:
+            self.resize_short = resize_short
+            self.w = None
+            self.h = None
+        elif size is not None:
+            self.resize_short = None
+            self.w = size if type(size) is int else size[0]
+            self.h = size if type(size) is int else size[1]
+        else:
+            raise OperatorParamError("invalid params for ReisizeImage for '\
+                'both 'size' and 'resize_short' are None")
+        self._resize_func = UnifiedResize(interpolation=interpolation, backend=backend)
+    def __call__(self, img):
+        img_h, img_w = img.shape[:2]
+        if self.resize_short is not None:
+            percent = float(self.resize_short) / min(img_w, img_h)
+            w = int(round(img_w * percent))
+            h = int(round(img_h * percent))
+        else:
+            w = self.w
+            h = self.h
+        return self._resize_func(img, (w, h))
+class CropImage(object):
+    """ crop image """
+    def __init__(self, size):
+        if type(size) is int:
+            self.size = (size, size)
+        else:
+            self.size = size  # (h, w)
+    def __call__(self, img):
+        w, h = self.size
+        img_h, img_w = img.shape[:2]
+        w_start = (img_w - w) // 2
+        h_start = (img_h - h) // 2
+        w_end = w_start + w
+        h_end = h_start + h
+        return img[h_start:h_end, w_start:w_end, :]
+class RandCropImage(object):
+    """ random crop image """
+    def __init__(self, size, scale=None, ratio=None, interpolation=None, backend="cv2"):
+        if type(size) is int:
+            self.size = (size, size)  # (h, w)
+        else:
+            self.size = size
+        self.scale = [0.08, 1.0] if scale is None else scale
+        self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio
+        self._resize_func = UnifiedResize(interpolation=interpolation, backend=backend)
+    def __call__(self, img):
+        size = self.size
+        scale = self.scale
+        ratio = self.ratio
+        aspect_ratio = math.sqrt(random.uniform(*ratio))
+        w = 1. * aspect_ratio
+        h = 1. / aspect_ratio
+        img_h, img_w = img.shape[:2]
+        bound = min((float(img_w) / img_h) / (w**2), (float(img_h) / img_w) / (h**2))
+        scale_max = min(scale[1], bound)
+        scale_min = min(scale[0], bound)
+        target_area = img_w * img_h * random.uniform(scale_min, scale_max)
+        target_size = math.sqrt(target_area)
+        w = int(target_size * w)
+        h = int(target_size * h)
+        i = random.randint(0, img_w - w)
+        j = random.randint(0, img_h - h)
+        img = img[j:j + h, i:i + w, :]
+        return self._resize_func(img, size)
+class RandFlipImage(object):
+    """ random flip image
+        flip_code:
+            1: Flipped Horizontally
+            0: Flipped Vertically
+            -1: Flipped Horizontally & Vertically
+    """
+    def __init__(self, flip_code=1):
+        assert flip_code in [-1, 0, 1], "flip_code should be a value in [-1, 0, 1]"
+        self.flip_code = flip_code
+    def __call__(self, img):
+        if random.randint(0, 1) == 1:
+            return cv2.flip(img, self.flip_code)
+        else:
+            return img
+class NormalizeImage(object):
+    """ normalize image such as substract mean, divide std
+    """
+    def __init__(self, scale=None, mean=None, std=None, order='chw', output_fp16=False, channel_num=3):
+        if isinstance(scale, str):
+            scale = eval(scale)
+        assert channel_num in [3, 4], "channel number of input image should be set to 3 or 4."
+        self.channel_num = channel_num
+        self.output_dtype = 'float16' if output_fp16 else 'float32'
+        self.scale = np.float32(scale if scale is not None else 1.0 / 255.0)
+        self.order = order
+        mean = mean if mean is not None else [0.485, 0.456, 0.406]
+        std = std if std is not None else [0.229, 0.224, 0.225]
+        shape = (3, 1, 1) if self.order == 'chw' else (1, 1, 3)
+        self.mean = np.array(mean).reshape(shape).astype('float32')
+        self.std = np.array(std).reshape(shape).astype('float32')
+    def __call__(self, img):
+        from PIL import Image
+        if isinstance(img, Image.Image):
+            img = np.array(img)
+        assert isinstance(img, np.ndarray), "invalid input 'img' in NormalizeImage"
+        img = (img.astype('float32') * self.scale - self.mean) / self.std
+        if self.channel_num == 4:
+            img_h = img.shape[1] if self.order == 'chw' else img.shape[0]
+            img_w = img.shape[2] if self.order == 'chw' else img.shape[1]
+            pad_zeros = np.zeros((1, img_h, img_w)) if self.order == 'chw' else np.zeros((img_h, img_w, 1))
+            img = (np.concatenate((img, pad_zeros), axis=0) if self.order == 'chw' else np.concatenate(
+                (img, pad_zeros), axis=2))
+        return img.astype(self.output_dtype)
+class ToCHWImage(object):
+    """ convert hwc image to chw image
+    """
+    def __init__(self):
+        pass
+    def __call__(self, img):
+        from PIL import Image
+        if isinstance(img, Image.Image):
+            img = np.array(img)
+        return img.transpose((2, 0, 1))
+class ColorJitter(RawColorJitter):
+    """ColorJitter.
+    """
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+    def __call__(self, img):
+        if not isinstance(img, Image.Image):
+            img = np.ascontiguousarray(img)
+            img = Image.fromarray(img)
+        img = super()._apply_image(img)
+        if isinstance(img, Image.Image):
+            img = np.asarray(img)
+        return img
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
+class Topk(object):
+    def __init__(self, topk=1, class_id_map_file=None):
+        assert isinstance(topk, (int, ))
+        self.class_id_map = self.parse_class_id_map(class_id_map_file)
+        self.topk = topk
+    def parse_class_id_map(self, class_id_map_file):
+        if class_id_map_file is None:
+            return None
+        if not os.path.exists(class_id_map_file):
+            print(
+                "Warning: If want to use your own label_dict, please input legal path!\nOtherwise label_names will be empty!"
+            )
+            return None
+        try:
+            class_id_map = {}
+            with open(class_id_map_file, "r") as fin:
+                lines = fin.readlines()
+                for line in lines:
+                    partition = line.split("\n")[0].partition(" ")
+                    class_id_map[int(partition[0])] = str(partition[-1])
+        except Exception as ex:
+            print(ex)
+            class_id_map = None
+        return class_id_map
+    def __call__(self, x, file_names=None, multilabel=False):
+        assert isinstance(x, paddle.Tensor)
+        if file_names is not None:
+            assert x.shape[0] == len(file_names)
+        x = F.softmax(x, axis=-1) if not multilabel else F.sigmoid(x)
+        x = x.numpy()
+        y = []
+        for idx, probs in enumerate(x):
+            index = probs.argsort(axis=0)[-self.topk:][::-1].astype("int32") if not multilabel else np.where(
+                probs >= 0.5)[0].astype("int32")
+            clas_id_list = []
+            score_list = []
+            label_name_list = []
+            for i in index:
+                clas_id_list.append(i.item())
+                score_list.append(probs[i].item())
+                if self.class_id_map is not None:
+                    label_name_list.append(self.class_id_map[i.item()])
+            result = {
+                "class_ids": clas_id_list,
+                "scores": np.around(score_list, decimals=5).tolist(),
+            }
+            if file_names is not None:
+                result["file_name"] = file_names[idx]
+            if label_name_list is not None:
+                result["label_names"] = label_name_list
+            y.append(result)
+        return y
--- a/modules/image/classification/levit_128_imagenet/utils.py
+++ b/modules/image/classification/levit_128_imagenet/utils.py
+# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+import copy
+import os
+import yaml
+__all__ = ['get_config']
+class AttrDict(dict):
+    def __getattr__(self, key):
+        return self[key]
+    def __setattr__(self, key, value):
+        if key in self.__dict__:
+            self.__dict__[key] = value
+        else:
+            self[key] = value
+    def __deepcopy__(self, content):
+        return copy.deepcopy(dict(self))
+def create_attr_dict(yaml_config):
+    from ast import literal_eval
+    for key, value in yaml_config.items():
+        if type(value) is dict:
+            yaml_config[key] = value = AttrDict(value)
+        if isinstance(value, str):
+            try:
+                value = literal_eval(value)
+            except BaseException:
+                pass
+        if isinstance(value, AttrDict):
+            create_attr_dict(yaml_config[key])
+        else:
+            yaml_config[key] = value
+def parse_config(cfg_file):
+    """Load a config file into AttrDict"""
+    with open(cfg_file, 'r') as fopen:
+        yaml_config = AttrDict(yaml.load(fopen, Loader=yaml.SafeLoader))
+    create_attr_dict(yaml_config)
+    return yaml_config
+def override(dl, ks, v):
+    """
+    Recursively replace dict of list
+    Args:
+        dl(dict or list): dict or list to be replaced
+        ks(list): list of keys
+        v(str): value to be replaced
+    """
+    def str2num(v):
+        try:
+            return eval(v)
+        except Exception:
+            return v
+    assert isinstance(dl, (list, dict)), ("{} should be a list or a dict")
+    assert len(ks) > 0, ('lenght of keys should larger than 0')
+    if isinstance(dl, list):
+        k = str2num(ks[0])
+        if len(ks) == 1:
+            assert k < len(dl), ('index({}) out of range({})'.format(k, dl))
+            dl[k] = str2num(v)
+        else:
+            override(dl[k], ks[1:], v)
+    else:
+        if len(ks) == 1:
+            # assert ks[0] in dl, ('{} is not exist in {}'.format(ks[0], dl))
+            if not ks[0] in dl:
+                print('A new filed ({}) detected!'.format(ks[0], dl))
+            dl[ks[0]] = str2num(v)
+        else:
+            override(dl[ks[0]], ks[1:], v)
+def override_config(config, options=None):
+    """
+    Recursively override the config
+    Args:
+        config(dict): dict to be replaced
+        options(list): list of pairs(key0.key1.idx.key2=value)
+            such as: [
+                'topk=2',
+                'VALID.transforms.1.ResizeImage.resize_short=300'
+            ]
+    Returns:
+        config(dict): replaced config
+    """
+    if options is not None:
+        for opt in options:
+            assert isinstance(opt, str), ("option({}) should be a str".format(opt))
+            assert "=" in opt, ("option({}) should contain a ="
+                                "to distinguish between key and value".format(opt))
+            pair = opt.split('=')
+            assert len(pair) == 2, ("there can be only a = in the option")
+            key, value = pair
+            keys = key.split('.')
+            override(config, keys, value)
+    return config
+def get_config(fname, overrides=None, show=False):
+    """
+    Read config from file
+    """
+    assert os.path.exists(fname), ('config file({}) is not exist'.format(fname))
+    config = parse_config(fname)
+    override_config(config, overrides)
+    return config
--- a/modules/image/classification/levit_128s_imagenet/README.md
+++ b/modules/image/classification/levit_128s_imagenet/README.md
+# levit_128s_imagenet
+|模型名称|levit_128s_imagenet|
+| :--- | :---: |
+|类别|图像-图像分类|
+|网络|LeViT|
+|数据集|ImageNet-2012|
+|是否支持Fine-tuning|否|
+|模型大小|45 MB|
+|最新更新日期|2022-04-02|
+|数据指标|Acc|
+## 一、模型基本信息
+- ### 模型介绍
+  - LeViT 是一种快速推理的、用于图像分类任务的混合神经网络。其设计之初考虑了网络模型在不同的硬件平台上的性能，因此能够更好地反映普遍应用的真实场景。通过大量实验，作者找到了卷积神经网络与 Transformer 体系更好的结合方式，并且提出了 attention-based 方法，用于整合 Transformer 中的位置信息编码, 该模块的模型结构配置为LeViT128s, 详情可参考[论文地址](https://arxiv.org/abs/2104.01136)。
+## 二、安装
+- ### 1、环境依赖  
+  - paddlepaddle >= 1.6.2  
+  - paddlehub >= 1.6.0  | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
+- ### 2、安装
+  - ```shell
+    $ hub install levit_128s_imagenet
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测
+- ### 1、命令行预测
+  - ```shell
+    $ hub run levit_128s_imagenet --input_path "/PATH/TO/IMAGE"
+    ```
+  - 通过命令行方式实现分类模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+- ### 2、预测代码示例
+  - ```python
+    import paddlehub as hub
+    import cv2
+    classifier = hub.Module(name="levit_128s_imagenet")
+    result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')])
+    # or
+    # result = classifier.classification(paths=['/PATH/TO/IMAGE'])
+    ```
+- ### 3、API
+  - ```python
+    def classification(images=None,
+                       paths=None,
+                       batch_size=1,
+                       use_gpu=False,
+                       top_k=1):
+    ```
+    - 分类接口API。
+    - **参数**
+      - images (list\[numpy.ndarray\]): 图片数据，每一个图片数据的shape 均为 \[H, W, C\]，颜色空间为 BGR； <br/>
+      - paths (list\[str\]): 图片的路径； <br/>
+      - batch\_size (int): batch 的大小；<br/>
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA_VISIBLE_DEVICES环境变量** <br/>
+      - top\_k (int): 返回预测结果的前 k 个。
+    - **返回**
+      - res (list\[dict\]): 分类结果，列表的每一个元素均为字典，其中 key 包括'class_ids'（种类索引）, 'scores'（置信度） 和 'label_names'（种类名称）
+## 四、服务部署
+- PaddleHub Serving可以部署一个图像识别的在线服务。
+- ### 第一步：启动PaddleHub Serving
+  - 运行启动命令：
+  - ```shell
+    $ hub serving start -m levit_128s_imagenet
+    ```
+  - 这样就完成了一个图像识别的在线服务的部署，默认端口号为8866。
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+  - ```python
+    import requests
+    import json
+    import cv2
+    import base64
+    def cv2_to_base64(image):
+        data = cv2.imencode('.jpg', image)[1]
+        return base64.b64encode(data.tostring()).decode('utf8')
+    # 发送HTTP请求
+    data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+    headers = {"Content-type": "application/json"\}
+    url = "http://127.0.0.1:8866/predict/levit_128s_imagenet"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    # 打印预测结果
+    print(r.json()["results"])
+    ```
+## 五、更新历史
+* 1.0.0
+  初始发布
+  - ```shell
+    $ hub install levit_128s_imagenet==1.0.0
+    ```
--- a/modules/image/classification/levit_128s_imagenet/model.py
+++ b/modules/image/classification/levit_128s_imagenet/model.py
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# Code was based on https://github.com/facebookresearch/LeViT
+import itertools
+import math
+import warnings
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+from paddle.nn.initializer import Constant
+from paddle.nn.initializer import TruncatedNormal
+from paddle.regularizer import L2Decay
+from .vision_transformer import Identity
+from .vision_transformer import ones_
+from .vision_transformer import trunc_normal_
+from .vision_transformer import zeros_
+def cal_attention_biases(attention_biases, attention_bias_idxs):
+    gather_list = []
+    attention_bias_t = paddle.transpose(attention_biases, (1, 0))
+    nums = attention_bias_idxs.shape[0]
+    for idx in range(nums):
+        gather = paddle.gather(attention_bias_t, attention_bias_idxs[idx])
+        gather_list.append(gather)
+    shape0, shape1 = attention_bias_idxs.shape
+    gather = paddle.concat(gather_list)
+    return paddle.transpose(gather, (1, 0)).reshape((0, shape0, shape1))
+class Conv2d_BN(nn.Sequential):
+    def __init__(self, a, b, ks=1, stride=1, pad=0, dilation=1, groups=1, bn_weight_init=1, resolution=-10000):
+        super().__init__()
+        self.add_sublayer('c', nn.Conv2D(a, b, ks, stride, pad, dilation, groups, bias_attr=False))
+        bn = nn.BatchNorm2D(b)
+        ones_(bn.weight)
+        zeros_(bn.bias)
+        self.add_sublayer('bn', bn)
+class Linear_BN(nn.Sequential):
+    def __init__(self, a, b, bn_weight_init=1):
+        super().__init__()
+        self.add_sublayer('c', nn.Linear(a, b, bias_attr=False))
+        bn = nn.BatchNorm1D(b)
+        if bn_weight_init == 0:
+            zeros_(bn.weight)
+        else:
+            ones_(bn.weight)
+        zeros_(bn.bias)
+        self.add_sublayer('bn', bn)
+    def forward(self, x):
+        l, bn = self._sub_layers.values()
+        x = l(x)
+        return paddle.reshape(bn(x.flatten(0, 1)), x.shape)
+class BN_Linear(nn.Sequential):
+    def __init__(self, a, b, bias=True, std=0.02):
+        super().__init__()
+        self.add_sublayer('bn', nn.BatchNorm1D(a))
+        l = nn.Linear(a, b, bias_attr=bias)
+        trunc_normal_(l.weight)
+        if bias:
+            zeros_(l.bias)
+        self.add_sublayer('l', l)
+def b16(n, activation, resolution=224):
+    return nn.Sequential(Conv2d_BN(3, n // 8, 3, 2, 1, resolution=resolution), activation(),
+                         Conv2d_BN(n // 8, n // 4, 3, 2, 1, resolution=resolution // 2), activation(),
+                         Conv2d_BN(n // 4, n // 2, 3, 2, 1, resolution=resolution // 4), activation(),
+                         Conv2d_BN(n // 2, n, 3, 2, 1, resolution=resolution // 8))
+class Residual(nn.Layer):
+    def __init__(self, m, drop):
+        super().__init__()
+        self.m = m
+        self.drop = drop
+    def forward(self, x):
+        if self.training and self.drop > 0:
+            y = paddle.rand(shape=[x.shape[0], 1, 1]).__ge__(self.drop).astype("float32")
+            y = y.divide(paddle.full_like(y, 1 - self.drop))
+            return paddle.add(x, y)
+        else:
+            return paddle.add(x, self.m(x))
+class Attention(nn.Layer):
+    def __init__(self, dim, key_dim, num_heads=8, attn_ratio=4, activation=None, resolution=14):
+        super().__init__()
+        self.num_heads = num_heads
+        self.scale = key_dim**-0.5
+        self.key_dim = key_dim
+        self.nh_kd = nh_kd = key_dim * num_heads
+        self.d = int(attn_ratio * key_dim)
+        self.dh = int(attn_ratio * key_dim) * num_heads
+        self.attn_ratio = attn_ratio
+        self.h = self.dh + nh_kd * 2
+        self.qkv = Linear_BN(dim, self.h)
+        self.proj = nn.Sequential(activation(), Linear_BN(self.dh, dim, bn_weight_init=0))
+        points = list(itertools.product(range(resolution), range(resolution)))
+        N = len(points)
+        attention_offsets = {}
+        idxs = []
+        for p1 in points:
+            for p2 in points:
+                offset = (abs(p1[0] - p2[0]), abs(p1[1] - p2[1]))
+                if offset not in attention_offsets:
+                    attention_offsets[offset] = len(attention_offsets)
+                idxs.append(attention_offsets[offset])
+        self.attention_biases = self.create_parameter(shape=(num_heads, len(attention_offsets)),
+                                                      default_initializer=zeros_,
+                                                      attr=paddle.ParamAttr(regularizer=L2Decay(0.0)))
+        tensor_idxs = paddle.to_tensor(idxs, dtype='int64')
+        self.register_buffer('attention_bias_idxs', paddle.reshape(tensor_idxs, [N, N]))
+    @paddle.no_grad()
+    def train(self, mode=True):
+        if mode:
+            super().train()
+        else:
+            super().eval()
+        if mode and hasattr(self, 'ab'):
+            del self.ab
+        else:
+            self.ab = cal_attention_biases(self.attention_biases, self.attention_bias_idxs)
+    def forward(self, x):
+        self.training = True
+        B, N, C = x.shape
+        qkv = self.qkv(x)
+        qkv = paddle.reshape(qkv, [B, N, self.num_heads, self.h // self.num_heads])
+        q, k, v = paddle.split(qkv, [self.key_dim, self.key_dim, self.d], axis=3)
+        q = paddle.transpose(q, perm=[0, 2, 1, 3])
+        k = paddle.transpose(k, perm=[0, 2, 1, 3])
+        v = paddle.transpose(v, perm=[0, 2, 1, 3])
+        k_transpose = paddle.transpose(k, perm=[0, 1, 3, 2])
+        if self.training:
+            attention_biases = cal_attention_biases(self.attention_biases, self.attention_bias_idxs)
+        else:
+            attention_biases = self.ab
+        attn = (paddle.matmul(q, k_transpose) * self.scale + attention_biases)
+        attn = F.softmax(attn)
+        x = paddle.transpose(paddle.matmul(attn, v), perm=[0, 2, 1, 3])
+        x = paddle.reshape(x, [B, N, self.dh])
+        x = self.proj(x)
+        return x
+class Subsample(nn.Layer):
+    def __init__(self, stride, resolution):
+        super().__init__()
+        self.stride = stride
+        self.resolution = resolution
+    def forward(self, x):
+        B, N, C = x.shape
+        x = paddle.reshape(x, [B, self.resolution, self.resolution, C])
+        end1, end2 = x.shape[1], x.shape[2]
+        x = x[:, 0:end1:self.stride, 0:end2:self.stride]
+        x = paddle.reshape(x, [B, -1, C])
+        return x
+class AttentionSubsample(nn.Layer):
+    def __init__(self,
+                 in_dim,
+                 out_dim,
+                 key_dim,
+                 num_heads=8,
+                 attn_ratio=2,
+                 activation=None,
+                 stride=2,
+                 resolution=14,
+                 resolution_=7):
+        super().__init__()
+        self.num_heads = num_heads
+        self.scale = key_dim**-0.5
+        self.key_dim = key_dim
+        self.nh_kd = nh_kd = key_dim * num_heads
+        self.d = int(attn_ratio * key_dim)
+        self.dh = int(attn_ratio * key_dim) * self.num_heads
+        self.attn_ratio = attn_ratio
+        self.resolution_ = resolution_
+        self.resolution_2 = resolution_**2
+        self.training = True
+        h = self.dh + nh_kd
+        self.kv = Linear_BN(in_dim, h)
+        self.q = nn.Sequential(Subsample(stride, resolution), Linear_BN(in_dim, nh_kd))
+        self.proj = nn.Sequential(activation(), Linear_BN(self.dh, out_dim))
+        self.stride = stride
+        self.resolution = resolution
+        points = list(itertools.product(range(resolution), range(resolution)))
+        points_ = list(itertools.product(range(resolution_), range(resolution_)))
+        N = len(points)
+        N_ = len(points_)
+        attention_offsets = {}
+        idxs = []
+        i = 0
+        j = 0
+        for p1 in points_:
+            i += 1
+            for p2 in points:
+                j += 1
+                size = 1
+                offset = (abs(p1[0] * stride - p2[0] + (size - 1) / 2), abs(p1[1] * stride - p2[1] + (size - 1) / 2))
+                if offset not in attention_offsets:
+                    attention_offsets[offset] = len(attention_offsets)
+                idxs.append(attention_offsets[offset])
+        self.attention_biases = self.create_parameter(shape=(num_heads, len(attention_offsets)),
+                                                      default_initializer=zeros_,
+                                                      attr=paddle.ParamAttr(regularizer=L2Decay(0.0)))
+        tensor_idxs_ = paddle.to_tensor(idxs, dtype='int64')
+        self.register_buffer('attention_bias_idxs', paddle.reshape(tensor_idxs_, [N_, N]))
+    @paddle.no_grad()
+    def train(self, mode=True):
+        if mode:
+            super().train()
+        else:
+            super().eval()
+        if mode and hasattr(self, 'ab'):
+            del self.ab
+        else:
+            self.ab = cal_attention_biases(self.attention_biases, self.attention_bias_idxs)
+    def forward(self, x):
+        self.training = True
+        B, N, C = x.shape
+        kv = self.kv(x)
+        kv = paddle.reshape(kv, [B, N, self.num_heads, -1])
+        k, v = paddle.split(kv, [self.key_dim, self.d], axis=3)
+        k = paddle.transpose(k, perm=[0, 2, 1, 3])  # BHNC
+        v = paddle.transpose(v, perm=[0, 2, 1, 3])
+        q = paddle.reshape(self.q(x), [B, self.resolution_2, self.num_heads, self.key_dim])
+        q = paddle.transpose(q, perm=[0, 2, 1, 3])
+        if self.training:
+            attention_biases = cal_attention_biases(self.attention_biases, self.attention_bias_idxs)
+        else:
+            attention_biases = self.ab
+        attn = (paddle.matmul(q, paddle.transpose(k, perm=[0, 1, 3, 2]))) * self.scale + attention_biases
+        attn = F.softmax(attn)
+        x = paddle.reshape(paddle.transpose(paddle.matmul(attn, v), perm=[0, 2, 1, 3]), [B, -1, self.dh])
+        x = self.proj(x)
+        return x
+class LeViT(nn.Layer):
+    """ Vision Transformer with support for patch or hybrid CNN input stage
+    """
+    def __init__(self,
+                 img_size=224,
+                 patch_size=16,
+                 in_chans=3,
+                 class_num=1000,
+                 embed_dim=[192],
+                 key_dim=[64],
+                 depth=[12],
+                 num_heads=[3],
+                 attn_ratio=[2],
+                 mlp_ratio=[2],
+                 hybrid_backbone=None,
+                 down_ops=[],
+                 attention_activation=nn.Hardswish,
+                 mlp_activation=nn.Hardswish,
+                 distillation=True,
+                 drop_path=0):
+        super().__init__()
+        self.class_num = class_num
+        self.num_features = embed_dim[-1]
+        self.embed_dim = embed_dim
+        self.distillation = distillation
+        self.patch_embed = hybrid_backbone
+        self.blocks = []
+        down_ops.append([''])
+        resolution = img_size // patch_size
+        for i, (ed, kd, dpth, nh, ar, mr,
+                do) in enumerate(zip(embed_dim, key_dim, depth, num_heads, attn_ratio, mlp_ratio, down_ops)):
+            for _ in range(dpth):
+                self.blocks.append(
+                    Residual(
+                        Attention(
+                            ed,
+                            kd,
+                            nh,
+                            attn_ratio=ar,
+                            activation=attention_activation,
+                            resolution=resolution,
+                        ), drop_path))
+                if mr > 0:
+                    h = int(ed * mr)
+                    self.blocks.append(
+                        Residual(
+                            nn.Sequential(
+                                Linear_BN(ed, h),
+                                mlp_activation(),
+                                Linear_BN(h, ed, bn_weight_init=0),
+                            ), drop_path))
+            if do[0] == 'Subsample':
+                #('Subsample',key_dim, num_heads, attn_ratio, mlp_ratio, stride)
+                resolution_ = (resolution - 1) // do[5] + 1
+                self.blocks.append(
+                    AttentionSubsample(*embed_dim[i:i + 2],
+                                       key_dim=do[1],
+                                       num_heads=do[2],
+                                       attn_ratio=do[3],
+                                       activation=attention_activation,
+                                       stride=do[5],
+                                       resolution=resolution,
+                                       resolution_=resolution_))
+                resolution = resolution_
+                if do[4] > 0:  # mlp_ratio
+                    h = int(embed_dim[i + 1] * do[4])
+                    self.blocks.append(
+                        Residual(
+                            nn.Sequential(
+                                Linear_BN(embed_dim[i + 1], h),
+                                mlp_activation(),
+                                Linear_BN(h, embed_dim[i + 1], bn_weight_init=0),
+                            ), drop_path))
+        self.blocks = nn.Sequential(*self.blocks)
+        # Classifier head
+        self.head = BN_Linear(embed_dim[-1], class_num) if class_num > 0 else Identity()
+        if distillation:
+            self.head_dist = BN_Linear(embed_dim[-1], class_num) if class_num > 0 else Identity()
+    def forward(self, x):
+        x = self.patch_embed(x)
+        x = x.flatten(2)
+        x = paddle.transpose(x, perm=[0, 2, 1])
+        x = self.blocks(x)
+        x = x.mean(1)
+        x = paddle.reshape(x, [-1, self.embed_dim[-1]])
+        if self.distillation:
+            x = self.head(x), self.head_dist(x)
+            if not self.training:
+                x = (x[0] + x[1]) / 2
+        else:
+            x = self.head(x)
+        return x
+def model_factory(C, D, X, N, drop_path, class_num, distillation):
+    embed_dim = [int(x) for x in C.split('_')]
+    num_heads = [int(x) for x in N.split('_')]
+    depth = [int(x) for x in X.split('_')]
+    act = nn.Hardswish
+    model = LeViT(
+        patch_size=16,
+        embed_dim=embed_dim,
+        num_heads=num_heads,
+        key_dim=[D] * 3,
+        depth=depth,
+        attn_ratio=[2, 2, 2],
+        mlp_ratio=[2, 2, 2],
+        down_ops=[
+            #('Subsample',key_dim, num_heads, attn_ratio, mlp_ratio, stride)
+            ['Subsample', D, embed_dim[0] // D, 4, 2, 2],
+            ['Subsample', D, embed_dim[1] // D, 4, 2, 2],
+        ],
+        attention_activation=act,
+        mlp_activation=act,
+        hybrid_backbone=b16(embed_dim[0], activation=act),
+        class_num=class_num,
+        drop_path=drop_path,
+        distillation=distillation)
+    return model
+specification = {
+    'LeViT_128S': {
+        'C': '128_256_384',
+        'D': 16,
+        'N': '4_6_8',
+        'X': '2_3_4',
+        'drop_path': 0
+    },
+    'LeViT_128': {
+        'C': '128_256_384',
+        'D': 16,
+        'N': '4_8_12',
+        'X': '4_4_4',
+        'drop_path': 0
+    },
+    'LeViT_192': {
+        'C': '192_288_384',
+        'D': 32,
+        'N': '3_5_6',
+        'X': '4_4_4',
+        'drop_path': 0
+    },
+    'LeViT_256': {
+        'C': '256_384_512',
+        'D': 32,
+        'N': '4_6_8',
+        'X': '4_4_4',
+        'drop_path': 0
+    },
+    'LeViT_384': {
+        'C': '384_512_768',
+        'D': 32,
+        'N': '6_9_12',
+        'X': '4_4_4',
+        'drop_path': 0.1
+    },
+}
+def LeViT_128S(**kwargs):
+    model = model_factory(**specification['LeViT_128S'], class_num=1000, distillation=False)
+    return model
--- a/modules/image/classification/levit_128s_imagenet/module.py
+++ b/modules/image/classification/levit_128s_imagenet/module.py
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+import copy
+import os
+import cv2
+import numpy as np
+import paddle
+from skimage.io import imread
+from skimage.transform import rescale
+from skimage.transform import resize
+import paddlehub as hub
+from .model import LeViT_128S
+from .processor import base64_to_cv2
+from .processor import create_operators
+from .processor import Topk
+from .utils import get_config
+from paddlehub.module.module import moduleinfo
+from paddlehub.module.module import runnable
+from paddlehub.module.module import serving
+@moduleinfo(name="levit_128s_imagenet",
+            type="cv/classification",
+            author="paddlepaddle",
+            author_email="",
+            summary="",
+            version="1.0.0")
+class LeViT_128S_ImageNet:
+    def __init__(self):
+        self.config = get_config(os.path.join(self.directory, 'LeViT_128S.yaml'), show=False)
+        self.label_path = os.path.join(self.directory, 'imagenet1k_label_list.txt')
+        self.pretrain_path = os.path.join(self.directory, 'LeViT_128S_pretrained.pdparams')
+        self.config['Infer']['PostProcess']['class_id_map_file'] = self.label_path
+        self.model = LeViT_128S()
+        param_state_dict = paddle.load(self.pretrain_path)
+        self.model.set_dict(param_state_dict)
+        self.preprocess_funcs = create_operators(self.config["Infer"]["transforms"])
+    def classification(self,
+                       images: list = None,
+                       paths: list = None,
+                       batch_size: int = 1,
+                       use_gpu: bool = False,
+                       top_k: int = 1):
+        '''
+        Args:
+            images (list[numpy.ndarray]): data of images, shape of each is [H, W, C], color space must be BGR.
+            paths (list[str]): The paths of images.
+            batch_size (int): batch size.
+            use_gpu (bool): Whether to use gpu.
+            top_k (int): Return top k results.
+        Returns:
+            res (list[dict]): The classfication results, each result dict contains key 'class_ids', 'scores' and 'label_names'.
+        '''
+        postprocess_func = Topk(top_k, self.label_path)
+        inputs = []
+        results = []
+        paddle.disable_static()
+        place = 'gpu:0' if use_gpu else 'cpu'
+        place = paddle.set_device(place)
+        if images == None and paths == None:
+            print('No image provided. Please input an image or a image path.')
+            return
+        if images != None:
+            for image in images:
+                image = image[:, :, ::-1]
+                inputs.append(image)
+        if paths != None:
+            for path in paths:
+                image = cv2.imread(path)[:, :, ::-1]
+                inputs.append(image)
+        batch_data = []
+        for idx, imagedata in enumerate(inputs):
+            for process in self.preprocess_funcs:
+                imagedata = process(imagedata)
+            batch_data.append(imagedata)
+            if len(batch_data) >= batch_size or idx == len(inputs) - 1:
+                batch_tensor = paddle.to_tensor(batch_data)
+                out = self.model(batch_tensor)
+                if isinstance(out, list):
+                    out = out[0]
+                if isinstance(out, dict) and "logits" in out:
+                    out = out["logits"]
+                if isinstance(out, dict) and "output" in out:
+                    out = out["output"]
+                result = postprocess_func(out)
+                results.extend(result)
+                batch_data.clear()
+        return results
+    @runnable
+    def run_cmd(self, argvs: list):
+        """
+        Run as a command.
+        """
+        self.parser = argparse.ArgumentParser(description="Run the {} module.".format(self.name),
+                                              prog='hub run {}'.format(self.name),
+                                              usage='%(prog)s',
+                                              add_help=True)
+        self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required")
+        self.arg_config_group = self.parser.add_argument_group(
+            title="Config options", description="Run configuration for controlling module behavior, not required.")
+        self.add_module_config_arg()
+        self.add_module_input_arg()
+        self.args = self.parser.parse_args(argvs)
+        results = self.classification(paths=[self.args.input_path],
+                                      use_gpu=self.args.use_gpu,
+                                      batch_size=self.args.batch_size,
+                                      top_k=self.args.top_k)
+        return results
+    @serving
+    def serving_method(self, images, **kwargs):
+        """
+        Run as a service.
+        """
+        images_decode = [base64_to_cv2(image) for image in images]
+        results = self.classification(images=images_decode, **kwargs)
+        return results
+    def add_module_config_arg(self):
+        """
+        Add the command config options.
+        """
+        self.arg_config_group.add_argument('--use_gpu', action='store_true', help="use GPU or not")
+        self.arg_config_group.add_argument('--batch_size', type=int, default=1, help='batch size')
+        self.arg_config_group.add_argument('--top_k', type=int, default=1, help='Return top k results.')
+    def add_module_input_arg(self):
+        """
+        Add the command input options.
+        """
+        self.arg_input_group.add_argument('--input_path', type=str, help="path to input image.")
--- a/modules/image/classification/levit_128s_imagenet/processor.py
+++ b/modules/image/classification/levit_128s_imagenet/processor.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+import base64
+import inspect
+import math
+import os
+import random
+import sys
+from functools import partial
+import cv2
+import numpy as np
+import paddle
+import paddle.nn.functional as F
+import six
+from paddle.vision.transforms import ColorJitter as RawColorJitter
+from PIL import Image
+def create_operators(params, class_num=None):
+    """
+    create operators based on the config
+    Args:
+        params(list): a dict list, used to create some operators
+    """
+    assert isinstance(params, list), ('operator config should be a list')
+    ops = []
+    current_module = sys.modules[__name__]
+    for operator in params:
+        assert isinstance(operator, dict) and len(operator) == 1, "yaml format error"
+        op_name = list(operator)[0]
+        param = {} if operator[op_name] is None else operator[op_name]
+        op_func = getattr(current_module, op_name)
+        if "class_num" in inspect.getfullargspec(op_func).args:
+            param.update({"class_num": class_num})
+        op = op_func(**param)
+        ops.append(op)
+    return ops
+class UnifiedResize(object):
+    def __init__(self, interpolation=None, backend="cv2"):
+        _cv2_interp_from_str = {
+            'nearest': cv2.INTER_NEAREST,
+            'bilinear': cv2.INTER_LINEAR,
+            'area': cv2.INTER_AREA,
+            'bicubic': cv2.INTER_CUBIC,
+            'lanczos': cv2.INTER_LANCZOS4
+        }
+        _pil_interp_from_str = {
+            'nearest': Image.NEAREST,
+            'bilinear': Image.BILINEAR,
+            'bicubic': Image.BICUBIC,
+            'box': Image.BOX,
+            'lanczos': Image.LANCZOS,
+            'hamming': Image.HAMMING
+        }
+        def _pil_resize(src, size, resample):
+            pil_img = Image.fromarray(src)
+            pil_img = pil_img.resize(size, resample)
+            return np.asarray(pil_img)
+        if backend.lower() == "cv2":
+            if isinstance(interpolation, str):
+                interpolation = _cv2_interp_from_str[interpolation.lower()]
+            # compatible with opencv < version 4.4.0
+            elif interpolation is None:
+                interpolation = cv2.INTER_LINEAR
+            self.resize_func = partial(cv2.resize, interpolation=interpolation)
+        elif backend.lower() == "pil":
+            if isinstance(interpolation, str):
+                interpolation = _pil_interp_from_str[interpolation.lower()]
+            self.resize_func = partial(_pil_resize, resample=interpolation)
+        else:
+            self.resize_func = cv2.resize
+    def __call__(self, src, size):
+        return self.resize_func(src, size)
+class OperatorParamError(ValueError):
+    """ OperatorParamError
+    """
+    pass
+class DecodeImage(object):
+    """ decode image """
+    def __init__(self, to_rgb=True, to_np=False, channel_first=False):
+        self.to_rgb = to_rgb
+        self.to_np = to_np  # to numpy
+        self.channel_first = channel_first  # only enabled when to_np is True
+    def __call__(self, img):
+        if six.PY2:
+            assert type(img) is str and len(img) > 0, "invalid input 'img' in DecodeImage"
+        else:
+            assert type(img) is bytes and len(img) > 0, "invalid input 'img' in DecodeImage"
+        data = np.frombuffer(img, dtype='uint8')
+        img = cv2.imdecode(data, 1)
+        if self.to_rgb:
+            assert img.shape[2] == 3, 'invalid shape of image[%s]' % (img.shape)
+            img = img[:, :, ::-1]
+        if self.channel_first:
+            img = img.transpose((2, 0, 1))
+        return img
+class ResizeImage(object):
+    """ resize image """
+    def __init__(self, size=None, resize_short=None, interpolation=None, backend="cv2"):
+        if resize_short is not None and resize_short > 0:
+            self.resize_short = resize_short
+            self.w = None
+            self.h = None
+        elif size is not None:
+            self.resize_short = None
+            self.w = size if type(size) is int else size[0]
+            self.h = size if type(size) is int else size[1]
+        else:
+            raise OperatorParamError("invalid params for ReisizeImage for '\
+                'both 'size' and 'resize_short' are None")
+        self._resize_func = UnifiedResize(interpolation=interpolation, backend=backend)
+    def __call__(self, img):
+        img_h, img_w = img.shape[:2]
+        if self.resize_short is not None:
+            percent = float(self.resize_short) / min(img_w, img_h)
+            w = int(round(img_w * percent))
+            h = int(round(img_h * percent))
+        else:
+            w = self.w
+            h = self.h
+        return self._resize_func(img, (w, h))
+class CropImage(object):
+    """ crop image """
+    def __init__(self, size):
+        if type(size) is int:
+            self.size = (size, size)
+        else:
+            self.size = size  # (h, w)
+    def __call__(self, img):
+        w, h = self.size
+        img_h, img_w = img.shape[:2]
+        w_start = (img_w - w) // 2
+        h_start = (img_h - h) // 2
+        w_end = w_start + w
+        h_end = h_start + h
+        return img[h_start:h_end, w_start:w_end, :]
+class RandCropImage(object):
+    """ random crop image """
+    def __init__(self, size, scale=None, ratio=None, interpolation=None, backend="cv2"):
+        if type(size) is int:
+            self.size = (size, size)  # (h, w)
+        else:
+            self.size = size
+        self.scale = [0.08, 1.0] if scale is None else scale
+        self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio
+        self._resize_func = UnifiedResize(interpolation=interpolation, backend=backend)
+    def __call__(self, img):
+        size = self.size
+        scale = self.scale
+        ratio = self.ratio
+        aspect_ratio = math.sqrt(random.uniform(*ratio))
+        w = 1. * aspect_ratio
+        h = 1. / aspect_ratio
+        img_h, img_w = img.shape[:2]
+        bound = min((float(img_w) / img_h) / (w**2), (float(img_h) / img_w) / (h**2))
+        scale_max = min(scale[1], bound)
+        scale_min = min(scale[0], bound)
+        target_area = img_w * img_h * random.uniform(scale_min, scale_max)
+        target_size = math.sqrt(target_area)
+        w = int(target_size * w)
+        h = int(target_size * h)
+        i = random.randint(0, img_w - w)
+        j = random.randint(0, img_h - h)
+        img = img[j:j + h, i:i + w, :]
+        return self._resize_func(img, size)
+class RandFlipImage(object):
+    """ random flip image
+        flip_code:
+            1: Flipped Horizontally
+            0: Flipped Vertically
+            -1: Flipped Horizontally & Vertically
+    """
+    def __init__(self, flip_code=1):
+        assert flip_code in [-1, 0, 1], "flip_code should be a value in [-1, 0, 1]"
+        self.flip_code = flip_code
+    def __call__(self, img):
+        if random.randint(0, 1) == 1:
+            return cv2.flip(img, self.flip_code)
+        else:
+            return img
+class NormalizeImage(object):
+    """ normalize image such as substract mean, divide std
+    """
+    def __init__(self, scale=None, mean=None, std=None, order='chw', output_fp16=False, channel_num=3):
+        if isinstance(scale, str):
+            scale = eval(scale)
+        assert channel_num in [3, 4], "channel number of input image should be set to 3 or 4."
+        self.channel_num = channel_num
+        self.output_dtype = 'float16' if output_fp16 else 'float32'
+        self.scale = np.float32(scale if scale is not None else 1.0 / 255.0)
+        self.order = order
+        mean = mean if mean is not None else [0.485, 0.456, 0.406]
+        std = std if std is not None else [0.229, 0.224, 0.225]
+        shape = (3, 1, 1) if self.order == 'chw' else (1, 1, 3)
+        self.mean = np.array(mean).reshape(shape).astype('float32')
+        self.std = np.array(std).reshape(shape).astype('float32')
+    def __call__(self, img):
+        from PIL import Image
+        if isinstance(img, Image.Image):
+            img = np.array(img)
+        assert isinstance(img, np.ndarray), "invalid input 'img' in NormalizeImage"
+        img = (img.astype('float32') * self.scale - self.mean) / self.std
+        if self.channel_num == 4:
+            img_h = img.shape[1] if self.order == 'chw' else img.shape[0]
+            img_w = img.shape[2] if self.order == 'chw' else img.shape[1]
+            pad_zeros = np.zeros((1, img_h, img_w)) if self.order == 'chw' else np.zeros((img_h, img_w, 1))
+            img = (np.concatenate((img, pad_zeros), axis=0) if self.order == 'chw' else np.concatenate(
+                (img, pad_zeros), axis=2))
+        return img.astype(self.output_dtype)
+class ToCHWImage(object):
+    """ convert hwc image to chw image
+    """
+    def __init__(self):
+        pass
+    def __call__(self, img):
+        from PIL import Image
+        if isinstance(img, Image.Image):
+            img = np.array(img)
+        return img.transpose((2, 0, 1))
+class ColorJitter(RawColorJitter):
+    """ColorJitter.
+    """
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+    def __call__(self, img):
+        if not isinstance(img, Image.Image):
+            img = np.ascontiguousarray(img)
+            img = Image.fromarray(img)
+        img = super()._apply_image(img)
+        if isinstance(img, Image.Image):
+            img = np.asarray(img)
+        return img
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
+class Topk(object):
+    def __init__(self, topk=1, class_id_map_file=None):
+        assert isinstance(topk, (int, ))
+        self.class_id_map = self.parse_class_id_map(class_id_map_file)
+        self.topk = topk
+    def parse_class_id_map(self, class_id_map_file):
+        if class_id_map_file is None:
+            return None
+        if not os.path.exists(class_id_map_file):
+            print(
+                "Warning: If want to use your own label_dict, please input legal path!\nOtherwise label_names will be empty!"
+            )
+            return None
+        try:
+            class_id_map = {}
+            with open(class_id_map_file, "r") as fin:
+                lines = fin.readlines()
+                for line in lines:
+                    partition = line.split("\n")[0].partition(" ")
+                    class_id_map[int(partition[0])] = str(partition[-1])
+        except Exception as ex:
+            print(ex)
+            class_id_map = None
+        return class_id_map
+    def __call__(self, x, file_names=None, multilabel=False):
+        assert isinstance(x, paddle.Tensor)
+        if file_names is not None:
+            assert x.shape[0] == len(file_names)
+        x = F.softmax(x, axis=-1) if not multilabel else F.sigmoid(x)
+        x = x.numpy()
+        y = []
+        for idx, probs in enumerate(x):
+            index = probs.argsort(axis=0)[-self.topk:][::-1].astype("int32") if not multilabel else np.where(
+                probs >= 0.5)[0].astype("int32")
+            clas_id_list = []
+            score_list = []
+            label_name_list = []
+            for i in index:
+                clas_id_list.append(i.item())
+                score_list.append(probs[i].item())
+                if self.class_id_map is not None:
+                    label_name_list.append(self.class_id_map[i.item()])
+            result = {
+                "class_ids": clas_id_list,
+                "scores": np.around(score_list, decimals=5).tolist(),
+            }
+            if file_names is not None:
+                result["file_name"] = file_names[idx]
+            if label_name_list is not None:
+                result["label_names"] = label_name_list
+            y.append(result)
+        return y
--- a/modules/image/classification/levit_128s_imagenet/utils.py
+++ b/modules/image/classification/levit_128s_imagenet/utils.py
--- a/modules/image/classification/levit_192_imagenet/README.md
+++ b/modules/image/classification/levit_192_imagenet/README.md
--- a/modules/image/classification/levit_192_imagenet/model.py
+++ b/modules/image/classification/levit_192_imagenet/model.py
--- a/modules/image/classification/levit_192_imagenet/module.py
+++ b/modules/image/classification/levit_192_imagenet/module.py
--- a/modules/image/classification/levit_192_imagenet/processor.py
+++ b/modules/image/classification/levit_192_imagenet/processor.py
--- a/modules/image/classification/levit_192_imagenet/utils.py
+++ b/modules/image/classification/levit_192_imagenet/utils.py
--- a/modules/image/classification/levit_256_imagenet/README.md
+++ b/modules/image/classification/levit_256_imagenet/README.md
--- a/modules/image/classification/levit_256_imagenet/model.py
+++ b/modules/image/classification/levit_256_imagenet/model.py
--- a/modules/image/classification/levit_256_imagenet/module.py
+++ b/modules/image/classification/levit_256_imagenet/module.py
--- a/modules/image/classification/levit_256_imagenet/processor.py
+++ b/modules/image/classification/levit_256_imagenet/processor.py
--- a/modules/image/classification/levit_256_imagenet/utils.py
+++ b/modules/image/classification/levit_256_imagenet/utils.py
--- a/modules/image/classification/levit_384_imagenet/README.md
+++ b/modules/image/classification/levit_384_imagenet/README.md
--- a/modules/image/classification/levit_384_imagenet/model.py
+++ b/modules/image/classification/levit_384_imagenet/model.py
--- a/modules/image/classification/levit_384_imagenet/module.py
+++ b/modules/image/classification/levit_384_imagenet/module.py
--- a/modules/image/classification/levit_384_imagenet/processor.py
+++ b/modules/image/classification/levit_384_imagenet/processor.py
--- a/modules/image/classification/levit_384_imagenet/utils.py
+++ b/modules/image/classification/levit_384_imagenet/utils.py
--- a/modules/image/classification/pplcnet_x0_25_imagenet/README.md
+++ b/modules/image/classification/pplcnet_x0_25_imagenet/README.md
--- a/modules/image/classification/pplcnet_x0_25_imagenet/model.py
+++ b/modules/image/classification/pplcnet_x0_25_imagenet/model.py
--- a/modules/image/classification/pplcnet_x0_25_imagenet/module.py
+++ b/modules/image/classification/pplcnet_x0_25_imagenet/module.py
--- a/modules/image/classification/pplcnet_x0_25_imagenet/processor.py
+++ b/modules/image/classification/pplcnet_x0_25_imagenet/processor.py
--- a/modules/image/classification/pplcnet_x0_25_imagenet/utils.py
+++ b/modules/image/classification/pplcnet_x0_25_imagenet/utils.py
--- a/modules/image/classification/pplcnet_x0_35_imagenet/README.md
+++ b/modules/image/classification/pplcnet_x0_35_imagenet/README.md
--- a/modules/image/classification/pplcnet_x0_35_imagenet/model.py
+++ b/modules/image/classification/pplcnet_x0_35_imagenet/model.py
--- a/modules/image/classification/pplcnet_x0_35_imagenet/module.py
+++ b/modules/image/classification/pplcnet_x0_35_imagenet/module.py
--- a/modules/image/classification/pplcnet_x0_35_imagenet/processor.py
+++ b/modules/image/classification/pplcnet_x0_35_imagenet/processor.py
--- a/modules/image/classification/pplcnet_x0_35_imagenet/utils.py
+++ b/modules/image/classification/pplcnet_x0_35_imagenet/utils.py
--- a/modules/image/classification/pplcnet_x0_5_imagenet/README.md
+++ b/modules/image/classification/pplcnet_x0_5_imagenet/README.md
--- a/modules/image/classification/pplcnet_x0_5_imagenet/model.py
+++ b/modules/image/classification/pplcnet_x0_5_imagenet/model.py
--- a/modules/image/classification/pplcnet_x0_5_imagenet/module.py
+++ b/modules/image/classification/pplcnet_x0_5_imagenet/module.py
--- a/modules/image/classification/pplcnet_x0_5_imagenet/processor.py
+++ b/modules/image/classification/pplcnet_x0_5_imagenet/processor.py
--- a/modules/image/classification/pplcnet_x0_5_imagenet/utils.py
+++ b/modules/image/classification/pplcnet_x0_5_imagenet/utils.py
--- a/modules/image/classification/pplcnet_x0_75_imagenet/README.md
+++ b/modules/image/classification/pplcnet_x0_75_imagenet/README.md
--- a/modules/image/classification/pplcnet_x0_75_imagenet/model.py
+++ b/modules/image/classification/pplcnet_x0_75_imagenet/model.py
--- a/modules/image/classification/pplcnet_x0_75_imagenet/module.py
+++ b/modules/image/classification/pplcnet_x0_75_imagenet/module.py
--- a/modules/image/classification/pplcnet_x0_75_imagenet/processor.py
+++ b/modules/image/classification/pplcnet_x0_75_imagenet/processor.py
--- a/modules/image/classification/pplcnet_x0_75_imagenet/utils.py
+++ b/modules/image/classification/pplcnet_x0_75_imagenet/utils.py
--- a/modules/image/classification/pplcnet_x1_0_imagenet/README.md
+++ b/modules/image/classification/pplcnet_x1_0_imagenet/README.md
--- a/modules/image/classification/pplcnet_x1_0_imagenet/model.py
+++ b/modules/image/classification/pplcnet_x1_0_imagenet/model.py
--- a/modules/image/classification/pplcnet_x1_0_imagenet/module.py
+++ b/modules/image/classification/pplcnet_x1_0_imagenet/module.py
--- a/modules/image/classification/pplcnet_x1_0_imagenet/processor.py
+++ b/modules/image/classification/pplcnet_x1_0_imagenet/processor.py
--- a/modules/image/classification/pplcnet_x1_0_imagenet/utils.py
+++ b/modules/image/classification/pplcnet_x1_0_imagenet/utils.py
--- a/modules/image/classification/pplcnet_x1_5_imagenet/README.md
+++ b/modules/image/classification/pplcnet_x1_5_imagenet/README.md
--- a/modules/image/classification/pplcnet_x1_5_imagenet/model.py
+++ b/modules/image/classification/pplcnet_x1_5_imagenet/model.py
--- a/modules/image/classification/pplcnet_x1_5_imagenet/module.py
+++ b/modules/image/classification/pplcnet_x1_5_imagenet/module.py
--- a/modules/image/classification/pplcnet_x1_5_imagenet/processor.py
+++ b/modules/image/classification/pplcnet_x1_5_imagenet/processor.py
--- a/modules/image/classification/pplcnet_x1_5_imagenet/utils.py
+++ b/modules/image/classification/pplcnet_x1_5_imagenet/utils.py
--- a/modules/image/classification/pplcnet_x2_0_imagenet/README.md
+++ b/modules/image/classification/pplcnet_x2_0_imagenet/README.md
--- a/modules/image/classification/pplcnet_x2_0_imagenet/model.py
+++ b/modules/image/classification/pplcnet_x2_0_imagenet/model.py
--- a/modules/image/classification/pplcnet_x2_0_imagenet/module.py
+++ b/modules/image/classification/pplcnet_x2_0_imagenet/module.py
--- a/modules/image/classification/pplcnet_x2_0_imagenet/processor.py
+++ b/modules/image/classification/pplcnet_x2_0_imagenet/processor.py
--- a/modules/image/classification/pplcnet_x2_0_imagenet/utils.py
+++ b/modules/image/classification/pplcnet_x2_0_imagenet/utils.py
--- a/modules/image/classification/pplcnet_x2_5_imagenet/README.md
+++ b/modules/image/classification/pplcnet_x2_5_imagenet/README.md
--- a/modules/image/classification/pplcnet_x2_5_imagenet/model.py
+++ b/modules/image/classification/pplcnet_x2_5_imagenet/model.py
--- a/modules/image/classification/pplcnet_x2_5_imagenet/module.py
+++ b/modules/image/classification/pplcnet_x2_5_imagenet/module.py
--- a/modules/image/classification/pplcnet_x2_5_imagenet/processor.py
+++ b/modules/image/classification/pplcnet_x2_5_imagenet/processor.py
--- a/modules/image/classification/pplcnet_x2_5_imagenet/utils.py
+++ b/modules/image/classification/pplcnet_x2_5_imagenet/utils.py
--- a/modules/text/text_generation/ernie_tiny/README.md
+++ b/modules/text/text_generation/ernie_tiny/README.md
--- a/modules/text/text_generation/ernie_tiny/README_en.md
+++ b/modules/text/text_generation/ernie_tiny/README_en.md