Add more semantic segmentation models

595f4534 · haoyuying · GitHub · 5e9c173b · 595f4534 · 595f4534
36 changed file
--- a/modules/image/semantic_segmentation/bisenetv2_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/bisenetv2_cityscapes/README.md
+# PaddleHub 图像分割
+
+## 模型预测
+
+若想使用我们提供的预训练模型进行预测，可使用如下脚本：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='bisenetv2_cityscapes')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+
+
+## 如何开始Fine-tune
+
+本示例将展示如何使用PaddleHub对预训练模型进行finetune并完成预测任务。
+
+在完成安装PaddlePaddle与PaddleHub后，通过执行`python train.py`即可开始使用bisenetv2_cityscapes模型对OpticDiscSeg等数据集进行Fine-tune。
+
+## 代码步骤
+
+使用PaddleHub Fine-tune API进行Fine-tune可以分为4个步骤。
+
+### Step1: 定义数据预处理方式
+```python
+from paddlehub.vision.segmentation_transforms import Compose, Resize, Normalize
+
+transform = Compose([Resize(target_size=(512, 512)), Normalize()])
+```
+
+`segmentation_transforms` 数据增强模块定义了丰富的针对图像分割数据的预处理方式，用户可按照需求替换自己需要的数据预处理方式。
+
+### Step2: 下载数据集并使用
+```python
+from paddlehub.datasets import OpticDiscSeg
+
+train_reader = OpticDiscSeg(transform， mode='train')
+
+```
+* `transform`: 数据预处理方式。
+* `mode`: 选择数据模式，可选项有 `train`, `test`, `val`, 默认为`train`。
+
+数据集的准备代码可以参考 [opticdiscseg.py](../../paddlehub/datasets/opticdiscseg.py)。`hub.datasets.OpticDiscSeg()`会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
+
+### Step3: 加载预训练模型
+
+```python
+model = hub.Module(name='bisenetv2_cityscapes', num_classes=2, pretrained=None)
+```
+* `name`: 选择预训练模型的名字。
+* `num_classes`: 分割模型的类别数目。
+* `pretrained`: 是否加载自己训练的模型，若为None，则加载提供的模型默认参数。
+
+### Step4: 选择优化策略和运行配置
+
+```python
+scheduler = paddle.optimizer.lr.PolynomialDecay(learning_rate=0.01, decay_steps=1000, power=0.9,  end_lr=0.0001)
+optimizer = paddle.optimizer.Adam(learning_rate=scheduler, parameters=model.parameters())
+trainer = Trainer(model, optimizer, checkpoint_dir='test_ckpt_img_ocr', use_gpu=True)
+```
+
+#### 优化策略
+
+Paddle2.0提供了多种优化器选择，如`SGD`, `Adam`, `Adamax`等，其中`Adam`:
+
+* `learning_rate`: 全局学习率。
+*  `parameters`: 待优化模型参数。
+
+#### 运行配置
+`Trainer` 主要控制Fine-tune的训练，包含以下可控制的参数:
+
+* `model`: 被优化模型；
+* `optimizer`: 优化器选择；
+* `use_gpu`: 是否使用gpu，默认为False;
+* `use_vdl`: 是否使用vdl可视化训练过程；
+* `checkpoint_dir`: 保存模型参数的地址；
+* `compare_metrics`: 保存最优模型的衡量指标；
+
+`trainer.train` 主要控制具体的训练过程，包含以下可控制的参数：
+
+* `train_dataset`: 训练时所用的数据集；
+* `epochs`: 训练轮数；
+* `batch_size`: 训练的批大小，如果使用GPU，请根据实际情况调整batch_size；
+* `num_workers`: works的数量，默认为0；
+* `eval_dataset`: 验证集；
+* `log_interval`: 打印日志的间隔， 单位为执行批训练的次数。
+* `save_interval`: 保存模型的间隔频次，单位为执行训练的轮数。
+
+## 模型预测
+
+当完成Fine-tune后，Fine-tune过程在验证集上表现最优的模型会被保存在`${CHECKPOINT_DIR}/best_model`目录下，其中`${CHECKPOINT_DIR}`目录为Fine-tune时所选择的保存checkpoint的目录。
+
+我们使用该模型来进行预测。predict.py脚本如下：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='bisenetv2_cityscapes', pretrained='/PATH/TO/CHECKPOINT')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+参数配置正确后，请执行脚本`python predict.py`。
+**Args**
+* `images`:原始图像路径或BGR格式图片；
+* `visualization`: 是否可视化，默认为True；
+* `save_path`: 保存结果的路径，默认保存路径为'seg_result'。
+
+**NOTE:** 进行预测时，所选择的module，checkpoint_dir，dataset必须和Fine-tune所用的一样。
+
+## 服务部署
+
+PaddleHub Serving可以部署一个在线图像分割服务。
+
+### Step1: 启动PaddleHub Serving
+
+运行启动命令：
+
+```shell
+$ hub serving start -m bisenetv2_cityscapes
+```
+
+这样就完成了一个图像分割服务化API的部署，默认端口号为8866。
+
+**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+
+### Step2: 发送预测请求
+
+配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+```python
+import requests
+import json
+import cv2
+import base64
+
+import numpy as np
+
+
+def cv2_to_base64(image):
+    data = cv2.imencode('.jpg', image)[1]
+    return base64.b64encode(data.tostring()).decode('utf8')
+
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
+
+# 发送HTTP请求
+org_im = cv2.imread('/PATH/TO/IMAGE')
+data = {'images':[cv2_to_base64(org_im)]}
+headers = {"Content-type": "application/json"}
+url = "http://127.0.0.1:8866/predict/bisenetv2_cityscapes"
+r = requests.post(url=url, headers=headers, data=json.dumps(data))
+mask = base64_to_cv2(r.json()["results"][0])
+```
+
+### 查看代码
+
+https://github.com/PaddlePaddle/PaddleSeg
+
+### 依赖
+
+paddlepaddle >= 2.0.0
+
+paddlehub >= 2.0.0
--- a/modules/image/semantic_segmentation/bisenetv2_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/bisenetv2_cityscapes/layers.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+
+
+def SyncBatchNorm(*args, **kwargs):
+    """In cpu environment nn.SyncBatchNorm does not have kernel so use nn.BatchNorm2D instead"""
+    if paddle.get_device() == 'cpu' or os.environ.get('PADDLESEG_EXPORT_STAGE'):
+        return nn.BatchNorm2D(*args, **kwargs)
+    else:
+        return nn.SyncBatchNorm(*args, **kwargs)
+
+
+class ConvBNReLU(nn.Layer):
+    """Basic conv bn relu layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs):
+        super().__init__()
+
+        self._conv = nn.Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        x = F.relu(x)
+        return x
+
+
+class ConvBN(nn.Layer):
+    """Basic conv bn layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs):
+        super().__init__()
+        self._conv = nn.Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        return x
+
+
+class ConvReLUPool(nn.Layer):
+    """Basic conv bn pool layer."""
+
+    def __init__(self, in_channels: int, out_channels: int):
+        super().__init__()
+        self.conv = nn.Conv2D(in_channels, out_channels, kernel_size=3, stride=1, padding=1, dilation=1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.conv(x)
+        x = F.relu(x)
+        x = F.pool2d(x, pool_size=2, pool_type="max", pool_stride=2)
+        return x
+
+
+class SeparableConvBNReLU(nn.Layer):
+    """Basic separable conv bn relu layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs):
+        super().__init__()
+        self.depthwise_conv = ConvBN(
+            in_channels,
+            out_channels=in_channels,
+            kernel_size=kernel_size,
+            padding=padding,
+            groups=in_channels,
+            **kwargs)
+        self.piontwise_conv = ConvBNReLU(in_channels, out_channels, kernel_size=1, groups=1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.depthwise_conv(x)
+        x = self.piontwise_conv(x)
+        return x
+
+
+class DepthwiseConvBN(nn.Layer):
+    """Basic depthwise conv bn relu layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs):
+        super().__init__()
+
+        self.depthwise_conv = ConvBN(
+            in_channels,
+            out_channels=out_channels,
+            kernel_size=kernel_size,
+            padding=padding,
+            groups=in_channels,
+            **kwargs)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.depthwise_conv(x)
+        return x
+
+
+class AuxLayer(nn.Layer):
+    """
+    The auxiliary layer implementation for auxiliary loss.
+
+    Args:
+        in_channels (int): The number of input channels.
+        inter_channels (int): The intermediate channels.
+        out_channels (int): The number of output channels, and usually it is num_classes.
+        dropout_prob (float, optional): The drop rate. Default: 0.1.
+    """
+
+    def __init__(self, in_channels: int, inter_channels: int, out_channels: int, dropout_prob: float = 0.1):
+        super().__init__()
+
+        self.conv_bn_relu = ConvBNReLU(in_channels=in_channels, out_channels=inter_channels, kernel_size=3, padding=1)
+
+        self.dropout = nn.Dropout(p=dropout_prob)
+
+        self.conv = nn.Conv2D(in_channels=inter_channels, out_channels=out_channels, kernel_size=1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.conv_bn_relu(x)
+        x = self.dropout(x)
+        x = self.conv(x)
+        return x
+
+
+class Activation(nn.Layer):
+    """
+    The wrapper of activations.
+    Args:
+        act (str, optional): The activation name in lowercase. It must be one of ['elu', 'gelu',
+            'hardshrink', 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid',
+            'softmax', 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax',
+            'hsigmoid']. Default: None, means identical transformation.
+    Returns:
+        A callable object of Activation.
+    Raises:
+        KeyError: When parameter `act` is not in the optional range.
+    Examples:
+        from paddleseg.models.common.activation import Activation
+        relu = Activation("relu")
+        print(relu)
+        # <class 'paddle.nn.layer.activation.ReLU'>
+        sigmoid = Activation("sigmoid")
+        print(sigmoid)
+        # <class 'paddle.nn.layer.activation.Sigmoid'>
+        not_exit_one = Activation("not_exit_one")
+        # KeyError: "not_exit_one does not exist in the current dict_keys(['elu', 'gelu', 'hardshrink',
+        # 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid', 'softmax',
+        # 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax', 'hsigmoid'])"
+    """
+
+    def __init__(self, act: str = None):
+        super(Activation, self).__init__()
+
+        self._act = act
+        upper_act_names = nn.layer.activation.__dict__.keys()
+        lower_act_names = [act.lower() for act in upper_act_names]
+        act_dict = dict(zip(lower_act_names, upper_act_names))
+
+        if act is not None:
+            if act in act_dict.keys():
+                act_name = act_dict[act]
+                self.act_func = eval("nn.layer.activation.{}()".format(act_name))
+            else:
+                raise KeyError("{} does not exist in the current {}".format(act, act_dict.keys()))
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        if self._act is not None:
+            return self.act_func(x)
+        else:
+            return x
--- a/modules/image/semantic_segmentation/bisenetv2_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/bisenetv2_cityscapes/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+from typing import Union, List, Tuple
+
+import numpy as np
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+from paddlehub.module.module import moduleinfo
+import paddlehub.vision.segmentation_transforms as T
+from paddlehub.module.cv_module import ImageSegmentationModule
+
+import bisenet_cityscapes.layers as layers
+
+
+@moduleinfo(
+    name="bisenetv2_cityscapes",
+    type="CV/semantic_segmentation",
+    author="paddlepaddle",
+    author_email="",
+    summary="Bisenet is a segmentation model trained by Cityscapes.",
+    version="1.0.0",
+    meta=ImageSegmentationModule)
+class BiSeNetV2(nn.Layer):
+    """
+    The BiSeNet V2 implementation based on PaddlePaddle.
+
+    The original article refers to
+    Yu, Changqian, et al. "BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time Semantic Segmentation"
+    (https://arxiv.org/abs/2004.02147)
+
+    Args:
+        num_classes (int): The unique number of target classes, default is 19.
+        lambd (float, optional): A factor for controlling the size of semantic branch channels. Default: 0.25.
+        align_corners (bool, optional): An argument of F.interpolate. It should be set to False when the feature size is even,
+            e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False.
+        pretrained (str, optional): The path or url of pretrained model. Default: None.
+    """
+
+    def __init__(self, num_classes: int = 19, lambd: float = 0.25, align_corners: bool = False, pretrained: str = None):
+        super(BiSeNetV2, self).__init__()
+
+        C1, C2, C3 = 64, 64, 128
+        db_channels = (C1, C2, C3)
+        C1, C3, C4, C5 = int(C1 * lambd), int(C3 * lambd), 64, 128
+        sb_channels = (C1, C3, C4, C5)
+        mid_channels = 128
+
+        self.db = DetailBranch(db_channels)
+        self.sb = SemanticBranch(sb_channels)
+
+        self.bga = BGA(mid_channels, align_corners)
+        self.aux_head1 = SegHead(C1, C1, num_classes)
+        self.aux_head2 = SegHead(C3, C3, num_classes)
+        self.aux_head3 = SegHead(C4, C4, num_classes)
+        self.aux_head4 = SegHead(C5, C5, num_classes)
+        self.head = SegHead(mid_channels, mid_channels, num_classes)
+
+        self.align_corners = align_corners
+        self.transforms = T.Compose([T.Normalize()])
+
+        if pretrained is not None:
+            model_dict = paddle.load(pretrained)
+            self.set_dict(model_dict)
+            print("load custom parameters success")
+
+        else:
+            checkpoint = os.path.join(self.directory, 'bisenet_model.pdparams')
+            model_dict = paddle.load(checkpoint)
+            self.set_dict(model_dict)
+            print("load pretrained parameters success")
+
+    def transform(self, img: Union[np.ndarray, str]) -> Union[np.ndarray, str]:
+        return self.transforms(img)
+
+    def forward(self, x: paddle.Tensor) -> List[paddle.Tensor]:
+        dfm = self.db(x)
+        feat1, feat2, feat3, feat4, sfm = self.sb(x)
+        logit = self.head(self.bga(dfm, sfm))
+
+        if not self.training:
+            logit_list = [logit]
+        else:
+            logit1 = self.aux_head1(feat1)
+            logit2 = self.aux_head2(feat2)
+            logit3 = self.aux_head3(feat3)
+            logit4 = self.aux_head4(feat4)
+            logit_list = [logit, logit1, logit2, logit3, logit4]
+
+        logit_list = [
+            F.interpolate(logit, paddle.shape(x)[2:], mode='bilinear', align_corners=self.align_corners)
+            for logit in logit_list
+        ]
+
+        return logit_list
+
+
+class StemBlock(nn.Layer):
+    def __init__(self, in_dim: int, out_dim: int):
+        super(StemBlock, self).__init__()
+
+        self.conv = layers.ConvBNReLU(in_dim, out_dim, 3, stride=2)
+
+        self.left = nn.Sequential(
+            layers.ConvBNReLU(out_dim, out_dim // 2, 1), layers.ConvBNReLU(out_dim // 2, out_dim, 3, stride=2))
+
+        self.right = nn.MaxPool2D(kernel_size=3, stride=2, padding=1)
+
+        self.fuse = layers.ConvBNReLU(out_dim * 2, out_dim, 3)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.conv(x)
+        left = self.left(x)
+        right = self.right(x)
+        concat = paddle.concat([left, right], axis=1)
+        return self.fuse(concat)
+
+
+class ContextEmbeddingBlock(nn.Layer):
+    def __init__(self, in_dim: int, out_dim: int):
+        super(ContextEmbeddingBlock, self).__init__()
+
+        self.gap = nn.AdaptiveAvgPool2D(1)
+        self.bn = layers.SyncBatchNorm(in_dim)
+
+        self.conv_1x1 = layers.ConvBNReLU(in_dim, out_dim, 1)
+        self.conv_3x3 = nn.Conv2D(out_dim, out_dim, 3, 1, 1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        gap = self.gap(x)
+        bn = self.bn(gap)
+        conv1 = self.conv_1x1(bn) + x
+        return self.conv_3x3(conv1)
+
+
+class GatherAndExpansionLayer1(nn.Layer):
+    """Gather And Expansion Layer with stride 1"""
+
+    def __init__(self, in_dim: int, out_dim: int, expand: int):
+        super().__init__()
+
+        expand_dim = expand * in_dim
+
+        self.conv = nn.Sequential(
+            layers.ConvBNReLU(in_dim, in_dim, 3), layers.DepthwiseConvBN(in_dim, expand_dim, 3),
+            layers.ConvBN(expand_dim, out_dim, 1))
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        return F.relu(self.conv(x) + x)
+
+
+class GatherAndExpansionLayer2(nn.Layer):
+    """Gather And Expansion Layer with stride 2"""
+
+    def __init__(self, in_dim: int, out_dim: int, expand: int):
+        super().__init__()
+
+        expand_dim = expand * in_dim
+
+        self.branch_1 = nn.Sequential(
+            layers.ConvBNReLU(in_dim, in_dim, 3), layers.DepthwiseConvBN(in_dim, expand_dim, 3, stride=2),
+            layers.DepthwiseConvBN(expand_dim, expand_dim, 3), layers.ConvBN(expand_dim, out_dim, 1))
+
+        self.branch_2 = nn.Sequential(
+            layers.DepthwiseConvBN(in_dim, in_dim, 3, stride=2), layers.ConvBN(in_dim, out_dim, 1))
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        return F.relu(self.branch_1(x) + self.branch_2(x))
+
+
+class DetailBranch(nn.Layer):
+    """The detail branch of BiSeNet, which has wide channels but shallow layers."""
+
+    def __init__(self, in_channels: int):
+        super().__init__()
+
+        C1, C2, C3 = in_channels
+
+        self.convs = nn.Sequential(
+            # stage 1
+            layers.ConvBNReLU(3, C1, 3, stride=2),
+            layers.ConvBNReLU(C1, C1, 3),
+            # stage 2
+            layers.ConvBNReLU(C1, C2, 3, stride=2),
+            layers.ConvBNReLU(C2, C2, 3),
+            layers.ConvBNReLU(C2, C2, 3),
+            # stage 3
+            layers.ConvBNReLU(C2, C3, 3, stride=2),
+            layers.ConvBNReLU(C3, C3, 3),
+            layers.ConvBNReLU(C3, C3, 3),
+        )
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        return self.convs(x)
+
+
+class SemanticBranch(nn.Layer):
+    """The semantic branch of BiSeNet, which has narrow channels but deep layers."""
+
+    def __init__(self, in_channels: int):
+        super().__init__()
+        C1, C3, C4, C5 = in_channels
+
+        self.stem = StemBlock(3, C1)
+
+        self.stage3 = nn.Sequential(GatherAndExpansionLayer2(C1, C3, 6), GatherAndExpansionLayer1(C3, C3, 6))
+
+        self.stage4 = nn.Sequential(GatherAndExpansionLayer2(C3, C4, 6), GatherAndExpansionLayer1(C4, C4, 6))
+
+        self.stage5_4 = nn.Sequential(
+            GatherAndExpansionLayer2(C4, C5, 6), GatherAndExpansionLayer1(C5, C5, 6), GatherAndExpansionLayer1(
+                C5, C5, 6), GatherAndExpansionLayer1(C5, C5, 6))
+
+        self.ce = ContextEmbeddingBlock(C5, C5)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        stage2 = self.stem(x)
+        stage3 = self.stage3(stage2)
+        stage4 = self.stage4(stage3)
+        stage5_4 = self.stage5_4(stage4)
+        fm = self.ce(stage5_4)
+        return stage2, stage3, stage4, stage5_4, fm
+
+
+class BGA(nn.Layer):
+    """The Bilateral Guided Aggregation Layer, used to fuse the semantic features and spatial features."""
+
+    def __init__(self, out_dim: int, align_corners: bool):
+        super().__init__()
+
+        self.align_corners = align_corners
+
+        self.db_branch_keep = nn.Sequential(layers.DepthwiseConvBN(out_dim, out_dim, 3), nn.Conv2D(out_dim, out_dim, 1))
+
+        self.db_branch_down = nn.Sequential(
+            layers.ConvBN(out_dim, out_dim, 3, stride=2), nn.AvgPool2D(kernel_size=3, stride=2, padding=1))
+
+        self.sb_branch_keep = nn.Sequential(
+            layers.DepthwiseConvBN(out_dim, out_dim, 3), nn.Conv2D(out_dim, out_dim, 1),
+            layers.Activation(act='sigmoid'))
+
+        self.sb_branch_up = layers.ConvBN(out_dim, out_dim, 3)
+
+        self.conv = layers.ConvBN(out_dim, out_dim, 3)
+
+    def forward(self, dfm: int, sfm: int) -> paddle.Tensor:
+        db_feat_keep = self.db_branch_keep(dfm)
+        db_feat_down = self.db_branch_down(dfm)
+        sb_feat_keep = self.sb_branch_keep(sfm)
+
+        sb_feat_up = self.sb_branch_up(sfm)
+        sb_feat_up = F.interpolate(
+            sb_feat_up, paddle.shape(db_feat_keep)[2:], mode='bilinear', align_corners=self.align_corners)
+
+        sb_feat_up = F.sigmoid(sb_feat_up)
+        db_feat = db_feat_keep * sb_feat_up
+
+        sb_feat = db_feat_down * sb_feat_keep
+        sb_feat = F.interpolate(sb_feat, paddle.shape(db_feat)[2:], mode='bilinear', align_corners=self.align_corners)
+
+        return self.conv(db_feat + sb_feat)
+
+
+class SegHead(nn.Layer):
+    def __init__(self, in_dim: int, mid_dim: int, num_classes: int):
+        super().__init__()
+
+        self.conv_3x3 = nn.Sequential(layers.ConvBNReLU(in_dim, mid_dim, 3), nn.Dropout(0.1))
+
+        self.conv_1x1 = nn.Conv2D(mid_dim, num_classes, 1, 1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        conv1 = self.conv_3x3(x)
+        conv2 = self.conv_1x1(conv1)
+        return conv2
--- a/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/README.md
+# PaddleHub 图像分割
+
+## 模型预测
+
+若想使用我们提供的预训练模型进行预测，可使用如下脚本：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='deeplabv3p_resnet50_cityscapes')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+
+## 如何开始Fine-tune
+
+在完成安装PaddlePaddle与PaddleHub后，通过执行`python train.py`即可开始使用deeplabv3p_resnet50_cityscapes模型对OpticDiscSeg等数据集进行Fine-tune。
+
+## 代码步骤
+
+使用PaddleHub Fine-tune API进行Fine-tune可以分为4个步骤。
+
+### Step1: 定义数据预处理方式
+```python
+from paddlehub.vision.segmentation_transforms import Compose, Resize, Normalize
+
+transform = Compose([Resize(target_size=(512, 512)), Normalize()])
+```
+
+`segmentation_transforms` 数据增强模块定义了丰富的针对图像分割数据的预处理方式，用户可按照需求替换自己需要的数据预处理方式。
+
+### Step2: 下载数据集并使用
+```python
+from paddlehub.datasets import OpticDiscSeg
+
+train_reader = OpticDiscSeg(transform， mode='train')
+
+```
+* `transform`: 数据预处理方式。
+* `mode`: 选择数据模式，可选项有 `train`, `test`, `val`, 默认为`train`。
+
+数据集的准备代码可以参考 [opticdiscseg.py](../../paddlehub/datasets/opticdiscseg.py)。`hub.datasets.OpticDiscSeg()`会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
+
+### Step3: 加载预训练模型
+
+```python
+model = hub.Module(name='deeplabv3p_resnet50_cityscapes', num_classes=2, pretrained=None)
+```
+* `name`: 选择预训练模型的名字。
+* `num_classes`: 分割模型的类别数目。
+* `pretrained`: 是否加载自己训练的模型，若为None，则加载提供的模型默认参数。
+
+### Step4: 选择优化策略和运行配置
+
+```python
+scheduler = paddle.optimizer.lr.PolynomialDecay(learning_rate=0.01, decay_steps=1000, power=0.9,  end_lr=0.0001)
+optimizer = paddle.optimizer.Adam(learning_rate=scheduler, parameters=model.parameters())
+trainer = Trainer(model, optimizer, checkpoint_dir='test_ckpt_img_ocr', use_gpu=True)
+```
+
+#### 优化策略
+
+Paddle2.0rc提供了多种优化器选择，如`SGD`, `Adam`, `Adamax`等，其中`Adam`:
+
+* `learning_rate`: 全局学习率。
+*  `parameters`: 待优化模型参数。
+
+#### 运行配置
+`Trainer` 主要控制Fine-tune的训练，包含以下可控制的参数:
+
+* `model`: 被优化模型；
+* `optimizer`: 优化器选择；
+* `use_gpu`: 是否使用gpu，默认为False;
+* `use_vdl`: 是否使用vdl可视化训练过程；
+* `checkpoint_dir`: 保存模型参数的地址；
+* `compare_metrics`: 保存最优模型的衡量指标；
+
+`trainer.train` 主要控制具体的训练过程，包含以下可控制的参数：
+
+* `train_dataset`: 训练时所用的数据集；
+* `epochs`: 训练轮数；
+* `batch_size`: 训练的批大小，如果使用GPU，请根据实际情况调整batch_size；
+* `num_workers`: works的数量，默认为0；
+* `eval_dataset`: 验证集；
+* `log_interval`: 打印日志的间隔， 单位为执行批训练的次数。
+* `save_interval`: 保存模型的间隔频次，单位为执行训练的轮数。
+
+## 模型预测
+
+当完成Fine-tune后，Fine-tune过程在验证集上表现最优的模型会被保存在`${CHECKPOINT_DIR}/best_model`目录下，其中`${CHECKPOINT_DIR}`目录为Fine-tune时所选择的保存checkpoint的目录。
+
+我们使用该模型来进行预测。predict.py脚本如下：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='deeplabv3p_resnet50_cityscapes', pretrained='/PATH/TO/CHECKPOINT')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+参数配置正确后，请执行脚本`python predict.py`。
+**Args**
+* `images`:原始图像路径或BGR格式图片；
+* `visualization`: 是否可视化，默认为True；
+* `save_path`: 保存结果的路径，默认保存路径为'seg_result'。
+
+**NOTE:** 进行预测时，所选择的module，checkpoint_dir，dataset必须和Fine-tune所用的一样。
+
+## 服务部署
+
+PaddleHub Serving可以部署一个在线图像分割服务。
+
+### Step1: 启动PaddleHub Serving
+
+运行启动命令：
+
+```shell
+$ hub serving start -m deeplabv3p_resnet50_cityscapes
+```
+
+这样就完成了一个图像分割服务化API的部署，默认端口号为8866。
+
+**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+
+### Step2: 发送预测请求
+
+配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+```python
+import requests
+import json
+import cv2
+import base64
+
+import numpy as np
+
+
+def cv2_to_base64(image):
+    data = cv2.imencode('.jpg', image)[1]
+    return base64.b64encode(data.tostring()).decode('utf8')
+
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
+
+# 发送HTTP请求
+org_im = cv2.imread('/PATH/TO/IMAGE')
+data = {'images':[cv2_to_base64(org_im)]}
+headers = {"Content-type": "application/json"}
+url = "http://127.0.0.1:8866/predict/deeplabv3p_resnet50_cityscapes"
+r = requests.post(url=url, headers=headers, data=json.dumps(data))
+mask = base64_to_cv2(r.json()["results"][0])
+```
+
+### 查看代码
+
+https://github.com/PaddlePaddle/PaddleSeg
+
+### 依赖
+
+paddlepaddle >= 2.0.0
+
+paddlehub >= 2.0.0
--- a/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/layers.py
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+from paddle.nn import Conv2D, AvgPool2D
+
+
+def SyncBatchNorm(*args, **kwargs):
+    """In cpu environment nn.SyncBatchNorm does not have kernel so use nn.BatchNorm2D instead"""
+    if paddle.get_device() == 'cpu':
+        return nn.BatchNorm2D(*args, **kwargs)
+    else:
+        return nn.SyncBatchNorm(*args, **kwargs)
+
+
+class ConvBNLayer(nn.Layer):
+    """Basic conv bn relu layer."""
+
+    def __init__(self,
+                 in_channels: int,
+                 out_channels: int,
+                 kernel_size: int,
+                 stride: int = 1,
+                 dilation: int = 1,
+                 groups: int = 1,
+                 is_vd_mode: bool = False,
+                 act: str = None,
+                 name: str = None):
+        super(ConvBNLayer, self).__init__()
+
+        self.is_vd_mode = is_vd_mode
+        self._pool2d_avg = AvgPool2D(kernel_size=2, stride=2, padding=0, ceil_mode=True)
+        self._conv = Conv2D(
+            in_channels=in_channels,
+            out_channels=out_channels,
+            kernel_size=kernel_size,
+            stride=stride,
+            padding=(kernel_size - 1) // 2 if dilation == 1 else 0,
+            dilation=dilation,
+            groups=groups,
+            bias_attr=False)
+
+        self._batch_norm = SyncBatchNorm(out_channels)
+        self._act_op = Activation(act=act)
+
+    def forward(self, inputs: paddle.Tensor) -> paddle.Tensor:
+        if self.is_vd_mode:
+            inputs = self._pool2d_avg(inputs)
+        y = self._conv(inputs)
+        y = self._batch_norm(y)
+        y = self._act_op(y)
+
+        return y
+
+
+class BottleneckBlock(nn.Layer):
+    """Residual bottleneck block"""
+
+    def __init__(self,
+                 in_channels: int,
+                 out_channels: int,
+                 stride: int,
+                 shortcut: bool = True,
+                 if_first: bool = False,
+                 dilation: int = 1,
+                 name: str = None):
+        super(BottleneckBlock, self).__init__()
+
+        self.conv0 = ConvBNLayer(
+            in_channels=in_channels, out_channels=out_channels, kernel_size=1, act='relu', name=name + "_branch2a")
+
+        self.dilation = dilation
+
+        self.conv1 = ConvBNLayer(
+            in_channels=out_channels,
+            out_channels=out_channels,
+            kernel_size=3,
+            stride=stride,
+            act='relu',
+            dilation=dilation,
+            name=name + "_branch2b")
+        self.conv2 = ConvBNLayer(
+            in_channels=out_channels, out_channels=out_channels * 4, kernel_size=1, act=None, name=name + "_branch2c")
+
+        if not shortcut:
+            self.short = ConvBNLayer(
+                in_channels=in_channels,
+                out_channels=out_channels * 4,
+                kernel_size=1,
+                stride=1,
+                is_vd_mode=False if if_first or stride == 1 else True,
+                name=name + "_branch1")
+
+        self.shortcut = shortcut
+
+    def forward(self, inputs: paddle.Tensor) -> paddle.Tensor:
+        y = self.conv0(inputs)
+        if self.dilation > 1:
+            padding = self.dilation
+            y = F.pad(y, [padding, padding, padding, padding])
+
+        conv1 = self.conv1(y)
+        conv2 = self.conv2(conv1)
+
+        if self.shortcut:
+            short = inputs
+        else:
+            short = self.short(inputs)
+
+        y = paddle.add(x=short, y=conv2)
+        y = F.relu(y)
+        return y
+
+
+class SeparableConvBNReLU(nn.Layer):
+    """Depthwise Separable Convolution."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(SeparableConvBNReLU, self).__init__()
+        self.depthwise_conv = ConvBN(
+            in_channels,
+            out_channels=in_channels,
+            kernel_size=kernel_size,
+            padding=padding,
+            groups=in_channels,
+            **kwargs)
+        self.piontwise_conv = ConvBNReLU(in_channels, out_channels, kernel_size=1, groups=1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.depthwise_conv(x)
+        x = self.piontwise_conv(x)
+        return x
+
+
+class ConvBN(nn.Layer):
+    """Basic conv bn layer"""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(ConvBN, self).__init__()
+        self._conv = Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        return x
+
+
+class ConvBNReLU(nn.Layer):
+    """Basic conv bn relu layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(ConvBNReLU, self).__init__()
+
+        self._conv = Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        x = F.relu(x)
+        return x
+
+
+class Activation(nn.Layer):
+    """
+    The wrapper of activations.
+    Args:
+        act (str, optional): The activation name in lowercase. It must be one of ['elu', 'gelu',
+            'hardshrink', 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid',
+            'softmax', 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax',
+            'hsigmoid']. Default: None, means identical transformation.
+    Returns:
+        A callable object of Activation.
+    Raises:
+        KeyError: When parameter `act` is not in the optional range.
+    Examples:
+        from paddleseg.models.common.activation import Activation
+        relu = Activation("relu")
+        print(relu)
+        # <class 'paddle.nn.layer.activation.ReLU'>
+        sigmoid = Activation("sigmoid")
+        print(sigmoid)
+        # <class 'paddle.nn.layer.activation.Sigmoid'>
+        not_exit_one = Activation("not_exit_one")
+        # KeyError: "not_exit_one does not exist in the current dict_keys(['elu', 'gelu', 'hardshrink',
+        # 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid', 'softmax',
+        # 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax', 'hsigmoid'])"
+    """
+
+    def __init__(self, act: str = None):
+        super(Activation, self).__init__()
+
+        self._act = act
+        upper_act_names = nn.layer.activation.__dict__.keys()
+        lower_act_names = [act.lower() for act in upper_act_names]
+        act_dict = dict(zip(lower_act_names, upper_act_names))
+
+        if act is not None:
+            if act in act_dict.keys():
+                act_name = act_dict[act]
+                self.act_func = eval("nn.layer.activation.{}()".format(act_name))
+            else:
+                raise KeyError("{} does not exist in the current {}".format(act, act_dict.keys()))
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        if self._act is not None:
+            return self.act_func(x)
+        else:
+            return x
+
+
+class ASPPModule(nn.Layer):
+    """
+    Atrous Spatial Pyramid Pooling.
+
+    Args:
+        aspp_ratios (tuple): The dilation rate using in ASSP module.
+        in_channels (int): The number of input channels.
+        out_channels (int): The number of output channels.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.
+        use_sep_conv (bool, optional): If using separable conv in ASPP module. Default: False.
+        image_pooling (bool, optional): If augmented with image-level features. Default: False
+    """
+
+    def __init__(self,
+                 aspp_ratios: tuple,
+                 in_channels: int,
+                 out_channels: int,
+                 align_corners: bool,
+                 use_sep_conv: bool = False,
+                 image_pooling: bool = False):
+        super().__init__()
+
+        self.align_corners = align_corners
+        self.aspp_blocks = nn.LayerList()
+
+        for ratio in aspp_ratios:
+            if use_sep_conv and ratio > 1:
+                conv_func = SeparableConvBNReLU
+            else:
+                conv_func = ConvBNReLU
+
+            block = conv_func(
+                in_channels=in_channels,
+                out_channels=out_channels,
+                kernel_size=1 if ratio == 1 else 3,
+                dilation=ratio,
+                padding=0 if ratio == 1 else ratio)
+            self.aspp_blocks.append(block)
+
+        out_size = len(self.aspp_blocks)
+
+        if image_pooling:
+            self.global_avg_pool = nn.Sequential(
+                nn.AdaptiveAvgPool2D(output_size=(1, 1)),
+                ConvBNReLU(in_channels, out_channels, kernel_size=1, bias_attr=False))
+            out_size += 1
+        self.image_pooling = image_pooling
+
+        self.conv_bn_relu = ConvBNReLU(in_channels=out_channels * out_size, out_channels=out_channels, kernel_size=1)
+
+        self.dropout = nn.Dropout(p=0.1)  # drop rate
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        outputs = []
+        for block in self.aspp_blocks:
+            y = block(x)
+            y = F.interpolate(y, x.shape[2:], mode='bilinear', align_corners=self.align_corners)
+            outputs.append(y)
+
+        if self.image_pooling:
+            img_avg = self.global_avg_pool(x)
+            img_avg = F.interpolate(img_avg, x.shape[2:], mode='bilinear', align_corners=self.align_corners)
+            outputs.append(img_avg)
+
+        x = paddle.concat(outputs, axis=1)
+        x = self.conv_bn_relu(x)
+        x = self.dropout(x)
+
+        return x
--- a/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/module.py
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+from typing import Union, List, Tuple
+
+import paddle
+from paddle import nn
+import paddle.nn.functional as F
+import numpy as np
+from paddlehub.module.module import moduleinfo
+import paddlehub.vision.segmentation_transforms as T
+from paddlehub.module.cv_module import ImageSegmentationModule
+
+from deeplabv3p_resnet50_cityscapes.resnet import ResNet50_vd
+import deeplabv3p_resnet50_cityscapes.layers as L
+
+
+@moduleinfo(
+    name="deeplabv3p_resnet50_cityscapes",
+    type="CV/semantic_segmentation",
+    author="paddlepaddle",
+    author_email="",
+    summary="DeepLabV3PResnet50 is a segmentation model.",
+    version="1.0.0",
+    meta=ImageSegmentationModule)
+class DeepLabV3PResnet50(nn.Layer):
+    """
+    The DeepLabV3PResnet50 implementation based on PaddlePaddle.
+
+    The original article refers to
+     Liang-Chieh Chen, et, al. "Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation"
+     (https://arxiv.org/abs/1802.02611)
+
+    Args:
+        num_classes (int): the unique number of target classes.
+        backbone_indices (tuple): two values in the tuple indicate the indices of output of backbone.
+            the first index will be taken as a low-level feature in Decoder component;
+            the second one will be taken as input of ASPP component.
+            Usually backbone consists of four downsampling stage, and return an output of
+            each stage, so we set default (0, 3), which means taking feature map of the first
+            stage in backbone as low-level feature used in Decoder, and feature map of the fourth
+            stage as input of ASPP.
+        aspp_ratios (tuple): the dilation rate using in ASSP module.
+            if output_stride=16, aspp_ratios should be set as (1, 6, 12, 18).
+            if output_stride=8, aspp_ratios is (1, 12, 24, 36).
+        aspp_out_channels (int): the output channels of ASPP module.
+        align_corners (bool, optional): An argument of F.interpolate. It should be set to False when the feature size is even,
+            e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False.
+        pretrained (str): the path of pretrained model. Default to None.
+    """
+
+    def __init__(self,
+                 num_classes: int = 19,
+                 backbone_indices: Tuple[int] = (0, 3),
+                 aspp_ratios: Tuple[int] = (1, 12, 24, 36),
+                 aspp_out_channels: int = 256,
+                 align_corners=False,
+                 pretrained: str = None):
+        super(DeepLabV3PResnet50, self).__init__()
+        self.backbone = ResNet50_vd()
+        backbone_channels = [self.backbone.feat_channels[i] for i in backbone_indices]
+        self.head = DeepLabV3PHead(num_classes, backbone_indices, backbone_channels, aspp_ratios, aspp_out_channels,
+                                   align_corners)
+        self.align_corners = align_corners
+        self.transforms = T.Compose([T.Normalize()])
+
+        if pretrained is not None:
+            model_dict = paddle.load(pretrained)
+            self.set_dict(model_dict)
+            print("load custom parameters success")
+
+        else:
+            checkpoint = os.path.join(self.directory, 'model.pdparams')
+            model_dict = paddle.load(checkpoint)
+            self.set_dict(model_dict)
+            print("load pretrained parameters success")
+
+    def transform(self, img: Union[np.ndarray, str]) -> Union[np.ndarray, str]:
+        return self.transforms(img)
+
+    def forward(self, x: paddle.Tensor) -> List[paddle.Tensor]:
+        feat_list = self.backbone(x)
+        logit_list = self.head(feat_list)
+        return [
+            F.interpolate(logit, x.shape[2:], mode='bilinear', align_corners=self.align_corners) for logit in logit_list
+        ]
+
+
+class DeepLabV3PHead(nn.Layer):
+    """
+    The DeepLabV3PHead implementation based on PaddlePaddle.
+
+    Args:
+        num_classes (int): The unique number of target classes.
+        backbone_indices (tuple): Two values in the tuple indicate the indices of output of backbone.
+            the first index will be taken as a low-level feature in Decoder component;
+            the second one will be taken as input of ASPP component.
+            Usually backbone consists of four downsampling stage, and return an output of
+            each stage. If we set it as (0, 3), it means taking feature map of the first
+            stage in backbone as low-level feature used in Decoder, and feature map of the fourth
+            stage as input of ASPP.
+        backbone_channels (tuple): The same length with "backbone_indices". It indicates the channels of corresponding index.
+        aspp_ratios (tuple): The dilation rates using in ASSP module.
+        aspp_out_channels (int): The output channels of ASPP module.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.
+    """
+
+    def __init__(self, num_classes: int, backbone_indices: Tuple[paddle.Tensor],
+                 backbone_channels: Tuple[paddle.Tensor], aspp_ratios: Tuple[float], aspp_out_channels: int,
+                 align_corners: bool):
+        super().__init__()
+
+        self.aspp = L.ASPPModule(
+            aspp_ratios, backbone_channels[1], aspp_out_channels, align_corners, use_sep_conv=True, image_pooling=True)
+        self.decoder = Decoder(num_classes, backbone_channels[0], align_corners)
+        self.backbone_indices = backbone_indices
+
+    def forward(self, feat_list: List[paddle.Tensor]) -> List[paddle.Tensor]:
+        logit_list = []
+        low_level_feat = feat_list[self.backbone_indices[0]]
+        x = feat_list[self.backbone_indices[1]]
+        x = self.aspp(x)
+        logit = self.decoder(x, low_level_feat)
+        logit_list.append(logit)
+        return logit_list
+
+
+class Decoder(nn.Layer):
+    """
+    Decoder module of DeepLabV3P model
+
+    Args:
+        num_classes (int): The number of classes.
+        in_channels (int): The number of input channels in decoder module.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.
+    """
+
+    def __init__(self, num_classes: int, in_channels: int, align_corners: bool):
+        super(Decoder, self).__init__()
+
+        self.conv_bn_relu1 = L.ConvBNReLU(in_channels=in_channels, out_channels=48, kernel_size=1)
+
+        self.conv_bn_relu2 = L.SeparableConvBNReLU(in_channels=304, out_channels=256, kernel_size=3, padding=1)
+        self.conv_bn_relu3 = L.SeparableConvBNReLU(in_channels=256, out_channels=256, kernel_size=3, padding=1)
+        self.conv = nn.Conv2D(in_channels=256, out_channels=num_classes, kernel_size=1)
+
+        self.align_corners = align_corners
+
+    def forward(self, x: paddle.Tensor, low_level_feat: paddle.Tensor) -> paddle.Tensor:
+        low_level_feat = self.conv_bn_relu1(low_level_feat)
+        x = F.interpolate(x, low_level_feat.shape[2:], mode='bilinear', align_corners=self.align_corners)
+        x = paddle.concat([x, low_level_feat], axis=1)
+        x = self.conv_bn_relu2(x)
+        x = self.conv_bn_relu3(x)
+        x = self.conv(x)
+        return x
--- a/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/resnet.py
+++ b/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/resnet.py
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from typing import Union, List, Tuple
+
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+import deeplabv3p_resnet50_cityscapes.layers as L
+
+
+class BasicBlock(nn.Layer):
+    def __init__(self,
+                 in_channels: int,
+                 out_channels: int,
+                 stride: int,
+                 shortcut: bool = True,
+                 if_first: bool = False,
+                 name: str = None):
+        super(BasicBlock, self).__init__()
+        self.stride = stride
+        self.conv0 = L.ConvBNLayer(
+            in_channels=in_channels,
+            out_channels=out_channels,
+            kernel_size=3,
+            stride=stride,
+            act='relu',
+            name=name + "_branch2a")
+        self.conv1 = L.ConvBNLayer(
+            in_channels=out_channels, out_channels=out_channels, kernel_size=3, act=None, name=name + "_branch2b")
+
+        if not shortcut:
+            self.short = L.ConvBNLayer(
+                in_channels=in_channels,
+                out_channels=out_channels,
+                kernel_size=1,
+                stride=1,
+                is_vd_mode=False if if_first else True,
+                name=name + "_branch1")
+
+        self.shortcut = shortcut
+
+    def forward(self, inputs: paddle.Tensor) -> paddle.Tensor:
+        y = self.conv0(inputs)
+        conv1 = self.conv1(y)
+
+        if self.shortcut:
+            short = inputs
+        else:
+            short = self.short(inputs)
+        y = paddle.elementwise_add(x=short, y=conv1, act='relu')
+
+        return y
+
+
+class ResNet50_vd(nn.Layer):
+    def __init__(self, multi_grid: Tuple[int] = (1, 2, 4)):
+        super(ResNet50_vd, self).__init__()
+        depth = [3, 4, 6, 3]
+        num_channels = [64, 256, 512, 1024]
+        num_filters = [64, 128, 256, 512]
+        self.feat_channels = [c * 4 for c in num_filters]
+        dilation_dict = {2: 2, 3: 4}
+        self.conv1_1 = L.ConvBNLayer(
+            in_channels=3, out_channels=32, kernel_size=3, stride=2, act='relu', name="conv1_1")
+        self.conv1_2 = L.ConvBNLayer(
+            in_channels=32, out_channels=32, kernel_size=3, stride=1, act='relu', name="conv1_2")
+        self.conv1_3 = L.ConvBNLayer(
+            in_channels=32, out_channels=64, kernel_size=3, stride=1, act='relu', name="conv1_3")
+        self.pool2d_max = nn.MaxPool2D(kernel_size=3, stride=2, padding=1)
+        self.stage_list = []
+
+        for block in range(len(depth)):
+            shortcut = False
+            block_list = []
+            for i in range(depth[block]):
+                conv_name = "res" + str(block + 2) + chr(97 + i)
+                dilation_rate = dilation_dict[block] if dilation_dict and block in dilation_dict else 1
+                if block == 3:
+                    dilation_rate = dilation_rate * multi_grid[i]
+                bottleneck_block = self.add_sublayer(
+                    'bb_%d_%d' % (block, i),
+                    L.BottleneckBlock(
+                        in_channels=num_channels[block] if i == 0 else num_filters[block] * 4,
+                        out_channels=num_filters[block],
+                        stride=2 if i == 0 and block != 0 and dilation_rate == 1 else 1,
+                        shortcut=shortcut,
+                        if_first=block == i == 0,
+                        name=conv_name,
+                        dilation=dilation_rate))
+                block_list.append(bottleneck_block)
+                shortcut = True
+            self.stage_list.append(block_list)
+
+    def forward(self, inputs: paddle.Tensor) -> paddle.Tensor:
+        y = self.conv1_1(inputs)
+        y = self.conv1_2(y)
+        y = self.conv1_3(y)
+        y = self.pool2d_max(y)
+        feat_list = []
+        for stage in self.stage_list:
+            for block in stage:
+                y = block(y)
+            feat_list.append(y)
+        return feat_list
--- a/modules/image/semantic_segmentation/fastscnn_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/fastscnn_cityscapes/README.md
+# PaddleHub 图像分割
+
+## 模型预测
+
+若想使用我们提供的预训练模型进行预测，可使用如下脚本：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='fastscnn_cityscapes')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+
+## 如何开始Fine-tune
+
+在完成安装PaddlePaddle与PaddleHub后，通过执行`python train.py`即可开始使用fastscnn_cityscapes模型对OpticDiscSeg等数据集进行Fine-tune。
+
+## 代码步骤
+
+使用PaddleHub Fine-tune API进行Fine-tune可以分为4个步骤。
+
+### Step1: 定义数据预处理方式
+```python
+from paddlehub.vision.segmentation_transforms import Compose, Resize, Normalize
+
+transform = Compose([Resize(target_size=(512, 512)), Normalize()])
+```
+
+`segmentation_transforms` 数据增强模块定义了丰富的针对图像分割数据的预处理方式，用户可按照需求替换自己需要的数据预处理方式。
+
+### Step2: 下载数据集并使用
+```python
+from paddlehub.datasets import OpticDiscSeg
+
+train_reader = OpticDiscSeg(transform， mode='train')
+
+```
+* `transform`: 数据预处理方式。
+* `mode`: 选择数据模式，可选项有 `train`, `test`, `val`, 默认为`train`。
+
+数据集的准备代码可以参考 [opticdiscseg.py](../../paddlehub/datasets/opticdiscseg.py)。`hub.datasets.OpticDiscSeg()`会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
+
+### Step3: 加载预训练模型
+
+```python
+model = hub.Module(name='fastscnn_cityscapes', num_classes=2, pretrained=None)
+```
+* `name`: 选择预训练模型的名字。
+* `num_classes`: 分割模型的类别数目。
+* `pretrained`: 是否加载自己训练的模型，若为None，则加载提供的模型默认参数。
+
+### Step4: 选择优化策略和运行配置
+
+```python
+scheduler = paddle.optimizer.lr.PolynomialDecay(learning_rate=0.01, decay_steps=1000, power=0.9,  end_lr=0.0001)
+optimizer = paddle.optimizer.Adam(learning_rate=scheduler, parameters=model.parameters())
+trainer = Trainer(model, optimizer, checkpoint_dir='test_ckpt_img_ocr', use_gpu=True)
+```
+
+#### 优化策略
+
+Paddle2.0rc提供了多种优化器选择，如`SGD`, `Adam`, `Adamax`等，其中`Adam`:
+
+* `learning_rate`: 全局学习率。
+*  `parameters`: 待优化模型参数。
+
+#### 运行配置
+`Trainer` 主要控制Fine-tune的训练，包含以下可控制的参数:
+
+* `model`: 被优化模型；
+* `optimizer`: 优化器选择；
+* `use_gpu`: 是否使用gpu，默认为False;
+* `use_vdl`: 是否使用vdl可视化训练过程；
+* `checkpoint_dir`: 保存模型参数的地址；
+* `compare_metrics`: 保存最优模型的衡量指标；
+
+`trainer.train` 主要控制具体的训练过程，包含以下可控制的参数：
+
+* `train_dataset`: 训练时所用的数据集；
+* `epochs`: 训练轮数；
+* `batch_size`: 训练的批大小，如果使用GPU，请根据实际情况调整batch_size；
+* `num_workers`: works的数量，默认为0；
+* `eval_dataset`: 验证集；
+* `log_interval`: 打印日志的间隔， 单位为执行批训练的次数。
+* `save_interval`: 保存模型的间隔频次，单位为执行训练的轮数。
+
+## 模型预测
+
+当完成Fine-tune后，Fine-tune过程在验证集上表现最优的模型会被保存在`${CHECKPOINT_DIR}/best_model`目录下，其中`${CHECKPOINT_DIR}`目录为Fine-tune时所选择的保存checkpoint的目录。
+
+我们使用该模型来进行预测。predict.py脚本如下：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='fastscnn_cityscapes', pretrained='/PATH/TO/CHECKPOINT')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+参数配置正确后，请执行脚本`python predict.py`。
+**Args**
+* `images`:原始图像路径或BGR格式图片；
+* `visualization`: 是否可视化，默认为True；
+* `save_path`: 保存结果的路径，默认保存路径为'seg_result'。
+
+**NOTE:** 进行预测时，所选择的module，checkpoint_dir，dataset必须和Fine-tune所用的一样。
+
+## 服务部署
+
+PaddleHub Serving可以部署一个在线图像分割服务。
+
+### Step1: 启动PaddleHub Serving
+
+运行启动命令：
+
+```shell
+$ hub serving start -m fastscnn_cityscapes
+```
+
+这样就完成了一个图像分割服务化API的部署，默认端口号为8866。
+
+**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+
+### Step2: 发送预测请求
+
+配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+```python
+import requests
+import json
+import cv2
+import base64
+
+import numpy as np
+
+
+def cv2_to_base64(image):
+    data = cv2.imencode('.jpg', image)[1]
+    return base64.b64encode(data.tostring()).decode('utf8')
+
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
+
+# 发送HTTP请求
+org_im = cv2.imread('/PATH/TO/IMAGE')
+data = {'images':[cv2_to_base64(org_im)]}
+headers = {"Content-type": "application/json"}
+url = "http://127.0.0.1:8866/predict/fastscnn_cityscapes"
+r = requests.post(url=url, headers=headers, data=json.dumps(data))
+mask = base64_to_cv2(r.json()["results"][0])
+```
+
+### 查看代码
+
+https://github.com/PaddlePaddle/PaddleSeg
+
+### 依赖
+
+paddlepaddle >= 2.0.0
+
+paddlehub >= 2.0.0
--- a/modules/image/semantic_segmentation/fastscnn_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/fastscnn_cityscapes/layers.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+from typing import Tuple
+
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+
+
+def SyncBatchNorm(*args, **kwargs):
+    """In cpu environment nn.SyncBatchNorm does not have kernel so use nn.BatchNorm2D instead"""
+    if paddle.get_device() == 'cpu' or os.environ.get('PADDLESEG_EXPORT_STAGE'):
+        return nn.BatchNorm2D(*args, **kwargs)
+    else:
+        return nn.SyncBatchNorm(*args, **kwargs)
+
+
+class ConvBNReLU(nn.Layer):
+    """Basic conv bn relu layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs):
+        super().__init__()
+
+        self._conv = nn.Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        x = F.relu(x)
+        return x
+
+
+class ConvBN(nn.Layer):
+    """Basic conv bn layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs):
+        super().__init__()
+        self._conv = nn.Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        return x
+
+
+class ConvReLUPool(nn.Layer):
+    """Basic conv bn pool layer."""
+
+    def __init__(self, in_channels: int, out_channels: int):
+        super().__init__()
+        self.conv = nn.Conv2D(in_channels, out_channels, kernel_size=3, stride=1, padding=1, dilation=1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.conv(x)
+        x = F.relu(x)
+        x = F.pool2d(x, pool_size=2, pool_type="max", pool_stride=2)
+        return x
+
+
+class SeparableConvBNReLU(nn.Layer):
+    """Basic separable conv bn relu layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs):
+        super().__init__()
+        self.depthwise_conv = ConvBN(
+            in_channels,
+            out_channels=in_channels,
+            kernel_size=kernel_size,
+            padding=padding,
+            groups=in_channels,
+            **kwargs)
+        self.piontwise_conv = ConvBNReLU(in_channels, out_channels, kernel_size=1, groups=1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.depthwise_conv(x)
+        x = self.piontwise_conv(x)
+        return x
+
+
+class DepthwiseConvBN(nn.Layer):
+    """Basic depthwise conv bn relu layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs):
+        super().__init__()
+        self.depthwise_conv = ConvBN(
+            in_channels,
+            out_channels=out_channels,
+            kernel_size=kernel_size,
+            padding=padding,
+            groups=in_channels,
+            **kwargs)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.depthwise_conv(x)
+        return x
+
+
+class AuxLayer(nn.Layer):
+    """
+    The auxiliary layer implementation for auxiliary loss.
+
+    Args:
+        in_channels (int): The number of input channels.
+        inter_channels (int): The intermediate channels.
+        out_channels (int): The number of output channels, and usually it is num_classes.
+        dropout_prob (float, optional): The drop rate. Default: 0.1.
+    """
+
+    def __init__(self, in_channels: int, inter_channels: int, out_channels: int, dropout_prob: float = 0.1):
+        super().__init__()
+
+        self.conv_bn_relu = ConvBNReLU(in_channels=in_channels, out_channels=inter_channels, kernel_size=3, padding=1)
+
+        self.dropout = nn.Dropout(p=dropout_prob)
+
+        self.conv = nn.Conv2D(in_channels=inter_channels, out_channels=out_channels, kernel_size=1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.conv_bn_relu(x)
+        x = self.dropout(x)
+        x = self.conv(x)
+        return x
+
+
+class Activation(nn.Layer):
+    """
+    The wrapper of activations.
+    Args:
+        act (str, optional): The activation name in lowercase. It must be one of ['elu', 'gelu',
+            'hardshrink', 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid',
+            'softmax', 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax',
+            'hsigmoid']. Default: None, means identical transformation.
+    Returns:
+        A callable object of Activation.
+    Raises:
+        KeyError: When parameter `act` is not in the optional range.
+    Examples:
+        from paddleseg.models.common.activation import Activation
+        relu = Activation("relu")
+        print(relu)
+        # <class 'paddle.nn.layer.activation.ReLU'>
+        sigmoid = Activation("sigmoid")
+        print(sigmoid)
+        # <class 'paddle.nn.layer.activation.Sigmoid'>
+        not_exit_one = Activation("not_exit_one")
+        # KeyError: "not_exit_one does not exist in the current dict_keys(['elu', 'gelu', 'hardshrink',
+        # 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid', 'softmax',
+        # 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax', 'hsigmoid'])"
+    """
+
+    def __init__(self, act: str = None):
+        super(Activation, self).__init__()
+
+        self._act = act
+        upper_act_names = nn.layer.activation.__dict__.keys()
+        lower_act_names = [act.lower() for act in upper_act_names]
+        act_dict = dict(zip(lower_act_names, upper_act_names))
+
+        if act is not None:
+            if act in act_dict.keys():
+                act_name = act_dict[act]
+                self.act_func = eval("nn.layer.activation.{}()".format(act_name))
+            else:
+                raise KeyError("{} does not exist in the current {}".format(act, act_dict.keys()))
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        if self._act is not None:
+            return self.act_func(x)
+        else:
+            return x
+
+
+class PPModule(nn.Layer):
+    """
+    Pyramid pooling module originally in PSPNet.
+
+    Args:
+        in_channels (int): The number of intput channels to pyramid pooling module.
+        out_channels (int): The number of output channels after pyramid pooling module.
+        bin_sizes (tuple, optional): The out size of pooled feature maps. Default: (1, 2, 3, 6).
+        dim_reduction (bool, optional): A bool value represents if reducing dimension after pooling. Default: True.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.
+    """
+
+    def __init__(self, in_channels: int, out_channels: int, bin_sizes: Tuple, dim_reduction: bool, align_corners: bool):
+        super().__init__()
+
+        self.bin_sizes = bin_sizes
+
+        inter_channels = in_channels
+        if dim_reduction:
+            inter_channels = in_channels // len(bin_sizes)
+
+        # we use dimension reduction after pooling mentioned in original implementation.
+        self.stages = nn.LayerList([self._make_stage(in_channels, inter_channels, size) for size in bin_sizes])
+
+        self.conv_bn_relu2 = ConvBNReLU(
+            in_channels=in_channels + inter_channels * len(bin_sizes),
+            out_channels=out_channels,
+            kernel_size=3,
+            padding=1)
+
+        self.align_corners = align_corners
+
+    def _make_stage(self, in_channels: int, out_channels: int, size: int):
+        """
+        Create one pooling layer.
+
+        In our implementation, we adopt the same dimension reduction as the original paper that might be
+        slightly different with other implementations.
+
+        After pooling, the channels are reduced to 1/len(bin_sizes) immediately, while some other implementations
+        keep the channels to be same.
+
+        Args:
+            in_channels (int): The number of intput channels to pyramid pooling module.
+            out_channels (int): The number of output channels to pyramid pooling module.
+            size (int): The out size of the pooled layer.
+
+        Returns:
+            conv (Tensor): A tensor after Pyramid Pooling Module.
+        """
+
+        prior = nn.AdaptiveAvgPool2D(output_size=(size, size))
+        conv = ConvBNReLU(in_channels=in_channels, out_channels=out_channels, kernel_size=1)
+
+        return nn.Sequential(prior, conv)
+
+    def forward(self, input: paddle.Tensor) -> paddle.Tensor:
+        cat_layers = []
+        for stage in self.stages:
+            x = stage(input)
+            x = F.interpolate(x, paddle.shape(input)[2:], mode='bilinear', align_corners=self.align_corners)
+            cat_layers.append(x)
+        cat_layers = [input] + cat_layers[::-1]
+        cat = paddle.concat(cat_layers, axis=1)
+        out = self.conv_bn_relu2(cat)
+
+        return out
--- a/modules/image/semantic_segmentation/fastscnn_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/fastscnn_cityscapes/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+from typing import Callable, Union, Tuple
+
+import paddle.nn as nn
+import paddle.nn.functional as F
+import paddle
+import numpy as np
+from paddlehub.module.module import moduleinfo
+import paddlehub.vision.segmentation_transforms as T
+from paddlehub.module.cv_module import ImageSegmentationModule
+
+import fastscnn_cityscapes.layers as layers
+
+
+@moduleinfo(
+    name="fastscnn_cityscapes",
+    type="CV/semantic_segmentation",
+    author="paddlepaddle",
+    author_email="",
+    summary="fastscnn_cityscapes is a segmentation model.",
+    version="1.0.0",
+    meta=ImageSegmentationModule)
+class FastSCNN(nn.Layer):
+    """
+    The FastSCNN implementation based on PaddlePaddle.
+    As mentioned in the original paper, FastSCNN is a real-time segmentation algorithm (123.5fps)
+    even for high resolution images (1024x2048).
+    The original article refers to
+    Poudel, Rudra PK, et al. "Fast-scnn: Fast semantic segmentation network"
+    (https://arxiv.org/pdf/1902.04502.pdf).
+    Args:
+        num_classes (int): The unique number of target classes, default is 19.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.. Default: False.
+        pretrained (str, optional): The path or url of pretrained model. Default: None.
+    """
+
+    def __init__(self, num_classes: int = 19, align_corners: bool = False, pretrained: str = None):
+
+        super(FastSCNN, self).__init__()
+
+        self.learning_to_downsample = LearningToDownsample(32, 48, 64)
+        self.global_feature_extractor = GlobalFeatureExtractor(
+            in_channels=64,
+            block_channels=[64, 96, 128],
+            out_channels=128,
+            expansion=6,
+            num_blocks=[3, 3, 3],
+            align_corners=True)
+        self.feature_fusion = FeatureFusionModule(64, 128, 128, align_corners)
+        self.classifier = Classifier(128, num_classes)
+        self.align_corners = align_corners
+        self.transforms = T.Compose([T.Normalize()])
+
+        if pretrained is not None:
+            model_dict = paddle.load(pretrained)
+            self.set_dict(model_dict)
+            print("load custom parameters success")
+
+        else:
+            checkpoint = os.path.join(self.directory, 'fastscnn_model.pdparams')
+            model_dict = paddle.load(checkpoint)
+            self.set_dict(model_dict)
+            print("load pretrained parameters success")
+
+    def transform(self, img: Union[np.ndarray, str]) -> Union[np.ndarray, str]:
+        return self.transforms(img)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        logit_list = []
+        input_size = paddle.shape(x)[2:]
+        higher_res_features = self.learning_to_downsample(x)
+        x = self.global_feature_extractor(higher_res_features)
+        x = self.feature_fusion(higher_res_features, x)
+        logit = self.classifier(x)
+        logit = F.interpolate(logit, input_size, mode='bilinear', align_corners=self.align_corners)
+        logit_list.append(logit)
+
+        return logit_list
+
+
+class LearningToDownsample(nn.Layer):
+    """
+    Learning to downsample module.
+    This module consists of three downsampling blocks (one conv and two separable conv)
+    Args:
+        dw_channels1 (int, optional): The input channels of the first sep conv. Default: 32.
+        dw_channels2 (int, optional): The input channels of the second sep conv. Default: 48.
+        out_channels (int, optional): The output channels of LearningToDownsample module. Default: 64.
+    """
+
+    def __init__(self, dw_channels1: int = 32, dw_channels2: int = 48, out_channels: int = 64):
+        super(LearningToDownsample, self).__init__()
+
+        self.conv_bn_relu = layers.ConvBNReLU(in_channels=3, out_channels=dw_channels1, kernel_size=3, stride=2)
+        self.dsconv_bn_relu1 = layers.SeparableConvBNReLU(
+            in_channels=dw_channels1, out_channels=dw_channels2, kernel_size=3, stride=2, padding=1)
+        self.dsconv_bn_relu2 = layers.SeparableConvBNReLU(
+            in_channels=dw_channels2, out_channels=out_channels, kernel_size=3, stride=2, padding=1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.conv_bn_relu(x)
+        x = self.dsconv_bn_relu1(x)
+        x = self.dsconv_bn_relu2(x)
+        return x
+
+
+class GlobalFeatureExtractor(nn.Layer):
+    """
+    Global feature extractor module.
+    This module consists of three InvertedBottleneck blocks (like inverted residual introduced by MobileNetV2) and
+    a PPModule (introduced by PSPNet).
+    Args:
+        in_channels (int): The number of input channels to the module.
+        block_channels (tuple): A tuple represents output channels of each bottleneck block.
+        out_channels (int): The number of output channels of the module. Default:
+        expansion (int): The expansion factor in bottleneck.
+        num_blocks (tuple): It indicates the repeat time of each bottleneck.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.
+    """
+
+    def __init__(self, in_channels: int, block_channels: int, out_channels: int, expansion: int, num_blocks: Tuple[int],
+                 align_corners: bool):
+        super(GlobalFeatureExtractor, self).__init__()
+
+        self.bottleneck1 = self._make_layer(InvertedBottleneck, in_channels, block_channels[0], num_blocks[0],
+                                            expansion, 2)
+        self.bottleneck2 = self._make_layer(InvertedBottleneck, block_channels[0], block_channels[1], num_blocks[1],
+                                            expansion, 2)
+        self.bottleneck3 = self._make_layer(InvertedBottleneck, block_channels[1], block_channels[2], num_blocks[2],
+                                            expansion, 1)
+
+        self.ppm = layers.PPModule(
+            block_channels[2], out_channels, bin_sizes=(1, 2, 3, 6), dim_reduction=True, align_corners=align_corners)
+
+    def _make_layer(self,
+                    block: Callable,
+                    in_channels: int,
+                    out_channels: int,
+                    blocks: int,
+                    expansion: int = 6,
+                    stride: int = 1):
+        layers = []
+        layers.append(block(in_channels, out_channels, expansion, stride))
+        for _ in range(1, blocks):
+            layers.append(block(out_channels, out_channels, expansion, 1))
+        return nn.Sequential(*layers)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.bottleneck1(x)
+        x = self.bottleneck2(x)
+        x = self.bottleneck3(x)
+        x = self.ppm(x)
+        return x
+
+
+class InvertedBottleneck(nn.Layer):
+    """
+    Single Inverted bottleneck implementation.
+    Args:
+        in_channels (int): The number of input channels to bottleneck block.
+        out_channels (int): The number of output channels of bottleneck block.
+        expansion (int, optional). The expansion factor in bottleneck. Default: 6.
+        stride (int, optional). The stride used in depth-wise conv. Defalt: 2.
+    """
+
+    def __init__(self, in_channels: int, out_channels: int, expansion: int = 6, stride: int = 2):
+        super().__init__()
+
+        self.use_shortcut = stride == 1 and in_channels == out_channels
+
+        expand_channels = in_channels * expansion
+        self.block = nn.Sequential(
+            # pw
+            layers.ConvBNReLU(in_channels=in_channels, out_channels=expand_channels, kernel_size=1, bias_attr=False),
+            # dw
+            layers.ConvBNReLU(
+                in_channels=expand_channels,
+                out_channels=expand_channels,
+                kernel_size=3,
+                stride=stride,
+                padding=1,
+                groups=expand_channels,
+                bias_attr=False),
+            # pw-linear
+            layers.ConvBN(in_channels=expand_channels, out_channels=out_channels, kernel_size=1, bias_attr=False))
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        out = self.block(x)
+        if self.use_shortcut:
+            out = x + out
+        return out
+
+
+class FeatureFusionModule(nn.Layer):
+    """
+    Feature Fusion Module Implementation.
+    This module fuses high-resolution feature and low-resolution feature.
+    Args:
+        high_in_channels (int): The channels of high-resolution feature (output of LearningToDownsample).
+        low_in_channels (int): The channels of low-resolution feature (output of GlobalFeatureExtractor).
+        out_channels (int): The output channels of this module.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.
+    """
+
+    def __init__(self, high_in_channels: int, low_in_channels: int, out_channels: int, align_corners: bool):
+        super().__init__()
+
+        # Only depth-wise conv
+        self.dwconv = layers.ConvBNReLU(
+            in_channels=low_in_channels,
+            out_channels=out_channels,
+            kernel_size=3,
+            padding=1,
+            groups=128,
+            bias_attr=False)
+
+        self.conv_low_res = layers.ConvBN(out_channels, out_channels, 1)
+        self.conv_high_res = layers.ConvBN(high_in_channels, out_channels, 1)
+        self.align_corners = align_corners
+
+    def forward(self, high_res_input: int, low_res_input: int) -> paddle.Tensor:
+        low_res_input = F.interpolate(
+            low_res_input, paddle.shape(high_res_input)[2:], mode='bilinear', align_corners=self.align_corners)
+        low_res_input = self.dwconv(low_res_input)
+        low_res_input = self.conv_low_res(low_res_input)
+        high_res_input = self.conv_high_res(high_res_input)
+        x = high_res_input + low_res_input
+
+        return F.relu(x)
+
+
+class Classifier(nn.Layer):
+    """
+    The Classifier module implementation.
+    This module consists of two depth-wise conv and one conv.
+    Args:
+        input_channels (int): The input channels to this module.
+        num_classes (int): The unique number of target classes.
+    """
+
+    def __init__(self, input_channels: int, num_classes: int):
+        super().__init__()
+
+        self.dsconv1 = layers.SeparableConvBNReLU(
+            in_channels=input_channels, out_channels=input_channels, kernel_size=3, padding=1)
+
+        self.dsconv2 = layers.SeparableConvBNReLU(
+            in_channels=input_channels, out_channels=input_channels, kernel_size=3, padding=1)
+
+        self.conv = nn.Conv2D(in_channels=input_channels, out_channels=num_classes, kernel_size=1)
+
+        self.dropout = nn.Dropout(p=0.1)  # dropout_prob
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.dsconv1(x)
+        x = self.dsconv2(x)
+        x = self.dropout(x)
+        x = self.conv(x)
+        return x
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/README.md
+# PaddleHub 图像分割
+
+## 模型预测
+
+
+若想使用我们提供的预训练模型进行预测，可使用如下脚本：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='fcn_hrnetw18_cityscapes')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+
+## 如何开始Fine-tune
+
+在完成安装PaddlePaddle与PaddleHub后，通过执行`python train.py`即可开始使用fcn_hrnetw18_cityscapes模型对OpticDiscSeg等数据集进行Fine-tune。
+
+## 代码步骤
+
+使用PaddleHub Fine-tune API进行Fine-tune可以分为4个步骤。
+
+### Step1: 定义数据预处理方式
+```python
+from paddlehub.vision.segmentation_transforms import Compose, Resize, Normalize
+
+transform = Compose([Resize(target_size=(512, 512)), Normalize()])
+```
+
+`segmentation_transforms` 数据增强模块定义了丰富的针对图像分割数据的预处理方式，用户可按照需求替换自己需要的数据预处理方式。
+
+### Step2: 下载数据集并使用
+```python
+from paddlehub.datasets import OpticDiscSeg
+
+train_reader = OpticDiscSeg(transform， mode='train')
+
+```
+* `transform`: 数据预处理方式。
+* `mode`: 选择数据模式，可选项有 `train`, `test`, `val`, 默认为`train`。
+
+数据集的准备代码可以参考 [opticdiscseg.py](../../paddlehub/datasets/opticdiscseg.py)。`hub.datasets.OpticDiscSeg()`会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
+
+### Step3: 加载预训练模型
+
+```python
+model = hub.Module(name='fcn_hrnetw18_cityscapes', num_classes=2, pretrained=None)
+```
+* `name`: 选择预训练模型的名字。
+* `num_classes`: 分割模型的类别数目。
+* `pretrained`: 是否加载自己训练的模型，若为None，则加载提供的模型默认参数。
+
+### Step4: 选择优化策略和运行配置
+
+```python
+scheduler = paddle.optimizer.lr.PolynomialDecay(learning_rate=0.01, decay_steps=1000, power=0.9,  end_lr=0.0001)
+optimizer = paddle.optimizer.Adam(learning_rate=scheduler, parameters=model.parameters())
+trainer = Trainer(model, optimizer, checkpoint_dir='test_ckpt_img_ocr', use_gpu=True)
+```
+
+#### 优化策略
+
+Paddle2.0rc提供了多种优化器选择，如`SGD`, `Adam`, `Adamax`等, 其中`Adam`:
+
+* `learning_rate`: 全局学习率。
+*  `parameters`: 待优化模型参数。
+
+#### 运行配置
+`Trainer` 主要控制Fine-tune的训练，包含以下可控制的参数:
+
+* `model`: 被优化模型；
+* `optimizer`: 优化器选择；
+* `use_gpu`: 是否使用gpu，默认为False;
+* `use_vdl`: 是否使用vdl可视化训练过程；
+* `checkpoint_dir`: 保存模型参数的地址；
+* `compare_metrics`: 保存最优模型的衡量指标；
+
+`trainer.train` 主要控制具体的训练过程，包含以下可控制的参数：
+
+* `train_dataset`: 训练时所用的数据集；
+* `epochs`: 训练轮数；
+* `batch_size`: 训练的批大小，如果使用GPU，请根据实际情况调整batch_size；
+* `num_workers`: works的数量，默认为0；
+* `eval_dataset`: 验证集；
+* `log_interval`: 打印日志的间隔， 单位为执行批训练的次数。
+* `save_interval`: 保存模型的间隔频次，单位为执行训练的轮数。
+
+## 模型预测
+
+当完成Fine-tune后，Fine-tune过程在验证集上表现最优的模型会被保存在`${CHECKPOINT_DIR}/best_model`目录下，其中`${CHECKPOINT_DIR}`目录为Fine-tune时所选择的保存checkpoint的目录。
+
+我们使用该模型来进行预测。predict.py脚本如下：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='fcn_hrnetw18_cityscapes', pretrained='/PATH/TO/CHECKPOINT')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+参数配置正确后，请执行脚本`python predict.py`。
+**Args**
+* `images`:原始图像路径或BGR格式图片；
+* `visualization`: 是否可视化，默认为True；
+* `save_path`: 保存结果的路径，默认保存路径为'seg_result'。
+
+**NOTE:** 进行预测时，所选择的module，checkpoint_dir，dataset必须和Fine-tune所用的一样。
+
+## 服务部署
+
+PaddleHub Serving可以部署一个在线图像分割服务。
+
+### Step1: 启动PaddleHub Serving
+
+运行启动命令：
+
+```shell
+$ hub serving start -m fcn_hrnetw18_cityscapes
+```
+
+这样就完成了一个图像分割服务化API的部署，默认端口号为8866。
+
+**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+
+### Step2: 发送预测请求
+
+配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+```python
+import requests
+import json
+import cv2
+import base64
+
+import numpy as np
+
+
+def cv2_to_base64(image):
+    data = cv2.imencode('.jpg', image)[1]
+    return base64.b64encode(data.tostring()).decode('utf8')
+
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
+
+# 发送HTTP请求
+org_im = cv2.imread('/PATH/TO/IMAGE')
+data = {'images':[cv2_to_base64(org_im)]}
+headers = {"Content-type": "application/json"}
+url = "http://127.0.0.1:8866/predict/fcn_hrnetw18_cityscapes"
+r = requests.post(url=url, headers=headers, data=json.dumps(data))
+mask = base64_to_cv2(r.json()["results"][0])
+```
+
+### 查看代码
+
+https://github.com/PaddlePaddle/PaddleSeg
+
+### 依赖
+
+paddlepaddle >= 2.0.0
+
+paddlehub >= 2.0.0
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/hrnet.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/hrnet.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/layers.py
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from typing import Tuple
+
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+from paddle.nn import Conv2D, AvgPool2D
+
+
+def SyncBatchNorm(*args, **kwargs):
+    """In cpu environment nn.SyncBatchNorm does not have kernel so use nn.BatchNorm2D instead"""
+    if paddle.get_device() == 'cpu':
+        return nn.BatchNorm2D(*args, **kwargs)
+    else:
+        return nn.SyncBatchNorm(*args, **kwargs)
+
+
+class ConvBNLayer(nn.Layer):
+    """Basic conv bn relu layer."""
+
+    def __init__(self,
+                 in_channels: int,
+                 out_channels: int,
+                 kernel_size: int,
+                 stride: int = 1,
+                 dilation: int = 1,
+                 groups: int = 1,
+                 is_vd_mode: bool = False,
+                 act: str = None,
+                 name: str = None):
+        super(ConvBNLayer, self).__init__()
+
+        self.is_vd_mode = is_vd_mode
+        self._pool2d_avg = AvgPool2D(kernel_size=2, stride=2, padding=0, ceil_mode=True)
+        self._conv = Conv2D(
+            in_channels=in_channels,
+            out_channels=out_channels,
+            kernel_size=kernel_size,
+            stride=stride,
+            padding=(kernel_size - 1) // 2 if dilation == 1 else 0,
+            dilation=dilation,
+            groups=groups,
+            bias_attr=False)
+
+        self._batch_norm = SyncBatchNorm(out_channels)
+        self._act_op = Activation(act=act)
+
+    def forward(self, inputs: paddle.Tensor) -> paddle.Tensor:
+        if self.is_vd_mode:
+            inputs = self._pool2d_avg(inputs)
+        y = self._conv(inputs)
+        y = self._batch_norm(y)
+        y = self._act_op(y)
+
+        return y
+
+
+class BottleneckBlock(nn.Layer):
+    """Residual bottleneck block"""
+
+    def __init__(self,
+                 in_channels: int,
+                 out_channels: int,
+                 stride: int,
+                 shortcut: bool = True,
+                 if_first: bool = False,
+                 dilation: int = 1,
+                 name: str = None):
+        super(BottleneckBlock, self).__init__()
+
+        self.conv0 = ConvBNLayer(
+            in_channels=in_channels, out_channels=out_channels, kernel_size=1, act='relu', name=name + "_branch2a")
+
+        self.dilation = dilation
+
+        self.conv1 = ConvBNLayer(
+            in_channels=out_channels,
+            out_channels=out_channels,
+            kernel_size=3,
+            stride=stride,
+            act='relu',
+            dilation=dilation,
+            name=name + "_branch2b")
+        self.conv2 = ConvBNLayer(
+            in_channels=out_channels, out_channels=out_channels * 4, kernel_size=1, act=None, name=name + "_branch2c")
+
+        if not shortcut:
+            self.short = ConvBNLayer(
+                in_channels=in_channels,
+                out_channels=out_channels * 4,
+                kernel_size=1,
+                stride=1,
+                is_vd_mode=False if if_first or stride == 1 else True,
+                name=name + "_branch1")
+
+        self.shortcut = shortcut
+
+    def forward(self, inputs: paddle.Tensor) -> paddle.Tensor:
+        y = self.conv0(inputs)
+        if self.dilation > 1:
+            padding = self.dilation
+            y = F.pad(y, [padding, padding, padding, padding])
+
+        conv1 = self.conv1(y)
+        conv2 = self.conv2(conv1)
+
+        if self.shortcut:
+            short = inputs
+        else:
+            short = self.short(inputs)
+
+        y = paddle.add(x=short, y=conv2)
+        y = F.relu(y)
+        return y
+
+
+class SeparableConvBNReLU(nn.Layer):
+    """Depthwise Separable Convolution."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(SeparableConvBNReLU, self).__init__()
+        self.depthwise_conv = ConvBN(
+            in_channels,
+            out_channels=in_channels,
+            kernel_size=kernel_size,
+            padding=padding,
+            groups=in_channels,
+            **kwargs)
+        self.piontwise_conv = ConvBNReLU(in_channels, out_channels, kernel_size=1, groups=1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.depthwise_conv(x)
+        x = self.piontwise_conv(x)
+        return x
+
+
+class ConvBN(nn.Layer):
+    """Basic conv bn layer"""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(ConvBN, self).__init__()
+        self._conv = Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        return x
+
+
+class ConvBNReLU(nn.Layer):
+    """Basic conv bn relu layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(ConvBNReLU, self).__init__()
+
+        self._conv = Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        x = F.relu(x)
+        return x
+
+
+class Activation(nn.Layer):
+    """
+    The wrapper of activations.
+    Args:
+        act (str, optional): The activation name in lowercase. It must be one of ['elu', 'gelu',
+            'hardshrink', 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid',
+            'softmax', 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax',
+            'hsigmoid']. Default: None, means identical transformation.
+    Returns:
+        A callable object of Activation.
+    Raises:
+        KeyError: When parameter `act` is not in the optional range.
+    Examples:
+        from paddleseg.models.common.activation import Activation
+        relu = Activation("relu")
+        print(relu)
+        # <class 'paddle.nn.layer.activation.ReLU'>
+        sigmoid = Activation("sigmoid")
+        print(sigmoid)
+        # <class 'paddle.nn.layer.activation.Sigmoid'>
+        not_exit_one = Activation("not_exit_one")
+        # KeyError: "not_exit_one does not exist in the current dict_keys(['elu', 'gelu', 'hardshrink',
+        # 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid', 'softmax',
+        # 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax', 'hsigmoid'])"
+    """
+
+    def __init__(self, act: str = None):
+        super(Activation, self).__init__()
+
+        self._act = act
+        upper_act_names = nn.layer.activation.__dict__.keys()
+        lower_act_names = [act.lower() for act in upper_act_names]
+        act_dict = dict(zip(lower_act_names, upper_act_names))
+
+        if act is not None:
+            if act in act_dict.keys():
+                act_name = act_dict[act]
+                self.act_func = eval("nn.layer.activation.{}()".format(act_name))
+            else:
+                raise KeyError("{} does not exist in the current {}".format(act, act_dict.keys()))
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        if self._act is not None:
+            return self.act_func(x)
+        else:
+            return x
+
+
+class ASPPModule(nn.Layer):
+    """
+    Atrous Spatial Pyramid Pooling.
+
+    Args:
+        aspp_ratios (tuple): The dilation rate using in ASSP module.
+        in_channels (int): The number of input channels.
+        out_channels (int): The number of output channels.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.
+        use_sep_conv (bool, optional): If using separable conv in ASPP module. Default: False.
+        image_pooling (bool, optional): If augmented with image-level features. Default: False
+    """
+
+    def __init__(self,
+                 aspp_ratios: Tuple[int],
+                 in_channels: int,
+                 out_channels: int,
+                 align_corners: bool,
+                 use_sep_conv: bool = False,
+                 image_pooling: bool = False):
+        super().__init__()
+
+        self.align_corners = align_corners
+        self.aspp_blocks = nn.LayerList()
+
+        for ratio in aspp_ratios:
+            if use_sep_conv and ratio > 1:
+                conv_func = SeparableConvBNReLU
+            else:
+                conv_func = ConvBNReLU
+
+            block = conv_func(
+                in_channels=in_channels,
+                out_channels=out_channels,
+                kernel_size=1 if ratio == 1 else 3,
+                dilation=ratio,
+                padding=0 if ratio == 1 else ratio)
+            self.aspp_blocks.append(block)
+
+        out_size = len(self.aspp_blocks)
+
+        if image_pooling:
+            self.global_avg_pool = nn.Sequential(
+                nn.AdaptiveAvgPool2D(output_size=(1, 1)),
+                ConvBNReLU(in_channels, out_channels, kernel_size=1, bias_attr=False))
+            out_size += 1
+        self.image_pooling = image_pooling
+
+        self.conv_bn_relu = ConvBNReLU(in_channels=out_channels * out_size, out_channels=out_channels, kernel_size=1)
+
+        self.dropout = nn.Dropout(p=0.1)  # drop rate
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        outputs = []
+        for block in self.aspp_blocks:
+            y = block(x)
+            y = F.interpolate(y, x.shape[2:], mode='bilinear', align_corners=self.align_corners)
+            outputs.append(y)
+
+        if self.image_pooling:
+            img_avg = self.global_avg_pool(x)
+            img_avg = F.interpolate(img_avg, x.shape[2:], mode='bilinear', align_corners=self.align_corners)
+            outputs.append(img_avg)
+
+        x = paddle.concat(outputs, axis=1)
+        x = self.conv_bn_relu(x)
+        x = self.dropout(x)
+
+        return x
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+from typing import Union, List, Tuple
+
+import paddle
+from paddle import nn
+import paddle.nn.functional as F
+import numpy as np
+from paddlehub.module.module import moduleinfo
+import paddlehub.vision.segmentation_transforms as T
+from paddlehub.module.cv_module import ImageSegmentationModule
+
+from fcn_hrnetw18_cityscapes.hrnet import HRNet_W18
+import fcn_hrnetw18_cityscapes.layers as layers
+
+
+@moduleinfo(
+    name="fcn_hrnetw18_cityscapes",
+    type="CV/semantic_segmentation",
+    author="paddlepaddle",
+    author_email="",
+    summary="Fcn_hrnetw18 is a segmentation model.",
+    version="1.0.0",
+    meta=ImageSegmentationModule)
+class FCN(nn.Layer):
+    """
+    A simple implementation for FCN based on PaddlePaddle.
+
+    The original article refers to
+    Evan Shelhamer, et, al. "Fully Convolutional Networks for Semantic Segmentation"
+    (https://arxiv.org/abs/1411.4038).
+
+    Args:
+        num_classes (int): The unique number of target classes.
+        backbone_indices (tuple, optional): The values in the tuple indicate the indices of output of backbone.
+            Default: (-1, ).
+        channels (int, optional): The channels between conv layer and the last layer of FCNHead.
+            If None, it will be the number of channels of input features. Default: None.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.  Default: False.
+        pretrained (str, optional): The path or url of pretrained model. Default: None
+    """
+
+    def __init__(self,
+                 num_classes: int = 19,
+                 backbone_indices: Tuple[int] = (-1, ),
+                 channels: int = None,
+                 align_corners: bool = False,
+                 pretrained: str = None):
+        super(FCN, self).__init__()
+
+        self.backbone = HRNet_W18()
+        backbone_channels = [self.backbone.feat_channels[i] for i in backbone_indices]
+
+        self.head = FCNHead(num_classes, backbone_indices, backbone_channels, channels)
+
+        self.align_corners = align_corners
+        self.transforms = T.Compose([T.Normalize()])
+
+        if pretrained is not None:
+            model_dict = paddle.load(pretrained)
+            self.set_dict(model_dict)
+            print("load custom parameters success")
+
+        else:
+            checkpoint = os.path.join(self.directory, 'model.pdparams')
+            model_dict = paddle.load(checkpoint)
+            self.set_dict(model_dict)
+            print("load pretrained parameters success")
+
+    def transform(self, img: Union[np.ndarray, str]) -> Union[np.ndarray, str]:
+        return self.transforms(img)
+
+    def forward(self, x: paddle.Tensor) -> List[paddle.Tensor]:
+        feat_list = self.backbone(x)
+        logit_list = self.head(feat_list)
+        return [
+            F.interpolate(logit, paddle.shape(x)[2:], mode='bilinear', align_corners=self.align_corners)
+            for logit in logit_list
+        ]
+
+
+class FCNHead(nn.Layer):
+    """
+    A simple implementation for FCNHead based on PaddlePaddle
+
+    Args:
+        num_classes (int): The unique number of target classes.
+        backbone_indices (tuple, optional): The values in the tuple indicate the indices of output of backbone.
+            Default: (-1, ).
+        backbone_channels (tuple): The values of backbone channels.
+            Default: (270, ).
+        channels (int, optional): The channels between conv layer and the last layer of FCNHead.
+            If None, it will be the number of channels of input features. Default: None.
+        pretrained (str, optional): The path of pretrained model. Default: None
+    """
+
+    def __init__(self,
+                 num_classes: int,
+                 backbone_indices: Tuple[int] = (-1, ),
+                 backbone_channels: Tuple[int] = (270, ),
+                 channels: int = None):
+        super(FCNHead, self).__init__()
+
+        self.num_classes = num_classes
+        self.backbone_indices = backbone_indices
+        if channels is None:
+            channels = backbone_channels[0]
+
+        self.conv_1 = layers.ConvBNReLU(
+            in_channels=backbone_channels[0], out_channels=channels, kernel_size=1, padding='same', stride=1)
+        self.cls = nn.Conv2D(in_channels=channels, out_channels=self.num_classes, kernel_size=1, stride=1, padding=0)
+
+    def forward(self, feat_list: nn.Layer) -> List[paddle.Tensor]:
+        logit_list = []
+        x = feat_list[self.backbone_indices[0]]
+        x = self.conv_1(x)
+        logit = self.cls(x)
+        logit_list.append(logit)
+        return logit_list
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_voc/README.md
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_voc/README.md
+# PaddleHub 图像分割
+
+
+## 模型预测
+
+
+若想使用我们提供的预训练模型进行预测，可使用如下脚本：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='fcn_hrnetw18_voc')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+
+## 如何开始Fine-tune
+
+在完成安装PaddlePaddle与PaddleHub后，通过执行`python train.py`即可开始使用fcn_hrnetw18_voc模型对OpticDiscSeg等数据集进行Fine-tune。
+
+## 代码步骤
+
+使用PaddleHub Fine-tune API进行Fine-tune可以分为4个步骤。
+
+### Step1: 定义数据预处理方式
+```python
+from paddlehub.vision.segmentation_transforms import Compose, Resize, Normalize
+
+transform = Compose([Resize(target_size=(512, 512)), Normalize()])
+```
+
+`segmentation_transforms` 数据增强模块定义了丰富的针对图像分割数据的预处理方式，用户可按照需求替换自己需要的数据预处理方式。
+
+### Step2: 下载数据集并使用
+```python
+from paddlehub.datasets import OpticDiscSeg
+
+train_reader = OpticDiscSeg(transform， mode='train')
+
+```
+* `transform`: 数据预处理方式。
+* `mode`: 选择数据模式，可选项有 `train`, `test`, `val`, 默认为`train`。
+
+数据集的准备代码可以参考 [opticdiscseg.py](../../paddlehub/datasets/opticdiscseg.py)。`hub.datasets.OpticDiscSeg()`会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
+
+### Step3: 加载预训练模型
+
+```python
+model = hub.Module(name='fcn_hrnetw18_voc', num_classes=2, pretrained=None)
+```
+* `name`: 选择预训练模型的名字。
+* `num_classes`: 分割模型的类别数目。
+* `pretrained`: 是否加载自己训练的模型，若为None，则加载提供的模型默认参数。
+
+### Step4: 选择优化策略和运行配置
+
+```python
+scheduler = paddle.optimizer.lr.PolynomialDecay(learning_rate=0.01, decay_steps=1000, power=0.9,  end_lr=0.0001)
+optimizer = paddle.optimizer.Adam(learning_rate=scheduler, parameters=model.parameters())
+trainer = Trainer(model, optimizer, checkpoint_dir='test_ckpt_img_ocr', use_gpu=True)
+```
+
+#### 优化策略
+
+Paddle2.0rc提供了多种优化器选择，如`SGD`, `Adam`, `Adamax`等，其中`Adam`:
+
+* `learning_rate`: 全局学习率。
+*  `parameters`: 待优化模型参数。
+
+#### 运行配置
+`Trainer` 主要控制Fine-tune的训练，包含以下可控制的参数:
+
+* `model`: 被优化模型；
+* `optimizer`: 优化器选择；
+* `use_gpu`: 是否使用gpu，默认为False;
+* `use_vdl`: 是否使用vdl可视化训练过程；
+* `checkpoint_dir`: 保存模型参数的地址；
+* `compare_metrics`: 保存最优模型的衡量指标；
+
+`trainer.train` 主要控制具体的训练过程，包含以下可控制的参数：
+
+* `train_dataset`: 训练时所用的数据集；
+* `epochs`: 训练轮数；
+* `batch_size`: 训练的批大小，如果使用GPU，请根据实际情况调整batch_size；
+* `num_workers`: works的数量，默认为0；
+* `eval_dataset`: 验证集；
+* `log_interval`: 打印日志的间隔， 单位为执行批训练的次数。
+* `save_interval`: 保存模型的间隔频次，单位为执行训练的轮数。
+
+## 模型预测
+
+当完成Fine-tune后，Fine-tune过程在验证集上表现最优的模型会被保存在`${CHECKPOINT_DIR}/best_model`目录下，其中`${CHECKPOINT_DIR}`目录为Fine-tune时所选择的保存checkpoint的目录。
+
+我们使用该模型来进行预测。predict.py脚本如下：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='fcn_hrnetw18_voc', pretrained='/PATH/TO/CHECKPOINT')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+参数配置正确后，请执行脚本`python predict.py`。
+**Args**
+* `images`:原始图像路径或BGR格式图片；
+* `visualization`: 是否可视化，默认为True；
+* `save_path`: 保存结果的路径，默认保存路径为'seg_result'。
+
+**NOTE:** 进行预测时，所选择的module，checkpoint_dir，dataset必须和Fine-tune所用的一样。
+
+## 服务部署
+
+PaddleHub Serving可以部署一个在线图像分割服务。
+
+### Step1: 启动PaddleHub Serving
+
+运行启动命令：
+
+```shell
+$ hub serving start -m fcn_hrnetw18_voc
+```
+
+这样就完成了一个图像分割服务化API的部署，默认端口号为8866。
+
+**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+
+### Step2: 发送预测请求
+
+配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+```python
+import requests
+import json
+import cv2
+import base64
+
+import numpy as np
+
+
+def cv2_to_base64(image):
+    data = cv2.imencode('.jpg', image)[1]
+    return base64.b64encode(data.tostring()).decode('utf8')
+
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
+
+# 发送HTTP请求
+org_im = cv2.imread('/PATH/TO/IMAGE')
+data = {'images':[cv2_to_base64(org_im)]}
+headers = {"Content-type": "application/json"}
+url = "http://127.0.0.1:8866/predict/fcn_hrnetw18_voc"
+r = requests.post(url=url, headers=headers, data=json.dumps(data))
+mask = base64_to_cv2(r.json()["results"][0])
+```
+
+### 查看代码
+
+https://github.com/PaddlePaddle/PaddleSeg
+
+### 依赖
+
+paddlepaddle >= 2.0.0
+
+paddlehub >= 2.0.0
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_voc/hrnet.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_voc/hrnet.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_voc/layers.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_voc/layers.py
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from typing import Tuple
+
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+from paddle.nn import Conv2D, AvgPool2D
+
+
+def SyncBatchNorm(*args, **kwargs):
+    """In cpu environment nn.SyncBatchNorm does not have kernel so use nn.BatchNorm2D instead"""
+    if paddle.get_device() == 'cpu':
+        return nn.BatchNorm2D(*args, **kwargs)
+    else:
+        return nn.SyncBatchNorm(*args, **kwargs)
+
+
+class ConvBNLayer(nn.Layer):
+    """Basic conv bn relu layer."""
+
+    def __init__(self,
+                 in_channels: int,
+                 out_channels: int,
+                 kernel_size: int,
+                 stride: int = 1,
+                 dilation: int = 1,
+                 groups: int = 1,
+                 is_vd_mode: bool = False,
+                 act: str = None,
+                 name: str = None):
+        super(ConvBNLayer, self).__init__()
+
+        self.is_vd_mode = is_vd_mode
+        self._pool2d_avg = AvgPool2D(kernel_size=2, stride=2, padding=0, ceil_mode=True)
+        self._conv = Conv2D(
+            in_channels=in_channels,
+            out_channels=out_channels,
+            kernel_size=kernel_size,
+            stride=stride,
+            padding=(kernel_size - 1) // 2 if dilation == 1 else 0,
+            dilation=dilation,
+            groups=groups,
+            bias_attr=False)
+
+        self._batch_norm = SyncBatchNorm(out_channels)
+        self._act_op = Activation(act=act)
+
+    def forward(self, inputs: paddle.Tensor) -> paddle.Tensor:
+        if self.is_vd_mode:
+            inputs = self._pool2d_avg(inputs)
+        y = self._conv(inputs)
+        y = self._batch_norm(y)
+        y = self._act_op(y)
+
+        return y
+
+
+class BottleneckBlock(nn.Layer):
+    """Residual bottleneck block"""
+
+    def __init__(self,
+                 in_channels: int,
+                 out_channels: int,
+                 stride: int,
+                 shortcut: bool = True,
+                 if_first: bool = False,
+                 dilation: int = 1,
+                 name: str = None):
+        super(BottleneckBlock, self).__init__()
+
+        self.conv0 = ConvBNLayer(
+            in_channels=in_channels, out_channels=out_channels, kernel_size=1, act='relu', name=name + "_branch2a")
+
+        self.dilation = dilation
+
+        self.conv1 = ConvBNLayer(
+            in_channels=out_channels,
+            out_channels=out_channels,
+            kernel_size=3,
+            stride=stride,
+            act='relu',
+            dilation=dilation,
+            name=name + "_branch2b")
+        self.conv2 = ConvBNLayer(
+            in_channels=out_channels, out_channels=out_channels * 4, kernel_size=1, act=None, name=name + "_branch2c")
+
+        if not shortcut:
+            self.short = ConvBNLayer(
+                in_channels=in_channels,
+                out_channels=out_channels * 4,
+                kernel_size=1,
+                stride=1,
+                is_vd_mode=False if if_first or stride == 1 else True,
+                name=name + "_branch1")
+
+        self.shortcut = shortcut
+
+    def forward(self, inputs: paddle.Tensor) -> paddle.Tensor:
+        y = self.conv0(inputs)
+        if self.dilation > 1:
+            padding = self.dilation
+            y = F.pad(y, [padding, padding, padding, padding])
+
+        conv1 = self.conv1(y)
+        conv2 = self.conv2(conv1)
+
+        if self.shortcut:
+            short = inputs
+        else:
+            short = self.short(inputs)
+
+        y = paddle.add(x=short, y=conv2)
+        y = F.relu(y)
+        return y
+
+
+class SeparableConvBNReLU(nn.Layer):
+    """Depthwise Separable Convolution."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(SeparableConvBNReLU, self).__init__()
+        self.depthwise_conv = ConvBN(
+            in_channels,
+            out_channels=in_channels,
+            kernel_size=kernel_size,
+            padding=padding,
+            groups=in_channels,
+            **kwargs)
+        self.piontwise_conv = ConvBNReLU(in_channels, out_channels, kernel_size=1, groups=1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.depthwise_conv(x)
+        x = self.piontwise_conv(x)
+        return x
+
+
+class ConvBN(nn.Layer):
+    """Basic conv bn layer"""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(ConvBN, self).__init__()
+        self._conv = Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        return x
+
+
+class ConvBNReLU(nn.Layer):
+    """Basic conv bn relu layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(ConvBNReLU, self).__init__()
+
+        self._conv = Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        x = F.relu(x)
+        return x
+
+
+class Activation(nn.Layer):
+    """
+    The wrapper of activations.
+    Args:
+        act (str, optional): The activation name in lowercase. It must be one of ['elu', 'gelu',
+            'hardshrink', 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid',
+            'softmax', 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax',
+            'hsigmoid']. Default: None, means identical transformation.
+    Returns:
+        A callable object of Activation.
+    Raises:
+        KeyError: When parameter `act` is not in the optional range.
+    Examples:
+        from paddleseg.models.common.activation import Activation
+        relu = Activation("relu")
+        print(relu)
+        # <class 'paddle.nn.layer.activation.ReLU'>
+        sigmoid = Activation("sigmoid")
+        print(sigmoid)
+        # <class 'paddle.nn.layer.activation.Sigmoid'>
+        not_exit_one = Activation("not_exit_one")
+        # KeyError: "not_exit_one does not exist in the current dict_keys(['elu', 'gelu', 'hardshrink',
+        # 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid', 'softmax',
+        # 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax', 'hsigmoid'])"
+    """
+
+    def __init__(self, act: str = None):
+        super(Activation, self).__init__()
+
+        self._act = act
+        upper_act_names = nn.layer.activation.__dict__.keys()
+        lower_act_names = [act.lower() for act in upper_act_names]
+        act_dict = dict(zip(lower_act_names, upper_act_names))
+
+        if act is not None:
+            if act in act_dict.keys():
+                act_name = act_dict[act]
+                self.act_func = eval("nn.layer.activation.{}()".format(act_name))
+            else:
+                raise KeyError("{} does not exist in the current {}".format(act, act_dict.keys()))
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        if self._act is not None:
+            return self.act_func(x)
+        else:
+            return x
+
+
+class ASPPModule(nn.Layer):
+    """
+    Atrous Spatial Pyramid Pooling.
+
+    Args:
+        aspp_ratios (tuple): The dilation rate using in ASSP module.
+        in_channels (int): The number of input channels.
+        out_channels (int): The number of output channels.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.
+        use_sep_conv (bool, optional): If using separable conv in ASPP module. Default: False.
+        image_pooling (bool, optional): If augmented with image-level features. Default: False
+    """
+
+    def __init__(self,
+                 aspp_ratios: Tuple[int],
+                 in_channels: int,
+                 out_channels: int,
+                 align_corners: bool,
+                 use_sep_conv: bool = False,
+                 image_pooling: bool = False):
+        super().__init__()
+
+        self.align_corners = align_corners
+        self.aspp_blocks = nn.LayerList()
+
+        for ratio in aspp_ratios:
+            if use_sep_conv and ratio > 1:
+                conv_func = SeparableConvBNReLU
+            else:
+                conv_func = ConvBNReLU
+
+            block = conv_func(
+                in_channels=in_channels,
+                out_channels=out_channels,
+                kernel_size=1 if ratio == 1 else 3,
+                dilation=ratio,
+                padding=0 if ratio == 1 else ratio)
+            self.aspp_blocks.append(block)
+
+        out_size = len(self.aspp_blocks)
+
+        if image_pooling:
+            self.global_avg_pool = nn.Sequential(
+                nn.AdaptiveAvgPool2D(output_size=(1, 1)),
+                ConvBNReLU(in_channels, out_channels, kernel_size=1, bias_attr=False))
+            out_size += 1
+        self.image_pooling = image_pooling
+
+        self.conv_bn_relu = ConvBNReLU(in_channels=out_channels * out_size, out_channels=out_channels, kernel_size=1)
+
+        self.dropout = nn.Dropout(p=0.1)  # drop rate
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        outputs = []
+        for block in self.aspp_blocks:
+            y = block(x)
+            y = F.interpolate(y, x.shape[2:], mode='bilinear', align_corners=self.align_corners)
+            outputs.append(y)
+
+        if self.image_pooling:
+            img_avg = self.global_avg_pool(x)
+            img_avg = F.interpolate(img_avg, x.shape[2:], mode='bilinear', align_corners=self.align_corners)
+            outputs.append(img_avg)
+
+        x = paddle.concat(outputs, axis=1)
+        x = self.conv_bn_relu(x)
+        x = self.dropout(x)
+
+        return x
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_voc/module.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_voc/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+from typing import Union, List, Tuple
+
+import paddle
+from paddle import nn
+import paddle.nn.functional as F
+import numpy as np
+from paddlehub.module.module import moduleinfo
+import paddlehub.vision.segmentation_transforms as T
+from paddlehub.module.cv_module import ImageSegmentationModule
+
+from fcn_hrnetw18_voc.hrnet import HRNet_W18
+import fcn_hrnetw18_voc.layers as layers
+
+
+@moduleinfo(
+    name="fcn_hrnetw18_voc",
+    type="CV/semantic_segmentation",
+    author="paddlepaddle",
+    author_email="",
+    summary="Fcn_hrnetw18 is a segmentation model.",
+    version="1.0.0",
+    meta=ImageSegmentationModule)
+class FCN(nn.Layer):
+    """
+    A simple implementation for FCN based on PaddlePaddle.
+
+    The original article refers to
+    Evan Shelhamer, et, al. "Fully Convolutional Networks for Semantic Segmentation"
+    (https://arxiv.org/abs/1411.4038).
+
+    Args:
+        num_classes (int): The unique number of target classes.
+        backbone_indices (tuple, optional): The values in the tuple indicate the indices of output of backbone.
+            Default: (-1, ).
+        channels (int, optional): The channels between conv layer and the last layer of FCNHead.
+            If None, it will be the number of channels of input features. Default: None.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.  Default: False.
+        pretrained (str, optional): The path or url of pretrained model. Default: None
+    """
+
+    def __init__(self,
+                 num_classes: int = 21,
+                 backbone_indices: Tuple[int] = (-1, ),
+                 channels: int = None,
+                 align_corners: bool = False,
+                 pretrained: str = None):
+        super(FCN, self).__init__()
+
+        self.backbone = HRNet_W18()
+        backbone_channels = [self.backbone.feat_channels[i] for i in backbone_indices]
+
+        self.head = FCNHead(num_classes, backbone_indices, backbone_channels, channels)
+
+        self.align_corners = align_corners
+        self.transforms = T.Compose([T.Normalize()])
+
+        if pretrained is not None:
+            model_dict = paddle.load(pretrained)
+            self.set_dict(model_dict)
+            print("load custom parameters success")
+
+        else:
+            checkpoint = os.path.join(self.directory, 'model.pdparams')
+            model_dict = paddle.load(checkpoint)
+            self.set_dict(model_dict)
+            print("load pretrained parameters success")
+
+    def transform(self, img: Union[np.ndarray, str]) -> Union[np.ndarray, str]:
+        return self.transforms(img)
+
+    def forward(self, x: paddle.Tensor) -> List[paddle.Tensor]:
+        feat_list = self.backbone(x)
+        logit_list = self.head(feat_list)
+        return [
+            F.interpolate(logit, paddle.shape(x)[2:], mode='bilinear', align_corners=self.align_corners)
+            for logit in logit_list
+        ]
+
+
+class FCNHead(nn.Layer):
+    """
+    A simple implementation for FCNHead based on PaddlePaddle
+
+    Args:
+        num_classes (int): The unique number of target classes.
+        backbone_indices (tuple, optional): The values in the tuple indicate the indices of output of backbone.
+            Default: (-1, ).
+        backbone_channels (tuple): The values of backbone channels.
+            Default: (270, ).
+        channels (int, optional): The channels between conv layer and the last layer of FCNHead.
+            If None, it will be the number of channels of input features. Default: None.
+        pretrained (str, optional): The path of pretrained model. Default: None
+    """
+
+    def __init__(self,
+                 num_classes: int,
+                 backbone_indices: Tuple[int] = (-1, ),
+                 backbone_channels: Tuple[int] = (270, ),
+                 channels: int = None):
+        super(FCNHead, self).__init__()
+
+        self.num_classes = num_classes
+        self.backbone_indices = backbone_indices
+        if channels is None:
+            channels = backbone_channels[0]
+
+        self.conv_1 = layers.ConvBNReLU(
+            in_channels=backbone_channels[0], out_channels=channels, kernel_size=1, padding='same', stride=1)
+        self.cls = nn.Conv2D(in_channels=channels, out_channels=self.num_classes, kernel_size=1, stride=1, padding=0)
+
+    def forward(self, feat_list: nn.Layer) -> List[paddle.Tensor]:
+        logit_list = []
+        x = feat_list[self.backbone_indices[0]]
+        x = self.conv_1(x)
+        logit = self.cls(x)
+        logit_list.append(logit)
+        return logit_list
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/README.md
+# PaddleHub 图像分割
+
+## 模型预测
+
+
+若想使用我们提供的预训练模型进行预测，可使用如下脚本：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='fcn_hrnetw48_cityscapes')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+
+## 如何开始Fine-tune
+
+在完成安装PaddlePaddle与PaddleHub后，通过执行`python train.py`即可开始使用fcn_hrnetw48_cityscapes模型对OpticDiscSeg等数据集进行Fine-tune。
+
+## 代码步骤
+
+使用PaddleHub Fine-tune API进行Fine-tune可以分为4个步骤。
+
+### Step1: 定义数据预处理方式
+```python
+from paddlehub.vision.segmentation_transforms import Compose, Resize, Normalize
+
+transform = Compose([Resize(target_size=(512, 512)), Normalize()])
+```
+
+`segmentation_transforms` 数据增强模块定义了丰富的针对图像分割数据的预处理方式，用户可按照需求替换自己需要的数据预处理方式。
+
+### Step2: 下载数据集并使用
+```python
+from paddlehub.datasets import OpticDiscSeg
+
+train_reader = OpticDiscSeg(transform， mode='train')
+
+```
+* `transform`: 数据预处理方式。
+* `mode`: 选择数据模式，可选项有 `train`, `test`, `val`, 默认为`train`。
+
+数据集的准备代码可以参考 [opticdiscseg.py](../../paddlehub/datasets/opticdiscseg.py)。`hub.datasets.OpticDiscSeg()`会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
+
+### Step3: 加载预训练模型
+
+```python
+model = hub.Module(name='fcn_hrnetw48_cityscapes', num_classes=2, pretrained=None)
+```
+* `name`: 选择预训练模型的名字。
+* `num_classes`: 分割模型的类别数目。
+* `pretrained`: 是否加载自己训练的模型，若为None，则加载提供的模型默认参数。
+
+### Step4: 选择优化策略和运行配置
+
+```python
+scheduler = paddle.optimizer.lr.PolynomialDecay(learning_rate=0.01, decay_steps=1000, power=0.9,  end_lr=0.0001)
+optimizer = paddle.optimizer.Adam(learning_rate=scheduler, parameters=model.parameters())
+trainer = Trainer(model, optimizer, checkpoint_dir='test_ckpt_img_ocr', use_gpu=True)
+```
+
+#### 优化策略
+
+Paddle2.0提供了多种优化器选择，如`SGD`, `Adam`, `Adamax`等，其中`Adam`:
+
+* `learning_rate`: 全局学习率。
+*  `parameters`: 待优化模型参数。
+
+#### 运行配置
+`Trainer` 主要控制Fine-tune的训练，包含以下可控制的参数:
+
+* `model`: 被优化模型；
+* `optimizer`: 优化器选择；
+* `use_gpu`: 是否使用gpu，默认为False;
+* `use_vdl`: 是否使用vdl可视化训练过程；
+* `checkpoint_dir`: 保存模型参数的地址；
+* `compare_metrics`: 保存最优模型的衡量指标；
+
+`trainer.train` 主要控制具体的训练过程，包含以下可控制的参数：
+
+* `train_dataset`: 训练时所用的数据集；
+* `epochs`: 训练轮数；
+* `batch_size`: 训练的批大小，如果使用GPU，请根据实际情况调整batch_size；
+* `num_workers`: works的数量，默认为0；
+* `eval_dataset`: 验证集；
+* `log_interval`: 打印日志的间隔， 单位为执行批训练的次数。
+* `save_interval`: 保存模型的间隔频次，单位为执行训练的轮数。
+
+## 模型预测
+
+当完成Fine-tune后，Fine-tune过程在验证集上表现最优的模型会被保存在`${CHECKPOINT_DIR}/best_model`目录下，其中`${CHECKPOINT_DIR}`目录为Fine-tune时所选择的保存checkpoint的目录。
+
+我们使用该模型来进行预测。predict.py脚本如下：
+
+```python
+import paddle
+import cv2
+import paddlehub as hub
+
+if __name__ == '__main__':
+    model = hub.Module(name='fcn_hrnetw48_cityscapes', pretrained='/PATH/TO/CHECKPOINT')
+    img = cv2.imread("/PATH/TO/IMAGE")
+    model.predict(images=[img], visualization=True)
+```
+
+参数配置正确后，请执行脚本`python predict.py`。
+**Args**
+* `images`:原始图像路径或BGR格式图片；
+* `visualization`: 是否可视化，默认为True；
+* `save_path`: 保存结果的路径，默认保存路径为'seg_result'。
+
+**NOTE:** 进行预测时，所选择的module，checkpoint_dir，dataset必须和Fine-tune所用的一样。
+
+## 服务部署
+
+PaddleHub Serving可以部署一个在线图像分割服务。
+
+### Step1: 启动PaddleHub Serving
+
+运行启动命令：
+
+```shell
+$ hub serving start -m fcn_hrnetw48_cityscapes
+```
+
+这样就完成了一个图像分割服务化API的部署，默认端口号为8866。
+
+**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+
+### Step2: 发送预测请求
+
+配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+```python
+import requests
+import json
+import cv2
+import base64
+
+import numpy as np
+
+
+def cv2_to_base64(image):
+    data = cv2.imencode('.jpg', image)[1]
+    return base64.b64encode(data.tostring()).decode('utf8')
+
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
+
+# 发送HTTP请求
+org_im = cv2.imread('/PATH/TO/IMAGE')
+data = {'images':[cv2_to_base64(org_im)]}
+headers = {"Content-type": "application/json"}
+url = "http://127.0.0.1:8866/predict/fcn_hrnetw48_cityscapes"
+r = requests.post(url=url, headers=headers, data=json.dumps(data))
+mask = base64_to_cv2(r.json()["results"][0])
+```
+
+### 查看代码
+
+https://github.com/PaddlePaddle/PaddleSeg
+
+### 依赖
+
+paddlepaddle >= 2.0.0
+
+paddlehub >= 2.0.0
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/hrnet.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/hrnet.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/layers.py
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from typing import Tuple
+
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+from paddle.nn import Conv2D, AvgPool2D
+
+
+def SyncBatchNorm(*args, **kwargs):
+    """In cpu environment nn.SyncBatchNorm does not have kernel so use nn.BatchNorm2D instead"""
+    if paddle.get_device() == 'cpu':
+        return nn.BatchNorm2D(*args, **kwargs)
+    else:
+        return nn.SyncBatchNorm(*args, **kwargs)
+
+
+class ConvBNLayer(nn.Layer):
+    """Basic conv bn relu layer."""
+
+    def __init__(self,
+                 in_channels: int,
+                 out_channels: int,
+                 kernel_size: int,
+                 stride: int = 1,
+                 dilation: int = 1,
+                 groups: int = 1,
+                 is_vd_mode: bool = False,
+                 act: str = None,
+                 name: str = None):
+        super(ConvBNLayer, self).__init__()
+
+        self.is_vd_mode = is_vd_mode
+        self._pool2d_avg = AvgPool2D(kernel_size=2, stride=2, padding=0, ceil_mode=True)
+        self._conv = Conv2D(
+            in_channels=in_channels,
+            out_channels=out_channels,
+            kernel_size=kernel_size,
+            stride=stride,
+            padding=(kernel_size - 1) // 2 if dilation == 1 else 0,
+            dilation=dilation,
+            groups=groups,
+            bias_attr=False)
+
+        self._batch_norm = SyncBatchNorm(out_channels)
+        self._act_op = Activation(act=act)
+
+    def forward(self, inputs: paddle.Tensor) -> paddle.Tensor:
+        if self.is_vd_mode:
+            inputs = self._pool2d_avg(inputs)
+        y = self._conv(inputs)
+        y = self._batch_norm(y)
+        y = self._act_op(y)
+
+        return y
+
+
+class BottleneckBlock(nn.Layer):
+    """Residual bottleneck block"""
+
+    def __init__(self,
+                 in_channels: int,
+                 out_channels: int,
+                 stride: int,
+                 shortcut: bool = True,
+                 if_first: bool = False,
+                 dilation: int = 1,
+                 name: str = None):
+        super(BottleneckBlock, self).__init__()
+
+        self.conv0 = ConvBNLayer(
+            in_channels=in_channels, out_channels=out_channels, kernel_size=1, act='relu', name=name + "_branch2a")
+
+        self.dilation = dilation
+
+        self.conv1 = ConvBNLayer(
+            in_channels=out_channels,
+            out_channels=out_channels,
+            kernel_size=3,
+            stride=stride,
+            act='relu',
+            dilation=dilation,
+            name=name + "_branch2b")
+        self.conv2 = ConvBNLayer(
+            in_channels=out_channels, out_channels=out_channels * 4, kernel_size=1, act=None, name=name + "_branch2c")
+
+        if not shortcut:
+            self.short = ConvBNLayer(
+                in_channels=in_channels,
+                out_channels=out_channels * 4,
+                kernel_size=1,
+                stride=1,
+                is_vd_mode=False if if_first or stride == 1 else True,
+                name=name + "_branch1")
+
+        self.shortcut = shortcut
+
+    def forward(self, inputs: paddle.Tensor) -> paddle.Tensor:
+        y = self.conv0(inputs)
+        if self.dilation > 1:
+            padding = self.dilation
+            y = F.pad(y, [padding, padding, padding, padding])
+
+        conv1 = self.conv1(y)
+        conv2 = self.conv2(conv1)
+
+        if self.shortcut:
+            short = inputs
+        else:
+            short = self.short(inputs)
+
+        y = paddle.add(x=short, y=conv2)
+        y = F.relu(y)
+        return y
+
+
+class SeparableConvBNReLU(nn.Layer):
+    """Depthwise Separable Convolution."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(SeparableConvBNReLU, self).__init__()
+        self.depthwise_conv = ConvBN(
+            in_channels,
+            out_channels=in_channels,
+            kernel_size=kernel_size,
+            padding=padding,
+            groups=in_channels,
+            **kwargs)
+        self.piontwise_conv = ConvBNReLU(in_channels, out_channels, kernel_size=1, groups=1)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self.depthwise_conv(x)
+        x = self.piontwise_conv(x)
+        return x
+
+
+class ConvBN(nn.Layer):
+    """Basic conv bn layer"""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(ConvBN, self).__init__()
+        self._conv = Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        return x
+
+
+class ConvBNReLU(nn.Layer):
+    """Basic conv bn relu layer."""
+
+    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, padding: str = 'same', **kwargs: dict):
+        super(ConvBNReLU, self).__init__()
+
+        self._conv = Conv2D(in_channels, out_channels, kernel_size, padding=padding, **kwargs)
+        self._batch_norm = SyncBatchNorm(out_channels)
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        x = self._conv(x)
+        x = self._batch_norm(x)
+        x = F.relu(x)
+        return x
+
+
+class Activation(nn.Layer):
+    """
+    The wrapper of activations.
+    Args:
+        act (str, optional): The activation name in lowercase. It must be one of ['elu', 'gelu',
+            'hardshrink', 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid',
+            'softmax', 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax',
+            'hsigmoid']. Default: None, means identical transformation.
+    Returns:
+        A callable object of Activation.
+    Raises:
+        KeyError: When parameter `act` is not in the optional range.
+    Examples:
+        from paddleseg.models.common.activation import Activation
+        relu = Activation("relu")
+        print(relu)
+        # <class 'paddle.nn.layer.activation.ReLU'>
+        sigmoid = Activation("sigmoid")
+        print(sigmoid)
+        # <class 'paddle.nn.layer.activation.Sigmoid'>
+        not_exit_one = Activation("not_exit_one")
+        # KeyError: "not_exit_one does not exist in the current dict_keys(['elu', 'gelu', 'hardshrink',
+        # 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid', 'softmax',
+        # 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax', 'hsigmoid'])"
+    """
+
+    def __init__(self, act: str = None):
+        super(Activation, self).__init__()
+
+        self._act = act
+        upper_act_names = nn.layer.activation.__dict__.keys()
+        lower_act_names = [act.lower() for act in upper_act_names]
+        act_dict = dict(zip(lower_act_names, upper_act_names))
+
+        if act is not None:
+            if act in act_dict.keys():
+                act_name = act_dict[act]
+                self.act_func = eval("nn.layer.activation.{}()".format(act_name))
+            else:
+                raise KeyError("{} does not exist in the current {}".format(act, act_dict.keys()))
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        if self._act is not None:
+            return self.act_func(x)
+        else:
+            return x
+
+
+class ASPPModule(nn.Layer):
+    """
+    Atrous Spatial Pyramid Pooling.
+
+    Args:
+        aspp_ratios (tuple): The dilation rate using in ASSP module.
+        in_channels (int): The number of input channels.
+        out_channels (int): The number of output channels.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.
+        use_sep_conv (bool, optional): If using separable conv in ASPP module. Default: False.
+        image_pooling (bool, optional): If augmented with image-level features. Default: False
+    """
+
+    def __init__(self,
+                 aspp_ratios: Tuple[int],
+                 in_channels: int,
+                 out_channels: int,
+                 align_corners: bool,
+                 use_sep_conv: bool = False,
+                 image_pooling: bool = False):
+        super().__init__()
+
+        self.align_corners = align_corners
+        self.aspp_blocks = nn.LayerList()
+
+        for ratio in aspp_ratios:
+            if use_sep_conv and ratio > 1:
+                conv_func = SeparableConvBNReLU
+            else:
+                conv_func = ConvBNReLU
+
+            block = conv_func(
+                in_channels=in_channels,
+                out_channels=out_channels,
+                kernel_size=1 if ratio == 1 else 3,
+                dilation=ratio,
+                padding=0 if ratio == 1 else ratio)
+            self.aspp_blocks.append(block)
+
+        out_size = len(self.aspp_blocks)
+
+        if image_pooling:
+            self.global_avg_pool = nn.Sequential(
+                nn.AdaptiveAvgPool2D(output_size=(1, 1)),
+                ConvBNReLU(in_channels, out_channels, kernel_size=1, bias_attr=False))
+            out_size += 1
+        self.image_pooling = image_pooling
+
+        self.conv_bn_relu = ConvBNReLU(in_channels=out_channels * out_size, out_channels=out_channels, kernel_size=1)
+
+        self.dropout = nn.Dropout(p=0.1)  # drop rate
+
+    def forward(self, x: paddle.Tensor) -> paddle.Tensor:
+        outputs = []
+        for block in self.aspp_blocks:
+            y = block(x)
+            y = F.interpolate(y, x.shape[2:], mode='bilinear', align_corners=self.align_corners)
+            outputs.append(y)
+
+        if self.image_pooling:
+            img_avg = self.global_avg_pool(x)
+            img_avg = F.interpolate(img_avg, x.shape[2:], mode='bilinear', align_corners=self.align_corners)
+            outputs.append(img_avg)
+
+        x = paddle.concat(outputs, axis=1)
+        x = self.conv_bn_relu(x)
+        x = self.dropout(x)
+
+        return x
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+from typing import Union, List, Tuple
+
+import paddle
+from paddle import nn
+import paddle.nn.functional as F
+import numpy as np
+from paddlehub.module.module import moduleinfo
+import paddlehub.vision.segmentation_transforms as T
+from paddlehub.module.cv_module import ImageSegmentationModule
+
+from fcn_hrnetw48_cityscapes.hrnet import HRNet_W48
+import fcn_hrnetw48_cityscapes.layers as layers
+
+
+@moduleinfo(
+    name="fcn_hrnetw48_cityscapes",
+    type="CV/semantic_segmentation",
+    author="paddlepaddle",
+    author_email="",
+    summary="Fcn_hrnetw48 is a segmentation model.",
+    version="1.0.0",
+    meta=ImageSegmentationModule)
+class FCN(nn.Layer):
+    """
+    A simple implementation for FCN based on PaddlePaddle.
+
+    The original article refers to
+    Evan Shelhamer, et, al. "Fully Convolutional Networks for Semantic Segmentation"
+    (https://arxiv.org/abs/1411.4038).
+
+    Args:
+        num_classes (int): The unique number of target classes.
+        backbone_indices (tuple, optional): The values in the tuple indicate the indices of output of backbone.
+            Default: (-1, ).
+        channels (int, optional): The channels between conv layer and the last layer of FCNHead.
+            If None, it will be the number of channels of input features. Default: None.
+        align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature
+            is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.  Default: False.
+        pretrained (str, optional): The path or url of pretrained model. Default: None
+    """
+
+    def __init__(self,
+                 num_classes: int = 19,
+                 backbone_indices: Tuple[int] = (-1, ),
+                 channels: int = None,
+                 align_corners: bool = False,
+                 pretrained: str = None):
+        super(FCN, self).__init__()
+
+        self.backbone = HRNet_W48()
+        backbone_channels = [self.backbone.feat_channels[i] for i in backbone_indices]
+
+        self.head = FCNHead(num_classes, backbone_indices, backbone_channels, channels)
+
+        self.align_corners = align_corners
+        self.transforms = T.Compose([T.Normalize()])
+
+        if pretrained is not None:
+            model_dict = paddle.load(pretrained)
+            self.set_dict(model_dict)
+            print("load custom parameters success")
+
+        else:
+            checkpoint = os.path.join(self.directory, 'model.pdparams')
+            model_dict = paddle.load(checkpoint)
+            self.set_dict(model_dict)
+            print("load pretrained parameters success")
+
+    def transform(self, img: Union[np.ndarray, str]) -> Union[np.ndarray, str]:
+        return self.transforms(img)
+
+    def forward(self, x: paddle.Tensor) -> List[paddle.Tensor]:
+        feat_list = self.backbone(x)
+        logit_list = self.head(feat_list)
+        return [
+            F.interpolate(logit, paddle.shape(x)[2:], mode='bilinear', align_corners=self.align_corners)
+            for logit in logit_list
+        ]
+
+
+class FCNHead(nn.Layer):
+    """
+    A simple implementation for FCNHead based on PaddlePaddle
+
+    Args:
+        num_classes (int): The unique number of target classes.
+        backbone_indices (tuple, optional): The values in the tuple indicate the indices of output of backbone.
+            Default: (-1, ).
+        backbone_channels (tuple): The values of backbone channels.
+            Default: (270, ).
+        channels (int, optional): The channels between conv layer and the last layer of FCNHead.
+            If None, it will be the number of channels of input features. Default: None.
+        pretrained (str, optional): The path of pretrained model. Default: None
+    """
+
+    def __init__(self,
+                 num_classes: int,
+                 backbone_indices: Tuple[int] = (-1, ),
+                 backbone_channels: Tuple[int] = (270, ),
+                 channels: int = None):
+        super(FCNHead, self).__init__()
+
+        self.num_classes = num_classes
+        self.backbone_indices = backbone_indices
+        if channels is None:
+            channels = backbone_channels[0]
+
+        self.conv_1 = layers.ConvBNReLU(
+            in_channels=backbone_channels[0], out_channels=channels, kernel_size=1, padding='same', stride=1)
+        self.cls = nn.Conv2D(in_channels=channels, out_channels=self.num_classes, kernel_size=1, stride=1, padding=0)
+
+    def forward(self, feat_list: nn.Layer) -> List[paddle.Tensor]:
+        logit_list = []
+        x = feat_list[self.backbone_indices[0]]
+        x = self.conv_1(x)
+        logit = self.cls(x)
+        logit_list.append(logit)
+        return logit_list
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_voc/README.md
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_voc/README.md
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_voc/hrnet.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_voc/hrnet.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_voc/layers.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_voc/layers.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_voc/module.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_voc/module.py
--- a/modules/image/semantic_segmentation/hardnet_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/hardnet_cityscapes/README.md
--- a/modules/image/semantic_segmentation/hardnet_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/hardnet_cityscapes/layers.py
--- a/modules/image/semantic_segmentation/hardnet_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/hardnet_cityscapes/module.py
--- a/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/README.md
--- a/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/hrnet.py
+++ b/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/hrnet.py
--- a/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/layers.py
--- a/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/module.py
--- a/modules/image/semantic_segmentation/unet_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/unet_cityscapes/README.md
--- a/modules/image/semantic_segmentation/unet_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/unet_cityscapes/layers.py
--- a/modules/image/semantic_segmentation/unet_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/unet_cityscapes/module.py