Merge branch 'develop' of https://github.com/PaddlePaddle/models into cloudtest2

05403680 · guosheng · 2c52abfa · b8cbdd33 · 05403680 · 05403680
407 changed file
--- a/.gitmodules
+++ b/.gitmodules
+[submodule "fluid/SimNet"]
+	path = fluid/SimNet
+	url = https://github.com/baidu/AnyQ.git
+[submodule "fluid/LAC"]
+	path = fluid/LAC
+	url = https://github.com/baidu/lac
+[submodule "fluid/Senta"]
+	path = fluid/Senta
+	url = https://github.com/baidu/Senta
--- a/README.md
+++ b/README.md
@@ -8,7 +8,7 @@ PaddlePaddle provides a rich set of computational units to enable users to adopt

 - [fluid models](fluid): use PaddlePaddle's Fluid APIs. We especially recommend users to use Fluid models.

- [v2 models](v2): use PaddlePaddle's v2 APIs.
+- [legacy models](legacy): use PaddlePaddle's v2 APIs.


 ## License

--- a/fluid/DeepASR/model_utils/model.py
+++ b/fluid/DeepASR/model_utils/model.py
@@ -2,7 +2,6 @@ from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function

-import paddle.v2 as paddle
 import paddle.fluid as fluid



--- a/LAC @ 66660503
+++ b/LAC @ 66660503
+Subproject commit 66660503bb6e8f34adc4715ccf42cad77ed46ded
--- a/fluid/README.cn.rst
+++ b/fluid/README.cn.rst
@@ -49,14 +49,46 @@ Network,ICNet)进行语义分割，相比其他分割算法，ICNet兼顾了准

 -  `ICNet <https://github.com/PaddlePaddle/models/tree/develop/fluid/icnet>`__

+图像生成
+-----------
+
+图像生成是指根据输入向量，生成目标图像。这里的输入向量可以是随机的噪声或用户指定的条件向量。具体的应用场景有：手写体生成、人脸合成、风格迁移、图像修复等。当前的图像生成任务主要是借助生成对抗网络（GAN）来实现。
+生成对抗网络（GAN）由两种子网络组成：生成器和识别器。生成器的输入是随机噪声或条件向量，输出是目标图像。识别器是一个分类器，输入是一张图像，输出是该图像是否是真实的图像。在训练过程中，生成器和识别器通过不断的相互博弈提升自己的能力。
+
+在图像生成任务中，我们介绍了如何使用DCGAN和ConditioanlGAN来进行手写数字的生成，另外还介绍了用于风格迁移的CycleGAN.
+
+- `DCGAN & ConditionalGAN <https://github.com/PaddlePaddle/models/tree/develop/fluid/gan/c_gan>`__
+- `CycleGAN <https://github.com/PaddlePaddle/models/tree/develop/fluid/gan/cycle_gan>`__
+
 场景文字识别
 ------------

 许多场景图像中包含着丰富的文本信息，对理解图像信息有着重要作用，能够极大地帮助人们认知和理解场景图像的内容。场景文字识别是在图像背景复杂、分辨率低下、字体多样、分布随意等情况下，将图像信息转化为文字序列的过程，可认为是一种特别的翻译过程：将图像输入翻译为自然语言输出。场景图像文字识别技术的发展也促进了一些新型应用的产生，如通过自动识别路牌中的文字帮助街景应用获取更加准确的地址信息等。

-在场景文字识别任务中，我们介绍如何将基于CNN的图像特征提取和基于RNN的序列翻译技术结合，免除人工定义特征，避免字符分割，使用自动学习到的图像特征，完成端到端地无约束字符定位和识别。当前，介绍了CRNN-CTC模型，后续会引入基于注意力机制的序列到序列模型。
+在场景文字识别任务中，我们介绍如何将基于CNN的图像特征提取和基于RNN的序列翻译技术结合，免除人工定义特征，避免字符分割，使用自动学习到的图像特征，完成字符识别。当前，介绍了CRNN-CTC模型和基于注意力机制的序列到序列模型。

 -  `CRNN-CTC模型 <https://github.com/PaddlePaddle/models/tree/develop/fluid/ocr_recognition>`__
+-  `Attention模型 <https://github.com/PaddlePaddle/models/tree/develop/fluid/ocr_recognition>`__
+
+
+度量学习
+-------
+
+
+度量学习也称作距离度量学习、相似度学习，通过学习对象之间的距离，度量学习能够用于分析对象时间的关联、比较关系，在实际问题中应用较为广泛，可应用于辅助分类、聚类问题，也广泛用于图像检索、人脸识别等领域。以往，针对不同的任务，需要选择合适的特征并手动构建距离函数，而度量学习可根据不同的任务来自主学习出针对特定任务的度量距离函数。度量学习和深度学习的结合，在人脸识别/验证、行人再识别(human Re-ID)、图像检索等领域均取得较好的性能，在这个任务中我们主要介绍了基于Fluid的深度度量学习模型，包含了三元组、四元组等损失函数。
+
+- `Metric Learning <https://github.com/PaddlePaddle/models/tree/develop/fluid/metric_learning>`__
+
+
+视频分类
+-------
+
+视频分类是视频理解任务的基础，与图像分类不同的是，分类的对象不再是静止的图像，而是一个由多帧图像构成的、包含语音数据、包含运动信息等的视频对象，因此理解视频需要获得更多的上下文信息，不仅要理解每帧图像是什么、包含什么，还需要结合不同帧，知道上下文的关联信息。视频分类方法主要包含基于卷积神经网络、基于循环神经网络、或将这两者结合的方法。该任务中我们介绍基于Fluid的视频分类模型，目前包含Temporal Segment Network(TSN)模型，后续会持续增加更多模型。
+
+
+- `TSN <https://github.com/PaddlePaddle/models/tree/develop/fluid/video_classification>`__
+
+

 语音识别
 --------
@@ -124,6 +156,15 @@ DQN 及其变体，并测试了它们在 Atari 游戏中的表现。

 - `Senta <https://github.com/baidu/Senta/blob/master/README.md>`__

+语义匹配
+--------
+
+在自然语言处理很多场景中，需要度量两个文本在语义上的相似度，这类任务通常被称为语义匹配。例如在搜索中根据查询与候选文档的相似度对搜索结果进行排序，文本去重中文本与文本相似度的计算，自动问答中候选答案与问题的匹配等。
+
+本例所开放的DAM (Deep Attention Matching Network)为百度自然语言处理部发表于ACL-2018的工作，用于检索式聊天机器人多轮对话中应答的选择。DAM受Transformer的启发，其网络结构完全基于注意力(attention)机制，利用栈式的self-attention结构分别学习不同粒度下应答和语境的语义表示，然后利用cross-attention获取应答与语境之间的相关性，在两个大规模多轮对话数据集上的表现均好于其它模型。
+
+- `Deep Attention Matching Network <https://github.com/PaddlePaddle/models/tree/develop/fluid/deep_attention_matching_net>`__ 
+
 AnyQ
 ----

@@ -135,3 +176,12 @@ SimNet是百度自然语言处理部于2013年自主研发的语义匹配框架

 -  `SimNet in PaddlePaddle
   Fluid <https://github.com/baidu/AnyQ/blob/master/tools/simnet/train/paddle/README.md>`__
+   
+机器阅读理解
+----
+
+机器阅读理解(MRC)是自然语言处理(NLP)中的核心任务之一，最终目标是让机器像人类一样阅读文本，提炼文本信息并回答相关问题。深度学习近年来在NLP中得到广泛使用，也使得机器阅读理解能力在近年有了大幅提高，但是目前研究的机器阅读理解都采用人工构造的数据集，以及回答一些相对简单的问题，和人类处理的数据还有明显差距，因此亟需大规模真实训练数据推动MRC的进一步发展。
+
+百度阅读理解数据集是由百度自然语言处理部开源的一个真实世界数据集，所有的问题、原文都来源于实际数据(百度搜索引擎数据和百度知道问答社区)，答案是由人类回答的。每个问题都对应多个答案，数据集包含200k问题、1000k原文和420k答案，是目前最大的中文MRC数据集。百度同时开源了对应的阅读理解模型，称为DuReader，采用当前通用的网络分层结构，通过双向attention机制捕捉问题和原文之间的交互关系，生成query-aware的原文表示，最终基于query-aware的原文表示通过point network预测答案范围。
+
+-  `DuReader in PaddlePaddle Fluid] <https://github.com/PaddlePaddle/models/blob/develop/fluid/machine_reading_comprehension/README.md>`__
--- a/fluid/README.md
+++ b/fluid/README.md
@@ -28,8 +28,11 @@ Fluid模型配置和参数文件的工具。

 开放环境中的检测人脸，尤其是小的、模糊的和部分遮挡的人脸也是一个具有挑战的任务。我们也介绍了如何基于 [WIDER FACE](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace) 数据训练百度自研的人脸检测PyramidBox模型，该算法于2018年3月份在WIDER FACE的多项评测中均获得 [第一名](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/WiderFace_Results.html)。

+Faster RCNN 是典型的两阶段目标检测器，相较于传统提取区域的方法，Faster RCNN中RPN网络通过共享卷积层参数大幅提高提取区域的效率，并提出高质量的候选区域。
+
 -  [Single Shot MultiBox Detector](https://github.com/PaddlePaddle/models/blob/develop/fluid/object_detection/README_cn.md)
 -  [Face Detector: PyramidBox](https://github.com/PaddlePaddle/models/tree/develop/fluid/face_detection/README_cn.md)
+-  [Faster RCNN](https://github.com/PaddlePaddle/models/tree/develop/fluid/faster_rcnn/README_cn.md)

 图像语义分割
 ------------
@@ -41,14 +44,45 @@ Network,ICNet)进行语义分割，相比其他分割算法，ICNet兼顾了准

 -  [ICNet](https://github.com/PaddlePaddle/models/tree/develop/fluid/icnet)

+图像生成
+-----------
+
+图像生成是指根据输入向量，生成目标图像。这里的输入向量可以是随机的噪声或用户指定的条件向量。具体的应用场景有：手写体生成、人脸合成、风格迁移、图像修复等。当前的图像生成任务主要是借助生成对抗网络（GAN）来实现。
+生成对抗网络（GAN）由两种子网络组成：生成器和识别器。生成器的输入是随机噪声或条件向量，输出是目标图像。识别器是一个分类器，输入是一张图像，输出是该图像是否是真实的图像。在训练过程中，生成器和识别器通过不断的相互博弈提升自己的能力。
+
+在图像生成任务中，我们介绍了如何使用DCGAN和ConditioanlGAN来进行手写数字的生成，另外还介绍了用于风格迁移的CycleGAN.
+
+- [DCGAN & ConditionalGAN](https://github.com/PaddlePaddle/models/tree/develop/fluid/gan/c_gan)
+- [CycleGAN](https://github.com/PaddlePaddle/models/tree/develop/fluid/gan/cycle_gan)
+
 场景文字识别
 ------------

 许多场景图像中包含着丰富的文本信息，对理解图像信息有着重要作用，能够极大地帮助人们认知和理解场景图像的内容。场景文字识别是在图像背景复杂、分辨率低下、字体多样、分布随意等情况下，将图像信息转化为文字序列的过程，可认为是一种特别的翻译过程：将图像输入翻译为自然语言输出。场景图像文字识别技术的发展也促进了一些新型应用的产生，如通过自动识别路牌中的文字帮助街景应用获取更加准确的地址信息等。

-在场景文字识别任务中，我们介绍如何将基于CNN的图像特征提取和基于RNN的序列翻译技术结合，免除人工定义特征，避免字符分割，使用自动学习到的图像特征，完成端到端地无约束字符定位和识别。当前，介绍了CRNN-CTC模型，后续会引入基于注意力机制的序列到序列模型。
+在场景文字识别任务中，我们介绍如何将基于CNN的图像特征提取和基于RNN的序列翻译技术结合，免除人工定义特征，避免字符分割，使用自动学习到的图像特征，完成字符识别。当前，介绍了CRNN-CTC模型和基于注意力机制的序列到序列模型。
+
+-  [CRNN-CTC模型](https://github.com/PaddlePaddle/models/tree/develop/fluid/ocr_recognition)
+-  [Attention模型](https://github.com/PaddlePaddle/models/tree/develop/fluid/ocr_recognition)
+
+
+度量学习
+-------
+
+
+度量学习也称作距离度量学习、相似度学习，通过学习对象之间的距离，度量学习能够用于分析对象时间的关联、比较关系，在实际问题中应用较为广泛，可应用于辅助分类、聚类问题，也广泛用于图像检索、人脸识别等领域。以往，针对不同的任务，需要选择合适的特征并手动构建距离函数，而度量学习可根据不同的任务来自主学习出针对特定任务的度量距离函数。度量学习和深度学习的结合，在人脸识别/验证、行人再识别(human Re-ID)、图像检索等领域均取得较好的性能，在这个任务中我们主要介绍了基于Fluid的深度度量学习模型，包含了三元组、四元组等损失函数。
+
+- [Metric Learning](https://github.com/PaddlePaddle/models/tree/develop/fluid/metric_learning)
+
+
+视频分类
+-------
+
+视频分类是视频理解任务的基础，与图像分类不同的是，分类的对象不再是静止的图像，而是一个由多帧图像构成的、包含语音数据、包含运动信息等的视频对象，因此理解视频需要获得更多的上下文信息，不仅要理解每帧图像是什么、包含什么，还需要结合不同帧，知道上下文的关联信息。视频分类方法主要包含基于卷积神经网络、基于循环神经网络、或将这两者结合的方法。该任务中我们介绍基于Fluid的视频分类模型，目前包含Temporal Segment Network(TSN)模型，后续会持续增加更多模型。
+
+
+- [TSN](https://github.com/PaddlePaddle/models/tree/develop/fluid/video_classification)

-  [CRNN-CTC模](https://github.com/PaddlePaddle/models/tree/develop/fluid/ocr_recognition)

 语音识别
 --------
@@ -94,6 +128,15 @@ Machine Translation, NMT)等阶段。在 NMT 成熟后，机器翻译才真正

 - [Senta](https://github.com/baidu/Senta/blob/master/README.md)

+语义匹配
+--------
+
+在自然语言处理很多场景中，需要度量两个文本在语义上的相似度，这类任务通常被称为语义匹配。例如在搜索中根据查询与候选文档的相似度对搜索结果进行排序，文本去重中文本与文本相似度的计算，自动问答中候选答案与问题的匹配等。
+
+本例所开放的DAM (Deep Attention Matching Network)为百度自然语言处理部发表于ACL-2018的工作，用于检索式聊天机器人多轮对话中应答的选择。DAM受Transformer的启发，其网络结构完全基于注意力(attention)机制，利用栈式的self-attention结构分别学习不同粒度下应答和语境的语义表示，然后利用cross-attention获取应答与语境之间的相关性，在两个大规模多轮对话数据集上的表现均好于其它模型。
+
+- [Deep Attention Matching Network](https://github.com/PaddlePaddle/models/tree/develop/fluid/deep_attention_matching_net)
+
 AnyQ
 ----

@@ -102,3 +145,12 @@ AnyQ
 SimNet是百度自然语言处理部于2013年自主研发的语义匹配框架，该框架在百度各产品上广泛应用，主要包括BOW、CNN、RNN、MM-DNN等核心网络结构形式，同时基于该框架也集成了学术界主流的语义匹配模型，如MatchPyramid、MV-LSTM、K-NRM等模型。使用SimNet构建出的模型可以便捷的加入AnyQ系统中，增强AnyQ系统的语义匹配能力。

 -  [SimNet in PaddlePaddle Fluid](https://github.com/baidu/AnyQ/blob/master/tools/simnet/train/paddle/README.md)
+
+机器阅读理解
+----------
+
+机器阅读理解(MRC)是自然语言处理(NLP)中的核心任务之一，最终目标是让机器像人类一样阅读文本，提炼文本信息并回答相关问题。深度学习近年来在NLP中得到广泛使用，也使得机器阅读理解能力在近年有了大幅提高，但是目前研究的机器阅读理解都采用人工构造的数据集，以及回答一些相对简单的问题，和人类处理的数据还有明显差距，因此亟需大规模真实训练数据推动MRC的进一步发展。
+
+百度阅读理解数据集是由百度自然语言处理部开源的一个真实世界数据集，所有的问题、原文都来源于实际数据(百度搜索引擎数据和百度知道问答社区)，答案是由人类回答的。每个问题都对应多个答案，数据集包含200k问题、1000k原文和420k答案，是目前最大的中文MRC数据集。百度同时开源了对应的阅读理解模型，称为DuReader，采用当前通用的网络分层结构，通过双向attention机制捕捉问题和原文之间的交互关系，生成query-aware的原文表示，最终基于query-aware的原文表示通过point network预测答案范围。
+
+-  [DuReader in PaddlePaddle Fluid](https://github.com/PaddlePaddle/models/blob/develop/fluid/machine_reading_comprehension/README.md)
--- a/Senta @ 870651e2
+++ b/Senta @ 870651e2
+Subproject commit 870651e257750f2c237f0b0bc9a27e5d062d1909
--- a/SimNet @ 4dbe7f7b
+++ b/SimNet @ 4dbe7f7b
+Subproject commit 4dbe7f7b0e76c188eb7f448d104f0165f0a12229
--- a/fluid/adversarial/tutorials/mnist_model.py
+++ b/fluid/adversarial/tutorials/mnist_model.py
 """
 CNN on mnist data using fluid api of paddlepaddle
 """
-import paddle.v2 as paddle
+import paddle
 import paddle.fluid as fluid



--- a/fluid/adversarial/tutorials/mnist_tutorial_bim.py
+++ b/fluid/adversarial/tutorials/mnist_tutorial_bim.py
@@ -8,7 +8,7 @@ sys.path.append("..")

 import matplotlib.pyplot as plt
 import paddle.fluid as fluid
-import paddle.v2 as paddle
+import paddle

 from advbox.adversary import Adversary
 from advbox.attacks.gradient_method import BIM

--- a/fluid/adversarial/tutorials/mnist_tutorial_deepfool.py
+++ b/fluid/adversarial/tutorials/mnist_tutorial_deepfool.py
@@ -8,7 +8,7 @@ sys.path.append("..")

 import matplotlib.pyplot as plt
 import paddle.fluid as fluid
-import paddle.v2 as paddle
+import paddle

 from advbox.adversary import Adversary
 from advbox.attacks.deepfool import DeepFoolAttack

--- a/fluid/adversarial/tutorials/mnist_tutorial_fgsm.py
+++ b/fluid/adversarial/tutorials/mnist_tutorial_fgsm.py
@@ -8,7 +8,7 @@ sys.path.append("..")
 import matplotlib.pyplot as plt
 import numpy as np
 import paddle.fluid as fluid
-import paddle.v2 as paddle
+import paddle

 from advbox.adversary import Adversary
 from advbox.attacks.gradient_method import FGSM

--- a/fluid/adversarial/tutorials/mnist_tutorial_ilcm.py
+++ b/fluid/adversarial/tutorials/mnist_tutorial_ilcm.py
@@ -7,7 +7,7 @@ sys.path.append("..")

 import matplotlib.pyplot as plt
 import paddle.fluid as fluid
-import paddle.v2 as paddle
+import paddle

 from advbox.adversary import Adversary
 from advbox.attacks.gradient_method import ILCM

--- a/fluid/adversarial/tutorials/mnist_tutorial_jsma.py
+++ b/fluid/adversarial/tutorials/mnist_tutorial_jsma.py
@@ -7,7 +7,7 @@ sys.path.append("..")

 import matplotlib.pyplot as plt
 import paddle.fluid as fluid
-import paddle.v2 as paddle
+import paddle

 from advbox.adversary import Adversary
 from advbox.attacks.saliency import JSMA

--- a/fluid/adversarial/tutorials/mnist_tutorial_lbfgs.py
+++ b/fluid/adversarial/tutorials/mnist_tutorial_lbfgs.py
@@ -7,7 +7,7 @@ sys.path.append("..")

 import matplotlib.pyplot as plt
 import paddle.fluid as fluid
-import paddle.v2 as paddle
+import paddle

 from advbox.adversary import Adversary
 from advbox.attacks.lbfgs import LBFGS

--- a/fluid/adversarial/tutorials/mnist_tutorial_mifgsm.py
+++ b/fluid/adversarial/tutorials/mnist_tutorial_mifgsm.py
@@ -9,7 +9,7 @@ sys.path.append("..")
 import matplotlib.pyplot as plt
 import numpy as np
 import paddle.fluid as fluid
-import paddle.v2 as paddle
+import paddle

 from advbox.adversary import Adversary
 from advbox.attacks.gradient_method import MIFGSM

--- a/fluid/deep_attention_matching_net/model.py
+++ b/fluid/deep_attention_matching_net/model.py
-import cPickle as pickle
+import six
 import numpy as np
 import paddle.fluid as fluid
 import utils.layers as layers
@@ -29,7 +29,7 @@ class Net(object):
        mask_cache = dict() if self.use_mask_cache else None

        turns_data = []
-        for i in xrange(self._max_turn_num):
+        for i in six.moves.xrange(self._max_turn_num):
            turn = fluid.layers.data(
                name="turn_%d" % i,
                shape=[self._max_turn_len, 1],
@@ -37,7 +37,7 @@ class Net(object):
            turns_data.append(turn)

        turns_mask = []
-        for i in xrange(self._max_turn_num):
+        for i in six.moves.xrange(self._max_turn_num):
            turn_mask = fluid.layers.data(
                name="turn_mask_%d" % i,
                shape=[self._max_turn_len, 1],
@@ -64,7 +64,7 @@ class Net(object):
        Hr = response_emb
        Hr_stack = [Hr]

-        for index in range(self._stack_num):
+        for index in six.moves.xrange(self._stack_num):
            Hr = layers.block(
                name="response_self_stack" + str(index),
                query=Hr,
@@ -78,7 +78,7 @@ class Net(object):

        # context part
        sim_turns = []
-        for t in xrange(self._max_turn_num):
+        for t in six.moves.xrange(self._max_turn_num):
            Hu = fluid.layers.embedding(
                input=turns_data[t],
                size=[self._vocab_size + 1, self._emb_size],
@@ -88,7 +88,7 @@ class Net(object):
                    initializer=fluid.initializer.Normal(scale=0.1)))
            Hu_stack = [Hu]

-            for index in range(self._stack_num):
+            for index in six.moves.xrange(self._stack_num):
                # share parameters
                Hu = layers.block(
                    name="turn_self_stack" + str(index),
@@ -104,7 +104,7 @@ class Net(object):
            # cross attention 
            r_a_t_stack = []
            t_a_r_stack = []
-            for index in range(self._stack_num + 1):
+            for index in six.moves.xrange(self._stack_num + 1):
                t_a_r = layers.block(
                    name="t_attend_r_" + str(index),
                    query=Hu_stack[index],
@@ -134,7 +134,7 @@ class Net(object):
                t_a_r = fluid.layers.stack(t_a_r_stack, axis=1)
                r_a_t = fluid.layers.stack(r_a_t_stack, axis=1)
            else:
-                for index in xrange(len(t_a_r_stack)):
+                for index in six.moves.xrange(len(t_a_r_stack)):
                    t_a_r_stack[index] = fluid.layers.unsqueeze(
                        input=t_a_r_stack[index], axes=[1])
                    r_a_t_stack[index] = fluid.layers.unsqueeze(
@@ -151,7 +151,7 @@ class Net(object):
        if self.use_stack_op:
            sim = fluid.layers.stack(sim_turns, axis=2)
        else:
-            for index in xrange(len(sim_turns)):
+            for index in six.moves.xrange(len(sim_turns)):
                sim_turns[index] = fluid.layers.unsqueeze(
                    input=sim_turns[index], axes=[2])
            # sim shape: [batch_size, 2*(stack_num+1), max_turn_num, max_turn_len, max_turn_len]

--- a/fluid/deep_attention_matching_net/test_and_evaluate.py
+++ b/fluid/deep_attention_matching_net/test_and_evaluate.py
 import os
+import six
 import numpy as np
 import time
 import argparse
@@ -6,8 +7,12 @@ import multiprocessing
 import paddle
 import paddle.fluid as fluid
 import utils.reader as reader
-import cPickle as pickle
-from utils.util import print_arguments
+from utils.util import print_arguments, mkdir
+
+try:
+    import cPickle as pickle  #python 2
+except ImportError as e:
+    import pickle  #python 3

 from model import Net

@@ -107,7 +112,7 @@ def parse_args():

 def test(args):
    if not os.path.exists(args.save_path):
-        raise ValueError("Invalid save path %s" % args.save_path)
+        mkdir(args.save_path)
    if not os.path.exists(args.model_path):
        raise ValueError("Invalid model init path %s" % args.model_path)
    # data data_config
@@ -158,7 +163,11 @@ def test(args):
        use_cuda=args.use_cuda, main_program=test_program)

    print("start loading data ...")
-    train_data, val_data, test_data = pickle.load(open(args.data_path, 'rb'))
+    with open(args.data_path, 'rb') as f:
+        if six.PY2:
+            train_data, val_data, test_data = pickle.load(f)
+        else:
+            train_data, val_data, test_data = pickle.load(f, encoding="bytes")
    print("finish loading data ...")

    if args.ext_eval:
@@ -178,9 +187,9 @@ def test(args):
    score_path = os.path.join(args.save_path, 'score.txt')
    score_file = open(score_path, 'w')

-    for it in xrange(test_batch_num // dev_count):
+    for it in six.moves.xrange(test_batch_num // dev_count):
        feed_list = []
-        for dev in xrange(dev_count):
+        for dev in six.moves.xrange(dev_count):
            index = it * dev_count + dev
            feed_dict = reader.make_one_batch_input(test_batches, index)
            feed_list.append(feed_dict)
@@ -190,9 +199,9 @@ def test(args):
        scores = np.array(predicts[0])
        print("step = %d" % it)

-        for dev in xrange(dev_count):
+        for dev in six.moves.xrange(dev_count):
            index = it * dev_count + dev
-            for i in xrange(args.batch_size):
+            for i in six.moves.xrange(args.batch_size):
                score_file.write(
                    str(scores[args.batch_size * dev + i][0]) + '\t' + str(
                        test_batches["label"][index][i]) + '\n')

--- a/fluid/deep_attention_matching_net/train_and_evaluate.py
+++ b/fluid/deep_attention_matching_net/train_and_evaluate.py
 import os
+import six
 import numpy as np
 import time
 import argparse
@@ -6,9 +7,13 @@ import multiprocessing
 import paddle
 import paddle.fluid as fluid
 import utils.reader as reader
-import cPickle as pickle
 from utils.util import print_arguments

+try:
+    import cPickle as pickle  #python 2
+except ImportError as e:
+    import pickle  #python 3
+
 from model import Net


@@ -164,35 +169,45 @@ def train(args):

    if args.word_emb_init is not None:
        print("start loading word embedding init ...")
-        word_emb = np.array(pickle.load(open(args.word_emb_init, 'rb'))).astype(
-            'float32')
+        if six.PY2:
+            word_emb = np.array(pickle.load(open(args.word_emb_init,
+                                                 'rb'))).astype('float32')
+        else:
+            word_emb = np.array(
+                pickle.load(
+                    open(args.word_emb_init, 'rb'), encoding="bytes")).astype(
+                        'float32')
        dam.set_word_embedding(word_emb, place)
        print("finish init word embedding  ...")

    print("start loading data ...")
-    train_data, val_data, test_data = pickle.load(open(args.data_path, 'rb'))
+    with open(args.data_path, 'rb') as f:
+        if six.PY2:
+            train_data, val_data, test_data = pickle.load(f)
+        else:
+            train_data, val_data, test_data = pickle.load(f, encoding="bytes")
    print("finish loading data ...")

    val_batches = reader.build_batches(val_data, data_conf)

-    batch_num = len(train_data['y']) / args.batch_size
+    batch_num = len(train_data[six.b('y')]) // args.batch_size
    val_batch_num = len(val_batches["response"])

-    print_step = max(1, batch_num / (dev_count * 100))
-    save_step = max(1, batch_num / (dev_count * 10))
+    print_step = max(1, batch_num // (dev_count * 100))
+    save_step = max(1, batch_num // (dev_count * 10))

    print("begin model training ...")
    print(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(time.time())))

    step = 0
-    for epoch in xrange(args.num_scan_data):
+    for epoch in six.moves.xrange(args.num_scan_data):
        shuffle_train = reader.unison_shuffle(train_data)
        train_batches = reader.build_batches(shuffle_train, data_conf)

        ave_cost = 0.0
-        for it in xrange(batch_num // dev_count):
+        for it in six.moves.xrange(batch_num // dev_count):
            feed_list = []
-            for dev in xrange(dev_count):
+            for dev in six.moves.xrange(dev_count):
                index = it * dev_count + dev
                feed_dict = reader.make_one_batch_input(train_batches, index)
                feed_list.append(feed_dict)
@@ -215,9 +230,9 @@ def train(args):

                score_path = os.path.join(args.save_path, 'score.' + str(step))
                score_file = open(score_path, 'w')
-                for it in xrange(val_batch_num // dev_count):
+                for it in six.moves.xrange(val_batch_num // dev_count):
                    feed_list = []
-                    for dev in xrange(dev_count):
+                    for dev in six.moves.xrange(dev_count):
                        val_index = it * dev_count + dev
                        feed_dict = reader.make_one_batch_input(val_batches,
                                                                val_index)
@@ -227,9 +242,9 @@ def train(args):
                                            fetch_list=[logits.name])

                    scores = np.array(predicts[0])
-                    for dev in xrange(dev_count):
+                    for dev in six.moves.xrange(dev_count):
                        val_index = it * dev_count + dev
-                        for i in xrange(args.batch_size):
+                        for i in six.moves.xrange(args.batch_size):
                            score_file.write(
                                str(scores[args.batch_size * dev + i][0]) + '\t'
                                + str(val_batches["label"][val_index][

--- a/fluid/deep_attention_matching_net/utils/douban_evaluation.py
+++ b/fluid/deep_attention_matching_net/utils/douban_evaluation.py
 import sys
+import six
 import numpy as np
 from sklearn.metrics import average_precision_score

@@ -7,7 +8,7 @@ def mean_average_precision(sort_data):
    #to do
    count_1 = 0
    sum_precision = 0
-    for index in range(len(sort_data)):
+    for index in six.moves.xrange(len(sort_data)):
        if sort_data[index][1] == 1:
            count_1 += 1
            sum_precision += 1.0 * count_1 / (index + 1)

--- a/fluid/deep_attention_matching_net/utils/evaluation.py
+++ b/fluid/deep_attention_matching_net/utils/evaluation.py
 import sys
+import six


 def get_p_at_n_in_m(data, n, m, ind):
@@ -30,9 +31,9 @@ def evaluate(file_path):
    p_at_2_in_10 = 0.0
    p_at_5_in_10 = 0.0

-    length = len(data) / 10
+    length = len(data) // 10

-    for i in xrange(0, length):
+    for i in six.moves.xrange(0, length):
        ind = i * 10
        assert data[ind][1] == 1


--- a/fluid/deep_attention_matching_net/utils/reader.py
+++ b/fluid/deep_attention_matching_net/utils/reader.py
-import cPickle as pickle
+import six
 import numpy as np

+try:
+    import cPickle as pickle  #python 2
+except ImportError as e:
+    import pickle  #python 3
+

 def unison_shuffle(data, seed=None):
    if seed is not None:
        np.random.seed(seed)

-    y = np.array(data['y'])
-    c = np.array(data['c'])
-    r = np.array(data['r'])
+    y = np.array(data[six.b('y')])
+    c = np.array(data[six.b('c')])
+    r = np.array(data[six.b('r')])

    assert len(y) == len(c) == len(r)
    p = np.random.permutation(len(y))
-    shuffle_data = {'y': y[p], 'c': c[p], 'r': r[p]}
+    shuffle_data = {six.b('y'): y[p], six.b('c'): c[p], six.b('r'): r[p]}
    return shuffle_data


@@ -65,9 +70,9 @@ def produce_one_sample(data,
       max_turn_len=50
       return y, nor_turns_nor_c, nor_r, turn_len, term_len, r_len
    '''
-    c = data['c'][index]
-    r = data['r'][index][:]
-    y = data['y'][index]
+    c = data[six.b('c')][index]
+    r = data[six.b('r')][index][:]
+    y = data[six.b('y')][index]

    turns = split_c(c, split_id)
    #normalize turns_c length, nor_turns length is max_turn_num
@@ -101,7 +106,7 @@ def build_one_batch(data,

    _label = []

-    for i in range(conf['batch_size']):
+    for i in six.moves.xrange(conf['batch_size']):
        index = batch_index * conf['batch_size'] + i
        y, nor_turns_nor_c, nor_r, turn_len, term_len, r_len = produce_one_sample(
            data, index, conf['_EOS_'], conf['max_turn_num'],
@@ -145,8 +150,8 @@ def build_batches(data, conf, turn_cut_type='tail', term_cut_type='tail'):

    _label_batches = []

-    batch_len = len(data['y']) / conf['batch_size']
-    for batch_index in range(batch_len):
+    batch_len = len(data[six.b('y')]) // conf['batch_size']
+    for batch_index in six.moves.range(batch_len):
        _turns, _tt_turns_len, _every_turn_len, _response, _response_len, _label = build_one_batch(
            data, batch_index, conf, turn_cut_type='tail', term_cut_type='tail')

@@ -192,8 +197,10 @@ def make_one_batch_input(data_batches, index):
    max_turn_num = turns.shape[1]
    max_turn_len = turns.shape[2]

-    turns_list = [turns[:, i, :] for i in xrange(max_turn_num)]
-    every_turn_len_list = [every_turn_len[:, i] for i in xrange(max_turn_num)]
+    turns_list = [turns[:, i, :] for i in six.moves.xrange(max_turn_num)]
+    every_turn_len_list = [
+        every_turn_len[:, i] for i in six.moves.xrange(max_turn_num)
+    ]

    feed_dict = {}
    for i, turn in enumerate(turns_list):
@@ -204,7 +211,7 @@ def make_one_batch_input(data_batches, index):
    for i, turn_len in enumerate(every_turn_len_list):
        feed_dict["turn_mask_%d" % i] = np.ones(
            (batch_size, max_turn_len, 1)).astype("float32")
-        for row in xrange(batch_size):
+        for row in six.moves.xrange(batch_size):
            feed_dict["turn_mask_%d" % i][row, turn_len[row]:, 0] = 0

    feed_dict["response"] = response
@@ -212,7 +219,7 @@ def make_one_batch_input(data_batches, index):

    feed_dict["response_mask"] = np.ones(
        (batch_size, max_turn_len, 1)).astype("float32")
-    for row in xrange(batch_size):
+    for row in six.moves.xrange(batch_size):
        feed_dict["response_mask"][row, response_len[row]:, 0] = 0

    feed_dict["label"] = np.array([data_batches["label"][index]]).reshape(
@@ -228,14 +235,14 @@ if __name__ == '__main__':
        "max_turn_len": 50,
        "_EOS_": 28270,
    }
-    train, val, test = pickle.load(open('../data/ubuntu/data_small.pkl', 'rb'))
+    with open('../ubuntu/data/data_small.pkl', 'rb') as f:
+        if six.PY2:
+            train, val, test = pickle.load(f)
+        else:
+            train, val, test = pickle.load(f, encoding="bytes")
    print('load data success')

    train_batches = build_batches(train, conf)
    val_batches = build_batches(val, conf)
    test_batches = build_batches(test, conf)
    print('build batches success')
-
-    pickle.dump([train_batches, val_batches, test_batches],
-                open('../data/ubuntu/data_small_xxx.pkl', 'wb'))
-    print('dump success')
--- a/fluid/deep_attention_matching_net/utils/util.py
+++ b/fluid/deep_attention_matching_net/utils/util.py
+import six
+import os
+
+
 def print_arguments(args):
    print('-----------  Configuration Arguments -----------')
-    for arg, value in sorted(vars(args).iteritems()):
+    for arg, value in sorted(six.iteritems(vars(args))):
        print('%s: %s' % (arg, value))
    print('------------------------------------------------')


+def mkdir(path):
+    if not os.path.isdir(path):
+        mkdir(os.path.split(path)[0])
+    else:
+        return
+    os.mkdir(path)
+
+
 def pos_encoding_init():
    pass


--- a/fluid/deeplabv3+/.gitignore
+++ b/fluid/deeplabv3+/.gitignore
+deeplabv3plus_xception65_initialize.params
+deeplabv3plus.params
+deeplabv3plus.tar.gz
--- a/fluid/deeplabv3+/README.md
+++ b/fluid/deeplabv3+/README.md
-DeepLab运行本目录下的程序示例需要使用PaddlePaddle develop最新版本。如果您的PaddlePaddle安装版本低于此要求，请按照[安装文档](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html)中的说明更新PaddlePaddle安装版本。
+DeepLab运行本目录下的程序示例需要使用PaddlePaddle Fluid v1.0.0版本或以上。如果您的PaddlePaddle安装版本低于此要求，请按照安装文档中的说明更新PaddlePaddle安装版本，如果使用GPU，该程序需要使用cuDNN v7版本。


 ## 代码结构
@@ -41,10 +41,12 @@ data/cityscape/
 如果需要从头开始训练模型，用户需要下载我们的初始化模型
 ```
 wget http://paddlemodels.cdn.bcebos.com/deeplab/deeplabv3plus_xception65_initialize.tar.gz
+tar -xf deeplabv3plus_xception65_initialize.tar.gz && rm deeplabv3plus_xception65_initialize.tar.gz
 ```
 如果需要最终训练模型进行fine tune或者直接用于预测，请下载我们的最终模型
 ```
 wget http://paddlemodels.cdn.bcebos.com/deeplab/deeplabv3plus.tar.gz
+tar -xf deeplabv3plus.tar.gz && rm deeplabv3plus.tar.gz
 ```


@@ -70,11 +72,11 @@ python train.py --help
 ```
 python ./train.py \
    --batch_size=8 \
-    --parallel=true
+    --parallel=true \
    --train_crop_size=769 \
    --total_step=90000 \
-    --init_weights_path=$INIT_WEIGHTS_PATH \
-    --save_weights_path=$SAVE_WEIGHTS_PATH \
+    --init_weights_path=deeplabv3plus_xception65_initialize.params \
+    --save_weights_path=output \
    --dataset_path=$DATASET_PATH
 ```

@@ -82,11 +84,10 @@ python ./train.py \
 执行以下命令在`Cityscape`测试数据集上进行测试：
 ```
 python ./eval.py \
-    --init_weights_path=$INIT_WEIGHTS_PATH \
+    --init_weights=deeplabv3plus.params \
    --dataset_path=$DATASET_PATH
 ```
-需要通过选项`--model_path`指定模型文件。
-测试脚本的输出的评估指标为[mean IoU]()。
+需要通过选项`--model_path`指定模型文件。测试脚本的输出的评估指标为mean IoU。


 ## 实验结果

--- a/fluid/deeplabv3+/eval.py
+++ b/fluid/deeplabv3+/eval.py
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import os
 os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = '0.98'

@@ -91,7 +94,7 @@ exe = fluid.Executor(place)
 exe.run(sp)

 if args.init_weights_path:
-    print "load from:", args.init_weights_path
+    print("load from:", args.init_weights_path)
    load_model()

 dataset = CityscapeDataset(args.dataset_path, 'val')
@@ -118,7 +121,7 @@ for i, imgs, labels, names in batches:
    mp = (wrong + right) != 0
    miou2 = np.mean((right[mp] * 1.0 / (right[mp] + wrong[mp])))
    if args.verbose:
-        print 'step: %s, mIoU: %s' % (i + 1, miou2)
+        print('step: %s, mIoU: %s' % (i + 1, miou2))
    else:
-        print '\rstep: %s, mIoU: %s' % (i + 1, miou2),
+        print('\rstep: %s, mIoU: %s' % (i + 1, miou2))
        sys.stdout.flush()
--- a/fluid/deeplabv3+/models.py
+++ b/fluid/deeplabv3+/models.py
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import paddle
 import paddle.fluid as fluid

@@ -50,7 +53,7 @@ def append_op_result(result, name):

 def conv(*args, **kargs):
    kargs['param_attr'] = name_scope + 'weights'
-    if kargs.has_key('bias_attr') and kargs['bias_attr']:
+    if 'bias_attr' in kargs and kargs['bias_attr']:
        kargs['bias_attr'] = name_scope + 'biases'
    else:
        kargs['bias_attr'] = False
@@ -62,7 +65,7 @@ def group_norm(input, G, eps=1e-5, param_attr=None, bias_attr=None):

    N, C, H, W = input.shape
    if C % G != 0:
-        print "group can not divide channle:", C, G
+        print("group can not divide channle:", C, G)
        for d in range(10):
            for t in [d, -d]:
                if G + t <= 0: continue
@@ -70,7 +73,7 @@ def group_norm(input, G, eps=1e-5, param_attr=None, bias_attr=None):
                    G = G + t
                    break
            if C % G == 0:
-                print "use group size:", G
+                print("use group size:", G)
                break
    assert C % G == 0
    param_shape = (G, )
@@ -139,7 +142,7 @@ def seq_conv(input, channel, stride, filter, dilation=1, act=None):
            filter,
            stride,
            groups=input.shape[1],
-            padding=(filter / 2) * dilation,
+            padding=(filter // 2) * dilation,
            dilation=dilation)
        input = bn(input)
        if act: input = act(input)

--- a/fluid/deeplabv3+/reader.py
+++ b/fluid/deeplabv3+/reader.py
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import cv2
 import numpy as np
+import os
+import six

 default_config = {
    "shuffle": True,
@@ -30,7 +35,7 @@ def slice_with_pad(a, s, value=0):
                pr = 0
            pads.append([pl, pr])
            slices.append([l, r])
-    slices = map(lambda x: slice(x[0], x[1], 1), slices)
+    slices = list(map(lambda x: slice(x[0], x[1], 1), slices))
    a = a[slices]
    a = np.pad(a, pad_width=pads, mode='constant', constant_values=value)
    return a
@@ -38,11 +43,17 @@ def slice_with_pad(a, s, value=0):

 class CityscapeDataset:
    def __init__(self, dataset_dir, subset='train', config=default_config):
-        import commands
-        label_dirname = dataset_dir + 'gtFine/' + subset
-        label_files = commands.getoutput(
-            "find %s -type f | grep labelTrainIds | sort" %
-            label_dirname).splitlines()
+        label_dirname = os.path.join(dataset_dir, 'gtFine/' + subset)
+        if six.PY2:
+            import commands
+            label_files = commands.getoutput(
+                "find %s -type f | grep labelTrainIds | sort" %
+                label_dirname).splitlines()
+        else:
+            import subprocess
+            label_files = subprocess.getstatusoutput(
+                "find %s -type f | grep labelTrainIds | sort" %
+                label_dirname)[-1].splitlines()
        self.label_files = label_files
        self.label_dirname = label_dirname
        self.index = 0
@@ -50,7 +61,7 @@ class CityscapeDataset:
        self.dataset_dir = dataset_dir
        self.config = config
        self.reset()
-        print "total number", len(label_files)
+        print("total number", len(label_files))

    def reset(self, shuffle=False):
        self.index = 0
@@ -66,13 +77,14 @@ class CityscapeDataset:
        shape = self.config["crop_size"]
        while True:
            ln = self.label_files[self.index]
-            img_name = self.dataset_dir + 'leftImg8bit/' + self.subset + ln[len(
-                self.label_dirname):]
+            img_name = os.path.join(
+                self.dataset_dir,
+                'leftImg8bit/' + self.subset + ln[len(self.label_dirname):])
            img_name = img_name.replace('gtFine_labelTrainIds', 'leftImg8bit')
            label = cv2.imread(ln)
            img = cv2.imread(img_name)
            if img is None:
-                print "load img failed:", img_name
+                print("load img failed:", img_name)
                self.next_img()
            else:
                break
@@ -128,5 +140,7 @@ class CityscapeDataset:
            from prefetch_generator import BackgroundGenerator
            batches = BackgroundGenerator(batches, 100)
        except:
-            print "You can install 'prefetch_generator' for acceleration of data reading."
+            print(
+                "You can install 'prefetch_generator' for acceleration of data reading."
+            )
        return batches
--- a/fluid/deeplabv3+/train.py
+++ b/fluid/deeplabv3+/train.py
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import os
 os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = '0.98'

@@ -126,13 +129,12 @@ exe = fluid.Executor(place)
 exe.run(sp)

 if args.init_weights_path:
-    print "load from:", args.init_weights_path
+    print("load from:", args.init_weights_path)
    load_model()

 dataset = CityscapeDataset(args.dataset_path, 'train')

 if args.parallel:
-    print "Using ParallelExecutor."
    exe_p = fluid.ParallelExecutor(
        use_cuda=True, loss_name=loss_mean.name, main_program=tp)

@@ -149,9 +151,9 @@ for i, imgs, labels, names in batches:
                             'label': labels},
                       fetch_list=[pred, loss_mean])
    if i % 100 == 0:
-        print "Model is saved to", args.save_weights_path
+        print("Model is saved to", args.save_weights_path)
        save_model()
-    print "step %s, loss: %s" % (i, np.mean(retv[1]))
+    print("step %s, loss: %s" % (i, np.mean(retv[1])))

-print "Training done. Model is saved to", args.save_weights_path
+print("Training done. Model is saved to", args.save_weights_path)
 save_model()
--- a/fluid/face_detection/.gitignore
+++ b/fluid/face_detection/.gitignore
@@ -10,3 +10,4 @@ output*
 pred
 eval_tools
 box*
+PyramidBox_WiderFace*
--- a/fluid/face_detection/data_util.py
+++ b/fluid/face_detection/data_util.py
@@ -9,6 +9,7 @@ import time
 import numpy as np
 import threading
 import multiprocessing
+import traceback
 try:
    import queue
 except ImportError:
@@ -71,6 +72,7 @@ class GeneratorEnqueuer(object):
                        try:
                            task()
                        except Exception:
+                            traceback.print_exc()
                            self._stop_event.set()
                            break
            else:
@@ -78,6 +80,7 @@ class GeneratorEnqueuer(object):
                    try:
                        task()
                    except Exception:
+                        traceback.print_exc()
                        self._stop_event.set()
                        break


--- a/fluid/face_detection/pyramidbox.py
+++ b/fluid/face_detection/pyramidbox.py
@@ -427,6 +427,7 @@ class PyramidBox(object):
            overlap_threshold=0.35,
            neg_overlap=0.35)
        loss = fluid.layers.reduce_sum(loss)
+        loss.persistable = True
        return loss

    def train(self):

--- a/fluid/face_detection/reader.py
+++ b/fluid/face_detection/reader.py
@@ -250,6 +250,10 @@ def train_generator(settings, file_list, batch_size, shuffle=True):
                    ymin = float(temp_info_box[1])
                    w = float(temp_info_box[2])
                    h = float(temp_info_box[3])
+
+                    # Filter out wrong labels
+                    if w < 0 or h < 0:
+                        continue
                    xmax = xmin + w
                    ymax = ymin + h

@@ -294,7 +298,7 @@ def train(settings,
                        generator_output = enqueuer.queue.get()
                        break
                    else:
-                        time.sleep(0.02)
+                        time.sleep(0.01)
                yield generator_output
                generator_output = None
        finally:

--- a/fluid/face_detection/train.py
+++ b/fluid/face_detection/train.py
@@ -167,7 +167,7 @@ def train(args, config, train_params, train_file_list):
            shutil.rmtree(model_path)

        print('save models to %s' % (model_path))
-        fluid.io.save_persistables(exe, model_path)
+        fluid.io.save_persistables(exe, model_path, main_program=program)

    train_py_reader.start()
    try:
@@ -189,13 +189,13 @@ def train(args, config, train_params, train_file_list):
                fetch_vars = [np.mean(np.array(v)) for v in fetch_vars]
                if batch_id % 10 == 0:
                    if not args.use_pyramidbox:
-                        print("Pass {0}, batch {1}, loss {2}, time {3}".format(
+                        print("Pass {:d}, batch {:d}, loss {:.6f}, time {:.5f}".format(
                            pass_id, batch_id, fetch_vars[0],
                            start_time - prev_start_time))
                    else:
-                        print("Pass {0}, batch {1}, face loss {2}, " \
-                              "head loss {3}, " \
-                              "time {4}".format(pass_id,
+                        print("Pass {:d}, batch {:d}, face loss {:.6f}, " \
+                              "head loss {:.6f}, " \
+                              "time {:.5f}".format(pass_id,
                               batch_id, fetch_vars[0], fetch_vars[1],
                               start_time - prev_start_time))
            if pass_id % 1 == 0 or pass_id == epoc_num - 1:

--- a/fluid/face_detection/widerface_eval.py
+++ b/fluid/face_detection/widerface_eval.py
@@ -82,9 +82,6 @@ def save_widerface_bboxes(image_path, bboxes_scores, output_dir):
    image_name = image_path.split('/')[-1]
    image_class = image_path.split('/')[-2]

-    image_name = image_name.encode('utf-8')
-    image_class = image_class.encode('utf-8')
-
    odir = os.path.join(output_dir, image_class)
    if not os.path.exists(odir):
        os.makedirs(odir)

--- a/fluid/faster_rcnn/README.md
+++ b/fluid/faster_rcnn/README.md
@@ -43,7 +43,7 @@ After data preparation, one can start the training step by:

    python train.py \
       --max_size=1333 \
-       --scales=800 \
+       --scales=[800] \
       --batch_size=8 \
       --model_save_dir=output/

@@ -57,6 +57,22 @@ After data preparation, one can start the training step by:
    sh ./pretrained/download.sh

 Set `pretrained_model` to load pre-trained model. In addition, this parameter is used to load trained model when finetuning as well.
+Please make sure that pretrained_model is downloaded and loaded correctly, otherwise, the loss may be NAN during training.
+
+**Install the [cocoapi](https://github.com/cocodataset/cocoapi):**
+
+To train the model, [cocoapi](https://github.com/cocodataset/cocoapi) is needed. Install the cocoapi:
+
+    # COCOAPI=/path/to/clone/cocoapi
+    git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
+    cd $COCOAPI/PythonAPI
+    # if cython is not installed
+    pip install Cython
+    # Install into global site-packages
+    make install
+    # Alternatively, if you do not have permissions or prefer
+    # not to install the COCO API into global site-packages
+    python2 setup.py install --user

 **data reader introduction:**

@@ -103,18 +119,7 @@ Finetuning is to finetune model weights in a specific task by loading pretrained

 ## Evaluation

-Evaluation is to evaluate the performance of a trained model. This sample provides `eval_coco_map.py` which uses a COCO-specific mAP metric defined by [COCO committee](http://cocodataset.org/#detections-eval). To use `eval_coco_map.py` , [cocoapi](https://github.com/cocodataset/cocoapi) is needed. Install the cocoapi:
-
-    # COCOAPI=/path/to/clone/cocoapi
-    git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
-    cd $COCOAPI/PythonAPI
-    # if cython is not installed
-    pip install Cython
-    # Install into global site-packages
-    make install
-    # Alternatively, if you do not have permissions or prefer
-    # not to install the COCO API into global site-packages
-    python2 setup.py install --user
+Evaluation is to evaluate the performance of a trained model. This sample provides `eval_coco_map.py` which uses a COCO-specific mAP metric defined by [COCO committee](http://cocodataset.org/#detections-eval).

 `eval_coco_map.py` is the main executor for evalution, one can start evalution step by:

@@ -136,7 +141,7 @@ Faster RCNN mAP
 | Detectron                 | 8            |    180000        | 0.315 |
 | Fluid minibatch padding | 8            |    180000        | 0.314 |
 | Fluid all padding         | 8            |    180000        | 0.308 |
-| Fluid no padding         |6            |    240000        | 0.317 |
+| Fluid no padding         |8            |    180000        | 0.316 |

 * Fluid all padding: Each image padding to 1333\*1333.
 * Fluid minibatch padding: Images in one batch padding to the same size. This method is same as detectron.

--- a/fluid/faster_rcnn/README_cn.md
+++ b/fluid/faster_rcnn/README_cn.md
@@ -42,7 +42,7 @@ Faster RCNN 目标检测模型

    python train.py \
       --max_size=1333 \
-       --scales=800 \
+       --scales=[800] \
       --batch_size=8 \
       --model_save_dir=output/ \
       --pretrained_model=${path_to_pretrain_model}
@@ -57,6 +57,22 @@ Faster RCNN 目标检测模型
    sh ./pretrained/download.sh

 通过初始化`pretrained_model` 加载预训练模型。同时在参数微调时也采用该设置加载已训练模型。
+请在训练前确认预训练模型下载与加载正确，否则训练过程中损失可能会出现NAN。
+
+**安装[cocoapi](https://github.com/cocodataset/cocoapi)：**
+
+训练前需要首先下载[cocoapi](https://github.com/cocodataset/cocoapi)：
+
+    # COCOAPI=/path/to/clone/cocoapi
+    git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
+    cd $COCOAPI/PythonAPI
+    # if cython is not installed
+    pip install Cython
+    # Install into global site-packages
+    make install
+    # Alternatively, if you do not have permissions or prefer
+    # not to install the COCO API into global site-packages
+    python2 setup.py install --user

 **数据读取器说明：** 数据读取器定义在reader.py中。所有图像将短边等比例缩放至`scales`，若长边大于`max_size`, 则再次将长边等比例缩放至`max_iter`。在训练阶段，对图像采用水平翻转。支持将同一个batch内的图像padding为相同尺寸。

@@ -87,18 +103,7 @@ Faster RCNN 训练loss

 ## 模型评估

-模型评估是指对训练完毕的模型评估各类性能指标。本示例采用[COCO官方评估](http://cocodataset.org/#detections-eval)，使用前需要首先下载[cocoapi](https://github.com/cocodataset/cocoapi)：
-
-    # COCOAPI=/path/to/clone/cocoapi
-    git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
-    cd $COCOAPI/PythonAPI
-    # if cython is not installed
-    pip install Cython
-    # Install into global site-packages
-    make install
-    # Alternatively, if you do not have permissions or prefer
-    # not to install the COCO API into global site-packages
-    python2 setup.py install --user
+模型评估是指对训练完毕的模型评估各类性能指标。本示例采用[COCO官方评估](http://cocodataset.org/#detections-eval)

 `eval_coco_map.py`是评估模块的主要执行程序，调用示例如下：

@@ -120,7 +125,7 @@ Faster RCNN mAP
 | Detectron                 | 8            |    180000        | 0.315 |
 | Fluid minibatch padding | 8            |    180000        | 0.314 |
 | Fluid all padding         | 8            |    180000        | 0.308 |
-| Fluid no padding            |6            |    240000        | 0.317 |
+| Fluid no padding            |8            |    180000        | 0.316 |

 * Fluid all padding: 每张图像填充为1333\*1333大小。
 * Fluid minibatch padding: 同一个batch内的图像填充为相同尺寸。该方法与detectron处理相同。

--- a/fluid/faster_rcnn/config.py
+++ b/fluid/faster_rcnn/config.py
+#  Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#    http://www.apache.org/licenses/LICENSE-2.0
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License. 
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+from edict import AttrDict
+import six
+import numpy as np
+
+_C = AttrDict()
+cfg = _C
+
+#
+# Training options
+#
+_C.TRAIN = AttrDict()
+
+# scales an image's shortest side
+_C.TRAIN.scales = [800]
+
+# max size of longest side
+_C.TRAIN.max_size = 1333
+
+# images per GPU in minibatch
+_C.TRAIN.im_per_batch = 1
+
+# roi minibatch size per image
+_C.TRAIN.batch_size_per_im = 512
+
+# target fraction of foreground roi minibatch 
+_C.TRAIN.fg_fractrion = 0.25
+
+# overlap threshold for a foreground roi
+_C.TRAIN.fg_thresh = 0.5
+
+# overlap threshold for a background roi
+_C.TRAIN.bg_thresh_hi = 0.5
+_C.TRAIN.bg_thresh_lo = 0.0
+
+# If False, only resize image and not pad, image shape is different between
+# GPUs in one mini-batch. If True, image shape is the same in one mini-batch.
+_C.TRAIN.padding_minibatch = False
+
+# Snapshot period
+_C.TRAIN.snapshot_iter = 10000
+
+# number of RPN proposals to keep before NMS
+_C.TRAIN.rpn_pre_nms_top_n = 12000
+
+# number of RPN proposals to keep after NMS
+_C.TRAIN.rpn_post_nms_top_n = 2000
+
+# NMS threshold used on RPN proposals
+_C.TRAIN.rpn_nms_thresh = 0.7
+
+# min size in RPN proposals
+_C.TRAIN.rpn_min_size = 0.0
+
+# eta for adaptive NMS in RPN
+_C.TRAIN.rpn_eta = 1.0
+
+# number of RPN examples per image
+_C.TRAIN.rpn_batch_size_per_im = 256
+
+# remove anchors out of the image
+_C.TRAIN.rpn_straddle_thresh = 0.
+
+# target fraction of foreground examples pre RPN minibatch
+_C.TRAIN.rpn_fg_fraction = 0.5
+
+# min overlap between anchor and gt box to be a positive examples
+_C.TRAIN.rpn_positive_overlap = 0.7
+
+# max overlap between anchor and gt box to be a negative examples
+_C.TRAIN.rpn_negative_overlap = 0.3
+
+# stopgrad at a specified stage
+_C.TRAIN.freeze_at = 2
+
+# min area of ground truth box
+_C.TRAIN.gt_min_area = -1
+
+#
+# Inference options
+#
+_C.TEST = AttrDict()
+
+# scales an image's shortest side
+_C.TEST.scales = [800]
+
+# max size of longest side
+_C.TEST.max_size = 1333
+
+# eta for adaptive NMS in RPN
+_C.TEST.rpn_eta = 1.0
+
+# min score threshold to infer
+_C.TEST.score_thresh = 0.05
+
+# overlap threshold used for NMS
+_C.TEST.nms_thresh = 0.5
+
+# number of RPN proposals to keep before NMS
+_C.TEST.rpn_pre_nms_top_n = 6000
+
+# number of RPN proposals to keep after NMS
+_C.TEST.rpn_post_nms_top_n = 1000
+
+# min size in RPN proposals
+_C.TEST.rpn_min_size = 0.0
+
+# max number of detections
+_C.TEST.detectiions_per_im = 100
+
+# NMS threshold used on RPN proposals
+_C.TEST.rpn_nms_thresh = 0.7
+
+#
+# Model options
+#
+
+# weight for bbox regression targets
+_C.bbox_reg_weights = [0.1, 0.1, 0.2, 0.2]
+
+# RPN anchor sizes
+_C.anchor_sizes = [32, 64, 128, 256, 512]
+
+# RPN anchor ratio
+_C.aspect_ratio = [0.5, 1, 2]
+
+# variance of anchors
+_C.variances = [1., 1., 1., 1.]
+
+# stride of feature map
+_C.rpn_stride = [16.0, 16.0]
+
+# Use roi pool or roi align, 'RoIPool' or 'RoIAlign'
+_C.roi_func = 'RoIAlign'
+
+# sampling ratio for roi align
+_C.sampling_ratio = 0
+
+# pooled width and pooled height 
+_C.roi_resolution = 14
+
+# spatial scale 
+_C.spatial_scale = 1. / 16.
+
+#
+# SOLVER options
+#
+
+# derived learning rate the to get the final learning rate.
+_C.learning_rate = 0.01
+
+# maximum number of iterations
+_C.max_iter = 180000
+
+# warm up to learning rate 
+_C.warm_up_iter = 500
+_C.warm_up_factor = 1. / 3.
+
+# lr steps_with_decay
+_C.lr_steps = [120000, 160000]
+_C.lr_gamma = 0.1
+
+# L2 regularization hyperparameter
+_C.weight_decay = 0.0001
+
+# momentum with SGD
+_C.momentum = 0.9
+
+#
+# ENV options
+#
+
+# support both CPU and GPU
+_C.use_gpu = True
+
+# Whether use parallel
+_C.parallel = True
+
+# Class number
+_C.class_num = 81
+
+# support pyreader
+_C.use_pyreader = True
+
+# pixel mean values
+_C.pixel_means = [102.9801, 115.9465, 122.7717]
+
+# clip box to prevent overflowing
+_C.bbox_clip = np.log(1000. / 16.)
+
+# dataset path
+_C.train_file_list = 'annotations/instances_train2017.json'
+_C.train_data_dir = 'train2017'
+_C.val_file_list = 'annotations/instances_val2017.json'
+_C.val_data_dir = 'val2017'
+
+
+def merge_cfg_from_args(args, mode):
+    """Merge config keys, values in args into the global config."""
+    if mode == 'train':
+        sub_d = _C.TRAIN
+    else:
+        sub_d = _C.TEST
+    for k, v in sorted(six.iteritems(vars(args))):
+        d = _C
+        try:
+            value = eval(v)
+        except:
+            value = v
+        if k in sub_d:
+            sub_d[k] = value
+        else:
+            d[k] = value
--- a/fluid/faster_rcnn/data_utils.py
+++ b/fluid/faster_rcnn/data_utils.py
@@ -27,21 +27,27 @@ from __future__ import unicode_literals

 import cv2
 import numpy as np
+from config import cfg


-def get_image_blob(roidb, settings):
+def get_image_blob(roidb, mode):
    """Builds an input blob from the images in the roidb at the specified
    scales.
    """
-    scale_ind = np.random.randint(0, high=len(settings.scales))
+    if mode == 'train':
+        scales = cfg.TRAIN.scales
+        scale_ind = np.random.randint(0, high=len(scales))
+        target_size = scales[scale_ind]
+        max_size = cfg.TRAIN.max_size
+    else:
+        target_size = cfg.TEST.scales[0]
+        max_size = cfg.TEST.max_size
    im = cv2.imread(roidb['image'])
    assert im is not None, \
        'Failed to read image \'{}\''.format(roidb['image'])
    if roidb['flipped']:
        im = im[:, ::-1, :]
-    target_size = settings.scales[scale_ind]
-    im, im_scale = prep_im_for_blob(im, settings.mean_value, target_size,
-                                    settings.max_size)
+    im, im_scale = prep_im_for_blob(im, cfg.pixel_means, target_size, max_size)

    return im, im_scale


--- a/fluid/faster_rcnn/edict.py
+++ b/fluid/faster_rcnn/edict.py
+# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+
+
+class AttrDict(dict):
+    def __init__(self, *args, **kwargs):
+        super(AttrDict, self).__init__(*args, **kwargs)
+
+    def __getattr__(self, name):
+        if name in self.__dict__:
+            return self.__dict__[name]
+        elif name in self:
+            return self[name]
+        else:
+            raise AttributeError(name)
+
+    def __setattr__(self, name, value):
+        if name in self.__dict__:
+            self.__dict__[name] = value
+        else:
+            self[name] = value
--- a/fluid/faster_rcnn/eval_coco_map.py
+++ b/fluid/faster_rcnn/eval_coco_map.py
@@ -29,18 +29,20 @@ import models.resnet as resnet
 import json
 from pycocotools.coco import COCO
 from pycocotools.cocoeval import COCOeval, Params
+from config import cfg


-def eval(cfg):
-
+def eval():
    if '2014' in cfg.dataset:
        test_list = 'annotations/instances_val2014.json'
    elif '2017' in cfg.dataset:
        test_list = 'annotations/instances_val2017.json'

-    image_shape = [3, cfg.max_size, cfg.max_size]
+    image_shape = [3, cfg.TEST.max_size, cfg.TEST.max_size]
    class_nums = cfg.class_num
-    batch_size = cfg.batch_size
+    devices = os.getenv("CUDA_VISIBLE_DEVICES") or ""
+    devices_num = len(devices.split(","))
+    total_batch_size = devices_num * cfg.TRAIN.im_per_batch
    cocoGt = COCO(os.path.join(cfg.data_dir, test_list))
    numId_to_catId_map = {i + 1: v for i, v in enumerate(cocoGt.getCatIds())}
    category_ids = cocoGt.getCatIds()
@@ -51,7 +53,6 @@ def eval(cfg):
    label_list[0] = ['background']

    model = model_builder.FasterRCNN(
-        cfg=cfg,
        add_conv_body_func=resnet.add_ResNet50_conv4_body,
        add_roi_box_head_func=resnet.add_ResNet_roi_conv5_head,
        use_pyreader=False,
@@ -66,7 +67,7 @@ def eval(cfg):
            return os.path.exists(os.path.join(cfg.pretrained_model, var.name))
        fluid.io.load_vars(exe, cfg.pretrained_model, predicate=if_exist)
    # yapf: enable
-    test_reader = reader.test(cfg, batch_size)
+    test_reader = reader.test(total_batch_size)
    feeder = fluid.DataFeeder(place=place, feed_list=model.feeds())

    dts_res = []
@@ -80,11 +81,11 @@ def eval(cfg):
            fetch_list=[v.name for v in fetch_list],
            feed=feeder.feed(batch_data),
            return_numpy=False)
-        new_lod, nmsed_out = get_nmsed_box(cfg, rpn_rois_v, confs_v, locs_v,
+        new_lod, nmsed_out = get_nmsed_box(rpn_rois_v, confs_v, locs_v,
                                           class_nums, im_info,
                                           numId_to_catId_map)

-        dts_res += get_dt_res(batch_size, new_lod, nmsed_out, batch_data)
+        dts_res += get_dt_res(total_batch_size, new_lod, nmsed_out, batch_data)
        end = time.time()
        print('batch id: {}, time: {}'.format(batch_id, end - start))
    with open("detection_result.json", 'w') as outfile:
@@ -100,6 +101,4 @@ def eval(cfg):
 if __name__ == '__main__':
    args = parse_args()
    print_arguments(args)
-
-    data_args = reader.Settings(args)
-    eval(data_args)
+    eval()
--- a/fluid/faster_rcnn/eval_helper.py
+++ b/fluid/faster_rcnn/eval_helper.py
@@ -20,6 +20,7 @@ import box_utils
 from PIL import Image
 from PIL import ImageDraw
 from PIL import ImageFont
+from config import cfg


 def box_decoder(target_box, prior_box, prior_box_var):
@@ -31,10 +32,8 @@ def box_decoder(target_box, prior_box, prior_box_var):
    prior_box_loc[:, 3] = (prior_box[:, 3] + prior_box[:, 1]) / 2
    pred_bbox = np.zeros_like(target_box, dtype=np.float32)
    for i in range(prior_box.shape[0]):
-        dw = np.minimum(prior_box_var[2] * target_box[i, 2::4],
-                        np.log(1000. / 16.))
-        dh = np.minimum(prior_box_var[3] * target_box[i, 3::4],
-                        np.log(1000. / 16.))
+        dw = np.minimum(prior_box_var[2] * target_box[i, 2::4], cfg.bbox_clip)
+        dh = np.minimum(prior_box_var[3] * target_box[i, 3::4], cfg.bbox_clip)
        pred_bbox[i, 0::4] = prior_box_var[0] * target_box[
            i, 0::4] * prior_box_loc[i, 0] + prior_box_loc[i, 2]
        pred_bbox[i, 1::4] = prior_box_var[1] * target_box[
@@ -67,11 +66,11 @@ def clip_tiled_boxes(boxes, im_shape):
    return boxes


-def get_nmsed_box(args, rpn_rois, confs, locs, class_nums, im_info,
+def get_nmsed_box(rpn_rois, confs, locs, class_nums, im_info,
                  numId_to_catId_map):
    lod = rpn_rois.lod()[0]
    rpn_rois_v = np.array(rpn_rois)
-    variance_v = np.array([0.1, 0.1, 0.2, 0.2])
+    variance_v = np.array(cfg.bbox_reg_weights)
    confs_v = np.array(confs)
    locs_v = np.array(locs)
    rois = box_decoder(locs_v, rpn_rois_v, variance_v)
@@ -89,12 +88,12 @@ def get_nmsed_box(args, rpn_rois, confs, locs, class_nums, im_info,
        cls_boxes = [[] for _ in range(class_nums)]
        scores_n = confs_v[start:end, :]
        for j in range(1, class_nums):
-            inds = np.where(scores_n[:, j] > args.score_threshold)[0]
+            inds = np.where(scores_n[:, j] > cfg.TEST.score_thresh)[0]
            scores_j = scores_n[inds, j]
            rois_j = rois_n[inds, j * 4:(j + 1) * 4]
            dets_j = np.hstack((rois_j, scores_j[:, np.newaxis])).astype(
                np.float32, copy=False)
-            keep = box_utils.nms(dets_j, args.nms_threshold)
+            keep = box_utils.nms(dets_j, cfg.TEST.nms_thresh)
            nms_dets = dets_j[keep, :]
            #add labels
            cat_id = numId_to_catId_map[j]
@@ -105,8 +104,8 @@ def get_nmsed_box(args, rpn_rois, confs, locs, class_nums, im_info,
    # Limit to max_per_image detections **over all classes**
        image_scores = np.hstack(
            [cls_boxes[j][:, -2] for j in range(1, class_nums)])
-        if len(image_scores) > 100:
-            image_thresh = np.sort(image_scores)[-100]
+        if len(image_scores) > cfg.TEST.detectiions_per_im:
+            image_thresh = np.sort(image_scores)[-cfg.TEST.detectiions_per_im]
            for j in range(1, class_nums):
                keep = np.where(cls_boxes[j][:, -2] >= image_thresh)[0]
                cls_boxes[j] = cls_boxes[j][keep, :]

--- a/fluid/faster_rcnn/image/mAP.jpg
+++ b/fluid/faster_rcnn/image/mAP.jpg
--- a/fluid/faster_rcnn/image/train_loss.jpg
+++ b/fluid/faster_rcnn/image/train_loss.jpg
--- a/fluid/faster_rcnn/infer.py
+++ b/fluid/faster_rcnn/infer.py
+import os
+import time
+import numpy as np
+from eval_helper import get_nmsed_box
+from eval_helper import get_dt_res
+from eval_helper import draw_bounding_box_on_image
+import paddle
+import paddle.fluid as fluid
+import reader
+from utility import print_arguments, parse_args
+import models.model_builder as model_builder
+import models.resnet as resnet
+import json
+from pycocotools.coco import COCO
+from pycocotools.cocoeval import COCOeval, Params
+from config import cfg
+
+
+def infer():
+
+    if '2014' in cfg.dataset:
+        test_list = 'annotations/instances_val2014.json'
+    elif '2017' in cfg.dataset:
+        test_list = 'annotations/instances_val2017.json'
+
+    cocoGt = COCO(os.path.join(cfg.data_dir, test_list))
+    numId_to_catId_map = {i + 1: v for i, v in enumerate(cocoGt.getCatIds())}
+    category_ids = cocoGt.getCatIds()
+    label_list = {
+        item['id']: item['name']
+        for item in cocoGt.loadCats(category_ids)
+    }
+    label_list[0] = ['background']
+    image_shape = [3, cfg.TEST.max_size, cfg.TEST.max_size]
+    class_nums = cfg.class_num
+
+    model = model_builder.FasterRCNN(
+        add_conv_body_func=resnet.add_ResNet50_conv4_body,
+        add_roi_box_head_func=resnet.add_ResNet_roi_conv5_head,
+        use_pyreader=False,
+        is_train=False)
+    model.build_model(image_shape)
+    rpn_rois, confs, locs = model.eval_out()
+    place = fluid.CUDAPlace(0) if cfg.use_gpu else fluid.CPUPlace()
+    exe = fluid.Executor(place)
+    # yapf: disable
+    if cfg.pretrained_model:
+        def if_exist(var):
+            return os.path.exists(os.path.join(cfg.pretrained_model, var.name))
+        fluid.io.load_vars(exe, cfg.pretrained_model, predicate=if_exist)
+    # yapf: enable
+    infer_reader = reader.infer()
+    feeder = fluid.DataFeeder(place=place, feed_list=model.feeds())
+
+    dts_res = []
+    fetch_list = [rpn_rois, confs, locs]
+    data = next(infer_reader())
+    im_info = [data[0][1]]
+    rpn_rois_v, confs_v, locs_v = exe.run(
+        fetch_list=[v.name for v in fetch_list],
+        feed=feeder.feed(data),
+        return_numpy=False)
+    new_lod, nmsed_out = get_nmsed_box(rpn_rois_v, confs_v, locs_v, class_nums,
+                                       im_info, numId_to_catId_map)
+    path = os.path.join(cfg.image_path, cfg.image_name)
+    draw_bounding_box_on_image(path, nmsed_out, cfg.draw_threshold, label_list)
+
+
+if __name__ == '__main__':
+    args = parse_args()
+    print_arguments(args)
+    infer()
--- a/fluid/faster_rcnn/models/model_builder.py
+++ b/fluid/faster_rcnn/models/model_builder.py
@@ -17,11 +17,11 @@ from paddle.fluid.param_attr import ParamAttr
 from paddle.fluid.initializer import Constant
 from paddle.fluid.initializer import Normal
 from paddle.fluid.regularizer import L2Decay
+from config import cfg


 class FasterRCNN(object):
    def __init__(self,
-                 cfg=None,
                 add_conv_body_func=None,
                 add_roi_box_head_func=None,
                 is_train=True,
@@ -29,7 +29,6 @@ class FasterRCNN(object):
                 use_random=True):
        self.add_conv_body_func = add_conv_body_func
        self.add_roi_box_head_func = add_roi_box_head_func
-        self.cfg = cfg
        self.is_train = is_train
        self.use_pyreader = use_pyreader
        self.use_random = use_random
@@ -111,10 +110,10 @@ class FasterRCNN(object):
                name="conv_rpn_b", learning_rate=2., regularizer=L2Decay(0.)))
        self.anchor, self.var = fluid.layers.anchor_generator(
            input=rpn_conv,
-            anchor_sizes=self.cfg.anchor_sizes,
-            aspect_ratios=self.cfg.aspect_ratios,
-            variance=self.cfg.variance,
-            stride=[16.0, 16.0])
+            anchor_sizes=cfg.anchor_sizes,
+            aspect_ratios=cfg.aspect_ratio,
+            variance=cfg.variances,
+            stride=cfg.rpn_stride)
        num_anchor = self.anchor.shape[2]
        # Proposal classification scores
        self.rpn_cls_score = fluid.layers.conv2d(
@@ -152,8 +151,12 @@ class FasterRCNN(object):
        rpn_cls_score_prob = fluid.layers.sigmoid(
            self.rpn_cls_score, name='rpn_cls_score_prob')

-        pre_nms_top_n = 12000 if self.is_train else 6000
-        post_nms_top_n = 2000 if self.is_train else 1000
+        param_obj = cfg.TRAIN if self.is_train else cfg.TEST
+        pre_nms_top_n = param_obj.rpn_pre_nms_top_n
+        post_nms_top_n = param_obj.rpn_post_nms_top_n
+        nms_thresh = param_obj.rpn_nms_thresh
+        min_size = param_obj.rpn_min_size
+        eta = param_obj.rpn_eta
        rpn_rois, rpn_roi_probs = fluid.layers.generate_proposals(
            scores=rpn_cls_score_prob,
            bbox_deltas=self.rpn_bbox_pred,
@@ -162,9 +165,9 @@ class FasterRCNN(object):
            variances=self.var,
            pre_nms_top_n=pre_nms_top_n,
            post_nms_top_n=post_nms_top_n,
-            nms_thresh=0.7,
-            min_size=0.0,
-            eta=1.0)
+            nms_thresh=nms_thresh,
+            min_size=min_size,
+            eta=eta)
        self.rpn_rois = rpn_rois
        if self.is_train:
            outs = fluid.layers.generate_proposal_labels(
@@ -173,13 +176,13 @@ class FasterRCNN(object):
                is_crowd=self.is_crowd,
                gt_boxes=self.gt_box,
                im_info=self.im_info,
-                batch_size_per_im=self.cfg.batch_size_per_im,
-                fg_fraction=0.25,
-                fg_thresh=0.5,
-                bg_thresh_hi=0.5,
-                bg_thresh_lo=0.0,
-                bbox_reg_weights=[0.1, 0.1, 0.2, 0.2],
-                class_nums=self.cfg.class_num,
+                batch_size_per_im=cfg.TRAIN.batch_size_per_im,
+                fg_fraction=cfg.TRAIN.fg_fractrion,
+                fg_thresh=cfg.TRAIN.fg_thresh,
+                bg_thresh_hi=cfg.TRAIN.bg_thresh_hi,
+                bg_thresh_lo=cfg.TRAIN.bg_thresh_lo,
+                bbox_reg_weights=cfg.bbox_reg_weights,
+                class_nums=cfg.class_num,
                use_random=self.use_random)

            self.rois = outs[0]
@@ -193,15 +196,24 @@ class FasterRCNN(object):
            pool_rois = self.rois
        else:
            pool_rois = self.rpn_rois
-        pool = fluid.layers.roi_pool(
-            input=roi_input,
-            rois=pool_rois,
-            pooled_height=14,
-            pooled_width=14,
-            spatial_scale=0.0625)
+        if cfg.roi_func == 'RoIPool':
+            pool = fluid.layers.roi_pool(
+                input=roi_input,
+                rois=pool_rois,
+                pooled_height=cfg.roi_resolution,
+                pooled_width=cfg.roi_resolution,
+                spatial_scale=cfg.spatial_scale)
+        elif cfg.roi_func == 'RoIAlign':
+            pool = fluid.layers.roi_align(
+                input=roi_input,
+                rois=pool_rois,
+                pooled_height=cfg.roi_resolution,
+                pooled_width=cfg.roi_resolution,
+                spatial_scale=cfg.spatial_scale,
+                sampling_ratio=cfg.sampling_ratio)
        rcnn_out = self.add_roi_box_head_func(pool)
        self.cls_score = fluid.layers.fc(input=rcnn_out,
-                                         size=self.cfg.class_num,
+                                         size=cfg.class_num,
                                         act=None,
                                         name='cls_score',
                                         param_attr=ParamAttr(
@@ -213,7 +225,7 @@ class FasterRCNN(object):
                                             learning_rate=2.,
                                             regularizer=L2Decay(0.)))
        self.bbox_pred = fluid.layers.fc(input=rcnn_out,
-                                         size=4 * self.cfg.class_num,
+                                         size=4 * cfg.class_num,
                                         act=None,
                                         name='bbox_pred',
                                         param_attr=ParamAttr(
@@ -257,8 +269,7 @@ class FasterRCNN(object):
            x=rpn_cls_score_reshape, shape=(0, -1, 1))
        rpn_bbox_pred_reshape = fluid.layers.reshape(
            x=rpn_bbox_pred_reshape, shape=(0, -1, 4))
-
-        score_pred, loc_pred, score_tgt, loc_tgt = \
+        score_pred, loc_pred, score_tgt, loc_tgt, bbox_weight = \
            fluid.layers.rpn_target_assign(
                bbox_pred=rpn_bbox_pred_reshape,
                cls_logits=rpn_cls_score_reshape,
@@ -267,11 +278,11 @@ class FasterRCNN(object):
                gt_boxes=self.gt_box,
                is_crowd=self.is_crowd,
                im_info=self.im_info,
-                rpn_batch_size_per_im=256,
-                rpn_straddle_thresh=0.0,
-                rpn_fg_fraction=0.5,
-                rpn_positive_overlap=0.7,
-                rpn_negative_overlap=0.3,
+                rpn_batch_size_per_im=cfg.TRAIN.rpn_batch_size_per_im,
+                rpn_straddle_thresh=cfg.TRAIN.rpn_straddle_thresh,
+                rpn_fg_fraction=cfg.TRAIN.rpn_fg_fraction,
+                rpn_positive_overlap=cfg.TRAIN.rpn_positive_overlap,
+                rpn_negative_overlap=cfg.TRAIN.rpn_negative_overlap,
                use_random=self.use_random)
        score_tgt = fluid.layers.cast(x=score_tgt, dtype='float32')
        rpn_cls_loss = fluid.layers.sigmoid_cross_entropy_with_logits(
@@ -279,7 +290,12 @@ class FasterRCNN(object):
        rpn_cls_loss = fluid.layers.reduce_mean(
            rpn_cls_loss, name='loss_rpn_cls')

-        rpn_reg_loss = fluid.layers.smooth_l1(x=loc_pred, y=loc_tgt, sigma=3.0)
+        rpn_reg_loss = fluid.layers.smooth_l1(
+            x=loc_pred,
+            y=loc_tgt,
+            sigma=3.0,
+            inside_weight=bbox_weight,
+            outside_weight=bbox_weight)
        rpn_reg_loss = fluid.layers.reduce_sum(
            rpn_reg_loss, name='loss_rpn_bbox')
        score_shape = fluid.layers.shape(score_tgt)

--- a/fluid/faster_rcnn/models/resnet.py
+++ b/fluid/faster_rcnn/models/resnet.py
@@ -16,6 +16,7 @@ import paddle.fluid as fluid
 from paddle.fluid.param_attr import ParamAttr
 from paddle.fluid.initializer import Constant
 from paddle.fluid.regularizer import L2Decay
+from config import cfg


 def conv_bn_layer(input,
@@ -88,8 +89,7 @@ def conv_affine_layer(input,
        default_initializer=Constant(0.))
    bias.stop_gradient = True

-    elt_mul = fluid.layers.elementwise_mul(x=conv, y=scale, axis=1)
-    out = fluid.layers.elementwise_add(x=elt_mul, y=bias, axis=1)
+    out = fluid.layers.affine_channel(x=conv, scale=scale, bias=bias)
    if act == 'relu':
        out = fluid.layers.relu(x=out)
    return out
@@ -137,7 +137,7 @@ ResNet_cfg = {
 }


-def add_ResNet50_conv4_body(body_input, freeze_at=2):
+def add_ResNet50_conv4_body(body_input):
    stages, block_func = ResNet_cfg[50]
    stages = stages[0:3]
    conv1 = conv_affine_layer(
@@ -149,13 +149,13 @@ def add_ResNet50_conv4_body(body_input, freeze_at=2):
        pool_stride=2,
        pool_padding=1)
    res2 = layer_warp(block_func, pool1, 64, stages[0], 1, name="res2")
-    if freeze_at == 2:
+    if cfg.TRAIN.freeze_at == 2:
        res2.stop_gradient = True
    res3 = layer_warp(block_func, res2, 128, stages[1], 2, name="res3")
-    if freeze_at == 3:
+    if cfg.TRAIN.freeze_at == 3:
        res3.stop_gradient = True
    res4 = layer_warp(block_func, res3, 256, stages[2], 2, name="res4")
-    if freeze_at == 4:
+    if cfg.TRAIN.freeze_at == 4:
        res4.stop_gradient = True
    return res4


--- a/fluid/faster_rcnn/pretrained/download.sh
+++ b/fluid/faster_rcnn/pretrained/download.sh
-DIR="$( cd "$(dirname "$0")" ; pwd -P )"
+DIR="$(dirname "$PWD -P")"
 cd "$DIR"

 # Download the data.

--- a/fluid/faster_rcnn/profile.py
+++ b/fluid/faster_rcnn/profile.py
@@ -26,19 +26,18 @@ import paddle.fluid.profiler as profiler
 import models.model_builder as model_builder
 import models.resnet as resnet
 from learning_rate import exponential_with_warmup_decay
+from config import cfg


-def train(cfg):
-    batch_size = cfg.batch_size
+def train():
    learning_rate = cfg.learning_rate
-    image_shape = [3, cfg.max_size, cfg.max_size]
+    image_shape = [3, cfg.TRAIN.max_size, cfg.TRAIN.max_size]
    num_iterations = cfg.max_iter

    devices = os.getenv("CUDA_VISIBLE_DEVICES") or ""
    devices_num = len(devices.split(","))
-
+    total_batch_size = devices_num * cfg.TRAIN.im_per_batch
    model = model_builder.FasterRCNN(
-        cfg=cfg,
        add_conv_body_func=resnet.add_ResNet50_conv4_body,
        add_roi_box_head_func=resnet.add_ResNet_roi_conv5_head,
        use_pyreader=cfg.use_pyreader,
@@ -51,8 +50,10 @@ def train(cfg):
    rpn_reg_loss.persistable = True
    loss = loss_cls + loss_bbox + rpn_cls_loss + rpn_reg_loss

-    boundaries = [120000, 160000]
-    values = [learning_rate, learning_rate * 0.1, learning_rate * 0.01]
+    boundaries = cfg.lr_steps
+    gamma = cfg.lr_gamma
+    step_num = len(lr_steps)
+    values = [learning_rate * (gamma**i) for i in range(step_num + 1)]

    optimizer = fluid.optimizer.Momentum(
        learning_rate=exponential_with_warmup_decay(
@@ -82,22 +83,16 @@ def train(cfg):
        train_exe = fluid.ParallelExecutor(
            use_cuda=bool(cfg.use_gpu), loss_name=loss.name)

-    assert cfg.batch_size % devices_num == 0, \
-        "batch_size = %d, devices_num = %d" %(cfg.batch_size, devices_num)
-
-    batch_size_per_dev = cfg.batch_size / devices_num
    if cfg.use_pyreader:
        train_reader = reader.train(
-            cfg,
-            batch_size=batch_size_per_dev,
-            total_batch_size=cfg.batch_size,
-            padding_total=cfg.padding_minibatch,
+            batch_size=cfg.TRAIN.im_per_batch,
+            total_batch_size=total_batch_size,
+            padding_total=cfg.TRAIN.padding_minibatch,
            shuffle=False)
        py_reader = model.py_reader
        py_reader.decorate_paddle_reader(train_reader)
    else:
-        train_reader = reader.train(
-            cfg, batch_size=cfg.batch_size, shuffle=False)
+        train_reader = reader.train(batch_size=total_batch_size, shuffle=False)
        feeder = fluid.DataFeeder(place=place, feed_list=model.feeds())

    fetch_list = [loss, loss_cls, loss_bbox, rpn_cls_loss, rpn_reg_loss]
@@ -109,7 +104,7 @@ def train(cfg):

        for batch_id in range(iterations):
            start_time = time.time()
-            data = train_reader().next()
+            data = next(train_reader())
            end_time = time.time()
            reader_time.append(end_time - start_time)
            start_time = time.time()
@@ -163,8 +158,7 @@ def train(cfg):
    run_func(2)
    # profiling
    start = time.time()
-    use_profile = False
-    if use_profile:
+    if cfg.use_profile:
        with profiler.profiler('GPU', 'total', '/tmp/profile_file'):
            reader_time, run_time, total_images = run_func(num_iterations)
    else:
@@ -181,6 +175,4 @@ def train(cfg):
 if __name__ == '__main__':
    args = parse_args()
    print_arguments(args)
-
-    data_args = reader.Settings(args)
-    train(data_args)
+    train()
--- a/fluid/faster_rcnn/reader.py
+++ b/fluid/faster_rcnn/reader.py
@@ -26,58 +26,45 @@ from collections import deque

 from roidbs import JsonDataset
 import data_utils
+from config import cfg


-class Settings(object):
-    def __init__(self, args=None):
-        for arg, value in sorted(six.iteritems(vars(args))):
-            setattr(self, arg, value)
-
-        if 'coco2014' in args.dataset:
-            self.class_nums = 81
-            self.train_file_list = 'annotations/instances_train2014.json'
-            self.train_data_dir = 'train2014'
-            self.val_file_list = 'annotations/instances_val2014.json'
-            self.val_data_dir = 'val2014'
-        elif 'coco2017' in args.dataset:
-            self.class_nums = 81
-            self.train_file_list = 'annotations/instances_train2017.json'
-            self.train_data_dir = 'train2017'
-            self.val_file_list = 'annotations/instances_val2017.json'
-            self.val_data_dir = 'val2017'
-        else:
-            raise NotImplementedError('Dataset {} not supported'.format(
-                self.dataset))
-        self.mean_value = np.array(self.mean_value)[
-            np.newaxis, np.newaxis, :].astype('float32')
-
-
-def coco(settings,
-         mode,
+def coco(mode,
         batch_size=None,
         total_batch_size=None,
         padding_total=False,
         shuffle=False):
+    if 'coco2014' in cfg.dataset:
+        cfg.train_file_list = 'annotations/instances_train2014.json'
+        cfg.train_data_dir = 'train2014'
+        cfg.val_file_list = 'annotations/instances_val2014.json'
+        cfg.val_data_dir = 'val2014'
+    elif 'coco2017' in cfg.dataset:
+        cfg.train_file_list = 'annotations/instances_train2017.json'
+        cfg.train_data_dir = 'train2017'
+        cfg.val_file_list = 'annotations/instances_val2017.json'
+        cfg.val_data_dir = 'val2017'
+    else:
+        raise NotImplementedError('Dataset {} not supported'.format(
+            cfg.dataset))
+    cfg.mean_value = np.array(cfg.pixel_means)[np.newaxis,
+                                               np.newaxis, :].astype('float32')
    total_batch_size = total_batch_size if total_batch_size else batch_size
    if mode != 'infer':
        assert total_batch_size % batch_size == 0
    if mode == 'train':
-        settings.train_file_list = os.path.join(settings.data_dir,
-                                                settings.train_file_list)
-        settings.train_data_dir = os.path.join(settings.data_dir,
-                                               settings.train_data_dir)
+        cfg.train_file_list = os.path.join(cfg.data_dir, cfg.train_file_list)
+        cfg.train_data_dir = os.path.join(cfg.data_dir, cfg.train_data_dir)
    elif mode == 'test' or mode == 'infer':
-        settings.val_file_list = os.path.join(settings.data_dir,
-                                              settings.val_file_list)
-        settings.val_data_dir = os.path.join(settings.data_dir,
-                                             settings.val_data_dir)
-    json_dataset = JsonDataset(settings, train=(mode == 'train'))
+        cfg.val_file_list = os.path.join(cfg.data_dir, cfg.val_file_list)
+        cfg.val_data_dir = os.path.join(cfg.data_dir, cfg.val_data_dir)
+    json_dataset = JsonDataset(train=(mode == 'train'))
    roidbs = json_dataset.get_roidb()

-    print("{} on {} with {} roidbs".format(mode, settings.dataset, len(roidbs)))
+    print("{} on {} with {} roidbs".format(mode, cfg.dataset, len(roidbs)))

    def roidb_reader(roidb, mode):
-        im, im_scales = data_utils.get_image_blob(roidb, settings)
+        im, im_scales = data_utils.get_image_blob(roidb, mode)
        im_id = roidb['id']
        im_height = np.round(roidb['height'] * im_scales)
        im_width = np.round(roidb['width'] * im_scales)
@@ -150,7 +137,7 @@ def coco(settings,

        else:
            for roidb in roidbs:
-                if settings.image_name not in roidb['image']:
+                if cfg.image_name not in roidb['image']:
                    continue
                im, im_info, im_id = roidb_reader(roidb, mode)
                batch_out = [(im, im_info, im_id)]
@@ -159,23 +146,14 @@ def coco(settings,
    return reader


-def train(settings,
-          batch_size,
-          total_batch_size=None,
-          padding_total=False,
-          shuffle=True):
+def train(batch_size, total_batch_size=None, padding_total=False, shuffle=True):
    return coco(
-        settings,
-        'train',
-        batch_size,
-        total_batch_size,
-        padding_total,
-        shuffle=shuffle)
+        'train', batch_size, total_batch_size, padding_total, shuffle=shuffle)


-def test(settings, batch_size, total_batch_size=None, padding_total=False):
-    return coco(settings, 'test', batch_size, total_batch_size, shuffle=False)
+def test(batch_size, total_batch_size=None, padding_total=False):
+    return coco('test', batch_size, total_batch_size, shuffle=False)


-def infer(settings):
-    return coco(settings, 'infer')
+def infer():
+    return coco('infer')
--- a/fluid/faster_rcnn/roidbs.py
+++ b/fluid/faster_rcnn/roidbs.py
@@ -26,7 +26,6 @@ from __future__ import print_function
 from __future__ import unicode_literals

 import copy
-import cPickle as pickle
 import logging
 import numpy as np
 import os
@@ -37,6 +36,7 @@ import matplotlib
 matplotlib.use('Agg')
 from pycocotools.coco import COCO
 import box_utils
+from config import cfg

 logger = logging.getLogger(__name__)

@@ -44,16 +44,16 @@ logger = logging.getLogger(__name__)
 class JsonDataset(object):
    """A class representing a COCO json dataset."""

-    def __init__(self, args, train=False):
-        print('Creating: {}'.format(args.dataset))
-        self.name = args.dataset
+    def __init__(self, train=False):
+        print('Creating: {}'.format(cfg.dataset))
+        self.name = cfg.dataset
        self.is_train = train
        if self.is_train:
-            data_dir = args.train_data_dir
-            file_list = args.train_file_list
+            data_dir = cfg.train_data_dir
+            file_list = cfg.train_file_list
        else:
-            data_dir = args.val_data_dir
-            file_list = args.val_file_list
+            data_dir = cfg.val_data_dir
+            file_list = cfg.val_file_list
        self.image_directory = data_dir
        self.COCO = COCO(file_list)
        # Set up dataset classes
@@ -91,7 +91,6 @@ class JsonDataset(object):
            end_time = time.time()
            print('_add_gt_annotations took {:.3f}s'.format(end_time -
                                                            start_time))
-
            print('Appending horizontally-flipped training examples...')
            self._extend_with_flipped_entries(roidb)
        print('Loaded dataset: {:s}'.format(self.name))
@@ -130,7 +129,7 @@ class JsonDataset(object):
        width = entry['width']
        height = entry['height']
        for obj in objs:
-            if obj['area'] < -1:  #cfg.TRAIN.GT_MIN_AREA:
+            if obj['area'] < cfg.TRAIN.gt_min_area:
                continue
            if 'ignore' in obj and obj['ignore'] == 1:
                continue

--- a/fluid/faster_rcnn/train.py
+++ b/fluid/faster_rcnn/train.py
@@ -28,11 +28,12 @@ import reader
 import models.model_builder as model_builder
 import models.resnet as resnet
 from learning_rate import exponential_with_warmup_decay
+from config import cfg


-def train(cfg):
+def train():
    learning_rate = cfg.learning_rate
-    image_shape = [3, cfg.max_size, cfg.max_size]
+    image_shape = [3, cfg.TRAIN.max_size, cfg.TRAIN.max_size]

    if cfg.debug:
        fluid.default_startup_program().random_seed = 1000
@@ -43,9 +44,9 @@ def train(cfg):

    devices = os.getenv("CUDA_VISIBLE_DEVICES") or ""
    devices_num = len(devices.split(","))
+    total_batch_size = devices_num * cfg.TRAIN.im_per_batch

    model = model_builder.FasterRCNN(
-        cfg=cfg,
        add_conv_body_func=resnet.add_ResNet50_conv4_body,
        add_roi_box_head_func=resnet.add_ResNet_roi_conv5_head,
        use_pyreader=cfg.use_pyreader,
@@ -58,18 +59,20 @@ def train(cfg):
    rpn_reg_loss.persistable = True
    loss = loss_cls + loss_bbox + rpn_cls_loss + rpn_reg_loss

-    boundaries = [120000, 160000]
-    values = [learning_rate, learning_rate * 0.1, learning_rate * 0.01]
+    boundaries = cfg.lr_steps
+    gamma = cfg.lr_gamma
+    step_num = len(cfg.lr_steps)
+    values = [learning_rate * (gamma**i) for i in range(step_num + 1)]

    optimizer = fluid.optimizer.Momentum(
        learning_rate=exponential_with_warmup_decay(
            learning_rate=learning_rate,
            boundaries=boundaries,
            values=values,
-            warmup_iter=500,
-            warmup_factor=1.0 / 3.0),
-        regularization=fluid.regularizer.L2Decay(0.0001),
-        momentum=0.9)
+            warmup_iter=cfg.warm_up_iter,
+            warmup_factor=cfg.warm_up_factor),
+        regularization=fluid.regularizer.L2Decay(cfg.weight_decay),
+        momentum=cfg.momentum)
    optimizer.minimize(loss)

    fluid.memory_optimize(fluid.default_main_program())
@@ -89,20 +92,16 @@ def train(cfg):
        train_exe = fluid.ParallelExecutor(
            use_cuda=bool(cfg.use_gpu), loss_name=loss.name)

-    assert cfg.batch_size % devices_num == 0
-    batch_size_per_dev = cfg.batch_size / devices_num
    if cfg.use_pyreader:
        train_reader = reader.train(
-            cfg,
-            batch_size=batch_size_per_dev,
-            total_batch_size=cfg.batch_size,
-            padding_total=cfg.padding_minibatch,
+            batch_size=cfg.TRAIN.im_per_batch,
+            total_batch_size=total_batch_size,
+            padding_total=cfg.TRAIN.padding_minibatch,
            shuffle=True)
        py_reader = model.py_reader
        py_reader.decorate_paddle_reader(train_reader)
    else:
-        train_reader = reader.train(
-            cfg, batch_size=cfg.batch_size, shuffle=True)
+        train_reader = reader.train(batch_size=total_batch_size, shuffle=True)
        feeder = fluid.DataFeeder(place=place, feed_list=model.feeds())

    def save_model(postfix):
@@ -133,7 +132,7 @@ def train(cfg):
                    smoothed_loss.get_median_value(
                    ), start_time - prev_start_time))
                sys.stdout.flush()
-                if (iter_id + 1) % cfg.snapshot_stride == 0:
+                if (iter_id + 1) % cfg.TRAIN.snapshot_iter == 0:
                    save_model("model_iter{}".format(iter_id))
        except fluid.core.EOFException:
            py_reader.reset()
@@ -159,7 +158,7 @@ def train(cfg):
                iter_id, lr[0],
                smoothed_loss.get_median_value(), start_time - prev_start_time))
            sys.stdout.flush()
-            if (iter_id + 1) % cfg.snapshot_stride == 0:
+            if (iter_id + 1) % cfg.TRAIN.snapshot_iter == 0:
                save_model("model_iter{}".format(iter_id))
            if (iter_id + 1) == cfg.max_iter:
                break
@@ -175,6 +174,4 @@ def train(cfg):
 if __name__ == '__main__':
    args = parse_args()
    print_arguments(args)
-
-    data_args = reader.Settings(args)
-    train(data_args)
+    train()
--- a/fluid/faster_rcnn/utility.py
+++ b/fluid/faster_rcnn/utility.py
@@ -18,7 +18,7 @@ Contains common utility functions.
 from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function
-
+import sys
 import distutils.util
 import numpy as np
 import six
@@ -26,6 +26,7 @@ from collections import deque
 from paddle.fluid import core
 import argparse
 import functools
+from config import *


 def print_arguments(args):
@@ -96,31 +97,33 @@ def parse_args():
    add_arg('model_save_dir',   str,    'output',     "The path to save model.")
    add_arg('pretrained_model', str,    'imagenet_resnet50_fusebn', "The init model path.")
    add_arg('dataset',          str,   'coco2017',  "coco2014, coco2017.")
-    add_arg('data_dir',         str,   'data/COCO17',        "The data root path.")
    add_arg('class_num',        int,   81,          "Class number.")
+    add_arg('data_dir',         str,   'data/COCO17',        "The data root path.")
    add_arg('use_pyreader',     bool,   True,           "Use pyreader.")
+    add_arg('use_profile',         bool,   False,       "Whether use profiler.")
    add_arg('padding_minibatch',bool,   False,
        "If False, only resize image and not pad, image shape is different between"
        " GPUs in one mini-batch. If True, image shape is the same in one mini-batch.")
    #SOLVER
    add_arg('learning_rate',    float,  0.01,     "Learning rate.")
    add_arg('max_iter',         int,    180000,   "Iter number.")
-    add_arg('log_window',       int,    1,        "Log smooth window, set 1 for debug, set 20 for train.")
-    add_arg('snapshot_stride',  int,    10000,    "save model every snapshot stride.")
+    add_arg('log_window',       int,    20,        "Log smooth window, set 1 for debug, set 20 for train.")
+    # FAST RCNN
    # RPN
    add_arg('anchor_sizes',     int,    [32,64,128,256,512],  "The size of anchors.")
    add_arg('aspect_ratios',    float,  [0.5,1.0,2.0],    "The ratio of anchors.")
    add_arg('variance',         float,  [1.,1.,1.,1.],    "The variance of anchors.")
-    add_arg('rpn_stride',       float,  16.,    "Stride of the feature map that RPN is attached.")
-    # FAST RCNN
+    add_arg('rpn_stride',       float,  [16.,16.],    "Stride of the feature map that RPN is attached.")
+    add_arg('rpn_nms_thresh',    float,   0.7,          "NMS threshold used on RPN proposals")
    # TRAIN TEST INFER
-    add_arg('batch_size',       int,   1,        "Minibatch size.")
+    add_arg('im_per_batch',       int,   1,        "Minibatch size.")
    add_arg('max_size',         int,   1333,    "The resized image height.")
    add_arg('scales', int,  [800],    "The resized image height.")
    add_arg('batch_size_per_im',int,    512,    "fast rcnn head batch size")
-    add_arg('mean_value',     float,   [102.9801, 115.9465, 122.7717], "pixel mean")
-    add_arg('nms_threshold',    float, 0.5,    "NMS threshold.")
-    add_arg('score_threshold',    float, 0.05,    "score threshold for NMS.")
+    add_arg('pixel_means',     float,   [102.9801, 115.9465, 122.7717], "pixel mean")
+    add_arg('nms_thresh',    float, 0.5,    "NMS threshold.")
+    add_arg('score_thresh',    float, 0.05,    "score threshold for NMS.")
+    add_arg('snapshot_stride',  int,    10000,    "save model every snapshot stride.")
    add_arg('debug',            bool,   False,   "Debug mode")
    # SINGLE EVAL AND DRAW
    add_arg('draw_threshold',  float, 0.8,    "Confidence threshold to draw bbox.")
@@ -128,4 +131,9 @@ def parse_args():
    add_arg('image_name',        str,    '',       "The single image used to inference and visualize.")
    # yapf: enable
    args = parser.parse_args()
+    file_name = sys.argv[0]
+    if 'train' in file_name or 'profile' in file_name:
+        merge_cfg_from_args(args, 'train')
+    else:
+        merge_cfg_from_args(args, 'test')
    return args
--- a/fluid/gan/c_gan/c_gan.py
+++ b/fluid/gan/c_gan/c_gan.py
@@ -12,8 +12,12 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import sys
 import os
+import six
 import argparse
 import functools
 import matplotlib
@@ -102,7 +106,7 @@ def train(args):
            noise_data = np.random.uniform(
                low=-1.0, high=1.0,
                size=[args.batch_size, NOISE_SIZE]).astype('float32')
-            real_image = np.array(map(lambda x: x[0], data)).reshape(
+            real_image = np.array(list(map(lambda x: x[0], data))).reshape(
                -1, 784).astype('float32')
            conditions_data = np.array([x[1] for x in data]).reshape(
                [-1, 1]).astype("float32")
@@ -138,7 +142,7 @@ def train(args):

            d_loss_np = [d_loss_1[0][0], d_loss_2[0][0]]

-            for _ in xrange(NUM_TRAIN_TIMES_OF_DG):
+            for _ in six.moves.xrange(NUM_TRAIN_TIMES_OF_DG):
                noise_data = np.random.uniform(
                    low=-1.0, high=1.0,
                    size=[args.batch_size, NOISE_SIZE]).astype('float32')
@@ -159,8 +163,9 @@ def train(args):
                total_images = np.concatenate([real_image, generated_images])
                fig = plot(total_images)
                msg = "Epoch ID={0}\n Batch ID={1}\n D-Loss={2}\n DG-Loss={3}\n gen={4}".format(
-                    pass_id, batch_id, d_loss_np, dg_loss_np,
-                    check(generated_images))
+                    pass_id, batch_id,
+                    np.sum(d_loss_np),
+                    np.sum(dg_loss_np), check(generated_images))
                print(msg)
                plt.title(msg)
                plt.savefig(

--- a/fluid/gan/c_gan/dc_gan.py
+++ b/fluid/gan/c_gan/dc_gan.py
@@ -12,11 +12,15 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import sys
 import os
 import argparse
 import functools
 import matplotlib
+import six
 import numpy as np
 import paddle
 import paddle.fluid as fluid
@@ -98,7 +102,7 @@ def train(args):
            noise_data = np.random.uniform(
                low=-1.0, high=1.0,
                size=[args.batch_size, NOISE_SIZE]).astype('float32')
-            real_image = np.array(map(lambda x: x[0], data)).reshape(
+            real_image = np.array(list(map(lambda x: x[0], data))).reshape(
                -1, 784).astype('float32')
            real_labels = np.ones(
                shape=[real_image.shape[0], 1], dtype='float32')
@@ -128,7 +132,7 @@ def train(args):

            d_loss_np = [d_loss_1[0][0], d_loss_2[0][0]]

-            for _ in xrange(NUM_TRAIN_TIMES_OF_DG):
+            for _ in six.moves.xrange(NUM_TRAIN_TIMES_OF_DG):
                noise_data = np.random.uniform(
                    low=-1.0, high=1.0,
                    size=[args.batch_size, NOISE_SIZE]).astype('float32')
@@ -146,7 +150,8 @@ def train(args):
                fig = plot(total_images)
                msg = "Epoch ID={0} Batch ID={1} D-Loss={2} DG-Loss={3}\n gen={4}".format(
                    pass_id, batch_id,
-                    np.sum(d_loss_np), dg_loss_np, check(generated_images))
+                    np.sum(d_loss_np),
+                    np.sum(dg_loss_np), check(generated_images))
                print(msg)
                plt.title(msg)
                plt.savefig(

--- a/fluid/gan/c_gan/network.py
+++ b/fluid/gan/c_gan/network.py
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import paddle
 import paddle.fluid as fluid
 from utility import get_parent_function_name
@@ -98,19 +101,19 @@ def D_cond(image, y):
    h2 = bn(fc(h1, dfc_dim), act='leaky_relu')
    h2 = fluid.layers.concat([h2, y], 1)

-    h3 = fc(h2, 1)
+    h3 = fc(h2, 1, act='sigmoid')
    return h3


 def G_cond(z, y):
    s_h, s_w = output_height, output_width
-    s_h2, s_h4 = int(s_h / 2), int(s_h / 4)
-    s_w2, s_w4 = int(s_w / 2), int(s_w / 4)
+    s_h2, s_h4 = int(s_h // 2), int(s_h // 4)
+    s_w2, s_w4 = int(s_w // 2), int(s_w // 4)

    yb = fluid.layers.reshape(y, [-1, y_dim, 1, 1])  #NCHW

    z = fluid.layers.concat([z, y], 1)
-    h0 = bn(fc(z, gfc_dim / 2), act='relu')
+    h0 = bn(fc(z, gfc_dim // 2), act='relu')
    h0 = fluid.layers.concat([h0, y], 1)

    h1 = bn(fc(h0, gf_dim * 2 * s_h4 * s_w4), act='relu')
@@ -128,14 +131,14 @@ def D(x):
    x = conv(x, df_dim, act='leaky_relu')
    x = bn(conv(x, df_dim * 2), act='leaky_relu')
    x = bn(fc(x, dfc_dim), act='leaky_relu')
-    x = fc(x, 1, act=None)
+    x = fc(x, 1, act='sigmoid')
    return x


 def G(x):
    x = bn(fc(x, gfc_dim))
-    x = bn(fc(x, gf_dim * 2 * img_dim / 4 * img_dim / 4))
-    x = fluid.layers.reshape(x, [-1, gf_dim * 2, img_dim / 4, img_dim / 4])
+    x = bn(fc(x, gf_dim * 2 * img_dim // 4 * img_dim // 4))
+    x = fluid.layers.reshape(x, [-1, gf_dim * 2, img_dim // 4, img_dim // 4])
    x = deconv(x, gf_dim * 2, act='relu', output_size=[14, 14])
    x = deconv(x, 1, filter_size=5, padding=2, act='tanh', output_size=[28, 28])
    x = fluid.layers.reshape(x, shape=[-1, 28 * 28])

--- a/fluid/gan/c_gan/utility.py
+++ b/fluid/gan/c_gan/utility.py
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import math
 import distutils.util
 import numpy as np
 import inspect
 import matplotlib
+import six
 matplotlib.use('agg')
 import matplotlib.pyplot as plt
 import matplotlib.gridspec as gridspec
@@ -54,7 +58,7 @@ def print_arguments(args):
    :type args: argparse.Namespace
    """
    print("-----------  Configuration Arguments -----------")
-    for arg, value in sorted(vars(args).iteritems()):
+    for arg, value in sorted(six.iteritems(vars(args))):
        print("%s: %s" % (arg, value))
    print("------------------------------------------------")


--- a/fluid/gan/cycle_gan/data_reader.py
+++ b/fluid/gan/cycle_gan/data_reader.py
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import os
 from PIL import Image
 import numpy as np
-from itertools import izip

 A_LIST_FILE = "./data/horse2zebra/trainA.txt"
 B_LIST_FILE = "./data/horse2zebra/trainB.txt"
@@ -70,11 +72,3 @@ def b_test_reader():
    Reader of images with B style for test.
    """
    return reader_creater(B_TEST_LIST_FILE, cycle=False, return_name=True)
-
-
-if __name__ == "__main__":
-    for A, B in izip(a_test_reader()(), a_test_reader()()):
-        print A[0].shape
-        print A[1]
-        print B[0].shape
-        print B[1]
--- a/fluid/gan/cycle_gan/train.py
+++ b/fluid/gan/cycle_gan/train.py
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import data_reader
 import os
 import random
@@ -9,7 +12,6 @@ import paddle.fluid as fluid
 import numpy as np
 from paddle.fluid import core
 from trainer import *
-from itertools import izip
 from scipy.misc import imsave
 import paddle.fluid.profiler as profiler
 from utility import add_arguments, print_arguments, ImagePool
@@ -66,7 +68,7 @@ def train(args):
        if not os.path.exists(out_path):
            os.makedirs(out_path)
        i = 0
-        for data_A, data_B in izip(A_test_reader(), B_test_reader()):
+        for data_A, data_B in zip(A_test_reader(), B_test_reader()):
            A_name = data_A[1]
            B_name = data_B[1]
            tensor_A = core.LoDTensor()
@@ -114,7 +116,7 @@ def train(args):
            exe, out_path + "/d_a", main_program=d_A_trainer.program)
        fluid.io.save_persistables(
            exe, out_path + "/d_b", main_program=d_B_trainer.program)
-        print "saved checkpoint to [%s]" % out_path
+        print("saved checkpoint to {}".format(out_path))
        sys.stdout.flush()

    def init_model():
@@ -128,7 +130,7 @@ def train(args):
            exe, args.init_model + "/d_a", main_program=d_A_trainer.program)
        fluid.io.load_persistables(
            exe, args.init_model + "/d_b", main_program=d_B_trainer.program)
-        print "Load model from [%s]" % args.init_model
+        print("Load model from {}".format(args.init_model))

    if args.init_model:
        init_model()
@@ -136,8 +138,8 @@ def train(args):
    for epoch in range(args.epoch):
        batch_id = 0
        for i in range(max_images_num):
-            data_A = A_reader.next()
-            data_B = B_reader.next()
+            data_A = next(A_reader)
+            data_B = next(B_reader)
            tensor_A = core.LoDTensor()
            tensor_B = core.LoDTensor()
            tensor_A.set(data_A, place)
@@ -174,9 +176,9 @@ def train(args):
                feed={"input_A": tensor_A,
                      "fake_pool_A": fake_pool_A})

-            print "epoch[%d]; batch[%d]; g_A_loss: %s; d_B_loss: %s; g_B_loss: %s; d_A_loss: %s;" % (
+            print("epoch{}; batch{}; g_A_loss: {}; d_B_loss: {}; g_B_loss: {}; d_A_loss: {};".format(
                epoch, batch_id, g_A_loss[0], d_B_loss[0], g_B_loss[0],
-                d_A_loss[0])
+                d_A_loss[0]))
            sys.stdout.flush()
            batch_id += 1


--- a/fluid/gan/cycle_gan/trainer.py
+++ b/fluid/gan/cycle_gan/trainer.py
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 from model import *
 import paddle.fluid as fluid


--- a/fluid/gan/cycle_gan/utility.py
+++ b/fluid/gan/cycle_gan/utility.py
@@ -17,6 +17,7 @@ from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function
 import distutils.util
+import six
 import random
 import glob
 import numpy as np
@@ -39,7 +40,7 @@ def print_arguments(args):
    :type args: argparse.Namespace
    """
    print("-----------  Configuration Arguments -----------")
-    for arg, value in sorted(vars(args).iteritems()):
+    for arg, value in sorted(six.iteritems(vars(args))):
        print("%s: %s" % (arg, value))
    print("------------------------------------------------")


--- a/fluid/icnet/infer.py
+++ b/fluid/icnet/infer.py
@@ -8,7 +8,7 @@ import os
 import cv2

 import paddle.fluid as fluid
-import paddle.v2 as paddle
+import paddle
 from icnet import icnet
 from utils import add_arguments, print_arguments, get_feeder_data
 from paddle.fluid.layers.learning_rate_scheduler import _decay_step_counter
@@ -111,10 +111,10 @@ def infer(args):
    for line in open(args.images_list):
        image_file = args.images_path + "/" + line.strip()
        filename = os.path.basename(image_file)
-        image = paddle.image.load_image(
+        image = paddle.dataset.image.load_image(
            image_file, is_color=True).astype("float32")
        image -= IMG_MEAN
-        img = paddle.image.to_chw(image)[np.newaxis, :]
+        img = paddle.dataset.image.to_chw(image)[np.newaxis, :]
        image_t = fluid.core.LoDTensor()
        image_t.set(img, place)
        result = exe.run(inference_program,

--- a/fluid/icnet/train.py
+++ b/fluid/icnet/train.py
@@ -17,6 +17,7 @@ from paddle.fluid.initializer import init_on_cpu

 if 'ce_mode' in os.environ:
    np.random.seed(10)
+    fluid.default_startup_program().random_seed = 90

 parser = argparse.ArgumentParser(description=__doc__)
 add_arg = functools.partial(add_arguments, argparser=parser)
@@ -91,9 +92,6 @@ def train(args):
        place = fluid.CUDAPlace(0)
    exe = fluid.Executor(place)

-    if 'ce_mode' in os.environ:
-        fluid.default_startup_program().random_seed = 90
-
    exe.run(fluid.default_startup_program())

    if args.init_model is not None:
@@ -126,9 +124,10 @@ def train(args):
            sub124_loss += results[3]
            # training log
            if iter_id % LOG_PERIOD == 0:
-                print("Iter[%d]; train loss: %.3f; sub4_loss: %.3f; sub24_loss: %.3f; sub124_loss: %.3f" % (
-                    iter_id, t_loss / LOG_PERIOD, sub4_loss / LOG_PERIOD,
-                    sub24_loss / LOG_PERIOD, sub124_loss / LOG_PERIOD))
+                print(
+                    "Iter[%d]; train loss: %.3f; sub4_loss: %.3f; sub24_loss: %.3f; sub124_loss: %.3f"
+                    % (iter_id, t_loss / LOG_PERIOD, sub4_loss / LOG_PERIOD,
+                       sub24_loss / LOG_PERIOD, sub124_loss / LOG_PERIOD))
                print("kpis	train_cost	%f" % (t_loss / LOG_PERIOD))

                t_loss = 0.

--- a/fluid/image_classification/README_cn.md
+++ b/fluid/image_classification/README_cn.md
@@ -14,7 +14,7 @@

 ## 安装

-在当前目录下运行样例代码需要PadddlePaddle Fluid的v0.13.0或以上的版本。如果你的运行环境中的PaddlePaddle低于此版本，请根据[安装文档](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html)中的说明来更新PaddlePaddle。
+在当前目录下运行样例代码需要PadddlePaddle Fluid的v0.13.0或以上的版本。如果你的运行环境中的PaddlePaddle低于此版本，请根据安装文档中的说明来更新PaddlePaddle。

 ## 数据准备


--- a/fluid/image_classification/_ce.py
+++ b/fluid/image_classification/_ce.py
@@ -19,7 +19,7 @@ test_acc_top1_kpi = AccKpi(
 test_acc_top5_kpi = AccKpi(
    'test_acc_top5', 0.02, 0, actived=True, desc='TOP5 ACC')
 test_cost_kpi = CostKpi('test_cost', 0.02, 0, actived=True, desc='train cost')
-train_speed_kpi = AccKpi(
+train_speed_kpi = DurationKpi(
    'train_speed',
    0.05,
    0,
@@ -38,7 +38,7 @@ test_acc_top5_card4_kpi = AccKpi(
    'test_acc_top5_card4', 0.02, 0, actived=True, desc='TOP5 ACC')
 test_cost_card4_kpi = CostKpi(
    'test_cost_card4', 0.02, 0, actived=True, desc='train cost')
-train_speed_card4_kpi = AccKpi(
+train_speed_card4_kpi = DurationKpi(
    'train_speed_card4',
    0.05,
    0,

--- a/fluid/image_classification/caffe2fluid/README.md
+++ b/fluid/image_classification/caffe2fluid/README.md
@@ -19,7 +19,7 @@ This tool is used to convert a Caffe model to a Fluid model

    - Download one from github directly
        ```
-        cd proto/ && wget https://github.com/ethereon/caffe-tensorflow/blob/master/kaffe/caffe/caffepb.py
+        cd proto/ && wget https://raw.githubusercontent.com/ethereon/caffe-tensorflow/master/kaffe/caffe/caffepb.py
        ```

 2. Convert the Caffe model to Fluid model

--- a/fluid/image_classification/caffe2fluid/examples/mnist/evaluate.py
+++ b/fluid/image_classification/caffe2fluid/examples/mnist/evaluate.py
@@ -8,7 +8,7 @@ import sys
 import os
 import numpy as np
 import paddle.fluid as fluid
-import paddle.v2 as paddle
+import paddle


 def test_model(exe, test_program, fetch_list, test_reader, feeder):

--- a/fluid/language_model/train.py
+++ b/fluid/language_model/train.py
@@ -21,6 +21,11 @@ def parse_args():
        action='store_true',
        help='If set, run \
        the task with continuous evaluation logs.')
+    parser.add_argument(
+        '--num_devices',
+        type=int,
+        default=1,
+        help='Number of GPU devices')
    args = parser.parse_args()
    return args

@@ -132,7 +137,7 @@ def train(train_reader,
                "src_wordseq": lod_src_wordseq,
                "dst_wordseq": lod_dst_wordseq
            },
-                                         fetch_list=fetch_list)
+                fetch_list=fetch_list)
            avg_ppl = np.exp(ret_avg_cost[0])
            newest_ppl = np.mean(avg_ppl)
            if i % 100 == 0:
@@ -153,7 +158,7 @@ def train(train_reader,
                print("kpis	imikolov_20_avg_ppl	%s" % newest_ppl)
            else:
                print("kpis	imikolov_20_pass_duration_card%s	%s" % \
-                                (gpu_num, total_time / epoch_idx))
+                      (gpu_num, total_time / epoch_idx))
                print("kpis	imikolov_20_avg_ppl_card%s	%s" %
                      (gpu_num, newest_ppl))
        save_dir = "%s/epoch_%d" % (model_dir, epoch_idx)
@@ -165,13 +170,13 @@ def train(train_reader,
    print("finish training")


-def get_cards(enable_ce):
-    if enable_ce:
+def get_cards(args):
+    if args.enable_ce:
        cards = os.environ.get('CUDA_VISIBLE_DEVICES')
        num = len(cards.split(","))
        return num
    else:
-        return fluid.core.get_cuda_device_count()
+        return args.num_devices


 def train_net():
@@ -179,7 +184,7 @@ def train_net():
    batch_size = 20
    args = parse_args()
    vocab, train_reader, test_reader = utils.prepare_data(
-        batch_size=batch_size * get_cards(args.enable_ce), buffer_size=1000, \
+        batch_size=batch_size * get_cards(args), buffer_size=1000, \
        word_freq_threshold=0, enable_ce = args.enable_ce)
    train(
        train_reader=train_reader,

--- a/fluid/machine_reading_comprehension/README.md
+++ b/fluid/machine_reading_comprehension/README.md
+# Abstract
+Dureader is an end-to-end neural network model for machine reading comprehension style question answering, which aims to answer questions from given passages. We first match the question and passages with a bidireactional attention flow network to obtrain the question-aware passages represenation. Then we employ a pointer network to locate the positions of answers from passages. Our experimental evalutions show that DuReader model achieves the state-of-the-art results in DuReader Dadaset.
+# Dataset
+DuReader Dataset is a new large-scale real-world and human sourced MRC dataset in Chinese. DuReader focuses on real-world open-domain question answering. The advantages of DuReader over existing datasets are concluded as follows:
+ - Real question
+ - Real article
+ - Real answer
+ - Real application scenario
+ - Rich annotation
+
+# Network
+DuReader model is inspired by 3 classic reading comprehension models([BiDAF](https://arxiv.org/abs/1611.01603), [Match-LSTM](https://arxiv.org/abs/1608.07905), [R-NET](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf)).
+
+DuReader model is a hierarchical multi-stage process and consists of five layers
+
+- **Word Embedding Layer** maps each word to a vector using a pre-trained word embedding model.
+- **Encoding Layer** extracts context infomation for each position in question and passages with a bi-directional LSTM network.
+- **Attention Flow Layer** couples the query and context vectors and produces a set of query-aware feature vectors for each word in the context. Please refer to [BiDAF](https://arxiv.org/abs/1611.01603) for more details.
+- **Fusion Layer** employs a layer of bi-directional LSTM to capture the interaction among context words independent of the query.
+- **Decode Layer** employs an answer point network with attention pooling of the quesiton to locate the positions of answers from passages. Please refer to [Match-LSTM](https://arxiv.org/abs/1608.07905) and [R-NET](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf) for more details.
+
+## How to Run
+### Download the Dataset
+To Download DuReader dataset:
+```
+cd data && bash download.sh
+```
+For more details about DuReader dataset please refer to [DuReader Dataset Homepage](https://ai.baidu.com//broad/subordinate?dataset=dureader).
+
+### Download Thirdparty Dependencies
+We use Bleu and Rouge as evaluation metrics, the calculation of these metrics relies on the scoring scripts under [coco-caption](https://github.com/tylin/coco-caption), to download them, run:
+
+```
+cd utils && bash download_thirdparty.sh
+```
+### Environment Requirements
+For now we've only tested on PaddlePaddle v1.0, to install PaddlePaddle and for more details about PaddlePaddle, see [PaddlePaddle Homepage](http://paddlepaddle.org).
+
+### Preparation
+Before training the model, we have to make sure that the data is ready. For preparation, we will check the data files, make directories and extract a vocabulary for later use. You can run the following command to do this with a specified task name:
+
+```
+sh run.sh --prepare
+```
+You can specify the files for train/dev/test by setting the `trainset`/`devset`/`testset`.
+### Training
+To train the model and you can also set the hyper-parameters such as the learning rate by using `--learning_rate NUM`. For example, to train the model for 10 passes, you can run:
+
+```
+sh run.sh --train --pass_num 10
+```
+
+The training process includes an evaluation on the dev set after each training epoch. By default, the model with the least Bleu-4 score on the dev set will be saved.
+
+### Evaluation
+To conduct a single evaluation on the dev set with the the model already trained, you can run the following command:
+
+```
+sh run.sh --evaluate  --load_dir models/1
+```
+
+### Prediction
+You can also predict answers for the samples in some files using the following command:
+
+```
+sh run.sh --predict --load_dir models/1 --testset ../data/preprocessed/testset/search.dev.json
+```
+
+By default, the results are saved at `../data/results/` folder. You can change this by specifying `--result_dir DIR_PATH`.
--- a/fluid/machine_reading_comprehension/args.py
+++ b/fluid/machine_reading_comprehension/args.py
+#   Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import argparse
+import distutils.util
+
+
+def parse_args():
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
+        '--prepare',
+        action='store_true',
+        help='create the directories, prepare the vocabulary and embeddings')
+    parser.add_argument('--train', action='store_true', help='train the model')
+    parser.add_argument(
+        '--evaluate', action='store_true', help='evaluate the model on dev set')
+    parser.add_argument(
+        '--predict',
+        action='store_true',
+        help='predict the answers for test set with trained model')
+    parser.add_argument(
+        "--embed_size",
+        type=int,
+        default=300,
+        help="The dimension of embedding table. (default: %(default)d)")
+    parser.add_argument(
+        "--hidden_size",
+        type=int,
+        default=300,
+        help="The size of rnn hidden unit. (default: %(default)d)")
+    parser.add_argument(
+        "--batch_size",
+        type=int,
+        default=32,
+        help="The sequence number of a mini-batch data. (default: %(default)d)")
+    parser.add_argument(
+        "--pass_num",
+        type=int,
+        default=5,
+        help="The pass number to train. (default: %(default)d)")
+    parser.add_argument(
+        "--learning_rate",
+        type=float,
+        default=0.001,
+        help="Learning rate used to train the model. (default: %(default)f)")
+    parser.add_argument(
+        "--weight_decay",
+        type=float,
+        default=0.0001,
+        help="Weight decay. (default: %(default)f)")
+    parser.add_argument(
+        "--use_gpu",
+        type=distutils.util.strtobool,
+        default=True,
+        help="Whether to use gpu. (default: %(default)d)")
+    parser.add_argument(
+        "--save_dir",
+        type=str,
+        default="model",
+        help="Specify the path to save trained models.")
+    parser.add_argument(
+        "--load_dir",
+        type=str,
+        default="",
+        help="Specify the path to load trained models.")
+    parser.add_argument(
+        "--save_interval",
+        type=int,
+        default=1,
+        help="Save the trained model every n passes."
+        "(default: %(default)d)")
+    parser.add_argument(
+        "--log_interval",
+        type=int,
+        default=50,
+        help="log the train loss every n batches."
+        "(default: %(default)d)")
+    parser.add_argument(
+        "--dev_interval",
+        type=int,
+        default=1000,
+        help="cal dev loss every n batches."
+        "(default: %(default)d)")
+    parser.add_argument('--optim', default='adam', help='optimizer type')
+    parser.add_argument('--trainset', nargs='+', help='train dataset')
+    parser.add_argument('--devset', nargs='+', help='dev dataset')
+    parser.add_argument('--testset', nargs='+', help='test dataset')
+    parser.add_argument('--vocab_dir', help='dict')
+    parser.add_argument('--max_p_num', type=int, default=5)
+    parser.add_argument('--max_a_len', type=int, default=200)
+    parser.add_argument('--max_p_len', type=int, default=500)
+    parser.add_argument('--max_q_len', type=int, default=9)
+    parser.add_argument('--doc_num', type=int, default=5)
+    parser.add_argument('--para_print', action='store_true')
+    parser.add_argument('--drop_rate', type=float, default=0.0)
+    parser.add_argument('--random_seed', type=int, default=123)
+    parser.add_argument(
+        '--log_path',
+        help='path of the log file. If not set, logs are printed to console')
+    parser.add_argument(
+        '--result_dir',
+        default='../data/results/',
+        help='the dir to output the results')
+    parser.add_argument(
+        '--result_name',
+        default='test_result',
+        help='the file name of the results')
+    args = parser.parse_args()
+    return args
--- a/fluid/machine_reading_comprehension/data/download.sh
+++ b/fluid/machine_reading_comprehension/data/download.sh
+#!/bin/bash
+# ==============================================================================
+# Copyright 2017 Baidu.com, Inc. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+
+if [[ -d preprocessed ]] && [[ -d raw ]]; then
+    echo "data exist"
+    exit 0
+else
+    wget -c --no-check-certificate http://dureader.gz.bcebos.com/dureader_preprocessed.zip 
+fi
+
+if md5sum --status -c md5sum.txt; then
+    unzip dureader_preprocessed.zip
+else
+    echo "download data error!" >> /dev/stderr
+    exit 1
+fi
--- a/fluid/machine_reading_comprehension/data/md5sum.txt
+++ b/fluid/machine_reading_comprehension/data/md5sum.txt
+7a4c28026f7dc94e8135d17203c63664  dureader_preprocessed.zip
--- a/fluid/machine_reading_comprehension/dataset.py
+++ b/fluid/machine_reading_comprehension/dataset.py
+# -*- coding:utf8 -*-
+# ==============================================================================
+# Copyright 2017 Baidu.com, Inc. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""
+This module implements data process strategies.
+"""
+
+import os
+import json
+import logging
+import numpy as np
+from collections import Counter
+
+
+class BRCDataset(object):
+    """
+    This module implements the APIs for loading and using baidu reading comprehension dataset
+    """
+
+    def __init__(self,
+                 max_p_num,
+                 max_p_len,
+                 max_q_len,
+                 train_files=[],
+                 dev_files=[],
+                 test_files=[]):
+        self.logger = logging.getLogger("brc")
+        self.max_p_num = max_p_num
+        self.max_p_len = max_p_len
+        self.max_q_len = max_q_len
+
+        self.train_set, self.dev_set, self.test_set = [], [], []
+        if train_files:
+            for train_file in train_files:
+                self.train_set += self._load_dataset(train_file, train=True)
+            self.logger.info('Train set size: {} questions.'.format(
+                len(self.train_set)))
+
+        if dev_files:
+            for dev_file in dev_files:
+                self.dev_set += self._load_dataset(dev_file)
+            self.logger.info('Dev set size: {} questions.'.format(
+                len(self.dev_set)))
+
+        if test_files:
+            for test_file in test_files:
+                self.test_set += self._load_dataset(test_file)
+            self.logger.info('Test set size: {} questions.'.format(
+                len(self.test_set)))
+
+    def _load_dataset(self, data_path, train=False):
+        """
+        Loads the dataset
+        Args:
+            data_path: the data file to load
+        """
+        with open(data_path) as fin:
+            data_set = []
+            for lidx, line in enumerate(fin):
+                sample = json.loads(line.strip())
+                if train:
+                    if len(sample['answer_spans']) == 0:
+                        continue
+                    if sample['answer_spans'][0][1] >= self.max_p_len:
+                        continue
+
+                if 'answer_docs' in sample:
+                    sample['answer_passages'] = sample['answer_docs']
+
+                sample['question_tokens'] = sample['segmented_question']
+
+                sample['passages'] = []
+                for d_idx, doc in enumerate(sample['documents']):
+                    if train:
+                        most_related_para = doc['most_related_para']
+                        sample['passages'].append({
+                            'passage_tokens':
+                            doc['segmented_paragraphs'][most_related_para],
+                            'is_selected': doc['is_selected']
+                        })
+                    else:
+                        para_infos = []
+                        for para_tokens in doc['segmented_paragraphs']:
+                            question_tokens = sample['segmented_question']
+                            common_with_question = Counter(
+                                para_tokens) & Counter(question_tokens)
+                            correct_preds = sum(common_with_question.values())
+                            if correct_preds == 0:
+                                recall_wrt_question = 0
+                            else:
+                                recall_wrt_question = float(
+                                    correct_preds) / len(question_tokens)
+                            para_infos.append((para_tokens, recall_wrt_question,
+                                               len(para_tokens)))
+                        para_infos.sort(key=lambda x: (-x[1], x[2]))
+                        fake_passage_tokens = []
+                        for para_info in para_infos[:1]:
+                            fake_passage_tokens += para_info[0]
+                        sample['passages'].append({
+                            'passage_tokens': fake_passage_tokens
+                        })
+                data_set.append(sample)
+        return data_set
+
+    def _one_mini_batch(self, data, indices, pad_id):
+        """
+        Get one mini batch
+        Args:
+            data: all data
+            indices: the indices of the samples to be selected
+            pad_id:
+        Returns:
+            one batch of data
+        """
+        batch_data = {
+            'raw_data': [data[i] for i in indices],
+            'question_token_ids': [],
+            'question_length': [],
+            'passage_token_ids': [],
+            'passage_length': [],
+            'start_id': [],
+            'end_id': []
+        }
+        max_passage_num = max(
+            [len(sample['passages']) for sample in batch_data['raw_data']])
+        #max_passage_num = min(self.max_p_num, max_passage_num)
+        max_passage_num = self.max_p_num
+        for sidx, sample in enumerate(batch_data['raw_data']):
+            for pidx in range(max_passage_num):
+                if pidx < len(sample['passages']):
+                    batch_data['question_token_ids'].append(sample[
+                        'question_token_ids'])
+                    batch_data['question_length'].append(
+                        len(sample['question_token_ids']))
+                    passage_token_ids = sample['passages'][pidx][
+                        'passage_token_ids']
+                    batch_data['passage_token_ids'].append(passage_token_ids)
+                    batch_data['passage_length'].append(
+                        min(len(passage_token_ids), self.max_p_len))
+                else:
+                    batch_data['question_token_ids'].append([])
+                    batch_data['question_length'].append(0)
+                    batch_data['passage_token_ids'].append([])
+                    batch_data['passage_length'].append(0)
+        batch_data, padded_p_len, padded_q_len = self._dynamic_padding(
+            batch_data, pad_id)
+        for sample in batch_data['raw_data']:
+            if 'answer_passages' in sample and len(sample['answer_passages']):
+                gold_passage_offset = padded_p_len * sample['answer_passages'][
+                    0]
+                batch_data['start_id'].append(gold_passage_offset + sample[
+                    'answer_spans'][0][0])
+                batch_data['end_id'].append(gold_passage_offset + sample[
+                    'answer_spans'][0][1])
+            else:
+                # fake span for some samples, only valid for testing
+                batch_data['start_id'].append(0)
+                batch_data['end_id'].append(0)
+        return batch_data
+
+    def _dynamic_padding(self, batch_data, pad_id):
+        """
+        Dynamically pads the batch_data with pad_id
+        """
+        pad_p_len = min(self.max_p_len, max(batch_data['passage_length']))
+        pad_q_len = min(self.max_q_len, max(batch_data['question_length']))
+        batch_data['passage_token_ids'] = [
+            (ids + [pad_id] * (pad_p_len - len(ids)))[:pad_p_len]
+            for ids in batch_data['passage_token_ids']
+        ]
+        batch_data['question_token_ids'] = [
+            (ids + [pad_id] * (pad_q_len - len(ids)))[:pad_q_len]
+            for ids in batch_data['question_token_ids']
+        ]
+        return batch_data, pad_p_len, pad_q_len
+
+    def word_iter(self, set_name=None):
+        """
+        Iterates over all the words in the dataset
+        Args:
+            set_name: if it is set, then the specific set will be used
+        Returns:
+            a generator
+        """
+        if set_name is None:
+            data_set = self.train_set + self.dev_set + self.test_set
+        elif set_name == 'train':
+            data_set = self.train_set
+        elif set_name == 'dev':
+            data_set = self.dev_set
+        elif set_name == 'test':
+            data_set = self.test_set
+        else:
+            raise NotImplementedError('No data set named as {}'.format(
+                set_name))
+        if data_set is not None:
+            for sample in data_set:
+                for token in sample['question_tokens']:
+                    yield token
+                for passage in sample['passages']:
+                    for token in passage['passage_tokens']:
+                        yield token
+
+    def convert_to_ids(self, vocab):
+        """
+        Convert the question and passage in the original dataset to ids
+        Args:
+            vocab: the vocabulary on this dataset
+        """
+        for data_set in [self.train_set, self.dev_set, self.test_set]:
+            if data_set is None:
+                continue
+            for sample in data_set:
+                sample['question_token_ids'] = vocab.convert_to_ids(sample[
+                    'question_tokens'])
+                for passage in sample['passages']:
+                    passage['passage_token_ids'] = vocab.convert_to_ids(passage[
+                        'passage_tokens'])
+
+    def gen_mini_batches(self, set_name, batch_size, pad_id, shuffle=True):
+        """
+        Generate data batches for a specific dataset (train/dev/test)
+        Args:
+            set_name: train/dev/test to indicate the set
+            batch_size: number of samples in one batch
+            pad_id: pad id
+            shuffle: if set to be true, the data is shuffled.
+        Returns:
+            a generator for all batches
+        """
+        if set_name == 'train':
+            data = self.train_set
+        elif set_name == 'dev':
+            data = self.dev_set
+        elif set_name == 'test':
+            data = self.test_set
+        else:
+            raise NotImplementedError('No data set named as {}'.format(
+                set_name))
+        data_size = len(data)
+        indices = np.arange(data_size)
+        if shuffle:
+            np.random.shuffle(indices)
+        for batch_start in np.arange(0, data_size, batch_size):
+            batch_indices = indices[batch_start:batch_start + batch_size]
+            yield self._one_mini_batch(data, batch_indices, pad_id)
--- a/fluid/machine_reading_comprehension/rc_model.py
+++ b/fluid/machine_reading_comprehension/rc_model.py
+#   Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import paddle.fluid.layers as layers
+import paddle.fluid as fluid
+import numpy as np
+
+
+def dropout(input, args):
+    if args.drop_rate:
+        return layers.dropout(
+            input,
+            dropout_prob=args.drop_rate,
+            seed=args.random_seed,
+            is_test=False)
+    else:
+        return input
+
+
+def bi_lstm_encoder(input_seq, gate_size, para_name, args):
+    # A bi-directional lstm encoder implementation.
+    # Linear transformation part for input gate, output gate, forget gate
+    # and cell activation vectors need be done outside of dynamic_lstm.
+    # So the output size is 4 times of gate_size.
+
+    input_forward_proj = layers.fc(
+        input=input_seq,
+        param_attr=fluid.ParamAttr(name=para_name + '_fw_gate_w'),
+        size=gate_size * 4,
+        act=None,
+        bias_attr=False)
+    input_reversed_proj = layers.fc(
+        input=input_seq,
+        param_attr=fluid.ParamAttr(name=para_name + '_bw_gate_w'),
+        size=gate_size * 4,
+        act=None,
+        bias_attr=False)
+    forward, _ = layers.dynamic_lstm(
+        input=input_forward_proj,
+        size=gate_size * 4,
+        use_peepholes=False,
+        param_attr=fluid.ParamAttr(name=para_name + '_fw_lstm_w'),
+        bias_attr=fluid.ParamAttr(name=para_name + '_fw_lstm_b'))
+    reversed, _ = layers.dynamic_lstm(
+        input=input_reversed_proj,
+        param_attr=fluid.ParamAttr(name=para_name + '_bw_lstm_w'),
+        bias_attr=fluid.ParamAttr(name=para_name + '_bw_lstm_b'),
+        size=gate_size * 4,
+        is_reverse=True,
+        use_peepholes=False)
+
+    encoder_out = layers.concat(input=[forward, reversed], axis=1)
+    return encoder_out
+
+
+def encoder(input_name, para_name, shape, hidden_size, args):
+    input_ids = layers.data(
+        name=input_name, shape=[1], dtype='int64', lod_level=1)
+    input_embedding = layers.embedding(
+        input=input_ids,
+        size=shape,
+        dtype='float32',
+        is_sparse=True,
+        param_attr=fluid.ParamAttr(name='embedding_para'))
+
+    encoder_out = bi_lstm_encoder(
+        input_seq=input_embedding,
+        gate_size=hidden_size,
+        para_name=para_name,
+        args=args)
+    return dropout(encoder_out, args)
+
+
+def attn_flow(q_enc, p_enc, p_ids_name, args):
+    tag = p_ids_name + "::"
+    drnn = layers.DynamicRNN()
+    with drnn.block():
+        h_cur = drnn.step_input(p_enc)
+        u_all = drnn.static_input(q_enc)
+        h_expd = layers.sequence_expand(x=h_cur, y=u_all)
+        s_t_mul = layers.elementwise_mul(x=u_all, y=h_expd, axis=0)
+        s_t_sum = layers.reduce_sum(input=s_t_mul, dim=1, keep_dim=True)
+        s_t_re = layers.reshape(s_t_sum, shape=[-1, 0])
+        s_t = layers.sequence_softmax(input=s_t_re)
+        u_expr = layers.elementwise_mul(x=u_all, y=s_t, axis=0)
+        u_expr = layers.sequence_pool(input=u_expr, pool_type='sum')
+
+        b_t = layers.sequence_pool(input=s_t_sum, pool_type='max')
+        drnn.output(u_expr, b_t)
+    U_expr, b = drnn()
+    b_norm = layers.sequence_softmax(input=b)
+    h_expr = layers.elementwise_mul(x=p_enc, y=b_norm, axis=0)
+    h_expr = layers.sequence_pool(input=h_expr, pool_type='sum')
+
+    H_expr = layers.sequence_expand(x=h_expr, y=p_enc)
+    H_expr = layers.lod_reset(x=H_expr, y=p_enc)
+    h_u = layers.elementwise_mul(x=p_enc, y=U_expr, axis=0)
+    h_h = layers.elementwise_mul(x=p_enc, y=H_expr, axis=0)
+
+    g = layers.concat(input=[p_enc, U_expr, h_u, h_h], axis=1)
+    return dropout(g, args)
+
+
+def lstm_step(x_t, hidden_t_prev, cell_t_prev, size, para_name, args):
+    def linear(inputs, para_name, args):
+        return layers.fc(input=inputs,
+                         size=size,
+                         param_attr=fluid.ParamAttr(name=para_name + '_w'),
+                         bias_attr=fluid.ParamAttr(name=para_name + '_b'))
+
+    input_cat = layers.concat([hidden_t_prev, x_t], axis=1)
+    forget_gate = layers.sigmoid(x=linear(input_cat, para_name + '_lstm_f',
+                                          args))
+    input_gate = layers.sigmoid(x=linear(input_cat, para_name + '_lstm_i',
+                                         args))
+    output_gate = layers.sigmoid(x=linear(input_cat, para_name + '_lstm_o',
+                                          args))
+    cell_tilde = layers.tanh(x=linear(input_cat, para_name + '_lstm_c', args))
+
+    cell_t = layers.sums(input=[
+        layers.elementwise_mul(
+            x=forget_gate, y=cell_t_prev), layers.elementwise_mul(
+                x=input_gate, y=cell_tilde)
+    ])
+
+    hidden_t = layers.elementwise_mul(x=output_gate, y=layers.tanh(x=cell_t))
+
+    return hidden_t, cell_t
+
+
+#point network
+def point_network_decoder(p_vec, q_vec, hidden_size, args):
+    tag = 'pn_decoder:'
+    init_random = fluid.initializer.Normal(loc=0.0, scale=1.0)
+
+    random_attn = layers.create_parameter(
+        shape=[1, hidden_size],
+        dtype='float32',
+        default_initializer=init_random)
+    random_attn = layers.fc(
+        input=random_attn,
+        size=hidden_size,
+        act=None,
+        param_attr=fluid.ParamAttr(name=tag + 'random_attn_fc_w'),
+        bias_attr=fluid.ParamAttr(name=tag + 'random_attn_fc_b'))
+    random_attn = layers.reshape(random_attn, shape=[-1])
+    U = layers.fc(input=q_vec,
+                  param_attr=fluid.ParamAttr(name=tag + 'q_vec_fc_w'),
+                  bias_attr=False,
+                  size=hidden_size,
+                  act=None) + random_attn
+    U = layers.tanh(U)
+
+    logits = layers.fc(input=U,
+                       param_attr=fluid.ParamAttr(name=tag + 'logits_fc_w'),
+                       bias_attr=fluid.ParamAttr(name=tag + 'logits_fc_b'),
+                       size=1,
+                       act=None)
+    scores = layers.sequence_softmax(input=logits)
+    pooled_vec = layers.elementwise_mul(x=q_vec, y=scores, axis=0)
+    pooled_vec = layers.sequence_pool(input=pooled_vec, pool_type='sum')
+
+    init_state = layers.fc(
+        input=pooled_vec,
+        param_attr=fluid.ParamAttr(name=tag + 'init_state_fc_w'),
+        bias_attr=fluid.ParamAttr(name=tag + 'init_state_fc_b'),
+        size=hidden_size,
+        act=None)
+
+    def custom_dynamic_rnn(p_vec, init_state, hidden_size, para_name, args):
+        tag = para_name + "custom_dynamic_rnn:"
+
+        def static_rnn(step,
+                       p_vec=p_vec,
+                       init_state=None,
+                       para_name='',
+                       args=args):
+            tag = para_name + "static_rnn:"
+            ctx = layers.fc(
+                input=p_vec,
+                param_attr=fluid.ParamAttr(name=tag + 'context_fc_w'),
+                bias_attr=fluid.ParamAttr(name=tag + 'context_fc_b'),
+                size=hidden_size,
+                act=None)
+
+            beta = []
+            c_prev = init_state
+            m_prev = init_state
+            for i in range(step):
+                m_prev0 = layers.fc(
+                    input=m_prev,
+                    size=hidden_size,
+                    act=None,
+                    param_attr=fluid.ParamAttr(name=tag + 'm_prev0_fc_w'),
+                    bias_attr=fluid.ParamAttr(name=tag + 'm_prev0_fc_b'))
+                m_prev1 = layers.sequence_expand(x=m_prev0, y=ctx)
+
+                Fk = ctx + m_prev1
+                Fk = layers.tanh(Fk)
+                logits = layers.fc(
+                    input=Fk,
+                    size=1,
+                    act=None,
+                    param_attr=fluid.ParamAttr(name=tag + 'logits_fc_w'),
+                    bias_attr=fluid.ParamAttr(name=tag + 'logits_fc_b'))
+
+                scores = layers.sequence_softmax(input=logits)
+                attn_ctx = layers.elementwise_mul(x=p_vec, y=scores, axis=0)
+                attn_ctx = layers.sequence_pool(input=attn_ctx, pool_type='sum')
+
+                hidden_t, cell_t = lstm_step(
+                    attn_ctx,
+                    hidden_t_prev=m_prev,
+                    cell_t_prev=c_prev,
+                    size=hidden_size,
+                    para_name=tag,
+                    args=args)
+                m_prev = hidden_t
+                c_prev = cell_t
+                beta.append(scores)
+            return beta
+
+        return static_rnn(
+            2, p_vec=p_vec, init_state=init_state, para_name=para_name)
+
+    fw_outputs = custom_dynamic_rnn(p_vec, init_state, hidden_size, tag + "fw:",
+                                    args)
+    bw_outputs = custom_dynamic_rnn(p_vec, init_state, hidden_size, tag + "bw:",
+                                    args)
+
+    start_prob = layers.elementwise_add(
+        x=fw_outputs[0], y=bw_outputs[1], axis=0) / 2
+    end_prob = layers.elementwise_add(
+        x=fw_outputs[1], y=bw_outputs[0], axis=0) / 2
+
+    return start_prob, end_prob
+
+
+def fusion(g, args):
+    m = bi_lstm_encoder(
+        input_seq=g, gate_size=args.hidden_size, para_name='fusion', args=args)
+    return dropout(m, args)
+
+
+def rc_model(hidden_size, vocab, args):
+    emb_shape = [vocab.size(), vocab.embed_dim]
+    # stage 1:encode 
+    p_ids_names = []
+    q_ids_names = []
+    ms = []
+    gs = []
+    qs = []
+    for i in range(args.doc_num):
+        p_ids_name = "pids_%d" % i
+        p_ids_names.append(p_ids_name)
+        p_enc_i = encoder(p_ids_name, 'p_enc', emb_shape, hidden_size, args)
+
+        q_ids_name = "qids_%d" % i
+        q_ids_names.append(q_ids_name)
+        q_enc_i = encoder(q_ids_name, 'q_enc', emb_shape, hidden_size, args)
+
+        # stage 2:match
+        g_i = attn_flow(q_enc_i, p_enc_i, p_ids_name, args)
+        # stage 3:fusion
+        m_i = fusion(g_i, args)
+        ms.append(m_i)
+        gs.append(g_i)
+        qs.append(q_enc_i)
+    m = layers.sequence_concat(input=ms)
+    g = layers.sequence_concat(input=gs)
+    q_vec = layers.sequence_concat(input=qs)
+
+    # stage 4:decode 
+    start_probs, end_probs = point_network_decoder(
+        p_vec=m, q_vec=q_vec, hidden_size=hidden_size, args=args)
+
+    start_labels = layers.data(
+        name="start_lables", shape=[1], dtype='float32', lod_level=1)
+    end_labels = layers.data(
+        name="end_lables", shape=[1], dtype='float32', lod_level=1)
+
+    cost0 = layers.sequence_pool(
+        layers.cross_entropy(
+            input=start_probs, label=start_labels, soft_label=True),
+        'sum')
+    cost1 = layers.sequence_pool(
+        layers.cross_entropy(
+            input=end_probs, label=end_labels, soft_label=True),
+        'sum')
+
+    cost0 = layers.mean(cost0)
+    cost1 = layers.mean(cost1)
+    cost = cost0 + cost1
+    cost.persistable = True
+
+    feeding_list = q_ids_names + ["start_lables", "end_lables"] + p_ids_names
+    return cost, start_probs, end_probs, feeding_list
--- a/fluid/machine_reading_comprehension/run.py
+++ b/fluid/machine_reading_comprehension/run.py
--- a/fluid/machine_reading_comprehension/run.sh
+++ b/fluid/machine_reading_comprehension/run.sh
+export CUDA_VISIBLE_DEVICES=1
+python run.py   \
+--trainset 'data/preprocessed/trainset/search.train.json' \
+           'data/preprocessed/trainset/zhidao.train.json' \
+--devset 'data/preprocessed/devset/search.dev.json' \
+         'data/preprocessed/devset/zhidao.dev.json' \
+--testset 'data/preprocessed/testset/search.test.json' \
+          'data/preprocessed/testset/zhidao.test.json' \
+--vocab_dir 'data/vocab' \
+--use_gpu true \
+--save_dir ./models \
+--pass_num 10 \
+--learning_rate 0.001 \
+--batch_size 8 \
+--embed_size 300 \
+--hidden_size 150 \
+--max_p_num 5 \
+--max_p_len 500 \
+--max_q_len 60 \
+--max_a_len 200 \
+--drop_rate 0.2 $@\
--- a/fluid/machine_reading_comprehension/utils/__init__.py
+++ b/fluid/machine_reading_comprehension/utils/__init__.py
+# coding:utf8
+# ==============================================================================
+# Copyright 2017 Baidu.com, Inc. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""
+This package implements some utility functions shared by PaddlePaddle
+and Tensorflow model implementations.
+
+Authors: liuyuan(liuyuan04@baidu.com)
+Date:    2017/10/06 18:23:06
+"""
+
+from .dureader_eval import compute_bleu_rouge
+from .dureader_eval import normalize
+from .preprocess import find_fake_answer
+from .preprocess import find_best_question_match
+
+__all__ = [
+    'compute_bleu_rouge',
+    'normalize',
+    'find_fake_answer',
+    'find_best_question_match',
+]
--- a/fluid/machine_reading_comprehension/utils/download_thirdparty.sh
+++ b/fluid/machine_reading_comprehension/utils/download_thirdparty.sh
+#!/bin/bash
+# ==============================================================================
+# Copyright 2017 Baidu.com, Inc. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+# We use Bleu and Rouge as evaluation metrics, the calculation of these metrics
+# relies on the scoring scripts under "https://github.com/tylin/coco-caption"
+
+bleu_base_url='https://raw.githubusercontent.com/tylin/coco-caption/master/pycocoevalcap/bleu'
+bleu_files=("LICENSE" "__init__.py" "bleu.py" "bleu_scorer.py")
+
+rouge_base_url="https://raw.githubusercontent.com/tylin/coco-caption/master/pycocoevalcap/rouge"
+rouge_files=("__init__.py" "rouge.py")
+
+download() {
+    local metric=$1; shift;
+    local base_url=$1; shift;
+    local fnames=($@);
+
+    mkdir -p ${metric}
+    for fname in ${fnames[@]};
+    do
+        printf "downloading: %s\n" ${base_url}/${fname}
+        wget --no-check-certificate ${base_url}/${fname} -O ${metric}/${fname}
+    done
+}
+
+# prepare rouge
+download "rouge_metric" ${rouge_base_url} ${rouge_files[@]}
+
+# prepare bleu
+download "bleu_metric" ${bleu_base_url} ${bleu_files[@]}
+
+# convert python 2.x source code to python 3.x
+2to3 -w "../utils/bleu_metric/bleu_scorer.py"
+2to3 -w "../utils/bleu_metric/bleu.py"
--- a/fluid/machine_reading_comprehension/utils/dureader_eval.py
+++ b/fluid/machine_reading_comprehension/utils/dureader_eval.py
+# -*- coding:utf8 -*-
+# ==============================================================================
+# Copyright 2017 Baidu.com, Inc. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""
+This module computes evaluation metrics for DuReader dataset.
+"""
+
+import argparse
+import json
+import sys
+import zipfile
+
+from collections import Counter
+from .bleu_metric.bleu import Bleu
+from .rouge_metric.rouge import Rouge
+
+EMPTY = ''
+YESNO_LABELS = set(['Yes', 'No', 'Depends'])
+
+
+def normalize(s):
+    """
+    Normalize strings to space joined chars.
+
+    Args:
+        s: a list of strings.
+
+    Returns:
+        A list of normalized strings.
+    """
+    if not s:
+        return s
+    normalized = []
+    for ss in s:
+        tokens = [c for c in list(ss) if len(c.strip()) != 0]
+        normalized.append(' '.join(tokens))
+    return normalized
+
+
+def data_check(obj, task):
+    """
+    Check data.
+
+    Raises:
+        Raises AssertionError when data is not legal.
+    """
+    assert 'question_id' in obj, "Missing 'question_id' field."
+    assert 'question_type' in obj, \
+            "Missing 'question_type' field. question_id: {}".format(obj['question_type'])
+
+    assert 'yesno_answers' in obj, \
+            "Missing 'yesno_answers' field. question_id: {}".format(obj['question_id'])
+    assert isinstance(obj['yesno_answers'], list), \
+            r"""'yesno_answers' field must be a list, if the 'question_type' is not
+            'YES_NO', then this field should be an empty list.
+            question_id: {}""".format(obj['question_id'])
+
+    assert 'entity_answers' in obj, \
+            "Missing 'entity_answers' field. question_id: {}".format(obj['question_id'])
+    assert isinstance(obj['entity_answers'], list) \
+            and len(obj['entity_answers']) > 0, \
+            r"""'entity_answers' field must be a list, and has at least one element,
+            which can be a empty list. question_id: {}""".format(obj['question_id'])
+
+
+def read_file(file_name, task, is_ref=False):
+    """
+    Read predict answers or reference answers from file.
+
+    Args:
+        file_name: the name of the file containing predict result or reference
+                   result.
+
+    Returns:
+        A dictionary mapping question_id to the result information. The result
+        information itself is also a dictionary with has four keys:
+        - question_type: type of the query.
+        - yesno_answers: A list of yesno answers corresponding to 'answers'.
+        - answers: A list of predicted answers.
+        - entity_answers: A list, each element is also a list containing the entities
+                    tagged out from the corresponding answer string.
+    """
+
+    def _open(file_name, mode, zip_obj=None):
+        if zip_obj is not None:
+            return zip_obj.open(file_name, mode)
+        return open(file_name, mode)
+
+    results = {}
+    keys = ['answers', 'yesno_answers', 'entity_answers', 'question_type']
+    if is_ref:
+        keys += ['source']
+
+    zf = zipfile.ZipFile(file_name, 'r') if file_name.endswith('.zip') else None
+    file_list = [file_name] if zf is None else zf.namelist()
+
+    for fn in file_list:
+        for line in _open(fn, 'r', zip_obj=zf):
+            try:
+                obj = json.loads(line.strip())
+            except ValueError:
+                raise ValueError("Every line of data should be legal json")
+            data_check(obj, task)
+            qid = obj['question_id']
+            assert qid not in results, "Duplicate question_id: {}".format(qid)
+            results[qid] = {}
+            for k in keys:
+                results[qid][k] = obj[k]
+    return results
+
+
+def compute_bleu_rouge(pred_dict, ref_dict, bleu_order=4):
+    """
+    Compute bleu and rouge scores.
+    """
+    assert set(pred_dict.keys()) == set(ref_dict.keys()), \
+            "missing keys: {}".format(set(ref_dict.keys()) - set(pred_dict.keys()))
+    scores = {}
+    bleu_scores, _ = Bleu(bleu_order).compute_score(ref_dict, pred_dict)
+    for i, bleu_score in enumerate(bleu_scores):
+        scores['Bleu-%d' % (i + 1)] = bleu_score
+    rouge_score, _ = Rouge().compute_score(ref_dict, pred_dict)
+    scores['Rouge-L'] = rouge_score
+    return scores
+
+
+def local_prf(pred_list, ref_list):
+    """
+    Compute local precision recall and f1-score,
+    given only one prediction list and one reference list
+    """
+    common = Counter(pred_list) & Counter(ref_list)
+    num_same = sum(common.values())
+    if num_same == 0:
+        return 0, 0, 0
+    p = 1.0 * num_same / len(pred_list)
+    r = 1.0 * num_same / len(ref_list)
+    f1 = (2 * p * r) / (p + r)
+    return p, r, f1
+
+
+def compute_prf(pred_dict, ref_dict):
+    """
+    Compute precision recall and f1-score.
+    """
+    pred_question_ids = set(pred_dict.keys())
+    ref_question_ids = set(ref_dict.keys())
+    correct_preds, total_correct, total_preds = 0, 0, 0
+    for question_id in ref_question_ids:
+        pred_entity_list = pred_dict.get(question_id, [[]])
+        assert len(pred_entity_list) == 1, \
+            'the number of entity list for question_id {} is not 1.'.format(question_id)
+        pred_entity_list = pred_entity_list[0]
+        all_ref_entity_lists = ref_dict[question_id]
+        best_local_f1 = 0
+        best_ref_entity_list = None
+        for ref_entity_list in all_ref_entity_lists:
+            local_f1 = local_prf(pred_entity_list, ref_entity_list)[2]
+            if local_f1 > best_local_f1:
+                best_ref_entity_list = ref_entity_list
+                best_local_f1 = local_f1
+        if best_ref_entity_list is None:
+            if len(all_ref_entity_lists) > 0:
+                best_ref_entity_list = sorted(
+                    all_ref_entity_lists, key=lambda x: len(x))[0]
+            else:
+                best_ref_entity_list = []
+        gold_entities = set(best_ref_entity_list)
+        pred_entities = set(pred_entity_list)
+        correct_preds += len(gold_entities & pred_entities)
+        total_preds += len(pred_entities)
+        total_correct += len(gold_entities)
+    p = float(correct_preds) / total_preds if correct_preds > 0 else 0
+    r = float(correct_preds) / total_correct if correct_preds > 0 else 0
+    f1 = 2 * p * r / (p + r) if correct_preds > 0 else 0
+    return {'Precision': p, 'Recall': r, 'F1': f1}
+
+
+def prepare_prf(pred_dict, ref_dict):
+    """
+    Prepares data for calculation of prf scores.
+    """
+    preds = {k: v['entity_answers'] for k, v in pred_dict.items()}
+    refs = {k: v['entity_answers'] for k, v in ref_dict.items()}
+    return preds, refs
+
+
+def filter_dict(result_dict, key_tag):
+    """
+    Filter a subset of the result_dict, where keys ends with 'key_tag'.
+    """
+    filtered = {}
+    for k, v in result_dict.items():
+        if k.endswith(key_tag):
+            filtered[k] = v
+    return filtered
+
+
+def get_metrics(pred_result, ref_result, task, source):
+    """
+    Computes metrics.
+    """
+    metrics = {}
+
+    ref_result_filtered = {}
+    pred_result_filtered = {}
+    if source == 'both':
+        ref_result_filtered = ref_result
+        pred_result_filtered = pred_result
+    else:
+        for question_id, info in ref_result.items():
+            if info['source'] == source:
+                ref_result_filtered[question_id] = info
+                if question_id in pred_result:
+                    pred_result_filtered[question_id] = pred_result[question_id]
+
+    if task == 'main' or task == 'all' \
+            or task == 'description':
+        pred_dict, ref_dict = prepare_bleu(pred_result_filtered,
+                                           ref_result_filtered, task)
+        metrics = compute_bleu_rouge(pred_dict, ref_dict)
+    elif task == 'yesno':
+        pred_dict, ref_dict = prepare_bleu(pred_result_filtered,
+                                           ref_result_filtered, task)
+        keys = ['Yes', 'No', 'Depends']
+        preds = [filter_dict(pred_dict, k) for k in keys]
+        refs = [filter_dict(ref_dict, k) for k in keys]
+
+        metrics = compute_bleu_rouge(pred_dict, ref_dict)
+
+        for k, pred, ref in zip(keys, preds, refs):
+            m = compute_bleu_rouge(pred, ref)
+            k_metric = [(k + '|' + key, v) for key, v in m.items()]
+            metrics.update(k_metric)
+
+    elif task == 'entity':
+        pred_dict, ref_dict = prepare_prf(pred_result_filtered,
+                                          ref_result_filtered)
+        pred_dict_bleu, ref_dict_bleu = prepare_bleu(pred_result_filtered,
+                                                     ref_result_filtered, task)
+        metrics = compute_prf(pred_dict, ref_dict)
+        metrics.update(compute_bleu_rouge(pred_dict_bleu, ref_dict_bleu))
+    else:
+        raise ValueError("Illegal task name: {}".format(task))
+
+    return metrics
+
+
+def prepare_bleu(pred_result, ref_result, task):
+    """
+    Prepares data for calculation of bleu and rouge scores.
+    """
+    pred_list, ref_list = [], []
+    qids = ref_result.keys()
+    for qid in qids:
+        if task == 'main':
+            pred, ref = get_main_result(qid, pred_result, ref_result)
+        elif task == 'yesno':
+            pred, ref = get_yesno_result(qid, pred_result, ref_result)
+        elif task == 'all':
+            pred, ref = get_all_result(qid, pred_result, ref_result)
+        elif task == 'entity':
+            pred, ref = get_entity_result(qid, pred_result, ref_result)
+        elif task == 'description':
+            pred, ref = get_desc_result(qid, pred_result, ref_result)
+        else:
+            raise ValueError("Illegal task name: {}".format(task))
+        if pred and ref:
+            pred_list += pred
+            ref_list += ref
+    pred_dict = dict(pred_list)
+    ref_dict = dict(ref_list)
+    for qid, ans in ref_dict.items():
+        ref_dict[qid] = normalize(ref_dict[qid])
+        pred_dict[qid] = normalize(pred_dict.get(qid, [EMPTY]))
+        if not ans or ans == [EMPTY]:
+            del ref_dict[qid]
+            del pred_dict[qid]
+
+    for k, v in pred_dict.items():
+        assert len(v) == 1, \
+            "There should be only one predict answer. question_id: {}".format(k)
+    return pred_dict, ref_dict
+
+
+def get_main_result(qid, pred_result, ref_result):
+    """
+    Prepare answers for task 'main'.
+
+    Args:
+        qid: question_id.
+        pred_result: A dict include all question_id's result information read
+                     from args.pred_file.
+        ref_result: A dict incluce all question_id's result information read
+                    from args.ref_file.
+    Returns:
+        Two lists, the first one contains predict result, the second
+        one contains reference result of the same question_id. Each list has
+        elements of tuple (question_id, answers), 'answers' is a list of strings.
+    """
+    ref_ans = ref_result[qid]['answers']
+    if not ref_ans:
+        ref_ans = [EMPTY]
+    pred_ans = pred_result.get(qid, {}).get('answers', [])[:1]
+    if not pred_ans:
+        pred_ans = [EMPTY]
+
+    return [(qid, pred_ans)], [(qid, ref_ans)]
+
+
+def get_entity_result(qid, pred_result, ref_result):
+    """
+    Prepare answers for task 'entity'.
+
+    Args:
+        qid: question_id.
+        pred_result: A dict include all question_id's result information read
+                     from args.pred_file.
+        ref_result: A dict incluce all question_id's result information read
+                    from args.ref_file.
+    Returns:
+        Two lists, the first one contains predict result, the second
+        one contains reference result of the same question_id. Each list has
+        elements of tuple (question_id, answers), 'answers' is a list of strings.
+    """
+    if ref_result[qid]['question_type'] != 'ENTITY':
+        return None, None
+    return get_main_result(qid, pred_result, ref_result)
+
+
+def get_desc_result(qid, pred_result, ref_result):
+    """
+    Prepare answers for task 'description'.
+
+    Args:
+        qid: question_id.
+        pred_result: A dict include all question_id's result information read
+                     from args.pred_file.
+        ref_result: A dict incluce all question_id's result information read
+                    from args.ref_file.
+    Returns:
+        Two lists, the first one contains predict result, the second
+        one contains reference result of the same question_id. Each list has
+        elements of tuple (question_id, answers), 'answers' is a list of strings.
+    """
+    if ref_result[qid]['question_type'] != 'DESCRIPTION':
+        return None, None
+    return get_main_result(qid, pred_result, ref_result)
+
+
+def get_yesno_result(qid, pred_result, ref_result):
+    """
+    Prepare answers for task 'yesno'.
+
+    Args:
+        qid: question_id.
+        pred_result: A dict include all question_id's result information read
+                     from args.pred_file.
+        ref_result: A dict incluce all question_id's result information read
+                    from args.ref_file.
+    Returns:
+        Two lists, the first one contains predict result, the second
+        one contains reference result of the same question_id. Each list has
+        elements of tuple (question_id, answers), 'answers' is a list of strings.
+    """
+
+    def _uniq(li, is_ref):
+        uniq_li = []
+        left = []
+        keys = set()
+        for k, v in li:
+            if k not in keys:
+                uniq_li.append((k, v))
+                keys.add(k)
+            else:
+                left.append((k, v))
+
+        if is_ref:
+            dict_li = dict(uniq_li)
+            for k, v in left:
+                dict_li[k] += v
+            uniq_li = [(k, v) for k, v in dict_li.items()]
+        return uniq_li
+
+    def _expand_result(uniq_li):
+        expanded = uniq_li[:]
+        keys = set([x[0] for x in uniq_li])
+        for k in YESNO_LABELS - keys:
+            expanded.append((k, [EMPTY]))
+        return expanded
+
+    def _get_yesno_ans(qid, result_dict, is_ref=False):
+        if qid not in result_dict:
+            return [(str(qid) + '_' + k, v) for k, v in _expand_result([])]
+        yesno_answers = result_dict[qid]['yesno_answers']
+        answers = result_dict[qid]['answers']
+        lbl_ans = _uniq([(k, [v]) for k, v in zip(yesno_answers, answers)],
+                        is_ref)
+        ret = [(str(qid) + '_' + k, v) for k, v in _expand_result(lbl_ans)]
+        return ret
+
+    if ref_result[qid]['question_type'] != 'YES_NO':
+        return None, None
+
+    ref_ans = _get_yesno_ans(qid, ref_result, is_ref=True)
+    pred_ans = _get_yesno_ans(qid, pred_result)
+    return pred_ans, ref_ans
+
+
+def get_all_result(qid, pred_result, ref_result):
+    """
+    Prepare answers for task 'all'.
+
+    Args:
+        qid: question_id.
+        pred_result: A dict include all question_id's result information read
+                     from args.pred_file.
+        ref_result: A dict incluce all question_id's result information read
+                    from args.ref_file.
+    Returns:
+        Two lists, the first one contains predict result, the second
+        one contains reference result of the same question_id. Each list has
+        elements of tuple (question_id, answers), 'answers' is a list of strings.
+    """
+    if ref_result[qid]['question_type'] == 'YES_NO':
+        return get_yesno_result(qid, pred_result, ref_result)
+    return get_main_result(qid, pred_result, ref_result)
+
+
+def format_metrics(metrics, task, err_msg):
+    """
+    Format metrics. 'err' field returns any error occured during evaluation.
+
+    Args:
+        metrics: A dict object contains metrics for different tasks.
+        task: Task name.
+        err_msg: Exception raised during evaluation.
+    Returns:
+        Formatted result.
+    """
+    result = {}
+    sources = ["both", "search", "zhidao"]
+    if err_msg is not None:
+        return {'errorMsg': str(err_msg), 'errorCode': 1, 'data': []}
+    data = []
+    if task != 'all' and task != 'main':
+        sources = ["both"]
+
+    if task == 'entity':
+        metric_names = ["Bleu-4", "Rouge-L"]
+        metric_names_prf = ["F1", "Precision", "Recall"]
+        for name in metric_names + metric_names_prf:
+            for src in sources:
+                obj = {
+                    "name": name,
+                    "value": round(metrics[src].get(name, 0) * 100, 2),
+                    "type": src,
+                }
+                data.append(obj)
+    elif task == 'yesno':
+        metric_names = ["Bleu-4", "Rouge-L"]
+        details = ["Yes", "No", "Depends"]
+        src = sources[0]
+        for name in metric_names:
+            obj = {
+                "name": name,
+                "value": round(metrics[src].get(name, 0) * 100, 2),
+                "type": 'All',
+            }
+            data.append(obj)
+            for d in details:
+                obj = {
+                    "name": name,
+                    "value": \
+                        round(metrics[src].get(d + '|' + name, 0) * 100, 2),
+                    "type": d,
+                    }
+                data.append(obj)
+    else:
+        metric_names = ["Bleu-4", "Rouge-L"]
+        for name in metric_names:
+            for src in sources:
+                obj = {
+                    "name": name,
+                    "value": \
+                        round(metrics[src].get(name, 0) * 100, 2),
+                    "type": src,
+                    }
+                data.append(obj)
+
+    result["data"] = data
+    result["errorCode"] = 0
+    result["errorMsg"] = "success"
+
+    return result
+
+
+def main(args):
+    """
+    Do evaluation.
+    """
+    err = None
+    metrics = {}
+    try:
+        pred_result = read_file(args.pred_file, args.task)
+        ref_result = read_file(args.ref_file, args.task, is_ref=True)
+        sources = ['both', 'search', 'zhidao']
+        if args.task not in set(['main', 'all']):
+            sources = sources[:1]
+        for source in sources:
+            metrics[source] = get_metrics(pred_result, ref_result, args.task,
+                                          source)
+    except ValueError as ve:
+        err = ve
+    except AssertionError as ae:
+        err = ae
+
+    print(json.dumps(
+        format_metrics(metrics, args.task, err), ensure_ascii=False).encode(
+            'utf8'))
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser()
+    parser.add_argument('pred_file', help='predict file')
+    parser.add_argument('ref_file', help='reference file')
+    parser.add_argument(
+        'task', help='task name: Main|Yes_No|All|Entity|Description')
+
+    args = parser.parse_args()
+    args.task = args.task.lower().replace('_', '')
+    main(args)
--- a/fluid/machine_reading_comprehension/utils/get_vocab.py
+++ b/fluid/machine_reading_comprehension/utils/get_vocab.py
+# -*- coding:utf8 -*-
+# ==============================================================================
+# Copyright 2017 Baidu.com, Inc. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""
+Utility function to generate vocabulary file.
+"""
+
+import argparse
+import sys
+import json
+
+from itertools import chain
+
+
+def get_vocab(files, vocab_file):
+    """
+    Builds vocabulary file from field 'segmented_paragraphs'
+    and 'segmented_question'.
+
+    Args:
+        files: A list of file names.
+        vocab_file: The file that stores the vocabulary.
+    """
+    vocab = {}
+    for f in files:
+        with open(f, 'r') as fin:
+            for line in fin:
+                obj = json.loads(line.strip())
+                paras = [
+                    chain(*d['segmented_paragraphs']) for d in obj['documents']
+                ]
+                doc_tokens = chain(*paras)
+                question_tokens = obj['segmented_question']
+                for t in list(doc_tokens) + question_tokens:
+                    vocab[t] = vocab.get(t, 0) + 1
+    # output
+    sorted_vocab = sorted(
+        [(v, c) for v, c in vocab.items()], key=lambda x: x[1], reverse=True)
+    with open(vocab_file, 'w') as outf:
+        for w, c in sorted_vocab:
+            print >> outf, '{}\t{}'.format(w.encode('utf8'), c)
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        '--files',
+        nargs='+',
+        required=True,
+        help='file list to count vocab from.')
+    parser.add_argument(
+        '--vocab', required=True, help='file to store counted vocab.')
+    args = parser.parse_args()
+    get_vocab(args.files, args.vocab)
--- a/fluid/machine_reading_comprehension/utils/marco_tokenize_data.py
+++ b/fluid/machine_reading_comprehension/utils/marco_tokenize_data.py
+#coding=utf8
+
+import os, sys, json
+import nltk
+
+
+def _nltk_tokenize(sequence):
+    tokens = nltk.word_tokenize(sequence)
+
+    cur_char_offset = 0
+    token_offsets = []
+    token_words = []
+    for token in tokens:
+        cur_char_offset = sequence.find(token, cur_char_offset)
+        token_offsets.append(
+            [cur_char_offset, cur_char_offset + len(token) - 1])
+        token_words.append(token)
+    return token_offsets, token_words
+
+
+def segment(input_js):
+    _, input_js['segmented_question'] = _nltk_tokenize(input_js['question'])
+    for doc_id, doc in enumerate(input_js['documents']):
+        doc['segmented_title'] = []
+        doc['segmented_paragraphs'] = []
+        for para_id, para in enumerate(doc['paragraphs']):
+            _, seg_para = _nltk_tokenize(para)
+            doc['segmented_paragraphs'].append(seg_para)
+    if 'answers' in input_js:
+        input_js['segmented_answers'] = []
+        for answer_id, answer in enumerate(input_js['answers']):
+            _, seg_answer = _nltk_tokenize(answer)
+            input_js['segmented_answers'].append(seg_answer)
+
+
+if __name__ == '__main__':
+    if len(sys.argv) != 2:
+        print('Usage: tokenize_data.py <input_path>')
+        exit()
+
+    nltk.download('punkt')
+
+    for line in open(sys.argv[1]):
+        dureader_js = json.loads(line.strip())
+        segment(dureader_js)
+        print(json.dumps(dureader_js))
--- a/fluid/machine_reading_comprehension/utils/marcov1_to_dureader.py
+++ b/fluid/machine_reading_comprehension/utils/marcov1_to_dureader.py
+#coding=utf8
+
+import sys
+import json
+import pandas as pd
+
+
+def trans(input_js):
+    output_js = {}
+    output_js['question'] = input_js['query']
+    output_js['question_type'] = input_js['query_type']
+    output_js['question_id'] = input_js['query_id']
+    output_js['fact_or_opinion'] = ""
+    output_js['documents'] = []
+    for para_id, para in enumerate(input_js['passages']):
+        doc = {}
+        doc['title'] = ""
+        if 'is_selected' in para:
+            doc['is_selected'] = True if para['is_selected'] != 0 else False
+        doc['paragraphs'] = [para['passage_text']]
+        output_js['documents'].append(doc)
+
+    if 'answers' in input_js:
+        output_js['answers'] = input_js['answers']
+    return output_js
+
+
+if __name__ == '__main__':
+    if len(sys.argv) != 2:
+        print('Usage: marcov1_to_dureader.py <input_path>')
+        exit()
+
+    df = pd.read_json(sys.argv[1])
+    for row in df.iterrows():
+        marco_js = json.loads(row[1].to_json())
+        dureader_js = trans(marco_js)
+        print(json.dumps(dureader_js))
--- a/fluid/machine_reading_comprehension/utils/marcov2_to_v1_tojsonl.py
+++ b/fluid/machine_reading_comprehension/utils/marcov2_to_v1_tojsonl.py
+import sys
+import json
+import pandas as pd
+
+if __name__ == '__main__':
+    if len(sys.argv) != 3:
+        print('Usage: tojson.py <input_path> <output_path>')
+        exit()
+    infile = sys.argv[1]
+    outfile = sys.argv[2]
+    df = pd.read_json(infile)
+    with open(outfile, 'w') as f:
+        for row in df.iterrows():
+            f.write(row[1].to_json() + '\n')
--- a/fluid/machine_reading_comprehension/utils/preprocess.py
+++ b/fluid/machine_reading_comprehension/utils/preprocess.py
+###############################################################################
+# ==============================================================================
+# Copyright 2017 Baidu.com, Inc. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""
+This module finds the most related paragraph of each document according to recall.
+"""
+
+import sys
+if sys.version[0] == '2':
+    reload(sys)
+    sys.setdefaultencoding("utf-8")
+import json
+from collections import Counter
+
+
+def precision_recall_f1(prediction, ground_truth):
+    """
+    This function calculates and returns the precision, recall and f1-score
+    Args:
+        prediction: prediction string or list to be matched
+        ground_truth: golden string or list reference
+    Returns:
+        floats of (p, r, f1)
+    Raises:
+        None
+    """
+    if not isinstance(prediction, list):
+        prediction_tokens = prediction.split()
+    else:
+        prediction_tokens = prediction
+    if not isinstance(ground_truth, list):
+        ground_truth_tokens = ground_truth.split()
+    else:
+        ground_truth_tokens = ground_truth
+    common = Counter(prediction_tokens) & Counter(ground_truth_tokens)
+    num_same = sum(common.values())
+    if num_same == 0:
+        return 0, 0, 0
+    p = 1.0 * num_same / len(prediction_tokens)
+    r = 1.0 * num_same / len(ground_truth_tokens)
+    f1 = (2 * p * r) / (p + r)
+    return p, r, f1
+
+
+def recall(prediction, ground_truth):
+    """
+    This function calculates and returns the recall
+    Args:
+        prediction: prediction string or list to be matched
+        ground_truth: golden string or list reference
+    Returns:
+        floats of recall
+    Raises:
+        None
+    """
+    return precision_recall_f1(prediction, ground_truth)[1]
+
+
+def f1_score(prediction, ground_truth):
+    """
+    This function calculates and returns the f1-score
+    Args:
+        prediction: prediction string or list to be matched
+        ground_truth: golden string or list reference
+    Returns:
+        floats of f1
+    Raises:
+        None
+    """
+    return precision_recall_f1(prediction, ground_truth)[2]
+
+
+def metric_max_over_ground_truths(metric_fn, prediction, ground_truths):
+    """
+    This function calculates and returns the precision, recall and f1-score
+    Args:
+        metric_fn: metric function pointer which calculates scores according to corresponding logic.
+        prediction: prediction string or list to be matched
+        ground_truth: golden string or list reference
+    Returns:
+        floats of (p, r, f1)
+    Raises:
+        None
+    """
+    scores_for_ground_truths = []
+    for ground_truth in ground_truths:
+        score = metric_fn(prediction, ground_truth)
+        scores_for_ground_truths.append(score)
+    return max(scores_for_ground_truths)
+
+
+def find_best_question_match(doc, question, with_score=False):
+    """
+    For each docment, find the paragraph that matches best to the question.
+    Args:
+        doc: The document object.
+        question: The question tokens.
+        with_score: If True then the match score will be returned,
+            otherwise False.
+    Returns:
+        The index of the best match paragraph, if with_score=False,
+        otherwise returns a tuple of the index of the best match paragraph
+        and the match score of that paragraph.
+    """
+    most_related_para = -1
+    max_related_score = 0
+    most_related_para_len = 0
+    for p_idx, para_tokens in enumerate(doc['segmented_paragraphs']):
+        if len(question) > 0:
+            related_score = metric_max_over_ground_truths(recall, para_tokens,
+                                                          question)
+        else:
+            related_score = 0
+
+        if related_score > max_related_score \
+                or (related_score == max_related_score \
+                and len(para_tokens) < most_related_para_len):
+            most_related_para = p_idx
+            max_related_score = related_score
+            most_related_para_len = len(para_tokens)
+    if most_related_para == -1:
+        most_related_para = 0
+    if with_score:
+        return most_related_para, max_related_score
+    return most_related_para
+
+
+def find_fake_answer(sample):
+    """
+    For each document, finds the most related paragraph based on recall,
+    then finds a span that maximize the f1_score compared with the gold answers
+    and uses this span as a fake answer span
+    Args:
+        sample: a sample in the dataset
+    Returns:
+        None
+    Raises:
+        None
+    """
+    for doc in sample['documents']:
+        most_related_para = -1
+        most_related_para_len = 999999
+        max_related_score = 0
+        for p_idx, para_tokens in enumerate(doc['segmented_paragraphs']):
+            if len(sample['segmented_answers']) > 0:
+                related_score = metric_max_over_ground_truths(
+                    recall, para_tokens, sample['segmented_answers'])
+            else:
+                continue
+            if related_score > max_related_score \
+                    or (related_score == max_related_score
+                        and len(para_tokens) < most_related_para_len):
+                most_related_para = p_idx
+                most_related_para_len = len(para_tokens)
+                max_related_score = related_score
+        doc['most_related_para'] = most_related_para
+
+    sample['answer_docs'] = []
+    sample['answer_spans'] = []
+    sample['fake_answers'] = []
+    sample['match_scores'] = []
+
+    best_match_score = 0
+    best_match_d_idx, best_match_span = -1, [-1, -1]
+    best_fake_answer = None
+    answer_tokens = set()
+    for segmented_answer in sample['segmented_answers']:
+        answer_tokens = answer_tokens | set(
+            [token for token in segmented_answer])
+    for d_idx, doc in enumerate(sample['documents']):
+        if not doc['is_selected']:
+            continue
+        if doc['most_related_para'] == -1:
+            doc['most_related_para'] = 0
+        most_related_para_tokens = doc['segmented_paragraphs'][doc[
+            'most_related_para']][:1000]
+        for start_tidx in range(len(most_related_para_tokens)):
+            if most_related_para_tokens[start_tidx] not in answer_tokens:
+                continue
+            for end_tidx in range(
+                    len(most_related_para_tokens) - 1, start_tidx - 1, -1):
+                span_tokens = most_related_para_tokens[start_tidx:end_tidx + 1]
+                if len(sample['segmented_answers']) > 0:
+                    match_score = metric_max_over_ground_truths(
+                        f1_score, span_tokens, sample['segmented_answers'])
+                else:
+                    match_score = 0
+                if match_score == 0:
+                    break
+                if match_score > best_match_score:
+                    best_match_d_idx = d_idx
+                    best_match_span = [start_tidx, end_tidx]
+                    best_match_score = match_score
+                    best_fake_answer = ''.join(span_tokens)
+    if best_match_score > 0:
+        sample['answer_docs'].append(best_match_d_idx)
+        sample['answer_spans'].append(best_match_span)
+        sample['fake_answers'].append(best_fake_answer)
+        sample['match_scores'].append(best_match_score)
+
+
+if __name__ == '__main__':
+    for line in sys.stdin:
+        sample = json.loads(line)
+        find_fake_answer(sample)
+        print(json.dumps(sample, encoding='utf8', ensure_ascii=False))
--- a/fluid/machine_reading_comprehension/utils/run_marco2dureader_preprocess.sh
+++ b/fluid/machine_reading_comprehension/utils/run_marco2dureader_preprocess.sh
+#!/bin/bash
+
+input_file=$1
+output_file=$2
+
+# convert the data from MARCO V2 (json) format to MARCO V1 (jsonl) format. 
+# the script was forked from MARCO repo. 
+# the format of MARCO V1 is much more easier to explore. 
+python3 marcov2_to_v1_tojsonl.py $input_file $input_file.marcov1
+
+# convert the data from MARCO V1 format to DuReader format. 
+python3 marcov1_to_dureader.py $input_file.marcov1 >$input_file.dureader_raw
+
+# tokenize the data. 
+python3 marco_tokenize_data.py $input_file.dureader_raw >$input_file.segmented
+
+# find fake answers (indicating the start and end positions of answers in the document) for train and dev sets. 
+# note that this should not be applied for test set, since there is no ground truth in test set. 
+python preprocess.py $input_file.segmented >$output_file
+
+# remove the temporal data files. 
+rm -rf $input_file.dureader_raw $input_file.segmented
--- a/fluid/machine_reading_comprehension/vocab.py
+++ b/fluid/machine_reading_comprehension/vocab.py
--- a/fluid/metric_learning/losses/datareader.py
+++ b/fluid/metric_learning/losses/datareader.py
 import os
 import math
 import random
-import cPickle
 import functools
 import numpy as np
-#import paddle.v2 as paddle
 import paddle
 from PIL import Image, ImageEnhance

@@ -45,9 +43,9 @@ for i, item in enumerate(test_list):
        test_data[label] = []
    test_data[label].append(path)

-print "train_data size:", len(train_data)
-print "test_data size:", len(test_data)
-print "test_data image number:", len(test_image_list)
+print("train_data size:", len(train_data))
+print("test_data size:", len(test_data))
+print("test_data image number:", len(test_image_list))
 random.shuffle(test_image_list)


@@ -214,11 +212,11 @@ def eml_iterator(data,
                 color_jitter=False,
                 rotate=False):
    def reader():
-        labs = data.keys()
+        labs = list(data.keys())
        lab_num = len(labs)
-        ind = range(0, lab_num)
+        ind = list(range(0, lab_num))
        assert batch_size % samples_each_class == 0, "batch_size % samples_each_class != 0"
-        num_class = batch_size/samples_each_class
+        num_class = batch_size // samples_each_class
        for i in range(iter_size):
            random.shuffle(ind)
            for n in range(num_class):
@@ -245,9 +243,9 @@ def quadruplet_iterator(data,
                        color_jitter=False,
                        rotate=False):
    def reader():
-        labs = data.keys()
+        labs = list(data.keys())
        lab_num = len(labs)
-        ind = range(0, lab_num)
+        ind = list(range(0, lab_num))
        for i in range(iter_size):
            random.shuffle(ind)
            ind_sample = ind[:class_num]
@@ -255,7 +253,7 @@ def quadruplet_iterator(data,
            for ind_i in ind_sample:
                lab = labs[ind_i]
                data_list = data[lab]
-                data_ind = range(0, len(data_list))
+                data_ind = list(range(0, len(data_list)))
                random.shuffle(data_ind)
                anchor_ind = data_ind[:samples_each_class]

@@ -277,15 +275,15 @@ def triplet_iterator(data,
                     color_jitter=False,
                     rotate=False):
    def reader():
-        labs = data.keys()
+        labs = list(data.keys())
        lab_num = len(labs)
-        ind = range(0, lab_num)
+        ind = list(range(0, lab_num))
        for i in range(iter_size):
            random.shuffle(ind)
            ind_pos, ind_neg = ind[:2]
            lab_pos = labs[ind_pos]
            pos_data_list = data[lab_pos]
-            data_ind = range(0, len(pos_data_list))
+            data_ind = list(range(0, len(pos_data_list)))
            random.shuffle(data_ind)
            anchor_ind, pos_ind = data_ind[:2]

@@ -346,7 +344,7 @@ def quadruplet_train(class_num, samples_each_class):
            
 def triplet_train(batch_size):
    assert(batch_size % 3 == 0)
-    return triplet_iterator(train_data, 'train', batch_size, iter_size = batch_size/3 * 100, \
+    return triplet_iterator(train_data, 'train', batch_size, iter_size = batch_size//3 * 100, \
                           shuffle=True, color_jitter=False, rotate=False)

 def test():

--- a/fluid/metric_learning/losses/emlloss.py
+++ b/fluid/metric_learning/losses/emlloss.py
--- a/fluid/metric_learning/losses/metrics.py
+++ b/fluid/metric_learning/losses/metrics.py
--- a/fluid/metric_learning/losses/quadrupletloss.py
+++ b/fluid/metric_learning/losses/quadrupletloss.py
--- a/fluid/metric_learning/losses/tripletloss.py
+++ b/fluid/metric_learning/losses/tripletloss.py
--- a/fluid/metric_learning/models/resnet.py
+++ b/fluid/metric_learning/models/resnet.py
--- a/fluid/metric_learning/models/se_resnext.py
+++ b/fluid/metric_learning/models/se_resnext.py
--- a/fluid/metric_learning/train.py
+++ b/fluid/metric_learning/train.py
--- a/fluid/metric_learning/utility.py
+++ b/fluid/metric_learning/utility.py
--- a/fluid/neural_machine_translation/rnn_search/README.md
+++ b/fluid/neural_machine_translation/rnn_search/README.md
--- a/fluid/neural_machine_translation/rnn_search/_ce.py
+++ b/fluid/neural_machine_translation/rnn_search/_ce.py
--- a/fluid/neural_machine_translation/rnn_search/args.py
+++ b/fluid/neural_machine_translation/rnn_search/args.py
--- a/fluid/neural_machine_translation/rnn_search/attention_model.py
+++ b/fluid/neural_machine_translation/rnn_search/attention_model.py
--- a/fluid/neural_machine_translation/rnn_search/images/bi_rnn.png
+++ b/fluid/neural_machine_translation/rnn_search/images/bi_rnn.png
--- a/fluid/neural_machine_translation/rnn_search/images/decoder_attention.png
+++ b/fluid/neural_machine_translation/rnn_search/images/decoder_attention.png
--- a/fluid/neural_machine_translation/rnn_search/images/encoder_attention.png
+++ b/fluid/neural_machine_translation/rnn_search/images/encoder_attention.png
--- a/fluid/neural_machine_translation/transformer/README_cn.md
+++ b/fluid/neural_machine_translation/transformer/README_cn.md
--- a/fluid/neural_machine_translation/transformer/gen_data.sh
+++ b/fluid/neural_machine_translation/transformer/gen_data.sh
--- a/fluid/neural_machine_translation/transformer/infer.py
+++ b/fluid/neural_machine_translation/transformer/infer.py
--- a/fluid/neural_machine_translation/transformer/model.py
+++ b/fluid/neural_machine_translation/transformer/model.py
--- a/fluid/neural_machine_translation/transformer/reader.py
+++ b/fluid/neural_machine_translation/transformer/reader.py
--- a/fluid/neural_machine_translation/transformer/train.py
+++ b/fluid/neural_machine_translation/transformer/train.py
--- a/fluid/neural_machine_translation/transformer/util.py
+++ b/fluid/neural_machine_translation/transformer/util.py
--- a/fluid/object_detection/.gitignore
+++ b/fluid/object_detection/.gitignore
--- a/fluid/object_detection/train.py
+++ b/fluid/object_detection/train.py
--- a/fluid/ocr_recognition/attention_model.py
+++ b/fluid/ocr_recognition/attention_model.py
--- a/fluid/ocr_recognition/eval.py
+++ b/fluid/ocr_recognition/eval.py
--- a/fluid/ocr_recognition/infer.py
+++ b/fluid/ocr_recognition/infer.py
--- a/fluid/policy_gradient/brain.py
+++ b/fluid/policy_gradient/brain.py
--- a/fluid/recommendation/gru4rec/README.md
+++ b/fluid/recommendation/gru4rec/README.md
--- a/fluid/recommendation/gru4rec/convert_format.py
+++ b/fluid/recommendation/gru4rec/convert_format.py
--- a/fluid/recommendation/gru4rec/infer.py
+++ b/fluid/recommendation/gru4rec/infer.py
--- a/fluid/recommendation/gru4rec/small_test.txt
+++ b/fluid/recommendation/gru4rec/small_test.txt
--- a/fluid/recommendation/gru4rec/small_train.txt
+++ b/fluid/recommendation/gru4rec/small_train.txt
--- a/fluid/recommendation/gru4rec/train.py
+++ b/fluid/recommendation/gru4rec/train.py
--- a/fluid/recommendation/gru4rec/utils.py
+++ b/fluid/recommendation/gru4rec/utils.py
--- a/fluid/text_classification/train.py
+++ b/fluid/text_classification/train.py
--- a/fluid/text_matching_on_quora/README.md
+++ b/fluid/text_matching_on_quora/README.md
--- a/fluid/text_matching_on_quora/cdssm_base.log
+++ b/fluid/text_matching_on_quora/cdssm_base.log
--- a/fluid/text_matching_on_quora/configs/__init__.py
+++ b/fluid/text_matching_on_quora/configs/__init__.py
--- a/fluid/text_matching_on_quora/configs/basic_config.py
+++ b/fluid/text_matching_on_quora/configs/basic_config.py
--- a/fluid/text_matching_on_quora/configs/cdssm.py
+++ b/fluid/text_matching_on_quora/configs/cdssm.py
--- a/fluid/text_matching_on_quora/configs/dec_att.py
+++ b/fluid/text_matching_on_quora/configs/dec_att.py
--- a/fluid/text_matching_on_quora/configs/infer_sent.py
+++ b/fluid/text_matching_on_quora/configs/infer_sent.py
--- a/fluid/text_matching_on_quora/configs/sse.py
+++ b/fluid/text_matching_on_quora/configs/sse.py
--- a/fluid/text_matching_on_quora/data/prepare_quora_data.sh
+++ b/fluid/text_matching_on_quora/data/prepare_quora_data.sh
--- a/fluid/text_matching_on_quora/imgs/README.md
+++ b/fluid/text_matching_on_quora/imgs/README.md
--- a/fluid/text_matching_on_quora/imgs/models_test_acc.png
+++ b/fluid/text_matching_on_quora/imgs/models_test_acc.png
--- a/fluid/text_matching_on_quora/metric.py
+++ b/fluid/text_matching_on_quora/metric.py
--- a/fluid/text_matching_on_quora/models/__init__.py
+++ b/fluid/text_matching_on_quora/models/__init__.py
--- a/fluid/text_matching_on_quora/models/cdssm.py
+++ b/fluid/text_matching_on_quora/models/cdssm.py
--- a/fluid/text_matching_on_quora/models/dec_att.py
+++ b/fluid/text_matching_on_quora/models/dec_att.py
--- a/fluid/text_matching_on_quora/models/infer_sent.py
+++ b/fluid/text_matching_on_quora/models/infer_sent.py
--- a/fluid/text_matching_on_quora/models/match_layers.py
+++ b/fluid/text_matching_on_quora/models/match_layers.py
--- a/fluid/text_matching_on_quora/models/my_layers.py
+++ b/fluid/text_matching_on_quora/models/my_layers.py
--- a/fluid/text_matching_on_quora/models/pwim.py
+++ b/fluid/text_matching_on_quora/models/pwim.py
--- a/fluid/text_matching_on_quora/models/sse.py
+++ b/fluid/text_matching_on_quora/models/sse.py
--- a/fluid/text_matching_on_quora/models/test.py
+++ b/fluid/text_matching_on_quora/models/test.py
--- a/fluid/text_matching_on_quora/pretrained_word2vec.py
+++ b/fluid/text_matching_on_quora/pretrained_word2vec.py
--- a/fluid/text_matching_on_quora/quora_question_pairs.py
+++ b/fluid/text_matching_on_quora/quora_question_pairs.py
--- a/fluid/text_matching_on_quora/train_and_evaluate.py
+++ b/fluid/text_matching_on_quora/train_and_evaluate.py
--- a/fluid/text_matching_on_quora/utils.py
+++ b/fluid/text_matching_on_quora/utils.py
--- a/fluid/video_classification/eval.py
+++ b/fluid/video_classification/eval.py
--- a/fluid/video_classification/infer.py
+++ b/fluid/video_classification/infer.py
--- a/fluid/video_classification/reader.py
+++ b/fluid/video_classification/reader.py
--- a/fluid/video_classification/resnet.py
+++ b/fluid/video_classification/resnet.py
--- a/fluid/video_classification/train.py
+++ b/fluid/video_classification/train.py
--- a/fluid/video_classification/utility.py
+++ b/fluid/video_classification/utility.py
--- a/v2/README.cn.md
+++ b/v2/README.cn.md
--- a/v2/README.md
+++ b/v2/README.md
--- a/v2/conv_seq2seq/README.md
+++ b/v2/conv_seq2seq/README.md
--- a/v2/conv_seq2seq/beamsearch.py
+++ b/v2/conv_seq2seq/beamsearch.py
--- a/v2/conv_seq2seq/download.sh
+++ b/v2/conv_seq2seq/download.sh
--- a/v2/conv_seq2seq/infer.py
+++ b/v2/conv_seq2seq/infer.py
--- a/v2/conv_seq2seq/model.py
+++ b/v2/conv_seq2seq/model.py
--- a/v2/conv_seq2seq/preprocess.py
+++ b/v2/conv_seq2seq/preprocess.py
--- a/v2/conv_seq2seq/reader.py
+++ b/v2/conv_seq2seq/reader.py
--- a/v2/conv_seq2seq/train.py
+++ b/v2/conv_seq2seq/train.py
--- a/v2/ctr/README.cn.md
+++ b/v2/ctr/README.cn.md
--- a/v2/ctr/README.md
+++ b/v2/ctr/README.md
--- a/v2/ctr/avazu_data_processer.py
+++ b/v2/ctr/avazu_data_processer.py
--- a/v2/ctr/dataset.md
+++ b/v2/ctr/dataset.md
--- a/v2/ctr/images/lr_vs_dnn.jpg
+++ b/v2/ctr/images/lr_vs_dnn.jpg
--- a/v2/ctr/images/wide_deep.png
+++ b/v2/ctr/images/wide_deep.png
--- a/v2/ctr/infer.py
+++ b/v2/ctr/infer.py
--- a/v2/ctr/network_conf.py
+++ b/v2/ctr/network_conf.py
--- a/v2/ctr/reader.py
+++ b/v2/ctr/reader.py
--- a/v2/ctr/train.py
+++ b/v2/ctr/train.py
--- a/v2/ctr/utils.py
+++ b/v2/ctr/utils.py
--- a/v2/deep_fm/README.cn.md
+++ b/v2/deep_fm/README.cn.md
--- a/v2/deep_fm/README.md
+++ b/v2/deep_fm/README.md
--- a/v2/deep_fm/data/download.sh
+++ b/v2/deep_fm/data/download.sh
--- a/v2/deep_fm/infer.py
+++ b/v2/deep_fm/infer.py
--- a/v2/deep_fm/network_conf.py
+++ b/v2/deep_fm/network_conf.py
--- a/v2/deep_fm/preprocess.py
+++ b/v2/deep_fm/preprocess.py
--- a/v2/deep_fm/reader.py
+++ b/v2/deep_fm/reader.py
--- a/v2/deep_fm/train.py
+++ b/v2/deep_fm/train.py
--- a/v2/dssm/README.cn.md
+++ b/v2/dssm/README.cn.md
--- a/v2/dssm/README.md
+++ b/v2/dssm/README.md
--- a/v2/dssm/data/classification/test.txt
+++ b/v2/dssm/data/classification/test.txt
--- a/v2/dssm/data/classification/train.txt
+++ b/v2/dssm/data/classification/train.txt
--- a/v2/dssm/data/rank/test.txt
+++ b/v2/dssm/data/rank/test.txt
--- a/v2/dssm/data/rank/train.txt
+++ b/v2/dssm/data/rank/train.txt
--- a/v2/dssm/data/vocab.txt
+++ b/v2/dssm/data/vocab.txt
--- a/v2/dssm/images/dssm.jpg
+++ b/v2/dssm/images/dssm.jpg
--- a/v2/dssm/images/dssm.png
+++ b/v2/dssm/images/dssm.png
--- a/v2/dssm/images/dssm2.jpg
+++ b/v2/dssm/images/dssm2.jpg
--- a/v2/dssm/images/dssm2.png
+++ b/v2/dssm/images/dssm2.png
--- a/v2/dssm/images/dssm3.jpg
+++ b/v2/dssm/images/dssm3.jpg
--- a/v2/dssm/infer.py
+++ b/v2/dssm/infer.py
--- a/v2/dssm/network_conf.py
+++ b/v2/dssm/network_conf.py
--- a/v2/dssm/reader.py
+++ b/v2/dssm/reader.py
--- a/v2/dssm/train.py
+++ b/v2/dssm/train.py
--- a/v2/dssm/utils.py
+++ b/v2/dssm/utils.py
--- a/v2/generate_chinese_poetry/README.md
+++ b/v2/generate_chinese_poetry/README.md
--- a/v2/generate_chinese_poetry/README_en.md
+++ b/v2/generate_chinese_poetry/README_en.md
--- a/v2/generate_chinese_poetry/data/download.sh
+++ b/v2/generate_chinese_poetry/data/download.sh
--- a/v2/generate_chinese_poetry/generate.py
+++ b/v2/generate_chinese_poetry/generate.py
--- a/v2/generate_chinese_poetry/network_conf.py
+++ b/v2/generate_chinese_poetry/network_conf.py
--- a/v2/generate_chinese_poetry/preprocess.py
+++ b/v2/generate_chinese_poetry/preprocess.py
--- a/v2/generate_chinese_poetry/reader.py
+++ b/v2/generate_chinese_poetry/reader.py
--- a/v2/generate_chinese_poetry/train.py
+++ b/v2/generate_chinese_poetry/train.py
--- a/v2/generate_chinese_poetry/utils.py
+++ b/v2/generate_chinese_poetry/utils.py
--- a/v2/generate_sequence_by_rnn_lm/.gitignore
+++ b/v2/generate_sequence_by_rnn_lm/.gitignore
--- a/v2/generate_sequence_by_rnn_lm/README.md
+++ b/v2/generate_sequence_by_rnn_lm/README.md
--- a/v2/generate_sequence_by_rnn_lm/beam_search.py
+++ b/v2/generate_sequence_by_rnn_lm/beam_search.py
--- a/v2/generate_sequence_by_rnn_lm/config.py
+++ b/v2/generate_sequence_by_rnn_lm/config.py
--- a/v2/generate_sequence_by_rnn_lm/data/train_data_examples.txt
+++ b/v2/generate_sequence_by_rnn_lm/data/train_data_examples.txt
--- a/v2/generate_sequence_by_rnn_lm/generate.py
+++ b/v2/generate_sequence_by_rnn_lm/generate.py
--- a/v2/generate_sequence_by_rnn_lm/images/ngram.png
+++ b/v2/generate_sequence_by_rnn_lm/images/ngram.png
--- a/v2/generate_sequence_by_rnn_lm/images/rnn.png
+++ b/v2/generate_sequence_by_rnn_lm/images/rnn.png
--- a/v2/generate_sequence_by_rnn_lm/network_conf.py
+++ b/v2/generate_sequence_by_rnn_lm/network_conf.py
--- a/v2/generate_sequence_by_rnn_lm/reader.py
+++ b/v2/generate_sequence_by_rnn_lm/reader.py
--- a/v2/generate_sequence_by_rnn_lm/train.py
+++ b/v2/generate_sequence_by_rnn_lm/train.py
--- a/v2/generate_sequence_by_rnn_lm/utils.py
+++ b/v2/generate_sequence_by_rnn_lm/utils.py
--- a/v2/globally_normalized_reader/.gitignore
+++ b/v2/globally_normalized_reader/.gitignore
--- a/v2/globally_normalized_reader/README.cn.md
+++ b/v2/globally_normalized_reader/README.cn.md
--- a/v2/globally_normalized_reader/README.md
+++ b/v2/globally_normalized_reader/README.md
--- a/v2/globally_normalized_reader/basic_modules.py
+++ b/v2/globally_normalized_reader/basic_modules.py
--- a/v2/globally_normalized_reader/beam_decoding.py
+++ b/v2/globally_normalized_reader/beam_decoding.py
--- a/v2/globally_normalized_reader/config.py
+++ b/v2/globally_normalized_reader/config.py
--- a/v2/globally_normalized_reader/data/download.sh
+++ b/v2/globally_normalized_reader/data/download.sh
--- a/v2/globally_normalized_reader/evaluate.py
+++ b/v2/globally_normalized_reader/evaluate.py
--- a/v2/globally_normalized_reader/featurize.py
+++ b/v2/globally_normalized_reader/featurize.py
--- a/v2/globally_normalized_reader/infer.py
+++ b/v2/globally_normalized_reader/infer.py
--- a/v2/globally_normalized_reader/model.py
+++ b/v2/globally_normalized_reader/model.py
--- a/v2/globally_normalized_reader/reader.py
+++ b/v2/globally_normalized_reader/reader.py
--- a/v2/globally_normalized_reader/train.py
+++ b/v2/globally_normalized_reader/train.py
--- a/v2/globally_normalized_reader/vocab.py
+++ b/v2/globally_normalized_reader/vocab.py
--- a/v2/hsigmoid/.gitignore
+++ b/v2/hsigmoid/.gitignore
--- a/v2/hsigmoid/README.md
+++ b/v2/hsigmoid/README.md
--- a/v2/hsigmoid/images/binary_tree.png
+++ b/v2/hsigmoid/images/binary_tree.png
--- a/v2/hsigmoid/images/network_conf.png
+++ b/v2/hsigmoid/images/network_conf.png
--- a/v2/hsigmoid/images/path_to_1.png
+++ b/v2/hsigmoid/images/path_to_1.png
--- a/v2/hsigmoid/infer.py
+++ b/v2/hsigmoid/infer.py
--- a/v2/hsigmoid/network_conf.py
+++ b/v2/hsigmoid/network_conf.py
--- a/v2/hsigmoid/train.py
+++ b/v2/hsigmoid/train.py
--- a/v2/image_classification/README.md
+++ b/v2/image_classification/README.md
--- a/v2/image_classification/alexnet.py
+++ b/v2/image_classification/alexnet.py
--- a/v2/image_classification/caffe2paddle/README.md
+++ b/v2/image_classification/caffe2paddle/README.md
--- a/v2/image_classification/caffe2paddle/caffe2paddle.py
+++ b/v2/image_classification/caffe2paddle/caffe2paddle.py
--- a/v2/image_classification/googlenet.py
+++ b/v2/image_classification/googlenet.py
--- a/v2/image_classification/inception_resnet_v2.py
+++ b/v2/image_classification/inception_resnet_v2.py
--- a/v2/image_classification/inception_v4.py
+++ b/v2/image_classification/inception_v4.py
--- a/v2/image_classification/infer.py
+++ b/v2/image_classification/infer.py
--- a/v2/image_classification/models/model_download.sh
+++ b/v2/image_classification/models/model_download.sh
--- a/v2/image_classification/reader.py
+++ b/v2/image_classification/reader.py
--- a/v2/image_classification/resnet.py
+++ b/v2/image_classification/resnet.py
--- a/v2/image_classification/tf2paddle/README.md
+++ b/v2/image_classification/tf2paddle/README.md
--- a/v2/image_classification/tf2paddle/tf2paddle.py
+++ b/v2/image_classification/tf2paddle/tf2paddle.py
--- a/v2/image_classification/train.py
+++ b/v2/image_classification/train.py
--- a/v2/image_classification/vgg.py
+++ b/v2/image_classification/vgg.py
--- a/v2/image_classification/xception.py
+++ b/v2/image_classification/xception.py
--- a/v2/ltr/README.md
+++ b/v2/ltr/README.md
--- a/v2/ltr/README_en.md
+++ b/v2/ltr/README_en.md
--- a/v2/ltr/images/LambdaRank_EN.png
+++ b/v2/ltr/images/LambdaRank_EN.png
--- a/v2/ltr/images/lambdarank.jpg
+++ b/v2/ltr/images/lambdarank.jpg
--- a/v2/ltr/images/learning_to_rank.jpg
+++ b/v2/ltr/images/learning_to_rank.jpg
--- a/v2/ltr/images/ranknet.jpg
+++ b/v2/ltr/images/ranknet.jpg
--- a/v2/ltr/images/ranknet_en.png
+++ b/v2/ltr/images/ranknet_en.png
--- a/v2/ltr/images/search_engine_example.png
+++ b/v2/ltr/images/search_engine_example.png
--- a/v2/ltr/infer.py
+++ b/v2/ltr/infer.py
--- a/v2/ltr/lambda_rank.py
+++ b/v2/ltr/lambda_rank.py
--- a/v2/ltr/ranknet.py
+++ b/v2/ltr/ranknet.py
--- a/v2/ltr/train.py
+++ b/v2/ltr/train.py
--- a/v2/mt_with_external_memory/README.md
+++ b/v2/mt_with_external_memory/README.md
--- a/v2/mt_with_external_memory/data_utils.py
+++ b/v2/mt_with_external_memory/data_utils.py
--- a/v2/mt_with_external_memory/external_memory.py
+++ b/v2/mt_with_external_memory/external_memory.py
--- a/v2/mt_with_external_memory/image/lstm_c_state.png
+++ b/v2/mt_with_external_memory/image/lstm_c_state.png
--- a/v2/mt_with_external_memory/image/memory_enhanced_decoder.png
+++ b/v2/mt_with_external_memory/image/memory_enhanced_decoder.png
--- a/v2/mt_with_external_memory/image/neural_turing_machine_arch.png
+++ b/v2/mt_with_external_memory/image/neural_turing_machine_arch.png
--- a/v2/mt_with_external_memory/image/turing_machine_cartoon.gif
+++ b/v2/mt_with_external_memory/image/turing_machine_cartoon.gif
--- a/v2/mt_with_external_memory/infer.py
+++ b/v2/mt_with_external_memory/infer.py
--- a/v2/mt_with_external_memory/model.py
+++ b/v2/mt_with_external_memory/model.py
--- a/v2/mt_with_external_memory/train.py
+++ b/v2/mt_with_external_memory/train.py
--- a/v2/nce_cost/.gitignore
+++ b/v2/nce_cost/.gitignore
--- a/v2/nce_cost/README.md
+++ b/v2/nce_cost/README.md
--- a/v2/nce_cost/images/network_conf.png
+++ b/v2/nce_cost/images/network_conf.png
--- a/v2/nce_cost/infer.py
+++ b/v2/nce_cost/infer.py
--- a/v2/nce_cost/network_conf.py
+++ b/v2/nce_cost/network_conf.py
--- a/v2/nce_cost/train.py
+++ b/v2/nce_cost/train.py
--- a/v2/nested_sequence/README.md
+++ b/v2/nested_sequence/README.md
--- a/v2/nested_sequence/README_en.md
+++ b/v2/nested_sequence/README_en.md
--- a/v2/nested_sequence/text_classification/.gitignore
+++ b/v2/nested_sequence/text_classification/.gitignore
--- a/v2/nested_sequence/text_classification/README.md
+++ b/v2/nested_sequence/text_classification/README.md
--- a/v2/nested_sequence/text_classification/README_en.md
+++ b/v2/nested_sequence/text_classification/README_en.md
--- a/v2/nested_sequence/text_classification/config.py
+++ b/v2/nested_sequence/text_classification/config.py
--- a/v2/nested_sequence/text_classification/data/infer.txt
+++ b/v2/nested_sequence/text_classification/data/infer.txt
--- a/v2/nested_sequence/text_classification/data/test_data/test.txt
+++ b/v2/nested_sequence/text_classification/data/test_data/test.txt
--- a/v2/nested_sequence/text_classification/data/train_data/train.txt
+++ b/v2/nested_sequence/text_classification/data/train_data/train.txt
--- a/v2/nested_sequence/text_classification/images/model.jpg
+++ b/v2/nested_sequence/text_classification/images/model.jpg
--- a/v2/nested_sequence/text_classification/infer.py
+++ b/v2/nested_sequence/text_classification/infer.py
--- a/v2/nested_sequence/text_classification/network_conf.py
+++ b/v2/nested_sequence/text_classification/network_conf.py
--- a/v2/nested_sequence/text_classification/reader.py
+++ b/v2/nested_sequence/text_classification/reader.py
--- a/v2/nested_sequence/text_classification/requirements.txt
+++ b/v2/nested_sequence/text_classification/requirements.txt
--- a/v2/nested_sequence/text_classification/train.py
+++ b/v2/nested_sequence/text_classification/train.py
--- a/v2/nested_sequence/text_classification/utils.py
+++ b/v2/nested_sequence/text_classification/utils.py
--- a/v2/neural_qa/.gitignore
+++ b/v2/neural_qa/.gitignore
--- a/v2/neural_qa/README.md
+++ b/v2/neural_qa/README.md
--- a/v2/neural_qa/config.py
+++ b/v2/neural_qa/config.py
--- a/v2/neural_qa/infer.py
+++ b/v2/neural_qa/infer.py
--- a/v2/neural_qa/network.py
+++ b/v2/neural_qa/network.py
--- a/v2/neural_qa/pre-trained-models/download-models.sh
+++ b/v2/neural_qa/pre-trained-models/download-models.sh
--- a/v2/neural_qa/pre-trained-models/neural_seq_qa.pre-trained-models.2017-10-27.tar.gz.md5
+++ b/v2/neural_qa/pre-trained-models/neural_seq_qa.pre-trained-models.2017-10-27.tar.gz.md5
--- a/v2/neural_qa/reader.py
+++ b/v2/neural_qa/reader.py
--- a/v2/neural_qa/test/test_reader.py
+++ b/v2/neural_qa/test/test_reader.py
--- a/v2/neural_qa/test/trn_data.gz
+++ b/v2/neural_qa/test/trn_data.gz
--- a/v2/neural_qa/train.py
+++ b/v2/neural_qa/train.py
--- a/v2/neural_qa/utils.py
+++ b/v2/neural_qa/utils.py
--- a/v2/neural_qa/val_and_test.py
+++ b/v2/neural_qa/val_and_test.py
--- a/v2/nmt_without_attention/README.cn.md
+++ b/v2/nmt_without_attention/README.cn.md
--- a/v2/nmt_without_attention/README.md
+++ b/v2/nmt_without_attention/README.md
--- a/v2/nmt_without_attention/generate.py
+++ b/v2/nmt_without_attention/generate.py
--- a/v2/nmt_without_attention/images/bidirectional-encoder.png
+++ b/v2/nmt_without_attention/images/bidirectional-encoder.png
--- a/v2/nmt_without_attention/images/encoder-decoder.png
+++ b/v2/nmt_without_attention/images/encoder-decoder.png
--- a/v2/nmt_without_attention/images/gru.png
+++ b/v2/nmt_without_attention/images/gru.png
--- a/v2/nmt_without_attention/network_conf.py
+++ b/v2/nmt_without_attention/network_conf.py
--- a/v2/nmt_without_attention/train.py
+++ b/v2/nmt_without_attention/train.py
--- a/v2/scene_text_recognition/README.md
+++ b/v2/scene_text_recognition/README.md
--- a/v2/scene_text_recognition/config.py
+++ b/v2/scene_text_recognition/config.py
--- a/v2/scene_text_recognition/decoder.py
+++ b/v2/scene_text_recognition/decoder.py
--- a/v2/scene_text_recognition/images/503.jpg
+++ b/v2/scene_text_recognition/images/503.jpg
--- a/v2/scene_text_recognition/images/504.jpg
+++ b/v2/scene_text_recognition/images/504.jpg
--- a/v2/scene_text_recognition/images/505.jpg
+++ b/v2/scene_text_recognition/images/505.jpg
--- a/v2/scene_text_recognition/images/ctc.png
+++ b/v2/scene_text_recognition/images/ctc.png
--- a/v2/scene_text_recognition/images/feature_vector.png
+++ b/v2/scene_text_recognition/images/feature_vector.png
--- a/v2/scene_text_recognition/images/transcription.png
+++ b/v2/scene_text_recognition/images/transcription.png
--- a/v2/scene_text_recognition/infer.py
+++ b/v2/scene_text_recognition/infer.py
--- a/v2/scene_text_recognition/network_conf.py
+++ b/v2/scene_text_recognition/network_conf.py
--- a/v2/scene_text_recognition/reader.py
+++ b/v2/scene_text_recognition/reader.py
--- a/v2/scene_text_recognition/requirements.txt
+++ b/v2/scene_text_recognition/requirements.txt
--- a/v2/scene_text_recognition/train.py
+++ b/v2/scene_text_recognition/train.py
--- a/v2/scene_text_recognition/utils.py
+++ b/v2/scene_text_recognition/utils.py
--- a/v2/scheduled_sampling/README.md
+++ b/v2/scheduled_sampling/README.md
--- a/v2/scheduled_sampling/README_en.md
+++ b/v2/scheduled_sampling/README_en.md
--- a/v2/scheduled_sampling/generate.py
+++ b/v2/scheduled_sampling/generate.py
--- a/v2/scheduled_sampling/images/Scheduled_Sampling.jpg
+++ b/v2/scheduled_sampling/images/Scheduled_Sampling.jpg
--- a/v2/scheduled_sampling/images/decay.jpg
+++ b/v2/scheduled_sampling/images/decay.jpg
--- a/v2/scheduled_sampling/network_conf.py
+++ b/v2/scheduled_sampling/network_conf.py
--- a/v2/scheduled_sampling/reader.py
+++ b/v2/scheduled_sampling/reader.py
--- a/v2/scheduled_sampling/train.py
+++ b/v2/scheduled_sampling/train.py
--- a/v2/scheduled_sampling/utils.py
+++ b/v2/scheduled_sampling/utils.py
--- a/v2/sequence_tagging_for_ner/.gitignore
+++ b/v2/sequence_tagging_for_ner/.gitignore
--- a/v2/sequence_tagging_for_ner/README.md
+++ b/v2/sequence_tagging_for_ner/README.md
--- a/v2/sequence_tagging_for_ner/data/download.sh
+++ b/v2/sequence_tagging_for_ner/data/download.sh
--- a/v2/sequence_tagging_for_ner/data/target.txt
+++ b/v2/sequence_tagging_for_ner/data/target.txt
--- a/v2/sequence_tagging_for_ner/data/test
+++ b/v2/sequence_tagging_for_ner/data/test
--- a/v2/sequence_tagging_for_ner/data/train
+++ b/v2/sequence_tagging_for_ner/data/train
--- a/v2/sequence_tagging_for_ner/data/vocab.txt
+++ b/v2/sequence_tagging_for_ner/data/vocab.txt
--- a/v2/sequence_tagging_for_ner/images/BIO tag example.png
+++ b/v2/sequence_tagging_for_ner/images/BIO tag example.png
--- a/v2/sequence_tagging_for_ner/images/ner_label_ins.png
+++ b/v2/sequence_tagging_for_ner/images/ner_label_ins.png
--- a/v2/sequence_tagging_for_ner/images/ner_model_en.png
+++ b/v2/sequence_tagging_for_ner/images/ner_model_en.png
--- a/v2/sequence_tagging_for_ner/images/ner_network.png
+++ b/v2/sequence_tagging_for_ner/images/ner_network.png
--- a/v2/sequence_tagging_for_ner/infer.py
+++ b/v2/sequence_tagging_for_ner/infer.py
--- a/v2/sequence_tagging_for_ner/network_conf.py
+++ b/v2/sequence_tagging_for_ner/network_conf.py
--- a/v2/sequence_tagging_for_ner/reader.py
+++ b/v2/sequence_tagging_for_ner/reader.py
--- a/v2/sequence_tagging_for_ner/train.py
+++ b/v2/sequence_tagging_for_ner/train.py
--- a/v2/sequence_tagging_for_ner/utils.py
+++ b/v2/sequence_tagging_for_ner/utils.py
--- a/v2/ssd/README.cn.md
+++ b/v2/ssd/README.cn.md
--- a/v2/ssd/README.md
+++ b/v2/ssd/README.md
--- a/v2/ssd/config/__init__.py
+++ b/v2/ssd/config/__init__.py
--- a/v2/ssd/config/pascal_voc_conf.py
+++ b/v2/ssd/config/pascal_voc_conf.py
--- a/v2/ssd/data/label_list
+++ b/v2/ssd/data/label_list
--- a/v2/ssd/data/prepare_voc_data.py
+++ b/v2/ssd/data/prepare_voc_data.py
--- a/v2/ssd/data_provider.py
+++ b/v2/ssd/data_provider.py
--- a/v2/ssd/eval.py
+++ b/v2/ssd/eval.py
--- a/v2/ssd/image_util.py
+++ b/v2/ssd/image_util.py
--- a/v2/ssd/images/SSD300x300_map.png
+++ b/v2/ssd/images/SSD300x300_map.png
--- a/v2/ssd/images/ssd_network.png
+++ b/v2/ssd/images/ssd_network.png
--- a/v2/ssd/images/vis_1.jpg
+++ b/v2/ssd/images/vis_1.jpg
--- a/v2/ssd/images/vis_2.jpg
+++ b/v2/ssd/images/vis_2.jpg
--- a/v2/ssd/images/vis_3.jpg
+++ b/v2/ssd/images/vis_3.jpg
--- a/v2/ssd/images/vis_4.jpg
+++ b/v2/ssd/images/vis_4.jpg
--- a/v2/ssd/infer.py
+++ b/v2/ssd/infer.py
--- a/v2/ssd/train.py
+++ b/v2/ssd/train.py
--- a/v2/ssd/vgg_ssd_net.py
+++ b/v2/ssd/vgg_ssd_net.py
--- a/v2/ssd/visual.py
+++ b/v2/ssd/visual.py
--- a/v2/text_classification/.gitignore
+++ b/v2/text_classification/.gitignore
--- a/v2/text_classification/README.md
+++ b/v2/text_classification/README.md
--- a/v2/text_classification/images/cnn_net.png
+++ b/v2/text_classification/images/cnn_net.png
--- a/v2/text_classification/images/dnn_net.png
+++ b/v2/text_classification/images/dnn_net.png
--- a/v2/text_classification/infer.py
+++ b/v2/text_classification/infer.py
--- a/v2/text_classification/network_conf.py
+++ b/v2/text_classification/network_conf.py
--- a/v2/text_classification/reader.py
+++ b/v2/text_classification/reader.py
--- a/v2/text_classification/run.sh
+++ b/v2/text_classification/run.sh
--- a/v2/text_classification/train.py
+++ b/v2/text_classification/train.py
--- a/v2/text_classification/utils.py
+++ b/v2/text_classification/utils.py
--- a/v2/youtube_recall/README.cn.md
+++ b/v2/youtube_recall/README.cn.md
--- a/v2/youtube_recall/README.md
+++ b/v2/youtube_recall/README.md
--- a/v2/youtube_recall/data/data.tar
+++ b/v2/youtube_recall/data/data.tar
--- a/v2/youtube_recall/data_processor.py
+++ b/v2/youtube_recall/data_processor.py
--- a/v2/youtube_recall/images/model_network.png
+++ b/v2/youtube_recall/images/model_network.png
--- a/v2/youtube_recall/images/recommendation_system.png
+++ b/v2/youtube_recall/images/recommendation_system.png
--- a/v2/youtube_recall/infer.py
+++ b/v2/youtube_recall/infer.py
--- a/v2/youtube_recall/infer_user.py
+++ b/v2/youtube_recall/infer_user.py
--- a/v2/youtube_recall/item_vector.py
+++ b/v2/youtube_recall/item_vector.py
--- a/v2/youtube_recall/network_conf.py
+++ b/v2/youtube_recall/network_conf.py
--- a/v2/youtube_recall/reader.py
+++ b/v2/youtube_recall/reader.py
--- a/v2/youtube_recall/train.py
+++ b/v2/youtube_recall/train.py
--- a/v2/youtube_recall/user_vector.py
+++ b/v2/youtube_recall/user_vector.py
--- a/v2/youtube_recall/utils.py
+++ b/v2/youtube_recall/utils.py