Merge pull request #8 from PaddlePaddle/develop

update

Merge pull request #8 from PaddlePaddle/develop
update
baaa6166 · zhengya01 · GitHub · e401522d · e68358de · baaa6166
51 changed file
--- a/README.md
+++ b/README.md
@@ -13,6 +13,57 @@ PaddlePaddle 提供了丰富的计算单元，使得用户可以采用模块化

 - [fluid模型](fluid): 使用 PaddlePaddle Fluid版本的 APIs，我们特别推荐您使用Fluid模型。

+## PaddleCV
+模型|简介|模型优势|参考论文
+--|:--:|:--:|:--:
+[AlexNet](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models)|图像分类经典模型|首次在CNN中成功的应用了ReLU、Dropout和LRN，并使用GPU进行运算加速|[ImageNet Classification with Deep Convolutional Neural Networks](https://www.researchgate.net/publication/267960550_ImageNet_Classification_with_Deep_Convolutional_Neural_Networks)
+[VGG](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models)|图像分类经典模型|在AlexNet的基础上使用3*3小卷积核，增加网络深度，具有很好的泛化能力|[Very Deep ConvNets for Large-Scale Inage Recognition](https://arxiv.org/pdf/1409.1556.pdf)
+[GoogleNet](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models)|图像分类经典模型|在不增加计算负载的前提下增加了网络的深度和宽度，性能更加优越|[Going deeper with convolutions](https://ieeexplore.ieee.org/document/7298594)
+[ResNet](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models)|残差网络|引入了新的残差结构，解决了随着网络加深，准确率下降的问题|[Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
+[Inception-v4](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models)|图像分类经典模型|更加deeper和wider的inception结构|[Inception-ResNet and the Impact of Residual Connections on Learning](http://arxiv.org/abs/1602.07261)
+[MobileNet](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models)|轻量级网络模型|为移动和嵌入式设备提出的高效模型|[MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications](https://arxiv.org/abs/1704.04861)
+[DPN](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models)|图像分类模型|结合了DenseNet和ResNeXt的网络结构，对图像分类效果有所提升|[Dual Path Networks](https://arxiv.org/abs/1707.01629)
+[SE-ResNeXt](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models)|图像分类模型|ResNeXt中加入了SE block，提高了模型准确率|[Squeeze-and-excitation networks](https://arxiv.org/abs/1709.01507)
+[SSD](https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleCV/object_detection/README_cn.md)|单阶段目标检测器|在不同尺度的特征图上检测对应尺度的目标,可以方便地插入到任何一种标准卷积网络中|[SSD: Single Shot MultiBox Detector](https://arxiv.org/abs/1512.02325)
+[Face Detector: PyramidBox](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/face_detection/README_cn.md)|基于SSD的单阶段人脸检测器|利用上下文信息解决困难人脸的检测问题，网络表达能力高，鲁棒性强|[PyramidBox: A Context-assisted Single Shot Face Detector](https://arxiv.org/pdf/1803.07737.pdf)
+[Faster RCNN](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/faster_rcnn/README_cn.md)|典型的两阶段目标检测器|创造性地采用卷积网络自行产生建议框，并且和目标检测网络共享卷积网络，建议框数目减少，质量提高|[Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks](https://arxiv.org/abs/1506.01497)
+[ICNet](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/icnet)|图像实时语义分割模型|即考虑了速度，也考虑了准确性，在高分辨率图像的准确性和低复杂度网络的效率之间获得平衡|[ICNet for Real-Time Semantic Segmentation on High-Resolution Images](https://arxiv.org/abs/1704.08545)
+[DCGAN](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/gan/c_gan)|图像生成模型|深度卷积生成对抗网络，将GAN和卷积网络结合起来，以解决GAN训练不稳定的问题|[Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://arxiv.org/pdf/1511.06434.pdf)
+[ConditionalGAN](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/gan/c_gan)|图像生成模型|条件生成对抗网络，一种带条件约束的GAN，使用额外信息对模型增加条件，可以指导数据生成过程|[Conditional Generative Adversarial Nets](https://arxiv.org/abs/1411.1784)
+[CycleGAN](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/gan/cycle_gan)|图片转化模型|自动将某一类图片转换成另外一类图片，可用于风格迁移|[Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks](https://arxiv.org/abs/1703.10593)
+[CRNN-CTC模型](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/ocr_recognition)|场景文字识别模型|使用CTC model识别图片中单行英文字符|[Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks](https://www.researchgate.net/publication/221346365_Connectionist_temporal_classification_Labelling_unsegmented_sequence_data_with_recurrent_neural_'networks)
+[Attention模型](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/ocr_recognition)|场景文字识别模型|使用attention 识别图片中单行英文字符|[Recurrent Models of Visual Attention](https://arxiv.org/abs/1406.6247)
+[Metric Learning](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/metric_learning)|度量学习模型|能够用于分析对象时间的关联、比较关系，可应用于辅助分类、聚类问题，也广泛用于图像检索、人脸识别等领域|-
+[TSN](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/video_classification)|视频分类模型|基于长范围时间结构建模，结合了稀疏时间采样策略和视频级监督来保证使用整段视频时学习得有效和高效|[Temporal Segment Networks: Towards Good Practices for Deep Action Recognition](https://arxiv.org/abs/1608.00859)
+[caffe2fluid](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/caffe2fluid)|将Caffe模型转换为Paddle Fluid配置和模型文件工具|-|-
+
+## PaddleNLP
+模型|简介|模型优势|参考论文
+--|:--:|:--:|:--:
+[Transformer](https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleNLP/neural_machine_translation/transformer/README_cn.md)|机器翻译模型|基于self-attention，计算复杂度小，并行度高，容易学习长程依赖，翻译效果更好|[Attention Is All You Need](https://arxiv.org/abs/1706.03762)
+[LAC](https://github.com/baidu/lac/blob/master/README.md)|联合的词法分析模型|能够整体性地完成中文分词、词性标注、专名识别任务|[Chinese Lexical Analysis with Deep Bi-GRU-CRF Network](https://arxiv.org/abs/1807.01882)
+[Senta](https://github.com/baidu/Senta/blob/master/README.md)|情感倾向分析模型集|百度AI开放平台中情感倾向分析模型|-
+[DAM](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleNLP/deep_attention_matching_net)|语义匹配模型|百度自然语言处理部发表于ACL-2018的工作,用于检索式聊天机器人多轮对话中应答的选择|[Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network](http://aclweb.org/anthology/P18-1103)
+[SimNet](https://github.com/baidu/AnyQ/blob/master/tools/simnet/train/paddle/README.md)|语义匹配框架|使用SimNet构建出的模型可以便捷的加入AnyQ系统中，增强AnyQ系统的语义匹配能力|-
+[DuReader](https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleNLP/machine_reading_comprehension/README.md)|阅读理解模型|百度MRC数据集上的机器阅读理解模型|-
+[Bi-GRU-CRF](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleNLP/sequence_tagging_for_ner/README.md)|命名实体识别|结合了CRF和双向GRU的命名实体识别模型|-
+
+## PaddleRec
+模型|简介|模型优势|参考论文
+--|:--:|:--:|:--:
+[TagSpace](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/tagspace)|文本及标签的embedding表示学习模型|应用于工业级的标签推荐，具体应用场景有feed新闻标签推荐等|[#TagSpace: Semantic embeddings from hashtags](https://www.bibsonomy.org/bibtex/0ed4314916f8e7c90d066db45c293462)
+[GRU4Rec](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/gru4rec)|个性化推荐模型|首次将RNN（GRU）运用于session-based推荐，相比传统的KNN和矩阵分解，效果有明显的提升|[Session-based Recommendations with Recurrent Neural Networks](https://arxiv.org/abs/1511.06939)
+[SSR](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/ssr)|序列语义检索推荐模型|使用参考论文中的思想，使用多种时间粒度进行用户行为预测|[Multi-Rate Deep Learning for Temporal Recommendation](https://dl.acm.org/citation.cfm?id=2914726)
+[DeepCTR](https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleRec/ctr/README.cn.md)|点击率预估模型|只实现了DeepFM论文中介绍的模型的DNN部分，DeepFM会在其他例子中给出|[DeepFM: A Factorization-Machine based Neural Network for CTR Prediction](https://arxiv.org/abs/1703.04247)
+[Multiview-Simnet](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/multiview_simnet)|个性化推荐模型|基于多元视图，将用户和项目的多个功能视图合并为一个统一模型|[A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems](http://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/frp1159-songA.pdf)
+
+## Other Models
+模型|简介|模型优势|参考论文
+--|:--:|:--:|:--:
+[DeepASR](https://github.com/PaddlePaddle/models/blob/develop/fluid/DeepASR/README_cn.md)|语音识别系统|利用Fluid框架完成语音识别中声学模型的配置和训练，并集成 Kaldi 的解码器|-
+[DQN](https://github.com/PaddlePaddle/models/blob/develop/fluid/DeepQNetwork/README_cn.md)|深度Q网络|value based强化学习算法，第一个成功地将深度学习和强化学习结合起来的模型|[Human-level control through deep reinforcement learning](https://www.nature.com/articles/nature14236)
+[DoubleDQN](https://github.com/PaddlePaddle/models/blob/develop/fluid/DeepQNetwork/README_cn.md)|DQN的变体|将Double Q的想法应用在DQN上，解决过优化问题|[Font Size: Deep Reinforcement Learning with Double Q-Learning](https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12389)
+[DuelingDQN](https://github.com/PaddlePaddle/models/blob/develop/fluid/DeepQNetwork/README_cn.md)|DQN的变体|改进了DQN模型，提高了模型的性能|[Dueling Network Architectures for Deep Reinforcement Learning](http://proceedings.mlr.press/v48/wangf16.html)

 ## License
 This tutorial is contributed by [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) and licensed under the [Apache-2.0 license](LICENSE).

--- a/fluid/PaddleCV/caffe2fluid/kaffe/paddle/network.py
+++ b/fluid/PaddleCV/caffe2fluid/kaffe/paddle/network.py
@@ -440,7 +440,8 @@ class Network(object):

        if need_transpose:
            order = range(dims)
-            order.remove(axis).append(axis)
+            order.remove(axis)
+            order.append(axis)
            input = fluid.layers.transpose(
                input,
                perm=order,
@@ -525,11 +526,21 @@ class Network(object):
        scale_shape = input.shape[axis:axis + num_axes]
        param_attr = fluid.ParamAttr(name=prefix + 'scale')
        scale_param = fluid.layers.create_parameter(
-            shape=scale_shape, dtype=input.dtype, name=name, attr=param_attr)
+            shape=scale_shape,
+            dtype=input.dtype,
+            name=name,
+            attr=param_attr,
+            is_bias=True,
+            default_initializer=fluid.initializer.Constant(value=1.0))

        offset_attr = fluid.ParamAttr(name=prefix + 'offset')
        offset_param = fluid.layers.create_parameter(
-            shape=scale_shape, dtype=input.dtype, name=name, attr=offset_attr)
+            shape=scale_shape,
+            dtype=input.dtype,
+            name=name,
+            attr=offset_attr,
+            is_bias=True,
+            default_initializer=fluid.initializer.Constant(value=0.0))

        output = fluid.layers.elementwise_mul(
            input,

--- a/fluid/PaddleCV/deeplabv3+/README.md
+++ b/fluid/PaddleCV/deeplabv3+/README.md
@@ -76,7 +76,7 @@ python ./train.py \
    --train_crop_size=769 \
    --total_step=90000 \
    --init_weights_path=deeplabv3plus_xception65_initialize.params \
-    --save_weights_path=output \
+    --save_weights_path=output/ \
    --dataset_path=$DATASET_PATH
 ```


--- a/fluid/PaddleCV/face_detection/README_cn.md
+++ b/fluid/PaddleCV/face_detection/README_cn.md
@@ -99,7 +99,7 @@ python -u train.py --batch_size=16 --pretrained_model=vgg_ilsvrc_16_fc_reduced

 模型训练所采用的数据增强：

-**数据增强**：数据的读取行为定义在 `reader.py` 中，所有的图片都会被缩放到640x640。在训练时还会对图片进行数据增强，包括随机扰动、翻转、裁剪等，和[物体检测SSD算法](https://github.com/PaddlePaddle/models/blob/develop/fluid/object_detection/README_cn.md#%E8%AE%AD%E7%BB%83-pascal-voc-%E6%95%B0%E6%8D%AE%E9%9B%86)中数据增强类似，除此之外，增加了上面提到的Data-anchor-sampling:
+**数据增强**：数据的读取行为定义在 `reader.py` 中，所有的图片都会被缩放到640x640。在训练时还会对图片进行数据增强，包括随机扰动、翻转、裁剪等，和[物体检测SSD算法](https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleCV/object_detection/README.md)中数据增强类似，除此之外，增加了上面提到的Data-anchor-sampling:

  **尺度变换(Data-anchor-sampling)**：随机将图片尺度变换到一定范围的尺度，大大增强人脸的尺度变化。具体操作为根据随机选择的人脸高(height)和宽(width)，得到$v=\\sqrt{width * height}$，判断$v$的值位于缩放区间$[16，32，64，128，256，512]$中的的哪一个。假设$v=45$，则选定$32<v<64$，以均匀分布的概率选取$[16，32，64]$中的任意一个值。若选中$64$，则该人脸的缩放区间在 $[64 / 2，min(v * 2, 64 * 2)]$中选定。


--- a/fluid/PaddleCV/faster_rcnn/README.md
+++ b/fluid/PaddleCV/faster_rcnn/README.md
@@ -38,18 +38,6 @@ Train the model on [MS-COCO dataset](http://cocodataset.org/#download), download

 ## Training

-After data preparation, one can start the training step by:
-
-    python train.py \
-       --model_save_dir=output/ \
-       --pretrained_model=${path_to_pretrain_model}
-       --data_dir=${path_to_data}
-
- Set ```export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7``` to specifiy 8 GPU to train.
- For more help on arguments:
-
-    python train.py --help
-
 **download the pre-trained model:** This sample provides Resnet-50 pre-trained model which is converted from Caffe. The model fuses the parameters in batch normalization layer. One can download pre-trained model as:

    sh ./pretrained/download.sh
@@ -72,6 +60,18 @@ To train the model, [cocoapi](https://github.com/cocodataset/cocoapi) is needed.
    # not to install the COCO API into global site-packages
    python2 setup.py install --user

+After data preparation, one can start the training step by:
+
+    python train.py \
+       --model_save_dir=output/ \
+       --pretrained_model=${path_to_pretrain_model}
+       --data_dir=${path_to_data}
+
+- Set ```export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7``` to specifiy 8 GPU to train.
+- For more help on arguments:
+
+    python train.py --help
+
 **data reader introduction:**

 * Data reader is defined in `reader.py`.
@@ -128,7 +128,7 @@ Inference is used to get prediction score or image features based on trained mod
    python infer.py \
       --dataset=coco2017 \
        --pretrained_model=${path_to_pretrain_model}  \
-        --image_path=data/COCO17/val2017/  \
+        --image_path=dataset/coco/val2017/  \
        --image_name=000000000139.jpg \
        --draw_threshold=0.6


--- a/fluid/PaddleCV/faster_rcnn/README_cn.md
+++ b/fluid/PaddleCV/faster_rcnn/README_cn.md
@@ -37,18 +37,6 @@ Faster RCNN 目标检测模型

 ## 模型训练

-数据准备完毕后，可以通过如下的方式启动训练：
-
-    python train.py \
-       --model_save_dir=output/ \
-       --pretrained_model=${path_to_pretrain_model}
-       --data_dir=${path_to_data}
-
- 通过设置export CUDA\_VISIBLE\_DEVICES=0,1,2,3,4,5,6,7指定8卡GPU训练。
- 可选参数见：
-
-    python train.py --help
-
 **下载预训练模型：** 本示例提供Resnet-50预训练模型，该模性转换自Caffe，并对批标准化层(Batch Normalization Layer)进行参数融合。采用如下命令下载预训练模型：

    sh ./pretrained/download.sh
@@ -71,6 +59,18 @@ Faster RCNN 目标检测模型
    # not to install the COCO API into global site-packages
    python2 setup.py install --user

+数据准备完毕后，可以通过如下的方式启动训练：
+
+    python train.py \
+       --model_save_dir=output/ \
+       --pretrained_model=${path_to_pretrain_model}
+       --data_dir=${path_to_data}
+
+- 通过设置export CUDA\_VISIBLE\_DEVICES=0,1,2,3,4,5,6,7指定8卡GPU训练。
+- 可选参数见：
+
+    python train.py --help
+
 **数据读取器说明：** 数据读取器定义在reader.py中。所有图像将短边等比例缩放至`scales`，若长边大于`max_size`, 则再次将长边等比例缩放至`max_size`。在训练阶段，对图像采用水平翻转。支持将同一个batch内的图像padding为相同尺寸。

 **模型设置：**
@@ -124,7 +124,7 @@ Faster RCNN 目标检测模型
    python infer.py \
       --dataset=coco2017 \
        --pretrained_model=${path_to_pretrain_model}  \
-        --image_path=data/COCO17/val2017/  \
+        --image_path=dataset/coco/val2017/  \
        --image_name=000000000139.jpg \
        --draw_threshold=0.6


--- a/fluid/PaddleCV/faster_rcnn/data_utils.py
+++ b/fluid/PaddleCV/faster_rcnn/data_utils.py
@@ -28,6 +28,7 @@ from __future__ import unicode_literals
 import cv2
 import numpy as np
 from config import cfg
+import os


 def get_image_blob(roidb, mode):
@@ -43,8 +44,11 @@ def get_image_blob(roidb, mode):
        target_size = cfg.TEST.scales[0]
        max_size = cfg.TEST.max_size
    im = cv2.imread(roidb['image'])
-    assert im is not None, \
-        'Failed to read image \'{}\''.format(roidb['image'])
+    try:
+        assert im is not None
+    except AssertionError as e:
+        print('Failed to read image \'{}\''.format(roidb['image']))
+        os._exit(0)
    if roidb['flipped']:
        im = im[:, ::-1, :]
    im, im_scale = prep_im_for_blob(im, cfg.pixel_means, target_size, max_size)

--- a/fluid/PaddleCV/faster_rcnn/utility.py
+++ b/fluid/PaddleCV/faster_rcnn/utility.py
@@ -98,7 +98,7 @@ def parse_args():
    add_arg('pretrained_model', str,    'imagenet_resnet50_fusebn', "The init model path.")
    add_arg('dataset',          str,   'coco2017',  "coco2014, coco2017.")
    add_arg('class_num',        int,   81,          "Class number.")
-    add_arg('data_dir',         str,   'data/COCO17',        "The data root path.")
+    add_arg('data_dir',         str,   'dataset/coco',        "The data root path.")
    add_arg('use_pyreader',     bool,   True,           "Use pyreader.")
    add_arg('use_profile',         bool,   False,       "Whether use profiler.")
    add_arg('padding_minibatch',bool,   False,
@@ -127,7 +127,7 @@ def parse_args():
    add_arg('debug',            bool,   False,   "Debug mode")
    # SINGLE EVAL AND DRAW
    add_arg('draw_threshold',  float, 0.8,    "Confidence threshold to draw bbox.")
-    add_arg('image_path',       str,   'data/COCO17/val2017',  "The image path used to inference and visualize.")
+    add_arg('image_path',       str,   'dataset/coco/val2017',  "The image path used to inference and visualize.")
    add_arg('image_name',        str,    '',       "The single image used to inference and visualize.")
    # ce
    parser.add_argument(

--- a/fluid/PaddleCV/gan/c_gan/.run_ce.sh
+++ b/fluid/PaddleCV/gan/c_gan/.run_ce.sh
@@ -3,7 +3,7 @@
 # This file is only used for continuous evaluation.
 export FLAGS_cudnn_deterministic=True
 export ce_mode=1
-(CUDA_VISIBLE_DEVICES=6 python c_gan.py --batch_size=121 --epoch=1 --run_ce=True --use_gpu=True & \
-CUDA_VISIBLE_DEVICES=7 python dc_gan.py --batch_size=121 --epoch=1 --run_ce=True --use_gpu=True) | python _ce.py
+(CUDA_VISIBLE_DEVICES=2 python c_gan.py --batch_size=121 --epoch=1 --run_ce=True --use_gpu=True & \
+CUDA_VISIBLE_DEVICES=3 python dc_gan.py --batch_size=121 --epoch=1 --run_ce=True --use_gpu=True) | python _ce.py


--- a/fluid/PaddleCV/image_classification/README.md
+++ b/fluid/PaddleCV/image_classification/README.md
@@ -6,6 +6,7 @@ Image classification, which is an important field of computer vision, is to clas
 - [Installation](#installation)
 - [Data preparation](#data-preparation)
 - [Training a model with flexible parameters](#training-a-model)
+- [Using Mixed-Precision Training](#using-mixed-precision-training)
 - [Finetuning](#finetuning)
 - [Evaluation](#evaluation)
 - [Inference](#inference)
@@ -112,6 +113,13 @@ The error rate curves of AlexNet, ResNet50 and SE-ResNeXt-50 are shown in the fi
 Training and validation Curves
 </p>

+
+## Using Mixed-Precision Training
+
+You may add `--fp16 1` to start train using mixed precisioin training, which the training process will use float16 and the output model ("master" parameters) is saved as float32. You also may need to pass `--scale_loss` to overcome accuracy issues, usually `--scale_loss 8.0` will do.
+
+Note that currently `--fp16` can not use together with `--with_mem_opt`, so pass `--with_mem_opt 0` to disable memory optimization pass.
+
 ## Finetuning

 Finetuning is to finetune model weights in a specific task by loading pretrained weights. After initializing ```path_to_pretrain_model```, one can finetune a model as:

--- a/fluid/PaddleCV/image_classification/README_cn.md
+++ b/fluid/PaddleCV/image_classification/README_cn.md
@@ -109,6 +109,11 @@ End pass 9, train_loss 3.3745200634, train_acc1 0.303871691227, train_acc5 0.545
 训练集合与验证集合上的错误率曲线
 </p>

+## 混合精度训练
+
+可以通过开启`--fp16 1`启动混合精度训练，这样训练过程会使用float16数据，并输出float32的模型参数（"master"参数）。您可能需要同时传入`--scale_loss`来解决fp16训练的精度问题，通常传入`--scale_loss 8.0`即可。
+
+注意，目前混合精度训练不能和内存优化功能同时使用，所以需要传`--with_mem_opt 0`这个参数来禁用内存优化功能。

 ## 参数微调


--- a/fluid/PaddleCV/image_classification/eval.py
+++ b/fluid/PaddleCV/image_classification/eval.py
@@ -49,7 +49,7 @@ def eval(args):
    # model definition
    model = models.__dict__[model_name]()

-    if model_name is "GoogleNet":
+    if model_name == "GoogleNet":
        out0, out1, out2 = model.net(input=image, class_dim=class_dim)
        cost0 = fluid.layers.cross_entropy(input=out0, label=label)
        cost1 = fluid.layers.cross_entropy(input=out1, label=label)
@@ -71,8 +71,10 @@ def eval(args):

    test_program = fluid.default_main_program().clone(for_test=True)

+    fetch_list = [avg_cost.name, acc_top1.name, acc_top5.name]
    if with_memory_optimization:
-        fluid.memory_optimize(fluid.default_main_program())
+        fluid.memory_optimize(
+            fluid.default_main_program(), skip_opt_set=set(fetch_list))

    place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
@@ -88,8 +90,6 @@ def eval(args):
    val_reader = paddle.batch(reader.val(), batch_size=args.batch_size)
    feeder = fluid.DataFeeder(place=place, feed_list=[image, label])

-    fetch_list = [avg_cost.name, acc_top1.name, acc_top5.name]
-
    test_info = [[], [], []]
    cnt = 0
    for batch_id, data in enumerate(val_reader()):

--- a/fluid/PaddleCV/image_classification/infer.py
+++ b/fluid/PaddleCV/image_classification/infer.py
@@ -11,7 +11,6 @@ import models
 import reader
 import argparse
 import functools
-from models.learning_rate import cosine_decay
 from utility import add_arguments, print_arguments
 import math

@@ -44,7 +43,6 @@ def infer(args):

    # model definition
    model = models.__dict__[model_name]()
-
    if model_name is "GoogleNet":
        out, _, _ = model.net(input=image, class_dim=class_dim)
    else:
@@ -52,8 +50,10 @@ def infer(args):

    test_program = fluid.default_main_program().clone(for_test=True)

+    fetch_list = [out.name]
    if with_memory_optimization:
-        fluid.memory_optimize(fluid.default_main_program())
+        fluid.memory_optimize(
+            fluid.default_main_program(), skip_opt_set=set(fetch_list))

    place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
@@ -70,8 +70,6 @@ def infer(args):
    test_reader = paddle.batch(reader.test(), batch_size=test_batch_size)
    feeder = fluid.DataFeeder(place=place, feed_list=[image])

-    fetch_list = [out.name]
-
    TOPK = 1
    for batch_id, data in enumerate(test_reader()):
        result = exe.run(test_program,

--- a/fluid/PaddleCV/image_classification/models/alexnet.py
+++ b/fluid/PaddleCV/image_classification/models/alexnet.py
@@ -142,7 +142,6 @@ class AlexNet():
        out = fluid.layers.fc(
            input=fc7,
            size=class_dim,
-            act='softmax',
            bias_attr=fluid.param_attr.ParamAttr(
                initializer=fluid.initializer.Uniform(-stdv, stdv)),
            param_attr=fluid.param_attr.ParamAttr(

--- a/fluid/PaddleCV/image_classification/models/dpn.py
+++ b/fluid/PaddleCV/image_classification/models/dpn.py
@@ -94,7 +94,6 @@ class DPN(object):
            initializer=fluid.initializer.Uniform(-stdv, stdv))
        fc6 = fluid.layers.fc(input=pool5,
                              size=class_dim,
-                              act='softmax',
                              param_attr=param_attr)

        return fc6

--- a/fluid/PaddleCV/image_classification/models/inception_v4.py
+++ b/fluid/PaddleCV/image_classification/models/inception_v4.py
@@ -47,7 +47,6 @@ class InceptionV4():
        out = fluid.layers.fc(
            input=drop,
            size=class_dim,
-            act='softmax',
            param_attr=fluid.param_attr.ParamAttr(
                initializer=fluid.initializer.Uniform(-stdv, stdv)))
        return out

--- a/fluid/PaddleCV/image_classification/models/mobilenet.py
+++ b/fluid/PaddleCV/image_classification/models/mobilenet.py
@@ -120,7 +120,6 @@ class MobileNet():

        output = fluid.layers.fc(input=input,
                                 size=class_dim,
-                                 act='softmax',
                                 param_attr=ParamAttr(initializer=MSRA()))
        return output


--- a/fluid/PaddleCV/image_classification/models/mobilenet_v2.py
+++ b/fluid/PaddleCV/image_classification/models/mobilenet_v2.py
@@ -73,7 +73,6 @@ class MobileNetV2():

        output = fluid.layers.fc(input=input,
                                 size=class_dim,
-                                 act='softmax',
                                 param_attr=ParamAttr(initializer=MSRA()))
        return output


--- a/fluid/PaddleCV/image_classification/models/resnet.py
+++ b/fluid/PaddleCV/image_classification/models/resnet.py
@@ -60,7 +60,6 @@ class ResNet():
        stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0)
        out = fluid.layers.fc(input=pool,
                              size=class_dim,
-                              act='softmax',
                              param_attr=fluid.param_attr.ParamAttr(
                                  initializer=fluid.initializer.Uniform(-stdv,
                                                                        stdv)))

--- a/fluid/PaddleCV/image_classification/models/resnet_dist.py
+++ b/fluid/PaddleCV/image_classification/models/resnet_dist.py
@@ -62,7 +62,6 @@ class DistResNet():
        stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0)
        out = fluid.layers.fc(input=pool,
                              size=class_dim,
-                              act='softmax',
                              param_attr=fluid.param_attr.ParamAttr(
                                  initializer=fluid.initializer.Uniform(-stdv,
                                                                        stdv),

--- a/fluid/PaddleCV/image_classification/models/se_resnext.py
+++ b/fluid/PaddleCV/image_classification/models/se_resnext.py
@@ -110,7 +110,6 @@ class SE_ResNeXt():
        stdv = 1.0 / math.sqrt(drop.shape[1] * 1.0)
        out = fluid.layers.fc(input=drop,
                              size=class_dim,
-                              act='softmax',
                              param_attr=fluid.param_attr.ParamAttr(
                                  initializer=fluid.initializer.Uniform(-stdv,
                                                                        stdv)))

--- a/fluid/PaddleCV/image_classification/models/shufflenet_v2.py
+++ b/fluid/PaddleCV/image_classification/models/shufflenet_v2.py
@@ -93,7 +93,6 @@ class ShuffleNetV2():

        output = fluid.layers.fc(input=pool_last,
                                 size=class_dim,
-                                 act='softmax',
                                 param_attr=ParamAttr(initializer=MSRA()))
        return output


--- a/fluid/PaddleCV/image_classification/models/vgg.py
+++ b/fluid/PaddleCV/image_classification/models/vgg.py
@@ -64,7 +64,6 @@ class VGGNet():
        out = fluid.layers.fc(
            input=fc2,
            size=class_dim,
-            act='softmax',
            param_attr=fluid.param_attr.ParamAttr(
                initializer=fluid.initializer.Normal(scale=0.005)),
            bias_attr=fluid.param_attr.ParamAttr(

--- a/fluid/PaddleCV/image_classification/models_name/alexnet.py
+++ b/fluid/PaddleCV/image_classification/models_name/alexnet.py
@@ -159,7 +159,6 @@ class AlexNet():
        out = fluid.layers.fc(
            input=fc7,
            size=class_dim,
-            act='softmax',
            bias_attr=fluid.param_attr.ParamAttr(
                initializer=fluid.initializer.Uniform(-stdv, stdv),
                name=layer_name[7] + "_offset"),

--- a/fluid/PaddleCV/image_classification/models_name/dpn.py
+++ b/fluid/PaddleCV/image_classification/models_name/dpn.py
@@ -122,7 +122,6 @@ class DPN(object):
            initializer=fluid.initializer.Uniform(-stdv, stdv))
        fc6 = fluid.layers.fc(input=pool5,
                              size=class_dim,
-                              act='softmax',
                              param_attr=param_attr,
                              name="fc6")


--- a/fluid/PaddleCV/image_classification/models_name/inception_v4.py
+++ b/fluid/PaddleCV/image_classification/models_name/inception_v4.py
@@ -48,7 +48,6 @@ class InceptionV4():
        out = fluid.layers.fc(
            input=drop,
            size=class_dim,
-            act='softmax',
            param_attr=ParamAttr(
                initializer=fluid.initializer.Uniform(-stdv, stdv),
                name="final_fc_weights"),

--- a/fluid/PaddleCV/image_classification/models_name/mobilenet.py
+++ b/fluid/PaddleCV/image_classification/models_name/mobilenet.py
@@ -130,7 +130,6 @@ class MobileNet():

        output = fluid.layers.fc(input=input,
                                 size=class_dim,
-                                 act='softmax',
                                 param_attr=ParamAttr(
                                     initializer=MSRA(), name="fc7_weights"),
                                 bias_attr=ParamAttr(name="fc7_offset"))

--- a/fluid/PaddleCV/image_classification/models_name/mobilenet_v2.py
+++ b/fluid/PaddleCV/image_classification/models_name/mobilenet_v2.py
@@ -80,7 +80,6 @@ class MobileNetV2():

        output = fluid.layers.fc(input=input,
                                 size=class_dim,
-                                 act='softmax',
                                 param_attr=ParamAttr(name='fc10_weights'),
                                 bias_attr=ParamAttr(name='fc10_offset'))
        return output

--- a/fluid/PaddleCV/image_classification/models_name/resnet.py
+++ b/fluid/PaddleCV/image_classification/models_name/resnet.py
@@ -74,7 +74,6 @@ class ResNet():
        stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0)
        out = fluid.layers.fc(input=pool,
                              size=class_dim,
-                              act='softmax',
                              param_attr=fluid.param_attr.ParamAttr(
                                  initializer=fluid.initializer.Uniform(-stdv,
                                                                        stdv)))

--- a/fluid/PaddleCV/image_classification/models_name/se_resnext.py
+++ b/fluid/PaddleCV/image_classification/models_name/se_resnext.py
@@ -123,7 +123,6 @@ class SE_ResNeXt():
        out = fluid.layers.fc(
            input=drop,
            size=class_dim,
-            act='softmax',
            param_attr=ParamAttr(
                initializer=fluid.initializer.Uniform(-stdv, stdv),
                name='fc6_weights'),

--- a/fluid/PaddleCV/image_classification/models_name/shufflenet_v2.py
+++ b/fluid/PaddleCV/image_classification/models_name/shufflenet_v2.py
@@ -97,7 +97,6 @@ class ShuffleNetV2():

        output = fluid.layers.fc(input=pool_last,
                                 size=class_dim,
-                                 act='softmax',
                                 param_attr=ParamAttr(
                                     initializer=MSRA(), name='fc6_weights'),
                                 bias_attr=ParamAttr(name='fc6_offset'))

--- a/fluid/PaddleCV/image_classification/models_name/vgg.py
+++ b/fluid/PaddleCV/image_classification/models_name/vgg.py
@@ -61,7 +61,6 @@ class VGGNet():
        out = fluid.layers.fc(
            input=fc2,
            size=class_dim,
-            act='softmax',
            param_attr=fluid.param_attr.ParamAttr(name=fc_name[2] + "_weights"),
            bias_attr=fluid.param_attr.ParamAttr(name=fc_name[2] + "_offset"))


--- a/fluid/PaddleCV/image_classification/train.py
+++ b/fluid/PaddleCV/image_classification/train.py
@@ -17,6 +17,7 @@ import functools
 import subprocess
 import utils
 from utils.learning_rate import cosine_decay
+from utils.fp16_utils import create_master_params_grads, master_param_to_train_param
 from utility import add_arguments, print_arguments
 import models
 import models_name
@@ -40,7 +41,9 @@ add_arg('model',            str,   "SE_ResNeXt50_32x4d", "Set the network to use
 add_arg('enable_ce',        bool,  False,                "If set True, enable continuous evaluation job.")
 add_arg('data_dir',         str,   "./data/ILSVRC2012",  "The ImageNet dataset root dir.")
 add_arg('model_category',   str,   "models",             "Whether to use models_name or not, valid value:'models','models_name'" )
-# yapf: enabl
+add_arg('fp16',             bool,  False,                "Enable half precision training with fp16." )
+add_arg('scale_loss',       float, 1.0,                  "Scale loss for fp16." )
+# yapf: enable


 def set_models(model):
@@ -145,12 +148,15 @@ def net_config(image, label, model, args):
        acc_top1 = fluid.layers.accuracy(input=out0, label=label, k=1)
        acc_top5 = fluid.layers.accuracy(input=out0, label=label, k=5)
    else:
-        out = model.net(input=image, class_dim=class_dim)
-        cost = fluid.layers.cross_entropy(input=out, label=label)
+        out = model.net(input=image, class_dim=class_dim)    
+        cost, pred = fluid.layers.softmax_with_cross_entropy(out, label, return_softmax=True) 
+        if args.scale_loss > 1:
+            avg_cost = fluid.layers.mean(x=cost) * float(args.scale_loss)
+        else:
+            avg_cost = fluid.layers.mean(x=cost)

-        avg_cost = fluid.layers.mean(x=cost)
-        acc_top1 = fluid.layers.accuracy(input=out, label=label, k=1)
-        acc_top5 = fluid.layers.accuracy(input=out, label=label, k=5)
+        acc_top1 = fluid.layers.accuracy(input=pred, label=label, k=1)
+        acc_top5 = fluid.layers.accuracy(input=pred, label=label, k=5)

    return avg_cost, acc_top1, acc_top5

@@ -171,6 +177,8 @@ def build_program(is_train, main_prog, startup_prog, args):
            use_double_buffer=True)
        with fluid.unique_name.guard():
            image, label = fluid.layers.read_file(py_reader)
+            if args.fp16:
+                image = fluid.layers.cast(image, "float16")
            avg_cost, acc_top1, acc_top5 = net_config(image, label, model, args)
            avg_cost.persistable = True
            acc_top1.persistable = True
@@ -184,7 +192,15 @@ def build_program(is_train, main_prog, startup_prog, args):
                params["learning_strategy"]["name"] = args.lr_strategy

                optimizer = optimizer_setting(params)
-                optimizer.minimize(avg_cost)
+
+                if args.fp16:
+                    params_grads = optimizer.backward(avg_cost)
+                    master_params_grads = create_master_params_grads(
+                        params_grads, main_prog, startup_prog, args.scale_loss)
+                    optimizer.apply_gradients(master_params_grads)
+                    master_param_to_train_param(master_params_grads, params_grads, main_prog)
+                else:
+                    optimizer.minimize(avg_cost)

    return py_reader, avg_cost, acc_top1, acc_top5


--- a/fluid/PaddleCV/image_classification/utils/__init__.py
+++ b/fluid/PaddleCV/image_classification/utils/__init__.py
 from .learning_rate import cosine_decay, lr_warmup
+from .fp16_utils import create_master_params_grads, master_param_to_train_param
--- a/fluid/PaddleCV/image_classification/utils/fp16_utils.py
+++ b/fluid/PaddleCV/image_classification/utils/fp16_utils.py
+from __future__ import print_function
+import paddle
+import paddle.fluid as fluid
+
+def cast_fp16_to_fp32(i, o, prog):
+    prog.global_block().append_op(
+        type="cast",
+        inputs={"X": i},
+        outputs={"Out": o},
+        attrs={
+            "in_dtype": fluid.core.VarDesc.VarType.FP16,
+            "out_dtype": fluid.core.VarDesc.VarType.FP32
+        }
+    )
+
+def cast_fp32_to_fp16(i, o, prog):
+    prog.global_block().append_op(
+        type="cast",
+        inputs={"X": i},
+        outputs={"Out": o},
+        attrs={
+            "in_dtype": fluid.core.VarDesc.VarType.FP32,
+            "out_dtype": fluid.core.VarDesc.VarType.FP16
+        }
+    )
+
+def copy_to_master_param(p, block):
+    v = block.vars.get(p.name, None)
+    if v is None:
+        raise ValueError("no param name %s found!" % p.name)
+    new_p = fluid.framework.Parameter(
+        block=block,
+        shape=v.shape,
+        dtype=fluid.core.VarDesc.VarType.FP32,
+        type=v.type,
+        lod_level=v.lod_level,
+        stop_gradient=p.stop_gradient,
+        trainable=p.trainable,
+        optimize_attr=p.optimize_attr,
+        regularizer=p.regularizer,
+        gradient_clip_attr=p.gradient_clip_attr,
+        error_clip=p.error_clip,
+        name=v.name + ".master")
+    return new_p
+
+def create_master_params_grads(params_grads, main_prog, startup_prog, scale_loss):
+    master_params_grads = []
+    tmp_role = main_prog._current_role
+    OpRole = fluid.core.op_proto_and_checker_maker.OpRole
+    main_prog._current_role = OpRole.Backward
+    for p, g in params_grads:
+        # create master parameters
+        master_param = copy_to_master_param(p, main_prog.global_block())
+        startup_master_param = startup_prog.global_block()._clone_variable(master_param)
+        startup_p = startup_prog.global_block().var(p.name)
+        cast_fp16_to_fp32(startup_p, startup_master_param, startup_prog)
+        # cast fp16 gradients to fp32 before apply gradients
+        if g.name.startswith("batch_norm"):
+            if scale_loss > 1:
+                scaled_g = g / float(scale_loss)
+            else:
+                scaled_g = g
+            master_params_grads.append([p, scaled_g])
+            continue
+        master_grad = fluid.layers.cast(g, "float32")
+        if scale_loss > 1:
+            master_grad = master_grad / float(scale_loss)
+        master_params_grads.append([master_param, master_grad])
+    main_prog._current_role = tmp_role
+    return master_params_grads
+
+def master_param_to_train_param(master_params_grads, params_grads, main_prog):
+    for idx, m_p_g in enumerate(master_params_grads):
+        train_p, _ = params_grads[idx]
+        if train_p.name.startswith("batch_norm"):
+            continue
+        with main_prog._optimized_guard([m_p_g[0], m_p_g[1]]):
+            cast_fp32_to_fp16(m_p_g[0], train_p, main_prog)
--- a/fluid/PaddleNLP/deep_attention_matching_net/train_and_evaluate.py
+++ b/fluid/PaddleNLP/deep_attention_matching_net/train_and_evaluate.py
@@ -248,8 +248,9 @@ def train(args):

    print("device count %d" % dev_count)
    print("theoretical memory usage: ")
-    print(fluid.contrib.memory_usage(
-        program=train_program, batch_size=args.batch_size))
+    print(
+        fluid.contrib.memory_usage(
+            program=train_program, batch_size=args.batch_size))

    exe = fluid.Executor(place)
    exe.run(train_startup)
@@ -318,8 +319,9 @@ def train(args):
            if (args.save_path is not None) and (step % save_step == 0):
                save_path = os.path.join(args.save_path, "step_" + str(step))
                print("Save model at step %d ... " % step)
-                print(time.strftime('%Y-%m-%d %H:%M:%S',
-                                    time.localtime(time.time())))
+                print(
+                    time.strftime('%Y-%m-%d %H:%M:%S',
+                                  time.localtime(time.time())))
                fluid.io.save_persistables(exe, save_path, train_program)

                score_path = os.path.join(args.save_path, 'score.' + str(step))
@@ -358,8 +360,9 @@ def train(args):
                    save_path = os.path.join(args.save_path,
                                             "step_" + str(step))
                    print("Save model at step %d ... " % step)
-                    print(time.strftime('%Y-%m-%d %H:%M:%S',
-                                        time.localtime(time.time())))
+                    print(
+                        time.strftime('%Y-%m-%d %H:%M:%S',
+                                      time.localtime(time.time())))
                    fluid.io.save_persistables(exe, save_path, train_program)

                    score_path = os.path.join(args.save_path,
@@ -389,9 +392,11 @@ def train(args):
            global_step, last_cost = train_with_pyreader(global_step)
        else:
            global_step, last_cost = train_with_feed(global_step)
-        train_time += time.time() - begin_time
+
+        pass_time_cost = time.time() - begin_time
+        train_time += pass_time_cost
        print("Pass {0}, pass_time_cost {1}"
-          .format(epoch, "%2.2f sec" % (time.time() - begin_time)))
+              .format(epoch, "%2.2f sec" % pass_time_cost))
    # For internal continuous evaluation
    if "CE_MODE_X" in os.environ:
        print("kpis	train_cost	%f" % last_cost)

--- a/fluid/PaddleNLP/text_classification/README.md
+++ b/fluid/PaddleNLP/text_classification/README.md
@@ -14,7 +14,7 @@

 ## 简介，模型详解

-在PaddlePaddle v2版本[文本分类](https://github.com/PaddlePaddle/models/blob/develop/text/README.md)中对于文本分类任务有较详细的介绍，在本例中不再重复介绍。
+在PaddlePaddle v2版本[文本分类](https://github.com/PaddlePaddle/models/blob/develop/legacy/text_classification/README.md)中对于文本分类任务有较详细的介绍，在本例中不再重复介绍。
 在模型上，我们采用了bow, cnn, lstm, gru四种常见的文本分类模型。

 ## 训练

--- a/fluid/PaddleNLP/text_classification/async_executor/README.md
+++ b/fluid/PaddleNLP/text_classification/async_executor/README.md
+# 文本分类
+
+以下是本例的简要目录结构及说明：
+
+```text
+.
+|-- README.md               # README
+|-- data_generator          # IMDB数据集生成工具
+|   |-- IMDB.py             # 在data_generator.py基础上扩展IMDB数据集处理逻辑
+|   |-- build_raw_data.py   # IMDB数据预处理，其产出被splitfile.py读取。格式：word word ... | label
+|   |-- data_generator.py   # 与AsyncExecutor配套的数据生成工具框架
+|   `-- splitfile.py        # 将build_raw_data.py生成的文件切分，其产出被IMDB.py读取
+|-- data_generator.sh       # IMDB数据集生成工具入口
+|-- data_reader.py          # 预测脚本使用的数据读取工具
+|-- infer.py                # 预测脚本
+`-- train.py                # 训练脚本
+```
+
+## 简介
+
+本目录包含用fluid.AsyncExecutor训练文本分类任务的脚本。网络模型定义沿用自父目录nets.py
+
+## 训练
+
+1. 运行命令 `sh data_generator.sh`，下载IMDB数据集，并转化成适合AsyncExecutor读取的训练数据
+2. 运行命令 `python train.py bow` 开始训练模型。
+    ```python
+    python train.py bow    # bow指定网络结构，可替换成cnn, lstm, gru
+    ```
+
+3. (可选）想自定义网络结构，需在[nets.py](../nets.py)中自行添加，并设置[train.py](./train.py)中的相应参数。
+    ```python
+    def train(train_reader,     # 训练数据
+        word_dict,              # 数据字典
+        network,                # 模型配置
+        use_cuda,               # 是否用GPU
+        parallel,               # 是否并行
+        save_dirname,           # 保存模型路径
+        lr=0.2,                 # 学习率大小
+        batch_size=128,         # 每个batch的样本数
+        pass_num=30):           # 训练的轮数
+    ```
+
+## 训练结果示例
+
+```text
+pass_id: 0 pass_time_cost 4.723438
+pass_id: 1 pass_time_cost 3.867186
+pass_id: 2 pass_time_cost 4.490111
+pass_id: 3 pass_time_cost 4.573296
+pass_id: 4 pass_time_cost 4.180547
+pass_id: 5 pass_time_cost 4.214476
+pass_id: 6 pass_time_cost 4.520387
+pass_id: 7 pass_time_cost 4.149485
+pass_id: 8 pass_time_cost 3.821354
+pass_id: 9 pass_time_cost 5.136178
+pass_id: 10 pass_time_cost 4.137318
+pass_id: 11 pass_time_cost 3.943429
+pass_id: 12 pass_time_cost 3.766478
+pass_id: 13 pass_time_cost 4.235983
+pass_id: 14 pass_time_cost 4.796462
+pass_id: 15 pass_time_cost 4.668116
+pass_id: 16 pass_time_cost 4.373798
+pass_id: 17 pass_time_cost 4.298131
+pass_id: 18 pass_time_cost 4.260021
+pass_id: 19 pass_time_cost 4.244411
+pass_id: 20 pass_time_cost 3.705138
+pass_id: 21 pass_time_cost 3.728070
+pass_id: 22 pass_time_cost 3.817919
+pass_id: 23 pass_time_cost 4.698598
+pass_id: 24 pass_time_cost 4.859262
+pass_id: 25 pass_time_cost 5.725732
+pass_id: 26 pass_time_cost 5.102599
+pass_id: 27 pass_time_cost 3.876582
+pass_id: 28 pass_time_cost 4.762538
+pass_id: 29 pass_time_cost 3.797759
+```
+与fluid.Executor不同，AsyncExecutor在每个pass结束不会将accuracy打印出来。为了观察训练过程，可以将fluid.AsyncExecutor.run()方法的Debug参数设为True，这样每个pass结束会把参数指定的fetch variable打印出来：
+
+```
+async_executor.run(
+    main_program,
+    dataset,
+    filelist,
+    thread_num,
+    [acc],
+    debug=True)
+```
+
+## 预测
+
+1. 运行命令 `python infer.py bow_model`, 开始预测。
+    ```python
+    python infer.py bow_model     # bow_model指定需要导入的模型
+    ```
+
+## 预测结果示例
+```text
+model_path: bow_model/epoch0.model, avg_acc: 0.882600
+model_path: bow_model/epoch1.model, avg_acc: 0.887920
+model_path: bow_model/epoch2.model, avg_acc: 0.886920
+model_path: bow_model/epoch3.model, avg_acc: 0.884720
+model_path: bow_model/epoch4.model, avg_acc: 0.879760
+model_path: bow_model/epoch5.model, avg_acc: 0.876920
+model_path: bow_model/epoch6.model, avg_acc: 0.874160
+model_path: bow_model/epoch7.model, avg_acc: 0.872000
+model_path: bow_model/epoch8.model, avg_acc: 0.870360
+model_path: bow_model/epoch9.model, avg_acc: 0.868480
+model_path: bow_model/epoch10.model, avg_acc: 0.867240
+model_path: bow_model/epoch11.model, avg_acc: 0.866200
+model_path: bow_model/epoch12.model, avg_acc: 0.865560
+model_path: bow_model/epoch13.model, avg_acc: 0.865160
+model_path: bow_model/epoch14.model, avg_acc: 0.864480
+model_path: bow_model/epoch15.model, avg_acc: 0.864240
+model_path: bow_model/epoch16.model, avg_acc: 0.863800
+model_path: bow_model/epoch17.model, avg_acc: 0.863520
+model_path: bow_model/epoch18.model, avg_acc: 0.862760
+model_path: bow_model/epoch19.model, avg_acc: 0.862680
+model_path: bow_model/epoch20.model, avg_acc: 0.862240
+model_path: bow_model/epoch21.model, avg_acc: 0.862280
+model_path: bow_model/epoch22.model, avg_acc: 0.862080
+model_path: bow_model/epoch23.model, avg_acc: 0.861560
+model_path: bow_model/epoch24.model, avg_acc: 0.861280
+model_path: bow_model/epoch25.model, avg_acc: 0.861160
+model_path: bow_model/epoch26.model, avg_acc: 0.861080
+model_path: bow_model/epoch27.model, avg_acc: 0.860920
+model_path: bow_model/epoch28.model, avg_acc: 0.860800
+model_path: bow_model/epoch29.model, avg_acc: 0.860760
+```
+注：过拟合导致acc持续下降，请忽略
--- a/fluid/PaddleNLP/text_classification/async_executor/data_generator.sh
+++ b/fluid/PaddleNLP/text_classification/async_executor/data_generator.sh
+#!/usr/bin/env bash
+
+# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+pushd .
+cd ./data_generator
+
+# wget "http://ai.stanford.edu/%7Eamaas/data/sentiment/aclImdb_v1.tar.gz"
+if [ ! -f aclImdb_v1.tar.gz ]; then
+    wget "http://10.64.74.104:8080/paddle/dataset/imdb/aclImdb_v1.tar.gz"
+fi
+tar zxvf aclImdb_v1.tar.gz
+
+mkdir train_data
+python build_raw_data.py train | python splitfile.py 12 train_data
+
+mkdir test_data
+python build_raw_data.py test | python splitfile.py 12 test_data
+
+/opt/python27/bin/python IMDB.py train_data
+/opt/python27/bin/python IMDB.py test_data
+
+mv ./output_dataset/train_data ../
+mv ./output_dataset/test_data ../
+cp aclImdb/imdb.vocab ../
+
+rm -rf ./output_dataset
+rm -rf train_data
+rm -rf test_data
+rm -rf aclImdb
+popd
--- a/fluid/PaddleNLP/text_classification/async_executor/data_generator/IMDB.py
+++ b/fluid/PaddleNLP/text_classification/async_executor/data_generator/IMDB.py
+#   Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import re
+import os, sys
+sys.path.append(os.path.abspath(os.path.join('..')))
+from data_generator import MultiSlotDataGenerator
+
+
+class IMDbDataGenerator(MultiSlotDataGenerator):
+    def load_resource(self, dictfile):
+        self._vocab = {}
+        wid = 0
+        with open(dictfile) as f:
+            for line in f:
+                self._vocab[line.strip()] = wid
+                wid += 1
+        self._unk_id = len(self._vocab)
+        self._pattern = re.compile(r'(;|,|\.|\?|!|\s|\(|\))')
+
+    def process(self, line):
+        send = '|'.join(line.split('|')[:-1]).lower().replace("<br />",
+                                                              " ").strip()
+        label = [int(line.split('|')[-1])]
+
+        words = [x for x in self._pattern.split(send) if x and x != " "]
+        feas = [
+            self._vocab[x] if x in self._vocab else self._unk_id for x in words
+        ]
+
+        return ("words", feas), ("label", label)
+
+
+imdb = IMDbDataGenerator()
+imdb.load_resource("aclImdb/imdb.vocab")
+
+# data from files
+file_names = os.listdir(sys.argv[1])
+filelist = []
+for i in range(0, len(file_names)):
+    filelist.append(os.path.join(sys.argv[1], file_names[i]))
+
+line_limit = 2500
+process_num = 24
+imdb.run_from_files(
+    filelist=filelist,
+    line_limit=line_limit,
+    process_num=process_num,
+    output_dir=('output_dataset/%s' % (sys.argv[1])))
--- a/fluid/PaddleNLP/text_classification/async_executor/data_generator/data_generator.py
+++ b/fluid/PaddleNLP/text_classification/async_executor/data_generator/data_generator.py
--- a/fluid/PaddleNLP/text_classification/async_executor/data_generator/splitfile.py
+++ b/fluid/PaddleNLP/text_classification/async_executor/data_generator/splitfile.py
+#   Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+Split file into parts
+"""
+import sys
+import os
+block = int(sys.argv[1])
+datadir = sys.argv[2]
+file_list = []
+for i in range(block):
+    file_list.append(open(datadir + "/part-" + str(i), "w"))
+id_ = 0
+for line in sys.stdin:
+    file_list[id_ % block].write(line)
+    id_ += 1
+for f in file_list:
+    f.close()
--- a/fluid/PaddleNLP/text_classification/async_executor/data_reader.py
+++ b/fluid/PaddleNLP/text_classification/async_executor/data_reader.py
+#   Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os
+import paddle
+
+
+def parse_fields(fields):
+    words_width = int(fields[0])
+    words = fields[1:1 + words_width]
+    label = fields[-1]
+
+    return words, label
+
+
+def imdb_data_feed_reader(data_dir, batch_size, buf_size):
+    """ 
+    Data feed reader for IMDB dataset.
+    This data set has been converted from original format to a format suitable
+    for AsyncExecutor
+    See data.proto for data format
+    """
+
+    def reader():
+        for file in os.listdir(data_dir):
+            if file.endswith('.proto'):
+                continue
+
+            with open(os.path.join(data_dir, file), 'r') as f:
+                for line in f:
+                    fields = line.split(' ')
+                    words, label = parse_fields(fields)
+                    yield words, label
+
+    test_reader = paddle.batch(
+        paddle.reader.shuffle(
+            reader, buf_size=buf_size), batch_size=batch_size)
+    return test_reader
--- a/fluid/PaddleNLP/text_classification/async_executor/infer.py
+++ b/fluid/PaddleNLP/text_classification/async_executor/infer.py
+#   Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import time
+import unittest
+import contextlib
+import numpy as np
+
+import paddle
+import paddle.fluid as fluid
+
+import data_reader
+
+
+def infer(test_reader, use_cuda, model_path=None):
+    """
+    inference function
+    """
+    if model_path is None:
+        print(str(model_path) + " cannot be found")
+        return
+
+    place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
+    exe = fluid.Executor(place)
+
+    inference_scope = fluid.core.Scope()
+    with fluid.scope_guard(inference_scope):
+        [inference_program, feed_target_names,
+         fetch_targets] = fluid.io.load_inference_model(model_path, exe)
+
+        total_acc = 0.0
+        total_count = 0
+        for data in test_reader():
+            acc = exe.run(inference_program,
+                          feed=utils.data2tensor(data, place),
+                          fetch_list=fetch_targets,
+                          return_numpy=True)
+            total_acc += acc[0] * len(data)
+            total_count += len(data)
+
+        avg_acc = total_acc / total_count
+        print("model_path: %s, avg_acc: %f" % (model_path, avg_acc))
+
+
+if __name__ == "__main__":
+    if __package__ is None:
+        from os import sys, path
+        sys.path.append(
+            os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
+    import utils
+
+    batch_size = 128
+    model_path = sys.argv[1]
+    test_data_dirname = 'test_data'
+
+    if len(sys.argv) == 3:
+        test_data_dirname = sys.argv[2]
+
+    test_reader = data_reader.imdb_data_feed_reader(
+        'test_data', batch_size, buf_size=500000)
+
+    models = os.listdir(model_path)
+    for i in range(0, len(models)):
+        epoch_path = "epoch" + str(i) + ".model"
+        epoch_path = os.path.join(model_path, epoch_path)
+        infer(test_reader, use_cuda=False, model_path=epoch_path)
--- a/fluid/PaddleNLP/text_classification/async_executor/train.py
+++ b/fluid/PaddleNLP/text_classification/async_executor/train.py
+#   Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import time
+import multiprocessing
+
+import paddle
+import paddle.fluid as fluid
+
+
+def train(network, dict_dim, lr, save_dirname, training_data_dirname, pass_num,
+          thread_num, batch_size):
+    file_names = os.listdir(training_data_dirname)
+    filelist = []
+    for i in range(0, len(file_names)):
+        if file_names[i] == 'data_feed.proto':
+            continue
+        filelist.append(os.path.join(training_data_dirname, file_names[i]))
+
+    dataset = fluid.DataFeedDesc(
+        os.path.join(training_data_dirname, 'data_feed.proto'))
+    dataset.set_batch_size(
+        batch_size)  # datafeed should be assigned a batch size
+    dataset.set_use_slots(['words', 'label'])
+
+    data = fluid.layers.data(
+        name="words", shape=[1], dtype="int64", lod_level=1)
+    label = fluid.layers.data(name="label", shape=[1], dtype="int64")
+
+    avg_cost, acc, prediction = network(data, label, dict_dim)
+    optimizer = fluid.optimizer.Adagrad(learning_rate=lr)
+    opt_ops, weight_and_grad = optimizer.minimize(avg_cost)
+
+    startup_program = fluid.default_startup_program()
+    main_program = fluid.default_main_program()
+
+    place = fluid.CPUPlace()
+    executor = fluid.Executor(place)
+    executor.run(startup_program)
+
+    async_executor = fluid.AsyncExecutor(place)
+    for i in range(pass_num):
+        pass_start = time.time()
+        async_executor.run(main_program,
+                           dataset,
+                           filelist,
+                           thread_num, [acc],
+                           debug=False)
+        print('pass_id: %u pass_time_cost %f' % (i, time.time() - pass_start))
+        fluid.io.save_inference_model('%s/epoch%d.model' % (save_dirname, i),
+                                      [data.name, label.name], [acc], executor)
+
+
+if __name__ == "__main__":
+    if __package__ is None:
+        from os import sys, path
+        sys.path.append(
+            os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
+
+    from nets import bow_net, cnn_net, lstm_net, gru_net
+    from utils import load_vocab
+
+    batch_size = 4
+    lr = 0.002
+    pass_num = 30
+    save_dirname = ""
+    thread_num = multiprocessing.cpu_count()
+
+    if sys.argv[1] == "bow":
+        network = bow_net
+        batch_size = 128
+        save_dirname = "bow_model"
+    elif sys.argv[1] == "cnn":
+        network = cnn_net
+        lr = 0.01
+        save_dirname = "cnn_model"
+    elif sys.argv[1] == "lstm":
+        network = lstm_net
+        lr = 0.05
+        save_dirname = "lstm_model"
+    elif sys.argv[1] == "gru":
+        network = gru_net
+        batch_size = 128
+        lr = 0.05
+        save_dirname = "gru_model"
+
+    training_data_dirname = 'train_data/'
+    if len(sys.argv) == 3:
+        training_data_dirname = sys.argv[2]
+
+    if len(sys.argv) == 4:
+        if thread_num >= int(sys.argv[3]):
+            thread_num = int(sys.argv[3])
+
+    vocab = load_vocab('imdb.vocab')
+    dict_dim = len(vocab)
+
+    train(network, dict_dim, lr, save_dirname, training_data_dirname, pass_num,
+          thread_num, batch_size)
--- a/fluid/PaddleRec/ctr/data/download.sh
+++ b/fluid/PaddleRec/ctr/data/download.sh
 #!/bin/bash

-wget --no-check-certificate https://s3-eu-west-1.amazonaws.com/criteo-labs/dac.tar.gz
+wget --no-check-certificate https://s3-eu-west-1.amazonaws.com/kaggle-display-advertising-challenge-dataset/dac.tar.gz
 tar zxf dac.tar.gz
 rm -f dac.tar.gz


--- a/fluid/PaddleRec/word2vec/README.cn.md
+++ b/fluid/PaddleRec/word2vec/README.cn.md
@@ -25,7 +25,14 @@ cd data && ./download.sh && cd ..
 ```bash
 python preprocess.py --data_path ./data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled --dict_path data/1-billion_dict
 ```
-如果您想使用我们支持的第三方词汇表，请将--other_dict_path设置为您存放将使用的词汇表的目录，并设置--with_other_dict使用它
+如果您想使用自定义的词典形如：
+```bash
+<UNK>
+a
+b
+c
+```
+请将--other_dict_path设置为您存放将使用的词典的目录，并设置--with_other_dict使用它

 ## 训练
 训练的命令行选项可以通过`python train.py -h`列出。
@@ -40,6 +47,14 @@ python train.py \
        --with_hs --with_nce --is_local \
        2>&1 | tee train.log
 ```
+如果您想使用自定义的词典形如：
+```bash
+<UNK>
+a
+b
+c
+```
+请将--other_dict_path设置为您存放将使用的词典的目录，并设置--with_other_dict使用它

 ### 分布式训练


--- a/fluid/PaddleRec/word2vec/README.md
+++ b/fluid/PaddleRec/word2vec/README.md
@@ -29,9 +29,16 @@ This model implement a skip-gram model of word2vector.
 Preprocess the training data to generate a word dict.

 ```bash
-python preprocess.py --data_path ./data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled --is_local --dict_path data/1-billion_dict
+python preprocess.py --data_path ./data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled --dict_path data/1-billion_dict
 ```
-if you would like to use our supported third party vocab, please set --other_dict_path as the directory of where you
+if you would like to use your own vocab follow the format below:
+```bash
+<UNK>
+a
+b
+c
+```
+Then, please set --other_dict_path as the directory of where you
 save the vocab you will use and set --with_other_dict flag on to using it.

 ## Train
@@ -47,7 +54,8 @@ python train.py \
        --with_hs --with_nce --is_local \
        2>&1 | tee train.log
 ```
-
+if you would like to use our supported third party vocab, please set --other_dict_path as the directory of where you
+save the vocab you will use and set --with_other_dict flag on to using it.

 ### Distributed Train
 Run a 2 pserver 2 trainer distribute training on a single machine.

--- a/fluid/PaddleRec/word2vec/preprocess.py
+++ b/fluid/PaddleRec/word2vec/preprocess.py
@@ -27,12 +27,6 @@ def parse_args():
        type=int,
        default=5,
        help="If the word count is less then freq, it will be removed from dict")
-    parser.add_argument(
-        '--is_local',
-        action='store_true',
-        required=False,
-        default=False,
-        help='Local train or not, (default: False)')

    parser.add_argument(
        '--with_other_dict',
@@ -203,26 +197,27 @@ def preprocess(args):
            for line in f:
                word_count[native_to_unicode(line.strip())] = 1

-    if args.is_local:
-        for i in range(1, 100):
-            with io.open(
-                    args.data_path + "/news.en-000{:0>2d}-of-00100".format(i),
-                    encoding='utf-8') as f:
-                for line in f:
+    for i in range(1, 100):
+        with io.open(
+                args.data_path + "/news.en-000{:0>2d}-of-00100".format(i),
+                encoding='utf-8') as f:
+            for line in f:
+                if args.with_other_dict:
                    line = strip_lines(line)
                    words = line.split()
-                    if args.with_other_dict:
-                        for item in words:
-                            if item in word_count:
-                                word_count[item] = word_count[item] + 1
-                            else:
-                                word_count[native_to_unicode('<UNK>')] += 1
-                    else:
-                        for item in words:
-                            if item in word_count:
-                                word_count[item] = word_count[item] + 1
-                            else:
-                                word_count[item] = 1
+                    for item in words:
+                        if item in word_count:
+                            word_count[item] = word_count[item] + 1
+                        else:
+                            word_count[native_to_unicode('<UNK>')] += 1
+                else:
+                    line = text_strip(line)
+                    words = line.split()
+                    for item in words:
+                        if item in word_count:
+                            word_count[item] = word_count[item] + 1
+                        else:
+                            word_count[item] = 1
    item_to_remove = []
    for item in word_count:
        if word_count[item] <= args.freq:

--- a/fluid/PaddleRec/word2vec/reader.py
+++ b/fluid/PaddleRec/word2vec/reader.py
@@ -105,7 +105,7 @@ class Word2VecReader(object):

        return set(targets)

-    def train(self, with_hs):
+    def train(self, with_hs, with_other_dict):
        def _reader():
            for file in self.filelist:
                with io.open(
@@ -116,7 +116,11 @@ class Word2VecReader(object):
                    count = 1
                    for line in f:
                        if self.trainer_id == count % self.trainer_num:
-                            line = preprocess.strip_lines(line, self.word_count)
+                            if with_other_dict:
+                                line = preprocess.strip_lines(line,
+                                                              self.word_count)
+                            else:
+                                line = preprocess.text_strip(line)
                            word_ids = [
                                self.word_to_id_[word] for word in line.split()
                                if word in self.word_to_id_
@@ -140,7 +144,11 @@ class Word2VecReader(object):
                    count = 1
                    for line in f:
                        if self.trainer_id == count % self.trainer_num:
-                            line = preprocess.strip_lines(line, self.word_count)
+                            if with_other_dict:
+                                line = preprocess.strip_lines(line,
+                                                              self.word_count)
+                            else:
+                                line = preprocess.text_strip(line)
                            word_ids = [
                                self.word_to_id_[word] for word in line.split()
                                if word in self.word_to_id_

--- a/fluid/PaddleRec/word2vec/train.py
+++ b/fluid/PaddleRec/word2vec/train.py
@@ -116,6 +116,13 @@ def parse_args():
        default=False,
        help='Do inference every 100 batches , (default: False)')

+    parser.add_argument(
+        '--with_other_dict',
+        action='store_true',
+        required=False,
+        default=False,
+        help='if use other dict , (default: False)')
+
    parser.add_argument(
        '--rank_num',
        type=int,
@@ -161,8 +168,8 @@ def train_loop(args, train_program, reader, py_reader, loss, trainer_id):
    py_reader.decorate_tensor_provider(
        convert_python_to_tensor(args.batch_size,
                                 reader.train((args.with_hs or (
-                                     not args.with_nce))), (args.with_hs or (
-                                         not args.with_nce))))
+                                     not args.with_nce)), args.with_other_dict),
+                                 (args.with_hs or (not args.with_nce))))

    place = fluid.CPUPlace()

@@ -261,7 +268,7 @@ def train(args):
            args.dict_path, args.train_data_path, filelist, 0, 1)
    else:
        trainer_id = int(os.environ["PADDLE_TRAINER_ID"])
-        trainers = int(os.environ["PADDLE_TRAINERS"])
+        trainer_num = int(os.environ["PADDLE_TRAINERS"])
        word2vec_reader = reader.Word2VecReader(args.dict_path,
                                                args.train_data_path, filelist,
                                                trainer_id, trainer_num)