Merge remote-tracking branch 'upstream/develop' into inception_v4

f359889d · wangmeng28 · d264cefc · 4f13cce6 · f359889d · f359889d
15 changed file
--- a/.clang_format.hook
+++ b/.clang_format.hook
 #!/usr/bin/env bash
 set -e

-readonly VERSION="3.9"
+readonly VERSION="3.8"

 version=$(clang-format -version)


--- a/.travis.yml
+++ b/.travis.yml
@@ -16,11 +16,12 @@ addons:
      - python
      - python-pip
      - python2.7-dev
+      - clang-format-3.8
  ssh_known_hosts: 52.76.173.135
-
 before_install:
-  -  sudo pip install -U virtualenv pre-commit pip
-  -  docker pull paddlepaddle/paddle:latest
+  - if [[ "$JOB" == "PRE_COMMIT" ]]; then sudo ln -s /usr/bin/clang-format-3.8 /usr/bin/clang-format; fi
+  - sudo pip install -U virtualenv pre-commit pip
+  - docker pull paddlepaddle/paddle:latest

 script:
  - exit_code=0

--- a/README.cn.md
+++ b/README.cn.md
@@ -98,13 +98,15 @@ PaddlePaddle提供了丰富的运算单元，帮助大家以模块化的方式

 图像相比文字能够提供更加生动、容易理解及更具艺术感的信息，是人们转递与交换信息的重要来源。图像分类是根据图像的语义信息对不同类别图像进行区分，是计算机视觉中重要的基础问题，也是图像检测、图像分割、物体跟踪、行为分析等其他高层视觉任务的基础，在许多领域都有着广泛的应用。如：安防领域的人脸识别和智能视频分析等，交通领域的交通场景识别，互联网领域基于内容的图像检索和相册自动归类，医学领域的图像识别等。

-在图像分类任务中，我们向大家介绍如何训练AlexNet、VGG、GoogLeNet、ResNet和Inception-v4模型。同时提供了一个够将Caffe训练好的模型文件转换为PaddlePaddle模型文件的模型转换工具。
+在图像分类任务中，我们向大家介绍如何训练AlexNet、VGG、GoogLeNet、ResNet、Inception-v4和Inception-Resnet-V2模型。同时提供了能够将Caffe或TensorFlow训练好的模型文件转换为PaddlePaddle模型文件的模型转换工具。

 - 11.1 [将Caffe模型文件转换为PaddlePaddle模型文件](https://github.com/PaddlePaddle/models/tree/develop/image_classification/caffe2paddle)
- 11.2 [AlexNet](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
- 11.3 [VGG](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
- 11.4 [Residual Network](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
- 11.5 [Inception-v4](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
+- 11.2 [将TensorFlow模型文件转换为PaddlePaddle模型文件](https://github.com/PaddlePaddle/models/tree/develop/image_classification/tf2paddle)
+- 11.3 [AlexNet](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
+- 11.4 [VGG](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
+- 11.5 [Residual Network](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
+- 11.6 [Inception-v4](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
+- 11.7 [Inception-Resnet-V2](https://github.com/PaddlePaddle/models/tree/develop/image_classification)

 ## 12. 目标检测


--- a/README.md
+++ b/README.md
@@ -72,12 +72,14 @@ As an example for sequence-to-sequence learning, we take the machine translation

 ## 9. Image classification

-For the example of image classification, we show you how to train AlexNet, VGG, GoogLeNet, ResNet and Inception-v4 models in PaddlePaddle. It also provides a model conversion tool that converts Caffe trained model files into PaddlePaddle model files.
+For the example of image classification, we show you how to train AlexNet, VGG, GoogLeNet, ResNet, Inception-v4 and Inception-Resnet-V2 models in PaddlePaddle. It also provides model conversion tools that convert Caffe or TensorFlow trained model files into PaddlePaddle model files.

 - 9.1 [convert Caffe model file to PaddlePaddle model file](https://github.com/PaddlePaddle/models/tree/develop/image_classification/caffe2paddle)
- 9.2 [AlexNet](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
- 9.3 [VGG](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
- 9.4 [Residual Network](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
- 9.5 [Inception-v4](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
+- 9.2 [convert TensorFlow model file to PaddlePaddle model file](https://github.com/PaddlePaddle/models/tree/develop/image_classification/tf2paddle)
+- 9.3 [AlexNet](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
+- 9.4 [VGG](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
+- 9.5 [Residual Network](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
+- 9.6 [Inception-v4](https://github.com/PaddlePaddle/models/tree/develop/image_classification)
+- 9.7 [Inception-Resnet-V2](https://github.com/PaddlePaddle/models/tree/develop/image_classification)

 This tutorial is contributed by [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) and licensed under the [Apache-2.0 license](LICENSE).
--- a/ctr/README.cn.md
+++ b/ctr/README.cn.md
@@ -146,8 +146,8 @@ Wide & Deep Learning Model\[[3](#参考文献)\] 可以作为一种相对成熟
 Figure 2. Wide & Deep Model
 </p>

-模型左边的 Wide 部分，可以容纳大规模系数特征，并且对一些特定的信息（比如 ID）有一定的记忆能力；
-而模型右边的 Deep 部分，能够学习特征间的隐含关系，在相同数量的特征下有更好的学习和推导能力。
+模型上边的 Wide 部分，可以容纳大规模系数特征，并且对一些特定的信息（比如 ID）有一定的记忆能力；
+而模型下边的 Deep 部分，能够学习特征间的隐含关系，在相同数量的特征下有更好的学习和推导能力。


 ### 编写模型输入

--- a/ctr/README.md
+++ b/ctr/README.md
@@ -120,7 +120,7 @@ The model structure is as follows:
 Figure 2. Wide & Deep Model
 </p>

-The wide part of the left side of the model can accommodate large-scale coefficient features and has some memory for some specific information (such as ID); and the Deep part of the right side of the model can learn the implicit relationship between features.
+The wide part of the top side of the model can accommodate large-scale coefficient features and has some memory for some specific information (such as ID); and the Deep part of the bottom side of the model can learn the implicit relationship between features.


 ### Model Input

--- a/ctr/images/wide_deep.png
+++ b/ctr/images/wide_deep.png
--- a/dssm/README.cn.md
+++ b/dssm/README.cn.md
@@ -216,49 +216,49 @@ Pairwise Rank复用上面的DNN结构，同一个source对两个target求相似
 ### 回归的数据格式
 ```
 # 3 fields each line:
-#   - source's word ids
-#   - target's word ids
+#   - source word list
+#   - target word list
 #   - target
-<ids> \t <ids> \t <float>
+<word list> \t <word list> \t <float>
 ```

 比如：

 ```
-3 6 10 \t 6 8 33 \t 0.7
-6 0 \t 6 9 330 \t 0.03
+苹果 六 袋    苹果 6s    0.1
+新手 汽车 驾驶    驾校 培训    0.9
 ```
 ### 分类的数据格式
 ```
 # 3 fields each line:
-#   - source's word ids
-#   - target's word ids
+#   - source word list
+#   - target word list
 #   - target
-<ids> \t <ids> \t <label>
+<word list> \t <word list> \t <label>
 ```

 比如：

 ```
-3 6 10 \t 6 8 33 \t 0
-6 10 \t 8 3 1 \t 1
+苹果 六 袋    苹果 6s    0
+新手 汽车 驾驶    驾校 培训    1
 ```

 ### 排序的数据格式
 ```
 # 4 fields each line:
-#   - source's word ids
-#   - target1's word ids
-#   - target2's word ids
+#   - source word list
+#   - target1 word list
+#   - target2 word list
 #   - label
-<ids> \t <ids> \t <ids> \t <label>
+<word list> \t <word list> \t <word list> \t <label>
 ```

 比如：

 ```
-7 2 4 \t 2 10 12 \t 9 2 7 10 23 \t 0
-7 2 4 \t 10 12 \t 9 2 21 23 \t 1
+苹果 六 袋    苹果 6s    新手 汽车 驾驶    1
+新手 汽车 驾驶    驾校 培训    苹果 6s    1
 ```

 ## 执行训练

--- a/dssm/README.md
+++ b/dssm/README.md
@@ -190,52 +190,52 @@ Below is a simple example for the data in `./data`
 ### Regression data format
 ```
 # 3 fields each line:
-#   - source's word ids
-#   - target's word ids
+#   - source word list
+#   - target word list
 #   - target
-<ids> \t <ids> \t <float>
+<word list> \t <word list> \t <float>
 ```

 The example of this format is as follows.

 ```
-3 6 10 \t 6 8 33 \t 0.7
-6 0 \t 6 9 330 \t 0.03
+Six bags of apples    Apple 6s    0.1
+The new driver    The driving school    0.9
 ```

 ### Classification data format
 ```
 # 3 fields each line:
-#   - source's word ids
-#   - target's word ids
+#   - source word list
+#   - target word list
 #   - target
-<ids> \t <ids> \t <label>
+<word list> \t <word list> \t <label>
 ```

 The example of this format is as follows.


 ```
-3 6 10 \t 6 8 33 \t 0
-6 10 \t 8 3 1 \t 1
+Six bags of apples    Apple 6s    0
+The new driver    The driving school    1
 ```


 ### Ranking data format
 ```
 # 4 fields each line:
-#   - source's word ids
-#   - target1's word ids
-#   - target2's word ids
+#   - source word list
+#   - target1 word list
+#   - target2 word list
 #   - label
-<ids> \t <ids> \t <ids> \t <label>
+<word list> \t <word list> \t <word list> \t <label>
 ```

 The example of this format is as follows.

 ```
-7 2 4 \t 2 10 12 \t 9 2 7 10 23 \t 0
-7 2 4 \t 10 12 \t 9 2 21 23 \t 1
+Six bags of apples    Apple 6s    The new driver    1
+The new driver    The driving school    Apple 6s    1
 ```

 ## Training

--- a/how_to_use_capi/README.md
+++ b/how_to_use_capi/README.md
-[TBD]
--- a/image_classification/README.md
+++ b/image_classification/README.md
 图像分类
 =======================

-这里将介绍如何在PaddlePaddle下使用AlexNet、VGG、GoogLeNet、ResNet和Inception-v4模型进行图像分类。图像分类问题的描述和这些模型的介绍可以参考[PaddlePaddle book](https://github.com/PaddlePaddle/book/tree/develop/03.image_classification)。
+这里将介绍如何在PaddlePaddle下使用AlexNet、VGG、GoogLeNet、ResNet、Inception-v4和Inception-ResNet-v2模型进行图像分类。图像分类问题的描述和这些模型的介绍可以参考[PaddlePaddle book](https://github.com/PaddlePaddle/book/tree/develop/03.image_classification)。

 ## 训练模型

@@ -11,6 +11,8 @@

 ```python
 import gzip
+import argparse
+
 import paddle.v2.dataset.flowers as flowers
 import paddle.v2 as paddle
 import reader
@@ -19,6 +21,7 @@ import resnet
 import alexnet
 import googlenet
 import inception_v4
+import inception_resnet_v2


 # PaddlePaddle init
@@ -30,6 +33,7 @@ paddle.init(use_gpu=False, trainer_count=1)
 设置算法参数（如数据维度、类别数目和batch size等参数），定义数据输入层`image`和类别标签`lbl`。

 ```python
+# Use 3 * 331 * 331 or 3 * 299 * 299 for DATA_DIM in Inception-ResNet-v2.
 DATA_DIM = 3 * 224 * 224
 CLASS_DIM = 102
 BATCH_SIZE = 128
@@ -42,7 +46,7 @@ lbl = paddle.layer.data(

 ### 获得所用模型

-这里可以选择使用AlexNet、VGG、GoogLeNet、ResNet和Inception-v4模型中的一个模型进行图像分类。通过调用相应的方法可以获得网络最后的Softmax层。
+这里可以选择使用AlexNet、VGG、GoogLeNet、ResNet、Inception-v4和Inception-ResNet-v2模型中的一个模型进行图像分类。通过调用相应的方法可以获得网络最后的Softmax层。

 1. 使用AlexNet模型

@@ -89,12 +93,24 @@ out = resnet.resnet_imagenet(image, class_dim=CLASS_DIM)

 5. 使用Inception-v4模型

-Inception-v4模型可以通过下面的代码获取：
+Inception-v4模型可以通过下面的代码获取, 本例中使用的模型输入大小为`3 * 224 * 224`：

 ```python
 out = inception_v4.inception_v4(image, class_dim=CLASS_DIM)
 ```

+
+6. 使用Inception-ResNet-v2模型
+
+提供的Inception-ResNet-v2模型支持`3 * 331 * 331`和`3 * 299 * 299`两种大小的输入，同时可以自行设置dropout概率，可以通过如下的代码使用：
+
+```python
+out = inception_resnet_v2.inception_resnet_v2(
+    image, class_dim=CLASS_DIM, dropout_rate=0.5, size=DATA_DIM)
+```
+
+注意，由于和其他几种模型输入大小不同，若配合提供的`reader.py`使用Inception-ResNet-v2时请先将`reader.py`中`paddle.image.simple_transform`中的参数为修改为相应大小。
+
 ### 定义损失函数

 ```python
@@ -182,7 +198,7 @@ def event_handler(event):

 ### 定义训练方法

-对于AlexNet、VGG和ResNet，可以按下面的代码定义训练方法：
+对于AlexNet、VGG、ResNet、Inception-v4和Inception-ResNet-v2，可以按下面的代码定义训练方法：

 ```python
 # Create trainer

--- a/image_classification/inception_resnet_v2.py
+++ b/image_classification/inception_resnet_v2.py
+import paddle.v2 as paddle
+
+
+def conv_bn_layer(input,
+                  ch_out,
+                  filter_size,
+                  stride,
+                  padding=0,
+                  active_type=paddle.activation.Relu(),
+                  ch_in=None):
+    """layer wrapper assembling convolution and batchnorm layer"""
+    tmp = paddle.layer.img_conv(
+        input=input,
+        filter_size=filter_size,
+        num_channels=ch_in,
+        num_filters=ch_out,
+        stride=stride,
+        padding=padding,
+        act=paddle.activation.Linear(),
+        bias_attr=False)
+    return paddle.layer.batch_norm(input=tmp, epsilon=0.001, act=active_type)
+
+
+def sequential_block(input, *layers):
+    """helper function for sequential layers"""
+    for layer in layers:
+        layer_func, layer_conf = layer
+        input = layer_func(input, **layer_conf)
+    return input
+
+
+def mixed_5b_block(input):
+    branch0 = conv_bn_layer(
+        input, ch_in=192, ch_out=96, filter_size=1, stride=1)
+    branch1 = sequential_block(input, (conv_bn_layer, {
+        "ch_in": 192,
+        "ch_out": 48,
+        "filter_size": 1,
+        "stride": 1
+    }), (conv_bn_layer, {
+        "ch_in": 48,
+        "ch_out": 64,
+        "filter_size": 5,
+        "stride": 1,
+        "padding": 2
+    }))
+    branch2 = sequential_block(input, (conv_bn_layer, {
+        "ch_in": 192,
+        "ch_out": 64,
+        "filter_size": 1,
+        "stride": 1
+    }), (conv_bn_layer, {
+        "ch_in": 64,
+        "ch_out": 96,
+        "filter_size": 3,
+        "stride": 1,
+        "padding": 1
+    }), (conv_bn_layer, {
+        "ch_in": 96,
+        "ch_out": 96,
+        "filter_size": 3,
+        "stride": 1,
+        "padding": 1
+    }))
+    branch3 = sequential_block(
+        input,
+        (paddle.layer.img_pool, {
+            "pool_size": 3,
+            "stride": 1,
+            "padding": 1,
+            "pool_type": paddle.pooling.Avg(),
+            "exclude_mode": False
+        }),
+        (conv_bn_layer, {
+            "ch_in": 192,
+            "ch_out": 64,
+            "filter_size": 1,
+            "stride": 1
+        }), )
+    out = paddle.layer.concat(input=[branch0, branch1, branch2, branch3])
+    return out
+
+
+def block35(input, scale=1.0):
+    branch0 = conv_bn_layer(
+        input, ch_in=320, ch_out=32, filter_size=1, stride=1)
+    branch1 = sequential_block(input, (conv_bn_layer, {
+        "ch_in": 320,
+        "ch_out": 32,
+        "filter_size": 1,
+        "stride": 1
+    }), (conv_bn_layer, {
+        "ch_in": 32,
+        "ch_out": 32,
+        "filter_size": 3,
+        "stride": 1,
+        "padding": 1
+    }))
+    branch2 = sequential_block(input, (conv_bn_layer, {
+        "ch_in": 320,
+        "ch_out": 32,
+        "filter_size": 1,
+        "stride": 1
+    }), (conv_bn_layer, {
+        "ch_in": 32,
+        "ch_out": 48,
+        "filter_size": 3,
+        "stride": 1,
+        "padding": 1
+    }), (conv_bn_layer, {
+        "ch_in": 48,
+        "ch_out": 64,
+        "filter_size": 3,
+        "stride": 1,
+        "padding": 1
+    }))
+    out = paddle.layer.concat(input=[branch0, branch1, branch2])
+    out = paddle.layer.img_conv(
+        input=out,
+        filter_size=1,
+        num_channels=128,
+        num_filters=320,
+        stride=1,
+        padding=0,
+        act=paddle.activation.Linear(),
+        bias_attr=None)
+    out = paddle.layer.slope_intercept(out, slope=scale, intercept=0.0)
+    out = paddle.layer.addto(input=[input, out], act=paddle.activation.Relu())
+    return out
+
+
+def mixed_6a_block(input):
+    branch0 = conv_bn_layer(
+        input, ch_in=320, ch_out=384, filter_size=3, stride=2)
+    branch1 = sequential_block(input, (conv_bn_layer, {
+        "ch_in": 320,
+        "ch_out": 256,
+        "filter_size": 1,
+        "stride": 1
+    }), (conv_bn_layer, {
+        "ch_in": 256,
+        "ch_out": 256,
+        "filter_size": 3,
+        "stride": 1,
+        "padding": 1
+    }), (conv_bn_layer, {
+        "ch_in": 256,
+        "ch_out": 384,
+        "filter_size": 3,
+        "stride": 2
+    }))
+    branch2 = paddle.layer.img_pool(
+        input,
+        num_channels=320,
+        pool_size=3,
+        stride=2,
+        pool_type=paddle.pooling.Max())
+    out = paddle.layer.concat(input=[branch0, branch1, branch2])
+    return out
+
+
+def block17(input, scale=1.0):
+    branch0 = conv_bn_layer(
+        input, ch_in=1088, ch_out=192, filter_size=1, stride=1)
+    branch1 = sequential_block(input, (conv_bn_layer, {
+        "ch_in": 1088,
+        "ch_out": 128,
+        "filter_size": 1,
+        "stride": 1
+    }), (conv_bn_layer, {
+        "ch_in": 128,
+        "ch_out": 160,
+        "filter_size": [7, 1],
+        "stride": 1,
+        "padding": [3, 0]
+    }), (conv_bn_layer, {
+        "ch_in": 160,
+        "ch_out": 192,
+        "filter_size": [1, 7],
+        "stride": 1,
+        "padding": [0, 3]
+    }))
+    out = paddle.layer.concat(input=[branch0, branch1])
+    out = paddle.layer.img_conv(
+        input=out,
+        filter_size=1,
+        num_channels=384,
+        num_filters=1088,
+        stride=1,
+        padding=0,
+        act=paddle.activation.Linear(),
+        bias_attr=None)
+    out = paddle.layer.slope_intercept(out, slope=scale, intercept=0.0)
+    out = paddle.layer.addto(input=[input, out], act=paddle.activation.Relu())
+    return out
+
+
+def mixed_7a_block(input):
+    branch0 = sequential_block(
+        input,
+        (conv_bn_layer, {
+            "ch_in": 1088,
+            "ch_out": 256,
+            "filter_size": 1,
+            "stride": 1
+        }),
+        (conv_bn_layer, {
+            "ch_in": 256,
+            "ch_out": 384,
+            "filter_size": 3,
+            "stride": 2
+        }), )
+    branch1 = sequential_block(
+        input,
+        (conv_bn_layer, {
+            "ch_in": 1088,
+            "ch_out": 256,
+            "filter_size": 1,
+            "stride": 1
+        }),
+        (conv_bn_layer, {
+            "ch_in": 256,
+            "ch_out": 288,
+            "filter_size": 3,
+            "stride": 2
+        }), )
+    branch2 = sequential_block(input, (conv_bn_layer, {
+        "ch_in": 1088,
+        "ch_out": 256,
+        "filter_size": 1,
+        "stride": 1
+    }), (conv_bn_layer, {
+        "ch_in": 256,
+        "ch_out": 288,
+        "filter_size": 3,
+        "stride": 1,
+        "padding": 1
+    }), (conv_bn_layer, {
+        "ch_in": 288,
+        "ch_out": 320,
+        "filter_size": 3,
+        "stride": 2
+    }))
+    branch3 = paddle.layer.img_pool(
+        input,
+        num_channels=1088,
+        pool_size=3,
+        stride=2,
+        pool_type=paddle.pooling.Max())
+    out = paddle.layer.concat(input=[branch0, branch1, branch2, branch3])
+    return out
+
+
+def block8(input, scale=1.0, no_relu=False):
+    branch0 = conv_bn_layer(
+        input, ch_in=2080, ch_out=192, filter_size=1, stride=1)
+    branch1 = sequential_block(input, (conv_bn_layer, {
+        "ch_in": 2080,
+        "ch_out": 192,
+        "filter_size": 1,
+        "stride": 1
+    }), (conv_bn_layer, {
+        "ch_in": 192,
+        "ch_out": 224,
+        "filter_size": [3, 1],
+        "stride": 1,
+        "padding": [1, 0]
+    }), (conv_bn_layer, {
+        "ch_in": 224,
+        "ch_out": 256,
+        "filter_size": [1, 3],
+        "stride": 1,
+        "padding": [0, 1]
+    }))
+    out = paddle.layer.concat(input=[branch0, branch1])
+    out = paddle.layer.img_conv(
+        input=out,
+        filter_size=1,
+        num_channels=448,
+        num_filters=2080,
+        stride=1,
+        padding=0,
+        act=paddle.activation.Linear(),
+        bias_attr=None)
+    out = paddle.layer.slope_intercept(out, slope=scale, intercept=0.0)
+    out = paddle.layer.addto(
+        input=[input, out],
+        act=paddle.activation.Linear() if no_relu else paddle.activation.Relu())
+    return out
+
+
+def inception_resnet_v2(input,
+                        class_dim,
+                        dropout_rate=0.5,
+                        data_dim=3 * 331 * 331):
+    conv2d_1a = conv_bn_layer(
+        input, ch_in=3, ch_out=32, filter_size=3, stride=2)
+    conv2d_2a = conv_bn_layer(
+        conv2d_1a, ch_in=32, ch_out=32, filter_size=3, stride=1)
+    conv2d_2b = conv_bn_layer(
+        conv2d_2a, ch_in=32, ch_out=64, filter_size=3, stride=1, padding=1)
+    maxpool_3a = paddle.layer.img_pool(
+        input=conv2d_2b, pool_size=3, stride=2, pool_type=paddle.pooling.Max())
+    conv2d_3b = conv_bn_layer(
+        maxpool_3a, ch_in=64, ch_out=80, filter_size=1, stride=1)
+    conv2d_4a = conv_bn_layer(
+        conv2d_3b, ch_in=80, ch_out=192, filter_size=3, stride=1)
+    maxpool_5a = paddle.layer.img_pool(
+        input=conv2d_4a, pool_size=3, stride=2, pool_type=paddle.pooling.Max())
+    mixed_5b = mixed_5b_block(maxpool_5a)
+    repeat = sequential_block(mixed_5b, *([(block35, {"scale": 0.17})] * 10))
+    mixed_6a = mixed_6a_block(repeat)
+    repeat1 = sequential_block(mixed_6a, *([(block17, {"scale": 0.10})] * 20))
+    mixed_7a = mixed_7a_block(repeat1)
+    repeat2 = sequential_block(mixed_7a, *([(block8, {"scale": 0.20})] * 9))
+    block_8 = block8(repeat2, no_relu=True)
+    conv2d_7b = conv_bn_layer(
+        block_8, ch_in=2080, ch_out=1536, filter_size=1, stride=1)
+    avgpool_1a = paddle.layer.img_pool(
+        input=conv2d_7b,
+        pool_size=8 if data_dim == 3 * 299 * 299 else 9,
+        stride=1,
+        pool_type=paddle.pooling.Avg(),
+        exclude_mode=False)
+    drop_out = paddle.layer.dropout(input=avgpool_1a, dropout_rate=dropout_rate)
+    out = paddle.layer.fc(
+        input=drop_out, size=class_dim, act=paddle.activation.Softmax())
+    return out
--- a/image_classification/infer.py
+++ b/image_classification/infer.py
+import os
 import gzip
+import argparse
+import numpy as np
+from PIL import Image
+
 import paddle.v2 as paddle
 import reader
 import vgg
@@ -6,14 +11,9 @@ import resnet
 import alexnet
 import googlenet
 import inception_v4
-import argparse
-import os
-from PIL import Image
-import numpy as np
+import inception_resnet_v2

-WIDTH = 224
-HEIGHT = 224
-DATA_DIM = 3 * WIDTH * HEIGHT
+DATA_DIM = 3 * 224 * 224  # Use 3 * 331 * 331 or 3 * 299 * 299 for Inception-ResNet-v2.
 CLASS_DIM = 102


@@ -29,7 +29,7 @@ def main():
        help='The model for image classification',
        choices=[
            'alexnet', 'vgg13', 'vgg16', 'vgg19', 'resnet', 'googlenet',
-            'inception_v4'
+            'inception-resnet-v2', 'inception_v4'
        ])
    parser.add_argument(
        'params_path', help='The file which stores the parameters')
@@ -53,6 +53,10 @@ def main():
        out = resnet.resnet_imagenet(image, class_dim=CLASS_DIM)
    elif args.model == 'googlenet':
        out, _, _ = googlenet.googlenet(image, class_dim=CLASS_DIM)
+    elif args.model == 'inception-resnet-v2':
+        assert DATA_DIM == 3 * 331 * 331 or DATA_DIM == 3 * 299 * 299
+        out = inception_resnet_v2.inception_resnet_v2(
+            image, class_dim=CLASS_DIM, dropout_rate=0.5, data_dim=DATA_DIM)
    elif args.model == 'inception_v4':
        out = inception_v4.inception_v4(image, class_dim=CLASS_DIM)


--- a/image_classification/train.py
+++ b/image_classification/train.py
 import gzip
+import argparse
+
 import paddle.v2.dataset.flowers as flowers
 import paddle.v2 as paddle
 import reader
@@ -7,9 +9,9 @@ import resnet
 import alexnet
 import googlenet
 import inception_v4
-import argparse
+import inception_resnet_v2

-DATA_DIM = 3 * 224 * 224
+DATA_DIM = 3 * 224 * 224  # Use 3 * 331 * 331 or 3 * 299 * 299 for Inception-ResNet-v2.
 CLASS_DIM = 102
 BATCH_SIZE = 128

@@ -22,7 +24,7 @@ def main():
        help='The model for image classification',
        choices=[
            'alexnet', 'vgg13', 'vgg16', 'vgg19', 'resnet', 'googlenet',
-            'inception_v4'
+            'inception-resnet-v2', 'inception_v4'
        ])
    args = parser.parse_args()

@@ -56,6 +58,10 @@ def main():
            input=out2, label=lbl, coeff=0.3)
        paddle.evaluator.classification_error(input=out2, label=lbl)
        extra_layers = [loss1, loss2]
+    elif args.model == 'inception-resnet-v2':
+        assert DATA_DIM == 3 * 331 * 331 or DATA_DIM == 3 * 299 * 299
+        out = inception_resnet_v2.inception_resnet_v2(
+            image, class_dim=CLASS_DIM, dropout_rate=0.5, data_dim=DATA_DIM)
    elif args.model == 'inception_v4':
        out = inception_v4.inception_v4(image, class_dim=CLASS_DIM)


--- a/nce_cost/README.md
+++ b/nce_cost/README.md
@@ -129,7 +129,7 @@ NCE 层的一些重要参数解释如下：
          size=dict_size,
          input=paddle.layer.trans_full_matrix_projection(
              hidden_layer, param_attr=paddle.attr.Param(name="nce_w")),
-          act=paddle.activation.Sigmoid(),
+          act=paddle.activation.Softmax(),
          bias_attr=paddle.attr.Param(name="nce_b"))
    ```
    上述代码片段中的 `paddle.layer.mixed` 必须以 PaddlePaddle 中 `paddle.layer.×_projection` 为输入。`paddle.layer.mixed` 将多个 `projection` （输入可以是多个）计算结果求和作为输出。`paddle.layer.trans_full_matrix_projection` 在计算矩阵乘法时会对参数$W$进行转置。