networks.html 55.2 KB
Newer Older
1 2


Y
Yu Yang 已提交
3 4 5 6 7 8 9 10
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
11
    <title>Networks &#8212; PaddlePaddle  documentation</title>
Y
Yu Yang 已提交
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
    
    <link rel="stylesheet" href="../../../_static/classic.css" type="text/css" />
    <link rel="stylesheet" href="../../../_static/pygments.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../../../',
        VERSION:     '',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="../../../_static/jquery.js"></script>
    <script type="text/javascript" src="../../../_static/underscore.js"></script>
    <script type="text/javascript" src="../../../_static/doctools.js"></script>
    <script type="text/javascript" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
29 30
    <link rel="index" title="Index" href="../../../genindex.html" />
    <link rel="search" title="Search" href="../../../search.html" />
Y
Yu Yang 已提交
31
    <link rel="top" title="PaddlePaddle  documentation" href="../../../index.html" />
32 33 34
    <link rel="up" title="Model Config Interface" href="index.html" />
    <link rel="next" title="Evaluators" href="evaluators.html" />
    <link rel="prev" title="Poolings" href="poolings.html" /> 
35 36 37 38 39 40 41 42 43 44
<script>
var _hmt = _hmt || [];
(function() {
  var hm = document.createElement("script");
  hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
  var s = document.getElementsByTagName("script")[0]; 
  s.parentNode.insertBefore(hm, s);
})();
</script>

Y
Yu Yang 已提交
45 46 47 48 49 50 51 52 53 54 55 56
  </head>
  <body role="document">
    <div class="related" role="navigation" aria-label="related navigation">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="../../../genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="../../../py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
57
          <a href="evaluators.html" title="Evaluators"
Y
Yu Yang 已提交
58 59
             accesskey="N">next</a> |</li>
        <li class="right" >
60
          <a href="poolings.html" title="Poolings"
Y
Yu Yang 已提交
61
             accesskey="P">previous</a> |</li>
62
        <li class="nav-item nav-item-0"><a href="../../../index.html">PaddlePaddle  documentation</a> &#187;</li>
63 64
          <li class="nav-item nav-item-1"><a href="../../index.html" >User Interface</a> &#187;</li>
          <li class="nav-item nav-item-2"><a href="index.html" accesskey="U">Model Config Interface</a> &#187;</li> 
Y
Yu Yang 已提交
65 66 67 68 69 70 71 72
      </ul>
    </div>  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body" role="main">
            
73 74 75 76 77
  <div class="section" id="networks">
<h1>Networks<a class="headerlink" href="#networks" title="Permalink to this headline"></a></h1>
<p>The networks module contains pieces of neural network that combine multiple layers.</p>
<div class="section" id="nlp">
<h2>NLP<a class="headerlink" href="#nlp" title="Permalink to this headline"></a></h2>
Y
Yu Yang 已提交
78
<div class="section" id="sequence-conv-pool">
79
<h3>sequence_conv_pool<a class="headerlink" href="#sequence-conv-pool" title="Permalink to this headline"></a></h3>
Y
Yu Yang 已提交
80
<dl class="function">
Y
Yu Yang 已提交
81 82
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">sequence_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
Y
Yu Yang 已提交
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105
<dd><p>Text convolution pooling layers helper.</p>
<p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; name of input layer</li>
<li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See
context_projection&#8217;s document.</li>
<li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li>
<li><strong>context_start</strong> (<em>int or None</em>) &#8211; context projection length. See
context_projection&#8217;s context_start.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li>
<li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name.
None if user don&#8217;t care.</li>
<li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute or None.</em>) &#8211; context projection parameter attribute.
None if user don&#8217;t care.</li>
<li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li>
<li><strong>fc_param_attr</strong> (<em>ParameterAttribute or None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li>
<li><strong>fc_bias_attr</strong> (<em>ParameterAttribute or None</em>) &#8211; fc bias parameter attribute. False if no bias,
None if user don&#8217;t care.</li>
Y
Yu Yang 已提交
106
<li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh</li>
Y
Yu Yang 已提交
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124
<li><strong>pool_bias_attr</strong> (<em>ParameterAttribute or None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care.
False if no bias.</li>
<li><strong>fc_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; fc layer extra attribute.</li>
<li><strong>context_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; context projection layer extra attribute.</li>
<li><strong>pool_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; pooling layer extra attribute.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">output layer name.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
125 126
</div>
<div class="section" id="text-conv-pool">
127
<h3>text_conv_pool<a class="headerlink" href="#text-conv-pool" title="Permalink to this headline"></a></h3>
Y
Yu Yang 已提交
128
<dl class="function">
Y
Yu Yang 已提交
129 130
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">text_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
Y
Yu Yang 已提交
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153
<dd><p>Text convolution pooling layers helper.</p>
<p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; name of input layer</li>
<li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See
context_projection&#8217;s document.</li>
<li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li>
<li><strong>context_start</strong> (<em>int or None</em>) &#8211; context projection length. See
context_projection&#8217;s context_start.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li>
<li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name.
None if user don&#8217;t care.</li>
<li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute or None.</em>) &#8211; context projection parameter attribute.
None if user don&#8217;t care.</li>
<li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li>
<li><strong>fc_param_attr</strong> (<em>ParameterAttribute or None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li>
<li><strong>fc_bias_attr</strong> (<em>ParameterAttribute or None</em>) &#8211; fc bias parameter attribute. False if no bias,
None if user don&#8217;t care.</li>
Y
Yu Yang 已提交
154
<li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh</li>
Y
Yu Yang 已提交
155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172
<li><strong>pool_bias_attr</strong> (<em>ParameterAttribute or None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care.
False if no bias.</li>
<li><strong>fc_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; fc layer extra attribute.</li>
<li><strong>context_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; context projection layer extra attribute.</li>
<li><strong>pool_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; pooling layer extra attribute.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">output layer name.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
173 174 175
</div>
</div>
<div class="section" id="images">
176
<h2>Images<a class="headerlink" href="#images" title="Permalink to this headline"></a></h2>
Y
Yu Yang 已提交
177
<div class="section" id="img-conv-bn-pool">
178
<h3>img_conv_bn_pool<a class="headerlink" href="#img-conv-bn-pool" title="Permalink to this headline"></a></h3>
Y
Yu Yang 已提交
179
<dl class="function">
Y
Yu Yang 已提交
180 181
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">img_conv_bn_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
Y
Yu Yang 已提交
182 183 184 185 186 187 188 189 190 191 192
<dd><p>Convolution, batch normalization, pooling group.</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; layer&#8217;s input</li>
<li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li>
<li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li>
Y
Yu Yang 已提交
193 194
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer&#8217;s document.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see batch_norm_layer&#8217;s document.</li>
Y
Yu Yang 已提交
195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221
<li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li>
<li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_bias_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ParameterAttribute" title="paddle.trainer_config_helpers.attrs.ParameterAttribute"><em>ParameterAttribute</em></a>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_param_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ParameterAttribute" title="paddle.trainer_config_helpers.attrs.ParameterAttribute"><em>ParameterAttribute</em></a>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_layer_attr</strong> (<em>ExtraLayerOutput</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>bn_param_attr</strong> (<em>ParameterAttribute.</em>) &#8211; see batch_norm_layer&#8217;s document.</li>
<li><strong>bn_bias_attr</strong> &#8211; see batch_norm_layer&#8217;s document.</li>
<li><strong>bn_layer_attr</strong> &#8211; ParameterAttribute.</li>
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li>
<li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li>
<li><strong>pool_layer_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; see img_pool_layer&#8217;s document.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">Layer groups output</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
222 223
</div>
<div class="section" id="img-conv-group">
224
<h3>img_conv_group<a class="headerlink" href="#img-conv-group" title="Permalink to this headline"></a></h3>
Y
Yu Yang 已提交
225
<dl class="function">
Y
Yu Yang 已提交
226 227
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">img_conv_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
Y
Yu Yang 已提交
228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255
<dd><p>Image Convolution Group, Used for vgg net.</p>
<p>TODO(yuyang18): Complete docs</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>conv_batchnorm_drop_rate</strong> &#8211; </li>
<li><strong>input</strong> &#8211; </li>
<li><strong>conv_num_filter</strong> &#8211; </li>
<li><strong>pool_size</strong> &#8211; </li>
<li><strong>num_channels</strong> &#8211; </li>
<li><strong>conv_padding</strong> &#8211; </li>
<li><strong>conv_filter_size</strong> &#8211; </li>
<li><strong>conv_act</strong> &#8211; </li>
<li><strong>conv_with_batchnorm</strong> &#8211; </li>
<li><strong>pool_stride</strong> &#8211; </li>
<li><strong>pool_type</strong> &#8211; </li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last"></p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
256 257
</div>
<div class="section" id="simple-img-conv-pool">
258
<h3>simple_img_conv_pool<a class="headerlink" href="#simple-img-conv-pool" title="Permalink to this headline"></a></h3>
Y
Yu Yang 已提交
259
<dl class="function">
Y
Yu Yang 已提交
260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">simple_img_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Simple image convolution and pooling group.</p>
<p>Input =&gt; conv =&gt; pooling</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li>
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer for details</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see img_conv_layer for details</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>bias_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ParameterAttribute" title="paddle.trainer_config_helpers.attrs.ParameterAttribute"><em>ParameterAttribute</em></a>) &#8211; see img_conv_layer for details</li>
<li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>param_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ParameterAttribute" title="paddle.trainer_config_helpers.attrs.ParameterAttribute"><em>ParameterAttribute</em></a>) &#8211; see img_conv_layer for details</li>
<li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer for details</li>
<li><strong>conv_layer_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; see img_conv_layer for details</li>
284 285 286
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li>
<li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li>
<li><strong>pool_layer_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; see img_pool_layer for details</li>
Y
Yu Yang 已提交
287 288 289 290 291 292 293 294 295 296 297 298 299 300 301
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">Layer&#8217;s output</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

</div>
<div class="section" id="vgg-16-network">
302
<h3>vgg_16_network<a class="headerlink" href="#vgg-16-network" title="Permalink to this headline"></a></h3>
Y
Yu Yang 已提交
303 304 305
<dl class="function">
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">vgg_16_network</code><span class="sig-paren">(</span><em>input_image</em>, <em>num_channels</em>, <em>num_classes=1000</em><span class="sig-paren">)</span></dt>
Y
Yu Yang 已提交
306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324
<dd><p>Same model from <a class="reference external" href="https://gist.github.com/ksimonyan/211839e770f7b538e2d8">https://gist.github.com/ksimonyan/211839e770f7b538e2d8</a></p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>num_classes</strong> &#8211; </li>
<li><strong>input_image</strong> (<em>LayerOutput</em>) &#8211; </li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; </li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last"></p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
325 326 327
</div>
</div>
<div class="section" id="recurrent">
328
<h2>Recurrent<a class="headerlink" href="#recurrent" title="Permalink to this headline"></a></h2>
Y
Yu Yang 已提交
329
<div class="section" id="lstm">
330
<h3>LSTM<a class="headerlink" href="#lstm" title="Permalink to this headline"></a></h3>
Y
Yu Yang 已提交
331
<div class="section" id="lstmemory-unit">
332
<h4>lstmemory_unit<a class="headerlink" href="#lstmemory-unit" title="Permalink to this headline"></a></h4>
Y
Yu Yang 已提交
333
<dl class="function">
Y
Yu Yang 已提交
334 335
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">lstmemory_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
Y
Yu Yang 已提交
336 337 338 339 340 341 342 343 344
<dd><p>Define calculations that a LSTM unit performs in a single time step.
This function itself is not a recurrent layer, so that it can not be
directly applied to sequence input. This function is always used in
recurrent_group (see layers.py for more details) to implement attention
mechanism.</p>
<p>Please refer to  <strong>Generating Sequences With Recurrent Neural Networks</strong>
for more details about LSTM. The link goes as follows:
.. _Link: <a class="reference external" href="https://arxiv.org/abs/1308.0850">https://arxiv.org/abs/1308.0850</a></p>
<div class="math">
345
\[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{xi}x_{t} + W_{hi}h_{t-1} + W_{ci}c_{t-1} + b_i)\\f_t &amp; = \sigma(W_{xf}x_{t} + W_{hf}h_{t-1} + W_{cf}c_{t-1} + b_f)\\c_t &amp; = f_tc_{t-1} + i_t tanh (W_{xc}x_t+W_{hc}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{xo}x_{t} + W_{ho}h_{t-1} + W_{co}c_t + b_o)\\h_t &amp; = o_t tanh(c_t)\end{aligned}\end{align} \]</div>
Y
Yu Yang 已提交
346 347 348 349 350 351 352 353
<p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm_step</span> <span class="o">=</span> <span class="n">lstmemory_unit</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span>
                           <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
                           <span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span>
                           <span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">(),</span>
                           <span class="n">state_act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">())</span>
</pre></div>
</div>
Y
Yu Yang 已提交
354 355 356 357 358 359
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
Y
Yu Yang 已提交
360 361 362
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory unit name.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstmemory unit size.</li>
<li><strong>param_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ParameterAttribute" title="paddle.trainer_config_helpers.attrs.ParameterAttribute"><em>ParameterAttribute</em></a>) &#8211; Parameter config, None if use default.</li>
Y
Yu Yang 已提交
363 364 365
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li>
<li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li>
Y
Yu Yang 已提交
366 367 368 369
<li><strong>mixed_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of mixed layer.
False means no bias, None means default bias.</li>
<li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer.
False means no bias, None means default bias.</li>
Y
Yu Yang 已提交
370
<li><strong>mixed_layer_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; mixed layer&#8217;s extra attribute.</li>
Y
Yu Yang 已提交
371 372
<li><strong>lstm_layer_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; lstm layer&#8217;s extra attribute.</li>
<li><strong>get_output_layer_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; get output layer&#8217;s extra attribute.</li>
Y
Yu Yang 已提交
373 374 375
</ul>
</td>
</tr>
Y
Yu Yang 已提交
376
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">lstmemory unit name.</p>
Y
Yu Yang 已提交
377 378 379 380 381 382 383 384 385
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
386 387
</div>
<div class="section" id="lstmemory-group">
388
<h4>lstmemory_group<a class="headerlink" href="#lstmemory-group" title="Permalink to this headline"></a></h4>
Y
Yu Yang 已提交
389
<dl class="function">
Y
Yu Yang 已提交
390 391
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">lstmemory_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
Y
Yu Yang 已提交
392 393 394 395 396 397
<dd><p>lstm_group is a recurrent layer group version Long Short Term Memory. It
does exactly the same calculation as the lstmemory layer (see lstmemory in
layers.py for the maths) does. A promising benefit is that LSTM memory
cell states, or hidden states in every time step are accessible to for the
user. This is especially useful in attention model. If you do not need to
access to the internal states of the lstm, but merely use its outputs,
398
it is recommended to use the lstmemory, which is relatively faster than
Y
Yu Yang 已提交
399 400 401 402 403 404 405 406 407 408 409 410 411 412 413
lstmemory_group.</p>
<p>NOTE: In PaddlePaddle&#8217;s implementation, the following input-to-hidden
multiplications:
<span class="math">\(W_{xi}x_{t}\)</span> , <span class="math">\(W_{xf}x_{t}\)</span>,
<span class="math">\(W_{xc}x_t\)</span>, <span class="math">\(W_{xo}x_{t}\)</span> are not done in lstmemory_unit to
speed up the calculations. Consequently, an additional mixed_layer with
full_matrix_projection must be included before lstmemory_unit is called.</p>
<p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm_step</span> <span class="o">=</span> <span class="n">lstmemory_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span>
                            <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
                            <span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span>
                            <span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">(),</span>
                            <span class="n">state_act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">())</span>
</pre></div>
</div>
Y
Yu Yang 已提交
414 415 416 417 418 419 420 421 422 423
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory group name.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstmemory group size.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; is lstm reversed</li>
<li><strong>param_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ParameterAttribute" title="paddle.trainer_config_helpers.attrs.ParameterAttribute"><em>ParameterAttribute</em></a>) &#8211; Parameter config, None if use default.</li>
Y
Yu Yang 已提交
424 425 426
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li>
<li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li>
Y
Yu Yang 已提交
427 428 429 430 431 432 433 434 435 436
<li><strong>mixed_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of mixed layer.
False means no bias, None means default bias.</li>
<li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer.
False means no bias, None means default bias.</li>
<li><strong>mixed_layer_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; mixed layer&#8217;s extra attribute.</li>
<li><strong>lstm_layer_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; lstm layer&#8217;s extra attribute.</li>
<li><strong>get_output_layer_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; get output layer&#8217;s extra attribute.</li>
</ul>
</td>
</tr>
Y
Yu Yang 已提交
437
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the lstmemory group.</p>
Y
Yu Yang 已提交
438 439 440 441 442 443 444
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td>
</tr>
</tbody>
</table>
Y
Yu Yang 已提交
445 446
</dd></dl>

Y
Yu Yang 已提交
447 448
</div>
<div class="section" id="simple-lstm">
449
<h4>simple_lstm<a class="headerlink" href="#simple-lstm" title="Permalink to this headline"></a></h4>
Y
Yu Yang 已提交
450
<dl class="function">
Y
Yu Yang 已提交
451 452 453 454 455 456
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">simple_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Simple LSTM Cell.</p>
<p>It just combine a mixed layer with fully_matrix_projection and a lstmemory
layer. The simple lstm cell was implemented as follow equations.</p>
<div class="math">
457
\[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{xi}x_{t} + W_{hi}h_{t-1} + W_{ci}c_{t-1} + b_i)\\f_t &amp; = \sigma(W_{xf}x_{t} + W_{hf}h_{t-1} + W_{cf}c_{t-1} + b_f)\\c_t &amp; = f_tc_{t-1} + i_t tanh (W_{xc}x_t+W_{hc}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{xo}x_{t} + W_{ho}h_{t-1} + W_{co}c_t + b_o)\\h_t &amp; = o_t tanh(c_t)\end{aligned}\end{align} \]</div>
Y
Yu Yang 已提交
458 459 460
<p>Please refer <strong>Generating Sequences With Recurrent Neural Networks</strong> if you
want to know what lstm is. <a class="reference external" href="http://arxiv.org/abs/1308.0850">Link</a> is here.</p>
<table class="docutils field-list" frame="void" rules="none">
Y
Yu Yang 已提交
461 462 463 464
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
Y
Yu Yang 已提交
465 466 467
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstm layer name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li>
Y
Yu Yang 已提交
468
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li>
Y
Yu Yang 已提交
469 470 471 472
<li><strong>mat_param_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ParameterAttribute" title="paddle.trainer_config_helpers.attrs.ParameterAttribute"><em>ParameterAttribute</em></a>) &#8211; mixed layer&#8217;s matrix projection parameter attribute.</li>
<li><strong>bias_param_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute. False means no bias, None
means default bias.</li>
<li><strong>inner_param_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ParameterAttribute" title="paddle.trainer_config_helpers.attrs.ParameterAttribute"><em>ParameterAttribute</em></a>) &#8211; lstm cell parameter attribute.</li>
Y
Yu Yang 已提交
473 474 475
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li>
<li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li>
Y
Yu Yang 已提交
476 477
<li><strong>mixed_layer_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; mixed layer&#8217;s extra attribute.</li>
<li><strong>lstm_cell_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ExtraLayerAttribute" title="paddle.trainer_config_helpers.attrs.ExtraLayerAttribute"><em>ExtraLayerAttribute</em></a>) &#8211; lstm layer&#8217;s extra attribute.</li>
Y
Yu Yang 已提交
478 479 480
</ul>
</td>
</tr>
Y
Yu Yang 已提交
481 482 483 484
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">lstm layer name.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
Y
Yu Yang 已提交
485 486 487 488 489 490
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
491 492
</div>
<div class="section" id="bidirectional-lstm">
493
<h4>bidirectional_lstm<a class="headerlink" href="#bidirectional-lstm" title="Permalink to this headline"></a></h4>
Y
Yu Yang 已提交
494
<dl class="function">
Y
Yu Yang 已提交
495 496
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">bidirectional_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
Y
Yu Yang 已提交
497 498 499 500 501 502 503 504 505 506
<dd><p>A bidirectional_lstm is a recurrent unit that iterates over the input
sequence both in forward and bardward orders, and then concatenate two
outputs form a final output. However, concatenation of two outputs
is not the only way to form the final output, you can also, for example,
just add them together.</p>
<p>Please refer to  <strong>Neural Machine Translation by Jointly Learning to Align
and Translate</strong> for more details about the bidirectional lstm.
The link goes as follows:
.. _Link: <a class="reference external" href="https://arxiv.org/pdf/1409.0473v3.pdf">https://arxiv.org/pdf/1409.0473v3.pdf</a></p>
<p>The example usage is:</p>
507
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">bi_lstm</span> <span class="o">=</span> <span class="n">bidirectional_lstm</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">input1</span><span class="p">],</span> <span class="n">size</span><span class="o">=</span><span class="mi">512</span><span class="p">)</span>
Y
Yu Yang 已提交
508 509
</pre></div>
</div>
Y
Yu Yang 已提交
510 511 512 513 514 515 516 517
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional lstm layer name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li>
Y
Yu Yang 已提交
518 519 520 521 522
<li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are
concatenated and returned.
If set True, the entire output sequences that are
processed in forward and backward directions are
concatenated and returned.</li>
Y
Yu Yang 已提交
523 524 525
</ul>
</td>
</tr>
526
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">LayerOutput object accroding to the return_seq.</p>
Y
Yu Yang 已提交
527 528 529 530 531 532 533 534 535
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
536 537 538
</div>
</div>
<div class="section" id="gru">
539
<h3>GRU<a class="headerlink" href="#gru" title="Permalink to this headline"></a></h3>
Y
Yu Yang 已提交
540
<div class="section" id="gru-unit">
541
<h4>gru_unit<a class="headerlink" href="#gru-unit" title="Permalink to this headline"></a></h4>
Y
Yu Yang 已提交
542
<dl class="function">
Y
Yu Yang 已提交
543 544
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">gru_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
Y
Yu Yang 已提交
545 546 547 548 549 550 551
<dd><p>Define calculations that a gated recurrent unit performs in a single time
step. This function itself is not a recurrent layer, so that it can not be
directly applied to sequence input. This function is almost always used in
the recurrent_group (see layers.py for more details) to implement attention
mechanism.</p>
<p>Please see grumemory in layers.py for the details about the maths.</p>
<table class="docutils field-list" frame="void" rules="none">
Y
Yu Yang 已提交
552 553 554 555
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
Y
Yu Yang 已提交
556 557 558 559 560 561
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activation</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activation</li>
<li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li>
Y
Yu Yang 已提交
562 563 564
</ul>
</td>
</tr>
Y
Yu Yang 已提交
565 566 567 568
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the gru output layer.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
Y
Yu Yang 已提交
569 570 571 572 573 574 575 576
</td>
</tr>
</tbody>
</table>
</dd></dl>

</div>
<div class="section" id="gru-group">
577
<h4>gru_group<a class="headerlink" href="#gru-group" title="Permalink to this headline"></a></h4>
Y
Yu Yang 已提交
578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620
<dl class="function">
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">gru_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>gru_group is a recurrent layer group version Gated Recurrent Unit. It
does exactly the same calculation as the grumemory layer does. A promising
benefit is that gru hidden sates are accessible to for the user. This is
especially useful in attention model. If you do not need to access to
any internal state, but merely use the outputs of a GRU, it is recommanded
to use the grumemory, which is relatively faster.</p>
<p>Please see grumemory in layers.py for more detail about the maths.</p>
<p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">gur_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span>
                <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
                <span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span>
                <span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">())</span>
</pre></div>
</div>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li>
<li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li>
<li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the gru group.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
621 622
</div>
<div class="section" id="simple-gru">
623
<h4>simple_gru<a class="headerlink" href="#simple-gru" title="Permalink to this headline"></a></h4>
Y
Yu Yang 已提交
624 625 626
<dl class="function">
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">simple_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650
<dd><p>You maybe see gru_step_layer, grumemory in layers.py, gru_unit, gru_group,
simple_gru in network.py. The reason why there are so many interfaces is
that we have two ways to implement recurrent neural network. One way is to
use one complete layer to implement rnn (including simple rnn, gru and lstm)
with multiple time steps, such as recurrent_layer, lstmemory, grumemory. But,
the multiplication operation <span class="math">\(W x_t\)</span> is not computed in these layers.
See details in their interfaces in layers.py.
The other implementation is to use an recurrent group which can ensemble a
series of layers to compute rnn step by step. This way is flexible for
attenion mechanism or other complex connections.</p>
<ul class="simple">
<li>gru_step_layer: only compute rnn by one step. It needs an memory as input
and can be used in recurrent group.</li>
<li>gru_unit: a wrapper of gru_step_layer with memory.</li>
<li>gru_group: a GRU cell implemented by a combination of multiple layers in
recurrent group.
But <span class="math">\(W x_t\)</span> is not done in group.</li>
<li>gru_memory: a GRU cell implemented by one layer, which does same calculation
with gru_group and is faster than gru_group.</li>
<li>simple_gru: a complete GRU implementation inlcuding <span class="math">\(W x_t\)</span> and
gru_group. <span class="math">\(W\)</span> contains <span class="math">\(W_r\)</span>, <span class="math">\(W_z\)</span> and <span class="math">\(W\)</span>, see
formula in grumemory.</li>
</ul>
<p>The computational speed is that, grumemory is relatively better than
Y
Yu Yang 已提交
651 652
gru_group, and gru_group is relatively better than simple_gru.</p>
<p>The example usage is:</p>
653
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">simple_gru</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">)</span>
Y
Yu Yang 已提交
654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681
</pre></div>
</div>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li>
<li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li>
<li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the gru group.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
682 683 684
</div>
</div>
<div class="section" id="simple-attention">
685
<h3>simple_attention<a class="headerlink" href="#simple-attention" title="Permalink to this headline"></a></h3>
Y
Yu Yang 已提交
686 687 688
<dl class="function">
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">simple_attention</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
Y
Yu Yang 已提交
689
<dd><p>Calculate and then return a context vector by attention machanism.
Y
Yu Yang 已提交
690
Size of the context vector equals to size of the encoded_sequence.</p>
Y
Yu Yang 已提交
691
<div class="math">
692
\[ \begin{align}\begin{aligned}a(s_{i-1},h_{j}) &amp; = v_{a}f(W_{a}s_{t-1} + U_{a}h_{j})\\e_{i,j} &amp; = a(s_{i-1}, h_{j})\\a_{i,j} &amp; = \frac{exp(e_{i,j})}{\sum_{k=1}^{T_x}{exp(e_{i,k})}}\\c_{i} &amp; = \sum_{j=1}^{T_{x}}a_{i,j}h_{j}\end{aligned}\end{align} \]</div>
Y
Yu Yang 已提交
693 694 695 696 697 698 699
<p>where <span class="math">\(h_{j}\)</span> is the jth element of encoded_sequence,
<span class="math">\(U_{a}h_{j}\)</span> is the jth element of encoded_proj
<span class="math">\(s_{i-1}\)</span> is decoder_state
<span class="math">\(f\)</span> is weight_act, and is set to tanh by default.</p>
<p>Please refer to <strong>Neural Machine Translation by Jointly Learning to
Align and Translate</strong> for more details. The link is as follows:
<a class="reference external" href="https://arxiv.org/abs/1409.0473">https://arxiv.org/abs/1409.0473</a>.</p>
Y
Yu Yang 已提交
700 701 702 703 704 705
<p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">context</span> <span class="o">=</span> <span class="n">simple_attention</span><span class="p">(</span><span class="n">encoded_sequence</span><span class="o">=</span><span class="n">enc_seq</span><span class="p">,</span>
                           <span class="n">encoded_proj</span><span class="o">=</span><span class="n">enc_proj</span><span class="p">,</span>
                           <span class="n">decoder_state</span><span class="o">=</span><span class="n">decoder_prev</span><span class="p">,)</span>
</pre></div>
</div>
Y
Yu Yang 已提交
706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the attention model.</li>
<li><strong>softmax_param_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ParameterAttribute" title="paddle.trainer_config_helpers.attrs.ParameterAttribute"><em>ParameterAttribute</em></a>) &#8211; parameter attribute of sequence softmax
that is used to produce attention weight</li>
<li><strong>weight_act</strong> (<em>Activation</em>) &#8211; activation of the attention model</li>
<li><strong>encoded_sequence</strong> (<em>LayerOutput</em>) &#8211; output of the encoder</li>
<li><strong>encoded_proj</strong> (<em>LayerOutput</em>) &#8211; attention weight is computed by a feed forward neural
network which has two inputs : decoder&#8217;s hidden state
of previous time step and encoder&#8217;s output.
encoded_proj is output of the feed-forward network for
encoder&#8217;s output. Here we pre-compute it outside
simple_attention for speed consideration.</li>
<li><strong>decoder_state</strong> (<em>LayerOutput</em>) &#8211; hidden state of decoder in previous time step</li>
<li><strong>transform_param_attr</strong> (<a class="reference internal" href="attrs.html#paddle.trainer_config_helpers.attrs.ParameterAttribute" title="paddle.trainer_config_helpers.attrs.ParameterAttribute"><em>ParameterAttribute</em></a>) &#8211; parameter attribute of the feed-forward
network that takes decoder_state as inputs to
compute attention weight.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a context vector</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
736 737 738
</div>
</div>
<div class="section" id="miscs">
739
<h2>Miscs<a class="headerlink" href="#miscs" title="Permalink to this headline"></a></h2>
Y
Yu Yang 已提交
740
<div class="section" id="dropout-layer">
741
<h3>dropout_layer<a class="headerlink" href="#dropout-layer" title="Permalink to this headline"></a></h3>
Y
Yu Yang 已提交
742
<dl class="function">
Y
Yu Yang 已提交
743 744
<dt>
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">dropout_layer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
Y
Yu Yang 已提交
745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763
<dd><p>&#64;TODO(yuyang18): Add comments.</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> &#8211; </li>
<li><strong>input</strong> &#8211; </li>
<li><strong>dropout_rate</strong> &#8211; </li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last"></p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

Y
Yu Yang 已提交
764 765
</div>
<div class="section" id="outputs">
766
<h3>outputs<a class="headerlink" href="#outputs" title="Permalink to this headline"></a></h3>
Y
Yu Yang 已提交
767
<dl class="function">
Y
Yu Yang 已提交
768
<dt>
769
<code class="descclassname">paddle.trainer_config_helpers.networks.</code><code class="descname">outputs</code><span class="sig-paren">(</span><em>layers</em>, <em>*args</em><span class="sig-paren">)</span></dt>
770 771
<dd><p>Declare the outputs of network. If user have not defined the inputs of
network, this method will calculate the input order by dfs travel.</p>
Y
Yu Yang 已提交
772 773 774 775
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
776
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>layers</strong> (<em>list|tuple|LayerOutput</em>) &#8211; Output layers.</td>
Y
Yu Yang 已提交
777 778 779 780 781 782 783
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"></td>
</tr>
</tbody>
</table>
</dd></dl>

784
</div>
Y
Yu Yang 已提交
785
</div>
Y
Yu Yang 已提交
786 787 788 789 790 791 792 793
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
        <div class="sphinxsidebarwrapper">
Y
Yu Yang 已提交
794 795
  <h3><a href="../../../index.html">Table Of Contents</a></h3>
  <ul>
796 797
<li><a class="reference internal" href="#">Networks</a><ul>
<li><a class="reference internal" href="#nlp">NLP</a><ul>
Y
Yu Yang 已提交
798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830
<li><a class="reference internal" href="#sequence-conv-pool">sequence_conv_pool</a></li>
<li><a class="reference internal" href="#text-conv-pool">text_conv_pool</a></li>
</ul>
</li>
<li><a class="reference internal" href="#images">Images</a><ul>
<li><a class="reference internal" href="#img-conv-bn-pool">img_conv_bn_pool</a></li>
<li><a class="reference internal" href="#img-conv-group">img_conv_group</a></li>
<li><a class="reference internal" href="#simple-img-conv-pool">simple_img_conv_pool</a></li>
<li><a class="reference internal" href="#vgg-16-network">vgg_16_network</a></li>
</ul>
</li>
<li><a class="reference internal" href="#recurrent">Recurrent</a><ul>
<li><a class="reference internal" href="#lstm">LSTM</a><ul>
<li><a class="reference internal" href="#lstmemory-unit">lstmemory_unit</a></li>
<li><a class="reference internal" href="#lstmemory-group">lstmemory_group</a></li>
<li><a class="reference internal" href="#simple-lstm">simple_lstm</a></li>
<li><a class="reference internal" href="#bidirectional-lstm">bidirectional_lstm</a></li>
</ul>
</li>
<li><a class="reference internal" href="#gru">GRU</a><ul>
<li><a class="reference internal" href="#gru-unit">gru_unit</a></li>
<li><a class="reference internal" href="#gru-group">gru_group</a></li>
<li><a class="reference internal" href="#simple-gru">simple_gru</a></li>
</ul>
</li>
<li><a class="reference internal" href="#simple-attention">simple_attention</a></li>
</ul>
</li>
<li><a class="reference internal" href="#miscs">Miscs</a><ul>
<li><a class="reference internal" href="#dropout-layer">dropout_layer</a></li>
<li><a class="reference internal" href="#outputs">outputs</a></li>
</ul>
</li>
831 832
</ul>
</li>
Y
Yu Yang 已提交
833 834
</ul>

Y
Yu Yang 已提交
835
  <h4>Previous topic</h4>
836 837
  <p class="topless"><a href="poolings.html"
                        title="previous chapter">Poolings</a></p>
Y
Yu Yang 已提交
838
  <h4>Next topic</h4>
839
  <p class="topless"><a href="evaluators.html"
Y
Yu Yang 已提交
840 841 842 843 844 845 846 847 848 849 850
                        title="next chapter">Evaluators</a></p>
  <div role="note" aria-label="source link">
    <h3>This Page</h3>
    <ul class="this-page-menu">
      <li><a href="../../../_sources/ui/api/trainer_config_helpers/networks.txt"
            rel="nofollow">Show Source</a></li>
    </ul>
   </div>
<div id="searchbox" style="display: none" role="search">
  <h3>Quick search</h3>
    <form class="search" action="../../../search.html" method="get">
851 852
      <div><input type="text" name="q" /></div>
      <div><input type="submit" value="Go" /></div>
Y
Yu Yang 已提交
853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related" role="navigation" aria-label="related navigation">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="../../../genindex.html" title="General Index"
             >index</a></li>
        <li class="right" >
          <a href="../../../py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
872
          <a href="evaluators.html" title="Evaluators"
Y
Yu Yang 已提交
873 874
             >next</a> |</li>
        <li class="right" >
875
          <a href="poolings.html" title="Poolings"
Y
Yu Yang 已提交
876
             >previous</a> |</li>
877
        <li class="nav-item nav-item-0"><a href="../../../index.html">PaddlePaddle  documentation</a> &#187;</li>
878 879
          <li class="nav-item nav-item-1"><a href="../../index.html" >User Interface</a> &#187;</li>
          <li class="nav-item nav-item-2"><a href="index.html" >Model Config Interface</a> &#187;</li> 
Y
Yu Yang 已提交
880 881 882
      </ul>
    </div>
    <div class="footer" role="contentinfo">
883
        &#169; Copyright 2016, PaddlePaddle developers.
884
      Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.4.9.
Y
Yu Yang 已提交
885 886 887
    </div>
  </body>
</html>