提交 2e94fbcf 编写于 作者: T Travis CI

Deploy to GitHub Pages: 8c71e093

上级 9bef9f31
...@@ -194,39 +194,39 @@ ...@@ -194,39 +194,39 @@
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">sequence_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">sequence_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Text convolution pooling layers helper.</p> <dd><p>Text convolution pooling group.</p>
<p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p> <p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; name of input layer</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See <li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See
context_projection&#8217;s document.</li> context_projection&#8217;s document.</li>
<li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li> <li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li>
<li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See <li><strong>context_start</strong> (<em>int|None</em>) &#8211; context start position. See
context_projection&#8217;s context_start.</li> context_projection&#8217;s context_start.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li>
<li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name. <li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute. <li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute|None</em>) &#8211; padding parameter attribute of context projection layer.
None if user don&#8217;t care.</li> If false, it means padding always be zero.</li>
<li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li> <li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li>
<li><strong>fc_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li> <li><strong>fc_param_attr</strong> (<em>ParameterAttribute|None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li>
<li><strong>fc_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias, <li><strong>fc_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; fc bias parameter attribute. False if no bias,
None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh.</li>
<li><strong>pool_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; pooling layer bias attr. False if no bias.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh</li>
<li><strong>pool_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care.
False if no bias.</li>
<li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li> <li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li>
<li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li> <li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li>
<li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li> <li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">output layer name.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">layer&#8217;s output.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -242,39 +242,39 @@ False if no bias.</li> ...@@ -242,39 +242,39 @@ False if no bias.</li>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">text_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">text_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Text convolution pooling layers helper.</p> <dd><p>Text convolution pooling group.</p>
<p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p> <p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; name of input layer</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See <li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See
context_projection&#8217;s document.</li> context_projection&#8217;s document.</li>
<li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li> <li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li>
<li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See <li><strong>context_start</strong> (<em>int|None</em>) &#8211; context start position. See
context_projection&#8217;s context_start.</li> context_projection&#8217;s context_start.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li>
<li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name. <li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute. <li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute|None</em>) &#8211; padding parameter attribute of context projection layer.
None if user don&#8217;t care.</li> If false, it means padding always be zero.</li>
<li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li> <li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li>
<li><strong>fc_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li> <li><strong>fc_param_attr</strong> (<em>ParameterAttribute|None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li>
<li><strong>fc_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias, <li><strong>fc_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; fc bias parameter attribute. False if no bias,
None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh.</li>
<li><strong>pool_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; pooling layer bias attr. False if no bias.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh</li>
<li><strong>pool_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care.
False if no bias.</li>
<li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li> <li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li>
<li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li> <li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li>
<li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li> <li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">output layer name.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">layer&#8217;s output.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -294,36 +294,37 @@ False if no bias.</li> ...@@ -294,36 +294,37 @@ False if no bias.</li>
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_bn_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_bn_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Convolution, batch normalization, pooling group.</p> <dd><p>Convolution, batch normalization, pooling group.</p>
<p>Img input =&gt; Conv =&gt; BN =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; layer&#8217;s input</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li> <li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li> <li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li> <li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see batch_norm_layer&#8217;s document.</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see batch_norm_layer for details.</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li> <li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>conv_bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>conv_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_layer_attr</strong> (<em>ExtraLayerOutput</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>conv_layer_attr</strong> (<em>ExtraLayerOutput</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>bn_param_attr</strong> (<em>ParameterAttribute.</em>) &#8211; see batch_norm_layer&#8217;s document.</li> <li><strong>bn_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see batch_norm_layer for details.</li>
<li><strong>bn_bias_attr</strong> &#8211; see batch_norm_layer&#8217;s document.</li> <li><strong>bn_bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see batch_norm_layer for details.</li>
<li><strong>bn_layer_attr</strong> &#8211; ParameterAttribute.</li> <li><strong>bn_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see batch_norm_layer for details.</li>
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li> <li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li> <li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer&#8217;s document.</li> <li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer for details.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">Layer groups output</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">layer&#8217;s output</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -347,26 +348,26 @@ False if no bias.</li> ...@@ -347,26 +348,26 @@ False if no bias.</li>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>conv_batchnorm_drop_rate</strong> (<em>list</em>) &#8211; if conv_with_batchnorm[i] is true, <li><strong>conv_batchnorm_drop_rate</strong> (<em>list</em>) &#8211; if conv_with_batchnorm[i] is true,
conv_batchnorm_drop_rate[i] represents the drop rate of each batch norm.</li> conv_batchnorm_drop_rate[i] represents the drop rate of each batch norm.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; layer&#8217;s input.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>conv_num_filter</strong> (<em>int</em>) &#8211; output channels num.</li> <li><strong>conv_num_filter</strong> (<em>list|tuple</em>) &#8211; list of output channels num.</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; pooling filter size.</li> <li><strong>pool_size</strong> (<em>int</em>) &#8211; pooling filter size.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; input channels num.</li> <li><strong>num_channels</strong> (<em>int</em>) &#8211; input channels num.</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; convolution padding size.</li> <li><strong>conv_padding</strong> (<em>int</em>) &#8211; convolution padding size.</li>
<li><strong>conv_filter_size</strong> (<em>int</em>) &#8211; convolution filter size.</li> <li><strong>conv_filter_size</strong> (<em>int</em>) &#8211; convolution filter size.</li>
<li><strong>conv_act</strong> (<em>BaseActivation</em>) &#8211; activation funciton after convolution.</li> <li><strong>conv_act</strong> (<em>BaseActivation</em>) &#8211; activation funciton after convolution.</li>
<li><strong>conv_with_batchnorm</strong> (<em>list</em>) &#8211; conv_with_batchnorm[i] represents <li><strong>conv_with_batchnorm</strong> (<em>list</em>) &#8211; if conv_with_batchnorm[i] is true,
if there is a batch normalization after each convolution.</li> there is a batch normalization operation after each convolution.</li>
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; pooling stride size.</li> <li><strong>pool_stride</strong> (<em>int</em>) &#8211; pooling stride size.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; pooling type.</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; pooling type.</li>
<li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; Convolution param attribute. <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; param attribute of convolution layer,
None means default attribute.</li> None means default attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">Layer&#8217;s output</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">layer&#8217;s output</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Type:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -380,34 +381,34 @@ None means default attribute.</li> ...@@ -380,34 +381,34 @@ None means default attribute.</li>
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_img_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_img_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Simple image convolution and pooling group.</p> <dd><p>Simple image convolution and pooling group.</p>
<p>Input =&gt; conv =&gt; pooling</p> <p>Img input =&gt; Conv =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li> <li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer for details</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see img_conv_layer for details</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details</li> <li><strong>bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer for details</li> <li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_conv_layer for details</li> <li><strong>conv_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li> <li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li> <li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer for details</li> <li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer for details.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">Layer&#8217;s output</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">layer&#8217;s output</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -432,13 +433,16 @@ None means default attribute.</li> ...@@ -432,13 +433,16 @@ None means default attribute.</li>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>num_classes</strong> &#8211; </li> <li><strong>num_classes</strong> (<em>int</em>) &#8211; number of class.</li>
<li><strong>input_image</strong> (<em>LayerOutput</em>) &#8211; </li> <li><strong>input_image</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; </li> <li><strong>num_channels</strong> (<em>int</em>) &#8211; input channels num.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last"></p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">layer&#8217;s output</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -456,9 +460,9 @@ None means default attribute.</li> ...@@ -456,9 +460,9 @@ None means default attribute.</li>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Define calculations that a LSTM unit performs during a single time step. <dd><p>lstmemory_unit defines the caculation process of a LSTM unit during a
This function itself is not a recurrent layer, so it can not be single time step. This function is not a recurrent layer, so it can not be
directly used to process sequence inputs. This function is always used in directly used to process sequence input. This function is always used in
recurrent_group (see layers.py for more details) to implement attention recurrent_group (see layers.py for more details) to implement attention
mechanism.</p> mechanism.</p>
<p>Please refer to <strong>Generating Sequences With Recurrent Neural Networks</strong> <p>Please refer to <strong>Generating Sequences With Recurrent Neural Networks</strong>
...@@ -479,21 +483,21 @@ for more details about LSTM. The link goes as follows: ...@@ -479,21 +483,21 @@ for more details about LSTM. The link goes as follows:
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>out_memory</strong> (<em>LayerOutput | None</em>) &#8211; output of previous time step</li> <li><strong>out_memory</strong> (<em>LayerOutput | None</em>) &#8211; output of previous time step</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory unit name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory unit name.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstmemory unit size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstmemory unit size.</li>
<li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute, None means default attribute.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; last activiation type of lstm.</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of lstm.</li>
<li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; state activiation type of lstm.</li>
<li><strong>input_proj_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias attribute for input-to-hidden projection. <li><strong>input_proj_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias attribute for input to hidden projection.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>input_proj_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra layer attribute for input to hidden <li><strong>input_proj_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra layer attribute for input to hidden
projection of the LSTM unit, such as dropout, error clipping.</li> projection of the LSTM unit, such as dropout, error clipping.</li>
<li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer. <li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias parameter attribute of lstm layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>lstm_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li> <li><strong>lstm_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra attribute of lstm layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -516,9 +520,9 @@ False means no bias, None means default bias.</li> ...@@ -516,9 +520,9 @@ False means no bias, None means default bias.</li>
<dd><p>lstm_group is a recurrent_group version of Long Short Term Memory. It <dd><p>lstm_group is a recurrent_group version of Long Short Term Memory. It
does exactly the same calculation as the lstmemory layer (see lstmemory in does exactly the same calculation as the lstmemory layer (see lstmemory in
layers.py for the maths) does. A promising benefit is that LSTM memory layers.py for the maths) does. A promising benefit is that LSTM memory
cell states, or hidden states in every time step are accessible to the cell states(or hidden states) in every time step are accessible to the
user. This is especially useful in attention model. If you do not need to user. This is especially useful in attention model. If you do not need to
access the internal states of the lstm, but merely use its outputs, access the internal states of the lstm and merely use its outputs,
it is recommended to use the lstmemory, which is relatively faster than it is recommended to use the lstmemory, which is relatively faster than
lstmemory_group.</p> lstmemory_group.</p>
<p>NOTE: In PaddlePaddle&#8217;s implementation, the following input-to-hidden <p>NOTE: In PaddlePaddle&#8217;s implementation, the following input-to-hidden
...@@ -540,18 +544,18 @@ full_matrix_projection must be included before lstmemory_unit is called.</p> ...@@ -540,18 +544,18 @@ full_matrix_projection must be included before lstmemory_unit is called.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstmemory group size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstmemory group size.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the lstmemory group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of lstmemory group.</li>
<li><strong>out_memory</strong> (<em>LayerOutput | None</em>) &#8211; output of previous time step</li> <li><strong>out_memory</strong> (<em>LayerOutput | None</em>) &#8211; output of previous time step.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; is lstm reversed</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; process the input in a reverse order or not.</li>
<li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute, None means default attribute.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; last activiation type of lstm.</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of lstm.</li>
<li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; state activiation type of lstm.</li>
<li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer. <li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias parameter attribute of lstm layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>input_proj_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias attribute for input-to-hidden projection. <li><strong>input_proj_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias attribute for input to hidden projection.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>input_proj_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra layer attribute for input to hidden <li><strong>input_proj_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra layer attribute for input to hidden
projection of the LSTM unit, such as dropout, error clipping.</li> projection of the LSTM unit, such as dropout, error clipping.</li>
...@@ -576,34 +580,34 @@ projection of the LSTM unit, such as dropout, error clipping.</li> ...@@ -576,34 +580,34 @@ projection of the LSTM unit, such as dropout, error clipping.</li>
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Simple LSTM Cell.</p> <dd><p>Simple LSTM Cell.</p>
<p>It just combine a mixed layer with fully_matrix_projection and a lstmemory <p>It just combines a mixed layer with fully_matrix_projection and a lstmemory
layer. The simple lstm cell was implemented as follow equations.</p> layer. The simple lstm cell was implemented with follow equations.</p>
<div class="math"> <div class="math">
\[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{xi}x_{t} + W_{hi}h_{t-1} + W_{ci}c_{t-1} + b_i)\\f_t &amp; = \sigma(W_{xf}x_{t} + W_{hf}h_{t-1} + W_{cf}c_{t-1} + b_f)\\c_t &amp; = f_tc_{t-1} + i_t tanh (W_{xc}x_t+W_{hc}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{xo}x_{t} + W_{ho}h_{t-1} + W_{co}c_t + b_o)\\h_t &amp; = o_t tanh(c_t)\end{aligned}\end{align} \]</div> \[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{xi}x_{t} + W_{hi}h_{t-1} + W_{ci}c_{t-1} + b_i)\\f_t &amp; = \sigma(W_{xf}x_{t} + W_{hf}h_{t-1} + W_{cf}c_{t-1} + b_f)\\c_t &amp; = f_tc_{t-1} + i_t tanh (W_{xc}x_t+W_{hc}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{xo}x_{t} + W_{ho}h_{t-1} + W_{co}c_t + b_o)\\h_t &amp; = o_t tanh(c_t)\end{aligned}\end{align} \]</div>
<p>Please refer <strong>Generating Sequences With Recurrent Neural Networks</strong> if you <p>Please refer to <strong>Generating Sequences With Recurrent Neural Networks</strong> for more
want to know what lstm is. <a class="reference external" href="http://arxiv.org/abs/1308.0850">Link</a> is here.</p> details about lstm. <a class="reference external" href="http://arxiv.org/abs/1308.0850">Link</a> is here.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstm layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; lstm layer name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; layer&#8217;s input.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; process the input in a reverse order or not.</li>
<li><strong>mat_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; mixed layer&#8217;s matrix projection parameter attribute.</li> <li><strong>mat_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of matrix projection in mixed layer.</li>
<li><strong>bias_param_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute. False means no bias, None <li><strong>bias_param_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute. False means no bias, None
means default bias.</li> means default bias.</li>
<li><strong>inner_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; lstm cell parameter attribute.</li> <li><strong>inner_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of lstm cell.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; last activiation type of lstm.</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of lstm.</li>
<li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; state activiation type of lstm.</li>
<li><strong>mixed_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li> <li><strong>mixed_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra attribute of mixed layer.</li>
<li><strong>lstm_cell_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li> <li><strong>lstm_cell_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra attribute of lstm.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">lstm layer name.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">layer&#8217;s output.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -620,8 +624,8 @@ means default bias.</li> ...@@ -620,8 +624,8 @@ means default bias.</li>
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>A bidirectional_lstm is a recurrent unit that iterates over the input <dd><p>A bidirectional_lstm is a recurrent unit that iterates over the input
sequence both in forward and bardward orders, and then concatenate two sequence both in forward and backward orders, and then concatenate two
outputs form a final output. However, concatenation of two outputs outputs to form a final output. However, concatenation of two outputs
is not the only way to form the final output, you can also, for example, is not the only way to form the final output, you can also, for example,
just add them together.</p> just add them together.</p>
<p>Please refer to <strong>Neural Machine Translation by Jointly Learning to Align <p>Please refer to <strong>Neural Machine Translation by Jointly Learning to Align
...@@ -640,15 +644,14 @@ The link goes as follows: ...@@ -640,15 +644,14 @@ The link goes as follows:
<li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional lstm layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional lstm layer name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li>
<li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are <li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, the last time step of output are
concatenated and returned. concatenated and returned.
If set True, the entire output sequences that are If set True, the entire output sequences in forward
processed in forward and backward directions are and backward directions are concatenated and returned.</li>
concatenated and returned.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">LayerOutput object accroding to the return_seq.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">LayerOutput object.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -667,9 +670,9 @@ concatenated and returned.</li> ...@@ -667,9 +670,9 @@ concatenated and returned.</li>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Define calculations that a gated recurrent unit performs in a single time <dd><p>gru_unit defines the calculation process of a gated recurrent unit during a single
step. This function itself is not a recurrent layer, so it can not be time step. This function is not a recurrent layer, so it can not be
directly used to process sequence inputs. This function is always used in directly used to process sequence input. This function is always used in
the recurrent_group (see layers.py for more details) to implement attention the recurrent_group (see layers.py for more details) to implement attention
mechanism.</p> mechanism.</p>
<p>Please see grumemory in layers.py for the details about the maths.</p> <p>Please see grumemory in layers.py for the details about the maths.</p>
...@@ -678,13 +681,13 @@ mechanism.</p> ...@@ -678,13 +681,13 @@ mechanism.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>memory_boot</strong> (<em>LayerOutput | None</em>) &#8211; the initialization state of the LSTM cell.</li> <li><strong>memory_boot</strong> (<em>LayerOutput | None</em>) &#8211; the initialization state of the LSTM cell.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; activation type of gru</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activation type or gru</li>
<li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> <li><strong>gru_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; Extra attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -708,11 +711,11 @@ mechanism.</p> ...@@ -708,11 +711,11 @@ mechanism.</p>
does exactly the same calculation as the grumemory layer does. A promising does exactly the same calculation as the grumemory layer does. A promising
benefit is that gru hidden states are accessible to the user. This is benefit is that gru hidden states are accessible to the user. This is
especially useful in attention model. If you do not need to access especially useful in attention model. If you do not need to access
any internal state, but merely use the outputs of a GRU, it is recommended any internal state and merely use the outputs of a GRU, it is recommended
to use the grumemory, which is relatively faster.</p> to use the grumemory, which is relatively faster.</p>
<p>Please see grumemory in layers.py for more detail about the maths.</p> <p>Please see grumemory in layers.py for more detail about the maths.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">gur_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">gru_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span>
<span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span>
<span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">())</span> <span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">())</span>
...@@ -723,15 +726,16 @@ to use the grumemory, which is relatively faster.</p> ...@@ -723,15 +726,16 @@ to use the grumemory, which is relatively faster.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>memory_boot</strong> (<em>LayerOutput | None</em>) &#8211; the initialization state of the LSTM cell.</li> <li><strong>memory_boot</strong> (<em>LayerOutput | None</em>) &#8211; the initialization state of the LSTM cell.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; process the input in a reverse order or not.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; activiation type of gru</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of gru</li>
<li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias parameter attribute of gru layer,
<li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> False means no bias, None means default bias.</li>
<li><strong>gru_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; Extra attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -751,11 +755,11 @@ to use the grumemory, which is relatively faster.</p> ...@@ -751,11 +755,11 @@ to use the grumemory, which is relatively faster.</p>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>You maybe see gru_step_layer, grumemory in layers.py, gru_unit, gru_group, <dd><p>You may see gru_step_layer, grumemory in layers.py, gru_unit, gru_group,
simple_gru in network.py. The reason why there are so many interfaces is simple_gru in network.py. The reason why there are so many interfaces is
that we have two ways to implement recurrent neural network. One way is to that we have two ways to implement recurrent neural network. One way is to
use one complete layer to implement rnn (including simple rnn, gru and lstm) use one complete layer to implement rnn (including simple rnn, gru and lstm)
with multiple time steps, such as recurrent_layer, lstmemory, grumemory. But, with multiple time steps, such as recurrent_layer, lstmemory, grumemory. But
the multiplication operation <span class="math">\(W x_t\)</span> is not computed in these layers. the multiplication operation <span class="math">\(W x_t\)</span> is not computed in these layers.
See details in their interfaces in layers.py. See details in their interfaces in layers.py.
The other implementation is to use an recurrent group which can ensemble a The other implementation is to use an recurrent group which can ensemble a
...@@ -785,14 +789,15 @@ gru_group, and gru_group is relatively better than simple_gru.</p> ...@@ -785,14 +789,15 @@ gru_group, and gru_group is relatively better than simple_gru.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; process the input in a reverse order or not.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; activiation type of gru</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of gru</li>
<li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias parameter attribute of gru layer,
<li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> False means no bias, None means default bias.</li>
<li><strong>gru_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; Extra attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -812,8 +817,8 @@ gru_group, and gru_group is relatively better than simple_gru.</p> ...@@ -812,8 +817,8 @@ gru_group, and gru_group is relatively better than simple_gru.</p>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru2</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru2</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>simple_gru2 is the same with simple_gru, but using grumemory instead <dd><p>simple_gru2 is the same with simple_gru, but using grumemory instead.
Please see grumemory in layers.py for more detail about the maths. Please refer to grumemory in layers.py for more detail about the math.
simple_gru2 is faster than simple_gru.</p> simple_gru2 is faster than simple_gru.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">simple_gru2</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">simple_gru2</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">)</span>
...@@ -824,14 +829,15 @@ simple_gru2 is faster than simple_gru.</p> ...@@ -824,14 +829,15 @@ simple_gru2 is faster than simple_gru.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; process the input in a reverse order or not.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; activiation type of gru</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of gru</li>
<li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias parameter attribute of gru layer,
<li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> False means no bias, None means default bias.</li>
<li><strong>gru_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; Extra attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -852,7 +858,7 @@ simple_gru2 is faster than simple_gru.</p> ...@@ -852,7 +858,7 @@ simple_gru2 is faster than simple_gru.</p>
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>A bidirectional_gru is a recurrent unit that iterates over the input <dd><p>A bidirectional_gru is a recurrent unit that iterates over the input
sequence both in forward and bardward orders, and then concatenate two sequence both in forward and backward orders, and then concatenate two
outputs to form a final output. However, concatenation of two outputs outputs to form a final output. However, concatenation of two outputs
is not the only way to form the final output, you can also, for example, is not the only way to form the final output, you can also, for example,
just add them together.</p> just add them together.</p>
...@@ -868,11 +874,10 @@ just add them together.</p> ...@@ -868,11 +874,10 @@ just add them together.</p>
<li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional gru layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional gru layer name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; gru layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; gru layer size.</li>
<li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are <li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, the last time step of output are
concatenated and returned. concatenated and returned.
If set True, the entire output sequences that are If set True, the entire output sequences in forward
processed in forward and backward directions are and backward directions are concatenated and returned.</li>
concatenated and returned.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -893,7 +898,7 @@ concatenated and returned.</li> ...@@ -893,7 +898,7 @@ concatenated and returned.</li>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_attention</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_attention</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Calculate and then return a context vector by attention machanism. <dd><p>Calculate and return a context vector with attention mechanism.
Size of the context vector equals to size of the encoded_sequence.</p> Size of the context vector equals to size of the encoded_sequence.</p>
<div class="math"> <div class="math">
\[ \begin{align}\begin{aligned}a(s_{i-1},h_{j}) &amp; = v_{a}f(W_{a}s_{t-1} + U_{a}h_{j})\\e_{i,j} &amp; = a(s_{i-1}, h_{j})\\a_{i,j} &amp; = \frac{exp(e_{i,j})}{\sum_{k=1}^{T_x}{exp(e_{i,k})}}\\c_{i} &amp; = \sum_{j=1}^{T_{x}}a_{i,j}h_{j}\end{aligned}\end{align} \]</div> \[ \begin{align}\begin{aligned}a(s_{i-1},h_{j}) &amp; = v_{a}f(W_{a}s_{t-1} + U_{a}h_{j})\\e_{i,j} &amp; = a(s_{i-1}, h_{j})\\a_{i,j} &amp; = \frac{exp(e_{i,j})}{\sum_{k=1}^{T_x}{exp(e_{i,k})}}\\c_{i} &amp; = \sum_{j=1}^{T_{x}}a_{i,j}h_{j}\end{aligned}\end{align} \]</div>
...@@ -917,8 +922,8 @@ Align and Translate</strong> for more details. The link is as follows: ...@@ -917,8 +922,8 @@ Align and Translate</strong> for more details. The link is as follows:
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the attention model.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the attention model.</li>
<li><strong>softmax_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of sequence softmax <li><strong>softmax_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of sequence softmax
that is used to produce attention weight</li> that is used to produce attention weight.</li>
<li><strong>weight_act</strong> (<em>Activation</em>) &#8211; activation of the attention model</li> <li><strong>weight_act</strong> (<em>BaseActivation</em>) &#8211; activation of the attention model.</li>
<li><strong>encoded_sequence</strong> (<em>LayerOutput</em>) &#8211; output of the encoder</li> <li><strong>encoded_sequence</strong> (<em>LayerOutput</em>) &#8211; output of the encoder</li>
<li><strong>encoded_proj</strong> (<em>LayerOutput</em>) &#8211; attention weight is computed by a feed forward neural <li><strong>encoded_proj</strong> (<em>LayerOutput</em>) &#8211; attention weight is computed by a feed forward neural
network which has two inputs : decoder&#8217;s hidden state network which has two inputs : decoder&#8217;s hidden state
......
因为 它太大了无法显示 source diff 。你可以改为 查看blob
...@@ -201,39 +201,39 @@ ...@@ -201,39 +201,39 @@
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">sequence_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">sequence_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Text convolution pooling layers helper.</p> <dd><p>Text convolution pooling group.</p>
<p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p> <p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; name of input layer</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See <li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See
context_projection&#8217;s document.</li> context_projection&#8217;s document.</li>
<li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li> <li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li>
<li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See <li><strong>context_start</strong> (<em>int|None</em>) &#8211; context start position. See
context_projection&#8217;s context_start.</li> context_projection&#8217;s context_start.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li>
<li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name. <li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute. <li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute|None</em>) &#8211; padding parameter attribute of context projection layer.
None if user don&#8217;t care.</li> If false, it means padding always be zero.</li>
<li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li> <li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li>
<li><strong>fc_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li> <li><strong>fc_param_attr</strong> (<em>ParameterAttribute|None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li>
<li><strong>fc_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias, <li><strong>fc_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; fc bias parameter attribute. False if no bias,
None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh.</li>
<li><strong>pool_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; pooling layer bias attr. False if no bias.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh</li>
<li><strong>pool_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care.
False if no bias.</li>
<li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li> <li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li>
<li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li> <li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li>
<li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li> <li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">output layer name.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">layer&#8217;s output.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -249,39 +249,39 @@ False if no bias.</li> ...@@ -249,39 +249,39 @@ False if no bias.</li>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">text_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">text_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Text convolution pooling layers helper.</p> <dd><p>Text convolution pooling group.</p>
<p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p> <p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; name of input layer</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See <li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See
context_projection&#8217;s document.</li> context_projection&#8217;s document.</li>
<li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li> <li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li>
<li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See <li><strong>context_start</strong> (<em>int|None</em>) &#8211; context start position. See
context_projection&#8217;s context_start.</li> context_projection&#8217;s context_start.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li>
<li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name. <li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute. <li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute|None</em>) &#8211; padding parameter attribute of context projection layer.
None if user don&#8217;t care.</li> If false, it means padding always be zero.</li>
<li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li> <li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li>
<li><strong>fc_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li> <li><strong>fc_param_attr</strong> (<em>ParameterAttribute|None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li>
<li><strong>fc_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias, <li><strong>fc_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; fc bias parameter attribute. False if no bias,
None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh.</li>
<li><strong>pool_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; pooling layer bias attr. False if no bias.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh</li>
<li><strong>pool_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care.
False if no bias.</li>
<li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li> <li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li>
<li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li> <li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li>
<li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li> <li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">output layer name.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">layer&#8217;s output.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -301,36 +301,37 @@ False if no bias.</li> ...@@ -301,36 +301,37 @@ False if no bias.</li>
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_bn_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_bn_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Convolution, batch normalization, pooling group.</p> <dd><p>Convolution, batch normalization, pooling group.</p>
<p>Img input =&gt; Conv =&gt; BN =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; layer&#8217;s input</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li> <li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li> <li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li> <li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see batch_norm_layer&#8217;s document.</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see batch_norm_layer for details.</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li> <li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>conv_bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>conv_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_layer_attr</strong> (<em>ExtraLayerOutput</em>) &#8211; see img_conv_layer&#8217;s document.</li> <li><strong>conv_layer_attr</strong> (<em>ExtraLayerOutput</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>bn_param_attr</strong> (<em>ParameterAttribute.</em>) &#8211; see batch_norm_layer&#8217;s document.</li> <li><strong>bn_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see batch_norm_layer for details.</li>
<li><strong>bn_bias_attr</strong> &#8211; see batch_norm_layer&#8217;s document.</li> <li><strong>bn_bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see batch_norm_layer for details.</li>
<li><strong>bn_layer_attr</strong> &#8211; ParameterAttribute.</li> <li><strong>bn_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see batch_norm_layer for details.</li>
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li> <li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li> <li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer&#8217;s document.</li> <li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer for details.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">Layer groups output</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">layer&#8217;s output</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -354,26 +355,26 @@ False if no bias.</li> ...@@ -354,26 +355,26 @@ False if no bias.</li>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>conv_batchnorm_drop_rate</strong> (<em>list</em>) &#8211; if conv_with_batchnorm[i] is true, <li><strong>conv_batchnorm_drop_rate</strong> (<em>list</em>) &#8211; if conv_with_batchnorm[i] is true,
conv_batchnorm_drop_rate[i] represents the drop rate of each batch norm.</li> conv_batchnorm_drop_rate[i] represents the drop rate of each batch norm.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; layer&#8217;s input.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>conv_num_filter</strong> (<em>int</em>) &#8211; output channels num.</li> <li><strong>conv_num_filter</strong> (<em>list|tuple</em>) &#8211; list of output channels num.</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; pooling filter size.</li> <li><strong>pool_size</strong> (<em>int</em>) &#8211; pooling filter size.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; input channels num.</li> <li><strong>num_channels</strong> (<em>int</em>) &#8211; input channels num.</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; convolution padding size.</li> <li><strong>conv_padding</strong> (<em>int</em>) &#8211; convolution padding size.</li>
<li><strong>conv_filter_size</strong> (<em>int</em>) &#8211; convolution filter size.</li> <li><strong>conv_filter_size</strong> (<em>int</em>) &#8211; convolution filter size.</li>
<li><strong>conv_act</strong> (<em>BaseActivation</em>) &#8211; activation funciton after convolution.</li> <li><strong>conv_act</strong> (<em>BaseActivation</em>) &#8211; activation funciton after convolution.</li>
<li><strong>conv_with_batchnorm</strong> (<em>list</em>) &#8211; conv_with_batchnorm[i] represents <li><strong>conv_with_batchnorm</strong> (<em>list</em>) &#8211; if conv_with_batchnorm[i] is true,
if there is a batch normalization after each convolution.</li> there is a batch normalization operation after each convolution.</li>
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; pooling stride size.</li> <li><strong>pool_stride</strong> (<em>int</em>) &#8211; pooling stride size.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; pooling type.</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; pooling type.</li>
<li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; Convolution param attribute. <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; param attribute of convolution layer,
None means default attribute.</li> None means default attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">Layer&#8217;s output</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">layer&#8217;s output</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Type:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -387,34 +388,34 @@ None means default attribute.</li> ...@@ -387,34 +388,34 @@ None means default attribute.</li>
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_img_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_img_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Simple image convolution and pooling group.</p> <dd><p>Simple image convolution and pooling group.</p>
<p>Input =&gt; conv =&gt; pooling</p> <p>Img input =&gt; Conv =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li> <li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer for details</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see img_conv_layer for details</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details</li> <li><strong>bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li> <li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer for details</li> <li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>conv_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_conv_layer for details</li> <li><strong>conv_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_conv_layer for details.</li>
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li> <li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li> <li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer for details.</li>
<li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer for details</li> <li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer for details.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">Layer&#8217;s output</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">layer&#8217;s output</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -439,13 +440,16 @@ None means default attribute.</li> ...@@ -439,13 +440,16 @@ None means default attribute.</li>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>num_classes</strong> &#8211; </li> <li><strong>num_classes</strong> (<em>int</em>) &#8211; number of class.</li>
<li><strong>input_image</strong> (<em>LayerOutput</em>) &#8211; </li> <li><strong>input_image</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; </li> <li><strong>num_channels</strong> (<em>int</em>) &#8211; input channels num.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first last"></p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">layer&#8217;s output</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -463,9 +467,9 @@ None means default attribute.</li> ...@@ -463,9 +467,9 @@ None means default attribute.</li>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Define calculations that a LSTM unit performs during a single time step. <dd><p>lstmemory_unit defines the caculation process of a LSTM unit during a
This function itself is not a recurrent layer, so it can not be single time step. This function is not a recurrent layer, so it can not be
directly used to process sequence inputs. This function is always used in directly used to process sequence input. This function is always used in
recurrent_group (see layers.py for more details) to implement attention recurrent_group (see layers.py for more details) to implement attention
mechanism.</p> mechanism.</p>
<p>Please refer to <strong>Generating Sequences With Recurrent Neural Networks</strong> <p>Please refer to <strong>Generating Sequences With Recurrent Neural Networks</strong>
...@@ -486,21 +490,21 @@ for more details about LSTM. The link goes as follows: ...@@ -486,21 +490,21 @@ for more details about LSTM. The link goes as follows:
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>out_memory</strong> (<em>LayerOutput | None</em>) &#8211; output of previous time step</li> <li><strong>out_memory</strong> (<em>LayerOutput | None</em>) &#8211; output of previous time step</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory unit name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory unit name.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstmemory unit size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstmemory unit size.</li>
<li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute, None means default attribute.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; last activiation type of lstm.</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of lstm.</li>
<li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; state activiation type of lstm.</li>
<li><strong>input_proj_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias attribute for input-to-hidden projection. <li><strong>input_proj_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias attribute for input to hidden projection.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>input_proj_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra layer attribute for input to hidden <li><strong>input_proj_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra layer attribute for input to hidden
projection of the LSTM unit, such as dropout, error clipping.</li> projection of the LSTM unit, such as dropout, error clipping.</li>
<li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer. <li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias parameter attribute of lstm layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>lstm_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li> <li><strong>lstm_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra attribute of lstm layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -523,9 +527,9 @@ False means no bias, None means default bias.</li> ...@@ -523,9 +527,9 @@ False means no bias, None means default bias.</li>
<dd><p>lstm_group is a recurrent_group version of Long Short Term Memory. It <dd><p>lstm_group is a recurrent_group version of Long Short Term Memory. It
does exactly the same calculation as the lstmemory layer (see lstmemory in does exactly the same calculation as the lstmemory layer (see lstmemory in
layers.py for the maths) does. A promising benefit is that LSTM memory layers.py for the maths) does. A promising benefit is that LSTM memory
cell states, or hidden states in every time step are accessible to the cell states(or hidden states) in every time step are accessible to the
user. This is especially useful in attention model. If you do not need to user. This is especially useful in attention model. If you do not need to
access the internal states of the lstm, but merely use its outputs, access the internal states of the lstm and merely use its outputs,
it is recommended to use the lstmemory, which is relatively faster than it is recommended to use the lstmemory, which is relatively faster than
lstmemory_group.</p> lstmemory_group.</p>
<p>NOTE: In PaddlePaddle&#8217;s implementation, the following input-to-hidden <p>NOTE: In PaddlePaddle&#8217;s implementation, the following input-to-hidden
...@@ -547,18 +551,18 @@ full_matrix_projection must be included before lstmemory_unit is called.</p> ...@@ -547,18 +551,18 @@ full_matrix_projection must be included before lstmemory_unit is called.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstmemory group size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstmemory group size.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the lstmemory group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of lstmemory group.</li>
<li><strong>out_memory</strong> (<em>LayerOutput | None</em>) &#8211; output of previous time step</li> <li><strong>out_memory</strong> (<em>LayerOutput | None</em>) &#8211; output of previous time step.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; is lstm reversed</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; process the input in a reverse order or not.</li>
<li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute, None means default attribute.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; last activiation type of lstm.</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of lstm.</li>
<li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; state activiation type of lstm.</li>
<li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer. <li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias parameter attribute of lstm layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>input_proj_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias attribute for input-to-hidden projection. <li><strong>input_proj_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias attribute for input to hidden projection.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>input_proj_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra layer attribute for input to hidden <li><strong>input_proj_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra layer attribute for input to hidden
projection of the LSTM unit, such as dropout, error clipping.</li> projection of the LSTM unit, such as dropout, error clipping.</li>
...@@ -583,34 +587,34 @@ projection of the LSTM unit, such as dropout, error clipping.</li> ...@@ -583,34 +587,34 @@ projection of the LSTM unit, such as dropout, error clipping.</li>
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Simple LSTM Cell.</p> <dd><p>Simple LSTM Cell.</p>
<p>It just combine a mixed layer with fully_matrix_projection and a lstmemory <p>It just combines a mixed layer with fully_matrix_projection and a lstmemory
layer. The simple lstm cell was implemented as follow equations.</p> layer. The simple lstm cell was implemented with follow equations.</p>
<div class="math"> <div class="math">
\[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{xi}x_{t} + W_{hi}h_{t-1} + W_{ci}c_{t-1} + b_i)\\f_t &amp; = \sigma(W_{xf}x_{t} + W_{hf}h_{t-1} + W_{cf}c_{t-1} + b_f)\\c_t &amp; = f_tc_{t-1} + i_t tanh (W_{xc}x_t+W_{hc}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{xo}x_{t} + W_{ho}h_{t-1} + W_{co}c_t + b_o)\\h_t &amp; = o_t tanh(c_t)\end{aligned}\end{align} \]</div> \[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{xi}x_{t} + W_{hi}h_{t-1} + W_{ci}c_{t-1} + b_i)\\f_t &amp; = \sigma(W_{xf}x_{t} + W_{hf}h_{t-1} + W_{cf}c_{t-1} + b_f)\\c_t &amp; = f_tc_{t-1} + i_t tanh (W_{xc}x_t+W_{hc}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{xo}x_{t} + W_{ho}h_{t-1} + W_{co}c_t + b_o)\\h_t &amp; = o_t tanh(c_t)\end{aligned}\end{align} \]</div>
<p>Please refer <strong>Generating Sequences With Recurrent Neural Networks</strong> if you <p>Please refer to <strong>Generating Sequences With Recurrent Neural Networks</strong> for more
want to know what lstm is. <a class="reference external" href="http://arxiv.org/abs/1308.0850">Link</a> is here.</p> details about lstm. <a class="reference external" href="http://arxiv.org/abs/1308.0850">Link</a> is here.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstm layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; lstm layer name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; layer&#8217;s input.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; process the input in a reverse order or not.</li>
<li><strong>mat_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; mixed layer&#8217;s matrix projection parameter attribute.</li> <li><strong>mat_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of matrix projection in mixed layer.</li>
<li><strong>bias_param_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute. False means no bias, None <li><strong>bias_param_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute. False means no bias, None
means default bias.</li> means default bias.</li>
<li><strong>inner_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; lstm cell parameter attribute.</li> <li><strong>inner_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of lstm cell.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; last activiation type of lstm.</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of lstm.</li>
<li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; state activiation type of lstm.</li>
<li><strong>mixed_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li> <li><strong>mixed_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra attribute of mixed layer.</li>
<li><strong>lstm_cell_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li> <li><strong>lstm_cell_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; extra attribute of lstm.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">lstm layer name.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">layer&#8217;s output.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -627,8 +631,8 @@ means default bias.</li> ...@@ -627,8 +631,8 @@ means default bias.</li>
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>A bidirectional_lstm is a recurrent unit that iterates over the input <dd><p>A bidirectional_lstm is a recurrent unit that iterates over the input
sequence both in forward and bardward orders, and then concatenate two sequence both in forward and backward orders, and then concatenate two
outputs form a final output. However, concatenation of two outputs outputs to form a final output. However, concatenation of two outputs
is not the only way to form the final output, you can also, for example, is not the only way to form the final output, you can also, for example,
just add them together.</p> just add them together.</p>
<p>Please refer to <strong>Neural Machine Translation by Jointly Learning to Align <p>Please refer to <strong>Neural Machine Translation by Jointly Learning to Align
...@@ -647,15 +651,14 @@ The link goes as follows: ...@@ -647,15 +651,14 @@ The link goes as follows:
<li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional lstm layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional lstm layer name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li>
<li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are <li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, the last time step of output are
concatenated and returned. concatenated and returned.
If set True, the entire output sequences that are If set True, the entire output sequences in forward
processed in forward and backward directions are and backward directions are concatenated and returned.</li>
concatenated and returned.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">LayerOutput object accroding to the return_seq.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">LayerOutput object.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
...@@ -674,9 +677,9 @@ concatenated and returned.</li> ...@@ -674,9 +677,9 @@ concatenated and returned.</li>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Define calculations that a gated recurrent unit performs in a single time <dd><p>gru_unit defines the calculation process of a gated recurrent unit during a single
step. This function itself is not a recurrent layer, so it can not be time step. This function is not a recurrent layer, so it can not be
directly used to process sequence inputs. This function is always used in directly used to process sequence input. This function is always used in
the recurrent_group (see layers.py for more details) to implement attention the recurrent_group (see layers.py for more details) to implement attention
mechanism.</p> mechanism.</p>
<p>Please see grumemory in layers.py for the details about the maths.</p> <p>Please see grumemory in layers.py for the details about the maths.</p>
...@@ -685,13 +688,13 @@ mechanism.</p> ...@@ -685,13 +688,13 @@ mechanism.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>memory_boot</strong> (<em>LayerOutput | None</em>) &#8211; the initialization state of the LSTM cell.</li> <li><strong>memory_boot</strong> (<em>LayerOutput | None</em>) &#8211; the initialization state of the LSTM cell.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; activation type of gru</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activation type or gru</li>
<li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> <li><strong>gru_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; Extra attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -715,11 +718,11 @@ mechanism.</p> ...@@ -715,11 +718,11 @@ mechanism.</p>
does exactly the same calculation as the grumemory layer does. A promising does exactly the same calculation as the grumemory layer does. A promising
benefit is that gru hidden states are accessible to the user. This is benefit is that gru hidden states are accessible to the user. This is
especially useful in attention model. If you do not need to access especially useful in attention model. If you do not need to access
any internal state, but merely use the outputs of a GRU, it is recommended any internal state and merely use the outputs of a GRU, it is recommended
to use the grumemory, which is relatively faster.</p> to use the grumemory, which is relatively faster.</p>
<p>Please see grumemory in layers.py for more detail about the maths.</p> <p>Please see grumemory in layers.py for more detail about the maths.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">gur_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">gru_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span>
<span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span>
<span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">())</span> <span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">())</span>
...@@ -730,15 +733,16 @@ to use the grumemory, which is relatively faster.</p> ...@@ -730,15 +733,16 @@ to use the grumemory, which is relatively faster.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>memory_boot</strong> (<em>LayerOutput | None</em>) &#8211; the initialization state of the LSTM cell.</li> <li><strong>memory_boot</strong> (<em>LayerOutput | None</em>) &#8211; the initialization state of the LSTM cell.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; process the input in a reverse order or not.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; activiation type of gru</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of gru</li>
<li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias parameter attribute of gru layer,
<li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> False means no bias, None means default bias.</li>
<li><strong>gru_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; Extra attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -758,11 +762,11 @@ to use the grumemory, which is relatively faster.</p> ...@@ -758,11 +762,11 @@ to use the grumemory, which is relatively faster.</p>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>You maybe see gru_step_layer, grumemory in layers.py, gru_unit, gru_group, <dd><p>You may see gru_step_layer, grumemory in layers.py, gru_unit, gru_group,
simple_gru in network.py. The reason why there are so many interfaces is simple_gru in network.py. The reason why there are so many interfaces is
that we have two ways to implement recurrent neural network. One way is to that we have two ways to implement recurrent neural network. One way is to
use one complete layer to implement rnn (including simple rnn, gru and lstm) use one complete layer to implement rnn (including simple rnn, gru and lstm)
with multiple time steps, such as recurrent_layer, lstmemory, grumemory. But, with multiple time steps, such as recurrent_layer, lstmemory, grumemory. But
the multiplication operation <span class="math">\(W x_t\)</span> is not computed in these layers. the multiplication operation <span class="math">\(W x_t\)</span> is not computed in these layers.
See details in their interfaces in layers.py. See details in their interfaces in layers.py.
The other implementation is to use an recurrent group which can ensemble a The other implementation is to use an recurrent group which can ensemble a
...@@ -792,14 +796,15 @@ gru_group, and gru_group is relatively better than simple_gru.</p> ...@@ -792,14 +796,15 @@ gru_group, and gru_group is relatively better than simple_gru.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; process the input in a reverse order or not.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; activiation type of gru</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of gru</li>
<li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias parameter attribute of gru layer,
<li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> False means no bias, None means default bias.</li>
<li><strong>gru_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; Extra attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -819,8 +824,8 @@ gru_group, and gru_group is relatively better than simple_gru.</p> ...@@ -819,8 +824,8 @@ gru_group, and gru_group is relatively better than simple_gru.</p>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru2</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru2</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>simple_gru2 is the same with simple_gru, but using grumemory instead <dd><p>simple_gru2 is the same with simple_gru, but using grumemory instead.
Please see grumemory in layers.py for more detail about the maths. Please refer to grumemory in layers.py for more detail about the math.
simple_gru2 is faster than simple_gru.</p> simple_gru2 is faster than simple_gru.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">simple_gru2</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">simple_gru2</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">)</span>
...@@ -831,14 +836,15 @@ simple_gru2 is faster than simple_gru.</p> ...@@ -831,14 +836,15 @@ simple_gru2 is faster than simple_gru.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; process the input in a reverse order or not.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; activiation type of gru</li>
<li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; gate activiation type of gru</li>
<li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False|None</em>) &#8211; bias parameter attribute of gru layer,
<li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> False means no bias, None means default bias.</li>
<li><strong>gru_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; Extra attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -859,7 +865,7 @@ simple_gru2 is faster than simple_gru.</p> ...@@ -859,7 +865,7 @@ simple_gru2 is faster than simple_gru.</p>
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>A bidirectional_gru is a recurrent unit that iterates over the input <dd><p>A bidirectional_gru is a recurrent unit that iterates over the input
sequence both in forward and bardward orders, and then concatenate two sequence both in forward and backward orders, and then concatenate two
outputs to form a final output. However, concatenation of two outputs outputs to form a final output. However, concatenation of two outputs
is not the only way to form the final output, you can also, for example, is not the only way to form the final output, you can also, for example,
just add them together.</p> just add them together.</p>
...@@ -875,11 +881,10 @@ just add them together.</p> ...@@ -875,11 +881,10 @@ just add them together.</p>
<li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional gru layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional gru layer name.</li>
<li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; gru layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; gru layer size.</li>
<li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are <li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, the last time step of output are
concatenated and returned. concatenated and returned.
If set True, the entire output sequences that are If set True, the entire output sequences in forward
processed in forward and backward directions are and backward directions are concatenated and returned.</li>
concatenated and returned.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -900,7 +905,7 @@ concatenated and returned.</li> ...@@ -900,7 +905,7 @@ concatenated and returned.</li>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_attention</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_attention</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Calculate and then return a context vector by attention machanism. <dd><p>Calculate and return a context vector with attention mechanism.
Size of the context vector equals to size of the encoded_sequence.</p> Size of the context vector equals to size of the encoded_sequence.</p>
<div class="math"> <div class="math">
\[ \begin{align}\begin{aligned}a(s_{i-1},h_{j}) &amp; = v_{a}f(W_{a}s_{t-1} + U_{a}h_{j})\\e_{i,j} &amp; = a(s_{i-1}, h_{j})\\a_{i,j} &amp; = \frac{exp(e_{i,j})}{\sum_{k=1}^{T_x}{exp(e_{i,k})}}\\c_{i} &amp; = \sum_{j=1}^{T_{x}}a_{i,j}h_{j}\end{aligned}\end{align} \]</div> \[ \begin{align}\begin{aligned}a(s_{i-1},h_{j}) &amp; = v_{a}f(W_{a}s_{t-1} + U_{a}h_{j})\\e_{i,j} &amp; = a(s_{i-1}, h_{j})\\a_{i,j} &amp; = \frac{exp(e_{i,j})}{\sum_{k=1}^{T_x}{exp(e_{i,k})}}\\c_{i} &amp; = \sum_{j=1}^{T_{x}}a_{i,j}h_{j}\end{aligned}\end{align} \]</div>
...@@ -924,8 +929,8 @@ Align and Translate</strong> for more details. The link is as follows: ...@@ -924,8 +929,8 @@ Align and Translate</strong> for more details. The link is as follows:
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the attention model.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the attention model.</li>
<li><strong>softmax_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of sequence softmax <li><strong>softmax_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of sequence softmax
that is used to produce attention weight</li> that is used to produce attention weight.</li>
<li><strong>weight_act</strong> (<em>Activation</em>) &#8211; activation of the attention model</li> <li><strong>weight_act</strong> (<em>BaseActivation</em>) &#8211; activation of the attention model.</li>
<li><strong>encoded_sequence</strong> (<em>LayerOutput</em>) &#8211; output of the encoder</li> <li><strong>encoded_sequence</strong> (<em>LayerOutput</em>) &#8211; output of the encoder</li>
<li><strong>encoded_proj</strong> (<em>LayerOutput</em>) &#8211; attention weight is computed by a feed forward neural <li><strong>encoded_proj</strong> (<em>LayerOutput</em>) &#8211; attention weight is computed by a feed forward neural
network which has two inputs : decoder&#8217;s hidden state network which has two inputs : decoder&#8217;s hidden state
......
因为 它太大了无法显示 source diff 。你可以改为 查看blob
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册