提交 6cdb184c 编写于 作者: T Travis CI

Deploy to GitHub Pages: fe84517b

上级 26354f19
......@@ -223,14 +223,15 @@
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; The input layer. Could be a list/tuple of input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -264,7 +265,7 @@ specified, selective_fc acts exactly like fc.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; The input layer.</li>
<li><strong>select</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The select layer. The output of select layer should be a
sparse binary matrix, and treat as the mask of selective fc.
......@@ -272,9 +273,10 @@ If is None, acts exactly like fc.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -424,7 +426,7 @@ the right size (which is the end of array) to the left.</li>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer a.</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer b.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; layer&#8217;s extra attribute.</li>
......@@ -478,7 +480,7 @@ rest channels will be processed by rest group of filters.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Layer Input.</li>
<li><strong>filter_size</strong> (<em>int|tuple|list</em>) &#8211; The x dimension of a filter kernel. Or input a tuple for
two image dimension.</li>
......@@ -497,8 +499,10 @@ image dimension</li>
<li><strong>dilation</strong> (<em>int|tuple|list</em>) &#8211; The x dimension of the dilation. Or input a tuple for two
image dimension</li>
<li><strong>dilation_y</strong> (<em>int</em>) &#8211; The y dimension of the dilation.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Convolution bias attribute. None means default bias.
False means no bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; number of input channels. If None will be set
automatically from previous output.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Convolution param attribute. None means default attribute</li>
......@@ -569,15 +573,15 @@ parameter attribute is set by this parameter.</li>
<dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">row_conv</code></dt>
<dd><p>The row convolution is called lookahead convolution. It is firstly
introduced in paper of <a class="reference external" href="https://arxiv.org/pdf/1512.02595v1.pdf">Deep Speech 2: End-toEnd Speech Recognition
introduced in paper of <a class="reference external" href="https://arxiv.org/pdf/1512.02595v1.pdf">Deep Speech 2: End-to-End Speech Recognition
in English and Mandarin</a> .</p>
<p>The bidirectional RNN that learns representation for a sequence by
performing a forward and a backward pass through the entire sequence.
However, unlike unidirectional RNNs, bidirectional RNNs are challenging
to deploy in an online and low-latency setting. The lookahead convolution
incorporates information from future subsequences in a computationally
efficient manner to improve unidirectional recurrent neural networks.</p>
<p>The connection of row convolution is different form the 1D sequence
efficient manner to improve unidirectional RNNs.</p>
<p>The connection of row convolution is different from the 1D sequence
convolution. Assumed that, the future context-length is k, that is to say,
it can get the output at timestep t by using the the input feature from t-th
timestep to (t+k+1)-th timestep. Assumed that the hidden dim of input
......@@ -603,7 +607,7 @@ number plus one equals context_len.</p>
plus one.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is linear activation.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute. If None, the parameter will be
initialized smartly. It&#8217;s better set it by yourself.</li>
initialized smartly. It&#8217;s better to set it by yourself.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -706,7 +710,7 @@ The details please refer to
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; layer&#8217;s input.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; number of input channel.</li>
<li><strong>pool_type</strong> &#8211; Pooling type. MaxPooling or AveragePooling. Default is MaxPooling.</li>
......@@ -771,7 +775,7 @@ s = input.size / num_channels
<li><strong>num_channels</strong> (<em>int|None</em>) &#8211; The channel number of input layer. If None will be set
automatically from previous output.</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; The group number of input layer.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer, which can not specify.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer attribute.</li>
</ul>
</td>
......@@ -807,7 +811,7 @@ The details please refer to
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; layer&#8217;s input.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; Normalize in number of <span class="math">\(size\)</span> feature maps.</li>
<li><strong>scale</strong> (<em>float</em>) &#8211; The hyper-parameter.</li>
......@@ -855,7 +859,7 @@ y_i &amp;\gets \gamma \hat{x_i} + \beta \qquad &amp;//\ scale\ and\ shift\end{sp
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; batch normalization input. Better be linear activation.
Because there is an activation inside batch_normalization.</li>
<li><strong>batch_norm_type</strong> (<em>None|string</em><em>, </em><em>None</em><em> or </em><em>&quot;batch_norm&quot;</em><em> or </em><em>&quot;cudnn_batch_norm&quot;</em>) &#8211; We have batch_norm and cudnn_batch_norm. batch_norm
......@@ -872,7 +876,7 @@ normalization will normalize input near zero.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; num of image channels or previous layer&#8217;s number of
filters. None will automatically get from layer&#8217;s
input.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; <span class="math">\(\beta\)</span>, better be zero when initialize. So the
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; <span class="math">\(\beta\)</span>, better be zero when initialize. So the
initial_std=0, initial_mean=1 is best practice.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; <span class="math">\(\gamma\)</span>, better be one when initialize. So the
initial_std=0, initial_mean=1 is best practice.</li>
......@@ -923,7 +927,7 @@ and <span class="math">\(out\)</span> is a (batchSize x dataDim) output vector.<
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -953,7 +957,7 @@ factors which dimensions equal to the channel&#8217;s number.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li>
</ul>
......@@ -993,7 +997,7 @@ and the size of <span class="math">\(out\)</span> is a (batchSize x dataDim) .</
</tr>
<tr class="field-even field"><th class="field-name">type input:</th><td class="field-body">paddle.v2.config_base.Layer</td>
</tr>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">Layer name.</td>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layer. It is optional.</td>
</tr>
<tr class="field-even field"><th class="field-name">type name:</th><td class="field-body">basestring</td>
</tr>
......@@ -1038,9 +1042,12 @@ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start &lt;= i &lt; end\en
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input Layer</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; bias attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; parameter attribute.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the layer</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Layer Attribute.</li>
</ul>
</td>
......@@ -1088,8 +1095,10 @@ more details about LSTM.</p>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation type, paddle.v2.activation.Tanh by default. <span class="math">\(h_t\)</span></li>
<li><strong>gate_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; gate activation type, paddle.v2.activation.Sigmoid by default.</li>
<li><strong>state_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; state activation type, paddle.v2.activation.Tanh by default.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no
bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Parameter Attribute.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer attribute</li>
</ul>
......@@ -1156,8 +1165,10 @@ affects the <span class="math">\({\tilde{h_t}}\)</span>.</li>
<li><strong>gate_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; gate activation type, paddle.v2.activation.Sigmoid by default.
This activation affects the <span class="math">\(z_t\)</span> and <span class="math">\(r_t\)</span>. It is the
<span class="math">\(\sigma\)</span> in the above formula.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no
bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Parameter Attribute.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer attribute</li>
</ul>
......@@ -1329,7 +1340,7 @@ output is <span class="math">\(o_t\)</span>, whose name is &#8216;state&#8217; a
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer&#8217;s name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; Layer&#8217;s size. NOTE: lstm layer&#8217;s size, should be equal to
<code class="code docutils literal"><span class="pre">input.size/4</span></code>, and should be equal to
<code class="code docutils literal"><span class="pre">state.size</span></code>.</li>
......@@ -1340,7 +1351,10 @@ output is <span class="math">\(o_t\)</span>, whose name is &#8216;state&#8217; a
be sigmoid only.</li>
<li><strong>state_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; State Activation Type. Default is sigmoid, and should
be sigmoid only.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Bias Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; layer&#8217;s extra attribute.</li>
</ul>
</td>
......@@ -1370,9 +1384,12 @@ be sigmoid only.</li>
<li><strong>output_mem</strong> &#8211; </li>
<li><strong>size</strong> &#8211; </li>
<li><strong>act</strong> &#8211; </li>
<li><strong>name</strong> &#8211; </li>
<li><strong>name</strong> &#8211; The name of this layer. It is optional.</li>
<li><strong>gate_act</strong> &#8211; </li>
<li><strong>bias_attr</strong> &#8211; </li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>param_attr</strong> &#8211; the parameter_attribute for transforming the output_mem
from previous step.</li>
<li><strong>layer_attr</strong> &#8211; </li>
......@@ -1486,7 +1503,7 @@ the output from input.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer&#8217;s name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; get output layer&#8217;s input. And this layer should contains
multiple outputs.</li>
<li><strong>arg_name</strong> (<em>basestring</em>) &#8211; Output name from input.</li>
......@@ -1542,9 +1559,10 @@ Each inputs is a projection or operator.</p>
<li><strong>input</strong> &#8211; inputs layer. It is an optional parameter. If set,
then this function will just return layer&#8217;s name.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; The extra layer config. Default is None.</li>
</ul>
</td>
......@@ -1571,7 +1589,7 @@ default Bias.</li>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Name of this embedding layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer for this embedding. NOTE: must be Index Data.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; The embedding dimension.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None</em>) &#8211; The embedding parameter attribute. See paddle.v2.attr.ParameterAttribute
......@@ -1967,12 +1985,15 @@ of stride is -1.</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>agg_level</strong> (<em>AggregateLevel</em>) &#8211; AggregateLevel.TO_NO_SEQUENCE or
AggregateLevel.TO_SEQUENCE</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li>
<li><strong>pooling_type</strong> (<em>BasePoolingType|None</em>) &#8211; Type of pooling, MaxPooling(default), AvgPooling,
SumPooling, SquareRootNPooling.</li>
<li><strong>stride</strong> (<em>Int</em>) &#8211; The step size between successive pooling regions.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias parameter attribute. False if no bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; The Extra Attributes for layer, such as dropout.</li>
</ul>
</td>
......@@ -2008,7 +2029,7 @@ of stride is -1.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>agg_level</strong> &#8211; Aggregated level</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer name.</li>
<li><strong>stride</strong> (<em>Int</em>) &#8211; The step size between successive pooling regions.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
......@@ -2046,7 +2067,7 @@ of stride is -1.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>agg_level</strong> &#8211; aggregation level</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer name.</li>
<li><strong>stride</strong> (<em>Int</em>) &#8211; The step size between successive pooling regions.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
......@@ -2080,7 +2101,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>list|tuple|collections.Sequence</em>) &#8211; input layers or projections</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
......@@ -2124,14 +2145,15 @@ processed in one batch.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
</ul>
</td>
</tr>
......@@ -2176,7 +2198,7 @@ will be sliced for multiple times.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of this layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input for this layer, it should be a sequence.</li>
<li><strong>starts</strong> (<em>paddle.v2.config_base.Layer|None</em>) &#8211; start indices to slice the input sequence.</li>
<li><strong>ends</strong> (<em>paddle.v2.config_base.Layer|None</em>) &#8211; end indices to slice the input sequence.</li>
......@@ -2218,7 +2240,7 @@ beam training.</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; A nested sequence.</li>
<li><strong>selected_indices</strong> &#8211; a set of sequence indices in the nested sequence.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of this layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
</ul>
</td>
</tr>
......@@ -2278,7 +2300,7 @@ convolution neural network, and before recurrent neural network.</p>
<li><strong>stride_y</strong> (<em>int</em>) &#8211; The stride size in vertical direction.</li>
<li><strong>padding_x</strong> (<em>int</em>) &#8211; The padding size in horizontal direction.</li>
<li><strong>padding_y</strong> (<em>int</em>) &#8211; The padding size in vertical direction.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer, which can not specify.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -2332,9 +2354,11 @@ sequence is one) to sequence data.&#8221;</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer</li>
<li><strong>expand_as</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Expand as this layer&#8217;s sequence info.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no
bias.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>expand_level</strong> (<em>ExpandLevel</em>) &#8211; whether input layer is timestep(default) or sequence.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
......@@ -2378,7 +2402,7 @@ bias.</li>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer</li>
<li><strong>num_repeats</strong> (<em>int</em>) &#8211; Repeat the input so many times</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>as_row_vector</strong> (<em>bool</em>) &#8211; True for treating input as row vector and repeating
in the column direction. This is equivalent to apply
concat() with num_repeats same input.
......@@ -2423,7 +2447,7 @@ usually used when the input sample is some image or feature map.</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>height</strong> (<em>int</em>) &#8211; The height of the sample matrix</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -2459,12 +2483,13 @@ output sequence has T*M/N instances, the dimension of each instance is N.</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>reshape_size</strong> (<em>int</em>) &#8211; the size of reshaped sequence.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
</ul>
</td>
</tr>
......@@ -2513,12 +2538,14 @@ Please refer to dropout for details.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; Input layers. It could be a paddle.v2.config_base.Layer or list/tuple of
paddle.v2.config_base.Layer.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type, default is tanh.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|bool</em>) &#8211; Bias attribute. If False, means no bias. None is default
bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer attribute.</li>
</ul>
</td>
......@@ -2581,7 +2608,7 @@ processed in one batch.</p>
<li><strong>weights</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The weight layer.</li>
<li><strong>vectors</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The vector layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; the dimension of this layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -2620,7 +2647,7 @@ which is used in NEURAL TURING MACHINE.</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>list|tuple</em>) &#8211; Input layer.</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Weight layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -2693,7 +2720,7 @@ and <span class="math">\(y\)</span> is a output vector.</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Weight layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -2732,7 +2759,7 @@ processed in one batch.</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Weight layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -2768,7 +2795,7 @@ ight)</p>
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The Layer Name.</td>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layer. It is optional.</td>
</tr>
<tr class="field-even field"><th class="field-name">type name:</th><td class="field-body">basestring</td>
</tr>
......@@ -2813,7 +2840,7 @@ element-wise. There is no activation and weight.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>slope</strong> (<em>float.</em>) &#8211; the scale factor.</li>
<li><strong>intercept</strong> (<em>float.</em>) &#8211; the offset.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
......@@ -2860,15 +2887,16 @@ For example, each sample:</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer a.</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer b.</li>
<li><strong>size</strong> (<em>int.</em>) &#8211; the layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -2907,7 +2935,7 @@ processed in one batch.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer a</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer b</li>
<li><strong>scale</strong> (<em>float</em>) &#8211; scale for cosine value. default is 5.</li>
......@@ -2946,7 +2974,7 @@ processed in one batch.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -2982,10 +3010,13 @@ bias are trainable.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; The input layer.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The parameter attribute of scaling.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The parameter attribute of shifting.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
</ul>
</td>
</tr>
......@@ -3020,7 +3051,7 @@ The result is stored in output.ids.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -3053,7 +3084,7 @@ Sampling one id for one sample.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -3097,7 +3128,7 @@ For each index i from 0 to batchSize -1, the output is the i-th row of the
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>list of paddle.v2.config_base.Layer</em>) &#8211; Input layers.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -3167,7 +3198,7 @@ in width dimension.</p>
<li><strong>pad_h</strong> (<em>list|None</em>) &#8211; padding size in height dimension.</li>
<li><strong>pad_w</strong> (<em>list|None</em>) &#8211; padding size in width dimension.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
</ul>
</td>
</tr>
......@@ -3203,7 +3234,7 @@ in width dimension.</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; The first input layer.</li>
<li><strong>label</strong> &#8211; The input label.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float.</em>) &#8211; The cost is multiplied with coeff.
The coefficient affects the gradient in the backward.</li>
<li><strong>weight</strong> (<em>LayerOutout</em>) &#8211; The cost of each sample is multiplied with each weight.
......@@ -3243,7 +3274,7 @@ Input should be a vector of positive numbers, without normalization.</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; The first input layer.</li>
<li><strong>label</strong> &#8211; The input label.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float.</em>) &#8211; The coefficient affects the gradient in the backward.</li>
<li><strong>softmax_selfnorm_alpha</strong> (<em>float.</em>) &#8211; The scale factor affects the cost.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
......@@ -3279,7 +3310,7 @@ Input should be a vector of positive numbers, without normalization.</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The first input layer.</li>
<li><strong>label</strong> &#8211; The input label.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float</em>) &#8211; The coefficient affects the gradient in the backward.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
......@@ -3328,7 +3359,7 @@ ight <a href="#id2"><span class="problematic" id="id3">|</span></a>leq delta</p>
</tr>
<tr class="field-even field"><th class="field-name">type input:</th><td class="field-body">paddle.v2.config_base.Layer.</td>
</tr>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layers. It is not necessary.</td>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layer. It is optional.</td>
</tr>
<tr class="field-even field"><th class="field-name">type name:</th><td class="field-body">None|basestring.</td>
</tr>
......@@ -3387,7 +3418,7 @@ a true binary class label :math:<a href="#id6"><span class="problematic" id="id7
</tr>
<tr class="field-even field"><th class="field-name">type input:</th><td class="field-body">paddle.v2.config_base.Layer.</td>
</tr>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layers. It is not necessary.</td>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layer. It is optional.</td>
</tr>
<tr class="field-even field"><th class="field-name">type name:</th><td class="field-body">None|basestring.</td>
</tr>
......@@ -3433,7 +3464,7 @@ a true binary class label :math:<a href="#id6"><span class="problematic" id="id7
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Samples of the same query should be loaded as sequence.</li>
<li><strong>score</strong> &#8211; The 2nd input. Score of each sample.</li>
<li><strong>NDCG_num</strong> (<em>int</em>) &#8211; The size of NDCG (Normalized Discounted Cumulative Gain),
e.g., 5 for NDCG&#64;5. It must be less than for equal to the
e.g., 5 for NDCG&#64;5. It must be less than or equal to the
minimum size of lists.</li>
<li><strong>max_sort_size</strong> (<em>int</em>) &#8211; The size of partial sorting in calculating gradient.
If max_sort_size = -1, then for each list, the
......@@ -3442,7 +3473,7 @@ In other cases, max_sort_size must be greater than or
equal to NDCG_num. And if max_sort_size is greater
than the size of a list, the algorithm will sort the
entire list of get gradient.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
</td>
......@@ -3471,7 +3502,7 @@ entire list of get gradient.</li>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Network prediction.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Data label.</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The weight affects the cost, namely the scale of cost.
......@@ -3530,7 +3561,7 @@ Their dimension is one.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Label is 1 or 0, means positive order and reverse order.</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The weight affects the cost, namely the scale of cost.
It is an optional argument.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float</em>) &#8211; The coefficient affects the gradient in the backward.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
......@@ -3563,7 +3594,7 @@ It is an optional argument.</li>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; The first input layer.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
</td>
......@@ -3603,7 +3634,7 @@ field model.</p>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The third layer is &#8220;weight&#8221; of each sample, which is an
optional argument.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Parameter attribute. None means default attribute</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float</em>) &#8211; The coefficient affects the gradient in the backward.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
......@@ -3644,7 +3675,7 @@ decoding or 0 for correct decoding.</p>
<li><strong>size</strong> (<em>int</em>) &#8211; size of this layer.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em><em> or </em><em>None</em>) &#8211; None or ground-truth label.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Parameter attribute. None means default attribute</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -3694,7 +3725,7 @@ should also be num_classes + 1.</p>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The data layer of label with variable length.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; category numbers + 1.</li>
<li><strong>name</strong> (<em>basestring|None</em>) &#8211; The name of this layer</li>
<li><strong>name</strong> (<em>basestring|None</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>norm_by_times</strong> (<em>bool</em>) &#8211; Whether to normalization by times. False by default.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
......@@ -3754,7 +3785,7 @@ should be consistent as that used in your labels.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The data layer of label with variable length.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; category numbers + 1.</li>
<li><strong>name</strong> (<em>basestring|None</em>) &#8211; The name of this layer, which can not specify.</li>
<li><strong>name</strong> (<em>basestring|None</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>blank</strong> (<em>int</em>) &#8211; the &#8216;blank&#8217; label used in ctc</li>
<li><strong>norm_by_times</strong> (<em>bool</em>) &#8211; Whether to normalization by times. False by default.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
......@@ -3791,8 +3822,8 @@ A fast and simple algorithm for training neural probabilistic language models.</
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple|collections.Sequence</em>) &#8211; input layers. It could be a paddle.v2.config_base.Layer of list/tuple of paddle.v2.config_base.Layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple|collections.Sequence</em>) &#8211; The input layers. It could be a paddle.v2.config_base.Layer of list/tuple of paddle.v2.config_base.Layer.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; label layer</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; weight layer, can be None(default)</li>
<li><strong>num_classes</strong> (<em>int</em>) &#8211; number of classes.</li>
......@@ -3802,7 +3833,10 @@ A fast and simple algorithm for training neural probabilistic language models.</
<li><strong>neg_distribution</strong> (<em>list|tuple|collections.Sequence|None</em>) &#8211; The distribution for generating the random negative labels.
A uniform distribution will be used if not provided.
If not None, its length must be equal to num_classes.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias parameter attribute. True if no bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
</td>
......@@ -3841,9 +3875,11 @@ Hierarchical Probabilistic Neural Network Language Model.&#8221;</p>
paddle.v2.config_base.Layer.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Label layer.</li>
<li><strong>num_classes</strong> (<em>int|None</em>) &#8211; number of classes.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Bias attribute. None means default bias.
False means no bias.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None</em>) &#8211; Parameter Attribute. None means default parameter.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
......@@ -3885,7 +3921,7 @@ size of input and label are equal. The formula is as follows,</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>label</strong> &#8211; The input label.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float</em>) &#8211; The coefficient affects the gradient in the backward.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
......@@ -3913,7 +3949,7 @@ size of input and label are equal. The formula is as follows,</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input_loc</strong> (<em>paddle.v2.config_base.Layer | List of paddle.v2.config_base.Layer</em>) &#8211; The input predict locations.</li>
<li><strong>input_conf</strong> (<em>paddle.v2.config_base.Layer | List of paddle.v2.config_base.Layer</em>) &#8211; The input priorbox confidence.</li>
<li><strong>priorbox</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input priorbox location and the variance.</li>
......@@ -3955,7 +3991,7 @@ It is used by recurrent layer group.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer name.</li>
<li><strong>eos_id</strong> (<em>int</em>) &#8211; end id of sequence</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
......@@ -3981,19 +4017,25 @@ It is used by recurrent layer group.</p>
<dl class="class">
<dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">dropout</code></dt>
<dd><p>&#64;TODO(yuyang18): Add comments.</p>
<dd><p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">dropout</span> <span class="o">=</span> <span class="n">dropout</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <span class="n">dropout_rate</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
</pre></div>
</div>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> &#8211; </li>
<li><strong>input</strong> &#8211; </li>
<li><strong>dropout_rate</strong> &#8211; </li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>dropout_rate</strong> (<em>float</em>) &#8211; The probability of dropout.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last"></p>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">paddle.v2.config_base.Layer object.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p>
</td>
</tr>
</tbody>
......@@ -4027,7 +4069,7 @@ a_i * z_i &amp;\quad \mathrm{otherwise}\end{split}\]</div>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Name of this layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>partial_sum</strong> (<em>int</em>) &#8211; <p>this parameter makes a group of inputs share a same weight.</p>
<ul>
......@@ -4060,7 +4102,7 @@ a_i * z_i &amp;\quad \mathrm{otherwise}\end{split}\]</div>
<dd><p>The gated unit layer implements a simple gating mechanism over the input.
The input <span class="math">\(X\)</span> is first projected into a new space <span class="math">\(X'\)</span>, and
it is also used to produce a gate weight <span class="math">\(\sigma\)</span>. Element-wise
prodict between <a href="#id12"><span class="problematic" id="id13">:match:`X&#8217;`</span></a> and <span class="math">\(\sigma\)</span> is finally returned.</p>
product between <a href="#id11"><span class="problematic" id="id12">:match:`X&#8217;`</span></a> and <span class="math">\(\sigma\)</span> is finally returned.</p>
<dl class="docutils">
<dt>Reference:</dt>
<dd>Language Modeling with Gated Convolutional Networks
......@@ -4077,7 +4119,7 @@ prodict between <a href="#id12"><span class="problematic" id="id13">:match:`X&#8
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input for this layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; output size of the gated unit.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation type of the projected input.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of this layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>gate_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Attributes to tune the gate output, for example, error
clipping threshold, dropout and so on. See paddle.v2.attr.ExtraAttribute for
more details.</li>
......@@ -4124,7 +4166,7 @@ no valid bounding box.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input_loc</strong> (<em>paddle.v2.config_base.Layer | List of paddle.v2.config_base.Layer.</em>) &#8211; The input predict locations.</li>
<li><strong>input_conf</strong> (<em>paddle.v2.config_base.Layer | List of paddle.v2.config_base.Layer.</em>) &#8211; The input priorbox confidence.</li>
<li><strong>priorbox</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input priorbox location and the variance.</li>
......
因为 它太大了无法显示 source diff 。你可以改为 查看blob
......@@ -321,3 +321,55 @@ pip uninstall py_paddle paddle
然后安装paddle的python环境, 在build目录下执行
pip install python/dist/paddle*.whl && pip install ../paddle/dist/py_paddle*.whl
16. PaddlePaddle存储的参数格式是什么,如何和明文进行相互转化
---------------------------------------------------------
PaddlePaddle保存的模型参数文件内容由16字节头信息和网络参数两部分组成。头信息中,1~4字节表示PaddlePaddle版本信息,请直接填充0;5~8字节表示每个参数占用的字节数,当保存的网络参数为float类型时为4,double类型时为8;9~16字节表示保存的参数总个数。
将PaddlePaddle保存的模型参数还原回明文时,可以使用相应数据类型的 :code:`numpy.array` 加载具体网络参数,此时可以跳过PaddlePaddle模型参数文件的头信息。若在PaddlePaddle编译时,未指定按照double精度编译,默认情况下按照float精度计算,保存的参数也是float类型。这时在使用 :code:`numpy.array` 时,一般设置 :code:`dtype=float32` 。示例如下:
.. code-block:: python
def read_parameter(fname, width):
s = open(fname).read()
# skip header
vec = np.fromstring(s[16:], dtype=np.float32)
# width is the size of the corresponding layer
np.savetxt(fname + ".csv", vec.reshape(width, -1),
fmt="%.6f", delimiter=",")
将明文参数转化为PaddlePaddle可加载的模型参数时,首先构造头信息,再写入网络参数。下面的代码将随机生成的矩阵转化为可以被PaddlePaddle加载的模型参数。
.. code-block:: python
def gen_rand_param(param_file, width, height, need_trans):
np.random.seed()
header = struct.pack("iil", 0, 4, height * width)
param = np.float32(np.random.rand(height, width))
with open(param_file, "w") as fparam:
fparam.write(header + param.tostring())
17. 如何加载预训练参数
------------------------------
* 对加载预训练参数的层,设置其参数属性 :code:`is_static=True`,使该层的参数在训练过程中保持不变。以embedding层为例,代码如下:
.. code-block:: python
emb_para = paddle.attr.Param(name='emb', is_static=True)
paddle.layer.embedding(size=word_dim, input=x, param_attr=emb_para)
* 从模型文件将预训练参数载入 :code:`numpy.array`,在创建parameters后,使用 :code:`parameters.set()` 加载预训练参数。PaddlePaddle保存的模型参数文件前16字节为头信息,用户将参数载入 :code:`numpy.array` 时须从第17字节开始。以embedding层为例,代码如下:
.. code-block:: python
def load_parameter(file_name, h, w):
with open(file_name, 'rb') as f:
f.read(16) # skip header.
return np.fromfile(f, dtype=np.float32).reshape(h, w)
parameters = paddle.parameters.create(my_cost)
parameters.set('emb', load_parameter(emb_param_file, 30000, 256))
......@@ -230,14 +230,15 @@
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; The input layer. Could be a list/tuple of input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -271,7 +272,7 @@ specified, selective_fc acts exactly like fc.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; The input layer.</li>
<li><strong>select</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The select layer. The output of select layer should be a
sparse binary matrix, and treat as the mask of selective fc.
......@@ -279,9 +280,10 @@ If is None, acts exactly like fc.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -431,7 +433,7 @@ the right size (which is the end of array) to the left.</li>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer a.</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer b.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; layer&#8217;s extra attribute.</li>
......@@ -485,7 +487,7 @@ rest channels will be processed by rest group of filters.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Layer Input.</li>
<li><strong>filter_size</strong> (<em>int|tuple|list</em>) &#8211; The x dimension of a filter kernel. Or input a tuple for
two image dimension.</li>
......@@ -504,8 +506,10 @@ image dimension</li>
<li><strong>dilation</strong> (<em>int|tuple|list</em>) &#8211; The x dimension of the dilation. Or input a tuple for two
image dimension</li>
<li><strong>dilation_y</strong> (<em>int</em>) &#8211; The y dimension of the dilation.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Convolution bias attribute. None means default bias.
False means no bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; number of input channels. If None will be set
automatically from previous output.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Convolution param attribute. None means default attribute</li>
......@@ -576,15 +580,15 @@ parameter attribute is set by this parameter.</li>
<dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">row_conv</code></dt>
<dd><p>The row convolution is called lookahead convolution. It is firstly
introduced in paper of <a class="reference external" href="https://arxiv.org/pdf/1512.02595v1.pdf">Deep Speech 2: End-toEnd Speech Recognition
introduced in paper of <a class="reference external" href="https://arxiv.org/pdf/1512.02595v1.pdf">Deep Speech 2: End-to-End Speech Recognition
in English and Mandarin</a> .</p>
<p>The bidirectional RNN that learns representation for a sequence by
performing a forward and a backward pass through the entire sequence.
However, unlike unidirectional RNNs, bidirectional RNNs are challenging
to deploy in an online and low-latency setting. The lookahead convolution
incorporates information from future subsequences in a computationally
efficient manner to improve unidirectional recurrent neural networks.</p>
<p>The connection of row convolution is different form the 1D sequence
efficient manner to improve unidirectional RNNs.</p>
<p>The connection of row convolution is different from the 1D sequence
convolution. Assumed that, the future context-length is k, that is to say,
it can get the output at timestep t by using the the input feature from t-th
timestep to (t+k+1)-th timestep. Assumed that the hidden dim of input
......@@ -610,7 +614,7 @@ number plus one equals context_len.</p>
plus one.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is linear activation.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute. If None, the parameter will be
initialized smartly. It&#8217;s better set it by yourself.</li>
initialized smartly. It&#8217;s better to set it by yourself.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -713,7 +717,7 @@ The details please refer to
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; layer&#8217;s input.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; number of input channel.</li>
<li><strong>pool_type</strong> &#8211; Pooling type. MaxPooling or AveragePooling. Default is MaxPooling.</li>
......@@ -778,7 +782,7 @@ s = input.size / num_channels
<li><strong>num_channels</strong> (<em>int|None</em>) &#8211; The channel number of input layer. If None will be set
automatically from previous output.</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; The group number of input layer.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer, which can not specify.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer attribute.</li>
</ul>
</td>
......@@ -814,7 +818,7 @@ The details please refer to
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; layer&#8217;s input.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; Normalize in number of <span class="math">\(size\)</span> feature maps.</li>
<li><strong>scale</strong> (<em>float</em>) &#8211; The hyper-parameter.</li>
......@@ -862,7 +866,7 @@ y_i &amp;\gets \gamma \hat{x_i} + \beta \qquad &amp;//\ scale\ and\ shift\end{sp
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; batch normalization input. Better be linear activation.
Because there is an activation inside batch_normalization.</li>
<li><strong>batch_norm_type</strong> (<em>None|string</em><em>, </em><em>None</em><em> or </em><em>&quot;batch_norm&quot;</em><em> or </em><em>&quot;cudnn_batch_norm&quot;</em>) &#8211; We have batch_norm and cudnn_batch_norm. batch_norm
......@@ -879,7 +883,7 @@ normalization will normalize input near zero.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; num of image channels or previous layer&#8217;s number of
filters. None will automatically get from layer&#8217;s
input.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; <span class="math">\(\beta\)</span>, better be zero when initialize. So the
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; <span class="math">\(\beta\)</span>, better be zero when initialize. So the
initial_std=0, initial_mean=1 is best practice.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; <span class="math">\(\gamma\)</span>, better be one when initialize. So the
initial_std=0, initial_mean=1 is best practice.</li>
......@@ -930,7 +934,7 @@ and <span class="math">\(out\)</span> is a (batchSize x dataDim) output vector.<
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -960,7 +964,7 @@ factors which dimensions equal to the channel&#8217;s number.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li>
</ul>
......@@ -1000,7 +1004,7 @@ and the size of <span class="math">\(out\)</span> is a (batchSize x dataDim) .</
</tr>
<tr class="field-even field"><th class="field-name">type input:</th><td class="field-body">paddle.v2.config_base.Layer</td>
</tr>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">Layer name.</td>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layer. It is optional.</td>
</tr>
<tr class="field-even field"><th class="field-name">type name:</th><td class="field-body">basestring</td>
</tr>
......@@ -1045,9 +1049,12 @@ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start &lt;= i &lt; end\en
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input Layer</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; bias attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; parameter attribute.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the layer</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Layer Attribute.</li>
</ul>
</td>
......@@ -1095,8 +1102,10 @@ more details about LSTM.</p>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation type, paddle.v2.activation.Tanh by default. <span class="math">\(h_t\)</span></li>
<li><strong>gate_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; gate activation type, paddle.v2.activation.Sigmoid by default.</li>
<li><strong>state_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; state activation type, paddle.v2.activation.Tanh by default.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no
bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Parameter Attribute.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer attribute</li>
</ul>
......@@ -1163,8 +1172,10 @@ affects the <span class="math">\({\tilde{h_t}}\)</span>.</li>
<li><strong>gate_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; gate activation type, paddle.v2.activation.Sigmoid by default.
This activation affects the <span class="math">\(z_t\)</span> and <span class="math">\(r_t\)</span>. It is the
<span class="math">\(\sigma\)</span> in the above formula.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no
bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Parameter Attribute.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer attribute</li>
</ul>
......@@ -1336,7 +1347,7 @@ output is <span class="math">\(o_t\)</span>, whose name is &#8216;state&#8217; a
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer&#8217;s name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; Layer&#8217;s size. NOTE: lstm layer&#8217;s size, should be equal to
<code class="code docutils literal"><span class="pre">input.size/4</span></code>, and should be equal to
<code class="code docutils literal"><span class="pre">state.size</span></code>.</li>
......@@ -1347,7 +1358,10 @@ output is <span class="math">\(o_t\)</span>, whose name is &#8216;state&#8217; a
be sigmoid only.</li>
<li><strong>state_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; State Activation Type. Default is sigmoid, and should
be sigmoid only.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Bias Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; layer&#8217;s extra attribute.</li>
</ul>
</td>
......@@ -1377,9 +1391,12 @@ be sigmoid only.</li>
<li><strong>output_mem</strong> &#8211; </li>
<li><strong>size</strong> &#8211; </li>
<li><strong>act</strong> &#8211; </li>
<li><strong>name</strong> &#8211; </li>
<li><strong>name</strong> &#8211; The name of this layer. It is optional.</li>
<li><strong>gate_act</strong> &#8211; </li>
<li><strong>bias_attr</strong> &#8211; </li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>param_attr</strong> &#8211; the parameter_attribute for transforming the output_mem
from previous step.</li>
<li><strong>layer_attr</strong> &#8211; </li>
......@@ -1493,7 +1510,7 @@ the output from input.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer&#8217;s name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; get output layer&#8217;s input. And this layer should contains
multiple outputs.</li>
<li><strong>arg_name</strong> (<em>basestring</em>) &#8211; Output name from input.</li>
......@@ -1549,9 +1566,10 @@ Each inputs is a projection or operator.</p>
<li><strong>input</strong> &#8211; inputs layer. It is an optional parameter. If set,
then this function will just return layer&#8217;s name.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; The extra layer config. Default is None.</li>
</ul>
</td>
......@@ -1578,7 +1596,7 @@ default Bias.</li>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Name of this embedding layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer for this embedding. NOTE: must be Index Data.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; The embedding dimension.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None</em>) &#8211; The embedding parameter attribute. See paddle.v2.attr.ParameterAttribute
......@@ -1974,12 +1992,15 @@ of stride is -1.</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>agg_level</strong> (<em>AggregateLevel</em>) &#8211; AggregateLevel.TO_NO_SEQUENCE or
AggregateLevel.TO_SEQUENCE</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li>
<li><strong>pooling_type</strong> (<em>BasePoolingType|None</em>) &#8211; Type of pooling, MaxPooling(default), AvgPooling,
SumPooling, SquareRootNPooling.</li>
<li><strong>stride</strong> (<em>Int</em>) &#8211; The step size between successive pooling regions.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias parameter attribute. False if no bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; The Extra Attributes for layer, such as dropout.</li>
</ul>
</td>
......@@ -2015,7 +2036,7 @@ of stride is -1.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>agg_level</strong> &#8211; Aggregated level</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer name.</li>
<li><strong>stride</strong> (<em>Int</em>) &#8211; The step size between successive pooling regions.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
......@@ -2053,7 +2074,7 @@ of stride is -1.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>agg_level</strong> &#8211; aggregation level</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer name.</li>
<li><strong>stride</strong> (<em>Int</em>) &#8211; The step size between successive pooling regions.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
......@@ -2087,7 +2108,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>list|tuple|collections.Sequence</em>) &#8211; input layers or projections</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
......@@ -2131,14 +2152,15 @@ processed in one batch.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
</ul>
</td>
</tr>
......@@ -2183,7 +2205,7 @@ will be sliced for multiple times.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of this layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input for this layer, it should be a sequence.</li>
<li><strong>starts</strong> (<em>paddle.v2.config_base.Layer|None</em>) &#8211; start indices to slice the input sequence.</li>
<li><strong>ends</strong> (<em>paddle.v2.config_base.Layer|None</em>) &#8211; end indices to slice the input sequence.</li>
......@@ -2225,7 +2247,7 @@ beam training.</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; A nested sequence.</li>
<li><strong>selected_indices</strong> &#8211; a set of sequence indices in the nested sequence.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of this layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
</ul>
</td>
</tr>
......@@ -2285,7 +2307,7 @@ convolution neural network, and before recurrent neural network.</p>
<li><strong>stride_y</strong> (<em>int</em>) &#8211; The stride size in vertical direction.</li>
<li><strong>padding_x</strong> (<em>int</em>) &#8211; The padding size in horizontal direction.</li>
<li><strong>padding_y</strong> (<em>int</em>) &#8211; The padding size in vertical direction.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer, which can not specify.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -2339,9 +2361,11 @@ sequence is one) to sequence data.&#8221;</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer</li>
<li><strong>expand_as</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Expand as this layer&#8217;s sequence info.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no
bias.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>expand_level</strong> (<em>ExpandLevel</em>) &#8211; whether input layer is timestep(default) or sequence.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
......@@ -2385,7 +2409,7 @@ bias.</li>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer</li>
<li><strong>num_repeats</strong> (<em>int</em>) &#8211; Repeat the input so many times</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>as_row_vector</strong> (<em>bool</em>) &#8211; True for treating input as row vector and repeating
in the column direction. This is equivalent to apply
concat() with num_repeats same input.
......@@ -2430,7 +2454,7 @@ usually used when the input sample is some image or feature map.</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>height</strong> (<em>int</em>) &#8211; The height of the sample matrix</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -2466,12 +2490,13 @@ output sequence has T*M/N instances, the dimension of each instance is N.</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>reshape_size</strong> (<em>int</em>) &#8211; the size of reshaped sequence.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
</ul>
</td>
</tr>
......@@ -2520,12 +2545,14 @@ Please refer to dropout for details.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; Input layers. It could be a paddle.v2.config_base.Layer or list/tuple of
paddle.v2.config_base.Layer.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type, default is tanh.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|bool</em>) &#8211; Bias attribute. If False, means no bias. None is default
bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer attribute.</li>
</ul>
</td>
......@@ -2588,7 +2615,7 @@ processed in one batch.</p>
<li><strong>weights</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The weight layer.</li>
<li><strong>vectors</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The vector layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; the dimension of this layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -2627,7 +2654,7 @@ which is used in NEURAL TURING MACHINE.</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>list|tuple</em>) &#8211; Input layer.</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Weight layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -2700,7 +2727,7 @@ and <span class="math">\(y\)</span> is a output vector.</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Weight layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -2739,7 +2766,7 @@ processed in one batch.</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Weight layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -2775,7 +2802,7 @@ ight)</p>
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The Layer Name.</td>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layer. It is optional.</td>
</tr>
<tr class="field-even field"><th class="field-name">type name:</th><td class="field-body">basestring</td>
</tr>
......@@ -2820,7 +2847,7 @@ element-wise. There is no activation and weight.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>slope</strong> (<em>float.</em>) &#8211; the scale factor.</li>
<li><strong>intercept</strong> (<em>float.</em>) &#8211; the offset.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
......@@ -2867,15 +2894,16 @@ For example, each sample:</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer a.</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer b.</li>
<li><strong>size</strong> (<em>int.</em>) &#8211; the layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -2914,7 +2942,7 @@ processed in one batch.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer a</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer b</li>
<li><strong>scale</strong> (<em>float</em>) &#8211; scale for cosine value. default is 5.</li>
......@@ -2953,7 +2981,7 @@ processed in one batch.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -2989,10 +3017,13 @@ bias are trainable.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; The input layer.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The parameter attribute of scaling.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The parameter attribute of shifting.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
</ul>
</td>
</tr>
......@@ -3027,7 +3058,7 @@ The result is stored in output.ids.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -3060,7 +3091,7 @@ Sampling one id for one sample.</p>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -3104,7 +3135,7 @@ For each index i from 0 to batchSize -1, the output is the i-th row of the
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>list of paddle.v2.config_base.Layer</em>) &#8211; Input layers.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
</ul>
</td>
......@@ -3174,7 +3205,7 @@ in width dimension.</p>
<li><strong>pad_h</strong> (<em>list|None</em>) &#8211; padding size in height dimension.</li>
<li><strong>pad_w</strong> (<em>list|None</em>) &#8211; padding size in width dimension.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
</ul>
</td>
</tr>
......@@ -3210,7 +3241,7 @@ in width dimension.</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; The first input layer.</li>
<li><strong>label</strong> &#8211; The input label.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float.</em>) &#8211; The cost is multiplied with coeff.
The coefficient affects the gradient in the backward.</li>
<li><strong>weight</strong> (<em>LayerOutout</em>) &#8211; The cost of each sample is multiplied with each weight.
......@@ -3250,7 +3281,7 @@ Input should be a vector of positive numbers, without normalization.</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; The first input layer.</li>
<li><strong>label</strong> &#8211; The input label.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float.</em>) &#8211; The coefficient affects the gradient in the backward.</li>
<li><strong>softmax_selfnorm_alpha</strong> (<em>float.</em>) &#8211; The scale factor affects the cost.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
......@@ -3286,7 +3317,7 @@ Input should be a vector of positive numbers, without normalization.</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The first input layer.</li>
<li><strong>label</strong> &#8211; The input label.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float</em>) &#8211; The coefficient affects the gradient in the backward.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
......@@ -3335,7 +3366,7 @@ ight <a href="#id2"><span class="problematic" id="id3">|</span></a>leq delta</p>
</tr>
<tr class="field-even field"><th class="field-name">type input:</th><td class="field-body">paddle.v2.config_base.Layer.</td>
</tr>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layers. It is not necessary.</td>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layer. It is optional.</td>
</tr>
<tr class="field-even field"><th class="field-name">type name:</th><td class="field-body">None|basestring.</td>
</tr>
......@@ -3394,7 +3425,7 @@ a true binary class label :math:<a href="#id6"><span class="problematic" id="id7
</tr>
<tr class="field-even field"><th class="field-name">type input:</th><td class="field-body">paddle.v2.config_base.Layer.</td>
</tr>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layers. It is not necessary.</td>
<tr class="field-odd field"><th class="field-name">param name:</th><td class="field-body">The name of this layer. It is optional.</td>
</tr>
<tr class="field-even field"><th class="field-name">type name:</th><td class="field-body">None|basestring.</td>
</tr>
......@@ -3440,7 +3471,7 @@ a true binary class label :math:<a href="#id6"><span class="problematic" id="id7
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Samples of the same query should be loaded as sequence.</li>
<li><strong>score</strong> &#8211; The 2nd input. Score of each sample.</li>
<li><strong>NDCG_num</strong> (<em>int</em>) &#8211; The size of NDCG (Normalized Discounted Cumulative Gain),
e.g., 5 for NDCG&#64;5. It must be less than for equal to the
e.g., 5 for NDCG&#64;5. It must be less than or equal to the
minimum size of lists.</li>
<li><strong>max_sort_size</strong> (<em>int</em>) &#8211; The size of partial sorting in calculating gradient.
If max_sort_size = -1, then for each list, the
......@@ -3449,7 +3480,7 @@ In other cases, max_sort_size must be greater than or
equal to NDCG_num. And if max_sort_size is greater
than the size of a list, the algorithm will sort the
entire list of get gradient.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
</td>
......@@ -3478,7 +3509,7 @@ entire list of get gradient.</li>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Network prediction.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Data label.</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The weight affects the cost, namely the scale of cost.
......@@ -3537,7 +3568,7 @@ Their dimension is one.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Label is 1 or 0, means positive order and reverse order.</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The weight affects the cost, namely the scale of cost.
It is an optional argument.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float</em>) &#8211; The coefficient affects the gradient in the backward.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
......@@ -3570,7 +3601,7 @@ It is an optional argument.</li>
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; The first input layer.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring.</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
</td>
......@@ -3610,7 +3641,7 @@ field model.</p>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The third layer is &#8220;weight&#8221; of each sample, which is an
optional argument.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Parameter attribute. None means default attribute</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float</em>) &#8211; The coefficient affects the gradient in the backward.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
......@@ -3651,7 +3682,7 @@ decoding or 0 for correct decoding.</p>
<li><strong>size</strong> (<em>int</em>) &#8211; size of this layer.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em><em> or </em><em>None</em>) &#8211; None or ground-truth label.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Parameter attribute. None means default attribute</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
</td>
......@@ -3701,7 +3732,7 @@ should also be num_classes + 1.</p>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The data layer of label with variable length.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; category numbers + 1.</li>
<li><strong>name</strong> (<em>basestring|None</em>) &#8211; The name of this layer</li>
<li><strong>name</strong> (<em>basestring|None</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>norm_by_times</strong> (<em>bool</em>) &#8211; Whether to normalization by times. False by default.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
</ul>
......@@ -3761,7 +3792,7 @@ should be consistent as that used in your labels.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The data layer of label with variable length.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; category numbers + 1.</li>
<li><strong>name</strong> (<em>basestring|None</em>) &#8211; The name of this layer, which can not specify.</li>
<li><strong>name</strong> (<em>basestring|None</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>blank</strong> (<em>int</em>) &#8211; the &#8216;blank&#8217; label used in ctc</li>
<li><strong>norm_by_times</strong> (<em>bool</em>) &#8211; Whether to normalization by times. False by default.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Extra Layer config.</li>
......@@ -3798,8 +3829,8 @@ A fast and simple algorithm for training neural probabilistic language models.</
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple|collections.Sequence</em>) &#8211; input layers. It could be a paddle.v2.config_base.Layer of list/tuple of paddle.v2.config_base.Layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple|collections.Sequence</em>) &#8211; The input layers. It could be a paddle.v2.config_base.Layer of list/tuple of paddle.v2.config_base.Layer.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; label layer</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; weight layer, can be None(default)</li>
<li><strong>num_classes</strong> (<em>int</em>) &#8211; number of classes.</li>
......@@ -3809,7 +3840,10 @@ A fast and simple algorithm for training neural probabilistic language models.</
<li><strong>neg_distribution</strong> (<em>list|tuple|collections.Sequence|None</em>) &#8211; The distribution for generating the random negative labels.
A uniform distribution will be used if not provided.
If not None, its length must be equal to num_classes.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias parameter attribute. True if no bias.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
</td>
......@@ -3848,9 +3882,11 @@ Hierarchical Probabilistic Neural Network Language Model.&#8221;</p>
paddle.v2.config_base.Layer.</li>
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Label layer.</li>
<li><strong>num_classes</strong> (<em>int|None</em>) &#8211; number of classes.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; layer name</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Bias attribute. None means default bias.
False means no bias.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Bool|Any</em>) &#8211; The Bias Attribute. If the parameter is set to
False or something not type of paddle.v2.attr.ParameterAttribute,
no bias is defined. If the parameter is set to
True, the bias is initialized to zero.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None</em>) &#8211; Parameter Attribute. None means default parameter.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
......@@ -3892,7 +3928,7 @@ size of input and label are equal. The formula is as follows,</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>label</strong> &#8211; The input label.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layers. It is not necessary.</li>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>coeff</strong> (<em>float</em>) &#8211; The coefficient affects the gradient in the backward.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
......@@ -3920,7 +3956,7 @@ size of input and label are equal. The formula is as follows,</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input_loc</strong> (<em>paddle.v2.config_base.Layer | List of paddle.v2.config_base.Layer</em>) &#8211; The input predict locations.</li>
<li><strong>input_conf</strong> (<em>paddle.v2.config_base.Layer | List of paddle.v2.config_base.Layer</em>) &#8211; The input priorbox confidence.</li>
<li><strong>priorbox</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input priorbox location and the variance.</li>
......@@ -3962,7 +3998,7 @@ It is used by recurrent layer group.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer name.</li>
<li><strong>eos_id</strong> (<em>int</em>) &#8211; end id of sequence</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
......@@ -3988,19 +4024,25 @@ It is used by recurrent layer group.</p>
<dl class="class">
<dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">dropout</code></dt>
<dd><p>&#64;TODO(yuyang18): Add comments.</p>
<dd><p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">dropout</span> <span class="o">=</span> <span class="n">dropout</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <span class="n">dropout_rate</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
</pre></div>
</div>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> &#8211; </li>
<li><strong>input</strong> &#8211; </li>
<li><strong>dropout_rate</strong> &#8211; </li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>dropout_rate</strong> (<em>float</em>) &#8211; The probability of dropout.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first last"></p>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">paddle.v2.config_base.Layer object.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p>
</td>
</tr>
</tbody>
......@@ -4034,7 +4076,7 @@ a_i * z_i &amp;\quad \mathrm{otherwise}\end{split}\]</div>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Name of this layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input layer.</li>
<li><strong>partial_sum</strong> (<em>int</em>) &#8211; <p>this parameter makes a group of inputs share a same weight.</p>
<ul>
......@@ -4067,7 +4109,7 @@ a_i * z_i &amp;\quad \mathrm{otherwise}\end{split}\]</div>
<dd><p>The gated unit layer implements a simple gating mechanism over the input.
The input <span class="math">\(X\)</span> is first projected into a new space <span class="math">\(X'\)</span>, and
it is also used to produce a gate weight <span class="math">\(\sigma\)</span>. Element-wise
prodict between <a href="#id12"><span class="problematic" id="id13">:match:`X&#8217;`</span></a> and <span class="math">\(\sigma\)</span> is finally returned.</p>
product between <a href="#id11"><span class="problematic" id="id12">:match:`X&#8217;`</span></a> and <span class="math">\(\sigma\)</span> is finally returned.</p>
<dl class="docutils">
<dt>Reference:</dt>
<dd>Language Modeling with Gated Convolutional Networks
......@@ -4084,7 +4126,7 @@ prodict between <a href="#id12"><span class="problematic" id="id13">:match:`X&#8
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input for this layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; output size of the gated unit.</li>
<li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation type of the projected input.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of this layer.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>gate_attr</strong> (<em>paddle.v2.attr.ExtraAttributeNone</em>) &#8211; Attributes to tune the gate output, for example, error
clipping threshold, dropout and so on. See paddle.v2.attr.ExtraAttribute for
more details.</li>
......@@ -4131,7 +4173,7 @@ no valid bounding box.</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The name of this layer. It is optional.</li>
<li><strong>input_loc</strong> (<em>paddle.v2.config_base.Layer | List of paddle.v2.config_base.Layer.</em>) &#8211; The input predict locations.</li>
<li><strong>input_conf</strong> (<em>paddle.v2.config_base.Layer | List of paddle.v2.config_base.Layer.</em>) &#8211; The input priorbox confidence.</li>
<li><strong>priorbox</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; The input priorbox location and the variance.</li>
......
......@@ -186,42 +186,44 @@
<div itemprop="articleBody">
<div class="section" id="faq">
<h1><a class="toc-backref" href="#id9">FAQ</a><a class="headerlink" href="#faq" title="永久链接至标题"></a></h1>
<h1><a class="toc-backref" href="#id11">FAQ</a><a class="headerlink" href="#faq" title="永久链接至标题"></a></h1>
<div class="contents topic" id="contents">
<p class="topic-title first">Contents</p>
<ul class="simple">
<li><a class="reference internal" href="#faq" id="id9">FAQ</a><ul>
<li><a class="reference internal" href="#id1" id="id10">1. 如何减少内存占用</a><ul>
<li><a class="reference internal" href="#dataprovider" id="id11">减少DataProvider缓冲池内存</a></li>
<li><a class="reference internal" href="#id2" id="id12">神经元激活内存</a></li>
<li><a class="reference internal" href="#id3" id="id13">参数内存</a></li>
<li><a class="reference internal" href="#faq" id="id11">FAQ</a><ul>
<li><a class="reference internal" href="#id1" id="id12">1. 如何减少内存占用</a><ul>
<li><a class="reference internal" href="#dataprovider" id="id13">减少DataProvider缓冲池内存</a></li>
<li><a class="reference internal" href="#id2" id="id14">神经元激活内存</a></li>
<li><a class="reference internal" href="#id3" id="id15">参数内存</a></li>
</ul>
</li>
<li><a class="reference internal" href="#paddlepaddle" id="id14">2. 如何加速PaddlePaddle的训练速度</a><ul>
<li><a class="reference internal" href="#id4" id="id15">减少数据载入的耗时</a></li>
<li><a class="reference internal" href="#id5" id="id16">加速训练速度</a></li>
<li><a class="reference internal" href="#id6" id="id17">利用更多的计算资源</a></li>
<li><a class="reference internal" href="#paddlepaddle" id="id16">2. 如何加速PaddlePaddle的训练速度</a><ul>
<li><a class="reference internal" href="#id4" id="id17">减少数据载入的耗时</a></li>
<li><a class="reference internal" href="#id5" id="id18">加速训练速度</a></li>
<li><a class="reference internal" href="#id6" id="id19">利用更多的计算资源</a></li>
</ul>
</li>
<li><a class="reference internal" href="#illegal-instruction" id="id18">3. 遇到“非法指令”或者是“illegal instruction”</a></li>
<li><a class="reference internal" href="#sgd" id="id19">4. 如何选择SGD算法的学习率</a></li>
<li><a class="reference internal" href="#id7" id="id20">5. 如何初始化参数</a></li>
<li><a class="reference internal" href="#id8" id="id21">6. 如何共享参数</a></li>
<li><a class="reference internal" href="#cp27mu-linux-x86-64-whl-is-not-a-supported-wheel-on-this-platform" id="id22">7. *-cp27mu-linux_x86_64.whl is not a supported wheel on this platform.</a></li>
<li><a class="reference internal" href="#python" id="id23">8. python相关的单元测试都过不了</a></li>
<li><a class="reference internal" href="#docker-gpu-cuda-driver-version-is-insufficient" id="id24">9. 运行Docker GPU镜像出现 &#8220;CUDA driver version is insufficient&#8221;</a></li>
<li><a class="reference internal" href="#cmake-pythonlibspythoninterp" id="id25">10. CMake源码编译, 找到的PythonLibs和PythonInterp版本不一致</a></li>
<li><a class="reference internal" href="#cmake-paddle0-0-0" id="id26">11. CMake源码编译,Paddle版本号为0.0.0</a></li>
<li><a class="reference internal" href="#a-protocol-message-was-rejected-because-it-was-too-big" id="id27">12. A protocol message was rejected because it was too big</a></li>
<li><a class="reference internal" href="#gpu" id="id28">13. 如何指定GPU设备</a></li>
<li><a class="reference internal" href="#floating-point-exception" id="id29">14. 训练过程中出现 <code class="code docutils literal"><span class="pre">Floating</span> <span class="pre">point</span> <span class="pre">exception</span></code>, 训练因此退出怎么办?</a></li>
<li><a class="reference internal" href="#import-paddle-v2-as-paddle-importerror-no-module-named-v2" id="id30">15. 编译安装后执行 import paddle.v2 as paddle 报ImportError: No module named v2</a></li>
<li><a class="reference internal" href="#illegal-instruction" id="id20">3. 遇到“非法指令”或者是“illegal instruction”</a></li>
<li><a class="reference internal" href="#sgd" id="id21">4. 如何选择SGD算法的学习率</a></li>
<li><a class="reference internal" href="#id7" id="id22">5. 如何初始化参数</a></li>
<li><a class="reference internal" href="#id8" id="id23">6. 如何共享参数</a></li>
<li><a class="reference internal" href="#cp27mu-linux-x86-64-whl-is-not-a-supported-wheel-on-this-platform" id="id24">7. *-cp27mu-linux_x86_64.whl is not a supported wheel on this platform.</a></li>
<li><a class="reference internal" href="#python" id="id25">8. python相关的单元测试都过不了</a></li>
<li><a class="reference internal" href="#docker-gpu-cuda-driver-version-is-insufficient" id="id26">9. 运行Docker GPU镜像出现 &#8220;CUDA driver version is insufficient&#8221;</a></li>
<li><a class="reference internal" href="#cmake-pythonlibspythoninterp" id="id27">10. CMake源码编译, 找到的PythonLibs和PythonInterp版本不一致</a></li>
<li><a class="reference internal" href="#cmake-paddle0-0-0" id="id28">11. CMake源码编译,Paddle版本号为0.0.0</a></li>
<li><a class="reference internal" href="#a-protocol-message-was-rejected-because-it-was-too-big" id="id29">12. A protocol message was rejected because it was too big</a></li>
<li><a class="reference internal" href="#gpu" id="id30">13. 如何指定GPU设备</a></li>
<li><a class="reference internal" href="#floating-point-exception" id="id31">14. 训练过程中出现 <code class="code docutils literal"><span class="pre">Floating</span> <span class="pre">point</span> <span class="pre">exception</span></code>, 训练因此退出怎么办?</a></li>
<li><a class="reference internal" href="#import-paddle-v2-as-paddle-importerror-no-module-named-v2" id="id32">15. 编译安装后执行 import paddle.v2 as paddle 报ImportError: No module named v2</a></li>
<li><a class="reference internal" href="#id9" id="id33">16. PaddlePaddle存储的参数格式是什么,如何和明文进行相互转化</a></li>
<li><a class="reference internal" href="#id10" id="id34">17. 如何加载预训练参数</a></li>
</ul>
</li>
</ul>
</div>
<div class="section" id="id1">
<h2><a class="toc-backref" href="#id10">1. 如何减少内存占用</a><a class="headerlink" href="#id1" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id12">1. 如何减少内存占用</a><a class="headerlink" href="#id1" title="永久链接至标题"></a></h2>
<p>神经网络的训练本身是一个非常消耗内存和显存的工作,经常会消耗数10GB的内存和数GB的显存。
PaddlePaddle的内存占用主要分为如下几个方面:</p>
<ul class="simple">
......@@ -232,7 +234,7 @@ PaddlePaddle的内存占用主要分为如下几个方面:</p>
</ul>
<p>其中,其他内存杂项是指PaddlePaddle本身所用的一些内存,包括字符串分配,临时变量等等,暂不考虑在内。</p>
<div class="section" id="dataprovider">
<h3><a class="toc-backref" href="#id11">减少DataProvider缓冲池内存</a><a class="headerlink" href="#dataprovider" title="永久链接至标题"></a></h3>
<h3><a class="toc-backref" href="#id13">减少DataProvider缓冲池内存</a><a class="headerlink" href="#dataprovider" title="永久链接至标题"></a></h3>
<p>PyDataProvider使用的是异步加载,同时在内存里直接随即选取数据来做Shuffle。即</p>
<img src="../_images/graphviz-9be6aad37f57c60f4b971dde0ef44ce27179cf9a.png" alt="digraph {
rankdir=LR;
......@@ -252,7 +254,7 @@ PaddlePaddle的内存占用主要分为如下几个方面:</p>
<p>这样做可以极大的减少内存占用,并且可能会加速训练过程,详细文档参考 <a class="reference internal" href="../api/v1/data_provider/pydataprovider2_cn.html#api-pydataprovider2"><span class="std std-ref">PyDataProvider2的使用</span></a></p>
</div>
<div class="section" id="id2">
<h3><a class="toc-backref" href="#id12">神经元激活内存</a><a class="headerlink" href="#id2" title="永久链接至标题"></a></h3>
<h3><a class="toc-backref" href="#id14">神经元激活内存</a><a class="headerlink" href="#id2" title="永久链接至标题"></a></h3>
<p>神经网络在训练的时候,会对每一个激活暂存一些数据,如神经元激活值等。
在反向传递的时候,这些数据会被用来更新参数。这些数据使用的内存主要和两个参数有关系,
一是batch size,另一个是每条序列(Sequence)长度。所以,其实也是和每个mini-batch中包含
......@@ -265,7 +267,7 @@ PaddlePaddle的内存占用主要分为如下几个方面:</p>
</ul>
</div>
<div class="section" id="id3">
<h3><a class="toc-backref" href="#id13">参数内存</a><a class="headerlink" href="#id3" title="永久链接至标题"></a></h3>
<h3><a class="toc-backref" href="#id15">参数内存</a><a class="headerlink" href="#id3" title="永久链接至标题"></a></h3>
<p>PaddlePaddle支持非常多的优化算法(Optimizer),不同的优化算法需要使用不同大小的内存。
例如使用 <code class="code docutils literal"><span class="pre">adadelta</span></code> 算法,则需要使用等于权重参数规模大约5倍的内存。举例,如果参数保存下来的模型目录
文件为 <code class="code docutils literal"><span class="pre">100M</span></code>, 那么该优化算法至少需要 <code class="code docutils literal"><span class="pre">500M</span></code> 的内存。</p>
......@@ -273,7 +275,7 @@ PaddlePaddle的内存占用主要分为如下几个方面:</p>
</div>
</div>
<div class="section" id="paddlepaddle">
<h2><a class="toc-backref" href="#id14">2. 如何加速PaddlePaddle的训练速度</a><a class="headerlink" href="#paddlepaddle" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id16">2. 如何加速PaddlePaddle的训练速度</a><a class="headerlink" href="#paddlepaddle" title="永久链接至标题"></a></h2>
<p>加速PaddlePaddle训练可以考虑从以下几个方面:</p>
<ul class="simple">
<li>减少数据载入的耗时</li>
......@@ -281,7 +283,7 @@ PaddlePaddle的内存占用主要分为如下几个方面:</p>
<li>利用分布式训练驾驭更多的计算资源</li>
</ul>
<div class="section" id="id4">
<h3><a class="toc-backref" href="#id15">减少数据载入的耗时</a><a class="headerlink" href="#id4" title="永久链接至标题"></a></h3>
<h3><a class="toc-backref" href="#id17">减少数据载入的耗时</a><a class="headerlink" href="#id4" title="永久链接至标题"></a></h3>
<p>使用<code class="code docutils literal"><span class="pre">pydataprovider</span></code>时,可以减少缓存池的大小,同时设置内存缓存功能,即可以极大的加速数据载入流程。
<code class="code docutils literal"><span class="pre">DataProvider</span></code> 缓存池的减小,和之前减小通过减小缓存池来减小内存占用的原理一致。</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="nd">@provider</span><span class="p">(</span><span class="n">min_pool_size</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="o">...</span><span class="p">)</span>
......@@ -295,7 +297,7 @@ PaddlePaddle的内存占用主要分为如下几个方面:</p>
<p>同时 <code class="code docutils literal"><span class="pre">&#64;provider</span></code> 接口有一个 <code class="code docutils literal"><span class="pre">cache</span></code> 参数来控制缓存方法,将其设置成 <code class="code docutils literal"><span class="pre">CacheType.CACHE_PASS_IN_MEM</span></code> 的话,会将第一个 <code class="code docutils literal"><span class="pre">pass</span></code> (过完所有训练数据即为一个pass)生成的数据缓存在内存里,在之后的 <code class="code docutils literal"><span class="pre">pass</span></code> 中,不会再从 <code class="code docutils literal"><span class="pre">python</span></code> 端读取数据,而是直接从内存的缓存里读取数据。这也会极大减少数据读入的耗时。</p>
</div>
<div class="section" id="id5">
<h3><a class="toc-backref" href="#id16">加速训练速度</a><a class="headerlink" href="#id5" title="永久链接至标题"></a></h3>
<h3><a class="toc-backref" href="#id18">加速训练速度</a><a class="headerlink" href="#id5" title="永久链接至标题"></a></h3>
<p>PaddlePaddle支持Sparse的训练,sparse训练需要训练特征是 <code class="code docutils literal"><span class="pre">sparse_binary_vector</span></code><code class="code docutils literal"><span class="pre">sparse_vector</span></code> 、或者 <code class="code docutils literal"><span class="pre">integer_value</span></code> 的任一一种。同时,与这个训练数据交互的Layer,需要将其Parameter设置成 sparse 更新模式,即设置 <code class="code docutils literal"><span class="pre">sparse_update=True</span></code></p>
<p>这里使用简单的 <code class="code docutils literal"><span class="pre">word2vec</span></code> 训练语言模型距离,具体使用方法为:</p>
<p>使用一个词前两个词和后两个词,来预测这个中间的词。这个任务的DataProvider为:</p>
......@@ -328,7 +330,7 @@ PaddlePaddle的内存占用主要分为如下几个方面:</p>
</div>
</div>
<div class="section" id="id6">
<h3><a class="toc-backref" href="#id17">利用更多的计算资源</a><a class="headerlink" href="#id6" title="永久链接至标题"></a></h3>
<h3><a class="toc-backref" href="#id19">利用更多的计算资源</a><a class="headerlink" href="#id6" title="永久链接至标题"></a></h3>
<p>利用更多的计算资源可以分为一下几个方式来进行:</p>
<ul class="simple">
<li>单机CPU训练<ul>
......@@ -348,17 +350,17 @@ PaddlePaddle的内存占用主要分为如下几个方面:</p>
</div>
</div>
<div class="section" id="illegal-instruction">
<h2><a class="toc-backref" href="#id18">3. 遇到“非法指令”或者是“illegal instruction”</a><a class="headerlink" href="#illegal-instruction" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id20">3. 遇到“非法指令”或者是“illegal instruction”</a><a class="headerlink" href="#illegal-instruction" title="永久链接至标题"></a></h2>
<p>PaddlePaddle使用avx SIMD指令提高cpu执行效率,因此错误的使用二进制发行版可能会导致这种错误,请选择正确的版本。</p>
</div>
<div class="section" id="sgd">
<h2><a class="toc-backref" href="#id19">4. 如何选择SGD算法的学习率</a><a class="headerlink" href="#sgd" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id21">4. 如何选择SGD算法的学习率</a><a class="headerlink" href="#sgd" title="永久链接至标题"></a></h2>
<p>在采用sgd/async_sgd进行训练时,一个重要的问题是选择正确的learning_rate。如果learning_rate太大,那么训练有可能不收敛,如果learning_rate太小,那么收敛可能很慢,导致训练时间过长。</p>
<p>通常做法是从一个比较大的learning_rate开始试,如果不收敛,那减少学习率10倍继续试验,直到训练收敛为止。那么如何判断训练不收敛呢?可以估计出如果模型采用不变的输出最小的cost0是多少。</p>
<p>如果训练过程的的cost明显高于这个常数输出的cost,那么我们可以判断为训练不收敛。举一个例子,假如我们是三分类问题,采用multi-class-cross-entropy作为cost,数据中0,1,2三类的比例为 <code class="code docutils literal"><span class="pre">0.2,</span> <span class="pre">0.5,</span> <span class="pre">0.3</span></code> , 那么常数输出所能达到的最小cost是 <code class="code docutils literal"><span class="pre">-(0.2*log(0.2)+0.5*log(0.5)+0.3*log(0.3))=1.03</span></code> 。如果训练一个pass(或者更早)后,cost还大于这个数,那么可以认为训练不收敛,应该降低学习率。</p>
</div>
<div class="section" id="id7">
<h2><a class="toc-backref" href="#id20">5. 如何初始化参数</a><a class="headerlink" href="#id7" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id22">5. 如何初始化参数</a><a class="headerlink" href="#id7" title="永久链接至标题"></a></h2>
<p>默认情况下,PaddlePaddle使用均值0,标准差为 <span class="math">\(\frac{1}{\sqrt{d}}\)</span> 来初始化参数。其中 <span class="math">\(d\)</span> 为参数矩阵的宽度。这种初始化方式在一般情况下不会产生很差的结果。如果用户想要自定义初始化方式,PaddlePaddle目前提供两种参数初始化的方式:</p>
<ul class="simple">
<li>高斯分布。将 <code class="code docutils literal"><span class="pre">param_attr</span></code> 设置成 <code class="code docutils literal"><span class="pre">param_attr=ParamAttr(initial_mean=0.0,</span> <span class="pre">initial_std=1.0)</span></code></li>
......@@ -372,7 +374,7 @@ PaddlePaddle的内存占用主要分为如下几个方面:</p>
<p>上述代码将bias全部初始化为1.0, 同时将参数初始化为 <code class="code docutils literal"><span class="pre">[1.0,</span> <span class="pre">-1.0]</span></code> 的均匀分布。</p>
</div>
<div class="section" id="id8">
<h2><a class="toc-backref" href="#id21">6. 如何共享参数</a><a class="headerlink" href="#id8" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id23">6. 如何共享参数</a><a class="headerlink" href="#id8" title="永久链接至标题"></a></h2>
<p>PaddlePaddle的参数使用名字 <code class="code docutils literal"><span class="pre">name</span></code> 作为参数的ID,相同名字的参数,会共享参数。设置参数的名字,可以使用 <code class="code docutils literal"><span class="pre">ParamAttr(name=&quot;YOUR_PARAM_NAME&quot;)</span></code> 来设置。更方便的设置方式,是使得要共享的参数使用同样的 <code class="code docutils literal"><span class="pre">ParamAttr</span></code> 对象。</p>
<p>简单的全连接网络,参数共享的配置示例为:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">paddle.trainer_config_helpers</span> <span class="k">import</span> <span class="o">*</span>
......@@ -409,7 +411,7 @@ PaddlePaddle的内存占用主要分为如下几个方面:</p>
<p>这里 <code class="code docutils literal"><span class="pre">hidden_a</span></code><code class="code docutils literal"><span class="pre">hidden_b</span></code> 使用了同样的parameter和bias。并且softmax层的两个输入也使用了同样的参数 <code class="code docutils literal"><span class="pre">softmax_param</span></code></p>
</div>
<div class="section" id="cp27mu-linux-x86-64-whl-is-not-a-supported-wheel-on-this-platform">
<h2><a class="toc-backref" href="#id22">7. *-cp27mu-linux_x86_64.whl is not a supported wheel on this platform.</a><a class="headerlink" href="#cp27mu-linux-x86-64-whl-is-not-a-supported-wheel-on-this-platform" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id24">7. *-cp27mu-linux_x86_64.whl is not a supported wheel on this platform.</a><a class="headerlink" href="#cp27mu-linux-x86-64-whl-is-not-a-supported-wheel-on-this-platform" title="永久链接至标题"></a></h2>
<p>出现这个问题的主要原因是,系统编译wheel包的时候,使用的 <code class="code docutils literal"><span class="pre">wheel</span></code> 包是最新的,
而系统中的 <code class="code docutils literal"><span class="pre">pip</span></code> 包比较老。具体的解决方法是,更新 <code class="code docutils literal"><span class="pre">pip</span></code> 包并重新编译PaddlePaddle。
更新 <code class="code docutils literal"><span class="pre">pip</span></code> 包的方法是:</p>
......@@ -418,7 +420,7 @@ PaddlePaddle的内存占用主要分为如下几个方面:</p>
</div>
</div>
<div class="section" id="python">
<h2><a class="toc-backref" href="#id23">8. python相关的单元测试都过不了</a><a class="headerlink" href="#python" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id25">8. python相关的单元测试都过不了</a><a class="headerlink" href="#python" title="永久链接至标题"></a></h2>
<p>如果出现以下python相关的单元测试都过不了的情况:</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span><span class="m">24</span> - test_PyDataProvider <span class="o">(</span>Failed<span class="o">)</span>
<span class="m">26</span> - test_RecurrentGradientMachine <span class="o">(</span>Failed<span class="o">)</span>
......@@ -449,7 +451,7 @@ Please uninstall paddle package before start unittest. Try to <span class="s1">&
</ul>
</div>
<div class="section" id="docker-gpu-cuda-driver-version-is-insufficient">
<h2><a class="toc-backref" href="#id24">9. 运行Docker GPU镜像出现 &#8220;CUDA driver version is insufficient&#8221;</a><a class="headerlink" href="#docker-gpu-cuda-driver-version-is-insufficient" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id26">9. 运行Docker GPU镜像出现 &#8220;CUDA driver version is insufficient&#8221;</a><a class="headerlink" href="#docker-gpu-cuda-driver-version-is-insufficient" title="永久链接至标题"></a></h2>
<p>用户在使用PaddlePaddle GPU的Docker镜像的时候,常常出现 <cite>Cuda Error: CUDA driver version is insufficient for CUDA runtime version</cite>, 原因在于没有把机器上CUDA相关的驱动和库映射到容器内部。
具体的解决方法是:</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span>$ <span class="nb">export</span> <span class="nv">CUDA_SO</span><span class="o">=</span><span class="s2">&quot;</span><span class="k">$(</span><span class="se">\l</span>s usr/lib64/libcuda* <span class="p">|</span> xargs -I<span class="o">{}</span> <span class="nb">echo</span> <span class="s1">&#39;-v {}:{}&#39;</span><span class="k">)</span><span class="s2"> </span><span class="k">$(</span><span class="se">\l</span>s /usr/lib64/libnvidia* <span class="p">|</span> xargs -I<span class="o">{}</span> <span class="nb">echo</span> <span class="s1">&#39;-v {}:{}&#39;</span><span class="k">)</span><span class="s2">&quot;</span>
......@@ -460,7 +462,7 @@ $ docker run <span class="si">${</span><span class="nv">CUDA_SO</span><span clas
<p>更多关于Docker的安装与使用, 请参考 <a class="reference external" href="http://www.paddlepaddle.org/doc_cn/build_and_install/install/docker_install.html">PaddlePaddle Docker 文档</a></p>
</div>
<div class="section" id="cmake-pythonlibspythoninterp">
<h2><a class="toc-backref" href="#id25">10. CMake源码编译, 找到的PythonLibs和PythonInterp版本不一致</a><a class="headerlink" href="#cmake-pythonlibspythoninterp" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id27">10. CMake源码编译, 找到的PythonLibs和PythonInterp版本不一致</a><a class="headerlink" href="#cmake-pythonlibspythoninterp" title="永久链接至标题"></a></h2>
<p>这是目前CMake寻找Python的逻辑存在缺陷,如果系统安装了多个Python版本,CMake找到的Python库和Python解释器版本可能有不一致现象,导致编译PaddlePaddle失败。正确的解决方法是,
用户强制指定特定的Python版本,具体操作如下:</p>
<blockquote>
......@@ -471,7 +473,7 @@ $ docker run <span class="si">${</span><span class="nv">CUDA_SO</span><span clas
<p>用户需要指定本机上Python的路径:<code class="docutils literal"><span class="pre">&lt;exc_path&gt;</span></code>, <code class="docutils literal"><span class="pre">&lt;lib_path&gt;</span></code>, <code class="docutils literal"><span class="pre">&lt;inc_path&gt;</span></code></p>
</div>
<div class="section" id="cmake-paddle0-0-0">
<h2><a class="toc-backref" href="#id26">11. CMake源码编译,Paddle版本号为0.0.0</a><a class="headerlink" href="#cmake-paddle0-0-0" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id28">11. CMake源码编译,Paddle版本号为0.0.0</a><a class="headerlink" href="#cmake-paddle0-0-0" title="永久链接至标题"></a></h2>
<p>如果运行 <code class="code docutils literal"><span class="pre">paddle</span> <span class="pre">version</span></code>, 出现 <code class="code docutils literal"><span class="pre">PaddlePaddle</span> <span class="pre">0.0.0</span></code>;或者运行 <code class="code docutils literal"><span class="pre">cmake</span> <span class="pre">..</span></code>,出现</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span>CMake Warning at cmake/version.cmake:20 <span class="o">(</span>message<span class="o">)</span>:
Cannot add paddle version from git tag
......@@ -480,7 +482,7 @@ $ docker run <span class="si">${</span><span class="nv">CUDA_SO</span><span clas
<p>那么用户需要拉取所有的远程分支到本机,命令为 <code class="code docutils literal"><span class="pre">git</span> <span class="pre">fetch</span> <span class="pre">upstream</span></code>,然后重新cmake即可。</p>
</div>
<div class="section" id="a-protocol-message-was-rejected-because-it-was-too-big">
<h2><a class="toc-backref" href="#id27">12. A protocol message was rejected because it was too big</a><a class="headerlink" href="#a-protocol-message-was-rejected-because-it-was-too-big" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id29">12. A protocol message was rejected because it was too big</a><a class="headerlink" href="#a-protocol-message-was-rejected-because-it-was-too-big" title="永久链接至标题"></a></h2>
<p>如果在训练NLP相关模型时,出现以下错误:</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span><span class="o">[</span>libprotobuf ERROR google/protobuf/io/coded_stream.cc:171<span class="o">]</span> A protocol message was rejected because it was too big <span class="o">(</span>more than <span class="m">67108864</span> bytes<span class="o">)</span>. To increase the limit <span class="o">(</span>or to disable these warnings<span class="o">)</span>, see CodedInputStream::SetTotalBytesLimit<span class="o">()</span> in google/protobuf/io/coded_stream.h.
F1205 <span class="m">14</span>:59:50.295174 <span class="m">14703</span> TrainerConfigHelper.cpp:59<span class="o">]</span> Check failed: m-&gt;conf.ParseFromString<span class="o">(</span>configProtoStr<span class="o">)</span>
......@@ -511,7 +513,7 @@ F1205 <span class="m">14</span>:59:50.295174 <span class="m">14703</span> Traine
<p>完整源码可参考 <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/tree/develop/demo/seqToseq">seqToseq</a> 示例。</p>
</div>
<div class="section" id="gpu">
<h2><a class="toc-backref" href="#id28">13. 如何指定GPU设备</a><a class="headerlink" href="#gpu" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id30">13. 如何指定GPU设备</a><a class="headerlink" href="#gpu" title="永久链接至标题"></a></h2>
<p>例如机器上有4块GPU,编号从0开始,指定使用2、3号GPU:</p>
<ul class="simple">
<li>方式1:通过 <a class="reference external" href="http://www.acceleware.com/blog/cudavisibledevices-masking-gpus">CUDA_VISIBLE_DEVICES</a> 环境变量来指定特定的GPU。</li>
......@@ -527,7 +529,7 @@ F1205 <span class="m">14</span>:59:50.295174 <span class="m">14703</span> Traine
</div>
</div>
<div class="section" id="floating-point-exception">
<h2><a class="toc-backref" href="#id29">14. 训练过程中出现 <code class="code docutils literal"><span class="pre">Floating</span> <span class="pre">point</span> <span class="pre">exception</span></code>, 训练因此退出怎么办?</a><a class="headerlink" href="#floating-point-exception" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id31">14. 训练过程中出现 <code class="code docutils literal"><span class="pre">Floating</span> <span class="pre">point</span> <span class="pre">exception</span></code>, 训练因此退出怎么办?</a><a class="headerlink" href="#floating-point-exception" title="永久链接至标题"></a></h2>
<p>Paddle二进制在运行时捕获了浮点数异常,只要出现浮点数异常(即训练过程中出现NaN或者Inf),立刻退出。浮点异常通常的原因是浮点数溢出、除零等问题。
主要原因包括两个方面:</p>
<ul class="simple">
......@@ -538,12 +540,57 @@ F1205 <span class="m">14</span>:59:50.295174 <span class="m">14703</span> Traine
<p>主要的解决办法是减小学习律或者对数据进行归一化处理。</p>
</div>
<div class="section" id="import-paddle-v2-as-paddle-importerror-no-module-named-v2">
<h2><a class="toc-backref" href="#id30">15. 编译安装后执行 import paddle.v2 as paddle 报ImportError: No module named v2</a><a class="headerlink" href="#import-paddle-v2-as-paddle-importerror-no-module-named-v2" title="永久链接至标题"></a></h2>
<h2><a class="toc-backref" href="#id32">15. 编译安装后执行 import paddle.v2 as paddle 报ImportError: No module named v2</a><a class="headerlink" href="#import-paddle-v2-as-paddle-importerror-no-module-named-v2" title="永久链接至标题"></a></h2>
<p>先查看一下是否曾经安装过paddle v1版本,有的话需要先卸载:</p>
<p>pip uninstall py_paddle paddle</p>
<p>然后安装paddle的python环境, 在build目录下执行</p>
<p>pip install python/dist/paddle*.whl &amp;&amp; pip install ../paddle/dist/py_paddle*.whl</p>
</div>
<div class="section" id="id9">
<h2><a class="toc-backref" href="#id33">16. PaddlePaddle存储的参数格式是什么,如何和明文进行相互转化</a><a class="headerlink" href="#id9" title="永久链接至标题"></a></h2>
<p>PaddlePaddle保存的模型参数文件内容由16字节头信息和网络参数两部分组成。头信息中,1~4字节表示PaddlePaddle版本信息,请直接填充0;5~8字节表示每个参数占用的字节数,当保存的网络参数为float类型时为4,double类型时为8;9~16字节表示保存的参数总个数。</p>
<p>将PaddlePaddle保存的模型参数还原回明文时,可以使用相应数据类型的 <code class="code docutils literal"><span class="pre">numpy.array</span></code> 加载具体网络参数,此时可以跳过PaddlePaddle模型参数文件的头信息。若在PaddlePaddle编译时,未指定按照double精度编译,默认情况下按照float精度计算,保存的参数也是float类型。这时在使用 <code class="code docutils literal"><span class="pre">numpy.array</span></code> 时,一般设置 <code class="code docutils literal"><span class="pre">dtype=float32</span></code> 。示例如下:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">read_parameter</span><span class="p">(</span><span class="n">fname</span><span class="p">,</span> <span class="n">width</span><span class="p">):</span>
<span class="n">s</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">fname</span><span class="p">)</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="c1"># skip header</span>
<span class="n">vec</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">fromstring</span><span class="p">(</span><span class="n">s</span><span class="p">[</span><span class="mi">16</span><span class="p">:],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>
<span class="c1"># width is the size of the corresponding layer</span>
<span class="n">np</span><span class="o">.</span><span class="n">savetxt</span><span class="p">(</span><span class="n">fname</span> <span class="o">+</span> <span class="s2">&quot;.csv&quot;</span><span class="p">,</span> <span class="n">vec</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">width</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">),</span>
<span class="n">fmt</span><span class="o">=</span><span class="s2">&quot;</span><span class="si">%.6f</span><span class="s2">&quot;</span><span class="p">,</span> <span class="n">delimiter</span><span class="o">=</span><span class="s2">&quot;,&quot;</span><span class="p">)</span>
</pre></div>
</div>
<p>将明文参数转化为PaddlePaddle可加载的模型参数时,首先构造头信息,再写入网络参数。下面的代码将随机生成的矩阵转化为可以被PaddlePaddle加载的模型参数。</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">gen_rand_param</span><span class="p">(</span><span class="n">param_file</span><span class="p">,</span> <span class="n">width</span><span class="p">,</span> <span class="n">height</span><span class="p">,</span> <span class="n">need_trans</span><span class="p">):</span>
<span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">()</span>
<span class="n">header</span> <span class="o">=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s2">&quot;iil&quot;</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="n">height</span> <span class="o">*</span> <span class="n">width</span><span class="p">)</span>
<span class="n">param</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">rand</span><span class="p">(</span><span class="n">height</span><span class="p">,</span> <span class="n">width</span><span class="p">))</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">param_file</span><span class="p">,</span> <span class="s2">&quot;w&quot;</span><span class="p">)</span> <span class="k">as</span> <span class="n">fparam</span><span class="p">:</span>
<span class="n">fparam</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">header</span> <span class="o">+</span> <span class="n">param</span><span class="o">.</span><span class="n">tostring</span><span class="p">())</span>
</pre></div>
</div>
</div>
<div class="section" id="id10">
<h2><a class="toc-backref" href="#id34">17. 如何加载预训练参数</a><a class="headerlink" href="#id10" title="永久链接至标题"></a></h2>
<ul class="simple">
<li>对加载预训练参数的层,设置其参数属性 <code class="code docutils literal"><span class="pre">is_static=True</span></code>,使该层的参数在训练过程中保持不变。以embedding层为例,代码如下:</li>
</ul>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">emb_para</span> <span class="o">=</span> <span class="n">paddle</span><span class="o">.</span><span class="n">attr</span><span class="o">.</span><span class="n">Param</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s1">&#39;emb&#39;</span><span class="p">,</span> <span class="n">is_static</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">paddle</span><span class="o">.</span><span class="n">layer</span><span class="o">.</span><span class="n">embedding</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="n">word_dim</span><span class="p">,</span> <span class="nb">input</span><span class="o">=</span><span class="n">x</span><span class="p">,</span> <span class="n">param_attr</span><span class="o">=</span><span class="n">emb_para</span><span class="p">)</span>
</pre></div>
</div>
<ul class="simple">
<li>从模型文件将预训练参数载入 <code class="code docutils literal"><span class="pre">numpy.array</span></code>,在创建parameters后,使用 <code class="code docutils literal"><span class="pre">parameters.set()</span></code> 加载预训练参数。PaddlePaddle保存的模型参数文件前16字节为头信息,用户将参数载入 <code class="code docutils literal"><span class="pre">numpy.array</span></code> 时须从第17字节开始。以embedding层为例,代码如下:</li>
</ul>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">load_parameter</span><span class="p">(</span><span class="n">file_name</span><span class="p">,</span> <span class="n">h</span><span class="p">,</span> <span class="n">w</span><span class="p">):</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">file_name</span><span class="p">,</span> <span class="s1">&#39;rb&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="mi">16</span><span class="p">)</span> <span class="c1"># skip header.</span>
<span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">fromfile</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">h</span><span class="p">,</span> <span class="n">w</span><span class="p">)</span>
<span class="n">parameters</span> <span class="o">=</span> <span class="n">paddle</span><span class="o">.</span><span class="n">parameters</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">my_cost</span><span class="p">)</span>
<span class="n">parameters</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="s1">&#39;emb&#39;</span><span class="p">,</span> <span class="n">load_parameter</span><span class="p">(</span><span class="n">emb_param_file</span><span class="p">,</span> <span class="mi">30000</span><span class="p">,</span> <span class="mi">256</span><span class="p">))</span>
</pre></div>
</div>
</div>
</div>
......
因为 它太大了无法显示 source diff 。你可以改为 查看blob
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册