<trclass="field-even field"><thclass="field-name">Returns:</th><tdclass="field-body"><pclass="first">The hidden state of GRU. The shape is (T times D), and lod is the same with the input.</p>
<trclass="field-even field"><thclass="field-name">Returns:</th><tdclass="field-body"><pclass="first">The hidden state of GRU. The shape is <spanclass="math">\((T \times D)\)</span>, and lod is the same with the input.</p>
<dt>The tensor variable storing the element-wise squared error</dt>
<dd><pclass="first last">difference of input and label.</p>
</dd>
</dl>
</p>
<trclass="field-even field"><thclass="field-name">Returns:</th><tdclass="field-body"><pclass="first">The tensor variable storing the element-wise squared error difference of input and label.</p>
<codeclass="xref py py-attr docutils literal"><spanclass="pre">begin_norm_axis</span><spanclass="pre">...</span><spanclass="pre">rank(input)</span></code> and calculate the moment statistics
along these dimensions for each feature vector <spanclass="math">\(a\)</span> with size
<spanclass="math">\(H\)</span>, then normalize each feature vector using the corresponding
statistics. After that, apply learnable gain and bias on the normalized
tensor to scale and shift if <codeclass="xref py py-attr docutils literal"><spanclass="pre">scale</span></code> and <codeclass="xref py py-attr docutils literal"><spanclass="pre">shift</span></code> are set.</p>
<p>Refer to <aclass="reference external"href="https://arxiv.org/pdf/1607.06450v1.pdf">Layer Normalization</a></p>
<p>The formula is as follows:</p>
<divclass="math">
\[ \begin{align}\begin{aligned}\mu & = \frac{1}{H}\sum_{i=1}^{H} a_i\\\sigma & = \sqrt{\frac{1}{H}\sum_{i=1}^{H}(a_i - \mu)^2}\\h & = f(\frac{g}{\sigma}(a - \mu) + b)\end{aligned}\end{align} \]</div>
<li><strong>input</strong> (<em>Variable</em>) – The input tensor variable.</li>
<li><strong>scale</strong> (<em>bool</em>) – Whether to learn the adaptive gain <spanclass="math">\(g\)</span> after
normalization.</li>
<li><strong>shift</strong> (<em>bool</em>) – Whether to learn the adaptive bias <spanclass="math">\(b\)</span> after
normalization.</li>
<li><strong>begin_norm_axis</strong> (<em>bool</em>) – The normalization will be performed along
dimensions from <codeclass="xref py py-attr docutils literal"><spanclass="pre">begin_norm_axis</span></code> to <codeclass="xref py py-attr docutils literal"><spanclass="pre">rank(input)</span></code>.</li>
<li><strong>epsilon</strong> (<em>float</em>) – The small value added to the variance to prevent
division by zero.</li>
<li><strong>param_attr</strong> (<em>ParamAttr|None</em>) – The parameter attribute for the learnable
gain <spanclass="math">\(g\)</span>.</li>
<li><strong>bias_attr</strong> (<em>ParamAttr|None</em>) – The parameter attribute for the learnable
bias <spanclass="math">\(b\)</span>.</li>
<li><strong>act</strong> (<em>str</em>) – Activation to be applied to the output of layer normalizaiton.</li>
</ul>
</td>
</tr>
<trclass="field-even field"><thclass="field-name">Returns:</th><tdclass="field-body"><pclass="first">A tensor variable with the same shape as the input.</p>
<trclass="field-even field"><thclass="field-name">Returns:</th><tdclass="field-body"><pclass="first">The hidden state of GRU. The shape is (T times D), and lod is the same with the input.</p>
<trclass="field-even field"><thclass="field-name">Returns:</th><tdclass="field-body"><pclass="first">The hidden state of GRU. The shape is <spanclass="math">\((T \times D)\)</span>, and lod is the same with the input.</p>
<dt>The tensor variable storing the element-wise squared error</dt>
<dd><pclass="first last">difference of input and label.</p>
</dd>
</dl>
</p>
<trclass="field-even field"><thclass="field-name">Returns:</th><tdclass="field-body"><pclass="first">The tensor variable storing the element-wise squared error difference of input and label.</p>
<codeclass="xref py py-attr docutils literal"><spanclass="pre">begin_norm_axis</span><spanclass="pre">...</span><spanclass="pre">rank(input)</span></code> and calculate the moment statistics
along these dimensions for each feature vector <spanclass="math">\(a\)</span> with size
<spanclass="math">\(H\)</span>, then normalize each feature vector using the corresponding
statistics. After that, apply learnable gain and bias on the normalized
tensor to scale and shift if <codeclass="xref py py-attr docutils literal"><spanclass="pre">scale</span></code> and <codeclass="xref py py-attr docutils literal"><spanclass="pre">shift</span></code> are set.</p>
<p>Refer to <aclass="reference external"href="https://arxiv.org/pdf/1607.06450v1.pdf">Layer Normalization</a></p>
<p>The formula is as follows:</p>
<divclass="math">
\[ \begin{align}\begin{aligned}\mu & = \frac{1}{H}\sum_{i=1}^{H} a_i\\\sigma & = \sqrt{\frac{1}{H}\sum_{i=1}^{H}(a_i - \mu)^2}\\h & = f(\frac{g}{\sigma}(a - \mu) + b)\end{aligned}\end{align} \]</div>
<li><strong>input</strong> (<em>Variable</em>) – The input tensor variable.</li>
<li><strong>scale</strong> (<em>bool</em>) – Whether to learn the adaptive gain <spanclass="math">\(g\)</span> after
normalization.</li>
<li><strong>shift</strong> (<em>bool</em>) – Whether to learn the adaptive bias <spanclass="math">\(b\)</span> after
normalization.</li>
<li><strong>begin_norm_axis</strong> (<em>bool</em>) – The normalization will be performed along
dimensions from <codeclass="xref py py-attr docutils literal"><spanclass="pre">begin_norm_axis</span></code> to <codeclass="xref py py-attr docutils literal"><spanclass="pre">rank(input)</span></code>.</li>
<li><strong>epsilon</strong> (<em>float</em>) – The small value added to the variance to prevent
division by zero.</li>
<li><strong>param_attr</strong> (<em>ParamAttr|None</em>) – The parameter attribute for the learnable
gain <spanclass="math">\(g\)</span>.</li>
<li><strong>bias_attr</strong> (<em>ParamAttr|None</em>) – The parameter attribute for the learnable
bias <spanclass="math">\(b\)</span>.</li>
<li><strong>act</strong> (<em>str</em>) – Activation to be applied to the output of layer normalizaiton.</li>
</ul>
</td>
</tr>
<trclass="field-even field"><thclass="field-name">Returns:</th><tdclass="field-body"><pclass="first">A tensor variable with the same shape as the input.</p>
<trclass="field-even field"><thclass="field-name">返回:</th><tdclass="field-body"><pclass="first">The hidden state of GRU. The shape is (T times D), and lod is the same with the input.</p>
<trclass="field-even field"><thclass="field-name">返回:</th><tdclass="field-body"><pclass="first">The hidden state of GRU. The shape is <spanclass="math">\((T \times D)\)</span>, and lod is the same with the input.</p>
<dt>The tensor variable storing the element-wise squared error</dt>
<dd><pclass="first last">difference of input and label.</p>
</dd>
</dl>
</p>
<trclass="field-even field"><thclass="field-name">返回:</th><tdclass="field-body"><pclass="first">The tensor variable storing the element-wise squared error difference of input and label.</p>
<codeclass="xref py py-attr docutils literal"><spanclass="pre">begin_norm_axis</span><spanclass="pre">...</span><spanclass="pre">rank(input)</span></code> and calculate the moment statistics
along these dimensions for each feature vector <spanclass="math">\(a\)</span> with size
<spanclass="math">\(H\)</span>, then normalize each feature vector using the corresponding
statistics. After that, apply learnable gain and bias on the normalized
tensor to scale and shift if <codeclass="xref py py-attr docutils literal"><spanclass="pre">scale</span></code> and <codeclass="xref py py-attr docutils literal"><spanclass="pre">shift</span></code> are set.</p>
<p>Refer to <aclass="reference external"href="https://arxiv.org/pdf/1607.06450v1.pdf">Layer Normalization</a></p>
<p>The formula is as follows:</p>
<divclass="math">
\[ \begin{align}\begin{aligned}\mu & = \frac{1}{H}\sum_{i=1}^{H} a_i\\\sigma & = \sqrt{\frac{1}{H}\sum_{i=1}^{H}(a_i - \mu)^2}\\h & = f(\frac{g}{\sigma}(a - \mu) + b)\end{aligned}\end{align} \]</div>
<li><strong>input</strong> (<em>Variable</em>) – The input tensor variable.</li>
<li><strong>scale</strong> (<em>bool</em>) – Whether to learn the adaptive gain <spanclass="math">\(g\)</span> after
normalization.</li>
<li><strong>shift</strong> (<em>bool</em>) – Whether to learn the adaptive bias <spanclass="math">\(b\)</span> after
normalization.</li>
<li><strong>begin_norm_axis</strong> (<em>bool</em>) – The normalization will be performed along
dimensions from <codeclass="xref py py-attr docutils literal"><spanclass="pre">begin_norm_axis</span></code> to <codeclass="xref py py-attr docutils literal"><spanclass="pre">rank(input)</span></code>.</li>
<li><strong>epsilon</strong> (<em>float</em>) – The small value added to the variance to prevent
division by zero.</li>
<li><strong>param_attr</strong> (<em>ParamAttr|None</em>) – The parameter attribute for the learnable
gain <spanclass="math">\(g\)</span>.</li>
<li><strong>bias_attr</strong> (<em>ParamAttr|None</em>) – The parameter attribute for the learnable
bias <spanclass="math">\(b\)</span>.</li>
<li><strong>act</strong> (<em>str</em>) – Activation to be applied to the output of layer normalizaiton.</li>
</ul>
</td>
</tr>
<trclass="field-even field"><thclass="field-name">返回:</th><tdclass="field-body"><pclass="first">A tensor variable with the same shape as the input.</p>