提交 bebdad9c 编写于 作者: T Travis CI

Deploy to GitHub Pages: 89bbc4f6

上级 baaae3cb
......@@ -2020,16 +2020,17 @@ explain how sequence_expand works:</p>
<dd><p>Lstm unit layer. The equation of a lstm step is:</p>
<blockquote>
<div><div class="math">
\[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{x_i}x_{t} + W_{h_i}h_{t-1} + W_{c_i}c_{t-1} + b_i)\\f_t &amp; = \sigma(W_{x_f}x_{t} + W_{h_f}h_{t-1} + W_{c_f}c_{t-1} + b_f)\\c_t &amp; = f_tc_{t-1} + i_t tanh (W_{x_c}x_t+W_{h_c}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{x_o}x_{t} + W_{h_o}h_{t-1} + W_{c_o}c_t + b_o)\\h_t &amp; = o_t tanh(c_t)\end{aligned}\end{align} \]</div>
\[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{x_i}x_{t} + W_{h_i}h_{t-1} + b_i)\\f_t &amp; = \sigma(W_{x_f}x_{t} + W_{h_f}h_{t-1} + b_f)\\c_t &amp; = f_tc_{t-1} + i_t tanh (W_{x_c}x_t + W_{h_c}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{x_o}x_{t} + W_{h_o}h_{t-1} + b_o)\\h_t &amp; = o_t tanh(c_t)\end{aligned}\end{align} \]</div>
</div></blockquote>
<p>The inputs of lstm unit includes <span class="math">\(x_t\)</span>, <span class="math">\(h_{t-1}\)</span> and
<span class="math">\(c_{t-1}\)</span>. The implementation separates the linear transformation
and non-linear transformation apart. Here, we take <span class="math">\(i_t\)</span> as an
example. The linear transformation is applied by calling a <cite>fc</cite> layer and
the equation is:</p>
<p>The inputs of lstm unit include <span class="math">\(x_t\)</span>, <span class="math">\(h_{t-1}\)</span> and
<span class="math">\(c_{t-1}\)</span>. The 2nd dimensions of <span class="math">\(h_{t-1}\)</span> and <span class="math">\(c_{t-1}\)</span>
should be same. The implementation separates the linear transformation and
non-linear transformation apart. Here, we take <span class="math">\(i_t\)</span> as an example.
The linear transformation is applied by calling a <cite>fc</cite> layer and the
equation is:</p>
<blockquote>
<div><div class="math">
\[L_{i_t} = W_{x_i}x_{t} + W_{h_i}h_{t-1} + W_{c_i}c_{t-1} + b_i\]</div>
\[L_{i_t} = W_{x_i}x_{t} + W_{h_i}h_{t-1} + b_i\]</div>
</div></blockquote>
<p>The non-linear transformation is applied by calling <cite>lstm_unit_op</cite> and the
equation is:</p>
......@@ -2043,9 +2044,12 @@ equation is:</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>x_t</strong> (<em>Variable</em>) &#8211; The input value of current step.</li>
<li><strong>hidden_t_prev</strong> (<em>Variable</em>) &#8211; The hidden value of lstm unit.</li>
<li><strong>cell_t_prev</strong> (<em>Variable</em>) &#8211; The cell value of lstm unit.</li>
<li><strong>x_t</strong> (<em>Variable</em>) &#8211; The input value of current step, a 2-D tensor with shape
M x N, M for batch size and N for input size.</li>
<li><strong>hidden_t_prev</strong> (<em>Variable</em>) &#8211; The hidden value of lstm unit, a 2-D tensor
with shape M x S, M for batch size and S for size of lstm unit.</li>
<li><strong>cell_t_prev</strong> (<em>Variable</em>) &#8211; The cell value of lstm unit, a 2-D tensor with
shape M x S, M for batch size and S for size of lstm unit.</li>
<li><strong>forget_bias</strong> (<em>float</em>) &#8211; The forget bias of lstm unit.</li>
<li><strong>param_attr</strong> (<em>ParamAttr</em>) &#8211; The attributes of parameter weights, used to set
initializer, name etc.</li>
......@@ -2060,14 +2064,14 @@ bias weights will be created and be set to default value.</li>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first">tuple</p>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Raises:</th><td class="field-body"><p class="first last"><code class="xref py py-exc docutils literal"><span class="pre">ValueError</span></code> &#8211; The ranks of <strong>x_t</strong>, <strong>hidden_t_prev</strong> and <strong>cell_t_prev</strong> not be 2 or the 1st dimensions of <strong>x_t</strong>, <strong>hidden_t_prev</strong> and <strong>cell_t_prev</strong> not be the same.</p>
<tr class="field-even field"><th class="field-name">Raises:</th><td class="field-body"><p class="first last"><code class="xref py py-exc docutils literal"><span class="pre">ValueError</span></code> &#8211; The ranks of <strong>x_t</strong>, <strong>hidden_t_prev</strong> and <strong>cell_t_prev</strong> not be 2 or the 1st dimensions of <strong>x_t</strong>, <strong>hidden_t_prev</strong> and <strong>cell_t_prev</strong> not be the same or the 2nd dimensions of <strong>hidden_t_prev</strong> and <strong>cell_t_prev</strong> not be the same.</p>
</td>
</tr>
</tbody>
</table>
<p class="rubric">Examples</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">x_t</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">x_t_data</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span>
<span class="n">prev_hidden</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">prev_hidden_data</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">20</span><span class="p">)</span>
<span class="n">prev_hidden</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">prev_hidden_data</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">30</span><span class="p">)</span>
<span class="n">prev_cell</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">prev_cell_data</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">30</span><span class="p">)</span>
<span class="n">hidden_value</span><span class="p">,</span> <span class="n">cell_value</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">lstm_unit</span><span class="p">(</span><span class="n">x_t</span><span class="o">=</span><span class="n">x_t</span><span class="p">,</span>
<span class="n">hidden_t_prev</span><span class="o">=</span><span class="n">prev_hidden</span><span class="p">,</span>
......
......@@ -2033,16 +2033,17 @@ explain how sequence_expand works:</p>
<dd><p>Lstm unit layer. The equation of a lstm step is:</p>
<blockquote>
<div><div class="math">
\[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{x_i}x_{t} + W_{h_i}h_{t-1} + W_{c_i}c_{t-1} + b_i)\\f_t &amp; = \sigma(W_{x_f}x_{t} + W_{h_f}h_{t-1} + W_{c_f}c_{t-1} + b_f)\\c_t &amp; = f_tc_{t-1} + i_t tanh (W_{x_c}x_t+W_{h_c}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{x_o}x_{t} + W_{h_o}h_{t-1} + W_{c_o}c_t + b_o)\\h_t &amp; = o_t tanh(c_t)\end{aligned}\end{align} \]</div>
\[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{x_i}x_{t} + W_{h_i}h_{t-1} + b_i)\\f_t &amp; = \sigma(W_{x_f}x_{t} + W_{h_f}h_{t-1} + b_f)\\c_t &amp; = f_tc_{t-1} + i_t tanh (W_{x_c}x_t + W_{h_c}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{x_o}x_{t} + W_{h_o}h_{t-1} + b_o)\\h_t &amp; = o_t tanh(c_t)\end{aligned}\end{align} \]</div>
</div></blockquote>
<p>The inputs of lstm unit includes <span class="math">\(x_t\)</span>, <span class="math">\(h_{t-1}\)</span> and
<span class="math">\(c_{t-1}\)</span>. The implementation separates the linear transformation
and non-linear transformation apart. Here, we take <span class="math">\(i_t\)</span> as an
example. The linear transformation is applied by calling a <cite>fc</cite> layer and
the equation is:</p>
<p>The inputs of lstm unit include <span class="math">\(x_t\)</span>, <span class="math">\(h_{t-1}\)</span> and
<span class="math">\(c_{t-1}\)</span>. The 2nd dimensions of <span class="math">\(h_{t-1}\)</span> and <span class="math">\(c_{t-1}\)</span>
should be same. The implementation separates the linear transformation and
non-linear transformation apart. Here, we take <span class="math">\(i_t\)</span> as an example.
The linear transformation is applied by calling a <cite>fc</cite> layer and the
equation is:</p>
<blockquote>
<div><div class="math">
\[L_{i_t} = W_{x_i}x_{t} + W_{h_i}h_{t-1} + W_{c_i}c_{t-1} + b_i\]</div>
\[L_{i_t} = W_{x_i}x_{t} + W_{h_i}h_{t-1} + b_i\]</div>
</div></blockquote>
<p>The non-linear transformation is applied by calling <cite>lstm_unit_op</cite> and the
equation is:</p>
......@@ -2056,9 +2057,12 @@ equation is:</p>
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>x_t</strong> (<em>Variable</em>) &#8211; The input value of current step.</li>
<li><strong>hidden_t_prev</strong> (<em>Variable</em>) &#8211; The hidden value of lstm unit.</li>
<li><strong>cell_t_prev</strong> (<em>Variable</em>) &#8211; The cell value of lstm unit.</li>
<li><strong>x_t</strong> (<em>Variable</em>) &#8211; The input value of current step, a 2-D tensor with shape
M x N, M for batch size and N for input size.</li>
<li><strong>hidden_t_prev</strong> (<em>Variable</em>) &#8211; The hidden value of lstm unit, a 2-D tensor
with shape M x S, M for batch size and S for size of lstm unit.</li>
<li><strong>cell_t_prev</strong> (<em>Variable</em>) &#8211; The cell value of lstm unit, a 2-D tensor with
shape M x S, M for batch size and S for size of lstm unit.</li>
<li><strong>forget_bias</strong> (<em>float</em>) &#8211; The forget bias of lstm unit.</li>
<li><strong>param_attr</strong> (<em>ParamAttr</em>) &#8211; The attributes of parameter weights, used to set
initializer, name etc.</li>
......@@ -2073,14 +2077,14 @@ bias weights will be created and be set to default value.</li>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first">tuple</p>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Raises:</th><td class="field-body"><p class="first last"><code class="xref py py-exc docutils literal"><span class="pre">ValueError</span></code> &#8211; The ranks of <strong>x_t</strong>, <strong>hidden_t_prev</strong> and <strong>cell_t_prev</strong> not be 2 or the 1st dimensions of <strong>x_t</strong>, <strong>hidden_t_prev</strong> and <strong>cell_t_prev</strong> not be the same.</p>
<tr class="field-even field"><th class="field-name">Raises:</th><td class="field-body"><p class="first last"><code class="xref py py-exc docutils literal"><span class="pre">ValueError</span></code> &#8211; The ranks of <strong>x_t</strong>, <strong>hidden_t_prev</strong> and <strong>cell_t_prev</strong> not be 2 or the 1st dimensions of <strong>x_t</strong>, <strong>hidden_t_prev</strong> and <strong>cell_t_prev</strong> not be the same or the 2nd dimensions of <strong>hidden_t_prev</strong> and <strong>cell_t_prev</strong> not be the same.</p>
</td>
</tr>
</tbody>
</table>
<p class="rubric">Examples</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">x_t</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">x_t_data</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span>
<span class="n">prev_hidden</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">prev_hidden_data</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">20</span><span class="p">)</span>
<span class="n">prev_hidden</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">prev_hidden_data</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">30</span><span class="p">)</span>
<span class="n">prev_cell</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">prev_cell_data</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">30</span><span class="p">)</span>
<span class="n">hidden_value</span><span class="p">,</span> <span class="n">cell_value</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">lstm_unit</span><span class="p">(</span><span class="n">x_t</span><span class="o">=</span><span class="n">x_t</span><span class="p">,</span>
<span class="n">hidden_t_prev</span><span class="o">=</span><span class="n">prev_hidden</span><span class="p">,</span>
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册