Deploy to GitHub Pages: 41b83884

864af933 · Travis CI · c3eeadef · 864af933 · 864af933 · 864af933
5 changed file
--- a/develop/doc/api/v2/fluid/layers.html
+++ b/develop/doc/api/v2/fluid/layers.html
@@ -350,7 +350,104 @@ constructor.</p>
 <dl class="function">
 <dt>
 <code class="descclassname">paddle.v2.fluid.layers.</code><code class="descname">dynamic_lstm</code><span class="sig-paren">(</span><em>input</em>, <em>size</em>, <em>param_attr=None</em>, <em>bias_attr=None</em>, <em>use_peepholes=True</em>, <em>is_reverse=False</em>, <em>gate_activation='sigmoid'</em>, <em>cell_activation='tanh'</em>, <em>candidate_activation='tanh'</em>, <em>dtype='float32'</em><span class="sig-paren">)</span></dt>
-<dd></dd></dl>
+<dd><p><strong>Dynamic LSTM Layer</strong></p>
+<p>The defalut implementation is diagonal/peephole connection
+(<a class="reference external" href="https://arxiv.org/pdf/1402.1128.pdf">https://arxiv.org/pdf/1402.1128.pdf</a>), the formula is as follows:</p>
+<div class="math">
+\[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{ix}x_{t} + W_{ih}h_{t-1} + W_{ic}c_{t-1} + b_i)\\f_t &amp; = \sigma(W_{fx}x_{t} + W_{fh}h_{t-1} + W_{fc}c_{t-1} + b_f)\\\tilde{c_t} &amp; = act_g(W_{cx}x_t + W_{ch}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{ox}x_{t} + W_{oh}h_{t-1} + W_{oc}c_t + b_o)\\c_t &amp; = f_t \odot c_{t-1} + i_t \odot \tilde{c_t}\\h_t &amp; = o_t \odot act_h(c_t)\end{aligned}\end{align} \]</div>
+<p>where the <span class="math">\(W\)</span> terms denote weight matrices (e.g. <span class="math">\(W_{xi}\)</span> is
+the matrix of weights from the input gate to the input), <span class="math">\(W_{ic},     W_{fc}, W_{oc}\)</span> are diagonal weight matrices for peephole connections. In
+our implementation, we use vectors to reprenset these diagonal weight
+matrices. The <span class="math">\(b\)</span> terms denote bias vectors (<span class="math">\(b_i\)</span> is the input
+gate bias vector), <span class="math">\(\sigma\)</span> is the non-line activations, such as
+logistic sigmoid function, and <span class="math">\(i, f, o\)</span> and <span class="math">\(c\)</span> are the input
+gate, forget gate, output gate, and cell activation vectors, respectively,
+all of which have the same size as the cell output activation vector <span class="math">\(h\)</span>.</p>
+<p>The <span class="math">\(\odot\)</span> is the element-wise product of the vectors. <span class="math">\(act_g\)</span>
+and <span class="math">\(act_h\)</span> are the cell input and cell output activation functions
+and <cite>tanh</cite> is usually used for them. <span class="math">\(\tilde{c_t}\)</span> is also called
+candidate hidden state, which is computed based on the current input and
+the previous hidden state.</p>
+<p>Set <cite>use_peepholes</cite> to <cite>False</cite> to disable peephole connection. The formula
+is omitted here, please refer to the paper
+<a class="reference external" href="http://www.bioinf.jku.at/publications/older/2604.pdf">http://www.bioinf.jku.at/publications/older/2604.pdf</a> for details.</p>
+<p>Note that these <span class="math">\(W_{xi}x_{t}, W_{xf}x_{t}, W_{xc}x_{t}, W_{xo}x_{t}\)</span>
+operations on the input <span class="math">\(x_{t}\)</span> are NOT included in this operator.
+Users can choose to use fully-connect layer before LSTM layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>input</strong> (<em>Variable</em>) &#8211; The input of dynamic_lstm layer, which supports
+variable-time length input sequence. The underlying
+tensor in this Variable is a matrix with shape
+(T X 4D), where T is the total time steps in this
+mini-batch, D is the hidden size.</li>
+<li><strong>size</strong> (<em>int</em>) &#8211; 4 * hidden size.</li>
+<li><strong>param_attr</strong> (<em>ParamAttr</em>) &#8211; <p>The parameter attribute for the learnable
+hidden-hidden weights.</p>
+<ul>
+<li>The shape is (D x 4D), where D is the hidden
+size.</li>
+<li>Weights = {<span class="math">\(W_{ch}, W_{ih},                                                 W_{fh}, W_{oh}\)</span>}</li>
+</ul>
+</li>
+<li><strong>bias_attr</strong> (<em>ParamAttr</em>) &#8211; <p>The bias attribute for the learnable bias
+weights, which contains two parts, input-hidden
+bias weights and peephole connections weights if
+setting <cite>use_peepholes</cite> to <cite>True</cite>.</p>
+<ol class="arabic">
+<li><cite>use_peepholes = False</cite></li>
+</ol>
+<blockquote>
+<div><ul>
+<li>The shape is (1 x 4D).</li>
+<li>Biases = {<span class="math">\(b_c, b_i, b_f, b_o\)</span>}.</li>
+</ul>
+</div></blockquote>
+<ol class="arabic" start="2">
+<li><cite>use_peepholes = True</cite></li>
+</ol>
+<blockquote>
+<div><ul>
+<li>The shape is (1 x 7D).</li>
+<li>Biases = { <span class="math">\(b_c, b_i, b_f, b_o, W_{ic},                                                  W_{fc}, W_{oc}\)</span>}.</li>
+</ul>
+</div></blockquote>
+</li>
+<li><strong>use_peepholes</strong> (<em>bool</em>) &#8211; Whether to enable diagonal/peephole connections,
+default <cite>True</cite>.</li>
+<li><strong>is_reverse</strong> (<em>bool</em>) &#8211; Whether to compute reversed LSTM, default <cite>False</cite>.</li>
+<li><strong>gate_activation</strong> (<em>str</em>) &#8211; The activation for input gate, forget gate and
+output gate. Choices = [&#8220;sigmoid&#8221;, &#8220;tanh&#8221;, &#8220;relu&#8221;,
+&#8220;identity&#8221;], default &#8220;sigmoid&#8221;.</li>
+<li><strong>cell_activation</strong> (<em>str</em>) &#8211; The activation for cell output. Choices = [&#8220;sigmoid&#8221;,
+&#8220;tanh&#8221;, &#8220;relu&#8221;, &#8220;identity&#8221;], default &#8220;tanh&#8221;.</li>
+<li><strong>candidate_activation</strong> (<em>str</em>) &#8211; The activation for candidate hidden state.
+Choices = [&#8220;sigmoid&#8221;, &#8220;tanh&#8221;, &#8220;relu&#8221;, &#8220;identity&#8221;],
+default &#8220;tanh&#8221;.</li>
+<li><strong>dtype</strong> (<em>str</em>) &#8211; Data type. Choices = [&#8220;float32&#8221;, &#8220;float64&#8221;], default &#8220;float32&#8221;.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">The hidden state, and cell state of LSTM. The shape of both         is (T x D), and lod is the same with the <cite>input</cite>.</p>
+</td>
+</tr>
+<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">tuple</p>
+</td>
+</tr>
+</tbody>
+</table>
+<p class="rubric">Examples</p>
+<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">hidden_dim</span> <span class="o">=</span> <span class="mi">512</span>
+<span class="n">forward_proj</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">input_seq</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">hidden_dim</span> <span class="o">*</span> <span class="mi">4</span><span class="p">,</span>
+                               <span class="n">act</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">bias_attr</span><span class="o">=</span><span class="bp">None</span><span class="p">)</span>
+<span class="n">forward</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">dynamic_lstm</span><span class="p">(</span>
+    <span class="nb">input</span><span class="o">=</span><span class="n">forward_proj</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">hidden_dim</span> <span class="o">*</span> <span class="mi">4</span><span class="p">,</span> <span class="n">use_peepholes</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
+</pre></div>
+</div>
+</dd></dl>
 </div>
 <div class="section" id="data">

--- a/develop/doc/operators.json
+++ b/develop/doc/operators.json
@@ -286,7 +286,7 @@
   "intermediate" : 0
 }, { 
   "name" : "C0",
-   "comment" : "(Tensor, optional) the initial cell state is an optional input. This is a tensor with shape (N x D), where N is the batch size. `H0` and `C0` can be NULL but only at the same time",
+   "comment" : "(Tensor, optional) the initial cell state is an optional input. This is a tensor with shape (N x D), where N is the batch size. `H0` and `C0` can be NULL but only at the same time.",
   "duplicable" : 0,
   "intermediate" : 0
 }, { 

--- a/develop/doc/searchindex.js
+++ b/develop/doc/searchindex.js
--- a/develop/doc_cn/api/v2/fluid/layers.html
+++ b/develop/doc_cn/api/v2/fluid/layers.html
@@ -369,7 +369,104 @@ constructor.</p>
 <dl class="function">
 <dt>
 <code class="descclassname">paddle.v2.fluid.layers.</code><code class="descname">dynamic_lstm</code><span class="sig-paren">(</span><em>input</em>, <em>size</em>, <em>param_attr=None</em>, <em>bias_attr=None</em>, <em>use_peepholes=True</em>, <em>is_reverse=False</em>, <em>gate_activation='sigmoid'</em>, <em>cell_activation='tanh'</em>, <em>candidate_activation='tanh'</em>, <em>dtype='float32'</em><span class="sig-paren">)</span></dt>
-<dd></dd></dl>
+<dd><p><strong>Dynamic LSTM Layer</strong></p>
+<p>The defalut implementation is diagonal/peephole connection
+(<a class="reference external" href="https://arxiv.org/pdf/1402.1128.pdf">https://arxiv.org/pdf/1402.1128.pdf</a>), the formula is as follows:</p>
+<div class="math">
+\[ \begin{align}\begin{aligned}i_t &amp; = \sigma(W_{ix}x_{t} + W_{ih}h_{t-1} + W_{ic}c_{t-1} + b_i)\\f_t &amp; = \sigma(W_{fx}x_{t} + W_{fh}h_{t-1} + W_{fc}c_{t-1} + b_f)\\\tilde{c_t} &amp; = act_g(W_{cx}x_t + W_{ch}h_{t-1} + b_c)\\o_t &amp; = \sigma(W_{ox}x_{t} + W_{oh}h_{t-1} + W_{oc}c_t + b_o)\\c_t &amp; = f_t \odot c_{t-1} + i_t \odot \tilde{c_t}\\h_t &amp; = o_t \odot act_h(c_t)\end{aligned}\end{align} \]</div>
+<p>where the <span class="math">\(W\)</span> terms denote weight matrices (e.g. <span class="math">\(W_{xi}\)</span> is
+the matrix of weights from the input gate to the input), <span class="math">\(W_{ic},     W_{fc}, W_{oc}\)</span> are diagonal weight matrices for peephole connections. In
+our implementation, we use vectors to reprenset these diagonal weight
+matrices. The <span class="math">\(b\)</span> terms denote bias vectors (<span class="math">\(b_i\)</span> is the input
+gate bias vector), <span class="math">\(\sigma\)</span> is the non-line activations, such as
+logistic sigmoid function, and <span class="math">\(i, f, o\)</span> and <span class="math">\(c\)</span> are the input
+gate, forget gate, output gate, and cell activation vectors, respectively,
+all of which have the same size as the cell output activation vector <span class="math">\(h\)</span>.</p>
+<p>The <span class="math">\(\odot\)</span> is the element-wise product of the vectors. <span class="math">\(act_g\)</span>
+and <span class="math">\(act_h\)</span> are the cell input and cell output activation functions
+and <cite>tanh</cite> is usually used for them. <span class="math">\(\tilde{c_t}\)</span> is also called
+candidate hidden state, which is computed based on the current input and
+the previous hidden state.</p>
+<p>Set <cite>use_peepholes</cite> to <cite>False</cite> to disable peephole connection. The formula
+is omitted here, please refer to the paper
+<a class="reference external" href="http://www.bioinf.jku.at/publications/older/2604.pdf">http://www.bioinf.jku.at/publications/older/2604.pdf</a> for details.</p>
+<p>Note that these <span class="math">\(W_{xi}x_{t}, W_{xf}x_{t}, W_{xc}x_{t}, W_{xo}x_{t}\)</span>
+operations on the input <span class="math">\(x_{t}\)</span> are NOT included in this operator.
+Users can choose to use fully-connect layer before LSTM layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
+<li><strong>input</strong> (<em>Variable</em>) &#8211; The input of dynamic_lstm layer, which supports
+variable-time length input sequence. The underlying
+tensor in this Variable is a matrix with shape
+(T X 4D), where T is the total time steps in this
+mini-batch, D is the hidden size.</li>
+<li><strong>size</strong> (<em>int</em>) &#8211; 4 * hidden size.</li>
+<li><strong>param_attr</strong> (<em>ParamAttr</em>) &#8211; <p>The parameter attribute for the learnable
+hidden-hidden weights.</p>
+<ul>
+<li>The shape is (D x 4D), where D is the hidden
+size.</li>
+<li>Weights = {<span class="math">\(W_{ch}, W_{ih},                                                 W_{fh}, W_{oh}\)</span>}</li>
+</ul>
+</li>
+<li><strong>bias_attr</strong> (<em>ParamAttr</em>) &#8211; <p>The bias attribute for the learnable bias
+weights, which contains two parts, input-hidden
+bias weights and peephole connections weights if
+setting <cite>use_peepholes</cite> to <cite>True</cite>.</p>
+<ol class="arabic">
+<li><cite>use_peepholes = False</cite></li>
+</ol>
+<blockquote>
+<div><ul>
+<li>The shape is (1 x 4D).</li>
+<li>Biases = {<span class="math">\(b_c, b_i, b_f, b_o\)</span>}.</li>
+</ul>
+</div></blockquote>
+<ol class="arabic" start="2">
+<li><cite>use_peepholes = True</cite></li>
+</ol>
+<blockquote>
+<div><ul>
+<li>The shape is (1 x 7D).</li>
+<li>Biases = { <span class="math">\(b_c, b_i, b_f, b_o, W_{ic},                                                  W_{fc}, W_{oc}\)</span>}.</li>
+</ul>
+</div></blockquote>
+</li>
+<li><strong>use_peepholes</strong> (<em>bool</em>) &#8211; Whether to enable diagonal/peephole connections,
+default <cite>True</cite>.</li>
+<li><strong>is_reverse</strong> (<em>bool</em>) &#8211; Whether to compute reversed LSTM, default <cite>False</cite>.</li>
+<li><strong>gate_activation</strong> (<em>str</em>) &#8211; The activation for input gate, forget gate and
+output gate. Choices = [&#8220;sigmoid&#8221;, &#8220;tanh&#8221;, &#8220;relu&#8221;,
+&#8220;identity&#8221;], default &#8220;sigmoid&#8221;.</li>
+<li><strong>cell_activation</strong> (<em>str</em>) &#8211; The activation for cell output. Choices = [&#8220;sigmoid&#8221;,
+&#8220;tanh&#8221;, &#8220;relu&#8221;, &#8220;identity&#8221;], default &#8220;tanh&#8221;.</li>
+<li><strong>candidate_activation</strong> (<em>str</em>) &#8211; The activation for candidate hidden state.
+Choices = [&#8220;sigmoid&#8221;, &#8220;tanh&#8221;, &#8220;relu&#8221;, &#8220;identity&#8221;],
+default &#8220;tanh&#8221;.</li>
+<li><strong>dtype</strong> (<em>str</em>) &#8211; Data type. Choices = [&#8220;float32&#8221;, &#8220;float64&#8221;], default &#8220;float32&#8221;.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">The hidden state, and cell state of LSTM. The shape of both         is (T x D), and lod is the same with the <cite>input</cite>.</p>
+</td>
+</tr>
+<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">tuple</p>
+</td>
+</tr>
+</tbody>
+</table>
+<p class="rubric">Examples</p>
+<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">hidden_dim</span> <span class="o">=</span> <span class="mi">512</span>
+<span class="n">forward_proj</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">input_seq</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">hidden_dim</span> <span class="o">*</span> <span class="mi">4</span><span class="p">,</span>
+                               <span class="n">act</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">bias_attr</span><span class="o">=</span><span class="bp">None</span><span class="p">)</span>
+<span class="n">forward</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">fluid</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">dynamic_lstm</span><span class="p">(</span>
+    <span class="nb">input</span><span class="o">=</span><span class="n">forward_proj</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">hidden_dim</span> <span class="o">*</span> <span class="mi">4</span><span class="p">,</span> <span class="n">use_peepholes</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
+</pre></div>
+</div>
+</dd></dl>
 </div>
 <div class="section" id="data">

--- a/develop/doc_cn/searchindex.js
+++ b/develop/doc_cn/searchindex.js