提交 9a0b4d3c 编写于 作者: T Travis CI

Deploy to GitHub Pages: f04f4f9a

上级 e1b45851
......@@ -1304,19 +1304,12 @@ bias weights will be created and be set to default value.</li>
sequence. The dimension of each time-step should be 1. Thus, the shape of
input Tensor can be either [N, 1] or [N], where N is the sum of the length
of all sequences.</p>
<dl class="docutils">
<dt>The algorithm works as follows:</dt>
<dd><dl class="first last docutils">
<dt>for i-th sequence in a mini-batch:</dt>
<dd><dl class="first last docutils">
<dt>$$Out(X[lod[i]:lod[i+1]], :) =</dt>
<dd>frac{exp(X[lod[i]:lod[i+1], :])}
{sum(exp(X[lod[i]:lod[i+1], :]))}$$</dd>
</dl>
</dd>
</dl>
</dd>
</dl>
<p>The algorithm works as follows:</p>
<blockquote>
<div>for i-th sequence in a mini-batch:</div></blockquote>
<p>$$
Out(X[lod[i]:lod[i+1]], :) = frac{exp(X[lod[i]:lod[i+1], :])} {sum(exp(X[lod[i]:lod[i+1], :]))}
$$</p>
<p>For example, for a mini-batch of 3 sequences with variable-length,
each containing 2, 3, 2 time-steps, the lod of which is [0, 2, 5, 7],
then softmax will be computed among X[0:2, :], X[2:5, :], X[5:7, :]
......
......@@ -2545,7 +2545,7 @@
"attrs" : [ ]
},{
"type" : "sequence_softmax",
"comment" : "\nSequence Softmax Operator.\n\nSequenceSoftmaxOp computes the softmax activation among all time-steps for each\nsequence. The dimension of each time-step should be 1. Thus, the shape of\ninput Tensor can be either [N, 1] or [N], where N is the sum of the length\nof all sequences.\n\nThe algorithm works as follows:\n for i-th sequence in a mini-batch:\n $$Out(X[lod[i]:lod[i+1]], :) =\n \\frac{\\exp(X[lod[i]:lod[i+1], :])}\n {\\sum(\\exp(X[lod[i]:lod[i+1], :]))}$$\n\nFor example, for a mini-batch of 3 sequences with variable-length,\neach containing 2, 3, 2 time-steps, the lod of which is [0, 2, 5, 7],\nthen softmax will be computed among X[0:2, :], X[2:5, :], X[5:7, :]\nand N turns out to be 7.\n\n",
"comment" : "\nSequence Softmax Operator.\n\nSequenceSoftmaxOp computes the softmax activation among all time-steps for each\nsequence. The dimension of each time-step should be 1. Thus, the shape of\ninput Tensor can be either [N, 1] or [N], where N is the sum of the length\nof all sequences.\n\nThe algorithm works as follows:\n\n for i-th sequence in a mini-batch:\n\n$$\nOut(X[lod[i]:lod[i+1]], :) = \\\n\\frac{\\exp(X[lod[i]:lod[i+1], :])} \\\n{\\sum(\\exp(X[lod[i]:lod[i+1], :]))}\n$$\n\nFor example, for a mini-batch of 3 sequences with variable-length,\neach containing 2, 3, 2 time-steps, the lod of which is [0, 2, 5, 7],\nthen softmax will be computed among X[0:2, :], X[2:5, :], X[5:7, :]\nand N turns out to be 7.\n\n",
"inputs" : [
{
"name" : "X",
......
......@@ -1317,19 +1317,12 @@ bias weights will be created and be set to default value.</li>
sequence. The dimension of each time-step should be 1. Thus, the shape of
input Tensor can be either [N, 1] or [N], where N is the sum of the length
of all sequences.</p>
<dl class="docutils">
<dt>The algorithm works as follows:</dt>
<dd><dl class="first last docutils">
<dt>for i-th sequence in a mini-batch:</dt>
<dd><dl class="first last docutils">
<dt>$$Out(X[lod[i]:lod[i+1]], :) =</dt>
<dd>frac{exp(X[lod[i]:lod[i+1], :])}
{sum(exp(X[lod[i]:lod[i+1], :]))}$$</dd>
</dl>
</dd>
</dl>
</dd>
</dl>
<p>The algorithm works as follows:</p>
<blockquote>
<div>for i-th sequence in a mini-batch:</div></blockquote>
<p>$$
Out(X[lod[i]:lod[i+1]], :) = frac{exp(X[lod[i]:lod[i+1], :])} {sum(exp(X[lod[i]:lod[i+1], :]))}
$$</p>
<p>For example, for a mini-batch of 3 sequences with variable-length,
each containing 2, 3, 2 time-steps, the lod of which is [0, 2, 5, 7],
then softmax will be computed among X[0:2, :], X[2:5, :], X[5:7, :]
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册