提交 eb48eee8 编写于 作者: 绝不原创的飞龙's avatar 绝不原创的飞龙

2024-02-05 13:15:51

上级 609f7f27
此差异已折叠。
......@@ -72,8 +72,7 @@
prefs: []
type: TYPE_NORMAL
- en: '**Don’t use linear layers that are too large.** A linear layer `nn.Linear(m,
n)` uses <math><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>m</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nm)</annotation></semantics></math>O(nm)
n)` uses $O(nm)$O(nm)
memory: that is to say, the memory requirements of the weights scales quadratically
with the number of features. It is very easy to [blow through your memory](https://github.com/pytorch/pytorch/issues/958)
this way (and remember that you will need at least twice the size of the weights,
......
此差异已折叠。
......@@ -186,17 +186,15 @@
prefs: []
type: TYPE_TB
- en: '| [`arange`](generated/torch.arange.html#torch.arange "torch.arange") | Returns
a 1-D tensor of size <math><semantics><mrow><mo fence="true">⌈</mo><mfrac><mrow><mtext>end</mtext><mo>−</mo><mtext>start</mtext></mrow><mtext>step</mtext></mfrac><mo
fence="true">⌉</mo></mrow><annotation encoding="application/x-tex">\left\lceil
\frac{\text{end} - \text{start}}{\text{step}} \right\rceil</annotation></semantics></math>⌈stepend−start​⌉
a 1-D tensor of size $\left\lceil
\frac{\text{end} - \text{start}}{\text{step}} \right\rceil$⌈stepend−start​⌉
with values from the interval `[start, end)` taken with common difference `step`
beginning from start. |'
prefs: []
type: TYPE_TB
- en: '| [`range`](generated/torch.range.html#torch.range "torch.range") | Returns
a 1-D tensor of size <math><semantics><mrow><mrow><mo fence="true">⌊</mo><mfrac><mrow><mtext>end</mtext><mo>−</mo><mtext>start</mtext></mrow><mtext>step</mtext></mfrac><mo
fence="true">⌋</mo></mrow><mo>+</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">\left\lfloor
\frac{\text{end} - \text{start}}{\text{step}} \right\rfloor + 1</annotation></semantics></math>⌊stepend−start​⌋+1
a 1-D tensor of size $\left\lfloor
\frac{\text{end} - \text{start}}{\text{step}} \right\rfloor + 1$⌊stepend−start​⌋+1
with values from `start` to `end` with step `step`. |'
prefs: []
type: TYPE_TB
......@@ -552,16 +550,12 @@
prefs: []
type: TYPE_TB
- en: '| [`rand`](generated/torch.rand.html#torch.rand "torch.rand") | Returns a tensor
filled with random numbers from a uniform distribution on the interval <math><semantics><mrow><mo
stretchy="false">[</mo><mn>0</mn><mo separator="true">,</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">[0, 1)</annotation></semantics></math>[0,1) |'
filled with random numbers from a uniform distribution on the interval $[0, 1)$[0,1) |'
prefs: []
type: TYPE_TB
- en: '| [`rand_like`](generated/torch.rand_like.html#torch.rand_like "torch.rand_like")
| Returns a tensor with the same size as `input` that is filled with random numbers
from a uniform distribution on the interval <math><semantics><mrow><mo stretchy="false">[</mo><mn>0</mn><mo
separator="true">,</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">[0, 1)</annotation></semantics></math>[0,1). |'
from a uniform distribution on the interval $[0, 1)$[0,1). |'
prefs: []
type: TYPE_TB
- en: '| [`randint`](generated/torch.randint.html#torch.randint "torch.randint") |
......@@ -817,9 +811,7 @@
prefs: []
type: TYPE_TB
- en: '| [`atan2`](generated/torch.atan2.html#torch.atan2 "torch.atan2") | Element-wise
arctangent of <math><semantics><mrow><msub><mtext>input</mtext><mi>i</mi></msub><mi
mathvariant="normal">/</mi><msub><mtext>other</mtext><mi>i</mi></msub></mrow><annotation
encoding="application/x-tex">\text{input}_{i} / \text{other}_{i}</annotation></semantics></math>inputi​/otheri​
arctangent of $\text{input}_{i} / \text{other}_{i}$inputi​/otheri​
with consideration of the quadrant. |'
prefs: []
type: TYPE_TB
......@@ -974,9 +966,7 @@
prefs: []
type: TYPE_TB
- en: '| [`gradient`](generated/torch.gradient.html#torch.gradient "torch.gradient")
| Estimates the gradient of a function <math><semantics><mrow><mi>g</mi><mo>:</mo><msup><mi
mathvariant="double-struck">R</mi><mi>n</mi></msup><mo>→</mo><mi mathvariant="double-struck">R</mi></mrow><annotation
encoding="application/x-tex">g : \mathbb{R}^n \rightarrow \mathbb{R}</annotation></semantics></math>g:Rn→R
| Estimates the gradient of a function $g : \mathbb{R}^n \rightarrow \mathbb{R}$g:Rn→R
in one or more dimensions using the [second-order accurate central differences
method](https://www.ams.org/journals/mcom/1988-51-184/S0025-5718-1988-0935077-0/S0025-5718-1988-0935077-0.pdf)
and either first or second order estimates at the boundaries. |'
......@@ -1465,9 +1455,8 @@
element-wise minimum of `input` and `other`. |'
prefs: []
type: TYPE_TB
- en: '| [`ne`](generated/torch.ne.html#torch.ne "torch.ne") | Computes <math><semantics><mrow><mtext>input</mtext><mo
mathvariant="normal">≠</mo><mtext>other</mtext></mrow><annotation encoding="application/x-tex">\text{input}
\neq \text{other}</annotation></semantics></math>input=other element-wise. |'
- en: '| [`ne`](generated/torch.ne.html#torch.ne "torch.ne") | Computes $\text{input}
\neq \text{other}$input=other element-wise. |'
prefs: []
type: TYPE_TB
- en: '| [`not_equal`](generated/torch.not_equal.html#torch.not_equal "torch.not_equal")
......@@ -1947,9 +1936,7 @@
type: TYPE_TB
- en: '| [`svd_lowrank`](generated/torch.svd_lowrank.html#torch.svd_lowrank "torch.svd_lowrank")
| Return the singular value decomposition `(U, S, V)` of a matrix, batches of
matrices, or a sparse matrix $A$A such that <math><semantics><mrow><mi>A</mi><mo>≈</mo><mi>U</mi><mi>d</mi><mi>i</mi><mi>a</mi><mi>g</mi><mo
stretchy="false">(</mo><mi>S</mi><mo stretchy="false">)</mo><msup><mi>V</mi><mi>T</mi></msup></mrow><annotation
encoding="application/x-tex">A \approx U diag(S) V^T</annotation></semantics></math>A≈Udiag(S)VT.
matrices, or a sparse matrix $A$A such that $A \approx U diag(S) V^T$A≈Udiag(S)VT.
|'
prefs: []
type: TYPE_TB
......
......@@ -498,11 +498,8 @@
prefs: []
type: TYPE_TB
- en: '| [`nn.Softplus`](generated/torch.nn.Softplus.html#torch.nn.Softplus "torch.nn.Softplus")
| Applies the Softplus function <math><semantics><mrow><mtext>Softplus</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mfrac><mn>1</mn><mi>β</mi></mfrac><mo>∗</mo><mi>log</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mn>1</mn><mo>+</mo><mi>exp</mi><mo>⁡</mo><mo stretchy="false">(</mo><mi>β</mi><mo>∗</mo><mi>x</mi><mo
stretchy="false">)</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\text{Softplus}(x)
= \frac{1}{\beta} * \log(1 + \exp(\beta * x))</annotation></semantics></math>Softplus(x)=β1​∗log(1+exp(β∗x))
| Applies the Softplus function $\text{Softplus}(x)
= \frac{1}{\beta} * \log(1 + \exp(\beta * x))$Softplus(x)=β1​∗log(1+exp(β∗x))
element-wise. |'
prefs: []
type: TYPE_TB
......@@ -527,10 +524,7 @@
prefs: []
type: TYPE_TB
- en: '| [`nn.GLU`](generated/torch.nn.GLU.html#torch.nn.GLU "torch.nn.GLU") | Applies
the gated linear unit function <math><semantics><mrow><mrow><mi>G</mi><mi>L</mi><mi>U</mi></mrow><mo
stretchy="false">(</mo><mi>a</mi><mo separator="true">,</mo><mi>b</mi><mo stretchy="false">)</mo><mo>=</mo><mi>a</mi><mo>⊗</mo><mi>σ</mi><mo
stretchy="false">(</mo><mi>b</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">{GLU}(a, b)= a \otimes \sigma(b)</annotation></semantics></math>GLU(a,b)=a⊗σ(b)
the gated linear unit function ${GLU}(a, b)= a \otimes \sigma(b)$GLU(a,b)=a⊗σ(b)
where $a$a
is the first half of the input matrices and $b$b is the second
half. |'
......@@ -558,9 +552,7 @@
prefs: []
type: TYPE_TB
- en: '| [`nn.LogSoftmax`](generated/torch.nn.LogSoftmax.html#torch.nn.LogSoftmax
"torch.nn.LogSoftmax") | Applies the <math><semantics><mrow><mi>log</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mtext>Softmax</mtext><mo stretchy="false">(</mo><mi>x</mi><mo
stretchy="false">)</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\log(\text{Softmax}(x))</annotation></semantics></math>log(Softmax(x))
"torch.nn.LogSoftmax") | Applies the $\log(\text{Softmax}(x))$log(Softmax(x))
function to an n-dimensional input Tensor. |'
prefs: []
type: TYPE_TB
......@@ -893,9 +885,7 @@
type: TYPE_TB
- en: '| [`nn.MultiLabelSoftMarginLoss`](generated/torch.nn.MultiLabelSoftMarginLoss.html#torch.nn.MultiLabelSoftMarginLoss
"torch.nn.MultiLabelSoftMarginLoss") | Creates a criterion that optimizes a multi-label
one-versus-all loss based on max-entropy, between input $x$x and target $y$y of size <math><semantics><mrow><mo
stretchy="false">(</mo><mi>N</mi><mo separator="true">,</mo><mi>C</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">(N, C)</annotation></semantics></math>(N,C). |'
one-versus-all loss based on max-entropy, between input $x$x and target $y$y of size $(N, C)$(N,C). |'
prefs: []
type: TYPE_TB
- en: '| [`nn.CosineEmbeddingLoss`](generated/torch.nn.CosineEmbeddingLoss.html#torch.nn.CosineEmbeddingLoss
......@@ -909,9 +899,7 @@
"torch.nn.MultiMarginLoss") | Creates a criterion that optimizes a multi-class
classification hinge loss (margin-based loss) between input $x$x (a 2D mini-batch
Tensor) and output $y$y
(which is a 1D tensor of target class indices, <math><semantics><mrow><mn>0</mn><mo>≤</mo><mi>y</mi><mo>≤</mo><mtext>x.size</mtext><mo
stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo><mo>−</mo><mn>1</mn></mrow><annotation
encoding="application/x-tex">0 \leq y \leq \text{x.size}(1)-1</annotation></semantics></math>0≤y≤x.size(1)−1):
(which is a 1D tensor of target class indices, $0 \leq y \leq \text{x.size}(1)-1$0≤y≤x.size(1)−1):
|'
prefs: []
type: TYPE_TB
......
......@@ -204,12 +204,8 @@
prefs: []
type: TYPE_TB
- en: '| [`relu6`](generated/torch.nn.functional.relu6.html#torch.nn.functional.relu6
"torch.nn.functional.relu6") | Applies the element-wise function <math><semantics><mrow><mtext>ReLU6</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mi>min</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mi>max</mi><mo>⁡</mo><mo stretchy="false">(</mo><mn>0</mn><mo
separator="true">,</mo><mi>x</mi><mo stretchy="false">)</mo><mo separator="true">,</mo><mn>6</mn><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\text{ReLU6}(x)
= \min(\max(0,x), 6)</annotation></semantics></math>ReLU6(x)=min(max(0,x),6).
"torch.nn.functional.relu6") | Applies the element-wise function $\text{ReLU6}(x)
= \min(\max(0,x), 6)$ReLU6(x)=min(max(0,x),6).
|'
prefs: []
type: TYPE_TB
......@@ -223,39 +219,22 @@
prefs: []
type: TYPE_TB
- en: '| [`selu`](generated/torch.nn.functional.selu.html#torch.nn.functional.selu
"torch.nn.functional.selu") | Applies element-wise, <math><semantics><mrow><mtext>SELU</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mi>s</mi><mi>c</mi><mi>a</mi><mi>l</mi><mi>e</mi><mo>∗</mo><mo
stretchy="false">(</mo><mi>max</mi><mo>⁡</mo><mo stretchy="false">(</mo><mn>0</mn><mo
separator="true">,</mo><mi>x</mi><mo stretchy="false">)</mo><mo>+</mo><mi>min</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mi>α</mi><mo>∗</mo><mo
stretchy="false">(</mo><mi>exp</mi><mo>⁡</mo><mo stretchy="false">(</mo><mi>x</mi><mo
stretchy="false">)</mo><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo><mo stretchy="false">)</mo><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\text{SELU}(x)
= scale * (\max(0,x) + \min(0, \alpha * (\exp(x) - 1)))</annotation></semantics></math>SELU(x)=scale∗(max(0,x)+min(0,α∗(exp(x)−1))),
"torch.nn.functional.selu") | Applies element-wise, $\text{SELU}(x)
= scale * (\max(0,x) + \min(0, \alpha * (\exp(x) - 1)))$SELU(x)=scale∗(max(0,x)+min(0,α∗(exp(x)−1))),
with $\alpha=1.6732632423543772848170429916717$α=1.6732632423543772848170429916717
and $scale=1.0507009873554804934193349852946$scale=1.0507009873554804934193349852946.
|'
prefs: []
type: TYPE_TB
- en: '| [`celu`](generated/torch.nn.functional.celu.html#torch.nn.functional.celu
"torch.nn.functional.celu") | Applies element-wise, <math><semantics><mrow><mtext>CELU</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mi>max</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mi>x</mi><mo stretchy="false">)</mo><mo>+</mo><mi>min</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mi>α</mi><mo>∗</mo><mo
stretchy="false">(</mo><mi>exp</mi><mo>⁡</mo><mo stretchy="false">(</mo><mi>x</mi><mi
mathvariant="normal">/</mi><mi>α</mi><mo stretchy="false">)</mo><mo>−</mo><mn>1</mn><mo
stretchy="false">)</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\text{CELU}(x)
= \max(0,x) + \min(0, \alpha * (\exp(x/\alpha) - 1))</annotation></semantics></math>CELU(x)=max(0,x)+min(0,α∗(exp(x/α)−1)).
"torch.nn.functional.celu") | Applies element-wise, $\text{CELU}(x)
= \max(0,x) + \min(0, \alpha * (\exp(x/\alpha) - 1))$CELU(x)=max(0,x)+min(0,α∗(exp(x/α)−1)).
|'
prefs: []
type: TYPE_TB
- en: '| [`leaky_relu`](generated/torch.nn.functional.leaky_relu.html#torch.nn.functional.leaky_relu
"torch.nn.functional.leaky_relu") | Applies element-wise, <math><semantics><mrow><mtext>LeakyReLU</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mi>max</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mi>x</mi><mo stretchy="false">)</mo><mo>+</mo><mtext>negative_slope</mtext><mo>∗</mo><mi>min</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">\text{LeakyReLU}(x) = \max(0, x) + \text{negative\_slope}
* \min(0, x)</annotation></semantics></math>LeakyReLU(x)=max(0,x)+negative_slope∗min(0,x)
"torch.nn.functional.leaky_relu") | Applies element-wise, $\text{LeakyReLU}(x) = \max(0, x) + \text{negative\_slope}
* \min(0, x)$LeakyReLU(x)=max(0,x)+negative_slope∗min(0,x)
|'
prefs: []
type: TYPE_TB
......@@ -265,11 +244,7 @@
prefs: []
type: TYPE_TB
- en: '| [`prelu`](generated/torch.nn.functional.prelu.html#torch.nn.functional.prelu
"torch.nn.functional.prelu") | Applies element-wise the function <math><semantics><mrow><mtext>PReLU</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mi>max</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mi>x</mi><mo stretchy="false">)</mo><mo>+</mo><mtext>weight</mtext><mo>∗</mo><mi>min</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">\text{PReLU}(x) = \max(0,x) + \text{weight} * \min(0,x)</annotation></semantics></math>PReLU(x)=max(0,x)+weight∗min(0,x)
"torch.nn.functional.prelu") | Applies element-wise the function $\text{PReLU}(x) = \max(0,x) + \text{weight} * \min(0,x)$PReLU(x)=max(0,x)+weight∗min(0,x)
where weight is a learnable parameter. |'
prefs: []
type: TYPE_TB
......@@ -288,20 +263,13 @@
type: TYPE_TB
- en: '| [`gelu`](generated/torch.nn.functional.gelu.html#torch.nn.functional.gelu
"torch.nn.functional.gelu") | When the approximate argument is ''none'', it applies
element-wise the function <math><semantics><mrow><mtext>GELU</mtext><mo stretchy="false">(</mo><mi>x</mi><mo
stretchy="false">)</mo><mo>=</mo><mi>x</mi><mo>∗</mo><mi mathvariant="normal">Φ</mi><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">\text{GELU}(x) = x * \Phi(x)</annotation></semantics></math>GELU(x)=x∗Φ(x)
element-wise the function $\text{GELU}(x) = x * \Phi(x)$GELU(x)=x∗Φ(x)
|'
prefs: []
type: TYPE_TB
- en: '| [`logsigmoid`](generated/torch.nn.functional.logsigmoid.html#torch.nn.functional.logsigmoid
"torch.nn.functional.logsigmoid") | Applies element-wise <math><semantics><mrow><mtext>LogSigmoid</mtext><mo
stretchy="false">(</mo><msub><mi>x</mi><mi>i</mi></msub><mo stretchy="false">)</mo><mo>=</mo><mi>log</mi><mo>⁡</mo><mrow><mo
fence="true">(</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>+</mo><mi>exp</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mo>−</mo><msub><mi>x</mi><mi>i</mi></msub><mo stretchy="false">)</mo></mrow></mfrac><mo
fence="true">)</mo></mrow></mrow><annotation encoding="application/x-tex">\text{LogSigmoid}(x_i)
= \log \left(\frac{1}{1 + \exp(-x_i)}\right)</annotation></semantics></math>LogSigmoid(xi​)=log(1+exp(−xi​)1​)
"torch.nn.functional.logsigmoid") | Applies element-wise $\text{LogSigmoid}(x_i)
= \log \left(\frac{1}{1 + \exp(-x_i)}\right)$LogSigmoid(xi​)=log(1+exp(−xi​)1​)
|'
prefs: []
type: TYPE_TB
......@@ -311,27 +279,18 @@
prefs: []
type: TYPE_TB
- en: '| [`tanhshrink`](generated/torch.nn.functional.tanhshrink.html#torch.nn.functional.tanhshrink
"torch.nn.functional.tanhshrink") | Applies element-wise, <math><semantics><mrow><mtext>Tanhshrink</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mi>x</mi><mo>−</mo><mtext>Tanh</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">\text{Tanhshrink}(x) = x - \text{Tanh}(x)</annotation></semantics></math>Tanhshrink(x)=x−Tanh(x)
"torch.nn.functional.tanhshrink") | Applies element-wise, $\text{Tanhshrink}(x) = x - \text{Tanh}(x)$Tanhshrink(x)=x−Tanh(x)
|'
prefs: []
type: TYPE_TB
- en: '| [`softsign`](generated/torch.nn.functional.softsign.html#torch.nn.functional.softsign
"torch.nn.functional.softsign") | Applies element-wise, the function <math><semantics><mrow><mtext>SoftSign</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mfrac><mi>x</mi><mrow><mn>1</mn><mo>+</mo><mi
mathvariant="normal">∣</mi><mi>x</mi><mi mathvariant="normal">∣</mi></mrow></mfrac></mrow><annotation
encoding="application/x-tex">\text{SoftSign}(x) = \frac{x}{1 + &#124;x&#124;}</annotation></semantics></math>SoftSign(x)=1+∣x∣x​
"torch.nn.functional.softsign") | Applies element-wise, the function $\text{SoftSign}(x) = \frac{x}{1 + &#124;x&#124;}$SoftSign(x)=1+∣x∣x​
|'
prefs: []
type: TYPE_TB
- en: '| [`softplus`](generated/torch.nn.functional.softplus.html#torch.nn.functional.softplus
"torch.nn.functional.softplus") | Applies element-wise, the function <math><semantics><mrow><mtext>Softplus</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mfrac><mn>1</mn><mi>β</mi></mfrac><mo>∗</mo><mi>log</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mn>1</mn><mo>+</mo><mi>exp</mi><mo>⁡</mo><mo stretchy="false">(</mo><mi>β</mi><mo>∗</mo><mi>x</mi><mo
stretchy="false">)</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\text{Softplus}(x)
= \frac{1}{\beta} * \log(1 + \exp(\beta * x))</annotation></semantics></math>Softplus(x)=β1​∗log(1+exp(β∗x)).
"torch.nn.functional.softplus") | Applies element-wise, the function $\text{Softplus}(x)
= \frac{1}{\beta} * \log(1 + \exp(\beta * x))$Softplus(x)=β1​∗log(1+exp(β∗x)).
|'
prefs: []
type: TYPE_TB
......@@ -360,23 +319,13 @@
prefs: []
type: TYPE_TB
- en: '| [`tanh`](generated/torch.nn.functional.tanh.html#torch.nn.functional.tanh
"torch.nn.functional.tanh") | Applies element-wise, <math><semantics><mrow><mtext>Tanh</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mi>tanh</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mfrac><mrow><mi>exp</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>−</mo><mi>exp</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mo>−</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><mrow><mi>exp</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>+</mo><mi>exp</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mo>−</mo><mi>x</mi><mo stretchy="false">)</mo></mrow></mfrac></mrow><annotation
encoding="application/x-tex">\text{Tanh}(x) = \tanh(x) = \frac{\exp(x) - \exp(-x)}{\exp(x)
+ \exp(-x)}</annotation></semantics></math>Tanh(x)=tanh(x)=exp(x)+exp(−x)exp(x)−exp(−x)​
"torch.nn.functional.tanh") | Applies element-wise, $\text{Tanh}(x) = \tanh(x) = \frac{\exp(x) - \exp(-x)}{\exp(x)
+ \exp(-x)}$Tanh(x)=tanh(x)=exp(x)+exp(−x)exp(x)−exp(−x)​
|'
prefs: []
type: TYPE_TB
- en: '| [`sigmoid`](generated/torch.nn.functional.sigmoid.html#torch.nn.functional.sigmoid
"torch.nn.functional.sigmoid") | Applies the element-wise function <math><semantics><mrow><mtext>Sigmoid</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>+</mo><mi>exp</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mo>−</mo><mi>x</mi><mo stretchy="false">)</mo></mrow></mfrac></mrow><annotation
encoding="application/x-tex">\text{Sigmoid}(x) = \frac{1}{1 + \exp(-x)}</annotation></semantics></math>Sigmoid(x)=1+exp(−x)1​
"torch.nn.functional.sigmoid") | Applies the element-wise function $\text{Sigmoid}(x) = \frac{1}{1 + \exp(-x)}$Sigmoid(x)=1+exp(−x)1​
|'
prefs: []
type: TYPE_TB
......@@ -624,26 +573,17 @@
type: TYPE_NORMAL
- en: '| [`pixel_shuffle`](generated/torch.nn.functional.pixel_shuffle.html#torch.nn.functional.pixel_shuffle
"torch.nn.functional.pixel_shuffle") | Rearranges elements in a tensor of shape
<math><semantics><mrow><mo stretchy="false">(</mo><mo>∗</mo><mo separator="true">,</mo><mi>C</mi><mo>×</mo><msup><mi>r</mi><mn>2</mn></msup><mo
separator="true">,</mo><mi>H</mi><mo separator="true">,</mo><mi>W</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">(*, C \times r^2, H, W)</annotation></semantics></math>(∗,C×r2,H,W)
to a tensor of shape <math><semantics><mrow><mo stretchy="false">(</mo><mo>∗</mo><mo
separator="true">,</mo><mi>C</mi><mo separator="true">,</mo><mi>H</mi><mo>×</mo><mi>r</mi><mo
separator="true">,</mo><mi>W</mi><mo>×</mo><mi>r</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">(*, C, H \times r, W \times r)</annotation></semantics></math>(∗,C,H×r,W×r),
$(*, C \times r^2, H, W)$(∗,C×r2,H,W)
to a tensor of shape $(*, C, H \times r, W \times r)$(∗,C,H×r,W×r),
where r is the `upscale_factor`. |'
prefs: []
type: TYPE_TB
- en: '| [`pixel_unshuffle`](generated/torch.nn.functional.pixel_unshuffle.html#torch.nn.functional.pixel_unshuffle
"torch.nn.functional.pixel_unshuffle") | Reverses the [`PixelShuffle`](generated/torch.nn.PixelShuffle.html#torch.nn.PixelShuffle
"torch.nn.PixelShuffle") operation by rearranging elements in a tensor of shape
<math><semantics><mrow><mo stretchy="false">(</mo><mo>∗</mo><mo separator="true">,</mo><mi>C</mi><mo
separator="true">,</mo><mi>H</mi><mo>×</mo><mi>r</mi><mo separator="true">,</mo><mi>W</mi><mo>×</mo><mi>r</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(*, C,
H \times r, W \times r)</annotation></semantics></math>(∗,C,H×r,W×r) to a tensor
of shape <math><semantics><mrow><mo stretchy="false">(</mo><mo>∗</mo><mo separator="true">,</mo><mi>C</mi><mo>×</mo><msup><mi>r</mi><mn>2</mn></msup><mo
separator="true">,</mo><mi>H</mi><mo separator="true">,</mo><mi>W</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">(*, C \times r^2, H, W)</annotation></semantics></math>(∗,C×r2,H,W),
$(*, C,
H \times r, W \times r)$(∗,C,H×r,W×r) to a tensor
of shape $(*, C \times r^2, H, W)$(∗,C×r2,H,W),
where r is the `downscale_factor`. |'
prefs: []
type: TYPE_TB
......
......@@ -626,19 +626,14 @@
prefs: []
type: TYPE_TB
- en: '| [`Tensor.bernoulli`](generated/torch.Tensor.bernoulli.html#torch.Tensor.bernoulli
"torch.Tensor.bernoulli") | Returns a result tensor where each <math><semantics><mrow><mtext
mathvariant="monospace">result[i]</mtext></mrow><annotation encoding="application/x-tex">\texttt{result[i]}</annotation></semantics></math>result[i]
is independently sampled from <math><semantics><mrow><mtext>Bernoulli</mtext><mo
stretchy="false">(</mo><mtext mathvariant="monospace">self[i]</mtext><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">\text{Bernoulli}(\texttt{self[i]})</annotation></semantics></math>Bernoulli(self[i]).
"torch.Tensor.bernoulli") | Returns a result tensor where each $\texttt{result[i]}$result[i]
is independently sampled from $\text{Bernoulli}(\texttt{self[i]})$Bernoulli(self[i]).
|'
prefs: []
type: TYPE_TB
- en: '| [`Tensor.bernoulli_`](generated/torch.Tensor.bernoulli_.html#torch.Tensor.bernoulli_
"torch.Tensor.bernoulli_") | Fills each location of `self` with an independent
sample from <math><semantics><mrow><mtext>Bernoulli</mtext><mo stretchy="false">(</mo><mtext
mathvariant="monospace">p</mtext><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">\text{Bernoulli}(\texttt{p})</annotation></semantics></math>Bernoulli(p).
sample from $\text{Bernoulli}(\texttt{p})$Bernoulli(p).
|'
prefs: []
type: TYPE_TB
......
......@@ -19,12 +19,8 @@
and the pathwise derivative estimator. REINFORCE is commonly seen as the basis
for policy gradient methods in reinforcement learning, and the pathwise derivative
estimator is commonly seen in the reparameterization trick in variational autoencoders.
Whilst the score function only requires the value of samples <math><semantics><mrow><mi>f</mi><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">f(x)</annotation></semantics></math>f(x), the pathwise
derivative requires the derivative <math><semantics><mrow><msup><mi>f</mi><mo
mathvariant="normal" lspace="0em" rspace="0em">′</mo></msup><mo stretchy="false">(</mo><mi>x</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">f'(x)</annotation></semantics></math>f′(x).
Whilst the score function only requires the value of samples $f(x)$f(x), the pathwise
derivative requires the derivative $f'(x)$f′(x).
The next sections discuss these two in a reinforcement learning example. For more
details see [Gradient Estimation Using Stochastic Computation Graphs](https://arxiv.org/abs/1506.05254)
.
......@@ -38,19 +34,13 @@
parameters, we only need `sample()` and `log_prob()` to implement REINFORCE:'
prefs: []
type: TYPE_NORMAL
- en: <math display="block"><semantics><mrow><mi mathvariant="normal">Δ</mi><mi>θ</mi><mo>=</mo><mi>α</mi><mi>r</mi><mfrac><mrow><mi
mathvariant="normal">∂</mi><mi>log</mi><mo>⁡</mo><mi>p</mi><mo stretchy="false">(</mo><mi>a</mi><mi
mathvariant="normal">∣</mi><msup><mi>π</mi><mi>θ</mi></msup><mo stretchy="false">(</mo><mi>s</mi><mo
stretchy="false">)</mo><mo stretchy="false">)</mo></mrow><mrow><mi mathvariant="normal">∂</mi><mi>θ</mi></mrow></mfrac></mrow><annotation
encoding="application/x-tex">\Delta\theta = \alpha r \frac{\partial\log p(a|\pi^\theta(s))}{\partial\theta}</annotation></semantics></math>Δθ=αr∂θ∂logp(a∣πθ(s))​
- en: $\Delta\theta = \alpha r \frac{\partial\log p(a|\pi^\theta(s))}{\partial\theta}$Δθ=αr∂θ∂logp(a∣πθ(s))​
prefs: []
type: TYPE_NORMAL
- en: where $\theta$θ
are the parameters, $\alpha$α
is the learning rate, $r$r
is the reward and <math><semantics><mrow><mi>p</mi><mo stretchy="false">(</mo><mi>a</mi><mi
mathvariant="normal">∣</mi><msup><mi>π</mi><mi>θ</mi></msup><mo stretchy="false">(</mo><mi>s</mi><mo
stretchy="false">)</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">p(a|\pi^\theta(s))</annotation></semantics></math>p(a∣πθ(s))
is the reward and $p(a|\pi^\theta(s))$p(a∣πθ(s))
is the probability of taking action $a$a in state $s$s given policy $\pi^\theta$πθ.
prefs: []
type: TYPE_NORMAL
......@@ -371,24 +361,14 @@
is defined below
prefs: []
type: TYPE_NORMAL
- en: <math display="block"><semantics><mrow><msub><mi>p</mi><mi>F</mi></msub><mo
stretchy="false">(</mo><mi>x</mi><mo separator="true">;</mo><mi>θ</mi><mo stretchy="false">)</mo><mo>=</mo><mi>exp</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mo stretchy="false">⟨</mo><mi>t</mi><mo stretchy="false">(</mo><mi>x</mi><mo
stretchy="false">)</mo><mo separator="true">,</mo><mi>θ</mi><mo stretchy="false">⟩</mo><mo>−</mo><mi>F</mi><mo
stretchy="false">(</mo><mi>θ</mi><mo stretchy="false">)</mo><mo>+</mo><mi>k</mi><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">p_{F}(x; \theta) = \exp(\langle t(x), \theta\rangle
- F(\theta) + k(x))</annotation></semantics></math>pF​(x;θ)=exp(⟨t(x),θ⟩−F(θ)+k(x))
- en: $p_{F}(x; \theta) = \exp(\langle t(x), \theta\rangle
- F(\theta) + k(x))$pF​(x;θ)=exp(⟨t(x),θ⟩−F(θ)+k(x))
prefs: []
type: TYPE_NORMAL
- en: where $\theta$θ
denotes the natural parameters, <math><semantics><mrow><mi>t</mi><mo stretchy="false">(</mo><mi>x</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">t(x)</annotation></semantics></math>t(x)
denotes the sufficient statistic, <math><semantics><mrow><mi>F</mi><mo stretchy="false">(</mo><mi>θ</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">F(\theta)</annotation></semantics></math>F(θ)
is the log normalizer function for a given family and <math><semantics><mrow><mi>k</mi><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">k(x)</annotation></semantics></math>k(x) is the carrier
denotes the natural parameters, $t(x)$t(x)
denotes the sufficient statistic, $F(\theta)$F(θ)
is the log normalizer function for a given family and $k(x)$k(x) is the carrier
measure.
prefs: []
type: TYPE_NORMAL
......@@ -667,10 +647,8 @@
"torch.multinomial") samples from.
prefs: []
type: TYPE_NORMAL
- en: Samples are integers from <math><semantics><mrow><mo stretchy="false">{</mo><mn>0</mn><mo
separator="true">,</mo><mo>…</mo><mo separator="true">,</mo><mi>K</mi><mo>−</mo><mn>1</mn><mo
stretchy="false">}</mo></mrow><annotation encoding="application/x-tex">\{0, \ldots,
K-1\}</annotation></semantics></math>{0,…,K−1} where K is `probs.size(-1)`.
- en: Samples are integers from $\{0, \ldots,
K-1\}${0,…,K−1} where K is `probs.size(-1)`.
prefs: []
type: TYPE_NORMAL
- en: If probs is 1-dimensional with length-K, each element is the relative probability
......@@ -1231,28 +1209,19 @@
of Bernoulli trials.
prefs: []
type: TYPE_NORMAL
- en: <math display="block"><semantics><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo
stretchy="false">)</mo><mo>=</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>p</mi><msup><mo
stretchy="false">)</mo><mi>k</mi></msup><mi>p</mi><mo separator="true">,</mo><mi>k</mi><mo>=</mo><mn>0</mn><mo
separator="true">,</mo><mn>1</mn><mo separator="true">,</mo><mi mathvariant="normal">.</mi><mi
mathvariant="normal">.</mi><mi mathvariant="normal">.</mi></mrow><annotation encoding="application/x-tex">P(X=k)
= (1-p)^{k} p, k = 0, 1, ...</annotation></semantics></math>P(X=k)=(1−p)kp,k=0,1,...
- en: $P(X=k)
= (1-p)^{k} p, k = 0, 1, ...$P(X=k)=(1−p)kp,k=0,1,...
prefs: []
type: TYPE_NORMAL
- en: Note
prefs: []
type: TYPE_NORMAL
- en: '[`torch.distributions.geometric.Geometric()`](#torch.distributions.geometric.Geometric
"torch.distributions.geometric.Geometric") <math><semantics><mrow><mo stretchy="false">(</mo><mi>k</mi><mo>+</mo><mn>1</mn><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(k+1)</annotation></semantics></math>(k+1)-th
trial is the first success hence draws samples in <math><semantics><mrow><mo stretchy="false">{</mo><mn>0</mn><mo
separator="true">,</mo><mn>1</mn><mo separator="true">,</mo><mo>…</mo><mo stretchy="false">}</mo></mrow><annotation
encoding="application/x-tex">\{0, 1, \ldots\}</annotation></semantics></math>{0,1,…},
"torch.distributions.geometric.Geometric") $(k+1)$(k+1)-th
trial is the first success hence draws samples in $\{0, 1, \ldots\}${0,1,…},
whereas [`torch.Tensor.geometric_()`](generated/torch.Tensor.geometric_.html#torch.Tensor.geometric_
"torch.Tensor.geometric_") k-th trial is the first success hence draws samples
in <math><semantics><mrow><mo stretchy="false">{</mo><mn>1</mn><mo separator="true">,</mo><mn>2</mn><mo
separator="true">,</mo><mo>…</mo><mo stretchy="false">}</mo></mrow><annotation
encoding="application/x-tex">\{1, 2, \ldots\}</annotation></semantics></math>{1,2,…}.'
in $\{1, 2, \ldots\}${1,2,…}.'
prefs: []
type: TYPE_NORMAL
- en: 'Example:'
......@@ -1719,9 +1688,7 @@
- en: 'LKJ distribution for lower Cholesky factor of correlation matrices. The distribution
is controlled by `concentration` parameter $\eta$η to make the
probability of the correlation matrix $M$M generated from
a Cholesky factor proportional to <math><semantics><mrow><mi>det</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mi>M</mi><msup><mo stretchy="false">)</mo><mrow><mi>η</mi><mo>−</mo><mn>1</mn></mrow></msup></mrow><annotation
encoding="application/x-tex">\det(M)^{\eta - 1}</annotation></semantics></math>det(M)η−1.
a Cholesky factor proportional to $\det(M)^{\eta - 1}$det(M)η−1.
Because of that, when `concentration == 1`, we have a uniform distribution over
Cholesky factors of correlation matrices:'
prefs: []
......@@ -2209,9 +2176,7 @@
a positive definite covariance matrix $\mathbf{\Sigma}$Σ
or a positive definite precision matrix $\mathbf{\Sigma}^{-1}$Σ−1
or a lower-triangular matrix $\mathbf{L}$L with
positive-valued diagonal entries, such that <math><semantics><mrow><mi mathvariant="bold">Σ</mi><mo>=</mo><mi
mathvariant="bold">L</mi><msup><mi mathvariant="bold">L</mi><mi mathvariant="normal">⊤</mi></msup></mrow><annotation
encoding="application/x-tex">\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^\top</annotation></semantics></math>Σ=LL⊤.
positive-valued diagonal entries, such that $\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^\top$Σ=LL⊤.
This triangular matrix can be obtained via e.g. Cholesky decomposition of the
covariance.
prefs: []
......@@ -2623,11 +2588,7 @@
- en: Samples are nonnegative integers, with a pmf given by
prefs: []
type: TYPE_NORMAL
- en: <math display="block"><semantics><mrow><msup><mrow><mi mathvariant="normal">r</mi><mi
mathvariant="normal">a</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">e</mi></mrow><mi>k</mi></msup><mfrac><msup><mi>e</mi><mrow><mo>−</mo><mrow><mi
mathvariant="normal">r</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">t</mi><mi
mathvariant="normal">e</mi></mrow></mrow></msup><mrow><mi>k</mi><mo stretchy="false">!</mo></mrow></mfrac></mrow><annotation
encoding="application/x-tex">\mathrm{rate}^k \frac{e^{-\mathrm{rate}}}{k!}</annotation></semantics></math>
- en: $\mathrm{rate}^k \frac{e^{-\mathrm{rate}}}{k!}$
ratekk!e−rate​
prefs: []
type: TYPE_NORMAL
......@@ -3250,9 +3211,7 @@
type: TYPE_NORMAL
- en: Creates a Wishart distribution parameterized by a symmetric positive definite
matrix $\Sigma$Σ, or its Cholesky
decomposition <math><semantics><mrow><mi mathvariant="bold">Σ</mi><mo>=</mo><mi
mathvariant="bold">L</mi><msup><mi mathvariant="bold">L</mi><mi mathvariant="normal">⊤</mi></msup></mrow><annotation
encoding="application/x-tex">\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^\top</annotation></semantics></math>Σ=LL⊤
decomposition $\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^\top$Σ=LL⊤
prefs: []
type: TYPE_NORMAL
- en: Example
......@@ -3369,18 +3328,11 @@
- en: '[PRE525]'
prefs: []
type: TYPE_PRE
- en: Compute Kullback-Leibler divergence <math><semantics><mrow><mi>K</mi><mi>L</mi><mo
stretchy="false">(</mo><mi>p</mi><mi mathvariant="normal">∥</mi><mi>q</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">KL(p \|
q)</annotation></semantics></math>KL(p∥q) between two distributions.
- en: Compute Kullback-Leibler divergence $KL(p \|
q)$KL(p∥q) between two distributions.
prefs: []
type: TYPE_NORMAL
- en: <math display="block"><semantics><mrow><mi>K</mi><mi>L</mi><mo stretchy="false">(</mo><mi>p</mi><mi
mathvariant="normal">∥</mi><mi>q</mi><mo stretchy="false">)</mo><mo>=</mo><mo>∫</mo><mi>p</mi><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mi>log</mi><mo>⁡</mo><mfrac><mrow><mi>p</mi><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><mrow><mi>q</mi><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow></mfrac><mi>d</mi><mi>x</mi></mrow><annotation
encoding="application/x-tex">KL(p \| q) = \int p(x) \log\frac {p(x)} {q(x)} \,dx</annotation></semantics></math>KL(p∥q)=∫p(x)logq(x)p(x)​dx
- en: $KL(p \| q) = \int p(x) \log\frac {p(x)} {q(x)} \,dx$KL(p∥q)=∫p(x)logq(x)p(x)​dx
prefs: []
type: TYPE_NORMAL
- en: Parameters
......@@ -3808,9 +3760,8 @@
- en: '[PRE530]'
prefs: []
type: TYPE_PRE
- en: Transform via the mapping <math><semantics><mrow><mi>y</mi><mo>=</mo><mi mathvariant="normal">∣</mi><mi>x</mi><mi
mathvariant="normal">∣</mi></mrow><annotation encoding="application/x-tex">y =
|x|</annotation></semantics></math>y=∣x∣.
- en: Transform via the mapping $y =
|x|$y=∣x∣.
prefs: []
type: TYPE_NORMAL
- en: '[PRE531]'
......@@ -3877,9 +3828,7 @@
- en: '[PRE535]'
prefs: []
type: TYPE_PRE
- en: 'Transforms an uncontrained real vector $x$x with length <math><semantics><mrow><mi>D</mi><mo>∗</mo><mo
stretchy="false">(</mo><mi>D</mi><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo><mi
mathvariant="normal">/</mi><mn>2</mn></mrow><annotation encoding="application/x-tex">D*(D-1)/2</annotation></semantics></math>D∗(D−1)/2
- en: 'Transforms an uncontrained real vector $x$x with length $D*(D-1)/2$D∗(D−1)/2
into the Cholesky factor of a D-dimension correlation matrix. This Cholesky factor
is a lower triangular matrix with positive diagonals and unit Euclidean norm for
each row. The transform is processed as follows:'
......@@ -3904,18 +3853,11 @@
triangular part, we apply a *signed* version of class [`StickBreakingTransform`](#torch.distributions.transforms.StickBreakingTransform
"torch.distributions.transforms.StickBreakingTransform") to transform $X_i$Xi​ into a unit
Euclidean length vector using the following steps: - Scales into the interval
<math><semantics><mrow><mo stretchy="false">(</mo><mo>−</mo><mn>1</mn><mo separator="true">,</mo><mn>1</mn><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(-1, 1)</annotation></semantics></math>(−1,1)
domain: <math><semantics><mrow><msub><mi>r</mi><mi>i</mi></msub><mo>=</mo><mi>tanh</mi><mo>⁡</mo><mo
stretchy="false">(</mo><msub><mi>X</mi><mi>i</mi></msub><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">r_i = \tanh(X_i)</annotation></semantics></math>ri​=tanh(Xi​).
$(-1, 1)$(−1,1)
domain: $r_i = \tanh(X_i)$ri​=tanh(Xi​).
- Transforms into an unsigned domain: $z_i = r_i^2$zi​=ri2​.
- Applies <math><semantics><mrow><msub><mi>s</mi><mi>i</mi></msub><mo>=</mo><mi>S</mi><mi>t</mi><mi>i</mi><mi>c</mi><mi>k</mi><mi>B</mi><mi>r</mi><mi>e</mi><mi>a</mi><mi>k</mi><mi>i</mi><mi>n</mi><mi>g</mi><mi>T</mi><mi>r</mi><mi>a</mi><mi>n</mi><mi>s</mi><mi>f</mi><mi>o</mi><mi>r</mi><mi>m</mi><mo
stretchy="false">(</mo><msub><mi>z</mi><mi>i</mi></msub><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">s_i = StickBreakingTransform(z_i)</annotation></semantics></math>si​=StickBreakingTransform(zi​).
- Transforms back into signed domain: <math><semantics><mrow><msub><mi>y</mi><mi>i</mi></msub><mo>=</mo><mi>s</mi><mi>i</mi><mi>g</mi><mi>n</mi><mo
stretchy="false">(</mo><msub><mi>r</mi><mi>i</mi></msub><mo stretchy="false">)</mo><mo>∗</mo><msqrt><msub><mi>s</mi><mi>i</mi></msub></msqrt></mrow><annotation
encoding="application/x-tex">y_i = sign(r_i) * \sqrt{s_i}</annotation></semantics></math>yi​=sign(ri​)∗si​​.'
- Applies $s_i = StickBreakingTransform(z_i)$si​=StickBreakingTransform(zi​).
- Transforms back into signed domain: $y_i = sign(r_i) * \sqrt{s_i}$yi​=sign(ri​)∗si​​.'
prefs:
- PREF_BQ
- PREF_OL
......@@ -3943,9 +3885,7 @@
- en: '[PRE538]'
prefs: []
type: TYPE_PRE
- en: Transform via the mapping <math><semantics><mrow><mi>y</mi><mo>=</mo><mi>exp</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">y = \exp(x)</annotation></semantics></math>y=exp(x).
- en: Transform via the mapping $y = \exp(x)$y=exp(x).
prefs: []
type: TYPE_NORMAL
- en: '[PRE539]'
......@@ -4018,30 +3958,22 @@
- en: '[PRE544]'
prefs: []
type: TYPE_PRE
- en: Transform via the mapping <math><semantics><mrow><mi>y</mi><mo>=</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>+</mo><mi>exp</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mo>−</mo><mi>x</mi><mo stretchy="false">)</mo></mrow></mfrac></mrow><annotation
encoding="application/x-tex">y = \frac{1}{1 + \exp(-x)}</annotation></semantics></math>y=1+exp(−x)1​
and <math><semantics><mrow><mi>x</mi><mo>=</mo><mtext>logit</mtext><mo stretchy="false">(</mo><mi>y</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">x = \text{logit}(y)</annotation></semantics></math>x=logit(y).
- en: Transform via the mapping $y = \frac{1}{1 + \exp(-x)}$y=1+exp(−x)1​
and $x = \text{logit}(y)$x=logit(y).
prefs: []
type: TYPE_NORMAL
- en: '[PRE545]'
prefs: []
type: TYPE_PRE
- en: Transform via the mapping <math><semantics><mrow><mtext>Softplus</mtext><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mi>log</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mn>1</mn><mo>+</mo><mi>exp</mi><mo>⁡</mo><mo stretchy="false">(</mo><mi>x</mi><mo
stretchy="false">)</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\text{Softplus}(x)
= \log(1 + \exp(x))</annotation></semantics></math>Softplus(x)=log(1+exp(x)).
- en: Transform via the mapping $\text{Softplus}(x)
= \log(1 + \exp(x))$Softplus(x)=log(1+exp(x)).
The implementation reverts to the linear function when $x > 20$x>20.
prefs: []
type: TYPE_NORMAL
- en: '[PRE546]'
prefs: []
type: TYPE_PRE
- en: Transform via the mapping <math><semantics><mrow><mi>y</mi><mo>=</mo><mi>tanh</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">y = \tanh(x)</annotation></semantics></math>y=tanh(x).
- en: Transform via the mapping $y = \tanh(x)$y=tanh(x).
prefs: []
type: TYPE_NORMAL
- en: It is equivalent to `` ComposeTransform([AffineTransform(0., 2.), SigmoidTransform(),
......@@ -4055,9 +3987,7 @@
- en: '[PRE547]'
prefs: []
type: TYPE_PRE
- en: Transform from unconstrained space to the simplex via <math><semantics><mrow><mi>y</mi><mo>=</mo><mi>exp</mi><mo>⁡</mo><mo
stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">y = \exp(x)</annotation></semantics></math>y=exp(x)
- en: Transform from unconstrained space to the simplex via $y = \exp(x)$y=exp(x)
then normalizing.
prefs: []
type: TYPE_NORMAL
......
此差异已折叠。
......@@ -88,9 +88,7 @@
- en: Fill the input Tensor with values drawn from the uniform distribution.
prefs: []
type: TYPE_NORMAL
- en: <math><semantics><mrow><mi mathvariant="script">U</mi><mo stretchy="false">(</mo><mi>a</mi><mo
separator="true">,</mo><mi>b</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">\mathcal{U}(a, b)</annotation></semantics></math>U(a,b).
- en: $\mathcal{U}(a, b)$U(a,b).
prefs: []
type: TYPE_NORMAL
- en: Parameters
......@@ -135,9 +133,7 @@
- en: Fill the input Tensor with values drawn from the normal distribution.
prefs: []
type: TYPE_NORMAL
- en: <math><semantics><mrow><mi mathvariant="script">N</mi><mo stretchy="false">(</mo><mtext>mean</mtext><mo
separator="true">,</mo><msup><mtext>std</mtext><mn>2</mn></msup><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">\mathcal{N}(\text{mean}, \text{std}^2)</annotation></semantics></math>N(mean,std2).
- en: $\mathcal{N}(\text{mean}, \text{std}^2)$N(mean,std2).
prefs: []
type: TYPE_NORMAL
- en: Parameters
......@@ -317,15 +313,12 @@
type: TYPE_NORMAL
- en: The method is described in Understanding the difficulty of training deep feedforward
neural networks - Glorot, X. & Bengio, Y. (2010). The resulting tensor will have
values sampled from <math><semantics><mrow><mi mathvariant="script">U</mi><mo
stretchy="false">(</mo><mo>−</mo><mi>a</mi><mo separator="true">,</mo><mi>a</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\mathcal{U}(-a,
a)</annotation></semantics></math>U(−a,a) where
values sampled from $\mathcal{U}(-a,
a)$U(−a,a) where
prefs: []
type: TYPE_NORMAL
- en: <math display="block"><semantics><mrow><mi>a</mi><mo>=</mo><mtext>gain</mtext><mo>×</mo><msqrt><mfrac><mn>6</mn><mrow><mtext>fan_in</mtext><mo>+</mo><mtext>fan_out</mtext></mrow></mfrac></msqrt></mrow><annotation
encoding="application/x-tex">a = \text{gain} \times \sqrt{\frac{6}{\text{fan\_in}
+ \text{fan\_out}}}</annotation></semantics></math> a=gain×fan_in+fan_out6​​
- en: $a = \text{gain} \times \sqrt{\frac{6}{\text{fan\_in}
+ \text{fan\_out}}}$ a=gain×fan_in+fan_out6​​
prefs: []
type: TYPE_NORMAL
- en: Also known as Glorot initialization.
......@@ -370,15 +363,12 @@
type: TYPE_NORMAL
- en: The method is described in Understanding the difficulty of training deep feedforward
neural networks - Glorot, X. & Bengio, Y. (2010). The resulting tensor will have
values sampled from <math><semantics><mrow><mi mathvariant="script">N</mi><mo
stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><msup><mtext>std</mtext><mn>2</mn></msup><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\mathcal{N}(0,
\text{std}^2)</annotation></semantics></math>N(0,std2) where
values sampled from $\mathcal{N}(0,
\text{std}^2)$N(0,std2) where
prefs: []
type: TYPE_NORMAL
- en: <math display="block"><semantics><mrow><mtext>std</mtext><mo>=</mo><mtext>gain</mtext><mo>×</mo><msqrt><mfrac><mn>2</mn><mrow><mtext>fan_in</mtext><mo>+</mo><mtext>fan_out</mtext></mrow></mfrac></msqrt></mrow><annotation
encoding="application/x-tex">\text{std} = \text{gain} \times \sqrt{\frac{2}{\text{fan\_in}
+ \text{fan\_out}}}</annotation></semantics></math> std=gain×fan_in+fan_out2​​
- en: $\text{std} = \text{gain} \times \sqrt{\frac{2}{\text{fan\_in}
+ \text{fan\_out}}}$ std=gain×fan_in+fan_out2​​
prefs: []
type: TYPE_NORMAL
- en: Also known as Glorot initialization.
......@@ -423,10 +413,8 @@
type: TYPE_NORMAL
- en: 'The method is described in Delving deep into rectifiers: Surpassing human-level
performance on ImageNet classification - He, K. et al. (2015). The resulting tensor
will have values sampled from <math><semantics><mrow><mi mathvariant="script">U</mi><mo
stretchy="false">(</mo><mo>−</mo><mtext>bound</mtext><mo separator="true">,</mo><mtext>bound</mtext><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\mathcal{U}(-\text{bound},
\text{bound})</annotation></semantics></math>U(−bound,bound) where'
will have values sampled from $\mathcal{U}(-\text{bound},
\text{bound})$U(−bound,bound) where'
prefs: []
type: TYPE_NORMAL
- en: $\text{bound} = \text{gain} \times \sqrt{\frac{3}{\text{fan\_mode}}}$
......@@ -483,10 +471,8 @@
type: TYPE_NORMAL
- en: 'The method is described in Delving deep into rectifiers: Surpassing human-level
performance on ImageNet classification - He, K. et al. (2015). The resulting tensor
will have values sampled from <math><semantics><mrow><mi mathvariant="script">N</mi><mo
stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><msup><mtext>std</mtext><mn>2</mn></msup><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\mathcal{N}(0,
\text{std}^2)</annotation></semantics></math>N(0,std2) where'
will have values sampled from $\mathcal{N}(0,
\text{std}^2)$N(0,std2) where'
prefs: []
type: TYPE_NORMAL
- en: $\text{std} = \frac{\text{gain}}{\sqrt{\text{fan\_mode}}}$
......@@ -541,12 +527,9 @@
- en: Fill the input Tensor with values drawn from a truncated normal distribution.
prefs: []
type: TYPE_NORMAL
- en: The values are effectively drawn from the normal distribution <math><semantics><mrow><mi
mathvariant="script">N</mi><mo stretchy="false">(</mo><mtext>mean</mtext><mo separator="true">,</mo><msup><mtext>std</mtext><mn>2</mn></msup><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\mathcal{N}(\text{mean},
\text{std}^2)</annotation></semantics></math>N(mean,std2) with values outside
<math><semantics><mrow><mo stretchy="false">[</mo><mi>a</mi><mo separator="true">,</mo><mi>b</mi><mo
stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">[a, b]</annotation></semantics></math>[a,b]
- en: The values are effectively drawn from the normal distribution $\mathcal{N}(\text{mean},
\text{std}^2)$N(mean,std2) with values outside
$[a, b]$[a,b]
redrawn until they are within the bounds. The method used for generating the random
values works best when $a \leq \text{mean} \leq b$a≤mean≤b.
prefs: []
......@@ -638,10 +621,8 @@
- en: Fill the 2D input Tensor as a sparse matrix.
prefs: []
type: TYPE_NORMAL
- en: The non-zero elements will be drawn from the normal distribution <math><semantics><mrow><mi
mathvariant="script">N</mi><mo stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mn>0.01</mn><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\mathcal{N}(0,
0.01)</annotation></semantics></math>N(0,0.01), as described in Deep learning
- en: The non-zero elements will be drawn from the normal distribution $\mathcal{N}(0,
0.01)$N(0,0.01), as described in Deep learning
via Hessian-free optimization - Martens, J. (2010).
prefs: []
type: TYPE_NORMAL
......
......@@ -502,10 +502,8 @@
to the weights:'
prefs: []
type: TYPE_NORMAL
- en: <math display="block"><semantics><mrow><msubsup><mi>W</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow><mtext>EMA</mtext></msubsup><mo>=</mo><mi>α</mi><msubsup><mi>W</mi><mi>t</mi><mtext>EMA</mtext></msubsup><mo>+</mo><mo
stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><mo stretchy="false">)</mo><msubsup><mi>W</mi><mi>t</mi><mtext>model</mtext></msubsup></mrow><annotation
encoding="application/x-tex">W^\textrm{EMA}_{t+1} = \alpha W^\textrm{EMA}_{t}
+ (1 - \alpha) W^\textrm{model}_t</annotation></semantics></math> Wt+1EMA​=αWtEMA​+(1−α)Wtmodel​
- en: $W^\textrm{EMA}_{t+1} = \alpha W^\textrm{EMA}_{t}
+ (1 - \alpha) W^\textrm{model}_t$ Wt+1EMA​=αWtEMA​+(1−α)Wtmodel​
prefs: []
type: TYPE_NORMAL
- en: where alpha is the EMA decay.
......
......@@ -12,9 +12,7 @@
numbers frequently occur in mathematics and engineering, especially in topics
like signal processing. Traditionally many users and libraries (e.g., TorchAudio)
have handled complex numbers by representing the data in float tensors with shape
<math><semantics><mrow><mo stretchy="false">(</mo><mi mathvariant="normal">.</mi><mi
mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mo separator="true">,</mo><mn>2</mn><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(..., 2)</annotation></semantics></math>(...,2)
$(..., 2)$(...,2)
where the last dimension contains the real and imaginary values.
prefs: []
type: TYPE_NORMAL
......@@ -71,9 +69,7 @@
- PREF_H2
type: TYPE_NORMAL
- en: Users who currently worked around the lack of complex tensors with real tensors
of shape <math><semantics><mrow><mo stretchy="false">(</mo><mi mathvariant="normal">.</mi><mi
mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mo separator="true">,</mo><mn>2</mn><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(..., 2)</annotation></semantics></math>(...,2)
of shape $(..., 2)$(...,2)
can easily to switch using the complex tensors in their code using [`torch.view_as_complex()`](generated/torch.view_as_complex.html#torch.view_as_complex
"torch.view_as_complex") and [`torch.view_as_real()`](generated/torch.view_as_real.html#torch.view_as_real
"torch.view_as_real"). Note that these functions don’t perform any copy and return
......
......@@ -418,9 +418,7 @@
prefs: []
type: TYPE_TB
- en: '| [`atan2`](generated/torch.atan2.html#torch.atan2 "torch.atan2") | Element-wise
arctangent of <math><semantics><mrow><msub><mtext>input</mtext><mi>i</mi></msub><mi
mathvariant="normal">/</mi><msub><mtext>other</mtext><mi>i</mi></msub></mrow><annotation
encoding="application/x-tex">\text{input}_{i} / \text{other}_{i}</annotation></semantics></math>inputi​/otheri​
arctangent of $\text{input}_{i} / \text{other}_{i}$inputi​/otheri​
with consideration of the quadrant. |'
prefs: []
type: TYPE_TB
......@@ -510,9 +508,8 @@
equality |'
prefs: []
type: TYPE_TB
- en: '| [`ne`](generated/torch.ne.html#torch.ne "torch.ne") | Computes <math><semantics><mrow><mtext>input</mtext><mo
mathvariant="normal">≠</mo><mtext>other</mtext></mrow><annotation encoding="application/x-tex">\text{input}
\neq \text{other}</annotation></semantics></math>input=other element-wise. |'
- en: '| [`ne`](generated/torch.ne.html#torch.ne "torch.ne") | Computes $\text{input}
\neq \text{other}$input=other element-wise. |'
prefs: []
type: TYPE_TB
- en: '| [`le`](generated/torch.le.html#torch.le "torch.le") | Computes $\text{input} \leq \text{other}$input≤other
......
......@@ -249,12 +249,10 @@
and `torch.bfloat16` and e = 8 for `torch.int8`.
prefs: []
type: TYPE_NORMAL
- en: <math display="block"><semantics><mrow><msub><mi>M</mi><mrow><mi>d</mi><mi>e</mi><mi>n</mi><mi>s</mi><mi>e</mi></mrow></msub><mo>=</mo><mi>r</mi><mo>×</mo><mi>c</mi><mo>×</mo><mi>e</mi><msub><mi>M</mi><mrow><mi>s</mi><mi>p</mi><mi>a</mi><mi>r</mi><mi>s</mi><mi>e</mi></mrow></msub><mo>=</mo><msub><mi>M</mi><mrow><mi>s</mi><mi>p</mi><mi>e</mi><mi>c</mi><mi>i</mi><mi>f</mi><mi>i</mi><mi>e</mi><mi>d</mi></mrow></msub><mo>+</mo><msub><mi>M</mi><mrow><mi>m</mi><mi>e</mi><mi>t</mi><mi>a</mi><mi>d</mi><mi>a</mi><mi>t</mi><mi>a</mi></mrow></msub><mo>=</mo><mi>r</mi><mo>×</mo><mfrac><mi>c</mi><mn>2</mn></mfrac><mo>×</mo><mi>e</mi><mo>+</mo><mi>r</mi><mo>×</mo><mfrac><mi>c</mi><mn>2</mn></mfrac><mo>×</mo><mn>2</mn><mo>=</mo><mfrac><mrow><mi>r</mi><mi>c</mi><mi>e</mi></mrow><mn>2</mn></mfrac><mo>+</mo><mi>r</mi><mi>c</mi><mo>=</mo><mi>r</mi><mi>c</mi><mi>e</mi><mo
stretchy="false">(</mo><mfrac><mn>1</mn><mn>2</mn></mfrac><mo>+</mo><mfrac><mn>1</mn><mi>e</mi></mfrac><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">M_{dense}
- en: $M_{dense}
= r \times c \times e \\ M_{sparse} = M_{specified} + M_{metadata} = r \times
\frac{c}{2} \times e + r \times \frac{c}{2} \times 2 = \frac{rce}{2} + rc =rce(\frac{1}{2}
+\frac{1}{e})</annotation></semantics></math> Mdense​=r×c×eMsparse​=Mspecified​+Mmetadata​=r×2c​×e+r×2c​×2=2rce​+rc=rce(21​+e1​)
+\frac{1}{e})$ Mdense​=r×c×eMsparse​=Mspecified​+Mmetadata​=r×2c​×e+r×2c​×2=2rce​+rc=rce(21​+e1​)
prefs: []
type: TYPE_NORMAL
- en: Using these calculations, we can determine the total memory footprint for both
......@@ -265,9 +263,8 @@
only on the bitwidth of the tensor datatype.
prefs: []
type: TYPE_NORMAL
- en: <math display="block"><semantics><mrow><mi>C</mi><mo>=</mo><mfrac><msub><mi>M</mi><mrow><mi>s</mi><mi>p</mi><mi>a</mi><mi>r</mi><mi>s</mi><mi>e</mi></mrow></msub><msub><mi>M</mi><mrow><mi>d</mi><mi>e</mi><mi>n</mi><mi>s</mi><mi>e</mi></mrow></msub></mfrac><mo>=</mo><mfrac><mn>1</mn><mn>2</mn></mfrac><mo>+</mo><mfrac><mn>1</mn><mi>e</mi></mfrac></mrow><annotation
encoding="application/x-tex">C = \frac{M_{sparse}}{M_{dense}} = \frac{1}{2} +
\frac{1}{e}</annotation></semantics></math> C=Mdense​Msparse​​=21​+e1​
- en: $C = \frac{M_{sparse}}{M_{dense}} = \frac{1}{2} +
\frac{1}{e}$ C=Mdense​Msparse​​=21​+e1​
prefs: []
type: TYPE_NORMAL
- en: By using this formula, we find that the compression ratio is 56.25% for `torch.float16`
......
......@@ -16,12 +16,9 @@
they are considered close if
prefs: []
type: TYPE_NORMAL
- en: <math display="block"><semantics><mrow><mo stretchy="false">∣</mo><mtext>actual</mtext><mo>−</mo><mtext>expected</mtext><mo
stretchy="false">∣</mo><mo>≤</mo><mtext mathvariant="monospace">atol</mtext><mo>+</mo><mtext
mathvariant="monospace">rtol</mtext><mo>⋅</mo><mo stretchy="false">∣</mo><mtext>expected</mtext><mo
stretchy="false">∣</mo></mrow><annotation encoding="application/x-tex">\lvert
- en: $\lvert
\text{actual} - \text{expected} \rvert \le \texttt{atol} + \texttt{rtol} \cdot
\lvert \text{expected} \rvert</annotation></semantics></math>∣actual−expected∣≤atol+rtol⋅∣expected∣
\lvert \text{expected} \rvert$∣actual−expected∣≤atol+rtol⋅∣expected∣
prefs: []
type: TYPE_NORMAL
- en: Non-finite values (`-inf` and `inf`) are only considered close if and only if
......
......@@ -293,19 +293,11 @@
- en: 'Shape:'
prefs: []
type: TYPE_NORMAL
- en: 'img_tensor: Default is <math><semantics><mrow><mo stretchy="false">(</mo><mn>3</mn><mo
separator="true">,</mo><mi>H</mi><mo separator="true">,</mo><mi>W</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">(3, H, W)</annotation></semantics></math>(3,H,W).
- en: 'img_tensor: Default is $(3, H, W)$(3,H,W).
You can use `torchvision.utils.make_grid()` to convert a batch of tensor into
3xHxW format or call `add_images` and let us do the job. Tensor with <math><semantics><mrow><mo
stretchy="false">(</mo><mn>1</mn><mo separator="true">,</mo><mi>H</mi><mo separator="true">,</mo><mi>W</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(1, H,
W)</annotation></semantics></math>(1,H,W), <math><semantics><mrow><mo stretchy="false">(</mo><mi>H</mi><mo
separator="true">,</mo><mi>W</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">(H, W)</annotation></semantics></math>(H,W), <math><semantics><mrow><mo
stretchy="false">(</mo><mi>H</mi><mo separator="true">,</mo><mi>W</mi><mo separator="true">,</mo><mn>3</mn><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(H, W,
3)</annotation></semantics></math>(H,W,3) is also suitable as long as corresponding
3xHxW format or call `add_images` and let us do the job. Tensor with $(1, H,
W)$(1,H,W), $(H, W)$(H,W), $(H, W,
3)$(H,W,3) is also suitable as long as corresponding
`dataformats` argument is passed, e.g. `CHW`, `HWC`, `HW`.'
prefs: []
type: TYPE_NORMAL
......@@ -364,10 +356,8 @@
- en: 'Shape:'
prefs: []
type: TYPE_NORMAL
- en: 'img_tensor: Default is <math><semantics><mrow><mo stretchy="false">(</mo><mi>N</mi><mo
separator="true">,</mo><mn>3</mn><mo separator="true">,</mo><mi>H</mi><mo separator="true">,</mo><mi>W</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(N, 3,
H, W)</annotation></semantics></math>(N,3,H,W). If `dataformats` is specified,
- en: 'img_tensor: Default is $(N, 3,
H, W)$(N,3,H,W). If `dataformats` is specified,
other shape will be accepted. e.g. NCHW or NHWC.'
prefs: []
type: TYPE_NORMAL
......@@ -466,10 +456,7 @@
- en: 'Shape:'
prefs: []
type: TYPE_NORMAL
- en: 'vid_tensor: <math><semantics><mrow><mo stretchy="false">(</mo><mi>N</mi><mo
separator="true">,</mo><mi>T</mi><mo separator="true">,</mo><mi>C</mi><mo separator="true">,</mo><mi>H</mi><mo
separator="true">,</mo><mi>W</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">(N, T, C, H, W)</annotation></semantics></math>(N,T,C,H,W).
- en: 'vid_tensor: $(N, T, C, H, W)$(N,T,C,H,W).
The values should lie in [0, 255] for type uint8 or [0, 1] for type float.'
prefs: []
type: TYPE_NORMAL
......@@ -511,9 +498,7 @@
- en: 'Shape:'
prefs: []
type: TYPE_NORMAL
- en: 'snd_tensor: <math><semantics><mrow><mo stretchy="false">(</mo><mn>1</mn><mo
separator="true">,</mo><mi>L</mi><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">(1, L)</annotation></semantics></math>(1,L). The
- en: 'snd_tensor: $(1, L)$(1,L). The
values should lie between [-1, 1].'
prefs: []
type: TYPE_NORMAL
......@@ -624,15 +609,12 @@
- en: 'Shape:'
prefs: []
type: TYPE_NORMAL
- en: 'mat: <math><semantics><mrow><mo stretchy="false">(</mo><mi>N</mi><mo separator="true">,</mo><mi>D</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(N, D)</annotation></semantics></math>(N,D),
- en: 'mat: $(N, D)$(N,D),
where N is number of data and D is feature dimension'
prefs: []
type: TYPE_NORMAL
- en: 'label_img: <math><semantics><mrow><mo stretchy="false">(</mo><mi>N</mi><mo
separator="true">,</mo><mi>C</mi><mo separator="true">,</mo><mi>H</mi><mo separator="true">,</mo><mi>W</mi><mo
stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(N, C,
H, W)</annotation></semantics></math>(N,C,H,W)'
- en: 'label_img: $(N, C,
H, W)$(N,C,H,W)'
prefs: []
type: TYPE_NORMAL
- en: 'Examples:'
......@@ -779,21 +761,15 @@
- en: 'Shape:'
prefs: []
type: TYPE_NORMAL
- en: 'vertices: <math><semantics><mrow><mo stretchy="false">(</mo><mi>B</mi><mo separator="true">,</mo><mi>N</mi><mo
separator="true">,</mo><mn>3</mn><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">(B, N, 3)</annotation></semantics></math>(B,N,3).
- en: 'vertices: $(B, N, 3)$(B,N,3).
(batch, number_of_vertices, channels)'
prefs: []
type: TYPE_NORMAL
- en: 'colors: <math><semantics><mrow><mo stretchy="false">(</mo><mi>B</mi><mo separator="true">,</mo><mi>N</mi><mo
separator="true">,</mo><mn>3</mn><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">(B, N, 3)</annotation></semantics></math>(B,N,3).
- en: 'colors: $(B, N, 3)$(B,N,3).
The values should lie in [0, 255] for type uint8 or [0, 1] for type float.'
prefs: []
type: TYPE_NORMAL
- en: 'faces: <math><semantics><mrow><mo stretchy="false">(</mo><mi>B</mi><mo separator="true">,</mo><mi>N</mi><mo
separator="true">,</mo><mn>3</mn><mo stretchy="false">)</mo></mrow><annotation
encoding="application/x-tex">(B, N, 3)</annotation></semantics></math>(B,N,3).
- en: 'faces: $(B, N, 3)$(B,N,3).
The values should lie in [0, number_of_vertices] for type uint8.'
prefs: []
type: TYPE_NORMAL
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册