From 5891000d633df769e65155688738e4405089d7f2 Mon Sep 17 00:00:00 2001 From: chen Date: Thu, 28 Mar 2019 20:58:20 +0800 Subject: [PATCH] traspose finish --- docs/1.0/nn.md | 70 ++++++++++++++++++++++++-------------------------- 1 file changed, 34 insertions(+), 36 deletions(-) diff --git a/docs/1.0/nn.md b/docs/1.0/nn.md index 07c33a97..64678d50 100644 --- a/docs/1.0/nn.md +++ b/docs/1.0/nn.md @@ -928,7 +928,7 @@ class torch.nn.Conv1d(in_channels, out_channels, kernel_size, stride=1, padding= * `stride` 参数控制了互相关操作(伪卷积)的步长,参数的数据类型一般是单个数字或者一个只有一个元素的元组。 -* `padding` 参数控制了要在一维卷积核的输入信号的两边填补的0的个数。 +* `padding` 参数控制了要在一维卷积核的输入信号的各维度各边上要补齐0的层数。 * `dilation` 参数控制了卷积核中各元素之间的距离;这也被称为多孔算法(à trous algorithm)。这个概念有点难解释,这个链接[link](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md)用可视化的方法很好地解释了`dilation`的作用。 @@ -959,7 +959,7 @@ Conv1d的参数: * **out_channels** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")) – 输出通道个数 * **kernel_size** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")) – 卷积核大小 * **stride** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 卷积操作的步长。 默认: 1 -* **padding** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 输入数据各边要补齐0的个数。 默认: 0 +* **padding** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 输入数据各维度各边上要补齐0的层数。 默认: 0 * **dilation** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 卷积核各元素之间的距离。 默认: 1 * **groups** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")_,_ _optional_) – 输入通道与输出通道之间相互隔离的连接的个数。 默认:1 * **bias** ([_bool_](https://docs.python.org/3/library/functions.html#bool "(in Python v3.7)")_,_ _optional_) – 如果被置为 `True`,向输出增加一个偏差量,此偏差是可学习参数。 默认:`True` @@ -1006,7 +1006,7 @@ class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding= * `stride` 参数控制了互相关操作(伪卷积)的步长,参数的数据类型一般是单个数字或者一个只有一个元素的元组。 -* `padding` 参数控制了要在二维卷积核的输入信号的上下左右各边填补的0的个数。 +* `padding` 参数控制了要在二维卷积核的输入信号的各维度各边上要补齐0的层数。 * `dilation` 参数控制了卷积核中各元素之间的距离;这也被称为多孔算法(à trous algorithm)。这个概念有点难解释,这个链接[link](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md)用可视化的方法很好地解释了`dilation`的作用。 @@ -1043,7 +1043,7 @@ Conv2d的参数: * **out_channels** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")) – 输出通道个数 * **kernel_size** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")) – 卷积核大小 * **stride** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) –卷积操作的步长。 默认: 1 -* **padding** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 输入数据各边要补齐0的个数。 默认: 0 +* **padding** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 输入数据各维度各边上要补齐0的层数。 默认: 0 * **dilation** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) –卷积核各元素之间的距离。 默认: 1 * **groups** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")_,_ _optional_) – 输入通道与输出通道之间相互隔离的连接的个数。 默认:1 * **bias** ([_bool_](https://docs.python.org/3/library/functions.html#bool "(in Python v3.7)")_,_ _optional_) – 如果被置为 `True`,向输出增加一个偏差量,此偏差是可学习参数。 默认:`True` @@ -1097,7 +1097,7 @@ class torch.nn.Conv3d(in_channels, out_channels, kernel_size, stride=1, padding= * `stride` 数控制了互相关操作(伪卷积)的步长。 -* `padding` 参数控制了要在三维卷积核的输入信号的各个维度的两边填补0的个数。 +* `padding` 参数控制了要在三维卷积核的输入信号的各维度各边上要补齐0的层数。 * `dilation` 参数控制了卷积核中各元素之间的距离;这也被称为多孔算法(à trous algorithm)。这个概念有点难解释,这个链接[link](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md)用可视化的方法很好地解释了`dilation`的作用。 @@ -1135,7 +1135,7 @@ Parameters: * **out_channels** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")) – 卷积操作输出通道的个数 * **kernel_size** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")) – 卷积核大小 * **stride** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 卷积操作的步长。 默认: 1 -* **padding** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 输入数据各边要补齐0的个数。 默认: 0 +* **padding** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 输入数据各维度各边上要补齐0的层数。 默认: 0 * **dilation** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 卷积核各元素之间的距离。 默认: 1 * **groups** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")_,_ _optional_) – 输入通道与输出通道之间相互隔离的连接的个数。 默认:1 * **bias** ([_bool_](https://docs.python.org/3/library/functions.html#bool "(in Python v3.7)")_,_ _optional_) – 如果被置为 `True`,向输出增加一个偏差量,此偏差是可学习参数。 默认:`True` @@ -1180,68 +1180,66 @@ Shape: class torch.nn.ConvTranspose1d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1) ``` -利用指定大小的一维转置卷积核对输入的多通道一维输入信号进行转置卷积操作的卷积层。 +利用指定大小的一维转置卷积核对输入的多通道一维输入信号进行转置卷积(当然此卷积也是互相关操作,cross-correlation)操作的模块。 -Applies a 1D transposed convolution operator over an input image composed of several input planes. +该模块可以看作是Conv1d相对于其输入的梯度(the gradient of Conv1d with respect to its input, 直译), 转置卷积又被称为小数步长卷积或是反卷积(尽管这不是一个真正意义上的反卷积)。 +* `stride` 控制了转置卷积操作的步长 controls the stride for the cross-correlation. +* `padding` 控制了要在输入的各维度的各边上补齐0的层数,与Conv1d不同的地方,此padding参数与实际补齐0的层数的关系为`层数 = kernel_size - 1 - padding`,详情请见下面的note。 -This module can be seen as the gradient of Conv1d with respect to its input. It is also known as a fractionally-strided convolution or a deconvolution (although it is not an actual deconvolution operation). + * `output_padding` 控制了转置卷积操作输出的各维度的长度增量,但注意这个参数不是说要往转置卷积的输出上pad 0,而是直接控制转置卷积的输出大小为根据此参数pad后的大小。更多的详情请见下面的note。 -* `stride` controls the stride for the cross-correlation. - -* `padding` controls the amount of implicit zero-paddings on both sides for `kernel_size - 1 - padding` number of points. See note below for details. +* `dilation` 控制了卷积核中各点之间的空间距离;这也被称为多孔算法(à trous algorithm)。这个概念有点难解释,这个链接[link](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md)用可视化的方法很好地解释了dilation的作用。 - * `output_padding` controls the additional size added to one side of the output shape. See note below for details. - -* `dilation` controls the spacing between the kernel points; also known as the à trous algorithm. It is harder to describe, but this [link](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md) has a nice visualization of what `dilation` does. - -* `groups` controls the connections between inputs and outputs. `in_channels` and `out_channels` must both be divisible by `groups`. For example, +* `groups` 控制了输入输出之间的连接(connections)的数量。`in_channels` 和 `out_channels` 必须能被 `groups` 整除。举个栗子, - > * At groups=1, all inputs are convolved to all outputs. - > * At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels, and producing half the output channels, and both subsequently concatenated. - > * At groups= `in_channels`, each input channel is convolved with its own set of filters (of size ![](img/648a514da1dace3deacf3f078287e157.jpg)). + > * 当 groups=1, 此Conv1d层会使用一个卷积层进行所有输入到输出的卷积操作。 + + > * 当 groups=2, 此时Conv1d层会产生两个并列的卷积层。同时,输入通道被分为两半,两个卷积层分别处理一半的输入通道,同时各自产生一半的输出通道。最后这两个卷积层的输出会被concatenated一起,作为此Conv1d层的输出。 + + > * 当 groups= `in_channels`, 每个输入通道都会被单独的一组卷积层处理,这个组的大小是![](img/648a514da1dace3deacf3f078287e157.jpg)。 Note -Depending of the size of your kernel, several (of the last) columns of the input might be lost, because it is a valid [cross-correlation](https://en.wikipedia.org/wiki/Cross-correlation), and not a full [cross-correlation](https://en.wikipedia.org/wiki/Cross-correlation). It is up to the user to add proper padding. +取决于你卷积核的大小,有些时候输入数据中某些列(最后几列)可能不会参与计算(比如列数整除卷积核大小有余数,而又没有padding,那最后的余数列一般不会参与卷积计算),这主要是因为pytorch中的互相关操作[cross-correlation](https://en.wikipedia.org/wiki/Cross-correlation)是保证计算正确的操作(valid operation), 而不是满操作(full operation)。所以实际操作中,还是要亲尽量选择好合适的padding参数哦。 Note -The `padding` argument effectively adds `kernel_size - 1 - padding` amount of zero padding to both sizes of the input. This is set so that when a [`Conv1d`](#torch.nn.Conv1d "torch.nn.Conv1d") and a [`ConvTranspose1d`](#torch.nn.ConvTranspose1d "torch.nn.ConvTranspose1d") are initialized with same parameters, they are inverses of each other in regard to the input and output shapes. However, when `stride > 1`, [`Conv1d`](#torch.nn.Conv1d "torch.nn.Conv1d") maps multiple input shapes to the same output shape. `output_padding` is provided to resolve this ambiguity by effectively increasing the calculated output shape on one side. Note that `output_padding` is only used to find output shape, but does not actually add zero-padding to output. +`padding` 参数控制了要在输入的各维度各边上补齐0的层数,与在Conv1d中不同的是,在转置卷积操作过程中,此padding参数与实际补齐0的层数的关系为`层数 = kernel_size - 1 - padding`, 这样设置的主要原因是当使用相同的参数构建[`Conv1d`](#torch.nn.Conv1d "torch.nn.Conv1d") 和[`ConvTranspose1d`](#torch.nn.ConvTranspose1d "torch.nn.ConvTranspose1d")模块的时候,这种设置能够实现两个模块有正好相反的输入输出的大小,即Conv1d的输出大小是其对应的ConvTranspose1d模块的输入大小,而ConvTranspose1d的输出大小又恰好是其对应的Conv1d模块的输入大小。然而,当`stride > 1`的时候,[`Conv1d`](#torch.nn.Conv1d "torch.nn.Conv1d") 的一个输出大小可能会对应多个输入大小,上一个note中就详细的介绍了这种情况,这样的情况下要保持前面提到两种模块的输入输出保持反向一致,那就要用到 `output_padding`参数了,这个参数可以增加转置卷积输出的某一维度的大小,以此来达到前面提到的同参数构建的[`Conv1d`](#torch.nn.Conv1d "torch.nn.Conv1d") 和[`ConvTranspose1d`](#torch.nn.ConvTranspose1d "torch.nn.ConvTranspose1d")模块的输入输出方向一致。 但注意这个参数不是说要往转置卷积的输出上pad 0,而是直接控制转置卷积的输出各维度的大小为根据此参数pad后的大小。 Note -In some circumstances when using the CUDA backend with CuDNN, this operator may select a nondeterministic algorithm to increase performance. If this is undesirable, you can try to make the operation deterministic (potentially at a performance cost) by setting `torch.backends.cudnn.deterministic = True`. Please see the notes on [Reproducibility](notes/randomness.html) for background. +当程序的运行环境是使用了CuDNN的CUDA环境的时候,一些非确定性的算法(nondeterministic algorithm)可能会被采用以提高整个计算的性能。如果不想使用这些非确定性的算法,你可以通过设置`torch.backends.cudnn.deterministic = True`来让整个计算过程保持确定性(可能会损失一定的计算性能)。对于后端(background),你可以看一下这一部分[Reproducibility](notes/randomness.html)了解其相关信息。 Parameters: -* **in_channels** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")) – Number of channels in the input image -* **out_channels** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")) – Number of channels produced by the convolution -* **kernel_size** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")) – Size of the convolving kernel -* **stride** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – Stride of the convolution. Default: 1 -* **padding** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – `kernel_size - 1 - padding` zero-padding will be added to both sides of the input. Default: 0 -* **output_padding** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – Additional size added to one side of the output shape. Default: 0 -* **groups** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")_,_ _optional_) – Number of blocked connections from input channels to output channels. Default: 1 -* **bias** ([_bool_](https://docs.python.org/3/library/functions.html#bool "(in Python v3.7)")_,_ _optional_) – If `True`, adds a learnable bias to the output. Default: `True` -* **dilation** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – Spacing between kernel elements. Default: 1 +* **in_channels** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")) – 输入通道的个数 +* **out_channels** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")) – 卷积操作输出通道的个数 +* **kernel_size** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")) – 卷积核大小 +* **stride** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 卷积操作的步长。 默认: 1 +* **padding** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – `kernel_size - 1 - padding` 层 0 会被补齐到输入数据的各边上。 默认: 0 +* **output_padding** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 输出的各维度要增加的大小。默认:0 +* **groups** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)")_,_ _optional_) – 输入通道与输出通道之间相互隔离的连接的个数。 默认:1 +* **bias** ([_bool_](https://docs.python.org/3/library/functions.html#bool "(in Python v3.7)")_,_ _optional_) – 如果被置为 `True`,向输出增加一个偏差量,此偏差是可学习参数。 默认:`True` +* **dilation** ([_int_](https://docs.python.org/3/library/functions.html#int "(in Python v3.7)") _or_ [_tuple_](https://docs.python.org/3/library/stdtypes.html#tuple "(in Python v3.7)")_,_ _optional_) – 卷积核各元素之间的距离。 默认: 1 ```py Shape: ``` -* Input: ![](img/7db3e5e5d600c81e77756d5eee050505.jpg) +* 输入: ![](img/7db3e5e5d600c81e77756d5eee050505.jpg) -* Output: ![](img/3423094375906aa21d1b2e095e95c230.jpg) where +* 输出: ![](img/3423094375906aa21d1b2e095e95c230.jpg) 其中, ![](img/c37a7e44707d3c08522f44ab4e4d6841.jpg) | Variables: | -* **weight** ([_Tensor_](tensors.html#torch.Tensor "torch.Tensor")) – the learnable weights of the module of shape (in_channels, out_channels, kernel_size[0], kernel_size[1]). The values of these weights are sampled from ![](img/3d305f1c240ff844b6cb2c1c6660e0af.jpg) where ![](img/69aab1ce658aabc9a2d986ae8281e2ad.jpg) -* **bias** ([_Tensor_](tensors.html#torch.Tensor "torch.Tensor")) – the learnable bias of the module of shape (out_channels). If `bias` is `True`, then the values of these weights are sampled from ![](img/3d305f1c240ff844b6cb2c1c6660e0af.jpg) where ![](img/69aab1ce658aabc9a2d986ae8281e2ad.jpg) +* **weight** ([_Tensor_](tensors.html#torch.Tensor "torch.Tensor")) – 模块中的一个大小为 (in_channels, out_channels, kernel_size[0])的权重张量,这些权重可训练学习(learnable)。这些权重的初始值的采样空间是![](img/3d305f1c240ff844b6cb2c1c6660e0af.jpg),其中 ![](img/69aab1ce658aabc9a2d986ae8281e2ad.jpg)。 +* **bias** ([_Tensor_](tensors.html#torch.Tensor "torch.Tensor")) – 模块的偏差项,大小为 (out_channels), 如果构造函数中的 `bias` 被置为 `True`,那么这些权重的初始值的采样空间是 ![](img/3d305f1c240ff844b6cb2c1c6660e0af.jpg) ,其中 ![](img/69aab1ce658aabc9a2d986ae8281e2ad.jpg)。 -- GitLab