Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
OpenDocCN
pytorch-doc-zh
提交
5df01738
P
pytorch-doc-zh
项目概览
OpenDocCN
/
pytorch-doc-zh
通知
123
Star
3932
Fork
992
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
pytorch-doc-zh
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
5df01738
编写于
2月 22, 2019
作者:
W
wizardforcel
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
2019-02-22 13:48:41
上级
48de6ef7
变更
33
展开全部
隐藏空白更改
内联
并排
Showing
33 changed file
with
637 addition
and
637 deletion
+637
-637
docs/0.4/1.md
docs/0.4/1.md
+2
-2
docs/0.4/10.md
docs/0.4/10.md
+213
-213
docs/0.4/11.md
docs/0.4/11.md
+23
-23
docs/0.4/12.md
docs/0.4/12.md
+8
-8
docs/0.4/13.md
docs/0.4/13.md
+4
-4
docs/0.4/14.md
docs/0.4/14.md
+19
-19
docs/0.4/15.md
docs/0.4/15.md
+1
-1
docs/0.4/16.md
docs/0.4/16.md
+57
-57
docs/0.4/17.md
docs/0.4/17.md
+132
-132
docs/0.4/18.md
docs/0.4/18.md
+2
-2
docs/0.4/19.md
docs/0.4/19.md
+36
-36
docs/0.4/2.md
docs/0.4/2.md
+5
-5
docs/0.4/20.md
docs/0.4/20.md
+24
-24
docs/0.4/22.md
docs/0.4/22.md
+3
-3
docs/0.4/23.md
docs/0.4/23.md
+2
-2
docs/0.4/24.md
docs/0.4/24.md
+2
-2
docs/0.4/25.md
docs/0.4/25.md
+1
-1
docs/0.4/26.md
docs/0.4/26.md
+4
-4
docs/0.4/27.md
docs/0.4/27.md
+11
-11
docs/0.4/28.md
docs/0.4/28.md
+1
-1
docs/0.4/29.md
docs/0.4/29.md
+2
-2
docs/0.4/3.md
docs/0.4/3.md
+8
-8
docs/0.4/30.md
docs/0.4/30.md
+7
-7
docs/0.4/33.md
docs/0.4/33.md
+2
-2
docs/0.4/34.md
docs/0.4/34.md
+12
-12
docs/0.4/35.md
docs/0.4/35.md
+17
-17
docs/0.4/36.md
docs/0.4/36.md
+12
-12
docs/0.4/37.md
docs/0.4/37.md
+2
-2
docs/0.4/4.md
docs/0.4/4.md
+5
-5
docs/0.4/5.md
docs/0.4/5.md
+3
-3
docs/0.4/6.md
docs/0.4/6.md
+1
-1
docs/0.4/7.md
docs/0.4/7.md
+4
-4
docs/0.4/8.md
docs/0.4/8.md
+12
-12
未找到文件。
docs/0.4/1.md
浏览文件 @
5df01738
...
...
@@ -21,7 +21,7 @@
如果一个输入变量定义
`requires_grad`
,那么他的输出也可以使用
`requires_grad`
;相反,只有当所有的输入变量都不定义
`requires_grad`
梯度,才不会输出梯度。如果其中所有的变量都不需要计算梯度,在子图中从不执行向后计算。
```
```
py
>>>
x
=
Variable
(
torch
.
randn
(
5
,
5
))
>>>
y
=
Variable
(
torch
.
randn
(
5
,
5
))
>>>
z
=
Variable
(
torch
.
randn
(
5
,
5
),
requires_grad
=
True
)
...
...
@@ -37,7 +37,7 @@ True
例如,如果您想调整预训练的
`CNN`
,只要切换冻结模型中的
`requires_grad`
标志即可,直到计算到最后一层才会保存中间缓冲区,仿射变换和网络输出都需要使用梯度的权值。
```
```
py
model
=
torchvision
.
models
.
resnet18
(
pretrained
=
True
)
for
param
in
model
.
parameters
():
param
.
requires_grad
=
False
...
...
docs/0.4/10.md
浏览文件 @
5df01738
此差异已折叠。
点击以展开。
docs/0.4/11.md
浏览文件 @
5df01738
...
...
@@ -21,7 +21,7 @@ Torch定义了七种CPU张量类型和八种GPU张量类型:
张量可以从Python的
`list`
或序列构成:
```
```
py
>>>
torch
.
FloatTensor
([[
1
,
2
,
3
],
[
4
,
5
,
6
]])
1
2
3
4
5
6
...
...
@@ -30,7 +30,7 @@ Torch定义了七种CPU张量类型和八种GPU张量类型:
可以通过指定它的大小来构建一个空的张量:
```
```
py
>>>
torch
.
IntTensor
(
2
,
4
).
zero_
()
0
0
0
0
0
0
0
0
...
...
@@ -39,7 +39,7 @@ Torch定义了七种CPU张量类型和八种GPU张量类型:
可以使用Python的索引和切片符号来访问和修改张量的内容:
```
```
py
>>>
x
=
torch
.
FloatTensor
([[
1
,
2
,
3
],
[
4
,
5
,
6
]])
>>>
print
(
x
[
1
][
2
])
6.0
...
...
@@ -54,7 +54,7 @@ Torch定义了七种CPU张量类型和八种GPU张量类型:
> 注意: 改变张量的方法可以用一个下划线后缀来标示。比如,`torch.FloatTensor.abs_()`会在原地计算绝对值并返回修改的张量,而`tensor.FloatTensor.abs()`将会在新张量中计算结果。
```
```
py
class
torch
.
Tensor
class
torch
.
Tensor
(
*
sizes
)
class
torch
.
Tensor
(
size
)
...
...
@@ -319,7 +319,7 @@ class torch.Tensor(storage)
返回单个元素的字节大小。 例:
```
```
py
>>>
torch
.
FloatTensor
().
element_size
()
4
>>>
torch
.
ByteTensor
().
element_size
()
...
...
@@ -356,7 +356,7 @@ class torch.Tensor(storage)
例:
```
```
py
>>>
x
=
torch
.
Tensor
([[
1
],
[
2
],
[
3
]])
>>>
x
.
size
()
torch
.
Size
([
3
,
1
])
...
...
@@ -372,7 +372,7 @@ torch.Size([3, 1])
将tensor扩展为参数tensor的大小。 该操作等效与:
```
```
py
self
.
expand
(
tensor
.
size
())
```
...
...
@@ -480,7 +480,7 @@ self.expand(tensor.size())
例:
```
```
py
>>>
x
=
torch
.
Tensor
([[
1
,
1
,
1
],
[
1
,
1
,
1
],
[
1
,
1
,
1
]])
>>>
t
=
torch
.
Tensor
([[
1
,
2
,
3
],
[
4
,
5
,
6
],
[
7
,
8
,
9
]])
>>>
index
=
torch
.
LongTensor
([
0
,
2
,
1
])
...
...
@@ -504,7 +504,7 @@ self.expand(tensor.size())
例:
```
```
py
>>>
x
=
torch
.
Tensor
(
3
,
3
)
>>>
t
=
torch
.
Tensor
([[
1
,
2
,
3
],
[
4
,
5
,
6
],
[
7
,
8
,
9
]])
>>>
index
=
torch
.
LongTensor
([
0
,
2
,
1
])
...
...
@@ -528,7 +528,7 @@ self.expand(tensor.size())
例:
```
```
py
>>>
x
=
torch
.
Tensor
([[
1
,
2
,
3
],
[
4
,
5
,
6
],
[
7
,
8
,
9
]])
>>>
index
=
torch
.
LongTensor
([
0
,
2
])
>>>
x
.
index_fill_
(
0
,
index
,
-
1
)
...
...
@@ -623,7 +623,7 @@ self.expand(tensor.size())
将
`callable`
作用于本tensor和参数tensor中的每一个元素,并将结果存放在本tensor中。
`callable`
应该有下列标志:
```
```
py
def
callable
(
a
,
b
)
->
number
```
...
...
@@ -703,7 +703,7 @@ def callable(a, b) -> number
例:
```
```
py
>>>
x
=
torch
.
Tensor
([[
1
,
2
,
3
],
[
4
,
5
,
6
],
[
7
,
8
,
9
]])
>>>
x
.
narrow
(
0
,
0
,
2
)
1
2
3
...
...
@@ -782,7 +782,7 @@ def callable(a, b) -> number
例:
```
```
py
>>>
x
=
torch
.
randn
(
2
,
3
,
5
)
>>>
x
.
size
()
torch
.
Size
([
2
,
3
,
5
])
...
...
@@ -864,7 +864,7 @@ torch.Size([5, 2, 3])
例:
```
```
py
>>>
x
=
torch
.
Tensor
([
1
,
2
,
3
])
>>>
x
.
repeat
(
4
,
2
)
1
2
3
1
2
3
...
...
@@ -886,7 +886,7 @@ torch.Size([4, 2, 3])
例:
```
```
py
>>>
x
=
torch
.
Tensor
([[
1
,
2
],
[
3
,
4
],
[
5
,
6
]])
>>>
x
.
resize_
(
2
,
2
)
>>>
x
...
...
@@ -899,7 +899,7 @@ torch.Size([4, 2, 3])
将当前张量调整为与指定张量相同的大小。这相当于:
```
```
py
self
.
resize_
(
tensor
.
size
())
```
...
...
@@ -934,7 +934,7 @@ self.resize_(tensor.size())
例子:
```
```
py
>>>
x
=
torch
.
rand
(
2
,
5
)
>>>
x
...
...
@@ -1025,7 +1025,7 @@ self.resize_(tensor.size())
例:
```
```
py
>>>
torch
.
Tensor
(
3
,
4
,
5
).
size
()
torch
.
Size
([
3
,
4
,
5
])
```
...
...
@@ -1066,7 +1066,7 @@ torch.Size([3, 4, 5])
以储存元素的个数的形式返回tensor在地城内存中的偏移量。 例:
```
```
py
>>>
x
=
torch
.
Tensor
([
1
,
2
,
3
,
4
,
5
])
>>>
x
.
storage_offset
()
0
...
...
@@ -1185,7 +1185,7 @@ torch.Size([3, 4, 5])
将此张量转换为给定类型的张量。 如果张量已经是正确的类型,则不会执行操作。等效于:
```
```
py
self
.
type
(
tensor
.
type
())
```
...
...
@@ -1207,7 +1207,7 @@ self.type(tensor.type())
例子:
```
```
py
>>>
x
=
torch
.
arange
(
1
,
8
)
>>>
x
...
...
@@ -1260,7 +1260,7 @@ self.type(tensor.type())
例子:
```
```
py
>>>
x
=
torch
.
randn
(
4
,
4
)
>>>
x
.
size
()
torch
.
Size
([
4
,
4
])
...
...
@@ -1276,7 +1276,7 @@ torch.Size([2, 8])
返回被视作与给定的tensor相同大小的原tensor。 等效于:
```
```
py
self
.
view
(
tensor
.
size
())
```
...
...
docs/0.4/12.md
浏览文件 @
5df01738
...
...
@@ -25,7 +25,7 @@
使用方法:
```
```
py
>>>
x
=
torch
.
Tensor
([[
1
,
2
,
3
,
4
,
5
],
[
6
,
7
,
8
,
9
,
10
]])
>>>
print
x
.
type
()
torch
.
FloatTensor
...
...
@@ -43,7 +43,7 @@ torch.FloatTensor
通过一个字符串:
```
```
py
>>>
torch
.
device
(
'cuda:0'
)
device
(
type
=
'cuda'
,
index
=
0
)
...
...
@@ -56,7 +56,7 @@ device(type='cuda')
通过字符串和设备序号:
```
```
py
>>>
torch
.
device
(
'cuda'
,
0
)
device
(
type
=
'cuda'
,
index
=
0
)
...
...
@@ -67,13 +67,13 @@ device(type='cpu', index=0)
> **注意**
> `torch.device`函数中的参数通常可以用一个字符串替代。这允许使用代码快速构建原型。
>
> ```
> ```
py
> >> # Example of a function that takes in a torch.device
> >> cuda1 = torch.device('cuda:1')
> >> torch.randn((2,3), device=cuda1)
> ```
>
> ```
> ```
py
> >> # You can substitute the torch.device with a string
> >> torch.randn((2,3), 'cuda:1')
> ```
...
...
@@ -83,7 +83,7 @@ device(type='cpu', index=0)
> **注意**
> 出于传统原因,可以通过单个设备序号构建设备,将其视为`cuda`设备。这匹配`Tensor.get_device()`,它为`cuda`张量返回一个序数,并且不支持`cpu`张量。
>
> ```
> ```
py
> >> torch.device(1)
> device(type='cuda', index=1)
> ```
...
...
@@ -93,7 +93,7 @@ device(type='cpu', index=0)
> **注意**
> 指定设备的方法可以使用(properly formatted)字符串或(legacy)整数型设备序数,即以下示例均等效:
>
> ```
> ```
py
> >> torch.randn((2,3), device=torch.device('cuda:1'))
> >> torch.randn((2,3), device='cuda:1')
> >> torch.randn((2,3), device=1) # legacy
...
...
@@ -107,7 +107,7 @@ device(type='cpu', index=0)
例:
```
```
py
>>>
x
=
torch
.
Tensor
([[
1
,
2
,
3
,
4
,
5
],
[
6
,
7
,
8
,
9
,
10
]])
>>>
x
.
stride
()
(
5
,
1
)
...
...
docs/0.4/13.md
浏览文件 @
5df01738
...
...
@@ -8,7 +8,7 @@ torch支持COO(rdinate)格式的稀疏张量,可以有效地存储和处
稀疏张量被表示为一对致密张量:值的张量和2D张量的索引。可以通过提供这两个张量来构造稀疏张量,以及稀疏张量的大小(不能从这些张量推断出!)假设我们要在位置(0,2)处定义具有条目3的稀疏张量, ,位置(1,0)的条目4,位置(1,2)的条目5。我们会写:
```
```
py
>>>
i
=
torch
.
LongTensor
([[
0
,
1
,
1
],
[
2
,
0
,
2
]])
>>>
v
=
torch
.
FloatTensor
([
3
,
4
,
5
])
...
...
@@ -20,7 +20,7 @@ torch支持COO(rdinate)格式的稀疏张量,可以有效地存储和处
请注意,LongTensor的输入不是索引元组的列表。如果要以这种方式编写索引,则在将它们传递给稀疏构造函数之前,应该进行转置:
```
```
py
>>>
i
=
torch
.
LongTensor
([[
0
,
2
],
[
1
,
0
],
[
1
,
2
]])
>>>
v
=
torch
.
FloatTensor
([
3
,
4
,
5
])
>>>
torch
.
sparse
.
FloatTensor
(
i
.
t
(),
v
,
torch
.
Size
([
2
,
3
])).
to_dense
()
...
...
@@ -31,7 +31,7 @@ torch支持COO(rdinate)格式的稀疏张量,可以有效地存储和处
您还可以构建混合稀疏张量,其中只有第一个n维是稀疏的,其余的维度是密集的。
```
```
py
>>>
i
=
torch
.
LongTensor
([[
2
,
4
]])
>>>
v
=
torch
.
FloatTensor
([[
1
,
3
],
[
5
,
7
]])
>>>
torch
.
sparse
.
FloatTensor
(
i
,
v
).
to_dense
()
...
...
@@ -45,7 +45,7 @@ torch支持COO(rdinate)格式的稀疏张量,可以有效地存储和处
可以通过指定一个空的稀疏张量来构建一个空的稀疏张量:
```
```
py
print
torch
.
sparse
.
FloatTensor
(
2
,
3
)
# FloatTensor of size 2x3 with indices:
# [torch.LongTensor with no dimension]
...
...
docs/0.4/14.md
浏览文件 @
5df01738
...
...
@@ -14,25 +14,25 @@
[
CUDA语义
](
http://pytorch.org/docs/master/notes/cuda.html#cuda-semantics
)
有关于使用CUDA的更多细节。
```
```
py
torch
.
cuda
.
current_blas_handle
()
```
返回指向当前cuBLAS句柄的cublasHandle_t指针
```
```
py
torch
.
cuda
.
current_device
()
```
返回当前所选设备的索引。
```
```
py
torch
.
cuda
.
current_stream
()
```
返回当前选定的
`Stream`
```
```
py
class
torch
.
cuda
.
device
(
idx
)
```
...
...
@@ -42,13 +42,13 @@ class torch.cuda.device(idx)
*
idx(int) – 设备索引选择。如果这个参数是负的,则是无效操作。
```
```
py
torch
.
cuda
.
device_count
()
```
返回可用的GPU数量。
```
```
py
class
torch
.
cuda
.
device_of
(
obj
)
```
...
...
@@ -60,13 +60,13 @@ class torch.cuda.device_of(obj)
*
obj (Tensor或者Storage) – 在选定设备上分配的对象。
```
```
py
torch
.
cuda
.
is_available
()
```
返回bool值,指示当前CUDA是否可用。
```
```
py
torch
.
cuda
.
set_device
(
device
)
```
...
...
@@ -78,7 +78,7 @@ torch.cuda.set_device(device)
*
device(int) - 选择的设备。如果此参数为负,则此函数是无操作的。
```
```
py
torch
.
cuda
.
stream
(
stream
)
```
...
...
@@ -90,7 +90,7 @@ torch.cuda.stream(stream)
*
stream(Stream) – 选择的流。如果为
`None`
,则这个管理器是无效的。
```
```
py
torch
.
cuda
.
synchronize
()
```
...
...
@@ -98,7 +98,7 @@ torch.cuda.synchronize()
### 交流集
```
```
py
torch
.
cuda
.
comm
.
broadcast
(
tensor
,
devices
)
```
...
...
@@ -111,7 +111,7 @@ torch.cuda.comm.broadcast(tensor, devices)
返回: 包含张量副本的元组,放置在对应于索引的设备上。
```
```
py
torch
.
cuda
.
comm
.
reduce_add
(
inputs
,
destination
=
None
)
```
...
...
@@ -126,7 +126,7 @@ torch.cuda.comm.reduce_add(inputs, destination=None)
返回: 包含放置在
`destination`
设备上的所有输入的元素总和的张量。
```
```
py
torch
.
cuda
.
comm
.
scatter
(
tensor
,
devices
,
chunk_sizes
=
None
,
dim
=
0
,
streams
=
None
)
```
...
...
@@ -141,7 +141,7 @@ torch.cuda.comm.scatter(tensor, devices, chunk_sizes=None, dim=0, streams=None)
返回: 包含
`tensor`
块的元组,传播给
`devices`
。
```
```
py
torch
.
cuda
.
comm
.
gather
(
tensors
,
dim
=
0
,
destination
=
None
)
```
...
...
@@ -159,7 +159,7 @@ torch.cuda.comm.gather(tensors, dim=0, destination=None)
## 流和事件
```
```
py
class
torch
.
cuda
.
Stream
```
...
...
@@ -200,7 +200,7 @@ CUDA流的包装。
提交到此流的所有未来工作将等待直到所有核心在调用完成时提交给给定的流。
```
```
py
class
torch
.
cuda
.
Event
(
enable_timing
=
False
,
blocking
=
False
,
interprocess
=
False
,
_handle
=
None
)
```
...
...
@@ -240,7 +240,7 @@ CUDA事件的包装。
## NVIDIA工具扩展(NVTX)
```
```
py
torch.cuda.nvtx.mark(msg)
```
...
...
@@ -248,7 +248,7 @@ CUDA事件的包装。
* msg(string) - 与事件关联的ASCII消息。
```
```
py
torch.cuda.nvtx.range_push(msg)
```
...
...
@@ -256,7 +256,7 @@ CUDA事件的包装。
* msg(string) - 与范围关联的ASCII消息
```
```
py
torch.cuda.nvtx.range_pop()
```
...
...
docs/0.4/15.md
浏览文件 @
5df01738
...
...
@@ -6,7 +6,7 @@
`torch.Storage`
是单个数据类型的连续的
`一维数组`
,每个
`torch.Tensor`
都具有相同数据类型的相应存储。
```
```
py
class
torch
.
FloatStorage
```
...
...
docs/0.4/16.md
浏览文件 @
5df01738
...
...
@@ -31,7 +31,7 @@
`Modules`
还可以包含其他模块,允许将它们嵌套在树结构中。您可以将子模块分配为常规属性:
```
```
py
import
torch.nn
as
nn
import
torch.nn.functional
as
F
...
...
@@ -52,7 +52,7 @@ class Model(nn.Module):
将一个子模块添加到当前模块。 该模块可以使用给定的名称作为属性访问。 例:
```
```
py
import
torch.nn
as
nn
class
Model
(
nn
.
Module
):
def
__init__
(
self
):
...
...
@@ -65,7 +65,7 @@ print(model.conv)
输出:
```
```
py
Conv2d
(
10
,
20
,
kernel_size
=
(
4
,
4
),
stride
=
(
1
,
1
))
```
...
...
@@ -73,7 +73,7 @@ Conv2d(10, 20, kernel_size=(4, 4), stride=(1, 1))
适用
`fn`
递归到每个子模块(如返回
`.children()`
),以及自我。典型用途包括初始化模型的参数(另见
`torch-nn-init`
)。 例如:
```
```
py
>>>
def
init_weights
(
m
):
>>>
print
(
m
)
>>>
if
type
(
m
)
==
nn
.
Linear
:
...
...
@@ -150,7 +150,7 @@ Sequential (
> NOTE: 重复的模块只返回一次。在以下示例中,`l`将仅返回一次。
```
```
py
>>>
l
=
nn
.
Linear
(
2
,
2
)
>>>
net
=
nn
.
Sequential
(
l
,
l
)
>>>
for
idx
,
m
in
enumerate
(
net
.
modules
()):
...
...
@@ -168,7 +168,7 @@ Sequential (
例子:
```
```
py
>>>
for
name
,
module
in
model
.
named_children
():
>>>
if
name
in
[
'conv4'
,
'conv5'
]:
>>>
print
(
module
)
...
...
@@ -180,7 +180,7 @@ Sequential (
> 注意: 重复的模块只返回一次。在以下示例中,`l`将仅返回一次。
>
> ```
> ```
py
> >> l = nn.Linear(2, 2)
> >> net = nn.Sequential(l, l)
> >> for idx, m in enumerate(net.named_modules()):
...
...
@@ -196,7 +196,7 @@ Sequential (
>
> 返回模块参数的迭代器,同时产生参数的名称以及参数本身 例如:
>
> ```
> ```
py
> >> for name, param in self.named_parameters():
> >> if name in ['bias']:
> >> print(param.size())
...
...
@@ -208,7 +208,7 @@ Sequential (
例子:
```
```
py
for
param
in
model
.
parameters
():
print
(
type
(
param
.
data
),
param
.
size
())
...
...
@@ -222,7 +222,7 @@ for param in model.parameters():
每当计算相对于模块输入的梯度时,将调用该钩。挂钩应具有以下签名:
```
```
py
hook
(
module
,
grad_input
,
grad_output
)
->
Variable
or
None
```
...
...
@@ -240,7 +240,7 @@ hook(module, grad_input, grad_output) -> Variable or None
例子:
```
```
py
self
.
register_buffer
(
'running_mean'
,
torch
.
zeros
(
num_features
))
```
...
...
@@ -248,7 +248,7 @@ self.register_buffer('running_mean', torch.zeros(num_features))
在模块上注册一个
`forward hook`
。 每次调用
`forward()`
计算输出的时候,这个
`hook`
就会被调用。它应该拥有以下签名:
```
```
py
hook
(
module
,
input
,
output
)
->
None
```
...
...
@@ -268,7 +268,7 @@ hook(module, input, output) -> None
例子:
```
```
py
module
.
state_dict
().
keys
()
# ['bias', 'weight']
```
...
...
@@ -289,7 +289,7 @@ module.state_dict().keys()
为了更容易理解,给出的是一个小例子:
```
```
py
# Example of using Sequential
model
=
nn
.
Sequential
(
...
...
@@ -319,7 +319,7 @@ model = nn.Sequential(OrderedDict([
例子:
```
```
py
class
MyModule
(
nn
.
Module
):
def
__init__
(
self
):
super
(
MyModule
,
self
).
__init__
()
...
...
@@ -360,7 +360,7 @@ ParameterList可以像普通Python列表一样进行索引,但是它包含的
例子:
```
```
py
class
MyModule
(
nn
.
Module
):
def
__init__
(
self
):
super
(
MyModule
,
self
).
__init__
()
...
...
@@ -428,7 +428,7 @@ $$L_{out}=floor((L_{in}+2_padding-dilation_(kernerl_size-1)-1)/stride+1)$$
**example:**
```
```
py
>>> m = nn.Conv1d(16, 33, 3, stride=2)
>>> input = autograd.Variable(torch.randn(20, 16, 50))
>>> output = m(input)
...
...
@@ -471,7 +471,7 @@ bias(`tensor`) - 卷积的偏置系数,大小是(`out_channel`)
Examples:
```
```
py
>>> # With square kernels and equal stride
>>> m = nn.Conv2d(16, 33, 3, stride=2)
>>> # non-square kernels and unequal stride and with padding
...
...
@@ -515,7 +515,7 @@ $$out(N_i, C_{out_j})=bias(C_{out_j})+\sum^{C_{in}-1}_{k=0}weight(C_{out_j},k)\b
Examples:
```
```
py
>>>
# With square kernels and equal stride
>>>
m
=
nn
.
Conv3d
(
16
,
33
,
3
,
stride
=
2
)
>>>
# non-square kernels and unequal stride and with padding
...
...
@@ -588,7 +588,7 @@ Examples:
**Example**
```
```
py
>>>
# With square kernels and equal stride
>>>
m
=
nn
.
ConvTranspose2d
(
16
,
33
,
3
,
stride
=
2
)
>>>
# non-square kernels and unequal stride and with padding
...
...
@@ -646,7 +646,7 @@ torch.Size([1, 16, 12, 12])
**Example**
```
```
py
>>>
# With square kernels and equal stride
>>>
m
=
nn
.
ConvTranspose3d
(
16
,
33
,
3
,
stride
=
2
)
>>>
# non-square kernels and unequal stride and with padding
...
...
@@ -683,7 +683,7 @@ $$L_{out}=floor((L_{in} + 2_padding - dilation_(kernel_size - 1) - 1)/stride + 1
**example:**
```
```
py
>>>
# pool of size=3, stride=2
>>>
m
=
nn
.
MaxPool1d
(
3
,
stride
=
2
)
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
20
,
16
,
50
))
...
...
@@ -720,7 +720,7 @@ $$W_{out}=floor((W_{in} + 2_padding[1] - dilation[1]_(kernel_size[1] - 1) - 1)/s
**example:**
```
```
py
>>>
# pool of square window of size=3, stride=2
>>>
m
=
nn
.
MaxPool2d
(
3
,
stride
=
2
)
>>>
# pool of non-square window
...
...
@@ -763,7 +763,7 @@ $$W_{out}=floor((W_{in} + 2_padding[2] - dilation[2]_(kernel_size[2] - 1) - 1)/s
**example:**
```
```
py
>>>
# pool of square window of size=3, stride=2
>>>
m
=
nn
.
MaxPool3d
(
3
,
stride
=
2
)
>>>
# pool of non-square window
...
...
@@ -796,7 +796,7 @@ $$H_{out}=(H_{in}-1)_stride[0]-2_padding[0]+kernel_size[0]$$
**Example:**
```
```
py
>>>
pool
=
nn
.
MaxPool1d
(
2
,
stride
=
2
,
return_indices
=
True
)
>>>
unpool
=
nn
.
MaxUnpool1d
(
2
,
stride
=
2
)
>>>
input
=
Variable
(
torch
.
Tensor
([[[
1
,
2
,
3
,
4
,
5
,
6
,
7
,
8
]]]))
...
...
@@ -852,7 +852,7 @@ $$W_{out}=(W_{in}-1)_stride[1]-2_padding[1]+kernel_size[1]$$
**Example:**
```
```
py
>>>
pool
=
nn
.
MaxPool2d
(
2
,
stride
=
2
,
return_indices
=
True
)
>>>
unpool
=
nn
.
MaxUnpool2d
(
2
,
stride
=
2
)
>>>
input
=
Variable
(
torch
.
Tensor
([[[[
1
,
2
,
3
,
4
],
...
...
@@ -910,7 +910,7 @@ H_{out}=(H_{in}-1)_stride[1]-2_padding[0]+kernel_size[1]\ W_{out}=(W_{in}-1)_str
**Example:**
```
```
py
>>>
# pool of square window of size=3, stride=2
>>>
pool
=
nn
.
MaxPool3d
(
3
,
stride
=
2
,
return_indices
=
True
)
>>>
unpool
=
nn
.
MaxUnpool3d
(
3
,
stride
=
2
)
...
...
@@ -942,7 +942,7 @@ $$L_{out}=floor((L_{in}+2*padding-kernel_size)/stride+1)$$
**Example:**
```
```
py
>>>
# pool with window of size=3, stride=2
>>>
m
=
nn
.
AvgPool1d
(
3
,
stride
=
2
)
>>>
m
(
Variable
(
torch
.
Tensor
([[[
1
,
2
,
3
,
4
,
5
,
6
,
7
]]])))
...
...
@@ -977,7 +977,7 @@ W_{out}=floor((W_{in}+2*padding[1]-kernel_size[1])/stride[1]+1) \end{aligned} $$
**Example:**
```
```
py
>>>
# pool of square window of size=3, stride=2
>>>
m
=
nn
.
AvgPool2d
(
3
,
stride
=
2
)
>>>
# pool of non-square window
...
...
@@ -1006,7 +1006,7 @@ W_{out}=floor((W_{in}+2*padding[2]-kernel_size[2])/stride[2]+1)
**Example:**
```
```
py
>>>
# pool of square window of size=3, stride=2
>>>
m
=
nn
.
AvgPool3d
(
3
,
stride
=
2
)
>>>
# pool of non-square window
...
...
@@ -1028,7 +1028,7 @@ W_{out}=floor((W_{in}+2*padding[2]-kernel_size[2])/stride[2]+1)
**Example:**
```
```
py
>>>
# pool of square window of size=3, and target output size 13x12
>>>
m
=
nn
.
FractionalMaxPool2d
(
3
,
output_size
=
(
13
,
12
))
>>>
# pool of square window and target output size being half of input image size
...
...
@@ -1064,7 +1064,7 @@ $$f(x)=pow(sum(X,p),1/p)$$
**Example:**
```
```
py
>>>
# power-2 pool of square window of size=3, stride=2
>>>
m
=
nn
.
LPPool2d
(
2
,
3
,
stride
=
2
)
>>>
# pool of non-square window of power 1.2
...
...
@@ -1084,7 +1084,7 @@ $$f(x)=pow(sum(X,p),1/p)$$
**Example:**
```
```
py
>>>
# target output size of 5
>>>
m
=
nn
.
AdaptiveMaxPool1d
(
5
)
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
1
,
64
,
8
))
...
...
@@ -1102,7 +1102,7 @@ $$f(x)=pow(sum(X,p),1/p)$$
**Example:**
```
```
py
>>>
# target output size of 5x7
>>>
m
=
nn
.
AdaptiveMaxPool2d
((
5
,
7
))
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
1
,
64
,
8
,
9
))
...
...
@@ -1122,7 +1122,7 @@ $$f(x)=pow(sum(X,p),1/p)$$
**Example:**
```
```
py
>>>
# target output size of 5
>>>
m
=
nn
.
AdaptiveAvgPool1d
(
5
)
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
1
,
64
,
8
))
...
...
@@ -1139,7 +1139,7 @@ $$f(x)=pow(sum(X,p),1/p)$$
**Example:**
```
```
py
>>>
# target output size of 5x7
>>>
m
=
nn
.
AdaptiveAvgPool2d
((
5
,
7
))
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
1
,
64
,
8
,
9
))
...
...
@@ -1164,7 +1164,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
ReLU
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1184,7 +1184,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
ReLU6
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1202,7 +1202,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
ELU
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1227,7 +1227,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
PReLU
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1250,7 +1250,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
LeakyReLU
(
0.1
)
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1276,7 +1276,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
Threshold
(
0.1
,
20
)
>>>
input
=
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1304,7 +1304,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
Hardtanh
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1324,7 +1324,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
Sigmoid
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1344,7 +1344,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
Tanh
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1362,7 +1362,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
LogSigmoid
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1391,7 +1391,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
Softplus
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1415,7 +1415,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
Softshrink
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1433,7 +1433,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
Softsign
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1453,7 +1453,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
Tanhshrink
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
))
>>>
print
(
input
)
...
...
@@ -1473,7 +1473,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
Softmin
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
,
3
))
>>>
print
(
input
)
...
...
@@ -1497,7 +1497,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
Softmax
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
,
3
))
>>>
print
(
input
)
...
...
@@ -1517,7 +1517,7 @@ shape:
例子:
```
```
py
>>>
m
=
nn
.
LogSoftmax
()
>>>
input
=
autograd
.
Variable
(
torch
.
randn
(
2
,
3
))
>>>
print
(
input
)
...
...
@@ -1552,7 +1552,7 @@ $$ y = \frac{x - mean[x]}{ \sqrt{Var[x]} + \epsilon} * gamma + beta $$
**例子**
```
```
py
>>>
# With Learnable Parameters
>>>
m
=
nn
.
BatchNorm1d
(
100
)
>>>
# Without Learnable Parameters
...
...
@@ -1589,7 +1589,7 @@ $$ y = \frac{x - mean[x]}{ \sqrt{Var[x]} + \epsilon} * gamma + beta $$
**例子**
```
```
py
>>>
# With Learnable Parameters
>>>
m
=
nn
.
BatchNorm2d
(
100
)
>>>
# Without Learnable Parameters
...
...
@@ -1626,7 +1626,7 @@ $$ y = \frac{x - mean[x]}{ \sqrt{Var[x]} + \epsilon} * gamma + beta $$
**例子**
```
```
py
>>>
# With Learnable Parameters
>>>
m
=
nn
.
BatchNorm3d
(
100
)
>>>
# Without Learnable Parameters
...
...
@@ -1685,7 +1685,7 @@ $$ y = \frac{x - mean[x]}{ \sqrt{Var[x]} + \epsilon} * gamma + beta $$
示例:
```
```
py
rnn
=
nn
.
RNN
(
10
,
20
,
2
)
input
=
Variable
(
torch
.
randn
(
5
,
3
,
10
))
h0
=
Variable
(
torch
.
randn
(
2
,
3
,
20
))
...
...
docs/0.4/17.md
浏览文件 @
5df01738
此差异已折叠。
点击以展开。
docs/0.4/18.md
浏览文件 @
5df01738
...
...
@@ -117,7 +117,7 @@ grad_outputs应该是output 包含每个输出的预先计算的梯度的长度
每次
`gradients`
被计算的时候,这个
`hook`
都被调用。
`hook`
应该拥有以下签名:
```
```
py
hook
(
grad
)
->
Variable
or
None
```
...
...
@@ -127,7 +127,7 @@ hook(grad) -> Variable or None
例:
```
```
py
>>>
v
=
Variable
(
torch
.
Tensor
([
0
,
0
,
0
]),
requires_grad
=
True
)
>>>
h
=
v
.
register_hook
(
lambda
grad
:
grad
*
2
)
# double the gradient
>>>
v
.
backward
(
torch
.
Tensor
([
1
,
1
,
1
]))
...
...
docs/0.4/19.md
浏览文件 @
5df01738
...
...
@@ -25,7 +25,7 @@
例子:
```
```
py
optimizer
=
optim
.
SGD
(
model
.
parameters
(),
lr
=
0.01
,
momentum
=
0.9
)
optimizer
=
optim
.
Adam
([
var1
,
var2
],
lr
=
0.0001
)
```
...
...
@@ -40,7 +40,7 @@ optimizer = optim.Adam([var1, var2], lr = 0.0001)
例如,当我们想指定每一层的学习率时,这是非常有用的:
```
```
py
optim
.
SGD
([
{
'params'
:
model
.
base
.
parameters
()},
{
'params'
:
model
.
classifier
.
parameters
(),
'lr'
:
1e-3
}
...
...
@@ -59,7 +59,7 @@ optim.SGD([
例子
```
```
py
for
input
,
target
in
dataset
:
optimizer
.
zero_grad
()
output
=
model
(
input
)
...
...
@@ -74,7 +74,7 @@ for input, target in dataset:
例子:
```
```
py
for
input
,
target
in
dataset
:
def
closure
():
optimizer
.
zero_grad
()
...
...
@@ -87,7 +87,7 @@ for input, target in dataset:
#### 算法
```
```
py
class
torch
.
optim
.
Optimizer
(
params
,
defaults
)
```
...
...
@@ -98,7 +98,7 @@ class torch.optim.Optimizer(params, defaults)
1.
params (iterable) —— 可迭代的
`Variable`
或者
`dict`
。指定应优化哪些变量。
2.
defaults-(dict):包含优化选项的默认值的dict(一个参数组没有指定的参数选项将会使用默认值)。
```
```
py
load_state_dict
(
state_dict
)
```
...
...
@@ -108,7 +108,7 @@ load_state_dict(state_dict)
1.
state_dict (dict) ——
`optimizer`
的状态。应该是
`state_dict()`
调用返回的对象。
```
```
py
state_dict
()
```
...
...
@@ -119,7 +119,7 @@ state_dict()
1.
state - 持有当前
`optimization`
状态的
`dict`
。它包含了 优化器类之间的不同。
2.
param_groups - 一个包含了所有参数组的
`dict`
。
```
```
py
step
(
closure
)
```
...
...
@@ -131,7 +131,7 @@ step(closure)
清除所有优化过的
`Variable`
的梯度。
```
```
py
class
torch
.
optim
.
Adadelta
(
params
,
lr
=
1.0
,
rho
=
0.9
,
eps
=
1e-06
,
weight_decay
=
0
)
```
...
...
@@ -147,7 +147,7 @@ class torch.optim.Adadelta(params, lr=1.0, rho=0.9, eps=1e-06, weight_decay=0)
4.
lr (float, 可选) – 将delta应用于参数之前缩放的系数(默认值:1.0)
5.
weight_decay (float, 可选) – 权重衰减 (L2范数)(默认值: 0)
```
```
py
step
(
closure
)
```
...
...
@@ -157,7 +157,7 @@ step(closure)
1.
closure (callable,可选) – 重新评估模型并返回损失的闭包。
```
```
py
class
torch
.
optim
.
Adagrad
(
params
,
lr
=
0.01
,
lr_decay
=
0
,
weight_decay
=
0
)
```
...
...
@@ -172,7 +172,7 @@ class torch.optim.Adagrad(params, lr=0.01, lr_decay=0, weight_decay=0)
3.
lr_decay (float, 可选) – 学习率衰减(默认: 0)
4.
weight_decay (float, 可选) – 权重衰减(L2范数)(默认: 0)
```
```
py
step
(
closure
)
```
...
...
@@ -182,7 +182,7 @@ step(closure)
1.
closure (callable,可选) – 重新评估模型并返回损失的闭包。
```
```
py
class
torch
.
optim
.
Adam
(
params
,
lr
=
0.001
,
betas
=
(
0.9
,
0.999
),
eps
=
1e-08
,
weight_decay
=
0
)[
source
]
```
...
...
@@ -198,7 +198,7 @@ class torch.optim.Adam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_d
4.
eps (float, 可选) – 增加分母的数值以提高数值稳定性(默认:1e-8)
5.
weight_decay (float, 可选) – 权重衰减(L2范数)(默认: 0)
```
```
py
step
(
closure
)
```
...
...
@@ -208,7 +208,7 @@ step(closure)
1.
closure (callable,可选) – 重新评估模型并返回损失的闭包。
```
```
py
class
torch
.
optim
.
Adamax
(
params
,
lr
=
0.002
,
betas
=
(
0.9
,
0.999
),
eps
=
1e-08
,
weight_decay
=
0
)
```
...
...
@@ -224,7 +224,7 @@ class torch.optim.Adamax(params, lr=0.002, betas=(0.9, 0.999), eps=1e-08, weight
4.
eps (float, 可选) – 增加分母的数值以提高数值稳定性(默认:1e-8)
5.
weight_decay (float, 可选) – 权重衰减(L2范数)(默认: 0)
```
```
py
step
(
closure
=
None
)
```
...
...
@@ -234,7 +234,7 @@ step(closure=None)
1.
closure (callable,可选) – 重新评估模型并返回损失的闭包。
```
```
py
class
torch
.
optim
.
ASGD
(
params
,
lr
=
0.01
,
lambd
=
0.0001
,
alpha
=
0.75
,
t0
=
1000000.0
,
weight_decay
=
0
)
```
...
...
@@ -251,7 +251,7 @@ class torch.optim.ASGD(params, lr=0.01, lambd=0.0001, alpha=0.75, t0=1000000.0,
5.
t0 (float, 可选) – 指明在哪一次开始平均化(默认:1e6)
6.
weight_decay (float, 可选) – 权重衰减(L2范数)(默认: 0)
```
```
py
step
(
closure
)
```
...
...
@@ -261,7 +261,7 @@ step(closure)
1.
closure (callable,可选) – 重新评估模型并返回损失的闭包。
```
```
py
class
torch
.
optim
.
LBFGS
(
params
,
lr
=
1
,
max_iter
=
20
,
max_eval
=
None
,
tolerance_grad
=
1e-05
,
tolerance_change
=
1e-09
,
history_size
=
100
,
line_search_fn
=
None
)
```
...
...
@@ -280,7 +280,7 @@ class torch.optim.LBFGS(params, lr=1, max_iter=20, max_eval=None, tolerance_grad
5.
tolerance_change (float) – 功能值/参数更改的终止公差(默认:1e-9)
6.
history_size (int) – 更新历史记录大小(默认:100)
```
```
py
step
(
closure
)
```
...
...
@@ -290,7 +290,7 @@ step(closure)
1.
closure (callable,可选) – 重新评估模型并返回损失的闭包。
```
```
py
class
torch
.
optim
.
RMSprop
(
params
,
lr
=
0.01
,
alpha
=
0.99
,
eps
=
1e-08
,
weight_decay
=
0
,
momentum
=
0
,
centered
=
False
)[
source
]
```
...
...
@@ -310,7 +310,7 @@ class torch.optim.RMSprop(params, lr=0.01, alpha=0.99, eps=1e-08, weight_decay=0
6.
centered (bool, 可选) – 如果为True,计算中心化的RMSProp,通过其方差的估计来对梯度进行归一化
7.
weight_decay (float, 可选) – 权重衰减(L2范数)(默认: 0)
```
```
py
step
(
closure
)
```
...
...
@@ -320,7 +320,7 @@ step(closure)
1.
closure (callable,可选) – 重新评估模型并返回损失的闭包。
```
```
py
class
torch
.
optim
.
Rprop
(
params
,
lr
=
0.01
,
etas
=
(
0.5
,
1.2
),
step_sizes
=
(
1e-06
,
50
))
```
...
...
@@ -333,7 +333,7 @@ class torch.optim.Rprop(params, lr=0.01, etas=(0.5, 1.2), step_sizes=(1e-06, 50)
3.
etas (Tuple[float, float], 可选) – 一对(etaminus,etaplis), 它们是乘数增加和减少因子(默认:0.5,1.2)
4.
step_sizes (Tuple[float, float], 可选) – 允许的一对最小和最大的步长(默认:1e-6,50)
```
```
py
step
(
closure
)
```
...
...
@@ -343,7 +343,7 @@ step(closure)
1.
closure (callable,可选) – 重新评估模型并返回损失的闭包。
```
```
py
class
torch
.
optim
.
SGD
(
params
,
lr
=
,
momentum
=
0
,
dampening
=
0
,
weight_decay
=
0
,
nesterov
=
False
)
```
...
...
@@ -362,7 +362,7 @@ Nesterov动量基于[On the importance of initialization and momentum in deep le
例子:
```
```
py
>>>
optimizer
=
torch
.
optim
.
SGD
(
model
.
parameters
(),
lr
=
0.1
,
momentum
=
0.9
)
>>>
optimizer
.
zero_grad
()
>>>
loss_fn
(
model
(
input
),
target
).
backward
()
...
...
@@ -373,7 +373,7 @@ Nesterov动量基于[On the importance of initialization and momentum in deep le
> 带有动量/Nesterov的SGD的实现稍微不同于Sutskever等人以及其他框架中的实现。 考虑到Momentum的具体情况,更新可以写成 v=ρ∗v+g p=p−lr∗v 其中,p、g、v和ρ分别是参数、梯度、速度和动量。 这是在对比Sutskever et. al。和其他框架采用该形式的更新 v=ρ∗v+lr∗g p=p−v Nesterov版本被类似地修改。
```
```
py
step
(
closure
)
```
...
...
@@ -387,7 +387,7 @@ step(closure)
`torch.optim.lr_scheduler`
提供了几种方法来根据epoches的数量调整学习率。
`torch.optim.lr_scheduler.ReduceLROnPlateau`
允许基于一些验证测量来降低动态学习速率。
```
```
py
class
torch
.
optim
.
lr_scheduler
.
LambdaLR
(
optimizer
,
lr_lambda
,
last_epoch
=-
1
)
```
...
...
@@ -401,7 +401,7 @@ class torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda, last_epoch=-1)
例子:
```
```
py
>>>
# Assuming optimizer has two groups.
>>>
lambda1
=
lambda
epoch
:
epoch
//
30
>>>
lambda2
=
lambda
epoch
:
0.95
**
epoch
...
...
@@ -412,7 +412,7 @@ class torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda, last_epoch=-1)
>>>
validate
(...)
```
```
```
py
class
torch
.
optim
.
lr_scheduler
.
StepLR
(
optimizer
,
step_size
,
gamma
=
0.1
,
last_epoch
=-
1
)
```
...
...
@@ -425,7 +425,7 @@ class torch.optim.lr_scheduler.StepLR(optimizer, step_size, gamma=0.1, last_epoc
例子:
```
```
py
>>>
# Assuming optimizer uses lr = 0.5 for all groups
>>>
# lr = 0.05 if epoch < 30
>>>
# lr = 0.005 if 30 <= epoch < 60
...
...
@@ -438,7 +438,7 @@ class torch.optim.lr_scheduler.StepLR(optimizer, step_size, gamma=0.1, last_epoc
>>>
validate
(...)
```
```
```
py
class
torch
.
optim
.
lr_scheduler
.
MultiStepLR
(
optimizer
,
milestones
,
gamma
=
0.1
,
last_epoch
=-
1
)
```
...
...
@@ -453,7 +453,7 @@ class torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones, gamma=0.1, las
例子:
```
```
py
>>>
# Assuming optimizer uses lr = 0.5 for all groups
>>>
# lr = 0.05 if epoch < 30
>>>
# lr = 0.005 if 30 <= epoch < 80
...
...
@@ -465,7 +465,7 @@ class torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones, gamma=0.1, las
>>>
validate
(...)
```
```
```
py
class
torch
.
optim
.
lr_scheduler
.
ExponentialLR
(
optimizer
,
gamma
,
last_epoch
=-
1
)
```
...
...
@@ -475,7 +475,7 @@ class torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma, last_epoch=-1)
2.
gamma (float) – 学习率衰减的乘积因子。
3.
last_epoch (int) – 最后一个指数。默认: -1.
```
```
py
class
torch
.
optim
.
lr_scheduler
.
ReduceLROnPlateau
(
optimizer
,
mode
=
'min'
,
factor
=
0.1
,
patience
=
10
,
verbose
=
False
,
threshold
=
0.0001
,
threshold_mode
=
'rel'
,
cooldown
=
0
,
min_lr
=
0
,
eps
=
1e-08
)
```
...
...
@@ -492,7 +492,7 @@ class torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0
9.
min_lr (float or list) – 标量或标量的列表。对所有的组群或每组的学习速率的一个较低的限制。 默认: 0.
10.
eps (float) – 适用于lr的最小衰减。如果新旧lr之间的差异小于eps,则更新将被忽略。默认: 1e-8.
```
```
py
>>>
optimizer
=
torch
.
optim
.
SGD
(
model
.
parameters
(),
lr
=
0.1
,
momentum
=
0.9
)
>>>
scheduler
=
torch
.
optim
.
ReduceLROnPlateau
(
optimizer
,
'min'
)
>>>
for
epoch
in
range
(
10
):
...
...
docs/0.4/2.md
浏览文件 @
5df01738
...
...
@@ -19,7 +19,7 @@
例如:
```
```
py
>>>
x
=
torch
.
FloatTensor
(
5
,
7
,
3
)
>>>
y
=
torch
.
FloatTensor
(
5
,
7
,
3
)
# 相同形状的质量可以被广播(上述规则总是成立的)
...
...
@@ -51,7 +51,7 @@
例如:
```
```
py
# 可以排列尾部维度,使阅读更容易
>>>
x
=
torch
.
FloatTensor
(
5
,
1
,
4
,
1
)
>>>
y
=
torch
.
FloatTensor
(
3
,
1
,
1
)
...
...
@@ -76,7 +76,7 @@ RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at no
例如:
```
```
py
>>>
x
=
torch
.
FloatTensor
(
5
,
3
,
4
,
1
)
>>>
y
=
torch
.
FloatTensor
(
3
,
1
,
1
)
>>>
(
x
.
add_
(
y
)).
size
()
...
...
@@ -97,7 +97,7 @@ RuntimeError: The expanded size of the tensor (1) must match the existing size (
例如:
```
```
py
>>>
torch
.
add
(
torch
.
ones
(
4
,
1
),
torch
.
randn
(
4
))
```
...
...
@@ -105,7 +105,7 @@ RuntimeError: The expanded size of the tensor (1) must match the existing size (
例如:
```
```
py
>>>
torch
.
utils
.
backcompat
.
broadcast_warning
.
enabled
=
True
>>>
torch
.
add
(
torch
.
ones
(
4
,
1
),
torch
.
ones
(
4
))
__main__
:
1
:
UserWarning
:
self
and
other
do
not
have
the
same
shape
,
but
are
broadcastable
,
and
have
the
same
number
of
elements
.
...
...
docs/0.4/20.md
浏览文件 @
5df01738
...
...
@@ -2,7 +2,7 @@
# torch.nn.init
```
```
py
torch
.
nn
.
init
.
calculate_gain
(
nonlinearity
,
param
=
None
)
```
...
...
@@ -24,11 +24,11 @@ torch.nn.init.calculate_gain(nonlinearity,param=None)
例子:
```
```
py
gain
=
nn
.
init
.
gain
(
'leaky_relu'
)
```
```
```
py
torch
.
nn
.
init
.
uniform
(
tensor
,
a
=
0
,
b
=
1
)[
source
]
```
...
...
@@ -42,7 +42,7 @@ torch.nn.init.uniform(tensor, a=0, b=1)[source]
例子:
```
```
py
w
=
torch
.
Tensor
(
3
,
5
)
print
nn
.
init
.
uniform
(
w
)
# 输出:
...
...
@@ -52,7 +52,7 @@ print nn.init.uniform(w)
# [torch.FloatTensor of size 3x5]
```
```
```
py
torch
.
nn
.
init
.
normal
(
tensor
,
mean
=
0
,
std
=
1
)
```
...
...
@@ -66,12 +66,12 @@ torch.nn.init.normal(tensor, mean=0, std=1)
例子:
```
```
py
w
=
torch
.
Tensor
(
3
,
5
)
print
torch
.
nn
.
init
.
normal
(
w
)
```
```
```
py
torch
.
nn
.
init
.
constant
(
tensor
,
val
)
```
...
...
@@ -84,12 +84,12 @@ torch.nn.init.constant(tensor, val)
例子:
```
```
py
w
=
torch
.
Tensor
(
3
,
5
)
print
torch
.
nn
.
init
.
constant
(
w
)
```
```
```
py
torch
.
nn
.
init
.
eye
(
tensor
)
```
...
...
@@ -101,12 +101,12 @@ torch.nn.init.eye(tensor)
例子:
```
```
py
w
=
torch
.
Tensor
(
3
,
5
)
print
torch
.
nn
.
init
.
eye
(
w
)
```
```
```
py
torch
.
nn
.
init
.
dirac
(
tensor
)
```
...
...
@@ -118,12 +118,12 @@ torch.nn.init.dirac(tensor)
例子:
```
```
py
w
=
torch
.
Tensor
(
3
,
16
,
5
,
5
)
print
torch
.
nn
.
init
.
dirac
(
w
)
```
```
```
py
torch
.
nn
.
init
.
xavier_uniform
(
tensor
,
gain
=
1
)
```
...
...
@@ -136,12 +136,12 @@ torch.nn.init.xavier_uniform(tensor, gain=1)
例子:
```
```
py
w
=
torch
.
Tensor
(
3
,
5
)
print
torch
.
nn
.
init
.
xavier_uniform
(
w
,
gain
=
nn
.
init
.
calculate_gain
(
'relu'
))
```
```
```
py
torch
.
nn
.
init
.
xavier_normal
(
tensor
,
gain
=
1
)
```
...
...
@@ -154,12 +154,12 @@ torch.nn.init.xavier_normal(tensor, gain=1)
例子:
```
```
py
>>>
w
=
torch
.
Tensor
(
3
,
5
)
>>>
nn
.
init
.
xavier_normal
(
w
)
```
```
```
py
torch
.
nn
.
init
.
kaiming_uniform
(
tensor
,
a
=
0
,
mode
=
'fan_in'
)
```
...
...
@@ -173,12 +173,12 @@ torch.nn.init.kaiming_uniform(tensor, a=0, mode='fan_in')
例子:
```
```
py
w
=
torch
.
Tensor
(
3
,
5
)
torch
.
nn
.
init
.
kaiming_uniform
(
w
,
mode
=
'fan_in'
)
```
```
```
py
torch
.
nn
.
init
.
kaiming_normal
(
tensor
,
a
=
0
,
mode
=
'fan_in'
)
```
...
...
@@ -190,12 +190,12 @@ torch.nn.init.kaiming_normal(tensor, a=0, mode='fan_in')
2.
a -此层后使用的整流器的负斜率(默认为ReLU为0)
3.
mode - "fan_in"(默认)或"fan_out"。"fan_in"保留正向传播时权值方差的量级,"fan_out"保留反向传播时的量级。
```
```
py
w
=
torch
.
Tensor
(
3
,
5
)
print
torch
.
nn
.
init
.
kaiming_normal
(
w
,
mode
=
'fan_out'
)
```
```
```
py
torch
.
nn
.
init
.
orthogonal
(
tensor
,
gain
=
1
)
```
...
...
@@ -208,12 +208,12 @@ torch.nn.init.orthogonal(tensor, gain=1)
例子:
```
```
py
w
=
torch
.
Tensor
(
3
,
5
)
print
torch
.
nn
.
init
.
orthogonal
(
w
)
```
```
```
py
torch
.
nn
.
init
.
sparse
(
tensor
,
sparsity
,
std
=
0.01
)
```
...
...
@@ -226,7 +226,7 @@ torch.nn.init.sparse(tensor, sparsity, std=0.01)
3.
std - 用于生成的正态分布的标准差
4.
non-zero values (the) – 例子:
```
```
py
w
=
torch
.
Tensor
(
3
,
5
)
print
torch
.
nn
.
init
.
sparse
(
w
,
sparsity
=
0.1
)
```
...
...
docs/0.4/22.md
浏览文件 @
5df01738
...
...
@@ -20,19 +20,19 @@
## 战略管理
```
```
py
torch
.
multiprocessing
.
get_all_sharing_strategies
()
```
返回一组当前系统支持的共享策略。
```
```
py
torch
.
multiprocessing
.
get_sharing_strategy
()
```
返回共享CPU张量的当前策略
```
```
py
torch
.
multiprocessing
.
set_sharing_strategy
(
new_strategy
)
```
...
...
docs/0.4/23.md
浏览文件 @
5df01738
...
...
@@ -76,7 +76,7 @@ Rank是分配给分布式组中每个进程的唯一标识符。它们总是连
或者,地址必须是有效的IP多播地址,在这种情况下可以自动分配等级。组播初始化还支持一个group_name参数,只要使用不同的组名,就可以为多个作业使用相同的地址。
```
```
py
import
torch.distributed
as
dist
# Use address of one of the machines
...
...
@@ -95,7 +95,7 @@ dist.init_process_group(init_method='tcp://[ff15:1e18:5d4c:4cf0:d02d:b659:53ba:b
该方法假设文件系统支持使用fcntl大多数本地系统进行锁定,NFS支持它。
```
```
py
import
torch.distributed
as
dist
# Rank will be assigned automatically if unspecified
...
...
docs/0.4/24.md
浏览文件 @
5df01738
...
...
@@ -4,13 +4,13 @@
用命令行运行它
```
```
py
python
-
m
torch
.
utils
.
bottleneck
/
path
/
to
/
source
/
script
.
py
[
args
]
```
`[args]`
是
`script.py`
中的任意参数,也可以运行如下代码获取更多使用说明。
```
```
py
python
-
m
torch
.
utils
.
bottleneck
-
h
```
...
...
docs/0.4/25.md
浏览文件 @
5df01738
...
...
@@ -46,7 +46,7 @@
例:
```
```
py
>>> model = nn.Sequential(...)
>>> input_var = checkpoint_sequential(model, chunks, input_var)
``
`
...
...
docs/0.4/26.md
浏览文件 @
5df01738
...
...
@@ -12,7 +12,7 @@
例
```
```
py
>>>
from
setuptools
import
setup
>>>
from
torch.utils.cpp_extension
import
BuildExtension
,
CppExtension
>>>
setup
(
...
...
@@ -34,7 +34,7 @@
例
```
```
py
>>>
from
setuptools
import
setup
>>>
from
torch.utils.cpp_extension
import
BuildExtension
,
CppExtension
>>>
setup
(
...
...
@@ -85,7 +85,7 @@
例
```
```
py
>>>
from
torch.utils.cpp_extension
import
load
>>>
module
=
load
(
name
=
'extension'
,
...
...
@@ -103,7 +103,7 @@
例如:
```
```
py
from
setuptools
import
setup
from
torch.utils.cpp_extension
import
BuildExtension
,
CppExtension
...
...
docs/0.4/27.md
浏览文件 @
5df01738
...
...
@@ -2,7 +2,7 @@
## torch.utils.data
```
```
py
class
torch
.
utils
.
data
.
Dataset
```
...
...
@@ -10,7 +10,7 @@ class torch.utils.data.Dataset
所有其他数据集都应该进行子类化。所有子类应该覆盖
`__len__`
和
`__getitem__`
,
`__len__`
提供了数据集的大小,
`__getitem__`
支持整数索引,范围从0到len(self)。
```
```
py
class
torch
.
utils
.
data
.
TensorDataset
(
data_tensor
,
target_tensor
)
```
...
...
@@ -25,7 +25,7 @@ class torch.utils.data.TensorDataset(data_tensor, target_tensor)
例子:
```
```
py
x
=
torch
.
linspace
(
1
,
10
,
10
)
# x data (torch tensor)
y
=
torch
.
linspace
(
10
,
1
,
10
)
# y data (torch tensor)
...
...
@@ -33,7 +33,7 @@ y = torch.linspace(10, 1, 10) # y data (torch tensor)
torch_dataset
=
torch
.
utils
.
data
.
TensorDataset
(
data_tensor
=
x
,
target_tensor
=
y
)
```
```
```
py
class
torch
.
utils
.
data
.
ConcatDataset
(
datasets
)
```
...
...
@@ -44,7 +44,7 @@ class torch.utils.data.ConcatDataset(datasets)
*
datasets的参数:要连接的数据集列表
*
datasets样式:iterable
```
```
py
class
torch
.
utils
.
data
.
DataLoader
(
dataset
,
batch_size
=
1
,
shuffle
=
False
,
sampler
=
None
,
num_workers
=
0
,
collate_fn
=<
function
default_collate
>
,
pin_memory
=
False
,
drop_last
=
False
)
```
...
...
@@ -62,7 +62,7 @@ class torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=False, sampler=
8.
pin_memory (bool, optional) – 如果为True,数据加载器在返回前将张量复制到CUDA固定内存中。
9.
drop_last (bool, optional) – 如果数据集大小不能被batch_size整除,设置为True可删除最后一个不完整的批处理。如果设为False并且数据集的大小不能被batch_size整除,则最后一个batch将更小。(默认: False)
```
```
py
class
torch
.
utils
.
data
.
sampler
.
Sampler
(
data_source
)
```
...
...
@@ -70,7 +70,7 @@ class torch.utils.data.sampler.Sampler(data_source)
每个采样器子类必须提供一个
`__iter__`
方法,提供一种迭代数据集元素的索引的方法,以及返回迭代器长度的
`__len__`
方法。
```
```
py
class
torch
.
utils
.
data
.
sampler
.
SequentialSampler
(
data_source
)
```
...
...
@@ -80,7 +80,7 @@ class torch.utils.data.sampler.SequentialSampler(data_source)
*
`data_source (Dataset)`
– 采样的数据集。
```
```
py
class
torch
.
utils
.
data
.
sampler
.
RandomSampler
(
data_source
)
```
...
...
@@ -88,7 +88,7 @@ class torch.utils.data.sampler.RandomSampler(data_source)
参数: -
`data_source (Dataset)`
– 采样的数据集。
```
```
py
class
torch
.
utils
.
data
.
sampler
.
SubsetRandomSampler
(
indices
)
```
...
...
@@ -96,7 +96,7 @@ class torch.utils.data.sampler.SubsetRandomSampler(indices)
参数: -
`indices (list)`
– 索引的列表
```
```
py
class
torch
.
utils
.
data
.
sampler
.
WeightedRandomSampler
(
weights
,
num_samples
,
replacement
=
True
)
```
...
...
@@ -107,7 +107,7 @@ class torch.utils.data.sampler.WeightedRandomSampler(weights, num_samples, repla
*
`weights (list)`
– 权重列表。不需要加起来为1
*
`num_samples (int)`
– 要绘制的样本数
```
```
py
class
torch
.
utils
.
data
.
distributed
.
DistributedSampler
(
dataset
,
num_replicas
=
None
,
rank
=
None
)
```
...
...
docs/0.4/28.md
浏览文件 @
5df01738
...
...
@@ -2,7 +2,7 @@
# torch.utils.ffi
```
```
py
torch
.
utils
.
ffi
.
create_extension
(
name
,
headers
,
sources
,
verbose
=
True
,
with_cuda
=
False
,
package
=
False
,
relative_to
=
'.'
,
**
kwargs
)
```
...
...
docs/0.4/29.md
浏览文件 @
5df01738
...
...
@@ -2,7 +2,7 @@
# torch.utils.model_zoo
```
```
py
torch
.
utils
.
model_zoo
.
load_url
(
url
,
model_dir
=
None
)
```
...
...
@@ -20,7 +20,7 @@ torch.utils.model_zoo.load_url(url, model_dir=None)
例如:
```
```
py
>>>
state_dict
=
torch
.
utils
.
model_zoo
.
load_url
(
'https://s3.amazonaws.com/pytorch/models/resnet18-5c106cde.pth'
)
```
...
...
docs/0.4/3.md
浏览文件 @
5df01738
...
...
@@ -20,7 +20,7 @@
下面可以用一个小例子来展示:
```
```
py
cuda
=
torch
.
device
(
"cuda"
)
# 默认为CUDA设备
cuda0
=
torch
.
device
(
"cuda:0"
)
cuda2
=
torch
.
device
(
"cuda:2"
)
# GPU 2
...
...
@@ -63,7 +63,7 @@ CUDA 流是属于特定设备的线性执行序列。您通常不需要明确创
除非显式的使用同步函数(例如
`synchronize()`
或
`wait_stream()`
),否则每个流内的操作都按照它们创建的顺序进行序列化,但是来自不同流的操作可以以任意相对顺序并发执行。例如,下面的代码是不正确的:
```
```
py
cuda
=
torch
.
device
(
"cuda"
)
s
=
torch
.
cuda
.
stream
()
# 在当前流中创建一个新的流
A
=
torch
.
empty
((
100
,
100
),
device
=
cuda
).
normal_
(
0.0
,
1.0
)
...
...
@@ -87,7 +87,7 @@ with torch.cuda.stream(s):
第一步是确定是否应该使用 GPU。一种常见的模式是使用 Python 的
`argparse`
模块来读入用户参数,并且有一个标志可用于禁用 CUDA,并结合
`is_available()`
使用。在下面的内容中,
`args.device`
会生成一个
`torch.device`
对象,该对象可用于将张量移动到 CPU 或 CUDA。
```
```
py
import
argparse
import
torch
...
...
@@ -103,14 +103,14 @@ else:
现在我们有了
`args.device`
,我们可以使用它在所需的设备上创建一个张量。
```
```
py
x
=
torch
.
empty
((
8
,
42
),
device
=
args
.
device
)
net
=
Network
().
to
(
device
=
args
.
device
)
```
这可以在许多情况下用于生成设备不可知代码。以下是使用
`dataloader`
的例子:
```
```
py
cuda0
=
torch
.
device
(
'cuda:0'
)
# CUDA GPU 0
for
i
,
x
in
enumerate
(
train_loader
):
x
=
x
.
to
(
cuda0
)
...
...
@@ -118,7 +118,7 @@ for i, x in enumerate(train_loader):
在系统上使用多个 GPU 时,您可以使用
`CUDA_VISIBLE_DEVICES`
环境标志来管理 PyTorch 可用的 GPU。如上所述,要手动控制在哪个 GPU 上创建张量,最佳做法是使用
`torch.cuda.device`
上下文管理器。
```
```
py
print
(
"外部的设备是0"
)
# 在设备0上
with
torch
.
cuda
.
device
(
1
):
print
(
"内部的设备是1"
)
# 设备1
...
...
@@ -129,7 +129,7 @@ print("外部的设备仍是0") # 设备0
这是建立模块时推荐的做法,在前向传递期间需要在内部创建新的张量
```
```
py
cuda
=
torch
.
device
(
"cuda"
)
x_cpu
=
torch
.
empty
(
2
)
y_gpu
=
torch
.
empty
(
2
,
device
=
cuda
)
...
...
@@ -153,7 +153,7 @@ print(y_cpu_long)
如果要创建与另一个张量相同类型和大小的张量,并将其填充为1或0,则可以使用
`ones_like()`
或
`zeros_like()`
作为便捷的辅助函数(也可以保留
`torch.device`
和
`torch.dtype`
的张量)。
```
```
py
x_cpu
=
torch
.
empty
(
2
,
3
)
x_gpu
=
torch
.
empty
(
2
,
3
)
...
...
docs/0.4/30.md
浏览文件 @
5df01738
...
...
@@ -6,7 +6,7 @@
这是一个简单的脚本,将
`torchvision`
中定义的预训练的
`AlexNet`
导出到
`ONNX`
中。它运行一轮推理,然后将结果跟踪模型保存到
`alexnet.proto`
:
```
```
py
from
torch.autograd
import
Variable
import
torch.onnx
import
torchvision
...
...
@@ -18,7 +18,7 @@ torch.onnx.export(model, dummy_input, "alexnet.proto", verbose=True)
保存文件
`alexnet.proto`
是一个二进制
`protobuf`
文件,其中包含您导出的模型(在本例中为
`AlexNet`
)的网络结构和参数。关键字参数
`verbose=True`
导致导出器打印出一个人类可读的网络表示:
```
```
py
# All parameters are encoded explicitly as inputs. By convention,
# learned parameters (ala nn.Module.state_dict) are first, and the
# actual inputs are last.
...
...
@@ -50,13 +50,13 @@ graph(%1 : Float(64, 3, 11, 11)
您也可以使用
[
onnx
](
https://github.com/onnx/onnx/
)
库来验证
`protobuf`
。你可以
`onnx`
用
`conda`
安装:
```
```
py
conda
install
-
c
conda
-
forge
onnx
```
然后,你可以运行:
```
```
py
import
onnx
# Load the ONNX model
...
...
@@ -75,13 +75,13 @@ onnx.helper.printable_graph(model.graph)
*
2、你需要
`onnx-caffe2`
,一个纯
`Python`
库,为
`ONNX`
提供一个
`Caffe2`
后端。
`onnx-caffe2`
你可以用
`pip`
来安装:
```
```
py
pip
install
onnx
-
caffe2
```
安装完成后,您可以使用
`Caffe2`
的后端:
```
```
py
# ...continuing from above
import
onnx_caffe2.backend
as
backend
import
numpy
as
np
...
...
@@ -114,7 +114,7 @@ print(outputs[0])
### torch.onnx功能
```
```
py
torch
.
onnx
.
export
(
model
,
args
,
f
,
export_params
=
True
,
verbose
=
False
,
training
=
False
)
```
...
...
docs/0.4/33.md
浏览文件 @
5df01738
...
...
@@ -6,13 +6,13 @@
如下代码用于获取加载图像的包的名称。
```
```
py
torchvision
.
get_image_backend
()
```
指定用于加载图像的包。
```
```
py
torchvision
.
set_image_backend
(
backend
)
```
...
...
docs/0.4/34.md
浏览文件 @
5df01738
...
...
@@ -18,7 +18,7 @@
所有数据集都是
`torch.utils.data.Dataset`
的子类, 即它们具有
**getitem**
和
**len**
实现方法。因此,它们都可以传递给
`torch.utils.data.DataLoader`
可以使用
`torch.multiprocessing`
工作人员并行加载多个样本的数据。例如:
```
```
py
imagenet_data
=
torchvision
.
datasets
.
ImageFolder
(
'path/to/imagenet_root/'
)
data_loader
=
torch
.
utils
.
data
.
DataLoader
(
imagenet_data
,
batch_size
=
4
,
...
...
@@ -30,7 +30,7 @@ data_loader = torch.utils.data.DataLoader(imagenet_data,
#### MNIST
```
```
py
dset
.
MNIST
(
root
,
train
=
True
,
transform
=
None
,
target_transform
=
None
,
download
=
False
)
```
...
...
@@ -46,7 +46,7 @@ dset.MNIST(root, train=True, transform=None, target_transform=None, download=Fal
需要安装
[
COCO API
](
https://github.com/pdollar/coco/tree/master/PythonAPI
)
```
```
py
dset
.
CocoCaptions
(
root
=
"dir where images are"
,
annFile
=
"json annotation file"
,
[
transform
,
target_transform
])
```
...
...
@@ -59,7 +59,7 @@ dset.CocoCaptions(root="dir where images are", annFile="json annotation file", [
例子:
```
```
py
import
torchvision.datasets
as
dset
import
torchvision.transforms
as
transforms
cap
=
dset
.
CocoCaptions
(
root
=
'dir where images are'
,
...
...
@@ -75,7 +75,7 @@ print(target)
输出:
```
```
py
Number
of
samples
:
82783
Image
Size
:
(
3L
,
427L
,
640L
)
[
u
'A plane emitting smoke stream flying over a mountain.'
,
...
...
@@ -89,7 +89,7 @@ u'A mountain view with a plume of smoke in the background']
检测:
```
```
py
dset
.
CocoDetection
(
root
=
"dir where images are"
,
annFile
=
"json annotation file"
,
[
transform
,
target_transform
])
```
...
...
@@ -104,7 +104,7 @@ dset.CocoDetection(root="dir where images are", annFile="json annotation file",
#### LSUN
```
```
py
dset
.
LSUN
(
db_path
,
classes
=
'train'
,
[
transform
,
target_transform
])
```
...
...
@@ -119,7 +119,7 @@ dset.LSUN(db_path, classes='train', [transform, target_transform])
一个通用的数据加载器,数据集中的数据以以下方式组织
```
```
py
root
/
dog
/
xxx
.
png
root
/
dog
/
xxy
.
png
root
/
dog
/
xxz
.
png
...
...
@@ -143,7 +143,7 @@ dset.ImageFolder(root="root folder path", [transform, target_transform])
#### CIFAR
```
```
py
dset
.
CIFAR10
(
root
,
train
=
True
,
transform
=
None
,
target_transform
=
None
,
download
=
False
)
dset
.
CIFAR100
(
root
,
train
=
True
,
transform
=
None
,
target_transform
=
None
,
download
=
False
)
...
...
@@ -159,7 +159,7 @@ dset.CIFAR100(root, train=True, transform=None, target_transform=None, download=
#### STL10
```
```
py
dset
.
STL10
(
root
,
split
=
'train'
,
transform
=
None
,
target_transform
=
None
,
download
=
False
)
```
...
...
@@ -173,7 +173,7 @@ dset.STL10(root, split='train', transform=None, target_transform=None, download=
#### SVHN
```
```
py
class
torchvision
.
datasets
.
SVHN
(
root
,
split
=
'train'
,
transform
=
None
,
target_transform
=
None
,
download
=
False
)
```
...
...
@@ -187,7 +187,7 @@ class torchvision.datasets.SVHN(root, split='train', transform=None, target_tran
#### PhotoTour
```
```
py
class
torchvision
.
datasets
.
PhotoTour
(
root
,
name
,
train
=
True
,
transform
=
None
,
download
=
False
)
```
...
...
docs/0.4/35.md
浏览文件 @
5df01738
...
...
@@ -12,7 +12,7 @@
可以通过调用构造函数来构造具有随机权重的模型:
```
```
py
import
torchvision.models
as
models
resnet18
=
models
.
resnet18
()
alexnet
=
models
.
alexnet
()
...
...
@@ -22,7 +22,7 @@ densenet = models.densenet_161()
我们提供的Pathway变体和alexnet预训练的模型,利用pytorch 的
`torch.utils.model_zoo`
。这些可以通过构建
`pretrained=True`
:
```
```
py
import
torchvision.models
as
models
resnet18
=
models
.
resnet18
(
pretrained
=
True
)
alexnet
=
models
.
alexnet
(
pretrained
=
True
)
...
...
@@ -30,7 +30,7 @@ alexnet = models.alexnet(pretrained=True)
所有预训练的模型的期望输入图像相同的归一化,即小批量形状通道的RGB图像(3 x H x W),其中H和W预计将至少224。这些图像必须被加载到[ 0, 1 ]的范围内,然后使用平均= [ 0.485,0.456,0.406 ]和STD=[ 0.229,0.224,0.225 ]进行归一化。您可以使用以下转换来正常化:
```
```
py
normalize
=
transforms
.
Normalize
(
mean
=
[
0.485
,
0.456
,
0.406
],
std
=
[
0.229
,
0.224
,
0.225
])
```
...
...
@@ -56,7 +56,7 @@ normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0
| Densenet-201 | 22.80 | 6.43 |
| Densenet-161 | 22.35 | 6.20 |
```
```
py
torchvision
.
models
.
alexnet
(
pretrained
=
False
,
**
kwargs
)
```
...
...
@@ -64,19 +64,19 @@ AlexNet 模型结构 paper地址
pretrained (bool) – True, 返回在ImageNet上训练好的模型。
```
```
py
torchvision
.
models
.
resnet18
(
pretrained
=
False
,
**
kwargs
)
```
构建一个resnet18模型 pretrained (bool) – True, 返回在ImageNet上训练好的模型。
```
```
py
torchvision
.
models
.
resnet34
(
pretrained
=
False
,
**
kwargs
)
```
构建一个ResNet-34 模型. Parameters: pretrained (bool) – True, 返回在ImageNet上训练好的模型。
```
```
py
torchvision
.
models
.
resnet50
(
pretrained
=
False
,
**
kwargs
)
```
...
...
@@ -84,7 +84,7 @@ torchvision.models.resnet50(pretrained=False, ** kwargs)
pretrained (bool) – True, 返回在ImageNet上训练好的模型。
```
```
py
torchvision
.
models
.
resnet101
(
pretrained
=
False
,
**
kwargs
)
```
...
...
@@ -92,7 +92,7 @@ torchvision.models.resnet101(pretrained=False, ** kwargs)
pretrained (bool) – True, 返回在ImageNet上训练好的模型。
```
```
py
torchvision
.
models
.
resnet152
(
pretrained
=
False
,
**
kwargs
)
```
...
...
@@ -100,7 +100,7 @@ torchvision.models.resnet152(pretrained=False, ** kwargs)
pretrained (bool) – True, 返回在ImageNet上训练好的模型。
```
```
py
torchvision
.
models
.
vgg11
(
pretrained
=
False
,
**
kwargs
)
```
...
...
@@ -108,13 +108,13 @@ VGG 11-layer model (configuration “A”) -
pretrained (bool) – True, 返回在ImageNet上训练好的模型。
```
```
py
torchvision
.
models
.
vgg11_bn
(
**
kwargs
)
```
VGG 11-layer model (configuration “A”) with batch normalization
```
```
py
torchvision
.
models
.
vgg13
(
pretrained
=
False
,
**
kwargs
)
```
...
...
@@ -122,13 +122,13 @@ VGG 13-layer model (configuration “B”)
pretrained (bool) – True, 返回在ImageNet上训练好的模型。
```
```
py
torchvision
.
models
.
vgg13_bn
(
**
kwargs
)
```
VGG 13-layer model (configuration “B”) with batch normalization
```
```
py
torchvision
.
models
.
vgg16
(
pretrained
=
False
,
**
kwargs
)
```
...
...
@@ -136,13 +136,13 @@ VGG 16-layer model (configuration “D”)
Parameters: pretrained (bool) – If True, returns a model pre-trained on ImageNet
```
```
py
torchvision
.
models
.
vgg16_bn
(
**
kwargs
)
```
VGG 16-layer model (configuration “D”) with batch normalization
```
```
py
torchvision
.
models
.
vgg19
(
pretrained
=
False
,
**
kwargs
)
```
...
...
@@ -150,7 +150,7 @@ VGG 19-layer model (configuration “E”)
pretrained (bool) – True, 返回在ImageNet上训练好的模型。
```
```
py
torchvision
.
models
.
vgg19_bn
(
**
kwargs
)
```
...
...
docs/0.4/36.md
浏览文件 @
5df01738
...
...
@@ -11,7 +11,7 @@
变换是常用的图像变换。它们可以用
`Compose`
连接在一起。
```
```
py
class
torchvision
.
transforms
.
Compose
(
transforms
)
```
...
...
@@ -19,7 +19,7 @@ class torchvision.transforms.Compose(transforms)
transforms: 由transform构成的列表. 例子:
```
```
py
transforms
.
Compose
([
transforms
.
CenterCrop
(
10
),
transforms
.
ToTensor
(),
...
...
@@ -30,7 +30,7 @@ transforms.Compose([
* * *
```
```
py
class
torchvision
.
transforms
.
Scale
(
size
,
interpolation
=
2
)
```
...
...
@@ -41,31 +41,31 @@ class torchvision.transforms.Scale(size, interpolation=2)
1.
size (sequence or int) - 期望输出尺寸。如果size是一个像(w, h)的序列,输出大小将按照w,h匹配到。如果大小是int,则图像将匹配到这个数字。例如,如果原图的
`height>width`
,那么改变大小后的图片大小是
`(size*height/width, size)`
。
2.
interpolation (int, optional) -需要添加值。默认的是
`PIL.Image.BILINEAR`
```
```
py
class
torchvision
.
transforms
.
CenterCrop
(
size
)
```
将给定的PIL.Image进行中心切割,得到给定的size,size可以是tuple,(target_height, target_width)。size也可以是一个Integer,在这种情况下,切出来的图片的形状是正方形。
```
```
py
class
torchvision
.
transforms
.
RandomCrop
(
size
,
padding
=
0
)
```
切割中心点的位置随机选取。size可以是tuple也可以是Integer。
```
```
py
class
torchvision
.
transforms
.
RandomHorizontalFlip
```
随机水平翻转给定的PIL.Image,概率为0.5。即:一半的概率翻转,一半的概率不翻转。
```
```
py
class
torchvision
.
transforms
.
RandomSizedCrop
(
size
,
interpolation
=
2
)
```
先将给定的PIL.Image随机切,然后再resize成给定的size大小。
```
```
py
class
torchvision
.
transforms
.
Pad
(
padding
,
fill
=
0
)
```
...
...
@@ -75,7 +75,7 @@ class torchvision.transforms.Pad(padding, fill=0)
* * *
```
```
py
class
torchvision
.
transforms
.
Normalize
(
mean
,
std
)
```
...
...
@@ -92,7 +92,7 @@ class torchvision.transforms.Normalize(mean, std)
* * *
```
```
py
class
torchvision
.
transforms
.
ToTensor
```
...
...
@@ -104,7 +104,7 @@ class torchvision.transforms.ToTensor
2.
返回结果: 转换后的图像。
3.
返回样式: Tensor张量
```
```
py
class
torchvision
.
transforms
.
ToPILImage
```
...
...
@@ -120,7 +120,7 @@ class torchvision.transforms.ToPILImage
* * *
```
```
py
class
torchvision
.
transforms
.
Lambda
(
lambd
)
```
...
...
docs/0.4/37.md
浏览文件 @
5df01738
...
...
@@ -2,7 +2,7 @@
# torchvision.utils
```
```
py
torchvision
.
utils
.
make_grid
(
tensor
,
nrow
=
8
,
padding
=
2
,
normalize
=
False
,
range
=
None
,
scale_each
=
False
,
pad_value
=
0
)
```
...
...
@@ -21,7 +21,7 @@ torchvision.utils.make_grid(tensor, nrow=8, padding=2, normalize=False, range=No
查看下面的例子:
```
```
py
torchvision
.
utils
.
save_image
(
tensor
,
filename
,
nrow
=
8
,
padding
=
2
,
normalize
=
False
,
range
=
None
,
scale_each
=
False
,
pad_value
=
0
)
```
...
...
docs/0.4/4.md
浏览文件 @
5df01738
...
...
@@ -19,7 +19,7 @@
你可以从下面的代码看到
`torch.nn`
模块的
`Linear`
函数, 以及注解
```
```
py
# Inherit from Function
class
Linear
(
Function
):
...
...
@@ -57,13 +57,13 @@ class Linear(Function):
现在,为了更方便使用这些自定义操作,推荐使用
`apply`
方法:
```
```
py
linear
=
LinearFunction
.
apply
```
我们下面给出一个由非变量参数进行参数化的函数的例子:
```
```
py
class
MulConstant
(
Function
):
@
staticmethod
def
forward
(
ctx
,
tensor
,
constant
):
...
...
@@ -81,7 +81,7 @@ class MulConstant(Function):
你可能想检测你刚刚实现的
`backward`
方法是否正确的计算了梯度。你可以使用小的有限差分法(
`Finite Difference`
)进行数值估计。
```
```
py
from
torch.autograd
import
gradcheck
# gradcheck takes a tuple of tensors as input, check if your gradient
...
...
@@ -107,7 +107,7 @@ print(test)
下面是实现
`Linear`
模块的方式:
```
```
py
class
Linear
(
nn
.
Module
):
def
__init__
(
self
,
input_features
,
output_features
,
bias
=
True
):
super
(
Linear
,
self
).
__init__
()
...
...
docs/0.4/5.md
浏览文件 @
5df01738
...
...
@@ -10,7 +10,7 @@
有时,当可微分变量可能发生时,它可能并不明显。考虑以下训练循环(从
[
源代码
](
https://discuss.pytorch.org/t/high-memory-usage-while-training/162
)
节选):
```
```
py
total_loss
=
0
for
i
in
range
(
10000
):
optimizer
.
zero_grad
()
...
...
@@ -29,7 +29,7 @@ for i in range(10000):
作用域的范围可能比你想象的要大。例如:
```
```
py
for
i
in
range
(
5
):
intermdeiate
=
f
(
input
[
i
])
result
+=
g
(
intermediate
)
...
...
@@ -61,7 +61,7 @@ PyTorch 使用缓存内存分配器来加速内存分配。因此,`nvidia-smi`
的序列。例如,你可以写:
```
```
py
from
torch.nn.utils.rnn
import
pack_padded_sequence
,
pad_packed_squence
class
MyModule
(
nn
.
Module
):
...
...
docs/0.4/6.md
浏览文件 @
5df01738
...
...
@@ -53,7 +53,7 @@
具体的 Hogwild 实现可以在
[
示例库
](
https://github.com/pytorch/examples/tree/master/mnist_hogwild
)
中找到,但为了展示代码的整体结构,下面还有一个最简单的示例:
```
```
py
import
torch.multiprocessing
as
mp
from
model
import
MyModel
...
...
docs/0.4/7.md
浏览文件 @
5df01738
...
...
@@ -10,26 +10,26 @@
第一个(推荐)只保存和加载模型参数:
```
```
py
torch
.
save
(
the_model
.
state_dict
(),
PATH
)
```
然后:
```
```
py
the_model
=
TheModelClass
(
*
args
,
**
kwargs
)
the_mdel
.
load_state_dict
(
torch
.
load
(
PATH
))
```
第二个方法是保存并加载整个模型:
```
```
py
torch
.
save
(
the_model
,
PATH
)
```
然后:
```
```
py
the_model
=
torch
.
load
(
PATH
)
```
...
...
docs/0.4/8.md
浏览文件 @
5df01738
此差异已折叠。
点击以展开。
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录