提交 b0a9b468 编写于 作者: 绝不原创的飞龙's avatar 绝不原创的飞龙

2024-02-04 16:18:17

上级 aed98b51
......@@ -17,9 +17,9 @@ PyTorch 有两个[用于处理数据的基本方法](https://pytorch.org/docs/st
```py
import torch
from torch import nn
from torch.utils.data import [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import [ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")
from torchvision.transforms import ToTensor
```
PyTorch 提供了领域特定的库,如[TorchText](https://pytorch.org/text/stable/index.html)[TorchVision](https://pytorch.org/vision/stable/index.html)[TorchAudio](https://pytorch.org/audio/stable/index.html),其中包括数据集。在本教程中,我们将使用一个 TorchVision 数据集。
......@@ -28,19 +28,19 @@ PyTorch 提供了领域特定的库,如[TorchText](https://pytorch.org/text/st
```py
# Download training data from open datasets.
[training_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST") = [datasets.FashionMNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")(
training_data = datasets.FashionMNIST(
root="data",
train=True,
download=True,
transform=[ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")(),
transform=ToTensor(),
)
# Download test data from open datasets.
[test_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST") = [datasets.FashionMNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")(
test_data = datasets.FashionMNIST(
root="data",
train=False,
download=True,
transform=[ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")(),
transform=ToTensor(),
)
```
......@@ -94,11 +94,11 @@ Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/
batch_size = 64
# Create data loaders.
[train_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")([training_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"), batch_size=batch_size)
[test_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")([test_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"), batch_size=batch_size)
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)
for [X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), y in [test_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"):
print(f"Shape of X [N, C, H, W]: {[X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").shape}")
for X, y in test_dataloader:
print(f"Shape of X [N, C, H, W]: {X.shape}")
print(f"Shape of y: {y.shape} {y.dtype}")
break
```
......@@ -120,32 +120,32 @@ Shape of y: torch.Size([64]) torch.int64
# Get cpu, gpu or mps device for training.
device = (
"cuda"
if [torch.cuda.is_available](https://pytorch.org/docs/stable/generated/torch.cuda.is_available.html#torch.cuda.is_available "torch.cuda.is_available")()
if torch.cuda.is_available()
else "mps"
if [torch.backends.mps.is_available](https://pytorch.org/docs/stable/backends.html#torch.backends.mps.is_available "torch.backends.mps.is_available")()
if torch.backends.mps.is_available()
else "cpu"
)
print(f"Using {device} device")
# Define model
class NeuralNetwork([nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")):
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
self.flatten = [nn.Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html#torch.nn.Flatten "torch.nn.Flatten")()
self.linear_relu_stack = [nn.Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html#torch.nn.Sequential "torch.nn.Sequential")(
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(28*28, 512),
[nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")(),
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(512, 512),
[nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")(),
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(512, 10)
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, 512),
nn.ReLU(),
nn.Linear(512, 512),
nn.ReLU(),
nn.Linear(512, 10)
)
def forward(self, [x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")):
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = self.flatten([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
logits = self.linear_relu_stack([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
def forward(self, x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
model = [NeuralNetwork](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")().to(device)
model = NeuralNetwork().to(device)
print(model)
```
......@@ -172,47 +172,47 @@ NeuralNetwork(
要训练一个模型,我们需要一个[损失函数](https://pytorch.org/docs/stable/nn.html#loss-functions)和一个[优化器](https://pytorch.org/docs/stable/optim.html)
```py
[loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss") = [nn.CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")()
[optimizer](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD") = [torch.optim.SGD](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD")([model.parameters](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.parameters "torch.nn.Module.parameters")(), lr=1e-3)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
```
在单个训练循环中,模型对训练数据集进行预测(以批量方式提供),并将预测错误反向传播以调整模型的参数。
```py
def train(dataloader, model, [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss"), [optimizer](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD")):
def train(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
[model.train](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.train "torch.nn.Module.train")()
for batch, ([X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), y) in enumerate(dataloader):
[X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), y = [X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").to(device), y.to(device)
model.train()
for batch, (X, y) in enumerate(dataloader):
X, y = X.to(device), y.to(device)
# Compute prediction error
[pred](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = model([X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
loss = [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")([pred](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), y)
pred = model(X)
loss = loss_fn(pred, y)
# Backpropagation
loss.backward()
[optimizer.step](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD.step "torch.optim.SGD.step")()
[optimizer.zero_grad](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD.zero_grad "torch.optim.SGD.zero_grad")()
optimizer.step()
optimizer.zero_grad()
if batch % 100 == 0:
loss, current = loss.item(), (batch + 1) * len([X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
loss, current = loss.item(), (batch + 1) * len(X)
print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")
```
我们还会检查模型在测试数据集上的表现,以确保它正在学习。
```py
def test(dataloader, model, [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")):
def test(dataloader, model, loss_fn):
size = len(dataloader.dataset)
num_batches = len(dataloader)
[model.eval](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.eval "torch.nn.Module.eval")()
model.eval()
test_loss, correct = 0, 0
with [torch.no_grad](https://pytorch.org/docs/stable/generated/torch.no_grad.html#torch.no_grad "torch.no_grad")():
for [X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), y in dataloader:
[X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), y = [X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").to(device), y.to(device)
[pred](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = model([X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
test_loss += [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")([pred](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), y).item()
correct += ([pred](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").argmax(1) == y).type([torch.float](https://pytorch.org/docs/stable/tensor_attributes.html#torch.dtype "torch.dtype")).sum().item()
with torch.no_grad():
for X, y in dataloader:
X, y = X.to(device), y.to(device)
pred = model(X)
test_loss += loss_fn(pred, y).item()
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
correct /= size
print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")
......@@ -224,8 +224,8 @@ def test(dataloader, model, [loss_fn](https://pytorch.org/docs/stable/generated/
epochs = 5
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")
train([train_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"), model, [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss"), [optimizer](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD"))
test([test_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"), model, [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss"))
train(train_dataloader, model, loss_fn, optimizer)
test(test_dataloader, model, loss_fn)
print("Done!")
```
......@@ -317,7 +317,7 @@ Done!
保存模型的常见方法是序列化内部状态字典(包含模型参数)。
```py
[torch.save](https://pytorch.org/docs/stable/generated/torch.save.html#torch.save "torch.save")([model.state_dict](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.state_dict "torch.nn.Module.state_dict")(), "model.pth")
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")
```
......@@ -330,8 +330,8 @@ Saved PyTorch Model State to model.pth
加载模型的过程包括重新创建模型结构并将状态字典加载到其中。
```py
model = [NeuralNetwork](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")().to(device)
[model.load_state_dict](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.load_state_dict "torch.nn.Module.load_state_dict")([torch.load](https://pytorch.org/docs/stable/generated/torch.load.html#torch.load "torch.load")("model.pth"))
model = NeuralNetwork().to(device)
model.load_state_dict(torch.load("model.pth"))
```
```py
......@@ -354,12 +354,12 @@ classes = [
"Ankle boot",
]
[model.eval](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.eval "torch.nn.Module.eval")()
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), y = [test_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")[0][0], [test_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")[0][1]
with [torch.no_grad](https://pytorch.org/docs/stable/generated/torch.no_grad.html#torch.no_grad "torch.no_grad")():
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").to(device)
[pred](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = model([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
predicted, actual = classes[[pred](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")[0].argmax(0)], classes[y]
model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
x = x.to(device)
pred = model(x)
predicted, actual = classes[pred[0].argmax(0)], classes[y]
print(f'Predicted: "{predicted}", Actual: "{actual}"')
```
......
......@@ -27,7 +27,7 @@ import numpy as np
```py
data = [[1, 2],[3, 4]]
[x_data](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.tensor](https://pytorch.org/docs/stable/generated/torch.tensor.html#torch.tensor "torch.tensor")(data)
x_data = torch.tensor(data)
```
**从 NumPy 数组**
......@@ -36,7 +36,7 @@ data = [[1, 2],[3, 4]]
```py
np_array = np.array(data)
[x_np](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.from_numpy](https://pytorch.org/docs/stable/generated/torch.from_numpy.html#torch.from_numpy "torch.from_numpy")(np_array)
x_np = torch.from_numpy(np_array)
```
**从另一个张量中:**
......@@ -44,11 +44,11 @@ np_array = np.array(data)
新张量保留了参数张量的属性(形状、数据类型),除非显式覆盖。
```py
[x_ones](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.ones_like](https://pytorch.org/docs/stable/generated/torch.ones_like.html#torch.ones_like "torch.ones_like")([x_data](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")) # retains the properties of x_data
print(f"Ones Tensor: \n {[x_ones](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")} \n")
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")
[x_rand](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand_like](https://pytorch.org/docs/stable/generated/torch.rand_like.html#torch.rand_like "torch.rand_like")([x_data](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), dtype=[torch.float](https://pytorch.org/docs/stable/tensor_attributes.html#torch.dtype "torch.dtype")) # overrides the datatype of x_data
print(f"Random Tensor: \n {[x_rand](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")} \n")
x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")
```
```py
......@@ -67,13 +67,13 @@ Random Tensor:
```py
shape = (2,3,)
[rand_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand](https://pytorch.org/docs/stable/generated/torch.rand.html#torch.rand "torch.rand")(shape)
[ones_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.ones](https://pytorch.org/docs/stable/generated/torch.ones.html#torch.ones "torch.ones")(shape)
[zeros_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.zeros](https://pytorch.org/docs/stable/generated/torch.zeros.html#torch.zeros "torch.zeros")(shape)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)
print(f"Random Tensor: \n {[rand_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")} \n")
print(f"Ones Tensor: \n {[ones_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")} \n")
print(f"Zeros Tensor: \n {[zeros_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")}")
print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")
```
```py
......@@ -97,11 +97,11 @@ Zeros Tensor:
张量属性描述了它们的形状、数据类型和存储它们的设备。
```py
[tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand](https://pytorch.org/docs/stable/generated/torch.rand.html#torch.rand "torch.rand")(3,4)
tensor = torch.rand(3,4)
print(f"Shape of tensor: {[tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").shape}")
print(f"Datatype of tensor: {[tensor.dtype](https://pytorch.org/docs/stable/tensor_attributes.html#torch.dtype "torch.dtype")}")
print(f"Device tensor is stored on: {[tensor.device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device")}")
print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")
```
```py
......@@ -122,8 +122,8 @@ Device tensor is stored on: cpu
```py
# We move our tensor to the GPU if available
if [torch.cuda.is_available](https://pytorch.org/docs/stable/generated/torch.cuda.is_available.html#torch.cuda.is_available "torch.cuda.is_available")():
[tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").to("cuda")
if torch.cuda.is_available():
tensor = tensor.to("cuda")
```
尝试运行列表中的一些操作。如果您熟悉 NumPy API,您会发现 Tensor API 非常易于使用。
......@@ -131,12 +131,12 @@ if [torch.cuda.is_available](https://pytorch.org/docs/stable/generated/torch.cud
**标准类似于 numpy 的索引和切片:**
```py
[tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.ones](https://pytorch.org/docs/stable/generated/torch.ones.html#torch.ones "torch.ones")(4, 4)
print(f"First row: {[tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")[0]}")
print(f"First column: {[tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")[:, 0]}")
print(f"Last column: {[tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")[..., -1]}")
[tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")[:,1] = 0
print([tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
tensor = torch.ones(4, 4)
print(f"First row: {tensor[0]}")
print(f"First column: {tensor[:, 0]}")
print(f"Last column: {tensor[..., -1]}")
tensor[:,1] = 0
print(tensor)
```
```py
......@@ -152,8 +152,8 @@ tensor([[1., 0., 1., 1.],
**连接张量** 您可以使用`torch.cat`沿着给定维度连接一系列张量。另请参阅[torch.stack](https://pytorch.org/docs/stable/generated/torch.stack.html),另一个微妙不同于`torch.cat`的张量连接运算符。
```py
[t1](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.cat](https://pytorch.org/docs/stable/generated/torch.cat.html#torch.cat "torch.cat")([[tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")], dim=1)
print([t1](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)
```
```py
......@@ -168,18 +168,18 @@ tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
```py
# This computes the matrix multiplication between two tensors. y1, y2, y3 will have the same value
# ``tensor.T`` returns the transpose of a tensor
[y1](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") @ [tensor.T](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
[y2](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").matmul([tensor.T](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
y1 = tensor @ tensor.T
y2 = tensor.matmul(tensor.T)
[y3](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand_like](https://pytorch.org/docs/stable/generated/torch.rand_like.html#torch.rand_like "torch.rand_like")([y1](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[torch.matmul](https://pytorch.org/docs/stable/generated/torch.matmul.html#torch.matmul "torch.matmul")([tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [tensor.T](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), out=[y3](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
y3 = torch.rand_like(y1)
torch.matmul(tensor, tensor.T, out=y3)
# This computes the element-wise product. z1, z2, z3 will have the same value
[z1](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") * [tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
[z2](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").mul([tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
z1 = tensor * tensor
z2 = tensor.mul(tensor)
[z3](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand_like](https://pytorch.org/docs/stable/generated/torch.rand_like.html#torch.rand_like "torch.rand_like")([tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[torch.mul](https://pytorch.org/docs/stable/generated/torch.mul.html#torch.mul "torch.mul")([tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), out=[z3](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
z3 = torch.rand_like(tensor)
torch.mul(tensor, tensor, out=z3)
```
```py
......@@ -192,8 +192,8 @@ tensor([[1., 0., 1., 1.],
**单元素张量** 如果您有一个单元素张量,例如通过将张量的所有值聚合为一个值,您可以使用`item()`将其转换为 Python 数值:
```py
[agg](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").sum()
agg_item = [agg](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").item()
agg = tensor.sum()
agg_item = agg.item()
print(agg_item, type(agg_item))
```
......@@ -204,9 +204,9 @@ print(agg_item, type(agg_item))
**原地操作** 将结果存储到操作数中的操作称为原地操作。它们以`_`后缀表示。例如:`x.copy_(y)``x.t_()`,将改变`x`
```py
print(f"{[tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")} \n")
[tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").add_(5)
print([tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
print(f"{tensor} \n")
tensor.add_(5)
print(tensor)
```
```py
......@@ -234,9 +234,9 @@ CPU 上的张量和 NumPy 数组可以共享它们的基础内存位置,改变
### 张量转换为 NumPy 数组
```py
[t](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.ones](https://pytorch.org/docs/stable/generated/torch.ones.html#torch.ones "torch.ones")(5)
print(f"t: {[t](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")}")
n = [t](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").numpy()
t = torch.ones(5)
print(f"t: {t}")
n = t.numpy()
print(f"n: {n}")
```
......@@ -248,8 +248,8 @@ n: [1\. 1\. 1\. 1\. 1.]
张量中的更改会反映在 NumPy 数组中。
```py
[t](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").add_(1)
print(f"t: {[t](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")}")
t.add_(1)
print(f"t: {t}")
print(f"n: {n}")
```
......@@ -262,14 +262,14 @@ n: [2\. 2\. 2\. 2\. 2.]
```py
n = np.ones(5)
[t](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.from_numpy](https://pytorch.org/docs/stable/generated/torch.from_numpy.html#torch.from_numpy "torch.from_numpy")(n)
t = torch.from_numpy(n)
```
NumPy 数组中的更改会反映在张量中。
```py
np.add(n, 1, out=n)
print(f"t: {[t](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")}")
print(f"t: {t}")
print(f"n: {n}")
```
......
......@@ -28,23 +28,23 @@ PyTorch 领域库提供了许多预加载数据集(如 FashionMNIST),它
```py
import torch
from torch.utils.data import [Dataset](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset "torch.utils.data.Dataset")
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import [ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plt
[training_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST") = [datasets.FashionMNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")(
training_data = datasets.FashionMNIST(
root="data",
train=True,
download=True,
transform=[ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")()
transform=ToTensor()
)
[test_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST") = [datasets.FashionMNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")(
test_data = datasets.FashionMNIST(
root="data",
train=False,
download=True,
transform=[ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")()
transform=ToTensor()
)
```
......@@ -112,12 +112,12 @@ labels_map = {
figure = plt.figure(figsize=(8, 8))
cols, rows = 3, 3
for i in range(1, cols * rows + 1):
sample_idx = [torch.randint](https://pytorch.org/docs/stable/generated/torch.randint.html#torch.randint "torch.randint")(len([training_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")), size=(1,)).item()
[img](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [training_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")[sample_idx]
sample_idx = torch.randint(len(training_data), size=(1,)).item()
img, label = training_data[sample_idx]
figure.add_subplot(rows, cols, i)
plt.title(labels_map[[label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")])
plt.title(labels_map[label])
plt.axis("off")
plt.imshow([img](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").squeeze(), cmap="gray")
plt.imshow(img.squeeze(), cmap="gray")
plt.show()
```
......@@ -134,9 +134,9 @@ plt.show()
```py
import os
import pandas as pd
from torchvision.io import [read_image](https://pytorch.org/vision/stable/generated/torchvision.io.read_image.html#torchvision.io.read_image "torchvision.io.read_image")
from torchvision.io import read_image
class CustomImageDataset([Dataset](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset "torch.utils.data.Dataset")):
class CustomImageDataset(Dataset):
def __init__(self, annotations_file, img_dir, transform=None, target_transform=None):
self.img_labels = pd.read_csv(annotations_file)
self.img_dir = img_dir
......@@ -148,13 +148,13 @@ class CustomImageDataset([Dataset](https://pytorch.org/docs/stable/data.html#tor
def __getitem__(self, idx):
img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])
image = [read_image](https://pytorch.org/vision/stable/generated/torchvision.io.read_image.html#torchvision.io.read_image "torchvision.io.read_image")(img_path)
[label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = self.img_labels.iloc[idx, 1]
image = read_image(img_path)
label = self.img_labels.iloc[idx, 1]
if self.transform:
image = self.transform(image)
if self.target_transform:
[label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = self.target_transform([label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
return image, [label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
label = self.target_transform(label)
return image, label
```
### `__init__`
......@@ -196,13 +196,13 @@ def __len__(self):
```py
def __getitem__(self, idx):
img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])
image = [read_image](https://pytorch.org/vision/stable/generated/torchvision.io.read_image.html#torchvision.io.read_image "torchvision.io.read_image")(img_path)
[label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = self.img_labels.iloc[idx, 1]
image = read_image(img_path)
label = self.img_labels.iloc[idx, 1]
if self.transform:
image = self.transform(image)
if self.target_transform:
[label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = self.target_transform([label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
return image, [label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
label = self.target_transform(label)
return image, label
```
* * *
......@@ -214,10 +214,10 @@ def __getitem__(self, idx):
`DataLoader`是一个可迭代对象,它在易用的 API 中为我们抽象了这种复杂性。
```py
from torch.utils.data import [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")
from torch.utils.data import DataLoader
[train_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")([training_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"), batch_size=64, shuffle=True)
[test_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")([test_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"), batch_size=64, shuffle=True)
train_dataloader = DataLoader(training_data, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True)
```
## 遍历 DataLoader
......@@ -226,14 +226,14 @@ from torch.utils.data import [DataLoader](https://pytorch.org/docs/stable/data.h
```py
# Display image and label.
[train_features](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [train_labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = next(iter([train_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")))
print(f"Feature batch shape: {[train_features](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").size()}")
print(f"Labels batch shape: {[train_labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").size()}")
[img](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [train_features](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")[0].squeeze()
[label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [train_labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")[0]
plt.imshow([img](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), cmap="gray")
train_features, train_labels = next(iter(train_dataloader))
print(f"Feature batch shape: {train_features.size()}")
print(f"Labels batch shape: {train_labels.size()}")
img = train_features[0].squeeze()
label = train_labels[0]
plt.imshow(img, cmap="gray")
plt.show()
print(f"Label: {[label](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")}")
print(f"Label: {label}")
```
![数据教程](img/984f7e1474d00727ca26fcbc11a91b69.png)
......
......@@ -17,14 +17,14 @@ FashionMNIST 的特征以 PIL 图像格式呈现,标签为整数。对于训
```py
import torch
from torchvision import datasets
from torchvision.transforms import [ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor"), [Lambda](https://pytorch.org/vision/stable/generated/torchvision.transforms.Lambda.html#torchvision.transforms.Lambda "torchvision.transforms.Lambda")
from torchvision.transforms import ToTensor, Lambda
[ds](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST") = [datasets.FashionMNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")(
ds = datasets.FashionMNIST(
root="data",
train=True,
download=True,
transform=[ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")(),
[target_transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Lambda.html#torchvision.transforms.Lambda "torchvision.transforms.Lambda")=[Lambda](https://pytorch.org/vision/stable/generated/torchvision.transforms.Lambda.html#torchvision.transforms.Lambda "torchvision.transforms.Lambda")(lambda y: [torch.zeros](https://pytorch.org/docs/stable/generated/torch.zeros.html#torch.zeros "torch.zeros")(10, dtype=[torch.float](https://pytorch.org/docs/stable/tensor_attributes.html#torch.dtype "torch.dtype")).scatter_(0, [torch.tensor](https://pytorch.org/docs/stable/generated/torch.tensor.html#torch.tensor "torch.tensor")(y), value=1))
transform=ToTensor(),
target_transform=Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1))
)
```
......@@ -80,8 +80,8 @@ Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/
Lambda 转换应用任何用户定义的 lambda 函数。在这里,我们定义一个函数将整数转换为一个独热编码的张量。它首先创建一个大小为 10 的零张量(数据集中标签的数量),然后调用[scatter_](https://pytorch.org/docs/stable/generated/torch.Tensor.scatter_.html),该函数根据标签`y`给定的索引分配`value=1`
```py
[target_transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Lambda.html#torchvision.transforms.Lambda "torchvision.transforms.Lambda") = [Lambda](https://pytorch.org/vision/stable/generated/torchvision.transforms.Lambda.html#torchvision.transforms.Lambda "torchvision.transforms.Lambda")(lambda y: [torch.zeros](https://pytorch.org/docs/stable/generated/torch.zeros.html#torch.zeros "torch.zeros")(
10, dtype=[torch.float](https://pytorch.org/docs/stable/tensor_attributes.html#torch.dtype "torch.dtype")).scatter_(dim=0, index=[torch.tensor](https://pytorch.org/docs/stable/generated/torch.tensor.html#torch.tensor "torch.tensor")(y), value=1))
target_transform = Lambda(lambda y: torch.zeros(
10, dtype=torch.float).scatter_(dim=0, index=torch.tensor(y), value=1))
```
* * *
......
......@@ -27,9 +27,9 @@ from torchvision import datasets, transforms
```py
device = (
"cuda"
if [torch.cuda.is_available](https://pytorch.org/docs/stable/generated/torch.cuda.is_available.html#torch.cuda.is_available "torch.cuda.is_available")()
if torch.cuda.is_available()
else "mps"
if [torch.backends.mps.is_available](https://pytorch.org/docs/stable/backends.html#torch.backends.mps.is_available "torch.backends.mps.is_available")()
if torch.backends.mps.is_available()
else "cpu"
)
print(f"Using {device} device")
......@@ -44,28 +44,28 @@ Using cuda device
我们通过子类化 `nn.Module` 来定义我们的神经网络,并在 `__init__` 中初始化神经网络层。每个 `nn.Module` 子类在 `forward` 方法中实现对输入数据的操作。
```py
class NeuralNetwork([nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")):
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
self.[flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html#torch.nn.Flatten "torch.nn.Flatten") = [nn.Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html#torch.nn.Flatten "torch.nn.Flatten")()
self.linear_relu_stack = [nn.Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html#torch.nn.Sequential "torch.nn.Sequential")(
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(28*28, 512),
[nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")(),
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(512, 512),
[nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")(),
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(512, 10),
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, 512),
nn.ReLU(),
nn.Linear(512, 512),
nn.ReLU(),
nn.Linear(512, 10),
)
def forward(self, x):
x = self.[flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html#torch.nn.Flatten "torch.nn.Flatten")(x)
[logits](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = self.linear_relu_stack(x)
return [logits](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
```
我们创建一个 `NeuralNetwork` 实例,并将其移动到 `device`,然后打印其结构。
```py
model = [NeuralNetwork](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")().to(device)
model = NeuralNetwork().to(device)
print(model)
```
......@@ -87,11 +87,11 @@ NeuralNetwork(
对输入调用模型会返回一个二维张量,dim=0 对应每个类别的 10 个原始预测值,dim=1 对应每个输出的单个值。通过将其传递给 `nn.Softmax` 模块,我们可以得到预测概率。
```py
[X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand](https://pytorch.org/docs/stable/generated/torch.rand.html#torch.rand "torch.rand")(1, 28, 28, device=device)
[logits](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = model([X](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[pred_probab](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [nn.Softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html#torch.nn.Softmax "torch.nn.Softmax")(dim=1)([logits](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[y_pred](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [pred_probab](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").argmax(1)
print(f"Predicted class: {[y_pred](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")}")
X = torch.rand(1, 28, 28, device=device)
logits = model(X)
pred_probab = nn.Softmax(dim=1)(logits)
y_pred = pred_probab.argmax(1)
print(f"Predicted class: {y_pred}")
```
```py
......@@ -105,8 +105,8 @@ Predicted class: tensor([7], device='cuda:0')
让我们分解 FashionMNIST 模型中的层。为了说明,我们将取一个大小为 28x28 的 3 张图像的示例小批量,并看看当我们将其通过网络时会发生什么。
```py
[input_image](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand](https://pytorch.org/docs/stable/generated/torch.rand.html#torch.rand "torch.rand")(3,28,28)
print([input_image](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").size())
input_image = torch.rand(3,28,28)
print(input_image.size())
```
```py
......@@ -118,9 +118,9 @@ torch.Size([3, 28, 28])
我们初始化 [nn.Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html) 层,将每个 2D 的 28x28 图像转换为一个连续的包含 784 个像素值的数组(保持 minibatch 维度(在 dim=0))。
```py
[flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html#torch.nn.Flatten "torch.nn.Flatten") = [nn.Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html#torch.nn.Flatten "torch.nn.Flatten")()
[flat_image](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html#torch.nn.Flatten "torch.nn.Flatten")([input_image](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
print([flat_image](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").size())
flatten = nn.Flatten()
flat_image = flatten(input_image)
print(flat_image.size())
```
```py
......@@ -132,9 +132,9 @@ torch.Size([3, 784])
[线性层](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html) 是一个模块,使用其存储的权重和偏置对输入进行线性变换。
```py
[layer1](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear") = [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(in_features=28*28, out_features=20)
[hidden1](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [layer1](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")([flat_image](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
print([hidden1](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").size())
layer1 = nn.Linear(in_features=28*28, out_features=20)
hidden1 = layer1(flat_image)
print(hidden1.size())
```
```py
......@@ -148,9 +148,9 @@ torch.Size([3, 20])
在这个模型中,我们在线性层之间使用 [nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html),但还有其他激活函数可以引入模型的非线性。
```py
print(f"Before ReLU: {[hidden1](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")}\n\n")
[hidden1](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")()([hidden1](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
print(f"After ReLU: {[hidden1](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")}")
print(f"Before ReLU: {hidden1}\n\n")
hidden1 = nn.ReLU()(hidden1)
print(f"After ReLU: {hidden1}")
```
```py
......@@ -180,14 +180,14 @@ After ReLU: tensor([[0.4158, 0.0000, 0.0000, 0.3960, 0.1476, 0.0000, 0.0000, 0.2
[nn.Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html) 是一个有序的模块容器。数据按照定义的顺序通过所有模块。您可以使用序列容器来组合一个快速网络,比如 `seq_modules`
```py
[seq_modules](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html#torch.nn.Sequential "torch.nn.Sequential") = [nn.Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html#torch.nn.Sequential "torch.nn.Sequential")(
[flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html#torch.nn.Flatten "torch.nn.Flatten"),
[layer1](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear"),
[nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")(),
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(20, 10)
seq_modules = nn.Sequential(
flatten,
layer1,
nn.ReLU(),
nn.Linear(20, 10)
)
[input_image](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand](https://pytorch.org/docs/stable/generated/torch.rand.html#torch.rand "torch.rand")(3,28,28)
[logits](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [seq_modules](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html#torch.nn.Sequential "torch.nn.Sequential")([input_image](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
input_image = torch.rand(3,28,28)
logits = seq_modules(input_image)
```
### nn.Softmax
......@@ -195,8 +195,8 @@ After ReLU: tensor([[0.4158, 0.0000, 0.0000, 0.3960, 0.1476, 0.0000, 0.0000, 0.2
神经网络的最后一个线性层返回 logits - 在[-infty, infty]范围内的原始值 - 这些值传递给[nn.Softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html)模块。logits 被缩放到表示模型对每个类别的预测概率的值[0, 1]。`dim`参数指示值必须在其上求和为 1 的维度。
```py
[softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html#torch.nn.Softmax "torch.nn.Softmax") = [nn.Softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html#torch.nn.Softmax "torch.nn.Softmax")(dim=1)
[pred_probab](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html#torch.nn.Softmax "torch.nn.Softmax")([logits](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
softmax = nn.Softmax(dim=1)
pred_probab = softmax(logits)
```
## 模型参数
......@@ -208,8 +208,8 @@ After ReLU: tensor([[0.4158, 0.0000, 0.0000, 0.3960, 0.1476, 0.0000, 0.0000, 0.2
```py
print(f"Model structure: {model}\n\n")
for name, [param](https://pytorch.org/docs/stable/generated/torch.nn.parameter.Parameter.html#torch.nn.parameter.Parameter "torch.nn.parameter.Parameter") in [model.named_parameters](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.named_parameters "torch.nn.Module.named_parameters")():
print(f"Layer: {name} | Size: {[param](https://pytorch.org/docs/stable/generated/torch.nn.parameter.Parameter.html#torch.nn.parameter.Parameter "torch.nn.parameter.Parameter").size()} | Values : {[param](https://pytorch.org/docs/stable/generated/torch.nn.parameter.Parameter.html#torch.nn.parameter.Parameter "torch.nn.parameter.Parameter")[:2]} \n")
for name, param in model.named_parameters():
print(f"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \n")
```
```py
......
......@@ -17,12 +17,12 @@
```py
import torch
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.ones](https://pytorch.org/docs/stable/generated/torch.ones.html#torch.ones "torch.ones")(5) # input tensor
[y](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.zeros](https://pytorch.org/docs/stable/generated/torch.zeros.html#torch.zeros "torch.zeros")(3) # expected output
[w](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.randn](https://pytorch.org/docs/stable/generated/torch.randn.html#torch.randn "torch.randn")(5, 3, requires_grad=True)
[b](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.randn](https://pytorch.org/docs/stable/generated/torch.randn.html#torch.randn "torch.randn")(3, requires_grad=True)
[z](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.matmul](https://pytorch.org/docs/stable/generated/torch.matmul.html#torch.matmul "torch.matmul")([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [w](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))+[b](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
[loss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.nn.functional.binary_cross_entropy_with_logits](https://pytorch.org/docs/stable/generated/torch.nn.functional.binary_cross_entropy_with_logits.html#torch.nn.functional.binary_cross_entropy_with_logits "torch.nn.functional.binary_cross_entropy_with_logits")([z](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [y](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
x = torch.ones(5) # input tensor
y = torch.zeros(3) # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)
```
## 张量、函数和计算图
......@@ -40,8 +40,8 @@ import torch
我们应用于张量以构建计算图的函数实际上是`Function`类的对象。这个对象知道如何在*前向*方向计算函数,也知道如何在*反向传播*步骤中计算它的导数。反向传播函数的引用存储在张量的`grad_fn`属性中。您可以在[文档](https://pytorch.org/docs/stable/autograd.html#function)中找到有关`Function`的更多信息。
```py
print(f"Gradient function for z = {[z](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").grad_fn}")
print(f"Gradient function for loss = {[loss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").grad_fn}")
print(f"Gradient function for z = {z.grad_fn}")
print(f"Gradient function for loss = {loss.grad_fn}")
```
```py
......@@ -54,9 +54,9 @@ Gradient function for loss = <BinaryCrossEntropyWithLogitsBackward0 object at 0x
为了优化神经网络中参数的权重,我们需要计算损失函数相对于参数的导数,即我们需要在一些固定的`x``y`值下计算\(\frac{\partial loss}{\partial w}\)\(\frac{\partial loss}{\partial b}\)。要计算这些导数,我们调用`loss.backward()`,然后从`w.grad``b.grad`中检索值:
```py
[loss.backward](https://pytorch.org/docs/stable/generated/torch.Tensor.backward.html#torch.Tensor.backward "torch.Tensor.backward")()
print([w.grad](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
print([b.grad](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
loss.backward()
print(w.grad)
print(b.grad)
```
```py
......@@ -79,12 +79,12 @@ tensor([0.3313, 0.0626, 0.2530])
默认情况下,所有`requires_grad=True`的张量都在跟踪它们的计算历史并支持梯度计算。然而,在某些情况下,我们不需要这样做,例如,当我们已经训练好模型,只想将其应用于一些输入数据时,即我们只想通过网络进行*前向*计算。我们可以通过在计算代码周围加上`torch.no_grad()`块来停止跟踪计算:
```py
[z](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.matmul](https://pytorch.org/docs/stable/generated/torch.matmul.html#torch.matmul "torch.matmul")([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [w](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))+[b](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
print([z](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").requires_grad)
z = torch.matmul(x, w)+b
print(z.requires_grad)
with [torch.no_grad](https://pytorch.org/docs/stable/generated/torch.no_grad.html#torch.no_grad "torch.no_grad")():
[z](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.matmul](https://pytorch.org/docs/stable/generated/torch.matmul.html#torch.matmul "torch.matmul")([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [w](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))+[b](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
print([z](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").requires_grad)
with torch.no_grad():
z = torch.matmul(x, w)+b
print(z.requires_grad)
```
```py
......@@ -95,9 +95,9 @@ False
实现相同结果的另一种方法是在张量上使用`detach()`方法:
```py
[z](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.matmul](https://pytorch.org/docs/stable/generated/torch.matmul.html#torch.matmul "torch.matmul")([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [w](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))+[b](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
[z_det](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [z](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").detach()
print([z_det](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").requires_grad)
z = torch.matmul(x, w)+b
z_det = z.detach()
print(z_det.requires_grad)
```
```py
......@@ -143,15 +143,15 @@ False
PyTorch 允许您计算给定输入向量\(v=(v_1 \dots v_m)\)**Jacobian Product** \(v^T\cdot J\),而不是计算 Jacobian 矩阵本身。通过使用\(v\)作为参数调用`backward`来实现这一点。\(v\)的大小应该与原始张量的大小相同,我们希望计算乘积的大小:
```py
[inp](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.eye](https://pytorch.org/docs/stable/generated/torch.eye.html#torch.eye "torch.eye")(4, 5, requires_grad=True)
[out](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = ([inp](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")+1).pow(2).t()
[out.backward](https://pytorch.org/docs/stable/generated/torch.Tensor.backward.html#torch.Tensor.backward "torch.Tensor.backward")([torch.ones_like](https://pytorch.org/docs/stable/generated/torch.ones_like.html#torch.ones_like "torch.ones_like")([out](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")), retain_graph=True)
print(f"First call\n{[inp.grad](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")}")
[out.backward](https://pytorch.org/docs/stable/generated/torch.Tensor.backward.html#torch.Tensor.backward "torch.Tensor.backward")([torch.ones_like](https://pytorch.org/docs/stable/generated/torch.ones_like.html#torch.ones_like "torch.ones_like")([out](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")), retain_graph=True)
print(f"\nSecond call\n{[inp.grad](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")}")
[inp.grad](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").zero_()
[out.backward](https://pytorch.org/docs/stable/generated/torch.Tensor.backward.html#torch.Tensor.backward "torch.Tensor.backward")([torch.ones_like](https://pytorch.org/docs/stable/generated/torch.ones_like.html#torch.ones_like "torch.ones_like")([out](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")), retain_graph=True)
print(f"\nCall after zeroing gradients\n{[inp.grad](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")}")
inp = torch.eye(4, 5, requires_grad=True)
out = (inp+1).pow(2).t()
out.backward(torch.ones_like(out), retain_graph=True)
print(f"First call\n{inp.grad}")
out.backward(torch.ones_like(out), retain_graph=True)
print(f"\nSecond call\n{inp.grad}")
inp.grad.zero_()
out.backward(torch.ones_like(out), retain_graph=True)
print(f"\nCall after zeroing gradients\n{inp.grad}")
```
```py
......
......@@ -17,37 +17,37 @@
```py
import torch
from torch import nn
from torch.utils.data import [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import [ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")
from torchvision.transforms import ToTensor
[training_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST") = [datasets.FashionMNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")(
training_data = datasets.FashionMNIST(
root="data",
train=True,
download=True,
transform=[ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")()
transform=ToTensor()
)
[test_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST") = [datasets.FashionMNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")(
test_data = datasets.FashionMNIST(
root="data",
train=False,
download=True,
transform=[ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")()
transform=ToTensor()
)
[train_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")([training_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"), batch_size=64)
[test_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")([test_data](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"), batch_size=64)
train_dataloader = DataLoader(training_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)
class NeuralNetwork([nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")):
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
self.flatten = [nn.Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html#torch.nn.Flatten "torch.nn.Flatten")()
self.linear_relu_stack = [nn.Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html#torch.nn.Sequential "torch.nn.Sequential")(
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(28*28, 512),
[nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")(),
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(512, 512),
[nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")(),
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(512, 10),
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, 512),
nn.ReLU(),
nn.Linear(512, 512),
nn.ReLU(),
nn.Linear(512, 10),
)
def forward(self, x):
......@@ -55,7 +55,7 @@ class NeuralNetwork([nn.Module](https://pytorch.org/docs/stable/generated/torch.
logits = self.linear_relu_stack(x)
return logits
model = [NeuralNetwork](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")()
model = NeuralNetwork()
```
```py
......@@ -141,7 +141,7 @@ epochs = 5
```py
# Initialize the loss function
[loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss") = [nn.CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")()
loss_fn = nn.CrossEntropyLoss()
```
### 优化器
......@@ -151,7 +151,7 @@ epochs = 5
我们通过注册需要训练的模型参数并传入学习率超参数来初始化优化器。
```py
[optimizer](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD") = [torch.optim.SGD](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD")([model.parameters](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.parameters "torch.nn.Module.parameters")(), lr=learning_rate)
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
```
在训练循环中,优化分为三个步骤:
......@@ -167,40 +167,40 @@ epochs = 5
我们定义`train_loop`循环优化代码,并定义`test_loop`评估模型在测试数据上的性能。
```py
def train_loop(dataloader, model, [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss"), [optimizer](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD")):
def train_loop(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
# Set the model to training mode - important for batch normalization and dropout layers
# Unnecessary in this situation but added for best practices
[model.train](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.train "torch.nn.Module.train")()
model.train()
for batch, (X, y) in enumerate(dataloader):
# Compute prediction and loss
pred = model(X)
loss = [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")(pred, y)
loss = loss_fn(pred, y)
# Backpropagation
loss.backward()
[optimizer.step](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD.step "torch.optim.SGD.step")()
[optimizer.zero_grad](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD.zero_grad "torch.optim.SGD.zero_grad")()
optimizer.step()
optimizer.zero_grad()
if batch % 100 == 0:
loss, current = loss.item(), batch * batch_size + len(X)
print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")
def test_loop(dataloader, model, [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")):
def test_loop(dataloader, model, loss_fn):
# Set the model to evaluation mode - important for batch normalization and dropout layers
# Unnecessary in this situation but added for best practices
[model.eval](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.eval "torch.nn.Module.eval")()
model.eval()
size = len(dataloader.dataset)
num_batches = len(dataloader)
test_loss, correct = 0, 0
# Evaluating the model with torch.no_grad() ensures that no gradients are computed during test mode
# also serves to reduce unnecessary gradient computations and memory usage for tensors with requires_grad=True
with [torch.no_grad](https://pytorch.org/docs/stable/generated/torch.no_grad.html#torch.no_grad "torch.no_grad")():
with torch.no_grad():
for X, y in dataloader:
pred = model(X)
test_loss += [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")(pred, y).item()
correct += (pred.argmax(1) == y).type([torch.float](https://pytorch.org/docs/stable/tensor_attributes.html#torch.dtype "torch.dtype")).sum().item()
test_loss += loss_fn(pred, y).item()
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
correct /= size
......@@ -210,14 +210,14 @@ def test_loop(dataloader, model, [loss_fn](https://pytorch.org/docs/stable/gener
我们初始化损失函数和优化器,并将其传递给`train_loop``test_loop`。可以增加 epoch 的数量来跟踪模型的性能改进。
```py
[loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss") = [nn.CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")()
[optimizer](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD") = [torch.optim.SGD](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD")([model.parameters](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.parameters "torch.nn.Module.parameters")(), lr=learning_rate)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
epochs = 10
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")
train_loop([train_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"), model, [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss"), [optimizer](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD"))
test_loop([test_dataloader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"), model, [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss"))
train_loop(train_dataloader, model, loss_fn, optimizer)
test_loop(test_dataloader, model, loss_fn)
print("Done!")
```
......
......@@ -20,8 +20,8 @@ import torchvision.models as models
PyTorch 模型将学习到的参数存储在内部状态字典中,称为 `state_dict`。这些可以通过 `torch.save` 方法进行持久化:
```py
model = [models.vgg16](https://pytorch.org/vision/stable/models/generated/torchvision.models.vgg16.html#torchvision.models.vgg16 "torchvision.models.vgg16")(weights='IMAGENET1K_V1')
[torch.save](https://pytorch.org/docs/stable/generated/torch.save.html#torch.save "torch.save")([model.state_dict](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.state_dict "torch.nn.Module.state_dict")(), 'model_weights.pth')
model = models.vgg16(weights='IMAGENET1K_V1')
torch.save(model.state_dict(), 'model_weights.pth')
```
```py
......@@ -71,9 +71,9 @@ Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /var/li
要加载模型权重,您需要首先创建相同模型的实例,然后使用 `load_state_dict()` 方法加载参数。
```py
model = [models.vgg16](https://pytorch.org/vision/stable/models/generated/torchvision.models.vgg16.html#torchvision.models.vgg16 "torchvision.models.vgg16")() # we do not specify ``weights``, i.e. create untrained model
[model.load_state_dict](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.load_state_dict "torch.nn.Module.load_state_dict")([torch.load](https://pytorch.org/docs/stable/generated/torch.load.html#torch.load "torch.load")('model_weights.pth'))
[model.eval](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.eval "torch.nn.Module.eval")()
model = models.vgg16() # we do not specify ``weights``, i.e. create untrained model
model.load_state_dict(torch.load('model_weights.pth'))
model.eval()
```
```py
......@@ -133,13 +133,13 @@ VGG(
在加载模型权重时,我们需要首先实例化模型类,因为类定义了网络的结构。我们可能希望将此类的结构与模型一起保存,这样我们可以将 `model`(而不是 `model.state_dict()`)传递给保存函数:
```py
[torch.save](https://pytorch.org/docs/stable/generated/torch.save.html#torch.save "torch.save")(model, 'model.pth')
torch.save(model, 'model.pth')
```
我们可以像这样加载模型:
```py
model = [torch.load](https://pytorch.org/docs/stable/generated/torch.load.html#torch.load "torch.load")('model.pth')
model = torch.load('model.pth')
```
注意
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
......@@ -25,38 +25,38 @@
```py
import torch
class TinyModel([torch.nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")):
class TinyModel(torch.nn.Module):
def __init__(self):
super([TinyModel](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module"), self).__init__()
super(TinyModel, self).__init__()
self.linear1 = [torch.nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(100, 200)
self.activation = [torch.nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")()
self.linear2 = [torch.nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(200, 10)
self.softmax = [torch.nn.Softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html#torch.nn.Softmax "torch.nn.Softmax")()
self.linear1 = torch.nn.Linear(100, 200)
self.activation = torch.nn.ReLU()
self.linear2 = torch.nn.Linear(200, 10)
self.softmax = torch.nn.Softmax()
def forward(self, [x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")):
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = self.linear1([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = self.activation([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = self.linear2([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = self.softmax([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
return [x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
def forward(self, x):
x = self.linear1(x)
x = self.activation(x)
x = self.linear2(x)
x = self.softmax(x)
return x
tinymodel = [TinyModel](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")()
tinymodel = TinyModel()
print('The model:')
print(tinymodel)
print('\n\nJust one layer:')
print([tinymodel.linear2](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear"))
print(tinymodel.linear2)
print('\n\nModel params:')
for [param](https://pytorch.org/docs/stable/generated/torch.nn.parameter.Parameter.html#torch.nn.parameter.Parameter "torch.nn.parameter.Parameter") in [tinymodel.parameters](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.parameters "torch.nn.Module.parameters")():
print([param](https://pytorch.org/docs/stable/generated/torch.nn.parameter.Parameter.html#torch.nn.parameter.Parameter "torch.nn.parameter.Parameter"))
for param in tinymodel.parameters():
print(param)
print('\n\nLayer params:')
for [param](https://pytorch.org/docs/stable/generated/torch.nn.parameter.Parameter.html#torch.nn.parameter.Parameter "torch.nn.parameter.Parameter") in [tinymodel.linear2.parameters](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.parameters "torch.nn.Module.parameters")():
print([param](https://pytorch.org/docs/stable/generated/torch.nn.parameter.Parameter.html#torch.nn.parameter.Parameter "torch.nn.parameter.Parameter"))
for param in tinymodel.linear2.parameters():
print(param)
```
```py
......@@ -145,18 +145,18 @@ tensor([ 0.0385, -0.0116, 0.0703, 0.0407, -0.0346, -0.0178, 0.0308, -0.0502,
最基本的神经网络层类型是*线性**全连接*层。这是一个每个输入都影响层的每个输出的程度由层的权重指定的层。如果一个模型有*m*个输入和*n*个输出,权重将是一个*m* x *n*矩阵。例如:
```py
[lin](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear") = [torch.nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(3, 2)
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand](https://pytorch.org/docs/stable/generated/torch.rand.html#torch.rand "torch.rand")(1, 3)
lin = torch.nn.Linear(3, 2)
x = torch.rand(1, 3)
print('Input:')
print([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
print(x)
print('\n\nWeight and Bias parameters:')
for [param](https://pytorch.org/docs/stable/generated/torch.nn.parameter.Parameter.html#torch.nn.parameter.Parameter "torch.nn.parameter.Parameter") in [lin.parameters](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.parameters "torch.nn.Module.parameters")():
print([param](https://pytorch.org/docs/stable/generated/torch.nn.parameter.Parameter.html#torch.nn.parameter.Parameter "torch.nn.parameter.Parameter"))
for param in lin.parameters():
print(param)
[y](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [lin](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
y = lin(x)
print('\n\nOutput:')
print([y](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
print(y)
```
```py
......@@ -189,32 +189,32 @@ tensor([[ 0.8814, -0.1492]], grad_fn=<AddmmBackward0>)
```py
import torch.functional as F
class LeNet([torch.nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")):
class LeNet(torch.nn.Module):
def __init__(self):
super([LeNet](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module"), self).__init__()
super(LeNet, self).__init__()
# 1 input image channel (black & white), 6 output channels, 5x5 square convolution
# kernel
self.conv1 = [torch.nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(1, 6, 5)
self.conv2 = [torch.nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(6, 16, 3)
self.conv1 = torch.nn.Conv2d(1, 6, 5)
self.conv2 = torch.nn.Conv2d(6, 16, 3)
# an affine operation: y = Wx + b
self.fc1 = [torch.nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(16 * 6 * 6, 120) # 6*6 from image dimension
self.fc2 = [torch.nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(120, 84)
self.fc3 = [torch.nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(84, 10)
self.fc1 = torch.nn.Linear(16 * 6 * 6, 120) # 6*6 from image dimension
self.fc2 = torch.nn.Linear(120, 84)
self.fc3 = torch.nn.Linear(84, 10)
def forward(self, [x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")):
def forward(self, x):
# Max pooling over a (2, 2) window
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = F.max_pool2d(F.relu(self.conv1([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))), (2, 2))
x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
# If the size is a square you can only specify a single number
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = F.max_pool2d(F.relu(self.conv2([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))), 2)
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").view(-1, self.num_flat_features([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")))
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = F.relu(self.fc1([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")))
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = F.relu(self.fc2([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")))
[x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = self.fc3([x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
return [x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
def num_flat_features(self, [x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")):
size = [x](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").size()[1:] # all dimensions except the batch dimension
x = F.max_pool2d(F.relu(self.conv2(x)), 2)
x = x.view(-1, self.num_flat_features(x))
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
def num_flat_features(self, x):
size = x.size()[1:] # all dimensions except the batch dimension
num_features = 1
for s in size:
num_features *= s
......@@ -244,20 +244,20 @@ class LeNet([torch.nn.Module](https://pytorch.org/docs/stable/generated/torch.nn
RNN 层的内部结构 - 或其变体,LSTM(长短期记忆)和 GRU(门控循环单元) - 是适度复杂的,超出了本视频的范围,但我们将通过一个基于 LSTM 的词性标注器来展示其工作原理(一种告诉你一个词是名词、动词等的分类器):
```py
class LSTMTagger([torch.nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")):
class LSTMTagger(torch.nn.Module):
def __init__(self, embedding_dim, hidden_dim, vocab_size, tagset_size):
super([LSTMTagger](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module"), self).__init__()
super(LSTMTagger, self).__init__()
self.hidden_dim = hidden_dim
self.word_embeddings = [torch.nn.Embedding](https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html#torch.nn.Embedding "torch.nn.Embedding")(vocab_size, embedding_dim)
self.word_embeddings = torch.nn.Embedding(vocab_size, embedding_dim)
# The LSTM takes word embeddings as inputs, and outputs hidden states
# with dimensionality hidden_dim.
self.lstm = [torch.nn.LSTM](https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html#torch.nn.LSTM "torch.nn.LSTM")(embedding_dim, hidden_dim)
self.lstm = torch.nn.LSTM(embedding_dim, hidden_dim)
# The linear layer that maps from hidden state space to tag space
self.hidden2tag = [torch.nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(hidden_dim, tagset_size)
self.hidden2tag = torch.nn.Linear(hidden_dim, tagset_size)
def forward(self, sentence):
embeds = self.word_embeddings(sentence)
......@@ -294,11 +294,11 @@ class LSTMTagger([torch.nn.Module](https://pytorch.org/docs/stable/generated/tor
**最大池化**(以及它的孪生,最小池化)通过组合单元格并将输入单元格的最大值分配给输出单元格来减少张量(我们看到了这一点)。例如:
```py
[my_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand](https://pytorch.org/docs/stable/generated/torch.rand.html#torch.rand "torch.rand")(1, 6, 6)
print([my_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
my_tensor = torch.rand(1, 6, 6)
print(my_tensor)
[maxpool_layer](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d "torch.nn.MaxPool2d") = [torch.nn.MaxPool2d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d "torch.nn.MaxPool2d")(3)
print([maxpool_layer](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d "torch.nn.MaxPool2d")([my_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")))
maxpool_layer = torch.nn.MaxPool2d(3)
print(maxpool_layer(my_tensor))
```
```py
......@@ -317,16 +317,16 @@ tensor([[[0.7950, 0.9876],
**归一化层**在将一个层的输出重新居中和归一化之前将其馈送到另一个层。对中间张量进行居中和缩放具有许多有益的效果,例如让您在不爆炸/消失梯度的情况下使用更高的学习速率。
```py
[my_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand](https://pytorch.org/docs/stable/generated/torch.rand.html#torch.rand "torch.rand")(1, 4, 4) * 20 + 5
print([my_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
my_tensor = torch.rand(1, 4, 4) * 20 + 5
print(my_tensor)
print([my_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").mean())
print(my_tensor.mean())
[norm_layer](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html#torch.nn.BatchNorm1d "torch.nn.BatchNorm1d") = [torch.nn.BatchNorm1d](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html#torch.nn.BatchNorm1d "torch.nn.BatchNorm1d")(4)
[normed_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [norm_layer](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html#torch.nn.BatchNorm1d "torch.nn.BatchNorm1d")([my_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
print([normed_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
norm_layer = torch.nn.BatchNorm1d(4)
normed_tensor = norm_layer(my_tensor)
print(normed_tensor)
print([normed_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").mean())
print(normed_tensor.mean())
```
```py
......@@ -352,11 +352,11 @@ tensor(3.3528e-08, grad_fn=<MeanBackward0>)
Dropout 层通过在训练期间随机设置输入张量的部分来工作 - 推断时始终关闭 dropout 层。这迫使模型学习针对这个掩码或减少的数据集。例如:
```py
[my_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand](https://pytorch.org/docs/stable/generated/torch.rand.html#torch.rand "torch.rand")(1, 4, 4)
my_tensor = torch.rand(1, 4, 4)
[dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html#torch.nn.Dropout "torch.nn.Dropout") = [torch.nn.Dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html#torch.nn.Dropout "torch.nn.Dropout")(p=0.4)
print([dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html#torch.nn.Dropout "torch.nn.Dropout")([my_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")))
print([dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html#torch.nn.Dropout "torch.nn.Dropout")([my_tensor](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")))
dropout = torch.nn.Dropout(p=0.4)
print(dropout(my_tensor))
print(dropout(my_tensor))
```
```py
......
......@@ -51,7 +51,7 @@ import matplotlib.pyplot as plt
import numpy as np
# PyTorch TensorBoard support
from torch.utils.tensorboard import [SummaryWriter](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter "torch.utils.tensorboard.writer.SummaryWriter")
from torch.utils.tensorboard import SummaryWriter
# In case you are using an environment that has TensorFlow installed,
# such as Google Colab, uncomment the following code to avoid
......@@ -68,26 +68,26 @@ from torch.utils.tensorboard import [SummaryWriter](https://pytorch.org/docs/sta
```py
# Gather datasets and prepare them for consumption
[transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose") = [transforms.Compose](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")(
[[transforms.ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")(),
[transforms.Normalize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Normalize.html#torchvision.transforms.Normalize "torchvision.transforms.Normalize")((0.5,), (0.5,))])
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))])
# Store separate training and validations splits in ./data
[training_set](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST") = [torchvision.datasets.FashionMNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")('./data',
training_set = torchvision.datasets.FashionMNIST('./data',
download=True,
train=True,
[transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")=[transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose"))
[validation_set](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST") = [torchvision.datasets.FashionMNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")('./data',
transform=transform)
validation_set = torchvision.datasets.FashionMNIST('./data',
download=True,
train=False,
[transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")=[transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose"))
transform=transform)
[training_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [torch.utils.data.DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")([training_set](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"),
training_loader = torch.utils.data.DataLoader(training_set,
batch_size=4,
shuffle=True,
num_workers=2)
[validation_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [torch.utils.data.DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")([validation_set](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"),
validation_loader = torch.utils.data.DataLoader(validation_set,
batch_size=4,
shuffle=False,
num_workers=2)
......@@ -108,12 +108,12 @@ def matplotlib_imshow(img, one_channel=False):
plt.imshow(np.transpose(npimg, (1, 2, 0)))
# Extract a batch of 4 images
dataiter = iter([training_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"))
[images](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = next(dataiter)
dataiter = iter(training_loader)
images, labels = next(dataiter)
# Create a grid from the images and show them
[img_grid](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torchvision.utils.make_grid](https://pytorch.org/vision/stable/generated/torchvision.utils.make_grid.html#torchvision.utils.make_grid "torchvision.utils.make_grid")([images](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
matplotlib_imshow([img_grid](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), one_channel=True)
img_grid = torchvision.utils.make_grid(images)
matplotlib_imshow(img_grid, one_channel=True)
```
![tensorboardyt 教程](img/8498a1fd8664fde87cab20edccaf4cb9.png)
......@@ -164,11 +164,11 @@ Extracting ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/FashionMN
```py
# Default log_dir argument is "runs" - but it's good to be specific
# torch.utils.tensorboard.SummaryWriter is imported above
[writer](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter "torch.utils.tensorboard.writer.SummaryWriter") = [SummaryWriter](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter "torch.utils.tensorboard.writer.SummaryWriter")('runs/fashion_mnist_experiment_1')
writer = SummaryWriter('runs/fashion_mnist_experiment_1')
# Write image data to TensorBoard log dir
[writer.add_image](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.add_image "torch.utils.tensorboard.writer.SummaryWriter.add_image")('Four Fashion-MNIST Images', [img_grid](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[writer.flush](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.flush "torch.utils.tensorboard.writer.SummaryWriter.flush")()
writer.add_image('Four Fashion-MNIST Images', img_grid)
writer.flush()
# To view, start TensorBoard on the command line with:
# tensorboard --logdir=runs
......@@ -184,73 +184,73 @@ TensorBoard 对于跟踪训练的进展和有效性非常有用。在下面,
让我们定义一个模型来对我们的图像瓷砖进行分类,以及用于训练的优化器和损失函数:
```py
class Net([nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")):
class Net(nn.Module):
def __init__(self):
super([Net](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module"), self).__init__()
self.conv1 = [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(1, 6, 5)
self.pool = [nn.MaxPool2d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d "torch.nn.MaxPool2d")(2, 2)
self.conv2 = [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(6, 16, 5)
self.fc1 = [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(16 * 4 * 4, 120)
self.fc2 = [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(120, 84)
self.fc3 = [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(84, 10)
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 4 * 4, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool([F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(self.conv1(x)))
x = self.pool([F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(self.conv2(x)))
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 4 * 4)
x = [F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(self.fc1(x))
x = [F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(self.fc2(x))
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = [Net](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")()
[criterion](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss") = [nn.CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")()
[optimizer](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD") = [optim.SGD](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD")([net.parameters](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.parameters "torch.nn.Module.parameters")(), lr=0.001, momentum=0.9)
net = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
```
现在让我们训练一个 epoch,并在每 1000 批次时评估训练与验证集的损失:
```py
print(len([validation_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")))
print(len(validation_loader))
for epoch in range(1): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate([training_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"), 0):
for i, data in enumerate(training_loader, 0):
# basic training loop
[inputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = data
[optimizer.zero_grad](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD.zero_grad "torch.optim.SGD.zero_grad")()
[outputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = net([inputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[loss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [criterion](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")([outputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[loss.backward](https://pytorch.org/docs/stable/generated/torch.Tensor.backward.html#torch.Tensor.backward "torch.Tensor.backward")()
[optimizer.step](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD.step "torch.optim.SGD.step")()
running_loss += [loss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").item()
inputs, labels = data
optimizer.zero_grad()
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
if i % 1000 == 999: # Every 1000 mini-batches...
print('Batch {}'.format(i + 1))
# Check against the validation set
running_vloss = 0.0
# In evaluation mode some model specific operations can be omitted eg. dropout layer
[net.train](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.train "torch.nn.Module.train")(False) # Switching to evaluation mode, eg. turning off regularisation
for j, vdata in enumerate([validation_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"), 0):
[vinputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [vlabels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = vdata
[voutputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = net([vinputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [criterion](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")([voutputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [vlabels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
running_vloss += [vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").item()
[net.train](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.train "torch.nn.Module.train")(True) # Switching back to training mode, eg. turning on regularisation
net.train(False) # Switching to evaluation mode, eg. turning off regularisation
for j, vdata in enumerate(validation_loader, 0):
vinputs, vlabels = vdata
voutputs = net(vinputs)
vloss = criterion(voutputs, vlabels)
running_vloss += vloss.item()
net.train(True) # Switching back to training mode, eg. turning on regularisation
avg_loss = running_loss / 1000
avg_vloss = running_vloss / len([validation_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"))
avg_vloss = running_vloss / len(validation_loader)
# Log the running loss averaged per batch
[writer.add_scalars](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.add_scalars "torch.utils.tensorboard.writer.SummaryWriter.add_scalars")('Training vs. Validation Loss',
writer.add_scalars('Training vs. Validation Loss',
{ 'Training' : avg_loss, 'Validation' : avg_vloss },
epoch * len([training_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")) + i)
epoch * len(training_loader) + i)
running_loss = 0.0
print('Finished Training')
[writer.flush](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.flush "torch.utils.tensorboard.writer.SummaryWriter.flush")()
writer.flush()
```
```py
......@@ -281,13 +281,13 @@ TensorBoard 还可以用于检查模型内部的数据流。为此,请使用
```py
# Again, grab a single mini-batch of images
dataiter = iter([training_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"))
[images](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = next(dataiter)
dataiter = iter(training_loader)
images, labels = next(dataiter)
# add_graph() will trace the sample input through your model,
# and render it as a graph.
[writer.add_graph](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.add_graph "torch.utils.tensorboard.writer.SummaryWriter.add_graph")(net, [images](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[writer.flush](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.flush "torch.utils.tensorboard.writer.SummaryWriter.flush")()
writer.add_graph(net, images)
writer.flush()
```
当您切换到 TensorBoard 时,您应该看到一个 GRAPHS 选项卡。双击“NET”节点以查看模型内部的层和数据流。
......@@ -300,25 +300,25 @@ dataiter = iter([training_loader](https://pytorch.org/docs/stable/data.html#torc
```py
# Select a random subset of data and corresponding labels
def select_n_random(data, [labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), n=100):
assert len(data) == len([labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
def select_n_random(data, labels, n=100):
assert len(data) == len(labels)
perm = [torch.randperm](https://pytorch.org/docs/stable/generated/torch.randperm.html#torch.randperm "torch.randperm")(len(data))
return data[perm][:n], [labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")[perm][:n]
perm = torch.randperm(len(data))
return data[perm][:n], labels[perm][:n]
# Extract a random subset of data
[images](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = select_n_random([training_set.data](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [training_set.targets](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
images, labels = select_n_random(training_set.data, training_set.targets)
# get the class labels for each image
class_labels = [classes[label] for label in [labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")]
class_labels = [classes[label] for label in labels]
# log embeddings
[features](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [images](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").view(-1, 28 * 28)
[writer.add_embedding](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.add_embedding "torch.utils.tensorboard.writer.SummaryWriter.add_embedding")([features](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"),
features = images.view(-1, 28 * 28)
writer.add_embedding(features,
metadata=class_labels,
label_img=[images](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").unsqueeze(1))
[writer.flush](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.flush "torch.utils.tensorboard.writer.SummaryWriter.flush")()
[writer.close](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.close "torch.utils.tensorboard.writer.SummaryWriter.close")()
label_img=images.unsqueeze(1))
writer.flush()
writer.close()
```
现在,如果您切换到 TensorBoard 并选择 PROJECTOR 选项卡,您应该看到投影的 3D 表示。您可以旋转和缩放模型。在大尺度和小尺度上检查它,并查看是否可以在投影数据和标签的聚类中发现模式。
......
......@@ -48,28 +48,28 @@ import torchvision
import torchvision.transforms as transforms
# PyTorch TensorBoard support
from torch.utils.tensorboard import [SummaryWriter](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter "torch.utils.tensorboard.writer.SummaryWriter")
from torch.utils.tensorboard import SummaryWriter
from datetime import datetime
[transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose") = [transforms.Compose](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")(
[[transforms.ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")(),
[transforms.Normalize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Normalize.html#torchvision.transforms.Normalize "torchvision.transforms.Normalize")((0.5,), (0.5,))])
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))])
# Create datasets for training & validation, download if necessary
[training_set](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST") = [torchvision.datasets.FashionMNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")('./data', train=True, [transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")=[transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose"), download=True)
[validation_set](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST") = [torchvision.datasets.FashionMNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST")('./data', train=False, [transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")=[transform](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose"), download=True)
training_set = torchvision.datasets.FashionMNIST('./data', train=True, transform=transform, download=True)
validation_set = torchvision.datasets.FashionMNIST('./data', train=False, transform=transform, download=True)
# Create data loaders for our datasets; shuffle for training, not for validation
[training_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [torch.utils.data.DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")([training_set](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"), batch_size=4, shuffle=True)
[validation_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [torch.utils.data.DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")([validation_set](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"), batch_size=4, shuffle=False)
training_loader = torch.utils.data.DataLoader(training_set, batch_size=4, shuffle=True)
validation_loader = torch.utils.data.DataLoader(validation_set, batch_size=4, shuffle=False)
# Class labels
classes = ('T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')
# Report split sizes
print('Training set has {} instances'.format(len([training_set](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"))))
print('Validation set has {} instances'.format(len([validation_set](https://pytorch.org/vision/stable/generated/torchvision.datasets.FashionMNIST.html#torchvision.datasets.FashionMNIST "torchvision.datasets.FashionMNIST"))))
print('Training set has {} instances'.format(len(training_set)))
print('Validation set has {} instances'.format(len(validation_set)))
```
```py
......@@ -134,13 +134,13 @@ def matplotlib_imshow(img, one_channel=False):
else:
plt.imshow(np.transpose(npimg, (1, 2, 0)))
dataiter = iter([training_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"))
[images](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = next(dataiter)
dataiter = iter(training_loader)
images, labels = next(dataiter)
# Create a grid from the images and show them
[img_grid](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torchvision.utils.make_grid](https://pytorch.org/vision/stable/generated/torchvision.utils.make_grid.html#torchvision.utils.make_grid "torchvision.utils.make_grid")([images](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
matplotlib_imshow([img_grid](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), one_channel=True)
print(' '.join(classes[[labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")[j]] for j in range(4)))
img_grid = torchvision.utils.make_grid(images)
matplotlib_imshow(img_grid, one_channel=True)
print(' '.join(classes[labels[j]] for j in range(4)))
```
![trainingyt](img/c62745d33703f5977e18e6e3956d7fe6.png)
......@@ -158,26 +158,26 @@ import torch.nn as nn
import torch.nn.functional as F
# PyTorch models inherit from torch.nn.Module
class GarmentClassifier([nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")):
class GarmentClassifier(nn.Module):
def __init__(self):
super([GarmentClassifier](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module"), self).__init__()
self.conv1 = [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(1, 6, 5)
self.pool = [nn.MaxPool2d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d "torch.nn.MaxPool2d")(2, 2)
self.conv2 = [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(6, 16, 5)
self.fc1 = [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(16 * 4 * 4, 120)
self.fc2 = [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(120, 84)
self.fc3 = [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(84, 10)
super(GarmentClassifier, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 4 * 4, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool([F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(self.conv1(x)))
x = self.pool([F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(self.conv2(x)))
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 4 * 4)
x = [F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(self.fc1(x))
x = [F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(self.fc2(x))
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
model = [GarmentClassifier](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")()
model = GarmentClassifier()
```
## 损失函数
......@@ -185,19 +185,19 @@ model = [GarmentClassifier](https://pytorch.org/docs/stable/generated/torch.nn.M
在这个例子中,我们将使用交叉熵损失。为了演示目的,我们将创建一批虚拟输出和标签值,将它们通过损失函数运行,并检查结果。
```py
[loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss") = [torch.nn.CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")()
loss_fn = torch.nn.CrossEntropyLoss()
# NB: Loss functions expect data in batches, so we're creating batches of 4
# Represents the model's confidence in each of the 10 classes for a given input
[dummy_outputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.rand](https://pytorch.org/docs/stable/generated/torch.rand.html#torch.rand "torch.rand")(4, 10)
dummy_outputs = torch.rand(4, 10)
# Represents the correct class among the 10 being tested
[dummy_labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [torch.tensor](https://pytorch.org/docs/stable/generated/torch.tensor.html#torch.tensor "torch.tensor")([1, 5, 3, 7])
dummy_labels = torch.tensor([1, 5, 3, 7])
print([dummy_outputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
print([dummy_labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
print(dummy_outputs)
print(dummy_labels)
[loss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")([dummy_outputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [dummy_labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
print('Total loss for this batch: {}'.format([loss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").item()))
loss = loss_fn(dummy_outputs, dummy_labels)
print('Total loss for this batch: {}'.format(loss.item()))
```
```py
......@@ -227,7 +227,7 @@ Total loss for this batch: 2.428950071334839
```py
# Optimizers specified in the torch.optim package
[optimizer](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD") = [torch.optim.SGD](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD")([model.parameters](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.parameters "torch.nn.Module.parameters")(), lr=0.001, momentum=0.9)
optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
```
## 训练循环
......@@ -258,29 +258,29 @@ def train_one_epoch(epoch_index, tb_writer):
# Here, we use enumerate(training_loader) instead of
# iter(training_loader) so that we can track the batch
# index and do some intra-epoch reporting
for i, data in enumerate([training_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")):
for i, data in enumerate(training_loader):
# Every data instance is an input + label pair
inputs, [labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = data
inputs, labels = data
# Zero your gradients for every batch!
[optimizer.zero_grad](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD.zero_grad "torch.optim.SGD.zero_grad")()
optimizer.zero_grad()
# Make predictions for this batch
outputs = model(inputs)
# Compute the loss and its gradients
[loss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")(outputs, [labels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[loss.backward](https://pytorch.org/docs/stable/generated/torch.Tensor.backward.html#torch.Tensor.backward "torch.Tensor.backward")()
loss = loss_fn(outputs, labels)
loss.backward()
# Adjust learning weights
[optimizer.step](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD.step "torch.optim.SGD.step")()
optimizer.step()
# Gather data and report
running_loss += [loss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor").item()
running_loss += loss.item()
if i % 1000 == 999:
last_loss = running_loss / 1000 # loss per batch
print(' batch {} loss: {}'.format(i + 1, last_loss))
tb_x = epoch_index * len([training_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")) + i + 1
tb_x = epoch_index * len(training_loader) + i + 1
tb_writer.add_scalar('Loss/train', last_loss, tb_x)
running_loss = 0.
......@@ -300,48 +300,48 @@ def train_one_epoch(epoch_index, tb_writer):
```py
# Initializing in a separate cell so we can easily add more epochs to the same run
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
[writer](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter "torch.utils.tensorboard.writer.SummaryWriter") = [SummaryWriter](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter "torch.utils.tensorboard.writer.SummaryWriter")('runs/fashion_trainer_{}'.format(timestamp))
writer = SummaryWriter('runs/fashion_trainer_{}'.format(timestamp))
epoch_number = 0
EPOCHS = 5
[best_vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = 1_000_000.
best_vloss = 1_000_000.
for epoch in range(EPOCHS):
print('EPOCH {}:'.format(epoch_number + 1))
# Make sure gradient tracking is on, and do a pass over the data
[model.train](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.train "torch.nn.Module.train")(True)
avg_loss = train_one_epoch(epoch_number, [writer](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter "torch.utils.tensorboard.writer.SummaryWriter"))
model.train(True)
avg_loss = train_one_epoch(epoch_number, writer)
[running_vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = 0.0
running_vloss = 0.0
# Set the model to evaluation mode, disabling dropout and using population
# statistics for batch normalization.
[model.eval](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.eval "torch.nn.Module.eval")()
model.eval()
# Disable gradient computation and reduce memory consumption.
with [torch.no_grad](https://pytorch.org/docs/stable/generated/torch.no_grad.html#torch.no_grad "torch.no_grad")():
for i, vdata in enumerate([validation_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")):
[vinputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [vlabels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = vdata
[voutputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = model([vinputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [loss_fn](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss "torch.nn.CrossEntropyLoss")([voutputs](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"), [vlabels](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"))
[running_vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") += [vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
with torch.no_grad():
for i, vdata in enumerate(validation_loader):
vinputs, vlabels = vdata
voutputs = model(vinputs)
vloss = loss_fn(voutputs, vlabels)
running_vloss += vloss
[avg_vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [running_vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") / (i + 1)
print('LOSS train {} valid {}'.format(avg_loss, [avg_vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")))
avg_vloss = running_vloss / (i + 1)
print('LOSS train {} valid {}'.format(avg_loss, avg_vloss))
# Log the running loss averaged per batch
# for both training and validation
[writer.add_scalars](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.add_scalars "torch.utils.tensorboard.writer.SummaryWriter.add_scalars")('Training vs. Validation Loss',
{ 'Training' : avg_loss, 'Validation' : [avg_vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") },
writer.add_scalars('Training vs. Validation Loss',
{ 'Training' : avg_loss, 'Validation' : avg_vloss },
epoch_number + 1)
[writer.flush](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.flush "torch.utils.tensorboard.writer.SummaryWriter.flush")()
writer.flush()
# Track best performance, and save the model's state
if [avg_vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") < [best_vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor"):
[best_vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor") = [avg_vloss](https://pytorch.org/docs/stable/tensors.html#torch.Tensor "torch.Tensor")
if avg_vloss < best_vloss:
best_vloss = avg_vloss
model_path = 'model_{}_{}'.format(timestamp, epoch_number)
[torch.save](https://pytorch.org/docs/stable/generated/torch.save.html#torch.save "torch.save")([model.state_dict](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.state_dict "torch.nn.Module.state_dict")(), model_path)
torch.save(model.state_dict(), model_path)
epoch_number += 1
```
......@@ -437,7 +437,7 @@ LOSS train 0.27903492261294194 valid 0.31206756830215454
加载模型的保存版本:
```py
saved_model = [GarmentClassifier](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")()
saved_model = GarmentClassifier()
saved_model.load_state_dict(torch.load(PATH))
```
......
......@@ -106,7 +106,7 @@ from matplotlib.colors import LinearSegmentedColormap
现在我们将使用 TorchVision 模型库下载一个预训练的 ResNet。由于我们不是在训练,所以暂时将其置于评估模式。
```py
model = [models.resnet18](https://pytorch.org/vision/stable/models/generated/torchvision.models.resnet18.html#torchvision.models.resnet18 "torchvision.models.resnet18")(weights='IMAGENET1K_V1')
model = models.resnet18(weights='IMAGENET1K_V1')
model = model.eval()
```
......@@ -123,14 +123,14 @@ plt.show()
```py
# model expects 224x224 3-color image
transform = [transforms.Compose](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")([
[transforms.Resize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Resize.html#torchvision.transforms.Resize "torchvision.transforms.Resize")(224),
[transforms.CenterCrop](https://pytorch.org/vision/stable/generated/torchvision.transforms.CenterCrop.html#torchvision.transforms.CenterCrop "torchvision.transforms.CenterCrop")(224),
[transforms.ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")()
transform = transforms.Compose([
transforms.Resize(224),
transforms.CenterCrop(224),
transforms.ToTensor()
])
# standard ImageNet normalization
transform_normalize = [transforms.Normalize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Normalize.html#torchvision.transforms.Normalize "torchvision.transforms.Normalize")(
transform_normalize = transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)
......@@ -148,8 +148,8 @@ with open(labels_path) as json_data:
```py
output = model(input_img)
output = [F.softmax](https://pytorch.org/docs/stable/generated/torch.nn.functional.softmax.html#torch.nn.functional.softmax "torch.nn.functional.softmax")(output, dim=1)
prediction_score, pred_label_idx = [torch.topk](https://pytorch.org/docs/stable/generated/torch.topk.html#torch.topk "torch.topk")(output, 1)
output = F.softmax(output, dim=1)
prediction_score, pred_label_idx = torch.topk(output, 1)
pred_label_idx.squeeze_()
predicted_label = idx_to_labels[str(pred_label_idx.item())][1]
print('Predicted:', predicted_label, '(', prediction_score.squeeze().item(), ')')
......@@ -280,8 +280,8 @@ for img in imgs:
input_img = input_img.unsqueeze(0) # the model requires a dummy batch dimension
output = model(input_img)
output = [F.softmax](https://pytorch.org/docs/stable/generated/torch.nn.functional.softmax.html#torch.nn.functional.softmax "torch.nn.functional.softmax")(output, dim=1)
prediction_score, pred_label_idx = [torch.topk](https://pytorch.org/docs/stable/generated/torch.topk.html#torch.topk "torch.topk")(output, 1)
output = F.softmax(output, dim=1)
prediction_score, pred_label_idx = torch.topk(output, 1)
pred_label_idx.squeeze_()
predicted_label = idx_to_labels[str(pred_label_idx.item())][1]
print('Predicted:', predicted_label, '/', pred_label_idx.item(), ' (', prediction_score.squeeze().item(), ')')
......@@ -317,11 +317,11 @@ def full_img_transform(input):
i = i.unsqueeze(0)
return i
input_imgs = [torch.cat](https://pytorch.org/docs/stable/generated/torch.cat.html#torch.cat "torch.cat")(list(map(lambda i: full_img_transform(i), imgs)), 0)
input_imgs = torch.cat(list(map(lambda i: full_img_transform(i), imgs)), 0)
visualizer = AttributionVisualizer(
models=[model],
score_func=lambda o: [torch.nn.functional.softmax](https://pytorch.org/docs/stable/generated/torch.nn.functional.softmax.html#torch.nn.functional.softmax "torch.nn.functional.softmax")(o, 1),
score_func=lambda o: torch.nn.functional.softmax(o, 1),
classes=list(map(lambda k: idx_to_labels[k][1], idx_to_labels.keys())),
features=[
ImageFeature(
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
......@@ -59,7 +59,7 @@ epsilons = [0, .05, .1, .15, .2, .25, .3]
pretrained_model = "data/lenet_mnist_model.pth"
use_cuda=True
# Set random seed for reproducibility
[torch.manual_seed](https://pytorch.org/docs/stable/generated/torch.manual_seed.html#torch.manual_seed "torch.manual_seed")(42)
torch.manual_seed(42)
```
```py
......@@ -72,51 +72,51 @@ use_cuda=True
```py
# LeNet Model definition
class Net([nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")):
class Net(nn.Module):
def __init__(self):
super([Net](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module"), self).__init__()
self.conv1 = [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(1, 32, 3, 1)
self.conv2 = [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(32, 64, 3, 1)
self.dropout1 = [nn.Dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html#torch.nn.Dropout "torch.nn.Dropout")(0.25)
self.dropout2 = [nn.Dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html#torch.nn.Dropout "torch.nn.Dropout")(0.5)
self.fc1 = [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(9216, 128)
self.fc2 = [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(128, 10)
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.dropout1 = nn.Dropout(0.25)
self.dropout2 = nn.Dropout(0.5)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = self.conv1(x)
x = [F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(x)
x = F.relu(x)
x = self.conv2(x)
x = [F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(x)
x = [F.max_pool2d](https://pytorch.org/docs/stable/generated/torch.nn.functional.max_pool2d.html#torch.nn.functional.max_pool2d "torch.nn.functional.max_pool2d")(x, 2)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = [torch.flatten](https://pytorch.org/docs/stable/generated/torch.flatten.html#torch.flatten "torch.flatten")(x, 1)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = [F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(x)
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x)
output = [F.log_softmax](https://pytorch.org/docs/stable/generated/torch.nn.functional.log_softmax.html#torch.nn.functional.log_softmax "torch.nn.functional.log_softmax")(x, dim=1)
output = F.log_softmax(x, dim=1)
return output
# MNIST Test dataset and dataloader declaration
[test_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [torch.utils.data.DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")(
[datasets.MNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html#torchvision.datasets.MNIST "torchvision.datasets.MNIST")('../data', train=False, download=True, transform=[transforms.Compose](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")([
[transforms.ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")(),
[transforms.Normalize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Normalize.html#torchvision.transforms.Normalize "torchvision.transforms.Normalize")((0.1307,), (0.3081,)),
test_loader = torch.utils.data.DataLoader(
datasets.MNIST('../data', train=False, download=True, transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,)),
])),
batch_size=1, shuffle=True)
# Define what device we are using
print("CUDA Available: ",[torch.cuda.is_available](https://pytorch.org/docs/stable/generated/torch.cuda.is_available.html#torch.cuda.is_available "torch.cuda.is_available")())
[device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device") = [torch.device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device")("cuda" if use_cuda and [torch.cuda.is_available](https://pytorch.org/docs/stable/generated/torch.cuda.is_available.html#torch.cuda.is_available "torch.cuda.is_available")() else "cpu")
print("CUDA Available: ",torch.cuda.is_available())
device = torch.device("cuda" if use_cuda and torch.cuda.is_available() else "cpu")
# Initialize the network
model = [Net](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")().to([device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device"))
model = Net().to(device)
# Load the pretrained model
[model.load_state_dict](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.load_state_dict "torch.nn.Module.load_state_dict")([torch.load](https://pytorch.org/docs/stable/generated/torch.load.html#torch.load "torch.load")(pretrained_model, map_location=[device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device")))
model.load_state_dict(torch.load(pretrained_model, map_location=device))
# Set the model in evaluation mode. In this case this is for the Dropout layers
[model.eval](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.eval "torch.nn.Module.eval")()
model.eval()
```
```py
......@@ -176,7 +176,7 @@ def fgsm_attack(image, epsilon, data_grad):
# Create the perturbed image by adjusting each pixel of the input image
perturbed_image = image + epsilon*sign_data_grad
# Adding clipping to maintain [0,1] range
perturbed_image = [torch.clamp](https://pytorch.org/docs/stable/generated/torch.clamp.html#torch.clamp "torch.clamp")(perturbed_image, 0, 1)
perturbed_image = torch.clamp(perturbed_image, 0, 1)
# Return the perturbed image
return perturbed_image
......@@ -194,9 +194,9 @@ def denorm(batch, mean=[0.1307], std=[0.3081]):
torch.Tensor: batch of tensors without normalization applied to them.
"""
if isinstance(mean, list):
mean = [torch.tensor](https://pytorch.org/docs/stable/generated/torch.tensor.html#torch.tensor "torch.tensor")(mean).to([device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device"))
mean = torch.tensor(mean).to(device)
if isinstance(std, list):
std = [torch.tensor](https://pytorch.org/docs/stable/generated/torch.tensor.html#torch.tensor "torch.tensor")(std).to([device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device"))
std = torch.tensor(std).to(device)
return batch * std.view(1, -1, 1, 1) + mean.view(1, -1, 1, 1)
```
......@@ -206,17 +206,17 @@ def denorm(batch, mean=[0.1307], std=[0.3081]):
最后,这个教程的核心结果来自 `test` 函数。每次调用此测试函数都会在 MNIST 测试集上执行完整的测试步骤,并报告最终准确性。但请注意,此函数还接受一个 *epsilon* 输入。这是因为 `test` 函数报告了受到强度为 \(\epsilon\) 的对手攻击的模型的准确性。更具体地说,对于测试集中的每个样本,该函数计算损失相对于输入数据的梯度(\(data\_grad\)),使用 `fgsm_attack` 创建扰动图像(\(perturbed\_data\)),然后检查扰动示例是否是对抗性的。除了测试模型的准确性外,该函数还保存并返回一些成功的对抗性示例,以便稍后进行可视化。
```py
def test( model, [device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device"), [test_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"), epsilon ):
def test( model, device, test_loader, epsilon ):
# Accuracy counter
correct = 0
adv_examples = []
# Loop over all examples in test set
for data, target in [test_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"):
for data, target in test_loader:
# Send the data and label to the device
data, target = data.to([device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device")), target.to([device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device"))
data, target = data.to(device), target.to(device)
# Set requires_grad attribute of tensor. Important for Attack
data.requires_grad = True
......@@ -230,10 +230,10 @@ def test( model, [device](https://pytorch.org/docs/stable/tensor_attributes.html
continue
# Calculate the loss
loss = [F.nll_loss](https://pytorch.org/docs/stable/generated/torch.nn.functional.nll_loss.html#torch.nn.functional.nll_loss "torch.nn.functional.nll_loss")(output, target)
loss = F.nll_loss(output, target)
# Zero all existing gradients
[model.zero_grad](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.zero_grad "torch.nn.Module.zero_grad")()
model.zero_grad()
# Calculate gradients of model in backward pass
loss.backward()
......@@ -248,7 +248,7 @@ def test( model, [device](https://pytorch.org/docs/stable/tensor_attributes.html
perturbed_data = fgsm_attack(data_denorm, epsilon, data_grad)
# Reapply normalization
perturbed_data_normalized = [transforms.Normalize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Normalize.html#torchvision.transforms.Normalize "torchvision.transforms.Normalize")((0.1307,), (0.3081,))(perturbed_data)
perturbed_data_normalized = transforms.Normalize((0.1307,), (0.3081,))(perturbed_data)
# Re-classify the perturbed image
output = model(perturbed_data_normalized)
......@@ -268,8 +268,8 @@ def test( model, [device](https://pytorch.org/docs/stable/tensor_attributes.html
adv_examples.append( (init_pred.item(), final_pred.item(), adv_ex) )
# Calculate final accuracy for this epsilon
final_acc = correct/float(len([test_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")))
print(f"Epsilon: {epsilon}\tTest Accuracy = {correct} / {len([test_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"))} = {final_acc}")
final_acc = correct/float(len(test_loader))
print(f"Epsilon: {epsilon}\tTest Accuracy = {correct} / {len(test_loader)} = {final_acc}")
# Return the accuracy and an adversarial example
return final_acc, adv_examples
......@@ -285,7 +285,7 @@ examples = []
# Run test for each epsilon
for eps in epsilons:
acc, ex = test(model, [device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device"), [test_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"), eps)
acc, ex = test(model, device, test_loader, eps)
accuracies.append(acc)
examples.append(ex)
```
......
此差异已折叠。
......@@ -46,20 +46,20 @@ opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib.request.install_opener(opener)
[device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device") = [torch.device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device")("cuda" if [torch.cuda.is_available](https://pytorch.org/docs/stable/generated/torch.cuda.is_available.html#torch.cuda.is_available "torch.cuda.is_available")() else "cpu")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Training dataset
[train_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [torch.utils.data.DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")(
[datasets.MNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html#torchvision.datasets.MNIST "torchvision.datasets.MNIST")(root='.', train=True, download=True,
transform=[transforms.Compose](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")([
[transforms.ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")(),
[transforms.Normalize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Normalize.html#torchvision.transforms.Normalize "torchvision.transforms.Normalize")((0.1307,), (0.3081,))
train_loader = torch.utils.data.DataLoader(
datasets.MNIST(root='.', train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])), batch_size=64, shuffle=True, num_workers=4)
# Test dataset
[test_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader") = [torch.utils.data.DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")(
[datasets.MNIST](https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html#torchvision.datasets.MNIST "torchvision.datasets.MNIST")(root='.', train=False, transform=[transforms.Compose](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")([
[transforms.ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")(),
[transforms.Normalize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Normalize.html#torchvision.transforms.Normalize "torchvision.transforms.Normalize")((0.1307,), (0.3081,))
test_loader = torch.utils.data.DataLoader(
datasets.MNIST(root='.', train=False, transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])), batch_size=64, shuffle=True, num_workers=4)
```
......@@ -110,35 +110,35 @@ Extracting ./MNIST/raw/t10k-labels-idx1-ubyte.gz to ./MNIST/raw
我们需要包含 affine_grid 和 grid_sample 模块的最新版本的 PyTorch。
```py
class Net([nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")):
class Net(nn.Module):
def __init__(self):
super([Net](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module"), self).__init__()
self.conv1 = [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(1, 10, kernel_size=5)
self.conv2 = [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(10, 20, kernel_size=5)
self.conv2_drop = [nn.Dropout2d](https://pytorch.org/docs/stable/generated/torch.nn.Dropout2d.html#torch.nn.Dropout2d "torch.nn.Dropout2d")()
self.fc1 = [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(320, 50)
self.fc2 = [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(50, 10)
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
self.conv2_drop = nn.Dropout2d()
self.fc1 = nn.Linear(320, 50)
self.fc2 = nn.Linear(50, 10)
# Spatial transformer localization-network
self.localization = [nn.Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html#torch.nn.Sequential "torch.nn.Sequential")(
[nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(1, 8, kernel_size=7),
[nn.MaxPool2d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d "torch.nn.MaxPool2d")(2, stride=2),
[nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")(True),
[nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d "torch.nn.Conv2d")(8, 10, kernel_size=5),
[nn.MaxPool2d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d "torch.nn.MaxPool2d")(2, stride=2),
[nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")(True)
self.localization = nn.Sequential(
nn.Conv2d(1, 8, kernel_size=7),
nn.MaxPool2d(2, stride=2),
nn.ReLU(True),
nn.Conv2d(8, 10, kernel_size=5),
nn.MaxPool2d(2, stride=2),
nn.ReLU(True)
)
# Regressor for the 3 * 2 affine matrix
self.fc_loc = [nn.Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html#torch.nn.Sequential "torch.nn.Sequential")(
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(10 * 3 * 3, 32),
[nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU "torch.nn.ReLU")(True),
[nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear "torch.nn.Linear")(32, 3 * 2)
self.fc_loc = nn.Sequential(
nn.Linear(10 * 3 * 3, 32),
nn.ReLU(True),
nn.Linear(32, 3 * 2)
)
# Initialize the weights/bias with identity transformation
self.fc_loc[2].weight.data.zero_()
self.fc_loc[2].bias.data.copy_([torch.tensor](https://pytorch.org/docs/stable/generated/torch.tensor.html#torch.tensor "torch.tensor")([1, 0, 0, 0, 1, 0], dtype=[torch.float](https://pytorch.org/docs/stable/tensor_attributes.html#torch.dtype "torch.dtype")))
self.fc_loc[2].bias.data.copy_(torch.tensor([1, 0, 0, 0, 1, 0], dtype=torch.float))
# Spatial transformer network forward function
def stn(self, x):
......@@ -147,8 +147,8 @@ class Net([nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.
theta = self.fc_loc(xs)
theta = theta.view(-1, 2, 3)
grid = [F.affine_grid](https://pytorch.org/docs/stable/generated/torch.nn.functional.affine_grid.html#torch.nn.functional.affine_grid "torch.nn.functional.affine_grid")(theta, x.size())
x = [F.grid_sample](https://pytorch.org/docs/stable/generated/torch.nn.functional.grid_sample.html#torch.nn.functional.grid_sample "torch.nn.functional.grid_sample")(x, grid)
grid = F.affine_grid(theta, x.size())
x = F.grid_sample(x, grid)
return x
......@@ -157,15 +157,15 @@ class Net([nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.
x = self.stn(x)
# Perform the usual forward pass
x = [F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")([F.max_pool2d](https://pytorch.org/docs/stable/generated/torch.nn.functional.max_pool2d.html#torch.nn.functional.max_pool2d "torch.nn.functional.max_pool2d")(self.conv1(x), 2))
x = [F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")([F.max_pool2d](https://pytorch.org/docs/stable/generated/torch.nn.functional.max_pool2d.html#torch.nn.functional.max_pool2d "torch.nn.functional.max_pool2d")(self.conv2_drop(self.conv2(x)), 2))
x = F.relu(F.max_pool2d(self.conv1(x), 2))
x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
x = x.view(-1, 320)
x = [F.relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu "torch.nn.functional.relu")(self.fc1(x))
x = [F.dropout](https://pytorch.org/docs/stable/generated/torch.nn.functional.dropout.html#torch.nn.functional.dropout "torch.nn.functional.dropout")(x, training=self.training)
x = F.relu(self.fc1(x))
x = F.dropout(x, training=self.training)
x = self.fc2(x)
return [F.log_softmax](https://pytorch.org/docs/stable/generated/torch.nn.functional.log_softmax.html#torch.nn.functional.log_softmax "torch.nn.functional.log_softmax")(x, dim=1)
return F.log_softmax(x, dim=1)
model = [Net](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module "torch.nn.Module")().to([device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device"))
model = Net().to(device)
```
## 训练模型
......@@ -173,45 +173,45 @@ model = [Net](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#tor
现在,让我们使用 SGD 算法来训练模型。网络以监督方式学习分类任务。同时,模型以端到端的方式自动学习 STN。
```py
[optimizer](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD") = [optim.SGD](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD "torch.optim.SGD")([model.parameters](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.parameters "torch.nn.Module.parameters")(), lr=0.01)
optimizer = optim.SGD(model.parameters(), lr=0.01)
def train(epoch):
[model.train](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.train "torch.nn.Module.train")()
for batch_idx, (data, target) in enumerate([train_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")):
data, target = data.to([device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device")), target.to([device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device"))
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
[optimizer.zero_grad](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD.zero_grad "torch.optim.SGD.zero_grad")()
optimizer.zero_grad()
output = model(data)
loss = [F.nll_loss](https://pytorch.org/docs/stable/generated/torch.nn.functional.nll_loss.html#torch.nn.functional.nll_loss "torch.nn.functional.nll_loss")(output, target)
loss = F.nll_loss(output, target)
loss.backward()
[optimizer.step](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD.step "torch.optim.SGD.step")()
optimizer.step()
if batch_idx % 500 == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len([train_loader.dataset](https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html#torchvision.datasets.MNIST "torchvision.datasets.MNIST")),
100. * batch_idx / len([train_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")), loss.item()))
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))
#
# A simple test procedure to measure the STN performances on MNIST.
#
def test():
with [torch.no_grad](https://pytorch.org/docs/stable/generated/torch.no_grad.html#torch.no_grad "torch.no_grad")():
[model.eval](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.eval "torch.nn.Module.eval")()
with torch.no_grad():
model.eval()
test_loss = 0
correct = 0
for data, target in [test_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader"):
data, target = data.to([device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device")), target.to([device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device"))
for data, target in test_loader:
data, target = data.to(device), target.to(device)
output = model(data)
# sum up batch loss
test_loss += [F.nll_loss](https://pytorch.org/docs/stable/generated/torch.nn.functional.nll_loss.html#torch.nn.functional.nll_loss "torch.nn.functional.nll_loss")(output, target, size_average=False).item()
test_loss += F.nll_loss(output, target, size_average=False).item()
# get the index of the max log-probability
pred = output.max(1, keepdim=True)[1]
correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len([test_loader.dataset](https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html#torchvision.datasets.MNIST "torchvision.datasets.MNIST"))
test_loss /= len(test_loader.dataset)
print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'
.format(test_loss, correct, len([test_loader.dataset](https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html#torchvision.datasets.MNIST "torchvision.datasets.MNIST")),
100. * correct / len([test_loader.dataset](https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html#torchvision.datasets.MNIST "torchvision.datasets.MNIST"))))
.format(test_loss, correct, len(test_loader.dataset),
100. * correct / len(test_loader.dataset)))
```
## 可视化 STN 结果
......@@ -235,18 +235,18 @@ def convert_image_np(inp):
# the corresponding transformed batch using STN.
def visualize_stn():
with [torch.no_grad](https://pytorch.org/docs/stable/generated/torch.no_grad.html#torch.no_grad "torch.no_grad")():
with torch.no_grad():
# Get a batch of training data
data = next(iter([test_loader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader "torch.utils.data.DataLoader")))[0].to([device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device "torch.device"))
data = next(iter(test_loader))[0].to(device)
input_tensor = data.cpu()
transformed_input_tensor = model.stn(data).cpu()
in_grid = convert_image_np(
[torchvision.utils.make_grid](https://pytorch.org/vision/stable/generated/torchvision.utils.make_grid.html#torchvision.utils.make_grid "torchvision.utils.make_grid")(input_tensor))
torchvision.utils.make_grid(input_tensor))
out_grid = convert_image_np(
[torchvision.utils.make_grid](https://pytorch.org/vision/stable/generated/torchvision.utils.make_grid.html#torchvision.utils.make_grid "torchvision.utils.make_grid")(transformed_input_tensor))
torchvision.utils.make_grid(transformed_input_tensor))
# Plot the results side-by-side
f, axarr = plt.subplots(1, 2)
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
......@@ -77,10 +77,10 @@ import torchvision.transforms as transforms
from PIL import Image
def transform_image(image_bytes):
my_transforms = [transforms.Compose](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")([[transforms.Resize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Resize.html#torchvision.transforms.Resize "torchvision.transforms.Resize")(255),
[transforms.CenterCrop](https://pytorch.org/vision/stable/generated/torchvision.transforms.CenterCrop.html#torchvision.transforms.CenterCrop "torchvision.transforms.CenterCrop")(224),
[transforms.ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")(),
[transforms.Normalize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Normalize.html#torchvision.transforms.Normalize "torchvision.transforms.Normalize")(
my_transforms = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(
[0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
image = Image.open(io.BytesIO(image_bytes))
......@@ -104,7 +104,7 @@ with open("../_static/img/sample_file.jpeg", 'rb') as f:
from torchvision import models
# Make sure to set `weights` as `'IMAGENET1K_V1'` to use the pretrained weights:
model = [models.densenet121](https://pytorch.org/vision/stable/models/generated/torchvision.models.densenet121.html#torchvision.models.densenet121 "torchvision.models.densenet121")(weights='IMAGENET1K_V1')
model = models.densenet121(weights='IMAGENET1K_V1')
# Since we are using our model only for inference, switch to `eval` mode:
model.eval()
......@@ -175,14 +175,14 @@ def predict():
>
> app = Flask(__name__)
> imagenet_class_index = json.load(open('<PATH/TO/.json/FILE>/imagenet_class_index.json'))
> model = [models.densenet121](https://pytorch.org/vision/stable/models/generated/torchvision.models.densenet121.html#torchvision.models.densenet121 "torchvision.models.densenet121")(weights='IMAGENET1K_V1')
> model = models.densenet121(weights='IMAGENET1K_V1')
> model.eval()
>
> def transform_image(image_bytes):
> my_transforms = [transforms.Compose](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html#torchvision.transforms.Compose "torchvision.transforms.Compose")([[transforms.Resize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Resize.html#torchvision.transforms.Resize "torchvision.transforms.Resize")(255),
> [transforms.CenterCrop](https://pytorch.org/vision/stable/generated/torchvision.transforms.CenterCrop.html#torchvision.transforms.CenterCrop "torchvision.transforms.CenterCrop")(224),
> [transforms.ToTensor](https://pytorch.org/vision/stable/generated/torchvision.transforms.ToTensor.html#torchvision.transforms.ToTensor "torchvision.transforms.ToTensor")(),
> [transforms.Normalize](https://pytorch.org/vision/stable/generated/torchvision.transforms.Normalize.html#torchvision.transforms.Normalize "torchvision.transforms.Normalize")(
> my_transforms = transforms.Compose([transforms.Resize(255),
> transforms.CenterCrop(224),
> transforms.ToTensor(),
> transforms.Normalize(
> [0.485, 0.456, 0.406],
> [0.229, 0.224, 0.225])])
> image = Image.open(io.BytesIO(image_bytes))
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册