提交 9c177544 编写于 作者: A Aston Zhang

revise fine tuning and cifar

上级 59f1b733
......@@ -28,29 +28,29 @@
我们首先将数据下载到`../data`。在当前目录解压后得到`hotdog/train``hotdog/test`这两个文件夹。每个下面有`hotdog``not-hotdog`这两个类别文件夹,里面是对应的图片文件。
```{.python .input n=4}
%matplotlib inline
import sys
sys.path.insert(0, '..')
import zipfile
import gluonbook as gb
from mxnet import nd, image, gluon, init
from mxnet import nd, gluon, init
from mxnet.gluon import data as gdata, loss as gloss, model_zoo, utils as gutils
from mxnet.gluon.data.vision import transforms
data_dir = '../data/'
base_url = 'https://apache-mxnet.s3-accelerate.amazonaws.com/'
fname = gluon.utils.download(
fname = gutils.download(
base_url+'gluon/dataset/hotdog.zip',
path=data_dir, sha1_hash='fba480ffa8aa7e0febbb511d181409f899b9baa5')
with zipfile.ZipFile(fname, 'r') as f:
f.extractall(data_dir)
with zipfile.ZipFile(fname, 'r') as z:
z.extractall(data_dir)
```
我们使用使用`ImageFolderDataset`类来读取数据。它将每个类别文件夹当做一个类,并读取下面所有的图片。
```{.python .input n=6}
train_imgs = gluon.data.vision.ImageFolderDataset(data_dir+'/hotdog/train')
test_imgs = gluon.data.vision.ImageFolderDataset(data_dir+'/hotdog/test')
train_imgs = gdata.vision.ImageFolderDataset(data_dir+'/hotdog/train')
test_imgs = gdata.vision.ImageFolderDataset(data_dir+'/hotdog/test')
```
下面画出前8张正例图片和最后的8张负例图片,可以看到他们性质和高宽各不相同。
......@@ -113,15 +113,15 @@ finetune_net.output.initialize(init.Xavier())
```{.python .input n=12}
def train(net, learning_rate, batch_size=128, epochs=5):
train_data = gluon.data.DataLoader(
train_data = gdata.DataLoader(
train_imgs.transform_first(train_augs), batch_size, shuffle=True)
test_data = gluon.data.DataLoader(
test_data = gdata.DataLoader(
test_imgs.transform_first(test_augs), batch_size)
ctx = gb.try_all_gpus()
net.collect_params().reset_ctx(ctx)
net.hybridize()
loss = gluon.loss.SoftmaxCrossEntropyLoss()
loss = gloss.SoftmaxCrossEntropyLoss()
trainer = gluon.Trainer(net.collect_params(), 'sgd', {
'learning_rate': learning_rate, 'wd': 0.001})
gb.train(train_data, test_data, net, loss, trainer, ctx, epochs)
......@@ -136,7 +136,7 @@ train(finetune_net, 0.01)
为了对比起见,我们训练同样的一个模型,但所有参数都初始成随机值。我们使用较大的学习率来加速收敛。
```{.python .input n=14}
scratch_net = gluon.model_zoo.vision.resnet18_v2(classes=2)
scratch_net = model_zoo.vision.resnet18_v2(classes=2)
scratch_net.initialize(init=init.Xavier())
train(scratch_net, 0.1)
```
......
......@@ -53,6 +53,7 @@ import datetime
import gluonbook as gb
from mxnet import autograd, gluon, init, nd
from mxnet.gluon import data as gdata, nn, loss as gloss
from mxnet.gluon.data.vision import transforms
import numpy as np
import os
import pandas as pd
......@@ -64,9 +65,9 @@ import shutil
demo = True
if demo:
import zipfile
for fin in ['train_tiny.zip', 'test_tiny.zip', 'trainLabels.csv.zip']:
with zipfile.ZipFile('../data/kaggle_cifar10/' + fin, 'r') as zin:
zin.extractall('../data/kaggle_cifar10/')
for f in ['train_tiny.zip', 'test_tiny.zip', 'trainLabels.csv.zip']:
with zipfile.ZipFile('../data/kaggle_cifar10/' + f, 'r') as z:
z.extractall('../data/kaggle_cifar10/')
```
### 整理数据集
......@@ -150,31 +151,31 @@ reorg_cifar10_data(data_dir, label_file, train_dir, test_dir, input_dir,
为避免过拟合,我们在这里使用`transforms`来增广数据集。例如我们加入`transforms.RandomFlipLeftRight()`即可随机对每张图片做镜面反转。我们也通过`transforms.Normalize()`对彩色图像RGB三个通道分别做[标准化](../chapter_supervised-learning/kaggle-gluon-kfold.md)。以下我们列举了所有可能用到的操作,这些操作可以根据需求来决定是否调用,它们的参数也都是可调的。
```{.python .input n=4}
transform_train = gdata.vision.transforms.Compose([
# gdata.vision.transforms.CenterCrop(32),
# gdata.vision.transforms.RandomFlipTopBottom(),
# gdata.vision.transforms.RandomColorJitter(brightness=0.0, contrast=0.0,
# saturation=0.0, hue=0.0),
# gdata.vision.transforms.RandomLighting(0.0),
# gdata.vision.transforms.Cast('float32'),
# gdata.vision.transforms.Resize(32),
transform_train = transforms.Compose([
transforms.CenterCrop(32),
transforms.RandomFlipTopBottom(),
transforms.RandomColorJitter(brightness=0.0, contrast=0.0,
saturation=0.0, hue=0.0),
transforms.RandomLighting(0.0),
transforms.Cast('float32'),
transforms.Resize(32),
# 随机按照 scale 和 ratio 裁剪,并放缩为 32 x 32 的正方形。
gdata.vision.transforms.RandomResizedCrop(32, scale=(0.08, 1.0),
transforms.RandomResizedCrop(32, scale=(0.08, 1.0),
ratio=(3.0/4.0, 4.0/3.0)),
# 随机左右翻转图片。
gdata.vision.transforms.RandomFlipLeftRight(),
transforms.RandomFlipLeftRight(),
# 将图片像素值缩小到(0, 1)内,并将数据格式从“高*宽*通道”改为“通道*高*宽”。
gdata.vision.transforms.ToTensor(),
transforms.ToTensor(),
# 对图片的每个通道做标准化。
gdata.vision.transforms.Normalize([0.4914, 0.4822, 0.4465],
transforms.Normalize([0.4914, 0.4822, 0.4465],
[0.2023, 0.1994, 0.2010])
])
# 测试时,无需对图像做标准化以外的增强数据处理。
transform_test = gdata.vision.transforms.Compose([
gdata.vision.transforms.ToTensor(),
gdata.vision.transforms.Normalize([0.4914, 0.4822, 0.4465],
transform_test = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize([0.4914, 0.4822, 0.4465],
[0.2023, 0.1994, 0.2010])
])
```
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册