" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 实现一个图片分类应用\n", "## 概述\n", "下面我们通过一个实际样例,带领大家体验MindSpore基础的功能,对于一般的用户而言,完成整个样例实践会持续20~30分钟。\n", "\n", "本例子会实现一个简单的图片分类的功能,整体流程如下:\n", "\n", "1、处理需要的数据集,这里使用了MNIST数据集。\n", "\n", "2、定义一个网络,这里我们使用LeNet网络。\n", "\n", "3、定义损失函数和优化器。\n", "\n", "4、加载数据集并进行训练,训练完成后,查看结果及保存模型文件。\n", "\n", "5、加载保存的模型,进行推理。\n", "\n", "6、验证模型,加载测试数据集和训练后的模型,验证结果精度。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "说明:
说明:你可以在这里找到完整可运行的样例代码:https://gitee.com/mindspore/docs/blob/master/tutorials/tutorial_code/lenet.py
训练数据集:{"http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz", "http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz"}
测试数据集:{"http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz", "http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz"}
我们用下面代码查询jupyter的工作目录。
训练数据集放在----Jupyter工作目录+\MNIST_Data\train\,此时train文件夹内应该包含两个文件,train-images-idx3-ubyte和train-labels-idx1-ubyte
测试数据集放在----Jupyter工作目录+\MNIST_Data\test\,此时test文件夹内应该包含两个文件,t10k-images-idx3-ubyte和t10k-labels-idx1-ubyte
1、定义数据集。
2、定义进行数据增强和处理所需要的一些参数。
3、根据参数,生成对应的数据增强操作。
4、使用map()映射函数,将数据操作应用到数据集。
5、对生成的数据集进行处理。
batch_size:每组包含的数据个数,现设置每组包含32个数据。
repeat_size:数据集复制的数量。\n", "
结构示意如下图:
结构示意如下图:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### LeNet5结构图" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"LeNet5\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在构建LeNet5前,我们需要对全连接层以及卷积层进行初始化。\n", "\n", "TruncatedNormal:参数初始化方法,MindSpore支持TruncatedNormal、Normal、Uniform等多种参数初始化方法,具体可以参考MindSpore API的mindspore.common.initializer模块说明。\n", "\n", "初始化示例代码如下:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mindspore.nn as nn\n", "from mindspore.common.initializer import TruncatedNormal\n", "\n", "# Initialize 2D convolution function\n", "def conv(in_channels, out_channels, kernel_size, stride=1, padding=0):\n", " \"\"\"Conv layer weight initial.\"\"\"\n", " weight = weight_variable()\n", " return nn.Conv2d(in_channels, out_channels,\n", " kernel_size=kernel_size, stride=stride, padding=padding,\n", " weight_init=weight, has_bias=False, pad_mode=\"valid\")\n", "\n", "# Initialize full connection layer\n", "def fc_with_initialize(input_channels, out_channels):\n", " \"\"\"Fc layer weight initial.\"\"\"\n", " weight = weight_variable()\n", " bias = weight_variable()\n", " return nn.Dense(input_channels, out_channels, weight, bias)\n", "\n", "# Set truncated normal distribution\n", "def weight_variable():\n", " \"\"\"Weight initial.\"\"\"\n", " return TruncatedNormal(0.02)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "使用MindSpore定义神经网络需要继承mindspore.nn.cell.Cell,Cell是所有神经网络(Conv2d等)的基类。\n", "\n", "神经网络的各层需要预先在\\_\\_init\\_\\_()方法中定义,然后通过定义construct()方法来完成神经网络的前向构造,按照LeNet5的网络结构,定义网络各层如下:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class LeNet5(nn.Cell):\n", " \"\"\"Lenet network structure.\"\"\"\n", " # define the operator required\n", " def __init__(self):\n", " super(LeNet5, self).__init__()\n", " self.batch_size = 32 # 32 pictures in each group\n", " self.conv1 = conv(1, 6, 5) # Convolution layer 1, 1 channel input (1 Figure), 6 channel output (6 figures), convolution core 5 * 5\n", " self.conv2 = conv(6, 16, 5) # Convolution layer 2,6-channel input, 16 channel output, convolution kernel 5 * 5\n", " self.fc1 = fc_with_initialize(16 * 5 * 5, 120)\n", " self.fc2 = fc_with_initialize(120, 84)\n", " self.fc3 = fc_with_initialize(84, 10)\n", " self.relu = nn.ReLU()\n", " self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)\n", " self.flatten = nn.Flatten()\n", "\n", " # use the preceding operators to construct networks\n", " def construct(self, x):\n", " x = self.conv1(x) # 1*32*32-->6*28*28\n", " x = self.relu(x) # 6*28*28-->6*14*14\n", " x = self.max_pool2d(x) # Pool layer\n", " x = self.conv2(x) # Convolution layer\n", " x = self.relu(x) # Function excitation layer\n", " x = self.max_pool2d(x) # Pool layer\n", " x = self.flatten(x) # Dimensionality reduction\n", " x = self.fc1(x) # Full connection\n", " x = self.relu(x) # Function excitation layer\n", " x = self.fc2(x) # Full connection\n", " x = self.relu(x) # Function excitation layer\n", " x = self.fc3(x) # Full connection\n", " return x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "构建完成后,我们将LeNet5的整体参数打印出来查看一下。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "network = LeNet5()\n", "print(network)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "param = network.trainable_params()\n", "param" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 四、搭建训练网络并进行训练" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "构建完成神经网络后,就可以着手进行训练网络的构建,模型训练函数为Model.train(),参数主要包含:\n", "
1、圈数epoch size(每圈需要遍历完成1875组图片);
2、数据集ds_train;
3、回调函数callbacks包含ModelCheckpoint、LossMonitor、SummaryStepckpoint_cb,Callback模型检测参数;
4、底层数据通道dataset_sink_mode,此参数默认True需设置成False,因为此功能只限于昇腾AI处理器。
损失函数:又叫目标函数,用于衡量预测值与实际值差异的程度。深度学习通过不停地迭代来缩小损失函数的值。定义一个好的损失函数,可以有效提高模型的性能。
优化器:用于最小化损失函数,从而在训练过程中改进模型。\n", "
定义损失函数。
定义损失函数。\n", "
MindSpore支持的损失函数有SoftmaxCrossEntropyWithLogits、L1Loss、MSELoss等。这里使用SoftmaxCrossEntropyWithLogits损失函数。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "import os\n", "\n", "os.system('del/f/s/q *.ckpt *.meta')# Clean up old run files before\n", "lr = 0.01 # learning rate\n", "momentum = 0.9 #\n", "\n", "# create the network\n", "network = LeNet5()\n", "\n", "# define the optimizer\n", "net_opt = nn.Momentum(network.trainable_params(), lr, momentum)\n", "\n", "\n", "# define the loss function\n", "net_loss = SoftmaxCrossEntropyWithLogits(is_grad=False, sparse=True, reduction='mean')\n", "# define the model\n", "model = Model(network, net_loss, net_opt, metrics={\"Accuracy\": Accuracy()} )\n", "\n", "\n", "epoch_size = 1\n", "mnist_path = \"./MNIST_Data\"\n", "\n", "config_ck = CheckpointConfig(save_checkpoint_steps=125, keep_checkpoint_max=16)\n", "# save the network model and parameters for subsequence fine-tuning\n", "\n", "ckpoint_cb = ModelCheckpoint(prefix=\"checkpoint_lenet\", config=config_ck)\n", "# group layers into an object with training and evaluation features\n", "step_loss = {\"step\": [], \"loss_value\": []}\n", "# step_ Loss dictionary for saving loss value and step number information\n", "step_loss_info = Step_loss_info()\n", "# save the steps and loss value\n", "repeat_size = 1\n", "train_net(model, epoch_size, mnist_path, repeat_size, ckpoint_cb, step_loss_info)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "训练完成后,能在Jupyter的工作路径上生成多个模型文件,名称具体含义checkpoint_{网络名称}-{第几个epoch}_{第几个step}.ckpt 。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 查看损失函数随着训练步数的变化情况" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "steps = step_loss[\"step\"]\n", "loss_value = step_loss[\"loss_value\"]\n", "steps = list(map(int, steps))\n", "loss_value = list(map(float, loss_value))\n", "plt.plot(steps, loss_value, color=\"red\")\n", "plt.xlabel(\"Steps\")\n", "plt.ylabel(\"Loss_value\")\n", "plt.title(\"Loss function value change chart\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "从上面可以看出来大致分为三个阶段:\n", "\n", "阶段一:开始训练loss值在2.2上下浮动,训练收益感觉并不明显。\n", "\n", "阶段二:训练到某一时刻,loss值减少迅速,训练收益大幅增加。\n", "\n", "阶段三:loss值收敛到一定小的值后,loss值开始振荡在一个小的区间上无法趋0,再继续增加训练并无明显收益,至此训练结束。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 五、数据测试验证模型精度" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "搭建测试网络的过程主要为:
dataset_sink_mode表示数据集下沉模式,仅仅支持昇腾AI处理器平台,所以这里设置成False 。
2、提取出image的数据。
3、使用函数model.predict()预测image对应的数字。需要说明的是predict返回的是image对应0-9的概率值。
4、调用plot_pie()将预测的各数字的概率显示出来。负概率的数字会被去掉。