quick_start.ipynb 33.9 KB
Notebook
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# <center>手写数字分类识别入门体验教程</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 实现一个图片分类应用\n",
    "## 概述\n",
    "下面我们通过一个实际样例,带领大家体验MindSpore基础的功能,对于一般的用户而言,完成整个样例实践会持续20~30分钟。\n",
    "\n",
    "本例子会实现一个简单的图片分类的功能,整体流程如下:\n",
    "\n",
    "1、处理需要的数据集,这里使用了MNIST数据集。\n",
    "\n",
    "2、定义一个网络,这里我们使用LeNet网络。\n",
    "\n",
    "3、定义损失函数和优化器。\n",
    "\n",
    "4、加载数据集并进行训练,训练完成后,查看结果及保存模型文件。\n",
    "\n",
    "5、加载保存的模型,进行推理。\n",
    "\n",
    "6、验证模型,加载测试数据集和训练后的模型,验证结果精度。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "说明:<br/>你可以在这里找到完整可运行的样例代码:https://gitee.com/mindspore/docs/blob/master/tutorials/tutorial_code/lenet.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 一、训练的数据集下载"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 方法一:\n",
    "从以下网址下载,并将数据包解压缩后放至Jupyter的工作目录下:<br/>训练数据集:{\"http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\", \"http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz\"}\n",
    "<br/>测试数据集:{\"http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz\", \"http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz\"}<br/>我们用下面代码查询jupyter的工作目录。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "os.getcwd()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "训练数据集放在----Jupyter工作目录+\\MNIST_Data\\train\\,此时train文件夹内应该包含两个文件,train-images-idx3-ubyte和train-labels-idx1-ubyte <br/>测试数据集放在----Jupyter工作目录+\\MNIST_Data\\test\\,此时test文件夹内应该包含两个文件,t10k-images-idx3-ubyte和t10k-labels-idx1-ubyte"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 方法二:\n",
    "直接执行下面代码,会自动进行训练集的下载与解压,但是整个过程根据网络好坏情况会需要花费几分钟时间。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Network request module, data download module, decompression module\n",
    "import urllib.request   \n",
    "from urllib.parse import urlparse\n",
    "import gzip \n",
    "\n",
    "def unzipfile(gzip_path):\n",
    "    \"\"\"unzip dataset file\n",
    "    Args:\n",
    "        gzip_path: dataset file path\n",
    "    \"\"\"\n",
    "    open_file = open(gzip_path.replace('.gz',''), 'wb')\n",
    "    gz_file = gzip.GzipFile(gzip_path)\n",
    "    open_file.write(gz_file.read())\n",
    "    gz_file.close()\n",
    "    \n",
    "def download_dataset():\n",
    "    \"\"\"Download the dataset from http://yann.lecun.com/exdb/mnist/.\"\"\"\n",
    "    print(\"******Downloading the MNIST dataset******\")\n",
    "    train_path = \"./MNIST_Data/train/\" \n",
    "    test_path = \"./MNIST_Data/test/\"\n",
    "    train_path_check = os.path.exists(train_path)\n",
    "    test_path_check = os.path.exists(test_path)\n",
109
    "    if train_path_check == False and test_path_check == False:\n",
110 111 112 113 114 115 116 117 118
    "        os.makedirs(train_path)\n",
    "        os.makedirs(test_path)\n",
    "    train_url = {\"http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\", \"http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz\"}\n",
    "    test_url = {\"http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz\", \"http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz\"}\n",
    "    \n",
    "    for url in train_url:\n",
    "        url_parse = urlparse(url)\n",
    "        # split the file name from url\n",
    "        file_name = os.path.join(train_path,url_parse.path.split('/')[-1])\n",
119
    "        if not os.path.exists(file_name.replace('.gz', '')):\n",
120 121 122 123 124 125 126 127
    "            file = urllib.request.urlretrieve(url, file_name)\n",
    "            unzipfile(file_name)\n",
    "            os.remove(file_name)\n",
    "            \n",
    "    for url in test_url:\n",
    "        url_parse = urlparse(url)\n",
    "        # split the file name from url\n",
    "        file_name = os.path.join(test_path,url_parse.path.split('/')[-1])\n",
128
    "        if not os.path.exists(file_name.replace('.gz', '')):\n",
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182
    "            file = urllib.request.urlretrieve(url, file_name)\n",
    "            unzipfile(file_name)\n",
    "            os.remove(file_name)\n",
    "\n",
    "download_dataset()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这样就完成了数据集的下载解压缩工作。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 二、处理MNIST数据集"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "由于我们后面会采用LeNet这样的卷积神经网络对数据集进行训练,而采用LeNet在训练数据时,对数据格式是有所要求的,所以接下来的工作需要我们先查看数据集内的数据是什么样的,这样才能构造一个针对性的数据转换函数,将数据集数据转换成符合训练要求的数据形式。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "更多的LeNet网络的介绍不在此赘述,希望详细了解LeNet网络,可以查询http://yann.lecun.com/exdb/lenet/ 。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 查看原始数据集数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mindspore import context\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib\n",
    "import numpy as np\n",
    "import mindspore.dataset as ds\n",
    "\n",
183
    "context.set_context(mode=context.GRAPH_MODE, device_target=\"CPU\") # Windows version, set to use CPU for graph calculation\n",
184 185 186
    "train_data_path = \"./MNIST_Data/train\"\n",
    "test_data_path = \"./MNIST_Data/test\"\n",
    "mnist_ds = ds.MnistDataset(train_data_path) # Load training dataset\n",
187
    "print('The type of mnist_ds:', type(mnist_ds))\n",
188 189 190 191 192 193 194
    "print(\"Number of pictures contained in the mnist_ds:\",mnist_ds.get_dataset_size()) # 60000 pictures in total\n",
    "\n",
    "dic_ds = mnist_ds.create_dict_iterator() # Convert dataset to dictionary type\n",
    "item = dic_ds.get_next()\n",
    "img = item[\"image\"]\n",
    "label = item[\"label\"]\n",
    "\n",
195 196 197
    "print(\"The item of mnist_ds:\", item.keys()) # Take a single data to view the data structure, including two keys, image and label\n",
    "print(\"Tensor of image in item:\", img.shape) # View the tensor of image (28,28,1)\n",
    "print(\"The label of item:\", label)\n",
198 199
    "\n",
    "plt.imshow(np.squeeze(img))\n",
200
    "plt.title(\"number:%s\"% item[\"label\"])\n",
201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "从上面的运行情况我们可以看到,训练数据集train-images-idx3-ubyte和train-labels-idx1-ubyte对应的是6万张图片和6万个数字下标,载入数据后经过create_dict_iterator()转换字典型的数据集,取其中的一个数据查看,这是一个key为image和label的字典,其中的image的张量(高度28,宽度28,通道1)和label为对应图片的数字。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数据处理"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "数据集对于训练非常重要,好的数据集可以有效提高训练精度和效率。在加载数据集前,我们通常会对数据集进行一些处理。\n",
    "#### 定义数据集及数据操作\n",
    "我们定义一个函数create_dataset()来创建数据集。在这个函数中,我们定义好需要进行的数据增强和处理操作:\n",
    "<br/>1、定义数据集。\n",
    "<br/>2、定义进行数据增强和处理所需要的一些参数。\n",
    "<br/>3、根据参数,生成对应的数据增强操作。\n",
    "<br/>4、使用map()映射函数,将数据操作应用到数据集。\n",
    "<br/>5、对生成的数据集进行处理。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Data processing module\n",
    "import mindspore.dataset.transforms.vision.c_transforms as CV\n",
    "import mindspore.dataset.transforms.c_transforms as C\n",
    "from mindspore.dataset.transforms.vision import Inter\n",
    "from mindspore.common import dtype as mstype\n",
    "\n",
    "\n",
    "def create_dataset(data_path, batch_size=32, repeat_size=1,\n",
    "                   num_parallel_workers=1):\n",
    "    \"\"\" create dataset for train or test\n",
    "    Args:\n",
    "        data_path: Data path\n",
    "        batch_size: The number of data records in each group\n",
    "        repeat_size: The number of replicated data records\n",
    "        num_parallel_workers: The number of parallel workers\n",
    "    \"\"\"\n",
    "    # define dataset\n",
    "    mnist_ds = ds.MnistDataset(data_path)\n",
    "\n",
    "    # Define some parameters needed for data enhancement and rough justification\n",
    "    resize_height, resize_width = 32, 32\n",
    "    rescale = 1.0 / 255.0\n",
    "    shift = 0.0\n",
    "    rescale_nml = 1 / 0.3081\n",
    "    shift_nml = -1 * 0.1307 / 0.3081\n",
    "\n",
    "    # According to the parameters, generate the corresponding data enhancement method\n",
    "    resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR)  # Resize images to (32, 32) by bilinear interpolation\n",
    "    rescale_nml_op = CV.Rescale(rescale_nml, shift_nml) # normalize images\n",
    "    rescale_op = CV.Rescale(rescale, shift) # rescale images\n",
    "    hwc2chw_op = CV.HWC2CHW() # change shape from (height, width, channel) to (channel, height, width) to fit network.\n",
    "    type_cast_op = C.TypeCast(mstype.int32) # change data type of label to int32 to fit network\n",
    "\n",
    "    # Using map () to apply operations to a dataset\n",
    "    mnist_ds = mnist_ds.map(input_columns=\"label\", operations=type_cast_op, num_parallel_workers=num_parallel_workers)\n",
    "    mnist_ds = mnist_ds.map(input_columns=\"image\", operations=resize_op, num_parallel_workers=num_parallel_workers)\n",
    "    mnist_ds = mnist_ds.map(input_columns=\"image\", operations=rescale_op, num_parallel_workers=num_parallel_workers)\n",
    "    mnist_ds = mnist_ds.map(input_columns=\"image\", operations=rescale_nml_op, num_parallel_workers=num_parallel_workers)\n",
    "    mnist_ds = mnist_ds.map(input_columns=\"image\", operations=hwc2chw_op, num_parallel_workers=num_parallel_workers)\n",
    "    # Process the generated dataset\n",
    "    buffer_size = 10000\n",
    "    mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size)  # 10000 as in LeNet train script\n",
    "    mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)\n",
    "    mnist_ds = mnist_ds.repeat(repeat_size)\n",
    "\n",
    "    return mnist_ds\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "其中<br/>\n",
    "batch_size:每组包含的数据个数,现设置每组包含32个数据。\n",
    "<br/>repeat_size:数据集复制的数量。\n",
    "<br/>先进行shuffle、batch操作,再进行repeat操作,这样能保证1个epoch内数据不重复。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "接下来我们查看将要进行训练的数据集内容是什么样的。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "首先,查看数据集内包含多少组数据。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
316 317
    "datas = create_dataset(train_data_path) # Process the train dataset\n",
    "print('Number of groups in the dataset:', datas.get_dataset_size()) # Number of query dataset groups"
318 319 320 321 322 323 324 325 326 327 328 329
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "其次,取出其中一组数据,查看包含的key,图片数据的张量,以及下标labels的值。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
330 331 332
   "metadata": {
    "scrolled": false
   },
333 334 335 336 337 338
   "outputs": [],
   "source": [
    "data = datas.create_dict_iterator().get_next() # Take a set of datasets\n",
    "print(data.keys())\n",
    "images = data[\"image\"] # Take out the image data in this dataset\n",
    "labels = data[\"label\"] # Take out the label (subscript) of this data set\n",
339 340
    "print('Tensor of image:', images.shape) # Query the tensor of images in each dataset (32,1,32,32)\n",
    "print('labels:', labels)"
341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "最后,查看image的图像和下标对应的值。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "count = 1\n",
    "for i in images:\n",
358
    "    plt.subplot(4, 8, count) \n",
359 360 361
    "    plt.imshow(np.squeeze(i))\n",
    "    plt.title('num:%s'%labels[count-1])\n",
    "    plt.xticks([])\n",
362
    "    count += 1\n",
363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446
    "    plt.axis(\"off\")\n",
    "plt.show() # Print a total of 32 pictures in the group"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过上述三个查询操作,看到经过变换后的图片,数据集内分成了1875组数据,每组数据中含有32张图片,每张图片像数值为32×32,数据全部准备好后,就可以进行下一步的数据训练了。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 三、构造神经网络"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在对手写字体识别上,通常采用卷积神经网络架构(CNN)进行学习预测,最经典的属1998年由Yann LeCun创建的LeNet5架构,<br/>其中分为:<br/>1、输入层;<br/>2、卷积层C1;<br/>3、池化层S2;<br/>4、卷积层C3;<br/>5、池化层S4;<br/>6、全连接F6;<br/>7、全连接;<br/>8、全连接OUTPUT。<br/>结构示意如下图:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### LeNet5结构图"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://img-blog.csdnimg.cn/20190305161316701.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L21tbV9qc3c=,size_16,color_FFFFFF,t_70\" alt=\"LeNet5\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在构建LeNet5前,我们需要对全连接层以及卷积层进行初始化。\n",
    "\n",
    "TruncatedNormal:参数初始化方法,MindSpore支持TruncatedNormal、Normal、Uniform等多种参数初始化方法,具体可以参考MindSpore API的mindspore.common.initializer模块说明。\n",
    "\n",
    "初始化示例代码如下:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import mindspore.nn as nn\n",
    "from mindspore.common.initializer import TruncatedNormal\n",
    "\n",
    "# Initialize 2D convolution function\n",
    "def conv(in_channels, out_channels, kernel_size, stride=1, padding=0):\n",
    "    \"\"\"Conv layer weight initial.\"\"\"\n",
    "    weight = weight_variable()\n",
    "    return nn.Conv2d(in_channels, out_channels,\n",
    "                     kernel_size=kernel_size, stride=stride, padding=padding,\n",
    "                     weight_init=weight, has_bias=False, pad_mode=\"valid\")\n",
    "\n",
    "# Initialize full connection layer\n",
    "def fc_with_initialize(input_channels, out_channels):\n",
    "    \"\"\"Fc layer weight initial.\"\"\"\n",
    "    weight = weight_variable()\n",
    "    bias = weight_variable()\n",
    "    return nn.Dense(input_channels, out_channels, weight, bias)\n",
    "\n",
    "# Set truncated normal distribution\n",
    "def weight_variable():\n",
    "    \"\"\"Weight initial.\"\"\"\n",
    "    return TruncatedNormal(0.02)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
447
    "使用MindSpore定义神经网络需要继承mindspore.nn.cell.Cell,Cell是所有神经网络(Conv2d等)的基类。\n",
448
    "\n",
449
    "神经网络的各层需要预先在\\_\\_init\\_\\_()方法中定义,然后通过定义construct()方法来完成神经网络的前向构造,按照LeNet5的网络结构,定义网络各层如下:"
450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class LeNet5(nn.Cell):\n",
    "    \"\"\"Lenet network structure.\"\"\"\n",
    "    # define the operator required\n",
    "    def __init__(self):\n",
    "        super(LeNet5, self).__init__()\n",
    "        self.batch_size = 32 # 32 pictures in each group\n",
    "        self.conv1 = conv(1, 6, 5) # Convolution layer 1, 1 channel input (1 Figure), 6 channel output (6 figures), convolution core 5 * 5\n",
    "        self.conv2 = conv(6, 16, 5) # Convolution layer 2,6-channel input, 16 channel output, convolution kernel 5 * 5\n",
    "        self.fc1 = fc_with_initialize(16 * 5 * 5, 120)\n",
    "        self.fc2 = fc_with_initialize(120, 84)\n",
    "        self.fc3 = fc_with_initialize(84, 10)\n",
    "        self.relu = nn.ReLU()\n",
    "        self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)\n",
    "        self.flatten = nn.Flatten()\n",
    "\n",
    "    # use the preceding operators to construct networks\n",
    "    def construct(self, x):\n",
    "        x = self.conv1(x) # 1*32*32-->6*28*28\n",
    "        x = self.relu(x) # 6*28*28-->6*14*14\n",
    "        x = self.max_pool2d(x) # Pool layer\n",
    "        x = self.conv2(x) # Convolution layer\n",
    "        x = self.relu(x) # Function excitation layer\n",
    "        x = self.max_pool2d(x) # Pool layer\n",
    "        x = self.flatten(x) # Dimensionality reduction\n",
    "        x = self.fc1(x) # Full connection\n",
    "        x = self.relu(x) # Function excitation layer\n",
    "        x = self.fc2(x) # Full connection\n",
    "        x = self.relu(x) # Function excitation layer\n",
    "        x = self.fc3(x) # Full connection\n",
    "        return x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "构建完成后,我们将LeNet5的整体参数打印出来查看一下。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "network = LeNet5()\n",
    "print(network)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "param = network.trainable_params()\n",
    "param"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 四、搭建训练网络并进行训练"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "构建完成神经网络后,就可以着手进行训练网络的构建,模型训练函数为Model.train(),参数主要包含:\n",
    "<br/>1、圈数epoch size(每圈需要遍历完成1875组图片);\n",
    "<br/>2、数据集ds_train;\n",
    "<br/>3、回调函数callbacks包含ModelCheckpoint、LossMonitor、SummaryStepckpoint_cb,Callback模型检测参数;\n",
    "<br/>4、底层数据通道dataset_sink_mode,此参数默认True需设置成False,因为此功能只限于昇腾AI处理器。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Training and testing related modules\n",
    "import argparse\n",
    "from mindspore import Tensor\n",
    "from mindspore.train.serialization import load_checkpoint, load_param_into_net\n",
    "from mindspore.train.callback import ModelCheckpoint, CheckpointConfig, LossMonitor,SummaryStep,Callback\n",
    "from mindspore.train import Model\n",
    "from mindspore.nn.metrics import Accuracy\n",
    "from mindspore.nn.loss import SoftmaxCrossEntropyWithLogits\n",
    "\n",
550
    "def train_net(model, epoch_size, mnist_path, repeat_size, ckpoint_cb, step_loss_info):\n",
551 552 553 554
    "    \"\"\"Define the training method.\"\"\"\n",
    "    print(\"============== Starting Training ==============\")\n",
    "    # load training dataset\n",
    "    ds_train = create_dataset(os.path.join(mnist_path, \"train\"), 32, repeat_size)\n",
555
    "    model.train(epoch_size, ds_train, callbacks=[ckpoint_cb, LossMonitor(), step_loss_info], dataset_sink_mode=False)"
556 557 558 559 560 561
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
562
    "自定义一个存储每一步训练的step和对应loss值的类Step_loss_info(),并继承了Callback类,可以自定义训练过程中的处理措施,非常方便,等训练完成后,可将数据绘图查看loss的变化情况。"
563 564 565 566 567 568 569 570 571
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Custom callback function\n",
572
    "class Step_loss_info(Callback):\n",
573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617
    "    def step_end(self, run_context):\n",
    "        cb_params = run_context.original_args()\n",
    "        # step_ Loss dictionary for saving loss value and step number information\n",
    "        step_loss[\"loss_value\"].append(str(cb_params.net_outputs))\n",
    "        step_loss[\"step\"].append(str(cb_params.cur_step_num))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 定义损失函数及优化器\n",
    "基本概念\n",
    "在进行定义之前,先简单介绍损失函数及优化器的概念。\n",
    "<br/>损失函数:又叫目标函数,用于衡量预测值与实际值差异的程度。深度学习通过不停地迭代来缩小损失函数的值。定义一个好的损失函数,可以有效提高模型的性能。\n",
    "<br/>优化器:用于最小化损失函数,从而在训练过程中改进模型。\n",
    "<br/>定义了损失函数后,可以得到损失函数关于权重的梯度。梯度用于指示优化器优化权重的方向,以提高模型性能。\n",
    "<br/>定义损失函数。\n",
    "<br/>MindSpore支持的损失函数有SoftmaxCrossEntropyWithLogits、L1Loss、MSELoss等。这里使用SoftmaxCrossEntropyWithLogits损失函数。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.system('del/f/s/q *.ckpt *.meta')# Clean up old run files before\n",
    "lr = 0.01 # learning rate\n",
    "momentum = 0.9 #\n",
    "\n",
    "# create the network\n",
    "network = LeNet5()\n",
    "\n",
    "# define the optimizer\n",
    "net_opt = nn.Momentum(network.trainable_params(), lr, momentum)\n",
    "\n",
    "\n",
    "# define the loss function\n",
    "net_loss = SoftmaxCrossEntropyWithLogits(is_grad=False, sparse=True, reduction='mean')\n",
    "# define the model\n",
618
    "model = Model(network, net_loss, net_opt, metrics={\"Accuracy\": Accuracy()} )\n",
619 620 621 622 623 624 625 626 627 628
    "\n",
    "\n",
    "epoch_size = 1\n",
    "mnist_path = \"./MNIST_Data\"\n",
    "\n",
    "config_ck = CheckpointConfig(save_checkpoint_steps=125, keep_checkpoint_max=16)\n",
    "# save the network model and parameters for subsequence fine-tuning\n",
    "\n",
    "ckpoint_cb = ModelCheckpoint(prefix=\"checkpoint_lenet\", config=config_ck)\n",
    "# group layers into an object with training and evaluation features\n",
629
    "step_loss = {\"step\": [], \"loss_value\": []}\n",
630
    "# step_ Loss dictionary for saving loss value and step number information\n",
631
    "step_loss_info = Step_loss_info()\n",
632 633
    "# save the steps and loss value\n",
    "repeat_size = 1\n",
634
    "train_net(model, epoch_size, mnist_path, repeat_size, ckpoint_cb, step_loss_info)\n"
635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "训练完成后,能在Jupyter的工作路径上生成多个模型文件,名称具体含义checkpoint_{网络名称}-{第几个epoch}_{第几个step}.ckpt 。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 查看损失函数随着训练步数的变化情况"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
654 655 656
   "metadata": {
    "scrolled": true
   },
657 658
   "outputs": [],
   "source": [
659
    "steps = step_loss[\"step\"]\n",
660
    "loss_value = step_loss[\"loss_value\"]\n",
661 662 663
    "steps = list(map(int, steps))\n",
    "loss_value = list(map(float, loss_value))\n",
    "plt.plot(steps, loss_value, color=\"red\")\n",
664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740
    "plt.xlabel(\"Steps\")\n",
    "plt.ylabel(\"Loss_value\")\n",
    "plt.title(\"Loss function value change chart\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "从上面可以看出来大致分为三个阶段:\n",
    "\n",
    "阶段一:开始训练loss值在2.2上下浮动,训练收益感觉并不明显。\n",
    "\n",
    "阶段二:训练到某一时刻,loss值减少迅速,训练收益大幅增加。\n",
    "\n",
    "阶段三:loss值收敛到一定小的值后,loss值开始振荡在一个小的区间上无法趋0,再继续增加训练并无明显收益,至此训练结束。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  五、数据测试验证模型精度"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "搭建测试网络的过程主要为:<br/>1、载入模型.cptk文件中的参数param;<br/>2、将参数param载入到神经网络LeNet5中;<br/>3、载入测试数据集;<br/>4、调用函数model.eval()传入参数测试数据集ds_eval,就生成模型checkpoint_lenet-1_1875.ckpt的精度值。<br/>dataset_sink_mode表示数据集下沉模式,仅仅支持昇腾AI处理器平台,所以这里设置成False 。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_net(network, model, mnist_path):\n",
    "    \"\"\"Define the evaluation method.\"\"\"\n",
    "    print(\"============== Starting Testing ==============\")\n",
    "    # load the saved model for evaluation\n",
    "    param_dict = load_checkpoint(\"checkpoint_lenet-1_1875.ckpt\")\n",
    "    # load parameter to the network\n",
    "    load_param_into_net(network, param_dict)\n",
    "    # load testing dataset\n",
    "    ds_eval = create_dataset(os.path.join(mnist_path, \"test\"))\n",
    "    acc = model.eval(ds_eval, dataset_sink_mode=False)\n",
    "    print(\"============== Accuracy:{} ==============\".format(acc))\n",
    "\n",
    "test_net(network, model, mnist_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "经过1875步训练后生成的模型精度超过95%,模型优良。\n",
    "我们可以看一下模型随着训练步数变化,精度随之变化的情况。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "acc_model_info()函数是将每125步的保存的模型,调用model.eval()函数将测试出的精度返回到步数列表和精度列表,如下:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def acc_model_info(network, model, mnist_path, model_numbers):\n",
    "    \"\"\"Define the plot info method\"\"\"\n",
741 742 743
    "    step_list = []\n",
    "    acc_list = []\n",
    "    for i in range(1, model_numbers+1):\n",
744 745 746 747 748 749 750 751 752 753 754 755
    "        # load the saved model for evaluation\n",
    "        param_dict = load_checkpoint(\"checkpoint_lenet-1_{}.ckpt\".format(str(i*125)))\n",
    "        # load parameter to the network\n",
    "        load_param_into_net(network, param_dict)\n",
    "        # load testing dataset\n",
    "        ds_eval = create_dataset(os.path.join(mnist_path, \"test\"))\n",
    "        acc = model.eval(ds_eval, dataset_sink_mode=False)\n",
    "        acc_list.append(acc['Accuracy'])\n",
    "        step_list.append(i*125)\n",
    "    return step_list,acc_list\n",
    "\n",
    "# Draw line chart according to training steps and model accuracy\n",
756
    "l1,l2 = acc_model_info(network, model, mnist_path, 15)\n",
757 758 759
    "plt.xlabel(\"Model of Steps\")\n",
    "plt.ylabel(\"Model accuracy\")\n",
    "plt.title(\"Model accuracy variation chart\")\n",
760
    "plt.plot(l1, l2, 'red')\n",
761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "从图中可以看出训练得到的模型精度变化分为三个阶段:1、缓慢上升,2、迅速上升,3、缓慢上升趋近于不到1的某个值时附近振荡,说明随着训练数据的增加,会对模型精度有着正相关的影响,但是随着精度到达一定程度,训练收益会降低。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 六、模型预测应用"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们尝试使用生成的模型应用到分类预测单个或者单组图片数据上,具体步骤如下:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1、需要将要测试的数据转换成适应LeNet5的数据类型。\n",
    "<br/>2、提取出image的数据。\n",
    "<br/>3、使用函数model.predict()预测image对应的数字。需要说明的是predict返回的是image对应0-9的概率值。\n",
    "<br/>4、调用plot_pie()将预测的各数字的概率显示出来。负概率的数字会被去掉。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "载入要测试的数据集并调用create_dataset()转换成符合格式要求的数据集,并选取其中一组32张图片进行预测。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds_test = create_dataset(test_data_path).create_dict_iterator()\n",
    "data = ds_test.get_next()\n",
    "images = data[\"image\"]\n",
    "labels = data[\"label\"] # The subscript of data picture is the standard for us to judge whether it is correct or not\n",
    "\n",
    "output =model.predict(Tensor(data['image']))\n",
    "# The predict function returns the probability of 0-9 numbers corresponding to each picture\n",
    "prb = output.asnumpy()\n",
816
    "pred = np.argmax(output.asnumpy(), axis=1)\n",
817 818 819
    "err_num = []\n",
    "index = 1\n",
    "for i in range(len(labels)):\n",
820 821 822
    "    plt.subplot(4, 8, i+1)\n",
    "    color = 'blue' if pred[i] == labels[i] else 'red'\n",
    "    plt.title(\"pre:{}\".format(pred[i]), color=color)\n",
823 824
    "    plt.imshow(np.squeeze(images[i]))\n",
    "    plt.axis(\"off\")\n",
825 826
    "    if color == 'red':\n",
    "        index = 0\n",
827
    "        # Print out the wrong data identified by the current group\n",
828
    "        print(\"Row {}, column {} is incorrectly identified as {}, the correct value should be {}\".format(int(i/8)+1, i%8+1, pred[i], labels[i]), '\\n')\n",
829 830
    "if index:\n",
    "    print(\"All the figures in this group are predicted correctly!\")\n",
831 832
    "print(pred, \"<--Predicted figures\") # Print the numbers recognized by each group of pictures\n",
    "print(labels, \"<--The right number\") # Print the subscript corresponding to each group of pictures\n",
833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "构建一个概率分析的饼图函数。\n",
    "\n",
    "备注:prb为上一段代码中,存储这组数对应的数字概率。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define the pie drawing function of probability analysis\n",
    "def plot_pie(prbs):\n",
853
    "    dict1 = {}\n",
854 855
    "    # Remove the negative number and build the dictionary dict1. The key is the number and the value is the probability value\n",
    "    for i in range(10):\n",
856 857
    "        if prbs[i] > 0:\n",
    "            dict1[str(i)] = prbs[i]\n",
858 859 860
    "\n",
    "    label_list = dict1.keys()    # Label of each part\n",
    "    size = dict1.values()    # Size of each part\n",
861 862
    "    colors = [\"red\", \"green\", \"pink\", \"blue\", \"purple\", \"orange\", \"gray\"] # Building a round cake pigment Library\n",
    "    color = colors[: len(size)]# Color of each part\n",
863 864 865 866 867 868 869 870
    "    plt.pie(size, colors=color, labels=label_list, labeldistance=1.1, autopct=\"%1.1f%%\", shadow=False, startangle=90, pctdistance=0.6)\n",
    "    plt.axis(\"equal\")    # Set the scale size of x-axis and y-axis to be equal\n",
    "    plt.legend()\n",
    "    plt.title(\"Image classification\")\n",
    "    plt.show()\n",
    "    \n",
    "    \n",
    "for i in range(2):\n",
871
    "    print(\"Figure {} probability of corresponding numbers [0-9]:\\n\".format(i+1), prb[i])\n",
872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904
    "    plot_pie(prb[i])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以上过程就是这次手写数字分类训练的全部体验过程。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:root] *",
   "language": "python",
   "name": "conda-root-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}