diff --git a/demo/quant/quant_aware/image_classification_training_aware_quantization_quick_start.ipynb b/demo/quant/quant_aware/image_classification_training_aware_quantization_quick_start.ipynb new file mode 100755 index 0000000000000000000000000000000000000000..495e3e1655074cffbb5a390c335ed47399d0968b --- /dev/null +++ b/demo/quant/quant_aware/image_classification_training_aware_quantization_quick_start.ipynb @@ -0,0 +1,342 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 图像分类模型量化训练-快速开始" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "该教程以图像分类模型MobileNetV1为例,说明如何快速使用PaddleSlim的[量化训练接口](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/docs/api/quantization_api.md)。 该示例包含以下步骤:\n", + "\n", + "1. 导入依赖\n", + "2. 构建模型\n", + "3. 训练模型\n", + "4. 量化\n", + "5. 训练和测试量化后的模型\n", + "6. 保存量化后的模型" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. 导入依赖\n", + "PaddleSlim依赖Paddle1.7版本,请确认已正确安装Paddle,然后按以下方式导入Paddle和PaddleSlim:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import paddle\n", + "import paddle.fluid as fluid\n", + "import paddleslim as slim\n", + "import numpy as np" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. 构建网络\n", + "该章节构造一个用于对MNIST数据进行分类的分类模型,选用`MobileNetV1`,并将输入大小设置为`[1, 28, 28]`,输出类别数为10。 为了方便展示示例,我们在`paddleslim.models`下预定义了用于构建分类模型的方法,执行以下代码构建分类模型:\n", + "\n", + ">注意:paddleslim.models下的API并非PaddleSlim常规API,是为了简化示例而封装预定义的一系列方法,比如:模型结构的定义、Program的构建等。" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "exe, train_program, val_program, inputs, outputs = \\\n", + " slim.models.image_classification(\"MobileNet\", [1, 28, 28], 10, use_gpu=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. 训练模型\n", + "该章节介绍了如何定义输入数据和如何训练和测试分类模型。先训练分类模型的原因是量化训练过程是在训练好的模型上进行的,也就是说是在训练好的模型的基础上加入量化反量化op之后,用小学习率进行参数微调。\n", + "\n", + "### 3.1 定义输入数据\n", + "\n", + "为了快速执行该示例,我们选取简单的MNIST数据,Paddle框架的`paddle.dataset.mnist`包定义了MNIST数据的下载和读取。\n", + "代码如下:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "import paddle.dataset.mnist as reader\n", + "train_reader = paddle.batch(\n", + " reader.train(), batch_size=128, drop_last=True)\n", + "test_reader = paddle.batch(\n", + " reader.train(), batch_size=128, drop_last=True)\n", + "train_feeder = fluid.DataFeeder(inputs, fluid.CPUPlace())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.2 训练和测试\n", + "先定义训练和测试函数,正常训练和量化训练时只需要调用函数即可。在训练函数中执行了一个epoch的训练,因为MNIST数据集数据较少,一个epoch就可将top1精度训练到95%以上。" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "def train(prog):\n", + " iter = 0\n", + " for data in train_reader():\n", + " acc1, acc5, loss = exe.run(prog, feed=train_feeder.feed(data), fetch_list=outputs)\n", + " if iter % 100 == 0:\n", + " print('train iter={}, top1={}, top5={}, loss={}'.format(iter, acc1.mean(), acc5.mean(), loss.mean()))\n", + " iter += 1\n", + " \n", + "def test(prog):\n", + " iter = 0\n", + " res = [[], []]\n", + " for data in train_reader():\n", + " acc1, acc5, loss = exe.run(prog, feed=train_feeder.feed(data), fetch_list=outputs)\n", + " if iter % 100 == 0:\n", + " print('test iter={}, top1={}, top5={}, loss={}'.format(iter, acc1.mean(), acc5.mean(), loss.mean()))\n", + " res[0].append(acc1.mean())\n", + " res[1].append(acc5.mean())\n", + " iter += 1\n", + " print('final test result top1={}, top5={}'.format(np.array(res[0]).mean(), np.array(res[1]).mean()))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "调用``train``函数训练分类网络,``train_program``是在第2步:构建网络中定义的。" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "train iter=0, top1=0.1171875, top5=0.546875, loss=2.79680204391\n", + "train iter=100, top1=0.9296875, top5=1.0, loss=0.305284500122\n", + "train iter=200, top1=0.9609375, top5=0.9921875, loss=0.158525630832\n", + "train iter=300, top1=0.9609375, top5=0.9921875, loss=0.146427512169\n", + "train iter=400, top1=0.9609375, top5=1.0, loss=0.179066047072\n" + ] + } + ], + "source": [ + "train(train_program)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "调用``test``函数测试分类网络,``val_program``是在第2步:构建网络中定义的。" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "test iter=0, top1=0.96875, top5=1.0, loss=0.0801232308149\n", + "test iter=100, top1=0.9609375, top5=1.0, loss=0.104892581701\n", + "test iter=200, top1=0.96875, top5=1.0, loss=0.156774014235\n", + "test iter=300, top1=0.984375, top5=1.0, loss=0.0931615754962\n", + "test iter=400, top1=0.9453125, top5=1.0, loss=0.184863254428\n", + "final test result top1=0.970469415188, top5=0.999282181263\n" + ] + } + ], + "source": [ + "test(val_program)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. 量化\n", + "\n", + "按照[默认配置](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/#_1)在``train_program``和``val_program``中加入量化和反量化op." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2020-02-06 09:08:49,489-INFO: quant_aware config {'moving_rate': 0.9, 'weight_quantize_type': 'channel_wise_abs_max', 'is_full_quantize': False, 'dtype': 'int8', 'weight_bits': 8, 'window_size': 10000, 'activation_bits': 8, 'quantize_op_types': ['conv2d', 'depthwise_conv2d', 'mul'], 'not_quant_pattern': ['skip_quant'], 'activation_quantize_type': 'moving_average_abs_max', 'for_tensorrt': False}\n", + "2020-02-06 09:08:50,943-INFO: quant_aware config {'moving_rate': 0.9, 'weight_quantize_type': 'channel_wise_abs_max', 'is_full_quantize': False, 'dtype': 'int8', 'weight_bits': 8, 'window_size': 10000, 'activation_bits': 8, 'quantize_op_types': ['conv2d', 'depthwise_conv2d', 'mul'], 'not_quant_pattern': ['skip_quant'], 'activation_quantize_type': 'moving_average_abs_max', 'for_tensorrt': False}\n" + ] + } + ], + "source": [ + "quant_program = slim.quant.quant_aware(train_program, exe.place, for_test=False)\n", + "val_quant_program = slim.quant.quant_aware(val_program, exe.place, for_test=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 5. 训练和测试量化后的模型\n", + "微调量化后的模型,训练一个epoch后测试。" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "train iter=0, top1=0.953125, top5=1.0, loss=0.184170544147\n", + "train iter=100, top1=0.96875, top5=1.0, loss=0.0945074558258\n", + "train iter=200, top1=0.9765625, top5=1.0, loss=0.0915599390864\n", + "train iter=300, top1=0.9765625, top5=1.0, loss=0.0562560297549\n", + "train iter=400, top1=0.9609375, top5=1.0, loss=0.094195574522\n" + ] + } + ], + "source": [ + "train(quant_program)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "测试量化后的模型,和``3.2 训练和测试``中得到的测试结果相比,精度相近,达到了无损量化。" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "test iter=0, top1=0.984375, top5=1.0, loss=0.0542894415557\n", + "test iter=100, top1=0.9609375, top5=1.0, loss=0.0662319809198\n", + "test iter=200, top1=0.9609375, top5=1.0, loss=0.0832970961928\n", + "test iter=300, top1=0.9921875, top5=1.0, loss=0.0262515246868\n", + "test iter=400, top1=0.96875, top5=1.0, loss=0.123742781579\n", + "final test result top1=0.984057843685, top5=0.999799668789\n" + ] + } + ], + "source": [ + "test(val_quant_program)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 6. 保存量化后的模型\n", + "\n", + "在``4. 量化``中使用接口``slim.quant.quant_aware``接口得到的模型只适合训练时使用,为了得到最终使用时的模型,需要使用[slim.quant.convert](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/#convert)接口,然后使用[fluid.io.save_inference_model](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api_cn/io_cn/save_inference_model_cn.html#save-inference-model)保存模型。``float_prog``的参数数据类型是float32,但是数据范围是int8, 保存之后可使用fluid或者paddle-lite加载使用,paddle-lite在使用时,会先将类型转换为int8。``int8_prog``的参数数据类型是int8, 保存后可看到量化后模型大小,不可加载使用。" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2020-02-06 09:09:27,529-INFO: convert config {'moving_rate': 0.9, 'weight_quantize_type': 'channel_wise_abs_max', 'is_full_quantize': False, 'dtype': 'int8', 'weight_bits': 8, 'window_size': 10000, 'activation_bits': 8, 'quantize_op_types': ['conv2d', 'depthwise_conv2d', 'mul'], 'not_quant_pattern': ['skip_quant'], 'activation_quantize_type': 'moving_average_abs_max', 'for_tensorrt': False}\n" + ] + }, + { + "data": { + "text/plain": [ + "[u'save_infer_model/scale_0',\n", + " u'save_infer_model/scale_1',\n", + " u'save_infer_model/scale_2']" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float_prog, int8_prog = slim.quant.convert(val_quant_program, exe.place, save_int8=True)\n", + "target_vars = [float_prog.global_block().var(name) for name in outputs]\n", + "fluid.io.save_inference_model(dirname='./inference_model/float',\n", + " feeded_var_names=[var.name for var in inputs],\n", + " target_vars=target_vars,\n", + " executor=exe,\n", + " main_program=float_prog)\n", + "fluid.io.save_inference_model(dirname='./inference_model/int8',\n", + " feeded_var_names=[var.name for var in inputs],\n", + " target_vars=target_vars,\n", + " executor=exe,\n", + " main_program=int8_prog)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 2", + "language": "python", + "name": "python2" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.12" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/demo/quant/quant_post/image_classification_post_training_quantization_quick_start.ipynb b/demo/quant/quant_post/image_classification_post_training_quantization_quick_start.ipynb new file mode 100755 index 0000000000000000000000000000000000000000..23b813e6c2525341579d26c13908a89ed96c1017 --- /dev/null +++ b/demo/quant/quant_post/image_classification_post_training_quantization_quick_start.ipynb @@ -0,0 +1,311 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 图像分类模型离线量化-快速开始" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "该教程以图像分类模型MobileNetV1为例,说明如何快速使用PaddleSlim的[离线量化接口](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/docs/api/quantization_api.md)。 该示例包含以下步骤:\n", + "\n", + "1. 导入依赖\n", + "2. 构建模型\n", + "3. 训练模型\n", + "4. 离线量化" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. 导入依赖\n", + "PaddleSlim依赖Paddle1.7版本,请确认已正确安装Paddle,然后按以下方式导入Paddle和PaddleSlim:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import paddle\n", + "import paddle.fluid as fluid\n", + "import paddleslim as slim\n", + "import numpy as np" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. 构建网络\n", + "该章节构造一个用于对MNIST数据进行分类的分类模型,选用`MobileNetV1`,并将输入大小设置为`[1, 28, 28]`,输出类别数为10。 为了方便展示示例,我们在`paddleslim.models`下预定义了用于构建分类模型的方法,执行以下代码构建分类模型:\n", + "\n", + ">注意:paddleslim.models下的API并非PaddleSlim常规API,是为了简化示例而封装预定义的一系列方法,比如:模型结构的定义、Program的构建等。" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "exe, train_program, val_program, inputs, outputs = \\\n", + " slim.models.image_classification(\"MobileNet\", [1, 28, 28], 10, use_gpu=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. 训练模型\n", + "该章节介绍了如何定义输入数据和如何训练和测试分类模型。先训练分类模型的原因是离线量化需要一个训练好的模型。\n", + "\n", + "### 3.1 定义输入数据\n", + "\n", + "为了快速执行该示例,我们选取简单的MNIST数据,Paddle框架的`paddle.dataset.mnist`包定义了MNIST数据的下载和读取。\n", + "代码如下:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "import paddle.dataset.mnist as reader\n", + "train_reader = paddle.batch(\n", + " reader.train(), batch_size=128, drop_last=True)\n", + "test_reader = paddle.batch(\n", + " reader.train(), batch_size=128, drop_last=True)\n", + "train_feeder = fluid.DataFeeder(inputs, fluid.CPUPlace())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.2 训练和测试\n", + "先定义训练和测试函数。在训练函数中执行了一个epoch的训练,因为MNIST数据集数据较少,一个epoch就可将top1精度训练到95%以上。\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "def train(prog):\n", + " iter = 0\n", + " for data in train_reader():\n", + " acc1, acc5, loss = exe.run(prog, feed=train_feeder.feed(data), fetch_list=outputs)\n", + " if iter % 100 == 0:\n", + " print('train', acc1.mean(), acc5.mean(), loss.mean())\n", + " iter += 1\n", + " \n", + "def test(prog, outputs=outputs):\n", + " iter = 0\n", + " res = [[], []]\n", + " for data in train_reader():\n", + " acc1, acc5, loss = exe.run(prog, feed=train_feeder.feed(data), fetch_list=outputs)\n", + " if iter % 100 == 0:\n", + " print('test', acc1.mean(), acc5.mean(), loss.mean())\n", + " res[0].append(acc1.mean())\n", + " res[1].append(acc5.mean())\n", + " iter += 1\n", + " print('final test result', np.array(res[0]).mean(), np.array(res[1]).mean())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "调用``train``函数训练分类网络,``train_program``是在第2步:构建网络中定义的。" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "('train', 0.0625, 0.5234375, 2.6373053)\n", + "('train', 0.9375, 0.9921875, 0.20106347)\n", + "('train', 0.953125, 1.0, 0.13234669)\n", + "('train', 0.96875, 0.9921875, 0.18056682)\n", + "('train', 0.9453125, 1.0, 0.15847622)\n" + ] + } + ], + "source": [ + "train(train_program)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "调用``test``函数测试分类网络,``val_program``是在第2步:构建网络中定义的。" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "('test', 0.9609375, 0.9921875, 0.12996897)\n", + "('test', 0.9609375, 1.0, 0.094265014)\n", + "('test', 0.9453125, 1.0, 0.10511534)\n", + "('test', 0.9765625, 1.0, 0.11341806)\n", + "('test', 0.953125, 1.0, 0.17046008)\n", + "('final test result', 0.9647603, 0.99943244)\n" + ] + } + ], + "source": [ + "test(val_program)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "保存inference model,将训练好的分类模型保存在``'./inference_model'``下,后续进行离线量化时将加载保存在此处的模型。" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[u'save_infer_model/scale_0',\n", + " u'save_infer_model/scale_1',\n", + " u'save_infer_model/scale_2']" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "target_vars = [val_program.global_block().var(name) for name in outputs]\n", + "fluid.io.save_inference_model(dirname='./inference_model',\n", + " feeded_var_names=[var.name for var in inputs],\n", + " target_vars=target_vars,\n", + " executor=exe,\n", + " main_program=val_program)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. 离线量化\n", + "\n", + "调用离线量化接口,加载文件夹``'./inference_model'``训练好的分类模型,并使用10个batch的数据进行参数校正。此过程无需训练,只需跑前向过程来计算量化所需参数。离线量化后的模型保存在文件夹``'./quant_post_model'``下。" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2020-02-06 09:32:42,944-INFO: run batch: 0\n", + "2020-02-06 09:32:42,944-INFO: run batch: 0\n", + "2020-02-06 09:32:43,233-INFO: run batch: 5\n", + "2020-02-06 09:32:43,233-INFO: run batch: 5\n", + "2020-02-06 09:32:43,362-INFO: all run batch: 10\n", + "2020-02-06 09:32:43,362-INFO: all run batch: 10\n", + "2020-02-06 09:32:43,365-INFO: calculate scale factor ...\n", + "2020-02-06 09:32:43,365-INFO: calculate scale factor ...\n", + "2020-02-06 09:32:54,841-INFO: update the program ...\n", + "2020-02-06 09:32:54,841-INFO: update the program ...\n" + ] + } + ], + "source": [ + "slim.quant.quant_post(\n", + " executor=exe,\n", + " model_dir='./inference_model',\n", + " quantize_model_path='./quant_post_model',\n", + " sample_generator=reader.test(),\n", + " batch_nums=10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "加载保存在文件夹``'./quant_post_model'``下的量化后的模型进行测试,可看到精度和``3.2 训练和测试``中得到的测试精度相近,因此离线量化过程对于此分类模型几乎无损。" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "('test', 0.9765625, 0.9921875, 0.11411239)\n", + "('test', 0.953125, 1.0, 0.111179784)\n", + "('test', 0.953125, 1.0, 0.101078615)\n", + "('test', 0.96875, 1.0, 0.0993958)\n", + "('test', 0.9609375, 1.0, 0.16066414)\n", + "('final test result', 0.9643096, 0.99931556)\n" + ] + } + ], + "source": [ + "quant_post_prog, feed_target_names, fetch_targets = fluid.io.load_inference_model(\n", + " dirname='./quant_post_model',\n", + " executor=exe)\n", + "test(quant_post_prog, fetch_targets)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 2", + "language": "python", + "name": "python2" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.12" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/docs/zh_cn/quick_start/quant_aware_tutorial.md b/docs/zh_cn/quick_start/quant_aware_tutorial.md new file mode 100644 index 0000000000000000000000000000000000000000..802f0b739e0168517556e899d557da87291eccfd --- /dev/null +++ b/docs/zh_cn/quick_start/quant_aware_tutorial.md @@ -0,0 +1,140 @@ +# 图像分类模型量化训练-快速开始 + +该教程以图像分类模型MobileNetV1为例,说明如何快速使用PaddleSlim的[量化训练接口](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/docs/api/quantization_api.md)。 该示例包含以下步骤: + +1. 导入依赖 +2. 构建模型 +3. 训练模型 +4. 量化 +5. 训练和测试量化后的模型 +6. 保存量化后的模型 + +## 1. 导入依赖 +PaddleSlim依赖Paddle1.7版本,请确认已正确安装Paddle,然后按以下方式导入Paddle和PaddleSlim: + + +```python +import paddle +import paddle.fluid as fluid +import paddleslim as slim +import numpy as np +``` + +## 2. 构建网络 +该章节构造一个用于对MNIST数据进行分类的分类模型,选用`MobileNetV1`,并将输入大小设置为`[1, 28, 28]`,输出类别数为10。 为了方便展示示例,我们在`paddleslim.models`下预定义了用于构建分类模型的方法,执行以下代码构建分类模型: + +>注意:paddleslim.models下的API并非PaddleSlim常规API,是为了简化示例而封装预定义的一系列方法,比如:模型结构的定义、Program的构建等。 + + +```python +exe, train_program, val_program, inputs, outputs = \ + slim.models.image_classification("MobileNet", [1, 28, 28], 10, use_gpu=True) +``` + +## 3. 训练模型 +该章节介绍了如何定义输入数据和如何训练和测试分类模型。先训练分类模型的原因是量化训练过程是在训练好的模型上进行的,也就是说是在训练好的模型的基础上加入量化反量化op之后,用小学习率进行参数微调。 + +### 3.1 定义输入数据 + +为了快速执行该示例,我们选取简单的MNIST数据,Paddle框架的`paddle.dataset.mnist`包定义了MNIST数据的下载和读取。 +代码如下: + + +```python +import paddle.dataset.mnist as reader +train_reader = paddle.batch( + reader.train(), batch_size=128, drop_last=True) +test_reader = paddle.batch( + reader.train(), batch_size=128, drop_last=True) +train_feeder = fluid.DataFeeder(inputs, fluid.CPUPlace()) +``` + +### 3.2 训练和测试 +先定义训练和测试函数,正常训练和量化训练时只需要调用函数即可。在训练函数中执行了一个epoch的训练,因为MNIST数据集数据较少,一个epoch就可将top1精度训练到95%以上。 + + +```python +def train(prog): + iter = 0 + for data in train_reader(): + acc1, acc5, loss = exe.run(prog, feed=train_feeder.feed(data), fetch_list=outputs) + if iter % 100 == 0: + print('train iter={}, top1={}, top5={}, loss={}'.format(iter, acc1.mean(), acc5.mean(), loss.mean())) + iter += 1 + +def test(prog): + iter = 0 + res = [[], []] + for data in train_reader(): + acc1, acc5, loss = exe.run(prog, feed=train_feeder.feed(data), fetch_list=outputs) + if iter % 100 == 0: + print('test iter={}, top1={}, top5={}, loss={}'.format(iter, acc1.mean(), acc5.mean(), loss.mean())) + res[0].append(acc1.mean()) + res[1].append(acc5.mean()) + iter += 1 + print('final test result top1={}, top5={}'.format(np.array(res[0]).mean(), np.array(res[1]).mean())) +``` + +调用``train``函数训练分类网络,``train_program``是在第2步:构建网络中定义的。 + + +```python +train(train_program) +``` + + +调用``test``函数测试分类网络,``val_program``是在第2步:构建网络中定义的。 + + +```python +test(val_program) +``` + + +## 4. 量化 + +按照[默认配置](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/#_1)在``train_program``和``val_program``中加入量化和反量化op. + + +```python +quant_program = slim.quant.quant_aware(train_program, exe.place, for_test=False) +val_quant_program = slim.quant.quant_aware(val_program, exe.place, for_test=True) +``` + + +## 5. 训练和测试量化后的模型 +微调量化后的模型,训练一个epoch后测试。 + + +```python +train(quant_program) +``` + + +测试量化后的模型,和``3.2 训练和测试``中得到的测试结果相比,精度相近,达到了无损量化。 + + +```python +test(val_quant_program) +``` + + +## 6. 保存量化后的模型 + +在``4. 量化``中使用接口``slim.quant.quant_aware``接口得到的模型只适合训练时使用,为了得到最终使用时的模型,需要使用[slim.quant.convert](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/#convert)接口,然后使用[fluid.io.save_inference_model](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api_cn/io_cn/save_inference_model_cn.html#save-inference-model)保存模型。``float_prog``的参数数据类型是float32,但是数据范围是int8, 保存之后可使用fluid或者paddle-lite加载使用,paddle-lite在使用时,会先将类型转换为int8。``int8_prog``的参数数据类型是int8, 保存后可看到量化后模型大小,不可加载使用。 + + +```python +float_prog, int8_prog = slim.quant.convert(val_quant_program, exe.place, save_int8=True) +target_vars = [float_prog.global_block().var(name) for name in outputs] +fluid.io.save_inference_model(dirname='./inference_model/float', + feeded_var_names=[var.name for var in inputs], + target_vars=target_vars, + executor=exe, + main_program=float_prog) +fluid.io.save_inference_model(dirname='./inference_model/int8', + feeded_var_names=[var.name for var in inputs], + target_vars=target_vars, + executor=exe, + main_program=int8_prog) +``` diff --git a/docs/zh_cn/quick_start/quant_post_tutorial.md b/docs/zh_cn/quick_start/quant_post_tutorial.md new file mode 100755 index 0000000000000000000000000000000000000000..84ec2cc076c6c756de6db9039ce261af480e161d --- /dev/null +++ b/docs/zh_cn/quick_start/quant_post_tutorial.md @@ -0,0 +1,128 @@ + # 图像分类模型离线量化-快速开始 + +该教程以图像分类模型MobileNetV1为例,说明如何快速使用PaddleSlim的[离线量化接口](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/docs/api/quantization_api.md)。 该示例包含以下步骤: + +1. 导入依赖 +2. 构建模型 +3. 训练模型 +4. 离线量化 + +## 1. 导入依赖 +PaddleSlim依赖Paddle1.7版本,请确认已正确安装Paddle,然后按以下方式导入Paddle和PaddleSlim: + + +```python +import paddle +import paddle.fluid as fluid +import paddleslim as slim +import numpy as np +``` + +## 2. 构建网络 +该章节构造一个用于对MNIST数据进行分类的分类模型,选用`MobileNetV1`,并将输入大小设置为`[1, 28, 28]`,输出类别数为10。 为了方便展示示例,我们在`paddleslim.models`下预定义了用于构建分类模型的方法,执行以下代码构建分类模型: + +>注意:paddleslim.models下的API并非PaddleSlim常规API,是为了简化示例而封装预定义的一系列方法,比如:模型结构的定义、Program的构建等。 + + +```python +exe, train_program, val_program, inputs, outputs = \ + slim.models.image_classification("MobileNet", [1, 28, 28], 10, use_gpu=True) +``` + +## 3. 训练模型 +该章节介绍了如何定义输入数据和如何训练和测试分类模型。先训练分类模型的原因是离线量化需要一个训练好的模型。 + +### 3.1 定义输入数据 + +为了快速执行该示例,我们选取简单的MNIST数据,Paddle框架的`paddle.dataset.mnist`包定义了MNIST数据的下载和读取。 +代码如下: + + +```python +import paddle.dataset.mnist as reader +train_reader = paddle.batch( + reader.train(), batch_size=128, drop_last=True) +test_reader = paddle.batch( + reader.train(), batch_size=128, drop_last=True) +train_feeder = fluid.DataFeeder(inputs, fluid.CPUPlace()) +``` + +### 3.2 训练和测试 +先定义训练和测试函数。在训练函数中执行了一个epoch的训练,因为MNIST数据集数据较少,一个epoch就可将top1精度训练到95%以上。 + + + +```python +def train(prog): + iter = 0 + for data in train_reader(): + acc1, acc5, loss = exe.run(prog, feed=train_feeder.feed(data), fetch_list=outputs) + if iter % 100 == 0: + print('train', acc1.mean(), acc5.mean(), loss.mean()) + iter += 1 + +def test(prog, outputs=outputs): + iter = 0 + res = [[], []] + for data in train_reader(): + acc1, acc5, loss = exe.run(prog, feed=train_feeder.feed(data), fetch_list=outputs) + if iter % 100 == 0: + print('test', acc1.mean(), acc5.mean(), loss.mean()) + res[0].append(acc1.mean()) + res[1].append(acc5.mean()) + iter += 1 + print('final test result', np.array(res[0]).mean(), np.array(res[1]).mean()) +``` + +调用``train``函数训练分类网络,``train_program``是在第2步:构建网络中定义的。 + + +```python +train(train_program) +``` + + +调用``test``函数测试分类网络,``val_program``是在第2步:构建网络中定义的。 + + +```python +test(val_program) +``` + + +保存inference model,将训练好的分类模型保存在``'./inference_model'``下,后续进行离线量化时将加载保存在此处的模型。 + + +```python +target_vars = [val_program.global_block().var(name) for name in outputs] +fluid.io.save_inference_model(dirname='./inference_model', + feeded_var_names=[var.name for var in inputs], + target_vars=target_vars, + executor=exe, + main_program=val_program) +``` + +## 4. 离线量化 + +调用离线量化接口,加载文件夹``'./inference_model'``训练好的分类模型,并使用10个batch的数据进行参数校正。此过程无需训练,只需跑前向过程来计算量化所需参数。离线量化后的模型保存在文件夹``'./quant_post_model'``下。 + + +```python +slim.quant.quant_post( + executor=exe, + model_dir='./inference_model', + quantize_model_path='./quant_post_model', + sample_generator=reader.test(), + batch_nums=10) +``` + + +加载保存在文件夹``'./quant_post_model'``下的量化后的模型进行测试,可看到精度和``3.2 训练和测试``中得到的测试精度相近,因此离线量化过程对于此分类模型几乎无损。 + + +```python +quant_post_prog, feed_target_names, fetch_targets = fluid.io.load_inference_model( + dirname='./quant_post_model', + executor=exe) +test(quant_post_prog, fetch_targets) +```