第二种是基于FSP的蒸馏方法(参考论文:[A Gift from Knowledge Distillation:
Fast Optimization, Network Minimization and Transfer Learning](http://openaccess.thecvf.com/content_cvpr_2017/papers/Yim_A_Gift_From_CVPR_2017_paper.pdf))
第二种是基于FSP的蒸馏方法(参考论文:[A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning](http://openaccess.thecvf.com/content_cvpr_2017/papers/Yim_A_Gift_From_CVPR_2017_paper.pdf))
# Training-aware Quantization of image classification model - quick start
This tutorial shows how to do training-aware quantization using [API](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/docs/api/quantization_api.md) in PaddleSlim. We use MobileNetV1 to train image classification model as example. The tutorial contains follow sections:
This tutorial shows how to do training-aware quantization using [API](https://paddlepaddle.github.io/PaddleSlim/api_en/paddleslim.quant.html#paddleslim.quant.quanter.quant_aware) in PaddleSlim. We use MobileNetV1 to train image classification model as example. The tutorial contains follow sections:
1. Necessary imports
2. Model architecture
...
...
@@ -89,7 +89,7 @@ test(val_program)
## 4. Quantization
We call ``quant_aware`` API to add quantization and dequantization operators in ``train_program`` and ``val_program`` according to [default configuration](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/#_1).
We call ``quant_aware`` API to add quantization and dequantization operators in ``train_program`` and ``val_program`` according to [default configuration](https://paddlepaddle.github.io/PaddleSlim/api_cn/quantization_api.html#id2).
The model in ``4. Quantization`` after calling ``slim.quant.quant_aware`` API is only suitable to train. To get the inference model, we should use [slim.quant.convert](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/#convert) API to change model architecture and use [fluid.io.save_inference_model](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api_cn/io_cn/save_inference_model_cn.html#save-inference-model) to save model. ``float_prog``'s parameters are float32 dtype but in int8's range which can be used in ``fluid`` or ``paddle-lite``. ``paddle-lite`` will change the parameters' dtype from float32 to int8 first when loading the inference model. ``int8_prog``'s parameters are int8 dtype and we can get model size after quantization by saving it. ``int8_prog`` cannot be used in ``fluid`` or ``paddle-lite``.
The model in ``4. Quantization`` after calling ``slim.quant.quant_aware`` API is only suitable to train. To get the inference model, we should use [slim.quant.convert](https://paddlepaddle.github.io/PaddleSlim/api_en/paddleslim.quant.html#paddleslim.quant.quanter.convert) API to change model architecture and use [fluid.io.save_inference_model](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api_cn/io_cn/save_inference_model_cn.html#save-inference-model) to save model. ``float_prog``'s parameters are float32 dtype but in int8's range which can be used in ``fluid`` or ``paddle-lite``. ``paddle-lite`` will change the parameters' dtype from float32 to int8 first when loading the inference model. ``int8_prog``'s parameters are int8 dtype and we can get model size after quantization by saving it. ``int8_prog`` cannot be used in ``fluid`` or ``paddle-lite``.
# Post-training Quantization of image classification model - quick start
This tutorial shows how to do post training quantization using [API](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/docs/api/quantization_api.md) in PaddleSlim. We use MobileNetV1 to train image classification model as example. The tutorial contains follow sections:
This tutorial shows how to do post training quantization using [API](https://paddlepaddle.github.io/PaddleSlim/api_en/paddleslim.quant.html#paddleslim.quant.quanter.quant_post) in PaddleSlim. We use MobileNetV1 to train image classification model as example. The tutorial contains follow sections:
第二种是基于FSP的蒸馏方法(参考论文:<ahref="#id23"><spanclass="problematic"id="id24">`</span></a>A Gift from Knowledge Distillation:</p>
第二种是基于FSP的蒸馏方法(参考论文:<aclass="reference external"href="http://openaccess.thecvf.com/content_cvpr_2017/papers/Yim_A_Gift_From_CVPR_2017_paper.pdf">A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning</a>)
<dt>Fast Optimization, Network Minimization and Transfer Learning <<aclass="reference external"href="http://openaccess.thecvf.com/content_cvpr_2017/papers/Yim_A_Gift_From_CVPR_2017_paper.pdf">http://openaccess.thecvf.com/content_cvpr_2017/papers/Yim_A_Gift_From_CVPR_2017_paper.pdf</a>>`_)</dt>
<li><aclass="reference external"href="https://media.nips.cc/Conferences/2015/tutorialslides/Dally-NIPS-Tutorial-2015.pdf">High-Performance Hardware for Machine Learning</a></li>
<li><aclass="reference external"href="https://arxiv.org/pdf/1806.08342.pdf">Quantizing deep convolutional networks for efficient inference: A whitepaper</a></li>
<h1>Training-aware Quantization of image classification model - quick start<aclass="headerlink"href="#training-aware-quantization-of-image-classification-model-quick-start"title="Permalink to this headline">¶</a></h1>
<p>This tutorial shows how to do training-aware quantization using <aclass="reference external"href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/docs/api/quantization_api.md">API</a> in PaddleSlim. We use MobileNetV1 to train image classification model as example. The tutorial contains follow sections:</p>
<p>This tutorial shows how to do training-aware quantization using <aclass="reference external"href="https://paddlepaddle.github.io/PaddleSlim/api_en/paddleslim.quant.html#paddleslim.quant.quanter.quant_aware">API</a> in PaddleSlim. We use MobileNetV1 to train image classification model as example. The tutorial contains follow sections:</p>
<olclass="arabic simple">
<li>Necessary imports</li>
<li>Model architecture</li>
...
...
@@ -235,7 +235,7 @@
</div>
<divclass="section"id="quantization">
<h2>4. Quantization<aclass="headerlink"href="#quantization"title="Permalink to this headline">¶</a></h2>
<p>We call <codeclass="docutils literal"><spanclass="pre">quant_aware</span></code> API to add quantization and dequantization operators in <codeclass="docutils literal"><spanclass="pre">train_program</span></code> and <codeclass="docutils literal"><spanclass="pre">val_program</span></code> according to <aclass="reference external"href="https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/#_1">default configuration</a>.</p>
<p>We call <codeclass="docutils literal"><spanclass="pre">quant_aware</span></code> API to add quantization and dequantization operators in <codeclass="docutils literal"><spanclass="pre">train_program</span></code> and <codeclass="docutils literal"><spanclass="pre">val_program</span></code> according to <aclass="reference external"href="https://paddlepaddle.github.io/PaddleSlim/api_cn/quantization_api.html#id2">default configuration</a>.</p>
<h2>6. Save model after quantization<aclass="headerlink"href="#save-model-after-quantization"title="Permalink to this headline">¶</a></h2>
<p>The model in <codeclass="docutils literal"><spanclass="pre">4.</span><spanclass="pre">Quantization</span></code> after calling <codeclass="docutils literal"><spanclass="pre">slim.quant.quant_aware</span></code> API is only suitable to train. To get the inference model, we should use <aclass="reference external"href="https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/#convert">slim.quant.convert</a> API to change model architecture and use <aclass="reference external"href="https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api_cn/io_cn/save_inference_model_cn.html#save-inference-model">fluid.io.save_inference_model</a> to save model. <codeclass="docutils literal"><spanclass="pre">float_prog</span></code>‘s parameters are float32 dtype but in int8’s range which can be used in <codeclass="docutils literal"><spanclass="pre">fluid</span></code> or <codeclass="docutils literal"><spanclass="pre">paddle-lite</span></code>. <codeclass="docutils literal"><spanclass="pre">paddle-lite</span></code> will change the parameters’ dtype from float32 to int8 first when loading the inference model. <codeclass="docutils literal"><spanclass="pre">int8_prog</span></code>‘s parameters are int8 dtype and we can get model size after quantization by saving it. <codeclass="docutils literal"><spanclass="pre">int8_prog</span></code> cannot be used in <codeclass="docutils literal"><spanclass="pre">fluid</span></code> or <codeclass="docutils literal"><spanclass="pre">paddle-lite</span></code>.</p>
<p>The model in <codeclass="docutils literal"><spanclass="pre">4.</span><spanclass="pre">Quantization</span></code> after calling <codeclass="docutils literal"><spanclass="pre">slim.quant.quant_aware</span></code> API is only suitable to train. To get the inference model, we should use <aclass="reference external"href="https://paddlepaddle.github.io/PaddleSlim/api_en/paddleslim.quant.html#paddleslim.quant.quanter.convert">slim.quant.convert</a> API to change model architecture and use <aclass="reference external"href="https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api_cn/io_cn/save_inference_model_cn.html#save-inference-model">fluid.io.save_inference_model</a> to save model. <codeclass="docutils literal"><spanclass="pre">float_prog</span></code>‘s parameters are float32 dtype but in int8’s range which can be used in <codeclass="docutils literal"><spanclass="pre">fluid</span></code> or <codeclass="docutils literal"><spanclass="pre">paddle-lite</span></code>. <codeclass="docutils literal"><spanclass="pre">paddle-lite</span></code> will change the parameters’ dtype from float32 to int8 first when loading the inference model. <codeclass="docutils literal"><spanclass="pre">int8_prog</span></code>‘s parameters are int8 dtype and we can get model size after quantization by saving it. <codeclass="docutils literal"><spanclass="pre">int8_prog</span></code> cannot be used in <codeclass="docutils literal"><spanclass="pre">fluid</span></code> or <codeclass="docutils literal"><spanclass="pre">paddle-lite</span></code>.</p>
<h1>Post-training Quantization of image classification model - quick start<aclass="headerlink"href="#post-training-quantization-of-image-classification-model-quick-start"title="Permalink to this headline">¶</a></h1>
<p>This tutorial shows how to do post training quantization using <aclass="reference external"href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/docs/api/quantization_api.md">API</a> in PaddleSlim. We use MobileNetV1 to train image classification model as example. The tutorial contains follow sections:</p>
<p>This tutorial shows how to do post training quantization using <aclass="reference external"href="https://paddlepaddle.github.io/PaddleSlim/api_en/paddleslim.quant.html#paddleslim.quant.quanter.quant_post">API</a> in PaddleSlim. We use MobileNetV1 to train image classification model as example. The tutorial contains follow sections:</p>