[doc develop] update the doc of dygraph qat (#695)

* update the doc of dygraph qat, test=develop, test=document_fix * up, test=develop, test=document_fix

[doc develop] update the doc of dygraph qat (#695)
* update the doc of dygraph qat, test=develop, test=document_fix * up, test=develop, test=document_fix
44114b96 · cc · GitHub · 09267010 · 44114b96 · 44114b96
5 changed file
--- a/docs/zh_cn/api_cn/dygraph/quanter/qat.rst
+++ b/docs/zh_cn/api_cn/dygraph/quanter/qat.rst
@@ -36,31 +36,31 @@ QAT

 .. code-block:: python

-{
-    # weight预处理方法，默认为None，代表不进行预处理；当需要使用`PACT`方法时设置为`"PACT"`
-    'weight_preprocess_type': None,
+    {
+        # weight预处理方法，默认为None，代表不进行预处理；当需要使用`PACT`方法时设置为`"PACT"`
+        'weight_preprocess_type': None,

-    # activation预处理方法，默认为None，代表不进行预处理`
-    'activation_preprocess_type': None,
+        # activation预处理方法，默认为None，代表不进行预处理`
+        'activation_preprocess_type': None,

-    # weight量化方法, 默认为'channel_wise_abs_max', 此外还支持'channel_wise_abs_max'
-    'weight_quantize_type': 'channel_wise_abs_max',
+        # weight量化方法, 默认为'channel_wise_abs_max', 此外还支持'channel_wise_abs_max'
+        'weight_quantize_type': 'channel_wise_abs_max',

-    # activation量化方法, 默认为'moving_average_abs_max', 此外还支持'abs_max'
-    'activation_quantize_type': 'moving_average_abs_max',
+        # activation量化方法, 默认为'moving_average_abs_max', 此外还支持'abs_max'
+        'activation_quantize_type': 'moving_average_abs_max',

-    # weight量化比特数, 默认为 8
-    'weight_bits': 8,
+        # weight量化比特数, 默认为 8
+        'weight_bits': 8,

-    # activation量化比特数, 默认为 8
-    'activation_bits': 8,
+        # activation量化比特数, 默认为 8
+        'activation_bits': 8,

-    # 'moving_average_abs_max'的滑动平均超参, 默认为0.9
-    'moving_rate': 0.9,
+        # 'moving_average_abs_max'的滑动平均超参, 默认为0.9
+        'moving_rate': 0.9,

-    # 需要量化的算子类型
-    'quantizable_layer_type': ['Conv2D', 'Linear'],
-}
+        # 需要量化的算子类型
+        'quantizable_layer_type': ['Conv2D', 'Linear'],
+    }
 ..

 
@@ -95,6 +95,8 @@ QAT

   将指定的动态图量化模型导出为静态图预测模型，用于预测部署。
   
+   量化预测模型可以使用`netron`软件打开，进行可视化查看。该量化预测模型和普通FP32预测模型一样，可以使用PaddleLite和PaddleInference加载预测，具体请参考`推理部署`章节。
+   
   **参数：**
   
   - **model(paddle.nn.Layer)** - 量化训练结束，需要导出的量化模型，该模型由`quantize`接口产出。

--- a/docs/zh_cn/api_cn/static/quant/quantization_api.rst
+++ b/docs/zh_cn/api_cn/static/quant/quantization_api.rst
@@ -251,8 +251,8 @@ convert

 **返回**

- **program (fluid.Program)** - freezed program，可用于保存inference model，参数为 ``float32`` 类型，但其数值范围可用int8表示。
- **int8_program (fluid.Program)** - freezed program，可用于保存inference model，参数为 ``int8`` 类型。当 ``save_int8`` 为False 时，不返回该值。
+- **program (fluid.Program)** - freezed program，可用于保存inference model，参数为 ``float32`` 类型，但其数值范围可用int8表示。该模型用于预测部署。
+- **int8_program (fluid.Program)** - freezed program，可用于保存inference model，参数为 ``int8`` 类型。当 ``save_int8`` 为False 时，不返回该值。该模型不可以用于预测部署。

 .. note::


--- a/docs/zh_cn/quick_start/dygraph/dygraph_quant_aware_training_tutorial.md
+++ b/docs/zh_cn/quick_start/dygraph/dygraph_quant_aware_training_tutorial.md
@@ -6,6 +6,8 @@ PaddleSlim使用的是模拟量化训练方案，一般模拟量化需要先对

 下面该教程将以图像分类模型MobileNetV1为例，说明如何快速使用[PaddleSlim的模型量化接口]()。

+> 注意：目前动态图量化训练还不支持有控制流逻辑的模型，如果量化训练中出现Warning，推荐使用静态图量化训练功能。
+
 该示例包含以下步骤：

 1. 导入依赖
@@ -104,4 +106,6 @@ quanter.save_quantized_model(
    input_spec=inputs)
 ```

-导出之后，可以在`path`路径下找到导出的量化预测模型
+导出之后，可以在`path`路径下找到导出的量化预测模型。
+
+量化预测模型可以使用`netron`软件打开，进行可视化查看。该量化预测模型和普通FP32预测模型一样，可以使用PaddleLite和PaddleInference加载预测，具体请参考`推理部署`章节。
--- a/docs/zh_cn/quick_start/static/quant_aware_tutorial.md
+++ b/docs/zh_cn/quick_start/static/quant_aware_tutorial.md
@@ -156,7 +156,7 @@ test(val_quant_program)

 ## 6. 保存量化后的模型

-在``4. 量化``中使用接口``slim.quant.quant_aware``接口得到的模型只适合训练时使用，为了得到最终使用时的模型，需要使用[slim.quant.convert](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/static/quant/quantization_api.html#convert)接口，然后使用[fluid.io.save_inference_model](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/static/save_inference_model_cn.html#save-inference-model)保存模型。``float_prog``的参数数据类型是float32，但是数据范围是int8, 保存之后可使用fluid或者paddle-lite加载使用，paddle-lite在使用时，会先将类型转换为int8。``int8_prog``的参数数据类型是int8, 保存后可看到量化后模型大小，不可加载使用。
+在``4. 量化``中使用接口``slim.quant.quant_aware``接口得到的模型只适合训练时使用，为了得到最终使用时的模型，需要使用[slim.quant.convert](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/static/quant/quantization_api.html#convert)接口，然后使用[fluid.io.save_inference_model](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/static/save_inference_model_cn.html#save-inference-model)保存模型。``float_prog``的参数数据类型是float32，但是数据范围是int8, 保存之后可使用Paddle executor, PaddleInference predictor 和Paddle-Lite predictor加载执行。``int8_prog``的参数数据类型是int8, 保存后可看到量化后模型大小会减小，**该模型不可以用于预测部署**。


 ```python

--- a/docs/zh_cn/tutorials/quant/dygraph/quant_aware_training_tutorial.md
+++ b/docs/zh_cn/tutorials/quant/dygraph/quant_aware_training_tutorial.md
@@ -2,6 +2,7 @@

 在线量化是在模型训练的过程中建模定点量化对模型的影响，通过在模型计算图中插入量化节点，在训练建模量化对模型精度的影响降低量化损失。

+> 注意：目前动态图量化训练还不支持有控制流逻辑的模型，如果量化训练中出现Warning，推荐使用静态图量化训练功能。

 PaddleSlim包含`QAT量化训练`和`PACT改进的量化训练`两种量化方法

@@ -64,6 +65,8 @@ quanter.save_quantized_model(
  input_spec=[paddle.static.InputSpec()])
 ```

+量化预测模型可以使用`netron`软件打开，进行可视化查看。该量化预测模型和普通FP32预测模型一样，可以使用PaddleLite和PaddleInference加载预测，具体请参考`推理部署`章节。
+
 ## PACT在线量化

 PACT方法是对普通在线量化方法的改进，对于一些量化敏感的模型，例如MobileNetV3，PACT方法一般都能降低量化模型的精度损失。