post_quant_static: remove the cache params, add optimize_model params (#471)

* post_quant_static: remove the cache params, add optimize_model params, test=develop Co-authored-by: N Bai Yifan <me@ethanbai.com>

post_quant_static: remove the cache params, add optimize_model params (#471)
* post_quant_static: remove the cache params, add optimize_model params, test=develop Co-authored-by: N Bai Yifan <me@ethanbai.com>
35431ce1 · cc · GitHub · 13017580 · 35431ce1 · 35431ce1
隐藏空白更改
内联并排

Showing with 10 addition and 10 deletion

docs/zh_cn/api_cn/quantization_api.rst docs/zh_cn/api_cn/quantization_api.rst +3 -5

paddleslim/quant/quanter.py paddleslim/quant/quanter.py +7 -5

未找到文件。
--- a/docs/zh_cn/api_cn/quantization_api.rst
+++ b/docs/zh_cn/api_cn/quantization_api.rst
@@ -97,7 +97,7 @@ quant_post_dynamic
 quant_post_static
 ---------------
-.. py:function:: paddleslim.quant.quant_post_static(executor,model_dir, quantize_model_path, batch_generator=None, sample_generator=None, model_filename=None, params_filename=None, save_model_filename='__model__', save_params_filename='__params__', batch_size=16, batch_nums=None, scope=None, algo='KL', quantizable_op_type=["conv2d","depthwise_conv2d","mul"], is_full_quantize=False, weight_bits=8, activation_bits=8, activation_quantize_type='range_abs_max', weight_quantize_type='channel_wise_abs_max', is_use_cache_file=False, cache_dir="./temp_post_training")
+.. py:function:: paddleslim.quant.quant_post_static(executor,model_dir, quantize_model_path, batch_generator=None, sample_generator=None, model_filename=None, params_filename=None, save_model_filename='__model__', save_params_filename='__params__', batch_size=16, batch_nums=None, scope=None, algo='KL', quantizable_op_type=["conv2d","depthwise_conv2d","mul"], is_full_quantize=False, weight_bits=8, activation_bits=8, activation_quantize_type='range_abs_max', weight_quantize_type='channel_wise_abs_max', optimize_model=False)
 `源代码 <https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/quant/quanter.py>`_
@@ -146,9 +146,7 @@ quant_post_static
 - **activation_bits(int)** - 激活值的量化比特位数, 默认值为8。
 - **weight_quantize_type(str)** - weight的量化方式，可选 `abs_max` 或者 `channel_wise_abs_max` ,通常情况下选 `channel_wise_abs_max` 模型量化精度更高。
 - **activation_quantize_type(str)** - 激活值的量化方式, 可选 `range_abs_max` 和 `moving_average_abs_max` 。设置激活量化方式不会影响计算scale的算法，只是影响在保存模型时使用哪种operator。
- **is_use_cache_file(bool)** - 是否使用硬盘对中间结果进行存储。如果为False, 则将中间结果存储在内存中。默认值为False。
+- **optimize_model(bool)** - 是否在量化之前对模型进行fuse优化。executor必须在cpu上执才可以设置该参数为True，然后会将`conv2d/depthwise_conv2d/conv2d_tranpose + batch_norm`进行fuse。
- **cache_dir(str)** - 如果 ``'is_use_cache_file'`` 为True, 则将中间结果存储在此参数设置的路径下。默认值为 ``./temp_post_training``  。
 **返回**
 无。
@@ -316,7 +314,7 @@ convert
       'mul', 'conv2d', 'pool2d', 'depthwise_conv2d', 'elementwise_add',
       'leaky_relu'
   ]
-   TRANSFORM_PASS_OP_TYPES = ['conv2d', 'depthwise_conv2d', 'mul']
+   TRANSFORM_PASS_OP_TYPES = ['conv2d', 'depthwise_conv2d', 'mul', 'conv2d_transpose']
   QUANT_DEQUANT_PASS_OP_TYPES = [
           "pool2d", "elementwise_add", "concat", "softmax", "argmax", "transpose",

--- a/paddleslim/quant/quanter.py
+++ b/paddleslim/quant/quanter.py
@@ -321,6 +321,7 @@ def quant_post_static(
        activation_bits=8,
        activation_quantize_type='range_abs_max',
        weight_quantize_type='channel_wise_abs_max',
+        optimize_model=False,
        is_use_cache_file=False,
        cache_dir="./temp_post_training"):
    """
@@ -377,9 +378,11 @@ def quant_post_static(
                the model accuracy is usually higher when using 'channel_wise_abs_max'.
        is_full_quantize(bool): if True, apply quantization to all supported quantizable op type.
                        If False, only apply quantization to the input quantizable_op_type. Default is False.
-        is_use_cache_file(bool): If False, all temp data will be saved in memory. If True,
+        optimize_model(bool, optional): If set optimize_model as True, it applies some 
-                                all temp data will be saved to disk. Defalut: False.
+                passes to optimize the model before quantization. So far, the place of
-        cache_dir(str): When 'is_use_cache_file' is True, temp data will be save in 'cache_dir'. Default is './temp_post_training'.
+                executor must be cpu it supports fusing batch_norm into convs.
+        is_use_cache_file(bool): This param is deprecated.
+        cache_dir(str): This param is deprecated.
    Returns:
        None
@@ -401,8 +404,7 @@ def quant_post_static(
        activation_bits=activation_bits,
        activation_quantize_type=activation_quantize_type,
        weight_quantize_type=weight_quantize_type,
-        is_use_cache_file=is_use_cache_file,
+        optimize_model=optimize_model)
-        cache_dir=cache_dir)
    post_training_quantization.quantize()
    post_training_quantization.save_quantized_model(
        quantize_model_path,