未验证 提交 35431ce1 编写于 作者: C cc 提交者: GitHub

post_quant_static: remove the cache params, add optimize_model params (#471)

* post_quant_static: remove the cache params, add optimize_model params, test=develop
Co-authored-by: NBai Yifan <me@ethanbai.com>
上级 13017580
...@@ -97,7 +97,7 @@ quant_post_dynamic ...@@ -97,7 +97,7 @@ quant_post_dynamic
quant_post_static quant_post_static
--------------- ---------------
.. py:function:: paddleslim.quant.quant_post_static(executor,model_dir, quantize_model_path, batch_generator=None, sample_generator=None, model_filename=None, params_filename=None, save_model_filename='__model__', save_params_filename='__params__', batch_size=16, batch_nums=None, scope=None, algo='KL', quantizable_op_type=["conv2d","depthwise_conv2d","mul"], is_full_quantize=False, weight_bits=8, activation_bits=8, activation_quantize_type='range_abs_max', weight_quantize_type='channel_wise_abs_max', is_use_cache_file=False, cache_dir="./temp_post_training") .. py:function:: paddleslim.quant.quant_post_static(executor,model_dir, quantize_model_path, batch_generator=None, sample_generator=None, model_filename=None, params_filename=None, save_model_filename='__model__', save_params_filename='__params__', batch_size=16, batch_nums=None, scope=None, algo='KL', quantizable_op_type=["conv2d","depthwise_conv2d","mul"], is_full_quantize=False, weight_bits=8, activation_bits=8, activation_quantize_type='range_abs_max', weight_quantize_type='channel_wise_abs_max', optimize_model=False)
`源代码 <https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/quant/quanter.py>`_ `源代码 <https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/quant/quanter.py>`_
...@@ -146,9 +146,7 @@ quant_post_static ...@@ -146,9 +146,7 @@ quant_post_static
- **activation_bits(int)** - 激活值的量化比特位数, 默认值为8。 - **activation_bits(int)** - 激活值的量化比特位数, 默认值为8。
- **weight_quantize_type(str)** - weight的量化方式,可选 `abs_max` 或者 `channel_wise_abs_max` ,通常情况下选 `channel_wise_abs_max` 模型量化精度更高。 - **weight_quantize_type(str)** - weight的量化方式,可选 `abs_max` 或者 `channel_wise_abs_max` ,通常情况下选 `channel_wise_abs_max` 模型量化精度更高。
- **activation_quantize_type(str)** - 激活值的量化方式, 可选 `range_abs_max` 和 `moving_average_abs_max` 。设置激活量化方式不会影响计算scale的算法,只是影响在保存模型时使用哪种operator。 - **activation_quantize_type(str)** - 激活值的量化方式, 可选 `range_abs_max` 和 `moving_average_abs_max` 。设置激活量化方式不会影响计算scale的算法,只是影响在保存模型时使用哪种operator。
- **is_use_cache_file(bool)** - 是否使用硬盘对中间结果进行存储。如果为False, 则将中间结果存储在内存中。默认值为False。 - **optimize_model(bool)** - 是否在量化之前对模型进行fuse优化。executor必须在cpu上执才可以设置该参数为True,然后会将`conv2d/depthwise_conv2d/conv2d_tranpose + batch_norm`进行fuse。
- **cache_dir(str)** - 如果 ``'is_use_cache_file'`` 为True, 则将中间结果存储在此参数设置的路径下。默认值为 ``./temp_post_training`` 。
**返回** **返回**
无。 无。
...@@ -316,7 +314,7 @@ convert ...@@ -316,7 +314,7 @@ convert
'mul', 'conv2d', 'pool2d', 'depthwise_conv2d', 'elementwise_add', 'mul', 'conv2d', 'pool2d', 'depthwise_conv2d', 'elementwise_add',
'leaky_relu' 'leaky_relu'
] ]
TRANSFORM_PASS_OP_TYPES = ['conv2d', 'depthwise_conv2d', 'mul'] TRANSFORM_PASS_OP_TYPES = ['conv2d', 'depthwise_conv2d', 'mul', 'conv2d_transpose']
QUANT_DEQUANT_PASS_OP_TYPES = [ QUANT_DEQUANT_PASS_OP_TYPES = [
"pool2d", "elementwise_add", "concat", "softmax", "argmax", "transpose", "pool2d", "elementwise_add", "concat", "softmax", "argmax", "transpose",
......
...@@ -321,6 +321,7 @@ def quant_post_static( ...@@ -321,6 +321,7 @@ def quant_post_static(
activation_bits=8, activation_bits=8,
activation_quantize_type='range_abs_max', activation_quantize_type='range_abs_max',
weight_quantize_type='channel_wise_abs_max', weight_quantize_type='channel_wise_abs_max',
optimize_model=False,
is_use_cache_file=False, is_use_cache_file=False,
cache_dir="./temp_post_training"): cache_dir="./temp_post_training"):
""" """
...@@ -377,9 +378,11 @@ def quant_post_static( ...@@ -377,9 +378,11 @@ def quant_post_static(
the model accuracy is usually higher when using 'channel_wise_abs_max'. the model accuracy is usually higher when using 'channel_wise_abs_max'.
is_full_quantize(bool): if True, apply quantization to all supported quantizable op type. is_full_quantize(bool): if True, apply quantization to all supported quantizable op type.
If False, only apply quantization to the input quantizable_op_type. Default is False. If False, only apply quantization to the input quantizable_op_type. Default is False.
is_use_cache_file(bool): If False, all temp data will be saved in memory. If True, optimize_model(bool, optional): If set optimize_model as True, it applies some
all temp data will be saved to disk. Defalut: False. passes to optimize the model before quantization. So far, the place of
cache_dir(str): When 'is_use_cache_file' is True, temp data will be save in 'cache_dir'. Default is './temp_post_training'. executor must be cpu it supports fusing batch_norm into convs.
is_use_cache_file(bool): This param is deprecated.
cache_dir(str): This param is deprecated.
Returns: Returns:
None None
...@@ -401,8 +404,7 @@ def quant_post_static( ...@@ -401,8 +404,7 @@ def quant_post_static(
activation_bits=activation_bits, activation_bits=activation_bits,
activation_quantize_type=activation_quantize_type, activation_quantize_type=activation_quantize_type,
weight_quantize_type=weight_quantize_type, weight_quantize_type=weight_quantize_type,
is_use_cache_file=is_use_cache_file, optimize_model=optimize_model)
cache_dir=cache_dir)
post_training_quantization.quantize() post_training_quantization.quantize()
post_training_quantization.save_quantized_model( post_training_quantization.save_quantized_model(
quantize_model_path, quantize_model_path,
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册