提交 edfde3ca 编写于 作者: 卢旭辉

Merge branch 'mix' into 'master'

Support mixing quantization-aware training and post-quantization ranges

See merge request deep-computing/mace!1251
...@@ -6,9 +6,9 @@ MACE supports two kinds of quantization mechanisms, i.e., ...@@ -6,9 +6,9 @@ MACE supports two kinds of quantization mechanisms, i.e.,
* **Quantization-aware training (Recommend)** * **Quantization-aware training (Recommend)**
After pre-training model using float point, insert simulated quantization operations into the model. Fine tune the new model. After pre-training model using float point, insert simulated quantization operations into the model. Fine tune the new model.
Refer to `Tensorflow quantization-aware training <https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize>`__. Refer to `Tensorflow quantization-aware training <https://github.com/tensorflow/tensorflow/tree/r1.15/tensorflow/contrib/quantize>`__.
* **Post training quantization** * **Post-training quantization**
After pre-training model using float point, estimate output range of each activation layer using sample inputs. After pre-training model using float point, estimate output range of each activation layer using sample inputs.
...@@ -28,7 +28,7 @@ models, e.g., MobileNet. The only thing you need to make it run using MACE is to ...@@ -28,7 +28,7 @@ models, e.g., MobileNet. The only thing you need to make it run using MACE is to
2. `quantize`: set `quantize` to be 1. 2. `quantize`: set `quantize` to be 1.
Post training quantization Post-training quantization
--------------------------- ---------------------------
MACE supports post-training quantization if you want to take a chance to quantize model directly without fine tuning. MACE supports post-training quantization if you want to take a chance to quantize model directly without fine tuning.
This method requires developer to calculate tensor range of each activation layer statistically using sample inputs. This method requires developer to calculate tensor range of each activation layer statistically using sample inputs.
...@@ -84,6 +84,16 @@ MACE provides tools to do statistics with following steps(using `inception-v3` f ...@@ -84,6 +84,16 @@ MACE provides tools to do statistics with following steps(using `inception-v3` f
`quantize` to `1` and `quantize_range_file` to the overall_range file path in yaml config). `quantize` to `1` and `quantize_range_file` to the overall_range file path in yaml config).
Mixing usage
---------------------------
As `quantization-aware training` is still evolving, there are some operations that are not supported,
which leaves some activation layers without tensor range. In this case, `post-training quantization`
can be used to calculate these missing ranges. To mix the usage, just get a `quantization-aware training`
model and then go through all the steps of `post-training quantization`. MACE will use the tensor ranges
from the `overall_range` file of `post-training quantization` if the ranges are missing from the
`quantization-aware training` model.
Supported devices Supported devices
----------------- -----------------
MACE supports running quantized models on ARM CPU and other acceleration devices, e.g., Qualcomm Hexagon DSP, MediaTek APU. MACE supports running quantized models on ARM CPU and other acceleration devices, e.g., Qualcomm Hexagon DSP, MediaTek APU.
......
...@@ -1758,20 +1758,14 @@ class Transformer(base_converter.ConverterInterface): ...@@ -1758,20 +1758,14 @@ class Transformer(base_converter.ConverterInterface):
quantize_info.zero_point = info.zero_point quantize_info.zero_point = info.zero_point
def transform_fake_quantize(self): def transform_fake_quantize(self):
if not self._option.quantize:
return False
# Quantize info from fixpoint fine tune # Quantize info from fixpoint fine tune
print("Transform fake quantize") print("Transform fake quantize")
range_file = self._option.quantize_range_file
if range_file:
return
net = self._model net = self._model
for op in net.op: for op in net.op:
if op.type == 'FakeQuantWithMinMaxVars' or \ if op.type == 'FakeQuantWithMinMaxVars' or \
op.type == 'FakeQuantWithMinMaxArgs': op.type == 'FakeQuantWithMinMaxArgs':
if op.input[0] not in self._consts: if self._option.quantize and op.input[0] not in self._consts:
producer_op = self._producer[op.input[0]] producer_op = self._producer[op.input[0]]
minval = ConverterUtil.get_arg(op, 'min').f minval = ConverterUtil.get_arg(op, 'min').f
maxval = ConverterUtil.get_arg(op, 'max').f maxval = ConverterUtil.get_arg(op, 'max').f
...@@ -1842,6 +1836,7 @@ class Transformer(base_converter.ConverterInterface): ...@@ -1842,6 +1836,7 @@ class Transformer(base_converter.ConverterInterface):
range_file = self._option.quantize_range_file range_file = self._option.quantize_range_file
if range_file: if range_file:
print("Add quantize tensor range") print("Add quantize tensor range")
post_quantize_info = {}
with open(range_file) as f: with open(range_file) as f:
for line in f: for line in f:
tensor_name, minmax = line.split("@@")[:2] tensor_name, minmax = line.split("@@")[:2]
...@@ -1856,17 +1851,21 @@ class Transformer(base_converter.ConverterInterface): ...@@ -1856,17 +1851,21 @@ class Transformer(base_converter.ConverterInterface):
activation_info.maxval = max_val activation_info.maxval = max_val
activation_info.scale = scale activation_info.scale = scale
activation_info.zero_point = zero activation_info.zero_point = zero
self._quantize_activation_info[tensor_name] = activation_info # noqa if tensor_name not in self._quantize_activation_info:
post_quantize_info[tensor_name] = activation_info
for op in self._model.op: for op in self._model.op:
if op.name.find(MaceKeyword.mace_output_node_name) >= 0: if op.name.find(MaceKeyword.mace_output_node_name) >= 0:
continue continue
for output in op.output: for output in op.output:
mace_check(output in self._quantize_activation_info, # Prefer quantize info from quantization-aware training
"%s does not have quantize activation info" if output not in self._quantize_activation_info:
% op) mace_check(output in post_quantize_info,
op.quantize_info.extend([ "%s does not have quantize activation info"
self._quantize_activation_info[output]]) % op)
op.quantize_info.extend([post_quantize_info[output]])
self._quantize_activation_info[output] = \
post_quantize_info[output]
if not self._option.quantize: if not self._option.quantize:
return False return False
...@@ -1979,6 +1978,7 @@ class Transformer(base_converter.ConverterInterface): ...@@ -1979,6 +1978,7 @@ class Transformer(base_converter.ConverterInterface):
maxval = producer_op0.quantize_info[0].maxval \ maxval = producer_op0.quantize_info[0].maxval \
- producer_op1.quantize_info[0].minval - producer_op1.quantize_info[0].minval
else: else:
print(op)
mace_check(False, "Quantized Elementwise only support:" mace_check(False, "Quantized Elementwise only support:"
" SUM and SUB without ranges now.") " SUM and SUB without ranges now.")
quantize_info = \ quantize_info = \
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册