Created by: Sand3r-
PR introduces changes that enable int8 FC to correctly perform computations in ernie topology.
Terminology used in description:
nhwc
-like or nchw
-like means any format that is descendant of these formats, such as ncw or nwc for 3-dim data blobs, nhwc, nchw for 4-dims and so on.
Description:
Background:
The nhwc
data layout was formerly used in all int8 computations so far, and quantize op had it hardcoded to reorder to nhwc-like format during quantizaiton.
Problem:
Unfortunately, due to how this format is laid out in memory, it was not possible to correctly use this format when fully-connected op used in_num_col_dims
parameter to merge some of the first dimensions, since it assumes an nchw-like format.
Solution:
To counter this, the PR makes it possible to set to which format (or in fact layout) should quantize reorder to during quantizaiton. Possible options being either NCHW
or NCHW
by setting its output_format
parameter.
To further utilise this new feature, cpu_quantize_pass
now accepts an optional attribute data_layout
which allows to set desired data format for all quantizes that will be added to the graph.