fuse batch normalization
Created by: luotao1
Motivation
The batch normalization followed the convolution or fully connected layer can be integrated with them. Doing so will give us a forward acceleration(about 30% in mobilenet) during inference.
Implementation
There are two simple examples:
- conv without bias:
conv->batch_norm->any_other_op
should be changed toconv->elementwise_add (bias)->any_other_op
- conv with bias:
conv->elementwise_add (bias)->batch_norm->any_other_op
should be changed toconv->elementwise_add (bias)->any_other_op
Thus, there are three~four stages when fusing batch normalization:
- insert
elementwise_add
op, its input is the output ofconv
(this stage is only for conv without bias) - fuse the
batch_norm
's parameters toconv
andelementwise_add
- remove
batch_norm
ops and its variables which not used in any other ops. - adjust the input of
any_other_op
to be the output ofelementwise_add
, and remove unused variables again.
V2 implementation
#6704 by @NHZlX and a demo in mobile repo.
fluid implementation
We plan to use an inference transpiler to implement fuse batch normalization. Before this transpiler, we should implement:
-
insert_op
method for stage 1: #9747 -
remove_op
method for stage 3: #9384 #9600 #9816 -
remove_var
method for stage 4: #9607