add fusion_conv_add_relu_int8_op and UT and faster gemm int8. (!1320) · 合并请求 · PaddlePaddle / Paddle-Lite

add fusion_conv_add_relu_int8_op and UT and faster gemm int8. !1320

Created by: wzzju

Faster gemm_int8, max speedup can be 2(int8 / float). Add gemm_with_bias and gemm_with_relu_bias. With the help of gemm_with_relu_bias, we implement the fusion_conv_add_relu_int8_op, of which inputs and outputs are both int8_t.