Created by: luotao1
speed up the im2col function:
- decreasing the calculation: for example,
int im_col_idx = w * stride[1] - padding[1] + w_offset * dilation[1];
needschannels_col* col_height* col_width * (2 mul + 1sum + 1 minus)
, now, it needschannels_col* col_height* col_width * 1sum + channels_col * (1sum + 2minus)
- use
flag = im_row_idx < 0 || im_row_idx >= im_height || im_col_idx < 0 || im_col_idx >= im_width;
to increase the hit ofif
.
#10685 (closed) OCR CRNN_CTC model:
- the conv2d time: 4.83ms(before) -> 4.41ms(after), speedup: +8.7%
- whole inference time: 58.6ms(before) -> 53.2ms(after), speedup: 9.2%