Created by: yiicy
armv7, v8 support sgemv transA
armv7, v8 support sgemm c4
fix lrn param not match paddle
sgemm c4的性能在kirin990上大约可以达到百分之85的峰值性能, 目前还不支持transA, transB,为支持c4的winograd作准备