Created by: chenjiaoAngel
in = [1, 256, 10, 10] out = [1, 512, 10, 10] ksize=[3,3], stride=[1,1] , pad=[1,1,1,1], group=1 conv3x3s1_direct: 14ms左右 gemmed: 7ms左右 add 判断: cin * cout < 4 * hin * win 选择conv3x3s1_direct 实现;否则,选择gemm 实现。与FP32 的实现保持一致
Created by: chenjiaoAngel
in = [1, 256, 10, 10] out = [1, 512, 10, 10] ksize=[3,3], stride=[1,1] , pad=[1,1,1,1], group=1 conv3x3s1_direct: 14ms左右 gemmed: 7ms左右 add 判断: cin * cout < 4 * hin * win 选择conv3x3s1_direct 实现;否则,选择gemm 实现。与FP32 的实现保持一致