支持attention模型、重构sgemm和depthwise conv3x3、实现winograd和depthwise conv5x5 v8版本 (!1493) · 合并请求 · PaddlePaddle / Paddle-Lite

支持attention模型、重构sgemm和depthwise conv3x3、实现winograd和depthwise conv5x5 v8版本 !1493

Created by: hjchen2

pr主要的工作如下： 1、支持ocr attention模型 2、修复Feed和Fetch op实现，支持多输入输出 3、重构float sgemm，并对数据打包做了优化，一般规模的矩阵乘gflops提升5%-10%。在ocr检测模型上加速40%左右 4、winograd arm64实现，ios平台ocr检测模型加速50% 5、重新实现depthwise conv3x3，解决原始版本在arm64下的bug，并在v8上会有较大的加速 6、depthwise conv5x5 arm64实现 7、优化op融合实现，保证融合后相比未融合预测速度更优