use the gemm replace batch gemm in the op matmul (!22849) · 合并请求 · PaddlePaddle / Paddle

use the gemm replace batch gemm in the op matmul !22849

Created by: wawltor

优化matmul的性能如果matmul的第一个输入X是3D tensor，同时输入Y是一个1D或者是2Dtensor，这个时候可以把matmul的第一个输入X的batch size 合入输入X的第二维中，这个kernel会选择gemm来替换batch gemm来做到加速的特性