Need to support non-contiguous access of C when using Eigen to compute gemm (#5997) · Issue · PaddlePaddle / Paddle

Need to support non-contiguous access of C when using Eigen to compute gemm

Created by: Xreki

Paddle has an option USE_EIGEN_FOR_BLAS, that is to use Eigen to compute those BLAS functions, such as gemm, instead of OpenBLAS. On some platforms, such as armeabi-v7a of Android, Eigen is faster than OpenBLAS. However, the current implementation of EigenBlasGemm does not support non-contiguous input and output:

https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/function/EigenGemm.cpp#L40-L62

    if (transA) {
      sizeA[0] = K;
      sizeA[1] = M;
      CHECK_EQ(M, lda);
    } else {
      sizeA[0] = M;
      sizeA[1] = K;
      CHECK_EQ(K, lda);
    }
    Eigen::array<int, 2> sizeB;
    if (transB) {
      sizeB[0] = N;
      sizeB[1] = K;
      CHECK_EQ(K, ldb);
    } else {
      sizeB[0] = K;
      sizeB[1] = N;
      CHECK_EQ(N, ldb);
    }
    Eigen::array<int, 2> sizeC;
    sizeC[0] = M;
    sizeC[1] = N;
    CHECK_EQ(N, ldc);

However, there are non-contiguous needs in the computation of GRU.

https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/function/GruFunctor.h#L34

      BlasGemm<Device, T>::compute(false,
                                   false,
                                   batchSize,
                                   2 * frameSize,
                                   frameSize,
                                   1,
                                   value.prevOutValue,
                                   frameSize,
                                   value.gateWeight,
                                   frameSize * 2,
                                   1,
                                   value.gateValue,
                                   frameSize * 3);

In the calling, N is frameSize * 2 and ldc is frameSize * 3, such that networks with GRU fail because of the following error:

I1128 12:51:08.703012 29921 Util.cpp:166] commandline:  --use_gpu=False 
F1128 12:51:08.752616 29921 EigenGemm.cpp:62] Check failed: N == ldc (400 vs. 600) 
*** Check failure stack trace: ***
    @           0x81c44d  google::LogMessage::Fail()
    @           0x81fefc  google::LogMessage::SendToLog()
    @           0x81bf73  google::LogMessage::Flush()
    @           0x82140e  google::LogMessageFatal::~LogMessageFatal()
    @           0x5e3437  paddle::EigenBlasGemm<>::compute()
    @           0x4976fb  paddle::GruCompute::forward<>()
    @           0x5836ae  paddle::GatedRecurrentLayer::forwardBatch()
    @           0x584732  paddle::GatedRecurrentLayer::forward()
    @           0x4c8b3d  paddle::NeuralNetwork::forward()
    @           0x60ae06  paddle_gradient_machine_forward
    @           0x42450e  infer()
    @           0x413bf5  main
    @       0x318ae1ecdd  (unknown)
    @           0x42314d  (unknown)
Aborted

PaddlePaddle / Paddle 大约 1 年 前同步成功

Need to support non-contiguous access of C when using Eigen to compute gemm

PaddlePaddle / Paddle
大约 1 年前同步成功