“29e5a34c0abbc599fc6ddea791b78187fef0ab73”上不存在“develop/doc/howto/cluster/index_en.html”
- 
由 Markus Kliegl 提交于* initial matmul operator Similar to np.matmul, but also has transpose_X and transpose_Y flags, and only supports tensors from rank 1 to 3 inclusive. For GPU, uses cublas?gemmStridedBatched. For CPU, uses cblas_?gemm_batch if available via MKL; otherwise a simple serial implementation that loops over the batch dimension is employed for now. 16489827
