Fork自 PaddlePaddle / Paddle
* Fix softmax cuda bug * Refine multihead log and softmax logic * Align block to 32
拖放文件到此处或点击上传