-
由 zhaoyuchen2018 提交于
* Refine elementwise kernel. Add a simple cuda kernel if grad x and y both exist Use 2D block cuda kernel to do broadcast. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * refine code. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * refine code. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
792443ef