use fp32 compute type for cublasGemmStridedBatchedEx with fp16 input/output (#42851)
* use fp32 compute type for cublasGemmStridedBatchedEx with fp16 input/output * add flags to control compute type * default to false * add unit test * default to true
Showing
想要评论请 注册 或 登录