* use fp32 compute type for cublasGemmStridedBatchedEx with fp16 input/output * add flags to control compute type * default to false * add unit test * default to true
拖放文件到此处或点击上传