体验新版 GitCode,发现更多精彩内容 >>
* Polish codes and memory usage for fused_gate_attention. * Fix wrong reduce_dims in fused_gate_attention when computing gradient of nonbatched_bias.
拖放文件到此处或点击上传