-
由 Shang Zhizhou 提交于
* implement MHA order same as training * fix fp16 compile issue on old architecture Co-authored-by: Nzlsh80826 <rewang@nvidia.com>
09b18a49
* implement MHA order same as training
* fix fp16 compile issue on old architecture
Co-authored-by: Nzlsh80826 <rewang@nvidia.com>