[Paddle-TRT] Implement MHA fp16 order same as training (#32629) (#32785)
* implement MHA order same as training
* fix fp16 compile issue on old architecture
Co-authored-by: Nzlsh80826 <rewang@nvidia.com>
Showing
想要评论请 注册 或 登录
* implement MHA order same as training
* fix fp16 compile issue on old architecture
Co-authored-by: Nzlsh80826 <rewang@nvidia.com>