[cherry-pick 2.3] Add fused_multi_transformer op to optimize transformer...
[cherry-pick 2.3] Add fused_multi_transformer op to optimize transformer generation performance (#42311) * Add fused_multi_transformer op to optimize transformer generation performance (#41814) * fix fused_multi_transformer compile failed in cuda arch < sm53 (#42315) * fix ci timeout
Showing
此差异已折叠。
此差异已折叠。
想要评论请 注册 或 登录