• W
    [cherry-pick 2.3] Add fused_multi_transformer op to optimize transformer... · 50bfe420
    WangXi 提交于
    [cherry-pick 2.3] Add fused_multi_transformer op to optimize transformer generation performance (#42311)
    
    * Add fused_multi_transformer op to optimize transformer generation performance (#41814)
    
    * fix fused_multi_transformer compile failed in cuda arch < sm53 (#42315)
    
    * fix ci timeout
    50bfe420
fused_transformer.py 35.1 KB