Support head_dim = 96 in fused_multi_transformer for PLATO-XL (#43120)
* Support head_dim = 96 in fused_multi_transformer in PLATO-XL * add notes
Showing
想要评论请 注册 或 登录
* Support head_dim = 96 in fused_multi_transformer in PLATO-XL * add notes