未验证 提交 feff99f5 编写于 作者: Y Yuang Liu 提交者: GitHub

update flash attn select (#54630) (#54716)

上级 570daa19
......@@ -81,7 +81,7 @@ def _math_attention(
def _select_sdp_cuda(head_dim):
if head_dim < 128:
if head_dim <= 128:
return "flash_attn"
else:
return "mem_efficient"
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册