未验证 提交 dca0b784 编写于 作者: J Jeff Rasley 提交者: GitHub

Fix datatype issue with sparse attention softmax (#363)

Fixes a dataype issue with softmax where the number of blocks being sent to the Triton kernel source was a torch.Tensor but should have been a python integer. On some environments (e.g., conda) this resulted in triton not knowing how to serialize the input (and crashing in our tests). Once switching to the correct datatype that triton expects this seems to have solved the issue.
Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com>
上级 093f09ff
......@@ -234,7 +234,7 @@ class Softmax:
bench: optional: set if you want to do benchmarking
"""
self.num_blocks = layout.sum()
self.num_blocks = layout.sum().item()
self.spdims = layout.shape
self.layout = layout
self.block = block
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册