optimize logsumexp in small data scale (#52952)
* optimize logsumexp in small data scale * fix * fix * add #pragma once * swith to use aligned_vector and support arbitrarily shape * fix store * fix store * refine for special cases * try * fix * update * fix * fix all_reduce * try * fix rocm bug * fix rocm bug * fix rocm bug * fix rocm bug * fix rocm bug * fix rocm bug * fix rocm bug * fix rocm bug
Showing
想要评论请 注册 或 登录