• A
    optimize logsumexp in small data scale (#52952) · 93e1bb98
    Asthestarsfalll 提交于
    * optimize logsumexp in small data scale
    
    * fix
    
    * fix
    
    * add #pragma once
    
    * swith to use aligned_vector and support arbitrarily shape
    
    * fix store
    
    * fix store
    
    * refine for special cases
    
    * try
    
    * fix
    
    * update
    
    * fix
    
    * fix all_reduce
    
    * try
    
    * fix rocm bug
    
    * fix rocm bug
    
    * fix rocm bug
    
    * fix rocm bug
    
    * fix rocm bug
    
    * fix rocm bug
    
    * fix rocm bug
    
    * fix rocm bug
    93e1bb98
logsumexp_kernel.cu 6.4 KB