• Z
    [bf16] add bf16 kernel: scale gather sum (#39683) · 6d26b332
    zhangbo9674 提交于
    * add scale gather sum
    
    * refine CUDA_ATOMIC_WRAPPER ADD for bf16
    
    * add gather unittest
    
    * solve conflict
    
    * add scale uinttest
    
    * add sum unittest
    
    * solve conflict
    
    * refine gather unittest
    
    * refine unittest
    6d26b332
selected_rows_functor.cu 21.7 KB