• Z
    [bf16] add bf16 kernel: layer_norm p_norm reduce_sum (#39843) · ce8ed978
    zhangbo9674 提交于
    * add layer norm
    
    * add p norm
    
    * add reduce sum
    
    * refine layer norm register bf16 for cudnn811
    
    * add bf16 cast for hip
    
    * add unittest
    
    * refine rocm
    
    * refine layer_norm unittest
    
    * refine reduce op
    
    * refine unittest
    
    * enhance atol for reduce unittest
    ce8ed978
cast_op.cu 1.6 KB