1. 04 3月, 2022 13 次提交
  2. 03 3月, 2022 11 次提交
  3. 02 3月, 2022 11 次提交
  4. 01 3月, 2022 5 次提交
    • C
      [Phi]rm reduce infershape (#39820) · 09039636
      chentianyu03 提交于
      * modify infershape utils and rm reduce infershape
      
      * merge develop
      
      * fix infermete bug
      
      * add IsForInferShape func in ArgumentMappingContext
      
      * add reduce_mean infermeta
      
      * modify annotation
      
      * add default dims
      09039636
    • X
      [phi] tranfer the selu_op and pass the CI (#39819) · 197da15a
      xiongkun 提交于
      * tranfer the selu_op and pass the CI
      
      * add sig files
      
      * fix code
      
      * fix by code review
      
      * remove TOOD
      
      * change the include position
      
      * change the head position
      197da15a
    • N
      Add function description for Kernel Primitive API (#39884) · 255bf609
      niuliling123 提交于
      * Add function description for Kernel Primitive API
      1. Set cumsum and sort share memory size = 1024
      2.sort and cumsum api limitation : blockDim.x must be less than 512 (blockDim.x <= 512)
      255bf609
    • Z
      [bf16] add bf16 kernel: layer_norm p_norm reduce_sum (#39843) · ce8ed978
      zhangbo9674 提交于
      * add layer norm
      
      * add p norm
      
      * add reduce sum
      
      * refine layer norm register bf16 for cudnn811
      
      * add bf16 cast for hip
      
      * add unittest
      
      * refine rocm
      
      * refine layer_norm unittest
      
      * refine reduce op
      
      * refine unittest
      
      * enhance atol for reduce unittest
      ce8ed978
    • Z
      [bf16] add bf16 kernel: scale gather sum (#39683) · 6d26b332
      zhangbo9674 提交于
      * add scale gather sum
      
      * refine CUDA_ATOMIC_WRAPPER ADD for bf16
      
      * add gather unittest
      
      * solve conflict
      
      * add scale uinttest
      
      * add sum unittest
      
      * solve conflict
      
      * refine gather unittest
      
      * refine unittest
      6d26b332