1. 07 1月, 2021 1 次提交
    • F
      [Cherry-pick] Layer norm fp16 and Nvidia optimize (#29169 #29434 #29522 #29576) (#30110) · 44b81e63
      furnace 提交于
      * Layer norm fp16 (#29169)
      
      * add fp16 for layer_norm op
      
      * revert layernorm api
      
      * fix forward
      
      * fix forward
      
      * fix backward for layernorm with fp16
      
      * fix unit test for layernorm with fp16
      
      * fix with_mkldnn compile error for layernorm with fp16
      
      * 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U>
      
      * fix with_mkldnn compile error for layernorm with fp16
      
      * fix with_mkldnn compile error for layernorm with fp16
      Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
      
      * fix layer_norm accuracy (#29434)
      
      * Layernorm opt (#29522)
      
      * layernorm fw opt
      
      * layernorm bw opt
      
      * fix typo, test=develop
      
      * remove const dim3 for windows CI compatibility
      
      * merge develop
      Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>
      
      * Fix compile problem when cuda_arch < 6000 (#29576)
      
      * fix compile problem when cuda_arch < 6000
      
      * refine code
      
      * refine code
      Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
      Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>
      44b81e63
  2. 14 5月, 2020 1 次提交
  3. 20 4月, 2020 1 次提交
  4. 06 1月, 2020 1 次提交
  5. 05 9月, 2018 1 次提交
  6. 08 8月, 2018 1 次提交
  7. 12 2月, 2018 1 次提交
  8. 10 2月, 2018 2 次提交
  9. 03 2月, 2018 2 次提交