1. 25 2月, 2022 1 次提交
    • C
      [Phi] Support cudnn kernel moving & move softmax kernels (#39547) · 8895379a
      Chen Weihang 提交于
      * support cudnn kernel moving
      
      * polish cmake rules
      
      * add unittest for coverage
      
      * remove orig kernel
      
      * remove softmax cudnn kernel
      
      * fix softmax test failed
      
      * fix npu func error
      
      * resolve conflict
      
      * rename gpu dnn kernels
      
      * fix name rule error
      
      * fix compile error
      
      * update fp16 namespace
      8895379a
  2. 20 2月, 2022 1 次提交
  3. 11 2月, 2022 1 次提交
  4. 09 2月, 2022 1 次提交
  5. 20 12月, 2021 1 次提交
    • F
      optimize softmax with cross entropy soft label (#32387) · f8955602
      Feng Xing 提交于
      softmax_with_cross_entropy optimization with soft label. This PR includes optimization of
          "SoftmaxWithCrossEntropySoftLabel" : compute log_softmax and then compute loss.
          "CrossEntropySoftLabel" : compute loss with softmax as input.
      These optimization includes following technics:
          read data to buffer with vectorization
          compute max and sum in warp
          fixed loop size with macro
      Performance (computation time):
          softmax_with_cross_entropy_0 (forward) : -40.1%
          softmax_with_cross_entropy_0 (backward): -41%
      f8955602
  6. 03 12月, 2021 1 次提交
  7. 01 11月, 2021 1 次提交
  8. 11 9月, 2021 1 次提交
  9. 10 9月, 2021 2 次提交
  10. 05 6月, 2021 1 次提交
  11. 21 5月, 2021 1 次提交
  12. 06 5月, 2021 1 次提交
  13. 02 4月, 2021 1 次提交
  14. 16 3月, 2021 1 次提交
  15. 11 3月, 2021 1 次提交
  16. 10 3月, 2021 1 次提交
  17. 03 3月, 2021 1 次提交
  18. 25 2月, 2021 1 次提交
  19. 23 2月, 2021 2 次提交
  20. 16 11月, 2020 1 次提交
    • G
      Fix gradients with ignore_idx in softmax_with_cross_entropy (#28622) · 110febdc
      Guo Sheng 提交于
      * Fix gradients with ignore_idx in softmax_with_cross_entropy.
      test=develop
      
      * Fix gradients with ignore_idx in softmax_with_cross_entropy on cpu.
      Remove softmax_with_cross_entropy from op_threshold_white_list.
      test=develop
      
      * Fix test_softmax_cross_entropy_op.py.
      test=develop
      110febdc
  21. 12 10月, 2020 1 次提交
  22. 16 7月, 2020 1 次提交
  23. 11 7月, 2020 1 次提交
  24. 23 2月, 2020 1 次提交
  25. 20 12月, 2019 1 次提交
  26. 03 12月, 2019 1 次提交
  27. 18 11月, 2019 1 次提交
  28. 09 5月, 2019 1 次提交
  29. 07 5月, 2019 1 次提交
    • K
      Softmax_cross_entropy op add axis (#16806) · a71d8fdb
      Kaipeng Deng 提交于
      * add attr axis infershape. test=develop
      
      * add CUDA kernel. test=develop
      
      * fix unittest. test=develop
      
      * fix unittest for soft_label. test=develop
      
      * fix fp16 unittest. test=develop
      
      * remove comment code. test=develop
      
      * refine test for axis. test=develop
      
      * add python api. test=develop
      
      * fix doc. test=develop
      
      * fix fp16 unittest. test=develop
      
      * fix ngraph test. test=develop
      
      * fix ENFORCE for test_imperative_transformer. test=develop
      
      * fit for ngraph test. test=develop
      
      * fix after rebase develop. test=develop
      
      * fix doc. test=develop
      
      * fix API.spec. test=develop
      
      * fix test_layers. test=develop
      
      * fix format. test=develop
      a71d8fdb
  30. 06 5月, 2019 1 次提交
  31. 21 4月, 2019 1 次提交
    • Z
      Refine model gpu memory (#16993) · 1202d3fc
      Zeng Jinle 提交于
      * speedup gc and inplace softmax_with_cross_entropy_grad
      test=develop
      
      * refine models gpu mem
      Merge skip vars and warning messages of mem opt
      remove relu mem opt
      test=develop
      
      * follow comments
      test=develop
      1202d3fc
  32. 11 4月, 2019 1 次提交
  33. 03 4月, 2019 1 次提交
  34. 02 4月, 2019 1 次提交
  35. 19 3月, 2019 1 次提交
  36. 17 3月, 2019 1 次提交
  37. 10 1月, 2019 1 次提交
    • W
      [Feature] support mix precision training for resnet (#14899) · fd854183
      Wu Yi 提交于
      * clip softmax for fp16
      
      * updates
      
      * fuse xent support fp16 test=develop
      
      * wip
      
      * wip
      
      * add simple row reduce
      
      * wip fp16 accurate softmax
      
      * add accurate softmax kernel for fp16 test=develop
      
      * update test=develop
      
      * fix cpu build test=develop
      
      * update api.spec test=develop
      
      * follow comments test=develop
      
      * fix build test=develop
      
      * fix trt build test=develop
      
      * fix inference build test=develop
      
      * fix merge test=develop
      
      * update test=develop
      
      * try fix build test=develop
      
      * fix build test=develop
      
      * rename real_exp test=develop
      
      * fortest
      
      * remove hacky kernels test=develop
      
      * clean up test=develop
      fd854183
  38. 11 12月, 2018 1 次提交