1. 25 2月, 2021 1 次提交
  2. 18 1月, 2021 1 次提交
  3. 13 1月, 2021 1 次提交
  4. 12 1月, 2021 2 次提交
  5. 11 1月, 2021 2 次提交
    • W
      [cherry-pick]Elementwise add grad GPU kernel optimization (#30276) · e59524f8
      wangchaochaohu 提交于
      * elementwise_add_grad Op optimization  (#29575)
      
      * optimize for long width for elementwise (#29602)
      
      * refine (#29622)
      
      * delete the code for fp16 optimization because it is not faster than common template code (#29715)
      
      * fix the shape choose of vectorize for cuda
      
      * optimization for fp16 elementwise add (#29744)
      
      * Fix the compiler error for half type (#29799)
      
      * refine the compiler error for half2 operation (#29816)
      
      * fix the compiler error when gcc4 cuda9.0 (#29997)
      e59524f8
    • Q
      add aarch64 and sunway kunlun lib (#30027) (#30237) · eacbd488
      QingshuChen 提交于
      * add aarch64 and sunway kunlun lib
      
      * minor
      
      * optimize elementwise_add for kunlun
      
      * update kunlun dependence
      
      * minor
      
      * minor
      eacbd488
  6. 07 1月, 2021 1 次提交
    • L
      [cherry pick] Some optimizations of elementwise_add, gelu and dropout for AMP (#30152) · 07f68fad
      Leo Chen 提交于
      * Improve performance of elementwise_add grad op (#29187)
      
      * pass stop_gradient for cast op
      
      * improve performance of elementwise_add grad
      
      * use tensor copy async
      
      * dygraph branch
      
      * fix dygraph branch
      
      * add ut
      
      * make gelu fp16 computing more robust (#29484)
      
      * Add fast path for dropout when p == 0  (#29553)
      
      * add fast path for p == 0 in dropout
      
      * add ut
      07f68fad
  7. 29 12月, 2020 1 次提交
    • C
      [Cherry-pick] Complex network execute support (#29905) · 91ebc460
      Chen Weihang 提交于
      * [Complex] Add support for complex grad accumulated (#29889)
      
      * add support for complex grad accumulated
      
      * add unittest for coverage
      
      * update test dtype
      
      * remove useless blank line
      
      * [Complex] Handle complex to real after type promotion (#29855)
      
      * try to add fwd op input dtypes
      
      * refactor base impl
      
      * return tmp_ins after dygraph prepare data
      
      * fix typo found in debug
      
      * polish comment & add complex net test
      
      * revert detail change
      
      * fix unittest failed
      
      * add complex kernel condition control
      
      * fix xpu test failed & polish comment
      
      * polish details by review comments
      
      * Complex op test (#29753)
      
      * delete no need to calculate inputs in dygraph op_test
      
      * delete no need to calculate inputs in dygraph op_test
      
      * change grad elementwise_mul for complex types (#29757)
      
      * add conj op for complex types
      
      * add conj for complex types
      
      * add more test case
      
      * add conj_op test
      
      * modify conj api and impl
      
      * add complex type for fill_constant_op xpu
      
      * add setConstant for complex type
      
      * remove complex conj test file
      
      * user define grad for test_conj_op
      
      * add test case for static mode of conj api
      
      * modify conj doc
      
      * change input args name to x
      
      * remove useless codes
      
      * conj support real types
      
      * add conj test case for real number
      
      * delete no need to calculate inputs in dygraph op_test
      
      * delete no need to calculate inputs in dygraph op_test
      
      * modify grad of mul for complex types
      
      * fix the grads of inputs args order not match bug
      
      * change the grad of div when complex types (#29804)
      
      * change the grad of div when complex types
      
      * fix the grads of inputs args order not match bug
      Co-authored-by: Nchentianyu03 <chentianyu03@baidu.com>
      91ebc460
  8. 08 12月, 2020 1 次提交
  9. 04 12月, 2020 1 次提交
  10. 01 12月, 2020 1 次提交
  11. 27 11月, 2020 1 次提交
  12. 26 11月, 2020 1 次提交
  13. 25 11月, 2020 2 次提交
  14. 20 11月, 2020 1 次提交
  15. 19 10月, 2020 1 次提交
  16. 16 10月, 2020 1 次提交
    • J
      Fix xpu enforce (#27978) · d330cf66
      Jack Zhou 提交于
      * test=kunlun;
      
      Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast):
      
          * elementwise_div op
          * elementwise_max op
          * elementwise_mul op (with grad op)
          * elementwise_sub op (with grad op)
      
      * 0.05->0.01
      
      * add xpu error message description;test=kunlun
      d330cf66
  17. 14 10月, 2020 1 次提交
  18. 27 9月, 2020 1 次提交
  19. 24 9月, 2020 1 次提交
    • W
      use iwyu clean include (#27267) · df43905f
      wanghuancoder 提交于
      * use iwyu clean include, test=develop, test=win
      
      * compilation error, test=develop
      
      * fix compilation error2, test=develop
      
      * fix compilation error3, test=develop
      
      * fix compilation error4, test=develop
      
      * fix compilation error5, test=develop
      
      * fix compilation error6, test=develop
      
      * fix compilation error7, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error10, test=develop
      
      * fix compilation error11, test=develop
      df43905f
  20. 21 9月, 2020 1 次提交
  21. 17 9月, 2020 1 次提交
  22. 16 9月, 2020 1 次提交
  23. 10 9月, 2020 2 次提交
  24. 04 9月, 2020 1 次提交
  25. 28 8月, 2020 1 次提交
  26. 27 8月, 2020 1 次提交
  27. 24 8月, 2020 1 次提交
  28. 22 8月, 2020 1 次提交
  29. 13 8月, 2020 1 次提交
    • L
      [OpDevOptimize] Add common infershape functions (#26096) · ffe52b44
      Leo Chen 提交于
      * add unchaged infershape function
      
      * add broadcast infershape function
      
      * fix bug
      
      * rename infershape functions
      
      * add UnaryOpUnchangedInferShapeCheckAxis
      
      * add error message
      
      * add test for common infer shape functions
      
      * dont update existed ops
      
      * dont update op_desc.h
      
      * add more test
      
      * add error check, refine error message
      ffe52b44
  30. 12 8月, 2020 1 次提交
  31. 08 8月, 2020 1 次提交
  32. 05 8月, 2020 1 次提交
  33. 18 6月, 2020 1 次提交
  34. 16 6月, 2020 1 次提交
  35. 03 6月, 2020 1 次提交
  36. 27 5月, 2020 1 次提交