1. 12 1月, 2021 1 次提交
    • C
      [Cherry-pick] Complex grad for matmul, kron and type promotion (#30304) · 7346edc2
      chentianyu03 提交于
      * complex gradient matmul  (#29966)
      
      * dot op support complex types
      
      * matmul support complex types
      
      * add test case
      
      * matmul broadcast gradient support complex
      
      * move conjFunctor to complex_functor.h
      
      * change the kron gradient when complex types (#29995)
      
      * type promotion for grad (#30177)
      
      * type promotion for grad
      
      * add type promotion for div op
      7346edc2
  2. 11 1月, 2021 2 次提交
    • W
      [cherry-pick]Elementwise add grad GPU kernel optimization (#30276) · e59524f8
      wangchaochaohu 提交于
      * elementwise_add_grad Op optimization  (#29575)
      
      * optimize for long width for elementwise (#29602)
      
      * refine (#29622)
      
      * delete the code for fp16 optimization because it is not faster than common template code (#29715)
      
      * fix the shape choose of vectorize for cuda
      
      * optimization for fp16 elementwise add (#29744)
      
      * Fix the compiler error for half type (#29799)
      
      * refine the compiler error for half2 operation (#29816)
      
      * fix the compiler error when gcc4 cuda9.0 (#29997)
      e59524f8
    • Q
      add aarch64 and sunway kunlun lib (#30027) (#30237) · eacbd488
      QingshuChen 提交于
      * add aarch64 and sunway kunlun lib
      
      * minor
      
      * optimize elementwise_add for kunlun
      
      * update kunlun dependence
      
      * minor
      
      * minor
      eacbd488
  3. 07 1月, 2021 1 次提交
    • L
      [cherry pick] Some optimizations of elementwise_add, gelu and dropout for AMP (#30152) · 07f68fad
      Leo Chen 提交于
      * Improve performance of elementwise_add grad op (#29187)
      
      * pass stop_gradient for cast op
      
      * improve performance of elementwise_add grad
      
      * use tensor copy async
      
      * dygraph branch
      
      * fix dygraph branch
      
      * add ut
      
      * make gelu fp16 computing more robust (#29484)
      
      * Add fast path for dropout when p == 0  (#29553)
      
      * add fast path for p == 0 in dropout
      
      * add ut
      07f68fad
  4. 29 12月, 2020 1 次提交
    • C
      [Cherry-pick] Complex network execute support (#29905) · 91ebc460
      Chen Weihang 提交于
      * [Complex] Add support for complex grad accumulated (#29889)
      
      * add support for complex grad accumulated
      
      * add unittest for coverage
      
      * update test dtype
      
      * remove useless blank line
      
      * [Complex] Handle complex to real after type promotion (#29855)
      
      * try to add fwd op input dtypes
      
      * refactor base impl
      
      * return tmp_ins after dygraph prepare data
      
      * fix typo found in debug
      
      * polish comment & add complex net test
      
      * revert detail change
      
      * fix unittest failed
      
      * add complex kernel condition control
      
      * fix xpu test failed & polish comment
      
      * polish details by review comments
      
      * Complex op test (#29753)
      
      * delete no need to calculate inputs in dygraph op_test
      
      * delete no need to calculate inputs in dygraph op_test
      
      * change grad elementwise_mul for complex types (#29757)
      
      * add conj op for complex types
      
      * add conj for complex types
      
      * add more test case
      
      * add conj_op test
      
      * modify conj api and impl
      
      * add complex type for fill_constant_op xpu
      
      * add setConstant for complex type
      
      * remove complex conj test file
      
      * user define grad for test_conj_op
      
      * add test case for static mode of conj api
      
      * modify conj doc
      
      * change input args name to x
      
      * remove useless codes
      
      * conj support real types
      
      * add conj test case for real number
      
      * delete no need to calculate inputs in dygraph op_test
      
      * delete no need to calculate inputs in dygraph op_test
      
      * modify grad of mul for complex types
      
      * fix the grads of inputs args order not match bug
      
      * change the grad of div when complex types (#29804)
      
      * change the grad of div when complex types
      
      * fix the grads of inputs args order not match bug
      Co-authored-by: Nchentianyu03 <chentianyu03@baidu.com>
      91ebc460
  5. 08 12月, 2020 1 次提交
  6. 04 12月, 2020 1 次提交
  7. 01 12月, 2020 1 次提交
  8. 27 11月, 2020 1 次提交
  9. 26 11月, 2020 1 次提交
  10. 25 11月, 2020 2 次提交
  11. 20 11月, 2020 1 次提交
  12. 19 10月, 2020 1 次提交
  13. 16 10月, 2020 1 次提交
    • J
      Fix xpu enforce (#27978) · d330cf66
      Jack Zhou 提交于
      * test=kunlun;
      
      Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast):
      
          * elementwise_div op
          * elementwise_max op
          * elementwise_mul op (with grad op)
          * elementwise_sub op (with grad op)
      
      * 0.05->0.01
      
      * add xpu error message description;test=kunlun
      d330cf66
  14. 14 10月, 2020 1 次提交
  15. 27 9月, 2020 1 次提交
  16. 24 9月, 2020 1 次提交
    • W
      use iwyu clean include (#27267) · df43905f
      wanghuancoder 提交于
      * use iwyu clean include, test=develop, test=win
      
      * compilation error, test=develop
      
      * fix compilation error2, test=develop
      
      * fix compilation error3, test=develop
      
      * fix compilation error4, test=develop
      
      * fix compilation error5, test=develop
      
      * fix compilation error6, test=develop
      
      * fix compilation error7, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error10, test=develop
      
      * fix compilation error11, test=develop
      df43905f
  17. 21 9月, 2020 1 次提交
  18. 17 9月, 2020 1 次提交
  19. 16 9月, 2020 1 次提交
  20. 10 9月, 2020 2 次提交
  21. 04 9月, 2020 1 次提交
  22. 28 8月, 2020 1 次提交
  23. 27 8月, 2020 1 次提交
  24. 24 8月, 2020 1 次提交
  25. 22 8月, 2020 1 次提交
  26. 13 8月, 2020 1 次提交
    • L
      [OpDevOptimize] Add common infershape functions (#26096) · ffe52b44
      Leo Chen 提交于
      * add unchaged infershape function
      
      * add broadcast infershape function
      
      * fix bug
      
      * rename infershape functions
      
      * add UnaryOpUnchangedInferShapeCheckAxis
      
      * add error message
      
      * add test for common infer shape functions
      
      * dont update existed ops
      
      * dont update op_desc.h
      
      * add more test
      
      * add error check, refine error message
      ffe52b44
  27. 12 8月, 2020 1 次提交
  28. 08 8月, 2020 1 次提交
  29. 05 8月, 2020 1 次提交
  30. 18 6月, 2020 1 次提交
  31. 16 6月, 2020 1 次提交
  32. 03 6月, 2020 1 次提交
  33. 27 5月, 2020 1 次提交
  34. 22 5月, 2020 1 次提交
  35. 18 5月, 2020 1 次提交
  36. 15 5月, 2020 1 次提交
  37. 12 5月, 2020 1 次提交