1. 08 9月, 2021 2 次提交
    • C
      Add FP16 PRelu (#35532) · 4e62af80
      cc 提交于
      4e62af80
    • F
      merge CMakeList.txt manual (#35378) · c4a3e8b4
      feng_shuai 提交于
      * merge CMakeList.txt manual
      
      * add platform for changethreadnum
      
      * repair some bugs according to make error
      
      * do nothing just flush CI
      
      * forget change thread num
      
      * add inplace_atol param for check_output_with_place
      
      * Windows
      
      * std:min and std::max should be change because of windows
      c4a3e8b4
  2. 01 9月, 2021 1 次提交
  3. 27 8月, 2021 1 次提交
    • X
      Add unpool2d op & Expose max_unpool2d API (#35056) · ceee71a0
      xiaoting 提交于
      * add maxunppol2d op, test=develop
      
      * fix typo, test=develop
      
      * fix unpool unitest, test=develop
      
      * fix unpool code-example, test=develop
      
      * fix for unpool_op_unittest,test=develop
      
      * fix example code, test=develop
      
      * add noqa:F401, test=develop
      
      * fix converage, test=develop
      
      * fix unitest for unpool, test=develop
      
      * rename unpool2d to unpool, test=develop
      
      * rename unpool2d to unpool, test=develop
      ceee71a0
  4. 17 8月, 2021 1 次提交
    • H
      Align CTC grad scale same with ESPNet (#34729) · 10f9644c
      Hui Zhang 提交于
      * dygraph support more ctc grad scale
      
      * scale for 1.x
      
      * fix unitest
      
      * fix unitest
      
      * format code
      
      * fix unittest
      
      * fix log info
      
      * unittest cov
      
      * fix format;notest,test=cpu,coverage
      
      * skip ctc_loss egs;test=cpu
      
      * warpctc grad cov;test=coverage
      
      * add dygraph test;test=coverage
      
      * format;test=cpu,coverage
      
      * format;test=cpu
      
      * add api compat;test=cpu
      
      * add cpu test
      
      * rename
      
      * rename
      
      * fix
      
      * fix test
      
      * format
      
      * eigen cpu
      
      * eigen gpu grad pass
      
      * cuda gpu pass
      
      * format
      
      * fix ci
      10f9644c
  5. 12 8月, 2021 1 次提交
    • Z
      Fix safety-bug of functional.linear (#34696) · 0e28c8bb
      zhulei 提交于
      * Fix safety-bug of functional.linear
      
      * Fix safety-bug of functional.linear
      
      * Fix safety-bug of functional.linear
      
      * Fix safety-bug of functional.linear
      0e28c8bb
  6. 11 8月, 2021 1 次提交
  7. 09 8月, 2021 1 次提交
  8. 22 7月, 2021 2 次提交
  9. 09 7月, 2021 1 次提交
  10. 07 7月, 2021 1 次提交
  11. 06 7月, 2021 1 次提交
  12. 05 7月, 2021 1 次提交
  13. 22 6月, 2021 1 次提交
  14. 21 6月, 2021 1 次提交
    • L
      Add AXPY oneDNN handler (#33632) · 773aabc7
      lidanqing 提交于
      * Add oneDNN AXPY handler.
      
      * Add fallback for small tensors.
      
      * Fix ifdefs
      
      * Remove unnecessary namespace prefixes and add missing headers.
      
      * Guard handler_axpy with proper ifdefs.
      
      * Compilation of this function is possible only when Paddle is not build
      with CUDA nor HIP.
      
      * Move AXPY handler code to separate files.
      
      * Use oneDNN AXPY handler in SGD op.
      
      * Use axpy handler only when Paddle is built with oneDNN.
      
      * Add test for SUM BF16 with big rows.
      
      * Fix SFINAE rules for elementwise_add_to.
      
      * Add test case for SGD with big rows.
      
      * update
      
      * update
      Co-authored-by: NAdam Osewski <adam.osewski@intel.com>
      773aabc7
  15. 05 6月, 2021 1 次提交
  16. 02 6月, 2021 1 次提交
  17. 01 6月, 2021 1 次提交
  18. 26 5月, 2021 2 次提交
    • C
      modify matmul Op to complex template types (#33130) · 6c07cd7e
      chentianyu03 提交于
      * modify matmul Op to complex template types
      
      * remove complex64/128 head file
      6c07cd7e
    • W
      optimize OP's compilation time (#32617) · 78ecb668
      wuhuanzhou 提交于
      * optimize OP's compilation time, test=develop
      
      * add more op and run ci test, test=develop
      
      * CUDA Kernel register in cc file, test=develop
      
      * fix macros, test=develop
      
      * fix undefined symbol error, test=develop
      
      * fix compilation error and undefined symbol, test=develop
      
      * fix compilation error on Windows, test=develop
      
      * fix compilation error on Windows, test=develop
      78ecb668
  19. 25 5月, 2021 1 次提交
    • C
      modify Ops to complex template (#33041) · 5fa44c34
      chentianyu03 提交于
      * modify conj, real, imag OP to complex template
      
      * replace with complex template to dot Op
      
      * replace with complex template to Abs Op
      
      * add support for complex64 and complex128
      5fa44c34
  20. 20 5月, 2021 1 次提交
    • C
      Add complex template type (#32857) · 738bf20e
      chentianyu03 提交于
      * add complex template file
      
      * add numtraits for complex template
      
      * add complex template type register
      
      * modify specify template of complex
      
      * modify specify template of complex
      
      * modify specify template of complex
      
      * modify specify template of complex
      
      * make TensorCheckerVisitor support complex type
      
      * fix operator= error
      
      * add complex template
      
      * add complex template type
      
      * add complex template type to pyarray transform
      
      * add complex template type to pyarray transform
      
      * remove complex type for dlpack register
      
      * set dlpack supprot complex type
      
      * set dlpack supprot complex type
      
      * set dlpack supprot complex type
      
      * remove explict for complex constructor
      
      * add complex unit test file
      738bf20e
  21. 12 5月, 2021 1 次提交
  22. 06 5月, 2021 2 次提交
  23. 27 4月, 2021 1 次提交
  24. 19 4月, 2021 1 次提交
  25. 14 4月, 2021 1 次提交
    • Z
      fix matrix_inverse_op with rocm (#32128) · 995b5f2c
      zhulei 提交于
      * fix matrix_inverse_op with rocm
      
      * fix matrix_inverse_op with rocm
      
      * fix matrix_inverse_op with rocm
      
      * fix matrix_inverse_op with rocm
      995b5f2c
  26. 13 4月, 2021 1 次提交
  27. 09 4月, 2021 2 次提交
    • N
      make high precision for avg_pool and adaptive_avg_pool when data_type is float16 (#31887) · ec2ffb68
      niuliling123 提交于
      * make high precision for avg_pool
      ec2ffb68
    • L
      [NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d
      Leo Chen 提交于
      * [feature] support npu allocator (#30840)
      
      [feature] support npu allocator
      
      * [feature] support npu operator (#30951)
      
      [feature] support npu operator
      
      * [feature] support npu allocator, part 2 (#30972)
      
      * support npu allocator
      
      * add npu device context
      
      * fix some compile problem
      
      * fix some compile problem
      
      * add npu info
      
      * compile ok
      
      * fix include dir
      
      * support naive_best_fit_allocator
      
      * run ut ok, bug failed to exit
      
      * call aclrtResetDevice before exit
      
      * fix aclFinilize
      
      * add system allocatot test
      
      * add selected_gpus in gtest
      
      * add tensor_test for npu
      
      * support npu op, initial commit
      
      * add npu stream
      
      * add elementwise_add_op
      
      * compile ok
      
      * fix typo
      
      * fix elementwise_add_op_npu_test
      
      * support op run
      
      * test can run but failed
      
      * change aclopExecuteV2 to aclopCompileAndExecute
      
      * support parsing ascend rank table file (#31000)
      
      support parsing ascend rank table file
      
      * Fix reshape on GE graph. (#31084)
      
      Fix reshape on GE graph
      
      * add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)
      
      * add npu sub op
      
      * fix typo
      
      * rename test
      
      * fix bug
      
      * fix bug
      
      * add fp16 kernel
      
      * fix typo
      
      * support sub grad op
      
      * support elementwise_sub_grad op
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      
      * Fix compilation problem (#31100)
      
      Fix compilation problem (#31100)
      
      * fix compile
      
      * fix code stype
      
      * remove const_cast
      
      * support adding correct npu op in pybind.h (#31143)
      
      * support adding correct npu op in pybind.h
      
      * refine code
      
      * [NPU] Support executor with NPU (#31057)
      
      * [NPU] Support executor with NPU
      
      * Fix code according to reviews
      
      * Fix code
      
      * Add unittest for sub op npu
      
      * refactor npu device manager (#31154)
      
      refactor npu device manager (#31154)
      
      * fix selected npus
      
      * fix compile
      
      * fix reading flags from env
      
      * format
      Co-authored-by: Nxiayanming <41795079@qq.com>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
      ccf5709d
  28. 07 4月, 2021 1 次提交
  29. 02 4月, 2021 1 次提交
  30. 01 4月, 2021 2 次提交
  31. 31 3月, 2021 1 次提交
  32. 19 3月, 2021 1 次提交
  33. 08 3月, 2021 1 次提交
  34. 05 3月, 2021 1 次提交