1. 13 8月, 2021 1 次提交
    • T
      New Einsum API (#33821) · 8c8667f0
      Tongxin Bai 提交于
      * OP dot: refactor CPU kernels and get better loop performance.
      
      * Minor fix on code format.
      
      * Fixed minor errors.
      
      * Add new API: einsum
      
      * Update the Einsum unit test.
      
      One case failed with matmul_v2, where the dtype is int64:
      
      a = np.arange(2 * 3 * 1).reshape(2, 3, 1)
      b = np.arange(1)
      paddle.einsum("...i, ...i", a, b)
      
      * Test cases in test_einsum test floating point dtypes only.
      
      As of now Paddle only supports float/double dtypes in matmul, which is
      one of building blocks of this Einsum implementation. We decide not to
      test einsum against other dtypes.
      
      * Polish format.
      
      * More formatting.
      
      * Format...
      
      * Einsum: improve test coverage.
      
      * Einsum: bug fixes and more testcases for testing error messages
      
      * Einsum: fix format..
      
      * Einsum: fixed typo and format.
      
      * Einsum: format again...
      
      * Einsum: applied suggested changes.
      
      * Einsum API: improve API documentation.
      
      * Einsum API: apply suggested changes.
      
      * Einsum API: Add dygraph only note.
      
      * Einsum API: Add dygraph only note.
      
      * Einsum API: fixed unittest.
      8c8667f0
  2. 28 7月, 2021 1 次提交
  3. 19 7月, 2021 1 次提交
    • C
      Add Cuda event and stream API (#32460) · 9c7f6af5
      chentianyu03 提交于
      * add cuda event and stream api
      
      * add cuda event and stream api
      
      * add get_current_stream api
      
      * add get_current_stream api
      
      * init streams
      
      * modify get_current_stream
      
      * modify get_cuttent_stream
      
      * add synchronize func
      
      * add current_stream doc and test file
      
      * move get_current_stream into CUDA macro
      
      * move CudaEvent into CUDA macro
      
      * move _get_current_stream and _device_synchronize into cuda macro
      
      * modify the macro of cuda stream and event
      
      * add test case for synchronize
      
      * add paddle.devices.cuda module
      
      * event and stream support hip
      
      * add doc for stream and event class
      
      * move cuda stream and event into single pybind
      
      * add cuda_streams_py.cc to cmakelist
      
      * add _device_synchronize and _get_current_stream to core module
      
      * add test case for cudastream and cudaevent
      
      * move __all__ in streams.py
      
      * fix test fail
      
      * add cuda to devices __all__
      
      * fix current_stream doc writing error
      
      * move devices to device direction, and merge device.py into __init__.py
      
      * add required:gpu to sample codes
      
      * remove cuda direction from device/__init__.py
      9c7f6af5
  4. 12 7月, 2021 1 次提交
  5. 23 6月, 2021 1 次提交
  6. 22 6月, 2021 1 次提交
    • Z
      [API/OP]Add a new API paddle.diagonal (#33586) · ad106290
      zhangbo9674 提交于
      * new api diagonal, test=develop
      
      * add new api diagonal, test=develop
      
      * new api diagonal, test=develop
      
      * add new api paddle.diagonal, test=develop
      
      * use framework::stride replace ComputeDimStride
      
      * replace cudaMalloc/cudaMemcpy by TensorFormVector in cudaKernel and cudaGradKernel
      
      * perfect funciton: when attr(offset) is exceed attr(axis1) or attr(axis2), set the diagonal dim is 0
      
      * fix RP-Mac-CI bug: replace framework::stride() by ComputDimStride.
      
      * perfect code-block
      
      * perfect code of python API diagonal
      
      * api supports dtype of float16 and bool
      
      * api supports dtype of float16 and bool
      
      * modify unittest code
      
      * modify unittest code
      
      * perfect dtype describe
      
      * perfect code-block
      ad106290
  7. 21 6月, 2021 1 次提交
  8. 17 6月, 2021 1 次提交
  9. 16 6月, 2021 2 次提交
  10. 15 6月, 2021 1 次提交
    • Z
      Add digamma_op and unittest (#33278) · 02a6d49a
      zyfncg 提交于
      * Add digamma_op and unittest
      
      * add digamma_op api
      
      * remove special DigammaCudaKernel and correct some docs
      
      * remove unused headers
      
      * fix api doc error
      02a6d49a
  11. 11 6月, 2021 2 次提交
  12. 09 6月, 2021 2 次提交
  13. 27 5月, 2021 1 次提交
  14. 07 5月, 2021 1 次提交
    • Z
      remove packages in __all__ (#32759) · a77ade0e
      zhiboniu 提交于
      * [OPs] Bug fix, fix the segment mean for illegal syncthreads usage. (#32596) (#32610)
      
      * [OPs] Bug fix, fix the segment mean for illegal syncthreads usage.
      
      * remove packages in __all__
      
      * create new public api level paddle.callbacks;paddle.hub;paddle.utils.unique_name
      Co-authored-by: NZhong Hui <zhonghui.net@gmail.com>
      a77ade0e
  15. 27 4月, 2021 2 次提交
  16. 25 4月, 2021 1 次提交
  17. 24 4月, 2021 1 次提交
  18. 22 4月, 2021 2 次提交
  19. 14 4月, 2021 1 次提交
  20. 09 4月, 2021 1 次提交
    • L
      [NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d
      Leo Chen 提交于
      * [feature] support npu allocator (#30840)
      
      [feature] support npu allocator
      
      * [feature] support npu operator (#30951)
      
      [feature] support npu operator
      
      * [feature] support npu allocator, part 2 (#30972)
      
      * support npu allocator
      
      * add npu device context
      
      * fix some compile problem
      
      * fix some compile problem
      
      * add npu info
      
      * compile ok
      
      * fix include dir
      
      * support naive_best_fit_allocator
      
      * run ut ok, bug failed to exit
      
      * call aclrtResetDevice before exit
      
      * fix aclFinilize
      
      * add system allocatot test
      
      * add selected_gpus in gtest
      
      * add tensor_test for npu
      
      * support npu op, initial commit
      
      * add npu stream
      
      * add elementwise_add_op
      
      * compile ok
      
      * fix typo
      
      * fix elementwise_add_op_npu_test
      
      * support op run
      
      * test can run but failed
      
      * change aclopExecuteV2 to aclopCompileAndExecute
      
      * support parsing ascend rank table file (#31000)
      
      support parsing ascend rank table file
      
      * Fix reshape on GE graph. (#31084)
      
      Fix reshape on GE graph
      
      * add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)
      
      * add npu sub op
      
      * fix typo
      
      * rename test
      
      * fix bug
      
      * fix bug
      
      * add fp16 kernel
      
      * fix typo
      
      * support sub grad op
      
      * support elementwise_sub_grad op
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      
      * Fix compilation problem (#31100)
      
      Fix compilation problem (#31100)
      
      * fix compile
      
      * fix code stype
      
      * remove const_cast
      
      * support adding correct npu op in pybind.h (#31143)
      
      * support adding correct npu op in pybind.h
      
      * refine code
      
      * [NPU] Support executor with NPU (#31057)
      
      * [NPU] Support executor with NPU
      
      * Fix code according to reviews
      
      * Fix code
      
      * Add unittest for sub op npu
      
      * refactor npu device manager (#31154)
      
      refactor npu device manager (#31154)
      
      * fix selected npus
      
      * fix compile
      
      * fix reading flags from env
      
      * format
      Co-authored-by: Nxiayanming <41795079@qq.com>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
      ccf5709d
  21. 01 4月, 2021 1 次提交
    • C
      add custom init grad for backward function (#31540) · 83b953f5
      chentianyu03 提交于
      * add custom init grad for backward function
      
      * add custom init grad for backward function
      
      * handle when the grad_tensor is none
      
      * handle when the grad_tensor is none
      
      * fix the args type error on windows platform
      
      * modify the args order and doc
      
      * format code
      
      * add grad_tensor to xpu
      
      * modify the grad_tensor type check
      
      * add paddle.backward api to support multi tensors gradient compute
      
      * add paddle.backward api to support multi tensors gradient compute
      
      * add paddle.atuograd module and backward api
      
      * change tensor.backward func args
      
      * modify tensor backward api
      
      * remove create_graph intputs args
      
      * add doc and examplex code for backward api
      
      * when have the same tensor, throw error
      
      * modify test Init func args
      
      * modify the execute.Init func args in test files
      
      * add paddle.autograd package in setup.py.in
      
      * modify error msg, remove _run_backward method in class Tensor
      
      * add test cases for backward api
      83b953f5
  22. 15 1月, 2021 1 次提交
    • P
      Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103) · 13d75736
      pangyoki 提交于
      * add view strategy on squeeze,unsqueeze,reshape,flatten
      
      * add squeeze unittest
      
      * add unittests
      
      * use View strategy as name rather than Reuse Allacation
      
      * fix view api doc
      
      * fix format
      
      * use core.ops when input of reshape2 is Tensor
      
      * fix test_cross_entropy_loss error because of reshape2
      
      * fix test_cross_entropy_loss error because of reshape2
      
      * add inplace strategy
      
      * add elementwise_add sub
      
      * let backward op not use inplace
      
      * grad op do not use inplace
      
      * fix memory increase error and add leaf error message
      
      * delete selected_rows
      
      * change op_function
      
      * little change
      
      * solve HandleViewBetweenInputAndOutput
      
      * add unittest and leaf error message
      
      * merge view error
      
      * optimize op_function_generator format and support sum inplace op
      
      * fix format of basic_engine
      
      * fix format for framework
      
      * little change of variable wrapper
      
      * add reshape, squeeze, unsqueeze, scatter api
      
      * add relu elu tanh softmax inplace api
      
      * fix test_squeeze_op unittest
      
      * fix test_relu_op unittest
      
      * fix comment problems
      
      * delete sample code of inplace api
      
      * add reference of grad_pending_nodes in basic_engine
      
      * fix unittest name
      
      * add inplace apis into wlist
      
      * fix error message
      
      * add PADDLE_ENFORCE for set grad op twice
      
      * fix head file error
      13d75736
  23. 07 1月, 2021 1 次提交
  24. 17 12月, 2020 2 次提交
    • C
      add conj op for complex types (#29527) · 71063b81
      chentianyu03 提交于
      * add conj op for complex types
      
      * add conj for complex types
      
      * add more test case
      
      * add conj_op test
      
      * modify conj api and impl
      
      * add complex type for fill_constant_op xpu
      
      * add setConstant for complex type
      
      * remove complex conj test file
      
      * user define grad for test_conj_op
      
      * add test case for static mode of conj api
      
      * modify conj doc
      
      * change input args name to x
      
      * remove useless codes
      
      * conj support real types
      
      * add conj test case for real number
      71063b81
    • C
      [Complex] Add real & imag op and api for complex tensor (#29672) · 6cfa59de
      Chen Weihang 提交于
      * add complex real op & api & unittest
      
      * add imag op & api & unittest
      
      * refactor op impl
      
      * revert simplify writing due to complile failed
      
      * polish details
      
      * polish grad op code
      6cfa59de
  25. 09 12月, 2020 2 次提交
  26. 07 12月, 2020 1 次提交
  27. 01 12月, 2020 1 次提交
  28. 30 11月, 2020 1 次提交
  29. 27 11月, 2020 1 次提交
  30. 26 11月, 2020 2 次提交
  31. 19 11月, 2020 1 次提交
  32. 13 11月, 2020 1 次提交
    • C
      Add ONNX Exporter (#27831) · c545b9b6
      channings 提交于
      * add onnx export module, test=develop
      
      * add unit test for paddle.onnx.export
      
      * adjust api & doc
      
      * fix some typo
      c545b9b6