1. 18 2月, 2022 2 次提交
    • Q
      [MLU]add matmul and matmul_v2 op (#39539) · 229ec32a
      qipengh 提交于
      * [MLU]add matmul and matmul_v2 op
      
      * [MLU] fix data_type and del matmul
      
      * [MLU] fix compile error
      
      * [MLU] fix ci_check error
      229ec32a
    • J
      [Bug Fix]Fix gradient accumulator (#39577) · a7cbd3ef
      Jiabin Yang 提交于
      * merge legacy to fluid
      
      * Remove legacy code
      
      * Remove legacy code
      
      * Remove DataType test
      
      * Using Tensor directly instead of using EagerTensor
      
      * support gradient_accumulation
      
      * make test_imperative_lod_tensor_to_selected_rows longer
      
      * make test_imperative_lod_tensor_to_selected_rows longer
      
      * refine code
      
      * Rename all EagerTensor to Tensor
      
      * Rename some EagerTensor to Tensor
      
      * rename EagerTensor to EagerVariable
      
      * add more test
      
      * fix different device gradient_accmulator bug
      
      * merge develop
      
      * remove useless tests
      a7cbd3ef
  2. 16 2月, 2022 1 次提交
  3. 15 2月, 2022 3 次提交
    • R
      [PluggableDevice] Add custom runtime support (#38740) · 3e7825f3
      ronnywang 提交于
      * [CustomRuntime] Add DeviceManager
      
      * [CustomRuntime] Add DeviceInterface
      
      * [CustomRuntime] Add Stream, Event, DeviceGuard, CallbackManager
      
      * [CustomRuntime] Add plug-in device
      
      * [CustomRuntime] Memory module support PluggableDevice
      
      * [CustomRuntime] Add WITH_PLUGGABLE_DEVICE cmake option
      
      * update
      
      * [API] update API doc based on comments, test=develop
      Co-authored-by: Nqili93 <qili93@qq.com>
      3e7825f3
    • W
      [Eager] Support SellectedRows MergeAdd case (#39449) · 6549a041
      Weilong Wu 提交于
      
      * Refactor SelectedRows MergeAdd func by using template
      
      * Add GetInnerMutable func instead of modify GetInnerMutableTensor
      
      * Updated PADDLE_ENFORCE statement
      
      * Remove useless PADDLE_ENFORCE statement
      
      * Polish Code
      6549a041
    • A
      [PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404
      Aurelius84 提交于
      * #1 migrate dist-related type()-> dtype()
      
      * move datatype function from pten -> fluid/framework
      
      * change type() in imperative into convert(dtype())
      
      * modify xx_tensor->type into xx_tensor->dtype
      
      * change the set_type interface and the caller
      
      * modify xx_tensor.type into xx_tensor.dtype
      
      * fix mutable_data(place, dtype())
      
      * change caller of mutable_data in pten and distributed
      
      * change the caller of mutable_data in fluid/framework
      
      * change the caller of mutable_data in imperative directory
      
      * mutable_data: inference
      
      * update the call of mutable_data
      
      * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType
      
      * pass the compile. the next step is remove VarType in Pten
      
      * fix all and remove VarType from pten. success in linux. Next task is other platform
      
      * fix conflict with develop
      
      * fix compiled error
      
      * Fix reset conversion
      
      * fix conflict
      
      * fix compiled problem
      
      * fix typo
      
      * Fix << in tensor_utils.cc
      
      * fix type->dtype
      
      * fix unittest
      
      * fix tensor init constructor
      
      * fix DataTypeSize for BFloat16
      
      * fix code style
      
      * fix npu compiled error
      
      * fix npu
      
      * compile npu sucessfully
      
      * fix conflict
      
      * fix conflict
      Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
      7e7e9404
  4. 11 2月, 2022 1 次提交
  5. 09 2月, 2022 1 次提交
    • J
      Replace EagerTensor with Tensor (#39376) · 945a3ce9
      Jiabin Yang 提交于
      * merge legacy to fluid
      
      * Remove legacy code
      
      * Remove legacy code
      
      * Remove DataType test
      
      * Using Tensor directly instead of using EagerTensor
      
      * support gradient_accumulation
      
      * make test_imperative_lod_tensor_to_selected_rows longer
      
      * make test_imperative_lod_tensor_to_selected_rows longer
      945a3ce9
  6. 28 1月, 2022 1 次提交
  7. 26 1月, 2022 2 次提交
    • W
      [Eager] Support imperative selected_rows_to_lod_tensor and the opposite case (#39223) · 787980b1
      Weilong Wu 提交于
      * Added selected_rows and rw_lock to pten
      
      * Renamed the unit test target to fix CI
      
      * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid
      
      * Remove rw_lock.h,rw_lock_test.cc in fluid
      
      * Use pten::RWLock and pten::AutoRDLock, fix CI
      
      * Use pten::SelectedRows
      
      * Use pten::SelectedRows
      
      * Fix to pass NPU CI
      
      * Selected_Rows inherits from TensorBase
      
      * Use pten::SelectedRows, to pass NPU CI
      
      * To fix NPU CI
      
      * To fix NPU CI again
      
      * Use paddle/pten/core/enforce and polish code
      
      * Support imperative selected_rows_to_lod_tensor
      
      * Polish code
      787980b1
    • H
      fix gradient accumulator bug. test=kunlun (#39127) · b1a458ac
      houj04 提交于
      * fix gradient accumulator bug. test=kunlun
      
      * fix typo. test=kunlun
      
      * fix typo. test=kunlun
      
      * fix unit tests. test=kunlun
      
      * using TensorCopySync. test=kunlun
      
      * only fix for xpu place. test=kunlun
      b1a458ac
  8. 25 1月, 2022 1 次提交
    • W
      [Move selected_rows PR #3] Change the relationship of [include/Cmake]. (#39128) · 2bafd338
      Weilong Wu 提交于
      * Added selected_rows and rw_lock to pten
      
      * Renamed the unit test target to fix CI
      
      * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid
      
      * Remove rw_lock.h,rw_lock_test.cc in fluid
      
      * Use pten::RWLock and pten::AutoRDLock, fix CI
      
      * Use pten::SelectedRows
      
      * Use pten::SelectedRows
      
      * Fix to pass NPU CI
      
      * Use pten::SelectedRows, to pass NPU CI
      
      * To fix NPU CI
      
      * To fix NPU CI again
      2bafd338
  9. 21 1月, 2022 1 次提交
  10. 17 1月, 2022 1 次提交
    • W
      [Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5
      Wilber 提交于
      * add pten::Place data structure.
      
      * update ci problem
      
      * fix ci problem
      
      * update
      
      * using platform::Place=pten::Place
      
      * remove BOOST_GET_CONST for CPUPlace and GPUPlace
      
      * compile pass 25%.
      
      * compile pass 45%
      
      * compile pass 60%
      
      * remove boost_get for xpu npu mlu and ipu
      
      * compile pass on cpu and gpu.
      
      * fix compile problem
      
      * fix compile error.
      
      * update
      
      * fix ci problem
      
      * update
      
      * ci approve
      
      * fix ci problem
      
      * fix ci eager test problem
      
      * remove BOOST_GET_CONST
      
      * fix npu compile
      c48a9ad5
  11. 20 12月, 2021 1 次提交
  12. 09 12月, 2021 1 次提交
  13. 27 11月, 2021 1 次提交
    • A
      [NPU] reorganization for device API abstraction (#37110) · 72241a6a
      Aganlengzi 提交于
      * [NPU] reorganization for device API abstraction
      
      * [NPU] delete old files
      
      * [NPU] fix npu_collective_helper
      
      * [NPU] fix collective_helper
      
      * [NPU] fix ut
      
      * [NPU] mod memory allocation and hccl_helper
      
      * [NPU] fix place_type
      
      * [NPU] split enfoce.h
      
      * move acl* call into npu_info
      
      * merge conflict
      
      * fix merge
      
      * merge conflict
      
      * merge conflict
      72241a6a
  14. 18 10月, 2021 1 次提交
  15. 10 9月, 2021 1 次提交
  16. 12 8月, 2021 1 次提交
  17. 26 5月, 2021 1 次提交
  18. 12 5月, 2021 1 次提交
  19. 14 4月, 2021 1 次提交
  20. 09 4月, 2021 1 次提交
    • L
      [NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d
      Leo Chen 提交于
      * [feature] support npu allocator (#30840)
      
      [feature] support npu allocator
      
      * [feature] support npu operator (#30951)
      
      [feature] support npu operator
      
      * [feature] support npu allocator, part 2 (#30972)
      
      * support npu allocator
      
      * add npu device context
      
      * fix some compile problem
      
      * fix some compile problem
      
      * add npu info
      
      * compile ok
      
      * fix include dir
      
      * support naive_best_fit_allocator
      
      * run ut ok, bug failed to exit
      
      * call aclrtResetDevice before exit
      
      * fix aclFinilize
      
      * add system allocatot test
      
      * add selected_gpus in gtest
      
      * add tensor_test for npu
      
      * support npu op, initial commit
      
      * add npu stream
      
      * add elementwise_add_op
      
      * compile ok
      
      * fix typo
      
      * fix elementwise_add_op_npu_test
      
      * support op run
      
      * test can run but failed
      
      * change aclopExecuteV2 to aclopCompileAndExecute
      
      * support parsing ascend rank table file (#31000)
      
      support parsing ascend rank table file
      
      * Fix reshape on GE graph. (#31084)
      
      Fix reshape on GE graph
      
      * add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)
      
      * add npu sub op
      
      * fix typo
      
      * rename test
      
      * fix bug
      
      * fix bug
      
      * add fp16 kernel
      
      * fix typo
      
      * support sub grad op
      
      * support elementwise_sub_grad op
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      
      * Fix compilation problem (#31100)
      
      Fix compilation problem (#31100)
      
      * fix compile
      
      * fix code stype
      
      * remove const_cast
      
      * support adding correct npu op in pybind.h (#31143)
      
      * support adding correct npu op in pybind.h
      
      * refine code
      
      * [NPU] Support executor with NPU (#31057)
      
      * [NPU] Support executor with NPU
      
      * Fix code according to reviews
      
      * Fix code
      
      * Add unittest for sub op npu
      
      * refactor npu device manager (#31154)
      
      refactor npu device manager (#31154)
      
      * fix selected npus
      
      * fix compile
      
      * fix reading flags from env
      
      * format
      Co-authored-by: Nxiayanming <41795079@qq.com>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
      ccf5709d
  21. 01 4月, 2021 1 次提交
    • C
      Refactor and simplify hook design & add Tensor.register_hook API (#31775) · dbeb3ea4
      Chen Weihang 提交于
      * refactor and simplify hook design
      
      * fix reducer add hook error
      
      * add Tensor.register_hook basic impl
      
      * refine prepare data impl
      
      * revert prepare data change
      
      * support register_hook for Tensor
      
      * add hook test in model
      
      * polish tests and doc example
      
      * fix double grad test failed
      
      * remove reduce hook func
      
      * fix set empty error
      
      * polish code by comments
      
      * change reduce_hook to mutable_hook
      
      * remove useless tmp_ins
      
      * fix shape code format error
      
      * fix shape code format error
      dbeb3ea4
  22. 26 3月, 2021 1 次提交
  23. 22 2月, 2021 1 次提交
  24. 05 1月, 2021 1 次提交
    • H
      support dygraph in xpu place (#30051) · 297fff1a
      hong 提交于
      * support dygraph in xpu place; test=develop
      
      * fix cpu/gpu compile error; test=develop
      
      * fix compile error; test=develop
      
      * fix xpu compile error; testd=develop
      297fff1a
  25. 25 12月, 2020 1 次提交
  26. 01 12月, 2020 1 次提交
  27. 18 11月, 2020 1 次提交
  28. 25 9月, 2020 1 次提交
  29. 21 8月, 2020 1 次提交
    • Q
      support Baidu Kunlun AI Accelerator (#25959) · 138ecf24
      QingshuChen 提交于
      * support Baidu AI Accelerator
        * test=kunlun
      
      * minor
       * test=kunlun
      
      * support xpu op in separate file
       * test=kunlun
      
      * update XPU error message and remove duplicated code
      
       * test=kunlun
      
      * minor
       * test=kunlun
      
      * minor
       * test=kunlun
      138ecf24
  30. 03 6月, 2020 1 次提交
  31. 20 3月, 2020 1 次提交
    • Z
      Add dygraph double grad implementation (#22939) · a31d7328
      Zeng Jinle 提交于
      * add double grad implementation for dygraph, test=develop
      
      * polish code, add uts, test=develop
      
      * fix place bug, test=develop
      
      * polish codes, add more uts for coverages, test=develop
      
      * add no_grad_set, test=develop
      
      * add star gan ut, test=develop
      
      * follow comments, test=develop
      a31d7328
  32. 09 3月, 2020 1 次提交
  33. 03 12月, 2019 1 次提交
    • Z
      support SelectedRows in dygraph, test=develop (#21078) · 6ebf0f47
      zhongpu 提交于
      * support SelectedRows in dygraph, test=develop
      
      * fix bug of _grad_ivar interface, test=develop
      
      * add optest for support seletedrows, test=develop
      
      * fix bug for gradient_accumulator in GPU mode, test=develop
      
      * fix error when Selectedrows addto LodTensor in sorted_gradient mdoe in dygraph, test=develop
      
      * refine and simplify gradient accumulator code, test=develop
      
      * add optest, test=develop
      
      * add optest and simplify code, test=develop
      
      * fix bug for test_imperative_selected_rows, test=develop
      
      * add optest for Coverage, test=develop
      
      * fix gradient interface and simplify code, test=develop
      
      * update api for gradient, test=develop
      
      * fix ShareDim's bug in DygraphExecutionContext class, test=develop
      
      * add optest, test=develop
      6ebf0f47
  34. 07 10月, 2019 1 次提交
    • J
      Fix/auto prune error on leaf (#20056) · 7a9bd0c5
      Jiabin Yang 提交于
      * test=develop, fix docker with paddle nccl problem
      
      * test=develop, Add Variable api and refine dygraph related API
      
      * test=develop, Add Variable api and refine dygraph related API
      
      * test=develop, refine test for new api and error info
      
      * test=develop, refine error info and test_layers
      
      * test=develop, add API.spec
      
      * test=devleop, fix to_string python2 and python3 compat error and refien doc
      
      * test=devleop, add API spec
      
      * test=devleop, update API spec
      
      * test=devleop, update API spec
      
      * test=develop, invoke ci
      
      * test=develop, fix example code
      
      * test=develop, update API spec
      
      * test=develop, fix auto_prune_error_on_leaf
      
      * test=develop, fix auto prune error on loss stop_gradient
      
      * test=develop, remove useless error check
      
      * test=develop, add more ut for sorted gradient
      7a9bd0c5
  35. 21 9月, 2019 1 次提交
    • J
      Feature/auto prune in dygraph (#19757) · 45425411
      Jiabin Yang 提交于
      * refactor dygraph,test=develop
      
      * fix failed unittest,test=develop
      
      * polish code,test=develop
      
      * check windows ci error,test=develop
      try to fix windows ci error by np.allclose,test=develop
      
      * polish vlog and profiler, test=develop
      
      * try to fix preceding ops order,test=develop
      
      * test transformer in windows ci, test=develop
      
      * use python c-api to speed up tracer.trace,test=develop
      
      * test=develop, fix docker with paddle nccl problem
      
      * test=develop, add ut for debug string and gradient_accumulator
      
      * test=develop, add tests for layer/gradient_accumulator/prepared_op
      
      * test=develop, fix complie error for test_prepared_op
      
      * test=develop, add more ut for dygraph
      
      * test=develop, create API.spec for dygraph api change
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ
      
      * test=develop, fix ut failed on parallel se-resnext
      
      * test=develop, change one more PADDLE_ENFORCE
      
      * support auto prune in dygraph mode
      
      * test=develop, support auto prune
      
      * test=develop, merge develop conflict
      
      * test=develop, fix test_layer and test_tracer ut
      
      * test=develop, fix bug which may cause stop_gradient disabled with a list of backward inputs
      45425411
  36. 05 9月, 2019 1 次提交
    • J
      Refactor dygraph (#19107) · e9233d1c
      Jiabin Yang 提交于
      * refactor dygraph,test=develop
      
      * fix failed unittest,test=develop
      
      * polish code,test=develop
      
      * check windows ci error,test=develop
      try to fix windows ci error by np.allclose,test=develop
      
      * polish vlog and profiler, test=develop
      
      * try to fix preceding ops order,test=develop
      
      * test transformer in windows ci, test=develop
      
      * use python c-api to speed up tracer.trace,test=develop
      
      * test=develop, fix docker with paddle nccl problem
      
      * test=develop, add ut for debug string and gradient_accumulator
      
      * test=develop, add tests for layer/gradient_accumulator/prepared_op
      
      * test=develop, fix complie error for test_prepared_op
      
      * test=develop, add more ut for dygraph
      
      * test=develop, create API.spec for dygraph api change
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ
      
      * test=develop, fix ut failed on parallel se-resnext
      
      * test=develop, change one more PADDLE_ENFORCE
      e9233d1c