1. 26 5月, 2021 1 次提交
  2. 12 5月, 2021 1 次提交
  3. 14 4月, 2021 1 次提交
  4. 09 4月, 2021 1 次提交
    • L
      [NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d
      Leo Chen 提交于
      * [feature] support npu allocator (#30840)
      
      [feature] support npu allocator
      
      * [feature] support npu operator (#30951)
      
      [feature] support npu operator
      
      * [feature] support npu allocator, part 2 (#30972)
      
      * support npu allocator
      
      * add npu device context
      
      * fix some compile problem
      
      * fix some compile problem
      
      * add npu info
      
      * compile ok
      
      * fix include dir
      
      * support naive_best_fit_allocator
      
      * run ut ok, bug failed to exit
      
      * call aclrtResetDevice before exit
      
      * fix aclFinilize
      
      * add system allocatot test
      
      * add selected_gpus in gtest
      
      * add tensor_test for npu
      
      * support npu op, initial commit
      
      * add npu stream
      
      * add elementwise_add_op
      
      * compile ok
      
      * fix typo
      
      * fix elementwise_add_op_npu_test
      
      * support op run
      
      * test can run but failed
      
      * change aclopExecuteV2 to aclopCompileAndExecute
      
      * support parsing ascend rank table file (#31000)
      
      support parsing ascend rank table file
      
      * Fix reshape on GE graph. (#31084)
      
      Fix reshape on GE graph
      
      * add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)
      
      * add npu sub op
      
      * fix typo
      
      * rename test
      
      * fix bug
      
      * fix bug
      
      * add fp16 kernel
      
      * fix typo
      
      * support sub grad op
      
      * support elementwise_sub_grad op
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      
      * Fix compilation problem (#31100)
      
      Fix compilation problem (#31100)
      
      * fix compile
      
      * fix code stype
      
      * remove const_cast
      
      * support adding correct npu op in pybind.h (#31143)
      
      * support adding correct npu op in pybind.h
      
      * refine code
      
      * [NPU] Support executor with NPU (#31057)
      
      * [NPU] Support executor with NPU
      
      * Fix code according to reviews
      
      * Fix code
      
      * Add unittest for sub op npu
      
      * refactor npu device manager (#31154)
      
      refactor npu device manager (#31154)
      
      * fix selected npus
      
      * fix compile
      
      * fix reading flags from env
      
      * format
      Co-authored-by: Nxiayanming <41795079@qq.com>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
      ccf5709d
  5. 01 4月, 2021 1 次提交
    • C
      Refactor and simplify hook design & add Tensor.register_hook API (#31775) · dbeb3ea4
      Chen Weihang 提交于
      * refactor and simplify hook design
      
      * fix reducer add hook error
      
      * add Tensor.register_hook basic impl
      
      * refine prepare data impl
      
      * revert prepare data change
      
      * support register_hook for Tensor
      
      * add hook test in model
      
      * polish tests and doc example
      
      * fix double grad test failed
      
      * remove reduce hook func
      
      * fix set empty error
      
      * polish code by comments
      
      * change reduce_hook to mutable_hook
      
      * remove useless tmp_ins
      
      * fix shape code format error
      
      * fix shape code format error
      dbeb3ea4
  6. 26 3月, 2021 1 次提交
  7. 22 2月, 2021 1 次提交
  8. 05 1月, 2021 1 次提交
    • H
      support dygraph in xpu place (#30051) · 297fff1a
      hong 提交于
      * support dygraph in xpu place; test=develop
      
      * fix cpu/gpu compile error; test=develop
      
      * fix compile error; test=develop
      
      * fix xpu compile error; testd=develop
      297fff1a
  9. 25 12月, 2020 1 次提交
  10. 01 12月, 2020 1 次提交
  11. 18 11月, 2020 1 次提交
  12. 25 9月, 2020 1 次提交
  13. 21 8月, 2020 1 次提交
    • Q
      support Baidu Kunlun AI Accelerator (#25959) · 138ecf24
      QingshuChen 提交于
      * support Baidu AI Accelerator
        * test=kunlun
      
      * minor
       * test=kunlun
      
      * support xpu op in separate file
       * test=kunlun
      
      * update XPU error message and remove duplicated code
      
       * test=kunlun
      
      * minor
       * test=kunlun
      
      * minor
       * test=kunlun
      138ecf24
  14. 03 6月, 2020 1 次提交
  15. 20 3月, 2020 1 次提交
    • Z
      Add dygraph double grad implementation (#22939) · a31d7328
      Zeng Jinle 提交于
      * add double grad implementation for dygraph, test=develop
      
      * polish code, add uts, test=develop
      
      * fix place bug, test=develop
      
      * polish codes, add more uts for coverages, test=develop
      
      * add no_grad_set, test=develop
      
      * add star gan ut, test=develop
      
      * follow comments, test=develop
      a31d7328
  16. 09 3月, 2020 1 次提交
  17. 03 12月, 2019 1 次提交
    • Z
      support SelectedRows in dygraph, test=develop (#21078) · 6ebf0f47
      zhongpu 提交于
      * support SelectedRows in dygraph, test=develop
      
      * fix bug of _grad_ivar interface, test=develop
      
      * add optest for support seletedrows, test=develop
      
      * fix bug for gradient_accumulator in GPU mode, test=develop
      
      * fix error when Selectedrows addto LodTensor in sorted_gradient mdoe in dygraph, test=develop
      
      * refine and simplify gradient accumulator code, test=develop
      
      * add optest, test=develop
      
      * add optest and simplify code, test=develop
      
      * fix bug for test_imperative_selected_rows, test=develop
      
      * add optest for Coverage, test=develop
      
      * fix gradient interface and simplify code, test=develop
      
      * update api for gradient, test=develop
      
      * fix ShareDim's bug in DygraphExecutionContext class, test=develop
      
      * add optest, test=develop
      6ebf0f47
  18. 07 10月, 2019 1 次提交
    • J
      Fix/auto prune error on leaf (#20056) · 7a9bd0c5
      Jiabin Yang 提交于
      * test=develop, fix docker with paddle nccl problem
      
      * test=develop, Add Variable api and refine dygraph related API
      
      * test=develop, Add Variable api and refine dygraph related API
      
      * test=develop, refine test for new api and error info
      
      * test=develop, refine error info and test_layers
      
      * test=develop, add API.spec
      
      * test=devleop, fix to_string python2 and python3 compat error and refien doc
      
      * test=devleop, add API spec
      
      * test=devleop, update API spec
      
      * test=devleop, update API spec
      
      * test=develop, invoke ci
      
      * test=develop, fix example code
      
      * test=develop, update API spec
      
      * test=develop, fix auto_prune_error_on_leaf
      
      * test=develop, fix auto prune error on loss stop_gradient
      
      * test=develop, remove useless error check
      
      * test=develop, add more ut for sorted gradient
      7a9bd0c5
  19. 21 9月, 2019 1 次提交
    • J
      Feature/auto prune in dygraph (#19757) · 45425411
      Jiabin Yang 提交于
      * refactor dygraph,test=develop
      
      * fix failed unittest,test=develop
      
      * polish code,test=develop
      
      * check windows ci error,test=develop
      try to fix windows ci error by np.allclose,test=develop
      
      * polish vlog and profiler, test=develop
      
      * try to fix preceding ops order,test=develop
      
      * test transformer in windows ci, test=develop
      
      * use python c-api to speed up tracer.trace,test=develop
      
      * test=develop, fix docker with paddle nccl problem
      
      * test=develop, add ut for debug string and gradient_accumulator
      
      * test=develop, add tests for layer/gradient_accumulator/prepared_op
      
      * test=develop, fix complie error for test_prepared_op
      
      * test=develop, add more ut for dygraph
      
      * test=develop, create API.spec for dygraph api change
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ
      
      * test=develop, fix ut failed on parallel se-resnext
      
      * test=develop, change one more PADDLE_ENFORCE
      
      * support auto prune in dygraph mode
      
      * test=develop, support auto prune
      
      * test=develop, merge develop conflict
      
      * test=develop, fix test_layer and test_tracer ut
      
      * test=develop, fix bug which may cause stop_gradient disabled with a list of backward inputs
      45425411
  20. 05 9月, 2019 1 次提交
    • J
      Refactor dygraph (#19107) · e9233d1c
      Jiabin Yang 提交于
      * refactor dygraph,test=develop
      
      * fix failed unittest,test=develop
      
      * polish code,test=develop
      
      * check windows ci error,test=develop
      try to fix windows ci error by np.allclose,test=develop
      
      * polish vlog and profiler, test=develop
      
      * try to fix preceding ops order,test=develop
      
      * test transformer in windows ci, test=develop
      
      * use python c-api to speed up tracer.trace,test=develop
      
      * test=develop, fix docker with paddle nccl problem
      
      * test=develop, add ut for debug string and gradient_accumulator
      
      * test=develop, add tests for layer/gradient_accumulator/prepared_op
      
      * test=develop, fix complie error for test_prepared_op
      
      * test=develop, add more ut for dygraph
      
      * test=develop, create API.spec for dygraph api change
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ
      
      * test=develop, fix ut failed on parallel se-resnext
      
      * test=develop, change one more PADDLE_ENFORCE
      e9233d1c