1. 20 12月, 2021 1 次提交
  2. 03 12月, 2021 1 次提交
  3. 09 4月, 2021 1 次提交
    • L
      [NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d
      Leo Chen 提交于
      * [feature] support npu allocator (#30840)
      
      [feature] support npu allocator
      
      * [feature] support npu operator (#30951)
      
      [feature] support npu operator
      
      * [feature] support npu allocator, part 2 (#30972)
      
      * support npu allocator
      
      * add npu device context
      
      * fix some compile problem
      
      * fix some compile problem
      
      * add npu info
      
      * compile ok
      
      * fix include dir
      
      * support naive_best_fit_allocator
      
      * run ut ok, bug failed to exit
      
      * call aclrtResetDevice before exit
      
      * fix aclFinilize
      
      * add system allocatot test
      
      * add selected_gpus in gtest
      
      * add tensor_test for npu
      
      * support npu op, initial commit
      
      * add npu stream
      
      * add elementwise_add_op
      
      * compile ok
      
      * fix typo
      
      * fix elementwise_add_op_npu_test
      
      * support op run
      
      * test can run but failed
      
      * change aclopExecuteV2 to aclopCompileAndExecute
      
      * support parsing ascend rank table file (#31000)
      
      support parsing ascend rank table file
      
      * Fix reshape on GE graph. (#31084)
      
      Fix reshape on GE graph
      
      * add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)
      
      * add npu sub op
      
      * fix typo
      
      * rename test
      
      * fix bug
      
      * fix bug
      
      * add fp16 kernel
      
      * fix typo
      
      * support sub grad op
      
      * support elementwise_sub_grad op
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      
      * Fix compilation problem (#31100)
      
      Fix compilation problem (#31100)
      
      * fix compile
      
      * fix code stype
      
      * remove const_cast
      
      * support adding correct npu op in pybind.h (#31143)
      
      * support adding correct npu op in pybind.h
      
      * refine code
      
      * [NPU] Support executor with NPU (#31057)
      
      * [NPU] Support executor with NPU
      
      * Fix code according to reviews
      
      * Fix code
      
      * Add unittest for sub op npu
      
      * refactor npu device manager (#31154)
      
      refactor npu device manager (#31154)
      
      * fix selected npus
      
      * fix compile
      
      * fix reading flags from env
      
      * format
      Co-authored-by: Nxiayanming <41795079@qq.com>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
      ccf5709d
  4. 01 2月, 2021 1 次提交
  5. 12 2月, 2018 1 次提交
  6. 10 2月, 2018 2 次提交
  7. 26 10月, 2017 1 次提交
    • Y
      Feature/save op (#5090) · efc2464f
      Yu Yang 提交于
      * Init
      
      * Stash
      
      * Polish SaveLoadOp
      
      * Fix CI
      
      * Polish code
      
      * Save GPU Tensor
      
      * Stash
      
      * Fix CI
      efc2464f
  8. 10 10月, 2017 1 次提交
  9. 05 10月, 2017 2 次提交
    • Y
      Use PADDLE_WITH_CUDA instead of PADDLE_WITH_GPU · 4558807c
      Yi Wang 提交于
      4558807c
    • Y
      Change `PADDLE_ONLY_CPU` to `PADDLE_WITH_GPU` · 84500f94
      Yu Yang 提交于
      By shell command
      
      ```bash
      sed -i 's#ifdef PADDLE_ONLY_CPU#ifndef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
      sed -i 's#ifndef PADDLE_ONLY_CPU#ifdef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
      ```
      84500f94
  10. 28 7月, 2017 3 次提交
  11. 22 7月, 2017 1 次提交
  12. 21 7月, 2017 2 次提交
  13. 19 7月, 2017 3 次提交
    • L
      Add memcpy · e53a48b4
      liaogang 提交于
      e53a48b4
    • F
      Simplify Tensor implimentation · 55d30172
      fengjiayi 提交于
      ATTENTION: some interfaces changed:
      1. void Tensor::set_dims(const DDim& dims) ==> void Tensor::Resize(const DDim& dims).
      2. void Tensor::ShareDataFrom(const Tensor& src)  ==> void Tensor::ShareDataWith(const Tensor& src)
      3. DDim Tensor::dims() const ==> const DDim& Tensor::dims() const
      55d30172
    • L
      Add memcpy · 028f3dc4
      liaogang 提交于
      028f3dc4
  14. 06 7月, 2017 1 次提交
  15. 28 6月, 2017 1 次提交
  16. 27 6月, 2017 1 次提交
  17. 26 6月, 2017 2 次提交
  18. 25 5月, 2017 1 次提交
  19. 09 12月, 2016 1 次提交
  20. 29 8月, 2016 1 次提交