1. 09 4月, 2021 1 次提交
    • L
      [NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d
      Leo Chen 提交于
      * [feature] support npu allocator (#30840)
      
      [feature] support npu allocator
      
      * [feature] support npu operator (#30951)
      
      [feature] support npu operator
      
      * [feature] support npu allocator, part 2 (#30972)
      
      * support npu allocator
      
      * add npu device context
      
      * fix some compile problem
      
      * fix some compile problem
      
      * add npu info
      
      * compile ok
      
      * fix include dir
      
      * support naive_best_fit_allocator
      
      * run ut ok, bug failed to exit
      
      * call aclrtResetDevice before exit
      
      * fix aclFinilize
      
      * add system allocatot test
      
      * add selected_gpus in gtest
      
      * add tensor_test for npu
      
      * support npu op, initial commit
      
      * add npu stream
      
      * add elementwise_add_op
      
      * compile ok
      
      * fix typo
      
      * fix elementwise_add_op_npu_test
      
      * support op run
      
      * test can run but failed
      
      * change aclopExecuteV2 to aclopCompileAndExecute
      
      * support parsing ascend rank table file (#31000)
      
      support parsing ascend rank table file
      
      * Fix reshape on GE graph. (#31084)
      
      Fix reshape on GE graph
      
      * add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)
      
      * add npu sub op
      
      * fix typo
      
      * rename test
      
      * fix bug
      
      * fix bug
      
      * add fp16 kernel
      
      * fix typo
      
      * support sub grad op
      
      * support elementwise_sub_grad op
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      
      * Fix compilation problem (#31100)
      
      Fix compilation problem (#31100)
      
      * fix compile
      
      * fix code stype
      
      * remove const_cast
      
      * support adding correct npu op in pybind.h (#31143)
      
      * support adding correct npu op in pybind.h
      
      * refine code
      
      * [NPU] Support executor with NPU (#31057)
      
      * [NPU] Support executor with NPU
      
      * Fix code according to reviews
      
      * Fix code
      
      * Add unittest for sub op npu
      
      * refactor npu device manager (#31154)
      
      refactor npu device manager (#31154)
      
      * fix selected npus
      
      * fix compile
      
      * fix reading flags from env
      
      * format
      Co-authored-by: Nxiayanming <41795079@qq.com>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
      ccf5709d
  2. 26 2月, 2021 1 次提交
  3. 24 9月, 2020 1 次提交
    • W
      use iwyu clean include (#27267) · df43905f
      wanghuancoder 提交于
      * use iwyu clean include, test=develop, test=win
      
      * compilation error, test=develop
      
      * fix compilation error2, test=develop
      
      * fix compilation error3, test=develop
      
      * fix compilation error4, test=develop
      
      * fix compilation error5, test=develop
      
      * fix compilation error6, test=develop
      
      * fix compilation error7, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error10, test=develop
      
      * fix compilation error11, test=develop
      df43905f
  4. 08 9月, 2020 1 次提交
    • W
      Error msg/polish tensor error msg (#26976) · 13804ed8
      WeiXin 提交于
      * polish one line error message in tensor.cc
      
      * polish error messages in tensor.cc,tensor.h tensor_impl.h
      
      * polish error messages in tensor.cc tensor.h tensor_impl.h
      
      * polish error messages in tensor.cc,tensor.h tensor_impl.h
      
      * polish error messages in tensor.cc tensor.h tensor_impl.h tensor_test.cc
      
      * polish error messages in tensor.cc tensor.h tensor_impl.h
      13804ed8
  5. 08 11月, 2019 1 次提交
  6. 18 10月, 2019 1 次提交
  7. 27 9月, 2019 1 次提交
    • C
      Paddle error message stack shaping and optimization (#19895) · b9163350
      Chen Weihang 提交于
      * shape and optimize paddle error message stack, test=develop
      
      * limit exception type & add unittest, test=develop
      
      * fix multi-platform problem, test=develop
      
      * fix related unnitest failed, test=develop
      
      * add doc & fix unittest errors, test=develop
      
      * fix function name error, test=develop
      
      * update tensor test exception msg compare, test=develop
      
      * remove unittest on win32, the dir format is different, test=develop
      
      * remove useless package, test=develop
      
      * add paddle enforce handler unittest, test=develop
      
      * add exception checkout, test=develop
      
      * fix coverage failed, test=develop
      
      * fix op registry test failed, test=develop
      
      * refactor whole pr, test=develop
      
      * remove test in CMakelist, test=develop
      
      * fix coverage, test=develop
      b9163350
  8. 09 9月, 2019 1 次提交
  9. 18 7月, 2019 1 次提交
    • Z
      Feature/auto_growth_allocator (#18561) · ae58afc5
      Zeng Jinle 提交于
      * feature/auto_growth_allocator, test=develop
      
      * add unittest of AlignedAllocator, test=develop
      
      * try to turn on auto_growth to test on CI, test=develop
      
      * fix segmentation fault in mixed_vector.h, test=develop
      
      * add unittests, test=develop
      ae58afc5
  10. 18 12月, 2018 1 次提交
    • D
      add ir memory optimize. (#14530) · 7cd24b13
      dzhwinter 提交于
      * follow comments. test=develop
      
      * Fix typo
      
      * fix compile error. test=develop
      
      * merge develop branch. test=develop
      
      * Remove set_equal
      
      * Polish code
      
      * Delete unused functions
      
      test=develop
      
      * polish code. test=develop
      
      * follow comment
      
      * polish code.
      
      * fix windows compile error. test=develop
      
      * fix op handle.
      
      * rerun ci. test=develop
      
      * rerun ci. test=develop
      
      * rerun macci. test=develop
      
      * polish code. test=develop
      
      * rewrite sort code. test=develop
      
      * remove unused code. test=develop
      
      * fix tests. test=develop
      
      * fix conflict. test=develop
      
      * follow comment. test=develop
      
      * merge develop branch. test=develop
      
      * fix tests. test=develop
      
      * remove ToTypeIndex. test=develop
      
      * rerun ci. test=develop
      7cd24b13
  11. 30 10月, 2018 1 次提交
  12. 01 8月, 2018 1 次提交
  13. 07 6月, 2018 1 次提交
    • M
      Mkldnn layout (#11040) · 3ff9ba0e
      mozga-intel 提交于
      * Add MKLDNN layout support in Paddle
      
      Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout
      can be used in MKLDNN enabled OP kernel. Before this commit, NCHW
      is hardcode to be used in all MKLDNN op kernels. As a result,
      non-optimized execution path is selected in MKLDNN primitive which
      bring worse performance.
      Besides framework change, three MKLDNN OP kernels were updated
      for using new MKLDNN layout. They are conv/pool2d/batch_norm.
      Other MKLDNN OP kernels need be also updated in similar way to
      achieve best performance.
      
      * Add MKLDNN layout support in activation OP
      
      * Don't populate layout from input to output when kMKLDNN in
      
      * Refine pool mkldnn op kernel
      
      * MKLDNN layout
      
      * Remove the inferitance from tensor file
      
      * MKLDNN layout: refactoring
      
      * Remove additional #define to register new operator
      
      * Prepare mkldnn tests to work with layout
      3ff9ba0e
  14. 12 2月, 2018 1 次提交
  15. 10 2月, 2018 2 次提交
  16. 21 1月, 2018 1 次提交
    • D
      "fix decode bug" (#7711) · e983cc90
      dzhwinter 提交于
      * "fix decode bug"
      
      * "follow commnet"
      
      * "fix error"
      
      * "fix hook bug"
      
      * fix based comment
      
      * fix copyright
      
      * fix based on comment
      e983cc90
  17. 15 1月, 2018 1 次提交
    • D
      Feature/hooks (#7513) · b9b75377
      dzhwinter 提交于
      * add copyright hook
      
      * add copyright hook
      
      * refine copyright hook
      
      * "test copyright hook"
      
      * fix check style
      
      * fix ci
      b9b75377
  18. 28 12月, 2017 1 次提交
    • Y
      Implement selectedrows serialize and deserialize (#7042) · 2cdef424
      Yancey 提交于
      * implement selectedrows serialize and deserialize
      
      * make serialize/deserialize as global function
      
      * recover send_imp.cc
      
      * delete unused brackets
      
      * fix compile error
      
      * serialize version in LodTensor and SelecetedRows
      
      * fix ci
      
      * fix ci
      2cdef424
  19. 25 12月, 2017 2 次提交
  20. 26 11月, 2017 1 次提交
    • D
      Feature/copytensor (#5455) · 45062fe5
      dzhwinter 提交于
      * "make global tensor function independently"
      
      * "replace functor"
      
      * "fix inline template error"
      
      * "fix tensor array with CopyFrom"
      
      * "fix other case use CopyFrom"
      
      * "move the op interface hardly"
      
      * "fix operators"
      
      * "fix typo"
      
      * "delete dynamic recurrent rnn and fix gru_unit in debugmode"
      
      * "fix unique_ptr copy"
      
      * "fix cuda copy"
      
      * "fix namespace error"
      
      * "removed nccl python test"
      
      * "fix include error"
      
      * "fix typo"
      
      * fix copy util test
      45062fe5
  21. 20 10月, 2017 1 次提交
  22. 12 10月, 2017 1 次提交
  23. 10 10月, 2017 1 次提交
  24. 05 10月, 2017 2 次提交
    • Y
      Use PADDLE_WITH_CUDA instead of PADDLE_WITH_GPU · 4558807c
      Yi Wang 提交于
      4558807c
    • Y
      Change `PADDLE_ONLY_CPU` to `PADDLE_WITH_GPU` · 84500f94
      Yu Yang 提交于
      By shell command
      
      ```bash
      sed -i 's#ifdef PADDLE_ONLY_CPU#ifndef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
      sed -i 's#ifndef PADDLE_ONLY_CPU#ifdef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
      ```
      84500f94
  25. 15 9月, 2017 1 次提交
  26. 08 9月, 2017 1 次提交
  27. 07 9月, 2017 2 次提交
  28. 06 9月, 2017 1 次提交
  29. 05 9月, 2017 1 次提交
    • F
      WIP · e76fa85c
      fengjiayi 提交于
      e76fa85c
  30. 09 8月, 2017 1 次提交
  31. 08 8月, 2017 1 次提交
    • Y
      fix some enforce (#3301) · 2af35002
      Yan Chunwei 提交于
      * fix some enforce
      
      * remove compatible_type to avoid compile error
      
      * remove shared_ptr
      
      * fix tensor error msg
      2af35002
  32. 28 7月, 2017 1 次提交
  33. 26 7月, 2017 1 次提交
  34. 25 7月, 2017 2 次提交
  35. 19 7月, 2017 1 次提交
    • F
      Simplify Tensor implimentation · 55d30172
      fengjiayi 提交于
      ATTENTION: some interfaces changed:
      1. void Tensor::set_dims(const DDim& dims) ==> void Tensor::Resize(const DDim& dims).
      2. void Tensor::ShareDataFrom(const Tensor& src)  ==> void Tensor::ShareDataWith(const Tensor& src)
      3. DDim Tensor::dims() const ==> const DDim& Tensor::dims() const
      55d30172