1. 09 4月, 2021 1 次提交
    • L
      [NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d
      Leo Chen 提交于
      * [feature] support npu allocator (#30840)
      
      [feature] support npu allocator
      
      * [feature] support npu operator (#30951)
      
      [feature] support npu operator
      
      * [feature] support npu allocator, part 2 (#30972)
      
      * support npu allocator
      
      * add npu device context
      
      * fix some compile problem
      
      * fix some compile problem
      
      * add npu info
      
      * compile ok
      
      * fix include dir
      
      * support naive_best_fit_allocator
      
      * run ut ok, bug failed to exit
      
      * call aclrtResetDevice before exit
      
      * fix aclFinilize
      
      * add system allocatot test
      
      * add selected_gpus in gtest
      
      * add tensor_test for npu
      
      * support npu op, initial commit
      
      * add npu stream
      
      * add elementwise_add_op
      
      * compile ok
      
      * fix typo
      
      * fix elementwise_add_op_npu_test
      
      * support op run
      
      * test can run but failed
      
      * change aclopExecuteV2 to aclopCompileAndExecute
      
      * support parsing ascend rank table file (#31000)
      
      support parsing ascend rank table file
      
      * Fix reshape on GE graph. (#31084)
      
      Fix reshape on GE graph
      
      * add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)
      
      * add npu sub op
      
      * fix typo
      
      * rename test
      
      * fix bug
      
      * fix bug
      
      * add fp16 kernel
      
      * fix typo
      
      * support sub grad op
      
      * support elementwise_sub_grad op
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      
      * Fix compilation problem (#31100)
      
      Fix compilation problem (#31100)
      
      * fix compile
      
      * fix code stype
      
      * remove const_cast
      
      * support adding correct npu op in pybind.h (#31143)
      
      * support adding correct npu op in pybind.h
      
      * refine code
      
      * [NPU] Support executor with NPU (#31057)
      
      * [NPU] Support executor with NPU
      
      * Fix code according to reviews
      
      * Fix code
      
      * Add unittest for sub op npu
      
      * refactor npu device manager (#31154)
      
      refactor npu device manager (#31154)
      
      * fix selected npus
      
      * fix compile
      
      * fix reading flags from env
      
      * format
      Co-authored-by: Nxiayanming <41795079@qq.com>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
      ccf5709d
  2. 07 4月, 2021 1 次提交
    • Z
      【NPU】Merge ascend GE&distributed code by 0208 from ascendrc (#31957) · 8c7c53b3
      zhang wenhui 提交于
      * Ascend rc (#30483)
      
      * Fix compilcation on CANN20.1 and older (#30494)
      
      Fix compilcation on CANN20.1 and older
      
      * Add distribution supported (#30578)
      
      Add distribution supported
      
      * Build praser for Hcom* operators (#30627)
      
      Build praser for Hcom* operators
      
      * Pass device_ids info from launch to trainer. (#30632)
      
      Pass device_ids info from launch to trainer
      
      * Add Hccl program group (#30642)
      
      Add Hccl program group
      
      * Add startup bash files of test_ascend_group. (#30645)
      
      Add startup bash files of test_ascend_group
      
      * cleanup (#30646)
      
      cleanup test_ascend_group.py
      
      * [Feature] Build parser to support distributed training (#30658)
      
      [Feature] Build parser to support distributed training
      
      * fix compilation on ascend-20.1 (#30722)
      
      fix compilation on ascend-20.1
      
      * Dev/fix ascend string (#30749)
      
      Dev/fix ascend string
      
      * code style (#30781)
      
      code style
      
      * Merge ascend_optimizer and ascend_parser. (#30776)
      
      Merge ascend_optimizer and ascend_parser.
      
      * Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug  (#30797)
      
      Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug
      
      * Add paddle ascend distribution training supported (#30796)
      
      Add paddle ascend distribution training supported
      
      * pass cxx_flags to gloo cmake (#30857)
      
      * Destroy session first. (#30954)
      
      Destroy session first.
      
      * merge
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix style, test=develop
      
      * fix, test=develop
      
      * fix
      
      * fix log fatal, test=develop
      
      * fix enforce style, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix rccl, test=develop
      
      * fix test, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix node_num, test=develop
      
      * fix ids str, test=develop
      
      * fix ids str, test=develop
      
      * fix ids str, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      Co-authored-by: Nhutuxian <hutuxian2011@sina.cn>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: NVoid Main <voidmain1313113@gmail.com>
      Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
      Co-authored-by: Ndingsiyu <18369187719@163.com>
      Co-authored-by: NOleNet <olenet@126.com>
      8c7c53b3
  3. 31 3月, 2021 1 次提交
    • W
      update cmake minimum version to 3.15 (#31807) · 3a95a0bc
      wuhuanzhou 提交于
      * update cmake minimum version to 3.15, test=develop
      
      * fix compilation error on Windows, test=develop
      
      * fix compilation error on Windows, test=develop
      
      * fix compilation error on Windows, test=develop
      3a95a0bc
  4. 23 3月, 2021 2 次提交
  5. 17 3月, 2021 1 次提交
  6. 16 3月, 2021 1 次提交
    • W
      Optimize compilation with Ninja (#31449) · 41e9ecfd
      wuhuanzhou 提交于
      * Optimize compilation with Ninja, notest, test=windows_ci, test=windows_op
      
      * no cache on windows ci, notest, test=windows_ci, test=windows_op
      
      * delete /Zc:inline compiled in NVCC, notest, test=windows_ci, test=windows_op
      
      * fix test_warpctc_op, notest, test=windows_ci
      
      * remove test code, test=develop
      41e9ecfd
  7. 22 2月, 2021 1 次提交
  8. 09 2月, 2021 1 次提交
  9. 21 1月, 2021 1 次提交
  10. 20 1月, 2021 1 次提交
    • W
      optimize unity build (#30195) · 7e671c07
      wuhuanzhou 提交于
      * optimize unity build, test=develop
      
      * fix code style error, test=develop
      
      * fix code style error and test /MP settings, test=develop
      7e671c07
  11. 18 1月, 2021 1 次提交
  12. 14 1月, 2021 1 次提交
  13. 12 1月, 2021 1 次提交
  14. 28 12月, 2020 1 次提交
  15. 26 12月, 2020 1 次提交
  16. 24 12月, 2020 1 次提交
  17. 21 12月, 2020 1 次提交
    • L
      Optimize compilation time with Unity Build (#29733) · 2e5b4a21
      LoveAn 提交于
      * Test compilation time with less parallel count, notest, test=windows_ci
      
      * optimize rules of Unity Build, notest, test=windows_ci, test=windows_op
      
      * limit parallel counts used only on GPU, test=develop
      
      * remove limit of argument /m:8 on Windows, test=develop
      2e5b4a21
  18. 17 12月, 2020 1 次提交
  19. 16 12月, 2020 1 次提交
    • Y
      添加rocm平台支持代码 (#29342) · 76738504
      Y_Xuan 提交于
      * 添加rocm平台支持代码
      
      * 修改一些问题
      
      * 修改一些歧义并添加备注
      
      * 修改代码格式
      
      * 解决冲突后的代码修改
      
      * 修改operators.cmake
      
      * 修改格式
      
      * 修正错误
      
      * 统一接口
      
      * 修改日期
      76738504
  20. 14 12月, 2020 1 次提交
    • G
      Fix precision problem (#29567) · 08f24a31
      GeminiCarrie 提交于
      * Fix a bug when running on an operating system without "bash."
      
      * add execution condition
      
      * for ci-coverage
      
      * get cpu information to check the precision problem
      
      * Update compilation environment for musl version
      
      * update dependencies
      
      * remove test code
      
      check cpu info
      
      remove test code
      
      review
      
      * update alpine and third_party denpendencies
      
      * add newline for ci Code format
      08f24a31
  21. 07 12月, 2020 1 次提交
    • L
      Compiling operator libraries with Unity build (#29130) · 671555ed
      LoveAn 提交于
      * Compiling operator libraries with Unity Build on Windows CPU.
      
      * Compiling operator libraries with Unity Build on Windows GPU, no_test, test=windows_ci
      
      * Add option in windows ci script, no_test, test=windows_ci
      
      * Optimize parallel compiling, test=develop
      
      * remove limit of parallel compile and skip some ops in UB, test=develop
      
      * remove changes of header file, test=develop
      
      * remove changes of header file, test=develop
      
      * fix test_eye_op unittest failed, test=develop
      
      * Compiling operator libraries with Unity Build on Linux, test=develop
      
      * set default WITH_UNITY_BUILD=OFF, test=develop
      
      * Move unity build rules into a single file and add comment, test=develop
      
      * optimize parallel compilation, test=develop
      
      * fix undefined reference error on coverage ci, test=develop
      671555ed
  22. 03 12月, 2020 1 次提交
  23. 01 12月, 2020 1 次提交
    • S
      add compile option WITH_TENSORRT (#29208) · fc80d2e0
      Shang Zhizhou 提交于
      * add compile option WITH_TENSORRT
      
      * add WITH_TENSORRT to ci paddle_buils.sh
      
      * add WITH_TENSORRT to paddle_build.sh
      
      * change FATAL to WARNING when TensorRT is not found and WITN_TENSORRT=ON, just to pass ci-py3 temporarily
      fc80d2e0
  24. 30 11月, 2020 1 次提交
  25. 27 11月, 2020 1 次提交
  26. 17 11月, 2020 1 次提交
  27. 16 11月, 2020 1 次提交
  28. 03 11月, 2020 1 次提交
  29. 26 10月, 2020 1 次提交
  30. 21 10月, 2020 1 次提交
  31. 14 10月, 2020 1 次提交
  32. 12 10月, 2020 1 次提交
  33. 23 9月, 2020 1 次提交
  34. 21 8月, 2020 1 次提交
    • Q
      support Baidu Kunlun AI Accelerator (#25959) · 138ecf24
      QingshuChen 提交于
      * support Baidu AI Accelerator
        * test=kunlun
      
      * minor
       * test=kunlun
      
      * support xpu op in separate file
       * test=kunlun
      
      * update XPU error message and remove duplicated code
      
       * test=kunlun
      
      * minor
       * test=kunlun
      
      * minor
       * test=kunlun
      138ecf24
  35. 24 7月, 2020 1 次提交
  36. 15 7月, 2020 1 次提交
  37. 09 7月, 2020 1 次提交
  38. 29 6月, 2020 1 次提交
  39. 23 6月, 2020 1 次提交