提交 · 3218075db4c40c7d5131876ab7055d7c2fc39562 · 机器未来 / Paddle

15 9月, 2021 1 次提交
- S
  Add paddle.cuda.device.stream_guard API (#35623) · 3218075d
  由 Siming Dai 提交于 9月 15, 2021
```
Add paddle.cuda.device.stream_guard API 
```
  3218075d
20 7月, 2021 1 次提交
- C
  
  fix cuda_stream missing mkldnn depending error (#34260) · c963a21d
  由 chentianyu03 提交于 7月 20, 2021
  
  c963a21d
19 7月, 2021 1 次提交

Add Cuda event and stream API (#32460) · 9c7f6af5

由 chentianyu03 提交于 7月 19, 2021

* add cuda event and stream api

* add cuda event and stream api

* add get_current_stream api

* add get_current_stream api

* init streams

* modify get_current_stream

* modify get_cuttent_stream

* add synchronize func

* add current_stream doc and test file

* move get_current_stream into CUDA macro

* move CudaEvent into CUDA macro

* move _get_current_stream and _device_synchronize into cuda macro

* modify the macro of cuda stream and event

* add test case for synchronize

* add paddle.devices.cuda module

* event and stream support hip

* add doc for stream and event class

* move cuda stream and event into single pybind

* add cuda_streams_py.cc to cmakelist

* add _device_synchronize and _get_current_stream to core module

* add test case for cudastream and cudaevent

* move __all__ in streams.py

* fix test fail

* add cuda to devices __all__

* fix current_stream doc writing error

* move devices to device direction, and merge device.py into __init__.py

* add required:gpu to sample codes

* remove cuda direction from device/__init__.py

9c7f6af5

12 4月, 2021 1 次提交
- L
  
  follow comments to refine PR 32144 (#32174) · af374ae6
  由 Leo Chen 提交于 4月 12, 2021
  
  af374ae6
09 4月, 2021 1 次提交

[NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d

由 Leo Chen 提交于 4月 09, 2021

* [feature] support npu allocator (#30840)

[feature] support npu allocator

* [feature] support npu operator (#30951)

[feature] support npu operator

* [feature] support npu allocator, part 2 (#30972)

* support npu allocator

* add npu device context

* fix some compile problem

* fix some compile problem

* add npu info

* compile ok

* fix include dir

* support naive_best_fit_allocator

* run ut ok, bug failed to exit

* call aclrtResetDevice before exit

* fix aclFinilize

* add system allocatot test

* add selected_gpus in gtest

* add tensor_test for npu

* support npu op, initial commit

* add npu stream

* add elementwise_add_op

* compile ok

* fix typo

* fix elementwise_add_op_npu_test

* support op run

* test can run but failed

* change aclopExecuteV2 to aclopCompileAndExecute

* support parsing ascend rank table file (#31000)

support parsing ascend rank table file

* Fix reshape on GE graph. (#31084)

Fix reshape on GE graph

* add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)

* add npu sub op

* fix typo

* rename test

* fix bug

* fix bug

* add fp16 kernel

* fix typo

* support sub grad op

* support elementwise_sub_grad op
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* Fix compilation problem (#31100)

Fix compilation problem (#31100)

* fix compile

* fix code stype

* remove const_cast

* support adding correct npu op in pybind.h (#31143)

* support adding correct npu op in pybind.h

* refine code

* [NPU] Support executor with NPU (#31057)

* [NPU] Support executor with NPU

* Fix code according to reviews

* Fix code

* Add unittest for sub op npu

* refactor npu device manager (#31154)

refactor npu device manager (#31154)

* fix selected npus

* fix compile

* fix reading flags from env

* format
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>

ccf5709d

28 1月, 2021 1 次提交
- Q
  [ROCM] update fluid platform for rocm35 (part1), test=develop (#30639) · f89da4ab
  由 Qi Li 提交于 1月 28, 2021
```
* [ROCM] update fluid platform for rocm35 (part1), test=develop

* address review comments, test=develop
```
  f89da4ab
24 9月, 2020 1 次提交

use iwyu clean include (#27267) · df43905f

由 wanghuancoder 提交于 9月 24, 2020

* use iwyu clean include, test=develop, test=win

* compilation error, test=develop

* fix compilation error2, test=develop

* fix compilation error3, test=develop

* fix compilation error4, test=develop

* fix compilation error5, test=develop

* fix compilation error6, test=develop

* fix compilation error7, test=develop

* fix compilation error8, test=develop

* fix compilation error8, test=develop

* fix compilation error10, test=develop

* fix compilation error11, test=develop

df43905f

11 5月, 2020 1 次提交

Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f

由 Chen Weihang 提交于 5月 11, 2020

* add new macro BOOST_GET_SAFELY & unittests, test=develop

* add different macro type, test=develop

* fix get macro type in executor, test=develop

* four macro part change backup

* using one macro for all case, test=develop

* revert attribute change, test=develop

* change to three func to solve gcc4.8 bug, test=develop

* polish some details, test=develop

aa0f254f

23 4月, 2020 1 次提交
- 石
  
  declare the stream::Priority as enum class, test=develop (#24013) · 34d7d6ae
  由石晓伟提交于 4月 23, 2020
  
  34d7d6ae
22 4月, 2020 1 次提交
- 石
  
  add boost dependency to cuda_stream (#24032) · db6d8673
  由石晓伟提交于 4月 22, 2020
  
  db6d8673
20 4月, 2020 1 次提交

Optimize the error messages of paddle CUDA API (#23816) · 78170037

由 Zhou Wei 提交于 4月 20, 2020

* Optimize the error messages of paddle CUDA API, test=develop

* fix the error messages of paddle CUDA API, test=develop

* Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop

* remove build_ex_string,test=develop

* merge conflict,test=develop

78170037

17 4月, 2020 1 次提交

石

DeviceContext Split, test=develop (#23737) · 2d01cc85

由石晓伟提交于 4月 17, 2020

* supports thread-binding stream, test=develop

* avoid using thread_local variables in dtor, test=develop

* modify the stream priority enum, test=develop

2d01cc85

01 4月, 2020 1 次提交
- 石
  
  reverts the commit 23177, test=develop (#23363) · 5c59d213
  由石晓伟提交于 4月 01, 2020
  
  5c59d213
30 3月, 2020 1 次提交
- 石
  
  supports thread-binding stream, test=develop (#23177) · 75ebb48a
  由石晓伟提交于 3月 30, 2020
  
  75ebb48a

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致