提交 · 3c1dc6f6d728541a61365ef09c208a49e29bc22c · PaddlePaddle / Paddle

24 1月, 2022 1 次提交

[PTEN] Move dynload from fluid to pten. (#39120) · 3c1dc6f6

由 Wilber 提交于 1月 24, 2022

* move dynload from fluid to pten.

* fix ci compile

* fix windows ci compile.

* update

* update

* fix compile error

3c1dc6f6

08 11月, 2021 1 次提交

Use cuda virtual memory management and merge blocks (#36189) · a1ec1d5a

由 wanghuancoder 提交于 11月 08, 2021

* Use cuda virtual memory management and merge blocks, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* window dll, test=develop

* fix cuda error of CUDA_ERROR_NOT_INITIALIZED, test=develop

* use autogrowthv2 for system allocator, test=develop

* remove ~CUDAVirtualMemAllocator(), test=develop

* refine, test=develop

* fix cuda error of CUDA_ERROR_NOT_INITIALIZED, test=develop

* fix cuda error of CUDA_ERROR_NOT_INITIALIZED, test=develop

* fix bug, test=develop

* revert system allocator, test =develop

* revert multiprocessing, test=develop

* fix AutoGrowthBestFitAllocatorV2 mutxt, test=develop

* catch cudaErrorInitializationError when create allocator, test=develop

* fix cuMemSetAccess use, test=develop

* refine cuda api use, test=develop

* refine, test=develop

* for test, test=develop

* for test, test=develop

* switch to v2, test=develop

* refine virtual allocator, test=develop

* Record cuMemCreate and cuMemRelease, test=develop

* refine, test=develop

* avoid out of bounds, test=develop

* rename allocator, test=develop

* refine, test=develop

* use PADDLE_ENFORCE_CUDA_SUCCESS, test=develop

* for test,test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

a1ec1d5a

09 7月, 2020 1 次提交
- C
  
  remove WITH_DSO compile option (#25444) · 172d4ecb
  由 Chen Weihang 提交于 7月 09, 2020
  
  172d4ecb
03 1月, 2020 1 次提交

Add the first implememtation of fusion_group op (#19621) · d4832077

由 Yiqun Liu 提交于 1月 03, 2020

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

* Add DeviceCodePool to manage all device codes.

* Add the first implementation fusion_group op.

* Add unit-test for fusion_group op.

* Add the check of result.

* Add the check of nvrtc in unit-test.
test=develop

* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop

* Disable fusion_group op for mac and windows.
test=develop

* Make the compiling of device code return status instead of hanging up.
test=develop

* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.

* Unify fusion_group_op's input and output names.
test=develop

* Add the check of CUDA driver library in unittest.
test=develop

* Refine the calling of PADDLE_ENFORCE.
test=develop

d4832077

05 9月, 2019 1 次提交

Integrate NVRTC to support compiling CUDA kernel at runtime (#19422) · 42b5bec6

由 Yiqun Liu 提交于 9月 05, 2019

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

42b5bec6

20 6月, 2018 1 次提交
- T
  
  enable dynamic load mklml lib on fluid · f503f129
  由 tensor-tang 提交于 6月 20, 2018
  
  f503f129
08 4月, 2018 1 次提交
- Y
  Fix cpplint errors with paddle/fluid/platform/dynload (#9715) · e185502e
  由 Yi Wang 提交于 4月 07, 2018
```
* Update source files.

* Update headers

* Update

* Update

* Update

* Update

* Fix a CMake dependency
```
  e185502e
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
26 12月, 2017 1 次提交
- L
  
  unify the indentation of license · 761b3297
  由 Luo Tao 提交于 12月 26, 2017
  
  761b3297
15 12月, 2017 1 次提交
- Y
  
  Fix compile on CUDA9.1 & MacOS (#6642) · d5cab4f0
  由 Yu Yang 提交于 12月 15, 2017
  
  d5cab4f0
24 10月, 2017 1 次提交

Feature/nccl dso (#5001) · 43c6ff21

由 Yu Yang 提交于 10月 23, 2017

* "add nccl enforce"

* Dev

* Update comment

* Add nccl test

* Follow comments

43c6ff21

20 8月, 2017 2 次提交
- Y
  Make OpInfoMap as a class · 7f6b5044
  由 Yu Yang 提交于 8月 20, 2017
```
* Add Get/Has methods to OpInfoMap
* Add PADDLE_ENFORCE for OpInfo to get field.
```
  7f6b5044
- Y
  Extract OpInfo into a library · 59b3df31
  由 Yu Yang 提交于 8月 20, 2017
```
Fix cycle dependencies, Fix #3583.
```
  59b3df31
16 8月, 2017 1 次提交
- Y
  Remove std::shared_ptr in Python & C++ · f15e0830
  由 Yu Yang 提交于 8月 16, 2017
```
* Also simplify pybind implementation by using OperatorBase as holder
  type.
```
  f15e0830
01 8月, 2017 1 次提交
- Y
  
  Follow comments and merge develop · e2fd2bd0
  由 Yu Yang 提交于 8月 01, 2017
  
  e2fd2bd0
26 7月, 2017 2 次提交
- Y
  
  Update Interface · b1b13f8f
  由 Yu Yang 提交于 7月 26, 2017
  
  b1b13f8f
- Y
  
  Update Backward · ecf23ce5
  由 Yu Yang 提交于 7月 26, 2017
  
  ecf23ce5
25 7月, 2017 1 次提交
- L
  
  ENH: Refine Tensor and And CopyFrom · de8a8fee
  由 liaogang 提交于 7月 25, 2017
  
  de8a8fee
17 7月, 2017 2 次提交
- Y
  
  Refine CMake dependencies graph · 38310f93
  由 Yu Yang 提交于 7月 17, 2017
  
  38310f93
- Y
  Add enforce switch for convient develop (#2850) · cdec5634
  由 Yan Chunwei 提交于 7月 17, 2017
```
* add NDEBUG switch to PADDLE_ENFORCE
```
  cdec5634
11 7月, 2017 2 次提交
- D
  
  "support net_proto header" · 18e65b0c
  由 dongzhihong 提交于 7月 11, 2017
  
  18e65b0c
- D
  
  "move opContext to DeviceContext" · bc021d77
  由 dongzhihong 提交于 7月 11, 2017
  
  bc021d77
06 7月, 2017 2 次提交
- L
  
  FIX: explicit construct pool element · a669bf48
  由 liaogang 提交于 7月 06, 2017
  
  a669bf48
- L
  
  ENH: add memory unit test · 74691789
  由 liaogang 提交于 7月 06, 2017
  
  74691789
05 7月, 2017 1 次提交
- L
  
  FIX: Buddy Allocator Free with Merge feature · ada1c20b
  由 liaogang 提交于 7月 05, 2017
  
  ada1c20b
04 7月, 2017 4 次提交
- L
  
  ENH: Add paddle_memory for external usage · 4dc3c9e0
  由 liaogang 提交于 7月 04, 2017
  
  4dc3c9e0
- L
  
  ENH: Add buddy allocator Free · 0ba63475
  由 liaogang 提交于 7月 04, 2017
  
  0ba63475
- L
  
  ENH: code style · ff363894
  由 liaogang 提交于 7月 04, 2017
  
  ff363894
- L
  
  ENH: add buddy alloctor Free · 4e1617d0
  由 liaogang 提交于 7月 04, 2017
  
  4e1617d0
03 7月, 2017 1 次提交
- L
  ENH: Add Alloc for buddy Allocator · bbd3eab7
  由 liaogang 提交于 7月 03, 2017
```
* Free will be added soon
```
  bbd3eab7
28 6月, 2017 2 次提交
- L
  
  FIX: Pass CI · 3e9aa7fd
  由 liaogang 提交于 6月 28, 2017
  
  3e9aa7fd
- Y
  
  Add buddy_allocator.cc and system_allocator.cc · 3e087f76
  由 Yi Wang 提交于 6月 27, 2017
  
  3e087f76

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功