提交 · 0d2235aadf87a22773d6ffe8322126715f42d3aa · Crayon鑫 / Paddle

25 12月, 2017 1 次提交
- D
  
  GPUPlace to CUDAPlace (#6960) · 0d2235aa
  由 dzhwinter 提交于 12月 25, 2017
  
  0d2235aa
24 12月, 2017 3 次提交

D

"remove hash combine" · a521ace6
由 dzhwinter 提交于 12月 24, 2017

a521ace6
Q
refine OpKernelType (#6879) · 37e96264
由 QI JUN 提交于 12月 24, 2017
```
* refine OpKernelKey

* refine codes

* fix code style

* follow comments
```
37e96264

Feature/operator run place (#6783) · 735eba29

由 dzhwinter 提交于 12月 24, 2017

* "change operator interface"

* "move devicepool to device_context"

* "fix operator test"

* "fix op_registry Run interface"

* "net op passed. Need to fix nccl multi-Context"

* "add nccl group function"

* "add nccl group function"

* "fix gpu count exceed 32 error"

* "fix recurrent op, nccl op"

* "change the other operators interface with Place"

* "fix typo"

* "fix pybind"

* "fix device in python side"

* "fix pybind failed"

* "add init for test"

* "fix CI"

735eba29

22 12月, 2017 1 次提交

"remove GPU Sync Interface" (#6793) · abde3130

由 dzhwinter 提交于 12月 22, 2017

* "remove GPU Sync Interface"

* "fix typo"

* "fix type cast error"

* "fix related Copy with stream"

* "fix failed tests with DevicePool"

* "fix stupid removed position error"

abde3130

21 12月, 2017 1 次提交
- D
  
  "small fix of Place" (#6766) · ad2ab952
  由 dzhwinter 提交于 12月 21, 2017
  
  ad2ab952
18 12月, 2017 2 次提交
- D
  
  Refine CUDA profiler and delete the test file. · 521db98b
  由 dangqingqing 提交于 12月 18, 2017
  
  521db98b
- Q
  add more place test and rename Cudnn to CUDNN (#6621) · 93a2d9c5
  由 QI JUN 提交于 12月 18, 2017
```
* add more place_test and rename Cudnn to CUDNN

* fix ci
```
  93a2d9c5
15 12月, 2017 5 次提交
- Y
  
  Simplize system_allocator and fix GPU_INFO (#6653) · 1b0c7d7c
  由 Yu Yang 提交于 12月 15, 2017
  
  1b0c7d7c
- Y
  
  Fix compile on CUDA9.1 & MacOS (#6642) · d5cab4f0
  由 Yu Yang 提交于 12月 15, 2017
  
  d5cab4f0
- T
  
  fix place_test on MKLDNNPlace · bf269d67
  由 tensor-tang 提交于 12月 15, 2017
  
  bf269d67
- T
  
  fix conflict of Place · a92f057e
  由 tensor-tang 提交于 12月 15, 2017
  
  a92f057e
- T
  
  fix undefined issue when with_gpu · f2712105
  由 tensor-tang 提交于 12月 15, 2017
  
  f2712105
14 12月, 2017 2 次提交
- T
  
  add MKLDNNPlace · e0c33176
  由 tensor-tang 提交于 12月 14, 2017
  
  e0c33176
- D
  "derived cudnnDevice context" (#6585) · 0e9b393b
  由 dzhwinter 提交于 12月 14, 2017
```
* "derived cudnnDevice context"

* "leave remove cudnn handle from CUDADeviceContext"

* "fix math function error"
```
  0e9b393b
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

07 12月, 2017 2 次提交
- Y
  
  Remove DeviceContext::Finish · 6b9567e0
  由 Yang Yu 提交于 12月 07, 2017
  
  6b9567e0
- Y
  Add HasCUDNN to detect if CUDNN is installed or not (#6349) · f291abfc
  由 Yu Yang 提交于 12月 07, 2017
```
* Add HasCUDNN to detect if CUDNN is installed or not

* Fix CI
```
  f291abfc
05 12月, 2017 1 次提交
- Q
  
  fix bug in gpu default memory allocating policy (#6268) · 96a5f96c
  由 QI JUN 提交于 12月 05, 2017
  
  96a5f96c
01 12月, 2017 3 次提交
- Q
  change GPU memory allocating policy (#6159) · d066b07f
  由 QI JUN 提交于 12月 01, 2017
```
* change GPU memory allocating policy

* fix potential overflow bug
```
  d066b07f
- C
  
  code refine (#6164) · e50f3570
  由 chengduo 提交于 12月 01, 2017
  
  e50f3570
- Y
  Fix the proformance problem of enforce (#6085) · 8ac02279
  由 Yu Yang 提交于 12月 01, 2017
```
* Fix Proformance problem of enforce

* Fix missing `;` in code

* Fix CI
```
  8ac02279
29 11月, 2017 1 次提交
- 武
  Fix compile on cudnn7 (#5982) · 4ecbab42
  由武毅提交于 11月 29, 2017
```
* fix compile on cudnn7

* update

* update

* make silent
```
  4ecbab42
28 11月, 2017 2 次提交
- D
  
  Refine paddle/v2/fluid/profiler.py. · 5e7e90ce
  由 dangqingqing 提交于 11月 28, 2017
  
  5e7e90ce
- D
  
  Refine paddle/v2/fluid/profiler.py. · 696b0253
  由 dangqingqing 提交于 11月 28, 2017
  
  696b0253
27 11月, 2017 3 次提交
- D
  
  Add cuda profiler tools and expose it in Python. · 623f62a7
  由 dangqingqing 提交于 11月 27, 2017
  
  623f62a7
- D
  
  Add cuda profiler tools. · 6cf2dcbc
  由 dangqingqing 提交于 11月 27, 2017
  
  6cf2dcbc
- 武
  Conv cudnn 3d (#5783) · a06bec12
  由武毅提交于 11月 27, 2017
```
* conv cudnn 3d

* update test case

* update

* update

* follow comments and remove groups from helper

* update

* refine

* update

* follow comments2

* update

* fix compile
```
  a06bec12
24 11月, 2017 1 次提交

Make enforce target (#5889) · c9172c1c

由 Qiao Longfei 提交于 11月 24, 2017

* make enforce a target and dependent on nccl when gpu is enabled

* add some more dependency

c9172c1c

23 11月, 2017 1 次提交
- Y
  Feature/support int64 for sum (#5832) · c077a6d5
  由 Yu Yang 提交于 11月 23, 2017
```
* Support int64 for sum op

* Refine code
```
  c077a6d5
16 11月, 2017 1 次提交
- D
  "fix accuracy kernel bug" (#5673) · e97b8987
  由 dzhwinter 提交于 11月 15, 2017
```
* "fix accuracy kernel bug"

* "relauch ci"
```
  e97b8987
15 11月, 2017 1 次提交
- C
  
  fix data layout · 74912c7d
  由 chengduoZH 提交于 11月 15, 2017
  
  74912c7d
13 11月, 2017 3 次提交
- C
  
  add cudnn_pool3d unit test · ec1e2fc9
  由 chengduoZH 提交于 11月 13, 2017
  
  ec1e2fc9
- C
  
  add cudnn 3d unit test · a93a59ec
  由 chengduoZH 提交于 11月 13, 2017
  
  a93a59ec
- Y
  
  Fix GPU Compile on Linux · 17405027
  由 Yang Yu 提交于 11月 13, 2017
  
  17405027
11 11月, 2017 2 次提交

D

Use G++ to compile some cu operators. · f5e36765
由 dangqingqing 提交于 11月 11, 2017

f5e36765

Fix a dead lock bug for dyload/nccl.h when nccl lib cannot be loaded (#5533) · 2378679a

由 emailweixu 提交于 11月 10, 2017

It caused by a bug of std::call_once described in https://stackoverflow.com/questions/41717579/stdcall-once-hangs-on-second-call-after-callable-threw-on-first-call. It is likely caused by a deeper bug of pthread_once, which is discussed in https://patchwork.ozlabs.org/patch/482350/

2378679a

08 11月, 2017 2 次提交
- Y
  CompareOp's kernel device type is decided by input tensor place · 3187451a
  由 Yang Yu 提交于 11月 07, 2017
```
CompareOp can run on CPU even other operators are running on GPU, since
opeatations like comparing control flags should be performed only on CPU
```
  3187451a
- Q
  
  Check errors for the cuda kernel calls. (#5436) · 58db07b7
  由 qingqing01 提交于 11月 08, 2017
  
  58db07b7
31 10月, 2017 1 次提交
- Q
  remove unused code (#5219) · afd1e844
  由 QI JUN 提交于 10月 30, 2017
```
* remove unused code

* fix cmake file

* fix build error
```
  afd1e844

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致