提交 · ee3483b0d9bd27be44c2c85d346a032e44d93ca7 · s920243400 / PaddleDetection

05 12月, 2017 1 次提交
- Q
  
  fix bug in gpu default memory allocating policy (#6268) · 96a5f96c
  由 QI JUN 提交于 12月 05, 2017
  
  96a5f96c
01 12月, 2017 3 次提交
- Q
  change GPU memory allocating policy (#6159) · d066b07f
  由 QI JUN 提交于 12月 01, 2017
```
* change GPU memory allocating policy

* fix potential overflow bug
```
  d066b07f
- C
  
  code refine (#6164) · e50f3570
  由 chengduo 提交于 12月 01, 2017
  
  e50f3570
- Y
  Fix the proformance problem of enforce (#6085) · 8ac02279
  由 Yu Yang 提交于 12月 01, 2017
```
* Fix Proformance problem of enforce

* Fix missing `;` in code

* Fix CI
```
  8ac02279
29 11月, 2017 1 次提交
- 武
  Fix compile on cudnn7 (#5982) · 4ecbab42
  由武毅提交于 11月 29, 2017
```
* fix compile on cudnn7

* update

* update

* make silent
```
  4ecbab42
28 11月, 2017 2 次提交
- D
  
  Refine paddle/v2/fluid/profiler.py. · 5e7e90ce
  由 dangqingqing 提交于 11月 28, 2017
  
  5e7e90ce
- D
  
  Refine paddle/v2/fluid/profiler.py. · 696b0253
  由 dangqingqing 提交于 11月 28, 2017
  
  696b0253
27 11月, 2017 3 次提交
- D
  
  Add cuda profiler tools and expose it in Python. · 623f62a7
  由 dangqingqing 提交于 11月 27, 2017
  
  623f62a7
- D
  
  Add cuda profiler tools. · 6cf2dcbc
  由 dangqingqing 提交于 11月 27, 2017
  
  6cf2dcbc
- 武
  Conv cudnn 3d (#5783) · a06bec12
  由武毅提交于 11月 27, 2017
```
* conv cudnn 3d

* update test case

* update

* update

* follow comments and remove groups from helper

* update

* refine

* update

* follow comments2

* update

* fix compile
```
  a06bec12
24 11月, 2017 1 次提交

由 Qiao Longfei 提交于 11月 24, 2017

* make enforce a target and dependent on nccl when gpu is enabled

* add some more dependency

c9172c1c

23 11月, 2017 1 次提交
- Y
  Feature/support int64 for sum (#5832) · c077a6d5
  由 Yu Yang 提交于 11月 23, 2017
```
* Support int64 for sum op

* Refine code
```
  c077a6d5
16 11月, 2017 1 次提交
- D
  "fix accuracy kernel bug" (#5673) · e97b8987
  由 dzhwinter 提交于 11月 15, 2017
```
* "fix accuracy kernel bug"

* "relauch ci"
```
  e97b8987
15 11月, 2017 1 次提交
- C
  
  fix data layout · 74912c7d
  由 chengduoZH 提交于 11月 15, 2017
  
  74912c7d
13 11月, 2017 3 次提交
- C
  
  add cudnn_pool3d unit test · ec1e2fc9
  由 chengduoZH 提交于 11月 13, 2017
  
  ec1e2fc9
- C
  
  add cudnn 3d unit test · a93a59ec
  由 chengduoZH 提交于 11月 13, 2017
  
  a93a59ec
- Y
  
  Fix GPU Compile on Linux · 17405027
  由 Yang Yu 提交于 11月 13, 2017
  
  17405027
11 11月, 2017 2 次提交

D

Use G++ to compile some cu operators. · f5e36765
由 dangqingqing 提交于 11月 11, 2017

f5e36765

Fix a dead lock bug for dyload/nccl.h when nccl lib cannot be loaded (#5533) · 2378679a

由 emailweixu 提交于 11月 10, 2017

It caused by a bug of std::call_once described in https://stackoverflow.com/questions/41717579/stdcall-once-hangs-on-second-call-after-callable-threw-on-first-call. It is likely caused by a deeper bug of pthread_once, which is discussed in https://patchwork.ozlabs.org/patch/482350/

2378679a

08 11月, 2017 2 次提交
- Y
  CompareOp's kernel device type is decided by input tensor place · 3187451a
  由 Yang Yu 提交于 11月 07, 2017
```
CompareOp can run on CPU even other operators are running on GPU, since
opeatations like comparing control flags should be performed only on CPU
```
  3187451a
- Q
  
  Check errors for the cuda kernel calls. (#5436) · 58db07b7
  由 qingqing01 提交于 11月 08, 2017
  
  58db07b7
31 10月, 2017 1 次提交
- Q
  remove unused code (#5219) · afd1e844
  由 QI JUN 提交于 10月 30, 2017
```
* remove unused code

* fix cmake file

* fix build error
```
  afd1e844
26 10月, 2017 1 次提交

Cudnn batch norm op (#5067) · 56b723c4

由 Qiao Longfei 提交于 10月 25, 2017

* init cudnn batch norm op

* rename batch_norm_cudnn_op.cc batch_norm_op.cu

* correct name style

* add ExtractNCWHD, simplify code

* fix ExtractNCWHD

* use CUDNN_ENFORCE instead of PADDLE_ENFORCE

56b723c4

25 10月, 2017 1 次提交
- D
  
  checkin nccl operator · 0990c87b
  由 Dong Zhihong 提交于 10月 24, 2017
  
  0990c87b
24 10月, 2017 2 次提交
- Y
  
  Use external project for NCCL (#5028) · 94e741d6
  由 Yu Yang 提交于 10月 23, 2017
  
  94e741d6
- Y
  Feature/nccl dso (#5001) · 43c6ff21
  由 Yu Yang 提交于 10月 23, 2017
```
* "add nccl enforce"

* Dev

* Update comment

* Add nccl test

* Follow comments
```
  43c6ff21
18 10月, 2017 1 次提交

MatMul operator (#4856) · 16489827

由 Markus Kliegl 提交于 10月 17, 2017

* initial matmul operator

Similar to np.matmul, but also has transpose_X and transpose_Y flags,
and only supports tensors from rank 1 to 3 inclusive.

For GPU, uses cublas?gemmStridedBatched. For CPU, uses
cblas_?gemm_batch if available via MKL; otherwise a simple serial
implementation that loops over the batch dimension is employed for now.

16489827

16 10月, 2017 1 次提交
- D
  
  "fix enforce error" · d8aebaf5
  由 Dong Zhihong 提交于 10月 15, 2017
  
  d8aebaf5
15 10月, 2017 1 次提交
- D
  
  "add enforce check" · 54d3dbd8
  由 Dong Zhihong 提交于 10月 14, 2017
  
  54d3dbd8
14 10月, 2017 1 次提交
- D
  
  "nccl add interface" · d1443104
  由 Dong Zhihong 提交于 10月 13, 2017
  
  d1443104
12 10月, 2017 1 次提交

武

Cudnn conv op (#4195) · a3ccbdb3

由武毅提交于 10月 12, 2017

* add cudnn_conv_op

* WIP

* update

* update

* fix grad check

* use platform::memory

* add support group for cudnn

* update

* follow comments

* fix onlycpu build

* update cuda define

* follow comments

* follow comments

* merge with updates

* fix compile error

* follow comments

* follow comments

a3ccbdb3

10 10月, 2017 2 次提交
- L
  
  remove unused PADDLE_ONLY_CPU comment · 871a3f6e
  由 Luo Tao 提交于 10月 10, 2017
  
  871a3f6e
- Y
  
  clean up for review · e5155713
  由 Yang Yang 提交于 10月 09, 2017
  
  e5155713
07 10月, 2017 1 次提交
- Q
  
  fix executor gpu unittest · 1f5192a2
  由 qijun 提交于 10月 06, 2017
  
  1f5192a2
05 10月, 2017 4 次提交
- Y
  
  Rename platform::GetDeviceCount into platform::GetCUDADeviceCount · 2b204f04
  由 Yi Wang 提交于 10月 04, 2017
  
  2b204f04
- Q
  
  fix gpu build error · fe10e86d
  由 qijun 提交于 10月 04, 2017
  
  fe10e86d
- Y
  
  Use PADDLE_WITH_CUDA instead of PADDLE_WITH_GPU · 4558807c
  由 Yi Wang 提交于 10月 04, 2017
  
  4558807c
- Y
  Change `PADDLE_ONLY_CPU` to `PADDLE_WITH_GPU` · 84500f94
  由 Yu Yang 提交于 10月 04, 2017
```
By shell command

```bash
  sed -i 's#ifdef PADDLE_ONLY_CPU#ifndef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
  sed -i 's#ifndef PADDLE_ONLY_CPU#ifdef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
```
```
  84500f94
04 10月, 2017 2 次提交
- Q
  
  remove device context manager · 39505151
  由 qijun 提交于 10月 03, 2017
  
  39505151
- Q
  
  refine codes · 6c4d1f55
  由 qijun 提交于 10月 03, 2017
  
  6c4d1f55

s920243400 / PaddleDetection 与 Fork 源项目一致

s920243400 / PaddleDetection
与 Fork 源项目一致