提交 · 81f22bb2f67a065aaabd771ec521bbf97fa0775e · PaddlePaddle / PaddleDetection

07 6月, 2018 2 次提交

split reduce op into multiple libraries, accelerate the compiling (#11029) · d48172f2

由 dzhwinter 提交于 6月 07, 2018

* "split into multiple .ccl"

* "refine file structure"

* "refine files"

* "remove the cmakelist"

* "fix typo"

* "fix typo"

* fix ci

d48172f2

Mkldnn layout (#11040) · 3ff9ba0e

由 mozga-intel 提交于 6月 07, 2018

* Add MKLDNN layout support in Paddle

Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout
can be used in MKLDNN enabled OP kernel. Before this commit, NCHW
is hardcode to be used in all MKLDNN op kernels. As a result,
non-optimized execution path is selected in MKLDNN primitive which
bring worse performance.
Besides framework change, three MKLDNN OP kernels were updated
for using new MKLDNN layout. They are conv/pool2d/batch_norm.
Other MKLDNN OP kernels need be also updated in similar way to
achieve best performance.

* Add MKLDNN layout support in activation OP

* Don't populate layout from input to output when kMKLDNN in

* Refine pool mkldnn op kernel

* MKLDNN layout

* Remove the inferitance from tensor file

* MKLDNN layout: refactoring

* Remove additional #define to register new operator

* Prepare mkldnn tests to work with layout

3ff9ba0e

18 4月, 2018 1 次提交
- Y
  
  remove REGISTER_OP and REGISTER_OP_EX · 68d96385
  由 Yang Yang 提交于 4月 17, 2018
  
  68d96385
17 4月, 2018 1 次提交
- Y
  
  first commit · dafe06af
  由 Yang Yang 提交于 4月 13, 2018
  
  dafe06af
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
09 2月, 2018 1 次提交
- E
  
  cumsum operator (#8288) · 725e6448
  由 emailweixu 提交于 2月 09, 2018
  
  725e6448
17 1月, 2018 1 次提交
- Q
  
  change DEVICE_TYPE in op_registry to LIBRARY_TYPE (#7588) · 6f71f89d
  由 Qiao Longfei 提交于 1月 17, 2018
  
  6f71f89d
03 1月, 2018 1 次提交
- L
  
  add more comments in CMakelists.txt of operator · 2d2b6332
  由 Luo Tao 提交于 1月 03, 2018
  
  2d2b6332
27 12月, 2017 1 次提交

"refine kernel registrar" (#6998) · 35c1683e

由 dzhwinter 提交于 12月 27, 2017

* "refine kernel registrar"

* "refine registrar with multikey"

* "fix register"

* "refine multikernel register"

* "fix CI"

* "fix CI"

* "fix registry"

* "swtich GPU to CUDA"

* "add register macro test case"

* "fix CI"

35c1683e

25 12月, 2017 1 次提交
- D
  
  GPUPlace to CUDAPlace (#6960) · 0d2235aa
  由 dzhwinter 提交于 12月 25, 2017
  
  0d2235aa
24 12月, 2017 1 次提交
- Q
  
  rm unsed RegisterOp method in OpRegistry · 6b99402d
  由 qiaolongfei 提交于 12月 24, 2017
  
  6b99402d
22 12月, 2017 1 次提交

Enforce drop_empty_grad=false When the input of an op is duplicable. · 0bfa1f7c

由 xuwei06 提交于 12月 01, 2017

For input argument with a list of variables, drop_empty_grad is not allowed because it makes the correspondence bewteen a variable and its gradient ambiguous. Use REGISTER_OP_EX to register the op or call InputGrad(?,false) in GradOpDescMaker.

0bfa1f7c

21 12月, 2017 1 次提交
- Y
  Rename XXDescBind --> XXDesc (#6797) · 09189732
  由 Yu Yang 提交于 12月 21, 2017
```
* Rename XXDescBind --> XXDesc

* Fix Compile
```
  09189732
20 12月, 2017 1 次提交
- Y
  Move framework.proto to proto namespace (#6718) · e445b3ff
  由 Yu Yang 提交于 12月 20, 2017
```
* Move framework.proto to proto namespace

* Fix compile

* Fix compile

* Fix Compile
```
  e445b3ff
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

08 11月, 2017 1 次提交

Polish OpWithKernel · bbdac7f7

由 Yu Yang 提交于 11月 07, 2017

* Chage `IndicateDataType` to `GetKernelType`. Make it easier to
  understand.
* Change `OpKernelKey` to `OpKernelType`
* Make operator developers can customize which kernel the operator will
  use in runtime.

bbdac7f7

01 11月, 2017 1 次提交

Feature/executor use program bind (#5196) · 1363ddb6

由 Yu Yang 提交于 10月 31, 2017

* Init commit

* Make executor use ProgramDescBind

* Change Attribute from BlockDesc to BlockDescBind

* Since we will get the program desc in RNN, just BlockDesc is not
  enough.

1363ddb6

29 10月, 2017 2 次提交
- Y
  Cast Operator (#5149) · b84e8226
  由 Yu Yang 提交于 10月 28, 2017
```
* Cast Operator

Cast input variable to other data type

* Fix compile error

* Add cast op

* Follow comments
```
  b84e8226
- Y
  Extract InferShape to many cc files (#5174) · 8f6c0a0f
  由 Yu Yang 提交于 10月 28, 2017
```
* Shrink Operator.h

* Fix CI compile
```
  8f6c0a0f
24 10月, 2017 1 次提交
- D
  
  "add register gpu macro" · 423d7438
  由 Dong Zhihong 提交于 10月 23, 2017
  
  423d7438
19 10月, 2017 2 次提交

Add glog as dependencies of ops (#4908) · e9249d16

由 Yu Yang 提交于 10月 18, 2017

* Add glog as dependencies of ops

* Use VLOG to logging some information is helpful when we debug Paddle

* Fix Unittests

e9249d16

Change ProgramDesc not a global variable (#4879) · e747623e

由 Yu Yang 提交于 10月 18, 2017

* Change ProgramDesc not a global variable

* Polish code style

* Correct implement BlockDesc destructor

* Unify program as parameter name

e747623e

18 10月, 2017 1 次提交
- Y
  
  Remove private data members in OpRegister (#4871) · 5d67677c
  由 Yu Yang 提交于 10月 17, 2017
  
  5d67677c
17 10月, 2017 1 次提交
- Q
  
  remove unused C++ class OpRegistrar · eb27c735
  由 qijun 提交于 10月 16, 2017
  
  eb27c735
13 10月, 2017 1 次提交

Add no_grad_vars for grad_op_maker (#4770) · a36d2416

由 Yu Yang 提交于 10月 12, 2017

* Add no_grad_vars for grad_op_maker

* Add unittest

* Fix unittest

* Fix unittest

* Follow comment

a36d2416

10 10月, 2017 1 次提交
- Y
  
  Fix bug of foward default attribute not passed to backward · c464ec21
  由 Yu Yang 提交于 10月 09, 2017
  
  c464ec21
06 10月, 2017 2 次提交
- Q
  
  clean code · e043b386
  由 qiaolongfei 提交于 10月 05, 2017
  
  e043b386
- Q
  
  add python unit test · 352af966
  由 qiaolongfei 提交于 10月 05, 2017
  
  352af966
05 10月, 2017 5 次提交
- Y
  
  Follow comments · ebbbaee0
  由 Yu Yang 提交于 10月 04, 2017
  
  ebbbaee0
- Y
  
  Fix CI Test · c4effc7d
  由 Yu Yang 提交于 10月 04, 2017
  
  c4effc7d
- Q
  
  tmp work · 5917e09c
  由 qiaolongfei 提交于 10月 04, 2017
  
  5917e09c
- Y
  
  Use PADDLE_WITH_CUDA instead of PADDLE_WITH_GPU · 4558807c
  由 Yi Wang 提交于 10月 04, 2017
  
  4558807c
- Y
  Change `PADDLE_ONLY_CPU` to `PADDLE_WITH_GPU` · 84500f94
  由 Yu Yang 提交于 10月 04, 2017
```
By shell command

```bash
  sed -i 's#ifdef PADDLE_ONLY_CPU#ifndef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
  sed -i 's#ifndef PADDLE_ONLY_CPU#ifdef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
```
```
  84500f94
04 10月, 2017 3 次提交
- Q
  
  add shape_inference_map · ab9545aa
  由 qiaolongfei 提交于 10月 04, 2017
  
  ab9545aa
- F
  
  Update · e47770bd
  由 fengjiayi 提交于 10月 03, 2017
  
  e47770bd
- F
  
  Add `CreateBackwardOp` function · ff7fdb7d
  由 fengjiayi 提交于 10月 03, 2017
  
  ff7fdb7d
03 10月, 2017 2 次提交
- Y
  
  Complete Register Gradient in compile time · 46c551b2
  由 Yu Yang 提交于 10月 02, 2017
  
  46c551b2
- Y
  
  Make compile pass · 578a357b
  由 Yu Yang 提交于 10月 02, 2017
  
  578a357b

PaddlePaddle / PaddleDetection 大约 1 年 前同步成功

PaddlePaddle / PaddleDetection
大约 1 年前同步成功