提交 · dfbd1cc3c180480fd6d387630945a2df6ed8f000 · wmsofts / Paddle

11 9月, 2018 1 次提交
- M
  
  Fuse MKLDNN's Conv + ReLU · 5d34ef61
  由 Michal Gallus 提交于 9月 04, 2018
  
  5d34ef61
10 9月, 2018 1 次提交
- K
  
  Reusing converted weights · 1658958f
  由 Krzysztof Binias 提交于 9月 10, 2018
  
  1658958f
21 8月, 2018 1 次提交

Fuse Convolution and Eltwise Add into MKLDNN's Conv+Bias (#12669) · cd32ddac

由 Michał Gallus 提交于 8月 21, 2018

* Fuse Convolution and Eltwise Add into Conv+Bias

* Reduce bias branching at conv_mkldnn_op

* Add MKLDNN build checks for Conv Bias

* Conv-bias: check if bias input exist befor assignment

* Conv-bias: Remove Bias dim check from infershape

It was causing conv3d test to crash upon\ncalling HasInput(Bias)

cd32ddac

11 6月, 2018 2 次提交
- D
  add inplace attribute to op_proto_maker (#10665) · bfa3fd6f
  由 dzhwinter 提交于 6月 11, 2018
```
* "add inplace attribute"

* "register inplace attribute"

* "change se-next model for memory-reuse"

* "fix typo"

* repick

* fix merge conflict

* "fix stupid error"
```
  bfa3fd6f
- M
  
  MKLDNN layout: Support for convolution operator · 9908d3cf
  由 mozga-intel 提交于 6月 10, 2018
  
  9908d3cf
07 6月, 2018 1 次提交

Mkldnn layout (#11040) · 3ff9ba0e

由 mozga-intel 提交于 6月 07, 2018

* Add MKLDNN layout support in Paddle

Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout
can be used in MKLDNN enabled OP kernel. Before this commit, NCHW
is hardcode to be used in all MKLDNN op kernels. As a result,
non-optimized execution path is selected in MKLDNN primitive which
bring worse performance.
Besides framework change, three MKLDNN OP kernels were updated
for using new MKLDNN layout. They are conv/pool2d/batch_norm.
Other MKLDNN OP kernels need be also updated in similar way to
achieve best performance.

* Add MKLDNN layout support in activation OP

* Don't populate layout from input to output when kMKLDNN in

* Refine pool mkldnn op kernel

* MKLDNN layout

* Remove the inferitance from tensor file

* MKLDNN layout: refactoring

* Remove additional #define to register new operator

* Prepare mkldnn tests to work with layout

3ff9ba0e

08 5月, 2018 1 次提交

Clean OpProtoAndCheckerMaker · 0e78cb69

由 Yu Yang 提交于 5月 08, 2018

Do not use ctor

* Reduce line of codes.
* We can use virtual function for Maker now.
* The implementation does not care what maker holds, it is easier to
refactor later.

0e78cb69

19 4月, 2018 1 次提交
- Y
  add semicolon to op registry (#10034) · e04c43d5
  由 Yang Yang(Tony) 提交于 4月 18, 2018
```
* script to add semicolon

* fix typo
```
  e04c43d5
17 4月, 2018 1 次提交
- Y
  
  script to fix all · ce7c2e86
  由 Yang Yang 提交于 4月 16, 2018
  
  ce7c2e86
04 4月, 2018 1 次提交
- Y
  
  Update · 54316bdd
  由 Yi Wang 提交于 4月 03, 2018
  
  54316bdd
16 3月, 2018 2 次提交
- L
  
  Add profiling event in feed, fetch and load op. · 371c53f8
  由 Liu Yiqun 提交于 3月 16, 2018
  
  371c53f8
- K
  
  add conv2d fp16 support · e4de5dc3
  由 Kexin Zhao 提交于 3月 15, 2018
  
  e4de5dc3
07 3月, 2018 1 次提交

MKLDNN conv2d kernel added (#8451) · 8c71adaa

由 pzelazko-intel 提交于 3月 07, 2018

* MKLDNN conv2 OP kernel added

* TODOs added

* mkldnn conv2d OP refactor

* CanCUDNNBeUsed and CanMKLDNNBeUsed moved

8c71adaa

28 2月, 2018 1 次提交
- C
  
  follow comments · a779b424
  由 chengduoZH 提交于 2月 27, 2018
  
  a779b424
27 2月, 2018 1 次提交
- C
  
  fix conv_op bug · b5c92092
  由 chengduoZH 提交于 2月 27, 2018
  
  b5c92092
16 2月, 2018 2 次提交
- Y
  
  change outputsize func name · cb06337f
  由 Yang Yang 提交于 2月 16, 2018
  
  cb06337f
- Y
  
  pass test_recognize_digits · 1d9fd1c0
  由 Yang Yang 提交于 2月 16, 2018
  
  1d9fd1c0
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
02 2月, 2018 1 次提交
- X
  
  rename op to depthwise_conv2d, more efficient · 2ffa3a8b
  由 xzl 提交于 2月 02, 2018
  
  2ffa3a8b
23 1月, 2018 1 次提交
- X
  
  ../../../../../paddle/api · 06db7038
  由 xzl 提交于 1月 23, 2018
  
  06db7038
22 1月, 2018 1 次提交
- Z
  
  add depthwise conv forward · 3772d27d
  由 zlx 提交于 1月 22, 2018
  
  3772d27d
17 1月, 2018 3 次提交
- C
  
  refine code · c9641a03
  由 chengduoZH 提交于 1月 17, 2018
  
  c9641a03
- C
  
  follow comments and refine python doc · ed7e74ab
  由 chengduoZH 提交于 1月 17, 2018
  
  ed7e74ab
- C
  
  follow comments · 24f528a1
  由 chengduoZH 提交于 1月 17, 2018
  
  24f528a1
15 1月, 2018 2 次提交
- C
  
  set use_cudnn as default · 251c6032
  由 chengduoZH 提交于 1月 15, 2018
  
  251c6032
- C
  
  fix conv, pool, conv_trans to decide use cudnn or not · 79aa5122
  由 chengduoZH 提交于 1月 15, 2018
  
  79aa5122
14 1月, 2018 1 次提交

"cudnn operators change to cudnn kernel" (#6660) · 5ad1aef0

由 dzhwinter 提交于 1月 14, 2018

* "unified operators"

* "add CUDNN register"

* "add use cudnn attribute"

* "add attribute"

* "test conv tranpose op"

* "remove duplicated attr"

* "fix op test"

* "add attribute to set cudnn"

* "add more log"

* "need layout op register support"

* "add more log"

* "change GetExpectedKernelType "

* "fix Get attr in conv_op"

* "fix CI"

* "fix tests"

* "removed kernel priority fallback"

* "fix CI"

* "fix stack pointer bug"

* "refine buggy interface"

* "add const cast to save life"

* "fix get_output_with_grad"

* "fix op test with dataformat"

* ""fix pooling

* "fix pooling test"

* "fix CI"

* "fix with_gpu error"

* "add transform needed functional check"

* "fix unpack list error"

* "comment out parallel.do temporary"

* "fix CI"

* "fix compile doc error"

* "make threshold larger"

5ad1aef0

09 1月, 2018 1 次提交

Port WarpCTC Operator (#5107) · b5fda272

由 Yiqun Liu 提交于 1月 09, 2018

* Add Seq2BatchFunctor, which will be used in WarpCTCOp.

* Implement WrapCTCFunctor and WrapCTCKernel.

* Add unittest of warpctc_op.

* Modify the check_output inferface in python unittest framework to allow check a subset of outputs.

* Use absolute offset lod in warpctc_op and related functors.

* Refine the comments of warpctc_op.

* The new python unittest supports checking a subset of the outputs, so revoke the previous change.

* Rename the transform from LoDTensor to Tensor with shape [max_sequence_length, num_sequences, sequence_width] to PaddingSequenceFunctor.

* Update to the newest codes.

* Rename the PaddingSequenceFunctor to PaddingLoDTensorFunctor and remove the computation of dimensions out of the functos.

b5fda272

04 1月, 2018 1 次提交
- Y
  
  Correctly handle image operators · 040dc59b
  由 Yang Yu 提交于 1月 04, 2018
  
  040dc59b
27 12月, 2017 1 次提交
- F
  
  move ENFORCE position · a04f30e7
  由 fengjiayi 提交于 12月 27, 2017
  
  a04f30e7
26 12月, 2017 1 次提交
- L
  
  unify the indentation of license · 761b3297
  由 Luo Tao 提交于 12月 26, 2017
  
  761b3297
20 12月, 2017 1 次提交
- Y
  Move framework.proto to proto namespace (#6718) · e445b3ff
  由 Yu Yang 提交于 12月 20, 2017
```
* Move framework.proto to proto namespace

* Fix compile

* Fix compile

* Fix Compile
```
  e445b3ff
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

27 11月, 2017 1 次提交
- C
  
  fix conv and conv_trans op doc · 9abc0e04
  由 chengduoZH 提交于 11月 27, 2017
  
  9abc0e04
17 11月, 2017 1 次提交
- C
  
  add double type kernel · c359e39b
  由 chengduoZH 提交于 11月 17, 2017
  
  c359e39b
15 11月, 2017 1 次提交
- C
  
  follow comments · 356d6954
  由 chengduoZH 提交于 11月 14, 2017
  
  356d6954
08 11月, 2017 2 次提交
- C
  
  fix conv2d doc · b6f9ba48
  由 chengduoZH 提交于 11月 08, 2017
  
  b6f9ba48
- C
  
  add dilation for im2col · 97e9dd72
  由 chengduoZH 提交于 11月 08, 2017
  
  97e9dd72

wmsofts / Paddle 与 Fork 源项目一致

wmsofts / Paddle
与 Fork 源项目一致