提交 · 5ad1aef051349a73b00b8d611f0ae2508f02490b · Crayon鑫 / Paddle

14 1月, 2018 1 次提交

"cudnn operators change to cudnn kernel" (#6660) · 5ad1aef0

由 dzhwinter 提交于 1月 14, 2018

* "unified operators"

* "add CUDNN register"

* "add use cudnn attribute"

* "add attribute"

* "test conv tranpose op"

* "remove duplicated attr"

* "fix op test"

* "add attribute to set cudnn"

* "add more log"

* "need layout op register support"

* "add more log"

* "change GetExpectedKernelType "

* "fix Get attr in conv_op"

* "fix CI"

* "fix tests"

* "removed kernel priority fallback"

* "fix CI"

* "fix stack pointer bug"

* "refine buggy interface"

* "add const cast to save life"

* "fix get_output_with_grad"

* "fix op test with dataformat"

* ""fix pooling

* "fix pooling test"

* "fix CI"

* "fix with_gpu error"

* "add transform needed functional check"

* "fix unpack list error"

* "comment out parallel.do temporary"

* "fix CI"

* "fix compile doc error"

* "make threshold larger"

5ad1aef0

10 1月, 2018 3 次提交
- D
  
  "fix CI" · a6edc038
  由 dzhwinter 提交于 1月 09, 2018
  
  a6edc038
- D
  
  "add flags" · f0316bdb
  由 dzhwinter 提交于 1月 09, 2018
  
  f0316bdb
- D
  Make init device on all gpu by default (#7345) · 5f985000
  由 dzhwinter 提交于 1月 10, 2018
```
* "init use all default devices"

* "fix init test"
```
  5f985000
09 1月, 2018 1 次提交

Port WarpCTC Operator (#5107) · b5fda272

由 Yiqun Liu 提交于 1月 09, 2018

* Add Seq2BatchFunctor, which will be used in WarpCTCOp.

* Implement WrapCTCFunctor and WrapCTCKernel.

* Add unittest of warpctc_op.

* Modify the check_output inferface in python unittest framework to allow check a subset of outputs.

* Use absolute offset lod in warpctc_op and related functors.

* Refine the comments of warpctc_op.

* The new python unittest supports checking a subset of the outputs, so revoke the previous change.

* Rename the transform from LoDTensor to Tensor with shape [max_sequence_length, num_sequences, sequence_width] to PaddingSequenceFunctor.

* Update to the newest codes.

* Rename the PaddingSequenceFunctor to PaddingLoDTensorFunctor and remove the computation of dimensions out of the functos.

b5fda272

08 1月, 2018 5 次提交

L

fix compile error in profiler.cc · 01ee42b1
由 Luo Tao 提交于 1月 08, 2018

01ee42b1
Y

Refine get_places · 63ff0b4b
由 Yang Yu 提交于 1月 08, 2018

63ff0b4b

cpu gpu transform function (#7191) · 0f353ab4

由 Qiao Longfei 提交于 1月 08, 2018

* add rename guard

* add device_data_transform

* add device_data_transform_test

* modify GetExpectedKernelType

* update operator.run

* support test test_label_semantic_roles

* optimize code

* optimize code

* rename GetActualKernelType to GetExpectedKernelType

* fix chunk_eval_op and device_data_transform_test

* add is_same_place to place

* optimize code, refine rename_guard

* refine rename guard, add GetKernelTypeForVar

* optimize code

* add some log

* rename guard

* use sub scope to create var

* fix compile

* add IsInitialized for Tensor

* add VarIsTensor

* fix op_registry_test

* test

* tmp disable priority

* restore switch_kernel.md

* code clean

0f353ab4

Y

Remove unused included header gflags · ea0280b4
由 Yibing Liu 提交于 1月 08, 2018

ea0280b4
Y

Remove the redundant switch case statement · d09503b2
由 Yibing Liu 提交于 1月 08, 2018

d09503b2

06 1月, 2018 1 次提交
- Y
  
  Fix profiler place bug · 7a4f3be9
  由 Yibing Liu 提交于 1月 06, 2018
  
  7a4f3be9
05 1月, 2018 7 次提交
- Y
  
  Fix bad_alloc bug & refine code in profiler · df3b250c
  由 Yibing Liu 提交于 1月 05, 2018
  
  df3b250c
- Y
  
  Make time calc funcs return ms instead of us · 5a0a4617
  由 Yibing Liu 提交于 1月 05, 2018
  
  5a0a4617
- Y
  
  fix typos · d7e56847
  由 Yibing Liu 提交于 1月 05, 2018
  
  d7e56847
- Y
  
  Enable sorting the profiling result by different keys · 0aa03a82
  由 Yibing Liu 提交于 1月 05, 2018
  
  0aa03a82
- T
  
  follow comments, use unique_ptr and remove unused file · 9c7cea81
  由 tensor-tang 提交于 1月 05, 2018
  
  9c7cea81
- Y
  
  Format profiling report · 2d94eca8
  由 Yibing Liu 提交于 1月 05, 2018
  
  2d94eca8
- Y
  
  Confirm the contents in profiling report · 0f441075
  由 Yibing Liu 提交于 1月 05, 2018
  
  0f441075
04 1月, 2018 1 次提交
- D
  "remove cudnn devicecontext" (#7207) · a4024a5f
  由 dzhwinter 提交于 1月 04, 2018
```
* "remove cudnndevicecontext"

* "remove unused init code"

* "fix hash functions"
```
  a4024a5f
03 1月, 2018 4 次提交
- T
  
  fix typo · b0ba2b06
  由 tensor-tang 提交于 1月 03, 2018
  
  b0ba2b06
- T
  
  fix mkldnn deps · 31fda46c
  由 tensor-tang 提交于 1月 03, 2018
  
  31fda46c
- T
  
  add mkldnn_helper · 03091ccb
  由 tensor-tang 提交于 1月 03, 2018
  
  03091ccb
- T
  
  add MKLDNNDeviceContext · 72652845
  由 tensor-tang 提交于 1月 03, 2018
  
  72652845
29 12月, 2017 1 次提交
- D
  
  Refine code struct. · 0a5fbb06
  由 dangqingqing 提交于 12月 29, 2017
  
  0a5fbb06
28 12月, 2017 2 次提交
- L
  
  fix some warning · 717e1252
  由 Luo Tao 提交于 12月 28, 2017
  
  717e1252
- Y
  
  Fix compile · 003917d8
  由 Yang Yu 提交于 12月 28, 2017
  
  003917d8
27 12月, 2017 5 次提交
- Y
  Rename API of DeviceContext (#7055) · 15e8c80e
  由 Yu Yang 提交于 12月 27, 2017
```
* Rename API of DeviceContext

Make them as usual names.

* Rename API of DeviceContext

Make them as usual names.

* Fix compile

* Fix compile

* Fix compile

* Fix compile

* Fix compile
```
  15e8c80e
- Y
  
  Add API for HasNAN HasInf · 15309fde
  由 Yang Yu 提交于 12月 27, 2017
  
  15309fde
- Y
  Rename API of DeviceContext · 8b877dd7
  由 Yang Yu 提交于 12月 27, 2017
```
Make them as usual names.
```
  8b877dd7
- Y
  Rename API of DeviceContext · a5e1cf5a
  由 Yang Yu 提交于 12月 27, 2017
```
Make them as usual names.
```
  a5e1cf5a
- Y
  Rename API of DeviceContext · fd2bf550
  由 Yang Yu 提交于 12月 27, 2017
```
Make them as usual names.
```
  fd2bf550
26 12月, 2017 2 次提交
- Y
  
  Add the parsing part for the profiling tool · 55b17c11
  由 Yibing Liu 提交于 12月 26, 2017
  
  55b17c11
- L
  
  unify the indentation of license · 761b3297
  由 Luo Tao 提交于 12月 26, 2017
  
  761b3297
25 12月, 2017 3 次提交
- Q
  remove unused place (#6972) · efd37269
  由 QI JUN 提交于 12月 25, 2017
```
* remove unused place

* fix ci
```
  efd37269
- Y
  
  Optimize adam_op · 1fdf8853
  由 Yang Yu 提交于 12月 25, 2017
  
  1fdf8853
- D
  
  GPUPlace to CUDAPlace (#6960) · 0d2235aa
  由 dzhwinter 提交于 12月 25, 2017
  
  0d2235aa
24 12月, 2017 3 次提交

D

"remove hash combine" · a521ace6
由 dzhwinter 提交于 12月 24, 2017

a521ace6
Q
refine OpKernelType (#6879) · 37e96264
由 QI JUN 提交于 12月 24, 2017
```
* refine OpKernelKey

* refine codes

* fix code style

* follow comments
```
37e96264

Feature/operator run place (#6783) · 735eba29

由 dzhwinter 提交于 12月 24, 2017

* "change operator interface"

* "move devicepool to device_context"

* "fix operator test"

* "fix op_registry Run interface"

* "net op passed. Need to fix nccl multi-Context"

* "add nccl group function"

* "add nccl group function"

* "fix gpu count exceed 32 error"

* "fix recurrent op, nccl op"

* "change the other operators interface with Place"

* "fix typo"

* "fix pybind"

* "fix device in python side"

* "fix pybind failed"

* "add init for test"

* "fix CI"

735eba29

22 12月, 2017 1 次提交

"remove GPU Sync Interface" (#6793) · abde3130

由 dzhwinter 提交于 12月 22, 2017

* "remove GPU Sync Interface"

* "fix typo"

* "fix type cast error"

* "fix related Copy with stream"

* "fix failed tests with DevicePool"

* "fix stupid removed position error"

abde3130

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致