提交 · 0a44cd91df5db634c1467bc7fabf39855cc35c64 · csdn_franckjun / Paddle

14 1月, 2018 1 次提交

"cudnn operators change to cudnn kernel" (#6660) · 5ad1aef0

由 dzhwinter 提交于 1月 14, 2018

* "unified operators"

* "add CUDNN register"

* "add use cudnn attribute"

* "add attribute"

* "test conv tranpose op"

* "remove duplicated attr"

* "fix op test"

* "add attribute to set cudnn"

* "add more log"

* "need layout op register support"

* "add more log"

* "change GetExpectedKernelType "

* "fix Get attr in conv_op"

* "fix CI"

* "fix tests"

* "removed kernel priority fallback"

* "fix CI"

* "fix stack pointer bug"

* "refine buggy interface"

* "add const cast to save life"

* "fix get_output_with_grad"

* "fix op test with dataformat"

* ""fix pooling

* "fix pooling test"

* "fix CI"

* "fix with_gpu error"

* "add transform needed functional check"

* "fix unpack list error"

* "comment out parallel.do temporary"

* "fix CI"

* "fix compile doc error"

* "make threshold larger"

5ad1aef0

09 1月, 2018 2 次提交

Port WarpCTC Operator (#5107) · b5fda272

由 Yiqun Liu 提交于 1月 09, 2018

* Add Seq2BatchFunctor, which will be used in WarpCTCOp.

* Implement WrapCTCFunctor and WrapCTCKernel.

* Add unittest of warpctc_op.

* Modify the check_output inferface in python unittest framework to allow check a subset of outputs.

* Use absolute offset lod in warpctc_op and related functors.

* Refine the comments of warpctc_op.

* The new python unittest supports checking a subset of the outputs, so revoke the previous change.

* Rename the transform from LoDTensor to Tensor with shape [max_sequence_length, num_sequences, sequence_width] to PaddingSequenceFunctor.

* Update to the newest codes.

* Rename the PaddingSequenceFunctor to PaddingLoDTensorFunctor and remove the computation of dimensions out of the functos.

b5fda272

Y
Rename CopyFrom to Copy for tensors (#7292) · ce6dad3b
由 Yu Yang 提交于 1月 09, 2018
```
* Rename Tensor::CopyFrom to Tensor::Copy

* Fix CI

* Fix compile
```
ce6dad3b

02 1月, 2018 2 次提交

Feature/transform (#7111) · 899a79cc

由 dzhwinter 提交于 1月 02, 2018

* "fix data transform"

* "data transformer"

* "add device pool"

* "add test"

* "fix ci"

* "fix datalayout implementation "

* "fix based on comment"

899a79cc

Q

fix compile (#7125) · 105ee86d
由 QI JUN 提交于 1月 02, 2018

105ee86d

29 12月, 2017 3 次提交
- C
  
  move cos_sim_functor to math · 24cf2fcd
  由 chengduoZH 提交于 12月 29, 2017
  
  24cf2fcd
- T
  
  scatter optimizers · 1039c1e3
  由 typhoonzero 提交于 12月 29, 2017
  
  1039c1e3
- T
  
  wip · 641b4c0f
  由 typhoonzero 提交于 12月 29, 2017
  
  641b4c0f
28 12月, 2017 5 次提交
- G
  
  Delete the old activation type for LSTM and GRU operator · 23b53c48
  由 guosheng 提交于 12月 28, 2017
  
  23b53c48
- G
  
  Refine the activation type in the GRU operator related · f74dff97
  由 guosheng 提交于 12月 28, 2017
  
  f74dff97
- S
  
  modify fun name · 95aec835
  由 sweetsky0901 提交于 12月 28, 2017
  
  95aec835
- Y
  Implement selectedrows serialize and deserialize (#7042) · 2cdef424
  由 Yancey 提交于 12月 28, 2017
```
* implement selectedrows serialize and deserialize

* make serialize/deserialize as global function

* recover send_imp.cc

* delete unused brackets

* fix compile error

* serialize version in LodTensor and SelecetedRows

* fix ci

* fix ci
```
  2cdef424
- S
  
  for xxYY to xx_yy · 1a685144
  由 sweetsky0901 提交于 12月 28, 2017
  
  1a685144
27 12月, 2017 3 次提交
- T
  
  wip · 74b12288
  由 typhoonzero 提交于 12月 27, 2017
  
  74b12288
- T
  
  WIP: adding generic scattor functors · d48a0e4e
  由 typhoonzero 提交于 12月 27, 2017
  
  d48a0e4e
- Q
  
  Update the CUDA kernel. · 19367389
  由 qingqing01 提交于 12月 26, 2017
  
  19367389
26 12月, 2017 3 次提交
- Q
  
  Resume CPU implenmentation. · 41372ded
  由 qingqing01 提交于 12月 26, 2017
  
  41372ded
- Q
  
  Optimize the rowwise add function. · 32d881be
  由 qingqing01 提交于 12月 26, 2017
  
  32d881be
- L
  
  unify the indentation of license · 761b3297
  由 Luo Tao 提交于 12月 26, 2017
  
  761b3297
25 12月, 2017 4 次提交
- D
  
  Fix the clang format. · a8e18549
  由 dangqingqing 提交于 12月 25, 2017
  
  a8e18549
- Q
  
  Refine the activation type getting in the LSTM operator to speed. · d760b6a5
  由 qingqing01 提交于 12月 25, 2017
  
  d760b6a5
- Q
  remove unused place (#6972) · efd37269
  由 QI JUN 提交于 12月 25, 2017
```
* remove unused place

* fix ci
```
  efd37269
- D
  
  GPUPlace to CUDAPlace (#6960) · 0d2235aa
  由 dzhwinter 提交于 12月 25, 2017
  
  0d2235aa
24 12月, 2017 1 次提交
- Q
  
  fix math_function warning · 682eee40
  由 qiaolongfei 提交于 12月 24, 2017
  
  682eee40
21 12月, 2017 1 次提交

Speed up ColwiseSum in CPU (#6834) · 7e214b49

由 Yu Yang 提交于 12月 21, 2017

* Remove unnecessary reshape in ColwiseSum

Speed up 12s -> 10s.

* Hand write ColwiseAdd in CPU

7e214b49

20 12月, 2017 1 次提交
- C
  
  revert im2col · cb3a74e4
  由 chengduoZH 提交于 12月 20, 2017
  
  cb3a74e4
19 12月, 2017 3 次提交
- C
  
  refine im2col · 7b0744ed
  由 chengduoZH 提交于 12月 19, 2017
  
  7b0744ed
- C
  
  refine · f1ab13bd
  由 chengduoZH 提交于 12月 19, 2017
  
  f1ab13bd
- C
  
  refine im2col · 293b292e
  由 chengduoZH 提交于 12月 19, 2017
  
  293b292e
18 12月, 2017 1 次提交
- Q
  add more place test and rename Cudnn to CUDNN (#6621) · 93a2d9c5
  由 QI JUN 提交于 12月 18, 2017
```
* add more place_test and rename Cudnn to CUDNN

* fix ci
```
  93a2d9c5
15 12月, 2017 1 次提交
- T
  
  fix undefined issue when with_gpu · f2712105
  由 tensor-tang 提交于 12月 15, 2017
  
  f2712105
14 12月, 2017 1 次提交

"derived cudnnDevice context" (#6585) · 0e9b393b

由 dzhwinter 提交于 12月 14, 2017

* "derived cudnnDevice context"

* "leave remove cudnn handle from CUDADeviceContext"

* "fix math function error"

0e9b393b

12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

11 12月, 2017 1 次提交
- S
  
  add detection_output op · 65b641bf
  由 sweetsky0901 提交于 12月 11, 2017
  
  65b641bf
12 12月, 2017 1 次提交
- T
  
  unify MKL macro definition · 69b44f2f
  由 tensor-tang 提交于 12月 12, 2017
  
  69b44f2f
09 12月, 2017 1 次提交
- S
  
  test detection_output cpu and gpu ok, but doc will be modify · fe177b62
  由 sweetsky0901 提交于 12月 09, 2017
  
  fe177b62
08 12月, 2017 1 次提交
- S
  
  add detection_output code only · ca535d18
  由 sweetsky0901 提交于 12月 08, 2017
  
  ca535d18
03 12月, 2017 1 次提交
- Q
  
  Make lstm_op follow google code style. · e5b51c4d
  由 qingqing01 提交于 12月 03, 2017
  
  e5b51c4d
29 11月, 2017 2 次提交
- S
  
  format .. · 4ffb73fd
  由 sweetsky0901 提交于 11月 29, 2017
  
  4ffb73fd
- S
  
  format code · 3206094b
  由 sweetsky0901 提交于 11月 29, 2017
  
  3206094b

csdn_franckjun / Paddle 与 Fork 源项目一致

csdn_franckjun / Paddle
与 Fork 源项目一致