提交 · 5ad1aef051349a73b00b8d611f0ae2508f02490b · 机器未来 / Paddle

14 1月, 2018 1 次提交

"cudnn operators change to cudnn kernel" (#6660) · 5ad1aef0

由 dzhwinter 提交于 1月 14, 2018

* "unified operators"

* "add CUDNN register"

* "add use cudnn attribute"

* "add attribute"

* "test conv tranpose op"

* "remove duplicated attr"

* "fix op test"

* "add attribute to set cudnn"

* "add more log"

* "need layout op register support"

* "add more log"

* "change GetExpectedKernelType "

* "fix Get attr in conv_op"

* "fix CI"

* "fix tests"

* "removed kernel priority fallback"

* "fix CI"

* "fix stack pointer bug"

* "refine buggy interface"

* "add const cast to save life"

* "fix get_output_with_grad"

* "fix op test with dataformat"

* ""fix pooling

* "fix pooling test"

* "fix CI"

* "fix with_gpu error"

* "add transform needed functional check"

* "fix unpack list error"

* "comment out parallel.do temporary"

* "fix CI"

* "fix compile doc error"

* "make threshold larger"

5ad1aef0

12 1月, 2018 1 次提交

Add get lod for debug (#7375) · 23df6c44

由 Qiao Longfei 提交于 1月 12, 2018

* add GetLoD for debug

* add LoDToString

* optimize if

* typo

* add lod_tensor to operator's dependency

23df6c44

10 1月, 2018 3 次提交
- Q
  reorganize data transform related code (#7391) · 377424bf
  由 Qiao Longfei 提交于 1月 10, 2018
```
* init data_type_transform

* split data_layout_transform

* tmp rm data_transform_test

* change device_data_transform to data_device_transform

* clean code

* clean code
```
  377424bf
- D
  
  "fix CI" · a6edc038
  由 dzhwinter 提交于 1月 09, 2018
  
  a6edc038
- D
  
  "add flags" · f0316bdb
  由 dzhwinter 提交于 1月 09, 2018
  
  f0316bdb
09 1月, 2018 1 次提交
- Q
  
  fix GetDims bug · 8b1a81a9
  由 qiaolongfei 提交于 1月 09, 2018
  
  8b1a81a9
08 1月, 2018 5 次提交

Q

fix priority · 0b52cc88
由 qiaolongfei 提交于 1月 08, 2018

0b52cc88
Q

add back priority · ca90356b
由 qiaolongfei 提交于 1月 08, 2018

ca90356b
D
Feature/add shared layout (#7233) · e94db381
由 dzhwinter 提交于 1月 08, 2018
```
* "reuse ShareLoD with no regret"

* "removed base class shareLayout"

* "fix CI"
```
e94db381

cpu gpu transform function (#7191) · 0f353ab4

由 Qiao Longfei 提交于 1月 08, 2018

* add rename guard

* add device_data_transform

* add device_data_transform_test

* modify GetExpectedKernelType

* update operator.run

* support test test_label_semantic_roles

* optimize code

* optimize code

* rename GetActualKernelType to GetExpectedKernelType

* fix chunk_eval_op and device_data_transform_test

* add is_same_place to place

* optimize code, refine rename_guard

* refine rename guard, add GetKernelTypeForVar

* optimize code

* add some log

* rename guard

* use sub scope to create var

* fix compile

* add IsInitialized for Tensor

* add VarIsTensor

* fix op_registry_test

* test

* tmp disable priority

* restore switch_kernel.md

* code clean

0f353ab4

E
Show argument dimensions with operator::DebugStringEx (#7268) · 8814bec0
由 emailweixu 提交于 1月 07, 2018
```
This can make it easier to locate error.
```
8814bec0

05 1月, 2018 1 次提交

Feature/use cudnn (#7141) · 5593858d

由 dzhwinter 提交于 1月 05, 2018

* "add c++ side kernel selection"

* "add multiple kernel op test"

* "kernel selection only support cudnn"

* "better formatter"

* "small fix with UseCPU"

* "depends on change interface Get(Place, Library)"

* "fix CI"

* "fix python cudnn test"

* "leave the register cudnn op to another PR"

* "fix CI"

* "use all kernel by default"

* "fix CI"

5593858d

04 1月, 2018 1 次提交
- Y
  
  clean up · 97dc451f
  由 Yang Yang 提交于 1月 04, 2018
  
  97dc451f
02 1月, 2018 1 次提交

Feature/transform (#7111) · 899a79cc

由 dzhwinter 提交于 1月 02, 2018

* "fix data transform"

* "data transformer"

* "add device pool"

* "add test"

* "fix ci"

* "fix datalayout implementation "

* "fix based on comment"

899a79cc

29 12月, 2017 1 次提交
- Q
  add helper function to get appropriate DeviceContext (#7066) · 5036cf03
  由 QI JUN 提交于 12月 29, 2017
```
* add helper function to get appropriate DeviceContext
```
  5036cf03
27 12月, 2017 6 次提交
- Y
  Rename API of DeviceContext (#7055) · 15e8c80e
  由 Yu Yang 提交于 12月 27, 2017
```
* Rename API of DeviceContext

Make them as usual names.

* Rename API of DeviceContext

Make them as usual names.

* Fix compile

* Fix compile

* Fix compile

* Fix compile

* Fix compile
```
  15e8c80e
- Q
  cache memory in local scope (#7058) · 7aed7eb5
  由 QI JUN 提交于 12月 27, 2017
```
* add KernelTypeToString interface

* cache memory in local scope

* fix typo

* refine trans logic
```
  7aed7eb5
- Y
  Rename API of DeviceContext · 8b877dd7
  由 Yang Yu 提交于 12月 27, 2017
```
Make them as usual names.
```
  8b877dd7
- Y
  Rename API of DeviceContext · a5e1cf5a
  由 Yang Yu 提交于 12月 27, 2017
```
Make them as usual names.
```
  a5e1cf5a
- Y
  Rename API of DeviceContext · fd2bf550
  由 Yang Yu 提交于 12月 27, 2017
```
Make them as usual names.
```
  fd2bf550
- Q
  add memory switch mechanism in operator kernel switch (#6991) · 94096ae5
  由 QI JUN 提交于 12月 27, 2017
```
* add memory switch mechanism in operator kernel switch
```
  94096ae5
25 12月, 2017 2 次提交

Impl kernel hint (#6883) · af0c4c45

由 Qiao Longfei 提交于 12月 25, 2017

* init kernel hint

* fix typo

* rm unused code

* add include in op_kernel.h

* restore op_kernel since it will be moved to op_kernel_type

* change force_cpu to use_cpu

* fix compilation

af0c4c45

Q

add op_kernel_type_test · 313afc9c
由 qiaolongfei 提交于 12月 25, 2017

313afc9c

24 12月, 2017 2 次提交

Q
refine OpKernelType (#6879) · 37e96264
由 QI JUN 提交于 12月 24, 2017
```
* refine OpKernelKey

* refine codes

* fix code style

* follow comments
```
37e96264

Feature/operator run place (#6783) · 735eba29

由 dzhwinter 提交于 12月 24, 2017

* "change operator interface"

* "move devicepool to device_context"

* "fix operator test"

* "fix op_registry Run interface"

* "net op passed. Need to fix nccl multi-Context"

* "add nccl group function"

* "add nccl group function"

* "fix gpu count exceed 32 error"

* "fix recurrent op, nccl op"

* "change the other operators interface with Place"

* "fix typo"

* "fix pybind"

* "fix device in python side"

* "fix pybind failed"

* "add init for test"

* "fix CI"

735eba29

21 12月, 2017 1 次提交
- Y
  
  pass forward runtime · f899150e
  由 Yang Yang 提交于 12月 21, 2017
  
  f899150e
20 12月, 2017 1 次提交
- Y
  Move framework.proto to proto namespace (#6718) · e445b3ff
  由 Yu Yang 提交于 12月 20, 2017
```
* Move framework.proto to proto namespace

* Fix compile

* Fix compile

* Fix Compile
```
  e445b3ff
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

05 12月, 2017 1 次提交
- D
  
  Remove the cuda stream synchronization between each operator. · 4e451a34
  由 dangqingqing 提交于 12月 05, 2017
  
  4e451a34
16 11月, 2017 1 次提交

feature/while_grad_op (#5554) · 18f0c40a

由 Yang Yang(Tony) 提交于 11月 16, 2017

* first commit

* Python API for while op

* Python Unittest for simple while_op forward

* fix out to be list

* Fix UT

* VarType

* Fix several bugs

* Fix bug

* Fix bug

* Fix Bug

* Fix bug

* Fix unittest

* Remove debug log

* Add comments

* add PADDLE_ENFORCE

* while_grad_op first commit

* Add `BlockDescBind::FindRecursiveOrCreateVar()` and fix bugs

* not sure how to setdim of while outputs

* push for test

* add executor vlog

* fix bug of while_op cond

* Several enhancement for code

1. Backward always infer shape & infer var type. Since there are RENAME
variables will be created when creating backward operator, but their
shape & var types are not inferenced.
2. Never use SomePtr-> directly, since every pointer could be nullptr if
it is a function return value. Add `detail::Ref` to cast pointer to
reference safely.
3. Enhance error message for backward.
4. Infer data type of variable in `sum` and `tensor_write`

* Fix bugs of while_op gradient

* Fix several bugs of while_op grad

* fix fill zeros like

* fix 3 >= 3

* fix place holder shouldn't be null

* fail on sum op

* Fix SumOp of TensorList

* clean up

* pass while test

* fix test_array_write_read

* pass sum op

* Support int/int64 for fill_constant_batch_size_like

* Fix compile

18f0c40a

08 11月, 2017 2 次提交

Polish OpWithKernel · bbdac7f7

由 Yu Yang 提交于 11月 07, 2017

* Chage `IndicateDataType` to `GetKernelType`. Make it easier to
  understand.
* Change `OpKernelKey` to `OpKernelType`
* Make operator developers can customize which kernel the operator will
  use in runtime.

bbdac7f7

Q

Check errors for the cuda kernel calls. (#5436) · 58db07b7
由 qingqing01 提交于 11月 08, 2017

58db07b7

07 11月, 2017 1 次提交

Add unittest, backward of array read/write op (#5409) · 6cde889b

由 Yu Yang 提交于 11月 06, 2017

* Use stable_sort in lod_rank_table

It is easy to debug and test when use `stable_sort`and the time
complexity is not changed.

* Add LoDTensorArray

* Stash

* Better debug message for IsInitialized

* Stash

* Better debug message for IsInitialized

* Complete array read/write op unittests

* Add unittest, Gradient of array read/write

* Follow comments

6cde889b

02 11月, 2017 1 次提交

Rewrite StaticRNN with Executor (#5224) · 0a32e74d

由 Yu Yang 提交于 11月 01, 2017

* Init commit

* Make executor use ProgramDescBind

* Change Attribute from BlockDesc to BlockDescBind

* Since we will get the program desc in RNN, just BlockDesc is not
  enough.

* Add DeviceContext to Executor API

* Rewrite RNN

* Pass Python

* AddBiasOp does not care num_flatten_dims

* Stash

* Fix MacOS Compile

* Pass RNN forward

* add python test

* refactor test

* Make compile pass

* add gradopmaker

* First draft done

* Polish code

* add grad op maker and grad infershape

* Polish code

* Fix backward.cc bug

* Fix infershape

* Rename function

* add backward test

* simplify recurrent test

* Update

* Pass unittest

* Add comments & refine test

* Add comments

* refactor test

* Complete Unittest

* fix StepScopes enforce

* Remove unused unittest

* no type error

* Update

* Make RNN Pass unittest

0a32e74d

01 11月, 2017 1 次提交
- Q
  add shareLod (#5259) · ee11f006
  由 Qiao Longfei 提交于 11月 01, 2017
```
* add shareLod

* fix sequence_conv grad infershape
```
  ee11f006
31 10月, 2017 1 次提交
- C
  
  add gpu kernel by copying inputs/outputs between cpu and gpu. · 86fd6b63
  由 caoying03 提交于 10月 29, 2017
  
  86fd6b63
29 10月, 2017 2 次提交
- Y
  Polish Accuracy Op (#5191) · 46a13e37
  由 Yu Yang 提交于 10月 28, 2017
```
* Accuracy does not support float/double, only support integers
* Polish error message when an operator does not support some device.
```
  46a13e37
- Y
  Extract InferShape to many cc files (#5174) · 8f6c0a0f
  由 Yu Yang 提交于 10月 28, 2017
```
* Shrink Operator.h

* Fix CI compile
```
  8f6c0a0f
27 10月, 2017 1 次提交

add sparse support for sum op (#5093) · 7f8574c0

由 QI JUN 提交于 10月 26, 2017

* add sparse support for sum op

* typo fix

* fix gpu build error

* fix unittest error

* typo fix

* infer var type and shape in op_test

* follow comments

* fix build error

* bypass some unittests depend on NetOp

7f8574c0

21 10月, 2017 1 次提交
- Y
  
  Global function, op_support_gpu (#4980) · 86437a8d
  由 Yu Yang 提交于 10月 20, 2017
  
  86437a8d

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致