提交 · fd2bf55016e6de50bbc436476050f1c442cb654c · Crayon鑫 / Paddle

27 12月, 2017 1 次提交
- Y
  Rename API of DeviceContext · fd2bf550
  由 Yang Yu 提交于 12月 27, 2017
```
Make them as usual names.
```
  fd2bf550
25 12月, 2017 2 次提交

由 Qiao Longfei 提交于 12月 25, 2017

* init kernel hint

* fix typo

* rm unused code

* add include in op_kernel.h

* restore op_kernel since it will be moved to op_kernel_type

* change force_cpu to use_cpu

* fix compilation

af0c4c45

Q

add op_kernel_type_test · 313afc9c
由 qiaolongfei 提交于 12月 25, 2017

313afc9c

24 12月, 2017 2 次提交

Q
refine OpKernelType (#6879) · 37e96264
由 QI JUN 提交于 12月 24, 2017
```
* refine OpKernelKey

* refine codes

* fix code style

* follow comments
```
37e96264

Feature/operator run place (#6783) · 735eba29

由 dzhwinter 提交于 12月 24, 2017

* "change operator interface"

* "move devicepool to device_context"

* "fix operator test"

* "fix op_registry Run interface"

* "net op passed. Need to fix nccl multi-Context"

* "add nccl group function"

* "add nccl group function"

* "fix gpu count exceed 32 error"

* "fix recurrent op, nccl op"

* "change the other operators interface with Place"

* "fix typo"

* "fix pybind"

* "fix device in python side"

* "fix pybind failed"

* "add init for test"

* "fix CI"

735eba29

20 12月, 2017 1 次提交
- Y
  Move framework.proto to proto namespace (#6718) · e445b3ff
  由 Yu Yang 提交于 12月 20, 2017
```
* Move framework.proto to proto namespace

* Fix compile

* Fix compile

* Fix Compile
```
  e445b3ff
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

05 12月, 2017 1 次提交
- D
  
  Remove the cuda stream synchronization between each operator. · 4e451a34
  由 dangqingqing 提交于 12月 05, 2017
  
  4e451a34
16 11月, 2017 1 次提交

feature/while_grad_op (#5554) · 18f0c40a

由 Yang Yang(Tony) 提交于 11月 16, 2017

* first commit

* Python API for while op

* Python Unittest for simple while_op forward

* fix out to be list

* Fix UT

* VarType

* Fix several bugs

* Fix bug

* Fix bug

* Fix Bug

* Fix bug

* Fix unittest

* Remove debug log

* Add comments

* add PADDLE_ENFORCE

* while_grad_op first commit

* Add `BlockDescBind::FindRecursiveOrCreateVar()` and fix bugs

* not sure how to setdim of while outputs

* push for test

* add executor vlog

* fix bug of while_op cond

* Several enhancement for code

1. Backward always infer shape & infer var type. Since there are RENAME
variables will be created when creating backward operator, but their
shape & var types are not inferenced.
2. Never use SomePtr-> directly, since every pointer could be nullptr if
it is a function return value. Add `detail::Ref` to cast pointer to
reference safely.
3. Enhance error message for backward.
4. Infer data type of variable in `sum` and `tensor_write`

* Fix bugs of while_op gradient

* Fix several bugs of while_op grad

* fix fill zeros like

* fix 3 >= 3

* fix place holder shouldn't be null

* fail on sum op

* Fix SumOp of TensorList

* clean up

* pass while test

* fix test_array_write_read

* pass sum op

* Support int/int64 for fill_constant_batch_size_like

* Fix compile

18f0c40a

08 11月, 2017 2 次提交

Polish OpWithKernel · bbdac7f7

由 Yu Yang 提交于 11月 07, 2017

* Chage `IndicateDataType` to `GetKernelType`. Make it easier to
  understand.
* Change `OpKernelKey` to `OpKernelType`
* Make operator developers can customize which kernel the operator will
  use in runtime.

bbdac7f7

Q

Check errors for the cuda kernel calls. (#5436) · 58db07b7
由 qingqing01 提交于 11月 08, 2017

58db07b7

07 11月, 2017 1 次提交

Add unittest, backward of array read/write op (#5409) · 6cde889b

由 Yu Yang 提交于 11月 06, 2017

* Use stable_sort in lod_rank_table

It is easy to debug and test when use `stable_sort`and the time
complexity is not changed.

* Add LoDTensorArray

* Stash

* Better debug message for IsInitialized

* Stash

* Better debug message for IsInitialized

* Complete array read/write op unittests

* Add unittest, Gradient of array read/write

* Follow comments

6cde889b

02 11月, 2017 1 次提交

Rewrite StaticRNN with Executor (#5224) · 0a32e74d

由 Yu Yang 提交于 11月 01, 2017

* Init commit

* Make executor use ProgramDescBind

* Change Attribute from BlockDesc to BlockDescBind

* Since we will get the program desc in RNN, just BlockDesc is not
  enough.

* Add DeviceContext to Executor API

* Rewrite RNN

* Pass Python

* AddBiasOp does not care num_flatten_dims

* Stash

* Fix MacOS Compile

* Pass RNN forward

* add python test

* refactor test

* Make compile pass

* add gradopmaker

* First draft done

* Polish code

* add grad op maker and grad infershape

* Polish code

* Fix backward.cc bug

* Fix infershape

* Rename function

* add backward test

* simplify recurrent test

* Update

* Pass unittest

* Add comments & refine test

* Add comments

* refactor test

* Complete Unittest

* fix StepScopes enforce

* Remove unused unittest

* no type error

* Update

* Make RNN Pass unittest

0a32e74d

01 11月, 2017 1 次提交
- Q
  add shareLod (#5259) · ee11f006
  由 Qiao Longfei 提交于 11月 01, 2017
```
* add shareLod

* fix sequence_conv grad infershape
```
  ee11f006
31 10月, 2017 1 次提交
- C
  
  add gpu kernel by copying inputs/outputs between cpu and gpu. · 86fd6b63
  由 caoying03 提交于 10月 29, 2017
  
  86fd6b63
29 10月, 2017 2 次提交
- Y
  Polish Accuracy Op (#5191) · 46a13e37
  由 Yu Yang 提交于 10月 28, 2017
```
* Accuracy does not support float/double, only support integers
* Polish error message when an operator does not support some device.
```
  46a13e37
- Y
  Extract InferShape to many cc files (#5174) · 8f6c0a0f
  由 Yu Yang 提交于 10月 28, 2017
```
* Shrink Operator.h

* Fix CI compile
```
  8f6c0a0f
27 10月, 2017 1 次提交

add sparse support for sum op (#5093) · 7f8574c0

由 QI JUN 提交于 10月 26, 2017

* add sparse support for sum op

* typo fix

* fix gpu build error

* fix unittest error

* typo fix

* infer var type and shape in op_test

* follow comments

* fix build error

* bypass some unittests depend on NetOp

7f8574c0

21 10月, 2017 1 次提交
- Y
  
  Global function, op_support_gpu (#4980) · 86437a8d
  由 Yu Yang 提交于 10月 20, 2017
  
  86437a8d
07 10月, 2017 1 次提交
- Q
  
  merge InferShapeContext and ExecutionContext · a0767228
  由 qiaolongfei 提交于 10月 07, 2017
  
  a0767228
05 10月, 2017 2 次提交

Y

Use PADDLE_WITH_CUDA instead of PADDLE_WITH_GPU · 4558807c
由 Yi Wang 提交于 10月 04, 2017

4558807c

Change `PADDLE_ONLY_CPU` to `PADDLE_WITH_GPU` · 84500f94

由 Yu Yang 提交于 10月 04, 2017

By shell command

```bash
sed -i 's#ifdef PADDLE_ONLY_CPU#ifndef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
sed -i 's#ifndef PADDLE_ONLY_CPU#ifdef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
```

84500f94

01 10月, 2017 1 次提交
- Q
  add some check to operator.run (#4544) · 87efa600
  由 Qiao Longfei 提交于 9月 30, 2017
```
* fix cond_op_test and add some check to operator.run

* tmp

* optimize kernel check
```
  87efa600
29 9月, 2017 1 次提交
- Q
  
  move EigenDeviceConverter to device_context.h · 7a6fcc7d
  由 qijun 提交于 9月 28, 2017
  
  7a6fcc7d
27 9月, 2017 2 次提交

T

fix atomic issue when cpu only · e0b17754
由 tensor-tang 提交于 9月 27, 2017

e0b17754

Refactoring InferShape (#3946) · 9a9d50a6

由 Qiao Longfei 提交于 9月 26, 2017

* init Infershape

* add static InferShape interface

* refactor add-op infershape

* add AttrReader

* add all maker's infershape

* add all InferShape

* add python infer api

* add VarDesc interface

* add python VarDesc and OpDesc interface

* update python code

* use infershape function to do shape inference

* clean code

* do not use pointer

* refine code of op_proto_maker

* add get_dims to VarDesc

* refine the code

* remove the dependency from operator to op registry

* remove OpProtoAndCheckerMaker from operator

* restore complete_add_op

* add shape_infer_impl.h

* code optimization

* remove const return value

* add fake BlockDesc class

* optimize code

* remove infer function in op_info

* move InferShapeContextImpl to operator.h

* optimize the interface of InferShapeContextBase

* add temperary interface of new infershape

* change add_op, clip_op, conv2d_op and activation_op

* change all operators InferShape

* fix SetDim

* update cos_sim_op

* update crop_op

* update lookup_table_op

* allocate tensor when call GetDim in InferShapeContext

* update modified_huber_loss_op

* update rowwise_add_op

* update mean_op

* update sequence_avg_pool_op

* typo

* remove old InferShape interface

* can compile

* fix or unit test

* clean code

* clean code

* remove const before InferShapeContext

* change InferenceContextBase to pointer

* rename RunTime to Runtime, code clean

9a9d50a6

21 9月, 2017 3 次提交
- D
  
  Remove LoDTensor in some operators' InferShape and refine ShareLoD function. · 36aeb30d
  由 dangqingqing 提交于 9月 21, 2017
  
  36aeb30d
- S
  
  make RecurrentOp's backward work · 6a0c3428
  由 superjom 提交于 9月 20, 2017
  
  6a0c3428
- Q
  
  move OpProtoAndCheckerMaker from operator to op_proto_maker · a7a66b80
  由 qiaolongfei 提交于 9月 19, 2017
  
  a7a66b80
20 9月, 2017 1 次提交
- Q
  
  move OpProtoAndCheckerMaker from operator to op_proto_maker · 98ef17ed
  由 qiaolongfei 提交于 9月 19, 2017
  
  98ef17ed
19 9月, 2017 1 次提交

Remove lazy-initialization in device_context · 81d56ca8

由 Yu Yang 提交于 9月 18, 2017

* Also use `const DeviceContext&` all the time, to prevent `const_cast`

Fix #4169
Fix #3468
Fix #3475

81d56ca8

14 9月, 2017 3 次提交
- D
  
  Fix specialization of template member functions in the non-template class in GCC 5.0. · 74f460fd
  由 dangqingqing 提交于 9月 14, 2017
  
  74f460fd
- Q
  
  fix relu functor and revert some codes · 0957fa7b
  由 qijun 提交于 9月 14, 2017
  
  0957fa7b
- D
  
  Replace LoDTensor in elementwise_mul_op, pad_op and recurrent_op_utils. · cb284283
  由 dangqingqing 提交于 9月 14, 2017
  
  cb284283
13 9月, 2017 2 次提交
- D
  
  Using LoDTensor instead of Tensor in every operator. · f2992063
  由 dangqingqing 提交于 9月 13, 2017
  
  f2992063
- Q
  
  move EigenDeviceConverter to device_context.h · 3c49e7b1
  由 qijun 提交于 9月 13, 2017
  
  3c49e7b1
08 9月, 2017 1 次提交
- Q
  
  follow comments · f50e36e2
  由 qijun 提交于 9月 08, 2017
  
  f50e36e2
04 9月, 2017 3 次提交
- Y
  
  Fix CI Test · 13b43279
  由 Yu Yang 提交于 9月 03, 2017
  
  13b43279
- Y
  
  Add GenerateTemporaryNames/CheckAllInputOutputSet · 7d5bdbbf
  由 Yu Yang 提交于 9月 03, 2017
  
  7d5bdbbf
- Y
  
  Simple Implementation · d7a1e40e
  由 Yu Yang 提交于 9月 03, 2017
  
  d7a1e40e

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致