提交 · 2a54ddd26710858d732b1743df1aa500e1efef94 · PaddlePaddle / Paddle

08 1月, 2018 3 次提交

D
Feature/add shared layout (#7233) · e94db381
由 dzhwinter 提交于 1月 08, 2018
```
* "reuse ShareLoD with no regret"

* "removed base class shareLayout"

* "fix CI"
```
e94db381

cpu gpu transform function (#7191) · 0f353ab4

由 Qiao Longfei 提交于 1月 08, 2018

* add rename guard

* add device_data_transform

* add device_data_transform_test

* modify GetExpectedKernelType

* update operator.run

* support test test_label_semantic_roles

* optimize code

* optimize code

* rename GetActualKernelType to GetExpectedKernelType

* fix chunk_eval_op and device_data_transform_test

* add is_same_place to place

* optimize code, refine rename_guard

* refine rename guard, add GetKernelTypeForVar

* optimize code

* add some log

* rename guard

* use sub scope to create var

* fix compile

* add IsInitialized for Tensor

* add VarIsTensor

* fix op_registry_test

* test

* tmp disable priority

* restore switch_kernel.md

* code clean

0f353ab4

E
Show argument dimensions with operator::DebugStringEx (#7268) · 8814bec0
由 emailweixu 提交于 1月 07, 2018
```
This can make it easier to locate error.
```
8814bec0

05 1月, 2018 1 次提交

Feature/use cudnn (#7141) · 5593858d

由 dzhwinter 提交于 1月 05, 2018

* "add c++ side kernel selection"

* "add multiple kernel op test"

* "kernel selection only support cudnn"

* "better formatter"

* "small fix with UseCPU"

* "depends on change interface Get(Place, Library)"

* "fix CI"

* "fix python cudnn test"

* "leave the register cudnn op to another PR"

* "fix CI"

* "use all kernel by default"

* "fix CI"

5593858d

04 1月, 2018 1 次提交
- Y
  
  clean up · 97dc451f
  由 Yang Yang 提交于 1月 04, 2018
  
  97dc451f
02 1月, 2018 1 次提交

Feature/transform (#7111) · 899a79cc

由 dzhwinter 提交于 1月 02, 2018

* "fix data transform"

* "data transformer"

* "add device pool"

* "add test"

* "fix ci"

* "fix datalayout implementation "

* "fix based on comment"

899a79cc

29 12月, 2017 1 次提交
- Q
  add helper function to get appropriate DeviceContext (#7066) · 5036cf03
  由 QI JUN 提交于 12月 29, 2017
```
* add helper function to get appropriate DeviceContext
```
  5036cf03
27 12月, 2017 6 次提交
- Y
  Rename API of DeviceContext (#7055) · 15e8c80e
  由 Yu Yang 提交于 12月 27, 2017
```
* Rename API of DeviceContext

Make them as usual names.

* Rename API of DeviceContext

Make them as usual names.

* Fix compile

* Fix compile

* Fix compile

* Fix compile

* Fix compile
```
  15e8c80e
- Q
  cache memory in local scope (#7058) · 7aed7eb5
  由 QI JUN 提交于 12月 27, 2017
```
* add KernelTypeToString interface

* cache memory in local scope

* fix typo

* refine trans logic
```
  7aed7eb5
- Y
  Rename API of DeviceContext · 8b877dd7
  由 Yang Yu 提交于 12月 27, 2017
```
Make them as usual names.
```
  8b877dd7
- Y
  Rename API of DeviceContext · a5e1cf5a
  由 Yang Yu 提交于 12月 27, 2017
```
Make them as usual names.
```
  a5e1cf5a
- Y
  Rename API of DeviceContext · fd2bf550
  由 Yang Yu 提交于 12月 27, 2017
```
Make them as usual names.
```
  fd2bf550
- Q
  add memory switch mechanism in operator kernel switch (#6991) · 94096ae5
  由 QI JUN 提交于 12月 27, 2017
```
* add memory switch mechanism in operator kernel switch
```
  94096ae5
25 12月, 2017 2 次提交

Impl kernel hint (#6883) · af0c4c45

由 Qiao Longfei 提交于 12月 25, 2017

* init kernel hint

* fix typo

* rm unused code

* add include in op_kernel.h

* restore op_kernel since it will be moved to op_kernel_type

* change force_cpu to use_cpu

* fix compilation

af0c4c45

Q

add op_kernel_type_test · 313afc9c
由 qiaolongfei 提交于 12月 25, 2017

313afc9c

24 12月, 2017 2 次提交

Q
refine OpKernelType (#6879) · 37e96264
由 QI JUN 提交于 12月 24, 2017
```
* refine OpKernelKey

* refine codes

* fix code style

* follow comments
```
37e96264

Feature/operator run place (#6783) · 735eba29

由 dzhwinter 提交于 12月 24, 2017

* "change operator interface"

* "move devicepool to device_context"

* "fix operator test"

* "fix op_registry Run interface"

* "net op passed. Need to fix nccl multi-Context"

* "add nccl group function"

* "add nccl group function"

* "fix gpu count exceed 32 error"

* "fix recurrent op, nccl op"

* "change the other operators interface with Place"

* "fix typo"

* "fix pybind"

* "fix device in python side"

* "fix pybind failed"

* "add init for test"

* "fix CI"

735eba29

21 12月, 2017 1 次提交
- Y
  
  pass forward runtime · f899150e
  由 Yang Yang 提交于 12月 21, 2017
  
  f899150e
20 12月, 2017 1 次提交
- Y
  Move framework.proto to proto namespace (#6718) · e445b3ff
  由 Yu Yang 提交于 12月 20, 2017
```
* Move framework.proto to proto namespace

* Fix compile

* Fix compile

* Fix Compile
```
  e445b3ff
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

05 12月, 2017 1 次提交
- D
  
  Remove the cuda stream synchronization between each operator. · 4e451a34
  由 dangqingqing 提交于 12月 05, 2017
  
  4e451a34
16 11月, 2017 1 次提交

feature/while_grad_op (#5554) · 18f0c40a

由 Yang Yang(Tony) 提交于 11月 16, 2017

* first commit

* Python API for while op

* Python Unittest for simple while_op forward

* fix out to be list

* Fix UT

* VarType

* Fix several bugs

* Fix bug

* Fix bug

* Fix Bug

* Fix bug

* Fix unittest

* Remove debug log

* Add comments

* add PADDLE_ENFORCE

* while_grad_op first commit

* Add `BlockDescBind::FindRecursiveOrCreateVar()` and fix bugs

* not sure how to setdim of while outputs

* push for test

* add executor vlog

* fix bug of while_op cond

* Several enhancement for code

1. Backward always infer shape & infer var type. Since there are RENAME
variables will be created when creating backward operator, but their
shape & var types are not inferenced.
2. Never use SomePtr-> directly, since every pointer could be nullptr if
it is a function return value. Add `detail::Ref` to cast pointer to
reference safely.
3. Enhance error message for backward.
4. Infer data type of variable in `sum` and `tensor_write`

* Fix bugs of while_op gradient

* Fix several bugs of while_op grad

* fix fill zeros like

* fix 3 >= 3

* fix place holder shouldn't be null

* fail on sum op

* Fix SumOp of TensorList

* clean up

* pass while test

* fix test_array_write_read

* pass sum op

* Support int/int64 for fill_constant_batch_size_like

* Fix compile

18f0c40a

08 11月, 2017 2 次提交

Polish OpWithKernel · bbdac7f7

由 Yu Yang 提交于 11月 07, 2017

* Chage `IndicateDataType` to `GetKernelType`. Make it easier to
  understand.
* Change `OpKernelKey` to `OpKernelType`
* Make operator developers can customize which kernel the operator will
  use in runtime.

bbdac7f7

Q

Check errors for the cuda kernel calls. (#5436) · 58db07b7
由 qingqing01 提交于 11月 08, 2017

58db07b7

07 11月, 2017 1 次提交

Add unittest, backward of array read/write op (#5409) · 6cde889b

由 Yu Yang 提交于 11月 06, 2017

* Use stable_sort in lod_rank_table

It is easy to debug and test when use `stable_sort`and the time
complexity is not changed.

* Add LoDTensorArray

* Stash

* Better debug message for IsInitialized

* Stash

* Better debug message for IsInitialized

* Complete array read/write op unittests

* Add unittest, Gradient of array read/write

* Follow comments

6cde889b

02 11月, 2017 1 次提交

Rewrite StaticRNN with Executor (#5224) · 0a32e74d

由 Yu Yang 提交于 11月 01, 2017

* Init commit

* Make executor use ProgramDescBind

* Change Attribute from BlockDesc to BlockDescBind

* Since we will get the program desc in RNN, just BlockDesc is not
  enough.

* Add DeviceContext to Executor API

* Rewrite RNN

* Pass Python

* AddBiasOp does not care num_flatten_dims

* Stash

* Fix MacOS Compile

* Pass RNN forward

* add python test

* refactor test

* Make compile pass

* add gradopmaker

* First draft done

* Polish code

* add grad op maker and grad infershape

* Polish code

* Fix backward.cc bug

* Fix infershape

* Rename function

* add backward test

* simplify recurrent test

* Update

* Pass unittest

* Add comments & refine test

* Add comments

* refactor test

* Complete Unittest

* fix StepScopes enforce

* Remove unused unittest

* no type error

* Update

* Make RNN Pass unittest

0a32e74d

01 11月, 2017 1 次提交
- Q
  add shareLod (#5259) · ee11f006
  由 Qiao Longfei 提交于 11月 01, 2017
```
* add shareLod

* fix sequence_conv grad infershape
```
  ee11f006
31 10月, 2017 1 次提交
- C
  
  add gpu kernel by copying inputs/outputs between cpu and gpu. · 86fd6b63
  由 caoying03 提交于 10月 29, 2017
  
  86fd6b63
29 10月, 2017 2 次提交
- Y
  Polish Accuracy Op (#5191) · 46a13e37
  由 Yu Yang 提交于 10月 28, 2017
```
* Accuracy does not support float/double, only support integers
* Polish error message when an operator does not support some device.
```
  46a13e37
- Y
  Extract InferShape to many cc files (#5174) · 8f6c0a0f
  由 Yu Yang 提交于 10月 28, 2017
```
* Shrink Operator.h

* Fix CI compile
```
  8f6c0a0f
27 10月, 2017 1 次提交

add sparse support for sum op (#5093) · 7f8574c0

由 QI JUN 提交于 10月 26, 2017

* add sparse support for sum op

* typo fix

* fix gpu build error

* fix unittest error

* typo fix

* infer var type and shape in op_test

* follow comments

* fix build error

* bypass some unittests depend on NetOp

7f8574c0

21 10月, 2017 1 次提交
- Y
  
  Global function, op_support_gpu (#4980) · 86437a8d
  由 Yu Yang 提交于 10月 20, 2017
  
  86437a8d
07 10月, 2017 1 次提交
- Q
  
  merge InferShapeContext and ExecutionContext · a0767228
  由 qiaolongfei 提交于 10月 07, 2017
  
  a0767228
05 10月, 2017 2 次提交

Y

Use PADDLE_WITH_CUDA instead of PADDLE_WITH_GPU · 4558807c
由 Yi Wang 提交于 10月 04, 2017

4558807c

Change `PADDLE_ONLY_CPU` to `PADDLE_WITH_GPU` · 84500f94

由 Yu Yang 提交于 10月 04, 2017

By shell command

```bash
sed -i 's#ifdef PADDLE_ONLY_CPU#ifndef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
sed -i 's#ifndef PADDLE_ONLY_CPU#ifdef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'`
```

84500f94

01 10月, 2017 1 次提交
- Q
  add some check to operator.run (#4544) · 87efa600
  由 Qiao Longfei 提交于 9月 30, 2017
```
* fix cond_op_test and add some check to operator.run

* tmp

* optimize kernel check
```
  87efa600
29 9月, 2017 1 次提交
- Q
  
  move EigenDeviceConverter to device_context.h · 7a6fcc7d
  由 qijun 提交于 9月 28, 2017
  
  7a6fcc7d
27 9月, 2017 2 次提交

T

fix atomic issue when cpu only · e0b17754
由 tensor-tang 提交于 9月 27, 2017

e0b17754

Refactoring InferShape (#3946) · 9a9d50a6

由 Qiao Longfei 提交于 9月 26, 2017

* init Infershape

* add static InferShape interface

* refactor add-op infershape

* add AttrReader

* add all maker's infershape

* add all InferShape

* add python infer api

* add VarDesc interface

* add python VarDesc and OpDesc interface

* update python code

* use infershape function to do shape inference

* clean code

* do not use pointer

* refine code of op_proto_maker

* add get_dims to VarDesc

* refine the code

* remove the dependency from operator to op registry

* remove OpProtoAndCheckerMaker from operator

* restore complete_add_op

* add shape_infer_impl.h

* code optimization

* remove const return value

* add fake BlockDesc class

* optimize code

* remove infer function in op_info

* move InferShapeContextImpl to operator.h

* optimize the interface of InferShapeContextBase

* add temperary interface of new infershape

* change add_op, clip_op, conv2d_op and activation_op

* change all operators InferShape

* fix SetDim

* update cos_sim_op

* update crop_op

* update lookup_table_op

* allocate tensor when call GetDim in InferShapeContext

* update modified_huber_loss_op

* update rowwise_add_op

* update mean_op

* update sequence_avg_pool_op

* typo

* remove old InferShape interface

* can compile

* fix or unit test

* clean code

* clean code

* remove const before InferShapeContext

* change InferenceContextBase to pointer

* rename RunTime to Runtime, code clean

9a9d50a6

21 9月, 2017 1 次提交
- D
  
  Remove LoDTensor in some operators' InferShape and refine ShareLoD function. · 36aeb30d
  由 dangqingqing 提交于 9月 21, 2017
  
  36aeb30d

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功