提交 · 9812bb8b483269d4f23ed2181a949963215a6458 · PaddlePaddle / Paddle

07 6月, 2018 1 次提交

由 mozga-intel 提交于 6月 07, 2018

* Add MKLDNN layout support in Paddle

Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout
can be used in MKLDNN enabled OP kernel. Before this commit, NCHW
is hardcode to be used in all MKLDNN op kernels. As a result,
non-optimized execution path is selected in MKLDNN primitive which
bring worse performance.
Besides framework change, three MKLDNN OP kernels were updated
for using new MKLDNN layout. They are conv/pool2d/batch_norm.
Other MKLDNN OP kernels need be also updated in similar way to
achieve best performance.

* Add MKLDNN layout support in activation OP

* Don't populate layout from input to output when kMKLDNN in

* Refine pool mkldnn op kernel

* MKLDNN layout

* Remove the inferitance from tensor file

* MKLDNN layout: refactoring

* Remove additional #define to register new operator

* Prepare mkldnn tests to work with layout

3ff9ba0e

06 6月, 2018 1 次提交
- Y
  
  Extract method from tensor_impl.h to tensor.cc · fc9f2d28
  由 yuyang18 提交于 6月 06, 2018
  
  fc9f2d28
18 4月, 2018 2 次提交
- T
  
  remove not used code · ff0d9341
  由 typhoonzero 提交于 4月 18, 2018
  
  ff0d9341
- T
  
  update by comments · 788636f0
  由 typhoonzero 提交于 4月 18, 2018
  
  788636f0
16 4月, 2018 2 次提交
- T
  
  wip split byref op · 04c559e3
  由 typhoonzero 提交于 4月 16, 2018
  
  04c559e3
- T
  
  add sharable tensor · f86d35a2
  由 typhoonzero 提交于 4月 16, 2018
  
  f86d35a2
26 3月, 2018 3 次提交
- C
  
  add unit test · 158d6c4d
  由 chengduoZH 提交于 3月 26, 2018
  
  158d6c4d
- C
  
  add CUDAPinnedPlace · 18eb7730
  由 chengduoZH 提交于 3月 26, 2018
  
  18eb7730
- C
  
  replace use_pinned with is_pinned · 39004080
  由 chengduoZH 提交于 3月 26, 2018
  
  39004080
20 3月, 2018 1 次提交
- C
  
  add use_pinned · eaa90d38
  由 chengduoZH 提交于 3月 20, 2018
  
  eaa90d38
13 2月, 2018 1 次提交
- F
  
  remove 'friend lod_tensor in tensor' · ed5dc3d4
  由 fengjiayi 提交于 2月 13, 2018
  
  ed5dc3d4
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
09 2月, 2018 1 次提交
- Y
  
  Polish code and add comments · 02d494c3
  由 Yu Yang 提交于 2月 09, 2018
  
  02d494c3
08 2月, 2018 1 次提交
- Y
  
  Rewrite mixed_vector.h · ef1aba39
  由 Yu Yang 提交于 2月 08, 2018
  
  ef1aba39
31 1月, 2018 1 次提交

Fix/lod (#7714) · ae7d1c1f

由 dzhwinter 提交于 1月 31, 2018

* "Need to re-design LoD "

* "add lod design"

* "fix lod gpu ptr pointer"

* "removed commented code"

* "fix CI"

* "remove set lod in pybind"

* "fix style check"

* "fix CI"

* "fix long type template error"

* "pybind reorder to use Place"

* "fix ci"

* "fix ci"

* fix ci

* "sperate as a new file"

* "fix CI"

* "fix ci"

* small fix

* "add test"

* "fix adam op"

* "fix lstmp op"

* "fix adam op"

* "follow comments"

* "fix ci"

ae7d1c1f

08 1月, 2018 1 次提交

cpu gpu transform function (#7191) · 0f353ab4

由 Qiao Longfei 提交于 1月 08, 2018

* add rename guard

* add device_data_transform

* add device_data_transform_test

* modify GetExpectedKernelType

* update operator.run

* support test test_label_semantic_roles

* optimize code

* optimize code

* rename GetActualKernelType to GetExpectedKernelType

* fix chunk_eval_op and device_data_transform_test

* add is_same_place to place

* optimize code, refine rename_guard

* refine rename guard, add GetKernelTypeForVar

* optimize code

* add some log

* rename guard

* use sub scope to create var

* fix compile

* add IsInitialized for Tensor

* add VarIsTensor

* fix op_registry_test

* test

* tmp disable priority

* restore switch_kernel.md

* code clean

0f353ab4

27 12月, 2017 1 次提交
- D
  Fix/transform (#7079) · c31cbae5
  由 dzhwinter 提交于 12月 27, 2017
```
* "fix data transform"

* "split into next PR"
```
  c31cbae5
25 12月, 2017 1 次提交

"add data layout" (#6955) · 7777c811

由 dzhwinter 提交于 12月 25, 2017

* "add data layout"

* "need kernel registry support"

* "fix data layout"

* "reorder include headers"

* "change enum to enum class"

* "fix CI"

7777c811

21 12月, 2017 1 次提交
- Y
  
  pass forward runtime · f899150e
  由 Yang Yang 提交于 12月 21, 2017
  
  f899150e
26 11月, 2017 1 次提交

Feature/copytensor (#5455) · 45062fe5

由 dzhwinter 提交于 11月 26, 2017

* "make global tensor function independently"

* "replace functor"

* "fix inline template error"

* "fix tensor array with CopyFrom"

* "fix other case use CopyFrom"

* "move the op interface hardly"

* "fix operators"

* "fix typo"

* "delete dynamic recurrent rnn and fix gru_unit in debugmode"

* "fix unique_ptr copy"

* "fix cuda copy"

* "fix namespace error"

* "removed nccl python test"

* "fix include error"

* "fix typo"

* fix copy util test

45062fe5

02 11月, 2017 1 次提交

Rewrite StaticRNN with Executor (#5224) · 0a32e74d

由 Yu Yang 提交于 11月 01, 2017

* Init commit

* Make executor use ProgramDescBind

* Change Attribute from BlockDesc to BlockDescBind

* Since we will get the program desc in RNN, just BlockDesc is not
  enough.

* Add DeviceContext to Executor API

* Rewrite RNN

* Pass Python

* AddBiasOp does not care num_flatten_dims

* Stash

* Fix MacOS Compile

* Pass RNN forward

* add python test

* refactor test

* Make compile pass

* add gradopmaker

* First draft done

* Polish code

* add grad op maker and grad infershape

* Polish code

* Fix backward.cc bug

* Fix infershape

* Rename function

* add backward test

* simplify recurrent test

* Update

* Pass unittest

* Add comments & refine test

* Add comments

* refactor test

* Complete Unittest

* fix StepScopes enforce

* Remove unused unittest

* no type error

* Update

* Make RNN Pass unittest

0a32e74d

30 10月, 2017 1 次提交

03 image classification (#5192) · 0049ce04

由 Qiao Longfei 提交于 10月 30, 2017

* add batch_norm_layer

* add img_conv_group layer and test

* add check to Tensor.type()

* forward can run

* with backward

* change label data time from int32 to int64

* refine code

* follow comment

0049ce04

26 10月, 2017 1 次提交

Feature/save op (#5090) · efc2464f

由 Yu Yang 提交于 10月 25, 2017

* Init

* Stash

* Polish SaveLoadOp

* Fix CI

* Polish code

* Save GPU Tensor

* Stash

* Fix CI

efc2464f

25 10月, 2017 1 次提交

"Serialize LoDTensor, Save/Restore model" (#4602) · fd2eb550

由 dzhwinter 提交于 10月 24, 2017

* "add model format design doc"

* "add restore function"

* "add parse protobuf"

* "move necessary information to saver.proto"

* "format code"

* "add gpu option"

* "add lod info"

* "add saveop python test wrapper"

* "checkpoint reuse save operator"

* "rewrite model format design doc"

* "async support needed"

* "fix run once"

* "fix doc based on comments"

* "refine based on comments"

* "fix based comments"

* "remove persistable flag from framework.proto"

* "add IndicateDataType to restore op"

* "add save test"

* "modify save restore code"

* "modified the restore logic"

* rm checkpoint_op.cc

* rm test_checkpoint_op.py

* "get inputs outputs name from execution context"

* Saving each variable to a independent file

* Fix bugs

* Rewrite save_restore_op_test with new Python framework

* Move `SaveOp` and `RestoreOp` from OpWithKernel to OpBase

* Refine unit test of SaveOp and RestoreOp

* fix compile errorwq

fd2eb550

20 10月, 2017 1 次提交

Remove template parameter for Tensor methods (#4937) · c532b967

由 Yu Yang 提交于 10月 19, 2017

* Remove template parameter for Tensor methods

* Also check the type is correct when data()
* Simplize holder_

* Fix accuracy_op

* Register Code

c532b967

17 10月, 2017 2 次提交

C

add forward computation of crf operator. · cc220eec
由 caoying03 提交于 10月 12, 2017

cc220eec

Rewrite feed/fetch op (#4815) · 4df6cf4d

由 Yu Yang 提交于 10月 16, 2017

* Feed/Fetch op just plain operator, not a OpWithKernel
* Do not register OpInfoMaker since Feed/Fetch will never be
  configured by users
* Feed/Fetch op has empty gradient
* Feed/Fetch op do not hard code `feed_variable`, `fetch_variable` as
  its input and output, make it as a plain Operator input/output

4df6cf4d

12 10月, 2017 1 次提交

Unify CUDA stream in Tensor CopyFrom interface (#4692) · 2603cb7e

由 QI JUN 提交于 10月 11, 2017

* init

* unify CopyFrom interface

* fix gpu build error

* fix bug in tensor_py.h

* refine code comments and add TODO list

* fix conflicts in FeedOp and FetchOp

2603cb7e

10 10月, 2017 1 次提交
- A
  Adding implementation for copying a vector to a tensor (#4635) · 383faaf7
  由 Abhinav Arora 提交于 10月 09, 2017
```
* Adding implementation for copying a vector to tensor
* Changing Tensor test to access gpu memory indirectly
```
  383faaf7
28 9月, 2017 1 次提交
- Y
  
  Add Skeleton of Double support · 3a5693e0
  由 Yu Yang 提交于 9月 27, 2017
  
  3a5693e0
23 9月, 2017 2 次提交

Remove `numel` field in tensor · a0ce05df

由 Yu Yang 提交于 9月 22, 2017

* It is duplicated with `dim_`. We can use `dim_` to calculate `numel`
  everytime. It does not cost too much.
* `numel` is not initialized by constructor. Also, `numel` is hard to
  synchronize with `dim_`.

So just remove it.

a0ce05df

Y

Change namespace of pybind.cc to pybind · f0cd5142
由 Yu Yang 提交于 9月 22, 2017

f0cd5142

15 9月, 2017 1 次提交
- Z
  
  modified codes · 39d79e64
  由 zchen0211 提交于 9月 14, 2017
  
  39d79e64
12 9月, 2017 1 次提交
- Q
  
  Call Tensor::numel() everywhere. · ad64ca5d
  由 qingqing01 提交于 9月 12, 2017
  
  ad64ca5d
08 9月, 2017 1 次提交
- Y
  
  Fix CI test · d8921e9d
  由 Yu Yang 提交于 9月 07, 2017
  
  d8921e9d
07 9月, 2017 1 次提交
- Q
  
  Add function to get element count from tensor. · a2a69f2a
  由 qingqing01 提交于 9月 07, 2017
  
  a2a69f2a
06 9月, 2017 1 次提交
- Z
  
  tensor element size support · adfef243
  由 Zhuoyuan 提交于 9月 05, 2017
  
  adfef243
05 9月, 2017 1 次提交
- F
  
  WIP · e76fa85c
  由 fengjiayi 提交于 9月 04, 2017
  
  e76fa85c

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功