提交 · 446198dab69e599d2aa7e09bc20b889cd29e0b4f · PaddlePaddle / Paddle

31 1月, 2018 1 次提交

由 dzhwinter 提交于 1月 31, 2018

* "Need to re-design LoD "

* "add lod design"

* "fix lod gpu ptr pointer"

* "removed commented code"

* "fix CI"

* "remove set lod in pybind"

* "fix style check"

* "fix CI"

* "fix long type template error"

* "pybind reorder to use Place"

* "fix ci"

* "fix ci"

* fix ci

* "sperate as a new file"

* "fix CI"

* "fix ci"

* small fix

* "add test"

* "fix adam op"

* "fix lstmp op"

* "fix adam op"

* "follow comments"

* "fix ci"

ae7d1c1f

30 1月, 2018 2 次提交

L

make inference_lib_dist · 9b5d41b6
由 Luo Tao 提交于 1月 30, 2018

9b5d41b6

Correct deps of threadpool (#7955) · 97014750

由 Yi Wang 提交于 1月 29, 2018

* refine channel test

* follow comments

* Add dependency enforce to threadpool

* Revert changes to channel_test.cc

* Revert changes to channel_test.cc

* Add #include "paddle/framework/macros.h"

97014750

29 1月, 2018 1 次提交

Rewrite class Channel to implement buffered and unbuffered channels (#7915) · d082f3a9

由 Yi Wang 提交于 1月 28, 2018

* Remove IsBounded as buffered channels have to be bounded

* Add derived classes Buffered and UnBuffered"

* Implement buffered and unbuffered channels

* Correct the syntax of Channel::Receive

* clang-format

* clang-format 3.8

* clang 3.8

d082f3a9

26 1月, 2018 1 次提交

New Run() method for framework::Executor (#7807) · 788f5c6d

由 kexinzhao 提交于 1月 25, 2018

* initial commit

* add new executor run function

* fix bug

* fix multiple definition of feed_fetch_method issue

* fix cmake

* fix tensor copy error

* refine executor code

* add comments

* temporary modification

* address comments

* fix bug

788f5c6d

22 1月, 2018 1 次提交
- D
  
  Fix the cmake dependences. · dd5e8d6c
  由 dangqingqing 提交于 1月 22, 2018
  
  dd5e8d6c
21 1月, 2018 1 次提交

Data type transform (#7653) · 85671b8a

由 Qiao Longfei 提交于 1月 21, 2018

* init complete data layout transform

* can compile

* test passed

* optimize code

* fix while_grad_op first step loss lod problem

* optimize in out ptr for transform

* add check

* update copyright

* clean code

* add NeedTransformLayout

* add comment

* change the interface of data_type_transform

* init data_type_transform_test

* complete data_type_transform_test

* add TransDataType to data_transform

85671b8a

20 1月, 2018 1 次提交
- D
  
  Add cmake for extern project of boost. · 564c6abd
  由 dangqingqing 提交于 1月 20, 2018
  
  564c6abd
19 1月, 2018 1 次提交
- Q
  complete data layout transform (#7440) · 0071b5f7
  由 Qiao Longfei 提交于 1月 19, 2018
```
* add data layout transform and optimize the implementation of data_transform
```
  0071b5f7
17 1月, 2018 1 次提交
- L
  
  add missing framework.pb.h and fix string install typo · c96b7e80
  由 Luo Tao 提交于 1月 17, 2018
  
  c96b7e80
16 1月, 2018 2 次提交
- D
  
  Refine profiler and expose to Python. · d2a70243
  由 dangqingqing 提交于 1月 16, 2018
  
  d2a70243
- L
  
  add paddle INSTALL for fluid api · 2be7cf90
  由 Luo Tao 提交于 1月 16, 2018
  
  2be7cf90
12 1月, 2018 1 次提交

Add get lod for debug (#7375) · 23df6c44

由 Qiao Longfei 提交于 1月 12, 2018

* add GetLoD for debug

* add LoDToString

* optimize if

* typo

* add lod_tensor to operator's dependency

23df6c44

10 1月, 2018 1 次提交

reorganize data transform related code (#7391) · 377424bf

由 Qiao Longfei 提交于 1月 10, 2018

* init data_type_transform

* split data_layout_transform

* tmp rm data_transform_test

* change device_data_transform to data_device_transform

* clean code

* clean code

377424bf

08 1月, 2018 1 次提交

cpu gpu transform function (#7191) · 0f353ab4

由 Qiao Longfei 提交于 1月 08, 2018

* add rename guard

* add device_data_transform

* add device_data_transform_test

* modify GetExpectedKernelType

* update operator.run

* support test test_label_semantic_roles

* optimize code

* optimize code

* rename GetActualKernelType to GetExpectedKernelType

* fix chunk_eval_op and device_data_transform_test

* add is_same_place to place

* optimize code, refine rename_guard

* refine rename guard, add GetKernelTypeForVar

* optimize code

* add some log

* rename guard

* use sub scope to create var

* fix compile

* add IsInitialized for Tensor

* add VarIsTensor

* fix op_registry_test

* test

* tmp disable priority

* restore switch_kernel.md

* code clean

0f353ab4

05 1月, 2018 2 次提交

Add COWPtr and its unittest · 0cfb5465

由 Yang Yu 提交于 1月 05, 2018

It will be used for LoD information in LoDTensor since LoD is a copy
on write field.

It is pretty slow for copying LoD information between operators. For
resnet it will cost roughly 10% time of whole time, including reading
data.

0cfb5465

Feature/use cudnn (#7141) · 5593858d

由 dzhwinter 提交于 1月 05, 2018

* "add c++ side kernel selection"

* "add multiple kernel op test"

* "kernel selection only support cudnn"

* "better formatter"

* "small fix with UseCPU"

* "depends on change interface Get(Place, Library)"

* "fix CI"

* "fix python cudnn test"

* "leave the register cudnn op to another PR"

* "fix CI"

* "use all kernel by default"

* "fix CI"

5593858d

04 1月, 2018 1 次提交
- Y
  
  Update cmake of scope · e138bcf4
  由 Yang Yu 提交于 1月 04, 2018
  
  e138bcf4
03 1月, 2018 1 次提交
- T
  
  fix shape_inference deps · 0a8775cc
  由 tensor-tang 提交于 1月 03, 2018
  
  0a8775cc
02 1月, 2018 1 次提交

Feature/transform (#7111) · 899a79cc

由 dzhwinter 提交于 1月 02, 2018

* "fix data transform"

* "data transformer"

* "add device pool"

* "add test"

* "fix ci"

* "fix datalayout implementation "

* "fix based on comment"

899a79cc

28 12月, 2017 4 次提交
- Y
  
  Update tensor_util · 3158b4b3
  由 Yang Yu 提交于 12月 28, 2017
  
  3158b4b3
- Y
  Implement selectedrows serialize and deserialize (#7042) · 2cdef424
  由 Yancey 提交于 12月 28, 2017
```
* implement selectedrows serialize and deserialize

* make serialize/deserialize as global function

* recover send_imp.cc

* delete unused brackets

* fix compile error

* serialize version in LodTensor and SelecetedRows

* fix ci

* fix ci
```
  2cdef424
- Y
  
  Fix compile · a9a44e01
  由 Yang Yu 提交于 12月 28, 2017
  
  a9a44e01
- Y
  
  Fix compile · 878d2e91
  由 Yang Yu 提交于 12月 28, 2017
  
  878d2e91
27 12月, 2017 2 次提交

"refine kernel registrar" (#6998) · 35c1683e

由 dzhwinter 提交于 12月 27, 2017

* "refine kernel registrar"

* "refine registrar with multikey"

* "fix register"

* "refine multikernel register"

* "fix CI"

* "fix CI"

* "fix registry"

* "swtich GPU to CUDA"

* "add register macro test case"

* "fix CI"

35c1683e

Q
add memory switch mechanism in operator kernel switch (#6991) · 94096ae5
由 QI JUN 提交于 12月 27, 2017
```
* add memory switch mechanism in operator kernel switch
```
94096ae5

26 12月, 2017 2 次提交

Add data transform fn (#6953) · f97f69fe

由 Qiao Longfei 提交于 12月 26, 2017

* init data_transform

* complete DataTransform

* fix build error

* add data_transform_test

* add a register test for data_transform_fn

* use function to simulate registration macro

* add register macro

* update test

* clean code

* restore unrelated code

* update data transform test

* generate unique name for REGISTER_DATA_TRANSFORM_FN

* add const

* follow comment

* update KernelTypePair hash function

f97f69fe

D
"fix threadpool style" (#7017) · 80dafdf5
由 dzhwinter 提交于 12月 26, 2017
```
* "fix threadpool style"

* "remove header"
```
80dafdf5

25 12月, 2017 2 次提交

Implement a simple threadpool (#6684) · 127bc2e0

由 Yancey 提交于 12月 25, 2017

* implement a simple threadpool

* unlock before cv.notify

* add done function

* add lock with GetAvailable function

* delete done_

* using call_once in GetInstance

* update by comment

* update comment

* enhance unit test for multi threads task

127bc2e0

Q

add op_kernel_type_test · 313afc9c
由 qiaolongfei 提交于 12月 25, 2017

313afc9c

24 12月, 2017 1 次提交

Feature/operator run place (#6783) · 735eba29

由 dzhwinter 提交于 12月 24, 2017

* "change operator interface"

* "move devicepool to device_context"

* "fix operator test"

* "fix op_registry Run interface"

* "net op passed. Need to fix nccl multi-Context"

* "add nccl group function"

* "add nccl group function"

* "fix gpu count exceed 32 error"

* "fix recurrent op, nccl op"

* "change the other operators interface with Place"

* "fix typo"

* "fix pybind"

* "fix device in python side"

* "fix pybind failed"

* "add init for test"

* "fix CI"

735eba29

18 12月, 2017 1 次提交

Feature/global context (#6537) · 24fda392

由 dzhwinter 提交于 12月 18, 2017

* "add DeviceContextPool"

* "add devicecontextpool in pybind"

* "add comments in python side "

* "fix static link error"

* "fix CI error"

* "add executor.py"

* "fix CI error"

* "add with gpu macro"

* "remove comment out codes"

* "add TODO items"

* "update init devices"

24fda392

26 11月, 2017 1 次提交

Feature/copytensor (#5455) · 45062fe5

由 dzhwinter 提交于 11月 26, 2017

* "make global tensor function independently"

* "replace functor"

* "fix inline template error"

* "fix tensor array with CopyFrom"

* "fix other case use CopyFrom"

* "move the op interface hardly"

* "fix operators"

* "fix typo"

* "delete dynamic recurrent rnn and fix gru_unit in debugmode"

* "fix unique_ptr copy"

* "fix cuda copy"

* "fix namespace error"

* "removed nccl python test"

* "fix include error"

* "fix typo"

* fix copy util test

45062fe5

15 11月, 2017 1 次提交
- Q
  fix gitignore (#5657) · 5f9f990e
  由 QI JUN 提交于 11月 14, 2017
```
* fix gitignore

* refine cmake file
```
  5f9f990e
04 11月, 2017 1 次提交

Add LoDRankTable (#5349) · 74849158

由 Yu Yang 提交于 11月 03, 2017

* Add LoDRankTable

LoD Rank Table stores the `level` of `lod` which is ordered by sequence
length in descending order. It is useful when implement dynamic RNN and
is shared by dynamic RNN memory, dynamic RNN slice input and dynamic
RNN slice output operators.

* Add InferVarType

74849158

31 10月, 2017 1 次提交
- D
  
  Refine activation function pointer for LSTM operator. · 1c8a0c4b
  由 dangqingqing 提交于 10月 31, 2017
  
  1c8a0c4b
29 10月, 2017 1 次提交
- Y
  Extract InferShape to many cc files (#5174) · 8f6c0a0f
  由 Yu Yang 提交于 10月 28, 2017
```
* Shrink Operator.h

* Fix CI compile
```
  8f6c0a0f
28 10月, 2017 1 次提交
- Y
  Add debug logs in scope, meta_cache and memory (#5170) · 2a5edec0
  由 Yu Yang 提交于 10月 27, 2017
```
* Add debug logs in scope, meta_cache and memory

* Add missing deps
```
  2a5edec0
27 10月, 2017 2 次提交

add sparse support for sum op (#5093) · 7f8574c0

由 QI JUN 提交于 10月 26, 2017

* add sparse support for sum op

* typo fix

* fix gpu build error

* fix unittest error

* typo fix

* infer var type and shape in op_test

* follow comments

* fix build error

* bypass some unittests depend on NetOp

7f8574c0

Gradient check use graph (#5027) · be00b0c4

由 Yu Yang 提交于 10月 26, 2017

* Simplize Gradient Check

* Stash

* Extract apply_backward_pass to backward.py

Rename apply_backward_pass to append_backward_ops

* Use graph API to check gradient

* Fix ci

* Fix CI

* Fix backward for double precision

* Stash

* Fix CI

* Fix ci

* Ignore GRU test

* Ignore xe op

* Fix CI

* Fix softmax with xe gradient

The correct equation should be IG = OG * (d_softmax_with_xe())

* Fix typo

* Fix merge error

* Disable LRN

be00b0c4

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功