提交 · 75618e4b5404044ea5446009969449bcc2d20376 · BaiXuePrincess / Paddle

26 6月, 2018 1 次提交

MKLDNN elementwis_add with default broadcast operations (#11544) · e26f51ce

由 Tomasz Patejko 提交于 6月 26, 2018

* elementwise_add with bcast: Brian's implementation by Brian added, with default bcasts

* elementwise_add with bcast: GetExpectedKernelType added to elementwise_op

* elementwise_add with bcast: use_mkldnn attribute added

* elementwise_add with bcast: changes after review and some formatting

* elementwise_add with bcast: changes after style check

* elementwise_add with bcast: changes after style check cont.

* elementwise_add with bcast: MKLDNN unittests added

* elementwise_add with bcast: original unittests with use_mkldnn flag

* elementwise_add with bcast: handling of MKLDNN format corrected

* elementwise_add with bcast: setting MKLDNN format turned into lambda

* elementwise_add with bcast: MKDNN format setting turned into separate function

* elementwise_add with bcast: condition for choosing MKLDNN simplified

* elementwise_add with bcast: fix for MKLDNN format set incorrectly in bcasts

* elementwise_add with bcast: changes in unittests for broadcasts

* elementwise_add with bcast: fixes in unittests regarding dimensions

* elementwise_add with bcast: bring back correct format setting in mklml grad path

* elementwise_add with bcast: fixed compilation error

e26f51ce

25 6月, 2018 1 次提交
- Y
  
  fix sparse paraexe dist train · 254154a9
  由 yi.wu 提交于 6月 25, 2018
  
  254154a9
22 6月, 2018 4 次提交
- Y
  
  add blocks attr type in proto · 8cb494f7
  由 Yancey1989 提交于 6月 22, 2018
  
  8cb494f7
- Y
  
  use optimize block list instead of first optimize block · 56a903d3
  由 Yancey1989 提交于 6月 22, 2018
  
  56a903d3
- C
  
  enhance ParallelExecutor stable (#11637) · da556ed6
  由 chengduo 提交于 6月 22, 2018
  
  da556ed6
- K
  
  add print lod_tensor int64 option (#11644) · 073af623
  由 Kexin Zhao 提交于 6月 21, 2018
  
  073af623
21 6月, 2018 6 次提交
- C
  Enhance Parallel Executor stable (#11634) · 7d26dd81
  由 chengduo 提交于 6月 21, 2018
```
* Fix Parallel Exe(VarHandel's version)

* Fix broadcast

* enhance ParallelExecutor stable
```
  7d26dd81
- F
  
  fix mac compile · 964f515e
  由 fengjiayi 提交于 6月 21, 2018
  
  964f515e
- C
  
  Add No Mutex · c99fca5f
  由 chengduoZH 提交于 6月 21, 2018
  
  c99fca5f
- X
  Merge pull request #11608 from panyx0718/doc · 4b446ac3
  由 Xin Pan 提交于 6月 21, 2018
```
small thread-safety fix and doc improvements.
```
  4b446ac3
- C
  
  Fix broadcast · 13de7238
  由 chengduoZH 提交于 6月 21, 2018
  
  13de7238
- C
  
  Fix Parallel Exe(VarHandel's version) · 28a86aeb
  由 chengduoZH 提交于 6月 21, 2018
  
  28a86aeb
20 6月, 2018 3 次提交
- X
  
  small thread-safety fix and doc improvements. · df31926f
  由 Xin Pan 提交于 6月 20, 2018
  
  df31926f
- Y
  
  move dist codes from operaotrs/detail to operators/distributed · 1ef6cdb6
  由 Yancey1989 提交于 6月 20, 2018
  
  1ef6cdb6
- Y
  
  fix compile warning · 7e6518e8
  由 Yancey1989 提交于 6月 20, 2018
  
  7e6518e8
19 6月, 2018 1 次提交

Fix decay bug (#11520) · a29cb4be

由 Qiyang Min 提交于 6月 19, 2018

* Add sub_blocks of lr_decay_op to pserver_prog after distribute_transpiler

* Remove unused logs and logics

* 1. Add ops to new block (considering the nested block condition)
2. Follow the original hierarchy of blocks
3. Change the function's name and remove debug lines

a29cb4be

16 6月, 2018 2 次提交
- Q
  
  update comment · 2b1ecdf5
  由 qiaolongfei 提交于 6月 16, 2018
  
  2b1ecdf5
- Q
  
  add keep_kids flag for executor · daa0fbd5
  由 qiaolongfei 提交于 6月 16, 2018
  
  daa0fbd5
15 6月, 2018 1 次提交

Modify Pybind LoDTensor API according to length-based LoD (#11106) · 417fcf4f

由 Kexin Zhao 提交于 6月 15, 2018

* add lod_tensor util and modify pybind

* refind pybind LoDTensor API and modify LoDTensor and DataFeeder test

* fix test error

* fix detection map op test

* fix reorder_lod_tensor test

* fix seq_concat_op

* fix chunk evel op test

* fix target assign op

* fix warp ctc op

* address comments step 1: reverse reset_lod op

* step 2: modify op test

* add warning message

* remove has_valid_lod

* add back has_valid_lod

* address comments

* add exception catching trial

417fcf4f

14 6月, 2018 3 次提交

T

initial with only 1 mkl/openblas threads for each pthreads · 3e58df20
由 tensor-tang 提交于 6月 14, 2018

3e58df20

Fix NCCLBcast hang up bug in Parallel Executor (#11377) · 046bb5c8

由 Qiyang Min 提交于 6月 13, 2018

* 1. Create buddy allocator in each places before NcclBcast the variables
2. Check the memory usage of ALL gpus rather than the first one

* 1. Make NCCLGroupGuard guards only the ncclBcast part, which avoid ncclGroupEnd blocking the exception throwing
2. NOTE the usage of NCCLGroupGuard

* Remove the memory usage check of gpus

* Fix code style

046bb5c8

Y

Dynamic Graph first prototype (#11415) · d827c6e8
由 Yang Yang(Tony) 提交于 6月 13, 2018

d827c6e8

13 6月, 2018 3 次提交
- Q
  
  add row_size for selected rows in DebugStringEx · 7ebef493
  由 qiaolongfei 提交于 6月 13, 2018
  
  7ebef493
- Q
  
  fix concurrency_test build error on mac · 82416f18
  由 qiaolongfei 提交于 6月 13, 2018
  
  82416f18
- Q
  
  fix build on mac · 9ebbfa6b
  由 qiaolongfei 提交于 6月 13, 2018
  
  9ebbfa6b
12 6月, 2018 4 次提交
- Y
  
  update by comment · f52d78d1
  由 Yancey1989 提交于 6月 12, 2018
  
  f52d78d1
- T
  
  throw warning if try to use mkldnn while not compiled · 6602db5b
  由 tensor-tang 提交于 6月 12, 2018
  
  6602db5b
- Y
  
  use get_appropriate_dev to schedule rpc op · 6d752baf
  由 Yancey1989 提交于 6月 12, 2018
  
  6d752baf
- W
  Trainer send term signal (#11220) · 34865f2d
  由 Wu Yi 提交于 6月 12, 2018
```
* wip

* use executor.complete to end trainer

* fix build

* fix build with distribute off

* fix typo

* fix cmake typo

* fix build
```
  34865f2d
11 6月, 2018 11 次提交
- Q
  
  fix build problem · 83a577e8
  由 qiaolongfei 提交于 6月 11, 2018
  
  83a577e8
- D
  add inplace attribute to op_proto_maker (#10665) · bfa3fd6f
  由 dzhwinter 提交于 6月 11, 2018
```
* "add inplace attribute"

* "register inplace attribute"

* "change se-next model for memory-reuse"

* "fix typo"

* repick

* fix merge conflict

* "fix stupid error"
```
  bfa3fd6f
- G
  
  polish (#11363) · 9087c668
  由 gongweibao 提交于 6月 11, 2018
  
  9087c668
- C
  replace use_event with use_cuda, because use_event means the program running... · aadaadf7
  由 chengduoZH 提交于 6月 11, 2018
```
replace use_event with use_cuda, because use_event means the program running with CUDA, so use_cuda maybe more intuitive.
```
  aadaadf7
- G
  
  Clean `sendop` `recv` operator. (#11309) · 627d7a64
  由 gongweibao 提交于 6月 11, 2018
  
  627d7a64
- C
  
  follow comments · 961fbce8
  由 chengduoZH 提交于 6月 11, 2018
  
  961fbce8
- C
  
  Add cpu test for parallel_executor_crf executor_fetch_feed, and enable these tests · 7b723839
  由 chengduoZH 提交于 6月 11, 2018
  
  7b723839
- C
  
  fix allReduce bug · d24e046c
  由 chengduoZH 提交于 6月 11, 2018
  
  d24e046c
- C
  
  add cpu test · a57e8a43
  由 chengduoZH 提交于 6月 11, 2018
  
  a57e8a43
- Q
  
  add more debug string · 0485405b
  由 qiaolongfei 提交于 6月 11, 2018
  
  0485405b
- G
  
  Add comments to a singleton. (#11333) · 062d5a56
  由 gongweibao 提交于 6月 11, 2018
  
  062d5a56

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致