提交 · a873fa84ceca411a5a776ff8ae303f8be24df95a · BaiXuePrincess / Paddle

02 7月, 2019 3 次提交

supports collective training with programs (#18392) · a873fa84

由 Yi Liu 提交于 7月 02, 2019

1. Since allreduce op has 4 reduce types, We split these four reduce types into four ops
2. We also refined the collective op code, e.g. we separated the collective op kernel into CPUKernel and CUDAKernel, and remove the device specified DeviceContext parameter in template as we already knew the target DeviceContext
3. We remove the newly added Collective op role to reduce the complexity of program and graph analysis

a873fa84

T
fix the api.spec file does not get the class comment problem (#18439) · 85b49d84
由 tianshuo78520a 提交于 7月 02, 2019
```
* fix the api.spec file does not get the class comment problem

* cat new.spec

* check api.spec

* test=develop
```
85b49d84
C
Add find_no_grad_vars in backward.py (#17942) · e0d8c6ac
由 chengduo 提交于 7月 02, 2019
```
* add not_been_used_vars to no_grad_set
test=develop
```
e0d8c6ac

01 7月, 2019 4 次提交

Make roi_perspective_transform op return mask and transform matrix (#18371) · 449c7a9f

由 LielinJiang 提交于 7月 01, 2019

* modify roi_perspective_transform_op to output mask and transform matrix

* modify comment

* modify comment

* modify API.spec

* update API.spec

* remove no use header, test=develop

* resolve conflict

449c7a9f

Fix Pooling output scale (#18186) · 7023a86c

由 Michał Gallus 提交于 7月 01, 2019

* Int8: Fix Pooling output scale

test=develop

* Update scales quantization for certain operators

These include: concat, transpose, pool and reshape. test=develop

* Move concat minimum scale finding to quantizer

test=develop

7023a86c

Fix bug in quantize kernel which cause crash in vgg16/19 model (#17964) · 4bc2987d

由 Brian Liu 提交于 7月 01, 2019

* Fix bug in quantize kernel which cause crash in vgg16/19 model

test=develop

* refine the code to reduce verbose code; test=develop

* remove useless code; test=develop

4bc2987d

X

add "import paddle.fluid as fluid" to examples lack of it · 47e2ef38
由 xsrobin 提交于 7月 01, 2019

47e2ef38

30 6月, 2019 1 次提交
- H
  update api format (#18413) · 8a39e5c1
  由 hutuxian 提交于 6月 30, 2019
```
* update api format
test=develop

* update API.spec
test=develop
```
  8a39e5c1
29 6月, 2019 1 次提交

fix data feed ptr error (#18419) · 93a2b317

由 jiaqi 提交于 6月 29, 2019

fix data feed ptr runtime error, pipeline trainer will core in some cases, so set it nullptr as default value.

93a2b317

28 6月, 2019 5 次提交

J
init custom black white list (#18377) · 2b4ef509
由 Jie Fang 提交于 6月 28, 2019
```
test=develop
```
2b4ef509

Fix potential mkldnn concat/pool/conv kernel issues (#18393) · 681d3553

由 Leo Zhao 提交于 6月 28, 2019

1. some key generation method is not aligned with PR#17965
2. enlarge ptr lifetime to avoid memory release if SetBlob fails
   otherwise it will get core dump.

test=develop

681d3553

Z
Add a unittest to inplace elementwise_add (#18385) · f5641000
由 Zeng Jinle 提交于 6月 28, 2019
```
* add_elementwise_add_inplace_test,test=develop

* rename file, test=develop
```
f5641000

Fix/program doc (#17908) · 43f64a17

由 Jiabin Yang 提交于 6月 28, 2019

* test=develop, add some comments for Program.clone

* test=develop, add API.spec

* test=develop, refine comments

* refine Program doc and clone doc

* test=develop, refine doc

43f64a17

Add is_compiled_with_cuda (#18356) · 871cc15e

由 chengduo 提交于 6月 28, 2019

*  add cuda_is_available
test=develop

* Fix api.spec
test=develop

* fix api doc
test=develop

871cc15e

27 6月, 2019 10 次提交

L
Fix dygraph show style (#18297) · fd6631ef
由 lujun 提交于 6月 27, 2019
```
Fix dygraph show style for FluidDoc.
```
fd6631ef
H
add dependecy of collective_helper (#18365) · 9931bc64
由 HaoRen 提交于 6月 27, 2019
```
* add dependecy of collective_helper

* test=develop
fix dependecy of collective_helper
```
9931bc64
翟

Remove all the code, API and doc of MKL-DNN INT8v1 (#18347) · 19da59ed
由翟飞跃提交于 6月 27, 2019

19da59ed

Fix Bug-prone code of PE (#18354) · 8ed33bf9

由 chengduo 提交于 6月 27, 2019

* update pe reduce config
test=develop

*  drop the local_exe_scopes of the previous parallel_executor
test=develop

8ed33bf9

T
fix communicator with pyreader (#18350) · 999d9a59
由 tangwei12 提交于 6月 27, 2019
```
* add is_runnning in communicator, test=develop
```
999d9a59
M
Reset DeviceContext after quantization warmup (#18182) · 84096932
由 Michał Gallus 提交于 6月 27, 2019
```
test=develop
```
84096932

supports collective communicated training (#18175) · b7128bac

由 HaoRen 提交于 6月 27, 2019

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* fix comment
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* fix comment
test=develop

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* test=develop
add collective op unittest standard

* test=develop
remove the test_collective directory

* test=develop
remove the test_collective directory

* remove slicegather test

* code format for reducescatter

* update attr of shard_index_op

* Modify macro nccl_helper

* remove test without distribute

* macro collective_helper

* marcro update

* test=develop
update support python3.5

* test=develop change gpu memory use to 0.1 when test

* test=develop
update ut equal func

* test=develop
set flags to 1.5

* test=develop fix pickle dumple  py35

* test=develop
fix divide in slice and add sync_comm_stream
update atol and rtol to 1e-05
rm shard_index op and test
modify read input from file to read from memory
remove origin_program in framework and add i/o in c_sync_calc_stream

* test=develop update unittest sync operator I/O

b7128bac

S
add int8 mkldnn prior_box (#17242) · 9252e8fa
由 Sylwester Fraczek 提交于 6月 27, 2019
```
add prior_box quantization code

add scale algo rules for prior box

test=develop
```
9252e8fa

some fixes for int8 mobilenet_ssd tester (#18112) · 5fd68ac1

由 lidanqing 提交于 6月 27, 2019

* some fixes for int8 mobilenet_ssd tester
test=develop

* change wrong data file name
test=develop

* change test images bin file from 200 images to 100 images

* change directory existence to file existence during downloading
test=develop

* reuse download_data
test=develop

* run full dataset when iterations=0
test=develop

5fd68ac1

[MKL-DNN] Extending reusing to Elementwise_add_mkldnn op (#18146) · c2efdfd5

由 Jacek Czaja 提交于 6月 27, 2019

* - Reusing of reuder used in elementwise_add_mkldnn

- Added MKL-DNN sum prim reusing

test=develop

- Compilation fixes

test=develop

- Yet another compilation fix

test=develop

- Yet another compilation fix

test=develo

- Yet another linking fix

test=develop

- Final compilation fix

test=develop

- lint fixes

test=develop

- Lint fixes

test=develop

* - Fixes after review

test=develop

c2efdfd5

26 6月, 2019 5 次提交
- Q
  Simplify multi_box_head API in detection.py and remove assign op. (#18310) · 9047ac68
  由 qingqing01 提交于 6月 26, 2019
```
* Simplify multi_box_head API in detection.py and remove assign op.
```
  9047ac68
- Z
  Refine CUDAPlace error message. (#18343) · 5826b72e
  由 Zeng Jinle 提交于 6月 26, 2019
```
* refine cuda place error msg, test=develop

* use LOG(ERROR)+exit(-1), test=develop
```
  5826b72e
- T
  remove unused jemalloc option (#18314) · 3c9755bb
  由 Tao Luo 提交于 6月 26, 2019
```
test=develop
```
  3c9755bb
- Y
  Update lamb optimizer (#18333) · 23941e43
  由 Yibing Liu 提交于 6月 26, 2019
```
* Update lamb optimizer

test=develop, test=document_preview

* Regenerate api spec

test=develop, test=document_preview
```
  23941e43
- C
  update reduce config (#18334) · 135a59ed
  由 chengduo 提交于 6月 26, 2019
```
test=develop
```
  135a59ed
25 6月, 2019 5 次提交

T
fix softrelu doc (#18324) · 81ec5382
由 tensor-tang 提交于 6月 25, 2019
```
* fix softrelu doc

test=develop

* update API doc

test=develop
```
81ec5382

Sequence mask support tensor (#18249) · df2eee71

由 Hongyu Liu 提交于 6月 25, 2019

* sequnce mask support max length tensor input; test=develop

* add rnn_impl.py; test=develop

* add basic gru lstm unittest; test=develop

* fix api spec; test=develop

* fix sequence_mask op bug;
test=develop
test=document_preview

* change +-*x to elmentwise_op; test=develop

* add mkl flag; test=develop

* fix rnn impl bug; test=develop

* update api spec; test=develop

* fix doc bug; test=develop

* fix lstm bugs; test=develop

df2eee71

optimize communicator merge sparse gradient test=develop (#18159) · 0e08e91c

由 Qiao Longfei 提交于 6月 25, 2019

* optimize communicator merge sparse gradient test=develop

* revert multithread selected rows merge add test=develop

* follow comment test=develop

0e08e91c

C
Fix default value of fluid.memory_optimize (#18295) · e06c69c7
由 chengduo 提交于 6月 25, 2019
```
* fix default value of fluid.memory_optimize
test=develop

* fix api.spec
test=develop
```
e06c69c7
Z
fix split and sampled softmax (#18280) · 6978b2e4
由 Zhaolong Xing 提交于 6月 25, 2019
```
test=develop
```
6978b2e4

24 6月, 2019 5 次提交
- Y
  Fix the bug of sequence_unpad op (#18290) · f57ee369
  由 Yibing Liu 提交于 6月 24, 2019
```
* Use TensorCopySync for sequence_unpad op

test=develop

* Fix the tensor memory alloc bug

test=develop
```
  f57ee369
- C
  Clean build strategy (#18148) · 5489216e
  由 chengduo 提交于 6月 24, 2019
```
* clean build_strategy
test=develop

* DataBalanceOpHandle has been removed
test=develop

* debug

* update build_strategy.
test=develop
```
  5489216e
- C
  update alloc_continuous_space_for_grad_pass (#18287) · 14e1e165
  由 chengduo 提交于 6月 24, 2019
```
test=develop
```
  14e1e165
- L
  add Dygraph api to api.spec (#18235) · 7e61baaa
  由 lujun 提交于 6月 24, 2019
```
add Dygraph api to api.spec
```
  7e61baaa
- L
  improve doc of lstm, sequence_enumerate, softmax_with_cross_entropy, space_to_depth APIs (#18261) · a736c03b
  由 liuwei1031 提交于 6月 24, 2019
```
* improve doc of lstm, sequence_enumerate, softmax_with_cross_entropy, space_to_depth APIs, test=develop

* update API.spec, test=develop
```
  a736c03b
22 6月, 2019 1 次提交
- F
  fix double buffer example (#18169) · fdf798f9
  由 flame 提交于 6月 22, 2019
```
test=develop
test=document_preview
```
  fdf798f9

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致