提交 · a873fa84ceca411a5a776ff8ae303f8be24df95a · Crayon鑫 / Paddle

02 7月, 2019 4 次提交

supports collective training with programs (#18392) · a873fa84

由 Yi Liu 提交于 7月 02, 2019

1. Since allreduce op has 4 reduce types, We split these four reduce types into four ops
2. We also refined the collective op code, e.g. we separated the collective op kernel into CPUKernel and CUDAKernel, and remove the device specified DeviceContext parameter in template as we already knew the target DeviceContext
3. We remove the newly added Collective op role to reduce the complexity of program and graph analysis

a873fa84

T
fix the api.spec file does not get the class comment problem (#18439) · 85b49d84
由 tianshuo78520a 提交于 7月 02, 2019
```
* fix the api.spec file does not get the class comment problem

* cat new.spec

* check api.spec

* test=develop
```
85b49d84
G
make fleet support mpi job submit directly (#18441) · 357311fd
由 guru4elephant 提交于 7月 02, 2019
```
make fleet support mpi job submit directly.
```
357311fd
C
Add find_no_grad_vars in backward.py (#17942) · e0d8c6ac
由 chengduo 提交于 7月 02, 2019
```
* add not_been_used_vars to no_grad_set
test=develop
```
e0d8c6ac

01 7月, 2019 9 次提交
- L
  Make roi_perspective_transform op return mask and transform matrix (#18371) · 449c7a9f
  由 LielinJiang 提交于 7月 01, 2019
```
* modify roi_perspective_transform_op to output mask and transform matrix

* modify comment

* modify comment

* modify API.spec

* update API.spec

* remove no use header, test=develop

* resolve conflict
```
  449c7a9f
- X
  update README_cn.md with latest version · 99659a96
  由 XiaoguangHu 提交于 7月 01, 2019
```
update README_cn.md with latest version
```
  99659a96
- X
  Update README.md with latest version · 2246f7c1
  由 XiaoguangHu 提交于 7月 01, 2019
```
update README.md with latest release & install version
```
  2246f7c1
- T
  fix mac ci random fail (#18430) · a3bc804f
  由 tensor-tang 提交于 7月 01, 2019
```
* fix mac ci random fail
* use platform instead
```
  a3bc804f
- M
  Fix Pooling output scale (#18186) · 7023a86c
  由 Michał Gallus 提交于 7月 01, 2019
```
* Int8: Fix Pooling output scale

test=develop

* Update scales quantization for certain operators

These include: concat, transpose, pool and reshape. test=develop

* Move concat minimum scale finding to quantizer

test=develop
```
  7023a86c
- B
  Fix bug in quantize kernel which cause crash in vgg16/19 model (#17964) · 4bc2987d
  由 Brian Liu 提交于 7月 01, 2019
```
* Fix bug in quantize kernel which cause crash in vgg16/19 model

test=develop

* refine the code to reduce verbose code; test=develop

* remove useless code; test=develop
```
  4bc2987d
- X
  replace mnist dataset url, test=develop (#18429) · dd3f9d19
  由 xiaoting 提交于 7月 01, 2019
```
replace mnist dataset url
```
  dd3f9d19
- X
  
  add "import paddle.fluid as fluid" to examples lack of it · 47e2ef38
  由 xsrobin 提交于 7月 01, 2019
  
  47e2ef38
- T
  
  test=develop (#18426) · 92ecb305
  由 tianshuo78520a 提交于 7月 01, 2019
  
  92ecb305
30 6月, 2019 1 次提交
- H
  update api format (#18413) · 8a39e5c1
  由 hutuxian 提交于 6月 30, 2019
```
* update api format
test=develop

* update API.spec
test=develop
```
  8a39e5c1
29 6月, 2019 3 次提交
- J
  fix data feed ptr error (#18419) · 93a2b317
  由 jiaqi 提交于 6月 29, 2019
```
fix data feed ptr runtime error, pipeline trainer will core in some cases, so set it nullptr as default value.
```
  93a2b317
- G
  update pslib library path (#18415) · ef81ff74
  由 guru4elephant 提交于 6月 29, 2019
```
change url of pslib.tar.gz
```
  ef81ff74
- T
  fix py-cpuinfo mac random fail (#18383) · ce7a024c
  由 tensor-tang 提交于 6月 29, 2019
```
* fix py-cpuinfo mac random fail
* differentiate version on windows
```
  ce7a024c
28 6月, 2019 10 次提交
- J
  init custom black white list (#18377) · 2b4ef509
  由 Jie Fang 提交于 6月 28, 2019
```
test=develop
```
  2b4ef509
- T
  fix ci document_preview job error (#18399) · b9630799
  由 tianshuo78520a 提交于 6月 28, 2019
```
* test=develop

* test=develop
```
  b9630799
- G
  add MultiSlotStringDataGenerator for speedup of string based user inp… (#18390) · e83f902b
  由 guru4elephant 提交于 6月 28, 2019
```
* add MultiSlotStringDataGenerator for speedup of string based user input data
```
  e83f902b
- L
  Fix potential mkldnn concat/pool/conv kernel issues (#18393) · 681d3553
  由 Leo Zhao 提交于 6月 28, 2019
```
1. some key generation method is not aligned with PR#17965
2. enlarge ptr lifetime to avoid memory release if SetBlob fails
   otherwise it will get core dump.

test=develop
```
  681d3553
- T
  Fix mac build nproc command not found (#18362) · 052b0448
  由 tianshuo78520a 提交于 6月 28, 2019
```
* change nproc 8
```
  052b0448
- Z
  Add a unittest to inplace elementwise_add (#18385) · f5641000
  由 Zeng Jinle 提交于 6月 28, 2019
```
* add_elementwise_add_inplace_test,test=develop

* rename file, test=develop
```
  f5641000
- J
  Fix/program doc (#17908) · 43f64a17
  由 Jiabin Yang 提交于 6月 28, 2019
```
* test=develop, add some comments for Program.clone

* test=develop, add API.spec

* test=develop, refine comments

* refine Program doc and clone doc

* test=develop, refine doc
```
  43f64a17
- J
  
  test=develop, fix multigpu hang on latest docker (#18379) · af874a1f
  由 Jiabin Yang 提交于 6月 28, 2019
  
  af874a1f
- C
  Add is_compiled_with_cuda (#18356) · 871cc15e
  由 chengduo 提交于 6月 28, 2019
```
*  add cuda_is_available
test=develop

* Fix api.spec
test=develop

* fix api doc
test=develop
```
  871cc15e
- W
  Call the test_slim_int8_* tests through absolute path (#18386) · 8ed819d8
  由 Wojciech Uss 提交于 6月 28, 2019
```
test=develop
```
  8ed819d8
27 6月, 2019 12 次提交

L
Fix dygraph show style (#18297) · fd6631ef
由 lujun 提交于 6月 27, 2019
```
Fix dygraph show style for FluidDoc.
```
fd6631ef
H
add dependecy of collective_helper (#18365) · 9931bc64
由 HaoRen 提交于 6月 27, 2019
```
* add dependecy of collective_helper

* test=develop
fix dependecy of collective_helper
```
9931bc64
翟

Remove all the code, API and doc of MKL-DNN INT8v1 (#18347) · 19da59ed
由翟飞跃提交于 6月 27, 2019

19da59ed

Fix Bug-prone code of PE (#18354) · 8ed33bf9

由 chengduo 提交于 6月 27, 2019

* update pe reduce config
test=develop

*  drop the local_exe_scopes of the previous parallel_executor
test=develop

8ed33bf9

T
fix communicator with pyreader (#18350) · 999d9a59
由 tangwei12 提交于 6月 27, 2019
```
* add is_runnning in communicator, test=develop
```
999d9a59

add combine_avx_noavx build to dockerfile · cff2c2d8

由 tianshuo78520a 提交于 6月 27, 2019

需要在avx_noavx build时候，生成dockerfile。
使用combine_avx_noavx 参数生成whl后发现不能build镜像，原因：没有生成dockerfile。需要添加生成dockerfile选项。

cff2c2d8

add WITH_COVERAGE option, default OFF (#17872) · 27fb9cad

由 kh2se2013 提交于 6月 27, 2019

* add WITH_COVERAGE option, default OFF

test=develop

* add coverage for python sdk

test=develop

* fix code style

* fix COVERAGE_FILE path

test=develop

* remove coverage package

test=develop

* test = develop, run coverage as module

27fb9cad

M
Reset DeviceContext after quantization warmup (#18182) · 84096932
由 Michał Gallus 提交于 6月 27, 2019
```
test=develop
```
84096932

supports collective communicated training (#18175) · b7128bac

由 HaoRen 提交于 6月 27, 2019

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* fix comment
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* fix comment
test=develop

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* test=develop
add collective op unittest standard

* test=develop
remove the test_collective directory

* test=develop
remove the test_collective directory

* remove slicegather test

* code format for reducescatter

* update attr of shard_index_op

* Modify macro nccl_helper

* remove test without distribute

* macro collective_helper

* marcro update

* test=develop
update support python3.5

* test=develop change gpu memory use to 0.1 when test

* test=develop
update ut equal func

* test=develop
set flags to 1.5

* test=develop fix pickle dumple  py35

* test=develop
fix divide in slice and add sync_comm_stream
update atol and rtol to 1e-05
rm shard_index op and test
modify read input from file to read from memory
remove origin_program in framework and add i/o in c_sync_calc_stream

* test=develop update unittest sync operator I/O

b7128bac

S
add int8 mkldnn prior_box (#17242) · 9252e8fa
由 Sylwester Fraczek 提交于 6月 27, 2019
```
add prior_box quantization code

add scale algo rules for prior box

test=develop
```
9252e8fa

some fixes for int8 mobilenet_ssd tester (#18112) · 5fd68ac1

由 lidanqing 提交于 6月 27, 2019

* some fixes for int8 mobilenet_ssd tester
test=develop

* change wrong data file name
test=develop

* change test images bin file from 200 images to 100 images

* change directory existence to file existence during downloading
test=develop

* reuse download_data
test=develop

* run full dataset when iterations=0
test=develop

5fd68ac1

[MKL-DNN] Extending reusing to Elementwise_add_mkldnn op (#18146) · c2efdfd5

由 Jacek Czaja 提交于 6月 27, 2019

* - Reusing of reuder used in elementwise_add_mkldnn

- Added MKL-DNN sum prim reusing

test=develop

- Compilation fixes

test=develop

- Yet another compilation fix

test=develop

- Yet another compilation fix

test=develo

- Yet another linking fix

test=develop

- Final compilation fix

test=develop

- lint fixes

test=develop

- Lint fixes

test=develop

* - Fixes after review

test=develop

c2efdfd5

26 6月, 2019 1 次提交
- Q
  Simplify multi_box_head API in detection.py and remove assign op. (#18310) · 9047ac68
  由 qingqing01 提交于 6月 26, 2019
```
* Simplify multi_box_head API in detection.py and remove assign op.
```
  9047ac68

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致