提交 · b7128bac5f12138062ec2518a0f856915c752a69 · Crayon鑫 / Paddle

27 6月, 2019 1 次提交

supports collective communicated training (#18175) · b7128bac

由 HaoRen 提交于 6月 27, 2019

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* fix comment
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* fix comment
test=develop

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* test=develop
add collective op unittest standard

* test=develop
remove the test_collective directory

* test=develop
remove the test_collective directory

* remove slicegather test

* code format for reducescatter

* update attr of shard_index_op

* Modify macro nccl_helper

* remove test without distribute

* macro collective_helper

* marcro update

* test=develop
update support python3.5

* test=develop change gpu memory use to 0.1 when test

* test=develop
update ut equal func

* test=develop
set flags to 1.5

* test=develop fix pickle dumple  py35

* test=develop
fix divide in slice and add sync_comm_stream
update atol and rtol to 1e-05
rm shard_index op and test
modify read input from file to read from memory
remove origin_program in framework and add i/o in c_sync_calc_stream

* test=develop update unittest sync operator I/O

b7128bac

25 6月, 2019 1 次提交
- C
  Fix default value of fluid.memory_optimize (#18295) · e06c69c7
  由 chengduo 提交于 6月 25, 2019
```
* fix default value of fluid.memory_optimize
test=develop

* fix api.spec
test=develop
```
  e06c69c7
31 5月, 2019 1 次提交
- T
  fix document of python api get_startup_program() (#17764) · 659b72a9
  由 tangwei12 提交于 5月 31, 2019
```
* add example to get_startup_program()
* fix example to get_startup_program()
```
  659b72a9
30 5月, 2019 1 次提交
- Y
  
  fix distributed_transpiler.py api test=develop (#17668) · ac92e4c0
  由 yaoxuefeng 提交于 5月 30, 2019
  
  ac92e4c0
29 5月, 2019 2 次提交
- G
  
  fix 2dconn test=develop (#17681) · 0d561ef4
  由 gongweibao 提交于 5月 29, 2019
  
  0d561ef4
- T
  fix doc in transpiler, test=develop (#17313) · 0d3c48e0
  由 tangwei12 提交于 5月 29, 2019
```
* fix doc in transpiler, test=develop
```
  0d3c48e0
27 5月, 2019 1 次提交
- G
  
  Add multi-ncclcomm and 2D ncclallreduce support. (#17263) · 65bbf950
  由 gongweibao 提交于 5月 27, 2019
  
  65bbf950
24 5月, 2019 1 次提交

[MKL-DNN] Add Fully Connected Op for inference only(#15226) · 0c39b97b

由 Michał Gallus 提交于 5月 24, 2019

* fuse mul and elementwise add to fc

* Reimplement the FC forward operator

* Fix FC MKLDNN integration by transposing weights

* Add FC MKLDNN Pass

test=develop

* FC MKLDNN Pass: change memcpy to std::copy

* Fix MKLDNN FC handling of mismatch input and weights dims

* Lower tolerance for MKL-DNN in resnet50 test

test=develop

* Adjust FC to support MKLDNN Op placement

test=develop

* Adjust Placement Op to set use_mkldnn attribute for graph

test=develop

* MKLDNN FC: fix weights format so that gemm version is called

test=develop

* FC MKLDNN: Remove tolerance decrease from tester_helper

* FC MKL-DNN: Refactor the code, change input reorder to weight reorder

* MKL-DNN FC: Introduce operator caching

test=develop

* FC MKL-DNN: Fix the tensor type in ExpectedKernelType

test=develop

* FC MKL-DNN: fix style changes

test=develop

* FC MKL-DNN: fallback to native on non-supported dim sizes

test=develop

* FC MKLDNN: fix CMake paths

test=develop

* FC MKLDNN: Refine placement pass graph mkldnn attribute

test=develop

* Fix Transpiler error for fuse_conv_eltwise

test=develop

* Fix missing STL includes in files

test=develop

* FC MKL-DNN: Enable new output size computation

Also, refine pass to comply with newest interface.
test=develop

* FC MKL-DNN: enable only when fc_mkldnn_pass is enabled

* FC MKL-DNN: Allow Weights to use oi or io format

* FC MKL-DNN: Adjust UT to work with correct dims

test=develop

* Enable MKL DEBUG for resnet50 analyzer

test=develop

* FC MKL-DNN: Improve Hashing function

test=develop

* FC MKL-DNN: Fix shape for fc weights in transpiler

* FC MKL-DNN: Update input pointer in re-used fc primitive

* Add log for not handling fc fuse for unsupported dims

test=develop

* FC MKL-DNN: Move transpose from pass to Op Kernel

test=develop

* FC MKL-DNN: Disable transpose in unit test

test=develop

* FC MKL-DNN: Remove fc_mkldnn_pass from default list

* Correct Flag for fake data analyzer tests

test=develop

* FC MKL-DNN: Add comment about fc mkldnn pass disablement

test=develop

* FC MKL-DNN: Disable fc in int8 tests

test=develop

0c39b97b

23 5月, 2019 2 次提交
- Q
  fix distribute doc test=develop (#17318) · 92e7d5d7
  由 Qiao Longfei 提交于 5月 23, 2019
```
* fix distribute doc
```
  92e7d5d7
- Q
  Async exe support communicator (#17386) · 58f7695a
  由 Qiao Longfei 提交于 5月 23, 2019
```
Async exe support communicator
```
  58f7695a
20 5月, 2019 1 次提交
- L
  improve the doc of paddle.fluid.memory_optimize, test=develop (#17473) · f82e4d75
  由 liuwei1031 提交于 5月 20, 2019
```
* improve the doc of paddle.fluid.memory_optimize, test=develop

* fix typo, test=develop
```
  f82e4d75
16 5月, 2019 1 次提交

improve the API Sample of DataFeeder, memory_optimize and release_memory (#17374) · 6a53fa95

由 liuwei1031 提交于 5月 16, 2019

* improve the API Sample of DataFeeder, memory_optimize and release_memory, test=develop

* update API.spec, test=develop, test=document_preview

* tweak the code format of feed API, test=develop

*  update API.spec, test=develop

* improve doc for DataFeeder and default_main_program, test=develop

6a53fa95

26 4月, 2019 1 次提交
- T
  
  truncated_gaussian_random supported in distributed training, test=develop (#17091) · 7330cd63
  由 tangwei12 提交于 4月 26, 2019
  
  7330cd63
25 4月, 2019 1 次提交
- T
  Fleet unify distributed training (#16791) · 1a4a51db
  由 tangwei12 提交于 4月 25, 2019
```
* implement distributed transpiler with fleet
```
  1a4a51db
27 3月, 2019 1 次提交
- Q
  
  fix pylint · d640c6cf
  由 Qiao Longfei 提交于 3月 27, 2019
  
  d640c6cf
25 3月, 2019 1 次提交
- Q
  
  fix trainer_id · 542b52fa
  由 Qiao Longfei 提交于 3月 25, 2019
  
  542b52fa
23 3月, 2019 1 次提交
- Q
  
  update transpiler and listen and serv op · de65398c
  由 Qiao Longfei 提交于 3月 23, 2019
  
  de65398c
04 3月, 2019 2 次提交
- X
  polish · 8e094f71
  由 Xin Pan 提交于 2月 27, 2019
```
test=develop
```
  8e094f71
- X
  add deprecation warning. · 9f3a3252
  由 Xin Pan 提交于 2月 27, 2019
```
test=develop
```
  9f3a3252
27 2月, 2019 2 次提交
- X
  polish · 0c277ac6
  由 Xin Pan 提交于 2月 27, 2019
```
test=develop
```
  0c277ac6
- X
  add deprecation warning. · 840cf780
  由 Xin Pan 提交于 2月 27, 2019
```
test=develop
```
  840cf780
20 2月, 2019 1 次提交
- T
  fix params with only 1 dim (#15828) · 971f3bc9
  由 tangwei12 提交于 2月 20, 2019
```
* fix params with only 1 dim
* test=develop
```
  971f3bc9
14 2月, 2019 2 次提交
- D
  update. test=develop · 84f067be
  由 dzhwinter 提交于 2月 14, 2019
```
test=develop
```
  84f067be
- D
  
  add details. test=develop · d453b0dc
  由 dzhwinter 提交于 2月 14, 2019
  
  d453b0dc
08 2月, 2019 2 次提交
- Q
  
  parameter recv can run · 8bda4ab2
  由 Qiao Longfei 提交于 2月 08, 2019
  
  8bda4ab2
- Q
  
  complete recv op · fbd186bd
  由 Qiao Longfei 提交于 2月 08, 2019
  
  fbd186bd
06 2月, 2019 1 次提交
- Q
  
  complete parameter_send · 4356f186
  由 Qiao Longfei 提交于 2月 06, 2019
  
  4356f186
31 1月, 2019 1 次提交
- D
  
  follow comments. test=develop · 0a63234c
  由 dzhwinter 提交于 1月 31, 2019
  
  0a63234c
30 1月, 2019 2 次提交

D

rerun ci. test=develop · 8b97a3a4
由 dzhwinter 提交于 1月 30, 2019

8b97a3a4

transpiler.py code clean (#15555) · 90df7ff3

由 tangwei12 提交于 1月 30, 2019

* move var strusted to vars_distributed.py, add optimizer's block name, test=develop

* rename optimzier's seems complex, revert it, test=develop

* replace * with details, test=develop

90df7ff3

29 1月, 2019 1 次提交
- D
  
  add flag. test=develop · a26a6bc7
  由 dzhwinter 提交于 1月 29, 2019
  
  a26a6bc7
25 1月, 2019 1 次提交
- G
  
  Add GetVariableNoBarrier on brpc. (#15488) · fe8f28c9
  由 gongweibao 提交于 1月 25, 2019
  
  fe8f28c9
24 1月, 2019 1 次提交
- W
  
  fix tangwei merge issue test=develop (#15506) · 22db82c0
  由 Wu Yi 提交于 1月 24, 2019
  
  22db82c0
23 1月, 2019 1 次提交
- T
  checkpoint at distributed training (#14854) · 8b50ad80
  由 tangwei12 提交于 1月 23, 2019
```
checkpoint for distributed training.
```
  8b50ad80
08 1月, 2019 1 次提交
- Q
  
  fix style test=develop · 810439a9
  由 Qiao Longfei 提交于 1月 08, 2019
  
  810439a9
28 12月, 2018 1 次提交
- Q
  fix dist sparse l2 decay · 49cce3fd
  由 Qiao Longfei 提交于 12月 28, 2018
```
test=develop
```
  49cce3fd
27 12月, 2018 1 次提交
- H
  en api improve format Dec 27 · 66ea7184
  由 haowang101779990 提交于 12月 26, 2018
```
test=develop
```
  66ea7184
26 12月, 2018 1 次提交
- S
  add scope_pool · 3e917a93
  由 sneaxiy 提交于 12月 26, 2018
```
add module cleanup
test=develop
```
  3e917a93
18 12月, 2018 2 次提交

J

add test transpiler dist test, test=develop · b2f789c6
由 JiabinYang 提交于 12月 18, 2018

b2f789c6

add ir memory optimize. (#14530) · 7cd24b13

由 dzhwinter 提交于 12月 18, 2018

* follow comments. test=develop

* Fix typo

* fix compile error. test=develop

* merge develop branch. test=develop

* Remove set_equal

* Polish code

* Delete unused functions

test=develop

* polish code. test=develop

* follow comment

* polish code.

* fix windows compile error. test=develop

* fix op handle.

* rerun ci. test=develop

* rerun ci. test=develop

* rerun macci. test=develop

* polish code. test=develop

* rewrite sort code. test=develop

* remove unused code. test=develop

* fix tests. test=develop

* fix conflict. test=develop

* follow comment. test=develop

* merge develop branch. test=develop

* fix tests. test=develop

* remove ToTypeIndex. test=develop

* rerun ci. test=develop

7cd24b13

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致