提交 · 5618f14047250f1325e7d544b4c147bf0a98c5a8 · 机器未来 / Paddle

21 1月, 2021 2 次提交
- G
  Add Hccl program group (#30642) · e4287ca6
  由 gongweibao 提交于 1月 21, 2021
```
Add Hccl program group
```
  e4287ca6
- G
  Add distribution supported (#30578) · f9c97dd7
  由 gongweibao 提交于 1月 21, 2021
```
Add distribution supported
```
  f9c97dd7
24 12月, 2020 1 次提交

[Feature] one ps (3/4) (#29604) · 032414ca

由 tangwei12 提交于 12月 24, 2020

* oneps (3/4)
Co-authored-by: NMrChengmo <cmchengmo@163.com>
Co-authored-by: Nmalin10 <malin10@baidu.com>
Co-authored-by: Nchengmo <chengmo@baidu.com>

032414ca

16 12月, 2020 1 次提交
- W
  
  fix gen_nccl_id_op_helper compile failed, test=develop (#29614) · 613c46bc
  由 WangXi 提交于 12月 16, 2020
  
  613c46bc
14 12月, 2020 1 次提交
- W
  
  gen nccl id use socket (#29431) · 467c7169
  由 WangXi 提交于 12月 14, 2020
  
  467c7169
27 8月, 2020 1 次提交
- L
  [api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis (#26552) · 1c681383
  由 lilong12 提交于 8月 27, 2020
```
add collective op for cpu using gloo and paddle.distributed.* apis
```
  1c681383
05 2月, 2020 1 次提交

add WITH_NCCL option for cmake. (#22384) · 7bc4b095

由 Wilber 提交于 2月 05, 2020

cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL

添加了PADDLE_WITH_NCCL定义

单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

7bc4b095

25 12月, 2019 1 次提交
- Z
  
  remove patch command and file of cares to Improved quality of Paddle Repo (#21776) · a01663ca
  由 zhouwei25 提交于 12月 25, 2019
  
  a01663ca
03 12月, 2019 1 次提交
- T
  remove unused snappy/snappystream depends in distributed codes (#21484) · 70eb3976
  由 Tao Luo 提交于 12月 03, 2019
```
test=develop
```
  70eb3976
27 6月, 2019 1 次提交

supports collective communicated training (#18175) · b7128bac

由 HaoRen 提交于 6月 27, 2019

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* fix comment
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* fix comment
test=develop

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* test=develop
add collective op unittest standard

* test=develop
remove the test_collective directory

* test=develop
remove the test_collective directory

* remove slicegather test

* code format for reducescatter

* update attr of shard_index_op

* Modify macro nccl_helper

* remove test without distribute

* macro collective_helper

* marcro update

* test=develop
update support python3.5

* test=develop change gpu memory use to 0.1 when test

* test=develop
update ut equal func

* test=develop
set flags to 1.5

* test=develop fix pickle dumple  py35

* test=develop
fix divide in slice and add sync_comm_stream
update atol and rtol to 1e-05
rm shard_index op and test
modify read input from file to read from memory
remove origin_program in framework and add i/o in c_sync_calc_stream

* test=develop update unittest sync operator I/O

b7128bac

23 3月, 2019 1 次提交
- Q
  
  update transpiler and listen and serv op · de65398c
  由 Qiao Longfei 提交于 3月 23, 2019
  
  de65398c
06 3月, 2019 1 次提交
- Q
  
  can run · 255b36da
  由 Qiao Longfei 提交于 3月 06, 2019
  
  255b36da
07 2月, 2019 1 次提交
- Q
  
  init parameter recv · a0585d08
  由 Qiao Longfei 提交于 2月 07, 2019
  
  a0585d08
28 1月, 2019 1 次提交
- Q
  
  code can compile · 657a4f94
  由 Qiao Longfei 提交于 1月 28, 2019
  
  657a4f94
26 12月, 2018 1 次提交
- D
  
  fix ci error. test=develop · 3ea2f415
  由 dzhwinter 提交于 12月 26, 2018
  
  3ea2f415
14 12月, 2018 1 次提交
- G
  
  Add brpc serialization support. (#11430) · 0b1c7d83
  由 gongweibao 提交于 12月 14, 2018
  
  0b1c7d83
19 11月, 2018 1 次提交

fix dist deps (#14471) · d7bd0361

由 Wu Yi 提交于 11月 19, 2018

* fix dist deps test=develop

* update test=develop

* update test=develop

* update test=develop

* update test=develop

d7bd0361

16 11月, 2018 1 次提交

Refine operator cmake (#14413) · a2d9b344

由 Wu Yi 提交于 11月 16, 2018

* wip simplify operator framework

* wip

* wip

* done test=develop

* clean test=develop

* fix test=develop

* fix deps test=develop

* fix cpu build test=develop

* fix tensorrt build test=develop

* fix tests test=develop

* fix test=develop

* fix cpu build test=develop

a2d9b344

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致