提交 · 8c7c53b3d5237bcdbcb42e492ec51bc581223549 · Crayon鑫 / Paddle

07 4月, 2021 1 次提交

【NPU】Merge ascend GE&distributed code by 0208 from ascendrc (#31957) · 8c7c53b3

由 zhang wenhui 提交于 4月 07, 2021

* Ascend rc (#30483)

* Fix compilcation on CANN20.1 and older (#30494)

Fix compilcation on CANN20.1 and older

* Add distribution supported (#30578)

Add distribution supported

* Build praser for Hcom* operators (#30627)

Build praser for Hcom* operators

* Pass device_ids info from launch to trainer. (#30632)

Pass device_ids info from launch to trainer

* Add Hccl program group (#30642)

Add Hccl program group

* Add startup bash files of test_ascend_group. (#30645)

Add startup bash files of test_ascend_group

* cleanup (#30646)

cleanup test_ascend_group.py

* [Feature] Build parser to support distributed training (#30658)

[Feature] Build parser to support distributed training

* fix compilation on ascend-20.1 (#30722)

fix compilation on ascend-20.1

* Dev/fix ascend string (#30749)

Dev/fix ascend string

* code style (#30781)

code style

* Merge ascend_optimizer and ascend_parser. (#30776)

Merge ascend_optimizer and ascend_parser.

* Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug  (#30797)

Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug

* Add paddle ascend distribution training supported (#30796)

Add paddle ascend distribution training supported

* pass cxx_flags to gloo cmake (#30857)

* Destroy session first. (#30954)

Destroy session first.

* merge

* fix, test=develop

* fix, test=develop

* fix style, test=develop

* fix, test=develop

* fix

* fix log fatal, test=develop

* fix enforce style, test=develop

* fix, test=develop

* fix, test=develop

* fix rccl, test=develop

* fix test, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix node_num, test=develop

* fix ids str, test=develop

* fix ids str, test=develop

* fix ids str, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix style code, test=develop

* fix style code, test=develop

* fix style code, test=develop

* fix style code, test=develop
Co-authored-by: Nhutuxian <hutuxian2011@sina.cn>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: NVoid Main <voidmain1313113@gmail.com>
Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
Co-authored-by: Ndingsiyu <18369187719@163.com>
Co-authored-by: NOleNet <olenet@126.com>

8c7c53b3

24 2月, 2021 1 次提交
- Q
  
  [ROCM] update fluid collective op for rocm, test=develop (#31075) · ee76ea72
  由 Qi Li 提交于 2月 24, 2021
  
  ee76ea72
05 2月, 2021 1 次提交
- L
  
  [Kunlun] add gen_bkcl_id_op, support multi XPU cards training using multiprocess (#30858) · 4a8b8b45
  由 liuyuhui 提交于 2月 05, 2021
  
  4a8b8b45
03 2月, 2021 2 次提交
- L
  
  fix WITH_XPU_BKCL in CMakeLists.txt (#30854) · 2cb55eff
  由 liuyuhui 提交于 2月 03, 2021
  
  2cb55eff
- W
  
  【kunlun】dygraph supports multi xpu card training (#30671) · b1026f64
  由 WangXi 提交于 2月 03, 2021
  
  b1026f64
19 1月, 2021 1 次提交
- W
  
  [Prepare for MultiProcess xpu] unified gen nccl id, refine imperative reducer (#30455) · 572c466d
  由 WangXi 提交于 1月 19, 2021
  
  572c466d
24 12月, 2020 1 次提交

[Feature] one ps (3/4) (#29604) · 032414ca

由 tangwei12 提交于 12月 24, 2020

* oneps (3/4)
Co-authored-by: NMrChengmo <cmchengmo@163.com>
Co-authored-by: Nmalin10 <malin10@baidu.com>
Co-authored-by: Nchengmo <chengmo@baidu.com>

032414ca

16 12月, 2020 1 次提交
- W
  
  fix gen_nccl_id_op_helper compile failed, test=develop (#29614) · 613c46bc
  由 WangXi 提交于 12月 16, 2020
  
  613c46bc
14 12月, 2020 1 次提交
- W
  
  gen nccl id use socket (#29431) · 467c7169
  由 WangXi 提交于 12月 14, 2020
  
  467c7169
27 8月, 2020 1 次提交
- L
  [api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis (#26552) · 1c681383
  由 lilong12 提交于 8月 27, 2020
```
add collective op for cpu using gloo and paddle.distributed.* apis
```
  1c681383
05 2月, 2020 1 次提交

add WITH_NCCL option for cmake. (#22384) · 7bc4b095

由 Wilber 提交于 2月 05, 2020

cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL

添加了PADDLE_WITH_NCCL定义

单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

7bc4b095

25 12月, 2019 1 次提交
- Z
  
  remove patch command and file of cares to Improved quality of Paddle Repo (#21776) · a01663ca
  由 zhouwei25 提交于 12月 25, 2019
  
  a01663ca
03 12月, 2019 1 次提交
- T
  remove unused snappy/snappystream depends in distributed codes (#21484) · 70eb3976
  由 Tao Luo 提交于 12月 03, 2019
```
test=develop
```
  70eb3976
27 6月, 2019 1 次提交

supports collective communicated training (#18175) · b7128bac

由 HaoRen 提交于 6月 27, 2019

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* fix comment
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* fix comment
test=develop

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* test=develop
add collective op unittest standard

* test=develop
remove the test_collective directory

* test=develop
remove the test_collective directory

* remove slicegather test

* code format for reducescatter

* update attr of shard_index_op

* Modify macro nccl_helper

* remove test without distribute

* macro collective_helper

* marcro update

* test=develop
update support python3.5

* test=develop change gpu memory use to 0.1 when test

* test=develop
update ut equal func

* test=develop
set flags to 1.5

* test=develop fix pickle dumple  py35

* test=develop
fix divide in slice and add sync_comm_stream
update atol and rtol to 1e-05
rm shard_index op and test
modify read input from file to read from memory
remove origin_program in framework and add i/o in c_sync_calc_stream

* test=develop update unittest sync operator I/O

b7128bac

23 3月, 2019 1 次提交
- Q
  
  update transpiler and listen and serv op · de65398c
  由 Qiao Longfei 提交于 3月 23, 2019
  
  de65398c
06 3月, 2019 1 次提交
- Q
  
  can run · 255b36da
  由 Qiao Longfei 提交于 3月 06, 2019
  
  255b36da
07 2月, 2019 1 次提交
- Q
  
  init parameter recv · a0585d08
  由 Qiao Longfei 提交于 2月 07, 2019
  
  a0585d08
28 1月, 2019 1 次提交
- Q
  
  code can compile · 657a4f94
  由 Qiao Longfei 提交于 1月 28, 2019
  
  657a4f94
26 12月, 2018 1 次提交
- D
  
  fix ci error. test=develop · 3ea2f415
  由 dzhwinter 提交于 12月 26, 2018
  
  3ea2f415
14 12月, 2018 1 次提交
- G
  
  Add brpc serialization support. (#11430) · 0b1c7d83
  由 gongweibao 提交于 12月 14, 2018
  
  0b1c7d83
19 11月, 2018 1 次提交

fix dist deps (#14471) · d7bd0361

由 Wu Yi 提交于 11月 19, 2018

* fix dist deps test=develop

* update test=develop

* update test=develop

* update test=develop

* update test=develop

d7bd0361

16 11月, 2018 1 次提交

Refine operator cmake (#14413) · a2d9b344

由 Wu Yi 提交于 11月 16, 2018

* wip simplify operator framework

* wip

* wip

* done test=develop

* clean test=develop

* fix test=develop

* fix deps test=develop

* fix cpu build test=develop

* fix tensorrt build test=develop

* fix tests test=develop

* fix test=develop

* fix cpu build test=develop

a2d9b344

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致