提交 · 7a92e74bff15c836b38ba8215e1103513e5b31f9 · Crayon鑫 / Paddle

06 9月, 2022 1 次提交
- W
  
  Completes basic dtypes for collective api in eager mode (#45574) · 7a92e74b
  由 Wen Sun 提交于 9月 06, 2022
  
  7a92e74b
17 8月, 2022 1 次提交

[CodeStyle][NPU] use np.testing.assert_allclose instead of... · 2de0d676

由 Nyakku Shigure 提交于 8月 17, 2022

[CodeStyle][NPU] use np.testing.assert_allclose instead of self.assertTrue(np.allclose(...)) (part 1) (#44988)

* autofix

* try resolve precision issues

* revert some changes

* clean some `err_msg`

* 0.0001 -> 1e-4

* update commented assert code

* try to fix some shape errors

* `numpy` -> `np`

* empty commit, trigger kunlun ci, test=kunlun

* empty commit, retrigger kunlun ci, test=kunlun

* empty commit, trigger kunlun ci, try fix npu memcpy_h2d, test=kunlun

* try fix npu import error, test=kunlun

2de0d676

01 8月, 2022 1 次提交
- L
  
  fix all_gather_object with various length, test=allcases (#44718) · e48cb42b
  由 LiYuRio 提交于 8月 01, 2022
  
  e48cb42b
28 7月, 2022 1 次提交
- L
  
  Complete the dtypes for all_gather, add all_gather_object api (#44417) · d4cf02bc
  由 LiYuRio 提交于 7月 28, 2022
  
  d4cf02bc
23 6月, 2022 1 次提交

Fix several unit tests and increase the unit tests stability (#43670) · c41c5e63

由 zlsh80826 提交于 6月 23, 2022

* Reduce gather op unit tests size and increase the timeout

* Add NVIDIA_TF32_OVERRIDE for multi-processes environment

* Remove record test for device event ut

c41c5e63

21 6月, 2022 1 次提交
- G
  
  Pass NVIDIA_TF32_OVERRIDE to internal (#43646) · 7307e955
  由 gongweibao 提交于 6月 21, 2022
  
  7307e955
15 6月, 2022 1 次提交
- Z
  place all save/load path into temporary directory (#43451) · 0c51f241
  由 zhaoyingli 提交于 6月 15, 2022
```
* use tempfile to place temporary files

* update

* revert test_communicator

* fix test_dist_base
```
  0c51f241
05 6月, 2022 1 次提交

【code format check upgrade】 step2：yapf (#42944) · a072fca8

由 Sing_chan 提交于 6月 05, 2022

* use yapf to format all python file

* yapf exclude two unittests file for they rely on writing and reading file, and format will break them

* disable diff_py_file because too many diff files cause command following failed

a072fca8

31 5月, 2022 1 次提交
- W
  [Eager] fix collective_global_gather (#43090) · ae45d981
  由 Weilong Wu 提交于 5月 31, 2022
```
* [Eager] fix collective_global_gather

* fix eager_ode = 1
```
  ae45d981
28 5月, 2022 1 次提交
- S
  [Bug Fix]Fix global_scatter/global_gather in ProcessGroup (#43027) · 8cc2e28c
  由 ShenLiang 提交于 5月 28, 2022
```
* fix alltoall

* rename utest
```
  8cc2e28c
13 9月, 2021 1 次提交
- 李
  upload global scatter and global gather operators related files (#35546) · ecfe8375
  由李季提交于 9月 13, 2021
```
* upload global scatter and global gather operators related files
```
  ecfe8375
21 6月, 2021 1 次提交
- T
  Del six.PY code2 (#33607) · 0f7187af
  由 tianshuo78520a 提交于 6月 21, 2021
```
* del py2 code2

* fix test timeout
```
  0f7187af
09 6月, 2021 1 次提交
- W
  
  [HybridParallel] update collective split to use c_embedding and mp_allreduce (#33411) · 42c1297e
  由 WangXi 提交于 6月 09, 2021
  
  42c1297e
26 5月, 2021 1 次提交
- J
  
  [Tensor Parallelism] split fix bug (#33015) · 20b9be65
  由 JZ-LIANG 提交于 5月 26, 2021
  
  20b9be65
27 4月, 2021 1 次提交
- L
  add alltoall api (#32507) · db41b742
  由 lilong12 提交于 4月 27, 2021
```
* add alltoall api, test=develop
```
  db41b742
26 4月, 2021 1 次提交
- L
  add send/recv api (#32504) · c47bafc6
  由 lilong12 提交于 4月 26, 2021
```
* add sendrecv, test=develop
```
  c47bafc6
21 4月, 2021 1 次提交
- L
  
  [Kunlun]add collective ops for multi XPU cards training and add Kunlun multi XPU cards CI (#32302) · 2194ad15
  由 liuyuhui 提交于 4月 21, 2021
  
  2194ad15
31 12月, 2020 3 次提交
- L
  
  update, test=develop (#30047) · 9e51e383
  由 lilong12 提交于 12月 31, 2020
  
  9e51e383
- L
  Disable gloo by default (#29805) · b0bd93de
  由 lilong12 提交于 12月 31, 2020
```
* update, test=develop
```
  b0bd93de
- L
  add the paddle.distributed.split api (#29970) · 2bc5121d
  由 lilong12 提交于 12月 31, 2020
```
* add distributed.split, test=develop
```
  2bc5121d
21 10月, 2020 1 次提交
- L
  modify ut cmakefile (#28140) · 4873c20d
  由 lilong12 提交于 10月 21, 2020
```
* modify ut cmakefile, test=develop
```
  4873c20d
29 9月, 2020 1 次提交
- L
  Initialize gloo for low level collective apis (#27672) · bbc2add7
  由 lilong12 提交于 9月 29, 2020
```
* add gloo initializer, test=develop
```
  bbc2add7
28 9月, 2020 2 次提交
- L
  
  Revert "Initialize gloo for low level collective apis (#27356)", test=document_fix (#27665) · 36c04102
  由 lilong12 提交于 9月 28, 2020
  
  36c04102
- L
  Initialize gloo for low level collective apis (#27356) · fa73e4a2
  由 lilong12 提交于 9月 28, 2020
```
* add gloo initializer, test=develop
```
  fa73e4a2
28 8月, 2020 1 次提交
- L
  
  update copyright year, test=document_fix (#26586) · f1ae017f
  由 lilong12 提交于 8月 28, 2020
  
  f1ae017f
27 8月, 2020 1 次提交
- L
  [api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis (#26552) · 1c681383
  由 lilong12 提交于 8月 27, 2020
```
add collective op for cpu using gloo and paddle.distributed.* apis
```
  1c681383
21 8月, 2020 1 次提交
- L
  
  Add collective ops (reduce) (#26340) · e92f770c
  由 lilong12 提交于 8月 21, 2020
  
  e92f770c
24 11月, 2019 1 次提交
- Y
  adapt test_collective_base.py for only two GPU cards available. (#21307) · f1b09ba3
  由 Yi Liu 提交于 11月 24, 2019
```
* adapt test_collective_base.py for only two GPU cards available.
test=develop

* fix bug of issue #21259
test=develop
```
  f1b09ba3
27 6月, 2019 1 次提交

supports collective communicated training (#18175) · b7128bac

由 HaoRen 提交于 6月 27, 2019

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* fix comment
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* supports collective training in executor

* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop

* use unique name for nccl_id

* supports output to stream in program_to_code

* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code

* set op role in collective training

* add collective op role

* fix comment
test=develop

* remove orig file

* add build optimizer by strategy

* add collective strategy

* refine collective strategy

* add multi-process role maker

* refine strategy building factory so that we can easily plugin more strategy

* scale loss grad in collective sgd transpiler

* add support for distributed fc

* code format

* revert some features for dist fc

* add support for distributed fc training

* test=develop
add collective op unittest standard

* test=develop
remove the test_collective directory

* test=develop
remove the test_collective directory

* remove slicegather test

* code format for reducescatter

* update attr of shard_index_op

* Modify macro nccl_helper

* remove test without distribute

* macro collective_helper

* marcro update

* test=develop
update support python3.5

* test=develop change gpu memory use to 0.1 when test

* test=develop
update ut equal func

* test=develop
set flags to 1.5

* test=develop fix pickle dumple  py35

* test=develop
fix divide in slice and add sync_comm_stream
update atol and rtol to 1e-05
rm shard_index op and test
modify read input from file to read from memory
remove origin_program in framework and add i/o in c_sync_calc_stream

* test=develop update unittest sync operator I/O

b7128bac

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致