提交 · b8d106e1c57ff6a06d91b5b5c1232cb54b6e47b7 · PaddlePaddle / Paddle

20 7月, 2022 1 次提交
- D
  【GPUPS】Adam accessor (#43919) · b8d106e1
  由 danleifeng 提交于 7月 20, 2022
```
* add adam/sharedadam optimzier for gpups;edit optimizer struct;test=develop
```
  b8d106e1
05 6月, 2022 1 次提交

【code format check upgrade】 step2：yapf (#42944) · a072fca8

由 Sing_chan 提交于 6月 05, 2022

* use yapf to format all python file

* yapf exclude two unittests file for they rely on writing and reading file, and format will break them

* disable diff_py_file because too many diff files cause command following failed

a072fca8

02 6月, 2022 1 次提交

add federated learning parameter server(fl-ps) mode (#42682) · d999049f

由 ziyoujiyi 提交于 6月 02, 2022

* back fl

* delete ssl cert

* .

* make warning

* .

* unittest paral degree

* solve unittest

* heter & multi cloud commm ready

* .

* .

* fl-ps v1.0

* .

* support N + N mode

* .

* .

* .

* .

* delete print

* .

* .

* .

* .

d999049f

19 5月, 2022 1 次提交
- D
  
  【GPUPS】add ctr_dymf_accessor for pscore (#42827) · 148582fe
  由 danleifeng 提交于 5月 19, 2022
  
  148582fe
12 5月, 2022 1 次提交
- S
  
  Fix some typos in paddle/. (#42408) · 2012672c
  由 Shuangchi He 提交于 5月 12, 2022
  
  2012672c
19 4月, 2022 1 次提交

double accessor and show_scale (#41943) · 8113c913

由 wangguanqun 提交于 4月 19, 2022

* double accessor and show_scale

* double accessor and show_scale

* rename

* fix bug in pslib config

* add unittest

8113c913

13 4月, 2022 1 次提交

the one ps proto (#41659) · b12af9e1

由 wangguanqun 提交于 4月 13, 2022

* the one ps proto

* the one ps proto

* fix

* fix

* fix

* fix windows ci

* fix windows ci

* add dependency

* add dependency

b12af9e1

31 3月, 2022 1 次提交

fix load bug and add distributed strategy from pslib (#40883) · 47383dca

由 wangguanqun 提交于 3月 31, 2022

* fix load bug and add distributed strategy from pslib

* add unittest

* use cvm config

* trainer and worker config

* add unittest

* add unittest

* add test

* code style

47383dca

17 1月, 2022 1 次提交
- S
  Add NoReduce mode for ParallelExecutor (#38969) · e50d883e
  由 sneaxiy 提交于 1月 17, 2022
```
* add no reduce mode for pe

* add NoReduce ut
```
  e50d883e
09 12月, 2021 1 次提交
- W
  default accessor and multi table config (#37714) · a9e0d28c
  由 wangguanqun 提交于 12月 09, 2021
```
* default accessor and multi table config

* add unittest

* add unittest

* delete print
```
  a9e0d28c
06 12月, 2021 1 次提交
- K
  
  heter for collective (#37613) · 1bdb8578
  由 kuizhiqing 提交于 12月 06, 2021
  
  1bdb8578
30 11月, 2021 1 次提交
- Z
  
  pscore global shuffle&default accessor config (#37626) · 1514eec6
  由 zhaocaibei123 提交于 11月 30, 2021
  
  1514eec6
26 11月, 2021 1 次提交
- Z
  upgrade async distributed training in pscore (#37515) · 74605fc2
  由 zhaocaibei123 提交于 11月 26, 2021
```
* test

* test

* rm test

* update

* update

* update

* add unittest

* update

* update save
```
  74605fc2
24 11月, 2021 1 次提交
- Z
  Adapt auto search (#37490) · 025053b4
  由 zhaoyingli 提交于 11月 24, 2021
```
* adapt auto search

* adapt auto search

* fix matmulv2 compatible

* del debug
```
  025053b4
08 9月, 2021 1 次提交

[Auto Parallel] Integrate all modules (#35483) · 12155358

由 Yulong Ao 提交于 9月 08, 2021

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* add dist

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update

* update

* delete unused proto

* resotre op_desc

* restore type_defs

* update var_desc

* remove dimss_mapping for proto_pybind

* update interface.py

* update framework.py

* update

* update

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* [WIP] Add the auto completion feature and related codes

* [WIP] Improve the auto completion and related codes

* [WIP] Make the auto completion to support data-parallel

* [WIP] Make the completion support mp and dp+mp

* [WIP] Refactor auto completion unit test for MLP

* [WIP] Refactor the implementation of DistributedOperatorImpl

* [WIP] Improve dims_mapping update rule and fix a bug

* [WIP] Support auto completion for one transformer decoder layer

* [WIP] Add a minor change

* [WIP] Fix a bug within the uint test

* Shard XShape tensor, add embedding completion and refactor code

* Add the distributed_operators dir to setup.py.in

* Improve the completion process and add the unittest for gpt

* fix process_mesh ut

* fix process_mesh ut

* update

* update, test=develop

* Add support for automatically completing distributed attrs of special ops

* update

* update

* update

* fix doc sample codes, test=develop

* improve coverage, test=develop

* add static_mode check, test=develop

* Model the cluster for cost model and physical mapping

* update, test=develop

* add set_placement, test=develop

* Add the check to make sure the candidate tensors' size is great than zero

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update, test=develop

* Auto mark dist attrs annotated by user

* update ndarray to nested list, test=develop

* update, test=develop

* Add auto-completion module for auto-parallel (based on PR#33804)

* Remove unnecessary files

* Remove unrelated files for the auto completion pr

* Update the unit test to improve the coverage

* Modify codes based on reviews

* Minor changes for CI

* Improve some codes based on new comments

* Fix bugs caused by shallow copy in attributes.py
* Imporve amend_distributed_attr_for_program in context.py
* Other changes for weihang's comments

* support shard reader

* support shard reader

* add parallel mode

* update process mesh

* add method to compute comm_group

* implement dist_embedding forward func

* implement dist matmul forward func

* implement dist reshape forward func

* add transpiler framework

* add transpiler forward

* implement transpiler forward

* implement transpiler backward & update

* add process

* add unitest

* chmod

* chmod

* chmod

* update unitest

* add unitest for gpt

* remove unused print

* rename transpiler --> partitioner

* rename transpiler --> partitioner

* chmod

* chmod

* bug fixed

* remove amp function

* update case for dp mode

* update case for dp mode

* [Auto Parallel] Integrate all parts with the newest code

* Integrate all parts of auto parallel and improve codes

* Integrate all parts by AutoParallelizer
* Add unit test for AutoParallelizer
* Improve auto completion module for pipeline parallel
* Add support for matmul_v2 in dist_matmul
* Correct the typo "stratergy" to "strategy"

* Modify distributed_strategy.proto to conform the main stream

* Restore parts of distributed_strategy to conform the develop branch
Co-authored-by: Nsandyhouse <lilong12@baidu.com>
Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>

12155358

20 8月, 2021 1 次提交
- Y
  
  [hybrid performance] Grad fuse for gradient merge under pipeline mode (#35004) · 4d9b2d6d
  由 Yuang Liu 提交于 8月 20, 2021
  
  4d9b2d6d
18 8月, 2021 1 次提交
- W
  [Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16... · a9673b44
  由 WangXi 提交于 8月 18, 2021
```
[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer (#34965)
```
  a9673b44
30 7月, 2021 1 次提交
- W
  add trainer desc config to distributed strategy (#34457) · e6aacd1e
  由 wangguanqun 提交于 7月 30, 2021
```
* add trainer desc config to distributed strategy

* code style modified
```
  e6aacd1e
08 7月, 2021 1 次提交
- M
  
  Distributed Automatic SParsity with Fleet (#33558) · 86cb3fb8
  由 Ming-Xu Huang 提交于 7月 08, 2021
  
  86cb3fb8
01 7月, 2021 1 次提交
- Y
  
  gradient scale (#33862) · 57aabbab
  由 Yuang Liu 提交于 7月 01, 2021
  
  57aabbab
21 6月, 2021 1 次提交
- Y
  
  add sync calc stream and add ut for fuse on gpu (#33580) · e0e0c0fa
  由 Yuang Liu 提交于 6月 21, 2021
  
  e0e0c0fa
10 6月, 2021 1 次提交
- B
  
  dp c_allreduce_sum_fusion op (#33169) · 003b4616
  由 Baibaifan 提交于 6月 10, 2021
  
  003b4616
09 6月, 2021 1 次提交
- W
  cache core.globals() to speed up dynamic graph (#32098) · b4954ce4
  由 wanghuancoder 提交于 6月 09, 2021
```
* modify API nn.Bilinear's doc, test=develop
```
  b4954ce4
07 6月, 2021 1 次提交
- Z
  
  fix too-many-format-args (#33353) · 599e9e48
  由 zhangchunle 提交于 6月 07, 2021
  
  599e9e48
26 5月, 2021 1 次提交
- J
  
  [Tensor Parallelism] split fix bug (#33015) · 20b9be65
  由 JZ-LIANG 提交于 5月 26, 2021
  
  20b9be65
17 5月, 2021 1 次提交
- S
  [HybridParallel]Fix precision problem of model parallel (#32897) · c809530e
  由 ShenLiang 提交于 5月 17, 2021
```
* fix precision of mp

* fix bug of seed

* fix dp

* print group
```
  c809530e
11 5月, 2021 1 次提交
- S
  Support control flow in DataParallel (#32826) · 298f210d
  由 ShenLiang 提交于 5月 11, 2021
```
* fix find_unused_parameters default value
```
  298f210d
08 5月, 2021 1 次提交
- L
  Add raw program meta optimizer (#32597) · c1c18b08
  由 lilong12 提交于 5月 08, 2021
```
* add raw program, test=develop
```
  c1c18b08
06 5月, 2021 1 次提交
- Z
  
  update 2.0 public api in distributed (#32695) · 70eb435c
  由 zhiboniu 提交于 5月 06, 2021
  
  70eb435c
25 4月, 2021 1 次提交
- L
  Fix the bug in mp (#31996) · 976fe6f9
  由 lilong12 提交于 4月 25, 2021
```
* update
```
  976fe6f9
20 4月, 2021 1 次提交
- J
  [Sharding]: update config DOC (#32299) · e3489013
  由 JZ-LIANG 提交于 4月 20, 2021
```
* sharding: update config DOC

* update pipeline config

* sharding update doc
```
  e3489013
17 4月, 2021 1 次提交
- S
  [Hybrid Parallel] Add model parallel support in dygraph (#32248) · 66d46221
  由 ShenLiang 提交于 4月 17, 2021
```
* add model parallel support in dygraph
```
  66d46221
01 4月, 2021 1 次提交
- S
  Support control flow in DataParallel (#31625) · 8460698b
  由 ShenLiang 提交于 4月 01, 2021
```
* support control flow

* supoort sync_parameters_buffers

* fix the bug of sparse embedding
```
  8460698b
24 2月, 2021 1 次提交
- L
  align the default value of some configuration for fleet to that of single cards (#30740) · dc8dfba3
  由 lilong12 提交于 2月 24, 2021
```
* update, test=develop
```
  dc8dfba3
01 2月, 2021 1 次提交
- W
  
  Fleet distributed strategy support pure fp16 (#30754) · 31ed9c9e
  由 WangXi 提交于 2月 01, 2021
  
  31ed9c9e
12 1月, 2021 1 次提交
- J
  
  Recompute Offload (#30233) · 75936d83
  由 JZ-LIANG 提交于 1月 12, 2021
  
  75936d83
09 12月, 2020 1 次提交
- S
  Rebuild group automatically in dynamic graph distributed (#29255) · 2ef9e0e2
  由 ShenLiang 提交于 12月 09, 2020
```
* add tensor_indices in AssignGroupBySize

* add rebuild group in reducer
```
  2ef9e0e2
01 12月, 2020 2 次提交
- S
  
  Change the api of DataParallel and Fleet (#29224) · 46b73e6c
  由 ShenLiang 提交于 12月 01, 2020
  
  46b73e6c
- 1
  test=develop, fix doc (#29200) · cc9c6196
  由 123malin 提交于 12月 01, 2020
```
* fix fleet api doc
```
  cc9c6196
26 11月, 2020 1 次提交

[sharding] doc, api, bug fixed (#28983) · 0dadacc4

由 JZ-LIANG 提交于 11月 26, 2020

* add lars to fleet meta optimizer

* add lamb to proto

* add lamb to fleet meta optimizer

* fixed syntax bug

* fixed syntax bug

* fixed syntax error in lamb, add config setter of lamb in distributed_strategy

* trigger unitest to rerun

* add new unitest func for lamb

* revise unitest for lars and lamb

* revise dgc meta unitest

* revise lars document in distribute_strategy

* revise lars lamb document in distributed_strategy.py

* revise lars lamb document in distributed_strategy.py

* add weight decay exclude logic to lars

* restore optimzier.py

* restore optimizer.py as develop except lars

* add epsilon and exclude fn to distributed_sttrategy

* add lars epsilon

* revise unitest for fleet lars and lamb

* revise lars lamb unitest for CI coverage

* revise lars argument api

* revise lars argument api

* revise lars argument api

* revise api doc of lars

* fix op role

* add sharding save and add_sync_comm_for_test function

* add comm_analyse to utlis

* revise sharding_utils

* add sharding saving unittest

* revise sharding utils for unittest

* revise sharding en doc

* update sharding utils api

* add doc for sharding

* fixed bug in sharding var size count

* update varsize count in sharding

* fix sharding num_nccl_comm

* Revert "fix sharding num_nccl_comm"

This reverts commit d51587c15e9323acf226ddd36154275f0d1daf76.

0dadacc4

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功