提交 · 6eeb16b8944955e572b5cdc0af450adfc5cd37a1 · PaddlePaddle / Paddle

09 12月, 2021 1 次提交
- W
  default accessor and multi table config (#37714) · a9e0d28c
  由 wangguanqun 提交于 12月 09, 2021
```
* default accessor and multi table config

* add unittest

* add unittest

* delete print
```
  a9e0d28c
06 12月, 2021 1 次提交
- K
  
  heter for collective (#37613) · 1bdb8578
  由 kuizhiqing 提交于 12月 06, 2021
  
  1bdb8578
30 11月, 2021 1 次提交
- Z
  
  pscore global shuffle&default accessor config (#37626) · 1514eec6
  由 zhaocaibei123 提交于 11月 30, 2021
  
  1514eec6
26 11月, 2021 1 次提交
- Z
  upgrade async distributed training in pscore (#37515) · 74605fc2
  由 zhaocaibei123 提交于 11月 26, 2021
```
* test

* test

* rm test

* update

* update

* update

* add unittest

* update

* update save
```
  74605fc2
24 11月, 2021 1 次提交
- Z
  Adapt auto search (#37490) · 025053b4
  由 zhaoyingli 提交于 11月 24, 2021
```
* adapt auto search

* adapt auto search

* fix matmulv2 compatible

* del debug
```
  025053b4
08 9月, 2021 1 次提交

[Auto Parallel] Integrate all modules (#35483) · 12155358

由 Yulong Ao 提交于 9月 08, 2021

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* add dist

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update

* update

* delete unused proto

* resotre op_desc

* restore type_defs

* update var_desc

* remove dimss_mapping for proto_pybind

* update interface.py

* update framework.py

* update

* update

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* [WIP] Add the auto completion feature and related codes

* [WIP] Improve the auto completion and related codes

* [WIP] Make the auto completion to support data-parallel

* [WIP] Make the completion support mp and dp+mp

* [WIP] Refactor auto completion unit test for MLP

* [WIP] Refactor the implementation of DistributedOperatorImpl

* [WIP] Improve dims_mapping update rule and fix a bug

* [WIP] Support auto completion for one transformer decoder layer

* [WIP] Add a minor change

* [WIP] Fix a bug within the uint test

* Shard XShape tensor, add embedding completion and refactor code

* Add the distributed_operators dir to setup.py.in

* Improve the completion process and add the unittest for gpt

* fix process_mesh ut

* fix process_mesh ut

* update

* update, test=develop

* Add support for automatically completing distributed attrs of special ops

* update

* update

* update

* fix doc sample codes, test=develop

* improve coverage, test=develop

* add static_mode check, test=develop

* Model the cluster for cost model and physical mapping

* update, test=develop

* add set_placement, test=develop

* Add the check to make sure the candidate tensors' size is great than zero

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update, test=develop

* Auto mark dist attrs annotated by user

* update ndarray to nested list, test=develop

* update, test=develop

* Add auto-completion module for auto-parallel (based on PR#33804)

* Remove unnecessary files

* Remove unrelated files for the auto completion pr

* Update the unit test to improve the coverage

* Modify codes based on reviews

* Minor changes for CI

* Improve some codes based on new comments

* Fix bugs caused by shallow copy in attributes.py
* Imporve amend_distributed_attr_for_program in context.py
* Other changes for weihang's comments

* support shard reader

* support shard reader

* add parallel mode

* update process mesh

* add method to compute comm_group

* implement dist_embedding forward func

* implement dist matmul forward func

* implement dist reshape forward func

* add transpiler framework

* add transpiler forward

* implement transpiler forward

* implement transpiler backward & update

* add process

* add unitest

* chmod

* chmod

* chmod

* update unitest

* add unitest for gpt

* remove unused print

* rename transpiler --> partitioner

* rename transpiler --> partitioner

* chmod

* chmod

* bug fixed

* remove amp function

* update case for dp mode

* update case for dp mode

* [Auto Parallel] Integrate all parts with the newest code

* Integrate all parts of auto parallel and improve codes

* Integrate all parts by AutoParallelizer
* Add unit test for AutoParallelizer
* Improve auto completion module for pipeline parallel
* Add support for matmul_v2 in dist_matmul
* Correct the typo "stratergy" to "strategy"

* Modify distributed_strategy.proto to conform the main stream

* Restore parts of distributed_strategy to conform the develop branch
Co-authored-by: Nsandyhouse <lilong12@baidu.com>
Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>

12155358

20 8月, 2021 1 次提交
- Y
  
  [hybrid performance] Grad fuse for gradient merge under pipeline mode (#35004) · 4d9b2d6d
  由 Yuang Liu 提交于 8月 20, 2021
  
  4d9b2d6d
18 8月, 2021 1 次提交
- W
  [Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16... · a9673b44
  由 WangXi 提交于 8月 18, 2021
```
[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer (#34965)
```
  a9673b44
30 7月, 2021 1 次提交
- W
  add trainer desc config to distributed strategy (#34457) · e6aacd1e
  由 wangguanqun 提交于 7月 30, 2021
```
* add trainer desc config to distributed strategy

* code style modified
```
  e6aacd1e
08 7月, 2021 1 次提交
- M
  
  Distributed Automatic SParsity with Fleet (#33558) · 86cb3fb8
  由 Ming-Xu Huang 提交于 7月 08, 2021
  
  86cb3fb8
01 7月, 2021 1 次提交
- Y
  
  gradient scale (#33862) · 57aabbab
  由 Yuang Liu 提交于 7月 01, 2021
  
  57aabbab
21 6月, 2021 1 次提交
- Y
  
  add sync calc stream and add ut for fuse on gpu (#33580) · e0e0c0fa
  由 Yuang Liu 提交于 6月 21, 2021
  
  e0e0c0fa
10 6月, 2021 1 次提交
- B
  
  dp c_allreduce_sum_fusion op (#33169) · 003b4616
  由 Baibaifan 提交于 6月 10, 2021
  
  003b4616
09 6月, 2021 1 次提交
- W
  cache core.globals() to speed up dynamic graph (#32098) · b4954ce4
  由 wanghuancoder 提交于 6月 09, 2021
```
* modify API nn.Bilinear's doc, test=develop
```
  b4954ce4
07 6月, 2021 1 次提交
- Z
  
  fix too-many-format-args (#33353) · 599e9e48
  由 zhangchunle 提交于 6月 07, 2021
  
  599e9e48
26 5月, 2021 1 次提交
- J
  
  [Tensor Parallelism] split fix bug (#33015) · 20b9be65
  由 JZ-LIANG 提交于 5月 26, 2021
  
  20b9be65
17 5月, 2021 1 次提交
- S
  [HybridParallel]Fix precision problem of model parallel (#32897) · c809530e
  由 ShenLiang 提交于 5月 17, 2021
```
* fix precision of mp

* fix bug of seed

* fix dp

* print group
```
  c809530e
11 5月, 2021 1 次提交
- S
  Support control flow in DataParallel (#32826) · 298f210d
  由 ShenLiang 提交于 5月 11, 2021
```
* fix find_unused_parameters default value
```
  298f210d
08 5月, 2021 1 次提交
- L
  Add raw program meta optimizer (#32597) · c1c18b08
  由 lilong12 提交于 5月 08, 2021
```
* add raw program, test=develop
```
  c1c18b08
06 5月, 2021 1 次提交
- Z
  
  update 2.0 public api in distributed (#32695) · 70eb435c
  由 zhiboniu 提交于 5月 06, 2021
  
  70eb435c
25 4月, 2021 1 次提交
- L
  Fix the bug in mp (#31996) · 976fe6f9
  由 lilong12 提交于 4月 25, 2021
```
* update
```
  976fe6f9
20 4月, 2021 1 次提交
- J
  [Sharding]: update config DOC (#32299) · e3489013
  由 JZ-LIANG 提交于 4月 20, 2021
```
* sharding: update config DOC

* update pipeline config

* sharding update doc
```
  e3489013
17 4月, 2021 1 次提交
- S
  [Hybrid Parallel] Add model parallel support in dygraph (#32248) · 66d46221
  由 ShenLiang 提交于 4月 17, 2021
```
* add model parallel support in dygraph
```
  66d46221
01 4月, 2021 1 次提交
- S
  Support control flow in DataParallel (#31625) · 8460698b
  由 ShenLiang 提交于 4月 01, 2021
```
* support control flow

* supoort sync_parameters_buffers

* fix the bug of sparse embedding
```
  8460698b
24 2月, 2021 1 次提交
- L
  align the default value of some configuration for fleet to that of single cards (#30740) · dc8dfba3
  由 lilong12 提交于 2月 24, 2021
```
* update, test=develop
```
  dc8dfba3
01 2月, 2021 1 次提交
- W
  
  Fleet distributed strategy support pure fp16 (#30754) · 31ed9c9e
  由 WangXi 提交于 2月 01, 2021
  
  31ed9c9e
12 1月, 2021 1 次提交
- J
  
  Recompute Offload (#30233) · 75936d83
  由 JZ-LIANG 提交于 1月 12, 2021
  
  75936d83
09 12月, 2020 1 次提交
- S
  Rebuild group automatically in dynamic graph distributed (#29255) · 2ef9e0e2
  由 ShenLiang 提交于 12月 09, 2020
```
* add tensor_indices in AssignGroupBySize

* add rebuild group in reducer
```
  2ef9e0e2
01 12月, 2020 2 次提交
- S
  
  Change the api of DataParallel and Fleet (#29224) · 46b73e6c
  由 ShenLiang 提交于 12月 01, 2020
  
  46b73e6c
- 1
  test=develop, fix doc (#29200) · cc9c6196
  由 123malin 提交于 12月 01, 2020
```
* fix fleet api doc
```
  cc9c6196
26 11月, 2020 1 次提交

[sharding] doc, api, bug fixed (#28983) · 0dadacc4

由 JZ-LIANG 提交于 11月 26, 2020

* add lars to fleet meta optimizer

* add lamb to proto

* add lamb to fleet meta optimizer

* fixed syntax bug

* fixed syntax bug

* fixed syntax error in lamb, add config setter of lamb in distributed_strategy

* trigger unitest to rerun

* add new unitest func for lamb

* revise unitest for lars and lamb

* revise dgc meta unitest

* revise lars document in distribute_strategy

* revise lars lamb document in distributed_strategy.py

* revise lars lamb document in distributed_strategy.py

* add weight decay exclude logic to lars

* restore optimzier.py

* restore optimizer.py as develop except lars

* add epsilon and exclude fn to distributed_sttrategy

* add lars epsilon

* revise unitest for fleet lars and lamb

* revise lars lamb unitest for CI coverage

* revise lars argument api

* revise lars argument api

* revise lars argument api

* revise api doc of lars

* fix op role

* add sharding save and add_sync_comm_for_test function

* add comm_analyse to utlis

* revise sharding_utils

* add sharding saving unittest

* revise sharding utils for unittest

* revise sharding en doc

* update sharding utils api

* add doc for sharding

* fixed bug in sharding var size count

* update varsize count in sharding

* fix sharding num_nccl_comm

* Revert "fix sharding num_nccl_comm"

This reverts commit d51587c15e9323acf226ddd36154275f0d1daf76.

0dadacc4

24 11月, 2020 1 次提交

Upgrade string literals to raw string (#28989) · 3815d7aa

由 Leo Chen 提交于 11月 24, 2020

* upgrade comment string to raw string

* fix string in

* fix string with ' '

* revert update on comments

* upgrade only necessary

* fix sample code checker

* fix comments with '''

3815d7aa

26 10月, 2020 1 次提交
- M
  add sharding strategy in fleet(#27900) · 81244fbf
  由 mapingshuo 提交于 10月 26, 2020
```
* add sharding
```
  81244fbf
22 10月, 2020 1 次提交
- W
  
  refine auto strategy, test=document_fix (#28211) · 11acbfae
  由 WangXi 提交于 10月 22, 2020
  
  11acbfae
12 10月, 2020 1 次提交
- W
  
  fleet combine amp dgc recompute meta optimizer (#27643) · 0a1862d1
  由 WangXi 提交于 10月 12, 2020
  
  0a1862d1
28 9月, 2020 1 次提交
- D
  Get final strategy (#27602) · 4e8f18ab
  由 Dong Daxiang 提交于 9月 28, 2020
```
* add get final strategy for user to print final strategy
```
  4e8f18ab
25 9月, 2020 1 次提交
- W
  
  fleet2.0 add fp16 grad compression (#27480) · e550fc02
  由 WangXi 提交于 9月 25, 2020
  
  e550fc02
16 9月, 2020 1 次提交
- S
  add adaptivelsgd in meta_optimizer (#27289) · 54b81fa3
  由 ShenLiang 提交于 9月 16, 2020
```
* add adaptivelsgd

* Todo fix the code to avoid the conflict.
```
  54b81fa3
14 9月, 2020 1 次提交
- S
  remove auto mode from localsgd optimizer (#27237) · 2b6a5793
  由 ShenLiang 提交于 9月 14, 2020
```
* rm auto from localsgd
```
  2b6a5793
09 9月, 2020 1 次提交
- D
  【paddle.fleet】refine launch and distributed repr string for print (#27093) · f7d08b7d
  由 Dong Daxiang 提交于 9月 09, 2020
```
* refine launch and distributed repr string for print
```
  f7d08b7d

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功