提交 · f6db9806061ec1d5522d378d41762c1cf4b3294f · Crayon鑫 / Paddle

16 9月, 2021 2 次提交
- Y
  
  [hybrid] Fix mp multi gradient clip prob (#35713) · a4eadd15
  由 Yuang Liu 提交于 9月 16, 2021
  
  a4eadd15
- W
  
  [hybrid] remove scale op in insert_scale_loss_grad_ops (#35775) · 02b0be08
  由 WangXi 提交于 9月 16, 2021
  
  02b0be08
15 9月, 2021 2 次提交
- H
  
  fix bugs of PR 35401 (#35746) · 09eaa7d7
  由 Haohongxiang 提交于 9月 15, 2021
  
  09eaa7d7
- W
  
  [hybrid] out data parallel as optimizer sharding parallel (#35593) · 78465703
  由 WangXi 提交于 9月 15, 2021
  
  78465703
14 9月, 2021 2 次提交

Add solutions to PyLayer which is unsupported in DataParallel (#35401) · d483b8c0

由 Haohongxiang 提交于 9月 14, 2021

* Add solutions to PyLayer which is unsupported in DataParallel

* modify note format for parallel.py

* modify docs of dataparallel

* add docs of dp with pylayer

* modify docs format

* modify example format

* change example of dp with pylayer

* add unittest for dp with pylayer

* modify ut

* merge latest codes

* update

* modify for CI-Coverage

* modify text-indent

d483b8c0

Z
Fix RawProgramOptimizer bug (#35704) · 0f741880
由 Zeng Jinle 提交于 9月 14, 2021
```
* fix raw optimizer gm

* update

* update ut
```
0f741880

13 9月, 2021 2 次提交
- S
  [HybridParallel]Fix scaler bug in pipeline_parallel/model_parallel (#35556) · 2bb44317
  由 ShenLiang 提交于 9月 13, 2021
```
* support grad group

* fix single card condition
```
  2bb44317
- G
  support hybrid parallel inference helper class (#35576) · dc3c845a
  由 Guoxia Wang 提交于 9月 13, 2021
```
* support hybrid parallel inference helper class
```
  dc3c845a
10 9月, 2021 2 次提交
- J
  [Dygraph 4D Parallel] Sharding Support MP-PP-DP Parallelism (#35580) · 2c922d63
  由 JZ-LIANG 提交于 9月 10, 2021
```
* sharding support dp

* sharding support mp

* sharding support pp
```
  2c922d63
- S
  
  fix bug of recompute in hybridparallel (#35588) · d53e567a
  由 ShenLiang 提交于 9月 10, 2021
  
  d53e567a
08 9月, 2021 2 次提交

[Auto Parallel] Integrate all modules (#35483) · 12155358

由 Yulong Ao 提交于 9月 08, 2021

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* add dist

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update, test=develop

* update

* update

* update

* update

* update

* update, test=develop

* update, test=develop

* update

* update

* delete unused proto

* resotre op_desc

* restore type_defs

* update var_desc

* remove dimss_mapping for proto_pybind

* update interface.py

* update framework.py

* update

* update

* add auto_parallel dir

* mv to paddle.distributed

* add shard_xx api

* add distributed attrs for var

* add ut, test=develop

* [WIP] Add the auto completion feature and related codes

* [WIP] Improve the auto completion and related codes

* [WIP] Make the auto completion to support data-parallel

* [WIP] Make the completion support mp and dp+mp

* [WIP] Refactor auto completion unit test for MLP

* [WIP] Refactor the implementation of DistributedOperatorImpl

* [WIP] Improve dims_mapping update rule and fix a bug

* [WIP] Support auto completion for one transformer decoder layer

* [WIP] Add a minor change

* [WIP] Fix a bug within the uint test

* Shard XShape tensor, add embedding completion and refactor code

* Add the distributed_operators dir to setup.py.in

* Improve the completion process and add the unittest for gpt

* fix process_mesh ut

* fix process_mesh ut

* update

* update, test=develop

* Add support for automatically completing distributed attrs of special ops

* update

* update

* update

* fix doc sample codes, test=develop

* improve coverage, test=develop

* add static_mode check, test=develop

* Model the cluster for cost model and physical mapping

* update, test=develop

* add set_placement, test=develop

* Add the check to make sure the candidate tensors' size is great than zero

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update doc, test=develop

* update, test=develop

* Auto mark dist attrs annotated by user

* update ndarray to nested list, test=develop

* update, test=develop

* Add auto-completion module for auto-parallel (based on PR#33804)

* Remove unnecessary files

* Remove unrelated files for the auto completion pr

* Update the unit test to improve the coverage

* Modify codes based on reviews

* Minor changes for CI

* Improve some codes based on new comments

* Fix bugs caused by shallow copy in attributes.py
* Imporve amend_distributed_attr_for_program in context.py
* Other changes for weihang's comments

* support shard reader

* support shard reader

* add parallel mode

* update process mesh

* add method to compute comm_group

* implement dist_embedding forward func

* implement dist matmul forward func

* implement dist reshape forward func

* add transpiler framework

* add transpiler forward

* implement transpiler forward

* implement transpiler backward & update

* add process

* add unitest

* chmod

* chmod

* chmod

* update unitest

* add unitest for gpt

* remove unused print

* rename transpiler --> partitioner

* rename transpiler --> partitioner

* chmod

* chmod

* bug fixed

* remove amp function

* update case for dp mode

* update case for dp mode

* [Auto Parallel] Integrate all parts with the newest code

* Integrate all parts of auto parallel and improve codes

* Integrate all parts by AutoParallelizer
* Add unit test for AutoParallelizer
* Improve auto completion module for pipeline parallel
* Add support for matmul_v2 in dist_matmul
* Correct the typo "stratergy" to "strategy"

* Modify distributed_strategy.proto to conform the main stream

* Restore parts of distributed_strategy to conform the develop branch
Co-authored-by: Nsandyhouse <lilong12@baidu.com>
Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>

12155358

Enable program passes on Fleet APIs (#34955) · 5f369881

由 Zeng Jinle 提交于 9月 08, 2021

* add fleet api for program pass

* turn on apply pass for CI test

* fix disable fuse_all_optimizer bug

* try to test ci

* fix CI

* fill unspecified op role

* fix fuse_allreduce

* add ut to improve coverage

* remove useless change

* improve c++ coverage

* follow some comments

* test ir pass pipeline

* update doc

* reduce ut time again

5f369881

01 9月, 2021 2 次提交
- S
  [HybridParallel]Support finetinue model for PipelineParallel (#35287) · 264ff9ef
  由 ShenLiang 提交于 9月 01, 2021
```
* add cache for send_recv

* add eval_batch for pipeline

* add eval batch for pipelineparallel

* add style code
```
  264ff9ef
- J
  
  bugfix for mp accuracy (#35326) · 7f17f9a0
  由 JZ-LIANG 提交于 9月 01, 2021
  
  7f17f9a0
25 8月, 2021 1 次提交
- W
  
  [hybrid npu] fix npu found_finite in hybrid (#35134) · f609ca37
  由 WangXi 提交于 8月 25, 2021
  
  f609ca37
20 8月, 2021 1 次提交
- Y
  
  [hybrid performance] Grad fuse for gradient merge under pipeline mode (#35004) · 4d9b2d6d
  由 Yuang Liu 提交于 8月 20, 2021
  
  4d9b2d6d
18 8月, 2021 2 次提交
- W
  [Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16... · a9673b44
  由 WangXi 提交于 8月 18, 2021
```
[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer (#34965)
```
  a9673b44
- F
  [CPU-PSLIB] Add consistency insepection of use_var_list and data_generator... · 209075a4
  由 Fan Zhang 提交于 8月 18, 2021
```
[CPU-PSLIB] Add consistency insepection of use_var_list and data_generator data, test=develop (#34463)
```
  209075a4
17 8月, 2021 1 次提交
- R
  
  [NPU]Adamw skip update for npu (#34897) · b4474fb4
  由 Roc 提交于 8月 17, 2021
  
  b4474fb4
13 8月, 2021 1 次提交
- S
  [Bug-Fix]fix bug of py36 import utils (#34873) · 507ea06f
  由 ShenLiang 提交于 8月 13, 2021
```
* fix bug of py36 import
```
  507ea06f
12 8月, 2021 1 次提交
- S
  [HybridParallel]Add Recompute for PipeLineParallel (#34607) · 589d13c5
  由 ShenLiang 提交于 8月 12, 2021
```
* add recompute for pp

* add recompute offload

* add recompute partition
```
  589d13c5
11 8月, 2021 3 次提交
- W
  
  [hybrid] pp+dp support fp16 allreduce (#34762) · 4d7af372
  由 WangXi 提交于 8月 11, 2021
  
  4d7af372
- S
  [HybridParallel] Support save/load for PipeLineParallel (#34768) · 88f2f4a4
  由 ShenLiang 提交于 8月 11, 2021
```
* add save/load for pipelineparallel

* add save/load
```
  88f2f4a4
- Y
  
  Optimize fused allreduce in raw program (#34509) · 4d2994cb
  由 Yuang Liu 提交于 8月 11, 2021
  
  4d2994cb
10 8月, 2021 2 次提交
- W
  
  [hybrid] refine sharding code (#34678) · a1603797
  由 WangXi 提交于 8月 10, 2021
  
  a1603797
- K
  
  kill all procs on exiting (#34741) · 84eb6757
  由 kuizhiqing 提交于 8月 10, 2021
  
  84eb6757
09 8月, 2021 1 次提交
- J
  
  Recompute: fix bug with transformer attention mask (#34664) · 0dff82c2
  由 JZ-LIANG 提交于 8月 09, 2021
  
  0dff82c2
06 8月, 2021 2 次提交
- S
  
  fix bug of inplace (#34665) · fa16c21f
  由 ShenLiang 提交于 8月 06, 2021
  
  fa16c21f
- B
  
  del wait in sharding for npu (#34637) · ce733495
  由 Baibaifan 提交于 8月 06, 2021
  
  ce733495
05 8月, 2021 2 次提交
- S
  
  rm detach (#34644) · 6c8a10a2
  由 ShenLiang 提交于 8月 05, 2021
  
  6c8a10a2
- S
  [HybridParallel]Fix bug of p2p for partial_send/recv (#34615) · 4cc3d9a2
  由 ShenLiang 提交于 8月 05, 2021
```
* fix bug of p2p for partial

* fix error
```
  4cc3d9a2
04 8月, 2021 1 次提交
- K
  
  Elastic as module (#34572) · 1f76a2f7
  由 kuizhiqing 提交于 8月 04, 2021
  
  1f76a2f7
03 8月, 2021 2 次提交
- W
  
  [hybrid] remove the using of global ring in hybrid parallel (#34525) · 56b7ebbc
  由 WangXi 提交于 8月 03, 2021
  
  56b7ebbc
- S
  [HybridParallel] Support segment for PipelineParallel (#34529) · 9b6c7eb9
  由 ShenLiang 提交于 8月 03, 2021
```
* add layer segment

* add segement for transformer

* add utest
```
  9b6c7eb9
02 8月, 2021 2 次提交
- S
  [HybridParallel]Support 1f1b for PipelineParallel (#34483) · 9e0bb91c
  由 ShenLiang 提交于 8月 02, 2021
```
* support 1f1b for pipeline

* add utest

* add send_partial/recv_partial

* support amp for pp

* fix logger
```
  9e0bb91c
- W
  
  [NPU] fix npu pipeline comm init (#34466) · 41e2d413
  由 WangXi 提交于 8月 02, 2021
  
  41e2d413
30 7月, 2021 3 次提交
- W
  add trainer desc config to distributed strategy (#34457) · e6aacd1e
  由 wangguanqun 提交于 7月 30, 2021
```
* add trainer desc config to distributed strategy

* code style modified
```
  e6aacd1e
- K
  fix force kill for elastic (#34488) · ba19398e
  由 kuizhiqing 提交于 7月 30, 2021
```
* fix force kill for elastic
```
  ba19398e
- Y
  
  all reduce fusion for shardinug, test=develop (#34480) · 423ea978
  由 Yuang Liu 提交于 7月 30, 2021
  
  423ea978
29 7月, 2021 1 次提交
- Y
  
  fix the allreduce fused bug, test=develop (#34446) · b56dbe08
  由 Yuang Liu 提交于 7月 29, 2021
  
  b56dbe08

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致