提交 · cfee9c13dcb0c983dd5857913cf8c15696b26f67 · PaddlePaddle / Paddle

04 11月, 2022 1 次提交
- L
  [cherry-pick2.4]for CodeStyle (#47608) · cfee9c13
  由 Ligoml 提交于 11月 04, 2022
```
* only run pre-commit

* only run pre-commit
```
  cfee9c13
03 11月, 2022 1 次提交
- S
  support unbalanced data for pipeline (#47199) (#47569) · d4bf8b1a
  由 ShenLiang 提交于 11月 03, 2022
```
* add unbalanced data

* fix utest
```
  d4bf8b1a
01 11月, 2022 1 次提交
- S
  
  add missing scale parameter (#47522) · 5ffd4afe
  由 sneaxiy 提交于 11月 01, 2022
  
  5ffd4afe
29 10月, 2022 1 次提交
- S
  [Cherry-pick][Release/2.4]Add fused_allreduce_gradients_with_group for PPFleetX (#47458) · df64e790
  由 sneaxiy 提交于 10月 29, 2022
```
* reformat hybrid_parallel_util.py by black

* add fused_allreduce_gradients_with_group

* add scale

* fix ci
```
  df64e790
24 10月, 2022 3 次提交

Y

Fix virtualpp with mp/recompute bugs (#47242) (#47249) · 9780eb72
由 Yuang Liu 提交于 10月 24, 2022

9780eb72

Support BF16 training for sharding (#46846) (#47246) · 5c85f1a7

由 Ghost Screaming 提交于 10月 24, 2022

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* Support bfloat16 type for reducer and sharding.

* Fix some bug.

* Polish code.

* Polise code.

* Add bfloat16 datatype in fill_grad kernels.
Co-authored-by: Nsneaxiy <sneaxiy@126.com>
Co-authored-by: Nsneaxiy <sneaxiy@126.com>

5c85f1a7

R

fix send for old dygraph mode by passing use_calc_stream to the send op (#47110) (#47201) · 82f1e1b7
由 Roc 提交于 10月 24, 2022

82f1e1b7

21 10月, 2022 1 次提交
- H
  
  support qat in sharding stage2 (#47169) (#47240) · 281891c5
  由 Haohongxiang 提交于 10月 21, 2022
  
  281891c5
19 10月, 2022 1 次提交

Add enable_partial_send_recv switch in pipeline_configs (#46992) (#47083) · 1d015f12

由 Ghost Screaming 提交于 10月 19, 2022

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* Support allow_partial switch, which can be configure in
pipeline_configs. If sent tensor are not the same from
different hosts, they shouldn't been sent partially and
then concated as a whole tensor.

* Change name allow_partial to enable_partial_send_recv.

* Add global variable _enable_partial_send_recv

1d015f12

18 10月, 2022 2 次提交

Cherry pick for sharding (#47061) · 5b642140

由 Yuang Liu 提交于 10月 18, 2022

* [dygraph sharding] Overlap the reduce and the caculation for sharding stage 2. (#46495)

* [dygraph sharding stage 2] sharding broadcast overlap (#46656)

* Multi groups for broadcast of sharding stage 2 (#46894)

5b642140

[cherry-pick] Fix perf issues of mp/pp/fuse in eager mode (#47071) · b84edd90

由 Haohongxiang 提交于 10月 18, 2022

* [Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv (#46116)

* [Dygraph] Fix Perf of FusedFeedForward and FusedAttention with AllReduce (#46780)

* update

b84edd90

17 10月, 2022 1 次提交

[Cherry-pick] Collective communication APIs (#46922) · 5fba2a98

由 Wen Sun 提交于 10月 17, 2022

* Support both use_calc_stream and sync_op in send recv APIs (#46023)

* Support both use_calc_stream and sync_op in allgather API (#46295)

* Support both use_calc_stream and sync_op in collective communication API (#46761)

* Move group and all reduce from collective to communication (#45848)

* Completes bfloat16 dtype for collective api in eager mode (#45844)

* Fix collective APIs cannot be recognized when building docs (#46962)
Co-authored-by: NLiYuRio <63526175+LiYuRio@users.noreply.github.com>

5fba2a98

11 10月, 2022 1 次提交

Cherry pick for dygraph pp (#46876) · 9cc3f69f

由 Yuang Liu 提交于 10月 11, 2022

* bug fix for virtual pipeline parallel (#45922)

* dont wait for send op under dygraph pp (#46209)

* [interleave pp] sync recv for 1f1b (#46399)

* [dygraph pp] all sync for allgather partial (#46483)

9cc3f69f

27 9月, 2022 1 次提交
- L
  
  change use_calc_stream to sync_op (#46182) (#46493) · 8089a1fb
  由 LiYuRio 提交于 9月 27, 2022
  
  8089a1fb
22 9月, 2022 2 次提交
- R
  logger manager (#45909) (#46087) · 7eb046c7
  由 Roc 提交于 9月 22, 2022
```
uniform logger manager in FleetAPI.
hidde API under distributed/utils which users don't need.
```
  7eb046c7
- H
  [Dygraph] Fix bugs of mp in eager mode (#46303) (#46396) · 372505be
  由 Haohongxiang 提交于 9月 22, 2022
```
* fix bugs of mp

* fix bugs of mp

* update

* update

* fix bug
```
  372505be
20 9月, 2022 2 次提交

H
[PolishComments] Polish some code comments (#46032) (#46261) · 42e56f65
由 HongyuJia 提交于 9月 20, 2022
```
* polish code comments

* polish data_device_transform.cc
```
42e56f65

[Cherry-Pick][AutoParallel] change import way and fix strategy (#46270) · c43ebfcf

由 zhaoyingli 提交于 9月 20, 2022

* [Auto Parallel] Change the import way of Auto Parallel (#46115)

* fix strategy (#46256)

* [Auto Parallel] performance improvement for Sharding-DP hybrid parallelism (#46180)

* remove no need grad allreduce communication when sharding-dp

* remove no need grad allreduce communication when sharding-dp

* bugfix

* bugfix

* bugfix
Co-authored-by: NYulong Ao <aoyulong@baidu.com>
Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>

c43ebfcf

19 9月, 2022 3 次提交
- W
  
  Recompute unify incubate (#46073) (#46210) · 4bced24a
  由 wuhuachaocoding 提交于 9月 19, 2022
  
  4bced24a
- W
  refactor mp. (#45803) (#46121) · e5dc9d61
  由 wuhuachaocoding 提交于 9月 19, 2022
```
* refactor mp.

* update setup.py.

* update mp_layers.py for compatibility.

* add documents for mp_layers.py

* update init.py

* update collective.py.

* update.

* update mp_ops.py

* update.

* update code style.

* update code style.
```
  e5dc9d61
- S
  
  rename fleetx, develop=document_fix (#46141) · 7a6db0a3
  由 ShenLiang 提交于 9月 19, 2022
  
  7a6db0a3
09 9月, 2022 1 次提交
- Y
  
  fix dygraph pp + mp nan after async send/recv (#45869) · 5d7e1c91
  由 Yuang Liu 提交于 9月 09, 2022
  
  5d7e1c91
07 9月, 2022 2 次提交
- Y
  
  [dygraph hybrid pp for interleave] Save/Load for interleaved pipeline. (#45797) · a9cc0274
  由 Yuang Liu 提交于 9月 07, 2022
  
  a9cc0274
- C
  [Auto Parallel] Support Iterable dataset for auto parallel (#45518) · b77fa1d9
  由 caozhou 提交于 9月 07, 2022
```
* support iterable dataset for auto parallel

* add split_data proto

* fix unittest bug

* fix recompute bug

* update cmake
```
  b77fa1d9
06 9月, 2022 1 次提交
- Y
  
  [dygraph hybrid pp for interleave] The interleave scheduler for pipeline parallel (#45497) · 72b5b5bf
  由 Yuang Liu 提交于 9月 06, 2022
  
  72b5b5bf
02 9月, 2022 1 次提交
- W
  
  update some input for pp and moe about recompute. (#45628) · 4c780311
  由 wuhuachaocoding 提交于 9月 02, 2022
  
  4c780311
01 9月, 2022 1 次提交

ps optimizer default config (#45563) · ae217373

由 wangguanqun 提交于 9月 01, 2022

* config

* fix unittest

* zero init & cache & patch config

* add barrier to save and load

* add unittest

ae217373

26 8月, 2022 3 次提交
- Y
  
  [dygraph hybrid pp for interleave] Virtual pipeline layer forward function (#45444) · 81eaa97d
  由 Yuang Liu 提交于 8月 26, 2022
  
  81eaa97d
- W
  
  [Eager] delete final state pre-name (#45306) · 126940b3
  由 wanghuancoder 提交于 8月 26, 2022
  
  126940b3
- Y
  
  [dygraph hybrid pp for interleave] Virtual pp stage layer split (#45402) · 04c15e79
  由 Yuang Liu 提交于 8月 26, 2022
  
  04c15e79
23 8月, 2022 2 次提交
- Z
  [AutoParallel] Add Quant Pass (#44877) · 61bc016c
  由 zhaoyingli 提交于 8月 23, 2022
```
* add quant pass
```
  61bc016c
- L
  
  [FleetExecutor] Using program to be the only interface of TaskNode (#43869) · 9ccdb5fa
  由 LiYuRio 提交于 8月 23, 2022
  
  9ccdb5fa
16 8月, 2022 1 次提交
- H
  [Fleet] Reconstruct of Fleet API in Dygraph Mode (#44922) · c17e6af8
  由 Haohongxiang 提交于 8月 16, 2022
```
* reconstruct_of_fleet_api

* update
```
  c17e6af8
15 8月, 2022 1 次提交

refactor fleet. (#44833) · 8636d2a2

由 wuhuachaocoding 提交于 8月 15, 2022

* refactor fleet.

* refact fleet.py.

* update fleet/__init__.py.

* update fleet.py

* update code style.

* update fleet

* update fleet

* update fleet

* update fleet

* update model.py

* update fleet.

* update __init__.py

* update fleet.

* update fleet.

* update fleet

* update fleet

* update fleet

* update fleet.

* update optimizer.py

* update optimizer

* update fleet.py

* update scaler.py

* update setup.py.in

8636d2a2

13 8月, 2022 1 次提交

fl-ps: support split sparse params in local & remote (#44864) · 3f5c405f

由 ziyoujiyi 提交于 8月 13, 2022

* back fl

* delete ssl cert

* .

* make warning

* .

* unittest paral degree

* solve unittest

* heter & multi cloud commm ready

* .

* .

* fl-ps v1.0

* .

* support N + N mode

* .

* .

* .

* .

* delete print

* .

* .

* .

* .

* fix bug

* .

* .

* fl-ps with coordinator ready

* merge dev

* update message parse only

* update fl client scheduler

* fix bug

* update multithreads sync

* fix ci errors

* update role_maker.py

* update role_maker.py

* fix ci error: windows py import error

* fix ci error: windows py import error

* fix windows ci pylib import error

* add dump fields & params

* try to fix windows import fleet error

* fix ps FLAGS error

* fix logging risk

* fix logging possible risk

* write trainer_desc file

* support split sparse params in local & remote

* fix import paddle.fluid.core.PSGPU

* fix import paddle.fluid.core.PSGPU

* add remote_sparse & local_sparse config

* fix unittest

* fix test_dist_fleet_geo table error

* fix PADDLE_ENFORCE error

* fix other's pr conflict

3f5c405f

12 8月, 2022 1 次提交
- H
  
  change default log level (#45093) · 34234282
  由 hong 提交于 8月 12, 2022
  
  34234282
10 8月, 2022 1 次提交
- A
  [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute (#44737) · 81d6fa6c
  由 Aurelius84 提交于 8月 10, 2022
```
* [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute

* add unittest for inference predictor
```
  81d6fa6c
09 8月, 2022 2 次提交

refine save/load interface for distributed cpups (#44862) · 7b29c89b

由 zhaocaibei123 提交于 8月 09, 2022

* save load

* save load

* add unittest

* first commit

* second commit

* third commit

* remove SaveLocalFS in memory sparse table

* save dense param

* update

* push slot

* fix push show clk: int -> float

* add unittest

* fix sample

* unittest

* add AsExtra for op

* unittest

* modify fs.py

* modify fs.py

* fix some bugs

* add dataset hdfs config

* local change

* dataset use differenct hadoop ugi/fs_name

* add

* fix conflict

* fix

* remove logs

* code style

* fix

* code style

* code style

* fix

* code style

* save_dense_param

* fix

* fix

* fix

* fix

* change momentum in dense optimzer

* fix

* fix

* change fluid => paddle.static

* remove some unuseful code
Co-authored-by: Nesythan <esythan@126.com>

7b29c89b

Y

[model parallel] enable mp to use fused linear (#44968) · e84250e8
由 Yuang Liu 提交于 8月 09, 2022

e84250e8

08 8月, 2022 1 次提交
- H
  
  fix_bugs_of_sharding (#44982) · ffd8adca
  由 Haohongxiang 提交于 8月 08, 2022
  
  ffd8adca

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功