提交 · 2992f78795b9279cfa35eddb7e38cc730e1ab100 · PaddlePaddle / Paddle

18 5月, 2023 1 次提交
- [cherry-pick 2.5][Zero-Dim] update 0D tensor API en doc (#53855) · 2992f787
  由 zhouweiwei2014 提交于 5月 18, 2023
```
* [Zero-Dim] update 0d tensor api en doc, test=document_fix

* [BUG] fix windows kernel dispatch of _lzcnt bug (#53728)
```
  2992f787
09 5月, 2023 4 次提交

[Cherry-pick 2.5][Zero-Dim] paddle.to_tensor support 0D (#53599) · 2aefc45b

由 zqw_1997 提交于 5月 09, 2023

* fix doc erros, test=allcase

* conflict

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* fix doc erros, test=allcase

* fix the to_tensor error

2aefc45b

[Zero-Dim] Support p_norm/reduce_sum_p output 0D (#53421) (#53618) · 3ffe8f36
由 zhouweiwei2014 提交于 5月 09, 2023

3ffe8f36

[cherry-pick 2.5][Zero-Dim] support paddle.sum/mean/loss api output 0D (#53601) · b6e23774

由 zhouweiwei2014 提交于 5月 09, 2023

* [Zero-Dim] fix functool.reduce more safe with intial value, to support empty list (#53182)

* [Zero-Dim] support 0d tensor for shape and squeeze onednn kernel (#52832)

* support 0d tensor for shape and squeeze onednn kernel

* set python api for shape op ut

* [Zero-Dim] distributed scatter/all_to_all support input 0D tensor (#53186)

* [Zero-Dim] Support paddle.sum/mean/loss api output 0D,test=allcase (#52739)

* [CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily (#53382)

* [CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily

* Add unittest

* [CINN Support 0D-Tensor] CINN hack squeeze2 with trick temporarily (#53454)

* fix test_autograd_dynamic (#53473)
Co-authored-by: Nzhwesky2010 <zhouwei25@baidu.com>

---------
Co-authored-by: NYangQun <qun.yang@intel.com>
Co-authored-by: NHongyuJia <jiahongyu@baidu.com>
Co-authored-by: NHydrogenSulfate <490868991@qq.com>

b6e23774

[Cherry-pick] zero-dim: support 0-D for getitem/setitem (#53441) · 767e7b3f

由 JYChen 提交于 5月 09, 2023

* support 0-D output and 0-D as indice in __getitem__

* fix tests

* fix inference and UT

* add unittest for setitem

* fix xpu test

* fix xpu 0-d

* fix right value is 0d and index is List/Tensor

* Hack__getitem__ from 0-d to 1-d with FLAGS_set_to_1d

* change PHI_DECLARE_xxx to DECLARE_xxx since the change not merged to 2.5

* hack 1-D tensor to Scalar

* throw warning at __getitem__, not slice_utils

767e7b3f

06 5月, 2023 1 次提交
- Z
  [cherry-pick]add flash randomness control and add scaled_dot_product_attention (#53518) · 1d23e0bb
  由 zhangkaihuo 提交于 5月 06, 2023
```
att, cherry-pick: #52902 #53113
```
  1d23e0bb
27 4月, 2023 1 次提交

[cherry-pick2.5] [Zero-Dim] Support... · b6996598

由 zhouweiwei2014 提交于 4月 27, 2023

[cherry-pick2.5] [Zero-Dim] Support all/any/min/max/prod/logsumexp/amax/amin/some loss output 0D (#53192)

b6996598

24 4月, 2023 2 次提交

J
Revert "Cherry pick getitem/setitem 0d (#53125)" (#53265) · 50f61213
由 JYChen 提交于 4月 24, 2023
```
This reverts commit a79c04f3.
```
50f61213

[CherryPick] [BugFix] wrong match between depend and c_allreduce_sum (#53271) · bfd1dd77

由 kangguangli 提交于 4月 24, 2023

* fix bug: wrong match between depend and c_allreduce_sum

(cherry picked from commit 327da8035bdfee3ec2f016e8cda29ec8ee89bc95)

* fix codestyle

(cherry picked from commit bdb1483081adc41aa47d3f7df257f63f1cff399b)

* fix bug

(cherry picked from commit 373ba5253c45ac019ffaa8d69d4ce9e02cb9ae79)

* add c_sync_calc_stream back

(cherry picked from commit 9933d7533ae1f307b76f24a33bf0c59e4c8e8f01)

* fix

(cherry picked from commit abc9a31beaa326f6a566c08749419bb33e209672)

* revert

(cherry picked from commit 07bc98dbf7c9df43910fa6e86a6a2698731dffb2)

* use flag to control

(cherry picked from commit 8e5682a4b99759cbe35a49f3f8c9db735dc8fee4)

* fix for code coverage

(cherry picked from commit fe7e61bdef24fbc43e2f4e1cb67f68963c957cf1)

bfd1dd77

23 4月, 2023 1 次提交

Cherry pick getitem/setitem 0d (#53125) · a79c04f3

由 JYChen 提交于 4月 23, 2023

* support 0-D output and 0-D as indice in __getitem__

* fix tests

* fix inference and UT

* add unittest for setitem

* fix xpu test

* fix xpu 0-d

a79c04f3

20 4月, 2023 1 次提交
- K
  [Perf] fix static graph performance issue in amp mode with multicard (#52724) (#53115) · 3603b9b1
  由 kangguangli 提交于 4月 20, 2023
```
* fix

* fix

* fix

* fix

* fix

* fix fuse group order

(cherry picked from commit 38ec37cd)
```
  3603b9b1
17 4月, 2023 5 次提交
- C
  [Fused] controlled randomness for fused dropout add (#52903) · e36f80c6
  由 Chitsing KUI 提交于 4月 17, 2023
```
* add random control for fused dropout add

* add __init__
```
  e36f80c6
- K
  
  rem cncl keyword in py (#52939) · ea04bef8
  由 Kim Yann 提交于 4月 17, 2023
  
  ea04bef8
- 张
  remove hccl in .py files (#52934) · 27a601e8
  由张春乔提交于 4月 17, 2023
```
* remove hccl in .py files

* remove ascend in setup.py.in

* remove ascend in setup.py
```
  27a601e8
- H
  
  [Dygraph] Support delaying div loss by accumulate_steps in PipelineLayer (#52848) · 0abdcff6
  由 Haohongxiang 提交于 4月 17, 2023
  
  0abdcff6
- C
  [Auto Parallel]Add o2 tune of rule based tuner (#52928) · 118a7415
  由 caozhou 提交于 4月 17, 2023
```
* add o2 tune

* add unittest

* fix error

* set unittest timeout
```
  118a7415
14 4月, 2023 2 次提交

1. modify set_value op, use Scalars to represent attr `values`, instead of a... · dd2a749a

由 Feiyu Chan 提交于 4月 14, 2023

1. modify set_value op, use Scalars to represent attr `values`, instead of a bunch of attributs of various types; (#52408)

2. add program converter and set_value op as an example, which provides the functionality to convert `paddle::framework::ProgramDesc` between old and new formats(the differences are mainly some operators with incompatible updates in the definition);
3. program version and operator version map now are always saved when serializing `paddle::framework::ProgramDesc` to identify the version;
3. provide an option `legacy_format=false` in serialization of `paddle::framework::ProgramDesc`, it decided whether to convert ProgramDesc back to a legacy format, which is compatible for paddle 2.4.2 or earlier versions to load and execute;
4. deserialization of `paddle::framework::ProgramDesc` is now automatically detecting whether the bytes it receives is in legacy format(contains any of the operators that has been incompatibly updated and have any attribute of type `Scalar`) and convert it to new format. But if you want a faithful deserialization without the automatic conversion, you can use protobuf's deserialization instead. Though it is not recommended, it can be used for the purpose of testing.

dd2a749a

R

[CustomDevice] add model parallel support for custom device (#52872) · f8d09011
由 ronnywang 提交于 4月 14, 2023

f8d09011

13 4月, 2023 1 次提交

[Auto Parallel] Add auto parallel tuner options in launch (#52053) · a67d3bb7

由 TaoTao Li 提交于 4月 13, 2023

* add auto parallel tuner options in launch

* add ut for launch in auto_parallel tuner

fix code format

* fix ci-converage

a67d3bb7

12 4月, 2023 4 次提交
- S
  
  fix bug of mp (#52789) · 3ece0ece
  由 ShenLiang 提交于 4月 12, 2023
  
  3ece0ece
- Y
  [Auto Parallel] Move some changes or bug fixes from 2.4 to develop (#52721) · cbdba509
  由 Yulong Ao 提交于 4月 12, 2023
```
* [Auto Parallel] Speedup the completion process

* [Auto Parallel] Skip the property of dist_context when deepcopying

* [Auto Parallel] Remove the unnecessary print

* [Auto Parallel] Move some changes from 2.4 branch to develop

* Update engine.py

* [Auto Parallel] Fix a bug
```
  cbdba509
- 张
  remove *hccl*.cc (#52798) · 2131ee5c
  由张春乔提交于 4月 12, 2023
```
* remove c_comm_init_hccl_op.cc and c_gen_hccl_id_op.cc

* remove gen_hccl_id_op.cc
```
  2131ee5c
- C
  
  [Auto Parallel]Add the single-node topology detection (#52723) · 05fd6d10
  由 CHANGer 提交于 4月 12, 2023
  
  05fd6d10
11 4月, 2023 3 次提交
- W
  
  fix save inf (#52632) · 5ab79273
  由 wangxiaoning 提交于 4月 11, 2023
  
  5ab79273
- W
  
  mp sync params & grads & opt states. (#51428) · 6b74cf76
  由 wuhuachaocoding 提交于 4月 11, 2023
  
  6b74cf76
- R
  
  fix_mac_m1_error (#52720) · f80a0fe9
  由 risemeup1 提交于 4月 11, 2023
  
  f80a0fe9
10 4月, 2023 1 次提交
- J
  [Auto Parallel] Randomness Control for Distributed Training (#52554) · 03afb41c
  由 JZ-LIANG 提交于 4月 10, 2023
```
* unique id for mesh

* rng ctrl

* support dropout

* register op

* adopt for recompute

* update unitest

* support pp
```
  03afb41c
09 4月, 2023 2 次提交
- C
  
  fix fused_dropout_add bug (#52644) · 5df1296d
  由 Chitsing KUI 提交于 4月 09, 2023
  
  5df1296d
- S
  [BugFix] Fix random seed bug in hybridparallel (#52656) · 61ca8b39
  由 ShenLiang 提交于 4月 08, 2023
```
* add seed control

* fix bug
```
  61ca8b39
07 4月, 2023 3 次提交
- K
  [Executor] remove run_program branch (#52471) · 39278731
  由 kangguangli 提交于 4月 07, 2023
```
* remove run_program

* remove FLAGS_USE_STANDALONE_EXECUTOR
```
  39278731
- add distributed p_send/p_recv/reduce_scatter operator (#51858) · 2b12a117
  由 TaoTao Li 提交于 4月 07, 2023
```
fix merge conflicts
```
  2b12a117
- R
  fix mkdir (#52570) · 41226d55
  由 Roc 提交于 4月 07, 2023
```
* fix mkdir

* update
```
  41226d55
06 4月, 2023 2 次提交
- N
  
  [CodeStyle][B017] catch more specific exceptions in unittests (#52553) · 9dbfadab
  由 Nyakku Shigure 提交于 4月 06, 2023
  
  9dbfadab
- K
  rem is_compiled_with_npu (#52385) · 7976e2a3
  由 Kim Yann 提交于 4月 06, 2023
```
* rem is_compiled_with_npu

* rem nup related code

* make lint happy

* rem test

* remove some tests

* Update grad_scaler.py

* fix an error
```
  7976e2a3
04 4月, 2023 2 次提交
- T
  
  bugfix on dist.alltoall_single (#52495) · e6e62342
  由 Tian 提交于 4月 04, 2023
  
  e6e62342
- L
  relocate debugger.py (#52048) · 076bc5d6
  由 LoneRanger 提交于 4月 04, 2023
```
* relocate debugger.py

* fix bug

* fix bug

* fix bug

* fix bug
```
  076bc5d6
03 4月, 2023 1 次提交

rem is_compiled_with_mlu (#52378) · 4b28f4ff

由 Kim Yann 提交于 4月 03, 2023

* rem is_compiled_with_mlu

* fix some mlu_place and mlu_device_coount

* make lint happy

4b28f4ff

31 3月, 2023 2 次提交

gather with doc (#52105) · 77d24854

由 zhenhailiu 提交于 3月 31, 2023

* gather with doc

* resolve comment

* polish

* polish

* code style

* polish doc

* add_test

* polish

* polish

* add test check

* add test check

* polish

* polish

* polish

* polish

* fix_time_out

* polish

* fix timeout

* fix_timeout

* polish

* polish

* polish

* polish

* polish

77d24854

张

[CodeStyle][UP030][UP031][UP032] using f-string (#52062) · 40e4f5a5

由张春乔提交于 3月 31, 2023

* autofix
Co-authored-by: NLiyulingyue <83450930+Liyulingyue@users.noreply.github.com>

* revert changes in python/paddle/distributed/fleet/utils/hybrid_parallel_util.py

* empty commit, trigger ci

* fix test_slice

---------
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

40e4f5a5

30 3月, 2023 1 次提交

Fix bug of c_softmax_with_cross_entropy_op_xpu_op (#52296) · 8ef97088

由 Ghost Screaming 提交于 3月 30, 2023

* Support ignore_index for c_softmax_with_cross_entropy_op.

* Polish code. Remove useless comments and add Testcase.

* Polish code for TestCase.

* Polish code.

* Polish code style.

* Polish code.

* Change loss calculation formula and ignore_index dtype.

* Polish TestCase.

* Fix bug of c_softmax_with_cross_entropy_op_xpu_op. Attribute 'ignore_index'
dtype is int64_t.

8ef97088

PaddlePaddle / Paddle 大约 2 年 前同步成功

PaddlePaddle / Paddle
大约 2 年前同步成功