提交 · v2.5.0-rc1 · PaddlePaddle / Paddle

19 5月, 2023 1 次提交
- Z
  [cherrypick][inference][trt]remove trt sparse weights flags (#53562) (#53850) · 5a69ddb9
  由 Zhang Jun 提交于 5月 19, 2023
```
* remove kSPARSE_WEIGHTS

* remove kFASTER_DYNAMIC_SHAPES_0805 and add 'TrtMajorVersion' function
```
  5a69ddb9
18 5月, 2023 1 次提交
- [cherry-pick 2.5][Zero-Dim] update 0D tensor API en doc (#53855) · 2992f787
  由 zhouweiwei2014 提交于 5月 18, 2023
```
* [Zero-Dim] update 0d tensor api en doc, test=document_fix

* [BUG] fix windows kernel dispatch of _lzcnt bug (#53728)
```
  2992f787
16 5月, 2023 3 次提交
- Y
  [AMP] Allow to switch whether to use promote strategy to choose kernel for O2... · d2f09015
  由 Yiqun Liu 提交于 5月 16, 2023
```
[AMP] Allow to switch whether to use promote strategy to choose kernel for O2 training. (#53742) (#53841)

Pcard-70458

cherry-pick #53742

中文文档：PaddlePaddle/docs#5882
```
  d2f09015
- Y
  [PHI]Add Filter for get_kernel_signatures.py (#53760) (#53809) · 5d10e910
  由 YuanRisheng 提交于 5月 16, 2023
```
* delete log

* filter some kernel signature
```
  5d10e910
- [AMP]fix embedding model weight type mismatch error (#53770) (#53827) · 4a08f7ec
  由 shaojie_wang 提交于 5月 15, 2023
```
Pcard-70458
cherry-pick: #53770
```
  4a08f7ec
15 5月, 2023 1 次提交
- Z
  fix dtype missmatch error (#53712) (#53764) · e6464f33
  由 Zhang Ting 提交于 5月 15, 2023
```
Pcard-70458
cherry-pick #53712
```
  e6464f33
13 5月, 2023 1 次提交

[cherrypick][inference Zero-Dim] Support 0-Dim Tensor in Paddle-TensorRT (#53752) · 20fbafe6

由 Zhang Jun 提交于 5月 13, 2023

* scale, square, sum, swish trt op converter support zero dim (#53660)

* [Paddle-Inference] Support trt 0dims of expand_as_v2 and mish. (#53627)

* support_expand_mish

* add unitest for reshpe 0 dims (#53685)

* Add trt pow converter. (#53462)

* Add trt pow converter.

* update to use AddConstantLayer

* add dims=0 ut

* [inference Zero-Dim]add equal, elementwise_op trt 0d (#53704)

* [inference Zero-Dim]prelu trt converter support zero dim tensor (#53634)

* prelu op trt converter support zero dim

* [Inference Zero-Dim] Support trt 0dim of gelu, hard_swish, hard_sigmoid and leaky_relu (#53714)

* support_act
* delete_silu

* [inference zero dim] softmax, stack op trt converter support zero dim (#53729)

* softmax support

* support stack

* remove unused code

* update

---------
Co-authored-by: NYuanle Liu <yuanlehome@163.com>
Co-authored-by: Nxiaoxiaohehe001 <49090790+xiaoxiaohehe001@users.noreply.github.com>
Co-authored-by: Nzhoutianzi666 <39978853+zhoutianzi666@users.noreply.github.com>
Co-authored-by: NWilber <jiweibo@baidu.com>

20fbafe6

12 5月, 2023 3 次提交
- 傅
  fix docs error of index_put API (#53747) · 286fd577
  由傅剑寒提交于 5月 12, 2023
```
This PR fix docs error of index_put , related dev PR is #53727
```
  286fd577
- H
  
  fix hessian's docstring (#53740) · b9bb6fe7
  由 HydrogenSulfate 提交于 5月 12, 2023
  
  b9bb6fe7
- 傅
  Add datatype for index_put in ops.yaml (#53715) · 799e4347
  由傅剑寒提交于 5月 12, 2023
```
This PR add data_type for selecting which arg's datatype to instantiate template type T for index_put kernel
Related PR #53652
```
  799e4347
11 5月, 2023 5 次提交

L

[cherry-pick] revise 'Examples' of LBFGS to create right docs(cn), test=docs_preview (#53698) · 2a696fb8
由 lijialin03 提交于 5月 11, 2023

2a696fb8
W
Fix div error when dtype is int64 in static mode (#53705) (#53713) · feee67ca
由 WangZhen 提交于 5月 11, 2023
```
* Fix div error when dtype is int64 in static mode

* Fix out dtype
```
feee67ca
L
[cherry-pick]fix windows static_assert error (#53694) · 42ca5d61
由 limingshu 提交于 5月 11, 2023
```
Fix static_assert bug in Windows CUDA 11.6 compilation. This may be the bug of msvc.
```
42ca5d61

【cherry-pick】【BugFix】fix err of api `to_tensor`, which caused by numpy version... · 73e6bbba

由 feifei-111 提交于 5月 11, 2023

【cherry-pick】【BugFix】fix err of api `to_tensor`, which caused by numpy version update (#53534) (#53624)

* 【BugFix】fix err of api `to_tensor`, which caused by numpy version update (#53534)

* fix

* update code

* pre-commit

* remove scale check (0-D tensor is usable)

* fix data dtype err

* fix numpy default dtype diff

* fix data dtype

* fix data dtype

* update

* fix coverage

* fix old test which is not correct when 0-D tensor is usable

73e6bbba

J
[Cherrypick] up index warning level (#53692) · 9fbae766
由 JYChen 提交于 5月 11, 2023
```
* up warning level

* numpy still vlog-0
```
9fbae766

10 5月, 2023 9 次提交

傅
[cherry pick] add index_put api (#53652) · 4d16cd63
由傅剑寒提交于 5月 10, 2023
```
This PR add index_put api for paddle
```
4d16cd63
Y
[cherry-pick] Fix the index calculation in cross_entroy_kernel. (#53659) (#53666) · 1ab562ca
由 Yiqun Liu 提交于 5月 10, 2023
```
cherry-pick #53659
```
1ab562ca
Z
[Cherry-Pick] Fix bug in log_softmax kernel when lastdim is larger than 100000 (#53657) · a7cad386
由 Zhang Zheng 提交于 5月 10, 2023
```
Fix bug in log_softmax kernel when lastdim is larger than 100000

There is an unexpected log in the calculation

Cherry-Pick: #53654
```
a7cad386
R

fix error sample code in static.nn.loss.nce (#53588) (#53630) · b0c55c28
由 RedContritio 提交于 5月 10, 2023

b0c55c28
Q
revert argsort to fix OOM bug (#53647) · 6707142a
由 Qi Shao 提交于 5月 10, 2023
```
Revert argsort to the version without full sort algorithm implemented
```
6707142a

[cherry-pick 2.5] Broadcast && Dropout_nd Performance Optimization into Release/2.5 (#53623) · f9ea2301

由 Bo Zhang 提交于 5月 10, 2023

* Support different dtypes of inputs for broadcast for dropout optimization  (#52093)

* change judgement for DropoutGradGPUKernelDriver

* add UnrollerWithoutVecSize and after this Loaddata to be refined

* pass unittest

* use same unroller with XPU

* BroadcastWithInt64Index

* BroadcastDataLoader template partial specialization

* fix compile errs in ROCms

* PR comment

* dropout_nd_optimization (#51479)

* with printf

* add DropOutNdForwardKernel

* PR comment

* Dropout optimize & clean broadcast inT and ElementwiseType (#52969)

* change judgement for DropoutGradGPUKernelDriver

* add UnrollerWithoutVecSize and after this Loaddata to be refined

* pass unittest

* use same unroller with XPU

* BroadcastWithInt64Index

* BroadcastDataLoader template partial specialization

* fix compile errs in ROCms

* clean ElementwiseT and InT for BroadcastKernel

* default axis and clean inT

* remove redundant fast divmod computation

* optimize drop_nd & drop_nd_grad

* optimize BroadcastDataLoader bf16 fp16

* rm InT etc. after merge develop

* delete constexpr for windows ci

* fix conflict

* fix conflic with develop

* fix conflic

* new clean

* clean

* Fix xpu2 kp compile error (#53548)

* fix conflict

* conflict

f9ea2301

[Cherry-pick 2.5][Zero-Dim] paddle.static.data, squeeze, unbind, unstack,... · fecea4c5

由 zqw_1997 提交于 5月 10, 2023

[Cherry-pick 2.5][Zero-Dim]  paddle.static.data, squeeze, unbind, unstack, gather_nd and einsum support 0D (#53602)

* add test cases, test=allcase

* fix test cases, test=allcase

* fix test cases, test=allcase

* assert_allclose, test=allcase

* 1e-5 to 1e-4, test=allcase

* change rtol from 1e-4 to 1e-3, test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* fix test cases, test=allcase

* fix test cases, test=allcase

* modify the test_squeeze to not use Tensor type axis, test=allcase

* add grad check for unbind and unstack, test=allcase

* check for squeeze axis tensor type, test=allcase

* fix bug, test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

fecea4c5

[Zero-Dim] add 0D Tensor UT case for XPU (#53611) · 3a247cba
由 zhouweiwei2014 提交于 5月 10, 2023

3a247cba
G

add and open 0D test pnorm and cond (#53616) · 7edcd05c
由 GGBond8488 提交于 5月 10, 2023

7edcd05c

09 5月, 2023 11 次提交

[AMP] fix static promote (#53439) (#53641) · c27e6d2f

由 niuliling123 提交于 5月 09, 2023

fix static promote
将因性能有问题而放入unsupprot_list中的算子放入黑名单中，以保证在O2模式下，只有3种场景权重会保持fp32

c27e6d2f

Z
add compare accuracy api (#53646) · 22678065
由 zhangkaihuo 提交于 5月 09, 2023
```
cherry-pick #53430
```
22678065

[Cherry-pick 2.5][Zero-Dim] paddle.to_tensor support 0D (#53599) · 2aefc45b

由 zqw_1997 提交于 5月 09, 2023

* fix doc erros, test=allcase

* conflict

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* fix doc erros, test=allcase

* fix the to_tensor error

2aefc45b

[Zero-Dim] Support p_norm/reduce_sum_p output 0D (#53421) (#53618) · 3ffe8f36
由 zhouweiwei2014 提交于 5月 09, 2023

3ffe8f36
L
Cherry pick fused linear (#53621) · f21b6f08
由 limingshu 提交于 5月 09, 2023
```
Cherry pick fused linear
```
f21b6f08

[cherry-pick 2.5][inference Zero-Dim] trt support 0 dims (#53497) · 77eeb226

由 Zhang Jun 提交于 5月 09, 2023

* [inference][trt]trt support 0 dims (#53383)

* trt support 0 dim
* update activation ut
* fix trt Unary operation do not support 0d when TRT < 8.6
* Update op_teller.cc
* update unary ut
* add rsqrt to unary_list
* move rsqrt to act_list

77eeb226

【cherry-pick】Op test add complex support (#53604) · c8504d86

由 GGBond8488 提交于 5月 09, 2023

* add complex support for  optest

* add complex grad test

* append one

* move some debug info

* move some debug info

* move some debug info

* move some debug info

* add more complex test

* Fix naming ambiguity

* Revert "add more complex test"

This reverts commit dbcb0516b8e53ba42e2d6089878a39b395345969.

* change backward gradient, add TODO

c8504d86

[cherry-pick 2.5][Zero-Dim] support paddle.sum/mean/loss api output 0D (#53601) · b6e23774

由 zhouweiwei2014 提交于 5月 09, 2023

* [Zero-Dim] fix functool.reduce more safe with intial value, to support empty list (#53182)

* [Zero-Dim] support 0d tensor for shape and squeeze onednn kernel (#52832)

* support 0d tensor for shape and squeeze onednn kernel

* set python api for shape op ut

* [Zero-Dim] distributed scatter/all_to_all support input 0D tensor (#53186)

* [Zero-Dim] Support paddle.sum/mean/loss api output 0D,test=allcase (#52739)

* [CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily (#53382)

* [CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily

* Add unittest

* [CINN Support 0D-Tensor] CINN hack squeeze2 with trick temporarily (#53454)

* fix test_autograd_dynamic (#53473)
Co-authored-by: Nzhwesky2010 <zhouwei25@baidu.com>

---------
Co-authored-by: NYangQun <qun.yang@intel.com>
Co-authored-by: NHongyuJia <jiahongyu@baidu.com>
Co-authored-by: NHydrogenSulfate <490868991@qq.com>

b6e23774

J

describe -> description in PR template (#53600) · f6cf7329
由 jzhang533 提交于 5月 09, 2023

f6cf7329

[Cherry-pick] zero-dim: support 0-D for getitem/setitem (#53441) · 767e7b3f

由 JYChen 提交于 5月 09, 2023

* support 0-D output and 0-D as indice in __getitem__

* fix tests

* fix inference and UT

* add unittest for setitem

* fix xpu test

* fix xpu 0-d

* fix right value is 0d and index is List/Tensor

* Hack__getitem__ from 0-d to 1-d with FLAGS_set_to_1d

* change PHI_DECLARE_xxx to DECLARE_xxx since the change not merged to 2.5

* hack 1-D tensor to Scalar

* throw warning at __getitem__, not slice_utils

767e7b3f

C

fix eval branch of prim vjp of batch_norm in amp mode (#53594) · 95a7bcf9
由 cyber-pioneer 提交于 5月 09, 2023

95a7bcf9

08 5月, 2023 5 次提交

Z
[Paddle-TRT] The Graph uses OpConverterType for op converter (#53214) (#53585) · 2cf4a04a
由 zhoutianzi666 提交于 5月 08, 2023
```
* add ```converter_type``` for op converter
```
2cf4a04a

[Cherry-Pick] Fix the calculation of y_grad in divide_backward (#53584) · e63fb1e6

由 Zhang Zheng 提交于 5月 08, 2023

Cherry-Pick: #53582
修改内容：在除法out = x / y中，将y的反向公式由dy = -dout * out / y 改为 dy = -dout * ((x / y) / y)
修改原因：使用result作为反向的输入，在低精度的时候本身cast之后就会存在一些精度损失，所以重新计算后才是更准确的结果
修改影响：此改动可以使结果更精确且对性能影响忽略不计

e63fb1e6

N
[cherry-pick] Fix core dumped in training when check_nan_inf=1 (#53423) · d5c3f032
由 niuliling123 提交于 5月 08, 2023
```
修复优化器精度检查bug
```
d5c3f032

Cherry-pick #53432 and #53556 (#53576) · 6583c390

由 Yiqun Liu 提交于 5月 08, 2023

* Add fused_gate_attention API. (#53432)
* Add PADDLE_THROW in take_along_axis kernel when the datatype of index is wrong. (#53556)

6583c390

[Cherry-pick]Cherry pick 0d output (#53538) · 2d02b0c1

由 GGBond8488 提交于 5月 08, 2023

* add 0D output support for inalg.slogdet,test=allcase

* fix zerom dime test error test=allcase

* fix test error test=allcase

* add static backward test, test=allcase

* support_0D_output_for_matrix_rank_multi_dot, test=allcase

* add 0D output test for matrox_rank and mutli_dot test=allcase

* fix assert error ,test=allcase

* fix test error, test=allcase

* fix other test error, test=allcase

* fix other test error, test=allcase

* fix test error, test=allcase

* fix matrix_rank and multi dot test err test=allcase

* fix test error test=allcase

* fix test zero dim test, test=allcase

* add static backward test for multi_dot, test=allcase

* add tol 2d broadcast test case, test=allcase

* fix test error test=allcase

* fix test error test=allcase

* test=allcase

* support_0d_output_for_linalg.norm

* fix test error test=allcase

* fix 0D test

* fix test error test=allcase

* fix test error test=allcase

* fix tets,test=allcase

* fix error,test=allcase

* fix errors ,test=allcase

* add static backward , test=allcase

* add static backwward test, test=allcase

* slogdet_support_0D_output

* add new case

* fix tests, test=allcase

* cherry-pick

* cherry-pick

* fix trace gpu kernel 0d error, test=allcase

* fix windows error, test=allcase

* add matrixrank cherry-pick

2d02b0c1

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功