提交 · d611e48c90d1a9145f97956ca2e5faea7a4a16bd · PaddlePaddle / Paddle

28 4月, 2023 21 次提交

Dropout optimize & clean broadcast inT and ElementwiseType (#52969) · d611e48c

由 Bo Zhang 提交于 4月 28, 2023

* change judgement for DropoutGradGPUKernelDriver

* add UnrollerWithoutVecSize and after this Loaddata to be refined

* pass unittest

* use same unroller with XPU

* BroadcastWithInt64Index

* BroadcastDataLoader template partial specialization

* fix compile errs in ROCms

* clean ElementwiseT and InT for BroadcastKernel

* default axis and clean inT

* remove redundant fast divmod computation

* optimize drop_nd & drop_nd_grad

* optimize BroadcastDataLoader bf16 fp16

* rm InT etc. after merge develop

* delete constexpr for windows ci

* fix conflict

* fix conflic with develop

* fix conflic

* new clean

* clean

d611e48c

G

[test]mv fluid op cinn to test/cpp/fluid/cinn (#53443) · a53ee944
由 gouzil 提交于 4月 28, 2023

a53ee944

【0D output】add_0D_output_support (#52857) · ef6e8d09

由 GGBond8488 提交于 4月 28, 2023

* add 0d support for dist, trace, paddle.linalg.cond test=allcase

* add_0d_output_support_for_det

* test=allcase

* support_0d_output_for_linalg.norm

* support linalg.norm 0d output, test=allcase

* fix 0D test

* fix zero dim test, test=allcase

* fix 0D test

* fix tets,test=allcase

* fix error,test=allcase

* fix errors ,test=allcase

* add static backward , test=allcase

* add static backwward test, test=allcase

* fix pr-ci-build error;test=document_fix (#53060)

* [Cherry-Pick] Unique support float16&bfloat16 (#53023)

unique支持float16和bfloat16数据类型，并完善相关单测。

* slogdet_support_0D_output

* add new case

* fix tests, test=allcase

* fix p_norm related test, test=allcase

* fix some err, test=allcase

* test=allcase

* move out trace

* open some case, test=allcase

* fix norm all case, test=allcase

* fix some test error, test=allcase

* fix typro,test=allcase

* fix test err, test=allcase

* test=allcase

* test

* fix test error, test=allcase

* fix test error, test=allcase

* fallback norm, test=allcase

---------
Co-authored-by: Ntianshuo78520a <707759223@qq.com>
Co-authored-by: NZhang Zheng <32410583+ZzSean@users.noreply.github.com>

ef6e8d09

[Zero-Dim] Support output 0D for squeeze, unbind, unstack. (#52843) · 6adfcdf6

由 zqw_1997 提交于 4月 28, 2023

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* test=allcase

* fix test cases, test=allcase

* fix test cases, test=allcase

* modify the test_squeeze to not use Tensor type axis, test=allcase

* add grad check for unbind and unstack, test=allcase

* check for squeeze axis tensor type, test=allcase

* fix bug, test=allcase

6adfcdf6

M

replace varbase to relevant name or notes as per the context (#53431) · 96180fff
由 Meteor Liu 提交于 4月 28, 2023

96180fff
H
Support static graph code generation for op edit_distance (#53297) · 396fe483
由 huangjiyi 提交于 4月 28, 2023
```
* update

* fix bug

* support parsing fixed kernel data_type

* update op_compat

* update
```
396fe483
S

Support static graph code-gen for unpool (#52947) · 005fee12
由 Sanbu 提交于 4月 28, 2023

005fee12

【Hackathon 4th No.12】为 Paddle 新增 Cauchy API (#52999) · 896c9315

由 megemini 提交于 4月 28, 2023

* 【Hackathon 4th No.12】为 Paddle 新增 Cauchy API

* [Change]修改初始化方法与类型检查

* [Change]将测试用例移动到新的目录下

* [Change]适配to_tensor的0D

896c9315

D

【Hackathon 4th No.11】为 paddle 添加 Geometric Distribution API (#51224) · bca1b0c6
由 dasen 提交于 4月 28, 2023

bca1b0c6
Z

[Bug fixes] Fix bugs in some sparse test (#53428) · 8f5eae47
由 Zhan Rongrui 提交于 4月 28, 2023

8f5eae47
J

remove is_npu_pinned_place (#53391) · 4ccbcce5
由 jjyaoao 提交于 4月 28, 2023

4ccbcce5
【Hackathon No.52】为 Paddle dist 算子实现 float16 数据类型支持 (#50915) · 9c406531
由 iSerendipity 提交于 4月 28, 2023

9c406531
Z
[inference][trt]trt support 0 dims (#53383) · 64adfe7a
由 Zhang Jun 提交于 4月 28, 2023
```
* trt support 0 dim

* trt support 0 dim

* update activation ut
```
64adfe7a

[KUNLUN] fix pp send /recv on xpu (#53427) · 6d9bbee3

由 Roc 提交于 4月 28, 2023

To make it synchronized at the first recv operator.
If warping all send and recv operators with group start and end, the received tensor will be not complete.

6d9bbee3

W

fix: fix deformable_conv_grad op test (#53415) · 207e0f33
由 wangshengxiang 提交于 4月 28, 2023

207e0f33
L
[XPU][BUG] Add cumsum grad kernel to xpu2 op list (#53386) · 1c1b487c
由 lj970926 提交于 4月 28, 2023
```
* clang format

* add cumsum_grad op to xpu2_op_list
```
1c1b487c
C

Add broadcast_tensors tests (#52961) · 98fc4277
由 co63oc 提交于 4月 28, 2023

98fc4277
S
【Hackathon No.55】add fmin BF16 test (#53100) · 8163faaa
由 superwinner1 提交于 4月 28, 2023
```
* 'fmin'

* 'fix'

* 'fix'
```
8163faaa

Change Py3 use Ubuntu20 Docker (#52523) · 4001f7ae

由 tianshuo78520a 提交于 4月 28, 2023

* test py3.8

* fix

* test gcc12

* test gcc12

* test gcc12

* test py3.8

* test py3

* fix

* fix

* add ubutnu20

* add ubutnu20

* add ubutnu20

* test py3

* fix error

* fix error

* fix error

* fix error

* fix error

* update py version

* fix

* test docker

* test docker

* test docker

* fix pip version,use pip3.9

* log error

* log add

* test ci

* add test log

* del

* fix

* test ci

* test ci

* update dockerfile

* update ci_dockerfile

* update ci_dockerfile

* update ci_dockerfile

* fix

* add git

* fix

4001f7ae

S

fix c_softmax deterministic (#53419) · f1e3575e
由 sneaxiy 提交于 4月 28, 2023

f1e3575e

【Prim】comp_elementwise_double_grad (first part) (#53385) · 05499c71

由 xiaoguoguo626807 提交于 4月 28, 2023

* add mul doubel grad

* add sub_double_grad

* add add sub high test

* add mutiply test

* modify other unsqueeze

* delete api.yaml

* only for make ci run

* midify unsqueeze

* modify unsqueeze

* tmp

* modify operants gen

05499c71

27 4月, 2023 19 次提交

Support different dtypes of inputs for broadcast for dropout optimization (#52093) · 3474e09c

由 Bo Zhang 提交于 4月 27, 2023

* change judgement for DropoutGradGPUKernelDriver

* add UnrollerWithoutVecSize and after this Loaddata to be refined

* pass unittest

* use same unroller with XPU

* BroadcastWithInt64Index

* BroadcastDataLoader template partial specialization

* fix compile errs in ROCms

* PR comment

3474e09c

[phi] Move sequence_pool to phi - Step 3 ：sequence_pool_grad_op (#52680) · fe053396

由 gouzil 提交于 4月 27, 2023

* [phi] move sequence_pool kernel to phi

* mv kernels impl

* fix parameter error

* clean include

* fix compat filename

* [phi] move fluid sequence_pool_grad to phi

* [phi][compat] sig rm GradVarName

* [phi] fix sequence_pool out type

* [phi] rm impl, add const string

* [phi] fix const str

* fix sequence_pooling cmake

* [phi] mv sequence_pooling_test

* [phi] fix grad sig

* [phi] fix sequence_pool is_test error

* [phi] fix sequence_pooling gpu include

* [phi] mv to impl

* [phi] fix SequencePoolFunctor cu include

* [phi] modify out max_index int32_t

* [phi] add pooltype mapping determine

* [phi] fix sequence_pool_sig

* [phi] fix sequence_pool_sig sum

* [phi] try ci

* [phi] fix max_index optional

fe053396

Y

scale trt converter support int64 (#53388) · 182b6f83
由 Yuanle Liu 提交于 4月 27, 2023

182b6f83
Z

xpu quant weight only (#53306) · 1c97aa69
由 zhupengyang 提交于 4月 27, 2023

1c97aa69
W
[Dy2St]Get grad names when call append backward to fix high order gradient (#53250) · 2d17df97
由 WangZhen 提交于 4月 27, 2023
```
[Dy2St]Get grad names when call append backward to fix high order gradient (#53250)
```
2d17df97
W
Update inference approve list (#53399) · a3a91682
由 WangZhen 提交于 4月 27, 2023
```
* Update slim approve list

* Fix id, test=document_fix
```
a3a91682
W

set sync_param default true (#53335) · 421f56a8
由 wuhuachaocoding 提交于 4月 27, 2023

421f56a8
H

[XPU] c_sync_calc_stream support more types (#53389) · 9c1eb98a
由 houj04 提交于 4月 27, 2023

9c1eb98a

[static op generation] triangular_solve (#53328) · 18968e7e

由 gouzil 提交于 4月 27, 2023

* [static op generation] triangular_solve

* [phi] mv triangular_solve_grad to static_backward

* [phi] fix import

* [phi] mv to ops.yaml、 backward.yaml

* fix forward attr

* [phi] fix triangular_solve_grad args

18968e7e

H
[CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily (#53382) · 9ab14865
由 HongyuJia 提交于 4月 27, 2023
```
* [CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily

* Add unittest
```
9ab14865
Y

【Hackathon No.91】register_hook for static mode (#52948) · db30aa1d
由 yangguohao 提交于 4月 27, 2023

db30aa1d
W

autogen code support for max_pool[2,3]_with_index op (#53359) · cf6cbc34
由 Wang Xin 提交于 4月 27, 2023

cf6cbc34

【PaddlePaddle Hackathon 4】：为maxout算子支持 float16 数据类型 (#50976) · 8bfd978f

由 NetPunk 提交于 4月 27, 2023

* support fp16 for maxout op

* format code

* change api

* add test for static float16

* format code

* formatting code

* atol alignment

* experiment—1

* experiment-2

* experiment-3

* format code

8bfd978f

Move fused feedforward (#53166) · 25b4ba7f

由 Sonder 提交于 4月 27, 2023

* trans fused_feedward Compute function to phi

* add register info

* remove maxfunctor

* move fused feedward to phi

* remove sig file

* remove fliud include

* add include

* add include

* add sig file

* add output register info

* fix sig file

* Update fused_feedforward_sig.cc

* fix grad kernel

* update output register info

* fix

* open fused_feedforward static build

* add optional and fix code style

* fix output info for fused attention

* add optional param

* merge

25b4ba7f

Z
[AMP] support OD level and skip dynamic loss scaling for bf16 (#53289) · 18e9dcdc
由 Zhang Ting 提交于 4月 27, 2023
```
* support OD level and skip dynamic loss scaling for bf16
```
18e9dcdc

[Fix CppExtension Unittest] Change CUDAExtension to CppExtension if necessary (#53352) · 3278dec7

由 HongyuJia 提交于 4月 27, 2023

* [Fix CppExtension Unittest] Change CUDAExtension to CppExtension if necessary

* Temporarily test cpp_extension under GPU

* Split mixed_extension unittest

3278dec7

H

[CustomOP Unittest] XPU unittest only keep forward test (#53021) · 89d1dd2e
由 HongyuJia 提交于 4月 27, 2023

89d1dd2e

Add jacobian and hessian (#53331) · e8d296ef

由 HydrogenSulfate 提交于 4月 27, 2023

* add jacobian and hessian in paddle.autograd

* disable unitest 'func_multi_input' for bug in high-order gradient of multiply

* add dimension checks

* add support for 0-D tensor

* change return type from Jacobian to Hessian in hessian function

* refine Jacobian _flatten function for single xs

* refine support for 0-D tensor

* 1. add 'func_multi_input' unitest for multiply_grad_kernel bug fixed
already.
2. support non-inplace math operation via magical method overwriting.

* add unitest for math operation and raise error when 0-D tensor is indexed

* add ndim check on ys and xs according to is_batched, and add one unitest

* refine docstring of jacobian and hessian

* move paddle.incubate.autograd.Jacobian/Hessian to paddle.incubate.autograd.functional.Jacobian/Hessian

* remove single_input unitest case because numerical differentiation is wrong

* remove 3 unitest for numerical result(reference result) is wrong

* 1. rename autodiff.py to autograd.py
2. increase TIMEOUT to 100

* cancel modification for functional Jacobian/Hessian

* 1. use tuple as return type instead of list
2. refine docstring

* add more unitest case to improve coverage

* remove 2 unitest of Hessian for numerical result is wrong

* remove 1 unitest of Hessian for numerical result is wrong

* remove 1 unitest of Hessian for numerical result is wrong

* change unit test to shape check

* correct doc and replace incubate API to stable API in _grad

e8d296ef

X
【prim】Concat bug (#53350) · 6768c6ec
由 xiaoguoguo626807 提交于 4月 27, 2023
```
* modify concat_grad add sum comp rule

* modify opcompat
```
6768c6ec

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功