提交 · 4ccbcce576241d836b98ce138be006b437c1d108 · PaddlePaddle / Paddle

28 4月, 2023 11 次提交
- J
  
  remove is_npu_pinned_place (#53391) · 4ccbcce5
  由 jjyaoao 提交于 4月 28, 2023
  
  4ccbcce5
- 【Hackathon No.52】为 Paddle dist 算子实现 float16 数据类型支持 (#50915) · 9c406531
  由 iSerendipity 提交于 4月 28, 2023
  
  9c406531
- Z
  [inference][trt]trt support 0 dims (#53383) · 64adfe7a
  由 Zhang Jun 提交于 4月 28, 2023
```
* trt support 0 dim

* trt support 0 dim

* update activation ut
```
  64adfe7a
- R
  [KUNLUN] fix pp send /recv on xpu (#53427) · 6d9bbee3
  由 Roc 提交于 4月 28, 2023
```
To make it synchronized at the first recv operator.
If warping all send and recv operators with group start and end, the received tensor will be not complete.
```
  6d9bbee3
- W
  
  fix: fix deformable_conv_grad op test (#53415) · 207e0f33
  由 wangshengxiang 提交于 4月 28, 2023
  
  207e0f33
- L
  [XPU][BUG] Add cumsum grad kernel to xpu2 op list (#53386) · 1c1b487c
  由 lj970926 提交于 4月 28, 2023
```
* clang format

* add cumsum_grad op to xpu2_op_list
```
  1c1b487c
- C
  
  Add broadcast_tensors tests (#52961) · 98fc4277
  由 co63oc 提交于 4月 28, 2023
  
  98fc4277
- S
  【Hackathon No.55】add fmin BF16 test (#53100) · 8163faaa
  由 superwinner1 提交于 4月 28, 2023
```
* 'fmin'

* 'fix'

* 'fix'
```
  8163faaa
- T
  Change Py3 use Ubuntu20 Docker (#52523) · 4001f7ae
  由 tianshuo78520a 提交于 4月 28, 2023
```
* test py3.8

* fix

* test gcc12

* test gcc12

* test gcc12

* test py3.8

* test py3

* fix

* fix

* add ubutnu20

* add ubutnu20

* add ubutnu20

* test py3

* fix error

* fix error

* fix error

* fix error

* fix error

* update py version

* fix

* test docker

* test docker

* test docker

* fix pip version,use pip3.9

* log error

* log add

* test ci

* add test log

* del

* fix

* test ci

* test ci

* update dockerfile

* update ci_dockerfile

* update ci_dockerfile

* update ci_dockerfile

* fix

* add git

* fix
```
  4001f7ae
- S
  
  fix c_softmax deterministic (#53419) · f1e3575e
  由 sneaxiy 提交于 4月 28, 2023
  
  f1e3575e
- X
  【Prim】comp_elementwise_double_grad (first part) (#53385) · 05499c71
  由 xiaoguoguo626807 提交于 4月 28, 2023
```
* add mul doubel grad

* add sub_double_grad

* add add sub high test

* add mutiply test

* modify other unsqueeze

* delete api.yaml

* only for make ci run

* midify unsqueeze

* modify unsqueeze

* tmp

* modify operants gen
```
  05499c71
27 4月, 2023 29 次提交

Support different dtypes of inputs for broadcast for dropout optimization (#52093) · 3474e09c

由 Bo Zhang 提交于 4月 27, 2023

* change judgement for DropoutGradGPUKernelDriver

* add UnrollerWithoutVecSize and after this Loaddata to be refined

* pass unittest

* use same unroller with XPU

* BroadcastWithInt64Index

* BroadcastDataLoader template partial specialization

* fix compile errs in ROCms

* PR comment

3474e09c

[phi] Move sequence_pool to phi - Step 3 ：sequence_pool_grad_op (#52680) · fe053396

由 gouzil 提交于 4月 27, 2023

* [phi] move sequence_pool kernel to phi

* mv kernels impl

* fix parameter error

* clean include

* fix compat filename

* [phi] move fluid sequence_pool_grad to phi

* [phi][compat] sig rm GradVarName

* [phi] fix sequence_pool out type

* [phi] rm impl, add const string

* [phi] fix const str

* fix sequence_pooling cmake

* [phi] mv sequence_pooling_test

* [phi] fix grad sig

* [phi] fix sequence_pool is_test error

* [phi] fix sequence_pooling gpu include

* [phi] mv to impl

* [phi] fix SequencePoolFunctor cu include

* [phi] modify out max_index int32_t

* [phi] add pooltype mapping determine

* [phi] fix sequence_pool_sig

* [phi] fix sequence_pool_sig sum

* [phi] try ci

* [phi] fix max_index optional

fe053396

Y

scale trt converter support int64 (#53388) · 182b6f83
由 Yuanle Liu 提交于 4月 27, 2023

182b6f83
Z

xpu quant weight only (#53306) · 1c97aa69
由 zhupengyang 提交于 4月 27, 2023

1c97aa69
W
[Dy2St]Get grad names when call append backward to fix high order gradient (#53250) · 2d17df97
由 WangZhen 提交于 4月 27, 2023
```
[Dy2St]Get grad names when call append backward to fix high order gradient (#53250)
```
2d17df97
W
Update inference approve list (#53399) · a3a91682
由 WangZhen 提交于 4月 27, 2023
```
* Update slim approve list

* Fix id, test=document_fix
```
a3a91682
W

set sync_param default true (#53335) · 421f56a8
由 wuhuachaocoding 提交于 4月 27, 2023

421f56a8
H

[XPU] c_sync_calc_stream support more types (#53389) · 9c1eb98a
由 houj04 提交于 4月 27, 2023

9c1eb98a

[static op generation] triangular_solve (#53328) · 18968e7e

由 gouzil 提交于 4月 27, 2023

* [static op generation] triangular_solve

* [phi] mv triangular_solve_grad to static_backward

* [phi] fix import

* [phi] mv to ops.yaml、 backward.yaml

* fix forward attr

* [phi] fix triangular_solve_grad args

18968e7e

H
[CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily (#53382) · 9ab14865
由 HongyuJia 提交于 4月 27, 2023
```
* [CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily

* Add unittest
```
9ab14865
Y

【Hackathon No.91】register_hook for static mode (#52948) · db30aa1d
由 yangguohao 提交于 4月 27, 2023

db30aa1d
W

autogen code support for max_pool[2,3]_with_index op (#53359) · cf6cbc34
由 Wang Xin 提交于 4月 27, 2023

cf6cbc34

【PaddlePaddle Hackathon 4】：为maxout算子支持 float16 数据类型 (#50976) · 8bfd978f

由 NetPunk 提交于 4月 27, 2023

* support fp16 for maxout op

* format code

* change api

* add test for static float16

* format code

* formatting code

* atol alignment

* experiment—1

* experiment-2

* experiment-3

* format code

8bfd978f

Move fused feedforward (#53166) · 25b4ba7f

由 Sonder 提交于 4月 27, 2023

* trans fused_feedward Compute function to phi

* add register info

* remove maxfunctor

* move fused feedward to phi

* remove sig file

* remove fliud include

* add include

* add include

* add sig file

* add output register info

* fix sig file

* Update fused_feedforward_sig.cc

* fix grad kernel

* update output register info

* fix

* open fused_feedforward static build

* add optional and fix code style

* fix output info for fused attention

* add optional param

* merge

25b4ba7f

Z
[AMP] support OD level and skip dynamic loss scaling for bf16 (#53289) · 18e9dcdc
由 Zhang Ting 提交于 4月 27, 2023
```
* support OD level and skip dynamic loss scaling for bf16
```
18e9dcdc

[Fix CppExtension Unittest] Change CUDAExtension to CppExtension if necessary (#53352) · 3278dec7

由 HongyuJia 提交于 4月 27, 2023

* [Fix CppExtension Unittest] Change CUDAExtension to CppExtension if necessary

* Temporarily test cpp_extension under GPU

* Split mixed_extension unittest

3278dec7

H

[CustomOP Unittest] XPU unittest only keep forward test (#53021) · 89d1dd2e
由 HongyuJia 提交于 4月 27, 2023

89d1dd2e

Add jacobian and hessian (#53331) · e8d296ef

由 HydrogenSulfate 提交于 4月 27, 2023

* add jacobian and hessian in paddle.autograd

* disable unitest 'func_multi_input' for bug in high-order gradient of multiply

* add dimension checks

* add support for 0-D tensor

* change return type from Jacobian to Hessian in hessian function

* refine Jacobian _flatten function for single xs

* refine support for 0-D tensor

* 1. add 'func_multi_input' unitest for multiply_grad_kernel bug fixed
already.
2. support non-inplace math operation via magical method overwriting.

* add unitest for math operation and raise error when 0-D tensor is indexed

* add ndim check on ys and xs according to is_batched, and add one unitest

* refine docstring of jacobian and hessian

* move paddle.incubate.autograd.Jacobian/Hessian to paddle.incubate.autograd.functional.Jacobian/Hessian

* remove single_input unitest case because numerical differentiation is wrong

* remove 3 unitest for numerical result(reference result) is wrong

* 1. rename autodiff.py to autograd.py
2. increase TIMEOUT to 100

* cancel modification for functional Jacobian/Hessian

* 1. use tuple as return type instead of list
2. refine docstring

* add more unitest case to improve coverage

* remove 2 unitest of Hessian for numerical result is wrong

* remove 1 unitest of Hessian for numerical result is wrong

* remove 1 unitest of Hessian for numerical result is wrong

* change unit test to shape check

* correct doc and replace incubate API to stable API in _grad

e8d296ef

X
【prim】Concat bug (#53350) · 6768c6ec
由 xiaoguoguo626807 提交于 4月 27, 2023
```
* modify concat_grad add sum comp rule

* modify opcompat
```
6768c6ec
H
updata Adamw.py (#52984) · c0ee14f6
由 hua-zi 提交于 4月 27, 2023
```
* updata Adamw.py

out.backward()  -> loss.backward()

* Update adamw.py
```
c0ee14f6
W

refine SynchronizeAllDevice (#53370) · 35af5818
由 wanghuancoder 提交于 4月 27, 2023

35af5818
J

Hack__getitem__ from 0-d to 1-d with FLAGS_set_to_1d (#53358) · 1bd468e2
由 JYChen 提交于 4月 27, 2023

1bd468e2
W

revert pr https://github.com/PaddlePaddle/Paddle/pull/46779 (#53373) · 2c12abd7
由 Wilber 提交于 4月 27, 2023

2c12abd7
E

fix softmax assert error (#53360) · c50f5fa4
由 engineer1109 提交于 4月 27, 2023

c50f5fa4

remove some [-Wunused-parameter] warning (#53365) · 0fac3281

由 Galaxy1458 提交于 4月 27, 2023

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

0fac3281

H
[XPU] remove scale_loss in parallel.py (#53337) · 2e1ac529
由 houj04 提交于 4月 27, 2023
```
* [XPU] remove scale_loss in parallel.py

* [XPU] throw Unimplemented when using Reducer
```
2e1ac529
H
Register fluid xpu kerenls to phi [part 2] (#53188) · eee9c788
由 huangjiyi 提交于 4月 27, 2023
```
* update

* fix bug
```
eee9c788
R
update cmake3.16 to 3.18 (#53288) · 166964b1
由 risemeup1 提交于 4月 27, 2023
```
* update cmake3.16 to 3.18

* test

* Update Dockerfile.ubuntu
```
166964b1
S

【Hackathon No.55】add fmax BF16 test (#51925) · 8a6ad6e5
由 superwinner1 提交于 4月 27, 2023

8a6ad6e5

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功