提交 · 296b3ff04562691467f053845e6febfc8277309b · PaddlePaddle / Paddle

01 3月, 2023 14 次提交

add topk prim backward (#50679) · 296b3ff0

由 zqw_1997 提交于 3月 01, 2023

* tmp gather vjp

* support gather

* remove useless code

* fix compiling error

* fix ut

* add eager test

* add eager test

* add seed

* small change

* fix cpu error

* fix transpose op compat

* remove tensor index case

* fix prim_cinn

* small commit

* add cumsum prim backward

* small commit

* skip aixs=None test case

* fix op generante eror

* fix static test error

* remove unused code

* fix static test error

* small commit

* skip cpu float16 test case

* skip eager cpu cumsum float16 test case

* add eager and static UT

* fix ut

* add composite backward rule

* fix error

* fix type error and format error

* add try cpu+float16 test

* fix test bugs

* remove test for cpu+float16 and make y[0] be the grad arg

* add cinn test

* fix UT

* fix the wrong dim of v in test cases

* change y[0] to y[1] for grad in UT

* reshape flatten out

* Disable cinn single test

* use scatter_nd_add

* modify the reshape part of topk_grad

* delete useless build file

* to make the syntax right

* modify bug

* try use of put_along_axis

* remove cinn test

* reformat todo

* add silu composite rule

* fix code style.

* add cinn test

* fix composite grad maker code gen

* add prim in cumsum op test

* remove old test

* fix typro

* pass the static test

* fix typro

* modify optest and delete old test files

* remove normal test_top_k_op test

* fix typro

* pass axis=None test case

* buffer comment

* for debug

* add silu fp16 unit test.

* add static guard

* remove forward prim test

* remove same name axis

* modify the test_top_v2_op.py to pass all local tests

* delete the useless testcase

* fix mistake

* add more testcases to test dtype16 and dtype32

---------
Co-authored-by: NJiabinYang <360788950@qq.com>
Co-authored-by: NGGBond8488 <857631483@qq.com>
Co-authored-by: Nzxcd <228587199@qq.com>
Co-authored-by: NCharles-hit <wanghao107@baidu.com>

296b3ff0

[Zero-Dim] Add Expand/Expand_as/Top_k for XPU to support Zero Dim Input. (#50947) · 226b4a95

由 yunyaoXYY 提交于 3月 01, 2023

* Add unitest from shilong

* Add kernel code from shilong

* fix codestyle

* add broadcast_shape test

* fix unitest

* fix unitests

* fix unitest

* add 0D grad support

* add 0D grad support

* add 0D grad support

* fix 0D tensor

* fix 0D

* fix xpu 0D

* fix expand kernel

* fix xpu expand

* Fix 0D kernel

* fix 0D

* fix 0D

* fix 0D

* fix 0D

* fix XPU top_k

* cancel the modify of xpu

* add XPU 0D tensor

* fix 0D

226b4a95

W

fix the backward bug of cumsum (#50997) · 934934d8
由 wawltor 提交于 3月 01, 2023

934934d8
M

[xpu] fix bugs of split/embedding_with_wltwise_add/beam_search_decode kernel (#51052) · 753fa844
由 mayang002 提交于 3月 01, 2023

753fa844
rename distributed_fused_lamb attr ring_id->ring_ids (#51000) · a348a423
由 TaoTao Li 提交于 3月 01, 2023

a348a423
C
fix zero bug of case18: paddle.logsumexp (#51034) · 2f900965
由 chenxiao120660 提交于 3月 01, 2023
```
* fix bug of logsumexp

* fix bug for logsumexp

* fix bug for logsumexp
```
2f900965
C

add op map (#51026) · 83f61bd5
由 cyber-pioneer 提交于 3月 01, 2023

83f61bd5
C

[XPU] Fix xpu_fuse_pass error caused by weight sharing by other operators. (#51039) · 1054b23e
由 csy0225 提交于 3月 01, 2023

1054b23e
G

fix cumsum prim op maker type error (#51014) · add510b9
由 GGBond8488 提交于 3月 01, 2023

add510b9
Z

[XPU] delete op device (#51029) · c9309942
由 zhupengyang 提交于 3月 01, 2023

c9309942
N

Add multiprecision for rms op (#50132) · 48060b2e
由 niuliling123 提交于 3月 01, 2023

48060b2e

[XPU] Add kernels for VITDET (#50992) · 798b527c

由 duanyanhui 提交于 3月 01, 2023

* add support of int64 add for xpu

* add transpose support for int64

* add randperm kernel

* fix randperm

* add distribute_fpn_proposal kernel

* fix comment

* add reduce_sum_int32

798b527c

E

fix custom plugin include headers error (#51013) · a548e70c
由 engineer1109 提交于 3月 01, 2023

a548e70c
R

fix gcc12 error (#51037) · ed511175
由 risemeup1 提交于 3月 01, 2023

ed511175

28 2月, 2023 20 次提交

Add gru qat int8 test (#50846) · a0562813

由 joanna.wozna.intel 提交于 2月 28, 2023

* Add gru qat int8 test

* Change place of model downloading

* Update paddle/fluid/inference/tests/api/CMakeLists.txt
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* Correct flags names and add description

---------
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

a0562813

H

[Fix Compile Error] Fix windows+Python3.10 compile error of ssize_t (#50994) · 4557e7e8
由 HongyuJia 提交于 2月 28, 2023

4557e7e8
G
【Hackathon No.69】[PHI decoupling] move device_wrapper from fluid to phi (#50749) · 7b6d5ac0
由 gouzil 提交于 2月 28, 2023
```
* [phi] move device_wrapper from fluid to phi

* [phi] fix ‘PADDLE_ENFORCE_XDNN_SUCCESS’ was not declared in this scope
```
7b6d5ac0
I

Fix some typos (#50914) · 5d8fe822
由 iLeGend 提交于 2月 28, 2023

5d8fe822
H
Rewrite mkldnn fc rnn fuse pass tester (#50265) · eb22391c
由 Hulek 提交于 2月 28, 2023
```
* Added file

* Tests separated and rewritten, fixed fc_lstm_fuse_pass

* Resolve conflicts
```
eb22391c
H

[Tensor Operants & Prim-Relevant] Tensor API support default value (#50928) · 2e6e188a
由 HongyuJia 提交于 2月 28, 2023

2e6e188a
H
[Extension Operants] Extension supports tensor operants (#50869) · 539293e2
由 HongyuJia 提交于 2月 28, 2023
```
* [Extension Operants] Extension supports tensor operants

* Polish fluid init_tensor_operants
```
539293e2

【prim】Matmul double grad composite api (#50452) · a0c473f4

由 xiaoguoguo626807 提交于 2月 28, 2023

* modify name

* merge develop

* original code

* build modify

* success 2*2

* fused dim=1 failed

* success

* modify static

* success for static except dim=1

* delete log

* tmp modify

* success

* success

* add fp1664

* delete fp16 cpu test

* stop windows test

* review modify

* modify tanh test

* modify tanh

* fix_conflixt

* modift static prim

* fix_conflict

* Update test_static_prim.cc

* update

* bug fix

a0c473f4

H
[C++ API GetAllocator] Add C++ `GetAllocator` interface (#50813) · 74446b37
由 HongyuJia 提交于 2月 28, 2023
```
* [C++ API GetAllocator] Add C++ GetAllocator interface

* move api to accurate directory
```
74446b37

add cumsum prim backward (#50565) · ca2b6095

由 GGBond8488 提交于 2月 28, 2023

* add cumsum prim backward

* skip aixs=None test case

* fix op generante eror

* fix static test error

* remove unused code

* fix static test error

* skip cpu float16 test case

* skip eager cpu cumsum float16 test case

* add cinn test

* reshape flatten out

* Disable cinn single test

* remove cinn test

* reformat todo

* add prim in cumsum op test

* remove old test

* fix typro

* fix typro

* fix typro

* pass axis=None test case

* remove forward prim test

* remove same name axis

ca2b6095

Z

[XPU] support convert fp16 model (#50790) · f265a313
由 zhupengyang 提交于 2月 28, 2023

f265a313
S

xpu gaussian_random support fp16 (#50881) · 569b018e
由 shentanyue 提交于 2月 28, 2023

569b018e
Y

fix bug in fused_gemm_epilogue_op.cc (#50980) · 064a5434
由 yuehuayingxueluo 提交于 2月 28, 2023

064a5434

张

[fp16] suppot fp16 on nn.Dropout2D (#50904) · bf05168c

由张春乔提交于 2月 28, 2023

* add unittest for nn.DropOut2D

* add fp16

* add fp16 in docs of temporal_shift_op.cc

* Update test_dropout_op.py

bf05168c

Z
forbid tensorrt_engine op's output is a persistable var (#50932) · bbf2bc2b
由 zhoutianzi666 提交于 2月 28, 2023
```
* forbid tensorrt_engine op's output is a persistable var
```
bbf2bc2b
T

xpu-paddlepaddle-57 [任务] adamw lr_radio支持 (#50979) · dda74715
由 taixiurong 提交于 2月 28, 2023

dda74715
Y

fix gflags from environment not activated (#50864) · a8fff38f
由 Yuanle Liu 提交于 2月 28, 2023

a8fff38f
W
fix concat axis bug (#50951) · 75a2f9d5
由 wenbin 提交于 2月 28, 2023
```
* fix concat bug

* recommit for ci
```
75a2f9d5
N

Count the number of 0 in the output Tensor (#50981) · 6c471ed0
由 niuliling123 提交于 2月 28, 2023

6c471ed0

【Prim】Reshape, transpose, cast vjp (#50778) · ab1b6303

由 Jiabin Yang 提交于 2月 28, 2023

* support transpose and reshape

* support reshpe, transpose, cast vjp

* merge develop

* recover unused file

* remove prim base

* support problem

* remove additional status settting

* remove additional status settting

* fix ut

* fix ut

* fix ut

* fix no grad branch

* add more test

* disable fp16 in cpu

* fix test

ab1b6303

27 2月, 2023 6 次提交

J

[CINN] fix cinn cache key should save var name bug (#50955) · f78b4079
由 jiangcheng 提交于 2月 27, 2023

f78b4079

Add inferface of get registered phi kernels (#50814) · 0f8c304a

由 zyfncg 提交于 2月 27, 2023

* add inferface of get registered phi kernels

* change KernelType to KernelKey

* add test

* refactor code

0f8c304a

[XPU] add fp16 support for shape and lookup_table_v2 op. (#50773) · d2a0577a

由 houj04 提交于 2月 27, 2023

* [XPU] add fp16 support for shape op.

* [XPU] add fp16 support for lookup_table_v2 op.

* update approval list: add qingshu's id.

d2a0577a

Z

handle trt engine deserialization failure and rebuild (#50775) · 377cbcea
由 Zhang Jun 提交于 2月 27, 2023

377cbcea

张

【Hackathon No.68】Remove utils in phi (#50833) · 6c181d1d

由张春乔提交于 2月 27, 2023

* remove utils

* remove utils

* remove utils

* remove utils

* Update get_data_from_tensor.h

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_kernel.cu.cc

* Update rnn_kernel.cc

* remove utils

* Update rnn_functor.h

* remove utils

* remove utils

* remove utils

* remove utils

* remove utils

* Update rnn_functor.h

* Update unsqueeze_op.h

* Update utils.h

* roll back

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* use TensorToVector

* use TensorToVector

* use TensorToVector

* use TensorToVector

* use TensorToVector

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* add TensorToVector

* roll back

* Update tensor_utils.h

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update tensor_utils.h

* Update rnn_kernel.cu.cc

* Update rnn_grad_kernel.cc

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_grad_kernel.cc

* Update rnn_kernel.cc

* TensorCopySync to phi::Copy

* fix codestyle

* rnn_kernel.cc: add ;

* replace all GetDataFromTensor with phi::GetVectorFromTensor

* delete include of util.h

6c181d1d

W
[TRT] Add sm version check for TensorRT flash attention and cross attention pass/plugin (#50830) · 38dad3b9
由 Wang Bojun 提交于 2月 27, 2023
```
* add sm version check

* use GetGPUComputeCapability
```
38dad3b9

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功