提交 · 798b527c0ccfc0035538778ecc016ea9f9efe586 · PaddlePaddle / Paddle

01 3月, 2023 3 次提交
- D
  [XPU] Add kernels for VITDET (#50992) · 798b527c
  由 duanyanhui 提交于 3月 01, 2023
```
* add support of int64 add for xpu

* add transpose support for int64

* add randperm kernel

* fix randperm

* add distribute_fpn_proposal kernel

* fix comment

* add reduce_sum_int32
```
  798b527c
- E
  
  fix custom plugin include headers error (#51013) · a548e70c
  由 engineer1109 提交于 3月 01, 2023
  
  a548e70c
- R
  
  fix gcc12 error (#51037) · ed511175
  由 risemeup1 提交于 3月 01, 2023
  
  ed511175
28 2月, 2023 20 次提交

Add gru qat int8 test (#50846) · a0562813

由 joanna.wozna.intel 提交于 2月 28, 2023

* Add gru qat int8 test

* Change place of model downloading

* Update paddle/fluid/inference/tests/api/CMakeLists.txt
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* Correct flags names and add description

---------
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

a0562813

H

[Fix Compile Error] Fix windows+Python3.10 compile error of ssize_t (#50994) · 4557e7e8
由 HongyuJia 提交于 2月 28, 2023

4557e7e8
G
【Hackathon No.69】[PHI decoupling] move device_wrapper from fluid to phi (#50749) · 7b6d5ac0
由 gouzil 提交于 2月 28, 2023
```
* [phi] move device_wrapper from fluid to phi

* [phi] fix ‘PADDLE_ENFORCE_XDNN_SUCCESS’ was not declared in this scope
```
7b6d5ac0
I

Fix some typos (#50914) · 5d8fe822
由 iLeGend 提交于 2月 28, 2023

5d8fe822
H
Rewrite mkldnn fc rnn fuse pass tester (#50265) · eb22391c
由 Hulek 提交于 2月 28, 2023
```
* Added file

* Tests separated and rewritten, fixed fc_lstm_fuse_pass

* Resolve conflicts
```
eb22391c
H

[Tensor Operants & Prim-Relevant] Tensor API support default value (#50928) · 2e6e188a
由 HongyuJia 提交于 2月 28, 2023

2e6e188a
H
[Extension Operants] Extension supports tensor operants (#50869) · 539293e2
由 HongyuJia 提交于 2月 28, 2023
```
* [Extension Operants] Extension supports tensor operants

* Polish fluid init_tensor_operants
```
539293e2

【prim】Matmul double grad composite api (#50452) · a0c473f4

由 xiaoguoguo626807 提交于 2月 28, 2023

* modify name

* merge develop

* original code

* build modify

* success 2*2

* fused dim=1 failed

* success

* modify static

* success for static except dim=1

* delete log

* tmp modify

* success

* success

* add fp1664

* delete fp16 cpu test

* stop windows test

* review modify

* modify tanh test

* modify tanh

* fix_conflixt

* modift static prim

* fix_conflict

* Update test_static_prim.cc

* update

* bug fix

a0c473f4

H
[C++ API GetAllocator] Add C++ `GetAllocator` interface (#50813) · 74446b37
由 HongyuJia 提交于 2月 28, 2023
```
* [C++ API GetAllocator] Add C++ GetAllocator interface

* move api to accurate directory
```
74446b37

add cumsum prim backward (#50565) · ca2b6095

由 GGBond8488 提交于 2月 28, 2023

* add cumsum prim backward

* skip aixs=None test case

* fix op generante eror

* fix static test error

* remove unused code

* fix static test error

* skip cpu float16 test case

* skip eager cpu cumsum float16 test case

* add cinn test

* reshape flatten out

* Disable cinn single test

* remove cinn test

* reformat todo

* add prim in cumsum op test

* remove old test

* fix typro

* fix typro

* fix typro

* pass axis=None test case

* remove forward prim test

* remove same name axis

ca2b6095

Z

[XPU] support convert fp16 model (#50790) · f265a313
由 zhupengyang 提交于 2月 28, 2023

f265a313
S

xpu gaussian_random support fp16 (#50881) · 569b018e
由 shentanyue 提交于 2月 28, 2023

569b018e
Y

fix bug in fused_gemm_epilogue_op.cc (#50980) · 064a5434
由 yuehuayingxueluo 提交于 2月 28, 2023

064a5434

张

[fp16] suppot fp16 on nn.Dropout2D (#50904) · bf05168c

由张春乔提交于 2月 28, 2023

* add unittest for nn.DropOut2D

* add fp16

* add fp16 in docs of temporal_shift_op.cc

* Update test_dropout_op.py

bf05168c

Z
forbid tensorrt_engine op's output is a persistable var (#50932) · bbf2bc2b
由 zhoutianzi666 提交于 2月 28, 2023
```
* forbid tensorrt_engine op's output is a persistable var
```
bbf2bc2b
T

xpu-paddlepaddle-57 [任务] adamw lr_radio支持 (#50979) · dda74715
由 taixiurong 提交于 2月 28, 2023

dda74715
Y

fix gflags from environment not activated (#50864) · a8fff38f
由 Yuanle Liu 提交于 2月 28, 2023

a8fff38f
W
fix concat axis bug (#50951) · 75a2f9d5
由 wenbin 提交于 2月 28, 2023
```
* fix concat bug

* recommit for ci
```
75a2f9d5
N

Count the number of 0 in the output Tensor (#50981) · 6c471ed0
由 niuliling123 提交于 2月 28, 2023

6c471ed0

【Prim】Reshape, transpose, cast vjp (#50778) · ab1b6303

由 Jiabin Yang 提交于 2月 28, 2023

* support transpose and reshape

* support reshpe, transpose, cast vjp

* merge develop

* recover unused file

* remove prim base

* support problem

* remove additional status settting

* remove additional status settting

* fix ut

* fix ut

* fix ut

* fix no grad branch

* add more test

* disable fp16 in cpu

* fix test

ab1b6303

27 2月, 2023 17 次提交

J

[CINN] fix cinn cache key should save var name bug (#50955) · f78b4079
由 jiangcheng 提交于 2月 27, 2023

f78b4079

Add inferface of get registered phi kernels (#50814) · 0f8c304a

由 zyfncg 提交于 2月 27, 2023

* add inferface of get registered phi kernels

* change KernelType to KernelKey

* add test

* refactor code

0f8c304a

[XPU] add fp16 support for shape and lookup_table_v2 op. (#50773) · d2a0577a

由 houj04 提交于 2月 27, 2023

* [XPU] add fp16 support for shape op.

* [XPU] add fp16 support for lookup_table_v2 op.

* update approval list: add qingshu's id.

d2a0577a

Z

handle trt engine deserialization failure and rebuild (#50775) · 377cbcea
由 Zhang Jun 提交于 2月 27, 2023

377cbcea

张

【Hackathon No.68】Remove utils in phi (#50833) · 6c181d1d

由张春乔提交于 2月 27, 2023

* remove utils

* remove utils

* remove utils

* remove utils

* Update get_data_from_tensor.h

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_kernel.cu.cc

* Update rnn_kernel.cc

* remove utils

* Update rnn_functor.h

* remove utils

* remove utils

* remove utils

* remove utils

* remove utils

* Update rnn_functor.h

* Update unsqueeze_op.h

* Update utils.h

* roll back

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* use TensorToVector

* use TensorToVector

* use TensorToVector

* use TensorToVector

* use TensorToVector

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* add TensorToVector

* roll back

* Update tensor_utils.h

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update tensor_utils.h

* Update rnn_kernel.cu.cc

* Update rnn_grad_kernel.cc

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_grad_kernel.cc

* Update rnn_kernel.cc

* TensorCopySync to phi::Copy

* fix codestyle

* rnn_kernel.cc: add ;

* replace all GetDataFromTensor with phi::GetVectorFromTensor

* delete include of util.h

6c181d1d

W
[TRT] Add sm version check for TensorRT flash attention and cross attention pass/plugin (#50830) · 38dad3b9
由 Wang Bojun 提交于 2月 27, 2023
```
* add sm version check

* use GetGPUComputeCapability
```
38dad3b9
H
[Tensor Operants & Prim] Tensor pow API uses elementwise_pow (#50886) · 8a097399
由 HongyuJia 提交于 2月 27, 2023
```
* [Tensor Operants & Prim] Tensor pow API uses elementwise_pow

* unittest change to fill_constant+elementwise_pow
```
8a097399
H
[Error Msg] Polish error message when GPU kernel not found (#50880) · 3e9ffaef
由 HongyuJia 提交于 2月 27, 2023
```
* [Error Msg] Polish error message when GPU kernel not found

* Only test in GPU environment
```
3e9ffaef
B
Reduce redundant cpu computation in slice compute (#50348) · 8aec0580
由 Bo Zhang 提交于 2月 27, 2023
```
* conflict

* add UpdateSliceAttrs
```
8aec0580
G

change message info (#50546) · 097402d9
由 gaoziyuan 提交于 2月 27, 2023

097402d9
C

revert operator.cc (#50895) · ec814cf5
由 csy0225 提交于 2月 27, 2023

ec814cf5
J
[kunlun] support reduce_scatter (#50792) · 6786c012
由 jameszhang 提交于 2月 27, 2023
```
* [kunlun] support reduce_scatter

* uncomment unittest

* update xccl to 1.0.10
```
6786c012
Y

Add PADDLE_THROW in ToCudaDataType and polish codes. (#50922) · 2eeaaa7d
由 Yiqun Liu 提交于 2月 27, 2023

2eeaaa7d
revert reshape 0 represent copy and support perm < 0 for paddle.transpose (#50720) · 3669868d
由 zhouweiwei2014 提交于 2月 27, 2023

3669868d

[IR] Type system stage2: add class Type, type uniquer utils, class IRContext (#50412) · a5827f0e

由 zhangbo9674 提交于 2月 27, 2023

* add TypeUniquer and IrContext

* refine include code

* add Type, TypeBase

* add built-in type

* add bulit-in Float32Type

* refine ut

* refine code

* refine code

* delete type_base

* rename ImplType to StorageType

* rename ImplType to StorageType

* add macros util for register type

* add macros util for register type

* refine name

* refine name

* change storage manager

* add multi_thread for ir_ctx

* rwlock_2_spinlock, add REGISTER_TYPE_2_IRCONTEXT

* DECLARE_TYPE_UTILITY_FUNCTOR

* refine ircontext singleton

* del destructor for ParametricStorageManager

* refine code

* Add necessary logs for debugging

* refine ir_context instance

* refine type get interface

* refine code by comment

a5827f0e

W
xpu: bind op scatter_nd_add. add data type for transpose2, clip & assign_value (#50825) · 0d12afea
由 wangshengxiang 提交于 2月 27, 2023
```
* [XPU] bind op scatter_nd_add

* [XPU] add more data type for op: clip, transpose2 & assign_value
```
0d12afea

[Bfloat16]register bfloat16 datatype for squared l2 norm (#50908) · 3c121040

由 shaojie_wang 提交于 2月 26, 2023

* register bfloat16 datatype for squared l2 norm

* register bfloat16 datatype for softmax with upper triangular mask

* register bfloat16 for tril triu cuda kernel

3c121040

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功