提交 · 2f2bf4e8872356c772baf665dc933114cbaced6b · PaddlePaddle / Paddle

06 3月, 2023 5 次提交

傅
[AMP OP&Test] add bf16 fp16 type support for interpolate (#51153) · 2f2bf4e8
由傅剑寒提交于 3月 06, 2023
```
* add bf16 fp16 type support for interpolate

* add bf16 fp16 support for interpolate in phi on cpu
```
2f2bf4e8
2

[AMP OP&Test] Fix scale kernel and perfect unit test (#50998) · 02f66747
由 201716010711 提交于 3月 05, 2023

02f66747
N

Add multiprecision for adadelta op (#50131) · a8a2b7f4
由 niuliling123 提交于 3月 06, 2023

a8a2b7f4

[phi decoupling] decouple dependency to device_context in phi (Part 1) (#50865) · a1006b2b

由 Huang Jiyi 提交于 3月 06, 2023

* move DeviceContextPool to phi

* add EmplaceExternalContextFunc

* update namespace

* update cmake

* fix bugs and create context_pool_impl.h

* replace platform::is_xxx_place

* fix bugs

* update generator

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix enforce usage

* Revert "fix enforce usage"

This reverts commit 5f521f08a69713cee506e64a00ec6d9fba709e27.

* fix bugs

* rm XPUDeviceContext and CustomDeviceContext

* fix bugs

* fix fix context init bug

* fix bugs after merge

* fix bugs

* fix name

* fix mutable_data

* update and fix bugs

* fix bugs

* update

* fix bugs

* fix name

* fix bugs

* merge

* fix bugs

* create context_pool in phi/backends

* create context_pool in phi/backends

* fix bugs

* fix xpu bugs

* fix rocm bugs

* fix bugs

* fix bugs

* fix bugs

* fix xpu bugs

* update

* update

* fix bugs

* fix bugs

a1006b2b

oneDNN kernels code cleanup (#50743) · e2054925

由 Sławomir Siwek 提交于 3月 06, 2023

* matmul refactored

* fc

* SetOutMemDescWithLogicalLayoutFusesSupport

* matmul_v2

* alpha support

* group repetetive funcs

* matmul utils

* execute matmul methods

* restore registered kernel names

* split header and impl files

* remove double negatives

* increase coverage

* add onednn tests to ctest

* remove fusion logic from base matmuls

e2054925

03 3月, 2023 4 次提交
- G
  【Hackathon No.70】[PHI decoupling] move jit kernels from fluid to phi (#50911) · 2d36c9a9
  由 gouzil 提交于 3月 03, 2023
```
* [phi] move jit kernels from fluid to phi

* [phi] fix paddle::phi err

* [phi] fix windows 'posix_memalign': identifier not found

* [phi] fix windows 'posix_memalign_free': identifier not found

* [phi] fix readme directory structure, fc_functor  paddle::platform
```
  2d36c9a9
- Y
  [PHI Decoupling]Remove memory header (Part2) (#50870) · 558068cc
  由 YuanRisheng 提交于 3月 03, 2023
```
* decouple memory copy

* fix ci bugs

* fix ci compile bugs

* fix rocm compile

* fix ci bugs
```
  558068cc
- Z
  
  Fix batch_norm momentum (#51120) · d9fb639c
  由 zhangkaihuo 提交于 3月 03, 2023
  
  d9fb639c
- N
  
  Add multi_precision for adagrad op (#50078) · 4779c2c1
  由 niuliling123 提交于 3月 03, 2023
  
  4779c2c1
02 3月, 2023 6 次提交
- R
  New executor static build for fluid kernel (#50670) · bf50784c
  由 Ruibiao Chen 提交于 3月 02, 2023
```
* Check structed kernel for new executor static build

* Update code

* Ready for resnet50

* Move transfer_dtype to phi

* Ready for transformer

* Fix CI errors

* Fix layer_norm InferMeta

* Remove layer_norm infermeta fix
```
  bf50784c
- L
  Cache for cublaslt descriptor (#50931) · 819f8939
  由 limingshu 提交于 3月 02, 2023
```
* first commit

* finish base work

* modification for good

* fix for cache setting and gather the algo and desc as one data for cache storage

* fix for cache setting and gather the algo and desc as one data for cache storage

* install pre-commit check
```
  819f8939
- C
  
  fix zero bug of case21: paddle.mode (#51091) · 25d3ed65
  由 chenxiao120660 提交于 3月 02, 2023
  
  25d3ed65
- A
  
  fix divide zero bug for paddle.all (#51088) · 2bcd3935
  由 ahahahahahaha 提交于 3月 02, 2023
  
  2bcd3935
- W
  
  [XPU] add smallest mode for top_k (#51053) · 0fd6e2a1
  由 wangshengxiang 提交于 3月 02, 2023
  
  0fd6e2a1
- L
  [AMP OP&Test] register fp16 and bf16 kernel for uniform_random (#50993) · 72f34450
  由 Leo Chen 提交于 3月 02, 2023
```
* register fp16 and bf16 kernel for uniform_random

* fix compile

* support selected_rows

* add ut

* revert cpu

* fp16 test skip cpu
```
  72f34450
01 3月, 2023 7 次提交

Integration flash attention (#49869) · 61611786

由 Chitsing KUI 提交于 3月 01, 2023

* flash attn

* seed

* almost

* softmax

* fix workspace

* add unitest; linux only

* fix setup

* fix datatype include

* fix setup typo

* fix def scope

* new error api

* use paddle fork

* fix attr bug; complete ut

* update flash hash

* fix rng reset

* fix offset

* fix comments

61611786

[Zero-Dim] Add Expand/Expand_as/Top_k for XPU to support Zero Dim Input. (#50947) · 226b4a95

由 yunyaoXYY 提交于 3月 01, 2023

* Add unitest from shilong

* Add kernel code from shilong

* fix codestyle

* add broadcast_shape test

* fix unitest

* fix unitests

* fix unitest

* add 0D grad support

* add 0D grad support

* add 0D grad support

* fix 0D tensor

* fix 0D

* fix xpu 0D

* fix expand kernel

* fix xpu expand

* Fix 0D kernel

* fix 0D

* fix 0D

* fix 0D

* fix 0D

* fix XPU top_k

* cancel the modify of xpu

* add XPU 0D tensor

* fix 0D

226b4a95

W

fix the backward bug of cumsum (#50997) · 934934d8
由 wawltor 提交于 3月 01, 2023

934934d8
M

[xpu] fix bugs of split/embedding_with_wltwise_add/beam_search_decode kernel (#51052) · 753fa844
由 mayang002 提交于 3月 01, 2023

753fa844
C
fix zero bug of case18: paddle.logsumexp (#51034) · 2f900965
由 chenxiao120660 提交于 3月 01, 2023
```
* fix bug of logsumexp

* fix bug for logsumexp

* fix bug for logsumexp
```
2f900965
N

Add multiprecision for rms op (#50132) · 48060b2e
由 niuliling123 提交于 3月 01, 2023

48060b2e

[XPU] Add kernels for VITDET (#50992) · 798b527c

由 duanyanhui 提交于 3月 01, 2023

* add support of int64 add for xpu

* add transpose support for int64

* add randperm kernel

* fix randperm

* add distribute_fpn_proposal kernel

* fix comment

* add reduce_sum_int32

798b527c

28 2月, 2023 4 次提交
- G
  【Hackathon No.69】[PHI decoupling] move device_wrapper from fluid to phi (#50749) · 7b6d5ac0
  由 gouzil 提交于 2月 28, 2023
```
* [phi] move device_wrapper from fluid to phi

* [phi] fix ‘PADDLE_ENFORCE_XDNN_SUCCESS’ was not declared in this scope
```
  7b6d5ac0
- Z
  
  [XPU] support convert fp16 model (#50790) · f265a313
  由 zhupengyang 提交于 2月 28, 2023
  
  f265a313
- S
  
  xpu gaussian_random support fp16 (#50881) · 569b018e
  由 shentanyue 提交于 2月 28, 2023
  
  569b018e
- T
  
  xpu-paddlepaddle-57 [任务] adamw lr_radio支持 (#50979) · dda74715
  由 taixiurong 提交于 2月 28, 2023
  
  dda74715
27 2月, 2023 7 次提交

[XPU] add fp16 support for shape and lookup_table_v2 op. (#50773) · d2a0577a

由 houj04 提交于 2月 27, 2023

* [XPU] add fp16 support for shape op.

* [XPU] add fp16 support for lookup_table_v2 op.

* update approval list: add qingshu's id.

d2a0577a

张

【Hackathon No.68】Remove utils in phi (#50833) · 6c181d1d

由张春乔提交于 2月 27, 2023

* remove utils

* remove utils

* remove utils

* remove utils

* Update get_data_from_tensor.h

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_kernel.cu.cc

* Update rnn_kernel.cc

* remove utils

* Update rnn_functor.h

* remove utils

* remove utils

* remove utils

* remove utils

* remove utils

* Update rnn_functor.h

* Update unsqueeze_op.h

* Update utils.h

* roll back

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* use TensorToVector

* use TensorToVector

* use TensorToVector

* use TensorToVector

* use TensorToVector

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* add TensorToVector

* roll back

* Update tensor_utils.h

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update tensor_utils.h

* Update rnn_kernel.cu.cc

* Update rnn_grad_kernel.cc

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_grad_kernel.cc

* Update rnn_kernel.cc

* TensorCopySync to phi::Copy

* fix codestyle

* rnn_kernel.cc: add ;

* replace all GetDataFromTensor with phi::GetVectorFromTensor

* delete include of util.h

6c181d1d

B
Reduce redundant cpu computation in slice compute (#50348) · 8aec0580
由 Bo Zhang 提交于 2月 27, 2023
```
* conflict

* add UpdateSliceAttrs
```
8aec0580
Y

Add PADDLE_THROW in ToCudaDataType and polish codes. (#50922) · 2eeaaa7d
由 Yiqun Liu 提交于 2月 27, 2023

2eeaaa7d
revert reshape 0 represent copy and support perm < 0 for paddle.transpose (#50720) · 3669868d
由 zhouweiwei2014 提交于 2月 27, 2023

3669868d
W
xpu: bind op scatter_nd_add. add data type for transpose2, clip & assign_value (#50825) · 0d12afea
由 wangshengxiang 提交于 2月 27, 2023
```
* [XPU] bind op scatter_nd_add

* [XPU] add more data type for op: clip, transpose2 & assign_value
```
0d12afea

[Bfloat16]register bfloat16 datatype for squared l2 norm (#50908) · 3c121040

由 shaojie_wang 提交于 2月 26, 2023

* register bfloat16 datatype for squared l2 norm

* register bfloat16 datatype for softmax with upper triangular mask

* register bfloat16 for tril triu cuda kernel

3c121040

26 2月, 2023 1 次提交

Matmul performance optimization with cuBlasLt (#46431) · d4217fc6

由 limingshu 提交于 2月 26, 2023


* implement of matmul using cublasLt instead of cublas

* Update matmul_kernel_impl_via_blasLt.h

---------
Co-authored-by: Nzhangbopd <1299246947@qq.com>
Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com>
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

d4217fc6

25 2月, 2023 1 次提交
- Z
  Rename elementwise_heaviside to heaviside (#50821) · 8129c22e
  由 zyfncg 提交于 2月 25, 2023
```
* rename elementwise_heaviside to heaviside

* delete __init__.py

* fix bug
```
  8129c22e
24 2月, 2023 5 次提交

Y

[Zero-Dim] Support 0D Tensor input for topk/broadcast_to/expand/expand_as/broadcast_shape (#50536) · 5041158f
由 yunyaoXYY 提交于 2月 24, 2023

5041158f
N

Fix KP operator Kernel selection error (#50178) · 6ef3f2ce
由 niuliling123 提交于 2月 24, 2023

6ef3f2ce
Y

supplement header file's code (#50826) · 92cae577
由 YuanRisheng 提交于 2月 24, 2023

92cae577

【prim】Slice grad (#50771) · f6dea800

由 xiaoguoguo626807 提交于 2月 24, 2023

* support prim test in OpTest

* fix cmake

* fix op test

* fix test_input_spec

* disable cinn in reduce_sum unit test

* add bfloat16 dtype for sum

* add approve rules

* polish code

* add clear jit program function

* convert grad out from tensor to numpy

* remove unnecessary code

* add only_prim flag

* fix flag

* fix op test

* add attr

* fix optest comp inplace error

* fix op test

* fix op test with guard

* add initialization of check_comp flag

* fix comp inplace error in op test

* rename check_comp with check_prim and add bfloat16 dtype convert

* rename comp_op_type to prim_op_type

* rename comp to prim

* remove useless code

* skip ci check for only prim

* add no_grad_vars and grad_outputs in prim test

* fix var_dict

* fix op test for only_prim

* fix dy2static bugs

* polish some code

* temp

* modify op test

* except cinn test

* modify bfp16

* modify pad grad

* add pad_grad dtype

* start cinn part

---------
Co-authored-by: NCharles-hit <wanghao107@baidu.com>

f6dea800

R
[XPU] add expand_grad, isnan, meshgrid kernels (#50774) · 7271de88
由 ronnywang 提交于 2月 24, 2023
```
* [XPU] add expand_grad, isnan, meshgrid kernels

* update
```
7271de88

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功