提交 · 3be6791f4009e571e0d7fc89ac5dc3445d24c687 · PaddlePaddle / Paddle

08 3月, 2023 7 次提交
- Y
  [AMP OP&Test] Mean fp/bf 16 support (#51114) · 3be6791f
  由 YuhangLi 提交于 3月 08, 2023
```
* mean fp16

* fp16

* [AMP OP&Test] mean append bf/fp16

* means append more bf16 uts

* format class name

* fix ci

* fix for windows

* fix issue

* fix redundancy

* fix redund

* fix elewise_max ut bf16 numeric delta

* remove use func
```
  3be6791f
- A
  
  add output defs for nonzero and nms (#51325) · 262358e8
  由 Ainavo 提交于 3月 08, 2023
  
  262358e8
- H
  
  Add output defs for some kernels (#51333) · 35d31e9a
  由 Huang Jiyi 提交于 3月 08, 2023
  
  35d31e9a
- F
  
  fix wrong backward grad of amin/amax (#51301) · 3239a7b3
  由 FlyingQianMM 提交于 3月 08, 2023
  
  3239a7b3
- R
  
  w/o pre-commit (#51315) · 37dbbbd1
  由 Ryan 提交于 3月 08, 2023
  
  37dbbbd1
- H
  Add output defs for logical_xxx kernel (#51331) · af3a0675
  由 Huang Jiyi 提交于 3月 08, 2023
```
* add output defs

* add output defs for kps
```
  af3a0675
- N
  
  Add mult_precision param for adamax op (#49705) · 151ec311
  由 niuliling123 提交于 3月 08, 2023
  
  151ec311
07 3月, 2023 2 次提交
- R
  
  Add output defs for topk kernel (#51233) · b5232bf4
  由 Ruibiao Chen 提交于 3月 07, 2023
  
  b5232bf4
- C
  [OpTest]add only_check_prim parameter in check grad (#51210) · 017452e9
  由 Charles-hit 提交于 3月 07, 2023
```
* support elementwise_pow bfloat16

* add only_check_prim parameters in check_grad

* modify unit test

* fix floor test

* fix sigmoid bfloat16 test
```
  017452e9
06 3月, 2023 5 次提交

傅
[AMP OP&Test] add bf16 fp16 type support for interpolate (#51153) · 2f2bf4e8
由傅剑寒提交于 3月 06, 2023
```
* add bf16 fp16 type support for interpolate

* add bf16 fp16 support for interpolate in phi on cpu
```
2f2bf4e8
2

[AMP OP&Test] Fix scale kernel and perfect unit test (#50998) · 02f66747
由 201716010711 提交于 3月 05, 2023

02f66747
N

Add multiprecision for adadelta op (#50131) · a8a2b7f4
由 niuliling123 提交于 3月 06, 2023

a8a2b7f4

[phi decoupling] decouple dependency to device_context in phi (Part 1) (#50865) · a1006b2b

由 Huang Jiyi 提交于 3月 06, 2023

* move DeviceContextPool to phi

* add EmplaceExternalContextFunc

* update namespace

* update cmake

* fix bugs and create context_pool_impl.h

* replace platform::is_xxx_place

* fix bugs

* update generator

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix enforce usage

* Revert "fix enforce usage"

This reverts commit 5f521f08a69713cee506e64a00ec6d9fba709e27.

* fix bugs

* rm XPUDeviceContext and CustomDeviceContext

* fix bugs

* fix fix context init bug

* fix bugs after merge

* fix bugs

* fix name

* fix mutable_data

* update and fix bugs

* fix bugs

* update

* fix bugs

* fix name

* fix bugs

* merge

* fix bugs

* create context_pool in phi/backends

* create context_pool in phi/backends

* fix bugs

* fix xpu bugs

* fix rocm bugs

* fix bugs

* fix bugs

* fix bugs

* fix xpu bugs

* update

* update

* fix bugs

* fix bugs

a1006b2b

oneDNN kernels code cleanup (#50743) · e2054925

由 Sławomir Siwek 提交于 3月 06, 2023

* matmul refactored

* fc

* SetOutMemDescWithLogicalLayoutFusesSupport

* matmul_v2

* alpha support

* group repetetive funcs

* matmul utils

* execute matmul methods

* restore registered kernel names

* split header and impl files

* remove double negatives

* increase coverage

* add onednn tests to ctest

* remove fusion logic from base matmuls

e2054925

03 3月, 2023 4 次提交
- G
  【Hackathon No.70】[PHI decoupling] move jit kernels from fluid to phi (#50911) · 2d36c9a9
  由 gouzil 提交于 3月 03, 2023
```
* [phi] move jit kernels from fluid to phi

* [phi] fix paddle::phi err

* [phi] fix windows 'posix_memalign': identifier not found

* [phi] fix windows 'posix_memalign_free': identifier not found

* [phi] fix readme directory structure, fc_functor  paddle::platform
```
  2d36c9a9
- Y
  [PHI Decoupling]Remove memory header (Part2) (#50870) · 558068cc
  由 YuanRisheng 提交于 3月 03, 2023
```
* decouple memory copy

* fix ci bugs

* fix ci compile bugs

* fix rocm compile

* fix ci bugs
```
  558068cc
- Z
  
  Fix batch_norm momentum (#51120) · d9fb639c
  由 zhangkaihuo 提交于 3月 03, 2023
  
  d9fb639c
- N
  
  Add multi_precision for adagrad op (#50078) · 4779c2c1
  由 niuliling123 提交于 3月 03, 2023
  
  4779c2c1
02 3月, 2023 6 次提交
- R
  New executor static build for fluid kernel (#50670) · bf50784c
  由 Ruibiao Chen 提交于 3月 02, 2023
```
* Check structed kernel for new executor static build

* Update code

* Ready for resnet50

* Move transfer_dtype to phi

* Ready for transformer

* Fix CI errors

* Fix layer_norm InferMeta

* Remove layer_norm infermeta fix
```
  bf50784c
- L
  Cache for cublaslt descriptor (#50931) · 819f8939
  由 limingshu 提交于 3月 02, 2023
```
* first commit

* finish base work

* modification for good

* fix for cache setting and gather the algo and desc as one data for cache storage

* fix for cache setting and gather the algo and desc as one data for cache storage

* install pre-commit check
```
  819f8939
- C
  
  fix zero bug of case21: paddle.mode (#51091) · 25d3ed65
  由 chenxiao120660 提交于 3月 02, 2023
  
  25d3ed65
- A
  
  fix divide zero bug for paddle.all (#51088) · 2bcd3935
  由 ahahahahahaha 提交于 3月 02, 2023
  
  2bcd3935
- W
  
  [XPU] add smallest mode for top_k (#51053) · 0fd6e2a1
  由 wangshengxiang 提交于 3月 02, 2023
  
  0fd6e2a1
- L
  [AMP OP&Test] register fp16 and bf16 kernel for uniform_random (#50993) · 72f34450
  由 Leo Chen 提交于 3月 02, 2023
```
* register fp16 and bf16 kernel for uniform_random

* fix compile

* support selected_rows

* add ut

* revert cpu

* fp16 test skip cpu
```
  72f34450
01 3月, 2023 7 次提交

Integration flash attention (#49869) · 61611786

由 Chitsing KUI 提交于 3月 01, 2023

* flash attn

* seed

* almost

* softmax

* fix workspace

* add unitest; linux only

* fix setup

* fix datatype include

* fix setup typo

* fix def scope

* new error api

* use paddle fork

* fix attr bug; complete ut

* update flash hash

* fix rng reset

* fix offset

* fix comments

61611786

[Zero-Dim] Add Expand/Expand_as/Top_k for XPU to support Zero Dim Input. (#50947) · 226b4a95

由 yunyaoXYY 提交于 3月 01, 2023

* Add unitest from shilong

* Add kernel code from shilong

* fix codestyle

* add broadcast_shape test

* fix unitest

* fix unitests

* fix unitest

* add 0D grad support

* add 0D grad support

* add 0D grad support

* fix 0D tensor

* fix 0D

* fix xpu 0D

* fix expand kernel

* fix xpu expand

* Fix 0D kernel

* fix 0D

* fix 0D

* fix 0D

* fix 0D

* fix XPU top_k

* cancel the modify of xpu

* add XPU 0D tensor

* fix 0D

226b4a95

W

fix the backward bug of cumsum (#50997) · 934934d8
由 wawltor 提交于 3月 01, 2023

934934d8
M

[xpu] fix bugs of split/embedding_with_wltwise_add/beam_search_decode kernel (#51052) · 753fa844
由 mayang002 提交于 3月 01, 2023

753fa844
C
fix zero bug of case18: paddle.logsumexp (#51034) · 2f900965
由 chenxiao120660 提交于 3月 01, 2023
```
* fix bug of logsumexp

* fix bug for logsumexp

* fix bug for logsumexp
```
2f900965
N

Add multiprecision for rms op (#50132) · 48060b2e
由 niuliling123 提交于 3月 01, 2023

48060b2e

[XPU] Add kernels for VITDET (#50992) · 798b527c

由 duanyanhui 提交于 3月 01, 2023

* add support of int64 add for xpu

* add transpose support for int64

* add randperm kernel

* fix randperm

* add distribute_fpn_proposal kernel

* fix comment

* add reduce_sum_int32

798b527c

28 2月, 2023 4 次提交
- G
  【Hackathon No.69】[PHI decoupling] move device_wrapper from fluid to phi (#50749) · 7b6d5ac0
  由 gouzil 提交于 2月 28, 2023
```
* [phi] move device_wrapper from fluid to phi

* [phi] fix ‘PADDLE_ENFORCE_XDNN_SUCCESS’ was not declared in this scope
```
  7b6d5ac0
- Z
  
  [XPU] support convert fp16 model (#50790) · f265a313
  由 zhupengyang 提交于 2月 28, 2023
  
  f265a313
- S
  
  xpu gaussian_random support fp16 (#50881) · 569b018e
  由 shentanyue 提交于 2月 28, 2023
  
  569b018e
- T
  
  xpu-paddlepaddle-57 [任务] adamw lr_radio支持 (#50979) · dda74715
  由 taixiurong 提交于 2月 28, 2023
  
  dda74715
27 2月, 2023 5 次提交

[XPU] add fp16 support for shape and lookup_table_v2 op. (#50773) · d2a0577a

由 houj04 提交于 2月 27, 2023

* [XPU] add fp16 support for shape op.

* [XPU] add fp16 support for lookup_table_v2 op.

* update approval list: add qingshu's id.

d2a0577a

张

【Hackathon No.68】Remove utils in phi (#50833) · 6c181d1d

由张春乔提交于 2月 27, 2023

* remove utils

* remove utils

* remove utils

* remove utils

* Update get_data_from_tensor.h

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_kernel.cu.cc

* Update rnn_kernel.cc

* remove utils

* Update rnn_functor.h

* remove utils

* remove utils

* remove utils

* remove utils

* remove utils

* Update rnn_functor.h

* Update unsqueeze_op.h

* Update utils.h

* roll back

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* use TensorToVector

* use TensorToVector

* use TensorToVector

* use TensorToVector

* use TensorToVector

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* add TensorToVector

* roll back

* Update tensor_utils.h

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update tensor_utils.h

* Update rnn_kernel.cu.cc

* Update rnn_grad_kernel.cc

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_grad_kernel.cc

* Update rnn_kernel.cc

* TensorCopySync to phi::Copy

* fix codestyle

* rnn_kernel.cc: add ;

* replace all GetDataFromTensor with phi::GetVectorFromTensor

* delete include of util.h

6c181d1d

B
Reduce redundant cpu computation in slice compute (#50348) · 8aec0580
由 Bo Zhang 提交于 2月 27, 2023
```
* conflict

* add UpdateSliceAttrs
```
8aec0580
Y

Add PADDLE_THROW in ToCudaDataType and polish codes. (#50922) · 2eeaaa7d
由 Yiqun Liu 提交于 2月 27, 2023

2eeaaa7d
revert reshape 0 represent copy and support perm < 0 for paddle.transpose (#50720) · 3669868d
由 zhouweiwei2014 提交于 2月 27, 2023

3669868d

PaddlePaddle / Paddle 接近 2 年 前同步成功

PaddlePaddle / Paddle
接近 2 年前同步成功