提交 · f71c805e7fe67283039bf7c15565ad3b9bd48b92 · PaddlePaddle / Paddle

19 5月, 2023 2 次提交

W

[XPU] fix fallback (#53801) · 4b85e5db
由 wz1qqx 提交于 5月 19, 2023

4b85e5db

Add flash attention to speedup fused_gate_attention. (#52731) · d29c1f8e

由 limingshu 提交于 5月 19, 2023

* Reorganize the forward codes of flash-attention.

* Fix forward.

* Remove some noused codes.

* Simplify codes and fix backward.

* Change all LOG(INFO) to VLOG and fix the backward.

* add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes

* decrease the effect of debug print on performance

* Unify the initialize of flashattn arguments.

* Rewirte the reshape of temp_mask and temp_bias.

* API support use_flash_attn.

* Fix compiling error on CI.

* Try to crop the flash-attention lib.

* Correct the condition of whether can use flash-attn.

* Remove the softmax_out argument.

* Remove is_causal.

* Polish codes.

* Fix qkv_transpose_out's shape and scaling of Q * K.

* Update commit of flash-attention.

---------
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

d29c1f8e

18 5月, 2023 2 次提交
- H
  
  move fusion_group kernel to phi (#53781) · 26da689d
  由 huangjiyi 提交于 5月 18, 2023
  
  26da689d
- C
  
  Add segment_pool tests (#53785) · 0bed2203
  由 co63oc 提交于 5月 18, 2023
  
  0bed2203
15 5月, 2023 2 次提交
- G
  remove some [-Wunsed-parameter] warning (#53689) · 3e1fffea
  由 Galaxy1458 提交于 5月 15, 2023
```
* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop
```
  3e1fffea
- R
  
  [XPU][PHI] bind index_sample_grad xpu kernel (#53753) · 81056073
  由 RuohengMa 提交于 5月 15, 2023
  
  81056073
12 5月, 2023 2 次提交
- R
  [CustomDevice] add inference MP support, PART0 (#53719) · d03bbefa
  由 ronnywang 提交于 5月 12, 2023
```
* [CustomDevice] add inference MP support, PART0

* update
```
  d03bbefa
- R
  
  [PHI] update xpu api version; bind reduce_any_bool xpu kernel; remove unnecessary header (#53716) · 0603777b
  由 RuohengMa 提交于 5月 12, 2023
  
  0603777b
11 5月, 2023 4 次提交
- L
  [XPU][PHI Kernels] add pad op for xpu (#53684) · 6f28eb70
  由 lijin23 提交于 5月 11, 2023
```
* add pad op for xpu

* add pad op for xpu

* add pad op for xpu
```
  6f28eb70
- J
  
  remove a part of npu (#53677) · 314d0418
  由 jjyaoao 提交于 5月 11, 2023
  
  314d0418
- S
  [XPU] add depthwise_conv2d_transpose (#53680) · 08b6f5d6
  由 SaltFish11 提交于 5月 11, 2023
```
* add_depthwise_conv2d_transpose

* Update test_depthwise_conv2d_transpose_op_xpu.py

删除print语句
```
  08b6f5d6
- 张
  
  昇腾和寒武纪相关代码退场 npu相关代码退场2 (#53568) · 0d45ac73
  由张春乔提交于 5月 11, 2023
  
  0d45ac73
10 5月, 2023 1 次提交

[XPU]Conv transpose fp16 && fix unittest (#53626) · 38d664b7

由 wz1qqx 提交于 5月 10, 2023

* fix as review, add fp16 conv2d_transpose

* fix unittest of bn and reduce_mean

* fix bn unittest

* fix ci

* fix ci

38d664b7

09 5月, 2023 3 次提交
- G
  remove some [-Wunused-parameter]warning (#53617) · bafc3469
  由 Galaxy1458 提交于 5月 09, 2023
```
* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop
```
  bafc3469
- R
  [PHI kernels] Bind XPU kernels (#53336) · 7e9c87c5
  由 RuohengMa 提交于 5月 09, 2023
```
* bind sparse_coo_tensor, reduce_max/max_int32, range/arange_int32, equal_bool, scatter_grad_float32, nearest_interp_int64 kernels

* add more unit tests; modify compilation logic of xpu sparse kernels
```
  7e9c87c5
- G
  remove some [-Wunused-parameter]warning and WITH_DISTRIBUTE flag (#53532) · 727fa27d
  由 Galaxy1458 提交于 5月 09, 2023
```
* test,test=develop

* test,test=develop
```
  727fa27d
08 5月, 2023 2 次提交
- W
  
  [XPU] Optimize fp16 xpu models (#53523) · 0a59825e
  由 wz1qqx 提交于 5月 08, 2023
  
  0a59825e
- C
  
  Fix typos, test=document_fix (#53540) · acefdeb7
  由 co63oc 提交于 5月 08, 2023
  
  acefdeb7
06 5月, 2023 1 次提交
- C
  
  XPU Support external stream (#53334) · 99399f32
  由 csy0225 提交于 5月 06, 2023
  
  99399f32
28 4月, 2023 1 次提交
- L
  [XPU][BUG] Add cumsum grad kernel to xpu2 op list (#53386) · 1c1b487c
  由 lj970926 提交于 4月 28, 2023
```
* clang format

* add cumsum_grad op to xpu2_op_list
```
  1c1b487c
27 4月, 2023 1 次提交
- H
  
  [XPU] c_sync_calc_stream support more types (#53389) · 9c1eb98a
  由 houj04 提交于 4月 27, 2023
  
  9c1eb98a
26 4月, 2023 1 次提交
- R
  Optimize prompt information (#53291) · 3ec12c2b
  由 risemeup1 提交于 4月 26, 2023
```
* Optimize prompt information

* add_information

* add_information
```
  3ec12c2b
25 4月, 2023 1 次提交

[PHI]Add flags macro for PHI (#52991) · 22e96bde

由 YuanRisheng 提交于 4月 25, 2023

* add flags for phi

* fix compile bugs

* fix ci bugs

* fix inference bugs

* fix cinn' bugs

* fix cinn bugs

* perfect code according comment

* fix ci bugs

* fix ci bugs

22e96bde

24 4月, 2023 1 次提交

remove some [-Wunused-parameter] (#53185) · 834eb2ba

由 Galaxy1458 提交于 4月 24, 2023

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test ,test=develop

834eb2ba

20 4月, 2023 2 次提交
- H
  [XPU] update numel/size op registration (#53094) · 4212d9ad
  由 houj04 提交于 4月 20, 2023
```
* [XPU] add numel op

* [XPU] update numel/size op registration
```
  4212d9ad
- W
  remove ASCEND* keyword (#53046) · 7fa415ca
  由 Wang Xin 提交于 4月 20, 2023
```
* remove ASCEND* keyword

* update docstring

* bug fixed

* bug fixed
```
  7fa415ca
19 4月, 2023 1 次提交
- H
  
  [XPU] add numel op (#53041) · 4812d8e4
  由 houj04 提交于 4月 19, 2023
  
  4812d8e4
17 4月, 2023 1 次提交
- 张
  
  remove hccl in some .cc files (#52942) · 514d83de
  由张春乔提交于 4月 17, 2023
  
  514d83de
14 4月, 2023 2 次提交

U

[Dcu]: Add rocsparse_spmm for dcu. (#52200) · 281ea2f4
由 umiswing 提交于 4月 14, 2023

281ea2f4

[Zero-Dim] support 0-D tensor for... · 6f41e177

由 YangQun 提交于 4月 14, 2023

[Zero-Dim] support 0-D tensor for reduce/reshape/stack/prelu/expand_v2/gaussion onednn kernels (#52185)

* support 0-D tensor for reduce/reshape/stack/prelu/expand_v2/gaussion ops

* fix gaussian random mkldnn op ut

6f41e177

13 4月, 2023 3 次提交

J
delete WITH_ASCEND_CL (#52825) · 4a374c60
由 jjyaoao 提交于 4月 13, 2023
```
* delete WITH_ASCEND_CL

* delete NPU/ and WITH_MLU
```
4a374c60

[enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h (#52651) · 5664ea26

由 HongyuJia 提交于 4月 13, 2023

* [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h

* Add logging.h for profiler.cc

* Add logging.h for gloo_utils.h

* Add logging.h for addmm_kernel_impl.h

* Add logging.h for addmm_grad_kernel_impl.h

* Add logging.h for p_send_kernel.cu

* Add logging.h for determinant_grad_kernel_impl.h

* Add logging.h for p_recv_kernel.cu

* Add logging.h for elementwise_grad_base.h

* Add logging.h for transfer_layout_kernel.cc

* Add logging.h for eigvals_kernel.cc and index_select_impl.h

* Add logging.h for all files in kernel directory

* Add logging.h for xpu_info.cc

* Add logging.h for xpu

5664ea26

C

[XPU] Fix instance_norm、conv2d_xpu、inplace optimizer bugs. (#52627) · fa8abeec
由 csy0225 提交于 4月 13, 2023

fa8abeec

10 4月, 2023 2 次提交
- H
  [enforce.h Decouple gflags.h] Move gflags.h from enforce.h to enforce.cc (#52573) · 3c0b1795
  由 HongyuJia 提交于 4月 10, 2023
```
* [enforce.h Decouple gflags.h] Move gflags.h from enforce.h to enforce.cc

* Add gflags.h for other files

* Add gflags.h for other files

* Add gflags.h for blas_impl.hip.h

* Add gflags.h for miopen_helper.h
```
  3c0b1795
- L
  
  support custom device on macos (#52620) · 575cafb4
  由 lishicheng1996 提交于 4月 10, 2023
  
  575cafb4
09 4月, 2023 1 次提交
- R
  [PHI CAPI] support complex dtype kernel (#52414) · b60f48ce
  由 ronnywang 提交于 4月 09, 2023
```
* [PHI CAPI] support complex dtype kernel

* update
```
  b60f48ce
07 4月, 2023 1 次提交
- W
  
  clean up WITH_MLU (#52546) · e75c01f9
  由 Wang Xin 提交于 4月 07, 2023
  
  e75c01f9
06 4月, 2023 1 次提交

Remove oneDNN-specific attributes from matmul (#49444) · 4d97b25d

由 Sławomir Siwek 提交于 4月 06, 2023

* replace matmul with matmul_v2 in fuse passes

* Remove fusion logic from matmul

* removing fusion methods

* add proper name

* adjust namespaces

* clean attrs in python tests

* delete checkpoint and restore matmul version

* remove unused code

* matmul and reshape/transpose fuses migrated

* split MatmulOneDNN headers

* fuse activation and eltwise_add

* add fuse_activation

* matmul_transpose_reshape/reshape_transpose_matmul

* matmul + elementwise_add (fused)

* activation temporary modifciation

* restore matmul(v1) version 0

* merge newest develop

* remove depedency from other PR

* revert pbtxt

* remove placeholders from matmul_v2

* add description in OPMaker

* remove matmul_v2_op.h and all depedencies

* remove dims changing in base op

* add possibility to fuse already fused_matmul

* restart broken CI

* Empty-Commit

* revert matmul_utils.h

* codestyle

* adjust imports

* add pbtxt file

* 100% matmul unit tests coverage

* trigger CI with minimal changes to develop

* adjust changes to develop

* add fused_matmul op

* inherit base ops

* add "v2"

* move OPMaker

* Gradually add fused_matmul files

* second batch of fused_matmul changes

* split infershapes of matmul_v2 and fused_matmul

* merge code from other PR

* 2023

* inherit fused_matmul from matmul_v2

* Update paddle/phi/backends/onednn/onednn_reuse.h
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* resolve conflicts

* codestyle

* simplify isgemmlinear

* 2023

* remove import

* reuse methods

* matmul_v2_mkldnn cleanup

* simplify ExecuteMatMulV1Grad

* matmul refactored

* fc

* SetOutMemDescWithLogicalLayoutFusesSupport

* matmul_v2

* alpha support

* group repetetive funcs

* matmul utils

* execute matmul methods

* restore registered kernel names

* split header and impl files

* remove double negatives

* reduce numer of modified files

* adjust ExecuteMatmul

* add scales for ut

* dates

* limit number of modified files

* fluid imports

* remove alpha

* codestyle

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

4d97b25d

03 4月, 2023 2 次提交
- remove WITH_ASCEND_CL PADDLE_WITH_ASCEND_CL WITH_ASCEND_CXX11 (#52448) · 0b60f28c
  由 engineer1109 提交于 4月 03, 2023
  
  0b60f28c
- R
  Fix gcc12 error when compiling using gcc12 and cuda12 (#50817) · 2f850990
  由 risemeup1 提交于 4月 03, 2023
```
* fix_gcc12_error

* fix_gcc12_error

* fix gcc12_error

* fix_gcc12_error
```
  2f850990

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功