提交 · f2c96bc264854a3176890c51187f94ddad3ee44b · PaddlePaddle / Paddle

29 3月, 2023 1 次提交
- S
  Fix generate_kernels.py in CUDA 12.0 (#52232) · f2c96bc2
  由 sneaxiy 提交于 3月 29, 2023
```
* fix generate_kernels.py in CUDA 12.0

* fix attrs bug
```
  f2c96bc2
24 3月, 2023 1 次提交

Memory Efficient Attention (#51867) · e5ad3859

由 ZhangDY-6483 提交于 3月 24, 2023

* first version, notest

* return final rst, notest

* use infinity() instead of max

* ut structure

* start up of ut

* generate lse

* update

* add depense

* reconstruct cmake

* move file

* add memory efficient attention and fix blasimpl

* update

* update cmake

* add namespace

* update cmake

* use .cu

* update for pad3d

* bug fix

* bug fix

* update

* bug fix

* update enforce

* add test case

* merge the lse pad

* fix kernel_fn of backward

* fix PADDLE_ENFORCE_EQ and phi_api

* fix PADDLE_ENFORCE

* fix PADDLE_ENFORCE

* rerun coverage

* fix memory efficient attention test

* rerun ci

* add cuda version condition

* add cuda version condition

* delete WIP test

* replace PADDLE_ENFORCE

* edit the namespace of datatype in multiple.cc

* rerun

* rerun

---------
Co-authored-by: Nliuyuang <liuyuang@baidu.com>

e5ad3859

22 3月, 2023 2 次提交

S

add fused dropout add (#51752) · 6ba0507d
由 ShenLiang 提交于 3月 22, 2023

6ba0507d

Add fused_linear_param_grad_add_kernel (#51805) · f59c5d8b

由 sneaxiy 提交于 3月 22, 2023

* add fused_linear_param_grad_add_kernel

* fix compile error

* remove flag

* fix ci compile error

* fix ci compile error

* revert pylayer revision

* fix ci ut

* improve performance

f59c5d8b

16 3月, 2023 1 次提交

Update from_blob API (#51646) · c07c7712

由 Huang Jiyi 提交于 3月 16, 2023

* remove contexts in tensor_utils

* update from_blob

* update from_blob

* update from_blob

* fix bug

* fix bug

c07c7712

15 3月, 2023 1 次提交
- U
  
  [cutlass] Fix make (#51718) · c665400b
  由 umiswing 提交于 3月 15, 2023
  
  c665400b
13 3月, 2023 1 次提交

[Paddle Inference ]use python to generate cutlass code (#50603) · 4e9e23cb

由 zhoutianzi666 提交于 3月 13, 2023

* use python to generate cutlass code

* refine CommonConvKernelPart1, CommonConvKernelPart2

* remove useless code in generate_cutlass_code.sh

* add more config in conv2d_residual

* CommonCutlassConvKernelPart1 and CommonCutlassConvKernelPart2

* add group conv support in util.cu

* remove .sh

* refine name

* make name goodgit status!

* add fuse_alpha

* make code easy to understand

* mot fopen generate in py

* use python script to generate conv2d,group=1 cutlass code

* use const &

* use const & && use python script to generate conv2d/group=1 code

4e9e23cb

10 3月, 2023 1 次提交

[New features]Add function node in phi_kernel for MKLDNN (#51073) · a0a6dc6a

由 HappyHeavyRain 提交于 3月 10, 2023

* Add function node in phi_kernel for MKLDNN

* fix the bug in 'BuildInferVarKernelContext'

* add infer_varkernel_utils.cc

* fix the bug:the first two parametes of 'BuildInferVarKernelContext' can't be template variable

* change the code according to first review

* change the code according to first review

* change the mode of paddle_build.sh

* change 'infer_var_kernel_fn_' to 'get_kerneltype_forvar_fn_'

* add the error information

* fix NotFound infomation warning

* fix NotFound infomation warning

* fix NotFound infomation warning

a0a6dc6a

09 3月, 2023 1 次提交

Add comm context manager, add phi broadcast op (#51072) · c191b707

由 TaoTao Li 提交于 3月 09, 2023

* * add comm context for device context

* add broadcast phi operator kernel and api

* add broadcast support dtype, update ut

* fix broadcast bfloat16 type

* fix ut

* update test_collective_broadcast_api timeout to 300

c191b707

01 3月, 2023 1 次提交

Integration flash attention (#49869) · 61611786

由 Chitsing KUI 提交于 3月 01, 2023

* flash attn

* seed

* almost

* softmax

* fix workspace

* add unitest; linux only

* fix setup

* fix datatype include

* fix setup typo

* fix def scope

* new error api

* use paddle fork

* fix attr bug; complete ut

* update flash hash

* fix rng reset

* fix offset

* fix comments

61611786

16 2月, 2023 1 次提交

[phi decoupling] remove variable.h in phi (#50407) · 905cefd4

由 Huang Jiyi 提交于 2月 16, 2023

* move variable_utils from phi_api_utils to fluid

* fix coment

* update include

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* update

* update

* fix CI-Windows-OpenBLAS

* fix bugs

* fix bugs

* fix bugs

* update include

* move variable_utils to phi_utils

* fix namespace

905cefd4

03 1月, 2023 1 次提交

[Paddle Inference] Implement conv2d_fusion NHWC format using cutlass (#47989) · c123dd1e

由 zhoutianzi666 提交于 1月 03, 2023

* Implement conv2d_fusion NHWC format using CUTLASS
* Add unit testing for CUTLASS Conv in inference
* Add experimental API for CUTLASS.

c123dd1e

23 12月, 2022 1 次提交
- H
  add rnn-t loss and api (#49199) · c088f9ec
  由 Hui Zhang 提交于 12月 23, 2022
```
* add warp transducer code
```
  c088f9ec
22 12月, 2022 1 次提交
- X
  
  [Paddle Inference] Add moe phi kernel (#48703) · def2a87f
  由 xiaoxiaohehe001 提交于 12月 22, 2022
  
  def2a87f
19 12月, 2022 1 次提交
- H
  [PHI decoupling] move gather_scatter_kernel from fluid to phi (#49132) · 0b79129d
  由 huangjiyi 提交于 12月 19, 2022
```
* move gather_scatter_kernel from fluid to phi

* mv gather_scatter_kernel to gather_scatter_functor
```
  0b79129d
17 12月, 2022 1 次提交
- W
  
  refactor: rename xccl files (#49127) · d4f43ad4
  由 Wen Sun 提交于 12月 17, 2022
  
  d4f43ad4
16 12月, 2022 1 次提交
- W
  
  refactor: rename files (#49117) · 40f3f4f0
  由 Wen Sun 提交于 12月 16, 2022
  
  40f3f4f0
06 12月, 2022 1 次提交

Clear extra input (Bias, ResidualData) in OpMaker of conv2d (#47579) · 0a2dfa38

由 zyfncg 提交于 12月 06, 2022

* delete Bias and ResidualData in OpMaker of conv2d

* delete extra input of conv3d

* refactor pass of conv_bias_fusion

* fix mkldnn dependency

* fix mkldnn compile

* fix test_conv_bias_mkldnn_fuse_pass

* police some code

* remove useless log

* fix analyzer_vit_ocr_tester

* fix conv_activation_mkldnn_fuse_pass

* fix test_analyzer_ocr

* add fused_conv_sig

* fix performence regression

* fix performance regression

0a2dfa38

05 12月, 2022 1 次提交
- H
  
  move device_memory_aligment from fluid to phi (#48694) · 796499fd
  由 huangjiyi 提交于 12月 05, 2022
  
  796499fd
18 11月, 2022 1 次提交

CUDNN v8 Implementation of Convolution Kernels (#47454) · 14a6e67b

由 Tian Zheng 提交于 11月 18, 2022

* Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation

* Fix macro

* Add implementation for conv_kernel and conv_grad_kernel

* Modification after rebase onto latest develop

* Modify plan cache to comply with the API of phi::autotune

* Refactor to reduce duplicate code

* Review fix:
- move functions in  conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu
- add const specifier for input tensor
- add logging when plans fail to execute
- move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h

* - move plan building outside of cache

* Fix ROCM build

14a6e67b

31 10月, 2022 1 次提交
- R
  [CustomDevice] GetCCLComm add custom device support (#47168) · 34d13d6a
  由 ronnywang 提交于 10月 31, 2022
```
* [CustomDevice] GetCCLComm add custom device support

* update

* update

* update
```
  34d13d6a
20 10月, 2022 1 次提交
- J
  Add infer prune function (#47046) · af9486fc
  由 JingZhuangzhuang 提交于 10月 20, 2022
```
* Add infer prune function

* Update phi.cmake

* Update operators.cmake

* add fusion op
```
  af9486fc
19 9月, 2022 1 次提交
- Z
  [Sparse] Add infer meta (#46016) · 4b95f85e
  由 zhangkaihuo 提交于 9月 19, 2022
```
* sparse infer_meta
```
  4b95f85e
09 9月, 2022 1 次提交
- C
  [Phi] Add fusion kernel dir and migrate fused_softmax_mask op (#45802) · 2b4f44d5
  由 Chen Weihang 提交于 9月 09, 2022
```
* add fusion dir and fuse_softmax_mask kernel

* remove fusion kernel dir

* migrate infershape

* fix code errror
```
  2b4f44d5
06 9月, 2022 1 次提交

[PHI]Add TensorArray for PHI (#45479) · 68f99b78

由 YuanRisheng 提交于 9月 06, 2022

* add tensor array

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix ci bugs

* update by comment

* update code

68f99b78

02 9月, 2022 1 次提交
- A
  [XPU]Migrate Adam XPU kernel into Phi (#45572) · cbabbe2e
  由 Aurelius84 提交于 9月 02, 2022
```
* [XPU]Migrate Adam XPU kernel into Phi

* test=kunlun
```
  cbabbe2e
30 8月, 2022 1 次提交

[phi] Transfer coalesce_tensor to phi (#45478) · cf9d651b

由 HongyuJia 提交于 8月 30, 2022

* add coalesce_tensor kernel

* polist coalesce_tensor kernel

* add sig and InferMeta

* add testcase

* add legacy_api.yaml

* fix infermeta

* fix yaml

* fix kernel implementation

* add compile dependency of phi/kernels

* fix MetaConfig

* add python api

* add and fix testcase

* rnn.py add import

* change _C_ops.coalesce_tensor

* remove useless comments

* add SetBackend

* restore XPU kernel temporarily

* fix code according to PR comments

cf9d651b

26 8月, 2022 1 次提交

Transfer transfer_layout from fluid to phi (#45261) · 985f2a4a

由 kangguangli 提交于 8月 26, 2022

* remove fluid kernel and activate phi kernel

* fix parameter error

* transfer mkldnn part

* modify header file path

* fix compile error

* transfer special case

* fix lod setting and special case for layout setting

* add testcase and refine code

985f2a4a

12 8月, 2022 1 次提交
- L
  
  fix nccl comm in sync_bn (#45100) · 1e965756
  由 LiYuRio 提交于 8月 12, 2022
  
  1e965756
05 8月, 2022 2 次提交

[MKLDNN]Move mkldnn activation kernel to phi (#44365) · 2dfa88d2

由 YuanRisheng 提交于 8月 05, 2022

* move mkldnn activation kernel

* fix compile bugs

* fix compile bugs

* deal with conflict

* fix compile bugs

* fix windows compile bugs

* mkldnn unittest fix

* change mutable to alloc

* fix unittest bugs

* modify code according comment

2dfa88d2

move fft kernels to phi (#44714) · 153f1138

由 Feiyu Chan 提交于 8月 05, 2022

* move fft kernels to phi, done with cufft, pocketfft, mkl_cdft, hipfft
* make stft_op use fft from phi/kernels/funcs, clean code

153f1138

03 8月, 2022 1 次提交
- Z
  transfer op multiclass_nms3 to phi (#44765) · 15ce2c1b
  由 zhiboniu 提交于 8月 03, 2022
```
* add cmake enforce

* transfer multiclass_nms3  to phi
```
  15ce2c1b
01 8月, 2022 1 次提交
- Z
  
  Revert for cmake static library errors on XPU KP #44762 · f15d930a
  由 zhiboniu 提交于 8月 01, 2022
  
  f15d930a
29 7月, 2022 1 次提交
- Z
  
  phi_multiclass_nms3 (#44613) · a9919903
  由 zhiboniu 提交于 7月 29, 2022
  
  a9919903
19 7月, 2022 1 次提交

compile phi/backends into one static library (#44373) · 1047cb17

由 Leo Chen 提交于 7月 19, 2022

* compile into one static library

* fix xpu compile

* fix xpu compile

* fix inference compile

* fix inference compile

* add custom test

* revert one file

1047cb17

16 7月, 2022 1 次提交

[Phi] Migrate solve kernel to phi (#44363) · c0a7830f

由 Weilong Wu 提交于 7月 16, 2022

* draft version

* draft version

* draft version

* migrate solve kernel to phi

* polish

* polish

* re useless header file, fix a bug in grad_kernel_impl

* add header file in need

c0a7830f

14 7月, 2022 1 次提交

[Phi]Improve the mechanism for mkldnn kernel in PHI (#43941) · e9b4d0be

由 YuanRisheng 提交于 7月 14, 2022

* adapt mkldnn kernel in PHI

* fix ci compile bugs

* fix compile bugs

* fix compile bugs

* fix compile bugs

* fix compile bugs

* delete comment

* fix compile bugs in windows-inference

* delete code for converage

* modify code by review

* modify code by review

* add todo

* fix compile bugs

* fix compile bugs

* fix compile bugs

* fix unittest bugsx

e9b4d0be

29 6月, 2022 1 次提交
- L
  
  add kernel_decalre for xpu kp kernels (#43920) · 6132476d
  由 Leo Chen 提交于 6月 29, 2022
  
  6132476d
24 6月, 2022 1 次提交

[Phi]Change Copy from Kernel to basic component utils (#43622) · 2739bd73

由 YuanRisheng 提交于 6月 24, 2022

* perfect copy

* deal with conflict

* deal with conflict

* fix compile bugs

* fix unittest bugs

* change code format

* deal with conflict

* modify code by review

* fix ce bugs

* fix ce bugs

* add lo

* perfect code format

* deal with conflicts

2739bd73

23 6月, 2022 1 次提交
- L
  
  clear cmake files of phi (#43769) · 295f289a
  由 Leo Chen 提交于 6月 23, 2022
  
  295f289a

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功