提交 · cd0f1523202eeacbc04ff3759645d5e34eab5852 · PaddlePaddle / Paddle

21 7月, 2023 1 次提交
- R
  
  [clang-tidy] enable modernize-use-override (#55491) · cd0f1523
  由 Ruibin Cheung 提交于 7月 21, 2023
  
  cd0f1523
19 4月, 2023 1 次提交

Support Linear operation in cuBlaslt and plug into attn_gemm and fusedLinear backward op (#52028) · f6f18835

由 limingshu 提交于 4月 19, 2023

* first commit

* restruct c++ interface to divide linear from matmulwithcublaslt

* finish building in cublaslt impl

* fix code bugs

* fix host cost

* add some changes

f6f18835

23 3月, 2023 1 次提交
- S
  Remove fluid deps in fused_linear_param_grad_add_kernel.cu (#51975) · 5da1a27b
  由 sneaxiy 提交于 3月 23, 2023
```
* remove fluid deps in fused_linear_param_grad_add_kernel

* fix compile error

* fix ut error

* follow comments
```
  5da1a27b
20 3月, 2023 1 次提交

Support Linear operation in cuBlaslt and plug into attn_gemm and fusedLinear forward op (#51124) · 2dfc3fa8

由 limingshu 提交于 3月 20, 2023

* optimization for fused linear op

* fix code format

* optimization for linear fused forward

* merge with develop

* fix bugs for gemm_ephilog

* package of cublaslt ephilogue type with enmu

* final fix before code reviewing

* fix missed fusedType typo

* fix code according to review suggestions

* fix windows ci error

* change location of MatmulPlanner

* add some changes for compiler error fix

---------

2dfc3fa8

28 2月, 2023 1 次提交
- Y
  
  fix bug in fused_gemm_epilogue_op.cc (#50980) · 064a5434
  由 yuehuayingxueluo 提交于 2月 28, 2023
  
  064a5434
26 2月, 2023 1 次提交

Enable matmul + bias fusion in fused_gat_attention. (#50755) · 57f6a469

由 Yiqun Liu 提交于 2月 26, 2023

* Enable matmul + bias fusion in fused_gat_attention.

* Add a variable to control whether using fused matmul + bias.

57f6a469

04 1月, 2023 1 次提交

[Unify KernelKey] change OpKernelType->KernelKey (#49138) · 4383494f

由 HongyuJia 提交于 1月 04, 2023

* execute use kernel_key first

* change OpKernelType->KernelKey

* fix py3 compile error, remove redundant header files

* fix build_strategy_test

* fix DataType::RAW

* fix custom_type test: operator_test.cc

* fix transform place

* fix backends_are_same_class

* try fix place TransDataDevice

* support all KernelKey

* fix TransformData

* fix place_are_same_class

* fix merge

* fix test_params_no_grad

* fix specific place of GetExpectedKernelType

* fix specific place of GetExpectedKernelType

* fix GetKernelTypeForVar

* fix dtype error

* fix fetch_v2

* change GetKernelTypeForVar

* fix interpreter

* fix typo error

* polish codes

* polish codes

* polish codes

* fix conflict

4383494f

14 12月, 2022 1 次提交
- M
  
  Fix nullptr to TestFuseGemmEpilogueReluBWDFP* (#48997) · e61df289
  由 Ming-Xu Huang 提交于 12月 14, 2022
  
  e61df289
07 12月, 2022 1 次提交
- 张
  
  [phi::DenseTensor] Replace Tensor with phi::DenseTensor (#48682) · 65420271
  由张春乔提交于 12月 07, 2022
  
  65420271
17 10月, 2022 1 次提交
- Y
  [PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
  由 YuanRisheng 提交于 10月 17, 2022
```
* namespace modify

* update by comment
```
  ec749398
28 9月, 2022 1 次提交

Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e

由 Chen Weihang 提交于 9月 28, 2022

* remove needless using tensor

* remove needless using tensor

* resolve conflict

* replace tensor using

* fix format error

* revert needless changing

* fix rocm and npu compile error

* fix cinn compile error

* fix format error

* fix mkldnn format error

* fix mkldnn format error

* fix cinn compile error

* fix cinn compile error

* fix cinn compile error

* resolve conflict

e12a905e

15 9月, 2022 1 次提交
- N
  
  [CodeStyle] trim trailing whitespace in .h, .cc, .cu, etc. (#46006) · 8dde7aea
  由 Nyakku Shigure 提交于 9月 15, 2022
  
  8dde7aea
08 9月, 2022 1 次提交
- T
  xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持xpu (#45706) · 7085cb97
  由 taixiurong 提交于 9月 08, 2022
```
* add gemm_epilogue

* xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持 test=kunlun
```
  7085cb97
26 6月, 2022 1 次提交
- S
  
  format all files in fluid using new config (#43776) · 576236a0
  由 Sing_chan 提交于 6月 26, 2022
  
  576236a0
05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
01 6月, 2022 1 次提交

Make fuse_gemm_epilogue support transpose_x and transpose_y (#40558) · 048b0013

由 sneaxiy 提交于 6月 01, 2022

* support weight transpose

* add ut

* add template

* fix transpose error

* fix transpose_comment

* add api tests

* add skipif

* add doc

048b0013

07 3月, 2022 1 次提交

cuBlasLt Epilogue To Fuse Linear + ReLU|GeLU (#39437) · 2a3d9eca

由 Ming-Xu Huang 提交于 3月 07, 2022

* Added cuBlasLtHandle_t to device context.

* Added fused_gemm_epilogue op.

1. Added fused_gemm_epilogue op to leverage cuBlastLt Epilogue.
2. Support fusion Act(X*Y + bias), X'dims >=2 and Y'dims shoule be 2.
2. Act currently only be supported ReLU. (Will add GeLU in the future).

* Added UT to fused_gemm_epilogue op.

* Added LinearAct Pattern

1. Added LinearAct into graph_pattern_detector.* to define (2.)'s
pattern.
2. LinearAct is used to detect act(element_add(matmul_v2(x, w), bias)).
3. act currently only support ReLU (Will support GeLU in the future).

* Added FuseGemmEpiloguePass

1, Added FuseGemmEpiloguePass to handle nn.Linear + Act{ReLU}
fusion (GeLU will be supported in the future).
2. Only support matmul_v2 from nn.Linear.

* Added pybind to BuildStrageter.fuse_gemm_epilogue_.

* Added UT for fuse_gemm_epilogue_pass.

* GeLU support and EpilogueSingleton

1. Added GeLU support to fused_gemm_epilogue op.
2. Added EpilogueSingleton to cache auxiliary pointer.
3. Added related UTs.

* Rename cublaslt_epilogue_opto gemm_epilogue_op.*.

* Added both train and infer pattern to LinearAct.

1. Added support of fwd graph with grap_ops linking to LinearAct.
2. Added related changes to fuse_gemm_epilogue_pass for above
modification.

* Changed CUDA requirement from 11.4 to 11.6 for fuse_gemm_epilogue_pass.

* Added identity activation support to gemm_epilogue_op.

* Added Linear Fusion (matmul_v2 + ele_add)

1. Added matmul_v2 + ele_add pattern to LinearActPattern.
2. Added matmul_v2 + ele_add support to fuse_gemm_epilogue_pass.

* Rename gemm_epilogue_op.* to fused_gemm_epilogue_op.*

* Add fused_gemm_epilogue_grad op.

1. Added fused_gemm_epilogue_grad to support backward epilogue fusion.

* Add UTs to fused_gemm_epilogue_grad_op.

* Change attribute name in fused_gemm_epilogue_grad_op for clearing.

* Allow DX and DBias be dispensable to fused_gemm_epilogue_grad op.

* Added ElementwiseAdd+Matmul+Act graph pattern detection.

* Fuse backward of Linear( Act(x))

1. Added backward fusion pass to Linear( Act(x)).
2. Added backward fusion pass to Linear(x).

* Added UTs to backward fusion of Linear(Act(x)).

* Complete document of arguments to fused_gemm_epilogue_op.

* Made arguments of some functions pass by reference.

* Modify code with review comments.

1. Made arguments of some function pass by reference.
2. Removed redundant code.
3. Followed Google code style to change code.

* Made 'const' code style be consistent

* Fixed random seed of python UTs.

* Set Compiling constrains to cuBlasLt

1. Require CUDA 11.6+
2. Remove fuse_gemm_epilogue related tests when CUDA < 11.6.

* Code Reivew from Paddle

1. Changed arguments name is_first_gemm to without_x_gradient for
clearing.
2. Applied PADDLE_THROW in fused_gemm_epilogue_op.

* Remove EpilogueSingleton

1. Applied ReserveSpace to replace Epilogue for passing auxiliary
pointers between FWD and BWD.

* Fix a logical error and enhance UTs.

1. Added act op count checking in UTs.
2. Fix issue to fuse backward or ReLU(Linear(X)).
3. TODO: solve GELU fusion issues.

* Fix Linear and GeLU fusion issues.

1. Modified graph_detech_pattern to fit with both linear wiht gelu or
relu.
2. Modified data range in Uts to allow negative values.

* Removed fused_gemm_epilogue_op.h.

* Rename namespace pten to phi.

* Rename name of arguments in fused_gemm_epilogue_op

1. bias -> Bias.
2. out -> Out.
3. reserve_space -> ReserveSpace.

* Change EpiloguePassActivationCache as local variable.

1. Removed singleton in EpiloguePassActivationCache.
2. Made EpiloguePassActivationCache as an argument to each pass
functions.

2a3d9eca

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功