提交 · fa051eecb107ed4072c5c34fb3abd47f049d9fd5 · BaiXuePrincess / Paddle

29 11月, 2022 1 次提交
- S
  
  [PHI decoupling] Move MKLDNN code (#48352) · fa051eec
  由 Sławomir Siwek 提交于 11月 29, 2022
  
  fa051eec
28 11月, 2022 4 次提交
- W
  fix: multihead matmul biasqk broadcast support for [1,1,seq,seq] shape (#47975) · 11b9d85f
  由 Wang Bojun 提交于 11月 28, 2022
```
* add trt support
```
  11b9d85f
- H
  [PHI decoupling] move several header files from fluid to phi (#48415) · fd9c91c3
  由 huangjiyi 提交于 11月 28, 2022
```
* decouple cudnn_desc.h from fluid

* move cudnn_desc.h from fluid to phi

* fix bugs

* decouple cudnn_helper.h from fluid

* fix bugs

* move cudnn_helper.h from fluid to phi

* add fluid cudnn_helper.h

* move miopen_desc.h from fluid to phi

* move miopen_helper.h from fluid to phi

* fix bugs

* move gpu_dnn.h from fluid to phi

* fix bugs

* update copyright year

* simplify gpu_dnn.h in fluid

* fix bugs

* fix xpu build bug

* fix compile bug

* fix bug
```
  fd9c91c3
- 张
  
  replace LoDTensor with phi::DenseTensor in fluid\operators\*\ except sequence_ops (#48418) · 30a31a53
  由张春乔提交于 11月 28, 2022
  
  30a31a53
- Use phi layernorm (#48276) · 86d92092
  由 MarDino 提交于 11月 28, 2022
  
  86d92092
23 11月, 2022 1 次提交
- Use cublaslt in multi transformer FFN (#48052) · b07e6b45
  由 MarDino 提交于 11月 23, 2022
```
* use fused mlp in multi transformer
* Restruct code
* use cublaslt to fuse ffn
* fix conflict
```
  b07e6b45
22 11月, 2022 2 次提交

T
CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203) · df4dfda0
由 Tian Zheng 提交于 11月 22, 2022
```
* Skip tests that use fused_ops on H100

* Add error message to FusedOps on H100
```
df4dfda0

[PHI decoupling] remove "gpu_device_function.h" in fluid. (#48117) · 4da1a0fe

由 huangjiyi 提交于 11月 22, 2022

* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi

* update copyright years

* rm "fluid/platform/device/gpu/gpu_device_function.h" in phi

* rm dependence to "gpu_device_function.h" in fluid

* rm gpu_device_function.h etc in fluid

* fix rocm-complie bugs

* fix cuda_helper_test.cu bugs

4da1a0fe

21 11月, 2022 1 次提交

mma qk tensor_core (#48087) · d79eda71

由 lzy 提交于 11月 21, 2022

* use mma for QK dot computing in fused_multi_transformer.
* Update fused_multi_transformer_op.cu.h

d79eda71

18 11月, 2022 3 次提交

Fused QKVBiasAdd and Transpose with Split Q, KV (#47680) · d595928e

由 MarDino 提交于 11月 18, 2022

* fused qkvBiasAdd and transpose with split qkv

* fix typo

* fix format

* fix name

* add annotation

* fix comment

d595928e

Optimize FusedBiasAddGelu Kernel (#47679) · b0e28540

由 MarDino 提交于 11月 18, 2022

* Add quick gelu and fused bias add kernel

* fix annotation

* remove useless code

* add fast gelu option and set it in multi transformer op

* add flag to restrict if use fast gelu approximate

* fix flags conflict

* fix use tanh function instead

* add cudart version limit

* use phi fast tanh func

* fix comment

b0e28540

W
[PHI decoupling] remove "gpu_primitives.h" in fluid (#48063) · 9918bf9c
由 Wang Xin 提交于 11月 18, 2022
```
* remove "gpu_primitives.h" in fluid namespace

* fix PR-CI-GpuPS fail

* fix PR-CI-GpuPS fail
```
9918bf9c

17 11月, 2022 2 次提交
- Y
  [PHI]Standardise some C++ API (Part5) (#47860) · f3650201
  由 YuanRisheng 提交于 11月 17, 2022
```
* standard api

* fix xpu bugs
```
  f3650201
- T
  
  xpu-paddlepaddle-41 [任务] ffn and attention test=kunlun (#46658) · 071708fa
  由 taixiurong 提交于 11月 17, 2022
  
  071708fa
15 11月, 2022 1 次提交

mkldnn directory cleanup (#47779) · 8a339d24

由 Sławomir Siwek 提交于 11月 15, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

8a339d24

09 11月, 2022 2 次提交

H

clean repetitious GetKernelTypeForVar (#47763) · c551e55d
由 HongyuJia 提交于 11月 09, 2022

c551e55d

Final changes to introduce mem_desc to be hold in Tensor (#46768) · 14f261ad

由 Jacek Czaja 提交于 11月 09, 2022

* first commit

- more fixes

- compilation fix

- compilation fix

- fix

- another fix

- yet another fix

- Fix

- fix to fused ops

- compilation fix

- compilation fix

- another compilation fix

- another fix

- fix

- fix

- fix

- fix

- yet another fix

- fix

- fix

- cosmetic fix

:- lint

- Revert some changes (to be brought back later)

- fix to build

- Added prototype of slice

- fix

compilation fix

- compilation fix

- fix

- fix

- Fix

- fix

 fix
	modified:   cmake/flags.cmake

* lint

* rerun of CI

* - Fix

* - lint

* - lint2

14f261ad

07 11月, 2022 1 次提交
- W
  
  Refactor collective communication all_gather, all_reduce, broadcast & barrier C++ API (#47481) · e1a1c354
  由 Wen Sun 提交于 11月 07, 2022
  
  e1a1c354
01 11月, 2022 1 次提交

Adapting device-specific Extra Attributes for the PHI kernel (#46342) · c923e6c9

由 Chen Weihang 提交于 10月 31, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* fix map at error

* Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* remove useless extra attrs

* replace mkldnn_engine by onednn_engine
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

c923e6c9

31 10月, 2022 2 次提交

optimize: vit 384 (#47432) · 520adc0e

由 feng_shuai 提交于 10月 31, 2022

* optimize: vit 384

* fix:bug

* fix:bug

* fix:supoort rocm complie

* refactor:name

* fix:support rocm

* fix:__HIP_NO_HALF_CONVERSIONS__

* optimize: delete scalar

* fix:rocm can't support

* fix:ernie error

520adc0e

N
fix typos for `True` and `False` (#47477) · f5912d0c
由 Nyakku Shigure 提交于 10月 31, 2022
```
* fix typo `Fasle`/`Flase` -> `Flase`

* fix typo `Ture` -> `True`
```
f5912d0c

27 10月, 2022 1 次提交
- S
  
  Add launch_bounds (#47285) · 13181fd9
  由 Shijie 提交于 10月 27, 2022
  
  13181fd9
26 10月, 2022 2 次提交
- H
  
  clean mkldnn headerfile (#47362) · 436115cf
  由 HongyuJia 提交于 10月 26, 2022
  
  436115cf
- S
  Refine the memory usage of fused_attention and fused_feedforward ops (#47236) · 6ef5d343
  由 sneaxiy 提交于 10月 26, 2022
```
* fix fused_attention fused_feedforward

* fix ci

* fix ci

* fix ci PADDLE_GET_CONST

* fix ci ut
```
  6ef5d343
25 10月, 2022 1 次提交
- H
  
  clean fusion_conv_inception headerfile (#47320) · c1077ae8
  由 HongyuJia 提交于 10月 25, 2022
  
  c1077ae8
24 10月, 2022 1 次提交
- Y
  
  Move the header file of conv cudnn and miopen to phi directory. (#47248) · 31f57f29
  由 Yiqun Liu 提交于 10月 24, 2022
  
  31f57f29
17 10月, 2022 1 次提交
- Y
  [PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
  由 YuanRisheng 提交于 10月 17, 2022
```
* namespace modify

* update by comment
```
  ec749398
13 10月, 2022 1 次提交

[Kernel Selection] Remove hard code of PADDLE_WITH_MKLDNN (#46606) · ef1c8759

由 HongyuJia 提交于 10月 13, 2022

* remove PADDLE_WITH_MKLDNN, test white_list=abs

* fix unique_ptr

* fix op.Type()

* remove TODO in kernel_dispatch.h

* remove IndicateVarDataType function, update white_list

* remove mkldnn hard code

* add comments

* fix ==

* update mkldnn_op_list

* delete hard code of OPs

* update mkldnn_op_list

* update mkldnn_op_list, remove interp

* add error check for ExecutionContext

* update mkldnn_op_list, remove transpose2_grad

* remove interpolate mkldnn

* remove fill_constant mkldnn

* opt HasAttr in DygraphExecutionContext

* deprecated commit, test mkldnn_white_list

* deprecated commit, test mkldnn_white_list

* deprecated commit, test mkldnn_black_list

* update mkldnn_op_list, add assert error op

* solve cudnn related op

* fix error

* add mkldnn fallback in phi_utils.cc

* remove mkldnn fallback in phi_utils.cc

* opt code implementation

* polish Copyright License

ef1c8759

11 10月, 2022 1 次提交
- C
  Remove LoDTensor using in fluid (Part 1) (#46663) · 940d8f25
  由 Chen Weihang 提交于 10月 11, 2022
```
* remove using lodtensor part1

* polish history code format
```
  940d8f25
10 10月, 2022 2 次提交

make fused_multi_transformer support dynamically set the cache_kvs' shape and... · 9ea279a4

由 carryyu 提交于 10月 10, 2022

make fused_multi_transformer support dynamically set the cache_kvs' shape and support input prefix_caches. (#46777)

* make fused_multi_transformer support dynamically set the cache_kvs' shape and support input prefix_caches.

9ea279a4

H

delete_multi_gru_headerfile (#46689) · 749da9a9
由 HongyuJia 提交于 10月 10, 2022

749da9a9

09 10月, 2022 1 次提交
- H
  
  [Dygraph] Fix Perf of FusedFeedForward and FusedAttention with AllReduce (#46780) · 078e8c78
  由 Haohongxiang 提交于 10月 09, 2022
  
  078e8c78
30 9月, 2022 1 次提交

support pure bfloat16 for more ops (#46364) · b7b231a6

由 sneaxiy 提交于 9月 30, 2022

* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* add bfloat16 to selu_grad to pass CI

* fix selu grad compilation error

b7b231a6

28 9月, 2022 1 次提交

Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e

由 Chen Weihang 提交于 9月 28, 2022

* remove needless using tensor

* remove needless using tensor

* resolve conflict

* replace tensor using

* fix format error

* revert needless changing

* fix rocm and npu compile error

* fix cinn compile error

* fix format error

* fix mkldnn format error

* fix mkldnn format error

* fix cinn compile error

* fix cinn compile error

* fix cinn compile error

* resolve conflict

e12a905e

21 9月, 2022 1 次提交
- J
  
  refine mkldnn code · 4b8d4ade
  由 jiahongyu 提交于 9月 20, 2022
  
  4b8d4ade
18 9月, 2022 1 次提交
- R
  
  Add INT8 support for fused_multi_transformer_op (#45284) · 3d7e2118
  由 RichardWooSJTU 提交于 9月 18, 2022
  
  3d7e2118
15 9月, 2022 1 次提交
- N
  
  [CodeStyle] trim trailing whitespace in .h, .cc, .cu, etc. (#46006) · 8dde7aea
  由 Nyakku Shigure 提交于 9月 15, 2022
  
  8dde7aea
09 9月, 2022 2 次提交
- X
  
  convfusion_cache (#45902) · 3bad26ec
  由 xiaoxiaohehe001 提交于 9月 09, 2022
  
  3bad26ec
- S
  
  fix fused_gemm_epilogue compile error (#45899) · 7d000112
  由 sneaxiy 提交于 9月 09, 2022
  
  7d000112
08 9月, 2022 1 次提交
- T
  xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持xpu (#45706) · 7085cb97
  由 taixiurong 提交于 9月 08, 2022
```
* add gemm_epilogue

* xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持 test=kunlun
```
  7085cb97

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致