提交 · 138bdf40e5e9b9830cb73730c53678748591d6a2 · PaddlePaddle / Paddle

29 8月, 2023 1 次提交
- C
  [clang-tidy] No.26,27 enable misc-unused-using-decls,misc-unused-alias-decls (#56485) · 138bdf40
  由 cyberslack_lee 提交于 8月 29, 2023
```
* fix

* fix
```
  138bdf40
25 8月, 2023 1 次提交

New ir support fuse bn add act (#56247) · d3f4596a

由 hong 提交于 8月 25, 2023

* support new ir load combine

* update

* polish code

* remove print

* update

* update

* update

* polish code

* fix bug

* polish code

* fix compile bug

* fix bug

* revert code

* remove useless code

* polish code

d3f4596a

15 8月, 2023 1 次提交

[Paddle Inference] Add masked multihead attention kernel and export API. (#55344) · 989c5e87

由 xiaoxiaohehe001 提交于 8月 15, 2023

* support_mmha
* add_python_api
* add_api_doc
* fix_doc_error
* fix_infermeta
* add_infermeta
* add_bf16_cuda_check
* add_bf16_check
* fix_ci_windows
* fix_ci_windows_kernel_register
* fix_test_mmha
* add_cumoffsets
* remove_bias
* delete_mmha_reshape_input_output
* rename_delete_hfile
* remove_fluid

---------
Co-authored-by: Nyangjianfengo1 <yangjianfeng01@baidu.com>

989c5e87

14 8月, 2023 1 次提交
- C
  
  [clang-tidy] No.31 enable modernize-use-bool-literals (#56216) · 2c307457
  由 cyberslack_lee 提交于 8月 14, 2023
  
  2c307457
09 8月, 2023 1 次提交
- X
  [oneDNN]rename macro to PADDLE_WITH_DNNL (#52208) · 6ff4c130
  由 Xinyu Chen 提交于 8月 09, 2023
```
* onednn: rename macro to PADDLE_WITH_DNNL

* onednn: rename macro to CINN_WITH_DNNL
```
  6ff4c130
07 8月, 2023 1 次提交
- R
  
  [clang-tidy] enable modernize-use-equals-default (#55983) · 30a02d27
  由 Ruibin Cheung 提交于 8月 07, 2023
  
  30a02d27
04 8月, 2023 1 次提交

[clang-tidy] enable modernize-use-emplace (#55799) · 469a0392

由 Ruibin Cheung 提交于 8月 04, 2023

* [clang-tidy] enable modernize-use-emplace

* Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into modernize_use_emplace

469a0392

03 8月, 2023 1 次提交
- W
  
  [clang-tidy] [No.4] enable `modernize-loop-convert` (#55704) · 81ccd99e
  由 Wang Xin 提交于 8月 03, 2023
  
  81ccd99e
21 7月, 2023 1 次提交
- R
  
  [clang-tidy] enable modernize-use-override (#55491) · cd0f1523
  由 Ruibin Cheung 提交于 7月 21, 2023
  
  cd0f1523
13 7月, 2023 1 次提交
- R
  Add matmul_int8 op (#55228) · 27cc0df5
  由 RichardWooSJTU 提交于 7月 13, 2023
```
* add matmul int8
```
  27cc0df5
12 7月, 2023 2 次提交

[ONEDNN] Upgrade oneDNN version to v3.1 (#52463) · cfa513f7

由 YangQun 提交于 7月 12, 2023

* squash pick the poc code
* fix build after rebase
* fix int8 conv and fc uts
* Fix and clean-up Get_SRC_Scale_Memory
* fix floating point fc uts
* fix test_analyzer_int8_googlenet
* test_analyzer_int8_mobilenetv1
* fix int8 mobilenet v2 and v3
* fix build error after rebase
* [oneDNN] rename library version
* fix conv bias datatype
* try to fix import error
* fix rebase error
* [oneDNN] pack library into python wheel
* add MKLDNN_SHARED_LIB_3 to env_dict
* fix test_analyzer_bert
* fix fill_constant op kernel
* fix ernie and matmul op ut
* fix softplus ut
* fix conv+relu6 fusion ut
* fix hardswish fusion
* fix quant+transpose fusion ut
* fixsgd ut
* fix int8 matmul with flatten
* fix fc+scale fusion
* fix conv/matmul+gelu fusion uts
* fix rebase error
* Revert "fix conv/matmul+gelu fusion uts"
This reverts commit 47eb5e49972bd8f7271a233def9bfb3e98ce78e1.
* upgrade to onednn v3.1
* remove older version onednn
* use densetensor::data() for achieving mean and var in layernorm impl
* comments for atol of integer tests
* fix clang-format
* Revert "remove older version onednn"
This reverts commit 783e57ddfd4401254596eae7d47adb9b03590c09.
* improve binary handle
* fix expand kernel
* Revert "use densetensor::data() for achieving mean and var in layernorm impl"
* always use forward_inference for conv
* remove activation scales
* rollback changes to mkldnn.cmake
* address comments
* port changes to dequantize kernel
* fix merge error
* fix fused_elementwise_kernel
* upgrade onednn version to v3.1.1
* fix some approval error
* fix error msg format
* remove old onednn libs
* try to fix symbolic link issue
* fix cinn test case segfault
* do not explicit link test with onednn
* remove unnecessary changes
* integrate CINN with onednn v3
* link with mkldnn project
* fix cinn build file

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
Co-authored-by: NChen, Xinyu1 <xinyu1.chen@intel.com>
Co-authored-by: Ntianshuo78520a <707759223@qq.com>

cfa513f7

[clang-tidy] enable `readability-container-size-empty` check (#55279) · be3a6fa7

由 Wang Xin 提交于 7月 12, 2023

* [clang-tidy] enable readability-container-size-empty check

* fix test_custom_kernel Failed

* add clang-tid-10 in dockerfile

* add clang-tidy in dockerfile

* fix bug

be3a6fa7

29 6月, 2023 1 次提交
- Y
  Fix compiling on XPU related to MPTypeTrait. (#54924) · 7353e9e9
  由 Yiqun Liu 提交于 6月 29, 2023
```
* Fix compiling on XPU related to MPTypeTrait.

* Unify the use of MPTypeTrait.

* Fix compiling error.
```
  7353e9e9
20 6月, 2023 1 次提交

static graph autogen code support for matmul op (#54338) · ad80fbfe

由 Wang Xin 提交于 6月 20, 2023

* static graph autogen code support for matmul op

* fix bug

* fix bug

* fix bug

* fix bug

* fix bug

* fix bug

* fix bug

ad80fbfe

12 6月, 2023 1 次提交
- R
  
  fix gcc12 error (#54535) · 89bcf894
  由 risemeup1 提交于 6月 12, 2023
  
  89bcf894
10 6月, 2023 1 次提交
- L
  
  Fix bugs in fused_linear_epilogue (#54512) · 0a704e14
  由 limingshu 提交于 6月 10, 2023
  
  0a704e14
08 6月, 2023 1 次提交
- C
  
  fuse vit attention for faster-rcnn on BML (#54139) · fc880209
  由 cmeng 提交于 6月 08, 2023
  
  fc880209
05 6月, 2023 1 次提交
- H
  
  Fix some compile errors with C++17 (#54282) · 68d81d0e
  由 huangjiyi 提交于 6月 05, 2023
  
  68d81d0e
01 6月, 2023 2 次提交

Support static graph code generation for conv2d, conv3d, depthwise_conv2d (#54201) · f3eccb3f

由 huangjiyi 提交于 6月 01, 2023

* update

* update cmake

* update

* update

* update

* update

* Revert "update cmake"

This reverts commit 1e1dc1b2bc9967b725201272607f939260070fd4.

* update

* update

* update

* update

f3eccb3f

mv all unittests test (#53235) · b0e86d55

由 tianshuo78520a 提交于 6月 01, 2023

* mv all unittests test

* fix error

* fix error

* fix

* fix

* del unittests

* fix paddle_build.sh

* fix

* fix test

* fix add test

* fix

* fix

* fix

* merge develop

* fix

* fix

* fix

* fix

* fix

* merge develop

* fix test_async_read_write

* fix test_async_read_write

* merge develop

* fix

* fix import legacy_test

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix bug

* fix

* fix coverage test bug

* fix

* fix

* fix

* fix

* fix

* fix code sstyle

* fix code

* fix code

* fix

* fix

* fix

* del test_sequence_enumerate_op.py

* fix

b0e86d55

24 5月, 2023 1 次提交

Try to increase the repeat of autotune and fix the setting of allow_tf32_cublas. (#53622) · f4abe34b

由 Yiqun Liu 提交于 5月 24, 2023

* Try to increase the repeat of autotune and fix the setting of allow_tf32_cublas.

* Change the repeat of cublaslt to 10.

* Use FLAGS_cublaslt_exhaustive_search_times as repeats.

* Fix compiling error on CI.

* Polish the key and simplify codes.

f4abe34b

23 5月, 2023 2 次提交
- C
  
  fix typos(#53967) · c36a000d
  由 cyberslack_lee 提交于 5月 23, 2023
  
  c36a000d
- H
  move fusion_group infershape to phi (#53934) · 3dc99088
  由 huangjiyi 提交于 5月 23, 2023
```
* update

* update

* update

* set out dtype
```
  3dc99088
19 5月, 2023 1 次提交

Add flash attention to speedup fused_gate_attention. (#52731) · d29c1f8e

由 limingshu 提交于 5月 19, 2023

* Reorganize the forward codes of flash-attention.

* Fix forward.

* Remove some noused codes.

* Simplify codes and fix backward.

* Change all LOG(INFO) to VLOG and fix the backward.

* add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes

* decrease the effect of debug print on performance

* Unify the initialize of flashattn arguments.

* Rewirte the reshape of temp_mask and temp_bias.

* API support use_flash_attn.

* Fix compiling error on CI.

* Try to crop the flash-attention lib.

* Correct the condition of whether can use flash-attn.

* Remove the softmax_out argument.

* Remove is_causal.

* Polish codes.

* Fix qkv_transpose_out's shape and scaling of Q * K.

* Update commit of flash-attention.

---------
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

d29c1f8e

18 5月, 2023 2 次提交

Fused elementwises kernels and ops (#51427) · fb4a6ecf

由 Hulek 提交于 5月 18, 2023

* Fused elementwises kernels and ops

* change fuse pass name

* adjust .pbtxt files

* adjust quantization attributes

* add missing arguments and fix others, review fixed

* simplify fused kernel registration

* fix elementwise unit tests

* reuse one fused elementwise op

* adjust proto

* Add supported datatypes

* Change 'Scale' to 'scale' in tests, change some tests to onednn

* Revert breaking changes

* Fix unit tests

* Delete obsolete test cases

* Delete commented out code

* Fix codestyle

* delete temporary condition

* fix conflicts and delete duplicate fusing

* Fix code after merge

* Move tests to new directory

* fix tests volatility

* Rename test_elementwise_add_onednn_op.py to test_elementwise_add_mkldnn_op.py

* Update CMakeLists.txt add mkldnn op test

---------
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

fb4a6ecf

H

move fusion_group kernel to phi (#53781) · 26da689d
由 huangjiyi 提交于 5月 18, 2023

26da689d

16 5月, 2023 2 次提交

G
remove some [-Wunused-parameter] warning and fix a file to pass cpplint (#53814) · 10a38b4e
由 Galaxy1458 提交于 5月 16, 2023
```
* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop
```
10a38b4e

Move fused batchnorm to Phi (#53476) · 5e5481d8

由 Sonder 提交于 5月 16, 2023

* trans fused batch norm Compute function

* trans batch norm register info to phi

* trans fused batch norm grad Compute

* trans batch norm grad register info

* add sig file

* update sig file

* Update fused_bn_activation_kernel.cu

* Update fused_bn_activation_grad_kernel.cu

* fix

* Rename fused_bn_activation_kernel_grad.cu to fused_bn_activation_kernel.cu

* fix

* fix

* fix CudnnDataType error

* fix

* fix include

* update

* add #if

* add fused bn act to cmakelist.txt

* update  cmakelist

* fix #ifdef error

* add timeout set

* add env set

* fix

* fix

* Update fused_bn_activation_sig.cc

5e5481d8

15 5月, 2023 1 次提交

remove some [-Wunused-paramter]warning (#53681) · 96188fc1

由 Galaxy1458 提交于 5月 15, 2023

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

96188fc1

11 5月, 2023 1 次提交
- H
  [XPU] update log for bkcl function calls. (#53609) · d67d74cc
  由 houj04 提交于 5月 11, 2023
```
* [XPU] update log for bkcl function calls.

* minor update

* revert unnecessary modifications.
```
  d67d74cc
09 5月, 2023 1 次提交
- G
  remove some [-Wunused-parameter]warning (#53617) · bafc3469
  由 Galaxy1458 提交于 5月 09, 2023
```
* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop
```
  bafc3469
05 5月, 2023 1 次提交
- G
  
  [test]mv fluid op fused to test/cpp/fluid/fused (#53434) · 903c5638
  由 gouzil 提交于 5月 05, 2023
  
  903c5638
28 4月, 2023 1 次提交

Dropout optimize & clean broadcast inT and ElementwiseType (#52969) · d611e48c

由 Bo Zhang 提交于 4月 28, 2023

* change judgement for DropoutGradGPUKernelDriver

* add UnrollerWithoutVecSize and after this Loaddata to be refined

* pass unittest

* use same unroller with XPU

* BroadcastWithInt64Index

* BroadcastDataLoader template partial specialization

* fix compile errs in ROCms

* clean ElementwiseT and InT for BroadcastKernel

* default axis and clean inT

* remove redundant fast divmod computation

* optimize drop_nd & drop_nd_grad

* optimize BroadcastDataLoader bf16 fp16

* rm InT etc. after merge develop

* delete constexpr for windows ci

* fix conflict

* fix conflic with develop

* fix conflic

* new clean

* clean

d611e48c

27 4月, 2023 2 次提交

Move fused feedforward (#53166) · 25b4ba7f

由 Sonder 提交于 4月 27, 2023

* trans fused_feedward Compute function to phi

* add register info

* remove maxfunctor

* move fused feedward to phi

* remove sig file

* remove fliud include

* add include

* add include

* add sig file

* add output register info

* fix sig file

* Update fused_feedforward_sig.cc

* fix grad kernel

* update output register info

* fix

* open fused_feedforward static build

* add optional and fix code style

* fix output info for fused attention

* add optional param

* merge

25b4ba7f

H
Register fluid xpu kerenls to phi [part 2] (#53188) · eee9c788
由 huangjiyi 提交于 4月 27, 2023
```
* update

* fix bug
```
eee9c788

26 4月, 2023 1 次提交
- H
  Register fluid xpu kerenls to phi [part 3] (#53189) · 37489df5
  由 huangjiyi 提交于 4月 26, 2023
```
* update

* update
```
  37489df5
25 4月, 2023 1 次提交

[PHI]Add flags macro for PHI (#52991) · 22e96bde

由 YuanRisheng 提交于 4月 25, 2023

* add flags for phi

* fix compile bugs

* fix ci bugs

* fix inference bugs

* fix cinn' bugs

* fix cinn bugs

* perfect code according comment

* fix ci bugs

* fix ci bugs

22e96bde

24 4月, 2023 1 次提交

Move fused feedforward xpu (#53196) · 83c2e682

由 Sonder 提交于 4月 24, 2023

* add sig file

* trans fused feedforward compute function to phi

* remove fluid include

* delete old register info

* fix build error

* trans fused feedforward grad xpu to phi

83c2e682

19 4月, 2023 2 次提交

Move fused_attention op to phi [迁移XPU OpKernel] [ test=kunlun ] (#53011) · 7b56bd25

由 Sonder 提交于 4月 19, 2023

* trans fused attention to phi

* add optional parm

* trans fused_attention_grad to phi

* add fused attention grad register info

* fix include

* test=kunlun

* add fused attention to static build list

* add remove

* update remove

7b56bd25

H
Register fluid kerenls to phi [part 11] (#53035) · abc44b40
由 huangjiyi 提交于 4月 19, 2023
```
* update

* fix bug

* fix bug

* fix bug

* fix bug
```
abc44b40

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功