提交 · 138bdf40e5e9b9830cb73730c53678748591d6a2 · PaddlePaddle / Paddle

29 8月, 2023 1 次提交
- C
  [clang-tidy] No.26,27 enable misc-unused-using-decls,misc-unused-alias-decls (#56485) · 138bdf40
  由 cyberslack_lee 提交于 8月 29, 2023
```
* fix

* fix
```
  138bdf40
12 7月, 2023 2 次提交

[ONEDNN] Upgrade oneDNN version to v3.1 (#52463) · cfa513f7

由 YangQun 提交于 7月 12, 2023

* squash pick the poc code
* fix build after rebase
* fix int8 conv and fc uts
* Fix and clean-up Get_SRC_Scale_Memory
* fix floating point fc uts
* fix test_analyzer_int8_googlenet
* test_analyzer_int8_mobilenetv1
* fix int8 mobilenet v2 and v3
* fix build error after rebase
* [oneDNN] rename library version
* fix conv bias datatype
* try to fix import error
* fix rebase error
* [oneDNN] pack library into python wheel
* add MKLDNN_SHARED_LIB_3 to env_dict
* fix test_analyzer_bert
* fix fill_constant op kernel
* fix ernie and matmul op ut
* fix softplus ut
* fix conv+relu6 fusion ut
* fix hardswish fusion
* fix quant+transpose fusion ut
* fixsgd ut
* fix int8 matmul with flatten
* fix fc+scale fusion
* fix conv/matmul+gelu fusion uts
* fix rebase error
* Revert "fix conv/matmul+gelu fusion uts"
This reverts commit 47eb5e49972bd8f7271a233def9bfb3e98ce78e1.
* upgrade to onednn v3.1
* remove older version onednn
* use densetensor::data() for achieving mean and var in layernorm impl
* comments for atol of integer tests
* fix clang-format
* Revert "remove older version onednn"
This reverts commit 783e57ddfd4401254596eae7d47adb9b03590c09.
* improve binary handle
* fix expand kernel
* Revert "use densetensor::data() for achieving mean and var in layernorm impl"
* always use forward_inference for conv
* remove activation scales
* rollback changes to mkldnn.cmake
* address comments
* port changes to dequantize kernel
* fix merge error
* fix fused_elementwise_kernel
* upgrade onednn version to v3.1.1
* fix some approval error
* fix error msg format
* remove old onednn libs
* try to fix symbolic link issue
* fix cinn test case segfault
* do not explicit link test with onednn
* remove unnecessary changes
* integrate CINN with onednn v3
* link with mkldnn project
* fix cinn build file

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
Co-authored-by: NChen, Xinyu1 <xinyu1.chen@intel.com>
Co-authored-by: Ntianshuo78520a <707759223@qq.com>

cfa513f7

[clang-tidy] enable `readability-container-size-empty` check (#55279) · be3a6fa7

由 Wang Xin 提交于 7月 12, 2023

* [clang-tidy] enable readability-container-size-empty check

* fix test_custom_kernel Failed

* add clang-tid-10 in dockerfile

* add clang-tidy in dockerfile

* fix bug

be3a6fa7

08 6月, 2023 1 次提交
- C
  
  fuse vit attention for faster-rcnn on BML (#54139) · fc880209
  由 cmeng 提交于 6月 08, 2023
  
  fc880209
18 5月, 2023 1 次提交

Fused elementwises kernels and ops (#51427) · fb4a6ecf

由 Hulek 提交于 5月 18, 2023

* Fused elementwises kernels and ops

* change fuse pass name

* adjust .pbtxt files

* adjust quantization attributes

* add missing arguments and fix others, review fixed

* simplify fused kernel registration

* fix elementwise unit tests

* reuse one fused elementwise op

* adjust proto

* Add supported datatypes

* Change 'Scale' to 'scale' in tests, change some tests to onednn

* Revert breaking changes

* Fix unit tests

* Delete obsolete test cases

* Delete commented out code

* Fix codestyle

* delete temporary condition

* fix conflicts and delete duplicate fusing

* Fix code after merge

* Move tests to new directory

* fix tests volatility

* Rename test_elementwise_add_onednn_op.py to test_elementwise_add_mkldnn_op.py

* Update CMakeLists.txt add mkldnn op test

---------
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

fb4a6ecf

13 4月, 2023 1 次提交
- W
  [Paddle-Trt] Replace fc mul matmul matmul_v2 with matrix_multiply (#52222) · ef734e84
  由 Wangzheee 提交于 4月 13, 2023
```
* Paddle-Trt: Replace fc mul matmul matmul_v2 with matrix_multiply
```
  ef734e84
22 3月, 2023 2 次提交

Add fused_feed_forward pass (#50423) · 5dda0ef6

由 Ghost Screaming 提交于 3月 22, 2023

* Add fused_feed_forward pass for semi-automatic static graph training.

* Add fused_feedforward property in parallel_executor.cc

* Polish code.

* Polish fused feed_forward pass code. Support use_dropout1 and
use_dropout2 option.

* Support model parallel in fused_feedforward pass.

5dda0ef6

Extract fused_transpose op dedicated for oneDNN fuse passes (#50021) · 02296977

由 Sławomir Siwek 提交于 3月 22, 2023

* extract common methods to reuse

* add header for transpose ops

* fused_transpose

* Split big function

* transpose2 tests

* fused_transpose

* Apply extra attributes

* add pbtxt file

* update pbtxt

* Merge develop

* add more strict op compats

* code  style

* remove mkldnn_data_type

* unify SetOutMemDescWithReshape2FuseSupport

* adjust quantize-dequantize for transpose

* remove appendact

* transpose2 quantization

* fix int8 tests

* adjust transpose_op to current develop

* delete fusion code from transpose_kernel

* add fused transpose to NHWC unittest

* change order

02296977

16 3月, 2023 1 次提交

split layernorm pass (#51228) · 3f3372b6

由 wenbin 提交于 3月 16, 2023

* split pass

* fix compile

* fix ut

* more time

* modify ut

* reduce dim

* fix compile

* reshape weight

* tensor

* remove enforce

* static shape ut

* batchsize

* reorder pass

* minus test cases

* windows timeout

* windows time out

* remove test for windows

* correct

* sssss

* xxx

3f3372b6

02 3月, 2023 1 次提交
- Z
  Fix performance problem in BF16 models (#50283) · e421c6a6
  由 zyfncg 提交于 3月 02, 2023
```
* fix performance drop in BF16 models

* fix test_cpu_quantize_squash_pass
```
  e421c6a6
23 2月, 2023 1 次提交
- C
  
  [XPU] Migrate xpu_embedding_with_eltwise_add_fuse_pass (#50590) · 8d325d82
  由 csy0225 提交于 2月 23, 2023
  
  8d325d82
16 2月, 2023 2 次提交
- J
  Add matmul_v2 and fused_matmul to the quantization process and adjust Ernie model test (#50354) · 8686a745
  由 joanna.wozna.intel 提交于 2月 16, 2023
```
* Add matmul_v2 to the quantization process and adjust Ernie model test

* Correct cpu_quantize_pass test

* Move op to fuse transformation to placement pass

* Correct test
```
  8686a745
- Z
  
  [XPU] fix dropout pass; add multi_encoder_xpu_fuse_pass & multi_encoder_xpu kernel (#50499) · c8aa6405
  由 zhupengyang 提交于 2月 16, 2023
  
  c8aa6405
08 2月, 2023 2 次提交

fuse quantize+transpose and transpose+dequantize (#49509) · 197a4ffe

由 Paulina Gacek 提交于 2月 08, 2023

* QuantTranpose pattern is being found by pass

* quant + transpose fuse

* code style changes

* UT written, reorder fixed

* Dequantize + transpose2 fuse  added

* pass name changed

* UT added & shift corrected

* got rid of redundancy

* review changes

* AsIntermediate corrected

* compat added

197a4ffe

S
Add bf16 support for fused matmul (#50254) · b47923b4
由 Sławomir Siwek 提交于 2月 08, 2023
```
* add support for bf16 fused_ops

* fused_matmul only
```
b47923b4

21 12月, 2022 1 次提交

Refactor Pass for fused_conv (#48848) · 7f0eb2e3

由 zyfncg 提交于 12月 21, 2022

* refactor conv_activation_mkldnn_fuse_pass

* refactor conv_affine_channel_mkldnn_fuse_pass

* fix conv_activation_mkldnn_fuse_pass

* fix mkldnn unittest

* refactor int8_scale_calculation_mkldnn_pass and params_quantization_mkldnn_pass

* refactor conv_elementwise_add_mkldnn_fuse_pass

* fix quant

* refactor conv_bn_fuse_pass

* fix conv_bn_fuse_pass

* refactor depthwise_conv_bn_fuse_pass

* fix unittest

* fix conv_bn_fuse_pass

* remove redundant conv2d in params_quantization_mkldnn_pass

* fix params_quantization_mkldnn_pass_tester

7f0eb2e3

06 12月, 2022 1 次提交

Clear extra input (Bias, ResidualData) in OpMaker of conv2d (#47579) · 0a2dfa38

由 zyfncg 提交于 12月 06, 2022

* delete Bias and ResidualData in OpMaker of conv2d

* delete extra input of conv3d

* refactor pass of conv_bias_fusion

* fix mkldnn dependency

* fix mkldnn compile

* fix test_conv_bias_mkldnn_fuse_pass

* police some code

* remove useless log

* fix analyzer_vit_ocr_tester

* fix conv_activation_mkldnn_fuse_pass

* fix test_analyzer_ocr

* add fused_conv_sig

* fix performence regression

* fix performance regression

0a2dfa38

05 12月, 2022 1 次提交

Reverse roll fuse (#46914) · feb68dd1

由 Wang Bojun 提交于 12月 05, 2022

* pass

* pass

* draft version

* share mem opt

* remove sharemem

* add pattern for the case with circle_shift=0

* add UT

* pass opt

* test_fix

* code-commit

* code-style

* code style

* code-style

* ut-fix

* op teller refine

* resolve conflict

* adjust position op_teller list and pass order for swin

* ut code style update

* adjust paddle pass order

* refine pass order

* refine pass order

* refine pass order

feb68dd1

01 12月, 2022 1 次提交
- Z
  [Paddle Inference] remove conv_act_set from graph_pattern_detector.cc (#48569) · d3f8ede0
  由 zhoutianzi666 提交于 12月 01, 2022
```
* remove conv_act_set from graph_pattern_detector.cc
```
  d3f8ede0
30 11月, 2022 2 次提交
- Z
  Add fuse_act_add_grad_pass (#48346) · ca552933
  由 zhangbo9674 提交于 11月 30, 2022
```
* add fuse act add grad pass

* polish code

* refine code

* add test

* refine code
```
  ca552933
- R
  Add int8 support in fused_multi_transformer_pass and fuse_multi_transformer_layer_pass (#48209) · 12486712
  由 RichardWooSJTU 提交于 11月 30, 2022
```
* delete unnecessary shape and slice op
Co-authored-by: NYour Name <you@example.com>
```
  12486712
21 11月, 2022 1 次提交

add fc-residual quantization (#46917) · fed0ed34

由 Sylwester Fraczek 提交于 11月 21, 2022

* add fc-residual quantization

* revert removal of check for use_mkldnn

* fix bug

* add disable_logs

* review fix

call twice AreScalesPresntForNodes instead of if-else

* rewrite residual input to output

* revert fc mkldnn taking residual data

* format fix

* fix LoDTensor->DenseTensor

* LoDTensor->DenseTensor

* output->input

* revert changes to unsupported script

revert changes to unsupported script

* remove fc residualdata from output blocklist in cpu_bfloat16_pass.cc

fed0ed34

16 11月, 2022 1 次提交
- P
  Add bf16 data type support to oneDNN bilinear_interp kernel (#46770) · 8e6315e4
  由 Piotr Paturej 提交于 11月 16, 2022
```
* Enable bf16 in oneDNN bilinear_interp kernel

* Fix bilinear_interp_v2 not enabled in models

* Remove unnecessary checks
```
  8e6315e4
15 11月, 2022 1 次提交
- J
  Added optimization pass for oneDNN layernorm kernel (#47782) · 519e7426
  由 jakpiase 提交于 11月 15, 2022
```
* optimization for ln

* fix

* added output to gpd

* added formatting

* fix
```
  519e7426
14 11月, 2022 1 次提交
- Y
  
  fix squueze_transpose (#47911) · f50de679
  由 yeliang2258 提交于 11月 14, 2022
  
  f50de679
07 11月, 2022 1 次提交

suqeeze2 + transpose2 fuse onednn (#47592) · fa874a46

由 Hui Zhang 提交于 11月 07, 2022

* suqeeze2 transpose2 fuse onednn

* format

* fix output shape

* fix conflict

* format

* format

* remove useless

* remove log

* simply pass

* fix comment

* fix

* fix msg

* fix error msg

* format

fa874a46

04 11月, 2022 1 次提交
- J
  Optimized oneDNN FC and added operator+unsqueeze2 and operator+reshape2 oneDNN fuse passes (#47391) · 9e006987
  由 jakpiase 提交于 11月 04, 2022
```
* tmp save

* minor chnage

* CI fix

* added FC optimizations

* latest update

* CI fix

* fixed bug with fusing fc
```
  9e006987
20 10月, 2022 1 次提交
- K
  Add FusedMultiTransformer fuse pass for GPT3 (#45907) · 5a2e5179
  由 Kaipeng Deng 提交于 10月 20, 2022
```
* add fused_multi_transformer_encoder/decoder pass, run GPT-3 success
```
  5a2e5179
18 10月, 2022 1 次提交

Merge layernorm trt fuse (#46320) · 5e9f491e

由 Wang Bojun 提交于 10月 18, 2022

* first version, accuracy corrected

* disable debug print

* use blockReduceSum in phi

* add UT

* add opCompat

* code style

* code refine

* bug fix

* code refine

* test fix

* bugfix

* codesytle fix

* code style

* code-style

* code-style

* code-style

5e9f491e

17 10月, 2022 2 次提交
- W
  Layernorm shift partition enhance (#46816) · 9e08633c
  由 Wang Bojun 提交于 10月 17, 2022
```
* first version of ln_s_p with s>0

* refine and UT

* pass opt draft

* pass opt

* code refine

* code-style

* bug fix

* fix ci test

* code style
```
  9e08633c
- J
  
  fix for conv_bias_mkldnn_pass (#47037) · acbda3e4
  由 jakpiase 提交于 10月 17, 2022
  
  acbda3e4
10 10月, 2022 1 次提交

Add fc residual pattern (#46757) · 0c789ae5

由 Sylwester Fraczek 提交于 10月 10, 2022

* fix fc pattern

remove use_bias
add residual input switch
fix references to pattern

* review fixes

0c789ae5

07 9月, 2022 1 次提交

Layernorm shift partition (#45736) · 960109af

由 wenbin 提交于 9月 07, 2022

* first commit

* conver done

* correct format

* layernorm_shift_partition

* correct convert

* redefine plugin

* runable

* bug fix

* modify ShiftPartitionPattern

* correct

* add UT

* modify ut

* compile

* modify enforce

* modify UT

960109af

22 8月, 2022 2 次提交
- J
  Add int8 support for matmul+elementwise_add fuse pass (#45077) · 9e5f3a38
  由 joanna.wozna.intel 提交于 8月 22, 2022
```
* Add int8 support for matmul+elementwiae_add fuse

* Corrections after review and ernie test fix
```
  9e5f3a38
- S
  Extend conv_concat_relu to support all activations (#45089) · d03ef054
  由 Sławomir Siwek 提交于 8月 22, 2022
```
* merge conv_concat_relu to conv_act

* fix typo

* extend unit test

* reuse existing gpd

* codestyle

* enforce mkldnn conv
```
  d03ef054
16 8月, 2022 2 次提交

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

W

fix new quant (#45155) · 2fb65e44
由 Wangzheee 提交于 8月 16, 2022

2fb65e44

04 8月, 2022 1 次提交

Matmuls with activation and elementwise_add fuses (#44655) · 0420d514

由 Sławomir Siwek 提交于 8月 04, 2022

* Add unit tests

* matmul_v2 + activation

* matmuls + elementwise_add

* matmul_v2 postops

* transform matmul to v2

* opcompat

* fix fusing matmul with multipe outs

* add shape constraints

* remove unused vars

* change pass order

* - Unit tests to be debugged

- fix

- refactor

- diagnostic

- more diagnostic

- fix

- Fix number two

- fix

- fix

- fix

- alpha added

- more fixes

- compilation fix

- removed diagnostic code

- cosmetic fixes

* lint

* add alpha constraint

* merge matmul refactor

* trigger CI

* - fix

* - another fix

* code style

* add support for matmul+elementwise_add+activation

* code style

* fix bfloat16 bugs

* change append_binary to append_sum
Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>

0420d514

27 7月, 2022 1 次提交
- P
  fix RemoveIntermediateOut in fuse_elewise_add_act_pass while converting graph to program (#44593) · be132719
  由 pangyoki 提交于 7月 27, 2022
```
* fix RemoveNode in fuse_elewise_add_act_pass

* fix

* change pointer to share_ptr

* fix

* fix

* fix format

* fix

* fix graph_safe_remove_nodes
```
  be132719
19 7月, 2022 1 次提交
- R
  Rename BOOST_GET macros (#44368) · 4b085c57
  由 Ruibiao Chen 提交于 7月 19, 2022
```
* Rename BOOST_GET macros

* Fix conflicts
```
  4b085c57

PaddlePaddle / Paddle 接近 2 年 前同步成功

PaddlePaddle / Paddle
接近 2 年前同步成功