提交 · 3f219160bee15a3afa7107439197361f8266dc57 · PaddlePaddle / Paddle

14 3月, 2022 1 次提交

Add an elementwise + activation fusion pass. (#36541) · 3f219160

由 Tomasz Socha 提交于 3月 14, 2022

* Add elementwise add and activation fuse pass

* Fix copy ellision

* More flexible pattern detector

* More flexible fusion pass

* Update lists for pass

* Add support for Pow operator

* Add support for more activation types

* Style

* Rename fusion pass

* First version of tests

* Dirty version of pass

* Polished version

* Update pbtxt

* Style

* Update names

* Style

* Use PADDLE_ENFORCE_EQ

* Save error message to variable

* WO for error checks

* CR

* Static style check

* Add missing 'activation_scale' attribute

* Add relu6 and sigmoid activations

* Style

* Fix fuse list formating

* Sync filenames for fuse pass files

* Fix cmake after move

* Fix registration

* Fix pass name in tests

* Add missing activations to checker

* WIPS

* Working mul op

* Working sub

* Working Add

* Remove pten includes

* Remove some forward declarations

* Remove Includes

* Fixes

* Remove default kernels

* Add check if post_ops attributes are avaliable

* Style

* Code adjustment

* Register default kernels

* We have year 2022 not 2021...
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Fast review fixes
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Review Fix

* Rename one_dnn -> onednn

* Style after review

* Fast and dirty fix for quantization

* Update tests

* Style

* Fix mkldnn_quantizer config

* Add Joanna's suggestion.

* Check if operator is explicitly disables on OneDNN

* Try to use unregistered attributes

* Style

* Test new framework

* FXI

* FXII

* Update test

* Style
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

3f219160

07 3月, 2022 1 次提交

cuBlasLt Epilogue To Fuse Linear + ReLU|GeLU (#39437) · 2a3d9eca

由 Ming-Xu Huang 提交于 3月 07, 2022

* Added cuBlasLtHandle_t to device context.

* Added fused_gemm_epilogue op.

1. Added fused_gemm_epilogue op to leverage cuBlastLt Epilogue.
2. Support fusion Act(X*Y + bias), X'dims >=2 and Y'dims shoule be 2.
2. Act currently only be supported ReLU. (Will add GeLU in the future).

* Added UT to fused_gemm_epilogue op.

* Added LinearAct Pattern

1. Added LinearAct into graph_pattern_detector.* to define (2.)'s
pattern.
2. LinearAct is used to detect act(element_add(matmul_v2(x, w), bias)).
3. act currently only support ReLU (Will support GeLU in the future).

* Added FuseGemmEpiloguePass

1, Added FuseGemmEpiloguePass to handle nn.Linear + Act{ReLU}
fusion (GeLU will be supported in the future).
2. Only support matmul_v2 from nn.Linear.

* Added pybind to BuildStrageter.fuse_gemm_epilogue_.

* Added UT for fuse_gemm_epilogue_pass.

* GeLU support and EpilogueSingleton

1. Added GeLU support to fused_gemm_epilogue op.
2. Added EpilogueSingleton to cache auxiliary pointer.
3. Added related UTs.

* Rename cublaslt_epilogue_opto gemm_epilogue_op.*.

* Added both train and infer pattern to LinearAct.

1. Added support of fwd graph with grap_ops linking to LinearAct.
2. Added related changes to fuse_gemm_epilogue_pass for above
modification.

* Changed CUDA requirement from 11.4 to 11.6 for fuse_gemm_epilogue_pass.

* Added identity activation support to gemm_epilogue_op.

* Added Linear Fusion (matmul_v2 + ele_add)

1. Added matmul_v2 + ele_add pattern to LinearActPattern.
2. Added matmul_v2 + ele_add support to fuse_gemm_epilogue_pass.

* Rename gemm_epilogue_op.* to fused_gemm_epilogue_op.*

* Add fused_gemm_epilogue_grad op.

1. Added fused_gemm_epilogue_grad to support backward epilogue fusion.

* Add UTs to fused_gemm_epilogue_grad_op.

* Change attribute name in fused_gemm_epilogue_grad_op for clearing.

* Allow DX and DBias be dispensable to fused_gemm_epilogue_grad op.

* Added ElementwiseAdd+Matmul+Act graph pattern detection.

* Fuse backward of Linear( Act(x))

1. Added backward fusion pass to Linear( Act(x)).
2. Added backward fusion pass to Linear(x).

* Added UTs to backward fusion of Linear(Act(x)).

* Complete document of arguments to fused_gemm_epilogue_op.

* Made arguments of some functions pass by reference.

* Modify code with review comments.

1. Made arguments of some function pass by reference.
2. Removed redundant code.
3. Followed Google code style to change code.

* Made 'const' code style be consistent

* Fixed random seed of python UTs.

* Set Compiling constrains to cuBlasLt

1. Require CUDA 11.6+
2. Remove fuse_gemm_epilogue related tests when CUDA < 11.6.

* Code Reivew from Paddle

1. Changed arguments name is_first_gemm to without_x_gradient for
clearing.
2. Applied PADDLE_THROW in fused_gemm_epilogue_op.

* Remove EpilogueSingleton

1. Applied ReserveSpace to replace Epilogue for passing auxiliary
pointers between FWD and BWD.

* Fix a logical error and enhance UTs.

1. Added act op count checking in UTs.
2. Fix issue to fuse backward or ReLU(Linear(X)).
3. TODO: solve GELU fusion issues.

* Fix Linear and GeLU fusion issues.

1. Modified graph_detech_pattern to fit with both linear wiht gelu or
relu.
2. Modified data range in Uts to allow negative values.

* Removed fused_gemm_epilogue_op.h.

* Rename namespace pten to phi.

* Rename name of arguments in fused_gemm_epilogue_op

1. bias -> Bias.
2. out -> Out.
3. reserve_space -> ReserveSpace.

* Change EpiloguePassActivationCache as local variable.

1. Removed singleton in EpiloguePassActivationCache.
2. Made EpiloguePassActivationCache as an argument to each pass
functions.

2a3d9eca

24 2月, 2022 1 次提交
- J
  Fix for split op in BF16 inference (#39548) · 75f91ce4
  由 jakpiase 提交于 2月 24, 2022
```
* Fix for split bf16 inference

* added test for pass

* changes after review
```
  75f91ce4
08 2月, 2022 1 次提交
- J
  [Bug fix] Fixed handling of one of the cases in the quantization process (#39342) · e4d475ea
  由 joanna.wozna.intel 提交于 2月 08, 2022
```
* Fix quantization next op findings

* Corrections according to the review
```
  e4d475ea
05 1月, 2022 2 次提交
- J
  
  Add input data type checking in BF16 placement pass (#38702) · 60c51de5
  由 joanna.wozna.intel 提交于 1月 05, 2022
  
  60c51de5
- J
  Quantize nearest_interp and nearest_interp_v2 (#38622) · 1456b02d
  由 joanna.wozna.intel 提交于 1月 05, 2022
```
* Quantize nearest_interp and nearest_interp_v2

* Check if avx_core supported

* Add depthwise_conv2d to supported quantization list
```
  1456b02d
20 12月, 2021 1 次提交

add matmul_scale_fuse_pass (#37962) · ce335c23

由 heliqi 提交于 12月 20, 2021

* add matmul_scale matmul_v2_scale fuse pass

* add scaletensor judge

* modify var name

* add timeout notest;test=coverag

* fix error commit

* fix use_mkldnn attr

* fix use_mkldnn attr

ce335c23

15 12月, 2021 1 次提交
- W
  remove bf16 (#38133) · 49108efa
  由 wenbin 提交于 12月 15, 2021
```
* remove bf16

* remove comments

* remove wrong return

* fix UT
```
  49108efa
14 12月, 2021 1 次提交
- S
  add reshape+transpose+matmul_v2 only (#37847) · a922168a
  由 Sylwester Fraczek 提交于 12月 14, 2021
```
* reshape+transpose+matmul_v2

* in_name->input_name

* fix pr-ci-static-check
```
  a922168a
07 12月, 2021 1 次提交
- Z
  Quantize slice op (#37630) · 2bd0f3c7
  由 Zuza 提交于 12月 07, 2021
```
* quantize slice op

* correct test

* fix code formatting
```
  2bd0f3c7
11 11月, 2021 1 次提交

Added softplus + activation oneDNN fuse pass (#36657) · a346c4dc

由 jakpiase 提交于 11月 11, 2021

* added softplus + activation fuse plass

* minor change

* implemented reviewer suggestion

* minor fix

* minor fix

* added scale_out parameter

* minor fix

* fix for iScan CI

* conditionally disabled logs

* refactored pass builder

a346c4dc

26 10月, 2021 1 次提交

[Paddle-Inference]Add MatmulV2ToMatmul convert Pass, fix (matmul_v2, matmul,... · 93c591e2

由 Wangzheee 提交于 10月 26, 2021

[Paddle-Inference]Add MatmulV2ToMatmul convert Pass, fix (matmul_v2, matmul, mul) convert pass, fix (matmul, mul) op_teller (#36652)

* new_Matmul2ToMatmulToMul

* new_Matmul2ToMatmulToMul

* fix paddle_pass_builder

* fix paddle_pass_builder

* fix paddle_pass_builder

* tem

* tem

* Add MatmulV2ToMatmul convert Pass; MatmulV2ToMul convert Pass

* Add MatmulV2ToMatmul convert Pass; MatmulV2ToMul convert Pass

* add matmul_broadcast_unitest

* fix op_teller

93c591e2

21 10月, 2021 1 次提交

Added matmul_v2+transpose+reshape fuse pass (#36481) · 856cb9c5

由 jakpiase 提交于 10月 21, 2021

* added base changes for matmul_v2+trans+resh fuse pass

* added full matmul_v2+transpose+reshape pass

* removed a file added by mistake

* added reviewers suggestions

* Changed ops type in checking capatibility version

* Deteled one statement

856cb9c5

14 10月, 2021 1 次提交
- W
  inference support bert when exists matmul_v2 (#36424) · 3e6d9dbb
  由 Wilber 提交于 10月 14, 2021
```
* support bert when exists matmul_v2

* update
```
  3e6d9dbb
13 10月, 2021 1 次提交

[PaddleInference] Pass: add int8 flag for op (#36042) · d7858c99

由 Wangzheee 提交于 10月 13, 2021

* add_int_pass

* add_int8_flag_pass

* add_int8_flag_pass

* fix CMakeLists.txt

* fix test_trt_fc_fuse_quant_dequant_pass.py

* fix python/paddle/fluid/tests/unittests/ir/inference/test_trt_fc_fuse_quant_dequant_pass.py

* fix test_trt_fc_fuse_quant_dequant_pass.py

d7858c99

22 9月, 2021 1 次提交
- W
  
  fix: delete_quant_dequant_filter_op_pass, delete_quant_dequant_op_pass (#35879) · 5cda6b2b
  由 Wangzheee 提交于 9月 22, 2021
  
  5cda6b2b
06 9月, 2021 1 次提交

Add fusion_lstm INT8 PTQ (#35334) · 7ef04da6

由 joanna.wozna.intel 提交于 9月 06, 2021

* Add fusion_lstm INT8 PTQ

* Correct mkldnn_cache_capacity and enable fc_lstm_fuse_pass only for this test

* Change mkldnn_cache_capacity

7ef04da6

28 4月, 2021 1 次提交

Nne integration (#32604) · abcb3f54

由 denglin-github 提交于 4月 28, 2021

* Add dlnne engine runtime

* Fix log

* Remove <const_cast> and remove unrelated modify with dlnne, +clang-format

* Fix CMakeList format error

* Add copyright message

* Fix dlnne CMakeList.txt

* Add some paddlepaddle_pass to support more networks

* Fix some format bug

* Add delete dropout_op pass

* Fix some format bug

* Fix format bug

abcb3f54

30 3月, 2021 1 次提交

[Paddle-TRT] TRT inference support for BERT/Transformer in paddle 2.0 api (#31744) · 14b7e3cf

由 Pei Yang 提交于 3月 30, 2021

* support multihead_matmul_fuse_pass_v3

* fix compile problems

* embedding_eltwise_ln pass support lookup_table_v2

* suppoort matmul and matmul_v2 in qkv matmul

14b7e3cf

26 3月, 2021 1 次提交
- T
  delete include framework.pb.h (#31859) · e804f085
  由 tianshuo78520a 提交于 3月 26, 2021
```
* delete include framework.pb.h

* fix error
```
  e804f085
23 2月, 2021 1 次提交

Unification of BF16 enablement process (#31034) · 781df300

由 joanna.wozna.intel 提交于 2月 23, 2021

* Unification of bfloat16 enablement process and refactor

* Remove unnecessary function

* Standardize the output name search

781df300

03 2月, 2021 1 次提交
- A
  
  Layer normalization fuse pass. (#30721) · 4f066e31
  由 Adam Osewski 提交于 2月 03, 2021
  
  4f066e31
13 1月, 2021 1 次提交

Added support for inference using quantization aware trained dygraph (#30288) · 7bbf3ac5

由 alncat 提交于 1月 13, 2021

* added support for inference using qunatization aware trained dygraph

* added support for inference using qunatization aware trained dygraph
correct boost get usage

* Delete incorrect warning message (#30196)

* fix warning and no grad

* clean redundant API alias in 2.0 - part 2 (#30013)

* delete paddle.nn.functional.assign

* fix dynamic to static error

* just add the op error message for the matmul xpu (#30246)

 add the op error message for the matmul xpu

* Add Static Variable Clone (#30208)

Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat

* use wget to replace curl to download the lcov file (#30229)

* use wget to replace curl to download the lcov file

* add cache for lcov

* fix test_pool3d_op timeout issue (#30248)

* Fix unittests bugs. (#30250)

* modify error message based on comments (#30189)

* modify error message based on comments

* edit code according to review.

* Correct spelling according to review.

* Fix bug for 'save mutiple method' (#30218)

* Fix bug for 'save mutiple method'

* To pass coverage.

* edit code to pass coverage.

* edit code to pass coverage.

* add unittest for coverage.

* change for coverage.

* edit for coverage.

* added support for inference using qunatization aware trained dygraph

* Alias from  paddle.fluid.layers.auc to paddle.static.auc (#30206)

* add alias from  fluid.layers.auc to static.auc

* Update __init__.py

* added support for inference using qunatization aware trained dygraph
correct boost get usage

* corrected boost get usage

* corrected naming issues and enforcing zero check

* correct paddle enforce message

* added more error checkings

* corrected error report message and optimized code

* corrected findvar usage

* corrected paddle_enforce in scope

* correct error messages

* correct error reporting format
Co-authored-by: NLielinJiang <50691816+LielinJiang@users.noreply.github.com>
Co-authored-by: NXiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
Co-authored-by: NHuihuang Zheng <zhhsplendid@gmail.com>
Co-authored-by: NYUNSHEN XIE <1084314248@qq.com>
Co-authored-by: NBai Yifan <me@ethanbai.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: NWeiXin <weixin10@baidu.com>
Co-authored-by: NJiaqi Liu <liujiaqi06@baidu.com>

7bbf3ac5

29 12月, 2020 1 次提交
- C
  map matmul/squeeze2+matmul/reshape2+matmul to mul (#29911) · 6a0102b0
  由 cc 提交于 12月 29, 2020
```
* map matmul/squeeze2+matmul/reshape2+matmul to mul
```
  6a0102b0
24 12月, 2020 1 次提交
- J
  
  Added fc + activation fuse pass (currently only gelu, sigmoid and tanh are supported) (#29772) · edc06c6a
  由 jakpiase 提交于 12月 24, 2020
  
  edc06c6a
30 11月, 2020 1 次提交
- W
  
  Add quantization of multi_gru op and tests (#28615) · 4fd4095d
  由 Wojciech Uss 提交于 11月 30, 2020
  
  4fd4095d
26 11月, 2020 1 次提交
- J
  Fix cpu_bfloat16_pass (#28730) · fddea674
  由 joanna.wozna.intel 提交于 11月 26, 2020
```
* Fix cpu_bfloat16_pass

* Add output_format

* Fix incorrect SetOutput

* Change fromating
```
  fddea674
25 11月, 2020 1 次提交
- W
  Add multi_gru_fuse_pass and tests (#28601) · 7b5a8e46
  由 Wojciech Uss 提交于 11月 25, 2020
```
* Add multi_gru_fuse_pass and tests

* fix date

* cleaned up headers
```
  7b5a8e46
24 11月, 2020 1 次提交
- W
  Add multi_gru_seq_fuse_pass and tests (#28604) · 991345b3
  由 Wojciech Uss 提交于 11月 24, 2020
```
* Add multi_gru_seq_fuse_pass and tests

* fix date

* removed unused functions
```
  991345b3
27 10月, 2020 1 次提交
- Z
  add Fuse bn add act pass (#28196) · fdc06f21
  由 Zhang Ting 提交于 10月 27, 2020
```
* add fuse_bn_add_act pass
```
  fdc06f21
26 10月, 2020 1 次提交
- A
  
  oneDNN BatchNorm + Act fusion pass. (#27912) · 7db747d9
  由 Adam Osewski 提交于 10月 26, 2020
  
  7db747d9
01 10月, 2020 1 次提交
- W
  
  Added support for quantization of fusion_gru (#27518) · 966447e3
  由 Wojciech Uss 提交于 10月 01, 2020
  
  966447e3
24 9月, 2020 1 次提交

use iwyu clean include (#27267) · df43905f

由 wanghuancoder 提交于 9月 24, 2020

* use iwyu clean include, test=develop, test=win

* compilation error, test=develop

* fix compilation error2, test=develop

* fix compilation error3, test=develop

* fix compilation error4, test=develop

* fix compilation error5, test=develop

* fix compilation error6, test=develop

* fix compilation error7, test=develop

* fix compilation error8, test=develop

* fix compilation error8, test=develop

* fix compilation error10, test=develop

* fix compilation error11, test=develop

df43905f

14 9月, 2020 1 次提交
- J
  
  Add bfloat16 passes (#26999) · 1483ea23
  由 joanna.wozna.intel 提交于 9月 14, 2020
  
  1483ea23
28 8月, 2020 1 次提交
- J
  Fix int8 performace drop cpu_quantize_placement_pass (#26715) · eb097d64
  由 joanna.wozna.intel 提交于 8月 28, 2020
```
* Fix cpu quantize placement pass

* Include string lib
```
  eb097d64
07 7月, 2020 1 次提交

[Fix BUGs]: fix multhead matmul pass's instable bug (#25123) · 7b7e6051

由 Zhaolong Xing 提交于 7月 07, 2020

* fix multhead matmul's instable
test=develop

* fix multihead matmul bug
test=develop

* fix converage problem
test=develop

7b7e6051

23 6月, 2020 1 次提交

[Paddle-TRT] Better Paddle-TensorRT support for PaddleSlim quant models (#25097) · b2f5a149

由 Pei Yang 提交于 6月 23, 2020

* Paddle-TensorRT support slim QAT. test=develop

* add comments. test=develop

* use RenameInput instead of ResetInputs. test=develop

b2f5a149

11 5月, 2020 1 次提交

Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f

由 Chen Weihang 提交于 5月 11, 2020

* add new macro BOOST_GET_SAFELY & unittests, test=develop

* add different macro type, test=develop

* fix get macro type in executor, test=develop

* four macro part change backup

* using one macro for all case, test=develop

* revert attribute change, test=develop

* change to three func to solve gcc4.8 bug, test=develop

* polish some details, test=develop

aa0f254f

06 5月, 2020 1 次提交
- J
  
  [Refactoring] Unify op-dequant squashes (#24277) · 356f5ee2
  由 joanna.wozna.intel 提交于 5月 06, 2020
  
  356f5ee2
30 4月, 2020 1 次提交
- J
  
  [INT8] Add requant-op squash (#24143) · b43b46e6
  由 joanna.wozna.intel 提交于 4月 30, 2020
  
  b43b46e6

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功