提交 · 5134f1100554e5d3074220ac1c05f94cbe9956d0 · BaiXuePrincess / Paddle

27 4月, 2022 1 次提交
- J
  Added missing test for shuffle_channel_mkldnn_detect_pass (#42001) · 5134f110
  由 jakpiase 提交于 4月 27, 2022
```
* added test for shuffle_channel_mkldnn_detect_pass

* added UT using new framework

* CI fix
```
  5134f110
14 4月, 2022 3 次提交

FC+elementwise_add (residual connection) (#41776) · 92d8d0bc

由 Sławomir Siwek 提交于 4月 14, 2022

* Change tensor name to match activation

* declare fc_eltwise_add pass

* merge conv_eltwise refactor PR

* first compilable draft

* unittest feedback tools

* Fuse pass tester

* Move IsReachable() to shared file

* 100% coverage of fuse_pass_tester.cc

* register pass

* Add bias node

* Improve unit tests / remove bias node from pattern

* improve fc_eltwiseadd_unittest

* cancel eltwise_add fuse if act is already fused

* Add elementwise_input scale

* Residual MVP

* Add new FC attrs

* Add more test cases

* Add missing op attrs

* Adapt code to new Elementwise pattern

* reuse existing fcpattern

* improve code style

* remove unused arguments

* fix typo

* remove whitespace

* remove int8 related code

* Remove attributes from base ops

* style

* style check

* Remove input from base op

* Set attribute during fuse

* ut timeout

* download and test model

* DRY

* apply feedback from review

* Style check

* fix typo

* cosmetic changes

* explicitly set residual as output

* VIT-OCR accuracy check

* trigger CI

* remove whitespaces

* fix missing data file

92d8d0bc

add mkldnn int8 pass [step3] (#41599) · 8e2d4d30

由 baoachun 提交于 4月 14, 2022

* add mkldnn int8 pass [step3]

* Add test for compute_propagate_scales_mkldnn_pass

* update pass

* update api comment and python api
Co-authored-by: Nwozna <joanna.wozna@intel.com>

8e2d4d30

Added shuffle_channel BF16/FP32 FWD oneDNN kernel (#39756) · c7623d72

由 jakpiase 提交于 4月 14, 2022

* added shuffle_channel bf16/fp32 fwd kernel

* added missing files

* CI fix

* changed from pten to phi

* tmp save

* added reviewers suggestions

* fix for test

c7623d72

10 4月, 2022 2 次提交
- B
  
  add mkldnn compute_propagate_scales int8 pass (#41592) · c00d869b
  由 baoachun 提交于 4月 10, 2022
  
  c00d869b
- B
  add mkldnn int8 pass [step1] (#41579) · e68da187
  由 baoachun 提交于 4月 10, 2022
```
* add mkldnn int8 pass

* add mkldnn int8 pass

* update pass
```
  e68da187
06 4月, 2022 1 次提交
- A
  [IPU] remove paddle_ipu shared library (#41307) · 229e91bf
  由 Allen Guo 提交于 4月 06, 2022
```
* remove paddle_ipu shared library

* fix unique_name
```
  229e91bf
02 4月, 2022 1 次提交
- W
  [Paddle inference] support new quant_model (#41049) · 1b58ce14
  由 Wangzheee 提交于 4月 02, 2022
```
* paddle inference support new quant_model
```
  1b58ce14
17 3月, 2022 1 次提交
- B
  
  support gpu mixed precision inference (#40531) · 06fee998
  由 baoachun 提交于 3月 17, 2022
  
  06fee998
14 3月, 2022 1 次提交

Add an elementwise + activation fusion pass. (#36541) · 3f219160

由 Tomasz Socha 提交于 3月 14, 2022

* Add elementwise add and activation fuse pass

* Fix copy ellision

* More flexible pattern detector

* More flexible fusion pass

* Update lists for pass

* Add support for Pow operator

* Add support for more activation types

* Style

* Rename fusion pass

* First version of tests

* Dirty version of pass

* Polished version

* Update pbtxt

* Style

* Update names

* Style

* Use PADDLE_ENFORCE_EQ

* Save error message to variable

* WO for error checks

* CR

* Static style check

* Add missing 'activation_scale' attribute

* Add relu6 and sigmoid activations

* Style

* Fix fuse list formating

* Sync filenames for fuse pass files

* Fix cmake after move

* Fix registration

* Fix pass name in tests

* Add missing activations to checker

* WIPS

* Working mul op

* Working sub

* Working Add

* Remove pten includes

* Remove some forward declarations

* Remove Includes

* Fixes

* Remove default kernels

* Add check if post_ops attributes are avaliable

* Style

* Code adjustment

* Register default kernels

* We have year 2022 not 2021...
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Fast review fixes
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Review Fix

* Rename one_dnn -> onednn

* Style after review

* Fast and dirty fix for quantization

* Update tests

* Style

* Fix mkldnn_quantizer config

* Add Joanna's suggestion.

* Check if operator is explicitly disables on OneDNN

* Try to use unregistered attributes

* Style

* Test new framework

* FXI

* FXII

* Update test

* Style
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

3f219160

07 3月, 2022 1 次提交

cuBlasLt Epilogue To Fuse Linear + ReLU|GeLU (#39437) · 2a3d9eca

由 Ming-Xu Huang 提交于 3月 07, 2022

* Added cuBlasLtHandle_t to device context.

* Added fused_gemm_epilogue op.

1. Added fused_gemm_epilogue op to leverage cuBlastLt Epilogue.
2. Support fusion Act(X*Y + bias), X'dims >=2 and Y'dims shoule be 2.
2. Act currently only be supported ReLU. (Will add GeLU in the future).

* Added UT to fused_gemm_epilogue op.

* Added LinearAct Pattern

1. Added LinearAct into graph_pattern_detector.* to define (2.)'s
pattern.
2. LinearAct is used to detect act(element_add(matmul_v2(x, w), bias)).
3. act currently only support ReLU (Will support GeLU in the future).

* Added FuseGemmEpiloguePass

1, Added FuseGemmEpiloguePass to handle nn.Linear + Act{ReLU}
fusion (GeLU will be supported in the future).
2. Only support matmul_v2 from nn.Linear.

* Added pybind to BuildStrageter.fuse_gemm_epilogue_.

* Added UT for fuse_gemm_epilogue_pass.

* GeLU support and EpilogueSingleton

1. Added GeLU support to fused_gemm_epilogue op.
2. Added EpilogueSingleton to cache auxiliary pointer.
3. Added related UTs.

* Rename cublaslt_epilogue_opto gemm_epilogue_op.*.

* Added both train and infer pattern to LinearAct.

1. Added support of fwd graph with grap_ops linking to LinearAct.
2. Added related changes to fuse_gemm_epilogue_pass for above
modification.

* Changed CUDA requirement from 11.4 to 11.6 for fuse_gemm_epilogue_pass.

* Added identity activation support to gemm_epilogue_op.

* Added Linear Fusion (matmul_v2 + ele_add)

1. Added matmul_v2 + ele_add pattern to LinearActPattern.
2. Added matmul_v2 + ele_add support to fuse_gemm_epilogue_pass.

* Rename gemm_epilogue_op.* to fused_gemm_epilogue_op.*

* Add fused_gemm_epilogue_grad op.

1. Added fused_gemm_epilogue_grad to support backward epilogue fusion.

* Add UTs to fused_gemm_epilogue_grad_op.

* Change attribute name in fused_gemm_epilogue_grad_op for clearing.

* Allow DX and DBias be dispensable to fused_gemm_epilogue_grad op.

* Added ElementwiseAdd+Matmul+Act graph pattern detection.

* Fuse backward of Linear( Act(x))

1. Added backward fusion pass to Linear( Act(x)).
2. Added backward fusion pass to Linear(x).

* Added UTs to backward fusion of Linear(Act(x)).

* Complete document of arguments to fused_gemm_epilogue_op.

* Made arguments of some functions pass by reference.

* Modify code with review comments.

1. Made arguments of some function pass by reference.
2. Removed redundant code.
3. Followed Google code style to change code.

* Made 'const' code style be consistent

* Fixed random seed of python UTs.

* Set Compiling constrains to cuBlasLt

1. Require CUDA 11.6+
2. Remove fuse_gemm_epilogue related tests when CUDA < 11.6.

* Code Reivew from Paddle

1. Changed arguments name is_first_gemm to without_x_gradient for
clearing.
2. Applied PADDLE_THROW in fused_gemm_epilogue_op.

* Remove EpilogueSingleton

1. Applied ReserveSpace to replace Epilogue for passing auxiliary
pointers between FWD and BWD.

* Fix a logical error and enhance UTs.

1. Added act op count checking in UTs.
2. Fix issue to fuse backward or ReLU(Linear(X)).
3. TODO: solve GELU fusion issues.

* Fix Linear and GeLU fusion issues.

1. Modified graph_detech_pattern to fit with both linear wiht gelu or
relu.
2. Modified data range in Uts to allow negative values.

* Removed fused_gemm_epilogue_op.h.

* Rename namespace pten to phi.

* Rename name of arguments in fused_gemm_epilogue_op

1. bias -> Bias.
2. out -> Out.
3. reserve_space -> ReserveSpace.

* Change EpiloguePassActivationCache as local variable.

1. Removed singleton in EpiloguePassActivationCache.
2. Made EpiloguePassActivationCache as an argument to each pass
functions.

2a3d9eca

01 3月, 2022 1 次提交
- W
  remove conv_affine_channel_fuse_pass (#39817) · fc06be9d
  由 wenbin 提交于 3月 01, 2022
```
* remove

* pass

* more pass
```
  fc06be9d
22 2月, 2022 1 次提交
- A
  
  sync recent changes (#39763) · d945e24c
  由 Allen Guo 提交于 2月 22, 2022
  
  d945e24c
15 2月, 2022 1 次提交

[Paddle-Inference] support preln_ernie: add... · 2bc91cc5

由 Wangzheee 提交于 2月 15, 2022

[Paddle-Inference] support preln_ernie: add preln_embedding_eltwise_layernorm_fuse_pass, preln_skip_layernorm_fuse_pass (#39508)

* support preln_ernie

* support preln_ernie

2bc91cc5

09 2月, 2022 1 次提交

[Paddle-Inference] rebuild matmul pass: trt and gpu_cpu (#39369) · db7d129e

由 Wangzheee 提交于 2月 09, 2022

* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu

db7d129e

26 1月, 2022 1 次提交

[IPU] sync misc changes 01 (#38876) · 4efbebea

由 Allen Guo 提交于 1月 26, 2022

* sync misc changes

* apply comments 01

* fix compile error

* remove is_ipu_place check

* add authors
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* sync changes

* restore cmake

* update ir cmake and setup.py

* update inference_lib cmake

* split PR
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

4efbebea

20 12月, 2021 1 次提交

add matmul_scale_fuse_pass (#37962) · ce335c23

由 heliqi 提交于 12月 20, 2021

* add matmul_scale matmul_v2_scale fuse pass

* add scaletensor judge

* modify var name

* add timeout notest;test=coverag

* fix error commit

* fix use_mkldnn attr

* fix use_mkldnn attr

ce335c23

16 12月, 2021 1 次提交

Add tests for PaddleInference Pass (#37676) · 96597a85

由 yeliang2258 提交于 12月 16, 2021

* add test for conv_elementwise_add2_act_fuse_pass and conv_elementwise_add_act_fuse_pass

* Add conv_eltwiseadd_bn_fuse_pass test and fix test_conv_elementwise_addX_act_fuse_pass

* add tests for conv_act_mkldnn_fuse_pass

* add test for conv_bias_mkldnn_fuse_pass

* update code

* add conv_act_mkldnn_fuse_pass for relu, relu6, swish, leaky_relu

* update test

* update

* update bug

* update

* update pattern_detector

* fix test_conv_eltwiseadd_bn_fuse_pass

* add diff display notest;test=windows_ci_inference

* fix

* remove test_conv_act_mkldnn_fuse_pass.py

* ifix

96597a85

14 12月, 2021 2 次提交

add layer_norm_fuse_pass test case (#37830) · b95c9cf2

由 heliqi 提交于 12月 14, 2021

* add layer_norm_fuse_pass test case

* restore cmakelist code

* Merge branch 'develop' into layer_norm_fuse_pass

* Merge branch 'develop' into layer_norm_fuse_pass

* add bad case test

b95c9cf2

S
add reshape+transpose+matmul_v2 only (#37847) · a922168a
由 Sylwester Fraczek 提交于 12月 14, 2021
```
* reshape+transpose+matmul_v2

* in_name->input_name

* fix pr-ci-static-check
```
a922168a

10 12月, 2021 1 次提交

add fc_elementwise_layernorm_fuse_pass (#37771) · 0127e92d

由 heliqi 提交于 12月 10, 2021

* add fc_elementwise_layernorm_fuse_pass

* fix name conflictn

* rebuild CI

* fix Ran Programs=0 bug

0127e92d

06 12月, 2021 1 次提交
- H
  add test_unsqueeze2_eltwise_fuse_pass (#37647) · 22401426
  由 heliqi 提交于 12月 06, 2021
```
* add test_unsqueeze2_eltwise_fuse_pass

* fix name conflictn

* rebuild CI
```
  22401426
11 11月, 2021 2 次提交

Add test property RUN_TYPE=CINN (#37114) · 7a0cc0a9

由 Huihuang Zheng 提交于 11月 11, 2021

Add test property RUN_TYPE=CINN to CINN unit tests. It will restrict Paddle-CINN CI to run these unit tests only.

7a0cc0a9

Added softplus + activation oneDNN fuse pass (#36657) · a346c4dc

由 jakpiase 提交于 11月 11, 2021

* added softplus + activation fuse plass

* minor change

* implemented reviewer suggestion

* minor fix

* minor fix

* added scale_out parameter

* minor fix

* fix for iScan CI

* conditionally disabled logs

* refactored pass builder

a346c4dc

21 10月, 2021 1 次提交

Added matmul_v2+transpose+reshape fuse pass (#36481) · 856cb9c5

由 jakpiase 提交于 10月 21, 2021

* added base changes for matmul_v2+trans+resh fuse pass

* added full matmul_v2+transpose+reshape pass

* removed a file added by mistake

* added reviewers suggestions

* Changed ops type in checking capatibility version

* Deteled one statement

856cb9c5

20 10月, 2021 1 次提交

Add CINN Compile Option (#36292) · 6524fa8d

由 Huihuang Zheng 提交于 10月 20, 2021

Add CINN compile option in CMake.

Now you can use CINN in Paddle by `-DWITH_CINN=ON` when `cmake`

To test it, you can run `make cinn_lib_test -j` and `ctest -R cinn_lib_test`. 

Note:
1. You should set
```
export runtime_include_dir=${CINN_SOURCE_DIR}/cinn/runtime/cuda 
```
When run test, the `${CINN_SOURCE_DIR}` should be set based on your CINN directory.

2. CINN is under developing now, you may have to change `CINN_GIT_TAG` to the git commit you need.

6524fa8d

15 10月, 2021 1 次提交

Add BuildCinnPass (#36345) · b3f02c57

由 jiangcheng 提交于 10月 15, 2021

* Add CinnSubgraphSearchPass

* solve CI problem of subgraph order not same

* fix some bug by review advices

* ensure the independently of subgraph, that mean the subgraph should not have link to out-graph

* rename cinn_subgraph_search_pass to build_cinn_pass and delete paddle_to_cinn_pass

* add flag to control wheter append build cinn pass

* remove AppendPass at ParallelExecutorPassBuilder

* rename paddle_to_cinn_pass to build_cinn_pass in build_strategy and close test_run_from_cinn

b3f02c57

13 10月, 2021 1 次提交

[PaddleInference] Pass: add int8 flag for op (#36042) · d7858c99

由 Wangzheee 提交于 10月 13, 2021

* add_int_pass

* add_int8_flag_pass

* add_int8_flag_pass

* fix CMakeLists.txt

* fix test_trt_fc_fuse_quant_dequant_pass.py

* fix python/paddle/fluid/tests/unittests/ir/inference/test_trt_fc_fuse_quant_dequant_pass.py

* fix test_trt_fc_fuse_quant_dequant_pass.py

d7858c99

11 10月, 2021 1 次提交

Add use_cinn Flag and RunFromCinn in PE (#36107) · 5690666c

由 Huihuang Zheng 提交于 10月 11, 2021

Add use_cinn flag and use it to control whether we run PaddlePaddle using CINN.

Also add:

Replace PaddlePaddle graph with a CINN graph in a pass
PE Method to feed data and run the graph by CINN

5690666c

18 9月, 2021 1 次提交

Basic PR on Cost Model (#35774) · 5ba9fe6e

由 Huihuang Zheng 提交于 9月 18, 2021

Add basic Cost Model, it uses executor to run program and profile it to get op time.

This is an early basic version, we will add more functions in the future.

5ba9fe6e

17 9月, 2021 1 次提交

GeneratePass for Python Pass (#35708) · f6db9806

由 wuhuanzhou 提交于 9月 17, 2021

#### 背景

#35602 提供Python侧开发子图替换类Pass的方式：

- 利用Paddle Python API或者辅助类型定义子图program用来匹配/替换图；
- Python侧注册Pass时，将注册函数最终转换为protobuf定义的PassDesc数据形式，供C++侧进行解析完成Pass实例注册。

本PR即为根据PassDesc规则描述解析生成Pass实例。

#### 方案设计

##### Pass规则验证

在以往的Pass开发中，会存在随着算子迭代引发的匹配失效或者错误匹配的问题，该问题可以通过扫描算子支持的参数设置及参数类型等来判断是否应该使用该Pass或者给出提示需要修改Pass代码。

当前Pass开发中提供了算子兼容性OpCompatSensiblePass用于解决上述问题。但同时还存在不足：由于以往Pass开发在运行时才能获取到pattern信息，所以需要在执行Pass时才可以判断。

使用PassDesc表示的Pass可以在执行Pass前验证上述问题，这个过程在VerifyDesc中完成。

##### 根据匹配子图构造pattern

GeneratePass对于图匹配和替换使用GraphPatternDecetor完成，构造匹配pattern实际上就是将对应对象成员PDPattern中添加PDNode和边关系。该过程在函数`InitGeneratePattern`中完成，该函数没有作为GeneratePass的成员方法，主要出于后续可能开发新的Decetor考虑，GeneratePass与Decetor的操作是没有关联的。

初始化pattern主要通过遍历匹配子图program的全部算子实现：

1. 添加当前算子对应PDNode及限制条件（算子类型、属性限制等）；
2. 遍历当前算子对应输入并从pattern中尝试获取PDNode：
   - 在pattern中获取到PDNode且为输出节点：表示属于匹配子图的中间节点，将该PDNode设置为中间节点；
   - 在pattern中没有获取到PDNode：添加该输入PDNode并设置作为输入节点；
   - 设置输入到算子的边关系；
3. 遍历当前算子对应输出：
   - 在pattern中获取到PDNode且为输入节点：表示属于匹配子图的中间节点，将该PDNode设置为中间节点；
   - 在pattern中没有获取到PDNode：添加该输入PDNode并设置作为输出节点；
   - 设置算子到输出的边关系；

##### 根据替换子图操作graph

替换子图操作的过程在`GetGenerateRewrite`函数中完成，与`InitGeneratePattern`类似没有作为GeneratePass的成员方法。

生成替换子图操作过程如下：

1. 判断冗余替换子图；
2. 遍历替换子图program的全部算子添加替换子图Node：
   1. 添加当前算子的Node及属性设置；
   2. 遍历当前算子对应输入，添加中间variable节点；
   3. 遍历当前算子对应输出，添加中间variable节点；
   4. 添加输入/输出节点与算子节点的边关系；
3. 删除匹配图中属于中间节点的Node；

##### 优化子图验证

对于替换子图或者替换后的计算图是否可以正确运行等，可以在执行Pass时验证，从而防止在后续执行计算图时出现异常。

当前Pass执行直接修改计算图，验证失败时无法很好的完成还原操作，目前子图验证暂时默认成功，留到后续改进。

f6db9806

17 8月, 2021 1 次提交

Add some passes which can be applied to Program (#34730) · 8046e33d

由 Zeng Jinle 提交于 8月 17, 2021

* add inplace passes and tests

* update

* fix use_cuda undefined
fix compile error of op compat

* add more ut

* fix CPU CI error

* check adam unique

* fix mac/windows ci, improve coverage

* fix ci error

* follow weihang's comment

* fix BlockDesc::MoveFrom

* follow qiuliang's comment

* update

* follow huihuang's comments

8046e33d

06 8月, 2021 1 次提交
- Q
  
  fix npu compile error, test=develop (#34656) · c16421c2
  由 Qi Li 提交于 8月 06, 2021
  
  c16421c2
12 6月, 2021 1 次提交

由 joanna.wozna.intel 提交于 6月 11, 2021

* Small changes related to BF16 fusion_gru and fusion_lstm

* Correct to pass arg by value

* Add conditions to rnn op

* Correct the spelling mistake

* Improving the test with checking activation

* Trigger CI

cd95ea82

03 6月, 2021 1 次提交
- 王
  
  add the fc fuse example for pass enhance, test=develop (#33250) · fc5b3a99
  由王明冬提交于 6月 03, 2021
  
  fc5b3a99
26 5月, 2021 1 次提交

optimize OP's compilation time (#32617) · 78ecb668

由 wuhuanzhou 提交于 5月 26, 2021

* optimize OP's compilation time, test=develop

* add more op and run ci test, test=develop

* CUDA Kernel register in cc file, test=develop

* fix macros, test=develop

* fix undefined symbol error, test=develop

* fix compilation error and undefined symbol, test=develop

* fix compilation error on Windows, test=develop

* fix compilation error on Windows, test=develop

78ecb668

25 5月, 2021 1 次提交
- 王
  
  add the IsLeftDefault definition for pass enhance,test=develop (#33081) · dc72ffa5
  由王明冬提交于 5月 25, 2021
  
  dc72ffa5
21 5月, 2021 1 次提交
- 王
  
  add method for enhance pass,test=develop (#33004) · 79ed7177
  由王明冬提交于 5月 21, 2021
  
  79ed7177
28 4月, 2021 1 次提交

Nne integration (#32604) · abcb3f54

由 denglin-github 提交于 4月 28, 2021

* Add dlnne engine runtime

* Fix log

* Remove <const_cast> and remove unrelated modify with dlnne, +clang-format

* Fix CMakeList format error

* Add copyright message

* Fix dlnne CMakeList.txt

* Add some paddlepaddle_pass to support more networks

* Fix some format bug

* Add delete dropout_op pass

* Fix some format bug

* Fix format bug

abcb3f54

22 2月, 2021 1 次提交
- Q
  
  [ROCM] update fluid framework for rocm (part1), test=develop (#31009) · 8fe09faf
  由 Qi Li 提交于 2月 22, 2021
  
  8fe09faf

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致