提交 · face8f1f1a0cd2696dc05f682ef91278a6d4d034 · BaiXuePrincess / Paddle

21 9月, 2022 1 次提交
- W
  residual_no_bias (#46129) · aa0e84e3
  由 wenbin 提交于 9月 21, 2022
```
* residual_no_bias

* comments

* more ut

* fix input
```
  aa0e84e3
20 9月, 2022 1 次提交
- Z
  [Inference] fix preln_residual_bias_fuse_pass bug in TNT_small model (#46178) · bfee398b
  由 zhoutianzi666 提交于 9月 20, 2022
```
* fix preln_residual_bias_fuse_pass bug in TNT_small model 
```
  bfee398b
19 9月, 2022 1 次提交

Fix wrong eigen header include (#46082) · 59a2a987

由 zyfncg 提交于 9月 19, 2022

* fix wrong eigen header include

* fix complie bug

* fix nan_inf_utils_detail

* fix resource_manager

* fix conv_miopen_helper

59a2a987

13 9月, 2022 1 次提交
- J
  add softmax infer kernel (#45955) · 01888482
  由 JingZhuangzhuang 提交于 9月 13, 2022
```
* add softmax infer kernel
```
  01888482
09 9月, 2022 1 次提交

[new-exe] convert fused_all_reduce_op_handle to program (#45774) · e755c07e

由 Leo Chen 提交于 9月 09, 2022

* add operator<< for BuildStrategy

* add fake_coalesce

* fit allreduce mode for new_exe

* remove dubeg code

* follow comments

e755c07e

07 9月, 2022 1 次提交

Layernorm shift partition (#45736) · 960109af

由 wenbin 提交于 9月 07, 2022

* first commit

* conver done

* correct format

* layernorm_shift_partition

* correct convert

* redefine plugin

* runable

* bug fix

* modify ShiftPartitionPattern

* correct

* add UT

* modify ut

* compile

* modify enforce

* modify UT

960109af

06 9月, 2022 1 次提交

[Paddle Inference] fix bugs in quant_conv2d_dequant_fuse_pass when weight is... · ddc244d3

由 zhoutianzi666 提交于 9月 06, 2022

[Paddle Inference] fix bugs in quant_conv2d_dequant_fuse_pass when weight is shared  between ops (#45719)

* fix_old_format

* fix bug in quant_conv2d_dequant

* fix bug in quant_conv2d_dequant

ddc244d3

05 9月, 2022 2 次提交

New format quant model support for MKLDNN (#45416) · 4e4f4586

由 yeliang2258 提交于 9月 05, 2022

* support onnx format quantized model

* update code

* add test

* add test

* fix

* fix test

* fix cmake

* update code

* change scale file path to calibration file path

* update code

* update code

* fix build bug

* fix build bugs

* fix

* fix

4e4f4586

F
fix bugs for vit attention pass (#45721) · b9d66e6b
由 feng_shuai 提交于 9月 05, 2022
```
* fix: vit attention pass

* reflash CI
```
b9d66e6b

31 8月, 2022 1 次提交
- H
  add del dropout op pass to jit pe enigne (#45439) · 46bc06b5
  由 Hui Zhang 提交于 8月 31, 2022
```
* add del dropout op pass to jit pe enigne

* add delete dropout test
```
  46bc06b5
30 8月, 2022 2 次提交

Remove extra attribute in OpMaker (#44310) · fe321f9a

由 zyfncg 提交于 8月 30, 2022

* add runtime config in phi

* add runtime attr for op desc and op

* fix no proto error

* adjust opdesc set_attr impl

* try to remove conv_op extra attrs

* add init runtime attr map

* change extra header path

* fix runtime_attr

* fix trace_op

* fix bug of pass

* fix merge conflict

* fix dygraph attrs

* fix bug of pass

* fix dygraph bug

* fix unittest module

* delete extra attr default

* fix dropout kernel

* polish code

* fix extra output of instance_norm

* fix merge confilct

* fix op_desc bug

* add extra attr in yaml for conv3d_transpose

* don't remove extra input and output

* fix save_inference_model

* fix bug of batch_norm

* revert some change

* polish log

* polish code

* add code comment
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

fe321f9a

Z
[Paddle-TRT] constant-folding (#45494) · 97f43a8e
由 zhoutianzi666 提交于 8月 30, 2022
```
add constant folding pass， for some model，it will get less latency；
```
97f43a8e

29 8月, 2022 1 次提交

[new_exe] Dy2Static support new_executor (#44450) · aba1295b

由 zhangbo9674 提交于 8月 29, 2022

* add interpretercore

* refine backward program id

* add code

* refine program

* refine code

* create forward/backward_program by prog2graph2prog method

* test, do not care

* refine code

* refine code

* refine code

* test, do not care

* add interpretorcore

* add scope

* refine scope create method

* add jit for new_exe

* solve conflict

* delete unused code

* polish code

* polish code

* refine scope in inplace

* refine for datatransfer

* refine _rebuild_from_desc

* refine control eager deletion attr

* refine used_for_jit

* refine jit for infer

* op size0 use ori program

* polish code

* refine jit

* refine run_program_op ut

* refine inplace

* refine control

* refine graph helper

* refine control

* refine inplace

* refine buffer_share_inplace_pass

* polish code

* polish code

* refine usage for compilerProgram

* refine control

* test

* test core cache

* refine code

* refine io.py

* increase test_seq2seq timeout

* refine convert program

* refine interpretercore_cache release

* delete buildinplace

* refine partial_program && io

* refine code for io

* test

* test

* test

aba1295b

24 8月, 2022 1 次提交
- W
  
  conv_eltwiseadd_bn_fuse support fp16 (#45379) · 62b5452d
  由 Wilber 提交于 8月 24, 2022
  
  62b5452d
22 8月, 2022 3 次提交
- J
  Add int8 support for matmul+elementwise_add fuse pass (#45077) · 9e5f3a38
  由 joanna.wozna.intel 提交于 8月 22, 2022
```
* Add int8 support for matmul+elementwiae_add fuse

* Corrections after review and ernie test fix
```
  9e5f3a38
- S
  Extend conv_concat_relu to support all activations (#45089) · d03ef054
  由 Sławomir Siwek 提交于 8月 22, 2022
```
* merge conv_concat_relu to conv_act

* fix typo

* extend unit test

* reuse existing gpd

* codestyle

* enforce mkldnn conv
```
  d03ef054
- Y
  
  remove trt_skip_layernorm_fuse_pass from gpu passes (#45293) · 25d58db6
  由 Yuanle Liu 提交于 8月 22, 2022
  
  25d58db6
17 8月, 2022 1 次提交
- F
  
  fix:op version (#45192) · d0cd0a11
  由 feng_shuai 提交于 8月 17, 2022
  
  d0cd0a11
16 8月, 2022 2 次提交

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

W

fix new quant (#45155) · 2fb65e44
由 Wangzheee 提交于 8月 16, 2022

2fb65e44

15 8月, 2022 1 次提交
- Y
  
  fused_embedding_eltwise_layernorm_op and skip_layernorm_op support fp16 (#44969) · ac0553a0
  由 Yuanle Liu 提交于 8月 15, 2022
  
  ac0553a0
12 8月, 2022 1 次提交

Offload calculations from matmul op to fuse pass (#44941) · acb78ea2

由 Sławomir Siwek 提交于 8月 12, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* Add int8 support for matmulV2

* restore ut

* adjust old ut

* restore parallel UT ruels

* remove mkldnn code from base ops

* move enforces to pass

* remove duplicated functions

* delete duplicated enforces

* feedback from review

* add comments to variables

* enable eltwise support

* dynamic attribute

* remove fusepass tests from op test

* remove fuse pass cases from op test

* revert introduction of dynamic attributes

* style
Co-authored-by: Nwozna <joanna.wozna@intel.com>

acb78ea2

11 8月, 2022 1 次提交
- W
  
  Change bias to persistable in preln_residual_bias_fuse_pass (#45037) · 26c573de
  由 whs 提交于 8月 11, 2022
  
  26c573de
10 8月, 2022 1 次提交
- A
  [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute (#44737) · 81d6fa6c
  由 Aurelius84 提交于 8月 10, 2022
```
* [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute

* add unittest for inference predictor
```
  81d6fa6c
09 8月, 2022 2 次提交
- Y
  
  fix mkldnn conv add pass when the dims of res and out are not equel (#45018) · 42c694df
  由 yeliang2258 提交于 8月 09, 2022
  
  42c694df
- Y
  Fix a bug in transpose2 when run native cpu (#44659) · 8185cecd
  由 yeliang2258 提交于 8月 09, 2022
```
* fix a bug in transpose2 about mkldnn

* fix bug
```
  8185cecd
05 8月, 2022 2 次提交

[MKLDNN]Move mkldnn activation kernel to phi (#44365) · 2dfa88d2

由 YuanRisheng 提交于 8月 05, 2022

* move mkldnn activation kernel

* fix compile bugs

* fix compile bugs

* deal with conflict

* fix compile bugs

* fix windows compile bugs

* mkldnn unittest fix

* change mutable to alloc

* fix unittest bugs

* modify code according comment

2dfa88d2

Merge matmul_v1 and matmul_v2 fuse passes (#44870) · d0cf9d9d

由 Sławomir Siwek 提交于 8月 05, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* restore ut

* adjust old ut

* restore parallel UT ruels

* feedback from review

d0cf9d9d

04 8月, 2022 1 次提交

Matmuls with activation and elementwise_add fuses (#44655) · 0420d514

由 Sławomir Siwek 提交于 8月 04, 2022

* Add unit tests

* matmul_v2 + activation

* matmuls + elementwise_add

* matmul_v2 postops

* transform matmul to v2

* opcompat

* fix fusing matmul with multipe outs

* add shape constraints

* remove unused vars

* change pass order

* - Unit tests to be debugged

- fix

- refactor

- diagnostic

- more diagnostic

- fix

- Fix number two

- fix

- fix

- fix

- alpha added

- more fixes

- compilation fix

- removed diagnostic code

- cosmetic fixes

* lint

* add alpha constraint

* merge matmul refactor

* trigger CI

* - fix

* - another fix

* code style

* add support for matmul+elementwise_add+activation

* code style

* fix bfloat16 bugs

* change append_binary to append_sum
Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>

0420d514

03 8月, 2022 1 次提交
- W
  
  fix trt and gpu pass: emb_elt_layn (#44842) · 2ea1c134
  由 Wangzheee 提交于 8月 03, 2022
  
  2ea1c134
02 8月, 2022 1 次提交

Multihead matmul fp16 (#44792) · 0fd8ee63

由 Wilber 提交于 8月 02, 2022

* multihead matmul add fp16

* fix windows error

* fix rocm error

* fix rocm error

0fd8ee63

01 8月, 2022 2 次提交
- L
  unify gpu context (#44740) · 86763023
  由 Leo Chen 提交于 8月 01, 2022
```
* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes
```
  86763023
- W
  [Paddle Inference] add varlen_token_prune plugin, pass, convert (#44733) · 24187fcb
  由 Wangzheee 提交于 8月 01, 2022
```
* add varlen_token_prune plugin, pass, convert
```
  24187fcb
27 7月, 2022 1 次提交
- P
  fix RemoveIntermediateOut in fuse_elewise_add_act_pass while converting graph to program (#44593) · be132719
  由 pangyoki 提交于 7月 27, 2022
```
* fix RemoveNode in fuse_elewise_add_act_pass

* fix

* change pointer to share_ptr

* fix

* fix

* fix format

* fix

* fix graph_safe_remove_nodes
```
  be132719
26 7月, 2022 3 次提交
- R
  
  Merge kProgramDescs in GraphToProgram (#44526) · b6e84806
  由 Ruibiao Chen 提交于 7月 26, 2022
  
  b6e84806
- R
  Set more attrs in ReplaceScaleLossGradOp (#44576) · ab198b45
  由 Ruibiao Chen 提交于 7月 26, 2022
```
* Set more attrs in ReplaceScaleLossGradOp

* Fix typos

* Fix CI errors

* Add UT
```
  ab198b45
- R
  
  Remove ControlDepVar in GraphToBlock (#44591) · 8d3672f0
  由 Ruibiao Chen 提交于 7月 26, 2022
  
  8d3672f0
21 7月, 2022 1 次提交
- X
  [Paddle inference] Add conv_fusion_fp16 (#44435) · 37455714
  由 xiaoxiaohehe001 提交于 7月 21, 2022
```
* convfusionfp16

* convfusionfp16

* convfusionfp16
```
  37455714
20 7月, 2022 1 次提交

transfer block_id to CreateVarNode in multi_devices_graph_pass (#44366) · 1882ffd5

由 pangyoki 提交于 7月 20, 2022

* fix CreateVarNode in multi_devices_graph_pass

* Revert "Fix var duplication bug for graph_to_program_pass (#44278)"

This reverts commit a2c4c86b.

1882ffd5

19 7月, 2022 1 次提交
- R
  Rename BOOST_GET macros (#44368) · 4b085c57
  由 Ruibiao Chen 提交于 7月 19, 2022
```
* Rename BOOST_GET macros

* Fix conflicts
```
  4b085c57

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致