提交 · ba4fbe71ee47657bba68637dc1774595f645085c · PaddlePaddle / Paddle

03 11月, 2022 1 次提交

[cherry pick] fix memory copy in prepare_data of FusedMultiTransformer pass (#47308) · ba4fbe71

由 Kaipeng Deng 提交于 11月 03, 2022

* fix memory copy in prepare_data. test=develop

* add cache_kv fp16 support. test=develop

* fit for simplify_with_basic_ops_pass. test=develop

ba4fbe71

20 10月, 2022 3 次提交

K
[cherry pick] Add FusedMultiTransformer fuse pass for GPT3 (#47150) · 396427a7
由 Kaipeng Deng 提交于 10月 20, 2022
```
* add fused_attention_pass. test=develop

* support fp16. test=develop

* fix format. test=develop
```
396427a7

[cherry-pick] Fix quantize model deploy bug in MKLDNN (#47119) · c2d344dd

由 yeliang2258 提交于 10月 20, 2022

* Fix quantize model deploy bugs when using MKLDNN (#45920)

* fix immutable op quantize bugs

* fix

* fix build bug

* fix test

* notest,test=inference

* fix ppyoloe acc drop bugs

* fix test

* fix test

* add test

* fix

* fix

* fix test

* fix refined name bug

* fix test

* bias fix

* fix matmul weight dequant bug

* re-ci

* fix tester

* fix test

* fix tester

* update weight dequantize func

* update code

* update test for converage

* update test

* update cmake

* update cmakelist

* update code

* rerun ci

* remove useless code

* re-ci

* update code

* update code

* fix header

* update code for log

c2d344dd

W
[Cherry-pick] layernorm shift partation enhance (#47086) · 9ed1454a
由 Wang Bojun 提交于 10月 20, 2022
```
* Enhance the layernorm shift partation fuse op when shift size > 0 (roll shifting)
* fix cherry-pick test
```
9ed1454a

19 10月, 2022 2 次提交

Add unsigned int8 scale propagation (#46378) (#47156) · 66dccd7d

由 yeliang2258 提交于 10月 19, 2022

* Add unsigned int8 propagation

* Add or modify unit tests

* Correct concat scale checking

* Apply review suggestions

* Corrections
Co-authored-by: Njoanna.wozna.intel <joanna.wozna@intel.com>

66dccd7d

W
[Dy2St]Fix recurrent op eager deletion pass error in dy2st (#47105) (#47134) · 69515e90
由 WangZhen 提交于 10月 19, 2022
```
[CherryPick][Dy2St]Fix recurrent op eager deletion pass error in dy2st
```
69515e90

17 10月, 2022 1 次提交

[IPU] paddle-inference support custom-ops (#45235) (#46868) · bd89be12

由 Allen Guo 提交于 10月 17, 2022

* paddle-inference support custom-ops
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

* fix tolower
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

bd89be12

14 10月, 2022 1 次提交
- Z
  
  [Paddle-TRT] support new quant format from slim (#46022) (#46979) · b8677c0d
  由 zhoutianzi666 提交于 10月 14, 2022
  
  b8677c0d
11 10月, 2022 1 次提交

[cherry-pick] [PHI] relu6_grad kernel (#46501) (#46862) · 2bcbf8b0

由 Sławomir Siwek 提交于 10月 11, 2022

* [PHI] Migrate gelu kernels (#45596)

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* gelu fwd

* sort activations

* gelu gradient

* remove unused macros

* merge conflicts

* fix merge conflicts

* remove extra contraint from gelu op

* [PHI] relu6_grad kernel (#46501)

* Relu6

* remove fluid handler

* add individual kernel signature

* coding style

* replace bounded_relu with clip

* whitespace

* code style

2bcbf8b0

20 9月, 2022 2 次提交
- Z
  [Inference] fix preln_residual_bias_fuse_pass bug in TNT_small model (#46178) (#46260) · c384b00d
  由 zhoutianzi666 提交于 9月 20, 2022
```
* fix preln_residual_bias_fuse_pass bug in TNT_small model
```
  c384b00d
- Z
  Fix wrong eigen header include (#46082) (#46202) · ac8cce20
  由 zyfncg 提交于 9月 20, 2022
```
* fix wrong eigen header include

* fix complie bug

* fix nan_inf_utils_detail

* fix resource_manager

* fix conv_miopen_helper
```
  ac8cce20
13 9月, 2022 1 次提交
- J
  
  cherry pick softmax infer kernel (#45957) · 0903020d
  由 JingZhuangzhuang 提交于 9月 13, 2022
  
  0903020d
07 9月, 2022 1 次提交

Layernorm shift partition (#45736) · 960109af

由 wenbin 提交于 9月 07, 2022

* first commit

* conver done

* correct format

* layernorm_shift_partition

* correct convert

* redefine plugin

* runable

* bug fix

* modify ShiftPartitionPattern

* correct

* add UT

* modify ut

* compile

* modify enforce

* modify UT

960109af

06 9月, 2022 1 次提交

[Paddle Inference] fix bugs in quant_conv2d_dequant_fuse_pass when weight is... · ddc244d3

由 zhoutianzi666 提交于 9月 06, 2022

[Paddle Inference] fix bugs in quant_conv2d_dequant_fuse_pass when weight is shared  between ops (#45719)

* fix_old_format

* fix bug in quant_conv2d_dequant

* fix bug in quant_conv2d_dequant

ddc244d3

05 9月, 2022 2 次提交

New format quant model support for MKLDNN (#45416) · 4e4f4586

由 yeliang2258 提交于 9月 05, 2022

* support onnx format quantized model

* update code

* add test

* add test

* fix

* fix test

* fix cmake

* update code

* change scale file path to calibration file path

* update code

* update code

* fix build bug

* fix build bugs

* fix

* fix

4e4f4586

F
fix bugs for vit attention pass (#45721) · b9d66e6b
由 feng_shuai 提交于 9月 05, 2022
```
* fix: vit attention pass

* reflash CI
```
b9d66e6b

31 8月, 2022 1 次提交
- H
  add del dropout op pass to jit pe enigne (#45439) · 46bc06b5
  由 Hui Zhang 提交于 8月 31, 2022
```
* add del dropout op pass to jit pe enigne

* add delete dropout test
```
  46bc06b5
30 8月, 2022 2 次提交

Remove extra attribute in OpMaker (#44310) · fe321f9a

由 zyfncg 提交于 8月 30, 2022

* add runtime config in phi

* add runtime attr for op desc and op

* fix no proto error

* adjust opdesc set_attr impl

* try to remove conv_op extra attrs

* add init runtime attr map

* change extra header path

* fix runtime_attr

* fix trace_op

* fix bug of pass

* fix merge conflict

* fix dygraph attrs

* fix bug of pass

* fix dygraph bug

* fix unittest module

* delete extra attr default

* fix dropout kernel

* polish code

* fix extra output of instance_norm

* fix merge confilct

* fix op_desc bug

* add extra attr in yaml for conv3d_transpose

* don't remove extra input and output

* fix save_inference_model

* fix bug of batch_norm

* revert some change

* polish log

* polish code

* add code comment
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

fe321f9a

Z
[Paddle-TRT] constant-folding (#45494) · 97f43a8e
由 zhoutianzi666 提交于 8月 30, 2022
```
add constant folding pass， for some model，it will get less latency；
```
97f43a8e

29 8月, 2022 1 次提交

[new_exe] Dy2Static support new_executor (#44450) · aba1295b

由 zhangbo9674 提交于 8月 29, 2022

* add interpretercore

* refine backward program id

* add code

* refine program

* refine code

* create forward/backward_program by prog2graph2prog method

* test, do not care

* refine code

* refine code

* refine code

* test, do not care

* add interpretorcore

* add scope

* refine scope create method

* add jit for new_exe

* solve conflict

* delete unused code

* polish code

* polish code

* refine scope in inplace

* refine for datatransfer

* refine _rebuild_from_desc

* refine control eager deletion attr

* refine used_for_jit

* refine jit for infer

* op size0 use ori program

* polish code

* refine jit

* refine run_program_op ut

* refine inplace

* refine control

* refine graph helper

* refine control

* refine inplace

* refine buffer_share_inplace_pass

* polish code

* polish code

* refine usage for compilerProgram

* refine control

* test

* test core cache

* refine code

* refine io.py

* increase test_seq2seq timeout

* refine convert program

* refine interpretercore_cache release

* delete buildinplace

* refine partial_program && io

* refine code for io

* test

* test

* test

aba1295b

24 8月, 2022 1 次提交
- W
  
  conv_eltwiseadd_bn_fuse support fp16 (#45379) · 62b5452d
  由 Wilber 提交于 8月 24, 2022
  
  62b5452d
22 8月, 2022 3 次提交
- J
  Add int8 support for matmul+elementwise_add fuse pass (#45077) · 9e5f3a38
  由 joanna.wozna.intel 提交于 8月 22, 2022
```
* Add int8 support for matmul+elementwiae_add fuse

* Corrections after review and ernie test fix
```
  9e5f3a38
- S
  Extend conv_concat_relu to support all activations (#45089) · d03ef054
  由 Sławomir Siwek 提交于 8月 22, 2022
```
* merge conv_concat_relu to conv_act

* fix typo

* extend unit test

* reuse existing gpd

* codestyle

* enforce mkldnn conv
```
  d03ef054
- Y
  
  remove trt_skip_layernorm_fuse_pass from gpu passes (#45293) · 25d58db6
  由 Yuanle Liu 提交于 8月 22, 2022
  
  25d58db6
17 8月, 2022 1 次提交
- F
  
  fix:op version (#45192) · d0cd0a11
  由 feng_shuai 提交于 8月 17, 2022
  
  d0cd0a11
16 8月, 2022 2 次提交

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

W

fix new quant (#45155) · 2fb65e44
由 Wangzheee 提交于 8月 16, 2022

2fb65e44

15 8月, 2022 1 次提交
- Y
  
  fused_embedding_eltwise_layernorm_op and skip_layernorm_op support fp16 (#44969) · ac0553a0
  由 Yuanle Liu 提交于 8月 15, 2022
  
  ac0553a0
12 8月, 2022 1 次提交

Offload calculations from matmul op to fuse pass (#44941) · acb78ea2

由 Sławomir Siwek 提交于 8月 12, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* Add int8 support for matmulV2

* restore ut

* adjust old ut

* restore parallel UT ruels

* remove mkldnn code from base ops

* move enforces to pass

* remove duplicated functions

* delete duplicated enforces

* feedback from review

* add comments to variables

* enable eltwise support

* dynamic attribute

* remove fusepass tests from op test

* remove fuse pass cases from op test

* revert introduction of dynamic attributes

* style
Co-authored-by: Nwozna <joanna.wozna@intel.com>

acb78ea2

11 8月, 2022 1 次提交
- W
  
  Change bias to persistable in preln_residual_bias_fuse_pass (#45037) · 26c573de
  由 whs 提交于 8月 11, 2022
  
  26c573de
10 8月, 2022 1 次提交
- A
  [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute (#44737) · 81d6fa6c
  由 Aurelius84 提交于 8月 10, 2022
```
* [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute

* add unittest for inference predictor
```
  81d6fa6c
09 8月, 2022 2 次提交
- Y
  
  fix mkldnn conv add pass when the dims of res and out are not equel (#45018) · 42c694df
  由 yeliang2258 提交于 8月 09, 2022
  
  42c694df
- Y
  Fix a bug in transpose2 when run native cpu (#44659) · 8185cecd
  由 yeliang2258 提交于 8月 09, 2022
```
* fix a bug in transpose2 about mkldnn

* fix bug
```
  8185cecd
05 8月, 2022 2 次提交

[MKLDNN]Move mkldnn activation kernel to phi (#44365) · 2dfa88d2

由 YuanRisheng 提交于 8月 05, 2022

* move mkldnn activation kernel

* fix compile bugs

* fix compile bugs

* deal with conflict

* fix compile bugs

* fix windows compile bugs

* mkldnn unittest fix

* change mutable to alloc

* fix unittest bugs

* modify code according comment

2dfa88d2

Merge matmul_v1 and matmul_v2 fuse passes (#44870) · d0cf9d9d

由 Sławomir Siwek 提交于 8月 05, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* restore ut

* adjust old ut

* restore parallel UT ruels

* feedback from review

d0cf9d9d

04 8月, 2022 1 次提交

Matmuls with activation and elementwise_add fuses (#44655) · 0420d514

由 Sławomir Siwek 提交于 8月 04, 2022

* Add unit tests

* matmul_v2 + activation

* matmuls + elementwise_add

* matmul_v2 postops

* transform matmul to v2

* opcompat

* fix fusing matmul with multipe outs

* add shape constraints

* remove unused vars

* change pass order

* - Unit tests to be debugged

- fix

- refactor

- diagnostic

- more diagnostic

- fix

- Fix number two

- fix

- fix

- fix

- alpha added

- more fixes

- compilation fix

- removed diagnostic code

- cosmetic fixes

* lint

* add alpha constraint

* merge matmul refactor

* trigger CI

* - fix

* - another fix

* code style

* add support for matmul+elementwise_add+activation

* code style

* fix bfloat16 bugs

* change append_binary to append_sum
Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>

0420d514

03 8月, 2022 1 次提交
- W
  
  fix trt and gpu pass: emb_elt_layn (#44842) · 2ea1c134
  由 Wangzheee 提交于 8月 03, 2022
  
  2ea1c134
02 8月, 2022 1 次提交

Multihead matmul fp16 (#44792) · 0fd8ee63

由 Wilber 提交于 8月 02, 2022

* multihead matmul add fp16

* fix windows error

* fix rocm error

* fix rocm error

0fd8ee63

01 8月, 2022 2 次提交
- L
  unify gpu context (#44740) · 86763023
  由 Leo Chen 提交于 8月 01, 2022
```
* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes
```
  86763023
- W
  [Paddle Inference] add varlen_token_prune plugin, pass, convert (#44733) · 24187fcb
  由 Wangzheee 提交于 8月 01, 2022
```
* add varlen_token_prune plugin, pass, convert
```
  24187fcb

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功