提交 · 65ffc3f5be5251e0954f473efd896a357f706418 · PaddlePaddle / Paddle

30 11月, 2022 2 次提交
- F
  
  feat:add the support for vit_attention_op on gpu (#48515) · e9ca7600
  由 feng_shuai 提交于 11月 30, 2022
  
  e9ca7600
- R
  Add int8 support in fused_multi_transformer_pass and fuse_multi_transformer_layer_pass (#48209) · 12486712
  由 RichardWooSJTU 提交于 11月 30, 2022
```
* delete unnecessary shape and slice op
Co-authored-by: NYour Name <you@example.com>
```
  12486712
23 11月, 2022 1 次提交
- W
  
  add map_depthwise_conv_to_conv pass (#47955) · 3daf5185
  由 Wilber 提交于 11月 23, 2022
  
  3daf5185
21 11月, 2022 2 次提交

add fc-residual quantization (#46917) · fed0ed34

由 Sylwester Fraczek 提交于 11月 21, 2022

* add fc-residual quantization

* revert removal of check for use_mkldnn

* fix bug

* add disable_logs

* review fix

call twice AreScalesPresntForNodes instead of if-else

* rewrite residual input to output

* revert fc mkldnn taking residual data

* format fix

* fix LoDTensor->DenseTensor

* LoDTensor->DenseTensor

* output->input

* revert changes to unsupported script

revert changes to unsupported script

* remove fc residualdata from output blocklist in cpu_bfloat16_pass.cc

fed0ed34

R

delete unnecessary shape and slice op (#48112) · 41483383
由 RichardWooSJTU 提交于 11月 21, 2022

41483383

16 11月, 2022 1 次提交
- P
  Add bf16 data type support to oneDNN bilinear_interp kernel (#46770) · 8e6315e4
  由 Piotr Paturej 提交于 11月 16, 2022
```
* Enable bf16 in oneDNN bilinear_interp kernel

* Fix bilinear_interp_v2 not enabled in models

* Remove unnecessary checks
```
  8e6315e4
15 11月, 2022 1 次提交
- J
  Added optimization pass for oneDNN layernorm kernel (#47782) · 519e7426
  由 jakpiase 提交于 11月 15, 2022
```
* optimization for ln

* fix

* added output to gpd

* added formatting

* fix
```
  519e7426
10 11月, 2022 2 次提交
- Z
  [search && paddle inference]add roformer pass&&plugin novarlen version (#47523) · 0f3fb562
  由 zhangxin81 提交于 11月 10, 2022
```
* add roformer pass&&plugin（novarlen）
```
  0f3fb562
- R
  Fuse multi transformer layer pass (#47541) · 1e3245a8
  由 RichardWooSJTU 提交于 11月 10, 2022
```
* add fuse_multi_transformer_layer_pass
```
  1e3245a8
09 11月, 2022 2 次提交

J

Fix U2++ perf (#47780) · b1fb2360
由 joanna.wozna.intel 提交于 11月 09, 2022

b1fb2360

Enable fc passes (#45704) · 7e914386

由 Paulina Gacek 提交于 11月 09, 2022

* Analysis API interface for disabling fc passes

* Unit tests corrected

* Python API added

* test runs only when PADDLE_WITH_MKLDNN

* Fc op changed to relu in matmul_op_test

* Disable fc passes in tests where acc drops

* code formating

* Unit test for analysisConf added

* Unit test gpu added

* fc passes disabled when iterations=0 in gru test

* style

* passes disabled when fp32 in gru test

* fc passes disabled in lstm test

* Import from inference, not fluid in doc

7e914386

08 11月, 2022 1 次提交
- K
  
  add fuse_multi_transformer passes to fp16. test=develop (#47676) · caca5687
  由 Kaipeng Deng 提交于 11月 08, 2022
  
  caca5687
07 11月, 2022 1 次提交

suqeeze2 + transpose2 fuse onednn (#47592) · fa874a46

由 Hui Zhang 提交于 11月 07, 2022

* suqeeze2 transpose2 fuse onednn

* format

* fix output shape

* fix conflict

* format

* format

* remove useless

* remove log

* simply pass

* fix comment

* fix

* fix msg

* fix error msg

* format

fa874a46

04 11月, 2022 1 次提交
- J
  Optimized oneDNN FC and added operator+unsqueeze2 and operator+reshape2 oneDNN fuse passes (#47391) · 9e006987
  由 jakpiase 提交于 11月 04, 2022
```
* tmp save

* minor chnage

* CI fix

* added FC optimizations

* latest update

* CI fix

* fixed bug with fusing fc
```
  9e006987
03 11月, 2022 1 次提交
- Y
  Fix ComputePropagateScalesMkldnnPass of MKLDNN (#47574) · 5fc92943
  由 yeliang2258 提交于 11月 03, 2022
```
* add constant_folding_pass pass for mkldnn int8

* update UpdateScaleOpInOutScales
```
  5fc92943
26 10月, 2022 2 次提交

Preln_Layernorm_Shift_Partition (#47099) · d17d0cd1

由 wenbin 提交于 10月 26, 2022

* prelnlayernorm_shift

* add ut

* remove paddle_enforce

* remove useless

* add UT

* remove UT

* add UT

* set timeout

d17d0cd1

FC/matmul(v2) + scale fuse pass (#47127) · c1c2be2d

由 Sławomir Siwek 提交于 10月 26, 2022

* fc/matmuls + scale fuse pass

* remove double-extension

* add unit tests

* comments from review

* codestyle

* add pass to int8 list

* new codestyle

* attr name typo

c1c2be2d

20 10月, 2022 2 次提交
- F
  
  fix:use constant_fold for vit pass (#47211) · ac666538
  由 feng_shuai 提交于 10月 20, 2022
  
  ac666538
- K
  Add FusedMultiTransformer fuse pass for GPT3 (#45907) · 5a2e5179
  由 Kaipeng Deng 提交于 10月 20, 2022
```
* add fused_multi_transformer_encoder/decoder pass, run GPT-3 success
```
  5a2e5179
18 10月, 2022 1 次提交

Merge layernorm trt fuse (#46320) · 5e9f491e

由 Wang Bojun 提交于 10月 18, 2022

* first version, accuracy corrected

* disable debug print

* use blockReduceSum in phi

* add UT

* add opCompat

* code style

* code refine

* bug fix

* code refine

* test fix

* bugfix

* codesytle fix

* code style

* code-style

* code-style

* code-style

5e9f491e

17 10月, 2022 2 次提交
- H
  Revert "add common subexpression elimination (#44386)" (#47062) · 7c6835ca
  由 hong 提交于 10月 17, 2022
```
This reverts commit 166ff39a.
```
  7c6835ca
- W
  Layernorm shift partition enhance (#46816) · 9e08633c
  由 Wang Bojun 提交于 10月 17, 2022
```
* first version of ln_s_p with s>0

* refine and UT

* pass opt draft

* pass opt

* code refine

* code-style

* bug fix

* fix ci test

* code style
```
  9e08633c
16 10月, 2022 1 次提交
- Z
  
  add common subexpression elimination (#44386) · 166ff39a
  由 ZeKai Zhou 提交于 10月 16, 2022
  
  166ff39a
10 10月, 2022 1 次提交
- Z
  
  [Paddle-TRT] support new quant format from slim (#46022) · 7987a905
  由 zhoutianzi666 提交于 10月 10, 2022
  
  7987a905
27 9月, 2022 1 次提交
- W
  [Paddle Inference]support n lookup_tables fuse to embeddinglayernorm(3) (#46243) · 4d772144
  由 Wangzheee 提交于 9月 27, 2022
```
* [Paddle Inference]support n lookup_tables fuse to embeddinglayernorm(3)
```
  4d772144
21 9月, 2022 1 次提交
- Z
  [Paddle-TRT] remove trt_reshape2_matmul_fuse_pass (#46090) · c9a7a3bc
  由 zhoutianzi666 提交于 9月 21, 2022
```
* Remove trt_reshape2_matmul_fuse_pass
```
  c9a7a3bc
07 9月, 2022 1 次提交

Layernorm shift partition (#45736) · 960109af

由 wenbin 提交于 9月 07, 2022

* first commit

* conver done

* correct format

* layernorm_shift_partition

* correct convert

* redefine plugin

* runable

* bug fix

* modify ShiftPartitionPattern

* correct

* add UT

* modify ut

* compile

* modify enforce

* modify UT

960109af

02 9月, 2022 1 次提交
- S
  
  enable add passes pre-calculating scales/quantizing weights (#44680) · cdb36da4
  由 Sylwester Fraczek 提交于 9月 02, 2022
  
  cdb36da4
30 8月, 2022 1 次提交
- Z
  [Paddle-TRT] constant-folding (#45494) · 97f43a8e
  由 zhoutianzi666 提交于 8月 30, 2022
```
add constant folding pass， for some model，it will get less latency；
```
  97f43a8e
22 8月, 2022 3 次提交
- J
  Add int8 support for matmul+elementwise_add fuse pass (#45077) · 9e5f3a38
  由 joanna.wozna.intel 提交于 8月 22, 2022
```
* Add int8 support for matmul+elementwiae_add fuse

* Corrections after review and ernie test fix
```
  9e5f3a38
- S
  Extend conv_concat_relu to support all activations (#45089) · d03ef054
  由 Sławomir Siwek 提交于 8月 22, 2022
```
* merge conv_concat_relu to conv_act

* fix typo

* extend unit test

* reuse existing gpd

* codestyle

* enforce mkldnn conv
```
  d03ef054
- Y
  
  remove trt_skip_layernorm_fuse_pass from gpu passes (#45293) · 25d58db6
  由 Yuanle Liu 提交于 8月 22, 2022
  
  25d58db6
16 8月, 2022 1 次提交

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

15 8月, 2022 1 次提交
- Y
  
  fused_embedding_eltwise_layernorm_op and skip_layernorm_op support fp16 (#44969) · ac0553a0
  由 Yuanle Liu 提交于 8月 15, 2022
  
  ac0553a0
14 8月, 2022 1 次提交
- X
  Revert "[Paddle Inference] Support cuda_graph. (#44878)" (#45115) · b0e7681f
  由 xiaoxiaohehe001 提交于 8月 14, 2022
```
This reverts commit 84bf5c31.
```
  b0e7681f
10 8月, 2022 1 次提交
- X
  [Paddle Inference] Support cuda_graph. (#44878) · 84bf5c31
  由 xiaoxiaohehe001 提交于 8月 10, 2022
```
* cuda_graph

* cuda_graph_

* cuda_graph_

* cuda_graph_
```
  84bf5c31
05 8月, 2022 1 次提交

Merge matmul_v1 and matmul_v2 fuse passes (#44870) · d0cf9d9d

由 Sławomir Siwek 提交于 8月 05, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* restore ut

* adjust old ut

* restore parallel UT ruels

* feedback from review

d0cf9d9d

04 8月, 2022 1 次提交

Matmuls with activation and elementwise_add fuses (#44655) · 0420d514

由 Sławomir Siwek 提交于 8月 04, 2022

* Add unit tests

* matmul_v2 + activation

* matmuls + elementwise_add

* matmul_v2 postops

* transform matmul to v2

* opcompat

* fix fusing matmul with multipe outs

* add shape constraints

* remove unused vars

* change pass order

* - Unit tests to be debugged

- fix

- refactor

- diagnostic

- more diagnostic

- fix

- Fix number two

- fix

- fix

- fix

- alpha added

- more fixes

- compilation fix

- removed diagnostic code

- cosmetic fixes

* lint

* add alpha constraint

* merge matmul refactor

* trigger CI

* - fix

* - another fix

* code style

* add support for matmul+elementwise_add+activation

* code style

* fix bfloat16 bugs

* change append_binary to append_sum
Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>

0420d514

02 8月, 2022 1 次提交

Multihead matmul fp16 (#44792) · 0fd8ee63

由 Wilber 提交于 8月 02, 2022

* multihead matmul add fp16

* fix windows error

* fix rocm error

* fix rocm error

0fd8ee63

29 7月, 2022 1 次提交
- M
  fused_fc_elementwise_layernorm_op support fp16 (#44710) · 856f741a
  由 ming1753 提交于 7月 29, 2022
```
* fused_fc_elementwise_layernorm support fp16

* fused_fc_elementwise_layernorm support double
```
  856f741a

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功