提交 · d9a134c3019cec43e8611a147a10206af3107f1c · PaddlePaddle / Paddle

15 2月, 2023 1 次提交
- W
  
  prefix (#50381) · d9a134c3
  由 Wang Bojun 提交于 2月 15, 2023
  
  d9a134c3
13 1月, 2023 1 次提交
- Y
  fix fc kernel diff (#49781) · 01c26ab2
  由 Yuanle Liu 提交于 1月 13, 2023
```
* fix fc kernel diff

* disable fc_elementwise_layernorm_fuse_pass
```
  01c26ab2
19 12月, 2022 1 次提交

[cherry-pick][Inference] support mixed precision inference (#49077) · ddcd1b61

由 Yuanle Liu 提交于 12月 19, 2022

* [Release2.4] Revert python link prs (#48573)

* Revert "Fix mac link python (#48017)"

This reverts commit 3fa7a736.

* Revert "[Cherry-pick] Fix python link error (#47811)"

This reverts commit ff642c68.

* Update config.go

* [Paddle Inference] Add float_to_half_pass to support  inference with mixed precision (#47993)

* [Inference] optimize some code and fix some bug (#48780)

* clean ir_pass_manager and fix map_depthwise_conv_to_conv_pass

* fix unitest timeout

* [Paddle Inference] clean unused code  (#48392)

* fix

* update

* update
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

ddcd1b61

10 11月, 2022 1 次提交
- R
  Fuse multi transformer layer pass (#47541) (#47830) · 3a6cc57c
  由 RichardWooSJTU 提交于 11月 10, 2022
```
* add fuse_multi_transformer_layer_pass
```
  3a6cc57c
09 11月, 2022 1 次提交
- H
  [cherry-pick] Squeeze2 and transpose2 fuse using oneDNN(#47712) · ea5f44b8
  由 Hui Zhang 提交于 11月 09, 2022
```
* suqeeze2 + transpose2 fuse onednn cherrypick 2.4

* format

* fix merge
```
  ea5f44b8
08 11月, 2022 2 次提交
- K
  
  add fuse_multi_transformer passes to fp16. test=develop (#47733) · 34f67a88
  由 Kaipeng Deng 提交于 11月 08, 2022
  
  34f67a88
- J
  [CHERRY-PICK] Added caching to oneDNN FC and op+unsqueeze2 and op+reshape2 fuse passes (#47690) · d0e19af3
  由 jakpiase 提交于 11月 08, 2022
```
* fc cherrypick

* another files added

* added transpose cherrypick

* reverter somebodys fc changes

* minor fix

* minor fix

* cherry-pick of fc+act changes

* minor fix

* fix
```
  d0e19af3
03 11月, 2022 2 次提交
- S
  
  FC/matmul(v2) + scale fuse pass (#47420) · 99c872fa
  由 Sławomir Siwek 提交于 11月 03, 2022
  
  99c872fa
- Y
  Fix ComputePropagateScalesMkldnnPass of MKLDNN (#47574) (#47639) · 559b9754
  由 yeliang2258 提交于 11月 03, 2022
```
* add constant_folding_pass pass for mkldnn int8

* update UpdateScaleOpInOutScales
```
  559b9754
29 10月, 2022 1 次提交
- A
  [JITLayer]Enable OneDNN on CPU and Fix zero shape (#47428) (#47436) · f4788442
  由 Aurelius84 提交于 10月 29, 2022
```
* [JITLayer]Enable OneDNN on CPU and Fix zero shape
```
  f4788442
28 10月, 2022 1 次提交

[Cherry-pick][JIT] Add Predictor for JITLayer (#47379) (#47419) · c42929c5

由 Aurelius84 提交于 10月 28, 2022

* [JIT] Add Predictor for JITLayer (#47379)

* add predictor_engine

* add predictor_engine

* fix zero shape

* fix lodTensor

* fix unittest

* fix code style

* update CmakeList

* fix new executor

c42929c5

20 10月, 2022 2 次提交

K
[cherry pick] Add FusedMultiTransformer fuse pass for GPT3 (#47150) · 396427a7
由 Kaipeng Deng 提交于 10月 20, 2022
```
* add fused_attention_pass. test=develop

* support fp16. test=develop

* fix format. test=develop
```
396427a7

[cherry-pick] Fix quantize model deploy bug in MKLDNN (#47119) · c2d344dd

由 yeliang2258 提交于 10月 20, 2022

* Fix quantize model deploy bugs when using MKLDNN (#45920)

* fix immutable op quantize bugs

* fix

* fix build bug

* fix test

* notest,test=inference

* fix ppyoloe acc drop bugs

* fix test

* fix test

* add test

* fix

* fix

* fix test

* fix refined name bug

* fix test

* bias fix

* fix matmul weight dequant bug

* re-ci

* fix tester

* fix test

* fix tester

* update weight dequantize func

* update code

* update test for converage

* update test

* update cmake

* update cmakelist

* update code

* rerun ci

* remove useless code

* re-ci

* update code

* update code

* fix header

* update code for log

c2d344dd

18 10月, 2022 1 次提交
- Z
  
  support shape tensor is the input of trt-subgraph (#47066) · 5a44c124
  由 zhoutianzi666 提交于 10月 18, 2022
  
  5a44c124
17 10月, 2022 1 次提交

[IPU] paddle-inference support custom-ops (#45235) (#46868) · bd89be12

由 Allen Guo 提交于 10月 17, 2022

* paddle-inference support custom-ops
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

* fix tolower
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

bd89be12

14 10月, 2022 4 次提交
- W
  
  cherry-pick 46942 (#47015) · 82db4993
  由 Wilber 提交于 10月 14, 2022
  
  82db4993
- X
  
  Add bmm convert (#47011) · 8f1ac7cf
  由 xiaoxiaohehe001 提交于 10月 14, 2022
  
  8f1ac7cf
- Z
  [cherry-pick 2.4][inference] fix reshape2 opteller (#46871) · 535d7574
  由 Zhang Jun 提交于 10月 14, 2022
```
* fix reshape2 opteller;
add elementwise min/max register for tensorrt
```
  535d7574
- Z
  
  [Paddle-TRT] support new quant format from slim (#46022) (#46979) · b8677c0d
  由 zhoutianzi666 提交于 10月 14, 2022
  
  b8677c0d
11 10月, 2022 1 次提交
- Y
  
  optimize Paddle-TRT performance (#46684) · d091d1b0
  由 Yuanle Liu 提交于 10月 11, 2022
  
  d091d1b0
28 9月, 2022 1 次提交
- Z
  
  remove trt_reshape2_matmul_fuse_pass (#46363) · a77a6f6b
  由 zhoutianzi666 提交于 9月 28, 2022
  
  a77a6f6b
20 9月, 2022 2 次提交
- Z
  [Paddle-TRT] Support matmul_v2 in Paddle-TensorRT (#46177) · 654807cd
  由 zhoutianzi666 提交于 9月 20, 2022
```
* Support matmul_v2 in Paddle-TensorRT converter.
```
  654807cd
- Z
  Fix wrong eigen header include (#46082) (#46202) · ac8cce20
  由 zyfncg 提交于 9月 20, 2022
```
* fix wrong eigen header include

* fix complie bug

* fix nan_inf_utils_detail

* fix resource_manager

* fix conv_miopen_helper
```
  ac8cce20
15 9月, 2022 1 次提交
- W
  
  General Plugin Mechanism (#45355) (#46070) · 07933116
  由 weishengying 提交于 9月 15, 2022
  
  07933116
07 9月, 2022 1 次提交

Layernorm shift partition (#45736) · 960109af

由 wenbin 提交于 9月 07, 2022

* first commit

* conver done

* correct format

* layernorm_shift_partition

* correct convert

* redefine plugin

* runable

* bug fix

* modify ShiftPartitionPattern

* correct

* add UT

* modify ut

* compile

* modify enforce

* modify UT

960109af

06 9月, 2022 2 次提交
- W
  
  enable memory optimize when fp16. (#45792) · 1967c6a6
  由 Wilber 提交于 9月 06, 2022
  
  1967c6a6
- L
  [TRT] Add silu converter (#45588) · dd0f9b96
  由 LielinJiang 提交于 9月 06, 2022
```
* add silu converter
```
  dd0f9b96
05 9月, 2022 2 次提交

New format quant model support for MKLDNN (#45416) · 4e4f4586

由 yeliang2258 提交于 9月 05, 2022

* support onnx format quantized model

* update code

* add test

* add test

* fix

* fix test

* fix cmake

* update code

* change scale file path to calibration file path

* update code

* update code

* fix build bug

* fix build bugs

* fix

* fix

4e4f4586

Update DlNNE engine (#45027) · 638965c5

由 denglin-github 提交于 9月 05, 2022

* add config param for enable_dlnne and support calibration mode
* remove useless file
* refine code and add annotation
* refine code of Warnning tips

638965c5

02 9月, 2022 1 次提交
- S
  
  enable add passes pre-calculating scales/quantizing weights (#44680) · cdb36da4
  由 Sylwester Fraczek 提交于 9月 02, 2022
  
  cdb36da4
30 8月, 2022 1 次提交
- Z
  [Paddle-TRT] constant-folding (#45494) · 97f43a8e
  由 zhoutianzi666 提交于 8月 30, 2022
```
add constant folding pass， for some model，it will get less latency；
```
  97f43a8e
29 8月, 2022 1 次提交
- Y
  
  TensorRT Engine context memory bind with predictor id (#45468) · 02621079
  由 Yuanle Liu 提交于 8月 29, 2022
  
  02621079
22 8月, 2022 3 次提交
- J
  Add int8 support for matmul+elementwise_add fuse pass (#45077) · 9e5f3a38
  由 joanna.wozna.intel 提交于 8月 22, 2022
```
* Add int8 support for matmul+elementwiae_add fuse

* Corrections after review and ernie test fix
```
  9e5f3a38
- S
  Extend conv_concat_relu to support all activations (#45089) · d03ef054
  由 Sławomir Siwek 提交于 8月 22, 2022
```
* merge conv_concat_relu to conv_act

* fix typo

* extend unit test

* reuse existing gpd

* codestyle

* enforce mkldnn conv
```
  d03ef054
- Y
  
  remove trt_skip_layernorm_fuse_pass from gpu passes (#45293) · 25d58db6
  由 Yuanle Liu 提交于 8月 22, 2022
  
  25d58db6
18 8月, 2022 2 次提交

[inference]predictor add GetInputType interface (#45143) · a8ae87f1

由 heliqi 提交于 8月 18, 2022

* predictor add GetInputType interface

* predictor change GetInputType to GetInputTypes

* predictor add tester

* predictor add tester

* predictor change GetInputType to GetInputTypes

* predictor change GetInputType to GetInputTypes

* predictor add tester

a8ae87f1

fix infer tans scope (#45203) · 2d0bb2c3

由 JingZhuangzhuang 提交于 8月 18, 2022

* fix infer tans scop

* fix infer trans scope

* fic infer trans scope

* fic infer trans scope
Co-authored-by: Ndingjiawei <327396238@qq.com>

2d0bb2c3

16 8月, 2022 2 次提交

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

W

memoptim and fp16 mixed precision (#45132) · fa890092
由 Wilber 提交于 8月 16, 2022

fa890092

15 8月, 2022 1 次提交
- Y
  
  fused_embedding_eltwise_layernorm_op and skip_layernorm_op support fp16 (#44969) · ac0553a0
  由 Yuanle Liu 提交于 8月 15, 2022
  
  ac0553a0

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功