提交 · 07933116a1cab5deaa34d94b0ec9c67e9e9f4c04 · 机器未来 / Paddle

15 9月, 2022 2 次提交
- W
  
  General Plugin Mechanism (#45355) (#46070) · 07933116
  由 weishengying 提交于 9月 15, 2022
  
  07933116
- Z
  fix trt multiclass_nms3 (#45166) (#46034) · 61a3e30b
  由 Zhang Jun 提交于 9月 15, 2022
```
* Support dynamic shape in multiclass_nms3 Plugin for Paddle-TensorRT.
```
  61a3e30b
14 9月, 2022 2 次提交
- J
  cherry pick delay tensorrt log (#45958) · 2ca65904
  由 JingZhuangzhuang 提交于 9月 14, 2022
```
* cherry pick delay tensorrt log
* Update trt_plugin.h
```
  2ca65904
- W
  
  Fix compile (#45996) (#46027) · 9d5003dc
  由 wenbin 提交于 9月 14, 2022
  
  9d5003dc
08 9月, 2022 5 次提交
- W
  
  fix select fp16 kernel. (#45882) · c80d01bf
  由 Wilber 提交于 9月 08, 2022
  
  c80d01bf
- W
  
  copyright (#45866) · 38325636
  由 wenbin 提交于 9月 08, 2022
  
  38325636
- A
  
  [OpAttr]Refine Teller logic if encounter OpDesc with Variable type Attribute (#45874) · 749667e5
  由 Aurelius84 提交于 9月 08, 2022
  
  749667e5
- A
  [OpAttr]Refine Teller logic if encounter OpDesc with Variable type Attribute (#45795) · a642365e
  由 Aurelius84 提交于 9月 08, 2022
```
* [OpAttr]Refine Teller logic if encounter OpDesc with Variable type Attribute

* fix iterator

* fix typo

* fix lambda expr

* fix ptr
```
  a642365e
- W
  
  bug fix (#45853) · b971ba04
  由 wenbin 提交于 9月 08, 2022
  
  b971ba04
07 9月, 2022 2 次提交

W
Optimiza params sync between CPU and GPU. (#45805) · a2b2af90
由 Wilber 提交于 9月 07, 2022
```
* enable memory optimize when fp16.

* optimiza params sync between cpu and gpu.
```
a2b2af90

Layernorm shift partition (#45736) · 960109af

由 wenbin 提交于 9月 07, 2022

* first commit

* conver done

* correct format

* layernorm_shift_partition

* correct convert

* redefine plugin

* runable

* bug fix

* modify ShiftPartitionPattern

* correct

* add UT

* modify ut

* compile

* modify enforce

* modify UT

960109af

06 9月, 2022 3 次提交
- W
  
  enable memory optimize when fp16. (#45792) · 1967c6a6
  由 Wilber 提交于 9月 06, 2022
  
  1967c6a6
- L
  [TRT] Add silu converter (#45588) · dd0f9b96
  由 LielinJiang 提交于 9月 06, 2022
```
* add silu converter
```
  dd0f9b96
- W
  [Paddle-Inference] remove int8 fallback (#45762) · 31efe00a
  由 Wangzheee 提交于 9月 06, 2022
```
* remove int8 fallback
```
  31efe00a
05 9月, 2022 2 次提交

New format quant model support for MKLDNN (#45416) · 4e4f4586

由 yeliang2258 提交于 9月 05, 2022

* support onnx format quantized model

* update code

* add test

* add test

* fix

* fix test

* fix cmake

* update code

* change scale file path to calibration file path

* update code

* update code

* fix build bug

* fix build bugs

* fix

* fix

4e4f4586

Update DlNNE engine (#45027) · 638965c5

由 denglin-github 提交于 9月 05, 2022

* add config param for enable_dlnne and support calibration mode
* remove useless file
* refine code and add annotation
* refine code of Warnning tips

638965c5

02 9月, 2022 2 次提交
- S
  
  enable add passes pre-calculating scales/quantizing weights (#44680) · cdb36da4
  由 Sylwester Fraczek 提交于 9月 02, 2022
  
  cdb36da4
- F
  padding the length of input for vit_attention (#45506) · f79be656
  由 feng_shuai 提交于 9月 02, 2022
```
* vit_384_opt

* just support trt8

* padding + unpadding

* fix:unit test

* refactor:padding

* fix: change the position of round_up

* refactor: delete workspace
```
  f79be656
30 8月, 2022 1 次提交
- Z
  [Paddle-TRT] constant-folding (#45494) · 97f43a8e
  由 zhoutianzi666 提交于 8月 30, 2022
```
add constant folding pass， for some model，it will get less latency；
```
  97f43a8e
29 8月, 2022 1 次提交
- Y
  
  TensorRT Engine context memory bind with predictor id (#45468) · 02621079
  由 Yuanle Liu 提交于 8月 29, 2022
  
  02621079
26 8月, 2022 2 次提交
- W
  Layernorm shape bugfix (#45431) · 3ca8cf44
  由 Wang Bojun 提交于 8月 26, 2022
```
* fix bug fix

* add shape size check

* polish code

* multi -1 shape fix

* code style improve

* bug fix

* code style fix
```
  3ca8cf44
- W
  
  fix_multihead (#45429) · fa06d9c3
  由 Wangzheee 提交于 8月 26, 2022
  
  fa06d9c3
25 8月, 2022 2 次提交
- W
  
  fix params sync multi times problem (#45406) · 20d38664
  由 Wilber 提交于 8月 25, 2022
  
  20d38664
- Z
  
  enforce_reshape (#45386) · 0bf40070
  由 zhoutianzi666 提交于 8月 25, 2022
  
  0bf40070
24 8月, 2022 3 次提交
- W
  fix mean/variance shape infer bug during loop call of dynamic trt enqueue (#45387) · 4e3f0b95
  由 Wang Bojun 提交于 8月 24, 2022
```
* fix bug fix
```
  4e3f0b95
- Y
  
  fix op_teller with_dynamic_shape judge bug (#45384) · 9e0baf6e
  由 Yuanle Liu 提交于 8月 24, 2022
  
  9e0baf6e
- W
  
  fix convert weight failed. (#45346) · 3d514e48
  由 Wilber 提交于 8月 24, 2022
  
  3d514e48
22 8月, 2022 4 次提交
- J
  Add int8 support for matmul+elementwise_add fuse pass (#45077) · 9e5f3a38
  由 joanna.wozna.intel 提交于 8月 22, 2022
```
* Add int8 support for matmul+elementwiae_add fuse

* Corrections after review and ernie test fix
```
  9e5f3a38
- S
  Extend conv_concat_relu to support all activations (#45089) · d03ef054
  由 Sławomir Siwek 提交于 8月 22, 2022
```
* merge conv_concat_relu to conv_act

* fix typo

* extend unit test

* reuse existing gpd

* codestyle

* enforce mkldnn conv
```
  d03ef054
- Z
  
  [Paddle-TRT] support output_padding in conv2d_transpose and conv3d_transpose (#45004) · 25d25b00
  由 zhoutianzi666 提交于 8月 22, 2022
  
  25d25b00
- Y
  
  remove trt_skip_layernorm_fuse_pass from gpu passes (#45293) · 25d58db6
  由 Yuanle Liu 提交于 8月 22, 2022
  
  25d58db6
19 8月, 2022 2 次提交
- W
  fix layernormTrt meanVar alloc bug (#45255) · 6fb34e74
  由 Wang Bojun 提交于 8月 19, 2022
```
* fix layernormTrt meanVar alloc bug
```
  6fb34e74
- W
  Trt groupnorm dynamic plugin (#44911) · 1aa6adb1
  由 Wang Bojun 提交于 8月 19, 2022
```
* add group_norm dyanmic plugin
```
  1aa6adb1
18 8月, 2022 2 次提交

[inference]predictor add GetInputType interface (#45143) · a8ae87f1

由 heliqi 提交于 8月 18, 2022

* predictor add GetInputType interface

* predictor change GetInputType to GetInputTypes

* predictor add tester

* predictor add tester

* predictor change GetInputType to GetInputTypes

* predictor change GetInputType to GetInputTypes

* predictor add tester

a8ae87f1

fix infer tans scope (#45203) · 2d0bb2c3

由 JingZhuangzhuang 提交于 8月 18, 2022

* fix infer tans scop

* fix infer trans scope

* fic infer trans scope

* fic infer trans scope
Co-authored-by: Ndingjiawei <327396238@qq.com>

2d0bb2c3

16 8月, 2022 2 次提交

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

W

memoptim and fp16 mixed precision (#45132) · fa890092
由 Wilber 提交于 8月 16, 2022

fa890092

15 8月, 2022 3 次提交

Y

fused_embedding_eltwise_layernorm_op and skip_layernorm_op support fp16 (#44969) · ac0553a0
由 Yuanle Liu 提交于 8月 15, 2022

ac0553a0

Refine TRT unit test (#45102) · 3512bf11

由 zlsh80826 提交于 8月 15, 2022

* Reduce pool2d test configuration

* Reduce depthwise_conv2d test configuration

* Reduce trt_convert_conv2d_fusion test configuration

* Reduce trt_convert_conv2d test configuration

* Reduce trt_convert_conv2d_transpose test configuration

* Reduce trt_convert_hard_swish test configuration

* Enhance trt auto scan test error message and mechanism

* Increase FP16 trt ut tolerance

3512bf11

W
convert_fp16 support multi block (#45050) · 9aecf286
由 Wilber 提交于 8月 15, 2022
```
* convert_fp16 support multi block

* update

* update
```
9aecf286

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致