提交 · 1cfcb71d980075a448981c1be1122d7032cdc39c · PaddlePaddle / Paddle

21 2月, 2023 2 次提交

[PHI Decoupling]Remove memory header (Part1) (#50419) · 1cfcb71d

由 YuanRisheng 提交于 2月 21, 2023

* decouple_memory

* perfect memory utils

* fix ci bugs

* fix inference bugs

* fix custom test bugs

* fix converage bugs

* modify code according comment

* modify namespace

* deal with compile bugs

1cfcb71d

Optimize the ernie inference performance on xpu backend. (#50357) · b39afb13

由 csy0225 提交于 2月 21, 2023

* Optimize the ernie inference performance on xpu

* fix enable runtime cache logic

* when op's input shape has changed, should create a new runtime context

* fix

* set flag when input shape has changed

b39afb13

17 2月, 2023 1 次提交
- Z
  [XPU] add multi_encoder_xpu_slice_fuse_pass, generate_sequence_xpu_fuse_pass,... · 61469eec
  由 zhupengyang 提交于 2月 17, 2023
```
[XPU] add multi_encoder_xpu_slice_fuse_pass, generate_sequence_xpu_fuse_pass, generate_sequence_xpu kernel (#50570)
```
  61469eec
16 2月, 2023 3 次提交
- J
  Add matmul_v2 and fused_matmul to the quantization process and adjust Ernie model test (#50354) · 8686a745
  由 joanna.wozna.intel 提交于 2月 16, 2023
```
* Add matmul_v2 to the quantization process and adjust Ernie model test

* Correct cpu_quantize_pass test

* Move op to fuse transformation to placement pass

* Correct test
```
  8686a745
- S
  [XPU][Fleet] Support multi-card infer for xpu (#50490) · 517d8074
  由 shentanyue 提交于 2月 16, 2023
```
* support xpu multi-card infer

* add ut

* clean code

* clean code

* fix

* fix

* fix

* fix
```
  517d8074
- Z
  
  [XPU] fix dropout pass; add multi_encoder_xpu_fuse_pass & multi_encoder_xpu kernel (#50499) · c8aa6405
  由 zhupengyang 提交于 2月 16, 2023
  
  c8aa6405
14 2月, 2023 2 次提交

D
Expand mixed_precision to custom device (#50378) · fcb746cb
由 duanyanhui 提交于 2月 14, 2023
```
* expand mix_precision to custom_device

* fix bug

* fix bug

* fix comment

* fix DEFINE bug
```
fcb746cb

add setvalue trt converter (#50341) · 2548657e

由 xjmxyt 提交于 2月 14, 2023

* add cast setvalue op

* add set_value to op teller

* renew test and add description

* add setAxis and add complex test

* change test

2548657e

11 2月, 2023 1 次提交

[TRT] elementwise_add+transpose fusion (#50081) · fd0d4fa4

由 Wang Bojun 提交于 2月 11, 2023

* eleadd_trans first version

log fix

* refine code for linear format, add pass check

* linear format refine and ut fix

* fix ut

* windows ut

* windows ut 2

* move tensorMeta and alloc to configure

fd0d4fa4

10 2月, 2023 1 次提交
- Z
  
  [XPU] add fc_xpu op&pass to optimize ernie model (#50277) · 945f918c
  由 zhupengyang 提交于 2月 10, 2023
  
  945f918c
09 2月, 2023 3 次提交
- Z
  [trt][inference]support int64 shapetensor as engine input (#50170) · 14a92c8c
  由 Zhang Jun 提交于 2月 09, 2023
```
* update

* support int64 shape tensor as engine input

* add inference_predictor ut
```
  14a92c8c
- J
  Adjust mkldnn_placement_pass to check library type and data type (#49899) · ebdf3ef9
  由 joanna.wozna.intel 提交于 2月 09, 2023
```
* Adjust mkldnn_placement_pass to check library type and data type

* Check if var has inputs

* Remove unrelated test

* Refactor
```
  ebdf3ef9
- W
  [TRT] Transpose layernorm fusion with different input format (#50082) · b2bb7ec9
  由 Wang Bojun 提交于 2月 09, 2023
```
* trans_layernorm
```
  b2bb7ec9
08 2月, 2023 1 次提交

fuse quantize+transpose and transpose+dequantize (#49509) · 197a4ffe

由 Paulina Gacek 提交于 2月 08, 2023

* QuantTranpose pattern is being found by pass

* quant + transpose fuse

* code style changes

* UT written, reorder fixed

* Dequantize + transpose2 fuse  added

* pass name changed

* UT added & shift corrected

* got rid of redundancy

* review changes

* AsIntermediate corrected

* compat added

197a4ffe

06 2月, 2023 1 次提交
- Y
  Disable conv2d_fusion_layout_transfer_pass temporarily (#50232) · 95fed8e8
  由 Yuanle Liu 提交于 2月 06, 2023
```
* disable conv2d_fusion_layout_transfer_pass temporarily

* disable conv2d_fusion_layout_transfer_pass temporarily
```
  95fed8e8
01 2月, 2023 2 次提交

Preln fix (#49802) · e03718f5

由 Wang Bojun 提交于 2月 01, 2023

* preln_residual 2 fused_bias_residual

* skip layernorm fix and ut

* code refine

* code style refine

* fix ut

* fix output

* add trt layer fall back info

* refine op teller and ut

* DropoutMaskOut output fix

e03718f5

jit layer support multi thread and fix predictor clone (#50095) · 9fa2eb38

由 Hui Zhang 提交于 2月 01, 2023

* jit layer support multi thread

* fix bug

* clone prediector not do graph optimizer

* format

* fix comment and format

* fix override and fromat

* fix

* fix

9fa2eb38

31 1月, 2023 2 次提交
- W
  gn_silu (#49928) · 111075a3
  由 wenbin 提交于 1月 31, 2023
```
* gn_silu

* add ut

* set TIMEOUT

* correct comments

* comments

* disable windows ut

* rename parameter
```
  111075a3
- Y
  
  [Paddle Inference] change the default values of some gflags (#50074) · a1f28a48
  由 Yuanle Liu 提交于 1月 31, 2023
  
  a1f28a48
19 1月, 2023 1 次提交
- H
  [Paddle Inference]Support PaddlePaddle Backend on Triton (#49758) · e3f39833
  由 heliqi 提交于 1月 19, 2023
```
* support PaddlePaddle Backend on Triton

* fix test cases

* fix Codestyle

* add test case

* add test case
```
  e3f39833
17 1月, 2023 1 次提交

[PHI]Change feed_op to phi kernel (#49116) · f7f1dc03

由 YuanRisheng 提交于 1月 17, 2023

* change feed_op to phi kernel

* fix ci bugs

* fix build bugs

* fix ci bugs

* fix compile bugs

* fix ci bugs

* perfect code

* perfect comment code

* fix install bugs

* modify code according comment

* remove visitor in feed_op

* modify according comment

* perfect code according comment

* add infershape

* fix py3 bugs

* fix getexpected kernel type

* fix getexpected kernel type

* fix ci bugs

* add registry for custom device

* fix py3 bugs

* fix floating point error

* fix py3 test bugs

f7f1dc03

16 1月, 2023 2 次提交
- Y
  [Paddle-TRT] support nhwc (#49633) · e43f7102
  由 Yuanle Liu 提交于 1月 16, 2023
```
* add trt_support_nhwc_pass
```
  e43f7102
- Y
  add gpu_cpu_map_matmul_to_mul_pass to kGpuLowerPrecisionPasses (#49753) · 07514139
  由 Yuanle Liu 提交于 1月 16, 2023
```
* add gpu_cpu_map_matmul_to_mul_pass to kGpuLowerPrecisionPasses

* disable fc_elementwise_layernorm_fuse_pass in mixed precision
```
  07514139
13 1月, 2023 1 次提交
- W
  add oss flash fmha and fmhca support (#49438) · a48b8e2c
  由 Wang Bojun 提交于 1月 13, 2023
```
* add fmha_flashattention oss plugin
```
  a48b8e2c
11 1月, 2023 1 次提交
- Z
  fix paddle_infer_contrib inclue (#49720) · 24f5c46e
  由 zhangxin81 提交于 1月 11, 2023
```
* fix paddle_infer_contrib include
```
  24f5c46e
10 1月, 2023 2 次提交
- X
  
  add_paddle_test (#49640) · f4d267c2
  由 xiaoxiaohehe001 提交于 1月 10, 2023
  
  f4d267c2
- S
  
  Add reduce_min prod trt converter (#49615) · 13992de7
  由 Sanbu 提交于 1月 10, 2023
  
  13992de7
09 1月, 2023 2 次提交
- W
  Preln groupnorm (#49463) · 591be3bd
  由 wenbin 提交于 1月 09, 2023
```
* skip_groupnorm

* init

* preln

* add ut

* more assert

* set timeout

* fix windows ci issue
```
  591be3bd
- G
  
  Unify the pass of the map class (#49568) · ee49994f
  由 gem5 提交于 1月 09, 2023
  
  ee49994f
06 1月, 2023 2 次提交
- Y
  
  [Inference] fix pass_builder (#49595) · 44cb3da3
  由 Yuanle Liu 提交于 1月 06, 2023
  
  44cb3da3
- Y
  
  fix trt engine memory sharing (#49584) · 1e8976e8
  由 Yuanle Liu 提交于 1月 06, 2023
  
  1e8976e8
05 1月, 2023 2 次提交
- W
  
  [Inference] inplace all reshape op (#49146) · 017af746
  由 Wilber 提交于 1月 05, 2023
  
  017af746
- Y
  
  [Paddle Inference] add unitest for zero_copy_tensor with bool type (#49495) · 8705a79d
  由 Yuanle Liu 提交于 1月 05, 2023
  
  8705a79d
04 1月, 2023 1 次提交
- L
  
  add multi_devices_fused_multi_transformer_encoder_pass and cherry-pick from 48349 (#49383) · 29eec2dd
  由 lzy 提交于 1月 04, 2023
  
  29eec2dd
03 1月, 2023 3 次提交
- Y
  
  [Paddle Inference] enhance paddle_infer::Tensor data type (#49388) · dc13f7c5
  由 Yuanle Liu 提交于 1月 03, 2023
  
  dc13f7c5
- Z
  [Paddle Inference] Implement conv2d_fusion NHWC format using cutlass (#47989) · c123dd1e
  由 zhoutianzi666 提交于 1月 03, 2023
```
* Implement conv2d_fusion NHWC format using CUTLASS
* Add unit testing for CUTLASS Conv in inference
* Add experimental API for CUTLASS.
```
  c123dd1e
- S
  
  Add not_equal trt converter (#49393) · 822ea0f9
  由 Sanbu 提交于 1月 03, 2023
  
  822ea0f9
28 12月, 2022 1 次提交
- Y
  
  update some trt log (#49330) · 02019804
  由 Yuanle Liu 提交于 12月 28, 2022
  
  02019804
22 12月, 2022 1 次提交
- G
  
  Enable identity_scale_op_clean_pass by default (#49227) · 9dac1e71
  由 gem5 提交于 12月 22, 2022
  
  9dac1e71
21 12月, 2022 1 次提交

Refactor Pass for fused_conv (#48848) · 7f0eb2e3

由 zyfncg 提交于 12月 21, 2022

* refactor conv_activation_mkldnn_fuse_pass

* refactor conv_affine_channel_mkldnn_fuse_pass

* fix conv_activation_mkldnn_fuse_pass

* fix mkldnn unittest

* refactor int8_scale_calculation_mkldnn_pass and params_quantization_mkldnn_pass

* refactor conv_elementwise_add_mkldnn_fuse_pass

* fix quant

* refactor conv_bn_fuse_pass

* fix conv_bn_fuse_pass

* refactor depthwise_conv_bn_fuse_pass

* fix unittest

* fix conv_bn_fuse_pass

* remove redundant conv2d in params_quantization_mkldnn_pass

* fix params_quantization_mkldnn_pass_tester

7f0eb2e3

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功