提交 · 61469eec0bee98e1bd65ba54e99fe39998ded605 · PaddlePaddle / Paddle

17 2月, 2023 1 次提交
- Z
  [XPU] add multi_encoder_xpu_slice_fuse_pass, generate_sequence_xpu_fuse_pass,... · 61469eec
  由 zhupengyang 提交于 2月 17, 2023
```
[XPU] add multi_encoder_xpu_slice_fuse_pass, generate_sequence_xpu_fuse_pass, generate_sequence_xpu kernel (#50570)
```
  61469eec
16 2月, 2023 5 次提交
- J
  Add matmul_v2 and fused_matmul to the quantization process and adjust Ernie model test (#50354) · 8686a745
  由 joanna.wozna.intel 提交于 2月 16, 2023
```
* Add matmul_v2 to the quantization process and adjust Ernie model test

* Correct cpu_quantize_pass test

* Move op to fuse transformation to placement pass

* Correct test
```
  8686a745
- T
  
  Export paddle_proto symbols (#50031) · dd1410d7
  由 Tomasz Socha 提交于 2月 16, 2023
  
  dd1410d7
- S
  [XPU][Fleet] Support multi-card infer for xpu (#50490) · 517d8074
  由 shentanyue 提交于 2月 16, 2023
```
* support xpu multi-card infer

* add ut

* clean code

* clean code

* fix

* fix

* fix

* fix
```
  517d8074
- H
  [Phi decouple] move layer_norm_kernel.cu.h to phi (#50506) · 8910bb4a
  由 Huang Jiyi 提交于 2月 16, 2023
```
* move layer_norm_kernel.cu.h to phi

* fix bugs

* fix namespace

* fix bugs

* fix CI-Windwos

* replace mutable_data

* fix bugs

* fix bugs
```
  8910bb4a
- Z
  
  [XPU] fix dropout pass; add multi_encoder_xpu_fuse_pass & multi_encoder_xpu kernel (#50499) · c8aa6405
  由 zhupengyang 提交于 2月 16, 2023
  
  c8aa6405
14 2月, 2023 2 次提交

D
Expand mixed_precision to custom device (#50378) · fcb746cb
由 duanyanhui 提交于 2月 14, 2023
```
* expand mix_precision to custom_device

* fix bug

* fix bug

* fix comment

* fix DEFINE bug
```
fcb746cb

add setvalue trt converter (#50341) · 2548657e

由 xjmxyt 提交于 2月 14, 2023

* add cast setvalue op

* add set_value to op teller

* renew test and add description

* add setAxis and add complex test

* change test

2548657e

11 2月, 2023 1 次提交

[TRT] elementwise_add+transpose fusion (#50081) · fd0d4fa4

由 Wang Bojun 提交于 2月 11, 2023

* eleadd_trans first version

log fix

* refine code for linear format, add pass check

* linear format refine and ut fix

* fix ut

* windows ut

* windows ut 2

* move tensorMeta and alloc to configure

fd0d4fa4

10 2月, 2023 1 次提交
- Z
  
  [XPU] add fc_xpu op&pass to optimize ernie model (#50277) · 945f918c
  由 zhupengyang 提交于 2月 10, 2023
  
  945f918c
09 2月, 2023 4 次提交

Z
[trt][inference]support int64 shapetensor as engine input (#50170) · 14a92c8c
由 Zhang Jun 提交于 2月 09, 2023
```
* update

* support int64 shape tensor as engine input

* add inference_predictor ut
```
14a92c8c
J
Adjust mkldnn_placement_pass to check library type and data type (#49899) · ebdf3ef9
由 joanna.wozna.intel 提交于 2月 09, 2023
```
* Adjust mkldnn_placement_pass to check library type and data type

* Check if var has inputs

* Remove unrelated test

* Refactor
```
ebdf3ef9

[Paddle-TRT] GroupNorm int8 nchw32 fake kernel (#50146) · d93c63a0

由 zhoutianzi666 提交于 2月 09, 2023

* add fmha_flashattention oss plugin

* add fmhca

* add oss fmhca

* code reconstruct and add ut

* code style refine

* fix ut and enforce check

* refine trt version check

refine compile

fix compile

* fix cross ut

* code refine

* use runtime trt version check

* bug fix and code refine

* compile fix

* merge develop

* add GN QDQ kernel

* support GN int8 fake kernel

* add with_int8

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8  UT

* add verison > 8000  in GN int8  UT

* add some check in .cu

* add stdlib.h in UT

* little change  in .cu

* remove rand_r use rand

* remove use rand

* setAxis(1)

* when int8 is on allow fall back to fp16

---------
Co-authored-by: Nwwbitejotunn <wang_bojun@outlook.com>

d93c63a0

W
[TRT] Transpose layernorm fusion with different input format (#50082) · b2bb7ec9
由 Wang Bojun 提交于 2月 09, 2023
```
* trans_layernorm
```
b2bb7ec9

08 2月, 2023 4 次提交

fuse quantize+transpose and transpose+dequantize (#49509) · 197a4ffe

由 Paulina Gacek 提交于 2月 08, 2023

* QuantTranpose pattern is being found by pass

* quant + transpose fuse

* code style changes

* UT written, reorder fixed

* Dequantize + transpose2 fuse  added

* pass name changed

* UT added & shift corrected

* got rid of redundancy

* review changes

* AsIntermediate corrected

* compat added

197a4ffe

Z
[inference][trt] Disable ShapeTensor for nearest_interp_v2 when trt version < 8.2 (#50258) · fa284076
由 Zhang Jun 提交于 2月 08, 2023
```
* update

* update

* format code

* update

* Update test_trt_convert_nearest_interp_v2.py
```
fa284076
W

Export custom operator-related function symbols (#50238) · f9c801ff
由 weishengying 提交于 2月 08, 2023

f9c801ff

[Paddle-TRT] remove engine info from RumImpl process (#50181) · b3888614

由 gaoziyuan 提交于 2月 08, 2023

* remove_engine_info

* remove_engine_info

* remove_engine_info

* change trtlayerinformation line to json

---------
Co-authored-by: Ngaoziyuan <gaoziyuan@baidu.com>

b3888614

07 2月, 2023 1 次提交
- X
  
  [Paddle Inference] Fix range fp32 input below trt 8.4. (#50221) · 766a4ca9
  由 xiaoxiaohehe001 提交于 2月 07, 2023
  
  766a4ca9
06 2月, 2023 3 次提交
- Y
  Disable conv2d_fusion_layout_transfer_pass temporarily (#50232) · 95fed8e8
  由 Yuanle Liu 提交于 2月 06, 2023
```
* disable conv2d_fusion_layout_transfer_pass temporarily

* disable conv2d_fusion_layout_transfer_pass temporarily
```
  95fed8e8
- W
  
  fix param (#50226) · 708f6e79
  由 wenbin 提交于 2月 06, 2023
  
  708f6e79
- X
  [Paddle Inference] Add Hasattri check of op teller. (#50110) · 4b4d92ea
  由 xiaoxiaohehe001 提交于 2月 06, 2023
```
* add_hasattri_check

* add_hasattri_check
```
  4b4d92ea
01 2月, 2023 2 次提交

Preln fix (#49802) · e03718f5

由 Wang Bojun 提交于 2月 01, 2023

* preln_residual 2 fused_bias_residual

* skip layernorm fix and ut

* code refine

* code style refine

* fix ut

* fix output

* add trt layer fall back info

* refine op teller and ut

* DropoutMaskOut output fix

e03718f5

jit layer support multi thread and fix predictor clone (#50095) · 9fa2eb38

由 Hui Zhang 提交于 2月 01, 2023

* jit layer support multi thread

* fix bug

* clone prediector not do graph optimizer

* format

* fix comment and format

* fix override and fromat

* fix

* fix

9fa2eb38

31 1月, 2023 5 次提交
- W
  gn_silu (#49928) · 111075a3
  由 wenbin 提交于 1月 31, 2023
```
* gn_silu

* add ut

* set TIMEOUT

* correct comments

* comments

* disable windows ut

* rename parameter
```
  111075a3
- W
  Unary (#49914) · 0d9185b9
  由 wenbin 提交于 1月 31, 2023
```
* disable integer

* disable integer

* add cast layer
```
  0d9185b9
- Z
  
  [inference][trt] add elementwise input data type check (#49675) · 5822e15c
  由 Zhang Jun 提交于 1月 31, 2023
  
  5822e15c
- Y
  
  [Paddle Inference] change the default values of some gflags (#50074) · a1f28a48
  由 Yuanle Liu 提交于 1月 31, 2023
  
  a1f28a48
- H
  [Decouple phi] Decouple custom_op in fluid and phi (#49866) · 48b3e869
  由 HongyuJia 提交于 1月 31, 2023
```
* decouple phi custom_op

* decouple phi custom_op, remove codes

* delete custom symbol of inference
```
  48b3e869
19 1月, 2023 1 次提交
- H
  [Paddle Inference]Support PaddlePaddle Backend on Triton (#49758) · e3f39833
  由 heliqi 提交于 1月 19, 2023
```
* support PaddlePaddle Backend on Triton

* fix test cases

* fix Codestyle

* add test case

* add test case
```
  e3f39833
18 1月, 2023 1 次提交
- W
  fix cast issue (#49909) · 55ccb429
  由 wenbin 提交于 1月 18, 2023
```
* fix cast issue

* add ut
```
  55ccb429
17 1月, 2023 1 次提交

[PHI]Change feed_op to phi kernel (#49116) · f7f1dc03

由 YuanRisheng 提交于 1月 17, 2023

* change feed_op to phi kernel

* fix ci bugs

* fix build bugs

* fix ci bugs

* fix compile bugs

* fix ci bugs

* perfect code

* perfect comment code

* fix install bugs

* modify code according comment

* remove visitor in feed_op

* modify according comment

* perfect code according comment

* add infershape

* fix py3 bugs

* fix getexpected kernel type

* fix getexpected kernel type

* fix ci bugs

* add registry for custom device

* fix py3 bugs

* fix floating point error

* fix py3 test bugs

f7f1dc03

16 1月, 2023 3 次提交
- Z
  [inference] Use output var name to mark the NVTX flag (#49825) · ea2e2495
  由 Zhang Jun 提交于 1月 16, 2023
```
* add outvar name for nvtx mark

* nly network created with kEXPLICIT_BATCH can setsetMaxBatchSize
```
  ea2e2495
- Y
  [Paddle-TRT] support nhwc (#49633) · e43f7102
  由 Yuanle Liu 提交于 1月 16, 2023
```
* add trt_support_nhwc_pass
```
  e43f7102
- Y
  add gpu_cpu_map_matmul_to_mul_pass to kGpuLowerPrecisionPasses (#49753) · 07514139
  由 Yuanle Liu 提交于 1月 16, 2023
```
* add gpu_cpu_map_matmul_to_mul_pass to kGpuLowerPrecisionPasses

* disable fc_elementwise_layernorm_fuse_pass in mixed precision
```
  07514139
13 1月, 2023 3 次提交

W
add oss flash fmha and fmhca support (#49438) · a48b8e2c
由 Wang Bojun 提交于 1月 13, 2023
```
* add fmha_flashattention oss plugin
```
a48b8e2c

[inference][trt]set output data type of trt network (#49712) · 690d7a69

由 Zhang Jun 提交于 1月 13, 2023

* update trt engine to set in/out data type

* update

* Update engine.cc

* Update engine.cc

* update

* set engine output type before freeze the network

* update

* update trt autoscan ut

* update

* update ut

* fix equal bug, update ut

* fix cast and equal ut

* update cast ut using TRT < 8.4

* set datatype from scope

* check output var is nullptr

* Update op_converter.h

* update tensorrt_engine_op_test ut

* update

690d7a69

W

[Inference] Update exported symbols. (#47538) · 43ec2271
由 Wilber 提交于 1月 13, 2023

43ec2271

12 1月, 2023 2 次提交
- X
  
  fix_arg (#49770) · 5d60ff91
  由 xiaoxiaohehe001 提交于 1月 12, 2023
  
  5d60ff91
- W
  more preln_gn patterns (#49728) · adcb0039
  由 wenbin 提交于 1月 12, 2023
```
* compile fix

* fix compile

* compile fix

* add more preln
```
  adcb0039

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功