提交 · 8f156fd7f2c4cff08a2b93196d288bfc15514c0d · PaddlePaddle / Paddle

02 3月, 2023 2 次提交
- G
  
  [Hackathon NO.74] 为 Paddle-TRT 添加 grid_sampler 算子 (#50934) · 8f156fd7
  由 gaoziyuan 提交于 3月 02, 2023
  
  8f156fd7
- Y
  
  process multiple conv2d_fusion shares weight (#51068) · ae60105d
  由 Yuanle Liu 提交于 3月 02, 2023
  
  ae60105d
01 3月, 2023 1 次提交
- Z
  
  [XPU] delete op device (#51029) · c9309942
  由 zhupengyang 提交于 3月 01, 2023
  
  c9309942
28 2月, 2023 4 次提交
- J
  Add gru qat int8 test (#50846) · a0562813
  由 joanna.wozna.intel 提交于 2月 28, 2023
```
* Add gru qat int8 test

* Change place of model downloading

* Update paddle/fluid/inference/tests/api/CMakeLists.txt
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* Correct flags names and add description

---------
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
```
  a0562813
- Z
  
  [XPU] support convert fp16 model (#50790) · f265a313
  由 zhupengyang 提交于 2月 28, 2023
  
  f265a313
- Y
  
  fix gflags from environment not activated (#50864) · a8fff38f
  由 Yuanle Liu 提交于 2月 28, 2023
  
  a8fff38f
- W
  fix concat axis bug (#50951) · 75a2f9d5
  由 wenbin 提交于 2月 28, 2023
```
* fix concat bug

* recommit for ci
```
  75a2f9d5
27 2月, 2023 2 次提交
- Z
  
  handle trt engine deserialization failure and rebuild (#50775) · 377cbcea
  由 Zhang Jun 提交于 2月 27, 2023
  
  377cbcea
- G
  
  change message info (#50546) · 097402d9
  由 gaoziyuan 提交于 2月 27, 2023
  
  097402d9
24 2月, 2023 3 次提交
- Z
  [Paddle-TRT] allow plugin fall back to fp16 when int8 (#50554) · f24eadd9
  由 zhoutianzi666 提交于 2月 24, 2023
```
* allow fall back to fp16 when int8

* refine code

* refine code

* refine code
```
  f24eadd9
- Y
  
  Fix libpaddle_inference.so symbol conflicts with other .so (gflags) (#50787) · 041ea14c
  由 Yuanle Liu 提交于 2月 24, 2023
  
  041ea14c
- Z
  [Paddle-TRT] Fix QkvToContextPluginDynamic bug (#50715) · 612d5da0
  由 zhoutianzi666 提交于 2月 24, 2023
```
* fix multihead

* fix multihead
```
  612d5da0
23 2月, 2023 3 次提交
- C
  
  [XPU] Migrate xpu_embedding_with_eltwise_add_fuse_pass (#50590) · 8d325d82
  由 csy0225 提交于 2月 23, 2023
  
  8d325d82
- H
  [phi decoupling] move generator implementation from fluid to phi (#50746) · 4e417409
  由 Huang Jiyi 提交于 2月 23, 2023
```
* move fluid generator to phi

* move fluid generator to phi

* update .gitignore

* fix bugs

* fix cannot find "glog/logging.h" in "generator.h"

* fix bugs
```
  4e417409
- Z
  
  [XPU] optimize multi_encoder_xpu_pass (#50759) · 5c9299e5
  由 zhupengyang 提交于 2月 23, 2023
  
  5c9299e5
22 2月, 2023 2 次提交
- S
  Fix some typos. (#50429) · 93b2bf4b
  由 Shuangchi He 提交于 2月 22, 2023
```
* Fix some typos.
Signed-off-by: Yulv-git <yulvchi@qq.com>

* pre-commit
Signed-off-by: Yulv-git <yulvchi@qq.com>

---------
Signed-off-by: Yulv-git <yulvchi@qq.com>
```
  93b2bf4b
- Z
  
  [XPU] link out_max to x_max between xpu_fusion_ops (#50690) · 1fd1c169
  由 zhupengyang 提交于 2月 22, 2023
  
  1fd1c169
21 2月, 2023 2 次提交

[PHI Decoupling]Remove memory header (Part1) (#50419) · 1cfcb71d

由 YuanRisheng 提交于 2月 21, 2023

* decouple_memory

* perfect memory utils

* fix ci bugs

* fix inference bugs

* fix custom test bugs

* fix converage bugs

* modify code according comment

* modify namespace

* deal with compile bugs

1cfcb71d

Optimize the ernie inference performance on xpu backend. (#50357) · b39afb13

由 csy0225 提交于 2月 21, 2023

* Optimize the ernie inference performance on xpu

* fix enable runtime cache logic

* when op's input shape has changed, should create a new runtime context

* fix

* set flag when input shape has changed

b39afb13

20 2月, 2023 1 次提交
- W
  
  fix mutable_data() (#50396) · c47f11f5
  由 Wang Bojun 提交于 2月 20, 2023
  
  c47f11f5
17 2月, 2023 1 次提交
- Z
  [XPU] add multi_encoder_xpu_slice_fuse_pass, generate_sequence_xpu_fuse_pass,... · 61469eec
  由 zhupengyang 提交于 2月 17, 2023
```
[XPU] add multi_encoder_xpu_slice_fuse_pass, generate_sequence_xpu_fuse_pass, generate_sequence_xpu kernel (#50570)
```
  61469eec
16 2月, 2023 5 次提交
- J
  Add matmul_v2 and fused_matmul to the quantization process and adjust Ernie model test (#50354) · 8686a745
  由 joanna.wozna.intel 提交于 2月 16, 2023
```
* Add matmul_v2 to the quantization process and adjust Ernie model test

* Correct cpu_quantize_pass test

* Move op to fuse transformation to placement pass

* Correct test
```
  8686a745
- T
  
  Export paddle_proto symbols (#50031) · dd1410d7
  由 Tomasz Socha 提交于 2月 16, 2023
  
  dd1410d7
- S
  [XPU][Fleet] Support multi-card infer for xpu (#50490) · 517d8074
  由 shentanyue 提交于 2月 16, 2023
```
* support xpu multi-card infer

* add ut

* clean code

* clean code

* fix

* fix

* fix

* fix
```
  517d8074
- H
  [Phi decouple] move layer_norm_kernel.cu.h to phi (#50506) · 8910bb4a
  由 Huang Jiyi 提交于 2月 16, 2023
```
* move layer_norm_kernel.cu.h to phi

* fix bugs

* fix namespace

* fix bugs

* fix CI-Windwos

* replace mutable_data

* fix bugs

* fix bugs
```
  8910bb4a
- Z
  
  [XPU] fix dropout pass; add multi_encoder_xpu_fuse_pass & multi_encoder_xpu kernel (#50499) · c8aa6405
  由 zhupengyang 提交于 2月 16, 2023
  
  c8aa6405
14 2月, 2023 2 次提交

D
Expand mixed_precision to custom device (#50378) · fcb746cb
由 duanyanhui 提交于 2月 14, 2023
```
* expand mix_precision to custom_device

* fix bug

* fix bug

* fix comment

* fix DEFINE bug
```
fcb746cb

add setvalue trt converter (#50341) · 2548657e

由 xjmxyt 提交于 2月 14, 2023

* add cast setvalue op

* add set_value to op teller

* renew test and add description

* add setAxis and add complex test

* change test

2548657e

11 2月, 2023 1 次提交

[TRT] elementwise_add+transpose fusion (#50081) · fd0d4fa4

由 Wang Bojun 提交于 2月 11, 2023

* eleadd_trans first version

log fix

* refine code for linear format, add pass check

* linear format refine and ut fix

* fix ut

* windows ut

* windows ut 2

* move tensorMeta and alloc to configure

fd0d4fa4

10 2月, 2023 1 次提交
- Z
  
  [XPU] add fc_xpu op&pass to optimize ernie model (#50277) · 945f918c
  由 zhupengyang 提交于 2月 10, 2023
  
  945f918c
09 2月, 2023 4 次提交

Z
[trt][inference]support int64 shapetensor as engine input (#50170) · 14a92c8c
由 Zhang Jun 提交于 2月 09, 2023
```
* update

* support int64 shape tensor as engine input

* add inference_predictor ut
```
14a92c8c
J
Adjust mkldnn_placement_pass to check library type and data type (#49899) · ebdf3ef9
由 joanna.wozna.intel 提交于 2月 09, 2023
```
* Adjust mkldnn_placement_pass to check library type and data type

* Check if var has inputs

* Remove unrelated test

* Refactor
```
ebdf3ef9

[Paddle-TRT] GroupNorm int8 nchw32 fake kernel (#50146) · d93c63a0

由 zhoutianzi666 提交于 2月 09, 2023

* add fmha_flashattention oss plugin

* add fmhca

* add oss fmhca

* code reconstruct and add ut

* code style refine

* fix ut and enforce check

* refine trt version check

refine compile

fix compile

* fix cross ut

* code refine

* use runtime trt version check

* bug fix and code refine

* compile fix

* merge develop

* add GN QDQ kernel

* support GN int8 fake kernel

* add with_int8

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8  UT

* add verison > 8000  in GN int8  UT

* add some check in .cu

* add stdlib.h in UT

* little change  in .cu

* remove rand_r use rand

* remove use rand

* setAxis(1)

* when int8 is on allow fall back to fp16

---------
Co-authored-by: Nwwbitejotunn <wang_bojun@outlook.com>

d93c63a0

W
[TRT] Transpose layernorm fusion with different input format (#50082) · b2bb7ec9
由 Wang Bojun 提交于 2月 09, 2023
```
* trans_layernorm
```
b2bb7ec9

08 2月, 2023 4 次提交

fuse quantize+transpose and transpose+dequantize (#49509) · 197a4ffe

由 Paulina Gacek 提交于 2月 08, 2023

* QuantTranpose pattern is being found by pass

* quant + transpose fuse

* code style changes

* UT written, reorder fixed

* Dequantize + transpose2 fuse  added

* pass name changed

* UT added & shift corrected

* got rid of redundancy

* review changes

* AsIntermediate corrected

* compat added

197a4ffe

Z
[inference][trt] Disable ShapeTensor for nearest_interp_v2 when trt version < 8.2 (#50258) · fa284076
由 Zhang Jun 提交于 2月 08, 2023
```
* update

* update

* format code

* update

* Update test_trt_convert_nearest_interp_v2.py
```
fa284076
W

Export custom operator-related function symbols (#50238) · f9c801ff
由 weishengying 提交于 2月 08, 2023

f9c801ff

[Paddle-TRT] remove engine info from RumImpl process (#50181) · b3888614

由 gaoziyuan 提交于 2月 08, 2023

* remove_engine_info

* remove_engine_info

* remove_engine_info

* change trtlayerinformation line to json

---------
Co-authored-by: Ngaoziyuan <gaoziyuan@baidu.com>

b3888614

07 2月, 2023 1 次提交
- X
  
  [Paddle Inference] Fix range fp32 input below trt 8.4. (#50221) · 766a4ca9
  由 xiaoxiaohehe001 提交于 2月 07, 2023
  
  766a4ca9
06 2月, 2023 1 次提交
- Y
  Disable conv2d_fusion_layout_transfer_pass temporarily (#50232) · 95fed8e8
  由 Yuanle Liu 提交于 2月 06, 2023
```
* disable conv2d_fusion_layout_transfer_pass temporarily

* disable conv2d_fusion_layout_transfer_pass temporarily
```
  95fed8e8

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功