提交 · 5f9d6d6844f331ae27bb710e4a4b7bac14f3225c · PaddlePaddle / Paddle

24 8月, 2023 2 次提交

[NewIR]Add NOT_FOR_INFER to prune Inference Library Size and Split VJP CodeGen... · 5d43f5e4

由 Aurelius84 提交于 8月 24, 2023

[NewIR]Add NOT_FOR_INFER to prune Inference Library Size and Split VJP CodeGen into pd_op_vjp.cc (#56352)

* [NewIR]Prune Inference Library Size and Remove IR Dialect

* remove options

* add NOT_FOR_INFER

* fix pd_vjp.cc

* polish deps

* fix code style

* fix unittest

* fix cmake

* fix inference CI

5d43f5e4

C

[XPU] Add embedding plugin (#56488) · 2a5adc5a
由 csy0225 提交于 8月 24, 2023

2a5adc5a

23 8月, 2023 2 次提交

Integrate TRT qdq layers (#54803) · ae84c603

由 Leo Chen 提交于 8月 23, 2023

* Integrate quantize/dequantize linear and add config for explicit quantization

* Fix the build error

* Add macro for TRT version < 8.0

* Remove qdq UT from windows

* Fix UT failure

* Check TRT version in qdq UT

* Test tensorrt_explicit_enabled API

* Disable QDQ UT if TRT version < 8.5

* Add quantization postfix into public APIs

* Apply code formatter

* Fix the UT failure for explicit quantization

* Apply code formatter on modified files

* Correct the year in copyright

ae84c603

T

Add fuse pass to remove duplicated transpose ops (#56326) · b8d7f801
由 Travis-Lee 提交于 8月 23, 2023

b8d7f801

22 8月, 2023 1 次提交
- C
  
  [TRT] PrelnResidualBiasPluginDynamic Support 4D Inputs (#56304) · 338fb32b
  由 chen 提交于 8月 22, 2023
  
  338fb32b
21 8月, 2023 1 次提交
- C
  
  add pad genetic plugin (#56037) · d38dde68
  由 chen 提交于 8月 21, 2023
  
  d38dde68
18 8月, 2023 1 次提交

[Inference] Make share_external_data supports bf16 and bool; fix while_op... · c65ef07c

由 lzy 提交于 8月 18, 2023

[Inference] Make share_external_data supports bf16 and bool; fix while_op cache_inference_while_scope when using fleet_executor. (#56055)

* 1. make share_external_data supports bf16 and bool; 2. don't drop_kids when cache_inference_while_scope

* fix FLAGS_cache_inference_while_scope

* add unitest

* add unitest

* skip unitest when cudnn_version < 8100

* skip test share_external_data_bf16 when CUDA_ARCH < 80

c65ef07c

17 8月, 2023 1 次提交

Add MarkTrtEngineOutputs API (#56188) · 2abf4326

由 ming1753 提交于 8月 17, 2023

* [paddle-TRT] support mark output

* [fix bug] hook function only call one in different predictor

* add api test

2abf4326

16 8月, 2023 1 次提交
- J
  
  [XPU] Add fast_layernorm_xpu_fuse_pass and fast_layernorm_xpu plugin (#56269) · f16e1869
  由 jiangfan06 提交于 8月 16, 2023
  
  f16e1869
15 8月, 2023 1 次提交

[Inference][Trt]fix bilinear_v2 (#56043) · a26a3a60

由 bukejiyu 提交于 8月 15, 2023

* fix trt bilinear_interp_v2_op

* add trt 8.0 condition

* add trt 8.0 condition

test bilinear

add trt 8.0 condition

* code style

a26a3a60

14 8月, 2023 1 次提交
- C
  
  [clang-tidy] No.31 enable modernize-use-bool-literals (#56216) · 2c307457
  由 cyberslack_lee 提交于 8月 14, 2023
  
  2c307457
10 8月, 2023 1 次提交
- C
  
  [XPU] Add transfilter when conv2d op dilation > 1 (#55978) · 81c56e27
  由 csy0225 提交于 8月 10, 2023
  
  81c56e27
09 8月, 2023 2 次提交
- X
  [oneDNN]rename macro to PADDLE_WITH_DNNL (#52208) · 6ff4c130
  由 Xinyu Chen 提交于 8月 09, 2023
```
* onednn: rename macro to PADDLE_WITH_DNNL

* onednn: rename macro to CINN_WITH_DNNL
```
  6ff4c130
- R
  
  [clang-tidy] fix modernize-make-unique (#55764) · 9f04f2ac
  由 Ruibin Cheung 提交于 8月 09, 2023
  
  9f04f2ac
07 8月, 2023 3 次提交
- Y
  [Inference] save_optimized_model_pass support tensorrt (#55893) · 6b10c0e5
  由 Yuanle Liu 提交于 8月 07, 2023
```
* fix cudnn 8.7+ bug on cudnnConvolutionBiasActivationForward

* save_optimized_model_pass support tensorrt

* update

* update

* fix compile

* update

* fix ut timeout
```
  6b10c0e5
- G
  
  [clang-tidy] NO.6 enable `modernize-avoid-c-arrays` step: 2 (#55954) · 5ada98b8
  由 gouzil 提交于 8月 07, 2023
  
  5ada98b8
- R
  
  [clang-tidy] enable modernize-use-equals-default (#55983) · 30a02d27
  由 Ruibin Cheung 提交于 8月 07, 2023
  
  30a02d27
04 8月, 2023 2 次提交
- R
  [clang-tidy] enable modernize-use-emplace (#55799) · 469a0392
  由 Ruibin Cheung 提交于 8月 04, 2023
```
* [clang-tidy] enable modernize-use-emplace

* Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into modernize_use_emplace
```
  469a0392
- Z
  
  [clang-tidy] NO.12 enable modernize-use-nullptr check(#55800) · 1e4f627d
  由 Zhenghai Zhang 提交于 8月 04, 2023
  
  1e4f627d
03 8月, 2023 2 次提交
- W
  
  [clang-tidy] [No.4] enable `modernize-loop-convert` (#55704) · 81ccd99e
  由 Wang Xin 提交于 8月 03, 2023
  
  81ccd99e
- W
  
  eliminate small pattern (#55843) · dc4b48f6
  由 wz1qqx 提交于 8月 03, 2023
  
  dc4b48f6
02 8月, 2023 3 次提交

W

[XPU]Add conv1d fuse pass (#55719) · 22c7a6eb
由 wz1qqx 提交于 8月 02, 2023

22c7a6eb

[Inference] Replace groupNorm when data types are bf16 and fp16, and data... · e61d892a

由 yangjianfengo1 提交于 8月 02, 2023

[Inference] Replace groupNorm when data types are bf16 and fp16, and data format is NHWC implementation. (#55399)

* finish

* cpergroup odd

* fix bf16

* single channel

* code style

* jingdu duiqi

* add head_file

* add bf16 head file

* bf16 2

* bf16

* bf16 head

* bf16 compile

* py test

* bf16 compile

* bf16 compile

* unset py test

* nhwc

* test

* mean var

* bf16 success

* su

* ctest success

* use is_same_as

* is_same

* use is_same

* rtol

* gpu_stream

* del sigmod

* fix bfloat16 type

* use cuda_bf16_hpp

* use_cuda_arch

* bfloat162float2

* del inplace_tol

* del max_releative_tol

* temp store

* jingdu duiqi

* temp store

* plugin

* jingdu duiqi

* duiqi

* include cuda.h

* del half

* half single

* ci

* add const

* ci

* cudamemset

* del printf

* fp16 test

* add half compute

* del br16 ci

* del ci

* ci approve

* del fluid include

e61d892a

J

[XPU] Add gather_squeeze_pass (#55605) · d13a49d6
由 jiangfan06 提交于 8月 02, 2023

d13a49d6

01 8月, 2023 1 次提交
- H
  
  [XPU] Add fast_where fusion op and XPU micro kernel (#55628) · 07e788f1
  由 hong19860320 提交于 8月 01, 2023
  
  07e788f1
27 7月, 2023 2 次提交
- M
  [Paddle-TRT] add flip op (#55688) · d608170a
  由 ming1753 提交于 7月 27, 2023
```
* [Paddle-TRT] add flip op
```
  d608170a
- M
  paddle-TRT support float64 (#55520) · 8b063030
  由 ming1753 提交于 7月 27, 2023
```
* Paddle-TRT support float64  in/out type, support fill_any_like_op in int64
```
  8b063030
24 7月, 2023 2 次提交

[Paddle-TRT] Convert 0D tensor to 1D tensor, increase the shape tensor's... · a3cf25e3

由 chen 提交于 7月 24, 2023

[Paddle-TRT] Convert 0D tensor to 1D tensor, increase the shape tensor's number count when collecting shape (#55503)

* make 0-D tensor to 1-D tensor to support Grounding-SAM and add shape check

* recover identity_op_clean_pass.cc

a3cf25e3

onednn: remove fc_elementwise_add fusion (#55504) · bea1f04c

由 Xinyu Chen 提交于 7月 24, 2023

* onednn: remove fc+eltwiseadd fusion pass
* onednn: remove post-sum fusion in fc kernel
* onednn: tests: make unfused add run into f32

bea1f04c

21 7月, 2023 3 次提交
- Y
  [Inference] save_optimized_model_pass support gpu (#55551) · 4b3ac86d
  由 Yuanle Liu 提交于 7月 21, 2023
```
* fix cudnn 8.7+ bug on cudnnConvolutionBiasActivationForward

* save_optimized_model_pass support gpu
```
  4b3ac86d
- R
  
  [clang-tidy] enable modernize-make-unique (#55506) · 45d49619
  由 Ruibin Cheung 提交于 7月 21, 2023
  
  45d49619
- J
  Bugfix, CUB regression in CUDA 12.2 (#55594) · b2c797ad
  由 Jeng Bai-Cheng 提交于 7月 21, 2023
```
Issue #55016
```
  b2c797ad
20 7月, 2023 2 次提交
- L
  Fix UT failure (#55360) · 7eeff7b1
  由 Leo Chen 提交于 7月 20, 2023
```
* Fix TRT multihead matmul UT failure
```
  7eeff7b1
- Z
  
  [XPU] fuse cast to conv2d/fc in mixed precision model (#54493) · 4df00939
  由 zhupengyang 提交于 7月 20, 2023
  
  4df00939
19 7月, 2023 2 次提交
- C
  
  add TRT op unbind (#55476) · 4a55f5e7
  由 chen 提交于 7月 19, 2023
  
  4a55f5e7
- C
  
  Delete repeat ops add gather squeeze unsqueeze (#55371) · 552ed8d8
  由 csy0225 提交于 7月 19, 2023
  
  552ed8d8
17 7月, 2023 2 次提交
- I
  [Paddle-TRT] Support conv2d op enter into trt when filter is not a persistable tensor (#55246) · 74206917
  由 iamsonderr 提交于 7月 17, 2023
```
* support_conv2d

* remove comment

* check code style

* add former Test

* check code style

* add unittest

* fix log

* change unittest

---------
Co-authored-by: zhoutianzi666 <17801055074@163.com>
```
  74206917
- M
  [Paddle-TRT] add assign op (#55426) · d778737e
  由 ming1753 提交于 7月 17, 2023
```
* [Paddle-TRT] add assign op
```
  d778737e
13 7月, 2023 1 次提交
- Y
  [BugFix] Replace include dense_tensor.h with forward declare in phi lib (#55396) · 9619443b
  由 Yuanle Liu 提交于 7月 13, 2023
```
* copy dense_tensor.h to inference lib

* update

* update
```
  9619443b
12 7月, 2023 1 次提交
- Y
  [Inference] rewrite identity_op_clean_pass (#55240) · 2363e623
  由 Yuanle Liu 提交于 7月 12, 2023
```
* rewrite identity_op_clean_pass

* fix

* adjust identity_op_clean_pass order in gpu passes

* fix ut
```
  2363e623

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功