提交 · 7c8c9b7d87b51838ae6ed379ac9bc3d5685b7bee · PaddlePaddle / Paddle

06 9月, 2023 1 次提交
- L
  [XPU] add squeeze_excitation_block_xpu op&pass to optimize ppocr_v3_det model (#56773) · 7c8c9b7d
  由 leolishaohao 提交于 9月 06, 2023
```
* [XPU] add squeeze_excitation_block_xpu op&pass to optimize ppocr_v3_det model test=kunlun

* fix

* fix Codestype

* remove xpu name
```
  7c8c9b7d
05 9月, 2023 1 次提交
- J
  
  [XPU] Add element_mul_add_fuse_pass and elementwise_madd_xpu kernel (#56629) · 5efaaaa3
  由 jiangfan06 提交于 9月 05, 2023
  
  5efaaaa3
04 9月, 2023 2 次提交
- Y
  
  multihead_matmul op support codegen and kernel remove to phi (#56846) · 79bfb184
  由 Yuanle Liu 提交于 9月 04, 2023
  
  79bfb184
- M
  Modify MarkTrtEngineOutputs API (#56858) · 179d4264
  由 ming1753 提交于 9月 04, 2023
```
* Modify MarkTrtEngineOutputs API
```
  179d4264
01 9月, 2023 1 次提交

[clang-tidy] No.34,36 enable... · 17e4be21

由 cyberslack_lee 提交于 9月 01, 2023

[clang-tidy] No.34,36 enable performance-noexcept-move-constructor,modernize-use-transparent-functors (#56261)

* fix

* fix

* CI

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* CI

* fix

* CI

17e4be21

30 8月, 2023 1 次提交

Add paddle custom flags support (#56256) · 2ef4ec71

由 huangjiyi 提交于 8月 30, 2023

* update

* repalce gflags header

* replace DEFINE_<type> with PD_DEFINE_<type>

* fix bug

* fix bug

* fix bug

* update cmake

* add :: before some paddle namespace

* fix link error

* fix CI-Py3

* allow commandline parse

* fix SetFlagsFromEnv

* fix bug

* fix bug

* fix CI-CINN

* fix CI-Coverage-build

* fix CI-Windows-build

* fix CI-Inference

* fix bug

* fix bug

* fix CI-CINN

* fix inference api test

* fix infer_ut test

* revert infer_ut gflags usage

* update

* fix inference

* remove flags export macro

* revert inference demo_ci gflags usage

* update

* update

* update

* update

* update

* update

* update

* update

* fix bug when turn on WITH_GFLAGS

* turn on WITH_GFLAGS

* fix bug when turn on WITH_GFLAGS

* fix bug when turn on WITH_GFLAGS

* update

* update and add unittest

* add unittest

* fix conflict

* rerun ci

* update

* resolve conflict

2ef4ec71

29 8月, 2023 3 次提交
- R
  
  [CustomDevice] Not reset pass_builder (#56755) · 220f13bd
  由 ronnywang 提交于 8月 29, 2023
  
  220f13bd
- C
  [clang-tidy] No.26,27 enable misc-unused-using-decls,misc-unused-alias-decls (#56485) · 138bdf40
  由 cyberslack_lee 提交于 8月 29, 2023
```
* fix

* fix
```
  138bdf40
- G
  
  [clang-tidy] enable bugprone-unhandled-self-assignment check (#56640) · b185adf8
  由 gouzil 提交于 8月 29, 2023
  
  b185adf8
28 8月, 2023 1 次提交
- G
  
  [clang-tidy] enable bugprone-exception-escape check (#56692) · dcaca0f4
  由 gouzil 提交于 8月 28, 2023
  
  dcaca0f4
25 8月, 2023 1 次提交

[Inference] auto mixed precision inference support white list (#56535) · ecff21e7

由 Yuanle Liu 提交于 8月 25, 2023

* auto mixed precision inference support white list

* update

* update

* update

* move down identity_op_clean_pass

* fix code style

ecff21e7

24 8月, 2023 2 次提交

[NewIR]Add NOT_FOR_INFER to prune Inference Library Size and Split VJP CodeGen... · 5d43f5e4

由 Aurelius84 提交于 8月 24, 2023

[NewIR]Add NOT_FOR_INFER to prune Inference Library Size and Split VJP CodeGen into pd_op_vjp.cc (#56352)

* [NewIR]Prune Inference Library Size and Remove IR Dialect

* remove options

* add NOT_FOR_INFER

* fix pd_vjp.cc

* polish deps

* fix code style

* fix unittest

* fix cmake

* fix inference CI

5d43f5e4

C

[XPU] Add embedding plugin (#56488) · 2a5adc5a
由 csy0225 提交于 8月 24, 2023

2a5adc5a

23 8月, 2023 2 次提交

Integrate TRT qdq layers (#54803) · ae84c603

由 Leo Chen 提交于 8月 23, 2023

* Integrate quantize/dequantize linear and add config for explicit quantization

* Fix the build error

* Add macro for TRT version < 8.0

* Remove qdq UT from windows

* Fix UT failure

* Check TRT version in qdq UT

* Test tensorrt_explicit_enabled API

* Disable QDQ UT if TRT version < 8.5

* Add quantization postfix into public APIs

* Apply code formatter

* Fix the UT failure for explicit quantization

* Apply code formatter on modified files

* Correct the year in copyright

ae84c603

T

Add fuse pass to remove duplicated transpose ops (#56326) · b8d7f801
由 Travis-Lee 提交于 8月 23, 2023

b8d7f801

22 8月, 2023 1 次提交
- C
  
  [TRT] PrelnResidualBiasPluginDynamic Support 4D Inputs (#56304) · 338fb32b
  由 chen 提交于 8月 22, 2023
  
  338fb32b
21 8月, 2023 1 次提交
- C
  
  add pad genetic plugin (#56037) · d38dde68
  由 chen 提交于 8月 21, 2023
  
  d38dde68
18 8月, 2023 1 次提交

[Inference] Make share_external_data supports bf16 and bool; fix while_op... · c65ef07c

由 lzy 提交于 8月 18, 2023

[Inference] Make share_external_data supports bf16 and bool; fix while_op cache_inference_while_scope when using fleet_executor. (#56055)

* 1. make share_external_data supports bf16 and bool; 2. don't drop_kids when cache_inference_while_scope

* fix FLAGS_cache_inference_while_scope

* add unitest

* add unitest

* skip unitest when cudnn_version < 8100

* skip test share_external_data_bf16 when CUDA_ARCH < 80

c65ef07c

17 8月, 2023 1 次提交

Add MarkTrtEngineOutputs API (#56188) · 2abf4326

由 ming1753 提交于 8月 17, 2023

* [paddle-TRT] support mark output

* [fix bug] hook function only call one in different predictor

* add api test

2abf4326

16 8月, 2023 1 次提交
- J
  
  [XPU] Add fast_layernorm_xpu_fuse_pass and fast_layernorm_xpu plugin (#56269) · f16e1869
  由 jiangfan06 提交于 8月 16, 2023
  
  f16e1869
15 8月, 2023 1 次提交

[Inference][Trt]fix bilinear_v2 (#56043) · a26a3a60

由 bukejiyu 提交于 8月 15, 2023

* fix trt bilinear_interp_v2_op

* add trt 8.0 condition

* add trt 8.0 condition

test bilinear

add trt 8.0 condition

* code style

a26a3a60

14 8月, 2023 1 次提交
- C
  
  [clang-tidy] No.31 enable modernize-use-bool-literals (#56216) · 2c307457
  由 cyberslack_lee 提交于 8月 14, 2023
  
  2c307457
10 8月, 2023 1 次提交
- C
  
  [XPU] Add transfilter when conv2d op dilation > 1 (#55978) · 81c56e27
  由 csy0225 提交于 8月 10, 2023
  
  81c56e27
09 8月, 2023 2 次提交
- X
  [oneDNN]rename macro to PADDLE_WITH_DNNL (#52208) · 6ff4c130
  由 Xinyu Chen 提交于 8月 09, 2023
```
* onednn: rename macro to PADDLE_WITH_DNNL

* onednn: rename macro to CINN_WITH_DNNL
```
  6ff4c130
- R
  
  [clang-tidy] fix modernize-make-unique (#55764) · 9f04f2ac
  由 Ruibin Cheung 提交于 8月 09, 2023
  
  9f04f2ac
07 8月, 2023 3 次提交
- Y
  [Inference] save_optimized_model_pass support tensorrt (#55893) · 6b10c0e5
  由 Yuanle Liu 提交于 8月 07, 2023
```
* fix cudnn 8.7+ bug on cudnnConvolutionBiasActivationForward

* save_optimized_model_pass support tensorrt

* update

* update

* fix compile

* update

* fix ut timeout
```
  6b10c0e5
- G
  
  [clang-tidy] NO.6 enable `modernize-avoid-c-arrays` step: 2 (#55954) · 5ada98b8
  由 gouzil 提交于 8月 07, 2023
  
  5ada98b8
- R
  
  [clang-tidy] enable modernize-use-equals-default (#55983) · 30a02d27
  由 Ruibin Cheung 提交于 8月 07, 2023
  
  30a02d27
04 8月, 2023 2 次提交
- R
  [clang-tidy] enable modernize-use-emplace (#55799) · 469a0392
  由 Ruibin Cheung 提交于 8月 04, 2023
```
* [clang-tidy] enable modernize-use-emplace

* Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into modernize_use_emplace
```
  469a0392
- Z
  
  [clang-tidy] NO.12 enable modernize-use-nullptr check(#55800) · 1e4f627d
  由 Zhenghai Zhang 提交于 8月 04, 2023
  
  1e4f627d
03 8月, 2023 2 次提交
- W
  
  [clang-tidy] [No.4] enable `modernize-loop-convert` (#55704) · 81ccd99e
  由 Wang Xin 提交于 8月 03, 2023
  
  81ccd99e
- W
  
  eliminate small pattern (#55843) · dc4b48f6
  由 wz1qqx 提交于 8月 03, 2023
  
  dc4b48f6
02 8月, 2023 3 次提交

W

[XPU]Add conv1d fuse pass (#55719) · 22c7a6eb
由 wz1qqx 提交于 8月 02, 2023

22c7a6eb

[Inference] Replace groupNorm when data types are bf16 and fp16, and data... · e61d892a

由 yangjianfengo1 提交于 8月 02, 2023

[Inference] Replace groupNorm when data types are bf16 and fp16, and data format is NHWC implementation. (#55399)

* finish

* cpergroup odd

* fix bf16

* single channel

* code style

* jingdu duiqi

* add head_file

* add bf16 head file

* bf16 2

* bf16

* bf16 head

* bf16 compile

* py test

* bf16 compile

* bf16 compile

* unset py test

* nhwc

* test

* mean var

* bf16 success

* su

* ctest success

* use is_same_as

* is_same

* use is_same

* rtol

* gpu_stream

* del sigmod

* fix bfloat16 type

* use cuda_bf16_hpp

* use_cuda_arch

* bfloat162float2

* del inplace_tol

* del max_releative_tol

* temp store

* jingdu duiqi

* temp store

* plugin

* jingdu duiqi

* duiqi

* include cuda.h

* del half

* half single

* ci

* add const

* ci

* cudamemset

* del printf

* fp16 test

* add half compute

* del br16 ci

* del ci

* ci approve

* del fluid include

e61d892a

J

[XPU] Add gather_squeeze_pass (#55605) · d13a49d6
由 jiangfan06 提交于 8月 02, 2023

d13a49d6

01 8月, 2023 1 次提交
- H
  
  [XPU] Add fast_where fusion op and XPU micro kernel (#55628) · 07e788f1
  由 hong19860320 提交于 8月 01, 2023
  
  07e788f1
27 7月, 2023 2 次提交
- M
  [Paddle-TRT] add flip op (#55688) · d608170a
  由 ming1753 提交于 7月 27, 2023
```
* [Paddle-TRT] add flip op
```
  d608170a
- M
  paddle-TRT support float64 (#55520) · 8b063030
  由 ming1753 提交于 7月 27, 2023
```
* Paddle-TRT support float64  in/out type, support fill_any_like_op in int64
```
  8b063030
24 7月, 2023 2 次提交

[Paddle-TRT] Convert 0D tensor to 1D tensor, increase the shape tensor's... · a3cf25e3

由 chen 提交于 7月 24, 2023

[Paddle-TRT] Convert 0D tensor to 1D tensor, increase the shape tensor's number count when collecting shape (#55503)

* make 0-D tensor to 1-D tensor to support Grounding-SAM and add shape check

* recover identity_op_clean_pass.cc

a3cf25e3

onednn: remove fc_elementwise_add fusion (#55504) · bea1f04c

由 Xinyu Chen 提交于 7月 24, 2023

* onednn: remove fc+eltwiseadd fusion pass
* onednn: remove post-sum fusion in fc kernel
* onednn: tests: make unfused add run into f32

bea1f04c

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功