提交 · 6b10c0e5dc83113a1102984f0bfd7edddf121db9 · PaddlePaddle / Paddle

07 8月, 2023 3 次提交
- Y
  [Inference] save_optimized_model_pass support tensorrt (#55893) · 6b10c0e5
  由 Yuanle Liu 提交于 8月 07, 2023
```
* fix cudnn 8.7+ bug on cudnnConvolutionBiasActivationForward

* save_optimized_model_pass support tensorrt

* update

* update

* fix compile

* update

* fix ut timeout
```
  6b10c0e5
- G
  
  [clang-tidy] NO.6 enable `modernize-avoid-c-arrays` step: 2 (#55954) · 5ada98b8
  由 gouzil 提交于 8月 07, 2023
  
  5ada98b8
- R
  
  [clang-tidy] enable modernize-use-equals-default (#55983) · 30a02d27
  由 Ruibin Cheung 提交于 8月 07, 2023
  
  30a02d27
04 8月, 2023 2 次提交
- R
  [clang-tidy] enable modernize-use-emplace (#55799) · 469a0392
  由 Ruibin Cheung 提交于 8月 04, 2023
```
* [clang-tidy] enable modernize-use-emplace

* Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into modernize_use_emplace
```
  469a0392
- Z
  
  [clang-tidy] NO.12 enable modernize-use-nullptr check(#55800) · 1e4f627d
  由 Zhenghai Zhang 提交于 8月 04, 2023
  
  1e4f627d
03 8月, 2023 2 次提交
- W
  
  [clang-tidy] [No.4] enable `modernize-loop-convert` (#55704) · 81ccd99e
  由 Wang Xin 提交于 8月 03, 2023
  
  81ccd99e
- W
  
  eliminate small pattern (#55843) · dc4b48f6
  由 wz1qqx 提交于 8月 03, 2023
  
  dc4b48f6
02 8月, 2023 3 次提交

W

[XPU]Add conv1d fuse pass (#55719) · 22c7a6eb
由 wz1qqx 提交于 8月 02, 2023

22c7a6eb

[Inference] Replace groupNorm when data types are bf16 and fp16, and data... · e61d892a

由 yangjianfengo1 提交于 8月 02, 2023

[Inference] Replace groupNorm when data types are bf16 and fp16, and data format is NHWC implementation. (#55399)

* finish

* cpergroup odd

* fix bf16

* single channel

* code style

* jingdu duiqi

* add head_file

* add bf16 head file

* bf16 2

* bf16

* bf16 head

* bf16 compile

* py test

* bf16 compile

* bf16 compile

* unset py test

* nhwc

* test

* mean var

* bf16 success

* su

* ctest success

* use is_same_as

* is_same

* use is_same

* rtol

* gpu_stream

* del sigmod

* fix bfloat16 type

* use cuda_bf16_hpp

* use_cuda_arch

* bfloat162float2

* del inplace_tol

* del max_releative_tol

* temp store

* jingdu duiqi

* temp store

* plugin

* jingdu duiqi

* duiqi

* include cuda.h

* del half

* half single

* ci

* add const

* ci

* cudamemset

* del printf

* fp16 test

* add half compute

* del br16 ci

* del ci

* ci approve

* del fluid include

e61d892a

J

[XPU] Add gather_squeeze_pass (#55605) · d13a49d6
由 jiangfan06 提交于 8月 02, 2023

d13a49d6

01 8月, 2023 1 次提交
- H
  
  [XPU] Add fast_where fusion op and XPU micro kernel (#55628) · 07e788f1
  由 hong19860320 提交于 8月 01, 2023
  
  07e788f1
27 7月, 2023 2 次提交
- M
  [Paddle-TRT] add flip op (#55688) · d608170a
  由 ming1753 提交于 7月 27, 2023
```
* [Paddle-TRT] add flip op
```
  d608170a
- M
  paddle-TRT support float64 (#55520) · 8b063030
  由 ming1753 提交于 7月 27, 2023
```
* Paddle-TRT support float64  in/out type, support fill_any_like_op in int64
```
  8b063030
24 7月, 2023 2 次提交

[Paddle-TRT] Convert 0D tensor to 1D tensor, increase the shape tensor's... · a3cf25e3

由 chen 提交于 7月 24, 2023

[Paddle-TRT] Convert 0D tensor to 1D tensor, increase the shape tensor's number count when collecting shape (#55503)

* make 0-D tensor to 1-D tensor to support Grounding-SAM and add shape check

* recover identity_op_clean_pass.cc

a3cf25e3

onednn: remove fc_elementwise_add fusion (#55504) · bea1f04c

由 Xinyu Chen 提交于 7月 24, 2023

* onednn: remove fc+eltwiseadd fusion pass
* onednn: remove post-sum fusion in fc kernel
* onednn: tests: make unfused add run into f32

bea1f04c

21 7月, 2023 3 次提交
- Y
  [Inference] save_optimized_model_pass support gpu (#55551) · 4b3ac86d
  由 Yuanle Liu 提交于 7月 21, 2023
```
* fix cudnn 8.7+ bug on cudnnConvolutionBiasActivationForward

* save_optimized_model_pass support gpu
```
  4b3ac86d
- R
  
  [clang-tidy] enable modernize-make-unique (#55506) · 45d49619
  由 Ruibin Cheung 提交于 7月 21, 2023
  
  45d49619
- J
  Bugfix, CUB regression in CUDA 12.2 (#55594) · b2c797ad
  由 Jeng Bai-Cheng 提交于 7月 21, 2023
```
Issue #55016
```
  b2c797ad
20 7月, 2023 2 次提交
- L
  Fix UT failure (#55360) · 7eeff7b1
  由 Leo Chen 提交于 7月 20, 2023
```
* Fix TRT multihead matmul UT failure
```
  7eeff7b1
- Z
  
  [XPU] fuse cast to conv2d/fc in mixed precision model (#54493) · 4df00939
  由 zhupengyang 提交于 7月 20, 2023
  
  4df00939
19 7月, 2023 2 次提交
- C
  
  add TRT op unbind (#55476) · 4a55f5e7
  由 chen 提交于 7月 19, 2023
  
  4a55f5e7
- C
  
  Delete repeat ops add gather squeeze unsqueeze (#55371) · 552ed8d8
  由 csy0225 提交于 7月 19, 2023
  
  552ed8d8
17 7月, 2023 2 次提交
- I
  [Paddle-TRT] Support conv2d op enter into trt when filter is not a persistable tensor (#55246) · 74206917
  由 iamsonderr 提交于 7月 17, 2023
```
* support_conv2d

* remove comment

* check code style

* add former Test

* check code style

* add unittest

* fix log

* change unittest

---------
Co-authored-by: zhoutianzi666 <17801055074@163.com>
```
  74206917
- M
  [Paddle-TRT] add assign op (#55426) · d778737e
  由 ming1753 提交于 7月 17, 2023
```
* [Paddle-TRT] add assign op
```
  d778737e
13 7月, 2023 1 次提交
- Y
  [BugFix] Replace include dense_tensor.h with forward declare in phi lib (#55396) · 9619443b
  由 Yuanle Liu 提交于 7月 13, 2023
```
* copy dense_tensor.h to inference lib

* update

* update
```
  9619443b
12 7月, 2023 3 次提交

Y
[Inference] rewrite identity_op_clean_pass (#55240) · 2363e623
由 Yuanle Liu 提交于 7月 12, 2023
```
* rewrite identity_op_clean_pass

* fix

* adjust identity_op_clean_pass order in gpu passes

* fix ut
```
2363e623

[ONEDNN] Upgrade oneDNN version to v3.1 (#52463) · cfa513f7

由 YangQun 提交于 7月 12, 2023

* squash pick the poc code
* fix build after rebase
* fix int8 conv and fc uts
* Fix and clean-up Get_SRC_Scale_Memory
* fix floating point fc uts
* fix test_analyzer_int8_googlenet
* test_analyzer_int8_mobilenetv1
* fix int8 mobilenet v2 and v3
* fix build error after rebase
* [oneDNN] rename library version
* fix conv bias datatype
* try to fix import error
* fix rebase error
* [oneDNN] pack library into python wheel
* add MKLDNN_SHARED_LIB_3 to env_dict
* fix test_analyzer_bert
* fix fill_constant op kernel
* fix ernie and matmul op ut
* fix softplus ut
* fix conv+relu6 fusion ut
* fix hardswish fusion
* fix quant+transpose fusion ut
* fixsgd ut
* fix int8 matmul with flatten
* fix fc+scale fusion
* fix conv/matmul+gelu fusion uts
* fix rebase error
* Revert "fix conv/matmul+gelu fusion uts"
This reverts commit 47eb5e49972bd8f7271a233def9bfb3e98ce78e1.
* upgrade to onednn v3.1
* remove older version onednn
* use densetensor::data() for achieving mean and var in layernorm impl
* comments for atol of integer tests
* fix clang-format
* Revert "remove older version onednn"
This reverts commit 783e57ddfd4401254596eae7d47adb9b03590c09.
* improve binary handle
* fix expand kernel
* Revert "use densetensor::data() for achieving mean and var in layernorm impl"
* always use forward_inference for conv
* remove activation scales
* rollback changes to mkldnn.cmake
* address comments
* port changes to dequantize kernel
* fix merge error
* fix fused_elementwise_kernel
* upgrade onednn version to v3.1.1
* fix some approval error
* fix error msg format
* remove old onednn libs
* try to fix symbolic link issue
* fix cinn test case segfault
* do not explicit link test with onednn
* remove unnecessary changes
* integrate CINN with onednn v3
* link with mkldnn project
* fix cinn build file

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
Co-authored-by: NChen, Xinyu1 <xinyu1.chen@intel.com>
Co-authored-by: Ntianshuo78520a <707759223@qq.com>

cfa513f7

[clang-tidy] enable `readability-container-size-empty` check (#55279) · be3a6fa7

由 Wang Xin 提交于 7月 12, 2023

* [clang-tidy] enable readability-container-size-empty check

* fix test_custom_kernel Failed

* add clang-tid-10 in dockerfile

* add clang-tidy in dockerfile

* fix bug

be3a6fa7

07 7月, 2023 4 次提交
- W
  
  [XPU] Add layernorm fuse pass (#55154) · eb12739e
  由 wz1qqx 提交于 7月 07, 2023
  
  eb12739e
- W
  
  [XPU] Eliminate small ops (#55193) · b8f265d2
  由 wz1qqx 提交于 7月 07, 2023
  
  b8f265d2
- Y
  rename WITH_INFERENCE_NVTX to WITH_NVTX and fix compile bug (#55219) · 43843192
  由 Yuanle Liu 提交于 7月 07, 2023
```
* fix WITH_SHARED_IR option type

* rename WITH_INFERENCE_NVTX to WITH_NVTX and fix compile bug

* update
```
  43843192
- I
  [Paddle Inference] del inplace op in memory_optimize_pass.cc (#55081) · 0685b3ec
  由 iamsonderr 提交于 7月 07, 2023
```
* commit

* del inplace op in memory_optimize_pass.cc

* check code style
```
  0685b3ec
06 7月, 2023 1 次提交
- Z
  
  fix lite xpu config (#55188) · b5645956
  由 zhupengyang 提交于 7月 06, 2023
  
  b5645956
05 7月, 2023 1 次提交
- W
  
  [XPU] add reduce_max_fuse_pass (#54981) · 54a101d5
  由 wz1qqx 提交于 7月 05, 2023
  
  54a101d5
04 7月, 2023 1 次提交
- L
  
  Print info for each layer in TRT inspector to avoid log being too long (#54748) · 54e1455a
  由 Leo Chen 提交于 7月 04, 2023
  
  54e1455a
03 7月, 2023 1 次提交
- 周
  [Paddle-TRT] use hook to collect shape in CollectShapeRangeInfo API. (#54841) · 989f3dde
  由周周周提交于 7月 03, 2023
```
* commit

* commit

* commit

* commit

* final commit

* use hook to collect shape and shape value
```
  989f3dde
30 6月, 2023 1 次提交
- M
  
  [XPU] Add conv2d transpose fuse pass (#54904) · 12c15b89
  由 mjp9527 提交于 6月 30, 2023
  
  12c15b89
29 6月, 2023 3 次提交
- 张
  [CodeStyle][CINN] format cpp code via clang-format (#54961) · af127342
  由张经纬提交于 6月 29, 2023
```
* fix clang-format

* 'fix_clang-format'

* fix remaining errors

* format

* empty commit, re-trigger all ci

* empty commit, re-trigger all ci

---------
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
```
  af127342
- W
  
  [XPU]add layer_norm fuse pass (#54930) · b94b3ac0
  由 wz1qqx 提交于 6月 28, 2023
  
  b94b3ac0
- W
  
  add lookup_table op for Paddle-TRT (#54882) · 7c89b972
  由 Wangzheee 提交于 6月 29, 2023
  
  7c89b972

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功