提交 · 62bff0e0ace6bdd885189e46c4113a3c70fffad1 · PaddlePaddle / Paddle

03 1月, 2023 1 次提交

[Paddle Inference] Implement conv2d_fusion NHWC format using cutlass (#47989) · c123dd1e

由 zhoutianzi666 提交于 1月 03, 2023

* Implement conv2d_fusion NHWC format using CUTLASS
* Add unit testing for CUTLASS Conv in inference
* Add experimental API for CUTLASS.

c123dd1e

21 12月, 2022 1 次提交

Refactor Pass for fused_conv (#48848) · 7f0eb2e3

由 zyfncg 提交于 12月 21, 2022

* refactor conv_activation_mkldnn_fuse_pass

* refactor conv_affine_channel_mkldnn_fuse_pass

* fix conv_activation_mkldnn_fuse_pass

* fix mkldnn unittest

* refactor int8_scale_calculation_mkldnn_pass and params_quantization_mkldnn_pass

* refactor conv_elementwise_add_mkldnn_fuse_pass

* fix quant

* refactor conv_bn_fuse_pass

* fix conv_bn_fuse_pass

* refactor depthwise_conv_bn_fuse_pass

* fix unittest

* fix conv_bn_fuse_pass

* remove redundant conv2d in params_quantization_mkldnn_pass

* fix params_quantization_mkldnn_pass_tester

7f0eb2e3

14 12月, 2022 1 次提交
- Y
  
  [Paddle Inference] rewrite convert_to_mixed_precision (#48853) · 28ea9aad
  由 Yuanle Liu 提交于 12月 14, 2022
  
  28ea9aad
09 12月, 2022 1 次提交
- Y
  [Inference] optimize some code and fix some bug (#48780) · c0034b5b
  由 Yuanle Liu 提交于 12月 09, 2022
```
* clean ir_pass_manager and fix map_depthwise_conv_to_conv_pass

* fix unitest timeout
```
  c0034b5b
08 12月, 2022 1 次提交
- W
  
  [Inference] inference add cinn interface (#48741) · 3a387df6
  由 Wilber 提交于 12月 08, 2022
  
  3a387df6
06 12月, 2022 1 次提交
- Y
  
  [Paddle Inference] Add float_to_half_pass to support inference with mixed precision (#47993) · c5a45cc6
  由 Yuanle Liu 提交于 12月 06, 2022
  
  c5a45cc6
01 12月, 2022 1 次提交
- W
  [Inference] Optimize memory_optimize pass. (#48476) · aa892113
  由 Wilber 提交于 12月 01, 2022
```
* update memory_optimize pass
```
  aa892113
30 11月, 2022 1 次提交
- Y
  
  [Paddle Inference] clean unused code (#48392) · 5de01e8a
  由 Yuanle Liu 提交于 11月 30, 2022
  
  5de01e8a
14 11月, 2022 1 次提交
- E
  
  add lite opencl support api (#47112) · 798ab3f9
  由 engineer1109 提交于 11月 14, 2022
  
  798ab3f9
01 11月, 2022 1 次提交
- S
  
  [Lite][XPU] Upgrade lite subgraph api of xpu (#47373) · 8a1124b1
  由 shentanyue 提交于 11月 01, 2022
  
  8a1124b1
12 10月, 2022 1 次提交
- Z
  
  [Paddle-TRT]support shape tensor is the input of trt-subgraph (#46482) · f2a778c9
  由 zhoutianzi666 提交于 10月 12, 2022
  
  f2a778c9
22 9月, 2022 1 次提交
- Y
  
  TensorRT engine context memory sharing (#45842) · 173b39bb
  由 Yuanle Liu 提交于 9月 22, 2022
  
  173b39bb
05 9月, 2022 1 次提交

Update DlNNE engine (#45027) · 638965c5

由 denglin-github 提交于 9月 05, 2022

* add config param for enable_dlnne and support calibration mode
* remove useless file
* refine code and add annotation
* refine code of Warnning tips

638965c5

05 8月, 2022 1 次提交

update trt workspace size param (#44469) · bdce552b

由 Zhang Jun 提交于 8月 05, 2022

* update trt workspace size param

* update

* update

* update

* use int64_t

* use int64_t

* upate

* update

bdce552b

08 7月, 2022 1 次提交
- W
  
  Inference support mixed-precision model [3] (#44057) · 7f958728
  由 Wilber 提交于 7月 08, 2022
  
  7f958728
29 6月, 2022 1 次提交
- W
  inference support mixed-precision model [1]. (#43814) · c7694b82
  由 Wilber 提交于 6月 29, 2022
```
* inference add convert to mixed model ability.
```
  c7694b82
24 6月, 2022 1 次提交
- W
  revert 40531 (#43807) · 7985407b
  由 Wilber 提交于 6月 24, 2022
```
* revert 40531

* update
```
  7985407b
05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
02 6月, 2022 1 次提交
- W
  [Paddle-Inference] new general transformer inference support (#43077) · 2810dfea
  由 Wangzheee 提交于 6月 02, 2022
```
* new general transformer inference support
```
  2810dfea
30 5月, 2022 1 次提交
- S
  [TensorRT] Fix delete fill_constant pass (#43053) · 1448520d
  由 shentanyue 提交于 5月 30, 2022
```
* update lite compile cmake

* Update delete_fill_constant_op_pass.cc

* Update analysis_config.cc
```
  1448520d
14 4月, 2022 1 次提交

add mkldnn int8 pass [step3] (#41599) · 8e2d4d30

由 baoachun 提交于 4月 14, 2022

* add mkldnn int8 pass [step3]

* Add test for compute_propagate_scales_mkldnn_pass

* update pass

* update api comment and python api
Co-authored-by: Nwozna <joanna.wozna@intel.com>

8e2d4d30

31 3月, 2022 1 次提交

add flatten2,reshape2,squueze2_trt_fuse_pass test cast (#41031) · 7ef69202

由 heliqi 提交于 3月 31, 2022

* add flatten2,reshape2,squueze2_trt_fuse_pass  test cast

* add flatten2,reshape2,squueze2_trt_fuse_pass  test cast

* add flatten2,reshape2,squueze2_trt_fuse_pass  test cast

7ef69202

17 3月, 2022 1 次提交
- B
  
  support gpu mixed precision inference (#40531) · 06fee998
  由 baoachun 提交于 3月 17, 2022
  
  06fee998
22 2月, 2022 1 次提交
- W
  [Paddle-Inference] fix pass and convert_op for preln_ernie (#39733) · 574f3402
  由 Wangzheee 提交于 2月 22, 2022
```
* fix pass and convert_op for preln_ernie and add preln_ernie'flag in pass
```
  574f3402
11 2月, 2022 1 次提交
- L
  
  Add TensorRT inspector into Paddle-TRT (#38362) · 69793a27
  由 Leo Chen 提交于 2月 11, 2022
  
  69793a27
13 1月, 2022 1 次提交
- W
  [Paddle-Inference] add Paddle Trt config: with_interleaved (#38884) · dccdc719
  由 Wangzheee 提交于 1月 13, 2022
```
* add Paddle Trt config: with_interleaved
```
  dccdc719
27 10月, 2021 1 次提交
- W
  
  enable trt test check and fix trt ut error（3/3） (#36581) · 8c1c72af
  由 Wilber 提交于 10月 27, 2021
  
  8c1c72af
22 10月, 2021 1 次提交
- W
  
  support lite xpu choose device id (#36610) · f46311b0
  由 Wilber 提交于 10月 22, 2021
  
  f46311b0
14 10月, 2021 1 次提交
- P
  
  clean inference logs when config.DisableGlogInfo is triggered (#36356) · 7f5128f4
  由 Pei Yang 提交于 10月 14, 2021
  
  7f5128f4
22 9月, 2021 1 次提交
- J
  
  [Inference] Support NNAdapter and ascend310 (#35226) · 10e53044
  由 JingZhuangzhuang 提交于 9月 22, 2021
  
  10e53044
14 9月, 2021 1 次提交
- W
  
  [Inference] Add tuned trt_dynamic_shape mode. (#34806) · 7c96efed
  由 Wilber 提交于 9月 14, 2021
  
  7c96efed
30 4月, 2021 1 次提交
- P
  
  remove check for optim_cache_dir in trt slim int8 (#32676) · c6713bc0
  由 Pei Yang 提交于 4月 30, 2021
  
  c6713bc0
25 4月, 2021 2 次提交

W

update lite subgraph api. (#32513) · 92dc9b2b
由 Wilber 提交于 4月 25, 2021

92dc9b2b

Nne integration (#32255) · feb2e476

由 denglin-github 提交于 4月 25, 2021

* Add dlnne engine runtime

* Fix log

* Remove <const_cast> and remove unrelated modify with dlnne, +clang-format

* Fix CMakeList format error

* Add copyright message

* Fix dlnne CMakeList.txt

* Add some paddlepaddle_pass to support more networks

* Fix some format bug

feb2e476

02 3月, 2021 1 次提交

support trt serialize when load model from memory (#31342) · 6404c438

由 Shang Zhizhou 提交于 3月 02, 2021

* support trt serialize when load model from memory

* delete conv_bn_fuse_pass before tensorrt, with which trt serialize engine id is not stable

* Revert "delete conv_bn_fuse_pass before tensorrt, with which trt serialize engine id is not stable"

performance degradation, fix in the future

This reverts commit fa6cd17e60b15df351efda379ddd00e9e9c1fea9.

* add delete conv_bn

* delete path when delete_cache_files

6404c438

18 2月, 2021 1 次提交
- P
  
  add trt transpose and flatten converter (#31022) · 9b54fe41
  由 Pei Yang 提交于 2月 18, 2021
  
  9b54fe41
25 1月, 2021 1 次提交

add DLA support：C++&&Python api (#30165) · ae0f88a9

由 Shang Zhizhou 提交于 1月 25, 2021

* add dla

* add dla done

* add python api
Co-authored-by: Nshangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com>

ae0f88a9

06 1月, 2021 1 次提交

add inference api： DisableTensorRtOps (#30109) · 05b27695

由 Shang Zhizhou 提交于 1月 06, 2021

* snap

* add inference api: DisableTensorRtOPs

* fix code style

* update api to experimental

* update variable name

05b27695

06 11月, 2020 1 次提交
- J
  Add bfloat16 softmax and gelu (#28394) · 7821759d
  由 joanna.wozna.intel 提交于 11月 06, 2020
```
* Add bfloat16 softmax and gelu

* Add pass attr bfloat16_enabled_op_types

* Changes from review
```
  7821759d
03 11月, 2020 1 次提交

TensorRT中ernie模型推理性能优化，支持变长输入 (#28367) · ea851796

由 Shang Zhizhou 提交于 11月 03, 2020

* fp16 result ok

* change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS

* auto detect special slice op converter for ernie with trt oss

* ernie oss only support fp16

* fix special_slice_plugin serialize bug

* matmul in tensorrt ok

* ernie unittest ok

* add matmul tensorrt unittest

* remove demo code

ea851796

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功