提交 · 372ac08a171d76c745deaab0feed2d587798f734 · PaddlePaddle / Paddle

23 3月, 2021 1 次提交

fix tensorrt output varible reshape (#31733) · 9d04ef73

由 Shang Zhizhou 提交于 3月 23, 2021

* fix tensorrt output varible reshape

* move padding shape x 1 x 1 in ernie to qkv and fc

* update layer name

* fix softmax when input is dynamic, fc not padding any more

* fix varlen

* move fc x_dim assert to op_teller

9d04ef73

22 3月, 2021 1 次提交

[Paddle-TRT] nearest_interp op (#31626) · bfced39e

由 zlsh80826 提交于 3月 22, 2021

* nearest_interp op converter w/ dynamic/static

* fix data_layout include

* add trt nearest unit_test

* add nearest_interp NHWC test

* update trt nearest interp nhwc testcase

* remove asterisk for python2 compatibility

* add empty line to prevent conflict

* nearest_interp op converter w/ dynamic/static

* fix data_layout include

* add trt nearest unit_test

* add nearest_interp NHWC test

* update trt nearest interp nhwc testcase

* remove asterisk for python2 compatibility

* add empty line to prevent conflict

* change the priority of out_h, out_w

bfced39e

18 3月, 2021 2 次提交
- Z
  [Paddle-TRT] gather converter (#31640) · fe241fd0
  由 zlsh80826 提交于 3月 18, 2021
```
* trt gather converter

* add trt gather unit_test
```
  fe241fd0
- Z
  [Paddle-TRT] support batch axis concatenation when using dynamic shape (#31627) · 4ea34278
  由 zlsh80826 提交于 3月 18, 2021
```
* support batch axis concatenation when using dynamic shape

* opteller can't return true early, or some test will not be executed
```
  4ea34278
12 3月, 2021 1 次提交
- S
  Trt elementwise plugin serialize (#31587) · 50ac7dbf
  由 Shang Zhizhou 提交于 3月 12, 2021
```
* add serialize unittest

* fix element_op trt plugin serialize bug
```
  50ac7dbf
10 3月, 2021 1 次提交
- S
  
  fix ernie_varlen when cutting head (#31497) · f57739be
  由 Shang Zhizhou 提交于 3月 10, 2021
  
  f57739be
03 3月, 2021 1 次提交
- P
  
  TRT conv2d converter support SAME padding (#31379) · 32211fe9
  由 Pei Yang 提交于 3月 03, 2021
  
  32211fe9
02 3月, 2021 3 次提交
- S
  
  change prelu plugin to tensorRT layer (#30210) · 77c44e2f
  由 Shang Zhizhou 提交于 3月 02, 2021
  
  77c44e2f
- P
  add n-d input support for trt scale converter (#31316) · 2e9e3fad
  由 Pei Yang 提交于 3月 02, 2021
```
* add n-d input support for trt scale converter

* add flatten for ut

* fix dims
```
  2e9e3fad
- Q
  
  [ROCM] update fluid operators for rocm (part4), test=develop (#31225) · 72d99c5d
  由 Qi Li 提交于 3月 02, 2021
  
  72d99c5d
24 2月, 2021 1 次提交

[Paddle-TRT] support group_norm (#31040) · 00b09e86

由 Pei Yang 提交于 2月 24, 2021

* add group norm plugin

* fix compile problems

* move concat axis check to trt op teller

* add nbDims for scale and bias nv dims

* add group norm unit test

* fix unittest

* add trt version restriction for group norm op teller

* fix unittest

00b09e86

22 2月, 2021 1 次提交
- S
  update trt int8 calibrator to IEntropyCalibratorV2 (#31060) · a5c56d83
  由 Shang Zhizhou 提交于 2月 22, 2021
```
* update trt int8 calibrator to IEntropyCalibratorV2

* add delele opt_cache for trt_split_converter_test
```
  a5c56d83
19 2月, 2021 1 次提交
- W
  
  update trt error message when input height or width is -1 (#31019) · 01ccfbcd
  由 Wilber 提交于 2月 18, 2021
  
  01ccfbcd
18 2月, 2021 1 次提交
- P
  
  add trt transpose and flatten converter (#31022) · 9b54fe41
  由 Pei Yang 提交于 2月 18, 2021
  
  9b54fe41
04 2月, 2021 2 次提交
- S
  fix split trt plugin initialize (#30875) · e6095bc2
  由 Shang Zhizhou 提交于 2月 04, 2021
```
* fix split trt plugin initialize

* update
```
  e6095bc2
- W
  use iwyu clean include second time, test=develop (#30829) · 35c5b23f
  由 wanghuancoder 提交于 2月 04, 2021
```
* use iwyu clean include second time, test=develop
```
  35c5b23f
02 2月, 2021 1 次提交
- S
  fix trt plugin clone and initialize bugs in TRT7.1+ (#30709) · b9094509
  由 Shang Zhizhou 提交于 2月 02, 2021
```
* fix trt plugin clone and initialize bugs

* fix unit test error

* enable trt in ci py3

* update unittest timeout
```
  b9094509
25 1月, 2021 1 次提交

add DLA support：C++&&Python api (#30165) · ae0f88a9

由 Shang Zhizhou 提交于 1月 25, 2021

* add dla

* add dla done

* add python api
Co-authored-by: Nshangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com>

ae0f88a9

19 1月, 2021 1 次提交
- L
  unify calling cudaSetDevice (#30470) · 81217a94
  由 Leo Chen 提交于 1月 19, 2021
```
* unify calling cudaSetDevice

* fix compile
```
  81217a94
13 1月, 2021 1 次提交

Added support for inference using quantization aware trained dygraph (#30288) · 7bbf3ac5

由 alncat 提交于 1月 13, 2021

* added support for inference using qunatization aware trained dygraph

* added support for inference using qunatization aware trained dygraph
correct boost get usage

* Delete incorrect warning message (#30196)

* fix warning and no grad

* clean redundant API alias in 2.0 - part 2 (#30013)

* delete paddle.nn.functional.assign

* fix dynamic to static error

* just add the op error message for the matmul xpu (#30246)

 add the op error message for the matmul xpu

* Add Static Variable Clone (#30208)

Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat

* use wget to replace curl to download the lcov file (#30229)

* use wget to replace curl to download the lcov file

* add cache for lcov

* fix test_pool3d_op timeout issue (#30248)

* Fix unittests bugs. (#30250)

* modify error message based on comments (#30189)

* modify error message based on comments

* edit code according to review.

* Correct spelling according to review.

* Fix bug for 'save mutiple method' (#30218)

* Fix bug for 'save mutiple method'

* To pass coverage.

* edit code to pass coverage.

* edit code to pass coverage.

* add unittest for coverage.

* change for coverage.

* edit for coverage.

* added support for inference using qunatization aware trained dygraph

* Alias from  paddle.fluid.layers.auc to paddle.static.auc (#30206)

* add alias from  fluid.layers.auc to static.auc

* Update __init__.py

* added support for inference using qunatization aware trained dygraph
correct boost get usage

* corrected boost get usage

* corrected naming issues and enforcing zero check

* correct paddle enforce message

* added more error checkings

* corrected error report message and optimized code

* corrected findvar usage

* corrected paddle_enforce in scope

* correct error messages

* correct error reporting format
Co-authored-by: NLielinJiang <50691816+LielinJiang@users.noreply.github.com>
Co-authored-by: NXiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
Co-authored-by: NHuihuang Zheng <zhhsplendid@gmail.com>
Co-authored-by: NYUNSHEN XIE <1084314248@qq.com>
Co-authored-by: NBai Yifan <me@ethanbai.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: NWeiXin <weixin10@baidu.com>
Co-authored-by: NJiaqi Liu <liujiaqi06@baidu.com>

7bbf3ac5

11 1月, 2021 1 次提交

modify error message based on comments (#30189) · 66dc4ac7

由 WeiXin 提交于 1月 11, 2021

* modify error message based on comments

* edit code according to review.

* Correct spelling according to review.

66dc4ac7

08 1月, 2021 1 次提交
- W
  
  fix windows compile when WITH_PYTHON=ON and WITH_TENSORRT=ON (#30194) · 01a287bf
  由 Wilber 提交于 1月 08, 2021
  
  01a287bf
08 12月, 2020 1 次提交
- P
  change hard_swish from plugin to layer (#29177) · 2480bdef
  由 Pei Yang 提交于 12月 08, 2020
```
* change hard_swish from plugin to layer

* add ut when threshold != scale
```
  2480bdef
07 12月, 2020 1 次提交
- P
  
  support clip op trt converter (#29411) · f860de4a
  由 Pei Yang 提交于 12月 07, 2020
  
  f860de4a
27 11月, 2020 1 次提交

detect tensorRT plugin fp16 in runtime (#27933) · b9e76a01

由 Shang Zhizhou 提交于 11月 27, 2020

* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake

* comile with cuda9

* add some unittest

* notest;test=coverage

* add unittest for trt plugin swish && split

* update ernie unittest

* fix some error message

* remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter

* fix comile errror when CUDA_ARCH_NAME < Pascal"

* fix comile error

* update unittest timeout

* compile with cuda9

* update error msg

* fix code style

* add some comments

* add define IF_CUDA_ARCH_SUPPORT_FP16

* rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED

b9e76a01

23 11月, 2020 1 次提交
- P
  change avg pooling and global pooling to trt layer in dynamic shape mode (#28702) · 994673bf
  由 Pei Yang 提交于 11月 23, 2020
```
* change avg pooling and global pooling to trt layer

* add support for static shape global pooling

* modify trt errmsg
```
  994673bf
12 11月, 2020 1 次提交
- S
  裁剪transformer模型trt支持；修复tensorRT不支持DeletePass的bug (#28517) · 8699f38d
  由 Shang Zhizhou 提交于 11月 12, 2020
```
* skip_layernorm_op done

* add unittest

* slice op convertor support trt < 6

* skip_layernorm only work in ernie
```
  8699f38d
03 11月, 2020 1 次提交

TensorRT中ernie模型推理性能优化，支持变长输入 (#28367) · ea851796

由 Shang Zhizhou 提交于 11月 03, 2020

* fp16 result ok

* change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS

* auto detect special slice op converter for ernie with trt oss

* ernie oss only support fp16

* fix special_slice_plugin serialize bug

* matmul in tensorrt ok

* ernie unittest ok

* add matmul tensorrt unittest

* remove demo code

ea851796

21 10月, 2020 1 次提交
- P
  
  change avg pooling from trt plugin to trt layer (#28032) · 602d2ce5
  由 Pei Yang 提交于 10月 21, 2020
  
  602d2ce5
13 10月, 2020 1 次提交
- S
  add info log for trt input dynamic shape check (#27796) · bbc837ee
  由 Shang Zhizhou 提交于 10月 13, 2020
```
* add info log for trt input dynamic shape check

* fix error msg error
```
  bbc837ee
28 9月, 2020 1 次提交

Add unittests and OP version registry for tensorrt_subgraph_pass (#27544) · ae6e40a7

由 Pei Yang 提交于 9月 28, 2020

* add unittests and op version register for tensorrt_subgraph_pass

* rename to test_trt_subgraph_pass.py

* fix softmax converter diff when padding dim=1

ae6e40a7

24 9月, 2020 1 次提交

use iwyu clean include (#27267) · df43905f

由 wanghuancoder 提交于 9月 24, 2020

* use iwyu clean include, test=develop, test=win

* compilation error, test=develop

* fix compilation error2, test=develop

* fix compilation error3, test=develop

* fix compilation error4, test=develop

* fix compilation error5, test=develop

* fix compilation error6, test=develop

* fix compilation error7, test=develop

* fix compilation error8, test=develop

* fix compilation error8, test=develop

* fix compilation error10, test=develop

* fix compilation error11, test=develop

df43905f

23 9月, 2020 1 次提交

Polish some lost invalid error message (#27445) · 76506447

由 Chen Weihang 提交于 9月 23, 2020

* polish some lost error msg

* add some math file to white list

* polish detail based reviewer commnet

76506447

22 9月, 2020 1 次提交
- P
  
  errmsg refine of trt plugin (#27309) · fda54c02
  由 Pei Yang 提交于 9月 22, 2020
  
  fda54c02
18 9月, 2020 1 次提交
- P
  
  Optimize emb_eltwise_layernorm_plugin and support fp16 (#27128) · a5ef246c
  由 Pei Yang 提交于 9月 18, 2020
  
  a5ef246c
15 9月, 2020 2 次提交

Optimize slice trt plugin (#26970) · 47fdc60e

由 Shang Zhizhou 提交于 9月 15, 2020

* optimize slice TRT plugin

This patch removes unnecessary barrier for data transfer of needed offset,
so data transfer can be overlap with GPU kernel execution.

This patch also fixes incorrect name of slice plugin. That is, replaces
"layernorm" with "slice"

test=develop

* add serialize/deserialize to slice plugin

* add static shape slice trt plugin

* fix slice trt op convertor dynamic shape bug

* fix format by clang-format

* fix pylint format error

* fix problems commented by peiyang
Co-authored-by: NRyan Jeng <rjeng@nvidia.com>

47fdc60e

Optimize error report (#27254) · e6e2e537

由 Shang Zhizhou 提交于 9月 15, 2020

* optimize errror report

* add test case for pad op converter

* fix some spelling mistake commented by peiyang

e6e2e537

14 9月, 2020 1 次提交
- P
  
  refine error message related to paddle-TRT (#27256) · aae41c6f
  由 Pei Yang 提交于 9月 14, 2020
  
  aae41c6f
02 9月, 2020 1 次提交
- Z
  fix pool trt plugin bug (#26463) · 932bbe95
  由 Zhaolong Xing 提交于 9月 02, 2020
```
test=develop
```
  932bbe95
01 9月, 2020 1 次提交

[Paddle-TRT] Stack op plugin (#25605) · ad6e3dd6

由 zlsh80826 提交于 9月 01, 2020

* add stack_op to CMakeLists

* add dim=3 support for scale op

* add trt stack op, test=develop

* remove debug message

* add stack plugin serialize

* remove slice, scale op, will add later

* enhence error message

* revise trt ernie test to conver the stack op CI testi, test=develop

* add stack op serialization

* fix test shape after adding stack op

* remove slice op, will add after implementing serialization

* roll back to min_graph=5 to avoid using slice op

* fix scale op output layer

* implement stack op createPlugin

* use workspace and move the defination to .cu

* move stack plugin creator definition to .cu, test=develop

ad6e3dd6

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功