提交 · fabdb43c94c20b9fdf5ce87438f710e680f2588f · 机器未来 / Paddle

13 4月, 2021 1 次提交

extend multiclass_nms unittest timeout threshold (#32214) · cb81826a

由 Pei Yang 提交于 4月 13, 2021

* extend multiclass_nms unittest timeout threshold

* adjust timeout to 200s

* temporarily disable multiclass_nms trt op teller

cb81826a

06 4月, 2021 1 次提交
- Z
  [PaddleTRT] Yolov3 bugfix (#32064) · b17e36a4
  由 zlsh80826 提交于 4月 06, 2021
```
* fix yolobox teller condition

* fix cuda double free bug
```
  b17e36a4
02 4月, 2021 2 次提交
- W
  
  update plugin creator name (#32021) · ed49b418
  由 Wilber 提交于 4月 02, 2021
  
  ed49b418
- W
  update trt engine addplugin name. (#32018) · d9187869
  由 Wilber 提交于 4月 02, 2021
```
* update trt engine addplugin name.

* update
```
  d9187869
01 4月, 2021 1 次提交

[Paddle-TRT] add anchor generator op plugin (#31730) · b807e408

由 zlsh80826 提交于 4月 01, 2021

* add anchor generator op plugin

* add anchor generator unit_test

* remove dbg info

* remove redundant line

* replace assertion with paddle enforce

* dynamic plugin replaces assertion with paddle enforce

* anchor generator support dynamic shape on spatial axis

* anchor generator test with fp16, dynamic shape

* add anchor generator test all

* add back main

* reduce test input size to not exceed the timelimit of ci

* change super to InferencePassTest for python2 compatibility

* reuse paddle operator anchor generator

* move creator construct to header with default

* add cuda ifdef

* reduce line

* change super to InferencePassTest for python2 compatibility

* fix anchor generator fp16 serialize setting

* split unittest from test_all

* restrict anchor generator input format before version 7234

* anchor generator only support greater than trt7.1

* change min_graph_size to 2

* min_graph size to 3 if dynamic shape

* reduce dynamic shape size to avoid trt search tactic too long to exceed time limit

* remove anchor from fetch list

* anchor generator support all trt version

* fix memory not allocated but if serialized

b807e408

30 3月, 2021 2 次提交
- S
  fix batchnorm when inpu dims < 3 (#31933) · 8084b759
  由 Shang Zhizhou 提交于 3月 30, 2021
```
* fix batchnorm when inpu dims < 3

* add unittest for batchnorm dims = 2
```
  8084b759
- Z
  [Paddle-TRT] yolobox (#31755) · 64ee255f
  由 zlsh80826 提交于 3月 30, 2021
```
* yolobox converter and plugin

* yolobox unittest

* add dynamic shape restriction

* fix git merge log
```
  64ee255f
29 3月, 2021 2 次提交

[Paddle-TRT] roi_align_plugin (#31732) · e3a38d79

由 zlsh80826 提交于 3月 29, 2021

* add roi_align_plugin

* add roi align unit_test

* add roi align serialization

* remove roi align static plugin because of batch dim issue

* refine roi align unittest and add fp16/serialization

* add trt roi align condition to op_teller

* refine error message

* remove unnecessary reshape layer

e3a38d79

[Paddle-TRT] trt affine channel converter (#31628) · bfb5cf55

由 zlsh80826 提交于 3月 29, 2021

* trt affine channel converter

* add trt affine channel base test

* add trt affine channel NHWC

* remove asterisk for python2 compatibility

* trt affine channel converter

* add trt affine channel base test

* add trt affine channel NHWC

* remove asterisk for python2 compatibility

* fix rebase

* move LodTensor to Tensor

* add dbg info

* affine channel converter only support NCHW

* scale,bias are parameters, use create_parameters api

* reduce test input size to not exceed the timelimit of ci

* refine affine channel unittest and add serialization/dynamic test

* change super to InferencePassTest for python2 compatibility

* change super to InferencePassTest for python2 compatibility

* fix affine channel fp16 serialize setting

bfb5cf55

26 3月, 2021 1 次提交

[Paddle-TRT] multiclass nms (#31742) · 01aa2526

由 zlsh80826 提交于 3月 26, 2021

* add multiclass_nms

* add multiclass_nms unittest

* add default enable_tensorrt_oss option

* refine multiclas nms unittest and add serialization/dynamic test

* change super to InferencePassTest for python2 compatibility

* refine multiclass nms unittest

* move out dynamic shape test due to ci timelimit

01aa2526

23 3月, 2021 2 次提交

W

trt plugin upgrade to pluginv2ext (#31670) · f4d9212d
由 Wilber 提交于 3月 23, 2021

f4d9212d

fix tensorrt output varible reshape (#31733) · 9d04ef73

由 Shang Zhizhou 提交于 3月 23, 2021

* fix tensorrt output varible reshape

* move padding shape x 1 x 1 in ernie to qkv and fc

* update layer name

* fix softmax when input is dynamic, fc not padding any more

* fix varlen

* move fc x_dim assert to op_teller

9d04ef73

22 3月, 2021 1 次提交

[Paddle-TRT] nearest_interp op (#31626) · bfced39e

由 zlsh80826 提交于 3月 22, 2021

* nearest_interp op converter w/ dynamic/static

* fix data_layout include

* add trt nearest unit_test

* add nearest_interp NHWC test

* update trt nearest interp nhwc testcase

* remove asterisk for python2 compatibility

* add empty line to prevent conflict

* nearest_interp op converter w/ dynamic/static

* fix data_layout include

* add trt nearest unit_test

* add nearest_interp NHWC test

* update trt nearest interp nhwc testcase

* remove asterisk for python2 compatibility

* add empty line to prevent conflict

* change the priority of out_h, out_w

bfced39e

18 3月, 2021 2 次提交
- Z
  [Paddle-TRT] gather converter (#31640) · fe241fd0
  由 zlsh80826 提交于 3月 18, 2021
```
* trt gather converter

* add trt gather unit_test
```
  fe241fd0
- Z
  [Paddle-TRT] support batch axis concatenation when using dynamic shape (#31627) · 4ea34278
  由 zlsh80826 提交于 3月 18, 2021
```
* support batch axis concatenation when using dynamic shape

* opteller can't return true early, or some test will not be executed
```
  4ea34278
12 3月, 2021 1 次提交
- S
  Trt elementwise plugin serialize (#31587) · 50ac7dbf
  由 Shang Zhizhou 提交于 3月 12, 2021
```
* add serialize unittest

* fix element_op trt plugin serialize bug
```
  50ac7dbf
10 3月, 2021 1 次提交
- S
  
  fix ernie_varlen when cutting head (#31497) · f57739be
  由 Shang Zhizhou 提交于 3月 10, 2021
  
  f57739be
03 3月, 2021 1 次提交
- P
  
  TRT conv2d converter support SAME padding (#31379) · 32211fe9
  由 Pei Yang 提交于 3月 03, 2021
  
  32211fe9
02 3月, 2021 3 次提交
- S
  
  change prelu plugin to tensorRT layer (#30210) · 77c44e2f
  由 Shang Zhizhou 提交于 3月 02, 2021
  
  77c44e2f
- P
  add n-d input support for trt scale converter (#31316) · 2e9e3fad
  由 Pei Yang 提交于 3月 02, 2021
```
* add n-d input support for trt scale converter

* add flatten for ut

* fix dims
```
  2e9e3fad
- Q
  
  [ROCM] update fluid operators for rocm (part4), test=develop (#31225) · 72d99c5d
  由 Qi Li 提交于 3月 02, 2021
  
  72d99c5d
24 2月, 2021 1 次提交

[Paddle-TRT] support group_norm (#31040) · 00b09e86

由 Pei Yang 提交于 2月 24, 2021

* add group norm plugin

* fix compile problems

* move concat axis check to trt op teller

* add nbDims for scale and bias nv dims

* add group norm unit test

* fix unittest

* add trt version restriction for group norm op teller

* fix unittest

00b09e86

22 2月, 2021 1 次提交
- S
  update trt int8 calibrator to IEntropyCalibratorV2 (#31060) · a5c56d83
  由 Shang Zhizhou 提交于 2月 22, 2021
```
* update trt int8 calibrator to IEntropyCalibratorV2

* add delele opt_cache for trt_split_converter_test
```
  a5c56d83
19 2月, 2021 1 次提交
- W
  
  update trt error message when input height or width is -1 (#31019) · 01ccfbcd
  由 Wilber 提交于 2月 18, 2021
  
  01ccfbcd
18 2月, 2021 1 次提交
- P
  
  add trt transpose and flatten converter (#31022) · 9b54fe41
  由 Pei Yang 提交于 2月 18, 2021
  
  9b54fe41
04 2月, 2021 2 次提交
- S
  fix split trt plugin initialize (#30875) · e6095bc2
  由 Shang Zhizhou 提交于 2月 04, 2021
```
* fix split trt plugin initialize

* update
```
  e6095bc2
- W
  use iwyu clean include second time, test=develop (#30829) · 35c5b23f
  由 wanghuancoder 提交于 2月 04, 2021
```
* use iwyu clean include second time, test=develop
```
  35c5b23f
02 2月, 2021 1 次提交
- S
  fix trt plugin clone and initialize bugs in TRT7.1+ (#30709) · b9094509
  由 Shang Zhizhou 提交于 2月 02, 2021
```
* fix trt plugin clone and initialize bugs

* fix unit test error

* enable trt in ci py3

* update unittest timeout
```
  b9094509
25 1月, 2021 1 次提交

add DLA support：C++&&Python api (#30165) · ae0f88a9

由 Shang Zhizhou 提交于 1月 25, 2021

* add dla

* add dla done

* add python api
Co-authored-by: Nshangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com>

ae0f88a9

19 1月, 2021 1 次提交
- L
  unify calling cudaSetDevice (#30470) · 81217a94
  由 Leo Chen 提交于 1月 19, 2021
```
* unify calling cudaSetDevice

* fix compile
```
  81217a94
13 1月, 2021 1 次提交

Added support for inference using quantization aware trained dygraph (#30288) · 7bbf3ac5

由 alncat 提交于 1月 13, 2021

* added support for inference using qunatization aware trained dygraph

* added support for inference using qunatization aware trained dygraph
correct boost get usage

* Delete incorrect warning message (#30196)

* fix warning and no grad

* clean redundant API alias in 2.0 - part 2 (#30013)

* delete paddle.nn.functional.assign

* fix dynamic to static error

* just add the op error message for the matmul xpu (#30246)

 add the op error message for the matmul xpu

* Add Static Variable Clone (#30208)

Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat

* use wget to replace curl to download the lcov file (#30229)

* use wget to replace curl to download the lcov file

* add cache for lcov

* fix test_pool3d_op timeout issue (#30248)

* Fix unittests bugs. (#30250)

* modify error message based on comments (#30189)

* modify error message based on comments

* edit code according to review.

* Correct spelling according to review.

* Fix bug for 'save mutiple method' (#30218)

* Fix bug for 'save mutiple method'

* To pass coverage.

* edit code to pass coverage.

* edit code to pass coverage.

* add unittest for coverage.

* change for coverage.

* edit for coverage.

* added support for inference using qunatization aware trained dygraph

* Alias from  paddle.fluid.layers.auc to paddle.static.auc (#30206)

* add alias from  fluid.layers.auc to static.auc

* Update __init__.py

* added support for inference using qunatization aware trained dygraph
correct boost get usage

* corrected boost get usage

* corrected naming issues and enforcing zero check

* correct paddle enforce message

* added more error checkings

* corrected error report message and optimized code

* corrected findvar usage

* corrected paddle_enforce in scope

* correct error messages

* correct error reporting format
Co-authored-by: NLielinJiang <50691816+LielinJiang@users.noreply.github.com>
Co-authored-by: NXiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
Co-authored-by: NHuihuang Zheng <zhhsplendid@gmail.com>
Co-authored-by: NYUNSHEN XIE <1084314248@qq.com>
Co-authored-by: NBai Yifan <me@ethanbai.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: NWeiXin <weixin10@baidu.com>
Co-authored-by: NJiaqi Liu <liujiaqi06@baidu.com>

7bbf3ac5

11 1月, 2021 1 次提交

modify error message based on comments (#30189) · 66dc4ac7

由 WeiXin 提交于 1月 11, 2021

* modify error message based on comments

* edit code according to review.

* Correct spelling according to review.

66dc4ac7

08 1月, 2021 1 次提交
- W
  
  fix windows compile when WITH_PYTHON=ON and WITH_TENSORRT=ON (#30194) · 01a287bf
  由 Wilber 提交于 1月 08, 2021
  
  01a287bf
08 12月, 2020 1 次提交
- P
  change hard_swish from plugin to layer (#29177) · 2480bdef
  由 Pei Yang 提交于 12月 08, 2020
```
* change hard_swish from plugin to layer

* add ut when threshold != scale
```
  2480bdef
07 12月, 2020 1 次提交
- P
  
  support clip op trt converter (#29411) · f860de4a
  由 Pei Yang 提交于 12月 07, 2020
  
  f860de4a
27 11月, 2020 1 次提交

detect tensorRT plugin fp16 in runtime (#27933) · b9e76a01

由 Shang Zhizhou 提交于 11月 27, 2020

* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake

* comile with cuda9

* add some unittest

* notest;test=coverage

* add unittest for trt plugin swish && split

* update ernie unittest

* fix some error message

* remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter

* fix comile errror when CUDA_ARCH_NAME < Pascal"

* fix comile error

* update unittest timeout

* compile with cuda9

* update error msg

* fix code style

* add some comments

* add define IF_CUDA_ARCH_SUPPORT_FP16

* rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED

b9e76a01

23 11月, 2020 1 次提交
- P
  change avg pooling and global pooling to trt layer in dynamic shape mode (#28702) · 994673bf
  由 Pei Yang 提交于 11月 23, 2020
```
* change avg pooling and global pooling to trt layer

* add support for static shape global pooling

* modify trt errmsg
```
  994673bf
12 11月, 2020 1 次提交
- S
  裁剪transformer模型trt支持；修复tensorRT不支持DeletePass的bug (#28517) · 8699f38d
  由 Shang Zhizhou 提交于 11月 12, 2020
```
* skip_layernorm_op done

* add unittest

* slice op convertor support trt < 6

* skip_layernorm only work in ernie
```
  8699f38d
03 11月, 2020 1 次提交

TensorRT中ernie模型推理性能优化，支持变长输入 (#28367) · ea851796

由 Shang Zhizhou 提交于 11月 03, 2020

* fp16 result ok

* change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS

* auto detect special slice op converter for ernie with trt oss

* ernie oss only support fp16

* fix special_slice_plugin serialize bug

* matmul in tensorrt ok

* ernie unittest ok

* add matmul tensorrt unittest

* remove demo code

ea851796

21 10月, 2020 1 次提交
- P
  
  change avg pooling from trt plugin to trt layer (#28032) · 602d2ce5
  由 Pei Yang 提交于 10月 21, 2020
  
  602d2ce5

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致