提交 · fe00d32aa53aab390c366e80aeeb312dd1a97e25 · 机器未来 / Paddle

24 2月, 2021 1 次提交
- P
  
  [Paddle-TRT] support group_norm (#31040) (#31188) · fe00d32a
  由 Pei Yang 提交于 2月 24, 2021
  
  fe00d32a
23 2月, 2021 2 次提交
- P
  
  add trt transpose and flatten converter (#31022) (#31139) · 20e68a22
  由 Pei Yang 提交于 2月 23, 2021
  
  20e68a22
- S
  
  update merge pr #31060（update trt int8 calibrator to IEntropyCalibratorV2） (#31121) · 1d2bd35e
  由 Shang Zhizhou 提交于 2月 23, 2021
  
  1d2bd35e
19 2月, 2021 1 次提交
- W
  
  cherry-pick pr (#31043) · 656124da
  由 Wilber 提交于 2月 19, 2021
  
  656124da
05 2月, 2021 1 次提交
- S
  fix trt plugin clone and initialize bugs in TRT7.1+ (#30709) (#30822) · a64bea0c
  由 Shang Zhizhou 提交于 2月 05, 2021
```
Co-authored-by: Ntianshuo78520a <707759223@qq.com>
```
  a64bea0c
02 2月, 2021 1 次提交

add DLA support：C++&&Python api (#30165) (#30810) · a8dfff99

由 Shang Zhizhou 提交于 2月 02, 2021

* add dla

* add python api
Co-authored-by: Nshangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com>
Co-authored-by: Nshangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com>

a8dfff99

14 1月, 2021 1 次提交
- A
  
  Added support for inference using quantization aware trained dygraph (#30288) (#30402) · 38faed7f
  由 alncat 提交于 1月 14, 2021
  
  38faed7f
12 1月, 2021 1 次提交

[2.0 Cherry-pick]fix 2.0 error message (#30332) · df67b317

由 swtkiwi 提交于 1月 12, 2021

* fix datanorm error msg (#30294)

* Optimize the error message of framework. (#30134)

* modify error message based on comments (#30189)

* modify error message based on comments

* edit code according to review.

* Correct spelling according to review.

* fix enforce msg of sum xpu op (#30113)

* enhance error info for py_func (#30138)

* enhance error info for py_func

* update

* fix elugradgrad test fail & error message opt (#30171)

* fix elugradgrad test fail and error message opt

* fix unitest,test=develop

* Update prroi_pool_op.h

fix error message

* opt message,test=develop

* fix ci fail,test=develop

* Refine PADDLE_ENFORCE Error Messages. test=develop (#30149)

Improve some error messages in parallel_executor.cc, conditional_block_op.cc, recurrent_op.cc

* enhance error message, test=develop (#30220)

* fix error message for distribute_fpn_proposals_op (#30116)

* enhance error msgs of fusion_seqpool_cvm_concat_op.cc, test=develop (#30240)

* just add the op error message for the matmul xpu (#30246)

 add the op error message for the matmul xpu

* enhance error message of nll_loss op test=develop (#30125)

* enhance error message of nll_loss op test=develop
Co-authored-by: Nyaoxuefeng <yaoxuefeng@baidu.com>
Co-authored-by: Nxiemoyuan <71377852+xiemoyuan@users.noreply.github.com>
Co-authored-by: NWeiXin <weixin10@baidu.com>
Co-authored-by: NJack Zhou <zhoushunjie@baidu.com>
Co-authored-by: NWilber <jiweibo@baidu.com>
Co-authored-by: NDouble_V <liuvv0203@163.com>
Co-authored-by: NHuihuang Zheng <zhhsplendid@gmail.com>
Co-authored-by: Nzhang wenhui <frankwhzhang@126.com>
Co-authored-by: Nwangguanzhong <jerrywgz@126.com>
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
Co-authored-by: Nlijianshe02 <48898730+lijianshe02@users.noreply.github.com>

df67b317

11 1月, 2021 1 次提交
- W
  
  Cherry-pick 30194 30164 30201(#30202) · 36de178a
  由 Wilber 提交于 1月 11, 2021
  
  36de178a
09 12月, 2020 2 次提交
- P
  
  support clip op trt converter (#29411) (#29496) · 4d51cd73
  由 Pei Yang 提交于 12月 09, 2020
  
  4d51cd73
- P
  
  conflict (#29498) · d5ff367b
  由 Pei Yang 提交于 12月 09, 2020
  
  d5ff367b
27 11月, 2020 1 次提交

detect tensorRT plugin fp16 in runtime (#27933) · b9e76a01

由 Shang Zhizhou 提交于 11月 27, 2020

* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake

* comile with cuda9

* add some unittest

* notest;test=coverage

* add unittest for trt plugin swish && split

* update ernie unittest

* fix some error message

* remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter

* fix comile errror when CUDA_ARCH_NAME < Pascal"

* fix comile error

* update unittest timeout

* compile with cuda9

* update error msg

* fix code style

* add some comments

* add define IF_CUDA_ARCH_SUPPORT_FP16

* rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED

b9e76a01

23 11月, 2020 1 次提交
- P
  change avg pooling and global pooling to trt layer in dynamic shape mode (#28702) · 994673bf
  由 Pei Yang 提交于 11月 23, 2020
```
* change avg pooling and global pooling to trt layer

* add support for static shape global pooling

* modify trt errmsg
```
  994673bf
12 11月, 2020 1 次提交
- S
  裁剪transformer模型trt支持；修复tensorRT不支持DeletePass的bug (#28517) · 8699f38d
  由 Shang Zhizhou 提交于 11月 12, 2020
```
* skip_layernorm_op done

* add unittest

* slice op convertor support trt < 6

* skip_layernorm only work in ernie
```
  8699f38d
03 11月, 2020 1 次提交

TensorRT中ernie模型推理性能优化，支持变长输入 (#28367) · ea851796

由 Shang Zhizhou 提交于 11月 03, 2020

* fp16 result ok

* change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS

* auto detect special slice op converter for ernie with trt oss

* ernie oss only support fp16

* fix special_slice_plugin serialize bug

* matmul in tensorrt ok

* ernie unittest ok

* add matmul tensorrt unittest

* remove demo code

ea851796

21 10月, 2020 1 次提交
- P
  
  change avg pooling from trt plugin to trt layer (#28032) · 602d2ce5
  由 Pei Yang 提交于 10月 21, 2020
  
  602d2ce5
13 10月, 2020 1 次提交
- S
  add info log for trt input dynamic shape check (#27796) · bbc837ee
  由 Shang Zhizhou 提交于 10月 13, 2020
```
* add info log for trt input dynamic shape check

* fix error msg error
```
  bbc837ee
28 9月, 2020 1 次提交

Add unittests and OP version registry for tensorrt_subgraph_pass (#27544) · ae6e40a7

由 Pei Yang 提交于 9月 28, 2020

* add unittests and op version register for tensorrt_subgraph_pass

* rename to test_trt_subgraph_pass.py

* fix softmax converter diff when padding dim=1

ae6e40a7

24 9月, 2020 1 次提交

use iwyu clean include (#27267) · df43905f

由 wanghuancoder 提交于 9月 24, 2020

* use iwyu clean include, test=develop, test=win

* compilation error, test=develop

* fix compilation error2, test=develop

* fix compilation error3, test=develop

* fix compilation error4, test=develop

* fix compilation error5, test=develop

* fix compilation error6, test=develop

* fix compilation error7, test=develop

* fix compilation error8, test=develop

* fix compilation error8, test=develop

* fix compilation error10, test=develop

* fix compilation error11, test=develop

df43905f

23 9月, 2020 1 次提交

Polish some lost invalid error message (#27445) · 76506447

由 Chen Weihang 提交于 9月 23, 2020

* polish some lost error msg

* add some math file to white list

* polish detail based reviewer commnet

76506447

22 9月, 2020 1 次提交
- P
  
  errmsg refine of trt plugin (#27309) · fda54c02
  由 Pei Yang 提交于 9月 22, 2020
  
  fda54c02
18 9月, 2020 1 次提交
- P
  
  Optimize emb_eltwise_layernorm_plugin and support fp16 (#27128) · a5ef246c
  由 Pei Yang 提交于 9月 18, 2020
  
  a5ef246c
15 9月, 2020 2 次提交

Optimize slice trt plugin (#26970) · 47fdc60e

由 Shang Zhizhou 提交于 9月 15, 2020

* optimize slice TRT plugin

This patch removes unnecessary barrier for data transfer of needed offset,
so data transfer can be overlap with GPU kernel execution.

This patch also fixes incorrect name of slice plugin. That is, replaces
"layernorm" with "slice"

test=develop

* add serialize/deserialize to slice plugin

* add static shape slice trt plugin

* fix slice trt op convertor dynamic shape bug

* fix format by clang-format

* fix pylint format error

* fix problems commented by peiyang
Co-authored-by: NRyan Jeng <rjeng@nvidia.com>

47fdc60e

Optimize error report (#27254) · e6e2e537

由 Shang Zhizhou 提交于 9月 15, 2020

* optimize errror report

* add test case for pad op converter

* fix some spelling mistake commented by peiyang

e6e2e537

14 9月, 2020 1 次提交
- P
  
  refine error message related to paddle-TRT (#27256) · aae41c6f
  由 Pei Yang 提交于 9月 14, 2020
  
  aae41c6f
02 9月, 2020 1 次提交
- Z
  fix pool trt plugin bug (#26463) · 932bbe95
  由 Zhaolong Xing 提交于 9月 02, 2020
```
test=develop
```
  932bbe95
01 9月, 2020 1 次提交

[Paddle-TRT] Stack op plugin (#25605) · ad6e3dd6

由 zlsh80826 提交于 9月 01, 2020

* add stack_op to CMakeLists

* add dim=3 support for scale op

* add trt stack op, test=develop

* remove debug message

* add stack plugin serialize

* remove slice, scale op, will add later

* enhence error message

* revise trt ernie test to conver the stack op CI testi, test=develop

* add stack op serialization

* fix test shape after adding stack op

* remove slice op, will add after implementing serialization

* roll back to min_graph=5 to avoid using slice op

* fix scale op output layer

* implement stack op createPlugin

* use workspace and move the defination to .cu

* move stack plugin creator definition to .cu, test=develop

ad6e3dd6

31 8月, 2020 1 次提交
- P
  [Paddle-TRT] TRT dynamic shape support PaddleSlim quant models (#26536) · 78a530c2
  由 Pei Yang 提交于 8月 31, 2020
```
* support trt dynamic shape int8

* add unittest

* add support for sigmoid; adapt to trt6+ api
```
  78a530c2
30 8月, 2020 1 次提交
- Z
  
  fix a skip_layernorm bug, test=develop (#26800) · ac63c7cd
  由 zlsh80826 提交于 8月 30, 2020
  
  ac63c7cd
28 8月, 2020 1 次提交
- P
  
  trt int8 support conv2d_transpose (#26636) · e3f8e5cf
  由 Pei Yang 提交于 8月 28, 2020
  
  e3f8e5cf
21 8月, 2020 1 次提交
- P
  
  add output scale and trt op teller support for hard_swish and hard_sigmoid (#26499) · 379222c3
  由 Pei Yang 提交于 8月 21, 2020
  
  379222c3
19 8月, 2020 1 次提交
- Z
  fix dy shape bug in trt7.1 (#26273) · b7a86e92
  由 Zhaolong Xing 提交于 8月 19, 2020
```
test=develop
```
  b7a86e92
07 8月, 2020 1 次提交
- P
  Fix TRT plugin registry without TRT lib (#25982) · beb0ca5f
  由 Pei Yang 提交于 8月 07, 2020
```
* fix trt plugin registry without trt lib

* support trt4

* refine code style
```
  beb0ca5f
05 8月, 2020 1 次提交

Fix registering trt plugin (#25744) · b717895f

由 Pei Yang 提交于 8月 05, 2020

* develop dynamic shape serilization

* add test param for gelu

* fix bugs

* delete redundant comments

* debug

* fix conflict. test=develop

* fix bug. test=develop

* add trt dynamic shape serialized support

* fix ernie serialized bug
test=develop

* fix codestyle
test=develop

* fix bug
test=develop

* fix bug.test=develop

* modify cmakelist test=develop

* fix bug
test=develop

* fix error message.  test=develop

* fix trt register plugin based on pr#25003

* add trt dynload

* fix deserialization bug of not finding plugin registration

* refine code style

* recover engine key in tensorrt_subgraph_pass

* for ci coverage

* add unittest for deserialization
Co-authored-by: Nhaozech <chenhaoze94@gmail.com>

b717895f

03 8月, 2020 1 次提交
- P
  
  add trt int8 support for elementwise_mul and scale (#25676) · 9e9a569d
  由 Pei Yang 提交于 8月 03, 2020
  
  9e9a569d
28 7月, 2020 2 次提交
- P
  
  add macro check for using TRT api dynamicRangeIsSet() (#25694) · eef98b7f
  由 Pei Yang 提交于 7月 28, 2020
  
  eef98b7f
- P
  
  fix trt instance norm plugin on gcc8. test=develop (#25730) · f82baed8
  由 Pei Yang 提交于 7月 28, 2020
  
  f82baed8
10 7月, 2020 1 次提交

Improve qkv transpose performance (#23919) · fc93266b

由 Jeng Bai-Cheng 提交于 7月 10, 2020

Use vector instruction (LDG.128) to improve qkv transpose. It
provides 1.4X speedup at same GPU base frequency.
test=develop

fc93266b

07 7月, 2020 1 次提交

[Fix BUGs]: fix multhead matmul pass's instable bug (#25123) · 7b7e6051

由 Zhaolong Xing 提交于 7月 07, 2020

* fix multhead matmul's instable
test=develop

* fix multihead matmul bug
test=develop

* fix converage problem
test=develop

7b7e6051

23 6月, 2020 1 次提交

[Paddle-TRT] Better Paddle-TensorRT support for PaddleSlim quant models (#25097) · b2f5a149

由 Pei Yang 提交于 6月 23, 2020

* Paddle-TensorRT support slim QAT. test=develop

* add comments. test=develop

* use RenameInput instead of ResetInputs. test=develop

b2f5a149

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致