提交 · 1def9e05656496c15f24dd134c8f669d23923a8e · 机器未来 / Paddle

27 11月, 2020 1 次提交

detect tensorRT plugin fp16 in runtime (#27933) · b9e76a01

由 Shang Zhizhou 提交于 11月 27, 2020

* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake

* comile with cuda9

* add some unittest

* notest;test=coverage

* add unittest for trt plugin swish && split

* update ernie unittest

* fix some error message

* remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter

* fix comile errror when CUDA_ARCH_NAME < Pascal"

* fix comile error

* update unittest timeout

* compile with cuda9

* update error msg

* fix code style

* add some comments

* add define IF_CUDA_ARCH_SUPPORT_FP16

* rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED

b9e76a01

15 9月, 2020 1 次提交

Optimize slice trt plugin (#26970) · 47fdc60e

由 Shang Zhizhou 提交于 9月 15, 2020

* optimize slice TRT plugin

This patch removes unnecessary barrier for data transfer of needed offset,
so data transfer can be overlap with GPU kernel execution.

This patch also fixes incorrect name of slice plugin. That is, replaces
"layernorm" with "slice"

test=develop

* add serialize/deserialize to slice plugin

* add static shape slice trt plugin

* fix slice trt op convertor dynamic shape bug

* fix format by clang-format

* fix pylint format error

* fix problems commented by peiyang
Co-authored-by: NRyan Jeng <rjeng@nvidia.com>

47fdc60e

15 6月, 2020 1 次提交
- Z
  [Paddle-TRT] slice kernel optimization (#24783) · 49e4ee27
  由 zlsh80826 提交于 6月 15, 2020
```
* parallel move shared data test=develop

* test=develop
```
  49e4ee27
23 4月, 2020 1 次提交
- Z
  [BUG]: Head number can only be > 1 on multihead op (#23974) · 35148d17
  由 Zhaolong Xing 提交于 4月 23, 2020
```
* support the head number == 1
test=develop

* fix slice op error.
test=develop
```
  35148d17
19 4月, 2020 1 次提交

[Eernie TRT]: add slice op and add emb eltwise layernorm fp16 support (#23723) · 133f1fc1

由 Zhaolong Xing 提交于 4月 19, 2020

* refine ernie trt dynamic shape support
1. add slice op converter
2. add emb eltwise layernorm fp16 support
test=develop

* fix dynamic shape test ut
test=develop

* fix comments.
test=develop

* fix comments
test=develop

133f1fc1

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致