- 08 9月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* update slice plugin * add test * fix code style * fix trt6 * update test * fix test * add timeout * update trt version * update cmake
-
- 20 7月, 2021 1 次提交
-
-
由 zlsh80826 提交于
* add trt noexcept definition * add trt noexcept on trt plugin * add trt noexcept on trt int8 calibrator * remove noexcept on base serialize * add trt noexcept on split plugin * add trt noexcept on elementwise plugin * add trt noexcept on prelu plugin * add trt noexcept on pool plugin * add trt noexcept on swish plugin * add trt noexcept on gelu plugin * add trt noexcept on layer norm plugin * add trt noexcept on instance norm plugin * add trt noexcept on emb eltwise layernorm plugin * add trt noexcept on qkv2context plugin * add trt noexcept on skip layernorm plugin * add trt noexcept on slice plugin * add trt noexcept on hard swish plugin * add trt noexcept on stack plugin * add trt noexcept on special slice plugin * add trt noexcept on anchor generator plugin * add trt noexcept on yolobox plugin * add trt noexcept on roi align plugin * add trt noexcept on gather nd plugin
-
- 12 7月, 2021 1 次提交
-
-
由 zlsh80826 提交于
* add trt LT version helper * upgrade PluginTensorRT to IPluginV2Ext * trt plugin factory is not usable in IPluginV2 * upgrade add plugin api to use IPluginV2 * remove IPlugin register and adapt getSerializeSize(), serialize() * adapt IPluginV2Layer * downgrade to IPluginV2 * implement elementwise clone * add gelu plugin creator and fix gelu serialization bug * add swish plugin creator and fix swish serialization bug * format * fix typo * add elementwise plugin creator and fix serialization * add base creator class * add gelu plugin creator * add hard swish creator and fix serialization * add instance norm creator and fix serialization * add layer norm creator and fix serialization * add pool creator and fix serialization * add prelu creator and fix serialization * add slice creator and fix serialization * add swish creator and fix serialization * add instance norm op unittest * remove redundent api * fix wrong graph size to enable trt * instance norm function move to cc * add trt elementwise ut to trigger coverage * remove opt cahce to hit serialization coverage * remove opt cahce to hit serialization coverage * remove unused code * remove unused inputs_ * add dbg info * remove dbg info * add instance norm serialization * roll back * remove comment code * remove trt plugin registery * fix prelu dynamic serialization * add prelu ut and reduce the input size to reduce memory usage * fix pool dynamic plugin serialization and add ut * refine pool ut with subtest * add env for avoiding oom * reduce test input size & increase pool op ut to 45s * add the contributor * remove copyright (will add in contributor) * remove copyright (will add in contributor)
-
- 28 6月, 2021 1 次提交
-
-
由 zlsh80826 提交于
-
- 24 6月, 2021 1 次提交
-
-
由 zlsh80826 提交于
* add trt LT version helper * trt8 requires void** to be void* const*
-
- 27 11月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake * comile with cuda9 * add some unittest * notest;test=coverage * add unittest for trt plugin swish && split * update ernie unittest * fix some error message * remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter * fix comile errror when CUDA_ARCH_NAME < Pascal" * fix comile error * update unittest timeout * compile with cuda9 * update error msg * fix code style * add some comments * add define IF_CUDA_ARCH_SUPPORT_FP16 * rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED
-
- 15 9月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* optimize slice TRT plugin This patch removes unnecessary barrier for data transfer of needed offset, so data transfer can be overlap with GPU kernel execution. This patch also fixes incorrect name of slice plugin. That is, replaces "layernorm" with "slice" test=develop * add serialize/deserialize to slice plugin * add static shape slice trt plugin * fix slice trt op convertor dynamic shape bug * fix format by clang-format * fix pylint format error * fix problems commented by peiyang Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
-
- 15 6月, 2020 1 次提交
-
-
由 zlsh80826 提交于
* parallel move shared data test=develop * test=develop
-
- 23 4月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* support the head number == 1 test=develop * fix slice op error. test=develop
-
- 19 4月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* refine ernie trt dynamic shape support 1. add slice op converter 2. add emb eltwise layernorm fp16 support test=develop * fix dynamic shape test ut test=develop * fix comments. test=develop * fix comments test=develop
-