- 11 10月, 2021 1 次提交
-
-
由 baoachun 提交于
* add skip case in trt converter ut * disable group_norm trt plugin
-
- 30 9月, 2021 1 次提交
-
-
由 wenbin 提交于
-
- 17 9月, 2021 2 次提交
-
-
由 feng_shuai 提交于
* broadcast qkv_op * use PADDLE_ENFORCE_GT to replace assert
-
由 津 提交于
-
- 08 9月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* update slice plugin * add test * fix code style * fix trt6 * update test * fix test * add timeout * update trt version * update cmake
-
- 23 8月, 2021 1 次提交
-
-
由 wenbin 提交于
-
- 20 7月, 2021 1 次提交
-
-
由 zlsh80826 提交于
* add trt noexcept definition * add trt noexcept on trt plugin * add trt noexcept on trt int8 calibrator * remove noexcept on base serialize * add trt noexcept on split plugin * add trt noexcept on elementwise plugin * add trt noexcept on prelu plugin * add trt noexcept on pool plugin * add trt noexcept on swish plugin * add trt noexcept on gelu plugin * add trt noexcept on layer norm plugin * add trt noexcept on instance norm plugin * add trt noexcept on emb eltwise layernorm plugin * add trt noexcept on qkv2context plugin * add trt noexcept on skip layernorm plugin * add trt noexcept on slice plugin * add trt noexcept on hard swish plugin * add trt noexcept on stack plugin * add trt noexcept on special slice plugin * add trt noexcept on anchor generator plugin * add trt noexcept on yolobox plugin * add trt noexcept on roi align plugin * add trt noexcept on gather nd plugin
-
- 12 7月, 2021 1 次提交
-
-
由 zlsh80826 提交于
* add trt LT version helper * upgrade PluginTensorRT to IPluginV2Ext * trt plugin factory is not usable in IPluginV2 * upgrade add plugin api to use IPluginV2 * remove IPlugin register and adapt getSerializeSize(), serialize() * adapt IPluginV2Layer * downgrade to IPluginV2 * implement elementwise clone * add gelu plugin creator and fix gelu serialization bug * add swish plugin creator and fix swish serialization bug * format * fix typo * add elementwise plugin creator and fix serialization * add base creator class * add gelu plugin creator * add hard swish creator and fix serialization * add instance norm creator and fix serialization * add layer norm creator and fix serialization * add pool creator and fix serialization * add prelu creator and fix serialization * add slice creator and fix serialization * add swish creator and fix serialization * add instance norm op unittest * remove redundent api * fix wrong graph size to enable trt * instance norm function move to cc * add trt elementwise ut to trigger coverage * remove opt cahce to hit serialization coverage * remove opt cahce to hit serialization coverage * remove unused code * remove unused inputs_ * add dbg info * remove dbg info * add instance norm serialization * roll back * remove comment code * remove trt plugin registery * fix prelu dynamic serialization * add prelu ut and reduce the input size to reduce memory usage * fix pool dynamic plugin serialization and add ut * refine pool ut with subtest * add env for avoiding oom * reduce test input size & increase pool op ut to 45s * add the contributor * remove copyright (will add in contributor) * remove copyright (will add in contributor)
-
- 09 7月, 2021 1 次提交
-
-
由 zlsh80826 提交于
-
- 28 6月, 2021 1 次提交
-
-
由 zlsh80826 提交于
-
- 25 6月, 2021 1 次提交
-
-
由 wenbin 提交于
* qkv * ci_test
-
- 24 6月, 2021 1 次提交
-
-
由 zlsh80826 提交于
* add trt LT version helper * trt8 requires void** to be void* const*
-
- 21 6月, 2021 1 次提交
-
-
由 Pei Yang 提交于
-
- 15 6月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* 1, remove layernorm dynamic fp16; 2, let reshape out in dynamic shape * remove useless code
-
- 08 6月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* add dynamic layer_norm plugin * fix bug * fix numpy.allclose * fix format * fix code style * remove shepe in dynamic shape * code format * remove layer norm fp16 * fix format
-
- 05 6月, 2021 1 次提交
-
-
由 Wilber 提交于
-
- 29 4月, 2021 1 次提交
-
-
由 zlsh80826 提交于
* implement MHA order same as training * fix fp16 compile issue on old architecture * fix format * fix format
-
- 06 4月, 2021 1 次提交
-
-
由 zlsh80826 提交于
* fix yolobox teller condition * fix cuda double free bug
-
- 02 4月, 2021 2 次提交
- 01 4月, 2021 1 次提交
-
-
由 zlsh80826 提交于
* add anchor generator op plugin * add anchor generator unit_test * remove dbg info * remove redundant line * replace assertion with paddle enforce * dynamic plugin replaces assertion with paddle enforce * anchor generator support dynamic shape on spatial axis * anchor generator test with fp16, dynamic shape * add anchor generator test all * add back main * reduce test input size to not exceed the timelimit of ci * change super to InferencePassTest for python2 compatibility * reuse paddle operator anchor generator * move creator construct to header with default * add cuda ifdef * reduce line * change super to InferencePassTest for python2 compatibility * fix anchor generator fp16 serialize setting * split unittest from test_all * restrict anchor generator input format before version 7234 * anchor generator only support greater than trt7.1 * change min_graph_size to 2 * min_graph size to 3 if dynamic shape * reduce dynamic shape size to avoid trt search tactic too long to exceed time limit * remove anchor from fetch list * anchor generator support all trt version * fix memory not allocated but if serialized
-
- 30 3月, 2021 1 次提交
-
-
由 zlsh80826 提交于
* yolobox converter and plugin * yolobox unittest * add dynamic shape restriction * fix git merge log
-
- 29 3月, 2021 1 次提交
-
-
由 zlsh80826 提交于
* add roi_align_plugin * add roi align unit_test * add roi align serialization * remove roi align static plugin because of batch dim issue * refine roi align unittest and add fp16/serialization * add trt roi align condition to op_teller * refine error message * remove unnecessary reshape layer
-
- 23 3月, 2021 2 次提交
-
-
由 Wilber 提交于
-
由 Shang Zhizhou 提交于
* fix tensorrt output varible reshape * move padding shape x 1 x 1 in ernie to qkv and fc * update layer name * fix softmax when input is dynamic, fc not padding any more * fix varlen * move fc x_dim assert to op_teller
-
- 12 3月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* add serialize unittest * fix element_op trt plugin serialize bug
-
- 10 3月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
-
- 02 3月, 2021 1 次提交
-
-
由 Qi Li 提交于
-
- 04 2月, 2021 2 次提交
-
-
由 Shang Zhizhou 提交于
* fix split trt plugin initialize * update
-
由 wanghuancoder 提交于
* use iwyu clean include second time, test=develop
-
- 02 2月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* fix trt plugin clone and initialize bugs * fix unit test error * enable trt in ci py3 * update unittest timeout
-
- 27 11月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake * comile with cuda9 * add some unittest * notest;test=coverage * add unittest for trt plugin swish && split * update ernie unittest * fix some error message * remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter * fix comile errror when CUDA_ARCH_NAME < Pascal" * fix comile error * update unittest timeout * compile with cuda9 * update error msg * fix code style * add some comments * add define IF_CUDA_ARCH_SUPPORT_FP16 * rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED
-
- 03 11月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* fp16 result ok * change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS * auto detect special slice op converter for ernie with trt oss * ernie oss only support fp16 * fix special_slice_plugin serialize bug * matmul in tensorrt ok * ernie unittest ok * add matmul tensorrt unittest * remove demo code
-
- 28 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* add unittests and op version register for tensorrt_subgraph_pass * rename to test_trt_subgraph_pass.py * fix softmax converter diff when padding dim=1
-
- 24 9月, 2020 1 次提交
-
-
由 wanghuancoder 提交于
* use iwyu clean include, test=develop, test=win * compilation error, test=develop * fix compilation error2, test=develop * fix compilation error3, test=develop * fix compilation error4, test=develop * fix compilation error5, test=develop * fix compilation error6, test=develop * fix compilation error7, test=develop * fix compilation error8, test=develop * fix compilation error8, test=develop * fix compilation error10, test=develop * fix compilation error11, test=develop
-
- 23 9月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* polish some lost error msg * add some math file to white list * polish detail based reviewer commnet
-
- 22 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 18 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 15 9月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* optimize slice TRT plugin This patch removes unnecessary barrier for data transfer of needed offset, so data transfer can be overlap with GPU kernel execution. This patch also fixes incorrect name of slice plugin. That is, replaces "layernorm" with "slice" test=develop * add serialize/deserialize to slice plugin * add static shape slice trt plugin * fix slice trt op convertor dynamic shape bug * fix format by clang-format * fix pylint format error * fix problems commented by peiyang Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
-
- 14 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-