- 13 11月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* 裁剪transformer模型trt支持;修复tensorRT不支持DeletePass的bug (#28517) * skip_layernorm_op done * add unittest * slice op convertor support trt < 6 * skip_layernorm only work in ernie * fix unittest * fix unittest
-
- 05 11月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* Fix TRT plugin registry without TRT lib (#25982) * fix trt plugin registry without trt lib * support trt4 * refine code style * pick ea851796 from develop * cherry-pick develop PR #26273 && #27796 * fix unittest error * fix unittest error * remove const_cast Co-authored-by: NPei Yang <peiyang@baidu.com>
-
- 23 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* optimize slice TRT plugin This patch removes unnecessary barrier for data transfer of needed offset, so data transfer can be overlap with GPU kernel execution. This patch also fixes incorrect name of slice plugin. That is, replaces "layernorm" with "slice" test=develop * add serialize/deserialize to slice plugin * add static shape slice trt plugin * fix slice trt op convertor dynamic shape bug * fix format by clang-format * fix pylint format error * fix problems commented by peiyang Co-authored-by: NRyan Jeng <rjeng@nvidia.com> Co-authored-by: NShang Zhizhou <shangzhizhou@baidu.com> Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
-
- 21 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 18 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* [Paddle-TRT] Stack op plugin (#25605) * add stack_op to CMakeLists * add dim=3 support for scale op * add trt stack op, test=develop * remove debug message * add stack plugin serialize * remove slice, scale op, will add later * enhence error message * revise trt ernie test to conver the stack op CI testi, test=develop * add stack op serialization * fix test shape after adding stack op * remove slice op, will add after implementing serialization * roll back to min_graph=5 to avoid using slice op * fix scale op output layer * implement stack op createPlugin * use workspace and move the defination to .cu * move stack plugin creator definition to .cu, test=develop * sync ut with develop Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>
-
- 01 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
This commit fixs the compiling bug regarding unique_ptr of IOptimizationProfile. IOptimizationProfile has protected dtor and is controlled by TensorRT internally. Application shouldn't delete the pointer of IOptimizationProfile. See TensorRT document: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_builder.html#a9ac47e100454151d8206ac91d543299a test=develop Co-authored-by: NJeng Bai-Cheng <jeng1220@users.noreply.github.com>
-
- 11 8月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* add macro check for using TRT api dynamicRangeIsSet() (#25694) * adjust minimum trt version for hard_sigmoid converter to 5130. test=develop (#24746)
-
- 06 8月, 2020 2 次提交
-
-
由 Pei Yang 提交于
* solve conflict * fix crash when trt not found in python; update unittest model path
-
由 Pei Yang 提交于
* fix multhead matmul's instable test=develop * fix multihead matmul bug test=develop * fix converage problem test=develop Co-authored-by: NZhaolong Xing <nhzlx.dragon@gmail.com>
-
- 04 8月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 06 7月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
test=release/1.8
-
- 01 7月, 2020 2 次提交
- 15 5月, 2020 1 次提交
-
-
由 Pei Yang 提交于
test=develop Co-authored-by: Nnhzlx <nhzlx.dragon@gmail.com>
-
- 23 4月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* Chery_pick:[Eernie TRT]: add slice op and add emb eltwise layernorm fp16 support (#23723) * refine ernie trt dynamic shape support 1. add slice op converter 2. add emb eltwise layernorm fp16 support test=develop * fix dynamic shape test ut test=develop * fix comments. test=develop * fix comments test=develop * cherry-pick [BUG]: Head number can only be > 1 on multihead op (#23974) * support the head number == 1 test=develop * fix slice op error. test=develop * cherry-pick :disable trt test, test=develop (#23984) test=release/2.0-beta
-
- 21 4月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
* cherry-pick,Optimize the error messages of paddle CUDA API * fix the error messages of paddle CUDA API * Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL * remove build_ex_string
-
- 17 4月, 2020 3 次提交
- 12 4月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* add elementwise pool2d, prelu, shuffle channel test=develop * add scale and refine concat eltwise conveter test=develop * refine elementwise converter test=develop * refine ut test and enforce error. test=develop * modify const cast test=develop
-
- 11 4月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* refine act conv2d pool2d fc, trt converter log test=develop * fix comments test=develop
-
- 10 4月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 08 4月, 2020 4 次提交
- 01 4月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
test=develop
-
- 26 3月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* add dynamic plugin support. test=develop * change emb eltwise layernorm to math function test=develop * add emb eltwise layernorm test=develop * can run dynamic shape ernie test=develop * fix ci test=develop * add ut for trt ernie dynamic test=develop * refine dynamic shape c++ interface. test=develop * fix comments test=develop * fix comments test=develop
-
- 17 3月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 09 3月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* change the ci trt from version 5. to 6.0 * paddle-trt dynamic shape support init * conv+bias or conv+bn dynamic shape support test=develop * modity trt engine opconvert test=develop * fix ci error test=develop
-
- 23 2月, 2020 1 次提交
-
-
由 tianshuo78520a 提交于
-
- 10 2月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
[Refine Paddle-TRT INT8]: Support PaddleSlim's Resnet50, Mobilenetv1, Yolov3 models for Inference. (#22483) * add int8 op teller for trt. * refine trt int8 * add int8 op teller for trt. test=develop
-
- 05 2月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* add mutex for trt engine test=develop * add the test for copy_to_cpu test=develop
-
- 14 1月, 2020 1 次提交
-
-
由 zhouwei25 提交于
faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 (#22164)
-
- 08 1月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 07 1月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* add TRT support for instance_norm op
-
- 06 1月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* add gelu plugin * align trt bert with gpu * add support for fused fc with relu, * add unittest for bert trt
-
- 04 12月, 2019 1 次提交
-
-
由 Zhaolong Xing 提交于
test=develop
-
- 20 11月, 2019 1 次提交
-
-
由 Pei Yang 提交于
added splitter "__" between weight name and suffix number to avoid conflicts.
-
- 18 11月, 2019 1 次提交
-
-
由 Zhaolong Xing 提交于
* refine trt int8 for dynamic range set test=develop * refine trt int8 test=develop
-