- 24 2月, 2021 1 次提交
-
-
由 Pei Yang 提交于
-
- 23 2月, 2021 2 次提交
-
-
由 Pei Yang 提交于
-
由 Shang Zhizhou 提交于
-
- 19 2月, 2021 1 次提交
-
-
由 Wilber 提交于
-
- 05 2月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
Co-authored-by: Ntianshuo78520a <707759223@qq.com>
-
- 02 2月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* add dla * add python api Co-authored-by: Nshangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com> Co-authored-by: Nshangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com>
-
- 14 1月, 2021 1 次提交
-
-
由 alncat 提交于
-
- 12 1月, 2021 1 次提交
-
-
由 swtkiwi 提交于
* fix datanorm error msg (#30294) * Optimize the error message of framework. (#30134) * modify error message based on comments (#30189) * modify error message based on comments * edit code according to review. * Correct spelling according to review. * fix enforce msg of sum xpu op (#30113) * enhance error info for py_func (#30138) * enhance error info for py_func * update * fix elugradgrad test fail & error message opt (#30171) * fix elugradgrad test fail and error message opt * fix unitest,test=develop * Update prroi_pool_op.h fix error message * opt message,test=develop * fix ci fail,test=develop * Refine PADDLE_ENFORCE Error Messages. test=develop (#30149) Improve some error messages in parallel_executor.cc, conditional_block_op.cc, recurrent_op.cc * enhance error message, test=develop (#30220) * fix error message for distribute_fpn_proposals_op (#30116) * enhance error msgs of fusion_seqpool_cvm_concat_op.cc, test=develop (#30240) * just add the op error message for the matmul xpu (#30246) add the op error message for the matmul xpu * enhance error message of nll_loss op test=develop (#30125) * enhance error message of nll_loss op test=develop Co-authored-by: Nyaoxuefeng <yaoxuefeng@baidu.com> Co-authored-by: Nxiemoyuan <71377852+xiemoyuan@users.noreply.github.com> Co-authored-by: NWeiXin <weixin10@baidu.com> Co-authored-by: NJack Zhou <zhoushunjie@baidu.com> Co-authored-by: NWilber <jiweibo@baidu.com> Co-authored-by: NDouble_V <liuvv0203@163.com> Co-authored-by: NHuihuang Zheng <zhhsplendid@gmail.com> Co-authored-by: Nzhang wenhui <frankwhzhang@126.com> Co-authored-by: Nwangguanzhong <jerrywgz@126.com> Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com> Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com> Co-authored-by: Nlijianshe02 <48898730+lijianshe02@users.noreply.github.com>
-
- 11 1月, 2021 1 次提交
-
-
由 Wilber 提交于
-
- 09 12月, 2020 2 次提交
- 27 11月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake * comile with cuda9 * add some unittest * notest;test=coverage * add unittest for trt plugin swish && split * update ernie unittest * fix some error message * remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter * fix comile errror when CUDA_ARCH_NAME < Pascal" * fix comile error * update unittest timeout * compile with cuda9 * update error msg * fix code style * add some comments * add define IF_CUDA_ARCH_SUPPORT_FP16 * rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED
-
- 23 11月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* change avg pooling and global pooling to trt layer * add support for static shape global pooling * modify trt errmsg
-
- 12 11月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* skip_layernorm_op done * add unittest * slice op convertor support trt < 6 * skip_layernorm only work in ernie
-
- 03 11月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* fp16 result ok * change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS * auto detect special slice op converter for ernie with trt oss * ernie oss only support fp16 * fix special_slice_plugin serialize bug * matmul in tensorrt ok * ernie unittest ok * add matmul tensorrt unittest * remove demo code
-
- 21 10月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 13 10月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* add info log for trt input dynamic shape check * fix error msg error
-
- 28 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* add unittests and op version register for tensorrt_subgraph_pass * rename to test_trt_subgraph_pass.py * fix softmax converter diff when padding dim=1
-
- 24 9月, 2020 1 次提交
-
-
由 wanghuancoder 提交于
* use iwyu clean include, test=develop, test=win * compilation error, test=develop * fix compilation error2, test=develop * fix compilation error3, test=develop * fix compilation error4, test=develop * fix compilation error5, test=develop * fix compilation error6, test=develop * fix compilation error7, test=develop * fix compilation error8, test=develop * fix compilation error8, test=develop * fix compilation error10, test=develop * fix compilation error11, test=develop
-
- 23 9月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* polish some lost error msg * add some math file to white list * polish detail based reviewer commnet
-
- 22 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 18 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 15 9月, 2020 2 次提交
-
-
由 Shang Zhizhou 提交于
* optimize slice TRT plugin This patch removes unnecessary barrier for data transfer of needed offset, so data transfer can be overlap with GPU kernel execution. This patch also fixes incorrect name of slice plugin. That is, replaces "layernorm" with "slice" test=develop * add serialize/deserialize to slice plugin * add static shape slice trt plugin * fix slice trt op convertor dynamic shape bug * fix format by clang-format * fix pylint format error * fix problems commented by peiyang Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
-
由 Shang Zhizhou 提交于
* optimize errror report * add test case for pad op converter * fix some spelling mistake commented by peiyang
-
- 14 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 02 9月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
test=develop
-
- 01 9月, 2020 1 次提交
-
-
由 zlsh80826 提交于
* add stack_op to CMakeLists * add dim=3 support for scale op * add trt stack op, test=develop * remove debug message * add stack plugin serialize * remove slice, scale op, will add later * enhence error message * revise trt ernie test to conver the stack op CI testi, test=develop * add stack op serialization * fix test shape after adding stack op * remove slice op, will add after implementing serialization * roll back to min_graph=5 to avoid using slice op * fix scale op output layer * implement stack op createPlugin * use workspace and move the defination to .cu * move stack plugin creator definition to .cu, test=develop
-
- 31 8月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* support trt dynamic shape int8 * add unittest * add support for sigmoid; adapt to trt6+ api
-
- 30 8月, 2020 1 次提交
-
-
由 zlsh80826 提交于
-
- 28 8月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 21 8月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 19 8月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
test=develop
-
- 07 8月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* fix trt plugin registry without trt lib * support trt4 * refine code style
-
- 05 8月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* develop dynamic shape serilization * add test param for gelu * fix bugs * delete redundant comments * debug * fix conflict. test=develop * fix bug. test=develop * add trt dynamic shape serialized support * fix ernie serialized bug test=develop * fix codestyle test=develop * fix bug test=develop * fix bug.test=develop * modify cmakelist test=develop * fix bug test=develop * fix error message. test=develop * fix trt register plugin based on pr#25003 * add trt dynload * fix deserialization bug of not finding plugin registration * refine code style * recover engine key in tensorrt_subgraph_pass * for ci coverage * add unittest for deserialization Co-authored-by: Nhaozech <chenhaoze94@gmail.com>
-
- 03 8月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 28 7月, 2020 2 次提交
- 10 7月, 2020 1 次提交
-
-
由 Jeng Bai-Cheng 提交于
Use vector instruction (LDG.128) to improve qkv transpose. It provides 1.4X speedup at same GPU base frequency. test=develop
-
- 07 7月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* fix multhead matmul's instable test=develop * fix multihead matmul bug test=develop * fix converage problem test=develop
-
- 23 6月, 2020 1 次提交
-
-
由 Pei Yang 提交于
* Paddle-TensorRT support slim QAT. test=develop * add comments. test=develop * use RenameInput instead of ResetInputs. test=develop
-