1. 23 4月, 2021 1 次提交
    • W
      move semantic checks to op_teller (#32279) · 7c38114f
      wenbin 提交于
      * move semantic checks to op_teller
      
      * more ops
      
      * more ops
      
      * revert block related change
      
      * part1
      
      * revert activation
      
      * remove if
      
      * remove const_cast
      
      * reslove conflict
      
      * remove const_cast
      
      * delete useless var
      
      * replace vlog(1) with vlog(3), replace assert with PADDLE_ENFORCE
      
      * down to 19 files
      7c38114f
  2. 16 4月, 2021 1 次提交
  3. 02 4月, 2021 1 次提交
  4. 27 11月, 2020 1 次提交
    • S
      detect tensorRT plugin fp16 in runtime (#27933) · b9e76a01
      Shang Zhizhou 提交于
      * remove -DSUPPORTS_CUDA_FP16 in cuda.cmake
      
      * comile with cuda9
      
      * add some unittest
      
      * notest;test=coverage
      
      * add unittest for trt plugin swish && split
      
      * update ernie unittest
      
      * fix some error message
      
      * remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter
      
      * fix comile errror when CUDA_ARCH_NAME < Pascal"
      
      * fix comile error
      
      * update unittest timeout
      
      * compile with cuda9
      
      * update error msg
      
      * fix code style
      
      * add some comments
      
      * add define IF_CUDA_ARCH_SUPPORT_FP16
      
      * rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED
      b9e76a01
  5. 12 11月, 2020 1 次提交
  6. 03 11月, 2020 1 次提交
    • S
      TensorRT中ernie模型推理性能优化,支持变长输入 (#28367) · ea851796
      Shang Zhizhou 提交于
      * fp16 result ok
      
      * change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS
      
      * auto detect special slice op converter for ernie with trt oss
      
      * ernie oss only support fp16
      
      * fix special_slice_plugin serialize bug
      
      * matmul in tensorrt ok
      
      * ernie unittest ok
      
      * add matmul tensorrt unittest
      
      * remove demo code
      ea851796
  7. 15 9月, 2020 1 次提交
    • S
      Optimize slice trt plugin (#26970) · 47fdc60e
      Shang Zhizhou 提交于
      * optimize slice TRT plugin
      
      This patch removes unnecessary barrier for data transfer of needed offset,
      so data transfer can be overlap with GPU kernel execution.
      
      This patch also fixes incorrect name of slice plugin. That is, replaces
      "layernorm" with "slice"
      
      test=develop
      
      * add serialize/deserialize to slice plugin
      
      * add static shape slice trt plugin
      
      * fix slice trt op convertor dynamic shape bug
      
      * fix format by clang-format
      
      * fix pylint format error
      
      * fix problems commented by peiyang
      Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
      47fdc60e
  8. 11 5月, 2020 1 次提交
    • C
      Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f
      Chen Weihang 提交于
      * add new macro BOOST_GET_SAFELY & unittests, test=develop
      
      * add different macro type, test=develop
      
      * fix get macro type in executor, test=develop
      
      * four macro part change backup
      
      * using one macro for all case, test=develop
      
      * revert attribute change, test=develop
      
      * change to three func to solve gcc4.8 bug, test=develop
      
      * polish some details, test=develop
      aa0f254f
  9. 19 4月, 2020 1 次提交
  10. 26 3月, 2020 1 次提交
    • Z
      [Paddle-TRT]: Ernie Dynamic shape support. (#23138) · 430b0099
      Zhaolong Xing 提交于
      * add dynamic plugin support.
      test=develop
      
      * change emb eltwise layernorm to math function
      test=develop
      
      * add emb eltwise layernorm
      test=develop
      
      * can run dynamic shape ernie
      test=develop
      
      * fix ci
      test=develop
      
      * add ut for trt ernie dynamic
      
      test=develop
      
      * refine dynamic shape c++ interface.
      test=develop
      
      * fix comments
      test=develop
      
      * fix comments
      test=develop
      430b0099