1. 25 4月, 2021 2 次提交
  2. 23 4月, 2021 1 次提交
    • W
      move semantic checks to op_teller (#32279) · 7c38114f
      wenbin 提交于
      * move semantic checks to op_teller
      
      * more ops
      
      * more ops
      
      * revert block related change
      
      * part1
      
      * revert activation
      
      * remove if
      
      * remove const_cast
      
      * reslove conflict
      
      * remove const_cast
      
      * delete useless var
      
      * replace vlog(1) with vlog(3), replace assert with PADDLE_ENFORCE
      
      * down to 19 files
      7c38114f
  3. 16 4月, 2021 1 次提交
  4. 02 4月, 2021 1 次提交
  5. 01 4月, 2021 1 次提交
    • Z
      [Paddle-TRT] add anchor generator op plugin (#31730) · b807e408
      zlsh80826 提交于
      * add anchor generator op plugin
      
      * add anchor generator unit_test
      
      * remove dbg info
      
      * remove redundant line
      
      * replace assertion with paddle enforce
      
      * dynamic plugin replaces assertion with paddle enforce
      
      * anchor generator support dynamic shape on spatial axis
      
      * anchor generator test with fp16, dynamic shape
      
      * add anchor generator test all
      
      * add back main
      
      * reduce test input size to not exceed the timelimit of ci
      
      * change super to InferencePassTest for python2 compatibility
      
      * reuse paddle operator anchor generator
      
      * move creator construct to header with default
      
      * add cuda ifdef
      
      * reduce line
      
      * change super to InferencePassTest for python2 compatibility
      
      * fix anchor generator fp16 serialize setting
      
      * split unittest from test_all
      
      * restrict anchor generator input format before version 7234
      
      * anchor generator only support greater than trt7.1
      
      * change min_graph_size to 2
      
      * min_graph size to 3 if dynamic shape
      
      * reduce dynamic shape size to avoid trt search tactic too long to exceed time limit
      
      * remove anchor from fetch list
      
      * anchor generator support all trt version
      
      * fix memory not allocated but if serialized
      b807e408
  6. 30 3月, 2021 2 次提交
  7. 29 3月, 2021 2 次提交
    • Z
      [Paddle-TRT] roi_align_plugin (#31732) · e3a38d79
      zlsh80826 提交于
      * add roi_align_plugin
      
      * add roi align unit_test
      
      * add roi align serialization
      
      * remove roi align static plugin because of batch dim issue
      
      * refine roi align unittest and add fp16/serialization
      
      * add trt roi align condition to op_teller
      
      * refine error message
      
      * remove unnecessary reshape layer
      e3a38d79
    • Z
      [Paddle-TRT] trt affine channel converter (#31628) · bfb5cf55
      zlsh80826 提交于
      * trt affine channel converter
      
      * add trt affine channel base test
      
      * add trt affine channel NHWC
      
      * remove asterisk for python2 compatibility
      
      * trt affine channel converter
      
      * add trt affine channel base test
      
      * add trt affine channel NHWC
      
      * remove asterisk for python2 compatibility
      
      * fix rebase
      
      * move LodTensor to Tensor
      
      * add dbg info
      
      * affine channel converter only support NCHW
      
      * scale,bias are parameters, use create_parameters api
      
      * reduce test input size to not exceed the timelimit of ci
      
      * refine affine channel unittest and add serialization/dynamic test
      
      * change super to InferencePassTest for python2 compatibility
      
      * change super to InferencePassTest for python2 compatibility
      
      * fix affine channel fp16 serialize setting
      bfb5cf55
  8. 26 3月, 2021 1 次提交
    • Z
      [Paddle-TRT] multiclass nms (#31742) · 01aa2526
      zlsh80826 提交于
      * add multiclass_nms
      
      * add multiclass_nms unittest
      
      * add default enable_tensorrt_oss option
      
      * refine multiclas nms unittest and add serialization/dynamic test
      
      * change super to InferencePassTest for python2 compatibility
      
      * refine multiclass nms unittest
      
      * move out dynamic shape test due to ci timelimit
      01aa2526
  9. 23 3月, 2021 2 次提交
  10. 22 3月, 2021 1 次提交
    • Z
      [Paddle-TRT] nearest_interp op (#31626) · bfced39e
      zlsh80826 提交于
      * nearest_interp op converter w/ dynamic/static
      
      * fix data_layout include
      
      * add trt nearest unit_test
      
      * add nearest_interp NHWC test
      
      * update trt nearest interp nhwc testcase
      
      * remove asterisk for python2 compatibility
      
      * add empty line to prevent conflict
      
      * nearest_interp op converter w/ dynamic/static
      
      * fix data_layout include
      
      * add trt nearest unit_test
      
      * add nearest_interp NHWC test
      
      * update trt nearest interp nhwc testcase
      
      * remove asterisk for python2 compatibility
      
      * add empty line to prevent conflict
      
      * change the priority of out_h, out_w
      bfced39e
  11. 18 3月, 2021 1 次提交
  12. 10 3月, 2021 1 次提交
  13. 03 3月, 2021 1 次提交
  14. 02 3月, 2021 2 次提交
  15. 24 2月, 2021 1 次提交
    • P
      [Paddle-TRT] support group_norm (#31040) · 00b09e86
      Pei Yang 提交于
      * add group norm plugin
      
      * fix compile problems
      
      * move concat axis check to trt op teller
      
      * add nbDims for scale and bias nv dims
      
      * add group norm unit test
      
      * fix unittest
      
      * add trt version restriction for group norm op teller
      
      * fix unittest
      00b09e86
  16. 18 2月, 2021 1 次提交
  17. 04 2月, 2021 1 次提交
  18. 13 1月, 2021 1 次提交
    • A
      Added support for inference using quantization aware trained dygraph (#30288) · 7bbf3ac5
      alncat 提交于
      * added support for inference using qunatization aware trained dygraph
      
      * added support for inference using qunatization aware trained dygraph
      correct boost get usage
      
      * Delete incorrect warning message (#30196)
      
      * fix warning and no grad
      
      * clean redundant API alias in 2.0 - part 2 (#30013)
      
      * delete paddle.nn.functional.assign
      
      * fix dynamic to static error
      
      * just add the op error message for the matmul xpu (#30246)
      
       add the op error message for the matmul xpu
      
      * Add Static Variable Clone (#30208)
      
      Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat
      
      * use wget to replace curl to download the lcov file (#30229)
      
      * use wget to replace curl to download the lcov file
      
      * add cache for lcov
      
      * fix test_pool3d_op timeout issue (#30248)
      
      * Fix unittests bugs. (#30250)
      
      * modify error message based on comments (#30189)
      
      * modify error message based on comments
      
      * edit code according to review.
      
      * Correct spelling according to review.
      
      * Fix bug for 'save mutiple method' (#30218)
      
      * Fix bug for 'save mutiple method'
      
      * To pass coverage.
      
      * edit code to pass coverage.
      
      * edit code to pass coverage.
      
      * add unittest for coverage.
      
      * change for coverage.
      
      * edit for coverage.
      
      * added support for inference using qunatization aware trained dygraph
      
      * Alias from  paddle.fluid.layers.auc to paddle.static.auc (#30206)
      
      * add alias from  fluid.layers.auc to static.auc
      
      * Update __init__.py
      
      * added support for inference using qunatization aware trained dygraph
      correct boost get usage
      
      * corrected boost get usage
      
      * corrected naming issues and enforcing zero check
      
      * correct paddle enforce message
      
      * added more error checkings
      
      * corrected error report message and optimized code
      
      * corrected findvar usage
      
      * corrected paddle_enforce in scope
      
      * correct error messages
      
      * correct error reporting format
      Co-authored-by: NLielinJiang <50691816+LielinJiang@users.noreply.github.com>
      Co-authored-by: NXiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
      Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
      Co-authored-by: NHuihuang Zheng <zhhsplendid@gmail.com>
      Co-authored-by: NYUNSHEN XIE <1084314248@qq.com>
      Co-authored-by: NBai Yifan <me@ethanbai.com>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: NWeiXin <weixin10@baidu.com>
      Co-authored-by: NJiaqi Liu <liujiaqi06@baidu.com>
      7bbf3ac5
  19. 11 1月, 2021 1 次提交
  20. 08 12月, 2020 1 次提交
  21. 07 12月, 2020 1 次提交
  22. 27 11月, 2020 1 次提交
    • S
      detect tensorRT plugin fp16 in runtime (#27933) · b9e76a01
      Shang Zhizhou 提交于
      * remove -DSUPPORTS_CUDA_FP16 in cuda.cmake
      
      * comile with cuda9
      
      * add some unittest
      
      * notest;test=coverage
      
      * add unittest for trt plugin swish && split
      
      * update ernie unittest
      
      * fix some error message
      
      * remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter
      
      * fix comile errror when CUDA_ARCH_NAME < Pascal"
      
      * fix comile error
      
      * update unittest timeout
      
      * compile with cuda9
      
      * update error msg
      
      * fix code style
      
      * add some comments
      
      * add define IF_CUDA_ARCH_SUPPORT_FP16
      
      * rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED
      b9e76a01
  23. 23 11月, 2020 1 次提交
  24. 12 11月, 2020 1 次提交
  25. 03 11月, 2020 1 次提交
    • S
      TensorRT中ernie模型推理性能优化,支持变长输入 (#28367) · ea851796
      Shang Zhizhou 提交于
      * fp16 result ok
      
      * change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS
      
      * auto detect special slice op converter for ernie with trt oss
      
      * ernie oss only support fp16
      
      * fix special_slice_plugin serialize bug
      
      * matmul in tensorrt ok
      
      * ernie unittest ok
      
      * add matmul tensorrt unittest
      
      * remove demo code
      ea851796
  26. 21 10月, 2020 1 次提交
  27. 13 10月, 2020 1 次提交
  28. 28 9月, 2020 1 次提交
  29. 24 9月, 2020 1 次提交
    • W
      use iwyu clean include (#27267) · df43905f
      wanghuancoder 提交于
      * use iwyu clean include, test=develop, test=win
      
      * compilation error, test=develop
      
      * fix compilation error2, test=develop
      
      * fix compilation error3, test=develop
      
      * fix compilation error4, test=develop
      
      * fix compilation error5, test=develop
      
      * fix compilation error6, test=develop
      
      * fix compilation error7, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error10, test=develop
      
      * fix compilation error11, test=develop
      df43905f
  30. 18 9月, 2020 1 次提交
  31. 15 9月, 2020 2 次提交
    • S
      Optimize slice trt plugin (#26970) · 47fdc60e
      Shang Zhizhou 提交于
      * optimize slice TRT plugin
      
      This patch removes unnecessary barrier for data transfer of needed offset,
      so data transfer can be overlap with GPU kernel execution.
      
      This patch also fixes incorrect name of slice plugin. That is, replaces
      "layernorm" with "slice"
      
      test=develop
      
      * add serialize/deserialize to slice plugin
      
      * add static shape slice trt plugin
      
      * fix slice trt op convertor dynamic shape bug
      
      * fix format by clang-format
      
      * fix pylint format error
      
      * fix problems commented by peiyang
      Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
      47fdc60e
    • S
      Optimize error report (#27254) · e6e2e537
      Shang Zhizhou 提交于
      * optimize errror report
      
      * add test case for pad op converter
      
      * fix some spelling mistake commented by peiyang
      e6e2e537
  32. 01 9月, 2020 1 次提交
    • Z
      [Paddle-TRT] Stack op plugin (#25605) · ad6e3dd6
      zlsh80826 提交于
      * add stack_op to CMakeLists
      
      * add dim=3 support for scale op
      
      * add trt stack op, test=develop
      
      * remove debug message
      
      * add stack plugin serialize
      
      * remove slice, scale op, will add later
      
      * enhence error message
      
      * revise trt ernie test to conver the stack op CI testi, test=develop
      
      * add stack op serialization
      
      * fix test shape after adding stack op
      
      * remove slice op, will add after implementing serialization
      
      * roll back to min_graph=5 to avoid using slice op
      
      * fix scale op output layer
      
      * implement stack op createPlugin
      
      * use workspace and move the defination to .cu
      
      * move stack plugin creator definition to .cu, test=develop
      ad6e3dd6
  33. 28 8月, 2020 1 次提交
  34. 05 8月, 2020 1 次提交
    • P
      Fix registering trt plugin (#25744) · b717895f
      Pei Yang 提交于
      * develop dynamic shape serilization
      
      * add test param for gelu
      
      * fix bugs
      
      * delete redundant comments
      
      * debug
      
      * fix conflict. test=develop
      
      * fix bug. test=develop
      
      * add trt dynamic shape serialized support
      
      * fix ernie serialized bug
      test=develop
      
      * fix codestyle
      test=develop
      
      * fix bug
      test=develop
      
      * fix bug.test=develop
      
      * modify cmakelist test=develop
      
      * fix bug
      test=develop
      
      * fix error message.  test=develop
      
      * fix trt register plugin based on pr#25003
      
      * add trt dynload
      
      * fix deserialization bug of not finding plugin registration
      
      * refine code style
      
      * recover engine key in tensorrt_subgraph_pass
      
      * for ci coverage
      
      * add unittest for deserialization
      Co-authored-by: Nhaozech <chenhaoze94@gmail.com>
      b717895f