1. 23 9月, 2020 1 次提交
    • P
      Optimize slice trt plugin (#26970) (#27456) · 8e1712a7
      Pei Yang 提交于
      * optimize slice TRT plugin
      
      This patch removes unnecessary barrier for data transfer of needed offset,
      so data transfer can be overlap with GPU kernel execution.
      
      This patch also fixes incorrect name of slice plugin. That is, replaces
      "layernorm" with "slice"
      
      test=develop
      
      * add serialize/deserialize to slice plugin
      
      * add static shape slice trt plugin
      
      * fix slice trt op convertor dynamic shape bug
      
      * fix format by clang-format
      
      * fix pylint format error
      
      * fix problems commented by peiyang
      Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
      Co-authored-by: NShang Zhizhou <shangzhizhou@baidu.com>
      Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
      8e1712a7
  2. 21 9月, 2020 1 次提交
  3. 18 9月, 2020 1 次提交
    • P
      [cherry-pick][Paddle-TRT] Stack op plugin (#25605) (#27365) · 4283be52
      Pei Yang 提交于
      * [Paddle-TRT] Stack op plugin (#25605)
      
      * add stack_op to CMakeLists
      
      * add dim=3 support for scale op
      
      * add trt stack op, test=develop
      
      * remove debug message
      
      * add stack plugin serialize
      
      * remove slice, scale op, will add later
      
      * enhence error message
      
      * revise trt ernie test to conver the stack op CI testi, test=develop
      
      * add stack op serialization
      
      * fix test shape after adding stack op
      
      * remove slice op, will add after implementing serialization
      
      * roll back to min_graph=5 to avoid using slice op
      
      * fix scale op output layer
      
      * implement stack op createPlugin
      
      * use workspace and move the defination to .cu
      
      * move stack plugin creator definition to .cu, test=develop
      
      * sync ut with develop
      Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>
      4283be52
  4. 11 8月, 2020 1 次提交
  5. 06 8月, 2020 1 次提交
  6. 01 7月, 2020 1 次提交
  7. 15 5月, 2020 1 次提交
  8. 23 4月, 2020 1 次提交
    • Z
      [Cherry-pick]: 23974, 23723, 23984 (#24084) · 26a1def9
      Zhaolong Xing 提交于
      * Chery_pick:[Eernie TRT]: add slice op and add emb eltwise layernorm fp16 support (#23723)
      
      * refine ernie trt dynamic shape support
      1. add slice op converter
      2. add emb eltwise layernorm fp16 support
      test=develop
      
      * fix dynamic shape test ut
      test=develop
      
      * fix comments.
      test=develop
      
      * fix comments
      test=develop
      
      * cherry-pick [BUG]: Head number can only be > 1 on multihead op (#23974)
      
      * support the head number == 1
      test=develop
      
      * fix slice op error.
      test=develop
      
      * cherry-pick :disable trt test, test=develop (#23984)
      
      test=release/2.0-beta
      26a1def9
  9. 17 4月, 2020 3 次提交
  10. 12 4月, 2020 1 次提交
  11. 11 4月, 2020 1 次提交
  12. 10 4月, 2020 1 次提交
  13. 08 4月, 2020 4 次提交
  14. 01 4月, 2020 1 次提交
  15. 26 3月, 2020 1 次提交
    • Z
      [Paddle-TRT]: Ernie Dynamic shape support. (#23138) · 430b0099
      Zhaolong Xing 提交于
      * add dynamic plugin support.
      test=develop
      
      * change emb eltwise layernorm to math function
      test=develop
      
      * add emb eltwise layernorm
      test=develop
      
      * can run dynamic shape ernie
      test=develop
      
      * fix ci
      test=develop
      
      * add ut for trt ernie dynamic
      
      test=develop
      
      * refine dynamic shape c++ interface.
      test=develop
      
      * fix comments
      test=develop
      
      * fix comments
      test=develop
      430b0099
  16. 09 3月, 2020 1 次提交
  17. 23 2月, 2020 1 次提交
  18. 10 2月, 2020 1 次提交
  19. 14 1月, 2020 1 次提交
  20. 07 1月, 2020 1 次提交
  21. 06 1月, 2020 1 次提交
  22. 18 11月, 2019 1 次提交
  23. 23 10月, 2019 1 次提交
  24. 21 9月, 2019 1 次提交
  25. 09 9月, 2019 1 次提交
  26. 12 8月, 2019 1 次提交
  27. 31 7月, 2019 1 次提交
    • Z
      Trt fp16 support (#18860) · 61238d31
      Zhaolong Xing 提交于
      * Fix Mask rcnn predictor
          1. refine memory optim algorithm to support the model with the block op.
          2. output diff : modify the affine channel fuse
          3. add condition_block_infer op
      add interface for setting trt calib table dir
      test=develop
      
      * add the missing files.
      test=develop
      
      * 1 add trt fp16 support
      test=develop
      61238d31
  28. 24 7月, 2019 1 次提交
    • Z
      Update trt5 for paddle-trt (#18645) · 26ae6d49
      Zhaolong Xing 提交于
      * update paddle-trt for:
          1. fix bug: when batch > 2, core in split plugin.
          2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.)
          3. add new attr to dropout.
          4. shuffle channel, swish, relu6 support
          test=develop
      
      * 1. fix ci
      test=develop
      26ae6d49
  29. 06 6月, 2019 1 次提交
  30. 25 5月, 2019 1 次提交
    • Z
      TRT: Support set dynamic range in int8 mode. (#17524) · 61221ebc
      Zhaolong Xing 提交于
      * fluid int8 train and trt int8 predict align.
      trt int8 predict init
      op converter
      
      * 2. align fluid int8 train and trt int8 inference.
      enhance quant dequant fuse pass
      enhance op converter, trt engine, trt engine op, trt subgraph pass.
      
      * 3. add delete_quant_dequant_pass for trt
      
      test=develop
      
      * 4. add the missing file
      test=develop
      
      * 5. i modify the c++ interface, but forget to modify the pybind code
      fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter
      test=develop
      61221ebc
  31. 23 5月, 2019 1 次提交
  32. 21 5月, 2019 1 次提交
  33. 20 3月, 2019 2 次提交
  34. 08 3月, 2019 1 次提交