1. 23 9月, 2020 1 次提交
    • P
      Optimize slice trt plugin (#26970) (#27456) · 8e1712a7
      Pei Yang 提交于
      * optimize slice TRT plugin
      
      This patch removes unnecessary barrier for data transfer of needed offset,
      so data transfer can be overlap with GPU kernel execution.
      
      This patch also fixes incorrect name of slice plugin. That is, replaces
      "layernorm" with "slice"
      
      test=develop
      
      * add serialize/deserialize to slice plugin
      
      * add static shape slice trt plugin
      
      * fix slice trt op convertor dynamic shape bug
      
      * fix format by clang-format
      
      * fix pylint format error
      
      * fix problems commented by peiyang
      Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
      Co-authored-by: NShang Zhizhou <shangzhizhou@baidu.com>
      Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
      8e1712a7
  2. 21 9月, 2020 1 次提交
  3. 18 9月, 2020 1 次提交
    • P
      [cherry-pick][Paddle-TRT] Stack op plugin (#25605) (#27365) · 4283be52
      Pei Yang 提交于
      * [Paddle-TRT] Stack op plugin (#25605)
      
      * add stack_op to CMakeLists
      
      * add dim=3 support for scale op
      
      * add trt stack op, test=develop
      
      * remove debug message
      
      * add stack plugin serialize
      
      * remove slice, scale op, will add later
      
      * enhence error message
      
      * revise trt ernie test to conver the stack op CI testi, test=develop
      
      * add stack op serialization
      
      * fix test shape after adding stack op
      
      * remove slice op, will add after implementing serialization
      
      * roll back to min_graph=5 to avoid using slice op
      
      * fix scale op output layer
      
      * implement stack op createPlugin
      
      * use workspace and move the defination to .cu
      
      * move stack plugin creator definition to .cu, test=develop
      
      * sync ut with develop
      Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>
      4283be52
  4. 06 8月, 2020 2 次提交
  5. 06 7月, 2020 1 次提交
  6. 01 7月, 2020 1 次提交
  7. 23 4月, 2020 1 次提交
    • Z
      [Cherry-pick]: 23974, 23723, 23984 (#24084) · 26a1def9
      Zhaolong Xing 提交于
      * Chery_pick:[Eernie TRT]: add slice op and add emb eltwise layernorm fp16 support (#23723)
      
      * refine ernie trt dynamic shape support
      1. add slice op converter
      2. add emb eltwise layernorm fp16 support
      test=develop
      
      * fix dynamic shape test ut
      test=develop
      
      * fix comments.
      test=develop
      
      * fix comments
      test=develop
      
      * cherry-pick [BUG]: Head number can only be > 1 on multihead op (#23974)
      
      * support the head number == 1
      test=develop
      
      * fix slice op error.
      test=develop
      
      * cherry-pick :disable trt test, test=develop (#23984)
      
      test=release/2.0-beta
      26a1def9
  8. 21 4月, 2020 1 次提交
  9. 17 4月, 2020 2 次提交
  10. 12 4月, 2020 1 次提交
  11. 08 4月, 2020 2 次提交
  12. 01 4月, 2020 1 次提交
  13. 26 3月, 2020 1 次提交
    • Z
      [Paddle-TRT]: Ernie Dynamic shape support. (#23138) · 430b0099
      Zhaolong Xing 提交于
      * add dynamic plugin support.
      test=develop
      
      * change emb eltwise layernorm to math function
      test=develop
      
      * add emb eltwise layernorm
      test=develop
      
      * can run dynamic shape ernie
      test=develop
      
      * fix ci
      test=develop
      
      * add ut for trt ernie dynamic
      
      test=develop
      
      * refine dynamic shape c++ interface.
      test=develop
      
      * fix comments
      test=develop
      
      * fix comments
      test=develop
      430b0099
  14. 08 1月, 2020 1 次提交
  15. 07 1月, 2020 1 次提交
  16. 06 1月, 2020 1 次提交
  17. 23 10月, 2019 1 次提交
  18. 05 9月, 2019 1 次提交
  19. 12 8月, 2019 1 次提交
  20. 24 7月, 2019 1 次提交
    • Z
      Update trt5 for paddle-trt (#18645) · 26ae6d49
      Zhaolong Xing 提交于
      * update paddle-trt for:
          1. fix bug: when batch > 2, core in split plugin.
          2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.)
          3. add new attr to dropout.
          4. shuffle channel, swish, relu6 support
          test=develop
      
      * 1. fix ci
      test=develop
      26ae6d49
  21. 08 3月, 2019 3 次提交
  22. 27 2月, 2019 1 次提交
  23. 26 2月, 2019 1 次提交
  24. 22 2月, 2019 1 次提交
    • N
      5. add static trt load model · 1d5ef7c9
      nhzlx 提交于
      1). add static trt load model
      2). fix bug: when device_id is not 0, the trt will have a bug
      test=develop
      1d5ef7c9
  25. 03 12月, 2018 1 次提交
  26. 21 11月, 2018 2 次提交
  27. 20 11月, 2018 4 次提交
  28. 19 11月, 2018 4 次提交