• S
    Optimize slice trt plugin (#26970) · 47fdc60e
    Shang Zhizhou 提交于
    * optimize slice TRT plugin
    
    This patch removes unnecessary barrier for data transfer of needed offset,
    so data transfer can be overlap with GPU kernel execution.
    
    This patch also fixes incorrect name of slice plugin. That is, replaces
    "layernorm" with "slice"
    
    test=develop
    
    * add serialize/deserialize to slice plugin
    
    * add static shape slice trt plugin
    
    * fix slice trt op convertor dynamic shape bug
    
    * fix format by clang-format
    
    * fix pylint format error
    
    * fix problems commented by peiyang
    Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
    47fdc60e
op_teller.cc 3.9 KB