1. 09 2月, 2023 2 次提交
    • Z
      [Paddle-TRT] GroupNorm int8 nchw32 fake kernel (#50146) · d93c63a0
      zhoutianzi666 提交于
      * add fmha_flashattention oss plugin
      
      * add fmhca
      
      * add oss fmhca
      
      * code reconstruct and add ut
      
      * code style refine
      
      * fix ut and enforce check
      
      * refine trt version check
      
      refine compile
      
      fix compile
      
      * fix cross ut
      
      * code refine
      
      * use runtime trt version check
      
      * bug fix and code refine
      
      * compile fix
      
      * merge develop
      
      * add GN QDQ kernel
      
      * support GN int8 fake kernel
      
      * add with_int8
      
      * add GN int8 fake kernel
      
      * add GN int8 fake kernel
      
      * add GN int8 fake kernel
      
      * add GN int8 fake kernel
      
      * add GN int8 fake kernel
      
      * add GN int8 fake kernel
      
      * add GN int8 fake kernel
      
      * add GN int8  UT
      
      * add verison > 8000  in GN int8  UT
      
      * add some check in .cu
      
      * add stdlib.h in UT
      
      * little change  in .cu
      
      * remove rand_r use rand
      
      * remove use rand
      
      * setAxis(1)
      
      * when int8 is on allow fall back to fp16
      
      ---------
      Co-authored-by: Nwwbitejotunn <wang_bojun@outlook.com>
      d93c63a0
    • W
      [TRT] Transpose layernorm fusion with different input format (#50082) · b2bb7ec9
      Wang Bojun 提交于
      * trans_layernorm
      b2bb7ec9
  2. 08 2月, 2023 4 次提交
  3. 07 2月, 2023 1 次提交
  4. 06 2月, 2023 3 次提交
  5. 01 2月, 2023 2 次提交
    • W
      Preln fix (#49802) · e03718f5
      Wang Bojun 提交于
      * preln_residual 2 fused_bias_residual
      
      * skip layernorm fix and ut
      
      * code refine
      
      * code style refine
      
      * fix ut
      
      * fix output
      
      * add trt layer fall back info
      
      * refine op teller and ut
      
      * DropoutMaskOut output fix
      e03718f5
    • H
      jit layer support multi thread and fix predictor clone (#50095) · 9fa2eb38
      Hui Zhang 提交于
      * jit layer support multi thread
      
      * fix bug
      
      * clone prediector not do graph optimizer
      
      * format
      
      * fix comment and format
      
      * fix override and fromat
      
      * fix
      
      * fix
      9fa2eb38
  6. 31 1月, 2023 5 次提交
  7. 19 1月, 2023 1 次提交
  8. 18 1月, 2023 1 次提交
  9. 17 1月, 2023 1 次提交
    • Y
      [PHI]Change feed_op to phi kernel (#49116) · f7f1dc03
      YuanRisheng 提交于
      * change feed_op to phi kernel
      
      * fix ci bugs
      
      * fix build bugs
      
      * fix ci bugs
      
      * fix compile bugs
      
      * fix ci bugs
      
      * perfect code
      
      * perfect comment code
      
      * fix install bugs
      
      * modify code according comment
      
      * remove visitor in feed_op
      
      * modify according comment
      
      * perfect code according comment
      
      * add infershape
      
      * fix py3 bugs
      
      * fix getexpected kernel type
      
      * fix getexpected kernel type
      
      * fix ci bugs
      
      * add registry for custom device
      
      * fix py3 bugs
      
      * fix floating point error
      
      * fix py3 test bugs
      f7f1dc03
  10. 16 1月, 2023 3 次提交
  11. 13 1月, 2023 3 次提交
  12. 12 1月, 2023 2 次提交
  13. 11 1月, 2023 3 次提交
  14. 10 1月, 2023 7 次提交
  15. 09 1月, 2023 2 次提交