1. 20 10月, 2022 1 次提交
  2. 19 10月, 2022 2 次提交
  3. 18 10月, 2022 2 次提交
    • W
      Merge layernorm trt fuse (#46320) · 5e9f491e
      Wang Bojun 提交于
      * first version, accuracy corrected
      
      * disable debug print
      
      * use blockReduceSum in phi
      
      * add UT
      
      * add opCompat
      
      * code style
      
      * code refine
      
      * bug fix
      
      * code refine
      
      * test fix
      
      * bugfix
      
      * codesytle fix
      
      * code style
      
      * code-style
      
      * code-style
      
      * code-style
      5e9f491e
    • S
      FC + activation fuse passes (#45183) · b7a23adb
      Sławomir Siwek 提交于
      * git
      
      * style
      
      * leave default relu in kernel
      
      * style
      
      * cleanup FCMKLDNN pattern
      
      * merge conflicts
      
      * update develop
      
      * update develop
      
      * add const
      
      * rename to oneDNN and adjust attributes
      
      * whitespace
      b7a23adb
  4. 17 10月, 2022 4 次提交
  5. 16 10月, 2022 1 次提交
  6. 13 10月, 2022 2 次提交
    • Y
      Fix quantize model deploy bugs when using MKLDNN (#45920) · 561fd8c8
      yeliang2258 提交于
      * fix immutable op quantize bugs
      
      * fix
      
      * fix build bug
      
      * fix test
      
      * notest,test=inference
      
      * fix ppyoloe acc drop bugs
      
      * fix test
      
      * fix test
      
      * add test
      
      * fix
      
      * fix
      
      * fix test
      
      * fix refined name bug
      
      * fix test
      
      * bias fix
      
      * fix matmul weight dequant bug
      
      * re-ci
      
      * fix tester
      
      * fix test
      
      * fix tester
      
      * update weight dequantize func
      
      * update code
      
      * update test for converage
      
      * update test
      
      * update cmake
      
      * update cmakelist
      
      * update code
      
      * rerun ci
      
      * remove useless code
      561fd8c8
    • J
      Add unsigned int8 scale propagation (#46378) · c72b3bfa
      joanna.wozna.intel 提交于
      * Add unsigned int8 propagation
      
      * Add or modify unit tests
      
      * Correct concat scale checking
      
      * Apply review suggestions
      
      * Corrections
      c72b3bfa
  7. 12 10月, 2022 1 次提交
  8. 11 10月, 2022 2 次提交
  9. 10 10月, 2022 3 次提交
  10. 30 9月, 2022 2 次提交
  11. 29 9月, 2022 1 次提交
  12. 28 9月, 2022 3 次提交
  13. 27 9月, 2022 1 次提交
  14. 22 9月, 2022 2 次提交
  15. 21 9月, 2022 2 次提交
    • Z
      Enable PaddleInference to use CINN. (#45009) · 3aa6bd57
      Zhen Wang 提交于
      * use cinn in the paddle inference
      
      * fix some cmake errors
      
      * Avoid division by zero in the arange_kernel.
      
      * Avoid dynamic ops.
      
      * Remove some useless codes.
      
      * Use OpTransInfo to encapsulate some codes used in the build_cinn_pass.
      3aa6bd57
    • W
      residual_no_bias (#46129) · aa0e84e3
      wenbin 提交于
      * residual_no_bias
      
      * comments
      
      * more ut
      
      * fix input
      aa0e84e3
  16. 20 9月, 2022 1 次提交
  17. 19 9月, 2022 1 次提交
  18. 13 9月, 2022 1 次提交
  19. 09 9月, 2022 1 次提交
  20. 07 9月, 2022 1 次提交
    • W
      Layernorm shift partition (#45736) · 960109af
      wenbin 提交于
      * first commit
      
      * conver done
      
      * correct format
      
      * layernorm_shift_partition
      
      * correct convert
      
      * redefine plugin
      
      * runable
      
      * bug fix
      
      * modify ShiftPartitionPattern
      
      * correct
      
      * add UT
      
      * modify ut
      
      * compile
      
      * modify enforce
      
      * modify UT
      960109af
  21. 06 9月, 2022 1 次提交
  22. 05 9月, 2022 2 次提交
  23. 31 8月, 2022 1 次提交
  24. 30 8月, 2022 2 次提交
    • Z
      Remove extra attribute in OpMaker (#44310) · fe321f9a
      zyfncg 提交于
      * add runtime config in phi
      
      * add runtime attr for op desc and op
      
      * fix no proto error
      
      * adjust opdesc set_attr impl
      
      * try to remove conv_op extra attrs
      
      * add init runtime attr map
      
      * change extra header path
      
      * fix runtime_attr
      
      * fix trace_op
      
      * fix bug of pass
      
      * fix merge conflict
      
      * fix dygraph attrs
      
      * fix bug of pass
      
      * fix dygraph bug
      
      * fix unittest module
      
      * delete extra attr default
      
      * fix dropout kernel
      
      * polish code
      
      * fix extra output of instance_norm
      
      * fix merge confilct
      
      * fix op_desc bug
      
      * add extra attr in yaml for conv3d_transpose
      
      * don't remove extra input and output
      
      * fix save_inference_model
      
      * fix bug of batch_norm
      
      * revert some change
      
      * polish log
      
      * polish code
      
      * add code comment
      Co-authored-by: NChen Weihang <chenweihang@baidu.com>
      fe321f9a
    • Z
      [Paddle-TRT] constant-folding (#45494) · 97f43a8e
      zhoutianzi666 提交于
      add constant folding pass, for some model,it will get less latency;
      97f43a8e