1. 15 2月, 2023 1 次提交
  2. 13 1月, 2023 1 次提交
  3. 19 12月, 2022 1 次提交
    • Y
      [cherry-pick][Inference] support mixed precision inference (#49077) · ddcd1b61
      Yuanle Liu 提交于
      * [Release2.4] Revert python link prs (#48573)
      
      * Revert "Fix mac link python (#48017)"
      
      This reverts commit 3fa7a736.
      
      * Revert "[Cherry-pick] Fix python link error (#47811)"
      
      This reverts commit ff642c68.
      
      * Update config.go
      
      * [Paddle Inference] Add float_to_half_pass to support  inference with mixed precision (#47993)
      
      * [Inference] optimize some code and fix some bug (#48780)
      
      * clean ir_pass_manager and fix map_depthwise_conv_to_conv_pass
      
      * fix unitest timeout
      
      * [Paddle Inference] clean unused code  (#48392)
      
      * fix
      
      * update
      
      * update
      Co-authored-by: NChen Weihang <chenweihang@baidu.com>
      ddcd1b61
  4. 10 11月, 2022 1 次提交
  5. 09 11月, 2022 1 次提交
  6. 08 11月, 2022 2 次提交
  7. 03 11月, 2022 2 次提交
  8. 29 10月, 2022 1 次提交
  9. 28 10月, 2022 1 次提交
  10. 20 10月, 2022 2 次提交
    • K
      [cherry pick] Add FusedMultiTransformer fuse pass for GPT3 (#47150) · 396427a7
      Kaipeng Deng 提交于
      * add fused_attention_pass. test=develop
      
      * support fp16. test=develop
      
      * fix format. test=develop
      396427a7
    • Y
      [cherry-pick] Fix quantize model deploy bug in MKLDNN (#47119) · c2d344dd
      yeliang2258 提交于
      * Fix quantize model deploy bugs when using MKLDNN (#45920)
      
      * fix immutable op quantize bugs
      
      * fix
      
      * fix build bug
      
      * fix test
      
      * notest,test=inference
      
      * fix ppyoloe acc drop bugs
      
      * fix test
      
      * fix test
      
      * add test
      
      * fix
      
      * fix
      
      * fix test
      
      * fix refined name bug
      
      * fix test
      
      * bias fix
      
      * fix matmul weight dequant bug
      
      * re-ci
      
      * fix tester
      
      * fix test
      
      * fix tester
      
      * update weight dequantize func
      
      * update code
      
      * update test for converage
      
      * update test
      
      * update cmake
      
      * update cmakelist
      
      * update code
      
      * rerun ci
      
      * remove useless code
      
      * re-ci
      
      * update code
      
      * update code
      
      * fix header
      
      * update code for log
      c2d344dd
  11. 18 10月, 2022 1 次提交
  12. 17 10月, 2022 1 次提交
  13. 14 10月, 2022 4 次提交
  14. 11 10月, 2022 1 次提交
  15. 28 9月, 2022 1 次提交
  16. 20 9月, 2022 2 次提交
  17. 15 9月, 2022 1 次提交
  18. 07 9月, 2022 1 次提交
    • W
      Layernorm shift partition (#45736) · 960109af
      wenbin 提交于
      * first commit
      
      * conver done
      
      * correct format
      
      * layernorm_shift_partition
      
      * correct convert
      
      * redefine plugin
      
      * runable
      
      * bug fix
      
      * modify ShiftPartitionPattern
      
      * correct
      
      * add UT
      
      * modify ut
      
      * compile
      
      * modify enforce
      
      * modify UT
      960109af
  19. 06 9月, 2022 2 次提交
  20. 05 9月, 2022 2 次提交
    • Y
      New format quant model support for MKLDNN (#45416) · 4e4f4586
      yeliang2258 提交于
      * support onnx format quantized model
      
      * update code
      
      * add test
      
      * add test
      
      * fix
      
      * fix test
      
      * fix cmake
      
      * update code
      
      * change scale file path to calibration file path
      
      * update code
      
      * update code
      
      * fix build bug
      
      * fix build bugs
      
      * fix
      
      * fix
      4e4f4586
    • D
      Update DlNNE engine (#45027) · 638965c5
      denglin-github 提交于
      * add config param for enable_dlnne and support calibration mode
      * remove useless file
      * refine code and add annotation
      * refine code of Warnning tips
      638965c5
  21. 02 9月, 2022 1 次提交
  22. 30 8月, 2022 1 次提交
  23. 29 8月, 2022 1 次提交
  24. 22 8月, 2022 3 次提交
  25. 18 8月, 2022 2 次提交
  26. 16 8月, 2022 2 次提交
    • F
      convert multihead to oss (#45019) · f706d95d
      feng_shuai 提交于
      * convert multihead to oss
      
      * fix:bug
      
      * fix:delete const cast
      
      * fix:don't support bias_qk
      
      * add vit pass
      
      * fix:convert bug and add preln_residual_bias
      
      * support length=-1
      
      * add UT for convert
      
      * add no_bias_qk support for gpu_multihead_op
      
      * delete infer_shape depends on bias_qk
      
      * oss just can be used in T4 and A*
      
      * fix:change api for ROCM CI
      f706d95d
    • W
      memoptim and fp16 mixed precision (#45132) · fa890092
      Wilber 提交于
      fa890092
  27. 15 8月, 2022 1 次提交