1. 02 2月, 2023 1 次提交
  2. 04 1月, 2023 1 次提交
  3. 22 12月, 2022 1 次提交
  4. 19 12月, 2022 1 次提交
    • Y
      [cherry-pick][Inference] support mixed precision inference (#49077) · ddcd1b61
      Yuanle Liu 提交于
      * [Release2.4] Revert python link prs (#48573)
      
      * Revert "Fix mac link python (#48017)"
      
      This reverts commit 3fa7a736.
      
      * Revert "[Cherry-pick] Fix python link error (#47811)"
      
      This reverts commit ff642c68.
      
      * Update config.go
      
      * [Paddle Inference] Add float_to_half_pass to support  inference with mixed precision (#47993)
      
      * [Inference] optimize some code and fix some bug (#48780)
      
      * clean ir_pass_manager and fix map_depthwise_conv_to_conv_pass
      
      * fix unitest timeout
      
      * [Paddle Inference] clean unused code  (#48392)
      
      * fix
      
      * update
      
      * update
      Co-authored-by: NChen Weihang <chenweihang@baidu.com>
      ddcd1b61
  5. 29 11月, 2022 1 次提交
    • Y
      [cherry-pick] updating mul and matmul with set_mem_desc and fix... · 9e2ba9b9
      yeliang2258 提交于
      [cherry-pick] updating mul and matmul with set_mem_desc and fix squeeze_transpose for MKLDNN (#47951)
      
      * Fix slice bugs in MKLDNN when input dims are zeros (#46671)
      
      * fix slice bugs
      
      * fix
      
      * update code
      
      * fix
      
      * update code
      
      * updating mul and matmul with set_mem_desc (#45624)
      
      * - mul & matmul changes
      
      - fix
      
      - bs16 correction of strides
      
      * - cosmetic fixes
      
      * - lint
      
      * - fix
      
      * - fix
      
      * - format -> mem_desc
      
      * - fix
      
      * - fix
      
      * - fix
      
      * - fix
      
      * - fix
      
      * fix squueze_transpose (#47911)
      Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>
      9e2ba9b9
  6. 10 11月, 2022 1 次提交
  7. 09 11月, 2022 1 次提交
  8. 08 11月, 2022 1 次提交
  9. 03 11月, 2022 3 次提交
  10. 20 10月, 2022 3 次提交
    • K
      [cherry pick] Add FusedMultiTransformer fuse pass for GPT3 (#47150) · 396427a7
      Kaipeng Deng 提交于
      * add fused_attention_pass. test=develop
      
      * support fp16. test=develop
      
      * fix format. test=develop
      396427a7
    • Y
      [cherry-pick] Fix quantize model deploy bug in MKLDNN (#47119) · c2d344dd
      yeliang2258 提交于
      * Fix quantize model deploy bugs when using MKLDNN (#45920)
      
      * fix immutable op quantize bugs
      
      * fix
      
      * fix build bug
      
      * fix test
      
      * notest,test=inference
      
      * fix ppyoloe acc drop bugs
      
      * fix test
      
      * fix test
      
      * add test
      
      * fix
      
      * fix
      
      * fix test
      
      * fix refined name bug
      
      * fix test
      
      * bias fix
      
      * fix matmul weight dequant bug
      
      * re-ci
      
      * fix tester
      
      * fix test
      
      * fix tester
      
      * update weight dequantize func
      
      * update code
      
      * update test for converage
      
      * update test
      
      * update cmake
      
      * update cmakelist
      
      * update code
      
      * rerun ci
      
      * remove useless code
      
      * re-ci
      
      * update code
      
      * update code
      
      * fix header
      
      * update code for log
      c2d344dd
    • W
      [Cherry-pick] layernorm shift partation enhance (#47086) · 9ed1454a
      Wang Bojun 提交于
      * Enhance the layernorm shift partation fuse op when shift size > 0 (roll shifting)
      * fix cherry-pick test
      9ed1454a
  11. 19 10月, 2022 2 次提交
  12. 17 10月, 2022 1 次提交
  13. 14 10月, 2022 1 次提交
  14. 11 10月, 2022 1 次提交
    • S
      [cherry-pick] [PHI] relu6_grad kernel (#46501) (#46862) · 2bcbf8b0
      Sławomir Siwek 提交于
      * [PHI] Migrate gelu kernels (#45596)
      
      * gaussian random
      
      * mkldnn to onednn renaming
      
      * fix merge conflicts
      
      * remove fluid code
      
      * onednn renaming
      
      * gelu fwd
      
      * sort activations
      
      * gelu gradient
      
      * remove unused macros
      
      * merge conflicts
      
      * fix merge conflicts
      
      * remove extra contraint from gelu op
      
      * [PHI] relu6_grad kernel (#46501)
      
      * Relu6
      
      * remove fluid handler
      
      * add individual kernel signature
      
      * coding style
      
      * replace bounded_relu with clip
      
      * whitespace
      
      * code style
      2bcbf8b0
  15. 20 9月, 2022 2 次提交
  16. 13 9月, 2022 1 次提交
  17. 07 9月, 2022 1 次提交
    • W
      Layernorm shift partition (#45736) · 960109af
      wenbin 提交于
      * first commit
      
      * conver done
      
      * correct format
      
      * layernorm_shift_partition
      
      * correct convert
      
      * redefine plugin
      
      * runable
      
      * bug fix
      
      * modify ShiftPartitionPattern
      
      * correct
      
      * add UT
      
      * modify ut
      
      * compile
      
      * modify enforce
      
      * modify UT
      960109af
  18. 06 9月, 2022 1 次提交
  19. 05 9月, 2022 2 次提交
  20. 31 8月, 2022 1 次提交
  21. 30 8月, 2022 2 次提交
    • Z
      Remove extra attribute in OpMaker (#44310) · fe321f9a
      zyfncg 提交于
      * add runtime config in phi
      
      * add runtime attr for op desc and op
      
      * fix no proto error
      
      * adjust opdesc set_attr impl
      
      * try to remove conv_op extra attrs
      
      * add init runtime attr map
      
      * change extra header path
      
      * fix runtime_attr
      
      * fix trace_op
      
      * fix bug of pass
      
      * fix merge conflict
      
      * fix dygraph attrs
      
      * fix bug of pass
      
      * fix dygraph bug
      
      * fix unittest module
      
      * delete extra attr default
      
      * fix dropout kernel
      
      * polish code
      
      * fix extra output of instance_norm
      
      * fix merge confilct
      
      * fix op_desc bug
      
      * add extra attr in yaml for conv3d_transpose
      
      * don't remove extra input and output
      
      * fix save_inference_model
      
      * fix bug of batch_norm
      
      * revert some change
      
      * polish log
      
      * polish code
      
      * add code comment
      Co-authored-by: NChen Weihang <chenweihang@baidu.com>
      fe321f9a
    • Z
      [Paddle-TRT] constant-folding (#45494) · 97f43a8e
      zhoutianzi666 提交于
      add constant folding pass, for some model,it will get less latency;
      97f43a8e
  22. 29 8月, 2022 1 次提交
    • Z
      [new_exe] Dy2Static support new_executor (#44450) · aba1295b
      zhangbo9674 提交于
      * add interpretercore
      
      * refine backward program id
      
      * add code
      
      * refine program
      
      * refine code
      
      * create forward/backward_program by prog2graph2prog method
      
      * test, do not care
      
      * refine code
      
      * refine code
      
      * refine code
      
      * test, do not care
      
      * add interpretorcore
      
      * add scope
      
      * refine scope create method
      
      * add jit for new_exe
      
      * solve conflict
      
      * delete unused code
      
      * polish code
      
      * polish code
      
      * refine scope in inplace
      
      * refine for datatransfer
      
      * refine _rebuild_from_desc
      
      * refine control eager deletion attr
      
      * refine used_for_jit
      
      * refine jit for infer
      
      * op size0 use ori program
      
      * polish code
      
      * refine jit
      
      * refine run_program_op ut
      
      * refine inplace
      
      * refine control
      
      * refine graph helper
      
      * refine control
      
      * refine inplace
      
      * refine buffer_share_inplace_pass
      
      * polish code
      
      * polish code
      
      * refine usage for compilerProgram
      
      * refine control
      
      * test
      
      * test core cache
      
      * refine code
      
      * refine io.py
      
      * increase test_seq2seq timeout
      
      * refine convert program
      
      * refine interpretercore_cache release
      
      * delete buildinplace
      
      * refine partial_program && io
      
      * refine code for io
      
      * test
      
      * test
      
      * test
      aba1295b
  23. 24 8月, 2022 1 次提交
  24. 22 8月, 2022 3 次提交
  25. 17 8月, 2022 1 次提交
  26. 16 8月, 2022 2 次提交
    • F
      convert multihead to oss (#45019) · f706d95d
      feng_shuai 提交于
      * convert multihead to oss
      
      * fix:bug
      
      * fix:delete const cast
      
      * fix:don't support bias_qk
      
      * add vit pass
      
      * fix:convert bug and add preln_residual_bias
      
      * support length=-1
      
      * add UT for convert
      
      * add no_bias_qk support for gpu_multihead_op
      
      * delete infer_shape depends on bias_qk
      
      * oss just can be used in T4 and A*
      
      * fix:change api for ROCM CI
      f706d95d
    • W
      fix new quant (#45155) · 2fb65e44
      Wangzheee 提交于
      2fb65e44
  27. 15 8月, 2022 1 次提交
  28. 12 8月, 2022 1 次提交
    • S
      Offload calculations from matmul op to fuse pass (#44941) · acb78ea2
      Sławomir Siwek 提交于
      * remove v2_transpose_reshape
      
      * matmul_transpose_reshape
      
      * reshape_transpose_matmul
      
      * Add int8 support for matmulV2
      
      * restore ut
      
      * adjust old ut
      
      * restore parallel UT ruels
      
      * remove mkldnn code from base ops
      
      * move enforces to pass
      
      * remove duplicated functions
      
      * delete duplicated enforces
      
      * feedback from review
      
      * add comments to variables
      
      * enable eltwise support
      
      * dynamic attribute
      
      * remove fusepass tests from op test
      
      * remove fuse pass cases from op test
      
      * revert introduction of dynamic attributes
      
      * style
      Co-authored-by: Nwozna <joanna.wozna@intel.com>
      acb78ea2
  29. 11 8月, 2022 1 次提交