1. 05 1月, 2023 1 次提交
  2. 04 1月, 2023 3 次提交
    • W
      [Inference] Add conv_fusion nhwc impl. (#49047) · 4a8708bb
      Wilber 提交于
      4a8708bb
    • Y
      [Paddle Inference] fix mixed precision diff (#49475) · ac75a9a6
      Yuanle Liu 提交于
      ac75a9a6
    • H
      [Unify KernelKey] change OpKernelType->KernelKey (#49138) · 4383494f
      HongyuJia 提交于
      * execute use kernel_key first
      
      * change OpKernelType->KernelKey
      
      * fix py3 compile error, remove redundant header files
      
      * fix build_strategy_test
      
      * fix DataType::RAW
      
      * fix custom_type test: operator_test.cc
      
      * fix transform place
      
      * fix backends_are_same_class
      
      * try fix place TransDataDevice
      
      * support all KernelKey
      
      * fix TransformData
      
      * fix place_are_same_class
      
      * fix merge
      
      * fix test_params_no_grad
      
      * fix specific place of GetExpectedKernelType
      
      * fix specific place of GetExpectedKernelType
      
      * fix GetKernelTypeForVar
      
      * fix dtype error
      
      * fix fetch_v2
      
      * change GetKernelTypeForVar
      
      * fix interpreter
      
      * fix typo error
      
      * polish codes
      
      * polish codes
      
      * polish codes
      
      * fix conflict
      4383494f
  3. 03 1月, 2023 1 次提交
  4. 29 12月, 2022 2 次提交
  5. 23 12月, 2022 1 次提交
  6. 20 12月, 2022 1 次提交
  7. 19 12月, 2022 1 次提交
  8. 16 12月, 2022 1 次提交
  9. 15 12月, 2022 2 次提交
  10. 14 12月, 2022 2 次提交
  11. 13 12月, 2022 1 次提交
  12. 12 12月, 2022 1 次提交
  13. 09 12月, 2022 2 次提交
  14. 08 12月, 2022 1 次提交
  15. 07 12月, 2022 1 次提交
  16. 06 12月, 2022 1 次提交
    • Z
      Clear extra input (Bias, ResidualData) in OpMaker of conv2d (#47579) · 0a2dfa38
      zyfncg 提交于
      * delete Bias and ResidualData in OpMaker of conv2d
      
      * delete extra input of conv3d
      
      * refactor pass of conv_bias_fusion
      
      * fix mkldnn dependency
      
      * fix mkldnn compile
      
      * fix test_conv_bias_mkldnn_fuse_pass
      
      * police some code
      
      * remove useless log
      
      * fix analyzer_vit_ocr_tester
      
      * fix conv_activation_mkldnn_fuse_pass
      
      * fix test_analyzer_ocr
      
      * add fused_conv_sig
      
      * fix performence regression
      
      * fix performance regression
      0a2dfa38
  17. 05 12月, 2022 2 次提交
  18. 01 12月, 2022 1 次提交
  19. 30 11月, 2022 4 次提交
  20. 29 11月, 2022 2 次提交
    • L
      fix mma_tensorcore (#48386) · bf4d1792
      lzy 提交于
      * fix mma_tensorcore (__CUDA_ARCH__)
      
      * disable tensorcore by default.
      
      disable tensorcore by default, because the judgment of __CUDA_ARCH__ will cause undefined behavior in some environments, can manually enable it on a machine that supports tensorcore.
      bf4d1792
    • S
      [PHI decoupling] Move MKLDNN code (#48352) · fa051eec
      Sławomir Siwek 提交于
      fa051eec
  21. 28 11月, 2022 4 次提交
  22. 23 11月, 2022 1 次提交
  23. 22 11月, 2022 2 次提交
  24. 21 11月, 2022 1 次提交
    • L
      mma qk tensor_core (#48087) · d79eda71
      lzy 提交于
      * use mma for QK dot computing in fused_multi_transformer.
      * Update fused_multi_transformer_op.cu.h
      d79eda71
  25. 18 11月, 2022 1 次提交