1. 19 12月, 2022 1 次提交
  2. 16 12月, 2022 1 次提交
  3. 15 12月, 2022 2 次提交
  4. 14 12月, 2022 2 次提交
  5. 13 12月, 2022 1 次提交
  6. 12 12月, 2022 1 次提交
  7. 09 12月, 2022 2 次提交
  8. 08 12月, 2022 1 次提交
  9. 07 12月, 2022 1 次提交
  10. 06 12月, 2022 1 次提交
    • Z
      Clear extra input (Bias, ResidualData) in OpMaker of conv2d (#47579) · 0a2dfa38
      zyfncg 提交于
      * delete Bias and ResidualData in OpMaker of conv2d
      
      * delete extra input of conv3d
      
      * refactor pass of conv_bias_fusion
      
      * fix mkldnn dependency
      
      * fix mkldnn compile
      
      * fix test_conv_bias_mkldnn_fuse_pass
      
      * police some code
      
      * remove useless log
      
      * fix analyzer_vit_ocr_tester
      
      * fix conv_activation_mkldnn_fuse_pass
      
      * fix test_analyzer_ocr
      
      * add fused_conv_sig
      
      * fix performence regression
      
      * fix performance regression
      0a2dfa38
  11. 05 12月, 2022 2 次提交
  12. 01 12月, 2022 1 次提交
  13. 30 11月, 2022 4 次提交
  14. 29 11月, 2022 2 次提交
    • L
      fix mma_tensorcore (#48386) · bf4d1792
      lzy 提交于
      * fix mma_tensorcore (__CUDA_ARCH__)
      
      * disable tensorcore by default.
      
      disable tensorcore by default, because the judgment of __CUDA_ARCH__ will cause undefined behavior in some environments, can manually enable it on a machine that supports tensorcore.
      bf4d1792
    • S
      [PHI decoupling] Move MKLDNN code (#48352) · fa051eec
      Sławomir Siwek 提交于
      fa051eec
  15. 28 11月, 2022 4 次提交
  16. 23 11月, 2022 1 次提交
  17. 22 11月, 2022 2 次提交
  18. 21 11月, 2022 1 次提交
    • L
      mma qk tensor_core (#48087) · d79eda71
      lzy 提交于
      * use mma for QK dot computing in fused_multi_transformer.
      * Update fused_multi_transformer_op.cu.h
      d79eda71
  19. 18 11月, 2022 3 次提交
  20. 17 11月, 2022 2 次提交
  21. 15 11月, 2022 1 次提交
    • S
      mkldnn directory cleanup (#47779) · 8a339d24
      Sławomir Siwek 提交于
      * cleanup unused code
      
      * unify is_int8 is_bfloat16
      
      * Simplify matmul_v2 FWD kernel
      
      * remove RunKernel methods
      
      * remove import namespace
      
      * remove headers
      
      * clean fluid/phi cross imports
      
      * remove fluid axpy_handler
      
      * delete fluid methods
      
      * activations
      
      * OneDNNMemDesc
      
      * MKLDNNFormatForSize
      
      * MatchShapeToLayout
      
      * MKLDNNMemoryFormat
      
      * MKLDNNFormat
      
      * ReorderMKLDNNHandler
      
      * to_void_cast
      
      * review suggestions
      
      * interpolate
      
      * remove fluid depedency
      8a339d24
  22. 09 11月, 2022 2 次提交
    • H
      clean repetitious GetKernelTypeForVar (#47763) · c551e55d
      HongyuJia 提交于
      c551e55d
    • J
      Final changes to introduce mem_desc to be hold in Tensor (#46768) · 14f261ad
      Jacek Czaja 提交于
      * first commit
      
      - more fixes
      
      - compilation fix
      
      - compilation fix
      
      - fix
      
      - another fix
      
      - yet another fix
      
      - Fix
      
      - fix to fused ops
      
      - compilation fix
      
      - compilation fix
      
      - another compilation fix
      
      - another fix
      
      - fix
      
      - fix
      
      - fix
      
      - fix
      
      - yet another fix
      
      - fix
      
      - fix
      
      - cosmetic fix
      
      :- lint
      
      - Revert some changes (to be brought back later)
      
      - fix to build
      
      - Added prototype of slice
      
      - fix
      
      compilation fix
      
      - compilation fix
      
      - fix
      
      - fix
      
      - Fix
      
      - fix
      
       fix
      	modified:   cmake/flags.cmake
      
      * lint
      
      * rerun of CI
      
      * - Fix
      
      * - lint
      
      * - lint2
      14f261ad
  23. 07 11月, 2022 1 次提交
  24. 01 11月, 2022 1 次提交
    • C
      Adapting device-specific Extra Attributes for the PHI kernel (#46342) · c923e6c9
      Chen Weihang 提交于
      * add extra attr property set
      
      * add type_info for all context
      
      * add onednn context to all context
      
      * fix context compile error
      
      * simplify conv kernel args
      
      * pass runtime attr into dev_ctx
      
      * fix marco error
      
      * clear conv_grad_kernel extra args
      
      * merge conv_grad_grad into conv_grad
      
      * clear conv2d_grad_grad extra attrs
      
      * clear yaml and eager extra attr
      
      * fix conv1d error
      
      * change to thread local
      
      * fix npu compile failed
      
      * try to fix windows compile failed
      
      * add conv2d onednn phi kernel
      
      * fix ci bugs (#36)
      
      * fix compile bugs (#38)
      
      * fix extra input transform bug (#39)
      
      * support dynamic created attr (#40)
      
      * reset extra info gen code
      
      * rm conv_grad_grad kernel
      
      * reimpl pass attr adapting
      
      * add int attr support
      
      * remove vector inputnames creating
      
      * fix map at error
      
      * Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
      Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
      
      * remove useless extra attrs
      
      * replace mkldnn_engine by onednn_engine
      Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
      Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
      c923e6c9