1. 11 11月, 2022 2 次提交
  2. 10 11月, 2022 1 次提交
    • J
      XPU multi-card support eager mode (#47445) · 3b91f8f3
      james 提交于
      * XPU support eager mode
      
      * add unittest for XPU eager mode
      
      * minor bugfix
      
      * minor bugfix, test=kunlun
      
      * correct copyright info
      
      * 1. remove unsed vars/funcs
      2. ProcessGroupBKCL inherit from ProcessGroupStream
      
      * bugfix for fp16 in eager mode multi-card, test=kunlun
      
      * rebase & fix a few issues
      
      * use new processgroup interface, test=kunlun
      
      * fix compile issue, test=kunlun
      3b91f8f3
  3. 09 11月, 2022 1 次提交
    • J
      Final changes to introduce mem_desc to be hold in Tensor (#46768) · 14f261ad
      Jacek Czaja 提交于
      * first commit
      
      - more fixes
      
      - compilation fix
      
      - compilation fix
      
      - fix
      
      - another fix
      
      - yet another fix
      
      - Fix
      
      - fix to fused ops
      
      - compilation fix
      
      - compilation fix
      
      - another compilation fix
      
      - another fix
      
      - fix
      
      - fix
      
      - fix
      
      - fix
      
      - yet another fix
      
      - fix
      
      - fix
      
      - cosmetic fix
      
      :- lint
      
      - Revert some changes (to be brought back later)
      
      - fix to build
      
      - Added prototype of slice
      
      - fix
      
      compilation fix
      
      - compilation fix
      
      - fix
      
      - fix
      
      - Fix
      
      - fix
      
       fix
      	modified:   cmake/flags.cmake
      
      * lint
      
      * rerun of CI
      
      * - Fix
      
      * - lint
      
      * - lint2
      14f261ad
  4. 08 11月, 2022 2 次提交
  5. 07 11月, 2022 5 次提交
    • H
      suqeeze2 + transpose2 fuse onednn (#47592) · fa874a46
      Hui Zhang 提交于
      * suqeeze2 transpose2 fuse onednn
      
      * format
      
      * fix output shape
      
      * fix conflict
      
      * format
      
      * format
      
      * remove useless
      
      * remove log
      
      * simply pass
      
      * fix comment
      
      * fix
      
      * fix msg
      
      * fix error msg
      
      * format
      fa874a46
    • Q
      support kldiv_loss/kldiv_loss_grad for kunlun (#47638) · 5f0a8adc
      QingshuChen 提交于
      *test=kunlun
      5f0a8adc
    • Y
      add roll and roll_grad kernels and strided_slice and strided_slice_grad... · 5a4d2186
      ykkk2333 提交于
      add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun (#47368)
      
      * add stat tool
      
      * add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun
      5a4d2186
    • R
      call InitDevices only once (#47678) · 0cbdcdda
      ronnywang 提交于
      0cbdcdda
    • H
      [Restore PR] Remove hard code of PADDLE_WITH_CUDA (#47630) · 908a381d
      HongyuJia 提交于
      * move cudnn hardcode outside GetExpectedKernelType
      
      * add header file
      
      * debug
      
      * update interpreter_util with hardcode
      
      * update interpreter_util headerfile
      
      * solve activation hardcode
      
      * debug with CI
      
      * add mkldnn_op_list header file
      
      * temporarily uncomment mkldnn
      
      * temporarily uncomment mkldnn
      
      * delete sequence_softmax cudnn hardcode
      
      * add hardcode to data_transfer.cc
      
      * update data_transfer headerfile
      
      * try fix segment fault
      
      * update cudnn&miopen_helper
      
      * reset HasAttr of DygraphExctnCtx
      
      * debug, this commit should pass all CI
      
      * debug should pass CI, temporarily disable activation
      
      * debug should pass CI
      
      * fix default_attr=nullptr bug
      
      * clean debug code
      
      * Call SetDnnFallback function in the base class
      
      * activation fallback to plain kernel
      
      * fix default GetExpectedKernelType find wrong kernel
      
      * search cudnn kernel instead of fallback
      
      * fix cudnn_handle bug
      
      * remove tanh use_cudnn
      
      * restore tanh use_cudnn
      
      * debug tanh
      
      * fix tanh bug
      
      * delete activation cudnn kernel
      
      * polish code
      908a381d
  6. 05 11月, 2022 1 次提交
  7. 04 11月, 2022 3 次提交
  8. 03 11月, 2022 1 次提交
  9. 02 11月, 2022 4 次提交
  10. 01 11月, 2022 2 次提交
    • H
      [Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325) · f9134045
      HongyuJia 提交于
      * move cudnn hardcode outside GetExpectedKernelType
      
      * add header file
      
      * debug
      
      * update interpreter_util with hardcode
      
      * update interpreter_util headerfile
      
      * solve activation hardcode
      
      * debug with CI
      
      * add mkldnn_op_list header file
      
      * temporarily uncomment mkldnn
      
      * temporarily uncomment mkldnn
      
      * delete sequence_softmax cudnn hardcode
      
      * add hardcode to data_transfer.cc
      
      * update data_transfer headerfile
      
      * try fix segment fault
      
      * update cudnn&miopen_helper
      
      * reset HasAttr of DygraphExctnCtx
      
      * debug, this commit should pass all CI
      
      * debug should pass CI, temporarily disable activation
      
      * debug should pass CI
      
      * fix default_attr=nullptr bug
      
      * clean debug code
      f9134045
    • C
      Adapting device-specific Extra Attributes for the PHI kernel (#46342) · c923e6c9
      Chen Weihang 提交于
      * add extra attr property set
      
      * add type_info for all context
      
      * add onednn context to all context
      
      * fix context compile error
      
      * simplify conv kernel args
      
      * pass runtime attr into dev_ctx
      
      * fix marco error
      
      * clear conv_grad_kernel extra args
      
      * merge conv_grad_grad into conv_grad
      
      * clear conv2d_grad_grad extra attrs
      
      * clear yaml and eager extra attr
      
      * fix conv1d error
      
      * change to thread local
      
      * fix npu compile failed
      
      * try to fix windows compile failed
      
      * add conv2d onednn phi kernel
      
      * fix ci bugs (#36)
      
      * fix compile bugs (#38)
      
      * fix extra input transform bug (#39)
      
      * support dynamic created attr (#40)
      
      * reset extra info gen code
      
      * rm conv_grad_grad kernel
      
      * reimpl pass attr adapting
      
      * add int attr support
      
      * remove vector inputnames creating
      
      * fix map at error
      
      * Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
      Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
      
      * remove useless extra attrs
      
      * replace mkldnn_engine by onednn_engine
      Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
      Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
      c923e6c9
  11. 27 10月, 2022 1 次提交
  12. 26 10月, 2022 2 次提交
    • S
      FC/matmul(v2) + scale fuse pass (#47127) · c1c2be2d
      Sławomir Siwek 提交于
      * fc/matmuls + scale fuse pass
      
      * remove double-extension
      
      * add unit tests
      
      * comments from review
      
      * codestyle
      
      * add pass to int8 list
      
      * new codestyle
      
      * attr name typo
      c1c2be2d
    • H
      [MKLDNN] Delete mkldnn hard code of prior_box (#47068) · d78dd7ea
      HongyuJia 提交于
      * remove prior_box mkldnn hard code
      
      * add header file
      
      * simplify PD_VISIT_TYPE
      
      * decouple dependency between prior_box and density_prior_box
      
      * fix pragma omp parallel error
      
      * bypass #pragma omp_parallel_for error
      
      * polish code
      
      * remove visit_type headerfile
      
      * polish codestyle
      
      * polish codestyle
      
      * try fix CI error
      
      * add testcase, datatype=float64
      
      * reset test_prior_box testcase
      
      * add datacheck to DenseTensor
      
      * update template name
      
      * call prior_box with macro expand
      d78dd7ea
  13. 25 10月, 2022 2 次提交
  14. 24 10月, 2022 1 次提交
  15. 21 10月, 2022 1 次提交
  16. 20 10月, 2022 1 次提交
  17. 19 10月, 2022 2 次提交
  18. 18 10月, 2022 1 次提交
  19. 17 10月, 2022 1 次提交
  20. 15 10月, 2022 1 次提交
  21. 13 10月, 2022 2 次提交
    • L
      add thread name for dataloader (#46990) · 770501b8
      Leo Chen 提交于
      770501b8
    • H
      [Kernel Selection] Remove hard code of PADDLE_WITH_MKLDNN (#46606) · ef1c8759
      HongyuJia 提交于
      * remove PADDLE_WITH_MKLDNN, test white_list=abs
      
      * fix unique_ptr
      
      * fix op.Type()
      
      * remove TODO in kernel_dispatch.h
      
      * remove IndicateVarDataType function, update white_list
      
      * remove mkldnn hard code
      
      * add comments
      
      * fix ==
      
      * update mkldnn_op_list
      
      * delete hard code of OPs
      
      * update mkldnn_op_list
      
      * update mkldnn_op_list, remove interp
      
      * add error check for ExecutionContext
      
      * update mkldnn_op_list, remove transpose2_grad
      
      * remove interpolate mkldnn
      
      * remove fill_constant mkldnn
      
      * opt HasAttr in DygraphExecutionContext
      
      * deprecated commit, test mkldnn_white_list
      
      * deprecated commit, test mkldnn_white_list
      
      * deprecated commit, test mkldnn_black_list
      
      * update mkldnn_op_list, add assert error op
      
      * solve cudnn related op
      
      * fix error
      
      * add mkldnn fallback in phi_utils.cc
      
      * remove mkldnn fallback in phi_utils.cc
      
      * opt code implementation
      
      * polish Copyright License
      ef1c8759
  22. 11 10月, 2022 2 次提交
  23. 10 10月, 2022 1 次提交