1. 27 1月, 2022 4 次提交
    • A
      [PTen]Support AllocateFrom in Tensor and Alloc/HostAlloc in Context (#39022) · 5631da9c
      Aurelius84 提交于
      * Support allocate_from in Tensor and allocate_data in Context
      
      * fix #ifdef CUDA
      
      * fix cycle depends
      
      * fix test_xxx_dev_api failed
      
      * fix windows compiling error
      
      * fix unittest
      
      * modify into PImpl
      
      * fix selected rows
      
      * add TODO comment
      
      * refine interface according reviewer
      5631da9c
    • Q
      [MLU] add compile ci scripts for MLU, test=mlu_ci (#39122) · 56410b4a
      Qi Li 提交于
      56410b4a
    • A
      [PluggableDevice] Add custom kernel support based on pten kernel management (#38848) · a8879215
      Aganlengzi 提交于
      * [Demo] custom kernel based on pten kernel
      
      * merge and npu custom work well
      
      * del comments
      
      * delete other code
      
      * fix CUDAContext
      
      * fix not found small_vector.h
      
      * support NPU
      
      * fix NPUContext
      
      * fix DeviceContext support
      
      * add UT
      
      * fix call
      
      * add UT
      
      * fix
      
      * fix for comments and ut
      
      * add MACRO control
      
      * fix multi input output
      
      * support env CUSTOM_DEVICE_ROOT
      
      * deal with special cases
      
      * fix for Windows
      
      * try coverage with test_custom_kernel_dot.py
      
      * fix test_custom_kernel_dot
      
      * fix test_custom_kernel_dot
      
      * fix merge
      
      * fix merge
      
      * fix CI
      
      * update
      
      * merge and fix
      
      * remove WITH_CUSTOM_KERNEL
      
      * fix merge
      
      * merge and fix
      
      * fix ut
      
      * fix ut for mac
      
      * add more UT
      
      * add more UT
      
      * fix
      a8879215
    • A
      [NPU] fix aarch64 deps (#39257) · 80dfa010
      Aganlengzi 提交于
      80dfa010
  2. 26 1月, 2022 4 次提交
  3. 25 1月, 2022 5 次提交
  4. 24 1月, 2022 3 次提交
  5. 21 1月, 2022 4 次提交
  6. 20 1月, 2022 2 次提交
  7. 19 1月, 2022 1 次提交
  8. 18 1月, 2022 3 次提交
  9. 17 1月, 2022 4 次提交
  10. 15 1月, 2022 1 次提交
  11. 14 1月, 2022 1 次提交
  12. 13 1月, 2022 2 次提交
  13. 12 1月, 2022 3 次提交
  14. 11 1月, 2022 1 次提交
  15. 10 1月, 2022 2 次提交
    • H
      Add gpu kernel for new api : linalg.lstsq (#38621) · 405103d8
      Haohongxiang 提交于
      * add lstsq gpu kernel
      
      * update
      
      * add docs_en
      
      * modify ut
      
      * fix bugs
      
      * modify example in docs_en
      
      * remove lstsq_op.cu from ROCM cmake
      
      * modify docs_en
      
      * modify docs_en
      
      * modify docs_en
      
      * remove unneccessary TensorCopy
      405103d8
    • L
      Profiler skeleton (#38826) · a8afed69
      liutiexing 提交于
      * add align for WorkQueue
      
      * add spinlock
      
      * merge develop
      
      * merge
      
      * Add EventsWaiter
      
      * Revert "Add EventsWaiter"
      
      This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.
      
      * profiler skeleton
      
      * update
      
      * update
      
      * update
      Co-authored-by: Nliutiexing <liutiexing@google.com>
      a8afed69