• L
    [NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d
    Leo Chen 提交于
    * [feature] support npu allocator (#30840)
    
    [feature] support npu allocator
    
    * [feature] support npu operator (#30951)
    
    [feature] support npu operator
    
    * [feature] support npu allocator, part 2 (#30972)
    
    * support npu allocator
    
    * add npu device context
    
    * fix some compile problem
    
    * fix some compile problem
    
    * add npu info
    
    * compile ok
    
    * fix include dir
    
    * support naive_best_fit_allocator
    
    * run ut ok, bug failed to exit
    
    * call aclrtResetDevice before exit
    
    * fix aclFinilize
    
    * add system allocatot test
    
    * add selected_gpus in gtest
    
    * add tensor_test for npu
    
    * support npu op, initial commit
    
    * add npu stream
    
    * add elementwise_add_op
    
    * compile ok
    
    * fix typo
    
    * fix elementwise_add_op_npu_test
    
    * support op run
    
    * test can run but failed
    
    * change aclopExecuteV2 to aclopCompileAndExecute
    
    * support parsing ascend rank table file (#31000)
    
    support parsing ascend rank table file
    
    * Fix reshape on GE graph. (#31084)
    
    Fix reshape on GE graph
    
    * add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)
    
    * add npu sub op
    
    * fix typo
    
    * rename test
    
    * fix bug
    
    * fix bug
    
    * add fp16 kernel
    
    * fix typo
    
    * support sub grad op
    
    * support elementwise_sub_grad op
    Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
    
    * Fix compilation problem (#31100)
    
    Fix compilation problem (#31100)
    
    * fix compile
    
    * fix code stype
    
    * remove const_cast
    
    * support adding correct npu op in pybind.h (#31143)
    
    * support adding correct npu op in pybind.h
    
    * refine code
    
    * [NPU] Support executor with NPU (#31057)
    
    * [NPU] Support executor with NPU
    
    * Fix code according to reviews
    
    * Fix code
    
    * Add unittest for sub op npu
    
    * refactor npu device manager (#31154)
    
    refactor npu device manager (#31154)
    
    * fix selected npus
    
    * fix compile
    
    * fix reading flags from env
    
    * format
    Co-authored-by: Nxiayanming <41795079@qq.com>
    Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
    Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
    Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
    ccf5709d
tensor_py.h 29.1 KB