1. 12 4月, 2021 1 次提交
    • P
      [NPU] Remove TensorFromVector and avoid sync copy in npu op kernel for better performance (#31994) · 5648bd80
      pangyoki 提交于
      * enable async copy and  add wait before sync operation
      
      * remove unneccessary wait
      
      * add FillNpuTensorWithConstant
      
      * refine
      
      * fix fill_constant
      
      * change TensorFromVector to FillNpuTensorWithConstant
      
      * fix ignored api
      
      * delete extra unittest
      
      * fix little error
      
      * fix update_loss_scaling_op_npu and check_finite_and_unscale_op_npu
      
      * change TensorCopySync to TensorCopy
      
      * delete useless Wait and add StreamWait
      
      * fix npu_stream error
      
      * fix check_finite_and_unscale_op_npu TensorCopy
      
      * only save stream wait
      
      * fix NPUDeviceContext in all c++ unittest
      
      * delete wait
      Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
      5648bd80
  2. 15 3月, 2021 1 次提交
  3. 11 7月, 2020 1 次提交
  4. 23 3月, 2019 1 次提交
    • W
      [Operator] Add range op. (#15431) · 18779b5b
      whs 提交于
      * Add range op.
      test=develop
      
      * Add more unitests.
      test=develop
      
      * Fix API.spec
      test=develop
      
      * Fix API.spec
      test=develop
      
      * Fix API.spec
      test=develop
      18779b5b