1. 14 9月, 2021 1 次提交
  2. 13 9月, 2021 2 次提交
  3. 08 9月, 2021 1 次提交
  4. 07 9月, 2021 1 次提交
  5. 06 9月, 2021 1 次提交
  6. 03 9月, 2021 2 次提交
  7. 02 9月, 2021 1 次提交
  8. 31 8月, 2021 1 次提交
  9. 27 8月, 2021 1 次提交
    • B
      add elementwise max grad op for npu (#34862) · 5310ceab
      baoachun 提交于
      * add elementwise max grad op for npu
      
      * add elementwise max grad op for npu
      
      * add elementwise max grad op for npu
      
      * add elementwise max grad op for npu
      
      * add elementwise max grad op for npu
      5310ceab
  10. 26 8月, 2021 1 次提交
    • J
      [oneDNN] disable caching oneDNN primitives in matmul v2, Reduce grad and... · 31f0221f
      Jacek Czaja 提交于
      [oneDNN] disable caching oneDNN primitives in  matmul v2, Reduce grad and elementwise_add grad, expand_v2 (#35132)
      
      * - grad caching disabled of matmul_v1
      
      - compilation fix
      
      - compilation fix
      
      * - reduction removed
      
      * - Matmul v2 disabled caching
      
      * Draft of further changes
      
      * - workaround for reducegrad
      
      * - fixes to UT
      
      * - fix to compilation
      
      * - another fix
      
      * - fix
      31f0221f
  11. 25 8月, 2021 2 次提交
  12. 22 8月, 2021 1 次提交
  13. 16 8月, 2021 1 次提交
    • J
      [oneDNN] Fix to 34554 (same as previous PR but should build with GPU) (#34859) · 9cb65653
      Jacek Czaja 提交于
      * - Added softmax without caching
      
      * - Binary is no longer manually cached
      
      * - Activation onednn caching removed
      
      * - Removed manual caching of activation
      
      * - modified UT
      
      * - fix
      
      * - fix
      
      * - fixes to building
      
      * - fix
      
      * - fix
      
      * - fix to UT
      
      * - Faulty UT workaround
      
      * - approval workaround
      
      * - Fixes after review
      
      * - compilation fixes
      
      * - more lint fixes
      
      * - more fixes after review
      
      * - fixes after another round of review
      
      * - hopefully compilation fix
      
      - compilation fix
      9cb65653
  14. 12 8月, 2021 1 次提交
  15. 11 8月, 2021 2 次提交
    • J
      [oneDNN] Fix to issue #34554 (#34623) · 0a5c99e8
      Jacek Czaja 提交于
      * - Added softmax without caching
      
      * - Binary is no longer manually cached
      
      * - Activation onednn caching removed
      
      * - Removed manual caching of activation
      
      * - modified UT
      
      * - fix
      
      * - fix
      
      * - fixes to building
      
      * - fix
      
      * - fix
      
      * - fix to UT
      
      * - Faulty UT workaround
      
      * - approval workaround
      
      * - Fixes after review
      
      * - compilation fixes
      
      * - more lint fixes
      
      * - more fixes after review
      
      * - fixes after another round of review
      0a5c99e8
    • A
      45af4f2a
  16. 09 8月, 2021 1 次提交
  17. 05 8月, 2021 1 次提交
  18. 07 7月, 2021 1 次提交
  19. 05 7月, 2021 2 次提交
  20. 24 6月, 2021 1 次提交
  21. 23 6月, 2021 1 次提交
  22. 12 6月, 2021 1 次提交
  23. 04 6月, 2021 1 次提交
  24. 02 6月, 2021 2 次提交
  25. 26 5月, 2021 1 次提交
    • L
      [NPU] refine NpuOpRunner (#32869) · 8259d9bf
      Leo Chen 提交于
      * refine ~npuOpRunner
      
      * implement destructor and forbid copy
      
      * use reference to avoid copy
      
      * use const reference
      
      * relax adam precision
      
      * fix top_k
      8259d9bf
  26. 25 5月, 2021 1 次提交
    • C
      modify complex template for elementwise ops (#33071) · dbc08d69
      chentianyu03 提交于
      * modify complex template for elementwise ops
      
      * modify mul, div grad struct
      
      * add complex template for CudaShuffleDownSync CudaShuffleXorSync funcs and fix the bug when delete cuda<9000
      
      * fix shuffle func args bug
      
      * fix shuffle func args bug
      
      * fix shuffle func args bug
      dbc08d69
  27. 24 5月, 2021 1 次提交
  28. 20 5月, 2021 2 次提交
  29. 14 5月, 2021 1 次提交
  30. 10 5月, 2021 1 次提交
  31. 22 4月, 2021 1 次提交
  32. 19 4月, 2021 1 次提交
    • L
      [NPU] cherry-pick gc/dataloader/save&load/optimization from ascendrc to develop (#32294) · cbe5c9f8
      Leo Chen 提交于
      * [NPU] support GarbageCollector for npu (#31874)
      
      * support GarbageCollector for npu
      
      * fix typo
      
      * fix gather_grad
      
      * disable NPUDefaultStreamGarbageCollector on NPU
      
      * [NPU] support npu for memcpy op (#31808)
      
      * support npu for memcpy op
      
      * add ut
      
      * fix ut
      
      * fix typo
      
      * 【NPU】fix bug of using temp vector (#31963)
      
      * fix bug when beta1_pow on cpu (#31995)
      
      * [NPU] support npu profiler (#31684)
      
      * support npu profiler
      
      * add python api
      
      * fix bugs
      
      * add wrapper for incomplete type
      
      * update profile proto
      
      * record npu wait
      
      * add xpu placeholder
      
      * fix adam (#32016)
      
      * [NPU] enable async copy and  add wait before sync operation (#31956)
      
      * enable async copy and  add wait before sync operation
      
      * remove unneccessary wait
      
      * add FillNpuTensorWithConstant
      
      * refine
      
      * fix fill_constant
      
      * make TensorFromVector/TensorToVector sync
      
      * [NPU] Support dataloader on npu place. (#31867)
      
      * [NPU] Wait on NPUPlace (#32086)
      
      * [NPU] fix cast op (#32121)
      
      * fix npu kernel of cast op to handle casting to same dtype
      
      * add comments
      
      * [NPU] support cann 20.3 (#32044)
      
      * fix compile problem on cann 20.3
      
      * fix ut
      
      * fix test_mul
      
      * fix check_finite_and_scale
      
      * fix lookup_table_v2_grad
      
      * fix cmake
      
      * support print op
      
      * [NPU] Support npu save load (#31893)
      
      * support save load for NPU
      
      * add save load npu unittest
      
      * support np.array transform in NPU
      
      * fix errors
      
      * delete dygraph in unittest
      
      * add Wait
      
      * fix unittest
      
      * fix review comment
      
      * fix unittest problem
      
      * fix little problem
      
      * change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performance (#32196)
      
      * change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performace
      
      * refine code
      
      * fix NPUDeviceContext in all c++ unittest (#32198)
      
      * fix NPUDeviceContext in all c++ unittest
      
      * refine log
      Co-authored-by: Npangyoki <pangyoki@126.com>
      
      * [NPU] Remove TensorFromVector and avoid sync copy in npu op kernel for better performance (#31994)
      
      * enable async copy and  add wait before sync operation
      
      * remove unneccessary wait
      
      * add FillNpuTensorWithConstant
      
      * refine
      
      * fix fill_constant
      
      * change TensorFromVector to FillNpuTensorWithConstant
      
      * fix ignored api
      
      * delete extra unittest
      
      * fix little error
      
      * fix update_loss_scaling_op_npu and check_finite_and_unscale_op_npu
      
      * change TensorCopySync to TensorCopy
      
      * delete useless Wait and add StreamWait
      
      * fix npu_stream error
      
      * fix check_finite_and_unscale_op_npu TensorCopy
      
      * only save stream wait
      
      * fix NPUDeviceContext in all c++ unittest
      
      * delete wait
      Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
      
      * delete useless unittest file (#32206)
      
      * Fix op test (#32231)
      
      * fix conditional block (#32243)
      
      * fix adam bug again (#32246)
      
      * fix compile
      
      * fix ut
      
      * fix ut
      Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
      Co-authored-by: Npangyoki <pangyoki@126.com>
      cbe5c9f8
  33. 18 4月, 2021 1 次提交