1. 01 4月, 2021 1 次提交
    • L
      [NPU] support npu profiler (#31684) · 6503ef56
      Leo Chen 提交于
      * support npu profiler
      
      * add python api
      
      * fix bugs
      
      * add wrapper for incomplete type
      
      * update profile proto
      
      * record npu wait
      
      * add xpu placeholder
      6503ef56
  2. 03 11月, 2020 1 次提交
  3. 07 7月, 2020 1 次提交
  4. 03 7月, 2020 1 次提交
  5. 09 6月, 2020 1 次提交
  6. 26 5月, 2020 1 次提交
  7. 25 5月, 2020 1 次提交
  8. 11 5月, 2020 1 次提交
    • C
      Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f
      Chen Weihang 提交于
      * add new macro BOOST_GET_SAFELY & unittests, test=develop
      
      * add different macro type, test=develop
      
      * fix get macro type in executor, test=develop
      
      * four macro part change backup
      
      * using one macro for all case, test=develop
      
      * revert attribute change, test=develop
      
      * change to three func to solve gcc4.8 bug, test=develop
      
      * polish some details, test=develop
      aa0f254f
  9. 24 2月, 2020 1 次提交
  10. 23 2月, 2020 1 次提交
  11. 09 1月, 2020 1 次提交
  12. 28 11月, 2019 1 次提交
  13. 13 3月, 2019 1 次提交
  14. 11 3月, 2019 1 次提交
  15. 04 3月, 2019 3 次提交
  16. 01 3月, 2019 1 次提交
  17. 24 2月, 2019 1 次提交
  18. 22 2月, 2019 1 次提交
  19. 21 2月, 2019 1 次提交
    • D
      Profiler refine and add CUDA runtime api tracer (#15301) · a83e4704
      Dun 提交于
      * refine profiler && add runtime tracer
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * fix bug && test=develop
      
      * add thread id map && test=develop
      
      * test=develop
      
      * testing
      
      * bug fix
      
      * remove cuda event && refine code && test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * fix windows temp file && test=develop
      
      * test=develop
      
      * fix windows bug && test=develop
      
      * fix start up issue && test=develop
      
      * code polish &&  test=develop
      
      * remove unused code && test=develop
      
      * add some cupti cbid && test=develop
      
      * add FLAGS_multiple_of_cupti_buffer_size && test=develop
      
      * fix compile error && test=develop
      
      * add keyword && test=develop
      
      * fix && test=develop
      
      * code polish && test=develop
      a83e4704
  20. 04 12月, 2018 1 次提交
    • Z
      test=develop · deb04809
      ZongwuYang 提交于
      Fix the bug that profiler cannot trace the nccl allreduce operator
      deb04809
  21. 26 11月, 2018 1 次提交
  22. 08 11月, 2018 1 次提交
  23. 13 8月, 2018 1 次提交
  24. 10 8月, 2018 1 次提交
  25. 31 7月, 2018 1 次提交
  26. 30 7月, 2018 3 次提交
  27. 23 7月, 2018 1 次提交
  28. 14 6月, 2018 1 次提交
    • X
      Remove cuptiFinalize. · d2afd210
      Xin Pan 提交于
      In cupti samples, only cuptiFlush is used.
      I can't find any places calling cuptiFinalize and
      this API can error out as not_implemented in some
      cuda installation.
      d2afd210
  29. 08 6月, 2018 2 次提交
  30. 22 5月, 2018 1 次提交
    • X
      multi-thread handlerequest · b4dd4c04
      Xin Pan 提交于
          Experiment on vgg flower, 2 trainers, 1ps.
          more trainer could have more speedup.
      
          After:
          Pass = 0, Iters = 327, Speed = (7.52) img/s
          Before:
          Pass = 0, Iters = 385, Speed = (6.77) img/s
      b4dd4c04
  31. 10 4月, 2018 1 次提交
  32. 14 3月, 2018 1 次提交
  33. 08 3月, 2018 2 次提交
  34. 06 3月, 2018 1 次提交