1. 09 1月, 2020 1 次提交
  2. 28 11月, 2019 1 次提交
  3. 13 3月, 2019 1 次提交
  4. 11 3月, 2019 1 次提交
  5. 04 3月, 2019 3 次提交
  6. 01 3月, 2019 1 次提交
  7. 24 2月, 2019 1 次提交
  8. 22 2月, 2019 1 次提交
  9. 21 2月, 2019 1 次提交
    • D
      Profiler refine and add CUDA runtime api tracer (#15301) · a83e4704
      Dun 提交于
      * refine profiler && add runtime tracer
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * fix bug && test=develop
      
      * add thread id map && test=develop
      
      * test=develop
      
      * testing
      
      * bug fix
      
      * remove cuda event && refine code && test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * fix windows temp file && test=develop
      
      * test=develop
      
      * fix windows bug && test=develop
      
      * fix start up issue && test=develop
      
      * code polish &&  test=develop
      
      * remove unused code && test=develop
      
      * add some cupti cbid && test=develop
      
      * add FLAGS_multiple_of_cupti_buffer_size && test=develop
      
      * fix compile error && test=develop
      
      * add keyword && test=develop
      
      * fix && test=develop
      
      * code polish && test=develop
      a83e4704
  10. 04 12月, 2018 1 次提交
    • Z
      test=develop · deb04809
      ZongwuYang 提交于
      Fix the bug that profiler cannot trace the nccl allreduce operator
      deb04809
  11. 26 11月, 2018 1 次提交
  12. 08 11月, 2018 1 次提交
  13. 13 8月, 2018 1 次提交
  14. 10 8月, 2018 1 次提交
  15. 31 7月, 2018 1 次提交
  16. 30 7月, 2018 3 次提交
  17. 23 7月, 2018 1 次提交
  18. 14 6月, 2018 1 次提交
    • X
      Remove cuptiFinalize. · d2afd210
      Xin Pan 提交于
      In cupti samples, only cuptiFlush is used.
      I can't find any places calling cuptiFinalize and
      this API can error out as not_implemented in some
      cuda installation.
      d2afd210
  19. 08 6月, 2018 2 次提交
  20. 22 5月, 2018 1 次提交
    • X
      multi-thread handlerequest · b4dd4c04
      Xin Pan 提交于
          Experiment on vgg flower, 2 trainers, 1ps.
          more trainer could have more speedup.
      
          After:
          Pass = 0, Iters = 327, Speed = (7.52) img/s
          Before:
          Pass = 0, Iters = 385, Speed = (6.77) img/s
      b4dd4c04
  21. 10 4月, 2018 1 次提交
  22. 14 3月, 2018 1 次提交
  23. 08 3月, 2018 2 次提交
  24. 06 3月, 2018 2 次提交
  25. 02 3月, 2018 1 次提交
  26. 01 3月, 2018 2 次提交
  27. 28 2月, 2018 1 次提交
  28. 26 2月, 2018 2 次提交