1. 11 7月, 2019 1 次提交
    • Z
      Feature/buffer_shared_inplace (#17911) · d3003a16
      Zeng Jinle 提交于
      * feature/buffer_shared_inplace, test=develop
      
      * refine code, test=develop
      
      * fix elementwise_add op cpu inplace and sum inplace bug, test=develop
      
      * add unittest and debug log, test=develop
      
      * fix parallel_executor scope bug, polish code, test=develop
      
      * fix sum op, activation op, single_in_place_inference bug, test=develop
      
      * remove kLocalExecScopeName, test=develop
      
      * fix unittest,test=develop
      
      * fix out_var first version bug, test=develop
      
      * follow comments,test=develop
      d3003a16
  2. 28 3月, 2019 1 次提交
  3. 21 2月, 2019 1 次提交
    • D
      Profiler refine and add CUDA runtime api tracer (#15301) · a83e4704
      Dun 提交于
      * refine profiler && add runtime tracer
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * fix bug && test=develop
      
      * add thread id map && test=develop
      
      * test=develop
      
      * testing
      
      * bug fix
      
      * remove cuda event && refine code && test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * fix windows temp file && test=develop
      
      * test=develop
      
      * fix windows bug && test=develop
      
      * fix start up issue && test=develop
      
      * code polish &&  test=develop
      
      * remove unused code && test=develop
      
      * add some cupti cbid && test=develop
      
      * add FLAGS_multiple_of_cupti_buffer_size && test=develop
      
      * fix compile error && test=develop
      
      * add keyword && test=develop
      
      * fix && test=develop
      
      * code polish && test=develop
      a83e4704
  4. 19 2月, 2019 1 次提交
  5. 17 1月, 2019 1 次提交
  6. 26 11月, 2018 1 次提交
  7. 22 11月, 2018 1 次提交
  8. 08 11月, 2018 1 次提交
  9. 29 10月, 2018 2 次提交
    • Q
      fix compile, optimize code test=develop · 3d4e0508
      Qiao Longfei 提交于
      3d4e0508
    • W
      [1.1] [project] train imagenet using large batch size (#13766) · 26200f2e
      Wu Yi 提交于
      * fix nccl2 lars dist support
      
      * put lars in momentum op
      
      * add tests lars
      
      * fix ci
      
      * fix cpu kernel
      
      * soft warning
      
      * remove lars in test_recognize_digits.py
      
      * move to another op
      
      * add file
      
      * update api.spec test=develop
      
      * update test=develop
      
      * fix api.spec test=develop
      
      * wip
      
      * wip, finish grad merge ops
      
      * wip, finish graph build
      
      * wip test running
      
      * work on 1 gpu
      
      * workable version
      
      * update
      
      * fix tests
      
      * fuse broadcast op
      
      * fix compile failed
      
      * refine
      
      * add batch merge test mnist
      
      * fix CI test=develop
      
      * fix build
      
      * use independent bn params for batch merge test=develop
      
      * update api.spec
      
      * follow comments and for test
      
      * wip
      
      * refine tests test=develop
      
      * follow comments test=develop
      
      * remove startup bn modify test=develop
      
      * follow comments test=develop
      
      * fix merge test=develop
      26200f2e
  10. 27 10月, 2018 1 次提交
  11. 13 9月, 2018 1 次提交
  12. 12 9月, 2018 1 次提交
  13. 22 6月, 2018 1 次提交
  14. 21 6月, 2018 3 次提交
  15. 09 5月, 2018 2 次提交
  16. 05 5月, 2018 1 次提交
  17. 04 5月, 2018 1 次提交
  18. 02 5月, 2018 1 次提交
  19. 20 4月, 2018 1 次提交
  20. 18 4月, 2018 2 次提交
  21. 17 4月, 2018 1 次提交
  22. 16 4月, 2018 1 次提交
  23. 13 4月, 2018 3 次提交
  24. 12 4月, 2018 1 次提交
  25. 11 4月, 2018 4 次提交