1. 23 7月, 2019 1 次提交
  2. 11 7月, 2019 1 次提交
    • Z
      Feature/buffer_shared_inplace (#17911) · d3003a16
      Zeng Jinle 提交于
      * feature/buffer_shared_inplace, test=develop
      
      * refine code, test=develop
      
      * fix elementwise_add op cpu inplace and sum inplace bug, test=develop
      
      * add unittest and debug log, test=develop
      
      * fix parallel_executor scope bug, polish code, test=develop
      
      * fix sum op, activation op, single_in_place_inference bug, test=develop
      
      * remove kLocalExecScopeName, test=develop
      
      * fix unittest,test=develop
      
      * fix out_var first version bug, test=develop
      
      * follow comments,test=develop
      d3003a16
  3. 10 7月, 2019 1 次提交
  4. 06 6月, 2019 1 次提交
  5. 08 5月, 2019 1 次提交
  6. 23 4月, 2019 1 次提交
  7. 21 4月, 2019 1 次提交
    • Z
      Refine model gpu memory (#16993) · 1202d3fc
      Zeng Jinle 提交于
      * speedup gc and inplace softmax_with_cross_entropy_grad
      test=develop
      
      * refine models gpu mem
      Merge skip vars and warning messages of mem opt
      remove relu mem opt
      test=develop
      
      * follow comments
      test=develop
      1202d3fc
  8. 18 4月, 2019 1 次提交
  9. 30 3月, 2019 1 次提交
  10. 28 3月, 2019 2 次提交
  11. 27 3月, 2019 2 次提交
  12. 22 3月, 2019 1 次提交
    • C
      [Speed]Refine ParallelExecutor (#16190) · a6a3b2fb
      chengduo 提交于
      * refine parallelExecutor
      test=develop
      
      * Polish op_handle
      test=develop
      
      * Remove unnecessary op_handle
      test=develop
      
      * Fix Travis CI
      test=develop
      
      * Fix fetch bug
      test=develop
      
      * Remove WaitInputVarGenerated
      
      * Fix OpHandleBase::Run
      test=develop
      
      * debug
      test=develop
      
      * use origin fetch_op_handle
      test=develop
      
      * Revert op_handle_base.cc
      test=develop
      
      * Polish code
      test=develop
      
      * Fix OpHandleBase::Run
      test=develop
      
      * code refine
      
      * test CI and CE
      test=develop
      
      * fix OpHandle::Run
      test=develop
      
      * refine AllReduceOpHandle
      test=develop
      
      * Polish code
      test=develop
      a6a3b2fb
  13. 20 3月, 2019 1 次提交
    • C
      Fuse AllReduce (#15921) · f26ba5bd
      chengduo 提交于
      * fuse all_reduce
      test=develop
      
      * add fuse_parameter_groups_size
      test=develop
      
      * Polish code
      test=develop
      
      * Fix travis-ci
      test=develop
      
      * Add SetGroupAccordingToLayers and SetGroupAccordingToGroupSize
      test=develop
      
      * Add SetGroupAccordingToMemorySize
      test=develop
      
      * fix multi_devices_graph
      test=develop
      
      * reset params_grads
      test=develop
      
      * Polish code
      test=develop
      f26ba5bd
  14. 05 3月, 2019 2 次提交
  15. 18 2月, 2019 1 次提交
  16. 14 2月, 2019 1 次提交
  17. 13 2月, 2019 1 次提交
  18. 11 2月, 2019 1 次提交
  19. 31 1月, 2019 2 次提交
  20. 27 1月, 2019 1 次提交
  21. 21 1月, 2019 2 次提交
    • D
      squash commits. test=develop · 8f3b2523
      dzhwinter 提交于
      8f3b2523
    • D
      Memory optimization of depthwise conv op and group norm op (#15313) · 9f8f0fc2
      Dun 提交于
      * mem opt
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * refine code  test=develop
      
      * refine code  test=develop
      
      * refine code  test=develop
      
      * refine code  test=develop
      
      * refine with cub test=develop
      
      * fix mkldnn test && remove comments && test=develop
      
      * polish code && test=develop
      
      * add only_forward test && test=develop
      9f8f0fc2
  22. 17 1月, 2019 1 次提交
  23. 10 1月, 2019 1 次提交
  24. 07 1月, 2019 1 次提交
  25. 18 12月, 2018 1 次提交
    • D
      add ir memory optimize. (#14530) · 7cd24b13
      dzhwinter 提交于
      * follow comments. test=develop
      
      * Fix typo
      
      * fix compile error. test=develop
      
      * merge develop branch. test=develop
      
      * Remove set_equal
      
      * Polish code
      
      * Delete unused functions
      
      test=develop
      
      * polish code. test=develop
      
      * follow comment
      
      * polish code.
      
      * fix windows compile error. test=develop
      
      * fix op handle.
      
      * rerun ci. test=develop
      
      * rerun ci. test=develop
      
      * rerun macci. test=develop
      
      * polish code. test=develop
      
      * rewrite sort code. test=develop
      
      * remove unused code. test=develop
      
      * fix tests. test=develop
      
      * fix conflict. test=develop
      
      * follow comment. test=develop
      
      * merge develop branch. test=develop
      
      * fix tests. test=develop
      
      * remove ToTypeIndex. test=develop
      
      * rerun ci. test=develop
      7cd24b13
  26. 14 12月, 2018 1 次提交
  27. 07 12月, 2018 2 次提交
  28. 06 12月, 2018 1 次提交
  29. 03 12月, 2018 1 次提交
  30. 29 11月, 2018 1 次提交
  31. 27 11月, 2018 1 次提交
  32. 06 11月, 2018 1 次提交
  33. 30 10月, 2018 1 次提交
  34. 29 10月, 2018 1 次提交
    • W
      [1.1] [project] train imagenet using large batch size (#13766) · 26200f2e
      Wu Yi 提交于
      * fix nccl2 lars dist support
      
      * put lars in momentum op
      
      * add tests lars
      
      * fix ci
      
      * fix cpu kernel
      
      * soft warning
      
      * remove lars in test_recognize_digits.py
      
      * move to another op
      
      * add file
      
      * update api.spec test=develop
      
      * update test=develop
      
      * fix api.spec test=develop
      
      * wip
      
      * wip, finish grad merge ops
      
      * wip, finish graph build
      
      * wip test running
      
      * work on 1 gpu
      
      * workable version
      
      * update
      
      * fix tests
      
      * fuse broadcast op
      
      * fix compile failed
      
      * refine
      
      * add batch merge test mnist
      
      * fix CI test=develop
      
      * fix build
      
      * use independent bn params for batch merge test=develop
      
      * update api.spec
      
      * follow comments and for test
      
      * wip
      
      * refine tests test=develop
      
      * follow comments test=develop
      
      * remove startup bn modify test=develop
      
      * follow comments test=develop
      
      * fix merge test=develop
      26200f2e