1. 11 9月, 2019 1 次提交
    • H
      Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320
      Huihuang Zheng 提交于
      TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.
      
      We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.
      
      Also added data_feed_proto to operator to fix CI in CPU compilation
      12542320
  2. 03 9月, 2019 1 次提交
  3. 08 7月, 2019 1 次提交
    • T
      add mkldnn shapeblob cache clear strategy (#18513) · fe32879d
      Tao Luo 提交于
      * add mkldnn shapeblob cache clear strategy
      
      test=develop
      
      * refine with comments
      
      test=develop
      
      * make cache clear strategy more safey
      
      test=develop
      
      * add lock for GetShapeBlobSize
      
      test=develop
      fe32879d
  4. 03 7月, 2019 1 次提交
  5. 02 7月, 2019 1 次提交
  6. 27 6月, 2019 1 次提交
  7. 28 4月, 2019 1 次提交
  8. 21 4月, 2019 1 次提交
    • Z
      Refine model gpu memory (#16993) · 1202d3fc
      Zeng Jinle 提交于
      * speedup gc and inplace softmax_with_cross_entropy_grad
      test=develop
      
      * refine models gpu mem
      Merge skip vars and warning messages of mem opt
      remove relu mem opt
      test=develop
      
      * follow comments
      test=develop
      1202d3fc
  9. 25 3月, 2019 1 次提交
  10. 21 3月, 2019 1 次提交
  11. 20 3月, 2019 2 次提交
    • N
      07dcf285
    • W
      Collective ops (#15572) · 6382b62f
      Wu Yi 提交于
      * wip allreduce in op
      
      * wip
      
      * wip
      
      * wip
      
      * wip adding test
      
      * wip for conflict with mp mode
      
      * fix tests test=develop
      
      * fix cpu build test=develop
      
      * fix travis clang format test=develop
      
      * fix cpu build test=develop
      
      * update api.spec test=develop
      
      * delete comment test=develop
      
      * fix cpplint test=develop
      
      * fix test=develop
      
      * follow comment test=develop
      
      * add file test=develop
      
      * fix build test=develop
      
      * update test=develop
      
      * to be compatible with sync_bn, and fix mp mode in develop test=develop
      6382b62f
  12. 19 3月, 2019 1 次提交
  13. 16 3月, 2019 1 次提交
  14. 15 3月, 2019 1 次提交
    • Q
      Support sync batch norm. (#16121) · 8ad672a2
      qingqing01 提交于
      * Support Sync Batch Norm.
      * Note, do not enable it in one device.
      
      Usage:
      
      build_strategy = fluid.BuildStrategy()
      build_strategy.sync_batch_norm = True
      binary = fluid.compiler.CompiledProgram(tp).with_data_parallel(
              loss_name=loss_mean.name,
              build_strategy=build_strategy)
      8ad672a2
  15. 14 1月, 2019 1 次提交
  16. 11 1月, 2019 2 次提交
    • C
      Revert "Remove workspace_handle in conv_cudnn (#15186)" · 358e657f
      chengduozh 提交于
      test=develop
      This reverts commit 064512aa.
      358e657f
    • C
      Remove workspace_handle in conv_cudnn (#15186) · 064512aa
      chengduo 提交于
      * remove workspace_handle in conv2d_cudnn
      test=develop
      
      * remove workspace_handle
      test=develop
      
      * fix bug
      test=develop
      
      * make test_conv2d_op SERIAL
      test=develop
      
      * save memory in conv_cudnn
      test=develop
      
      * enhance thread safety
      test=develop
      
      * enhance temporary allocator
      test=develop
      
      * Add excess fraction
      test=develop
      
      * follow comments
      test=develop
      
      * fix bug and code refine
      test=develop
      
      * fix memory size check
      test=develop
      
      * rename reuse_tmp_allocation_excess_fraction
      test=develop
      064512aa
  17. 08 1月, 2019 2 次提交
  18. 02 1月, 2019 1 次提交
  19. 29 12月, 2018 1 次提交
  20. 25 12月, 2018 1 次提交
  21. 21 12月, 2018 1 次提交
    • C
      [Feature] Add Temporary Allocator (#14875) · 79bd6dfa
      chengduo 提交于
      * Add Temporal Allocator
      
      * add Temporay Allocator to DeviceContext
      test=develop
      
      * code refine
      test=develop
      
      * fix mean_iou
      test=develop
      
      * Add DeviceTemporaryAllocator
      test=develop
      
      * fix conv_op bug
      test=develop
      
      * small fix
      test=develop
      
      * code refine
      test=develop
      
      * log refine
      test=develop
      
      * fix unit test
      test=develop
      
      * move double check
      
      * refine concat_and_split
      test=develop
      
      * add limit_of_temporary_allocation
      test=develop
      
      * fix name
      test=develop
      79bd6dfa
  22. 11 12月, 2018 1 次提交
  23. 03 12月, 2018 1 次提交
  24. 22 11月, 2018 1 次提交
    • C
      Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929) · 00b9e9a1
      chengduo 提交于
      * refine cublase
      test=develop
      
      * code refine
      
      * refine cublas
      
      * add GEMME_EX
      
      * add enable_cublas_tensor_op_math doc and add cublasCall
      test=develop
      
      * fix CublasCall for cuda version
      test=develop
      
      * fix error
      test=develop
      
      * fix GEMM_EX to be compatible with gcc 4.8
      test=develop
      
      * add GEMM_EX
      test=develop
      
      * to compatiable with gcc4.8
      test=develop
      00b9e9a1
  25. 15 11月, 2018 1 次提交
  26. 08 11月, 2018 2 次提交
  27. 07 11月, 2018 1 次提交
  28. 06 11月, 2018 1 次提交
  29. 31 10月, 2018 1 次提交
  30. 30 10月, 2018 1 次提交
  31. 26 10月, 2018 1 次提交
  32. 25 10月, 2018 1 次提交
  33. 24 10月, 2018 1 次提交
  34. 21 10月, 2018 1 次提交
  35. 15 10月, 2018 1 次提交
  36. 27 9月, 2018 1 次提交