1. 23 4月, 2020 1 次提交
    • [cherry-pick] Thread-local Allocator, test=release/2.0 (#24061) · 597cc058
      石晓伟 提交于
      * cherry-pick of DeviceContext Split, test=develop (#23737)
      
      * New feature: thread local allocator, test=develop (#23989)
      
      * add the thread_local_allocator, test=develop
      
      * refactor the thread_local_allocator, test=develop
      
      * provides option setting strategy, test=develop
      
      * add boost dependency to cuda_stream, test=develop
      
      * declare the stream::Priority as enum class, test=develop
      
      * deal with PADDLE_ENFORCE_CUDA_SUCCESS macro in pr #23816
      597cc058
  2. 21 4月, 2020 1 次提交
  3. 01 4月, 2020 1 次提交
  4. 31 3月, 2020 1 次提交
  5. 30 3月, 2020 1 次提交
  6. 05 2月, 2020 1 次提交
  7. 08 1月, 2020 1 次提交
  8. 10 12月, 2019 1 次提交
    • A
      MKL-DNN 1.0 Update (#20162) · e81f0228
      Adam 提交于
      * MKLDNN v1.0 rebase to Paddle 1.6
      test=develop
      
      * Add hacky paddle::string::to_string() implementation
      
      * vectorize<int64-t>() -> vectorize() cleanup
      test=develop
      
      * PADDLE_ENFORCE and void_cast fixes
      test=develop
      
      * Rebase changes
      test=develop
      
      * Cosmetics
      test=develop
      
      * Delete MKL from mkldnn.cmake
      test=develop
      
      * CMake debug commands
      test=develop
      
      * Delete MKLDNN_VERBOSE and rebase fixes
      test=develop
      
      * Rebase fixes
      test=develop
      
      * Temporarily disable int8 resnet101 vgg16 and vgg19 tests
      test=develop
      
      * Add libmkldnn.so.1 to python setup
      test=develop
      
      * Add libmkldnn.so.1 to inference_lib cmake after rebase
      test=develop
      
      * Post rebase fixes + FC int8 changes
      test=develop
      
      * Fix LRN NHWC
      test=develop
      
      * Fix NHWC conv3d
      test=develop
      
      * Windows build fix + next conv3d fix
      test=develop
      
      * Fix conv2d on AVX2 machines
      test=develop
      e81f0228
  9. 06 12月, 2019 1 次提交
  10. 29 11月, 2019 1 次提交
  11. 18 11月, 2019 1 次提交
  12. 14 11月, 2019 1 次提交
  13. 24 9月, 2019 1 次提交
  14. 22 9月, 2019 1 次提交
  15. 11 9月, 2019 1 次提交
    • H
      Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320
      Huihuang Zheng 提交于
      TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.
      
      We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.
      
      Also added data_feed_proto to operator to fix CI in CPU compilation
      12542320
  16. 12 8月, 2019 1 次提交
  17. 11 7月, 2019 1 次提交
  18. 08 7月, 2019 1 次提交
    • T
      add mkldnn shapeblob cache clear strategy (#18513) · fe32879d
      Tao Luo 提交于
      * add mkldnn shapeblob cache clear strategy
      
      test=develop
      
      * refine with comments
      
      test=develop
      
      * make cache clear strategy more safey
      
      test=develop
      
      * add lock for GetShapeBlobSize
      
      test=develop
      fe32879d
  19. 03 7月, 2019 1 次提交
  20. 02 7月, 2019 1 次提交
  21. 27 6月, 2019 1 次提交
  22. 18 6月, 2019 1 次提交
  23. 10 6月, 2019 1 次提交
  24. 07 6月, 2019 1 次提交
  25. 28 3月, 2019 1 次提交
  26. 25 3月, 2019 1 次提交
  27. 20 3月, 2019 1 次提交
  28. 19 3月, 2019 1 次提交
  29. 16 3月, 2019 1 次提交
  30. 15 3月, 2019 1 次提交
    • Q
      Support sync batch norm. (#16121) · 8ad672a2
      qingqing01 提交于
      * Support Sync Batch Norm.
      * Note, do not enable it in one device.
      
      Usage:
      
      build_strategy = fluid.BuildStrategy()
      build_strategy.sync_batch_norm = True
      binary = fluid.compiler.CompiledProgram(tp).with_data_parallel(
              loss_name=loss_mean.name,
              build_strategy=build_strategy)
      8ad672a2
  31. 22 2月, 2019 1 次提交
  32. 19 2月, 2019 1 次提交
  33. 16 1月, 2019 1 次提交
  34. 11 1月, 2019 3 次提交
    • C
      fix thread safe bug · c4eced98
      chengduozh 提交于
      test=develop
      c4eced98
    • C
      Revert "Remove workspace_handle in conv_cudnn (#15186)" · 358e657f
      chengduozh 提交于
      test=develop
      This reverts commit 064512aa.
      358e657f
    • C
      Remove workspace_handle in conv_cudnn (#15186) · 064512aa
      chengduo 提交于
      * remove workspace_handle in conv2d_cudnn
      test=develop
      
      * remove workspace_handle
      test=develop
      
      * fix bug
      test=develop
      
      * make test_conv2d_op SERIAL
      test=develop
      
      * save memory in conv_cudnn
      test=develop
      
      * enhance thread safety
      test=develop
      
      * enhance temporary allocator
      test=develop
      
      * Add excess fraction
      test=develop
      
      * follow comments
      test=develop
      
      * fix bug and code refine
      test=develop
      
      * fix memory size check
      test=develop
      
      * rename reuse_tmp_allocation_excess_fraction
      test=develop
      064512aa
  35. 08 1月, 2019 2 次提交
  36. 07 1月, 2019 1 次提交
  37. 02 1月, 2019 1 次提交