1. 09 2月, 2022 1 次提交
  2. 25 1月, 2022 1 次提交
  3. 28 12月, 2021 1 次提交
    • F
      Utilize StreamSafeCUDAAllocator to support fast GC in new executor (#37642) · 0c7153a4
      From00 提交于
      * fix reshape move storage error
      
      * remove needless set type
      
      * alloc tensor by shared storage
      
      * Utilize StreamSafeCUDAAllocator to support fast GC in new executor
      
      * Fix compile error for Windows and ROCm
      
      * Fix compile error for Windows
      
      * Modify UT stream_safe_cuda_alloc_test
      
      * Modify UT stream_safe_cuda_alloc_test
      
      * Rewrite fast GC
      
      * Rewrite fast GC
      
      * Fix compile error for BOOST_GET_CONST
      
      * Fix compile error for BOOST_GET_CONST
      
      * Changes default stream for StreamSafeCUDAAllocator
      
      * Fix a small CI error
      
      * Remove some redundant code
      
      * Fix conflict
      
      * Fix compile error for ROCm
      
      * Fix Windoes CI error
      
      * Fix CI error
      
      * Remove some unnecessary code
      
      * Fix CI error
      
      * Add UT for fast GC
      
      * Fix CI error
      
      * add device-agnostic stream class
      
      * add stream.h
      
      * fix ut
      
      * fix cpu compile
      
      * Use RWLock in GetAllocator
      
      * Fix CI error
      Co-authored-by: NChen Weihang <chenweihang@baidu.com>
      Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
      0c7153a4
  4. 27 12月, 2021 1 次提交
  5. 17 12月, 2021 1 次提交
  6. 25 11月, 2021 1 次提交
    • F
      Support multi-stream allocation for CUDA place (#37290) · b9c464c3
      From00 提交于
      * Support multi-stream allocation for CUDA place
      
      * Do not notify the retrying from other streams when free CUDA allocation
      
      * Fix compile error for CPU
      
      * Fix compile error for HIP
      
      * Release memory for StreamSafeCUDAAllocaRetry in malloc_test
      
      * Add FLAGS_use_stream_safe_cuda_allocator
      
      * Fix CI error for 'set_tests_properties'
      
      * Invalidate stream safe CUDA allocator for naive_best_fit and thread_local strategy
      
      * Performance improvement: insert allocation pair to outstanding_events_map when free but not alloc; replace recursive_mutex with SpinLock
      
      * FLAGS priority changes: FLAGS_use_system_allocator > FLAGS_use_stream_safe_cuda_allocator
      
      * Performance improvement: directly delete allocation when the recorded_streams is empty in FreeImpl of StreamSafeCUDAAllocator
      
      * Add UT for alloc interface
      
      * Changes multi-stream interface; move retry code from AllocatorFacadePrivate to StreamSafeCUDAAllocator
      b9c464c3
  7. 04 2月, 2021 1 次提交
  8. 06 11月, 2020 1 次提交
  9. 04 11月, 2020 1 次提交
  10. 24 9月, 2019 1 次提交
  11. 11 9月, 2019 1 次提交
    • H
      Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320
      Huihuang Zheng 提交于
      TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.
      
      We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.
      
      Also added data_feed_proto to operator to fix CI in CPU compilation
      12542320
  12. 10 6月, 2019 1 次提交
  13. 16 11月, 2018 1 次提交
  14. 14 11月, 2018 2 次提交
  15. 09 11月, 2018 2 次提交
  16. 08 11月, 2018 1 次提交
  17. 19 10月, 2018 1 次提交
  18. 10 10月, 2018 1 次提交
  19. 28 9月, 2018 2 次提交
  20. 08 8月, 2018 1 次提交
  21. 09 7月, 2018 1 次提交
  22. 29 6月, 2018 1 次提交
    • C
      Init allocated memory for unit test (#11657) · d2ad4a5c
      chengduo 提交于
      * memory init
      
      * add env
      
      * refine anounce
      
      * Add check for Nan
      
      * Debug
      
      * Add env for cc_test
      
      * Add env for py_test and nv_test
      
      * Remove py_test env
      
      * Add env for py_test
      
      * serial test_recognize_digits
      
      * Test FLAGS_init_allocated_mem function for unit test
      
      * Init allocated mem for op unit test
      
      * Add env for all unit test
      d2ad4a5c
  23. 08 4月, 2018 3 次提交
  24. 02 4月, 2018 1 次提交
  25. 28 3月, 2018 1 次提交
  26. 27 3月, 2018 1 次提交
  27. 26 3月, 2018 3 次提交
  28. 20 3月, 2018 4 次提交
  29. 12 2月, 2018 1 次提交
  30. 10 2月, 2018 1 次提交