1. 14 7月, 2022 1 次提交
    • L
      refine allocation cmake (#44241) · dc5a0420
      Leo Chen 提交于
      * build into one static library
      
      * move memory/detail to memory/allocation
      
      * fix bug
      
      * fix profiler
      
      * fix framework_proto
      
      * fix deps
      
      * fix inference compilation
      
      * fix rocm compile
      
      * follow comments
      
      * fix buddy_allocator_test
      dc5a0420
  2. 14 6月, 2022 1 次提交
  3. 04 6月, 2022 1 次提交
  4. 27 5月, 2022 1 次提交
  5. 30 3月, 2022 1 次提交
    • F
      Add new APIs for GPU memory monitoring (max_memory_allocated,... · afe02e9d
      From00 提交于
      Add new APIs for GPU memory monitoring (max_memory_allocated, max_memory_reserved, memory_allocated, memory_reserved) (#38657)
      
      * Add new API memory_reserved
      
      * Add memory_allocated, max_memory_reserved and max_memory_allocater
      
      * Fix CI error
      
      * Fix CI error
      
      * Enhance UT
      
      * Add FLAGS_memory_stats_opt
      
      * Add STATS macro functions
      
      * Add StatAllocator
      
      * Fix CI errors
      
      * Add UT
      
      * Fix CI errors
      afe02e9d
  6. 08 2月, 2022 1 次提交
    • F
      Support allocate CUDA managed memory (#39075) · 42910361
      From00 提交于
      * Rough implementation for experiment
      
      * Support allocate cuda managed memory
      
      * Fix CI error
      
      * Modify UT
      
      * Check whether support memory oversubscription
      
      * Fix ROCM Compile error
      
      * Fix ROCM Compile error
      
      * Fix UT cuda_managed_memory_test
      
      * Set UT timeout to 40
      
      * Add UT OOMExceptionTest
      
      * Set UT timeout to 50
      42910361
  7. 25 1月, 2022 1 次提交
  8. 08 12月, 2021 1 次提交
    • F
      Fix CUDAGraphAllocator bug for StreamSafeCUDAAllocator (#37821) · b4a67491
      From00 提交于
      * Fix CUDAGraph bug for StreamSafeCUDAAllocator
      
      * Add CUDAGrapthAllocator check in multi-stream interface
      
      * Set FLAGS_use_stream_safe_cuda_allocator defaulted to false
      
      * Fix environment error for cmake
      
      * Fix cmake error
      
      * Add UT of GetAllocatorInterfaceTest
      
      * Add UT of CUDAGraphExceptionTest
      
      * Enhance CUDAGraphExceptionTest
      b4a67491
  9. 25 11月, 2021 1 次提交
    • F
      Support multi-stream allocation for CUDA place (#37290) · b9c464c3
      From00 提交于
      * Support multi-stream allocation for CUDA place
      
      * Do not notify the retrying from other streams when free CUDA allocation
      
      * Fix compile error for CPU
      
      * Fix compile error for HIP
      
      * Release memory for StreamSafeCUDAAllocaRetry in malloc_test
      
      * Add FLAGS_use_stream_safe_cuda_allocator
      
      * Fix CI error for 'set_tests_properties'
      
      * Invalidate stream safe CUDA allocator for naive_best_fit and thread_local strategy
      
      * Performance improvement: insert allocation pair to outstanding_events_map when free but not alloc; replace recursive_mutex with SpinLock
      
      * FLAGS priority changes: FLAGS_use_system_allocator > FLAGS_use_stream_safe_cuda_allocator
      
      * Performance improvement: directly delete allocation when the recorded_streams is empty in FreeImpl of StreamSafeCUDAAllocator
      
      * Add UT for alloc interface
      
      * Changes multi-stream interface; move retry code from AllocatorFacadePrivate to StreamSafeCUDAAllocator
      b9c464c3
  10. 01 2月, 2021 1 次提交
  11. 14 1月, 2021 1 次提交
  12. 14 1月, 2020 1 次提交
  13. 24 9月, 2019 1 次提交
  14. 11 9月, 2019 1 次提交
    • H
      Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320
      Huihuang Zheng 提交于
      TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.
      
      We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.
      
      Also added data_feed_proto to operator to fix CI in CPU compilation
      12542320
  15. 11 3月, 2019 1 次提交
  16. 04 3月, 2019 3 次提交
  17. 01 3月, 2019 1 次提交
  18. 02 2月, 2019 2 次提交
  19. 16 11月, 2018 1 次提交
  20. 10 10月, 2018 1 次提交
  21. 28 9月, 2018 1 次提交
  22. 08 4月, 2018 3 次提交
  23. 03 4月, 2018 2 次提交
  24. 30 3月, 2018 1 次提交
  25. 10 2月, 2018 1 次提交
  26. 06 2月, 2018 1 次提交
  27. 30 1月, 2018 1 次提交
  28. 23 1月, 2018 2 次提交
  29. 16 1月, 2018 1 次提交
  30. 24 11月, 2017 1 次提交
  31. 28 10月, 2017 1 次提交
  32. 15 8月, 2017 1 次提交
  33. 04 8月, 2017 1 次提交