- 08 2月, 2022 1 次提交
-
-
由 From00 提交于
* Rough implementation for experiment * Support allocate cuda managed memory * Fix CI error * Modify UT * Check whether support memory oversubscription * Fix ROCM Compile error * Fix ROCM Compile error * Fix UT cuda_managed_memory_test * Set UT timeout to 40 * Add UT OOMExceptionTest * Set UT timeout to 50
-
- 25 1月, 2022 1 次提交
-
-
由 From00 提交于
-
- 08 12月, 2021 1 次提交
-
-
由 From00 提交于
* Fix CUDAGraph bug for StreamSafeCUDAAllocator * Add CUDAGrapthAllocator check in multi-stream interface * Set FLAGS_use_stream_safe_cuda_allocator defaulted to false * Fix environment error for cmake * Fix cmake error * Add UT of GetAllocatorInterfaceTest * Add UT of CUDAGraphExceptionTest * Enhance CUDAGraphExceptionTest
-
- 25 11月, 2021 1 次提交
-
-
由 From00 提交于
* Support multi-stream allocation for CUDA place * Do not notify the retrying from other streams when free CUDA allocation * Fix compile error for CPU * Fix compile error for HIP * Release memory for StreamSafeCUDAAllocaRetry in malloc_test * Add FLAGS_use_stream_safe_cuda_allocator * Fix CI error for 'set_tests_properties' * Invalidate stream safe CUDA allocator for naive_best_fit and thread_local strategy * Performance improvement: insert allocation pair to outstanding_events_map when free but not alloc; replace recursive_mutex with SpinLock * FLAGS priority changes: FLAGS_use_system_allocator > FLAGS_use_stream_safe_cuda_allocator * Performance improvement: directly delete allocation when the recorded_streams is empty in FreeImpl of StreamSafeCUDAAllocator * Add UT for alloc interface * Changes multi-stream interface; move retry code from AllocatorFacadePrivate to StreamSafeCUDAAllocator
-
- 01 2月, 2021 1 次提交
-
-
由 Qi Li 提交于
-
- 14 1月, 2021 1 次提交
-
-
由 QingshuChen 提交于
-
- 14 1月, 2020 1 次提交
-
-
由 zhouwei25 提交于
faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 (#22164)
-
- 24 9月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 11 9月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory. We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton. Also added data_feed_proto to operator to fix CI in CPU compilation
-
- 11 3月, 2019 1 次提交
-
- 04 3月, 2019 3 次提交
-
-
由 chengduo 提交于
Add Event for TensorCopy
- 01 3月, 2019 1 次提交
-
-
由 chengduo 提交于
Add Event for TensorCopy
-
- 02 2月, 2019 2 次提交
- 16 11月, 2018 1 次提交
-
-
由 Yu Yang 提交于
test=develop
-
- 10 10月, 2018 1 次提交
-
-
由 sneaxiy 提交于
-
- 28 9月, 2018 1 次提交
-
-
由 Yu Yang 提交于
Use OO style to rewrite memory allocation.
-
- 08 4月, 2018 3 次提交
- 03 4月, 2018 2 次提交
-
-
由 chengduoZH 提交于
-
由 chengduoZH 提交于
-
- 30 3月, 2018 1 次提交
-
-
由 chengduoZH 提交于
-
- 10 2月, 2018 1 次提交
-
-
由 Yi Wang 提交于
-
- 06 2月, 2018 1 次提交
-
-
由 Luo Tao 提交于
-
- 30 1月, 2018 1 次提交
-
-
由 Luo Tao 提交于
-
- 23 1月, 2018 2 次提交
-
-
由 dangqingqing 提交于
-
由 dangqingqing 提交于
-
- 16 1月, 2018 1 次提交
-
-
由 Luo Tao 提交于
-
- 24 11月, 2017 1 次提交
-
-
由 Qiao Longfei 提交于
* make enforce a target and dependent on nccl when gpu is enabled * add some more dependency
-
- 28 10月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Add debug logs in scope, meta_cache and memory * Add missing deps
-
- 15 8月, 2017 1 次提交
-
-
由 qijun 提交于
-
- 04 8月, 2017 2 次提交
- 29 7月, 2017 1 次提交
-
-
由 Helin Wang 提交于
-
- 27 7月, 2017 1 次提交
-
-
由 liaogang 提交于
-
- 23 7月, 2017 2 次提交