1. 25 11月, 2021 6 次提交
    • F
      Support multi-stream allocation for CUDA place (#37290) · b9c464c3
      From00 提交于
      * Support multi-stream allocation for CUDA place
      
      * Do not notify the retrying from other streams when free CUDA allocation
      
      * Fix compile error for CPU
      
      * Fix compile error for HIP
      
      * Release memory for StreamSafeCUDAAllocaRetry in malloc_test
      
      * Add FLAGS_use_stream_safe_cuda_allocator
      
      * Fix CI error for 'set_tests_properties'
      
      * Invalidate stream safe CUDA allocator for naive_best_fit and thread_local strategy
      
      * Performance improvement: insert allocation pair to outstanding_events_map when free but not alloc; replace recursive_mutex with SpinLock
      
      * FLAGS priority changes: FLAGS_use_system_allocator > FLAGS_use_stream_safe_cuda_allocator
      
      * Performance improvement: directly delete allocation when the recorded_streams is empty in FreeImpl of StreamSafeCUDAAllocator
      
      * Add UT for alloc interface
      
      * Changes multi-stream interface; move retry code from AllocatorFacadePrivate to StreamSafeCUDAAllocator
      b9c464c3
    • W
      50f75fb5
    • Z
      Added GradTensorHolder to Eager Dygraph (#37458) · bc9f9f43
      Zhanlue Yang 提交于
      * Added GradTensorHolder to Eager Dygraph
      
      * Added accumulation codes to Eager Dygraph
      
      * Fix windows-ci issue
      
      * Fix NPU-CI issue
      
      * Fixed CI-Coverage issue
      bc9f9f43
    • L
      Export task node to python (#37509) · 3f815e76
      LiYuRio 提交于
      3f815e76
    • X
      Fix test rnn memory helper op (#37474) · e4791d88
      xiongkun 提交于
      * clear LoDTensorArray
      
      * fix  bugs
      
      * fix
      
      * fix gpu
      e4791d88
    • W
      fix_matmul_op_int8_plugin (#37525) · 0fd70d71
      Wangzheee 提交于
      0fd70d71
  2. 24 11月, 2021 15 次提交
  3. 23 11月, 2021 19 次提交