1. 01 9月, 2022 1 次提交
  2. 01 8月, 2022 1 次提交
    • L
      unify gpu context (#44740) · 86763023
      Leo Chen 提交于
      * remove cudaDeviceContext
      
      * remove more template
      
      * fix rocm compile
      
      * remove alias name CUDADeviceContext
      
      * fix compile
      
      * fix tests
      
      * revert changes
      86763023
  3. 26 6月, 2022 1 次提交
  4. 20 2月, 2022 1 次提交
  5. 11 2月, 2022 1 次提交
  6. 03 12月, 2021 1 次提交
  7. 11 7月, 2020 1 次提交
  8. 11 9月, 2019 1 次提交
    • H
      Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320
      Huihuang Zheng 提交于
      TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.
      
      We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.
      
      Also added data_feed_proto to operator to fix CI in CPU compilation
      12542320
  9. 21 12月, 2018 1 次提交
    • C
      [Feature] Add Temporary Allocator (#14875) · 79bd6dfa
      chengduo 提交于
      * Add Temporal Allocator
      
      * add Temporay Allocator to DeviceContext
      test=develop
      
      * code refine
      test=develop
      
      * fix mean_iou
      test=develop
      
      * Add DeviceTemporaryAllocator
      test=develop
      
      * fix conv_op bug
      test=develop
      
      * small fix
      test=develop
      
      * code refine
      test=develop
      
      * log refine
      test=develop
      
      * fix unit test
      test=develop
      
      * move double check
      
      * refine concat_and_split
      test=develop
      
      * add limit_of_temporary_allocation
      test=develop
      
      * fix name
      test=develop
      79bd6dfa
  10. 14 6月, 2018 1 次提交
    • W
      Add mean IOU op. (#10519) · 6fcdb240
      whs 提交于
      * Add mean_iou op.
      
      * Add unitest for mean iou op.
      
      * Add optional collections of confusion matrix and mean_iou.
      
      * Fix cuda kernel.
      
      * Refine code.
      1. Merge computing in GPU to two kernel.
      2. Use wrong array and correct array instead of confusion matrix.
      
      * Add python api and fix cuda kernel.
      
      * Fix comments.
      
      * Small fix.
      
      * Small fix.
      6fcdb240