1. 06 8月, 2020 1 次提交
  2. 11 9月, 2019 1 次提交
    • H
      Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320
      Huihuang Zheng 提交于
      TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.
      
      We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.
      
      Also added data_feed_proto to operator to fix CI in CPU compilation
      12542320
  3. 02 8月, 2019 1 次提交
    • K
      Fix memory leak in test (#18622) · c2c876f7
      Krzysztof Binias 提交于
      * Fix memory leak in test
      
      test=develop
      
      * Fix memory leak in test
      test=develop
      
      * Fix memory leak in test
      test=develop
      
      * Pull out vars of the loops
      test=develop
      c2c876f7
  4. 19 3月, 2019 1 次提交
  5. 21 1月, 2019 1 次提交
  6. 18 1月, 2019 2 次提交
  7. 28 12月, 2018 1 次提交
  8. 26 11月, 2018 1 次提交
  9. 23 11月, 2018 2 次提交
    • J
      just for test · 5cd2fc9f
      JiabinYang 提交于
      5cd2fc9f
    • S
      Fix cmake for AMDGPU platform (#13801) · 61c5f13f
      sabreshao 提交于
      * HIP cmake.
      Enable whole archieve build for pybind library.
      
      Disable two warning.
      
      Rollback to C++11.
      
      Link RCCL to WA gpu kernel loading issue.
      
      Update eigen to fix build failure.
      
      Add more include directories.
      
      Fix O3 build failure.
      
      Update eigen.
      
      fix tensor_util_test segment fault issue
      
      add more macro check in hip.cmake.
      we may consider refine hip.cmake to inherit all add_definitions() in parrent scope, in the future.
      
      Fix rocRAND load.
      
      Update eigen to fix gru_unit_op and reduce_op.
      
      Add HIP support to testing.
      
      Update eigen to support int16 and int8 in arg min and arg max.
      
      * add rocprim as cub library used by nv implementation
      
      * Reduce build time in rocprim.
      
      * Add rocprim introduction, remove useless cmake code.
      
      * Remove useless flags and format cmake file.
      61c5f13f
  10. 22 11月, 2018 1 次提交
  11. 12 11月, 2018 1 次提交
  12. 09 11月, 2018 1 次提交
  13. 08 11月, 2018 1 次提交
  14. 28 9月, 2018 1 次提交
  15. 05 7月, 2018 1 次提交
  16. 03 7月, 2018 1 次提交
  17. 01 7月, 2018 1 次提交
  18. 17 6月, 2018 1 次提交
  19. 15 6月, 2018 1 次提交
  20. 12 6月, 2018 1 次提交
  21. 08 6月, 2018 1 次提交
  22. 08 4月, 2018 1 次提交
  23. 07 4月, 2018 1 次提交
  24. 26 2月, 2018 1 次提交
  25. 10 2月, 2018 1 次提交
  26. 08 2月, 2018 1 次提交
  27. 01 2月, 2018 1 次提交
  28. 30 1月, 2018 1 次提交
  29. 10 1月, 2018 1 次提交
  30. 25 12月, 2017 1 次提交
  31. 24 12月, 2017 1 次提交
    • D
      Feature/operator run place (#6783) · 735eba29
      dzhwinter 提交于
      * "change operator interface"
      
      * "move devicepool to device_context"
      
      * "fix operator test"
      
      * "fix op_registry Run interface"
      
      * "net op passed. Need to fix nccl multi-Context"
      
      * "add nccl group function"
      
      * "add nccl group function"
      
      * "fix gpu count exceed 32 error"
      
      * "fix recurrent op, nccl op"
      
      * "change the other operators interface with Place"
      
      * "fix typo"
      
      * "fix pybind"
      
      * "fix device in python side"
      
      * "fix pybind failed"
      
      * "add init for test"
      
      * "fix CI"
      735eba29
  32. 01 12月, 2017 1 次提交
  33. 30 11月, 2017 1 次提交
  34. 31 10月, 2017 1 次提交
  35. 29 6月, 2017 1 次提交
  36. 28 6月, 2017 1 次提交
  37. 04 1月, 2017 2 次提交