1. 23 11月, 2018 4 次提交
    • L
      add SetMKLDNNThreadId api · a5c4b463
      luotao1 提交于
      a5c4b463
    • L
      add Set/GetCPUNumThreads api · e21edb26
      luotao1 提交于
      e21edb26
    • S
      Fix cmake for AMDGPU platform (#13801) · 61c5f13f
      sabreshao 提交于
      * HIP cmake.
      Enable whole archieve build for pybind library.
      
      Disable two warning.
      
      Rollback to C++11.
      
      Link RCCL to WA gpu kernel loading issue.
      
      Update eigen to fix build failure.
      
      Add more include directories.
      
      Fix O3 build failure.
      
      Update eigen.
      
      fix tensor_util_test segment fault issue
      
      add more macro check in hip.cmake.
      we may consider refine hip.cmake to inherit all add_definitions() in parrent scope, in the future.
      
      Fix rocRAND load.
      
      Update eigen to fix gru_unit_op and reduce_op.
      
      Add HIP support to testing.
      
      Update eigen to support int16 and int8 in arg min and arg max.
      
      * add rocprim as cub library used by nv implementation
      
      * Reduce build time in rocprim.
      
      * Add rocprim introduction, remove useless cmake code.
      
      * Remove useless flags and format cmake file.
      61c5f13f
    • Q
      CUDA kernel for density_prior_box_op. (#14513) · 36f08eef
      qingqing01 提交于
      * CUDA kernel for density_prior_box_op.
      * Support flatten to 2D.
      36f08eef
  2. 22 11月, 2018 5 次提交
    • C
      Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929) · 00b9e9a1
      chengduo 提交于
      * refine cublase
      test=develop
      
      * code refine
      
      * refine cublas
      
      * add GEMME_EX
      
      * add enable_cublas_tensor_op_math doc and add cublasCall
      test=develop
      
      * fix CublasCall for cuda version
      test=develop
      
      * fix error
      test=develop
      
      * fix GEMM_EX to be compatible with gcc 4.8
      test=develop
      
      * add GEMM_EX
      test=develop
      
      * to compatiable with gcc4.8
      test=develop
      00b9e9a1
    • D
      Group Norm (#13843) · ae7d2286
      Dun 提交于
      Add group normalization operator.
      ae7d2286
    • W
      Windows/online (#14474) · d9a1f3e5
      wopeizl 提交于
      * add recordio support
      
      * disable the openblas multi-thread on windows since no support
      adjust the python script
      
      * code style
      
      * code style
      test=develop
      
      * add create_recordio_file_reader back
      
      * fix code style
      test=develop
      
      * fix the gtest.cmake on windows
      
      * fix cc_test on windows
      
      * fix the win build
      test=develop
      
      * remove fused compile support on windows
      test=develop
      
      * add the jit support
      test=develop
      
      * add the jit support, test=develop
      
      * add the jit support, test=develop
      
      * add the jit back
      fix compile error on windows
      
      * rollback test=develop
      
      * test case fix
      
      * disable DSO by default on windows
      
      * exclude warpctc_op on windows
      
      * exclude the dynload_warpctc out on windows
      test=develop
      
      * fix the scripts error
      test=develop
      
      * disable avx on windows by default
      test=develop
      
      * re-organize the cmake file
      
      * disable mkl on windows by default
      
      * add warp_ctc back
      
      * fix the dependency
      
      * fix the dependency
      
      * fix the build issue on windows
      
      * remove unsupported flag on windows
      
      * code style
      
      * code style
      test=develop
      
      * fix issue
      
      * add profiler, parallel_executor back
      
      * clean up the pre-definitions on windows
      
      * fix build issue
      
      * test=develop
      d9a1f3e5
    • Y
      fix(Cpu): fix cpu compile and unittest · 533c5d58
      Yu Yang 提交于
      test=develop
      533c5d58
    • S
      add dlpack support · 3912545f
      sneaxiy 提交于
      test=develop
      3912545f
  3. 21 11月, 2018 4 次提交
  4. 20 11月, 2018 12 次提交
  5. 19 11月, 2018 11 次提交
  6. 18 11月, 2018 1 次提交
  7. 17 11月, 2018 3 次提交