1. 22 2月, 2019 2 次提交
    • T
      Revert 15770 develop a6910f90 gelu mkl opt (#15872) · ee2321de
      tensor-tang 提交于
      * Revert "Optimze Gelu with MKL Erf function (#15770)"
      
      This reverts commit 676995c8.
      
      * test=develop
      ee2321de
    • Y
      Optimze Gelu with MKL Erf function (#15770) · 676995c8
      Yihua Xu 提交于
      * Optimize for gelu operator
      
      * Set up the low accuracy mode of MKL ERF function.
      
      test=develop
      
      * Only enable MKLML ERF when OS is linux
      
      * Use the speical mklml version included vmsErf function to verify gelu mkl kernel.
      
      test=develop
      
      * Add the CUDA macro to avoid NVCC's compile issue.
      
      test=develop
      
      * Add the TODO comments for mklml library modification.
      
      test=develop
      
      * Clean Code
      
      test=develop
      
      * Add the comment of marco for NVCC compiler.
      
      test=develop
      676995c8
  2. 28 1月, 2019 1 次提交
  3. 26 12月, 2018 1 次提交
  4. 19 12月, 2018 1 次提交
  5. 18 12月, 2018 3 次提交
  6. 13 12月, 2018 1 次提交
  7. 05 12月, 2018 1 次提交
  8. 04 12月, 2018 1 次提交
  9. 27 11月, 2018 2 次提交
  10. 26 11月, 2018 1 次提交
  11. 23 11月, 2018 1 次提交
  12. 22 11月, 2018 3 次提交
    • C
      Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929) · 00b9e9a1
      chengduo 提交于
      * refine cublase
      test=develop
      
      * code refine
      
      * refine cublas
      
      * add GEMME_EX
      
      * add enable_cublas_tensor_op_math doc and add cublasCall
      test=develop
      
      * fix CublasCall for cuda version
      test=develop
      
      * fix error
      test=develop
      
      * fix GEMM_EX to be compatible with gcc 4.8
      test=develop
      
      * add GEMM_EX
      test=develop
      
      * to compatiable with gcc4.8
      test=develop
      00b9e9a1
    • P
      fix unit test cases · 7c8c9dc9
      peizhilin 提交于
      7c8c9dc9
    • W
      Windows/online (#14474) · d9a1f3e5
      wopeizl 提交于
      * add recordio support
      
      * disable the openblas multi-thread on windows since no support
      adjust the python script
      
      * code style
      
      * code style
      test=develop
      
      * add create_recordio_file_reader back
      
      * fix code style
      test=develop
      
      * fix the gtest.cmake on windows
      
      * fix cc_test on windows
      
      * fix the win build
      test=develop
      
      * remove fused compile support on windows
      test=develop
      
      * add the jit support
      test=develop
      
      * add the jit support, test=develop
      
      * add the jit support, test=develop
      
      * add the jit back
      fix compile error on windows
      
      * rollback test=develop
      
      * test case fix
      
      * disable DSO by default on windows
      
      * exclude warpctc_op on windows
      
      * exclude the dynload_warpctc out on windows
      test=develop
      
      * fix the scripts error
      test=develop
      
      * disable avx on windows by default
      test=develop
      
      * re-organize the cmake file
      
      * disable mkl on windows by default
      
      * add warp_ctc back
      
      * fix the dependency
      
      * fix the dependency
      
      * fix the build issue on windows
      
      * remove unsupported flag on windows
      
      * code style
      
      * code style
      test=develop
      
      * fix issue
      
      * add profiler, parallel_executor back
      
      * clean up the pre-definitions on windows
      
      * fix build issue
      
      * test=develop
      d9a1f3e5
  13. 21 11月, 2018 1 次提交
  14. 19 11月, 2018 1 次提交
  15. 16 11月, 2018 1 次提交
    • W
      Add cudnn ctc loss (#12366) · b32c13dc
      Wu Yi 提交于
      * add cudnn ctc loss
      
      * wip add test test=develop
      
      * wip
      
      * wip
      
      * done test=develop
      
      * move include cudnn test=develop
      
      * test test=develop
      
      * fix build test=develop
      
      * fix build test=develop
      
      * fix build on cudnn5 test=develop
      
      * fix cudnn5 build test=develop
      
      * fix cudnn5 build test=develop
      
      * merge develop softmax functor change test=develop
      b32c13dc
  16. 13 11月, 2018 1 次提交
  17. 09 11月, 2018 1 次提交
    • Q
      Exhaustive search for cuDNN conv. (#14286) · abe20923
      qingqing01 提交于
      * exhaustive search for cuDNN conv.
      * Refine code and add unit testing.
      * Fix model load in fluid/inference and unit testing in conv2d
      * Follow comments.
      * Fix compiling test=develop
      abe20923
  18. 08 11月, 2018 1 次提交
  19. 07 11月, 2018 2 次提交
  20. 02 11月, 2018 1 次提交
    • W
      Add affine grid generator op (#12238) · 0c319e0b
      whs 提交于
      * Add affine grid generator.
      
      * fix ffine grid.
      
      * Add unitest.
      
      * Add CPU kernel and fix unitest.
      
      * Fix CPU kernel.
      
      * Refine code.
      test=develop
      
      * Fix python api.
      test=develop
      
      * Update python api.
      test=develop
      
      * Fix comment.
      test=develop
      
      * Rename affine_grid_generator to affine_grid and enhence unitest.
      test=develop
      
      * Fix unitest.
      test=develop
      0c319e0b
  21. 29 10月, 2018 2 次提交
  22. 28 9月, 2018 1 次提交
  23. 15 9月, 2018 1 次提交
  24. 05 9月, 2018 1 次提交
  25. 29 8月, 2018 1 次提交
  26. 27 8月, 2018 4 次提交
  27. 26 8月, 2018 1 次提交
  28. 24 8月, 2018 1 次提交
  29. 22 8月, 2018 1 次提交