- 27 11月, 2018 12 次提交
-
-
由 Qiao Longfei 提交于
-
由 Clementine 提交于
-
由 chengduo 提交于
test=develop
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
由 tangwei12 提交于
* Fix truncated normal. * Fix. * Make nce support more distribution. * Fix API.spec. * Fix python API. * Fix. test=develop * Fix API.spec test=develop * Fix sampler. * Fix order of arguments in python API. test=develop * NCE add selectedrows support * NCE update weighted sampling * fix bugs in nce_op, and assign_value_op optimized * fix bugs in nce_op, revert assign_value_op * nce_op optimize * nce_op optimize * nce_op optimize * add selectedRows test later test=develop * add selectedRows supported * add selectedRows supported test=develop * add selectedRows supported * add nce selectedRows supported, test=develop * add nce selectedRows supported * add nce selectedRows supported, test=develop * fix height in nce, test=develop * add ut * add ut, test=develop * make AutoGrownIndex inline test=develop * fix tinny error, test=develop
-
由 Qiao Longfei 提交于
test=develop
-
由 peizhilin 提交于
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
test=develop
-
由 dengkaipeng 提交于
-
由 dengkaipeng 提交于
-
- 26 11月, 2018 4 次提交
-
-
由 tensor-tang 提交于
test=develop
-
由 qingqing01 提交于
* Transpose-Flatten-Concat fusion operator. * Add unit testing and fix bug.
-
由 tangwei12 提交于
* fix mkdir conflict * fix load/save lookup tables test=develop * add lookup_table_utils * fix load optimize vars on pserver * delete lookup table utils * fix save and load lookup tables * fix load optimizer var * fix load optimizer var, test=develop * fix python 3 style, test=develop * move lookup_table_utils to contrib utils
-
由 Yiqun Liu 提交于
test=develop
-
- 25 11月, 2018 1 次提交
-
-
由 gongweibao 提交于
-
- 23 11月, 2018 5 次提交
-
-
由 luotao1 提交于
-
由 peizhilin 提交于
fix code style
-
由 qingqing01 提交于
* CUDA kernel for density_prior_box_op. * Support flatten to 2D.
-
由 tensor-tang 提交于
test=develop
-
由 peizhilin 提交于
-
- 22 11月, 2018 6 次提交
-
-
由 chengduo 提交于
* refine cublase test=develop * code refine * refine cublas * add GEMME_EX * add enable_cublas_tensor_op_math doc and add cublasCall test=develop * fix CublasCall for cuda version test=develop * fix error test=develop * fix GEMM_EX to be compatible with gcc 4.8 test=develop * add GEMM_EX test=develop * to compatiable with gcc4.8 test=develop
-
由 peizhilin 提交于
-
由 tensor-tang 提交于
test=develop
-
由 tensor-tang 提交于
test=develop
-
由 Dun 提交于
Add group normalization operator.
-
由 wopeizl 提交于
* add recordio support * disable the openblas multi-thread on windows since no support adjust the python script * code style * code style test=develop * add create_recordio_file_reader back * fix code style test=develop * fix the gtest.cmake on windows * fix cc_test on windows * fix the win build test=develop * remove fused compile support on windows test=develop * add the jit support test=develop * add the jit support, test=develop * add the jit support, test=develop * add the jit back fix compile error on windows * rollback test=develop * test case fix * disable DSO by default on windows * exclude warpctc_op on windows * exclude the dynload_warpctc out on windows test=develop * fix the scripts error test=develop * disable avx on windows by default test=develop * re-organize the cmake file * disable mkl on windows by default * add warp_ctc back * fix the dependency * fix the dependency * fix the build issue on windows * remove unsupported flag on windows * code style * code style test=develop * fix issue * add profiler, parallel_executor back * clean up the pre-definitions on windows * fix build issue * test=develop
-
- 21 11月, 2018 4 次提交
-
-
由 tensor-tang 提交于
test=develop
-
由 tensor-tang 提交于
test=develop
-
由 Yu Yang 提交于
some operators depend on cub and xxhash by header. The dependency should be declared explicitly rather than declared to pybind. test=develop
-
由 Dang Qingqing 提交于
test=develop
-
- 20 11月, 2018 5 次提交
-
-
由 tensor-tang 提交于
test=develop
-
由 tensor-tang 提交于
-
由 Yihua Xu 提交于
-
由 Yihua Xu 提交于
-
- 19 11月, 2018 3 次提交
-
-
由 qingqing01 提交于
* Modify some infer-shape in compile-time.
-
由 Yihua Xu 提交于
* Optimize layer_norm operator with AVX intrinsic functions * Revert the wrong modifications * Implement the jit kernel for layer_norm operator * Add math headfile to fix the compile issue (test=develop) * Add math headfile to fix the compile issue (test=develop) * Fixed the intrinsic headfile issue (test=develop) * Fix the conflicts (test=develop) * Revert for CUDA compiler (test=develop) * Fixed the cuda depency (test=develop) * Fix the marco issues (test=develop)
-
由 peizhilin 提交于
-