- 15 1月, 2018 1 次提交
-
-
由 wanghaoshuang 提交于
1. Fix kernel 2. Add more test case
-
- 13 1月, 2018 1 次提交
-
-
由 wanghaoshuang 提交于
2. Remove num_seq arguments. 3. Refine CUDA kernel of ScaleLoDTensorFunctor. 4. Change max_relative_error of gradient unitest to 0.007
-
- 11 1月, 2018 2 次提交
-
-
由 wanghaoshuang 提交于
-
由 wanghaoshuang 提交于
2. Add check grad test
-
- 09 1月, 2018 2 次提交
-
-
由 Yiqun Liu 提交于
* Add Seq2BatchFunctor, which will be used in WarpCTCOp. * Implement WrapCTCFunctor and WrapCTCKernel. * Add unittest of warpctc_op. * Modify the check_output inferface in python unittest framework to allow check a subset of outputs. * Use absolute offset lod in warpctc_op and related functors. * Refine the comments of warpctc_op. * The new python unittest supports checking a subset of the outputs, so revoke the previous change. * Rename the transform from LoDTensor to Tensor with shape [max_sequence_length, num_sequences, sequence_width] to PaddingSequenceFunctor. * Update to the newest codes. * Rename the PaddingSequenceFunctor to PaddingLoDTensorFunctor and remove the computation of dimensions out of the functos.
-
由 Yu Yang 提交于
* Rename Tensor::CopyFrom to Tensor::Copy * Fix CI * Fix compile
-
- 02 1月, 2018 2 次提交
- 29 12月, 2017 3 次提交
-
-
由 chengduoZH 提交于
-
由 typhoonzero 提交于
-
由 typhoonzero 提交于
-
- 28 12月, 2017 5 次提交
-
-
由 guosheng 提交于
-
由 guosheng 提交于
-
由 sweetsky0901 提交于
-
由 Yancey 提交于
* implement selectedrows serialize and deserialize * make serialize/deserialize as global function * recover send_imp.cc * delete unused brackets * fix compile error * serialize version in LodTensor and SelecetedRows * fix ci * fix ci
-
由 sweetsky0901 提交于
-
- 27 12月, 2017 3 次提交
-
-
由 typhoonzero 提交于
-
由 typhoonzero 提交于
-
由 qingqing01 提交于
-
- 26 12月, 2017 3 次提交
-
-
由 qingqing01 提交于
-
由 qingqing01 提交于
-
由 Luo Tao 提交于
-
- 25 12月, 2017 4 次提交
-
-
由 dangqingqing 提交于
-
由 qingqing01 提交于
-
由 QI JUN 提交于
* remove unused place * fix ci
-
由 dzhwinter 提交于
-
- 24 12月, 2017 1 次提交
-
-
由 qiaolongfei 提交于
-
- 21 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Remove unnecessary reshape in ColwiseSum Speed up 12s -> 10s. * Hand write ColwiseAdd in CPU
-
- 20 12月, 2017 1 次提交
-
-
由 chengduoZH 提交于
-
- 19 12月, 2017 3 次提交
-
-
由 chengduoZH 提交于
-
由 chengduoZH 提交于
-
由 chengduoZH 提交于
-
- 18 12月, 2017 1 次提交
-
-
由 QI JUN 提交于
* add more place_test and rename Cudnn to CUDNN * fix ci
-
- 15 12月, 2017 1 次提交
-
-
由 tensor-tang 提交于
-
- 14 12月, 2017 1 次提交
-
-
由 dzhwinter 提交于
* "derived cudnnDevice context" * "leave remove cudnn handle from CUDADeviceContext" * "fix math function error"
-
- 12 12月, 2017 1 次提交
-
-
由 QI JUN 提交于
There are mainly following fixes: - take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place` - remove `eigen_device` interface in base class `DeviceContext` - remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext` - remove unused `platform::EigenDeviceConverter` - rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL` - rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`
-
- 11 12月, 2017 1 次提交
-
-
由 sweetsky0901 提交于
-
- 12 12月, 2017 1 次提交
-
-
由 tensor-tang 提交于
-
- 09 12月, 2017 1 次提交
-
-
由 sweetsky0901 提交于
-
- 08 12月, 2017 1 次提交
-
-
由 sweetsky0901 提交于
-