- 14 1月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "unified operators" * "add CUDNN register" * "add use cudnn attribute" * "add attribute" * "test conv tranpose op" * "remove duplicated attr" * "fix op test" * "add attribute to set cudnn" * "add more log" * "need layout op register support" * "add more log" * "change GetExpectedKernelType " * "fix Get attr in conv_op" * "fix CI" * "fix tests" * "removed kernel priority fallback" * "fix CI" * "fix stack pointer bug" * "refine buggy interface" * "add const cast to save life" * "fix get_output_with_grad" * "fix op test with dataformat" * ""fix pooling * "fix pooling test" * "fix CI" * "fix with_gpu error" * "add transform needed functional check" * "fix unpack list error" * "comment out parallel.do temporary" * "fix CI" * "fix compile doc error" * "make threshold larger"
-
- 12 1月, 2018 1 次提交
-
-
由 Yan Chunwei 提交于
-
- 11 1月, 2018 1 次提交
-
-
由 wanghaoshuang 提交于
2. Add check grad test
-
- 09 1月, 2018 1 次提交
-
-
由 Yiqun Liu 提交于
* Add Seq2BatchFunctor, which will be used in WarpCTCOp. * Implement WrapCTCFunctor and WrapCTCKernel. * Add unittest of warpctc_op. * Modify the check_output inferface in python unittest framework to allow check a subset of outputs. * Use absolute offset lod in warpctc_op and related functors. * Refine the comments of warpctc_op. * The new python unittest supports checking a subset of the outputs, so revoke the previous change. * Rename the transform from LoDTensor to Tensor with shape [max_sequence_length, num_sequences, sequence_width] to PaddingSequenceFunctor. * Update to the newest codes. * Rename the PaddingSequenceFunctor to PaddingLoDTensorFunctor and remove the computation of dimensions out of the functos.
-
- 03 1月, 2018 2 次提交
- 02 1月, 2018 4 次提交
-
-
由 Luo Tao 提交于
-
由 Luo Tao 提交于
-
由 sweetsky0901 提交于
-
由 sweetsky0901 提交于
-
- 29 12月, 2017 1 次提交
-
-
由 chengduoZH 提交于
-
- 27 12月, 2017 2 次提交
- 25 12月, 2017 1 次提交
-
-
由 typhoonzero 提交于
-
- 19 12月, 2017 1 次提交
-
-
由 Yang Yang 提交于
-
- 12 12月, 2017 2 次提交
-
-
由 sweetsky0901 提交于
-
由 QI JUN 提交于
There are mainly following fixes: - take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place` - remove `eigen_device` interface in base class `DeviceContext` - remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext` - remove unused `platform::EigenDeviceConverter` - rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL` - rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`
-
- 01 12月, 2017 3 次提交
-
-
由 typhoonzero 提交于
-
由 Yancey 提交于
* fix grpc compile warn * update * -Wnon-virtual-dtor -> -Wno-non-virtual-dtor
-
由 typhoonzero 提交于
-
- 28 11月, 2017 1 次提交
-
-
由 武毅 提交于
* WIP send recv op * WIP send recv * put grpc impl in details * put grpc impl in details * update wip * update proto * update proto * update proto * clean cmake * wip on op implementations * wip on op implementations * compile ok adding ut * wip unitest * add extern cares for linking * wip add ut * working version send recv * revert optimizer.py * update test cmake * add libtool to dockerfile * update cmake dependency * update cmake depends * update cmake grpc depends * fix cmake dependency * fix compile error * fix compile * follow comments * update * update copyfrom
-
- 27 11月, 2017 2 次提交
- 26 11月, 2017 1 次提交
-
-
由 dzhwinter 提交于
* "make global tensor function independently" * "replace functor" * "fix inline template error" * "fix tensor array with CopyFrom" * "fix other case use CopyFrom" * "move the op interface hardly" * "fix operators" * "fix typo" * "delete dynamic recurrent rnn and fix gru_unit in debugmode" * "fix unique_ptr copy" * "fix cuda copy" * "fix namespace error" * "removed nccl python test" * "fix include error" * "fix typo" * fix copy util test
-
- 22 11月, 2017 1 次提交
-
-
由 sweetsky0901 提交于
-
- 21 11月, 2017 2 次提交
-
-
由 sweetsky0901 提交于
-
由 sweetsky0901 提交于
-
- 18 11月, 2017 1 次提交
-
-
由 Abhinav Arora 提交于
-
- 16 11月, 2017 1 次提交
-
-
由 QI JUN 提交于
* adam sparse support * fix gpu build error * fix ci * fix ci * fix adagrad sparse update bug * fix gpu build error
-
- 13 11月, 2017 3 次提交
-
-
由 chengduoZH 提交于
-
由 Qiao Longfei 提交于
* init trieconcat_op * add basic implementation * add test * add more test * update unit test * add PackAllSteps test * fix PackAllSteps * all test passed * clean code * remove state inside helper * rename prob to score * optimize RemoveFromEnd * use deconstructor to delete BeamNode recursively * optimize interface * add comment to interface * optimizer data structure * use template to define the type of score * use template parameter for BeamHelper * change father to parent * rename TrieConcat to BeamSearchOutConcat * use LoDTensorArray * rename BeamSearchOutConcat to BeamSearchDecode * refine code * remain all candidate sentence in beam_search_decode_op, do not consider endid * use unique_ptr * fix compare bug * fix lod compile problem
-
由 dangqingqing 提交于
-
- 11 11月, 2017 2 次提交
-
-
由 dangqingqing 提交于
-
由 wanghaox 提交于
-
- 08 11月, 2017 3 次提交
-
-
由 Yu Yang 提交于
* Add LoDRankTable LoD Rank Table stores the `level` of `lod` which is ordered by sequence length in descending order. It is useful when implement dynamic RNN and is shared by dynamic RNN memory, dynamic RNN slice input and dynamic RNN slice output operators. * Add skeleton for array_to_lod_tensor and lod_tensor_to_array * Add VarType::LoDTensorArray * Add PyBind of LoDTensorArray * Add InferVarType * Add first unittest * Add ut * Add unittest * Add unittest * Add unittests * update * init * add infershape for lod_tensor_to_array_op * compelete array_to_lod_tensor_op * copy data * clean code * clean code * Fix unittest data * fix bugs * fix compile error * Refine TensorToArrayOp * refactor array_to_lod_tensor * Unittest * fix bugs * Fix unittest * Fix unittest * debug * Debug * Fix unittest * clean code * refactor * use ostream * update test * fix gpu build error * make gpu test pass
-
由 Yu Yang 提交于
-
由 Yu Yang 提交于
* Compare Operator * Follow comments
-
- 07 11月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Use stable_sort in lod_rank_table It is easy to debug and test when use `stable_sort`and the time complexity is not changed. * Add LoDTensorArray * Stash * Better debug message for IsInitialized * Stash * Better debug message for IsInitialized * Complete array read/write op unittests
-
- 04 11月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Add LoDRankTable LoD Rank Table stores the `level` of `lod` which is ordered by sequence length in descending order. It is useful when implement dynamic RNN and is shared by dynamic RNN memory, dynamic RNN slice input and dynamic RNN slice output operators. * Add InferVarType
-
- 03 11月, 2017 1 次提交
-
-
由 dangqingqing 提交于
-