- 10 1月, 2018 7 次提交
-
-
由 Yang Yang(Tony) 提交于
feature/parallel_gpu
-
由 Yang Yu 提交于
-
由 dzhwinter 提交于
* "init use all default devices" * "fix init test"
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
* add lod tensor ToAbsOffset test * add share lod to topk op and softmax op
-
由 xuwei06 提交于
Added backward.calc_gradient to backpropagate gradient from given targets to inputs.
-
由 xuwei06 提交于
-
- 09 1月, 2018 9 次提交
-
-
由 yangyaming 提交于
-
由 QI JUN 提交于
-
由 fengjiayi 提交于
-
由 Yancey 提交于
* test dist word2vec * multiple trainers work
-
由 Yiqun Liu 提交于
* Add Seq2BatchFunctor, which will be used in WarpCTCOp. * Implement WrapCTCFunctor and WrapCTCKernel. * Add unittest of warpctc_op. * Modify the check_output inferface in python unittest framework to allow check a subset of outputs. * Use absolute offset lod in warpctc_op and related functors. * Refine the comments of warpctc_op. * The new python unittest supports checking a subset of the outputs, so revoke the previous change. * Rename the transform from LoDTensor to Tensor with shape [max_sequence_length, num_sequences, sequence_width] to PaddingSequenceFunctor. * Update to the newest codes. * Rename the PaddingSequenceFunctor to PaddingLoDTensorFunctor and remove the computation of dimensions out of the functos.
-
由 fengjiayi 提交于
-
由 Yu Yang 提交于
* Rename Tensor::CopyFrom to Tensor::Copy * Fix CI * Fix compile
-
由 Yu Yang 提交于
* Remove unused LoDTensor methods * Update
-
由 qiaolongfei 提交于
-
- 08 1月, 2018 13 次提交
-
-
由 qiaolongfei 提交于
-
由 qiaolongfei 提交于
-
由 Yancey 提交于
* create tensor in recv op * static global function to global function
-
由 qiaolongfei 提交于
-
由 Luo Tao 提交于
-
由 Yang Yu 提交于
-
由 dzhwinter 提交于
* "reuse ShareLoD with no regret" * "removed base class shareLayout" * "fix CI"
-
由 Yang Yu 提交于
-
由 Qiao Longfei 提交于
* add rename guard * add device_data_transform * add device_data_transform_test * modify GetExpectedKernelType * update operator.run * support test test_label_semantic_roles * optimize code * optimize code * rename GetActualKernelType to GetExpectedKernelType * fix chunk_eval_op and device_data_transform_test * add is_same_place to place * optimize code, refine rename_guard * refine rename guard, add GetKernelTypeForVar * optimize code * add some log * rename guard * use sub scope to create var * fix compile * add IsInitialized for Tensor * add VarIsTensor * fix op_registry_test * test * tmp disable priority * restore switch_kernel.md * code clean
-
由 Yibing Liu 提交于
-
由 Yibing Liu 提交于
-
由 emailweixu 提交于
This can make it easier to locate error.
-
由 Siddharth Goyal 提交于
-
- 06 1月, 2018 1 次提交
-
-
由 Yibing Liu 提交于
-
- 05 1月, 2018 10 次提交
-
-
由 tensor-tang 提交于
-
由 Yibing Liu 提交于
-
由 tensor-tang 提交于
-
由 guosheng 提交于
-
由 Yancey 提交于
-
由 Yibing Liu 提交于
-
由 Yibing Liu 提交于
-
由 Yibing Liu 提交于
-
由 Yang Yu 提交于
It will be used for LoD information in LoDTensor since LoD is a copy on write field. It is pretty slow for copying LoD information between operators. For resnet it will cost roughly 10% time of whole time, including reading data.
-
由 dzhwinter 提交于
* "add c++ side kernel selection" * "add multiple kernel op test" * "kernel selection only support cudnn" * "better formatter" * "small fix with UseCPU" * "depends on change interface Get(Place, Library)" * "fix CI" * "fix python cudnn test" * "leave the register cudnn op to another PR" * "fix CI" * "use all kernel by default" * "fix CI"
-