- 06 2月, 2018 1 次提交
-
-
由 Luo Tao 提交于
-
- 31 1月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "Need to re-design LoD " * "add lod design" * "fix lod gpu ptr pointer" * "removed commented code" * "fix CI" * "remove set lod in pybind" * "fix style check" * "fix CI" * "fix long type template error" * "pybind reorder to use Place" * "fix ci" * "fix ci" * fix ci * "sperate as a new file" * "fix CI" * "fix ci" * small fix * "add test" * "fix adam op" * "fix lstmp op" * "fix adam op" * "follow comments" * "fix ci"
-
- 30 1月, 2018 2 次提交
- 29 1月, 2018 1 次提交
-
-
由 Yi Wang 提交于
* Remove IsBounded as buffered channels have to be bounded * Add derived classes Buffered and UnBuffered" * Implement buffered and unbuffered channels * Correct the syntax of Channel::Receive * clang-format * clang-format 3.8 * clang 3.8
-
- 26 1月, 2018 1 次提交
-
-
由 kexinzhao 提交于
* initial commit * add new executor run function * fix bug * fix multiple definition of feed_fetch_method issue * fix cmake * fix tensor copy error * refine executor code * add comments * temporary modification * address comments * fix bug
-
- 22 1月, 2018 1 次提交
-
-
由 dangqingqing 提交于
-
- 21 1月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
* init complete data layout transform * can compile * test passed * optimize code * fix while_grad_op first step loss lod problem * optimize in out ptr for transform * add check * update copyright * clean code * add NeedTransformLayout * add comment * change the interface of data_type_transform * init data_type_transform_test * complete data_type_transform_test * add TransDataType to data_transform
-
- 20 1月, 2018 1 次提交
-
-
由 dangqingqing 提交于
-
- 19 1月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
* add data layout transform and optimize the implementation of data_transform
-
- 17 1月, 2018 1 次提交
-
-
由 Luo Tao 提交于
-
- 16 1月, 2018 2 次提交
-
-
由 dangqingqing 提交于
-
由 Luo Tao 提交于
-
- 12 1月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
* add GetLoD for debug * add LoDToString * optimize if * typo * add lod_tensor to operator's dependency
-
- 10 1月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
* init data_type_transform * split data_layout_transform * tmp rm data_transform_test * change device_data_transform to data_device_transform * clean code * clean code
-
- 08 1月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
* add rename guard * add device_data_transform * add device_data_transform_test * modify GetExpectedKernelType * update operator.run * support test test_label_semantic_roles * optimize code * optimize code * rename GetActualKernelType to GetExpectedKernelType * fix chunk_eval_op and device_data_transform_test * add is_same_place to place * optimize code, refine rename_guard * refine rename guard, add GetKernelTypeForVar * optimize code * add some log * rename guard * use sub scope to create var * fix compile * add IsInitialized for Tensor * add VarIsTensor * fix op_registry_test * test * tmp disable priority * restore switch_kernel.md * code clean
-
- 05 1月, 2018 2 次提交
-
-
由 Yang Yu 提交于
It will be used for LoD information in LoDTensor since LoD is a copy on write field. It is pretty slow for copying LoD information between operators. For resnet it will cost roughly 10% time of whole time, including reading data.
-
由 dzhwinter 提交于
* "add c++ side kernel selection" * "add multiple kernel op test" * "kernel selection only support cudnn" * "better formatter" * "small fix with UseCPU" * "depends on change interface Get(Place, Library)" * "fix CI" * "fix python cudnn test" * "leave the register cudnn op to another PR" * "fix CI" * "use all kernel by default" * "fix CI"
-
- 04 1月, 2018 1 次提交
-
-
由 Yang Yu 提交于
-
- 03 1月, 2018 1 次提交
-
-
由 tensor-tang 提交于
-
- 02 1月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "fix data transform" * "data transformer" * "add device pool" * "add test" * "fix ci" * "fix datalayout implementation " * "fix based on comment"
-
- 28 12月, 2017 4 次提交
- 27 12月, 2017 2 次提交
-
-
由 dzhwinter 提交于
* "refine kernel registrar" * "refine registrar with multikey" * "fix register" * "refine multikernel register" * "fix CI" * "fix CI" * "fix registry" * "swtich GPU to CUDA" * "add register macro test case" * "fix CI"
-
由 QI JUN 提交于
* add memory switch mechanism in operator kernel switch
-
- 26 12月, 2017 2 次提交
-
-
由 Qiao Longfei 提交于
* init data_transform * complete DataTransform * fix build error * add data_transform_test * add a register test for data_transform_fn * use function to simulate registration macro * add register macro * update test * clean code * restore unrelated code * update data transform test * generate unique name for REGISTER_DATA_TRANSFORM_FN * add const * follow comment * update KernelTypePair hash function
-
由 dzhwinter 提交于
* "fix threadpool style" * "remove header"
-
- 25 12月, 2017 2 次提交
-
-
由 Yancey 提交于
* implement a simple threadpool * unlock before cv.notify * add done function * add lock with GetAvailable function * delete done_ * using call_once in GetInstance * update by comment * update comment * enhance unit test for multi threads task
-
由 qiaolongfei 提交于
-
- 24 12月, 2017 1 次提交
-
-
由 dzhwinter 提交于
* "change operator interface" * "move devicepool to device_context" * "fix operator test" * "fix op_registry Run interface" * "net op passed. Need to fix nccl multi-Context" * "add nccl group function" * "add nccl group function" * "fix gpu count exceed 32 error" * "fix recurrent op, nccl op" * "change the other operators interface with Place" * "fix typo" * "fix pybind" * "fix device in python side" * "fix pybind failed" * "add init for test" * "fix CI"
-
- 18 12月, 2017 1 次提交
-
-
由 dzhwinter 提交于
* "add DeviceContextPool" * "add devicecontextpool in pybind" * "add comments in python side " * "fix static link error" * "fix CI error" * "add executor.py" * "fix CI error" * "add with gpu macro" * "remove comment out codes" * "add TODO items" * "update init devices"
-
- 26 11月, 2017 1 次提交
-
-
由 dzhwinter 提交于
* "make global tensor function independently" * "replace functor" * "fix inline template error" * "fix tensor array with CopyFrom" * "fix other case use CopyFrom" * "move the op interface hardly" * "fix operators" * "fix typo" * "delete dynamic recurrent rnn and fix gru_unit in debugmode" * "fix unique_ptr copy" * "fix cuda copy" * "fix namespace error" * "removed nccl python test" * "fix include error" * "fix typo" * fix copy util test
-
- 15 11月, 2017 1 次提交
-
-
由 QI JUN 提交于
* fix gitignore * refine cmake file
-
- 04 11月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Add LoDRankTable LoD Rank Table stores the `level` of `lod` which is ordered by sequence length in descending order. It is useful when implement dynamic RNN and is shared by dynamic RNN memory, dynamic RNN slice input and dynamic RNN slice output operators. * Add InferVarType
-
- 31 10月, 2017 1 次提交
-
-
由 dangqingqing 提交于
-
- 29 10月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Shrink Operator.h * Fix CI compile
-
- 28 10月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Add debug logs in scope, meta_cache and memory * Add missing deps
-
- 27 10月, 2017 1 次提交
-
-
由 QI JUN 提交于
* add sparse support for sum op * typo fix * fix gpu build error * fix unittest error * typo fix * infer var type and shape in op_test * follow comments * fix build error * bypass some unittests depend on NetOp
-