- 10 2月, 2018 2 次提交
- 06 2月, 2018 2 次提交
- 02 2月, 2018 1 次提交
-
-
由 fengjiayi 提交于
-
- 31 1月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "unify flags" * "fix init"
-
- 19 1月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
* check if kernel if found for kernel type * do kernel check before data transform
-
- 14 1月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "unified operators" * "add CUDNN register" * "add use cudnn attribute" * "add attribute" * "test conv tranpose op" * "remove duplicated attr" * "fix op test" * "add attribute to set cudnn" * "add more log" * "need layout op register support" * "add more log" * "change GetExpectedKernelType " * "fix Get attr in conv_op" * "fix CI" * "fix tests" * "removed kernel priority fallback" * "fix CI" * "fix stack pointer bug" * "refine buggy interface" * "add const cast to save life" * "fix get_output_with_grad" * "fix op test with dataformat" * ""fix pooling * "fix pooling test" * "fix CI" * "fix with_gpu error" * "add transform needed functional check" * "fix unpack list error" * "comment out parallel.do temporary" * "fix CI" * "fix compile doc error" * "make threshold larger"
-
- 12 1月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
* add GetLoD for debug * add LoDToString * optimize if * typo * add lod_tensor to operator's dependency
-
- 10 1月, 2018 3 次提交
-
-
由 Qiao Longfei 提交于
* init data_type_transform * split data_layout_transform * tmp rm data_transform_test * change device_data_transform to data_device_transform * clean code * clean code
-
由 dzhwinter 提交于
-
由 dzhwinter 提交于
-
- 09 1月, 2018 1 次提交
-
-
由 qiaolongfei 提交于
-
- 08 1月, 2018 5 次提交
-
-
由 qiaolongfei 提交于
-
由 qiaolongfei 提交于
-
由 dzhwinter 提交于
* "reuse ShareLoD with no regret" * "removed base class shareLayout" * "fix CI"
-
由 Qiao Longfei 提交于
* add rename guard * add device_data_transform * add device_data_transform_test * modify GetExpectedKernelType * update operator.run * support test test_label_semantic_roles * optimize code * optimize code * rename GetActualKernelType to GetExpectedKernelType * fix chunk_eval_op and device_data_transform_test * add is_same_place to place * optimize code, refine rename_guard * refine rename guard, add GetKernelTypeForVar * optimize code * add some log * rename guard * use sub scope to create var * fix compile * add IsInitialized for Tensor * add VarIsTensor * fix op_registry_test * test * tmp disable priority * restore switch_kernel.md * code clean
-
由 emailweixu 提交于
This can make it easier to locate error.
-
- 05 1月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "add c++ side kernel selection" * "add multiple kernel op test" * "kernel selection only support cudnn" * "better formatter" * "small fix with UseCPU" * "depends on change interface Get(Place, Library)" * "fix CI" * "fix python cudnn test" * "leave the register cudnn op to another PR" * "fix CI" * "use all kernel by default" * "fix CI"
-
- 04 1月, 2018 1 次提交
-
-
由 Yang Yang 提交于
-
- 02 1月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "fix data transform" * "data transformer" * "add device pool" * "add test" * "fix ci" * "fix datalayout implementation " * "fix based on comment"
-
- 29 12月, 2017 1 次提交
-
-
由 QI JUN 提交于
* add helper function to get appropriate DeviceContext
-
- 27 12月, 2017 6 次提交
-
-
由 Yu Yang 提交于
* Rename API of DeviceContext Make them as usual names. * Rename API of DeviceContext Make them as usual names. * Fix compile * Fix compile * Fix compile * Fix compile * Fix compile
-
由 QI JUN 提交于
* add KernelTypeToString interface * cache memory in local scope * fix typo * refine trans logic
-
由 Yang Yu 提交于
Make them as usual names.
-
由 Yang Yu 提交于
Make them as usual names.
-
由 Yang Yu 提交于
Make them as usual names.
-
由 QI JUN 提交于
* add memory switch mechanism in operator kernel switch
-
- 25 12月, 2017 2 次提交
-
-
由 Qiao Longfei 提交于
* init kernel hint * fix typo * rm unused code * add include in op_kernel.h * restore op_kernel since it will be moved to op_kernel_type * change force_cpu to use_cpu * fix compilation
-
由 qiaolongfei 提交于
-
- 24 12月, 2017 2 次提交
-
-
由 QI JUN 提交于
* refine OpKernelKey * refine codes * fix code style * follow comments
-
由 dzhwinter 提交于
* "change operator interface" * "move devicepool to device_context" * "fix operator test" * "fix op_registry Run interface" * "net op passed. Need to fix nccl multi-Context" * "add nccl group function" * "add nccl group function" * "fix gpu count exceed 32 error" * "fix recurrent op, nccl op" * "change the other operators interface with Place" * "fix typo" * "fix pybind" * "fix device in python side" * "fix pybind failed" * "add init for test" * "fix CI"
-
- 21 12月, 2017 1 次提交
-
-
由 Yang Yang 提交于
-
- 20 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Move framework.proto to proto namespace * Fix compile * Fix compile * Fix Compile
-
- 12 12月, 2017 1 次提交
-
-
由 QI JUN 提交于
There are mainly following fixes: - take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place` - remove `eigen_device` interface in base class `DeviceContext` - remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext` - remove unused `platform::EigenDeviceConverter` - rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL` - rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`
-
- 05 12月, 2017 1 次提交
-
-
由 dangqingqing 提交于
-
- 16 11月, 2017 1 次提交
-
-
由 Yang Yang(Tony) 提交于
* first commit * Python API for while op * Python Unittest for simple while_op forward * fix out to be list * Fix UT * VarType * Fix several bugs * Fix bug * Fix bug * Fix Bug * Fix bug * Fix unittest * Remove debug log * Add comments * add PADDLE_ENFORCE * while_grad_op first commit * Add `BlockDescBind::FindRecursiveOrCreateVar()` and fix bugs * not sure how to setdim of while outputs * push for test * add executor vlog * fix bug of while_op cond * Several enhancement for code 1. Backward always infer shape & infer var type. Since there are RENAME variables will be created when creating backward operator, but their shape & var types are not inferenced. 2. Never use SomePtr-> directly, since every pointer could be nullptr if it is a function return value. Add `detail::Ref` to cast pointer to reference safely. 3. Enhance error message for backward. 4. Infer data type of variable in `sum` and `tensor_write` * Fix bugs of while_op gradient * Fix several bugs of while_op grad * fix fill zeros like * fix 3 >= 3 * fix place holder shouldn't be null * fail on sum op * Fix SumOp of TensorList * clean up * pass while test * fix test_array_write_read * pass sum op * Support int/int64 for fill_constant_batch_size_like * Fix compile
-
- 08 11月, 2017 2 次提交
-
-
由 Yu Yang 提交于
* Chage `IndicateDataType` to `GetKernelType`. Make it easier to understand. * Change `OpKernelKey` to `OpKernelType` * Make operator developers can customize which kernel the operator will use in runtime.
-
由 qingqing01 提交于
-
- 07 11月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Use stable_sort in lod_rank_table It is easy to debug and test when use `stable_sort`and the time complexity is not changed. * Add LoDTensorArray * Stash * Better debug message for IsInitialized * Stash * Better debug message for IsInitialized * Complete array read/write op unittests * Add unittest, Gradient of array read/write * Follow comments
-