- 30 8月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
* Support memory eager deletion on recurrent OP (#17710) Test PaddingRNN on V100 GPU device. Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU. GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR) Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR) * Fix random test_recurrent_op failure (#18718) The change includes 3 things: 1. Set CPU_NUM to 1 in the tests because the ParallelExecutor will print warning that CPU_NUM is not set and use default 1. 2. Old tests compare two RNNs, hand written simple RNN and same RNN built by Paddle, but initialized RNN weights in numpy random and Paddle random separately. Fixed it by setting weights and bias values. 3. Also set numpy random seed in the tests. Now the two RNNs diff can be smaller (rtol from 0.1, 0.2 to. 0.01) in the tests.
-
- 30 5月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
Optimize recurrent_op using Prepare and RunPreparedContext, avoiding create operators in every iter. (#17689) test=develop
-
- 19 5月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 16 5月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 12 4月, 2019 1 次提交
-
-
由 chengduo 提交于
* enable recurrent op test=develop
-
- 28 3月, 2019 1 次提交
-
-
由 sneaxiy 提交于
test=develop
-
- 27 3月, 2019 1 次提交
-
-
由 sneaxiy 提交于
test=develop
-
- 11 3月, 2019 1 次提交
-
-
由 sneaxiy 提交于
test=develop
-
- 06 3月, 2019 2 次提交
- 12 12月, 2018 1 次提交
-
-
由 Yu Yang 提交于
test=develop
-
- 26 11月, 2018 1 次提交
-
-
由 minqiyang 提交于
test=develop
-
- 08 11月, 2018 1 次提交
-
-
由 minqiyang 提交于
Fix code to support cpplint syntax check test=develop
-
- 21 6月, 2018 2 次提交
-
-
由 tensor-tang 提交于
This reverts commit 4d8e8ee2, reversing changes made to d6a9f005.
-
由 tensor-tang 提交于
-
- 19 6月, 2018 1 次提交
-
-
由 mozga-intel 提交于
-
- 08 5月, 2018 1 次提交
-
-
由 Yu Yang 提交于
Do not use ctor * Reduce line of codes. * We can use virtual function for Maker now. * The implementation does not care what maker holds, it is easier to refactor later.
-
- 25 4月, 2018 1 次提交
-
-
由 Abhinav Arora 提交于
-
- 15 2月, 2018 1 次提交
-
-
由 Yi Wang 提交于
* Update tensor_util.h * Update with moved TensorDesc * Fix tensur_utils.cu * Update * Update * Update * Update * Make tensor_util.cu a symbolic link
-
- 12 2月, 2018 1 次提交
-
-
由 qingqing01 提交于
-
- 10 2月, 2018 2 次提交
- 09 2月, 2018 1 次提交
-
-
由 Yang Yang 提交于
-
- 09 1月, 2018 1 次提交
-
-
由 Yu Yang 提交于
* Rename Tensor::CopyFrom to Tensor::Copy * Fix CI * Fix compile
-
- 27 12月, 2017 4 次提交
-
-
由 Yu Yang 提交于
* Rename API of DeviceContext Make them as usual names. * Rename API of DeviceContext Make them as usual names. * Fix compile * Fix compile * Fix compile * Fix compile * Fix compile
-
由 Yang Yu 提交于
Make them as usual names.
-
由 Yang Yu 提交于
Make them as usual names.
-
由 Yang Yu 提交于
Make them as usual names.
-
- 26 12月, 2017 2 次提交
- 24 12月, 2017 1 次提交
-
-
由 dzhwinter 提交于
* "change operator interface" * "move devicepool to device_context" * "fix operator test" * "fix op_registry Run interface" * "net op passed. Need to fix nccl multi-Context" * "add nccl group function" * "add nccl group function" * "fix gpu count exceed 32 error" * "fix recurrent op, nccl op" * "change the other operators interface with Place" * "fix typo" * "fix pybind" * "fix device in python side" * "fix pybind failed" * "add init for test" * "fix CI"
-
- 22 12月, 2017 1 次提交
-
-
由 xuwei06 提交于
For input argument with a list of variables, drop_empty_grad is not allowed because it makes the correspondence bewteen a variable and its gradient ambiguous. Use REGISTER_OP_EX to register the op or call InputGrad(?,false) in GradOpDescMaker.
-
- 21 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Rename XXDescBind --> XXDesc * Fix Compile
-
- 20 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Move framework.proto to proto namespace * Fix compile * Fix compile * Fix Compile
-
- 19 12月, 2017 2 次提交
- 14 12月, 2017 1 次提交
-
-
由 fengjiayi 提交于
-
- 11 12月, 2017 1 次提交
-
-
由 Yiqun Liu 提交于
* Fix compiling error of gcc4.9. * Refine the check of cxx compiler flags in api/CMakeLists.txt.
-
- 04 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Add DataFeeder A v2 API like data feeder for book demos. We can feed data directly from reader. * Fix CI * Add an unittest for while/rnn op forward * Add unittest for raw while op backward * Fix CI
-
- 26 11月, 2017 1 次提交
-
-
由 dzhwinter 提交于
* "make global tensor function independently" * "replace functor" * "fix inline template error" * "fix tensor array with CopyFrom" * "fix other case use CopyFrom" * "move the op interface hardly" * "fix operators" * "fix typo" * "delete dynamic recurrent rnn and fix gru_unit in debugmode" * "fix unique_ptr copy" * "fix cuda copy" * "fix namespace error" * "removed nccl python test" * "fix include error" * "fix typo" * fix copy util test
-