- 07 11月, 2018 1 次提交
-
-
由 chengduo 提交于
* add fp16 backward support test=develop * add sum_op fp16 test * disable test_dist_save_load test=develop * add check_grad for sum * add unit test for softmax_grad fp16 test=develop * add scale_op unit test * add mul_grad_op unit test for fp16 * add cross_entropy_grad and eman_grad unit test for fp16 test=develop * fix cross_entropy unit test * add pool2d fp16 unit test * refine conv2d fp16 unit test test=develop * refine activation unit test test=develop * fix ci test=develop * follow zhihong's comment, copy from https://github.com/PaddlePaddle/Paddle/pull/12796 test=develop
-
- 28 10月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 27 10月, 2018 3 次提交
-
-
由 Qiao Longfei 提交于
test=develop
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
- 18 10月, 2018 1 次提交
-
-
由 qingqing01 提交于
-
- 17 10月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 08 10月, 2018 1 次提交
-
-
由 qiaolongfei 提交于
-
- 27 9月, 2018 1 次提交
-
-
由 tangwei12 提交于
* add dist ut for text_classification * add dist ut for text_classification * add simnet bow unittest * add dist ut for simnet bow * add trainning data url for simnet bow * add trainning data url for simnet bow * modify simnet test_reader to train reader * add test_dist_ctr * test_dist_ctr can run now * dense update is good * add unit test for selected rows * debug unit test * fix dist sparse update problem * Constant args at init * optimize code * simnet optimize * fix DebugStringEx * optimize sum_op.h * add ScaleOpVarTypeInference * clean code * fix test_dist_transpiler.py * code optimize * modify delta * fix sparse update bug * dist test use one cpu * update some data * remove unused code * add use cuda config * unit test fix * unit test fix * unit test fix * unit test fix * dist_word2vec use CPU * unit test fix * unit test fix * code clean * code clean * merge develop * api spec update * Revert: api spec update * replace simnet data with fake * replace simnet data with fake * update dim * add batch auc * code clean * code clean * modify print to stderr * update simnet delta -> 1e-5 * update RUN_STEP * add use_reader_alloc * add use_reader_alloc * add use_reader_alloc * modify delta * add use_reader_alloc * fix stderr write * python3 compatibility test=develop * python3 compatibility, test=develop * Update dist_text_classification.py * test=develop
-
- 20 9月, 2018 2 次提交
-
- 17 9月, 2018 1 次提交
-
-
由 chengduoZH 提交于
-
- 18 8月, 2018 1 次提交
-
-
由 tangwei12 提交于
-
- 17 8月, 2018 2 次提交
-
-
由 tangwei12 提交于
-
- 16 8月, 2018 2 次提交
- 01 8月, 2018 1 次提交
-
-
由 tangwei12 提交于
-
- 31 7月, 2018 1 次提交
-
-
由 tangwei12 提交于
-
- 09 4月, 2018 1 次提交
-
-
由 Abhinav Arora 提交于
-
- 09 3月, 2018 1 次提交
-
-
由 Yancey 提交于
Fix sparse update memory error for distributed training
-
- 15 2月, 2018 1 次提交
-
-
由 Yi Wang 提交于
* Update tensor_util.h * Update with moved TensorDesc * Fix tensur_utils.cu * Update * Update * Update * Update * Make tensor_util.cu a symbolic link
-
- 12 2月, 2018 1 次提交
-
-
由 qingqing01 提交于
-
- 11 2月, 2018 1 次提交
-
-
由 Yancey 提交于
* dynamic send/recv selected rows * update by comment * fix by comment
-
- 10 2月, 2018 2 次提交
- 30 1月, 2018 1 次提交
-
-
由 Yang Yu 提交于
* Polish sum_op support SelectedRows in_place
-
- 09 1月, 2018 2 次提交
- 28 12月, 2017 2 次提交
- 26 12月, 2017 2 次提交
- 12 12月, 2017 1 次提交
-
-
由 QI JUN 提交于
There are mainly following fixes: - take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place` - remove `eigen_device` interface in base class `DeviceContext` - remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext` - remove unused `platform::EigenDeviceConverter` - rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL` - rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`
-
- 04 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Add DataFeeder A v2 API like data feeder for book demos. We can feed data directly from reader. * Fix CI * Add an unittest for while/rnn op forward * Add unittest for raw while op backward * Fix CI
-
- 01 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Fix Proformance problem of enforce * Fix missing `;` in code * Fix CI
-
- 26 11月, 2017 1 次提交
-
-
由 dzhwinter 提交于
* "make global tensor function independently" * "replace functor" * "fix inline template error" * "fix tensor array with CopyFrom" * "fix other case use CopyFrom" * "move the op interface hardly" * "fix operators" * "fix typo" * "delete dynamic recurrent rnn and fix gru_unit in debugmode" * "fix unique_ptr copy" * "fix cuda copy" * "fix namespace error" * "removed nccl python test" * "fix include error" * "fix typo" * fix copy util test
-
- 07 11月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Use stable_sort in lod_rank_table It is easy to debug and test when use `stable_sort`and the time complexity is not changed. * Add LoDTensorArray * Stash * Better debug message for IsInitialized * Stash * Better debug message for IsInitialized * Complete array read/write op unittests * Add unittest, Gradient of array read/write * Follow comments
-
- 02 11月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Init commit * Make executor use ProgramDescBind * Change Attribute from BlockDesc to BlockDescBind * Since we will get the program desc in RNN, just BlockDesc is not enough. * Add DeviceContext to Executor API * Rewrite RNN * Pass Python * AddBiasOp does not care num_flatten_dims * Stash * Fix MacOS Compile * Pass RNN forward * add python test * refactor test * Make compile pass * add gradopmaker * First draft done * Polish code * add grad op maker and grad infershape * Polish code * Fix backward.cc bug * Fix infershape * Rename function * add backward test * simplify recurrent test * Update * Pass unittest * Add comments & refine test * Add comments * refactor test * Complete Unittest * fix StepScopes enforce * Remove unused unittest * no type error * Update * Make RNN Pass unittest
-
- 29 10月, 2017 1 次提交
-
-
由 QI JUN 提交于
* add sparse support for sum op * typo fix * fix gpu build error * fix unittest error * typo fix * infer var type and shape in op_test * follow comments * fix build error * bypass some unittests depend on NetOp * support sparse output for lookup table grad op * refine codes * fix gpu build error * fix lookup table grad gpu kernel * fix ci * fix ci * fix ci * fix bug in lookup_table_grad op * fix bug in test_word2vec * register double kernel for some operators * set is_sparse=True in test_word2vec * fix lookup table grad op CUDA kernel bug * disable test_modified_huber_loss_op temporarily * disable test_lstm_unit_op temporarily
-