- 10 6月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* support diff dataset tensor place in single process dataloader * fix unittest failed
-
- 19 4月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* [NPU] support GarbageCollector for npu (#31874) * support GarbageCollector for npu * fix typo * fix gather_grad * disable NPUDefaultStreamGarbageCollector on NPU * [NPU] support npu for memcpy op (#31808) * support npu for memcpy op * add ut * fix ut * fix typo * 【NPU】fix bug of using temp vector (#31963) * fix bug when beta1_pow on cpu (#31995) * [NPU] support npu profiler (#31684) * support npu profiler * add python api * fix bugs * add wrapper for incomplete type * update profile proto * record npu wait * add xpu placeholder * fix adam (#32016) * [NPU] enable async copy and add wait before sync operation (#31956) * enable async copy and add wait before sync operation * remove unneccessary wait * add FillNpuTensorWithConstant * refine * fix fill_constant * make TensorFromVector/TensorToVector sync * [NPU] Support dataloader on npu place. (#31867) * [NPU] Wait on NPUPlace (#32086) * [NPU] fix cast op (#32121) * fix npu kernel of cast op to handle casting to same dtype * add comments * [NPU] support cann 20.3 (#32044) * fix compile problem on cann 20.3 * fix ut * fix test_mul * fix check_finite_and_scale * fix lookup_table_v2_grad * fix cmake * support print op * [NPU] Support npu save load (#31893) * support save load for NPU * add save load npu unittest * support np.array transform in NPU * fix errors * delete dygraph in unittest * add Wait * fix unittest * fix review comment * fix unittest problem * fix little problem * change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performance (#32196) * change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performace * refine code * fix NPUDeviceContext in all c++ unittest (#32198) * fix NPUDeviceContext in all c++ unittest * refine log Co-authored-by: Npangyoki <pangyoki@126.com> * [NPU] Remove TensorFromVector and avoid sync copy in npu op kernel for better performance (#31994) * enable async copy and add wait before sync operation * remove unneccessary wait * add FillNpuTensorWithConstant * refine * fix fill_constant * change TensorFromVector to FillNpuTensorWithConstant * fix ignored api * delete extra unittest * fix little error * fix update_loss_scaling_op_npu and check_finite_and_unscale_op_npu * change TensorCopySync to TensorCopy * delete useless Wait and add StreamWait * fix npu_stream error * fix check_finite_and_unscale_op_npu TensorCopy * only save stream wait * fix NPUDeviceContext in all c++ unittest * delete wait Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com> * delete useless unittest file (#32206) * Fix op test (#32231) * fix conditional block (#32243) * fix adam bug again (#32246) * fix compile * fix ut * fix ut Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com> Co-authored-by: Npangyoki <pangyoki@126.com>
-
- 03 3月, 2021 1 次提交
-
-
由 Qi Li 提交于
* [ROCM] update fluid operators for rocm (part3), test=develop * fix clang format error, test=develop
-
- 04 2月, 2021 1 次提交
-
-
由 wanghuancoder 提交于
* use iwyu clean include second time, test=develop
-
- 20 11月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
-
- 10 8月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add pin memory control * fix buffered reader init problem * fix unittest error * add unittest for coverage
-
- 29 7月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* simplify buffered reader to improve DataLoader performance * fix 22 failed unittests * fix cuda pinned context condition * fix test_reader_reset failed * fix two failed unittests * change unittest place * polish error messaage * polish cast op GetExpecctedKernelType * remove debug info in unittest
-
- 25 5月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* polish reader error message, test=develop * fix detail error, test=develop * reset activation dcudnn change, test=develop
-
- 11 5月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add new macro BOOST_GET_SAFELY & unittests, test=develop * add different macro type, test=develop * fix get macro type in executor, test=develop * four macro part change backup * using one macro for all case, test=develop * revert attribute change, test=develop * change to three func to solve gcc4.8 bug, test=develop * polish some details, test=develop
-
- 20 4月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
* Optimize the error messages of paddle CUDA API, test=develop * fix the error messages of paddle CUDA API, test=develop * Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop * remove build_ex_string,test=develop * merge conflict,test=develop
-
- 25 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 14 10月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* refine py_reader exit, test=develop * fix multiprocess_reader exception unittest, test=develop * increase code coverage for legacy fluid.layers.py_reader, test=develop
-
- 14 8月, 2019 1 次提交
-
-
由 chengduo 提交于
Use CUDAPinnedPlace in buffered_reader
-
- 17 6月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* fix py_reader iterable bug, test=develop * move data from buffered_reader,test=develop
-
- 30 4月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 11 3月, 2019 1 次提交
-
- 04 3月, 2019 3 次提交
-
-
由 chengduo 提交于
Add Event for TensorCopy
- 01 3月, 2019 3 次提交
-
-
由 chengduo 提交于
Add Event for TensorCopy
-
由 Qiao Longfei 提交于
-
由 sneaxiy 提交于
test=develop
-
- 20 2月, 2019 1 次提交
-
-
由 sneaxiy 提交于
test=develop
-
- 08 2月, 2019 1 次提交
-
-
由 Dun Liang 提交于
test=develop
-
- 01 2月, 2019 1 次提交
-
-
由 kolinwei 提交于
-
- 20 1月, 2019 1 次提交
-
-
由 Dun Liang 提交于
-
- 12 1月, 2019 1 次提交
-
-
由 Dun Liang 提交于
-
- 10 12月, 2018 1 次提交
-
-
由 Yancey1989 提交于
-
- 07 12月, 2018 1 次提交
-
-
由 Yancey1989 提交于
-
- 06 12月, 2018 1 次提交
-
-
由 Yancey1989 提交于
-
- 20 7月, 2018 1 次提交
-
-
由 fengjiayi 提交于
1. Make the feeding thread of py_reader a daemon thread. 2. Update buffer_reader's destructor, fixing a bug. 3. Make pyreader demo script supporting CPU environment.
-
- 18 7月, 2018 1 次提交
-
-
由 yuyang18 提交于
-
- 16 7月, 2018 1 次提交
-
-
由 yuyang18 提交于
-
- 14 7月, 2018 1 次提交
-
-
由 yuyang18 提交于
-