- 19 1月, 2021 1 次提交
-
-
由 liuyuhui 提交于
-
- 18 1月, 2021 1 次提交
-
-
由 pangyoki 提交于
Cherry-pick PR 30103. Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103) (#30496) * add view strategy on squeeze,unsqueeze,reshape,flatten * add squeeze unittest * add unittests * use View strategy as name rather than Reuse Allacation * fix view api doc * fix format * use core.ops when input of reshape2 is Tensor * fix test_cross_entropy_loss error because of reshape2 * fix test_cross_entropy_loss error because of reshape2 * add inplace strategy * add elementwise_add sub * let backward op not use inplace * grad op do not use inplace * fix memory increase error and add leaf error message * delete selected_rows * change op_function * little change * solve HandleViewBetweenInputAndOutput * add unittest and leaf error message * merge view error * optimize op_function_generator format and support sum inplace op * fix format of basic_engine * fix format for framework * little change of variable wrapper * add reshape, squeeze, unsqueeze, scatter api * add relu elu tanh softmax inplace api * fix test_squeeze_op unittest * fix test_relu_op unittest * fix comment problems * delete sample code of inplace api * add reference of grad_pending_nodes in basic_engine * fix unittest name * add inplace apis into wlist * fix error message * add PADDLE_ENFORCE for set grad op twice * fix head file error
-
- 13 1月, 2021 1 次提交
-
-
由 tangwei12 提交于
Change-Id: I3c788e7576688e63181e7f01562529b85a09cc59
-
- 11 1月, 2021 1 次提交
-
-
由 WangXi 提交于
* Optimization grad merge performance (#29784) * [fleet] combine amp and gradient merge, test=develop (#30086) * fix assign_op_xpu concat_op_xpu warining (#30120) Co-authored-by: Nliuyuhui <liuyuhui@baidu.com>
-
- 07 1月, 2021 1 次提交
-
-
由 liuyuhui 提交于
-
- 29 12月, 2020 1 次提交
-
-
由 liuyuhui 提交于
* [Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337) * [Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574) * [Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor (#29926) * add bkcl.so in whl for kunlun (#29947) * [Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor (#29961) Co-authored-by: NQingshuChen <qingshu.chen714@gmail.com>
-
- 25 12月, 2020 2 次提交
-
-
由 QingshuChen 提交于
* feat: support check_nan_inf for kunlun device * support kunlun stack * minor
-
由 tangwei12 提交于
* add ps table (#29463) * add ps table Change-Id: I468a04bd071d21ff52654926fcf4d5f3da19e178 * add service (#29560) * add service, remove ut on mac * fix heter_profiler & add heter stop method * fix code style * merge pscore Change-Id: Ie7f60d1cdde6755a0c29db26863c6283e9843d57 * fix cmake Change-Id: I6773509a7b4ca79139ecc40b7bf3eb318ceff8bb * fix conflit Change-Id: I35575be0c96a8520f9d756ea7f1ff0b904a165ba * fix conflit Change-Id: Ic926ea0b0d67803226d51241397ba3b510226bfa
-
- 02 12月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* hot fix complle failed in gcc4.8 * fix failed unittest
-
- 01 12月, 2020 1 次提交
-
-
由 chentianyu03 提交于
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types * add test cases for complex elementwise, matmul and getitem unittest * add test cases for complex types * add test cases for complex matmul unittest
-
- 26 11月, 2020 1 次提交
-
-
由 WangXi 提交于
-
- 27 10月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
* add fuse_bn_add_act pass
-
- 22 10月, 2020 1 次提交
-
-
由 Leo Chen 提交于
* fix bug of fetch_async_op_handle * revert some changes of test_buffer_shared_memory_reuse_pass * revert some changes of test_buffer_shared_memory_reuse_pass
-
- 27 9月, 2020 1 次提交
-
-
由 Leo Chen 提交于
* refine broadcast_op_handle * refine some error messages * refine some files * fix bug * fix bug * fix bug * follow comments * follow comments
-
- 24 9月, 2020 1 次提交
-
-
由 wanghuancoder 提交于
* use iwyu clean include, test=develop, test=win * compilation error, test=develop * fix compilation error2, test=develop * fix compilation error3, test=develop * fix compilation error4, test=develop * fix compilation error5, test=develop * fix compilation error6, test=develop * fix compilation error7, test=develop * fix compilation error8, test=develop * fix compilation error8, test=develop * fix compilation error10, test=develop * fix compilation error11, test=develop
-
- 21 9月, 2020 2 次提交
-
-
由 Leo Chen 提交于
* support use add instead of sum to do gradient accumulation * add inplace addto pass * add grad_add op and inplace addto pass * remove debug code * code refine * fix bug when sereral sum ops inserts at same op_idx * fix Flags type * add addto attribute for conv3d * fix ut * code clean * fix type
-
由 Leo Chen 提交于
* refine error msg in var_handle.h, test=develop * refine all_reduce_op_handle * fix some error msg * refine variable_visitor * refine threaded_ssa_graph_executor * refine inplace related files * refine executor related files * refine fetch_op_handle.cc * fix bug * follow comments
-
- 03 9月, 2020 2 次提交
-
-
由 Feiyu Chan 提交于
-
由 joanna.wozna.intel 提交于
-
- 02 9月, 2020 1 次提交
-
-
由 wanghuancoder 提交于
* optimized transformation form tensor to numpy, test=develop * Modify fetch op handle, from memcpy Sync to memcpy Async, test=develop * modify CUDAPinnedPlace to CPUPlace, test=develop * modify CPUPlace to CUDAPinnedPlace, and set default inplace to false, test=develop * revert fetch_op_handle, add fetch_async_op_handle, test=develop * revert fetch_op_handle, add fetch_async_op_handle, test=develop * fix error msg report, test=develop * fix bug in cpuplace, test=develop * fix bug in unmerge and tensorarray modle, test=develop * fix bug, double copy gpu memory, test=develop * fix chenweihang¡¯s review advice, test=develop
-
- 25 8月, 2020 1 次提交
-
-
由 wanghuancoder 提交于
* optimized transformation form tensor to numpy, test=develop * optimized transformation form tensor to numpy, pass pre-commit, test=develop * modify fetchophandle zerocopy to deepcopy in PE&CUP, test=develop * modify py:array construct, test=develop * fix _fetch_var to use deep copy, test=develop
-
- 07 8月, 2020 1 次提交
-
-
由 tangwei12 提交于
* fix large scale KV * fix single training using async ssa graph
-
- 30 7月, 2020 1 次提交
-
-
由 tangwei12 提交于
Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) (#22957) * Integrated Trainer of Parameter Server
-
- 10 7月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* polish pe exception process logic, test=develop * fix unittest, test=develop * add unittests, test=develop
-
- 07 7月, 2020 1 次提交
-
-
由 hong 提交于
* cat bad alloc exception; test=develop * add unitest; test=develop * move bad alloc catch to the first place; test=develop * polish error message; test=develop * polish error message; test=develop * add mutex header; test=develop
-
- 03 6月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* remove REPLACE_ENFORCE_GLOG compile option & add ci rule prohibit LOG(FATAL) using, test=develop * remove ci test case, test=develop * replace all LOG(FATAL) & polish message, test=develop * fix typo, test=develop * polish error info detail, test=develop
-
- 11 5月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add new macro BOOST_GET_SAFELY & unittests, test=develop * add different macro type, test=develop * fix get macro type in executor, test=develop * four macro part change backup * using one macro for all case, test=develop * revert attribute change, test=develop * change to three func to solve gcc4.8 bug, test=develop * polish some details, test=develop
-
- 23 4月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 20 4月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
* Optimize the error messages of paddle CUDA API, test=develop * fix the error messages of paddle CUDA API, test=develop * Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop * remove build_ex_string,test=develop * merge conflict,test=develop
-
- 19 4月, 2020 1 次提交
-
-
由 guofei 提交于
* Support LoDTEnsorArray in fetch op test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop
-
- 14 4月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* correct reader device index, test=develop * fix async executor scope var initialization, test=develop
-
- 10 4月, 2020 2 次提交
- 09 4月, 2020 1 次提交
-
-
由 mozga-intel 提交于
* Remove the NGraph engine from PDPD repository 1. Each operator was removed from the operator's directory 2. Each test was removed from the unittest directory 3. The parallel executor support was removed from the PDPD 4. The CMake file was removed from the PDPD 5. The NG flags were removed from the repository test=develop * Remove ngraph from: 1. Cmake file 2. Python file test=develop
-
- 07 4月, 2020 1 次提交
-
-
由 qingqing01 提交于
* Make optimizer consistent in dygraph and static-graph and remove some LOG-INFO
-
- 05 4月, 2020 1 次提交
-
- 04 4月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
* solve the conflict of ops with the same name. test=develop
-
- 03 4月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 01 4月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 25 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-