- 15 9月, 2021 2 次提交
-
-
由 houj04 提交于
-
由 Siming Dai 提交于
Add paddle.cuda.device.stream_guard API
-
- 14 9月, 2021 2 次提交
-
-
由 chenenquan 提交于
* Add empty_cache api to release idle gpu memory hold by allocator,test=develop * Add empty_cache api to release idle gpu memory hold by allocator,test=develop * Add empty_cache api to release idle gpu memory hold by allocator,test=develop * Fix test coverage problem for empty_cache * delete redundant check for empty_cache * fix the problem of empty_cache's doc * delete the nvidia-smi comment in doc of empty_cache, test=document_fix
-
由 Wilber 提交于
-
- 11 9月, 2021 1 次提交
-
-
由 王明冬 提交于
-
- 10 9月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* change metaclass of Layer from pybind11_builtins.pybind11_type to type * fix cast * add ut
-
- 09 9月, 2021 1 次提交
-
-
由 0x45f 提交于
* init matrix_rank op, add matrix_rank CPU code and test * add GPU kernel, remove svd_eigen.h * add CPU kernel when tol is tensor * add cpu and gpu code when tol is tensor * fix CI-ROCM error * add matrix_rank API describe, fix PR-CI-Py3 error * fix PR-CI-Windows error, add matrix_rank API test * delete useless comments * fix review * add my code in svd_helper.h * update doc commets * remove spaces
-
- 08 9月, 2021 4 次提交
-
-
由 xiongkun 提交于
* can pass the fake test * add files * modify cmake to pass windows-ci * for ci pass * WITH_GLOO=ON * for pass coverage test * add cpuonly testcase * add * disable nccl when compile with cuda * change python version in cpuonly * add backend argument * add required gpu * add required:gpu
-
由 Zeng Jinle 提交于
* add fleet api for program pass * turn on apply pass for CI test * fix disable fuse_all_optimizer bug * try to test ci * fix CI * fill unspecified op role * fix fuse_allreduce * add ut to improve coverage * remove useless change * improve c++ coverage * follow some comments * test ir pass pipeline * update doc * reduce ut time again
-
由 Leo Chen 提交于
* release gil before op run * support npu grad test * fix op_test
-
由 WangXi 提交于
-
- 07 9月, 2021 1 次提交
-
-
由 yaoxuefeng 提交于
-
- 06 9月, 2021 1 次提交
-
-
由 WeiXin 提交于
* support numpy dtype and polish code of list index. * polish code.
-
- 04 9月, 2021 1 次提交
-
-
由 Wilber 提交于
-
- 02 9月, 2021 1 次提交
-
-
由 Baibaifan 提交于
-
- 01 9月, 2021 1 次提交
-
-
由 zyfncg 提交于
* Support getitem by Bool index * delete some debug info of bool index * support the case that the shape of bool index is different from indexed tensor * support setitem by bool index * add the unittest for throwing exception * merge conflict * add check for int tensor when index is bool
-
- 31 8月, 2021 2 次提交
-
-
由 Aurelius84 提交于
* polish code * fix unittest on windows * refine pybind interface * support statistic MemSize of AllocatorPool * Replace mutex into atomic
-
由 Shang Zhizhou 提交于
* Revert "Revert "Add copy from tensor (#34406)" (#35173)" This reverts commit 32c1ec42. * add template instantiation
-
- 30 8月, 2021 1 次提交
-
-
由 Aurelius84 提交于
* Abstract GenerateDeviceEventFlag to shield platforms * Remove get_cuda_flags
-
- 27 8月, 2021 2 次提交
-
-
由 Guoxia Wang 提交于
* sparse_momentum_op is used to save w@GRAD memory for gather_op when gather from a large parameter
-
由 zhangchunle 提交于
This reverts commit ac33c0ca.
-
- 26 8月, 2021 3 次提交
-
-
由 Siming Dai 提交于
* add dlpack api and fix a from_dlpack
-
由 WeiXin 提交于
* polish code * polish code. * polish code. * polish code. * polish code.
-
由 Shang Zhizhou 提交于
* add api * temp save * revert * copytocpu async ok * fix style * copy sync ok * fix compile error * fix compile error * api done * update python async api * fix compile * remove async python api; add c++ async unittest * remove python async api * update unittest * update unittest * add C++ unittest for copytensor * add unittest * update namespace utils to class TensorUtils * add unittest * update unittest * update unittest * update code style * update code style * update unittest
-
- 25 8月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* fix index tensor leak in __setitem__ * fix another usage of PyTuple_Pack * refine code * refine code * handle None index * add Py_DecRef * revert ut * refine code * merge develop * use RAII * follow comments
-
- 24 8月, 2021 2 次提交
-
-
由 wanghuancoder 提交于
* add fetch, test=develop * fix fetch2op, test=develop * fix fetch2op, test=develop * refine, test=develop * fix fetch ctx, test=develop * add wait, test=develop * rename fetch2 to fetch_v2, test=develop * merge, test=develop
-
由 Yulong Ao 提交于
* add auto_parallel dir * mv to paddle.distributed * add shard_xx api * add distributed attrs for var * add ut, test=develop * add dist * update * update * update * update * update * update, test=develop * update, test=develop * update, test=develop * update, test=develop * update, test=develop * update, test=develop * update, test=develop * update * update * update * update * update * update, test=develop * update, test=develop * update * update * delete unused proto * resotre op_desc * restore type_defs * update var_desc * remove dimss_mapping for proto_pybind * update interface.py * update framework.py * update * update * add auto_parallel dir * mv to paddle.distributed * add shard_xx api * add distributed attrs for var * add ut, test=develop * [WIP] Add the auto completion feature and related codes * [WIP] Improve the auto completion and related codes * [WIP] Make the auto completion to support data-parallel * [WIP] Make the completion support mp and dp+mp * [WIP] Refactor auto completion unit test for MLP * [WIP] Refactor the implementation of DistributedOperatorImpl * [WIP] Improve dims_mapping update rule and fix a bug * [WIP] Support auto completion for one transformer decoder layer * [WIP] Add a minor change * [WIP] Fix a bug within the uint test * Shard XShape tensor, add embedding completion and refactor code * Add the distributed_operators dir to setup.py.in * Improve the completion process and add the unittest for gpt * fix process_mesh ut * fix process_mesh ut * update * update, test=develop * Add support for automatically completing distributed attrs of special ops * update * update * update * fix doc sample codes, test=develop * improve coverage, test=develop * add static_mode check, test=develop * Model the cluster for cost model and physical mapping * update, test=develop * add set_placement, test=develop * Add the check to make sure the candidate tensors' size is great than zero * update doc, test=develop * update doc, test=develop * update doc, test=develop * update doc, test=develop * update, test=develop * Auto mark dist attrs annotated by user * update ndarray to nested list, test=develop * update, test=develop * Add auto-completion module for auto-parallel (based on PR#33804) * Remove unnecessary files * Remove unrelated files for the auto completion pr * Update the unit test to improve the coverage * Modify codes based on reviews * Minor changes for CI * Improve some codes based on new comments * Fix bugs caused by shallow copy in attributes.py * Imporve amend_distributed_attr_for_program in context.py * Other changes for weihang's comments Co-authored-by: Nsandyhouse <lilong12@baidu.com>
-
- 23 8月, 2021 4 次提交
-
-
由 Bo Liu 提交于
-
由 zyfncg 提交于
* Support getitem by Bool index * delete some debug info of bool index * support the case that the shape of bool index is different from indexed tensor
-
由 seemingwang 提交于
-
由 zhaoyingli 提交于
* adamw support cuda * adamw support cuda
-
- 19 8月, 2021 1 次提交
-
-
由 Aurelius84 提交于
* add device_context * add gtest for device_event_gpu * Remvoe duplicate DeviceType * push for test * add unittest * fix macros * fix MSVC using usage
-
- 18 8月, 2021 2 次提交
-
-
由 wanghuancoder 提交于
* code refactoring, test=develop * refine, test=develop * refine, test=develop * refine, test=develop
-
由 Zhanlue Yang 提交于
* Add function to disable paddle signal handler Paddle used google::InstallFaultSignalHandler to handle selected system signals, mainly for debugging and bug report purposes. However, this can be conflicted with other python packages whoever captures similar signals. Such python package involves tvm and more To resolve this issue, we support a function to disable signal handler * Remove signal test from WIN32 platform * Remove redundant return from disable_signal_handler() function * Add detailed messages to en_doc
-
- 17 8月, 2021 2 次提交
-
-
由 chentianyu03 提交于
* copy boost optional.hpp to paddle * copy boost optional.hpp to paddle * move directions * del fluid/utils * modify .hpp to .h * move directions * modify to paddle::optional * add modification description * format code stype for the files in paddle/utils * format code stype
-
由 Zeng Jinle 提交于
* add inplace passes and tests * update * fix use_cuda undefined fix compile error of op compat * add more ut * fix CPU CI error * check adam unique * fix mac/windows ci, improve coverage * fix ci error * follow weihang's comment * fix BlockDesc::MoveFrom * follow qiuliang's comment * update * follow huihuang's comments
-
- 16 8月, 2021 2 次提交
-
-
由 zyfncg 提交于
Change the invoking method of settiem by Ellipsis and None index from numpy to set_value op (#34911) * Change invoking mathod of the settiem by Ellipsis and None index from numpy to set_value op * add none_axes into attr of set_value_op in dygraph mode
-
由 duanboqiang 提交于
* add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * remove unity build * add unique_consecutive op * add unique_consecutive op * add enable static * add noqa * add space line * add default case. * add comma * add space line * modify unique_consecutive unittest * optimize ut coverage * rebase develop * improve coverage * update en docs * update en docs * update en docs * update en docs * update en docs * update en doc
-
- 13 8月, 2021 2 次提交