- 02 9月, 2021 1 次提交
-
-
由 xiongkun 提交于
* Add SVD Op and it's GPU and CPU kernel * Remove CUDAPlace in test_svd_op, make the test available in CPU package * modfity the file * fix windows bug/ fix ROCM / fix test timeout * for pass the CIs * improve error report * for code review * some modification to test_svd_op * change python code style * expose the svd interface for document
-
- 31 8月, 2021 1 次提交
-
-
由 Zhanlue Yang 提交于
[Background] Expansion in code size can be irreversible in the long run, leading to huge release packages which not only hampers user experience but also exceeds a hard limit of pypi. In such, NV_FATBIN section takes up 86% of the compiled dylib size, owing to the vast number of GPU arches supported. This PR aims to prune this NV_FATBIN. [Solution] In the new release strategy, two types of whl packages will be involved: Cubin PIP package: PIP package maintains a smaller window for GPU arches support, containing sm_60, sm_70, sm_75, sm_80 cubins, covering Pascal - Ampere arches JIT release package: This is a backup for Cubin PIP package, containing compute_35, compute_50, compute_60, compute_70, compute_75, compute_80, with best performance and GPU arches coverage. However, it takes around 10 min to install due to the JIT compilation. [How to use] The new release strategy is disabled by default. To compile for Cubin PIP package, add this to cmake: -DCUBIN_RELEASE_PIP To compile for JIT release package, add this to cmake: -DJIT_RELEASE_WHL
-
- 20 8月, 2021 1 次提交
-
-
由 Hao Lin 提交于
-
- 18 8月, 2021 1 次提交
-
-
由 Zhanlue Yang 提交于
* Add function to disable paddle signal handler Paddle used google::InstallFaultSignalHandler to handle selected system signals, mainly for debugging and bug report purposes. However, this can be conflicted with other python packages whoever captures similar signals. Such python package involves tvm and more To resolve this issue, we support a function to disable signal handler * Remove signal test from WIN32 platform * Remove redundant return from disable_signal_handler() function * Add detailed messages to en_doc
-
- 16 8月, 2021 1 次提交
-
-
由 duanboqiang 提交于
* add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * add unique_consecutive_op * remove unity build * add unique_consecutive op * add unique_consecutive op * add enable static * add noqa * add space line * add default case. * add comma * add space line * modify unique_consecutive unittest * optimize ut coverage * rebase develop * improve coverage * update en docs * update en docs * update en docs * update en docs * update en docs * update en doc
-
- 13 8月, 2021 1 次提交
-
-
由 Tongxin Bai 提交于
* OP dot: refactor CPU kernels and get better loop performance. * Minor fix on code format. * Fixed minor errors. * Add new API: einsum * Update the Einsum unit test. One case failed with matmul_v2, where the dtype is int64: a = np.arange(2 * 3 * 1).reshape(2, 3, 1) b = np.arange(1) paddle.einsum("...i, ...i", a, b) * Test cases in test_einsum test floating point dtypes only. As of now Paddle only supports float/double dtypes in matmul, which is one of building blocks of this Einsum implementation. We decide not to test einsum against other dtypes. * Polish format. * More formatting. * Format... * Einsum: improve test coverage. * Einsum: bug fixes and more testcases for testing error messages * Einsum: fix format.. * Einsum: fixed typo and format. * Einsum: format again... * Einsum: applied suggested changes. * Einsum API: improve API documentation. * Einsum API: apply suggested changes. * Einsum API: Add dygraph only note. * Einsum API: Add dygraph only note. * Einsum API: fixed unittest.
-
- 28 7月, 2021 1 次提交
-
-
由 zhiboniu 提交于
-
- 19 7月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* add cuda event and stream api * add cuda event and stream api * add get_current_stream api * add get_current_stream api * init streams * modify get_current_stream * modify get_cuttent_stream * add synchronize func * add current_stream doc and test file * move get_current_stream into CUDA macro * move CudaEvent into CUDA macro * move _get_current_stream and _device_synchronize into cuda macro * modify the macro of cuda stream and event * add test case for synchronize * add paddle.devices.cuda module * event and stream support hip * add doc for stream and event class * move cuda stream and event into single pybind * add cuda_streams_py.cc to cmakelist * add _device_synchronize and _get_current_stream to core module * add test case for cudastream and cudaevent * move __all__ in streams.py * fix test fail * add cuda to devices __all__ * fix current_stream doc writing error * move devices to device direction, and merge device.py into __init__.py * add required:gpu to sample codes * remove cuda direction from device/__init__.py
-
- 12 7月, 2021 1 次提交
-
-
由 zhiboniu 提交于
-
- 23 6月, 2021 1 次提交
-
-
由 Zhanlue Yang 提交于
-
- 22 6月, 2021 1 次提交
-
-
由 zhangbo9674 提交于
* new api diagonal, test=develop * add new api diagonal, test=develop * new api diagonal, test=develop * add new api paddle.diagonal, test=develop * use framework::stride replace ComputeDimStride * replace cudaMalloc/cudaMemcpy by TensorFormVector in cudaKernel and cudaGradKernel * perfect funciton: when attr(offset) is exceed attr(axis1) or attr(axis2), set the diagonal dim is 0 * fix RP-Mac-CI bug: replace framework::stride() by ComputDimStride. * perfect code-block * perfect code of python API diagonal * api supports dtype of float16 and bool * api supports dtype of float16 and bool * modify unittest code * modify unittest code * perfect dtype describe * perfect code-block
-
- 21 6月, 2021 1 次提交
-
-
由 zhiboniu 提交于
-
- 17 6月, 2021 1 次提交
-
-
由 ronnywang 提交于
* add atan2_op * fix
-
- 16 6月, 2021 2 次提交
-
-
由 Zhou Wei 提交于
-
由 zhangbo9674 提交于
* new api trunc, test=develop
-
- 15 6月, 2021 1 次提交
-
-
由 zyfncg 提交于
* Add digamma_op and unittest * add digamma_op api * remove special DigammaCudaKernel and correct some docs * remove unused headers * fix api doc error
-
- 11 6月, 2021 2 次提交
- 09 6月, 2021 2 次提交
- 27 5月, 2021 1 次提交
-
-
由 Qi Li 提交于
-
- 07 5月, 2021 1 次提交
-
-
由 zhiboniu 提交于
* [OPs] Bug fix, fix the segment mean for illegal syncthreads usage. (#32596) (#32610) * [OPs] Bug fix, fix the segment mean for illegal syncthreads usage. * remove packages in __all__ * create new public api level paddle.callbacks;paddle.hub;paddle.utils.unique_name Co-authored-by: NZhong Hui <zhonghui.net@gmail.com>
-
- 27 4月, 2021 2 次提交
-
-
由 zhiboniu 提交于
Co-authored-by: NXiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
-
由 zhiboniu 提交于
-
- 25 4月, 2021 1 次提交
-
-
由 Wenyu 提交于
* add Hub Module for easy to use pre-trained models. * support list, load, help fucntions. * support load models by github, gitee, local Co-authored-by: NLielinJiang <jianglielin@baidu.com>
-
- 24 4月, 2021 1 次提交
-
-
由 zhiboniu 提交于
-
- 22 4月, 2021 2 次提交
-
-
由 Yang Zhang 提交于
-
由 zhiboniu 提交于
-
- 14 4月, 2021 1 次提交
-
-
由 Feiyu Chan 提交于
* add common dtypes as paddle's dtypes * import paddle.fluid.core_avx.VarDesc.VarType as paddle.dtype
-
- 09 4月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* [feature] support npu allocator (#30840) [feature] support npu allocator * [feature] support npu operator (#30951) [feature] support npu operator * [feature] support npu allocator, part 2 (#30972) * support npu allocator * add npu device context * fix some compile problem * fix some compile problem * add npu info * compile ok * fix include dir * support naive_best_fit_allocator * run ut ok, bug failed to exit * call aclrtResetDevice before exit * fix aclFinilize * add system allocatot test * add selected_gpus in gtest * add tensor_test for npu * support npu op, initial commit * add npu stream * add elementwise_add_op * compile ok * fix typo * fix elementwise_add_op_npu_test * support op run * test can run but failed * change aclopExecuteV2 to aclopCompileAndExecute * support parsing ascend rank table file (#31000) support parsing ascend rank table file * Fix reshape on GE graph. (#31084) Fix reshape on GE graph * add npu kernel for elementwise_sub and elementwise_sub_grad (#30973) * add npu sub op * fix typo * rename test * fix bug * fix bug * add fp16 kernel * fix typo * support sub grad op * support elementwise_sub_grad op Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com> * Fix compilation problem (#31100) Fix compilation problem (#31100) * fix compile * fix code stype * remove const_cast * support adding correct npu op in pybind.h (#31143) * support adding correct npu op in pybind.h * refine code * [NPU] Support executor with NPU (#31057) * [NPU] Support executor with NPU * Fix code according to reviews * Fix code * Add unittest for sub op npu * refactor npu device manager (#31154) refactor npu device manager (#31154) * fix selected npus * fix compile * fix reading flags from env * format Co-authored-by: Nxiayanming <41795079@qq.com> Co-authored-by: Ngongweibao <weibao.gong@gmail.com> Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com> Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
-
- 01 4月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* add custom init grad for backward function * add custom init grad for backward function * handle when the grad_tensor is none * handle when the grad_tensor is none * fix the args type error on windows platform * modify the args order and doc * format code * add grad_tensor to xpu * modify the grad_tensor type check * add paddle.backward api to support multi tensors gradient compute * add paddle.backward api to support multi tensors gradient compute * add paddle.atuograd module and backward api * change tensor.backward func args * modify tensor backward api * remove create_graph intputs args * add doc and examplex code for backward api * when have the same tensor, throw error * modify test Init func args * modify the execute.Init func args in test files * add paddle.autograd package in setup.py.in * modify error msg, remove _run_backward method in class Tensor * add test cases for backward api
-
- 15 1月, 2021 1 次提交
-
-
由 pangyoki 提交于
* add view strategy on squeeze,unsqueeze,reshape,flatten * add squeeze unittest * add unittests * use View strategy as name rather than Reuse Allacation * fix view api doc * fix format * use core.ops when input of reshape2 is Tensor * fix test_cross_entropy_loss error because of reshape2 * fix test_cross_entropy_loss error because of reshape2 * add inplace strategy * add elementwise_add sub * let backward op not use inplace * grad op do not use inplace * fix memory increase error and add leaf error message * delete selected_rows * change op_function * little change * solve HandleViewBetweenInputAndOutput * add unittest and leaf error message * merge view error * optimize op_function_generator format and support sum inplace op * fix format of basic_engine * fix format for framework * little change of variable wrapper * add reshape, squeeze, unsqueeze, scatter api * add relu elu tanh softmax inplace api * fix test_squeeze_op unittest * fix test_relu_op unittest * fix comment problems * delete sample code of inplace api * add reference of grad_pending_nodes in basic_engine * fix unittest name * add inplace apis into wlist * fix error message * add PADDLE_ENFORCE for set grad op twice * fix head file error
-
- 07 1月, 2021 1 次提交
-
-
由 123malin 提交于
* test=develop, add model_average and lookahead
-
- 17 12月, 2020 2 次提交
-
-
由 chentianyu03 提交于
* add conj op for complex types * add conj for complex types * add more test case * add conj_op test * modify conj api and impl * add complex type for fill_constant_op xpu * add setConstant for complex type * remove complex conj test file * user define grad for test_conj_op * add test case for static mode of conj api * modify conj doc * change input args name to x * remove useless codes * conj support real types * add conj test case for real number
-
由 Chen Weihang 提交于
* add complex real op & api & unittest * add imag op & api & unittest * refactor op impl * revert simplify writing due to complile failed * polish details * polish grad op code
-
- 09 12月, 2020 2 次提交
-
-
由 joejiong 提交于
As the title
-
由 Wei Shengyu 提交于
* remove addcmul * remove unittest and other related code of addcmul * fix bug * fix merge conflict
-
- 07 12月, 2020 1 次提交
-
-
由 chentianyu03 提交于
* rm complexvariable * modify test_var_base unittest * remove duplicated codes
-
- 01 12月, 2020 1 次提交
-
-
由 yukavio 提交于
-
- 30 11月, 2020 1 次提交
-
-
由 LielinJiang 提交于
* lazy import for scipy * rm unused check
-