提交 · 662230487cdaac10891d5e272175a81a4e234cb3 · BaiXuePrincess / Paddle

13 9月, 2021 1 次提交

由 Yanxing Shi 提交于 9月 13, 2021

* fix github name

* fix CI error

* fix review and CI error

* fix inf,nan error and modify unittest samples

* add unittest samples

* add unittest samples

* fix unittest error

* test=document_fix

* test=document_fix

* modify doc and add unittest samples

* fix error newline in constant

* modify doc after mentor review

* modify __all__ and doc

* modify doc

66223048

10 9月, 2021 1 次提交

add cumprod op (#35185) · 4e509f46

由 hlygit66666 提交于 9月 10, 2021

* add test_cumprod_op

* Revert "add test_cumprod_op"

This reverts commit c96cf6dff5d09ae7d8cc72c1e8ae4369a153aa19.

* recommit

* add error message

* test input(x) initialize

* test use cpu

* update test code

* add test type

* add test case

* solve ci problem

* add complex case test

* add complex case test

* fix review problem

* fix conflict

* fix some docs

* change test case

* change test case

* fix review problems again

* fix docs

* fix inclusivescan bug

4e509f46

02 9月, 2021 1 次提交

Add SVD Op and it's GPU and CPU kernel (#34953) · 7e5fb462

由 xiongkun 提交于 9月 02, 2021

* Add SVD Op and it's GPU and CPU kernel

* Remove CUDAPlace in test_svd_op, make the test available in CPU package

* modfity the file

* fix windows bug/ fix ROCM / fix test timeout

* for pass the CIs

* improve error report

* for code review

* some modification to test_svd_op

* change python code style

* expose the svd interface for document

7e5fb462

31 8月, 2021 1 次提交

New whl release strategy with pruned nv_fatbin (#35239) · 2f3b393d

由 Zhanlue Yang 提交于 8月 31, 2021

[Background]
Expansion in code size can be irreversible in the long run, leading to huge release packages which
not only hampers user experience but also exceeds a hard limit of pypi.

In such, NV_FATBIN section takes up 86% of the compiled dylib size, owing to the vast number of GPU
arches supported.

This PR aims to prune this NV_FATBIN.

[Solution]
In the new release strategy, two types of whl packages will be involved:

Cubin PIP package:
PIP package maintains a smaller window for GPU arches support, containing
sm_60, sm_70, sm_75, sm_80 cubins, covering Pascal - Ampere arches

JIT release package:
This is a backup for Cubin PIP package, containing compute_35, compute_50, compute_60,
compute_70, compute_75, compute_80, with best performance and GPU arches coverage.

However, it takes around 10 min to install due to the JIT compilation.

[How to use]
The new release strategy is disabled by default.
To compile for Cubin PIP package, add this to cmake: -DCUBIN_RELEASE_PIP
To compile for JIT release package, add this to cmake: -DJIT_RELEASE_WHL

2f3b393d

20 8月, 2021 1 次提交
- H
  
  Add paddle.linalg.matrix_power OP (#34667) · e2241a43
  由 Hao Lin 提交于 8月 20, 2021
  
  e2241a43
18 8月, 2021 1 次提交

Add function to disable paddle signal handler (#34577) · dd533dd3

由 Zhanlue Yang 提交于 8月 18, 2021

* Add function to disable paddle signal handler

Paddle used google::InstallFaultSignalHandler to handle selected system signals,
mainly for debugging and bug report purposes.

However, this can be conflicted with other python packages whoever captures similar signals.
Such python package involves tvm and more

To resolve this issue, we support a function to disable signal handler

* Remove signal test from WIN32 platform

* Remove redundant return from disable_signal_handler() function

* Add detailed messages to en_doc

dd533dd3

16 8月, 2021 1 次提交

add unique_consecutive_op (#34334) · 875cfd57

由 duanboqiang 提交于 8月 16, 2021

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* add unique_consecutive_op

* remove unity build

* add unique_consecutive op

* add unique_consecutive op

* add enable static

* add noqa

* add space line

* add default case.

* add comma

* add space line

* modify unique_consecutive unittest

* optimize ut coverage

* rebase develop

* improve coverage

* update en docs

* update en docs

* update en docs

* update en docs

* update en docs

* update en doc

875cfd57

13 8月, 2021 1 次提交

New Einsum API (#33821) · 8c8667f0

由 Tongxin Bai 提交于 8月 13, 2021

* OP dot: refactor CPU kernels and get better loop performance.

* Minor fix on code format.

* Fixed minor errors.

* Add new API: einsum

* Update the Einsum unit test.

One case failed with matmul_v2, where the dtype is int64:

a = np.arange(2 * 3 * 1).reshape(2, 3, 1)
b = np.arange(1)
paddle.einsum("...i, ...i", a, b)

* Test cases in test_einsum test floating point dtypes only.

As of now Paddle only supports float/double dtypes in matmul, which is
one of building blocks of this Einsum implementation. We decide not to
test einsum against other dtypes.

* Polish format.

* More formatting.

* Format...

* Einsum: improve test coverage.

* Einsum: bug fixes and more testcases for testing error messages

* Einsum: fix format..

* Einsum: fixed typo and format.

* Einsum: format again...

* Einsum: applied suggested changes.

* Einsum API: improve API documentation.

* Einsum API: apply suggested changes.

* Einsum API: Add dygraph only note.

* Einsum API: Add dygraph only note.

* Einsum API: fixed unittest.

8c8667f0

28 7月, 2021 1 次提交
- Z
  
  reverse paddle.vision.xxx import (#34432) · a83a368f
  由 zhiboniu 提交于 7月 28, 2021
  
  a83a368f
19 7月, 2021 1 次提交

Add Cuda event and stream API (#32460) · 9c7f6af5

由 chentianyu03 提交于 7月 19, 2021

* add cuda event and stream api

* add cuda event and stream api

* add get_current_stream api

* add get_current_stream api

* init streams

* modify get_current_stream

* modify get_cuttent_stream

* add synchronize func

* add current_stream doc and test file

* move get_current_stream into CUDA macro

* move CudaEvent into CUDA macro

* move _get_current_stream and _device_synchronize into cuda macro

* modify the macro of cuda stream and event

* add test case for synchronize

* add paddle.devices.cuda module

* event and stream support hip

* add doc for stream and event class

* move cuda stream and event into single pybind

* add cuda_streams_py.cc to cmakelist

* add _device_synchronize and _get_current_stream to core module

* add test case for cudastream and cudaevent

* move __all__ in streams.py

* fix test fail

* add cuda to devices __all__

* fix current_stream doc writing error

* move devices to device direction, and merge device.py into __init__.py

* add required:gpu to sample codes

* remove cuda direction from device/__init__.py

9c7f6af5

12 7月, 2021 1 次提交
- Z
  
  add paddle/linalg.py to add new linalg apis (#34033) · bfbea8fd
  由 zhiboniu 提交于 7月 12, 2021
  
  bfbea8fd
23 6月, 2021 1 次提交
- Z
  
  Add new operation: BroadcastTensorsOp (#33294) · affddfaa
  由 Zhanlue Yang 提交于 6月 23, 2021
  
  affddfaa
22 6月, 2021 1 次提交

[API/OP]Add a new API paddle.diagonal (#33586) · ad106290

由 zhangbo9674 提交于 6月 22, 2021

* new api diagonal, test=develop

* add new api diagonal, test=develop

* new api diagonal, test=develop

* add new api paddle.diagonal, test=develop

* use framework::stride replace ComputeDimStride

* replace cudaMalloc/cudaMemcpy by TensorFormVector in cudaKernel and cudaGradKernel

* perfect funciton: when attr(offset) is exceed attr(axis1) or attr(axis2), set the diagonal dim is 0

* fix RP-Mac-CI bug: replace framework::stride() by ComputDimStride.

* perfect code-block

* perfect code of python API diagonal

* api supports dtype of float16 and bool

* api supports dtype of float16 and bool

* modify unittest code

* modify unittest code

* perfect dtype describe

* perfect code-block

ad106290

21 6月, 2021 1 次提交
- Z
  
  add new api ci check file (#33609) · 50f885fd
  由 zhiboniu 提交于 6月 21, 2021
  
  50f885fd
17 6月, 2021 1 次提交
- R
  Add atan2 op and test (#33067) · 918aeb71
  由 ronnywang 提交于 6月 16, 2021
```
* add atan2_op

* fix
```
  918aeb71
16 6月, 2021 2 次提交
- Z
  
  Add bitwise_and/or/xor/not OP/API and unittest (#33524) · ecc05377
  由 Zhou Wei 提交于 6月 16, 2021
  
  ecc05377
- Z
  [Feature] add paddle.trunc (#33371) · 72d36970
  由 zhangbo9674 提交于 6月 16, 2021
```
* new api trunc, test=develop
```
  72d36970
15 6月, 2021 1 次提交

Add digamma_op and unittest (#33278) · 02a6d49a

由 zyfncg 提交于 6月 15, 2021

* Add digamma_op and unittest

* add digamma_op api

* remove special DigammaCudaKernel and correct some docs

* remove unused headers

* fix api doc error

02a6d49a

11 6月, 2021 2 次提交
- R
  
  add expm1_op (#33066) · 5cca9e4c
  由 ronnywang 提交于 6月 11, 2021
  
  5cca9e4c
- Z
  update 2.0 public api in all left files (#33313) · 022198c5
  由 zhiboniu 提交于 6月 11, 2021
```
* update 2.0 public api in all left files

* reverse device.py all list;
fix some flake8 errors
```
  022198c5
09 6月, 2021 2 次提交
- L
  Add API paddle.neg() and paddle.lgamma(), along with some unittests for paddle.neg(). (#33248) · 9cda9ec2
  由 levi131 提交于 6月 09, 2021
```
* add paddle.neg api

* add test for neg

* fix an English gammar error in comment

* add lgamma api

* support api paddle.tensor.neg() and paddle.tensor.lgamma()

* modify test_neg_op.py
```
  9cda9ec2
- L
  
  Add diagflat op, test=develop (#33334) · 32ef95d7
  由 Li Min 提交于 6月 09, 2021
  
  32ef95d7
27 5月, 2021 1 次提交
- Q
  
  [ROCM] add is_compiled_with_rocm api, test=develop (#33043) · 6a5b7e59
  由 Qi Li 提交于 5月 27, 2021
  
  6a5b7e59
07 5月, 2021 1 次提交

remove packages in __all__ (#32759) · a77ade0e

由 zhiboniu 提交于 5月 07, 2021

* [OPs] Bug fix, fix the segment mean for illegal syncthreads usage. (#32596) (#32610)

* [OPs] Bug fix, fix the segment mean for illegal syncthreads usage.

* remove packages in __all__

* create new public api level paddle.callbacks;paddle.hub;paddle.utils.unique_name
Co-authored-by: NZhong Hui <zhonghui.net@gmail.com>

a77ade0e

27 4月, 2021 2 次提交
- Z
  update 2.0 public api in paddle.init (#32034) · 125e4816
  由 zhiboniu 提交于 4月 27, 2021
```
Co-authored-by: NXiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
```
  125e4816
- Z
  
  update 2.0 public api in dataset&framework (#31985) · 9930a582
  由 zhiboniu 提交于 4月 27, 2021
  
  9930a582
25 4月, 2021 1 次提交

Add hub Module for easy to use pre-trained models. (#31873) · 4e460d7b

由 Wenyu 提交于 4月 25, 2021

* add Hub Module for easy to use pre-trained models.
*   support list, load, help fucntions.
*   support load models by github, gitee, local 
Co-authored-by: NLielinJiang <jianglielin@baidu.com>

4e460d7b

24 4月, 2021 1 次提交
- Z
  
  add tensor.tolist() support (#32366) · 8beb1707
  由 zhiboniu 提交于 4月 24, 2021
  
  8beb1707
22 4月, 2021 2 次提交
- Y
  
  Add `paddle.set_grad_enabled` (#31794) · f8ca5a9d
  由 Yang Zhang 提交于 4月 22, 2021
  
  f8ca5a9d
- Z
  
  fix type(x)=paddle.VarBase to paddle.Tensor (#32364) · bec4b167
  由 zhiboniu 提交于 4月 22, 2021
  
  bec4b167
14 4月, 2021 1 次提交

add common dtypes as paddle's dtypes (#32012) · 95939b52

由 Feiyu Chan 提交于 4月 14, 2021

* add common dtypes as paddle's dtypes

* import paddle.fluid.core_avx.VarDesc.VarType as paddle.dtype

95939b52

09 4月, 2021 1 次提交

[NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d

由 Leo Chen 提交于 4月 09, 2021

* [feature] support npu allocator (#30840)

[feature] support npu allocator

* [feature] support npu operator (#30951)

[feature] support npu operator

* [feature] support npu allocator, part 2 (#30972)

* support npu allocator

* add npu device context

* fix some compile problem

* fix some compile problem

* add npu info

* compile ok

* fix include dir

* support naive_best_fit_allocator

* run ut ok, bug failed to exit

* call aclrtResetDevice before exit

* fix aclFinilize

* add system allocatot test

* add selected_gpus in gtest

* add tensor_test for npu

* support npu op, initial commit

* add npu stream

* add elementwise_add_op

* compile ok

* fix typo

* fix elementwise_add_op_npu_test

* support op run

* test can run but failed

* change aclopExecuteV2 to aclopCompileAndExecute

* support parsing ascend rank table file (#31000)

support parsing ascend rank table file

* Fix reshape on GE graph. (#31084)

Fix reshape on GE graph

* add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)

* add npu sub op

* fix typo

* rename test

* fix bug

* fix bug

* add fp16 kernel

* fix typo

* support sub grad op

* support elementwise_sub_grad op
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* Fix compilation problem (#31100)

Fix compilation problem (#31100)

* fix compile

* fix code stype

* remove const_cast

* support adding correct npu op in pybind.h (#31143)

* support adding correct npu op in pybind.h

* refine code

* [NPU] Support executor with NPU (#31057)

* [NPU] Support executor with NPU

* Fix code according to reviews

* Fix code

* Add unittest for sub op npu

* refactor npu device manager (#31154)

refactor npu device manager (#31154)

* fix selected npus

* fix compile

* fix reading flags from env

* format
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>

ccf5709d

01 4月, 2021 1 次提交

add custom init grad for backward function (#31540) · 83b953f5

由 chentianyu03 提交于 4月 01, 2021

* add custom init grad for backward function

* add custom init grad for backward function

* handle when the grad_tensor is none

* handle when the grad_tensor is none

* fix the args type error on windows platform

* modify the args order and doc

* format code

* add grad_tensor to xpu

* modify the grad_tensor type check

* add paddle.backward api to support multi tensors gradient compute

* add paddle.backward api to support multi tensors gradient compute

* add paddle.atuograd module and backward api

* change tensor.backward func args

* modify tensor backward api

* remove create_graph intputs args

* add doc and examplex code for backward api

* when have the same tensor, throw error

* modify test Init func args

* modify the execute.Init func args in test files

* add paddle.autograd package in setup.py.in

* modify error msg, remove _run_backward method in class Tensor

* add test cases for backward api

83b953f5

15 1月, 2021 1 次提交

Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103) · 13d75736

由 pangyoki 提交于 1月 15, 2021

* add view strategy on squeeze,unsqueeze,reshape,flatten

* add squeeze unittest

* add unittests

* use View strategy as name rather than Reuse Allacation

* fix view api doc

* fix format

* use core.ops when input of reshape2 is Tensor

* fix test_cross_entropy_loss error because of reshape2

* fix test_cross_entropy_loss error because of reshape2

* add inplace strategy

* add elementwise_add sub

* let backward op not use inplace

* grad op do not use inplace

* fix memory increase error and add leaf error message

* delete selected_rows

* change op_function

* little change

* solve HandleViewBetweenInputAndOutput

* add unittest and leaf error message

* merge view error

* optimize op_function_generator format and support sum inplace op

* fix format of basic_engine

* fix format for framework

* little change of variable wrapper

* add reshape, squeeze, unsqueeze, scatter api

* add relu elu tanh softmax inplace api

* fix test_squeeze_op unittest

* fix test_relu_op unittest

* fix comment problems

* delete sample code of inplace api

* add reference of grad_pending_nodes in basic_engine

* fix unittest name

* add inplace apis into wlist

* fix error message

* add PADDLE_ENFORCE for set grad op twice

* fix head file error

13d75736

07 1月, 2021 1 次提交
- 1
  Add Lookahead and ModelAverage Optimizer (#30004) · 198fbdfb
  由 123malin 提交于 1月 07, 2021
```
* test=develop, add model_average and lookahead
```
  198fbdfb
17 12月, 2020 2 次提交

add conj op for complex types (#29527) · 71063b81

由 chentianyu03 提交于 12月 17, 2020

* add conj op for complex types

* add conj for complex types

* add more test case

* add conj_op test

* modify conj api and impl

* add complex type for fill_constant_op xpu

* add setConstant for complex type

* remove complex conj test file

* user define grad for test_conj_op

* add test case for static mode of conj api

* modify conj doc

* change input args name to x

* remove useless codes

* conj support real types

* add conj test case for real number

71063b81

[Complex] Add real & imag op and api for complex tensor (#29672) · 6cfa59de

由 Chen Weihang 提交于 12月 17, 2020

* add complex real op & api & unittest

* add imag op & api & unittest

* refactor op impl

* revert simplify writing due to complile failed

* polish details

* polish grad op code

6cfa59de

09 12月, 2020 2 次提交
- J
  Add tangent operator (#29207) · 87e75a77
  由 joejiong 提交于 12月 09, 2020
```
As the title
```
  87e75a77
- W
  remove addcmul (#28937) · dc8bb76c
  由 Wei Shengyu 提交于 12月 09, 2020
```
* remove addcmul

* remove unittest and other related code of addcmul

* fix bug

* fix merge conflict
```
  dc8bb76c
07 12月, 2020 1 次提交
- C
  remove complexvariable (#29390) · 64e4e17f
  由 chentianyu03 提交于 12月 07, 2020
```
* rm complexvariable

* modify test_var_base unittest

* remove duplicated codes
```
  64e4e17f

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致