提交 · 21dc044a540fd10bf5ed5ba89570c049885d0441 · 机器未来 / Paddle

15 4月, 2021 1 次提交

【NPU】Cherry-pick ascendrc ops code by 0325 to develop (#32197) · e6bc358d

由 zhang wenhui 提交于 4月 15, 2021

* merge 31065

* Fix typo of selected_npus (#31230)

* merge 31249

* [NPU] Support npu op pow and pow grad (#31247)

* [NPU] Support npu op: (1) pow (2) pow_grad

* Support fp16

* Fix pow npu fp16 test (#31256)

* support list of list attribute for NPU (#31299)

* support list of list attribute for NPU

* fix compile problem

* fix reference

* [NPU] Support npu op: (1) slice (2) slice_grad (#31275)

* fix reading flags from env (#31329)

* merge 31347

* [NPU] Support npu op layer_norm and layer_norm_grad (#31310)

* init commit, add layer_norm npu kernel

* fix typo

* add unittest

* add unittest

* fix bug

* fix bug

* refine ut

* [NPU] add npu kernel for equal op (#31393)

* add npu kernel for equal op

* refine code

* add more ut

* update year

* [NPU] Support npu kernel for shape op  (#31427)

* add shape npu

* fix

* fix

* fix endif (#31431)

* Fix pow, use fillD instead of broadcast (#31433)

* Fix pow, refine code (#31440)

* fix cmake of cryptopp to avoid downloading every time (#31451)

* [NPU] squeeze and unsqueeze op for ascend (#31452)
Co-authored-by: Nroot <xiayanming@baidu.com>

* Support npu kernel for gather op (#31458)

* add gather npu op

* code review done

* update python new line

* precommit

* fix review

* del commit

* 【NPU】add scale op for npu (#31499)

* add scale npu

* fix

* fix

* Support TensorFormVector, TensorToVector of bool type (#31518)

* support TensorFormVector, TensorToVector of bool type

* add ut

* fix compile problem

* 【NPU】support npu kernel for fill_constant op (#31521)

* add fill_constant npu

* add fill_constant npu

* fix

* cherry-pick 31422, solve conflict

* 【NPU】Support npu kernel for matmul op (#31544)

* add matmulv2_npu

* add matmul

* add matmul

* [NPU] Support npu op elementwise_mul and elementwise_mul_grad (#31571)

* [NPU] Support npu op elementwise_max (#31574)

* 【NPU】add relu op for  npu (#31515)

* add relu npu

* fixed

* fix

* 【NPU】Suppert npu kernel for reshape2 op (#31524)

* add reshape2 npu

* add reshpe2

* [NPU] Support npu kernel for gather op fix bug (#31541)

* add gather npu op

* code review done

* update python new line

* precommit

* fix review

* del commit

* update gather_grad

* fix bug

* fix bug

* [NPU] Support npu kernel for amp_check_finite_and_unscale_npu op (#31457)

* Support npu kernel for amp_check_finite_and_unscale_npu op

* support EnforceNotMet exception

* fix exception bug

* modify python unittest

* precommit

* update c++ unittest

* fix review

* fix review

* [NPU] accuracy op (#31492)

* accuracy op

* fix license

* fix

* add test and fix bug

* [NPU] add Assign OP (#31561)

* add assign op

* add test assign npu test

* dele if def
Co-authored-by: Noyjxer <1728722986@qq.com>

* [NPU] fix npu op elementwise_mul_grad (#31592)

* 【NPU】Support npu op gelu and gelu_grad (#31530)

* Support npu op gelu and gelu_grad

* Support npu op gelu and gelu_grad

* [NPU] fix assgin cmake (#31595)

* fix gather_grad bug (#31607)

* [NPU] add range op (#31560)

* add range op

* fix codestyle; call GetSize directly
Co-authored-by: Noyjxer <1728722986@qq.com>

* 【NPU】Support npu op elementwise_div and elementwise_div_grad (#31573)

* Support npu op elementwise_div and elementwise_div_grad

* Support npu op elementwise_div and elementwise_div_grad

* Support npu op elementwise_div and elementwise_div_grad

* [NPU] Support npu op log, log_grad, sqrt, sqrt_grad, square, tanh and tanh_grad (#31600)

* [NPU] Support npu op logicalnot_op (#31534)

* [NPU] Support npu op elementwise_min (#31575)

* [NPU] Support npu op elementwise_pow (#31576)

* [NPU] Support npu op table_lookup_v2 and table_lookup_v2_grad (#31399)

* [npu] support npu kernel `table_lookup_v2`

* clean up

* +python test

* +cmake

* clean up

* remove int8 kernel
+ python unitest for fp16

* clean up

* [NPU] support npu kernel for `less_than` (#31327)

* [npu] support npu kernel for `less than`

* remove int* kernel

* cleanup

* [NPU] Support npu kernel scatter op (#31624)

* Support npu kernel scatter op

* Add more test

* [NPU] fix allocator min chunk size (#31632)

* [NPU] Support NPU kernel cast op (#31635)
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* [NPU] add npu kernel for sgd (#31639)

* 【NPU】Support NPU kernel for reduce_sum op v2 (#31620)

* add reduce_sum

* fix broadcastd

* fix test

* fix

* add unsqueeze in reduce_sum

* add template

* add unittest for keep_dim

* test reduce_all
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* [NPU] add npu kernel for adam (#31644)

* add npu kernel for adam

* refine code

* disable test

* modify atol

* 【NPU】Support npu kernel for mul op (#31584)

* add mul

* add test mul

* [NPU] add npu kernel for softmax_with_cross_entropy (#31656)

* init

* fix bugs

* [NPU] add npu kernel for mean Op (#31562)

* update mean op

* update mean op

* give a better test activation
Co-authored-by: Noyjxer <1728722986@qq.com>

* Revert "[NPU] add npu kernel for mean Op (#31562)" (#31665)

This reverts commit 468ac699.

* 【NPU】Add TensorCopy to NPU kernel for reduce_sum op  (#31667)

* update unittest

* add TensorCopy in npu grad kernel

* [NPU] Support npu op `expand` (#31405)

* [npu] support npu kernel  for `expand`

* [NPU] fix shape of dx in mul_grad (#31675)

* fix shape of dx

* refine code

* [NPU] add Increment op (#31563)

* add increment

* fix

* update test increment op inplace

* update increment op

* increment b = 2
Co-authored-by: Noyjxer <1728722986@qq.com>

* [NPU] add NPU add topk  (#31596)

* add topk op

* add cmake

* update topk npu op

* refactor func

* fix test not go npu TopKD bug

* NPUPlace(4) to NPUPlace(0)

* update comment
Co-authored-by: Noyjxer <1728722986@qq.com>

* [NPU] Support NPU kernel sum op (#31671)

* [NPU] npu support `transpose` (#31486)

* cherry-pick 31564, solve conflict

* [NPU] Fix bug: Fix calculation errors of pow grad npu kernel (#31699)

* [NPU] Support testing grad of NPU ops in OpTest (#31697)

* [NPU] Support NPU kernel of stack op (#31711)

* [NPU] Remove redundant ctest of top_k_op_npu_test (#31718)

* [NPU] fix reshape npu op kernel (#31726)

* rename npu op file

* fix reshape

* [NPU] change transpose to transpose2 (#31734)

* change transpose to transpose2

* fix bug

* [NPU] Support  mean npu kernel (#31729)

* [NPU] fix some bugs of npu op (#31739)

* fix softmax

* fix mean

* fix lookup_table_v2

* 【NPU】Fix npu kernel elementwise_div_grad  (#31753)

* [NPU] fix the grad kernel diff bug of gather op (#31757)

* fix gather grad kernel diff

* fix gather grad kernel diff

* fix gather review bug

* 【NPU】Fix reshape test & add grad test (#31776)

* fix

* fix

* [NPU] support fp16 for npu accuracy op (#31797)

* [NPU] support list of tensor input (#31801)

* support list of tensor as npu input

* add comment

* fix typo

* fix typo

* [NPU] add npu kernel for concat op (#31695)

* add npu kernel for concat op

* add npu kernel for concat op

* refine code

* update

* refine concat_grad

* [NPU] Support npu kernel for op elementwise_floordiv (#31822)

* [NPU] fix bug of lookup_table_v2_grad (#31834)

* [NPU] support default stream (#31510)

* [NPU] support mixed precision input for npu layer norm (#31847)

* support mixed precision input for npu layer norm

* fix layer_norm npu kernel
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

* 【NPU】Support npu kernel for update_loss_scaling op (#31830)

* add update_loss_scaling_npu NPU kernel

* change TensorFromVec to Memset

* fix compile problem (#31850)

* [NPU] support npu for conditional_block op (#31854)

* 【NPU】Add int dtype kernel for reshape2 op (#31864)

* fix

* fix

* [NPU] fix some op bugs (#31855)

* fix some op bugs

* fix some bugs

* follow comments

* fix log level

* add ut

* [NPU] support fp16 of input for api pow (#31871)

* [NPU] add npu kernel for truncated_gaussian_random op (#31654)

* init

* add todo

* add npu kernel for truncated_gaussian_random

* add sync

* fix concat_grad

* fix typo

* fix compile

* fix compile

* fix compile

* fix compile

* fix compile

* fix compile

* fix code style

* fix code style

* fix code

* Fix op test (#32231)

* fix conditional block (#32243)

* fix style code
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
Co-authored-by: NReventon_L <luyuxiang1994@qq.com>
Co-authored-by: Nroot <xiayanming@baidu.com>
Co-authored-by: Noyjxer <1728722986@qq.com>
Co-authored-by: Nyinhaofeng <66763551+yinhaofeng@users.noreply.github.com>
Co-authored-by: NOleNet <olenet@126.com>
Co-authored-by: NMeiyim <chen_xuyi@outlook.com>
Co-authored-by: Noyxuan-11 <963650125@qq.com>
Co-authored-by: Npangyoki <pangyoki@126.com>

e6bc358d

14 4月, 2021 2 次提交
- W
  
  support the bool tensor and scalar (#32272) · 7da4455f
  由 wawltor 提交于 4月 14, 2021
  
  7da4455f
- T
  
  fix expand op lack of float16 (#32238) · f4b2ce44
  由 Thomas Young 提交于 4月 14, 2021
  
  f4b2ce44
13 4月, 2021 1 次提交
- Q
  
  [ROCM] fix depth conv2d in rocm, test=develop (#32170) · 693c7629
  由 Qi Li 提交于 4月 13, 2021
  
  693c7629
07 4月, 2021 1 次提交
- D
  add uint8 type for flatten op (#32120) · 297290a8
  由 danleifeng 提交于 4月 07, 2021
```
* add uint8 type for flatten;test=develop
```
  297290a8
06 4月, 2021 1 次提交
- Z
  fix test of affine_grid with rocm (#32047) · 78af100c
  由 zhulei 提交于 4月 06, 2021
```
* fix test of affine_grid with rocm

* fix test of affine_grid with rocm
```
  78af100c
01 4月, 2021 1 次提交
- Z
  
  Support uint8_t for fill_constant_op (#31911) · 980227f9
  由 Zhang Zheng 提交于 4月 01, 2021
  
  980227f9
29 3月, 2021 1 次提交
- R
  
  [ROCM] added a cudnn switch of conv2d for rocm platform (#31836) · 123949eb
  由 ronnywang 提交于 3月 29, 2021
  
  123949eb
22 3月, 2021 1 次提交
- A
  
  [oneDNN] Initial bf16 amp integration (#31093) · 7ccf6b60
  由 arlesniak 提交于 3月 22, 2021
  
  7ccf6b60
17 3月, 2021 1 次提交
- Z
  
  support NHWC for temporal_shift op (#31642) · 7f50bb7e
  由 Zhang Ting 提交于 3月 17, 2021
  
  7f50bb7e
11 3月, 2021 1 次提交
- J
  
  optimize range op by place parameters on cpu rather than gpu, test=develop (#30811) · 9ed6c895
  由 jiangcheng 提交于 3月 11, 2021
  
  9ed6c895
04 3月, 2021 1 次提交

[Dy2stat] Fix Read-Only Attribute as while_loop Output (#31415) · 6bf02a12

由 Huihuang Zheng 提交于 3月 04, 2021

Fix Read-Only Attribute as while_loop Output:

Usually, our convert_while_loop will be like:
```
    [a, b, c] = paddle.jit.dy2static.convert_while_loop(
            condition_name, body_name, [a, b, c])
```
where a, b, c are in loop_var_names.

However, if loop_var_names contains property such as foo.x, we cannot
assign the attribute as output of convert_while_loop because Python
property is a kind of read-only attribute. To handle the case, we replace
the attributes which are output of convert_while_loop with generated
variables, then if we know the attribute is not read-only at runtime, we
assign the attribute. The created statements are like:
```
    [a, b, __attribute_variable_1] = paddle.jit.dy2static.convert_while_loop(
            condition_name, body_name, [a, b, foo.x])
    if not isinstance(getattr(type(foo), x, None), property): foo.x = __attribute_variable_1
```

6bf02a12

26 2月, 2021 1 次提交
- P
  
  change np.int to int to fix paddle warning (#31221) · 6fafbdc3
  由 pangyoki 提交于 2月 26, 2021
  
  6fafbdc3
24 2月, 2021 1 次提交
- Q
  Update doc for 2.0 API and some callback (#31180) · 572cc8bd
  由 qingqing01 提交于 2月 24, 2021
```
test=document_fix
```
  572cc8bd
03 2月, 2021 1 次提交
- A
  
  Call new cudnn batch norm API regardless of data type and data layout (#30157) · 666efc23
  由 AshburnLee 提交于 2月 03, 2021
  
  666efc23
29 1月, 2021 1 次提交
- J
  
  fix paddle.static.acc and auc sample code bug, test=document_fix (#30715) · 65a9744c
  由 Jiaqi Liu 提交于 1月 29, 2021
  
  65a9744c
27 1月, 2021 2 次提交

update gather_tree doc (#30693) · a87d78f1

由 liu zhengxi 提交于 1月 27, 2021

* update gather_tree doc, test=document_fix

* update sample code, test=document_fix

* remove tensor type, test=document_fix

a87d78f1

L
upgrade gather_tree to core.ops (#30697) · fef3654b
由 liu zhengxi 提交于 1月 27, 2021
```
* upgrade gather_tree to core.ops

* update gather_tree unittests
```
fef3654b

18 1月, 2021 1 次提交
- Z
  
  avoid calling cast twice (#30527) · 34bf8dfc
  由 Zhang Ting 提交于 1月 18, 2021
  
  34bf8dfc
14 1月, 2021 1 次提交
- J
  add auc into 'all' list (#30310) · e395bcd1
  由 Jiaqi Liu 提交于 1月 14, 2021
```
* add auc into 'all' list

* alias acc, expose to users

* update sample code
```
  e395bcd1
13 1月, 2021 1 次提交

Set expected place in child thread for dataloader to avoid costing cuda memory... · 3d015f1c

由 Leo Chen 提交于 1月 13, 2021

Set expected place in child thread for dataloader to avoid costing cuda memory on other card (#30338)

* set expected place in child thread for dataloader

* set device id when set tensor from numpy

* revert tensor_py change

* add compile guard

* fix ci

* fix bug

3d015f1c

11 1月, 2021 1 次提交
- X
  clean redundant API alias in 2.0 - part 2 (#30013) · 6bfdef72
  由 XiaoguangHu 提交于 1月 10, 2021
```
* delete paddle.nn.functional.assign

* fix dynamic to static error
```
  6bfdef72
08 1月, 2021 1 次提交
- L
  Fix test_slice: avoid unnecessary copying of TensorArray from subblock to parent block(#30168) · b2483d78
  由 liym27 提交于 1月 08, 2021
```
In control flow, don't copy TensorArray from subblock to parent block when TensorArray is created in parent block.
```
  b2483d78
07 1月, 2021 1 次提交
- W
  
  refine the paddle place support using str (#28769) · 7dd551e0
  由 wangchaochaohu 提交于 1月 07, 2021
  
  7dd551e0
06 1月, 2021 2 次提交

add dispenable input for core.ops.reshape2/expand/slice (#30072) · adac38c5

由 Leo Chen 提交于 1月 06, 2021

* add dispenable input 'shape' for core.ops.reshape2

* add dispenable inputs for core.ops.reshape2/expand/slice

* add ut

adac38c5

Fix beam search bug (#29824) · 2e8425b6

由 Jiaqi Liu 提交于 1月 06, 2021

* fix beam search bug

* add dygraph unittest

* update dynamic_decode argument doc

* add warning info for state which has no lengths attribute

2e8425b6

04 1月, 2021 1 次提交
- W
  
  Optimization grad merge performance (#29784) · ee16006b
  由 WangXi 提交于 1月 04, 2021
  
  ee16006b
28 12月, 2020 1 次提交

clean redundant API alias in 2.0 - part 1 (#29928) · 726c78f2

由 XiaoguangHu 提交于 12月 28, 2020

* rm check_import_scipy, rm chunk_eval and mean_iou in paddle.metric.__init__.py

* Revert "rm check_import_scipy, rm chunk_eval and mean_iou in paddle.metric.__init__.py"

This reverts commit 179ba8c2b22bc31fe8d8a126e31820792cbd0f4e.

* delete paddle.metric.chunk_eval and paddle.metric.mean_iou

* delete paddle.nn.clip and paddle.nn.clip_by_norm

* delete paddle.nn.functional.activation.hard_sigmoid and paddle.nn.functional.activation.hard_swish

* delete paddle.nn.Pool2D, paddle.nn.BilinearTensorProduct, paddle.nn.RowConv, paddle.nn.functional.row_conv

* fix extension import error

* fix unittest for row_conv and Pool2D

726c78f2

23 12月, 2020 1 次提交

heter box (#29734) · 09b6e719

由 Thunderbrook 提交于 12月 23, 2020

* 　add heter box

* add trainer, worker, wrapper...

* format

* for ci

* format

* remove boost get

* boost & copyright

* rename

* 　rename

* format

* format

* format
Co-authored-by: Nyaoxuefeng6 <yaoxuefeng@baidu.com>

09b6e719

11 12月, 2020 1 次提交
- L
  Add fast path for dropout when p == 0 (#29553) · 0fdd3656
  由 Leo Chen 提交于 12月 11, 2020
```
* add fast path for p == 0 in dropout

* add ut
```
  0fdd3656
09 12月, 2020 1 次提交
- J
  Add tangent operator (#29207) · 87e75a77
  由 joejiong 提交于 12月 09, 2020
```
As the title
```
  87e75a77
07 12月, 2020 1 次提交

[paddle v2.0.0rc1: API fixs] assign/conv2d/conv2d_transpose/cast/ParamAttr (#29171) · 2ee7a6b0

由 liuyuhui 提交于 12月 07, 2020

* fix DLTP-15151, paddle.ParamAttr API

* fix DLTP-15083/DLTP-15274, paddle.nn.functionl.assign paddle.cast API

* fix DLTP-15431/DLTP-15432, paddle.static.nn.conv2d paddle.static.nn.conv2d_transpose API

* fix DLTP-15083, paddle.nn.functionl.assign API

* fix DLTP-15431/DLTP-15432, paddle.static.nn.conv2d paddle.static.nn.conv2d_transpose API

* support in_dygraph_mode for cast op, test=develop

* fix bug,test=develop

* fix doc

* fix DLTP-15431/DLTP-15432, paddle.static.nn.conv2d paddle.static.nn.conv2d_transpose API

2ee7a6b0

05 12月, 2020 1 次提交

Fix api docs in RNN, Transformer, layer_norm, WeightNormParamAttr (#29235) · 8fc7f1b6

由 Guo Sheng 提交于 12月 05, 2020

* Fix api docs in RNN, Transformer, layer_norm, WeightNormParamAttr.
test=develop

* Fix api doc for print in label_smooth.
test=develop

* Update api docs according to review comments.
Add name argument in RNN back.
test=develop

8fc7f1b6

04 12月, 2020 1 次提交
- L
  
  Fix bug: delete wrong check_type of paddle.concat and support LoDTensorArray (#29306) · 5f84d0b3
  由 liym27 提交于 12月 04, 2020
  
  5f84d0b3
02 12月, 2020 2 次提交
- L
  Move temporal_shift to paddle.nn.functional (#29261) · b9f1f434
  由 LielinJiang 提交于 12月 02, 2020
```
* move temporal_shift to functional
```
  b9f1f434
- M
  Update conv3d API (#29205) · 6a9a62c3
  由 mls1999725 提交于 12月 02, 2020
```
* Update conv3d API

* Update nn.py

* Update nn.py

* Update nn.py

* Update nn.py

* Update nn.py

* Update nn.py

* Update nn.py

* Update nn.py
```
  6a9a62c3
01 12月, 2020 1 次提交

Improve performance of elementwise_add grad op (#29187) · 116305ea

由 Leo Chen 提交于 12月 01, 2020

* pass stop_gradient for cast op

* improve performance of elementwise_add grad

* use tensor copy async

* dygraph branch

* fix dygraph branch

* add ut

116305ea

30 11月, 2020 3 次提交

fix doc of alpha_dropout/dropout/dropout2d/dropout3d/npair_loss (#29136) · b6a26749

由 huangjun12 提交于 11月 30, 2020

* fix en doc, test=document_fix

* add blank after code declare, test=document_fix

* refine doc of dropout, test=document_fix

* refine npair_loss and dropout, test=document_fix

b6a26749

H

Refine the doc and unit test for Sigmoid and stanh (#29198) · f23665e5
由 hong19860320 提交于 11月 30, 2020

f23665e5

Check whether there is any inplace operation affecting gradient calculation. (#27901) · 865a4598

由 liym27 提交于 11月 30, 2020

* Add a class TensorInplaceVersion to count the inplace version and put it in framework::Tensor instead of Allocation or Variable.

* Add a new attribute `_inplace_version` for VarBase.

* Raise exception if an inplace operation can result in incorrect gradient computation.

* Add a new interface _bump_inplace_version() for VarBase to bump the version whenever the Tensor is modified through an inplace operation.

* For api assign, call _bump_inplace_version() when it's an inplace operation inn dynamic mode.

* Use original var_wrapper if the inplace_version is not changed.

* Replace SnapshotVarWrapperList with SnapshotVarWrapper to optimize performane.

865a4598

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致