提交 · f9377965c4209cb6941150c55b6129afb58c64fe · PaddlePaddle / Paddle

05 3月, 2021 1 次提交
- L
  [Kunlun]Multi xpu dygraph performance optimization , add distributed.spawn... · 9ebf05b0
  由 liuyuhui 提交于 3月 05, 2021
```
[Kunlun]Multi xpu dygraph performance optimization , add distributed.spawn support for multi xpu and some bug-fixes (#31130)
```
  9ebf05b0
04 3月, 2021 6 次提交

L

[Dy2Stat] Remove gast.Index for compatibility of gast 0.4.0 (#31358) · 522c91ec
由 liym27 提交于 3月 04, 2021

522c91ec
Z

support float16 for temporal_shift op (#31432) · 7d95e598
由 Zhang Ting 提交于 3月 04, 2021

7d95e598
Z
improve performance of depthwise_conv2d (#31099) · dcce54ea
由 Zhang Ting 提交于 3月 04, 2021
```
* improve performance of depthwise_conv2d

* add unittest
```
dcce54ea
L

Fix bug for set_value op when input dtype is not float32 (#31411) · 0fff9306
由 liym27 提交于 3月 04, 2021

0fff9306

[Dy2stat] Fix Read-Only Attribute as while_loop Output (#31415) · 6bf02a12

由 Huihuang Zheng 提交于 3月 04, 2021

Fix Read-Only Attribute as while_loop Output:

Usually, our convert_while_loop will be like:
```
    [a, b, c] = paddle.jit.dy2static.convert_while_loop(
            condition_name, body_name, [a, b, c])
```
where a, b, c are in loop_var_names.

However, if loop_var_names contains property such as foo.x, we cannot
assign the attribute as output of convert_while_loop because Python
property is a kind of read-only attribute. To handle the case, we replace
the attributes which are output of convert_while_loop with generated
variables, then if we know the attribute is not read-only at runtime, we
assign the attribute. The created statements are like:
```
    [a, b, __attribute_variable_1] = paddle.jit.dy2static.convert_while_loop(
            condition_name, body_name, [a, b, foo.x])
    if not isinstance(getattr(type(foo), x, None), property): foo.x = __attribute_variable_1
```

6bf02a12

J

Added LSTM BF16 and fixed GRU BF16 (#31234) · 5b4f8aac
由 jakpiase 提交于 3月 04, 2021

5b4f8aac

03 3月, 2021 4 次提交
- Q
  
  [ROCM] fix softmax with loss and update python scripts, test=develop (#31373) · db50fb67
  由 Qi Li 提交于 3月 03, 2021
  
  db50fb67
- P
  
  TRT conv2d converter support SAME padding (#31379) · 32211fe9
  由 Pei Yang 提交于 3月 03, 2021
  
  32211fe9
- Q
  
  [ROCM] update fluid operators for rocm (part6), test=develop (#31301) · 946dbdae
  由 Qi Li 提交于 3月 03, 2021
  
  946dbdae
- W
  Add attrs `deformable_groups` for deformable_conv API (#31335) · 1cbccfa5
  由 wangna11BD 提交于 3月 03, 2021
```
* add attrs deformable_groups
```
  1cbccfa5
02 3月, 2021 2 次提交

P
add n-d input support for trt scale converter (#31316) · 2e9e3fad
由 Pei Yang 提交于 3月 02, 2021
```
* add n-d input support for trt scale converter

* add flatten for ut

* fix dims
```
2e9e3fad

lamb_op_xpu;test=kunlun (#31012) · d79fdc3d

由 Gradie 提交于 3月 02, 2021

* lamb_op_xpu;test=kunlun

* modify lamb_op_xpu.cc;test=kunlun

* delete atol lamb_op_xpu; test=kunlun

* update xpu.cmake;test=kunlun

* test_error 1e-5,lamb_op_xpu;test=kunlun

* error1e-5,lamb_op_xpu,test=kunlun

* delete atol lamb_xpu;test=kunlun

* modify atol,lamb_op_xpy;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu, XPUOptest;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu,modify xpu_cmake; test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu,modify xpucmake;test=kunlun

d79fdc3d

26 2月, 2021 2 次提交
- A
  [Dy2Stat] Fix eval_if_exist_else_none bug (#31261) · 1dd40870
  由 Aurelius84 提交于 2月 26, 2021
```
* fix eval_if_exist_else_none bug

* fix typo

* fix typo

* fix test_op_num unittest
```
  1dd40870
- W
  
  xpu support fuse allreduce (#31104) · b8bce682
  由 WangXi 提交于 2月 26, 2021
  
  b8bce682
25 2月, 2021 2 次提交
- L
  add int pad support for Pad1D/2D/3D (#31209) · ad50fa71
  由 littletomatodonkey 提交于 2月 25, 2021
```
* add int pad support for Pad1D/2D/3D

* fix type

* fix format
```
  ad50fa71
- J
  
  OneDNN hardswish integration (#30211) · 2f116534
  由 jakpiase 提交于 2月 25, 2021
  
  2f116534
24 2月, 2021 4 次提交

[Paddle-TRT] support group_norm (#31040) · 00b09e86

由 Pei Yang 提交于 2月 24, 2021

* add group norm plugin

* fix compile problems

* move concat axis check to trt op teller

* add nbDims for scale and bias nv dims

* add group norm unit test

* fix unittest

* add trt version restriction for group norm op teller

* fix unittest

00b09e86

C

change test_multiprocess_reader_exception cmake (#31174) · c209751c
由 Chen Weihang 提交于 2月 24, 2021

c209751c

fix entry (#31079) · ebbdf525

由 tangwei12 提交于 2月 24, 2021

* fix entry

* fix distributed lookup table fuse case

* fix entry bug at first time

* move entry from paddle.fluid -> paddle.distributed

* fix ut with paddle.enable_static()
Co-authored-by: Nmalin10 <malin10@baidu.com>

ebbdf525

add warning message when dtypes of operator are not same (#31136) · 70131b47

由 chentianyu03 提交于 2月 24, 2021

* add error msg when dtypes of operator are not same

* add error msg when dtypes of operator are not same

* change error msg to warning msg when dtypes of operator are not same

* modify test case to fit for python2

70131b47

23 2月, 2021 2 次提交

Optimization of Transformer API (#30957) · edacb629

由 xiemoyuan 提交于 2月 23, 2021

* Support 'bool' and 'int' for attention mask.

* Update docs.

* Add unittest for Transformer.

* fix bugs.

edacb629

Save load/save pickle protocol (#31044) · ee1801c1

由 WeiXin 提交于 2月 23, 2021

* add default argument  for paddle.save/static.save

* edit documentation of

* Add comments for special processing for protocol=2 and protocol=3.

* Update python/paddle/fluid/io.py
Co-authored-by: Nlanxianghit <47554610+lanxianghit@users.noreply.github.com>
Co-authored-by: Nlanxianghit <47554610+lanxianghit@users.noreply.github.com>

ee1801c1

22 2月, 2021 2 次提交

[Dy2stat] Refactoring tensor_shape_transformer.py to Fix Change after Assign Bug (#31082) · cf43a321

由 Huihuang Zheng 提交于 2月 22, 2021

**Problem**
In our old shape transformer logic, if user write:
```
s = tensor.shape
...
y = paddle.some_api(s)
```
Dy2stat will change it to
```
...
y = paddle.some_api(convert_var_shape(tensor))
```
However it will cause fatal bug if user changes the shape of `x` after assign. For example:
```
s = tensor.shape
...
tensor = paddle.some_change_shape_api(tensor)
...
y = paddle.some_api(s)
```
Then the Dy2stat will get wrong result because the code is translated into:
```
tensor = paddle.some_change_shape_api(tensor)
...
y = paddle.some_api(convert_var_shape(tensor)) # tensor shape has been changed, not origin `s` value
```

**Solution Logic**

It can not be solved in the old logic, so I refactoring tensor_shape_transformer logic. Now we will use `s` to store shape attribute and generate a var `s__STATIC_CONVERT_VAR_SHAPE_SUFFIX` to store static shape API `shape(tensor)`
```
s = tensor.shape
...
y = paddle.some_api(s)
```
Dy2stat will change it to
```
s = tensor.shape
s__STATIC_CONVERT_VAR_SHAPE_SUFFIX = shape(tensor)
...
y = paddle.some_api(choose_shape_attr_or_api(s, s__STATIC_CONVERT_VAR_SHAPE_SUFFIX ))
```
In this case, the code is consistent with origin dygraph meaning and it fixed the change after assign bug.

**Code Key Note**

To help reviewers, the key change of this PR is changing `self.name_to_var_shape` from "mapping name to shape node" to "mapping name to its STATIC_CONVERT_VAR_SHAPE_SUFFIX name", then if a variable name has the SUFFIX, we can choose to use attribute shape or shape api. Other changes go with the key change.

**Consideration**
The issue of this PR is that we store extra static `shape` API result, will it harms the speed of Dy2stat? In some cases it will, but we argue that the benefit would be greater than the cost.

1. The extra calling to static `shape` API will happen when coder assign among shape variables. Take the following dygraph code as an instance:
```
s1 = tensor.shape
s2 = s1
s3 = s2
...
```
Then we called extra static `shape` APIs again and again, however users seldom write code like this.

2. If the shape variable is used a lot, for example:
```
s = tensor.shape
y1 = paddle.some_api1(s)
y2 = paddle.some_api2(s)
y3 = paddle.some_api3(s)
```
Our old logic will create 3 shape APIs but now just 1. This is more common user code pattern. In fact, if reviewers take a look at the current unit test in this PR, you could see the op numbers decrease after this PR. So we argue that this PR can also improve speed in this code pattern.

cf43a321

fix dist fleet ctr ut (#31087) · 0e4b1542

由 tangwei12 提交于 2月 22, 2021

* fix dist fleet ctr ut

Change-Id: I59bf5123c7bd47bd0e8f1ca2a26295257597c0f5

* fix dist fleet ctr ut

Change-Id: Iafcdd172364be47fe67b753774ce09af050bcbce

* Update CMakeLists.txt

0e4b1542

20 2月, 2021 5 次提交

add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel... · d5323dab

由 TTerror 提交于 2月 20, 2021

add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel executor;optimize lookup_table op (#31056)

* add squeeze_op/unsqueeze_op on kunlun; fix conv op and parallel executor on kunlun; optimize lookup_table op on kunlun

* update squeeze/unsqueeze op

d5323dab

1
test=develop, save/load, shrink (#30625) · 16b4260b
由 123malin 提交于 2月 20, 2021
```
* test=develop, save/load, shrink
Co-authored-by: NseiriosPlus <tangwei12@baidu.com>
```
16b4260b
S
export paddle.static.normalize_program method. (#31072) · 4424aac6
由 Shibo Tao 提交于 2月 20, 2021
```
* export paddle.static.normalize_program method. test=develop

* fix ut coverage.test=develop
```
4424aac6

[static setitem] Support the index is Tensor; step>1; step<0 .(#30949) · 5b367dab

由 liym27 提交于 2月 20, 2021

* [static setitem] support the index step > 1. tensor_a[::3] = value

* [static setitem] support the index step < 0. Eg: tensor_a[::-3] = value

* [static setitem] support the index is Tensor. eg: tensor_a[tensor_3:0:-1] = value

* Add op version.

5b367dab

Fix that convert_var_shape doesn't support slice like [0:], test=develop (#31051) · ef627ac5

由 Huihuang Zheng 提交于 2月 20, 2021

As the title, when slice_node like 1:3 being passed to idx of convert_var_shape, it will cause syntax error because a function cannot take this as argument. This PR fixed it.

ef627ac5

19 2月, 2021 4 次提交
- J
  Added reshape grad bf16 (#31035) · f7465641
  由 Jacek Czaja 提交于 2月 19, 2021
```
* - added Reshape grad bf16

* - Added reshape grad bf16

* - cosmetics in py
```
  f7465641
- S
  
  Remove scale loss before reduce in dygraph (#30807) · 9401173e
  由 ShenLiang 提交于 2月 19, 2021
  
  9401173e
- K
  fix dataloader collate return list mix tensor and numpy array (#30904) · c4ddc3ab
  由 Kaipeng Deng 提交于 2月 19, 2021
```
* fix dataloader collate return list mix tensor and numpy array. test=develop
```
  c4ddc3ab
- G
  add offset parameter in roi_align,generate_proposals.etc ops (#30864) · 5b267474
  由 Guanghua Yu 提交于 2月 19, 2021
```
* add  parameter in roi_align op
```
  5b267474
18 2月, 2021 5 次提交

P

add trt transpose and flatten converter (#31022) · 9b54fe41
由 Pei Yang 提交于 2月 18, 2021

9b54fe41

Add Conv Transpose BF16 (#30877) · caf9d398

由 joanna.wozna.intel 提交于 2月 18, 2021

* Add conv transpose BF16

* Share function GetWeightsTz

* Adjust to review and fix op compatibility

* Add bias to unique handler name

* Remove errors related to paddle enforce

* Add conv2d_transpose to bf16 list and kernel refator

caf9d398

H
Refine fake_interface Error Message (#30981) · cbbe1274
由 Huihuang Zheng 提交于 2月 18, 2021
```
Refine fake_interface Error Message
```
cbbe1274

Add Support for Tuple in for Loop (#30998) · c1375783

由 Huihuang Zheng 提交于 2月 18, 2021

Dy2stat didn't support tuple as iteration variable in the past. This PR added there main cases:

1). Non-enumerate case: for var1, var2 in var|var.numpy() will be re-written as:
for FOR_ITER_TUPLE_PREFIX_x in var | var.numpy():
var1 = FOR_ITER_TUPLE_PREFIX_x[0]
var2 = FOR_ITER_TUPLE_PREFIX_x[1]
2). Enumerate out tuple case: for t in enumerate(var|var.numpy) will be rewritten as:
for FOR_ITER_TUPLE_INDEX_PREFIX_x, FOR_ITER_TUPLE_PREFIX_x in enumerate(var|var.numpy):
t = (FOR_ITER_TUPLE_INDEX_PREFIX_x, FOR_ITER_TUPLE_PREFIX_x)
3). Enumerate inner tuple case: for i, (var1, (var2, va3)) in enumerate(var|var.numpy()) will
be re-written as:
for i, FOR_ITER_TUPLE_PREFIX_x in var | var.numpy():
var1 = FOR_ITER_TUPLE_PREFIX_x[0]
var2 = FOR_ITER_TUPLE_PREFIX_x[1][0]
var3 = FOR_ITER_TUPLE_PREFIX_x[1][1]

c1375783

W

Handle missing symlink method on Windows (#31006) · 2497f439
由 Wojciech Uss 提交于 2月 17, 2021

2497f439

10 2月, 2021 1 次提交
- W
  
  delay timeout of unnittest 'test_static_save_load'. (#30975) · 8ab29f4b
  由 WeiXin 提交于 2月 10, 2021
  
  8ab29f4b

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功