提交 · 36de178aaee72e3a7e3baf9d0220bc773383f1f0 · PaddlePaddle / Paddle

11 1月, 2021 7 次提交
- W
  
  Cherry-pick 30194 30164 30201(#30202) · 36de178a
  由 Wilber 提交于 1月 11, 2021
  
  36de178a
- A
  Skip convert tensor shape while using Paddle.shape (#30223) (#30239) · 55604248
  由 Aurelius84 提交于 1月 11, 2021
```
* fix tensor shape bug

* fix op_num

* clean code
```
  55604248
- G
  Quantization supports 2.0 APIs (#30036) (#30257) · 393a91f1
  由 guofei 提交于 1月 11, 2021
```
* Quantization supports 2.0 APIs

* Fix the error of save_quantized_model
```
  393a91f1
- W
  [cherry-pick 2.0] optimize gradient merge (#30185) · e283dc6f
  由 WangXi 提交于 1月 11, 2021
```
* Optimization grad merge performance (#29784)

* [fleet] combine amp and gradient merge, test=develop (#30086)

* fix assign_op_xpu concat_op_xpu warining (#30120)
Co-authored-by: Nliuyuhui <liuyuhui@baidu.com>
```
  e283dc6f
- C
  [Cherry-pick] remove distributed prepare context (#30219) (#30256) · 1fa98c5d
  由 Chen Weihang 提交于 1月 10, 2021
```
att, cherry-pick of #30219
```
  1fa98c5d
- X
  [cherry-pick] clean redundant API alias in 2.0 - part 2 (#30244) · 70cbde83
  由 XiaoguangHu 提交于 1月 10, 2021
```
* fix dynamic to static error

* delete paddle.nn.functional.assign
```
  70cbde83
- Q
  add aarch64 and sunway kunlun lib (#30027) (#30237) · eacbd488
  由 QingshuChen 提交于 1月 11, 2021
```
* add aarch64 and sunway kunlun lib

* minor

* optimize elementwise_add for kunlun

* update kunlun dependence

* minor

* minor
```
  eacbd488
10 1月, 2021 1 次提交
- W
  
  fix adamw apply gradient (#30130) (#30207) · c4cd99f3
  由 WangXi 提交于 1月 10, 2021
  
  c4cd99f3
09 1月, 2021 1 次提交
- L
  
  fix pad (#30231) · 6d1fb79d
  由 littletomatodonkey 提交于 1月 09, 2021
  
  6d1fb79d
08 1月, 2021 11 次提交

[Cherry-Pick 2.0] In creation.assgin, reuse implamention code of... · 8e788e27

由 liym27 提交于 1月 08, 2021

[Cherry-Pick 2.0] In creation.assgin, reuse implamention code of layers.tensor.assign to avoid maintain two code (#30227) (#30236)

cherry-pick #30227

8e788e27

[cherry-pick] [Dy2Stat] Don't convert to paddle.shape if var_x.shape is not... · 2ba9bdd7

由 liym27 提交于 1月 08, 2021

[cherry-pick] [Dy2Stat] Don't convert to paddle.shape if var_x.shape is not negetive #29965 (#30235)

* [Cherry-Pick 2.0] [Dy2Stat] Don't convert to paddle.shape if var_x.shape is not negetive (#29965)

1. When x is Variable, call nn.shape(x) only in following cases:
 1）The shape of x is used in control flow condition.
 2）The dim to be used is negetive
2. When x is Variable, but x.shape or x.shape[idx] doesn't contain negetive value, don't convert to paddle.shape()

* [Cherry-Pick 2.0] [Dy2Stat] Use Paddle2.0 api paddle.tensor.array_* (#30156)

2ba9bdd7

[Cherry-pick] amp related PR cherry pick into Release/2.0 (#30212) · 9f7c66b4

由 huangxu96 提交于 1月 08, 2021

* Optimizer trans momentum (#29597)

* merge amp related function in Momentum from paddle.fluid.contrib.optimizer into paddle.optimizer.

* Add unittest for 2.0  Momentum API.

* fix some bugs in weight_decay.

* add alias for fluid.contrib.mixed_precision (#29562)

* add alias for fluid.contrib.mixed_precision

* add static.amp into setup.pu.in (#29621)

* add static.amp into setup.pu.in

* add unittest for api

* fix a bug in multi_precision_fp16 unittest. (#29756)

9f7c66b4

[cherry-pick 2.0] Fix bug: In dynamic mode, if start or end is negetive,... · 5fe3da39

由 liym27 提交于 1月 08, 2021

[cherry-pick 2.0] Fix bug: In dynamic mode, if start or end is negetive, __getitem__  return wrong result(#30003) (#30146)

1. when slice_item is a slice:
 1) the start of __getitem__ should be std::max(start, 0) if slice
 2) the start of __getitem__ should be std::min(end, dim)
2. when slice_item is an integer, it should be in [-dim_len, dim_len)
3. Fix error message to use accurate data

5fe3da39

[Cherry-Pick 2.0][setitem] Support Tensor setitem in static mode (#29708) (#30104) · f46ddc0e

由 liym27 提交于 1月 08, 2021

1. Type of index: int, slice(step must be 1).

2. Type of value:
 (1) int32, int64, float32, bool;
 (2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported>
 (3) paddle.Tensor(int32, int64, float32, float64, bool);

f46ddc0e

Fix beam search bug (#29824) (#30140) · b2ca2cad

由 Jiaqi Liu 提交于 1月 08, 2021

* fix beam search bug

* add dygraph unittest

* update dynamic_decode argument doc

* add warning info for state which has no lengths attribute

b2ca2cad

[Cherry-pick] [Complex] Simplify prepared op impl to improve performance (#30153) (#30215) · 0e3a1d35

由 Chen Weihang 提交于 1月 08, 2021

* simplify prepared op impl to improve performance

* fix kunlun compile error

* continue fix kunlun compile error

* only transform diff place when dtype diff

* fix failed unittests

* remove useless file

* polish impl by review comment

0e3a1d35

【2.0API CherryPick】LookAhead, ModelAverage, IndexSelect (#30205) · 3ce4d34d

由 123malin 提交于 1月 08, 2021

* Add Lookahead and ModelAverage Optimizer (#30004)

* test=develop, add model_average and lookahead

* Improve Index select cuda kernel (#30139)

* test=develop, add index_select_cuda kernel

3ce4d34d

L

fix paddle.pow doc, test=document_fix (#30159) (#30213) · 8d3648c8
由 LutaoChu 提交于 1月 08, 2021

8d3648c8
C
fix syncbn convert (#30158) (#30176) · 030d678c
由 ceci3 提交于 1月 08, 2021
```
* fix syncbn convet

* add unittest
```
030d678c

[Cherry-pick] Simplify the options of spawn based on fleetrun (#30144) (#30197) · 39204d56

由 Chen Weihang 提交于 1月 07, 2021

* Simplify the options of spawn based on fleetrun (#30144)

* Simplify the options of spawn based on fleetrun

* polish details

* polish doc details

* cleanup enum test=develop (#29294)
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>

39204d56

07 1月, 2021 8 次提交

[cherry pick] paddle.save/load ,paddle.static.save/load 保存大文件的bug (#30170) · bfb6f613

由 WeiXin 提交于 1月 07, 2021

* Support storage of large parameters (#29988)

* Support storage of large parameters

* Reduce the complexity of the unittest

* Reduce the complexity of the unittest,commented out unittest for

* add unittest for static.save/load

* Increase the timeout threshold of 'test_static_save_load'

* Increase the timeout threshold of 'test_static_save_load'

* Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load'

* Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load'

* Extend the timeout for the (#30151)

bfb6f613

S

fix error message (#30135) (#30182) · 9f02c284
由 ShenLiang 提交于 1月 07, 2021

9f02c284

[cherry pick] Some optimizations of elementwise_add, gelu and dropout for AMP (#30152) · 07f68fad

由 Leo Chen 提交于 1月 07, 2021

* Improve performance of elementwise_add grad op (#29187)

* pass stop_gradient for cast op

* improve performance of elementwise_add grad

* use tensor copy async

* dygraph branch

* fix dygraph branch

* add ut

* make gelu fp16 computing more robust (#29484)

* Add fast path for dropout when p == 0  (#29553)

* add fast path for p == 0 in dropout

* add ut

07f68fad

[Cherry-pick] Layer norm fp16 and Nvidia optimize (#29169 #29434 #29522 #29576) (#30110) · 44b81e63

由 furnace 提交于 1月 07, 2021

* Layer norm fp16 (#29169)

* add fp16 for layer_norm op

* revert layernorm api

* fix forward

* fix forward

* fix backward for layernorm with fp16

* fix unit test for layernorm with fp16

* fix with_mkldnn compile error for layernorm with fp16

* 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U>

* fix with_mkldnn compile error for layernorm with fp16

* fix with_mkldnn compile error for layernorm with fp16
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

* fix layer_norm accuracy (#29434)

* Layernorm opt (#29522)

* layernorm fw opt

* layernorm bw opt

* fix typo, test=develop

* remove const dim3 for windows CI compatibility

* merge develop
Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>

* Fix compile problem when cuda_arch < 6000 (#29576)

* fix compile problem when cuda_arch < 6000

* refine code

* refine code
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>

44b81e63

S

add inference api： DisableTensorRtOps (#30109) (#30178) · cb71fea0
由 Shang Zhizhou 提交于 1月 07, 2021

cb71fea0
T
pre padding in dygraph (#30179) · a2b0357d
由 tangwei12 提交于 1月 07, 2021
```
Change-Id: Ia5279b0cbb6a5b3970aff66e9510e0d85efa70ce
```
a2b0357d
L

fix xpu pe sync, test=notest (#30095) (#30114) · 85545bbc
由 liuyuhui 提交于 1月 07, 2021

85545bbc

Cherry pick bn (#30136) · 157ff094

由 ceci3 提交于 1月 07, 2021

* fix bn docs (#30096)

* add attribute for batch_norm (#29950)

* add attribute for batch_norm

157ff094

06 1月, 2021 5 次提交

support dygraph in xpu place (#30051) (#30112) · 285f33e5

由 hong 提交于 1月 06, 2021

* support dygraph in xpu place; test=develop

* fix cpu/gpu compile error; test=develop

* fix compile error; test=develop

* fix xpu compile error; testd=develop

285f33e5

G
Cherrypick 30071 (#30074) · 19bec2fe
由 gongweibao 提交于 1月 06, 2021
```
* fix log test=release/2.0

* fix ut test=develop
```
19bec2fe

[Cherry-pick]cherry-pick to Release/2.0 (#30076) · 1ad7fcbf

由 huangxu96 提交于 1月 06, 2021

* add fp16 check into max and avg pool (#29479)

* Add ReserveSpace in dygraph batch_norm. (#29221)

* Add ReserveSpace in dygraph batch_norm.

* Add unittest for reservespace

* add float16 into adaptive_avg_pool2d check list. (#29547)

1ad7fcbf

[Cherry-Pick 2.0][Dynamic Inplace] Support ShareInplaceVersionCounterWith for... · 743649b5

由 liym27 提交于 1月 06, 2021

[Cherry-Pick 2.0][Dynamic Inplace] Support ShareInplaceVersionCounterWith for C++ Tensor (#29842) (#30105)

Before this PR, SharePlaceHolderWith share Tensor between different C++ Variable, which meas sharing the data, shape, and inplace_version_counter_ of Tensor.
But in some cases, Sharing data and inplace_version_counter_ but not sharing shape is needed. For example, inplace op reshape, can't share shape.

This PR, discard SharePlaceHolderWith, and expose ShareInplaceVersionCounterWith for C++ Tensor.
This reverts commit b10ecd9d.

* Support ShareInplaceVersionCounterWith to share the same inplace version counter for VarBase

743649b5

L
[Cherry-pick 2.0] Migrate 4 APIs about array to paddle.tensor.* (#29565) (#30101) · 52caf787
由 liym27 提交于 1月 06, 2021
```
4 APIs: array_length, array_read, array_write, create_array，cherry-pick #29565
```
52caf787

05 1月, 2021 7 次提交

T
add topo-aware in heter-ps (#30087) (#30117) · 7fc2ce50
由 Thunderbrook 提交于 1月 05, 2021
```
* add topo aware

* resource.h

* topo aware

* format
```
7fc2ce50

[Cherry-pick 2.0] cherry pick 3 PRs about Dynamic-to-Static (#30100) · faeee3c3

由 liym27 提交于 1月 05, 2021

* [cherry-pick 2.0] Fix unitest test_slice (#29740)

Before this commit, test_slice use old api `dygraph_to_static_func` to use Dynamic-t-Static and use Executor explicitly，which is not recommended to users.
After fixed, use recommended API `paddle.jit.to_static` to replace `dygraph_to_static_func`, which won't trigger the random exception on coverage CI.

* [cherry-pick 2.0][Dy2Stat] Support grammar: for ele in var[idx] (#29541)

Support to transformfor ele in var stms in which var is a slice of Tensor.

* [cherry-pick 2.0][Dy2Stat] Fix bug for loop: a variable is used and created in loop, but used before created (#29769)

faeee3c3

[cherry-pick 2.0] Support dygraph quant model and avoid the scale to be infinity (#30098) · 3fe71d0a

由 cc 提交于 1月 05, 2021

* fix ininite scale values (#29386)

* Support dygraph quant model (#29927)

* Avoid the scale to be infinity in quant2_int8_mkldnn_pass, test=develop
* support quantized model for paddle2.0 dygraph, test=develop
Co-authored-by: NWojciech Uss <wojciech.uss@intel.com>

3fe71d0a

G

fix test=release/2.0 (#30045) · 6e2066b0
由 gongweibao 提交于 1月 05, 2021

6e2066b0

[cherry pick]Set FLAGS_selected_gpus for spawn (#29962) (#30097) · cda7397f

由 Chen Weihang 提交于 1月 05, 2021

Set FLAGS_selected_gpus for spawn.

When the child process starts, it will inherit the configuration of the main process and set the FLAGS once, but the environment variable has not been set at this time, which leads to the FLAGS_selected_gpus is keep same with mainprocess(usually empty), so manually update the flags here.

注：增加了一个单测，又移除了，单测打印显示CI机器nvidia-smi只有两张卡，需要大于两张卡才能测这个问题

cda7397f

fix large scale memory (#30035) (#30085) · e3975223

由 tangwei12 提交于 1月 05, 2021

* memory holder optimize

Change-Id: Ic91af8ac6f2853336d28a9fbbc5e8d0c57b5d05e

* memory holder optimize

Change-Id: I2fd1c14ecc17f5d5ce88b87890381ea801e6367f

* fix large scale memory holder

Change-Id: Ief0992b02b00220e16c72cc637a56e7b5788140f

* fix large scale memory holder

Change-Id: I910142a3952ead643a5604f8f80955f3e6efe655

e3975223

C

[cherry-pick] Add mkldnn interpolate op, support manual enable mkldnn interpolate op (#30083) · 9a6926f5
由 cc 提交于 1月 05, 2021

9a6926f5

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功