提交 · c2a4a50eb29e1c8dab90c0a5403a700bfd1f4cc7 · PaddlePaddle / Paddle

18 1月, 2021 7 次提交

L
fix cache key for inplaced elementwise ops (#30404) (#30478) · c2a4a50e
由 lidanqing 提交于 1月 18, 2021
```
Co-authored-by: NWojciech Uss <wojciech.uss@intel.com>
```
c2a4a50e

[cherry-pick]Modify the calculation logic of LambOptimizer (#29313) (#30510) · b3fa899b

由 guofei 提交于 1月 18, 2021

* Modify the calculation logic of LambOptimizer (#29313)

* Modify the calculation logic of LambOptimizer

* Modify the calculation logic of LambOptimizer

* Modify the calculation logic of LambOptimizer

b3fa899b

C
[cherry-pick] add pad and concat double grad #29549 (#30432) · 5e4d54a1
由 ceci3 提交于 1月 18, 2021
```
* add pad and concat double grad

* resolve conflict
```
5e4d54a1

[cherry-pick] improve perfomance of cast and tril op (#30498) · de003cee

由 Zhang Ting 提交于 1月 18, 2021

* add fp16 support for tril_triu op (#30186)

* add VecCastCUDAKernel (#30296)
Co-authored-by: Nfurnace <34057289+windstamp@users.noreply.github.com>

de003cee

1
test=develop, fix fleet.metric (#30438) (#30473) · 2c3799d1
由 123malin 提交于 1月 18, 2021
```
* test=develop, fix fleet.metrics(mse, rmse, mae)
```
2c3799d1

Cherry-pick PR 30103. Add Inplace strategy (Output reuse Input Varbase) in... · 27c2f1ea

由 pangyoki 提交于 1月 18, 2021

Cherry-pick PR 30103. Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103) (#30496)

* add view strategy on squeeze,unsqueeze,reshape,flatten

* add squeeze unittest

* add unittests

* use View strategy as name rather than Reuse Allacation

* fix view api doc

* fix format

* use core.ops when input of reshape2 is Tensor

* fix test_cross_entropy_loss error because of reshape2

* fix test_cross_entropy_loss error because of reshape2

* add inplace strategy

* add elementwise_add sub

* let backward op not use inplace

* grad op do not use inplace

* fix memory increase error and add leaf error message

* delete selected_rows

* change op_function

* little change

* solve HandleViewBetweenInputAndOutput

* add unittest and leaf error message

* merge view error

* optimize op_function_generator format and support sum inplace op

* fix format of basic_engine

* fix format for framework

* little change of variable wrapper

* add reshape, squeeze, unsqueeze, scatter api

* add relu elu tanh softmax inplace api

* fix test_squeeze_op unittest

* fix test_relu_op unittest

* fix comment problems

* delete sample code of inplace api

* add reference of grad_pending_nodes in basic_engine

* fix unittest name

* add inplace apis into wlist

* fix error message

* add PADDLE_ENFORCE for set grad op twice

* fix head file error

27c2f1ea

W

【Release/2.0】fix compile error in ARM subgraph (#30488) · 3e49fdcc
由 Wilber 提交于 1月 18, 2021

3e49fdcc

15 1月, 2021 9 次提交

Cherry pick 30072 (#30499) · 590e718b

由 pangyoki 提交于 1月 15, 2021

* Cherry-pick 30072, add dispenable input for core.ops.reshape2/expand/slice (#30072)

* add dispenable input 'shape' for core.ops.reshape2

* add dispenable inputs for core.ops.reshape2/expand/slice

* add ut

* save reshape update in pr 30180

* save reshape update v2 in pr 30180
Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>

590e718b

Y
Fix float64 bug in layer norm (#30454) · c9d26423
由 Yang Zhang 提交于 1月 15, 2021
```
built-in `rsqrt` is shadowed
```
c9d26423

add transpose double grad , cherry-pick from #29600 (#30435) · badc6f22

由 lijianshe02 提交于 1月 15, 2021

* add transpose double grad test=develop (#29600)

* add transpose double grad test=develop

* cherry-pick test=develop

badc6f22

【Cherry pick】 Expose paddle.static.auc, paddle.static.acc to users (#30311) · a64c7c91

由 Jiaqi Liu 提交于 1月 15, 2021

* Alias from  paddle.fluid.layers.auc to paddle.static.auc (#30206)

* add alias from  fluid.layers.auc to static.auc

* Update __init__.py

* add auc into all list

* alias acc, expose to users

* add auc into 'all' list (#30310)

* add auc into 'all' list

* alias acc, expose to users

* update sample code

a64c7c91

W

Support double backward rsqrt (#29589) (#30431) · 71ab8ae9
由 whs 提交于 1月 15, 2021

71ab8ae9

【Cherry-Pick】add distributed_infer (#30300) (#30427) · ae75affd

由 123malin 提交于 1月 15, 2021

* test=develop, add distributed_infer (#30300)

* test=develop, add distributed_infer

* test=develop, fix unittest cmakefile conflict

* test=develop, fix test_dist_fleet_base

ae75affd

W
Cherrypick fix rnn batch size diff (#30462) · e0e98627
由 wawltor 提交于 1月 15, 2021
```
* fix the rnn mask memory bug for out of read

* update the code for the rnn
```
e0e98627

[cherry-pick2.0]Enhance installation error message after separating AVX and... · 8ab8c620

由 Zhou Wei 提交于 1月 15, 2021

 [cherry-pick2.0]Enhance installation error message after separating AVX and NO_AVX compilation #30442 

cherry-pick #30413
1. 30架构对应很早期的显卡，在2.0及之后移除该架构编译
2. 分离avx与core_avx编译，并优化了安装报错信息。

8ab8c620

S

fix jetson compile error (#30378) (#30436) · e97d5947
由 Shang Zhizhou 提交于 1月 15, 2021

e97d5947

14 1月, 2021 14 次提交

S

fix flatten api grad (#30426) (#30441) · 8b5307bf
由 ShenLiang 提交于 1月 14, 2021

8b5307bf
L

[cherry-pick] correct the allowed dimension size (#30326) (#30433) · 35c8eaf5
由 lidanqing 提交于 1月 14, 2021

35c8eaf5
C

skip quantizing ops in cpu inference (#30342) (#30405) · 2f16e0c6
由 cc 提交于 1月 14, 2021

2f16e0c6
W

move 'load_op_library','LayerHelper' to 'paddle/incubate' (#30339) (#30412) · c07027e0
由 WeiXin 提交于 1月 14, 2021

c07027e0
C
[Cherry-pick] Fix prune input bug of jit.save #30425 · 2cdc36f4
由 Chen Weihang 提交于 1月 14, 2021
```
[Cherry-pick] Fix prune input bug of jit.save

cheryy-pick of #30384
```
2cdc36f4

optimize memcpy perf for kunlun (#30291) (#30382) · 9de42be2

由 QingshuChen 提交于 1月 14, 2021

* optimize memcpy perf for kunlun (#30291)

* optimize memcpy perf for kunlun

* remove useless unitest for kunlun mean

* minor

* fix bug that cann't find mkldnn(kunlun) (#30394)

9de42be2

[cherrypick 2.0] add double grad for conv_transpose and depthwise_conv (#30429) · 1552343a

由 LielinJiang 提交于 1月 14, 2021

* Add double grad for conv_transpose (#29706)

* add double grad for conv_transpose

* register cudnn conv double grad for depthwise conv (#29807)

1552343a

Z

[cherry-pick 2.0]enable MakeCipher api for inference (#30389) · ac70275a
由 Zhang Jun 提交于 1月 14, 2021

ac70275a
A

Added support for inference using quantization aware trained dygraph (#30288) (#30402) · 38faed7f
由 alncat 提交于 1月 14, 2021

38faed7f
B

cherry-pick 30354 (#30407) · 5d30d072
由 Bai Yifan 提交于 1月 14, 2021

5d30d072

fix bug of celoss when using ignore_index and reduction (#30395) · c22ee575

由 chajchaj 提交于 1月 14, 2021

* fix bug of celoss when using ignore_index and reduction (#30180)

* fix bug of using ignore_index and reduction,test=develop

* fix bug of celoss when using ignore_index and reduction, test=develop

* improve performance when ignore_index=-100, test=develop

* add test in test_cross_entropy_loss.py for coverage rate, test=develop

* rm comment in test_cross_entropy_loss.py, test=develop

* del  hard code of "float64" in python/paddle/nn/functional/loss.py, test=develop

* change mask to a more simplified implementation, test=develop

* del comment in python/paddle/nn/functional/loss.py, test=develop

* del hard code and change mask to a more simplified implementation, test=develop

* change mask to a more simplified implementation, test=develop

* change mask to a more simplified implementation, test=develop

* fix bug of celoss when using ignore_index and reduction (#30180)

* fix bug of using ignore_index and reduction,test=develop

* fix bug of celoss when using ignore_index and reduction, test=develop

* improve performance when ignore_index=-100, test=develop

* add test in test_cross_entropy_loss.py for coverage rate, test=develop

* rm comment in test_cross_entropy_loss.py, test=develop

* del  hard code of "float64" in python/paddle/nn/functional/loss.py, test=develop

* change mask to a more simplified implementation, test=develop

* del comment in python/paddle/nn/functional/loss.py, test=develop

* del hard code and change mask to a more simplified implementation, test=develop

* change mask to a more simplified implementation, test=develop

* change mask to a more simplified implementation, test=develop

c22ee575

C
fix (#30399) · e1bad4d7
由 Chengmo 提交于 1月 14, 2021
```
Co-authored-by: NseiriosPlus <tangwei12@baidu.com>
```
e1bad4d7
W

fix compile error on ARM (#30390) · 14b60947
由 Wilber 提交于 1月 14, 2021

14b60947
G
Softmax backward optimize (#30249) (#30400) · 4cc0337f
由 GaoWei8 提交于 1月 14, 2021
```
* softmax backward optimize
```
4cc0337f

13 1月, 2021 9 次提交

[cherry-pick] Set expected place in child thread for dataloader #30383 · 9fb5a3e5

由 Leo Chen 提交于 1月 13, 2021

* set expected place in child thread for dataloader

* set device id when set tensor from numpy

* revert tensor_py change

* add compile guard

* fix ci

* fix bug

9fb5a3e5

J

Recompute Offload (#30233) (#30372) · 3fbc3cf4
由 JZ-LIANG 提交于 1月 13, 2021

3fbc3cf4
S

Support unused parameters in dynamic graph distributed (#30224) (#30374) · 020e2431
由 ShenLiang 提交于 1月 13, 2021

020e2431
H

add amp example document (#30315) · 46a73e64
由 huangxu96 提交于 1月 13, 2021

46a73e64
C
[Cherry-pick] Remove c++ stacktrace open hint #30341 · 428c884f
由 Chen Weihang 提交于 1月 13, 2021
```
[Cherry-pick] Remove c++ stacktrace open hint，cherry-pick of #30325
```
428c884f
C

update error information (#30316) · 43636886
由 cnn 提交于 1月 13, 2021

43636886
T
split ps with distributed (#30337) · a97ca56a
由 tangwei12 提交于 1月 13, 2021
```
Change-Id: I3c788e7576688e63181e7f01562529b85a09cc59
```
a97ca56a

石

git cherry-pick the commits of operator version registries, test=release/2.0 (#30292) · 5eab1a38

由石晓伟提交于 1月 13, 2021

* Register op version for grid_sampler, test=op_version (#29916)

* add op version for fake_quant and fake_dequant ops, test=op_version (#29923)

* Register op version for print, test=op_version (#29945)

* add gru op_register_version; test=op_version; (#29931)

* Register op version for coalesce_tensor. (#29940)

* register op version for conv2d_transpose, conv3d_transpose and depthwise_conv2d_transpose, test=op_version (#29937)

* add op_register_version for allclose op; test=op_version (#29968)

* register ModifyAttr for instance_norm, test=op_version (#29938)

* add op_version for flip op [test=op_version] (#30019)

* add the op version check for the elementwise ops, test=op_version (#30010)

* add the support the op version check for matmul, test=op_version (#30011)

* Revert "register ModifyAttr for instance_norm, test=op_version (#29938)"

* add REGISTER_OP_VERSION for generate_proposals, roi_align, roi_pool test=op_version (#30034)

* Fix rank_attention op_version, test=op_version (#30006)

* fix rank_attention, test=op_version

* Register op version for linspace,test=op_version (#30025)

* fix op_register_version for compare ops, test=op_version (#30007)
Co-authored-by: Nzhoushunjie <zhoushunjie@baidu.com>

* register ModifyAttr for instance_norm, test=op_version (#30065)

* register instance norm, test=op_version

* add trace op_register_version and fix version bug; test=op_version (#30000)

* fix a bug in op_version_registry, test=develop, test=op_version (#29994)

* Add version checking, test=op_version (#30129)

* fix a bug in gaussian_random_op version, test=release/2.0
Co-authored-by: NLielinJiang <50691816+LielinJiang@users.noreply.github.com>
Co-authored-by: Ncc <52520497+juncaipeng@users.noreply.github.com>
Co-authored-by: NQi Li <qili93@qq.com>
Co-authored-by: NJack Zhou <zhoushunjie@baidu.com>
Co-authored-by: NGuo Sheng <whucsgs@163.com>
Co-authored-by: Nwangxinxin08 <69842442+wangxinxin08@users.noreply.github.com>
Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
Co-authored-by: NFlyingQianMM <245467267@qq.com>
Co-authored-by: Nceci3 <ceci3@users.noreply.github.com>
Co-authored-by: Nhutuxian <hutuxian2011@sina.cn>
Co-authored-by: Nchalsliu <45041955+chalsliu@users.noreply.github.com>
Co-authored-by: Nwangguanzhong <jerrywgz@126.com>
Co-authored-by: NShenLiang <shenliang03@baidu.com>
Co-authored-by: Nyinhaofeng <66763551+yinhaofeng@users.noreply.github.com>
Co-authored-by: Nchannings <chenlingchi@baidu.com>
Co-authored-by: Nchentianyu03 <chentianyu03@baidu.com>
Co-authored-by: Nruri <shipeng1108@163.com>

5eab1a38

resolve #30141 (#30145) (#30345) · 0fbfbeac

由 Wilber 提交于 1月 13, 2021

fix compile problem on FT
Co-authored-by: Nhouj04 <35131887+houj04@users.noreply.github.com>

0fbfbeac

12 1月, 2021 1 次提交

[cherry]Add callback after TensorCopy (#30123) (#30268) · 9d0a1eb4

由 Leo Chen 提交于 1月 12, 2021

* change to tensor copy sync

* change to tensor copy sync

* make copy_to safe when use TensorCopy

* refine code

* add ut

* add cudapinned garbagecollector

* add testcase: cpu place -> cuda pinned place

9d0a1eb4

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功