提交 · ae75affd34b5308f8456bed4ecb94bba6321cb24 · PaddlePaddle / Paddle

15 1月, 2021 4 次提交

【Cherry-Pick】add distributed_infer (#30300) (#30427) · ae75affd

由 123malin 提交于 1月 15, 2021

* test=develop, add distributed_infer (#30300)

* test=develop, add distributed_infer

* test=develop, fix unittest cmakefile conflict

* test=develop, fix test_dist_fleet_base

ae75affd

W
Cherrypick fix rnn batch size diff (#30462) · e0e98627
由 wawltor 提交于 1月 15, 2021
```
* fix the rnn mask memory bug for out of read

* update the code for the rnn
```
e0e98627

[cherry-pick2.0]Enhance installation error message after separating AVX and... · 8ab8c620

由 Zhou Wei 提交于 1月 15, 2021

 [cherry-pick2.0]Enhance installation error message after separating AVX and NO_AVX compilation #30442 

cherry-pick #30413
1. 30架构对应很早期的显卡，在2.0及之后移除该架构编译
2. 分离avx与core_avx编译，并优化了安装报错信息。

8ab8c620

S

fix jetson compile error (#30378) (#30436) · e97d5947
由 Shang Zhizhou 提交于 1月 15, 2021

e97d5947

14 1月, 2021 14 次提交

S

fix flatten api grad (#30426) (#30441) · 8b5307bf
由 ShenLiang 提交于 1月 14, 2021

8b5307bf
L

[cherry-pick] correct the allowed dimension size (#30326) (#30433) · 35c8eaf5
由 lidanqing 提交于 1月 14, 2021

35c8eaf5
C

skip quantizing ops in cpu inference (#30342) (#30405) · 2f16e0c6
由 cc 提交于 1月 14, 2021

2f16e0c6
W

move 'load_op_library','LayerHelper' to 'paddle/incubate' (#30339) (#30412) · c07027e0
由 WeiXin 提交于 1月 14, 2021

c07027e0
C
[Cherry-pick] Fix prune input bug of jit.save #30425 · 2cdc36f4
由 Chen Weihang 提交于 1月 14, 2021
```
[Cherry-pick] Fix prune input bug of jit.save

cheryy-pick of #30384
```
2cdc36f4

optimize memcpy perf for kunlun (#30291) (#30382) · 9de42be2

由 QingshuChen 提交于 1月 14, 2021

* optimize memcpy perf for kunlun (#30291)

* optimize memcpy perf for kunlun

* remove useless unitest for kunlun mean

* minor

* fix bug that cann't find mkldnn(kunlun) (#30394)

9de42be2

[cherrypick 2.0] add double grad for conv_transpose and depthwise_conv (#30429) · 1552343a

由 LielinJiang 提交于 1月 14, 2021

* Add double grad for conv_transpose (#29706)

* add double grad for conv_transpose

* register cudnn conv double grad for depthwise conv (#29807)

1552343a

Z

[cherry-pick 2.0]enable MakeCipher api for inference (#30389) · ac70275a
由 Zhang Jun 提交于 1月 14, 2021

ac70275a
A

Added support for inference using quantization aware trained dygraph (#30288) (#30402) · 38faed7f
由 alncat 提交于 1月 14, 2021

38faed7f
B

cherry-pick 30354 (#30407) · 5d30d072
由 Bai Yifan 提交于 1月 14, 2021

5d30d072

fix bug of celoss when using ignore_index and reduction (#30395) · c22ee575

由 chajchaj 提交于 1月 14, 2021

* fix bug of celoss when using ignore_index and reduction (#30180)

* fix bug of using ignore_index and reduction,test=develop

* fix bug of celoss when using ignore_index and reduction, test=develop

* improve performance when ignore_index=-100, test=develop

* add test in test_cross_entropy_loss.py for coverage rate, test=develop

* rm comment in test_cross_entropy_loss.py, test=develop

* del  hard code of "float64" in python/paddle/nn/functional/loss.py, test=develop

* change mask to a more simplified implementation, test=develop

* del comment in python/paddle/nn/functional/loss.py, test=develop

* del hard code and change mask to a more simplified implementation, test=develop

* change mask to a more simplified implementation, test=develop

* change mask to a more simplified implementation, test=develop

* fix bug of celoss when using ignore_index and reduction (#30180)

* fix bug of using ignore_index and reduction,test=develop

* fix bug of celoss when using ignore_index and reduction, test=develop

* improve performance when ignore_index=-100, test=develop

* add test in test_cross_entropy_loss.py for coverage rate, test=develop

* rm comment in test_cross_entropy_loss.py, test=develop

* del  hard code of "float64" in python/paddle/nn/functional/loss.py, test=develop

* change mask to a more simplified implementation, test=develop

* del comment in python/paddle/nn/functional/loss.py, test=develop

* del hard code and change mask to a more simplified implementation, test=develop

* change mask to a more simplified implementation, test=develop

* change mask to a more simplified implementation, test=develop

c22ee575

C
fix (#30399) · e1bad4d7
由 Chengmo 提交于 1月 14, 2021
```
Co-authored-by: NseiriosPlus <tangwei12@baidu.com>
```
e1bad4d7
W

fix compile error on ARM (#30390) · 14b60947
由 Wilber 提交于 1月 14, 2021

14b60947
G
Softmax backward optimize (#30249) (#30400) · 4cc0337f
由 GaoWei8 提交于 1月 14, 2021
```
* softmax backward optimize
```
4cc0337f

13 1月, 2021 9 次提交

[cherry-pick] Set expected place in child thread for dataloader #30383 · 9fb5a3e5

由 Leo Chen 提交于 1月 13, 2021

* set expected place in child thread for dataloader

* set device id when set tensor from numpy

* revert tensor_py change

* add compile guard

* fix ci

* fix bug

9fb5a3e5

J

Recompute Offload (#30233) (#30372) · 3fbc3cf4
由 JZ-LIANG 提交于 1月 13, 2021

3fbc3cf4
S

Support unused parameters in dynamic graph distributed (#30224) (#30374) · 020e2431
由 ShenLiang 提交于 1月 13, 2021

020e2431
H

add amp example document (#30315) · 46a73e64
由 huangxu96 提交于 1月 13, 2021

46a73e64
C
[Cherry-pick] Remove c++ stacktrace open hint #30341 · 428c884f
由 Chen Weihang 提交于 1月 13, 2021
```
[Cherry-pick] Remove c++ stacktrace open hint，cherry-pick of #30325
```
428c884f
C

update error information (#30316) · 43636886
由 cnn 提交于 1月 13, 2021

43636886
T
split ps with distributed (#30337) · a97ca56a
由 tangwei12 提交于 1月 13, 2021
```
Change-Id: I3c788e7576688e63181e7f01562529b85a09cc59
```
a97ca56a

石

git cherry-pick the commits of operator version registries, test=release/2.0 (#30292) · 5eab1a38

由石晓伟提交于 1月 13, 2021

* Register op version for grid_sampler, test=op_version (#29916)

* add op version for fake_quant and fake_dequant ops, test=op_version (#29923)

* Register op version for print, test=op_version (#29945)

* add gru op_register_version; test=op_version; (#29931)

* Register op version for coalesce_tensor. (#29940)

* register op version for conv2d_transpose, conv3d_transpose and depthwise_conv2d_transpose, test=op_version (#29937)

* add op_register_version for allclose op; test=op_version (#29968)

* register ModifyAttr for instance_norm, test=op_version (#29938)

* add op_version for flip op [test=op_version] (#30019)

* add the op version check for the elementwise ops, test=op_version (#30010)

* add the support the op version check for matmul, test=op_version (#30011)

* Revert "register ModifyAttr for instance_norm, test=op_version (#29938)"

* add REGISTER_OP_VERSION for generate_proposals, roi_align, roi_pool test=op_version (#30034)

* Fix rank_attention op_version, test=op_version (#30006)

* fix rank_attention, test=op_version

* Register op version for linspace,test=op_version (#30025)

* fix op_register_version for compare ops, test=op_version (#30007)
Co-authored-by: Nzhoushunjie <zhoushunjie@baidu.com>

* register ModifyAttr for instance_norm, test=op_version (#30065)

* register instance norm, test=op_version

* add trace op_register_version and fix version bug; test=op_version (#30000)

* fix a bug in op_version_registry, test=develop, test=op_version (#29994)

* Add version checking, test=op_version (#30129)

* fix a bug in gaussian_random_op version, test=release/2.0
Co-authored-by: NLielinJiang <50691816+LielinJiang@users.noreply.github.com>
Co-authored-by: Ncc <52520497+juncaipeng@users.noreply.github.com>
Co-authored-by: NQi Li <qili93@qq.com>
Co-authored-by: NJack Zhou <zhoushunjie@baidu.com>
Co-authored-by: NGuo Sheng <whucsgs@163.com>
Co-authored-by: Nwangxinxin08 <69842442+wangxinxin08@users.noreply.github.com>
Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
Co-authored-by: NFlyingQianMM <245467267@qq.com>
Co-authored-by: Nceci3 <ceci3@users.noreply.github.com>
Co-authored-by: Nhutuxian <hutuxian2011@sina.cn>
Co-authored-by: Nchalsliu <45041955+chalsliu@users.noreply.github.com>
Co-authored-by: Nwangguanzhong <jerrywgz@126.com>
Co-authored-by: NShenLiang <shenliang03@baidu.com>
Co-authored-by: Nyinhaofeng <66763551+yinhaofeng@users.noreply.github.com>
Co-authored-by: Nchannings <chenlingchi@baidu.com>
Co-authored-by: Nchentianyu03 <chentianyu03@baidu.com>
Co-authored-by: Nruri <shipeng1108@163.com>

5eab1a38

resolve #30141 (#30145) (#30345) · 0fbfbeac

由 Wilber 提交于 1月 13, 2021

fix compile problem on FT
Co-authored-by: Nhouj04 <35131887+houj04@users.noreply.github.com>

0fbfbeac

12 1月, 2021 9 次提交

[cherry]Add callback after TensorCopy (#30123) (#30268) · 9d0a1eb4

由 Leo Chen 提交于 1月 12, 2021

* change to tensor copy sync

* change to tensor copy sync

* make copy_to safe when use TensorCopy

* refine code

* add ut

* add cudapinned garbagecollector

* add testcase: cpu place -> cuda pinned place

9d0a1eb4

【Cherry-Pick】Fix device_context & Save Tensor & Gloo (#30336) · 284bae99

由 Chengmo 提交于 1月 12, 2021

* Fix server.h include device_context (#30243)

* fix cmake
Co-authored-by: NseiriosPlus <tangwei12@baidu.com>

* 【Paddle.Fleet】Support local save sparse param (#30175)

* add save tensor support
Co-authored-by: NseiriosPlus <tangwei12@baidu.com>

* add sparse embedding & load vars for 2.0 & gloo bug fix (#30306)

* add sparse embedding & load vars for 2.0

Change-Id: I36b59ed5f015189dc9d9d2e34a9357722d369f1b

* fix hdfs gloo

Change-Id: Ia84d579053720ad804183e54c9a04b4f031c79c6

* fix gloo hdfs

Change-Id: I5ab982fd483cddc10adcdef0b8aa83aca976cb9e

* move loadvar/sparse embedding from incubute to static

Change-Id: I57081d3545ad2efab78c72420d2162c0eacaf3a0
Co-authored-by: Ntangwei12 <tangwei12@baidu.com>

284bae99

[2.0 Cherry-pick]fix 2.0 error message (#30332) · df67b317

由 swtkiwi 提交于 1月 12, 2021

* fix datanorm error msg (#30294)

* Optimize the error message of framework. (#30134)

* modify error message based on comments (#30189)

* modify error message based on comments

* edit code according to review.

* Correct spelling according to review.

* fix enforce msg of sum xpu op (#30113)

* enhance error info for py_func (#30138)

* enhance error info for py_func

* update

* fix elugradgrad test fail & error message opt (#30171)

* fix elugradgrad test fail and error message opt

* fix unitest,test=develop

* Update prroi_pool_op.h

fix error message

* opt message,test=develop

* fix ci fail,test=develop

* Refine PADDLE_ENFORCE Error Messages. test=develop (#30149)

Improve some error messages in parallel_executor.cc, conditional_block_op.cc, recurrent_op.cc

* enhance error message, test=develop (#30220)

* fix error message for distribute_fpn_proposals_op (#30116)

* enhance error msgs of fusion_seqpool_cvm_concat_op.cc, test=develop (#30240)

* just add the op error message for the matmul xpu (#30246)

 add the op error message for the matmul xpu

* enhance error message of nll_loss op test=develop (#30125)

* enhance error message of nll_loss op test=develop
Co-authored-by: Nyaoxuefeng <yaoxuefeng@baidu.com>
Co-authored-by: Nxiemoyuan <71377852+xiemoyuan@users.noreply.github.com>
Co-authored-by: NWeiXin <weixin10@baidu.com>
Co-authored-by: NJack Zhou <zhoushunjie@baidu.com>
Co-authored-by: NWilber <jiweibo@baidu.com>
Co-authored-by: NDouble_V <liuvv0203@163.com>
Co-authored-by: NHuihuang Zheng <zhhsplendid@gmail.com>
Co-authored-by: Nzhang wenhui <frankwhzhang@126.com>
Co-authored-by: Nwangguanzhong <jerrywgz@126.com>
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
Co-authored-by: Nlijianshe02 <48898730+lijianshe02@users.noreply.github.com>

df67b317

L
[cherry-pick] use cuda generator in bernoulli cuda kernel (#30199) #30286 · e7cbc43f
由 Leo Chen 提交于 1月 12, 2021
```
[cherry-pick] use cuda generator in bernoulli cuda kernel (#30199)
```
e7cbc43f
C

cherry pick tensor table (#30221) · 330aea6e
由 Chengmo 提交于 1月 12, 2021

330aea6e

[cherry-pick]memory optimization for fuse pattern of elemwise_add + act (#30303) · b207b8a7

由 wangchaochaohu 提交于 1月 12, 2021

* reduce the  occupied size  of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885)

* register OPMaker and Infer Shape Check for fused_elementwise_add (#30259)

b207b8a7

[Cherry-pick]Fix the accuracy problem of allclose op when using float64 data... · 2db79f0a

由 Zhen Wang 提交于 1月 12, 2021

[Cherry-pick]Fix the accuracy problem of allclose op when using float64 data type in static mode.(#29890) (#30313)

* Fix the accuracy problem of allclose op when using float64 data type in static mode.

* Format the code style.

2db79f0a

[Cherry-pick] Complex grad for matmul, kron and type promotion (#30304) · 7346edc2

由 chentianyu03 提交于 1月 12, 2021

* complex gradient matmul  (#29966)

* dot op support complex types

* matmul support complex types

* add test case

* matmul broadcast gradient support complex

* move conjFunctor to complex_functor.h

* change the kron gradient when complex types (#29995)

* type promotion for grad (#30177)

* type promotion for grad

* add type promotion for div op

7346edc2

L
Delete incorrect warning message (#30196) (#30262) · 501b11de
由 LielinJiang 提交于 1月 12, 2021
```
* fix warning and no grad
```
501b11de

11 1月, 2021 4 次提交

[Cherry-Pick] Support vector<double> as type of op attribute and op set_value... · d839761e

由 liym27 提交于 1月 11, 2021

[Cherry-Pick] Support vector<double> as type of op attribute and op set_value suppport vector<double> as value (#30126) (#30305)

Cherry-Pick #30126
1. Support vector<float64> as type of op attribute.
2. op set_value suppports float64 numpy.array

d839761e

L
[cherry-pick] Async drop scope in executor (#29714) #30285 · 93ce7f69
由 Leo Chen 提交于 1月 11, 2021
```
[cherry-pick] Async drop scope in executor (#29714)
```
93ce7f69
L
[Cherry-Pick 2.0] Check the rank of input in kernel of set_value op (#30147) (#30301) · a2bbd06a
由 liym27 提交于 1月 11, 2021
```
cherry-pick #30147，For op set_value, check input's rank < 7
```
a2bbd06a

[cherry pick] Add detailed error message for curandStatus_t, cublasStatus_t,... · 04cc659c

由 WeiXin 提交于 1月 11, 2021

[cherry pick] Add detailed error message for curandStatus_t, cublasStatus_t, cusolverStatus_t (#30161) (#30280)

为curandStatus_t、cublasStatus_t、cusolverStatus_t添加详细的报错信息。
原始PR：#30161

04cc659c

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功