提交 · 640f8cf01c585170744dd48a2c285b37bddd7d3c · 机器未来 / Paddle

31 12月, 2020 1 次提交

[Cherry-pick] Disable gloo by default #29559 #29805 (#29601) · 640f8cf0

由 lilong12 提交于 12月 31, 2020

* update, test=develop (#29559)

* Disable gloo by default (#29805)

* update, test=develop

* update, test=develop

640f8cf0

29 12月, 2020 4 次提交

[Kunlun] 2.0 cherry-pick:Support for Baidu Kunlun XPU multi card training (#29713) · 847aa172

由 liuyuhui 提交于 12月 29, 2020

* [Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)

* [Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574)

* [Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor  (#29926)

* add bkcl.so in whl for kunlun (#29947)

* [Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor  (#29961)
Co-authored-by: NQingshuChen <qingshu.chen714@gmail.com>

847aa172

[Cherry-pick] Complex network execute support (#29905) · 91ebc460

由 Chen Weihang 提交于 12月 29, 2020

* [Complex] Add support for complex grad accumulated (#29889)

* add support for complex grad accumulated

* add unittest for coverage

* update test dtype

* remove useless blank line

* [Complex] Handle complex to real after type promotion (#29855)

* try to add fwd op input dtypes

* refactor base impl

* return tmp_ins after dygraph prepare data

* fix typo found in debug

* polish comment & add complex net test

* revert detail change

* fix unittest failed

* add complex kernel condition control

* fix xpu test failed & polish comment

* polish details by review comments

* Complex op test (#29753)

* delete no need to calculate inputs in dygraph op_test

* delete no need to calculate inputs in dygraph op_test

* change grad elementwise_mul for complex types (#29757)

* add conj op for complex types

* add conj for complex types

* add more test case

* add conj_op test

* modify conj api and impl

* add complex type for fill_constant_op xpu

* add setConstant for complex type

* remove complex conj test file

* user define grad for test_conj_op

* add test case for static mode of conj api

* modify conj doc

* change input args name to x

* remove useless codes

* conj support real types

* add conj test case for real number

* delete no need to calculate inputs in dygraph op_test

* delete no need to calculate inputs in dygraph op_test

* modify grad of mul for complex types

* fix the grads of inputs args order not match bug

* change the grad of div when complex types (#29804)

* change the grad of div when complex types

* fix the grads of inputs args order not match bug
Co-authored-by: Nchentianyu03 <chentianyu03@baidu.com>

91ebc460

L
Fix Conv2DTanspose bug when padding='same' (#29915) (#29936) · acb29ff8
由 LielinJiang 提交于 12月 29, 2020
```
* fix conv_transpose bug when padding=same
```
acb29ff8

[cherry-pick] clean redundant API alias in 2.0 - part 1 #29928 (#29960) · c9c835b5

由 XiaoguangHu 提交于 12月 28, 2020

* [cherry-pick] cherry-pick of PR#29928

* delete paddle.metric.chunk_eval and paddle.metric.mean_iou

* delete paddle.nn.clip and paddle.nn.clip_by_norm

* delete paddle.nn.functional.activation.hard_sigmoid and paddle.nn.functional.activation.hard_swish

* [cherry-pick] cherry-pick of PR#29928

* fix extension import error

c9c835b5

28 12月, 2020 2 次提交

[Cherry-Pick 2.0][Dy2Stat] 1. Fix bug of for-range stmts. 2. Support that step... · a8b6dd86

由 liym27 提交于 12月 28, 2020

[Cherry-Pick 2.0][Dy2Stat] 1. Fix bug of for-range stmts. 2. Support that step value is negative in for-range stmts (#29519) (#29874)

1. Fix error in _build_cond_stmt of for-range stmts.

2. Support that step value is negative in for-range stmts

3. Fix code because of the diff between Py2 and Py3

a8b6dd86

[Cherry-pick] Cherry-pick of PR#29579 and PR#29617 (#29904) · 63939597

由 Huihuang Zheng 提交于 12月 28, 2020

* [Dy2stat] Enable jit.save to Save Without Running (#29579)

Enable jit.save to Save Without Running.

* Modify CublasHandleHolder to Fix Random Unittest Failure. test=develop (#29617)

Modify CublasHandleHolder from using PADDLE_ENFORCE_CUDA_SUCCESS to PADDLE_RETRY_CUDA_SUCCESS to fix random unittest failure. We checked that the unittest log showed CUDA allocation error at this file, which may due to GPU not enough. We fixed similar failure in the past, so we applied PADDLE_RETRY_CUDA_SUCCESS here.

63939597

25 12月, 2020 2 次提交

Q
feat: support check_nan_inf for kunlun/xpu device (#29694) (#29898) · 41917fb5
由 QingshuChen 提交于 12月 25, 2020
```
* feat: support check_nan_inf for kunlun device

* support kunlun stack

* minor
```
41917fb5

2 0 ps core 2 (#29894) · f781ab08

由 tangwei12 提交于 12月 25, 2020

* add ps table (#29463)

* add ps table

Change-Id: I468a04bd071d21ff52654926fcf4d5f3da19e178

* add service (#29560)

* add service, remove ut on mac

* fix heter_profiler & add heter stop method

* fix code style

* merge pscore

Change-Id: Ie7f60d1cdde6755a0c29db26863c6283e9843d57

* fix cmake

Change-Id: I6773509a7b4ca79139ecc40b7bf3eb318ceff8bb

* fix conflit

Change-Id: I35575be0c96a8520f9d756ea7f1ff0b904a165ba

* fix conflit

Change-Id: Ic926ea0b0d67803226d51241397ba3b510226bfa

f781ab08

22 12月, 2020 2 次提交

add nearest_interp_v2 on kunlun (#29725) (#29822) · 0e2f5bb1

由 QingshuChen 提交于 12月 22, 2020

* add nearest_interp_v2 on kunlun

* add nearest_interp_v2 on kunlun
Co-authored-by: NTTerror <tangzhiyi11@users.noreply.github.com>

0e2f5bb1

[cherry-pick 2.0] gen nccl id socket (#29746) · 0f49e0b7

由 WangXi 提交于 12月 22, 2020

* gen nccl id use socket (#29431)

* fix gen_nccl_id_op_helper compile failed, test=develop (#29614)

0f49e0b7

18 12月, 2020 1 次提交

[Cherry-pick] Add complex api conj, real and imag (#29750) · ab5cc042

由 Chen Weihang 提交于 12月 18, 2020

* Add complex dtype op (add) test example (#29603)


* add op test case for complex

* polish code details

* add xpu set constant support

* fix argument rror

* remove useless pyc file

* [Complex] Add real & imag op and api for complex tensor (#29672)

* add complex real op & api & unittest

* add imag op & api & unittest

* refactor op impl

* revert simplify writing due to complile failed

* polish details

* polish grad op code

* add conj op for complex types (#29527)

* add conj op for complex types

* add conj for complex types

* add more test case

* add conj_op test

* modify conj api and impl

* add complex type for fill_constant_op xpu

* add setConstant for complex type

* remove complex conj test file

* user define grad for test_conj_op

* add test case for static mode of conj api

* modify conj doc

* change input args name to x

* remove useless codes

* conj support real types

* add conj test case for real number
Co-authored-by: Nchentianyu03 <chentianyu03@baidu.com>

ab5cc042

17 12月, 2020 4 次提交

[cherry-pick]fix matmulv2 bug & add rebuild group & fix bug of download (#29726) · df0430dc

由 ShenLiang 提交于 12月 17, 2020

* Fix the dowanload bug in the case of multiple machines (#29551)

* fix the dowanload bug
* add sort for ips

* Fix bug of matmul_v2 for broadcast case (#29599)

* fix bug of matmul_v2 for broadcast

* Rebuild group automatically in dynamic graph distributed (#29255)

* add tensor_indices in AssignGroupBySize

* add rebuild group in reducer

* fix error message of gather nd (#29521)

df0430dc

update activation op on kunlun (#29577) (#29717) · e82efc0c

由 TTerror 提交于 12月 17, 2020

* fix expand && concat/transpose to new api

* update xpu_header

* update activation op on kunlun

* update activation op on kunlun

* update activation op on kunlun

* update activation op on kunlun

* update activation op on kunlun

* add nearest_interp on kunlun

* update error message

e82efc0c

remove addcmul (#28937) (#29640) · 6f873c21

由 Wei Shengyu 提交于 12月 17, 2020

* remove addcmul

* remove unittest and other related code of addcmul

* fix bug

* fix merge conflict

6f873c21

C
[Cherry-pick][bug fix] fix train eval set error in static mode (#29540) (#29571) · 61d70277
由 Chen Weihang 提交于 12月 16, 2020
```
Fix Layer train eval setting failed in static mode, more details please see #29540
```
61d70277

16 12月, 2020 2 次提交
- J
  fix wmt14 doc, remove backward, add bidirect direction in rnn api (#29633) (#29695) · a19e1fe8
  由 Jack Zhou 提交于 12月 16, 2020
```
* fix wmt14 doc, remove backward, add bidirect direction in rnn api

* fix rnn unittest

* fix test_rnn_nets_static.py bug
```
  a19e1fe8
- Q
  support roi_align & affine_channel for kunlun (#29561) (#29657) · d82b0300
  由 QingshuChen 提交于 12月 16, 2020
```
* support roi_align & affine_channel for kunlun

* minor
```
  d82b0300
15 12月, 2020 1 次提交

cherry-pick kunlun PR: 29458, 29539 (#29583) · 03ddf690

由 QingshuChen 提交于 12月 15, 2020

* support mobilenet for kunlun (#29458)

* add xpu ops for training transformer in kunlun (#29539)

* 1.fix matmul bug 2. add one hot

* add xpu error msg
Co-authored-by: Nprocr <procrboo@gmail.com>
Co-authored-by: Ntaixiurong <taixiurong@126.com>

03ddf690

09 12月, 2020 2 次提交
- P
  
  support clip op trt converter (#29411) (#29496) · 4d51cd73
  由 Pei Yang 提交于 12月 09, 2020
  
  4d51cd73
- P
  
  conflict (#29498) · d5ff367b
  由 Pei Yang 提交于 12月 09, 2020
  
  d5ff367b
08 12月, 2020 4 次提交

[2.0 rc1/cherrypick] cherry-pick kunlun PR:29234/29229/29293/29367/29280/29448 (#29466) · 6bfc5721

由 liuyuhui 提交于 12月 08, 2020

* add deformable_conv op on xpu (#29234)

* rebase develop

* update deformable_conv op on xpu

* update deformable_conv op on xpu

* update kunlun conv2d/softmax/elementwise implemetation (#29229)

* update conv2d & softmax to new xpu api
* test=kunlun

* remove useless comments
* test=kunlun

* remote softmax xpu op
* test=kunlun

* update kunlun softmax
* test=kunlun

* update xpu unitest
* test=kunlun

* fix elementwise_grad bug for kunlun
*test=kunlun

* support global pooling for kunlun (#29293)

* test=kunlun

* update reduce_sum op on xpu (#29367)

* update reduce_sum op on xpu

* update reduce_sum op on xpu

* support running on xpu

* fix expand/uniform_random && concat/transpose to new api on xpu (#29280)

* fix expand && concat/transpose to new api

* update uniform_random_op

* update xpu_header

* 1. fix elementwise ops'bug 2. fix softmax_with_cross_entropy_op 3. add biliner_interp_op (#29448)
Co-authored-by: Nroot <root@bjhw-sys-rpm0223.bjhw.baidu.com>
Co-authored-by: N卖鱼的哲学 <tangzhiyi11@users.noreply.github.com>
Co-authored-by: NQingshuChen <qingshu.chen714@gmail.com>
Co-authored-by: Ntaixiurong <taixiurong@126.com>
Co-authored-by: Nroot <root@bjhw-sys-rpm0223.bjhw.baidu.com>

6bfc5721

S
[Cherry-Pick]Fix bug where embedding can‘t be processed correctly in reducer (#29490) · 6b9302a2
由 ShenLiang 提交于 12月 08, 2020
```
* fix the bug of reducer in embedding
```
6b9302a2
L
[Cherry-pick] Fix bug in gloo that gloo initialization hangs (#29449) · d8e1e50a
由 lilong12 提交于 12月 08, 2020
```
* update, test=develop (#29331)
```
d8e1e50a
Z

revert cast eigen kernel (#29445) · 14cf420e
由 Zhang Ting 提交于 12月 08, 2020

14cf420e

07 12月, 2020 4 次提交
- S
  Fix unittest (#29412) (#29437) · c14d2c6a
  由 Shang Zhizhou 提交于 12月 07, 2020
```
* fix tensorrt unittest precision error

* fix unittest precision error. test_trt_subgraph_pass && test_trt_dynamic_shape_transformer_prune
```
  c14d2c6a
- B
  Add deform_conv2d,DeformConv2D (#29364) (#29425) · b776434c
  由 Bai Yifan 提交于 12月 07, 2020
```
* add deform_conv2d,DeformConv2D
```
  b776434c
- C
  
  change shape of output in cross_entropy (#29414) · d094cd02
  由 chajchaj 提交于 12月 07, 2020
  
  d094cd02
- C
  remove complexvariable (#29390) (#29417) · 9fec4bce
  由 chentianyu03 提交于 12月 07, 2020
```
* rm complexvariable

* modify test_var_base unittest

* remove duplicated codes
```
  9fec4bce
05 12月, 2020 2 次提交

L
[cherri-pick] Fix bug: delete wrong check_type of paddle.concat and support... · 2816f590
由 liym27 提交于 12月 05, 2020
```
[cherri-pick] Fix bug: delete wrong check_type of paddle.concat and support LoDTensorArray (#29306) (#29368)
```
2816f590

Release/2.0 rc1 (#29388) · fbb6cd70

由 chentianyu03 提交于 12月 05, 2020

* fix random failed of complex matmul

* Make transpose, trace, kron, reshape, sum op support complex type (#29321)

* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types

* add test cases for complex elementwise, matmul and getitem unittest

* add test cases for complex types

* add test cases for complex matmul unittest

* kron, reshape, transpose support complex types

* sum and trace op support complex types

* add test case of sum and trace op

* fix the bug of imag part of complex not initialized

* format file

* format code style

* kron support type promotion; modify test cases

fbb6cd70

04 12月, 2020 6 次提交

H
[Dy2stat] Reduce Exception Type for Better Error Message (#29268) (#29363) · 981244cf
由 Huihuang Zheng 提交于 12月 04, 2020
```
Reduce exception type so that if covert_to_static failed, it reports right error message.
```
981244cf

[cherry-pick 2.0rc1][inplace] Add ShareHolderWith for class Variable and... · efb5ad62

由 liym27 提交于 12月 04, 2020

[cherry-pick 2.0rc1][inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in VarBase.detach() to share the same Tensor/SelectedRows (#29267) (#29359)

efb5ad62

[cherry-pick 2.0rc1][Dy2Stat] Fix bug: Do not use gast.Subscript to replace... · d10eb700

由 liym27 提交于 12月 04, 2020

[cherry-pick 2.0rc1][Dy2Stat] Fix bug: Do not use gast.Subscript to replace gast.Name in when transforming for_enumerate_loop (#29310) (#29361)

d10eb700

Support type promote for basic math ops (quantum required) (#29265) (#29354) · 0e7539e7

由 Chen Weihang 提交于 12月 04, 2020

* basic impl of type promote

* add comment & another testcase

* fix complex bugs & support python op promote type

* fix failed unittests & polish code

* add unittest for coverage

* change to only promote complex type

* polish code details

* polish several comments

0e7539e7

[Cheery-Pick 2.0.0-rc1][Dy2stat] Add a decorator paddle.jit.not_to_static to... · 8e0d688a

由 liym27 提交于 12月 04, 2020

[Cheery-Pick 2.0.0-rc1][Dy2stat] Add a decorator paddle.jit.not_to_static to support that not to convert a function in Dynamic-to-Static. (#29253) (#29340)

Usage scenarios：A function could have run successfully in static mode,  you can use it to decorate a function in the following cases:
  1. An unknown error occurs in the dynamic-to-static conversion process of the function;
  2. In the internal implementation of the function, it has two branches: dynamic branch and static branch;
  3. Users don't want to convert the function in the process of dynamic to static.

8e0d688a

L
use has_grad instead of train_mode (#29309) (#29346) · 0a7c7c1c
由 Leo Chen 提交于 12月 04, 2020
```
* use has_grad instead of train_mode

* add vlog for debug

* fix ut

* fix ut
```
0a7c7c1c

03 12月, 2020 3 次提交

L
Move temporal_shift to paddle.nn.functional (#29261) (#29315) · f616daaa
由 LielinJiang 提交于 12月 03, 2020
```
* move temporal_shift to functional
```
f616daaa
S
[cherry-pick]Change the api of DataParallel and Fleet (#29288) · ec57656e
由 ShenLiang 提交于 12月 03, 2020
```
* Change the api of DataParallel and Fleet (#29224)
```
ec57656e

[Cherry-pick] Add pure fp16 training with master weights. (#29301) · d8ea8a06

由 Zhen Wang 提交于 12月 03, 2020

* Add pure fp16 training with master weights. (#27712)

* add the weight decay func for the momentum op

* Add the multi_precision function in Momentum Optimizer.

* Make sure that the initial value of master weights are same with the fp16 weights.

* add static loss scaling.

* add the rescale_grad function in the pure fp16 training.

* use the original momentum updating method.

* Polish some codes, such as variable names.

* add docstring for apis.

* update the var creation details of _create_master_weight.

* not modify codes about imperative momentum updating.

* Fix the error of test_dist_sparse_tensor_load_momentum UT.

* add unit test for multi precision fp16 training.

* add more unit tests for CI.

* Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT.

d8ea8a06

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致