提交 · 7fc2ce50908da9dd3d3a6bc03880ccaec21c7eab · BaiXuePrincess / Paddle

05 1月, 2021 6 次提交

T
add topo-aware in heter-ps (#30087) (#30117) · 7fc2ce50
由 Thunderbrook 提交于 1月 05, 2021
```
* add topo aware

* resource.h

* topo aware

* format
```
7fc2ce50

[Cherry-pick 2.0] cherry pick 3 PRs about Dynamic-to-Static (#30100) · faeee3c3

由 liym27 提交于 1月 05, 2021

* [cherry-pick 2.0] Fix unitest test_slice (#29740)

Before this commit, test_slice use old api `dygraph_to_static_func` to use Dynamic-t-Static and use Executor explicitly，which is not recommended to users.
After fixed, use recommended API `paddle.jit.to_static` to replace `dygraph_to_static_func`, which won't trigger the random exception on coverage CI.

* [cherry-pick 2.0][Dy2Stat] Support grammar: for ele in var[idx] (#29541)

Support to transformfor ele in var stms in which var is a slice of Tensor.

* [cherry-pick 2.0][Dy2Stat] Fix bug for loop: a variable is used and created in loop, but used before created (#29769)

faeee3c3

[cherry-pick 2.0] Support dygraph quant model and avoid the scale to be infinity (#30098) · 3fe71d0a

由 cc 提交于 1月 05, 2021

* fix ininite scale values (#29386)

* Support dygraph quant model (#29927)

* Avoid the scale to be infinity in quant2_int8_mkldnn_pass, test=develop
* support quantized model for paddle2.0 dygraph, test=develop
Co-authored-by: NWojciech Uss <wojciech.uss@intel.com>

3fe71d0a

G

fix test=release/2.0 (#30045) · 6e2066b0
由 gongweibao 提交于 1月 05, 2021

6e2066b0

[cherry pick]Set FLAGS_selected_gpus for spawn (#29962) (#30097) · cda7397f

由 Chen Weihang 提交于 1月 05, 2021

Set FLAGS_selected_gpus for spawn.

When the child process starts, it will inherit the configuration of the main process and set the FLAGS once, but the environment variable has not been set at this time, which leads to the FLAGS_selected_gpus is keep same with mainprocess(usually empty), so manually update the flags here.

注：增加了一个单测，又移除了，单测打印显示CI机器nvidia-smi只有两张卡，需要大于两张卡才能测这个问题

cda7397f

C

[cherry-pick] Add mkldnn interpolate op, support manual enable mkldnn interpolate op (#30083) · 9a6926f5
由 cc 提交于 1月 05, 2021

9a6926f5

04 1月, 2021 1 次提交
- Z
  [cherry pick 2.0]support deepcopy for Layer/Tensor/Paramerbase (#29387) (#29873) · c06350c9
  由 Zhou Wei 提交于 1月 04, 2021
```
* support deepcopy for Layer/Tensor/Paramerbase

* fix some code
```
  c06350c9
31 12月, 2020 5 次提交
- L
  add the paddle.distributed.split api (#29970) (#30041) · 84c2315a
  由 lilong12 提交于 12月 31, 2020
```
* add distributed.split, test=develop
```
  84c2315a
- L
  fix the bug in pipeline data parallelism (#29731) (#29918) · f0e04e1f
  由 lilong12 提交于 12月 31, 2020
```
* update, test=develop
```
  f0e04e1f
- L
  [Cherry-pick] Disable gloo by default #29559 #29805 (#29601) · 640f8cf0
  由 lilong12 提交于 12月 31, 2020
```
* update, test=develop (#29559)

* Disable gloo by default (#29805)

* update, test=develop

* update, test=develop
```
  640f8cf0
- Z
  [cherry-pick] hardsigmoid add attr slope and offset (#29999) (#30032) · 38f83788
  由 zhupengyang 提交于 12月 31, 2020
```
test=develop
```
  38f83788
- X
  [cherry-pick] add alias for upsample (#29984) · 5d5faba8
  由 xiaoting 提交于 12月 31, 2020
```
* add alias for upsample, test=develop

* add alias for upsample

* fix example
```
  5d5faba8
30 12月, 2020 3 次提交
- W
  
  fix the state_dict bug for the xpu (#30008) · 9859afa9
  由 wawltor 提交于 12月 30, 2020
  
  9859afa9
- C
  [cherry-pick] Fix 2.0 bugs (#29992) · faf2bb39
  由 Chen Long 提交于 12月 30, 2020
```
* fix doc bugs test=document_fix

* fix code bugs test=document_fix

* fix code bugs test=document_fix

* fix doc bugs test=document_fix

* fix doc bugs test=document_fix

* fix doc bugs test=document_fix
```
  faf2bb39
- L
  Fix rotation bug when use cv2 backend (#29933) (#29982) · d6a4f89a
  由 LielinJiang 提交于 12月 30, 2020
```
* fix cv2 rotation
```
  d6a4f89a
29 12月, 2020 5 次提交

[Kunlun] 2.0 cherry-pick:Support for Baidu Kunlun XPU multi card training (#29713) · 847aa172

由 liuyuhui 提交于 12月 29, 2020

* [Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)

* [Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574)

* [Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor  (#29926)

* add bkcl.so in whl for kunlun (#29947)

* [Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor  (#29961)
Co-authored-by: NQingshuChen <qingshu.chen714@gmail.com>

847aa172

[Cherry-pick] Complex network execute support (#29905) · 91ebc460

由 Chen Weihang 提交于 12月 29, 2020

* [Complex] Add support for complex grad accumulated (#29889)

* add support for complex grad accumulated

* add unittest for coverage

* update test dtype

* remove useless blank line

* [Complex] Handle complex to real after type promotion (#29855)

* try to add fwd op input dtypes

* refactor base impl

* return tmp_ins after dygraph prepare data

* fix typo found in debug

* polish comment & add complex net test

* revert detail change

* fix unittest failed

* add complex kernel condition control

* fix xpu test failed & polish comment

* polish details by review comments

* Complex op test (#29753)

* delete no need to calculate inputs in dygraph op_test

* delete no need to calculate inputs in dygraph op_test

* change grad elementwise_mul for complex types (#29757)

* add conj op for complex types

* add conj for complex types

* add more test case

* add conj_op test

* modify conj api and impl

* add complex type for fill_constant_op xpu

* add setConstant for complex type

* remove complex conj test file

* user define grad for test_conj_op

* add test case for static mode of conj api

* modify conj doc

* change input args name to x

* remove useless codes

* conj support real types

* add conj test case for real number

* delete no need to calculate inputs in dygraph op_test

* delete no need to calculate inputs in dygraph op_test

* modify grad of mul for complex types

* fix the grads of inputs args order not match bug

* change the grad of div when complex types (#29804)

* change the grad of div when complex types

* fix the grads of inputs args order not match bug
Co-authored-by: Nchentianyu03 <chentianyu03@baidu.com>

91ebc460

T
cherry pick heter ps (#29955) · a839ddca
由 Thunderbrook 提交于 12月 29, 2020
```
* cherry pick heter ps

* 　CMakeList
```
a839ddca
L
Fix Conv2DTanspose bug when padding='same' (#29915) (#29936) · acb29ff8
由 LielinJiang 提交于 12月 29, 2020
```
* fix conv_transpose bug when padding=same
```
acb29ff8

[cherry-pick] clean redundant API alias in 2.0 - part 1 #29928 (#29960) · c9c835b5

由 XiaoguangHu 提交于 12月 28, 2020

* [cherry-pick] cherry-pick of PR#29928

* delete paddle.metric.chunk_eval and paddle.metric.mean_iou

* delete paddle.nn.clip and paddle.nn.clip_by_norm

* delete paddle.nn.functional.activation.hard_sigmoid and paddle.nn.functional.activation.hard_swish

* [cherry-pick] cherry-pick of PR#29928

* fix extension import error

c9c835b5

28 12月, 2020 2 次提交

[Cherry-Pick 2.0][Dy2Stat] 1. Fix bug of for-range stmts. 2. Support that step... · a8b6dd86

由 liym27 提交于 12月 28, 2020

[Cherry-Pick 2.0][Dy2Stat] 1. Fix bug of for-range stmts. 2. Support that step value is negative in for-range stmts (#29519) (#29874)

1. Fix error in _build_cond_stmt of for-range stmts.

2. Support that step value is negative in for-range stmts

3. Fix code because of the diff between Py2 and Py3

a8b6dd86

[Cherry-pick] Cherry-pick of PR#29579 and PR#29617 (#29904) · 63939597

由 Huihuang Zheng 提交于 12月 28, 2020

* [Dy2stat] Enable jit.save to Save Without Running (#29579)

Enable jit.save to Save Without Running.

* Modify CublasHandleHolder to Fix Random Unittest Failure. test=develop (#29617)

Modify CublasHandleHolder from using PADDLE_ENFORCE_CUDA_SUCCESS to PADDLE_RETRY_CUDA_SUCCESS to fix random unittest failure. We checked that the unittest log showed CUDA allocation error at this file, which may due to GPU not enough. We fixed similar failure in the past, so we applied PADDLE_RETRY_CUDA_SUCCESS here.

63939597

25 12月, 2020 4 次提交

L
Update en docs of to_tensor (#29718) (#29901) · be85ecc9
由 LielinJiang 提交于 12月 25, 2020
```
* update to_tensor en docs
```
be85ecc9
Q
feat: support check_nan_inf for kunlun/xpu device (#29694) (#29898) · 41917fb5
由 QingshuChen 提交于 12月 25, 2020
```
* feat: support check_nan_inf for kunlun device

* support kunlun stack

* minor
```
41917fb5

2 0 ps core 2 (#29894) · f781ab08

由 tangwei12 提交于 12月 25, 2020

* add ps table (#29463)

* add ps table

Change-Id: I468a04bd071d21ff52654926fcf4d5f3da19e178

* add service (#29560)

* add service, remove ut on mac

* fix heter_profiler & add heter stop method

* fix code style

* merge pscore

Change-Id: Ie7f60d1cdde6755a0c29db26863c6283e9843d57

* fix cmake

Change-Id: I6773509a7b4ca79139ecc40b7bf3eb318ceff8bb

* fix conflit

Change-Id: I35575be0c96a8520f9d756ea7f1ff0b904a165ba

* fix conflit

Change-Id: Ic926ea0b0d67803226d51241397ba3b510226bfa

f781ab08

flops fix Cherry pick (#29872) · 4fa94d4a

由 yukavio 提交于 12月 25, 2020

* add some feature for paddle.flops (#29572)

* fix flops (#29758)

* fix flops

* fix flops

* fix flops (#29818)

4fa94d4a

22 12月, 2020 5 次提交
- Q
  add nearest_interp_v2 on kunlun (#29725) (#29822) · 0e2f5bb1
  由 QingshuChen 提交于 12月 22, 2020
```
* add nearest_interp_v2 on kunlun

* add nearest_interp_v2 on kunlun
Co-authored-by: NTTerror <tangzhiyi11@users.noreply.github.com>
```
  0e2f5bb1
- S
  Support multi-stream communication for dynamic graph distributed (#29525) (#29821) · f7a598fa
  由 ShenLiang 提交于 12月 22, 2020
```
* fix fleet for multi-stream

* fix memcpy for ncclid

* use sync to solve move operation
```
  f7a598fa
- S
  [Cherry-pick] fix isfinite all any ops (#29658) · f7bf2891
  由 syyxsxx 提交于 12月 22, 2020
```
* detele reduce_all reduce_any isfinite add all and any

* fix all and any example

* keep_dim to keepdim

* fix example code
```
  f7bf2891
- W
  [cherry-pick 2.0] gen nccl id socket (#29746) · 0f49e0b7
  由 WangXi 提交于 12月 22, 2020
```
* gen nccl id use socket (#29431)

* fix gen_nccl_id_op_helper compile failed, test=develop (#29614)
```
  0f49e0b7
- W
  
  fleet sync build strategy, test=develop (#29732) (#29745) · f8888a07
  由 WangXi 提交于 12月 22, 2020
  
  f8888a07
18 12月, 2020 2 次提交

[Cherry-pick] Add complex api conj, real and imag (#29750) · ab5cc042

由 Chen Weihang 提交于 12月 18, 2020

* Add complex dtype op (add) test example (#29603)


* add op test case for complex

* polish code details

* add xpu set constant support

* fix argument rror

* remove useless pyc file

* [Complex] Add real & imag op and api for complex tensor (#29672)

* add complex real op & api & unittest

* add imag op & api & unittest

* refactor op impl

* revert simplify writing due to complile failed

* polish details

* polish grad op code

* add conj op for complex types (#29527)

* add conj op for complex types

* add conj for complex types

* add more test case

* add conj_op test

* modify conj api and impl

* add complex type for fill_constant_op xpu

* add setConstant for complex type

* remove complex conj test file

* user define grad for test_conj_op

* add test case for static mode of conj api

* modify conj doc

* change input args name to x

* remove useless codes

* conj support real types

* add conj test case for real number
Co-authored-by: Nchentianyu03 <chentianyu03@baidu.com>

ab5cc042

J
Update EarlyStopping sample code (#29723) (#29727) · cc2edc5e
由 Jiaqi Liu 提交于 12月 18, 2020
```
* update EarlyStopping doc

* update EarlyStopping doc, test=document_fix
```
cc2edc5e

17 12月, 2020 4 次提交

[cherry-pick]fix matmulv2 bug & add rebuild group & fix bug of download (#29726) · df0430dc

由 ShenLiang 提交于 12月 17, 2020

* Fix the dowanload bug in the case of multiple machines (#29551)

* fix the dowanload bug
* add sort for ips

* Fix bug of matmul_v2 for broadcast case (#29599)

* fix bug of matmul_v2 for broadcast

* Rebuild group automatically in dynamic graph distributed (#29255)

* add tensor_indices in AssignGroupBySize

* add rebuild group in reducer

* fix error message of gather nd (#29521)

df0430dc

update activation op on kunlun (#29577) (#29717) · e82efc0c

由 TTerror 提交于 12月 17, 2020

* fix expand && concat/transpose to new api

* update xpu_header

* update activation op on kunlun

* update activation op on kunlun

* update activation op on kunlun

* update activation op on kunlun

* update activation op on kunlun

* add nearest_interp on kunlun

* update error message

e82efc0c

remove addcmul (#28937) (#29640) · 6f873c21

由 Wei Shengyu 提交于 12月 17, 2020

* remove addcmul

* remove unittest and other related code of addcmul

* fix bug

* fix merge conflict

6f873c21

C
[Cherry-pick][bug fix] fix train eval set error in static mode (#29540) (#29571) · 61d70277
由 Chen Weihang 提交于 12月 16, 2020
```
Fix Layer train eval setting failed in static mode, more details please see #29540
```
61d70277

16 12月, 2020 3 次提交
- J
  fix wmt14 doc, remove backward, add bidirect direction in rnn api (#29633) (#29695) · a19e1fe8
  由 Jack Zhou 提交于 12月 16, 2020
```
* fix wmt14 doc, remove backward, add bidirect direction in rnn api

* fix rnn unittest

* fix test_rnn_nets_static.py bug
```
  a19e1fe8
- J
  [2.0/cherrypick] cherry-pick Sharding PR:29518 (#29593) · ab04bf01
  由 JZ-LIANG 提交于 12月 16, 2020
```
* Sharding add hybrid-dp feature

* update sharding in distributed_strategy

* update sharding unitest

* revise code format for sharding
```
  ab04bf01
- Q
  support roi_align & affine_channel for kunlun (#29561) (#29657) · d82b0300
  由 QingshuChen 提交于 12月 16, 2020
```
* support roi_align & affine_channel for kunlun

* minor
```
  d82b0300

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致