提交 · 3f2a665a0a23a0b0e0d472ef0699322e5e80a9a8 · PaddlePaddle / Paddle

30 11月, 2021 3 次提交

G
support data_format='NHWC' for prelu channel mode (#37019) · 3f2a665a
由 Guoxia Wang 提交于 11月 30, 2021
```
* support data_format='NHWC' for prelu channel mode
```
3f2a665a

[Auto Parallel] Do the physical mapping between the process graph and the cluster graph (#37094) · b0dff05d

由 Yulong Ao 提交于 11月 30, 2021

* [Auto Parallel]  Add the unified cluster representation

* [Auto Parallel] Add the graph class for physical mapping

* [Auto Parallel] Add the simple physical mapper

* Set the timeout of the mapper

* Merge the upstream develop unittests cmake files

* Fix a bug of the process group

* Remove mapper unittest from platforms which is not GPU

* Move the instantiation of process group after resharding

* Add the local id for devices

* Update the rank mapping format

* Add some comments

* Remove the related files about mapping

* Update the unittest for auto mapping

* Remove unused rank_mapping unittest

* Improve the unittest coverage

* Improve the unittest coverage

b0dff05d

X
Fix test calc gradient (#37672) · a0631364
由 xiongkun 提交于 11月 30, 2021
```
* add scope_guard

* 1. fix control flow cases 2. fix calc_gradient
```
a0631364

29 11月, 2021 6 次提交

add expand_v2/expand_as_v2 for kunlun (#37592) · dae4e7f2

由 TTerror 提交于 11月 29, 2021

* add expand_v2/expand_as_v2 for kunlun

* update expand_as_v2

* update expand_as_v2

* support float16/bool

* update xpu.cmake

dae4e7f2

[AMP] For `amp.decorate()` optimizers set to None is ok (#37541) · 2bb3f0b5

由 zhangbo9674 提交于 11月 29, 2021

* amp.decorate optimizers set to None is ok

* refine unittest

* add unittest and refine example code

* refine unittest

2bb3f0b5

Y

[fleet_executor] Hold the carrier while running for one micro step. (#37605) · 74ca89ef
由 Yuang Liu 提交于 11月 29, 2021

74ca89ef

[New features] Support batch_jacobian and batch_hessian (#37547) · 4d24d352

由 Weilong Wu 提交于 11月 29, 2021

* native commit for triple grad of sigmod

* Updated unittests files

* init functional jacobian api

* Updated trible_test func

* Updated gradient_checker & test_script

* finish test with dtype float32

* add float64 test case

* polish code

* use atol=1e-5 with dtype float64

* fix for ci

* set timeout for test_jacobian

* fix dygraph grad to support high differential

* polish API docstring

* Updated gradient checker and some related files

* fix double grad strip error for high differential

* fix double grad strip error for high differential

* Add Sigmoid triple grad tests

* fix dygraph double grad dtype error when calling for high differential senario

* Updated triple grad teses func

* Use np.random to initialize ddx

* Updated triple_grad_check func

* add todo for gradient checker and refine some comments

* remove additional code

* add test for warnging in backward.py

* format python code

* support multi input in triple gradient checker

* Add matmul triple grad kernel

* Updated comments of TODO

* Supported some special tests

* Change code-format to follow CI std

* Updated gradient_checker.py

* Fix conflicts

* Removed unnecessary printing log

* Change code style to follow CI std

* support batch in jacobian and hessian

* add batch jacobian and batch hessian

* Add batch_jacobian test, draft version

* [New features] Add elementwise_mul triple grad kernel (#37152)

* Add elementwise_mul triple grad kernel

* Removed InplaceInferer and polished code

* Add numerical_batch_jacobian,numerical_batch_hessian and tests

* Support batch_jacobian and batch_numerical

* Use pre-commit to check code format

* Update doc, polish code, add unit test

* Reset the TIMEOUT properties of test_jacobian to pass CI
Co-authored-by: Nlevi131 <limaolin01@baidu.com>
Co-authored-by: NJiabin Yang <360788950@qq.com>

4d24d352

B

fix_InternalStorage (#37568) · d0a89744
由 Baibaifan 提交于 11月 29, 2021

d0a89744
W

[ut] Update skip concept to ignore. (#37635) · ae544242
由 Wilber 提交于 11月 29, 2021

ae544242

27 11月, 2021 2 次提交

[Auto Parallel] Add the graph class for the process and cluster (#37482) · 48faf638

由 Yulong Ao 提交于 11月 27, 2021

* [Auto Parallel]  Add the unified cluster representation

* [Auto Parallel] Add the graph class for physical mapping

* [Auto Parallel] Add the simple physical mapper

* Set the timeout of the mapper

* Merge the upstream develop unittests cmake files

* Fix a bug of the process group

* Remove mapper unittest from platforms which is not GPU

* Move the instantiation of process group after resharding

* Add the local id for devices

* Update the rank mapping format

* Add some comments

* Remove the related files about mapping

* Remove unused rank_mapping unittest

* Improve the unittest coverage

48faf638

J

fix save inference model conditional op (#37579) · fd41456f
由 JingZhuangzhuang 提交于 11月 27, 2021

fd41456f

26 11月, 2021 6 次提交

S
fix data parallel when VOCAB var in program (#37543) · e05540f7
由 Steffy-zxf 提交于 11月 26, 2021
```
* fix data parallel when VOCAB var in program
```
e05540f7
Z
upgrade async distributed training in pscore (#37515) · 74605fc2
由 zhaocaibei123 提交于 11月 26, 2021
```
* test

* test

* rm test

* update

* update

* update

* add unittest

* update

* update save
```
74605fc2
L
Fix bugs when bias add none in static graph for fused_attention op. (#37566) · 097e098d
由 Li Min 提交于 11月 26, 2021
```
* Fix bugs when bias is none for static graph for fused_attention op.
```
097e098d

Added interface reset_grad_inplace_version (#37573) · dcb91fd7

由 Zhanlue Yang 提交于 11月 26, 2021

reset_inplace_version removes all inplace related records to VarBase/VariableWrapper, the essential purpose of which is to let you use inplace operations as if using its non-inplaced version, which of course will cause unexpected consequences if not used with care.

This is essentially a hack interface to satisfy one specific request

dcb91fd7

TDM2 (#37044) · 4826167c

由 wangzhen38 提交于 11月 26, 2021

* add tdm sample

* add tdm sample in c++

* update tdm sample

* modify sample count

* fix conflict

* add set_date

* fix cmake error

* fix bug of proto

* update index_dataset proto

* update cmake

* fix error cmake

* fix cmake mkldnn

* fix cmake proto

* update cmake proto

* update cmake

* update rec

* update dataset

* update dataset

* update dataset

* updata dataset

* updata dataset

* updata coverage

* updata ci

* goback4

* fix npu ci

* add xxhash dep

4826167c

Fix dropout static when axis != None (#37223) · f25fda37

由 smallv0221 提交于 11月 26, 2021

* fix dropout static when axis != None

* update dropout test

* add dropout test

* fix test

* Update test_dropout_op.py

* Update test_dropout_op.py

* fix testcase

* fix testcase

* Update test_dropout_op.py

* fix testcase

* fix testcase

* optimize perf

* add new test

* fix testcase

f25fda37

25 11月, 2021 7 次提交
- F
  [NPU] add int64 support for argsort op (#37434) · 3e088aaf
  由 furnace 提交于 11月 25, 2021
```
* [NPU] add int64 support for argsort op

* [NPU] delete debug codes
```
  3e088aaf
- F
  [NPU] add NPU kernel for prior_box op (#37519) · 1127fecb
  由 furnace 提交于 11月 25, 2021
```
* [NPU] add NPU kernel for prior_box op

* [NPU] delete debug codes
```
  1127fecb
- B
  
  Add InternalStorage and add ShardingOptimizerStage2 (#37489) · 5af64631
  由 Baibaifan 提交于 11月 25, 2021
  
  5af64631
- add new API paddle.nn.initializer.Dirac (#37389) · bbb9b28a
  由 zhouweiwei2014 提交于 11月 25, 2021
```
* add new API paddle.nn.initializer.Dirac

* fix doc
```
  bbb9b28a
- L
  
  Export task node to python (#37509) · 3f815e76
  由 LiYuRio 提交于 11月 25, 2021
  
  3f815e76
- M
  【PaddlePaddle Hackathon】6、在 Paddle 中新增 ZeroPad2d (#37151) · 81861f69
  由 Matsumoto GAO 提交于 11月 25, 2021
```
* add zeropad2d v0.1

* add zeropad2d v0.2

* add zeropad2d v0.3

* add zeropad2d v0.3

* add zeropad2d v0.3

* add zeropad2d v0.4

* add zeropad2d v0.5

* add zeropad2d v0.5 codestyle

* add zeropad2d v0.5 codestyle

* add zeropad2d v0.6 functional

* add zeropad2d v0.6 functional

* add zeropad2d v0.6 functional
```
  81861f69
- L
  [new-exec] skip compiled program (#37512) · 171da2ce
  由 Leo Chen 提交于 11月 25, 2021
```
* skip compiled program

* fix ut
```
  171da2ce
24 11月, 2021 4 次提交

T
[GpuPs]pybind core (#37287) · d69daed1
由 Thunderbrook 提交于 11月 24, 2021
```
* pybind core

* set use psgpu
```
d69daed1

[Paddle-Inference] Matmul_int8_convert: tensor*tensor (#37285) · 16590799

由 Wangzheee 提交于 11月 24, 2021

* matmul_convert_int8

* matmul_convert_int8

* matmulconvert_int8

* Matmul_int8_convert: tensor*tensor

* Matmul_int8_convert: tensor*tensor

* Matmul_int8_convert: tensor*tensor

16590799

Y
[Auto Parallel] Add the unified cluster representation (#37091) · db727551
由 Yulong Ao 提交于 11月 24, 2021
```
* [Auto Parallel]  Add the unified cluster representation

* Add the local id for devices

* Add some comments
```
db727551

[Dy2stat]support pure fp16 for dy2stat (#36944) · 52edad6a

由 0x45f 提交于 11月 24, 2021

* run dy2stat pure fp16 in Linear model

* no use self._pure_fp16_inputs

* add test and fix Adam error in dy2stat pure fp16 training

* use paddle.optimizer.Adam

* run test in gpu

* change test time for CI

* enlarge atol for test_resnet_pure_fp16

* refine code and enlarge atol

* make custom_white_list and custom_black_list take effect for AMP and pure fp16

* check tracer is not None

* use default atol

* change filter_size

* change atol and add some NOTE

52edad6a

23 11月, 2021 7 次提交
- P
  fix inplace bug when the first grad_var(loss_grad) is inplace var (#37420) · ee1e1642
  由 pangyoki 提交于 11月 23, 2021
```
* fix inplace bug

* fix custom grad input error

* add unittest

* fix inplace bug
```
  ee1e1642
- L
  Add support bias is none for fused_attention op. (#37411) · 1a8786cf
  由 Li Min 提交于 11月 23, 2021
```
Add support for bias is none for fused_attention op.
```
  1a8786cf
- L
  [new-exec] skip compiled program with places > 1 (#37457) · 2dfcdf21
  由 Leo Chen 提交于 11月 23, 2021
```
* skip compiled program with places > 1

* fix corner case and add ut
```
  2dfcdf21
- W
  [Paddle Inference] Fix_nearest: align_corners != true (#37368) · bc150edc
  由 Wangzheee 提交于 11月 23, 2021
```
* fix_nearest

* fix_nearest

* fix_nearest

* fix_nearest
```
  bc150edc
- R
  [NPU] Added HCCL backend support in dygraph mode (#36285) · 83e55cff
  由 ronnywang 提交于 11月 23, 2021
```
* Added HCCL backend support in dynamic graph mode

* fix segmentation fault

* add ut
```
  83e55cff
- Z
  Bug fix for snapshotting VariableWrapper with initialized tensor but e… (#37410) · e58ac121
  由 Zhanlue Yang 提交于 11月 23, 2021
```
* Bug fix for snapshoting VariableWrapper with initialized tensor but empty allocation

* Added unittest for inplace&clear_gradient
```
  e58ac121
- A
  [NewExe] Support layout/dtype transform by adding transfer_layout/transfer_dtype op (#37299) · 2a1f009e
  由 Aurelius84 提交于 11月 23, 2021
```
* Add transfer_layout/dtype op

* clean useless codes

* fix unused var

* add optest in white.txt

* split into data_transfer.cc

* fix cmake

* modify according reviewer comment

* replace cast_op with transfer_dtype_op
```
  2a1f009e
22 11月, 2021 5 次提交

Add isclose op (#37135) · d2200e97

由 andyjpaddle 提交于 11月 22, 2021

* add isclose op, test=develop

* add isclose op, test=develop

* add isclose api, test=develop

* rm useless code

* rm useless code

* update python api of isclose

* add some unittest of isclose op, test=develop

d2200e97

[Dy2stat]Allow users to switch eval/train mode when using @to_static to... · eb602398

由 0x45f 提交于 11月 22, 2021

[Dy2stat]Allow users to switch eval/train mode when using @to_static to decorate a function (#37383)

* Allow users to switch eval/train mode when using @to_static to decorate a function

* refine code for train() and eval()

eb602398

Z

elu support alpha < 0 (#37316) · e3503de8
由 zhupengyang 提交于 11月 22, 2021

e3503de8
Z
Support zero value in dimension for slice (#37313) · e788c7b5
由 zyfncg 提交于 11月 22, 2021
```
* support zero dim for slice op

* support zero dim Tensor in set_value op

* polish some debug log
```
e788c7b5
Z

fix bug of indexing tensor with None (#37400) · de0cb386
由 zyfncg 提交于 11月 22, 2021

de0cb386

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功