提交 · adaeee4d3d3834616e121c32c95b09a87f24712d · 机器未来 / Paddle

17 9月, 2021 13 次提交

[AMP] Support pure fp16 training mode for dygraph (#35521) · adaeee4d

由 zhangbo9674 提交于 9月 17, 2021

* add pure fp16 major function in auto_cast & tracer

* support master weight in dygraph for pure fp16

* check mix dtype of fp16&fp32 for check_finite_and_unscale op

* change pure fp16 funtion name

* refine some bug in auto_cast

* refine auto_cast interface logic

* add param _casted_by_pure_fp16 for class Layer

* support state_dict hook for save model by user appointed dtype in pure_fp16_decorator

* refine pure_fp16_decorator as decorator

* add unittest

* add comment

* add comment

* support recompute

* add comment for auto_cast and decorator

* support to_static_state_dict for paddle.jit.save

* unlimite models num and optimizers num

* add lookup_table in black_list

* fix momentum and layer state_dict

* fix bug in layer state_dict

* fix bug in layer state_dict_helper

* refine unittest

* refine test_momentun_op

* refine interface and some code

* refine amp_decorator interface

* refine pure fp16 interface

* refine master weight interface

adaeee4d

L
temporally disable the warnings (#35560) · 68ae6345
由 Leo Chen 提交于 9月 17, 2021
```
* temporally disable the warnings

* disable ut
```
68ae6345
G

fix unittest (#35808) · fcfb0afe
由 Guoxia Wang 提交于 9月 17, 2021

fcfb0afe

Add linalg pinv api (#35804) · 71e01d3f

由 andyjpaddle 提交于 9月 17, 2021

* add pinv api, test=develop
* add linalg pinv api, test=develop
* update example code, test=develop

71e01d3f

Support EMA in Paddle2.x and Fleet (#35673) · fb4d5689

由 Haohongxiang 提交于 9月 17, 2021

* Support EMA in Paddle2.x and Fleet

* update

* update

* update

* modify ut of ema

* modify docs

* modify bugs

* update

* update

* update

* modify ut

fb4d5689

add inplace op support to prune, scale_op is no longer need in jit.save (#35730) · 21921936

由 Haipeng Wang 提交于 9月 17, 2021

* add scale_op in model save step is not necessary, just fix the prune method to support static graph and inplace op

* fix jit.save, no need to add scale_op to each outputvar anymore.
fix prune_with_input, now it supports inplace op

* temporarily disable test_trt_dynamic_shape.TRTDynamicShapeOutOfBound2Test

21921936

津

[inference]add hard_swish dynamic plugin (#35214) · c59c8e4f
由津提交于 9月 17, 2021

c59c8e4f

Add skip teller (#35807) · 0f74e5e7

由 xiaoxiaohehe001 提交于 9月 17, 2021

* add_skip_layernorm

* add_skip_layernorm

* add_skip_layernorm

* add_skip_layernorm

* add_skip_layernorm

* add_skip_layernorm

* add_skiplayernorm_teller

* add_skip_layernorm

* add_skip_layernorm_teller

* add_skip_layernorm_teller

* add_skip_layernorm

* add_skip_teller

0f74e5e7

L
expose cuda stream to users (#35813) · 40cfa512
由 Leo Chen 提交于 9月 17, 2021
```
* expose cuda stream to users

* add ut
```
40cfa512
津
[inference]add reduce converter test (#35145) · 05275010
由津提交于 9月 17, 2021
```
* add test

* add test

* add test
```
05275010
津
leaky_relu test (#35318) · 867f4fa0
由津提交于 9月 17, 2021
```
* add test

* add test

* add test

* add test

* add test
```
867f4fa0

增强equal API，输入Y支持int，float，bool或者tensor类型 (#35695) · 9b2d53fc

由 yeliang2258 提交于 9月 17, 2021

* update equal op, input Y can be float,int,bool or tensor

* update test

* update code style

* update code style

* update doc

* update str check

* remote str

* add type check

9b2d53fc

0

refine matrix_rank op code and doc (#35722) · 28fffef6
由 0x45f 提交于 9月 17, 2021

28fffef6

16 9月, 2021 13 次提交

Y

[hybrid] Fix mp multi gradient clip prob (#35713) · a4eadd15
由 Yuang Liu 提交于 9月 16, 2021

a4eadd15
Z

Add segment apis to paddle.incubate (#35759) · 4b683887
由 Zhong Hui 提交于 9月 16, 2021

4b683887
A
[NPU] add index_select_grad kernel and unit tests (#35594) · 67a094b5
由 Aganlengzi 提交于 9月 16, 2021
```
* [NPU] add index_select_grad kernel and unit tests

* dim=0 not need transpose
```
67a094b5
K
fix dataloader exit terminate error (#34501) · e93c18a3
由 Kaipeng Deng 提交于 9月 16, 2021
```
* fix DataLoader exit with SIGABRT/SIGSEGV. test=develop
```
e93c18a3

Support new API linalg.cond in paddle (#35140) · 2df74aa6

由 Haohongxiang 提交于 9月 16, 2021

* Support new API linalg.cond in paddle

* check code style

* check code style

* modify codes

* add docs_eng of linalg.cond

* add svd_norm for linalg.cond

* modify docs_en of cond

* add support for empty input in dynamic mode

* modify set_time of unittest

* update

* modify unittest of cond

* update

* remove cond in paddle.__all__

* pull latest codes

* merge latest codes

* update

2df74aa6

C

Add CPU and GPU eigh op implementation (#34990) · 07d0b834
由 crystal 提交于 9月 16, 2021

07d0b834
W
[paddle-trt] fix gather convert (#35784) · 7546a079
由 Wangzheee 提交于 9月 16, 2021
```
* fix gather

* fix
```
7546a079

[Dy2stat]fix no_grad context error in dy2stat (#35725) · 3e897489

由 0x45f 提交于 9月 16, 2021

* fix no_grad context error in dy2stat

* remove useless comments

* fix error by drop_kids in python

* add test and fix review

3e897489

G
support l2_normalize float16 (#35776) · b666fd3c
由 Guoxia Wang 提交于 9月 16, 2021
```
* support fp16 dtype
```
b666fd3c
L
remove distributed attributes at the last stage for auto parallel (#35605) · a3790606
由 lilong12 提交于 9月 16, 2021
```
* update
```
a3790606

Python support register pass via PassDesc (#35602) · bab39eb2

由 wuhuanzhou 提交于 9月 16, 2021

PR主要功能：针对fusion等子图替换场景，支持Python侧开发并注册Pass。

背景
Pass是指输入一个深度学习计算图Graph，依照一定条件进行修改，输出修改后的Graph的过程；
当前PaddlePadle框架编写Pass代码存在以下问题：
用户需要手写Graph的条件匹配、在Graph上的修改代码；
对Graph操作需要深入底层框架代码，了解Graph的结构，并且知道相关Pass写法；
我们提出了针对fusion等子图替换类Pass的优化方案以支持用户在Python侧开发注册Pass，提升二次开发体验：
用户只需要输入匹配和替换的子图描述，由深度学习框架编写的代码来生成匹配和替换的逻辑，不需要用户对Graph进行匹配和替换操作；
API级别的替换，用户可以通过Paddle的Python API构造子图，从而不需要知道Graph的结构，也能写Paddle的Graph Pass代码

bab39eb2

W

[hybrid] remove scale op in insert_scale_loss_grad_ops (#35775) · 02b0be08
由 WangXi 提交于 9月 16, 2021

02b0be08
Z

Add a new op: paddle.linalg.multi_dot (#35224) · c9f7cff0
由 zhangkaihuo 提交于 9月 16, 2021

c9f7cff0

15 9月, 2021 14 次提交

J
Fix for slice OneDNN kernel in solov2 and ppyolo models (#35706) · 9d996cdd
由 jakpiase 提交于 9月 15, 2021
```
* fixed slice error

* added handling of StartsTensor+List and EndsTensor+List

* fix for ppyolo model
```
9d996cdd

王

clip op extra information when export model. (#35447) · 4d236354

由王明冬提交于 9月 15, 2021

* clip op extra information when export model,test=ocr

* rename clip_extra parameter to kwargs in save_inference_model, test=ocr

4d236354

Change the invoking method of settiem from numpy to set_value op when value isn't tensor (#35701) · 86d4af39

由 zyfncg 提交于 9月 15, 2021

* Change the invoking method of settiem from numpy to set_value op when value is not tensor

* fix the check logic for inplace in setitem

* fix the unittest problem caused by setitem doesn't support fp16

* modify some code format in setitem

86d4af39

add dist_attr for dist op and var (#35585) · fc5fb2a1

由 zhaoyingli 提交于 9月 15, 2021

* add dist_attr for dist op

* add unitest

* update inputname

* update function name

* add unitest

* update CMakeLists.txt for CI

* fix dis_matmul

* fix compile error

* update matmul to matmul_v2

fc5fb2a1

[NPU] add beam_search npu op (#34860) · 3760be06

由 pangyoki 提交于 9月 15, 2021

* add beam_search npu op

* fix CMakeList and add unittest

* fix bug of beam search npu op

* fix unittest

* let input ids become int64

* set output ids to int64_t

* delete check_dygraph

* fix beam_width=1

3760be06

W
support numpy.ndarray index. (#35748) · 9f588cc2
由 WeiXin 提交于 9月 15, 2021
```
* support numpy.ndarray index.

* polish code.
```
9f588cc2
Q
[NPU] fix depthwise_conv2d_grad, test=develop (#35626) · d3e06a51
由 Qi Li 提交于 9月 15, 2021
```
* [NPU] fix depthwise_conv2d_grad, test=develop

* remove debug files, test=develop
```
d3e06a51
J
Add gelu convert test (#35529) · 39dcfc6c
由 JingZhuangzhuang 提交于 9月 15, 2021
```
Co-authored-by: Nxiaoxiaohehe001 <hiteezsf@163.com>
```
39dcfc6c

Add New OP: gumbel_softmax (#35506) · 18eda6c3

由 YuanRisheng 提交于 9月 15, 2021

* Add New Op: gumbel_softmax

* Add New Op: gumbel_softmax

* Add New Op: gumbel_softmax (amend)

* add __main__ function in unit test

* fix bugs when test in windows ci

* update en docs

* delete reletive error in unit test

* delete relative error in unit test

* set hard=True in unit test

18eda6c3

S
Add paddle.cuda.device.stream_guard API (#35623) · 3218075d
由 Siming Dai 提交于 9月 15, 2021
```
Add paddle.cuda.device.stream_guard API 
```
3218075d
W

[hybrid] out data parallel as optimizer sharding parallel (#35593) · 78465703
由 WangXi 提交于 9月 15, 2021

78465703

[Paddle Inference]Add split op TRT converter unittest. (#35127) · e26a2504

由 xiaoxiaohehe001 提交于 9月 15, 2021

* add_split_op

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

* add_split_teller

e26a2504

[Paddle Inference]Add Transpose op TRT converter unittest (#35138) · 09f920a2

由 xiaoxiaohehe001 提交于 9月 15, 2021

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

* add_transpose_teller

09f920a2

[Paddle Inference]Add scale TRT converter unittest. (#35225) · c563609a

由 xiaoxiaohehe001 提交于 9月 15, 2021

* add_scale_teller

* add_scale_teller

* add_scale_teller

* add_scale_teller

* add_scale_teller

* add_scale_teller

* add_scale_teller

* add_scale_teller

* add_scale_teller

* add_scale_teller

* add_scale_teller

* add_scale_teller

* add_scale_teller

* add_scale_teller

c563609a

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致