提交 · 77a8a3944a01ac2cb3a62c99cd7de459872a01b8 · PaddlePaddle / Paddle

23 8月, 2021 1 次提交
- Z
  add adamw cuda kernel (#35020) · 77a8a394
  由 zhaoyingli 提交于 8月 23, 2021
```
* adamw support cuda

* adamw support cuda
```
  77a8a394
22 8月, 2021 1 次提交
- Z
  
  implementation of broadcast add backward by reduce (#34143) · 56c5e210
  由 Zhang Zheng 提交于 8月 22, 2021
  
  56c5e210
20 8月, 2021 10 次提交
- H
  
  Add paddle.linalg.matrix_power OP (#34667) · e2241a43
  由 Hao Lin 提交于 8月 20, 2021
  
  e2241a43
- Y
  
  [hybrid performance] Grad fuse for gradient merge under pipeline mode (#35004) · 4d9b2d6d
  由 Yuang Liu 提交于 8月 20, 2021
  
  4d9b2d6d
- L
  [npu]Add argsort op (#34865) · 99ffeffe
  由 lzzyzlbb 提交于 8月 20, 2021
```
* add rmsprop npu

* add argsort npu

* add argsort npu

* modify according to review

* modify sharedatawith according to review

* modify reshape according to review

* rm dygraph=false
```
  99ffeffe
- S
  [NPU] Support npu kernel for pad3d op (#34815) · ef517a56
  由 Sing_chan 提交于 8月 20, 2021
```
* [NPU] Support npu kernel for pad3d op

* fix for comment of zhouwei25

* fix some bugs according to qili93's comments

* add support and test for paddings in input

* delete VLOG used for debug
```
  ef517a56
- W
  use spin lock in auto growth allocator (#34910) · 6bacfb0e
  由 wanghuancoder 提交于 8月 20, 2021
```
* use spin lock in auto growth allocator, test=develop

* use pthread spin lock, test=develop

* use lock guard, test=develop

* use malloc spin lock, test=develop

* use lock_guard, test=develop
```
  6bacfb0e
- W
  fix set_lod in data_feed (#35000) · 4416c793
  由 wangguanqun 提交于 8月 20, 2021
```
* add trainer desc config to distributed strategy

* code style modified

* data_feed set lod
```
  4416c793
- Z
  [NPU] Support npu op depthwise_conv2d (#34853) · 4c115a82
  由 zhaoyingli 提交于 8月 20, 2021
```
* add depthwise_conv2d npu

* add some tests

* Delete test_unique_op_npu.py

* delete trans input
```
  4c115a82
- Z
  [NPU] Support npu op where and where grad (#34587) · d082955e
  由 zhaoyingli 提交于 8月 20, 2021
```
* [NPU] Support npu op where and where grad

* fix use const_cast

* delete a test
```
  d082955e
- P
  
  temporary disable resnet50-quant multi-thread test (#35035) · f927b653
  由 Peihan 提交于 8月 20, 2021
  
  f927b653
- J
  add (N,C,*) input support for GroupNorm (#34773) · 46371515
  由 JYChen 提交于 8月 20, 2021
```
* add (N,C,*) input support for GroupNorm

* --amend
```
  46371515
19 8月, 2021 6 次提交

[NPU] Support npu kernel for sin op (#34844) · 4641e8fc

由 JingZhuangzhuang 提交于 8月 19, 2021

* add npu sin op

* [NPU] Support npu kernel for sin op

* modify support npu kernel for sin op

* modify support npu kernel for sin op

* modify nou sin op

* modify npu sin op

* add sin op npu

4641e8fc

add resnet50_quant model in PR-CI-INFERENCE (#35012) · 97cae5e8

由 Peihan 提交于 8月 19, 2021

* add slim resnet50 quant model in pr-ci-inference

* enable resnet50_quant multi_thread4_trt_int8_bz1

* remove LOG(FATAL)

97cae5e8

Y
Add dimension check for inverse to avoid dividing by 0 error when input's... · a2e08657
由 Yiqun Liu 提交于 8月 19, 2021
```
Add dimension check for inverse to avoid dividing by 0 error when input's shape is [0, 0, 0]. (#34996)
```
a2e08657
C
fix batch_norm and instance norm when input is [] (#34107) · ca7f5208
由 ceci3 提交于 8月 19, 2021
```
* fix batch_norm and instance norm when input is []
```
ca7f5208

Fix Inference CI CPU/GPU (#34931) · 26213a77

由 tianshuo78520a 提交于 8月 19, 2021

* notest;test=gpu-inference

* notest;test=gpu-inference

* notest;test=gpu-inference

* notest;test=gpu-inference

* fix error

* notest;test=gpu-inference

* notest;test=gpu-inference

* notest;test=gpu-inference

* test=gpu-inference

26213a77

Abstract DeviceEvent to manage cross-platform Event implementation (#34922) · 22da1907

由 Aurelius84 提交于 8月 19, 2021

* add device_context

* add gtest for device_event_gpu

* Remvoe duplicate DeviceType

* push for test

* add unittest

* fix macros

* fix MSVC using usage

22da1907

18 8月, 2021 13 次提交

L
[NPU]add rmsprop op (#34864) · 9cbba97b
由 lzzyzlbb 提交于 8月 18, 2021
```
* [npu]add rmsprop op
```
9cbba97b

Add NPU kernel for norm Op: float16 and float32 (#34609) · 755c8a19

由 xiongkun 提交于 8月 18, 2021

* Add NPU kernel for norm Op: float16 and float32

* fix code for code review

* fix for code review

* add type for paddle_throw

* remove unnecessary head file.\nAdd more testcase

* remove a broadcast

755c8a19

fix pad outliers err (#34979) · 248e27b7

由 littletomatodonkey 提交于 8月 18, 2021

* fix pad outliers err

* fix pad api input type and doc

* fix example of pad

* add unittest for pad3d

* fix unittest

* fix error format

* fix pad doc

248e27b7

code refactoring for new executor (#34970) · 40d4d834

由 wanghuancoder 提交于 8月 18, 2021

* code refactoring, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

40d4d834

P

add paddle detection model in pr-ci-inference (#34986) · 1b747de7
由 Peihan 提交于 8月 18, 2021

1b747de7
J
[NPU] Add square grad (#34889) · 1b71a718
由 Jackwaterveg 提交于 8月 18, 2021
```
* test=develop

* test=develop
```
1b71a718
J
[NPU] Add leaky Relu (#34894) · 40f62737
由 Jackwaterveg 提交于 8月 18, 2021
```
* test=develop

* test=develop
```
40f62737
W
[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16... · a9673b44
由 WangXi 提交于 8月 18, 2021
```
[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer (#34965)
```
a9673b44

[CustomOp] Fix ext_tensor.cast failed bug (#34884) · 4d88cdb8

由 Chen Weihang 提交于 8月 18, 2021

* fix ext_tensor.cast failed bug

* remove useless deps

* fix windows cmake failed

* try to fix windows make failed

* fix make error on windwos

4d88cdb8

Add function to disable paddle signal handler (#34577) · dd533dd3

由 Zhanlue Yang 提交于 8月 18, 2021

* Add function to disable paddle signal handler

Paddle used google::InstallFaultSignalHandler to handle selected system signals,
mainly for debugging and bug report purposes.

However, this can be conflicted with other python packages whoever captures similar signals.
Such python package involves tvm and more

To resolve this issue, we support a function to disable signal handler

* Remove signal test from WIN32 platform

* Remove redundant return from disable_signal_handler() function

* Add detailed messages to en_doc

dd533dd3

W

add the safe check for the some ops (#34978) · 12bf046b
由 wawltor 提交于 8月 18, 2021

12bf046b
L
[NPU] add retry on HcclGetRootInfo to fix "bind fail" (#34977) · 52a7b0c4
由 Leo Chen 提交于 8月 18, 2021
```
* add retry for HcclGetRootInfo

* refine code

* reduce retry interval
```
52a7b0c4
G
support class center sample of PartialFC (#34106) · 100db44f
由 Guoxia Wang 提交于 8月 18, 2021
```
* support class center sample of PartialFC
```
100db44f

17 8月, 2021 9 次提交

R

[NPU]Adamw skip update for npu (#34897) · b4474fb4
由 Roc 提交于 8月 17, 2021

b4474fb4
A

[NPU] add where_index op and tests (#34951) · 1ef21855
由 Aganlengzi 提交于 8月 17, 2021

1ef21855

Copy boost optional to Paddle (#34780) · 9be41447

由 chentianyu03 提交于 8月 17, 2021

* copy boost optional.hpp to paddle

* copy boost optional.hpp to paddle

* move directions

* del fluid/utils

* modify .hpp to .h

* move directions

* modify to paddle::optional

* add modification description

* format code stype for the files in paddle/utils

* format code stype

9be41447

[oneDNN ] disabling more ops caching (#34830) · f1c1d9e0

由 Jacek Czaja 提交于 8月 17, 2021

* - disabled caching of layer norm

- fix in compilation

- compilation fix

- transpose caching disabled

- compilation fix

- more compilation fixes

- sum caching disabled

- compilation fix

* - LRN with disabled cache

* lint fixes

f1c1d9e0

S
[bug fix] fix unfold negative_size_param (#34943) · 8ef1bf87
由 shangliang Xu 提交于 8月 17, 2021
```
* [bug fix] fix unfold negative_size_param
```
8ef1bf87
P
add mkl multi-thread test cases in PR-CI-INFERENCE (#34946) · 9d4f00bc
由 Peihan 提交于 8月 17, 2021
```
* add mkl multi-thread test cases

* fix codestyle

* fix codestyle & enable ernie mkl test
```
9d4f00bc

Align CTC grad scale same with ESPNet (#34729) · 10f9644c

由 Hui Zhang 提交于 8月 16, 2021

* dygraph support more ctc grad scale

* scale for 1.x

* fix unitest

* fix unitest

* format code

* fix unittest

* fix log info

* unittest cov

* fix format;notest,test=cpu,coverage

* skip ctc_loss egs;test=cpu

* warpctc grad cov;test=coverage

* add dygraph test;test=coverage

* format;test=cpu,coverage

* format;test=cpu

* add api compat;test=cpu

* add cpu test

* rename

* rename

* fix

* fix test

* format

* eigen cpu

* eigen gpu grad pass

* cuda gpu pass

* format

* fix ci

10f9644c

Add some passes which can be applied to Program (#34730) · 8046e33d

由 Zeng Jinle 提交于 8月 17, 2021

* add inplace passes and tests

* update

* fix use_cuda undefined
fix compile error of op compat

* add more ut

* fix CPU CI error

* check adam unique

* fix mac/windows ci, improve coverage

* fix ci error

* follow weihang's comment

* fix BlockDesc::MoveFrom

* follow qiuliang's comment

* update

* follow huihuang's comments

8046e33d

Z

add api fill_diagonal_inplace (#34460) · 5de576b0
由 zhiboniu 提交于 8月 17, 2021

5de576b0

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功