提交 · 673bf71968e1623daa862cdf7ea50ccdc66393ff · 机器未来 / Paddle

23 8月, 2021 13 次提交
- J
  [oneDNN] disable caching for interpolate and batch Norm (#35030) · 673bf719
  由 Jacek Czaja 提交于 8月 23, 2021
```
* - disabled interpolate onednn

* - compilation fix

* - draft of batch_norm cache disabling

* - fixes to UT
```
  673bf719
- L
  
  upgrade oneDNN to v2.3.2 (#35040) · a047c139
  由 lidanqing 提交于 8月 23, 2021
  
  a047c139
- P
  support infer_ut on windows nightly build (#35049) · 4f86aae0
  由 Peihan 提交于 8月 23, 2021
```
* enable infer_ut on windows

* remove lib calculation & time

* unset http_proxy when download bos file on windows
```
  4f86aae0
- L
  Refactor the organization of layer_norm cuda impl. (#34883) · 7f5eb533
  由 Li Min 提交于 8月 23, 2021
```
Refactor the organization of layer_norm cuda impl so that it can be reused in fused attention op.

    Extract the layer_norm cuda impl form layer_norm_op.cu to layer_norm_kernel.cu.h.
    Define fused/attention_layer_norm.h, which can be used in fused attention op in next PR.
```
  7f5eb533
- Y
  
  [hybrid performance] optim the grad fuse for pipeline mode by sorting the grad by dtype (#35070) · fad4b3b4
  由 Yuang Liu 提交于 8月 23, 2021
  
  fad4b3b4
- Z
  Support gettiem by Bool index (#35026) · b6dc16cb
  由 zyfncg 提交于 8月 23, 2021
```
* Support getitem by Bool index

* delete some debug info of bool index

* support the case that the shape of bool index is different from indexed tensor
```
  b6dc16cb
- W
  Revert "use spin lock in auto growth allocator (#34910)" (#35069) · 97fef015
  由 wanghuancoder 提交于 8月 23, 2021
```
This reverts commit 6bacfb0e.
```
  97fef015
- P
  
  add beam_search_decode npu op (#34967) · 4ce272ed
  由 pangyoki 提交于 8月 23, 2021
  
  4ce272ed
- P
  
  add fill_constant_batch_size_like npu op (#34563) · 7d86737c
  由 pangyoki 提交于 8月 23, 2021
  
  7d86737c
- T
  
  Fix a bug of strided_slice op, about the axes parameter access memory out of bounds (#35062) · aefec228
  由 TeslaZhao 提交于 8月 23, 2021
  
  aefec228
- S
  
  set node feature (#34994) · c3efabeb
  由 seemingwang 提交于 8月 23, 2021
  
  c3efabeb
- Z
  add adamw cuda kernel (#35020) · 77a8a394
  由 zhaoyingli 提交于 8月 23, 2021
```
* adamw support cuda

* adamw support cuda
```
  77a8a394
- L
  Add cuda.device_count api (#34811) · cf99c0d5
  由 Linjie Chen 提交于 8月 23, 2021
```
* Add cuda device count api

* update coda format

* fix unittest error

* update code format

* update comment
```
  cf99c0d5
22 8月, 2021 1 次提交
- Z
  
  implementation of broadcast add backward by reduce (#34143) · 56c5e210
  由 Zhang Zheng 提交于 8月 22, 2021
  
  56c5e210
20 8月, 2021 13 次提交
- H
  
  Add paddle.linalg.matrix_power OP (#34667) · e2241a43
  由 Hao Lin 提交于 8月 20, 2021
  
  e2241a43
- Y
  
  [hybrid performance] Grad fuse for gradient merge under pipeline mode (#35004) · 4d9b2d6d
  由 Yuang Liu 提交于 8月 20, 2021
  
  4d9b2d6d
- T
  
  fix model-benchmark build error (#35041) · f6015d0d
  由 tianshuo78520a 提交于 8月 20, 2021
  
  f6015d0d
- L
  [npu]Add argsort op (#34865) · 99ffeffe
  由 lzzyzlbb 提交于 8月 20, 2021
```
* add rmsprop npu

* add argsort npu

* add argsort npu

* modify according to review

* modify sharedatawith according to review

* modify reshape according to review

* rm dygraph=false
```
  99ffeffe
- S
  [NPU] Support npu kernel for pad3d op (#34815) · ef517a56
  由 Sing_chan 提交于 8月 20, 2021
```
* [NPU] Support npu kernel for pad3d op

* fix for comment of zhouwei25

* fix some bugs according to qili93's comments

* add support and test for paddings in input

* delete VLOG used for debug
```
  ef517a56
- W
  use spin lock in auto growth allocator (#34910) · 6bacfb0e
  由 wanghuancoder 提交于 8月 20, 2021
```
* use spin lock in auto growth allocator, test=develop

* use pthread spin lock, test=develop

* use lock guard, test=develop

* use malloc spin lock, test=develop

* use lock_guard, test=develop
```
  6bacfb0e
- W
  fix set_lod in data_feed (#35000) · 4416c793
  由 wangguanqun 提交于 8月 20, 2021
```
* add trainer desc config to distributed strategy

* code style modified

* data_feed set lod
```
  4416c793
- Z
  [NPU] Support npu op depthwise_conv2d (#34853) · 4c115a82
  由 zhaoyingli 提交于 8月 20, 2021
```
* add depthwise_conv2d npu

* add some tests

* Delete test_unique_op_npu.py

* delete trans input
```
  4c115a82
- Z
  [NPU] Support npu op where and where grad (#34587) · d082955e
  由 zhaoyingli 提交于 8月 20, 2021
```
* [NPU] Support npu op where and where grad

* fix use const_cast

* delete a test
```
  d082955e
- P
  
  temporary disable resnet50-quant multi-thread test (#35035) · f927b653
  由 Peihan 提交于 8月 20, 2021
  
  f927b653
- J
  add (N,C,*) input support for GroupNorm (#34773) · 46371515
  由 JYChen 提交于 8月 20, 2021
```
* add (N,C,*) input support for GroupNorm

* --amend
```
  46371515
- S
  
  [bug fix] fix spectral_norm bug (#35005) · 1aa2bde0
  由 shangliang Xu 提交于 8月 20, 2021
  
  1aa2bde0
- T
  Add op benchmark run function log (#35034) · 096b0f2e
  由 tianshuo78520a 提交于 8月 20, 2021
```
* Add run function log

* test=document_fix
```
  096b0f2e
19 8月, 2021 11 次提交
- J
  [NPU] Support npu kernel for sin op (#34844) · 4641e8fc
  由 JingZhuangzhuang 提交于 8月 19, 2021
```
* add npu sin op

* [NPU] Support npu kernel for sin op

* modify support npu kernel for sin op

* modify support npu kernel for sin op

* modify nou sin op

* modify npu sin op

* add sin op npu
```
  4641e8fc
- P
  
  fix reshape when is a number (#35016) · 866c1ea6
  由 parap1uie-s 提交于 8月 19, 2021
  
  866c1ea6
- T
  
  Fix op-benchmark cpu/gpu; test=document_fix (#35027) · ed9a14e4
  由 tianshuo78520a 提交于 8月 19, 2021
  
  ed9a14e4
- L
  
  remove unused statements in test_dist_base.py (#35017) · ef024c89
  由 lilong12 提交于 8月 19, 2021
  
  ef024c89
- P
  add resnet50_quant model in PR-CI-INFERENCE (#35012) · 97cae5e8
  由 Peihan 提交于 8月 19, 2021
```
* add slim resnet50 quant model in pr-ci-inference

* enable resnet50_quant multi_thread4_trt_int8_bz1

* remove LOG(FATAL)
```
  97cae5e8
- Y
  Add dimension check for inverse to avoid dividing by 0 error when input's... · a2e08657
  由 Yiqun Liu 提交于 8月 19, 2021
```
Add dimension check for inverse to avoid dividing by 0 error when input's shape is [0, 0, 0]. (#34996)
```
  a2e08657
- C
  fix batch_norm and instance norm when input is [] (#34107) · ca7f5208
  由 ceci3 提交于 8月 19, 2021
```
* fix batch_norm and instance norm when input is []
```
  ca7f5208
- 王
  
  add the auto scan test for TensorRT convert,test=develop (#34980) · 255fc7d8
  由王明冬提交于 8月 19, 2021
  
  255fc7d8
- T
  Fix Inference CI CPU/GPU (#34931) · 26213a77
  由 tianshuo78520a 提交于 8月 19, 2021
```
* notest;test=gpu-inference

* notest;test=gpu-inference

* notest;test=gpu-inference

* notest;test=gpu-inference

* fix error

* notest;test=gpu-inference

* notest;test=gpu-inference

* notest;test=gpu-inference

* test=gpu-inference
```
  26213a77
- T
  
  Fix op-benchmark cpu/gpu error (#34997) · c4e05e1c
  由 tianshuo78520a 提交于 8月 19, 2021
  
  c4e05e1c
- A
  Abstract DeviceEvent to manage cross-platform Event implementation (#34922) · 22da1907
  由 Aurelius84 提交于 8月 19, 2021
```
* add device_context

* add gtest for device_event_gpu

* Remvoe duplicate DeviceType

* push for test

* add unittest

* fix macros

* fix MSVC using usage
```
  22da1907
18 8月, 2021 2 次提交

L
[NPU]add rmsprop op (#34864) · 9cbba97b
由 lzzyzlbb 提交于 8月 18, 2021
```
* [npu]add rmsprop op
```
9cbba97b

Add NPU kernel for norm Op: float16 and float32 (#34609) · 755c8a19

由 xiongkun 提交于 8月 18, 2021

* Add NPU kernel for norm Op: float16 and float32

* fix code for code review

* fix for code review

* add type for paddle_throw

* remove unnecessary head file.\nAdd more testcase

* remove a broadcast

755c8a19

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致