提交 · 7c3567ea1b0202762e6d698897f57b8c17bd9688 · PaddlePaddle / Paddle

24 9月, 2021 4 次提交
- L
  
  fix cusparse compile bug in windows CUDA11.2, test=develop (#35941) · 7c3567ea
  由 Liu-xiandong 提交于 9月 24, 2021
  
  7c3567ea
- B
  
  add emb_eltwise_layernorm trt converter test case (#36027) · 0bbaf9bd
  由 baoachun 提交于 9月 24, 2021
  
  0bbaf9bd
- B
  add multihead_matmul trt converter test case (#36023) · fcaa64b3
  由 baoachun 提交于 9月 24, 2021
```
* add multihead_matmul trt converter test case

* move attribute check to op_teller
```
  fcaa64b3
- W
  add the shape check for the matmul (#35791) · 8e19d1ba
  由 wawltor 提交于 9月 24, 2021
```
* add the shape check for the matmul

* remove the test case for the linear
```
  8e19d1ba
23 9月, 2021 7 次提交
- L
  Optimize workqueue (#35931) · 4e7bd9c3
  由 liutiexing 提交于 9月 23, 2021
```
* add align for WorkQueue

* WorkQueue update

* Revert "WorkQueue update"

This reverts commit 14ce793dbb204f8ddec63c34b3b72a73c7cdb93a.

* optimize WorkQueue
```
  4e7bd9c3
- P
  
  fix ernie-int8 compile error on windows (#35972) · d6d2dafa
  由 Peihan 提交于 9月 23, 2021
  
  d6d2dafa
- W
  
  fix trt problem (#35938) · 8d0922ed
  由 Wilber 提交于 9月 23, 2021
  
  8d0922ed
- F
  
  Replace Eigen with Lapack library for eigvals OP kernel (#35909) · 9b8aafe5
  由 From00 提交于 9月 23, 2021
  
  9b8aafe5
- L
  
  Add fused_attention_op: add impl wrappers. (#35903) · 88ea8e6f
  由 Li Min 提交于 9月 23, 2021
  
  88ea8e6f
- T
  add argmax and iou_similarity for kunlun (#35836) · 7bf84e2d
  由 TTerror 提交于 9月 23, 2021
```
* add argmax and iou_similarity for kunlun

* add argmax and iou_similarity for kunlun

* add argmax and iou_similarity for kunlun
```
  7bf84e2d
- W
  add pass_desc_py_proto depends, test=develop (#35864) · 1548407d
  由 wuhuanzhou 提交于 9月 23, 2021
```
add pass_desc_py_proto depends
```
  1548407d
22 9月, 2021 24 次提交
- T
  Fix copy elision warning (#35885) · 47d6bc86
  由 Tomasz Socha 提交于 9月 22, 2021
```
* Fix copy elision warning

* Remove redundand code
```
  47d6bc86
- Z
  
  ResnetUnitOp implemented by cuDNN fused op(backend code) (#35557) · 736a7388
  由 Zhang Zheng 提交于 9月 22, 2021
  
  736a7388
- S
  move variable UPLOAD_TP_FILE to the beginning or it cant be initialized when... · 482f062d
  由 Sing_chan 提交于 9月 22, 2021
```
move variable UPLOAD_TP_FILE to the beginning or it cant be initialized when running build-whl task (#35895)
```
  482f062d
- Z
  
  fix adamw DeprecationWarining (#35869) · f67a50bd
  由 zhaoyingli 提交于 9月 22, 2021
  
  f67a50bd
- Z
  [AMP]split minimize and add unscale_ for GradScaler (#35825) · bf6f0e54
  由 zhangbo9674 提交于 9月 22, 2021
```
* split minimize() to step() + update()

* add unscale and step for grad_scaler

* add unittest

* refine code in minimize

* delete step in loss_scaler

* fix example bug

* refine comment

* refine unittest

* add unittest
```
  bf6f0e54
- R
  [NPU] add randperm_op_npu (#35763) · 4f0c3278
  由 ronnywang 提交于 9月 22, 2021
```
* add randperm_op_npu

* fix test_set_value_op_npu
```
  4f0c3278
- T
  op:transpose_op supports bool type (#35886) · 0c6ee945
  由 TeslaZhao 提交于 9月 22, 2021
```
* Pass compat of conv_transpose_bias_mkldnn_fuse_pass

* Fix a bug of strided_slice op, about the axes parameter access memory out of bounds

* Fix a bug of transpose op, about accessing memory out of bounds of the perm param

* op:transpose_op supports bool type
```
  0c6ee945
- H
  Det &Slogdet (#34992) · 9ce45ddd
  由 huangxu96 提交于 9月 22, 2021
```
Add new API : paddle.linalg.det & paddle.linalg.slogdet

API Alias：paddle.det& paddle.slogdet
```
  9ce45ddd
- Y
  
  update paddle2onnx version to 0.8.2 in unittest_py/requirements.txt (#35837) · 00e0e358
  由 yeliang2258 提交于 9月 22, 2021
  
  00e0e358
- P
  support ernie-int8 test and prune op attribute test (#35890) · e8789c11
  由 Peihan 提交于 9月 22, 2021
```
* support ernie-int8 test and prune op attribute test

* remove using and use namespace

* remove macro and use shell instead

* Revert "remove macro and use shell instead"

This reverts commit 615964b149d7de7825b341936b42be22a4bc0091.

* fix grammar error

* fix shell error
```
  e8789c11
- W
  
  add no need buffer check, test=develop (#35790) · 7ebbcbbc
  由 wanghuancoder 提交于 9月 22, 2021
  
  7ebbcbbc
- Z
  
  refine FLAGS approval (#35904) · 7ba69249
  由 Zeng Jinle 提交于 9月 22, 2021
  
  7ba69249
- J
  
  [Inference] Support NNAdapter and ascend310 (#35226) · 10e53044
  由 JingZhuangzhuang 提交于 9月 22, 2021
  
  10e53044
- W
  
  fix: delete_quant_dequant_filter_op_pass, delete_quant_dequant_op_pass (#35879) · 5cda6b2b
  由 Wangzheee 提交于 9月 22, 2021
  
  5cda6b2b
- J
  fix conv2d convert test (#35627) · 1238115e
  由 JingZhuangzhuang 提交于 9月 21, 2021
```
* support nnadapter and ascend310

* modify code

* add anchor_generator convert test

* add gelu convert test

* add conv2d convert test

* modify anchor_operator convert test

* modify conv2d test

* modify con2d convert test

* modify conv2d convert test

* modify conv2d convert test

* modify conv2d test

* fix WITH_PYTHON compile error

* modify test file

* modify test file

* modify test file

* modify test file

* modify test file

* modify test file

* modify test file

* modify test file
Co-authored-by: Nxiaoxiaohehe001 <hiteezsf@163.com>
Co-authored-by: Njiweibo <jiweibo@baidu.com>
```
  1238115e
- J
  
  Add quant2 int8 lstm model test (#35887) · be4d0026
  由 joanna.wozna.intel 提交于 9月 22, 2021
  
  be4d0026
- W
  fix feed for new executor (#35803) · 4c2a06df
  由 wanghuancoder 提交于 9月 21, 2021
```
* fix feed, test=develop

* delete one test case, test=develop
```
  4c2a06df
- W
  
  add timeline(recordevent) for new executor, test=develop (#35831) · 5574c8cf
  由 wanghuancoder 提交于 9月 21, 2021
  
  5574c8cf
- W
  refine gc for new_executor (#35764) · fab1a029
  由 wanghuancoder 提交于 9月 21, 2021
```
* refine gc for new_executor, test=develop

* refine, test=develop

* refine, test=develop

* merge, test=develop
```
  fab1a029
- A
  Modify H2D and D2H as kQueue::Sync and Polish Schedule logic (#35866) · fe35496b
  由 Aurelius84 提交于 9月 22, 2021
```
* Modify H2D and D2H as kQueue::Sync

* fix interface error
```
  fe35496b
- [2.2]support extern third_party lapack API on Linux/Windows/Mac (#35690) · ae65257d
  由 zhouweiwei2014 提交于 9月 22, 2021
```
* support extern third_party lapack on Linux/Windows/Mac

* fix ci
```
  ae65257d
- F
  
  disable tests for fft on windows with gpu (#35872) · 5af6081a
  由 Feiyu Chan 提交于 9月 22, 2021
  
  5af6081a
- Z
  
  fix bug of module 'paddle' has no attribute 'fluid' for python3.6 (#35862) · 12ab017e
  由 zhangbo9674 提交于 9月 22, 2021
  
  12ab017e
- W
  
  add dilation check for conv (#35838) · 77134300
  由 wangguanzhong 提交于 9月 22, 2021
  
  77134300
21 9月, 2021 2 次提交

G

support fp16 (#35888) · 087c23a9
由 Guoxia Wang 提交于 9月 21, 2021

087c23a9

Reuse OneDNN handler for SGD and SUM for SelectedRows input tensors. (#35510) · 799f3861

由 Adam Osewski 提交于 9月 20, 2021

* Create stateful OneDNNAXPYHandler object.

This makes it possible to call it multiple times without recreating the
oneDNN primitives every time.

* Prepare SGDOpKernel to reuse its implementation from OneDNN kernel.

* OneDNN SGD kernel.

* Update call to use new OneDNNAXPYHandler object api.

* Setup seed in proper place.

* Enable OneDNN kernel only for single case.

* For dense param and sparse grad.

* Small refactor.

* Enable oneDNN by op attr or by cmd line flag.

* Use int64_t type for number of elements.

* Support dense param and grad from OneDNN kernel.

* Enable SGD OneDNN kernel when use MP BF16 optimizer.

* Force non-copyable/movable OneDNNAXPYHandler.

* Reuse OneDNNAXPYHandler for spare tensors in SUM op.

* Fix SFINAE rules.

* Remove recording event inside AXPY.

* Get rid of internal primitive caching.

* Stop use PP cache mechanims to store mem and primitive obj.
* Handler obj store and reuse needed desc & prim

* Do not derive from MKLDNNHandlerT

799f3861

19 9月, 2021 2 次提交

Optimization of pool2d grad (#35389) · 86685190

由 limingshu 提交于 9月 19, 2021

* Optimization of pool2d grad, first commit.

* remove useless print codes

* refine codes

* refine codes

* seal more operation into template specialization

* fix template struct error in MaxPool2dGrad.

* Fix header including error

* refine code with comment

* Seal the param-preparation codes into function for common use.

* Seal the param-preparation codes into function for common use.

* Seal the param-preparation into funciton and make it common for other kernels

* polish code and erase useless template speicalization

* Rerun triger

* rerun trigger

86685190

B

add hard_sigmoid trt converter test cases (#35876) · 9f88d327
由 baoachun 提交于 9月 19, 2021

9f88d327

18 9月, 2021 1 次提交
- Z
  
  increase test_imperative_auto_mixed_precision timePROPERTIES TIMEOUT (#35863) · e7617512
  由 zhangbo9674 提交于 9月 18, 2021
  
  e7617512

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功