提交 · 6d353aa524770279a9b216e011d6623b7be0ea35 · Crayon鑫 / Paddle

11 10月, 2021 8 次提交
- J
  
  fix for matmul_v2 6D x 2D (#36342) · 339cb191
  由 jakpiase 提交于 10月 11, 2021
  
  339cb191
- L
  Add nn.functional.sparse_attention and some test cases, test=develop (#35757) · 85b77232
  由 Liu-xiandong 提交于 10月 11, 2021
```
Add paddle.nn.functional.sparse_attention API

    本个PR主要将sparse_attention功能在python层进行了一层封装，OP的主体代码见：#PR35676

    此外，对于封装的python 接口，增加了相应的单测。
```
  85b77232
- Z
  
  Add more tests and fix bugs for cudnn_norm_conv_test and cudnn_bn_and_relu_test (#36314) · a679fcbb
  由 Zhang Zheng 提交于 10月 11, 2021
  
  a679fcbb
- N
  Add functor_primitives.h for kernel primtive api (#36203) · 830debc2
  由 niuliling123 提交于 10月 11, 2021
```
* Add functor_primitives.h for kernel primtive api

* update

* move namespace kps

* subFunctor init_data

* delete InvalidArgumentError
```
  830debc2
- Q
  [NPU] fix matmul_v2 and utils.run_check, test=develop (#36164) · 7850f7ce
  由 Qi Li 提交于 10月 11, 2021
```
* [NPU] fix matmul_v2 and utils.run_check, test=develop

* remove debug files, test=develop

* fix install_check, test=develop

* fix doc, test=develop

* fix review comments, test=develop
```
  7850f7ce
- Q
  [NPU] fix set_value, test=develop (#36272) · 83541fd4
  由 Qi Li 提交于 10月 11, 2021
```
* [NPU] fix set_value, test=develop

* fix typo, test=develop

* fix typo, test=develop
```
  83541fd4
- Q
  
  [NPU] fix softmax_with_cross_entropy in dygraph, test=develop (#36297) · 11061325
  由 Qi Li 提交于 10月 11, 2021
  
  11061325
- X
  
  use unified external error message for cufft api (#36114) · 642aaa2e
  由 Xiaoxu Chen 提交于 10月 11, 2021
  
  642aaa2e
09 10月, 2021 3 次提交
- Z
  
  Implement Fused BN + Add + Relu with cudnnFusedOps API. (#35955) · 7e6c0cee
  由 Zhang Zheng 提交于 10月 09, 2021
  
  7e6c0cee
- Y
  
  Enhance OpTest for bfloat16. (#36079) · 91119271
  由 Yiqun Liu 提交于 10月 09, 2021
  
  91119271
- Z
  
  fill_diagonal op fix border cross caused by offset (#36212) · 62e41150
  由 zhiboniu 提交于 10月 09, 2021
  
  62e41150
08 10月, 2021 5 次提交
- J
  Fix for oneDNN conv op (#36284) · 57e8cbec
  由 jakpiase 提交于 10月 08, 2021
```
* fix for conv op

* Minor change
```
  57e8cbec
- Z
  Support CUDA Graph on ParallelExecutor (#36250) · f9591bb1
  由 Zeng Jinle 提交于 10月 08, 2021
```
* support CUDA Graph on PE

* add ut, fix CI compile

* reduce memory consumption

* fix CUDA 10 CI

* improve coverage

* improve python coverage
```
  f9591bb1
- Q
  [NPU] BatchNorm support layout of NCL and NLC, test=develop (#35668) · 7cb19f57
  由 Qi Li 提交于 10月 08, 2021
```
* [NPU] support NCL and NCL for BatchNorm, test=develop

* [NPU] remove debug files, test=develop

* update, test=develop
```
  7cb19f57
- A
  Added oneDNN BF16 relu (#36265) · 1bd9cfef
  由 arlesniak 提交于 10月 08, 2021
```
* Added oneDNN BF16 relu

* fixed typo

* refactored test, review fixes
```
  1bd9cfef
- Z
  
  fix cast cuda implementation (#36266) · 9814f895
  由 Zeng Jinle 提交于 10月 08, 2021
  
  9814f895
07 10月, 2021 1 次提交

[OneDNN] Conv op refactor. (#36252) · e9288340

由 Adam Osewski 提交于 10月 07, 2021

* Remove unused header.

* Use ConvMKLDNNHandlerT for conv2d INT8.

* Use absolute module path to import.

e9288340

05 10月, 2021 1 次提交

Added concat BF16/FP32 BWD OneDNN kernel (#35889) · dc4d5719

由 jakpiase 提交于 10月 05, 2021

* tmp

* added concat BF16/FP32 BWD oneDNN kernel

* minor change

* minor change

* fix for CI

* added formatting

* Reverted deleting static keyword

* added reviewers suggestions

* reverted deleting concat bf16 test file

* fixed concat tests

dc4d5719

30 9月, 2021 1 次提交

[NPU] modify transpose2 and index_select_grad kernels for model xlnet (#36214) · a66b9fba

由 Aganlengzi 提交于 9月 30, 2021

* [NPU] modify transpose2 and index_select_grad kernels for model xlnet

* add transpose2 int64_t unit test

* add more transpose2 unit tests

* update test_transpose_op_npu.py

a66b9fba

29 9月, 2021 7 次提交
- Z
  [npu] add box coder (#36171) · 83578cfa
  由 zhulei 提交于 9月 29, 2021
```
* [npu] add box coder

* [npu] add box coder
```
  83578cfa
- P
  
  fix bug of top_k npu op (#36175) · 2b8fd704
  由 pangyoki 提交于 9月 29, 2021
  
  2b8fd704
- Z
  [NPU] Add group norm (#35937) · c79de728
  由 zhulei 提交于 9月 29, 2021
```
* [NPU] Add group norm

* [NPU] Add group norm

* [NPU] Add group norm

* [NPU] Add group norm

* [NPU] Add group_norm op
```
  c79de728
- A
  [NPU] mod for model bert (#36165) · 7bddf2e8
  由 Aganlengzi 提交于 9月 29, 2021
```
* merge conflict of paddle_gtest_main.cc

* modify FLAGS_npu_precision_mode and default not to call aclSetCompileopt
```
  7bddf2e8
- Y
  
  Implement the grad and enhance the cache of norm_convolution fusion ops. (#36168) · 767050d9
  由 Yiqun Liu 提交于 9月 29, 2021
  
  767050d9
- L
  
  Add fused_dropout wrapper to ease use. (#36185) · 092d45c3
  由 Li Min 提交于 9月 29, 2021
  
  092d45c3
- R
  
  [ROCM] bugfix for bilinear_interp_v2_grad (#36160) · 5e1d0b5c
  由 ronnywang 提交于 9月 29, 2021
  
  5e1d0b5c
28 9月, 2021 5 次提交

L
Add sparse_attention api, test=develop (#35676) · 6b587e93
由 Liu-xiandong 提交于 9月 28, 2021
```
Add sparse_attention OPs, python api will be added in next pr
```
6b587e93

add API paddle.linalg.eig (#35674) · bc7e2b92

由 Lijunhui 提交于 9月 28, 2021

* Add paddle.linalg.eig op

* remove comments

* remove comments

* extend batch_size to the origin

* add real times complex functor & destroy the backward complex output bug

* terminate output diff when input real tensors

* correct tiny doc errors

* move functions from eig_helper to svd_helper and remove eig_helper

* remove tensor.Resize

* remove no longer used code

* use existing lapack functions

* reply review comments 21/27

* remove .cu as this op is only executed on CPU

* remove const_cast & add const in argument list for read-only references

* fix sample code error in CI

* remove template typename Tbase and more

* remove eig exposure in paddle.*

* add 'name=None' in eig python implementation

* handle the unittest

* try to solve the unittest

* solve CI coverage

* remove no longer used code

* polish API doc and more

* reply review comments

* polish unittest, commit plan B

* polish unittest

bc7e2b92

R

[ROCM] bugfix for arg_min_max (#36098) · 36791fdd
由 ronnywang 提交于 9月 28, 2021

36791fdd

[hybrid] seed and dropout op support force-cpu (#35820) · 58c8f6b3

由 xiayanming 提交于 9月 28, 2021

* [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid

* [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid

* [HIP] fix op not support AMD GPU bug

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] fix seed ci failed issue

* add AsExtra for force_cpu of seed op

58c8f6b3

G

fix bug of reduce_sum when src_dtype != dst_dtype and reduce_num == 1 (#36123) · d5268a6e
由 Guoxia Wang 提交于 9月 28, 2021

d5268a6e

27 9月, 2021 3 次提交

fix zero tensor for unique, unstack (#36021) · efd35384

由 Jiawei Wang 提交于 9月 27, 2021

* fix extra op for expand, expand_as, tile, unstack

* fix unique unstack dim 0

* Update expand_v2_op.cc

* fix unique_op format

efd35384

Lars op optimiztion with cudaLaunchCooperativeKernel method (#35652) · a112ce42

由 limingshu 提交于 9月 27, 2021

* A leap of try for cudaLaunchCooperativeKernel

* fix bugs

* Totally replace the lar cuda kernel

* Fix bugs

* fix code according to comments

* fix codes according to  review comments

* adding some function overload

* relocate the power operation.

a112ce42

Added flatten and flatten2 BF16/FP32 FWD/BWD kernels (#35892) · e427a0f1

由 jakpiase 提交于 9月 27, 2021

* refactored reshape multiop kernel and added flatten1/2 kernels

* added formatting for flatten tests

* CI fix

* disabled reshape_kernel ops after succesful CI run

* minor fix

e427a0f1

26 9月, 2021 6 次提交
- J
  
  bugfix reshape -1 (#36087) · 2fe9ae71
  由 JZ-LIANG 提交于 9月 26, 2021
  
  2fe9ae71
- J
  [new api] add func/class API psroi_pool and UT (#35352) · e45d64ec
  由 JYChen 提交于 9月 26, 2021
```
* add func/class API psroi_pool and UT

* add UT in static mode

* Remove redundant type checks in static mode

* More detailed description for test_psroi_pool_op

* fix code format of UT

* fix en-doc
```
  e45d64ec
- Y
  
  Add a check for multiplex op (#34972) · b430f6a3
  由 Yulong Ao 提交于 9月 26, 2021
  
  b430f6a3
- W
  
  Fix FPE of label smooth op (#35861) · 628ff34b
  由 whs 提交于 9月 26, 2021
  
  628ff34b
- C
  
  CPU forward calculation replaces Eigen with Lapack;Modify linalg exposure rules (#35916) · 7ff226f0
  由 crystal 提交于 9月 26, 2021
  
  7ff226f0
- Y
  Support fixed seed in Python for test (#36065) · 1b90f968
  由 YuanRisheng 提交于 9月 26, 2021
```
* Add New Op: gumbel_softmax

* Add New Op: gumbel_softmax

* Add New Op: gumbel_softmax (amend)

* add __main__ function in unit test

* fix bugs when test in windows ci

* update en docs

* delete reletive error in unit test

* delete relative error in unit test

* set hard=True in unit test

* Support fix seed in Python for test
```
  1b90f968

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致