提交 · cdd7b956e7a99401a09d99130a1757be2d979bf3 · PaddlePaddle / Paddle

09 11月, 2022 3 次提交
- W
  [Paddle Inference]upgrade scale and slice op convert for Paddle-TensorRT (#47746) · cdd7b956
  由 Wangzheee 提交于 11月 09, 2022
```
* upgrade scale and slice op convert for Paddle-TensorRT
```
  cdd7b956
- Z
  
  [Sparse]optimize sparse convolution and fix MaskHelper bug (#47703) · 1aa64d13
  由 zhangkaihuo 提交于 11月 09, 2022
  
  1aa64d13
- W
  refine python call error report (#47724) · 5c7fce47
  由 wanghuancoder 提交于 11月 09, 2022
```
* refine python call error report
```
  5c7fce47
08 11月, 2022 21 次提交
- R
  
  [CustomDevice] fix the not ready kernel can not register. (#47758) · 4b0f1b0c
  由 ronnywang 提交于 11月 08, 2022
  
  4b0f1b0c
- W
  
  Fix compiler error with_trt (#47716) · 6934ae2b
  由 Wilber 提交于 11月 08, 2022
  
  6934ae2b
- L
  
  refine comm api implementation (#47713) · 84c9a0d6
  由 LiYuRio 提交于 11月 08, 2022
  
  84c9a0d6
- [Zero-Dim] support input 0D Tensor for sundary api (#47734) · 3198af20
  由 zhouweiwei2014 提交于 11月 08, 2022
```
* [Zero-Dim] support input 0D Tensor for sundary api

* fix comment
```
  3198af20
- S
  Migrate old C++ unit tests to Python framework (#47006) · 0c9f09b8
  由 Sławomir Siwek 提交于 11月 08, 2022
```
* softplus+activation

* fc + elementwise_add test refactored

* rename MKLDNN to OneDNN

* fc+activation tests refactored

* remove softplus ut

* whitespace

* whitespace

* codestyle

* codestyle

* add more cases to fc+act

* remove softplus+hard_sigmoid pass

* remove softplus + hard_sigmoid UT

* add approximate for gelu

* swish beta range

* new codestyle

* reduce number of tests
```
  0c9f09b8
- Z
  
  add adadelta op for xpu, test=kunlun (#47661) · 047971f0
  由 zhangyikun02 提交于 11月 08, 2022
  
  047971f0
- Z
  
  argsort support n > 16384 and add argsort_grad op for xpu, test=kunlun (#47701) · 6a6a3ff1
  由 zhangyikun02 提交于 11月 08, 2022
  
  6a6a3ff1
- K
  
  add fuse_multi_transformer passes to fp16. test=develop (#47676) · caca5687
  由 Kaipeng Deng 提交于 11月 08, 2022
  
  caca5687
- L
  
  Fix bug of abs_double_grad in eager mode for kunlun, test=kunlun (#47722) · aba3c806
  由 Leo Guo 提交于 11月 08, 2022
  
  aba3c806
- R
  
  [CustomDevice] fix undefined symbol GetCCLComm in the cpu version (#47717) · 97004f67
  由 ronnywang 提交于 11月 08, 2022
  
  97004f67
- Z
  [Paddle Inference] allow fold fill_constant && allow nms3 into trt in int8 model (#47551) · c3a69111
  由 zhoutianzi666 提交于 11月 08, 2022
```
* allow fold fill_constant && allow nms3 into trt in int8 model
* use unordered_map
* fix CI failing
```
  c3a69111
- N
  [CodeStyle][py2][U004] unecessary explicit `object` inheritance in class definition (#47642) · 888272b5
  由 Nyakku Shigure 提交于 11月 08, 2022
```
* [CodeStyle][py2][U004] unecessary explicit `object` inheritance in class definition

* fix an increment
```
  888272b5
- P
  Split quant (#47449) · 130db92a
  由 Paulina Gacek 提交于 11月 08, 2022
```
* Split kernel registered, tests for uint/int added

* Split quantized

* Split output scales calculated only once

* NearestInterp test fix reversed

* DequantizeOutputs corrected
```
  130db92a
- J
  removing dependent to fluid/framework/eigen.h in phi (#47675) · c7cd8d98
  由 jzhang533 提交于 11月 08, 2022
```
* removing dependent to fluid/framework/eigen.h in phi

* more fix according to PR-CI-Py3 fail
```
  c7cd8d98
- T
  remove dist xpu tests for R200 (#47381) · ef21b58b
  由 tianshuo78520a 提交于 11月 08, 2022
```
* disable distributed xpu tests

* test=kunlun

* test=document_fix;test=kunlun

* test=document_fix;test=kunlun

* test=document_fix;test=kunlun

* test=document_fix;test=kunlun
```
  ef21b58b
- C
  support pow double grad op (#47691) · 6fe9dfb2
  由 Charles-hit 提交于 11月 08, 2022
```
* support pow_double_grad op

* add unit test for pow double grad

* fix pow double grad

* optimize pow double grad kernel

* fix pow double grad kernel
```
  6fe9dfb2
- Z
  [Paddle-TRT]Fix cast converter bug , use setOutputType() instaead (#46289) · 18adbbd0
  由 zhoutianzi666 提交于 11月 08, 2022
```
* fix cast bug
```
  18adbbd0
- W
  
  remove <fluid/eager/api/utils/global_utils.h> from phi (#47739) · 42d9fe2f
  由 Wang Xin 提交于 11月 08, 2022
  
  42d9fe2f
- C
  
  normalize autotune tests dir (#47726) · 6bab3343
  由 Chen Weihang 提交于 11月 08, 2022
  
  6bab3343
- T
  
  fix cinn_instruction_run_op_test when FLAGS_use_system_allocator=True (#47731) · a4a9ce0e
  由 TeFeng Chen 提交于 11月 08, 2022
  
  a4a9ce0e
- T
  Fix undefined symbol: shm_open (#47421) · 50c3632f
  由 Tomasz Socha 提交于 11月 08, 2022
```
* Fix undefined symbol: shm_open

* Fix for Windows

* Exclude APLLE
```
  50c3632f
07 11月, 2022 16 次提交

Y
Define ConvRunner to wrapper the call of cudnn conv functions. (#47576) · c331e2ce
由 Yiqun Liu 提交于 11月 07, 2022
```
* Define ConvRunner to wrapper the call of cudnn conv functions.

* Use ConvKind in SearchAlgorithm.
```
c331e2ce

suqeeze2 + transpose2 fuse onednn (#47592) · fa874a46

由 Hui Zhang 提交于 11月 07, 2022

* suqeeze2 transpose2 fuse onednn

* format

* fix output shape

* fix conflict

* format

* format

* remove useless

* remove log

* simply pass

* fix comment

* fix

* fix msg

* fix error msg

* format

fa874a46

W

remove hardcoded -Wunused-variable compiler flags (#47706) · 45bc4542
由 Wang Xin 提交于 11月 07, 2022

45bc4542
L

fix nlu compilation (#47707) · 75f34bb7
由 Leo Chen 提交于 11月 07, 2022

75f34bb7
Q
support kldiv_loss/kldiv_loss_grad for kunlun (#47638) · 5f0a8adc
由 QingshuChen 提交于 11月 07, 2022
```
*test=kunlun
```
5f0a8adc
T
Test FLAGS_enable_cudnn_frontend In CUDA117 CI (#47635) · 87753ee8
由 tianshuo78520a 提交于 11月 07, 2022
```
* test=cuda117

* test=cuda11

* test=document_fix;test=cuda117

* test=document_fix
```
87753ee8
P

disable WITH_CUDNN_DSO (#47674) · c65f0565
由 pangyoki 提交于 11月 07, 2022

c65f0565

add roll and roll_grad kernels and strided_slice and strided_slice_grad... · 5a4d2186

由 ykkk2333 提交于 11月 07, 2022

add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun (#47368)

* add stat tool

* add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun

5a4d2186

W
refine python lib link (#47681) · eb102189
由 wanghuancoder 提交于 11月 07, 2022
```
* refine python lib link
```
eb102189
Y

[Paddle inference] fix mixed precision (#47654) · 624ffdf2
由 Yuanle Liu 提交于 11月 07, 2022

624ffdf2
R

call InitDevices only once (#47678) · 0cbdcdda
由 ronnywang 提交于 11月 07, 2022

0cbdcdda
W
Get three grad lists in CPP to avoid gpu idle time (#47665) · 01bfe786
由 WangZhen 提交于 11月 07, 2022
```
* Get three grad lists in CPP to avoid gpu idle time

* Support legacy mode
```
01bfe786

[Restore PR] Remove hard code of PADDLE_WITH_CUDA (#47630) · 908a381d

由 HongyuJia 提交于 11月 07, 2022

* move cudnn hardcode outside GetExpectedKernelType

* add header file

* debug

* update interpreter_util with hardcode

* update interpreter_util headerfile

* solve activation hardcode

* debug with CI

* add mkldnn_op_list header file

* temporarily uncomment mkldnn

* temporarily uncomment mkldnn

* delete sequence_softmax cudnn hardcode

* add hardcode to data_transfer.cc

* update data_transfer headerfile

* try fix segment fault

* update cudnn&miopen_helper

* reset HasAttr of DygraphExctnCtx

* debug, this commit should pass all CI

* debug should pass CI, temporarily disable activation

* debug should pass CI

* fix default_attr=nullptr bug

* clean debug code

* Call SetDnnFallback function in the base class

* activation fallback to plain kernel

* fix default GetExpectedKernelType find wrong kernel

* search cudnn kernel instead of fallback

* fix cudnn_handle bug

* remove tanh use_cudnn

* restore tanh use_cudnn

* debug tanh

* fix tanh bug

* delete activation cudnn kernel

* polish code

908a381d

Q

[cusotm device] add python inference api, test=develop (#46460) · 6074c50a
由 Qi Li 提交于 11月 07, 2022

6074c50a
W

Refactor collective communication all_gather, all_reduce, broadcast & barrier C++ API (#47481) · e1a1c354
由 Wen Sun 提交于 11月 07, 2022

e1a1c354
S
[PHI] Migrate batch_norm (#47652) · 2337e609
由 Sławomir Siwek 提交于 11月 07, 2022
```
* init changes

* bnorm

* method signature

* change order

* bnorm

* removed unused args
```
2337e609

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功