提交 · 4e09b0893c259c72cfae79f1834201d5e061354b · BaiXuePrincess / Paddle

11 11月, 2022 2 次提交

[IPU]: add model_runtime backend support in IPU (#47363) · 21b901cb

由 czr-gc 提交于 11月 11, 2022

* feat(ipu): add model_runtime backend support in IPU.

* fix(ipu_executor): fix error message format.

* fix(ipu_executor): fix format.

* fix(ipu_executor): fix format again.

* fix(ipu_executor): fix format again.

* fix(ipu_executor): fix format again.

21b901cb

J
bugfix in XPU legacy_dygraph distributed training: (#47838) · 9a6465ca
由 james 提交于 11月 11, 2022
```
phi::Alloc() complains about missing device_allocator_
```
9a6465ca

10 11月, 2022 1 次提交

XPU multi-card support eager mode (#47445) · 3b91f8f3

由 james 提交于 11月 10, 2022

* XPU support eager mode

* add unittest for XPU eager mode

* minor bugfix

* minor bugfix, test=kunlun

* correct copyright info

* 1. remove unsed vars/funcs
2. ProcessGroupBKCL inherit from ProcessGroupStream

* bugfix for fp16 in eager mode multi-card, test=kunlun

* rebase & fix a few issues

* use new processgroup interface, test=kunlun

* fix compile issue, test=kunlun

3b91f8f3

09 11月, 2022 1 次提交

Final changes to introduce mem_desc to be hold in Tensor (#46768) · 14f261ad

由 Jacek Czaja 提交于 11月 09, 2022

* first commit

- more fixes

- compilation fix

- compilation fix

- fix

- another fix

- yet another fix

- Fix

- fix to fused ops

- compilation fix

- compilation fix

- another compilation fix

- another fix

- fix

- fix

- fix

- fix

- yet another fix

- fix

- fix

- cosmetic fix

:- lint

- Revert some changes (to be brought back later)

- fix to build

- Added prototype of slice

- fix

compilation fix

- compilation fix

- fix

- fix

- Fix

- fix

 fix
	modified:   cmake/flags.cmake

* lint

* rerun of CI

* - Fix

* - lint

* - lint2

14f261ad

08 11月, 2022 2 次提交
- Z
  
  add adadelta op for xpu, test=kunlun (#47661) · 047971f0
  由 zhangyikun02 提交于 11月 08, 2022
  
  047971f0
- Z
  
  argsort support n > 16384 and add argsort_grad op for xpu, test=kunlun (#47701) · 6a6a3ff1
  由 zhangyikun02 提交于 11月 08, 2022
  
  6a6a3ff1
07 11月, 2022 5 次提交

suqeeze2 + transpose2 fuse onednn (#47592) · fa874a46

由 Hui Zhang 提交于 11月 07, 2022

* suqeeze2 transpose2 fuse onednn

* format

* fix output shape

* fix conflict

* format

* format

* remove useless

* remove log

* simply pass

* fix comment

* fix

* fix msg

* fix error msg

* format

fa874a46

Q
support kldiv_loss/kldiv_loss_grad for kunlun (#47638) · 5f0a8adc
由 QingshuChen 提交于 11月 07, 2022
```
*test=kunlun
```
5f0a8adc

add roll and roll_grad kernels and strided_slice and strided_slice_grad... · 5a4d2186

由 ykkk2333 提交于 11月 07, 2022

add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun (#47368)

* add stat tool

* add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun

5a4d2186

R

call InitDevices only once (#47678) · 0cbdcdda
由 ronnywang 提交于 11月 07, 2022

0cbdcdda

[Restore PR] Remove hard code of PADDLE_WITH_CUDA (#47630) · 908a381d

由 HongyuJia 提交于 11月 07, 2022

* move cudnn hardcode outside GetExpectedKernelType

* add header file

* debug

* update interpreter_util with hardcode

* update interpreter_util headerfile

* solve activation hardcode

* debug with CI

* add mkldnn_op_list header file

* temporarily uncomment mkldnn

* temporarily uncomment mkldnn

* delete sequence_softmax cudnn hardcode

* add hardcode to data_transfer.cc

* update data_transfer headerfile

* try fix segment fault

* update cudnn&miopen_helper

* reset HasAttr of DygraphExctnCtx

* debug, this commit should pass all CI

* debug should pass CI, temporarily disable activation

* debug should pass CI

* fix default_attr=nullptr bug

* clean debug code

* Call SetDnnFallback function in the base class

* activation fallback to plain kernel

* fix default GetExpectedKernelType find wrong kernel

* search cudnn kernel instead of fallback

* fix cudnn_handle bug

* remove tanh use_cudnn

* restore tanh use_cudnn

* debug tanh

* fix tanh bug

* delete activation cudnn kernel

* polish code

908a381d

05 11月, 2022 1 次提交
- Y
  
  Use an unified FLAGS_check_nan_inf_level to control the result of checking infinite. (#47672) · 54bc3b46
  由 Yiqun Liu 提交于 11月 05, 2022
  
  54bc3b46
04 11月, 2022 3 次提交
- H
  [XPU] add cumsum op. test=kunlun (#47585) · ac2a94c7
  由 houj04 提交于 11月 04, 2022
```
* [XPU] add cumsum op. test=kunlun

* try to fix linker. test=kunlun

* try to fix linker. test=kunlun

* try to fix linker. test=kunlun

* debug. test=kunlun

* update xpu.cmake. remove unnecessary codes. test=kunlun.
```
  ac2a94c7
- Y
  
  fix deepfm and deep_wide bug, add embedding_sparse_grad kernel, test=kunlun (#47365) · f53e920d
  由 ykkk2333 提交于 11月 04, 2022
  
  f53e920d
- J
  Optimized oneDNN FC and added operator+unsqueeze2 and operator+reshape2 oneDNN fuse passes (#47391) · 9e006987
  由 jakpiase 提交于 11月 04, 2022
```
* tmp save

* minor chnage

* CI fix

* added FC optimizations

* latest update

* CI fix

* fixed bug with fusing fc
```
  9e006987
03 11月, 2022 1 次提交
- sparse attention kernel is used from 11.8 (#47594) · 7648f429
  由 zhouweiwei2014 提交于 11月 03, 2022
  
  7648f429
02 11月, 2022 4 次提交

H
Revert "[Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325)" (#47582) · a57a19ea
由 HongyuJia 提交于 11月 02, 2022
```
This reverts commit f9134045.
```
a57a19ea

Improve the tool for checking nan and inf, and support to compute the max, min... · ad39043f

由 Yiqun Liu 提交于 11月 02, 2022

Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor. (#47095)

* Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor.

* Add a FLAGS to control whether abort when meets inf/nan and polish codes.

* Fix unittest.

* Change the computing of mean.

ad39043f

[XPU] add int64 support for slice and subtract. (#47409) · 77395619

由 houj04 提交于 11月 02, 2022

* [XPU] add int64 support for slice and subtract. test=kunlun

* try to fix xpu compile. test=kunlun

* try to fix xpu compile. test=kunlun

* try to fix xpu compile. test=kunlun

* remove unnecessary modification. test=kunlun

77395619

Add build option for CUDNN Frontend API (#47524) · eb100c7b

由 Tian Zheng 提交于 11月 02, 2022

* Add build option for CUDNN Frontend API

* Fix review comments

* Change namespace for cudnn_frontend.h

eb100c7b

01 11月, 2022 2 次提交

[Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325) · f9134045

由 HongyuJia 提交于 11月 01, 2022

* move cudnn hardcode outside GetExpectedKernelType

* add header file

* debug

* update interpreter_util with hardcode

* update interpreter_util headerfile

* solve activation hardcode

* debug with CI

* add mkldnn_op_list header file

* temporarily uncomment mkldnn

* temporarily uncomment mkldnn

* delete sequence_softmax cudnn hardcode

* add hardcode to data_transfer.cc

* update data_transfer headerfile

* try fix segment fault

* update cudnn&miopen_helper

* reset HasAttr of DygraphExctnCtx

* debug, this commit should pass all CI

* debug should pass CI, temporarily disable activation

* debug should pass CI

* fix default_attr=nullptr bug

* clean debug code

f9134045

Adapting device-specific Extra Attributes for the PHI kernel (#46342) · c923e6c9

由 Chen Weihang 提交于 10月 31, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* fix map at error

* Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* remove useless extra attrs

* replace mkldnn_engine by onednn_engine
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

c923e6c9

27 10月, 2022 1 次提交

[JIT] Add Predictor for JITLayer (#47379) · b160d09e

由 Aurelius84 提交于 10月 27, 2022

* add predictor_engine

* add predictor_engine

* fix zero shape

* fix lodTensor

* fix unittest

* fix code style

* update CmakeList

b160d09e

26 10月, 2022 2 次提交

FC/matmul(v2) + scale fuse pass (#47127) · c1c2be2d

由 Sławomir Siwek 提交于 10月 26, 2022

* fc/matmuls + scale fuse pass

* remove double-extension

* add unit tests

* comments from review

* codestyle

* add pass to int8 list

* new codestyle

* attr name typo

c1c2be2d

[MKLDNN] Delete mkldnn hard code of prior_box (#47068) · d78dd7ea

由 HongyuJia 提交于 10月 26, 2022

* remove prior_box mkldnn hard code

* add header file

* simplify PD_VISIT_TYPE

* decouple dependency between prior_box and density_prior_box

* fix pragma omp parallel error

* bypass #pragma omp_parallel_for error

* polish code

* remove visit_type headerfile

* polish codestyle

* polish codestyle

* try fix CI error

* add testcase, datatype=float64

* reset test_prior_box testcase

* add datacheck to DenseTensor

* update template name

* call prior_box with macro expand

d78dd7ea

25 10月, 2022 2 次提交
- H
  [Kernel Selection] Remove hard code of PADDLE_WITH_MKLDNN (Part2 add dnn_fallback flag) (#47200) · 6f5e7826
  由 HongyuJia 提交于 10月 25, 2022
```
* use dnn_fallback flag to delete mkldnn hardcode

* polish code style

* fix protected error

* fix const error

* fix reduce_op fallback

* fix pool_op fallback

* add Set function of dnn_fallback_
```
  6f5e7826
- H
  
  opt conv_transpose cudnn (#47294) · afd5a96b
  由 HongyuJia 提交于 10月 25, 2022
  
  afd5a96b
24 10月, 2022 1 次提交

[MKLDNN] Delete mkldnn hard code of mul (#47166) · aede713a

由 HongyuJia 提交于 10月 24, 2022

* delete GetExpectedKernelType mkldnn of mul_grad

* update mkldnn_op_list, remove mul_grad

* delete GetExpectedKernelType mkldnn of mul

aede713a

21 10月, 2022 1 次提交
- Y
  fix nvprof_nvtx_push interface bug (#47232) · 340009d6
  由 Yuanle Liu 提交于 10月 21, 2022
```
* fix nvprof_nvtx_push interface bug
```
  340009d6
20 10月, 2022 1 次提交

[MKLDNN] Delete mkldnn hard code of fc (#47138) · 4dc4d5fc

由 HongyuJia 提交于 10月 20, 2022

* remove fc mkldnn hardcode

* remove useless enum of kFCMKLDNN

* fix macro error

* update operators.cmake

4dc4d5fc

19 10月, 2022 2 次提交
- Y
  
  add nvtxRangePush/Pop for naive_executor and refine some code (#47139) · de6e7431
  由 Yuanle Liu 提交于 10月 19, 2022
  
  de6e7431
- L
  clean unused code: piece.cc/h (#47103) · e435d695
  由 Leo Chen 提交于 10月 19, 2022
```
* clean unused code: piece.cc/h

* clean usage
```
  e435d695
18 10月, 2022 1 次提交
- H
  
  delete GetExpectedKernelType mkldnn of conv_op (#47044) · a9c20660
  由 HongyuJia 提交于 10月 18, 2022
  
  a9c20660
17 10月, 2022 1 次提交
- Y
  [PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
  由 YuanRisheng 提交于 10月 17, 2022
```
* namespace modify

* update by comment
```
  ec749398
15 10月, 2022 1 次提交
- H
  
  delete GetExpectedKernelType mkldnn of transpose2 (#46977) · 64b61fc4
  由 HongyuJia 提交于 10月 15, 2022
  
  64b61fc4
13 10月, 2022 2 次提交

L

add thread name for dataloader (#46990) · 770501b8
由 Leo Chen 提交于 10月 13, 2022

770501b8

[Kernel Selection] Remove hard code of PADDLE_WITH_MKLDNN (#46606) · ef1c8759

由 HongyuJia 提交于 10月 13, 2022

* remove PADDLE_WITH_MKLDNN, test white_list=abs

* fix unique_ptr

* fix op.Type()

* remove TODO in kernel_dispatch.h

* remove IndicateVarDataType function, update white_list

* remove mkldnn hard code

* add comments

* fix ==

* update mkldnn_op_list

* delete hard code of OPs

* update mkldnn_op_list

* update mkldnn_op_list, remove interp

* add error check for ExecutionContext

* update mkldnn_op_list, remove transpose2_grad

* remove interpolate mkldnn

* remove fill_constant mkldnn

* opt HasAttr in DygraphExecutionContext

* deprecated commit, test mkldnn_white_list

* deprecated commit, test mkldnn_white_list

* deprecated commit, test mkldnn_black_list

* update mkldnn_op_list, add assert error op

* solve cudnn related op

* fix error

* add mkldnn fallback in phi_utils.cc

* remove mkldnn fallback in phi_utils.cc

* opt code implementation

* polish Copyright License

ef1c8759

11 10月, 2022 2 次提交
- W
  
  Completes bfloat16 dtype for collective api in eager mode (#45844) · e4eb8d36
  由 Wen Sun 提交于 10月 11, 2022
  
  e4eb8d36
- C
  Remove LoDTensor using in fluid (Part 1) (#46663) · 940d8f25
  由 Chen Weihang 提交于 10月 11, 2022
```
* remove using lodtensor part1

* polish history code format
```
  940d8f25
10 10月, 2022 1 次提交

add function FindInputNameByVarName (#46759) · 8eaff62d

由 Sylwester Fraczek 提交于 10月 10, 2022

* Add methods that find input or output name by var name

* kind of bugfix - initialize variables

* ci fix

* review fixed

8eaff62d

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致