提交 · db3239273ccba4b974c2e83b28a3fd40c0fa99e6 · PaddlePaddle / Paddle

01 11月, 2022 1 次提交

Adapting device-specific Extra Attributes for the PHI kernel (#46342) · c923e6c9

由 Chen Weihang 提交于 10月 31, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* fix map at error

* Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* remove useless extra attrs

* replace mkldnn_engine by onednn_engine
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

c923e6c9

27 10月, 2022 1 次提交

[JIT] Add Predictor for JITLayer (#47379) · b160d09e

由 Aurelius84 提交于 10月 27, 2022

* add predictor_engine

* add predictor_engine

* fix zero shape

* fix lodTensor

* fix unittest

* fix code style

* update CmakeList

b160d09e

26 10月, 2022 2 次提交

FC/matmul(v2) + scale fuse pass (#47127) · c1c2be2d

由 Sławomir Siwek 提交于 10月 26, 2022

* fc/matmuls + scale fuse pass

* remove double-extension

* add unit tests

* comments from review

* codestyle

* add pass to int8 list

* new codestyle

* attr name typo

c1c2be2d

[MKLDNN] Delete mkldnn hard code of prior_box (#47068) · d78dd7ea

由 HongyuJia 提交于 10月 26, 2022

* remove prior_box mkldnn hard code

* add header file

* simplify PD_VISIT_TYPE

* decouple dependency between prior_box and density_prior_box

* fix pragma omp parallel error

* bypass #pragma omp_parallel_for error

* polish code

* remove visit_type headerfile

* polish codestyle

* polish codestyle

* try fix CI error

* add testcase, datatype=float64

* reset test_prior_box testcase

* add datacheck to DenseTensor

* update template name

* call prior_box with macro expand

d78dd7ea

25 10月, 2022 2 次提交
- H
  [Kernel Selection] Remove hard code of PADDLE_WITH_MKLDNN (Part2 add dnn_fallback flag) (#47200) · 6f5e7826
  由 HongyuJia 提交于 10月 25, 2022
```
* use dnn_fallback flag to delete mkldnn hardcode

* polish code style

* fix protected error

* fix const error

* fix reduce_op fallback

* fix pool_op fallback

* add Set function of dnn_fallback_
```
  6f5e7826
- H
  
  opt conv_transpose cudnn (#47294) · afd5a96b
  由 HongyuJia 提交于 10月 25, 2022
  
  afd5a96b
24 10月, 2022 1 次提交

[MKLDNN] Delete mkldnn hard code of mul (#47166) · aede713a

由 HongyuJia 提交于 10月 24, 2022

* delete GetExpectedKernelType mkldnn of mul_grad

* update mkldnn_op_list, remove mul_grad

* delete GetExpectedKernelType mkldnn of mul

aede713a

21 10月, 2022 1 次提交
- Y
  fix nvprof_nvtx_push interface bug (#47232) · 340009d6
  由 Yuanle Liu 提交于 10月 21, 2022
```
* fix nvprof_nvtx_push interface bug
```
  340009d6
20 10月, 2022 1 次提交

[MKLDNN] Delete mkldnn hard code of fc (#47138) · 4dc4d5fc

由 HongyuJia 提交于 10月 20, 2022

* remove fc mkldnn hardcode

* remove useless enum of kFCMKLDNN

* fix macro error

* update operators.cmake

4dc4d5fc

19 10月, 2022 2 次提交
- Y
  
  add nvtxRangePush/Pop for naive_executor and refine some code (#47139) · de6e7431
  由 Yuanle Liu 提交于 10月 19, 2022
  
  de6e7431
- L
  clean unused code: piece.cc/h (#47103) · e435d695
  由 Leo Chen 提交于 10月 19, 2022
```
* clean unused code: piece.cc/h

* clean usage
```
  e435d695
18 10月, 2022 1 次提交
- H
  
  delete GetExpectedKernelType mkldnn of conv_op (#47044) · a9c20660
  由 HongyuJia 提交于 10月 18, 2022
  
  a9c20660
17 10月, 2022 1 次提交
- Y
  [PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
  由 YuanRisheng 提交于 10月 17, 2022
```
* namespace modify

* update by comment
```
  ec749398
15 10月, 2022 1 次提交
- H
  
  delete GetExpectedKernelType mkldnn of transpose2 (#46977) · 64b61fc4
  由 HongyuJia 提交于 10月 15, 2022
  
  64b61fc4
13 10月, 2022 2 次提交

L

add thread name for dataloader (#46990) · 770501b8
由 Leo Chen 提交于 10月 13, 2022

770501b8

[Kernel Selection] Remove hard code of PADDLE_WITH_MKLDNN (#46606) · ef1c8759

由 HongyuJia 提交于 10月 13, 2022

* remove PADDLE_WITH_MKLDNN, test white_list=abs

* fix unique_ptr

* fix op.Type()

* remove TODO in kernel_dispatch.h

* remove IndicateVarDataType function, update white_list

* remove mkldnn hard code

* add comments

* fix ==

* update mkldnn_op_list

* delete hard code of OPs

* update mkldnn_op_list

* update mkldnn_op_list, remove interp

* add error check for ExecutionContext

* update mkldnn_op_list, remove transpose2_grad

* remove interpolate mkldnn

* remove fill_constant mkldnn

* opt HasAttr in DygraphExecutionContext

* deprecated commit, test mkldnn_white_list

* deprecated commit, test mkldnn_white_list

* deprecated commit, test mkldnn_black_list

* update mkldnn_op_list, add assert error op

* solve cudnn related op

* fix error

* add mkldnn fallback in phi_utils.cc

* remove mkldnn fallback in phi_utils.cc

* opt code implementation

* polish Copyright License

ef1c8759

11 10月, 2022 2 次提交
- W
  
  Completes bfloat16 dtype for collective api in eager mode (#45844) · e4eb8d36
  由 Wen Sun 提交于 10月 11, 2022
  
  e4eb8d36
- C
  Remove LoDTensor using in fluid (Part 1) (#46663) · 940d8f25
  由 Chen Weihang 提交于 10月 11, 2022
```
* remove using lodtensor part1

* polish history code format
```
  940d8f25
10 10月, 2022 1 次提交

add function FindInputNameByVarName (#46759) · 8eaff62d

由 Sylwester Fraczek 提交于 10月 10, 2022

* Add methods that find input or output name by var name

* kind of bugfix - initialize variables

* ci fix

* review fixed

8eaff62d

30 9月, 2022 3 次提交

[IPU] paddle-inference support custom-ops (#45235) · a6b4bee3

由 Allen Guo 提交于 9月 30, 2022

* paddle-inference support custom-ops
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

* fix tolower
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

a6b4bee3

fix bugs of tipc, test=kunlun (#46540) · d16360c8

由 ykkk2333 提交于 9月 30, 2022

* migrate sigmoid with cross entropy, and tile xpu kernels to phi, test=kunlun

* migrate add_n kernep to phi, test=kunlun

* fix bugs of tipc, test=kunlun

d16360c8

support pure bfloat16 for more ops (#46364) · b7b231a6

由 sneaxiy 提交于 9月 30, 2022

* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* add bfloat16 to selu_grad to pass CI

* fix selu grad compilation error

b7b231a6

29 9月, 2022 1 次提交

Add index_select, index_select_grad, reduce_min kernel and their unittests for... · 9a1855ff

由 Leo Guo 提交于 9月 29, 2022

Add index_select, index_select_grad, reduce_min kernel and their unittests for kunlun. Add registers of index_select, index_select_grad, reduce_min, sqrt, sqrt_grad to xpu2_op_list.test=kunlun. (#46557)

9a1855ff

28 9月, 2022 3 次提交

S

fix collective helper (#46582) · bd10211c
由 sneaxiy 提交于 9月 28, 2022

bd10211c

Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e

由 Chen Weihang 提交于 9月 28, 2022

* remove needless using tensor

* remove needless using tensor

* resolve conflict

* replace tensor using

* fix format error

* revert needless changing

* fix rocm and npu compile error

* fix cinn compile error

* fix format error

* fix mkldnn format error

* fix mkldnn format error

* fix cinn compile error

* fix cinn compile error

* fix cinn compile error

* resolve conflict

e12a905e

[PHI] relu6_grad kernel (#46501) · cee2b12d

由 Sławomir Siwek 提交于 9月 28, 2022

* Relu6

* remove fluid handler

* add individual kernel signature

* coding style

* replace bounded_relu with clip

* whitespace

* code style

cee2b12d

26 9月, 2022 1 次提交
- C
  
  [MLU] fluid: add mluop (#46429) · 3e1e482b
  由 cifar10 提交于 9月 26, 2022
  
  3e1e482b
25 9月, 2022 1 次提交
- S
  
  move some singleton to cc file (#46470) · e8b9ae20
  由 sneaxiy 提交于 9月 25, 2022
  
  e8b9ae20
22 9月, 2022 1 次提交
- C
  
  [MLU] fix profiler compile failure (#46208) · 608181a9
  由 Chenxiao Niu 提交于 9月 22, 2022
  
  608181a9
18 9月, 2022 1 次提交
- R
  
  Add INT8 support for fused_multi_transformer_op (#45284) · 3d7e2118
  由 RichardWooSJTU 提交于 9月 18, 2022
  
  3d7e2118
16 9月, 2022 5 次提交

Support broadcast elementwise operators with int64 index type (#45741) · 20b5bf84

由 sneaxiy 提交于 9月 16, 2022

* support int64 non-broadcast

* support broadcast case for int64 index

* fix bug

* support more Arity

* remove some codes

* upgrade patchelf to v0.15.0 to pass CI build

* fix bug

* fix patchelf installation

* add debug flags

* remove useless codes

* fix viterbi_decode and set_value op uts

* remove always enable int64

20b5bf84

C
optimize device synchronization in profiler (#46089) · 2a5bd7dc
由 chenjian 提交于 9月 16, 2022
```
* avoid to synchronize all devices

* synchronize custom device
```
2a5bd7dc
J

Modify callstacklevel flag for c++ (#46058) · d072aaeb
由 JingZhuangzhuang 提交于 9月 16, 2022

d072aaeb
L
add interpretercore for jit engine (#46092) · 22c3cdb4
由 Leo Chen 提交于 9月 16, 2022
```
* add interpretercore for jit engine

* add ut
```
22c3cdb4

[CustomDevice] add new executor support (#46038) · 268f097e

由 ronnywang 提交于 9月 16, 2022

* [CustomDevice] add custom_device_resource_pool & device_event_custom_device

* update

* update

* update

* update

268f097e

15 9月, 2022 2 次提交
- J
  updating mul and matmul with set_mem_desc (#45624) · 416e0de7
  由 Jacek Czaja 提交于 9月 15, 2022
```
* - mul & matmul changes

- fix

- bs16 correction of strides

* - cosmetic fixes

* - lint

* - fix

* - fix

* - format -> mem_desc

* - fix

* - fix

* - fix

* - fix

* - fix
```
  416e0de7
- N
  
  [CodeStyle] trim trailing whitespace in .h, .cc, .cu, etc. (#46006) · 8dde7aea
  由 Nyakku Shigure 提交于 9月 15, 2022
  
  8dde7aea
14 9月, 2022 2 次提交
- J
  Support inference compilation in training package (#46008) · cbe64cc1
  由 JingZhuangzhuang 提交于 9月 14, 2022
```
* merge python lib
* Update third_party.cmake
* Update CMakeLists.txt
```
  cbe64cc1
- J
  delay tensorrt registry (#45824) · d7d35ff8
  由 JingZhuangzhuang 提交于 9月 14, 2022
```
* Delay TensorRT registry
* Add unused define
* Fix TensorRT test
* fix function to reference
* Update trt_plugin.h
```
  d7d35ff8
09 9月, 2022 1 次提交

[new-exe] convert fused_all_reduce_op_handle to program (#45774) · e755c07e

由 Leo Chen 提交于 9月 09, 2022

* add operator<< for BuildStrategy

* add fake_coalesce

* fit allreduce mode for new_exe

* remove dubeg code

* follow comments

e755c07e

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功