提交 · b7b231a668ac51365cdce11dfafe6f7da04b2350 · PaddlePaddle / Paddle

30 9月, 2022 1 次提交

support pure bfloat16 for more ops (#46364) · b7b231a6

由 sneaxiy 提交于 9月 30, 2022

* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* add bfloat16 to selu_grad to pass CI

* fix selu grad compilation error

b7b231a6

29 9月, 2022 9 次提交

C

Optimize softmax's performance when dim_size >= 100000. (#46535) · 9012787f
由 carryyu 提交于 9月 29, 2022

9012787f

Move valid check from python to kernel (#46412) · 37bc2d7b

由 Zhang Zheng 提交于 9月 29, 2022

* Move valid check from python to kernel

* fix error throw

* fix

* invalid label check

* fix

* Revert "fix"

This reverts commit 79fad6799cfa4b30423dbc84e67d7d843d22b84a.

* Revert "invalid label check"

This reverts commit 402a9707390ad5386b3222e85844b92d2e9b9fa4.

* Revert "fix"

This reverts commit 09ba3080ee0587447f875c19cdf060485f15ae3b.

* Revert "fix error throw"

This reverts commit a901bfcc2179d5c120ec29af766f392b122dab52.

* Revert "Move valid check from python to kernel"

This reverts commit baa03cc4ef82d8d45516c30dfb52bf5aead30748.

* final fix

* fix

* fix

37bc2d7b

Add index_select, index_select_grad, reduce_min kernel and their unittests for... · 9a1855ff

由 Leo Guo 提交于 9月 29, 2022

Add index_select, index_select_grad, reduce_min kernel and their unittests for kunlun. Add registers of index_select, index_select_grad, reduce_min, sqrt, sqrt_grad to xpu2_op_list.test=kunlun. (#46557)

9a1855ff

fix P40 topk: Make the optimized topk compatible with P40. (#46547) · 667082c0

由 carryyu 提交于 9月 29, 2022

* fix P40 topk: Make the optimized topk compatible with P40.

* fix P40 topk: Make the optimized topk compatible with P40.

* fix P40 topk: Make the optimized topk compatible with P40.

667082c0

M

add register for strided_slice_grad (#46549) · 40ab6faf
由 ming1753 提交于 9月 29, 2022

40ab6faf
傅

fix uniform_rand_kernel FP16 support in dygraph mode (#46212) · ccab0e2a
由傅剑寒提交于 9月 29, 2022

ccab0e2a
H
[OptLayoutSelect] Select the highest priority layout (#46598) · 596d8209
由 HongyuJia 提交于 9月 29, 2022
```
* select highest priority layout

* opt performance, save virtual table find
```
596d8209

[Fix KernelKeyParser] Unify the logic of `operator()` in `KernelKeyParser` (#46560) · 4140d7ec

由 HongyuJia 提交于 9月 29, 2022

* add datatype check for ParseKernelKeyByInputArgs

* polish error message

* Actually, einsum has vector<Tensor> inpute with DataType::COMPLEX64, see test_einsum_v2.py

* headerfile remove enforce.h

4140d7ec

[XPU] update xpu cmake to 0928. (#46437) · 58a478f8

由 houj04 提交于 9月 29, 2022

* [XPU] update xpu cmake to 0923. test=kunlun

* [XPU] update xpu cmake to 0928. test=kunlun

58a478f8

28 9月, 2022 10 次提交

Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e

由 Chen Weihang 提交于 9月 28, 2022

* remove needless using tensor

* remove needless using tensor

* resolve conflict

* replace tensor using

* fix format error

* revert needless changing

* fix rocm and npu compile error

* fix cinn compile error

* fix format error

* fix mkldnn format error

* fix mkldnn format error

* fix cinn compile error

* fix cinn compile error

* fix cinn compile error

* resolve conflict

e12a905e

H

rename filenames from pten to phi (#46579) · 3f8585a9
由 HongyuJia 提交于 9月 28, 2022

3f8585a9

[phi Backend] Change BackendSet from uint64_t to uint32_t (#46532) · f6f8c935

由 HongyuJia 提交于 9月 28, 2022

* change BackendSet from 64bits to 32bits

* fix _MSC_VER error, __lzcnt32->__lzcnt

* fix __GNUC__ error, __builtin_clzl->__builtin_clz

f6f8c935

L

first commit (#46525) · 806b252c
由 limingshu 提交于 9月 28, 2022

806b252c

[PHI] relu6_grad kernel (#46501) · cee2b12d

由 Sławomir Siwek 提交于 9月 28, 2022

* Relu6

* remove fluid handler

* add individual kernel signature

* coding style

* replace bounded_relu with clip

* whitespace

* code style

cee2b12d

Y

add decode_jpeg yaml (#46562) · c7da8602
由 YuanRisheng 提交于 9月 28, 2022

c7da8602
Y
[BugFix]Fix concat bugs when call onednn kernel (#46518) · 0ee6dfbe
由 YuanRisheng 提交于 9月 28, 2022
```
* fix concat bug

* fix ci bugs

* fix ci bugs
```
0ee6dfbe

[NPU] add gpu kernel for transfer layout (#46307) · 526d963e

由 kangguangli 提交于 9月 28, 2022

* add gpu kernel for transfer layout

* comment error throw

* fix: flag setting in testcase; add condition check for raising error

* fix typo

* fix: add error type for PADDLE_THROW

* remove kernel fallback in data_transfer.cc

* remove useless variable definition

526d963e

W
[PHI] phi support xpu black list (#46527) · 84f7835d
由 wanghuancoder 提交于 9月 28, 2022
```
* phi support xpu black list
```
84f7835d
Z
Fix clip_extra logic in remove_training_info (#46534) · 7e2e2ee7
由 zyfncg 提交于 9月 28, 2022
```
* fix clip_extra code in remove_training_info

* revert rnn opmaker clear
```
7e2e2ee7

27 9月, 2022 6 次提交
- J
  
  adjust backend priority, GPUDNN>GPU>ONEDNN>CPU · 7467221b
  由 jiahongyu 提交于 9月 27, 2022
  
  7467221b
- J
  
  polish typo, emum->enum, defalutly->defaultly · c82d1020
  由 jiahongyu 提交于 9月 27, 2022
  
  c82d1020
- L
  
  Delete int kernel type in Scatter Kernel.test=kunlun (#46030) · 403cd2b5
  由 Leo Guo 提交于 9月 27, 2022
  
  403cd2b5
- Z
  
  Fix syntax errors of args name in int_array.h (#46521) · 38e82868
  由 zyfncg 提交于 9月 27, 2022
  
  38e82868
- C
  Add README.md for phi (#46506) · a7aefaea
  由 Chen Weihang 提交于 9月 27, 2022
```
* add readme for phi

* polish details, test=document_fix
```
  a7aefaea
- Z
  
  [Sparse] Support static graph (#46245) · a02eb143
  由 zhangkaihuo 提交于 9月 27, 2022
  
  a02eb143
26 9月, 2022 5 次提交
- Z
  
  fix shard_index kernel (#46491) · 808bf2b4
  由 zhaoyingli 提交于 9月 26, 2022
  
  808bf2b4
- L
  
  [Fix] Remove std::trunc() in FloorDivideFunctor and InverseFloorDivideFunctor (#45051) · 091ae705
  由 Lin Manhui 提交于 9月 26, 2022
  
  091ae705
- C
  Enable eager mode on xpu (#46227) · 87a25fbd
  由 Chen Weihang 提交于 9月 26, 2022
```
* enable eager mode on xpu, test=kunlun

* add numpy support to xpu

* fix tensor using error

* fix  error, test=kunlun

* fix failed tests, test=kunlun
```
  87a25fbd
- Z
  
  clear extra atts of sequence_softmax in opmaker (#46457) · 159f10e3
  由 zyfncg 提交于 9月 26, 2022
  
  159f10e3
- Z
  
  clear extra attrs of distribute op in opmaker (#46451) · 4f847433
  由 zyfncg 提交于 9月 26, 2022
  
  4f847433
23 9月, 2022 6 次提交
- W
  
  [Phi] support bincount yaml and _C_ops.bincount under eager (#46443) · 991ec7d3
  由 Weilong Wu 提交于 9月 23, 2022
  
  991ec7d3
- Z
  Optimize performance of depthwise_conv_fwd (#46287) · 330b1a0a
  由 Zhang Zheng 提交于 9月 23, 2022
```
* Optimize performance of depthwise_conv_fwd

* fix
```
  330b1a0a
- D
  add phi reduce_sum test=kunlun (#46241) · 22fe4f03
  由 dongfangshenzhu 提交于 9月 23, 2022
```
* add phi reduce_sum test=kunlun

* add fhi reduce_sum test=kunlun

* add fhi reduce_sum test=kunlun
```
  22fe4f03
- Y
  
  move selected_rows_functor (#46373) · b6c6f4f9
  由 YuanRisheng 提交于 9月 23, 2022
  
  b6c6f4f9
- L
  Addition of bf16 type support for Compare OP (#46413) · 1a7d907d
  由 limingshu 提交于 9月 23, 2022
```
* first commit

* clarify the quotes

* change code style format

* support bfloat16
```
  1a7d907d
- Z
  
  clear extra attrs of quantize op in opmaker (#46418) · 62c05369
  由 zyfncg 提交于 9月 23, 2022
  
  62c05369
22 9月, 2022 3 次提交

[PHI] Sum op migration (#46239) · 3448afc1

由 Paulina Gacek 提交于 9月 22, 2022

* Sum kernel migrated to phi

* Static cast added, file name changed

* OneDNNGetDataType to uppercase

* refactoring

* AddOneDNNHandler changed to SumOneDNNHandler

3448afc1

Z
Clear extra attrs of lookup_table_v2 in OpMaker (#46321) · ffc697ff
由 zyfncg 提交于 9月 22, 2022
```
* clear extra attrs of look_up_table_v2 in opmaker

* fix bug
```
ffc697ff
P
[PHI] Migrate sgd and stack oneDNN kernels (#46374) · 4ae37aee
由 Piotr Paturej 提交于 9月 22, 2022
```
* Convert slice+grad oneDNN fluid kernels to PHI

* Change mutable_data to Alloc

* Refactor licences
```
4ae37aee

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功