提交 · abb3813683f44f9bd96f57e87d8ddae8e4d44a82 · PaddlePaddle / Paddle

17 10月, 2022 2 次提交

[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape (#46694) · abb38136

由 OccupyMars2025 提交于 10月 17, 2022

* add sparse reshape

* change the dtype in all test cases to int64

* just one test case

* modify comments

* Update test_sparse_reshape_op.py

* chang the type of "shape"  from  vector<int64_t>  to  IntArray

* check whether sp_out.to_dense() is the cause  of error

* print sp_out

* Update reshape_kernel.cc

* use numpy to generate the equal paddle tensor

* just check dense_tensor.numpy()

* check cpu and cuda versions

* Update test_sparse_reshape_op.py

* supply all test cases for cpu forward coo kernel

* test forward coo cuda kernel

* change configuration of cuda kernel

* keep only one test case

* test coo cpu kernel (forward and backward)

* row major or column major ???

* test cuda coo forward kernel

* complete declaration and registration

* Update __init__.py

* rebuild

* retrigger CI

* add cudaMalloc and cudaMemcpy  in  ReshapeCooKernel  and change back to row major order in a cuda dense tensor

* midify minor error

* test only cpu coo forward kernel

* add all test cases for coo forward kernel  (both cpu and gpu)

* test all forward kernels (coo, csr; cpu, gpu)

* add all test cases for all kinds of kernels

* just retrigger CI

* Update sparse_ops.yaml

* Update sparse_ops.yaml

* Update sparse_ops.yaml

* resolve conflicts

* Update sparse_ops.yaml

* don't specify tensor place

* new shape has -1 or 0 in it

* Update unary_grad_kernel.h

* correct lvalue error

* code style

* Update sparse_backward.yaml

* Update sparse_ops.yaml

* Update unary_kernel.h

* Update unary.py

* Update sparse_backward.yaml

* Update unary.py

* code style

* code style

* code style

* Update unary.py

* specify tensor place explicitly

* do not use numpy array

* use numpy array in unit test again

* modify example code in docstring

abb38136

L
Fix the bug of PHI kernel of reduce_sum in kunlun when using eager mode. (#47004) · f9c1cdc1
由 Leo Guo 提交于 10月 17, 2022
```
test=kunlun
```
f9c1cdc1

14 10月, 2022 2 次提交
- R
  
  speed_up for deformable conv (#46997) · eee6b3a7
  由 Rayman 提交于 10月 14, 2022
  
  eee6b3a7
- W
  TRT pool2d adaptive mode bugfix (#46802) · eb32746a
  由 Wang Bojun 提交于 10月 14, 2022
```
* draft with debug print
```
  eb32746a
13 10月, 2022 5 次提交
- X
  
  logsumexp support fp16 (#45817) · 910e1b6a
  由 xiaohemaikoo 提交于 10月 13, 2022
  
  910e1b6a
- [Zero-Dim] support 0D for paddle.transpose/reshape/stack/tile/unsqueeze (#46555) · 78add057
  由 zhouweiwei2014 提交于 10月 13, 2022
  
  78add057
- C
  
  fix softmax memory align (#46902) · 71748805
  由 carryyu 提交于 10月 13, 2022
  
  71748805
- Z
  Revert #46111 (#46961) · cf9ca61d
  由 Zhang Ting 提交于 10月 13, 2022
```
* Revert "【Hackathon No.56&38】deformable_conv_v1 算子实现 float16 数据类型支持&前向运行加速 (#46111)"
```
  cf9ca61d
- Z
  Correct the logic and remove unnecessary template param (#46623) · 450af30c
  由 Zhang Zheng 提交于 10月 13, 2022
```
* Correct the logic and remove unnecessary template param

* fix error throw

* fix print format

* fix ci
```
  450af30c
12 10月, 2022 5 次提交
- Z
  Revert "remove comment (#46827)" (#46935) · 2ea3700a
  由 Zhang Ting 提交于 10月 12, 2022
```
This reverts commit 8a5f17e8.
```
  2ea3700a
- Z
  
  deliver indices_dict (#46919) · 4681f13b
  由 zhangkaihuo 提交于 10月 12, 2022
  
  4681f13b
- S
  Fix some operators when the tensor.numel() > INT32_MAX (#46767) · e896567e
  由 sneaxiy 提交于 10月 12, 2022
```
* fix some ops for int64 range

* update error message
```
  e896567e
- [Zero-Dim] support input 0D Tensor for some unary api (#45992) · 05c2b9ba
  由 zhouweiwei2014 提交于 10月 12, 2022
```
* [Zero-Dim] support input 0D Tensor for unary api

* fix CI
```
  05c2b9ba
- Z
  
  [Sparse] Rename and fix doc (#46853) · a9cc5482
  由 zhangkaihuo 提交于 10月 12, 2022
  
  a9cc5482
11 10月, 2022 2 次提交
- F
  
  set_value_op: add support for complex types (#46884) · 34c7e3e3
  由 Feiyu Chan 提交于 10月 11, 2022
  
  34c7e3e3
- C
  Remove LoDTensor using in fluid (Part 1) (#46663) · 940d8f25
  由 Chen Weihang 提交于 10月 11, 2022
```
* remove using lodtensor part1

* polish history code format
```
  940d8f25
10 10月, 2022 4 次提交
- R
  
  remove comment (#46827) · 8a5f17e8
  由 Rayman 提交于 10月 10, 2022
  
  8a5f17e8
- P
  [PHI] transpose2_grad op migration (#46139) · e3407a80
  由 Paulina Gacek 提交于 10月 10, 2022
```
* op migrated, Copy(OneDNNContext, ...) added

* mutable_data & op registration in fluid removed

* refactoring

* OneDNNGetDataType to uppercase

* missing cpu check added, handler moved to .h file

* name changed to transpose_grad

* Copy changed back to TensorCopy

* Resizing corrected, Copy(OneDNNContext) removed
```
  e3407a80
- R
  
  【Hackathon No.36】优化 lerp_grad op 在 GPU 上的计算性能 (#45946) · ef61df30
  由 Rayman 提交于 10月 10, 2022
  
  ef61df30
- R
  【Hackathon No.56&38】deformable_conv_v1 算子实现 float16 数据类型支持&前向运行加速 (#46111) · 5e0614a1
  由 Rayman 提交于 10月 10, 2022
```
support fp16 for deformable conv
```
  5e0614a1
09 10月, 2022 4 次提交
- Z
  
  add sync_batch_norm_kernel (#46430) · 5cd6a707
  由 zhangkaihuo 提交于 10月 09, 2022
  
  5cd6a707
- Z
  
  [Sparse] Add a batch_norm kernel (#46359) · 888223b7
  由 zhangkaihuo 提交于 10月 09, 2022
  
  888223b7
- S
  
  add seed check (#46747) · 97ec57fe
  由 Sławomir Siwek 提交于 10月 09, 2022
  
  97ec57fe
- S
  Enable hard_swish_grad unit test (#46621) · ff0171e4
  由 Sławomir Siwek 提交于 10月 09, 2022
```
* enable hard_swish_grad unit test

* remove unused argument
```
  ff0171e4
03 10月, 2022 1 次提交
- J
  Requantize to use Memory Desc in Tensors (#46608) · a579e523
  由 Jacek Czaja 提交于 10月 03, 2022
```
* - some more MD changes

* - lint

* - compilation fixes

* - compilation fixes

* - lint

* - fix
```
  a579e523
30 9月, 2022 9 次提交
- Z
  Optimize performance of depthwise_conv_bwd of filter (#46490) · 04eb211a
  由 Zhang Zheng 提交于 9月 30, 2022
```
* Optimize performance of depthwise_conv_bwd of filter

* op-benchmark

* fix

* op benchmark

* merge bwd
```
  04eb211a
- Z
  Optimize performance of depthwise_conv_bwd (#46362) · f17a73e9
  由 Zhang Zheng 提交于 9月 30, 2022
```
* Optimize performance of depthwise_conv_bwd

* fix
```
  f17a73e9
- Y
  fix bugs of tipc, test=kunlun (#46540) · d16360c8
  由 ykkk2333 提交于 9月 30, 2022
```
* migrate sigmoid with cross entropy, and tile xpu kernels to phi, test=kunlun

* migrate add_n kernep to phi, test=kunlun

* fix bugs of tipc, test=kunlun
```
  d16360c8
- H
  
  change mkldnn kernel layout, ALL_LAYOUT->ONEDNN (#46626) · 22e81907
  由 HongyuJia 提交于 9月 30, 2022
  
  22e81907
- H
  
  change mkldnn kernel layout, ALL_LAYOUT->ONEDNN (#46628) · 4744cbc7
  由 HongyuJia 提交于 9月 30, 2022
  
  4744cbc7
- 六
  
  【Hackathon No.21】为 Paddle 新增 paddle.incubate.sparse.transpose 稀疏 API (#45849) · 2b879a69
  由六个骨头提交于 9月 30, 2022
  
  2b879a69
- H
  
  change mkldnn kernel layout, ALL_LAYOUT->ONEDNN (#46627) · 4b9dae01
  由 HongyuJia 提交于 9月 30, 2022
  
  4b9dae01
- H
  
  change mkldnn kernel layout, ALL_LAYOUT->ONEDNN (#46629) · abee2210
  由 HongyuJia 提交于 9月 30, 2022
  
  abee2210
- S
  support pure bfloat16 for more ops (#46364) · b7b231a6
  由 sneaxiy 提交于 9月 30, 2022
```
* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* add bfloat16 to selu_grad to pass CI

* fix selu grad compilation error
```
  b7b231a6
29 9月, 2022 6 次提交

C

Optimize softmax's performance when dim_size >= 100000. (#46535) · 9012787f
由 carryyu 提交于 9月 29, 2022

9012787f

Move valid check from python to kernel (#46412) · 37bc2d7b

由 Zhang Zheng 提交于 9月 29, 2022

* Move valid check from python to kernel

* fix error throw

* fix

* invalid label check

* fix

* Revert "fix"

This reverts commit 79fad6799cfa4b30423dbc84e67d7d843d22b84a.

* Revert "invalid label check"

This reverts commit 402a9707390ad5386b3222e85844b92d2e9b9fa4.

* Revert "fix"

This reverts commit 09ba3080ee0587447f875c19cdf060485f15ae3b.

* Revert "fix error throw"

This reverts commit a901bfcc2179d5c120ec29af766f392b122dab52.

* Revert "Move valid check from python to kernel"

This reverts commit baa03cc4ef82d8d45516c30dfb52bf5aead30748.

* final fix

* fix

* fix

37bc2d7b

Add index_select, index_select_grad, reduce_min kernel and their unittests for... · 9a1855ff

由 Leo Guo 提交于 9月 29, 2022

Add index_select, index_select_grad, reduce_min kernel and their unittests for kunlun. Add registers of index_select, index_select_grad, reduce_min, sqrt, sqrt_grad to xpu2_op_list.test=kunlun. (#46557)

9a1855ff

fix P40 topk: Make the optimized topk compatible with P40. (#46547) · 667082c0

由 carryyu 提交于 9月 29, 2022

* fix P40 topk: Make the optimized topk compatible with P40.

* fix P40 topk: Make the optimized topk compatible with P40.

* fix P40 topk: Make the optimized topk compatible with P40.

667082c0

M

add register for strided_slice_grad (#46549) · 40ab6faf
由 ming1753 提交于 9月 29, 2022

40ab6faf
傅

fix uniform_rand_kernel FP16 support in dygraph mode (#46212) · ccab0e2a
由傅剑寒提交于 9月 29, 2022

ccab0e2a

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功