提交 · 5c73a6eaa2be74eaed7e974b433a4c44f6da58b6 · 机器未来 / Paddle

10 1月, 2022 4 次提交

[Unify Tensors PR #5] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea

由 Zhanlue Yang 提交于 1月 10, 2022

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor

* Modified framework::Tensor to inherit from DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

* Rearranged cfunction calls from tensor.data<void>() to tensor.data()

* Fixed CI issues

* Fixed lite issues

* Fixed data() interface issues,test=allcases

* Resolved IsInitialized() issues

* Fixed ResetHolder() issues

* Fixed MKLDNN & Storage issues

* Resolved ShareBufferWith() issues

* Fixed LoD issues

5c73a6ea

Add MaxUnPool3D op and MaxUnPool1D op (#38716) · 7e31542c

由 andyjpaddle 提交于 1月 10, 2022

* add maxunpool3d op

* update doc for maxunpool3d op

* update doc for maxunpool3d op

* update doc for maxunpool3d op

* update sample code for maxunpool3d

* add maxunpool1d op

* update some code for maxunpool1d

7e31542c

G

remove fp32 tmp tensor and cast op for initializer.Normal and initializer.Constant (#38818) · 2238a535
由 Guoxia Wang 提交于 1月 10, 2022

2238a535
G

fix cuda seed bug of class_center_sample traning on multi gpu (#38815) · 04f73d89
由 Guoxia Wang 提交于 1月 10, 2022

04f73d89

07 1月, 2022 4 次提交

[PTen]Refactor flatten_grad kernel (#38712) · 5cf0bb79

由 YuanRisheng 提交于 1月 07, 2022

* refactor flatten grad kernel

* fix bugs when run ci unittest

* fix bugs when use default GetExpectedPtenKernelArgs

* xshape sometimes is has null holder ,fix this bugs

5cf0bb79

modify mish op and add mish api (#38734) · 8c92337c

由 wangxinxin08 提交于 1月 07, 2022

* add mish operator and api

* remove redundant code and modify grad_atol of mish unittest

* modify mish code to be consistent with other activation implementation

8c92337c

Add multi tensor for adam (#38010) · fb3313e9

由 zhangbo9674 提交于 1月 07, 2022

* add multi tensor for adam

* add merged_adam op

* refine code

* refine adam compute logic

fb3313e9

L
Add fp16 support for scale and bias parameter for fused_layernnorm_residual_dropout op. (#38775) · 1b6e4664
由 Li Min 提交于 1月 07, 2022
```
* Add fp16 support for scale/bias for fused_layernnorm_residual_dropout_bias op.
```
1b6e4664

06 1月, 2022 7 次提交
- Y
  [PTen]Move manipulation mid to new directory and rename flatten/reshape kernel (#38730) · 3d3bc681
  由 YuanRisheng 提交于 1月 06, 2022
```
* move mid api and rename kernel

* use empty kernel
```
  3d3bc681
- T
  
  fix expand_v2 and expand_as_v2 bug (#38677) · aec493c0
  由 Thomas Young 提交于 1月 06, 2022
  
  aec493c0
- L
  Revert "Remove useless headers for some grad ops (#38732)" (#38743) · fc990d08
  由 limingshu 提交于 1月 06, 2022
```
This reverts commit c0e2b98e.
```
  fc990d08
- L
  Remove useless headers for some grad ops (#38732) · c0e2b98e
  由 limingshu 提交于 1月 06, 2022
```
* fix the wrong filename

* first commit
```
  c0e2b98e
- Z
  【PTen】Adjust the format of full kernel (#38596) · 0c02d2ed
  由 zyfncg 提交于 1月 06, 2022
```
* adjust the full kernel

* remove creation.h

* use Empty to create tensor in full
```
  0c02d2ed
- Y
  [Pten]Move GPU_implementation of elementwise kernel in new directory (#38696) · c1adced7
  由 YuanRisheng 提交于 1月 06, 2022
```
* move gpu_impl of elementwise kernel

* change copyright to 2022
```
  c1adced7
- J
  Added exp FP32 FWD/BWD oneDNN kernel and optimized other oneDNN grad kernels (#38624) · 718183f1
  由 jakpiase 提交于 1月 06, 2022
```
* added exp activation and use_dst_for_bwd kernels

* CI RERUN

* minor change
```
  718183f1
05 1月, 2022 7 次提交

optimize elementwise_mul_grad using new interfaces (#37728) · 36a102f8

由 Lijunhui 提交于 1月 05, 2022

* init commit: new elem_mul_grad

* add template speciallization for complex in multiply

* reply review comments

* correct dx and dy computation when T is complex

* reply review comments

* update to new ReduceRunctor

* mul-output broadcast

* call functions

* call functions with comments

* remove comments

36a102f8

T

update masked_select_op for kunlun (#38678) · 40078103
由 TTerror 提交于 1月 05, 2022

40078103
W

add depthwise_conv2d op for mkldnn (#38484) · e1cc2236
由 wangxinxin08 提交于 1月 05, 2022

e1cc2236

[pten]Move reduce code new (#38648) · 7a4a512d

由 chentianyu03 提交于 1月 05, 2022

* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs

* fix compile bugs

* move reduce files by new rule

* add set header

* format code style

* merge develop and fix conflict

* merge develop and fix conflict
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

7a4a512d

J
Fix for matmul_v2 oneDNN op broadcasting when inputs dims have different lengths (#38665) · 67923124
由 jakpiase 提交于 1月 05, 2022
```
* fix for matmul_v2 broadcasting

* fix for output shape not broadcasted
```
67923124

add huber_loss for kunlun (#38589) · a268c7ce

由 TTerror 提交于 1月 05, 2022

* add huber_loss for kunlun

* update xpu.cmake

* update unitests

* update unitests

* update elementwise_add

* update elementwise_add

* update elementwise_add

a268c7ce

implementation of broadcast div backward by reduce (#38044) · 55cd9cb8

由 crystal 提交于 1月 05, 2022

* add elementwise div

* move mul and div grad functor

* Combine multiple CUDA kernels

* Update the reduce interface call

* add multi-output

* add multi-output div

* add branch judge

* Package branch

* Combine the x and y functions into one

55cd9cb8

04 1月, 2022 7 次提交
- N
  Add OpFunctor and replace cast, scale, clip, bce_loss and abs_grad with... · 6eac06e3
  由 niuliling123 提交于 1月 04, 2022
```
Add OpFunctor and replace cast, scale, clip, bce_loss and abs_grad with elementwise_no_broadcast (#38500)
```
  6eac06e3
- Q
  
  [XPU] update XPU device info, test=develop (#37884) · e1187e50
  由 Qi Li 提交于 1月 04, 2022
  
  e1187e50
- A
  Fix memcpyD2H sync behavior with other stream (#38647) · c0c54ba3
  由 Aurelius84 提交于 1月 04, 2022
```
* Fix memcpyD2H sync behavior with other stream

* add wait

* add wait

* add wait
```
  c0c54ba3
- Y
  [Pten]Move CPU_implementation of elementwise kernel in new directory (#38651) · 7c020c71
  由 YuanRisheng 提交于 1月 04, 2022
```
* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs

* move cpu_impl of elementwise kernel to new directory
```
  7c020c71
- F
  [NPU] add pad and pad_grad (#38658) · 6e9714a2
  由 furnace 提交于 1月 04, 2022
```
[NPU] add pad and pad_grad
```
  6e9714a2
- J
  
  added sqrt bf16 fwd/bwd (#38599) · 2d2609ea
  由 jakpiase 提交于 1月 04, 2022
  
  2d2609ea
- C
  [PTen] Move inner empty and cast api to kernel.h (#38587) · 64538c8d
  由 Chen Weihang 提交于 1月 04, 2022
```
* move inner cast api to cast_kernel.h

* resolve conflit
```
  64538c8d
31 12月, 2021 9 次提交

[XPU]add split op for kunlun2,*test=kunlun (#38277) · 26b845e2

由 Zhangjingyu06 提交于 12月 31, 2021

* [XPU]add split op for kunlun2,*test=kunlun

* [XPU]add split op for kunlun2,*test=kunlun

* [XPU]add split op for kunlun,*test=kunlun
Co-authored-by: NQingshuChen <chenqingshu@baidu.com>

26b845e2

[new API] add paddle.kthvalue and paddle.Tensor.kthvalue (#38386) · 538b5721

由 JYChen 提交于 12月 31, 2021

* add new api/op kthvalue

* kthvalue cuda kernel to cub sorting

* fix example code error

* throw errors instead of LOG in cuda sort

* throw errors by Paddle_ENFORCE

538b5721

Y
[Pten]Fix bugs of compilation when use pten::add/subtract (#38631) · 31efec53
由 YuanRisheng 提交于 12月 31, 2021
```
* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs

* fix compile bugs
```
31efec53
Z

add new API paddle.linalg.lu/lu_unpack (#38617) · 2ce91c33
由 zhiboniu 提交于 12月 31, 2021

2ce91c33

Add fold opereators (#38613) · 8898dce1

由 xiaoting 提交于 12月 31, 2021

* add fold opereators, test=develop

* add fold opereators, test=develop

* add fold opereators, test=develop

* update fold op error test, test=develop

* fix unitext, test=develop

* fix unitext, test=develop

8898dce1

Put_along_axis (based on PR #37921 by Xu Huang) (#38608) · f147fc99

由 Huihuang Zheng 提交于 12月 31, 2021

Paddle new APIs: put_along_axis.

Xu Huang is on holiday so we created this PR to work on it. It is based on his PR: https://github.com/PaddlePaddle/Paddle/pull/37921

f147fc99

Z

add lu_op backward (#38616) · a1275c8b
由 zhiboniu 提交于 12月 31, 2021

a1275c8b
C
[PTen] Unify data layout of pten and fluid (#38583) · 8d32cef8
由 Chen Weihang 提交于 12月 31, 2021
```
* unify data layout

* fix test_transfer_layout error
```
8d32cef8
Y
[Pten]Move math to new directory and change 「math」 to 「math_kernel」 (#38604) · e76087ad
由 YuanRisheng 提交于 12月 31, 2021
```
* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs
```
e76087ad

30 12月, 2021 2 次提交
- Z
  add OP lu forward (#38559) · 4e21457d
  由 zhiboniu 提交于 12月 30, 2021
```
LGTM
```
  4e21457d
- H
  add sigmoid_cross_entropy_with_logits to kl1 (#38586) · 790cadd1
  由 houj04 提交于 12月 30, 2021
```
* add sigmoid cross entropy with logits to kl1. test=kunlun

* add sigmoid cross entropy with logits to kl1. test=kunlun
```
  790cadd1

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致