提交 · 79ac8870ec71c99188f6d487ba74922cf90468a5 · Crayon鑫 / Paddle

23 4月, 2022 1 次提交

[Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT (#42138) · 79ac8870

由 Aurelius84 提交于 4月 23, 2022

* [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT

* [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT

79ac8870

22 4月, 2022 5 次提交

Add gpudnn yaml config for some OPs (#41773) · 4940a525

由 Ruibiao Chen 提交于 4月 22, 2022

* Add gpudnn yaml config for some OPs

* Add grad gpudnn config

* Fix CI errors

* Fix CI errors

* Fix CI errors

* Fix conflicts

4940a525

[WIP] Algorithm Cache of cuBlasLt Epilogue (#41010) · 19650d72

由 Ming-Xu Huang 提交于 4月 22, 2022

* Fix leading dimension setting error in fused_gemm_epilogue_grad_op.

* Add dyload to cuBlasLt functions.

* Added cublasLtMatmulAlgoGetHeuristic to improve performance.

* Added FLAGS_cublaslt_exhaustive_search_times to cublasLt epilogue

* Added UTs to FLAGS_cublaslt_exhaustive_search_times

* Added warmup runs in algo searching of Gemm epilogue.

* Update copyright and documents.

* Fixed error handling.

19650d72

Z

Add Sparse BatchNorm and fix two bugs (#42013) · 8a6456db
由 zhangkaihuo 提交于 4月 22, 2022

8a6456db
Z
Dygraph performance optimization (v2) (#42103) · c79d1186
由 zyfncg 提交于 4月 22, 2022
```
* optimiaze performance of PreparePhiData

* dygraph performance optimization
```
c79d1186
J
[Eager] fix memory issue for eager (#42086) · 23d1b3e8
由 Jiabin Yang 提交于 4月 22, 2022
```
* fix memory issue for eager

* fix bug
```
23d1b3e8

21 4月, 2022 2 次提交
- A
  [CustomDevice] fix macro (#42073) · ec995c59
  由 Aganlengzi 提交于 4月 21, 2022
```
* [CustomDevice] fix macro

* fix
```
  ec995c59
- S
  Support FP16 argmax/argmin kernel (#42038) · 7003dcaa
  由 sneaxiy 提交于 4月 21, 2022
```
* support int16 argmax kernel

* add fp16 test
```
  7003dcaa
20 4月, 2022 1 次提交

【PaddlePaddle Hackathon 2】9、为 Paddle 新增 logspace API (#41261) · a3c50c42

由 BrilliantYuKaimin 提交于 4月 20, 2022

* 增加logspace的算子描述

* 增加logspace的形状推断

* 增加logspace核函数实现

* 在python中增加logspace接口

* 增加logspace单测

* 增加logspace

* Update logspace_kernel.cu

* Update logspace_op.cc

* 调整代码格式

* Update doc of logspace

* Update tensor.py

* Update logspace_op.cc

* Update logspace_kernel.cc

* Update logspace_kernel.cu

* Update test_logspace.py

* 调整 logspace 的位置

* 调整代码格式

a3c50c42

19 4月, 2022 8 次提交

C

polish tensor api details (#41971) · e5c61b15
由 Chen Weihang 提交于 4月 19, 2022

e5c61b15

OneDNN md-in-tensor refactoring part 1: Added main changes for md-in-tensor (#41303) · c9f4fcf3

由 jakpiase 提交于 4月 19, 2022

* changes for md in tensor

* ci fix

* Temporarily limited dims for test

* ci fix

* removed unnecessary includes

* added reviewers suggestions

* checkouted two files to avoid changing more than 19 in single PR

* minor fix

* reverted one file to reduce files changed to 19

c9f4fcf3

fix pad3d infer shape (#41753) · 8f77f8bc
由 littletomatodonkey 提交于 4月 19, 2022
```
* fix pad3d infer shape
```
8f77f8bc
F

[MLU] support add callback to stream (#41831) · 03533b0c
由 fwenguang 提交于 4月 19, 2022

03533b0c

[Phi]Separate AddKernel/DivideKernel/SubtractKernel/MultiplyKernel from... · 2cb19d8f

由 YuanRisheng 提交于 4月 19, 2022

[Phi]Separate AddKernel/DivideKernel/SubtractKernel/MultiplyKernel from ElementwiseKernel（Part1） (#41806)

* seperate add/div/sub/mul from elementwise

* delete code

* fix compile bugs

* deal with conflict

* fix bugs when compile

* fix windows unit test bug

* fix ci converage bugs

2cb19d8f

[Phi]Fix expand_sig infershape BUG under static graph mode (#41936) · 59e8382d

由 Aurelius84 提交于 4月 19, 2022

* [Phi]Fix expand_sig infershape BUG under static graph mode

* [Phi]Fix expand_sig infershape BUG under static graph mode

* [Phi]Fix unittest

* [Phi]Fix unittest

59e8382d

A
[Eager]Fix NeedTransformPlace behavior if set skip_transform in yaml (#41920) · 7ce0ee69
由 Aurelius84 提交于 4月 19, 2022
```
* [Eager]Fix NeedTransformPlace behavior if set skip_transform in yaml

* add unittest for full_like

* fix unittest
```
7ce0ee69

[Eager] Fix numpy interface for constructing empty tensor (#41904) · 2ed01960

由 Weilong Wu 提交于 4月 19, 2022

* [Eager] Fix numpy interface for constructing empty tensor

* Fix CI, construct empty tensor

* Modify empty tensor's shape from [] to [0]

* Add more test for constructing empty tensor

2ed01960

18 4月, 2022 4 次提交
- Z
  Create Tensor by paddle::empty in custom operator (#41840) · bc1c3e3e
  由 zyfncg 提交于 4月 18, 2022
```
* create tensor by empty in custom op

* fix some bug
```
  bc1c3e3e
- L
  
  [KP] Add Reduce op registry & UT for xpu_kp compilation (#41869) · b3959fe4
  由 Lijunhui 提交于 4月 18, 2022
  
  b3959fe4
- Z
  
  Add sparse kernel coalesced (#41784) · 8f469ddd
  由 zhangkaihuo 提交于 4月 18, 2022
  
  8f469ddd
- S
  Optimization for graph_sample_neighbors API (#41447) · c31dd04c
  由 Siming Dai 提交于 4月 18, 2022
```
* add eids result for graph_sample_neighbors

* fix bug

* move fisher_yates sample to warp

* add cpu eid output

* delete comment

* delete comment

* change nullptr placeholder

* optimize sample kernel

* fix mutable_data
```
  c31dd04c
17 4月, 2022 2 次提交

[Perf] Optimize dygraph scheduling performance (#41696) · 7ee31a96

由 Chen Weihang 提交于 4月 17, 2022

* split phi and fluid infermeta context

* resolve conflict

* fix type error

* optimize scheduling perf

* spec small vector size

* replace all grad var name

* fix test failed

* move init defalut signature

* polish details

* polish details

* fix no init bug

* init sig for tests

* add init sig for infer

* fix infrt error

* fix infrt failed

* fix kunlun error

* fix infrt failed

7ee31a96

[CustomOp] Fix PlaceType related compat error (#41826) · b5d9c31c

由 Chen Weihang 提交于 4月 17, 2022

* fix place type related compat error

* fix test failed

* remove dll decl

* revert place type change

* add dll decl

b5d9c31c

16 4月, 2022 1 次提交
- 王
  
  move fc_functor from fluid to phi.test=develop (#41856) · 21aa3adc
  由王明冬提交于 4月 16, 2022
  
  21aa3adc
15 4月, 2022 8 次提交

[Yaml]add adamw yaml (#41678) · ea0a164b

由 chentianyu03 提交于 4月 15, 2022

* add adamw yaml

* fix test case error

* make the name of weight and bias in linear1 and linear2 to be constant

ea0a164b

[Phi]Reduce kernels into multiply files (#41747) · 1927aff9

由 chentianyu03 提交于 4月 15, 2022

* split reduce_kernel

* rm reduce_kernel in cmake

* split reduce_grad kernels

* fix cmake build error

* format code

* fix standalone_executor_test error

1927aff9

[DoubleGrad] Enabled test_imperative_star_gan_with_gradient_penalty.py under eager mode (#41730) · 27f28e82

由 Zhanlue Yang 提交于 4月 15, 2022

* [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad

* Fixed elementwise issue

* Addressed CI failures

* [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode

* [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode

* Enabled more test cases

* [DoubleGrad] Enabled test_imperative_star_gan_with_gradient_penalty.py under eager mode

* Adjusted test_imperative_star_gan_with_gradient_penalty.py

27f28e82

Add eager string tensor (#41039) · a22b68b8

由 Jack Zhou 提交于 4月 15, 2022

* Add core.eager.StringTensor __init__ which pyarray args can be passed

* Add the numpy method of core.eager.StringTensor

* revert tensor.to_string modification

* Add ToPyObject for core.eager.StringTensor

* Add debug string for core.eager.StringTensor

* Remove place args of core.eager.StringTensor temporarily

* Fix check string_tensor error

* remove dtype of core.eager.StringTensor

* add core.eager.StringTensor unittest

* remove pstring from VarDesc

* Add InitStringTensorWithStringTensor

* Remove to_string modification

* Remove zero_copy arg from StringTensor creator

a22b68b8

C

polish tensor depreacted method warning (#41807) · e83e44c7
由 Chen Weihang 提交于 4月 15, 2022

e83e44c7
Z

Add API: Sparse Convolution3D (#41434) · 1665594d
由 zhangkaihuo 提交于 4月 15, 2022

1665594d

Change cuDNN Conv kernel for auto tune feature (#41313) · 35acfeda

由 limingshu 提交于 4月 15, 2022

* change cudnn helper for auto-tune

* Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm.

* Fix the bug in calculating and printing current step cache hit rate.

* Improve the autotune cache and fix unittest.

* Change the key from AlgorithmType to int64_t.

* Fix unittest for cpu-only env.

* change ChooseAlgoByWorkspace for heuristic mode
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

35acfeda

fix batch norm memory issue (#41717) · 42abcc08

由 hong 提交于 4月 15, 2022

* try to fix batch norm memory issue

* fix batch norm memroy alloc bug

* polish some code

42abcc08

14 4月, 2022 8 次提交

L
[KP] Add registry for elementwise_add/max/min/sub/div/mul/floordiv on XPU2 with KP lib (#41494) · fbe2c311
由 Lijunhui 提交于 4月 14, 2022
```
* regist elementwise_xxx
```
fbe2c311
C

remove all is initialized using (#41766) · 4733fe60
由 Chen Weihang 提交于 4月 14, 2022

4733fe60

[Phi] Support construct Scalar by using Non-CPU Tensor (#41765) · 54ccc308

由 YuanRisheng 提交于 4月 14, 2022

* support construct scalar using non-cpu tensor

* fix bugs when run unittest

* fix compile bugs

* fix bugs when run ci

* fix compile bugs

* fix bugs when move copy

* perfect unit test

* perfect unittest

* update according to comment

* add target dependency

* deal with conflict

* fix bugs when run unit test

* fix unit test bugs

54ccc308

Fix to #38693 (minimal UT) (#41026) · d0f3296b

由 Jacek Czaja 提交于 4月 14, 2022

* Add UT

- Added missed data_layout

- Added missing conversions

- NDHWC added

- NDHWC support in data_transform

- another fix

- condddate change

- fix

u- fix

- fix

- fix

- fix

- fix

- fix to hack

- compilation fix

- fix to automatic merge

* - reduced UT

* - fix

* - lint

* - fix to lint

d0f3296b

Z
[PHI] Support some c++ api in paddle namespace (#41778) · b075dee8
由 zyfncg 提交于 4月 14, 2022
```
* support some c++ api in paddle namespace

* change c++ api namespace in custom op
```
b075dee8

[DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode (#41668) · ad9585b6

由 Zhanlue Yang 提交于 4月 14, 2022

* [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad

* Fixed elementwise issue

* Addressed CI failures

* [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode

* [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode

* Enabled more test cases

* Fixed performance issues

* Fixed minor issue

ad9585b6

A

[Op]Fix adam/adamw beta1_pow/beta2_pow place while copying (#41732) · 4ae76d21
由 Aurelius84 提交于 4月 14, 2022

4ae76d21
C

remove inner_place using (#41768) · de2a3942
由 Chen Weihang 提交于 4月 14, 2022

de2a3942

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致