提交 · 9d12e70c5daab2c59724d0d4b2758b1f1ecdaabc · BaiXuePrincess / Paddle

23 6月, 2022 1 次提交
- Z
  
  fix set_value (#43694) (#43783) · 9d12e70c
  由 zyfncg 提交于 6月 23, 2022
  
  9d12e70c
22 6月, 2022 2 次提交

Optimize linspace to avoid GPU -> CPU copy. (#42750) (#43746) · 4dcfc6df

由 Yiqun Liu 提交于 6月 22, 2022

cherry-pick #42750。

QA反馈，#42750 优化后，solov2模型性能可提升6%，故cherry-pick到2.3。因#41096 将linspace python实现从fluid.layers.tensor挪到了paddle.tensor.creation下，该pr不在release/2.3分支中，故将#42750 中python修改同步到fluid.layers.tensor.linspace中。

4dcfc6df

[cherry pick] Support optional residual add in fused ops and slice large... · 0660d5f2

由 Zhang Ting 提交于 6月 22, 2022

[cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax (#43719)

 [cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax

cherry-pick #43635 #43681 #43474

0660d5f2

20 6月, 2022 1 次提交
- X
  [Cherry pick] Einsum memory optimization PR #43397 (#43554) · 638b69dc
  由 xiongkun 提交于 6月 20, 2022
```
* cherry pick from #43397

* fix code
```
  638b69dc
15 6月, 2022 1 次提交
- Z
  [cherry-pick] Fix bug of strided_slice and slice (#43388, #43443) (#43432) · 7e940b84
  由 zyfncg 提交于 6月 15, 2022
```
* fix bug of strided_slice (#43388)

* fix stride_slice bug

* fix bug

* fix bug of infer shape for slice (#43443)
```
  7e940b84
14 6月, 2022 1 次提交

[ CherryPick ] Cherry pick for einsum optimization. (#43468) · 22e75d92

由 xiongkun 提交于 6月 14, 2022

* [EinsumOp] Polish forward logic and backward logic for optimize (#42603)

* change logic for optimize

* modifty

* merge

* change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 (#43010)

* [EinsumOp] Make EinsumOp support bfloat16. (#43085)

* change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0

* make EInsumOP support bf16

* add unittest for BF16

* add condition for test_BF16

* fix bugs

* fix

* change the backward api to fit einsum op

22e75d92

08 6月, 2022 1 次提交

Replace ReduceAmax/Amax.part.cu with KP (#43202) (#43263) · e161979e

由 niuliling123 提交于 6月 08, 2022

Reduce amax/amin frobenius_norm_kerne原始实现为Eigen实现，文件编译时间较长，因此本PR将其替换为KP实现
删除DefaultElementwiseOperator中重复功能支持，减少elementwise_double_grad OP编译时间

e161979e

07 6月, 2022 1 次提交
- N
  [cherry-pick]Delete ElementwiseKernel in BroadcastKernel (#42779) (#43210) · 52ef8656
  由 niuliling123 提交于 6月 07, 2022
```
Delete ElementwiseKernel in BroadcastKernel
减少所有Broadcast中重复功能调用，同时减少编译时间和问题体积
```
  52ef8656
06 6月, 2022 1 次提交

cherry-pick 42645 (#43205) · 835a1888

由 niuliling123 提交于 6月 06, 2022

删除Broadcast function中rank例化以及Elementwise调用，降低编译时间。
从develop分支中的#42645 PR修改而来，由于develop分支与release分支相差较大，无法实现cherry-pick，因此针对release2.3重新提交PR.
Broadcast中关于rank的例化会导致底层模板展开较多，造成reduce_sum_grad_kernel.cu.o文件体积过大，修改后可以降低.o体积及编译时间

835a1888

06 5月, 2022 1 次提交

Fix the race condition in cumsum operator (#42205) (#42500) · 58f40144

由 wawltor 提交于 5月 06, 2022

* Fix the race condition in cumsum operator

* Optimize cumsum operator
Co-authored-by: NLeo Chen <39020268+leo0519@users.noreply.github.com>

58f40144

04 5月, 2022 1 次提交
- X
  [cherry-pick 2.3] fix bug of batch_norm_grad kernel with fp16 (#42461) · a5745864
  由 XiaoguangHu 提交于 5月 04, 2022
```
* fix bug of batch_norm_grad kernel with fp16

* format code
```
  a5745864
30 4月, 2022 1 次提交

Make einsum_v2 support multi-operands (#42327) (#42397) · 34352fcd

由 xiongkun 提交于 4月 30, 2022

* Extend python einsum interface to make einsum_v2 support multi-operands and switch it to default.

* add opt_einsum dependence

* add yaml and support eager model

* fix by code review

34352fcd

28 4月, 2022 5 次提交

Optimize attribute selected performence (#42294) (#42368) · e0e534ab

由 Chen Weihang 提交于 4月 28, 2022

* opt attr eaque perf

* opt attr select code

* fix one hot infermeta

* polish get attr impl

* fix tests failed

* add testcases

e0e534ab

Add C++ EinsumOp which support 2 operands einsum. (#42105) (#42357) · d04a68d3

由 xiongkun 提交于 4月 28, 2022

* full api fix

* when out is None, go old dygraph mode

* by static check

* first version: support 2-inputs forwards. TODO: 1. backward  2. BroadCast  3. MultiVariable

* time out -> 120

d04a68d3

F
set device id of Place() to get GPUContext needed by LimitGridDim in... · 0fe0aea9
由 FlyingQianMM 提交于 4月 28, 2022
```
set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast (PaddlePaddle#42320) (#42332)
```
0fe0aea9

[cherry-pick] Optimize performance of dygraph (#42196) (#42329) · 2ea56c90

由 zyfncg 提交于 4月 28, 2022

* Optimize performance of dygraph (v4)  (#42196)

* optimize performance of dygraph

* optimize performance of dygraph and elementwise_add

* optimize the trace op

* fix bug

* fix bug

* fix unittest bug

* fix code format

* fix cherry-pick problem

2ea56c90

[cherry-pick] Optimize performance of dygraph (#42231, #42253) (#42309) · 69a92b7b

由 zyfncg 提交于 4月 28, 2022

* Optimize the performanece of sum api (#42231)

* optimize the performanece of sum api

* optimize IsDenseTensorInput

* remove debug log

* Add move construct for KernelSignature (#42253)

* add move construct for KernelSignature

* add noexcept

* fix cherry-pick problem

69a92b7b

26 4月, 2022 1 次提交

[Cherry-pick] Optimize dygraph performance part2 (#42224) · ab24b9c0

由 Chen Weihang 提交于 4月 26, 2022

* Add paddle::variant and replace paddle::any (#42139)

* add variant and replace any

* split attribute

* Optimize dygraph GetExpectedKernelType perf (#42154)

* opt dygraph scheduling

* revert part impl

* fix variant compile error (#42203)

* replace any by variant in infermeta (#42181)

ab24b9c0

25 4月, 2022 1 次提交

[Cherry-Pick][Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm... · 58d0d15e

由 Aurelius84 提交于 4月 25, 2022

[Cherry-Pick][Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm and fix shape op (#42170)

* [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT (#42138)

* [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT

* [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT

* [Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm (#42132)

58d0d15e

21 4月, 2022 2 次提交

[Cherry-pick] Optimize dygraph scheduling performance (#42010) · ec1d2a16

由 Chen Weihang 提交于 4月 21, 2022

* [Phi] Support setting size of vector<Tensor> for out in yaml (#41576)

* support setting vector out size in yaml

* support setting size of vector<tensor> for out in yaml

* resolve conflict
Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>

ec1d2a16

J
[Cherry-pick] Enabled test_imperative_star_gan_with_gradient_penalty.py under eager mode (#41994) · af7439ad
由 Jiabin Yang 提交于 4月 21, 2022
```
* cherry-pick python/paddle/utils/code_gen/backward.yaml

* remove unsupported yaml
Co-authored-by: NZhanlue Yang <jim19930609@gmail.com>
```
af7439ad

19 4月, 2022 4 次提交

[cherry-pick] add rsqrt, equal_all, expand yaml and unittest (#41443, #41540) (#41965) · 018245d8

由 zyfncg 提交于 4月 19, 2022

* add rsqrt yaml and unittest (#41443)

* Add expand equal all yaml (#41540)

* add expand, poisson

* add poison grad

* add expand equal_all poisson triangular solve yaml
Co-authored-by: Nhong <43953930+phlrain@users.noreply.github.com>

018245d8

Y
[Cherry-pick 2.3] Autotune the workspace and kernel choosing of conv (#41833) · b4adbe5c
由 Yiqun Liu 提交于 4月 19, 2022
```
Cherry-pick #40338 #41741 #41313
```
b4adbe5c
Z
Add kernel sparse_mask_helper; sparse_coo_tensor_grad (#41586) (#41902) · 44d8c6ed
由 zhangkaihuo 提交于 4月 19, 2022
```
cherry-pick the PR#41586 to realese/2.3
```
44d8c6ed

Optimization for graph_sample_neighbors API (#41447) (#41897) · 6115b016

由 Siming Dai 提交于 4月 19, 2022

* add eids result for graph_sample_neighbors

* fix bug

* move fisher_yates sample to warp

* add cpu eid output

* delete comment

* delete comment

* change nullptr placeholder

* optimize sample kernel

* fix mutable_data

6115b016

18 4月, 2022 2 次提交

[Phi]Reduce kernels into multiply files (#41747) (#41854) · 688f4ec0

由 chentianyu03 提交于 4月 18, 2022

* split reduce_kernel

* rm reduce_kernel in cmake

* split reduce_grad kernels

* fix cmake build error

* format code

* fix standalone_executor_test error

688f4ec0

[DoubleGrad] Enabled double grad test cases in eager_mode for... · a367fbab

由 Zhanlue Yang 提交于 4月 18, 2022

[DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad (#41451) (#41893)

* [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad

* Fixed elementwise issue

* Addressed CI failures

a367fbab

15 4月, 2022 3 次提交
- Z
  [cherry-pick] Add Sparse API to_dense, to_sparse_coo and values (#41394) (#41834) · 8300e618
  由 zhangkaihuo 提交于 4月 15, 2022
```
Add paddle.sparse and three Sparse API (#41276)
Add Sparse API to_dense, to_sparse_coo and values (#41394)
```
  8300e618
- Z
  
  fix p_norm gpu nan bug while divide zero (#41804) · 261f97fb
  由 zhiboniu 提交于 4月 15, 2022
  
  261f97fb
- Y
  [Phi]Add multi_dot/maxout/multiplex op yaml (#41550) (#41818) · 3449a34e
  由 YuanRisheng 提交于 4月 15, 2022
```
* add multi_dot,maxout,multiplex yaml

* add code converage
```
  3449a34e
14 4月, 2022 2 次提交

Cherry pick final state ops (#41755) · 921a6fb7

由 chentianyu03 提交于 4月 14, 2022

* [Yaml]add exp yaml (#41217)

* add exp yaml

* add exp api in test case

* add determinant yaml

* fix exp op unittest

* change test class name

* modify api name

* compacted with raw api

* fix det api

* add python_api

* add test eager for determinant op

* [Yaml] Add assign yaml (#41428)

* add assign yaml

* add assign api

* add assign backward api

* add assign

* add assign yaml

* add assign

* assign yaml

* add assign raw kernel and use assign_raw in yaml

* merge develop branch

* add missing python_api

* exchange assign and assign_raw kernel name (#41625)

* exchange assign and assign_raw kernel name

* fix register error

* [Yaml]add gaussian_random yaml and test case (#41312)

* add guassian random yaml

* add gaussian_random yaml and test case

* fix error modify of full yaml

* import in_dygraph_mode

* import _in_legacy_dygraph

* add place arg in api

* import __current_expected_place

* fix test_egr_python_api failed case

* add test case

* add cast for NormalInitializer

* fix test error

* fix test error

* rm unsed check code

* fix test error in test_initializer_nn

* modify by review

* [Phi]fix split error when sections has 0 size and add test case (#41708)

* fix split error when sections has 0 size and add test case

* fix test case

921a6fb7

W

add fp16 kernel to clip_grad (#41675) · d447c678
由 wuyefeilin 提交于 4月 14, 2022

d447c678

13 4月, 2022 3 次提交

[Cherry-pick] Two detail fix prs (#41728) · 8dba3d0b

由 Chen Weihang 提交于 4月 13, 2022

* [Eager] Remove elementwise add in conv (#41515)

* remove elementwise add in conv

* use reshape

* fix warpctc grad kernel dep eror (#41598)

8dba3d0b

F
add a inner loop for index_select_grad_init() in index_select op when dealing... · 5d4980c0
由 FlyingQianMM 提交于 4月 13, 2022
```
add a inner loop for index_select_grad_init() in index_select op when dealing with large-shape data (PaddlePaddle#41563) (#41669)
```
5d4980c0
A
Revert "[Phi] Migrate Adam and AdamW into Phi (#40351)" (#41712) · 8663376f
由 Aurelius84 提交于 4月 13, 2022
```
* Revert "[Phi] Migrate Adam and AdamW into Phi (#40351)"

This reverts commit 56cd3407.

* add infermeta
```
8663376f

12 4月, 2022 3 次提交

H
fix search sort bug (#41664) (#41684) · 30fb07ef
由 hong 提交于 4月 12, 2022
```
* fix search sort bug (#41664)

* fix depthwise dnn bug (#41666)
```
30fb07ef

[Cherry-Pick]Add... · a0b0a32f

由 YuanRisheng 提交于 4月 12, 2022

[Cherry-Pick]Add hard_swish/kron/linspace/logit/graph_send_recv/multi_dot/maxout/multiplex op yaml file  (#41566)

* [Phi]Add graph_send_recv yaml file (#41206)

* add graph_send_recv yaml

* deal with confict

* fix compile bugs

* cherry-pick pr 41298

* cherry-pick pr41550

* fix compile bugs

a0b0a32f

J

Fix RNN OP multi-threads predict bug (#41529) (#41560) · e4dcf0bf
由 Jack Zhou 提交于 4月 12, 2022

e4dcf0bf

11 4月, 2022 2 次提交

H

add depthwise conv hip support (#41537) (#41603) · 676c960c
由 hong 提交于 4月 11, 2022

676c960c

[Cherry-pick] Add truncated_normal/unique/swish/unbind yaml and polish Getting... · b2e095c4

由 Chen Weihang 提交于 4月 11, 2022

[Cherry-pick] Add truncated_normal/unique/swish/unbind yaml and polish Getting tensor place impl (#41539)

* [Phi] Polish truncated normal kernel and add yaml (#41280)

* polish truncated normal kernel

* add yaml

* add truncated normal kernel and add yaml

* polish unittests and yaml

* import dygraph mehtod

* add unique yaml and final state api (#41460)

* fix get tensor backend set bug (#41478)

* [Phi] Add unbind yaml and final state api (#41277)

* add unbind yaml

* fix unittest

* [Phi] Add swish yaml and final state api (#41479)

* add swish yaml and final state api

* skip mkldnn test

* fix grad mkldnn test

* add cherry-pick lost code

b2e095c4

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致