提交 · d04a68d3ab32b8bd86e54c8f01d7065323a65bb6 · BaiXuePrincess / Paddle

28 4月, 2022 4 次提交

Add C++ EinsumOp which support 2 operands einsum. (#42105) (#42357) · d04a68d3

由 xiongkun 提交于 4月 28, 2022

* full api fix

* when out is None, go old dygraph mode

* by static check

* first version: support 2-inputs forwards. TODO: 1. backward  2. BroadCast  3. MultiVariable

* time out -> 120

d04a68d3

F
set device id of Place() to get GPUContext needed by LimitGridDim in... · 0fe0aea9
由 FlyingQianMM 提交于 4月 28, 2022
```
set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast (PaddlePaddle#42320) (#42332)
```
0fe0aea9

[cherry-pick] Optimize performance of dygraph (#42196) (#42329) · 2ea56c90

由 zyfncg 提交于 4月 28, 2022

* Optimize performance of dygraph (v4)  (#42196)

* optimize performance of dygraph

* optimize performance of dygraph and elementwise_add

* optimize the trace op

* fix bug

* fix bug

* fix unittest bug

* fix code format

* fix cherry-pick problem

2ea56c90

[cherry-pick] Optimize performance of dygraph (#42231, #42253) (#42309) · 69a92b7b

由 zyfncg 提交于 4月 28, 2022

* Optimize the performanece of sum api (#42231)

* optimize the performanece of sum api

* optimize IsDenseTensorInput

* remove debug log

* Add move construct for KernelSignature (#42253)

* add move construct for KernelSignature

* add noexcept

* fix cherry-pick problem

69a92b7b

26 4月, 2022 1 次提交

[Cherry-pick] Optimize dygraph performance part2 (#42224) · ab24b9c0

由 Chen Weihang 提交于 4月 26, 2022

* Add paddle::variant and replace paddle::any (#42139)

* add variant and replace any

* split attribute

* Optimize dygraph GetExpectedKernelType perf (#42154)

* opt dygraph scheduling

* revert part impl

* fix variant compile error (#42203)

* replace any by variant in infermeta (#42181)

ab24b9c0

25 4月, 2022 1 次提交

[Cherry-Pick][Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm... · 58d0d15e

由 Aurelius84 提交于 4月 25, 2022

[Cherry-Pick][Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm and fix shape op (#42170)

* [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT (#42138)

* [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT

* [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT

* [Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm (#42132)

58d0d15e

21 4月, 2022 2 次提交

[Cherry-pick] Optimize dygraph scheduling performance (#42010) · ec1d2a16

由 Chen Weihang 提交于 4月 21, 2022

* [Phi] Support setting size of vector<Tensor> for out in yaml (#41576)

* support setting vector out size in yaml

* support setting size of vector<tensor> for out in yaml

* resolve conflict
Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>

ec1d2a16

J
[Cherry-pick] Enabled test_imperative_star_gan_with_gradient_penalty.py under eager mode (#41994) · af7439ad
由 Jiabin Yang 提交于 4月 21, 2022
```
* cherry-pick python/paddle/utils/code_gen/backward.yaml

* remove unsupported yaml
Co-authored-by: NZhanlue Yang <jim19930609@gmail.com>
```
af7439ad

19 4月, 2022 4 次提交

[cherry-pick] add rsqrt, equal_all, expand yaml and unittest (#41443, #41540) (#41965) · 018245d8

由 zyfncg 提交于 4月 19, 2022

* add rsqrt yaml and unittest (#41443)

* Add expand equal all yaml (#41540)

* add expand, poisson

* add poison grad

* add expand equal_all poisson triangular solve yaml
Co-authored-by: Nhong <43953930+phlrain@users.noreply.github.com>

018245d8

Y
[Cherry-pick 2.3] Autotune the workspace and kernel choosing of conv (#41833) · b4adbe5c
由 Yiqun Liu 提交于 4月 19, 2022
```
Cherry-pick #40338 #41741 #41313
```
b4adbe5c
Z
Add kernel sparse_mask_helper; sparse_coo_tensor_grad (#41586) (#41902) · 44d8c6ed
由 zhangkaihuo 提交于 4月 19, 2022
```
cherry-pick the PR#41586 to realese/2.3
```
44d8c6ed

Optimization for graph_sample_neighbors API (#41447) (#41897) · 6115b016

由 Siming Dai 提交于 4月 19, 2022

* add eids result for graph_sample_neighbors

* fix bug

* move fisher_yates sample to warp

* add cpu eid output

* delete comment

* delete comment

* change nullptr placeholder

* optimize sample kernel

* fix mutable_data

6115b016

18 4月, 2022 2 次提交

[Phi]Reduce kernels into multiply files (#41747) (#41854) · 688f4ec0

由 chentianyu03 提交于 4月 18, 2022

* split reduce_kernel

* rm reduce_kernel in cmake

* split reduce_grad kernels

* fix cmake build error

* format code

* fix standalone_executor_test error

688f4ec0

[DoubleGrad] Enabled double grad test cases in eager_mode for... · a367fbab

由 Zhanlue Yang 提交于 4月 18, 2022

[DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad (#41451) (#41893)

* [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad

* Fixed elementwise issue

* Addressed CI failures

a367fbab

15 4月, 2022 3 次提交
- Z
  [cherry-pick] Add Sparse API to_dense, to_sparse_coo and values (#41394) (#41834) · 8300e618
  由 zhangkaihuo 提交于 4月 15, 2022
```
Add paddle.sparse and three Sparse API (#41276)
Add Sparse API to_dense, to_sparse_coo and values (#41394)
```
  8300e618
- Z
  
  fix p_norm gpu nan bug while divide zero (#41804) · 261f97fb
  由 zhiboniu 提交于 4月 15, 2022
  
  261f97fb
- Y
  [Phi]Add multi_dot/maxout/multiplex op yaml (#41550) (#41818) · 3449a34e
  由 YuanRisheng 提交于 4月 15, 2022
```
* add multi_dot,maxout,multiplex yaml

* add code converage
```
  3449a34e
14 4月, 2022 2 次提交

Cherry pick final state ops (#41755) · 921a6fb7

由 chentianyu03 提交于 4月 14, 2022

* [Yaml]add exp yaml (#41217)

* add exp yaml

* add exp api in test case

* add determinant yaml

* fix exp op unittest

* change test class name

* modify api name

* compacted with raw api

* fix det api

* add python_api

* add test eager for determinant op

* [Yaml] Add assign yaml (#41428)

* add assign yaml

* add assign api

* add assign backward api

* add assign

* add assign yaml

* add assign

* assign yaml

* add assign raw kernel and use assign_raw in yaml

* merge develop branch

* add missing python_api

* exchange assign and assign_raw kernel name (#41625)

* exchange assign and assign_raw kernel name

* fix register error

* [Yaml]add gaussian_random yaml and test case (#41312)

* add guassian random yaml

* add gaussian_random yaml and test case

* fix error modify of full yaml

* import in_dygraph_mode

* import _in_legacy_dygraph

* add place arg in api

* import __current_expected_place

* fix test_egr_python_api failed case

* add test case

* add cast for NormalInitializer

* fix test error

* fix test error

* rm unsed check code

* fix test error in test_initializer_nn

* modify by review

* [Phi]fix split error when sections has 0 size and add test case (#41708)

* fix split error when sections has 0 size and add test case

* fix test case

921a6fb7

W

add fp16 kernel to clip_grad (#41675) · d447c678
由 wuyefeilin 提交于 4月 14, 2022

d447c678

13 4月, 2022 3 次提交

[Cherry-pick] Two detail fix prs (#41728) · 8dba3d0b

由 Chen Weihang 提交于 4月 13, 2022

* [Eager] Remove elementwise add in conv (#41515)

* remove elementwise add in conv

* use reshape

* fix warpctc grad kernel dep eror (#41598)

8dba3d0b

F
add a inner loop for index_select_grad_init() in index_select op when dealing... · 5d4980c0
由 FlyingQianMM 提交于 4月 13, 2022
```
add a inner loop for index_select_grad_init() in index_select op when dealing with large-shape data (PaddlePaddle#41563) (#41669)
```
5d4980c0
A
Revert "[Phi] Migrate Adam and AdamW into Phi (#40351)" (#41712) · 8663376f
由 Aurelius84 提交于 4月 13, 2022
```
* Revert "[Phi] Migrate Adam and AdamW into Phi (#40351)"

This reverts commit 56cd3407.

* add infermeta
```
8663376f

12 4月, 2022 3 次提交

H
fix search sort bug (#41664) (#41684) · 30fb07ef
由 hong 提交于 4月 12, 2022
```
* fix search sort bug (#41664)

* fix depthwise dnn bug (#41666)
```
30fb07ef

[Cherry-Pick]Add... · a0b0a32f

由 YuanRisheng 提交于 4月 12, 2022

[Cherry-Pick]Add hard_swish/kron/linspace/logit/graph_send_recv/multi_dot/maxout/multiplex op yaml file  (#41566)

* [Phi]Add graph_send_recv yaml file (#41206)

* add graph_send_recv yaml

* deal with confict

* fix compile bugs

* cherry-pick pr 41298

* cherry-pick pr41550

* fix compile bugs

a0b0a32f

J

Fix RNN OP multi-threads predict bug (#41529) (#41560) · e4dcf0bf
由 Jack Zhou 提交于 4月 12, 2022

e4dcf0bf

11 4月, 2022 2 次提交

H

add depthwise conv hip support (#41537) (#41603) · 676c960c
由 hong 提交于 4月 11, 2022

676c960c

[Cherry-pick] Add truncated_normal/unique/swish/unbind yaml and polish Getting... · b2e095c4

由 Chen Weihang 提交于 4月 11, 2022

[Cherry-pick] Add truncated_normal/unique/swish/unbind yaml and polish Getting tensor place impl (#41539)

* [Phi] Polish truncated normal kernel and add yaml (#41280)

* polish truncated normal kernel

* add yaml

* add truncated normal kernel and add yaml

* polish unittests and yaml

* import dygraph mehtod

* add unique yaml and final state api (#41460)

* fix get tensor backend set bug (#41478)

* [Phi] Add unbind yaml and final state api (#41277)

* add unbind yaml

* fix unittest

* [Phi] Add swish yaml and final state api (#41479)

* add swish yaml and final state api

* skip mkldnn test

* fix grad mkldnn test

* add cherry-pick lost code

b2e095c4

08 4月, 2022 1 次提交
- Y
  
  fix bugs of reshape double grad infermeta (#41459) (#41493) · f196b84e
  由 YuanRisheng 提交于 4月 08, 2022
  
  f196b84e
07 4月, 2022 3 次提交
- S
  [BugFix] Add error hint for one_hot gpu version (#41335) (#41495) · 57fe4fc9
  由 Siming Dai 提交于 4月 07, 2022
```
* add one_hot gpu hint

* move allow_out_of_range judgement

* delete useless unittest
```
  57fe4fc9
- S
  fix bug of missing boost when compile cache.cc (#41449) · e34042af
  由 Sing_chan 提交于 4月 07, 2022
```
【chery-pick #41430】fix bug of random compile failure, due to incorrect compile order of dependencies
```
  e34042af
- [cherry-pick2.3]fix compile bug of windows cuda11.5 (#41464) · 7f502b7c
  由 zhouweiwei2014 提交于 4月 07, 2022
```
cherry-pick

fix compile bug of windows cuda11.5 #41433
```
  7f502b7c
06 4月, 2022 2 次提交

Add conv yaml (#41354) · 7ed7c6c7

由 hong 提交于 4月 06, 2022

* update

* add conv yaml

* add backward

* remove useless code

* fix bug

* fix bug

* revert fluid dygraph conv2d

* remove useless infermeta function

* fix meta fn deluplicat error

* conv using custom impl

* remove amp include

* fix bug

* use cudnn = true

* fix test mkldnn caching bug

7ed7c6c7

X
[Dygraph TestsFix] Test some tests in new dygraph final_state mode. (#41363) · 0b96793e
由 xiongkun 提交于 4月 06, 2022
```
* fix less than

* fix some tests

* fix additional 3 unittest case
```
0b96793e

05 4月, 2022 4 次提交

Z
Fix bug of data transform in inference executor (#41349) · 91212104
由 zyfncg 提交于 4月 05, 2022
```
* fix bug of data transform in inference executor

* fix bug
```
91212104

[DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul (#41387) · d8a10977

由 Zhanlue Yang 提交于 4月 05, 2022

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* [DoubleGrad #4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues

* [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run

* Fixed minor issues

* Fixed issue with backward graph construction logic

* Fixed implementation issues with backward graph reconstruction

* Fixed unittest issue

* Fixed issues

* [DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul

* Fixed issues with phi kernel

* Added triple grad test case

* Fixed minor issue

d8a10977

G

add new format of quantization (#41041) · b72a7ebb
由 Guanghua Yu 提交于 4月 05, 2022

b72a7ebb

Implement AutoTuneStatus class for Kernel Auto Tune (#41218) · b0f8000e

由 Zhang Ting 提交于 4月 05, 2022

* switch autotune

* implement AutoTuneCache

* implement AutoTuneCache class

* add pybind api

* add dygraph test

* support static mode and eager mode and improve unittests

* rename the SwitchAutoTune Class and improve tests

* improve AutoTuneStatus and reduce the cost of tests

b0f8000e

04 4月, 2022 3 次提交
- 0
  
  Fix Warpctc error when using muti-gpu (#41389) · f8b3e576
  由 0x45f 提交于 4月 04, 2022
  
  f8b3e576
- F
  
  fix index_select kernel configuration error where input numel is 0 (#41383) · 3e9ad093
  由 FlyingQianMM 提交于 4月 04, 2022
  
  3e9ad093
- H
  Add batch norm yaml (#41386) · 77cf305f
  由 hong 提交于 4月 04, 2022
```
* update

* fix bug
```
  77cf305f

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致