提交 · 572ba24ea0f61f9ce54617c8c0dcd99c86bd17a7 · BaiXuePrincess / Paddle

12 1月, 2022 5 次提交

Z

[part 4]change type of function args (#38888) · a250c56c
由 Zhang Ting 提交于 1月 12, 2022

a250c56c
Z

[part 2]change type of function args (#38886) · 86434818
由 Zhang Ting 提交于 1月 12, 2022

86434818
Z

[part 1]change type of function args (#38885) · df5d55bb
由 Zhang Ting 提交于 1月 12, 2022

df5d55bb

Adjust warpper of gpu_lanuch_config (#38654) · f5166284

由 limingshu 提交于 1月 12, 2022

* first commit

* fix wrong filename

* fix the wrong spell name

* fix gpu config warper

* modify according to pr advices

* fix GpuLauchConfig1D api bugs

* change the config for dropout grad

* fix bugs

* modification according to pr advices

* modification according to pr advices

f5166284

Os info (#38779) · 0d8d1e0e

由 liutiexing 提交于 1月 12, 2022

* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* os_info update

* update

* update

* update

* update

* update

* fix

* update

* update for windows

* fix windows

* update

* update
Co-authored-by: Nliutiexing <liutiexing@google.com>

0d8d1e0e

11 1月, 2022 10 次提交

Y

refactor reshape grad kernel (#38833) · 8cc09552
由 YuanRisheng 提交于 1月 11, 2022

8cc09552

【PTen】Add dot and matmul grad kernel in pten (#38713) · be817719

由 zyfncg 提交于 1月 11, 2022

* refactor matmul directory in pten

* fix merge conflict

* add dot_grad kernel

* add dot_grad kernel in pten

* add matmul_grad kernel

* update the code

* delete useless code in fluid

* fix some bug of running matmul grad kernel

* fix merge conflict

* refactor some code

* refactor code

be817719

S

oepn third_party cache in wincheck_inference (#38877) · 5b940c44
由 Sing_chan 提交于 1月 11, 2022

5b940c44
Z
Fix bug in elementwise_mul/div_grad when inplace strategy (#38840) · 7915d180
由 Zhang Zheng 提交于 1月 11, 2022
```
* fix bug when inplace strategy

* fix

* fix

* fix

* fix

* fix
```
7915d180
N

Modified Kernel Primitive API and elementwise for xpu2 #38688 · 3eaf8d2c
由 niuliling123 提交于 1月 11, 2022

3eaf8d2c
W
[PTEN] Add pten::Place data structure. (#38844) · 2bed9b9c
由 Wilber 提交于 1月 11, 2022
```
* add pten::Place data structure.

* update ci problem

* fix ci problem

* update
```
2bed9b9c

Remove useless headers for some grad ops (#38823) · 9f34a070

由 limingshu 提交于 1月 11, 2022

* fix the wrong filename

* first commit

* first commit

* remove rest useless headers

* for ci approval

9f34a070

S
support vs2019 compilation in windows (#38719) · 0ad363b1
由 Sing_chan 提交于 1月 11, 2022
```
* support vs2019 compilation in windows

* not modify pow_op's original compute logic
```
0ad363b1

[Eager] fix some eager logic (#38576) · d3686471

由 wanghuancoder 提交于 1月 11, 2022

* Rearranged Eager AutoCodeGen directory structure

* Removed USE_OP in Eager AutoCodeGen

* Enabled generation for Operators without Grad/Inputs/Outputs

* Resolved operators without input

* Fixed merge conflicts

* Enabled Eager AutoCodeGen for 10+ more operators

* Refactored Eager AutoCodeGen with more organized helper objects

* Enabled Eager AutoCodeGen for operators with multiple OpBases

* Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument

* Handled Dispensable Inputs/Outputs in Eager AutoCodeGen

* Adjusted function generation/call between Python-C API & Dygraph API

* Synchronized auto-generated Python-C API with Dygraph Forward Functions

* support more eager tensor api

* fix merge compile error

* fix compile error and fit develop code

* support pure CPU

* fix some logic error in eager_mode

* support _varbase_creator in eager mode

* Added safe_initialized interface to EagerTensor for use in processing dispensable inputs

* for eager mode

* refine

* support multiple constructor for eager tensor

* add place related code

* polish code

* specific randint with dtype of int64

* Support pure cpu test

* eager logic

* refine test in pure cpu

* eager logic

* eager logic

* eager logic, test=develop

* skip core.eager when in inference, test=develop

* refine, test=develop

* refine, test=develop

* call RetainGrad after run forward kernel, test=develop

* refine, test=develop

* support dygraph util, meta, guard test

* eager test case

* support inference test

* refine test and fix initializer failed

* modify eagertensor patch method

* add eagertensor.clear_grandint, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* support create varbase and fix retain grad error

* call monkey_patch_varbase in _test_eager_guard, test=develop

* fix windows error

* split clear_gradient to clear_gradient and zero_grads, test=develop

* refine, test=develop

* refine, test=develop

* support test_imperative_basic test in eager mode

* remove additional log in variable.h

* remove additional log in variable.h

* remove additional code create in merge

* eager

* fix some eager logic, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop
Co-authored-by: Njim19930609 <jim19930609@gmail.com>
Co-authored-by: NJiabinYang <360788950@qq.com>

d3686471

F

roi_align fix (#38788) · ffbc2122
由 fengkuangxiaxia 提交于 1月 11, 2022

ffbc2122

10 1月, 2022 18 次提交

Y

add retry on pull dense sync (#38793) · 0a7cb901
由 yaoxuefeng 提交于 1月 10, 2022

0a7cb901

Add gpu kernel for new api : linalg.lstsq (#38621) · 405103d8

由 Haohongxiang 提交于 1月 10, 2022

* add lstsq gpu kernel

* update

* add docs_en

* modify ut

* fix bugs

* modify example in docs_en

* remove lstsq_op.cu from ROCM cmake

* modify docs_en

* modify docs_en

* modify docs_en

* remove unneccessary TensorCopy

405103d8

L

[Fleet Executor] Modified python cache strategy to support multi carriers (#38839) · c50c22b0
由 LiYuRio 提交于 1月 10, 2022

c50c22b0
Y

[fleet_executor] framework for big model inference (#38795) · ededcda2
由 Yuang Liu 提交于 1月 10, 2022

ededcda2
B
refactor the forward implementation of reshape npu op (#38748) · 31b1f707
由 baoachun 提交于 1月 10, 2022
```
* refactor the forward implementation of reshape npu op

* update reshape npu op

* update reshape npu op
```
31b1f707
C

move get expected kernel args into pten (#38825) · 3a23c1a2
由 Chen Weihang 提交于 1月 10, 2022

3a23c1a2
Y
Add the backward support for QR (#38824) · 657b6742
由 Yulong Ao 提交于 1月 10, 2022
```
* Add the backward support for QR

* Remove unnecessary comments
```
657b6742

[Unify Tensors PR ] Removed interfaces & members from lod_tensor,test=allcases (#38811) · 953638e0

由 Zhanlue Yang 提交于 1月 10, 2022

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor

* Modified framework::Tensor to inherit from DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

* Rearranged cfunction calls from tensor.data<void>() to tensor.data()

* Fixed CI issues

* Fixed lite issues

* Fixed data() interface issues,test=allcases

* Resolved IsInitialized() issues

* Fixed ResetHolder() issues

* Fixed MKLDNN & Storage issues

* Resolved ShareBufferWith() issues

* Fixed LoD issues

* Removed interfaces & members from lod_tensor,test=allcases

953638e0

S

[bug fix] fix unfold runtime bug (#38819) · 5c357504
由 shangliang Xu 提交于 1月 10, 2022

5c357504

Profiler skeleton (#38826) · a8afed69

由 liutiexing 提交于 1月 10, 2022

* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* profiler skeleton

* update

* update

* update
Co-authored-by: Nliutiexing <liutiexing@google.com>

a8afed69

T

1.fix elementwise_add_grad bug. 2. add dropout kernel in kl2 (#38726) · 7b860a23
由 taixiurong 提交于 1月 10, 2022

7b860a23
W

fix attr missing in conv cudnn kernel (#38827) · 066a8063
由 wangxinxin08 提交于 1月 10, 2022

066a8063

[Unify Tensors PR ] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea

由 Zhanlue Yang 提交于 1月 10, 2022

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor

* Modified framework::Tensor to inherit from DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

* Rearranged cfunction calls from tensor.data<void>() to tensor.data()

* Fixed CI issues

* Fixed lite issues

* Fixed data() interface issues,test=allcases

* Resolved IsInitialized() issues

* Fixed ResetHolder() issues

* Fixed MKLDNN & Storage issues

* Resolved ShareBufferWith() issues

* Fixed LoD issues

5c73a6ea

Support setting infershape function for custom grad op (#38776) · 046553c7

由 Chen Weihang 提交于 1月 10, 2022

* unify infer_shape func calling

* support set grad infer shape fn for custom op

* unify infershape in new executor and eager

* remove todo comment

* revert infershape in operator

046553c7

L

[fleet_executor] Add barrier rpc (#38799) · cd2855b0
由 LiYuRio 提交于 1月 10, 2022

cd2855b0

Add MaxUnPool3D op and MaxUnPool1D op (#38716) · 7e31542c

由 andyjpaddle 提交于 1月 10, 2022

* add maxunpool3d op

* update doc for maxunpool3d op

* update doc for maxunpool3d op

* update doc for maxunpool3d op

* update sample code for maxunpool3d

* add maxunpool1d op

* update some code for maxunpool1d

7e31542c

G

remove fp32 tmp tensor and cast op for initializer.Normal and initializer.Constant (#38818) · 2238a535
由 Guoxia Wang 提交于 1月 10, 2022

2238a535
G

fix cuda seed bug of class_center_sample traning on multi gpu (#38815) · 04f73d89
由 Guoxia Wang 提交于 1月 10, 2022

04f73d89

07 1月, 2022 7 次提交
- Y
  [PTen]Refactor flatten_grad kernel (#38712) · 5cf0bb79
  由 YuanRisheng 提交于 1月 07, 2022
```
* refactor flatten grad kernel

* fix bugs when run ci unittest

* fix bugs when use default GetExpectedPtenKernelArgs

* xshape sometimes is has null holder ,fix this bugs
```
  5cf0bb79
- W
  modify mish op and add mish api (#38734) · 8c92337c
  由 wangxinxin08 提交于 1月 07, 2022
```
* add mish operator and api

* remove redundant code and modify grad_atol of mish unittest

* modify mish code to be consistent with other activation implementation
```
  8c92337c
- Z
  Add multi tensor for adam (#38010) · fb3313e9
  由 zhangbo9674 提交于 1月 07, 2022
```
* add multi tensor for adam

* add merged_adam op

* refine code

* refine adam compute logic
```
  fb3313e9
- N
  
  Fix a bug when reduce_num = 1 in Reduce Op (#38771) · f634c0b1
  由 niuliling123 提交于 1月 07, 2022
  
  f634c0b1
- L
  
  [fleet_executor] Support multi carriers (#38709) · 769e5bc4
  由 LiYuRio 提交于 1月 07, 2022
  
  769e5bc4
- L
  
  [new-exec] support pten kernel (#38770) · 7f3b0877
  由 Leo Chen 提交于 1月 07, 2022
  
  7f3b0877
- L
  Add fp16 support for scale and bias parameter for fused_layernnorm_residual_dropout op. (#38775) · 1b6e4664
  由 Li Min 提交于 1月 07, 2022
```
* Add fp16 support for scale/bias for fused_layernnorm_residual_dropout_bias op.
```
  1b6e4664

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致