提交 · aec6e8a961ea22af9e144b46134bbaa9fb56e0fd · PaddlePaddle / Paddle

06 1月, 2022 2 次提交
- M
  
  [Paddle-ASP]Asp sharding (#37725) · aec6e8a9
  由 minghaoBD 提交于 1月 06, 2022
  
  aec6e8a9
- J
  Added exp FP32 FWD/BWD oneDNN kernel and optimized other oneDNN grad kernels (#38624) · 718183f1
  由 jakpiase 提交于 1月 06, 2022
```
* added exp activation and use_dst_for_bwd kernels

* CI RERUN

* minor change
```
  718183f1
05 1月, 2022 6 次提交

J
Make post training quant API support dataloader (#38686) · 0af1a87b
由 Jiaqi Liu 提交于 1月 05, 2022
```
* make post training quant API support dataloader
```
0af1a87b

[Eager] Support test imperative basic in eager test_empty_grad (#38376) · 9108e777

由 wanghuancoder 提交于 1月 05, 2022

* Rearranged Eager AutoCodeGen directory structure

* Removed USE_OP in Eager AutoCodeGen

* Enabled generation for Operators without Grad/Inputs/Outputs

* Resolved operators without input

* Fixed merge conflicts

* Enabled Eager AutoCodeGen for 10+ more operators

* Refactored Eager AutoCodeGen with more organized helper objects

* Enabled Eager AutoCodeGen for operators with multiple OpBases

* Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument

* Handled Dispensable Inputs/Outputs in Eager AutoCodeGen

* Adjusted function generation/call between Python-C API & Dygraph API

* Synchronized auto-generated Python-C API with Dygraph Forward Functions

* support more eager tensor api

* fix merge compile error

* fix compile error and fit develop code

* support pure CPU

* fix some logic error in eager_mode

* support _varbase_creator in eager mode

* Added safe_initialized interface to EagerTensor for use in processing dispensable inputs

* for eager mode

* refine

* support multiple constructor for eager tensor

* add place related code

* polish code

* specific randint with dtype of int64

* Support pure cpu test

* eager logic

* refine test in pure cpu

* eager logic

* eager logic

* eager logic, test=develop

* skip core.eager when in inference, test=develop

* refine, test=develop

* refine, test=develop

* call RetainGrad after run forward kernel, test=develop

* refine, test=develop

* support dygraph util, meta, guard test

* eager test case

* support inference test

* refine test and fix initializer failed

* modify eagertensor patch method

* add eagertensor.clear_grandint, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* call monkey_patch_varbase in _test_eager_guard, test=develop

* split clear_gradient to clear_gradient and zero_grads, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop
Co-authored-by: Njim19930609 <jim19930609@gmail.com>
Co-authored-by: NJiabinYang <360788950@qq.com>

9108e777

J
Fix for matmul_v2 oneDNN op broadcasting when inputs dims have different lengths (#38665) · 67923124
由 jakpiase 提交于 1月 05, 2022
```
* fix for matmul_v2 broadcasting

* fix for output shape not broadcasted
```
67923124

Quantize nearest_interp and nearest_interp_v2 (#38622) · 1456b02d

由 joanna.wozna.intel 提交于 1月 05, 2022

* Quantize nearest_interp and nearest_interp_v2

* Check if avx_core supported

* Add depthwise_conv2d to supported quantization list

1456b02d

add huber_loss for kunlun (#38589) · a268c7ce

由 TTerror 提交于 1月 05, 2022

* add huber_loss for kunlun

* update xpu.cmake

* update unitests

* update unitests

* update elementwise_add

* update elementwise_add

* update elementwise_add

a268c7ce

Support EagerTensor initialization with kwargs (#38488) · 4ba6d4e4

由 Weilong Wu 提交于 1月 05, 2022

* Support EagerTensor init with kwargs

* Updated comments

* Updated unit tests case

* Refactor InitTensor related code to reduce duplicate code

* Updated the error reporting msg

* Updated VLOG msg

* Merge develop and Update EagerTensor init func

* Polish switch case, reduce some code

* Add SyntaxError unit test case

* Refactor the related initialization func of EagerTensor

* Remove ParseStopGradient and ParseZeroCopy and ParsePersistable, construct ParseBooleanArgs instead.

* Updated error msg to pass CI

* Updated PADDLE_ENFORCE error type

4ba6d4e4

04 1月, 2022 7 次提交
- L
  
  [new-exec] avoid adding_feed_fetch in each run (#38672) · 1345a456
  由 Leo Chen 提交于 1月 04, 2022
  
  1345a456
- F
  [NPU] add pad and pad_grad (#38658) · 6e9714a2
  由 furnace 提交于 1月 04, 2022
```
[NPU] add pad and pad_grad
```
  6e9714a2
- L
  
  [fleet_executor] Support multi carriers (#38650) · 2273471d
  由 LiYuRio 提交于 1月 04, 2022
  
  2273471d
- J
  
  added sqrt bf16 fwd/bwd (#38599) · 2d2609ea
  由 jakpiase 提交于 1月 04, 2022
  
  2d2609ea
- 0
  [Dy2st]Fix error when set buffer in forward (#38540) · 1e3f01ed
  由 0x45f 提交于 1月 04, 2022
```
* fix error when set buffer in forward

* add unittest

* refine class name

* refine not framework.in_dygraph_mode() in if

* fix UT error

* add comment

* refine code

* remove useless import
```
  1e3f01ed
- W
  
  Support test_imperative container_sequential and signal_handler with eager_guard (#38614) · a7b13d38
  由 Weilong Wu 提交于 1月 04, 2022
  
  a7b13d38
- W
  
  [Eager] Fix benchmark Performance (#38610) · 08b7f17d
  由 wanghuancoder 提交于 1月 04, 2022
  
  08b7f17d
31 12月, 2021 13 次提交

[XPU]add split op for kunlun2,*test=kunlun (#38277) · 26b845e2

由 Zhangjingyu06 提交于 12月 31, 2021

* [XPU]add split op for kunlun2,*test=kunlun

* [XPU]add split op for kunlun2,*test=kunlun

* [XPU]add split op for kunlun,*test=kunlun
Co-authored-by: NQingshuChen <chenqingshu@baidu.com>

26b845e2

[new API] add paddle.kthvalue and paddle.Tensor.kthvalue (#38386) · 538b5721

由 JYChen 提交于 12月 31, 2021

* add new api/op kthvalue

* kthvalue cuda kernel to cub sorting

* fix example code error

* throw errors instead of LOG in cuda sort

* throw errors by Paddle_ENFORCE

538b5721

add mul_gru_fuse_pass ut (#37772) · bc827307

由 baoachun 提交于 12月 31, 2021

* add mul_gru_fuse_pass ut

* update ut

* update ut

* update ut timeout setting

* update ut

bc827307

X
Probability distribution API of Beta and KL-Divergence (#38558) · 4794a44f
由 Xiaoxu Chen 提交于 12月 31, 2021
```
* add beta distribution
* add kl_divergence and register_kl api
```
4794a44f

[MLU]support calling mlu op from python interface (#38292) · b6bf650a

由 fwenguang 提交于 12月 31, 2021

* [MLU]support calling mlu op from python interface

* [MLU]fix

* fix

* [mlu]fix mlu_places

* [mlu]fix required mlu

* fix

* [MLU]fix tensor copy

* [mlu] fix MLUPlace call path

b6bf650a

J
[new api] add new api paddle.quantile and paddle.Tensor.quantile (#38567) · 20dc1ac2
由 JYChen 提交于 12月 31, 2021
```
* add new api paddle.quantile and paddle.Tensor.quantile

* add take_todo and fix UT
```
20dc1ac2
Z

add new API paddle.linalg.lu/lu_unpack (#38617) · 2ce91c33
由 zhiboniu 提交于 12月 31, 2021

2ce91c33

[Auto Parallel] Add general gradient merge pass to support auto parallel (#38259) · 89ce6db8

由 xiayanming 提交于 12月 31, 2021

* [Auto Parallel] add gradient merge pass

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix pr review

* fix pr review

* fix pr review

* fix pr review

* fix pr review

* fix pr review

89ce6db8

Add fold opereators (#38613) · 8898dce1

由 xiaoting 提交于 12月 31, 2021

* add fold opereators, test=develop

* add fold opereators, test=develop

* add fold opereators, test=develop

* update fold op error test, test=develop

* fix unitext, test=develop

* fix unitext, test=develop

8898dce1

D

fix timeout (#38612) · 02c17c0b
由 Double_V 提交于 12月 31, 2021

02c17c0b

Put_along_axis (based on PR #37921 by Xu Huang) (#38608) · f147fc99

由 Huihuang Zheng 提交于 12月 31, 2021

Paddle new APIs: put_along_axis.

Xu Huang is on holiday so we created this PR to work on it. It is based on his PR: https://github.com/PaddlePaddle/Paddle/pull/37921

f147fc99

Z

add lu_op backward (#38616) · a1275c8b
由 zhiboniu 提交于 12月 31, 2021

a1275c8b
C
[PTen] Unify data layout of pten and fluid (#38583) · 8d32cef8
由 Chen Weihang 提交于 12月 31, 2021
```
* unify data layout

* fix test_transfer_layout error
```
8d32cef8

30 12月, 2021 12 次提交

Z
add OP lu forward (#38559) · 4e21457d
由 zhiboniu 提交于 12月 30, 2021
```
LGTM
```
4e21457d

add sigmoid_cross_entropy_with_logits to kl1 (#38586) · 790cadd1

由 houj04 提交于 12月 30, 2021

* add sigmoid cross entropy with logits to kl1. test=kunlun

* add sigmoid cross entropy with logits to kl1. test=kunlun

790cadd1

Z
Add exp, abs_grad, reciprocal, reciprocal_grad operator for XPU and update... · ceec1e21
由 zhangyk0314 提交于 12月 30, 2021
```
Add exp, abs_grad, reciprocal, reciprocal_grad operator for XPU and update xpu2_op_list.h,test=kunlun (#38570)
```
ceec1e21
J
[New API] add new api paddle.mode and paddle.Tensor.mode (#38446) · 3777779b
由 JYChen 提交于 12月 30, 2021
```
* add new OP mode

* rename trans-variable name and fix UT
```
3777779b
Y
[Auto parallel] Make sure the id semantics of every var and op unique (#38132) · 5620214e
由 Yulong Ao 提交于 12月 30, 2021
```
* [Auto parallel] Make the id of var and op unique

* [Auto Parallel] Rename back dist_context to distop_context
```
5620214e

Add cpu kernel of new api : lstsq (#38585) · ccf99b66

由 Haohongxiang 提交于 12月 30, 2021

* add cpu kernel of lstsq

* update

* modify code style

* modify unittest

* remove support for complex

ccf99b66

Support test imperative basic with fixed retain grad interface (#38548) · 2421a25a

由 Jiabin Yang 提交于 12月 30, 2021

* Rearranged Eager AutoCodeGen directory structure

* Removed USE_OP in Eager AutoCodeGen

* Enabled generation for Operators without Grad/Inputs/Outputs

* Resolved operators without input

* Fixed merge conflicts

* Enabled Eager AutoCodeGen for 10+ more operators

* Refactored Eager AutoCodeGen with more organized helper objects

* Enabled Eager AutoCodeGen for operators with multiple OpBases

* Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument

* Handled Dispensable Inputs/Outputs in Eager AutoCodeGen

* Adjusted function generation/call between Python-C API & Dygraph API

* Synchronized auto-generated Python-C API with Dygraph Forward Functions

* support more eager tensor api

* fix merge compile error

* fix compile error and fit develop code

* support pure CPU

* fix some logic error in eager_mode

* support _varbase_creator in eager mode

* Added safe_initialized interface to EagerTensor for use in processing dispensable inputs

* for eager mode

* refine

* support multiple constructor for eager tensor

* add place related code

* polish code

* specific randint with dtype of int64

* Support pure cpu test

* eager logic

* refine test in pure cpu

* eager logic

* eager logic

* eager logic, test=develop

* skip core.eager when in inference, test=develop

* refine, test=develop

* refine, test=develop

* call RetainGrad after run forward kernel, test=develop

* refine, test=develop

* support dygraph util, meta, guard test

* support inference test

* refine test and fix initializer failed

* support create varbase and fix retain grad error

* fix windows error

* support test_imperative_basic test in eager mode

* remove additional log in variable.h

* remove additional log in variable.h

* remove additional code create in merge
Co-authored-by: Njim19930609 <jim19930609@gmail.com>
Co-authored-by: NWang Huan <wanghuan29@baidu.com>

2421a25a

Added Conv2D BF16 BWD oneDNN kernel (#38507) · ed8ba011

由 jakpiase 提交于 12月 30, 2021

* working test for padding only

* added full conv2d grad kernel

* removed some trash

* minor change

* Ci fix

* format fix

ed8ba011

Z

[PSCore]Fix test fleet base 2 (#38588) · 04496d89
由 zmxdream 提交于 12月 30, 2021

04496d89

[PTen] Remove offset in storage (#38472) · a504ff3f

由 Chen Weihang 提交于 12月 29, 2021

* remove offset in storage

* revert api change

* fix custom op slice bug

* fix mutable_data error

a504ff3f

add ExponentialFamily and Dirichlet probability distribution (#38445) · 00cddf07

由 Xiaoxu Chen 提交于 12月 30, 2021

* extend Distribution baseclass for supporting multivariant distribution and prob method

* add ExponentialFamily base class and entropy using Bregman divergence

* add dirichlet probability distribution

00cddf07

add dirichlet random sample op in cpu and gpu kernel (#38244) · c5bf09bb

由 Xiaoxu Chen 提交于 12月 30, 2021

* add dirichlet sample op and cpu backend kernel

* add Dirichlet op cuda kernel  (#6)

* add dirichlet op hip kernel
Co-authored-by: NFeiyu Chan <chenfeiyu@baidu.com>

c5bf09bb

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功