提交 · 08941eda5d0c007051abdb6dd0a5f45cd94a433a · Crayon鑫 / Paddle

24 12月, 2021 13 次提交

[pten] combine reduce_cuda codes (#38328) · 08941eda

由 chentianyu03 提交于 12月 24, 2021

* combine reduce_cuda codes

* support float16 in pten redcue_mean

* replace ReduceCudaKernel impl with pten reduce impl

* mv reduce funcs into reduce_cuda_impl

* rm unsed codes and headers

* mv GetReduceDim into reduce_cuda_impl

* recover GetReduceDim in reduce_op.h

* add new dispatch macro

* fix pool op output not inited and cause transform to pten::denseTensor error

* fix output tensor not initialized error

* rename new dispatch macro and format code style

* rm reduce_functor_op.h file

08941eda

[Unify Tensors PR ] Replaced pten::Allocation with... · 42cf2bee

由 Zhanlue Yang 提交于 12月 24, 2021

[Unify Tensors PR #1] Replaced pten::Allocation with shared_ptr<memory::Allocation> for Storage (#38301)

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

42cf2bee

Z
[heterps]move pre-init id logic from common_sparse_table to sparse_geo_table (#38173) · 52329f6f
由 zmxdream 提交于 12月 24, 2021
```
* remove pre-init id in common_sparse_tabl.cc
```
52329f6f
add new API/OP:paddle.Tensor.exponential_ (#38256) · 33185000
由 zhouweiwei2014 提交于 12月 24, 2021
```
* add new API/OP:paddle.Tensor.exponential_

* fix CI
```
33185000
[MLU]add mlu op interface (#38241) · c396ee65
由努力努力在努力丶提交于 12月 24, 2021
```
* [MLU]add mlu op interface

* [MLU]fix alpha of activation op
```
c396ee65
Y
add pull gpups sparse op (#37124) · 572b3e90
由 yaoxuefeng 提交于 12月 24, 2021
```
 add pull gpups sparse op
```
572b3e90
B

fix share buffer to (#38407) · 9409ff6b
由 Baibaifan 提交于 12月 24, 2021

9409ff6b
王

[infrt] fix infrt script bug and function error. test=develop (#38384) · 4b3d5195
由王明冬提交于 12月 24, 2021

4b3d5195
C

add register general kernel marco (#38409) · fc0a50aa
由 Chen Weihang 提交于 12月 23, 2021

fc0a50aa
Z

Add new API cholesky_solve (#38167) · 39f7c41f
由 zhiboniu 提交于 12月 24, 2021

39f7c41f
add new API/OP: paddle.poisson (#38117) · bcf86e5c
由 zhouweiwei2014 提交于 12月 24, 2021
```
* add new API/OP:paddle.poisson

* fix comment
```
bcf86e5c

add conv+hard_sigmoid and conv+hard_swish fuse pass ut (#37553) · a858326a

由 baoachun 提交于 12月 24, 2021

* add conv+hard_sigmoid fuse pass ut

* update conv_elementwise_add_mkldnn_fuse_pass ut

* update conv_hard_sigmoid_mkldnn_fuse_pass ut

* update conv+hard_sigmoid and conv+hard_swish fuse pass ut

* update ut

* update ut

a858326a

Support test imperative basic in eager (#38313) · d48f7c89

由 Jiabin Yang 提交于 12月 24, 2021

* Rearranged Eager AutoCodeGen directory structure

* Removed USE_OP in Eager AutoCodeGen

* Enabled generation for Operators without Grad/Inputs/Outputs

* Resolved operators without input

* Fixed merge conflicts

* Enabled Eager AutoCodeGen for 10+ more operators

* Refactored Eager AutoCodeGen with more organized helper objects

* Enabled Eager AutoCodeGen for operators with multiple OpBases

* Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument

* Handled Dispensable Inputs/Outputs in Eager AutoCodeGen

* Adjusted function generation/call between Python-C API & Dygraph API

* Synchronized auto-generated Python-C API with Dygraph Forward Functions

* support more eager tensor api

* fix merge compile error

* fix compile error and fit develop code

* support pure CPU

* fix some logic error in eager_mode

* support _varbase_creator in eager mode

* Added safe_initialized interface to EagerTensor for use in processing dispensable inputs

* for eager mode

* refine

* support multiple constructor for eager tensor

* add place related code

* polish code

* specific randint with dtype of int64

* Support pure cpu test

* eager logic

* refine test in pure cpu

* eager logic

* eager logic

* eager logic, test=develop

* skip core.eager when in inference, test=develop

* refine, test=develop

* refine, test=develop

* call RetainGrad after run forward kernel, test=develop

* refine, test=develop

* support dygraph util, meta, guard test

* support inference test

* refine test and fix initializer failed
Co-authored-by: Njim19930609 <jim19930609@gmail.com>
Co-authored-by: NWang Huan <wanghuan29@baidu.com>

d48f7c89

23 12月, 2021 16 次提交

C

move conj kernel impl (#38365) · 8da9eff4
由 Chen Weihang 提交于 12月 23, 2021

8da9eff4
J
Make GetBlob assuming elements are cached (#38336) · 7da5368d
由 Jacek Czaja 提交于 12月 23, 2021
```
* First set of fixes

* - Make more likely to GetBlob find a blobs

* - Lint
```
7da5368d
S
block warning when build demo_ci and infer_ut (#38306) · 3629cd27
由 Sing_chan 提交于 12月 23, 2021
```
* block warning when build demo_ci and infer_ut

* use build pipe line clone to test
```
3629cd27
Y
add mem pool (#37127) · 745477fe
由 yaoxuefeng 提交于 12月 23, 2021
```
add mem pool
```
745477fe

Add erfinv API (#38295) · 6b59b58c

由 wuhuanzhou 提交于 12月 23, 2021

* add erfinv API, test=develop

* fix gradient accuracy error, test=develop

* fix cuda compilation error on Windows, test=develop

* fix M_2_SQRTPI undeclared identifier on Windows, test=develop

6b59b58c

Upgrade work queue (#38335) · 198d11be

由 liutiexing 提交于 12月 23, 2021

* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* update EventsWater

* fix

* split workqueue files

* add more tests

* fix

* bugfix

* bugfix

* update
Co-authored-by: Nliutiexing <liutiexing@google.com>

198d11be

Z
【PTen】Add empty and empty_like kernel in pten (#38334) · 4221cd33
由 zyfncg 提交于 12月 23, 2021
```
* add empty and empty_like kernel in pten

* add empty dev_api
```
4221cd33
W
Support external stream. (#38373) · 15ad7ee4
由 Wilber 提交于 12月 23, 2021
```
* support external stream.

* update

* update

* update
```
15ad7ee4
H

add-leaky-relu-to-xpu2-op-list (#38366) · b7bafee8
由 houj04 提交于 12月 23, 2021

b7bafee8

add mkldnn conv_elementwise_add_mkldnn_fuse_pass ut (#37612) · f88065d3

由 baoachun 提交于 12月 23, 2021

* add mkldnn conv_elementwise_add_mkldnn_fuse_pass ut

* update mkldnn conv_elementwise_add_mkldnn_fuse_pass ut

* update conv_elementwise_add_mkldnn_fuse_pass ut

* update conv_elementwise_add_mkldnn_fuse_pass ut

* update conv_elementwise_add_mkldnn_fuse_pass ut

* restrict conv2d data_format in conv_elementwise_add_mkldnn_fuse_pass

* update conv_elementwise_add_mkldnn_fuse_pass OpCompat

* update conv_elementwise_add_mkldnn_fuse_pass ut

* update ut

f88065d3

王

[infrt] unify the paddle dialect operation name. test=develop (#38354) · 4aed099d
由王明冬提交于 12月 23, 2021

4aed099d
add new API: paddle.clone;Tensor.element_size;nn.utils.parameters_to_vector (#38020) · 0eb03ed7
由 zhouweiwei2014 提交于 12月 23, 2021
```
* add new API: paddle.clone;Tensor.element_size;nn.utils.parameters_to_vector

* fix comment
```
0eb03ed7

Add unittest for flatten2_matmul squeeze2_matmul reshape2_matmul pass (#37644) · aa059885

由 heliqi 提交于 12月 23, 2021

* add flatten2_matmul squeeze2_matmul reshape2_matmul test case

* modify skip func to ignore_pass_case func

* rebuild CI

* add test_xx_matmul_fuse_pass timeout

* add test_map_xx_pass timeout

* add max_duration of test cast

* add trt skip

* add timeout

* del commented code

aa059885

C

move sign kernel impl (#38363) · bb38b6aa
由 Chen Weihang 提交于 12月 22, 2021

bb38b6aa
C
[PTen] Move dot kernel impl (#38359) · 0a4ffbc7
由 Chen Weihang 提交于 12月 22, 2021
```
* move dot kernel impl

* remove needless cmake items
```
0a4ffbc7
石
updates the pten allocation, test=develop (#38355) · 4d5a6064
由石晓伟提交于 12月 23, 2021
```
* updates the pten allocation, test=develop

* avoids an error message, test=develop
```
4d5a6064

22 12月, 2021 11 次提交
- T
  
  Update CE-Framework doccker (#38270) · ced6ab6d
  由 tianshuo78520a 提交于 12月 22, 2021
  
  ced6ab6d
- C
  use elementwise to optimize gelu backward implementation on GPU (#38263) · 858e4358
  由 crystal 提交于 12月 22, 2021
```
* optimize gelu backward

* optimize gelu backward

* optimize code

* Number to expression

* Replacement number
```
  858e4358
- Y
  
  optimize buddy_allocator (#38312) · 8fe1cb72
  由 Yang 提交于 12月 22, 2021
  
  8fe1cb72
- C
  [PTen] Change functions to funcs (#38340) · 64e2f670
  由 Chen Weihang 提交于 12月 22, 2021
```
* change functions to funcs

* remove useless code
```
  64e2f670
- C
  [PTen] Add cmake function for kernels (#38311) · e6310dbd
  由 Chen Weihang 提交于 12月 22, 2021
```
* add pten kernel cmake

* add pten kernel cmake function

* fix compile error

* add enforce include for full kernel

* fix compile failed

* change cuda to gpu

* fix cmake function error
```
  e6310dbd
- B
  add mkldnn reshape_transpose_matmul fuse pass ut and op version check (#37468) · 274b135b
  由 baoachun 提交于 12月 22, 2021
```
* add mkldnn reshape_transpose_matmul fuse pass ut and op version check

* update reshape_transpose_matmul_mkldnn_fuse_pass ut

* update ut
```
  274b135b
- B
  update mkldnn batch_norm_activation fuse pass ut (#37402) · 3d7e737c
  由 baoachun 提交于 12月 22, 2021
```
* update mkldnn batch_norm_activation fuse pass ut

* update ut

* update mkldnn batch_norm_act_fuse_pass ut

* update batch_norm_act_fuse_pass ut

* update ut
```
  3d7e737c
- 王
  
  [infrt] add tensorrt op teller pass. test=develop (#38304) · 44112817
  由王明冬提交于 12月 22, 2021
  
  44112817
- L
  
  [fleet_executor] Move IntraSend to Carrier. Using blocking queue (#38322) · ddc15a18
  由 LiYuRio 提交于 12月 22, 2021
  
  ddc15a18
- C
  
  add copy constructor for densetensor (#38319) · fabc058b
  由 Chen Weihang 提交于 12月 21, 2021
  
  fabc058b
- Y
  [PTen]Move flatten kernel to new directory (#38255) · 4d1ce184
  由 YuanRisheng 提交于 12月 22, 2021
```
* move flatten

* fix bugs of test

* modify header file

* add copy declare

* fix compile bugs
```
  4d1ce184

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致