提交 · dae4e7f285f54be8f82e355a7b01f22a3b23a921 · 机器未来 / Paddle

29 11月, 2021 3 次提交
- T
  add expand_v2/expand_as_v2 for kunlun (#37592) · dae4e7f2
  由 TTerror 提交于 11月 29, 2021
```
* add expand_v2/expand_as_v2 for kunlun

* update expand_as_v2

* update expand_as_v2

* support float16/bool

* update xpu.cmake
```
  dae4e7f2
- P
  
  Add third batch of deprecated mkldnn namespace name changes (#37558) · 1ba81500
  由 piotrekobiIntel 提交于 11月 29, 2021
  
  1ba81500
- W
  Support fetch lodtensor array (#37580) · a0678eb1
  由 wanghuancoder 提交于 11月 29, 2021
```
* suport fetch lodtensor array, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop
```
  a0678eb1
27 11月, 2021 1 次提交

[NPU] reorganization for device API abstraction (#37110) · 72241a6a

由 Aganlengzi 提交于 11月 27, 2021

* [NPU] reorganization for device API abstraction

* [NPU] delete old files

* [NPU] fix npu_collective_helper

* [NPU] fix collective_helper

* [NPU] fix ut

* [NPU] mod memory allocation and hccl_helper

* [NPU] fix place_type

* [NPU] split enfoce.h

* move acl* call into npu_info

* merge conflict

* fix merge

* merge conflict

* merge conflict

72241a6a

26 11月, 2021 2 次提交
- Z
  upgrade async distributed training in pscore (#37515) · 74605fc2
  由 zhaocaibei123 提交于 11月 26, 2021
```
* test

* test

* rm test

* update

* update

* update

* add unittest

* update

* update save
```
  74605fc2
- C
  
  fix reshape async copy error (#37595) · 5607bcf2
  由 Chen Weihang 提交于 11月 26, 2021
  
  5607bcf2
25 11月, 2021 7 次提交
- Z
  【PTen】Add fill_constant kernel using ScalarArray in pten (#37481) · a0d465f8
  由 zyfncg 提交于 11月 25, 2021
```
* add scalar and scalar_array

* remove DenseTensor include from Scalar and ScalarArray

* remove inner header from scalar_array

* refactor the method of fill_constant and add some comment

* add fill_constant kernel using ScalarArray

* modify some prompt

* remove fill_constant kernel with no shape
```
  a0d465f8
- F
  [NPU] add int64 support for argsort op (#37434) · 3e088aaf
  由 furnace 提交于 11月 25, 2021
```
* [NPU] add int64 support for argsort op

* [NPU] delete debug codes
```
  3e088aaf
- F
  [NPU] add NPU kernel for prior_box op (#37519) · 1127fecb
  由 furnace 提交于 11月 25, 2021
```
* [NPU] add NPU kernel for prior_box op

* [NPU] delete debug codes
```
  1127fecb
- Y
  
  Disable the check of missing op benchmark script temporarily. (#37535) · 65056742
  由 Yiqun Liu 提交于 11月 25, 2021
  
  65056742
- Z
  
  Pass the stream created by Paddle to CINN. (#37337) · c249556d
  由 Zhen Wang 提交于 11月 25, 2021
  
  c249556d
- Z
  Added GradTensorHolder to Eager Dygraph (#37458) · bc9f9f43
  由 Zhanlue Yang 提交于 11月 25, 2021
```
* Added GradTensorHolder to Eager Dygraph

* Added accumulation codes to Eager Dygraph

* Fix windows-ci issue

* Fix NPU-CI issue

* Fixed CI-Coverage issue
```
  bc9f9f43
- X
  Fix test rnn memory helper op (#37474) · e4791d88
  由 xiongkun 提交于 11月 25, 2021
```
* clear LoDTensorArray

* fix  bugs

* fix

* fix gpu
```
  e4791d88
24 11月, 2021 5 次提交
- P
  Changed second batch of deprecated mkldnn header and function names to new oneDNN names (#37351) · 7db7a0ec
  由 piotrekobiIntel 提交于 11月 24, 2021
```
* Add second batch of deprecated mkldnn namespace and macro changes

* Unlock CI

* Fix temporary namespace alias placing
```
  7db7a0ec
- A
  
  Fix lod in fetch_v2 (#37514) · acbf9974
  由 Aurelius84 提交于 11月 24, 2021
  
  acbf9974
- Y
  elementwise_mul refactor (#37471) · c5e857d4
  由 YuanRisheng 提交于 11月 24, 2021
```
* elementwise_mul refactor

* perfect code in test

* delete redundant code

* fix bugs when run test_multiply

* adjust the location of macro

* fix bugs when run ci
```
  c5e857d4
- Z
  【PTen】Add Scalar and ScalarArray in pten (#37409) · 0f24de83
  由 zyfncg 提交于 11月 24, 2021
```
* add scalar and scalar_array

* remove DenseTensor include from Scalar and ScalarArray

* remove inner header from scalar_array

* refactor the method of fill_constant and add some comment
```
  0f24de83
- F
  
  fix:transform the data from cpu to gpu when trt is used (#37427) · 49366a63
  由 feng_shuai 提交于 11月 24, 2021
  
  49366a63
23 11月, 2021 6 次提交
- Q
  [XPU] Reorganize xpu device codes in platform, test=develop (#37428) · 79800978
  由 Qi Li 提交于 11月 23, 2021
```
* [XPU] Reorganize xpu device codes in platform, test=develop

* fix xpu_header.h, test=develop
```
  79800978
- L
  Add support bias is none for fused_attention op. (#37411) · 1a8786cf
  由 Li Min 提交于 11月 23, 2021
```
Add support for bias is none for fused_attention op.
```
  1a8786cf
- S
  Enhance the error message of scatter op (#37429) · 11b17c88
  由 sneaxiy 提交于 11月 23, 2021
```
* enhance scatter err msg check

* fix ci error
```
  11b17c88
- Y
  [PTen]Elementwise_div Kernel Refactor (#37418) · 32d9beef
  由 YuanRisheng 提交于 11月 23, 2021
```
* elementwise_div refactor

* fix compile bugs in windows ci
```
  32d9beef
- R
  [NPU] Added HCCL backend support in dygraph mode (#36285) · 83e55cff
  由 ronnywang 提交于 11月 23, 2021
```
* Added HCCL backend support in dynamic graph mode

* fix segmentation fault

* add ut
```
  83e55cff
- A
  [NewExe] Support layout/dtype transform by adding transfer_layout/transfer_dtype op (#37299) · 2a1f009e
  由 Aurelius84 提交于 11月 23, 2021
```
* Add transfer_layout/dtype op

* clean useless codes

* fix unused var

* add optest in white.txt

* split into data_transfer.cc

* fix cmake

* modify according reviewer comment

* replace cast_op with transfer_dtype_op
```
  2a1f009e
22 11月, 2021 6 次提交

disable copying of datatype when sharing buffer between two tensors. (#37247) · 9ec1432d

由 Feiyu Chan 提交于 11月 22, 2021

* disable copying of datatype when sharing buffer between two tensors.
* fix for mkldnn operator kernels (elementwise_add, sum, softplus, softmax, scale, activation), mannually set the data type when reusing memory by ShareBufferWith.

9ec1432d

Add isclose op (#37135) · d2200e97

由 andyjpaddle 提交于 11月 22, 2021

* add isclose op, test=develop

* add isclose op, test=develop

* add isclose api, test=develop

* rm useless code

* rm useless code

* update python api of isclose

* add some unittest of isclose op, test=develop

d2200e97

Z

elu support alpha < 0 (#37316) · e3503de8
由 zhupengyang 提交于 11月 22, 2021

e3503de8
Z
Support zero value in dimension for slice (#37313) · e788c7b5
由 zyfncg 提交于 11月 22, 2021
```
* support zero dim for slice op

* support zero dim Tensor in set_value op

* polish some debug log
```
e788c7b5

[PTen] Add variable transform to/from ptenTensor and add cast kernel (#36916) · 5caa6fc5

由 chentianyu03 提交于 11月 22, 2021

* add cast kernel

* add cast cuda kernel

* add cast kernel

* make cast kernel output dtype undefined

* get cast dtype from vardesc

* move cast to manipulation and add test case

* add castinfershape

* avoid reinitilaze variable

* InitializeVariable support datatype

* merge develop branch

* fix merge bug

* revert modify initializeVariable

* revert modify on InitializeVariable

* revert modify on InitializeVariable

* mutable support reset dtype

* enable make pten tensor from variable when def_arg.type is undefined

* fix build pten ctx start_idx error

* copy pten out tensor to variable

* merge develop branch

* fix non pten kernel cast failed

* add reset allocation place for remake tensor

* fix inplace realloc error

* add mutable on pten kernles and remove unused cast files

* rename function names

* fix output type error

* fix conflict with develop branch

* set data type to variable with pten's dtype

* fix test_cast_api type mismatch

* densorTensro mutable_data support 0 bytes value

* fix the inplace bug of reshape kernel

* fix pten.backend != variable.place when moving storage, palce mismatch bug

* fix conflict with develop branch

* Fix bug of paddle::experimental::MovesStorage

* fix ReMakePtenDenseTensor place mismatch bug

* Revert "fix ReMakePtenDenseTensor place mismatch bug"

This reverts commit 86336032f60b8a15eacd2c1ff2fa513f5d8dfd1a.

* fix ReMakePtenDenseTensor place mismatch bug

* reverts the set_lod interface, test=develop

* modify by the review options

* modify error message

* add & for const input arguments

* add reference in params

* elementwise_sub add mutable_data

* fix ResetHolderWithType check size bug

* add dependence pten_tensor to test_cast_api object

* remove unused code to pass ci coverage
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

5caa6fc5

L

[new feature] add local scope for interpretercore (#37379) · 1f0512be
由 Leo Chen 提交于 11月 22, 2021

1f0512be

19 11月, 2021 6 次提交

L

bug fix shard_index (#37042) · b505ff96
由 lilong12 提交于 11月 19, 2021

b505ff96
J
Optimize cinn_cache_key by replace GraphToProgram to Dot string (#37317) · edc3496f
由 jiangcheng 提交于 11月 19, 2021
```
* optimize cache-key by replace GraphToProgram to Dot string

* fix compile failure bug
```
edc3496f

Add fuse_resnet_unit pass (#36818) · 3cd3bf29

由 wuhuanzhou 提交于 11月 19, 2021

* GeneratePass support attr condition and mapping, test=develop

* fix coverage, test=develop

* Add fuse_resnet_unit pass, test=develop

* fix CI errors, test=develop

* fix CI errors, test=develop

* fix unittest error when compiling without CUDA, test=develop

* fix static ci error, test=develop

* limit kernel size must equal 1, test=develop

3cd3bf29

F

fix for cufft: some early versions of cufft do not define CUFFT_VERSION in the header (#37312) · d8191d06
由 Feiyu Chan 提交于 11月 19, 2021

d8191d06

Add paddle.incubate.graph_send_recv API (#37205) · 39012536

由 Siming Dai 提交于 11月 19, 2021

* add cpu version, using set: sum, min, max

* add cpu version: mean

* improve cpu code and fix dynamic memory allcation problem

* fix arg error, add index judge, delete fp16

* fix bug in CudaAtomicMax and CudaAtomicMin

* add CUDA version

* fix grad_op bug for index

* add op test, add correct cpu grad op

* Add correct CUDA Mean grad

* [Add] Successful MEAN and SUM

* [Add] Successful MIN and MAX in CPU

* [Add] Successful MIN and MAX in CUDA

* fix windows dtype ci

* fix ROCM ci by adding HIP flag

* rename fused_gather_scatter to send_recv

* unify name as send and recv

* change zero index return time

* add send_recv incubate api

* fix index data type, add unittest case for API

* delete redundant input tensor

* fix en example and docs, add default value in pool_type

* add shape judge and max grid judge

* fix comment

* fix index type bug

* add const &

* fix en docs

* delete numpy in examples

* add unittest for int input

* fix send_recv comment

* change send_recv to graph_send_recv

39012536

L

fix cmake dependence error (#37304) · 6653ac5e
由 LiYuRio 提交于 11月 19, 2021

6653ac5e

18 11月, 2021 4 次提交

L
fix bug to support dropout eval grad computing. (#37305) · c3d3001f
由 Li Min 提交于 11月 18, 2021
```
* fix bug to support dropout eval grad computing.

* Remove useless code.
```
c3d3001f

[PTen]elementwise_sub kernel refactor (#37260) · 36a95654

由 YuanRisheng 提交于 11月 18, 2021

* elementwise_add kernel refactor

* fix compile bugs in elementwise_add refactor

* fix compile bugs when run in npu/xpu

* fix bugs when run unit test

* fix bugs when run ci-windows

* modify code as recommended

* code format adjust

* fix bugs when run ci

* fix compile bug when run in ci-windwos

* elementwise_sub refactor

* add PD_DLL_DECL for elementwise_sub

* fix bugs when compilei

36a95654

Add the `GetFetchNames` method in CinnGraphSymbolization. (#37218) · 3ad495e8

由 Zhen Wang 提交于 11月 18, 2021

* Add the `GetFetchNames` method in CinnGraphSymbolization.

* Use unordered_set instead vector as the type of fetch_var_names.

* Reuse the definition of kCompilationKey.

* Use CompileOptions to set fetch_var_ids.

* Update the argument passing of GraphCompiler.Build.

* Fix some bugs in CinnGraphSymbolization::GetFetchIds.

3ad495e8

Opt topk (#37256) · c4862d99

由 zhangkaihuo 提交于 11月 18, 2021

topk中有cub和手写kernel两种实现，而cub是通过排序来获取topk，通过多组数据发现只有当input_width>=128且k超过input_width 75%的时候性能会比手写的更好。

c4862d99

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致