提交 · b4e44b0ab289c5f0da72fb6f9083b926c0ccddd9 · 机器未来 / Paddle

10 12月, 2021 1 次提交
- C
  
  change serval variable name and usage related cinn_launch (#38022) · a9bd6f0c
  由 CtfGo 提交于 12月 10, 2021
  
  a9bd6f0c
09 12月, 2021 6 次提交
- C
  cache scope and place on CinnLaunchContext and pass them to callback (#37983) · 151c5d74
  由 CtfGo 提交于 12月 09, 2021
```
cinn_launch_op： cache scope and place on CinnLaunchContext to skip duplicate alloc/free callback construction
```
  151c5d74
- C
  [PTen] Refine Kernel Registrar Writing (#37977) · b199ba85
  由 Chen Weihang 提交于 12月 09, 2021
```
* refine the kernel register impl

* fix cmake and symbol error

* remove overload marco

* polish details
```
  b199ba85
- J
  
  add ipu device p2 (#37840) · cb636a48
  由 jianghaicheng 提交于 12月 09, 2021
  
  cb636a48
- R
  
  optimize flip op, removing duplicated computation when dim size is one (#37825) · 890638cf
  由 Roc 提交于 12月 09, 2021
  
  890638cf
- F
  
  format softmax forward (#37927) · 18aca3f5
  由 Feng Xing 提交于 12月 09, 2021
  
  18aca3f5
- C
  
  adjust main dir (#37916) · 1911b6f0
  由 Chen Weihang 提交于 12月 08, 2021
  
  1911b6f0
08 12月, 2021 6 次提交

add a subdirectory named cinn in operators and move releated files into it (#37938) · 9cb637ed

由 CtfGo 提交于 12月 08, 2021

1. add a subdirectory named `cinn` in `paddle/fluid/operators` directory and move releated files into it
2. seperate CinnLaunchContext class from `cinn_launch_op.h` and put it in a  new independent file named `cinn_launch_context.h`, so that it can be included by others clearly.

9cb637ed

Y
[PTen]Add alias kernel name (#37881) · ff6507db
由 YuanRisheng 提交于 12月 08, 2021
```
* add alias kernel name

* modify code as suggestions
```
ff6507db

Add paddle.lerp API to do a linear interpolation (#37253) · 1716324c

由 wuhuanzhou 提交于 12月 08, 2021

* save temp

* add unittest, test=develop

* fix ci error, test=develop

* fix grad accuracy error, test=develop

* fix unused error, test=develop

* fix compilation error on Windows, test=develop

* add unittest, test=develop

* modify by review comment and add lerp_

* fix inplace api, test=develop

* fix inplace api, test=develop

* fix coverage error, test=develop

1716324c

C
implementation of broadcast sub backward by reduce (#37754) · 567e6bbc
由 crystal 提交于 12月 08, 2021
```
* add boardcast_sub

* add boardcast_sub
```
567e6bbc
Y

fix softmax max dim (#37901) · b5dd12fb
由 Yanxing Shi 提交于 12月 08, 2021

b5dd12fb
S
Fix CUDA Graph H2D bug by restore host memory (#37774) · a1ad3a63
由 sneaxiy 提交于 12月 08, 2021
```
* fix CUDA Graph H2D bug again

* fix no return bug
```
a1ad3a63

07 12月, 2021 2 次提交
- D
  
  fix filter_by_instag op for lod_level=0 without lod;test=develop (#37834) · b48545ee
  由 danleifeng 提交于 12月 07, 2021
  
  b48545ee
- Z
  Quantize slice op (#37630) · 2bd0f3c7
  由 Zuza 提交于 12月 07, 2021
```
* quantize slice op

* correct test

* fix code formatting
```
  2bd0f3c7
06 12月, 2021 2 次提交
- H
  Update CINN tag (#37870) · 3e33ef5a
  由 Huihuang Zheng 提交于 12月 06, 2021
```
1. Modify git tag for CINN
2. Support compile option "-DWITH_CINN=ON, -DWITH_TESTING=OFF"
```
  3e33ef5a
- C
  [PTen] Fix reshape move storage using error (#37765) · ead81230
  由 Chen Weihang 提交于 12月 05, 2021
```
* fix reshape move storage error

* remove needless set type

* alloc tensor by shared storage
```
  ead81230
03 12月, 2021 2 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb
- L
  [new-exec] use stream safe allocator in memcpy_h2d (#37777) · 9ccb6228
  由 Leo Chen 提交于 12月 03, 2021
```
* use sync h2d copy

* use stream safe allocator in memcpy_h2d

* remove wait

* add guard
```
  9ccb6228
02 12月, 2021 1 次提交
- F
  [NPU] add int64 support for scatter op (#37440) · 85e5ab2e
  由 furnace 提交于 12月 02, 2021
```
* [NPU] add int64 support for scatter op

* [NPU] delete debug codes

* [NPU] optimize codes
```
  85e5ab2e
01 12月, 2021 4 次提交
- S
  Fix inplace addto pass by setting dtype correctly (#37717) · b0d580a2
  由 sneaxiy 提交于 12月 01, 2021
```
* fix inplace addto pass

* update

* fix ut

* improve ci coverage

* fix musl ci compile error
```
  b0d580a2
- T
  add prior_box for kunlun (#37697) · e0fc8937
  由 TTerror 提交于 12月 01, 2021
```
* add prior_box for kunlun

* update

* update CMakeLists
```
  e0fc8937
- F
  add angle_op (#37689) · 28b43111
  由 Feiyu Chan 提交于 12月 01, 2021
```
* add angle_op
```
  28b43111
- H
  Modify ShareTensorWithCinnBuffer by callback to save memory (#37493) · 661dbdbe
  由 Huihuang Zheng 提交于 12月 01, 2021
```
Modify ShareTensorWithCinnBuffer by callback to save memory
```
  661dbdbe
30 11月, 2021 6 次提交
- S
  refactoring matmul_v2 mkldnn hierarchy (#37622) · fab92824
  由 Sylwester Fraczek 提交于 11月 30, 2021
```
* refactoring matmul hierarchy

* review fix

* review fix

* review_FIX-part2
```
  fab92824
- S
  Add new unittests for gIOHW format in conv_transpose_mkldnn_op (#37344) · d93ee063
  由 Sławomir Siwek 提交于 11月 30, 2021
```
* Add new unittests

* Replace I with O channel for filter groups

* Undo changes affecting other operators

* Fix oneDNN namespace typo

* Fix code format error
```
  d93ee063
- Z
  [opt] Add regularation and Nesterov for mergerd_momentum op (#37527) · c8ffdecb
  由 zhangbo9674 提交于 11月 30, 2021
```
* add regularation and Nesterov for mergerd_momentum

* refine unittest for use_nesterov attr

* refine op check

* refine code

* fix bug

* refine code of regularization_flag

* delete useless code
```
  c8ffdecb
- C
  
  add scale api and test (#37683) · 0c8b9994
  由 Chen Weihang 提交于 11月 30, 2021
  
  0c8b9994
- G
  support data_format='NHWC' for prelu channel mode (#37019) · 3f2a665a
  由 Guoxia Wang 提交于 11月 30, 2021
```
* support data_format='NHWC' for prelu channel mode
```
  3f2a665a
- Y
  
  fix overflow in some cuda ops (#37670) · 0c82e3a0
  由 Yang 提交于 11月 30, 2021
  
  0c82e3a0
29 11月, 2021 4 次提交

[Pten] Add reduce mean kernel, replace with mean API (#37559) · f9e9fd19

由 chentianyu03 提交于 11月 29, 2021

* add pten reduce kernel

* add reduce_sum kernel

* update attribute args and order

* make out dtype undefined

* fix empty input error

* merge develop branch

* rename sum as reduce function

* rename sum as reduce function

* fix reducekernelImpl args error

* add reduce cuda kernel

* modify dims type to const &

* remove unsed log

* fix reduce_all out eigen function error

* remove unused codes

* add the missing sum api define and testcase

* merge develop branch

* fix sum test axis value error

* replace pten mean kernel with reduce_mean

* revcover meam cuda to original implement

f9e9fd19

add expand_v2/expand_as_v2 for kunlun (#37592) · dae4e7f2

由 TTerror 提交于 11月 29, 2021

* add expand_v2/expand_as_v2 for kunlun

* update expand_as_v2

* update expand_as_v2

* support float16/bool

* update xpu.cmake

dae4e7f2

P

Add third batch of deprecated mkldnn namespace name changes (#37558) · 1ba81500
由 piotrekobiIntel 提交于 11月 29, 2021

1ba81500

Support fetch lodtensor array (#37580) · a0678eb1

由 wanghuancoder 提交于 11月 29, 2021

* suport fetch lodtensor array, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

a0678eb1

27 11月, 2021 1 次提交

[NPU] reorganization for device API abstraction (#37110) · 72241a6a

由 Aganlengzi 提交于 11月 27, 2021

* [NPU] reorganization for device API abstraction

* [NPU] delete old files

* [NPU] fix npu_collective_helper

* [NPU] fix collective_helper

* [NPU] fix ut

* [NPU] mod memory allocation and hccl_helper

* [NPU] fix place_type

* [NPU] split enfoce.h

* move acl* call into npu_info

* merge conflict

* fix merge

* merge conflict

* merge conflict

72241a6a

26 11月, 2021 2 次提交
- Z
  upgrade async distributed training in pscore (#37515) · 74605fc2
  由 zhaocaibei123 提交于 11月 26, 2021
```
* test

* test

* rm test

* update

* update

* update

* add unittest

* update

* update save
```
  74605fc2
- C
  
  fix reshape async copy error (#37595) · 5607bcf2
  由 Chen Weihang 提交于 11月 26, 2021
  
  5607bcf2
25 11月, 2021 3 次提交

【PTen】Add fill_constant kernel using ScalarArray in pten (#37481) · a0d465f8

由 zyfncg 提交于 11月 25, 2021

* add scalar and scalar_array

* remove DenseTensor include from Scalar and ScalarArray

* remove inner header from scalar_array

* refactor the method of fill_constant and add some comment

* add fill_constant kernel using ScalarArray

* modify some prompt

* remove fill_constant kernel with no shape

a0d465f8

F
[NPU] add int64 support for argsort op (#37434) · 3e088aaf
由 furnace 提交于 11月 25, 2021
```
* [NPU] add int64 support for argsort op

* [NPU] delete debug codes
```
3e088aaf
F
[NPU] add NPU kernel for prior_box op (#37519) · 1127fecb
由 furnace 提交于 11月 25, 2021
```
* [NPU] add NPU kernel for prior_box op

* [NPU] delete debug codes
```
1127fecb

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致