提交 · dd3afc9d5fbaae6bdc1e62669e5eb934f01a9cc4 · 机器未来 / Paddle

16 12月, 2021 5 次提交

L
Add fmax and fmin operators (#37826) · dd3afc9d
由 LJQ❤️ 提交于 12月 16, 2021
```
Add elementwise_fmax and elementwise_fmin operators
```
dd3afc9d

Add sparse_attention mask ,test=develop (#37973) · fa463b90

由 Liu-xiandong 提交于 12月 16, 2021

Add key_padding_mask and attn_mask in sparse_attention Api

1.Key padding mask is a tensor with dimensions [batch_size, seq_len], and attention mask is a tensor with dimensions [seq_len, seq_len]. The data types of the two masks are consistent with Q, K, and V, which are float32 or float64. If the value in Mask is 0, it means that the position needs to be masked.

2.The changed files are mainly paddle/fluid/operators/sparse_attention_op.cu and python/paddle/fluid/tests/unittests/test_sparse_attention_op.py. sparse_attention has three parts: sddmm, softmax, and dsd. Adding the mask operation only needs to modify the softmax. It has no effect on the other two parts. In addition, in order to test the mask function, related tests has been added.

fa463b90

N
Add the transformop parameter in TensorReduceFunctorImpl (#38135) · 524389ee
由 niuliling123 提交于 12月 16, 2021
```
* Add the transformop parameter in TensorReduceFunctorImpl
```
524389ee

[Pten]Modify registered kernel name (#38109) · be874c08

由 YuanRisheng 提交于 12月 16, 2021

* Reduce reshape kernel functions in pten

* delete notes

* fix bugs when compile

* modify register name

* fix compile bugs

be874c08

Add float16 type for scatter op. (#38136) · 9bac4a76

由 Li Min 提交于 12月 16, 2021

* Add float16 type for scatter op.

* Add fp16 test for scatter op.

* Add int and int64 support for scatter_grad on gpu.

* Add int and int64 for check_variable_and_dtype routine.

* Minors.

* Code format.

9bac4a76

15 12月, 2021 3 次提交
- Y
  Change a comment to avoid the disturb to op benchmark ci. (#38148) · 4d8242df
  由 Yiqun Liu 提交于 12月 15, 2021
```
test=document_fix
```
  4d8242df
- H
  Add cinn_launch_op_test into Paddle-CINN ci (#38076) · e5a838f8
  由 Huihuang Zheng 提交于 12月 15, 2021
```
As the title.
```
  e5a838f8
- C
  replace with pten kernel in cast cuda compute and remove unused codes (#38074) · 75332401
  由 chentianyu03 提交于 12月 15, 2021
```
* replace with pten kernel in cast cuda compute and remove unused codes

* rm unused header file

* replace CastCUDAOpKernel with CastOpKernel
```
  75332401
14 12月, 2021 6 次提交
- S
  add map_matmul and fc_act_fuse passes to quant2_int8_mkldnn_pass (#38023) · 8f800dc0
  由 Sylwester Fraczek 提交于 12月 14, 2021
```
* add map_matmul passes to quant2_int8_mkldnn_pass

* fix fc+act fuse (activation scale)

* ci fix, c++17 structured bindings not available

* fix ci static check
```
  8f800dc0
- B
  add conv_gelu_mkldnn_fuse_pass (#38107) · 206a33b3
  由 baoachun 提交于 12月 14, 2021
```
* add conv_gelu_mkldnn_fuse_pass

* add post ops
```
  206a33b3
- W
  
  modify the fix_seed attribute in dropout op is a def attribute.test=develop (#38100) · f44add7b
  由 weishengying 提交于 12月 14, 2021
  
  f44add7b
- Y
  [PTen] Reduce reshape kernel functions in pten (#38055) · a3c8abc7
  由 YuanRisheng 提交于 12月 14, 2021
```
* Reduce reshape kernel functions in pten

* delete notes

* fix bugs when compile
```
  a3c8abc7
- W
  
  fix generate_proposals op doc (#38048) · c117dfba
  由 wangguanzhong 提交于 12月 14, 2021
  
  c117dfba
- S
  add reshape+transpose+matmul_v2 only (#37847) · a922168a
  由 Sylwester Fraczek 提交于 12月 14, 2021
```
* reshape+transpose+matmul_v2

* in_name->input_name

* fix pr-ci-static-check
```
  a922168a
13 12月, 2021 4 次提交

T

update xpu_memcpy (#38049) · bdf5834e
由 taixiurong 提交于 12月 13, 2021

bdf5834e
N

[pnorm] Optimize p_norm op for special cases (#37685) · 10d9ab4b
由 Noel 提交于 12月 13, 2021

10d9ab4b

add logit API (#37844) · b197bfe6

由 wangzhen38 提交于 12月 13, 2021

* add Logit API

* add unittest

* conflict

* pull conflit

* pull conflit logit

* fix unititest

* fix code style

* update docs style of

* update en doc

* fix docs en style

* fix docs en style1

* fix docs en style2

* fix docs en style3

* fix docs en style4

* fix docs en style5

* fix docs en style6

* fix docs en style7

* fix docs en style8

* update by review

* fix nan bug

b197bfe6

C
complement deps on cinn_launch_context cmake (#38031) · cba84f88
由 CtfGo 提交于 12月 13, 2021
```
complement deps of cmake files under WITH_CINN compilation
```
cba84f88

10 12月, 2021 5 次提交
- L
  
  fix int32 overflow in cuda kernel loop (#38007) · 37f43ebc
  由 Leo Chen 提交于 12月 10, 2021
  
  37f43ebc
- Z
  fix pscore geo&lr_decay (#37995) · 513d1f97
  由 zhaocaibei123 提交于 12月 10, 2021
```
* fix

* modify log

* fix batch_size
```
  513d1f97
- F
  add as_complex and as_real op (#37784) · ae40370d
  由 Feiyu Chan 提交于 12月 10, 2021
```
* add as_complex and as_real op
```
  ae40370d
- J
  
  support pylayer with different input dtype (#37974) · c732c831
  由 Jiabin Yang 提交于 12月 10, 2021
  
  c732c831
- C
  
  change serval variable name and usage related cinn_launch (#38022) · a9bd6f0c
  由 CtfGo 提交于 12月 10, 2021
  
  a9bd6f0c
09 12月, 2021 6 次提交
- C
  cache scope and place on CinnLaunchContext and pass them to callback (#37983) · 151c5d74
  由 CtfGo 提交于 12月 09, 2021
```
cinn_launch_op： cache scope and place on CinnLaunchContext to skip duplicate alloc/free callback construction
```
  151c5d74
- C
  [PTen] Refine Kernel Registrar Writing (#37977) · b199ba85
  由 Chen Weihang 提交于 12月 09, 2021
```
* refine the kernel register impl

* fix cmake and symbol error

* remove overload marco

* polish details
```
  b199ba85
- J
  
  add ipu device p2 (#37840) · cb636a48
  由 jianghaicheng 提交于 12月 09, 2021
  
  cb636a48
- R
  
  optimize flip op, removing duplicated computation when dim size is one (#37825) · 890638cf
  由 Roc 提交于 12月 09, 2021
  
  890638cf
- F
  
  format softmax forward (#37927) · 18aca3f5
  由 Feng Xing 提交于 12月 09, 2021
  
  18aca3f5
- C
  
  adjust main dir (#37916) · 1911b6f0
  由 Chen Weihang 提交于 12月 08, 2021
  
  1911b6f0
08 12月, 2021 6 次提交

add a subdirectory named cinn in operators and move releated files into it (#37938) · 9cb637ed

由 CtfGo 提交于 12月 08, 2021

1. add a subdirectory named `cinn` in `paddle/fluid/operators` directory and move releated files into it
2. seperate CinnLaunchContext class from `cinn_launch_op.h` and put it in a  new independent file named `cinn_launch_context.h`, so that it can be included by others clearly.

9cb637ed

Y
[PTen]Add alias kernel name (#37881) · ff6507db
由 YuanRisheng 提交于 12月 08, 2021
```
* add alias kernel name

* modify code as suggestions
```
ff6507db

Add paddle.lerp API to do a linear interpolation (#37253) · 1716324c

由 wuhuanzhou 提交于 12月 08, 2021

* save temp

* add unittest, test=develop

* fix ci error, test=develop

* fix grad accuracy error, test=develop

* fix unused error, test=develop

* fix compilation error on Windows, test=develop

* add unittest, test=develop

* modify by review comment and add lerp_

* fix inplace api, test=develop

* fix inplace api, test=develop

* fix coverage error, test=develop

1716324c

C
implementation of broadcast sub backward by reduce (#37754) · 567e6bbc
由 crystal 提交于 12月 08, 2021
```
* add boardcast_sub

* add boardcast_sub
```
567e6bbc
Y

fix softmax max dim (#37901) · b5dd12fb
由 Yanxing Shi 提交于 12月 08, 2021

b5dd12fb
S
Fix CUDA Graph H2D bug by restore host memory (#37774) · a1ad3a63
由 sneaxiy 提交于 12月 08, 2021
```
* fix CUDA Graph H2D bug again

* fix no return bug
```
a1ad3a63

07 12月, 2021 2 次提交
- D
  
  fix filter_by_instag op for lod_level=0 without lod;test=develop (#37834) · b48545ee
  由 danleifeng 提交于 12月 07, 2021
  
  b48545ee
- Z
  Quantize slice op (#37630) · 2bd0f3c7
  由 Zuza 提交于 12月 07, 2021
```
* quantize slice op

* correct test

* fix code formatting
```
  2bd0f3c7
06 12月, 2021 2 次提交
- H
  Update CINN tag (#37870) · 3e33ef5a
  由 Huihuang Zheng 提交于 12月 06, 2021
```
1. Modify git tag for CINN
2. Support compile option "-DWITH_CINN=ON, -DWITH_TESTING=OFF"
```
  3e33ef5a
- C
  [PTen] Fix reshape move storage using error (#37765) · ead81230
  由 Chen Weihang 提交于 12月 05, 2021
```
* fix reshape move storage error

* remove needless set type

* alloc tensor by shared storage
```
  ead81230
03 12月, 2021 1 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致