提交 · 71b046cda4d2c1751cfbc280e3695261f12fe8b4 · PaddlePaddle / Paddle

25 5月, 2022 1 次提交

[EinsumOp] Optimize the backward speed of EinsumOp (#42663) · 71b046cd

由 xiongkun 提交于 5月 25, 2022

* change logic for optimize

* modifty

* optimize the backward speed of EinsumOp

* add cache optimizer for einsum op

* EinsumOp: fix new dygraph mode error

* fix bug

* change Cache->InnerCache

* fix code

* fix

* add nan inf utils for einsum op

* add as_extra

* Compatible with v2.3 EinsumOp

* remove dispensable

71b046cd

24 5月, 2022 2 次提交
- Y
  [Phi]Move grad_add op kernel into phi and delete elementwise_add_op file (#42903) · 4d7a9eef
  由 YuanRisheng 提交于 5月 24, 2022
```
* move grad_add

* fix unittest bugs

* fix compile bugs
```
  4d7a9eef
- F
  
  fix cmake command, rm -> remove (#42927) · de735a9a
  由 Feiyu Chan 提交于 5月 24, 2022
  
  de735a9a
23 5月, 2022 4 次提交
- Y
  Add double grad yaml for celu/sqrt/rsqrt/square op (#42895) · 0211a833
  由 YuanRisheng 提交于 5月 23, 2022
```
* add double grad yaml

* fix bugs when compile infrt
```
  0211a833
- Z
  [Phi] Remove Storage (#42872) · fa6b3c9a
  由 zyfncg 提交于 5月 23, 2022
```
* remove storage

* add glog include

* add glog include

* add glog include
```
  fa6b3c9a
- remove is_init_py of RandomGenerator, and use Global RandomGenerator by default (#42876) · 3b488bae
  由 zhouweiwei2014 提交于 5月 23, 2022
```
* remove is_init_py of RandomGenerator, and use Global Generator if not OP seed

* fix comment
```
  3b488bae
- S
  
  Fix a bug in BroadcastConfig for KP XPU2 rec model (#42866) · 106083aa
  由 shixingbo 提交于 5月 23, 2022
  
  106083aa
20 5月, 2022 5 次提交

N

Delete ElementwiseKernel in BroadcastKernel (#42779) · 0d878f1a
由 niuliling123 提交于 5月 20, 2022

0d878f1a

use fp32 compute type for cublasGemmStridedBatchedEx with fp16 input/output (#42851) · f36a9464

由 Leo Chen 提交于 5月 20, 2022

* use fp32 compute type for cublasGemmStridedBatchedEx with fp16 input/output

* add flags to control compute type

* default to false

* add unit test

* default to true

f36a9464

Y

move activation kernel (#42880) · 191c441a
由 YuanRisheng 提交于 5月 20, 2022

191c441a
W

[Eager] Make CreateInferMeta more robust (#42871) · d8b69124
由 Weilong Wu 提交于 5月 20, 2022

d8b69124

[Hackathon No.5] tril_indices OP (#41639) · 75db5b86

由 xiaoguoguo626807 提交于 5月 20, 2022

* add tril_indices cpu kernal

* modify tril_indice cpu op

* modify bug

* modify bug

* add tril_indices python api

* add tril_indices python api

* resolve conflict

* add tril_indices test

* modify details

* add tril_indices.cu

* pythonapi pass

* save tril_indices

* CPU tril_indices pass

* delete vlog

* modify test_tril_indices_op.py

* delete tril_indices_kernel.cc.swp

* delete tril_indice.cu

* modify code style

* add newline in creation.py

* modify creation.py linux newline

* delete annotation

* check code style

* check .py style add final_state??

* modify code style

* add gpu_tril_indices

* modify gpu_compiled_juage

* modify gpu judge

* code style

* add test example

* modify english document

modify english document

modify english document

modify document

modify document

* modify pram name

* modify pram name

* modify pram

* reduce test ex

75db5b86

19 5月, 2022 3 次提交

[Phi] Change the output format of C++ backward api (Part2) (#42545) · 4427f1b1

由 zyfncg 提交于 5月 19, 2022

* change the output format of C++ backward api

* fix merge conflict

* fix sparse api code auto-gen

* fix eager_gen bug

* fix bug of output is null

* fix bug of conv2d_grad_impl

* fix optional grad

* fix bug of eager-gen double_grad

* fix bug

* fix multiply_double_grad bug

* fix bug of higher order derivative

* fix bug of FillZeroForEmptyGradInput

* remove redundant vector in grad_node

* fix bug of test_deformable_conv_v1_op

* fix bug of test_deformable_conv_v1_op

* some refacotr

4427f1b1

Z
[Phi] Remove shared_storage (#42821) · 7a171e3c
由 zyfncg 提交于 5月 19, 2022
```
* remove shared_storage

* fix bug

* fix rnn bug
```
7a171e3c
C
[CompileOpt] Refine enforce code and remove boost/variant include (#41093) · ca359fec
由 Chen Weihang 提交于 5月 19, 2022
```
* refine enforce code

* refine enforce code

* fix compile failed

* fix infrt failed
```
ca359fec

18 5月, 2022 4 次提交
- F
  Add Code Generation for operators, op makers and argument mapping functions (#41772) · e339d3c1
  由 Feiyu Chan 提交于 5月 18, 2022
```
Add Code Generation for operators,  op makers and argument mapping functions (#41772)
```
  e339d3c1
- S
  matmul and matmul_v2 refactor (#42732) · 570d0322
  由 Sławomir Siwek 提交于 5月 18, 2022
```
* matmul refactor

* remove UT which only check ENFORCE output

* code format

* improve memory usage
```
  570d0322
- N
  
  Add return in initial function (#42823) · bebaee37
  由 niuliling123 提交于 5月 18, 2022
  
  bebaee37
- Z
  Add intermediate config for some api in yaml (#42824) · 384062fa
  由 zyfncg 提交于 5月 18, 2022
```
* add intermediate for some api

* fix bug

* fix fluid.layer
```
  384062fa
17 5月, 2022 1 次提交
- C
  [Eager] Adapt faster tokenizer op (#42718) · b189e83f
  由 Chen Weihang 提交于 5月 17, 2022
```
* adapt faster tokenizer op

* add eager test

* add unittest
```
  b189e83f
16 5月, 2022 3 次提交

N

delete rank switch in broadcast_function.h for compile (#42645) · 8501fb00
由 niuliling123 提交于 5月 16, 2022

8501fb00
Y

Optimize linspace to avoid GPU -> CPU copy. (#42750) · 34cda80b
由 Yiqun Liu 提交于 5月 16, 2022

34cda80b

[PHI] Support construct IntArray by using Non-CPU Tensosr (#41764) · 8eecd852

由 zyfncg 提交于 5月 16, 2022

* support construct scalar using non-cpu tensor

* fix bugs when run unittest

* fix compile bugs

* fix bugs when run ci

* fix compile bugs

* fix bugs when move copy

* perfect unit test

* perfect unittest

* update according to comment

* int_array supports constructed by gpu tensor

* add some test

* polish code

* adjust full api

* add unittest

* add unittest
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

8eecd852

13 5月, 2022 1 次提交
- W
  
  add gpu resources. (#42723) · 1280f294
  由 Wilber 提交于 5月 13, 2022
  
  1280f294
12 5月, 2022 2 次提交
- S
  
  Fix some typos in paddle/. (#42408) · 2012672c
  由 Shuangchi He 提交于 5月 12, 2022
  
  2012672c
- T
  
  【Hackathon No.60】refactor unary sparse ops and add sparse sqrt, tanh, sin (#41356) · f1eda7d0
  由 tiancaishaonvjituizi 提交于 5月 12, 2022
  
  f1eda7d0
11 5月, 2022 1 次提交

[Phi] Change the output format of C++ backward api (Part1) (#42677) · ba71fbea

由 zyfncg 提交于 5月 11, 2022

* change the output format of C++ backward api

* fix merge conflict

* fix sparse api code auto-gen

* fix eager_gen bug

* fix bug of output is null

* fix bug of conv2d_grad_impl

* fix optional grad

* fix bug of eager-gen double_grad

* fix bug

* fix multiply_double_grad bug

* remove node pruning

ba71fbea

10 5月, 2022 3 次提交

X
[EinsumOp] Polish forward logic and backward logic for optimize (#42603) · cf198dc9
由 xiongkun 提交于 5月 10, 2022
```
* change logic for optimize

* modifty
```
cf198dc9

【PaddlePaddle Hackathon 2】18、为 Paddle 新增 paddle.heaviside 和 paddle.Tensor.heaviside API (#41872) · 4892d592

由 BrilliantYuKaimin 提交于 5月 10, 2022

* Create elementwise_heaviside_op.cc

* add ElementwiseHeavisideFunctor

* Create test_elementwise_heaviside_op.py

* 增加heaviside的python接口

* add heaviside in white list

* 增加heaviside的签名

* 增加heaviside的核函数

* 增加heaviside梯度的核函数

* 增加heaviside梯度的注册

* 调整代码格式

* Update elementwise_sig.cc

* add heaviside in __all__

* Update heaviside docs

* Update math.py

* Update math.py

* Update math.py

4892d592

S

broadcast_add kp performance optimization (#42097) · c7855125
由 shixingbo 提交于 5月 10, 2022

c7855125

09 5月, 2022 2 次提交
- J
  [Need approval] Add AdamW-CPU FP32 JIT assembly kernel (#42522) · 766c50ac
  由 joanna.wozna.intel 提交于 5月 09, 2022
```
* Add AdamW jit kernel

* Second implementation

* Add missing header

* Correct number of jit kernels in the test
```
  766c50ac
- N
  
  Modified reduce for xpu2 (#42439) · ae4d1ec1
  由 niuliling123 提交于 5月 09, 2022
  
  ae4d1ec1
07 5月, 2022 1 次提交
- Z
  [Phi] Change sync copy to async for gpu_pinned to gpu place in data transform (#41966) · 6583a8d2
  由 zyfncg 提交于 5月 07, 2022
```
* the copy type of data transform for gpu_pinned to gpu change from syna to async

* refactor code
```
  6583a8d2
06 5月, 2022 1 次提交
- Z
  
  fix conv3d backward (#42502) · 503569a0
  由 zhangkaihuo 提交于 5月 06, 2022
  
  503569a0
05 5月, 2022 3 次提交
- X
  
  Fix Einsum Infershape get None (#42493) · f315489d
  由 xiongkun 提交于 5月 05, 2022
  
  f315489d
- Z
  
  fix sparse mask (#42305) · e8e3b997
  由 zhangkaihuo 提交于 5月 05, 2022
  
  e8e3b997
- Q
  update xpu depends (#42365) · d90e24ac
  由 QingshuChen 提交于 5月 05, 2022
```
* update xpu depends
*test=kunlun

* minor
*test=kunlun
Co-authored-by: Nroot <root@yq01-sys-hic-p40-0091.yq01.baidu.com>
```
  d90e24ac
04 5月, 2022 2 次提交
- X
  fix bug of batch_norm_grad kernel with fp16 (#42460) · 65708141
  由 XiaoguangHu 提交于 5月 04, 2022
```
* fix bug of batch_norm_grad kernel with fp16

* format code
```
  65708141
- X
  
  fix bug when compiling with cusparse in CUDA version >=11.4 (#42455) · 92fdfe33
  由 XiaoguangHu 提交于 5月 04, 2022
  
  92fdfe33
01 5月, 2022 1 次提交
- L
  
  [KP] Complete registry of elementwise ops on XPU with KP (#42056) · a3d56a9c
  由 Lijunhui 提交于 5月 01, 2022
  
  a3d56a9c

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功