提交 · f87fa3c0e5d0ebf89b336cf16c4d1eb0b8767b25 · 机器未来 / Paddle

30 5月, 2022 4 次提交

【PaddlePaddle Hackathon 2】15 新增 API Nanmedian (#42385) · f87fa3c0

由 thunder95 提交于 5月 30, 2022

* nanmedian op

* 修改cuda kernel的bug

* 修复count_if在其他硬件平台不兼容

* 修复某些cpu硬件不兼容

* 修复某些cpu硬件不兼容

* 修复isnan判断

* 兼容numpy低版本不支持全部nan的情况

* 兼容numpy低版本不支持全部nan的情况

* fix code example

* fix api comment error

* 修改反向传播逻辑以及c++处理逻辑

* 完成修改建议

* typo pre_dim

* update en docs, test=document_fix

* remove numpy in en doc, test=document_fix

* add r,test=document_fix

* 添加api到all

* follow advice from chenwhql

f87fa3c0

L
Optimize memcpy operation in Eigh (#42853) · 806073d6
由 limingshu 提交于 5月 30, 2022
```
* 1st commit

* fix usless change in header transpose_kernel_h file

* add sync
```
806073d6
A
[fix] addmm supports 1-d input (#42959) · 849d937b
由 Aganlengzi 提交于 5月 30, 2022
```
* addmm supports 1-d input

* fix coverage

* fix

* more ut
```
849d937b
Z
Make data transform inplaced when tensor is on GPUPinned (#43055) · 114a5d21
由 zyfncg 提交于 5月 30, 2022
```
* make data transform inplace when tensor is on gpupinned in new dygraph

* fix unittest
```
114a5d21

27 5月, 2022 2 次提交

[Phi] Change optional tensor from `optional<const Tensor&>` to `optional<Tensor>` (#42939) · 6d78524c

由 zyfncg 提交于 5月 27, 2022

* refactor the optional tensor

* remove optiona<MetaTensor> in InferMeta

* fix bug

* fix optional<vector<Tensor>>

* fix bug

* fix rmsprop

* fix amp of eager_gen

* polish code

* fix deleted code

* fix merge conflict

* polish code

* remove is_nullopt_

* fix merge conflict

* fix merge conflict

6d78524c

X

change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 (#43010) · 668e235c
由 xiongkun 提交于 5月 27, 2022

668e235c

26 5月, 2022 2 次提交
- Y
  
  move instance_norm_double_grad (#43021) · b2b78cd4
  由 YuanRisheng 提交于 5月 26, 2022
  
  b2b78cd4
- Y
  [Phi]Refactor InstanceNormKernel and InstanceNormGradKernel (#42978) · cc272afb
  由 YuanRisheng 提交于 5月 26, 2022
```
* move instance_norm

* change mutable_data

* fix compile bugs
```
  cc272afb
25 5月, 2022 2 次提交

fix maybe-uninitialized warning (#42902) · f1f79b0d

由 Leo Chen 提交于 5月 25, 2022

* fix maybe-uninitialized warning

* fix compile

* fix xpu compile

* fix npu compile

* fix infer compile

* fix compile

* fix compile

f1f79b0d

[EinsumOp] Optimize the backward speed of EinsumOp (#42663) · 71b046cd

由 xiongkun 提交于 5月 25, 2022

* change logic for optimize

* modifty

* optimize the backward speed of EinsumOp

* add cache optimizer for einsum op

* EinsumOp: fix new dygraph mode error

* fix bug

* change Cache->InnerCache

* fix code

* fix

* add nan inf utils for einsum op

* add as_extra

* Compatible with v2.3 EinsumOp

* remove dispensable

71b046cd

24 5月, 2022 2 次提交
- Y
  [Phi]Move grad_add op kernel into phi and delete elementwise_add_op file (#42903) · 4d7a9eef
  由 YuanRisheng 提交于 5月 24, 2022
```
* move grad_add

* fix unittest bugs

* fix compile bugs
```
  4d7a9eef
- F
  
  fix cmake command, rm -> remove (#42927) · de735a9a
  由 Feiyu Chan 提交于 5月 24, 2022
  
  de735a9a
23 5月, 2022 4 次提交
- Y
  Add double grad yaml for celu/sqrt/rsqrt/square op (#42895) · 0211a833
  由 YuanRisheng 提交于 5月 23, 2022
```
* add double grad yaml

* fix bugs when compile infrt
```
  0211a833
- Z
  [Phi] Remove Storage (#42872) · fa6b3c9a
  由 zyfncg 提交于 5月 23, 2022
```
* remove storage

* add glog include

* add glog include

* add glog include
```
  fa6b3c9a
- remove is_init_py of RandomGenerator, and use Global RandomGenerator by default (#42876) · 3b488bae
  由 zhouweiwei2014 提交于 5月 23, 2022
```
* remove is_init_py of RandomGenerator, and use Global Generator if not OP seed

* fix comment
```
  3b488bae
- S
  
  Fix a bug in BroadcastConfig for KP XPU2 rec model (#42866) · 106083aa
  由 shixingbo 提交于 5月 23, 2022
  
  106083aa
20 5月, 2022 5 次提交

N

Delete ElementwiseKernel in BroadcastKernel (#42779) · 0d878f1a
由 niuliling123 提交于 5月 20, 2022

0d878f1a

use fp32 compute type for cublasGemmStridedBatchedEx with fp16 input/output (#42851) · f36a9464

由 Leo Chen 提交于 5月 20, 2022

* use fp32 compute type for cublasGemmStridedBatchedEx with fp16 input/output

* add flags to control compute type

* default to false

* add unit test

* default to true

f36a9464

Y

move activation kernel (#42880) · 191c441a
由 YuanRisheng 提交于 5月 20, 2022

191c441a
W

[Eager] Make CreateInferMeta more robust (#42871) · d8b69124
由 Weilong Wu 提交于 5月 20, 2022

d8b69124

[Hackathon No.5] tril_indices OP (#41639) · 75db5b86

由 xiaoguoguo626807 提交于 5月 20, 2022

* add tril_indices cpu kernal

* modify tril_indice cpu op

* modify bug

* modify bug

* add tril_indices python api

* add tril_indices python api

* resolve conflict

* add tril_indices test

* modify details

* add tril_indices.cu

* pythonapi pass

* save tril_indices

* CPU tril_indices pass

* delete vlog

* modify test_tril_indices_op.py

* delete tril_indices_kernel.cc.swp

* delete tril_indice.cu

* modify code style

* add newline in creation.py

* modify creation.py linux newline

* delete annotation

* check code style

* check .py style add final_state??

* modify code style

* add gpu_tril_indices

* modify gpu_compiled_juage

* modify gpu judge

* code style

* add test example

* modify english document

modify english document

modify english document

modify document

modify document

* modify pram name

* modify pram name

* modify pram

* reduce test ex

75db5b86

19 5月, 2022 3 次提交

[Phi] Change the output format of C++ backward api (Part2) (#42545) · 4427f1b1

由 zyfncg 提交于 5月 19, 2022

* change the output format of C++ backward api

* fix merge conflict

* fix sparse api code auto-gen

* fix eager_gen bug

* fix bug of output is null

* fix bug of conv2d_grad_impl

* fix optional grad

* fix bug of eager-gen double_grad

* fix bug

* fix multiply_double_grad bug

* fix bug of higher order derivative

* fix bug of FillZeroForEmptyGradInput

* remove redundant vector in grad_node

* fix bug of test_deformable_conv_v1_op

* fix bug of test_deformable_conv_v1_op

* some refacotr

4427f1b1

Z
[Phi] Remove shared_storage (#42821) · 7a171e3c
由 zyfncg 提交于 5月 19, 2022
```
* remove shared_storage

* fix bug

* fix rnn bug
```
7a171e3c
C
[CompileOpt] Refine enforce code and remove boost/variant include (#41093) · ca359fec
由 Chen Weihang 提交于 5月 19, 2022
```
* refine enforce code

* refine enforce code

* fix compile failed

* fix infrt failed
```
ca359fec

18 5月, 2022 4 次提交
- F
  Add Code Generation for operators, op makers and argument mapping functions (#41772) · e339d3c1
  由 Feiyu Chan 提交于 5月 18, 2022
```
Add Code Generation for operators,  op makers and argument mapping functions (#41772)
```
  e339d3c1
- S
  matmul and matmul_v2 refactor (#42732) · 570d0322
  由 Sławomir Siwek 提交于 5月 18, 2022
```
* matmul refactor

* remove UT which only check ENFORCE output

* code format

* improve memory usage
```
  570d0322
- N
  
  Add return in initial function (#42823) · bebaee37
  由 niuliling123 提交于 5月 18, 2022
  
  bebaee37
- Z
  Add intermediate config for some api in yaml (#42824) · 384062fa
  由 zyfncg 提交于 5月 18, 2022
```
* add intermediate for some api

* fix bug

* fix fluid.layer
```
  384062fa
17 5月, 2022 1 次提交
- C
  [Eager] Adapt faster tokenizer op (#42718) · b189e83f
  由 Chen Weihang 提交于 5月 17, 2022
```
* adapt faster tokenizer op

* add eager test

* add unittest
```
  b189e83f
16 5月, 2022 3 次提交

N

delete rank switch in broadcast_function.h for compile (#42645) · 8501fb00
由 niuliling123 提交于 5月 16, 2022

8501fb00
Y

Optimize linspace to avoid GPU -> CPU copy. (#42750) · 34cda80b
由 Yiqun Liu 提交于 5月 16, 2022

34cda80b

[PHI] Support construct IntArray by using Non-CPU Tensosr (#41764) · 8eecd852

由 zyfncg 提交于 5月 16, 2022

* support construct scalar using non-cpu tensor

* fix bugs when run unittest

* fix compile bugs

* fix bugs when run ci

* fix compile bugs

* fix bugs when move copy

* perfect unit test

* perfect unittest

* update according to comment

* int_array supports constructed by gpu tensor

* add some test

* polish code

* adjust full api

* add unittest

* add unittest
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

8eecd852

13 5月, 2022 1 次提交
- W
  
  add gpu resources. (#42723) · 1280f294
  由 Wilber 提交于 5月 13, 2022
  
  1280f294
12 5月, 2022 2 次提交
- S
  
  Fix some typos in paddle/. (#42408) · 2012672c
  由 Shuangchi He 提交于 5月 12, 2022
  
  2012672c
- T
  
  【Hackathon No.60】refactor unary sparse ops and add sparse sqrt, tanh, sin (#41356) · f1eda7d0
  由 tiancaishaonvjituizi 提交于 5月 12, 2022
  
  f1eda7d0
11 5月, 2022 1 次提交

[Phi] Change the output format of C++ backward api (Part1) (#42677) · ba71fbea

由 zyfncg 提交于 5月 11, 2022

* change the output format of C++ backward api

* fix merge conflict

* fix sparse api code auto-gen

* fix eager_gen bug

* fix bug of output is null

* fix bug of conv2d_grad_impl

* fix optional grad

* fix bug of eager-gen double_grad

* fix bug

* fix multiply_double_grad bug

* remove node pruning

ba71fbea

10 5月, 2022 3 次提交

X
[EinsumOp] Polish forward logic and backward logic for optimize (#42603) · cf198dc9
由 xiongkun 提交于 5月 10, 2022
```
* change logic for optimize

* modifty
```
cf198dc9

【PaddlePaddle Hackathon 2】18、为 Paddle 新增 paddle.heaviside 和 paddle.Tensor.heaviside API (#41872) · 4892d592

由 BrilliantYuKaimin 提交于 5月 10, 2022

* Create elementwise_heaviside_op.cc

* add ElementwiseHeavisideFunctor

* Create test_elementwise_heaviside_op.py

* 增加heaviside的python接口

* add heaviside in white list

* 增加heaviside的签名

* 增加heaviside的核函数

* 增加heaviside梯度的核函数

* 增加heaviside梯度的注册

* 调整代码格式

* Update elementwise_sig.cc

* add heaviside in __all__

* Update heaviside docs

* Update math.py

* Update math.py

* Update math.py

4892d592

S

broadcast_add kp performance optimization (#42097) · c7855125
由 shixingbo 提交于 5月 10, 2022

c7855125

09 5月, 2022 1 次提交
- J
  [Need approval] Add AdamW-CPU FP32 JIT assembly kernel (#42522) · 766c50ac
  由 joanna.wozna.intel 提交于 5月 09, 2022
```
* Add AdamW jit kernel

* Second implementation

* Add missing header

* Correct number of jit kernels in the test
```
  766c50ac

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致