提交 · 829723f2361c17eb4bdf530a3c978d9878f4a058 · 机器未来 / Paddle

21 6月, 2022 3 次提交
- S
  resort .cu headers, set clang-format not sort include block and consider .cu... · 829723f2
  由 Sing_chan 提交于 6月 21, 2022
```
resort .cu headers, set clang-format not sort include block and consider .cu as main source file (#43633)
```
  829723f2
- Z
  
  slice large tensor for cudnn_softmax (#43681) · bd5e97d3
  由 Zhang Ting 提交于 6月 21, 2022
  
  bd5e97d3
- S
  
  remove .clang-format in paddle/fluid to use the same config (#43678) · b2df4c76
  由 Sing_chan 提交于 6月 21, 2022
  
  b2df4c76
20 6月, 2022 2 次提交
- Z
  
  add cross_op cuda kernel (#43558) · ec3e0a13
  由 zhangbopd 提交于 6月 20, 2022
  
  ec3e0a13
- 【Sparse】add new API/OP(csr->csr) of SparseTensor softmax (#43475) · 2ddbc647
  由 zhouweiwei2014 提交于 6月 20, 2022
```
* add new API/OP(csr->csr) of SparseTensor softmax

* fix comment
```
  2ddbc647
17 6月, 2022 2 次提交
- Z
  fix batch csr (#43553) · 03517d8a
  由 zhangkaihuo 提交于 6月 17, 2022
```
* fix to_sparse_csr
```
  03517d8a
- Y
  
  Fix index calculation error in ElementwiseKernel. (#43603) · 48ea76c9
  由 Yiqun Liu 提交于 6月 17, 2022
  
  48ea76c9
16 6月, 2022 2 次提交
- R
  [CustomKernel] add custom kernel c api (#42986) · 6fe10181
  由 ronnywang 提交于 6月 16, 2022
```
* [CustomKernel] add custom kernel c api

* update

* update

* fix unable to export capi
Co-authored-by: Nronny1996 <524019753@qq.com>
```
  6fe10181
- L
  fix xpu kp compilation (#43496) · 767efaca
  由 Leo Chen 提交于 6月 16, 2022
```
* fix xpu kp compilation

* add depends
```
  767efaca
15 6月, 2022 3 次提交
- G
  
  modify index dtype from int to int64_t of concat_and_split_functor (#43479) · 81abaaf5
  由 Guoxia Wang 提交于 6月 15, 2022
  
  81abaaf5
- add some kernels(csr*dense->csr, dense*dense->csr) of SparseTensor matmul (#42935) · 346efe96
  由 zhouweiwei2014 提交于 6月 15, 2022
```
* add some kernel(csr*dense->csr, dense*dense->csr) of SparseTensor matmul

* fix CI

* fix CI

* fix comment

* fix comment
```
  346efe96
- Y
  Use int64_t in GetGpuLaunchConfig1D and ElementwiseKernel as index type to... · 15577630
  由 Yiqun Liu 提交于 6月 15, 2022
```
Use int64_t in GetGpuLaunchConfig1D and ElementwiseKernel as index type to support large tensor. (#43506)

* Change some data type from int to int64_t in GetGpuLaunchConfig1D to support large tensor.

* Use int64_t in ElementwiseKernel as index type to support large tensor.
```
  15577630
14 6月, 2022 3 次提交

[Eager] Fix edvr starganv2 (#43471) · c62a7e25

由 Jiabin Yang 提交于 6月 14, 2022

* fix starganv2

* fix starganv2 stop_gradient end error

* fix edvr_starganv2

* fix mul kernel to fix optional ddx

* fix typo

c62a7e25

Z

fix compiling werror (#43337) · c6421019
由 Zhang Jun 提交于 6月 14, 2022

c6421019

[ Make FLAGS_einsum_opt as default ] Einsum memory optimization (#43397) · 83abec60

由 xiongkun 提交于 6月 14, 2022

* change logic for optimize

* modifty

* optimize the backward speed of EinsumOp

* add cache optimizer for einsum op

* EinsumOp: fix new dygraph mode error

* fix bug

* change Cache->InnerCache

* fix code

* fix

* add nan inf utils for einsum op

* add as_extra

* memory optimizer for einsum

* update code

83abec60

13 6月, 2022 2 次提交
- Z
  fix bug of strided_slice (#43388) · abc5d0c4
  由 zyfncg 提交于 6月 13, 2022
```
* fix stride_slice bug

* fix bug
```
  abc5d0c4
- Z
  sparse convertion kernel support secondary dispatch (#43345) · 5752643b
  由 zhangkaihuo 提交于 6月 13, 2022
```
* use GpuMemcpy and GpuMemset

* sparse convert kernel support double dispatch by indices dtype

* cudaMemcpyKind->gpuMemcpyKind
```
  5752643b
10 6月, 2022 4 次提交
- C
  [Phi] Fix depthwise conv yaml error (#43379) · f551d9fe
  由 Chen Weihang 提交于 6月 10, 2022
```
* fix depthwise conv yaml error

* fix depthwise conv double grad error
```
  f551d9fe
- W
  
  revert PR43039 (#43384) · ac75617a
  由 Wilber 提交于 6月 10, 2022
  
  ac75617a
- L
  make all phi kernels to 2(host/device) static libraries directly (#43247) · 5781999d
  由 Leo Chen 提交于 6月 10, 2022
```
* make all phi kernels to 2(host/device) static libraries directly

* fix calling kernel_declare

* fix compile

* fix cpu compile

* fix rocm compile

* fix xpu compile

* fix xpu kp compile

* fix inference compile
```
  5781999d
- T
  
  [Hackathon No.28] implement logcumsumexp (#42267) · 19a7524f
  由 tiancaishaonvjituizi 提交于 6月 10, 2022
  
  19a7524f
09 6月, 2022 1 次提交
- C
  Implement dropout_nd operator to optimize dropout with axis not None. (#42463) · caa57498
  由 crystal 提交于 6月 09, 2022
```
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
```
  caa57498
08 6月, 2022 1 次提交
- Y
  [Phi]Move group op kernel into PHI and add yaml / unittest (#43104) · 99c6497b
  由 YuanRisheng 提交于 6月 08, 2022
```
* move_group_norm

* move group norm backward

* fix code format

* modify code according comment
```
  99c6497b
07 6月, 2022 6 次提交
- S
  
  Optimized the performance of activation op in XPU2 (#43187) · d5afc1ba
  由 shixingbo 提交于 6月 07, 2022
  
  d5afc1ba
- L
  
  Allocate and use new memory for temp data in cumsum kernel (#43101) · 5dcebb9b
  由 Leo Chen 提交于 6月 07, 2022
  
  5dcebb9b
- G
  
  add bf16 dtype for flatten kernel (#43264) · 0fdb3ced
  由 Guoxia Wang 提交于 6月 07, 2022
  
  0fdb3ced
- W
  
  [multi-stream] Fix split and concat problem. (#43039) · 8c3777df
  由 Wilber 提交于 6月 07, 2022
  
  8c3777df
- L
  Transpose optimization with assitant of Chengdu Supercomputing Center and... · 71a63f0a
  由 limingshu 提交于 6月 07, 2022
```
Transpose optimization with assitant of  Chengdu Supercomputing Center and auto_tune operation (#42704)
```
  71a63f0a
- N
  
  [XPU KP]Add xpu register, any, amax, amin op test (#43204) · aec49361
  由 niuliling123 提交于 6月 07, 2022
  
  aec49361
06 6月, 2022 1 次提交
- N
  
  Replace ReduceAmax/Amax.part.cu with KP (#43202) · 39903f72
  由 niuliling123 提交于 6月 06, 2022
  
  39903f72
05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
04 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：cmake-format (#43057) · 92568edb
  由 Sing_chan 提交于 6月 04, 2022
  
  92568edb
02 6月, 2022 2 次提交
- S
  Support hetergraph reindex (#43128) · ceb20406
  由 Siming Dai 提交于 6月 02, 2022
```
* support heter reindex

* add unittest, fix bug

* add comment

* delete empty line

* refine example

* fix codestyle

* add disable static
```
  ceb20406
- L
  Extend forward fast layer_norm kernel to support more dimensions. (#43118) · 85baa3c0
  由 Li Min 提交于 6月 02, 2022
```
* extend forward fast_ln_kernel to support more column values.
```
  85baa3c0
01 6月, 2022 3 次提交

Y
Add yaml and unittest for instance_norm op (#43060) · 56ae33b6
由 YuanRisheng 提交于 6月 01, 2022
```
* add yaml

* fix infrt compile bugs
```
56ae33b6
A

[fix] split nanmedian fluid deps (#43135) · b23914c2
由 Aganlengzi 提交于 6月 01, 2022

b23914c2

[Yaml]add conv3d, depthwise_conv2d yaml (#42807) · 5f2c251c

由 chentianyu03 提交于 6月 01, 2022

* add conv3d yaml

* add conv3d_grad, conv3d_double_grad

* add final_state_conv3d test case

* add conv3d double test case

* add depthwise_conv2d grad yaml

* add depthwise_conv2d double grad test case

* modify the order of args

* add depthwise_conv2d_grad_grad config

5f2c251c

31 5月, 2022 3 次提交

C
[Phi] Polish assign kernel copy impl (#43061) · c9e7c407
由 Chen Weihang 提交于 5月 31, 2022
```
* fix assign kernel copy impl

* fix test failed
```
c9e7c407

【PaddlePaddle Hackathon 2】16 新增 API RRelu (#41823) · 21e1d10f

由 thunder95 提交于 5月 31, 2022

* rrelu逻辑部分

* unregistered op kernel (unresolved)

* commit before merge

* 丰富测试用例

* 修复rrelu-sig的bug

* 修复cpu环境测试

* 修改拼写错误

* 修改code format

* 尝试优化测试用例timeout的问题

* 优化测试用例

* 移除seed, 优化随机函数

* update en doc for rrelu

* fix rrelu en docs, test=document_fix

* add paper link for en docs, test=document_fix

* udpate en doc

* add r,test=document_fix

21e1d10f

[EinsumOp] Make EinsumOp support bfloat16. (#43085) · a4bb38cb

由 xiongkun 提交于 5月 31, 2022

* change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0

* make EInsumOP support bf16

* add unittest for BF16

* add condition for test_BF16

* fix bugs

* fix

a4bb38cb

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致