提交 · 4a08c781340b12ba59b8c42492652c2f5b5cefb2 · BaiXuePrincess / Paddle

18 6月, 2022 1 次提交
- remove unuse cuSparse function (#43626) · 4a08c781
  由 zhouweiwei2014 提交于 6月 18, 2022
  
  4a08c781
17 6月, 2022 2 次提交
- Z
  fix batch csr (#43553) · 03517d8a
  由 zhangkaihuo 提交于 6月 17, 2022
```
* fix to_sparse_csr
```
  03517d8a
- Y
  
  Fix index calculation error in ElementwiseKernel. (#43603) · 48ea76c9
  由 Yiqun Liu 提交于 6月 17, 2022
  
  48ea76c9
16 6月, 2022 2 次提交
- R
  [CustomKernel] add custom kernel c api (#42986) · 6fe10181
  由 ronnywang 提交于 6月 16, 2022
```
* [CustomKernel] add custom kernel c api

* update

* update

* fix unable to export capi
Co-authored-by: Nronny1996 <524019753@qq.com>
```
  6fe10181
- L
  fix xpu kp compilation (#43496) · 767efaca
  由 Leo Chen 提交于 6月 16, 2022
```
* fix xpu kp compilation

* add depends
```
  767efaca
15 6月, 2022 5 次提交
- G
  
  modify index dtype from int to int64_t of concat_and_split_functor (#43479) · 81abaaf5
  由 Guoxia Wang 提交于 6月 15, 2022
  
  81abaaf5
- Z
  Rename yaml (#43470) · fcd32950
  由 zyfncg 提交于 6月 15, 2022
```
* rename yaml file

* fix merge conflict

* fix infrt
```
  fcd32950
- add some kernels(csr*dense->csr, dense*dense->csr) of SparseTensor matmul (#42935) · 346efe96
  由 zhouweiwei2014 提交于 6月 15, 2022
```
* add some kernel(csr*dense->csr, dense*dense->csr) of SparseTensor matmul

* fix CI

* fix CI

* fix comment

* fix comment
```
  346efe96
- Y
  Use int64_t in GetGpuLaunchConfig1D and ElementwiseKernel as index type to... · 15577630
  由 Yiqun Liu 提交于 6月 15, 2022
```
Use int64_t in GetGpuLaunchConfig1D and ElementwiseKernel as index type to support large tensor. (#43506)

* Change some data type from int to int64_t in GetGpuLaunchConfig1D to support large tensor.

* Use int64_t in ElementwiseKernel as index type to support large tensor.
```
  15577630
- R
  Refactor dynload/port.h (#43431) · 332fdd1e
  由 Ruibiao Chen 提交于 6月 15, 2022
```
* Refactor port.h

* Remove some unnecessary code

* Fix CI errors
```
  332fdd1e
14 6月, 2022 3 次提交

[Eager] Fix edvr starganv2 (#43471) · c62a7e25

由 Jiabin Yang 提交于 6月 14, 2022

* fix starganv2

* fix starganv2 stop_gradient end error

* fix edvr_starganv2

* fix mul kernel to fix optional ddx

* fix typo

c62a7e25

Z

fix compiling werror (#43337) · c6421019
由 Zhang Jun 提交于 6月 14, 2022

c6421019

[ Make FLAGS_einsum_opt as default ] Einsum memory optimization (#43397) · 83abec60

由 xiongkun 提交于 6月 14, 2022

* change logic for optimize

* modifty

* optimize the backward speed of EinsumOp

* add cache optimizer for einsum op

* EinsumOp: fix new dygraph mode error

* fix bug

* change Cache->InnerCache

* fix code

* fix

* add nan inf utils for einsum op

* add as_extra

* memory optimizer for einsum

* update code

83abec60

13 6月, 2022 3 次提交
- Z
  fix bug of strided_slice (#43388) · abc5d0c4
  由 zyfncg 提交于 6月 13, 2022
```
* fix stride_slice bug

* fix bug
```
  abc5d0c4
- R
  
  Fix cmakelint errors for some files (#43428) · edf69ae0
  由 Ruibiao Chen 提交于 6月 13, 2022
  
  edf69ae0
- Z
  sparse convertion kernel support secondary dispatch (#43345) · 5752643b
  由 zhangkaihuo 提交于 6月 13, 2022
```
* use GpuMemcpy and GpuMemset

* sparse convert kernel support double dispatch by indices dtype

* cudaMemcpyKind->gpuMemcpyKind
```
  5752643b
10 6月, 2022 4 次提交
- C
  [Phi] Fix depthwise conv yaml error (#43379) · f551d9fe
  由 Chen Weihang 提交于 6月 10, 2022
```
* fix depthwise conv yaml error

* fix depthwise conv double grad error
```
  f551d9fe
- W
  
  revert PR43039 (#43384) · ac75617a
  由 Wilber 提交于 6月 10, 2022
  
  ac75617a
- L
  make all phi kernels to 2(host/device) static libraries directly (#43247) · 5781999d
  由 Leo Chen 提交于 6月 10, 2022
```
* make all phi kernels to 2(host/device) static libraries directly

* fix calling kernel_declare

* fix compile

* fix cpu compile

* fix rocm compile

* fix xpu compile

* fix xpu kp compile

* fix inference compile
```
  5781999d
- T
  
  [Hackathon No.28] implement logcumsumexp (#42267) · 19a7524f
  由 tiancaishaonvjituizi 提交于 6月 10, 2022
  
  19a7524f
09 6月, 2022 2 次提交
- M
  
  [sparse inference] Supporting 2:4 sparse inference (#43179) · 20b38cfa
  由 minghaoBD 提交于 6月 09, 2022
  
  20b38cfa
- C
  Implement dropout_nd operator to optimize dropout with axis not None. (#42463) · caa57498
  由 crystal 提交于 6月 09, 2022
```
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
```
  caa57498
08 6月, 2022 3 次提交
- X
  
  call_once (#43206) · cad139a7
  由 xiaoxiaohehe001 提交于 6月 08, 2022
  
  cad139a7
- Z
  
  fix tensor copy bug (#43299) · 88216f63
  由 zyfncg 提交于 6月 08, 2022
  
  88216f63
- Y
  [Phi]Move group op kernel into PHI and add yaml / unittest (#43104) · 99c6497b
  由 YuanRisheng 提交于 6月 08, 2022
```
* move_group_norm

* move group norm backward

* fix code format

* modify code according comment
```
  99c6497b
07 6月, 2022 6 次提交
- S
  
  Optimized the performance of activation op in XPU2 (#43187) · d5afc1ba
  由 shixingbo 提交于 6月 07, 2022
  
  d5afc1ba
- L
  
  Allocate and use new memory for temp data in cumsum kernel (#43101) · 5dcebb9b
  由 Leo Chen 提交于 6月 07, 2022
  
  5dcebb9b
- G
  
  add bf16 dtype for flatten kernel (#43264) · 0fdb3ced
  由 Guoxia Wang 提交于 6月 07, 2022
  
  0fdb3ced
- W
  
  [multi-stream] Fix split and concat problem. (#43039) · 8c3777df
  由 Wilber 提交于 6月 07, 2022
  
  8c3777df
- L
  Transpose optimization with assitant of Chengdu Supercomputing Center and... · 71a63f0a
  由 limingshu 提交于 6月 07, 2022
```
Transpose optimization with assitant of  Chengdu Supercomputing Center and auto_tune operation (#42704)
```
  71a63f0a
- N
  
  [XPU KP]Add xpu register, any, amax, amin op test (#43204) · aec49361
  由 niuliling123 提交于 6月 07, 2022
  
  aec49361
06 6月, 2022 1 次提交
- N
  
  Replace ReduceAmax/Amax.part.cu with KP (#43202) · 39903f72
  由 niuliling123 提交于 6月 06, 2022
  
  39903f72
05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
04 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：cmake-format (#43057) · 92568edb
  由 Sing_chan 提交于 6月 04, 2022
  
  92568edb
02 6月, 2022 2 次提交
- S
  Support hetergraph reindex (#43128) · ceb20406
  由 Siming Dai 提交于 6月 02, 2022
```
* support heter reindex

* add unittest, fix bug

* add comment

* delete empty line

* refine example

* fix codestyle

* add disable static
```
  ceb20406
- L
  Extend forward fast layer_norm kernel to support more dimensions. (#43118) · 85baa3c0
  由 Li Min 提交于 6月 02, 2022
```
* extend forward fast_ln_kernel to support more column values.
```
  85baa3c0
01 6月, 2022 3 次提交

Y
Add yaml and unittest for instance_norm op (#43060) · 56ae33b6
由 YuanRisheng 提交于 6月 01, 2022
```
* add yaml

* fix infrt compile bugs
```
56ae33b6
A

[fix] split nanmedian fluid deps (#43135) · b23914c2
由 Aganlengzi 提交于 6月 01, 2022

b23914c2

[Yaml]add conv3d, depthwise_conv2d yaml (#42807) · 5f2c251c

由 chentianyu03 提交于 6月 01, 2022

* add conv3d yaml

* add conv3d_grad, conv3d_double_grad

* add final_state_conv3d test case

* add conv3d double test case

* add depthwise_conv2d grad yaml

* add depthwise_conv2d double grad test case

* modify the order of args

* add depthwise_conv2d_grad_grad config

5f2c251c

31 5月, 2022 1 次提交
- C
  [Phi] Polish assign kernel copy impl (#43061) · c9e7c407
  由 Chen Weihang 提交于 5月 31, 2022
```
* fix assign kernel copy impl

* fix test failed
```
  c9e7c407

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致