提交 · b6bf8994bd176448929057628db05b226bc74ce3 · PaddlePaddle / Paddle

22 6月, 2022 3 次提交

W

fix the cumsum bug for large size (#43722) · 4b3e8d56
由 wawltor 提交于 6月 22, 2022

4b3e8d56

fix bugs in codegen for operators (#43594) · a160c417

由 Feiyu Chan 提交于 6月 22, 2022

* add codegen for get_expected_kernel, add argument mapping for selected_rows kernels, fix other bugs in codegen for operators.
* move bernoulli, erf, mv, poisson, trunc, erf to api.yaml and corresponding backward api to backward.yaml
* generate EmptyGradOpMaker for ops without grad op
* add code to generate all possible kernel signatures for infrt

a160c417

Z

Fix batch csr (#43708) · d41a9373
由 zhangkaihuo 提交于 6月 22, 2022

d41a9373

21 6月, 2022 6 次提交
- W
  cpplint fix 3 (#43679) · ff7d2464
  由 wangzhen38 提交于 6月 21, 2022
```
* cpplint fix 3

* cpplint fix 3

* cpplint fix 3

* cpplint fix 3
```
  ff7d2464
- Z
  
  fix set_value (#43694) · 94249f5e
  由 zyfncg 提交于 6月 21, 2022
  
  94249f5e
- Y
  
  Fix cudnn error for BatchNorm1D kernel (#43072) · bbe0fdb0
  由 Yao Zihang 提交于 6月 21, 2022
  
  bbe0fdb0
- S
  resort .cu headers, set clang-format not sort include block and consider .cu... · 829723f2
  由 Sing_chan 提交于 6月 21, 2022
```
resort .cu headers, set clang-format not sort include block and consider .cu as main source file (#43633)
```
  829723f2
- Z
  
  slice large tensor for cudnn_softmax (#43681) · bd5e97d3
  由 Zhang Ting 提交于 6月 21, 2022
  
  bd5e97d3
- S
  
  remove .clang-format in paddle/fluid to use the same config (#43678) · b2df4c76
  由 Sing_chan 提交于 6月 21, 2022
  
  b2df4c76
20 6月, 2022 2 次提交
- Z
  
  add cross_op cuda kernel (#43558) · ec3e0a13
  由 zhangbopd 提交于 6月 20, 2022
  
  ec3e0a13
- 【Sparse】add new API/OP(csr->csr) of SparseTensor softmax (#43475) · 2ddbc647
  由 zhouweiwei2014 提交于 6月 20, 2022
```
* add new API/OP(csr->csr) of SparseTensor softmax

* fix comment
```
  2ddbc647
18 6月, 2022 1 次提交
- remove unuse cuSparse function (#43626) · 4a08c781
  由 zhouweiwei2014 提交于 6月 18, 2022
  
  4a08c781
17 6月, 2022 2 次提交
- Z
  fix batch csr (#43553) · 03517d8a
  由 zhangkaihuo 提交于 6月 17, 2022
```
* fix to_sparse_csr
```
  03517d8a
- Y
  
  Fix index calculation error in ElementwiseKernel. (#43603) · 48ea76c9
  由 Yiqun Liu 提交于 6月 17, 2022
  
  48ea76c9
16 6月, 2022 2 次提交
- R
  [CustomKernel] add custom kernel c api (#42986) · 6fe10181
  由 ronnywang 提交于 6月 16, 2022
```
* [CustomKernel] add custom kernel c api

* update

* update

* fix unable to export capi
Co-authored-by: Nronny1996 <524019753@qq.com>
```
  6fe10181
- L
  fix xpu kp compilation (#43496) · 767efaca
  由 Leo Chen 提交于 6月 16, 2022
```
* fix xpu kp compilation

* add depends
```
  767efaca
15 6月, 2022 5 次提交
- G
  
  modify index dtype from int to int64_t of concat_and_split_functor (#43479) · 81abaaf5
  由 Guoxia Wang 提交于 6月 15, 2022
  
  81abaaf5
- Z
  Rename yaml (#43470) · fcd32950
  由 zyfncg 提交于 6月 15, 2022
```
* rename yaml file

* fix merge conflict

* fix infrt
```
  fcd32950
- add some kernels(csr*dense->csr, dense*dense->csr) of SparseTensor matmul (#42935) · 346efe96
  由 zhouweiwei2014 提交于 6月 15, 2022
```
* add some kernel(csr*dense->csr, dense*dense->csr) of SparseTensor matmul

* fix CI

* fix CI

* fix comment

* fix comment
```
  346efe96
- Y
  Use int64_t in GetGpuLaunchConfig1D and ElementwiseKernel as index type to... · 15577630
  由 Yiqun Liu 提交于 6月 15, 2022
```
Use int64_t in GetGpuLaunchConfig1D and ElementwiseKernel as index type to support large tensor. (#43506)

* Change some data type from int to int64_t in GetGpuLaunchConfig1D to support large tensor.

* Use int64_t in ElementwiseKernel as index type to support large tensor.
```
  15577630
- R
  Refactor dynload/port.h (#43431) · 332fdd1e
  由 Ruibiao Chen 提交于 6月 15, 2022
```
* Refactor port.h

* Remove some unnecessary code

* Fix CI errors
```
  332fdd1e
14 6月, 2022 3 次提交

[Eager] Fix edvr starganv2 (#43471) · c62a7e25

由 Jiabin Yang 提交于 6月 14, 2022

* fix starganv2

* fix starganv2 stop_gradient end error

* fix edvr_starganv2

* fix mul kernel to fix optional ddx

* fix typo

c62a7e25

Z

fix compiling werror (#43337) · c6421019
由 Zhang Jun 提交于 6月 14, 2022

c6421019

[ Make FLAGS_einsum_opt as default ] Einsum memory optimization (#43397) · 83abec60

由 xiongkun 提交于 6月 14, 2022

* change logic for optimize

* modifty

* optimize the backward speed of EinsumOp

* add cache optimizer for einsum op

* EinsumOp: fix new dygraph mode error

* fix bug

* change Cache->InnerCache

* fix code

* fix

* add nan inf utils for einsum op

* add as_extra

* memory optimizer for einsum

* update code

83abec60

13 6月, 2022 3 次提交
- Z
  fix bug of strided_slice (#43388) · abc5d0c4
  由 zyfncg 提交于 6月 13, 2022
```
* fix stride_slice bug

* fix bug
```
  abc5d0c4
- R
  
  Fix cmakelint errors for some files (#43428) · edf69ae0
  由 Ruibiao Chen 提交于 6月 13, 2022
  
  edf69ae0
- Z
  sparse convertion kernel support secondary dispatch (#43345) · 5752643b
  由 zhangkaihuo 提交于 6月 13, 2022
```
* use GpuMemcpy and GpuMemset

* sparse convert kernel support double dispatch by indices dtype

* cudaMemcpyKind->gpuMemcpyKind
```
  5752643b
10 6月, 2022 4 次提交
- C
  [Phi] Fix depthwise conv yaml error (#43379) · f551d9fe
  由 Chen Weihang 提交于 6月 10, 2022
```
* fix depthwise conv yaml error

* fix depthwise conv double grad error
```
  f551d9fe
- W
  
  revert PR43039 (#43384) · ac75617a
  由 Wilber 提交于 6月 10, 2022
  
  ac75617a
- L
  make all phi kernels to 2(host/device) static libraries directly (#43247) · 5781999d
  由 Leo Chen 提交于 6月 10, 2022
```
* make all phi kernels to 2(host/device) static libraries directly

* fix calling kernel_declare

* fix compile

* fix cpu compile

* fix rocm compile

* fix xpu compile

* fix xpu kp compile

* fix inference compile
```
  5781999d
- T
  
  [Hackathon No.28] implement logcumsumexp (#42267) · 19a7524f
  由 tiancaishaonvjituizi 提交于 6月 10, 2022
  
  19a7524f
09 6月, 2022 2 次提交
- M
  
  [sparse inference] Supporting 2:4 sparse inference (#43179) · 20b38cfa
  由 minghaoBD 提交于 6月 09, 2022
  
  20b38cfa
- C
  Implement dropout_nd operator to optimize dropout with axis not None. (#42463) · caa57498
  由 crystal 提交于 6月 09, 2022
```
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
```
  caa57498
08 6月, 2022 3 次提交
- X
  
  call_once (#43206) · cad139a7
  由 xiaoxiaohehe001 提交于 6月 08, 2022
  
  cad139a7
- Z
  
  fix tensor copy bug (#43299) · 88216f63
  由 zyfncg 提交于 6月 08, 2022
  
  88216f63
- Y
  [Phi]Move group op kernel into PHI and add yaml / unittest (#43104) · 99c6497b
  由 YuanRisheng 提交于 6月 08, 2022
```
* move_group_norm

* move group norm backward

* fix code format

* modify code according comment
```
  99c6497b
07 6月, 2022 4 次提交
- S
  
  Optimized the performance of activation op in XPU2 (#43187) · d5afc1ba
  由 shixingbo 提交于 6月 07, 2022
  
  d5afc1ba
- L
  
  Allocate and use new memory for temp data in cumsum kernel (#43101) · 5dcebb9b
  由 Leo Chen 提交于 6月 07, 2022
  
  5dcebb9b
- G
  
  add bf16 dtype for flatten kernel (#43264) · 0fdb3ced
  由 Guoxia Wang 提交于 6月 07, 2022
  
  0fdb3ced
- W
  
  [multi-stream] Fix split and concat problem. (#43039) · 8c3777df
  由 Wilber 提交于 6月 07, 2022
  
  8c3777df

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功