提交 · 7985407b7e29be7bd9018fbb25b3e04f64216b45 · 机器未来 / Paddle

24 6月, 2022 3 次提交
- [Sparse] support batch compute of SparseTensor matmul/masked_matmul/softmax (#43703) · eec4e034
  由 zhouweiwei2014 提交于 6月 24, 2022
  
  eec4e034
- X
  
  change svd_cpu_kernel from Eigen to Lapack, speed up the compile from 120s -> 20s (#43784) · bafd8dec
  由 xiongkun 提交于 6月 24, 2022
  
  bafd8dec
- Z
  
  add comment for kernel of api in api.yaml (#43799) · 23036031
  由 zyfncg 提交于 6月 24, 2022
  
  23036031
23 6月, 2022 3 次提交
- R
  Remove unnecessary includings for pstring.h (#43752) · 66a28e13
  由 Ruibiao Chen 提交于 6月 23, 2022
```
* Remove unnecessary including for pstring.h

* Fix typos
```
  66a28e13
- M
  
  【Hackathon No.56 57 58 59】sparse elementwise add sub mul div (#41857) · e3d94fc5
  由 Matsumoto Ruko 提交于 6月 23, 2022
  
  e3d94fc5
- L
  
  clear cmake files of phi (#43769) · 295f289a
  由 Leo Chen 提交于 6月 23, 2022
  
  295f289a
22 6月, 2022 2 次提交
- W
  
  fix the cumsum bug for large size (#43722) · 4b3e8d56
  由 wawltor 提交于 6月 22, 2022
  
  4b3e8d56
- Z
  
  Fix batch csr (#43708) · d41a9373
  由 zhangkaihuo 提交于 6月 22, 2022
  
  d41a9373
21 6月, 2022 6 次提交
- W
  cpplint fix 3 (#43679) · ff7d2464
  由 wangzhen38 提交于 6月 21, 2022
```
* cpplint fix 3

* cpplint fix 3

* cpplint fix 3

* cpplint fix 3
```
  ff7d2464
- Z
  
  fix set_value (#43694) · 94249f5e
  由 zyfncg 提交于 6月 21, 2022
  
  94249f5e
- Y
  
  Fix cudnn error for BatchNorm1D kernel (#43072) · bbe0fdb0
  由 Yao Zihang 提交于 6月 21, 2022
  
  bbe0fdb0
- S
  resort .cu headers, set clang-format not sort include block and consider .cu... · 829723f2
  由 Sing_chan 提交于 6月 21, 2022
```
resort .cu headers, set clang-format not sort include block and consider .cu as main source file (#43633)
```
  829723f2
- Z
  
  slice large tensor for cudnn_softmax (#43681) · bd5e97d3
  由 Zhang Ting 提交于 6月 21, 2022
  
  bd5e97d3
- S
  
  remove .clang-format in paddle/fluid to use the same config (#43678) · b2df4c76
  由 Sing_chan 提交于 6月 21, 2022
  
  b2df4c76
20 6月, 2022 2 次提交
- Z
  
  add cross_op cuda kernel (#43558) · ec3e0a13
  由 zhangbopd 提交于 6月 20, 2022
  
  ec3e0a13
- 【Sparse】add new API/OP(csr->csr) of SparseTensor softmax (#43475) · 2ddbc647
  由 zhouweiwei2014 提交于 6月 20, 2022
```
* add new API/OP(csr->csr) of SparseTensor softmax

* fix comment
```
  2ddbc647
17 6月, 2022 2 次提交
- Z
  fix batch csr (#43553) · 03517d8a
  由 zhangkaihuo 提交于 6月 17, 2022
```
* fix to_sparse_csr
```
  03517d8a
- Y
  
  Fix index calculation error in ElementwiseKernel. (#43603) · 48ea76c9
  由 Yiqun Liu 提交于 6月 17, 2022
  
  48ea76c9
16 6月, 2022 2 次提交
- R
  [CustomKernel] add custom kernel c api (#42986) · 6fe10181
  由 ronnywang 提交于 6月 16, 2022
```
* [CustomKernel] add custom kernel c api

* update

* update

* fix unable to export capi
Co-authored-by: Nronny1996 <524019753@qq.com>
```
  6fe10181
- L
  fix xpu kp compilation (#43496) · 767efaca
  由 Leo Chen 提交于 6月 16, 2022
```
* fix xpu kp compilation

* add depends
```
  767efaca
15 6月, 2022 3 次提交
- G
  
  modify index dtype from int to int64_t of concat_and_split_functor (#43479) · 81abaaf5
  由 Guoxia Wang 提交于 6月 15, 2022
  
  81abaaf5
- add some kernels(csr*dense->csr, dense*dense->csr) of SparseTensor matmul (#42935) · 346efe96
  由 zhouweiwei2014 提交于 6月 15, 2022
```
* add some kernel(csr*dense->csr, dense*dense->csr) of SparseTensor matmul

* fix CI

* fix CI

* fix comment

* fix comment
```
  346efe96
- Y
  Use int64_t in GetGpuLaunchConfig1D and ElementwiseKernel as index type to... · 15577630
  由 Yiqun Liu 提交于 6月 15, 2022
```
Use int64_t in GetGpuLaunchConfig1D and ElementwiseKernel as index type to support large tensor. (#43506)

* Change some data type from int to int64_t in GetGpuLaunchConfig1D to support large tensor.

* Use int64_t in ElementwiseKernel as index type to support large tensor.
```
  15577630
14 6月, 2022 3 次提交

[Eager] Fix edvr starganv2 (#43471) · c62a7e25

由 Jiabin Yang 提交于 6月 14, 2022

* fix starganv2

* fix starganv2 stop_gradient end error

* fix edvr_starganv2

* fix mul kernel to fix optional ddx

* fix typo

c62a7e25

Z

fix compiling werror (#43337) · c6421019
由 Zhang Jun 提交于 6月 14, 2022

c6421019

[ Make FLAGS_einsum_opt as default ] Einsum memory optimization (#43397) · 83abec60

由 xiongkun 提交于 6月 14, 2022

* change logic for optimize

* modifty

* optimize the backward speed of EinsumOp

* add cache optimizer for einsum op

* EinsumOp: fix new dygraph mode error

* fix bug

* change Cache->InnerCache

* fix code

* fix

* add nan inf utils for einsum op

* add as_extra

* memory optimizer for einsum

* update code

83abec60

13 6月, 2022 2 次提交
- Z
  fix bug of strided_slice (#43388) · abc5d0c4
  由 zyfncg 提交于 6月 13, 2022
```
* fix stride_slice bug

* fix bug
```
  abc5d0c4
- Z
  sparse convertion kernel support secondary dispatch (#43345) · 5752643b
  由 zhangkaihuo 提交于 6月 13, 2022
```
* use GpuMemcpy and GpuMemset

* sparse convert kernel support double dispatch by indices dtype

* cudaMemcpyKind->gpuMemcpyKind
```
  5752643b
10 6月, 2022 4 次提交
- C
  [Phi] Fix depthwise conv yaml error (#43379) · f551d9fe
  由 Chen Weihang 提交于 6月 10, 2022
```
* fix depthwise conv yaml error

* fix depthwise conv double grad error
```
  f551d9fe
- W
  
  revert PR43039 (#43384) · ac75617a
  由 Wilber 提交于 6月 10, 2022
  
  ac75617a
- L
  make all phi kernels to 2(host/device) static libraries directly (#43247) · 5781999d
  由 Leo Chen 提交于 6月 10, 2022
```
* make all phi kernels to 2(host/device) static libraries directly

* fix calling kernel_declare

* fix compile

* fix cpu compile

* fix rocm compile

* fix xpu compile

* fix xpu kp compile

* fix inference compile
```
  5781999d
- T
  
  [Hackathon No.28] implement logcumsumexp (#42267) · 19a7524f
  由 tiancaishaonvjituizi 提交于 6月 10, 2022
  
  19a7524f
09 6月, 2022 1 次提交
- C
  Implement dropout_nd operator to optimize dropout with axis not None. (#42463) · caa57498
  由 crystal 提交于 6月 09, 2022
```
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
```
  caa57498
08 6月, 2022 1 次提交
- Y
  [Phi]Move group op kernel into PHI and add yaml / unittest (#43104) · 99c6497b
  由 YuanRisheng 提交于 6月 08, 2022
```
* move_group_norm

* move group norm backward

* fix code format

* modify code according comment
```
  99c6497b
07 6月, 2022 6 次提交
- S
  
  Optimized the performance of activation op in XPU2 (#43187) · d5afc1ba
  由 shixingbo 提交于 6月 07, 2022
  
  d5afc1ba
- L
  
  Allocate and use new memory for temp data in cumsum kernel (#43101) · 5dcebb9b
  由 Leo Chen 提交于 6月 07, 2022
  
  5dcebb9b
- G
  
  add bf16 dtype for flatten kernel (#43264) · 0fdb3ced
  由 Guoxia Wang 提交于 6月 07, 2022
  
  0fdb3ced
- W
  
  [multi-stream] Fix split and concat problem. (#43039) · 8c3777df
  由 Wilber 提交于 6月 07, 2022
  
  8c3777df
- L
  Transpose optimization with assitant of Chengdu Supercomputing Center and... · 71a63f0a
  由 limingshu 提交于 6月 07, 2022
```
Transpose optimization with assitant of  Chengdu Supercomputing Center and auto_tune operation (#42704)
```
  71a63f0a
- N
  
  [XPU KP]Add xpu register, any, amax, amin op test (#43204) · aec49361
  由 niuliling123 提交于 6月 07, 2022
  
  aec49361

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致