提交 · ed604569b96774b970e52046d403c6c58fa7995c · PaddlePaddle / Paddle

05 6月, 2023 7 次提交

【Hackathon 4 No.19】Add polygamma API to Paddle (#53791) · ed604569

由 PommesPeter 提交于 6月 05, 2023

* feat: added polygamma init code

* feat: added polygamma unittest code

* test: added more test cases

* refactor: added forward impl

* refactor: added backward impl

* test: updated cases

* refactor: updated test cases

* refactor: added more case and fixed some bugs

* test: updated ref func

* refactor: updated code style

* refactor: move the code

* refactor: updated test

* refactor: updated test

* docs: updated en doc
Co-authored-by: Nzachary sun <70642955+sunzhongkai588@users.noreply.github.com>

* docs: updated math eq

---------
Co-authored-by: Nzachary sun <70642955+sunzhongkai588@users.noreply.github.com>

ed604569

G

[static op generation] pool2d, pool3d (#54070) · 30881647
由 gouzil 提交于 6月 05, 2023

30881647
W

[bug fix] group norm backward (#54341) · d338b2f8
由 wangzhen38 提交于 6月 05, 2023

d338b2f8
H

[XPU] fix unittest of shape op. (#54323) · f55eb06f
由 houj04 提交于 6月 05, 2023

f55eb06f
U

Add macro SPCONV_WITH_CUTLASS (#54274) · e7a38f15
由 umiswing 提交于 6月 05, 2023

e7a38f15
H
Support code generation for op conv2d_transpose, conv3d_transpose,... · 1075d35d
由 huangjiyi 提交于 6月 05, 2023
```
Support code generation for op conv2d_transpose, conv3d_transpose, depthwise_conv2d_transpose (#54242)
```
1075d35d

optimize logsumexp in small data scale (#52952) · 93e1bb98

由 Asthestarsfalll 提交于 6月 05, 2023

* optimize logsumexp in small data scale

* fix

* fix

* add #pragma once

* swith to use aligned_vector and support arbitrarily shape

* fix store

* fix store

* refine for special cases

* try

* fix

* update

* fix

* fix all_reduce

* try

* fix rocm bug

* fix rocm bug

* fix rocm bug

* fix rocm bug

* fix rocm bug

* fix rocm bug

* fix rocm bug

* fix rocm bug

93e1bb98

03 6月, 2023 1 次提交
- S
  
  【Hackathon 4th No.29】为 Paddle 新增 paddle.sparse.slice 稀疏 API (#53794) · d71baff6
  由 Scotty 提交于 6月 03, 2023
  
  d71baff6
02 6月, 2023 8 次提交
- R
  
  fix typo (#54299) · 06304ade
  由 RedContritio 提交于 6月 02, 2023
  
  06304ade
- D
  【PaddlePaddle Hackathon 4】No.56 :add fp and bf16 for bernoulli (#54232) · 85d5f26d
  由 Difer 提交于 6月 02, 2023
```
* add fp&bf16 bernoulli

* add check_dtype & fix error

* fix rocm error
```
  85d5f26d
- W
  
  [XPU]Add yolo box fuse pass && kernel (#54163) · a087b9cb
  由 wz1qqx 提交于 6月 02, 2023
  
  a087b9cb
- H
  floor div support int8/int16/int32/int64/uint8/float32/float64/bfloat16/float16 (#53854) · 6310419b
  由 Hui Zhang 提交于 6月 02, 2023
```
* floor div support float/double/bfloat16/float16

* add ut

* fix bug

* fix fft.ifftshift for floor_divide upgrade

* fix comment

* fix bugs

* fix bug
```
  6310419b
- Z
  Optimize perf of broadcast matmul (#54126) · 9f76d050
  由 Zhang Zheng 提交于 6月 02, 2023
```
* Optimize perf of broadcast matmul

* support more dtype
```
  9f76d050
- 傅
  
  add mixed bool and int index support for index_put (#54195) · 8fd4ef91
  由傅剑寒提交于 6月 02, 2023
  
  8fd4ef91
- Z
  [AMP] support master_grad for adam and momentum (#54240) · 703a64a3
  由 Zhang Ting 提交于 6月 02, 2023
```
* support master_grad for adam and momentum

Co-authored-by: zhangting_2017@163.com <zhangting2020>
```
  703a64a3
- W
  static graph autogen code for shape op (#54221) · f5342918
  由 Wang Xin 提交于 6月 02, 2023
```
* static graph autogen code for shape op

* fix onednn

* fix onednn
```
  f5342918
01 6月, 2023 5 次提交
- U
  
  [Sparse] Support sparse conv 2d. (#54158) · 4f25604e
  由 umiswing 提交于 6月 01, 2023
  
  4f25604e
- [Zero-Dim] OpTest support shape check and fix previous case problem (#54117) · d4451cb0
  由 zhouweiwei2014 提交于 6月 01, 2023
  
  d4451cb0
- R
  [ROCM] fix multihead_matmul (#54108) · effebd41
  由 ronnywang 提交于 6月 01, 2023
```
* [ROCM] fix multihead_matmul

* skip bf16 uts

* update
```
  effebd41
- Y
  
  fix xpu-kp bugs (#54234) · e8735ddf
  由 YuanRisheng 提交于 6月 01, 2023
  
  e8735ddf
- H
  Support static graph code generation for conv2d, conv3d, depthwise_conv2d (#54201) · f3eccb3f
  由 huangjiyi 提交于 6月 01, 2023
```
* update

* update cmake

* update

* update

* update

* update

* Revert "update cmake"

This reverts commit 1e1dc1b2bc9967b725201272607f939260070fd4.

* update

* update

* update

* update
```
  f3eccb3f
31 5月, 2023 1 次提交
- C
  support activation prim op bf16 dtype (#54193) · cbeff5fc
  由 Charles-hit 提交于 5月 31, 2023
```
* support activation prim op bf16 dtype

* remove useless code
```
  cbeff5fc
30 5月, 2023 4 次提交

update_c++17 (#53892) · 950b563b

由 risemeup1 提交于 5月 30, 2023

* update_c++17

* update_c++17

* fix windows bug

* solve cirle depend

* solve cirle depend

* solve cirle depend

* solve cirle depend

* solve cirle depend

* fix windows bug

* fix compiler error

* fix compiler error

* update eigen3

* update eigen3

* update eigen3

* fix mac-py3 compiler error

* update C++17

* fix mac compiler error

* fix compile error

* fix coverage_compiler error

* fix coverage_ci_problem

* fix coverage_error

* fix_kunlun200 compile error

* fix kunlun200 compiler error

* fix compile error

* fix compiler error

* fix py3 failed test

* fix kunlun200 compiler error

* test

* fix test error

* fix test error

* fix test error

* test

* test

* fix mac py3 error

* fix mac py3 error

* fix mac py3 error

* fix test error

* fix test error

* fix compile error

* fix compile error

* fix compile error

* test

* test

* fix compiler error

* test

* test

* debug on ci

* fix compiler error

* fix compiler error

* test

* fix cinn compiler error

* test

* fix rocm cmpile error

* fix cinn and kunlun compile error

* update c++14

* Update flags.cmake

950b563b

softmax fwd: force vec size to 1 when dtype is float (#54183) · f5a3b427
由 shaojie_wang 提交于 5月 30, 2023
```
* softmax fwd: force vec size to 1 when dtype is float

* use 1024 as threshold to use cudnn
```
f5a3b427

[AMP] Reimplement check_nan_inf as check_numerics_kernel. (#52245) · 44bd5927

由 Yiqun Liu 提交于 5月 30, 2023

* Reimplement the check_nan_inf function as check_numerics kernel.

* Remove the cpu implemention to phi.

* Add ifdef for the including of omp.h.

* Move the use of FLAGS_check_nan_inf_level out of header file.

* Implement a common PrintAndThrowError function.

* Fix the error using of __NVCC__, which should be instead with __CUDA_ARCH__.

* Add dependency of phi.

* Polish codes and unittest.

44bd5927

H

[XPU] using xpu::normal in gaussian kernel. (#54176) · 060e4fab
由 houj04 提交于 5月 30, 2023

060e4fab

26 5月, 2023 1 次提交

[PHI Decoupling]Create PHI shared lib (#53735) · da50a009

由 YuanRisheng 提交于 5月 26, 2023

* create phi so

* fix ci bugs

* fix py3 bugs

* add file

* fix py3 bugs

* fix windows bugs

* perfect so

* fix py3 bugs

* delete all static target in phi

* fix windows bugs

* fix py3 bugs

* fix ci bugs

* fix windows bugs

* fix bugs: gflags can't be linked by dynamic and static lib

* fix bugs that can not load 3rd party

* fix ci bugs

* fix compile bugs

* fix py3 bugs

* fix conflict

* fix xpu bugs

* fix mac compile bugs

* fix psgpu bugs

* fix inference failed

* deal with conflict

* fix LIBRARY_PATH bug

* fix windows bugs

* fix onednn error

* fix windows compile bugs

* fix windows compile bugs

* fix test_cuda_graph_static_mode_error aborted

* fix windows bugs

* fix mac-python3 error

* fix hip compile bugs

* change mode to static

* change to static mode

* fix ci bugs

* fix py3 bugs

* fix windows bugs

* fix bugs

* add static flag

* add PADDLE_API

* change position of PADDLE_API

* fix windows bugs

* change mode to dynamic lib

* fix windows static bugs

* deal with conflict

* fix windows unit bug

* fix coverage

* deal with conflict

* fix windows-inference

* fix py3 bugs

* fix bugs when compile type_info

* fix compile bugs

* fix py3 bugs

* fix windows bugs

* fix windows openblas

* fix xpu bugs

* fix enforce_test in windows

* update code according comment

* fix windows cmake bug

* fix windows bugs

* fix windows bugs

* delete cinn unittest

* fix cinn bugs

---------
Co-authored-by: lzydev <1528794076@qq.com>

da50a009

25 5月, 2023 5 次提交
- Z
  
  Using a sorting method may achieve better performance. (#54045) · 6d1292ef
  由 zhangkaihuo 提交于 5月 25, 2023
  
  6d1292ef
- Z
  
  [Sparse]fix sparse bug (#53390) · acf3e526
  由 zhangkaihuo 提交于 5月 25, 2023
  
  acf3e526
- T
  
  【Hackathon 4th No.26】为 Paddle 新增 paddle.sparse.nn.Softmax 稀疏 API 的 coo 格式计算逻辑 (#53613) · 4ea1d041
  由 thunder95 提交于 5月 25, 2023
  
  4ea1d041
- [Zero-Dim] support ReshapeTransform/nll_loss/matmul support 0D (#53828) · a64a722a
  由 zhouweiwei2014 提交于 5月 25, 2023
  
  a64a722a
- L
  add log for memory stats (#54083) · 5745a63f
  由 Leo Chen 提交于 5月 25, 2023
```
* add log for memory stats

* fix string_split in einsum
```
  5745a63f
24 5月, 2023 6 次提交
- Y
  Try to increase the repeat of autotune and fix the setting of allow_tf32_cublas. (#53622) · f4abe34b
  由 Yiqun Liu 提交于 5月 24, 2023
```
* Try to increase the repeat of autotune and fix the setting of allow_tf32_cublas.

* Change the repeat of cublaslt to 10.

* Use FLAGS_cublaslt_exhaustive_search_times as repeats.

* Fix compiling error on CI.

* Polish the key and simplify codes.
```
  f4abe34b
- Z
  
  move reduce raw kernels to legacy (#53961) · f488e3fd
  由 zhangyuqin1998 提交于 5月 24, 2023
  
  f488e3fd
- Z
  move raw kernels to legacy (#53913) · 48f5af99
  由 zhangyuqin1998 提交于 5月 24, 2023
```
* move raw kernels to legacy

* Update elementwise_add_kernel.cu

* fix
```
  48f5af99
- W
  
  [XPU]Add act add fuse (#53965) · f55f9d79
  由 wz1qqx 提交于 5月 24, 2023
  
  f55f9d79
- W
  Update lerp_kernel.cu (#54071) · a299797d
  由 Winters Montagne 提交于 5月 24, 2023
```
Removed unnecessary header files introduced
```
  a299797d
- L
  [XPU][PHI Kernels] bind bitwise_add kernel & add int32/int64 support to... · 0a06140f
  由 lijin23 提交于 5月 24, 2023
```
[XPU][PHI Kernels] bind bitwise_add kernel & add int32/int64 support to scatter_nd_add kernel for xpu (#54066)

* bind new kernels to xpu

* refine code

* fix bugs in unittest
```
  0a06140f
23 5月, 2023 2 次提交
- Z
  [AMP OP&Test] Support float16 in selu (#54030) · 6133ca4e
  由 Zhang Zheng 提交于 5月 23, 2023
```
* [AMP OP&Test] Support float16 in selu

* fix
```
  6133ca4e
- R
  
  [PHI] bind nll_loss xpu kernel (#54043) · 73d706ce
  由 RuohengMa 提交于 5月 23, 2023
  
  73d706ce

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功