提交 · bc91012f826c72c0620a49651c11de76b7afecd4 · PaddlePaddle / Paddle

18 4月, 2023 5 次提交
- C
  【Hackathon No.60】randperm, split, split_with_num 算子FP16/BF16单测完善 (#52683) · bc91012f
  由 chenxujun 提交于 4月 18, 2023
```
* Add split, split_with_num tests

* Add randperm tests

* Fix code
```
  bc91012f
- C
  
  Add index_add, index_sample, put_along_axis, take_along_axis tests (#52572) · 1eb30775
  由 chenxujun 提交于 4月 18, 2023
  
  1eb30775
- G
  【0D output】add 0D output support for linalg.slogdet (#52891) · a7155c5c
  由 GGBond8488 提交于 4月 18, 2023
```
* add 0D output support for inalg.slogdet,test=allcase

* fix zerom dime test error test=allcase

* fix test error test=allcase

* add static backward test, test=allcase
```
  a7155c5c
- J
  fix the set_value error in cpu (#49804) · 239dbc4e
  由 JYChen 提交于 4月 18, 2023
```
* fix the set_value error in cpu

* add a unitest for set_value OP

* fix platform::is_gpu_place

* add todo note for set_value
```
  239dbc4e
- Z
  reorder_prior_box (#52749) · a70d9db9
  由 zhangyuqin1998 提交于 4月 18, 2023
```
* reorder_prior_box

* fix
```
  a70d9db9
17 4月, 2023 8 次提交

[Paddle-Inference] Add cutlass conv2d_depthwise (#51792) · bd3b096a

由 zhoutianzi666 提交于 4月 17, 2023

* initial commit for cutlass_teller

* second commit for cutlass_teller

* add conv2d_depthwise python template

* add conv2d_depthwise cutlass template

* /zhoukangkang/paddle_cutlass/Paddle/paddle/fluid/framework/ir/cutlass_teller.h

* refine code in Conv2dFusionCanSupport

* add macro in cutlass_teller.h

* add 3x3 5x5 teller

* add groups not 1 or conv2d_depthwise teller

* 只生成ic是8的倍数的conv2d_depthwise 的kernel

* add EXPLICIT in cutlass_teller.h

* final commit

* add split_k_slices in conv2d_depthwise

* make stages == 2

* 重构部分代码

* add CutlassFusionType

* solve illegal memory

* make stride_h=stride_w && make dilation==1

* must check HasAttr(use_cutlass) before GetAttrIfExists

* add CONV2D_DEPTHWISE_BIAS_SILU to OpType2String

* modify decl.h and util.cu

bd3b096a

C
[Fused] controlled randomness for fused dropout add (#52903) · e36f80c6
由 Chitsing KUI 提交于 4月 17, 2023
```
* add random control for fused dropout add

* add __init__
```
e36f80c6
V
[AMP OP&Test]Add BF16 implementation and unit tests of multinomial (#52898) · d19d2486
由 Vvsmile 提交于 4月 17, 2023
```
* fix multinomial

* fix test_elementwise

* fix convert_float_to_uint16

* aadd test_multimial_op

* fix code style
```
d19d2486

【PaddlePaddle Hackathon 4 No.49】：为 Paddle bce_loss 支持 float16 数据类型 (#50930) · 44e6de98

由 thunder95 提交于 4月 17, 2023

* untracked files

* bce_loss_fp16

* remove unused files

* back max_rel_erro still big

* simplify code

* upd

* fix max_relative_error

* restart ci

* Update test_bce_loss.py

* Update test_bce_loss.py

* Update test_bce_loss.py

* Update test_bce_loss.py

* try to pass test

* restore file

* remove error value

* fix bug

---------
Co-authored-by: NZhang Ting <Douyaer2020@qq.com>

44e6de98

J
【Eager】fix multiply double grad error (#52870) · cf3ddf24
由 Jiabin Yang 提交于 4月 17, 2023
```
* fix multiply double grad error

* fix multiply dy only kenrel
```
cf3ddf24

【Hackathon No.32】为 Paddle 优化 expand_as 前向&反向 op 在 GPU 上的计算性能 (#52700) · 3c44e948

由 Hanchiao 提交于 4月 17, 2023

* Implement optimized kernel for OP-expand_as.

* Support fp16.
Co-authored-by: Timber-Ye <ye_hanqiao@163.com>
Co-authored-by: NBrianQian1999 <brianqianhitsz@gmail.com>

* remove fp16 support

* remove MAX_RANK_SUPPORTED

---------
Co-authored-by: NBrianQian1999 <brianqianhitsz@gmail.com>

3c44e948

Z

rename_SliceKernel (#52863) · d2b0d63f
由 zhangyuqin1998 提交于 4月 17, 2023

d2b0d63f

Add output defs for some kernelsPhi register (#52941) · 23f87442

由 Sonder 提交于 4月 17, 2023

* add register info for eigh and eig_gard

* add sync_batch_norm_op.cu register info

* add lamb output register info

* add unique register info

* change type name

* change type name

* add output register info for check_finite_and_unscale

* update cmake and config file

* add register info for adagrad

* fix build error

* add sync to run_unittests.sh

* add register info for unique_consecutive

* fix build error

* add eigh to STATIC_BUILD_TESTS

* update eig_kernel.cc

* update eig_kernel.cc

* fix infer mate error

* fix unique register error

* fix lamb register info error

* fix lamb register info

* update lamb register info

* fix lamb

* remove one Output Register

* update static build file

* add eigh op to disable_wingpu_test

* update run_unittests

23f87442

14 4月, 2023 11 次提交
- Z
  
  [AMP OP&Test] Cumprod support fp16 and bf16 (#52919) · 8a850af6
  由 Zhang Zheng 提交于 4月 14, 2023
  
  8a850af6
- C
  
  【Hackathon4 No58】logcumsum logsum (#51275) · 468869e4
  由 cyberslack_lee 提交于 4月 14, 2023
  
  468869e4
- C
  
  【Hackathon4 No58】kthvalue (#51615) · 43efb979
  由 cyberslack_lee 提交于 4月 14, 2023
  
  43efb979
- C
  【Hackathon No.62】digamma, dirichlet算子FP16/BF16单测完善 (#52604) · 7ecbcc08
  由 chenxujun 提交于 4月 14, 2023
```
* Add digamma, dirichlet tests

* Fix code
```
  7ecbcc08
- S
  【Hackathon No.55】add erf FP16 test and BF16 test (#52136) · eeb4d165
  由 superwinner1 提交于 4月 14, 2023
```
* add erf FP16 test
```
  eeb4d165
- C
  
  Add angle,bmm tests (#52630) · 6d7ee668
  由 chenxujun 提交于 4月 14, 2023
  
  6d7ee668
- U
  
  [Dcu]: Add rocsparse_spmm for dcu. (#52200) · 281ea2f4
  由 umiswing 提交于 4月 14, 2023
  
  281ea2f4
- Y
  [Zero-Dim] support 0-D tensor for... · 6f41e177
  由 YangQun 提交于 4月 14, 2023
```
[Zero-Dim] support 0-D tensor for reduce/reshape/stack/prelu/expand_v2/gaussion onednn kernels (#52185)

* support 0-D tensor for reduce/reshape/stack/prelu/expand_v2/gaussion ops

* fix gaussian random mkldnn op ut
```
  6f41e177
- G
  [phi] move sequence_pool to phi - Step 2 : sequence_pool_op (#52750) · b281b221
  由 gouzil 提交于 4月 14, 2023
```
* [phi] move sequence_pool kernel to phi

* [phi] mv sequence_pooling to phi funcs

* [phi] mv sequence_pooling_test

* [phi] RollBACK `paddle/fluid/operators/sequence_ops/sequence_pool_op.cc`

* [phi][funcs] fix mutable_data

* [phi][funcs] fix mutable_data
```
  b281b221
- S
  
  fix win cu116 compile error (#52894) · 60ba559a
  由 sneaxiy 提交于 4月 14, 2023
  
  60ba559a
- Z
  
  delete unused param from swish_grad and relu6_grad (#52805) · 54e4360a
  由 zhangyuqin1998 提交于 4月 14, 2023
  
  54e4360a
13 4月, 2023 12 次提交
- S
  【Hackathon No.55】 add channel_shuffle FP16/BF16 support and tests (#51884) · 48ccb785
  由 superwinner1 提交于 4月 13, 2023
```
* No55 add channel_shuffle FP16/BF16 support and tests
```
  48ccb785
- D
  【Hackathon No57】add_fp16_bf16_for_dot & bf16_for_cross (#52426) · 205094f0
  由 Difer 提交于 4月 13, 2023
```
* add_fp_bf_for_dot & bf_for_cross

* fix error

* fix some error

* fix some error

* change something

* fix magic number
```
  205094f0
- Z
  [AMP OP&Test] Support fp16&bf16 in reduce_max (#52862) · e0e044c0
  由 Zhang Zheng 提交于 4月 13, 2023
```
* [AMP OP&Test] Support fp16&bf16 in reduce_max
```
  e0e044c0
- L
  
  Fix the parameter check error in rmsprop_kernel_xpu. (#52866) · 9dc7e5ef
  由 Leo Guo 提交于 4月 13, 2023
  
  9dc7e5ef
- C
  
  Add pixel_shuffle pixel_unshuffle fp16/bf16 (#52582) · 2aaed989
  由 chenxujun 提交于 4月 13, 2023
  
  2aaed989
- C
  
  Add overlap_add, sign tests (#52667) · cb6de765
  由 chenxujun 提交于 4月 13, 2023
  
  cb6de765
- Z
  rename PD_REGISTER_GENERAL_KERNEL (#52759) · 3a66627e
  由 zhangyuqin1998 提交于 4月 13, 2023
```
* rename PD_REGISTER_GENERAL_KERNEL

* Update feed_op.cc

* fix

* Update strings_empty_kernel.cc
```
  3a66627e
- H
  [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h (#52651) · 5664ea26
  由 HongyuJia 提交于 4月 13, 2023
```
* [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h

* Add logging.h for profiler.cc

* Add logging.h for gloo_utils.h

* Add logging.h for addmm_kernel_impl.h

* Add logging.h for addmm_grad_kernel_impl.h

* Add logging.h for p_send_kernel.cu

* Add logging.h for determinant_grad_kernel_impl.h

* Add logging.h for p_recv_kernel.cu

* Add logging.h for elementwise_grad_base.h

* Add logging.h for transfer_layout_kernel.cc

* Add logging.h for eigvals_kernel.cc and index_select_impl.h

* Add logging.h for all files in kernel directory

* Add logging.h for xpu_info.cc

* Add logging.h for xpu
```
  5664ea26
- Z
  
  delete useless cast, elementwise_mul (#52831) · 0695fb88
  由 zhupengyang 提交于 4月 13, 2023
  
  0695fb88
- U
  
  [cutlass] Sparse conv3d backward fusion (#52361) · 0b98d1aa
  由 umiswing 提交于 4月 13, 2023
  
  0b98d1aa
- Z
  
  rename_bilinear_tensor_op (#52745) · eb93b5c9
  由 zhangyuqin1998 提交于 4月 13, 2023
  
  eb93b5c9
- C
  
  [XPU] Fix instance_norm、conv2d_xpu、inplace optimizer bugs. (#52627) · fa8abeec
  由 csy0225 提交于 4月 13, 2023
  
  fa8abeec
12 4月, 2023 4 次提交

Z
Optimize performance of unique kernel (#52736) · 8cbeefea
由 Zhang Zheng 提交于 4月 12, 2023
```
* Optimize performance of unique kernel

* fix ci
```
8cbeefea

[AMP OP&Test] add fp16/bf16 unittest for pool2d op (#52288) · f9b155f9

由 Wei Shengyu 提交于 4月 12, 2023

* add bf16 support and bf16/fp16 unittest for pool2d

* add include files

* dbg

* reformat

* reformat

* modify code according to review comment

* remove duplicate code

* remove dup code

* remove useless include

* dbg

f9b155f9

Patch del (#52754) · 189e0d44

由 wangzhen38 提交于 4月 12, 2023

* [DO NOT MERGE] adadelta lr support

* [DO NOT MERGE] gpu support

* [test] follow torch

* fix acc update order

* for ci

* [bug fix] update master para

* [bug fix] update test

* [bug fix] for ci test

* for ci

* fix xpu

* [adadelta fix] del fluid head file

* for ci

* del notes

189e0d44

[AMP OP&Test] support bf16 for batch norm (#52407) · 523f8a26

由 Guoxia Wang 提交于 4月 12, 2023

* [AMP OP&Test] support bf16 for batchnorm

* codestyle

* Update batch_norm_grad_kernel.cu

* Update batch_norm_kernel.cu

* fix codestyle

* fix

* fix

* fix

* fix

* fix

* Update batch_norm_kernel.cc

523f8a26

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功