提交 · 2a705b74a02cb8072d5a8510276e6a5154ab9ee1 · PaddlePaddle / Paddle

26 4月, 2023 2 次提交
- D
  
  【Hackathon No.48】为 Paddle determinant 算子实现 float16 数据类型支持 (#53286) · 2a705b74
  由 denglianbin 提交于 4月 26, 2023
  
  2a705b74
- D
  
  【Hackathon No.48】为 Paddle meshgrid 算子实现 float16 数据类型支持 (#53284) · 9127cc3c
  由 denglianbin 提交于 4月 26, 2023
  
  9127cc3c
25 4月, 2023 4 次提交
- Z
  【PaddlePaddle Hackathon 4 No.33】为 Paddle 优化 Histogram op 在 GPU 上的计算性能 (#53112) · c1a61fc0
  由 Zero Rains 提交于 4月 25, 2023
```
* create KernelMinMax to optimize the performance of histogram op in GPU

* change to block and warp wise operation

* remove the time in DtoH

* fix a bug
```
  c1a61fc0
- C
  
  【Hackathon No.61】min 算子FP16/BF16单测完善 (#52887) · d7a5e900
  由 cyberslack_lee 提交于 4月 25, 2023
  
  d7a5e900
- fix shared memory over usage in embedding grad kernel on deterministic mode (#53247) · 6f684bd2
  由 shaojie_wang 提交于 4月 25, 2023
```
* fix shared memory over usage in embedding grad kernel on determistic mode

* use IdT as interger dtype
```
  6f684bd2
- D
  【Hackathon No57】add fp16 & bf16 for max_pool2d_with_index, max_pool3d_with_index (#52314) · 46951224
  由 Difer 提交于 4月 25, 2023
```
* add fp_bf for pool_max_withidx

* fix some error

* fix error

* codestyle error

* fix masktype

* fix input bf type

* input bf dtype convert error

* back to convert input to bf16 first

* fix convert error

* fix bf16 grad check
```
  46951224
24 4月, 2023 2 次提交

add 0D support for trace (#53208) · 9d90738c

由 GGBond8488 提交于 4月 24, 2023

* add 0D support for trace, test=allcase

* fix trace gpu kernel 0d error, test=allcase

* fix windows error, test=allcase

9d90738c

S
Add weighted sample (#52013) · 6a8d98e0
由 Siming Dai 提交于 4月 24, 2023
```
Add paddle.geometric.weighted_sample_neighbors API
```
6a8d98e0

23 4月, 2023 2 次提交
- Z
  delete overwrite from gather_grad (#52707) · a32c1391
  由 zhangyuqin1998 提交于 4月 23, 2023
```
* delete overwrite from gather_grad

* fix

* Update gather_grad_kernel.cc
```
  a32c1391
- Z
  delete axis from elementwise_grad (#53202) · a3cd9cb9
  由 zhangyuqin1998 提交于 4月 23, 2023
```
* remove axis from elementwise_grad

* Update elementwise_sig.cc
```
  a3cd9cb9
22 4月, 2023 1 次提交

[Zero-Dim] support output 0D for... · b406a7db

由 wangfengsheng1999 提交于 4月 22, 2023

[Zero-Dim] support output 0D for is_empty/as_complex/inner/dot/rank/tensordot/squeeze_/static.accuracy/static.auc/metric.accuracy, test=allcase (#52850)

* [Zero-Dim] support output 0D for is_empty/as_complex/, test=allcase

* [Zero-Dim] support output 0D for is_empty/as_complex/, test=allcase

* add test case

* modify dot/metric.accuracy/static.accuracy/static.auc

* modfiy inner/tensordot bug

* test 9 api

* [Zero-Dim] support output 0D for is_empty/as_complex/inner/dot/rank/tensordot/squeeze_/static.accuracy/static.auc/metric.accuracy, test=allcase

* fix bug

* support output 0D for is_empty/as_complex/inner/dot/rank/tensordot/squeeze_/static.accuracy/static.auc/metric.accuracy

* code style

* fix bug

* fix test_dot_op bug

* fix accuracy bug

* fix bug

* fix bug

* fix bug

* fix bug

* codestyle

* fix dot bug

* fix dot bug

* fix dot bug

* code style

* fix dot bug

* fix dot bug

* fix dot bug

* fix dot bug

* fix dot bug

* fix dot bug

* modify code

b406a7db

21 4月, 2023 3 次提交
- S
  add deterministic embedding grad kernel (#50494) · 017254d6
  由 Shijie 提交于 4月 21, 2023
```
* add deterministic embedding grad kernel

* minor change

* minor change

* Add new FLAG to enable deterministic embedding

* Update embedding deterministic kernel
```
  017254d6
- C
  
  Add trace tests (#52954) · 3371747d
  由 co63oc 提交于 4月 21, 2023
  
  3371747d
- C
  
  Add unfold tests (#52963) · f8823c1a
  由 co63oc 提交于 4月 21, 2023
  
  f8823c1a
20 4月, 2023 1 次提交
- C
  [FlashAttn] add flash randomness control (#52902) · 00ac8014
  由 Chitsing KUI 提交于 4月 20, 2023
```
* add flash randomness control

* fix VLOG undefied
```
  00ac8014
19 4月, 2023 2 次提交
- L
  Support Linear operation in cuBlaslt and plug into attn_gemm and fusedLinear backward op (#52028) · f6f18835
  由 limingshu 提交于 4月 19, 2023
```
* first commit

* restruct c++ interface to divide linear from matmulwithcublaslt

* finish building in cublaslt impl

* fix code bugs

* fix host cost

* add some changes
```
  f6f18835
- Z
  fix graph_reindex (#52930) · e5506be6
  由 zhangyuqin1998 提交于 4月 19, 2023
```
* fix graph_reindex

* fix

* Update op_compat.yaml
```
  e5506be6
18 4月, 2023 7 次提交
- C
  【Hackathon No.60】prelu, clip_by_norm, multi_dot 算子FP16/BF16单测完善 (#52666) · c3055d23
  由 chenxujun 提交于 4月 18, 2023
```
* Add prelu, clip_by_norm, multi_dot tests

* Fix code

* Fix code
```
  c3055d23
- Z
  [AMP OP&Test] Unique support float16&bfloat16 (#52995) · 1d37868f
  由 Zhang Zheng 提交于 4月 18, 2023
```
* [AMP OP&Test] Unique support float16&bfloat16

* add test
```
  1d37868f
- Z
  reorder MatrixRank (#52925) · 00efdf84
  由 zhangyuqin1998 提交于 4月 18, 2023
```
* reorder MatrixRank

* fix

* fix

* fix

* fix

* fix
```
  00efdf84
- C
  
  Add logspace tests (#52956) · 417e5baf
  由 chenxujun 提交于 4月 18, 2023
  
  417e5baf
- C
  【Hackathon No.60】randperm, split, split_with_num 算子FP16/BF16单测完善 (#52683) · bc91012f
  由 chenxujun 提交于 4月 18, 2023
```
* Add split, split_with_num tests

* Add randperm tests

* Fix code
```
  bc91012f
- C
  
  Add index_add, index_sample, put_along_axis, take_along_axis tests (#52572) · 1eb30775
  由 chenxujun 提交于 4月 18, 2023
  
  1eb30775
- Z
  reorder_prior_box (#52749) · a70d9db9
  由 zhangyuqin1998 提交于 4月 18, 2023
```
* reorder_prior_box

* fix
```
  a70d9db9
17 4月, 2023 5 次提交

V
[AMP OP&Test]Add BF16 implementation and unit tests of multinomial (#52898) · d19d2486
由 Vvsmile 提交于 4月 17, 2023
```
* fix multinomial

* fix test_elementwise

* fix convert_float_to_uint16

* aadd test_multimial_op

* fix code style
```
d19d2486

【PaddlePaddle Hackathon 4 No.49】：为 Paddle bce_loss 支持 float16 数据类型 (#50930) · 44e6de98

由 thunder95 提交于 4月 17, 2023

* untracked files

* bce_loss_fp16

* remove unused files

* back max_rel_erro still big

* simplify code

* upd

* fix max_relative_error

* restart ci

* Update test_bce_loss.py

* Update test_bce_loss.py

* Update test_bce_loss.py

* Update test_bce_loss.py

* try to pass test

* restore file

* remove error value

* fix bug

---------
Co-authored-by: NZhang Ting <Douyaer2020@qq.com>

44e6de98

【Hackathon No.32】为 Paddle 优化 expand_as 前向&反向 op 在 GPU 上的计算性能 (#52700) · 3c44e948

由 Hanchiao 提交于 4月 17, 2023

* Implement optimized kernel for OP-expand_as.

* Support fp16.
Co-authored-by: Timber-Ye <ye_hanqiao@163.com>
Co-authored-by: NBrianQian1999 <brianqianhitsz@gmail.com>

* remove fp16 support

* remove MAX_RANK_SUPPORTED

---------
Co-authored-by: NBrianQian1999 <brianqianhitsz@gmail.com>

3c44e948

Z

rename_SliceKernel (#52863) · d2b0d63f
由 zhangyuqin1998 提交于 4月 17, 2023

d2b0d63f

Add output defs for some kernelsPhi register (#52941) · 23f87442

由 Sonder 提交于 4月 17, 2023

* add register info for eigh and eig_gard

* add sync_batch_norm_op.cu register info

* add lamb output register info

* add unique register info

* change type name

* change type name

* add output register info for check_finite_and_unscale

* update cmake and config file

* add register info for adagrad

* fix build error

* add sync to run_unittests.sh

* add register info for unique_consecutive

* fix build error

* add eigh to STATIC_BUILD_TESTS

* update eig_kernel.cc

* update eig_kernel.cc

* fix infer mate error

* fix unique register error

* fix lamb register info error

* fix lamb register info

* update lamb register info

* fix lamb

* remove one Output Register

* update static build file

* add eigh op to disable_wingpu_test

* update run_unittests

23f87442

14 4月, 2023 8 次提交
- Z
  
  [AMP OP&Test] Cumprod support fp16 and bf16 (#52919) · 8a850af6
  由 Zhang Zheng 提交于 4月 14, 2023
  
  8a850af6
- C
  
  【Hackathon4 No58】logcumsum logsum (#51275) · 468869e4
  由 cyberslack_lee 提交于 4月 14, 2023
  
  468869e4
- C
  
  【Hackathon4 No58】kthvalue (#51615) · 43efb979
  由 cyberslack_lee 提交于 4月 14, 2023
  
  43efb979
- C
  【Hackathon No.62】digamma, dirichlet算子FP16/BF16单测完善 (#52604) · 7ecbcc08
  由 chenxujun 提交于 4月 14, 2023
```
* Add digamma, dirichlet tests

* Fix code
```
  7ecbcc08
- S
  【Hackathon No.55】add erf FP16 test and BF16 test (#52136) · eeb4d165
  由 superwinner1 提交于 4月 14, 2023
```
* add erf FP16 test
```
  eeb4d165
- C
  
  Add angle,bmm tests (#52630) · 6d7ee668
  由 chenxujun 提交于 4月 14, 2023
  
  6d7ee668
- G
  [phi] move sequence_pool to phi - Step 2 : sequence_pool_op (#52750) · b281b221
  由 gouzil 提交于 4月 14, 2023
```
* [phi] move sequence_pool kernel to phi

* [phi] mv sequence_pooling to phi funcs

* [phi] mv sequence_pooling_test

* [phi] RollBACK `paddle/fluid/operators/sequence_ops/sequence_pool_op.cc`

* [phi][funcs] fix mutable_data

* [phi][funcs] fix mutable_data
```
  b281b221
- Z
  
  delete unused param from swish_grad and relu6_grad (#52805) · 54e4360a
  由 zhangyuqin1998 提交于 4月 14, 2023
  
  54e4360a
13 4月, 2023 3 次提交
- S
  【Hackathon No.55】 add channel_shuffle FP16/BF16 support and tests (#51884) · 48ccb785
  由 superwinner1 提交于 4月 13, 2023
```
* No55 add channel_shuffle FP16/BF16 support and tests
```
  48ccb785
- D
  【Hackathon No57】add_fp16_bf16_for_dot & bf16_for_cross (#52426) · 205094f0
  由 Difer 提交于 4月 13, 2023
```
* add_fp_bf_for_dot & bf_for_cross

* fix error

* fix some error

* fix some error

* change something

* fix magic number
```
  205094f0
- Z
  [AMP OP&Test] Support fp16&bf16 in reduce_max (#52862) · e0e044c0
  由 Zhang Zheng 提交于 4月 13, 2023
```
* [AMP OP&Test] Support fp16&bf16 in reduce_max
```
  e0e044c0

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功