提交 · 6a8d98e0bbca6b9c45b75bf7fe156330c0a995f7 · PaddlePaddle / Paddle

24 4月, 2023 5 次提交
- S
  Add weighted sample (#52013) · 6a8d98e0
  由 Siming Dai 提交于 4月 24, 2023
```
Add paddle.geometric.weighted_sample_neighbors API
```
  6a8d98e0
- S
  Move fused feedforward xpu (#53196) · 83c2e682
  由 Sonder 提交于 4月 24, 2023
```
* add sig file

* trans fused feedforward compute function to phi

* remove fluid include

* delete old register info

* fix build error

* trans fused feedforward grad xpu to phi
```
  83c2e682
- C
  
  shared_external mermory add xpu (#53240) · d71615dc
  由 csy0225 提交于 4月 24, 2023
  
  d71615dc
- Z
  
  [Sparse]fix bug in paddle.sparse.transpose and paddle.sparse.reshape (#53038) · 15251291
  由 Zhan Rongrui 提交于 4月 24, 2023
  
  15251291
- G
  remove some [-Wunused-parameter] (#53185) · 834eb2ba
  由 Galaxy1458 提交于 4月 24, 2023
```
* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test ,test=develop
```
  834eb2ba
23 4月, 2023 4 次提交
- Z
  delete overwrite from gather_grad (#52707) · a32c1391
  由 zhangyuqin1998 提交于 4月 23, 2023
```
* delete overwrite from gather_grad

* fix

* Update gather_grad_kernel.cc
```
  a32c1391
- H
  [XPU] fc use int_with_ll_t (#53183) · 7634a18a
  由 houj04 提交于 4月 23, 2023
```
* [XPU] fc use int_with_ll_t

* fix test_unbind_op_xpu
```
  7634a18a
- Z
  delete axis from elementwise_grad (#53202) · a3cd9cb9
  由 zhangyuqin1998 提交于 4月 23, 2023
```
* remove axis from elementwise_grad

* Update elementwise_sig.cc
```
  a3cd9cb9
- G
  remove some [-Wunused-parameter] (#53162) · b02687cc
  由 Galaxy1458 提交于 4月 23, 2023
```
* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop
```
  b02687cc
22 4月, 2023 1 次提交

[Zero-Dim] support output 0D for... · b406a7db

由 wangfengsheng1999 提交于 4月 22, 2023

[Zero-Dim] support output 0D for is_empty/as_complex/inner/dot/rank/tensordot/squeeze_/static.accuracy/static.auc/metric.accuracy, test=allcase (#52850)

* [Zero-Dim] support output 0D for is_empty/as_complex/, test=allcase

* [Zero-Dim] support output 0D for is_empty/as_complex/, test=allcase

* add test case

* modify dot/metric.accuracy/static.accuracy/static.auc

* modfiy inner/tensordot bug

* test 9 api

* [Zero-Dim] support output 0D for is_empty/as_complex/inner/dot/rank/tensordot/squeeze_/static.accuracy/static.auc/metric.accuracy, test=allcase

* fix bug

* support output 0D for is_empty/as_complex/inner/dot/rank/tensordot/squeeze_/static.accuracy/static.auc/metric.accuracy

* code style

* fix bug

* fix test_dot_op bug

* fix accuracy bug

* fix bug

* fix bug

* fix bug

* fix bug

* codestyle

* fix dot bug

* fix dot bug

* fix dot bug

* code style

* fix dot bug

* fix dot bug

* fix dot bug

* fix dot bug

* fix dot bug

* fix dot bug

* modify code

b406a7db

21 4月, 2023 7 次提交
- J
  support 0-D output and 0-D as indice in __getitem__/__setitem__ (#52814) · 4e939c89
  由 JYChen 提交于 4月 21, 2023
```
* support 0-D output and 0-D as indice in __getitem__

* fix tests

* fix inference and UT

* add unittest for setitem

* fix xpu test

* fix xpu 0-d
```
  4e939c89
- S
  add deterministic embedding grad kernel (#50494) · 017254d6
  由 Shijie 提交于 4月 21, 2023
```
* add deterministic embedding grad kernel

* minor change

* minor change

* Add new FLAG to enable deterministic embedding

* Update embedding deterministic kernel
```
  017254d6
- C
  
  Add trace tests (#52954) · 3371747d
  由 co63oc 提交于 4月 21, 2023
  
  3371747d
- C
  
  Add unfold tests (#52963) · f8823c1a
  由 co63oc 提交于 4月 21, 2023
  
  f8823c1a
- R
  Revert "remove ASCEND* keyword" (#53131) · 0f99debd
  由 ronnywang 提交于 4月 21, 2023
```
* Revert "remove ASCEND* keyword (#53046)"

This reverts commit 7fa415ca.

* Delete ascend_trigger_op.cc

* revert-53046-remove_ASCEND_keyword

* update

* update
```
  0f99debd
- U
  
  [cutlass] gather-gemm-scatter fusion on sm 75 (#53017) · 8a1cdc70
  由 umiswing 提交于 4月 21, 2023
  
  8a1cdc70
- Y
  
  Update for fused linear grad add. (#53118) · 63c83870
  由 Yuang Liu 提交于 4月 21, 2023
  
  63c83870
20 4月, 2023 5 次提交
- Z
  move_elementwise_raw (#53010) · 7a72f7a2
  由 zhangyuqin1998 提交于 4月 20, 2023
```
* setup

* Update elementwise_kernel.cc

* Update elementwise_kernel.cc

* fix

* fix

* Update elementwise_kernel.cu

* fix

* Update elementwise_kernel.cc

* Update elementwise_kernel.cc

* Update elementwise_kernel.cc

* Update elementwise_kernel.cc

* Update elementwise_kernel.cc

* Update elementwise_kernel.cc
```
  7a72f7a2
- C
  [FlashAttn] add flash randomness control (#52902) · 00ac8014
  由 Chitsing KUI 提交于 4月 20, 2023
```
* add flash randomness control

* fix VLOG undefied
```
  00ac8014
- [Zero-Dim] Support all/any/min/prod/logsumexp/amax/amin/some loss output 0D,test=allcase (#53051) · e6def1eb
  由 zhouweiwei2014 提交于 4月 20, 2023
  
  e6def1eb
- W
  remove ASCEND* keyword (#53046) · 7fa415ca
  由 Wang Xin 提交于 4月 20, 2023
```
* remove ASCEND* keyword

* update docstring

* bug fixed

* bug fixed
```
  7fa415ca
- C
  
  Fix typos, test=document_fix (#53099) · 20a66bbf
  由 co63oc 提交于 4月 20, 2023
  
  20a66bbf
19 4月, 2023 5 次提交
- S
  Move fused_attention op to phi [迁移XPU OpKernel] [ test=kunlun ] (#53011) · 7b56bd25
  由 Sonder 提交于 4月 19, 2023
```
* trans fused attention to phi

* add optional parm

* trans fused_attention_grad to phi

* add fused attention grad register info

* fix include

* test=kunlun

* add fused attention to static build list

* add remove

* update remove
```
  7b56bd25
- Z
  
  fix bug for pool2d and pool2d_grad when kernel_size > in_h/in_w in xpu (#53043) · b1d3ec16
  由 zhangyikun02 提交于 4月 19, 2023
  
  b1d3ec16
- H
  
  [XPU] add numel op (#53041) · 4812d8e4
  由 houj04 提交于 4月 19, 2023
  
  4812d8e4
- L
  Support Linear operation in cuBlaslt and plug into attn_gemm and fusedLinear backward op (#52028) · f6f18835
  由 limingshu 提交于 4月 19, 2023
```
* first commit

* restruct c++ interface to divide linear from matmulwithcublaslt

* finish building in cublaslt impl

* fix code bugs

* fix host cost

* add some changes
```
  f6f18835
- Z
  fix graph_reindex (#52930) · e5506be6
  由 zhangyuqin1998 提交于 4月 19, 2023
```
* fix graph_reindex

* fix

* Update op_compat.yaml
```
  e5506be6
18 4月, 2023 9 次提交
- C
  【Hackathon No.60】prelu, clip_by_norm, multi_dot 算子FP16/BF16单测完善 (#52666) · c3055d23
  由 chenxujun 提交于 4月 18, 2023
```
* Add prelu, clip_by_norm, multi_dot tests

* Fix code

* Fix code
```
  c3055d23
- Z
  [AMP OP&Test] Unique support float16&bfloat16 (#52995) · 1d37868f
  由 Zhang Zheng 提交于 4月 18, 2023
```
* [AMP OP&Test] Unique support float16&bfloat16

* add test
```
  1d37868f
- Z
  reorder MatrixRank (#52925) · 00efdf84
  由 zhangyuqin1998 提交于 4月 18, 2023
```
* reorder MatrixRank

* fix

* fix

* fix

* fix

* fix
```
  00efdf84
- C
  
  Add logspace tests (#52956) · 417e5baf
  由 chenxujun 提交于 4月 18, 2023
  
  417e5baf
- C
  【Hackathon No.60】randperm, split, split_with_num 算子FP16/BF16单测完善 (#52683) · bc91012f
  由 chenxujun 提交于 4月 18, 2023
```
* Add split, split_with_num tests

* Add randperm tests

* Fix code
```
  bc91012f
- C
  
  Add index_add, index_sample, put_along_axis, take_along_axis tests (#52572) · 1eb30775
  由 chenxujun 提交于 4月 18, 2023
  
  1eb30775
- G
  【0D output】add 0D output support for linalg.slogdet (#52891) · a7155c5c
  由 GGBond8488 提交于 4月 18, 2023
```
* add 0D output support for inalg.slogdet,test=allcase

* fix zerom dime test error test=allcase

* fix test error test=allcase

* add static backward test, test=allcase
```
  a7155c5c
- J
  fix the set_value error in cpu (#49804) · 239dbc4e
  由 JYChen 提交于 4月 18, 2023
```
* fix the set_value error in cpu

* add a unitest for set_value OP

* fix platform::is_gpu_place

* add todo note for set_value
```
  239dbc4e
- Z
  reorder_prior_box (#52749) · a70d9db9
  由 zhangyuqin1998 提交于 4月 18, 2023
```
* reorder_prior_box

* fix
```
  a70d9db9
17 4月, 2023 4 次提交

[Paddle-Inference] Add cutlass conv2d_depthwise (#51792) · bd3b096a

由 zhoutianzi666 提交于 4月 17, 2023

* initial commit for cutlass_teller

* second commit for cutlass_teller

* add conv2d_depthwise python template

* add conv2d_depthwise cutlass template

* /zhoukangkang/paddle_cutlass/Paddle/paddle/fluid/framework/ir/cutlass_teller.h

* refine code in Conv2dFusionCanSupport

* add macro in cutlass_teller.h

* add 3x3 5x5 teller

* add groups not 1 or conv2d_depthwise teller

* 只生成ic是8的倍数的conv2d_depthwise 的kernel

* add EXPLICIT in cutlass_teller.h

* final commit

* add split_k_slices in conv2d_depthwise

* make stages == 2

* 重构部分代码

* add CutlassFusionType

* solve illegal memory

* make stride_h=stride_w && make dilation==1

* must check HasAttr(use_cutlass) before GetAttrIfExists

* add CONV2D_DEPTHWISE_BIAS_SILU to OpType2String

* modify decl.h and util.cu

bd3b096a

C
[Fused] controlled randomness for fused dropout add (#52903) · e36f80c6
由 Chitsing KUI 提交于 4月 17, 2023
```
* add random control for fused dropout add

* add __init__
```
e36f80c6
V
[AMP OP&Test]Add BF16 implementation and unit tests of multinomial (#52898) · d19d2486
由 Vvsmile 提交于 4月 17, 2023
```
* fix multinomial

* fix test_elementwise

* fix convert_float_to_uint16

* aadd test_multimial_op

* fix code style
```
d19d2486

【PaddlePaddle Hackathon 4 No.49】：为 Paddle bce_loss 支持 float16 数据类型 (#50930) · 44e6de98

由 thunder95 提交于 4月 17, 2023

* untracked files

* bce_loss_fp16

* remove unused files

* back max_rel_erro still big

* simplify code

* upd

* fix max_relative_error

* restart ci

* Update test_bce_loss.py

* Update test_bce_loss.py

* Update test_bce_loss.py

* Update test_bce_loss.py

* try to pass test

* restore file

* remove error value

* fix bug

---------
Co-authored-by: NZhang Ting <Douyaer2020@qq.com>

44e6de98

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功