提交 · 77298931d14c102ae186f0d67fe077e624f35512 · PaddlePaddle / Paddle

27 2月, 2023 21 次提交
- 张
  
  suppot fp16 in broadcast (#50905) · 77298931
  由张春乔提交于 2月 27, 2023
  
  77298931
- H
  fix fp16 dtype checking for clip op (#50878) · d832a54d
  由 haozi 提交于 2月 27, 2023
```
* fix fp16 dtype checking for clip op

* modify the name

* fix type error

* fix check error

* Update test_clip_op.py

fix test error

* Update test_clip_op.py

fix code style

---------
Co-authored-by: NZhang Ting <Douyaer2020@qq.com>
```
  d832a54d
- I
  
  fix fp16 dtype checking for conj op (#50868) · 6b85eb59
  由 Infinity_lee 提交于 2月 27, 2023
  
  6b85eb59
- H
  [Error Msg] Polish error message when GPU kernel not found (#50880) · 3e9ffaef
  由 HongyuJia 提交于 2月 27, 2023
```
* [Error Msg] Polish error message when GPU kernel not found

* Only test in GPU environment
```
  3e9ffaef
- Z
  [bug fix] fix fp16 dtype checking for argmax op (#50811) · f3aec871
  由 Zhang Ting 提交于 2月 27, 2023
```
* fix fp16 dtype checking for argmax op

* run fp16 test when place is gpu

* Update search.py

fix doc
```
  f3aec871
- A
  
  [fp16] fix fp16 support for nn.PairwiseDistance (#50849) · 587120ec
  由 Ainavo 提交于 2月 27, 2023
  
  587120ec
- 陈
  
  fix fp16 dtype checking for paddle.diag API (#50848) · ebea0885
  由陈沧夜提交于 2月 27, 2023
  
  ebea0885
- 张
  [fp16] suppot fp16 input in nansum (#50847) · 9951b86f
  由张春乔提交于 2月 27, 2023
```
* add float16 in python/paddle/math

* add unittest for float16

* add float16 support in python.paddle.tensor.search.where

* remove fp16 error cases

* Add NotImplementedError unittest

* fix codestyle

* fluid to paddle.static; add cases with GPU

* Add float16 in English docs
```
  9951b86f
- B
  Reduce redundant cpu computation in slice compute (#50348) · 8aec0580
  由 Bo Zhang 提交于 2月 27, 2023
```
* conflict

* add UpdateSliceAttrs
```
  8aec0580
- G
  
  change message info (#50546) · 097402d9
  由 gaoziyuan 提交于 2月 27, 2023
  
  097402d9
- C
  
  revert operator.cc (#50895) · ec814cf5
  由 csy0225 提交于 2月 27, 2023
  
  ec814cf5
- C
  
  add prim test for sqrt and exp (#50942) · cf209204
  由 Charles-hit 提交于 2月 27, 2023
  
  cf209204
- J
  [kunlun] support reduce_scatter (#50792) · 6786c012
  由 jameszhang 提交于 2月 27, 2023
```
* [kunlun] support reduce_scatter

* uncomment unittest

* update xccl to 1.0.10
```
  6786c012
- Y
  
  Add PADDLE_THROW in ToCudaDataType and polish codes. (#50922) · 2eeaaa7d
  由 Yiqun Liu 提交于 2月 27, 2023
  
  2eeaaa7d
- revert reshape 0 represent copy and support perm < 0 for paddle.transpose (#50720) · 3669868d
  由 zhouweiwei2014 提交于 2月 27, 2023
  
  3669868d
- Z
  [IR] Type system stage2: add class Type, type uniquer utils, class IRContext (#50412) · a5827f0e
  由 zhangbo9674 提交于 2月 27, 2023
```
* add TypeUniquer and IrContext

* refine include code

* add Type, TypeBase

* add built-in type

* add bulit-in Float32Type

* refine ut

* refine code

* refine code

* delete type_base

* rename ImplType to StorageType

* rename ImplType to StorageType

* add macros util for register type

* add macros util for register type

* refine name

* refine name

* change storage manager

* add multi_thread for ir_ctx

* rwlock_2_spinlock, add REGISTER_TYPE_2_IRCONTEXT

* DECLARE_TYPE_UTILITY_FUNCTOR

* refine ircontext singleton

* del destructor for ParametricStorageManager

* refine code

* Add necessary logs for debugging

* refine ir_context instance

* refine type get interface

* refine code by comment
```
  a5827f0e
- W
  xpu: bind op scatter_nd_add. add data type for transpose2, clip & assign_value (#50825) · 0d12afea
  由 wangshengxiang 提交于 2月 27, 2023
```
* [XPU] bind op scatter_nd_add

* [XPU] add more data type for op: clip, transpose2 & assign_value
```
  0d12afea
- Z
  [AutoParallel] add dist_attr in data_parallel optimization (#49744) · a36cdd6b
  由 zhaoyingli 提交于 2月 27, 2023
```
* fix dist_attr in data_parallel in optimization

* fix grad_clip pass when pp2

* fix dist_attr
```
  a36cdd6b
- [Bfloat16]register bfloat16 datatype for squared l2 norm (#50908) · 3c121040
  由 shaojie_wang 提交于 2月 26, 2023
```
* register bfloat16 datatype for squared l2 norm

* register bfloat16 datatype for softmax with upper triangular mask

* register bfloat16 for tril triu cuda kernel
```
  3c121040
- W
  [mv fleet] mv fleet to distributed (#50834) · 5d322ced
  由 wangzhen38 提交于 2月 27, 2023
```
* [mv fleet] mv fleet to distributed

* [mv fleet] for ci

* [mv fleet] for ci

* [mv fleet] solve ci of version
```
  5d322ced
- Z
  [AutoParallel] fix set_grad_var_shape (#50722) · 76c495d7
  由 zhaoyingli 提交于 2月 27, 2023
```
* fix set_grad_var_shape

* recover modify
```
  76c495d7
26 2月, 2023 2 次提交

Matmul performance optimization with cuBlasLt (#46431) · d4217fc6

由 limingshu 提交于 2月 26, 2023


* implement of matmul using cublasLt instead of cublas

* Update matmul_kernel_impl_via_blasLt.h

---------
Co-authored-by: Nzhangbopd <1299246947@qq.com>
Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com>
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

d4217fc6

Enable matmul + bias fusion in fused_gat_attention. (#50755) · 57f6a469

由 Yiqun Liu 提交于 2月 26, 2023

* Enable matmul + bias fusion in fused_gat_attention.

* Add a variable to control whether using fused matmul + bias.

57f6a469

25 2月, 2023 3 次提交

Support 0D for equal tensor with scalar (#50857) · 7c73910e
由 zhouweiwei2014 提交于 2月 25, 2023

7c73910e

change outputs and grads from fp16-fp16-comparision and fp16-fp32 (#50700) · 2dec64d0

由 Vvsmile 提交于 2月 25, 2023

* change outputs and grads from fp16-fp16-comparision and fp16-fp32
comparision

* support grad comparision fp16-fp32

* the change of reference dtype only occured from np.float16 to np.float32

* fix the list type can not infer the dtype by attribute dtype by transfer
the list to array

* adjust the default atol and rtol of float16 to 1e-3

* Polish code

* fix error

* fix

* Polish code

* fix the _is_cal_ref and np.float16

* fix the combination of is_calc_ref and np.float16

* remove unuseful codes in op_test.py

* fix ci

* fix the rtol set in the dygraph checker and eager checker

---------
Co-authored-by: NZzSean <18818272991@163.com>

2dec64d0

Z
Rename elementwise_heaviside to heaviside (#50821) · 8129c22e
由 zyfncg 提交于 2月 25, 2023
```
* rename elementwise_heaviside to heaviside

* delete __init__.py

* fix bug
```
8129c22e

24 2月, 2023 14 次提交
- Y
  
  [Zero-Dim] Support 0D Tensor input for topk/broadcast_to/expand/expand_as/broadcast_shape (#50536) · 5041158f
  由 yunyaoXYY 提交于 2月 24, 2023
  
  5041158f
- C
  
  Fix typos (#50852) · 4a0855a5
  由 chenxujun 提交于 2月 24, 2023
  
  4a0855a5
- W
  Revert grad scale optimization pr (#50839) · 8a503522
  由 Weilong Wu 提交于 2月 24, 2023
```
* Revert "fixoptminizer _set_auxiliary_var bug (#50335)"

This reverts commit c44005f0.

* Revert "refine optimizer create accumulators (#50188)"

This reverts commit 244e7546.

* Revert "fix found_inf bug for custom optimizer (#50158)"

This reverts commit 64573f9f.

* Revert "refine amp scaler found_inf (#49864)"

This reverts commit 382e9a06.

* fix code format

* fix conflict
```
  8a503522
- 姜
  dynamic graph tests (#50572) · 09694f82
  由姜永久提交于 2月 24, 2023
```
* fix

* and others

* more ops

* reset distribute_fpn and precision_recall

* reset fc

* modify arange test

* modify reshape&reduce

* add fill_any and sigmoid_cross_entropy

* reset linear_interp_v2

* reset reduce

* modify

* modify arange

* modify cast
```
  09694f82
- Z
  [Paddle-TRT] allow plugin fall back to fp16 when int8 (#50554) · f24eadd9
  由 zhoutianzi666 提交于 2月 24, 2023
```
* allow fall back to fp16 when int8

* refine code

* refine code

* refine code
```
  f24eadd9
- S
  Fused ops converter (#50751) · 9429936c
  由 Sławomir Siwek 提交于 2月 24, 2023
```
* ConvertToFusedOp

* change static to inline
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
```
  9429936c
- N
  
  Fix KP operator Kernel selection error (#50178) · 6ef3f2ce
  由 niuliling123 提交于 2月 24, 2023
  
  6ef3f2ce
- J
  【Prim】Fix prim amp (#50518) · 6664a232
  由 Jiabin Yang 提交于 2月 24, 2023
```
* change amp with to_prim

* fix prim amp

* fix rules

* fix liear

* add amp test

* add test

* disable this test on cpu

* disable this test on cpu

---------
Co-authored-by: Ncyber-pioneer <chenzhuo@tju.edu.cn>
```
  6664a232
- C
  
  fix composite grad maker code gen (#50854) · 07c416c8
  由 Charles-hit 提交于 2月 24, 2023
  
  07c416c8
- Y
  
  Fix libpaddle_inference.so symbol conflicts with other .so (gflags) (#50787) · 041ea14c
  由 Yuanle Liu 提交于 2月 24, 2023
  
  041ea14c
- Y
  
  fix setup.py (#50800) · bda59b1b
  由 YUNSHEN XIE 提交于 2月 24, 2023
  
  bda59b1b
- support 'backend' in static ops (#50671) · 363825df
  由 HappyHeavyRain 提交于 2月 24, 2023
```
* support 'backend' in static ops

* change bitwise_xx comment in python

* change bitwise_xxx comment in python

* change 'backend' and 'data_type' in GetExpectedKernelType
```
  363825df
- Y
  
  supplement header file's code (#50826) · 92cae577
  由 YuanRisheng 提交于 2月 24, 2023
  
  92cae577
- W
  Add bert prim and cinn test (#50545) · bfa217e4
  由 WangZhen 提交于 2月 24, 2023
```
* Add bert prim and cinn test
```
  bfa217e4

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功