提交 · d4217fc6b23190f23cef48deae6fd16317d9c118 · PaddlePaddle / Paddle

26 2月, 2023 2 次提交

Matmul performance optimization with cuBlasLt (#46431) · d4217fc6

由 limingshu 提交于 2月 26, 2023


* implement of matmul using cublasLt instead of cublas

* Update matmul_kernel_impl_via_blasLt.h

---------
Co-authored-by: Nzhangbopd <1299246947@qq.com>
Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com>
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

d4217fc6

Enable matmul + bias fusion in fused_gat_attention. (#50755) · 57f6a469

由 Yiqun Liu 提交于 2月 26, 2023

* Enable matmul + bias fusion in fused_gat_attention.

* Add a variable to control whether using fused matmul + bias.

57f6a469

25 2月, 2023 3 次提交

Support 0D for equal tensor with scalar (#50857) · 7c73910e
由 zhouweiwei2014 提交于 2月 25, 2023

7c73910e

change outputs and grads from fp16-fp16-comparision and fp16-fp32 (#50700) · 2dec64d0

由 Vvsmile 提交于 2月 25, 2023

* change outputs and grads from fp16-fp16-comparision and fp16-fp32
comparision

* support grad comparision fp16-fp32

* the change of reference dtype only occured from np.float16 to np.float32

* fix the list type can not infer the dtype by attribute dtype by transfer
the list to array

* adjust the default atol and rtol of float16 to 1e-3

* Polish code

* fix error

* fix

* Polish code

* fix the _is_cal_ref and np.float16

* fix the combination of is_calc_ref and np.float16

* remove unuseful codes in op_test.py

* fix ci

* fix the rtol set in the dygraph checker and eager checker

---------
Co-authored-by: NZzSean <18818272991@163.com>

2dec64d0

Z
Rename elementwise_heaviside to heaviside (#50821) · 8129c22e
由 zyfncg 提交于 2月 25, 2023
```
* rename elementwise_heaviside to heaviside

* delete __init__.py

* fix bug
```
8129c22e

24 2月, 2023 25 次提交

Y

[Zero-Dim] Support 0D Tensor input for topk/broadcast_to/expand/expand_as/broadcast_shape (#50536) · 5041158f
由 yunyaoXYY 提交于 2月 24, 2023

5041158f
C

Fix typos (#50852) · 4a0855a5
由 chenxujun 提交于 2月 24, 2023

4a0855a5

Revert grad scale optimization pr (#50839) · 8a503522

由 Weilong Wu 提交于 2月 24, 2023

* Revert "fixoptminizer _set_auxiliary_var bug (#50335)"

This reverts commit c44005f0.

* Revert "refine optimizer create accumulators (#50188)"

This reverts commit 244e7546.

* Revert "fix found_inf bug for custom optimizer (#50158)"

This reverts commit 64573f9f.

* Revert "refine amp scaler found_inf (#49864)"

This reverts commit 382e9a06.

* fix code format

* fix conflict

8a503522

姜

dynamic graph tests (#50572) · 09694f82

由姜永久提交于 2月 24, 2023

* fix

* and others

* more ops

* reset distribute_fpn and precision_recall

* reset fc

* modify arange test

* modify reshape&reduce

* add fill_any and sigmoid_cross_entropy

* reset linear_interp_v2

* reset reduce

* modify

* modify arange

* modify cast

09694f82

Z
[Paddle-TRT] allow plugin fall back to fp16 when int8 (#50554) · f24eadd9
由 zhoutianzi666 提交于 2月 24, 2023
```
* allow fall back to fp16 when int8

* refine code

* refine code

* refine code
```
f24eadd9

Fused ops converter (#50751) · 9429936c

由 Sławomir Siwek 提交于 2月 24, 2023

* ConvertToFusedOp

* change static to inline
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

9429936c

N

Fix KP operator Kernel selection error (#50178) · 6ef3f2ce
由 niuliling123 提交于 2月 24, 2023

6ef3f2ce

【Prim】Fix prim amp (#50518) · 6664a232

由 Jiabin Yang 提交于 2月 24, 2023

* change amp with to_prim

* fix prim amp

* fix rules

* fix liear

* add amp test

* add test

* disable this test on cpu

* disable this test on cpu

---------
Co-authored-by: Ncyber-pioneer <chenzhuo@tju.edu.cn>

6664a232

C

fix composite grad maker code gen (#50854) · 07c416c8
由 Charles-hit 提交于 2月 24, 2023

07c416c8
Y

Fix libpaddle_inference.so symbol conflicts with other .so (gflags) (#50787) · 041ea14c
由 Yuanle Liu 提交于 2月 24, 2023

041ea14c
Y

fix setup.py (#50800) · bda59b1b
由 YUNSHEN XIE 提交于 2月 24, 2023

bda59b1b

support 'backend' in static ops (#50671) · 363825df

由 HappyHeavyRain 提交于 2月 24, 2023

* support 'backend' in static ops

* change bitwise_xx comment in python

* change bitwise_xxx comment in python

* change 'backend' and 'data_type' in GetExpectedKernelType

363825df

Y

supplement header file's code (#50826) · 92cae577
由 YuanRisheng 提交于 2月 24, 2023

92cae577
W
Add bert prim and cinn test (#50545) · bfa217e4
由 WangZhen 提交于 2月 24, 2023
```
* Add bert prim and cinn test
```
bfa217e4

【prim】Slice grad (#50771) · f6dea800

由 xiaoguoguo626807 提交于 2月 24, 2023

* support prim test in OpTest

* fix cmake

* fix op test

* fix test_input_spec

* disable cinn in reduce_sum unit test

* add bfloat16 dtype for sum

* add approve rules

* polish code

* add clear jit program function

* convert grad out from tensor to numpy

* remove unnecessary code

* add only_prim flag

* fix flag

* fix op test

* add attr

* fix optest comp inplace error

* fix op test

* fix op test with guard

* add initialization of check_comp flag

* fix comp inplace error in op test

* rename check_comp with check_prim and add bfloat16 dtype convert

* rename comp_op_type to prim_op_type

* rename comp to prim

* remove useless code

* skip ci check for only prim

* add no_grad_vars and grad_outputs in prim test

* fix var_dict

* fix op test for only_prim

* fix dy2static bugs

* polish some code

* temp

* modify op test

* except cinn test

* modify bfp16

* modify pad grad

* add pad_grad dtype

* start cinn part

---------
Co-authored-by: NCharles-hit <wanghao107@baidu.com>

f6dea800

H

[Tensor Operants & Prim] Tensor arithmetic operants support left scalar type (#50840) · 0d956e17
由 HongyuJia 提交于 2月 24, 2023

0d956e17
X

[dy2static] bug fix: Lazy initialize bugs (#50785) · 44a32fbd
由 xiongkun 提交于 2月 24, 2023

44a32fbd
C
[Prim]fix attrs loss in creating op (#50780) · 016f5ecb
由 cyber-pioneer 提交于 2月 24, 2023
```
* fix attrs loss in creating op

* add comment

* add case

* add case

* remove unused case setting
```
016f5ecb
Y
[Save/Load]Fix backward op's error when use jit.load (#50744) · 2be69d05
由 YuanRisheng 提交于 2月 24, 2023
```
* perfect translated layer

* perfect code according comment
```
2be69d05

Support pslib ci (#50822) · 31e465e1

由 pangengzheng 提交于 2月 24, 2023

* change protobuf version in pslib mode and link libjvm.so fot libps.so

* keep protobuf version same with pslib and enable compile with pslib

31e465e1

R
[XPU] add expand_grad, isnan, meshgrid kernels (#50774) · 7271de88
由 ronnywang 提交于 2月 24, 2023
```
* [XPU] add expand_grad, isnan, meshgrid kernels

* update
```
7271de88
Z
[Paddle-TRT] Fix QkvToContextPluginDynamic bug (#50715) · 612d5da0
由 zhoutianzi666 提交于 2月 24, 2023
```
* fix multihead

* fix multihead
```
612d5da0

[CINN]Enhance CacheKey hash logic by considering input dtypes (#50557) · 21c6eccf

由 Aurelius84 提交于 2月 24, 2023

* [CINN]Enhance CacheKey hash logic by considering input dtypes

* add unittest

* fix typo

* fix typo

* fix map.at

* fix find

* fix test

* fix cinn cache key structure realize

* using ordered map for attributes

* add test by review advice

---------
Co-authored-by: Njiangcheng <thisjiang@qq.com>

21c6eccf

H

fix lbfgs error (#50820) · 6e37a2c0
由 Hui Zhang 提交于 2月 24, 2023

6e37a2c0
P
Remove tests with save_quant_model.py (#50307) · db170b2b
由 Paulina Gacek 提交于 2月 23, 2023
```
* got rid of save_quant_model

* review changes
```
db170b2b

23 2月, 2023 10 次提交
- H
  
  add api (#50766) · 7f87d75b
  由 Hui Zhang 提交于 2月 23, 2023
  
  7f87d75b
- L
  
  first commit (#50808) · a1e96e47
  由 limingshu 提交于 2月 23, 2023
  
  a1e96e47
- C
  
  [XPU] Migrate xpu_embedding_with_eltwise_add_fuse_pass (#50590) · 8d325d82
  由 csy0225 提交于 2月 23, 2023
  
  8d325d82
- H
  [Tensor API & Prim-Relevant] Unsupport prob Tensor API (#50756) · d7673e2f
  由 HongyuJia 提交于 2月 23, 2023
```
* change phi tensor_gen->tensor_operants_gen

* [Tensor API] Support multiple Tensor C++ api

* [Tensor API] Unsupport prob Tensor API

* accept reviewers comment of #50731

* delete tensor_api.yaml
```
  d7673e2f
- H
  [phi decoupling] move generator implementation from fluid to phi (#50746) · 4e417409
  由 Huang Jiyi 提交于 2月 23, 2023
```
* move fluid generator to phi

* move fluid generator to phi

* update .gitignore

* fix bugs

* fix cannot find "glog/logging.h" in "generator.h"

* fix bugs
```
  4e417409
- L
  [OptionalOptimization]: LayerNorm forward Optimization with Welford (#50362) · 746b774b
  由 limingshu 提交于 2月 23, 2023
```
* first commit

* main codes has been developed

* fix all bugs

* add vectorize input&output

* a test for optimization_of_layer_norm_fwd

* add some changes

* fix memory coalesced access for more optimization.

* fix addition ctest error

* fix according to ci-approval

* remove change on slice
```
  746b774b
- R
  
  fix bug that touch __init__.py (#50793) · e1956ab5
  由 risemeup1 提交于 2月 23, 2023
  
  e1956ab5
- R
  [Paddle C++ API] Remapping input and output tensors after XPU op has fallen back to CPU op (#50625) · f7b45b3e
  由 RuohengMa 提交于 2月 23, 2023
```
* fix accurary diff issue when XPU op batch_norm is added to XPU blacklist

* remap op output tensor to input tensor when the op has fallen back to CPU

* rename function name and fix bug causing by InplaceCounter
```
  f7b45b3e
- R
  
  add gcc-12.2.0.tar.gz (#50777) · 846c4c30
  由 risemeup1 提交于 2月 23, 2023
  
  846c4c30
- D
  
  add custom_cpu mixed_precision test (#50789) · 7b1f42d3
  由 duanyanhui 提交于 2月 23, 2023
  
  7b1f42d3

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功