提交 · 3669868d903b73e686e682a196ae20e9338efdbc · PaddlePaddle / Paddle

27 2月, 2023 3 次提交
- revert reshape 0 represent copy and support perm < 0 for paddle.transpose (#50720) · 3669868d
  由 zhouweiwei2014 提交于 2月 27, 2023
  
  3669868d
- W
  xpu: bind op scatter_nd_add. add data type for transpose2, clip & assign_value (#50825) · 0d12afea
  由 wangshengxiang 提交于 2月 27, 2023
```
* [XPU] bind op scatter_nd_add

* [XPU] add more data type for op: clip, transpose2 & assign_value
```
  0d12afea
- [Bfloat16]register bfloat16 datatype for squared l2 norm (#50908) · 3c121040
  由 shaojie_wang 提交于 2月 26, 2023
```
* register bfloat16 datatype for squared l2 norm

* register bfloat16 datatype for softmax with upper triangular mask

* register bfloat16 for tril triu cuda kernel
```
  3c121040
26 2月, 2023 2 次提交

Matmul performance optimization with cuBlasLt (#46431) · d4217fc6

由 limingshu 提交于 2月 26, 2023


* implement of matmul using cublasLt instead of cublas

* Update matmul_kernel_impl_via_blasLt.h

---------
Co-authored-by: Nzhangbopd <1299246947@qq.com>
Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com>
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

d4217fc6

Enable matmul + bias fusion in fused_gat_attention. (#50755) · 57f6a469

由 Yiqun Liu 提交于 2月 26, 2023

* Enable matmul + bias fusion in fused_gat_attention.

* Add a variable to control whether using fused matmul + bias.

57f6a469

25 2月, 2023 1 次提交
- Z
  Rename elementwise_heaviside to heaviside (#50821) · 8129c22e
  由 zyfncg 提交于 2月 25, 2023
```
* rename elementwise_heaviside to heaviside

* delete __init__.py

* fix bug
```
  8129c22e
24 2月, 2023 8 次提交

Y

[Zero-Dim] Support 0D Tensor input for topk/broadcast_to/expand/expand_as/broadcast_shape (#50536) · 5041158f
由 yunyaoXYY 提交于 2月 24, 2023

5041158f
N

Fix KP operator Kernel selection error (#50178) · 6ef3f2ce
由 niuliling123 提交于 2月 24, 2023

6ef3f2ce
Y

Fix libpaddle_inference.so symbol conflicts with other .so (gflags) (#50787) · 041ea14c
由 Yuanle Liu 提交于 2月 24, 2023

041ea14c

support 'backend' in static ops (#50671) · 363825df

由 HappyHeavyRain 提交于 2月 24, 2023

* support 'backend' in static ops

* change bitwise_xx comment in python

* change bitwise_xxx comment in python

* change 'backend' and 'data_type' in GetExpectedKernelType

363825df

Y

supplement header file's code (#50826) · 92cae577
由 YuanRisheng 提交于 2月 24, 2023

92cae577

【prim】Slice grad (#50771) · f6dea800

由 xiaoguoguo626807 提交于 2月 24, 2023

* support prim test in OpTest

* fix cmake

* fix op test

* fix test_input_spec

* disable cinn in reduce_sum unit test

* add bfloat16 dtype for sum

* add approve rules

* polish code

* add clear jit program function

* convert grad out from tensor to numpy

* remove unnecessary code

* add only_prim flag

* fix flag

* fix op test

* add attr

* fix optest comp inplace error

* fix op test

* fix op test with guard

* add initialization of check_comp flag

* fix comp inplace error in op test

* rename check_comp with check_prim and add bfloat16 dtype convert

* rename comp_op_type to prim_op_type

* rename comp to prim

* remove useless code

* skip ci check for only prim

* add no_grad_vars and grad_outputs in prim test

* fix var_dict

* fix op test for only_prim

* fix dy2static bugs

* polish some code

* temp

* modify op test

* except cinn test

* modify bfp16

* modify pad grad

* add pad_grad dtype

* start cinn part

---------
Co-authored-by: NCharles-hit <wanghao107@baidu.com>

f6dea800

H

[Tensor Operants & Prim] Tensor arithmetic operants support left scalar type (#50840) · 0d956e17
由 HongyuJia 提交于 2月 24, 2023

0d956e17
R
[XPU] add expand_grad, isnan, meshgrid kernels (#50774) · 7271de88
由 ronnywang 提交于 2月 24, 2023
```
* [XPU] add expand_grad, isnan, meshgrid kernels

* update
```
7271de88

23 2月, 2023 12 次提交

L

first commit (#50808) · a1e96e47
由 limingshu 提交于 2月 23, 2023

a1e96e47
C

[XPU] Migrate xpu_embedding_with_eltwise_add_fuse_pass (#50590) · 8d325d82
由 csy0225 提交于 2月 23, 2023

8d325d82

[Tensor API & Prim-Relevant] Unsupport prob Tensor API (#50756) · d7673e2f

由 HongyuJia 提交于 2月 23, 2023

* change phi tensor_gen->tensor_operants_gen

* [Tensor API] Support multiple Tensor C++ api

* [Tensor API] Unsupport prob Tensor API

* accept reviewers comment of #50731

* delete tensor_api.yaml

d7673e2f

[phi decoupling] move generator implementation from fluid to phi (#50746) · 4e417409

由 Huang Jiyi 提交于 2月 23, 2023

* move fluid generator to phi

* move fluid generator to phi

* update .gitignore

* fix bugs

* fix cannot find "glog/logging.h" in "generator.h"

* fix bugs

4e417409

[OptionalOptimization]: LayerNorm forward Optimization with Welford (#50362) · 746b774b

由 limingshu 提交于 2月 23, 2023

* first commit

* main codes has been developed

* fix all bugs

* add vectorize input&output

* a test for optimization_of_layer_norm_fwd

* add some changes

* fix memory coalesced access for more optimization.

* fix addition ctest error

* fix according to ci-approval

* remove change on slice

746b774b

R

fix bug that touch __init__.py (#50793) · e1956ab5
由 risemeup1 提交于 2月 23, 2023

e1956ab5

[Paddle C++ API] Remapping input and output tensors after XPU op has fallen back to CPU op (#50625) · f7b45b3e

由 RuohengMa 提交于 2月 23, 2023

* fix accurary diff issue when XPU op batch_norm is added to XPU blacklist

* remap op output tensor to input tensor when the op has fallen back to CPU

* rename function name and fix bug causing by InplaceCounter

f7b45b3e

[Tensor Operants & Prim] Tensor arithmetic operants support right scalar type (#50563) · 5f5a2082

由 HongyuJia 提交于 2月 23, 2023

* polish namespace

* change static_tensor_operants

* polish namespace

* support add, subtract, divide

* add unit test

* polish unittest

* fix cmake error

* solve conflicts, merge auto code-gen

* add scalar operator in tensor.h

* tensorbase

* static prim full support more datatype

* fix prim unittest

* polish codes

* fix cmake error

5f5a2082

Y
[PHI Decoupling]Remove Profiler header (Part3) (#50721) · 8476c552
由 YuanRisheng 提交于 2月 23, 2023
```
* move profiler

* fix compile bugs
```
8476c552

Support 'complex promote' in yaml (#50611) · 91a3d159

由 HappyHeavyRain 提交于 2月 23, 2023

* support 'complex promote' in yaml

* change the compplex_promote

* change 'kron' in math.py

* change 'kron' comment in python

* change kron comment in python

* change kron comment in python

91a3d159

【Prim】Enhance gather vjp (#50786) · dca3a099

由 Jiabin Yang 提交于 2月 23, 2023

* tmp gather vjp

* support gather

* remove useless code

* fix compiling error

* fix ut

* add eager test

* add eager test

* add seed

* fix cpu error

* fix transpose op compat

* remove tensor index case

* fix prim_cinn

* fix ut

* add gather composite

dca3a099

kunlun support c_softmax_with_cross_entropy (#49934) · f43b5fe5

由 jameszhang 提交于 2月 23, 2023

* kunlun support c_softmax_with_cross_entropy

* fix grad calc error

* replace mutable_data() and ShareDataWith()

* update xdnn

* update xpu toolchain to 20230215

* remove fluid from test file

f43b5fe5

22 2月, 2023 9 次提交
- H
  [Tensor API] Support multiple Tensor C++ api (#50731) · 652d12cc
  由 HongyuJia 提交于 2月 22, 2023
```
* change phi tensor_gen->tensor_operants_gen

* [Tensor API] Support multiple Tensor C++ api
```
  652d12cc
- [Win]fix compile error due to depend xxhash (#50760) · a35dbc29
  由 zhouweiwei2014 提交于 2月 22, 2023
  
  a35dbc29
- R
  
  fix ninja and make incremental compiling error (#50616) · 433c2ffb
  由 risemeup1 提交于 2月 22, 2023
  
  433c2ffb
- S
  Fix some typos. (#50429) · 93b2bf4b
  由 Shuangchi He 提交于 2月 22, 2023
```
* Fix some typos.
Signed-off-by: Yulv-git <yulvchi@qq.com>

* pre-commit
Signed-off-by: Yulv-git <yulvchi@qq.com>

---------
Signed-off-by: Yulv-git <yulvchi@qq.com>
```
  93b2bf4b
- Z
  
  [XPU] link out_max to x_max between xpu_fusion_ops (#50690) · 1fd1c169
  由 zhupengyang 提交于 2月 22, 2023
  
  1fd1c169
- H
  [Fix Typo] Fix typo error, implemention->implementation (#50495) · 7d077000
  由 HongyuJia 提交于 2月 22, 2023
```
* fix py::array_t calling bug

* fix typo, implemention->implementation, test=document_fix
```
  7d077000
- Z
  
  [sparse]Fix mask_kernel name (#50713) · cf95db58
  由 zhangkaihuo 提交于 2月 22, 2023
  
  cf95db58
- J
  【Prim】Add gather vjp (#50305) · 4db8e5c7
  由 Jiabin Yang 提交于 2月 22, 2023
```
* tmp gather vjp

* support gather

* remove useless code

* fix compiling error

* fix ut

* add eager test

* add eager test

* add seed

* fix cpu error

* fix transpose op compat

* remove tensor index case

* fix prim_cinn

* fix ut
```
  4db8e5c7
- H
  
  [XPU] add fp16 support for assign. update xccl to 1.0.9. (#50702) · 613a3ffe
  由 houj04 提交于 2月 22, 2023
  
  613a3ffe
21 2月, 2023 5 次提交

Support bw invoke fw (#50260) · d8845735

由 HappyHeavyRain 提交于 2月 21, 2023

* support bw invoke fw

* fix scale in static_backward.yaml

* fix the bug in tensorrt/convert

* move 'scale','sign' into ops.yaml

* add scale_grad of scale in op_compat.yaml

* change generated_static_op in CMakeLists.txt

d8845735

[Prim] Add op map (#50673) · 3c7e94d6

由 cyber-pioneer 提交于 2月 21, 2023

* fix flatten op map

* remove prim op all list

* add op map info of full_like

* polish code

3c7e94d6

Q

add c_reduce_sum/unstack/all_reduce_datatype for kunlun (#50606) · 397c9403
由 QingshuChen 提交于 2月 21, 2023

397c9403

[PHI Decoupling]Remove memory header (Part1) (#50419) · 1cfcb71d

由 YuanRisheng 提交于 2月 21, 2023

* decouple_memory

* perfect memory utils

* fix ci bugs

* fix inference bugs

* fix custom test bugs

* fix converage bugs

* modify code according comment

* modify namespace

* deal with compile bugs

1cfcb71d

[phi decoupling] move sequence_padding from fluid to phi (#50639) · 5f443601

由 Huang Jiyi 提交于 2月 21, 2023

* move sequence_padding to phi

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix buga

* fix bugs

* revert and update phi::XPUContext

5f443601

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功