提交 · f265a31324493e5cf426909f109cfd62f922060a · PaddlePaddle / Paddle

28 2月, 2023 4 次提交
- Z
  
  [XPU] support convert fp16 model (#50790) · f265a313
  由 zhupengyang 提交于 2月 28, 2023
  
  f265a313
- S
  
  xpu gaussian_random support fp16 (#50881) · 569b018e
  由 shentanyue 提交于 2月 28, 2023
  
  569b018e
- T
  
  xpu-paddlepaddle-57 [任务] adamw lr_radio支持 (#50979) · dda74715
  由 taixiurong 提交于 2月 28, 2023
  
  dda74715
- J
  【Prim】Reshape, transpose, cast vjp (#50778) · ab1b6303
  由 Jiabin Yang 提交于 2月 28, 2023
```
* support transpose and reshape

* support reshpe, transpose, cast vjp

* merge develop

* recover unused file

* remove prim base

* support problem

* remove additional status settting

* remove additional status settting

* fix ut

* fix ut

* fix ut

* fix no grad branch

* add more test

* disable fp16 in cpu

* fix test
```
  ab1b6303
27 2月, 2023 8 次提交

[XPU] add fp16 support for shape and lookup_table_v2 op. (#50773) · d2a0577a

由 houj04 提交于 2月 27, 2023

* [XPU] add fp16 support for shape op.

* [XPU] add fp16 support for lookup_table_v2 op.

* update approval list: add qingshu's id.

d2a0577a

张

【Hackathon No.68】Remove utils in phi (#50833) · 6c181d1d

由张春乔提交于 2月 27, 2023

* remove utils

* remove utils

* remove utils

* remove utils

* Update get_data_from_tensor.h

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_kernel.cu.cc

* Update rnn_kernel.cc

* remove utils

* Update rnn_functor.h

* remove utils

* remove utils

* remove utils

* remove utils

* remove utils

* Update rnn_functor.h

* Update unsqueeze_op.h

* Update utils.h

* roll back

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* Update tensor_utils.h

* use TensorToVector

* use TensorToVector

* use TensorToVector

* use TensorToVector

* use TensorToVector

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* add TensorToVector

* roll back

* Update tensor_utils.h

* Update rnn_functor.h

* Update rnn_grad_kernel.cu.cc

* Update tensor_utils.h

* Update rnn_kernel.cu.cc

* Update rnn_grad_kernel.cc

* Update rnn_kernel.cc

* Update rnn_grad_kernel.cu.cc

* Update rnn_kernel.cu.cc

* Update rnn_grad_kernel.cc

* Update rnn_kernel.cc

* TensorCopySync to phi::Copy

* fix codestyle

* rnn_kernel.cc: add ;

* replace all GetDataFromTensor with phi::GetVectorFromTensor

* delete include of util.h

6c181d1d

H
[Tensor Operants & Prim] Tensor pow API uses elementwise_pow (#50886) · 8a097399
由 HongyuJia 提交于 2月 27, 2023
```
* [Tensor Operants & Prim] Tensor pow API uses elementwise_pow

* unittest change to fill_constant+elementwise_pow
```
8a097399
B
Reduce redundant cpu computation in slice compute (#50348) · 8aec0580
由 Bo Zhang 提交于 2月 27, 2023
```
* conflict

* add UpdateSliceAttrs
```
8aec0580
Y

Add PADDLE_THROW in ToCudaDataType and polish codes. (#50922) · 2eeaaa7d
由 Yiqun Liu 提交于 2月 27, 2023

2eeaaa7d
revert reshape 0 represent copy and support perm < 0 for paddle.transpose (#50720) · 3669868d
由 zhouweiwei2014 提交于 2月 27, 2023

3669868d
W
xpu: bind op scatter_nd_add. add data type for transpose2, clip & assign_value (#50825) · 0d12afea
由 wangshengxiang 提交于 2月 27, 2023
```
* [XPU] bind op scatter_nd_add

* [XPU] add more data type for op: clip, transpose2 & assign_value
```
0d12afea

[Bfloat16]register bfloat16 datatype for squared l2 norm (#50908) · 3c121040

由 shaojie_wang 提交于 2月 26, 2023

* register bfloat16 datatype for squared l2 norm

* register bfloat16 datatype for softmax with upper triangular mask

* register bfloat16 for tril triu cuda kernel

3c121040

26 2月, 2023 2 次提交

Matmul performance optimization with cuBlasLt (#46431) · d4217fc6

由 limingshu 提交于 2月 26, 2023


* implement of matmul using cublasLt instead of cublas

* Update matmul_kernel_impl_via_blasLt.h

---------
Co-authored-by: Nzhangbopd <1299246947@qq.com>
Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com>
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

d4217fc6

Enable matmul + bias fusion in fused_gat_attention. (#50755) · 57f6a469

由 Yiqun Liu 提交于 2月 26, 2023

* Enable matmul + bias fusion in fused_gat_attention.

* Add a variable to control whether using fused matmul + bias.

57f6a469

25 2月, 2023 1 次提交
- Z
  Rename elementwise_heaviside to heaviside (#50821) · 8129c22e
  由 zyfncg 提交于 2月 25, 2023
```
* rename elementwise_heaviside to heaviside

* delete __init__.py

* fix bug
```
  8129c22e
24 2月, 2023 8 次提交

Y

[Zero-Dim] Support 0D Tensor input for topk/broadcast_to/expand/expand_as/broadcast_shape (#50536) · 5041158f
由 yunyaoXYY 提交于 2月 24, 2023

5041158f
N

Fix KP operator Kernel selection error (#50178) · 6ef3f2ce
由 niuliling123 提交于 2月 24, 2023

6ef3f2ce
Y

Fix libpaddle_inference.so symbol conflicts with other .so (gflags) (#50787) · 041ea14c
由 Yuanle Liu 提交于 2月 24, 2023

041ea14c

support 'backend' in static ops (#50671) · 363825df

由 HappyHeavyRain 提交于 2月 24, 2023

* support 'backend' in static ops

* change bitwise_xx comment in python

* change bitwise_xxx comment in python

* change 'backend' and 'data_type' in GetExpectedKernelType

363825df

Y

supplement header file's code (#50826) · 92cae577
由 YuanRisheng 提交于 2月 24, 2023

92cae577

【prim】Slice grad (#50771) · f6dea800

由 xiaoguoguo626807 提交于 2月 24, 2023

* support prim test in OpTest

* fix cmake

* fix op test

* fix test_input_spec

* disable cinn in reduce_sum unit test

* add bfloat16 dtype for sum

* add approve rules

* polish code

* add clear jit program function

* convert grad out from tensor to numpy

* remove unnecessary code

* add only_prim flag

* fix flag

* fix op test

* add attr

* fix optest comp inplace error

* fix op test

* fix op test with guard

* add initialization of check_comp flag

* fix comp inplace error in op test

* rename check_comp with check_prim and add bfloat16 dtype convert

* rename comp_op_type to prim_op_type

* rename comp to prim

* remove useless code

* skip ci check for only prim

* add no_grad_vars and grad_outputs in prim test

* fix var_dict

* fix op test for only_prim

* fix dy2static bugs

* polish some code

* temp

* modify op test

* except cinn test

* modify bfp16

* modify pad grad

* add pad_grad dtype

* start cinn part

---------
Co-authored-by: NCharles-hit <wanghao107@baidu.com>

f6dea800

H

[Tensor Operants & Prim] Tensor arithmetic operants support left scalar type (#50840) · 0d956e17
由 HongyuJia 提交于 2月 24, 2023

0d956e17
R
[XPU] add expand_grad, isnan, meshgrid kernels (#50774) · 7271de88
由 ronnywang 提交于 2月 24, 2023
```
* [XPU] add expand_grad, isnan, meshgrid kernels

* update
```
7271de88

23 2月, 2023 12 次提交

L

first commit (#50808) · a1e96e47
由 limingshu 提交于 2月 23, 2023

a1e96e47
C

[XPU] Migrate xpu_embedding_with_eltwise_add_fuse_pass (#50590) · 8d325d82
由 csy0225 提交于 2月 23, 2023

8d325d82

[Tensor API & Prim-Relevant] Unsupport prob Tensor API (#50756) · d7673e2f

由 HongyuJia 提交于 2月 23, 2023

* change phi tensor_gen->tensor_operants_gen

* [Tensor API] Support multiple Tensor C++ api

* [Tensor API] Unsupport prob Tensor API

* accept reviewers comment of #50731

* delete tensor_api.yaml

d7673e2f

[phi decoupling] move generator implementation from fluid to phi (#50746) · 4e417409

由 Huang Jiyi 提交于 2月 23, 2023

* move fluid generator to phi

* move fluid generator to phi

* update .gitignore

* fix bugs

* fix cannot find "glog/logging.h" in "generator.h"

* fix bugs

4e417409

[OptionalOptimization]: LayerNorm forward Optimization with Welford (#50362) · 746b774b

由 limingshu 提交于 2月 23, 2023

* first commit

* main codes has been developed

* fix all bugs

* add vectorize input&output

* a test for optimization_of_layer_norm_fwd

* add some changes

* fix memory coalesced access for more optimization.

* fix addition ctest error

* fix according to ci-approval

* remove change on slice

746b774b

R

fix bug that touch __init__.py (#50793) · e1956ab5
由 risemeup1 提交于 2月 23, 2023

e1956ab5

[Paddle C++ API] Remapping input and output tensors after XPU op has fallen back to CPU op (#50625) · f7b45b3e

由 RuohengMa 提交于 2月 23, 2023

* fix accurary diff issue when XPU op batch_norm is added to XPU blacklist

* remap op output tensor to input tensor when the op has fallen back to CPU

* rename function name and fix bug causing by InplaceCounter

f7b45b3e

[Tensor Operants & Prim] Tensor arithmetic operants support right scalar type (#50563) · 5f5a2082

由 HongyuJia 提交于 2月 23, 2023

* polish namespace

* change static_tensor_operants

* polish namespace

* support add, subtract, divide

* add unit test

* polish unittest

* fix cmake error

* solve conflicts, merge auto code-gen

* add scalar operator in tensor.h

* tensorbase

* static prim full support more datatype

* fix prim unittest

* polish codes

* fix cmake error

5f5a2082

Y
[PHI Decoupling]Remove Profiler header (Part3) (#50721) · 8476c552
由 YuanRisheng 提交于 2月 23, 2023
```
* move profiler

* fix compile bugs
```
8476c552

Support 'complex promote' in yaml (#50611) · 91a3d159

由 HappyHeavyRain 提交于 2月 23, 2023

* support 'complex promote' in yaml

* change the compplex_promote

* change 'kron' in math.py

* change 'kron' comment in python

* change kron comment in python

* change kron comment in python

91a3d159

【Prim】Enhance gather vjp (#50786) · dca3a099

由 Jiabin Yang 提交于 2月 23, 2023

* tmp gather vjp

* support gather

* remove useless code

* fix compiling error

* fix ut

* add eager test

* add eager test

* add seed

* fix cpu error

* fix transpose op compat

* remove tensor index case

* fix prim_cinn

* fix ut

* add gather composite

dca3a099

kunlun support c_softmax_with_cross_entropy (#49934) · f43b5fe5

由 jameszhang 提交于 2月 23, 2023

* kunlun support c_softmax_with_cross_entropy

* fix grad calc error

* replace mutable_data() and ShareDataWith()

* update xdnn

* update xpu toolchain to 20230215

* remove fluid from test file

f43b5fe5

22 2月, 2023 5 次提交
- H
  [Tensor API] Support multiple Tensor C++ api (#50731) · 652d12cc
  由 HongyuJia 提交于 2月 22, 2023
```
* change phi tensor_gen->tensor_operants_gen

* [Tensor API] Support multiple Tensor C++ api
```
  652d12cc
- [Win]fix compile error due to depend xxhash (#50760) · a35dbc29
  由 zhouweiwei2014 提交于 2月 22, 2023
  
  a35dbc29
- R
  
  fix ninja and make incremental compiling error (#50616) · 433c2ffb
  由 risemeup1 提交于 2月 22, 2023
  
  433c2ffb
- S
  Fix some typos. (#50429) · 93b2bf4b
  由 Shuangchi He 提交于 2月 22, 2023
```
* Fix some typos.
Signed-off-by: Yulv-git <yulvchi@qq.com>

* pre-commit
Signed-off-by: Yulv-git <yulvchi@qq.com>

---------
Signed-off-by: Yulv-git <yulvchi@qq.com>
```
  93b2bf4b
- Z
  
  [XPU] link out_max to x_max between xpu_fusion_ops (#50690) · 1fd1c169
  由 zhupengyang 提交于 2月 22, 2023
  
  1fd1c169

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功