提交 · 569b018ea8911c45b70dead0ad03dd623873cd1f · PaddlePaddle / Paddle

28 2月, 2023 2 次提交
- Z
  forbid tensorrt_engine op's output is a persistable var (#50932) · bbf2bc2b
  由 zhoutianzi666 提交于 2月 28, 2023
```
* forbid tensorrt_engine op's output is a persistable var
```
  bbf2bc2b
- N
  
  Count the number of 0 in the output Tensor (#50981) · 6c471ed0
  由 niuliling123 提交于 2月 28, 2023
  
  6c471ed0
27 2月, 2023 4 次提交
- J
  
  [CINN] fix cinn cache key should save var name bug (#50955) · f78b4079
  由 jiangcheng 提交于 2月 27, 2023
  
  f78b4079
- W
  [TRT] Add sm version check for TensorRT flash attention and cross attention pass/plugin (#50830) · 38dad3b9
  由 Wang Bojun 提交于 2月 27, 2023
```
* add sm version check

* use GetGPUComputeCapability
```
  38dad3b9
- H
  [Error Msg] Polish error message when GPU kernel not found (#50880) · 3e9ffaef
  由 HongyuJia 提交于 2月 27, 2023
```
* [Error Msg] Polish error message when GPU kernel not found

* Only test in GPU environment
```
  3e9ffaef
- C
  
  revert operator.cc (#50895) · ec814cf5
  由 csy0225 提交于 2月 27, 2023
  
  ec814cf5
24 2月, 2023 3 次提交

由 Sławomir Siwek 提交于 2月 24, 2023

* ConvertToFusedOp

* change static to inline
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

9429936c

N

Fix KP operator Kernel selection error (#50178) · 6ef3f2ce
由 niuliling123 提交于 2月 24, 2023

6ef3f2ce

[CINN]Enhance CacheKey hash logic by considering input dtypes (#50557) · 21c6eccf

由 Aurelius84 提交于 2月 24, 2023

* [CINN]Enhance CacheKey hash logic by considering input dtypes

* add unittest

* fix typo

* fix typo

* fix map.at

* fix find

* fix test

* fix cinn cache key structure realize

* using ordered map for attributes

* add test by review advice

---------
Co-authored-by: Njiangcheng <thisjiang@qq.com>

21c6eccf

23 2月, 2023 4 次提交
- C
  
  [XPU] Migrate xpu_embedding_with_eltwise_add_fuse_pass (#50590) · 8d325d82
  由 csy0225 提交于 2月 23, 2023
  
  8d325d82
- H
  [phi decoupling] move generator implementation from fluid to phi (#50746) · 4e417409
  由 Huang Jiyi 提交于 2月 23, 2023
```
* move fluid generator to phi

* move fluid generator to phi

* update .gitignore

* fix bugs

* fix cannot find "glog/logging.h" in "generator.h"

* fix bugs
```
  4e417409
- R
  
  fix bug that touch __init__.py (#50793) · e1956ab5
  由 risemeup1 提交于 2月 23, 2023
  
  e1956ab5
- Z
  
  [XPU] optimize multi_encoder_xpu_pass (#50759) · 5c9299e5
  由 zhupengyang 提交于 2月 23, 2023
  
  5c9299e5
22 2月, 2023 2 次提交
- S
  Fix some typos. (#50429) · 93b2bf4b
  由 Shuangchi He 提交于 2月 22, 2023
```
* Fix some typos.
Signed-off-by: Yulv-git <yulvchi@qq.com>

* pre-commit
Signed-off-by: Yulv-git <yulvchi@qq.com>

---------
Signed-off-by: Yulv-git <yulvchi@qq.com>
```
  93b2bf4b
- Z
  
  [XPU] link out_max to x_max between xpu_fusion_ops (#50690) · 1fd1c169
  由 zhupengyang 提交于 2月 22, 2023
  
  1fd1c169
21 2月, 2023 3 次提交

Support bw invoke fw (#50260) · d8845735

由 HappyHeavyRain 提交于 2月 21, 2023

* support bw invoke fw

* fix scale in static_backward.yaml

* fix the bug in tensorrt/convert

* move 'scale','sign' into ops.yaml

* add scale_grad of scale in op_compat.yaml

* change generated_static_op in CMakeLists.txt

d8845735

D
[Custom Device] Add static custom back_list (#50666) · d79d5933
由 duanyanhui 提交于 2月 21, 2023
```
* add static custom back_list

* rm comments

* fix log

* fix comment
```
d79d5933

Optimize the ernie inference performance on xpu backend. (#50357) · b39afb13

由 csy0225 提交于 2月 21, 2023

* Optimize the ernie inference performance on xpu

* fix enable runtime cache logic

* when op's input shape has changed, should create a new runtime context

* fix

* set flag when input shape has changed

b39afb13

20 2月, 2023 4 次提交
- S
  
  [XPU] fix fc_xpu_fuse_pass (#50569) · 77606f5d
  由 shentanyue 提交于 2月 20, 2023
  
  77606f5d
- H
  [Tensor operants] Polish tensor operants implementation (#50634) · 8c844356
  由 HongyuJia 提交于 2月 20, 2023
```
* polish tensor operants implementation

* change year, 2021->2023
```
  8c844356
- H
  [phi decoupling] move serialization from phi to fluid (#50608) · 6b3c48c1
  由 Huang Jiyi 提交于 2月 20, 2023
```
* move save_op to fluid

* fix namespace

* move_load_kernel

* fix kernel_register

* move serialization to fluid

* fix test

* fix bugs
```
  6b3c48c1
- P
  fix cuda graph error when new executor change feed fetch (#50306) · 9167fda3
  由 pangyoki 提交于 2月 20, 2023
```
* change error

* fix
```
  9167fda3
17 2月, 2023 6 次提交
- S
  upgrade oneDNN to 2.7.3 (#46301) · f803b239
  由 Sławomir Siwek 提交于 2月 17, 2023
```
* change SHA

* update to oneDNN 2.7

* update to 2.7.1

* update to 2.7.2

* add supported hardsigmoid

* update to 2.7.3

* limit cpu threads for int8 test

* group activations
```
  f803b239
- H
  [phi decoupling] move platform/transform to phi (#50498) · fe332794
  由 Huang Jiyi 提交于 2月 17, 2023
```
* move platform::transform to phi

* fix bugs

* move transform_test to phi

* fix cmake

* update namespace

* fix cmake
```
  fe332794
- Z
  [XPU] add multi_encoder_xpu_slice_fuse_pass, generate_sequence_xpu_fuse_pass,... · 61469eec
  由 zhupengyang 提交于 2月 17, 2023
```
[XPU] add multi_encoder_xpu_slice_fuse_pass, generate_sequence_xpu_fuse_pass, generate_sequence_xpu kernel (#50570)
```
  61469eec
- R
  Consider kernel argument def for data device transform in standalone Executor (#50471) · af1ace59
  由 Ruibiao Chen 提交于 2月 17, 2023
```
* Consider kernel argument def for data device transform in standalone executor

* Fix ALL_BACKEND errors

* Fix CI errors
```
  af1ace59
- J
  
  [CINN] support int8/uint8/int16/uint16 dtype (#50566) · 9e73be65
  由 jiangcheng 提交于 2月 17, 2023
  
  9e73be65
- R
  
  fix ninja error (#49181) · b5d0d8c8
  由 risemeup1 提交于 2月 17, 2023
  
  b5d0d8c8
16 2月, 2023 6 次提交

Add matmul_v2 and fused_matmul to the quantization process and adjust Ernie model test (#50354) · 8686a745

由 joanna.wozna.intel 提交于 2月 16, 2023

* Add matmul_v2 to the quantization process and adjust Ernie model test

* Correct cpu_quantize_pass test

* Move op to fuse transformation to placement pass

* Correct test

8686a745

Rewrite mkldnn conv bn fuse pass tester (#50034) · e2aacd21

由 Hulek 提交于 2月 16, 2023

* New onednn test

* checkopoint

* added new test, fixed issue with onednn bias

* fix bias check

* remove prints, refactor code

* delete old test

* update python tests cmake

* Delete depracated conv bias

* Delete outdated bias from convolution test

e2aacd21

S
[XPU][Fleet] Support multi-card infer for xpu (#50490) · 517d8074
由 shentanyue 提交于 2月 16, 2023
```
* support xpu multi-card infer

* add ut

* clean code

* clean code

* fix

* fix

* fix

* fix
```
517d8074
Z

[XPU] fix dropout pass; add multi_encoder_xpu_fuse_pass & multi_encoder_xpu kernel (#50499) · c8aa6405
由 zhupengyang 提交于 2月 16, 2023

c8aa6405

Use StandaloneExecutor in FleetExecutor (#50239) · df207283

由 Ruibiao Chen 提交于 2月 16, 2023

* Use StandaloneExecutor in FleetExecutor

* Update FLAGS

* Fix CI errors

* Update code

* Add force_root_scope_vars config

* Update code

* Fix CI errors

* Fix test_layer_new errors

df207283

[phi decoupling] remove variable.h in phi (#50407) · 905cefd4

由 Huang Jiyi 提交于 2月 16, 2023

* move variable_utils from phi_api_utils to fluid

* fix coment

* update include

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* update

* update

* fix CI-Windows-OpenBLAS

* fix bugs

* fix bugs

* fix bugs

* update include

* move variable_utils to phi_utils

* fix namespace

905cefd4

15 2月, 2023 5 次提交

L
make cinn_launch_op run interpretercore in tracing mode to reduce number of threads (#50472) · bf38175e
由 Leo Chen 提交于 2月 15, 2023
```
* make cinn_launch_op run interpretercore in tracing mode to reduce number of threads

* skip getWorkqueue in tracing mode
```
bf38175e

Rewrite conv activation mkldnn fuse pass tester (#49278) · 84beef80

由 Hulek 提交于 2月 15, 2023

* Done

* Deleted old python test, fixed new python test, changed names in parallel_UT

* Revert parallel UT changes

* Revert parallel UT changes v2

* Review fixes and simplification of conv output shape calculation, disabled sqrt from conv_act_duse_pass

* delete sqrt from possible activations from conv_concat_relu test

* review refactor

* merge main

* delete sqrt from list of compatible activations

* Test with no outdated inputs

84beef80

[PHI Decoupling]Remove Profiler header (Part2) (#50183) · 8fabca11

由 YuanRisheng 提交于 2月 15, 2023

* move profiler

* add file

* fix mac compile bugs

* fix ci bugs

* fix mac bugs

* fix ci bugs

* fix compile bugs

* perfect code according comment

8fabca11

R

fix ninja problem (#50431) · 96006f77
由 risemeup1 提交于 2月 15, 2023

96006f77
Y
[CUSTOM]custom device add black_list (#50409) · 66d3c56e
由 YuhangLi 提交于 2月 15, 2023
```
* [CUSTOM]custom device add black_list

* change log level

* fix some issues
```
66d3c56e

14 2月, 2023 1 次提交
- D
  Expand mixed_precision to custom device (#50378) · fcb746cb
  由 duanyanhui 提交于 2月 14, 2023
```
* expand mix_precision to custom_device

* fix bug

* fix bug

* fix comment

* fix DEFINE bug
```
  fcb746cb

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功