提交 · 257e6c99f4b70fdfea9816f21a6f007304176afe · PaddlePaddle / Paddle

04 1月, 2023 6 次提交

Y

update vlog output (#49541) · bbc6dd94
由 Yuanle Liu 提交于 1月 04, 2023

bbc6dd94
W

[Inference] Add conv_fusion nhwc impl. (#49047) · 4a8708bb
由 Wilber 提交于 1月 04, 2023

4a8708bb
Y

[Paddle Inference] fix mixed precision diff (#49475) · ac75a9a6
由 Yuanle Liu 提交于 1月 04, 2023

ac75a9a6
S
Revert "Replace matmul with matmul_v2 during oneDNN fuse passes (#49108)" (#49524) · 338cbeaa
由 Sławomir Siwek 提交于 1月 04, 2023
```
This reverts commit 2c444dfa.
```
338cbeaa

[Unify KernelKey] change OpKernelType->KernelKey (#49138) · 4383494f

由 HongyuJia 提交于 1月 04, 2023

* execute use kernel_key first

* change OpKernelType->KernelKey

* fix py3 compile error, remove redundant header files

* fix build_strategy_test

* fix DataType::RAW

* fix custom_type test: operator_test.cc

* fix transform place

* fix backends_are_same_class

* try fix place TransDataDevice

* support all KernelKey

* fix TransformData

* fix place_are_same_class

* fix merge

* fix test_params_no_grad

* fix specific place of GetExpectedKernelType

* fix specific place of GetExpectedKernelType

* fix GetKernelTypeForVar

* fix dtype error

* fix fetch_v2

* change GetKernelTypeForVar

* fix interpreter

* fix typo error

* polish codes

* polish codes

* polish codes

* fix conflict

4383494f

L

add multi_devices_fused_multi_transformer_encoder_pass and cherry-pick from 48349 (#49383) · 29eec2dd
由 lzy 提交于 1月 04, 2023

29eec2dd

03 1月, 2023 10 次提交
- W
  
  [code_style fix] graph_brpc_client cpplint (#49457) · a2d7e1d7
  由 wangzhen38 提交于 1月 03, 2023
  
  a2d7e1d7
- W
  [Dy2St]Fix param and out grad names in dy2st for high order grad (#49461) · f484a61e
  由 WangZhen 提交于 1月 03, 2023
```
* Fix param and out grad names in dy2st for high order grad
```
  f484a61e
- Y
  
  [Paddle Inference] enhance paddle_infer::Tensor data type (#49388) · dc13f7c5
  由 Yuanle Liu 提交于 1月 03, 2023
  
  dc13f7c5
- S
  Replace matmul with matmul_v2 during oneDNN fuse passes (#49108) · 2c444dfa
  由 Sławomir Siwek 提交于 1月 03, 2023
```
* replace matmul with matmul_v2 in fuse passes

* Remove fusion logic from matmul

* removing fusion methods

* add proper name

* adjust namespaces
```
  2c444dfa
- K
  
  set Flag_control_flow_use_new_executor=true by default (#49447) · 0f9e2b17
  由 kangguangli 提交于 1月 03, 2023
  
  0f9e2b17
- Z
  [Paddle Inference] Implement conv2d_fusion NHWC format using cutlass (#47989) · c123dd1e
  由 zhoutianzi666 提交于 1月 03, 2023
```
* Implement conv2d_fusion NHWC format using CUTLASS
* Add unit testing for CUTLASS Conv in inference
* Add experimental API for CUTLASS.
```
  c123dd1e
- A
  [OpAttr]Fix Ignore AttriteTensor in IndicateDataType bug in grad_op (#49472) · 5ac96468
  由 Aurelius84 提交于 1月 03, 2023
```
* [OpAttr]Fix Ignore AttriteTensor in IndicateDataType bug in grad_op

* add GetExpectedKernelType
```
  5ac96468
- Z
  [Zero-Dim] reshape/reshape_/reverse 0D support (#49357) · 347d2123
  由 zhaoyingli 提交于 1月 03, 2023
```
* [Zero-Dim] reshape/reshape_/reverse 0D support

* rm comment

* change paddle.to_tensor to paddle.full

* fix docs

* update paddle.full
```
  347d2123
- Z
  
  forbid ops who have 1D intermediate tensor entering Paddle-TRT (#49378) · 021085e3
  由 zhoutianzi666 提交于 1月 03, 2023
  
  021085e3
- S
  
  Add not_equal trt converter (#49393) · 822ea0f9
  由 Sanbu 提交于 1月 03, 2023
  
  822ea0f9
02 1月, 2023 1 次提交
- H
  
  Scale Matmul Fuse pass rewritten (#49105) · 18c0a002
  由 Hulek 提交于 1月 02, 2023
  
  18c0a002
01 1月, 2023 1 次提交
- G
  
  memorty_optimize remove inplace op (#49431) · aa96ddc3
  由 gem5 提交于 1月 01, 2023
  
  aa96ddc3
30 12月, 2022 6 次提交

Z
Fix test_conv_bn_fuse_pass_cc on Windows System (#49446) · a4b4343f
由 zyfncg 提交于 12月 30, 2022
```
* fix test_conv_bn_fuse_pass_cc

* remove comment
```
a4b4343f
Z
[inference][trt] update Convolution to ConvolutionNd (#47653) · 6e5917e4
由 Zhang Jun 提交于 12月 30, 2022
```
* update conv to convNd

* trigger ci
```
6e5917e4

Support static graph code-gen for squeeze and unsqueeze op (#49430) · 23c1ac2c

由 zyfncg 提交于 12月 30, 2022

* support static graph code-gen for squeeze op

* generate static graph code of unsqueeze

* refine op name

* add extra output in op_compat

* remove debug log

23c1ac2c

H

fix possible bug (#49367) · 18f0ab86
由 HongyuJia 提交于 12月 30, 2022

18f0ab86

在文档中统一静态图模式与动态图模式的英文翻译 (#49170) · a186e60d

由 Sanbu 提交于 12月 30, 2022

* 1219

* temporarily change the num_diff_files limit, test=document_fix

* Revert "temporarily change the num_diff_files limit, test=document_fix"

This reverts commit 8e70f00ef468d2dad0e38b3da06295ed62990d20.

* for codestyle

* remove duplicate license

* `static mode` -> `static graph mode`

* Update hybrid_parallel_inference.py

* Update layer_function_generator.py

* Update manipulation.py

* reset
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

a186e60d

W
Fix default GetExpectedKernelType for ops supported tensor attrs (#49414) · 8a859554
由 WangZhen 提交于 12月 30, 2022
```
* Fix default GetExpectedKernelType for ops supported tensor attrs
```
8a859554

29 12月, 2022 4 次提交
- X
  auto parallel bf16 (#49079) · 418edae5
  由 xu98bin 提交于 12月 29, 2022
```
* auto parallel bf16
```
  418edae5
- Z
  [pglbox2.0]fix load into memory (#49389) · 1078e064
  由 zmxdream 提交于 12月 29, 2022
```
* fix load into memory

* fix load into memory

* fix code style
```
  1078e064
- fix ambiguous symbol error (#49406) · 6f07960c
  由 MarDino 提交于 12月 29, 2022
  
  6f07960c
- W
  fused_attention_op paratmers stop grad support (#49351) · 0bb999b6
  由 Wang Bojun 提交于 12月 29, 2022
```
* fusedAttenGrad_noGrad

* code style fix

* add ut

* remove unnecessary log
```
  0bb999b6
28 12月, 2022 5 次提交

[new-exec] Ahead-Of-Time choosing kernel (#48789) · 63d2d722

由 Leo Chen 提交于 12月 28, 2022

* add skip run

* alloc minimum memory

* skip check_size in Alloc

* skip check_size in Alloc

* skip check_size in Alloc

* fix cases when tensor is initialized or empty

* alloc empty output for place info

* add test

* increase timeout

* format code

* skip cpu

* add cudnn_deterministic

* fit for hostAlloc

* follow comments

* change check_size to fake_alloc

63d2d722

generate the static graph code of some ops (#49212) · 1804f834

由 HappyHeavyRain 提交于 12月 28, 2022

* generate the static op of some ops

* add the VERSION of pixel_shuffle

* change the API doc of isclose

* change the API doc of isclose

* fix the isclose op comment

1804f834

Y

update some trt log (#49330) · 02019804
由 Yuanle Liu 提交于 12月 28, 2022

02019804
W

Fix misspelled words in comments (#49366) · e2b2f7d0
由 WangZhen 提交于 12月 28, 2022

e2b2f7d0
W
delete old dygraph pylayer (#49339) · 0b60b784
由 wanghuancoder 提交于 12月 28, 2022
```
* delete old dygraph pylayer
```
0b60b784

27 12月, 2022 3 次提交

[AutoParallel] quantization pass support export (#48072) · 27ce06aa

由 zhaoyingli 提交于 12月 27, 2022

* [AutoParallel] quantization pass support export

* support subgraph

* move_presist_var_to_global_block

* update unittest

* fix ci-coverage

* fix codestyle

* fix fake_dequantize_op

* remove unused var

* fix ci error and aprroval error

* add unittest for fp16 in test_dequant_linear

* replace mutable data

* fix unittest in non-cuda-core

* fix unittest
Co-authored-by: Ncarryyu <569782149@qq.com>
Co-authored-by: Nwufeisheng <wfs1997@163.com>

27ce06aa

[new executor]Support CINN use InterpreterCore (#48911) · 2ca3d3f7

由 zhangbo9674 提交于 12月 27, 2022

* cinn use interpretercore

* fix bug

* fix compile bug

* fix scope bug

* refine code

* refine code by comment

* refine code by comment

2ca3d3f7

R
Support priority scheduling for standalone executor (#49275) · 0839bba3
由 Ruibiao Chen 提交于 12月 27, 2022
```
* Support priority scheduling for standalone executor

* Add CPU test
```
0839bba3

26 12月, 2022 4 次提交

R
[0d Tensor] update scatter for zero-dimension tensor (#49279) · 73aa98cf
由 Roc 提交于 12月 26, 2022
```
* revert concat and change concat to stack

* let stack kernel support int8, uint8 and bool type
```
73aa98cf

[Auto Parallel] Merge the python and c++ impls of ProcessMesh (#47503) · 1c0afa79

由 Yulong Ao 提交于 12月 26, 2022

* [Auto Parallel] Rename methods of ProcessMesh

* [Auto Parallel] Impl the python process_mesh by the c++ one

* [Auto Parallel] Add some minor modifications

* [Auto Parallel] Rename some methods

* [Auto Parallel] Remove unnecessary codes

* [Auto Parallel] Add back some removed files

* [Auto Parallel] Fix bugs

* [Auto Parallel] Fix a bug

* Update process_mesh.cc

* [Auto Parallel] Fix a bug

1c0afa79

R

Add FLAGS for communication op dependency in standalone executor (#49291) · d6fef01c
由 Ruibiao Chen 提交于 12月 26, 2022

d6fef01c
R
Improve stream analyzer (#49314) · f0f4dd1e
由 Ruibiao Chen 提交于 12月 26, 2022
```
* Memory search for stream analyzer

* Shrink redundant waiters
```
f0f4dd1e

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功