提交 · 2265d63c606d3652386b9521179dbf5ab5ef46d5 · PaddlePaddle / Paddle

31 7月, 2023 3 次提交

H
[NewIR]fix new ir shadow typo (#55706) · 2265d63c
由 hong 提交于 7月 31, 2023
```
* fix new ir shadow typo

* update
```
2265d63c

[IR] Support GC and TraceRun for NewIr InterpreterCore (#55772) · dc96ebc0

由 zhangbo9674 提交于 7月 31, 2023

* add interface

* add code

* add code

* add code

* add code

* fix bug

* fix bug

* add var prefix

* add code

* add code

* add code

* fix compile bug

* fix bug

* refine code

* refine code

* refine code

* refine code

* fix bug

* add code

* add code

* fix bug

* add code

* add code

* refine code

* refine code

* fix bug

dc96ebc0

W
Support stride2 (#55156) · 859fc01b
由 wanghuancoder 提交于 7月 31, 2023
```
support stride
```
859fc01b

30 7月, 2023 1 次提交

[IR] Add Construct event for new ir interpretercore (#55555) · 345de9a5

由 zhangbo9674 提交于 7月 30, 2023

* add interface

* add code

* add code

* add code

* add code

* fix bug

* fix bug

* add var prefix

* add code

* add code

* add code

* fix compile bug

* fix bug

* refine code

* refine code

* refine code

* refine code

* fix bug

345de9a5

28 7月, 2023 1 次提交

New ir support fluid op (#55693) · b76c2f94

由 hong 提交于 7月 28, 2023

* new ir support save combine

* update

* polish code

* update

* new ir support fluid op

* remove depulicate op

* fix ir exe test compile error

* fix compile bug

* update

* code format

* update

* update

* polish code

b76c2f94

27 7月, 2023 1 次提交
- R
  [CustomPass] add support for outputting the intermediate variables (#55728) · da258964
  由 ronnywang 提交于 7月 27, 2023
```
* add support for outputting the intermediate variables

* fix fuse_rresnet_unit
```
  da258964
26 7月, 2023 3 次提交
- S
  【静态图性能优化】Share event (#55650) · 0601c2c9
  由 Sonder 提交于 7月 26, 2023
```
* add sharing event info

* add sharing event info

* fix

* remove const

* add flag

* fix
```
  0601c2c9
- H
  New ir support save combine (#55538) · a88d36aa
  由 hong 提交于 7月 26, 2023
```
* new ir support save combine

* update

* polish code
```
  a88d36aa
- G
  
  add modernize-redundant-void-arg check (#55652) · 12fb18dd
  由 gouzil 提交于 7月 26, 2023
  
  12fb18dd
25 7月, 2023 2 次提交

[NewIR]new ir dygraph to static supoort gpu (#55620) · fb9bec5d

由 hong 提交于 7月 25, 2023

* add kernel dialect

* change DenseTensorTypeStorage to DenseTensorType

* add test case`

* add first pd_op to kernel dialect

* lower pd op to kernel dialect

* update

* update

* remove useless code

* add attrite print test

* fix bug

* update

* update

* update

* update

* polish code

* fix bug

* polish  code  and add python test

* add test

* fix test error

* relax constraint when inserting get_parameter

* add env flag

* fix bug

* dygraph2static support new ir

* fix bug

* revert test env

* change cc_test_old to cc_test

* update

* fix build_static bug

* update test

* fix type test error

* udpate cmake

* disable test in windows

* fix inference compile

* fix program translator error

* only run on cpu, not support gpu yet

* fix conflict

* polish code

* fix bug

* add feed with place op

* update

* remove useless unitest

* udpate mkldnn

* update

* update

* align mkldnn version

* new ir support builtin slice op

* fix bug

* fix phi kernel adaptor bug

* add enable static

* add enable_static

* remove useless test case

* change feed list to single variable

* update

* add feed with place and shaddow output op

* fix bug

* remove usless code

* support gpu

* fix bug

* fix bug

* remove template

* add more data type

* fix cimpile bug

* udpate

* remove useless code

* revert dygraph2st test

* remove usless code

* revert op

* fix bug

* new ir dygraph2static support gpu

* remove usless code

* code polish

* add const

* revert code and remove useless code

* revert code

* revert legacy op yaml

* remove useless code

* delete std::move

---------
Co-authored-by: Nkangguangli <kangguangli@hotmail.com>

fb9bec5d

J

Fix reduce_ops for mixed-precision FP16 support (#55573) · ca72aa2a
由 jiangfan06 提交于 7月 25, 2023

ca72aa2a

24 7月, 2023 2 次提交
- Y
  
  [sharding stage 1 optim] Sharding comm overlap with backward (#55598) · a9f877ff
  由 Yuang Liu 提交于 7月 24, 2023
  
  a9f877ff
- X
  onednn: remove fc_elementwise_add fusion (#55504) · bea1f04c
  由 Xinyu Chen 提交于 7月 24, 2023
```
* onednn: remove fc+eltwiseadd fusion pass
* onednn: remove post-sum fusion in fc kernel
* onednn: tests: make unfused add run into f32
```
  bea1f04c
22 7月, 2023 1 次提交
- R
  
  [PHI CAPI] Add support for registering a new operator, PART2 (#55533) · 14006e96
  由 ronnywang 提交于 7月 22, 2023
  
  14006e96
21 7月, 2023 2 次提交
- R
  
  [clang-tidy] enable modernize-make-unique (#55506) · 45d49619
  由 Ruibin Cheung 提交于 7月 21, 2023
  
  45d49619
- R
  
  [clang-tidy] enable modernize-use-override (#55491) · cd0f1523
  由 Ruibin Cheung 提交于 7月 21, 2023
  
  cd0f1523
20 7月, 2023 5 次提交

【静态图性能优化】图依赖信息复用 (#55389) · ee65599e

由 Sonder 提交于 7月 20, 2023

* add share api for DependencyBuilder

* add judge codes for sharing build results

* add ShareBuildResultsFrom

* update ShareDependencyFrom

* fix error

* add share codes

* fix memory error

* update according review

* update notes

* fix code style

* remove const_cast

* fix code style

ee65599e

[NewIR]Change feed list to variable list && support GPU (#55401) · 75517841

由 hong 提交于 7月 20, 2023

* add feed with place op

* remove useless unitest

* udpate mkldnn

* update

* new ir support builtin slice op

* fix phi kernel adaptor bug

* add enable_static

* remove useless test case

* change feed list to single variable

* support gpu

* fix bug

* remove template

* add more data type

* fix cimpile bug

75517841

Z

[XPU] fuse cast to conv2d/fc in mixed precision model (#54493) · 4df00939
由 zhupengyang 提交于 7月 20, 2023

4df00939
M

fix bug of constant folding pass (#55556) · bc61c796
由 ming1753 提交于 7月 20, 2023

bc61c796

[Kunlun] Modify some legacy code on distributed training (#55515) · 806f8d2b

由 XiaociZhang 提交于 7月 20, 2023

* [Kunlun] Mofify some legacy code on distributed training

There were limitations on XPUs before, such as concat/split is not
supported, and c_broadcast only support fp32. These limitations are
lifted recently.

Multi-device profiling on XPU will also be supported by this PR.
Without this PR, a hanging broadcast will be issued by devices that
enables profiling, eventually lead to kernel timeout error.

* fix typo

806f8d2b

19 7月, 2023 5 次提交
- R
  
  [CustomPass] add register_pass api (#55511) · 6216beb3
  由 ronnywang 提交于 7月 19, 2023
  
  6216beb3
- R
  
  [PHI CAPI] Add support for registering a new operator, PART1 (#55532) · 3f17596a
  由 ronnywang 提交于 7月 19, 2023
  
  3f17596a
- Z
  [IR] Add Dependency build for new ir interpretercore (#55468) · fd192303
  由 zhangbo9674 提交于 7月 19, 2023
```
* add interface

* add code

* add code

* add code

* add code

* fix bug

* fix bug
```
  fd192303
- C
  
  Delete repeat ops add gather squeeze unsqueeze (#55371) · 552ed8d8
  由 csy0225 提交于 7月 19, 2023
  
  552ed8d8
- Y
  
  Sharding stage 1 tensor fusion (#55427) · 4c4d3185
  由 Yuang Liu 提交于 7月 19, 2023
  
  4c4d3185
17 7月, 2023 2 次提交
- W
  
  [IR] finetune the StrAttribute interface. (#55439) · 896d7cfa
  由 winter-wang 提交于 7月 17, 2023
  
  896d7cfa
- H
  
  [0D-Tensor] CINN supports unsqueeze, delete hack in Paddle's pass (#55336) · f736f151
  由 HongyuJia 提交于 7月 17, 2023
  
  f736f151
14 7月, 2023 2 次提交

Z
[IR] Refine BuildScope in phi_kernel_util (#55423) · f00a06d8
由 zhangbo9674 提交于 7月 14, 2023
```
* add code

* fix bug

* refine code

* refine code

* fix bug
```
f00a06d8

[IR] Reconstruct the Instruction for NewIrInterpreter (#55239) · 69e9f03e

由 zhangbo9674 提交于 7月 14, 2023

* add inplace interface

* support inplace

* refine code

* fix bug

* fix bug

* refien code

* add file

* add interface

* refine code

* refine code

* add phi kernel instruction

* refine code

* add test

* delete unuse code

* add test

* add test

* add deps

* delete unused code

* fix bug

* fix bug

69e9f03e

13 7月, 2023 4 次提交
- Y
  [BugFix] Replace include dense_tensor.h with forward declare in phi lib (#55396) · 9619443b
  由 Yuanle Liu 提交于 7月 13, 2023
```
* copy dense_tensor.h to inference lib

* update

* update
```
  9619443b
- Y
  
  [BugFix] Fix issue-50853: CUDNN error(9), CUDNN_STATUS_NOT_SUPPORTED · 78a4e3fd
  由 Yuanle Liu 提交于 7月 13, 2023
  
  78a4e3fd
- R
  Support nvprof for auto parallel (#55347) · 9210b1af
  由 Ruibiao Chen 提交于 7月 13, 2023
```
* Support nvprof for auto parallel

* Fix CI errors

* Fix CI errors
```
  9210b1af
- H
  [NewIR]new ir support builtin slice op (#55381) · 4b6d2f5f
  由 hong 提交于 7月 13, 2023
```
* new ir support builtin slice op

* fix phi kernel adaptor bug
```
  4b6d2f5f
12 7月, 2023 3 次提交

Y
[Inference] rewrite identity_op_clean_pass (#55240) · 2363e623
由 Yuanle Liu 提交于 7月 12, 2023
```
* rewrite identity_op_clean_pass

* fix

* adjust identity_op_clean_pass order in gpu passes

* fix ut
```
2363e623

[ONEDNN] Upgrade oneDNN version to v3.1 (#52463) · cfa513f7

由 YangQun 提交于 7月 12, 2023

* squash pick the poc code
* fix build after rebase
* fix int8 conv and fc uts
* Fix and clean-up Get_SRC_Scale_Memory
* fix floating point fc uts
* fix test_analyzer_int8_googlenet
* test_analyzer_int8_mobilenetv1
* fix int8 mobilenet v2 and v3
* fix build error after rebase
* [oneDNN] rename library version
* fix conv bias datatype
* try to fix import error
* fix rebase error
* [oneDNN] pack library into python wheel
* add MKLDNN_SHARED_LIB_3 to env_dict
* fix test_analyzer_bert
* fix fill_constant op kernel
* fix ernie and matmul op ut
* fix softplus ut
* fix conv+relu6 fusion ut
* fix hardswish fusion
* fix quant+transpose fusion ut
* fixsgd ut
* fix int8 matmul with flatten
* fix fc+scale fusion
* fix conv/matmul+gelu fusion uts
* fix rebase error
* Revert "fix conv/matmul+gelu fusion uts"
This reverts commit 47eb5e49972bd8f7271a233def9bfb3e98ce78e1.
* upgrade to onednn v3.1
* remove older version onednn
* use densetensor::data() for achieving mean and var in layernorm impl
* comments for atol of integer tests
* fix clang-format
* Revert "remove older version onednn"
This reverts commit 783e57ddfd4401254596eae7d47adb9b03590c09.
* improve binary handle
* fix expand kernel
* Revert "use densetensor::data() for achieving mean and var in layernorm impl"
* always use forward_inference for conv
* remove activation scales
* rollback changes to mkldnn.cmake
* address comments
* port changes to dequantize kernel
* fix merge error
* fix fused_elementwise_kernel
* upgrade onednn version to v3.1.1
* fix some approval error
* fix error msg format
* remove old onednn libs
* try to fix symbolic link issue
* fix cinn test case segfault
* do not explicit link test with onednn
* remove unnecessary changes
* integrate CINN with onednn v3
* link with mkldnn project
* fix cinn build file

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
Co-authored-by: NChen, Xinyu1 <xinyu1.chen@intel.com>
Co-authored-by: Ntianshuo78520a <707759223@qq.com>

cfa513f7

[clang-tidy] enable `readability-container-size-empty` check (#55279) · be3a6fa7

由 Wang Xin 提交于 7月 12, 2023

* [clang-tidy] enable readability-container-size-empty check

* fix test_custom_kernel Failed

* add clang-tid-10 in dockerfile

* add clang-tidy in dockerfile

* fix bug

be3a6fa7

07 7月, 2023 3 次提交
- W
  
  [XPU] Add layernorm fuse pass (#55154) · eb12739e
  由 wz1qqx 提交于 7月 07, 2023
  
  eb12739e
- W
  
  [XPU] Eliminate small ops (#55193) · b8f265d2
  由 wz1qqx 提交于 7月 07, 2023
  
  b8f265d2
- Y
  rename WITH_INFERENCE_NVTX to WITH_NVTX and fix compile bug (#55219) · 43843192
  由 Yuanle Liu 提交于 7月 07, 2023
```
* fix WITH_SHARED_IR option type

* rename WITH_INFERENCE_NVTX to WITH_NVTX and fix compile bug

* update
```
  43843192

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功