提交 · 2d98758c1b03c7670de95051930e278d981e15ac · PaddlePaddle / Paddle

20 7月, 2023 5 次提交

由 Sonder 提交于 7月 20, 2023

* add share api for DependencyBuilder

* add judge codes for sharing build results

* add ShareBuildResultsFrom

* update ShareDependencyFrom

* fix error

* add share codes

* fix memory error

* update according review

* update notes

* fix code style

* remove const_cast

* fix code style

ee65599e

[NewIR]Change feed list to variable list && support GPU (#55401) · 75517841

由 hong 提交于 7月 20, 2023

* add feed with place op

* remove useless unitest

* udpate mkldnn

* update

* new ir support builtin slice op

* fix phi kernel adaptor bug

* add enable_static

* remove useless test case

* change feed list to single variable

* support gpu

* fix bug

* remove template

* add more data type

* fix cimpile bug

75517841

Z

[XPU] fuse cast to conv2d/fc in mixed precision model (#54493) · 4df00939
由 zhupengyang 提交于 7月 20, 2023

4df00939
M

fix bug of constant folding pass (#55556) · bc61c796
由 ming1753 提交于 7月 20, 2023

bc61c796

[Kunlun] Modify some legacy code on distributed training (#55515) · 806f8d2b

由 XiaociZhang 提交于 7月 20, 2023

* [Kunlun] Mofify some legacy code on distributed training

There were limitations on XPUs before, such as concat/split is not
supported, and c_broadcast only support fp32. These limitations are
lifted recently.

Multi-device profiling on XPU will also be supported by this PR.
Without this PR, a hanging broadcast will be issued by devices that
enables profiling, eventually lead to kernel timeout error.

* fix typo

806f8d2b

19 7月, 2023 5 次提交
- R
  
  [CustomPass] add register_pass api (#55511) · 6216beb3
  由 ronnywang 提交于 7月 19, 2023
  
  6216beb3
- R
  
  [PHI CAPI] Add support for registering a new operator, PART1 (#55532) · 3f17596a
  由 ronnywang 提交于 7月 19, 2023
  
  3f17596a
- Z
  [IR] Add Dependency build for new ir interpretercore (#55468) · fd192303
  由 zhangbo9674 提交于 7月 19, 2023
```
* add interface

* add code

* add code

* add code

* add code

* fix bug

* fix bug
```
  fd192303
- C
  
  Delete repeat ops add gather squeeze unsqueeze (#55371) · 552ed8d8
  由 csy0225 提交于 7月 19, 2023
  
  552ed8d8
- Y
  
  Sharding stage 1 tensor fusion (#55427) · 4c4d3185
  由 Yuang Liu 提交于 7月 19, 2023
  
  4c4d3185
17 7月, 2023 2 次提交
- W
  
  [IR] finetune the StrAttribute interface. (#55439) · 896d7cfa
  由 winter-wang 提交于 7月 17, 2023
  
  896d7cfa
- H
  
  [0D-Tensor] CINN supports unsqueeze, delete hack in Paddle's pass (#55336) · f736f151
  由 HongyuJia 提交于 7月 17, 2023
  
  f736f151
14 7月, 2023 2 次提交

Z
[IR] Refine BuildScope in phi_kernel_util (#55423) · f00a06d8
由 zhangbo9674 提交于 7月 14, 2023
```
* add code

* fix bug

* refine code

* refine code

* fix bug
```
f00a06d8

[IR] Reconstruct the Instruction for NewIrInterpreter (#55239) · 69e9f03e

由 zhangbo9674 提交于 7月 14, 2023

* add inplace interface

* support inplace

* refine code

* fix bug

* fix bug

* refien code

* add file

* add interface

* refine code

* refine code

* add phi kernel instruction

* refine code

* add test

* delete unuse code

* add test

* add test

* add deps

* delete unused code

* fix bug

* fix bug

69e9f03e

13 7月, 2023 4 次提交
- Y
  [BugFix] Replace include dense_tensor.h with forward declare in phi lib (#55396) · 9619443b
  由 Yuanle Liu 提交于 7月 13, 2023
```
* copy dense_tensor.h to inference lib

* update

* update
```
  9619443b
- Y
  
  [BugFix] Fix issue-50853: CUDNN error(9), CUDNN_STATUS_NOT_SUPPORTED · 78a4e3fd
  由 Yuanle Liu 提交于 7月 13, 2023
  
  78a4e3fd
- R
  Support nvprof for auto parallel (#55347) · 9210b1af
  由 Ruibiao Chen 提交于 7月 13, 2023
```
* Support nvprof for auto parallel

* Fix CI errors

* Fix CI errors
```
  9210b1af
- H
  [NewIR]new ir support builtin slice op (#55381) · 4b6d2f5f
  由 hong 提交于 7月 13, 2023
```
* new ir support builtin slice op

* fix phi kernel adaptor bug
```
  4b6d2f5f
12 7月, 2023 3 次提交

Y
[Inference] rewrite identity_op_clean_pass (#55240) · 2363e623
由 Yuanle Liu 提交于 7月 12, 2023
```
* rewrite identity_op_clean_pass

* fix

* adjust identity_op_clean_pass order in gpu passes

* fix ut
```
2363e623

[ONEDNN] Upgrade oneDNN version to v3.1 (#52463) · cfa513f7

由 YangQun 提交于 7月 12, 2023

* squash pick the poc code
* fix build after rebase
* fix int8 conv and fc uts
* Fix and clean-up Get_SRC_Scale_Memory
* fix floating point fc uts
* fix test_analyzer_int8_googlenet
* test_analyzer_int8_mobilenetv1
* fix int8 mobilenet v2 and v3
* fix build error after rebase
* [oneDNN] rename library version
* fix conv bias datatype
* try to fix import error
* fix rebase error
* [oneDNN] pack library into python wheel
* add MKLDNN_SHARED_LIB_3 to env_dict
* fix test_analyzer_bert
* fix fill_constant op kernel
* fix ernie and matmul op ut
* fix softplus ut
* fix conv+relu6 fusion ut
* fix hardswish fusion
* fix quant+transpose fusion ut
* fixsgd ut
* fix int8 matmul with flatten
* fix fc+scale fusion
* fix conv/matmul+gelu fusion uts
* fix rebase error
* Revert "fix conv/matmul+gelu fusion uts"
This reverts commit 47eb5e49972bd8f7271a233def9bfb3e98ce78e1.
* upgrade to onednn v3.1
* remove older version onednn
* use densetensor::data() for achieving mean and var in layernorm impl
* comments for atol of integer tests
* fix clang-format
* Revert "remove older version onednn"
This reverts commit 783e57ddfd4401254596eae7d47adb9b03590c09.
* improve binary handle
* fix expand kernel
* Revert "use densetensor::data() for achieving mean and var in layernorm impl"
* always use forward_inference for conv
* remove activation scales
* rollback changes to mkldnn.cmake
* address comments
* port changes to dequantize kernel
* fix merge error
* fix fused_elementwise_kernel
* upgrade onednn version to v3.1.1
* fix some approval error
* fix error msg format
* remove old onednn libs
* try to fix symbolic link issue
* fix cinn test case segfault
* do not explicit link test with onednn
* remove unnecessary changes
* integrate CINN with onednn v3
* link with mkldnn project
* fix cinn build file

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
Co-authored-by: NChen, Xinyu1 <xinyu1.chen@intel.com>
Co-authored-by: Ntianshuo78520a <707759223@qq.com>

cfa513f7

[clang-tidy] enable `readability-container-size-empty` check (#55279) · be3a6fa7

由 Wang Xin 提交于 7月 12, 2023

* [clang-tidy] enable readability-container-size-empty check

* fix test_custom_kernel Failed

* add clang-tid-10 in dockerfile

* add clang-tidy in dockerfile

* fix bug

be3a6fa7

07 7月, 2023 5 次提交
- W
  
  [XPU] Add layernorm fuse pass (#55154) · eb12739e
  由 wz1qqx 提交于 7月 07, 2023
  
  eb12739e
- W
  
  [XPU] Eliminate small ops (#55193) · b8f265d2
  由 wz1qqx 提交于 7月 07, 2023
  
  b8f265d2
- Y
  rename WITH_INFERENCE_NVTX to WITH_NVTX and fix compile bug (#55219) · 43843192
  由 Yuanle Liu 提交于 7月 07, 2023
```
* fix WITH_SHARED_IR option type

* rename WITH_INFERENCE_NVTX to WITH_NVTX and fix compile bug

* update
```
  43843192
- H
  
  fix exception bug (#55216) · 31edad21
  由 hong 提交于 7月 07, 2023
  
  31edad21
- E
  
  [CustomDevice]fix == error with place (#55173) · 70df3aa4
  由 engineer1109 提交于 7月 07, 2023
  
  70df3aa4
06 7月, 2023 2 次提交
- 傅
  [CINN] Re-Implement operator = for two Expr Tree (#55145) · af58cc37
  由傅剑寒提交于 7月 06, 2023
```
* optimize expr operator = implementation

* fix codestyle
```
  af58cc37
- Z
  [IR] Refine some code for NewIRInterpreter (#55169) · e9f9da14
  由 zhangbo9674 提交于 7月 06, 2023
```
* fix bug

* fix bug

* refien code

* refien code

* fix bug

* refine code
```
  e9f9da14
05 7月, 2023 3 次提交

[IR] New IR access InterpreterCore：add local scope logic (#55112) · 85831c32

由 zhangbo9674 提交于 7月 05, 2023

* add local scope

* refine code

* refien code

* refine code

* support local scope for BuildFuncList

* fix bug

* add log

* fix bug

* polish code

* fix bug

85831c32

W

[XPU] add reduce_max_fuse_pass (#54981) · 54a101d5
由 wz1qqx 提交于 7月 05, 2023

54a101d5

[NewIR]Fix tensor attribute translator bug (#55129) · bf92ccc7

由 hong 提交于 7月 05, 2023

* suport optional input in new_ir

* polish code

* add coverate test

* update

* update

* add unitest

* remove reduplicate code

* udpate

* fix assign error

* revert test arg min max

* update

* fix bug

* polish code

bf92ccc7

04 7月, 2023 2 次提交
- H
  
  posh code (#55114) · b4a149a5
  由 hong 提交于 7月 04, 2023
  
  b4a149a5
- H
  [NewIR]Fix null value and support some attribute (#55100) · a2903920
  由 hong 提交于 7月 04, 2023
```
* suport optional input in new_ir

* polish code

* add coverate test

* update

* update

* add unitest

* remove reduplicate code

* set test timeout
```
  a2903920
03 7月, 2023 2 次提交
- J
  [XPU] Fix the topk, set_value ops that using temporary tensors avoiding the... · cc2059a0
  由 jiangfan06 提交于 7月 03, 2023
```
[XPU] Fix the topk, set_value ops that using temporary tensors avoiding the memory overlaps during multi-stream inference (#54851)
```
  cc2059a0
- 周
  [Paddle-TRT] use hook to collect shape in CollectShapeRangeInfo API. (#54841) · 989f3dde
  由周周周提交于 7月 03, 2023
```
* commit

* commit

* commit

* commit

* final commit

* use hook to collect shape and shape value
```
  989f3dde
30 6月, 2023 3 次提交
- M
  
  [XPU] Add conv2d transpose fuse pass (#54904) · 12c15b89
  由 mjp9527 提交于 6月 30, 2023
  
  12c15b89
- Y
  
  [IR&PASS] add conv + bn fuse pattern, and other works (#54933) · 19345fa7
  由 Yuanle Liu 提交于 6月 30, 2023
  
  19345fa7
- J
  
  fix cachek_kv_layout pass (#54994) · 19d6a988
  由 JiangHao 提交于 6月 30, 2023
  
  19d6a988
29 6月, 2023 2 次提交

H
Refactor build attribute (#54968) · eef38db1
由 hong 提交于 6月 29, 2023
```
* update

* refactor build context

* fix bug

* polish code

* change func name
```
eef38db1

张

[CodeStyle][CINN] format cpp code via clang-format (#54961) · af127342

由张经纬提交于 6月 29, 2023

* fix clang-format

* 'fix_clang-format'

* fix remaining errors

* format

* empty commit, re-trigger all ci

* empty commit, re-trigger all ci

---------
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

af127342

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功