提交 · bc153701da60fd042335177c6bf2f145f38fc90b · PaddlePaddle / Paddle

19 7月, 2023 1 次提交
- Y
  
  Sharding stage 1 tensor fusion (#55427) · 4c4d3185
  由 Yuang Liu 提交于 7月 19, 2023
  
  4c4d3185
18 7月, 2023 3 次提交
- H
  [NewIR]Fix new ir concat split bug (#55419) · 5e6645d7
  由 hong 提交于 7月 18, 2023
```
* fix new ir concat op bug

* fix bug

* using add_n_with_kernel instead of add_n impl

* fix pd_op yaml bug

* fix bug
```
  5e6645d7
- L
  
  fix typo: thream->stream (#55445) · 2558364c
  由 Leo Chen 提交于 7月 18, 2023
  
  2558364c
- K
  [NewIR] support custom verify in op definition generation (#55428) · 7bd50187
  由 kangguangli 提交于 7月 18, 2023
```
* support custom verify

* fix

* fix

* fix

* fix coverage ci

* remove custom verify in assert
```
  7bd50187
17 7月, 2023 7 次提交
- Z
  
  fix bug (#55471) · e9b8feac
  由 zhangbo9674 提交于 7月 17, 2023
  
  e9b8feac
- W
  
  [IR] optimize the error log. (#55465) · 1d2a91c6
  由 winter-wang 提交于 7月 17, 2023
  
  1d2a91c6
- I
  [Paddle-TRT] Support conv2d op enter into trt when filter is not a persistable tensor (#55246) · 74206917
  由 iamsonderr 提交于 7月 17, 2023
```
* support_conv2d

* remove comment

* check code style

* add former Test

* check code style

* add unittest

* fix log

* change unittest

---------
Co-authored-by: zhoutianzi666 <17801055074@163.com>
```
  74206917
- M
  [Paddle-TRT] add assign op (#55426) · d778737e
  由 ming1753 提交于 7月 17, 2023
```
* [Paddle-TRT] add assign op
```
  d778737e
- W
  
  [IR] finetune the StrAttribute interface. (#55439) · 896d7cfa
  由 winter-wang 提交于 7月 17, 2023
  
  896d7cfa
- H
  
  [0D-Tensor] CINN supports unsqueeze, delete hack in Paddle's pass (#55336) · f736f151
  由 HongyuJia 提交于 7月 17, 2023
  
  f736f151
- C
  
  remove useless move (#55430) · 2982046b
  由 Chen Weihang 提交于 7月 17, 2023
  
  2982046b
14 7月, 2023 5 次提交
- Z
  [IR] Refine BuildScope in phi_kernel_util (#55423) · f00a06d8
  由 zhangbo9674 提交于 7月 14, 2023
```
* add code

* fix bug

* refine code

* refine code

* fix bug
```
  f00a06d8
- R
  
  support auto generate for static op elementwise_min (#55008) · 36eb5cde
  由 RedContritio 提交于 7月 14, 2023
  
  36eb5cde
- R
  
  [CustomDevice] add stream safe allocator support (#55393) · 73e441f9
  由 ronnywang 提交于 7月 14, 2023
  
  73e441f9
- Z
  [IR] Reconstruct the Instruction for NewIrInterpreter (#55239) · 69e9f03e
  由 zhangbo9674 提交于 7月 14, 2023
```
* add inplace interface

* support inplace

* refine code

* fix bug

* fix bug

* refien code

* add file

* add interface

* refine code

* refine code

* add phi kernel instruction

* refine code

* add test

* delete unuse code

* add test

* add test

* add deps

* delete unused code

* fix bug

* fix bug
```
  69e9f03e
- H
  
  fix misspelling of type name (#55398) · f1bffdac
  由 hong 提交于 7月 14, 2023
  
  f1bffdac
13 7月, 2023 8 次提交
- Y
  [BugFix] Replace include dense_tensor.h with forward declare in phi lib (#55396) · 9619443b
  由 Yuanle Liu 提交于 7月 13, 2023
```
* copy dense_tensor.h to inference lib

* update

* update
```
  9619443b
- Y
  
  [BugFix] Fix issue-50853: CUDNN error(9), CUDNN_STATUS_NOT_SUPPORTED · 78a4e3fd
  由 Yuanle Liu 提交于 7月 13, 2023
  
  78a4e3fd
- R
  Support nvprof for auto parallel (#55347) · 9210b1af
  由 Ruibiao Chen 提交于 7月 13, 2023
```
* Support nvprof for auto parallel

* Fix CI errors

* Fix CI errors
```
  9210b1af
- C
  【AMP Prim OP】support instance_norm prim ops for fp16 and bf16 dtype (#55368) · 65950324
  由 Charles-hit 提交于 7月 13, 2023
```
* [prim]support fp16 for instance_norm and instance_norm_grad

* support fp16 and bfp16 dtype for instance_norm prim rules

* fix new ir test

---------
Co-authored-by: Ncxxly <chenxx_id@163.com>
```
  65950324
- H
  [NewIR]new ir support builtin slice op (#55381) · 4b6d2f5f
  由 hong 提交于 7月 13, 2023
```
* new ir support builtin slice op

* fix phi kernel adaptor bug
```
  4b6d2f5f
- Z
  [Yaml] Fix bug of code-gen for op_maker (#55369) · 9c5e4b4e
  由 zyfncg 提交于 7月 13, 2023
```
* add check of input tensors in Yaml

* fix bug of code-gen for opmaker

* fix bug
```
  9c5e4b4e
- R
  Add matmul_int8 op (#55228) · 27cc0df5
  由 RichardWooSJTU 提交于 7月 13, 2023
```
* add matmul int8
```
  27cc0df5
- H
  [NewIR]fix new ir edit distance bug (#55294) · 2194e4c1
  由 hong 提交于 7月 13, 2023
```
* fix edit distance bug

* add op define kernel data type

* fix bug

* update

* add header

* add op test to cmake
```
  2194e4c1
12 7月, 2023 8 次提交

[Semi Auto] Softmax SPMD Rule (#55196) · 885d1aec

由 JZ-LIANG 提交于 7月 12, 2023

* resolute input sharding conflict maybe

* fixed comment

---------
Co-authored-by: NYichen Zhang <zhangyichen03@baidu.com>
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

885d1aec

H
[NewIR] fix new ir expand op (#55327) · de9318a3
由 hong 提交于 7月 12, 2023
```
* fix new ir expand op

* fix count bug

* remove useless code
```
de9318a3
Y
[Inference] rewrite identity_op_clean_pass (#55240) · 2363e623
由 Yuanle Liu 提交于 7月 12, 2023
```
* rewrite identity_op_clean_pass

* fix

* adjust identity_op_clean_pass order in gpu passes

* fix ut
```
2363e623
R

[CustomDevice] optimize SplitDenseTensor by calling split_with_num kernel (#55330) · d65209b6
由 ronnywang 提交于 7月 12, 2023

d65209b6
R
[CustomDevice] fix release error in process_group_custom (#55293) · 7a705727
由 ronnywang 提交于 7月 12, 2023
```
* [CustomDevice] fix release error for process_group_custom

* update
```
7a705727

Support selected rows new ir (#54987) · fc66b5d7

由 hong 提交于 7月 12, 2023

* refine program translator

* fix warning: not override

* fix bug

* merge new modifications

* modify by reviews

* resolve conflicts

* resolve conflicts

* fix

* fix

* update

* support selected rows

* update

* add selectrows

* fix bug

* add ut

* refine code

* refien code

* update

* update

* support selected rows

* support selected rows

* support dense tensor

* remove useless code

* polish code

* remote standalone executor test

---------
Co-authored-by: Nkangguangli <kangguangli@hotmail.com>
Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>

fc66b5d7

[ONEDNN] Upgrade oneDNN version to v3.1 (#52463) · cfa513f7

由 YangQun 提交于 7月 12, 2023

* squash pick the poc code
* fix build after rebase
* fix int8 conv and fc uts
* Fix and clean-up Get_SRC_Scale_Memory
* fix floating point fc uts
* fix test_analyzer_int8_googlenet
* test_analyzer_int8_mobilenetv1
* fix int8 mobilenet v2 and v3
* fix build error after rebase
* [oneDNN] rename library version
* fix conv bias datatype
* try to fix import error
* fix rebase error
* [oneDNN] pack library into python wheel
* add MKLDNN_SHARED_LIB_3 to env_dict
* fix test_analyzer_bert
* fix fill_constant op kernel
* fix ernie and matmul op ut
* fix softplus ut
* fix conv+relu6 fusion ut
* fix hardswish fusion
* fix quant+transpose fusion ut
* fixsgd ut
* fix int8 matmul with flatten
* fix fc+scale fusion
* fix conv/matmul+gelu fusion uts
* fix rebase error
* Revert "fix conv/matmul+gelu fusion uts"
This reverts commit 47eb5e49972bd8f7271a233def9bfb3e98ce78e1.
* upgrade to onednn v3.1
* remove older version onednn
* use densetensor::data() for achieving mean and var in layernorm impl
* comments for atol of integer tests
* fix clang-format
* Revert "remove older version onednn"
This reverts commit 783e57ddfd4401254596eae7d47adb9b03590c09.
* improve binary handle
* fix expand kernel
* Revert "use densetensor::data() for achieving mean and var in layernorm impl"
* always use forward_inference for conv
* remove activation scales
* rollback changes to mkldnn.cmake
* address comments
* port changes to dequantize kernel
* fix merge error
* fix fused_elementwise_kernel
* upgrade onednn version to v3.1.1
* fix some approval error
* fix error msg format
* remove old onednn libs
* try to fix symbolic link issue
* fix cinn test case segfault
* do not explicit link test with onednn
* remove unnecessary changes
* integrate CINN with onednn v3
* link with mkldnn project
* fix cinn build file

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
Co-authored-by: NChen, Xinyu1 <xinyu1.chen@intel.com>
Co-authored-by: Ntianshuo78520a <707759223@qq.com>

cfa513f7

[clang-tidy] enable `readability-container-size-empty` check (#55279) · be3a6fa7

由 Wang Xin 提交于 7月 12, 2023

* [clang-tidy] enable readability-container-size-empty check

* fix test_custom_kernel Failed

* add clang-tid-10 in dockerfile

* add clang-tidy in dockerfile

* fix bug

be3a6fa7

11 7月, 2023 4 次提交

[NewIR] Fix new ir unsqueeze op bug (#55212) · 852d7a12

由 hong 提交于 7月 11, 2023

* suport optional input in new_ir

* polish code

* add coverate test

* update

* update

* add unitest

* remove reduplicate code

* udpate

* fix assign error

* revert test arg min max

* update

* fix bug

* polish code

* update

* fix unique and close op bug

* update

* update

* revert test code

* revert unique test

* polish code

* remove useless code

---------
Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>

852d7a12

Z
[IR] Add op compat info for grad op (#55277) · b4d7e1e0
由 zhangbo9674 提交于 7月 11, 2023
```
* fix bug

* fix bug

* fix bug
```
b4d7e1e0

赛题七-开发grad_fn、next_functions两个API 并暴露到python端-v1 (#54838) · ab46b14c

由 qiuwenbo 提交于 7月 11, 2023

* [尝试] 给tensor增加一个属性, 这个属性是一个定值 1

* 暴露gradnode 并构建gradnode新的方法(用来测试)进行暴露给python python端可以访问

* 开发grad_fn、next_functions两个API 并暴露到python端- 做一些规范化处理

* 增加一个单元测试

* 优化 code-style

ab46b14c

A
[NewIR]Refine IrPrinter and basic Concept Interface for const Object (#55209) · 4fa3e149
由 Aurelius84 提交于 7月 11, 2023
```
* [NewIR]Refine IrPrinter and basic Concept Interface for const Object
```
4fa3e149

10 7月, 2023 4 次提交
- D
  
  fix generator pickle for custom device (#55247) · b20d22df
  由 duanyanhui 提交于 7月 10, 2023
  
  b20d22df
- L
  [SemiAuto] move ut of auto_parallel (#55217) · 89600fa1
  由 Leo Chen 提交于 7月 10, 2023
```
* move ut of auto_parallel

* fix ut
```
  89600fa1
- K
  [NewIR] add stop_gradient attribute for defining op (#55235) · c5a191bb
  由 kangguangli 提交于 7月 10, 2023
```
* add stop_gradient attribute for defining op

* modify by reviews

* fix
```
  c5a191bb
- Y
  
  [PASS] add constant folding pass (#55099) · 4905a247
  由 Yuanle Liu 提交于 7月 10, 2023
  
  4905a247

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功