提交 · cc262c5591f3ad704fb7fc91a26a926b3881a0b7 · PaddlePaddle / Paddle

19 7月, 2023 1 次提交
- S
  Fix mea segmentation fault error (#55408) · cc262c55
  由 sneaxiy 提交于 7月 19, 2023
```
* fix mea seg fault develop

* fix bias_grad seg fault
```
  cc262c55
18 7月, 2023 4 次提交

batch add inpalce api (#55078) · 19302938

由 GGBond8488 提交于 7月 18, 2023

* batch add inpalce api

* fix inplace fn generate

* add test for  new inpalce api

* fix typro

* fix typro

* fix typro

* fix test error

* fix atan2

* remove atan2

* auto genereate inpalce api

* fix inplace generate fn error

* fix windows error

* fix test error

* fix test error

* fix windows ci error

* fix test error

* fix test_error

* fix test error

* fix eigen aliasing error in inplace

* remove elementwise_pow inplace

* fix doc error

* fix test error

19302938

[NewIR]Fix new ir concat split bug (#55419) · 5e6645d7

由 hong 提交于 7月 18, 2023

* fix new ir concat op bug

* fix bug

* using add_n_with_kernel instead of add_n impl

* fix pd_op yaml bug

* fix bug

5e6645d7

K
[NewIR] fix hsigmoid_loss (#55483) · 38782dc3
由 kangguangli 提交于 7月 18, 2023
```
* fix hsigmoid_loss

* add test into whitelist
```
38782dc3
G
[OpCompat] add cast and repeat_interleave in op_compat.yaml (#55467) · 922d2481
由 gouzil 提交于 7月 18, 2023
```
* add cast and repeat_interleave

* fix
```
922d2481

17 7月, 2023 6 次提交
- Z
  
  update slice in op_compat.yaml (#55432) · c16ab557
  由 Zhenghai Zhang 提交于 7月 17, 2023
  
  c16ab557
- Z
  Support more dtype for any/all API. (#55253) · 7b19efe4
  由 zxcd 提交于 7月 17, 2023
```
* add more data type for all/any.

* remove xpu fix.

* add test unit.

* fix typename name.

* fix output data type.
```
  7b19efe4
- Z
  TensorSetConstantXPU support to use xpu::constant when T is float/float16 (#55122) · 6692dc9a
  由 zhangyikun02 提交于 7月 17, 2023
```
* TensorSetConstantXPU support to use xpu::constant when T is float/float16

* add xpu_wait for TensorSetConstantXPU
```
  6692dc9a
- R
  
  update transpose in op_compat.yaml (#55458) · f66a705f
  由 RedContritio 提交于 7月 17, 2023
  
  f66a705f
- A
  [OpCompat] add fetch and update mish in op_compat.yaml (#55422) · 8f8fa38a
  由 Asthestarsfalll 提交于 7月 17, 2023
```
* [OpCompat] add fetch and update mish in op_compat.yaml

* add missing outputs

* fix codestyle
```
  8f8fa38a
- C
  
  remove useless move (#55430) · 2982046b
  由 Chen Weihang 提交于 7月 17, 2023
  
  2982046b
15 7月, 2023 1 次提交
- R
  
  add increment in op_compat.yaml (#55404) · 07567939
  由 RedContritio 提交于 7月 15, 2023
  
  07567939
14 7月, 2023 9 次提交
- R
  
  support auto generate for static op elementwise_min (#55008) · 36eb5cde
  由 RedContritio 提交于 7月 14, 2023
  
  36eb5cde
- R
  
  [NewIR] logsumexp 参数名映射修复 (#55406) · 36b2c5e5
  由 RedContritio 提交于 7月 14, 2023
  
  36b2c5e5
- R
  
  [NewIR] elementwise_add 参数名映射修复 (#55403) · fd72b329
  由 RedContritio 提交于 7月 14, 2023
  
  fd72b329
- W
  
  [OpCompat] add and update print, split and sync_batch_norm in op_compat.yaml (#55417) · 7eb0457c
  由 Wang Xin 提交于 7月 14, 2023
  
  7eb0457c
- Z
  
  fix embedding_with_eltwise_add_xpu (#55354) · 95aab366
  由 zhupengyang 提交于 7月 14, 2023
  
  95aab366
- S
  
  fix fisher yates sample (#55329) · f311a927
  由 Siming Dai 提交于 7月 14, 2023
  
  f311a927
- K
  [OpCompat] add feed in op_compat.yaml (#55402) · 27fd2bc2
  由 kangguangli 提交于 7月 14, 2023
```
* add feed in op_compat.yaml

* remove input mapping
```
  27fd2bc2
- H
  
  [XPU] Fix yolo_box to support multi-stream based inference (#55310) · 7e4290c5
  由 hong19860320 提交于 7月 14, 2023
  
  7e4290c5
- T
  Update CUDNN Frontend API to v0.9.1 (#54949) · 76b77d81
  由 Tian Zheng 提交于 7月 14, 2023
```
* Update CUDNN Frontend API to v0.9.1
- Remove old patches
- Remove workarounds that are no longer needed

* Fix test_switch_autotune
```
  76b77d81
13 7月, 2023 13 次提交
- Y
  [BugFix] Replace include dense_tensor.h with forward declare in phi lib (#55396) · 9619443b
  由 Yuanle Liu 提交于 7月 13, 2023
```
* copy dense_tensor.h to inference lib

* update

* update
```
  9619443b
- X
  
  recover tanh_triple (#55372) · bfb861f5
  由 xiaoguoguo626807 提交于 7月 13, 2023
  
  bfb861f5
- F
  [inference] Add FusedBiasActKernel (#55301) · 0a4d1999
  由 freeliuzc 提交于 7月 13, 2023
```
* add init value for CudaSwishFunctor

* add new phi kernel fusedBiasActKernel
```
  0a4d1999
- C
  【AMP Prim OP】support instance_norm prim ops for fp16 and bf16 dtype (#55368) · 65950324
  由 Charles-hit 提交于 7月 13, 2023
```
* [prim]support fp16 for instance_norm and instance_norm_grad

* support fp16 and bfp16 dtype for instance_norm prim rules

* fix new ir test

---------
Co-authored-by: Ncxxly <chenxx_id@163.com>
```
  65950324
- add phi operator c_concat and ut (#55320) · 788be26d
  由 lil-Xing 提交于 7月 13, 2023
```
* add phi operator c_concat and ut

* update create_var use

* update copyright
```
  788be26d
- H
  [NewIR]new ir support builtin slice op (#55381) · 4b6d2f5f
  由 hong 提交于 7月 13, 2023
```
* new ir support builtin slice op

* fix phi kernel adaptor bug
```
  4b6d2f5f
- Z
  Move compare_raw_kernel to legacy (#53928) · 1dd8770a
  由 zhangyuqin1998 提交于 7月 13, 2023
```
* Move compare_raw_kernel to legacy

* fix

* Update compare_kernel.cc

* Move compare_raw_kernel to legacy
```
  1dd8770a
- R
  
  [CustomDevice] fix device guard (#55351) · 0fd6efbb
  由 ronnywang 提交于 7月 13, 2023
  
  0fd6efbb
- M
  
  fix bug on case with gpu driver but no gpu (#55335) · acf4a2ae
  由 ming1753 提交于 7月 13, 2023
  
  acf4a2ae
- F
  
  fix roi_align roi_pool to static num 0 (#55342) · 0a21836d
  由 Feng Ni 提交于 7月 13, 2023
  
  0a21836d
- W
  
  fix conv_fusion in multi thread. (#55374) · ceb83562
  由 Wilber 提交于 7月 13, 2023
  
  ceb83562
- R
  Add matmul_int8 op (#55228) · 27cc0df5
  由 RichardWooSJTU 提交于 7月 13, 2023
```
* add matmul int8
```
  27cc0df5
- Q
  Modify bf16 and fix the elementwise_max (#54799) · 6f7ceca0
  由 Qi Shao 提交于 7月 13, 2023
```
* modify the accuracy checking framework of bf16 optest, including both of forward and backward
```
  6f7ceca0
12 7月, 2023 5 次提交

Fix llm int8 build error (#55338) · 006bd959

由 FormlessUnit 提交于 7月 12, 2023

* add macro to avoid llm.int8 build error

* fix ci

---------
Co-authored-by: Nwufeisheng <wfs1997@163.com>

006bd959

R
[CustomDevice] fix release error in process_group_custom (#55293) · 7a705727
由 ronnywang 提交于 7月 12, 2023
```
* [CustomDevice] fix release error for process_group_custom

* update
```
7a705727

Support selected rows new ir (#54987) · fc66b5d7

由 hong 提交于 7月 12, 2023

* refine program translator

* fix warning: not override

* fix bug

* merge new modifications

* modify by reviews

* resolve conflicts

* resolve conflicts

* fix

* fix

* update

* support selected rows

* update

* add selectrows

* fix bug

* add ut

* refine code

* refien code

* update

* update

* support selected rows

* support selected rows

* support dense tensor

* remove useless code

* polish code

* remote standalone executor test

---------
Co-authored-by: Nkangguangli <kangguangli@hotmail.com>
Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>

fc66b5d7

[ONEDNN] Upgrade oneDNN version to v3.1 (#52463) · cfa513f7

由 YangQun 提交于 7月 12, 2023

* squash pick the poc code
* fix build after rebase
* fix int8 conv and fc uts
* Fix and clean-up Get_SRC_Scale_Memory
* fix floating point fc uts
* fix test_analyzer_int8_googlenet
* test_analyzer_int8_mobilenetv1
* fix int8 mobilenet v2 and v3
* fix build error after rebase
* [oneDNN] rename library version
* fix conv bias datatype
* try to fix import error
* fix rebase error
* [oneDNN] pack library into python wheel
* add MKLDNN_SHARED_LIB_3 to env_dict
* fix test_analyzer_bert
* fix fill_constant op kernel
* fix ernie and matmul op ut
* fix softplus ut
* fix conv+relu6 fusion ut
* fix hardswish fusion
* fix quant+transpose fusion ut
* fixsgd ut
* fix int8 matmul with flatten
* fix fc+scale fusion
* fix conv/matmul+gelu fusion uts
* fix rebase error
* Revert "fix conv/matmul+gelu fusion uts"
This reverts commit 47eb5e49972bd8f7271a233def9bfb3e98ce78e1.
* upgrade to onednn v3.1
* remove older version onednn
* use densetensor::data() for achieving mean and var in layernorm impl
* comments for atol of integer tests
* fix clang-format
* Revert "remove older version onednn"
This reverts commit 783e57ddfd4401254596eae7d47adb9b03590c09.
* improve binary handle
* fix expand kernel
* Revert "use densetensor::data() for achieving mean and var in layernorm impl"
* always use forward_inference for conv
* remove activation scales
* rollback changes to mkldnn.cmake
* address comments
* port changes to dequantize kernel
* fix merge error
* fix fused_elementwise_kernel
* upgrade onednn version to v3.1.1
* fix some approval error
* fix error msg format
* remove old onednn libs
* try to fix symbolic link issue
* fix cinn test case segfault
* do not explicit link test with onednn
* remove unnecessary changes
* integrate CINN with onednn v3
* link with mkldnn project
* fix cinn build file

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
Co-authored-by: NChen, Xinyu1 <xinyu1.chen@intel.com>
Co-authored-by: Ntianshuo78520a <707759223@qq.com>

cfa513f7

[clang-tidy] enable `readability-container-size-empty` check (#55279) · be3a6fa7

由 Wang Xin 提交于 7月 12, 2023

* [clang-tidy] enable readability-container-size-empty check

* fix test_custom_kernel Failed

* add clang-tid-10 in dockerfile

* add clang-tidy in dockerfile

* fix bug

be3a6fa7

11 7月, 2023 1 次提交

support sharding parallel (#54634) · b7a05057

由 pangengzheng 提交于 7月 11, 2023

* support sharding parallel

* fix name

* fix

* update

* test amp for sharding

---------

Co-authored-by: pangengzheng <pangengzheng.baidu.com>

b7a05057

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功