提交 · 197a4ffee970c807057aeb10df54f607987a8e21 · PaddlePaddle / Paddle

08 2月, 2023 7 次提交
- P
  fuse quantize+transpose and transpose+dequantize (#49509) · 197a4ffe
  由 Paulina Gacek 提交于 2月 08, 2023
```
* QuantTranpose pattern is being found by pass

* quant + transpose fuse

* code style changes

* UT written, reorder fixed

* Dequantize + transpose2 fuse  added

* pass name changed

* UT added & shift corrected

* got rid of redundancy

* review changes

* AsIntermediate corrected

* compat added
```
  197a4ffe
- H
  
  Use inference, save construct time (#50163) · 7a82b6de
  由 HongyuJia 提交于 2月 08, 2023
  
  7a82b6de
- Z
  Fix bn performance degradation (#50287) · 6f1ec935
  由 zhangkaihuo 提交于 2月 08, 2023
```
* fix bn performance degradation
```
  6f1ec935
- H
  [Tensor Support unsigned] Tensor::data() supports unsigned int and bfloat16 (#50257) · 80dc81c5
  由 HongyuJia 提交于 2月 08, 2023
```
* support unsigned int and bfloat16

* update unit test

* update DenseTensor datatype

* unsupport more datatype of mutable_data(Place)

* fix unittest
```
  80dc81c5
- Z
  
  [Zero-Dim] Fix 0d axis support for argmin/argmax (#50293) · aec1e4ce
  由 Zhong Hui 提交于 2月 08, 2023
  
  aec1e4ce
- H
  
  move mixed_vector (#50282) · 35d7d1f0
  由 Huang Jiyi 提交于 2月 08, 2023
  
  35d7d1f0
- Y
  [PHI]Unify Fluid and PHI kernel (#49328) · e92e3aab
  由 YuanRisheng 提交于 2月 08, 2023
```
* unify_kernel

* fix compile bugs

* modify macro name

* perfect code according comment

* fix compile bugs

* fix compile bugs

* fix ci bugs

* fix ci bug

* fix ci bugs

* fix ci bugs

* modify code according comment

* rm conv_fusion_op
```
  e92e3aab
07 2月, 2023 5 次提交
- Z
  Remove axis in some elementwise api (#50190) · 1dedaada
  由 zyfncg 提交于 2月 07, 2023
```
* remove axis in some elementwise api

* fix inplace bug eager-gen

* fix bug

* revert change for CheckInplace

* polish code
```
  1dedaada
- 张
  
  fix div 0 error in conv1/2/3 (#49999) · 7a0fdeb9
  由张春乔提交于 2月 07, 2023
  
  7a0fdeb9
- C
  add batch_norm composite rule (#49894) · 9b3a41b1
  由 cyber-pioneer 提交于 2月 07, 2023
```
move composite test case

remove unuseful var

add composite op blacklist
```
  9b3a41b1
- C
  
  Support build with gcc12 for CUDA less than 12.0 (#50106) · 755049f2
  由 chalsliu 提交于 2月 07, 2023
  
  755049f2
- Y
  
  Fix gather, scatter op 0d tenor GPU error. (#50271) · 05c9c0a5
  由 Yuang Liu 提交于 2月 07, 2023
  
  05c9c0a5
06 2月, 2023 7 次提交
- Y
  
  remove profiler (#50191) · 5a13280a
  由 YuanRisheng 提交于 2月 06, 2023
  
  5a13280a
- Z
  Delete extra input (Bias, ResidualData) in OpMaker of conv2d (#49121) · 2deada9a
  由 zyfncg 提交于 2月 06, 2023
```
* remove extra input of conv2d

* fix bug

* fix unittest bug

* adjust conv2d.pbtxt

* fix cpu_quantize_pass_tester

* revert use_addto of conv2d

* fix runtime attribute

* fix bug

* recover force_fp32_output in conv2d

* refine error info

* fix bug
```
  2deada9a
- 张
  
  fix div 0 error of split (#49958) · e12c9221
  由张春乔提交于 2月 06, 2023
  
  e12c9221
- H
  
  [XPU] add int type for concat and split functor (#50200) · b3e5b0c4
  由 houj04 提交于 2月 06, 2023
  
  b3e5b0c4
- D
  
  unique_consecutive add 0d (#50213) · eb8353a4
  由 duanboqiang 提交于 2月 06, 2023
  
  eb8353a4
- E
  
  phi move ReshapeToMatrix & GetValue (#50139) · d09962a1
  由 engineer1109 提交于 2月 06, 2023
  
  d09962a1
- R
  
  fix gcc12 error: mismatched-new-delete error in custom_device.cc (#47466) · 6d70761e
  由 risemeup1 提交于 2月 06, 2023
  
  6d70761e
03 2月, 2023 5 次提交

R
Fix 堆栈溢出 (stack overflow) of case8: paddle.unique_consecutive (#49983) · 83077f6f
由 RedContritio 提交于 2月 03, 2023
```
* support negative index in unique_consecutive

* add unittest

* add unittest
```
83077f6f

Replace matmul(v2) with fused_matmul during oneDNN fuse passes (#49515) · 5cfe1645

由 Sławomir Siwek 提交于 2月 03, 2023

* replace matmul with matmul_v2 in fuse passes

* Remove fusion logic from matmul

* removing fusion methods

* add proper name

* adjust namespaces

* clean attrs in python tests

* delete checkpoint and restore matmul version

* remove unused code

* matmul and reshape/transpose fuses migrated

* split MatmulOneDNN headers

* fuse activation and eltwise_add

* add fuse_activation

* matmul_transpose_reshape/reshape_transpose_matmul

* matmul + elementwise_add (fused)

* activation temporary modifciation

* merge newest develop

* remove depedency from other PR

* revert pbtxt

* remove placeholders from matmul_v2

* add description in OPMaker

* remove matmul_v2_op.h and all depedencies

* remove dims changing in base op

* add possibility to fuse already fused_matmul

* restart broken CI

* Empty-Commit

* revert matmul_utils.h

* codestyle

* adjust imports

* add pbtxt file

* 100% matmul unit tests coverage

* trigger CI with minimal changes to develop

* adjust changes to develop

* add fused_matmul op

* inherit base ops

* add "v2"

* move OPMaker

* Gradually add fused_matmul files

* second batch of fused_matmul changes

* split infershapes of matmul_v2 and fused_matmul

* inherit fused_matmul from matmul_v2

* Update paddle/phi/backends/onednn/onednn_reuse.h
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

5cfe1645

R

Fix div 0 error of case20: paddle.min (#50013) · 50c43dd3
由 RedContritio 提交于 2月 03, 2023

50c43dd3

Fix 堆栈溢出 (stack overflow) of case3: paddle.metric.accuracy (#49984) · 97411214

由 RedContritio 提交于 2月 03, 2023

* add input check for accuracyOp

* add input check for gpu/accuracyOp

* add unittest

* use rank instead of dimensions in message

* update unittest

* update unittest

97411214

Generate some static graph ops (#49906) · 85490f70

由 HappyHeavyRain 提交于 2月 03, 2023

* generate some static graph ops

* fix the bug of pow

* add REGISTER_ACTIVATION_OP in operators.cmake

* modify the file operators.cmake

85490f70

02 2月, 2023 5 次提交
- R
  
  [CustomDevice] refine custom device api (#50152) · dd480273
  由 ronnywang 提交于 2月 02, 2023
  
  dd480273
- R
  Fix div 0 error of case10: paddle.nn.functional.max_pool2d/max_pool3d (#50012) · 1451fa51
  由 RedContritio 提交于 2月 02, 2023
```
* add stride check for PoolOutputSize

* add unittest
```
  1451fa51
- C
  Several ops support zero dim on GPU and CPU (#49959) · 5db88d08
  由 Ccc 提交于 2月 02, 2023
```
* paddle.nn.functional.softmax
* paddle.nn.functional.log_softmax
* paddle.nn.functional.gumbel_softmax
* paddle.nn.functional.prelu
```
  5db88d08
- Y
  [BugFix]Fix bugs when compile with OneDNN (#50096) · 3c557e2f
  由 YuanRisheng 提交于 2月 02, 2023
```
* fix bugs

* fix ci bugs
```
  3c557e2f
- L
  
  Fix the FP16 precision problem of add_n. (#50129) · 14dd68e1
  由 liuruyan 提交于 2月 02, 2023
  
  14dd68e1
01 2月, 2023 11 次提交

R
Fix UFA非法地址访问(UFA illegal address access) of case3: paddle.crop (#49994) · 34bf3d09
由 RedContritio 提交于 2月 01, 2023
```
* add range check for crop_kernel

* remove shape negative check

* add unittest
```
34bf3d09
R
Fix 空指针 (Null pointer) of case8: paddle.slice (#49979) · 3cf50f91
由 RedContritio 提交于 2月 01, 2023
```
* add check for input of slice

* add unittest
```
3cf50f91
R
Fix div 0 error of case11: paddle.nn.functional.max_pool1d/max_pool2d/max_pool3d (#50010) · 3ab6faa8
由 RedContritio 提交于 2月 01, 2023
```
* add stride check for MaxPool

* add unittests
```
3ab6faa8

[Zero-Dim] Fix 0-dim tensor for arg_min_max op. (#49570) · e4e94a88

由 Zhong Hui 提交于 2月 01, 2023

* fix 0-d tensor for arg_min_max op.

* fix xpu.

* fix zero dims

* fix

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

* Update test_zero_dim_tensor.py

* Update test_zero_dim_tensor_xpu.py

* Update test_zero_dim_tensor.py

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

* Update arg_min_max_kernel.cc

e4e94a88

Z

support grid_sampler_grad op for XPU (#49857) · 520f48d6
由 zhangyikun02 提交于 2月 01, 2023

520f48d6
G
[Divide by 0 Error] add lu check (#49974) · f71796b6
由 gouzil 提交于 2月 01, 2023
```
* [Divide by 0 Error] add lu check

* [Divide by 0 Error] lu check migrate to c++
```
f71796b6

[Divide by 0 Error] add eig check (#49971) · 226a6567

由 gouzil 提交于 2月 01, 2023

* [Divide by 0 Error] add eig check

* [Divide by 0 Error] eig check migrate to c++

* [Divide by 0 Error] Fix class name error

226a6567

[Divide by 0 Error] add norm check (#49966) · 5dfddaea

由 gouzil 提交于 2月 01, 2023

* [Divide by 0 Error] add norm check

* [Divide by 0 Error] fix x AttributeError

* [Divide by 0 Error] norm check migrate to c++

5dfddaea

Combination of multiple paddle::memory::allocate operation into one for ops (#49126) · bdae5481

由 limingshu 提交于 2月 01, 2023

* A leap of try for cudaLaunchCooperativeKernel

* fix bugs

* Totally replace the lar cuda kernel

* Fix bugs

* fix code according to comments

* fix codes according to  review comments

* adding some function overload

* relocate the power operation.

* add bf16 support for index select relevant ops

* revert bf16 type change.

* add changes for more op

* fix code writting bugs

bdae5481

Fix UFA非法地址访问(UFA illegal address access) of case4: paddle.unbind (#49995) · 9ce8cfcf

由 RedContritio 提交于 2月 01, 2023

* add axis check for unbind

* add axis range check for unbind

* update unittest and axis validation for unbind

* add unittest invalid axis for unbind

* restore axis extract for unbind

9ce8cfcf

H2D data transfer optimization for split kernel (#49086) · 057ba778

由 limingshu 提交于 2月 01, 2023

* profile reduce kernel for fp16 and reduceHigherdim

* use reinterpret_cast

* fix for CI on ROCm

* add Macro for ROCm

* ROCm CI config

* ROCm CI config

* unit test repair

* pull

* add common_funcs.h

* reduceType

* Update reduce_function.h

* not higher

* rename

* implement of matmul using cublasLt instead of cublas

* cublasLt bugfix

* Update matmul_kernel_impl.h

* Update matmul_kernel_impl_via_blasLt.h

* for-loop-algo

* PR comments changes

* add macro

* ci unused variable isCublasLt

* ci unused variable isCublasLt macro

* split matmul to autotune

* rewrite the split kernel with segmented_array

* rewrite the split kernel with segmented_array

* rewrite the split kernel with segmented_array

* add some method for cuda_graph

* fix bugs for rocm

* change for ci-error

* i dont know why ci-model-benchmark gives a shit error, so i recover codes with original one to see if original codes work.

* add some changes for passing mode_benchmark and coverage ci

* fix ci error

* fix ci-rocm error

* add some changes for header

---------
Co-authored-by: Nzhangbopd <1299246947@qq.com>
Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com>

057ba778

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功