提交 · c123dd1e4032efdbfff0bf0c35a58155f2d6e1d9 · PaddlePaddle / Paddle

03 1月, 2023 6 次提交
- Z
  [Paddle Inference] Implement conv2d_fusion NHWC format using cutlass (#47989) · c123dd1e
  由 zhoutianzi666 提交于 1月 03, 2023
```
* Implement conv2d_fusion NHWC format using CUTLASS
* Add unit testing for CUTLASS Conv in inference
* Add experimental API for CUTLASS.
```
  c123dd1e
- A
  [OpAttr]Fix Ignore AttriteTensor in IndicateDataType bug in grad_op (#49472) · 5ac96468
  由 Aurelius84 提交于 1月 03, 2023
```
* [OpAttr]Fix Ignore AttriteTensor in IndicateDataType bug in grad_op

* add GetExpectedKernelType
```
  5ac96468
- Y
  Use BroadcastKernel and ReduceKernel to optimize expand and expand_grad. (#49419) · c4604025
  由 Yiqun Liu 提交于 1月 03, 2023
```
* Use BroadcastKernel and ReduceKernel to optimize expand and expand_grad.

* Correct the axis when there is only 1 input in BroadcastKernel.

* Add the calculate of output's shape.
```
  c4604025
- Z
  [Zero-Dim] reshape/reshape_/reverse 0D support (#49357) · 347d2123
  由 zhaoyingli 提交于 1月 03, 2023
```
* [Zero-Dim] reshape/reshape_/reverse 0D support

* rm comment

* change paddle.to_tensor to paddle.full

* fix docs

* update paddle.full
```
  347d2123
- Z
  
  forbid ops who have 1D intermediate tensor entering Paddle-TRT (#49378) · 021085e3
  由 zhoutianzi666 提交于 1月 03, 2023
  
  021085e3
- S
  
  Add not_equal trt converter (#49393) · 822ea0f9
  由 Sanbu 提交于 1月 03, 2023
  
  822ea0f9
02 1月, 2023 1 次提交
- H
  
  Scale Matmul Fuse pass rewritten (#49105) · 18c0a002
  由 Hulek 提交于 1月 02, 2023
  
  18c0a002
01 1月, 2023 1 次提交
- G
  
  memorty_optimize remove inplace op (#49431) · aa96ddc3
  由 gem5 提交于 1月 01, 2023
  
  aa96ddc3
31 12月, 2022 1 次提交
- C
  
  support flip 0D (#49460) · cb22a5c7
  由 caozhou 提交于 12月 31, 2022
  
  cb22a5c7
30 12月, 2022 10 次提交

Z
[CI-Precision] Optimize precision test logic (#49441) · 3e8cec85
由 zhangbo9674 提交于 12月 30, 2022
```
* speedup getFNDAFile

* add fnda_base for c++ ut cc file

* fix bug

* fix bug

* fix bug

* fix bug
```
3e8cec85

[Custom device] Add custom_cpu testcase of custom_relu (#49300) · 69c7edcf

由 HongyuJia 提交于 12月 30, 2022

* add custom_cpu testcase

* update test_custom_device_setup

* update path to custom_runtime

* fix cmd wait

* test Linux only

* setup once

* integrate to one run_cmd

* add pip install

* change timeout

* add debug string

* add debug string

* add debug string

* use os.system and change module name

* add runtime

* add more debug message

* continue debug

* timestamp

* fix testcase import bug

* remove error message

* set TIMEOUT property

69c7edcf

Z
Fix test_conv_bn_fuse_pass_cc on Windows System (#49446) · a4b4343f
由 zyfncg 提交于 12月 30, 2022
```
* fix test_conv_bn_fuse_pass_cc

* remove comment
```
a4b4343f
Z
[inference][trt] update Convolution to ConvolutionNd (#47653) · 6e5917e4
由 Zhang Jun 提交于 12月 30, 2022
```
* update conv to convNd

* trigger ci
```
6e5917e4
L

revert phi_static (#49433) · 802c5797
由 Leo Chen 提交于 12月 30, 2022

802c5797

Support static graph code-gen for squeeze and unsqueeze op (#49430) · 23c1ac2c

由 zyfncg 提交于 12月 30, 2022

* support static graph code-gen for squeeze op

* generate static graph code of unsqueeze

* refine op name

* add extra output in op_compat

* remove debug log

23c1ac2c

H

fix possible bug (#49367) · 18f0ab86
由 HongyuJia 提交于 12月 30, 2022

18f0ab86

在文档中统一静态图模式与动态图模式的英文翻译 (#49170) · a186e60d

由 Sanbu 提交于 12月 30, 2022

* 1219

* temporarily change the num_diff_files limit, test=document_fix

* Revert "temporarily change the num_diff_files limit, test=document_fix"

This reverts commit 8e70f00ef468d2dad0e38b3da06295ed62990d20.

* for codestyle

* remove duplicate license

* `static mode` -> `static graph mode`

* Update hybrid_parallel_inference.py

* Update layer_function_generator.py

* Update manipulation.py

* reset
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

a186e60d

R
fix_mac_build_problem (#49435) · 162f8fe2
由 risemeup1 提交于 12月 30, 2022
```
* fix_mac_build_problem

* fix_mac_build_problem

* fix_mac_build_problem
```
162f8fe2
W
Fix default GetExpectedKernelType for ops supported tensor attrs (#49414) · 8a859554
由 WangZhen 提交于 12月 30, 2022
```
* Fix default GetExpectedKernelType for ops supported tensor attrs
```
8a859554

29 12月, 2022 6 次提交
- R
  
  fix_bug (#49390) · 839e1499
  由 risemeup1 提交于 12月 29, 2022
  
  839e1499
- Y
  
  xpu kernels support api int64 vector inputs, test=kunlun (#49336) · 3c2420a3
  由 ykkk2333 提交于 12月 29, 2022
  
  3c2420a3
- X
  auto parallel bf16 (#49079) · 418edae5
  由 xu98bin 提交于 12月 29, 2022
```
* auto parallel bf16
```
  418edae5
- Z
  [pglbox2.0]fix load into memory (#49389) · 1078e064
  由 zmxdream 提交于 12月 29, 2022
```
* fix load into memory

* fix load into memory

* fix code style
```
  1078e064
- fix ambiguous symbol error (#49406) · 6f07960c
  由 MarDino 提交于 12月 29, 2022
  
  6f07960c
- W
  fused_attention_op paratmers stop grad support (#49351) · 0bb999b6
  由 Wang Bojun 提交于 12月 29, 2022
```
* fusedAttenGrad_noGrad

* code style fix

* add ut

* remove unnecessary log
```
  0bb999b6
28 12月, 2022 8 次提交
- S
  
  fix unique_kernel support axis=-1 (#49385) · ab786715
  由 sprouteer 提交于 12月 28, 2022
  
  ab786715
- L
  [new-exec] Ahead-Of-Time choosing kernel (#48789) · 63d2d722
  由 Leo Chen 提交于 12月 28, 2022
```
* add skip run

* alloc minimum memory

* skip check_size in Alloc

* skip check_size in Alloc

* skip check_size in Alloc

* fix cases when tensor is initialized or empty

* alloc empty output for place info

* add test

* increase timeout

* format code

* skip cpu

* add cudnn_deterministic

* fit for hostAlloc

* follow comments

* change check_size to fake_alloc
```
  63d2d722
- generate the static graph code of some ops (#49212) · 1804f834
  由 HappyHeavyRain 提交于 12月 28, 2022
```
* generate the static op of some ops

* add the VERSION of pixel_shuffle

* change the API doc of isclose

* change the API doc of isclose

* fix the isclose op comment
```
  1804f834
- X
  
  fix_moe (#49353) · 04511cf9
  由 xiaoxiaohehe001 提交于 12月 28, 2022
  
  04511cf9
- H
  
  fix bugs of paddle.multiplex API (#49368) · f6f0c562
  由 Haohongxiang 提交于 12月 28, 2022
  
  f6f0c562
- Y
  
  update some trt log (#49330) · 02019804
  由 Yuanle Liu 提交于 12月 28, 2022
  
  02019804
- W
  
  Fix misspelled words in comments (#49366) · e2b2f7d0
  由 WangZhen 提交于 12月 28, 2022
  
  e2b2f7d0
- W
  delete old dygraph pylayer (#49339) · 0b60b784
  由 wanghuancoder 提交于 12月 28, 2022
```
* delete old dygraph pylayer
```
  0b60b784
27 12月, 2022 7 次提交

Z

add unbind op for xpu (#49356) · 16931039
由 zhangyikun02 提交于 12月 27, 2022

16931039
R
fix run_setup problem (#49358) · 746a4ddb
由 risemeup1 提交于 12月 27, 2022
```
* fix run_setup problem

* test
```
746a4ddb
X
fix fold for large bs (#49337) · 9dde26f6
由 xiaoting 提交于 12月 27, 2022
```
* fix fold for large bs

* fix fold for large bs
```
9dde26f6
X
Revert "make bilinear interpolate stable. (#48644)" (#49307) · 17ec1620
由 xiongkun 提交于 12月 27, 2022
```
This reverts commit e1e8bf72.
```
17ec1620

[AutoParallel] quantization pass support export (#48072) · 27ce06aa

由 zhaoyingli 提交于 12月 27, 2022

* [AutoParallel] quantization pass support export

* support subgraph

* move_presist_var_to_global_block

* update unittest

* fix ci-coverage

* fix codestyle

* fix fake_dequantize_op

* remove unused var

* fix ci error and aprroval error

* add unittest for fp16 in test_dequant_linear

* replace mutable data

* fix unittest in non-cuda-core

* fix unittest
Co-authored-by: Ncarryyu <569782149@qq.com>
Co-authored-by: Nwufeisheng <wfs1997@163.com>

27ce06aa

[new executor]Support CINN use InterpreterCore (#48911) · 2ca3d3f7

由 zhangbo9674 提交于 12月 27, 2022

* cinn use interpretercore

* fix bug

* fix compile bug

* fix scope bug

* refine code

* refine code by comment

* refine code by comment

2ca3d3f7

R
Support priority scheduling for standalone executor (#49275) · 0839bba3
由 Ruibiao Chen 提交于 12月 27, 2022
```
* Support priority scheduling for standalone executor

* Add CPU test
```
0839bba3

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功