提交 · 032da73170080c56830dd4bba7a943b4bd45a05e · PaddlePaddle / Paddle

05 1月, 2023 7 次提交
- S
  Support 0D for paddle.sort/argsort (#49501) · 032da731
  由 Siming Dai 提交于 1月 05, 2023
```
* support 0D for paddle.sort/argsort

* support 0D tensor for paddle.sort/argsort in xpu

* fix bug

* fix grad and add value assertion
```
  032da731
- X
  
  [Paddle Inference] Add ci flags for a persistent IBuilder. (#49538) · fcd6d675
  由 xiaoxiaohehe001 提交于 1月 05, 2023
  
  fcd6d675
- Generate the static graph code of ops (#49413) · 39f0eb2c
  由 HappyHeavyRain 提交于 1月 05, 2023
```
* generate the static graph code of ops

* modify the isclose comment

* modify the clip comment in nn.py

* reset nn.py
```
  39f0eb2c
- Z
  
  [BugFix] Fix illegal memory overflow for p_norm op (#49537) · ba1dce0a
  由 Zhong Hui 提交于 1月 05, 2023
  
  ba1dce0a
- Z
  
  support generate static graph code for imag and real op (#49523) · 192eb4d5
  由 zyfncg 提交于 1月 05, 2023
  
  192eb4d5
- X
  
  fix trace heap overflow (#49548) · 5feadc0b
  由 XiangGao 提交于 1月 05, 2023
  
  5feadc0b
- G
  
  Add to_hash func and paddle2arg map for cinn (#49402) · 1168a178
  由 GaoYuYang 提交于 1月 05, 2023
  
  1168a178
04 1月, 2023 5 次提交

G

Add the input check for softmax_with_cross_entropy (#49333) · f17b2de8
由 Guanghua Yu 提交于 1月 04, 2023

f17b2de8
W

[Inference] Add conv_fusion nhwc impl. (#49047) · 4a8708bb
由 Wilber 提交于 1月 04, 2023

4a8708bb
Z

refine diagonal infermeta (#49520) · 852c8db3
由 zhangbo9674 提交于 1月 04, 2023

852c8db3
Y

[Paddle Inference] fix mixed precision diff (#49475) · ac75a9a6
由 Yuanle Liu 提交于 1月 04, 2023

ac75a9a6

[Unify KernelKey] change OpKernelType->KernelKey (#49138) · 4383494f

由 HongyuJia 提交于 1月 04, 2023

* execute use kernel_key first

* change OpKernelType->KernelKey

* fix py3 compile error, remove redundant header files

* fix build_strategy_test

* fix DataType::RAW

* fix custom_type test: operator_test.cc

* fix transform place

* fix backends_are_same_class

* try fix place TransDataDevice

* support all KernelKey

* fix TransformData

* fix place_are_same_class

* fix merge

* fix test_params_no_grad

* fix specific place of GetExpectedKernelType

* fix specific place of GetExpectedKernelType

* fix GetKernelTypeForVar

* fix dtype error

* fix fetch_v2

* change GetKernelTypeForVar

* fix interpreter

* fix typo error

* polish codes

* polish codes

* polish codes

* fix conflict

4383494f

03 1月, 2023 3 次提交
- L
  
  H2D data transfer optimization for concat kernel (#49040) · 0de94cd9
  由 limingshu 提交于 1月 03, 2023
  
  0de94cd9
- Z
  [Paddle Inference] Implement conv2d_fusion NHWC format using cutlass (#47989) · c123dd1e
  由 zhoutianzi666 提交于 1月 03, 2023
```
* Implement conv2d_fusion NHWC format using CUTLASS
* Add unit testing for CUTLASS Conv in inference
* Add experimental API for CUTLASS.
```
  c123dd1e
- Y
  Use BroadcastKernel and ReduceKernel to optimize expand and expand_grad. (#49419) · c4604025
  由 Yiqun Liu 提交于 1月 03, 2023
```
* Use BroadcastKernel and ReduceKernel to optimize expand and expand_grad.

* Correct the axis when there is only 1 input in BroadcastKernel.

* Add the calculate of output's shape.
```
  c4604025
31 12月, 2022 1 次提交
- C
  
  support flip 0D (#49460) · cb22a5c7
  由 caozhou 提交于 12月 31, 2022
  
  cb22a5c7
30 12月, 2022 4 次提交

[Custom device] Add custom_cpu testcase of custom_relu (#49300) · 69c7edcf

由 HongyuJia 提交于 12月 30, 2022

* add custom_cpu testcase

* update test_custom_device_setup

* update path to custom_runtime

* fix cmd wait

* test Linux only

* setup once

* integrate to one run_cmd

* add pip install

* change timeout

* add debug string

* add debug string

* add debug string

* use os.system and change module name

* add runtime

* add more debug message

* continue debug

* timestamp

* fix testcase import bug

* remove error message

* set TIMEOUT property

69c7edcf

L

revert phi_static (#49433) · 802c5797
由 Leo Chen 提交于 12月 30, 2022

802c5797

Support static graph code-gen for squeeze and unsqueeze op (#49430) · 23c1ac2c

由 zyfncg 提交于 12月 30, 2022

* support static graph code-gen for squeeze op

* generate static graph code of unsqueeze

* refine op name

* add extra output in op_compat

* remove debug log

23c1ac2c

在文档中统一静态图模式与动态图模式的英文翻译 (#49170) · a186e60d

由 Sanbu 提交于 12月 30, 2022

* 1219

* temporarily change the num_diff_files limit, test=document_fix

* Revert "temporarily change the num_diff_files limit, test=document_fix"

This reverts commit 8e70f00ef468d2dad0e38b3da06295ed62990d20.

* for codestyle

* remove duplicate license

* `static mode` -> `static graph mode`

* Update hybrid_parallel_inference.py

* Update layer_function_generator.py

* Update manipulation.py

* reset
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

a186e60d

29 12月, 2022 1 次提交
- Y
  
  xpu kernels support api int64 vector inputs, test=kunlun (#49336) · 3c2420a3
  由 ykkk2333 提交于 12月 29, 2022
  
  3c2420a3
28 12月, 2022 5 次提交

S

fix unique_kernel support axis=-1 (#49385) · ab786715
由 sprouteer 提交于 12月 28, 2022

ab786715

[new-exec] Ahead-Of-Time choosing kernel (#48789) · 63d2d722

由 Leo Chen 提交于 12月 28, 2022

* add skip run

* alloc minimum memory

* skip check_size in Alloc

* skip check_size in Alloc

* skip check_size in Alloc

* fix cases when tensor is initialized or empty

* alloc empty output for place info

* add test

* increase timeout

* format code

* skip cpu

* add cudnn_deterministic

* fit for hostAlloc

* follow comments

* change check_size to fake_alloc

63d2d722

generate the static graph code of some ops (#49212) · 1804f834

由 HappyHeavyRain 提交于 12月 28, 2022

* generate the static op of some ops

* add the VERSION of pixel_shuffle

* change the API doc of isclose

* change the API doc of isclose

* fix the isclose op comment

1804f834

X

fix_moe (#49353) · 04511cf9
由 xiaoxiaohehe001 提交于 12月 28, 2022

04511cf9
H

fix bugs of paddle.multiplex API (#49368) · f6f0c562
由 Haohongxiang 提交于 12月 28, 2022

f6f0c562

27 12月, 2022 4 次提交
- Z
  
  add unbind op for xpu (#49356) · 16931039
  由 zhangyikun02 提交于 12月 27, 2022
  
  16931039
- X
  fix fold for large bs (#49337) · 9dde26f6
  由 xiaoting 提交于 12月 27, 2022
```
* fix fold for large bs

* fix fold for large bs
```
  9dde26f6
- X
  Revert "make bilinear interpolate stable. (#48644)" (#49307) · 17ec1620
  由 xiongkun 提交于 12月 27, 2022
```
This reverts commit e1e8bf72.
```
  17ec1620
- Z
  [new executor]Support CINN use InterpreterCore (#48911) · 2ca3d3f7
  由 zhangbo9674 提交于 12月 27, 2022
```
* cinn use interpretercore

* fix bug

* fix compile bug

* fix scope bug

* refine code

* refine code by comment

* refine code by comment
```
  2ca3d3f7
26 12月, 2022 2 次提交

fix dlrm qpsproblem (#49171) · c8f76337

由 ykkk2333 提交于 12月 26, 2022

* migrate shaple sgd, split,sign xpu kernels to phi, test=kunlun

* fix dlrm throughput problem, test=kunlun

c8f76337

R
[0d Tensor] update scatter for zero-dimension tensor (#49279) · 73aa98cf
由 Roc 提交于 12月 26, 2022
```
* revert concat and change concat to stack

* let stack kernel support int8, uint8 and bool type
```
73aa98cf

23 12月, 2022 6 次提交
- Q
  
  suport recompute for kunlun (#49069) · 98c17a68
  由 QingshuChen 提交于 12月 23, 2022
  
  98c17a68
- Y
  
  Fix arange gpu kernel (#49273) · e073313d
  由 Yuanle Liu 提交于 12月 23, 2022
  
  e073313d
- C
  fix matmul double and triple grad (#48779) · 13c4fd59
  由 Charles-hit 提交于 12月 23, 2022
```
* fix matmul double and triple grad

* remove some comment

* add matmul_double_grad unit test

* fix matmul triple grad

* fix dot triple grad and add unit test

* modify codestyle

* fix dot_grad

* refactor dot triple grad

* disable some unit test

* fix unit test

* fix unit test in double grad
```
  13c4fd59
- H
  
  square_grad support fp16 *test=kunlun (#48847) · ae544586
  由 haosicheng 提交于 12月 23, 2022
  
  ae544586
- H
  add rnn-t loss and api (#49199) · c088f9ec
  由 Hui Zhang 提交于 12月 23, 2022
```
* add warp transducer code
```
  c088f9ec
- Register half datatype for Roll Kernel (#49192) · 3b90a7f3
  由 MarDino 提交于 12月 23, 2022
```
* register half datatype

* register roll grad fp16 kernel
```
  3b90a7f3
22 12月, 2022 2 次提交

[eager] use CPUAllocator directly (#47125) · 4537ba23

由 Weilong Wu 提交于 12月 22, 2022

* [eager] use CPUAllocator directly

* modify pstring sizeof 48 default

* rm CPU test for NaiveBestFitAllocator

* fix Mac ci compile errors

* use UNUSED to state unused_obj

* mv UNUSED statement to allocator_facade.cc

* fix roi_align

* fix yolov3 test case

* recover original code

* recover original code

* fix trt roi_align test
Co-authored-by: Njerrywgz <jerrywgz@126.com>

4537ba23

X

[Paddle Inference] Add moe phi kernel (#48703) · def2a87f
由 xiaoxiaohehe001 提交于 12月 22, 2022

def2a87f

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功