提交 · 73e441f97d62cebc17f4716e805b67c5d3c00a99 · PaddlePaddle / Paddle

14 7月, 2023 9 次提交
- R
  
  [CustomDevice] add stream safe allocator support (#55393) · 73e441f9
  由 ronnywang 提交于 7月 14, 2023
  
  73e441f9
- Z
  
  fix embedding_with_eltwise_add_xpu (#55354) · 95aab366
  由 zhupengyang 提交于 7月 14, 2023
  
  95aab366
- S
  
  fix fisher yates sample (#55329) · f311a927
  由 Siming Dai 提交于 7月 14, 2023
  
  f311a927
- K
  [OpCompat] add feed in op_compat.yaml (#55402) · 27fd2bc2
  由 kangguangli 提交于 7月 14, 2023
```
* add feed in op_compat.yaml

* remove input mapping
```
  27fd2bc2
- H
  
  [0D-Tensor] CINN supports transpose, add special case to expand_zero_dim_pass (#55379) · d1b74ba5
  由 HongyuJia 提交于 7月 14, 2023
  
  d1b74ba5
- Z
  [IR] Reconstruct the Instruction for NewIrInterpreter (#55239) · 69e9f03e
  由 zhangbo9674 提交于 7月 14, 2023
```
* add inplace interface

* support inplace

* refine code

* fix bug

* fix bug

* refien code

* add file

* add interface

* refine code

* refine code

* add phi kernel instruction

* refine code

* add test

* delete unuse code

* add test

* add test

* add deps

* delete unused code

* fix bug

* fix bug
```
  69e9f03e
- H
  
  [XPU] Fix yolo_box to support multi-stream based inference (#55310) · 7e4290c5
  由 hong19860320 提交于 7月 14, 2023
  
  7e4290c5
- T
  Update CUDNN Frontend API to v0.9.1 (#54949) · 76b77d81
  由 Tian Zheng 提交于 7月 14, 2023
```
* Update CUDNN Frontend API to v0.9.1
- Remove old patches
- Remove workarounds that are no longer needed

* Fix test_switch_autotune
```
  76b77d81
- H
  
  fix misspelling of type name (#55398) · f1bffdac
  由 hong 提交于 7月 14, 2023
  
  f1bffdac
13 7月, 2023 22 次提交
- Y
  [BugFix] Replace include dense_tensor.h with forward declare in phi lib (#55396) · 9619443b
  由 Yuanle Liu 提交于 7月 13, 2023
```
* copy dense_tensor.h to inference lib

* update

* update
```
  9619443b
- Y
  
  [BugFix] Fix issue-50853: CUDNN error(9), CUDNN_STATUS_NOT_SUPPORTED · 78a4e3fd
  由 Yuanle Liu 提交于 7月 13, 2023
  
  78a4e3fd
- X
  
  recover tanh_triple (#55372) · bfb861f5
  由 xiaoguoguo626807 提交于 7月 13, 2023
  
  bfb861f5
- F
  [inference] Add FusedBiasActKernel (#55301) · 0a4d1999
  由 freeliuzc 提交于 7月 13, 2023
```
* add init value for CudaSwishFunctor

* add new phi kernel fusedBiasActKernel
```
  0a4d1999
- R
  Support nvprof for auto parallel (#55347) · 9210b1af
  由 Ruibiao Chen 提交于 7月 13, 2023
```
* Support nvprof for auto parallel

* Fix CI errors

* Fix CI errors
```
  9210b1af
- C
  【AMP Prim OP】support instance_norm prim ops for fp16 and bf16 dtype (#55368) · 65950324
  由 Charles-hit 提交于 7月 13, 2023
```
* [prim]support fp16 for instance_norm and instance_norm_grad

* support fp16 and bfp16 dtype for instance_norm prim rules

* fix new ir test

---------
Co-authored-by: Ncxxly <chenxx_id@163.com>
```
  65950324
- add phi operator c_concat and ut (#55320) · 788be26d
  由 lil-Xing 提交于 7月 13, 2023
```
* add phi operator c_concat and ut

* update create_var use

* update copyright
```
  788be26d
- H
  [NewIR]new ir support builtin slice op (#55381) · 4b6d2f5f
  由 hong 提交于 7月 13, 2023
```
* new ir support builtin slice op

* fix phi kernel adaptor bug
```
  4b6d2f5f
- Z
  Move compare_raw_kernel to legacy (#53928) · 1dd8770a
  由 zhangyuqin1998 提交于 7月 13, 2023
```
* Move compare_raw_kernel to legacy

* fix

* Update compare_kernel.cc

* Move compare_raw_kernel to legacy
```
  1dd8770a
- Z
  Cinn schedule error (#54983) · 5f05b22b
  由 Zhang Zheng 提交于 7月 13, 2023
```
* [CINN] Schedule error message optimization

* format code style

* add test

* fix format

* using CINN_THROW and using flags

* optimize error msg

* do not use abtract class of error hanlder

* fix header
```
  5f05b22b
- R
  
  [CustomDevice] fix device guard (#55351) · 0fd6efbb
  由 ronnywang 提交于 7月 13, 2023
  
  0fd6efbb
- M
  
  fix bug on case with gpu driver but no gpu (#55335) · acf4a2ae
  由 ming1753 提交于 7月 13, 2023
  
  acf4a2ae
- F
  
  fix roi_align roi_pool to static num 0 (#55342) · 0a21836d
  由 Feng Ni 提交于 7月 13, 2023
  
  0a21836d
- Z
  [CINN] Refactor pass api of group fusion in CINN (#55090) · c80bf368
  由 zyfncg 提交于 7月 13, 2023
```
* new group fuse pass api

* fix header

* update

* change logic of get master node to fix bug

* revert update for ReduceFuseReduce

* modify according review

* modify by review

* refine

* update

* fix code-format
```
  c80bf368
- Z
  [Yaml] Fix bug of code-gen for op_maker (#55369) · 9c5e4b4e
  由 zyfncg 提交于 7月 13, 2023
```
* add check of input tensors in Yaml

* fix bug of code-gen for opmaker

* fix bug
```
  9c5e4b4e
- W
  
  fix conv_fusion in multi thread. (#55374) · ceb83562
  由 Wilber 提交于 7月 13, 2023
  
  ceb83562
- H
  
  [0D-Tensor] Support matmul, fix infershape (#55316) · ce8455c0
  由 HongyuJia 提交于 7月 13, 2023
  
  ce8455c0
- B
  [CINN] comb the op lowering code (#54982) · 3559252a
  由 BiynXu 提交于 7月 13, 2023
```
* [CINN] comb the op lowering code

* [CINN] format code of OpLower
```
  3559252a
- R
  Add matmul_int8 op (#55228) · 27cc0df5
  由 RichardWooSJTU 提交于 7月 13, 2023
```
* add matmul int8
```
  27cc0df5
- H
  [NewIR]fix new ir edit distance bug (#55294) · 2194e4c1
  由 hong 提交于 7月 13, 2023
```
* fix edit distance bug

* add op define kernel data type

* fix bug

* update

* add header

* add op test to cmake
```
  2194e4c1
- Q
  Modify bf16 and fix the elementwise_max (#54799) · 6f7ceca0
  由 Qi Shao 提交于 7月 13, 2023
```
* modify the accuracy checking framework of bf16 optest, including both of forward and backward
```
  6f7ceca0
- A
  [NewIR]Disable copy and assign for Operation (#55328) · 4c5ce835
  由 Aurelius84 提交于 7月 13, 2023
```
* [NewIR]Disable copy and assign for Operation

* add macros.h
```
  4c5ce835
12 7月, 2023 9 次提交

H

[0D-Tensor] CINN supports broadcast_to, fix infershape (#55321) · 276c159d
由 HongyuJia 提交于 7月 12, 2023

276c159d

[Semi Auto] Softmax SPMD Rule (#55196) · 885d1aec

由 JZ-LIANG 提交于 7月 12, 2023

* resolute input sharding conflict maybe

* fixed comment

---------
Co-authored-by: NYichen Zhang <zhangyichen03@baidu.com>
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

885d1aec

H

[0D-Tensor] CINN supports squeeze, fix infershape and GetPositiveAxes (#55333) · bb0df468
由 HongyuJia 提交于 7月 12, 2023

bb0df468
H
[NewIR] fix new ir expand op (#55327) · de9318a3
由 hong 提交于 7月 12, 2023
```
* fix new ir expand op

* fix count bug

* remove useless code
```
de9318a3
Y
[Inference] rewrite identity_op_clean_pass (#55240) · 2363e623
由 Yuanle Liu 提交于 7月 12, 2023
```
* rewrite identity_op_clean_pass

* fix

* adjust identity_op_clean_pass order in gpu passes

* fix ut
```
2363e623

Fix llm int8 build error (#55338) · 006bd959

由 FormlessUnit 提交于 7月 12, 2023

* add macro to avoid llm.int8 build error

* fix ci

---------
Co-authored-by: Nwufeisheng <wfs1997@163.com>

006bd959

R

[CustomDevice] optimize SplitDenseTensor by calling split_with_num kernel (#55330) · d65209b6
由 ronnywang 提交于 7月 12, 2023

d65209b6
R
[CustomDevice] fix release error in process_group_custom (#55293) · 7a705727
由 ronnywang 提交于 7月 12, 2023
```
* [CustomDevice] fix release error for process_group_custom

* update
```
7a705727

Support selected rows new ir (#54987) · fc66b5d7

由 hong 提交于 7月 12, 2023

* refine program translator

* fix warning: not override

* fix bug

* merge new modifications

* modify by reviews

* resolve conflicts

* resolve conflicts

* fix

* fix

* update

* support selected rows

* update

* add selectrows

* fix bug

* add ut

* refine code

* refien code

* update

* update

* support selected rows

* support selected rows

* support dense tensor

* remove useless code

* polish code

* remote standalone executor test

---------
Co-authored-by: Nkangguangli <kangguangli@hotmail.com>
Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>

fc66b5d7

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功