提交 · f7832fea70d6d3dfb77312cf549d93851d75818e · PaddlePaddle / Paddle

07 9月, 2022 16 次提交
- Y
  
  rename the template type name for tranpose (#45834) · 9b70c556
  由 Yuang Liu 提交于 9月 07, 2022
  
  9b70c556
- C
  [Phi] Fix infermeta bug for vector input and output (#45810) · 420d186a
  由 Chen Weihang 提交于 9月 07, 2022
```
* fix infermeta bug for vector input and output

* add unittest
```
  420d186a
- W
  Construct exec and ctx only once in cond op to speed up (#45794) · ba653e7b
  由 WangZhen 提交于 9月 07, 2022
```
* Construct exec and ctx only once in cond op to speed up

* Fix construct function error
```
  ba653e7b
- W
  
  Fix fused cuda op's mutable data [2] (#45562) · 4bbbed9a
  由 Wilber 提交于 9月 07, 2022
  
  4bbbed9a
- P
  [PHI] Migrate reduce sum+grad, mean+grad, min and max oneDNN kernels (#45536) · 22255528
  由 piotrekobi 提交于 9月 07, 2022
```
* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* Migrate reduce_op oneDNN kernels to phi

* Remove unnecessary header

* remove fluid code

* onednn renaming

* Change std::vector<int64_t> to IntArray

* Fix code style

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Move more functions from mkldnn_helper.h to onednn_helpper.h

* Change MKLDNN to OneDNN in VLOG message

* Implement reviewer suggestions
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>
```
  22255528
- W
  [OpAttr]Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose (#45620) · fe169bf1
  由 WangZhen 提交于 9月 07, 2022
```
Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose
```
  fe169bf1
- Y
  
  [alphafold] Transpose support large tensors where there numel is bigger than INT32_MAX (#45753) · d9a9e638
  由 Yuang Liu 提交于 9月 07, 2022
  
  d9a9e638
- R
  
  Fix bug for AutoGrowthBestFitAllocator build (#45806) · fcbb307c
  由 Ruibiao Chen 提交于 9月 07, 2022
  
  fcbb307c
- W
  Optimiza params sync between CPU and GPU. (#45805) · a2b2af90
  由 Wilber 提交于 9月 07, 2022
```
* enable memory optimize when fp16.

* optimiza params sync between cpu and gpu.
```
  a2b2af90
- Z
  Clear extra attrs of reduce op in OpMaker (#45786) · 63b6a11b
  由 zyfncg 提交于 9月 07, 2022
```
* clear extra attrs of reduce op in opmaker

* fix reduce_mean
```
  63b6a11b
- H
  
  [XPU] move rnn op to phi. (#45822) · 91631492
  由 houj04 提交于 9月 07, 2022
  
  91631492
- W
  Layernorm shift partition (#45736) · 960109af
  由 wenbin 提交于 9月 07, 2022
```
* first commit

* conver done

* correct format

* layernorm_shift_partition

* correct convert

* redefine plugin

* runable

* bug fix

* modify ShiftPartitionPattern

* correct

* add UT

* modify ut

* compile

* modify enforce

* modify UT
```
  960109af
- C
  [Auto Parallel] Support Iterable dataset for auto parallel (#45518) · b77fa1d9
  由 caozhou 提交于 9月 07, 2022
```
* support iterable dataset for auto parallel

* add split_data proto

* fix unittest bug

* fix recompute bug

* update cmake
```
  b77fa1d9
- Q
  [MLU] fix sync_bn of mlu and add unittests (#45707) · 500f070d
  由 qipengh 提交于 9月 07, 2022
```
* [MLU] fix sync_bn of mlu and add unittests

* [MLU] remove redunant code of pytest
```
  500f070d
- L
  
  add device context getter (#45790) · b7d219be
  由 LiYuRio 提交于 9月 07, 2022
  
  b7d219be
- S
  [PHI] Migrate scale kernel (#45537) · 429b5b5b
  由 Sławomir Siwek 提交于 9月 07, 2022
```
* scale kernel

* endline

* add inplace

* fix merge conflicts

* Merge conflicts
```
  429b5b5b
06 9月, 2022 22 次提交
- Y
  [PHI]Add TensorArray for PHI (#45479) · 68f99b78
  由 YuanRisheng 提交于 9月 06, 2022
```
* add tensor array

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix ci bugs

* update by comment

* update code
```
  68f99b78
- D
  
  fix cmake download program (#45800) · 3f3f923b
  由 danleifeng 提交于 9月 06, 2022
  
  3f3f923b
- H
  [jit] pe engine with mkldnn (#45728) · 0a7e6f90
  由 Hui Zhang 提交于 9月 06, 2022
```
* using mkldnn

* using with mkldnn macro

* fix use mkldnn
```
  0a7e6f90
- W
  
  enable memory optimize when fp16. (#45792) · 1967c6a6
  由 Wilber 提交于 9月 06, 2022
  
  1967c6a6
- J
  Added concat workaround for vivo model (#45091) · 8f37c66f
  由 jakpiase 提交于 9月 06, 2022
```
* concat workaround

* CI rerun
```
  8f37c66f
- Y
  
  migrate deformable_conv and merged momentum kernels to phi, test=kunlun (#45691) · 7f3c7aeb
  由 ykkk2333 提交于 9月 06, 2022
  
  7f3c7aeb
- R
  Enable startup program for standalone executor (#45314) · 6df93364
  由 Ruibiao Chen 提交于 9月 06, 2022
```
* Enable startup program for standalone executor

* Disable test_py_reader_using_executor

* Fix test_parallel_executor_mnist

* Fix CI errors

* Fix CI errors
```
  6df93364
- C
  Update protobuf output format for profiler (#45724) · 23bc0e3c
  由 chenjian 提交于 9月 06, 2022
```
* update protobuf format

* fix protobuf content

* fix file mode

* fix compiling error when gpu not exists

* fix compiling error when gpu not exists

* fix compiling error when gpu not exists

* fix compiling error when gpu not exists

* support rocm
```
  23bc0e3c
- Z
  [Paddle Inference] fix bugs in quant_conv2d_dequant_fuse_pass when weight is... · ddc244d3
  由 zhoutianzi666 提交于 9月 06, 2022
```
[Paddle Inference] fix bugs in quant_conv2d_dequant_fuse_pass when weight is shared  between ops (#45719)

* fix_old_format

* fix bug in quant_conv2d_dequant

* fix bug in quant_conv2d_dequant
```
  ddc244d3
- Y
  
  migrate unsqueeze kernels to phi, test=kunlun (#45673) · 4acf1ef7
  由 ykkk2333 提交于 9月 06, 2022
  
  4acf1ef7
- O
  
  take some notes about sparse API (#45720) · 5c95e5c8
  由 OccupyMars2025 提交于 9月 06, 2022
  
  5c95e5c8
- Y
  
  fix mkldnn bugs (#45770) · 23def396
  由 YuanRisheng 提交于 9月 06, 2022
  
  23def396
- N
  
  Fix layout autotune in windows ci (#45751) · cd84e1bf
  由 niuliling123 提交于 9月 06, 2022
  
  cd84e1bf
- W
  
  Fix DequantizeTwoScale kernel (#45632) · 98a5af1a
  由 whs 提交于 9月 06, 2022
  
  98a5af1a
- Z
  Clear extra attributes of matmul_v2 in OpMaker (#45708) · d4c4c53d
  由 zyfncg 提交于 9月 06, 2022
```
* set use_cudnn=true for conv2d

* clear opmaker of matmul_v2

* fix bug of set_attr

* add extra attr checker in infer_shape
```
  d4c4c53d
- Z
  
  clear extra attrs of some op in opmaker (#45758) · 22f042ba
  由 zyfncg 提交于 9月 06, 2022
  
  22f042ba
- L
  [TRT] Add silu converter (#45588) · dd0f9b96
  由 LielinJiang 提交于 9月 06, 2022
```
* add silu converter
```
  dd0f9b96
- L
  Fix grad error of groupnorm op when cuda version==11.7 (#45738) · b0a3638f
  由 LielinJiang 提交于 9月 06, 2022
```
* fix grad error of grounorm op when cuda version==11.7
```
  b0a3638f
- W
  [Paddle-Inference] remove int8 fallback (#45762) · 31efe00a
  由 Wangzheee 提交于 9月 06, 2022
```
* remove int8 fallback
```
  31efe00a
- C
  
  add op count by lib method (#45680) · 8d4f2613
  由 Chen Weihang 提交于 9月 06, 2022
  
  8d4f2613
- W
  
  Completes basic dtypes for collective api in eager mode (#45574) · 7a92e74b
  由 Wen Sun 提交于 9月 06, 2022
  
  7a92e74b
- H
  
  [XPU] rmsprop to phi. (#45734) · 1137677a
  由 houj04 提交于 9月 06, 2022
  
  1137677a
05 9月, 2022 2 次提交

[PHI] Move oneDNN helper classes to new location (#45626) · 269bd1fe

由 piotrekobi 提交于 9月 05, 2022

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Move more functions from mkldnn_helper.h to onednn_helpper.h

* Change MKLDNN to OneDNN in VLOG message
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

269bd1fe

New format quant model support for MKLDNN (#45416) · 4e4f4586

由 yeliang2258 提交于 9月 05, 2022

* support onnx format quantized model

* update code

* add test

* add test

* fix

* fix test

* fix cmake

* update code

* change scale file path to calibration file path

* update code

* update code

* fix build bug

* fix build bugs

* fix

* fix

4e4f4586

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功