提交 · a89e48fe8567d33478961c4336dbe0460690d63f · BaiXuePrincess / Paddle

07 9月, 2022 23 次提交
- L
  
  use xxhash instead of cryptopp (#45837) · a89e48fe
  由 Leo Chen 提交于 9月 07, 2022
  
  a89e48fe
- H
  [XPU] update xdnn to 0907. (#45777) · 1e981d0d
  由 houj04 提交于 9月 07, 2022
```
* [XPU] update xdnn to 0906. test=kunlun

* [XPU] update xdnn to 0907. test=kunlun
```
  1e981d0d
- Y
  
  rename the template type name for tranpose (#45834) · 9b70c556
  由 Yuang Liu 提交于 9月 07, 2022
  
  9b70c556
- C
  [Phi] Fix infermeta bug for vector input and output (#45810) · 420d186a
  由 Chen Weihang 提交于 9月 07, 2022
```
* fix infermeta bug for vector input and output

* add unittest
```
  420d186a
- W
  Construct exec and ctx only once in cond op to speed up (#45794) · ba653e7b
  由 WangZhen 提交于 9月 07, 2022
```
* Construct exec and ctx only once in cond op to speed up

* Fix construct function error
```
  ba653e7b
- W
  
  Fix fused cuda op's mutable data [2] (#45562) · 4bbbed9a
  由 Wilber 提交于 9月 07, 2022
  
  4bbbed9a
- B
  
  fix nullptr bug of BmmGradInferMeta (#45765) · 26d161ef
  由 BiynXu 提交于 9月 07, 2022
  
  26d161ef
- P
  [PHI] Migrate reduce sum+grad, mean+grad, min and max oneDNN kernels (#45536) · 22255528
  由 piotrekobi 提交于 9月 07, 2022
```
* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* Migrate reduce_op oneDNN kernels to phi

* Remove unnecessary header

* remove fluid code

* onednn renaming

* Change std::vector<int64_t> to IntArray

* Fix code style

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Move more functions from mkldnn_helper.h to onednn_helpper.h

* Change MKLDNN to OneDNN in VLOG message

* Implement reviewer suggestions
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>
```
  22255528
- W
  [OpAttr]Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose (#45620) · fe169bf1
  由 WangZhen 提交于 9月 07, 2022
```
Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose
```
  fe169bf1
- Y
  
  [alphafold] Transpose support large tensors where there numel is bigger than INT32_MAX (#45753) · d9a9e638
  由 Yuang Liu 提交于 9月 07, 2022
  
  d9a9e638
- R
  
  Fix bug for AutoGrowthBestFitAllocator build (#45806) · fcbb307c
  由 Ruibiao Chen 提交于 9月 07, 2022
  
  fcbb307c
- W
  Optimiza params sync between CPU and GPU. (#45805) · a2b2af90
  由 Wilber 提交于 9月 07, 2022
```
* enable memory optimize when fp16.

* optimiza params sync between cpu and gpu.
```
  a2b2af90
- Z
  Clear extra attrs of reduce op in OpMaker (#45786) · 63b6a11b
  由 zyfncg 提交于 9月 07, 2022
```
* clear extra attrs of reduce op in opmaker

* fix reduce_mean
```
  63b6a11b
- H
  
  [XPU] move rnn op to phi. (#45822) · 91631492
  由 houj04 提交于 9月 07, 2022
  
  91631492
- W
  Layernorm shift partition (#45736) · 960109af
  由 wenbin 提交于 9月 07, 2022
```
* first commit

* conver done

* correct format

* layernorm_shift_partition

* correct convert

* redefine plugin

* runable

* bug fix

* modify ShiftPartitionPattern

* correct

* add UT

* modify ut

* compile

* modify enforce

* modify UT
```
  960109af
- C
  [Auto Parallel] Support Iterable dataset for auto parallel (#45518) · b77fa1d9
  由 caozhou 提交于 9月 07, 2022
```
* support iterable dataset for auto parallel

* add split_data proto

* fix unittest bug

* fix recompute bug

* update cmake
```
  b77fa1d9
- Q
  [MLU] fix sync_bn of mlu and add unittests (#45707) · 500f070d
  由 qipengh 提交于 9月 07, 2022
```
* [MLU] fix sync_bn of mlu and add unittests

* [MLU] remove redunant code of pytest
```
  500f070d
- L
  
  add device context getter (#45790) · b7d219be
  由 LiYuRio 提交于 9月 07, 2022
  
  b7d219be
- L
  Performance fix for broadcast kernel [Part2] (#40051) · 87cba48b
  由 limingshu 提交于 9月 07, 2022
```
* first commit

* merged with develop

* merged with develop

* fix merge sequential one dims bugs
```
  87cba48b
- S
  [PHI] Migrate scale kernel (#45537) · 429b5b5b
  由 Sławomir Siwek 提交于 9月 07, 2022
```
* scale kernel

* endline

* add inplace

* fix merge conflicts

* Merge conflicts
```
  429b5b5b
- X
  [InferMeta] add compile-time infermeta logic for stack infermeta. (#45528) · 5a4ceb32
  由 xiongkun 提交于 9月 07, 2022
```
* add compile-time infermeta logic for stack infermeta.

* add unittest for stack infermeta where -1 exists in shapes.

* remove backward changes.
```
  5a4ceb32
- Z
  
  [Sparse]Rename sparse kernel (#45730) · 36739748
  由 zhangkaihuo 提交于 9月 07, 2022
  
  36739748
- S
  Fix UpdateLossScalingKernel to prevent data transform error (#45809) · c084a7b1
  由 sneaxiy 提交于 9月 07, 2022
```
* fix amp kernel

* update to remove PADDLE_WITH_XPU macro
```
  c084a7b1
06 9月, 2022 17 次提交
- Y
  [PHI]Add TensorArray for PHI (#45479) · 68f99b78
  由 YuanRisheng 提交于 9月 06, 2022
```
* add tensor array

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix ci bugs

* update by comment

* update code
```
  68f99b78
- D
  
  fix cmake download program (#45800) · 3f3f923b
  由 danleifeng 提交于 9月 06, 2022
  
  3f3f923b
- H
  [jit] pe engine with mkldnn (#45728) · 0a7e6f90
  由 Hui Zhang 提交于 9月 06, 2022
```
* using mkldnn

* using with mkldnn macro

* fix use mkldnn
```
  0a7e6f90
- W
  
  enable memory optimize when fp16. (#45792) · 1967c6a6
  由 Wilber 提交于 9月 06, 2022
  
  1967c6a6
- J
  Added concat workaround for vivo model (#45091) · 8f37c66f
  由 jakpiase 提交于 9月 06, 2022
```
* concat workaround

* CI rerun
```
  8f37c66f
- Y
  
  migrate deformable_conv and merged momentum kernels to phi, test=kunlun (#45691) · 7f3c7aeb
  由 ykkk2333 提交于 9月 06, 2022
  
  7f3c7aeb
- R
  Enable startup program for standalone executor (#45314) · 6df93364
  由 Ruibiao Chen 提交于 9月 06, 2022
```
* Enable startup program for standalone executor

* Disable test_py_reader_using_executor

* Fix test_parallel_executor_mnist

* Fix CI errors

* Fix CI errors
```
  6df93364
- C
  Update protobuf output format for profiler (#45724) · 23bc0e3c
  由 chenjian 提交于 9月 06, 2022
```
* update protobuf format

* fix protobuf content

* fix file mode

* fix compiling error when gpu not exists

* fix compiling error when gpu not exists

* fix compiling error when gpu not exists

* fix compiling error when gpu not exists

* support rocm
```
  23bc0e3c
- Z
  [Paddle Inference] fix bugs in quant_conv2d_dequant_fuse_pass when weight is... · ddc244d3
  由 zhoutianzi666 提交于 9月 06, 2022
```
[Paddle Inference] fix bugs in quant_conv2d_dequant_fuse_pass when weight is shared  between ops (#45719)

* fix_old_format

* fix bug in quant_conv2d_dequant

* fix bug in quant_conv2d_dequant
```
  ddc244d3
- Y
  
  migrate unsqueeze kernels to phi, test=kunlun (#45673) · 4acf1ef7
  由 ykkk2333 提交于 9月 06, 2022
  
  4acf1ef7
- O
  
  take some notes about sparse API (#45720) · 5c95e5c8
  由 OccupyMars2025 提交于 9月 06, 2022
  
  5c95e5c8
- Y
  
  fix mkldnn bugs (#45770) · 23def396
  由 YuanRisheng 提交于 9月 06, 2022
  
  23def396
- W
  [Eager, Performance optimization] reduce_all interface move reduce_all flag... · 192b3033
  由 Weilong Wu 提交于 9月 06, 2022
```
[Eager, Performance optimization] reduce_all interface move reduce_all flag from python to C++ (#45744)

* [Eager, Performance optimization] move reduce_all flag from python to c++

* polish reduce_all

* fix ci error

* fix errors
```
  192b3033
- N
  
  Fix layout autotune in windows ci (#45751) · cd84e1bf
  由 niuliling123 提交于 9月 06, 2022
  
  cd84e1bf
- W
  
  Fix DequantizeTwoScale kernel (#45632) · 98a5af1a
  由 whs 提交于 9月 06, 2022
  
  98a5af1a
- W
  [Eager, Performance optimization] Reduce min/max kernel polish (#45755) · a6476418
  由 Weilong Wu 提交于 9月 06, 2022
```
* [Eager, Performance optimization] reduce_max / min polish

* polish reduce_max / min

* update min/max kernel reduce_all logic

* fix a mistake

* fix ci errors

* fix errors
```
  a6476418
- X
  
  elementwise op support fp16 (#45496) · f6d9ec27
  由 xiaohemaikoo 提交于 9月 06, 2022
  
  f6d9ec27

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致