提交 · 569d6c5b89981744eec77047a41389f01b47d513 · 机器未来 / Paddle

08 9月, 2022 1 次提交
- S
  
  fix fused_gemm_epilogue_op compile error (#45862) · 569d6c5b
  由 sneaxiy 提交于 9月 08, 2022
  
  569d6c5b
07 9月, 2022 29 次提交
- C
  [Phi] Migrate save kernel (#45665) · fc66fdb7
  由 Chen Weihang 提交于 9月 07, 2022
```
* add save kernel

* add save_sr_kernel

* remove original save_op

* add save gpu kernel

* remove combine kernel

* add port.h include

* add save selected rows test

* remove useless kernel.h
```
  fc66fdb7
- L
  
  use xxhash instead of cryptopp (#45837) · a89e48fe
  由 Leo Chen 提交于 9月 07, 2022
  
  a89e48fe
- V
  update security policy test=document_fix (#45843) · f7832fea
  由 Vigi Zhang 提交于 9月 07, 2022
```
add running untrusted models in security policy
```
  f7832fea
- H
  [XPU] update xdnn to 0907. (#45777) · 1e981d0d
  由 houj04 提交于 9月 07, 2022
```
* [XPU] update xdnn to 0906. test=kunlun

* [XPU] update xdnn to 0907. test=kunlun
```
  1e981d0d
- Y
  
  rename the template type name for tranpose (#45834) · 9b70c556
  由 Yuang Liu 提交于 9月 07, 2022
  
  9b70c556
- C
  [Phi] Fix infermeta bug for vector input and output (#45810) · 420d186a
  由 Chen Weihang 提交于 9月 07, 2022
```
* fix infermeta bug for vector input and output

* add unittest
```
  420d186a
- W
  Construct exec and ctx only once in cond op to speed up (#45794) · ba653e7b
  由 WangZhen 提交于 9月 07, 2022
```
* Construct exec and ctx only once in cond op to speed up

* Fix construct function error
```
  ba653e7b
- W
  
  Fix fused cuda op's mutable data [2] (#45562) · 4bbbed9a
  由 Wilber 提交于 9月 07, 2022
  
  4bbbed9a
- B
  
  fix nullptr bug of BmmGradInferMeta (#45765) · 26d161ef
  由 BiynXu 提交于 9月 07, 2022
  
  26d161ef
- P
  [PHI] Migrate reduce sum+grad, mean+grad, min and max oneDNN kernels (#45536) · 22255528
  由 piotrekobi 提交于 9月 07, 2022
```
* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* Migrate reduce_op oneDNN kernels to phi

* Remove unnecessary header

* remove fluid code

* onednn renaming

* Change std::vector<int64_t> to IntArray

* Fix code style

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Move more functions from mkldnn_helper.h to onednn_helpper.h

* Change MKLDNN to OneDNN in VLOG message

* Implement reviewer suggestions
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>
```
  22255528
- C
  Fix test_custom_relu_op_jit windows error (#45812) · 352babaa
  由 Chen Weihang 提交于 9月 07, 2022
```
* fix test_custom_relu_op_jit windows error

* polish assert format
```
  352babaa
- W
  [OpAttr]Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose (#45620) · fe169bf1
  由 WangZhen 提交于 9月 07, 2022
```
Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose
```
  fe169bf1
- Y
  
  [alphafold] Transpose support large tensors where there numel is bigger than INT32_MAX (#45753) · d9a9e638
  由 Yuang Liu 提交于 9月 07, 2022
  
  d9a9e638
- C
  replace fill_zeros_like op with fill_any_like op (#45657) · 0ddcf30c
  由 Charles-hit 提交于 9月 07, 2022
```
* relace fill_zeros_like op with fill_any_like op in backward.py and tensor.py

* Remove unnecessary comments

* modify create op_desc param
```
  0ddcf30c
- R
  
  Fix bug for AutoGrowthBestFitAllocator build (#45806) · fcbb307c
  由 Ruibiao Chen 提交于 9月 07, 2022
  
  fcbb307c
- W
  Optimiza params sync between CPU and GPU. (#45805) · a2b2af90
  由 Wilber 提交于 9月 07, 2022
```
* enable memory optimize when fp16.

* optimiza params sync between cpu and gpu.
```
  a2b2af90
- Z
  Clear extra attrs of reduce op in OpMaker (#45786) · 63b6a11b
  由 zyfncg 提交于 9月 07, 2022
```
* clear extra attrs of reduce op in opmaker

* fix reduce_mean
```
  63b6a11b
- H
  
  [XPU] move rnn op to phi. (#45822) · 91631492
  由 houj04 提交于 9月 07, 2022
  
  91631492
- Y
  
  [dygraph hybrid pp for interleave] Save/Load for interleaved pipeline. (#45797) · a9cc0274
  由 Yuang Liu 提交于 9月 07, 2022
  
  a9cc0274
- W
  Layernorm shift partition (#45736) · 960109af
  由 wenbin 提交于 9月 07, 2022
```
* first commit

* conver done

* correct format

* layernorm_shift_partition

* correct convert

* redefine plugin

* runable

* bug fix

* modify ShiftPartitionPattern

* correct

* add UT

* modify ut

* compile

* modify enforce

* modify UT
```
  960109af
- C
  [Auto Parallel] Support Iterable dataset for auto parallel (#45518) · b77fa1d9
  由 caozhou 提交于 9月 07, 2022
```
* support iterable dataset for auto parallel

* add split_data proto

* fix unittest bug

* fix recompute bug

* update cmake
```
  b77fa1d9
- Q
  [MLU] fix sync_bn of mlu and add unittests (#45707) · 500f070d
  由 qipengh 提交于 9月 07, 2022
```
* [MLU] fix sync_bn of mlu and add unittests

* [MLU] remove redunant code of pytest
```
  500f070d
- L
  
  add device context getter (#45790) · b7d219be
  由 LiYuRio 提交于 9月 07, 2022
  
  b7d219be
- W
  
  [Eager, Performance optimization] polish uniform_random (#45807) · 1a372bd1
  由 Weilong Wu 提交于 9月 07, 2022
  
  1a372bd1
- L
  Performance fix for broadcast kernel [Part2] (#40051) · 87cba48b
  由 limingshu 提交于 9月 07, 2022
```
* first commit

* merged with develop

* merged with develop

* fix merge sequential one dims bugs
```
  87cba48b
- S
  [PHI] Migrate scale kernel (#45537) · 429b5b5b
  由 Sławomir Siwek 提交于 9月 07, 2022
```
* scale kernel

* endline

* add inplace

* fix merge conflicts

* Merge conflicts
```
  429b5b5b
- X
  [InferMeta] add compile-time infermeta logic for stack infermeta. (#45528) · 5a4ceb32
  由 xiongkun 提交于 9月 07, 2022
```
* add compile-time infermeta logic for stack infermeta.

* add unittest for stack infermeta where -1 exists in shapes.

* remove backward changes.
```
  5a4ceb32
- Z
  
  [Sparse]Rename sparse kernel (#45730) · 36739748
  由 zhangkaihuo 提交于 9月 07, 2022
  
  36739748
- S
  Fix UpdateLossScalingKernel to prevent data transform error (#45809) · c084a7b1
  由 sneaxiy 提交于 9月 07, 2022
```
* fix amp kernel

* update to remove PADDLE_WITH_XPU macro
```
  c084a7b1
06 9月, 2022 10 次提交
- Y
  [PHI]Add TensorArray for PHI (#45479) · 68f99b78
  由 YuanRisheng 提交于 9月 06, 2022
```
* add tensor array

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix ci bugs

* update by comment

* update code
```
  68f99b78
- D
  
  fix cmake download program (#45800) · 3f3f923b
  由 danleifeng 提交于 9月 06, 2022
  
  3f3f923b
- R
  
  Use logging to print log of FLAGS_FORCE_USE_PROGRAM_CACHE (#45799) · 1d393cfb
  由 Ruibiao Chen 提交于 9月 06, 2022
  
  1d393cfb
- H
  [jit] pe engine with mkldnn (#45728) · 0a7e6f90
  由 Hui Zhang 提交于 9月 06, 2022
```
* using mkldnn

* using with mkldnn macro

* fix use mkldnn
```
  0a7e6f90
- W
  
  enable memory optimize when fp16. (#45792) · 1967c6a6
  由 Wilber 提交于 9月 06, 2022
  
  1967c6a6
- J
  Added concat workaround for vivo model (#45091) · 8f37c66f
  由 jakpiase 提交于 9月 06, 2022
```
* concat workaround

* CI rerun
```
  8f37c66f
- Y
  
  migrate deformable_conv and merged momentum kernels to phi, test=kunlun (#45691) · 7f3c7aeb
  由 ykkk2333 提交于 9月 06, 2022
  
  7f3c7aeb
- C
  
  replace reshape op with reshape2 op (#45735) · d8a09e25
  由 Charles-hit 提交于 9月 06, 2022
  
  d8a09e25
- R
  Enable startup program for standalone executor (#45314) · 6df93364
  由 Ruibiao Chen 提交于 9月 06, 2022
```
* Enable startup program for standalone executor

* Disable test_py_reader_using_executor

* Fix test_parallel_executor_mnist

* Fix CI errors

* Fix CI errors
```
  6df93364
- C
  Update protobuf output format for profiler (#45724) · 23bc0e3c
  由 chenjian 提交于 9月 06, 2022
```
* update protobuf format

* fix protobuf content

* fix file mode

* fix compiling error when gpu not exists

* fix compiling error when gpu not exists

* fix compiling error when gpu not exists

* fix compiling error when gpu not exists

* support rocm
```
  23bc0e3c

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致