提交 · 1006383b8485c3409a9a7e09d9623df7e03f7364 · PaddlePaddle / Paddle

17 1月, 2022 10 次提交

add TransferCastOpPass, DeleteScaleOpPass (#38985) · 1006383b

由 Allen Guo 提交于 1月 17, 2022

Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

1006383b

[IPU] update ipu releated passes p0 (#38846) · 84f257bd

由 Allen Guo 提交于 1月 17, 2022

* update ipu releated passes
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* remove ipu_pass_base

* update error msg

* update error msg 02

* split pr 01

* restore ipu_pass_base
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

84f257bd

S
Add NoReduce mode for ParallelExecutor (#38969) · e50d883e
由 sneaxiy 提交于 1月 17, 2022
```
* add no reduce mode for pe

* add NoReduce ut
```
e50d883e
S

add squared_l2_norm (#38968) · 6eeb16b8
由 sneaxiy 提交于 1月 17, 2022

6eeb16b8
Z

[part 5]change type of function args (#38889) · ac933235
由 Zhang Ting 提交于 1月 17, 2022

ac933235
S

add convert func for string helper (#38600) · 73742d36
由 sneaxiy 提交于 1月 17, 2022

73742d36
R
fix paddle.where torch diff (#38870) · 096afbe1
由 ronnywang 提交于 1月 17, 2022
```
* fix paddle.where torch diff

* update
```
096afbe1
0
[Dy2St]close enable_inplace PASS for PE and open test_mnist_pure_fp16.py for windows (#38752) · 724d49da
由 0x45f 提交于 1月 17, 2022
```
* close enable_inplace PASS for PE, and test dy2st pure fp16 training stability

* add some comment

* enlarge atol
```
724d49da
J
Support auto prune logic in eager mode (#38960) · f81569e3
由 Jiabin Yang 提交于 1月 17, 2022
```
* support test_auto_prune_partial

* support rest of autoprune strategy in eager mode
```
f81569e3
Z

Removed debug info (#38947) · 3115d005
由 Zhanlue Yang 提交于 1月 17, 2022

3115d005

16 1月, 2022 1 次提交
- C
  [Pten] Add select kernel map method for infrt (#38972) · 192184e8
  由 Chen Weihang 提交于 1月 16, 2022
```
* add select kernel map method

* fix error
```
  192184e8
15 1月, 2022 6 次提交
- 石
  
  updates the ctor of tensor, test=develop (#38946) · 5c358674
  由石晓伟提交于 1月 15, 2022
  
  5c358674
- 石
  
  isolates friends of storage, test=develop (#38977) · d13c7799
  由石晓伟提交于 1月 15, 2022
  
  d13c7799
- C
  [PTen] Remove cached kernel context (#38953) · 35d2b71a
  由 Chen Weihang 提交于 1月 15, 2022
```
* remove cached kernel context

* revert dataloader format change
```
  35d2b71a
- C
  
  replace last contextT (#38971) · 1053b1d5
  由 Chen Weihang 提交于 1月 15, 2022
  
  1053b1d5
- Z
  [Unify Tensors PR #7] Merged LoDTensor with Tensor, test=allcases (#38880) · 88966b28
  由 Zhanlue Yang 提交于 1月 15, 2022
```
* Merged LoDTensor with Tensor,test=allcases

* Patched python level LoDTensor

* Fixed example code failure

* Polished function names, removed duplicated forward declarations
```
  88966b28
- Z
  
  fix performance problem caused by Conj (#38939) · a8879148
  由 zyfncg 提交于 1月 15, 2022
  
  a8879148
14 1月, 2022 8 次提交
- H
  add flatten_contiguous_range OpConvert for Paddle-TRT (#38922) · 050aa6fe
  由 heliqi 提交于 1月 14, 2022
```
* add trt_convert_flatten_contiguous_rang op

* trt version >7,support trt_convert_flatten_contiguous_rang

* trt version >7,support trt_convert_flatten_contiguous_rang

* trt version >7,support trt_convert_flatten_contiguous_rang

* test cast add trt version >=7 skip
```
  050aa6fe
- Z
  [XPU]add stack_grad op for kunlun2,*test=kunlun (#38674) · 87ee3e4f
  由 Zhangjingyu06 提交于 1月 14, 2022
```
* [XPU]add split op for kunlun2,*test=kunlun

* [XPU]add split op for kunlun2,*test=kunlun

* [XPU]add split op for kunlun,*test=kunlun

* [XPU]add stack_grad op for kunlun2,*test=kunlun
Co-authored-by: NQingshuChen <chenqingshu@baidu.com>
```
  87ee3e4f
- 王
  
  [infrt] update the version of llvm. test=develop (#38843) · 0de8a805
  由王明冬提交于 1月 14, 2022
  
  0de8a805
- B
  
  Add dygraph sharding stage3 (#38052) · 4c77a908
  由 Baibaifan 提交于 1月 14, 2022
  
  4c77a908
- Y
  
  refactor impl of elementwise op part2 (#38898) · 556d5097
  由 YuanRisheng 提交于 1月 14, 2022
  
  556d5097
- Q
  [MLU]Add mean and reduce_mean op (#38872) · 7f8d5bc8
  由 qipengh 提交于 1月 14, 2022
```
* [MLU]: add mean and reduce mean op

* [MLU]add mlu pytest dir in CMakeLists.txt

* [MLU]fix tensor data

* [MLU]fix TensorToPyArray and license
```
  7f8d5bc8
- S
  
  fix bug of -DPADDLE_WITH_SSE3 not set when WITH_AVX AND AVX_FOUND even SSE3_FOUND (#38931) · 9e0686ed
  由 Sing_chan 提交于 1月 14, 2022
  
  9e0686ed
- 石
  
  remove interface: DenseTensor::release, test=develop (#38937) · 9ff989ae
  由石晓伟提交于 1月 14, 2022
  
  9ff989ae
13 1月, 2022 14 次提交
- C
  [PTen] Rename kernel register marco (#38861) · 158bf13f
  由 Chen Weihang 提交于 1月 13, 2022
```
* rename register marco

* fix error changing

* fix format error
```
  158bf13f
- W
  [Paddle-Inference] add Paddle Trt config: with_interleaved (#38884) · dccdc719
  由 Wangzheee 提交于 1月 13, 2022
```
* add Paddle Trt config: with_interleaved
```
  dccdc719
- S
  
  [bug fix] fix unfold bug in compile time (#38907) · 7f123456
  由 shangliang Xu 提交于 1月 13, 2022
  
  7f123456
- F
  [NPU] fix tril_triu (#38864) · eaccdc71
  由 furnace 提交于 1月 13, 2022
```
[NPU] fix tril_triu
```
  eaccdc71
- F
  [NPU] fix expand op (#38526) · 7a5af630
  由 furnace 提交于 1月 13, 2022
```
* [NPU] fix expand op

* [NPU] optimize codes

* [NPU] optimize codes
```
  7a5af630
- S
  force close eager_generator.exe (#38896) · 23aa7b08
  由 Sing_chan 提交于 1月 13, 2022
```
* force close eager_generator.exe

* modify according to zhouwei's comment
```
  23aa7b08
- C
  [pten]Remove pten/include dir files (#38878) · 7e0292ea
  由 chentianyu03 提交于 1月 13, 2022
```
* move dot_dev api into dot_kernel.h

* add infermate header

* modify to dotkerel in dot_op.h

* mvoe conj dev api into complex_kernel.h

* move sign dev api into  sign_kernel.h

* move scale dev api into kernel.h and remove infermete.h

* rm paddle/pten/include/math.h

* rm paddle/pten/include/math.h

* rm include dir

* rm paddle/pten/include/math.h

* fix conflict with develop branch

* rm devContext in conj_op.h

* add the missing complex_kernel header
```
  7e0292ea
- J
  
  [Dist Pass] AMP pass add dist_update_loss_scaling op (#38902) · 53783e1e
  由 JZ-LIANG 提交于 1月 13, 2022
  
  53783e1e
- L
  
  [fleet_executor] fix uninitialized pointer (#38904) · a6cf6cdd
  由 LiYuRio 提交于 1月 13, 2022
  
  a6cf6cdd
- W
  roi_align aligned supported (#38905) · 08dcea18
  由 wenbin 提交于 1月 13, 2022
```
roi_align aligned supported
```
  08dcea18
- J
  Added mul BF16/FP32 FWD/BWD oneDNN kernel (#38552) · fc6eed5b
  由 jakpiase 提交于 1月 13, 2022
```
* base changes for mul reimplementation

* empty commit

* tmp save

* full implementation of mul bf16/fp32 fwd bwd

* CI fix

* CI rerun

* changed unity build cmake to avoid gpu issues

* removed mul mkldnn from unity build

* added skipping tests if not cpu_bf16

* CI fix

* CI fix

* CI fix
```
  fc6eed5b
- C
  Fix mkldnn invalid infershape impl (#38837) · 281644cd
  由 Chen Weihang 提交于 1月 13, 2022
```
* fix mkldnn invalid infershape

* add unittest for mkldnn in new executor

* add import os
```
  281644cd
- W
  Support test_imperative using_non_zero_gpu with _test_eager_guard() (#38881) · 5e515781
  由 Weilong Wu 提交于 1月 13, 2022
```
* Support test_imperative using_non_zero_gpu and Add a TODO comment

* Change GPU number to 0

* Modify the cuda device selection method
```
  5e515781
- 石
  
  splits allocation for pten, test=develop (#38853) · 277cf900
  由石晓伟提交于 1月 13, 2022
  
  277cf900
12 1月, 2022 1 次提交
- Z
  [part 3]change type of function args (#38887) · 0efcae86
  由 Zhang Ting 提交于 1月 12, 2022
```
* code clean

* [part 3]change type of function args
```
  0efcae86

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功