提交 · bfacd706c77e7875275e8187197e793035e2504c · PaddlePaddle / Paddle

18 1月, 2022 8 次提交
- W
  add the uva function for the Tensor (#38950) · bfacd706
  由 wawltor 提交于 1月 18, 2022
```
* add the uva api for the tensor

* fix the compiler problem for the uva

* fix the example for the _uva

* fix the compile problem in the pten library

* update the enviroment support for the uva

* use the make_shared replace the shared_ptr
```
  bfacd706
- T
  
  fix lookup_table_v2 error in kunlun2 (#38855) · df898f8b
  由 taixiurong 提交于 1月 18, 2022
  
  df898f8b
- Y
  
  Unify the functor of elementwise and logical ops. (#35767) · b1365d25
  由 Yiqun Liu 提交于 1月 18, 2022
  
  b1365d25
- J
  fix trt convert conv2d skip (#38999) · dfa242e4
  由 JingZhuangzhuang 提交于 1月 18, 2022
```
* fix trt convert conv2d skip

* fix trt convert conv2d skip
```
  dfa242e4
- W
  modify transpose params check (#39006) · 27f8460a
  由 wenbin 提交于 1月 18, 2022
```
* modify params check

* correct compile
```
  27f8460a
- Y
  
  break the circular dependency between reduce and elementwise (#38951) · a1980d9c
  由 YuanRisheng 提交于 1月 18, 2022
  
  a1980d9c
- Z
  [GPUPS]Fix ps_gpu_wrapper (#38993) · 4aa91fd6
  由 zmxdream 提交于 1月 18, 2022
```
* update

* fix ps_gpu_wrapper. test=develop

* fix ps_gpu_wrapper. test=develop
```
  4aa91fd6
- S
  Speedup FP16 Gelu op using fast math and vectorized 8 kernel (#38980) · 8c20d668
  由 sneaxiy 提交于 1月 18, 2022
```
* speedup gelu using fast math

* add bwd part
```
  8c20d668
17 1月, 2022 16 次提交

W
disable unsupported trt dimension (#38962) · 55e9087f
由 wenbin 提交于 1月 17, 2022
```
* develop test

* throw

* ne

* wrong cnt
```
55e9087f
J

fix for conv2D training error (#38938) · 944ea436
由 jakpiase 提交于 1月 17, 2022

944ea436

update ipu_executor, remove ipu_optimizer (#38986) · 05c98ec7

由 Allen Guo 提交于 1月 17, 2022

Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

05c98ec7

[IPU] update ipu_backend p0 (#38854) · b2aee3e3

由 Allen Guo 提交于 1月 17, 2022

* update ipu_backend

* sync with paddle internal
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* apply comments 01

* update error messag

* restore ipu_executor and ipu_optimizer

* add clang-format on
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

b2aee3e3

expose input variables that only shape needed in each subgraph that compiled by CINN (#38367) · b4cb3589

由 CtfGo 提交于 1月 17, 2022

collecting input variables that only shape needed of each subgraph that compiled by CINN in build_cinn_pass, and expose them to memory optimization of framework passes by declaringDECLARE_INPLACE_OP_INFERER in cinn_launch op.

b4cb3589

Z

remove MakePtenDenseTensor in op compute (#38910) · 04f042a5
由 zyfncg 提交于 1月 17, 2022

04f042a5

[Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5

由 Wilber 提交于 1月 17, 2022

* add pten::Place data structure.

* update ci problem

* fix ci problem

* update

* using platform::Place=pten::Place

* remove BOOST_GET_CONST for CPUPlace and GPUPlace

* compile pass 25%.

* compile pass 45%

* compile pass 60%

* remove boost_get for xpu npu mlu and ipu

* compile pass on cpu and gpu.

* fix compile problem

* fix compile error.

* update

* fix ci problem

* update

* ci approve

* fix ci problem

* fix ci eager test problem

* remove BOOST_GET_CONST

* fix npu compile

c48a9ad5

W

fix benchmark in paddlerec (#38278) · 1dbc8632
由 wangguanqun 提交于 1月 17, 2022

1dbc8632

add TransferCastOpPass, DeleteScaleOpPass (#38985) · 1006383b

由 Allen Guo 提交于 1月 17, 2022

Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

1006383b

[IPU] update ipu releated passes p0 (#38846) · 84f257bd

由 Allen Guo 提交于 1月 17, 2022

* update ipu releated passes
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* remove ipu_pass_base

* update error msg

* update error msg 02

* split pr 01

* restore ipu_pass_base
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

84f257bd

S
Add NoReduce mode for ParallelExecutor (#38969) · e50d883e
由 sneaxiy 提交于 1月 17, 2022
```
* add no reduce mode for pe

* add NoReduce ut
```
e50d883e
S

add squared_l2_norm (#38968) · 6eeb16b8
由 sneaxiy 提交于 1月 17, 2022

6eeb16b8
Z

[part 5]change type of function args (#38889) · ac933235
由 Zhang Ting 提交于 1月 17, 2022

ac933235
S

add convert func for string helper (#38600) · 73742d36
由 sneaxiy 提交于 1月 17, 2022

73742d36
J
Support auto prune logic in eager mode (#38960) · f81569e3
由 Jiabin Yang 提交于 1月 17, 2022
```
* support test_auto_prune_partial

* support rest of autoprune strategy in eager mode
```
f81569e3
Z

Removed debug info (#38947) · 3115d005
由 Zhanlue Yang 提交于 1月 17, 2022

3115d005

15 1月, 2022 3 次提交
- 石
  
  updates the ctor of tensor, test=develop (#38946) · 5c358674
  由石晓伟提交于 1月 15, 2022
  
  5c358674
- C
  [PTen] Remove cached kernel context (#38953) · 35d2b71a
  由 Chen Weihang 提交于 1月 15, 2022
```
* remove cached kernel context

* revert dataloader format change
```
  35d2b71a
- Z
  [Unify Tensors PR #7] Merged LoDTensor with Tensor, test=allcases (#38880) · 88966b28
  由 Zhanlue Yang 提交于 1月 15, 2022
```
* Merged LoDTensor with Tensor,test=allcases

* Patched python level LoDTensor

* Fixed example code failure

* Polished function names, removed duplicated forward declarations
```
  88966b28
14 1月, 2022 5 次提交

add flatten_contiguous_range OpConvert for Paddle-TRT (#38922) · 050aa6fe

由 heliqi 提交于 1月 14, 2022

* add trt_convert_flatten_contiguous_rang op

* trt version >7,support trt_convert_flatten_contiguous_rang

* trt version >7,support trt_convert_flatten_contiguous_rang

* trt version >7,support trt_convert_flatten_contiguous_rang

* test cast add trt version >=7 skip

050aa6fe

[XPU]add stack_grad op for kunlun2,*test=kunlun (#38674) · 87ee3e4f

由 Zhangjingyu06 提交于 1月 14, 2022

* [XPU]add split op for kunlun2,*test=kunlun

* [XPU]add split op for kunlun2,*test=kunlun

* [XPU]add split op for kunlun,*test=kunlun

* [XPU]add stack_grad op for kunlun2,*test=kunlun
Co-authored-by: NQingshuChen <chenqingshu@baidu.com>

87ee3e4f

Y

refactor impl of elementwise op part2 (#38898) · 556d5097
由 YuanRisheng 提交于 1月 14, 2022

556d5097

[MLU]Add mean and reduce_mean op (#38872) · 7f8d5bc8

由 qipengh 提交于 1月 14, 2022

* [MLU]: add mean and reduce mean op

* [MLU]add mlu pytest dir in CMakeLists.txt

* [MLU]fix tensor data

* [MLU]fix TensorToPyArray and license

7f8d5bc8

石

remove interface: DenseTensor::release, test=develop (#38937) · 9ff989ae
由石晓伟提交于 1月 14, 2022

9ff989ae

13 1月, 2022 8 次提交

W
[Paddle-Inference] add Paddle Trt config: with_interleaved (#38884) · dccdc719
由 Wangzheee 提交于 1月 13, 2022
```
* add Paddle Trt config: with_interleaved
```
dccdc719
S

[bug fix] fix unfold bug in compile time (#38907) · 7f123456
由 shangliang Xu 提交于 1月 13, 2022

7f123456
F
[NPU] fix tril_triu (#38864) · eaccdc71
由 furnace 提交于 1月 13, 2022
```
[NPU] fix tril_triu
```
eaccdc71
F
[NPU] fix expand op (#38526) · 7a5af630
由 furnace 提交于 1月 13, 2022
```
* [NPU] fix expand op

* [NPU] optimize codes

* [NPU] optimize codes
```
7a5af630

[pten]Remove pten/include dir files (#38878) · 7e0292ea

由 chentianyu03 提交于 1月 13, 2022

* move dot_dev api into dot_kernel.h

* add infermate header

* modify to dotkerel in dot_op.h

* mvoe conj dev api into complex_kernel.h

* move sign dev api into  sign_kernel.h

* move scale dev api into kernel.h and remove infermete.h

* rm paddle/pten/include/math.h

* rm paddle/pten/include/math.h

* rm include dir

* rm paddle/pten/include/math.h

* fix conflict with develop branch

* rm devContext in conj_op.h

* add the missing complex_kernel header

7e0292ea

L

[fleet_executor] fix uninitialized pointer (#38904) · a6cf6cdd
由 LiYuRio 提交于 1月 13, 2022

a6cf6cdd
W
roi_align aligned supported (#38905) · 08dcea18
由 wenbin 提交于 1月 13, 2022
```
roi_align aligned supported
```
08dcea18

Added mul BF16/FP32 FWD/BWD oneDNN kernel (#38552) · fc6eed5b

由 jakpiase 提交于 1月 13, 2022

* base changes for mul reimplementation

* empty commit

* tmp save

* full implementation of mul bf16/fp32 fwd bwd

* CI fix

* CI rerun

* changed unity build cmake to avoid gpu issues

* removed mul mkldnn from unity build

* added skipping tests if not cpu_bf16

* CI fix

* CI fix

* CI fix

fc6eed5b

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功