提交 · c48a9ad56e69a5d27d1b36df8c731c9c32f84d78 · 机器未来 / Paddle

17 1月, 2022 5 次提交

[Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5

由 Wilber 提交于 1月 17, 2022

* add pten::Place data structure.

* update ci problem

* fix ci problem

* update

* using platform::Place=pten::Place

* remove BOOST_GET_CONST for CPUPlace and GPUPlace

* compile pass 25%.

* compile pass 45%

* compile pass 60%

* remove boost_get for xpu npu mlu and ipu

* compile pass on cpu and gpu.

* fix compile problem

* fix compile error.

* update

* fix ci problem

* update

* ci approve

* fix ci problem

* fix ci eager test problem

* remove BOOST_GET_CONST

* fix npu compile

c48a9ad5

add TransferCastOpPass, DeleteScaleOpPass (#38985) · 1006383b

由 Allen Guo 提交于 1月 17, 2022

Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

1006383b

[IPU] update ipu releated passes p0 (#38846) · 84f257bd

由 Allen Guo 提交于 1月 17, 2022

* update ipu releated passes
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* remove ipu_pass_base

* update error msg

* update error msg 02

* split pr 01

* restore ipu_pass_base
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

84f257bd

S
Add NoReduce mode for ParallelExecutor (#38969) · e50d883e
由 sneaxiy 提交于 1月 17, 2022
```
* add no reduce mode for pe

* add NoReduce ut
```
e50d883e
Z

Removed debug info (#38947) · 3115d005
由 Zhanlue Yang 提交于 1月 17, 2022

3115d005

15 1月, 2022 2 次提交

C
[PTen] Remove cached kernel context (#38953) · 35d2b71a
由 Chen Weihang 提交于 1月 15, 2022
```
* remove cached kernel context

* revert dataloader format change
```
35d2b71a

[Unify Tensors PR ] Merged LoDTensor with Tensor, test=allcases (#38880) · 88966b28

由 Zhanlue Yang 提交于 1月 15, 2022

* Merged LoDTensor with Tensor,test=allcases

* Patched python level LoDTensor

* Fixed example code failure

* Polished function names, removed duplicated forward declarations

88966b28

14 1月, 2022 1 次提交

[MLU]Add mean and reduce_mean op (#38872) · 7f8d5bc8

由 qipengh 提交于 1月 14, 2022

* [MLU]: add mean and reduce mean op

* [MLU]add mlu pytest dir in CMakeLists.txt

* [MLU]fix tensor data

* [MLU]fix TensorToPyArray and license

7f8d5bc8

13 1月, 2022 3 次提交

[pten]Remove pten/include dir files (#38878) · 7e0292ea

由 chentianyu03 提交于 1月 13, 2022

* move dot_dev api into dot_kernel.h

* add infermate header

* modify to dotkerel in dot_op.h

* mvoe conj dev api into complex_kernel.h

* move sign dev api into  sign_kernel.h

* move scale dev api into kernel.h and remove infermete.h

* rm paddle/pten/include/math.h

* rm paddle/pten/include/math.h

* rm include dir

* rm paddle/pten/include/math.h

* fix conflict with develop branch

* rm devContext in conj_op.h

* add the missing complex_kernel header

7e0292ea

C
Fix mkldnn invalid infershape impl (#38837) · 281644cd
由 Chen Weihang 提交于 1月 13, 2022
```
* fix mkldnn invalid infershape

* add unittest for mkldnn in new executor

* add import os
```
281644cd
石

splits allocation for pten, test=develop (#38853) · 277cf900
由石晓伟提交于 1月 13, 2022

277cf900

12 1月, 2022 2 次提交

[IPU] add more ops (#38831) · 050fd168

由 Allen Guo 提交于 1月 12, 2022

* support more ops

* Co-authored-by: Xiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* add authors
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* update date
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

050fd168

S
Fix conv act int8 scale (#38331) · 4825addd
由 Sylwester Fraczek 提交于 1月 12, 2022
```
* fix conv act int8 scale

* add unit test for conv+hard_swish
```
4825addd

11 1月, 2022 1 次提交

【PTen】Add dot and matmul grad kernel in pten (#38713) · be817719

由 zyfncg 提交于 1月 11, 2022

* refactor matmul directory in pten

* fix merge conflict

* add dot_grad kernel

* add dot_grad kernel in pten

* add matmul_grad kernel

* update the code

* delete useless code in fluid

* fix some bug of running matmul grad kernel

* fix merge conflict

* refactor some code

* refactor code

be817719

10 1月, 2022 7 次提交

Y

add retry on pull dense sync (#38793) · 0a7cb901
由 yaoxuefeng 提交于 1月 10, 2022

0a7cb901
C

move get expected kernel args into pten (#38825) · 3a23c1a2
由 Chen Weihang 提交于 1月 10, 2022

3a23c1a2

[Unify Tensors PR #6] Removed interfaces & members from lod_tensor,test=allcases (#38811) · 953638e0

由 Zhanlue Yang 提交于 1月 10, 2022

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor

* Modified framework::Tensor to inherit from DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

* Rearranged cfunction calls from tensor.data<void>() to tensor.data()

* Fixed CI issues

* Fixed lite issues

* Fixed data() interface issues,test=allcases

* Resolved IsInitialized() issues

* Fixed ResetHolder() issues

* Fixed MKLDNN & Storage issues

* Resolved ShareBufferWith() issues

* Fixed LoD issues

* Removed interfaces & members from lod_tensor,test=allcases

953638e0

Profiler skeleton (#38826) · a8afed69

由 liutiexing 提交于 1月 10, 2022

* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* profiler skeleton

* update

* update

* update
Co-authored-by: Nliutiexing <liutiexing@google.com>

a8afed69

T

1.fix elementwise_add_grad bug. 2. add dropout kernel in kl2 (#38726) · 7b860a23
由 taixiurong 提交于 1月 10, 2022

7b860a23

[Unify Tensors PR ] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea

由 Zhanlue Yang 提交于 1月 10, 2022

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor

* Modified framework::Tensor to inherit from DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

* Rearranged cfunction calls from tensor.data<void>() to tensor.data()

* Fixed CI issues

* Fixed lite issues

* Fixed data() interface issues,test=allcases

* Resolved IsInitialized() issues

* Fixed ResetHolder() issues

* Fixed MKLDNN & Storage issues

* Resolved ShareBufferWith() issues

* Fixed LoD issues

5c73a6ea

Support setting infershape function for custom grad op (#38776) · 046553c7

由 Chen Weihang 提交于 1月 10, 2022

* unify infer_shape func calling

* support set grad infer shape fn for custom op

* unify infershape in new executor and eager

* remove todo comment

* revert infershape in operator

046553c7

07 1月, 2022 2 次提交
- Y
  [PTen]Refactor flatten_grad kernel (#38712) · 5cf0bb79
  由 YuanRisheng 提交于 1月 07, 2022
```
* refactor flatten grad kernel

* fix bugs when run ci unittest

* fix bugs when use default GetExpectedPtenKernelArgs

* xshape sometimes is has null holder ,fix this bugs
```
  5cf0bb79
- L
  
  [new-exec] support pten kernel (#38770) · 7f3b0877
  由 Leo Chen 提交于 1月 07, 2022
  
  7f3b0877
05 1月, 2022 2 次提交
- J
  
  Add input data type checking in BF16 placement pass (#38702) · 60c51de5
  由 joanna.wozna.intel 提交于 1月 05, 2022
  
  60c51de5
- J
  Quantize nearest_interp and nearest_interp_v2 (#38622) · 1456b02d
  由 joanna.wozna.intel 提交于 1月 05, 2022
```
* Quantize nearest_interp and nearest_interp_v2

* Check if avx_core supported

* Add depthwise_conv2d to supported quantization list
```
  1456b02d
04 1月, 2022 4 次提交

Q

[XPU] update XPU device info, test=develop (#37884) · e1187e50
由 Qi Li 提交于 1月 04, 2022

e1187e50

[Pten]Move CPU_implementation of elementwise kernel in new directory (#38651) · 7c020c71

由 YuanRisheng 提交于 1月 04, 2022

* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs

* move cpu_impl of elementwise kernel to new directory

7c020c71

[Unify Tensors PR ]Port framework::Tensor members & interfaces to... · dfdc9960

由 Zhanlue Yang 提交于 1月 04, 2022

[Unify Tensors PR #3]Port framework::Tensor members & interfaces to pten::DenseTensor, test=allcases (#38473)

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

dfdc9960

Y
heter context support dynamic mf dim (#38487) · 59888bba
由 yaoxuefeng 提交于 1月 04, 2022
```
heter context support dynamic mf dim
```
59888bba

31 12月, 2021 5 次提交
- B
  add mul_gru_fuse_pass ut (#37772) · bc827307
  由 baoachun 提交于 12月 31, 2021
```
* add mul_gru_fuse_pass ut

* update ut

* update ut

* update ut timeout setting

* update ut
```
  bc827307
- Z
  
  fix compile error for fleetwrapper with -DWITH_TESTING=ON (#38603) · 761055f0
  由 zmxdream 提交于 12月 31, 2021
  
  761055f0
- F
  [MLU]support calling mlu op from python interface (#38292) · b6bf650a
  由 fwenguang 提交于 12月 31, 2021
```
* [MLU]support calling mlu op from python interface

* [MLU]fix

* fix

* [mlu]fix mlu_places

* [mlu]fix required mlu

* fix

* [MLU]fix tensor copy

* [mlu] fix MLUPlace call path
```
  b6bf650a
- Z
  
  Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor (#38607) · 5a6a2d27
  由 Zhanlue Yang 提交于 12月 31, 2021
  
  5a6a2d27
- C
  [PTen] Unify data layout of pten and fluid (#38583) · 8d32cef8
  由 Chen Weihang 提交于 12月 31, 2021
```
* unify data layout

* fix test_transfer_layout error
```
  8d32cef8
30 12月, 2021 6 次提交
- J
  
  Refactor cpu_quantize_pass (#38019) · 1fa6900e
  由 joanna.wozna.intel 提交于 12月 30, 2021
  
  1fa6900e
- F
  flags to choose kp kernel (#38455) · ed2cfecf
  由 Feng Xing 提交于 12月 30, 2021
```
This PR adds runtime flags run_kp_kernel, which choose which op to run for xpu2. There are two: dynamic linked and built from kp.
```
  ed2cfecf
- Y
  [Auto parallel] Make sure the id semantics of every var and op unique (#38132) · 5620214e
  由 Yulong Ao 提交于 12月 30, 2021
```
* [Auto parallel] Make the id of var and op unique

* [Auto Parallel] Rename back dist_context to distop_context
```
  5620214e
- W
  dynamic shape clone (#38520) · 339c34e6
  由 wenbin 提交于 12月 30, 2021
```
* dynamic shape clone supported
```
  339c34e6
- X
  [New-Exe]Fix word2vec hang proble using InterpreterCore (#38584) · e683ab50
  由 xiongkun 提交于 12月 30, 2021
```
* fix wait for tiexing

* fix work2vec model. new_exe support EOF Exception in ReadOp now
```
  e683ab50
- C
  [PTen] Remove offset in storage (#38472) · a504ff3f
  由 Chen Weihang 提交于 12月 29, 2021
```
* remove offset in storage

* revert api change

* fix custom op slice bug

* fix mutable_data error
```
  a504ff3f

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致