提交 · d985b4b181e502183ebe6d12979a6bc191f9cdd3 · PaddlePaddle / Paddle

14 7月, 2022 1 次提交

[Phi]Improve the mechanism for mkldnn kernel in PHI (#43941) · e9b4d0be

由 YuanRisheng 提交于 7月 14, 2022

* adapt mkldnn kernel in PHI

* fix ci compile bugs

* fix compile bugs

* fix compile bugs

* fix compile bugs

* fix compile bugs

* delete comment

* fix compile bugs in windows-inference

* delete code for converage

* modify code by review

* modify code by review

* add todo

* fix compile bugs

* fix compile bugs

* fix compile bugs

* fix unittest bugsx

e9b4d0be

05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
27 5月, 2022 1 次提交

[Phi] Change optional tensor from `optional<const Tensor&>` to `optional<Tensor>` (#42939) · 6d78524c

由 zyfncg 提交于 5月 27, 2022

* refactor the optional tensor

* remove optiona<MetaTensor> in InferMeta

* fix bug

* fix optional<vector<Tensor>>

* fix bug

* fix rmsprop

* fix amp of eager_gen

* polish code

* fix deleted code

* fix merge conflict

* polish code

* remove is_nullopt_

* fix merge conflict

* fix merge conflict

6d78524c

26 4月, 2022 1 次提交

Remove std::type_index in AttributeArdDef (#42122) · 996358a1

由 Chen Weihang 提交于 4月 26, 2022

* polish some impl

* add lost attr type

* polish details

* fix error type

* polish in name lists

* add double attr

* adapt infrt attr parse

996358a1

24 4月, 2022 1 次提交
- C
  Add paddle::variant and replace paddle::any (#42139) · 79f717d6
  由 Chen Weihang 提交于 4月 24, 2022
```
* add variant and replace any

* split attribute
```
  79f717d6
01 4月, 2022 1 次提交

[Phi]Interploatd kernels into phi (#40855) · d65a7a46

由 chentianyu03 提交于 4月 01, 2022

* add interploate cpu kernel

* fix nullptr bug

* add interpolate gpu kernel

* fix unit test error

* remove raw kernels

* add cuda kernel impl

* add infermeta

* recover accidentally deleted kernels in interpolate op

* fix grad x_grad name error

* remove interpolate_v2_op.h

* rm unused codes

* fix xpu build error

* fix build error

* fix namespace error

* add register header for nup

* fix infermeta error

* modify by review

* add the missing args in test_trt_convert_nearest_interp_v2

d65a7a46

27 3月, 2022 1 次提交

Add StringTensor (#39830) · 0695e1ac

由 Jack Zhou 提交于 3月 27, 2022

* add string tensor and case convert kernels

* Add strings empty kernel; Reorganize the structure of case convert kernel

* Add string infermeta

* Update mutable_data of string tensor

* rename kernel name

* add string copy tmp

* Fix strings copy device bug

* add utf8 gpu converter

* add string tensor c++ api

* Remove mutable_data of string tensor

* update string tensor interface

* remove charcases_flag.h

* remove some fluid headers

* Add make_ddim

* __HIPCC__ -> PADDLE_WITH_HIP

* remove fluid headers

* fix cpu compile

* remove std::hash

* Fix cudaMalloc

* Remove strings/impl directory

* Fix infrt/get_phi_kernel_info.py;Add custom_kernels deps

* Add empty kernel test

* Remove some comments

* Modify lower/upper api encoding type: string->bool

* STRING->PSTRING; Add CreateInferLikeMeta

* Add code gen for C++ String API

* remove strings_api_utils.h

* Add ignore file (strings_api.h, strings_api.cc)

* update strings gen script

* change args order of case convert kernels

* Add comments for pstring, StringTensor

* cpstring_internal.h -> cpstring_impl.h

* Update accordding to comments:

1. Remove fluid headers
2. paddle::platform::errors -> phi::errors
3. Use 'place.GetType() == phi::AllocationType::GPU' instead of 'paddle::platform::is_cpu_space()'
4. Use camel code style

* Remove all singletons in strings kernels

* fix rocm compile

* Fix py3 compile

* Fix c++ coverage

* 1. Add pstring proto type
2. Add StringTensor debug info
3. Rename case_convert_kernel to strings_lower_upper
4. Remove serialize derialize strings kernel

* DataLayout::PSTRING -> DataLayout::PSTRING_UNION

* Register pstring data type

* Fix strings api gen

* Fix dense tensor register pstring dtype

* Fix error messages

* remove line

* add pstring unittest

* remove test string api unitest

* remove empty line

* Remove some headers to decrease the size of executable file

0695e1ac

18 3月, 2022 1 次提交
- Z
  Add Sparse OP Maxpool (#40569) · e52ffb70
  由 zhangkaihuo 提交于 3月 18, 2022
```
sparse maxpool; kernel_registry support sparse tensor
```
  e52ffb70
08 3月, 2022 1 次提交
- A
  [custom kernel]Upgrade support for multiple libs (#40223) · c39aa18e
  由 Aganlengzi 提交于 3月 08, 2022
```
* [custom kernel]Upgade support for multi libs

* upgrade phi_custom_kernel deps
```
  c39aa18e
06 3月, 2022 1 次提交

[Phi] Replace all prefix PT by PD and fix typo (#40046) · d30d85da

由 Chen Weihang 提交于 3月 06, 2022

* replace prefix pt by pd

* replace added kernel

* revert util change

* pd kernel to phi

* resolve conflict

* resolve conflict

d30d85da

04 3月, 2022 1 次提交

[Phi] Change input vec tensor to pointer type (#40078) · 50d5bf79

由 Chen Weihang 提交于 3月 04, 2022

* change input vec tensor to pointer

* update input between

* fix format error

* resolve conflict

* resolve conflict

50d5bf79

02 3月, 2022 1 次提交

Move sgd to phi (#40045) · f3d54e2e

由 hong 提交于 3月 02, 2022

* move sgd to phi; test=develop

* update

* add sgd kernel; test=develop

f3d54e2e

28 2月, 2022 1 次提交
- A
  
  [custom kernel] change kernel name judgement and remove macro control for selected_row (#39977) · 49677636
  由 Aganlengzi 提交于 2月 28, 2022
  
  49677636
22 2月, 2022 2 次提交
- A
  [custom kernel]Delete useless and upgrade (#39791) · edc3ba13
  由 Aganlengzi 提交于 2月 22, 2022
```
* [custom kernel]Delete useless

* change RegType enum names

* mod notes

* merge

* update
```
  edc3ba13
- C
  [PTen->Phi PR2] Rename PT_REGISTER macro to PD_REGISTER (#39790) · 4a338796
  由 Chen Weihang 提交于 2月 22, 2022
```
* unify register macro

* rename declare macro

* fix infrt error
```
  4a338796
21 2月, 2022 1 次提交

[PluggableDevice]custom kernel to phi core structs (#39690) · 68631ed4

由 Aganlengzi 提交于 2月 21, 2022

* [PluggableDevice]custom kernel to pten core structs

* mod extension.h for custom op

* compatible python for CI

* support custom context

* refactor to pten

* fix windows and ut

68631ed4

20 2月, 2022 1 次提交

[PTen->Phi PR1] Change pten dirname and namespace to phi (#39748) · dcfe1986

由 Chen Weihang 提交于 2月 20, 2022

* rename pten dir to phi

* rename namespace to phi

* rename infrt pten dir to phi

* resolve conflict

* rename pten to phi in cmake

* revert all infrt change

* change needed files

* fix infrt failed

* fix inference failed

dcfe1986

17 2月, 2022 1 次提交
- C
  
  support set fp32 input for fp16 kernel (#39625) · 5fb9cf60
  由 Chen Weihang 提交于 2月 17, 2022
  
  5fb9cf60
15 2月, 2022 1 次提交
- C
  [PTen] Fix single dtype register errror (#39506) · 9fd67ffe
  由 Chen Weihang 提交于 2月 15, 2022
```
* fix single dtype reg errror

* fix windows failed
```
  9fd67ffe
11 2月, 2022 1 次提交

[PTen] Remove pten core's dependency on fluid xxx_info.h (#39401) · d763a91a

由 Chen Weihang 提交于 2月 11, 2022

* ermove xxx_info include

* fix namespace error

* resolve conflict

* skip xpu context in registry

* fix macro error

* resolve conflict

* resolve conflict

* revert xpu convert

* remove trans to fluid place

* remove useless headers

d763a91a

09 2月, 2022 1 次提交

update basic infrastructure (#39383) · b12e7a17

由 hong 提交于 2月 09, 2022

* update basic infrastructure; support string,  suport vecotr<int>, add tensor args type index; test=develop

* remove useless code; test=develop

* fix bug; test=develop

* polish code; test=develop

b12e7a17

30 1月, 2022 1 次提交

[pten] fit get all register op kernels (#39288) · eefe5feb

由 Leo Chen 提交于 1月 30, 2022

* upgrade _get_all_register_op_kernels

* add ut

* support xpu/npu

* fix device id

* enhance TransToFluidPlace

* fix compile

eefe5feb

29 1月, 2022 1 次提交

[PTen] Tidy pten core headers (#39188) · dd990981

由 Chen Weihang 提交于 1月 29, 2022

* open header for custom kernel

* add core utils

* tidy core code

* tify header

* tidy include

* tidy namespace

* resolve conflit

* fix unittest and coverage

* remove platform using

* resolve conflict

* resolve conflict

* fix digamma namespace error

* fix xpu full kernel error

* fix xpu full kernel error

* polish details

* add place for lib storage

dd990981

28 1月, 2022 1 次提交
- Y
  [PTen]Refactor scale kernel that has selected_rows input (#39278) · abfc2fe9
  由 YuanRisheng 提交于 1月 28, 2022
```
* refactor scale kernel that its input is selected_rows

* complement upload file
```
  abfc2fe9
25 1月, 2022 1 次提交

[PTen] Migrate string tinyformat errors and part of enforce into pten (#39051) · 6ca49164

由 xiongkun 提交于 1月 25, 2022

* transfer: string tinyformat errors and part of enforce into pten

* remove comment

* fix by code review

* assert is not compile in -DNDEBUG

* add string as dependences of paddle_inference

6ca49164

22 1月, 2022 1 次提交
- C
  
  add meta tensor for unify infershape (#39131) · 09f6f17c
  由 Chen Weihang 提交于 1月 22, 2022
  
  09f6f17c
13 1月, 2022 1 次提交
- C
  [PTen] Rename kernel register marco (#38861) · 158bf13f
  由 Chen Weihang 提交于 1月 13, 2022
```
* rename register marco

* fix error changing

* fix format error
```
  158bf13f
11 1月, 2022 1 次提交

【PTen】Add dot and matmul grad kernel in pten (#38713) · be817719

由 zyfncg 提交于 1月 11, 2022

* refactor matmul directory in pten

* fix merge conflict

* add dot_grad kernel

* add dot_grad kernel in pten

* add matmul_grad kernel

* update the code

* delete useless code in fluid

* fix some bug of running matmul grad kernel

* fix merge conflict

* refactor some code

* refactor code

be817719

24 12月, 2021 1 次提交
- C
  
  add register general kernel marco (#38409) · fc0a50aa
  由 Chen Weihang 提交于 12月 23, 2021
  
  fc0a50aa
21 12月, 2021 1 次提交
- C
  [PTen] Rename cuda dir and context to gpu (#38296) · dc7597e3
  由 Chen Weihang 提交于 12月 21, 2021
```
* rename cuda to gpu

* revert CMake change

* resolve conflit

* rename other cuda to gpu

* poish details
```
  dc7597e3
16 12月, 2021 1 次提交

[PTen] Add register_ctx_kernel marco and move scale kernel (#38121) · af498677

由 Chen Weihang 提交于 12月 16, 2021

* add register_ctx_kernel and move scale kernel

* polish details by reviewer comment

* fix xpu compile failed

* fix cmake error

af498677

14 12月, 2021 2 次提交
- C
  [PTen] Polish kernel register marco design (#38078) · c9da845f
  由 Chen Weihang 提交于 12月 14, 2021
```
* polish register marco

* resolve compile failed

* revert needless change

* revert eager related change

* revert eager related change

* change register marco name

* polish deetails
```
  c9da845f
- Y
  
  remove KernelName (#38082) · 8198cad7
  由 YuanRisheng 提交于 12月 14, 2021
  
  8198cad7
13 12月, 2021 1 次提交

【PTen】Add variadic args kernel for PTen API to replace KernelContext (#37942) · b76ef045

由 zyfncg 提交于 12月 13, 2021

* add variadic_args kernel in pten

* merge develop code

* add variadic_args kernel and benchmark

* change dynamic_cast to static_cast for DeviceContext

* merge the code

* modify code format

* refactor variadic kernel function

b76ef045

10 12月, 2021 1 次提交
- Y
  [PTen]Add alias name for matmul and remove redundant member in kernel factory (#38011) · c5a7da4b
  由 YuanRisheng 提交于 12月 10, 2021
```
* add alias kernel name

* modify code as suggestions

* add alias name for matmul and remove redundant member in kernel factory
```
  c5a7da4b
09 12月, 2021 1 次提交

[PTen] Refine Kernel Registrar Writing (#37977) · b199ba85

由 Chen Weihang 提交于 12月 09, 2021

* refine the kernel register impl

* fix cmake and symbol error

* remove overload marco

* polish details

b199ba85

14 11月, 2021 1 次提交

[PTen]Reshape Kernel Refactor (#37164) · 895692e3

由 YuanRisheng 提交于 11月 14, 2021

* reshape kernel refactor

* fix compile bugs when run ci

* support xpu for reshape

* fix bugs when run unittest in kunlun ci

* fix compile bugs when run kunlun

* perfect code according to suggestion

895692e3

02 11月, 2021 2 次提交

support multiply inputs and outputs (#36851) · e4a134ac

由 chentianyu03 提交于 11月 02, 2021

* initial tensor design & sign kernel demo

* add move constructor for meta & add lodtensor

* add dirs & sign xpu kernel

* add mean cpu&cuda kernel impl

* move sign & mean xpu & npu kernel

* add selected_rows basic impl

* refactor design, BaseTensor to DenseTensor, etc.

* add scale mkldnn kernel

* polish xpu & npu impl details

* fix mkldnn reuse compile failed

* change tensor operation lib name

* rename util filename

* add more comments

* change TensorImplInterface to TensorInterface

* add kernel key and factory

* remove MKLDNNTensorMeta, add MKLDNNDenseTensor

* change XXDeviceContext to XXContext

* add base kernel registrar utils & test on sign

* replace boost::any by paddle::any

* fix several ci failed

* fix npu compile error

* add ordered map util

* fix multiple ordered_map compile errors

* move dev into include dir

* support sign op in static op run

* fix static op run error

* fix new executor compile failed

* add dygraph branch & remove sign_op.h

* fix test_infer_no_need_buffer_slots

* fix rocm compile link error

* fix unitybuild error & clear glog

* fix npu compile failed

* skip quant trans test

* fix part windows compile problem

* fix xpu enforce error

* fix inference test failed

* remove ordered_map to solve quant failed

* fix part of rcom compile faild

* add more register kernels

* revert scale kernel temporarily

* fix code format error

* add new kernel registrar marco

* rename top to tcmpt

* revert xpu, npu, mkldnn impl & remove op def

* add kernel args parse functor to auto parse args

* revert some change & add scale kernels

* add op proto in dygraph kernelcontext building

* polish kernel dispatch logic & nameing rule

* fix scale kernel match error

* fix scale test failed

* add mean API and unittest

* test mean api success

* add branch to solve compiled error

* skip clang format error

* add mean skip rule in op_library

* add dot kernel, api and unittest (#6)

* remove old kernel and add symbol link

* fix dot compiled failed

* add merco for module declare

* fix npu and xpu compile error

* revert sign, mean, scale, dot kernel removing

* add comment for keeping old kernel impl

* fix mutable_data error

* fix bfloat16 conflit

* fix inference undef error

* adapt to msvc compile rules

* polish comment for template inst

* add cmake template instantiation for win

* fix backend to place device id bug

* fix ifdef error

* Op2functor (#7)

* add kernel args maker class

* make args maker non-const

* remove debug log

* modify codes by review options

* split constructPrKernelContext function

* fix output name bug

* fix test_mean_op test_sign_op failed

* fill_any_like kernel refactor (#10)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* skip dtype for fill_any_like

* add attrs for kernel key constrcut

* add use_pt_kernel Flags to control whether to use pt kernel (#13)

* add use_pt_kernel Flags to control whether to use pt kernel

* change the default value to true for cheking pt kernels

* fix mutable_data cuda place error

* move high level apis into hapi

* remove selectedrows adapting temporarily

* Support Scalar in Tensor Compute Library (#14)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* remove mkldnn tensor & polish details

* use flat_hash_map and small_vector in kernel factory

* Refactor flatten kernel (#12)

* refactor flatten kernel

* update infershape function

* fix compile bugs

* fix bugs when merge

* fix compiler bugs

* fix bugs when run test_flatten_api

* fix bugs when run test

* Revert "use flat_hash_map and small_vector in kernel factory"

This reverts commit 23091495cfdd3df8cc1be592d30f09ea66a7c72b.

* Move cpu, cuda and other device code into kernels (#15)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Perfect unitests (#16)

* perfect unittest

* update license

* replace with flat_hash_map, small_vector (#19)

* fix small_vector build error on windows platform

* replace with flat_hash_map, small_vector

* remove todo

* Perfect unitests (#20)

* perfect unittest

* update license

* fix bug when run tcmpt_utils_test

* refactor execution adapting impl

* fix insert conflit

* Fix CI bug of test_yolov3 (#21)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Fix CI bug of test_yolov3

* add the tensor base class, test=develop (#17)

* update the tensor base class, test=develop

* remove two funcs, test=develop

* update the error msg, test=develop
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* [no-verify] commit backend and tensor signature changes

* Rename tcmpt to pten (#23)

* rename tcmpt to pten

* update omitted files for rename to pten

* update omitted file for rename to pten

* remove k of all enum var

* remove kernel_instantiate (#26)

* remove symbols and spatial_tensor

* change common to functions

* readd share tensor impl methods

* add a candidate dense tensor class, test=develop (#28)

* change all Pt to Pten

* resolve conflit with xiaowei

* Op2functor opt1 (#27)

* replace to small vector and change to const &

* add std::move
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* polish kernel factory and kernel registry

* fix operator test error msg mismatch

* remove tensor signature and backend set member

* move scalar and polish enforce

* revert dtype layout change to fix error

* fix enum operator override error

* add several base unittests

* add pten utils tests

* polish some details

* Dev/op2func refactor 3 (#30)

* add a candidate dense tensor class, test=develop

* remove TensorBase::backend(), test=develop

* remove some ops, test=develop

* cherry-pick the pr of tensor meta, test=develop

* moves the dense tensor and some ops, test=develop

* update the linalg operator, test=develop

* update other operators, test=develop

* fix errors, test=develop

* fix bugs, test=develop

* try to resolve the problem of windows ci, test=develop

* updates codes, test=develop

* fix the tensor_utils.cc, test=develop

* modify the dense tensor, test=develop

* fix the data type, test=develop
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details

* polish kernel signature details

* fix a bug about offsets of the tensor, test=develop (#31)
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* support multiply inputs and outputs

* rm attrs {}

* fix multioutputs bug

* merge develop

* remove unsed header file

* add missing & in const reference

* modify inputAt, outputAt to inputBetween, outputBetween
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: Nzyfncg <1370305206@qq.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

e4a134ac

C
[PTen] Fix detail bugs and append registry macro (#36866) · 53b3f40f
由 Chen Weihang 提交于 11月 02, 2021
```
* fix several bugs

* fix elementwith override error
```
53b3f40f

01 11月, 2021 1 次提交

Paddle Tensor Operation Library initial implementation (#34425) · b9fdd3bc

由 Chen Weihang 提交于 11月 01, 2021

* initial tensor design & sign kernel demo

* add move constructor for meta & add lodtensor

* add dirs & sign xpu kernel

* add mean cpu&cuda kernel impl

* move sign & mean xpu & npu kernel

* add selected_rows basic impl

* refactor design, BaseTensor to DenseTensor, etc.

* add scale mkldnn kernel

* polish xpu & npu impl details

* fix mkldnn reuse compile failed

* change tensor operation lib name

* rename util filename

* add more comments

* change TensorImplInterface to TensorInterface

* add kernel key and factory

* remove MKLDNNTensorMeta, add MKLDNNDenseTensor

* change XXDeviceContext to XXContext

* add base kernel registrar utils & test on sign

* replace boost::any by paddle::any

* fix several ci failed

* fix npu compile error

* add ordered map util

* fix multiple ordered_map compile errors

* move dev into include dir

* support sign op in static op run

* fix static op run error

* fix new executor compile failed

* add dygraph branch & remove sign_op.h

* fix test_infer_no_need_buffer_slots

* fix rocm compile link error

* fix unitybuild error & clear glog

* fix npu compile failed

* skip quant trans test

* fix part windows compile problem

* fix xpu enforce error

* fix inference test failed

* remove ordered_map to solve quant failed

* fix part of rcom compile faild

* add more register kernels

* revert scale kernel temporarily

* fix code format error

* add new kernel registrar marco

* rename top to tcmpt

* revert xpu, npu, mkldnn impl & remove op def

* add kernel args parse functor to auto parse args

* revert some change & add scale kernels

* add op proto in dygraph kernelcontext building

* polish kernel dispatch logic & nameing rule

* fix scale kernel match error

* fix scale test failed

* add mean API and unittest

* test mean api success

* add branch to solve compiled error

* skip clang format error

* add mean skip rule in op_library

* add dot kernel, api and unittest (#6)

* remove old kernel and add symbol link

* fix dot compiled failed

* add merco for module declare

* fix npu and xpu compile error

* revert sign, mean, scale, dot kernel removing

* add comment for keeping old kernel impl

* fix mutable_data error

* fix bfloat16 conflit

* fix inference undef error

* adapt to msvc compile rules

* polish comment for template inst

* add cmake template instantiation for win

* fix backend to place device id bug

* fix ifdef error

* Op2functor (#7)

* add kernel args maker class

* make args maker non-const

* remove debug log

* modify codes by review options

* split constructPrKernelContext function

* fix output name bug

* fix test_mean_op test_sign_op failed

* fill_any_like kernel refactor (#10)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* skip dtype for fill_any_like

* add attrs for kernel key constrcut

* add use_pt_kernel Flags to control whether to use pt kernel (#13)

* add use_pt_kernel Flags to control whether to use pt kernel

* change the default value to true for cheking pt kernels

* fix mutable_data cuda place error

* move high level apis into hapi

* remove selectedrows adapting temporarily

* Support Scalar in Tensor Compute Library (#14)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* remove mkldnn tensor & polish details

* use flat_hash_map and small_vector in kernel factory

* Refactor flatten kernel (#12)

* refactor flatten kernel

* update infershape function

* fix compile bugs

* fix bugs when merge

* fix compiler bugs

* fix bugs when run test_flatten_api

* fix bugs when run test

* Revert "use flat_hash_map and small_vector in kernel factory"

This reverts commit 23091495cfdd3df8cc1be592d30f09ea66a7c72b.

* Move cpu, cuda and other device code into kernels (#15)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Perfect unitests (#16)

* perfect unittest

* update license

* replace with flat_hash_map, small_vector (#19)

* fix small_vector build error on windows platform

* replace with flat_hash_map, small_vector

* remove todo

* Perfect unitests (#20)

* perfect unittest

* update license

* fix bug when run tcmpt_utils_test

* refactor execution adapting impl

* fix insert conflit

* Fix CI bug of test_yolov3 (#21)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Fix CI bug of test_yolov3

* add the tensor base class, test=develop (#17)

* update the tensor base class, test=develop

* remove two funcs, test=develop

* update the error msg, test=develop
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* [no-verify] commit backend and tensor signature changes

* Rename tcmpt to pten (#23)

* rename tcmpt to pten

* update omitted files for rename to pten

* update omitted file for rename to pten

* remove k of all enum var

* remove kernel_instantiate (#26)

* remove symbols and spatial_tensor

* change common to functions

* readd share tensor impl methods

* add a candidate dense tensor class, test=develop (#28)

* change all Pt to Pten

* resolve conflit with xiaowei

* Op2functor opt1 (#27)

* replace to small vector and change to const &

* add std::move
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* polish kernel factory and kernel registry

* fix operator test error msg mismatch

* remove tensor signature and backend set member

* move scalar and polish enforce

* revert dtype layout change to fix error

* fix enum operator override error

* add several base unittests

* add pten utils tests

* polish some details

* Dev/op2func refactor 3 (#30)

* add a candidate dense tensor class, test=develop

* remove TensorBase::backend(), test=develop

* remove some ops, test=develop

* cherry-pick the pr of tensor meta, test=develop

* moves the dense tensor and some ops, test=develop

* update the linalg operator, test=develop

* update other operators, test=develop

* fix errors, test=develop

* fix bugs, test=develop

* try to resolve the problem of windows ci, test=develop

* updates codes, test=develop

* fix the tensor_utils.cc, test=develop

* modify the dense tensor, test=develop

* fix the data type, test=develop
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details

* polish kernel signature details

* fix a bug about offsets of the tensor, test=develop (#31)
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details
Co-authored-by: Nchentianyu03 <ctychentianyu@gmail.com>
Co-authored-by: Nzyfncg <1370305206@qq.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

b9fdd3bc

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功