提交 · 057cdb95bc8ac1f105c75ab8fd64785612796dfa · afeixing77 / Paddle

14 2月, 2023 1 次提交

decouple tensor_utils (#50264) · 057cdb95

由 engineer1109 提交于 2月 14, 2023

fix X

remove TensorCopy

codestyle

add fluid memory header

fix symbol

fix cmake

fix cmake

fix context

fix header

fix place

fix context

fix context

fix context

fix code

fix custom context

fix custom context

fix copy

fix data_transform

fix style

remove changes of custom

fix scalar

057cdb95

06 2月, 2023 1 次提交
- E
  
  phi move ReshapeToMatrix & GetValue (#50139) · d09962a1
  由 engineer1109 提交于 2月 06, 2023
  
  d09962a1
30 1月, 2023 1 次提交
- E
  add phi tensor vector array api from fluid (#49885) · 094e3b8c
  由 engineer1109 提交于 1月 30, 2023
```
replace all TensorFromVector & TensorToVector

AssignKernel async copy
```
  094e3b8c
12 12月, 2022 1 次提交

[PHI] OneDNN version of Copy (#48539) · d666c7df

由 Paulina Gacek 提交于 12月 12, 2022

* OneDNN version of Copy, tranpose kernels adjusted

* style fixes in tranpose_grad

* redundant headers deleted

d666c7df

19 9月, 2022 1 次提交
- Z
  [Sparse] Add infer meta (#46016) · 4b95f85e
  由 zhangkaihuo 提交于 9月 19, 2022
```
* sparse infer_meta
```
  4b95f85e
30 8月, 2022 1 次提交
- K
  fix memcpy_h2d bug related to cuda stream setting when allocate memory (#45450) · 10abdb8f
  由 kangguangli 提交于 8月 30, 2022
```
* fix memcpy_h2d bug related to cuda stream setting when allocate memory

* add header file

* fix compile error for cpu only
```
  10abdb8f
25 8月, 2022 1 次提交

Transfer memcpy d2h from fluid to phi (#45150) · 0d14e74a

由 kangguangli 提交于 8月 25, 2022

* transfer memcpy_d2h from fluid to phi

* refine arg check and add comment

* fix cannot fallback to phi kernel

* fix gpu_context host alloc when tensor size = 0

* add kernel for std::vector<DenseTensor> args

* fix bugs in MemcpyD2HMultiIOKernel

* remove useless header file

* polish format

* fix typo

* add testcase for cudapinned place

* refine check condition in test

* polish error message

* polish error message

* remove header in fluid  directory

* merge memcpy_h2d and memcpy_d2h into one file, change register method to simplify implementation

* fix code style check

0d14e74a

09 8月, 2022 1 次提交
- R
  Fix copy bug for same src and dst Tensor (#44992) · 125e48c3
  由 Ruibiao Chen 提交于 8月 09, 2022
```
* Fix copy bug for same src and dst Tensor

* Improve code design

* Fix errors
```
  125e48c3
22 7月, 2022 1 次提交

[CustomDevice] register Copy for custom device (#44200) · 3b0aa75e

由 Aganlengzi 提交于 7月 22, 2022

* [CustomDevice] register Copy for custom device

* [CustomDevice] register Copy for custom device

* [CustomDevice] register Copy for custom device

* merge and add uts

* merge and add uts

* fix for blocking and unittests coverage

3b0aa75e

18 7月, 2022 1 次提交

[Plugin] Fix Custom device in eager mode, test=develop (#43952) · 04e55582

由 Qi Li 提交于 7月 18, 2022

* [Plugin] Fix Custom device in eager mode, test=develop

* update test case, test=develop

* update ut for coverage, test=develop

04e55582

29 6月, 2022 1 次提交
- Z
  
  Change sparse Copy from Kernel to basic component utils (#43916) · 148fa05e
  由 zhangkaihuo 提交于 6月 29, 2022
  
  148fa05e
24 6月, 2022 1 次提交

[Phi]Change Copy from Kernel to basic component utils (#43622) · 2739bd73

由 YuanRisheng 提交于 6月 24, 2022

* perfect copy

* deal with conflict

* deal with conflict

* fix compile bugs

* fix unittest bugs

* change code format

* deal with conflict

* modify code by review

* fix ce bugs

* fix ce bugs

* add lo

* perfect code format

* deal with conflicts

2739bd73

21 6月, 2022 1 次提交
- S
  resort .cu headers, set clang-format not sort include block and consider .cu... · 829723f2
  由 Sing_chan 提交于 6月 21, 2022
```
resort .cu headers, set clang-format not sort include block and consider .cu as main source file (#43633)
```
  829723f2
05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
01 4月, 2022 1 次提交

[Eager] Support pinned (#41035) · f3270fc8

由 wanghuancoder 提交于 4月 01, 2022

* support pinned, test=develop

* support async_write, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine,test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

f3270fc8

28 3月, 2022 1 次提交

[Phi] Fix assign kernel bug (#40927) · 822a2d1f

由 Chen Weihang 提交于 3月 28, 2022

* fix assign kernel bug

* fix xpu kernel select error

* add cudn pinned place

* fix copy error

* fix infrt error

822a2d1f

21 3月, 2022 1 次提交

Refine to_tensor for eager mode and support gpu_pinned (#40535) · 45d1fb8d

由 0x45f 提交于 3月 21, 2022

* Refine to_tensor for eager mode

* support gpu_pinned

* refine code

* support gpu_pinned copy_to

* fix layer.__setattr__

* support to_tensor for gpu_pinned

* fix unit test

* refine gpu_pinned

* restore the original code

* add is_gup_pinned() and refine eager.Tensor._copy_to()

45d1fb8d

26 2月, 2022 1 次提交

[Pten] Refactor the copy kernel (#39731) · 9a7b9eda

由 zyfncg 提交于 2月 26, 2022

* remove SetAllocationForOutputTenosr

* add place param for copy kernel

* recover SetAllocationForOutputTenosr

* polish code

* fix empty_dev api bug

* test=allcases

* test=allcases

* fix bug

* recover empty

* recover modify

9a7b9eda

22 2月, 2022 2 次提交

change Vector to std::vector and provide MixVector class as a helper … (#39559) · 728c0624

由 xiongkun 提交于 2月 22, 2022

* change Vector to std::vector and provide MixVector class as a helper wrapper class

* solve the multi-gpu hang problem

* remove the duplicate template instantialize

* Copy vector to cpu

* add CopyToCPU

* xxx

* final version: fix the problem of all reduce

* remove mixvector dependence

* fix

* merge

* fix code

* fix by CI

728c0624

C
[PTen->Phi PR2] Rename PT_REGISTER macro to PD_REGISTER (#39790) · 4a338796
由 Chen Weihang 提交于 2月 22, 2022
```
* unify register macro

* rename declare macro

* fix infrt error
```
4a338796

20 2月, 2022 1 次提交

[PTen->Phi PR1] Change pten dirname and namespace to phi (#39748) · dcfe1986

由 Chen Weihang 提交于 2月 20, 2022

* rename pten dir to phi

* rename namespace to phi

* rename infrt pten dir to phi

* resolve conflict

* rename pten to phi in cmake

* revert all infrt change

* change needed files

* fix infrt failed

* fix inference failed

dcfe1986

17 2月, 2022 1 次提交
- C
  [PTen] Remove fluid device context deps (#39604) · d63ece1f
  由 Chen Weihang 提交于 2月 17, 2022
```
* remove fluid device context deps

* fix compile failde
```
  d63ece1f
15 2月, 2022 1 次提交

[PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404

由 Aurelius84 提交于 2月 15, 2022

* #1 migrate dist-related type()-> dtype()

* move datatype function from pten -> fluid/framework

* change type() in imperative into convert(dtype())

* modify xx_tensor->type into xx_tensor->dtype

* change the set_type interface and the caller

* modify xx_tensor.type into xx_tensor.dtype

* fix mutable_data(place, dtype())

* change caller of mutable_data in pten and distributed

* change the caller of mutable_data in fluid/framework

* change the caller of mutable_data in imperative directory

* mutable_data: inference

* update the call of mutable_data

* transfer MakePenScalarArray MakePtenScalar ResetHolderWithType

* pass the compile. the next step is remove VarType in Pten

* fix all and remove VarType from pten. success in linux. Next task is other platform

* fix conflict with develop

* fix compiled error

* Fix reset conversion

* fix conflict

* fix compiled problem

* fix typo

* Fix << in tensor_utils.cc

* fix type->dtype

* fix unittest

* fix tensor init constructor

* fix DataTypeSize for BFloat16

* fix code style

* fix npu compiled error

* fix npu

* compile npu sucessfully

* fix conflict

* fix conflict
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>

7e7e9404

06 2月, 2022 1 次提交
- W
  
  [PTEN] Add Gpu context (#39305) · a821c4a9
  由 Wilber 提交于 2月 06, 2022
  
  a821c4a9
29 1月, 2022 1 次提交

[PTen] Tidy pten core headers (#39188) · dd990981

由 Chen Weihang 提交于 1月 29, 2022

* open header for custom kernel

* add core utils

* tidy core code

* tify header

* tidy include

* tidy namespace

* resolve conflit

* fix unittest and coverage

* remove platform using

* resolve conflict

* resolve conflict

* fix digamma namespace error

* fix xpu full kernel error

* fix xpu full kernel error

* polish details

* add place for lib storage

dd990981

27 1月, 2022 2 次提交

Z

implement AllocateFrom (#39280) · d89f246c
由 zhangkaihuo 提交于 1月 27, 2022

d89f246c

Add SparseCooTensor and SparseCsrTensor (#38906) · a7edb3f3

由 zhangkaihuo 提交于 1月 27, 2022

* fix bug:
1. atten: set the default value of attn_dropout_rate to None
2. ffn: add activation parameter

* for pure fp16

* Add a SparseCsrTensor

* remove unused functional

* remove const

* remove SetMemoberTensor

* remove non_zero_nums_, the number of non zero elements of each batch can be obtained from the crows

* SparseCooTensor

* add SetMember

* merge upstream; add SetMember

* merge upstream

* merge upstream; add newline at end of file

* add newline at end of file

* remove newline at end of file

* remove newline at end of file

* stash

* user pten::framework::make_ddim

* user pten::framework::make_ddim

* merge upstream; use the latest mutable_data

* merge upstream; use the latest mutable_data

* return mutable dense tensor

a7edb3f3

24 1月, 2022 1 次提交

石

[Refactoring Tensor PR #5] replace storage with pten allocation (#39085) · a56e16a7

由石晓伟提交于 1月 24, 2022

* updates callers, test=develop

* updates tensor, test=develop

* fixes errors, test=develop

* remove some dtypes, test=develop

* fix errors in the base storage modification, test=develop

* fixes a bug, test=develop

* fixes the bugs in push the whole, test=develop

* updates, test=develop

* update

* update, test=develop

* fixes the mac-py3 CI, test=develop

* remove the storage impl, test=develop

* updates some codes, test=develop

* update, test=develop

* updates pten allocation, test=develop

a56e16a7

20 1月, 2022 1 次提交

【PTen】Remove code of converting Tensor to DensoeTensor (#38926) · 8784ec65

由 zyfncg 提交于 1月 20, 2022

* remove MakePtenTensor in BuildKernelContext

* fix a bug caused by storage

* remove WriteBackOutput in dynamic and static mode

* fix complie error of std::max

* fix complie error of std::max

* fix date_type bug

* fix memory alloc bug

* add some debug info

* fix compile problem

* fix problem of data_type check

* comment out some unreached code

8784ec65

18 1月, 2022 1 次提交

[Unify Tensors PR #8] Merged Tensor into DenseTensor, test=allcases (#38914) · 2052f1e3

由 Zhanlue Yang 提交于 1月 18, 2022

* Merged LoDTensor with Tensor,test=allcases

* Patched python level LoDTensor

* Patched python level LoDTensor

* Merge Tensor into DenseTensor

* Fixed namespace issues,test=allcases

* Fixed merge issues

* Fixed inference issues

* Fixed NPU test issues

* Fixed merge issues

2052f1e3

17 1月, 2022 1 次提交

[Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5

由 Wilber 提交于 1月 17, 2022

* add pten::Place data structure.

* update ci problem

* fix ci problem

* update

* using platform::Place=pten::Place

* remove BOOST_GET_CONST for CPUPlace and GPUPlace

* compile pass 25%.

* compile pass 45%

* compile pass 60%

* remove boost_get for xpu npu mlu and ipu

* compile pass on cpu and gpu.

* fix compile problem

* fix compile error.

* update

* fix ci problem

* update

* ci approve

* fix ci problem

* fix ci eager test problem

* remove BOOST_GET_CONST

* fix npu compile

c48a9ad5

31 12月, 2021 1 次提交
- C
  
  replace contextt to context (#38619) · f1366d58
  由 Chen Weihang 提交于 12月 31, 2021
  
  f1366d58
26 12月, 2021 1 次提交

[PTen] Move copy kernel impl (#38421) · 73819658

由 Chen Weihang 提交于 12月 26, 2021

* add register general kernel marco

* move copy kernel impl

* revert needless change

* polish details

* fix xpu compil faild

* fix xpu compile failed

* polish format

73819658

21 12月, 2021 1 次提交
- C
  [PTen] Rename cuda dir and context to gpu (#38296) · dc7597e3
  由 Chen Weihang 提交于 12月 21, 2021
```
* rename cuda to gpu

* revert CMake change

* resolve conflit

* rename other cuda to gpu

* poish details
```
  dc7597e3
14 12月, 2021 1 次提交

[PTen] Polish kernel register marco design (#38078) · c9da845f

由 Chen Weihang 提交于 12月 14, 2021

* polish register marco

* resolve compile failed

* revert needless change

* revert eager related change

* revert eager related change

* change register marco name

* polish deetails

c9da845f

09 12月, 2021 1 次提交

[PTen] Refine Kernel Registrar Writing (#37977) · b199ba85

由 Chen Weihang 提交于 12月 09, 2021

* refine the kernel register impl

* fix cmake and symbol error

* remove overload marco

* polish details

b199ba85

19 11月, 2021 1 次提交

[PTen] Add copy_to and to method for Tensor (#37262) · 5a000900

由 Chen Weihang 提交于 11月 18, 2021

* add copy_to and to method for Tensor

* polish msg format

* fix details error

* fix copy_to test compile failed

* fix typo

5a000900

17 11月, 2021 1 次提交
- Z
  
  rename TensorBase interface data_type() to dtype() (#37257) · 1e9b3a3d
  由 zyfncg 提交于 11月 17, 2021
  
  1e9b3a3d
14 11月, 2021 1 次提交

[PTen]Reshape Kernel Refactor (#37164) · 895692e3

由 YuanRisheng 提交于 11月 14, 2021

* reshape kernel refactor

* fix compile bugs when run ci

* support xpu for reshape

* fix bugs when run unittest in kunlun ci

* fix compile bugs when run kunlun

* perfect code according to suggestion

895692e3

01 11月, 2021 1 次提交

Paddle Tensor Operation Library initial implementation (#34425) · b9fdd3bc

由 Chen Weihang 提交于 11月 01, 2021

* initial tensor design & sign kernel demo

* add move constructor for meta & add lodtensor

* add dirs & sign xpu kernel

* add mean cpu&cuda kernel impl

* move sign & mean xpu & npu kernel

* add selected_rows basic impl

* refactor design, BaseTensor to DenseTensor, etc.

* add scale mkldnn kernel

* polish xpu & npu impl details

* fix mkldnn reuse compile failed

* change tensor operation lib name

* rename util filename

* add more comments

* change TensorImplInterface to TensorInterface

* add kernel key and factory

* remove MKLDNNTensorMeta, add MKLDNNDenseTensor

* change XXDeviceContext to XXContext

* add base kernel registrar utils & test on sign

* replace boost::any by paddle::any

* fix several ci failed

* fix npu compile error

* add ordered map util

* fix multiple ordered_map compile errors

* move dev into include dir

* support sign op in static op run

* fix static op run error

* fix new executor compile failed

* add dygraph branch & remove sign_op.h

* fix test_infer_no_need_buffer_slots

* fix rocm compile link error

* fix unitybuild error & clear glog

* fix npu compile failed

* skip quant trans test

* fix part windows compile problem

* fix xpu enforce error

* fix inference test failed

* remove ordered_map to solve quant failed

* fix part of rcom compile faild

* add more register kernels

* revert scale kernel temporarily

* fix code format error

* add new kernel registrar marco

* rename top to tcmpt

* revert xpu, npu, mkldnn impl & remove op def

* add kernel args parse functor to auto parse args

* revert some change & add scale kernels

* add op proto in dygraph kernelcontext building

* polish kernel dispatch logic & nameing rule

* fix scale kernel match error

* fix scale test failed

* add mean API and unittest

* test mean api success

* add branch to solve compiled error

* skip clang format error

* add mean skip rule in op_library

* add dot kernel, api and unittest (#6)

* remove old kernel and add symbol link

* fix dot compiled failed

* add merco for module declare

* fix npu and xpu compile error

* revert sign, mean, scale, dot kernel removing

* add comment for keeping old kernel impl

* fix mutable_data error

* fix bfloat16 conflit

* fix inference undef error

* adapt to msvc compile rules

* polish comment for template inst

* add cmake template instantiation for win

* fix backend to place device id bug

* fix ifdef error

* Op2functor (#7)

* add kernel args maker class

* make args maker non-const

* remove debug log

* modify codes by review options

* split constructPrKernelContext function

* fix output name bug

* fix test_mean_op test_sign_op failed

* fill_any_like kernel refactor (#10)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* skip dtype for fill_any_like

* add attrs for kernel key constrcut

* add use_pt_kernel Flags to control whether to use pt kernel (#13)

* add use_pt_kernel Flags to control whether to use pt kernel

* change the default value to true for cheking pt kernels

* fix mutable_data cuda place error

* move high level apis into hapi

* remove selectedrows adapting temporarily

* Support Scalar in Tensor Compute Library (#14)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* remove mkldnn tensor & polish details

* use flat_hash_map and small_vector in kernel factory

* Refactor flatten kernel (#12)

* refactor flatten kernel

* update infershape function

* fix compile bugs

* fix bugs when merge

* fix compiler bugs

* fix bugs when run test_flatten_api

* fix bugs when run test

* Revert "use flat_hash_map and small_vector in kernel factory"

This reverts commit 23091495cfdd3df8cc1be592d30f09ea66a7c72b.

* Move cpu, cuda and other device code into kernels (#15)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Perfect unitests (#16)

* perfect unittest

* update license

* replace with flat_hash_map, small_vector (#19)

* fix small_vector build error on windows platform

* replace with flat_hash_map, small_vector

* remove todo

* Perfect unitests (#20)

* perfect unittest

* update license

* fix bug when run tcmpt_utils_test

* refactor execution adapting impl

* fix insert conflit

* Fix CI bug of test_yolov3 (#21)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Fix CI bug of test_yolov3

* add the tensor base class, test=develop (#17)

* update the tensor base class, test=develop

* remove two funcs, test=develop

* update the error msg, test=develop
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* [no-verify] commit backend and tensor signature changes

* Rename tcmpt to pten (#23)

* rename tcmpt to pten

* update omitted files for rename to pten

* update omitted file for rename to pten

* remove k of all enum var

* remove kernel_instantiate (#26)

* remove symbols and spatial_tensor

* change common to functions

* readd share tensor impl methods

* add a candidate dense tensor class, test=develop (#28)

* change all Pt to Pten

* resolve conflit with xiaowei

* Op2functor opt1 (#27)

* replace to small vector and change to const &

* add std::move
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* polish kernel factory and kernel registry

* fix operator test error msg mismatch

* remove tensor signature and backend set member

* move scalar and polish enforce

* revert dtype layout change to fix error

* fix enum operator override error

* add several base unittests

* add pten utils tests

* polish some details

* Dev/op2func refactor 3 (#30)

* add a candidate dense tensor class, test=develop

* remove TensorBase::backend(), test=develop

* remove some ops, test=develop

* cherry-pick the pr of tensor meta, test=develop

* moves the dense tensor and some ops, test=develop

* update the linalg operator, test=develop

* update other operators, test=develop

* fix errors, test=develop

* fix bugs, test=develop

* try to resolve the problem of windows ci, test=develop

* updates codes, test=develop

* fix the tensor_utils.cc, test=develop

* modify the dense tensor, test=develop

* fix the data type, test=develop
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details

* polish kernel signature details

* fix a bug about offsets of the tensor, test=develop (#31)
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details
Co-authored-by: Nchentianyu03 <ctychentianyu@gmail.com>
Co-authored-by: Nzyfncg <1370305206@qq.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

b9fdd3bc

afeixing77 / Paddle 与 Fork 源项目一致

afeixing77 / Paddle
与 Fork 源项目一致