提交 · 3b0aa75efa4009d8f03301aaa55341b4bd5fcf22 · csdn_franckjun / Paddle

22 7月, 2022 1 次提交

[CustomDevice] register Copy for custom device (#44200) · 3b0aa75e

由 Aganlengzi 提交于 7月 22, 2022

* [CustomDevice] register Copy for custom device

* [CustomDevice] register Copy for custom device

* [CustomDevice] register Copy for custom device

* merge and add uts

* merge and add uts

* fix for blocking and unittests coverage

3b0aa75e

18 7月, 2022 1 次提交

[Plugin] Fix Custom device in eager mode, test=develop (#43952) · 04e55582

由 Qi Li 提交于 7月 18, 2022

* [Plugin] Fix Custom device in eager mode, test=develop

* update test case, test=develop

* update ut for coverage, test=develop

04e55582

29 6月, 2022 1 次提交
- Z
  
  Change sparse Copy from Kernel to basic component utils (#43916) · 148fa05e
  由 zhangkaihuo 提交于 6月 29, 2022
  
  148fa05e
24 6月, 2022 1 次提交

[Phi]Change Copy from Kernel to basic component utils (#43622) · 2739bd73

由 YuanRisheng 提交于 6月 24, 2022

* perfect copy

* deal with conflict

* deal with conflict

* fix compile bugs

* fix unittest bugs

* change code format

* deal with conflict

* modify code by review

* fix ce bugs

* fix ce bugs

* add lo

* perfect code format

* deal with conflicts

2739bd73

21 6月, 2022 1 次提交
- S
  resort .cu headers, set clang-format not sort include block and consider .cu... · 829723f2
  由 Sing_chan 提交于 6月 21, 2022
```
resort .cu headers, set clang-format not sort include block and consider .cu as main source file (#43633)
```
  829723f2
05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
01 4月, 2022 1 次提交

[Eager] Support pinned (#41035) · f3270fc8

由 wanghuancoder 提交于 4月 01, 2022

* support pinned, test=develop

* support async_write, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine,test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

f3270fc8

28 3月, 2022 1 次提交

[Phi] Fix assign kernel bug (#40927) · 822a2d1f

由 Chen Weihang 提交于 3月 28, 2022

* fix assign kernel bug

* fix xpu kernel select error

* add cudn pinned place

* fix copy error

* fix infrt error

822a2d1f

21 3月, 2022 1 次提交

Refine to_tensor for eager mode and support gpu_pinned (#40535) · 45d1fb8d

由 0x45f 提交于 3月 21, 2022

* Refine to_tensor for eager mode

* support gpu_pinned

* refine code

* support gpu_pinned copy_to

* fix layer.__setattr__

* support to_tensor for gpu_pinned

* fix unit test

* refine gpu_pinned

* restore the original code

* add is_gup_pinned() and refine eager.Tensor._copy_to()

45d1fb8d

26 2月, 2022 1 次提交

[Pten] Refactor the copy kernel (#39731) · 9a7b9eda

由 zyfncg 提交于 2月 26, 2022

* remove SetAllocationForOutputTenosr

* add place param for copy kernel

* recover SetAllocationForOutputTenosr

* polish code

* fix empty_dev api bug

* test=allcases

* test=allcases

* fix bug

* recover empty

* recover modify

9a7b9eda

22 2月, 2022 2 次提交

change Vector to std::vector and provide MixVector class as a helper … (#39559) · 728c0624

由 xiongkun 提交于 2月 22, 2022

* change Vector to std::vector and provide MixVector class as a helper wrapper class

* solve the multi-gpu hang problem

* remove the duplicate template instantialize

* Copy vector to cpu

* add CopyToCPU

* xxx

* final version: fix the problem of all reduce

* remove mixvector dependence

* fix

* merge

* fix code

* fix by CI

728c0624

C
[PTen->Phi PR2] Rename PT_REGISTER macro to PD_REGISTER (#39790) · 4a338796
由 Chen Weihang 提交于 2月 22, 2022
```
* unify register macro

* rename declare macro

* fix infrt error
```
4a338796

20 2月, 2022 1 次提交

[PTen->Phi PR1] Change pten dirname and namespace to phi (#39748) · dcfe1986

由 Chen Weihang 提交于 2月 20, 2022

* rename pten dir to phi

* rename namespace to phi

* rename infrt pten dir to phi

* resolve conflict

* rename pten to phi in cmake

* revert all infrt change

* change needed files

* fix infrt failed

* fix inference failed

dcfe1986

17 2月, 2022 1 次提交
- C
  [PTen] Remove fluid device context deps (#39604) · d63ece1f
  由 Chen Weihang 提交于 2月 17, 2022
```
* remove fluid device context deps

* fix compile failde
```
  d63ece1f
15 2月, 2022 1 次提交

[PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404

由 Aurelius84 提交于 2月 15, 2022

* #1 migrate dist-related type()-> dtype()

* move datatype function from pten -> fluid/framework

* change type() in imperative into convert(dtype())

* modify xx_tensor->type into xx_tensor->dtype

* change the set_type interface and the caller

* modify xx_tensor.type into xx_tensor.dtype

* fix mutable_data(place, dtype())

* change caller of mutable_data in pten and distributed

* change the caller of mutable_data in fluid/framework

* change the caller of mutable_data in imperative directory

* mutable_data: inference

* update the call of mutable_data

* transfer MakePenScalarArray MakePtenScalar ResetHolderWithType

* pass the compile. the next step is remove VarType in Pten

* fix all and remove VarType from pten. success in linux. Next task is other platform

* fix conflict with develop

* fix compiled error

* Fix reset conversion

* fix conflict

* fix compiled problem

* fix typo

* Fix << in tensor_utils.cc

* fix type->dtype

* fix unittest

* fix tensor init constructor

* fix DataTypeSize for BFloat16

* fix code style

* fix npu compiled error

* fix npu

* compile npu sucessfully

* fix conflict

* fix conflict
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>

7e7e9404

06 2月, 2022 1 次提交
- W
  
  [PTEN] Add Gpu context (#39305) · a821c4a9
  由 Wilber 提交于 2月 06, 2022
  
  a821c4a9
29 1月, 2022 1 次提交

[PTen] Tidy pten core headers (#39188) · dd990981

由 Chen Weihang 提交于 1月 29, 2022

* open header for custom kernel

* add core utils

* tidy core code

* tify header

* tidy include

* tidy namespace

* resolve conflit

* fix unittest and coverage

* remove platform using

* resolve conflict

* resolve conflict

* fix digamma namespace error

* fix xpu full kernel error

* fix xpu full kernel error

* polish details

* add place for lib storage

dd990981

27 1月, 2022 2 次提交

Z

implement AllocateFrom (#39280) · d89f246c
由 zhangkaihuo 提交于 1月 27, 2022

d89f246c

Add SparseCooTensor and SparseCsrTensor (#38906) · a7edb3f3

由 zhangkaihuo 提交于 1月 27, 2022

* fix bug:
1. atten: set the default value of attn_dropout_rate to None
2. ffn: add activation parameter

* for pure fp16

* Add a SparseCsrTensor

* remove unused functional

* remove const

* remove SetMemoberTensor

* remove non_zero_nums_, the number of non zero elements of each batch can be obtained from the crows

* SparseCooTensor

* add SetMember

* merge upstream; add SetMember

* merge upstream

* merge upstream; add newline at end of file

* add newline at end of file

* remove newline at end of file

* remove newline at end of file

* stash

* user pten::framework::make_ddim

* user pten::framework::make_ddim

* merge upstream; use the latest mutable_data

* merge upstream; use the latest mutable_data

* return mutable dense tensor

a7edb3f3

24 1月, 2022 1 次提交

石

[Refactoring Tensor PR #5] replace storage with pten allocation (#39085) · a56e16a7

由石晓伟提交于 1月 24, 2022

* updates callers, test=develop

* updates tensor, test=develop

* fixes errors, test=develop

* remove some dtypes, test=develop

* fix errors in the base storage modification, test=develop

* fixes a bug, test=develop

* fixes the bugs in push the whole, test=develop

* updates, test=develop

* update

* update, test=develop

* fixes the mac-py3 CI, test=develop

* remove the storage impl, test=develop

* updates some codes, test=develop

* update, test=develop

* updates pten allocation, test=develop

a56e16a7

20 1月, 2022 1 次提交

【PTen】Remove code of converting Tensor to DensoeTensor (#38926) · 8784ec65

由 zyfncg 提交于 1月 20, 2022

* remove MakePtenTensor in BuildKernelContext

* fix a bug caused by storage

* remove WriteBackOutput in dynamic and static mode

* fix complie error of std::max

* fix complie error of std::max

* fix date_type bug

* fix memory alloc bug

* add some debug info

* fix compile problem

* fix problem of data_type check

* comment out some unreached code

8784ec65

18 1月, 2022 1 次提交

[Unify Tensors PR #8] Merged Tensor into DenseTensor, test=allcases (#38914) · 2052f1e3

由 Zhanlue Yang 提交于 1月 18, 2022

* Merged LoDTensor with Tensor,test=allcases

* Patched python level LoDTensor

* Patched python level LoDTensor

* Merge Tensor into DenseTensor

* Fixed namespace issues,test=allcases

* Fixed merge issues

* Fixed inference issues

* Fixed NPU test issues

* Fixed merge issues

2052f1e3

17 1月, 2022 1 次提交

[Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5

由 Wilber 提交于 1月 17, 2022

* add pten::Place data structure.

* update ci problem

* fix ci problem

* update

* using platform::Place=pten::Place

* remove BOOST_GET_CONST for CPUPlace and GPUPlace

* compile pass 25%.

* compile pass 45%

* compile pass 60%

* remove boost_get for xpu npu mlu and ipu

* compile pass on cpu and gpu.

* fix compile problem

* fix compile error.

* update

* fix ci problem

* update

* ci approve

* fix ci problem

* fix ci eager test problem

* remove BOOST_GET_CONST

* fix npu compile

c48a9ad5

31 12月, 2021 1 次提交
- C
  
  replace contextt to context (#38619) · f1366d58
  由 Chen Weihang 提交于 12月 31, 2021
  
  f1366d58
26 12月, 2021 1 次提交

[PTen] Move copy kernel impl (#38421) · 73819658

由 Chen Weihang 提交于 12月 26, 2021

* add register general kernel marco

* move copy kernel impl

* revert needless change

* polish details

* fix xpu compil faild

* fix xpu compile failed

* polish format

73819658

21 12月, 2021 1 次提交
- C
  [PTen] Rename cuda dir and context to gpu (#38296) · dc7597e3
  由 Chen Weihang 提交于 12月 21, 2021
```
* rename cuda to gpu

* revert CMake change

* resolve conflit

* rename other cuda to gpu

* poish details
```
  dc7597e3
14 12月, 2021 1 次提交

[PTen] Polish kernel register marco design (#38078) · c9da845f

由 Chen Weihang 提交于 12月 14, 2021

* polish register marco

* resolve compile failed

* revert needless change

* revert eager related change

* revert eager related change

* change register marco name

* polish deetails

c9da845f

09 12月, 2021 1 次提交

[PTen] Refine Kernel Registrar Writing (#37977) · b199ba85

由 Chen Weihang 提交于 12月 09, 2021

* refine the kernel register impl

* fix cmake and symbol error

* remove overload marco

* polish details

b199ba85

19 11月, 2021 1 次提交

[PTen] Add copy_to and to method for Tensor (#37262) · 5a000900

由 Chen Weihang 提交于 11月 18, 2021

* add copy_to and to method for Tensor

* polish msg format

* fix details error

* fix copy_to test compile failed

* fix typo

5a000900

17 11月, 2021 1 次提交
- Z
  
  rename TensorBase interface data_type() to dtype() (#37257) · 1e9b3a3d
  由 zyfncg 提交于 11月 17, 2021
  
  1e9b3a3d
14 11月, 2021 1 次提交

[PTen]Reshape Kernel Refactor (#37164) · 895692e3

由 YuanRisheng 提交于 11月 14, 2021

* reshape kernel refactor

* fix compile bugs when run ci

* support xpu for reshape

* fix bugs when run unittest in kunlun ci

* fix compile bugs when run kunlun

* perfect code according to suggestion

895692e3

01 11月, 2021 1 次提交

Paddle Tensor Operation Library initial implementation (#34425) · b9fdd3bc

由 Chen Weihang 提交于 11月 01, 2021

* initial tensor design & sign kernel demo

* add move constructor for meta & add lodtensor

* add dirs & sign xpu kernel

* add mean cpu&cuda kernel impl

* move sign & mean xpu & npu kernel

* add selected_rows basic impl

* refactor design, BaseTensor to DenseTensor, etc.

* add scale mkldnn kernel

* polish xpu & npu impl details

* fix mkldnn reuse compile failed

* change tensor operation lib name

* rename util filename

* add more comments

* change TensorImplInterface to TensorInterface

* add kernel key and factory

* remove MKLDNNTensorMeta, add MKLDNNDenseTensor

* change XXDeviceContext to XXContext

* add base kernel registrar utils & test on sign

* replace boost::any by paddle::any

* fix several ci failed

* fix npu compile error

* add ordered map util

* fix multiple ordered_map compile errors

* move dev into include dir

* support sign op in static op run

* fix static op run error

* fix new executor compile failed

* add dygraph branch & remove sign_op.h

* fix test_infer_no_need_buffer_slots

* fix rocm compile link error

* fix unitybuild error & clear glog

* fix npu compile failed

* skip quant trans test

* fix part windows compile problem

* fix xpu enforce error

* fix inference test failed

* remove ordered_map to solve quant failed

* fix part of rcom compile faild

* add more register kernels

* revert scale kernel temporarily

* fix code format error

* add new kernel registrar marco

* rename top to tcmpt

* revert xpu, npu, mkldnn impl & remove op def

* add kernel args parse functor to auto parse args

* revert some change & add scale kernels

* add op proto in dygraph kernelcontext building

* polish kernel dispatch logic & nameing rule

* fix scale kernel match error

* fix scale test failed

* add mean API and unittest

* test mean api success

* add branch to solve compiled error

* skip clang format error

* add mean skip rule in op_library

* add dot kernel, api and unittest (#6)

* remove old kernel and add symbol link

* fix dot compiled failed

* add merco for module declare

* fix npu and xpu compile error

* revert sign, mean, scale, dot kernel removing

* add comment for keeping old kernel impl

* fix mutable_data error

* fix bfloat16 conflit

* fix inference undef error

* adapt to msvc compile rules

* polish comment for template inst

* add cmake template instantiation for win

* fix backend to place device id bug

* fix ifdef error

* Op2functor (#7)

* add kernel args maker class

* make args maker non-const

* remove debug log

* modify codes by review options

* split constructPrKernelContext function

* fix output name bug

* fix test_mean_op test_sign_op failed

* fill_any_like kernel refactor (#10)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* skip dtype for fill_any_like

* add attrs for kernel key constrcut

* add use_pt_kernel Flags to control whether to use pt kernel (#13)

* add use_pt_kernel Flags to control whether to use pt kernel

* change the default value to true for cheking pt kernels

* fix mutable_data cuda place error

* move high level apis into hapi

* remove selectedrows adapting temporarily

* Support Scalar in Tensor Compute Library (#14)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* remove mkldnn tensor & polish details

* use flat_hash_map and small_vector in kernel factory

* Refactor flatten kernel (#12)

* refactor flatten kernel

* update infershape function

* fix compile bugs

* fix bugs when merge

* fix compiler bugs

* fix bugs when run test_flatten_api

* fix bugs when run test

* Revert "use flat_hash_map and small_vector in kernel factory"

This reverts commit 23091495cfdd3df8cc1be592d30f09ea66a7c72b.

* Move cpu, cuda and other device code into kernels (#15)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Perfect unitests (#16)

* perfect unittest

* update license

* replace with flat_hash_map, small_vector (#19)

* fix small_vector build error on windows platform

* replace with flat_hash_map, small_vector

* remove todo

* Perfect unitests (#20)

* perfect unittest

* update license

* fix bug when run tcmpt_utils_test

* refactor execution adapting impl

* fix insert conflit

* Fix CI bug of test_yolov3 (#21)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Fix CI bug of test_yolov3

* add the tensor base class, test=develop (#17)

* update the tensor base class, test=develop

* remove two funcs, test=develop

* update the error msg, test=develop
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* [no-verify] commit backend and tensor signature changes

* Rename tcmpt to pten (#23)

* rename tcmpt to pten

* update omitted files for rename to pten

* update omitted file for rename to pten

* remove k of all enum var

* remove kernel_instantiate (#26)

* remove symbols and spatial_tensor

* change common to functions

* readd share tensor impl methods

* add a candidate dense tensor class, test=develop (#28)

* change all Pt to Pten

* resolve conflit with xiaowei

* Op2functor opt1 (#27)

* replace to small vector and change to const &

* add std::move
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* polish kernel factory and kernel registry

* fix operator test error msg mismatch

* remove tensor signature and backend set member

* move scalar and polish enforce

* revert dtype layout change to fix error

* fix enum operator override error

* add several base unittests

* add pten utils tests

* polish some details

* Dev/op2func refactor 3 (#30)

* add a candidate dense tensor class, test=develop

* remove TensorBase::backend(), test=develop

* remove some ops, test=develop

* cherry-pick the pr of tensor meta, test=develop

* moves the dense tensor and some ops, test=develop

* update the linalg operator, test=develop

* update other operators, test=develop

* fix errors, test=develop

* fix bugs, test=develop

* try to resolve the problem of windows ci, test=develop

* updates codes, test=develop

* fix the tensor_utils.cc, test=develop

* modify the dense tensor, test=develop

* fix the data type, test=develop
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details

* polish kernel signature details

* fix a bug about offsets of the tensor, test=develop (#31)
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details
Co-authored-by: Nchentianyu03 <ctychentianyu@gmail.com>
Co-authored-by: Nzyfncg <1370305206@qq.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

b9fdd3bc

csdn_franckjun / Paddle 与 Fork 源项目一致

csdn_franckjun / Paddle
与 Fork 源项目一致