提交 · 1fa6900ee9f14e72b36e86e34ea73cfff9606101 · BaiXuePrincess / Paddle

29 12月, 2021 1 次提交

add argsort/scatter for kunlun (#38345) · 4643baa7

由 TTerror 提交于 12月 29, 2021

* add argsort/scatter for kunlun

* update test_scatter

* update xpu.cmake

* update xpu.cmake

* fix scatter

4643baa7

27 12月, 2021 1 次提交
- C
  
  remove npu related impl (#38428) · f1d56b77
  由 Chen Weihang 提交于 12月 26, 2021
  
  f1d56b77
26 12月, 2021 1 次提交
- C
  
  auto parse kernel deps by include (#38438) · e5c7ca48
  由 Chen Weihang 提交于 12月 26, 2021
  
  e5c7ca48
24 12月, 2021 2 次提交
- C
  
  add register general kernel marco (#38409) · fc0a50aa
  由 Chen Weihang 提交于 12月 23, 2021
  
  fc0a50aa
- Z
  
  Add new API cholesky_solve (#38167) · 39f7c41f
  由 zhiboniu 提交于 12月 24, 2021
  
  39f7c41f
22 12月, 2021 2 次提交
- C
  [PTen] Add cmake function for kernels (#38311) · e6310dbd
  由 Chen Weihang 提交于 12月 22, 2021
```
* add pten kernel cmake

* add pten kernel cmake function

* fix compile error

* add enforce include for full kernel

* fix compile failed

* change cuda to gpu

* fix cmake function error
```
  e6310dbd
- 王
  
  [infrt] add tensorrt op teller pass. test=develop (#38304) · 44112817
  由王明冬提交于 12月 22, 2021
  
  44112817
20 12月, 2021 1 次提交
- F
  
  [MLU]add mlu backend (#38207) · 76514a1f
  由 fwenguang 提交于 12月 20, 2021
  
  76514a1f
18 12月, 2021 1 次提交
- 王
  
  [infrt] add unit test script for infrt. test=develop (#38232) · a3bd6fc0
  由王明冬提交于 12月 18, 2021
  
  a3bd6fc0
16 12月, 2021 2 次提交
- S
  
  block warning: overriding D9025 (#38034) · 672dba1b
  由 Sing_chan 提交于 12月 16, 2021
  
  672dba1b
- C
  
  fix header match error (#38175) · 4ef59f08
  由 Chen Weihang 提交于 12月 15, 2021
  
  4ef59f08
15 12月, 2021 1 次提交
- 王
  
  [infrt] fix the infrt compile error. test=develop (#38124) · 660d9ff2
  由王明冬提交于 12月 15, 2021
  
  660d9ff2
07 12月, 2021 2 次提交

introduce INF-RT (#37669) · 70dea138

由 Yan Chunwei 提交于 12月 07, 2021

* add infrt code

refined with Paddle's code style.

* rename CinnRtConfig to InfRtConfig

* rename CinnRt to InfRt of some code

* rename CINNRT to INFRT

* remove unnecessary code

* replace CINN to INFRT in the source code

* replace all "cinn" in code to "infrt"

* remove some const_cast

70dea138

S

block MASM : warning A4018 when building cryptopp in windows with ninja (#37890) · ca6ff1f6
由 Sing_chan 提交于 12月 07, 2021

ca6ff1f6

06 12月, 2021 1 次提交

Update CINN tag (#37870) · 3e33ef5a

由 Huihuang Zheng 提交于 12月 06, 2021

1. Modify git tag for CINN
2. Support compile option "-DWITH_CINN=ON, -DWITH_TESTING=OFF"

3e33ef5a

03 12月, 2021 1 次提交
- J
  
  add ipu_backend (#36322) · a3b3ec68
  由 jianghaicheng 提交于 12月 03, 2021
  
  a3b3ec68
01 12月, 2021 1 次提交
- W
  add xpu_base_url parameter (#37712) · caff6668
  由 wangye707 提交于 12月 01, 2021
```
* 添加xpu指定url参数
```
  caff6668
29 11月, 2021 2 次提交
- T
  add expand_v2/expand_as_v2 for kunlun (#37592) · dae4e7f2
  由 TTerror 提交于 11月 29, 2021
```
* add expand_v2/expand_as_v2 for kunlun

* update expand_as_v2

* update expand_as_v2

* support float16/bool

* update xpu.cmake
```
  dae4e7f2
- S
  
  unity variable name in third_party cmake files (#37590) · ba6f645e
  由 Sing_chan 提交于 11月 29, 2021
  
  ba6f645e
26 11月, 2021 1 次提交
- S
  block xxhash warning of c4711 (#37442) · 6b7c0616
  由 Sing_chan 提交于 11月 26, 2021
```
* block xxhash warning of c4711

* modify according to zhouwei's comment

* fix syntax error
```
  6b7c0616
25 11月, 2021 2 次提交
- S
  make third_party's cmake get source code directly 2 (#37372) · c520da39
  由 Sing_chan 提交于 11月 25, 2021
```
* make third_party's cmake get source code directly 2

* modify according to zhouwei's comment

* eager needs mkldnn to compile
```
  c520da39
- S
  block unknown option /arch:SSE3 (#37439) · adb54eb0
  由 Sing_chan 提交于 11月 25, 2021
```
* block unknown option /arch:SSE3

* modify according to zhouwei's comment
```
  adb54eb0
23 11月, 2021 1 次提交

[PTen] Adapt to inference api dir for pten (#37415) · 73f4601d

由 Chen Weihang 提交于 11月 22, 2021

* adapt to inference api dir for pten

* fix conflit with develop

* fix test_egr_ds_eager_tensor compile failed

73f4601d

19 11月, 2021 2 次提交

Fix CI bug caused by type of TensorMeta (#37373) · d29cc7b4

由 zyfncg 提交于 11月 19, 2021

* rename TensorBase interface data_type() to dtype()

* rename type to dtype of TensorMeta

* merge the code

* merge the code

* fix the problem when merge conflict

* fix bug of ci caused by type of tensor_meta

* changes cmake to clear cache

d29cc7b4

S

make third_party's cmake get source code directly (#37332) · da5fb1d4
由 Sing_chan 提交于 11月 19, 2021

da5fb1d4

17 11月, 2021 1 次提交

Upgrade oneDNN to v2.4.4 (#36226) · d08753df

由 piotrekobiIntel 提交于 11月 17, 2021

* upgrade oneDNN to v2.4-rc

* Removed failing test

* Revert "Removed failing test"

This reverts commit 60e70e717fac2c86b7beb24dfa1343a5804ea455.

* Remove most tests for debugging purposes

* Update hash to oneDNN 2.4

* Revert test change

* Update oneDNN to 2.4.2

* Update oneDNN to 2.4.3

* Change oneDNN version to 2.3 for Jenkins test

* Revert "Change oneDNN version to 2.3 for Jenkins test"

This reverts commit 0b176defc3b63f65dd0ba85873a018534f287000.

* Update oneDNN to 2.4.4

* Change version of oneDNN to 2.3 for new Jenkins test

* Revert "Change version of oneDNN to 2.3 for new Jenkins test"

This reverts commit e005a0f78f2b41cdcf4d7de3a21df7f910b78268.

d08753df

15 11月, 2021 1 次提交

[Pten] Refactor the implementation of custom operator (#37122) · 1e598f1a

由 Chen Weihang 提交于 11月 15, 2021

* move extension into pten [no-verify]

* append tensor methods by ext_tensor [no-verify]

* append other tensor methods [no-verify]

* ext related files tidy [no-verify]

* include relation tidy [no-verify]

* add pten tensor test [no-verify]

* replace tensor in custom op & compile success

* refine tensor constructor for unittest

* custom relu jit run success

* fix all custom op unittests

* add inference cmake adapt [no-verify]

* fix failed unittests

* fix windows failed unittests

* try to fix kunlun and inference failed

* fix test_elementwise_api error

* try to fix win compile failed

* fix kunlun fp16 type error

* remove useless haddle error macro

* add custom linear op test

* fix compile failed & add win symbols

* fix non pten kernel cast failed

* add dll decl for api

* polish several deetails

* polish details by review comment

* add dll_decl for register

1e598f1a

11 11月, 2021 1 次提交
- Z
  Add macro required by CINN. (#37066) · 9a9345fa
  由 Zhen Wang 提交于 11月 11, 2021
```
* Add macro required by CINN.

* Remove CMAKE_BUILD_TYPE form cinn.cmake.
```
  9a9345fa
09 11月, 2021 2 次提交
- S
  
  fix bugs when build in windows with_inference_api_test=on (#36973) · fd15477f
  由 Sing_chan 提交于 11月 09, 2021
  
  fd15477f
- T
  
  add gather_nd/tile op for kunlun (#37029) · 819b9589
  由 TTerror 提交于 11月 09, 2021
  
  819b9589
06 11月, 2021 1 次提交
- Z
  Update the batch size used in test_resnet50_with_cinn.py. (#37013) · 68c3e2cb
  由 Zhen Wang 提交于 11月 06, 2021
```
* Update the batch size used in test_resnet50_with_cinn.py.
* Enable more debug info.
```
  68c3e2cb
04 11月, 2021 1 次提交
- Y
  
  [fleet_executor] Framework for message and manager part. (#36966) · be4eaba0
  由 Yuang Liu 提交于 11月 04, 2021
  
  be4eaba0
02 11月, 2021 1 次提交
- Q
  support different precision in kunlun (#36836) · e512aa9a
  由 QingshuChen 提交于 11月 02, 2021
```
* support different precision in kunlun

* minor

* minor

* minor
```
  e512aa9a
01 11月, 2021 3 次提交

C
update cinn commit id tag to the newest one for fix some bugs (#36890) · fe81306c
由 CtfGo 提交于 11月 01, 2021
```
update cinn commit id tag to the newest one for fix some bugs
```
fe81306c

Paddle Tensor Operation Library initial implementation (#34425) · b9fdd3bc

由 Chen Weihang 提交于 11月 01, 2021

* initial tensor design & sign kernel demo

* add move constructor for meta & add lodtensor

* add dirs & sign xpu kernel

* add mean cpu&cuda kernel impl

* move sign & mean xpu & npu kernel

* add selected_rows basic impl

* refactor design, BaseTensor to DenseTensor, etc.

* add scale mkldnn kernel

* polish xpu & npu impl details

* fix mkldnn reuse compile failed

* change tensor operation lib name

* rename util filename

* add more comments

* change TensorImplInterface to TensorInterface

* add kernel key and factory

* remove MKLDNNTensorMeta, add MKLDNNDenseTensor

* change XXDeviceContext to XXContext

* add base kernel registrar utils & test on sign

* replace boost::any by paddle::any

* fix several ci failed

* fix npu compile error

* add ordered map util

* fix multiple ordered_map compile errors

* move dev into include dir

* support sign op in static op run

* fix static op run error

* fix new executor compile failed

* add dygraph branch & remove sign_op.h

* fix test_infer_no_need_buffer_slots

* fix rocm compile link error

* fix unitybuild error & clear glog

* fix npu compile failed

* skip quant trans test

* fix part windows compile problem

* fix xpu enforce error

* fix inference test failed

* remove ordered_map to solve quant failed

* fix part of rcom compile faild

* add more register kernels

* revert scale kernel temporarily

* fix code format error

* add new kernel registrar marco

* rename top to tcmpt

* revert xpu, npu, mkldnn impl & remove op def

* add kernel args parse functor to auto parse args

* revert some change & add scale kernels

* add op proto in dygraph kernelcontext building

* polish kernel dispatch logic & nameing rule

* fix scale kernel match error

* fix scale test failed

* add mean API and unittest

* test mean api success

* add branch to solve compiled error

* skip clang format error

* add mean skip rule in op_library

* add dot kernel, api and unittest (#6)

* remove old kernel and add symbol link

* fix dot compiled failed

* add merco for module declare

* fix npu and xpu compile error

* revert sign, mean, scale, dot kernel removing

* add comment for keeping old kernel impl

* fix mutable_data error

* fix bfloat16 conflit

* fix inference undef error

* adapt to msvc compile rules

* polish comment for template inst

* add cmake template instantiation for win

* fix backend to place device id bug

* fix ifdef error

* Op2functor (#7)

* add kernel args maker class

* make args maker non-const

* remove debug log

* modify codes by review options

* split constructPrKernelContext function

* fix output name bug

* fix test_mean_op test_sign_op failed

* fill_any_like kernel refactor (#10)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* skip dtype for fill_any_like

* add attrs for kernel key constrcut

* add use_pt_kernel Flags to control whether to use pt kernel (#13)

* add use_pt_kernel Flags to control whether to use pt kernel

* change the default value to true for cheking pt kernels

* fix mutable_data cuda place error

* move high level apis into hapi

* remove selectedrows adapting temporarily

* Support Scalar in Tensor Compute Library (#14)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* remove mkldnn tensor & polish details

* use flat_hash_map and small_vector in kernel factory

* Refactor flatten kernel (#12)

* refactor flatten kernel

* update infershape function

* fix compile bugs

* fix bugs when merge

* fix compiler bugs

* fix bugs when run test_flatten_api

* fix bugs when run test

* Revert "use flat_hash_map and small_vector in kernel factory"

This reverts commit 23091495cfdd3df8cc1be592d30f09ea66a7c72b.

* Move cpu, cuda and other device code into kernels (#15)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Perfect unitests (#16)

* perfect unittest

* update license

* replace with flat_hash_map, small_vector (#19)

* fix small_vector build error on windows platform

* replace with flat_hash_map, small_vector

* remove todo

* Perfect unitests (#20)

* perfect unittest

* update license

* fix bug when run tcmpt_utils_test

* refactor execution adapting impl

* fix insert conflit

* Fix CI bug of test_yolov3 (#21)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Fix CI bug of test_yolov3

* add the tensor base class, test=develop (#17)

* update the tensor base class, test=develop

* remove two funcs, test=develop

* update the error msg, test=develop
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* [no-verify] commit backend and tensor signature changes

* Rename tcmpt to pten (#23)

* rename tcmpt to pten

* update omitted files for rename to pten

* update omitted file for rename to pten

* remove k of all enum var

* remove kernel_instantiate (#26)

* remove symbols and spatial_tensor

* change common to functions

* readd share tensor impl methods

* add a candidate dense tensor class, test=develop (#28)

* change all Pt to Pten

* resolve conflit with xiaowei

* Op2functor opt1 (#27)

* replace to small vector and change to const &

* add std::move
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* polish kernel factory and kernel registry

* fix operator test error msg mismatch

* remove tensor signature and backend set member

* move scalar and polish enforce

* revert dtype layout change to fix error

* fix enum operator override error

* add several base unittests

* add pten utils tests

* polish some details

* Dev/op2func refactor 3 (#30)

* add a candidate dense tensor class, test=develop

* remove TensorBase::backend(), test=develop

* remove some ops, test=develop

* cherry-pick the pr of tensor meta, test=develop

* moves the dense tensor and some ops, test=develop

* update the linalg operator, test=develop

* update other operators, test=develop

* fix errors, test=develop

* fix bugs, test=develop

* try to resolve the problem of windows ci, test=develop

* updates codes, test=develop

* fix the tensor_utils.cc, test=develop

* modify the dense tensor, test=develop

* fix the data type, test=develop
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details

* polish kernel signature details

* fix a bug about offsets of the tensor, test=develop (#31)
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details
Co-authored-by: Nchentianyu03 <ctychentianyu@gmail.com>
Co-authored-by: Nzyfncg <1370305206@qq.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

b9fdd3bc

S

change boost url, which block warning of Unknown compiler version (#36857) · 3c0a68ce
由 Sing_chan 提交于 11月 01, 2021

3c0a68ce

28 10月, 2021 1 次提交

Fix several bugs for enabling Paddle to train with CINN. (#36739) · c93331c5

由 Zhen Wang 提交于 10月 28, 2021

* Update the content of `test_parallel_executor_run_cinn.py`.

* Fix some bugs in the topological sort and `CreateNewSubGraph`.

* Update the CINN commit id used by Paddle.

* Update the unit test to `add+relu`.

* Update according to reviewers' suggestion.

c93331c5

27 10月, 2021 1 次提交

add paddle.linalg.eigvalsh API (#35615) · 9f9ed3ae

由 huangjun12 提交于 10月 27, 2021

* add eigvalsh with is_test

* add eigvalsh op

* fix backward bug

* forward and backward, float and complex, unittest

* remove eigvalsh_helper.h

* remove changes of cusolver.h

* fix unittest

* fix unittest bug

* update code following eigh

* fix test

* update lapack

* pull develop

* update funcor

* fix unittest bug

* fix details

* add tensor_method_func

* fix notes

9f9ed3ae

25 10月, 2021 2 次提交

add some ops to train ssd on kunlun (#36407) · 50778ad6

由 TTerror 提交于 10月 25, 2021

* add some ops to train ssd on kunlun

* add some ops to train ssd on kunlun

* add some ops to train ssd on kunlun

* update cast op unittest

* update cast op unittest

* update cast op unittest

* update xpu cmake

* update cast unittest

50778ad6

add op: fused_feedforward(forward) (#35843) · b18cbfb2

由 zhangkaihuo 提交于 10月 25, 2021

这个PR只包含fused_feedforward前向的代码。

相关kernel实现：fused_dropout_act_bias, fused_residual_dropout_bias, fused_layernorm_residual_dropout_bias

fused_feedforward是一个融合算子，该算子对transformer模型的feed forward层的算子进行融合和封装，使得前端只呈现一个接口，通过融合减少部分访存和kernel launch的时间，以此提升性能。

b18cbfb2

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致