提交 · bc3ca6786668c732c451d1742e1aee08fac3cb1f · PaddlePaddle / Paddle

17 2月, 2022 1 次提交

add softplus op for kunlun2. test=kunlun (#39555) · 9f99b591

由 houj04 提交于 2月 17, 2022

* add softplus op for kunlun2. test=kunlun

* add softplus op for kunlun2. test=kunlun

* fix code style. test=kunlun

* fix code style. test=kunlun

* add more test cases. test=kunlun

9f99b591

15 2月, 2022 3 次提交

[PluggableDevice] Add custom runtime support (#38740) · 3e7825f3

由 ronnywang 提交于 2月 15, 2022

* [CustomRuntime] Add DeviceManager

* [CustomRuntime] Add DeviceInterface

* [CustomRuntime] Add Stream, Event, DeviceGuard, CallbackManager

* [CustomRuntime] Add plug-in device

* [CustomRuntime] Memory module support PluggableDevice

* [CustomRuntime] Add WITH_PLUGGABLE_DEVICE cmake option

* update

* [API] update API doc based on comments, test=develop
Co-authored-by: Nqili93 <qili93@qq.com>

3e7825f3

S
fix bug when use extern_openblas and generator is ninja (#39428) · f73f5b06
由 Sing_chan 提交于 2月 15, 2022
```
* fix bug when use extern_openblas and generator is ninja

* modify according to zhouwei's comment
```
f73f5b06

new way of test case, 2nd, *test=kunlun (#39478) · 4745234f

由 z8hanghuan 提交于 2月 15, 2022

* new way of test case, 2nd, *test=kunlun

* new way of test case, 2nd, *test=kunlun

* new way of test case, 2nd, *test=kunlun

4745234f

14 2月, 2022 1 次提交
- Q
  
  [ROCm] fix missing dcu kernel in operator.cmake, test=develop (#39480) · 55da9344
  由 Qi Li 提交于 2月 14, 2022
  
  55da9344
11 2月, 2022 1 次提交
- Z
  
  get build time (#39368) · 72ad280b
  由 zhangchunle 提交于 2月 11, 2022
  
  72ad280b
02 2月, 2022 1 次提交

[PTen] Remove kernel alias name (#39321) · 5dc20c27

由 Chen Weihang 提交于 2月 02, 2022

* remove kernel alias name

* fix depreacted error

* fix deprecated failed

* fix mean error

* resolve conflict

* fix windows failed

5dc20c27

30 1月, 2022 1 次提交
- feat(cncl_mlu): add cncl dev for mlu distributed backend (#39294) · d28f6f7b
  由 mhhhh1 提交于 1月 30, 2022
  
  d28f6f7b
29 1月, 2022 2 次提交

Add xpu2 compiler (#37254) · 92da5055

由 Liu-xiandong 提交于 1月 29, 2022

* Add XPU compiler for paddle, test=develop

* clean code

* clean useless code

* clean useless code

* clean useless code

* test

* add include path

* use clang compiler

* xpu2.cmake

* XPU2 compiler passed

* update

* update after pten

* combination the WITH_XPU and WITH_XPU2

* update the fuse operation in WITH_XPU and WITH_XPU2

* update

* update

* update

* fix the merge error

* update

* update the code

* update the code

* add run_kp_kernel flag

* update

* update

* fix prepared type_ bug

* clean and update the code

* reset the kernel_primitives

* update

* clean the code

* delete useless comment

* fix the bug in WITH_XPU

* update

* update

* modify the abi

* delete some useless code

* Parameter automation in xpu compilation

* Parameter automation in xpu compilation

* delete kps in cmake

* delete useless comment

* clean the code

* clean the code

92da5055

J

Update register_kernels and kernel_library function in pten.cmake (#39259) · 6b3a6a9f
由 Jack Zhou 提交于 1月 29, 2022

6b3a6a9f

28 1月, 2022 1 次提交
- Y
  [PTen]Refactor scale kernel that has selected_rows input (#39278) · abfc2fe9
  由 YuanRisheng 提交于 1月 28, 2022
```
* refactor scale kernel that its input is selected_rows

* complement upload file
```
  abfc2fe9
27 1月, 2022 1 次提交
- T
  compile for afs api (#39113) · 4748486e
  由 Thunderbrook 提交于 1月 27, 2022
```
* compile for afs api

* with pslib
```
  4748486e
26 1月, 2022 3 次提交

[pten] remove deprecated fluid op kernel for pten (#38842) · 3ab9aef1

由 Leo Chen 提交于 1月 26, 2022

* update cmake file to remove fluid kernel

* add pten declaration.h to where pybind.h used

* fix sync_bn and tensorrt_engine

* refine detection_library

* fix interpreter_core

* support eager legacy

* fit eager legacy for pten

* fall back to cpu if not found kernel

* fix compile problem

* fix compile problem

* refine fallback logic

* fit operator.run()

* fix xpu compile

* fit for new_exec

* add REGISTER_OP_WITHOUT_GRADIENT

* un-cache pt_kernel_context

* fix compile

* fix cudnn

* fix compiling with on_infer

* fix mkldnn

* fix isfinite_v2

* fix xpu problem

* fix op_device

* refine fallback for xpu

* fix xpu compile

* merge develop

* refine code format

* fix compile

* fix compile

* add data_transfer

* fix PreparePtenData

* fix cpu context

* merge develop

* fix compile

* fix error device context

* fix xpu

* fix dev_ctx

3ab9aef1

[IPU] sync misc changes 02 (#39189) · 5df78366

由 Allen Guo 提交于 1月 26, 2022

* sync misc changes

* apply comments 01

* fix compile error

* remove is_ipu_place check

* add authors
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* sync changes

* restore cmake

* update ir cmake and setup.py

* update inference_lib cmake

* restore for split PR
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

5df78366

[PTen] Unify InferMeta(Shape) Function in pten and fluid op (#38976) · b75507d3

由 Chen Weihang 提交于 1月 26, 2022

* infermeta context init design

* support infermeta called in fluid op

* add hasattr and attr methods

* add dygraah GetVarPtrs support

* rename arg_map_context to arg_map_utils

* add registry for arg map func

* resolve conflit

* refactor op utils design

* polish meta config

* fix details

* remove hasattr method

* resolve conflit

* revert cmake order change

* revert some change

* change init pos

* fix compile faileed

* fix typo

* fix inference failed

* fix windows ccompile failed

* polish format
Co-authored-by: NWang Huan <wanghuan29@baidu.com>

b75507d3

25 1月, 2022 1 次提交

[PTen] Migrate string tinyformat errors and part of enforce into pten (#39051) · 6ca49164

由 xiongkun 提交于 1月 25, 2022

* transfer: string tinyformat errors and part of enforce into pten

* remove comment

* fix by code review

* assert is not compile in -DNDEBUG

* add string as dependences of paddle_inference

6ca49164

24 1月, 2022 1 次提交

support sparse of adam, *test=kunlun (#38483) · e106901e

由 z8hanghuan 提交于 1月 24, 2022

* support sparse of adam, *test=kunlun

* add pre-commit-config.yaml

* support sparse of adam in KL2,*test=kunlun

* support sparse of adam in KL2, *test=kunlun

* modify xpu.cmake, *test=kunlun

* support sparse of adam, rm some wait, *test=kunlun

* support sparse of adam, rm some wait, *test=kunlun

* support sparse of adam, *test=kunlun

* support sparse of adam, *test=kunlun

* support sparse of adam, *test=kunlun

* support sparse of adam, *test=kunlun

* support sparse of adam, *test=kunlun

e106901e

22 1月, 2022 1 次提交
- C
  [PTen] Auto generate include headers (#39123) · e92b3040
  由 Chen Weihang 提交于 1月 22, 2022
```
* auto gen include headers

* move to pten.cmake
```
  e92b3040
21 1月, 2022 1 次提交

[PTen]Separate origin Kernel and add Kernel for C++ API (#39002) · a0f586bc

由 YuanRisheng 提交于 1月 21, 2022

* add kernel for c++ api

* fix compile bugs

* fix kunlun compile bugs

* perfect cmake

* fix compile bugs when run ci-inference

* fix compile bugs

* add non-raw kernel for fluid op

* fix compile bugs

* fix compile bugs

* fix unit test bug

a0f586bc

20 1月, 2022 1 次提交
- A
  [Pten] Migrate bfloat16/float16/complex from paddle::platform into pten::common (#39044) · f1143f0c
  由 Aurelius84 提交于 1月 20, 2022
```
* Migrate bfloat16/float16/complex from platform into pten::common

* fix typo

* fix code style
```
  f1143f0c
14 1月, 2022 2 次提交
- 王
  
  [infrt] update the version of llvm. test=develop (#38843) · 0de8a805
  由王明冬提交于 1月 14, 2022
  
  0de8a805
- S
  
  fix bug of -DPADDLE_WITH_SSE3 not set when WITH_AVX AND AVX_FOUND even SSE3_FOUND (#38931) · 9e0686ed
  由 Sing_chan 提交于 1月 14, 2022
  
  9e0686ed
13 1月, 2022 1 次提交
- C
  [PTen] Rename kernel register marco (#38861) · 158bf13f
  由 Chen Weihang 提交于 1月 13, 2022
```
* rename register marco

* fix error changing

* fix format error
```
  158bf13f
12 1月, 2022 1 次提交
- Z
  
  pscore perfermance optimization (#38582) · f1201482
  由 zhaocaibei123 提交于 1月 12, 2022
  
  f1201482
11 1月, 2022 2 次提交

【PTen】Add dot and matmul grad kernel in pten (#38713) · be817719

由 zyfncg 提交于 1月 11, 2022

* refactor matmul directory in pten

* fix merge conflict

* add dot_grad kernel

* add dot_grad kernel in pten

* add matmul_grad kernel

* update the code

* delete useless code in fluid

* fix some bug of running matmul grad kernel

* fix merge conflict

* refactor some code

* refactor code

be817719

S
support vs2019 compilation in windows (#38719) · 0ad363b1
由 Sing_chan 提交于 1月 11, 2022
```
* support vs2019 compilation in windows

* not modify pow_op's original compute logic
```
0ad363b1

10 1月, 2022 1 次提交

Add gpu kernel for new api : linalg.lstsq (#38621) · 405103d8

由 Haohongxiang 提交于 1月 10, 2022

* add lstsq gpu kernel

* update

* add docs_en

* modify ut

* fix bugs

* modify example in docs_en

* remove lstsq_op.cu from ROCM cmake

* modify docs_en

* modify docs_en

* modify docs_en

* remove unneccessary TensorCopy

405103d8

05 1月, 2022 1 次提交
- T
  
  update masked_select_op for kunlun (#38678) · 40078103
  由 TTerror 提交于 1月 05, 2022
  
  40078103
30 12月, 2021 2 次提交
- Z
  add OP lu forward (#38559) · 4e21457d
  由 zhiboniu 提交于 12月 30, 2021
```
LGTM
```
  4e21457d
- Z
  Add exp, abs_grad, reciprocal, reciprocal_grad operator for XPU and update... · ceec1e21
  由 zhangyk0314 提交于 12月 30, 2021
```
Add exp, abs_grad, reciprocal, reciprocal_grad operator for XPU and update xpu2_op_list.h,test=kunlun (#38570)
```
  ceec1e21
29 12月, 2021 1 次提交

add argsort/scatter for kunlun (#38345) · 4643baa7

由 TTerror 提交于 12月 29, 2021

* add argsort/scatter for kunlun

* update test_scatter

* update xpu.cmake

* update xpu.cmake

* fix scatter

4643baa7

27 12月, 2021 1 次提交
- C
  
  remove npu related impl (#38428) · f1d56b77
  由 Chen Weihang 提交于 12月 26, 2021
  
  f1d56b77
26 12月, 2021 1 次提交
- C
  
  auto parse kernel deps by include (#38438) · e5c7ca48
  由 Chen Weihang 提交于 12月 26, 2021
  
  e5c7ca48
24 12月, 2021 2 次提交
- C
  
  add register general kernel marco (#38409) · fc0a50aa
  由 Chen Weihang 提交于 12月 23, 2021
  
  fc0a50aa
- Z
  
  Add new API cholesky_solve (#38167) · 39f7c41f
  由 zhiboniu 提交于 12月 24, 2021
  
  39f7c41f
22 12月, 2021 2 次提交
- C
  [PTen] Add cmake function for kernels (#38311) · e6310dbd
  由 Chen Weihang 提交于 12月 22, 2021
```
* add pten kernel cmake

* add pten kernel cmake function

* fix compile error

* add enforce include for full kernel

* fix compile failed

* change cuda to gpu

* fix cmake function error
```
  e6310dbd
- 王
  
  [infrt] add tensorrt op teller pass. test=develop (#38304) · 44112817
  由王明冬提交于 12月 22, 2021
  
  44112817
20 12月, 2021 1 次提交
- F
  
  [MLU]add mlu backend (#38207) · 76514a1f
  由 fwenguang 提交于 12月 20, 2021
  
  76514a1f
18 12月, 2021 1 次提交
- 王
  
  [infrt] add unit test script for infrt. test=develop (#38232) · a3bd6fc0
  由王明冬提交于 12月 18, 2021
  
  a3bd6fc0
16 12月, 2021 1 次提交
- S
  
  block warning: overriding D9025 (#38034) · 672dba1b
  由 Sing_chan 提交于 12月 16, 2021
  
  672dba1b

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功