提交 · 848ae7dc34c84c09ac6df93e5cfd5c2031156cea · PaddlePaddle / Paddle

28 1月, 2022 6 次提交

Move digamma to pten (#39240) · 848ae7dc

由 hong 提交于 1月 28, 2022

* move digamma to pten; test=develop

* fix mutable_data bugs; test=develop

* remove useless code; test=develop

* remove kernel compute; test=develop

* fix bug; test=develop

848ae7dc

W
compile fix (#39272) · 91dd0f0d
由 wenbin 提交于 1月 28, 2022
```
* slice

* shuffle pass enhancement
```
91dd0f0d

[PSLIB] Add Metrics Module, Support User-defined Add Metric (#38789) · 2e6be886

由 Fan Zhang 提交于 1月 28, 2022

* [PSLIB] Add Metrics Module, Support User-defined Add Metric

* [PSLIB] Modify According to CI

* [PSLIB] Modify According to CI

* [PSLIB] Modify According to CI

* [PSLIB] Modify According to CI Coverage

* [PSLIB] Modify According to CI

* [PSLIB] Modify According to CI

* [PSLIB] Modify According to CI

* [PSLIB] Modify According to CI

* [PSLIB] Modify According to CI

* [PSLIB] Modify According to CI Coverage

* [PSLIB] Modify According to CI Coverage

* [PSLIB] Modify According to CI Coverage

* modify role_maker

* update CMakeLists.txt

2e6be886

【Pten】Remove WriteBackOutput in tensor_utils (#39291) · 3ef2922b

由 zyfncg 提交于 1月 28, 2022

* remove remake densetensor

* fix eager test error

* fix bug in eager

* implement AllocateFrom

* remove WriteBackOutput

* fix problem of eager
Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>

3ef2922b

[Eager] Refactor TensorAdd by template (#39282) · 0bb3e5f1

由 Weilong Wu 提交于 1月 28, 2022

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* Use overload instead of template

0bb3e5f1

Z

Auto-geneate kernel signature in C++ API (#39281) · fc5fa0de
由 zyfncg 提交于 1月 28, 2022

fc5fa0de

27 1月, 2022 24 次提交

Z

implement AllocateFrom (#39280) · d89f246c
由 zhangkaihuo 提交于 1月 27, 2022

d89f246c

Add Khop Graph Sampler API (#39146) · 35f949b5

由 Siming Dai 提交于 1月 27, 2022

* add the test case for the UVA

* add the context load for the uva

* Add graph_sample kernel

* Add graph_sample commit

* add new commit for graph_sample

* add unsigned long long int

* delete some remarks

* add cpu version

* add cuda eids

* add cpu eids

* delete _uva

* optimize speed: emplace_back, last_layer

* add to_uva_tensor

* add cpu return_eids choice

* add gpu return_eids choice

* add cpu reindex_nodes

* add gpu reindex_nodes

* rename op and add OMP for cpu

* add incubate api

* fix the compile problem for the PADDLE_ENFORE and different device

* fix the rcom and windows compile problem

* add unittest for graph_sample_neighbors

* fix cpu unittest and unique problem

* fix uva unittest, fix cuda unique problem

* fix the windows compile problem

* fix the windows rand_r compile problem

* add correct unittest, add src_eids dispensable

* delete black

* combine uva unittest

* mv Sample_index to Sample_Index; check input shape; fix random sample func

* delete memset & cudaMemset

* fix according to PR comments

* fix rocm ci

* modify function names according to the specification

* fix windows_openblas ci

* refine annotations, fix windows unittest, add default value for uva device_id, fix bug for input nodes with empty neighbors

* fix rocm ci

* rename graph_sample_neighbors as graph_khop_sampler, add incubate api doc

* add data type

* fix conflict
Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>

35f949b5

L

[pten] remove concat fluid kernel (#39268) · 552db8dc
由 Leo Chen 提交于 1月 27, 2022

552db8dc
C
Add kernelsignature constructor for windows (#39253) · 33e3f5ac
由 Chen Weihang 提交于 1月 27, 2022
```
* add constructor for win

* change impl

* fix bug
```
33e3f5ac
Z
【PTen】Remove ReMakePtenDenseTensor (#39094) · 98c1829b
由 zyfncg 提交于 1月 27, 2022
```
* remove remake densetensor

* fix eager test error

* fix bug in eager
```
98c1829b
Y

refactor elementwise sub grad (#39225) · 7a1e1193
由 YuanRisheng 提交于 1月 27, 2022

7a1e1193

[PTen]Support AllocateFrom in Tensor and Alloc/HostAlloc in Context (#39022) · 5631da9c

由 Aurelius84 提交于 1月 27, 2022

* Support allocate_from in Tensor and allocate_data in Context

* fix #ifdef CUDA

* fix cycle depends

* fix test_xxx_dev_api failed

* fix windows compiling error

* fix unittest

* modify into PImpl

* fix selected rows

* add TODO comment

* refine interface according reviewer

5631da9c

C
[PTen] Add infermeta registry (#39204) · f3f16126
由 Chen Weihang 提交于 1月 27, 2022
```
* add infermeta registry

* add infermeta registry

* add unittest

* polish details
```
f3f16126
Q

[MLU] add compile ci scripts for MLU, test=mlu_ci (#39122) · 56410b4a
由 Qi Li 提交于 1月 27, 2022

56410b4a

[PluggableDevice] Add custom kernel support based on pten kernel management (#38848) · a8879215

由 Aganlengzi 提交于 1月 27, 2022

* [Demo] custom kernel based on pten kernel

* merge and npu custom work well

* del comments

* delete other code

* fix CUDAContext

* fix not found small_vector.h

* support NPU

* fix NPUContext

* fix DeviceContext support

* add UT

* fix call

* add UT

* fix

* fix for comments and ut

* add MACRO control

* fix multi input output

* support env CUSTOM_DEVICE_ROOT

* deal with special cases

* fix for Windows

* try coverage with test_custom_kernel_dot.py

* fix test_custom_kernel_dot

* fix test_custom_kernel_dot

* fix merge

* fix merge

* fix CI

* update

* merge and fix

* remove WITH_CUSTOM_KERNEL

* fix merge

* merge and fix

* fix ut

* fix ut for mac

* add more UT

* add more UT

* fix

a8879215

[pten] add full xpu kernel (#39172) · 93839717

由 chentianyu03 提交于 1月 27, 2022

* add full_kernel xpu

* fix full xpu register device type error

* fix full kernel bug

* add fulllike kernel impl and replace with raw kernel

* fix dev_ctx convert template args error

* modify namespace and header file

* add isinf check

* fix input type args in TensorSetConstantXPU error

93839717

W
fix shuffle_channel_detect_pass (#39242) · af9ddeb7
由 wenbin 提交于 1月 27, 2022
```
* shuffle channel pass

* add ut

* timeout fix

* makefile fix
```
af9ddeb7

optimize kunlun/xpu softmax_with_cross_entropy add add unitest (#39180) · 2b9bb8bb

由 QingshuChen 提交于 1月 27, 2022

* optimize kunlun/xpu softmax_with_cross_entropy add add unitest
*test=kunlun

* minor
*test=kunlun

* minor
*test=kunlun

* minor
*test=kunlun

* minor
*test=kunlun

2b9bb8bb

王

fix the c api unit test failed in windows. test=develop (#39244) · 9b79988c
由王明冬提交于 1月 27, 2022

9b79988c
Z
Fix slice error in jit.to_static mode (#39251) · c0f993f6
由 zyfncg 提交于 1月 27, 2022
```
* fix slice bug

* fix syntax error
```
c0f993f6
Y

INFRT/add LLVM lit to infrt (#39149) · a1addeef
由 Yan Chunwei 提交于 1月 27, 2022

a1addeef
T
compile for afs api (#39113) · 4748486e
由 Thunderbrook 提交于 1月 27, 2022
```
* compile for afs api

* with pslib
```
4748486e

Add SparseCooTensor and SparseCsrTensor (#38906) · a7edb3f3

由 zhangkaihuo 提交于 1月 27, 2022

* fix bug:
1. atten: set the default value of attn_dropout_rate to None
2. ffn: add activation parameter

* for pure fp16

* Add a SparseCsrTensor

* remove unused functional

* remove const

* remove SetMemoberTensor

* remove non_zero_nums_, the number of non zero elements of each batch can be obtained from the crows

* SparseCooTensor

* add SetMember

* merge upstream; add SetMember

* merge upstream

* merge upstream; add newline at end of file

* add newline at end of file

* remove newline at end of file

* remove newline at end of file

* stash

* user pten::framework::make_ddim

* user pten::framework::make_ddim

* merge upstream; use the latest mutable_data

* merge upstream; use the latest mutable_data

* return mutable dense tensor

a7edb3f3

[Paddle-Inference]: fix concat slice (#39096) · f080e8d5

由 Wangzheee 提交于 1月 27, 2022

* Paddle-Inference:fix_concat_slice

* Paddle-Inference:fix_concat_slice

* Paddle-Inference:fix_concat_slice

* Paddle-Inference:fix_concat_slice

* [Paddle-Inference]: fix concat slice

* [Paddle-Inference]: fix concat slice

* [Paddle-Inference]: fix concat slice

f080e8d5

Y

[fleet_executor] add flag to control timer (#39241) · d6d745d2
由 Yuang Liu 提交于 1月 27, 2022

d6d745d2
F

move math_cuda_utils.h to pten/kernels/funcs (#39246) · 809a10b6
由 Feiyu Chan 提交于 1月 27, 2022

809a10b6
A

[NPU] fix aarch64 deps (#39257) · 80dfa010
由 Aganlengzi 提交于 1月 27, 2022

80dfa010
Z

Modified EagerUtils interfaces (#39200) · 7ed81711
由 Zhanlue Yang 提交于 1月 27, 2022

7ed81711

Added automatic code generation for final state Eager Dygraph (#39192) · eb1f9439

由 Zhanlue Yang 提交于 1月 27, 2022

* Removed debug info

* Added automatic code generation for final state Eager Dygraph

* Modified backward yaml

* Fixed CI Issues

eb1f9439

26 1月, 2022 10 次提交

[pten] remove deprecated fluid op kernel for pten (#38842) · 3ab9aef1

由 Leo Chen 提交于 1月 26, 2022

* update cmake file to remove fluid kernel

* add pten declaration.h to where pybind.h used

* fix sync_bn and tensorrt_engine

* refine detection_library

* fix interpreter_core

* support eager legacy

* fit eager legacy for pten

* fall back to cpu if not found kernel

* fix compile problem

* fix compile problem

* refine fallback logic

* fit operator.run()

* fix xpu compile

* fit for new_exec

* add REGISTER_OP_WITHOUT_GRADIENT

* un-cache pt_kernel_context

* fix compile

* fix cudnn

* fix compiling with on_infer

* fix mkldnn

* fix isfinite_v2

* fix xpu problem

* fix op_device

* refine fallback for xpu

* fix xpu compile

* merge develop

* refine code format

* fix compile

* fix compile

* add data_transfer

* fix PreparePtenData

* fix cpu context

* merge develop

* fix compile

* fix error device context

* fix xpu

* fix dev_ctx

3ab9aef1

[pten] Cast xpu kernel (#39179) · 93d2f0a6

由 chentianyu03 提交于 1月 26, 2022

* cast xpu kernel init

* cast xpu kernel

* replace with raw cast xpu kernel

* fix cast kernel bug

* add the missing break

* modify namespace and header file

93d2f0a6

X

add dependences of enforce (#39237) · 2c0160e5
由 xiongkun 提交于 1月 26, 2022

2c0160e5

[Eager] Support imperative selected_rows_to_lod_tensor and the opposite case (#39223) · 787980b1

由 Weilong Wu 提交于 1月 26, 2022

* Added selected_rows and rw_lock to pten

* Renamed the unit test target to fix CI

* Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid

* Remove rw_lock.h,rw_lock_test.cc in fluid

* Use pten::RWLock and pten::AutoRDLock, fix CI

* Use pten::SelectedRows

* Use pten::SelectedRows

* Fix to pass NPU CI

* Selected_Rows inherits from TensorBase

* Use pten::SelectedRows, to pass NPU CI

* To fix NPU CI

* To fix NPU CI again

* Use paddle/pten/core/enforce and polish code

* Support imperative selected_rows_to_lod_tensor

* Polish code

787980b1

Q
[MLU]Add conv2d op (#39110) · 71634a61
由 qipengh 提交于 1月 26, 2022
```
* [MLU]Add conv2d op

* [MLU]fix comment

* [MLU]adapt NCHW of conv2d op
```
71634a61

[IPU] sync misc changes 01 (#38876) · 4efbebea

由 Allen Guo 提交于 1月 26, 2022

* sync misc changes

* apply comments 01

* fix compile error

* remove is_ipu_place check

* add authors
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* sync changes

* restore cmake

* update ir cmake and setup.py

* update inference_lib cmake

* split PR
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

4efbebea

[Move selected_rows PR #5] VisitDataType use Pten::DataType (#39236) · 42a0947e

由 Weilong Wu 提交于 1月 26, 2022

* Added selected_rows and rw_lock to pten

* Renamed the unit test target to fix CI

* Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid

* Remove rw_lock.h,rw_lock_test.cc in fluid

* Use pten::RWLock and pten::AutoRDLock, fix CI

* Use pten::SelectedRows

* Use pten::SelectedRows

* Fix to pass NPU CI

* Selected_Rows inherits from TensorBase

* Use pten::SelectedRows, to pass NPU CI

* To fix NPU CI

* To fix NPU CI again

* Use paddle/pten/core/enforce and polish code

* Use pten::DataType instead of using proto_type

* Move part of data_type to pten

* Polish Code

42a0947e

Y
[Pten]Move kernel_primitives lib to Pten directory (#39169) · 452bcbe2
由 YuanRisheng 提交于 1月 26, 2022
```
* move kernel_primitives

* use pten's errors
```
452bcbe2
W
[PTEN] cpu_context add eigen deps (#39234) · bd5c962d
由 Wilber 提交于 1月 26, 2022
```
* add eigen deps

* update
```
bd5c962d

[IPU] sync misc changes 02 (#39189) · 5df78366

由 Allen Guo 提交于 1月 26, 2022

* sync misc changes

* apply comments 01

* fix compile error

* remove is_ipu_place check

* add authors
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* sync changes

* restore cmake

* update ir cmake and setup.py

* update inference_lib cmake

* restore for split PR
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

5df78366

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功