提交 · 0d46a1085f58d20e8b9d3693172d5739e73cc08d · Crayon鑫 / Paddle

15 2月, 2022 8 次提交

F
[Pten] move paddle/operators/math/functors.h and compound_functors.h (#39514) · 0d46a108
由 Feiyu Chan 提交于 2月 15, 2022
```
* move paddle/operators/math/functors.h
* move paddle/operators/math/compound_functors.h
```
0d46a108

Add cinn_instruction_run_op for launching execution of a cinn instruction (#39435) · 9d0baeab

由 TeFeng Chen 提交于 2月 15, 2022

* add cinn_instruction_run_op for launching execution of a cinn instruction

* fix multi definition compilation error

* update cmake

* fix bug at infershape

* fix compile error due to lacking header file

9d0baeab

move histogram to pten (#39496) · 556f6eb0

由 hong 提交于 2月 15, 2022

* move histogram to pten; test=develop

* fix format error; test=develop

* fix histogram kernel format; test=develop

556f6eb0

Move Abs OP to pten (#39492) · fb473067

由 From00 提交于 2月 15, 2022

* Move Abs op to pten

* Fix NPU compilation error

* Fix CI error

* Use LaunchSameDimsElementwiseCudaKernel in pten

fb473067

S

add dropout fp32 (#39501) · b81358d1
由 sneaxiy 提交于 2月 15, 2022

b81358d1

move algorithm.h (#39502) · 7eb9593e

由 Feiyu Chan 提交于 2月 15, 2022

Move paddle/fluid/operators/math/algorithm.h to paddle/pten/kernels/funcs and rename all references to symbols in it.

7eb9593e

[Pten]Move expand_v2 to pten (#39471) · 2d16d69b

由 Linjie Chen 提交于 2月 15, 2022

* move expand to pten

* move expand_v2 to pten

* move expand_v2 to pten

* fix grad register

* fix grad register

* fix tensorcpry

* fix tensorcopy

* fix tensorcopy

* fix tensorcopy

* fix tensorcopy

* fix ci

* fix tensorcopy

2d16d69b

[PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404

由 Aurelius84 提交于 2月 15, 2022

* #1 migrate dist-related type()-> dtype()

* move datatype function from pten -> fluid/framework

* change type() in imperative into convert(dtype())

* modify xx_tensor->type into xx_tensor->dtype

* change the set_type interface and the caller

* modify xx_tensor.type into xx_tensor.dtype

* fix mutable_data(place, dtype())

* change caller of mutable_data in pten and distributed

* change the caller of mutable_data in fluid/framework

* change the caller of mutable_data in imperative directory

* mutable_data: inference

* update the call of mutable_data

* transfer MakePenScalarArray MakePtenScalar ResetHolderWithType

* pass the compile. the next step is remove VarType in Pten

* fix all and remove VarType from pten. success in linux. Next task is other platform

* fix conflict with develop

* fix compiled error

* Fix reset conversion

* fix conflict

* fix compiled problem

* fix typo

* Fix << in tensor_utils.cc

* fix type->dtype

* fix unittest

* fix tensor init constructor

* fix DataTypeSize for BFloat16

* fix code style

* fix npu compiled error

* fix npu

* compile npu sucessfully

* fix conflict

* fix conflict
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>

7e7e9404

14 2月, 2022 4 次提交

C
[PTen] Add HasAttr for ArgumentMappingContext (#39464) · ddb1e23f
由 Chen Weihang 提交于 2月 14, 2022
```
* add has_attr for arg map context

* skip useless attr now

* skip attr if not exists

* fix typo
```
ddb1e23f

[pten] add split kernel (#39060) · d0df5632

由 chentianyu03 提交于 2月 14, 2022

* add split kernel

* add split kernel signature

* fix split bug

* modify MakePtenScalarArrayFromVarList

* modify MakePtenScalarArrayFromVarList

* fix split windows register error

* add test case for split kernel

* replace raw split kernel with pten kernel

* fix makeScalar/ScalarArray bug

* remove debug log

* remove int64_t type in buildPtcontext

* update by code review

* fix split dev test failed

* change DenseTensorMeta to MetaTensor

* change split api code from auto gen to manual

* split cuda kernel support bfloat16 type

* fix conflict

* rm raw split kernel

* merge develop branch

* change to pten::errors

d0df5632

T

fix gather_nd, *test=kunlun (#39283) · d12c3636
由 TTerror 提交于 2月 14, 2022

d12c3636
[MLU] add mlu kernel for c_broadcast op (#39470) · 1b9e6790
由 mhhhh1 提交于 2月 14, 2022

1b9e6790

11 2月, 2022 11 次提交
- L
  
  Add TensorRT inspector into Paddle-TRT (#38362) · 69793a27
  由 Leo Chen 提交于 2月 11, 2022
  
  69793a27
- J
  Added shape (U)INT8/BF16/FP32 oneDNN kernel (#36033) · 52bbaae9
  由 jakpiase 提交于 2月 11, 2022
```
* added shape oneDNN kernel

* removed unnecessary import from test

* added skipping tests for GPU

* refactoring

* refactored shape kernel

* added tests in new framework

* removed one line

* minor change

* added newline at EOF

* added formatting

* added attributes as extra
```
  52bbaae9
- J
  
  uniform_random op for mlu (#39450) · 02f06708
  由 joeqiao12 提交于 2月 11, 2022
  
  02f06708
- Z
  [bf16] add bf16 kernel: transpose & unbind (#39457) · 1e6047f1
  由 zhangbo9674 提交于 2月 11, 2022
```
* add transpose unbind

* add unittest

* refine transpose unittest
```
  1e6047f1
- Z
  [MLU]support c_gen_cncl_id_op run on MLU device (#39336) · 89aa8b1a
  由 zn 提交于 2月 11, 2022
```
Co-authored-by: Nzhangna <zhangna@cambricon.com>
```
  89aa8b1a
- F
  
  [MLU] add pool2d and pool2d_grad mlu kernel (#39453) · 702bce57
  由 fwenguang 提交于 2月 11, 2022
  
  702bce57
- F
  [Pten] move operators/math/math_function_* to pten/kernels/func (#39300) · d25a7f9e
  由 Feiyu Chan 提交于 2月 11, 2022
```
* move operators/math/math_function_* to pten/kernels/func
* namespace from `paddle::operators::math` to `pten::funcs`
```
  d25a7f9e
- Z
  Optimize performance of softmax_bwd when axis!=-1 (#38609) · 2ea15fc9
  由 Zhang Zheng 提交于 2月 11, 2022
```
* Optimize performance of softmax_bwd when axis!=-1

* fix

* fix

* fix

* fix
```
  2ea15fc9
- L
  Optimize bilinear interpolation foward (#39243) · a1174973
  由 Lijunhui 提交于 2月 11, 2022
```
* bilinear_fw init

* optimize code

* pre-compute linear_interp input index
```
  a1174973
- C
  [PTen] Move grad GetExpectedPtenKernelArgs into pten (#39418) · 667bd962
  由 Chen Weihang 提交于 2月 11, 2022
```
* move grad get expected pten kernel args

* fix reduce sum error

* fix element_sub_grad failed

* revert kernel judge change
```
  667bd962
- Z
  Support different dtypes of inputs for elementwise ops (#38859) · bf305033
  由 Zhang Ting 提交于 2月 11, 2022
```
* improve backward performance

* support different dtypes for elementwise ops
```
  bf305033
10 2月, 2022 7 次提交

F
[MLU] add mlu kernel for accuracy op (#39337) · 383de295
由 fwenguang 提交于 2月 10, 2022
```
* [MLU] add mlu kernel for accuracy op

* fix license format

* fix error message
```
383de295
F
[NPU] add reduce_min (#39019) · 2b8b16d7
由 furnace 提交于 2月 10, 2022
```
[NPU] add reduce_min
```
2b8b16d7

move Masked select to pten (#39193) · e2ad433b

由 hong 提交于 2月 10, 2022

* move masked select cpu kernel

* add masked selected gpu kernel; test=develop

* fix bugs; test=develop

* bug fix; test=develop

* bug fix; test=develop

* add namespace to set mask array; test=develop

* fix bug; test=develop

* fix bugs; test=develop

* fix ddim bug; test=develop

* fix npu op bug; test=develop

* fix xpu dependecy bug; test=develop

* move kernel args to sig.cc; test=develop

e2ad433b

Modify the unsqueeze dimension of input data in conv1d NCL And NLC format (#38425) · 224bc511

由 crystal 提交于 2月 10, 2022

* optimize conv1d forward

* add conv opt

* Optimize memory copy

* delete share data with

* set num_filters=512

* add nlc optimize

* Optimize num_filter=512 data on A100 and V100

* Fix the workspace_size size setting of filter

224bc511

Z
[bf16] add bf16 kernel: squeeze & unsqueeze & stack (#39402) · 59c7aea5
由 zhangbo9674 提交于 2月 10, 2022
```
* add squeeze unsqueeze stack

* add unittest

* add cpu kernel
```
59c7aea5

[bf16] add bf16 kernel: dropout & reshape & slice (#39395) · e8ac7fc3

由 zhangbo9674 提交于 2月 10, 2022

* add dropout

* add reshape

* add slice

* refien slice unittest

* refine slice unittest

* add cpu bf16 kernel

e8ac7fc3

L
[pten] update isnan registration (#39419) · 14ed2f54
由 Leo Chen 提交于 2月 10, 2022
```
* update isnan registration

* fix compile
```
14ed2f54

09 2月, 2022 10 次提交
- Z
  Optimize performance of softmax_fwd when axis!=-1 (#38602) · 8e1b0204
  由 Zhang Zheng 提交于 2月 09, 2022
```
* Optimize performence of softmax_fwd when axis!=-1

* use functor

* support hip

* fix functor
```
  8e1b0204
- N
  
  Replace EigenBroadcast with ElementwiseBroadcast in ReduceGrad (#39255) · 772be4f5
  由 niuliling123 提交于 2月 09, 2022
  
  772be4f5
- [MLU] add mlu kernel for c_comm_init op (#39364) · 1bd7a143
  由 mhhhh1 提交于 2月 09, 2022
  
  1bd7a143
- F
  
  [MLU] add gaussian_random mlu kernel (#39338) · c35b4b8e
  由 fwenguang 提交于 2月 09, 2022
  
  c35b4b8e
- F
  
  [mlu] add mlu kernel for momentum op (#39331) · f8ba12e5
  由 fwenguang 提交于 2月 09, 2022
  
  f8ba12e5
- F
  
  [mlu] add mlu kernel for elementwise_add (#39313) · d47a511a
  由 fwenguang 提交于 2月 09, 2022
  
  d47a511a
- J
  Replace EagerTensor with Tensor (#39376) · 945a3ce9
  由 Jiabin Yang 提交于 2月 09, 2022
```
* merge legacy to fluid

* Remove legacy code

* Remove legacy code

* Remove DataType test

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer
```
  945a3ce9
- Y
  
  Rename partial function name TensorReduceFunctorImpl to TensorReduceImpl. (#39387) · 6354f81c
  由 Yiqun Liu 提交于 2月 09, 2022
  
  6354f81c
- H
  Move trace op to pten (#39227) · d7dddf94
  由 hong 提交于 2月 09, 2022
```
* add trace op

* bug fix

* bug fix; test=develop

* thrust bug fix; test=develop

* remove useless register; test=develop

* fix bug; test=develop

* update trace kernel; test=develop

* move kernel args to trace_sig; test=develop
```
  d7dddf94
- C
  
  move stream into pten (#39392) · 266955a9
  由 Chen Weihang 提交于 2月 09, 2022
  
  266955a9

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致