提交 · 9f34a0702213ada872c04ddbc367db2ceedfc697 · Crayon鑫 / Paddle

11 1月, 2022 2 次提交
- L
  Remove useless headers for some grad ops (#38823) · 9f34a070
  由 limingshu 提交于 1月 11, 2022
```
* fix the wrong filename

* first commit

* first commit

* remove rest useless headers

* for ci approval
```
  9f34a070
- S
  support vs2019 compilation in windows (#38719) · 0ad363b1
  由 Sing_chan 提交于 1月 11, 2022
```
* support vs2019 compilation in windows

* not modify pow_op's original compute logic
```
  0ad363b1
10 1月, 2022 1 次提交
- T
  
  1.fix elementwise_add_grad bug. 2. add dropout kernel in kl2 (#38726) · 7b860a23
  由 taixiurong 提交于 1月 10, 2022
  
  7b860a23
06 1月, 2022 3 次提交
- L
  Revert "Remove useless headers for some grad ops (#38732)" (#38743) · fc990d08
  由 limingshu 提交于 1月 06, 2022
```
This reverts commit c0e2b98e.
```
  fc990d08
- L
  Remove useless headers for some grad ops (#38732) · c0e2b98e
  由 limingshu 提交于 1月 06, 2022
```
* fix the wrong filename

* first commit
```
  c0e2b98e
- Y
  [Pten]Move GPU_implementation of elementwise kernel in new directory (#38696) · c1adced7
  由 YuanRisheng 提交于 1月 06, 2022
```
* move gpu_impl of elementwise kernel

* change copyright to 2022
```
  c1adced7
05 1月, 2022 2 次提交

optimize elementwise_mul_grad using new interfaces (#37728) · 36a102f8

由 Lijunhui 提交于 1月 05, 2022

* init commit: new elem_mul_grad

* add template speciallization for complex in multiply

* reply review comments

* correct dx and dy computation when T is complex

* reply review comments

* update to new ReduceRunctor

* mul-output broadcast

* call functions

* call functions with comments

* remove comments

36a102f8

implementation of broadcast div backward by reduce (#38044) · 55cd9cb8

由 crystal 提交于 1月 05, 2022

* add elementwise div

* move mul and div grad functor

* Combine multiple CUDA kernels

* Update the reduce interface call

* add multi-output

* add multi-output div

* add branch judge

* Package branch

* Combine the x and y functions into one

55cd9cb8

04 1月, 2022 1 次提交

[Pten]Move CPU_implementation of elementwise kernel in new directory (#38651) · 7c020c71

由 YuanRisheng 提交于 1月 04, 2022

* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs

* move cpu_impl of elementwise kernel to new directory

7c020c71

31 12月, 2021 1 次提交
- Y
  [Pten]Move math to new directory and change 「math」 to 「math_kernel」 (#38604) · e76087ad
  由 YuanRisheng 提交于 12月 31, 2021
```
* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs
```
  e76087ad
29 12月, 2021 1 次提交
- L
  
  code clean (#38550) · 206a8f6c
  由 limingshu 提交于 12月 29, 2021
  
  206a8f6c
28 12月, 2021 1 次提交
- L
  Support multi-output feature for elementwise (#38410) · 48f061fb
  由 limingshu 提交于 12月 28, 2021
```
* first commit

* pass ctest of  elementwise_div_grad
```
  48f061fb
21 12月, 2021 1 次提交
- A
  
  Fix for wrong conditions between forward and backward in elementwise_add_grad op (#38176) · d9780a22
  由 arlesniak 提交于 12月 21, 2021
  
  d9780a22
20 12月, 2021 1 次提交

Support FP16 for more ops (#38123) · 1f445bf3

由 sneaxiy 提交于 12月 20, 2021

* support FP16 for more ops

* add amp list tests

* refine reduce_mean_grad

* fix OP benchmark ci

* fix fp16 reduce_mean

* updat ut, but still have some problems

* remove mean/reduce_mean fp16 kernel

1f445bf3

18 12月, 2021 1 次提交
- F
  add complex op (#37918) · 31e874b1
  由 Feiyu Chan 提交于 12月 18, 2021
```
* add complex op and `paddle.complex`.
```
  31e874b1
17 12月, 2021 1 次提交
- L
  [BugFix]: Elementwise branch selection and Broadcast dimension merge (#38204) · e097a748
  由 limingshu 提交于 12月 17, 2021
```
* fix_bugs_for_elementwise_branch_selection

* fix merge_dims bugs

* fix all influenced file
```
  e097a748
16 12月, 2021 3 次提交
- L
  Add fmax and fmin operators (#37826) · dd3afc9d
  由 LJQ❤️ 提交于 12月 16, 2021
```
Add elementwise_fmax and elementwise_fmin operators
```
  dd3afc9d
- N
  Add the transformop parameter in TensorReduceFunctorImpl (#38135) · 524389ee
  由 niuliling123 提交于 12月 16, 2021
```
* Add the transformop parameter in TensorReduceFunctorImpl
```
  524389ee
- Y
  [Pten]Modify registered kernel name (#38109) · be874c08
  由 YuanRisheng 提交于 12月 16, 2021
```
* Reduce reshape kernel functions in pten

* delete notes

* fix bugs when compile

* modify register name

* fix compile bugs
```
  be874c08
15 12月, 2021 1 次提交
- Y
  Change a comment to avoid the disturb to op benchmark ci. (#38148) · 4d8242df
  由 Yiqun Liu 提交于 12月 15, 2021
```
test=document_fix
```
  4d8242df
09 12月, 2021 1 次提交
- C
  
  adjust main dir (#37916) · 1911b6f0
  由 Chen Weihang 提交于 12月 08, 2021
  
  1911b6f0
08 12月, 2021 2 次提交
- Y
  [PTen]Add alias kernel name (#37881) · ff6507db
  由 YuanRisheng 提交于 12月 08, 2021
```
* add alias kernel name

* modify code as suggestions
```
  ff6507db
- C
  implementation of broadcast sub backward by reduce (#37754) · 567e6bbc
  由 crystal 提交于 12月 08, 2021
```
* add boardcast_sub

* add boardcast_sub
```
  567e6bbc
03 12月, 2021 1 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb
27 11月, 2021 1 次提交

[NPU] reorganization for device API abstraction (#37110) · 72241a6a

由 Aganlengzi 提交于 11月 27, 2021

* [NPU] reorganization for device API abstraction

* [NPU] delete old files

* [NPU] fix npu_collective_helper

* [NPU] fix collective_helper

* [NPU] fix ut

* [NPU] mod memory allocation and hccl_helper

* [NPU] fix place_type

* [NPU] split enfoce.h

* move acl* call into npu_info

* merge conflict

* fix merge

* merge conflict

* merge conflict

72241a6a

24 11月, 2021 1 次提交

elementwise_mul refactor (#37471) · c5e857d4

由 YuanRisheng 提交于 11月 24, 2021

* elementwise_mul refactor

* perfect code in test

* delete redundant code

* fix bugs when run test_multiply

* adjust the location of macro

* fix bugs when run ci

c5e857d4

23 11月, 2021 1 次提交
- Y
  [PTen]Elementwise_div Kernel Refactor (#37418) · 32d9beef
  由 YuanRisheng 提交于 11月 23, 2021
```
* elementwise_div refactor

* fix compile bugs in windows ci
```
  32d9beef
22 11月, 2021 1 次提交

disable copying of datatype when sharing buffer between two tensors. (#37247) · 9ec1432d

由 Feiyu Chan 提交于 11月 22, 2021

* disable copying of datatype when sharing buffer between two tensors.
* fix for mkldnn operator kernels (elementwise_add, sum, softplus, softmax, scale, activation), mannually set the data type when reusing memory by ShareBufferWith.

9ec1432d

18 11月, 2021 1 次提交

[PTen]elementwise_sub kernel refactor (#37260) · 36a95654

由 YuanRisheng 提交于 11月 18, 2021

* elementwise_add kernel refactor

* fix compile bugs in elementwise_add refactor

* fix compile bugs when run in npu/xpu

* fix bugs when run unit test

* fix bugs when run ci-windows

* modify code as recommended

* code format adjust

* fix bugs when run ci

* fix compile bug when run in ci-windwos

* elementwise_sub refactor

* add PD_DLL_DECL for elementwise_sub

* fix bugs when compilei

36a95654

17 11月, 2021 1 次提交

Changed first batch of deprecated mkldnn headers and function names to new oneDNN names (#37040) · ce3ee9bb

由 piotrekobiIntel 提交于 11月 17, 2021

* Change first batch of mkldnn headers and namespace names to dnnl

* Revert changes to tensor.h, which require approval

* Format changes with pre-commit

* Add int32 tests

* Fix int32 tests and call GetDataFromTensor for int32

* Fix test

ce3ee9bb

15 11月, 2021 1 次提交
- W
  [New features] Add elementwise_mul triple grad kernel (#37152) · 59fdf4da
  由 Weilong Wu 提交于 11月 15, 2021
```
* Add elementwise_mul triple grad kernel

* Removed InplaceInferer and polished code
```
  59fdf4da
12 11月, 2021 1 次提交

[Pten]Refactor the Elementwise_add Kernel (#37043) · c1310343

由 YuanRisheng 提交于 11月 12, 2021

* elementwise_add kernel refactor

* fix compile bugs in elementwise_add refactor

* fix compile bugs when run in npu/xpu

* fix bugs when run unit test

* fix bugs when run ci-windows

* modify code as recommended

* code format adjust

* fix bugs when run ci

* fix compile bug when run in ci-windwos

c1310343

02 11月, 2021 1 次提交
- C
  [PTen] Fix detail bugs and append registry macro (#36866) · 53b3f40f
  由 Chen Weihang 提交于 11月 02, 2021
```
* fix several bugs

* fix elementwith override error
```
  53b3f40f
28 10月, 2021 1 次提交

[NPU] Add int64 supporting for expand_v2, reduce_max, scale and tests (#36582) · c038cc7a

由 ronnywang 提交于 10月 28, 2021

* add TypeAdapter method for npu_op_runner

* add int64 supporting for elementwise_mul and reduce_sum

* add int64 supporting and UT for expand_v2, scale and reduce_max

* fix bug

c038cc7a

27 10月, 2021 1 次提交

Added fp32 / bf16 forward and backward elementwise_div_mkldnn operator (#36158) · e92e6b06

由 piotrekobiIntel 提交于 10月 27, 2021

* Add WIP version of elementwise_div_mkldnn without working dy grad

* Add dy gradient calculation implementation, disable broadcast tests

* Readd removed tests from static_mode_white_list

* Add bfloat16 gradient tests, remove int8 and uint8 support

* - Change the way dy grad is calculated to improve performance
- Refactor BinaryMKLDNNHandler to use a default parameter

* Change copyright year

* Refactor as suggested

* Attempt to bypass CI Approval
not accepting max_relative_error

* Fix formatting issue

e92e6b06

25 10月, 2021 1 次提交
- A
  [NPU] modifications for model ernie-1.0 (#36642) · 19b02d95
  由 Aganlengzi 提交于 10月 25, 2021
```
* [NPU] modifications for model ernie-1.0

* rollback 503003 and change cast to dtype
```
  19b02d95
22 10月, 2021 1 次提交

【Bug Fixes】Elementwise_add triple grad, fixed an input uninitialized problem (#36618) · 6580ad16

由 Weilong Wu 提交于 10月 22, 2021

* Support elementwise_add triple grad Kernel

* Change code-format to follow CI std

* Removed unreasonable code, and fixed an input uninitialized issue

* Support elementwise_add triple grad Kernel

* Change code-format to follow CI std

* Removed unreasonable code, and fixed an input uninitialized issue

6580ad16

21 10月, 2021 2 次提交

Add viterbi decode (#35778) · 6072aecb

由 Jack Zhou 提交于 10月 21, 2021

* add viterbi decode cpu kernel

* add viterbi decoder api in paddle.text

* add a data buffer once to avoid create many small pieces of data buffer frequently

* fix viterbi max_seq_length bug

* fix seq_len=1 bug

* fix device context

* move split out of for loop

* remove INVERSE_SUB

* remove 2 GET_CAST_MASK

* remove 1 loop

* remove Functor

* add to_static deploy code

* use MAX_FUNC instead of ELE_MAX

* add MaxFunctor

* impl max_func

* remove MaxFunctor

* remove cast op

* use REGISTER_OP_WITHOUT_GRADIENT

* add viterbi cuda kernel

* add FIX_BLOCKDIM_CASE macro

* add MKL add, mul; add get data mask

* add arange mkl impl

* add CPU Argmax

* add cpu gather

* use EXECUTE_MKL_ELEMENT_BINARY_OP instead of some ADD, MUL

* use SameDimsBinaryOP instead of EXECUTE_MKL_ELEMENT_BINARY_OP

* use SAME_DIMS_ELEMENT_BINARY_OP

* add SimpleBroadcastBinaryOP

* use int instead of int64_t to accelerate

* optimize SimpleBroadcastBinaryOP

* optimize SimpleBroadcastBinaryOP

* optimize performance in both single thread and multithread situation

* remove useless line

* remove useless code

* add CREATE_TENSOR_BUFFER macro

* add INIT_REQUIRED_TENSOR macro

* add comment

* fix windows ci

* add viterbi unittest

* remove cuda add functor

* remove cuda equal

* remove a template function

* fix windows ci

* fix windows dtype

* remove some template instance

* remove useless header file

* remove some blockdim

* remove transpose impl

* accelerate cpu performance on single thread situation

* viterbi_decode->crf_decode

* rename crf params name

* add viterbi api test

* remove useless import

* add enable_static

* use viterbi decoder

* fix viterbi len=1

* fix  viterbi unittest

* remove useless comments

* reconstruct viterbi decode

* remove ADD,SUB,MUL structure

* fix coverage

* remove CREATE_TENSOR

* add name args

* crf.py->ops.py; with_start_stop_tag->include_start_end_tag

* update crf_decode en docs

* fix viterbi decode en docs

* fix some review comments

* add FIXED_BLOCK_DIM_CASE in cuda

* push_back->emplace_back

* crf_decode->viterbi_decode; include_start_end_tag->include_bos_eos_tag

* paddle.text.ops.viterbi_decode->paddle.text.viterbi_decode

* fix viterbi_decode en docs

6072aecb

Fix a bug in ReadData, ReadDataBc and ReadDataReduce when NX != 1 (#36373) · 921c0917

由 niuliling123 提交于 10月 21, 2021

* Update the implement of reduceAnyKernel according to kernel primitive api
* Fix a bug in ReadData, ReadDataBc and ReadDataReduce when NX != 1

921c0917

19 10月, 2021 1 次提交
- W
  Support elementwise_add triple grad Kernel (#36508) · 51c97d9f
  由 Weilong Wu 提交于 10月 19, 2021
```
* Support elementwise_add triple grad Kernel

* Change code-format to follow CI std
```
  51c97d9f

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致