提交 · 55cd9cb867566c58876d671f4c35039722fd1b34 · PaddlePaddle / Paddle

05 1月, 2022 1 次提交

implementation of broadcast div backward by reduce (#38044) · 55cd9cb8

由 crystal 提交于 1月 05, 2022

* add elementwise div

* move mul and div grad functor

* Combine multiple CUDA kernels

* Update the reduce interface call

* add multi-output

* add multi-output div

* add branch judge

* Package branch

* Combine the x and y functions into one

55cd9cb8

04 1月, 2022 1 次提交

[Pten]Move CPU_implementation of elementwise kernel in new directory (#38651) · 7c020c71

由 YuanRisheng 提交于 1月 04, 2022

* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs

* move cpu_impl of elementwise kernel to new directory

7c020c71

31 12月, 2021 1 次提交
- Y
  [Pten]Move math to new directory and change 「math」 to 「math_kernel」 (#38604) · e76087ad
  由 YuanRisheng 提交于 12月 31, 2021
```
* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs
```
  e76087ad
29 12月, 2021 1 次提交
- L
  
  code clean (#38550) · 206a8f6c
  由 limingshu 提交于 12月 29, 2021
  
  206a8f6c
28 12月, 2021 1 次提交
- L
  Support multi-output feature for elementwise (#38410) · 48f061fb
  由 limingshu 提交于 12月 28, 2021
```
* first commit

* pass ctest of  elementwise_div_grad
```
  48f061fb
21 12月, 2021 1 次提交
- A
  
  Fix for wrong conditions between forward and backward in elementwise_add_grad op (#38176) · d9780a22
  由 arlesniak 提交于 12月 21, 2021
  
  d9780a22
20 12月, 2021 1 次提交

Support FP16 for more ops (#38123) · 1f445bf3

由 sneaxiy 提交于 12月 20, 2021

* support FP16 for more ops

* add amp list tests

* refine reduce_mean_grad

* fix OP benchmark ci

* fix fp16 reduce_mean

* updat ut, but still have some problems

* remove mean/reduce_mean fp16 kernel

1f445bf3

18 12月, 2021 1 次提交
- F
  add complex op (#37918) · 31e874b1
  由 Feiyu Chan 提交于 12月 18, 2021
```
* add complex op and `paddle.complex`.
```
  31e874b1
17 12月, 2021 1 次提交
- L
  [BugFix]: Elementwise branch selection and Broadcast dimension merge (#38204) · e097a748
  由 limingshu 提交于 12月 17, 2021
```
* fix_bugs_for_elementwise_branch_selection

* fix merge_dims bugs

* fix all influenced file
```
  e097a748
16 12月, 2021 3 次提交
- L
  Add fmax and fmin operators (#37826) · dd3afc9d
  由 LJQ❤️ 提交于 12月 16, 2021
```
Add elementwise_fmax and elementwise_fmin operators
```
  dd3afc9d
- N
  Add the transformop parameter in TensorReduceFunctorImpl (#38135) · 524389ee
  由 niuliling123 提交于 12月 16, 2021
```
* Add the transformop parameter in TensorReduceFunctorImpl
```
  524389ee
- Y
  [Pten]Modify registered kernel name (#38109) · be874c08
  由 YuanRisheng 提交于 12月 16, 2021
```
* Reduce reshape kernel functions in pten

* delete notes

* fix bugs when compile

* modify register name

* fix compile bugs
```
  be874c08
15 12月, 2021 1 次提交
- Y
  Change a comment to avoid the disturb to op benchmark ci. (#38148) · 4d8242df
  由 Yiqun Liu 提交于 12月 15, 2021
```
test=document_fix
```
  4d8242df
09 12月, 2021 1 次提交
- C
  
  adjust main dir (#37916) · 1911b6f0
  由 Chen Weihang 提交于 12月 08, 2021
  
  1911b6f0
08 12月, 2021 2 次提交
- Y
  [PTen]Add alias kernel name (#37881) · ff6507db
  由 YuanRisheng 提交于 12月 08, 2021
```
* add alias kernel name

* modify code as suggestions
```
  ff6507db
- C
  implementation of broadcast sub backward by reduce (#37754) · 567e6bbc
  由 crystal 提交于 12月 08, 2021
```
* add boardcast_sub

* add boardcast_sub
```
  567e6bbc
03 12月, 2021 1 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb
27 11月, 2021 1 次提交

[NPU] reorganization for device API abstraction (#37110) · 72241a6a

由 Aganlengzi 提交于 11月 27, 2021

* [NPU] reorganization for device API abstraction

* [NPU] delete old files

* [NPU] fix npu_collective_helper

* [NPU] fix collective_helper

* [NPU] fix ut

* [NPU] mod memory allocation and hccl_helper

* [NPU] fix place_type

* [NPU] split enfoce.h

* move acl* call into npu_info

* merge conflict

* fix merge

* merge conflict

* merge conflict

72241a6a

24 11月, 2021 1 次提交

elementwise_mul refactor (#37471) · c5e857d4

由 YuanRisheng 提交于 11月 24, 2021

* elementwise_mul refactor

* perfect code in test

* delete redundant code

* fix bugs when run test_multiply

* adjust the location of macro

* fix bugs when run ci

c5e857d4

23 11月, 2021 1 次提交
- Y
  [PTen]Elementwise_div Kernel Refactor (#37418) · 32d9beef
  由 YuanRisheng 提交于 11月 23, 2021
```
* elementwise_div refactor

* fix compile bugs in windows ci
```
  32d9beef
22 11月, 2021 1 次提交

disable copying of datatype when sharing buffer between two tensors. (#37247) · 9ec1432d

由 Feiyu Chan 提交于 11月 22, 2021

* disable copying of datatype when sharing buffer between two tensors.
* fix for mkldnn operator kernels (elementwise_add, sum, softplus, softmax, scale, activation), mannually set the data type when reusing memory by ShareBufferWith.

9ec1432d

18 11月, 2021 1 次提交

[PTen]elementwise_sub kernel refactor (#37260) · 36a95654

由 YuanRisheng 提交于 11月 18, 2021

* elementwise_add kernel refactor

* fix compile bugs in elementwise_add refactor

* fix compile bugs when run in npu/xpu

* fix bugs when run unit test

* fix bugs when run ci-windows

* modify code as recommended

* code format adjust

* fix bugs when run ci

* fix compile bug when run in ci-windwos

* elementwise_sub refactor

* add PD_DLL_DECL for elementwise_sub

* fix bugs when compilei

36a95654

17 11月, 2021 1 次提交

Changed first batch of deprecated mkldnn headers and function names to new oneDNN names (#37040) · ce3ee9bb

由 piotrekobiIntel 提交于 11月 17, 2021

* Change first batch of mkldnn headers and namespace names to dnnl

* Revert changes to tensor.h, which require approval

* Format changes with pre-commit

* Add int32 tests

* Fix int32 tests and call GetDataFromTensor for int32

* Fix test

ce3ee9bb

15 11月, 2021 1 次提交
- W
  [New features] Add elementwise_mul triple grad kernel (#37152) · 59fdf4da
  由 Weilong Wu 提交于 11月 15, 2021
```
* Add elementwise_mul triple grad kernel

* Removed InplaceInferer and polished code
```
  59fdf4da
12 11月, 2021 1 次提交

[Pten]Refactor the Elementwise_add Kernel (#37043) · c1310343

由 YuanRisheng 提交于 11月 12, 2021

* elementwise_add kernel refactor

* fix compile bugs in elementwise_add refactor

* fix compile bugs when run in npu/xpu

* fix bugs when run unit test

* fix bugs when run ci-windows

* modify code as recommended

* code format adjust

* fix bugs when run ci

* fix compile bug when run in ci-windwos

c1310343

02 11月, 2021 1 次提交
- C
  [PTen] Fix detail bugs and append registry macro (#36866) · 53b3f40f
  由 Chen Weihang 提交于 11月 02, 2021
```
* fix several bugs

* fix elementwith override error
```
  53b3f40f
28 10月, 2021 1 次提交

[NPU] Add int64 supporting for expand_v2, reduce_max, scale and tests (#36582) · c038cc7a

由 ronnywang 提交于 10月 28, 2021

* add TypeAdapter method for npu_op_runner

* add int64 supporting for elementwise_mul and reduce_sum

* add int64 supporting and UT for expand_v2, scale and reduce_max

* fix bug

c038cc7a

27 10月, 2021 1 次提交

Added fp32 / bf16 forward and backward elementwise_div_mkldnn operator (#36158) · e92e6b06

由 piotrekobiIntel 提交于 10月 27, 2021

* Add WIP version of elementwise_div_mkldnn without working dy grad

* Add dy gradient calculation implementation, disable broadcast tests

* Readd removed tests from static_mode_white_list

* Add bfloat16 gradient tests, remove int8 and uint8 support

* - Change the way dy grad is calculated to improve performance
- Refactor BinaryMKLDNNHandler to use a default parameter

* Change copyright year

* Refactor as suggested

* Attempt to bypass CI Approval
not accepting max_relative_error

* Fix formatting issue

e92e6b06

25 10月, 2021 1 次提交
- A
  [NPU] modifications for model ernie-1.0 (#36642) · 19b02d95
  由 Aganlengzi 提交于 10月 25, 2021
```
* [NPU] modifications for model ernie-1.0

* rollback 503003 and change cast to dtype
```
  19b02d95
22 10月, 2021 1 次提交

【Bug Fixes】Elementwise_add triple grad, fixed an input uninitialized problem (#36618) · 6580ad16

由 Weilong Wu 提交于 10月 22, 2021

* Support elementwise_add triple grad Kernel

* Change code-format to follow CI std

* Removed unreasonable code, and fixed an input uninitialized issue

* Support elementwise_add triple grad Kernel

* Change code-format to follow CI std

* Removed unreasonable code, and fixed an input uninitialized issue

6580ad16

21 10月, 2021 2 次提交

Add viterbi decode (#35778) · 6072aecb

由 Jack Zhou 提交于 10月 21, 2021

* add viterbi decode cpu kernel

* add viterbi decoder api in paddle.text

* add a data buffer once to avoid create many small pieces of data buffer frequently

* fix viterbi max_seq_length bug

* fix seq_len=1 bug

* fix device context

* move split out of for loop

* remove INVERSE_SUB

* remove 2 GET_CAST_MASK

* remove 1 loop

* remove Functor

* add to_static deploy code

* use MAX_FUNC instead of ELE_MAX

* add MaxFunctor

* impl max_func

* remove MaxFunctor

* remove cast op

* use REGISTER_OP_WITHOUT_GRADIENT

* add viterbi cuda kernel

* add FIX_BLOCKDIM_CASE macro

* add MKL add, mul; add get data mask

* add arange mkl impl

* add CPU Argmax

* add cpu gather

* use EXECUTE_MKL_ELEMENT_BINARY_OP instead of some ADD, MUL

* use SameDimsBinaryOP instead of EXECUTE_MKL_ELEMENT_BINARY_OP

* use SAME_DIMS_ELEMENT_BINARY_OP

* add SimpleBroadcastBinaryOP

* use int instead of int64_t to accelerate

* optimize SimpleBroadcastBinaryOP

* optimize SimpleBroadcastBinaryOP

* optimize performance in both single thread and multithread situation

* remove useless line

* remove useless code

* add CREATE_TENSOR_BUFFER macro

* add INIT_REQUIRED_TENSOR macro

* add comment

* fix windows ci

* add viterbi unittest

* remove cuda add functor

* remove cuda equal

* remove a template function

* fix windows ci

* fix windows dtype

* remove some template instance

* remove useless header file

* remove some blockdim

* remove transpose impl

* accelerate cpu performance on single thread situation

* viterbi_decode->crf_decode

* rename crf params name

* add viterbi api test

* remove useless import

* add enable_static

* use viterbi decoder

* fix viterbi len=1

* fix  viterbi unittest

* remove useless comments

* reconstruct viterbi decode

* remove ADD,SUB,MUL structure

* fix coverage

* remove CREATE_TENSOR

* add name args

* crf.py->ops.py; with_start_stop_tag->include_start_end_tag

* update crf_decode en docs

* fix viterbi decode en docs

* fix some review comments

* add FIXED_BLOCK_DIM_CASE in cuda

* push_back->emplace_back

* crf_decode->viterbi_decode; include_start_end_tag->include_bos_eos_tag

* paddle.text.ops.viterbi_decode->paddle.text.viterbi_decode

* fix viterbi_decode en docs

6072aecb

Fix a bug in ReadData, ReadDataBc and ReadDataReduce when NX != 1 (#36373) · 921c0917

由 niuliling123 提交于 10月 21, 2021

* Update the implement of reduceAnyKernel according to kernel primitive api
* Fix a bug in ReadData, ReadDataBc and ReadDataReduce when NX != 1

921c0917

19 10月, 2021 1 次提交
- W
  Support elementwise_add triple grad Kernel (#36508) · 51c97d9f
  由 Weilong Wu 提交于 10月 19, 2021
```
* Support elementwise_add triple grad Kernel

* Change code-format to follow CI std
```
  51c97d9f
18 10月, 2021 1 次提交
- Q
  
  [NPU] add kernels for elementwise_add gather_nd tile, test=develop (#36464) · cbd15f7d
  由 Qi Li 提交于 10月 18, 2021
  
  cbd15f7d
12 10月, 2021 1 次提交

[NPU] fix elementwise_mul to support broadcast, test=develop (#36258) · 09778f46

由 Qi Li 提交于 10月 12, 2021

* [NPU] fix elementwise_mul to support broadcast, test=develop

* remove debug files, test=develop

* add axis support, test=develop

09778f46

29 9月, 2021 1 次提交

[NPU] mod for model bert (#36165) · 7bddf2e8

由 Aganlengzi 提交于 9月 29, 2021

* merge conflict of paddle_gtest_main.cc

* modify FLAGS_npu_precision_mode and default not to call aclSetCompileopt

7bddf2e8

24 9月, 2021 1 次提交

Added elementwise_sub_mkldnn operator (#35662) · 787273ed

由 piotrekobiIntel 提交于 9月 24, 2021

* Add elementwise_sub_mkldnn_op without grad

* Add test to static_mode_white_list

* Refactor code, change license years

* Remove invalid grad implementation

* Fix element_wise_sub_op test

* Fix CI Approval error

* Remove unnecessary EltwiseSubMKLDNNGradKernel class

* Fix CI Approval 2

* Fix CI Approval 3

* Fix CI Approval Attempt #4

* Fix CI Approve Attempt #5

* Fix CI Approval Attempt #6

* Fix CI Approval Attemt #7

* Change test names containing add to sub

* Fix old tests testing add instead of sub

* Copy grad implementation from elementwise_add_mkldnn

* CI test fix attempt

* Revert "CI test fix attempt"

This reverts commit c647cacf41e6a87c715385a185de5cbf65fc8900.

* Fix CI attempt 2

* Fix elementwise_sub tests, temporary mkldnn broadcast test disable

* Add working implementation of elementwise_sub grad

* Fix build errors caused by pull

* Fix format error

* Fix format error 2

* Disable elementwise_sub_mkldnn test on GPU

* Apply fix for paddle.fluid import

* Revert changes of test_elementwise_sub and Fix mkldnn test

* Revert "Apply fix for paddle.fluid import"

This reverts commit fc3b122fec8e12f2bcb32928a2685ba4d20fd742.

* fix bug of module 'paddle' has no attribute 'fluid' for python3.6 (#35862)

* Add changes suggested by reviewers

* Change @unittest.skipIf... to @OpTestTool.skip_if_not_cpu_bf16() to satisfy Approval CI

* Remove check_dygraph=False to satisify CI Approval
Co-authored-by: Nzhangbo9674 <82555433+zhangbo9674@users.noreply.github.com>

787273ed

23 9月, 2021 1 次提交
- L
  
  Add fused_attention_op: add impl wrappers. (#35903) · 88ea8e6f
  由 Li Min 提交于 9月 23, 2021
  
  88ea8e6f
21 9月, 2021 1 次提交
- G
  
  support fp16 (#35888) · 087c23a9
  由 Guoxia Wang 提交于 9月 21, 2021
  
  087c23a9
18 9月, 2021 1 次提交
- Y
  
  Correct the return type of elementwise kernel to avoid many compiling warnings. (#35839) · 71f051fb
  由 Yiqun Liu 提交于 9月 18, 2021
  
  71f051fb

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功