- 05 1月, 2022 1 次提交
-
-
由 crystal 提交于
* add elementwise div * move mul and div grad functor * Combine multiple CUDA kernels * Update the reduce interface call * add multi-output * add multi-output div * add branch judge * Package branch * Combine the x and y functions into one
-
- 04 1月, 2022 1 次提交
-
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs * move cpu_impl of elementwise kernel to new directory
-
- 31 12月, 2021 1 次提交
-
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs
-
- 29 12月, 2021 1 次提交
-
-
由 limingshu 提交于
-
- 28 12月, 2021 1 次提交
-
-
由 limingshu 提交于
* first commit * pass ctest of elementwise_div_grad
-
- 21 12月, 2021 1 次提交
-
-
由 arlesniak 提交于
-
- 20 12月, 2021 1 次提交
-
-
由 sneaxiy 提交于
* support FP16 for more ops * add amp list tests * refine reduce_mean_grad * fix OP benchmark ci * fix fp16 reduce_mean * updat ut, but still have some problems * remove mean/reduce_mean fp16 kernel
-
- 18 12月, 2021 1 次提交
-
-
由 Feiyu Chan 提交于
* add complex op and `paddle.complex`.
-
- 17 12月, 2021 1 次提交
-
-
由 limingshu 提交于
* fix_bugs_for_elementwise_branch_selection * fix merge_dims bugs * fix all influenced file
-
- 16 12月, 2021 3 次提交
-
-
由 LJQ❤️ 提交于
Add elementwise_fmax and elementwise_fmin operators
-
由 niuliling123 提交于
* Add the transformop parameter in TensorReduceFunctorImpl
-
由 YuanRisheng 提交于
* Reduce reshape kernel functions in pten * delete notes * fix bugs when compile * modify register name * fix compile bugs
-
- 15 12月, 2021 1 次提交
-
-
由 Yiqun Liu 提交于
test=document_fix
-
- 09 12月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
-
- 08 12月, 2021 2 次提交
-
-
由 YuanRisheng 提交于
* add alias kernel name * modify code as suggestions
-
由 crystal 提交于
* add boardcast_sub * add boardcast_sub
-
- 03 12月, 2021 1 次提交
-
-
由 ronnywang 提交于
* refine structure for cuda and rocm * update * update * update * update
-
- 27 11月, 2021 1 次提交
-
-
由 Aganlengzi 提交于
* [NPU] reorganization for device API abstraction * [NPU] delete old files * [NPU] fix npu_collective_helper * [NPU] fix collective_helper * [NPU] fix ut * [NPU] mod memory allocation and hccl_helper * [NPU] fix place_type * [NPU] split enfoce.h * move acl* call into npu_info * merge conflict * fix merge * merge conflict * merge conflict
-
- 24 11月, 2021 1 次提交
-
-
由 YuanRisheng 提交于
* elementwise_mul refactor * perfect code in test * delete redundant code * fix bugs when run test_multiply * adjust the location of macro * fix bugs when run ci
-
- 23 11月, 2021 1 次提交
-
-
由 YuanRisheng 提交于
* elementwise_div refactor * fix compile bugs in windows ci
-
- 22 11月, 2021 1 次提交
-
-
由 Feiyu Chan 提交于
* disable copying of datatype when sharing buffer between two tensors. * fix for mkldnn operator kernels (elementwise_add, sum, softplus, softmax, scale, activation), mannually set the data type when reusing memory by ShareBufferWith.
-
- 18 11月, 2021 1 次提交
-
-
由 YuanRisheng 提交于
* elementwise_add kernel refactor * fix compile bugs in elementwise_add refactor * fix compile bugs when run in npu/xpu * fix bugs when run unit test * fix bugs when run ci-windows * modify code as recommended * code format adjust * fix bugs when run ci * fix compile bug when run in ci-windwos * elementwise_sub refactor * add PD_DLL_DECL for elementwise_sub * fix bugs when compilei
-
- 17 11月, 2021 1 次提交
-
-
由 piotrekobiIntel 提交于
* Change first batch of mkldnn headers and namespace names to dnnl * Revert changes to tensor.h, which require approval * Format changes with pre-commit * Add int32 tests * Fix int32 tests and call GetDataFromTensor for int32 * Fix test
-
- 15 11月, 2021 1 次提交
-
-
由 Weilong Wu 提交于
* Add elementwise_mul triple grad kernel * Removed InplaceInferer and polished code
-
- 12 11月, 2021 1 次提交
-
-
由 YuanRisheng 提交于
* elementwise_add kernel refactor * fix compile bugs in elementwise_add refactor * fix compile bugs when run in npu/xpu * fix bugs when run unit test * fix bugs when run ci-windows * modify code as recommended * code format adjust * fix bugs when run ci * fix compile bug when run in ci-windwos
-
- 02 11月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* fix several bugs * fix elementwith override error
-
- 28 10月, 2021 1 次提交
-
-
由 ronnywang 提交于
* add TypeAdapter method for npu_op_runner * add int64 supporting for elementwise_mul and reduce_sum * add int64 supporting and UT for expand_v2, scale and reduce_max * fix bug
-
- 27 10月, 2021 1 次提交
-
-
由 piotrekobiIntel 提交于
* Add WIP version of elementwise_div_mkldnn without working dy grad * Add dy gradient calculation implementation, disable broadcast tests * Readd removed tests from static_mode_white_list * Add bfloat16 gradient tests, remove int8 and uint8 support * - Change the way dy grad is calculated to improve performance - Refactor BinaryMKLDNNHandler to use a default parameter * Change copyright year * Refactor as suggested * Attempt to bypass CI Approval not accepting max_relative_error * Fix formatting issue
-
- 25 10月, 2021 1 次提交
-
-
由 Aganlengzi 提交于
* [NPU] modifications for model ernie-1.0 * rollback 503003 and change cast to dtype
-
- 22 10月, 2021 1 次提交
-
-
由 Weilong Wu 提交于
* Support elementwise_add triple grad Kernel * Change code-format to follow CI std * Removed unreasonable code, and fixed an input uninitialized issue * Support elementwise_add triple grad Kernel * Change code-format to follow CI std * Removed unreasonable code, and fixed an input uninitialized issue
-
- 21 10月, 2021 2 次提交
-
-
由 Jack Zhou 提交于
* add viterbi decode cpu kernel * add viterbi decoder api in paddle.text * add a data buffer once to avoid create many small pieces of data buffer frequently * fix viterbi max_seq_length bug * fix seq_len=1 bug * fix device context * move split out of for loop * remove INVERSE_SUB * remove 2 GET_CAST_MASK * remove 1 loop * remove Functor * add to_static deploy code * use MAX_FUNC instead of ELE_MAX * add MaxFunctor * impl max_func * remove MaxFunctor * remove cast op * use REGISTER_OP_WITHOUT_GRADIENT * add viterbi cuda kernel * add FIX_BLOCKDIM_CASE macro * add MKL add, mul; add get data mask * add arange mkl impl * add CPU Argmax * add cpu gather * use EXECUTE_MKL_ELEMENT_BINARY_OP instead of some ADD, MUL * use SameDimsBinaryOP instead of EXECUTE_MKL_ELEMENT_BINARY_OP * use SAME_DIMS_ELEMENT_BINARY_OP * add SimpleBroadcastBinaryOP * use int instead of int64_t to accelerate * optimize SimpleBroadcastBinaryOP * optimize SimpleBroadcastBinaryOP * optimize performance in both single thread and multithread situation * remove useless line * remove useless code * add CREATE_TENSOR_BUFFER macro * add INIT_REQUIRED_TENSOR macro * add comment * fix windows ci * add viterbi unittest * remove cuda add functor * remove cuda equal * remove a template function * fix windows ci * fix windows dtype * remove some template instance * remove useless header file * remove some blockdim * remove transpose impl * accelerate cpu performance on single thread situation * viterbi_decode->crf_decode * rename crf params name * add viterbi api test * remove useless import * add enable_static * use viterbi decoder * fix viterbi len=1 * fix viterbi unittest * remove useless comments * reconstruct viterbi decode * remove ADD,SUB,MUL structure * fix coverage * remove CREATE_TENSOR * add name args * crf.py->ops.py; with_start_stop_tag->include_start_end_tag * update crf_decode en docs * fix viterbi decode en docs * fix some review comments * add FIXED_BLOCK_DIM_CASE in cuda * push_back->emplace_back * crf_decode->viterbi_decode; include_start_end_tag->include_bos_eos_tag * paddle.text.ops.viterbi_decode->paddle.text.viterbi_decode * fix viterbi_decode en docs
-
由 niuliling123 提交于
* Update the implement of reduceAnyKernel according to kernel primitive api * Fix a bug in ReadData, ReadDataBc and ReadDataReduce when NX != 1
-
- 19 10月, 2021 1 次提交
-
-
由 Weilong Wu 提交于
* Support elementwise_add triple grad Kernel * Change code-format to follow CI std
-
- 18 10月, 2021 1 次提交
-
-
由 Qi Li 提交于
-
- 12 10月, 2021 1 次提交
-
-
由 Qi Li 提交于
* [NPU] fix elementwise_mul to support broadcast, test=develop * remove debug files, test=develop * add axis support, test=develop
-
- 29 9月, 2021 1 次提交
-
-
由 Aganlengzi 提交于
* merge conflict of paddle_gtest_main.cc * modify FLAGS_npu_precision_mode and default not to call aclSetCompileopt
-
- 24 9月, 2021 1 次提交
-
-
由 piotrekobiIntel 提交于
* Add elementwise_sub_mkldnn_op without grad * Add test to static_mode_white_list * Refactor code, change license years * Remove invalid grad implementation * Fix element_wise_sub_op test * Fix CI Approval error * Remove unnecessary EltwiseSubMKLDNNGradKernel class * Fix CI Approval 2 * Fix CI Approval 3 * Fix CI Approval Attempt #4 * Fix CI Approve Attempt #5 * Fix CI Approval Attempt #6 * Fix CI Approval Attemt #7 * Change test names containing add to sub * Fix old tests testing add instead of sub * Copy grad implementation from elementwise_add_mkldnn * CI test fix attempt * Revert "CI test fix attempt" This reverts commit c647cacf41e6a87c715385a185de5cbf65fc8900. * Fix CI attempt 2 * Fix elementwise_sub tests, temporary mkldnn broadcast test disable * Add working implementation of elementwise_sub grad * Fix build errors caused by pull * Fix format error * Fix format error 2 * Disable elementwise_sub_mkldnn test on GPU * Apply fix for paddle.fluid import * Revert changes of test_elementwise_sub and Fix mkldnn test * Revert "Apply fix for paddle.fluid import" This reverts commit fc3b122fec8e12f2bcb32928a2685ba4d20fd742. * fix bug of module 'paddle' has no attribute 'fluid' for python3.6 (#35862) * Add changes suggested by reviewers * Change @unittest.skipIf... to @OpTestTool.skip_if_not_cpu_bf16() to satisfy Approval CI * Remove check_dygraph=False to satisify CI Approval Co-authored-by: Nzhangbo9674 <82555433+zhangbo9674@users.noreply.github.com>
-
- 23 9月, 2021 1 次提交
-
-
由 Li Min 提交于
-
- 21 9月, 2021 1 次提交
-
-
由 Guoxia Wang 提交于
-
- 18 9月, 2021 1 次提交
-
-
由 Yiqun Liu 提交于
-