1. 12 11月, 2021 1 次提交
    • Y
      [Pten]Refactor the Elementwise_add Kernel (#37043) · c1310343
      YuanRisheng 提交于
      * elementwise_add kernel refactor
      
      * fix compile bugs in elementwise_add refactor
      
      * fix compile bugs when run in npu/xpu
      
      * fix bugs when run unit test
      
      * fix bugs when run ci-windows
      
      * modify code as recommended
      
      * code format adjust
      
      * fix bugs when run ci
      
      * fix compile bug when run in ci-windwos
      c1310343
  2. 21 10月, 2021 1 次提交
    • J
      Add viterbi decode (#35778) · 6072aecb
      Jack Zhou 提交于
      * add viterbi decode cpu kernel
      
      * add viterbi decoder api in paddle.text
      
      * add a data buffer once to avoid create many small pieces of data buffer frequently
      
      * fix viterbi max_seq_length bug
      
      * fix seq_len=1 bug
      
      * fix device context
      
      * move split out of for loop
      
      * remove INVERSE_SUB
      
      * remove 2 GET_CAST_MASK
      
      * remove 1 loop
      
      * remove Functor
      
      * add to_static deploy code
      
      * use MAX_FUNC instead of ELE_MAX
      
      * add MaxFunctor
      
      * impl max_func
      
      * remove MaxFunctor
      
      * remove cast op
      
      * use REGISTER_OP_WITHOUT_GRADIENT
      
      * add viterbi cuda kernel
      
      * add FIX_BLOCKDIM_CASE macro
      
      * add MKL add, mul; add get data mask
      
      * add arange mkl impl
      
      * add CPU Argmax
      
      * add cpu gather
      
      * use EXECUTE_MKL_ELEMENT_BINARY_OP instead of some ADD, MUL
      
      * use SameDimsBinaryOP instead of EXECUTE_MKL_ELEMENT_BINARY_OP
      
      * use SAME_DIMS_ELEMENT_BINARY_OP
      
      * add SimpleBroadcastBinaryOP
      
      * use int instead of int64_t to accelerate
      
      * optimize SimpleBroadcastBinaryOP
      
      * optimize SimpleBroadcastBinaryOP
      
      * optimize performance in both single thread and multithread situation
      
      * remove useless line
      
      * remove useless code
      
      * add CREATE_TENSOR_BUFFER macro
      
      * add INIT_REQUIRED_TENSOR macro
      
      * add comment
      
      * fix windows ci
      
      * add viterbi unittest
      
      * remove cuda add functor
      
      * remove cuda equal
      
      * remove a template function
      
      * fix windows ci
      
      * fix windows dtype
      
      * remove some template instance
      
      * remove useless header file
      
      * remove some blockdim
      
      * remove transpose impl
      
      * accelerate cpu performance on single thread situation
      
      * viterbi_decode->crf_decode
      
      * rename crf params name
      
      * add viterbi api test
      
      * remove useless import
      
      * add enable_static
      
      * use viterbi decoder
      
      * fix viterbi len=1
      
      * fix  viterbi unittest
      
      * remove useless comments
      
      * reconstruct viterbi decode
      
      * remove ADD,SUB,MUL structure
      
      * fix coverage
      
      * remove CREATE_TENSOR
      
      * add name args
      
      * crf.py->ops.py; with_start_stop_tag->include_start_end_tag
      
      * update crf_decode en docs
      
      * fix viterbi decode en docs
      
      * fix some review comments
      
      * add FIXED_BLOCK_DIM_CASE in cuda
      
      * push_back->emplace_back
      
      * crf_decode->viterbi_decode; include_start_end_tag->include_bos_eos_tag
      
      * paddle.text.ops.viterbi_decode->paddle.text.viterbi_decode
      
      * fix viterbi_decode en docs
      6072aecb
  3. 15 9月, 2021 1 次提交
  4. 14 9月, 2021 1 次提交
  5. 13 9月, 2021 2 次提交
  6. 22 8月, 2021 1 次提交
  7. 05 7月, 2021 2 次提交
  8. 04 6月, 2021 1 次提交
  9. 02 6月, 2021 2 次提交
  10. 12 4月, 2021 1 次提交
    • R
      [ROCM] fix some unittests (#32129) · bd2a4e23
      ronnywang 提交于
      * [ROCM] fix test_gru_rnn_op
      
      * [ROCM] fix test_expand_op
      
      * [ROCM] fix test_cross_entropy_loss
      
      * [ROCM] fix test_conv_nn_grad
      
      * [ROCM] fix test_bilinear_tensor_product_op
      
      * [ROCM] fix elementwise_op_function
      
      * [ROCM] fix test_lstm_cudnn_op
      
      * [ROCM] fix test_gpu_package_without_gpu_device
      
      * [ROCM] fix test_gru_unit_op
      
      * [ROCM] fix test_imperative_optimizer
      
      * [ROCM] fix rnn
      
      * [ROCM] fix group_norm_op
      
      * [ROCM] fix test_pool3d_api
      
      * [ROCM] fix test_pool3d_op
      bd2a4e23
  11. 10 3月, 2021 1 次提交
  12. 03 3月, 2021 1 次提交
  13. 03 2月, 2021 1 次提交
  14. 10 1月, 2021 1 次提交
  15. 05 8月, 2020 1 次提交
  16. 16 6月, 2020 1 次提交
  17. 12 5月, 2020 1 次提交
  18. 11 5月, 2020 1 次提交
    • C
      Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f
      Chen Weihang 提交于
      * add new macro BOOST_GET_SAFELY & unittests, test=develop
      
      * add different macro type, test=develop
      
      * fix get macro type in executor, test=develop
      
      * four macro part change backup
      
      * using one macro for all case, test=develop
      
      * revert attribute change, test=develop
      
      * change to three func to solve gcc4.8 bug, test=develop
      
      * polish some details, test=develop
      aa0f254f
  19. 13 4月, 2020 1 次提交
    • L
      elementwise ops error message enhancement,the python error message had add before · 289edf39
      LutaoChu 提交于
      Those ops add the kernel message enhancement, as follows
      paddle.fluid.layers.elementwise_add	
      paddle.fluid.layers.elementwise_div
      paddle.fluid.layers.elementwise_floordiv
      paddle.fluid.layers.elementwise_max	
      paddle.fluid.layers.elementwise_min	
      paddle.fluid.layers.elementwise_mod	
      paddle.fluid.layers.elementwise_mul	
      paddle.fluid.layers.elementwise_pow	
      paddle.fluid.layers.elementwise_sub
      289edf39
  20. 03 4月, 2020 2 次提交
  21. 29 3月, 2020 1 次提交
    • Z
      Improve elementwise performance. (#23001) · 58615a62
      zhaoyuchen2018 提交于
      * Improve elementwise performance.
      
      Elementwise performace is poor as walk into CommonGradBroadcastCUDA, add some new kernels for different data pattern.
      
      * Add some cuda kernel to speedup common broadcast cases. test=develop
      
      * Add more test cases and fix cuda kernel bug. test=develop
      
      * Remove tests as cpu percision fails.test=develop
      
      * Refine SplitDims, test=develop
      
      * Change file mode, test=develop
      58615a62
  22. 25 3月, 2020 1 次提交
  23. 17 1月, 2020 1 次提交
  24. 19 11月, 2019 1 次提交
  25. 10 10月, 2019 1 次提交
  26. 04 9月, 2019 1 次提交
  27. 20 8月, 2019 1 次提交
  28. 14 6月, 2019 1 次提交
  29. 20 5月, 2019 1 次提交
    • L
      Double backward elementwise div (#17416) · 10b23a72
      lvmengsi 提交于
      * double backward, elementwise_div
      
      * fix dx empty. test=develop
      
      * bug fix (#17392)
      
      fix secure bug
      
      * Eanble stack operator for a Ngraph, test=develop (#17406)
      
      * fix sqrt_grad_grad unittest. test=develop (#17410)
      
      * fix sqrt_grad_grad unittest. test=develop
      
      * disable sqrt_grad_grad unittest. test=develop
      
      * test=develop, fix unittest
      
      * test=develop, fix unittest
      
      * test=develop, fix unittest
      
      * test=develop, fix bug
      
      * fix unittest. test=develop
      
      * fix unittest dx. test=develop
      
      * tmp fix! for test... test=develop
      
      * reduce tmp, test=develop
      
      * test=develop, reduce tmp
      
      * fix broadcast unittest. test=develop
      
      * fix format. test=develop
      
      * refine code. test=develop
      
      * refine code. test=develop
      
      * refine GetDoubleGradSafeTensor. test=develop
      
      * fix format. test=develop
      10b23a72
  30. 13 5月, 2019 1 次提交
    • K
      add double grad for elementwise_mul op (#17255) · 8bae8590
      Kaipeng Deng 提交于
      * add double grad for elementwise_mul. test=develop
      
      * remove comment. test=develop
      
      * fix grad sum. test=develop
      
      * fix for axis expand. test=develop
      
      * add test for axis expand. test=develop
      8bae8590
  31. 08 5月, 2019 1 次提交
  32. 24 1月, 2019 1 次提交
  33. 16 11月, 2018 1 次提交
    • W
      Refine operator cmake (#14413) · a2d9b344
      Wu Yi 提交于
      * wip simplify operator framework
      
      * wip
      
      * wip
      
      * done test=develop
      
      * clean test=develop
      
      * fix test=develop
      
      * fix deps test=develop
      
      * fix cpu build test=develop
      
      * fix tensorrt build test=develop
      
      * fix tests test=develop
      
      * fix test=develop
      
      * fix cpu build test=develop
      a2d9b344
  34. 14 11月, 2018 1 次提交
  35. 08 11月, 2018 1 次提交
  36. 07 11月, 2018 1 次提交
    • C
      Add fp16 backward support (#14202) · a9b5d42d
      chengduo 提交于
      * add fp16 backward support
      test=develop
      
      * add sum_op fp16 test
      
      * disable test_dist_save_load
      test=develop
      
      * add check_grad for sum
      
      * add unit test for softmax_grad fp16
      test=develop
      
      * add scale_op unit test
      
      * add mul_grad_op unit test for fp16
      
      * add cross_entropy_grad and eman_grad unit test for fp16
      test=develop
      
      * fix cross_entropy unit test
      
      * add pool2d fp16 unit test
      
      * refine conv2d fp16 unit test
      test=develop
      
      * refine activation unit test
      test=develop
      
      * fix ci
      test=develop
      
      * follow zhihong's comment, copy from https://github.com/PaddlePaddle/Paddle/pull/12796
      test=develop
      a9b5d42d