1. 25 11月, 2021 1 次提交
    • M
      【PaddlePaddle Hackathon】6、在 Paddle 中新增 ZeroPad2d (#37151) · 81861f69
      Matsumoto GAO 提交于
      * add zeropad2d v0.1
      
      * add zeropad2d v0.2
      
      * add zeropad2d v0.3
      
      * add zeropad2d v0.3
      
      * add zeropad2d v0.3
      
      * add zeropad2d v0.4
      
      * add zeropad2d v0.5
      
      * add zeropad2d v0.5 codestyle
      
      * add zeropad2d v0.5 codestyle
      
      * add zeropad2d v0.6 functional
      
      * add zeropad2d v0.6 functional
      
      * add zeropad2d v0.6 functional
      81861f69
  2. 22 11月, 2021 1 次提交
  3. 28 10月, 2021 1 次提交
  4. 26 10月, 2021 1 次提交
    • L
      Add fused attention op backward and python layer. (#36498) · 5119428e
      Li Min 提交于
      功能:本PR的目标是提高attention模块的计算性能。
      为了减少框架层对op的调度开销,本PR通过在C++层手动实现attention模块,对外提供attention 大op;
      为了减少防存开销,本PR采取了两种优化方法:
      (1)在q,k,v计算时通过共享输入X,将该处的gemm,transpose和bias add从三次调用减少为一次;
      (2)使用kernel融合优化技术,在不同cuda kernel之间通过寄存器传输数据;
      5119428e
  5. 22 10月, 2021 1 次提交
    • L
      Fused attention op forward (#35905) · d4906214
      Li Min 提交于
      功能:本PR的目标是提高attention模块的计算性能。
      为了减少框架层对op的调度开销,本PR通过在C++层手动实现attention模块,对外提供attention 大op;
      为了减少防存开销,本PR采取了两种优化方法:
      (1)在q,k,v计算时通过共享输入X,将该处的gemm,transpose和bias add从三次调用减少为一次;
      (2)使用kernel融合优化技术,在不同cuda kernel之间通过寄存器传输数据;
      d4906214
  6. 13 10月, 2021 2 次提交
  7. 17 9月, 2021 1 次提交
  8. 15 9月, 2021 1 次提交
  9. 06 9月, 2021 1 次提交
    • F
      replase pass with error exception (#35367) · 5675042d
      Feng Xing 提交于
      This PR adds error exception in fused transformer python interface.
      The function body are not implemented (will be implemented later).
      Following zhiqiu's comment in previous PR-35206 (merged already), it is better to raise an exception instead of using "pass".
      5675042d
  10. 31 8月, 2021 1 次提交
    • F
      transformer opt python files (#35206) · e2991555
      Feng Xing 提交于
      This PR adds fused transformer python related files. It defines interface of fused transformer.
      
      Fused transformer implements an optimized version of transformer layer (in python/paddle/nn/layer/transformer.py). In this PR, four layers (functions) are defined:
      (1) FusedMultiHeadAttention: multi-head attention layer
      (2) FusedFeedForward: feed forward layer
      (3) FusedTransformerEncoderLayer: transformer encoder layer
      (4) FusedTransformer: transformer layer
      e2991555
  11. 27 8月, 2021 1 次提交
    • X
      Add unpool2d op & Expose max_unpool2d API (#35056) · ceee71a0
      xiaoting 提交于
      * add maxunppol2d op, test=develop
      
      * fix typo, test=develop
      
      * fix unpool unitest, test=develop
      
      * fix unpool code-example, test=develop
      
      * fix for unpool_op_unittest,test=develop
      
      * fix example code, test=develop
      
      * add noqa:F401, test=develop
      
      * fix converage, test=develop
      
      * fix unitest for unpool, test=develop
      
      * rename unpool2d to unpool, test=develop
      
      * rename unpool2d to unpool, test=develop
      ceee71a0
  12. 20 8月, 2021 1 次提交
  13. 18 8月, 2021 1 次提交
    • littletomatodonkey's avatar
      fix pad outliers err (#34979) · 248e27b7
      littletomatodonkey 提交于
      * fix pad outliers err
      
      * fix pad api input type and doc
      
      * fix example of pad
      
      * add unittest for pad3d
      
      * fix unittest
      
      * fix error format
      
      * fix pad doc
      248e27b7
  14. 17 8月, 2021 1 次提交
    • H
      Align CTC grad scale same with ESPNet (#34729) · 10f9644c
      Hui Zhang 提交于
      * dygraph support more ctc grad scale
      
      * scale for 1.x
      
      * fix unitest
      
      * fix unitest
      
      * format code
      
      * fix unittest
      
      * fix log info
      
      * unittest cov
      
      * fix format;notest,test=cpu,coverage
      
      * skip ctc_loss egs;test=cpu
      
      * warpctc grad cov;test=coverage
      
      * add dygraph test;test=coverage
      
      * format;test=cpu,coverage
      
      * format;test=cpu
      
      * add api compat;test=cpu
      
      * add cpu test
      
      * rename
      
      * rename
      
      * fix
      
      * fix test
      
      * format
      
      * eigen cpu
      
      * eigen gpu grad pass
      
      * cuda gpu pass
      
      * format
      
      * fix ci
      10f9644c
  15. 06 8月, 2021 2 次提交
    • J
      4caf60df
    • S
      paddle/nn fix formula bugs (#34643) · 0f19ac7c
      sunzhongkai588 提交于
      * fix paddle.optimizer test=document_fix
      
      * fix paddle.optimizer test=document_fix
      
      * fix bugs in paddle.nn.functional document test=document_fix
      
      * fix bugs in paddle.nn.functional document test=document_fix
      
      * fix bugs in paddle.nn.functional document test=document_fix
      
      * fix bugs in paddle.nn.functional document test=document_fix
      
      * fix nn formula bugs test=document_fix
      
      * fix nn formula bugs test=document_fix
      
      * fix nn formula bugs test=document_fix
      0f19ac7c
  16. 02 8月, 2021 1 次提交
    • S
      Add Identity OP (#34420) · 80f7f7ea
      shiyutang 提交于
      * test=develop
      
      * update identity
      
      * add unittest
      
      * notest,test=mac_py3
      
      * modify comment & testname
      
      * test=document_fix
      
      * update comment
      
      * test=document_fix
      
      * activate all of the CI
      80f7f7ea
  17. 26 7月, 2021 1 次提交
  18. 19 7月, 2021 1 次提交
  19. 15 7月, 2021 1 次提交
  20. 01 7月, 2021 1 次提交
  21. 25 6月, 2021 1 次提交
  22. 16 6月, 2021 1 次提交
  23. 09 6月, 2021 1 次提交
  24. 01 6月, 2021 1 次提交
  25. 31 5月, 2021 1 次提交
  26. 22 5月, 2021 1 次提交
  27. 29 4月, 2021 1 次提交
  28. 27 4月, 2021 2 次提交
  29. 26 4月, 2021 3 次提交
  30. 25 4月, 2021 1 次提交
  31. 22 4月, 2021 1 次提交
  32. 20 4月, 2021 1 次提交
  33. 13 4月, 2021 1 次提交
  34. 12 4月, 2021 1 次提交
    • R
      [ROCM] fix some unittests (#32129) · bd2a4e23
      ronnywang 提交于
      * [ROCM] fix test_gru_rnn_op
      
      * [ROCM] fix test_expand_op
      
      * [ROCM] fix test_cross_entropy_loss
      
      * [ROCM] fix test_conv_nn_grad
      
      * [ROCM] fix test_bilinear_tensor_product_op
      
      * [ROCM] fix elementwise_op_function
      
      * [ROCM] fix test_lstm_cudnn_op
      
      * [ROCM] fix test_gpu_package_without_gpu_device
      
      * [ROCM] fix test_gru_unit_op
      
      * [ROCM] fix test_imperative_optimizer
      
      * [ROCM] fix rnn
      
      * [ROCM] fix group_norm_op
      
      * [ROCM] fix test_pool3d_api
      
      * [ROCM] fix test_pool3d_op
      bd2a4e23
  35. 08 4月, 2021 1 次提交
    • C
      Add LayerDict class (#31951) · e45c3fa5
      chentianyu03 提交于
      * add layerdict class
      
      * add docs and test cases for LayerDict class
      
      * remove the arguments type in function define
      
      * add update inputs type check
      e45c3fa5