1. 09 10月, 2021 3 次提交
  2. 08 10月, 2021 5 次提交
  3. 07 10月, 2021 1 次提交
  4. 05 10月, 2021 1 次提交
    • J
      Added concat BF16/FP32 BWD OneDNN kernel (#35889) · dc4d5719
      jakpiase 提交于
      * tmp
      
      * added concat BF16/FP32 BWD oneDNN kernel
      
      * minor change
      
      * minor change
      
      * fix for CI
      
      * added formatting
      
      * Reverted deleting static keyword
      
      * added reviewers suggestions
      
      * reverted deleting concat bf16 test file
      
      * fixed concat tests
      dc4d5719
  5. 30 9月, 2021 4 次提交
  6. 29 9月, 2021 10 次提交
  7. 28 9月, 2021 8 次提交
    • L
      Add sparse_attention api, test=develop (#35676) · 6b587e93
      Liu-xiandong 提交于
      Add sparse_attention OPs, python api will be added in next pr
      6b587e93
    • L
      add API paddle.linalg.eig (#35674) · bc7e2b92
      Lijunhui 提交于
      * Add paddle.linalg.eig op
      
      * remove comments
      
      * remove comments
      
      * extend batch_size to the origin
      
      * add real times complex functor & destroy the backward complex output bug
      
      * terminate output diff when input real tensors
      
      * correct tiny doc errors
      
      * move functions from eig_helper to svd_helper and remove eig_helper
      
      * remove tensor.Resize
      
      * remove no longer used code
      
      * use existing lapack functions
      
      * reply review comments 21/27
      
      * remove .cu as this op is only executed on CPU
      
      * remove const_cast & add const in argument list for read-only references
      
      * fix sample code error in CI
      
      * remove template typename Tbase and more
      
      * remove eig exposure in paddle.*
      
      * add 'name=None' in eig python implementation
      
      * handle the unittest
      
      * try to solve the unittest
      
      * solve CI coverage
      
      * remove no longer used code
      
      * polish API doc and more
      
      * reply review comments
      
      * polish unittest, commit plan B
      
      * polish unittest
      bc7e2b92
    • X
      [hybrid] seed and dropout op support force-cpu (#35820) · 58c8f6b3
      xiayanming 提交于
      * [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid
      
      * [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid
      
      * [HIP] fix op not support AMD GPU bug
      
      * [hybrid] seed and dropout op support force-cpu
      
      * [hybrid] seed and dropout op support force-cpu
      
      * [hybrid] seed and dropout op support force-cpu
      
      * [hybrid] seed and dropout op support force-cpu
      
      * [hybrid] seed and dropout op support force-cpu
      
      * [hybrid] fix seed ci failed issue
      
      * add AsExtra for force_cpu of seed op
      58c8f6b3
    • Z
      remove new linalg api in paddle.__init__ (#36151) · 3bb4715e
      zhiboniu 提交于
      remove recent linalg api in paddle.init;
      add args 'name' in some new linalg api interface
      same change in develop branch to #36112
      3bb4715e
    • J
      【Bug fix】Fix dygraph double grad dtype error (#36125) · af4f018a
      Jiabin Yang 提交于
      * fix dygraph double grad dtype error when calling for high differential senario
      
      * reinvoke ci
      
      * add test for partial_engine.cc
      af4f018a
    • K
      py2 to py3 bug and iface fix for pslib (#36102) · 0e07f20e
      kuizhiqing 提交于
      0e07f20e
    • W
      [hybrid] optimizer sharding support optimize cast (#35878) · eef0a943
      WangXi 提交于
      eef0a943
    • Y
      Add paddle.device.cuda.get_device_properties (#35661) · 4cbed9e5
      Yanxing Shi 提交于
      * Initial Commit
      
      * add unittest and add error information
      
      * modify doc
      
      * fix some error
      
      * fix some word
      
      * fix bug cudaDeviceProp* and modify error explanation
      
      * fix cudaDeviceProp* error and unnitest samples
      
      * fix hip error and PADDLE_WITH_HIP
      
      * update style
      
      * fix error is_compiled_with_cuda
      
      * fix paddle.device.cuda.get_device_properties
      
      * fix error for multi thread safe
      
      * update style
      
      * merge conflict
      
      * modify after mentor review
      
      * update style
      
      * delete word
      
      * fix unittest error for windows
      
      * support string input and modify some code
      
      * modify doc to support string input
      
      * fix error for express information
      
      * fix error for express information
      
      * fix unnitest for windows
      
      * fix device.startswith('gpu:')
      
      * format error and doc
      
      * fix after review
      
      * format code
      
      * fix error for doc compile
      
      * fix error for doc compile
      
      * fix error for doc compile
      
      * fix error for doc compile
      
      * fix error for doc compile
      
      * fix py2 error
      
      * fix wrong words and doc
      
      * fix _gpuDeviceProperties
      4cbed9e5
  8. 27 9月, 2021 4 次提交
    • J
      fix zero tensor for unique, unstack (#36021) · efd35384
      Jiawei Wang 提交于
      * fix extra op for expand, expand_as, tile, unstack
      
      * fix unique unstack dim 0
      
      * Update expand_v2_op.cc
      
      * fix unique_op format
      efd35384
    • J
      Added flatten and flatten2 BF16/FP32 FWD/BWD kernels (#35892) · e427a0f1
      jakpiase 提交于
      * refactored reshape multiop kernel and added flatten1/2 kernels
      
      * added formatting for flatten tests
      
      * CI fix
      
      * disabled reshape_kernel ops after succesful CI run
      
      * minor fix
      e427a0f1
    • L
      Add functional autograd API: jacobian (#35917) · ec2f68e8
      levi131 提交于
      * init functional jacobian api
      
      * finish test with dtype float32
      
      * add float64 test case
      
      * polish code
      
      * use atol=1e-5 with dtype float64
      
      * fix for ci
      
      * set timeout for test_jacobian
      
      * polish API docstring
      
      * modify docstring
      ec2f68e8
    • H
      support saving model defined parameters without add scale_op (#36119) · 8db6d221
      Haipeng Wang 提交于
      * add scale_op in model save step is not necessary, just fix the prune method to support static graph and inplace op
      
      * fix jit.save, no need to add scale_op to each outputvar anymore.
      fix prune_with_input, now it supports inplace op
      
      * temporarily disable test_trt_dynamic_shape.TRTDynamicShapeOutOfBound2Test
      
      * allow user to export parameters defined in model
      8db6d221
  9. 26 9月, 2021 4 次提交