1. 26 4月, 2021 3 次提交
  2. 06 2月, 2021 6 次提交
    • C
      xprtrdma: Clean up rpcrdma_prepare_readch() · 586a0787
      Chuck Lever 提交于
      Since commit 9ed5af26 ("SUNRPC: Clean up the handling of page
      padding in rpc_prepare_reply_pages()") [Dec 2020] the NFS client
      passes payload data to the transport with the padding in xdr->pages
      instead of in the send buffer's tail kvec. There's no need for the
      extra logic to advance the base of the tail kvec because the upper
      layer no longer places XDR padding there.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      586a0787
    • C
      xprtrdma: Pad optimization, revisited · 2324fbed
      Chuck Lever 提交于
      The NetApp Linux team discovered that with NFS/RDMA servers that do
      not support RFC 8797, the Linux client is forming NFSv4.x WRITE
      requests incorrectly.
      
      In this case, the Linux NFS client disables implicit chunk round-up
      for odd-length Read and Write chunks. The goal was to support old
      servers that needed that padding to be sent explicitly by clients.
      
      In that case the Linux NFS included the tail kvec in the Read chunk,
      since the tail contains any needed padding. That meant a separate
      memory registration is needed for the tail kvec, adding to the cost
      of forming such requests. To avoid that cost for a mere 3 bytes of
      zeroes that are always ignored by receivers, we try to use implicit
      roundup when possible.
      
      For NFSv4.x, the tail kvec also sometimes contains a trailing
      GETATTR operation. The Linux NFS client unintentionally includes
      that GETATTR operation in the Read chunk as well as inline.
      
      The fix is simply to /never/ include the tail kvec when forming a
      data payload Read chunk. The padding is thus now always present.
      
      Note that since commit 9ed5af26 ("SUNRPC: Clean up the handling
      of page padding in rpc_prepare_reply_pages()") [Dec 2020] the NFS
      client passes payload data to the transport with the padding in
      xdr->pages instead of in the send buffer's tail kvec. So now the
      Linux NFS client appends XDR padding to all odd-sized Read chunks.
      This shouldn't be a problem because:
      
       - RFC 8166-compliant servers are supposed to work with or without
         that XDR padding in Read chunks.
      
       - Since the padding is now in the same memory region as the data
         payload, a separate memory registration is not needed. In
         addition, the link layer extends data in RDMA Read responses to
         4-byte boundaries anyway. Thus there is now no savings when the
         padding is not included.
      
      Because older kernels include the payload's XDR padding in the
      tail kvec, a fix there will be more complicated. Thus backporting
      this patch is not recommended.
      
      Reported by: Olga Kornievskaia <Olga.Kornievskaia@netapp.com>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NTom Talpey <tom@talpey.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      2324fbed
    • C
      rpcrdma: Fix comments about reverse-direction operation · 84dff5eb
      Chuck Lever 提交于
      During the final stages of publication of RFC 8167, reviewers
      requested that we use the term "reverse direction" rather than
      "backwards direction". Update comments to reflect this preference.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NTom Talpey <tom@talpey.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      84dff5eb
    • C
      xprtrdma: Refactor invocations of offset_in_page() · 67b16625
      Chuck Lever 提交于
      Clean up so that offset_in_page() is invoked less often in the
      most common case, which is mapping xdr->pages.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NTom Talpey <tom@talpey.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      67b16625
    • C
      xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map() · 54e6aec5
      Chuck Lever 提交于
      Clean up.
      
      Remove a conditional branch from the SGL set-up loop in frwr_map():
      Instead of using either sg_set_page() or sg_set_buf(), initialize
      the mr_page field properly when rpcrdma_convert_kvec() converts the
      kvec to an SGL entry. frwr_map() can then invoke sg_set_page()
      unconditionally.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NTom Talpey <tom@talpey.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      54e6aec5
    • C
      xprtrdma: Remove FMR support in rpcrdma_convert_iovs() · 9929f4ad
      Chuck Lever 提交于
      Support for FMR was removed by commit ba69cd12 ("xprtrdma:
      Remove support for FMR memory registration") [Dec 2018]. That means
      the buffer-splitting behavior of rpcrdma_convert_kvec(), added by
      commit 821c791a ("xprtrdma: Segment head and tail XDR buffers
      on page boundaries") [Mar 2016], is no longer necessary. FRWR
      memory registration handles this case with aplomb.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      9929f4ad
  3. 14 12月, 2020 1 次提交
    • C
      xprtrdma: Fix XDRBUF_SPARSE_PAGES support · 15261b91
      Chuck Lever 提交于
      Olga K. observed that rpcrdma_marsh_req() allocates sparse pages
      only when it has determined that a Reply chunk is necessary. There
      are plenty of cases where no Reply chunk is needed, but the
      XDRBUF_SPARSE_PAGES flag is set. The result would be a crash in
      rpcrdma_inline_fixup() when it tries to copy parts of the received
      Reply into a missing page.
      
      To avoid crashing, handle sparse page allocation up front.
      
      Until XATTR support was added, this issue did not appear often
      because the only SPARSE_PAGES consumer always expected a reply large
      enough to always require a Reply chunk.
      Reported-by: NOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      15261b91
  4. 11 11月, 2020 5 次提交
  5. 16 7月, 2020 1 次提交
  6. 14 7月, 2020 3 次提交
  7. 22 6月, 2020 1 次提交
    • C
      xprtrdma: Fix handling of RDMA_ERROR replies · 7b2182ec
      Chuck Lever 提交于
      The RPC client currently doesn't handle ERR_CHUNK replies correctly.
      rpcrdma_complete_rqst() incorrectly passes a negative number to
      xprt_complete_rqst() as the number of bytes copied. Instead, set
      task->tk_status to the error value, and return zero bytes copied.
      
      In these cases, return -EIO rather than -EREMOTEIO. The RPC client's
      finite state machine doesn't know what to do with -EREMOTEIO.
      
      Additional clean ups:
      - Don't double-count RDMA_ERROR replies
      - Remove a stale comment
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Cc: <stable@kernel.vger.org>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      7b2182ec
  8. 12 6月, 2020 1 次提交
    • C
      SUNRPC: receive buffer size estimation values almost never change · 53bc19f1
      Chuck Lever 提交于
      Avoid unnecessary cache sloshing by placing the buffer size
      estimation update logic behind an atomic bit flag.
      
      The size of GSS information included in each wrapped Reply does
      not change during the lifetime of a GSS context. Therefore, the
      au_rslack and au_ralign fields need to be updated only once after
      establishing a fresh GSS credential.
      
      Thus a slack size update must occur after a cred is created,
      duplicated, renewed, or expires. I'm not sure I have this exactly
      right. A trace point is introduced to track updates to these
      variables to enable troubleshooting the problem if I missed a spot.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      53bc19f1
  9. 20 4月, 2020 1 次提交
  10. 27 3月, 2020 2 次提交
    • C
      xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt · e28ce900
      Chuck Lever 提交于
      Change the rpcrdma_xprt_disconnect() function so that it no longer
      waits for the DISCONNECTED event.  This prevents blocking if the
      remote is unresponsive.
      
      In rpcrdma_xprt_disconnect(), the transport's rpcrdma_ep is
      detached. Upon return from rpcrdma_xprt_disconnect(), the transport
      (r_xprt) is ready immediately for a new connection.
      
      The RDMA_CM_DEVICE_REMOVAL and RDMA_CM_DISCONNECTED events are now
      handled almost identically.
      
      However, because the lifetimes of rpcrdma_xprt structures and
      rpcrdma_ep structures are now independent, creating an rpcrdma_ep
      needs to take a module ref count. The ep now owns most of the
      hardware resources for a transport.
      
      Also, a kref is needed to ensure that rpcrdma_ep sticks around
      long enough for the cm_event_handler to finish.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      e28ce900
    • C
      xprtrdma: Merge struct rpcrdma_ia into struct rpcrdma_ep · 93aa8e0a
      Chuck Lever 提交于
      I eventually want to allocate rpcrdma_ep separately from struct
      rpcrdma_xprt so that on occasion there can be more than one ep per
      xprt.
      
      The new struct rpcrdma_ep will contain all the fields currently in
      rpcrdma_ia and in rpcrdma_ep. This is all the device and CM settings
      for the connection, in addition to per-connection settings
      negotiated with the remote.
      
      Take this opportunity to rename the existing ep fields from rep_* to
      re_* to disambiguate these from struct rpcrdma_rep.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      93aa8e0a
  11. 17 3月, 2020 1 次提交
  12. 15 1月, 2020 4 次提交
  13. 24 10月, 2019 9 次提交
  14. 27 8月, 2019 2 次提交