1. 03 5月, 2010 1 次提交
    • N
      sunrpc: centralise most calls to svc_xprt_received · b48fa6b9
      Neil Brown 提交于
      svc_xprt_received must be called when ->xpo_recvfrom has finished
      receiving a message, so that the XPT_BUSY flag will be cleared and
      if necessary, requeued for further work.
      
      This call is currently made in each ->xpo_recvfrom function, often
      from multiple different points.  In each case it is the earliest point
      on a particular path where it is known that the protection provided by
      XPT_BUSY is no longer needed.
      
      However there are (still) some error paths which do not call
      svc_xprt_received, and requiring each ->xpo_recvfrom to make the call
      does not encourage robustness.
      
      So: move the svc_xprt_received call to be made just after the
      call to ->xpo_recvfrom(), and move it of the various ->xpo_recvfrom
      methods.
      
      This means that it may not be called at the earliest possible instant,
      but this is unlikely to be a measurable performance issue.
      
      Note that there are still other calls to svc_xprt_received as it is
      also needed when an xprt is newly created.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      b48fa6b9
  2. 30 11月, 2009 1 次提交
  3. 16 6月, 2009 1 次提交
  4. 26 4月, 2009 1 次提交
  5. 15 12月, 2008 1 次提交
  6. 07 10月, 2008 1 次提交
    • T
      svcrdma: Modify the RPC recv path to use FRMR when available · 146b6df6
      Tom Tucker 提交于
      RPCRDMA requests that specify a read-list are fetched with RDMA_READ. Using
      an FRMR to map the data sink improves NFSRDMA security on transports that
      place the RDMA_READ data sink LKEY on the wire because the valid lifetime
      of the MR is only the duration of the RDMA_READ. The LKEY is invalidated
      when the last RDMA_READ WR completes.
      
      Mapping the data sink also allows for very large amounts to data to be
      fetched with a single WR, so if the client is also using FRMR, the entire
      RPC read-list can be fetched with a single WR.
      Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
      146b6df6
  7. 14 8月, 2008 1 次提交
    • T
      svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet · 24b8b447
      Tom Tucker 提交于
      RDMA_READ completions are kept on a separate queue from the general
      I/O request queue. Since a separate lock is used to protect the RDMA_READ
      completion queue, a race exists between the dto_tasklet and the
      svc_rdma_recvfrom thread where the dto_tasklet sets the XPT_DATA
      bit and adds I/O to the read-completion queue. Concurrently, the
      recvfrom thread checks the generic queue, finds it empty and resets
      the XPT_DATA bit. A subsequent svc_xprt_enqueue will fail to enqueue
      the transport for I/O and cause the transport to "stall".
      
      The fix is to protect both lists with the same lock and set the XPT_DATA
      bit with this lock held.
      Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      24b8b447
  8. 03 7月, 2008 2 次提交
  9. 19 5月, 2008 7 次提交
  10. 27 3月, 2008 1 次提交
    • T
      SVCRDMA: Check num_sge when setting LAST_CTXT bit · c8237a5f
      Tom Tucker 提交于
      The RDMACTXT_F_LAST_CTXT bit was getting set incorrectly
      when the last chunk in the read-list spanned multiple pages. This
      resulted in a kernel panic when the wrong context was used to
      build the RPC iovec page list.
      
      RDMA_READ is used to fetch RPC data from the client for
      NFS_WRITE requests. A scatter-gather is used to map the
      advertised client side buffer to the server-side iovec and
      associated page list.
      
      WR contexts are used to convey which scatter-gather entries are
      handled by each WR. When the write data is large, a single RPC may
      require multiple RDMA_READ requests so the contexts for a single RPC
      are chained together in a linked list. The last context in this list
      is marked with a bit RDMACTXT_F_LAST_CTXT so that when this WR completes,
      the CQ handler code can enqueue the RPC for processing.
      
      The code in rdma_read_xdr was setting this bit on the last two
      contexts on this list when the last read-list chunk spanned multiple
      pages. This caused the svc_rdma_recvfrom logic to incorrectly build
      the RPC and caused the kernel to crash because the second-to-last
      context doesn't contain the iovec page list.
      
      Modified the condition that sets this bit so that it correctly detects
      the last context for the RPC.
      Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
      Tested-by: NRoland Dreier <rolandd@cisco.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c8237a5f
  11. 25 3月, 2008 1 次提交
    • R
      SVCRDMA: Use only 1 RDMA read scatter entry for iWARP adapters · d3073779
      Roland Dreier 提交于
      The iWARP protocol limits RDMA read requests to a single scatter
      entry.  NFS/RDMA has code in rdma_read_max_sge() that is supposed to
      limit the sge_count for RDMA read requests to 1, but the code to do
      that is inside an #ifdef RDMA_TRANSPORT_IWARP block.  In the mainline
      kernel at least, RDMA_TRANSPORT_IWARP is an enum and not a
      preprocessor #define, so the #ifdef'ed code is never compiled.
      
      In my test of a kernel build with -j8 on an NFS/RDMA mount, this
      problem eventually leads to trouble starting with:
      
          svcrdma: Error posting send = -22
          svcrdma : RDMA_READ error = -22
      
      and things go downhill from there.
      
      The trivial fix is to delete the #ifdef guard.  The check seems to be
      a remnant of when the NFS/RDMA code was not merged and needed to
      compile against multiple kernel versions, although I don't think it
      ever worked as intended.  In any case now that the code is upstream
      there's no need to test whether the RDMA_TRANSPORT_IWARP constant is
      defined or not.
      
      Without this patch, my kernel build on an NFS/RDMA mount using NetEffect
      adapters quickly and 100% reproducibly failed with an error like:
      
          ld: final link failed: Software caused connection abort
      
      With the patch applied I was able to complete a kernel build on the
      same setup.
      
      (Tom Tucker says this is "actually an _ancient_ remnant when it had to
      compile against iWARP vs. non-iWARP enabled OFA trees.")
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Acked-by: NTom Tucker <tom@opengridcomputing.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d3073779
  12. 02 2月, 2008 1 次提交
    • T
      rdma: SVCRDMA recvfrom · d5b31be6
      Tom Tucker 提交于
      This file implements the RDMA transport recvfrom function. The function
      dequeues work reqeust completion contexts from an I/O list that it shares
      with the I/O tasklet in svc_rdma_transport.c. For ONCRPC RDMA, an RPC may
      not be complete when it is received. Instead, the RDMA header that precedes
      the RPC message informs the transport where to get the RPC data from on
      the client and where to place it in the RPC message before it is delivered
      to the server. The svc_rdma_recvfrom function therefore, parses this RDMA
      header and issues any necessary RDMA operations to fetch the remainder of
      the RPC from the client.
      
      Special handling is required when the request involves an RDMA_READ.
      In this case, recvfrom submits the RDMA_READ requests to the underlying
      transport driver and then returns 0. When the transport
      completes the last RDMA_READ for the request, it enqueues it on a
      read completion queue and enqueues the transport. The recvfrom code
      favors this queue over the regular DTO queue when satisfying reads.
      Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
      Acked-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      d5b31be6