1. 14 7月, 2017 11 次提交
    • C
      xprtrdma: Fix documenting comments in frwr_ops.c · 6afafa77
      Chuck Lever 提交于
      Clean up.
      
      FASTREG and LOCAL_INV WRs are typically not signaled. localinv_wake
      is used for the last LOCAL_INV WR in a chain, which is always
      signaled. The documenting comments should reflect that.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      6afafa77
    • C
      xprtrdma: Replace PAGE_MASK with offset_in_page() · d933cc32
      Chuck Lever 提交于
      Clean up.
      
      Reported by: Geliang Tang <geliangtang@gmail.com>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      d933cc32
    • C
      xprtrdma: FMR does not need list_del_init() · e2f6ef09
      Chuck Lever 提交于
      Clean up.
      
      Commit 38f1932e ("xprtrdma: Remove FMRs from the unmap list
      after unmapping") utilized list_del_init() to try to prevent some
      list corruption. The corruption was actually caused by the reply
      handler racing with a signal. Now that MR invalidation is properly
      serialized, list_del_init() can safely be replaced.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      e2f6ef09
    • C
      xprtrdma: Demote "connect" log messages · 173b8f49
      Chuck Lever 提交于
      Some have complained about the log messages generated when xprtrdma
      opens or closes a connection to a server. When an NFS mount is
      mostly idle these can appear every few minutes as the client idles
      out the connection and reconnects.
      
      Connection and disconnection is a normal part of operation, and not
      exceptional, so change these to dprintk's for now. At some point
      all of these will be converted to tracepoints, but that's for
      another day.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      173b8f49
    • C
      xprtrdma: Don't defer MR recovery if ro_map fails · 1f541895
      Chuck Lever 提交于
      Deferred MR recovery does a DMA-unmapping of the MW. However, ro_map
      invokes rpcrdma_defer_mr_recovery in some error cases where the MW
      has not even been DMA-mapped yet.
      
      Avoid a DMA-unmapping error replacing rpcrdma_defer_mr_recovery.
      
      Also note that if ib_dma_map_sg is asked to map 0 nents, it will
      return 0. So the extra "if (i == 0)" check is no longer needed.
      
      Fixes: 42fe28f6 ("xprtrdma: Do not leak an MW during a DMA ...")
      Fixes: 505bbe64 ("xprtrdma: Refactor MR recovery work queues")
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      1f541895
    • C
      xprtrdma: Fix FRWR invalidation error recovery · 8d75483a
      Chuck Lever 提交于
      When ib_post_send() fails, all LOCAL_INV WRs past @bad_wr have to be
      examined, and the MRs reset by hand.
      
      I'm not sure how the existing code can work by comparing R_keys.
      Restructure the logic so that instead it walks the chain of WRs,
      starting from the first bad one.
      
      Make sure to wait for completion if at least one WR was actually
      posted. Otherwise, if the ib_post_send fails, we can end up
      DMA-unmapping the MR while LOCAL_INV operations are in flight.
      
      Commit 7a89f9c6 ("xprtrdma: Honor ->send_request API contract")
      added the rdma_disconnect() call site. The disconnect actually
      causes more problems than it solves, and SQ overruns happen only as
      a result of software bugs. So remove it.
      
      Fixes: d7a21c1b ("xprtrdma: Reset MRs in frwr_op_unmap_sync()")
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      8d75483a
    • C
      xprtrdma: Fix client lock-up after application signal fires · 431af645
      Chuck Lever 提交于
      After a signal, the RPC client aborts synchronous RPCs running on
      behalf of the signaled application.
      
      The server is still executing those RPCs, and will write the results
      back into the client's memory when it's done. By the time the server
      writes the results, that memory is likely being used for other
      purposes. Therefore xprtrdma has to immediately invalidate all
      memory regions used by those aborted RPCs to prevent the server's
      writes from clobbering that re-used memory.
      
      With FMR memory registration, invalidation takes a relatively long
      time. In fact, the invalidation is often still running when the
      server tries to write the results into the memory regions that are
      being invalidated.
      
      This sets up a race between two processes:
      
      1.  After the signal, xprt_rdma_free calls ro_unmap_safe.
      2.  While ro_unmap_safe is still running, the server replies and
          rpcrdma_reply_handler runs, calling ro_unmap_sync.
      
      Both processes invoke ib_unmap_fmr on the same FMR.
      
      The mlx4 driver allows two ib_unmap_fmr calls on the same FMR at
      the same time, but HCAs generally don't tolerate this. Sometimes
      this can result in a system crash.
      
      If the HCA happens to survive, rpcrdma_reply_handler continues. It
      removes the rpc_rqst from rq_list and releases the transport_lock.
      This enables xprt_rdma_free to run in another process, and the
      rpc_rqst is released while rpcrdma_reply_handler is still waiting
      for the ib_unmap_fmr call to finish.
      
      But further down in rpcrdma_reply_handler, the transport_lock is
      taken again, and "rqst" is dereferenced. If "rqst" has already been
      released, this triggers a general protection fault. Since bottom-
      halves are disabled, the system locks up.
      
      Address both issues by reversing the order of the xprt_lookup_rqst
      call and the ro_unmap_sync call. Introduce a separate lookup
      mechanism for rpcrdma_req's to enable calling ro_unmap_sync before
      xprt_lookup_rqst. Now the handler takes the transport_lock once
      and holds it for the XID lookup and RPC completion.
      
      BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=305
      Fixes: 68791649 ('xprtrdma: Invalidate in the RPC reply ... ')
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      431af645
    • C
      xprtrdma: Rename rpcrdma_req::rl_free · a80d66c9
      Chuck Lever 提交于
      Clean up: I'm about to use the rl_free field for purposes other than
      a free list. So use a more generic name.
      
      This is a refactoring change only.
      
      BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=305
      Fixes: 68791649 ('xprtrdma: Invalidate in the RPC reply ... ')
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      a80d66c9
    • C
      xprtrdma: Pass only the list of registered MRs to ro_unmap_sync · 451d26e1
      Chuck Lever 提交于
      There are rare cases where an rpcrdma_req can be re-used (via
      rpcrdma_buffer_put) while the RPC reply handler is still running.
      This is due to a signal firing at just the wrong instant.
      
      Since commit 9d6b0409 ("xprtrdma: Place registered MWs on a
      per-req list"), rpcrdma_mws are self-contained; ie., they fully
      describe an MR and scatterlist, and no part of that information is
      stored in struct rpcrdma_req.
      
      As part of closing the above race window, pass only the req's list
      of registered MRs to ro_unmap_sync, rather than the rpcrdma_req
      itself.
      
      Some extra transport header sanity checking is removed. Since the
      client depends on its own recollection of what memory had been
      registered, there doesn't seem to be a way to abuse this change.
      
      And, the check was not terribly effective. If the client had sent
      Read chunks, the "list_empty" test is negative in both of the
      removed cases, which are actually looking for Write or Reply
      chunks.
      
      BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=305
      Fixes: 68791649 ('xprtrdma: Invalidate in the RPC reply ... ')
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      451d26e1
    • C
      xprtrdma: Pre-mark remotely invalidated MRs · 4b196dc6
      Chuck Lever 提交于
      There are rare cases where an rpcrdma_req and its matched
      rpcrdma_rep can be re-used, via rpcrdma_buffer_put, while the RPC
      reply handler is still using that req. This is typically due to a
      signal firing at just the wrong instant.
      
      As part of closing this race window, avoid using the wrong
      rpcrdma_rep to detect remotely invalidated MRs. Mark MRs as
      invalidated while we are sure the rep is still OK to use.
      
      BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=305
      Fixes: 68791649 ('xprtrdma: Invalidate in the RPC reply ... ')
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      4b196dc6
    • C
      xprtrdma: On invalidation failure, remove MWs from rl_registered · 04d25b7d
      Chuck Lever 提交于
      Callers assume the ro_unmap_sync and ro_unmap_safe methods empty
      the list of registered MRs. Ensure that all paths through
      fmr_op_unmap_sync() remove MWs from that list.
      
      Fixes: 9d6b0409 ("xprtrdma: Place registered MWs on a ... ")
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      04d25b7d
  2. 24 5月, 2017 1 次提交
  3. 26 4月, 2017 25 次提交
  4. 29 3月, 2017 1 次提交
  5. 18 3月, 2017 1 次提交
    • C
      xprtrdma: Squelch kbuild sparse complaint · eed50879
      Chuck Lever 提交于
      New complaint from kbuild for 4.9.y:
      
      net/sunrpc/xprtrdma/verbs.c:489:19: sparse: incompatible types in
          comparison expression (different type sizes)
      
      verbs.c:
      489	max_sge = min(ia->ri_device->attrs.max_sge, RPCRDMA_MAX_SEND_SGES);
      
      I can't reproduce this running sparse here. Likewise, "make W=1
      net/sunrpc/xprtrdma/verbs.o" never indicated any issue.
      
      A little poking suggests that because the range of its values is
      small, gcc can make the actual width of RPCRDMA_MAX_SEND_SGES
      smaller than the width of an unsigned integer.
      
      Fixes: 16f906d6 ("xprtrdma: Reduce required number of send SGEs")
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      eed50879
  6. 25 2月, 2017 1 次提交