1. 14 7月, 2020 2 次提交
  2. 22 6月, 2020 1 次提交
    • C
      xprtrdma: Fix handling of RDMA_ERROR replies · 7b2182ec
      Chuck Lever 提交于
      The RPC client currently doesn't handle ERR_CHUNK replies correctly.
      rpcrdma_complete_rqst() incorrectly passes a negative number to
      xprt_complete_rqst() as the number of bytes copied. Instead, set
      task->tk_status to the error value, and return zero bytes copied.
      
      In these cases, return -EIO rather than -EREMOTEIO. The RPC client's
      finite state machine doesn't know what to do with -EREMOTEIO.
      
      Additional clean ups:
      - Don't double-count RDMA_ERROR replies
      - Remove a stale comment
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Cc: <stable@kernel.vger.org>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      7b2182ec
  3. 12 6月, 2020 1 次提交
    • C
      SUNRPC: receive buffer size estimation values almost never change · 53bc19f1
      Chuck Lever 提交于
      Avoid unnecessary cache sloshing by placing the buffer size
      estimation update logic behind an atomic bit flag.
      
      The size of GSS information included in each wrapped Reply does
      not change during the lifetime of a GSS context. Therefore, the
      au_rslack and au_ralign fields need to be updated only once after
      establishing a fresh GSS credential.
      
      Thus a slack size update must occur after a cred is created,
      duplicated, renewed, or expires. I'm not sure I have this exactly
      right. A trace point is introduced to track updates to these
      variables to enable troubleshooting the problem if I missed a spot.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      53bc19f1
  4. 20 4月, 2020 1 次提交
  5. 27 3月, 2020 2 次提交
    • C
      xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt · e28ce900
      Chuck Lever 提交于
      Change the rpcrdma_xprt_disconnect() function so that it no longer
      waits for the DISCONNECTED event.  This prevents blocking if the
      remote is unresponsive.
      
      In rpcrdma_xprt_disconnect(), the transport's rpcrdma_ep is
      detached. Upon return from rpcrdma_xprt_disconnect(), the transport
      (r_xprt) is ready immediately for a new connection.
      
      The RDMA_CM_DEVICE_REMOVAL and RDMA_CM_DISCONNECTED events are now
      handled almost identically.
      
      However, because the lifetimes of rpcrdma_xprt structures and
      rpcrdma_ep structures are now independent, creating an rpcrdma_ep
      needs to take a module ref count. The ep now owns most of the
      hardware resources for a transport.
      
      Also, a kref is needed to ensure that rpcrdma_ep sticks around
      long enough for the cm_event_handler to finish.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      e28ce900
    • C
      xprtrdma: Merge struct rpcrdma_ia into struct rpcrdma_ep · 93aa8e0a
      Chuck Lever 提交于
      I eventually want to allocate rpcrdma_ep separately from struct
      rpcrdma_xprt so that on occasion there can be more than one ep per
      xprt.
      
      The new struct rpcrdma_ep will contain all the fields currently in
      rpcrdma_ia and in rpcrdma_ep. This is all the device and CM settings
      for the connection, in addition to per-connection settings
      negotiated with the remote.
      
      Take this opportunity to rename the existing ep fields from rep_* to
      re_* to disambiguate these from struct rpcrdma_rep.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      93aa8e0a
  6. 17 3月, 2020 1 次提交
  7. 15 1月, 2020 4 次提交
  8. 24 10月, 2019 9 次提交
  9. 27 8月, 2019 2 次提交
  10. 22 8月, 2019 1 次提交
  11. 21 8月, 2019 3 次提交
  12. 09 7月, 2019 8 次提交
    • C
      xprtrdma: Refactor chunk encoding · 6a6c6def
      Chuck Lever 提交于
      Clean up.
      
      Move the "not present" case into the individual chunk encoders. This
      improves code organization and readability.
      
      The reason for the original organization was to optimize for the
      case where there there are no chunks. The optimization turned out to
      be inconsequential, so let's err on the side of code readability.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      6a6c6def
    • C
      xprtrdma: Wake RPCs directly in rpcrdma_wc_send path · 0ab11523
      Chuck Lever 提交于
      Eliminate a context switch in the path that handles RPC wake-ups
      when a Receive completion has to wait for a Send completion.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      0ab11523
    • C
      xprtrdma: Reduce context switching due to Local Invalidation · d8099fed
      Chuck Lever 提交于
      Since commit ba69cd12 ("xprtrdma: Remove support for FMR memory
      registration"), FRWR is the only supported memory registration mode.
      
      We can take advantage of the asynchronous nature of FRWR's LOCAL_INV
      Work Requests to get rid of the completion wait by having the
      LOCAL_INV completion handler take care of DMA unmapping MRs and
      waking the upper layer RPC waiter.
      
      This eliminates two context switches when local invalidation is
      necessary. As a side benefit, we will no longer need the per-xprt
      deferred completion work queue.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      d8099fed
    • C
      xprtrdma: Add mechanism to place MRs back on the free list · 40088f0e
      Chuck Lever 提交于
      When a marshal operation fails, any MRs that were already set up for
      that request are recycled. Recycling releases MRs and creates new
      ones, which is expensive.
      
      Since commit f2877623 ("xprtrdma: Chain Send to FastReg WRs")
      was merged, recycling FRWRs is unnecessary. This is because before
      that commit, frwr_map had already posted FAST_REG Work Requests,
      so ownership of the MRs had already been passed to the NIC and thus
      dealing with them had to be delayed until they completed.
      
      Since that commit, however, FAST_REG WRs are posted at the same time
      as the Send WR. This means that if marshaling fails, we are certain
      the MRs are safe to simply unmap and place back on the free list
      because neither the Send nor the FAST_REG WRs have been posted yet.
      The kernel still has ownership of the MRs at this point.
      
      This reduces the total number of MRs that the xprt has to create
      under heavy workloads and makes the marshaling logic less brittle.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      40088f0e
    • C
      xprtrdma: Remove fr_state · 84756894
      Chuck Lever 提交于
      Now that both the Send and Receive completions are handled in
      process context, it is safe to DMA unmap and return MRs to the
      free or recycle lists directly in the completion handlers.
      
      Doing this means rpcrdma_frwr no longer needs to track the state of
      each MR, meaning that a VALID or FLUSHED MR can no longer appear on
      an xprt's MR free list. Thus there is no longer a need to track the
      MR's registration state in rpcrdma_frwr.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      84756894
    • C
      xprtrdma: Remove the RPCRDMA_REQ_F_PENDING flag · 5809ea4f
      Chuck Lever 提交于
      Commit 9590d083 ("xprtrdma: Use xprt_pin_rqst in
      rpcrdma_reply_handler") pins incoming RPC/RDMA replies so they
      can be left in the pending requests queue while they are being
      processed without introducing a race between ->buf_free and the
      transport's reply handler. Therefore RPCRDMA_REQ_F_PENDING is no
      longer necessary.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      5809ea4f
    • C
      xprtrdma: Fix occasional transport deadlock · 05eb06d8
      Chuck Lever 提交于
      Under high I/O workloads, I've noticed that an RPC/RDMA transport
      occasionally deadlocks (IOPS goes to zero, and doesn't recover).
      Diagnosis shows that the sendctx queue is empty, but when sendctxs
      are returned to the queue, the xprt_write_space wake-up never
      occurs. The wake-up logic in rpcrdma_sendctx_put_locked is racy.
      
      I noticed that both EMPTY_SCQ and XPRT_WRITE_SPACE are implemented
      via an atomic bit. Just one of those is sufficient. Removing
      EMPTY_SCQ in favor of the generic bit mechanism makes the deadlock
      un-reproducible.
      
      Without EMPTY_SCQ, rpcrdma_buffer::rb_flags is no longer used and
      is therefore removed.
      
      Unfortunately this patch does not apply cleanly to stable. If
      needed, someone will have to port it and test it.
      
      Fixes: 2fad6592 ("xprtrdma: Wait on empty sendctx queue")
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      05eb06d8
    • C
      xprtrdma: Replace use of xdr_stream_pos in rpcrdma_marshal_req · 1310051c
      Chuck Lever 提交于
      This is a latent bug. xdr_stream_pos works by subtracting
      xdr_stream::nwords from xdr_buf::len. But xdr_stream::nwords is not
      initialized by xdr_init_encode().
      
      It works today only because all fields in rpcrdma_req::rl_stream
      are initialized to zero by rpcrdma_req_create, making the
      subtraction in xdr_stream_pos always a no-op.
      
      I found this issue via code inspection. It was introduced by commit
      39f4cd9e ("xprtrdma: Harden chunk list encoding against send
      buffer overflow"), but the code has changed enough since then that
      this fix can't be automatically applied to stable.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      1310051c
  13. 07 7月, 2019 1 次提交
  14. 26 4月, 2019 4 次提交