1. 20 9月, 2016 13 次提交
    • C
      xprtrdma: Move send_wr to struct rpcrdma_req · 90aab602
      Chuck Lever 提交于
      Clean up: Most of the fields in each send_wr do not vary. There is
      no need to initialize them before each ib_post_send(). This removes
      a large-ish data structure from the stack.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      90aab602
    • C
      xprtrdma: Simplify rpcrdma_ep_post_recv() · b157380a
      Chuck Lever 提交于
      Clean up.
      
      Since commit fc664485 ("xprtrdma: Split the completion queue"),
      rpcrdma_ep_post_recv() no longer uses the "ep" argument.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      b157380a
    • C
      xprtrdma: Eliminate "ia" argument in rpcrdma_{alloc, free}_regbuf · 13650c23
      Chuck Lever 提交于
      Clean up. The "ia" argument is no longer used.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      13650c23
    • C
      xprtrdma: Delay DMA mapping Send and Receive buffers · 54cbd6b0
      Chuck Lever 提交于
      Currently, each regbuf is allocated and DMA mapped at the same time.
      This is done during transport creation.
      
      When a device driver is unloaded, every DMA-mapped buffer in use by
      a transport has to be unmapped, and then remapped to the new
      device if the driver is loaded again. Remapping will have to be done
      _after_ the connect worker has set up the new device.
      
      But there's an ordering problem:
      
      call_allocate, which invokes xprt_rdma_allocate which calls
      rpcrdma_alloc_regbuf to allocate Send buffers, happens _before_
      the connect worker can run to set up the new device.
      
      Instead, at transport creation, allocate each buffer, but leave it
      unmapped. Once the RPC carries these buffers into ->send_request, by
      which time a transport connection should have been established,
      check to see that the RPC's buffers have been DMA mapped. If not,
      map them there.
      
      When device driver unplug support is added, it will simply unmap all
      the transport's regbufs, but it doesn't have to deallocate the
      underlying memory.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      54cbd6b0
    • C
      xprtrdma: Replace DMA_BIDIRECTIONAL · 99ef4db3
      Chuck Lever 提交于
      The use of DMA_BIDIRECTIONAL is discouraged by DMA-API.txt.
      Fortunately, xprtrdma now knows which direction I/O is going as
      soon as it allocates each regbuf.
      
      The RPC Call and Reply buffers are no longer the same regbuf. They
      can each be labeled correctly now. The RPC Reply buffer is never
      part of either a Send or Receive WR, but it can be part of Reply
      chunk, which is mapped and registered via ->ro_map . So it is not
      DMA mapped when it is allocated (DMA_NONE), to avoid a double-
      mapping.
      
      Since Receive buffers are no longer DMA_BIDIRECTIONAL and their
      contents are never modified by the host CPU, DMA-API-HOWTO.txt
      suggests that a DMA sync before posting each buffer should be
      unnecessary. (See my_card_interrupt_handler).
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      99ef4db3
    • C
      xprtrdma: Use smaller buffers for RPC-over-RDMA headers · 08cf2efd
      Chuck Lever 提交于
      Commit 94931746 ("xprtrdma: Limit number of RDMA segments in
      RPC-over-RDMA headers") capped the number of chunks that may appear
      in RPC-over-RDMA headers. The maximum header size can be estimated
      and fixed to avoid allocating buffer space that is never used.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      08cf2efd
    • C
      xprtrdma: Initialize separate RPC call and reply buffers · 9c40c49f
      Chuck Lever 提交于
      RPC-over-RDMA needs to separate its RPC call and reply buffers.
      
       o When an RPC Call is sent, rq_snd_buf is DMA mapped for an RDMA
         Send operation using DMA_TO_DEVICE
      
       o If the client expects a large RPC reply, it DMA maps rq_rcv_buf
         as part of a Reply chunk using DMA_FROM_DEVICE
      
      The two mappings are for data movement in opposite directions.
      
      DMA-API.txt suggests that if these mappings share a DMA cacheline,
      bad things can happen. This could occur in the final bytes of
      rq_snd_buf and the first bytes of rq_rcv_buf if the two buffers
      happen to share a DMA cacheline.
      
      On x86_64 the cacheline size is typically 8 bytes, and RPC call
      messages are usually much smaller than the send buffer, so this
      hasn't been a noticeable problem. But the DMA cacheline size can be
      larger on other platforms.
      
      Also, often rq_rcv_buf starts most of the way into a page, thus
      an additional RDMA segment is needed to map and register the end of
      that buffer. Try to avoid that scenario to reduce the cost of
      registering and invalidating Reply chunks.
      
      Instead of carrying a single regbuf that covers both rq_snd_buf and
      rq_rcv_buf, each struct rpcrdma_req now carries one regbuf for
      rq_snd_buf and one regbuf for rq_rcv_buf.
      
      Some incidental changes worth noting:
      
      - To clear out some spaghetti, refactor xprt_rdma_allocate.
      - The value stored in rg_size is the same as the value stored in
        the iov.length field, so eliminate rg_size
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      9c40c49f
    • C
      SUNRPC: Add a transport-specific private field in rpc_rqst · 5a6d1db4
      Chuck Lever 提交于
      Currently there's a hidden and indirect mechanism for finding the
      rpcrdma_req that goes with an rpc_rqst. It depends on getting from
      the rq_buffer pointer in struct rpc_rqst to the struct
      rpcrdma_regbuf that controls that buffer, and then to the struct
      rpcrdma_req it goes with.
      
      This was done back in the day to avoid the need to add a per-rqst
      pointer or to alter the buf_free API when support for RPC-over-RDMA
      was introduced.
      
      I'm about to change the way regbuf's work to support larger inline
      thresholds. Now is a good time to replace this indirect mechanism
      with something that is more straightforward. I guess this should be
      considered a clean up.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      5a6d1db4
    • C
      SUNRPC: Separate buffer pointers for RPC Call and Reply messages · 68778945
      Chuck Lever 提交于
      For xprtrdma, the RPC Call and Reply buffers are involved in real
      I/O operations.
      
      To start with, the DMA direction of the I/O for a Call is opposite
      that of a Reply.
      
      In the current arrangement, the Reply buffer address is on a
      four-byte alignment just past the call buffer. Would be friendlier
      on some platforms if that was at a DMA cache alignment instead.
      
      Because the current arrangement allocates a single memory region
      which contains both buffers, the RPC Reply buffer often contains a
      page boundary in it when the Call buffer is large enough (which is
      frequent).
      
      It would be a little nicer for setting up DMA operations (and
      possible registration of the Reply buffer) if the two buffers were
      separated, well-aligned, and contained as few page boundaries as
      possible.
      
      Now, I could just pad out the single memory region used for the pair
      of buffers. But frequently that would mean a lot of unused space to
      ensure the Reply buffer did not have a page boundary.
      
      Add a separate pointer to rpc_rqst that points right to the RPC
      Reply buffer. This makes no difference to xprtsock, but it will help
      xprtrdma in subsequent patches.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      68778945
    • C
      SUNRPC: Generalize the RPC buffer release API · 3435c74a
      Chuck Lever 提交于
      xprtrdma needs to allocate the Call and Reply buffers separately.
      TBH, the reliance on using a single buffer for the pair of XDR
      buffers is transport implementation-specific.
      
      Instead of passing just the rq_buffer into the buf_free method, pass
      the task structure and let buf_free take care of freeing both
      XDR buffers at once.
      
      There's a micro-optimization here. In the common case, both
      xprt_release and the transport's buf_free method were checking if
      rq_buffer was NULL. Now the check is done only once per RPC.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      3435c74a
    • C
      SUNRPC: Generalize the RPC buffer allocation API · 5fe6eaa1
      Chuck Lever 提交于
      xprtrdma needs to allocate the Call and Reply buffers separately.
      TBH, the reliance on using a single buffer for the pair of XDR
      buffers is transport implementation-specific.
      
      Transports that want to allocate separate Call and Reply buffers
      will ignore the "size" argument anyway.  Don't bother passing it.
      
      The buf_alloc method can't return two pointers. Instead, make the
      method's return value an error code, and set the rq_buffer pointer
      in the method itself.
      
      This gives call_allocate an opportunity to terminate an RPC instead
      of looping forever when a permanent problem occurs. If a request is
      just bogus, or the transport is in a state where it can't allocate
      resources for any request, there needs to be a way to kill the RPC
      right there and not loop.
      
      This immediately fixes a rare problem in the backchannel send path,
      which loops if the server happens to send a CB request whose
      call+reply size is larger than a page (which it shouldn't do yet).
      
      One more issue: looks like xprt_inject_disconnect was incorrectly
      placed in the failure path in call_allocate. It needs to be in the
      success path, as it is for other call-sites.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      5fe6eaa1
    • C
      SUNRPC: Refactor rpc_xdr_buf_init() · b9c5bc03
      Chuck Lever 提交于
      Clean up: there is some XDR initialization logic that is common
      to the forward channel and backchannel. Move it to an XDR header
      so it can be shared.
      
      rpc_rqst::rq_buffer points to a buffer containing big-endian data.
      Update its annotation as part of the clean up.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      b9c5bc03
    • C
      xprtrdma: Eliminate INLINE_THRESHOLD macros · eb342e9a
      Chuck Lever 提交于
      Clean up: r_xprt is already available everywhere these macros are
      invoked, so just dereference that directly.
      
      RPCRDMA_INLINE_PAD_VALUE is no longer used, so it can simply be
      removed.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      eb342e9a
  2. 07 9月, 2016 2 次提交
    • C
      xprtrdma: Fix receive buffer accounting · 05c97466
      Chuck Lever 提交于
      An RPC can terminate before its reply arrives, if a credential
      problem or a soft timeout occurs. After this happens, xprtrdma
      reports it is out of Receive buffers.
      
      A Receive buffer is posted before each RPC is sent, and returned to
      the buffer pool when a reply is received. If no reply is received
      for an RPC, that Receive buffer remains posted. But xprtrdma tries
      to post another when the next RPC is sent.
      
      If this happens a few dozen times, there are no receive buffers left
      to be posted at send time. I don't see a way for a transport
      connection to recover at that point, and it will spit warnings and
      unnecessarily delay RPCs on occasion for its remaining lifetime.
      
      Commit 1e465fd4 ("xprtrdma: Replace send and receive arrays")
      removed a little bit of logic to detect this case and not provide
      a Receive buffer so no more buffers are posted, and then transport
      operation continues correctly. We didn't understand what that logic
      did, and it wasn't commented, so it was removed as part of the
      overhaul to support backchannel requests.
      
      Restore it, but be wary of the need to keep extra Receives posted
      to deal with backchannel requests.
      
      Fixes: 1e465fd4 ("xprtrdma: Replace send and receive arrays")
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      05c97466
    • C
      xprtrdma: Revert 3d4cf35b ("xprtrdma: Reply buffer exhaustion...") · 78d506e1
      Chuck Lever 提交于
      Receive buffer exhaustion, if it were to actually occur, would be
      catastrophic. However, when there are no reply buffers to post, that
      means all of them have already been posted and are waiting for
      incoming replies. By design, there can never be more RPCs in flight
      than there are available receive buffers.
      
      A receive buffer can be left posted after an RPC exits without a
      received reply; say, due to a credential problem or a soft timeout.
      This does not result in fewer posted receive buffers than there are
      pending RPCs, and there is already logic in xprtrdma to deal
      appropriately with this case.
      
      It also looks like the "+ 2" that was removed was accidentally
      accommodating the number of extra receive buffers needed for
      receiving backchannel requests. That will need to be addressed by
      another patch.
      
      Fixes: 3d4cf35b ("xprtrdma: Reply buffer exhaustion can be...")
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      78d506e1
  3. 20 7月, 2016 1 次提交
  4. 12 7月, 2016 22 次提交
  5. 18 5月, 2016 2 次提交