1. 03 3月, 2016 1 次提交
  2. 08 10月, 2015 1 次提交
    • C
      IB: split struct ib_send_wr · e622f2f4
      Christoph Hellwig 提交于
      This patch split up struct ib_send_wr so that all non-trivial verbs
      use their own structure which embedds struct ib_send_wr.  This dramaticly
      shrinks the size of a WR for most common operations:
      
      sizeof(struct ib_send_wr) (old):	96
      
      sizeof(struct ib_send_wr):		48
      sizeof(struct ib_rdma_wr):		64
      sizeof(struct ib_atomic_wr):		96
      sizeof(struct ib_ud_wr):		88
      sizeof(struct ib_fast_reg_wr):		88
      sizeof(struct ib_bind_mw_wr):		96
      sizeof(struct ib_sig_handover_wr):	80
      
      And with Sagi's pending MR rework the fast registration WR will also be
      down to a reasonable size:
      
      sizeof(struct ib_fastreg_wr):		64
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [srp, srpt]
      Reviewed-by: Chuck Lever <chuck.lever@oracle.com> [sunrpc]
      Tested-by: NHaggai Eran <haggaie@mellanox.com>
      Tested-by: NSagi Grimberg <sagig@mellanox.com>
      Tested-by: NSteve Wise <swise@opengridcomputing.com>
      e622f2f4
  3. 06 10月, 2015 3 次提交
  4. 01 10月, 2015 1 次提交
  5. 31 8月, 2015 1 次提交
  6. 26 8月, 2015 2 次提交
  7. 22 6月, 2015 1 次提交
  8. 19 5月, 2015 1 次提交
  9. 24 11月, 2014 1 次提交
  10. 20 11月, 2012 1 次提交
  11. 30 5月, 2012 1 次提交
  12. 07 6月, 2011 1 次提交
  13. 01 2月, 2011 1 次提交
    • T
      rds/ib: use system_wq instead of rds_ib_fmr_wq · c534a107
      Tejun Heo 提交于
      With cmwq, there's no reason to use dedicated rds_ib_fmr_wq - it's not
      in the memory reclaim path and the maximum number of concurrent work
      items is bound by the number of devices.  Drop it and use system_wq
      instead.  This rds_ib_fmr_init/exit() noops.  Both removed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Andy Grover <andy.grover@oracle.com>
      c534a107
  14. 21 10月, 2010 1 次提交
  15. 09 9月, 2010 22 次提交
    • Z
      RDS/IB: print string constants in more places · 59f740a6
      Zach Brown 提交于
      This prints the constant identifier for work completion status and rdma
      cm event types, like we already do for IB event types.
      
      A core string array helper is added that each string type uses.
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      59f740a6
    • Z
      RDS/IB: protect the list of IB devices · ea819867
      Zach Brown 提交于
      The RDS IB device list wasn't protected by any locking.  Traversal in
      both the get_mr and FMR flushing paths could race with additon and
      removal.
      
      List manipulation is done with RCU primatives and is protected by the
      write side of a rwsem.  The list traversal in the get_mr fast path is
      protected by a rcu read critical section.  The FMR list traversal is
      more problematic because it can block while traversing the list.  We
      protect this with the read side of the rwsem.
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      ea819867
    • Z
      RDS/IB: track signaled sends · f046011c
      Zach Brown 提交于
      We're seeing bugs today where IB connection shutdown clears the send
      ring while the tasklet is processing completed sends.  Implementation
      details cause this to dereference a null pointer.  Shutdown needs to
      wait for send completion to stop before tearing down the connection.  We
      can't simply wait for the ring to empty because it may contain
      unsignaled sends that will never be processed.
      
      This patch tracks the number of signaled sends that we've posted and
      waits for them to complete.  It also makes sure that the tasklet has
      finished executing.
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      f046011c
    • Z
      RDS: remove __init and __exit annotation · ef87b7ea
      Zach Brown 提交于
      The trivial amount of memory saved isn't worth the cost of dealing with section
      mismatches.
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      ef87b7ea
    • Z
      RDS/IB: create a work queue for FMR flushing · 515e079d
      Zach Brown 提交于
      This patch moves the FMR flushing work in to its own mult-threaded work queue.
      This is to maintain performance in preparation for returning the main krdsd
      work queue back to a single threaded work queue to avoid deep-rooted
      concurrency bugs.
      
      This is also good because it further separates FMRs, which might be removed
      some day, from the rest of the code base.
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      515e079d
    • Z
      RDS/IB: destroy connections on rmmod · 8aeb1ba6
      Zach Brown 提交于
      IB connections were not being destroyed during rmmod.
      
      First, recently IB device removal callback was changed to disconnect
      connections that used the removing device rather than destroying them.  So
      connections with devices during rmmod were not being destroyed.
      
      Second, rds_ib_destroy_nodev_conns() was being called before connections are
      disassociated with devices.  It would almost never find connections in the
      nodev list.
      
      We first get rid of rds_ib_destroy_conns(), which is no longer called, and
      refactor the existing caller into the main body of the function and get rid of
      the list and lock wrappers.
      
      Then we call rds_ib_destroy_nodev_conns() *after* ib_unregister_client() has
      removed the IB device from all the conns and put the conns on the nodev list.
      
      The result is that IB connections are destroyed by rmmod.
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      8aeb1ba6
    • A
      RDS/IB: Make ib_recv_refill return void · b6fb0df1
      Andy Grover 提交于
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      b6fb0df1
    • C
      rds: more FMRs are faster · eabb7322
      Chris Mason 提交于
      When we add more FMRs, we flush them less often and so we go faster.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      eabb7322
    • C
      RDS/IB: Add caching of frags and incs · 33244125
      Chris Mason 提交于
      This patch is based heavily on an initial patch by Chris Mason.
      Instead of freeing slab memory and pages, it keeps them, and
      funnels them back to be reused.
      
      The lock minimization strategy uses xchg and cmpxchg atomic ops
      for manipulation of pointers to list heads. We anchor the lists with a
      pointer to a list_head struct instead of a static list_head struct.
      We just have to carefully use the existing primitives with
      the difference between a pointer and a static head struct.
      
      For example, 'list_empty()' means that our anchor pointer points to a list with
      a single item instead of meaning that our static head element doesn't point to
      any list items.
      
      Original patch by Chris, with significant mods and fixes by Andy and Zach.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      33244125
    • A
      RDS: Use page_remainder_alloc() for recv bufs · 0b088e00
      Andy Grover 提交于
      Instead of splitting up a page into RDS_FRAG_SIZE chunks
      ourselves, ask rds_page_remainder_alloc() to do it. While it
      is possible PAGE_SIZE > FRAG_SIZE, on x86en it isn't, so having
      duplicate "carve up a page into buffers" code seems excessive.
      
      The other modification this spawns is the use of a single
      struct scatterlist in rds_page_frag instead of a bare page ptr.
      This causes verbosity to increase in some places, and decrease
      in others.
      
      Finally, I decided to unify the lifetimes and alloc/free of
      rds_page_frag and its page. This is a nice simplification in itself,
      but will be extra-nice once we come to adding cmason's recycling
      patch.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      0b088e00
    • Z
      RDS/IB: add refcount tracking to struct rds_ib_device · 3e0249f9
      Zach Brown 提交于
      The RDS IB client .remove callback used to free the rds_ibdev for the given
      device unconditionally.  This could race other users of the struct.  This patch
      adds refcounting so that we only free the rds_ibdev once all of its users are
      done.
      
      Many rds_ibdev users are tied to connections.  We give the connection a
      reference and change these users to reference the device in the connection
      instead of looking it up in the IB client data.  The only user of the IB client
      data remaining is the first lookup of the device as connections are built up.
      
      Incrementing the reference count of a device found in the IB client data could
      race with final freeing so we use an RCU grace period to make sure that freeing
      won't happen until those lookups are done.
      
      MRs need the rds_ibdev to get at the pool that they're freed in to.  They exist
      outside a connection and many MRs can reference different devices from one
      socket, so it was natural to have each MR hold a reference.  MR refs can be
      dropped from interrupt handlers and final device teardown can block so we push
      it off to a work struct.  Pool teardown had to be fixed to cancel its pending
      work instead of deadlocking waiting for all queued work, including itself, to
      finish.
      
      MRs get their reference from the global device list, which gets a reference.
      It is left unprotected by locks and remains racy.  A simple global lock would
      be a significant bottleneck.  More scalable (complicated) locking should be
      done carefully in a later patch.
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      3e0249f9
    • A
      RDS/IB: add _to_node() macros for numa and use {k,v}malloc_node() · e4c52c98
      Andy Grover 提交于
      Allocate send/recv rings in memory that is node-local to the HCA.
      This significantly helps performance.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      e4c52c98
    • A
      51e2cba8
    • A
      RDS: Refill recv ring directly from tasklet · f17a1a55
      Andy Grover 提交于
      Performance is better if we use allocations that don't block
      to refill the receive ring. Since the whole reason we were
      kicking out to the worker thread was so we could do blocking
      allocs, we no longer need to do this.
      
      Remove gfp params from rds_ib_recv_refill(); we always use
      GFP_NOWAIT.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      f17a1a55
    • A
      RDS: Perform unmapping ops in stages · ff3d7d36
      Andy Grover 提交于
      Previously, RDS would wait until the final send WR had completed
      and then handle cleanup. With silent ops, we do not know
      if an atomic, rdma, or data op will be last. This patch
      handles any of these cases by keeping a pointer to the last
      op in the message in m_last_op.
      
      When the TX completion event fires, rds dispatches to per-op-type
      cleanup functions, and then does whole-message cleanup, if the
      last op equalled m_last_op.
      
      This patch also moves towards having op-specific functions take
      the op struct, instead of the overall rm struct.
      
      rds_ib_connection has a pointer to keep track of a a partially-
      completed data send operation. This patch changes it from an
      rds_message pointer to the narrower rm_data_op pointer, and
      modifies places that use this pointer as needed.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      ff3d7d36
    • A
      RDS: Remove struct rds_rdma_op · f8b3aaf2
      Andy Grover 提交于
      A big changeset, but it's all pretty dumb.
      
      struct rds_rdma_op was already embedded in struct rm_rdma_op.
      Remove rds_rdma_op and put its members in rm_rdma_op. Rename
      members with "op_" prefix instead of "r_", for consistency.
      
      Of course this breaks a lot, so fixup the code accordingly.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      f8b3aaf2
    • A
      RDS: Implement silent atomics · 241eef3e
      Andy Grover 提交于
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      241eef3e
    • A
      RDS: Remove unsignaled_bytes sysctl · 1d34f175
      Andy Grover 提交于
      Removed unsignaled_bytes sysctl and code to signal
      based on it. I believe unsignaled_wrs is more than
      sufficient for our purposes.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      1d34f175
    • A
      RDS/IB: Remove ib_[header/data]_sge() functions · 919ced4c
      Andy Grover 提交于
      These functions were to cope with differently ordered
      sg entries depending on RDS 3.0 or 3.1+. Now that
      we've dropped 3.0 compatibility we no longer need them.
      
      Also, modify usage sites for these to refer to sge[0] or [1]
      directly. Reorder code to initialize header sgs first.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      919ced4c
    • A
      809fa148
    • A
      RDS: Base init_depth and responder_resources on hw values · 40589e74
      Andy Grover 提交于
      Instead of using a constant for initiator_depth and
      responder_resources, read the per-QP values when the
      device is enumerated, and then use these values when creating
      the connection.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      40589e74
    • A
      RDS: Implement atomic operations · 15133f6e
      Andy Grover 提交于
      Implement a CMSG-based interface to do FADD and CSWP ops.
      
      Alter send routines to handle atomic ops.
      
      Add atomic counters to stats.
      
      Add xmit_atomic() to struct rds_transport
      
      Inline rds_ib_send_unmap_rdma into unmap_rm
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      15133f6e
  16. 31 10月, 2009 1 次提交