1. 30 4月, 2008 10 次提交
    • E
      RDMA/nes: Add support for SFP+ PHY · 0e1de5d6
      Eric Schneider 提交于
      This patch enables the iw_nes module for NetEffect RNICs to support
      additional PHYs including SFP+ (referred to as ARGUS in the code).
      Signed-off-by: NEric Schneider <eric.schneider@neteffect.com>
      Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      0e1de5d6
    • F
      RDMA/nes: Use LRO · 37dab411
      Faisal Latif 提交于
      Signed-off-by: Faisal Latif <flatif@neteffect.com.
      Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      37dab411
    • R
      IB/mthca: Avoid changing userspace ABI to handle DMA write barrier attribute · baaad380
      Roland Dreier 提交于
      Commit cb9fbc5c ("IB: expand ib_umem_get() prototype") changed the
      mthca userspace ABI to provide a way for userspace to indicate which
      memory regions need the DMA write barrier attribute.  However, it is
      possible to handle this without breaking existing userspace, by having
      the mthca kernel driver recognize whether it is talking to old or new
      userspace, depending on the size of the register MR structure passed in.
      
      The only potential drawback of this is that is allows old userspace
      (which has a bug with DMA ordering on large SGI Altix systems) to
      continue to run on new kernels, but the advantage of allowing old
      userspace to continue to work on unaffected systems seems to outweigh
      this, and we can print a warning to push people to upgrade their
      userspace.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      baaad380
    • O
      IB/mthca: Avoid recycling old FMR R_Keys too soon · 0bfe151c
      Olaf Kirch 提交于
      When a FMR is unmapped, mthca resets the map count to 0, and clears
      the upper part of the R_Key which is used as the sequence counter.
      
      This poses a problem for RDS, which uses ib_fmr_unmap as a fence
      operation.  RDS assumes that after issuing an unmap, the old R_Keys
      will be invalid for a "reasonable" period of time. For instance,
      Oracle processes uses shared memory buffers allocated from a pool of
      buffers.  When a process dies, we want to reclaim these buffers -- but
      we must make sure there are no pending RDMA operations to/from those
      buffers.  The only way to achieve that is by using unmap and sync the
      TPT.
      
      However, when the sequence count is reset on unmap, there is a high
      likelihood that a new mapping will be given the same R_Key that was
      issued a few milliseconds ago.
      
      To prevent this, don't reset the sequence count when unmapping a FMR.
      Signed-off-by: NOlaf Kirch <olaf.kirch@oracle.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      0bfe151c
    • S
      IB/ehca: Allocate event queue size depending on max number of CQs and QPs · d227fa72
      Stefan Roscher 提交于
      If a lot of QPs fall into Error state at once and the EQ of the
      respective HCA is too small, it might overrun, causing the eHCA driver
      to stop processing completion events and calling the application's
      completion handlers, effectively causing traffic to stop.
      
      Fix this by limiting available QPs and CQs to a customizable max
      count, and determining EQ size based on these counts and a worst-case
      assumption.
      Signed-off-by: NStefan Roscher <stefan.roscher@de.ibm.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      d227fa72
    • H
      IB/ehca: handle negative return value from ibmebus_request_irq() properly · 7df109d9
      Hoang-Nam Nguyen 提交于
      ehca_create_eq() was assigning a signed return value to an unsiged
      local variable and then checking if the variable was < 0, which meant
      that errors were always ignored.  Fix this by using one variable for
      signed integer return values and another for u64 hcall return values.
      
      Bug originally found by Roel Kluin <12o3l@tiscali.nl>.
      Signed-off-by: NHoang-Nam Nguyen <hnguyen@de.ibm.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      7df109d9
    • S
      RDMA/cxgb3: Support peer-2-peer connection setup · f8b0dfd1
      Steve Wise 提交于
      Open MPI, Intel MPI and other applications don't respect the iWARP
      requirement that the client (active) side of the connection send the
      first RDMA message.  This class of application connection setup is
      called peer-to-peer.  Typically once the connection is setup, _both_
      sides want to send data.
      
      This patch enables supporting peer-to-peer over the chelsio RNIC by
      enforcing this iWARP requirement in the driver itself as part of RDMA
      connection setup.
      
      Connection setup is extended, when the peer2peer module option is 1,
      such that the MPA initiator will send a 0B Read (the RTR) just after
      connection setup.  The MPA responder will suspend SQ processing until
      the RTR message is received and reply-to.
      
      In the longer term, this will be handled in a standardized way by
      enhancing the MPA negotiation so peers can indicate whether they
      want/need the RTR and what type of RTR (0B read, 0B write, or 0B send)
      should be sent.  This will be done by standardizing a few bits of the
      private data in order to negotiate all this.  However this patch
      enables peer-to-peer applications now and allows most of the required
      firmware and driver changes to be done and tested now.
      
      Design:
      
       - Add a module option, peer2peer, to enable this mode.
      
       - New firmware support for peer-to-peer mode:
      
      	- a new bit in the rdma_init WR to tell it to do peer-2-peer
      	  and what form of RTR message to send or expect.
      
      	- process _all_ preposted recvs before moving the connection
      	  into rdma mode.
      
      	- passive side: defer completing the rdma_init WR until all
      	  pre-posted recvs are processed.  Suspend SQ processing until
      	  the RTR is received.
      
      	- active side: expect and process the 0B read WR on offload TX
      	  queue. Defer completing the rdma_init WR until all
      	  pre-posted recvs are processed.  Suspend SQ processing until
      	  the 0B read WR is processed from the offload TX queue.
      
       - If peer2peer is set, driver posts 0B read request on offload TX
         queue just after posting the rdma_init WR to the offload TX queue.
      
       - Add CQ poll logic to ignore unsolicitied read responses.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      f8b0dfd1
    • S
      RDMA/cxgb3: Set the max_mr_size device attribute correctly · ccaf10d0
      Steve Wise 提交于
      cxgb3 only supports 4GB memory regions.  The lustre RDMA code uses
      this attribute and currently has to code around our bad setting.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      ccaf10d0
    • S
      RDMA/cxgb3: Correctly serialize peer abort path · 989a1780
      Steve Wise 提交于
      Open MPI and other stress testing exposed a few bad bugs in handling
      aborts in the middle of a normal close.  Fix these by:
      
       - serializing abort reply and peer abort processing with disconnect
         processing
      
       - warning (and ignoring) if ep timer is stopped when it wasn't running
      
       - cleaning up disconnect path to correctly deal with aborting and
         dead endpoints
      
       - in iwch_modify_qp(), taking a ref on the ep before releasing the qp
         lock if iwch_ep_disconnect() will be called.  The ref is dropped
         after calling disconnect.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      989a1780
    • Y
      mlx4_core: Add a way to set the "collapsed" CQ flag · e463c7b1
      Yevgeny Petrilin 提交于
      Extend the mlx4_cq_resize() API with a way to set the "collapsed" flag
      for the CQ being created.
      Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      e463c7b1
  2. 29 4月, 2008 1 次提交
    • A
      IB: expand ib_umem_get() prototype · cb9fbc5c
      Arthur Kepner 提交于
      Add a new parameter, dmasync, to the ib_umem_get() prototype.  Use dmasync = 1
      when mapping user-allocated CQs with ib_umem_get().
      Signed-off-by: NArthur Kepner <akepner@sgi.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: Jes Sorensen <jes@sgi.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Roland Dreier <rdreier@cisco.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cb9fbc5c
  3. 24 4月, 2008 10 次提交
  4. 22 4月, 2008 8 次提交
  5. 20 4月, 2008 1 次提交
  6. 19 4月, 2008 2 次提交
  7. 17 4月, 2008 8 次提交