1. 11 3月, 2008 1 次提交
  2. 01 3月, 2008 3 次提交
  3. 16 2月, 2008 1 次提交
  4. 15 2月, 2008 1 次提交
    • S
      RDMA/cma: Do not issue MRA if user rejects connection request · ead595ae
      Sean Hefty 提交于
      There's an undesirable interaction with issuing MRA requests to
      increase connection timeouts and the listen backlog.
      
      When the rdma_cm receives a connection request, it queues an MRA with
      the ib_cm.  (The ib_cm will send an MRA if it receives a duplicate
      REQ.)  The rdma_cm will then create a new rdma_cm_id and give that to
      the user, which in this case is the rdma_user_cm.
      
      If the listen backlog maintained in the rdma_user_cm is full, it
      destroys the rdma_cm_id, which in turns destroys the ib_cm_id.  The
      ib_cm_id generates a REJ because the state of the ib_cm_id has changed
      to MRA sent, versus REQ received.  When the backlog is full, we just
      want to drop the REQ so that it is retried later.
      
      Fix this by deferring queuing the MRA until after the user of the
      rdma_cm has examined the connection request.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      ead595ae
  5. 13 2月, 2008 2 次提交
  6. 05 2月, 2008 2 次提交
    • O
      IB/fmr_pool: Allocate page list for pool FMRs only when caching enabled · 1d96354e
      Or Gerlitz 提交于
      Allocate memory for the page_list field of struct ib_pool_fmr only
      when caching is enabled for the FMR pool, since the field is not used
      otherwise.  This can save significant amounts of memory for large
      pools with caching turned off.
      Signed-off-by: NOr Gerlitz <ogerlitz@voltaire.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      1d96354e
    • S
      IB/cm: Add interim support for routed paths · 3971c9f6
      Sean Hefty 提交于
      Paths with hop_limit > 1 indicate that the connection will be routed
      between IB subnets.  Update the subnet local field in the CM REQ based
      on the hop_limit value.  In addition, if the path is routed, then set
      the LIDs in the REQ to the permissive LIDs.  This is used to indicate
      to the passive side that it should use the LIDs in the received local
      route header (LRH) associated with the REQ when programming the QP.
      
      This is a temporary work-around to the IB CM to support IB router
      development until the IB router specification is completed.  It is not
      anticipated that this work-around will cause any interoperability
      issues with existing stacks or future stacks that will properly
      support IB routers when defined.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      3971c9f6
  7. 29 1月, 2008 3 次提交
  8. 26 1月, 2008 14 次提交
    • O
      IB/fmr_pool: ib_fmr_pool_flush() should flush all dirty FMRs · a3cd7d90
      Olaf Kirch 提交于
      When a FMR is released via ib_fmr_pool_unmap(), the FMR usually ends
      up on the free_list rather than the dirty_list (because we allow a
      certain number of remappings before actually requiring a flush).
      
      However, ib_fmr_batch_release() only looks at dirty_list when flushing
      out old mappings.  This means that when ib_fmr_pool_flush() is used to
      force a flush of the FMR pool, some dirty FMRs that have not reached
      their maximum remap count will not actually be flushed.
      
      Fix this by flushing all FMRs that have been used at least once in
      ib_fmr_batch_release().
      Signed-off-by: NOlaf Kirch <olaf.kirch@oracle.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      a3cd7d90
    • O
      IB/fmr_pool: Flush serial numbers can get out of sync · a656eb75
      Olaf Kirch 提交于
      Normally, the serial numbers for flush requests and flushes executed
      for an FMR pool should be in sync.
      
      However, if the FMR pool flushes dirty FMRs because the
      dirty_watermark was reached, we wake up the cleanup thread and let it
      do its stuff.  As a side effect, the cleanup thread increments
      pool->flush_ser, which leaves it one higher than pool->req_ser.  The
      next time the user calls ib_flush_fmr_pool(), the cleanup thread will
      be woken up, but ib_flush_fmr_pool() won't wait for the flush to
      complete because flush_ser is already past req_ser.  This means the
      FMRs that the user expects to be flushed may not have all been flushed
      when the function returns.
      
      Fix this by telling the cleanup thread to do work exclusively by
      incrementing req_ser, and by moving the comparison of dirty_len and
      dirty_watermark into ib_fmr_pool_unmap().
      Signed-off-by: NOlaf Kirch <olaf.kirch@oracle.com>
      a656eb75
    • R
      IB/umad: Simplify and fix locking · 2fe7e6f7
      Roland Dreier 提交于
      In addition to being overly complex, the locking in user_mad.c is
      broken: there were multiple reports of deadlocks and lockdep warnings.
      In particular it seems that a single thread may end up trying to take
      the same rwsem for reading more than once, which is explicitly
      forbidden in the comments in <linux/rwsem.h>.
      
      To solve this, we change the locking to use plain mutexes instead of
      rwsems.  There is one mutex per open file, which protects the contents
      of the struct ib_umad_file, including the array of agents and list of
      queued packets; and there is one mutex per struct ib_umad_port, which
      protects the contents, including the list of open files.  We never
      hold the file mutex across calls to functions like ib_unregister_mad_agent(),
      which can call back into other ib_umad code to queue a packet, and we
      always hold the port mutex as long as we need to make sure that a
      device is not hot-unplugged from under us.
      
      This even makes things nicer for users of the -rt patch, since we
      remove calls to downgrade_write() (which is not implemented in -rt).
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      2fe7e6f7
    • S
      RDMA/cma: Override default responder_resources with user value · 5851bb89
      Sean Hefty 提交于
      By default, the responder_resources parameter is set to that received
      in a connection request.  The passive side may override this value
      when accepting the connection.  Use the value provided by the passive
      side when transitioning the QP to RTR state, rather than the value
      given in the connect request.  Without this change, the RTR transition
      may fail if the passive side supports fewer responder_resources than
      that in the request.
      
      For code consistency and to protect against QP destruction, restructure
      overriding initiator_depth to match how responder_resources is set.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      5851bb89
    • R
      IPoIB: improve IPv4/IPv6 to IB mcast mapping functions · a9e527e3
      Rolf Manderscheid 提交于
      An IPoIB subnet on an IB fabric that spans multiple IB subnets can't
      use link-local scope in multicast GIDs.  The existing routines that
      map IP/IPv6 multicast addresses into IB link-level addresses hard-code
      the scope to link-local, and they also leave the partition key field
      uninitialised.  This patch adds a parameter (the link-level broadcast
      address) to the mapping routines, allowing them to initialise both the
      scope and the P_Key appropriately, and fixes up the call sites.
      
      The next step will be to add a way to configure the scope for an IPoIB
      interface.
      Signed-off-by: NRolf Manderscheid <rvm@obsidianresearch.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      a9e527e3
    • S
      RDMA/cma: add support for rdma_migrate_id() · 88314e4d
      Sean Hefty 提交于
      This is based on user feedback from Doug Ledford at RedHat:
      
      Events that occur on an rdma_cm_id are reported to userspace through an
      event channel.  Connection request events are reported on the event
      channel associated with the listen.  When the connection is accepted, a
      new rdma_cm_id is created and automatically uses the listen event
      channel.  This is suboptimal where the user only wants listen events on
      that channel.
      
      Additionally, it may be desirable to have events related to connection
      establishment use a different event channel than those related to
      already established connections.
      
      Allow the user to migrate an rdma_cm_id between event channels. All
      pending events associated with the rdma_cm_id are moved to the new event
      channel.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      88314e4d
    • V
      RDMA/cma: Reenable device removal on passive side · 45d9478d
      Vladimir Sokolovsky 提交于
      Enable conn_id remove on the passive side after connection
      establishment.  This corrects an issue where the IB driver can't be
      unloaded after running applications over RDS.  The 'dev_remove' counter
      does not reach 0 for established connections on the passive side.
      
      This problem is limited to device removal, and only occurs on the
      passive side if there are established connections.
      Signed-off-by: NVladimir Sokolovsky <vlad@mellanox.co.il>
      Reviewed-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      45d9478d
    • S
      IB/mad: Fix incorrect access to items on local_list · b61d92d8
      Sean Hefty 提交于
      In cancel_mads(), MADs are moved from the wait_list and local_list
      to a cancel_list for processing.  However, the structures on these two
      lists are not the same.  The wait_list references struct
      ib_mad_send_wr_private, but local_list references struct
      ib_mad_local_private.  Cancel_mads() treats all items moved to the
      cancel_list as struct ib_mad_send_wr_private.  This leads to a system
      crash when requests are moved from the local_list to the cancel_list.
      
      Fix this by leaving local_list alone.  All requests on the local_list
      have completed are just awaiting processing by a queued worker thread.
      
      Bug (crash) reported by Dotan Barak <dotanb@dev.mellanox.co.il>.
      Problem with local_list access reported by Robert Reynolds
      <rreynolds@opengridcomputing.com>.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      b61d92d8
    • S
      IB/cm: Add basic performance counters · 9af57b7a
      Sean Hefty 提交于
      Add performance/debug counters to track sent/received messages, retries,
      and duplicates.  Counters are tracked per CM message type, per port.
      
      The counters are always enabled, so intrusive state tracking is not done.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      9af57b7a
    • S
      IB/mad: Report number of times a mad was retried · 4fc8cd49
      Sean Hefty 提交于
      To allow ULPs to tune timeout values and capture retry statistics,
      report the number of times that a mad send operation was retried.
      
      For RMPP mads, report the total number of times that the any portion
      (send window) of the send operation was retried.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      4fc8cd49
    • S
      IB/multicast: Report errors on multicast groups if P_key changes · 547af765
      Sean Hefty 提交于
      P_key changes can invalidate multicast groups.  Report errors on all
      multicast groups affected by a pkey change.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      547af765
    • S
      IB/mad: Enable loopback of DR SMP responses from userspace · 727792da
      Steve Welch 提交于
      The local loopback of an outgoing DR SMP response is limited to those
      that originate at the driver specific SMA implementation during the
      driver specific process_mad() function.  This patch enables a
      returning DR SMP originating in userspace (or elsewhere) to be
      delivered to the local managment stack.  In this specific case the
      driver process_mad() function does not consume or process the MAD, so
      a reponse mad has not be created and the original MAD must manually be
      copied to the MAD buffer that is to be handed off to the local agent.
      Signed-off-by: NSteve Welch <swelch@systemfabricworks.com>
      Acked-by: NHal Rosenstock <hal@xsigo.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      727792da
    • R
      IB/mad: Remove redundant NULL pointer check in ib_mad_recv_done_handler() · 3828ff45
      Ralph Campbell 提交于
      In ib_mad_recv_done_handler(), the response pointer is checked for
      NULL after allocating it.  It is then checked again in the local
      process_mad() path but there is no possibility of it changing in
      between.
      Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
      Acked-by: NHal Rosenstock <hal@xsigo.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      3828ff45
    • S
      RDMA/iwcm: Set initiator depth and responder resources to device max values · 8d8293cf
      Steve Wise 提交于
      Set the initiator depth and responder resources to the device max
      values for new connect request events in the iWARP connection manager.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      8d8293cf
  9. 25 1月, 2008 2 次提交
  10. 31 10月, 2007 1 次提交
  11. 24 10月, 2007 1 次提交
  12. 23 10月, 2007 1 次提交
  13. 20 10月, 2007 1 次提交
    • R
      IB/uverbs: Fix checking of userspace object ownership · cbfb50e6
      Roland Dreier 提交于
          
      Commit 9ead190b ("IB/uverbs: Don't serialize with ib_uverbs_idr_mutex")
      rewrote how userspace objects are looked up in the uverbs module's
      idrs, and introduced a severe bug in the process: there is no checking
      that an operation is being performed by the right process any more.
      Fix this by adding the missing check of uobj->context in __idr_get_uobj().
      
      Apparently everyone is being very careful to only touch their own
      objects, because this bug was introduced in June 2006 in 2.6.18, and
      has gone undetected until now.
      
      Cc: stable <stable@kernel.org>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      cbfb50e6
  14. 19 10月, 2007 1 次提交
  15. 17 10月, 2007 2 次提交
    • S
      RDMA/cma: Fix deadlock destroying listen requests · d02d1f53
      Sean Hefty 提交于
      Deadlock condition reported by Kanoj Sarcar <kanoj@netxen.com>.
      The deadlock occurs when a connection request arrives at the same
      time that a wildcard listen is being destroyed.
      
      A wildcard listen maintains per device listen requests for each
      RDMA device in the system.  The per device listens are automatically
      added and removed when RDMA devices are inserted or removed from
      the system.
      
      When a wildcard listen is destroyed, rdma_destroy_id() acquires
      the rdma_cm's device mutex ('lock') to protect against hot-plug
      events adding or removing per device listens.  It then tries to
      destroy the per device listens by calling ib_destroy_cm_id() or
      iw_destroy_cm_id().  It does this while holding the device mutex.
      
      However, if the underlying iw/ib CM reports a connection request
      while this is occurring, the rdma_cm callback function will try
      to acquire the same device mutex.  Since we're in a callback,
      the ib_destroy_cm_id() or iw_destroy_cm_id() calls will block until
      their callback thread returns, but the callback is blocked waiting for
      the device mutex.
      
      Fix this by re-working how per device listens are destroyed.  Use
      rdma_destroy_id(), which avoids the deadlock, in place of
      cma_destroy_listen().  Additional synchronization is added to handle
      device hot-plug events and ensure that the id is not destroyed twice.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      d02d1f53
    • S
      RDMA/cma: Add locking around QP accesses · c5483388
      Sean Hefty 提交于
      If a user allocates a QP on an rdma_cm_id, the rdma_cm will automatically
      transition the QP through its states (RTR, RTS, error, etc.)  While the
      QP state transitions are occurring, the QP itself must remain valid.
      Provide locking around the QP pointer to prevent its destruction while
      accessing the pointer.
      
      This fixes an issue reported by Olaf Kirch from Oracle that resulted in
      a system crash:
      
      "An incoming connection arrives and we decide to tear down the nascent
       connection.  The remote ends decides to do the same.  We start to shut
       down the connection, and call rdma_destroy_qp on our cm_id. ... Now
       apparently a 'connect reject' message comes in from the other host,
       and cma_ib_handler() is called with an event of IB_CM_REJ_RECEIVED.
       It calls cma_modify_qp_err, which for some odd reason tries to modify
       the exact same QP we just destroyed."
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      c5483388
  16. 13 10月, 2007 1 次提交
    • K
      Driver core: change add_uevent_var to use a struct · 7eff2e7a
      Kay Sievers 提交于
      This changes the uevent buffer functions to use a struct instead of a
      long list of parameters. It does no longer require the caller to do the
      proper buffer termination and size accounting, which is currently wrong
      in some places. It fixes a known bug where parts of the uevent
      environment are overwritten because of wrong index calculations.
      
      Many thanks to Mathieu Desnoyers for finding bugs and improving the
      error handling.
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      
      7eff2e7a
  17. 11 10月, 2007 1 次提交
  18. 10 10月, 2007 2 次提交
    • S
      RDMA/cma: Queue IB CM MRAs to avoid unnecessary remote retries · dcb3f974
      Sean Hefty 提交于
      Automatically queue MRA message to decrease the number of retries sent
      by the remote side during connection establishment.  This also has the
      effect of increasing the overall connection timeout without using a
      longer retry time in the case of dropped packets.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      dcb3f974
    • S
      IB/cm: Modify interface to send MRAs in response to duplicate messages · de98b693
      Sean Hefty 提交于
      The IB CM provides a message received acknowledged (MRA) message that
      can be sent to indicate that a REQ or REP message has been received, but
      will require more time to process than the timeout specified by those
      messages.  In many cases, the application may not know how long it will
      take to respond to a CM message, but the majority of the time, it will
      usually respond before a retry has been sent.  Rather than sending an
      MRA in response to all messages just to handle the case where a longer
      timeout is needed, it is more efficient to queue the MRA for sending in
      case a duplicate message is received.
      
      This avoids sending an MRA when it is not needed, but limits the number
      of times that a REQ or REP will be resent.  It also provides for a
      simpler implementation than generating the MRA based on a timer event.
      (That is, trying to send the MRA after receiving the first REQ or REP if
      a response has not been generated, so that it is received at the remote
      side before a duplicate REQ or REP has been received)
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      de98b693