1. 25 2月, 2010 5 次提交
  2. 11 2月, 2010 1 次提交
    • S
      RDMA/cm: Revert association of an RDMA device when binding to loopback · 8523c048
      Sean Hefty 提交于
      Revert the following change from commit 6f8372b6 ("RDMA/cm: fix
      loopback address support")
      
         The defined behavior of rdma_bind_addr is to associate an RDMA
         device with an rdma_cm_id, as long as the user specified a non-
         zero address.  (ie they weren't just trying to reserve a port)
         Currently, if the loopback address is passed to rdma_bind_addr,
         no device is associated with the rdma_cm_id.  Fix this.
      
      It turns out that important apps such as Open MPI depend on
      rdma_bind_addr() NOT associating any RDMA device when binding to a
      loopback address.  Open MPI is being updated to deal with this, but at
      least until a new Open MPI release is available, maintain the previous
      behavior: allow rdma_bind_addr() to succeed, but do not bind to a
      device.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Acked-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      8523c048
  3. 07 1月, 2010 1 次提交
  4. 17 12月, 2009 1 次提交
  5. 10 12月, 2009 1 次提交
  6. 20 11月, 2009 8 次提交
  7. 19 11月, 2009 1 次提交
  8. 17 11月, 2009 1 次提交
    • S
      RDMA/ucma: Add option to manually set IB path · a7ca1f00
      Sean Hefty 提交于
      Export rdma_set_ib_paths to user space to allow applications to
      manually set the IB path used for connections.  This allows
      alternative ways for a user space application or library to obtain
      path record information, including retrieving path information
      from cached data, avoiding direct interaction with the IB SA.
      The IB SA is a single, centralized entity that can limit scaling
      on large clusters running MPI applications.
      
      Future changes to the rdma cm can expand on this framework to
      support the full range of features allowed by the IB CM, such as
      separate forward and reverse paths and APM.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Reviewed-By: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      a7ca1f00
  9. 12 10月, 2009 1 次提交
  10. 08 10月, 2009 2 次提交
  11. 05 10月, 2009 1 次提交
  12. 24 9月, 2009 1 次提交
  13. 10 9月, 2009 1 次提交
  14. 07 9月, 2009 2 次提交
    • H
      IB/mad: Allow tuning of QP0 and QP1 sizes · b76aabc3
      Hal Rosenstock 提交于
      MADs are UD and can be dropped if there are no receives posted, so
      allow receive queue size to be set with a module parameter in case the
      queue needs to be lengthened.  Send side tuning is done for symmetry
      with receive.
      Signed-off-by: NHal Rosenstock <hal.rosenstock@gmail.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      b76aabc3
    • R
      IB/mad: Fix possible lock-lock-timer deadlock · 6b2eef8f
      Roland Dreier 提交于
      Lockdep reported a possible deadlock with cm_id_priv->lock,
      mad_agent_priv->lock and mad_agent_priv->timed_work.timer; this
      happens because the mad module does
      
      	cancel_delayed_work(&mad_agent_priv->timed_work);
      
      while holding mad_agent_priv->lock.  cancel_delayed_work() internally
      does del_timer_sync(&mad_agent_priv->timed_work.timer).
      
      This can turn into a deadlock because mad_agent_priv->lock is taken
      inside cm_id_priv->lock, so we can get the following set of contexts
      that deadlock each other:
      
       A: holding cm_id_priv->lock, waiting for mad_agent_priv->lock
       B: holding mad_agent_priv->lock, waiting for del_timer_sync()
       C: interrupt during mad_agent_priv->timed_work.timer that takes
          cm_id_priv->lock
      
      Fix this by using the new __cancel_delayed_work() interface (which
      internally does del_timer() instead of del_timer_sync()) in all the
      places where we are holding a lock.
      
      Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=13757Reported-by: NBart Van Assche <bart.vanassche@gmail.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      6b2eef8f
  15. 06 9月, 2009 4 次提交
  16. 24 6月, 2009 1 次提交
  17. 16 6月, 2009 1 次提交
    • G
      infiniband: remove driver_data direct access of struct device · 3f7c58a0
      Greg Kroah-Hartman 提交于
      In the near future, the driver core is going to not allow direct access
      to the driver_data pointer in struct device.  Instead, the functions
      dev_get_drvdata() and dev_set_drvdata() should be used.  These functions
      have been around since the beginning, so are backwards compatible with
      all older kernel versions.
      
      
      Cc: general@lists.openfabrics.org
      Cc: Roland Dreier <rolandd@cisco.com>
      Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
      Cc: Sean Hefty <sean.hefty@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      3f7c58a0
  18. 09 4月, 2009 1 次提交
    • Y
      RDMA/cma: Create cm id even when IB port is down · d2ca39f2
      Yossi Etigin 提交于
      When doing rdma_resolve_addr(), if the relevant IB port is down, the
      function fails and the cm_id is not bound to the correct device.
      Therefore, application does not have a device handle and cannot wait
      for the port to become active.  The function fails because the
      underlying IPoIB interface is not joined to the broadcast group and
      therefore the SA does not have a multicast record to take a Q_Key
      from.
      
      The fix is to use lazy Q_Key resolution - cma_set_qkey() will set
      id_priv->qkey if it was not set, and will be called just before the
      Q_Key is really required.
      Signed-off-by: NYossi Etigin <yosefe@voltaire.com>
      Acked-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      d2ca39f2
  19. 02 4月, 2009 1 次提交
  20. 05 3月, 2009 1 次提交
  21. 04 3月, 2009 2 次提交
    • J
      IB/sa_query: Fix AH leak due to update_sm_ah() race · 6b708b3d
      Jack Morgenstein 提交于
      Our testing uncovered a race condition in ib_sa_event():
      
      	spin_lock_irqsave(&port->ah_lock, flags);
      	if (port->sm_ah)
      		kref_put(&port->sm_ah->ref, free_sm_ah);
      	port->sm_ah = NULL;
      	spin_unlock_irqrestore(&port->ah_lock, flags);
      
      	schedule_work(&sa_dev->port[event->element.port_num -
      				    sa_dev->start_port].update_task);
      
      If two events occur back-to-back (e.g., client-reregister and LID
      change), both may pass the spinlock-protected code above before the
      scheduled work updates the port->sm_ah handle.  Then if the scheduled
      work ends up running twice, the second operation will then find a
      non-NULL port->sm_ah, and will simply overwrite it in update_sm_ah --
      resulting in an AH leak.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      6b708b3d
    • R
      IB/mad: Fix ib_post_send_mad() returning 0 with no generate send comp · 4780c195
      Ralph Campbell 提交于
      If ib_post_send_mad() returns 0, the API guarantees that there will be
      a callback to send_buf->mad_agent->send_handler() so that the sender
      can call ib_free_send_mad().  Otherwise, the ib_mad_send_buf will be
      leaked and the mad_agent reference count will never go to zero and the
      IB device module cannot be unloaded.  The above can happen without
      this patch if process_mad() returns (IB_MAD_RESULT_SUCCESS |
      IB_MAD_RESULT_CONSUMED).
      
      If process_mad() returns IB_MAD_RESULT_SUCCESS and there is no agent
      registered to receive the mad being sent, handle_outgoing_dr_smp()
      returns zero which causes a MAD packet which is at the end of the
      directed route to be incorrectly sent on the wire but doesn't cause a
      hang since the HCA generates a send completion.
      Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      4780c195
  22. 28 2月, 2009 2 次提交
    • R
      IB/mad: initialize mad_agent_priv before putting on lists · d9620a4c
      Ralph Campbell 提交于
      There is a potential race in ib_register_mad_agent() where the struct
      ib_mad_agent_private is not fully initialized before it is added to
      the list of agents per IB port. This means the ib_mad_agent_private
      could be seen before the refcount, spin locks, and linked lists are
      initialized.  The fix is to initialize the structure earlier.
      Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      d9620a4c
    • R
      IB/mad: Fix null pointer dereference in local_completions() · 1d9bc6d6
      Ralph Campbell 提交于
      handle_outgoing_dr_smp() can queue a struct ib_mad_local_private
      *local on the mad_agent_priv->local_work work queue with
      local->mad_priv == NULL if device->process_mad() returns
      IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY and
      (!ib_response_mad(&mad_priv->mad.mad) ||
      !mad_agent_priv->agent.recv_handler).
      
      In this case, local_completions() will be called with local->mad_priv
      == NULL. The code does check for this case and skips calling
      recv_mad_agent->agent.recv_handler() but recv == 0 so
      kmem_cache_free() is called with a NULL pointer.
      
      Also, since recv isn't reinitialized each time through the loop, it
      can cause a memory leak if recv should have been zero.
      Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
      1d9bc6d6