1. 02 5月, 2013 1 次提交
  2. 23 3月, 2013 1 次提交
    • M
      IPoIB: Fix send lockup due to missed TX completion · 1ee9e2aa
      Mike Marciniszyn 提交于
      Commit f0dc117a ("IPoIB: Fix TX queue lockup with mixed UD/CM
      traffic") attempts to solve an issue where unprocessed UD send
      completions can deadlock the netdev.
      
      The patch doesn't fully resolve the issue because if more than half
      the tx_outstanding's were UD and all of the destinations are RC
      reachable, arming the CQ doesn't solve the issue.
      
      This patch uses the IB_CQ_REPORT_MISSED_EVENTS on the
      ib_req_notify_cq().  If the rc is above 0, the UD send cq completion
      callback is called directly to re-arm the send completion timer.
      
      This issue is seen in very large parallel filesystem deployments
      and the patch has been shown to correct the issue.
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NDean Luick <dean.luick@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      1ee9e2aa
  3. 26 2月, 2013 5 次提交
    • R
      IPoIB: Free ipoib neigh on path record failure so path rec queries are retried · f72dd566
      Roland Dreier 提交于
      If IPoIB fails to look up a path record (eg if it tries during an SM
      failover when one SM is dead but the new one hasn't taken over yet), the
      driver ends up with a neighbour structure but no address handle (AH).
      There's no mechanism to recover from this: any further packets sent to
      this destination will be silently dumped in ipoib_start_xmit().
      
      Fix this by freeing the neighbour structures when a path rec query
      fails, so that the next packet queued to be sent will trigger a new path
      record query.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      f72dd566
    • B
      IB/srp: Fail I/O requests if the transport is offline · 2ce19e72
      Bart Van Assche 提交于
      If an SRP target is no longer reachable and srp_reset_host() fails to
      reconnect then ib_srp will invoke scsi_remove_host().  That function
      will invoke __scsi_remove_device() for each LUN.  And that last
      function will change the device state from SDEV_TRANSPORT_OFFLINE into
      SDEV_CANCEL.  Certain user space software, e.g. older versions of
      multipathd, continue queueing I/O to SCSI devices that are in the
      SDEV_CANCEL state.
      
      If these I/O requests are submitted as SG_IO that means that the
      REQ_PREEMPT flag will be set and hence that these requests will be
      passed to srp_queuecommand().  These requests will time out.  If new
      requests are queued fast enough from user space these active requests
      will prevent __scsi_remove_device() to finish.
      
      Avoid this by failing I/O requests in the SDEV_CANCEL state if the
      transport is offline.  Introduce a new variable to keep track of the
      transport state instead of failing requests if (!target->connected ||
      target->qp_in_error), so that the SCSI error handler has a chance to
      retry commands after a transport layer failure occurred.
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Cc: <stable@vger.kernel.org> # 3.8
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      2ce19e72
    • B
      IB/srp: Avoid endless SCSI error handling loop · c7c4e7ff
      Bart Van Assche 提交于
      If a SCSI command times out it is passed to the SCSI error
      handler. The SCSI error handler will try to abort the commands that
      timed out.  If aborting fails, a device reset will be attempted.  If
      the device reset also fails a host reset will be attempted.  If the
      host reset also fails the whole procedure will be repeated.
      
      srp_abort() and srp_reset_device() fail for a QP in the error state.
      srp_reset_host() fails after host removal has started.  Hence if the
      SCSI error handler gets invoked after host removal has started and
      with the QP in the error state an endless loop will be triggered.
      
      Modify the SCSI error handling functions in ib_srp as follows:
      - Abort SCSI commands properly even if the QP is in the error state.
      - Make srp_reset_host() reset SCSI requests even after host removal
        has already started or if reconnecting fails.
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Acked-by: NDavid Dillow <dave@thedillows.org>
      Cc: <stable@vger.kernel.org> # 3.8
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      c7c4e7ff
    • B
      IB/srp: Avoid sending a task management function needlessly · 3780d1f0
      Bart Van Assche 提交于
      Do not send a task management function if sending will fail anyway
      because either there is no RDMA/RC connection or the QP is in the
      error state.
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Acked-by: NDavid Dillow <dave@thedillows.org>
      Cc: <stable@vger.kernel.org> # 3.8
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      3780d1f0
    • B
      IB/srp: Track connection state properly · e1b2f13a
      Bart Van Assche 提交于
      Remove an assignment that incorrectly overwrites the connection state
      update by srp_connect_target().
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Acked-by: NDavid Dillow <dave@thedillows.org>
      Cc: <stable@vger.kernel.org> # 3.8
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      e1b2f13a
  4. 22 2月, 2013 3 次提交
  5. 20 2月, 2013 3 次提交
  6. 06 2月, 2013 1 次提交
    • S
      IPoIB: Fix crash due to skb double destruct · 7e5a90c2
      Shlomo Pongratz 提交于
      After commit b13912bb ("IPoIB: Call skb_dst_drop() once skb is
      enqueued for sending"), using connected mode and running multithreaded
      iperf for long time, ie
      
          iperf -c <IP> -P 16 -t 3600
      
      results in a crash.
      
      After the above-mentioned patch, the driver is calling skb_orphan() and
      skb_dst_drop() after calling post_send() in ipoib_cm.c::ipoib_cm_send()
      (also in ipoib_ib.c::ipoib_send())
      
      The problem with this is, as is written in a comment in both routines,
      "it's entirely possible that the completion handler will run before we
      execute anything after the post_send()."  This leads to running the
      skb cleanup routines simultaneously in two different contexts.
      
      The solution is to always perform the skb_orphan() and skb_dst_drop()
      before queueing the send work request.  If an error occurs, then it
      will be no different than the regular case where dev_free_skb_any() in
      the completion path, which is assumed to be after these two routines.
      Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      7e5a90c2
  7. 07 1月, 2013 1 次提交
  8. 20 12月, 2012 1 次提交
    • R
      IPoIB: Call skb_dst_drop() once skb is enqueued for sending · b13912bb
      Roland Dreier 提交于
      Currently, IPoIB delays collecting send completions for TX packets in
      order to batch work more efficiently.  It does skb_orphan() right after
      queuing the packets so that destructors run early, to avoid problems
      like holding socket send buffers for too long (since we might not
      collect a send completion until a long time after the packet is
      actually sent).
      
      However, IPoIB clears IFF_XMIT_DST_RELEASE because it actually looks
      at skb_dst() to update the PMTU when it gets a too-long packet.  This
      means that the packets sitting in the TX ring with uncollected send
      completions are holding a reference on the dst.  We've seen this lead
      to pathological behavior with respect to route and neighbour GC.  The
      easy fix for this is to call skb_dst_drop() when we call skb_orphan().
      
      Also, give packets sent via connected mode (CM) the same skb_orphan()
      / skb_dst_drop() treatment that packets sent via datagram mode get.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b13912bb
  9. 01 12月, 2012 12 次提交
  10. 29 11月, 2012 1 次提交
  11. 28 11月, 2012 1 次提交
    • N
      ib_srpt: Convert I/O path to target_submit_cmd + drop legacy ioctx->kref · 9474b043
      Nicholas Bellinger 提交于
      This patch converts the main srpt_handle_cmd() I/O path to use modern
      target_submit_cmd() with TARGET_SCF_ACK_KREF flag usage.  This includes
      dropping the original internal ioctx->kref + srpt_put_send_ioctx() usage
      in favor of target_put_sess_cmd() w/ se_cmd_t->cmd_kref within ib_srpt
      response callbacks.
      
      It also updates srpt_abort_cmd() to call target_put_sess_cmd() for
      completion of aborted commands, and adds target_wait_for_sess_cmds() into
      srpt_release_channel_work() to allow outstanding I/O to complete during
      session shutdown.
      
      Also, go ahead and update srpt_handle_tsk_mgmt() to make the remaining
      transport_init_se_cmd() to setup the ioctx->cmd with se_tmr_req.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Roland Dreier <roland@kernel.org>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      9474b043
  12. 07 11月, 2012 1 次提交
    • C
      target: pass sense_reason as a return value · de103c93
      Christoph Hellwig 提交于
      Pass the sense reason as an explicit return value from the I/O submission
      path instead of storing it in struct se_cmd and using negative return
      values.  This cleans up a lot of the code pathes, and with the sparse
      annotations for the new sense_reason_t type allows for much better
      error checking.
      
      (nab: Convert spc_emulate_modesense + spc_emulate_modeselect to use
            sense_reason_t with Roland's MODE SELECT changes)
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Roland Dreier <roland@purestorage.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      de103c93
  13. 04 10月, 2012 1 次提交
  14. 03 10月, 2012 1 次提交
  15. 02 10月, 2012 1 次提交
    • O
      IB/ipoib: Add more rtnl_link_ops callbacks · 862096a8
      Or Gerlitz 提交于
      Add the rtnl_link_ops changelink and fill_info callbacks, through
      which the admin can now set/get the driver mode, etc policies.
      Maintain the proprietary sysfs entries only for legacy childs.
      
      For child devices, set dev->iflink to point to the parent
      device ifindex, such that user space tools can now correctly
      show the uplink relation as done for vlan, macvlan, etc
      devices. Pointed out by Patrick McHardy <kaber@trash.net>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      862096a8
  16. 01 10月, 2012 3 次提交
  17. 21 9月, 2012 1 次提交
    • O
      IB/ipoib: Add rtnl_link_ops support · 9baa0b03
      Or Gerlitz 提交于
      Add rtnl_link_ops to IPoIB, with the first usage being child device
      create/delete through them. Childs devices are now either legacy ones,
      created/deleted through the ipoib sysfs entries, or RTNL ones.
      
      Adding support for RTNL childs involved refactoring of ipoib_vlan_add
      which is now used by both the sysfs and the link_ops code.
      
      Also, added ndo_uninit entry to support calling unregister_netdevice_queue
      from the rtnl dellink entry. This required removal of calls to
      ipoib_dev_cleanup from the driver in flows which use unregister_netdevice,
      since the networking core will invoke ipoib_uninit which does exactly that.
      Signed-off-by: NErez Shitrit <erezsh@mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9baa0b03
  18. 18 9月, 2012 2 次提交