1. 08 7月, 2013 2 次提交
    • N
      iser-target: Add vendor_err debug output · c5a2adbf
      Nicholas Bellinger 提交于
      Add output for ib_wc.vendor_err in isert_cq_[t,r]x_work(), which
      is useful for debugging future issues.
      Reported-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      c5a2adbf
    • N
      iser-target: Fix session reset bug with RDMA_CM_EVENT_DISCONNECTED · b2cb9649
      Nicholas Bellinger 提交于
      This patch addresses a bug where RDMA_CM_EVENT_DISCONNECTED may occur
      before the connection shutdown has been completed by rx/tx threads,
      that causes isert_free_conn() to wait indefinately on ->conn_wait.
      
      This patch allows isert_disconnect_work code to invoke rdma_disconnect
      when isert_disconnect_work() process context is started by client
      session reset before isert_free_conn() code has been reached.
      
      It also adds isert_conn->conn_mutex protection for ->state within
      isert_disconnect_work(), isert_cq_comp_err() and isert_free_conn()
      code, along with isert_check_state() for wait_event usage.
      
      (v2: Add explicit iscsit_cause_connection_reinstatement call
           during isert_disconnect_work() to force conn reset)
      
      Cc: stable@vger.kernel.org  # 3.10+
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      b2cb9649
  2. 07 7月, 2013 4 次提交
    • N
      iscsi-target: Fix ISCSI_OP_SCSI_TMFUNC handling for iser · 186a9647
      Nicholas Bellinger 提交于
      This patch adds target_get_sess_cmd reference counting for
      iscsit_handle_task_mgt_cmd(), and adds a target_put_sess_cmd()
      for the failure case.
      
      It also fixes a bug where ISCSI_OP_SCSI_TMFUNC type commands
      where leaking iscsi_cmd->i_conn_node and eventually triggering
      an OOPs during struct isert_conn shutdown.
      
      Cc: stable@vger.kernel.org  # 3.10+
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      186a9647
    • N
      iscsi-target: Fix iscsit_sequence_cmd reject handling for iser · 561bf158
      Nicholas Bellinger 提交于
      This patch moves ISCSI_OP_REJECT failures into iscsit_sequence_cmd()
      in order to avoid external iscsit_reject_cmd() reject usage for all
      PDU types.
      
      It also updates PDU specific handlers for traditional iscsi-target
      code to not reset the session after posting a ISCSI_OP_REJECT during
      setup.
      
      (v2: Fix CMDSN_LOWER_THAN_EXP for ISCSI_OP_SCSI to call
           target_put_sess_cmd() after iscsit_sequence_cmd() failure)
      
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: stable@vger.kernel.org  # 3.10+
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      561bf158
    • N
      iscsi-target: Fix iscsit_add_reject* usage for iser · ba159914
      Nicholas Bellinger 提交于
      This patch changes iscsit_add_reject() + iscsit_add_reject_from_cmd()
      usage to not sleep on iscsi_cmd->reject_comp to address a free-after-use
      usage bug in v3.10 with iser-target code.
      
      It saves ->reject_reason for use within iscsit_build_reject() so the
      correct value for both transport cases.  It also drops the legacy
      fail_conn parameter usage throughput iscsi-target code and adds
      two iscsit_add_reject_cmd() and iscsit_reject_cmd helper functions,
      along with various small cleanups.
      
      (v2: Re-enable target_put_sess_cmd() to be called from
           iscsit_add_reject_from_cmd() for rejects invoked after
           target_get_sess_cmd() has been called)
      
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: stable@vger.kernel.org  # 3.10+
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      ba159914
    • N
      iser-target: Fix isert_put_reject payload buffer post · 3df8f68a
      Nicholas Bellinger 提交于
      This patch adds the missing isert_put_reject() logic to post
      a outgoing payload buffer to hold the 48 bytes of original PDU
      header request payload for the rejected cmd.
      
      It also fixes ISTATE_SEND_REJECT handling in isert_response_completion()
      -> isert_do_control_comp() code, and drops incorrect iscsi_cmd_t->reject_comp
      usage.
      
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: stable@vger.kernel.org  # 3.10+
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      3df8f68a
  3. 25 6月, 2013 1 次提交
    • N
      iscsi/isert-target: Refactor ISCSI_OP_NOOP RX handling · 778de368
      Nicholas Bellinger 提交于
      This patch refactors ISCSI_OP_NOOP handling within iscsi-target in
      order to handle iscsi_nopout payloads in a transport specific manner.
      
      This includes splitting existing iscsit_handle_nop_out() into
      iscsit_setup_nop_out() and iscsit_process_nop_out() calls, and
      makes iscsit_handle_nop_out() be only used internally by traditional
      iscsi socket calls.
      
      Next update iser-target code to use new callers and add FIXME for
      the handling iscsi_nopout payloads.  Also fix reject response handling
      in iscsit_setup_nop_out() to use proper iscsit_add_reject_from_cmd().
      
      v2: Fix uninitialized iscsit_handle_nop_out() payload_length usage (Fengguang)
      v3: Remove left-over dead code in iscsit_setup_nop_out() (DanC)
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      778de368
  4. 05 6月, 2013 2 次提交
  5. 30 5月, 2013 1 次提交
    • N
      ib_srpt: Call target_sess_cmd_list_set_waiting during shutdown_session · 1d19f780
      Nicholas Bellinger 提交于
      Given that srpt_release_channel_work() calls target_wait_for_sess_cmds()
      to allow outstanding se_cmd_t->cmd_kref a change to complete, the call
      to perform target_sess_cmd_list_set_waiting() needs to happen in
      srpt_shutdown_session()
      
      Also, this patch adds an explicit call to srpt_shutdown_session() within
      srpt_drain_channel() so that target_sess_cmd_list_set_waiting() will be
      called in the cases where TFO->shutdown_session() is not triggered
      directly by TCM.
      
      Cc: Joern Engel <joern@logfs.org>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      1d19f780
  6. 21 5月, 2013 1 次提交
  7. 08 5月, 2013 1 次提交
  8. 02 5月, 2013 4 次提交
  9. 25 4月, 2013 1 次提交
    • N
      iser-target: Add iSCSI Extensions for RDMA (iSER) target driver · b8d26b3b
      Nicholas Bellinger 提交于
      This patch adds support for iSCSI Extensions for RDMA target mode,
      and includes CQ pooling per isert_device context distributed across
      multiple active iser target sessions.
      
      It also uses cmwq process context for RX / TX ib_post_cq() polling
      via isert_cq_desc->cq_[rx,tx]_work invoked by isert_cq_[rx,tx]_callback()
      hardIRQ context callbacks.
      
      v5 changes:
      
      - Use ISER_RECV_DATA_SEG_LEN instead of hardcoded value in ISER_RX_PAD_SIZE (Or)
      - Fix make W=1 warnings (Or)
      - Add missing depends on NET && INFINIBAND_ADDR_TRANS in Kconfig (Randy + Or)
      - Make isert_device_find_by_ib_dev() return proper ERR_PTR (Wei Yongjun)
      - Properly setup iscsi_np->np_sockaddr in isert_setup_np() (Shlomi + nab)
      - Add special case for early ISCSI_OP_SCSI_CMD exception handling (nab)
      
      v4 changes:
      - Mark isert_cq_rx_work as static (Or)
      - Drop unnecessary ib_dma_sync_single_for_cpu + ib_dma_sync_single_for_device
        calls for isert_cmd->sense_buf_dma from isert_put_response (Or)
      - Use 12288 for ISER_RX_PAD_SIZE base to save extra page per
        struct iser_rx_desc (Or + nab)
      - Drop now unnecessary isert_rx_desc usage, and convert RX users to
        iser_rx_desc (Or + nab)
      - Move isert_[alloc,free]_rx_descriptors() ahead of
        isert_create_device_ib_res() usage (nab)
      - Mark isert_cq_[rx,tx]_callback() + prototypes as static
      - Fix 'warning: 'ret' may be used uninitialized' warning for
        isert_create_device_ib_res on powerpc allmodconfig (fengguang + nab)
      - Fix 'warning: 'ret' may be used uninitialized' warning for
        isert_connect_request on i386 allyesconfig (fengguang + nab)
      - Fix pr_debug conversion specification in isert_rx_completion()
        (fengguang + nab)
      - Drop unnecessary isert_conn->conn_cm_id != NULL check in
        isert_connect_release causing the build warning:
        "variable dereferenced before check 'isert_conn->conn_cm_id'"
      - Fix isert_lid + isert_np leak in isert_setup_np failure path
      - Add isert_conn->conn_wait_comp_err usage in isert_free_conn()
        for isert_cq_comp_err completion path
      - Add isert_conn->logout_posted bit to determine decrement of
        isert_conn->post_send_buf_count from logout response completion
      - Always set ISER_CONN_DOWN from isert_disconnect_work() callback
      
      v3 changes:
      
      - Convert to use per isert_cq_desc->cq_[rx,tx]_work + drop tasklets (Or + nab)
      - Move IB_EVENT_QP_LAST_WQE_REACHED warn into correct
        isert_qp_event_callback (Or)
      - Drop unnecessary IB_ACCESS_REMOTE_* access flag usage in
        isert_create_device_ib_res (Or)
      - Add common isert_init_send_wr(), and convert isert_put_* calls (Or)
      - Move to verbs+core logic to single ib_isert.[c,h]  (Or + nab)
      - Add kmem_cache isert_cmd_cache usage for descriptor allocation (nab)
      - Move common ib_post_send() logic used by isert_put_*() to
        isert_post_response() (nab)
      - Add isert_put_reject call in isert_response_queue() for posting
        ISCSI_REJECT response. (nab)
      - Add ISTATE_SEND_REJECT checking in isert_do_control_comp. (nab)
      
      v2 changes:
      
      - Drop unused ISERT_ADDR_ROUTE_TIMEOUT define
      - Add rdma_notify() call for IB_EVENT_COMM_EST in isert_qp_event_callback()
      - Make isert_query_device() less verbose
      - Drop unused RDMA_CM_EVENT_ADDR_ERROR and RDMA_CM_EVENT_ROUTE_ERROR
        cases from isert_cma_handler()
      - Drop unused rdma/ib_fmr_pool.h include
      - Update isert_conn_setup_qp() to assign cq based upon least used
      - Add isert_create_device_ib_res() to setup PD, CQs and MRs for each
        underlying struct ib_device, instead of using per isert_conn resources.
      - Add isert_free_device_ib_res() to release PD, CQs and MRs for each
        underlying struct ib_device.
      - Add isert_device_find_by_ib_dev()
      - Change isert_connect_request() to drop PD, CQs and MRs allocation,
        and use isert_device_find_by_ib_dev() instead.
      - Add isert_device_try_release()
      - Change isert_connect_release() to decrement cq_active_qps, and drop
        PD, CQs and MRs resource release.
      - Update isert_connect_release() to call isert_device_try_release()
      - Make isert_create_device_ib_res() determine device->cqs_used based
        upon num_online_cpus()
      - Drop misleading isert_dump_ib_wc() usage
      - Drop unused rdma/ib_fmr_pool.h include
      - Use proper xfer_len for login PDUs in isert_rx_completion()
      - Add isert_release_cmd() usage
      - Change isert_alloc_cmd() to setup iscsi_cmd.release_cmd() pointer
      - Change isert_put_cmd() to perform per iscsi_opcode specific release
        logic
      - Add isert_unmap_cmd() call for ISCSI_OP_SCSI_CMD from isert_put_cmd()
      - Change isert_send_completion() to call
        atomic_dec(&isert_conn->post_send_buf_count)
        based upon per iscsi_opcode logic
      - Drop ISTATE_REMOVE processing from isert_immediate_queue()
      - Drop ISTATE_SEND_DATAIN processing from isert_response_queue()
      - Drop ISTATE_SEND_STATUS processing from isert_response_queue()
      - Drop iscsit_transport->iscsit_unmap_cmd() and ->iscsit_free_cmd()
      - Convert iser_cq_tx_tasklet() to use struct isert_cq_desc pooling logic
      - Convert isert_cq_tx_callback() to use struct isert_cq_desc pooling
        logic
      - Convert iser_cq_rx_tasklet() to use struct isert_cq_desc pooling logic
      - Convert isert_cq_rx_callback() to use struct isert_cq_desc pooling
        logic
      - Add explict iscsit_stop_dataout_timer() call to
        isert_do_rdma_read_comp()
      - Use isert_get_dataout() for iscsit_transport->iscsit_get_dataout()
        caller
      - Drop ISTATE_SEND_R2T processing from isert_immediate_queue()
      - Drop unused rdma/ib_fmr_pool.h include
      - Drop isert_cmd->cmd_kref in favor of se_cmd->cmd_kref usage
      - Add struct isert_device in order to support multiple EQs + CQ pooling
      - Add struct isert_cq_desc
      - Drop tasklets and cqs from isert_conn
      - Bump ISERT_MAX_CQ to 64
      - Various minor checkpatch fixes
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      b8d26b3b
  10. 18 4月, 2013 1 次提交
    • P
      IPoIB: add support for TIPC protocol · dc850b0e
      Patrick McHardy 提交于
      Support TIPC in the IPoIB driver. Since IPoIB now keeps track of its own
      neighbour entries and doesn't require the packet to have a dst_entry
      anymore, the only necessary changes are to:
      
      - not drop multicast TIPC packets because of the unknown ethernet type
      - handle unicast TIPC packets similar to IPv4/IPv6 unicast packets
      
      in ipoib_start_xmit().
      
      An alternative would be to remove all ethertype limitations since they're
      not necessary anymore, all TIPC needs to know about is ARP and RARP since
      it wants to always perform "path find", even if a path is already known.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc850b0e
  11. 17 4月, 2013 3 次提交
  12. 23 3月, 2013 1 次提交
    • M
      IPoIB: Fix send lockup due to missed TX completion · 1ee9e2aa
      Mike Marciniszyn 提交于
      Commit f0dc117a ("IPoIB: Fix TX queue lockup with mixed UD/CM
      traffic") attempts to solve an issue where unprocessed UD send
      completions can deadlock the netdev.
      
      The patch doesn't fully resolve the issue because if more than half
      the tx_outstanding's were UD and all of the destinations are RC
      reachable, arming the CQ doesn't solve the issue.
      
      This patch uses the IB_CQ_REPORT_MISSED_EVENTS on the
      ib_req_notify_cq().  If the rc is above 0, the UD send cq completion
      callback is called directly to re-arm the send completion timer.
      
      This issue is seen in very large parallel filesystem deployments
      and the patch has been shown to correct the issue.
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NDean Luick <dean.luick@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      1ee9e2aa
  13. 26 2月, 2013 5 次提交
    • R
      IPoIB: Free ipoib neigh on path record failure so path rec queries are retried · f72dd566
      Roland Dreier 提交于
      If IPoIB fails to look up a path record (eg if it tries during an SM
      failover when one SM is dead but the new one hasn't taken over yet), the
      driver ends up with a neighbour structure but no address handle (AH).
      There's no mechanism to recover from this: any further packets sent to
      this destination will be silently dumped in ipoib_start_xmit().
      
      Fix this by freeing the neighbour structures when a path rec query
      fails, so that the next packet queued to be sent will trigger a new path
      record query.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      f72dd566
    • B
      IB/srp: Fail I/O requests if the transport is offline · 2ce19e72
      Bart Van Assche 提交于
      If an SRP target is no longer reachable and srp_reset_host() fails to
      reconnect then ib_srp will invoke scsi_remove_host().  That function
      will invoke __scsi_remove_device() for each LUN.  And that last
      function will change the device state from SDEV_TRANSPORT_OFFLINE into
      SDEV_CANCEL.  Certain user space software, e.g. older versions of
      multipathd, continue queueing I/O to SCSI devices that are in the
      SDEV_CANCEL state.
      
      If these I/O requests are submitted as SG_IO that means that the
      REQ_PREEMPT flag will be set and hence that these requests will be
      passed to srp_queuecommand().  These requests will time out.  If new
      requests are queued fast enough from user space these active requests
      will prevent __scsi_remove_device() to finish.
      
      Avoid this by failing I/O requests in the SDEV_CANCEL state if the
      transport is offline.  Introduce a new variable to keep track of the
      transport state instead of failing requests if (!target->connected ||
      target->qp_in_error), so that the SCSI error handler has a chance to
      retry commands after a transport layer failure occurred.
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Cc: <stable@vger.kernel.org> # 3.8
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      2ce19e72
    • B
      IB/srp: Avoid endless SCSI error handling loop · c7c4e7ff
      Bart Van Assche 提交于
      If a SCSI command times out it is passed to the SCSI error
      handler. The SCSI error handler will try to abort the commands that
      timed out.  If aborting fails, a device reset will be attempted.  If
      the device reset also fails a host reset will be attempted.  If the
      host reset also fails the whole procedure will be repeated.
      
      srp_abort() and srp_reset_device() fail for a QP in the error state.
      srp_reset_host() fails after host removal has started.  Hence if the
      SCSI error handler gets invoked after host removal has started and
      with the QP in the error state an endless loop will be triggered.
      
      Modify the SCSI error handling functions in ib_srp as follows:
      - Abort SCSI commands properly even if the QP is in the error state.
      - Make srp_reset_host() reset SCSI requests even after host removal
        has already started or if reconnecting fails.
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Acked-by: NDavid Dillow <dave@thedillows.org>
      Cc: <stable@vger.kernel.org> # 3.8
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      c7c4e7ff
    • B
      IB/srp: Avoid sending a task management function needlessly · 3780d1f0
      Bart Van Assche 提交于
      Do not send a task management function if sending will fail anyway
      because either there is no RDMA/RC connection or the QP is in the
      error state.
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Acked-by: NDavid Dillow <dave@thedillows.org>
      Cc: <stable@vger.kernel.org> # 3.8
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      3780d1f0
    • B
      IB/srp: Track connection state properly · e1b2f13a
      Bart Van Assche 提交于
      Remove an assignment that incorrectly overwrites the connection state
      update by srp_connect_target().
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Acked-by: NDavid Dillow <dave@thedillows.org>
      Cc: <stable@vger.kernel.org> # 3.8
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      e1b2f13a
  14. 22 2月, 2013 3 次提交
  15. 20 2月, 2013 3 次提交
  16. 06 2月, 2013 1 次提交
    • S
      IPoIB: Fix crash due to skb double destruct · 7e5a90c2
      Shlomo Pongratz 提交于
      After commit b13912bb ("IPoIB: Call skb_dst_drop() once skb is
      enqueued for sending"), using connected mode and running multithreaded
      iperf for long time, ie
      
          iperf -c <IP> -P 16 -t 3600
      
      results in a crash.
      
      After the above-mentioned patch, the driver is calling skb_orphan() and
      skb_dst_drop() after calling post_send() in ipoib_cm.c::ipoib_cm_send()
      (also in ipoib_ib.c::ipoib_send())
      
      The problem with this is, as is written in a comment in both routines,
      "it's entirely possible that the completion handler will run before we
      execute anything after the post_send()."  This leads to running the
      skb cleanup routines simultaneously in two different contexts.
      
      The solution is to always perform the skb_orphan() and skb_dst_drop()
      before queueing the send work request.  If an error occurs, then it
      will be no different than the regular case where dev_free_skb_any() in
      the completion path, which is assumed to be after these two routines.
      Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      7e5a90c2
  17. 07 1月, 2013 1 次提交
  18. 20 12月, 2012 1 次提交
    • R
      IPoIB: Call skb_dst_drop() once skb is enqueued for sending · b13912bb
      Roland Dreier 提交于
      Currently, IPoIB delays collecting send completions for TX packets in
      order to batch work more efficiently.  It does skb_orphan() right after
      queuing the packets so that destructors run early, to avoid problems
      like holding socket send buffers for too long (since we might not
      collect a send completion until a long time after the packet is
      actually sent).
      
      However, IPoIB clears IFF_XMIT_DST_RELEASE because it actually looks
      at skb_dst() to update the PMTU when it gets a too-long packet.  This
      means that the packets sitting in the TX ring with uncollected send
      completions are holding a reference on the dst.  We've seen this lead
      to pathological behavior with respect to route and neighbour GC.  The
      easy fix for this is to call skb_dst_drop() when we call skb_orphan().
      
      Also, give packets sent via connected mode (CM) the same skb_orphan()
      / skb_dst_drop() treatment that packets sent via datagram mode get.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b13912bb
  19. 01 12月, 2012 4 次提交