1. 18 9月, 2012 1 次提交
  2. 16 8月, 2012 2 次提交
  3. 15 8月, 2012 2 次提交
  4. 30 7月, 2012 1 次提交
    • S
      IPoIB: Use a private hash table for path lookup in xmit path · b63b70d8
      Shlomo Pongratz 提交于
      Dave Miller <davem@davemloft.net> provided a detailed description of
      why the way IPoIB is using neighbours for its own ipoib_neigh struct
      is buggy:
      
          Any time an ipoib_neigh is changed, a sequence like the following is made:
      
          			spin_lock_irqsave(&priv->lock, flags);
          			/*
          			 * It's safe to call ipoib_put_ah() inside
          			 * priv->lock here, because we know that
          			 * path->ah will always hold one more reference,
          			 * so ipoib_put_ah() will never do more than
          			 * decrement the ref count.
          			 */
          			if (neigh->ah)
          				ipoib_put_ah(neigh->ah);
          			list_del(&neigh->list);
          			ipoib_neigh_free(dev, neigh);
          			spin_unlock_irqrestore(&priv->lock, flags);
          			ipoib_path_lookup(skb, n, dev);
      
          This doesn't work, because you're leaving a stale pointer to the freed up
          ipoib_neigh in the special neigh->ha pointer cookie.  Yes, it even fails
          with all the locking done to protect _changes_ to *ipoib_neigh(n), and
          with the code in ipoib_neigh_free() that NULLs out the pointer.
      
          The core issue is that read side calls to *to_ipoib_neigh(n) are not
          being synchronized at all, they are performed without any locking.  So
          whether we hold the lock or not when making changes to *ipoib_neigh(n)
          you still can have threads see references to freed up ipoib_neigh
          objects.
      
          	cpu 1			cpu 2
          	n = *ipoib_neigh()
          				*ipoib_neigh() = NULL
          				kfree(n)
          	n->foo == OOPS
      
          [..]
      
          Perhaps the ipoib code can have a private path database it manages
          entirely itself, which holds all the necessary information and is
          looked up by some generic key which is available easily at transmit
          time and does not involve generic neighbour entries.
      
      See <http://marc.info/?l=linux-rdma&m=132812793105624&w=2> and
      <http://marc.info/?l=linux-rdma&w=2&r=1&s=allows+references+to+freed+memory&q=b>
      for the full discussion.
      
      This patch aims to solve the race conditions found in the IPoIB driver.
      
      The patch removes the connection between the core networking neighbour
      structure and the ipoib_neigh structure.  In addition to avoiding the
      race described above, it allows us to handle SKBs carrying IP packets
      that don't have any associated neighbour.
      
      We add an ipoib_neigh hash table with N buckets where the key is the
      destination hardware address.  The ipoib_neigh is fetched from the
      hash table and instead of the stashed location in the neighbour
      structure. The hash table uses both RCU and reference counting to
      guarantee that no ipoib_neigh instance is ever deleted while in use.
      
      Fetching the ipoib_neigh structure instance from the hash also makes
      the special code in ipoib_start_xmit that handles remote and local
      bonding failover redundant.
      
      Aged ipoib_neigh instances are deleted by a garbage collection task
      that runs every M seconds and deletes every ipoib_neigh instance that
      was idle for at least 2*M seconds. The deletion is safe since the
      ipoib_neigh instances are protected using RCU and reference count
      mechanisms.
      
      The number of buckets (N) and frequency of running the GC thread (M),
      are taken from the exported arb_tbl.
      Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b63b70d8
  5. 17 7月, 2012 2 次提交
    • D
      net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() · 6700c270
      David S. Miller 提交于
      This will be used so that we can compose a full flow key.
      
      Even though we have a route in this context, we need more.  In the
      future the routes will be without destination address, source address,
      etc. keying.  One ipv4 route will cover entire subnets, etc.
      
      In this environment we have to have a way to possess persistent storage
      for redirects and PMTU information.  This persistent storage will exist
      in the FIB tables, and that's why we'll need to be able to rebuild a
      full lookup flow key here.  Using that flow key will do a fib_lookup()
      and create/update the persistent entry.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6700c270
    • C
      srpt: use target_execute_cmd for WRITEs in srpt_handle_rdma_comp · e672a47f
      Christoph Hellwig 提交于
      srpt_handle_rdma_comp is called from kthread context and thus can execute
      target_execute_cmd directly.  srpt_abort_cmd sets the CMD_T_LUN_STOP
      flag directly, and thus the abuse of transport_generic_handle_data can be
      replaced with an opencoded variant of that code path.  I'm still not happy
      about a fabric driver poking into target core internals like this, but
      let's defer the bigger architecture changes for now.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      e672a47f
  6. 11 7月, 2012 1 次提交
    • E
      IPoIB: fix skb truesize underestimatiom · b28ba726
      Eric Dumazet 提交于
      Or Gerlitz reported triggering of WARN_ON_ONCE(delta < len); in
      skb_try_coalesce()
      This warning tracks drivers that incorrectly set skb->truesize
      
      IPoIB indeed allocates a full page to store a fragment, but only
      accounts in skb->truesize the used part of the page (frame length)
      
      This patch fixes skb truesize underestimation, and
      also fixes a performance issue, because RX skbs have not enough tailroom
      to allow IP and TCP stacks to pull their header in skb linear part
      without an expensive call to pskb_expand_head()
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Cc: Erez Shitrit <erezsh@mellanox.com>
      Cc: Shlomo Pongartz <shlomop@mellanox.com>
      Cc: Roland Dreier <roland@purestorage.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b28ba726
  7. 09 7月, 2012 1 次提交
  8. 06 7月, 2012 1 次提交
  9. 05 7月, 2012 1 次提交
  10. 19 5月, 2012 1 次提交
  11. 15 4月, 2012 2 次提交
  12. 12 4月, 2012 1 次提交
    • R
      IB/srpt: Set srq_type to IB_SRQT_BASIC · 6f360336
      Roland Dreier 提交于
      Since commit 96104eda ("RDMA/core: Add SRQ type field"), kernel
      users of SRQs need to specify srq_type = IB_SRQT_BASIC in struct
      ib_srq_init_attr, or else most low-level drivers will fail in
      when srpt_add_one() calls ib_create_srq() and gets -ENOSYS.
      
      (mlx4_ib works OK nearly all of the time, because it just needs
      srq_type != IB_SRQT_XRC.  And apparently nearly everyone using
      ib_srpt is using mlx4 hardware)
      Reported-by: NAlexey Shvetsov <alexxy@gentoo.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      6f360336
  13. 20 3月, 2012 1 次提交
  14. 18 3月, 2012 1 次提交
    • N
      ib_srpt: Fix srpt_handle_cmd send_ioctx->ioctx_kref leak on exception · 187e70a5
      Nicholas Bellinger 提交于
      This patch addresses a bug in srpt_handle_cmd() failure handling where
      send_ioctx->kref is being leaked with the local extra reference after init,
      causing the expected kref_put() in srpt_handle_send_comp() to not be the final
      call to invoke srpt_put_send_ioctx_kref() -> transport_generic_free_cmd() and
      perform se_cmd descriptor memory release.
      
      It also fixes a SCF_SCSI_RESERVATION_CONFLICT handling bug where this code
      is incorrectly falling through to transport_handle_cdb_direct() after
      invoking srpt_queue_status() to send SAM_STAT_RESERVATION_CONFLICT status.
      
      Note this patch is for >= v3.3 mainline code, and current lio-core.git
      code has already been converted to target_submit_cmd() + se_cmd->cmd_kref usage,
      and internal ioctx->kref usage has been removed.  I'm including this patch
      now into target-pending/for-next with a CC' for v3.3 stable.
      
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Roland Dreier <roland@purestorage.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      187e70a5
  15. 11 3月, 2012 1 次提交
  16. 09 3月, 2012 1 次提交
    • O
      IB: Change CQE "csum_ok" field to a bit flag · d927d505
      Or Gerlitz 提交于
      Use a bit in wc_flags rather then a whole integer to hold the
      "checksum OK" flag.  By itself, this change doesn't reduce the size of
      struct ib_wc on 64bit machines -- it stays on 56 bytes because of
      padding.  However, it will allow to add more fields in the future
      without enlarging the struct.  Also, it will let us have a unified
      approach with future libibverbs checksum offload reporting, because a
      bit flag doesn't break the library ABI.
      
      This patch was suggested during conversation with Liran Liss
      <liranl@mellanox.com>.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      d927d505
  17. 06 3月, 2012 1 次提交
    • O
      IB/iser: Post initial receive buffers before sending the final login request · 89e984e2
      Or Gerlitz 提交于
      An iser target may send iscsi NO-OP PDUs as soon as it marks the iSER
      iSCSI session as fully operative.  This means that there is window
      where there are no posted receive buffers on the initiator side, so
      it's possible for the iSER RC connection to break because of RNR NAK /
      retry errors.  To fix this, rely on the flags bits in the login
      request to have FFP (0x3) in the lower nibble as a marker for the
      final login request, and post an initial chunk of receive buffers
      before sending that login request instead of after getting the login
      response.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      89e984e2
  18. 05 3月, 2012 1 次提交
  19. 28 2月, 2012 2 次提交
  20. 26 2月, 2012 3 次提交
  21. 10 2月, 2012 1 次提交
  22. 09 2月, 2012 2 次提交
  23. 07 2月, 2012 1 次提交
  24. 04 2月, 2012 1 次提交
  25. 03 2月, 2012 3 次提交
  26. 04 1月, 2012 1 次提交
  27. 16 12月, 2011 1 次提交
    • B
      ib_srpt: Initial SRP Target merge for v3.3-rc1 · a42d985b
      Bart Van Assche 提交于
      This patch adds the kernel module ib_srpt SCSI RDMA Protocol (SRP) target
      implementation conforming to the SRP r16a specification for the mainline
      drivers/target infrastructure.
      
      This driver was originally developed by Vu Pham and has been optimized by
      Bart Van Assche and merged into upstream LIO based on his srpt-lio-4.1
      branch here:
      
         https://github.com/bvanassche/srpt-lio/commits/srpt-lio-4.1/
      
      This updated patch also contains the following two changes from
      lio-core-2.6.git/master.  One is to fix a bug with 1 >= task->task_sg[]
      chained mappings in ib_srpt, and the other to convert the configfs control
      plane to reference IB Port GUID and struct srpt_port directly following
      mainline v4.x target_core_fabric_configfs.c convertion for ib_srpt
      to work with rtslib/rtsadmin v2 code.
      
      These seperate patches can be found here:
      
      ib_srpt: Fix bug with chainged SGLs in srpt_map_sg_to_ib_sge
      http://www.risingtidesystems.com/git/?p=lio-core-2.6.git;a=commitdiff;h=ea485147563b6555a97dbf811825fbb586519252
      
      ib_srpt: Convert se_wwn endpoint reference to struct srpt_port->port_wwn
      http://www.risingtidesystems.com/git/?p=lio-core-2.6.git;a=commitdiff;h=4e544a210acb227df1bb4ca5086e65bdf4e648ea
      
      This also includes the following recent v1 -> v2 review changes:
      
      ib_srpt: Fix potential out-of-bounds array access
      ib_srpt: Avoid failed multipart RDMA transfers
      ib_srpt: Fix srpt_alloc_fabric_acl failure case return value
      ib_srpt: Update comments to reference $driver/$port layout
      ib_srpt: Fix sport->port_guid formatting code
      ib_srpt: Remove legacy use_port_guid_in_session_name module parameter
      ib_srpt: Convert srp_max_rdma_size into per port configfs attribute
      ib_srpt: Convert srp_max_rsp_size into per port configfs attribute
      ib_srpt: Convert srpt_sq_size into per port configfs attribute
      
      and v2 -> v3 review changes:
      
      ib_srpt: Fix possible race with srp_sq_size in srpt_create_ch_ib
      ib_srpt: Fix possible race with srp_max_rsp_size in srpt_release_channel_work
      ib_srpt: Fix up MAX_SRPT_RDMA_SIZE define
      ib_srpt: Make srpt_map_sg_to_ib_sge() failure case return -EAGAIN
      ib_srpt: Convert port_guid to use subnet_prefix + interface_id formatting
      ib_srpt: Make srpt_check_stop_free return kref_put status
      ib_srpt: Make compilation with BUG=n proceed`
      ib_srpt: Use new target_core_fabric.h include
      ib_srpt: Check hex2bin() return code to silence build warning
      
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Roland Dreier <roland@purestorage.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Vu Pham <vu@mellanox.com>
      Cc: David Dillow <dillowda@ornl.gov>
      Signed-off-by: NNicholas A. Bellinger <nab@risingtidesystems.com>
      a42d985b
  28. 06 12月, 2011 2 次提交
  29. 01 12月, 2011 1 次提交