- 04 10月, 2012 1 次提交
-
-
由 Alex Tabachnik 提交于
RX/TX CQs will now be selected from a per HCA pool. For the RX flow this has the effect of using different interrupt vectors when using low level drivers (such as mlx4) that map the "vector" param provided by the ULP on CQ creation to a dedicated IRQ/MSI-X vector. This allows the RX flow processing of IO responses to be distributed across multiple CPUs. QPs (--> iSER sessions) are assigned to CQs in round robin order using the CQ with the minimum number of sessions attached to it. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NAlex Tabachnik <alext@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 02 10月, 2012 1 次提交
-
-
由 Or Gerlitz 提交于
Add the rtnl_link_ops changelink and fill_info callbacks, through which the admin can now set/get the driver mode, etc policies. Maintain the proprietary sysfs entries only for legacy childs. For child devices, set dev->iflink to point to the parent device ifindex, such that user space tools can now correctly show the uplink relation as done for vlan, macvlan, etc devices. Pointed out by Patrick McHardy <kaber@trash.net> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 10月, 2012 3 次提交
-
-
由 Bart Van Assche 提交于
We need to call scsi_done() for commands after we abort them. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Acked-by: NDavid Dillow <dillowda@ornl.gov> Cc: <stable@vger.kernel.org> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Bart Van Assche 提交于
srp_free_req() uses the scsi_cmnd structure contents to unmap buffers, so we must invoke srp_free_req() before we release ownership of that structure. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Acked-by: NDavid Dillow <dillowda@ornl.gov> Cc: <stable@vger.kernel.org> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Patrick McHardy 提交于
Fix a crash in ipoib_mcast_join_task(). (with help from Or Gerlitz) Commit c8c2afe3 ("IPoIB: Use rtnl lock/unlock when changing device flags") added a call to rtnl_lock() in ipoib_mcast_join_task(), which is run from the ipoib_workqueue, and hence the workqueue can't be flushed from the context of ipoib_stop(). In the current code, ipoib_stop() (which doesn't flush the workqueue) calls ipoib_mcast_dev_flush(), which goes and deletes all the multicast entries. This takes place without any synchronization with a possible running instance of ipoib_mcast_join_task() for the same ipoib device, leading to a crash due to NULL pointer dereference. Fix this by making sure that the workqueue is flushed before ipoib_mcast_dev_flush() is called. To make that possible, we move the RTNL-lock wrapped code to ipoib_mcast_join_finish(). Signed-off-by: NPatrick McHardy <kaber@trash.net> Cc: <stable@vger.kernel.org> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 21 9月, 2012 1 次提交
-
-
由 Or Gerlitz 提交于
Add rtnl_link_ops to IPoIB, with the first usage being child device create/delete through them. Childs devices are now either legacy ones, created/deleted through the ipoib sysfs entries, or RTNL ones. Adding support for RTNL childs involved refactoring of ipoib_vlan_add which is now used by both the sysfs and the link_ops code. Also, added ndo_uninit entry to support calling unregister_netdevice_queue from the rtnl dellink entry. This required removal of calls to ipoib_dev_cleanup from the driver in flows which use unregister_netdevice, since the networking core will invoke ipoib_uninit which does exactly that. Signed-off-by: NErez Shitrit <erezsh@mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 9月, 2012 2 次提交
-
-
由 Shlomo Pongratz 提交于
Lockdep points out a circular locking dependency betwwen the ipoib device priv spinlock (priv->lock) and the neighbour table rwlock (ntbl->rwlock). In the normal path, ie neigbour garbage collection task, the neigh table rwlock is taken first and then if the neighbour needs to be deleted, priv->lock is taken. However in some error paths, such as in ipoib_cm_handle_tx_wc(), priv->lock is taken first and then ipoib_neigh_free routine is called which in turn takes the neighbour table ntbl->rwlock. The solution is to get rid the neigh table rwlock completely and use only priv->lock. Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Shlomo Pongratz 提交于
If the neighbours hash table is empty when unloading the module, then ipoib_flush_neighs(), the cleanup routine, isn't called and the memory used for the hash table itself leaked. To fix this, ipoib_flush_neighs() is allways called, and another completion object is added to signal when the table is freed. Once invoked, ipoib_flush_neighs() flushes all the neighbours (if there are any), calls the the hash table RCU free routine, which now signals completion of the deletion process, and waits for the last neighbour to be freed. Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 16 8月, 2012 2 次提交
-
-
由 Bart Van Assche 提交于
Avoid a crash caused by the scmnd->scsi_done(scmnd) call in srp_process_rsp() being invoked with scsi_done == NULL. This can happen if a reply is received during or after a command abort. Reported-by: NJoseph Glanville <joseph.glanville@orionvm.com.au> Reference: http://marc.info/?l=linux-rdma&m=134314367801595 Cc: <stable@vger.kernel.org> Acked-by: NDavid Dillow <dillowda@ornl.gov> Signed-off-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Masanari Iida 提交于
Correct spelling typos in comments in drivers/infiniband. Signed-off-by: NMasanari Iida <standby24x7@gmail.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 15 8月, 2012 2 次提交
-
-
由 Shlomo Pongratz 提交于
Commit b63b70d8 ("IPoIB: Use a private hash table for path lookup in xmit path") introduced a bug where in ipoib_neigh_free() (which is called from a few errors flows in the driver), rcu_dereference() is invoked with the wrong pointer object, which results in a crash. Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Shlomo Pongratz 提交于
Commit b63b70d8 ("IPoIB: Use a private hash table for path lookup in xmit path") introduced a bug where in ipoib_cm_destroy_tx() a CM object is moved between lists without any supported locking. Under a stress test, this eventually leads to list corruption and a crash. Previously when this routine was called, callers were taking the device priv lock. Currently this function is called from the RCU callback associated with neighbour deletion. Fix the race by taking the same lock we used to before. Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 30 7月, 2012 1 次提交
-
-
由 Shlomo Pongratz 提交于
Dave Miller <davem@davemloft.net> provided a detailed description of why the way IPoIB is using neighbours for its own ipoib_neigh struct is buggy: Any time an ipoib_neigh is changed, a sequence like the following is made: spin_lock_irqsave(&priv->lock, flags); /* * It's safe to call ipoib_put_ah() inside * priv->lock here, because we know that * path->ah will always hold one more reference, * so ipoib_put_ah() will never do more than * decrement the ref count. */ if (neigh->ah) ipoib_put_ah(neigh->ah); list_del(&neigh->list); ipoib_neigh_free(dev, neigh); spin_unlock_irqrestore(&priv->lock, flags); ipoib_path_lookup(skb, n, dev); This doesn't work, because you're leaving a stale pointer to the freed up ipoib_neigh in the special neigh->ha pointer cookie. Yes, it even fails with all the locking done to protect _changes_ to *ipoib_neigh(n), and with the code in ipoib_neigh_free() that NULLs out the pointer. The core issue is that read side calls to *to_ipoib_neigh(n) are not being synchronized at all, they are performed without any locking. So whether we hold the lock or not when making changes to *ipoib_neigh(n) you still can have threads see references to freed up ipoib_neigh objects. cpu 1 cpu 2 n = *ipoib_neigh() *ipoib_neigh() = NULL kfree(n) n->foo == OOPS [..] Perhaps the ipoib code can have a private path database it manages entirely itself, which holds all the necessary information and is looked up by some generic key which is available easily at transmit time and does not involve generic neighbour entries. See <http://marc.info/?l=linux-rdma&m=132812793105624&w=2> and <http://marc.info/?l=linux-rdma&w=2&r=1&s=allows+references+to+freed+memory&q=b> for the full discussion. This patch aims to solve the race conditions found in the IPoIB driver. The patch removes the connection between the core networking neighbour structure and the ipoib_neigh structure. In addition to avoiding the race described above, it allows us to handle SKBs carrying IP packets that don't have any associated neighbour. We add an ipoib_neigh hash table with N buckets where the key is the destination hardware address. The ipoib_neigh is fetched from the hash table and instead of the stashed location in the neighbour structure. The hash table uses both RCU and reference counting to guarantee that no ipoib_neigh instance is ever deleted while in use. Fetching the ipoib_neigh structure instance from the hash also makes the special code in ipoib_start_xmit that handles remote and local bonding failover redundant. Aged ipoib_neigh instances are deleted by a garbage collection task that runs every M seconds and deletes every ipoib_neigh instance that was idle for at least 2*M seconds. The deletion is safe since the ipoib_neigh instances are protected using RCU and reference count mechanisms. The number of buckets (N) and frequency of running the GC thread (M), are taken from the exported arb_tbl. Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 17 7月, 2012 2 次提交
-
-
由 David S. Miller 提交于
This will be used so that we can compose a full flow key. Even though we have a route in this context, we need more. In the future the routes will be without destination address, source address, etc. keying. One ipv4 route will cover entire subnets, etc. In this environment we have to have a way to possess persistent storage for redirects and PMTU information. This persistent storage will exist in the FIB tables, and that's why we'll need to be able to rebuild a full lookup flow key here. Using that flow key will do a fib_lookup() and create/update the persistent entry. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christoph Hellwig 提交于
srpt_handle_rdma_comp is called from kthread context and thus can execute target_execute_cmd directly. srpt_abort_cmd sets the CMD_T_LUN_STOP flag directly, and thus the abuse of transport_generic_handle_data can be replaced with an opencoded variant of that code path. I'm still not happy about a fabric driver poking into target core internals like this, but let's defer the bigger architecture changes for now. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
- 11 7月, 2012 1 次提交
-
-
由 Eric Dumazet 提交于
Or Gerlitz reported triggering of WARN_ON_ONCE(delta < len); in skb_try_coalesce() This warning tracks drivers that incorrectly set skb->truesize IPoIB indeed allocates a full page to store a fragment, but only accounts in skb->truesize the used part of the page (frame length) This patch fixes skb truesize underestimation, and also fixes a performance issue, because RX skbs have not enough tailroom to allow IP and TCP stacks to pull their header in skb linear part without an expensive call to pskb_expand_head() Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NOr Gerlitz <ogerlitz@mellanox.com> Cc: Erez Shitrit <erezsh@mellanox.com> Cc: Shlomo Pongartz <shlomop@mellanox.com> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 7月, 2012 1 次提交
-
-
由 Roland Dreier 提交于
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 06 7月, 2012 1 次提交
-
-
由 David S. Miller 提交于
Otherwise local_bh_enable() complains. Reported-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 7月, 2012 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 5月, 2012 1 次提交
-
-
由 Or Gerlitz 提交于
The current error flow code was releasing the IB connection object and calling iscsi_destroy_endpoint() directly without going through the reference counting mechanism introduced in commit 39ff05db ("IB/iser: Enhance disconnection logic for multi-pathing"). This resulted in a double free of the iscsi endpoint object, which causes a kernel NULL pointer dereference. Fix that by plugging into the IB conn reference counting correctly. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 15 4月, 2012 2 次提交
-
-
由 Andy Grover 提交于
This patch renames a horribly misnamed function that no longer allocate tasks to something more descriptive for it's modern use in target core. (nab: Fix up ib_srpt to use this as well ahead of a target_submit_cmd conversion) Signed-off-by: NAndy Grover <agrover@redhat.com> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
由 Roland Dreier 提交于
With the modern target core, se_cmd->t_data_sg already points to a sglist that covers the whole command. So task_sg chaining is needless overhead and obfuscation -- instead of splicing the split up task sglists back into one list, we can just use the original list directly. Signed-off-by: NRoland Dreier <roland@purestorage.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
- 12 4月, 2012 1 次提交
-
-
由 Roland Dreier 提交于
Since commit 96104eda ("RDMA/core: Add SRQ type field"), kernel users of SRQs need to specify srq_type = IB_SRQT_BASIC in struct ib_srq_init_attr, or else most low-level drivers will fail in when srpt_add_one() calls ib_create_srq() and gets -ENOSYS. (mlx4_ib works OK nearly all of the time, because it just needs srq_type != IB_SRQT_XRC. And apparently nearly everyone using ib_srpt is using mlx4 hardware) Reported-by: NAlexey Shvetsov <alexxy@gentoo.org> Cc: <stable@vger.kernel.org> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 20 3月, 2012 1 次提交
-
-
由 Cong Wang 提交于
Acked-by: NRoland Dreier <roland@purestorage.com> Signed-off-by: NCong Wang <amwang@redhat.com>
-
- 18 3月, 2012 1 次提交
-
-
由 Nicholas Bellinger 提交于
This patch addresses a bug in srpt_handle_cmd() failure handling where send_ioctx->kref is being leaked with the local extra reference after init, causing the expected kref_put() in srpt_handle_send_comp() to not be the final call to invoke srpt_put_send_ioctx_kref() -> transport_generic_free_cmd() and perform se_cmd descriptor memory release. It also fixes a SCF_SCSI_RESERVATION_CONFLICT handling bug where this code is incorrectly falling through to transport_handle_cdb_direct() after invoking srpt_queue_status() to send SAM_STAT_RESERVATION_CONFLICT status. Note this patch is for >= v3.3 mainline code, and current lio-core.git code has already been converted to target_submit_cmd() + se_cmd->cmd_kref usage, and internal ioctx->kref usage has been removed. I'm including this patch now into target-pending/for-next with a CC' for v3.3 stable. Cc: Bart Van Assche <bvanassche@acm.org> Cc: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
- 11 3月, 2012 1 次提交
-
-
由 Nicholas Bellinger 提交于
This patch drops the following unused legacy API callers from target_core_fabric.h: *) TFO->fall_back_to_erl0() *) TFO->stop_session() *) TFO->sess_logged_in() *) TFO->is_state_remove() This patch also removes the stub usage in loopback, tcm_fc, iscsi_target, and ib_srpt fabric modules. Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
- 09 3月, 2012 1 次提交
-
-
由 Or Gerlitz 提交于
Use a bit in wc_flags rather then a whole integer to hold the "checksum OK" flag. By itself, this change doesn't reduce the size of struct ib_wc on 64bit machines -- it stays on 56 bytes because of padding. However, it will allow to add more fields in the future without enlarging the struct. Also, it will let us have a unified approach with future libibverbs checksum offload reporting, because a bit flag doesn't break the library ABI. This patch was suggested during conversation with Liran Liss <liranl@mellanox.com>. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Reviewed-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 06 3月, 2012 1 次提交
-
-
由 Or Gerlitz 提交于
An iser target may send iscsi NO-OP PDUs as soon as it marks the iSER iSCSI session as fully operative. This means that there is window where there are no posted receive buffers on the initiator side, so it's possible for the iSER RC connection to break because of RNR NAK / retry errors. To fix this, rely on the flags bits in the login request to have FFP (0x3) in the lower nibble as a marker for the final login request, and post an initial chunk of receive buffers before sending that login request instead of after getting the login response. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 05 3月, 2012 1 次提交
-
-
由 Doug Ledford 提交于
We allocate the login dma buffers in iser_verbs.c as part of alloc_ib_conn_resources(), however we are freeing them in iser_initiator.c as part of iser_free_rx_descriptors(). This is needlessly confusing. We have an alloc_rx_descriptors() and it doesn't alloc something that the free_rx_descriptors() frees, and we have an alloc_ib_conn_resources() that allocs something not freed by free_ib_conn_resources(). Clean that up. Signed-off-by: NDoug Ledford <dledford@redhat.com> [ Fix build error in iser_free_ib_conn_res(). - Or ] Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 28 2月, 2012 2 次提交
-
-
由 Bart Van Assche 提交于
Remove sysfs attributes before removing a target instead of testing the target state in every sysfs attribute callback method. Note: it is safe to invoke a sysfs attribute removal method like device_remove_file() twice on the same attribute. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Acked-by: NDavid Dillow <dillowda@ornl.gov> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Bart Van Assche 提交于
Use pr_fmt() and pr_xxx() instead of more verbose printk() equivalents. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Acked-by: NDavid Dillow <dillowda@ornl.gov> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 26 2月, 2012 3 次提交
-
-
由 Masanari Iida 提交于
Signed-off-by: NMasanari Iida <standby24x7@gmail.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Andy Grover 提交于
Change the test for if a cmd is a tmr request to checking if SCF_SCSI_TMR_CDB (a new flag) is set in cmd->se_cmd_flags. Also remove se_tmr_req_cache usage in favor of kzalloc usage, and make core_tmr_alloc_req() return int + setup se_cmd->se_tmr_req directly and fix up various fabric module usages Cc: Andy Grover <agrover@redhat.com> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
由 Christoph Hellwig 提交于
Replace various atomic_ts used as flags in struct se_cmd with a single transport_state bitmap that requires t_state_lock to be held for modifications. In the target core that assumption generally is true, but some recently added code in the SRP target had to grow new lock calls. I can't say I like the way how it messes with the command state directly, but let's leave that for later. (Re-add missing ib_srpt.c changes that nab dropped..) Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
- 10 2月, 2012 1 次提交
-
-
由 Masanari Iida 提交于
Correct spelling "alocate" to "allocate" in drivers/infiniband/ulp/srpt/ib_srpt.c Signed-off-by: NMasanari Iida <standby24x7@gmail.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 09 2月, 2012 2 次提交
-
-
由 Roland Dreier 提交于
Commit a0417fa3 ("net: Make qdisc_skb_cb upper size bound explicit.") made it possible for a netdev driver to use skb->cb between its header_ops.create method and its .ndo_start_xmit method. Use this in ipoib_hard_header() to stash away the LL address (GID + QPN), instead of the "ipoib_pseudoheader" hack. This allows IPoIB to stop lying about its hard_header_len, which will let us fix the L2 check for GRO. Signed-off-by: NRoland Dreier <roland@purestorage.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Roland Dreier 提交于
Commit a0417fa3 ("net: Make qdisc_skb_cb upper size bound explicit.") made it possible for a netdev driver to use skb->cb between its header_ops.create method and its .ndo_start_xmit method. Use this in ipoib_hard_header() to stash away the LL address (GID + QPN), instead of the "ipoib_pseudoheader" hack. This allows IPoIB to stop lying about its hard_header_len, which will let us fix the L2 check for GRO. Signed-off-by: NRoland Dreier <roland@purestorage.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 2月, 2012 1 次提交
-
-
由 Jesper Juhl 提交于
Signed-off-by: NJesper Juhl <jj@chaosbits.net> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 04 2月, 2012 1 次提交
-
-
由 Dan Carpenter 提交于
transport_init_session() and target_fabric_configfs_init() don't return NULL pointers, they only return ERR_PTRs or valid pointers. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 03 2月, 2012 1 次提交
-
-
由 Jesper Juhl 提交于
Signed-off-by: NJesper Juhl <jj@chaosbits.net> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-