- 12 3月, 2008 8 次提交
-
-
由 Roland Dreier 提交于
Commit 7143740d ("IPoIB: Add send gather support") made struct ipoib_tx_buf significantly larger, since the mapping member changed from a single u64 to an array with MAX_SKB_FRAGS + 1 entries. This means that allocating tx_rings with kzalloc() may fail because there is not enough contiguous memory for the new, much bigger size. Fix this regression by allocating the rings with vmalloc() instead. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Commit 7143740d ("IPoIB: Add send gather support") made it possible for tx_wr.num_sge to be != 1 -- this happens if send gather support is enabled. However, the code in the connected mode post_send() function assumes the old invariant, namely that tx_wr.num_sge is always 1. Fix this by explicitly setting tx_wr.num_sge to 1 in the CM post_send(). Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Or Gerlitz 提交于
When set_multicast_list() is called the multicast task is restarted and the IPOIB_MCAST_STARTED bit is cleared. As a result for some window of time, multicast packets are not transmitted nor queued but rather dropped by ipoib_mcast_send(). These dropped packets are painful in two cases: - bonding fail-over which both calls set_multicast_list() on the new active slave and sends Gratuitous ARP through that slave. - IP_DROP_MEMBERSHIP code which both calls set_multicast_list() on the device and issues IGMP leave. In both these cases, depending on the scheduling of the IPoIB multicast task, the packets would be dropped. As a result, in the bonding case, the failover would not be detected by the peers until their neighbour is renewed the neighbour (which takes a few tens of seconds). In the IGMP case, the IP router doesn't get an IGMP leave and would only learn on that from further probes on the group (also a delay of at least a few tens of seconds). Fix this by allowing transmission (or queuing) depending on the IPOIB_FLAG_OPER_UP flag instead of the IPOIB_MCAST_STARTED flag. Signed-off-by: NOlga Shern <olgas@voltaire.com> Signed-off-by: NOr Gerlitz <ogerlitz@voltaire.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Patrick Marchand Latifi 提交于
Reset the retry counter when we get a good RDMA_READ_RESPONSE_MIDDLE packet. This fix will prevent the requester from reporting a retry exceeded error too early. Signed-off-by: NPatrick Marchand Latifi <patrick.latifi@qlogic.com>
-
由 Patrick Marchand Latifi 提交于
A work completion entry could be placed on the wrong completion queue when an RC QP is placed in the error state. Signed-off-by: NPatrick Marchand Latifi <patrick.latifi@qlogic.com> Acked-by: NRalph Campbell <ralph.campbell@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Patrick Marchand Latifi 提交于
This patch fixes the initialization of RC QPs, since we would rely on the queue pair type (ibqp->qp_type) being set, but this field is only initialized when we return from ipath_create_qp (it is initialized by the user-level verbs library). The fix is to not depend on this field to initialize the send and the receive state of the RC QP. Signed-off-by: NPatrick Marchand Latifi <patrick.latifi@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Patrick Marchand Latifi 提交于
There can be a case where the requester's rnr retry counter (s_rnr_retry) is less than the number of rnr retries allowed per QP (s_rnr_retry_cnt). This can happen if the s_rnr_retry counter is being decremented and an ipath_query_qp call is issued during that time frame. The fix is to always return the number of rnr retries allowed per QP instead of the requester's rnr counter. Found by code review. Signed-off-by: NPatrick Marchand Latifi <patrick.latifi@qlogic.com> Acked-by: NRalph Campbell <ralph.campbell@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Ralph Campbell 提交于
Subnet manager SetPortinfo messages distingush between changing the link state (DOWN, ARM, ACTIVE) and the link physical state (POLL, SLEEP, DISABLED). These are somewhat independent commands and affect when link width and speed changes take effect. Without this patch, a link DOWN physical state NOP command was causing the link width and speed settings to take effect which should only happen when the link physical state is goes down (either by a SMP or some link physical error like link errors exceeding the threshold). Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 11 3月, 2008 3 次提交
-
-
由 Steve Wise 提交于
cm_work_handler() can access cm_id_priv after it drops its reference by calling iwch_deref_id(), which might cause it to be freed. The fix is to look at whether IWCM_F_CALLBACK_DESTROY is set _before_ dropping the reference. Then if it was set, free the cm_id on this thread. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Arne Redlich 提交于
"iser_device" allocation failure is "handled" with a BUG_ON() right before dereferencing the NULL-pointer - fix this! Signed-off-by: NArne Redlich <arne.redlich@xiranet.com> Signed-off-by: NErez Zilber <erezz@voltaire.com>
-
由 Arne Redlich 提交于
The iteration through the list of "iser_device"s during device lookup/creation is broken -- it might result in an infinite loop if more than one HCA is used with iSER. Fix this by using list_for_each_entry() instead of the open-coded flawed list iteration code. Signed-off-by: NArne Redlich <arne.redlich@xiranet.com> Signed-off-by: NErez Zilber <erezz@voltaire.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 10 3月, 2008 1 次提交
-
-
由 Jon Mason 提交于
The cxbg3 driver is unnecessarily decreasing the number of CQ entries by one when creating a CQ. This will cause the CQ not to have as many entries as requested by the user if the user requests a power of 2 size. Signed-off-by: NJon Mason <jon@opengridcomputing.com> Acked-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 01 3月, 2008 4 次提交
-
-
由 Jon Mason 提交于
Set cap.max_inline_data to the actual max inline data that the adapter support, so that userspace apps see the right value returned. Signed-off-by: NJon Mason <jon@opengridcomputing.com> Acked-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Pete Wyckoff 提交于
Commit a3cd7d90 ("IB/fmr_pool: ib_fmr_pool_flush() should flush all dirty FMRs") caused a regression for iSER and was reverted in e5507736. This change attempts to redo the original patch so that all used FMR entries are flushed when ib_flush_fmr_pool() is called without affecting the normal FMR pool cleaning thread. Simply move used entries from the clean list onto the dirty list in ib_flush_fmr_pool() before letting the cleanup thread do its job. Signed-off-by: NPete Wyckoff <pw@osc.edu> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Pete Wyckoff 提交于
This reverts commit a3cd7d90. The original commit breaks iSER reliably, making it complain: iser: iser_reg_page_vec:ib_fmr_pool_map_phys failed: -11 The FMR cleanup thread runs ib_fmr_batch_release() as dirty entries build up. This commit causes clean but used FMR entries also to be purged. During that process, another thread can see that there are no free FMRs and fail, even though there should always have been enough available. Signed-off-by: NPete Wyckoff <pw@osc.edu> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Sean Hefty 提交于
When a CM MAD is received, it is queued to a CM workqueue for processing. The queued work item references the port and device on which the MAD was received. If that device is removed from the system before the work item can execute, the work item will reference freed memory. To fix this, flush the workqueue after unregistering to receive MAD, and before the device is be freed. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 27 2月, 2008 6 次提交
-
-
由 John Lacombe 提交于
Interrupt moderation low threshold value was incorrectly triggering, indicating that the threshold should be lowered. The impact was the timer was likely to become 40usecs and get stuck there. The biggest side effect was too many interrupts and nonoptimal performance. Signed-off-by: NJohn Lacombe <jlacombe@neteffect.com> Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Faisal Latif 提交于
With commit ef19454b ("[LIB] crc32c: Keep intermediate crc state in cpu order"), the behavior of crc32c changes on big-endian platforms. Our algorithm expects the previous behavior; otherwise we have RDMA connection establishment failure on big-endian platforms like powerpc. Apply cpu_to_le32() to value returned by crc32c() to get the previous behavior. Signed-off-by: NFaisal Latif <flatif@neteffect.com> Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Faisal Latif 提交于
Fix use-after-free spotted by Coverity checker flagged by Adrian Bunk. Signed-off-by: NFaisal Latif <flatif@neteffect.com> Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Glenn Streiff 提交于
Just delete the debugging statement so we don't use cqp_request after freeing it. Adrian Bunk flagged this use-after-free issue spotted by the Coverity checker. Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Adrian Bunk 提交于
Fix a check-after-use spotted by the Coverity checker. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Adrian Bunk 提交于
Fix a memory leak spotted by the Coverity checker. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 26 2月, 2008 3 次提交
-
-
由 Adrian Bunk 提交于
Fix an off-by-one spotted by the Coverity checker. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
Adrian Bunk pointed out that a Coverity scan found some apparently dead code in nes_verbs.c that really shouldn't have been dead. The function nes_create_cq() was missing the assignment err = 1; just prior to an iteration that conditionally set err = 0 if a PBL was found for a given virtual CQ. I also noticed we should have been returning -EFAULT on a couple related error paths. Signed-off-by: NChien Tung <ctung@neteffect.com> Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Bryan Rosenburg 提交于
A single entry (addr 0x10001000, size 0x2000) will get converted to page address 0x10000000 with a page size of 0x4000. The code as it stands doesn't address the single buffer case, but in fact it allows the subsequent single-buffer special case to be eliminated entirely. Because the mask now includes the (page adjusted) starting and ending addresses, the general case works for the single buffer case as well. Signed-off-by: NBryan Rosenburg <rosnbrg@us.ibm.com> Acked-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 20 2月, 2008 2 次提交
-
-
由 Roland Dreier 提交于
When mthca_fmr_alloc() returns an error, it should free the MPT at the index key, not mr->ibmr.lkey, since the lkey has been mangled by hw_index_to_key() and no longer is the real index. This bug causes corruption of the MPT table free bitmap when mthca_fmr_alloc() fails. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Pradeep Satyanarayana 提交于
Commit efcd9971 ("IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list()") introduced a bug in ipoib_cm_dev_stop() when the receive drain times out. In that case, the function moves all the pending rx stuff into a private list but then calls ipoib_cm_free_rx_reap_list(), which handles a different list. Fix this by moving everything to the rx_reap_list that will actually get freed up. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=906>. Signed-off-by: NPradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 19 2月, 2008 1 次提交
-
-
由 Roland Dreier 提交于
In nes_create_qp(), the test if (nesqp->mmap_sq_db_index > NES_MAX_USER_WQ_REGIONS) { is used to error out if the db_index is too large; however, if the test doesn't trigger, then the index is used as nes_ucontext->mmap_nesqp[nesqp->mmap_sq_db_index] = nesqp; and mmap_nesqp is declared as struct nes_qp *mmap_nesqp[NES_MAX_USER_WQ_REGIONS]; which leads to an array overrun if the index is exactly equal to NES_MAX_USER_WQ_REGIONS. Fix this by bailing out if the index is greater than or equal to NES_MAX_USER_WQ_REGIONS. This was spotted by the Coverity checker (CID 2162). Acked-by: NGlenn Streiff <gstreiff@neteffect.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 17 2月, 2008 1 次提交
-
-
由 Chien Tung 提交于
We need to account for the VLAN header size in nes_netdev_change_mtu() and nes_netdev_init(). Also, add spin lock/unlock during VLAN RX registration so only one process can assign VLAN group for a given interface at a time. Signed-off-by: NChien Tung <ctung@neteffect.com> Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 16 2月, 2008 2 次提交
-
-
由 Glenn Streiff 提交于
Only mask out MAC interrupt if necessary and re-enable on ifup. There could be multiple netdevs going through the same MAC. MAC interrupts should not be masked off until the last netdev is downed. Signed-off-by: NChien Tung <ctung@neteffect.com> Signed-off-by: NGlenn Streiff <gstreiff@neteffect.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Li Zefan 提交于
If kobject_create_and_add() fails and returns NULL, the current code in ib_device_register_sysfs() does not set ret and hence returns 0. Set ret to -ENOMEM for this failure, so that the caller knows that ib_device_register_sysfs() actually failed. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 15 2月, 2008 4 次提交
-
-
由 Sean Hefty 提交于
There's an undesirable interaction with issuing MRA requests to increase connection timeouts and the listen backlog. When the rdma_cm receives a connection request, it queues an MRA with the ib_cm. (The ib_cm will send an MRA if it receives a duplicate REQ.) The rdma_cm will then create a new rdma_cm_id and give that to the user, which in this case is the rdma_user_cm. If the listen backlog maintained in the rdma_user_cm is full, it destroys the rdma_cm_id, which in turns destroys the ib_cm_id. The ib_cm_id generates a REJ because the state of the ib_cm_id has changed to MRA sent, versus REQ received. When the backlog is full, we just want to drop the REQ so that it is retried later. Fix this by deferring queuing the MRA until after the user of the rdma_cm has examined the connection request. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Jack Morgenstein 提交于
Currently mlx4_ib_fmr_alloc() calls mlx4_mr_enable() instead of mlx4_fmr_enable(). The two functions are equivalent at the moment, but this is not really correct (and the change is needed to fix a bug). Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Eli Cohen 提交于
struct ipoib_cm_tx.ibwc is unused since commit 1b524963 ("IPoIB/cm: Use common CQ for CM send completions"), so remove it. Signed-off-by: NEli Cohen <eli@mellanox.co.il>
-
由 Jack Morgenstein 提交于
In P_Key event handling, if the old P_Key is no longer available, the driver must call ipoib_ib_dev_stop() -- just as it does when the P_Key is still available (see procedure __ipoib_ib_dev_flush()). When a P_Key becomes available, the driver will perform ipoib_open(), which assumes that the QP is in RESET, the cm_id has been destroyed/deleted, etc. If ipoib_ib_dev_stop() is not called as described above, then these assumptions will be false, and the attempt to bring the interface up will fail. Found by Mellanox QA. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 13 2月, 2008 5 次提交
-
-
由 Marcin Slusarz 提交于
replace: big_endian_variable = cpu_to_beX(beX_to_cpu(big_endian_variable) + expression_in_cpu_byteorder); with: beX_add_cpu(&big_endian_variable, expression_in_cpu_byteorder); Generated with a semantic patch. Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
The cxgb3 HW and driver don't support loopback RDMA connections. So fail any connection attempt where the destination address is local. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Commit 9af57b7a ("IB/cm: Add basic performance counters") introduced a bug in how the reference count for cm_class.subsys.kobj was handled: the path that released a device did a kobject_put() on that kobject, but there was no kobject_get() in the path the handles adding a device. So the reference count ended up too low, which leads to bad things. Fix up and simplify the reference counting to avoid this. (Actually, I introduced the bug when fixing the patch up to match some of Greg's kobject changes, but who's counting) Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Pesky little devils, sneaking around... Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Usually harmless, since the scatterlist is always hard-coded to a length of 1, but it triggers a BUG() if CONFIG_DEBUG_SG=y, so we better fix it. This fixes <http://bugzilla.kernel.org/show_bug.cgi?id=9934>. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-