- 16 4月, 2015 5 次提交
-
-
由 Doug Ledford 提交于
During my recent work on the rtnl lock deadlock in the IPoIB driver, I saw that even once I fixed the apparent races for a single device, as soon as that device had any children, new races popped up. It turns out that this is because no matter how well we protect against races on a single device, the fact that all devices use the same workqueue, and flush_workqueue() flushes *everything* from that workqueue means that we would also have to prevent all races between different devices (for instance, ipoib_mcast_restart_task on interface ib0 can race with ipoib_mcast_flush_dev on interface ib0.8002, resulting in a deadlock on the rtnl_lock). There are several possible solutions to this problem: Make carrier_on_task and mcast_restart_task try to take the rtnl for some set period of time and if they fail, then bail. This runs the real risk of dropping work on the floor, which can end up being its own separate kind of deadlock. Set some global flag in the driver that says some device is in the middle of going down, letting all tasks know to bail. Again, this can drop work on the floor. Or the method this patch attempts to use, which is when we bring an interface up, create a workqueue specifically for that interface, so that when we take it back down, we are flushing only those tasks associated with our interface. In addition, keep the global workqueue, but now limit it to only flush tasks. In this way, the flush tasks can always flush the device specific work queues without having deadlock issues. Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Doug Ledford 提交于
We blindly assume that we can just take the rtnl lock and that will prevent races with downing this interface. Unfortunately, that's not the case. In ipoib_mcast_stop_thread() we will call flush_workqueue() in an attempt to clear out all remaining instances of ipoib_join_task. But, since this task is put on the same workqueue as the join task, the flush_workqueue waits on this thread too. But this thread is deadlocked on the rtnl lock. The better thing here is to use trylock and loop on that until we either get the lock or we see that FLAG_OPER_UP has been cleared, in which case we don't need to do anything anyway and we just return. While investigating which flag should be used, FLAG_ADMIN_UP or FLAG_OPER_UP, it was determined that FLAG_OPER_UP was the more appropriate flag to use. However, there was a mix of these two flags in use in the existing code. So while we check for that flag here as part of this race fix, also cleanup the two places that had used the less appropriate flag for their tests. Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Doug Ledford 提交于
The ipoib_mcast_flush_dev routine is called with the rtnl_lock held and needs to keep it held. It also needs to call flush_workqueue() to flush out any outstanding work. In the past, we've had to try and make sure that we didn't flush out any outstanding join completions because they also wanted to grab rtnl_lock() and that would deadlock. It turns out that the only thing in the join completion handler that needs this lock can be safely moved to our carrier_on_task, thereby reducing the potential for the join completion code and the flush code to deadlock against each other. Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Doug Ledford 提交于
In preparation for using per device work queues, we need to move the start of the neighbor thread task to after ipoib_ib_dev_init and move the destruction of the neighbor task to before ipoib_ib_dev_cleanup. Otherwise we will end up freeing our workqueue with work possibly still on it. Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Doug Ledford 提交于
Create a an ipoib_flush_ah and ipoib_stop_ah routines to use at appropriate times to flush out all remaining ah entries before we shut the device down. Because neighbors and mcast entries can each have a reference on any given ah, we must make sure to free all of those first before our ah will actually have a 0 refcount and be able to be reaped. This factoring is needed in preparation for having per-device work queues. The original per-device workqueue code resulted in the following error message: <ibdev>: ib_dealloc_pd failed That error was tracked down to this issue. With the changes to which workqueues were flushed when, there were no flushes of the per device workqueue after the last ah's were freed, resulting in an attempt to dealloc the pd with outstanding resources still allocated. This code puts the explicit flushes in the needed places to avoid that problem. Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 03 4月, 2015 10 次提交
-
-
由 Saeed Mahameed 提交于
Preparation for ethernet driver. Signed-off-by: NAchiad Shochat <achiad@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Saeed Mahameed 提交于
Pass consumer index as a parameter to arm CQ Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Saeed Mahameed 提交于
Preparation for ethernet driver. These functions will be used in drivers other than mlx5_ib. Signed-off-by: NAchiad Shochat <achiad@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Saeed Mahameed 提交于
Signed-off-by: NAchiad Shochat <achiad@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Saeed Mahameed 提交于
Do it in one place instead of every where the function is invoked Signed-off-by: NAchiad Shochat <achiad@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eli Cohen 提交于
Put a line of space before return and next statement. Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Haggai Abramonvsky 提交于
Pass 0 in the sqd_event parameter. Signed-off-by: NHaggai Abramovsky <hagaya@mellanox.com> Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Shamay 提交于
The calls to SET_PORT used hard-code numbers, when supplying command's opcode modifiers, fix that to use well defined constants. Signed-off-by: NIdo Shamay <idos@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
Don't use dev->iflink anymore. CC: Roland Dreier <roland@kernel.org> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shachar Raindel 提交于
Properly verify that the resulting page aligned end address is larger than both the start address and the length of the memory area requested. Both the start and length arguments for ib_umem_get are controlled by the user. A misbehaving user can provide values which will cause an integer overflow when calculating the page aligned end address. This overflow can cause also miscalculation of the number of pages mapped, and additional logic issues. Addresses: CVE-2014-8159 Cc: <stable@vger.kernel.org> Signed-off-by: NShachar Raindel <raindel@mellanox.com> Signed-off-by: NJack Morgenstein <jackm@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 26 3月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
struct kiocb now is a generic I/O container, so move it to fs.h. Also do a #include diet for aio.h while we're at it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 19 3月, 2015 2 次提交
-
-
由 Majd Dibbiny 提交于
For RoCE ports, we set the u32 PMA values based on u64 HCA counters. In case of overflow, according to the IB spec, we have to saturate a counter to its max value, do that. Fixes: c3779134 ('IB/mlx4: Support PMA counters for IBoE') Signed-off-by: NMajd Dibbiny <majd@mellanox.com> Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com> Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Moni Shoua 提交于
Processing an event is done in a different context from the one when the event was dispatched. This requires a check that the slave net device is still valid when the event is being processed. The check is done under the iboe lock which ensure correctness. Fixes: a5750090 ('IB/mlx4: Add port aggregation support') Signed-off-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 2月, 2015 2 次提交
-
-
由 Mike Marciniszyn 提交于
Upstream checkpatch now requires this. Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Mike Marciniszyn 提交于
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 20 2月, 2015 1 次提交
-
-
由 David Howells 提交于
Code that does this: if (!(d_unhashed(tmp) && tmp->d_inode)) { ... simple_unlink(parent->d_inode, tmp); } is broken because: !(d_unhashed(tmp) && tmp->d_inode) is equivalent to: !d_unhashed(tmp) || !tmp->d_inode so it is possible to get into simple_unlink() with tmp->d_inode == NULL. simple_unlink(), however, assumes tmp->d_inode cannot be NULL. I think that what was meant is this: !d_unhashed(tmp) && tmp->d_inode and that the logical-not operator or the final close-bracket was misplaced. Signed-off-by: NDavid Howells <dhowells@redhat.com> cc: Bryan O'Sullivan <bos@pathscale.com> cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 19 2月, 2015 19 次提交
-
-
由 Haggai Eran 提交于
Re-enable the on-demand paging capability query through the extended query device verb. Signed-off-by: NHaggai Eran <haggaie@mellanox.com> Reviewed-by: NYann Droneaud <ydroneaud@opteya.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Haggai Eran 提交于
Add on-demand paging capabilities reporting to the extended query device verb. Yann Droneaud writes: Note: as offsetof() is used to retrieve the size of the lower chunk of the response, beware that it only works if the upper chunk is right after, without any implicit padding. And, as the size of the latter chunk is added to the base size, implicit padding at the end of the structure is not taken in account. Both point must be taken in account when extending the uverbs functionalities. Signed-off-by: NHaggai Eran <haggaie@mellanox.com> Reviewed-by: NYann Droneaud <ydroneaud@opteya.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Eli Cohen 提交于
Add extensible query device capabilities verb to allow adding new features. ib_uverbs_ex_query_device is added and copy_query_dev_fields is used to copy capability fields to be used by both ib_uverbs_query_device and ib_uverbs_ex_query_device. Following the discussion about this patch [1], the code now validates the command's comp_mask is zero, returning -EINVAL for unknown values, in order to allow extending the verb in the future. The verb also checks the user-space provided response buffer size and only fills in capabilities that will fit in the buffer. In attempt to follow the spirit of presentation [2] by Tzahi Oved that was presented during OpenFabrics Alliance International Developer Workshop 2013, the comp_mask bits will only describe which fields are valid. Furthermore, fields that can simply be cleared when they are not supported, do not require a comp_mask bit at all. The verb returns a response_length field containing the actual number of bytes written by the kernel, so that a newer version running on an older kernel can tell which fields were actually returned. [1] [PATCH v1 0/5] IB/core: extended query device caps cleanup for v3.19 http://thread.gmane.org/gmane.linux.kernel.api/7889/ [2] https://www.openfabrics.org/images/docs/2013_Dev_Workshop/Tues_0423/2013_Workshop_Tues_0830_Tzahi_Oved-verbs_extensions_ofa_2013-tzahio.pdfSigned-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NHaggai Eran <haggaie@mellanox.com> Reviewed-by: NYann Droneaud <ydroneaud@opteya.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Hariprasad S 提交于
In c4iw_wait_for_reply(), if a FW6_MSG WR reply is not received after C4IW_WR_TO seconds, fail the WR operation and mark the device as fatally dead. Further, if the device is marked fatally dead, then fail the WR wait immediately. Also change the timeout to 60 seconds. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Dan Carpenter 提交于
The ->sgid_tbl[] array has OCRDMA_MAX_SGID number of elements so this test is off by one. ->sgid_tbl is allocated in ocrdma_alloc_resources(). Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Rasmus Villemoes 提交于
In the expressions idx/32 and idx%32, both idx and 32 have signed type, and unfortunately the C standard prescribes rounding to 0, so unless gcc can prove that idx is non-negative, these cannot be implemented as simple shift respectively mask operations. Help gcc by changing the type of idx to unsigned - this cuts another few instructions from the generated code. Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Rasmus Villemoes 提交于
gcc emits a surprising amount of code in order to flip a bit. One would think that a single instruction is enough. $ scripts/bloat-o-meter /tmp/ocrdma_verbs.o drivers/infiniband/hw/ocrdma/ocrdma_verbs.o add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-142 (-142) function old new delta ocrdma_post_srq_recv 498 460 -38 ocrdma_poll_cq 2010 1962 -48 ocrdma_discard_cqes 495 439 -56 All three calls of ocrdma_srq_toggle_bit happen within spinlocks, so saving a few useless instructions might be worthwhile. Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Mitesh Ahuja 提交于
Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Devesh Sharma 提交于
For the AH that describs a VLAN interface details, vlan present bit needs to be set during posting a WQE. This patch adds the code to allow it happening. Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Mitesh Ahuja 提交于
Use get_ocrdma_dev(ocrdma_qp->ibqp.device) function to access ocrdma device pointer. Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Mitesh Ahuja 提交于
Add support for interrupt moderation for ocrdma device. Thresholds for high interrupt rates are static values derived based on experimental results. Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Devesh Sharma 提交于
Check for return value for ocrdma_resolve_dmac while setting AV params. Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Selvin Xavier 提交于
If the SQ and RQ of the QP in error state uses separate CQs, traverse the list of QPs using each CQs and invoke the buddy CQ handler for both SQ and RQ. Signed-off-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Devesh Sharma 提交于
Remove support for RDMA-READ-WITH-INVALIDATE from ocrdma driver. Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Mitesh Ahuja 提交于
1. Cleanup sequence in ocrdma_remove(). The device should be unregistered from IB stack before any device specific cleanup. 2. Always return success in the resource destroy path. In case destroy command returns error, IB stack will trigger cleanup again while closing the uverbs device causing kernel panic BUG_ON(). Signed-off-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Padmanabh Ratnakar 提交于
Fix ocrdma_query_qp to refelect correct qp state based on FW. Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NPadmanabh Ratnakar <padmanabh.ratnakar@emulex.com> Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Selvin Xavier 提交于
1. Add statistics counters for error cqes. 2. Add file ("reset_stats") to reset rdma stats in Debugfs. Signed-off-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Devesh Sharma 提交于
Fix ocrdma_register_device to initialize correct number of interrupt vectors in device pointer. Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Mitesh Ahuja 提交于
Move PD allocation and deallocation from firmware to driver. At driver load time all the PDs will be requested from firmware and their management will be handled by driver to reduce mailbox commands overhead at runtime. Signed-off-by: NMitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com> Signed-off-by: NSelvin Xavier <selvin.xavier@emulex.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-