- 02 12月, 2008 1 次提交
-
-
由 Jack Morgenstein 提交于
When resizing a CQ, MTTs associated with the old CQE buffer were not freed. As a result, if any app used resize CQ repeatedly, all MTTs were eventually exhausted, which led to all memory registration operations failing until the driver is reloaded. Once the RESIZE_CQ command returns successfully from FW, FW no longer accesses the old CQ buffer, so it is safe to deallocate the MTT entries used by the old CQ buffer. Finally, if the RESIZE_CQ command fails, the MTTs allocated for the new CQEs buffer also need to be de-allocated. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1416>. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 07 8月, 2008 1 次提交
-
-
由 Yevgeny Petrilin 提交于
Add ethernet-related fields to struct mlx4_cqe so that the mlx4_en ethernet NIC driver can share the same definition. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 26 7月, 2008 1 次提交
-
-
由 Jack Morgenstein 提交于
Update existing Mellanox copyright lines to 2008, and add such lines to files where they are missing. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 23 7月, 2008 1 次提交
-
-
由 Roland Dreier 提交于
Add support for the following operations to mlx4 when device firmware supports them: - Send with invalidate and local invalidate send queue work requests; - Allocate/free fast register MRs; - Allocate/free fast register MR page lists; - Fast register MR send queue work requests; - Local DMA L_Key. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 15 7月, 2008 1 次提交
-
-
由 Steve Wise 提交于
This patch adds support for the IB "base memory management extension" (BMME) and the equivalent iWARP operations (which the iWARP verbs mandates all devices must implement). The new operations are: - Allocate an ib_mr for use in fast register work requests. - Allocate/free a physical buffer lists for use in fast register work requests. This allows device drivers to allocate this memory as needed for use in posting send requests (eg via dma_alloc_coherent). - New send queue work requests: * send with remote invalidate * fast register memory region * local invalidate memory region * RDMA read with invalidate local memory region (iWARP only) Consumer interface details: - A new device capability flag IB_DEVICE_MEM_MGT_EXTENSIONS is added to indicate device support for these features. - New send work request opcodes IB_WR_FAST_REG_MR, IB_WR_LOCAL_INV, IB_WR_RDMA_READ_WITH_INV are added. - A new consumer API function, ib_alloc_mr() is added to allocate fast register memory regions. - New consumer API functions, ib_alloc_fast_reg_page_list() and ib_free_fast_reg_page_list() are added to allocate and free device-specific memory for fast registration page lists. - A new consumer API function, ib_update_fast_reg_key(), is added to allow the key portion of the R_Key and L_Key of a fast registration MR to be updated. Consumers call this if desired before posting a IB_WR_FAST_REG_MR work request. Consumers can use this as follows: - MR is allocated with ib_alloc_mr(). - Page list memory is allocated with ib_alloc_fast_reg_page_list(). - MR R_Key/L_Key "key" field is updated with ib_update_fast_reg_key(). - MR made VALID and bound to a specific page list via ib_post_send(IB_WR_FAST_REG_MR) - MR made INVALID via ib_post_send(IB_WR_LOCAL_INV), ib_post_send(IB_WR_RDMA_READ_WITH_INV) or an incoming send with invalidate operation. - MR is deallocated with ib_dereg_mr() - page lists dealloced via ib_free_fast_reg_page_list(). Applications can allocate a fast register MR once, and then can repeatedly bind the MR to different physical block lists (PBLs) via posting work requests to a send queue (SQ). For each outstanding MR-to-PBL binding in the SQ pipe, a fast_reg_page_list needs to be allocated (the fast_reg_page_list is owned by the low-level driver from the consumer posting a work request until the request completes). Thus pipelining can be achieved while still allowing device-specific page_list processing. The 32-bit fast register memory key/STag is composed of a 24-bit index and an 8-bit key. The application can change the key each time it fast registers thus allowing more control over the peer's use of the key/STag (ie it can effectively be changed each time the rkey is rebound to a page list). Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 01 5月, 2008 1 次提交
-
-
由 Roland Dreier 提交于
When I merged bbf8eed1 ("IB/mlx4: Add support for resizing CQs") I changed things around so that mlx4_ib_alloc_cq_buf() and mlx4_ib_free_cq_buf() were used everywhere they could be. However, I screwed up the number of entries passed into mlx4_ib_alloc_cq_buf() in a couple places -- the function bumps the number of entries internally, so the caller shouldn't add 1 as well. Passing a too-big value for the number of entries to mlx4_ib_free_cq_buf() can cause the cleanup to go off the end of an array and corrupt allocator state in interesting ways. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 30 4月, 2008 1 次提交
-
-
由 Yevgeny Petrilin 提交于
Extend the mlx4_cq_resize() API with a way to set the "collapsed" flag for the CQ being created. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 29 4月, 2008 1 次提交
-
-
由 Arthur Kepner 提交于
Add a new parameter, dmasync, to the ib_umem_get() prototype. Use dmasync = 1 when mapping user-allocated CQs with ib_umem_get(). Signed-off-by: NArthur Kepner <akepner@sgi.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Jes Sorensen <jes@sgi.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 4月, 2008 1 次提交
-
-
由 Yevgeny Petrilin 提交于
In addition to mlx4_ib, there will be ethernet and FC consumers of mlx4_core, so move the code for managing kernel doorbells into the core module to avoid having to duplicate this multiple times. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 17 4月, 2008 4 次提交
-
-
由 Vladimir Sokolovsky 提交于
Signed-off-by: NVladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Eli Cohen 提交于
Signed-off-by: NEli Cohen <eli@mellnaox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Eli Cohen 提交于
Add TSO support to the mlx4_ib driver. Signed-off-by: NEli Cohen <eli@mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Eli Cohen 提交于
ConnectX devices support checksum generation and verification of TCP and UDP packets for UD IPoIB messages. This patch checks if the HCA supports this and sets the IB_DEVICE_UD_IP_CSUM capability flag if it does. It implements support for handling the IB_SEND_IP_CSUM send flag and setting the csum_ok field in receive work completions. Signed-off-by: NEli Cohen <eli@mellanox.co.il> Signed-off-by: NAli Ayub <ali@mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 09 2月, 2008 1 次提交
-
-
由 Jack Morgenstein 提交于
ConnectX HCA supports shrinking WQEs, so that a single work request can be made of multiple units of wqe_shift. This way, WRs can differ in size, and do not have to be a power of 2 in size, saving memory and speeding up send WR posting. Unfortunately, if we do this then the wqe_index field in CQEs can't be used to look up the WR ID anymore, so our implementation does this only if selective signaling is off. Further, on 32-bit platforms, we can't use vmap() to make the QP buffer virtually contigious. Thus we have to use constant-sized WRs to make sure a WR is always fully within a single page-sized chunk. Finally, we use WRs with the NOP opcode to avoid wrapping around the queue buffer in the middle of posting a WR, and we set the NoErrorCompletion bit to avoid getting completions with error for NOP WRs. However, NEC is only supported starting with firmware 2.2.232, so we use constant-sized WRs for older firmware. And, since MLX QPs only support SEND, we use constant-sized WRs in this case. When stamping during NOP posting, do stamping following setting of the NOP WQE valid bit. Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 07 2月, 2008 1 次提交
-
-
由 Roland Dreier 提交于
We use struct mlx4_buf for kernel QP, CQ and SRQ buffers, and the code to look up an entry is duplicated in get_cqe_from_buf() and the QP and SRQ versions of get_wqe(). Factor this out into mlx4_buf_offset(). This will also make it easier to switch over to using vmap() for buffers. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 26 1月, 2008 1 次提交
-
-
由 Roland Dreier 提交于
Rather than byte-swapping cqe->g_mlpath_rqpn each time we extract a field from it, byte-swap it once into a temporary variable. This results in smaller, better code -- eg, on 32-bit x86: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5) function old new delta mlx4_ib_poll_cq 1188 1183 -5 Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 09 1月, 2008 1 次提交
-
-
由 Dotan Barak 提交于
Fix the value of pkey_index in completions to get a valid value for GSI QPs. Without this fix, incoming GSI packets on port 2 get an invalid P_Key index in the completion, which prevents the MAD layer from sending back a response, which can make the second port of ConnectX HCAs completely useless. Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 04 8月, 2007 1 次提交
-
-
由 Vu Pham 提交于
Current code has a cut-and-paste error and returns IB_WC_SEND when it should return IB_WC_RDMA_READ. Signed-off-by: NVu Pham <vu@mellanox.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 18 6月, 2007 2 次提交
-
-
由 Jack Morgenstein 提交于
When compacting CQ entries, we need to set the correct value of the ownership bit in case the value is different between the index we copy the CQE from and the index we copy it to. Found by Ronni Zimmerman of Mellanox. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
New ConnectX firmware introduces FW command interface revision 2, which requires that for each QP, a chunk of send queue entries (the "headroom") is kept marked as invalid, so that the HCA doesn't get confused if it prefetches entries that haven't been posted yet. Add code to the driver to do this, and also update the user ABI so that userspace can request that the prefetcher be turned off for userspace QPs (we just leave the prefetcher on for all kernel QPs). Unfortunately, marking send queue entries this way is confuses older firmware, so we change the driver to allow only FW command interface revisions 2. This means that users will have to update their firmware to work with the new driver, but the firmware is changing quickly and the old firmware has lots of other bugs anyway, so this shouldn't be too big a deal. Based on a patch from Jack Morgenstein <jackm@dev.mellanox.co.il>. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 13 6月, 2007 1 次提交
-
-
由 Roland Dreier 提交于
Cast the increment added to wq->tail when send completions are processed to u16 to avoid using wrong values caused by standard integer promotions. The same bug was fixed in libmlx4 by Eli Cohen <eli@mellanox.co.il>. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 09 5月, 2007 1 次提交
-
-
由 Roland Dreier 提交于
Add an InfiniBand driver for Mellanox ConnectX adapters. Because these adapters can also be used as ethernet NICs and Fibre Channel HBAs, the driver is split into two modules: mlx4_core: Handles low-level things like device initialization and processing firmware commands. Also controls resource allocation so that the InfiniBand, ethernet and FC functions can share a device without stepping on each other. mlx4_ib: Handles InfiniBand-specific things; plugs into the InfiniBand midlayer. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-