- 16 4月, 2015 3 次提交
-
-
由 Yishai Hadas 提交于
Request GIDs from the SM on demand, i.e., when a VF actually needs them, and release them when the GIDs are no longer in use. In cloud environments, this is useful for GID migrations, in which a GID is assigned to a VF on the destination HCA, while the VF on the source HCA is shutdown (but the GID was not administratively released). Signed-off-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Yishai Hadas 提交于
Change the init flow to ask GUIDs only for active VFs. This is done for both SM & HOST modes so that there is no need any more to maintain the ownership record type. In case SM mode is used, the initial value will be 0, ask the SM to assign, for the HOST mode the initial value will be the HOST generated GUID. This will enable out of the box experience for both probed and attached VFs. Signed-off-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Yishai Hadas 提交于
If the SM rejects an alias GUID request the PF driver keeps trying to acquire the specified GUID indefinitely, utilizing an exponential backoff scheme. Retrying is managed per GUID entry. Each entry that wasn't applied holds its next retry information. Retry requests to the SM consist of records of 8 consecutive GUIDS. Each record that contains GUIDs requiring retries holds its next time-to-run based on the retry information of all its GUID entries. The record having the lowest retry time will run first when that retry time arrives. Since the method (SET or DELETE) as sent to the SM applies to all the GUIDs in the record, we must handle SET requests and DELETE requests in separate SM messages (one for SETs and the other for DELETEs). To avoid race conditions where a GUID entry request (set or delete) was modified after the SM request was sent, we save the method and the requested indices as part of the callback's context -- thus, only the requested indexes are evaluated when the response is received. When an GUID entry is approved we turn off its retry-required bit, this prevents redundant SM retries from occurring on that record. The port down event should be sent only when previously it was up. Likewise, the port up event should be sent only if previously the port was down. Synchronization was added around the flows that change entries and record state to prevent race conditions. Signed-off-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 10 2月, 2015 1 次提交
-
-
由 Yishai Hadas 提交于
The driver exposes interfaces that directly relate to HW state. Upon fatal error, consumers of these interfaces (ULPs) that rely on completion of all their posted work-request could hang, thereby introducing dependencies in shutdown order. To prevent this from happening, we manage the relevant resources (CQs, QPs) that are used by the device. Upon a fatal error, we now generate simulated completions for outstanding WQEs that were not completed at the time the HW was reset. It includes invoking the completion event handler for all involved CQs so that the ULPs will poll those CQs. When polled we return simulated CQEs with IB_WC_WR_FLUSH_ERR return code enabling ULPs to clean up their resources and not wait forever for completions upon receiving remove_one. The above change requires an extra check in the data path to make sure that when device is in error state, the simulated CQEs will be returned and no further WQEs will be posted. Signed-off-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 2月, 2015 2 次提交
-
-
由 Moni Shoua 提交于
When the mlx4 IB (RoCE) device works in link aggregation mode, it exposes a single port to upper layers. Therefore, applications always set '1' in port_num attribute when modifying a QP or creating an address handle. To make sure that a node uses all available ports the mlx4 driver will override the port_num attribute with a round robin policy. Signed-off-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Moni Shoua 提交于
In port aggregation mode flows for port #1 (the only port) should be mirrored on port #2. This is because packets can arrive from either physical ports. Signed-off-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 9月, 2014 1 次提交
-
-
由 Jack Morgenstein 提交于
The source MAC is needed in RoCE when building the QP1 header. Currently, this is obtained from the source net device. However, the net device may not yet exist, or can be destroyed in parallel to this QP1 send operation (e.g through the VPI port change flow) so accessing it may cause a kernel crash. To fix this, we maintain a source MAC cache per port for the net device in struct mlx4_ib_roce. This cached MAC is initialized to be the default MAC address obtained during HCA initialization via QUERY_PORT. This cached MAC is updated via the netdev event notifier handler. Since the cached MAC is held in an atomic64 object, we do not need locking when accessing it. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 02 8月, 2014 1 次提交
-
-
由 Matan Barak 提交于
This enables the user to change the protection domain, access flags and translation (address and length) of the MR. Use basic mlx4_core helper functions to get, update and set MPT and MTT objects according to the required modifications. Signed-off-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 03 6月, 2014 1 次提交
-
-
由 Jiri Kosina 提交于
Modify the various routines used to allocate memory resources which serve QPs in mlx4 to get an input GFP directive. Have the Ethernet driver to use GFP_KERNEL in it's QP allocations as done prior to this commit, and the IB driver to use GFP_NOIO when the IB verbs IB_QP_CREATE_USE_GFP_NOIO QP creation flag is provided. Signed-off-by: NMel Gorman <mgorman@suse.de> Signed-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 17 5月, 2014 1 次提交
-
-
由 Matan Barak 提交于
When we receive a netdev event indicating a netdev change and/or a netdev address change, we must change the MAC index used by the proxy QP1 (in the QP context), otherwise RoCE CM packets sent by the VF will not carry the same source MAC address as the non-CM packets. We use the UPDATE_QP command to perform this change. In order to avoid modifying a QP context based on netdev event, while the driver attempts to destroy this QP (e.g either the mlx4_ib or ib_mad modules are unloaded), we use mutex locking in both flows. Since the relevant mlx4 proxy GSI QP is created indirectly by the mad module when they create their GSI QP, the mlx4 didn't need to keep track on that QP prior to this change. Now, when QP modifications are needed to this QP from within the driver, we added refernece to it. Signed-off-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 3月, 2014 2 次提交
-
-
由 Jack Morgenstein 提交于
Since there is no connection between the MAC/VLAN and the GID when using IP-based addressing, the proxy QP1 (running on the slave) must pass the source-mac, destination-mac, and vlan_id information separately from the GID. Additionally, the Host must pass the remote source-mac and vlan_id back to the slave, This is achieved as follows: Outgoing MADs: 1. Source MAC: obtained from the CQ completion structure (struct ib_wc, smac field). 2. Destination MAC: obtained from the tunnel header 3. vlan_id: obtained from the tunnel header. Incoming MADs 1. The source (i.e., remote) MAC and vlan_id are passed in the tunnel header to the proxy QP1. VST mode support: For outgoing MADs, the vlan_id obtained from the header is discarded, and the vlan_id specified by the Hypervisor is used instead. For incoming MADs, the incoming vlan_id (in the wc) is discarded, and the "invalid" vlan (0xffff) is substituted when forwarding to the slave. Signed-off-by: NMoni Shoua <monis@mellanox.co.il> Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jack Morgenstein 提交于
The IB side of RoCE requires the MAC table index of the MAC address used by its QPs. To obtain the real MAC index, the IB side registers the MAC (increasing its ref count, and also returning the real MAC index) during the modify-qp sequence. This protects against the ETH side deleting or modifying that MAC table entry while the QP is active. Note that until the modify-qp command returns success, the MAC and VLAN information only has "candidate" status. If the modify-qp succeeds, the "candidate" info is promoted to the operational MAC/VLAN info for the qp. If the modify fails, the candidate MAC/VLAN is unregistered, and the old qp info is preserved. The patch is a bit complex, because there are multiple qp transitions where the primary-path information may be modified: INIT-to-RTR, and SQD-to-SQD. Similarly for the alternate path information. Therefore the code must handle cases where path information has already been entered into the QP context by previous qp transitions. For the MAC address, the success logic is as follows: 1. If there was no previous MAC, simply move the candidate MAC information to the operational information, and reset the candidate MAC info. 2. If there was a previous MAC, unregister it. Then move the MAC information from candidate to operational, and reset the candidate info (as in 1. above). The MAC address failure logic is the same for all cases: - Unregister the candidate MAC, and reset the candidate MAC info. For Vlan registration, the logic is similar. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 1月, 2014 2 次提交
-
-
由 Moni Shoua 提交于
IP based RoCE gids don't store Ethernet L2 parameters, MAC and VLAN. Therefore, we need to extract them from the CQE and place them in struct ib_wc (to be used for cases were they were taken from the gid). Also, when modifying a QP or building address handle, instead of parsing the dgid to get the MAC and VLAN, take them from the address handle attributes. Signed-off-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Moni Shoua 提交于
Currently, the mlx4 driver set IBoE (RoCE) gids to encode related Ethernet netdevice interface MAC address and possibly VLAN id. Change this scheme such that gids encode interface IP addresses (both IP4 and IPv6). This requires learning the IP addresses which are of use by a netdevice associated with the HCA port, formatting them to gids and adding them to the port gid table. Furthermore, events of add and delete address are caught to maintain the gid table accordingly. Associated IP addresses may belong to a master of an Ethernet netdevice on top of that port so this should be considered when building and maintaining the gid table. Signed-off-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 15 1月, 2014 2 次提交
-
-
由 Matan Barak 提交于
This patch adds support for steerable (NETIF) QP creation. When we create the device, we allocate a range of steerable QPs. Afterward when a QP is created with the NETIF flag, it's allocated from this range. Allocation is managed by bitmap allocator. Internal steering rules for those QPs is automatically generated on their creation. Signed-off-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Matan Barak 提交于
Up until now, flow steering wasn't supported when using IB ports. This patch enables support for flow steering if all hardware ports support that, for example the new MLX4_DEV_CAP_FLAG2_DMFS_IPOIB mlx4 device capability. Signed-off-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 29 8月, 2013 1 次提交
-
-
由 Hadar Hen Zion 提交于
Implement ib_create_flow() and ib_destroy_flow(). Translate the verbs structures provided by the user to HW structures and call the MLX4_QP_FLOW_STEERING_ATTACH/DETACH firmware commands. On the ATTACH command completion, the firmware provides a 64-bit registration ID, which is placed into struct mlx4_ib_flow that wraps the instance of struct ib_flow which is retuned to caller. Later, this reg ID is used for detaching that flow from the firmware. Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 26 2月, 2013 3 次提交
-
-
由 Shani Michaeli 提交于
* Implement memory windows binding in mlx4_ib_post_send. * Implement mlx4_ib_bind_mw by deferring to mlx4_ib_post_send. * Rename MLX4_WQE_FMR_PERM_* flags to MLX4_WQE_FMR_AND_BIND_PERM_*, indicating that they are used both for fast registration work requests, and for memory window bind work requests. Signed-off-by: NHaggai Eran <haggaie@mellanox.com> Signed-off-by: NShani Michaeli <shanim@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Shani Michaeli 提交于
Implement MW allocation and deallocation in mlx4_core and mlx4_ib. Pass down the enable bind flag when registering memory regions. Signed-off-by: NHaggai Eran <haggaie@mellanox.com> Signed-off-by: NShani Michaeli <shanim@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Roland Dreier 提交于
Matches the way they're used, and actually lets at least x86-64 generate better code: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-38 (-38) function old new delta mlx4_ib_post_send 4416 4378 -38 Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 27 11月, 2012 1 次提交
-
-
由 Or Gerlitz 提交于
ConnectX-3 devices can use either 64- or 32-byte completion queue entries (CQEs) and event queue entries (EQEs). Using 64-byte EQEs/CQEs performs better because each entry is aligned to a complete cacheline. This patch queries the HCA's capabilities, and if it supports 64-byte CQEs and EQES the driver will configure the HW to work in 64-byte mode. The 32-byte vs 64-byte mode is global per HCA and not per CQ or EQ. Since this mode is global, userspace (libmlx4) must be updated to work with the configured CQE size, and guests using SR-IOV virtual functions need to know both EQE and CQE size. In case one of the 64-byte CQE/EQE capabilities is activated, the patch makes sure that older guest drivers that use the QUERY_DEV_FUNC command (e.g as done in mlx4_core of Linux 3.3..3.6) will notice that they need an update to be able to work with the PPF. This is done by changing the returned pf_context_behaviour not to be zero any more. In case none of these capabilities is activated that value remains zero and older guest drivers can run OK. The SRIOV related flow is as follows 1. the PPF does the detection of the new capabilities using QUERY_DEV_CAP command. 2. the PPF activates the new capabilities using INIT_HCA. 3. the VF detects if the PPF activated the capabilities using QUERY_HCA, and if this is the case activates them for itself too. Note that the VF detects that it must be aware to the new PF behaviour using QUERY_FUNC_CAP. Steps 1 and 2 apply also for native mode. User space notification is done through a new field introduced in struct mlx4_ib_ucontext which holds device capabilities for which user space must take action. This changes the binary interface so the ABI towards libmlx4 exposed through uverbs is bumped from 3 to 4 but only when **needed** i.e. only when the driver does use 64-byte CQEs or future device capabilities which must be in sync by user space. This practice allows to work with unmodified libmlx4 on older devices (e.g A0, B0) which don't support 64-byte CQEs. In order to keep existing systems functional when they update to a newer kernel that contains these changes in VF and userspace ABI, a module parameter enable_64b_cqe_eqe must be set to enable 64-byte mode; the default is currently false. Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 01 10月, 2012 8 次提交
-
-
由 Jack Morgenstein 提交于
This is necessary in order to support > 1 VF/PF in a VM for software that uses the node guid as a discriminator, such as librdmacm. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Jack Morgenstein 提交于
This directory is added only for the master -- slaves do not have it. The sysfs iov directory is used to manage and examine the port P_Key and guid paravirtualization. Under iov/ports, the administrator may examine the gid and P_Key tables as they are present in the device (and as are seen in the "network view" presented to the SM). Under the iov/<pci slot number> directories, the admin may map the index numbers in the physical tables (as under iov/ports) to the paravirtualized index numbers that guests see. For example, if the administrator, for port 1 on guest 2 maps physical pkey index 10 to virtual index 1, then that guest, whenever it uses its pkey index 1, will actually be using the real pkey index 10. Based on patch from Erez Shitrit <erezsh@mellanox.com> Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Jack Morgenstein 提交于
For IB ports, we paravirtualize the GUID at index 0 on slaves. The GUID at index 0 seen by a slave is the actual GUID occupying the GUID table at the slave-id index. The driver, by default, requests at startup time that subnet manager populate its entire guid table with GUIDs. These guids are then mapped (paravirtualized) to the slaves, and appear for each slave as its GUID at index 0. Until each slave has such a guid, its port status is DOWN. The guid table is cached to support special QP paravirtualization, and event propagation to slaves on guid change (we test to see if the guid really changed before propagating an event to the slave). To support this caching, add capability to __mlx4_ib_query_gid() to obtain the network view (i.e., physical view) gid at index X, not just the host (paravirtualized) view. Based on a patch from Erez Shitrit <erezsh@mellanox.com> Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Amir Vadai 提交于
In CM para-virtualization: 1. Incoming requests are steered to the correct vHCA according to the embedded GID. 2. Communication IDs on outgoing requests are replaced by a globally unique ID, generated by the PPF, since there is no synchronization of ID generation between guests (and so these IDs are not guaranteed to be globally unique). The guest's comm ID is stored, and is returned to the response MAD when it arrives. Signed-off-by: NAmir Vadai <amirv@mellanox.co.il> Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Oren Duer 提交于
MCG paravirtualization support includes: - Creating multicast groups by VFs, and keeping accounting of them - Leaving multicast groups by VFs - Updating SM only with real changes in the overall picture of MCGs status - Creation of MGID=0 groups (let SM choose MGID) Note that the MCG module maintains its own internal MCG object reference counts. The reason for this is that the IB core is used to track only the multicast groups joins generated by the PF it runs over. The PF IB core layer is unaware of slaves, so it cannot be used to keep track of MCG joins they generate. Signed-off-by: NOren Duer <oren@mellanox.co.il> Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Jack Morgenstein 提交于
The MAD_IFC firmware command fulfills two functions. First, it is used in the QP0/QP1 MAD-handling flow to obtain information from the FW (for answering queries), and for setting variables in the HCA (MAD SET packets). For this, MAD_IFC should provide the FW (physical) view of the data. This is the view that OpenSM needs. We call this the "network view". In the second case, MAD_IFC is used by various verbs to obtain data regarding the local HCA (e.g., ib_query_device()). We call this the "host view". This data needs to be paravirtualized. MAD_IFC therefore needs a wrapper function, and also needs another flag indicating whether it should provide the network view (when it is called by ib_process_mad in special-qp packet handling), or the host view (when it is called while implementing a verb). There are currently 2 flag parameters in mlx4_MAD_IFC already: ignore_bkey and ignore_mkey. These two parameters are replaced by a single "mad_ifc_flags" parameter, with different bits set for each flag. A third flag is added: "network-view/host-view". Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Jack Morgenstein 提交于
Allocate SR-IOV paravirtualization resources and MAD demuxing contexts on the master. This has two parts. The first part is to initialize the structures to contain the contexts. This is done at master startup time in mlx4_ib_init_sriov(). The second part is to actually create the tunneling resources required on the master to support a slave. This is performed the master detects that a slave has started up (MLX4_DEV_EVENT_SLAVE_INIT event generated when a slave initializes its comm channel). For the master, there is no such startup event, so it creates its own tunneling resources when it starts up. In addition, the master also creates the real special QPs. The ib_core layer on the master causes creation of proxy special QPs, since the master is also paravirtualized at the ib_core layer. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Jack Morgenstein 提交于
1. Introduce the basic SR-IOV parvirtualization context objects for multiplexing and demultiplexing MADs. 2. Introduce support for the new proxy and tunnel QP types. This patch introduces the objects required by the master for managing QP paravirtualization for guests. struct mlx4_ib_sriov is created by the master only. It is a container for the following: 1. All the info required by the PPF to multiplex and de-multiplex MADs (including those from the PF). (struct mlx4_ib_demux_ctx demux) 2. All the info required to manage alias GUIDs (i.e., the GUID at index 0 that each guest perceives. In fact, this is not the GUID which is actually at index 0, but is, in fact, the GUID which is at index[<VF number>] in the physical table. 3. structures which are used to manage CM paravirtualization 4. structures for managing the real special QPs when running in SR-IOV mode. The real SQPs are controlled by the PPF in this case. All SQPs created and controlled by the ib core layer are proxy SQP. struct mlx4_ib_demux_ctx contains the information per port needed to manage paravirtualization: 1. All multicast paravirt info 2. All tunnel-qp paravirt info for the port. 3. GUID-table and GUID-prefix for the port 4. work queues. struct mlx4_ib_demux_pv_ctx contains all the info for managing the paravirtualized QPs for one slave/port. struct mlx4_ib_demux_pv_qp contains the info need to run an individual QP (either tunnel qp or real SQP). Note: We made use of the 2 most significant bits in enum mlx4_ib_qp_flags (based on enum ib_qp_create_flags in ib_verbs.h). We need these bits in the low-level driver for internal purposes. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 11 7月, 2012 1 次提交
-
-
由 Jack Morgenstein 提交于
The port management change event can replace smp_snoop. If the capability bit for this event is set in dev-caps, the event is used (by the driver setting the PORT_MNG_CHG_EVENT bit in the async event mask in the MAP_EQ fw command). In this case, when the driver passes incoming SMP PORT_INFO SET mads to the FW, the FW generates port management change events to signal any changes to the driver. If the FW generates these events, smp_snoop shouldn't be invoked in ib_process_mad(), or duplicate events will occur (once from the FW-generated event, and once from smp_snoop). In the case where the FW does not generate port management change events smp_snoop needs to be invoked to create these events. The flow in smp_snoop has been modified to make use of the same procedures as in the fw-generated-event event case to generate the port management events (LID change, Client-rereg, Pkey change, and/or GID change). Port management change event handling required changing the mlx4_ib_event and mlx4_dispatch_event prototypes; the "param" argument (last argument) had to be changed to unsigned long in order to accomodate passing the EQE pointer. We also needed to move the definition of struct mlx4_eqe from net/mlx4.h to file device.h -- to make it available to the IB driver, to handle port management change events. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 09 7月, 2012 1 次提交
-
-
由 Jack Morgenstein 提交于
Define pr_fmt and add some pr_debug prints. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 08 7月, 2012 1 次提交
-
-
由 Hadar Hen Zion 提交于
The driver is modified to support three operation modes. If supported by firmware use the device managed flow steering API, that which we call device managed steering mode. Else, if the firmware supports the B0 steering mode use it, and finally, if none of the above, use the A0 steering mode. When the steering mode is device managed, the code is modified such that L2 based rules set by the mlx4_en driver for Ethernet unicast and multicast, and the IB stack multicast attach calls done through the mlx4_ib driver are all routed to use the device managed API. When attaching rule using device managed flow steering API, the firmware returns a 64 bit registration id, which is to be provided during detach. Currently the firmware is always programmed during HCA initialization to use standard L2 hashing. Future work should be done to allow configuring the flow-steering hash function with common, non proprietary means. Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 6月, 2012 1 次提交
-
-
由 Sagi Grimberg 提交于
1. Limit the max number of WQEs per QP reported when querying the device, so that ib_create_qp() will not fail for a QP size that the device claimed to support due to additional headroom WQEs being allocated. 2. Limit qp resources accepted for ib_create_qp() to the limits reported in ib_query_device(). In kernel space, make sure that the limits returned to the caller following qp creation also lie within the reported device limits. For userspace, report as before, and do adjustment in libmlx4 (so as not to break ABI). Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NSagi Grimberg <sagig@mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 19 5月, 2012 1 次提交
-
-
由 Shlomo Pongratz 提交于
Enable IB ULPs to use a larger portion of the device EQs (which map to IRQs). The mlx4_ib driver follows the mlx4_core framework of the EQs to be divided among the device ports. In this scheme, for each IB port, the number of allocated EQs follows the number of cores, subject to other system constraints, such as number available MSI-X vectors. Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 14 10月, 2011 2 次提交
-
-
由 Sean Hefty 提交于
Support the creation of XRC INI and TGT QPs. To handle the case where a CQ or PD is not provided, we allocate them internally with the xrcd. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Support creating and destroying XRC domains. Any sharing of the XRCD is managed above the low-level driver. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 19 7月, 2011 1 次提交
-
-
由 Or Gerlitz 提交于
Allocate flow counter per Ethernet/IBoE port, and attach this counter to all the QPs created on that port. Based on patch by Eli Cohen <eli@mellanox.co.il>. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.co.il> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 26 10月, 2010 1 次提交
-
-
由 Eli Cohen 提交于
Add support for IBoE to mlx4_ib. The bulk of the code is handling the new address vector fields; mlx4 needs the MAC address of a remote node to include it in a WQE (for datagrams) or in the QP context (for connected QPs). Address resolution is done by assuming all unicast GIDs are either link-local IPv6 addresses. Multicast group attach/detach needs to update the NIC's multicast filters; but since attaching a QP to a multicast group can be done before the QP is bound to a port, for IBoE we need to keep track of all multicast groups that a QP is attached too before it transitions from INIT to RTR (since it does not have a port in the INIT state). Signed-off-by: NEli Cohen <eli@mellanox.co.il> [ Many things cleaned up and otherwise monkeyed with; hope I didn't introduce too many bugs. - Roland ] Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 06 9月, 2009 1 次提交
-
-
由 Jack Morgenstein 提交于
Userspace apps are supposed to release all ib device resources if they receive a fatal async event (IBV_EVENT_DEVICE_FATAL). However, the app has no way of knowing when the device has come back up, except to repeatedly attempt ibv_open_device() until it succeeds. However, currently there is no protection against the open succeeding while the device is in being removed following the fatal event. In this case, the open will succeed, but as a result the device waits in the middle of its removal until the new app releases its resources -- and the new app will not do so, since the open succeeded at a point following the fatal event generation. This patch adds an "active" flag to the device. The active flag is set to false (in the fatal event flow) before the "fatal" event is generated, so any subsequent ibv_dev_open() call to the device will fail until the device comes back up, thus preventing the above deadlock. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 08 5月, 2009 1 次提交
-
-
由 Jack Morgenstein 提交于
The low-level mlx4 driver modified the page-list addresses for fast register work requests post send to big-endian, and set a "present" bit. This caused problems later when the consumer attempted to unmap the pages using the page-list (using the list addresses which were assumed to be still in CPU-endian order). Fix the mlx4 driver to allocate two buffers and use a private buffer for the hardware-format bus addresses. This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1571>, an NFS/RDMA server crash. The cause of the crash was found by Vu Pham of Mellanox. The fix is along the lines suggested by Steve Wise in comment #21 in bug 1571. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-