- 27 11月, 2020 11 次提交
-
-
由 Parav Pandit 提交于
When eswitch manager is running on ECPF, host PF should be treated as non eswitch manager port, similar to other VF vports. Fail to do so, results in firmware treating PF's vport as ECPF vport for eswitch ACL tables. Non zero check to figure out if a given vport is other vport or not is not sufficient becase PF vport number = 0 on ECPF. Hence, create esw acl tables with an attribute of other vport. Signed-off-by: NParav Pandit <parav@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Parav Pandit 提交于
To match the hardware spec, rename peer_pf to host_pf. Signed-off-by: NParav Pandit <parav@nvidia.com> Reviewed-by: NBodong Wang <bodong@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Parav Pandit 提交于
Subsequent patch implements helper API which has mlx5_core_dev as const pointer, make its caller API too const *. Signed-off-by: NParav Pandit <parav@nvidia.com> Reviewed-by: NBodong Wang <bodong@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Yishai Hadas 提交于
Expose other function ifc bits to enable setting HCA caps on behalf of other function. In addition, expose vhca_resource_manager bit to control whether the other function functionality is supported by firmware. Signed-off-by: NYishai Hadas <yishaih@nvidia.com> Reviewed-by: NParav Pandit <parav@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Aya Levin 提交于
Expose FW indication that it supports stateless offloads for IP over IP tunneled packets per direction. In some HW like ConnectX-4 IP-in-IP support is not symmetric, it supports steering on the inner header but it doesn't TX-Checksum and TSO. Add IP-in-IP capability per direction to cover this case as well. Note: only if both indications are turned on, the global tunnel_stateless_ip_over_ip is on too. Signed-off-by: NAya Levin <ayal@nvidia.com> Reviewed-by: NMoshe Shemesh <moshe@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Parav Pandit 提交于
Update the hardware interface definitions to query and modify vhca state, related EQE and event code. Signed-off-by: NParav Pandit <parav@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Parav Pandit 提交于
mlx5 command init and cleanup routines are internal to mlx5_core driver. Hence, avoid exporting them and move their definition to mlx5_core driver's internal file mlx5_core.h Signed-off-by: NParav Pandit <parav@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Eran Ben Elisha 提交于
Add a bit in HCA capabilities layout to indicate if ts_cqe_to_dest_cqn is supported. In addition, add ts_cqe_to_dest_cqn field to SQ context, for driver to set the actual CQN. Signed-off-by: NEran Ben Elisha <eranbe@nvidia.com> Reviewed-by: NTariq Toukan <tariqt@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Muhammad Sammar 提交于
Add misc4 match params to enable matching on prog_sample_fields. Signed-off-by: NMuhammad Sammar <muhammads@nvidia.com> Reviewed-by: NAlex Vesker <valex@nvidia.com> Reviewed-by: NMark Bloch <mbloch@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Chris Mi 提交于
The flow sampler object is a new destination type. Add a new member for the flow destination. Signed-off-by: NChris Mi <cmi@nvidia.com> Reviewed-by: NOz Shlomo <ozsh@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Chris Mi 提交于
Hardware introduces flow sampler object for packet sampling. Add the offload hardware bits and structures. Signed-off-by: NChris Mi <cmi@nvidia.com> Reviewed-by: NOz Shlomo <ozsh@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
- 31 10月, 2020 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arraysSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
- 27 10月, 2020 1 次提交
-
-
由 Parav Pandit 提交于
When a mlx5 core devlink instance is reloaded in different net namespace, its associated IB device is deleted and recreated. Example sequence is: $ ip netns add foo $ devlink dev reload pci/0000:00:08.0 netns foo $ ip netns del foo mlx5 IB device needs to attach and detach the netdevice to it through the netdev notifier chain during load and unload sequence. A below call graph of the unload flow. cleanup_net() down_read(&pernet_ops_rwsem); <- first sem acquired ops_pre_exit_list() pre_exit() devlink_pernet_pre_exit() devlink_reload() mlx5_devlink_reload_down() mlx5_unload_one() [...] mlx5_ib_remove() mlx5_ib_unbind_slave_port() mlx5_remove_netdev_notifier() unregister_netdevice_notifier() down_write(&pernet_ops_rwsem);<- recurrsive lock Hence, when net namespace is deleted, mlx5 reload results in deadlock. When deadlock occurs, devlink mutex is also held. This not only deadlocks the mlx5 device under reload, but all the processes which attempt to access unrelated devlink devices are deadlocked. Hence, fix this by mlx5 ib driver to register for per net netdev notifier instead of global one, which operats on the net namespace without holding the pernet_ops_rwsem. Fixes: 4383cfcc ("net/mlx5: Add devlink reload") Link: https://lore.kernel.org/r/20201026134359.23150-1-parav@nvidia.comSigned-off-by: NParav Pandit <parav@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
- 13 10月, 2020 2 次提交
-
-
由 Huy Nguyen 提交于
Add new FTE in TX IPsec FT per IPsec state. It has the same matching criteria as the RX steering rule. The IPsec FT is created/destroyed when the first/last rule is added/deleted respectively. Signed-off-by: NHuy Nguyen <huyn@mellanox.com> Reviewed-by: NBoris Pismenny <borisp@nvidia.com> Reviewed-by: NTariq Toukan <tariqt@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Huy Nguyen 提交于
Add new namespace that represents the NIC TX domain. Signed-off-by: NHuy Nguyen <huyn@mellanox.com> Signed-off-by: NRaed Salem <raeds@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
- 10 10月, 2020 2 次提交
-
-
由 Moshe Shemesh 提交于
Firmware live patch event notifies the driver that the firmware was just updated using live patch. In such case the driver should not reload or re-initiate entities, part to updating the firmware version and re-initiate the firmware tracer which can be updated by live patch with new strings database to help debugging an issue. Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Reviewed-by: NSaeed Mahameed <saeedm@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Moshe Shemesh 提交于
Once the driver gets sync_reset_request from firmware it prepares for the coming reset and sends acknowledge. After getting this event the driver expects device reset, either it will trigger PCI reset on sync_reset_now event or such PCI reset will be triggered by another PF of the same device. So it moves to reset requested mode and if it gets PCI reset triggered by the other PF it detect the reset and reloads. Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Reviewed-by: NSaeed Mahameed <saeedm@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 03 10月, 2020 2 次提交
-
-
由 Saeed Mahameed 提交于
In case of pci is offline reclaim_pages_cmd() will still try to call the FW to release FW pages, cmd_exec() in this case will return a silent success without actually calling the FW. This is wrong and will cause page leaks, what we should do is to detect pci offline or command interface un-available before tying to access the FW and manually release the FW pages in the driver. In this patch we share the code to check for FW command interface availability and we call it in sensitive places e.g. reclaim_pages_cmd(). Alternative fix: 1. Remove MLX5_CMD_OP_MANAGE_PAGES form mlx5_internal_err_ret_value, command success simulation list. 2. Always Release FW pages even if cmd_exec fails in reclaim_pages_cmd(). Reviewed-by: NMoshe Shemesh <moshe@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Eran Ben Elisha 提交于
Upon command completion timeout, driver simulates a forced command completion. In a rare case where real interrupt for that command arrives simultaneously, it might release the command entry while the forced handler might still access it. Fix that by adding an entry refcount, to track current amount of allowed handlers. Command entry to be released only when this refcount is decremented to zero. Command refcount is always initialized to one. For callback commands, command completion handler is the symmetric flow to decrement it. For non-callback commands, it is wait_func(). Before ringing the doorbell, increment the refcount for the real completion handler. Once the real completion handler is called, it will decrement it. For callback commands, once the delayed work is scheduled, increment the refcount. Upon callback command completion handler, we will try to cancel the timeout callback. In case of success, we need to decrement the callback refcount as it will never run. In addition, gather the entry index free and the entry free into a one flow for all command types release. Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com> Reviewed-by: NMoshe Shemesh <moshe@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
- 01 10月, 2020 1 次提交
-
-
由 sunils 提交于
Currently only 256 vports can be supported as only 8 bits are reserved for them and 8 bits are reserved for vhca_ids in metadata reg c0. To support more than 256 vports, replace vhca_id with a unique shorter 4-bit PF number which covers upto 16 PF's. Use remaining 12 bits for vports ranging 1-4095. This will continue to generate unique metadata even if multiple PCI devices have same switch_id. Signed-off-by: Nsunils <sunils@nvidia.com> Reviewed-by: NParav Pandit <parav@nvidia.com> Reviewed-by: NVu Pham <vuhuong@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
- 18 9月, 2020 3 次提交
-
-
由 Alex Vesker 提交于
Added sw_owner_v2 which will be enabled for future devices, replacing sw_owner bit. Signed-off-by: NAlex Vesker <valex@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
-
由 Aharon Landau 提交于
Combine two same enums to avoid duplication. Signed-off-by: NAharon Landau <aharonl@mellanox.com> Reviewed-by: NMichael Guralnik <michaelgur@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
-
由 Aharon Landau 提交于
The functions mlx5_query_port_link_width_oper and mlx5_query_port_ib_proto_oper are always called together, so combine them to a new function called mlx5_query_port_oper to avoid duplication. And while the mlx5i_get_port_settings is the same as mlx5_query_port_oper therefore let's remove it. According to the IB spec link_width_oper and ib_proto_oper should be u16 and not as written u8, so perform casting as a preparation to cross-RDMA patch which will fix that type for all drivers in the RDMA subsystem. Fixes: ada68c31 ("net/mlx5: Introduce a new header file for physical port functions") Signed-off-by: NAharon Landau <aharonl@mellanox.com> Reviewed-by: NMichael Guralnik <michaelgur@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
-
- 16 9月, 2020 2 次提交
-
-
由 Ofer Levi 提交于
Add CQE compression support for completions of packets that span multiple strides in a Striding RQ, per the HW capability. In our memory model, we use small strides (256B as of today) for the non-linear SKB mode. This feature allows CQE compression to work also for multiple strides packets. In this case decompressing the mini CQE array will use stride index provided by HW as part of the mini CQE. Before this feature, compression was possible only for single-strided packets, i.e. for packets of size up to 256 bytes when in non-linear mode, and the index was maintained by SW. This feature is supported for ConnectX-5 and above. Feature performance test: This was whitebox-tested, we reduced the PCI speed from 125Gb/s to 62.5Gb/s to overload pci and manipulated mlx5 driver to drop incoming packets before building the SKB to achieve low cpu utilization. Outcome is low cpu utilization and bottleneck on pci only. Test setup: Server: Intel(R) Xeon(R) Silver 4108 CPU @ 1.80GHz server, 32 cores NIC: ConnectX-6 DX. Sender side generates 300 byte packets at full pci bandwidth. Receiver side configuration: Single channel, one cpu processing with one ring allocated. Cpu utilization is ~20% while pci bandwidth is fully utilized. For the generated traffic and interface MTU of 4500B (to activate the non-linear SKB mode), packet rate improvement is about 19% from ~17.6Mpps to ~21Mpps. Without this feature, counters show no CQE compression blocks for this setup, while with the feature, counters show ~20.7Mpps compressed CQEs in ~500K compression blocks. Signed-off-by: NOfer Levi <oferle@mellanox.com> Reviewed-by: NTariq Toukan <tariqt@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Eran Ben Elisha 提交于
Clock struct is part of struct mlx5_core_dev. Code was inconsistent, on some cases used container_of and on another used clock->mdev. Align code to use container_of amd remove clock->mdev pointer. While here, fix reverse xmas tree coding style. Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com> Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
-
- 27 8月, 2020 1 次提交
-
-
由 Mark Zhang 提交于
When DCT QPs work in RoCE LAG mode: 1. DCT creation is allowed only when it is supported 2. The "port" of a DCT QP is assigned in a round-robin way Link: https://lore.kernel.org/r/20200818115245.700581-3-leon@kernel.orgSigned-off-by: NMark Zhang <markz@mellanox.com> Reviewed-by: NMaor Gottlieb <maorg@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
- 29 7月, 2020 1 次提交
-
-
由 Ron Diskin 提交于
When setting the PF interface up/down, notify the firmware to update uplink state via MODIFY_VPORT_STATE, when E-Switch is enabled. This behavior will prevent sending traffic out on uplink port when PF is down, such as sending traffic from a VF interface which is still up. Currently when calling mlx5e_open/close(), the driver only sends PAOS command to notify the firmware to set the physical port state to up/down, however, it is not sufficient. When VF is in "auto" state, it follows the uplink state, which was not updated on mlx5e_open/close() before this patch. When switchdev mode is enabled and uplink representor is first enabled, set the uplink port state value back to its FW default "AUTO". Fixes: 63bfd399 ("net/mlx5e: Send PAOS command on interface up/down") Signed-off-by: NRon Diskin <rondi@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Reviewed-by: NMoshe Shemesh <moshe@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 28 7月, 2020 1 次提交
-
-
由 Eran Ben Elisha 提交于
Per page request event, FW request to allocated or release pages for a single function. Driver maintains FW pages object per function, so there is no need to hold one global page data-base. Instead, have a page data-base per function, which will improve performance release flow in all cases, especially for "release all pages". As the range of function IDs is large and not sequential, use xarray to store a per function ID page data-base, where the function ID is the key. Upon first allocation of a page to a function ID, create the page data-base per function. This data-base will be released only at pagealloc mechanism cleanup. NIC: ConnectX-4 Lx CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz Test case: 32 VFs, measure release pages on one VF as part of FLR Before: 0.021 Sec After: 0.014 Sec The improvement depends on amount of VFs and memory utilization by them. Time measurements above were taken from idle system. Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 27 7月, 2020 2 次提交
-
-
由 Meir Lichtinger 提交于
Up to ConnectX-7 UMR is not used when user passes relaxed ordering access flag. ConnectX-7 supports setting relaxed ordering read/write mkey attribute by UMR, indicated by new HCA capabilities. With ConnectX-7 driver uses UMR when user set relaxed ordering access flag, in contrast to previous silicon models. Specifically it includes setting relvant flags of mkey context mask in UMR control segment, and relaxed ordering write and read flags in UMR mkey context segment. Link: https://lore.kernel.org/r/20200716105248.1423452-4-leon@kernel.orgSigned-off-by: NMeir Lichtinger <meirl@mellanox.com> Reviewed-by: NMichael Guralnik <michaelgur@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Meir Lichtinger 提交于
Use generic mlx5 structure defined in mlx5_ifc.h to represent ConnectX device data structures instead of using structure defined specifically for mlx5_ib module. Link: https://lore.kernel.org/r/20200716105248.1423452-3-leon@kernel.orgSigned-off-by: NMeir Lichtinger <meirl@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
- 25 7月, 2020 1 次提交
-
-
由 Meir Lichtinger 提交于
Up to ConnectX-7 setting mkey relaxed ordering read/write attributes by UMR is not supported. ConnectX-7 supports this option, which is indicated by two new HCA capabilities - relaxed_ordering_write_umr and relaxed_ordering_read_umr. Signed-off-by: NMeir Lichtinger <meirl@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
-
- 17 7月, 2020 2 次提交
-
-
由 Huy Nguyen 提交于
Add FTE actions IPsec ENCRYPT/DECRYPT Add ipsec_obj_id field in FTE Add new action field MLX5_ACTION_IN_FIELD_IPSEC_SYNDROME Signed-off-by: NHuy Nguyen <huyn@mellanox.com> Reviewed-by: NRaed Salem <raeds@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Raed Salem 提交于
This to set the base for downstream patches to support the new IPsec implementation of the Connect-X family. Following modifications made: - Remove accel layer dependency from MLX5_FPGA_IPSEC. - Introduce accel_ipsec_ops, each IPsec device will have to support these ops. Signed-off-by: NRaed Salem <raeds@mellanox.com> Reviewed-by: NTariq Toukan <tariqt@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 16 7月, 2020 3 次提交
-
-
由 Eli Cohen 提交于
Rename mlx5_ifc_device_virtio_emulation_cap_bits to mlx5_ifc_virtio_emulation_cap_bits to match names produced by the tools producing these auto generated files. In addition missing capabilities that will be required by VDPA implementation. Signed-off-by: NEli Cohen <eli@mellanox.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Eli Cohen 提交于
VDPA is a new interface that will be added in subsequent patches. It uses mlx5 core devices and resources. Add an interface type for it. Signed-off-by: NEli Cohen <eli@mellanox.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Eli Cohen 提交于
mlx5_fill_page_frag_array() is used to populate dma addresses to resources that require it, such as QPs, RQs etc. When the resource is used, PA list permissions are ignored. For resources that use MTT list, the user is required to provide the access rights. Subsequent patches use resources that require MTT lists, so modify API and implementation to support that. Signed-off-by: NEli Cohen <eli@mellanox.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 10 7月, 2020 1 次提交
-
-
由 Eran Ben Elisha 提交于
Device unit for port buffers size, xoff_threshold and xon_threshold is cells. Fix a bug in driver where cell unit size was hard-coded to 128 bytes. This hard-coded value is buggy, as it is wrong for some hardware versions. Driver to read cell size from SBCAM register and translate bytes to cell units accordingly. In order to fix the bug, this patch exposes SBCAM (Shared buffer capabilities mask) layout and defines. If SBCAM.cap_cell_size is valid, use it for all bytes to cells calculations. If not valid, fallback to 128. Cell size do not change on the fly per device. Instead of issuing SBCAM access reg command every time such translation is needed, cache it in mlx5e_dcbx as part of mlx5e_dcbnl_initialize(). Pass dcbx.port_buff_cell_sz as a param to every function that needs bytes to cells translation. While fixing the bug, move MLX5E_BUFFER_CELL_SHIFT macro to en_dcbnl.c, as it is only used by that file. Fixes: 0696d608 ("net/mlx5e: Receive buffer configuration") Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com> Reviewed-by: NHuy Nguyen <huyn@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 09 7月, 2020 1 次提交
-
-
由 Meir Lichtinger 提交于
This patch exposes new link modes using 100Gbps per lane, including 100G, 200G and 400G modes. Signed-off-by: NMeir Lichtinger <meirl@mellanox.com> Reviewed-by: NAya Levin <ayal@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 7月, 2020 1 次提交
-
-
由 Michael Guralnik 提交于
If in the process of creating the underlay QP for an IPoIB interface the user has set the address and specifically the 1st-3rd bytes representing the QP number, use the requested QP number when creating the underlay QP. For a user to be able to request a QP number on QP creation, the MKEY_BY_NAME NVCONFIG should be set. As mkey_by_name and qp_by_name are coupled in FW. This requires driver to query the mkey_by_name max cap during initialization and set the current cap if it was enabled in FW. Signed-off-by: NMichael Guralnik <michaelgur@mellanox.com> Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
-
- 28 6月, 2020 1 次提交
-
-
由 Tariq Toukan 提交于
Add explicit WQE segment structures for the TLS static and progress params. According to the HW spec, TISN is not part of the progress params context, take it out of it. Rename the control segment tisn field as it could hold either a TIS or a TIR number. Signed-off-by: NTariq Toukan <tariqt@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-