- 02 7月, 2019 9 次提交
-
-
由 Petr Machata 提交于
On Spectrum-1, timestamps are delivered separately from the packets, and need to paired up. Therefore, at some point after mlxsw_sp_port_xmit() is invoked, it is necessary to involve the chip-specific driver code to allow it to do the necessary bookkeeping and matching. On Spectrum-2, timestamps are delivered in CQE. For that reason, position the point of driver involvement into mlxsw_pci_cqe_sdq_handle() to make it hopefully easier to extend for Spectrum-2 in the future. To tell the driver what port the packet was sent on, keep tx_info in SKB control buffer. Introduce a new driver core interface mlxsw_core_ptp_transmitted(), a driver callback ptp_transmitted, and a PTP op transmitted. The callee is responsible for taking care of releasing the SKB passed to the new interfaces, and correspondingly have the new stub callbacks just call dev_kfree_skb_any(). Follow-up patches will introduce the actual content into mlxsw_sp1_ptp_transmitted() in particular. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
The SKB control buffer is useful (and used) for bookkeeping of information related to that SKB. Add helpers so that the mlxsw driver(s) can safely use the buffer as well. The structure is currently empty, individual users will add members to it as necessary. Note that SKB allocation functions already clear the buffer, so the cleanup is only necessary when ndo_start_xmit is called. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
When configured, the Spectrum hardware can recognize PTP packets and trap them to the CPU using dedicated traps, PTP0 and PTP1. One reason to get PTP packets under dedicated traps is to have a separate policer suitable for the amount of PTP traffic expected when switch is operated as a boundary clock. For this, add two new trap groups, MLXSW_REG_HTGT_TRAP_GROUP_SP_PTP0 and _PTP1, and associate the two PTP traps with these two groups. In the driver, specifically for Spectrum-1, event PTP packets will need to be paired up with their timestamps. Those arrive through a different set of traps, added later in the patch set. To support this future use, introduce a new PTP op, ptp_receive. It is possible to configure which PTP messages should be trapped under which PTP trap. On Spectrum systems, we will use PTP0 for event packets (which need timestamping), and PTP1 for control packets (which do not). Thus configure PTP0 trap with a custom callback that defers to the ptp_receive op. Additionally, L2 PTP packets are actually trapped through the LLDP trap, not through any of the PTP traps. So treat the LLDP trap the same way as the PTP0 trap. Unlike PTP traps, which are currently still disabled, LLDP trap is active. Correspondingly, have all the implementations of the ptp_receive op return true, which the handler treats as a signal to forward the packet immediately. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
On Spectrum-1, timestamps for PTP packets are delivered through queues of ingress and egress timestamps. There are two event traps corresponding to activity on each of those queues. This mechanism is absent on Spectrum-2, and therefore the traps should only be registered on Spectrum-1. Carry a chip-specific listener array in mlxsw_sp->listeners and listeners_count. Register listeners from that array in mlxsw_sp_traps_init(). Add a new listener array for Spectrum-1 traps and configure the newly-added mlxsw_sp->listeners with this array. The listener array is empty for now, the events will be added in a later patch. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
On Spectrum-1, timestamps for PTP packets are delivered through queues of ingress and egress timestamps. There are two event traps corresponding to activity on each of those queues. This mechanism is absent on Spectrum-2, and therefore the traps should only be registered on Spectrum-1. Extract out of mlxsw_sp_traps_init() a generic helper, mlxsw_sp_traps_register(), and likewise with _unregister(). The new helpers will later be called with Spectrum-1-specific traps. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
This register serves to configure global parameters of certain monitoring operations. The following patches will use it to configure that when PTP timestamps are delivered through the PTP FIFO traps, the FIFO in question is cleared as well. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
The MTPPTR is used for reading the per port PTP timestamp FIFO. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
This register is used for configuring under which trap to deliver PTP packets depending on type of the packet. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
This register serves for configuration of which PTP messages should be timestamped. This is a global configuration, despite the register name. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 6月, 2019 10 次提交
-
-
由 Paul Blakey 提交于
After changing the parent_id to be the same for both NICs of same the hardware device, netdev_port_same_parent_id now returns true for more cases (all the lower devices in the hierarchy are on the same hardware device). If merged eswitch isn't enabled, these cases aren't supported, so disallow them. Signed-off-by: NPaul Blakey <paulb@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Paul Blakey 提交于
Report system_image_guid as the E-Switch switch_id, this ensures that when a NIC contains multiple PCI functions and which has merged eswitch capability, all representors from multiple PFs publish same switch_id. Signed-off-by: NPaul Blakey <paulb@mellanox.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Gavi Teitz 提交于
Refreshing TIRs is done in order to update the TIRs with the current state of SQs in the transport domain, so that the TIRs can filter out undesired self-loopback packets based on the source SQ of the packet. Representor TIRs will only receive packets that originate from their associated vport, due to dedicated steering, and therefore will never receive self-loopback packets, whose source vport will be the vport of the E-Switch manager, and therefore not the vport associated with the representor. As such, it is not necessary to refresh the representors' TIRs, since self-loopback packets can't reach them. Since representors only exist in switchdev mode, and there is no scenario in which a representor will exist in the transport domain alongside a non-representor, it is not necessary to refresh the transport domain's TIRs upon changing the state of a representor's queues. Therefore, do not refresh TIRs upon such a change. Achieve this by adding an update_rx callback to the mlx5e_profile, which refreshes TIRs for non-representors and does nothing for representors, and replace instances of mlx5e_refresh_tirs() upon changing the state of the queues with update_rx(). Signed-off-by: NGavi Teitz <gavi@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Reviewed-by: NTariq Toukan <tariqt@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Arnd Bergmann 提交于
Putting an empty 'mlx5_flow_spec' structure on the stack is a bit wasteful and causes a warning on 32-bit architectures when building with clang -fsanitize-coverage: drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c: In function 'mlx5_eswitch_termtbl_create': drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c:90:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Since the structure is never written to, we can statically allocate it to avoid the stack usage. To be on the safe side, mark all subsequent function arguments that we pass it into as 'const' as well. Fixes: 10caabda ("net/mlx5e: Use termination table for VLAN push actions") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NSaeed Mahameed <saeedm@mellanox.com> Acked-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Parav Pandit 提交于
Consider PCI and non PCI device types while setting device name in get_drvinfo() callback using existing generic device. Signed-off-by: NParav Pandit <parav@mellanox.com> Reviewed-by: NVu Pham <vuhuong@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Parav Pandit 提交于
Currently PF phys_port_name is named as pfNvf-1 as vport number for PF vport is 65535. Correct PF's phys_port name as agreed upon name as pfN. Signed-off-by: NParav Pandit <parav@mellanox.com> Reviewed-by: NVu Pham <vuhuong@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Ariel Levkovich 提交于
Set supported device features in the netdevice MPLS features mask. This will enable HW checksumming and TSO for MPLS tagged traffic. Signed-off-by: NAriel Levkovich <lariel@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Ariel Levkovich 提交于
This patch changes the way the driver advertises its checksum offload capabilities within the net device features bit mask. Instead of advertising protocol specific checksumming capabilities which are limited today to IPv4 and IPv6, we move to reporing generic HW checksumming capabilities. This will allow the network stack to let mlx5 device offload checksum for cases where the IP header is encapsulated within another protocol and the skb->protocol doesn't indicate one of the IP versions protocol, specifically in the case of MPLS label encapsulating the IP header and the skb->protocol indiciates MPLS ethertype rather than IP. Moving the HW_CSUM reporting is required in the basic net device hw features mask and also in the extensions (vlan and encpasulation features) since the extensions are always multiplied by the basic features set during the packet's traversal through the stack's tx flow. Signed-off-by: NAriel Levkovich <lariel@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Gavi Teitz 提交于
Remove the limitation preventing adding a vport's MAC address to the Multi-Physical Function Switch (MPFS) more than once per E-switch, as there is no difference in the MPFS if an address is being used by an E-switch more than once. This allows the E-switch to have multiple vports with the same MAC address, allowing vports to be classified by VLAN id instead of by MAC if desired. Signed-off-by: NGavi Teitz <gavi@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Gavi Teitz 提交于
Unify and isolate the error handling flow in mlx5_mpfs_add_mac(), removing code duplication. Signed-off-by: NGavi Teitz <gavi@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 27 6月, 2019 11 次提交
-
-
由 Jianbo Liu 提交于
As the ingress ACL rules save vhca id and vport number to packet's metadata REG_C_0, and the metadata matching for the rules in both fast path and slow path are all added, enable this feature if supported. Signed-off-by: NJianbo Liu <jianbol@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Jianbo Liu 提交于
In slow path, packet that not matched by any offloaded rule is forwarded to eswitch vport manager for further processing. Add matching on metadata for peer miss rules in FDB, and rules which forward packet to correct representor in esw manager NIC_RX table. Signed-off-by: NJianbo Liu <jianbol@mellanox.com> Reviewed-by: NEli Britstein <elibr@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Jianbo Liu 提交于
In order to do matching on metadata in slow path when demuxing traffic to representors, explicitly enable the feature that allows HW to pass metadata REG_C_0 from FDB to eswitch manager NIC_RX table. Signed-off-by: NJianbo Liu <jianbol@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Jianbo Liu 提交于
Add esw vport query and modify functions, and exposing them is needed for enabling or disabling registers passed as metatdata to vport NIC_RX table in slow path. Signed-off-by: NJianbo Liu <jianbol@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Jianbo Liu 提交于
If FW's capabilities and configurations meet the requirement of vport metadata matching, this feature will be used. As the information about vport number and vhca_id related to packet is already stored to its metadata register, which is used as an indicator for perticular vport, now we can change to match on this metadata for all the offloading rules in fast path. Signed-off-by: NJianbo Liu <jianbol@mellanox.com> Reviewed-by: NEli Britstein <elibr@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Jianbo Liu 提交于
In vport metadata matching, source port number is replaced by metadata. While FW has no idea about what it is in the metadata, a syndrome will happen. Specify a known origin to avoid the syndrome. However, there is no functional change because ANY_VPORT (0) is filled in flow_source, the same default value as before, as a pre-step towards metadata matching for fast path. There are two other values can be filled in flow_source. When setting 0x1, packet matching this rule is from uplink, while 0x2 is for packet from other local vports. Signed-off-by: NJianbo Liu <jianbol@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Jianbo Liu 提交于
When a dual-port VHCA sends a RoCE packet on its non-native port, and the packet arrives to its affiliated vport FDB, a mismatch might occur on the rules that match the packet source vport as it is not represented by single VHCA only in this case. So we change to match on metadata instead of source vport. To do that, a rule is created in all vports and uplink ingress ACLs, to save the source vport number and vhca id in the packet's metadata in order to match on it later. The metadata register used is the first of the 32-bit type C registers. It can be used for matching and header modify operations. The higher 16 bits of this register are for vhca id, and the lower 16 ones is for vport number. This change is not for dual-port RoCE only. If HW and FW allow, the vport metadata matching is enabled by default. Signed-off-by: NJianbo Liu <jianbol@mellanox.com> Reviewed-by: NEli Britstein <elibr@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Jianbo Liu 提交于
Refactor the flow data structures, add new flow_context and move flow_tag into it, as flow_tag doesn't belong to the rule action. Signed-off-by: NJianbo Liu <jianbol@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Parav Pandit 提交于
Introduce a helper API mlx5_eswitch_is_vf_vport() to check if a given vport_num belongs to VF or not. Signed-off-by: NParav Pandit <parav@mellanox.com> Reviewed-by: NJianbo Liu <jianbol@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Jianbo Liu 提交于
That modify header action can be then attached to a steering rule in the ingress ACL. Signed-off-by: NJianbo Liu <jianbol@mellanox.com> Reviewed-by: NEli Britstein <elibr@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Jianbo Liu 提交于
The ingress and egress ACL root namespaces are created per vport and stored into arrays. However, the vport number is not the same as the index. Passing the array index, instead of vport number, to get the correct ingress and egress acl namespace. Fixes: 9b93ab98 ("net/mlx5: Separate ingress/egress namespaces for each vport") Signed-off-by: NJianbo Liu <jianbol@mellanox.com> Reviewed-by: NOz Shlomo <ozsh@mellanox.com> Reviewed-by: NEli Britstein <elibr@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 26 6月, 2019 4 次提交
-
-
由 Tal Gilboa 提交于
Moved all logic from dim.h and net_dim.h to dim.c and net_dim.c. This is both more structurally appealing and would allow to only expose externally used functions. Signed-off-by: NTal Gilboa <talgi@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Tal Gilboa 提交于
Removed 'net' prefix from functions and structs used by external drivers. Signed-off-by: NTal Gilboa <talgi@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Tal Gilboa 提交于
In order to avoid confusion between the function and the similarly named struct. In preparation for removing the 'net' prefix from dim members. Signed-off-by: NTal Gilboa <talgi@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Tal Gilboa 提交于
Renamed macros in use by external drivers. Signed-off-by: NTal Gilboa <talgi@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 25 6月, 2019 1 次提交
-
-
由 Matthew Wilcox 提交于
The lock protecting the data structure does not need to be an rwlock. The only read access to the lock is in an error path, and if that's limiting your scalability, you have bigger performance problems. Eliminate mlx5_mkey_table in favour of using the xarray directly. reg_mr_callback must use GFP_ATOMIC for allocating XArray nodes as it may be called in interrupt context. This also fixes a minor bug where SRCU locking was being used on the radix tree read side, when RCU was needed too. Signed-off-by: NMatthew Wilcox <willy@infradead.org> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 24 6月, 2019 3 次提交
-
-
由 Vadim Pasternak 提交于
Extend macros MLXSW_REG_MTMP_TEMP_TO_MC() to allow support of negative temperature readout, since chip and others thermal components are capable of operating within the negative temperature. With no such support negative temperature will be consider as very high temperature and it will cause wrong readout and thermal shutdown. For negative values 2`s complement is used. Tested in chamber. Example of chip ambient temperature readout with chamber temperature: -10 Celsius: temp1: -6.0C (highest = -5.0C) -5 Celsius: temp1: -1.0C (highest = -1.0C) v2 (Andrew Lunn): * Replace '%u' with '%d' in mlxsw_hwmon_module_temp_show() Signed-off-by: NVadim Pasternak <vadimp@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vadim Pasternak 提交于
When multiple sensors are mapped to the same cooling device, the cooling device should be set according the worst sensor from the sensors associated with this cooling device. Provide the hottest thermal zone detection and enforce cooling device to follow the temperature trends of the hottest zone only. Prevent competition for the cooling device control from others zones, by "stable trend" indication. A cooling device will not perform any actions associated with a zone with a "stable trend". When other thermal zone is detected as a hottest, a cooling device is to be switched to following temperature trends of new hottest zone. Thermal zone score is represented by 32 bits unsigned integer and calculated according to the next formula: For T < TZ<t><i>, where t from {normal trip = 0, high trip = 1, hot trip = 2, critical = 3}: TZ<i> score = (T + (TZ<t><i> - T) / 2) / (TZ<t><i> - T) * 256 ** j; Highest thermal zone score s is set as MAX(TZ<i>score); Following this formula, if TZ<i> is in trip point higher than TZ<k>, the higher score is to be always assigned to TZ<i>. For two thermal zones located at the same kind of trip point, the higher score will be assigned to the zone which is closer to the next trip point. Thus, the highest score will always be assigned objectively to the hottest thermal zone. All the thermal zones initially are to be configured with mode "enabled" with the "step_wise" governor. Signed-off-by: NVadim Pasternak <vadimp@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vadim Pasternak 提交于
Add a dedicated thermal zone for each inter-connect device. The current temperature is obtained from inter-connect temperature sensor and the default trip points are set to the same values as default ASIC trip points. These settings could be changed from the user space. A cooling device (fan) is bound to all inter-connect devices. Signed-off-by: NVadim Pasternak <vadimp@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 6月, 2019 2 次提交
-
-
由 Jesper Dangaard Brouer 提交于
This patch is needed before we can allow drivers to use page_pool for DMA-mappings. Today with page_pool and XDP return API, it is possible to remove the page_pool object (from rhashtable), while there are still in-flight packet-pages. This is safely handled via RCU and failed lookups in __xdp_return() fallback to call put_page(), when page_pool object is gone. In-case page is still DMA mapped, this will result in page note getting correctly DMA unmapped. To solve this, the page_pool is extended with tracking in-flight pages. And XDP disconnect system queries page_pool and waits, via workqueue, for all in-flight pages to be returned. To avoid killing performance when tracking in-flight pages, the implement use two (unsigned) counters, that in placed on different cache-lines, and can be used to deduct in-flight packets. This is done by mapping the unsigned "sequence" counters onto signed Two's complement arithmetic operations. This is e.g. used by kernel's time_after macros, described in kernel commit 1ba3aab3 and 5a581b36, and also explained in RFC1982. The trick is these two incrementing counters only need to be read and compared, when checking if it's safe to free the page_pool structure. Which will only happen when driver have disconnected RX/alloc side. Thus, on a non-fast-path. It is chosen that page_pool tracking is also enabled for the non-DMA use-case, as this can be used for statistics later. After this patch, using page_pool requires more strict resource "release", e.g. via page_pool_release_page() that was introduced in this patchset, and previous patches implement/fix this more strict requirement. Drivers no-longer call page_pool_destroy(). Drivers already call xdp_rxq_info_unreg() which call xdp_rxq_info_unreg_mem_model(), which will attempt to disconnect the mem id, and if attempt fails schedule the disconnect for later via delayed workqueue. Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
The mlx5 driver is using page_pool, but not for DMA-mapping (currently), and is a little too relaxed about returning or releasing page resources, as it is not strictly necessary, when not using DMA-mappings. As this patchset is working towards tracking page_pool resources, to know about in-flight frames on shutdown. Then fix places where mlx5 leak page_pool resource. In case of dma_mapping_error, then recycle into page_pool. In mlx5e_free_rq() moved the page_pool_destroy() call to after the mlx5e_page_release() calls, as it is more correct. In mlx5e_page_release() when no recycle was requested, then release page from the page_pool, via page_pool_release_page(). Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: NTariq Toukan <tariqt@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-