- 18 6月, 2021 2 次提交
-
-
由 Ioana Ciornei 提交于
There are many places where both the fwnode_handle and the of_node of a device need to be populated. Add a function which does both so that we have consistency. Suggested-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jian Shen 提交于
Th_strings arrays netdev_features_strings, tunable_strings, and phy_tunable_strings has been moved to file net/ethtool/common.c. So fixes the comment. Signed-off-by: NJian Shen <shenjian15@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 6月, 2021 1 次提交
-
-
由 Kuniyuki Iwashima 提交于
This patch introduces a new bpf_attach_type for BPF_PROG_TYPE_SK_REUSEPORT to check if the attached eBPF program is capable of migrating sockets. When the eBPF program is attached, we run it for socket migration if the expected_attach_type is BPF_SK_REUSEPORT_SELECT_OR_MIGRATE or net.ipv4.tcp_migrate_req is enabled. Currently, the expected_attach_type is not enforced for the BPF_PROG_TYPE_SK_REUSEPORT type of program. Thus, this commit follows the earlier idea in the commit aac3fc32 ("bpf: Post-hooks for sys_bind") to fix up the zero expected_attach_type in bpf_prog_load_fixup_attach_type(). Moreover, this patch adds a new field (migrating_sk) to sk_reuseport_md to select a new listener based on the child socket. migrating_sk varies depending on if it is migrating a request in the accept queue or during 3WHS. - accept_queue : sock (ESTABLISHED/SYN_RECV) - 3WHS : request_sock (NEW_SYN_RECV) In the eBPF program, we can select a new listener by BPF_FUNC_sk_select_reuseport(). Also, we can cancel migration by returning SK_DROP. This feature is useful when listeners have different settings at the socket API level or when we want to free resources as soon as possible. - SK_PASS with selected_sk, select it as a new listener - SK_PASS with selected_sk NULL, fallbacks to the random selection - SK_DROP, cancel the migration. There is a noteworthy point. We select a listening socket in three places, but we do not have struct skb at closing a listener or retransmitting a SYN+ACK. On the other hand, some helper functions do not expect skb is NULL (e.g. skb_header_pointer() in BPF_FUNC_skb_load_bytes(), skb_tail_pointer() in BPF_FUNC_skb_load_bytes_relative()). So we allocate an empty skb temporarily before running the eBPF program. Suggested-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NEric Dumazet <edumazet@google.com> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/netdev/20201123003828.xjpjdtk4ygl6tg6h@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/netdev/20201203042402.6cskdlit5f3mw4ru@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/netdev/20201209030903.hhow5r53l6fmozjn@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/bpf/20210612123224.12525-10-kuniyu@amazon.co.jp
-
- 15 6月, 2021 5 次提交
-
-
由 Shay Drory 提交于
FW is now supporting more than 256 MSI-X per PF (up to 2K). Hence, enlarge interrupt field in CREATE_EQ to make use of the new MSI-X's. Signed-off-by: NShay Drory <shayd@nvidia.com> Reviewed-by: NMaor Gottlieb <maorg@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Leon Romanovsky 提交于
The users of EQ are running their code on different CPUs and with various affinity patterns. Move the cpumask setting close to their actual usage. Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Reviewed-by: NShay Drory <shayd@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Oleksij Rempel 提交于
This patch support for cable test for the ksz886x switches and the ksz8081 PHY. The patch was tested on a KSZ8873RLL switch with following results: - port 1: - provides invalid values, thus return -ENOTSUPP (Errata: DS80000830A: "LinkMD does not work on Port 1", http://ww1.microchip.com/downloads/en/DeviceDoc/KSZ8873-Errata-DS80000830A.pdf) - port 2: - can detect distance - can detect open on each wire of pair A (wire 1 and 2) - can detect open only on one wire of pair B (only wire 3) - can detect short between wires of a pair (wires 1 + 2 or 3 + 6) - short between pairs is detected as open. For example short between wires 2 + 3 is detected as open. Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Oleksij Rempel 提交于
Add support for MDI-X status and configuration Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michael Grzeschik 提交于
Some micrel devices share the same PHY register defines. This patch moves them to one common header so other drivers can reuse them. And reuse generic MII_* defines where possible. Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de> Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: NVladimir Oltean <olteanv@gmail.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 6月, 2021 3 次提交
-
-
由 Alex Elder 提交于
The csum_value field in the rmnet_map_dl_csum_trailer structure is a "real" Internet checksum. It is a 16 bit value, in big endian format, which represents an inverted ones' complement sum over pairs of bytes. Make that clear by changing its type to __sum16. This makes a typecast in rmnet_map_ipv4_dl_csum_trailer() and another in rmnet_map_ipv6_dl_csum_trailer() unnecessary. Signed-off-by: NAlex Elder <elder@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
Add support to create (and destroy) interfaces via a new rtnetlink kind "wwan". The responsible driver has to use the new wwan_register_ops() to make this possible. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NSergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: NLoic Poulain <loic.poulain@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steen Hegelund 提交于
Add 25gbase-r phy interface mode Signed-off-by: NSteen Hegelund <steen.hegelund@microchip.com> Signed-off-by: NBjarni Jonasson <bjarni.jonasson@microchip.com> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 6月, 2021 18 次提交
-
-
由 Vladimir Oltean 提交于
The sja1105 hardware has a quirk in that some changes require a switch reset, which loses all configuration. When the reset is initiated, everything needs to be reprogrammed, including the MACs and the PCS. This is currently done in sja1105_static_config_reload() - we manually call sja1105_adjust_port_config(), sja1105_sgmii_pcs_config() and sja1105_sgmii_pcs_force_speed() which are all internal functions. There is a desire for sja1105 to use the common xpcs driver, and that means that the equivalents of those functions, xpcs_do_config() and xpcs_link_up() respectively, will no longer be local functions. Forcing phylink to retrigger a resolve somehow, say by doing dev_close() followed by dev_open() is not really an option, because the CPU port might have a PCS as well, and there is no net device which we can close and reopen for that. Additionally, the dev_close/dev_open sequence might force a renegotiation of the copper-side link for SGMII ports connected to a PHY, and this is undesirable as well, because the switch reset is much quicker than a PHY autoneg, so we would have a lot more downtime. The only solution I see is for the sja1105 driver to keep doing what it's doing, and that means we need to export the equivalents from xpcs for sja1105_sgmii_pcs_config and sja1105_sgmii_pcs_force_speed, and call them directly in sja1105_static_config_reload(). This will be done during the conversion patch. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
The NXP SJA1110 switch integrates its own, non-Synopsys PMA, but it manages it through the register space of the XPCS itself, in a small register window inside MDIO_MMD_VEND2 from address 0x8030 to 0x806e. This coincides with where the registers for the default Synopsys PMA are, but the register definitions are of course not the same. This situation is an odd hardware quirk, but the simplest way to manage it is to drive the SJA1110's PMA from within the XPCS driver. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
The NXP SJA1105 DSA switch integrates a Synopsys SGMII XPCS on port 4. The generic code works fine, except there is an integration issue which needs to be dealt with: in this switch, the XPCS is integrated with a PMA that has the TX lane polarity inverted by default (PLUS is MINUS, MINUS is PLUS). To obtain normal non-inverted behavior, the TX lane polarity must be inverted in the PCS, via the DIGITAL_CONTROL_2 register. We introduce a pma_config() method in xpcs_compat which is called by the phylink_pcs_config() implementation. Also, the NXP SJA1105 returns all zeroes in the PHY ID registers 2 and 3. We need to hack up an ad-hoc PHY ID (OUI is zero, device ID is 1) in order for the XPCS driver to recognize it. This PHY ID is added to the public include/linux/pcs/pcs-xpcs.h for that reason (for the sja1105 driver to be able to use it in a later patch). Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
The struct mdio_xpcs_args is reminiscent of when a similarly named struct mdio_xpcs_ops existed. Now that that is removed, we can shorten the name to dw_xpcs (dw for DesignWare). Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arseny Krasnov 提交于
Small updates to make SOCK_SEQPACKET work: 1) Send SHUTDOWN on socket close for SEQPACKET type. 2) Set SEQPACKET packet type during send. 3) Set 'VIRTIO_VSOCK_SEQ_EOR' bit in flags for last packet of message. 4) Implement data check function for SEQPACKET. 5) Check for max datagram size. Signed-off-by: NArseny Krasnov <arseny.krasnov@kaspersky.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arseny Krasnov 提交于
Callback fetches RW packets from rx queue of socket until whole record is copied(if user's buffer is full, user is not woken up). This is done to not stall sender, because if we wake up user and it leaves syscall, nobody will send credit update for rest of record, and sender will wait for next enter of read syscall at receiver's side. So if user buffer is full, we just send credit update and drop data. Signed-off-by: NArseny Krasnov <arseny.krasnov@kaspersky.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Calvin Johnson 提交于
Define phylink_fwnode_phy_connect() to connect phy specified by a fwnode to a phylink instance. Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com> Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com> Acked-by: NGrant Likely <grant.likely@arm.com> Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Calvin Johnson 提交于
Define acpi_mdiobus_register() to Register mii_bus and create PHYs for each ACPI child node. Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com> Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com> Acked-by: NRafael J. Wysocki <rafael@kernel.org> Acked-by: NGrant Likely <grant.likely@arm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Calvin Johnson 提交于
Introduce a wrapper around the _ADR evaluation. Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com> Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com> Acked-by: NRafael J. Wysocki <rafael@kernel.org> Acked-by: NGrant Likely <grant.likely@arm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Calvin Johnson 提交于
Introduce fwnode_mdiobus_register_phy() to register PHYs on the mdiobus. From the compatible string, identify whether the PHY is c45 and based on this create a PHY device instance which is registered on the mdiobus. Along with fwnode_mdiobus_register_phy() also introduce fwnode_find_mii_timestamper() and fwnode_mdiobus_phy_device_register() since they are needed. While at it, also use the newly introduced fwnode operation in of_mdiobus_phy_device_register(). Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com> Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com> Acked-by: NGrant Likely <grant.likely@arm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Calvin Johnson 提交于
Extract phy_id from compatible string. This will be used by fwnode_mdiobus_register_phy() to create phy device using the phy_id. Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com> Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com> Acked-by: NGrant Likely <grant.likely@arm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Calvin Johnson 提交于
Define fwnode_phy_find_device() to iterate an mdiobus and find the phy device of the provided phy fwnode. Additionally define device_phy_find_device() to find phy device of provided device. Define fwnode_get_phy_node() to get phy_node using named reference. Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com> Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com> Acked-by: NGrant Likely <grant.likely@arm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Calvin Johnson 提交于
Define fwnode_mdio_find_device() to get a pointer to the mdio_device from fwnode passed to the function. Refactor of_mdio_find_device() to use fwnode_mdio_find_device(). Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com> Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com> Acked-by: NGrant Likely <grant.likely@arm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
The TX timestamping procedure for SJA1105 is a bit unconventional because the transmit procedure itself is unconventional. Control packets (and therefore PTP as well) are transmitted to a specific port in SJA1105 using "management routes" which must be written over SPI to the switch. These are one-shot rules that match by destination MAC address on traffic coming from the CPU port, and select the precise destination port for that packet. So to transmit a packet from NET_TX softirq context, we actually need to defer to a process context so that we can perform that SPI write before we send the packet. The DSA master dev_queue_xmit() runs in process context, and we poll until the switch confirms it took the TX timestamp, then we annotate the skb clone with that TX timestamp. This is why the sja1105 driver does not need an skb queue for TX timestamping. But the SJA1110 is a bit (not much!) more conventional, and you can request 2-step TX timestamping through the DSA header, as well as give the switch a cookie (timestamp ID) which it will give back to you when it has the timestamp. So now we do need a queue for keeping the skb clones until their TX timestamps become available. The interesting part is that the metadata frames from SJA1105 haven't disappeared completely. On SJA1105 they were used as follow-ups which contained RX timestamps, but on SJA1110 they are actually TX completion packets, which contain a variable (up to 32) array of timestamps. Why an array? Because: - not only is the TX timestamp on the egress port being communicated, but also the RX timestamp on the CPU port. Nice, but we don't care about that, so we ignore it. - because a packet could be multicast to multiple egress ports, each port takes its own timestamp, and the TX completion packet contains the individual timestamps on each port. This is unconventional because switches typically have a timestamping FIFO and raise an interrupt, but this one doesn't. So the tagger needs to detect and parse meta frames, and call into the main switch driver, which pairs the timestamps with the skbs in the TX timestamping queue which are waiting for one. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
The SJA1110 has improved a few things compared to SJA1105: - To send a control packet from the host port with SJA1105, one needed to program a one-shot "management route" over SPI. This is no longer true with SJA1110, you can actually send "in-band control extensions" in the packets sent by DSA, these are in fact DSA tags which contain the destination port and switch ID. - When receiving a control packet from the switch with SJA1105, the source port and switch ID were written in bytes 3 and 4 of the destination MAC address of the frame (which was a very poor shot at a DSA header). If the control packet also had an RX timestamp, that timestamp was sent in an actual follow-up packet, so there were reordering concerns on multi-core/multi-queue DSA masters, where the metadata frame with the RX timestamp might get processed before the actual packet to which that timestamp belonged (there is no way to pair a packet to its timestamp other than the order in which they were received). On SJA1110, this is no longer true, control packets have the source port, switch ID and timestamp all in the DSA tags. - Timestamps from the switch were partial: to get a 64-bit timestamp as required by PTP stacks, one would need to take the partial 24-bit or 32-bit timestamp from the packet, then read the current PTP time very quickly, and then patch in the high bits of the current PTP time into the captured partial timestamp, to reconstruct what the full 64-bit timestamp must have been. That is awful because packet processing is done in NAPI context, but reading the current PTP time is done over SPI and therefore needs sleepable context. But it also aggravated a few things: - Not only is there a DSA header in SJA1110, but there is a DSA trailer in fact, too. So DSA needs to be extended to support taggers which have both a header and a trailer. Very unconventional - my understanding is that the trailer exists because the timestamps couldn't be prepared in time for putting them in the header area. - Like SJA1105, not all packets sent to the CPU have the DSA tag added to them, only control packets do: * the ones which match the destination MAC filters/traps in MAC_FLTRES1 and MAC_FLTRES0 * the ones which match FDB entries which have TRAP or TAKETS bits set So we could in theory hack something up to request the switch to take timestamps for all packets that reach the CPU, and those would be DSA-tagged and contain the source port / switch ID by virtue of the fact that there needs to be a timestamp trailer provided. BUT: - The SJA1110 does not parse its own DSA tags in a way that is useful for routing in cross-chip topologies, a la Marvell. And the sja1105 driver already supports cross-chip bridging from the SJA1105 days. It does that by automatically setting up the DSA links as VLAN trunks which contain all the necessary tag_8021q RX VLANs that must be communicated between the switches that span the same bridge. So when using tag_8021q on sja1105, it is possible to have 2 switches with ports sw0p0, sw0p1, sw1p0, sw1p1, and 2 VLAN-unaware bridges br0 and br1, and br0 can take sw0p0 and sw1p0, and br1 can take sw0p1 and sw1p1, and forwarding will happen according to the expected rules of the Linux bridge. We like that, and we don't want that to go away, so as a matter of fact, the SJA1110 tagger still needs to support tag_8021q. So the sja1110 tagger is a hybrid between tag_8021q for data packets, and the native hardware support for control packets. On RX, packets have a 13-byte trailer if they contain an RX timestamp. That trailer is padded in such a way that its byte 8 (the start of the "residence time" field - not parsed by Linux because we don't care) is aligned on a 16 byte boundary. So the padding has a variable length between 0 and 15 bytes. The DSA header contains the offset of the beginning of the padding relative to the beginning of the frame (and the end of the padding is obviously the end of the packet minus 13 bytes, the length of the trailer). So we discard it. Packets which don't have a trailer contain the source port and switch ID information in the header (they are "trap-to-host" packets). Packets which have a trailer contain the source port and switch ID in the trailer. On TX, the destination port mask and switch ID is always in the trailer, so we always need to say in the header that a trailer is present. The header needs a custom EtherType and this was chosen as 0xdadc, after 0xdada which is for Marvell and 0xdadb which is for VLANs in VLAN-unaware mode on SJA1105 (and SJA1110 in fact too). Because we use tag_8021q in concert with the native tagging protocol, control packets will have 2 DSA tags. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
In SJA1105, RX timestamps for packets sent to the CPU are transmitted in separate follow-up packets (metadata frames). These contain partial timestamps (24 or 32 bits) which are kept in SJA1105_SKB_CB(skb)->meta_tstamp. Thankfully, SJA1110 improved that, and the RX timestamps are now transmitted in-band with the actual packet, in the timestamp trailer. The RX timestamps are now full-width 64 bits. Because we process the RX DSA tags in the rcv() method in the tagger, but we would like to preserve the DSA code structure in that we populate the skb timestamp in the port_rxtstamp() call which only happens later, the implication is that we must somehow pass the 64-bit timestamp from the rcv() method all the way to port_rxtstamp(). We can use the skb->cb for that. Rename the meta_tstamp from struct sja1105_skb_cb from "meta_tstamp" to "tstamp", and increase its size to 64 bits. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
The added value of this function is that it can deal with both the case where the VLAN header is in the skb head, as well as in the offload field. This is something I was not able to do using other functions in the network stack. Since both ocelot-8021q and sja1105 need to do the same stuff, let's make it a common service provided by tag_8021q. This is done as refactoring for the new SJA1110 tagger, which partly uses tag_8021q as well (just like SJA1105), and will be the third caller. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
All users of tag_8021q select it in Kconfig, so shim functions are not needed because it is not possible for it to be disabled and its callers enabled. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 6月, 2021 1 次提交
-
-
由 Jacob Keller 提交于
Add the ice_ptp_hw.c file and some associated definitions to the ice driver folder. This file contains basic low level definitions for functions that interact with the device hardware. For now, only E810-based devices are supported. The ice hardware supports 2 major variants which have different PHYs with different procedures necessary for interacting with the device clock. Because the device captures timestamps in the PHY, each PHY has its own internal timer. The timers are synchronized in hardware by first preparing the source timer and the PHY timer shadow registers, and then issuing a synchronization command. This ensures that both the source timer and PHY timers are programmed simultaneously. The timers themselves are all driven from the same oscillator source. The functions in ice_ptp_hw.c abstract over the differences between how the PHYs in E810 are programmed vs how the PHYs in E822 devices are programmed. This series only implements E810 support, but E822 support will be added in a future change. Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Tested-by: NTony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
- 10 6月, 2021 4 次提交
-
-
由 Vlad Buslov 提交于
Create new files bridge.{c|h} in en/rep directory that implement bridge interaction with representor netdevices and handle required events/notifications, bridge.{c|h} in esw directory that implement all necessary eswitch offloading infrastructure and works on vport/eswitch level. Provide new kconfig MLX5_BRIDGE which is automatically selected when both kernel bridge and mlx5 eswitch configs are enabled. Provide basic infrastructure for bridge offloads: - struct mlx5_esw_bridge_offloads - per-eswitch bridge offload structure that encapsulates generic bridge-offloads data (notifier blocks, ingress flow table/group, etc.) that is created/deleted on enable/disable eswitch offloads. - struct mlx5_esw_bridge - per-bridge structure that encapsulates per-bridge data (reference counter, FDB, egress flow table/group, etc.) that is created when first eswitch represetor is attached to new bridge and deleted when last representor is removed from the bridge as a result of NETDEV_CHANGEUPPER event. The bridge tables are created with new priority FDB_BR_OFFLOAD in FDB namespace. The new priority is between tc-miss and slow path priorities. Priority consist of two levels: the ingress table that is global per eswitch and matches incoming packets by src_mac/vid and redirects them to next level (egress table) that is chosen according to ingress port bridge membership and matches on dst_mac/vid in order to redirect packet to vport according to the following diagram: + | +---------v----------+ | | | FDB_TC_OFFLOAD | | | +---------+----------+ | | +---------v----------+ | | | FDB_FT_OFFLOAD | | | +---------+----------+ | | +---------v----------+ | | | FDB_TC_MISS | | | +---------+----------+ | +--------------------------------------+ | | | | +------+ | | | | | +------v--------+ FDB_BR_OFFLOAD | | | INGRESS_TABLE | | | +------+---+----+ | | | | match | | | +---------+ | | | | | +-------+ | | +-------v-------+ match | | | | | | EGRESS_TABLE +------------> vport | | | +-------+-------+ | | | | | | | +-------+ | | miss | | | +------+------+ | | | | +--------------------------------------+ | | +---------v----------+ | | | FDB_SLOW_PATH | | | +---------+----------+ | v Signed-off-by: NVlad Buslov <vladbu@nvidia.com> Reviewed-by: NJianbo Liu <jianbol@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Vlad Buslov 提交于
In order to adhere to kernel software datapath model bridge offloads must come after TC and NF FDBs. Following patches in this series add new FDB priority for bridge after FDB_FT_OFFLOAD. However, since netfilter offload is implemented with unmanaged tables, its miss path is not automatically connected to next priority and requires the code to manually connect with slow table. To keep bridge offloads encapsulated and not mix it with eswitch offloads, create a new FDB_TC_MISS priority between FDB_FT_OFFLOAD and FDB_SLOW_PATH: + | +---------v----------+ | | | FDB_TC_OFFLOAD | | | +---------+----------+ | | | +---------v----------+ | | | FDB_FT_OFFLOAD | | | +---------+----------+ | | | +---------v----------+ | | | FDB_TC_MISS | | | +---------+----------+ | | | +---------v----------+ | | | FDB_SLOW_PATH | | | +---------+----------+ | v Initialize the new priority with single default empty managed table and use the table as TC/NF miss patch instead of slow table. This approach allows bridge offloads to be created as new FDB namespace priority between FDB_TC_MISS and FDB_SLOW_PATH without exposing its internal tables to any other modules since miss path of managed TC-miss table is automatically wired to next priority. Signed-off-by: NVlad Buslov <vladbu@nvidia.com> Reviewed-by: NJianbo Liu <jianbol@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Yevgeny Kliteynik 提交于
Adding new reformat context type (INSERT_HEADER) requires adding two new parameters to reformat context - reformat_param_0 and reformat_param_1. As defined by HW spec, these parameters have different meaning for different reformat context type. The first parameter (reformat_param_0) is not new to HW spec, but it wasn't used by any of the supported reformats. The second parameter (reformat_param_1) is new to the HW spec - it was added to allow supporting INSERT_HEADER. For NSERT_HEADER, reformat_param_0 indicates the header used to reference the location of the inserted header, and reformat_param_1 indicates the offset of the inserted header from the reference point defined by reformat_param_0. Signed-off-by: NYevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
由 Yevgeny Kliteynik 提交于
Add support for HCA caps 2 that contains capabilities for the new insert/remove header actions. Added the required definitions for supporting the new reformat type: added packet reformat parameters, reformat anchors and definitions to allow copy/set into the inserted EMD (Embedded MetaData) tag. Signed-off-by: NYevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: NVlad Buslov <vladbu@nvidia.com> Reviewed-by: NJianbo Liu <jianbol@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
- 09 6月, 2021 4 次提交
-
-
由 Matthew Hagan 提交于
We are currently assuming that GMAC_AHB_RESET will already be deasserted by the bootloader. However if this has not been done, probing of the GMAC will fail. To remedy this we must ensure GMAC_AHB_RESET has been deasserted prior to probing. v2 changes: - remove NULL condition check for stmmac_ahb_rst in stmmac_main.c - unwrap dev_err() message in stmmac_main.c - add PTR_ERR() around plat->stmmac_ahb_rst in stmmac_platform.c v3 changes: - add error pointer to dev_err() output - add reset_control_assert(stmmac_ahb_rst) in stmmac_dvr_remove - revert PTR_ERR() around plat->stmmac_ahb_rst since this is performed on the returned value of ret by the calling function Signed-off-by: NMatthew Hagan <mnhagan88@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sergey Ryazanov 提交于
It is quite unusual when some value can not be equal to a defined range max value. Also most subsystems defines FOO_TYPE_MAX as a maximum valid value. So turn the WAN_PORT_MAX meaning from the number of supported port types to the maximum valid port type. Signed-off-by: NSergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: NLoic Poulain <loic.poulain@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Voon Weifeng 提交于
The Intel mGbE supports 2.5Gbps link speed by increasing the clock rate by 2.5 times of the original rate. In this mode, the serdes/PHY operates at a serial baud rate of 3.125 Gbps and the PCS data path and GMII interface of the MAC operate at 312.5 MHz instead of 125 MHz. For Intel mGbE, the overclocking of 2.5 times clock rate to support 2.5G is only able to be configured in the BIOS during boot time. Kernel driver has no access to modify the clock rate for 1Gbps/2.5G mode. The way to determined the current 1G/2.5G mode is by reading a dedicated adhoc register through mdio bus. In short, after the system boot up, it is either in 1G mode or 2.5G mode which not able to be changed on the fly. Compared to 1G mode, the 2.5G mode selects the 2500BASEX as PHY interface and disables the xpcs_an_inband. This is to cater for some PHYs that only supports 2500BASEX PHY interface with no autonegotiation. v2: remove MAC supported link speed masking v3: Restructure to introduce intel_speed_mode_2500() to read serdes registers for max speed supported and select the appropritate configuration. Use max_speed to determine the supported link speed mask. Signed-off-by: NVoon Weifeng <weifeng.voon@intel.com> Signed-off-by: NMichael Sit Wei Hong <michael.wei.hong.sit@intel.com> Reviewed-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Voon Weifeng 提交于
XPCS IP supports 2500BASEX as PHY interface. It is configured as autonegotiation disable to cater for PHYs that does not supports 2500BASEX autonegotiation. v2: Add supported link speed masking. v3: Restructure to introduce xpcs_config_2500basex() used to configure the xpcs for 2.5G speeds. Added 2500BASEX specific information for configuration. v4: Fix indentation error Signed-off-by: NVoon Weifeng <weifeng.voon@intel.com> Signed-off-by: NMichael Sit Wei Hong <michael.wei.hong.sit@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 6月, 2021 2 次提交
-
-
由 Ilias Apalodimas 提交于
Up to now several high speed NICs have custom mechanisms of recycling the allocated memory they use for their payloads. Our page_pool API already has recycling capabilities that are always used when we are running in 'XDP mode'. So let's tweak the API and the kernel network stack slightly and allow the recycling to happen even during the standard operation. The API doesn't take into account 'split page' policies used by those drivers currently, but can be extended once we have users for that. The idea is to be able to intercept the packet on skb_release_data(). If it's a buffer coming from our page_pool API recycle it back to the pool for further usage or just release the packet entirely. To achieve that we introduce a bit in struct sk_buff (pp_recycle:1) and a field in struct page (page->pp) to store the page_pool pointer. Storing the information in page->pp allows us to recycle both SKBs and their fragments. We could have skipped the skb bit entirely, since identical information can bederived from struct page. However, in an effort to affect the free path as less as possible, reading a single bit in the skb which is already in cache, is better that trying to derive identical information for the page stored data. The driver or page_pool has to take care of the sync operations on it's own during the buffer recycling since the buffer is, after opting-in to the recycling, never unmapped. Since the gain on the drivers depends on the architecture, we are not enabling recycling by default if the page_pool API is used on a driver. In order to enable recycling the driver must call skb_mark_for_recycle() to store the information we need for recycling in page->pp and enabling the recycling bit, or page_pool_store_mem_info() for a fragment. Co-developed-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Co-developed-by: NMatteo Croce <mcroce@microsoft.com> Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Signed-off-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Matteo Croce 提交于
This is a prerequisite patch, the next one is enabling recycling of skbs and fragments. Add an extra argument on __skb_frag_unref() to handle recycling, and update the current users of the function with that. Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-