- 12 2月, 2021 4 次提交
-
-
由 Tariq Toukan 提交于
Use a new config SOCK_RX_QUEUE_MAPPING to compile-in the socket RX queue field and logic, instead of the XPS config. This breaks dependency in XPS, and allows selecting it from non-XPS use cases, as we do in the next patch. In addition, use the new flag to wrap the logic in sk_rx_queue_get() and protect access to the sk_rx_queue_mapping field, while keeping the function exposed unconditionally, just like sk_rx_queue_set() and sk_rx_queue_clear(). Signed-off-by: NTariq Toukan <tariqt@nvidia.com> Reviewed-by: NMaxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
It is simpler to make net->net_cookie a plain u64 written once in setup_net() instead of looping and using atomic64 helpers. Lorenz Bauer wants to add SO_NETNS_COOKIE socket option and this patch would makes his patch series simpler. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Lorenz Bauer <lmb@cloudflare.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Tested-by: NLorenz Bauer <lmb@cloudflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 George McCollister 提交于
Add support for offloading of HSR/PRP (IEC 62439-3) tag insertion tag removal, duplicate generation and forwarding on DSA switches. Add DSA_NOTIFIER_HSR_JOIN and DSA_NOTIFIER_HSR_LEAVE which trigger calls to .port_hsr_join and .port_hsr_leave in the DSA driver for the switch. The DSA switch driver should then set netdev feature flags for the HSR/PRP operation that it offloads. NETIF_F_HW_HSR_TAG_INS NETIF_F_HW_HSR_TAG_RM NETIF_F_HW_HSR_FWD NETIF_F_HW_HSR_DUP Signed-off-by: NGeorge McCollister <george.mccollister@gmail.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Reviewed-by: NVladimir Oltean <olteanv@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Aya Levin 提交于
%s/dest_mac_filter/dmac_filter/g Fixes: e78ab164 ("devlink: Add DMAC filter generic packet trap") Signed-off-by: NAya Levin <ayal@nvidia.com> Reported-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NTariq Toukan <tariqt@nvidia.com> Reviewed-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 2月, 2021 3 次提交
-
-
由 Amit Cohen 提交于
After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. To avoid such cases, previous patch set added the ability to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed, this behavior is controlled by sysctl. With the above mentioned behavior, it is possible to know from user-space if the route was offloaded, but if the offload fails there is no indication to user-space. Following a failure, a routing daemon will wait indefinitely for a notification that will never come. This patch adds an "offload_failed" indication to IPv6 routes, so that users will have better visibility into the offload process. 'struct fib6_info' is extended with new field that indicates if route offload failed. Note that the new field is added using unused bit and therefore there is no need to increase struct size. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Amit Cohen 提交于
After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. To avoid such cases, previous patch set added the ability to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed, this behavior is controlled by sysctl. With the above mentioned behavior, it is possible to know from user-space if the route was offloaded, but if the offload fails there is no indication to user-space. Following a failure, a routing daemon will wait indefinitely for a notification that will never come. This patch adds an "offload_failed" indication to IPv4 routes, so that users will have better visibility into the offload process. 'struct fib_alias', and 'struct fib_rt_info' are extended with new field that indicates if route offload failed. Note that the new field is added using unused bit and therefore there is no need to increase structs size. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Horatiu Vultur 提交于
Now that MRP started to use also SWITCHDEV_ATTR_ID_PORT_STP_STATE to notify HW, then SWITCHDEV_ATTR_ID_MRP_PORT_STAT is not used anywhere else, therefore we can remove it. Fixes: c284b545 ("switchdev: mrp: Extend switchdev API to offload MRP") Signed-off-by: NHoratiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 2月, 2021 4 次提交
-
-
由 Leon Romanovsky 提交于
Fix the following compilation warnings: net/netfilter/ipvs/ip_vs_proto_tcp.c:147:1: warning: no previous prototype for 'tcp_snat_handler' [-Wmissing-prototypes] 147 | tcp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp, | ^~~~~~~~~~~~~~~~ net/netfilter/ipvs/ip_vs_proto_udp.c:136:1: warning: no previous prototype for 'udp_snat_handler' [-Wmissing-prototypes] 136 | udp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp, | ^~~~~~~~~~~~~~~~ Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Leon Romanovsky 提交于
Fir the following compilation warnings: 1031 | INDIRECT_CALLABLE_SCOPE void udp_v6_early_demux(struct sk_buff *skb) net/ipv6/ip6_offload.c:182:41: warning: no previous prototype for ‘ipv6_gro_receive’ [-Wmissing-prototypes] 182 | INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, | ^~~~~~~~~~~~~~~~ net/ipv6/ip6_offload.c:320:29: warning: no previous prototype for ‘ipv6_gro_complete’ [-Wmissing-prototypes] 320 | INDIRECT_CALLABLE_SCOPE int ipv6_gro_complete(struct sk_buff *skb, int nhoff) | ^~~~~~~~~~~~~~~~~ net/ipv6/ip6_offload.c:182:41: warning: no previous prototype for ‘ipv6_gro_receive’ [-Wmissing-prototypes] 182 | INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, | ^~~~~~~~~~~~~~~~ net/ipv6/ip6_offload.c:320:29: warning: no previous prototype for ‘ipv6_gro_complete’ [-Wmissing-prototypes] 320 | INDIRECT_CALLABLE_SCOPE int ipv6_gro_complete(struct sk_buff *skb, int nhoff) Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Leon Romanovsky 提交于
Fix the following compilation warning: net/ipv6/udp.c:1031:30: warning: no previous prototype for 'udp_v6_early_demux' [-Wmissing-prototypes] 1031 | INDIRECT_CALLABLE_SCOPE void udp_v6_early_demux(struct sk_buff *skb) | ^~~~~~~~~~~~~~~~~~ net/ipv6/udp.c:1072:29: warning: no previous prototype for 'udpv6_rcv' [-Wmissing-prototypes] 1072 | INDIRECT_CALLABLE_SCOPE int udpv6_rcv(struct sk_buff *skb) | ^~~~~~~~~ Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xin Long 提交于
When enabling encap for a ipv6 socket without udp_encap_needed_key increased, UDP GRO won't work for v4 mapped v6 address packets as sk will be NULL in udp4_gro_receive(). This patch is to enable it by increasing udp_encap_needed_key for v6 sockets in udp_tunnel_encap_enable(), and correspondingly decrease udp_encap_needed_key in udpv6_destroy_sock(). v1->v2: - add udp_encap_disable() and export it. v2->v3: - add the change for rxrpc and bareudp into one patch, as Alex suggested. v3->v4: - move rxrpc part to another patch. Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 04 2月, 2021 5 次提交
-
-
由 Wei Wang 提交于
Currently, a percpu_counter with the default batch size (2*nr_cpus) is used to record the total # of active sockets per protocol. This means sk_sockets_allocated_read_positive() could be off by +/-2*(nr_cpus^2). This under/over-estimation could lead to wrong memory suppression conditions in __sk_raise_mem_allocated(). Fix this by using a more reasonable fixed batch size of 16. See related commit cf86a086 ("net/dst: use a smaller percpu_counter batch for dst entries accounting") that addresses a similar issue. Signed-off-by: NWei Wang <weiwan@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NSoheil Hassas Yeganeh <soheil@google.com> Link: https://lore.kernel.org/r/20210202193408.1171634-1-weiwan@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Brian Vazquez 提交于
This patch avoids the indirect call for the common case: ip6_dst_check and ipv4_dst_check Signed-off-by: NBrian Vazquez <brianvv@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Brian Vazquez 提交于
This patch avoids the indirect call for the common case: ip6_mtu and ipv4_mtu Signed-off-by: NBrian Vazquez <brianvv@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Brian Vazquez 提交于
This patch avoids the indirect call for the common case: ip6_output and ip_output Signed-off-by: NBrian Vazquez <brianvv@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Brian Vazquez 提交于
This patch avoids the indirect call for the common case: ip_local_deliver and ip6_input Signed-off-by: NBrian Vazquez <brianvv@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 03 2月, 2021 3 次提交
-
-
由 Amit Cohen 提交于
After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. It is also possible for a route already installed in hardware to change its action and therefore its flags. For example, a host route that is trapping packets can be "promoted" to perform decapsulation following the installation of an IPinIP/VXLAN tunnel. Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed. The aim is to provide an indication to user-space (e.g., routing daemons) about the state of the route in hardware. Introduce a sysctl that controls this behavior. Keep the default value at 0 (i.e., do not emit notifications) for several reasons: - Multiple RTM_NEWROUTE notification per-route might confuse existing routing daemons. - Convergence reasons in routing daemons. - The extra notifications will negatively impact the insertion rate. - Not all users are interested in these notifications. Move fib6_info_hw_flags_set() to C file because it is no longer a short function. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Amit Cohen 提交于
The next patch will emit notification when hardware flags are changed, in case that fib_notify_on_flag_change sysctl is set to 1. To know sysctl values, net struct is needed. This change is consistent with the IPv4 version, which gets 'net' struct as its first argument. Currently, the only callers of this function are mlxsw and netdevsim. Patch the callers to pass net. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Amit Cohen 提交于
After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. It is also possible for a route already installed in hardware to change its action and therefore its flags. For example, a host route that is trapping packets can be "promoted" to perform decapsulation following the installation of an IPinIP/VXLAN tunnel. Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed. The aim is to provide an indication to user-space (e.g., routing daemons) about the state of the route in hardware. Introduce a sysctl that controls this behavior. Keep the default value at 0 (i.e., do not emit notifications) for several reasons: - Multiple RTM_NEWROUTE notification per-route might confuse existing routing daemons. - Convergence reasons in routing daemons. - The extra notifications will negatively impact the insertion rate. - Not all users are interested in these notifications. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Acked-by: NRoopa Prabhu <roopa@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 02 2月, 2021 4 次提交
-
-
由 Dongseok Yi 提交于
UDP/IP header of UDP GROed frag_skbs are not updated even after NAT forwarding. Only the header of head_skb from ip_finish_output_gso -> skb_gso_segment is updated but following frag_skbs are not updated. A call path skb_mac_gso_segment -> inet_gso_segment -> udp4_ufo_fragment -> __udp_gso_segment -> __udp_gso_segment_list does not try to update UDP/IP header of the segment list but copy only the MAC header. Update port, addr and check of each skb of the segment list in __udp_gso_segment_list. It covers both SNAT and DNAT. Fixes: 9fd1ff5d (udp: Support UDP fraglist GRO/GSO.) Signed-off-by: NDongseok Yi <dseok.yi@samsung.com> Acked-by: NSteffen Klassert <steffen.klassert@secunet.com> Link: https://lore.kernel.org/r/1611962007-80092-1-git-send-email-dseok.yi@samsung.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alexander Ovechkin 提交于
Commit e5f0e8f8 ("net: sched: introduce and use qdisc tree flush/purge helpers") introduced qdisc tree flush/purge helpers, but erroneously used flush helper instead of purge helper in qdisc_replace function. This issue was found in our CI, that tests various qdisc setups by configuring qdisc and sending data through it. Call of invalid helper sporadically leads to corruption of vt_tree/cf_tree of hfsc_class that causes kernel oops: Oops: 0000 [#1] SMP PTI CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.11.0-8f6859df #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:rb_insert_color+0x18/0x190 Code: c3 31 c0 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 48 8b 07 48 85 c0 0f 84 05 01 00 00 48 8b 10 f6 c2 01 0f 85 34 01 00 00 <48> 8b 4a 08 49 89 d0 48 39 c1 74 7d 48 85 c9 74 32 f6 01 01 75 2d RSP: 0018:ffffc900000b8bb0 EFLAGS: 00010246 RAX: ffff8881ef4c38b0 RBX: ffff8881d956e400 RCX: ffff8881ef4c38b0 RDX: 0000000000000000 RSI: ffff8881d956f0a8 RDI: ffff8881d956e4b0 RBP: 0000000000000000 R08: 000000d5c4e249da R09: 1600000000000000 R10: ffffc900000b8be0 R11: ffffc900000b8b28 R12: 0000000000000001 R13: 000000000000005a R14: ffff8881f0905000 R15: ffff8881f0387d00 FS: 0000000000000000(0000) GS:ffff8881f8b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 00000001f4796004 CR4: 0000000000060ee0 Call Trace: <IRQ> init_vf.isra.19+0xec/0x250 [sch_hfsc] hfsc_enqueue+0x245/0x300 [sch_hfsc] ? fib_rules_lookup+0x12a/0x1d0 ? __dev_queue_xmit+0x4b6/0x930 ? hfsc_delete_class+0x250/0x250 [sch_hfsc] __dev_queue_xmit+0x4b6/0x930 ? ip6_finish_output2+0x24d/0x590 ip6_finish_output2+0x24d/0x590 ? ip6_output+0x6c/0x130 ip6_output+0x6c/0x130 ? __ip6_finish_output+0x110/0x110 mld_sendpack+0x224/0x230 mld_ifc_timer_expire+0x186/0x2c0 ? igmp6_group_dropped+0x200/0x200 call_timer_fn+0x2d/0x150 run_timer_softirq+0x20c/0x480 ? tick_sched_do_timer+0x60/0x60 ? tick_sched_timer+0x37/0x70 __do_softirq+0xf7/0x2cb irq_exit+0xa0/0xb0 smp_apic_timer_interrupt+0x74/0x150 apic_timer_interrupt+0xf/0x20 </IRQ> Fixes: e5f0e8f8 ("net: sched: introduce and use qdisc tree flush/purge helpers") Signed-off-by: NAlexander Ovechkin <ovov@yandex-team.ru> Reported-by: NAlexander Kuznetsov <wwfq@yandex-team.ru> Acked-by: NDmitry Monakhov <dmtrmonakhov@yandex-team.ru> Acked-by: NDmitry Yakunin <zeil@yandex-team.ru> Acked-by: NCong Wang <xiyou.wangcong@gmail.com> Link: https://lore.kernel.org/r/20210201200049.299153-1-ovov@yandex-team.ruSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Johannes Berg 提交于
If register_netdevice() fails after having called cfg80211's netdev notifier (cfg80211_netdev_notifier_call) it will call the notifier again with UNREGISTER. This would then lock the wiphy mutex because we're marked as registered, which causes a deadlock. Fix this by separately keeping track of whether or not we're in the middle of registering to also skip the notifier call on this unregister. Reported-by: syzbot+2ae0ca9d7737ad1a62b7@syzkaller.appspotmail.com Fixes: a05829a7 ("cfg80211: avoid holding the RTNL when calling the driver") Link: https://lore.kernel.org/r/20210201192048.ed8bad436737.I7cae042c44b15f80919a285799a15df467e9d42d@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Yu Liu 提交于
If a peer device doesn't support eSCO 2M we should skip the params that use it when setting up sync connection since they will always fail. Signed-off-by: NYu Liu <yudiliu@google.com> Reviewed-by: NAbhishek Pandit-Subedi <abhishekpandit@chromium.org> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
- 30 1月, 2021 6 次提交
-
-
由 Neal Cardwell 提交于
This commit shrinks inet_connection_sock by 4 bytes, by shrinking icsk_mtup.enabled from 32 bits to 1 bit, and shrinking icsk_mtup.probe_size from s32 to an unsuigned 31 bit field. This is to save space to compensate for the recent introduction of a new u32 in inet_connection_sock, icsk_probes_tstamp, in the recent bug fix commit 9d9b1ee0 ("tcp: fix TCP_USER_TIMEOUT with zero window"). This should not change functionality, since icsk_mtup.enabled is only ever set to 0 or 1, and icsk_mtup.probe_size can only be either 0 or a positive MTU value returned by tcp_mss_to_mtu() Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20210129185438.1813237-1-ncardwell.kernel@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
There are use cases for which the existing tagger, based on the NPI (Node Processor Interface) functionality, is insufficient. Namely: - Frames injected through the NPI port bypass the frame analyzer, so no source address learning is performed, no TSN stream classification, etc. - Flow control is not functional over an NPI port (PAUSE frames are encapsulated in the same Extraction Frame Header as all other frames) - There can be at most one NPI port configured for an Ocelot switch. But in NXP LS1028A and T1040 there are two Ethernet CPU ports. The non-NPI port is currently either disabled, or operated as a plain user port (albeit an internally-facing one). Having the ability to configure the two CPU ports symmetrically could pave the way for e.g. creating a LAG between them, to increase bandwidth seamlessly for the system. So there is a desire to have an alternative to the NPI mode. This change keeps the default tagger for the Seville and Felix switches as "ocelot", but it can be changed via the following device attribute: echo ocelot-8021q > /sys/class/<dsa-master>/dsa/tagging Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
Currently DSA exposes the following sysfs: $ cat /sys/class/net/eno2/dsa/tagging ocelot which is a read-only device attribute, introduced in the kernel as commit 98cdb480 ("net: dsa: Expose tagging protocol to user-space"), and used by libpcap since its commit 993db3800d7d ("Add support for DSA link-layer types"). It would be nice if we could extend this device attribute by making it writable: $ echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging This is useful with DSA switches that can make use of more than one tagging protocol. It may be useful in dsa_loop in the future too, to perform offline testing of various taggers, or for changing between dsa and edsa on Marvell switches, if that is desirable. In terms of implementation, drivers can support this feature by implementing .change_tag_protocol, which should always leave the switch in a consistent state: either with the new protocol if things went well, or with the old one if something failed. Teardown of the old protocol, if necessary, must be handled by the driver. Some things remain as before: - The .get_tag_protocol is currently only called at probe time, to load the initial tagging protocol driver. Nonetheless, new drivers should report the tagging protocol in current use now. - The driver should manage by itself the initial setup of tagging protocol, no later than the .setup() method, as well as destroying resources used by the last tagger in use, no earlier than the .teardown() method. For multi-switch DSA trees, error handling is a bit more complicated, since e.g. the 5th out of 7 switches may fail to change the tag protocol. When that happens, a revert to the original tag protocol is attempted, but that may fail too, leaving the tree in an inconsistent state despite each individual switch implementing .change_tag_protocol transactionally. Since the intersection between drivers that implement .change_tag_protocol and drivers that support D in DSA is currently the empty set, the possibility for this error to happen is ignored for now. Testing: $ insmod mscc_felix.ko [ 79.549784] mscc_felix 0000:00:00.5: Adding to iommu group 14 [ 79.565712] mscc_felix 0000:00:00.5: Failed to register DSA switch: -517 $ insmod tag_ocelot.ko $ rmmod mscc_felix.ko $ insmod mscc_felix.ko [ 97.261724] libphy: VSC9959 internal MDIO bus: probed [ 97.267363] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 0 [ 97.274998] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 1 [ 97.282561] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 2 [ 97.289700] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 3 [ 97.599163] mscc_felix 0000:00:00.5 swp0 (uninitialized): PHY [0000:00:00.3:10] driver [Microsemi GE VSC8514 SyncE] (irq=POLL) [ 97.862034] mscc_felix 0000:00:00.5 swp1 (uninitialized): PHY [0000:00:00.3:11] driver [Microsemi GE VSC8514 SyncE] (irq=POLL) [ 97.950731] mscc_felix 0000:00:00.5 swp0: configuring for inband/qsgmii link mode [ 97.964278] 8021q: adding VLAN 0 to HW filter on device swp0 [ 98.146161] mscc_felix 0000:00:00.5 swp2 (uninitialized): PHY [0000:00:00.3:12] driver [Microsemi GE VSC8514 SyncE] (irq=POLL) [ 98.238649] mscc_felix 0000:00:00.5 swp1: configuring for inband/qsgmii link mode [ 98.251845] 8021q: adding VLAN 0 to HW filter on device swp1 [ 98.433916] mscc_felix 0000:00:00.5 swp3 (uninitialized): PHY [0000:00:00.3:13] driver [Microsemi GE VSC8514 SyncE] (irq=POLL) [ 98.485542] mscc_felix 0000:00:00.5: configuring for fixed/internal link mode [ 98.503584] mscc_felix 0000:00:00.5: Link is Up - 2.5Gbps/Full - flow control rx/tx [ 98.527948] device eno2 entered promiscuous mode [ 98.544755] DSA: tree 0 setup $ ping 10.0.0.1 PING 10.0.0.1 (10.0.0.1): 56 data bytes 64 bytes from 10.0.0.1: seq=0 ttl=64 time=2.337 ms 64 bytes from 10.0.0.1: seq=1 ttl=64 time=0.754 ms ^C - 10.0.0.1 ping statistics - 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 0.754/1.545/2.337 ms $ cat /sys/class/net/eno2/dsa/tagging ocelot $ cat ./test_ocelot_8021q.sh #!/bin/bash ip link set swp0 down ip link set swp1 down ip link set swp2 down ip link set swp3 down ip link set swp5 down ip link set eno2 down echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging ip link set eno2 up ip link set swp0 up ip link set swp1 up ip link set swp2 up ip link set swp3 up ip link set swp5 up $ ./test_ocelot_8021q.sh ./test_ocelot_8021q.sh: line 9: echo: write error: Protocol not available $ rmmod tag_ocelot.ko rmmod: can't unload module 'tag_ocelot': Resource temporarily unavailable $ insmod tag_ocelot_8021q.ko $ ./test_ocelot_8021q.sh $ cat /sys/class/net/eno2/dsa/tagging ocelot-8021q $ rmmod tag_ocelot.ko $ rmmod tag_ocelot_8021q.ko rmmod: can't unload module 'tag_ocelot_8021q': Resource temporarily unavailable $ ping 10.0.0.1 PING 10.0.0.1 (10.0.0.1): 56 data bytes 64 bytes from 10.0.0.1: seq=0 ttl=64 time=0.953 ms 64 bytes from 10.0.0.1: seq=1 ttl=64 time=0.787 ms 64 bytes from 10.0.0.1: seq=2 ttl=64 time=0.771 ms $ rmmod mscc_felix.ko [ 645.544426] mscc_felix 0000:00:00.5: Link is Down [ 645.838608] DSA: tree 0 torn down $ rmmod tag_ocelot_8021q.ko Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
Cascading DSA switches can be done multiple ways. There is the brute force approach / tag stacking, where one upstream switch, located between leaf switches and the host Ethernet controller, will just happily transport the DSA header of those leaf switches as payload. For this kind of setups, DSA works without any special kind of treatment compared to a single switch - they just aren't aware of each other. Then there's the approach where the upstream switch understands the tags it transports from its leaves below, as it doesn't push a tag of its own, but it routes based on the source port & switch id information present in that tag (as opposed to DMAC & VID) and it strips the tag when egressing a front-facing port. Currently only Marvell implements the latter, and Marvell DSA trees contain only Marvell switches. So it is safe to say that DSA trees already have a single tag protocol shared by all switches, and in fact this is what makes the switches able to understand each other. This fact is also implied by the fact that currently, the tagging protocol is reported as part of a sysfs installed on the DSA master and not per port, so it must be the same for all the ports connected to that DSA master regardless of the switch that they belong to. It's time to make this official and enforce it (yes, this also means we won't have any "switch understands tag to some extent but is not able to speak it" hardware oddities that we'll support in the future). This is needed due to the imminent introduction of the dsa_switch_ops:: change_tag_protocol driver API. When that is introduced, we'll have to notify switches of the tagging protocol that they're configured to use. Currently the tag_ops structure pointer is held only for CPU ports. But there are switches which don't have CPU ports and nonetheless still need to be configured. These would be Marvell leaf switches whose upstream port is just a DSA link. How do we inform these of their tagging protocol setup/deletion? One answer to the above would be: iterate through the DSA switch tree's ports once, list the CPU ports, get their tag_ops, then iterate again now that we have it, and notify everybody of that tag_ops. But what to do if conflicts appear between one cpu_dp->tag_ops and another? There's no escaping the fact that conflict resolution needs to be done, so we can be upfront about it. Ease our work and just keep the master copy of the tag_ops inside the struct dsa_switch_tree. Reference counting is now moved to be per-tree too, instead of per-CPU port. There are many places in the data path that access master->dsa_ptr->tag_ops and we would introduce unnecessary performance penalty going through yet another indirection, so keep those right where they are. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xin Long 提交于
This patch is to add csum offload support for gre header: On the TX path in gre_build_header(), when CHECKSUM_PARTIAL's set for inner proto, it will calculate the csum for outer proto, and inner csum will be offloaded later. Otherwise, CHECKSUM_PARTIAL and csum_start/offset will be set for outer proto, and the outer csum will be offloaded later. On the GSO path in gre_gso_segment(), when CHECKSUM_PARTIAL is not set for inner proto and the hardware supports csum offload, CHECKSUM_PARTIAL and csum_start/offset will be set for outer proto, and outer csum will be offloaded later. Otherwise, it will do csum for outer proto by calling gso_make_checksum(). Note that SCTP has to do the csum by itself for non GSO path in sctp_packet_pack(), as gre_build_header() can't handle the csum with CHECKSUM_PARTIAL set for SCTP CRC csum offload. v1->v2: - remove the SCTP part, as GRE dev doesn't support SCTP CRC CSUM and it will always do checksum for SCTP in sctp_packet_pack() when it's not a GSO packet. Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paul Blakey 提交于
Give offloading drivers the direction of the offloaded ct flow, this will be used for matches on direction (ct_state +/-rpl). Signed-off-by: NPaul Blakey <paulb@nvidia.com> Reviewed-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 29 1月, 2021 5 次提交
-
-
由 Hans de Goede 提交于
Some devices, e.g. the RTL8723BS bluetooth part, some USB attached devices, completely drop from the bus on a system-suspend. These devices will have their driver unbound and rebound on resume (when the dropping of the bus gets detected) and will show up as a new HCI after resume. These devices do not benefit from the suspend / resume handling work done by the hci_suspend_notifier. At best this unnecessarily adds some time to the suspend/resume time. But this may also actually cause problems, if the code doing the driver unbinding runs after the pm-notifier then the hci_suspend_notifier code will try to talk to a device which is now in an uninitialized state. This commit adds a new HCI_QUIRK_NO_SUSPEND_NOTIFIER quirk which allows drivers to opt-out of the hci_suspend_notifier when they know beforehand that their device will be fully re-initialized / reprobed on resume. Signed-off-by: NHans de Goede <hdegoede@redhat.com> Reviewed-by: NAbhishek Pandit-Subedi <abhishekpandit@chromium.org> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Ido Schimmel 提交于
Currently there are only two types of in-kernel nexthop notification. The two are distinguished by the 'is_grp' boolean field in 'struct nh_notifier_info'. As more notification types are introduced for more next-hop group types, a boolean is not an easily extensible interface. Instead, convert it to an enum. Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NPetr Machata <petrm@nvidia.com> Signed-off-by: NPetr Machata <petrm@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Petr Machata 提交于
The values that a next-hop group needs to keep track of depend on the group type. Introduce a union to separate fields specific to the mpath groups from fields specific to other group types. Signed-off-by: NPetr Machata <petrm@nvidia.com> Reviewed-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Julian Wiedmann 提交于
Stop maintaining the skb_send_q list for TRANS_HIPER sockets. Not only is it extra overhead, but keeping around a list of skb clones means that we later also have to match the ->sk_txnotify() calls against these clones and free them accordingly. The current matching logic (comparing the skbs' shinfo location) is frustratingly fragile, and breaks if the skb's head is mangled in any sort of way while passing from dev_queue_xmit() to the device's HW queue. Also adjust the interface for ->sk_txnotify(), to make clear that we don't actually care about any skb internals. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Julian Wiedmann 提交于
The TX code keeps track of all skbs that are in-flight but haven't actually been sent out yet. For native IUCV sockets that's not a huge deal, but with TRANS_HIPER sockets it would be much better if we didn't need to maintain a list of skb clones. Note that we actually only care about the _count_ of skbs in this stage of the TX pipeline. So as prep work for removing the skb tracking on TRANS_HIPER sockets, keep track of the skb count in a separate variable and pair any list {enqueue, unlink} with a count {increment, decrement}. Then replace all occurences where we currently look at the skb list's fill level. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 28 1月, 2021 4 次提交
-
-
由 Aya Levin 提交于
Add packet trap that can report packets that were dropped due to destination MAC filtering. Signed-off-by: NAya Levin <ayal@nvidia.com> Reviewed-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NMoshe Shemesh <moshe@nvidia.com> Reviewed-by: NTariq Toukan <tariqt@nvidia.com> Signed-off-by: NTariq Toukan <tariqt@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Pablo Neira Ayuso 提交于
This function is not used anymore by any extension, statify it. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
This new function combines the netlink register attribute parser and the store validation function. This update requires to replace: enum nft_registers dreg:8; in many of the expression private areas otherwise compiler complains with: error: cannot take address of bit-field ‘dreg’ when passing the register field as reference. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
This new function combines the netlink register attribute parser and the load validation function. This update requires to replace: enum nft_registers sreg:8; in many of the expression private areas otherwise compiler complains with: error: cannot take address of bit-field ‘sreg’ when passing the register field as reference. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 27 1月, 2021 2 次提交
-
-
由 Praveen Chaudhary 提交于
For IPv4, default route is learned via DHCPv4 and user is allowed to change metric using config etc/network/interfaces. But for IPv6, default route can be learned via RA, for which, currently a fixed metric value 1024 is used. Ideally, user should be able to configure metric on default route for IPv6 similar to IPv4. This patch adds sysctl for the same. Logs: For IPv4: Config in etc/network/interfaces: auto eth0 iface eth0 inet dhcp metric 4261413864 IPv4 Kernel Route Table: $ ip route list default via 172.21.47.1 dev eth0 metric 4261413864 FRR Table, if a static route is configured: [In real scenario, it is useful to prefer BGP learned default route over DHCPv4 default route.] Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, > - selected route, * - FIB route S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03 K 0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m i.e. User can prefer Default Router learned via Routing Protocol in IPv4. Similar behavior is not possible for IPv6, without this fix. After fix [for IPv6]: sudo sysctl -w net.ipv6.conf.eth0.net.ipv6.conf.eth0.ra_defrtr_metric=1996489705 IP monitor: [When IPv6 RA is received] default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 pref high Kernel IPv6 routing table $ ip -6 route list default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 21sec hoplimit 64 pref high FRR Table, if a static route is configured: [In real scenario, it is useful to prefer BGP learned default route over IPv6 RA default route.] Codes: K - kernel route, C - connected, S - static, R - RIPng, O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, > - selected route, * - FIB route S>* ::/0 [20/0] is directly connected, eth0, 00:00:06 K ::/0 [119/1001] via fe80::xx16:xxxx:feb3:ce8e, eth0, 6d07h43m If the metric is changed later, the effect will be seen only when next IPv6 RA is received, because the default route must be fully controlled by RA msg. Below metric is changed from 1996489705 to 1996489704. $ sudo sysctl -w net.ipv6.conf.eth0.ra_defrtr_metric=1996489704 net.ipv6.conf.eth0.ra_defrtr_metric = 1996489704 IP monitor: [On next IPv6 RA msg, Kernel deletes prev route and installs new route with updated metric] Deleted default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 3sec hoplimit 64 pref high default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489704 pref high Signed-off-by: NPraveen Chaudhary <pchaudhary@linkedin.com> Signed-off-by: NZhenggen Xu <zxu@linkedin.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20210125214430.24079-1-pchaudhary@linkedin.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xie He 提交于
In the lapb module, the timers may run concurrently with other code in this module, and there is currently no locking to prevent the code from racing on "struct lapb_cb". This patch adds locking to prevent racing. 1. Add "spinlock_t lock" to "struct lapb_cb"; Add "spin_lock_bh" and "spin_unlock_bh" to APIs, timer functions and notifier functions. 2. Add "bool t1timer_stop, t2timer_stop" to "struct lapb_cb" to make us able to ask running timers to abort; Modify "lapb_stop_t1timer" and "lapb_stop_t2timer" to make them able to abort running timers; Modify "lapb_t2timer_expiry" and "lapb_t1timer_expiry" to make them abort after they are stopped by "lapb_stop_t1timer", "lapb_stop_t2timer", and "lapb_start_t1timer", "lapb_start_t2timer". 3. Let lapb_unregister wait for other API functions and running timers to stop. 4. The lapb_device_event function calls lapb_disconnect_request. In order to avoid trying to hold the lock twice, add a new function named "__lapb_disconnect_request" which assumes the lock is held, and make it called by lapb_disconnect_request and lapb_device_event. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: NXie He <xie.he.0141@gmail.com> Link: https://lore.kernel.org/r/20210126040939.69995-1-xie.he.0141@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-