- 17 11月, 2020 3 次提交
-
-
由 Paolo Abeni 提交于
unlocked version of protocol level close, will be used by MPTCP to allow decouple orphaning and subflow level close. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paolo Abeni 提交于
Will be needed by the next patch, as MPTCP needs to handle directly the error/memory-allocation-needed path. No functional changes intended. Additionally let MPTCP code access the tcp_remove_empty_skb() helper. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Francis Laniel 提交于
Calls to nla_strlcpy are now replaced by calls to nla_strscpy which is the new name of this function. Signed-off-by: NFrancis Laniel <laniel_francis@privacyrequired.com> Reviewed-by: NKees Cook <keescook@chromium.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 15 11月, 2020 3 次提交
-
-
由 Eric Dumazet 提交于
These functions do not need to be exported. Signed-off-by: NEric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20201113113553.3411756-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
Both IPv4 and IPv6 needs it via a function pointer. Following patch will avoid the indirect call. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Oliver Herms 提交于
This patch adds an IPv4 routes encapsulation attribute to the result of netlink RTM_GETROUTE requests (e.g. ip route get 192.0.2.1). Signed-off-by: NOliver Herms <oliver.peter.herms@gmail.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20201113085517.GA1307262@twsSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 13 11月, 2020 4 次提交
-
-
由 Thierry Reding 提交于
When dumping the name and NTP servers advertised by DHCP, a blank line is emitted if either of the lists is empty. This can lead to confusing issues such as the blank line getting flagged as warning. This happens because the blank line is the result of pr_cont("\n") and that may see its level corrupted by some other driver concurrently writing to the console. Fix this by making sure that the terminating newline is only emitted if at least one entry in the lists was printed before. Reported-by: NJon Hunter <jonathanh@nvidia.com> Signed-off-by: NThierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20201110073757.1284594-1-thierry.reding@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Menglong Dong 提交于
The initialization for 'err' with '-ENOSYS' is redundant and can be removed, as it is updated soon and not used. Changes since v1: - Move the err declaration below struct sock *sk Signed-off-by: NMenglong Dong <dong.menglong@zte.com.cn> Link: https://lore.kernel.org/r/5faa01d5.1c69fb81.8451c.cb5b@mx.google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alexander Lobakin 提交于
udp{4,6}_lib_lookup_skb() use ip{,v6}_hdr() to get IP header of the packet. While it's probably OK for non-frag0 paths, this helpers will also point to junk on Fast/frag0 GRO when all headers are located in frags. As a result, sk/skb lookup may fail or give wrong results. To support both GRO modes, skb_gro_network_header() might be used. To not modify original functions, add private versions of udp{4,6}_lib_lookup_skb() only to perform correct sk lookups on GRO. Present since the introduction of "application-level" UDP GRO in 4.7-rc1. Misc: replace totally unneeded ternaries with plain ifs. Fixes: a6024562 ("udp: Add GRO functions to UDP socket") Suggested-by: NWillem de Bruijn <willemb@google.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NAlexander Lobakin <alobakin@pm.me> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alexander Lobakin 提交于
UDP GRO uses udp_hdr(skb) in its .gro_receive() callback. While it's probably OK for non-frag0 paths (when all headers or even the entire frame are already in skb head), this inline points to junk when using Fast GRO (napi_gro_frags() or napi_gro_receive() with only Ethernet header in skb head and all the rest in the frags) and breaks GRO packet compilation and the packet flow itself. To support both modes, skb_gro_header_fast() + skb_gro_header_slow() are typically used. UDP even has an inline helper that makes use of them, udp_gro_udphdr(). Use that instead of troublemaking udp_hdr() to get rid of the out-of-order delivers. Present since the introduction of plain UDP GRO in 5.0-rc1. Fixes: e20cf8d3 ("udp: implement GRO for plain UDP sockets.") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NAlexander Lobakin <alobakin@pm.me> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 12 11月, 2020 2 次提交
-
-
由 Ido Schimmel 提交于
Be more consistent about the way in which the nexthop flags are set and set them in one go. Suggested-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20201110102553.1924232-1-idosch@idosch.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vincent Bernat 提交于
The disable_policy and disable_xfrm are a per-interface sysctl to disable IPsec policy or encryption on an interface. However, while a "all" variant is exposed, it was a noop since it was never evaluated. We use the usual "or" logic for this kind of sysctls. Signed-off-by: NVincent Bernat <vincent@bernat.ch> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 11 11月, 2020 3 次提交
-
-
由 Eric Dumazet 提交于
The skb is needed only to fetch the keys for the lookup. Both functions are used from GRO stack, we do not want accidental modification of the skb. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NAlexander Lobakin <alobakin@pm.me> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Mao Wenan 提交于
When net.ipv4.tcp_syncookies=1 and syn flood is happened, cookie_v4_check or cookie_v6_check tries to redo what tcp_v4_send_synack or tcp_v6_send_synack did, rsk_window_clamp will be changed if SOCK_RCVBUF is set, which will make rcv_wscale is different, the client still operates with initial window scale and can overshot granted window, the client use the initial scale but local server use new scale to advertise window value, and session work abnormally. Fixes: e88c64f0 ("tcp: allow effective reduction of TCP's rcv-buffer via setsockopt") Signed-off-by: NMao Wenan <wenan.mao@linux.alibaba.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/1604967391-123737-1-git-send-email-wenan.mao@linux.alibaba.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Menglong Dong 提交于
The initialization for 'err' with '-EINVAL' is redundant and can be removed, as it is updated soon. Changes since v1: - Remove redundant empty line Signed-off-by: NMenglong Dong <dong.menglong@zte.com.cn> Link: https://lore.kernel.org/r/20201108010541.12432-1-dong.menglong@zte.com.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 10 11月, 2020 7 次提交
-
-
由 Heiner Kallweit 提交于
After having migrated all users remove ip_tunnel_get_stats64(). Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Heiner Kallweit 提交于
Replace ip_tunnel_get_stats64() with the new identical core function dev_get_tstats64(). Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Heiner Kallweit 提交于
Replace ip_tunnel_get_stats64() with the new identical core function dev_get_tstats64(). Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Menglong Dong 提交于
The initialization for 'err' with '-EINVAL' is redundant and can be removed, as it is updated soon and not used. Signed-off-by: NMenglong Dong <dong.menglong@zte.com.cn> Link: https://lore.kernel.org/r/1604644960-48378-2-git-send-email-dong.menglong@zte.com.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Menglong Dong 提交于
The initialization for 'err' with 0 is redundant and can be removed, as it is updated by ip_send_skb and not used before that. Signed-off-by: NMenglong Dong <dong.menglong@zte.com.cn> Link: https://lore.kernel.org/r/1604644960-48378-4-git-send-email-dong.menglong@zte.com.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Stefano Brivio 提交于
Jianlin reports that a bridged IPv6 VXLAN endpoint, carrying IPv6 packets over a link with a PMTU estimation of exactly 1350 bytes, won't trigger ICMPv6 Packet Too Big replies when the encapsulated datagrams exceed said PMTU value. VXLAN over IPv6 adds 70 bytes of overhead, so an ICMPv6 reply indicating 1280 bytes as inner MTU would be legitimate and expected. This comes from an off-by-one error I introduced in checks added as part of commit 4cb47a86 ("tunnels: PMTU discovery support for directly bridged IP packets"), whose purpose was to prevent sending ICMPv6 Packet Too Big messages with an MTU lower than the smallest permissible IPv6 link MTU, i.e. 1280 bytes. In iptunnel_pmtud_check_icmpv6(), avoid triggering a reply only if the advertised MTU would be less than, and not equal to, 1280 bytes. Also fix the analogous comparison for IPv4, that is, skip the ICMP reply only if the resulting MTU is strictly less than 576 bytes. This becomes apparent while running the net/pmtu.sh bridged VXLAN or GENEVE selftests with adjusted lower-link MTU values. Using e.g. GENEVE, setting ll_mtu to the values reported below, in the test_pmtu_ipvX_over_bridged_vxlanY_or_geneveY_exception() test function, we can see failures on the following tests: test | ll_mtu -------------------------------|-------- pmtu_ipv4_br_geneve4_exception | 626 pmtu_ipv6_br_geneve4_exception | 1330 pmtu_ipv6_br_geneve6_exception | 1350 owing to the different tunneling overheads implied by the corresponding configurations. Reported-by: NJianlin Shi <jishi@redhat.com> Fixes: 4cb47a86 ("tunnels: PMTU discovery support for directly bridged IP packets") Signed-off-by: NStefano Brivio <sbrivio@redhat.com> Link: https://lore.kernel.org/r/4f5fc2f33bfdf8409549fafd4f952b008bf04d63.1604681709.git.sbrivio@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Menglong Dong 提交于
When udp_memory_allocated is at the limit, __udp_enqueue_schedule_skb will return a -ENOBUFS, and skb will be dropped in __udp_queue_rcv_skb without any counters being done. It's hard to find out what happened once this happen. So we introduce a UDP_MIB_MEMERRORS to do this job. Well, this change looks friendly to the existing users, such as netstat: $ netstat -u -s Udp: 0 packets received 639 packets to unknown port received. 158689 packet receive errors 180022 packets sent RcvbufErrors: 20930 MemErrors: 137759 UdpLite: IpExt: InOctets: 257426235 OutOctets: 257460598 InNoECTPkts: 181177 v2: - Fix some alignment problems Signed-off-by: NMenglong Dong <dong.menglong@zte.com.cn> Link: https://lore.kernel.org/r/1604627354-43207-1-git-send-email-dong.menglong@zte.com.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 08 11月, 2020 1 次提交
-
-
由 Allen Pais 提交于
In preparation for unconditionally passing the struct tasklet_struct pointer to all tasklet callbacks, switch to using the new tasklet_setup() and from_tasklet() to pass the tasklet pointer explicitly. Signed-off-by: NRomain Perier <romain.perier@gmail.com> Signed-off-by: NAllen Pais <apais@linux.microsoft.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 07 11月, 2020 13 次提交
-
-
由 Ido Schimmel 提交于
Remove in-kernel route notifications when the configuration of their nexthop changes. These notifications are unnecessary because the route still uses the same nexthop ID. A separate notification for the nexthop change itself is now sent in the nexthop notification chain. Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
When registering a new notifier to the nexthop notification chain, replay all the existing nexthops to the new notifier so that it will have a complete picture of the available nexthops. Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
This will be used by the next patch which extends the function to replay all the existing nexthops to the notifier block being registered. Device drivers will be able to pass extack to the function since it is passed to them upon reload from devlink. Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
When a single nexthop is deleted, the configuration of all the groups using the nexthop is effectively modified. In this case, emit a notification in the nexthop notification chain for each modified group so that listeners would not need to keep track of which nexthops are member in which groups. In the rare cases where the notification fails, emit an error to the kernel log. This is done by allocating extack on the stack and printing the error logged by the listener that rejected the notification. Changes since RFC: * Allocate extack on the stack Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
When a single nexthop is replaced, the configuration of all the groups using the nexthop is effectively modified. In this case, emit a notification in the nexthop notification chain for each modified group so that listeners would not need to keep track of which nexthops are member in which groups. The notification can only be emitted after the new configuration (i.e., 'struct nh_info') is pointed at by the old shell (i.e., 'struct nexthop'). Before that the configuration of the nexthop groups is still the same as before the replacement. Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
The notification is emitted after all the validation checks were performed, but before the new configuration (i.e., 'struct nh_info') is pointed at by the old shell (i.e., 'struct nexthop'). This prevents the need to perform rollback in case the notification is vetoed. The next patch will also emit a replace notification for all the nexthop groups in which the nexthop is used. Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
Emit a notification in the nexthop notification chain when an existing nexthop group is replaced. The notification is emitted after all the validation checks were performed, but before the new configuration (i.e., 'struct nh_grp') is pointed at by the old shell (i.e., 'struct nexthop'). This prevents the need to perform rollback in case the notification is vetoed. Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
Emit a notification in the nexthop notification chain when a new nexthop is added (not replaced). The nexthop can either be a new group or a single nexthop. The notification is sent after the nexthop is inserted into the red-black tree, as listeners might need to callback into the nexthop code with the nexthop ID in order to mark the nexthop as offloaded. A 'REPLACE' notification is emitted instead of 'ADD' as the distinction between the two is not important for in-kernel listeners. In case the listener is not familiar with the encoded nexthop ID, it can simply treat it as a new one. This is also consistent with the route offload API. Changes since RFC: * Reword commit message Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
Add a function that can be called by device drivers to set "offload" or "trap" indication on nexthops following nexthop notifications. Changes since RFC: * s/nexthop_hw_flags_set/nexthop_set_hw_flags/ Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
The flag indicates to user space that the nexthop is not programmed to forward packets in hardware, but rather to trap them to the CPU. This is needed, for example, when the MAC of the nexthop neighbour is not resolved and packets should reach the CPU to trigger neighbour resolution. The flag will be used in subsequent patches by netdevsim to test nexthop objects programming to device drivers and in the future by mlxsw as well. Changes since RFC: * Reword commit message Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
Convert the sole listener of the nexthop notification chain (the VXLAN driver) to the new notification info. Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
Prepare the new notification information so that it could be passed to listeners in the new patch. Changes since RFC: * Add a blank line in __nh_notifier_single_info_init() Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
The next patch will add extack to the notification info. This allows listeners to veto notifications and communicate the reason to user space. Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 05 11月, 2020 1 次提交
-
-
由 Paolo Abeni 提交于
When the TCP stack splits a packet on the write queue, the tail half currently lose the associated skb extensions, and will not carry the DSM on the wire. The above does not cause functional problems and is allowed by the RFC, but interact badly with GRO and RX coalescing, as possible candidates for aggregation will carry different TCP options. This change tries to improve the MPTCP behavior, propagating the skb extensions on split. Additionally, we must prevent the MPTCP stack from updating the mapping after the split occur: that will both violate the RFC and fool the reader. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 03 11月, 2020 1 次提交
-
-
由 Yuchung Cheng 提交于
During TCP fast recovery, the congestion control in charge is by default the Proportional Rate Reduction (PRR) unless the congestion control module specified otherwise (e.g. BBR). Previously when tcp_packets_in_flight() is below snd_ssthresh PRR would slow start upon receiving an ACK that 1) cumulatively acknowledges retransmitted data and 2) does not detect further lost retransmission Such conditions indicate the repair is in good steady progress after the first round trip of recovery. Otherwise PRR adopts the packet conservation principle to send only the amount that was newly delivered (indicated by this ACK). This patch generalizes the previous design principle to include also the newly sent data beside retransmission: as long as the delivery is making good progress, both retransmission and new data should be accounted to make PRR more cautious in slow starting. Suggested-by: NMatt Mathis <mattmathis@google.com> Suggested-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20201031013412.1973112-1-ycheng@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 01 11月, 2020 2 次提交
-
-
由 Pablo Neira Ayuso 提交于
Enhance validation to support for reject from inet ingress chains. Note that, reject from inet ingress and netdev ingress differ. Reject packets from inet ingress are sent through ip_local_out() since inet reject emulates the IP layer receive path. So the reject packet follows to classic IP output and postrouting paths. The reject action from netdev ingress assumes the packet not yet entered the IP layer, so the reject packet is sent through dev_queue_xmit(). Therefore, reject packets from netdev ingress do not follow the classic IP output and postrouting paths. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 wenxu 提交于
The tunnel device such as vxlan, bareudp and geneve in the lwt mode set the outer df only based TUNNEL_DONT_FRAGMENT. And this was also the behavior for gre device before switching to use ip_md_tunnel_xmit in commit 962924fa ("ip_gre: Refactor collect metatdata mode tunnel xmit to ip_md_tunnel_xmit") When the ip_gre in lwt mode xmit with ip_md_tunnel_xmi changed the rule and make the discrepancy between handling of DF by different tunnels. So in the ip_md_tunnel_xmit should follow the same rule like other tunnels. Fixes: cfc7381b ("ip_tunnel: add collect_md mode to IPIP tunnel") Signed-off-by: Nwenxu <wenxu@ucloud.cn> Link: https://lore.kernel.org/r/1604028728-31100-1-git-send-email-wenxu@ucloud.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-