- 19 9月, 2017 2 次提交
-
-
由 Xin Long 提交于
Now in ip6gre_header before packing the ipv6 header, it skb_push t->hlen which only includes encap_hlen + tun_hlen. It means greh and inner header would be over written by ipv6 stuff and ipv6h might have no chance to set up. Jianlin found this issue when using remote any on ip6_gre, the packets he captured on gre dev are truncated: 22:50:26.210866 Out ethertype IPv6 (0x86dd), length 120: truncated-ip6 -\ 8128 bytes missing!(flowlabel 0x92f40, hlim 0, next-header Options (0) \ payload length: 8192) ::1:2000:0 > ::1:0:86dd: HBH [trunc] ip-proto-128 \ 8184 It should also skb_push ipv6hdr so that ipv6h points to the right position to set ipv6 stuff up. This patch is to skb_push hlen + sizeof(*ipv6h) and also fix some indents in ip6gre_header. Fixes: c12b395a ("gre: Support GRE over IPv6") Reported-by: NJianlin Shi <jishi@redhat.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
While trying an ESP transport mode encryption for UDPv6 packets of datagram size 1436 with MTU 1500, checksum error was observed in the secondary fragment. This error occurs due to the UDP payload checksum being missed out when computing the full checksum for these packets in udp6_hwcsum_outgoing(). Fixes: d39d938c ("ipv6: Introduce udpv6_send_skb()") Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 9月, 2017 1 次提交
-
-
由 Haishuang Yan 提交于
In collect_md mode, if the tun dev is down, it still can call __ip6_tnl_rcv to receive on packets, and the rx statistics increase improperly. When the md tunnel is down, it's not neccessary to increase RX drops for the tunnel device, packets would be recieved on fallback tunnel, and the RX drops on fallback device will be increased as expected. Fixes: 8d79266b ("ip6_tunnel: add collect_md mode to IPv6 tunnels") Cc: Alexei Starovoitov <ast@fb.com> Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 9月, 2017 1 次提交
-
-
由 David Lebrun 提交于
As seg6_validate_srh() already checks that the Routing Header type is correct, it is not necessary to do it again in get_srh(). Fixes: 5829d70b ("ipv6: sr: fix get_srh() to comply with IPv6 standard "RFC 8200") Signed-off-by: NDavid Lebrun <dlebrun@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 9月, 2017 5 次提交
-
-
由 Haishuang Yan 提交于
Similar to vxlan/geneve tunnel, if hop_limit is zero, it should fall back to ip6_dst_hoplimt(). Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
IPv6 FIB should use FIB6_TABLE_HASHSZ, not FIB_TABLE_HASHSZ. Fixes: ba1cc08d ("ipv6: fix memory leak with multiple tables during netns destruction") Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
While the cited commit fixed a possible deadlock, it added a leak of the request socket, since reqsk_put() must be called if the BPF filter decided the ACK packet must be dropped. Fixes: d624d276 ("tcp: fix possible deadlock in TCP stack vs BPF filter") Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
There are reports about spurious softlockups during iptables-restore, a backtrace i saw points at get_counters -- it uses a sequence lock and also has unbounded restart loop. Signed-off-by: NFlorian Westphal <fw@strlen.de> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Sabrina Dubroca 提交于
fib6_net_exit only frees the main and local tables. If another table was created with fib6_alloc_table, we leak it when the netns is destroyed. Fix this in the same way ip_fib_net_exit cleans up tables, by walking through the whole hashtable of fib6_table's. We can get rid of the special cases for local and main, since they're also part of the hashtable. Reproducer: ip netns add x ip -net x -6 rule add from 6003:1::/64 table 100 ip netns del x Reported-by: NJianlin Shi <jishi@redhat.com> Fixes: 58f09b78 ("[NETNS][IPV6] ip6_fib - make it per network namespace") Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 9月, 2017 1 次提交
-
-
由 Xin Long 提交于
Now when probessing ICMPV6_PKT_TOOBIG, ip6gre_err only subtracts the offset of gre header from mtu info. The expected mtu of gre device should also subtract gre header. Otherwise, the next packets still can't be sent out. Jianlin found this issue when using the topo: client(ip6gre)<---->(nic1)route(nic2)<----->(ip6gre)server and reducing nic2's mtu, then both tcp and sctp's performance with big size data became 0. This patch is to fix it by also subtracting grehdr (tun->tun_hlen) from mtu info when updating gre device's mtu in ip6gre_err(). It also needs to subtract ETH_HLEN if gre dev'type is ARPHRD_ETHER. Reported-by: NJianlin Shi <jishi@redhat.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 9月, 2017 4 次提交
-
-
由 Varsha Rao 提交于
This patch removes CONFIG_NETFILTER_DEBUG and _ASSERT() macros as they are no longer required. Replace _ASSERT() macros with WARN_ON(). Signed-off-by: NVarsha Rao <rvarsha016@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Varsha Rao 提交于
This patch removes NF_CT_ASSERT() and instead uses WARN_ON(). Signed-off-by: NVarsha Rao <rvarsha016@gmail.com>
-
由 Florian Westphal 提交于
tested with allmodconfig build. Signed-off-by: NFlorian Westphal <fw@strlen.de>
-
由 Jesper Dangaard Brouer 提交于
This reverts commit 1d6119ba. After reverting commit 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting") then here is no need for this fix-up patch. As percpu_counter is no longer used, it cannot memory leak it any-longer. Fixes: 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting") Fixes: 1d6119ba ("net: fix percpu memory leaks") Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 9月, 2017 2 次提交
-
-
由 Ido Schimmel 提交于
When a listener registers to the FIB notification chain it receives a dump of the FIB entries and rules from existing address families by invoking their dump operations. While we call into these modules we need to make sure they aren't removed. Do that by increasing their reference count before invoking their dump operations and decrease it afterwards. Fixes: 04b1d4e5 ("net: core: Make the FIB notification chain generic") Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Reviewed-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Meyer 提交于
Grepping for "sizeof\(.+\) / sizeof\(" found this as one of the first candidates. Maybe a coccinelle can catch all of those. Signed-off-by: NThomas Meyer <thomas@m3y3r.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 8月, 2017 2 次提交
-
-
由 Yossi Kuperman 提交于
In conjunction with crypto offload [1], removing the ESP trailer by hardware can potentially improve the performance by avoiding (1) a cache miss incurred by reading the nexthdr field and (2) the necessity to calculate the csum value of the trailer in order to keep skb->csum valid. This patch introduces the changes to the xfrm stack and merely serves as an infrastructure. Subsequent patch to mlx5 driver will put this to a good use. [1] https://www.mail-archive.com/netdev@vger.kernel.org/msg175733.htmlSigned-off-by: NYossi Kuperman <yossiku@mellanox.com> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
由 Ahmed Abdelsalam 提交于
IPv6 packet may carry more than one extension header, and IPv6 nodes must accept and attempt to process extension headers in any order and occurring any number of times in the same packet. Hence, there should be no assumption that Segment Routing extension header is to appear immediately after the IPv6 header. Moreover, section 4.1 of RFC 8200 gives a recommendation on the order of appearance of those extension headers within an IPv6 packet. According to this recommendation, Segment Routing extension header should appear after Hop-by-Hop and Destination Options headers (if they present). This patch fixes the get_srh(), so it gets the segment routing header regardless of its position in the chain of the extension headers in IPv6 packet, and makes sure that the IPv6 routing extension header is of Type 4. Signed-off-by: NAhmed Abdelsalam <amsalam20@gmail.com> Acked-by: NDavid Lebrun <david.lebrun@uclouvain.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 8月, 2017 4 次提交
-
-
由 Eric Dumazet 提交于
Florian reported UDP xmit drops that could be root caused to the too small neigh limit. Current limit is 64 KB, meaning that even a single UDP socket would hit it, since its default sk_sndbuf comes from net.core.wmem_default (~212992 bytes on 64bit arches). Once ARP/ND resolution is in progress, we should allow a little more packets to be queued, at least for one producer. Once neigh arp_queue is filled, a rogue socket should hit its sk_sndbuf limit and either block in sendmsg() or return -EAGAIN. Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Tariq repored local pings to linklocal address is failing: $ ifconfig ens8 ens8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 11.141.16.6 netmask 255.255.0.0 broadcast 11.141.255.255 inet6 fe80::7efe:90ff:fecb:7502 prefixlen 64 scopeid 0x20<link> ether 7c:fe:90:cb:75:02 txqueuelen 1000 (Ethernet) RX packets 12 bytes 1164 (1.1 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 30 bytes 2484 (2.4 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 $ /bin/ping6 -c 3 fe80::7efe:90ff:fecb:7502%ens8 PING fe80::7efe:90ff:fecb:7502%ens8(fe80::7efe:90ff:fecb:7502) 56 data bytes Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
ChunYu found a kernel warn_on during syzkaller fuzzing: [40226.038539] WARNING: CPU: 5 PID: 23720 at net/ipv4/af_inet.c:152 inet_sock_destruct+0x78d/0x9a0 [40226.144849] Call Trace: [40226.147590] <IRQ> [40226.149859] dump_stack+0xe2/0x186 [40226.176546] __warn+0x1a4/0x1e0 [40226.180066] warn_slowpath_null+0x31/0x40 [40226.184555] inet_sock_destruct+0x78d/0x9a0 [40226.246355] __sk_destruct+0xfa/0x8c0 [40226.290612] rcu_process_callbacks+0xaa0/0x18a0 [40226.336816] __do_softirq+0x241/0x75e [40226.367758] irq_exit+0x1f6/0x220 [40226.371458] smp_apic_timer_interrupt+0x7b/0xa0 [40226.376507] apic_timer_interrupt+0x93/0xa0 The warn_on happned when sk->sk_rmem_alloc wasn't 0 in inet_sock_destruct. As after commit f970bd9e ("udp: implement memory accounting helpers"), udp has changed to use udp_destruct_sock as sk_destruct where it would udp_rmem_release all rmem. But IPV6_ADDRFORM sockopt sets sk_destruct with inet_sock_destruct after changing family to PF_INET. If rmem is not 0 at that time, and there is no place to release rmem before calling inet_sock_destruct, the warn_on will be triggered. This patch is to fix it by not setting sk_destruct in IPV6_ADDRFORM sockopt any more. As IPV6_ADDRFORM sockopt only works for tcp and udp. TCP sock has already set it's sk_destruct with inet_sock_destruct and UDP has set with udp_destruct_sock since they're created. Fixes: f970bd9e ("udp: implement memory accounting helpers") Reported-by: NChunYu Wang <chunwang@redhat.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
There appears to be no need to use rtnl, addrlabel entries are refcounted and add/delete is serialized by the addrlabel table spinlock. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 8月, 2017 3 次提交
-
-
由 Xin Long 提交于
Now it doesn't check for the cached route expiration in ipv6's dst_ops->check(), because it trusts dst_gc that would clean the cached route up when it's expired. The problem is in dst_gc, it would clean the cached route only when it's refcount is 1. If some other module (like xfrm) keeps holding it and the module only release it when dst_ops->check() fails. But without checking for the cached route expiration, .check() may always return true. Meanwhile, without releasing the cached route, dst_gc couldn't del it. It will cause this cached route never to expire. This patch is to set dst.obsolete with DST_OBSOLETE_KILL in .gc when it's expired, and check obsolete != DST_OBSOLETE_FORCE_CHK in .check. Note that this is even needed when ipv6 dst_gc timer is removed one day. It would set dst.obsolete in .redirect and .update_pmtu instead, and check for cached route expiration when getting it, just like what ipv4 route does. Reported-by: NJianlin Shi <jishi@redhat.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Wang 提交于
Commit c5cff856 adds rcu grace period before freeing fib6_node. This generates a new sparse warning on rt->rt6i_node related code: net/ipv6/route.c:1394:30: error: incompatible types in comparison expression (different address spaces) ./include/net/ip6_fib.h:187:14: error: incompatible types in comparison expression (different address spaces) This commit adds "__rcu" tag for rt6i_node and makes sure corresponding rcu API is used for it. After this fix, sparse no longer generates the above warning. Fixes: c5cff856 ("ipv6: add rcu grace period before freeing fib6_node") Signed-off-by: NWei Wang <weiwan@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Acked-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Twice patches trying to constify inet{6}_protocol have been reverted: 39294c3d ("Revert "ipv6: constify inet6_protocol structures"") to revert 3a3a4e30 and then 03157937 ("Revert "ipv4: make net_protocol const"") to revert aa8db499. Add a comment that the structures can not be const because the early_demux field can change based on a sysctl. Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 8月, 2017 1 次提交
-
-
由 Florian Westphal 提交于
When enabling logging for invalid connections we currently also log most icmpv6 types, which we don't track intentionally (e.g. neigh discovery). "invalid" should really mean "invalid", i.e. short header or bad checksum. We don't do any logging for icmp(v4) either, its just useless noise. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 26 8月, 2017 7 次提交
-
-
由 Paolo Abeni 提交于
Currently, in the udp6 code, the dst cookie is not initialized/updated concurrently with the RX dst used by early demux. As a result, the dst_check() in the early_demux path always fails, the rx dst cache is always invalidated, and we can't really leverage significant gain from the demux lookup. Fix it adding udp6 specific variant of sk_rx_dst_set() and use it to set the dst cookie when the dst entry is really changed. The issue is there since the introduction of early demux for ipv6. Fixes: 5425077d ("net: ipv6: Add early demux handler for UDP unicast") Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Lebrun 提交于
This patch implements the following seg6local actions. - SEG6_LOCAL_ACTION_END_T: regular SRH processing and forward to the next-hop looked up in the specified routing table. - SEG6_LOCAL_ACTION_END_DX2: decapsulate an L2 frame and forward it to the specified network interface. - SEG6_LOCAL_ACTION_END_DX4: decapsulate an IPv4 packet and forward it, possibly to the specified next-hop. - SEG6_LOCAL_ACTION_END_DT6: decapsulate an IPv6 packet and forward it to the next-hop looked up in the specified routing table. Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Lebrun 提交于
This patch adds three helper functions to be used with the seg6local packet processing actions. The decap_and_validate() function will be used by the End.D* actions, that decapsulate an SR-enabled packet. The advance_nextseg() function applies the fundamental operations to update an SRH for the next segment. The lookup_nexthop() function helps select the next-hop for the processed SR packets. It supports an optional next-hop address to route the packet specifically through it, and an optional routing table to use. Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Lebrun 提交于
This patch ensures that the seg6local lightweight tunnel is used solely with IPv6 routes and processes only IPv6 packets. Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Lebrun 提交于
This patch implements the L2 frame encapsulation mechanism, referred to as T.Encaps.L2 in the SRv6 specifications [1]. A new type of SRv6 tunnel mode is added (SEG6_IPTUN_MODE_L2ENCAP). It only accepts packets with an existing MAC header (i.e., it will not work for locally generated packets). The resulting packet looks like IPv6 -> SRH -> Ethernet -> original L3 payload. The next header field of the SRH is set to NEXTHDR_NONE. [1] https://tools.ietf.org/html/draft-filsfils-spring-srv6-network-programming-01Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Lebrun 提交于
This patch enables the SRv6 encapsulation mode to carry an IPv4 payload. All the infrastructure was already present, I just had to add a parameter to seg6_do_srh_encap() to specify the inner packet protocol, and perform some additional checks. Usage example: ip route add 1.2.3.4 encap seg6 mode encap segs fc00::1,fc00::2 dev eth0 Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steffen Klassert 提交于
rt_cookie might be used uninitialized, fix this by initializing it. Fixes: c5cff856 ("ipv6: add rcu grace period before freeing fib6_node") Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 8月, 2017 7 次提交
-
-
由 Steffen Klassert 提交于
We use skb_availroom to calculate the skb tailroom for the ESP trailer. skb_availroom calculates the tailroom and subtracts this value by reserved_tailroom. However reserved_tailroom is a union with the skb mark. This means that we subtract the tailroom by the skb mark if set. Fix this by using skb_tailroom instead. Fixes: cac2661c ("esp4: Avoid skb_cow_data whenever possible") Fixes: 03e2a30f ("esp6: Avoid skb_cow_data whenever possible") Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
由 Steffen Klassert 提交于
We allocate the page fragment for the ESP trailer inside a spinlock, but consume it outside of the lock. This is racy as some other cou could get the same page fragment then. Fix this by consuming the page fragment inside the lock too. Fixes: cac2661c ("esp4: Avoid skb_cow_data whenever possible") Fixes: 03e2a30f ("esp6: Avoid skb_cow_data whenever possible") Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
由 Jakub Sitnicki 提交于
Allow our callers to influence the choice of ECMP link by honoring the hash passed together with the flow info. This allows for special treatment of ICMP errors which we would like to route over the same path as the IPv6 datagram that triggered the error. Also go through rt6_multipath_hash(), in the usual case when we aren't dealing with an ICMP error, so that there is one central place where multipath hash is computed. Signed-off-by: NJakub Sitnicki <jkbs@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Sitnicki 提交于
Commit 644d0e65 ("ipv6 Use get_hash_from_flowi6 for rt6 hash") has turned rt6_info_hash_nhsfn() into a one-liner, so it no longer makes sense to keep it around. Also remove the accompanying comment that has become outdated. Signed-off-by: NJakub Sitnicki <jkbs@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Sitnicki 提交于
When forwarding or sending out an ICMPv6 error, look at the embedded packet that triggered the error and compute a flow hash over its headers. This let's us route the ICMP error together with the flow it belongs to when multipath (ECMP) routing is in use, which in turn makes Path MTU Discovery work in ECMP load-balanced or anycast setups (RFC 7690). Granted, end-hosts behind the ECMP router (aka servers) need to reflect the IPv6 Flow Label for PMTUD to work. The code is organized to be in parallel with ipv4 stack: ip_multipath_l3_keys -> ip6_multipath_l3_keys fib_multipath_hash -> rt6_multipath_hash Signed-off-by: NJakub Sitnicki <jkbs@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Sitnicki 提交于
Reflecting IPv6 Flow Label at server nodes is useful in environments that employ multipath routing to load balance the requests. As "IPv6 Flow Label Reflection" standard draft [1] points out - ICMPv6 PTB error messages generated in response to a downstream packets from the server can be routed by a load balancer back to the original server without looking at transport headers, if the server applies the flow label reflection. This enables the Path MTU Discovery past the ECMP router in load-balance or anycast environments where each server node is reachable by only one path. Introduce a sysctl to enable flow label reflection per net namespace for all newly created sockets. Same could be earlier achieved only per socket by setting the IPV6_FL_F_REFLECT flag for the IPV6_FLOWLABEL_MGR socket option. [1] https://tools.ietf.org/html/draft-wang-6man-flow-label-reflection-01Signed-off-by: NJakub Sitnicki <jkbs@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
CONFIG_NF_CONNTRACK_PROCFS is deprecated, no need to use a function pointer in the trackers for this. Place the printf formatting in the one place that uses it. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-