- 10 10月, 2013 1 次提交
-
-
由 Eric Dumazet 提交于
TCP listener refactoring, part 5 : We want to be able to insert request sockets (SYN_RECV) into main ehash table instead of the per listener hash table to allow RCU lookups and remove listener lock contention. This patch includes the needed struct sock_common in front of struct request_sock This means there is no more inet6_request_sock IPv6 specific structure. Following inet_request_sock fields were renamed as they became macros to reference fields from struct sock_common. Prefix ir_ was chosen to avoid name collisions. loc_port -> ir_loc_port loc_addr -> ir_loc_addr rmt_addr -> ir_rmt_addr rmt_port -> ir_rmt_port iif -> ir_iif Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 10月, 2013 4 次提交
-
-
由 Eric Dumazet 提交于
TCP listener refactoring, part 4 : To speed up inet lookups, we moved IPv4 addresses from inet to struct sock_common Now is time to do the same for IPv6, because it permits us to have fast lookups for all kind of sockets, including upcoming SYN_RECV. Getting IPv6 addresses in TCP lookups currently requires two extra cache lines, plus a dereference (and memory stall). inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6 This patch is way bigger than its IPv4 counter part, because for IPv4, we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6, it's not doable easily. inet6_sk(sk)->daddr becomes sk->sk_v6_daddr inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr at the same offset. We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic macro. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
TCP listener refactoring, part 3 : Our goal is to hash SYN_RECV sockets into main ehash for fast lookup, and parallel SYN processing. Current inet_ehash_bucket contains two chains, one for ESTABLISH (and friend states) sockets, another for TIME_WAIT sockets only. As the hash table is sized to get at most one socket per bucket, it makes little sense to have separate twchain, as it makes the lookup slightly more complicated, and doubles hash table memory usage. If we make sure all socket types have the lookup keys at the same offsets, we can use a generic and faster lookup. It turns out TIME_WAIT and ESTABLISHED sockets already have common lookup fields for IPv4. [ INET_TW_MATCH() is no longer needed ] I'll provide a follow-up to factorize IPv6 lookup as well, to remove INET6_TW_MATCH() This way, SYN_RECV pseudo sockets will be supported the same. A new sock_gen_put() helper is added, doing either a sock_put() or inet_twsk_put() [ and will support SYN_RECV later ]. Note this helper should only be called in real slow path, when rcu lookup found a socket that was moved to another identity (freed/reused immediately), but could eventually be used in other contexts, like sock_edemux() Before patch : dmesg | grep "TCP established" TCP established hash table entries: 524288 (order: 11, 8388608 bytes) After patch : TCP established hash table entries: 524288 (order: 10, 4194304 bytes) Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Oussama Ghorbel 提交于
Unlike ipv4, the struct member hlen holds the length of the GRE and ipv6 headers. This length is also counted in dev->hard_header_len. Perhaps, it's more clean to modify the hlen to count only the GRE header without ipv6 header as the variable name suggest, but the simple way to fix this without regression risk is simply modify the calculation of the limit in ip6gre_tunnel_change_mtu function. Verified in kernel version v3.11. Signed-off-by: NOussama Ghorbel <ou.ghorbel@gmail.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shawn Bohrer 提交于
UDP sockets can receive packets from multiple endpoints and thus may be received on multiple receive queues. Since packets packets can arrive on multiple receive queues we should not mark the napi_id for all packets. This makes busy read/poll only work for connected UDP sockets. This additionally enables busy read/poll for UDP multicast packets as long as the socket is connected by moving the check into __udp_queue_rcv_skb(). Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com> Suggested-by: NEric Dumazet <edumazet@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 10月, 2013 1 次提交
-
-
由 Oussama Ghorbel 提交于
The (inner) MTU of a ipip6 (IPv4-in-IPv6) tunnel cannot be set below 1280, which is the minimum MTU in IPv6. However, there should be no IPv6 on the tunnel interface at all, so the IPv6 rules should not apply. More info at https://bugzilla.kernel.org/show_bug.cgi?id=15530 This patch allows to check the minimum MTU for ipv6 tunnel according to these rules: -In case the tunnel is configured with ipip6 mode the minimum MTU is 68. -In case the tunnel is configured with ip6ip6 or any mode the minimum MTU is 1280. Signed-off-by: NOussama Ghorbel <ou.ghorbel@gmail.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 10月, 2013 2 次提交
-
-
由 Eric Dumazet 提交于
While working on tcp listener refactoring, I found that it would really make things easier if sock_common could include the IPv6 addresses needed in the lookups, instead of doing very complex games to get their values (depending on sock being SYN_RECV, ESTABLISHED, TIME_WAIT) For this to happen, I need to be sure that tcp6_timewait_sock and tcp_timewait_sock consume same number of cache lines. This is possible if we only use 32bits for tw_ttd, as we remove one 32bit hole in inet_timewait_sock inet_tw_time_stamp() is defined and used, even if its current implementation looks like tcp_time_stamp : We might need finer resolution for tcp_time_stamp in the future. Before patch : sizeof(struct tcp6_timewait_sock) = 0xc8 After patch : sizeof(struct tcp6_timewait_sock) = 0xc0 Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
TCP listener refactoring, part 2 : We can use a generic lookup, sockets being in whatever state, if we are sure all relevant fields are at the same place in all socket types (ESTABLISH, TIME_WAIT, SYN_RECV) This patch removes these macros : inet_addrpair, inet_addrpair, tw_addrpair, tw_portpair And adds : sk_portpair, sk_addrpair, sk_daddr, sk_rcv_saddr Then, INET_TW_MATCH() is really the same than INET_MATCH() Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 10月, 2013 1 次提交
-
-
由 Eric Dumazet 提交于
commit 3ab5aee7 ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets. We should instead use inet_twsk_put() Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 10月, 2013 2 次提交
-
-
由 Nicolas Dichtel 提交于
rtnl ops where introduced by c075b130 ("ip6tnl: advertise tunnel param via rtnl"), but I forget to assign rtnl ops to fb tunnels. Now that it is done, we must remove the explicit call to unregister_netdevice_queue(), because the fallback tunnel is added to the queue in ip6_tnl_destroy_tunnels() when checking rtnl_link_ops of all netdevices (this is valid since commit 0bd87628 ("ip6tnl: add x-netns support")). Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
rtnl ops where introduced by ba3e3f50 ("sit: advertise tunnel param via rtnl"), but I forget to assign rtnl ops to fb tunnels. Now that it is done, we must remove the explicit call to unregister_netdevice_queue(), because the fallback tunnel is added to the queue in sit_destroy_tunnels() when checking rtnl_link_ops of all netdevices (this is valid since commit 5e6700b3 ("sit: add support of x-netns")). Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 10月, 2013 4 次提交
-
-
由 Salam Noureddine 提交于
It is possible for the timer handlers to run after the call to ipv6_mc_down so use in6_dev_put instead of __in6_dev_put in the handler function in order to do proper cleanup when the refcnt reaches 0. Otherwise, the refcnt can reach zero without the inet6_dev being destroyed and we end up leaking a reference to the net_device and see messages like the following, unregister_netdevice: waiting for eth0 to become free. Usage count = 1 Tested on linux-3.4.43. Signed-off-by: NSalam Noureddine <noureddine@aristanetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hannes Frederic Sowa 提交于
gre_hlen already accounts for sizeof(struct ipv6_hdr) + gre header, so initialize max_headroom to zero. Otherwise the if (encap_limit >= 0) { max_headroom += 8; mtu -= 8; } increments an uninitialized variable before max_headroom was reset. Found with coverity: 728539 Cc: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Li RongQing 提交于
setting fl6.flowi6_flags as zero after memset is redundant, Remove it. Signed-off-by: NLi RongQing <roy.qing.li@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paul Marks 提交于
Consider the scenario where an IPv6 router is advertising a fixed preferred_lft of 1800 seconds, while the valid_lft begins at 3600 seconds and counts down in realtime. A client should reset its preferred_lft to 1800 every time the RA is received, but a bug is causing Linux to ignore the update. The core problem is here: if (prefered_lft != ifp->prefered_lft) { Note that ifp->prefered_lft is an offset, so it doesn't decrease over time. Thus, the comparison is always (1800 != 1800), which fails to trigger an update. The most direct solution would be to compute a "stored_prefered_lft", and use that value in the comparison. But I think that trying to filter out unnecessary updates here is a premature optimization. In order for the filter to apply, both of these would need to hold: - The advertised valid_lft and preferred_lft are both declining in real time. - No clock skew exists between the router & client. So in this patch, I've set "update_lft = 1" unconditionally, which allows the surrounding code to be greatly simplified. Signed-off-by: NPaul Marks <pmarks@google.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 9月, 2013 1 次提交
-
-
由 Patrick McHardy 提交于
TCP packets hitting the SYN proxy through the SYNPROXY target are not validated by TCP conntrack. When th->doff is below 5, an underflow happens when calculating the options length, causing skb_header_pointer() to return NULL and triggering the BUG_ON(). Handle this case gracefully by checking for NULL instead of using BUG_ON(). Reported-by: NMartin Topholm <mph@one.com> Tested-by: NMartin Topholm <mph@one.com> Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 29 9月, 2013 1 次提交
-
-
由 Catalin\(ux\) M. BOIE 提交于
When a router is doing DNAT for 6to4/6rd packets the latest anti-spoofing commit 218774dc ("ipv6: add anti-spoofing checks for 6to4 and 6rd") will drop them because the IPv6 address embedded does not match the IPv4 destination. This patch will allow them to pass by testing if we have an address that matches on 6to4/6rd interface. I have been hit by this problem using Fedora and IPV6TO4_IPV4ADDR. Also, log the dropped packets (with rate limit). Signed-off-by: NCatalin(ux) M. BOIE <catab@embedromix.ro> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 9月, 2013 2 次提交
-
-
由 Hannes Frederic Sowa 提交于
This patch provides an additional safety net against NULL pointer dereferences while walking the fib trie for the new /proc/net/ipv6_route walkers. I never needed it myself and am unsure if it is needed at all, but the same checks where introduced in 2bec5a36 ("ipv6: fib: fix crash when changing large fib while dumping it") to fix NULL pointer bugs. This patch is separated from the first patch to make it easier to revert if we are sure we can drop this logic. Cc: Ben Greear <greearb@candelatech.com> Cc: Patrick McHardy <kaber@trash.net> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hannes Frederic Sowa 提交于
Dumping routes on a system with lots rt6_infos in the fibs causes up to 11-order allocations in seq_file (which fail). While we could switch there to vmalloc we could just implement the streaming interface for /proc/net/ipv6_route. This patch switches /proc/net/ipv6_route from single_open_net to seq_open_net. loff_t *pos tracks dst entries. Also kill never used struct rt6_proc_arg and now unused function fib6_clean_all_ro. Cc: Ben Greear <greearb@candelatech.com> Cc: Patrick McHardy <kaber@trash.net> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 9月, 2013 6 次提交
-
-
由 Hannes Frederic Sowa 提交于
In the following scenario the socket is corked: If the first UDP packet is larger then the mtu we try to append it to the write queue via ip6_ufo_append_data. A following packet, which is smaller than the mtu would be appended to the already queued up gso-skb via plain ip6_append_data. This causes random memory corruptions. In ip6_ufo_append_data we also have to be careful to not queue up the same skb multiple times. So setup the gso frame only when no first skb is available. This also fixes a shortcoming where we add the current packet's length to cork->length but return early because of a packet > mtu with dontfrag set (instead of sutracting it again). Found with trinity. Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Reported-by: NDmitry Vyukov <dvyukov@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
There was some bug report on ipv6 module removal path before. Also, as Stephen pointed out, after vxlan module gets ipv6 support, the ipv6 stub it used is not safe against this module removal either. So, let's just remove inet6_exit() so that ipv6 module will not be able to be unloaded. Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: NCong Wang <amwang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
Halve mss table size to make blind cookie guessing more difficult. This is sad since the tables were already small, but there is little alternative except perhaps adding more precise mss information in the tcp timestamp. Timestamps are unfortunately not ubiquitous. Guessing all possible cookie values still has 8-in 2**32 chance. Reported-by: NJakob Lell <jakob@jakoblell.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
We currently accept cookies that were created less than 4 minutes ago (ie, cookies with counter delta 0-3). Combined with the 8 mss table values, this yields 32 possible values (out of 2**32) that will be valid. Reducing the lifetime to < 2 minutes halves the guessing chance while still providing a large enough period. While at it, get rid of jiffies value -- they overflow too quickly on 32 bit platforms. getnstimeofday is used to create a counter that increments every 64s. perf shows getnstimeofday cost is negible compared to sha_transform; normal tcp initial sequence number generation uses getnstimeofday, too. Reported-by: NJakob Lell <jakob@jakoblell.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Duan Jiong 提交于
Redirect isn't an error condition, it should leave the error handler without touching the socket. Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Duan Jiong 提交于
Redirect isn't an error condition, it should leave the error handler without touching the socket. Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 9月, 2013 1 次提交
-
-
由 Ding Zhi 提交于
IFLA_IPTUN_LOCAL and IFLA_IPTUN_REMOTE were inverted. Introduced by c075b130 (ip6tnl: advertise tunnel param via rtnl). Signed-off-by: NDing Zhi <zhi.ding@6wind.com> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 9月, 2013 1 次提交
-
-
由 Phil Oester 提交于
In commit 58a317f1 (netfilter: ipv6: add IPv6 NAT support), icmpv6_manip_pkt was added with an incorrect comparison of ICMP codes to types. This causes problems when using NAT rules with the --random option. Correct the comparison. This closes netfilter bugzilla #851, reported by Alexander Neumann. Signed-off-by: NPhil Oester <kernel@linuxace.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 12 9月, 2013 4 次提交
-
-
由 Michal Kubeček 提交于
When loading the ipv6 module, ndisc_init() is called before ip6_route_init(). As the former registers a handler calling fib6_run_gc(), this opens a window to run the garbage collector before necessary data structures are initialized. If a network device is initialized in this window, adding MAC address to it triggers a NETDEV_CHANGEADDR event, leading to a crash in fib6_clean_all(). Take the event handler registration out of ndisc_init() into a separate function ndisc_late_init() and move it after ip6_route_init(). Signed-off-by: NMichal Kubecek <mkubecek@suse.cz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stefan Tomanek 提交于
This change just removes two tabs from the source file. Signed-off-by: NStefan Tomanek <stefan.tomanek@wertarbyte.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
When the kernel is compiled with CONFIG_IPV6_SUBTREES, and we return with an error in fn = fib6_add_1(), then error codes are encoded into the return pointer e.g. ERR_PTR(-ENOENT). In such an error case, we write the error code into err and jump to out, hence enter the if(err) condition. Now, if CONFIG_IPV6_SUBTREES is enabled, we check for: if (pn != fn && pn->leaf == rt) ... if (pn != fn && !pn->leaf && !(pn->fn_flags & RTN_RTINFO)) ... Since pn is NULL and fn is f.e. ERR_PTR(-ENOENT), then pn != fn evaluates to true and causes a NULL-pointer dereference on further checks on pn. Fix it, by setting both NULL in error case, so that pn != fn already evaluates to false and no further dereference takes place. This was first correctly implemented in 4a287eba ("IPv6 routing, NLM_F_* flag support: REPLACE and EXCL flags support, warn about missing CREATE flag"), but the bug got later on introduced by 188c517a ("ipv6: return errno pointers consistently for fib6_add_1()"). Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: Lin Ming <mlin@ss.pku.edu.cn> Cc: Matti Vaittinen <matti.vaittinen@nsn.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NMatti Vaittinen <matti.vaittinen@nsn.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
In rfc4942 and rfc2460 I cannot find anything which would implicate to drop packets which have only padding in tlv. Current behaviour breaks TAHI Test v6LC.1.2.6. Problem was intruduced in: 9b905fe6 "ipv6/exthdrs: strict Pad1 and PadN check" Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 9月, 2013 1 次提交
-
-
由 Duan Jiong 提交于
RFC 4861 says that the IP source address of the Redirect is the same as the current first-hop router for the specified ICMP Destination Address, so the gateway should be taken into consideration when we find the route for redirect. There was once a check in commit a6279458 ("NDISC: Search over all possible rules on receipt of redirect.") and the check went away in commit b94f1c09 ("ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect()"). The bug is only "exploitable" on layer-2 because the source address of the redirect is checked to be a valid link-local address but it makes spoofing a lot easier in the same L2 domain nonetheless. Thanks very much for Hannes's help. Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 9月, 2013 8 次提交
-
-
由 Daniel Borkmann 提交于
We already have mld_{gq,ifc,dad}_start_timer() functions, so introduce mld_{gq,ifc,dad}_stop_timer() functions to reduce code size and make it more readable. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Make igmp6_event_query() a bit easier to read by refactoring code parts into mld_process_v1() and mld_process_v2(). Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Similarly as we do in MLDv2 queries, set a forged MLDv1 query with 0 ms mld_maxdelay to minimum timer shot time of 1 jiffies. This is eventually done in igmp6_group_queried() anyway, so we can simplify a check there. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
RFC3810, 10. Security Considerations says under subsection 10.1. Query Message: A forged Version 1 Query message will put MLDv2 listeners on that link in MLDv1 Host Compatibility Mode. This scenario can be avoided by providing MLDv2 hosts with a configuration option to ignore Version 1 messages completely. Hence, implement a MLDv2-only mode that will ignore MLDv1 traffic: echo 2 > /proc/sys/net/ipv6/conf/ethX/force_mld_version or echo 2 > /proc/sys/net/ipv6/conf/all/force_mld_version Note that <all> device has a higher precedence as it was previously also the case in the macro MLD_V1_SEEN() that would "short-circuit" if condition on <all> case. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Get rid of MLDV2_MRC and use our new macros for mantisse and exponent to calculate Maximum Response Delay out of the Maximum Response Code. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Replace the macro with a function to make it more readable. GCC will eventually decide whether to inline this or not (also, that's not fast-path anyway). Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
i) RFC3810, 9.2. Query Interval [QI] says: The Query Interval variable denotes the interval between General Queries sent by the Querier. Default value: 125 seconds. [...] ii) RFC3810, 9.3. Query Response Interval [QRI] says: The Maximum Response Delay used to calculate the Maximum Response Code inserted into the periodic General Queries. Default value: 10000 (10 seconds) [...] The number of seconds represented by the [Query Response Interval] must be less than the [Query Interval]. iii) RFC3810, 9.12. Older Version Querier Present Timeout [OVQPT] says: The Older Version Querier Present Timeout is the time-out for transitioning a host back to MLDv2 Host Compatibility Mode. When an MLDv1 query is received, MLDv2 hosts set their Older Version Querier Present Timer to [Older Version Querier Present Timeout]. This value MUST be ([Robustness Variable] times (the [Query Interval] in the last Query received)) plus ([Query Response Interval]). Hence, on *default* the timeout results in: [RV] = 2, [QI] = 125sec, [QRI] = 10sec [OVQPT] = [RV] * [QI] + [QRI] = 260sec Having that said, we currently calculate [OVQPT] (here given as 'switchback' variable) as ... switchback = (idev->mc_qrv + 1) * max_delay RFC3810, 9.12. says "the [Query Interval] in the last Query received". In section "9.14. Configuring timers", it is said: This section is meant to provide advice to network administrators on how to tune these settings to their network. Ambitious router implementations might tune these settings dynamically based upon changing characteristics of the network. [...] iv) RFC38010, 9.14.2. Query Interval: The overall level of periodic MLD traffic is inversely proportional to the Query Interval. A longer Query Interval results in a lower overall level of MLD traffic. The value of the Query Interval MUST be equal to or greater than the Maximum Response Delay used to calculate the Maximum Response Code inserted in General Query messages. I assume that was why switchback is calculated as is (3 * max_delay), although this setting seems to be meant for routers only to configure their [QI] interval for non-default intervals. So usage here like this is clearly wrong. Concluding, the current behaviour in IPv6's multicast code is not conform to the RFC as switch back is calculated wrongly. That is, it has a too small value, so MLDv2 hosts switch back again to MLDv2 way too early, i.e. ~30secs instead of ~260secs on default. Hence, introduce necessary helper functions and fix this up properly as it should be. Introduced in 06da92283 ("[IPV6]: Add MLDv2 support."). Credits to Hannes Frederic Sowa who also had a hand in this as well. Also thanks to Hangbin Liu who did initial testing. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: David Stevens <dlstevens@us.ibm.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
In tcp_v6_do_rcv() code, when processing pkt options, we soley work on our skb clone opt_skb that we've created earlier before entering tcp_rcv_established() on our way. However, only in condition ... if (np->rxopt.bits.rxtclass) np->rcv_tclass = ipv6_get_dsfield(ipv6_hdr(skb)); ... we work on skb itself. As we extract every other information out of opt_skb in ipv6_pktoptions path, this seems wrong, since skb can already be released by tcp_rcv_established() earlier on. When we try to access it in ipv6_hdr(), we will dereference freed skb. [ Bug added by commit 4c507d28 ("net: implement IP_RECVTOS for IP_PKTOPTIONS") ] Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: NEric Dumazet <edumazet@google.com> Acked-by: NJiri Benc <jbenc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-