- 19 11月, 2013 3 次提交
-
-
由 Vlad Yasevich 提交于
Commit 6d0bfe22 net: ipv6: Add IPv6 support to the ping socket introduced a change in the cleanup logic of inet6_init and has a bug in that ipv6_packet_cleanup() may not be called. Fix the cleanup ordering. CC: Hannes Frederic Sowa <hannes@stressinduktion.org> CC: Lorenzo Colitti <lorenzo@google.com> CC: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: NVlad Yasevich <vyasevich@gmail.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hannes Frederic Sowa 提交于
Only update *addr_len when we actually fill in sockaddr, otherwise we can return uninitialized memory from the stack to the caller in the recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL) checks because we only get called with a valid addr_len pointer either from sock_common_recvmsg or inet_recvmsg. If a blocking read waits on a socket which is concurrently shut down we now return zero and set msg_msgnamelen to 0. Reported-by: Nmpb <mpb.mail@gmail.com> Suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fabio Estevam 提交于
When CONFIG_SYSCTL=n the following build warning happens: net/ipv6/ndisc.c:1730:1: warning: label 'out' defined but not used [-Wunused-label] The 'out' label is only used when CONFIG_SYSCTL=y, so move it inside the 'ifdef CONFIG_SYSCTL' block. Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 11月, 2013 1 次提交
-
-
由 Martin Topholm 提交于
When the synproxy_parse_options is called on the client ack the mss option will not be present. Consequently mss wont be included in the backend syn packet, which falls back to 536 bytes mss. Therefore XT_SYNPROXY_OPT_MSS is explicitly flagged when recovering mss value from cookie. Signed-off-by: NMartin Topholm <mph@one.com> Reviewed-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 15 11月, 2013 5 次提交
-
-
由 Nicolas Dichtel 提交于
Bug has been introduced by commit bb814094 ("ip6tnl: allow to use rtnl ops on fb tunnel"). When ip6_tunnel.ko is unloaded, FB device is delete by rtnl_link_unregister() and then we try to use the pointer in ip6_tnl_destroy_tunnels(). Let's add an handler for dellink, which will never remove the FB tunnel. With this patch it will no more be possible to remove it via 'ip link del ip6tnl0', but it's safer. The same fix was already proposed by Willem de Bruijn <willemb@google.com> for sit interfaces. CC: Willem de Bruijn <willemb@google.com> Reported-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
addrconf_add_linklocal() already adds the link local route, so there is no reason to add it before calling this function. Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
When a link local address was added to a sit interface, the corresponding route was not configured. This breaks routing protocols that use the link local address, like OSPFv3. To ease the code reading, I remove sit_route_add(), which only adds v4 mapped routes, and add this kind of route directly in sit_add_v4_addrs(). Thus link local and v4 mapped routes are configured in the same place. Reported-by: NLi Hongjun <hongjun.li@6wind.com> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
When the local IPv4 endpoint is wilcard (0.0.0.0), the prefix length is correctly set, ie 64 if the address is a link local one or 96 if the address is a v4 mapped one. But when the local endpoint is specified, the prefix length is set to 128 for both kind of address. This patch fix this wrong prefix length. Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
Bug: The fallback device is created in sit_init_net and assumed to be freed in sit_exit_net. First, it is dereferenced in that function, in sit_destroy_tunnels: struct net *net = dev_net(sitn->fb_tunnel_dev); Prior to this, rtnl_unlink_register has removed all devices that match rtnl_link_ops == sit_link_ops. Commit 205983c4 added the line + sitn->fb_tunnel_dev->rtnl_link_ops = &sit_link_ops; which cases the fallback device to match here and be freed before it is last dereferenced. Fix: This commit adds an explicit .delllink callback to sit_link_ops that skips deallocation at rtnl_unlink_register for the fallback device. This mechanism is comparable to the one in ip_tunnel. It also modifies sit_destroy_tunnels and its only caller sit_exit_net to avoid the offending dereference in the first place. That double lookup is more complicated than required. Test: The bug is only triggered when CONFIG_NET_NS is enabled. It causes a GPF only when CONFIG_DEBUG_SLAB is enabled. Verified that this bug exists at the mentioned commit, at davem-net HEAD and at 3.11.y HEAD. Verified that it went away after applying this patch. Fixes: 205983c4 ("sit: allow to use rtnl ops on fb tunnel") Signed-off-by: NWillem de Bruijn <willemb@google.com> Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 11月, 2013 3 次提交
-
-
由 Hannes Frederic Sowa 提交于
Fixes a suspicious rcu derference warning. Cc: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Pushing original fragments through causes several problems. For example for matching, frags may not be matched correctly. Take following example: <example> On HOSTA do: ip6tables -I INPUT -p icmpv6 -j DROP ip6tables -I INPUT -p icmpv6 -m icmp6 --icmpv6-type 128 -j ACCEPT and on HOSTB you do: ping6 HOSTA -s2000 (MTU is 1500) Incoming echo requests will be filtered out on HOSTA. This issue does not occur with smaller packets than MTU (where fragmentation does not happen) </example> As was discussed previously, the only correct solution seems to be to use reassembled skb instead of separete frags. Doing this has positive side effects in reducing sk_buff by one pointer (nfct_reasm) and also the reams dances in ipvs and conntrack can be removed. Future plan is to remove net/ipv6/netfilter/nf_conntrack_reasm.c entirely and use code in net/ipv6/reassembly.c instead. Signed-off-by: NJiri Pirko <jiri@resnulli.us> Acked-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NMarcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
If reassembled packet would fit into outdev MTU, it is not fragmented according the original frag size and it is send as single big packet. The second case is if skb is gso. In that case fragmentation does not happen according to the original frag size. This patch fixes these. Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 11月, 2013 4 次提交
-
-
由 Duan Jiong 提交于
As the rfc 4191 said, the Router Preference and Lifetime values in a ::/0 Route Information Option should override the preference and lifetime values in the Router Advertisement header. But when the kernel deals with a ::/0 Route Information Option, the rt6_get_route_info() always return NULL, that means that overriding will not happen, because those default routers were added without flag RTF_ROUTEINFO in rt6_add_dflt_router(). In order to deal with that condition, we should call rt6_get_dflt_router when the prefix length is 0. Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florent Fourcot 提交于
Take ip6_fl_lock before to read and update a label. v2: protect only the relevant code Reported-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florent Fourcot 提交于
If the last RFC 6437 does not give any constraints for lifetime of flow labels, the previous RFC 3697 spoke of a minimum of 120 seconds between reattribution of a flow label. The maximum linger is currently set to 60 seconds and does not allow this configuration without CAP_NET_ADMIN right. This patch increase the maximum linger to 150 seconds, allowing more flexibility to standard users. Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florent Fourcot 提交于
It is already possible to set/put/renew a label with IPV6_FLOWLABEL_MGR and setsockopt. This patch add the possibility to get information about this label (current value, time before expiration, etc). It helps application to take decision for a renew or a release of the label. v2: * Add spin_lock to prevent race condition * return -ENOENT if no result found * check if flr_action is GET v3: * move the spin_lock to protect only the relevant code Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 11月, 2013 5 次提交
-
-
由 John Stultz 提交于
While enabling lockdep on seqlocks, I ran across the warning below caused by the ipv6 stats being updated in both irq and non-irq context. This patch changes from IP6_INC_STATS_BH to IP6_INC_STATS (suggested by Eric Dumazet) to resolve this problem. [ 11.120383] ================================= [ 11.121024] [ INFO: inconsistent lock state ] [ 11.121663] 3.12.0-rc1+ #68 Not tainted [ 11.122229] --------------------------------- [ 11.122867] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 11.123741] init/4483 [HC0[0]:SC1[3]:HE1:SE0] takes: [ 11.124505] (&stats->syncp.seq#6){+.?...}, at: [<c1ab80c2>] ndisc_send_ns+0xe2/0x130 [ 11.125736] {SOFTIRQ-ON-W} state was registered at: [ 11.126447] [<c10e0eb7>] __lock_acquire+0x5c7/0x1af0 [ 11.127222] [<c10e2996>] lock_acquire+0x96/0xd0 [ 11.127925] [<c1a9a2c3>] write_seqcount_begin+0x33/0x40 [ 11.128766] [<c1a9aa03>] ip6_dst_lookup_tail+0x3a3/0x460 [ 11.129582] [<c1a9e0ce>] ip6_dst_lookup_flow+0x2e/0x80 [ 11.130014] [<c1ad18e0>] ip6_datagram_connect+0x150/0x4e0 [ 11.130014] [<c1a4d0b5>] inet_dgram_connect+0x25/0x70 [ 11.130014] [<c198dd61>] SYSC_connect+0xa1/0xc0 [ 11.130014] [<c198f571>] SyS_connect+0x11/0x20 [ 11.130014] [<c198fe6b>] SyS_socketcall+0x12b/0x300 [ 11.130014] [<c1bbf880>] syscall_call+0x7/0xb [ 11.130014] irq event stamp: 1184 [ 11.130014] hardirqs last enabled at (1184): [<c1086901>] local_bh_enable+0x71/0x110 [ 11.130014] hardirqs last disabled at (1183): [<c10868cd>] local_bh_enable+0x3d/0x110 [ 11.130014] softirqs last enabled at (0): [<c108014d>] copy_process.part.42+0x45d/0x11a0 [ 11.130014] softirqs last disabled at (1147): [<c1086e05>] irq_exit+0xa5/0xb0 [ 11.130014] [ 11.130014] other info that might help us debug this: [ 11.130014] Possible unsafe locking scenario: [ 11.130014] [ 11.130014] CPU0 [ 11.130014] ---- [ 11.130014] lock(&stats->syncp.seq#6); [ 11.130014] <Interrupt> [ 11.130014] lock(&stats->syncp.seq#6); [ 11.130014] [ 11.130014] *** DEADLOCK *** [ 11.130014] [ 11.130014] 3 locks held by init/4483: [ 11.130014] #0: (rcu_read_lock){.+.+..}, at: [<c109363c>] SyS_setpriority+0x4c/0x620 [ 11.130014] #1: (((&ifa->dad_timer))){+.-...}, at: [<c108c1c0>] call_timer_fn+0x0/0xf0 [ 11.130014] #2: (rcu_read_lock){.+.+..}, at: [<c1ab6494>] ndisc_send_skb+0x54/0x5d0 [ 11.130014] [ 11.130014] stack backtrace: [ 11.130014] CPU: 0 PID: 4483 Comm: init Not tainted 3.12.0-rc1+ #68 [ 11.130014] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 11.130014] 00000000 00000000 c55e5c10 c1bb0e71 c57128b0 c55e5c4c c1badf79 c1ec1123 [ 11.130014] c1ec1484 00001183 00000000 00000000 00000001 00000003 00000001 00000000 [ 11.130014] c1ec1484 00000004 c5712dcc 00000000 c55e5c84 c10de492 00000004 c10755f2 [ 11.130014] Call Trace: [ 11.130014] [<c1bb0e71>] dump_stack+0x4b/0x66 [ 11.130014] [<c1badf79>] print_usage_bug+0x1d3/0x1dd [ 11.130014] [<c10de492>] mark_lock+0x282/0x2f0 [ 11.130014] [<c10755f2>] ? kvm_clock_read+0x22/0x30 [ 11.130014] [<c10dd8b0>] ? check_usage_backwards+0x150/0x150 [ 11.130014] [<c10e0e74>] __lock_acquire+0x584/0x1af0 [ 11.130014] [<c10b1baf>] ? sched_clock_cpu+0xef/0x190 [ 11.130014] [<c10de58c>] ? mark_held_locks+0x8c/0xf0 [ 11.130014] [<c10e2996>] lock_acquire+0x96/0xd0 [ 11.130014] [<c1ab80c2>] ? ndisc_send_ns+0xe2/0x130 [ 11.130014] [<c1ab66d3>] ndisc_send_skb+0x293/0x5d0 [ 11.130014] [<c1ab80c2>] ? ndisc_send_ns+0xe2/0x130 [ 11.130014] [<c1ab80c2>] ndisc_send_ns+0xe2/0x130 [ 11.130014] [<c108cc32>] ? mod_timer+0xf2/0x160 [ 11.130014] [<c1aa706e>] ? addrconf_dad_timer+0xce/0x150 [ 11.130014] [<c1aa70aa>] addrconf_dad_timer+0x10a/0x150 [ 11.130014] [<c1aa6fa0>] ? addrconf_dad_completed+0x1c0/0x1c0 [ 11.130014] [<c108c233>] call_timer_fn+0x73/0xf0 [ 11.130014] [<c108c1c0>] ? __internal_add_timer+0xb0/0xb0 [ 11.130014] [<c1aa6fa0>] ? addrconf_dad_completed+0x1c0/0x1c0 [ 11.130014] [<c108c5b1>] run_timer_softirq+0x141/0x1e0 [ 11.130014] [<c1086b20>] ? __do_softirq+0x70/0x1b0 [ 11.130014] [<c1086b70>] __do_softirq+0xc0/0x1b0 [ 11.130014] [<c1086e05>] irq_exit+0xa5/0xb0 [ 11.130014] [<c106cfd5>] smp_apic_timer_interrupt+0x35/0x50 [ 11.130014] [<c1bbfbca>] apic_timer_interrupt+0x32/0x38 [ 11.130014] [<c10936ed>] ? SyS_setpriority+0xfd/0x620 [ 11.130014] [<c10e26c9>] ? lock_release+0x9/0x240 [ 11.130014] [<c10936d7>] ? SyS_setpriority+0xe7/0x620 [ 11.130014] [<c1bbee6d>] ? _raw_read_unlock+0x1d/0x30 [ 11.130014] [<c1093701>] SyS_setpriority+0x111/0x620 [ 11.130014] [<c109363c>] ? SyS_setpriority+0x4c/0x620 [ 11.130014] [<c1bbf880>] syscall_call+0x7/0xb Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "David S. Miller" <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: James Morris <jmorris@namei.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/1381186321-4906-5-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 John Stultz 提交于
In order to enable lockdep on seqcount/seqlock structures, we must explicitly initialize any locks. The u64_stats_sync structure, uses a seqcount, and thus we need to introduce a u64_stats_init() function and use it to initialize the structure. This unfortunately adds a lot of fairly trivial initialization code to a number of drivers. But the benefit of ensuring correctness makes this worth while. Because these changes are required for lockdep to be enabled, and the changes are quite trivial, I've not yet split this patch out into 30-some separate patches, as I figured it would be better to get the various maintainers thoughts on how to best merge this change along with the seqcount lockdep enablement. Feedback would be appreciated! Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Acked-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: James Morris <jmorris@namei.org> Cc: Jesse Gross <jesse@nicira.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Mirko Lindner <mlindner@marvell.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Roger Luethi <rl@hellgate.ch> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Simon Horman <horms@verge.net.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Wensong Zhang <wensong@linux-vs.org> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Duan Jiong 提交于
Now rt6_alloc_cow() is only called by ip6_pol_route() when rt->rt6i_flags doesn't contain both RTF_NONEXTHOP and RTF_GATEWAY, and rt->rt6i_flags hasn't been changed in ip6_rt_copy(). So there is no neccessary to judge whether rt->rt6i_flags contains RTF_GATEWAY or not. Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hannes Frederic Sowa 提交于
Commit 1e2bd517 ("udp6: Fix udp fragmentation for tunnel traffic.") changed the calculation if there is enough space to include a fragment header in the skb from a skb->mac_header dervived one to skb_headroom. Because we already peeled off the skb to transport_header this is wrong. Change this back to check if we have enough room before the mac_header. This fixes a panic Saran Neti reported. He used the tbf scheduler which skb_gso_segments the skb. The offsets get negative and we panic in memcpy because the skb was erroneously not expanded at the head. Reported-by: NSaran Neti <Saran.Neti@telus.com> Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florent Fourcot 提交于
The code of flow label in Linux Kernel follows the rules of RFC 1809 (an informational one) for conditions on flow label sharing. There rules are not in the last proposed standard for flow label (RFC 6437), or in the previous one (RFC 3697). Since this code does not follow any current or old standard, we can remove it. With this removal, the ipv6_opt_cmp function is now a dead code and it can be removed too. Changelog to v1: * add justification for the change * remove the condition on IPv6 options [ Remove ipv6_hdr_cmp and it is now unused as well. -DaveM ] Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 11月, 2013 1 次提交
-
-
由 Steffen Klassert 提交于
On some codepaths the skb does not have a dst entry when xfrm_decode_session() is called. So check for a valid skb_dst() before dereferencing the device interface index. We use 0 as the device index if there is no valid skb_dst(), or at reverse decoding we use skb_iif as device interface index. Bug was introduced with git commit bafd4bd4 ("xfrm: Decode sessions with output interface."). Reported-by: NMeelis Roos <mroos@linux.ee> Tested-by: NMeelis Roos <mroos@linux.ee> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
- 31 10月, 2013 1 次提交
-
-
由 Duan Jiong 提交于
After reading the function rt6_check_neigh(), we can know that the RT6_NUD_FAIL_SOFT can be returned only when the IS_ENABLE(CONFIG_IPV6_ROUTER_PREF) is false. so in function find_match(), there is no need to execute the statement !IS_ENABLED(CONFIG_IPV6_ROUTER_PREF). Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 10月, 2013 3 次提交
-
-
由 Mathias Krause 提交于
struct esp_data consists of a single pointer, vanishing the need for it to be a structure. Fold the pointer into 'data' direcly, removing one level of pointer indirection. Signed-off-by: NMathias Krause <mathias.krause@secunet.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
由 Mathias Krause 提交于
The padlen member of struct esp_data is always zero. Get rid of it. Signed-off-by: NMathias Krause <mathias.krause@secunet.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
由 David S. Miller 提交于
The code for privacy extentions is very mature, and making it configurable only gives marginal memory/code savings in exchange for obfuscation and hard to read code via CPP ifdef'ery. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 10月, 2013 1 次提交
-
-
由 Steffen Klassert 提交于
With the removal of the routing cache, we lost the option to tweak the garbage collector threshold along with the maximum routing cache size. So git commit 703fb94e ("xfrm: Fix the gc threshold value for ipv4") moved back to a static threshold. It turned out that the current threshold before we start garbage collecting is much to small for some workloads, so increase it from 1024 to 32768. This means that we start the garbage collector if we have more than 32768 dst entries in the system and refuse new allocations if we are above 65536. Reported-by: NWolfgang Walter <linux@stwm.de> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
- 26 10月, 2013 1 次提交
-
-
由 Hannes Frederic Sowa 提交于
On receiving a packet too big icmp error we check if our current cached dst_entry in the socket is still valid. This validation check did not care about the expiration of the (cached) route. The error path I traced down: The socket receives a packet too big mtu notification. It still has a valid dst_entry and thus issues the ip6_rt_pmtu_update on this dst_entry, setting RTF_EXPIRE and updates the dst.expiration value (which could fail because of not up-to-date expiration values, see previous patch). In some seldom cases we race with a) the ip6_fib gc or b) another routing lookup which would result in a recreation of the cached rt6_info from its parent non-cached rt6_info. While copying the rt6_info we reinitialize the metrics store by copying it over from the parent thus invalidating the just installed pmtu update (both dsts use the same key to the inetpeer storage). The dst_entry with the just invalidated metrics data would just get its RTF_EXPIRES flag cleared and would continue to stay valid for the socket. We should have not issued the pmtu update on the already expired dst_entry in the first placed. By checking the expiration on the dst entry and doing a relookup in case it is out of date we close the race because we would install a new rt6_info into the fib before we issue the pmtu update, thus closing this race. Not reliably updating the dst.expire value was fixed by the patch "ipv6: reset dst.expires value when clearing expire flag". Reported-by: NSteinar H. Gunderson <sgunderson@bigfoot.com> Reported-by: NValentijn Sessink <valentyn@blub.net> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Reviewed-by: NEric Dumazet <edumazet@google.com> Tested-by: NValentijn Sessink <valentyn@blub.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 10月, 2013 1 次提交
-
-
由 Hannes Frederic Sowa 提交于
Defer the fragmentation hash secret initialization for IPv6 like the previous patch did for IPv4. Because the netfilter logic reuses the hash secret we have to split it first. Thus introduce a new nf_hash_frag function which takes care to seed the hash secret. Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 10月, 2013 1 次提交
-
-
由 Stanislav Fomichev 提交于
Don't verify checksum for outgoing packets because checksum calculation may be done by the device. Without this patch: $ ip6tables -I OUTPUT -p tcp --dport 80 -j REJECT --reject-with tcp-reset $ time telnet ipv6.google.com 80 Trying 2a00:1450:4010:c03::67... telnet: Unable to connect to remote host: Connection timed out real 0m7.201s user 0m0.000s sys 0m0.000s With the patch applied: $ ip6tables -I OUTPUT -p tcp --dport 80 -j REJECT --reject-with tcp-reset $ time telnet ipv6.google.com 80 Trying 2a00:1450:4010:c03::67... telnet: Unable to connect to remote host: Connection refused real 0m0.085s user 0m0.000s sys 0m0.000s Signed-off-by: NStanislav Fomichev <stfomichev@yandex-team.ru> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 22 10月, 2013 6 次提交
-
-
由 Will Deacon 提交于
During kernel stability testing on an SMP ARMv7 system, Yalin Wang reported the following panic from the netfilter code: 1fe0: 0000001c 5e2d3b10 4007e779 4009e110 60000010 00000032 ff565656 ff545454 [<c06c48dc>] (ipt_do_table+0x448/0x584) from [<c0655ef0>] (nf_iterate+0x48/0x7c) [<c0655ef0>] (nf_iterate+0x48/0x7c) from [<c0655f7c>] (nf_hook_slow+0x58/0x104) [<c0655f7c>] (nf_hook_slow+0x58/0x104) from [<c0683bbc>] (ip_local_deliver+0x88/0xa8) [<c0683bbc>] (ip_local_deliver+0x88/0xa8) from [<c0683718>] (ip_rcv_finish+0x418/0x43c) [<c0683718>] (ip_rcv_finish+0x418/0x43c) from [<c062b1c4>] (__netif_receive_skb+0x4cc/0x598) [<c062b1c4>] (__netif_receive_skb+0x4cc/0x598) from [<c062b314>] (process_backlog+0x84/0x158) [<c062b314>] (process_backlog+0x84/0x158) from [<c062de84>] (net_rx_action+0x70/0x1dc) [<c062de84>] (net_rx_action+0x70/0x1dc) from [<c0088230>] (__do_softirq+0x11c/0x27c) [<c0088230>] (__do_softirq+0x11c/0x27c) from [<c008857c>] (do_softirq+0x44/0x50) [<c008857c>] (do_softirq+0x44/0x50) from [<c0088614>] (local_bh_enable_ip+0x8c/0xd0) [<c0088614>] (local_bh_enable_ip+0x8c/0xd0) from [<c06b0330>] (inet_stream_connect+0x164/0x298) [<c06b0330>] (inet_stream_connect+0x164/0x298) from [<c061d68c>] (sys_connect+0x88/0xc8) [<c061d68c>] (sys_connect+0x88/0xc8) from [<c000e340>] (ret_fast_syscall+0x0/0x30) Code: 2a000021 e59d2028 e59de01c e59f011c (e7824103) ---[ end trace da227214a82491bd ]--- Kernel panic - not syncing: Fatal exception in interrupt This comes about because CPU1 is executing xt_replace_table in response to a setsockopt syscall, resulting in: ret = xt_jumpstack_alloc(newinfo); --> newinfo->jumpstack = kzalloc(size, GFP_KERNEL); [...] table->private = newinfo; newinfo->initial_entries = private->initial_entries; Meanwhile, CPU0 is handling the network receive path and ends up in ipt_do_table, resulting in: private = table->private; [...] jumpstack = (struct ipt_entry **)private->jumpstack[cpu]; On weakly ordered memory architectures, the writes to table->private and newinfo->jumpstack from CPU1 can be observed out of order by CPU0. Furthermore, on architectures which don't respect ordering of address dependencies (i.e. Alpha), the reads from CPU0 can also be re-ordered. This patch adds an smp_wmb() before the assignment to table->private (which is essentially publishing newinfo) to ensure that all writes to newinfo will be observed before plugging it into the table structure. A dependent-read barrier is also added on the consumer sides, to ensure the same ordering requirements are also respected there. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reported-by: NWang, Yalin <Yalin.Wang@sonymobile.com> Tested-by: NWang, Yalin <Yalin.Wang@sonymobile.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Hannes Frederic Sowa 提交于
Routes need to be probed asynchronous otherwise the call stack gets exhausted when the kernel attemps to deliver another skb inline, like e.g. xt_TEE does, and we probe at the same time. We update neigh->updated still at once, otherwise we would send to many probes. Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Now ipv6_gso_segment() is stackable, its relatively easy to implement GSO/TSO support for SIT tunnels Performance results, when segmentation is done after tunnel device (as no NIC is yet enabled for TSO SIT support) : Before patch : lpq84:~# ./netperf -H 2002:af6:1153:: -Cc MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:1153:: () port 0 AF_INET6 Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 3168.31 4.81 4.64 2.988 2.877 After patch : lpq84:~# ./netperf -H 2002:af6:1153:: -Cc MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:1153:: () port 0 AF_INET6 Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 5525.00 7.76 5.17 2.763 1.840 Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
In order to support GSO on SIT tunnels, we need to make inet_gso_segment() stackable. It should not assume network header starts right after mac header. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
The code that is implemented is per memory cgroup not per netns, and having per netns bits is just confusing. Remove the per netns bits to make it easier to see what is really going on. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Julian Anastasov 提交于
Make sure rt6i_gateway contains nexthop information in all routes returned from lookup or when routes are directly attached to skb for generated ICMP packets. The effect of this patch should be a faster version of rt6_nexthop() and the consideration of local addresses as nexthop. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 10月, 2013 4 次提交
-
-
由 Hannes Frederic Sowa 提交于
Initialize the ehash and ipv6_hash_secrets with net_get_random_once. Each compilation unit gets its own secret now: ipv4/inet_hashtables.o ipv4/udp.o ipv6/inet6_hashtables.o ipv6/udp.o rds/connection.o The functions still get inlined into the hashing functions. In the fast path we have at most two (needed in ipv6) if (unlikely(...)). Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hannes Frederic Sowa 提交于
This patch splits the secret key for syncookies for ipv4 and ipv6 and initializes them with net_get_random_once. This change was the reason I did this series. I think the initialization of the syncookie_secret is way to early. Cc: Florian Westphal <fw@strlen.de> Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hannes Frederic Sowa 提交于
This patch splits the inet6_ehashfn into separate ones in ipv6/inet6_hashtables.o and ipv6/udp.o to ease the introduction of seperate secrets keys later. Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Now inet_gso_segment() is stackable, its relatively easy to implement GSO/TSO support for IPIP Performance results, when segmentation is done after tunnel device (as no NIC is yet enabled for TSO IPIP support) : Before patch : lpq83:~# ./netperf -H 7.7.9.84 -Cc MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.9.84 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 3357.88 5.09 3.70 2.983 2.167 After patch : lpq83:~# ./netperf -H 7.7.9.84 -Cc MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.9.84 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 7710.19 4.52 6.62 1.152 1.687 Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-