提交 · 00fd38d938db3f1ab1c486549afc450cb7e751b1 · openeuler / raspberrypi-kernel

16 11月, 2015 4 次提交

tcp: ensure proper barriers in lockless contexts · 00fd38d9

由 Eric Dumazet 提交于 11月 12, 2015

Some functions access TCP sockets without holding a lock and
might output non consistent data, depending on compiler and or
architecture.

tcp_diag_get_info(), tcp_get_info(), tcp_poll(), get_tcp4_sock() ...

Introduce sk_state_load() and sk_state_store() to fix the issues,
and more clearly document where this lack of locking is happening.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

00fd38d9

ipv6: Check rt->dst.from for the DST_NOCACHE route · 02bcf4e0

由 Martin KaFai Lau 提交于 11月 11, 2015

All DST_NOCACHE rt6_info used to have rt->dst.from set to
its parent.

After commit 8e3d5be7 ("ipv6: Avoid double dst_free"),
DST_NOCACHE is also set to rt6_info which does not have
a parent (i.e. rt->dst.from is NULL).

This patch catches the rt->dst.from == NULL case.

Fixes: 8e3d5be7 ("ipv6: Avoid double dst_free")
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

02bcf4e0

ipv6: Check expire on DST_NOCACHE route · 5973fb1e

由 Martin KaFai Lau 提交于 11月 11, 2015

Since the expires of the DST_NOCACHE rt can be set during
the ip6_rt_update_pmtu(), we also need to consider the expires
value when doing ip6_dst_check().

This patches creates __rt6_check_expired() to only
check the expire value (if one exists) of the current rt.

In rt6_dst_from_check(), it adds __rt6_check_expired() as
one of the condition check.
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5973fb1e

ipv6: Avoid creating RTF_CACHE from a rt that is not managed by fib6 tree · 0d3f6d29

由 Martin KaFai Lau 提交于 11月 11, 2015

The original bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1272571

The setup has a IPv4 GRE tunnel running in a IPSec.  The bug
happens when ndisc starts sending router solicitation at the gre
interface.  The simplified oops stack is like:

__lock_acquire+0x1b2/0x1c30
lock_acquire+0xb9/0x140
_raw_write_lock_bh+0x3f/0x50
__ip6_ins_rt+0x2e/0x60
ip6_ins_rt+0x49/0x50
~~~~~~~~
__ip6_rt_update_pmtu.part.54+0x145/0x250
ip6_rt_update_pmtu+0x2e/0x40
~~~~~~~~
ip_tunnel_xmit+0x1f1/0xf40
__gre_xmit+0x7a/0x90
ipgre_xmit+0x15a/0x220
dev_hard_start_xmit+0x2bd/0x480
__dev_queue_xmit+0x696/0x730
dev_queue_xmit+0x10/0x20
neigh_direct_output+0x11/0x20
ip6_finish_output2+0x21f/0x770
ip6_finish_output+0xa7/0x1d0
ip6_output+0x56/0x190
~~~~~~~~
ndisc_send_skb+0x1d9/0x400
ndisc_send_rs+0x88/0xc0
~~~~~~~~

The rt passed to ip6_rt_update_pmtu() is created by
icmp6_dst_alloc() and it is not managed by the fib6 tree,
so its rt6i_table == NULL.  When __ip6_rt_update_pmtu() creates
a RTF_CACHE clone, the newly created clone also has rt6i_table == NULL
and it causes the ip6_ins_rt() oops.

During pmtu update, we only want to create a RTF_CACHE clone
from a rt which is currently managed (or owned) by the
fib6 tree.  It means either rt->rt6i_node != NULL or
rt is a RTF_PCPU clone.

It is worth to note that rt6i_table may not be NULL even it is
not (yet) managed by the fib6 tree (e.g. addrconf_dst_alloc()).
Hence, rt6i_node is a better check instead of rt6i_table.

Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu")
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Reported-by: NChris Siebenmann <cks-rhbugzilla@cs.toronto.edu>
Cc: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d3f6d29

06 11月, 2015 2 次提交

tcp: use correct req pointer in tcp_move_syn() calls · 49a496c9

由 Eric Dumazet 提交于 11月 05, 2015

I mistakenly took wrong request sock pointer when calling tcp_move_syn()

@req_unhash is either a copy of @req, or a NULL value for
FastOpen connexions (as we do not expect to unhash the temporary
request sock from ehash table)

Fixes: 805c4bc0 ("tcp: fix req->saved_syn race")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Ying Cai <ycai@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

49a496c9

tcp: fix req->saved_syn race · 805c4bc0

由 Eric Dumazet 提交于 11月 05, 2015

For the reasons explained in commit ce105008 ("tcp/dccp: fix
ireq->pktopts race"), we need to make sure we do not access
req->saved_syn unless we own the request sock.

This fixes races for listeners using TCP_SAVE_SYN option.

Fixes: e994b2f0 ("tcp: do not lock listener to process SYN packets")
Fixes: 079096f1 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NYing Cai <ycai@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

805c4bc0

05 11月, 2015 1 次提交

ipv6: clean up dev_snmp6 proc entry when we fail to initialize inet6_dev · 2a189f9e

由 Sabrina Dubroca 提交于 11月 04, 2015

In ipv6_add_dev, when addrconf_sysctl_register fails, we do not clean up
the dev_snmp6 entry that we have already registered for this device.
Call snmp6_unregister_dev in this case.

Fixes: a317a2f1 ("ipv6: fail early when creating netdev named all or default")
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a189f9e

03 11月, 2015 5 次提交

ipv6: fix tunnel error handling · ebac62fe

由 Michal Kubeček 提交于 11月 03, 2015

Both tunnel6_protocol and tunnel46_protocol share the same error
handler, tunnel6_err(), which traverses through tunnel6_handlers list.
For ipip6 tunnels, we need to traverse tunnel46_handlers as we do e.g.
in tunnel46_rcv(). Current code can generate an ICMPv6 error message
with an IPv4 packet embedded in it.

Fixes: 73d605d1 ("[IPSEC]: changing API of xfrm6_tunnel_register")
Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebac62fe

sit: fix sit0 percpu double allocations · 4ece9009

由 Eric Dumazet 提交于 11月 02, 2015

sit0 device allocates its percpu storage twice :
- One time in ipip6_tunnel_init()
- One time in ipip6_fb_tunnel_init()

Thus we leak 48 bytes per possible cpu per network namespace dismantle.

ipip6_fb_tunnel_init() can be much simpler and does not
return an error, and should be called after register_netdev()

Note that ipip6_tunnel_clone_6rd() also needs to be called
after register_netdev() (calling ipip6_tunnel_init())

Fixes: ebe084aa ("sit: Use ipip6_tunnel_init as the ndo_init function.")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ece9009

net: fix percpu memory leaks · 1d6119ba

由 Eric Dumazet 提交于 11月 02, 2015

This patch fixes following problems :

1) percpu_counter_init() can return an error, therefore
  init_frag_mem_limit() must propagate this error so that
  inet_frags_init_net() can do the same up to its callers.

2) If ip[46]_frags_ns_ctl_register() fail, we must unwind
   properly and free the percpu_counter.

Without this fix, we leave freed object in percpu_counters
global list (if CONFIG_HOTPLUG_CPU) leading to crashes.

This bug was detected by KASAN and syzkaller tool
(http://github.com/google/syzkaller)

Fixes: 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d6119ba

ipv6: fix crash on ICMPv6 redirects with prohibited/blackholed source · ec13ad1d

由 Matthias Schiffer 提交于 11月 02, 2015

There are other error values besides ip6_null_entry that can be returned by
ip6_route_redirect(): fib6_rule_action() can also result in
ip6_blk_hole_entry and ip6_prohibit_entry if such ip rules are installed.

Only checking for ip6_null_entry in rt6_do_redirect() causes ip6_ins_rt()
to be called with rt->rt6i_table == NULL in these cases, making the kernel
crash.
Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec13ad1d

tcp/dccp: fix ireq->pktopts race · ce105008

由 Eric Dumazet 提交于 10月 30, 2015

IPv6 request sockets store a pointer to skb containing the SYN packet
to be able to transfer it to full blown socket when 3WHS is done
(ireq->pktopts -> np->pktoptions)

As explained in commit 5e0724d0 ("tcp/dccp: fix hashdance race for
passive sessions"), we must transfer the skb only if we won the
hashdance race, if multiple cpus receive the 'ack' packet completing
3WHS at the same time.

Fixes: e994b2f0 ("tcp: do not lock listener to process SYN packets")
Fixes: 079096f1 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce105008

02 11月, 2015 2 次提交

ipv6: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment · 405c92f7

由 Hannes Frederic Sowa 提交于 10月 27, 2015

CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one
of those warn about them once and handle them gracefully by recalculating
the checksum.

Fixes: commit 32dce968 ("ipv6: Allow for partial checksums on non-ufo packets")
See-also: commit 72e843bb ("ipv6: ip6_fragment() should check CHECKSUM_PARTIAL")
Cc: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

405c92f7

ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets · 682b1a9d

由 Hannes Frederic Sowa 提交于 10月 27, 2015

We cannot reliable calculate packet size on MSG_MORE corked sockets
and thus cannot decide if they are going to be fragmented later on,
so better not use CHECKSUM_PARTIAL in the first place.

The IPv6 code also intended to protect and not use CHECKSUM_PARTIAL in
the existence of IPv6 extension headers, but the condition was wrong. Fix
it up, too. Also the condition to check whether the packet fits into
one fragment was wrong and has been corrected.

Fixes: commit 32dce968 ("ipv6: Allow for partial checksums on non-ufo packets")
See-also: commit 72e843bb ("ipv6: ip6_fragment() should check CHECKSUM_PARTIAL")
Cc: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

682b1a9d

30 10月, 2015 1 次提交

ipv6: recreate ipv6 link-local addresses when increasing MTU over IPV6_MIN_MTU · b7b0b1d2

由 Alexander Duyck 提交于 10月 26, 2015

This change makes it so that we reinitialize the interface if the MTU is
increased back above IPV6_MIN_MTU and the interface is up.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7b0b1d2

29 10月, 2015 2 次提交

ipv6: protect mtu calculation of wrap-around and infinite loop by rounding issues · 89bc7848

由 Hannes Frederic Sowa 提交于 10月 28, 2015

Raw sockets with hdrincl enabled can insert ipv6 extension headers
right into the data stream. In case we need to fragment those packets,
we reparse the options header to find the place where we can insert
the fragment header. If the extension headers exceed the link's MTU we
actually cannot make progress in such a case.

Instead of ending up in broken arithmetic or rounding towards 0 and
entering an endless loop in ip6_fragment, just prevent those cases by
aborting early and signal -EMSGSIZE to user space.

This is the second version of the patch which doesn't use the
overflow_usub function, which got reverted for now.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

89bc7848

Revert "Merge branch 'ipv6-overflow-arith'" · 1e0d69a9

由 Hannes Frederic Sowa 提交于 10月 28, 2015

Linus dislikes these changes. To not hold up the net-merge let's revert
it for now and fix the bug like Linus suggested.

This reverts commit ec3661b4, reversing
changes made to c80dbe04.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e0d69a9

28 10月, 2015 1 次提交

ipv6: Export nf_ct_frag6_consume_orig() · 190b8ffb

由 Joe Stringer 提交于 10月 25, 2015

This is needed in openvswitch to fix an skb leak in the next patch.
Signed-off-by: NJoe Stringer <joestringer@nicira.com>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

190b8ffb

27 10月, 2015 1 次提交

ipv6: icmp: include addresses in debug messages · 4b3418fb

由 Bjørn Mork 提交于 10月 24, 2015

Messages like "icmp6_send: no reply to icmp error" are close
to useless. Adding source and destination addresses to provide
some more clue.
Signed-off-by: NBjørn Mork <bjorn@mork.no>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b3418fb

23 10月, 2015 3 次提交

tcp/dccp: fix hashdance race for passive sessions · 5e0724d0

由 Eric Dumazet 提交于 10月 22, 2015

Multiple cpus can process duplicates of incoming ACK messages
matching a SYN_RECV request socket. This is a rare event under
normal operations, but definitely can happen.

Only one must win the race, otherwise corruption would occur.

To fix this without adding new atomic ops, we use logic in
inet_ehash_nolisten() to detect the request was present in the same
ehash bucket where we try to insert the new child.

If request socket was not found, we have to undo the child creation.

This actually removes a spin_lock()/spin_unlock() pair in
reqsk_queue_unlink() for the fast path.

Fixes: e994b2f0 ("tcp: do not lock listener to process SYN packets")
Fixes: 079096f1 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e0724d0

ipv6: protect mtu calculation of wrap-around and infinite loop by rounding issues · b72a2b01

由 Hannes Frederic Sowa 提交于 10月 16, 2015

Raw sockets with hdrincl enabled can insert ipv6 extension headers
right into the data stream. In case we need to fragment those packets,
we reparse the options header to find the place where we can insert
the fragment header. If the extension headers exceed the link's MTU we
actually cannot make progress in such a case.

Instead of ending up in broken arithmetic or rounding towards 0 and
entering an endless loop in ip6_fragment, just prevent those cases by
aborting early and signal -EMSGSIZE to user space.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b72a2b01

ipv6: fix the incorrect return value of throw route · ab997ad4

由 lucien 提交于 10月 23, 2015

The error condition -EAGAIN, which is signaled by throw routes, tells
the rules framework to walk on searching for next matches. If the walk
ends and we stop walking the rules with the result of a throw route we
have to translate the error conditions to -ENETUNREACH.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab997ad4

22 10月, 2015 4 次提交

net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr set · d46a9d67

由 David Ahern 提交于 10月 21, 2015

741a11d9 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set")
adds the RT6_LOOKUP_F_IFACE flag to make device index mismatch fatal if
oif is given. Hajime reported that this change breaks the Mobile IPv6
use case that wants to force the message through one interface yet use
the source address from another interface. Handle this case by only
adding the flag if oif is set and saddr is not set.

Fixes: 741a11d9 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set")
Cc: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d46a9d67

ipv6: gro: support sit protocol · feec0cb3

由 Eric Dumazet 提交于 10月 19, 2015

Tom Herbert added SIT support to GRO with commit
19424e05 ("sit: Add gro callbacks to sit_offload"),
later reverted by Herbert Xu.

The problem came because Tom patch was building GRO
packets without proper meta data : If packets were locally
delivered, we would not care.

But if packets needed to be forwarded, GSO engine was not
able to segment individual segments.

With the following patch, we correctly set skb->encapsulation
and inner network header. We also update gso_type.

Tested:

Server :
netserver
modprobe dummy
ifconfig dummy0 8.0.0.1 netmask 255.255.255.0 up
arp -s 8.0.0.100 4e:32:51:04:47:e5
iptables -I INPUT -s 10.246.7.151 -j TEE --gateway 8.0.0.100
ifconfig sixtofour0
sixtofour0 Link encap:IPv6-in-IPv4
          inet6 addr: 2002:af6:798::1/128 Scope:Global
          inet6 addr: 2002:af6:798::/128 Scope:Global
          UP RUNNING NOARP  MTU:1480  Metric:1
          RX packets:411169 errors:0 dropped:0 overruns:0 frame:0
          TX packets:409414 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:20319631739 (20.3 GB)  TX bytes:29529556 (29.5 MB)

Client :
netperf -H 2002:af6:798::1 -l 1000 &

Checked on server traffic copied on dummy0 and verify segments were
properly rebuilt, with proper IP headers, TCP checksums...

tcpdump on eth0 shows proper GRO aggregation takes place.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

feec0cb3

netlink: Rightsize IFLA_AF_SPEC size calculation · b1974ed0

由 Arad, Ronen 提交于 10月 19, 2015

if_nlmsg_size() overestimates the minimum allocation size of netlink
dump request (when called from rtnl_calcit()) or the size of the
message (when called from rtnl_getlink()). This is because
ext_filter_mask is not supported by rtnl_link_get_af_size() and
rtnl_link_get_size().

The over-estimation is significant when at least one netdev has many
VLANs configured (8 bytes for each configured VLAN).

This patch-set "rightsizes" the protocol specific attribute size
calculation by propagating ext_filter_mask to rtnl_link_get_af_size()
and adding this a argument to get_link_af_size op in rtnl_af_ops.

Bridge module already used filtering aware sizing for notifications.
br_get_link_af_size_filtered() is consistent with the modified
get_link_af_size op so it replaces br_get_link_af_size() in br_af_ops.
br_get_link_af_size() becomes unused and thus removed.
Signed-off-by: NRonen Arad <ronen.arad@intel.com>
Acked-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1974ed0

net: Really fix vti6 with oif in dst lookups · f1900fb5

由 David Ahern 提交于 10月 19, 2015

6e28b000 ("net: Fix vti use case with oif in dst lookups for IPv6")
is missing the checks on FLOWI_FLAG_SKIP_NH_OIF. Add them.

Fixes: 42a7b32b ("xfrm: Add oif to dst lookups")
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f1900fb5

19 10月, 2015 2 次提交

xfrm: Fix pmtu discovery for local generated packets. · ca064bd8

由 Steffen Klassert 提交于 10月 19, 2015

Commit 044a832a ("xfrm: Fix local error reporting crash
with interfamily tunnels") moved the setting of skb->protocol
behind the last access of the inner mode family to fix an
interfamily crash. Unfortunately now skb->protocol might not
be set at all, so we fail dispatch to the inner address family.
As a reault, the local error handler is not called and the
mtu value is not reported back to userspace.

We fix this by setting skb->protocol on message size errors
before we call xfrm_local_error.

Fixes: 044a832a ("xfrm: Fix local error reporting crash with interfamily tunnels")
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

ca064bd8

tcp: do not set queue_mapping on SYNACK · dc6ef6be

由 Eric Dumazet 提交于 10月 16, 2015

At the time of commit fff32699 ("tcp: reflect SYN queue_mapping into
SYNACK packets") we had little ways to cope with SYN floods.

We no longer need to reflect incoming skb queue mappings, and instead
can pick a TX queue based on cpu cooking the SYNACK, with normal XPS
affinities.

Note that all SYNACK retransmits were picking TX queue 0, this no longer
is a win given that SYNACK rtx are now distributed on all cpus.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc6ef6be

17 10月, 2015 1 次提交

netfilter: remove hook owner refcounting · 2ffbceb2

由 Florian Westphal 提交于 10月 13, 2015

since commit 8405a8ff ("netfilter: nf_qeueue: Drop queue entries on
nf_unregister_hook") all pending queued entries are discarded.

So we can simply remove all of the owner handling -- when module is
removed it also needs to unregister all its hooks.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2ffbceb2

16 10月, 2015 3 次提交

tcp/dccp: add inet_csk_reqsk_queue_drop_and_put() helper · f03f2e15

由 Eric Dumazet 提交于 10月 14, 2015

Let's reduce the confusion about inet_csk_reqsk_queue_drop() :
In many cases we also need to release reference on request socket,
so add a helper to do this, reducing code size and complexity.

Fixes: 4bdc3d66 ("tcp/dccp: fix behavior of stale SYN_RECV request sockets")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f03f2e15

ipv6: Initialize rt6_info properly in ip6_blackhole_route() · 0a1f5962

由 Martin KaFai Lau 提交于 10月 15, 2015

ip6_blackhole_route() does not initialize the newly allocated
rt6_info properly.  This patch:
1. Call rt6_info_init() to initialize rt6i_siblings and rt6i_uncached

2. The current rt->dst._metrics init code is incorrect:
   - 'rt->dst._metrics = ort->dst._metris' is not always safe
   - Not sure what dst_copy_metrics() is trying to do here
     considering ip6_rt_blackhole_cow_metrics() always returns
     NULL

   Fix:
   - Always do dst_copy_metrics()
   - Replace ip6_rt_blackhole_cow_metrics() with
     dst_cow_metrics_generic()

3. Mask out the RTF_PCPU bit from the newly allocated blackhole route.
   This bug triggers an oops (reported by Phil Sutter) in rt6_get_cookie().
   It is because RTF_PCPU is set while rt->dst.from is NULL.

Fixes: d52d3997 ("ipv6: Create percpu rt6_info")
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Reported-by: NPhil Sutter <phil@nwl.cc>
Tested-by: NPhil Sutter <phil@nwl.cc>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a1f5962

ipv6: Move common init code for rt6_info to a new function rt6_info_init() · ebfa45f0

由 Martin KaFai Lau 提交于 10月 15, 2015

Introduce rt6_info_init() to do the common init work for
'struct rt6_info' (after calling dst_alloc).

It is a prep work to fix the rt6_info init logic in the
ip6_blackhole_route().
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebfa45f0

14 10月, 2015 3 次提交

netfilter: ipv6: pointer cast layout · dbb526eb

由 Ian Morris 提交于 10月 11, 2015

Correct whitespace layout of a pointer casting.

No changes detected by objdiff.
Signed-off-by: NIan Morris <ipm@chirality.org.uk>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

dbb526eb

netfilter: ip6_tables: improve if statements · 4305ae44

由 Ian Morris 提交于 10月 11, 2015

Correct whitespace layout of if statements.

No changes detected by objdiff.
Signed-off-by: NIan Morris <ipm@chirality.org.uk>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

4305ae44

tcp/dccp: fix behavior of stale SYN_RECV request sockets · 4bdc3d66

由 Eric Dumazet 提交于 10月 13, 2015

When a TCP/DCCP listener is closed, its pending SYN_RECV request sockets
become stale, meaning 3WHS can not complete.

But current behavior is wrong :
incoming packets finding such stale sockets are dropped.

We need instead to cleanup the request socket and perform another
lookup :
- Incoming ACK will give a RST answer,
- SYN rtx might find another listener if available.
- We expedite cleanup of request sockets and old listener socket.

Fixes: 079096f1 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4bdc3d66

13 10月, 2015 5 次提交

netfilter: ip6_tables: ternary operator layout · 544d9b17

由 Ian Morris 提交于 10月 11, 2015

Correct whitespace layout of ternary operators in the netfilter-ipv6
code.

No changes detected by objdiff.
Signed-off-by: NIan Morris <ipm@chirality.org.uk>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

544d9b17

netfilter: ipv6: whitespace around operators · f9527ea9

由 Ian Morris 提交于 10月 11, 2015

This patch cleanses whitespace around arithmetical operators.

No changes detected by objdiff.
Signed-off-by: NIan Morris <ipm@chirality.org.uk>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

f9527ea9

netfilter: ipv6: code indentation · 7695495d

由 Ian Morris 提交于 10月 11, 2015

Use tabs instead of spaces to indent code.

No changes detected by objdiff.
Signed-off-by: NIan Morris <ipm@chirality.org.uk>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

7695495d

netfilter: ip6_tables: function definition layout · cda219c6

由 Ian Morris 提交于 10月 11, 2015

Use tabs instead of spaces to indent second line of parameters in
function definitions.

No changes detected by objdiff.
Signed-off-by: NIan Morris <ipm@chirality.org.uk>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

cda219c6

netfilter: ip6_tables: label placement · 6ac94619

由 Ian Morris 提交于 10月 11, 2015

Whitespace cleansing: Labels should not be indented.

No changes detected by objdiff.
Signed-off-by: NIan Morris <ipm@chirality.org.uk>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

6ac94619