提交 · b7bb110008607a915298bf0f47d25886ecb94477 · openanolis / cloud-kernel

10 12月, 2015 1 次提交

rfkill: copy the name into the rfkill struct · b7bb1100

由 Johannes Berg 提交于 12月 10, 2015

Some users of rfkill, like NFC and cfg80211, use a dynamic name when
allocating rfkill, in those cases dev_name(). Therefore, the pointer
passed to rfkill_alloc() might not be valid forever, I specifically
found the case that the rfkill name was quite obviously an invalid
pointer (or at least garbage) when the wiphy had been renamed.

Fix this by making a copy of the rfkill name in rfkill_alloc().

Cc: stable@vger.kernel.org
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

b7bb1100

03 12月, 2015 3 次提交

mac80211: fix off-channel mgmt-tx uninitialized variable usage · c1df932c

由 Johannes Berg 提交于 11月 27, 2015

In the last change here, I neglected to update the cookie in one code
path: when a mgmt-tx has no real cookie sent to userspace as it doesn't
wait for a response, but is off-channel. The original code used the SKB
pointer as the cookie and always assigned the cookie to the TX SKB in
ieee80211_start_roc_work(), but my change turned this around and made
the code rely on a valid cookie being passed in.

Unfortunately, the off-channel no-wait TX path wasn't assigning one at
all, resulting in an uninitialized stack value being used. This wasn't
handed back to userspace as a cookie (since in the no-wait case there
isn't a cookie), but it was tested for non-zero to distinguish between
mgmt-tx and off-channel.

Fix this by assigning a dummy non-zero cookie unconditionally, and get
rid of a misleading comment and some dead code while at it. I'll clean
up the ACK SKB handling separately later.

Fixes: 3b79af97 ("mac80211: stop using pointers as userspace cookies")
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

c1df932c

mac80211: do not actively scan DFS channels · 4e39ccac

由 Antonio Quartulli 提交于 11月 21, 2015

DFS channels should not be actively scanned as we can't be sure
if we are allowed or not.

If the current channel is in the DFS band, active scan might be
performed after CSA, but we have no guarantee about other channels,
therefore it is safer to prevent active scanning at all.

Cc: stable@vger.kernel.org
Signed-off-by: NAntonio Quartulli <antonio@open-mesh.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

4e39ccac

mac80211: don't teardown sdata on sdata stop · 835112b2

由 Eliad Peller 提交于 11月 17, 2015

Interfaces are being initialized (setup) on addition,
and torn down on removal.

However, p2p device is being torn down when stopped,
resulting in the next p2p start operation being done
on uninitialized interface.

Solve it by calling ieee80211_teardown_sdata() only
on interface removal (for the non-netdev case).
Signed-off-by: NEliad Peller <eliadx.peller@intel.com>
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
[squashed in fix to call teardown after unregister]
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

835112b2

20 11月, 2015 2 次提交

mac80211: always set the buf_size in AddBA req to 64 · ac062197

由 Gregory Greenman 提交于 11月 17, 2015

Advertising reordering window in ADDBA less than 64 can crash some APs,
an example is LinkSys WRT120N (with FW v1.0.07 build 002 Jun 18 2012).
On the other hand, a driver may need to limit Tx A-MPDU size for its own
reasons, like specific HW limitations.
Signed-off-by: NGregory Greenman <gregory.greenman@intel.com>
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

ac062197

mac80211: ensure we don't update tx power on a non-running sdata · 5ad11b50

由 Emmanuel Grumbach 提交于 11月 17, 2015

We can't update the Tx power on the device unless it is
running.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=101521.

Cc: stable@vger.kernel.org
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

5ad11b50

17 11月, 2015 2 次提交

mac80211: mesh: fix call_rcu() usage · c2e703a5

由 Johannes Berg 提交于 11月 17, 2015

When using call_rcu(), the called function may be delayed quite
significantly, and without a matching rcu_barrier() there's no
way to be sure it has finished.
Therefore, global state that could be gone/freed/reused should
never be touched in the callback.

Fix this in mesh by moving the atomic_dec() into the caller;
that's not really a problem since we already unlinked the path
and it will be destroyed anyway.

This fixes a crash Jouni observed when running certain tests in
a certain order, in which the mesh interface was torn down, the
memory reused for a function pointer (work struct) and running
that then crashed since the pointer had been decremented by 1,
resulting in an invalid instruction byte stream.

Cc: stable@vger.kernel.org
Fixes: eb2b9311 ("mac80211: mesh path table implementation")
Reported-by: NJouni Malinen <j@w1.fi>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

c2e703a5

mac80211: don't advertise NL80211_FEATURE_FULL_AP_CLIENT_STATE · 45bb780a

由 Johannes Berg 提交于 11月 04, 2015

For now, this feature doesn't actually work. To avoid shipping a
kernel that has it enabled but where it can't be used disable it
for now - we can re-enable it when it's fixed.

This partially reverts 44674d9c ("mac80211: advertise support
for full station state in AP mode").

Cc: Ayala Beker <ayala.beker@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

45bb780a

16 11月, 2015 12 次提交

ipvs: use skb_to_full_sk() helper · 340c78e5

由 Eric Dumazet 提交于 11月 12, 2015

SYNACK packets might be attached to request sockets.

Use skb_to_full_sk() helper to avoid illegal accesses to
inet_sk(skb->sk)

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Acked-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

340c78e5

tcp: ensure proper barriers in lockless contexts · 00fd38d9

由 Eric Dumazet 提交于 11月 12, 2015

Some functions access TCP sockets without holding a lock and
might output non consistent data, depending on compiler and or
architecture.

tcp_diag_get_info(), tcp_get_info(), tcp_poll(), get_tcp4_sock() ...

Introduce sk_state_load() and sk_state_store() to fix the issues,
and more clearly document where this lack of locking is happening.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

00fd38d9

sctp: translate host order to network order when setting a hmacid · ed5a377d

由 lucien 提交于 11月 12, 2015

now sctp auth cannot work well when setting a hmacid manually, which
is caused by that we didn't use the network order for hmacid, so fix
it by adding the transformation in sctp_auth_ep_set_hmacs.

even we set hmacid with the network order in userspace, it still
can't work, because of this condition in sctp_auth_ep_set_hmacs():

		if (id > SCTP_AUTH_HMAC_ID_MAX)
			return -EOPNOTSUPP;

so this wasn't working before and thus it won't break compatibility.

Fixes: 65b07e5d ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NVlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed5a377d

packet: fix tpacket_snd max frame len · 5cfb4c8d

由 Daniel Borkmann 提交于 11月 11, 2015

Since it's introduction in commit 69e3c75f ("net: TX_RING and
packet mmap"), TX_RING could be used from SOCK_DGRAM and SOCK_RAW
side. When used with SOCK_DGRAM only, the size_max > dev->mtu +
reserve check should have reserve as 0, but currently, this is
unconditionally set (in it's original form as dev->hard_header_len).

I think this is not correct since tpacket_fill_skb() would then
take dev->mtu and dev->hard_header_len into account for SOCK_DGRAM,
the extra VLAN_HLEN could be possible in both cases. Presumably, the
reserve code was copied from packet_snd(), but later on missed the
check. Make it similar as we have it in packet_snd().

Fixes: 69e3c75f ("net: TX_RING and packet mmap")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5cfb4c8d

packet: infer protocol from ethernet header if unset · c72219b7

由 Daniel Borkmann 提交于 11月 11, 2015

In case no struct sockaddr_ll has been passed to packet
socket's sendmsg() when doing a TX_RING flush run, then
skb->protocol is set to po->num instead, which is the protocol
passed via socket(2)/bind(2).

Applications only xmitting can go the path of allocating the
socket as socket(PF_PACKET, <mode>, 0) and do a bind(2) on the
TX_RING with sll_protocol of 0. That way, register_prot_hook()
is neither called on creation nor on bind time, which saves
cycles when there's no interest in capturing anyway.

That leaves us however with po->num 0 instead and therefore
the TX_RING flush run sets skb->protocol to 0 as well. Eric
reported that this leads to problems when using tools like
trafgen over bonding device. I.e. the bonding's hash function
could invoke the kernel's flow dissector, which depends on
skb->protocol being properly set. In the current situation, all
the traffic is then directed to a single slave.

Fix it up by inferring skb->protocol from the Ethernet header
when not set and we have ARPHRD_ETHER device type. This is only
done in case of SOCK_RAW and where we have a dev->hard_header_len
length. In case of ARPHRD_ETHER devices, this is guaranteed to
cover ETH_HLEN, and therefore being accessed on the skb after
the skb_store_bits().
Reported-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c72219b7

packet: only allow extra vlan len on ethernet devices · 3c70c132

由 Daniel Borkmann 提交于 11月 11, 2015

Packet sockets can be used by various net devices and are not
really restricted to ARPHRD_ETHER device types. However, when
currently checking for the extra 4 bytes that can be transmitted
in VLAN case, our assumption is that we generally probe on
ARPHRD_ETHER devices. Therefore, before looking into Ethernet
header, check the device type first.

This also fixes the issue where non-ARPHRD_ETHER devices could
have no dev->hard_header_len in TX_RING SOCK_RAW case, and thus
the check would test unfilled linear part of the skb (instead
of non-linear).

Fixes: 57f89bfa ("network: Allow af_packet to transmit +4 bytes for VLAN packets.")
Fixes: 52f1454f ("packet: allow to transmit +4 byte in TX_RING slot for VLAN case")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c70c132

packet: always probe for transport header · 8fd6c80d

由 Daniel Borkmann 提交于 11月 11, 2015

We concluded that the skb_probe_transport_header() should better be
called unconditionally. Avoiding the call into the flow dissector has
also not really much to do with the direct xmit mode.

While it seems that only virtio_net code makes use of GSO from non
RX/TX ring packet socket paths, we should probe for a transport header
nevertheless before they hit devices.

Reference: http://thread.gmane.org/gmane.linux.network/386173/Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fd6c80d

packet: do skb_probe_transport_header when we actually have data · efdfa2f7

由 Daniel Borkmann 提交于 11月 11, 2015

In tpacket_fill_skb() commit c1aad275 ("packet: set transport
header before doing xmit") and later on 40893fd0 ("net: switch
to use skb_probe_transport_header()") was probing for a transport
header on the skb from a ring buffer slot, but at a time, where
the skb has _not even_ been filled with data yet. So that call into
the flow dissector is pretty useless. Lets do it after we've set
up the skb frags.

Fixes: c1aad275 ("packet: set transport header before doing xmit")
Reported-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

efdfa2f7

ipv6: Check rt->dst.from for the DST_NOCACHE route · 02bcf4e0

由 Martin KaFai Lau 提交于 11月 11, 2015

All DST_NOCACHE rt6_info used to have rt->dst.from set to
its parent.

After commit 8e3d5be7 ("ipv6: Avoid double dst_free"),
DST_NOCACHE is also set to rt6_info which does not have
a parent (i.e. rt->dst.from is NULL).

This patch catches the rt->dst.from == NULL case.

Fixes: 8e3d5be7 ("ipv6: Avoid double dst_free")
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

02bcf4e0

ipv6: Check expire on DST_NOCACHE route · 5973fb1e

由 Martin KaFai Lau 提交于 11月 11, 2015

Since the expires of the DST_NOCACHE rt can be set during
the ip6_rt_update_pmtu(), we also need to consider the expires
value when doing ip6_dst_check().

This patches creates __rt6_check_expired() to only
check the expire value (if one exists) of the current rt.

In rt6_dst_from_check(), it adds __rt6_check_expired() as
one of the condition check.
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5973fb1e

ipv6: Avoid creating RTF_CACHE from a rt that is not managed by fib6 tree · 0d3f6d29

由 Martin KaFai Lau 提交于 11月 11, 2015

The original bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1272571

The setup has a IPv4 GRE tunnel running in a IPSec.  The bug
happens when ndisc starts sending router solicitation at the gre
interface.  The simplified oops stack is like:

__lock_acquire+0x1b2/0x1c30
lock_acquire+0xb9/0x140
_raw_write_lock_bh+0x3f/0x50
__ip6_ins_rt+0x2e/0x60
ip6_ins_rt+0x49/0x50
~~~~~~~~
__ip6_rt_update_pmtu.part.54+0x145/0x250
ip6_rt_update_pmtu+0x2e/0x40
~~~~~~~~
ip_tunnel_xmit+0x1f1/0xf40
__gre_xmit+0x7a/0x90
ipgre_xmit+0x15a/0x220
dev_hard_start_xmit+0x2bd/0x480
__dev_queue_xmit+0x696/0x730
dev_queue_xmit+0x10/0x20
neigh_direct_output+0x11/0x20
ip6_finish_output2+0x21f/0x770
ip6_finish_output+0xa7/0x1d0
ip6_output+0x56/0x190
~~~~~~~~
ndisc_send_skb+0x1d9/0x400
ndisc_send_rs+0x88/0xc0
~~~~~~~~

The rt passed to ip6_rt_update_pmtu() is created by
icmp6_dst_alloc() and it is not managed by the fib6 tree,
so its rt6i_table == NULL.  When __ip6_rt_update_pmtu() creates
a RTF_CACHE clone, the newly created clone also has rt6i_table == NULL
and it causes the ip6_ins_rt() oops.

During pmtu update, we only want to create a RTF_CACHE clone
from a rt which is currently managed (or owned) by the
fib6 tree.  It means either rt->rt6i_node != NULL or
rt is a RTF_PCPU clone.

It is worth to note that rt6i_table may not be NULL even it is
not (yet) managed by the fib6 tree (e.g. addrconf_dst_alloc()).
Hence, rt6i_node is a better check instead of rt6i_table.

Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu")
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Reported-by: NChris Siebenmann <cks-rhbugzilla@cs.toronto.edu>
Cc: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d3f6d29

af-unix: fix use-after-free with concurrent readers while splicing · 73ed5d25

由 Hannes Frederic Sowa 提交于 11月 10, 2015

During splicing an af-unix socket to a pipe we have to drop all
af-unix socket locks. While doing so we allow another reader to enter
unix_stream_read_generic which can read, copy and finally free another
skb. If exactly this skb is just in process of being spliced we get a
use-after-free report by kasan.

First, we must make sure to not have a free while the skb is used during
the splice operation. We simply increment its use counter before unlocking
the reader lock.

Stream sockets have the nice characteristic that we don't care about
zero length writes and they never reach the peer socket's queue. That
said, we can take the UNIXCB.consumed field as the indicator if the
skb was already freed from the socket's receive queue. If the skb was
fully consumed after we locked the reader side again we know it has been
dropped by a second reader. We indicate a short read to user space and
abort the current splice operation.

This bug has been found with syzkaller
(http://github.com/google/syzkaller) by Dmitry Vyukov.

Fixes: 2b514574 ("net: af_unix: implement splice for stream af_unix sockets")
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73ed5d25

11 11月, 2015 4 次提交

netfilter: nf_tables: add clone interface to expression operations · 086f3321

由 Pablo Neira Ayuso 提交于 11月 10, 2015

With the conversion of the counter expressions to make it percpu, we
need to clone the percpu memory area, otherwise we crash when using
counters from flow tables.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

086f3321

netfilter: fix xt_TEE and xt_TPROXY dependencies · 74ec4d55

由 Arnd Bergmann 提交于 11月 10, 2015

Kconfig is too smart for its own good: a Kconfig line that states

	select NF_DEFRAG_IPV6 if IP6_NF_IPTABLES

means that if IP6_NF_IPTABLES is set to 'm', then NF_DEFRAG_IPV6 will
also be set to 'm', regardless of the state of the symbol from which
it is selected. When the xt_TEE driver is built-in and nothing else
forces NF_DEFRAG_IPV6 to be built-in, this causes a link-time error:

net/built-in.o: In function `tee_tg6':
net/netfilter/xt_TEE.c:46: undefined reference to `nf_dup_ipv6'

This works around that behavior by changing the dependency to
'if IP6_NF_IPTABLES != n', which is interpreted as boolean expression
rather than a tristate and causes the NF_DEFRAG_IPV6 symbol to
be built-in as well.

The bug only occurs once in thousands of 'randconfig' builds and
does not really impact real users. From inspecting the other
surrounding Kconfig symbols, I am guessing that NETFILTER_XT_TARGET_TPROXY
and NETFILTER_XT_MATCH_SOCKET have the same issue. If not, this
change should still be harmless.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

74ec4d55

netfilter: nfnetlink_log: work around uninitialized variable warning · c872a2d9

由 Arnd Bergmann 提交于 11月 10, 2015

After a recent (correct) change, gcc started warning about the use
of the 'flags' variable in nfulnl_recv_config()

net/netfilter/nfnetlink_log.c: In function 'nfulnl_recv_config':
net/netfilter/nfnetlink_log.c:320:14: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/netfilter/nfnetlink_log.c:828:6: note: 'flags' was declared here

The warning first shows up in ARM s3c2410_defconfig with gcc-4.3 or
higher (including 5.2.1, which is the latest version I checked) I
tried working around it by rearranging the code but had no success
with that.

As a last resort, this initializes the variable to zero, which shuts
up the warning, but means that we don't get a warning if the code
is ever changed in a way that actually causes the variable to be
used without first being written.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 8cbc8708 ("netfilter: nfnetlink_log: validate dependencies to avoid breaking atomicity")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c872a2d9

Revert "bridge: Allow forward delay to be cfgd when STP enabled" · 8a921265

由 Vlad Yasevich 提交于 11月 10, 2015

This reverts commit 34c2d9fb.

There are 2 reasons for this revert:
 1)  The commit in question doesn't do what it says it does.  The
     description reads: "Allow bridge forward delay to be configured
     when Spanning Tree is enabled."  This was already the case before
     the commit was made.  What the commit actually do was disallow
     invalid values or 'forward_delay' when STP was turned off.

 2)  The above change was actually a change in the user observed
     behavior and broke things like libvirt and other network configs
     that set 'forward_delay' to 0 without enabling STP.  The value
     of 0 is actually used when STP is turned off to immediately mark
     the bridge as forwarding.
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a921265

10 11月, 2015 2 次提交

net: fix a race in dst_release() · d69bbf88

由 Eric Dumazet 提交于 11月 09, 2015

Only cpu seeing dst refcount going to 0 can safely
dereference dst->flags.

Otherwise an other cpu might already have freed the dst.

Fixes: 27b75c95 ("net: avoid RCU for NOCACHE dst")
Reported-by: NGreg Thelen <gthelen@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d69bbf88

remove abs64() · 79211c8e

由 Andrew Morton 提交于 11月 09, 2015

Switch everything to the new and more capable implementation of abs().
Mainly to give the new abs() a bit of a workout.

Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

79211c8e

09 11月, 2015 5 次提交

netfilter: Fix removal of GRE expectation entries created by PPTP · c255cb2e

由 Anthony Lineham 提交于 10月 22, 2015

The uninitialized tuple structure caused incorrect hash calculation
and the lookup failed.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=106441Signed-off-by: NAnthony Lineham <anthony.lineham@alliedtelesis.co.nz>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c255cb2e

netfilter: nft_meta: use skb_to_full_sk() helper · 3aed8225

由 Eric Dumazet 提交于 11月 08, 2015

SYNACK packets might be attached to request sockets.

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3aed8225

net_sched: em_meta: use skb_to_full_sk() helper · 02a56c81

由 Eric Dumazet 提交于 11月 08, 2015

SYNACK packets might be attached to request sockets.

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

02a56c81

sched: cls_flow: use skb_to_full_sk() helper · 743b2a66

由 Eric Dumazet 提交于 11月 08, 2015

SYNACK packets might be attached to request sockets.

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

743b2a66

netfilter: xt_owner: use skb_to_full_sk() helper · fdd723e2

由 Eric Dumazet 提交于 11月 08, 2015

SYNACK packets might be attached to a request socket,
xt_owner wants to gte the listener in this case.

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fdd723e2

07 11月, 2015 4 次提交

J
netfilter: ipset: Fix hash type expire: release empty hash bucket block · 0aae24eb
由 Jozsef Kadlecsik 提交于 11月 07, 2015
```
When all entries are expired/all slots are empty, release the bucket.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
```
0aae24eb

netfilter: ipset: Fix hash:* type expiration · e9dfdc05

由 Jozsef Kadlecsik 提交于 11月 07, 2015

Incorrect index was used when the data blob was shrinked at expiration,
which could lead to falsely expired entries and memory leak when
the comment extension was used too.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

e9dfdc05

netfilter: ipset: Fix extension alignment · 95ad1f4a

由 Jozsef Kadlecsik 提交于 11月 07, 2015

The data extensions in ipset lacked the proper memory alignment and
thus could lead to kernel crash on several architectures. Therefore
the structures have been reorganized and alignment attributes added
where needed. The patch was tested on armv7h by Gerhard Wiesinger and
on x86_64, sparc64 by Jozsef Kadlecsik.
Reported-by: NGerhard Wiesinger <lists@wiesinger.com>
Tested-by: NGerhard Wiesinger <lists@wiesinger.com>
Tested-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

95ad1f4a

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc

由 Mel Gorman 提交于 11月 06, 2015

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd

__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts.  They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve".  __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available.  Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative.  High priority users continue to use
__GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
  pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
  __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
  into this category where kswapd will still be woken but atomic reserves
  are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
  helper gfpflags_allow_blocking() where possible. This is because
  checking for __GFP_WAIT as was done historically now can trigger false
  positives. Some exceptions like dm-crypt.c exist where the code intent
  is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
  flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
  and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL.  They may
now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.
Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0164adc

06 11月, 2015 4 次提交

tcp: use correct req pointer in tcp_move_syn() calls · 49a496c9

由 Eric Dumazet 提交于 11月 05, 2015

I mistakenly took wrong request sock pointer when calling tcp_move_syn()

@req_unhash is either a copy of @req, or a NULL value for
FastOpen connexions (as we do not expect to unhash the temporary
request sock from ehash table)

Fixes: 805c4bc0 ("tcp: fix req->saved_syn race")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Ying Cai <ycai@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

49a496c9

packet: race condition in packet_bind · 30f7ea1c

由 Francesco Ruggeri 提交于 11月 05, 2015

There is a race conditions between packet_notifier and packet_bind{_spkt}.

It happens if packet_notifier(NETDEV_UNREGISTER) executes between the
time packet_bind{_spkt} takes a reference on the new netdevice and the
time packet_do_bind sets po->ifindex.
In this case the notification can be missed.
If this happens during a dev_change_net_namespace this can result in the
netdevice to be moved to the new namespace while the packet_sock in the
old namespace still holds a reference on it. When the netdevice is later
deleted in the new namespace the deletion hangs since the packet_sock
is not found in the new namespace' &net->packet.sklist.
It can be reproduced with the script below.

This patch makes packet_do_bind check again for the presence of the
netdevice in the packet_sock's namespace after the synchronize_net
in unregister_prot_hook.
More in general it also uses the rcu lock for the duration of the bind
to stop dev_change_net_namespace/rollback_registered_many from
going past the synchronize_net following unlist_netdevice, so that
no NETDEV_UNREGISTER notifications can happen on the new netdevice
while the bind is executing. In order to do this some code from
packet_bind{_spkt} is consolidated into packet_do_dev.

import socket, os, time, sys
proto=7
realDev='em1'
vlanId=400
if len(sys.argv) > 1:
   vlanId=int(sys.argv[1])
dev='vlan%d' % vlanId

os.system('taskset -p 0x10 %d' % os.getpid())

s = socket.socket(socket.PF_PACKET, socket.SOCK_RAW, proto)
os.system('ip link add link %s name %s type vlan id %d' %
          (realDev, dev, vlanId))
os.system('ip netns add dummy')

pid=os.fork()

if pid == 0:
   # dev should be moved while packet_do_bind is in synchronize net
   os.system('taskset -p 0x20000 %d' % os.getpid())
   os.system('ip link set %s netns dummy' % dev)
   os.system('ip netns exec dummy ip link del %s' % dev)
   s.close()
   sys.exit(0)

time.sleep(.004)
try:
   s.bind(('%s' % dev, proto+1))
except:
   print 'Could not bind socket'
   s.close()
   os.system('ip netns del dummy')
   sys.exit(0)

os.waitpid(pid, 0)
s.close()
os.system('ip netns del dummy')
sys.exit(0)
Signed-off-by: NFrancesco Ruggeri <fruggeri@arista.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30f7ea1c

ipv4: use sk_fullsock() in ipv4_conntrack_defrag() · f668f5f7

由 Eric Dumazet 提交于 11月 05, 2015

Before converting a 'socket pointer' into inet socket,
use sk_fullsock() to detect timewait or request sockets.

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Tested-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f668f5f7

tcp: fix req->saved_syn race · 805c4bc0

由 Eric Dumazet 提交于 11月 05, 2015

For the reasons explained in commit ce105008 ("tcp/dccp: fix
ireq->pktopts race"), we need to make sure we do not access
req->saved_syn unless we own the request sock.

This fixes races for listeners using TCP_SAVE_SYN option.

Fixes: e994b2f0 ("tcp: do not lock listener to process SYN packets")
Fixes: 079096f1 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NYing Cai <ycai@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

805c4bc0

05 11月, 2015 1 次提交

ipv6: clean up dev_snmp6 proc entry when we fail to initialize inet6_dev · 2a189f9e

由 Sabrina Dubroca 提交于 11月 04, 2015

In ipv6_add_dev, when addrconf_sysctl_register fails, we do not clean up
the dev_snmp6 entry that we have already registered for this device.
Call snmp6_unregister_dev in this case.

Fixes: a317a2f1 ("ipv6: fail early when creating netdev named all or default")
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a189f9e

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功