提交 · 76061f631c2ea4ab9c4d66f3a96ecc5737f5aaf7 · openanolis / cloud-kernel

09 9月, 2016 1 次提交

tcp: fastopen: avoid negative sk_forward_alloc · 76061f63

由 Eric Dumazet 提交于 9月 07, 2016

When DATA and/or FIN are carried in a SYN/ACK message or SYN message,
we append an skb in socket receive queue, but we forget to call
sk_forced_mem_schedule().

Effect is that the socket has a negative sk->sk_forward_alloc as long as
the message is not read by the application.

Josh Hunt fixed a similar issue in commit d22e1537 ("tcp: fix tcp
fin memory accounting")

Fixes: 168a8f58 ("tcp: TCP Fast Open Server - main code path")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NJosh Hunt <johunt@akamai.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76061f63

07 9月, 2016 3 次提交

ipv6: addrconf: fix dev refcont leak when DAD failed · 751eb6b6

由 Wei Yongjun 提交于 9月 05, 2016

In general, when DAD detected IPv6 duplicate address, ifp->state
will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
delayed work, the call tree should be like this:

ndisc_recv_ns
  -> addrconf_dad_failure        <- missing ifp put
     -> addrconf_mod_dad_work
       -> schedule addrconf_dad_work()
         -> addrconf_dad_stop()  <- missing ifp hold before call it

addrconf_dad_failure() called with ifp refcont holding but not put.
addrconf_dad_work() call addrconf_dad_stop() without extra holding
refcount. This will not cause any issue normally.

But the race between addrconf_dad_failure() and addrconf_dad_work()
may cause ifp refcount leak and netdevice can not be unregister,
dmesg show the following messages:

IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
...
unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Cc: stable@vger.kernel.org
Fixes: c15b1cca ("ipv6: move DAD and addrconf_verify processing
to workqueue")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

751eb6b6

net: Don't delete routes in different VRFs · 5a56a0b3

由 Mark Tomlinson 提交于 9月 05, 2016

When deleting an IP address from an interface, there is a clean-up of
routes which refer to this local address. However, there was no check to
see that the VRF matched. This meant that deletion wasn't confined to
the VRF it should have been.

To solve this, a new field has been added to fib_info to hold a table
id. When removing fib entries corresponding to a local ip address, this
table id is also used in the comparison.

The table id is populated when the fib_info is created. This was already
done in some places, but not in ip_rt_ioctl(). This has now been fixed.

Fixes: 021dd3b8 ("net: Add routes to the table associated with the device")
Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
Tested-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NMark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a56a0b3

ipv6: release dst in ping_v6_sendmsg · 03c2778a

由 Dave Jones 提交于 9月 02, 2016

Neither the failure or success paths of ping_v6_sendmsg release
the dst it acquires.  This leads to a flood of warnings from
"net/core/dst.c:288 dst_release" on older kernels that
don't have 8bf4ada2 backported.

That patch optimistically hoped this had been fixed post 3.10, but
it seems at least one case wasn't, where I've seen this triggered
a lot from machines doing unprivileged icmp sockets.

Cc: Martin Lau <kafai@fb.com>
Signed-off-by: NDave Jones <davej@codemonkey.org.uk>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

03c2778a

05 9月, 2016 3 次提交

af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock' · 6e1ce3c3

由 Linus Torvalds 提交于 9月 01, 2016

Right now we use the 'readlock' both for protecting some of the af_unix
IO path and for making the bind be single-threaded.

The two are independent, but using the same lock makes for a nasty
deadlock due to ordering with regards to filesystem locking. The bind
locking would want to nest outside the VSF pathname locking, but the IO
locking wants to nest inside some of those same locks.

We tried to fix this earlier with commit c845acb3 ("af_unix: Fix
splice-bind deadlock") which moved the readlock inside the vfs locks,
but that caused problems with overlayfs that will then call back into
filesystem routines that take the lock in the wrong order anyway.

Splitting the locks means that we can go back to having the bind lock be
the outermost lock, and we don't have any deadlocks with lock ordering.
Acked-by: NRainer Weikusat <rweikusat@cyberadapt.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e1ce3c3

Revert "af_unix: Fix splice-bind deadlock" · 38f7bd94

由 Linus Torvalds 提交于 9月 01, 2016

This reverts commit c845acb3.

It turns out that it just replaces one deadlock with another one: we can
still get the wrong lock ordering with the readlock due to overlayfs
calling back into the filesystem layer and still taking the vfs locks
after the readlock.

The proper solution ends up being to just split the readlock into two
pieces: the bind lock (taken *outside* the vfs locks) and the IO lock
(taken *inside* the filesystem locks).  The two locks are independent
anyway.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

38f7bd94

bonding: Fix bonding crash · 24b27fc4

由 Mahesh Bandewar 提交于 9月 01, 2016

Following few steps will crash kernel -

  (a) Create bonding master
      > modprobe bonding miimon=50
  (b) Create macvlan bridge on eth2
      > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \
	   type macvlan
  (c) Now try adding eth2 into the bond
      > echo +eth2 > /sys/class/net/bond0/bonding/slaves
      <crash>

Bonding does lots of things before checking if the device enslaved is
busy or not.

In this case when the notifier call-chain sends notifications, the
bond_netdev_event() assumes that the rx_handler /rx_handler_data is
registered while the bond_enslave() hasn't progressed far enough to
register rx_handler for the new slave.

This patch adds a rx_handler check that can be performed right at the
beginning of the enslave code to avoid getting into this situation.
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24b27fc4

03 9月, 2016 1 次提交

l2tp: fix use-after-free during module unload · 2f86953e

由 Sabrina Dubroca 提交于 9月 02, 2016

Tunnel deletion is delayed by both a workqueue (l2tp_tunnel_delete -> wq
 -> l2tp_tunnel_del_work) and RCU (sk_destruct -> RCU ->
l2tp_tunnel_destruct).

By the time l2tp_tunnel_destruct() runs to destroy the tunnel and finish
destroying the socket, the private data reserved via the net_generic
mechanism has already been freed, but l2tp_tunnel_destruct() actually
uses this data.

Make sure tunnel deletion for the netns has completed before returning
from l2tp_exit_net() by first flushing the tunnel removal workqueue, and
then waiting for RCU callbacks to complete.

Fixes: 167eb17e ("l2tp: create tunnel sockets in the right namespace")
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f86953e

02 9月, 2016 7 次提交

ipv6: Don't unset flowi6_proto in ipxip6_tnl_xmit() · ab343801

由 Eli Cooper 提交于 8月 26, 2016

Commit 8eb30be0 ("ipv6: Create ip6_tnl_xmit") unsets
flowi6_proto in ip4ip6_tnl_xmit() and ip6ip6_tnl_xmit().
Since xfrm_selector_match() relies on this info, IPv6 packets
sent by an ip6tunnel cannot be properly selected by their
protocols after removing it. This patch puts flowi6_proto back.

Cc: stable@vger.kernel.org
Fixes: 8eb30be0 ("ipv6: Create ip6_tnl_xmit")
Signed-off-by: NEli Cooper <elicooper@gmx.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab343801

rps: flow_dissector: Fix uninitialized flow_keys used in __skb_get_hash possibly · 635c223c

由 Gao Feng 提交于 8月 31, 2016

The original codes depend on that the function parameters are evaluated from
left to right. But the parameter's evaluation order is not defined in C
standard actually.

When flow_keys_have_l4(&keys) is invoked before ___skb_get_hash(skb, &keys,
hashrnd) with some compilers or environment, the keys passed to
flow_keys_have_l4 is not initialized.

Fixes: 6db61d79 ("flow_dissector: Ignore flow dissector return value from ___skb_get_hash")
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NGao Feng <fgao@ikuai8.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

635c223c

tcp: fastopen: fix rcv_wup initialization for TFO server on SYN/data · 28b346cb

由 Neal Cardwell 提交于 8月 30, 2016

Yuchung noticed that on the first TFO server data packet sent after
the (TFO) handshake, the server echoed the TCP timestamp value in the
SYN/data instead of the timestamp value in the final ACK of the
handshake. This problem did not happen on regular opens.

The tcp_replace_ts_recent() logic that decides whether to remember an
incoming TS value needs tp->rcv_wup to hold the latest receive
sequence number that we have ACKed (latest tp->rcv_nxt we have
ACKed). This commit fixes this issue by ensuring that a TFO server
properly updates tp->rcv_wup to match tp->rcv_nxt at the time it sends
a SYN/ACK for the SYN/data.
Reported-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
Fixes: 168a8f58 ("tcp: TCP Fast Open Server - main code path")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

28b346cb

net: bridge: don't increment tx_dropped in br_do_proxy_arp · 85a3d4a9

由 Nikolay Aleksandrov 提交于 8月 30, 2016

pskb_may_pull may fail due to various reasons (e.g. alloc failure), but the
skb isn't changed/dropped and processing continues so we shouldn't
increment tx_dropped.

CC: Kyeyoon Park <kyeyoonp@codeaurora.org>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: bridge@lists.linux-foundation.org
Fixes: 95850116 ("bridge: Add support for IEEE 802.11 Proxy ARP")
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

85a3d4a9

netconf: add a notif when settings are created · 29c994e3

由 Nicolas Dichtel 提交于 8月 30, 2016

All changes are notified, but the initial state was missing.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29c994e3

ipv6: add missing netconf notif when 'all' is updated · d26c638c

由 Nicolas Dichtel 提交于 8月 30, 2016

The 'default' value was not advertised.

Fixes: f3a1bfb1 ("rtnl/ipv6: use netconf msg to advertise forwarding status")
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d26c638c

tipc: fix random link resets while adding a second bearer · d2f394dc

由 Parthasarathy Bhuvaragan 提交于 9月 01, 2016

In a dual bearer configuration, if the second tipc link becomes
active while the first link still has pending nametable "bulk"
updates, it randomly leads to reset of the second link.

When a link is established, the function named_distribute(),
fills the skb based on node mtu (allows room for TUNNEL_PROTOCOL)
with NAME_DISTRIBUTOR message for each PUBLICATION.
However, the function named_distribute() allocates the buffer by
increasing the node mtu by INT_H_SIZE (to insert NAME_DISTRIBUTOR).
This consumes the space allocated for TUNNEL_PROTOCOL.

When establishing the second link, the link shall tunnel all the
messages in the first link queue including the "bulk" update.
As size of the NAME_DISTRIBUTOR messages while tunnelling, exceeds
the link mtu the transmission fails (-EMSGSIZE).

Thus, the synch point based on the message count of the tunnel
packets is never reached leading to link timeout.

In this commit, we adjust the size of name distributor message so that
they can be tunnelled.
Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d2f394dc

01 9月, 2016 2 次提交

kcm: fix a socket double free · c0338aff

由 WANG Cong 提交于 8月 28, 2016

Dmitry reported a double free on kcm socket, which could
be easily reproduced by:

	#include <unistd.h>
	#include <sys/syscall.h>

	int main()
	{
	  int fd = syscall(SYS_socket, 0x29ul, 0x5ul, 0x0ul, 0, 0, 0);
	  syscall(SYS_ioctl, fd, 0x89e2ul, 0x20a98000ul, 0, 0, 0);
	  return 0;
	}

This is because on the error path, after we install
the new socket file, we call sock_release() to clean
up the socket, which leaves the fd pointing to a freed
socket. Fix this by calling sys_close() on that fd
directly.

Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module")
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c0338aff

bridge: re-introduce 'fix parsing of MLDv2 reports' · 9264251e

由 Davide Caratti 提交于 8月 31, 2016

commit bc8c20ac ("bridge: multicast: treat igmpv3 report with
INCLUDE and no sources as a leave") seems to have accidentally reverted
commit 47cc84ce ("bridge: fix parsing of MLDv2 reports"). This
commit brings back a change to br_ip6_multicast_mld2_report() where
parsing of MLDv2 reports stops when the first group is successfully
added to the MDB cache.

Fixes: bc8c20ac ("bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave")
Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: NThadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9264251e

30 8月, 2016 2 次提交

netfilter: nf_tables_netdev: remove redundant ip_hdr assignment · c73c2484

由 Liping Zhang 提交于 8月 28, 2016

We have already use skb_header_pointer to get the ip header pointer,
so there's no need to use ip_hdr again. Moreover, in NETDEV INGRESS
hook, ip header maybe not linear, so use ip_hdr is not appropriate,
remove it.
Signed-off-by: NLiping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c73c2484

mac80211: TDLS: don't require beaconing for AP BW · 554d072e

由 Arik Nemtsov 提交于 8月 29, 2016

Stop downgrading TDLS chandef when reaching the AP BW. The AP provides
the necessary regulatory protection in this case.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=153961, which
reported an infinite loop here.
Reported-by: NKamil Toman <kamil.toman@gmail.com>
Signed-off-by: NArik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

554d072e

26 8月, 2016 4 次提交

qdisc: fix a module refcount leak in qdisc_create_dflt() · 166ee5b8

由 Eric Dumazet 提交于 8月 24, 2016

Should qdisc_alloc() fail, we must release the module refcount
we got right before.

Fixes: 6da7c8fc ("qdisc: allow setting default queuing discipline")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

166ee5b8

tipc: fix the error handling in tipc_udp_enable() · a5de125d

由 Wei Yongjun 提交于 8月 24, 2016

Fix to return a negative error code in enable_mcast() error handling
case, and release udp socket when necessary.

Fixes: d0f91938 ("tipc: add ip/udp media type")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5de125d

Bluetooth: Fix hci_sock_recvmsg when MSG_TRUNC is not set · 4f34228b

由 Luiz Augusto von Dentz 提交于 8月 15, 2016

Similar to bt_sock_recvmsg MSG_TRUNC shall be checked using the original
flags not msg_flags.
Signed-off-by: NLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

4f34228b

Bluetooth: Fix bt_sock_recvmsg when MSG_TRUNC is not set · 90a56f72

由 Luiz Augusto von Dentz 提交于 8月 12, 2016

Commit b5f34f94 attempt to introduce
proper handling for MSG_TRUNC but recv and variants should still work
as read if no flag is passed, but because the code may set MSG_TRUNC to
msg->msg_flags that shall not be used as it may cause it to be behave as
if MSG_TRUNC is always, so instead of using it this changes the code to
use the flags parameter which shall contain the original flags.
Signed-off-by: NLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

90a56f72

25 8月, 2016 6 次提交

netfilter: ebtables: put module reference when an incorrect extension is found · 4249fc1f

由 Sabrina Dubroca 提交于 8月 23, 2016

commit bcf49342 ("netfilter: ebtables: Fix extension lookup with
identical name") added a second lookup in case the extension that was
found during the first lookup matched another extension with the same
name, but didn't release the reference on the incorrect module.

Fixes: bcf49342 ("netfilter: ebtables: Fix extension lookup with identical name")
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Acked-by: NPhil Sutter <phil@nwl.cc>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

4249fc1f

netfilter: nft_meta: improve the validity check of pkttype set expr · 960fa72f