提交 · 07a5d38453599052aff0877b16bb9c1585f08609 · openeuler / Kernel

07 1月, 2016 1 次提交

net: possible use after free in dst_release · 07a5d384

由 Francesco Ruggeri 提交于 1月 06, 2016

dst_release should not access dst->flags after decrementing
__refcnt to 0. The dst_entry may be in dst_busy_list and
dst_gc_task may dst_destroy it before dst_release gets a chance
to access dst->flags.

Fixes: d69bbf88 ("net: fix a race in dst_release()")
Fixes: 27b75c95 ("net: avoid RCU for NOCACHE dst")
Signed-off-by: NFrancesco Ruggeri <fruggeri@arista.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07a5d384

06 1月, 2016 2 次提交

net: sched: fix missing free per cpu on qstats · 73c20a8b

由 John Fastabend 提交于 1月 05, 2016

When a qdisc is using per cpu stats (currently just the ingress
qdisc) only the bstats are being freed. This also free's the qstats.

Fixes: b0ab6f92 ("net: sched: enable per cpu qstats")
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73c20a8b

bridge: Only call /sbin/bridge-stp for the initial network namespace · ff621985

由 Hannes Frederic Sowa 提交于 1月 05, 2016

[I stole this patch from Eric Biederman. He wrote:]

> There is no defined mechanism to pass network namespace information
> into /sbin/bridge-stp therefore don't even try to invoke it except
> for bridge devices in the initial network namespace.
>
> It is possible for unprivileged users to cause /sbin/bridge-stp to be
> invoked for any network device name which if /sbin/bridge-stp does not
> guard against unreasonable arguments or being invoked twice on the
> same network device could cause problems.

[Hannes: changed patch using netns_eq]

Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff621985

05 1月, 2016 2 次提交

af_unix: Fix splice-bind deadlock · c845acb3

由 Rainer Weikusat 提交于 1月 03, 2016

On 2015/11/06, Dmitry Vyukov reported a deadlock involving the splice
system call and AF_UNIX sockets,

http://lists.openwall.net/netdev/2015/11/06/24

The situation was analyzed as

(a while ago) A: socketpair()
B: splice() from a pipe to /mnt/regular_file
	does sb_start_write() on /mnt
C: try to freeze /mnt
	wait for B to finish with /mnt
A: bind() try to bind our socket to /mnt/new_socket_name
	lock our socket, see it not bound yet
	decide that it needs to create something in /mnt
	try to do sb_start_write() on /mnt, block (it's
	waiting for C).
D: splice() from the same pipe to our socket
	lock the pipe, see that socket is connected
	try to lock the socket, block waiting for A
B:	get around to actually feeding a chunk from
	pipe to file, try to lock the pipe.  Deadlock.

on 2015/11/10 by Al Viro,

http://lists.openwall.net/netdev/2015/11/10/4

The patch fixes this by removing the kern_path_create related code from
unix_mknod and executing it as part of unix_bind prior acquiring the
readlock of the socket in question. This means that A (as used above)
will sb_start_write on /mnt before it acquires the readlock, hence, it
won't indirectly block B which first did a sb_start_write and then
waited for a thread trying to acquire the readlock. Consequently, A
being blocked by C waiting for B won't cause a deadlock anymore
(effectively, both A and B acquire two locks in opposite order in the
situation described above).

Dmitry Vyukov(<dvyukov@google.com>) tested the original patch.
Signed-off-by: NRainer Weikusat <rweikusat@mobileactivedefense.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c845acb3

net: Propagate lookup failure in l3mdev_get_saddr to caller · b5bdacf3

由 David Ahern 提交于 1月 04, 2016

Commands run in a vrf context are not failing as expected on a route lookup:
    root@kenny:~# ip ro ls table vrf-red
    unreachable default

    root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
    ping: Warning: source address might be selected on device other than vrf-red.
    PING 10.100.1.254 (10.100.1.254) from 0.0.0.0 vrf-red: 56(84) bytes of data.

    --- 10.100.1.254 ping statistics ---
    2 packets transmitted, 0 received, 100% packet loss, time 999ms

Since the vrf table does not have a route for 10.100.1.254 the ping
should have failed. The saddr lookup causes a full VRF table lookup.
Propogating a lookup failure to the user allows the command to fail as
expected:

    root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
    connect: No route to host
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5bdacf3

31 12月, 2015 2 次提交

sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close · 068d8bd3

由 Xin Long 提交于 12月 29, 2015

In sctp_close, sctp_make_abort_user may return NULL because of memory
allocation failure. If this happens, it will bypass any state change
and never free the assoc. The assoc has no chance to be freed and it
will be kept in memory with the state it had even after the socket is
closed by sctp_close().

So if sctp_make_abort_user fails to allocate memory, we should abort
the asoc via sctp_primitive_ABORT as well. Just like the annotation in
sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said,
"Even if we can't send the ABORT due to low memory delete the TCB.
This is a departure from our typical NOMEM handling".

But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would
dereference the chunk pointer, and system crash. So we should add
SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other
places where it adds SCTP_CMD_REPLY cmd.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

068d8bd3

net, socket, socket_wq: fix missing initialization of flags · 574aab1e

由 Nicolai Stange 提交于 12月 29, 2015

Commit ceb5d58b ("net: fix sock_wake_async() rcu protection") from
the current 4.4 release cycle introduced a new flags member in
struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA
from struct socket's flags member into that new place.

Unfortunately, the new flags field is never initialized properly, at least
not for the struct socket_wq instance created in sock_alloc_inode().

One particular issue I encountered because of this is that my GNU Emacs
failed to draw anything on my desktop -- i.e. what I got is a transparent
window, including the title bar. Bisection lead to the commit mentioned
above and further investigation by means of strace told me that Emacs
is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is
reproducible 100% of times and the fact that properly initializing the
struct socket_wq ->flags fixes the issue leads me to the conclusion that
somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags,
preventing my Emacs from receiving any SIGIO's due to data becoming
available and it got stuck.

Make sock_alloc_inode() set the newly created struct socket_wq's ->flags
member to zero.

Fixes: ceb5d58b ("net: fix sock_wake_async() rcu protection")
Signed-off-by: NNicolai Stange <nicstange@gmail.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

574aab1e

30 12月, 2015 1 次提交

openvswitch: Fix template leak in error cases. · 90c7afc9

由 Joe Stringer 提交于 12月 23, 2015

Commit 5b48bb8506c5 ("openvswitch: Fix helper reference leak") fixed a
reference leak on helper objects, but inadvertently introduced a leak on
the ct template.

Previously, ct_info.ct->general.use was initialized to 0 by
nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action()
returned successful. If an error occurred while adding the helper or
adding the action to the actions buffer, the __ovs_ct_free_action()
cleanup would use nf_ct_put() to free the entry; However, this relies on
atomic_dec_and_test(ct_info.ct->general.use). This reference must be
incremented first, or nf_ct_put() will never free it.

Fix the issue by acquiring a reference to the template immediately after
allocation.

Fixes: cae3a262 ("openvswitch: Allow attaching helpers to ct action")
Fixes: 5b48bb8506c5 ("openvswitch: Fix helper reference leak")
Signed-off-by: NJoe Stringer <joe@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90c7afc9

28 12月, 2015 2 次提交

sctp: label accepted/peeled off sockets · 3538a5c8

由 Marcelo Ricardo Leitner 提交于 12月 23, 2015

Accepted or peeled off sockets were missing a security label (e.g.
SELinux) which means that socket was in "unlabeled" state.

This patch clones the sock's label from the parent sock and resolves the
issue (similar to AF_BLUETOOTH protocol family).

Cc: Paul Moore <pmoore@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3538a5c8

sctp: use GFP_USER for user-controlled kmalloc · 9ba0b963

由 Marcelo Ricardo Leitner 提交于 12月 23, 2015

Commit cacc0621 ("sctp: use GFP_USER for user-controlled kmalloc")
missed two other spots.

For connectx, as it's more likely to be used by kernel users of the API,
it detects if GFP_USER should be used or not.

Fixes: cacc0621 ("sctp: use GFP_USER for user-controlled kmalloc")
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ba0b963

24 12月, 2015 1 次提交

ipv6: honor ifindex in case we receive ll addresses in router advertisements · c1a9a291

由 Hannes Frederic Sowa 提交于 12月 23, 2015

Marc Haber reported we don't honor interface indexes when we receive link
local router addresses in router advertisements. Luckily the non-strict
version of ipv6_chk_addr already does the correct job here, so we can
simply use it to lighten the checks and use those addresses by default
without any configuration change.

Link: <http://permalink.gmane.org/gmane.linux.network/391348>
Reported-by: NMarc Haber <mh+netdev@zugschlus.de>
Cc: Marc Haber <mh+netdev@zugschlus.de>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1a9a291

23 12月, 2015 3 次提交

addrconf: always initialize sysctl table data · 5449a5ca

由 WANG Cong 提交于 12月 21, 2015

When sysctl performs restrict writes, it allows to write from
a middle position of a sysctl file, which requires us to initialize
the table data before calling proc_dostring() for the write case.

Fixes: 3d1bec99 ("ipv6: introduce secret_stable to ipv6_devconf")
Reported-by: NSasha Levin <sasha.levin@oracle.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Tested-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5449a5ca

ipv6/addrlabel: fix ip6addrlbl_get() · e459dfee

由 Andrey Ryabinin 提交于 12月 21, 2015

ip6addrlbl_get() has never worked. If ip6addrlbl_hold() succeeded,
ip6addrlbl_get() will exit with '-ESRCH'. If ip6addrlbl_hold() failed,
ip6addrlbl_get() will use about to be free ip6addrlbl_entry pointer.

Fix this by inverting ip6addrlbl_hold() check.

Fixes: 2a8cc6c8 ("[IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table.")
Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: NCong Wang <cwang@twopensource.com>
Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e459dfee

switchdev: bridge: Pass ageing time as clock_t instead of jiffies · ef9cdd0f

由 Ido Schimmel 提交于 12月 21, 2015

The bridge's ageing time is offloaded to hardware when:
	1) A port joins a bridge
	2) The ageing time of the bridge is changed

In the first case the ageing time is offloaded as jiffies, but in the
second case it's offloaded as clock_t, which is what existing switchdev
drivers expect to receive.

Fixes: 6ac311ae ("Adding switchdev ageing notification on port bridged")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef9cdd0f

19 12月, 2015 2 次提交

openvswitch: correct encoding of set tunnel action attributes · e905eabc

由 Simon Horman 提交于 12月 18, 2015

In a set action tunnel attributes should be encoded in a
nested action.

I noticed this because ovs-dpctl was reporting an error
when dumping flows due to the incorrect encoding of tunnel attributes
in a set action.

Fixes: fc4099f1 ("openvswitch: Fix egress tunnel info.")
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e905eabc

ipip: ioctl: Remove superfluous IP-TTL handling. · 6d3c348a

由 Pravin B Shelar 提交于 12月 17, 2015

IP-TTL case is already handled in ip_tunnel_ioctl() API.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d3c348a

18 12月, 2015 5 次提交

netfilter: nft_ct: include direction when dumping NFT_CT_L3PROTOCOL key · d5f79b6e

由 Florian Westphal 提交于 12月 18, 2015

one nft userspace test case fails with

'ct l3proto original ipv4' mismatches 'ct l3proto ipv4'

... because NFTA_CT_DIRECTION attr is missing.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

d5f79b6e

netfilter: nf_tables: use skb->protocol instead of assuming ethernet header · aa47e42c

由 Pablo Neira Ayuso 提交于 12月 15, 2015

Otherwise we may end up with incorrect network and transport header for
other protocols.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

aa47e42c

net: check both type and procotol for tcp sockets · ac5cc977

由 WANG Cong 提交于 12月 16, 2015

Dmitry reported the following out-of-bound access:

Call Trace:
 [<ffffffff816cec2e>] __asan_report_load4_noabort+0x3e/0x40
mm/kasan/report.c:294
 [<ffffffff84affb14>] sock_setsockopt+0x1284/0x13d0 net/core/sock.c:880
 [<     inline     >] SYSC_setsockopt net/socket.c:1746
 [<ffffffff84aed7ee>] SyS_setsockopt+0x1fe/0x240 net/socket.c:1729
 [<ffffffff85c18c76>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

This is because we mistake a raw socket as a tcp socket.
We should check both sk->sk_type and sk->sk_protocol to ensure
it is a tcp socket.

Willem points out __skb_complete_tx_timestamp() needs to fix as well.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac5cc977

tcp: restore fastopen with no data in SYN packet · 07e100f9

由 Eric Dumazet 提交于 12月 16, 2015

Yuchung tracked a regression caused by commit 57be5bda ("ip: convert
tcp_sendmsg() to iov_iter primitives") for TCP Fast Open.

Some Fast Open users do not actually add any data in the SYN packet.

Fixes: 57be5bda ("ip: convert tcp_sendmsg() to iov_iter primitives")
Reported-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07e100f9

af_unix: Revert 'lock_interruptible' in stream receive code · 3822b5c2

由 Rainer Weikusat 提交于 12月 16, 2015

With b3ca9b02, the AF_UNIX SOCK_STREAM
receive code was changed from using mutex_lock(&u->readlock) to
mutex_lock_interruptible(&u->readlock) to prevent signals from being
delayed for an indefinite time if a thread sleeping on the mutex
happened to be selected for handling the signal. But this was never a
problem with the stream receive code (as opposed to its datagram
counterpart) as that never went to sleep waiting for new messages with the
mutex held and thus, wouldn't cause secondary readers to block on the
mutex waiting for the sleeping primary reader. As the interruptible
locking makes the code more complicated in exchange for no benefit,
change it back to using mutex_lock.
Signed-off-by: NRainer Weikusat <rweikusat@mobileactivedefense.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3822b5c2

17 12月, 2015 1 次提交

fou: clean up socket with kfree_rcu · 3036facb

由 Hannes Frederic Sowa 提交于 12月 15, 2015

fou->udp_offloads is managed by RCU. As it is actually included inside
the fou sockets, we cannot let the memory go out of scope before a grace
period. We either can synchronize_rcu or switch over to kfree_rcu to
manage the sockets. kfree_rcu seems appropriate as it is used by vxlan
and geneve.

Fixes: 23461551 ("fou: Support for foo-over-udp RX path")
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3036facb

16 12月, 2015 4 次提交

ipv6: automatically enable stable privacy mode if stable_secret set · 9b29c696

由 Hannes Frederic Sowa 提交于 12月 15, 2015

Bjørn reported that while we switch all interfaces to privacy stable mode
when setting the secret, we don't set this mode for new interfaces. This
does not make sense, so change this behaviour.

Fixes: 622c81d5 ("ipv6: generation of stable privacy addresses for link-local and autoconf")
Reported-by: NBjørn Mork <bjorn@mork.no>
Cc: Bjørn Mork <bjorn@mork.no>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b29c696

net: fix uninitialized variable issue · 130ed5d1

由 tadeusz.struk@intel.com 提交于 12月 15, 2015

msg_iocb needs to be initialized on the recv/recvfrom path.
Otherwise afalg will wrongly interpret it as an async call.

Cc: stable@vger.kernel.org
Reported-by: NHarald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

130ed5d1

D
bluetooth: Validate socket address length in sco_sock_bind(). · 5233252f
由 David S. Miller 提交于 12月 15, 2015
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
5233252f

net_sched: make qdisc_tree_decrease_qlen() work for non mq · 225734de

由 Eric Dumazet 提交于 12月 15, 2015

Stas Nichiporovich reported a regression in his HFSC qdisc setup
on a non multi queue device.

It turns out I mistakenly added a TCQ_F_NOPARENT flag on all qdisc
allocated in qdisc_create() for non multi queue devices, which was
rather buggy. I was clearly mislead by the TCQ_F_ONETXQUEUE that is
also set here for no good reason, since it only matters for the root
qdisc.

Fixes: 4eaf3b84 ("net_sched: fix qdisc_tree_decrease_qlen() races")
Reported-by: NStas Nichiporovich <stasn77@gmail.com>
Tested-by: NStas Nichiporovich <stasn77@gmail.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

225734de

15 12月, 2015 10 次提交

mac80211: handle width changes from opmode notification IE in beacon · cf1e05c6

由 Eyal Shapira 提交于 12月 08, 2015

An AP can send an operating channel width change in a beacon
opmode notification IE as long as there's a change in the nss as
well (See 802.11ac-2013 section 10.41).
So don't limit updating to nss only from an opmode notification IE.
Signed-off-by: NEyal Shapira <eyalx.shapira@intel.com>
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

cf1e05c6

mac80211: suppress unchanged "limiting TX power" messages · a87da0cb

由 Johannes Berg 提交于 12月 08, 2015

When the AP is advertising limited TX power, the message can be
printed over and over again. Suppress it when the power level
isn't changing.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=106011Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

a87da0cb

mac80211: reprogram in interface order · 1ea2c864

由 Johannes Berg 提交于 12月 08, 2015

During reprogramming, mac80211 currently first adds all the channel
contexts, then binds them to the vifs and then goes to reconfigure
all the interfaces. Drivers might, perhaps implicitly, rely on the
operation order for certain things that typically happen within a
single function elsewhere in mac80211. To avoid problems with that,
reorder the code in mac80211's restart/reprogramming to work fully
within the interface loop so that the order of operations is like
in normal operation.

For iwlwifi, this fixes a firmware crash when reprogramming with an
AP/GO interface active.
Reported-by: NDavid Spinadel <david.spinadel@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

1ea2c864

mac80211: run scan completed work on reconfig failure · 74430f94

由 Johannes Berg 提交于 12月 08, 2015

When reconfiguration during resume fails while a scan is pending
for completion work, that work will never run, and the scan will
be stuck forever. Factor out the code to recover this and call it
also in ieee80211_handle_reconfig_failure().
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

74430f94

nl80211: Fix potential memory leak in nl80211_connect · 707554b4

由 Ola Olsson 提交于 12月 11, 2015

Free cached keys if the last early return path is taken.
Signed-off-by: NOla Olsson <ola.olsson@sonymobile.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

707554b4

nl80211: Fix potential memory leak in nl80211_set_wowlan · e5dbe070

由 Ola Olsson 提交于 12月 12, 2015

Compared to cfg80211_rdev_free_wowlan in core.h,
the error goto label lacks the freeing of nd_config.
Fix that.
Signed-off-by: NOla Olsson <ola.olsson@sonymobile.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

e5dbe070

nl80211: fix a few memory leaks in reg.c · 09d11800

由 Ola Olsson 提交于 12月 13, 2015

The first leak occurs when entering the default case
in the switch for the initiator in set_regdom.
The second leaks a platform_device struct if the
platform registration in regulatory_init succeeds but
the sub sequent regulatory hint fails due to no memory.
Signed-off-by: NOla Olsson <ola.olsson@sonymobile.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

09d11800

skbuff: Fix offset error in skb_reorder_vlan_header · f6548615

由 Vlad Yasevich 提交于 12月 14, 2015

skb_reorder_vlan_header is called after the vlan header has
been pulled.  As a result the offset of the begining of
the mac header has been incrased by 4 bytes (VLAN_HLEN).
When moving the mac addresses, include this incrase in
the offset calcualation so that the mac addresses are
copied correctly.

Fixes: a6e18ff1 (vlan: Fix untag operations of stacked vlans with REORDER_HEADER off)
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: NVladislav Yasevich <vyasevich@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6548615

net: fix IP early demux races · 5037e9ef

由 Eric Dumazet 提交于 12月 14, 2015

David Wilder reported crashes caused by dst reuse.

<quote David>
  I am seeing a crash on a distro V4.2.3 kernel caused by a double
  release of a dst_entry.  In ipv4_dst_destroy() the call to
  list_empty() finds a poisoned next pointer, indicating the dst_entry
  has already been removed from the list and freed. The crash occurs
  18 to 24 hours into a run of a network stress exerciser.
</quote>

Thanks to his detailed report and analysis, we were able to understand
the core issue.

IP early demux can associate a dst to skb, after a lookup in TCP/UDP
sockets.

When socket cache is not properly set, we want to store into
sk->sk_dst_cache the dst for future IP early demux lookups,
by acquiring a stable refcount on the dst.

Problem is this acquisition is simply using an atomic_inc(),
which works well, unless the dst was queued for destruction from
dst_release() noticing dst refcount went to zero, if DST_NOCACHE
was set on dst.

We need to make sure current refcount is not zero before incrementing
it, or risk double free as David reported.

This patch, being a stable candidate, adds two new helpers, and use
them only from IP early demux problematic paths.

It might be possible to merge in net-next skb_dst_force() and
skb_dst_force_safe(), but I prefer having the smallest patch for stable
kernels : Maybe some skb_dst_force() callers do not expect skb->dst
can suddenly be cleared.

Can probably be backported back to linux-3.6 kernels
Reported-by: NDavid J. Wilder <dwilder@us.ibm.com>
Tested-by: NDavid J. Wilder <dwilder@us.ibm.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5037e9ef

net: add validation for the socket syscall protocol argument · 79462ad0

由 Hannes Frederic Sowa 提交于 12月 14, 2015

郭永刚 reported that one could simply crash the kernel as root by
using a simple program:

	int socket_fd;
	struct sockaddr_in addr;
	addr.sin_port = 0;
	addr.sin_addr.s_addr = INADDR_ANY;
	addr.sin_family = 10;

	socket_fd = socket(10,3,0x40000000);
	connect(socket_fd , &addr,16);

AF_INET, AF_INET6 sockets actually only support 8-bit protocol
identifiers. inet_sock's skc_protocol field thus is sized accordingly,
thus larger protocol identifiers simply cut off the higher bits and
store a zero in the protocol fields.

This could lead to e.g. NULL function pointer because as a result of
the cut off inet_num is zero and we call down to inet_autobind, which
is NULL for raw sockets.

kernel: Call Trace:
kernel:  [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70
kernel:  [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80
kernel:  [<ffffffff81645069>] SYSC_connect+0xd9/0x110
kernel:  [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80
kernel:  [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200
kernel:  [<ffffffff81645e0e>] SyS_connect+0xe/0x10
kernel:  [<ffffffff81779515>] tracesys_phase2+0x84/0x89

I found no particular commit which introduced this problem.

CVE: CVE-2015-8543
Cc: Cong Wang <cwang@twopensource.com>
Reported-by: N郭永刚 <guoyonggang@360.cn>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79462ad0

14 12月, 2015 3 次提交

net: Flush local routes when device changes vrf association · 7f49e7a3

由 David Ahern 提交于 12月 10, 2015

The VRF driver cycles netdevs when an interface is enslaved or released:
the down event is used to flush neighbor and route tables and the up
event (if the interface was already up) effectively moves local and
connected routes to the proper table.

As of 4f823def the local route is left hanging around after a link
down, so when a netdev is moved from one VRF to another (or released
from a VRF altogether) local routes are left in the wrong table.

Fix by handling the NETDEV_CHANGEUPPER event. When the upper dev is
an L3mdev then call fib_disable_ip to flush all routes, local ones
to.

Fixes: 4f823def ("ipv4: fix to not remove local route on link down")
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f49e7a3

sched/wait: Fix the signal handling fix · dfd01f02

由 Peter Zijlstra 提交于 12月 13, 2015

Jan Stancek reported that I wrecked things for him by fixing things for
Vladimir :/

His report was due to an UNINTERRUPTIBLE wait getting -EINTR, which
should not be possible, however my previous patch made this possible by
unconditionally checking signal_pending().

We cannot use current->state as was done previously, because the
instruction after the store to that variable it can be changed.  We must
instead pass the initial state along and use that.

Fixes: 68985633 ("sched/wait: Fix signal handling in bit wait helpers")
Reported-by: NJan Stancek <jstancek@redhat.com>
Reported-by: NChris Mason <clm@fb.com>
Tested-by: NJan Stancek <jstancek@redhat.com>
Tested-by: NVladimir Murzin <vladimir.murzin@arm.com>
Tested-by: NChris Mason <clm@fb.com>
Reviewed-by: NPaul Turner <pjt@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: tglx@linutronix.de
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: hpa@zytor.com
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dfd01f02

netfilter: nf_tables: use reverse traversal commit_list in nf_tables_abort · a907e36d

由 Xin Long 提交于 12月 07, 2015

When we use 'nft -f' to submit rules, it will build multiple rules into
one netlink skb to send to kernel, kernel will process them one by one.
meanwhile, it add the trans into commit_list to record every commit.
if one of them's return value is -EAGAIN, status |= NFNL_BATCH_REPLAY
will be marked. after all the process is done. it will roll back all the
commits.

now kernel use list_add_tail to add trans to commit, and use
list_for_each_entry_safe to roll back. which means the order of adding
and rollback is the same. that will cause some cases cannot work well,
even trigger call trace, like:

1. add a set into table foo  [return -EAGAIN]:
   commit_list = 'add set trans'
2. del foo:
   commit_list = 'add set trans' -> 'del set trans' -> 'del tab trans'
then nf_tables_abort will be called to roll back:
firstly process 'add set trans':
                   case NFT_MSG_NEWSET:
                        trans->ctx.table->use--;
                        list_del_rcu(&nft_trans_set(trans)->list);

  it will del the set from the table foo, but it has removed when del
  table foo [step 2], then the kernel will panic.

the right order of rollback should be:
  'del tab trans' -> 'del set trans' -> 'add set trans'.
which is opposite with commit_list order.

so fix it by rolling back commits with reverse order in nf_tables_abort.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

a907e36d

12 12月, 2015 1 次提交

mpls: make via address optional for multipath routes · f20367df

由 Robert Shearman 提交于 12月 10, 2015

The via address is optional for a single path route, yet is mandatory
when the multipath attribute is used:

  # ip -f mpls route add 100 dev lo
  # ip -f mpls route add 101 nexthop dev lo
  RTNETLINK answers: Invalid argument

Make them consistent by making the via address optional when the
RTA_MULTIPATH attribute is being parsed so that both forms of
specifying the route work.
Signed-off-by: NRobert Shearman <rshearma@brocade.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f20367df

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功