提交 · 469f87e158628fe66dcbbce9dd5e7b7acfe934a9 · openanolis / cloud-kernel

17 6月, 2017 1 次提交

ip_tunnel: fix potential issue in ip_tunnel_rcv · 469f87e1

由 Haishuang Yan 提交于 6月 15, 2017

When ip_tunnel_rcv fails, the tun_dst won't be freed, so call
dst_release to free it in error code path.

Fixes: 2e15ea39 ("ip_gre: Add support to collect tunnel metadata.")
Acked-by: NEric Dumazet <edumazet@google.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Tested-by: NZhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

469f87e1

16 6月, 2017 2 次提交

sctp: return next obj by passing pos + 1 into sctp_transport_get_idx · 988c7322

由 Xin Long 提交于 6月 15, 2017

In sctp_for_each_transport, pos is used to save how many objs it has
dumped. Now it gets the last obj by sctp_transport_get_idx, then gets
the next obj by sctp_transport_get_next.

The issue is that in the meanwhile if some objs in transport hashtable
are removed and the objs nums are less than pos, sctp_transport_get_idx
would return NULL and hti.walker.tbl is NULL as well. At this moment
it should stop hti, instead of continue getting the next obj. Or it
would cause a NULL pointer dereference in sctp_transport_get_next.

This patch is to pass pos + 1 into sctp_transport_get_idx to get the
next obj directly, even if pos > objs nums, it would return NULL and
stop hti.

Fixes: 626d16f5 ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

988c7322

rxrpc: Fix several cases where a padded len isn't checked in ticket decode · 5f2f9765

由 David Howells 提交于 6月 15, 2017

This fixes CVE-2017-7482.

When a kerberos 5 ticket is being decoded so that it can be loaded into an
rxrpc-type key, there are several places in which the length of a
variable-length field is checked to make sure that it's not going to
overrun the available data - but the data is padded to the nearest
four-byte boundary and the code doesn't check for this extra.  This could
lead to the size-remaining variable wrapping and the data pointer going
over the end of the buffer.

Fix this by making the various variable-length data checks use the padded
length.
Reported-by: N石磊 <shilei-c@360.cn>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-by: NMarc Dionne <marc.c.dionne@auristor.com>
Reviewed-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f2f9765

15 6月, 2017 4 次提交

ipv6: fix calling in6_ifa_hold incorrectly for dad work · f8a894b2

由 Xin Long 提交于 6月 15, 2017

Now when starting the dad work in addrconf_mod_dad_work, if the dad work
is idle and queued, it needs to hold ifa.

The problem is there's one gap in [1], during which if the pending dad work
is removed elsewhere. It will miss to hold ifa, but the dad word is still
idea and queue.

        if (!delayed_work_pending(&ifp->dad_work))
                in6_ifa_hold(ifp);
                    <--------------[1]
        mod_delayed_work(addrconf_wq, &ifp->dad_work, delay);

An use-after-free issue can be caused by this.

Chen Wei found this issue when WARN_ON(!hlist_unhashed(&ifp->addr_lst)) in
net6_ifa_finish_destroy was hit because of it.

As Hannes' suggestion, this patch is to fix it by holding ifa first in
addrconf_mod_dad_work, then calling mod_delayed_work and putting ifa if
the dad_work is already in queue.

Note that this patch did not choose to fix it with:

  if (!mod_delayed_work(delay))
          in6_ifa_hold(ifp);

As with it, when delay == 0, dad_work would be scheduled immediately, all
addrconf_mod_dad_work(0) callings had to be moved under ifp->lock.
Reported-by: NWei Chen <weichen@redhat.com>
Suggested-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8a894b2

net: don't global ICMP rate limit packets originating from loopback · 849a44de

由 Jesper Dangaard Brouer 提交于 6月 14, 2017

Florian Weimer seems to have a glibc test-case which requires that
loopback interfaces does not get ICMP ratelimited.  This was broken by
commit c0303efe ("net: reduce cycles spend on ICMP replies that
gets rate limited").

An ICMP response will usually be routed back-out the same incoming
interface.  Thus, take advantage of this and skip global ICMP
ratelimit when the incoming device is loopback.  In the unlikely event
that the outgoing it not loopback, due to strange routing policy
rules, ICMP rate limiting still works via peer ratelimiting via
icmpv4_xrlim_allow().  Thus, we should still comply with RFC1812
(section 4.3.2.8 "Rate Limiting").

This seems to fix the reproducer given by Florian.  While still
avoiding to perform expensive and unneeded outgoing route lookup for
rate limited packets (in the non-loopback case).

Fixes: c0303efe ("net: reduce cycles spend on ICMP replies that gets rate limited")
Reported-by: NFlorian Weimer <fweimer@redhat.com>
Reported-by: N"H.J. Lu" <hjl.tools@gmail.com>
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

849a44de

net/act_pedit: fix an error code · c4f65b09

由 Dan Carpenter 提交于 6月 14, 2017

I'm reviewing static checker warnings where we do ERR_PTR(0), which is
the same as NULL. I'm pretty sure we intended to return ERR_PTR(-EINVAL)
here. Sometimes these bugs lead to a NULL dereference but I don't
immediately see that problem here.

Fixes: 71d0ed70 ("net/act_pedit: Support using offset relative to the conventional network headers")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NAmir Vadai <amir@vadai.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c4f65b09

net_sched: move tcf_lock down after gen_replace_estimator() · 74030603

由 WANG Cong 提交于 6月 13, 2017

Laura reported a sleep-in-atomic kernel warning inside
tcf_act_police_init() which calls gen_replace_estimator() with
spinlock protection.

It is not necessary in this case, we already have RTNL lock here
so it is enough to protect concurrent writers. For the reader,
i.e. tcf_act_police(), it needs to make decision based on this
rate estimator, in the worst case we drop more/less packets than
necessary while changing the rate in parallel, it is still acceptable.
Reported-by: NLaura Abbott <labbott@redhat.com>
Reported-by: NNick Huber <nicholashuber@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74030603

14 6月, 2017 2 次提交

caif: Add sockaddr length check before accessing sa_family in connect handler · 20a3d5bf

由 Mateusz Jurczyk 提交于 6月 13, 2017

Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in the connect()
handler of the AF_CAIF socket. Since the syscall doesn't enforce a minimum
size of the corresponding memory region, very short sockaddrs (zero or one
byte long) result in operating on uninitialized memory while referencing
sa_family.
Signed-off-by: NMateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20a3d5bf

igmp: acquire pmc lock for ip_mc_clear_src() · c38b7d32

由 WANG Cong 提交于 6月 12, 2017

Andrey reported a use-after-free in add_grec():

        for (psf = *psf_list; psf; psf = psf_next) {
		...
                psf_next = psf->sf_next;

where the struct ip_sf_list's were already freed by:

 kfree+0xe8/0x2b0 mm/slub.c:3882
 ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
 ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
 ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
 inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
 sock_release+0x8d/0x1e0 net/socket.c:597
 sock_close+0x16/0x20 net/socket.c:1072

This happens because we don't hold pmc->lock in ip_mc_clear_src()
and a parallel mr_ifc_timer timer could jump in and access them.

The RCU lock is there but it is merely for pmc itself, this
spinlock could actually ensure we don't access them in parallel.

Thanks to Eric and Long for discussion on this bug.
Reported-by: NAndrey Konovalov <andreyknvl@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c38b7d32

13 6月, 2017 8 次提交

net: rps: fix uninitialized symbol warning · 97d8b6e3

由 Ashwanth Goli 提交于 6月 13, 2017

This patch fixes uninitialized symbol warning that
got introduced by the following commit
773fc8f6 ("net: rps: send out pending IPI's on CPU hotplug")
Signed-off-by: NAshwanth Goli <ashwanth@codeaurora.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97d8b6e3

mac80211: don't send SMPS action frame in AP mode when not needed · b3dd8279

由 Emmanuel Grumbach 提交于 6月 10, 2017

mac80211 allows to modify the SMPS state of an AP both,
when it is started, and after it has been started. Such a
change will trigger an action frame to all the peers that
are currently connected, and will be remembered so that
new peers will get notified as soon as they connect (since
the SMPS setting in the beacon may not be the right one).

This means that we need to remember the SMPS state
currently requested as well as the SMPS state that was
configured initially (and advertised in the beacon).
The former is bss->req_smps and the latter is
sdata->smps_mode.

Initially, the AP interface could only be started with
SMPS_OFF, which means that sdata->smps_mode was SMPS_OFF
always. Later, a nl80211 API was added to be able to start
an AP with a different AP mode. That code forgot to update
bss->req_smps and because of that, if the AP interface was
started with SMPS_DYNAMIC, we had:
   sdata->smps_mode = SMPS_DYNAMIC
   bss->req_smps = SMPS_OFF

That configuration made mac80211 think it needs to fire off
an action frame to any new station connecting to the AP in
order to let it know that the actual SMPS configuration is
SMPS_OFF.

Fix that by properly setting bss->req_smps in
ieee80211_start_ap.

Fixes: f6993174 ("mac80211: set smps_mode according to ap params")
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

b3dd8279

mac80211/wpa: use constant time memory comparison for MACs · 98c67d18

由 Jason A. Donenfeld 提交于 6月 10, 2017

Otherwise, we enable all sorts of forgeries via timing attack.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: linux-wireless@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

98c67d18

mac80211: set bss_info data before configuring the channel · c87905be

由 Johannes Berg 提交于 6月 10, 2017

When mac80211 changes the channel, it also calls into the driver's
bss_info_changed() callback, e.g. with BSS_CHANGED_IDLE. The driver
may, like iwlwifi does, access more data from bss_info in that case
and iwlwifi accesses the basic_rates bitmap, but if changing from a
band with more (basic) rates to one with fewer, an out-of-bounds
access of the rate array may result.

While we can't avoid having invalid data at some point in time, we
can avoid having it while we call the driver - so set up all the
data before configuring the channel, and then apply it afterwards.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=195677Reported-by: NJohannes Hirte <johannes.hirte@datenkhaos.de>
Tested-by: NJohannes Hirte <johannes.hirte@datenkhaos.de>
Debugged-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

c87905be

mac80211: remove 5/10 MHz rate code from station MLME · 44f6d42c

由 Johannes Berg 提交于 6月 10, 2017

There's no need for the station MLME code to handle bitrates for 5
or 10 MHz channels when it can't ever create such a configuration.
Remove the unnecessary code.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

44f6d42c

mac80211: Fix incorrect condition when checking rx timestamp · 204a7dbc

由 Avraham Stern 提交于 6月 12, 2017

If the driver reports the rx timestamp at PLCP start, mac80211 can
only handle legacy encoding, but the code checks that the encoding
is not legacy. Fix this.

Fixes: da6a4352 ("mac80211: separate encoding/bandwidth from flags")
Signed-off-by: NAvraham Stern <avraham.stern@intel.com>
Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

204a7dbc

mac80211: don't look at the PM bit of BAR frames · 769dc04d

由 Emmanuel Grumbach 提交于 6月 08, 2017

When a peer sends a BAR frame with PM bit clear, we should
not modify its PM state as madated by the spec in
802.11-20012 10.2.1.2.

Cc: stable@vger.kernel.org
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

769dc04d

hsr: fix incorrect warning · 675c8da0

由 Karicheri, Muralidharan 提交于 6月 12, 2017

When HSR interface is setup using ip link command, an annoying warning
appears with the trace as below:-

[  203.019828] hsr_get_node: Non-HSR frame
[  203.019833] Modules linked in:
[  203.019848] CPU: 0 PID: 158 Comm: sd-resolve Tainted: G        W       4.12.0-rc3-00052-g9fa6bf70 #2
[  203.019853] Hardware name: Generic DRA74X (Flattened Device Tree)
[  203.019869] [<c0110280>] (unwind_backtrace) from [<c010c2f4>] (show_stack+0x10/0x14)
[  203.019880] [<c010c2f4>] (show_stack) from [<c04b9f64>] (dump_stack+0xac/0xe0)
[  203.019894] [<c04b9f64>] (dump_stack) from [<c01374e8>] (__warn+0xd8/0x104)
[  203.019907] [<c01374e8>] (__warn) from [<c0137548>] (warn_slowpath_fmt+0x34/0x44)
root@am57xx-evm:~# [  203.019921] [<c0137548>] (warn_slowpath_fmt) from [<c081126c>] (hsr_get_node+0x148/0x170)
[  203.019932] [<c081126c>] (hsr_get_node) from [<c0814240>] (hsr_forward_skb+0x110/0x7c0)
[  203.019942] [<c0814240>] (hsr_forward_skb) from [<c0811d64>] (hsr_dev_xmit+0x2c/0x34)
[  203.019954] [<c0811d64>] (hsr_dev_xmit) from [<c06c0828>] (dev_hard_start_xmit+0xc4/0x3bc)
[  203.019963] [<c06c0828>] (dev_hard_start_xmit) from [<c06c13d8>] (__dev_queue_xmit+0x7c4/0x98c)
[  203.019974] [<c06c13d8>] (__dev_queue_xmit) from [<c0782f54>] (ip6_finish_output2+0x330/0xc1c)
[  203.019983] [<c0782f54>] (ip6_finish_output2) from [<c0788f0c>] (ip6_output+0x58/0x454)
[  203.019994] [<c0788f0c>] (ip6_output) from [<c07b16cc>] (mld_sendpack+0x420/0x744)

As this is an expected path to hsr_get_node() with frame coming from
the master interface, add a check to ensure packet is not from the
master port and then warn.
Signed-off-by: NMurali Karicheri <m-karicheri2@ti.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

675c8da0

12 6月, 2017 2 次提交

proc: snmp6: Use correct type in memset · 3500cd73

由 Christian Perle 提交于 6月 12, 2017

Reading /proc/net/snmp6 yields bogus values on 32 bit kernels.
Use "u64" instead of "unsigned long" in sizeof().

Fixes: 4a4857b1 ("proc: Reduce cache miss in snmp6_seq_show")
Signed-off-by: NChristian Perle <christian.perle@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3500cd73

net: ipmr: Fix some mroute forwarding issues in vrf's · 4b1f0d33

由 Donald Sharp 提交于 6月 10, 2017

This patch fixes two issues:

1) When forwarding on *,G mroutes that are in a vrf, the
kernel was dropping information about the actual incoming
interface when calling ip_mr_forward from ip_mr_input.
This caused ip_mr_forward to send the multicast packet
back out the incoming interface.  Fix this by
modifying ip_mr_forward to be handed the correctly
resolved dev.

2) When a unresolved cache entry is created we store
the incoming skb on the unresolved cache entry and
upon mroute resolution from the user space daemon,
we attempt to forward the packet.  Again we were
not resolving to the correct incoming device for
a vrf scenario, before calling ip_mr_forward.
Fix this by resolving to the correct interface
and calling ip_mr_forward with the result.

Fixes: e58e4159 ("net: Enable support for VRF with ipv4 multicast")
Signed-off-by: NDonald Sharp <sharpd@cumulusnetworks.com>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: NYotam Gigi <yotamg@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b1f0d33

11 6月, 2017 4 次提交

net: tipc: Fix a sleep-in-atomic bug in tipc_msg_reverse · 343eba69

由 Jia-Ju Bai 提交于 6月 10, 2017

The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the
function call path is:
tipc_l2_rcv_msg (acquire the lock by rcu_read_lock)
  tipc_rcv
    tipc_sk_rcv
      tipc_msg_reverse
        pskb_expand_head(GFP_KERNEL) --> may sleep
tipc_node_broadcast
  tipc_node_xmit_skb
    tipc_node_xmit
      tipc_sk_rcv
        tipc_msg_reverse
          pskb_expand_head(GFP_KERNEL) --> may sleep

To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC".
Signed-off-by: NJia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

343eba69

net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx · f146e872

由 Jia-Ju Bai 提交于 6月 10, 2017

The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the
function call path is:
cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock)
  cfctrl_linkdown_req
    cfpkt_create
      cfpkt_create_pfx
        alloc_skb(GFP_KERNEL) --> may sleep
cfserl_receive (acquire the lock by rcu_read_lock)
  cfpkt_split
    cfpkt_create_pfx
      alloc_skb(GFP_KERNEL) --> may sleep

There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or
"GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function
is called under a rcu read lock, instead in interrupt.

To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx.
Signed-off-by: NJia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f146e872

sctp: disable BH in sctp_for_each_endpoint · 581409da

由 Xin Long 提交于 6月 10, 2017

Now sctp holds read_lock when foreach sctp_ep_hashtable without disabling
BH. If CPU schedules to another thread A at this moment, the thread A may
be trying to hold the write_lock with disabling BH.

As BH is disabled and CPU cannot schedule back to the thread holding the
read_lock, while the thread A keeps waiting for the read_lock. A dead
lock would be triggered by this.

This patch is to fix this dead lock by calling read_lock_bh instead to
disable BH when holding the read_lock in sctp_for_each_endpoint.

Fixes: 626d16f5 ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
Reported-by: NXiumei Mu <xmu@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

581409da

l2tp: cast l2tp traffic counter to unsigned · 9b3dc0a1

由 Dominik Heidler 提交于 6月 09, 2017

This fixes a counter problem on 32bit systems:
When the rx_bytes counter reached 2 GiB, it jumpd to (2^64 Bytes - 2GiB) Bytes.

rtnl_link_stats64 has __u64 type and atomic_long_read returns
atomic_long_t which is signed. Due to the conversation
we get an incorrect value on 32bit systems if the MSB of
the atomic_long_t value is set.

CC: Tom Parkin <tparkin@katalix.com>
Fixes: 7b7c0719 ("l2tp: avoid deadlock in l2tp stats update")
Signed-off-by: NDominik Heidler <dheidler@suse.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b3dc0a1

10 6月, 2017 3 次提交

mac80211: free netdev on dev_alloc_name() error · c7a61cba

由 Johannes Berg 提交于 6月 09, 2017

The change to remove free_netdev() from ieee80211_if_free()
erroneously didn't add the necessary free_netdev() for when
ieee80211_if_free() is called directly in one place, rather
than as the priv_destructor. Add the missing call.

Fixes: cf124db5 ("net: Fix inconsistent teardown and release of private netdev state.")
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7a61cba

net: rps: send out pending IPI's on CPU hotplug · 773fc8f6

由 ashwanth@codeaurora.org 提交于 6月 09, 2017

IPI's from the victim cpu are not handled in dev_cpu_callback.
So these pending IPI's would be sent to the remote cpu only when
NET_RX is scheduled on the victim cpu and since this trigger is
unpredictable it would result in packet latencies on the remote cpu.

This patch add support to send the pending ipi's of victim cpu.
Signed-off-by: NAshwanth Goli <ashwanth@codeaurora.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

773fc8f6

Fix an intermittent pr_emerg warning about lo becoming free. · f186ce61

由 Krister Johansen 提交于 6月 08, 2017

It looks like this:

Message from syslogd@flamingo at Apr 26 00:45:00 ...
 kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4

They seem to coincide with net namespace teardown.

The message is emitted by netdev_wait_allrefs().

Forced a kdump in netdev_run_todo, but found that the refcount on the lo
device was already 0 at the time we got to the panic.

Used bcc to check the blocking in netdev_run_todo.  The only places
where we're off cpu there are in the rcu_barrier() and msleep() calls.
That behavior is expected.  The msleep time coincides with the amount of
time we spend waiting for the refcount to reach zero; the rcu_barrier()
wait times are not excessive.

After looking through the list of callbacks that the netdevice notifiers
invoke in this path, it appears that the dst_dev_event is the most
interesting.  The dst_ifdown path places a hold on the loopback_dev as
part of releasing the dev associated with the original dst cache entry.
Most of our notifier callbacks are straight-forward, but this one a)
looks complex, and b) places a hold on the network interface in
question.

I constructed a new bcc script that watches various events in the
liftime of a dst cache entry.  Note that dst_ifdown will take a hold on
the loopback device until the invalidated dst entry gets freed.

[      __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183
    __dst_free
    rcu_nocb_kthread
    kthread
    ret_from_fork
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f186ce61

09 6月, 2017 3 次提交

af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers · defbcf2d

由 Mateusz Jurczyk 提交于 6月 08, 2017

Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.
Signed-off-by: NMateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

defbcf2d

can: af_can: namespace support: fix lockdep splat: properly initialize spin_lock · 74b7b490

由 Marc Kleine-Budde 提交于 6月 06, 2017

This patch uses spin_lock_init() instead of __SPIN_LOCK_UNLOCKED() to
initialize the per namespace net->can.can_rcvlists_lock lock to fix this
lockdep warning:

| INFO: trying to register non-static key.
| the code is fine but needs lockdep annotation.
| turning off the locking correctness validator.
| CPU: 0 PID: 186 Comm: candump Not tainted 4.12.0-rc3+ #47
| Hardware name: Marvell Kirkwood (Flattened Device Tree)
| [<c0016644>] (unwind_backtrace) from [<c00139a8>] (show_stack+0x18/0x1c)
| [<c00139a8>] (show_stack) from [<c0058c8c>] (register_lock_class+0x1e4/0x55c)
| [<c0058c8c>] (register_lock_class) from [<c005bdfc>] (__lock_acquire+0x148/0x1990)
| [<c005bdfc>] (__lock_acquire) from [<c005deec>] (lock_acquire+0x174/0x210)
| [<c005deec>] (lock_acquire) from [<c04a6780>] (_raw_spin_lock+0x50/0x88)
| [<c04a6780>] (_raw_spin_lock) from [<bf02116c>] (can_rx_register+0x94/0x15c [can])
| [<bf02116c>] (can_rx_register [can]) from [<bf02a868>] (raw_enable_filters+0x60/0xc0 [can_raw])
| [<bf02a868>] (raw_enable_filters [can_raw]) from [<bf02ac14>] (raw_enable_allfilters+0x2c/0xa0 [can_raw])
| [<bf02ac14>] (raw_enable_allfilters [can_raw]) from [<bf02ad38>] (raw_bind+0xb0/0x250 [can_raw])
| [<bf02ad38>] (raw_bind [can_raw]) from [<c03b5fb8>] (SyS_bind+0x70/0xac)
| [<c03b5fb8>] (SyS_bind) from [<c000f8c0>] (ret_fast_syscall+0x0/0x1c)

Cc: Mario Kicherer <dev@kicherer.org>
Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

74b7b490

ila_xlat: add missing hash secret initialization · 0db47e3d

由 Arnd Bergmann 提交于 6月 08, 2017

While discussing the possible merits of clang warning about unused initialized
functions, I found one function that was clearly meant to be called but
never actually is.

__ila_hash_secret_init() initializes the hash value for the ila locator,
apparently this is intended to prevent hash collision attacks, but this ends
up being a read-only zero constant since there is no caller. I could find
no indication of why it was never called, the earliest patch submission
for the module already was like this. If my interpretation is right, we
certainly want to backport the patch to stable kernels as well.

I considered adding it to the ila_xlat_init callback, but for best effect
the random data is read as late as possible, just before it is first used.
The underlying net_get_random_once() is already highly optimized to avoid
overhead when called frequently.

Fixes: 7f00feaf ("ila: Add generic ILA translation facility")
Cc: stable@vger.kernel.org
Link: https://www.spinics.net/lists/kernel/msg2527243.htmlSigned-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0db47e3d

08 6月, 2017 7 次提交

net: ipv6: Release route when device is unregistering · 8397ed36

由 David Ahern 提交于 6月 07, 2017

Roopa reported attempts to delete a bond device that is referenced in a
multipath route is hanging:

$ ifdown bond2    # ifupdown2 command that deletes virtual devices
unregister_netdevice: waiting for bond2 to become free. Usage count = 2

Steps to reproduce:
    echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown
    ip link add dev bond12 type bond
    ip link add dev bond13 type bond
    ip addr add 2001:db8:2::0/64 dev bond12
    ip addr add 2001:db8:3::0/64 dev bond13
    ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2
    ip link del dev bond12
    ip link del dev bond13

The root cause is the recent change to keep routes on a linkdown. Update
the check to detect when the device is unregistering and release the
route for that case.

Fixes: a1a22c12 ("net: ipv6: Keep nexthop of multipath route on admin down")
Reported-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8397ed36

net: Zero ifla_vf_info in rtnl_fill_vfinfo() · 0eed9cf5

由 Mintz, Yuval 提交于 6月 07, 2017

Some of the structure's fields are not initialized by the
rtnetlink. If driver doesn't set those in ndo_get_vf_config(),
they'd leak memory to user.
Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
CC: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: NGreg Rose <gvrose8192@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0eed9cf5

decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb · dd0da17b

由 Mateusz Jurczyk 提交于 6月 07, 2017

Verify that the length of the socket buffer is sufficient to cover the
nlmsghdr structure before accessing the nlh->nlmsg_len field for further
input sanitization. If the client only supplies 1-3 bytes of data in
sk_buff, then nlh->nlmsg_len remains partially uninitialized and
contains leftover memory from the corresponding kernel allocation.
Operating on such data may result in indeterminate evaluation of the
nlmsg_len < sizeof(*nlh) expression.

The bug was discovered by a runtime instrumentation designed to detect
use of uninitialized memory in the kernel. The patch prevents this and
other similar tools (e.g. KMSAN) from flagging this behavior in the future.
Signed-off-by: NMateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd0da17b

Revert "decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb" · c164772d

由 David S. Miller 提交于 6月 08, 2017

This reverts commit 85eac2ba.

There is an updated version of this fix which we should
use instead.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c164772d

decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb · 85eac2ba

由 Mateusz Jurczyk 提交于 6月 07, 2017

Verify that the length of the socket buffer is sufficient to cover the
entire nlh->nlmsg_len field before accessing that field for further
input sanitization. If the client only supplies 1-3 bytes of data in
sk_buff, then nlh->nlmsg_len remains partially uninitialized and
contains leftover memory from the corresponding kernel allocation.
Operating on such data may result in indeterminate evaluation of the
nlmsg_len < sizeof(*nlh) expression.

85eac2ba

net: Fix inconsistent teardown and release of private netdev state. · cf124db5

由 David S. Miller 提交于 5月 08, 2017

Network devices can allocate reasources and private memory using
netdev_ops->ndo_init().  However, the release of these resources
can occur in one of two different places.

Either netdev_ops->ndo_uninit() or netdev->destructor().

The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.

netdev_ops->ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.

netdev->destructor(), on the other hand, does not run until the
netdev references all go away.

Further complicating the situation is that netdev->destructor()
almost universally does also a free_netdev().

This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.

If netdev_ops->ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
it is not able to invoke netdev->destructor().

This is because netdev->destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.

However, this means that the resources that would normally be released
by netdev->destructor() will not be.

Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.

Many drivers do not try to deal with this, and instead we have leaks.

Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev->destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().

netdev->priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev->destructor(), except for
free_netdev().

netdev->needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().

Now, register_netdevice() can sanely release all resources after
ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
and netdev->priv_destructor().

And at the end of unregister_netdevice(), we invoke
netdev->priv_destructor() and optionally call free_netdev().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf124db5

net: don't call strlen on non-terminated string in dev_set_alias() · c28294b9

由 Alexander Potapenko 提交于 6月 06, 2017

KMSAN reported a use of uninitialized memory in dev_set_alias(),
which was caused by calling strlcpy() (which in turn called strlen())
on the user-supplied non-terminated string.
Signed-off-by: NAlexander Potapenko <glider@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c28294b9

07 6月, 2017 2 次提交

net: bridge: fix a null pointer dereference in br_afspec · 1020ce31

由 Nikolay Aleksandrov 提交于 6月 06, 2017

We might call br_afspec() with p == NULL which is a valid use case if
the action is on the bridge device itself, but the bridge tunnel code
dereferences the p pointer without checking, so check if p is null
first.
Reported-by: NGustavo A. R. Silva <garsilva@embeddedor.com>
Fixes: efa5356b ("bridge: per vlan dst_metadata netlink support")
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1020ce31

net/ipv6: Fix CALIPSO causing GPF with datagram support · e3ebdb20

由 Richard Haines 提交于 6月 05, 2017

When using CALIPSO with IPPROTO_UDP it is possible to trigger a GPF as the
IP header may have moved.

Also update the payload length after adding the CALIPSO option.
Signed-off-by: NRichard Haines <richard_c_haines@btinternet.com>
Acked-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NHuw Davies <huw@codeweavers.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3ebdb20

06 6月, 2017 1 次提交

Revert "sit: reload iphdr in ipip6_rcv" · f4eb17e1

由 David S. Miller 提交于 6月 06, 2017

This reverts commit b699d003.

As per Eric Dumazet, the pskb_may_pull() is a NOP in this
particular case, so the 'iph' reload is unnecessary.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4eb17e1

05 6月, 2017 1 次提交

devlink: fix potential memort leak · 6044bd4a

由 Haishuang Yan 提交于 6月 05, 2017

We must free allocated skb when genlmsg_put() return fails.

Fixes: 1555d204 ("devlink: Support for pipeline debug (dpipe)")
Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6044bd4a

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功