提交 · 7254ad094f4a7d624d988d3095e6cb94f3e9afe6 · openanolis / cloud-kernel

03 4月, 2019 28 次提交

ila: Fix rhashtable walker list corruption · 7254ad09

由 Herbert Xu 提交于 3月 26, 2019

[ Upstream commit b5f9bd15b88563b55a99ed588416881367a0ce5f ]

ila_xlat_nl_cmd_flush uses rhashtable walkers allocated from the
stack but it never frees them.  This corrupts the walker list of
the hash table.

This patch fixes it.

Reported-by: syzbot+dae72a112334aa65a159@syzkaller.appspotmail.com
Fixes: b6e71bde ("ila: Flush netlink command to clear xlat...")
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

7254ad09

vxlan: Don't call gro_cells_destroy() before device is unregistered · 979f8a67

由 Zhiqiang Liu 提交于 3月 16, 2019

[ Upstream commit cc4807bb609230d8959fd732b0bf3bd4c2de8eac ]

Commit ad6c9986bcb62 ("vxlan: Fix GRO cells race condition between
receive and link delete") fixed a race condition for the typical case a vxlan
device is dismantled from the current netns. But if a netns is dismantled,
vxlan_destroy_tunnels() is called to schedule a unregister_netdevice_queue()
of all the vxlan tunnels that are related to this netns.

In vxlan_destroy_tunnels(), gro_cells_destroy() is called and finished before
unregister_netdevice_queue(). This means that the gro_cells_destroy() call is
done too soon, for the same reasons explained in above commit.

So we need to fully respect the RCU rules, and thus must remove the
gro_cells_destroy() call or risk use after-free.

Fixes: 58ce31cc ("vxlan: GRO support at tunnel layer")
Signed-off-by: NSuanming.Mou <mousuanming@huawei.com>
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
Reviewed-by: NZhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

979f8a67

vrf: prevent adding upper devices · 3b1386be

由 Sabrina Dubroca 提交于 3月 26, 2019

[ Upstream commit 1017e0987117c32783ba7c10fe2e7ff1456ba1dc ]

VRF devices don't work with upper devices. Currently, it's possible to
add a VRF device to a bridge or team, and to create macvlan, macsec, or
ipvlan devices on top of a VRF (bond and vlan are prevented respectively
by the lack of an ndo_set_mac_address op and the NETIF_F_VLAN_CHALLENGED
feature flag).

Fix this by setting the IFF_NO_RX_HANDLER flag (introduced in commit
f5426250 ("net: introduce IFF_NO_RX_HANDLER")).

Cc: David Ahern <dsahern@gmail.com>
Fixes: 193125db ("net: Introduce VRF device driver")
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

3b1386be

tun: properly test for IFF_UP · 8ea78da1

由 Eric Dumazet 提交于 3月 14, 2019

[ Upstream commit 4477138fa0ae4e1b699786ef0600863ea6e6c61c ]

Same reasons than the ones explained in commit 4179cb5a4c92
("vxlan: test dev->flags & IFF_UP before calling netif_rx()")

netif_rx_ni() or napi_gro_frags() must be called under a strict contract.

At device dismantle phase, core networking clears IFF_UP
and flush_all_backlogs() is called after rcu grace period
to make sure no incoming packet might be in a cpu backlog
and still referencing the device.

A similar protocol is used for gro layer.

Most drivers call netif_rx() from their interrupt handler,
and since the interrupts are disabled at device dismantle,
netif_rx() does not have to check dev->flags & IFF_UP

Virtual drivers do not have this guarantee, and must
therefore make the check themselves.

Fixes: 1bd4978a ("tun: honor IFF_UP in tun_get_user()")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

8ea78da1

tipc: fix cancellation of topology subscriptions · 52a7505c

由 Erik Hugne 提交于 3月 21, 2019

[ Upstream commit 33872d79f5d1cbedaaab79669cc38f16097a9450 ]

When cancelling a subscription, we have to clear the cancel bit in the
request before iterating over any established subscriptions with memcmp.
Otherwise no subscription will ever be found, and it will not be
possible to explicitly unsubscribe individual subscriptions.

Fixes: 8985ecc7 ("tipc: simplify endianness handling in topology subscriber")
Signed-off-by: NErik Hugne <erik.hugne@gmail.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

52a7505c

tipc: change to check tipc_own_id to return in tipc_net_stop · 1be6c0c7

由 Xin Long 提交于 3月 24, 2019

[ Upstream commit 9926cb5f8b0f0aea535735185600d74db7608550 ]

When running a syz script, a panic occurred:

[  156.088228] BUG: KASAN: use-after-free in tipc_disc_timeout+0x9c9/0xb20 [tipc]
[  156.094315] Call Trace:
[  156.094844]  <IRQ>
[  156.095306]  dump_stack+0x7c/0xc0
[  156.097346]  print_address_description+0x65/0x22e
[  156.100445]  kasan_report.cold.3+0x37/0x7a
[  156.102402]  tipc_disc_timeout+0x9c9/0xb20 [tipc]
[  156.106517]  call_timer_fn+0x19a/0x610
[  156.112749]  run_timer_softirq+0xb51/0x1090

It was caused by the netns freed without deleting the discoverer timer,
while later on the netns would be accessed in the timer handler.

The timer should have been deleted by tipc_net_stop() when cleaning up a
netns. However, tipc has been able to enable a bearer and start d->timer
without the local node_addr set since Commit 52dfae5c ("tipc: obtain
node identity from interface by default"), which caused the timer not to
be deleted in tipc_net_stop() then.

So fix it in tipc_net_stop() by changing to check local node_id instead
of local node_addr, as Jon suggested.

While at it, remove the calling of tipc_nametbl_withdraw() there, since
tipc_nametbl_stop() will take of the nametbl's freeing after.

Fixes: 52dfae5c ("tipc: obtain node identity from interface by default")
Reported-by: syzbot+a25307ad099309f1c2b9@syzkaller.appspotmail.com
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NYing Xue <ying.xue@windriver.com>
Acked-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

1be6c0c7

tipc: allow service ranges to be connect()'ed on RDM/DGRAM · 24d1a625

由 Erik Hugne 提交于 3月 17, 2019

[ Upstream commit ea239314fe42ace880bdd834256834679346c80e ]

We move the check that prevents connecting service ranges to after
the RDM/DGRAM check, and move address sanity control to a separate
function that also validates the service range.

Fixes: 23998835 ("tipc: improve address sanity check in tipc_connect()")
Signed-off-by: NErik Hugne <erik.hugne@gmail.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

24d1a625

tcp: do not use ipv6 header for ipv4 flow · 7115df61

由 Eric Dumazet 提交于 3月 19, 2019

[ Upstream commit 89e4130939a20304f4059ab72179da81f5347528 ]

When a dual stack tcp listener accepts an ipv4 flow,
it should not attempt to use an ipv6 header or tcp_v6_iif() helper.

Fixes: 1397ed35 ("ipv6: add flowinfo for tcp6 pkt_options for all cases")
Fixes: df3687ff ("ipv6: add the IPV6_FL_F_REFLECT flag to IPV6_FL_A_GET")
Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

7115df61

sctp: use memdup_user instead of vmemdup_user · cab576f1

由 Xin Long 提交于 3月 20, 2019

[ Upstream commit ef82bcfa671b9a635bab5fa669005663d8b177c5 ]

In sctp_setsockopt_bindx()/__sctp_setsockopt_connectx(), it allocates
memory with addrs_size which is passed from userspace. We used flag
GFP_USER to put some more restrictions on it in Commit cacc0621
("sctp: use GFP_USER for user-controlled kmalloc").

However, since Commit c981f254 ("sctp: use vmemdup_user() rather
than badly open-coding memdup_user()"), vmemdup_user() has been used,
which doesn't check GFP_USER flag when goes to vmalloc_*(). So when
addrs_size is a huge value, it could exhaust memory and even trigger
oom killer.

This patch is to use memdup_user() instead, in which GFP_USER would
work to limit the memory allocation with a huge addrs_size.

Note we can't fix it by limiting 'addrs_size', as there's no demand
for it from RFC.

Reported-by: syzbot+ec1b7575afef85a0e5ca@syzkaller.appspotmail.com
Fixes: c981f254 ("sctp: use vmemdup_user() rather than badly open-coding memdup_user()")
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

cab576f1

sctp: get sctphdr by offset in sctp_compute_cksum · 97265479

由 Xin Long 提交于 3月 18, 2019

[ Upstream commit 273160ffc6b993c7c91627f5a84799c66dfe4dee ]

sctp_hdr(skb) only works when skb->transport_header is set properly.

But in Netfilter, skb->transport_header for ipv6 is not guaranteed
to be right value for sctphdr. It would cause to fail to check the
checksum for sctp packets.

So fix it by using offset, which is always right in all places.

v1->v2:
  - Fix the changelog.

Fixes: e6d8b64b ("net: sctp: fix and consolidate SCTP checksumming code")
Reported-by: NLi Shuang <shuali@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

97265479

rhashtable: Still do rehash when we get EEXIST · cf86f7a9

由 Herbert Xu 提交于 3月 21, 2019

[ Upstream commit 408f13ef358aa5ad56dc6230c2c7deb92cf462b1 ]

As it stands if a shrink is delayed because of an outstanding
rehash, we will go into a rescheduling loop without ever doing
the rehash.

This patch fixes this by still carrying out the rehash and then
rescheduling so that we can shrink after the completion of the
rehash should it still be necessary.

The return value of EEXIST captures this case and other cases
(e.g., another thread expanded/rehashed the table at the same
time) where we should still proceed with the rehash.

Fixes: da20420f ("rhashtable: Add nested tables")
Reported-by: NJosh Elsasser <jelsasser@appneta.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Tested-by: NJosh Elsasser <jelsasser@appneta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

cf86f7a9

packets: Always register packet sk in the same order · 69cea7cf

由 Maxime Chevallier 提交于 3月 16, 2019

[ Upstream commit a4dc6a49156b1f8d6e17251ffda17c9e6a5db78a ]

When using fanouts with AF_PACKET, the demux functions such as
fanout_demux_cpu will return an index in the fanout socket array, which
corresponds to the selected socket.

The ordering of this array depends on the order the sockets were added
to a given fanout group, so for FANOUT_CPU this means sockets are bound
to cpus in the order they are configured, which is OK.

However, when stopping then restarting the interface these sockets are
bound to, the sockets are reassigned to the fanout group in the reverse
order, due to the fact that they were inserted at the head of the
interface's AF_PACKET socket list.

This means that traffic that was directed to the first socket in the
fanout group is now directed to the last one after an interface restart.

In the case of FANOUT_CPU, traffic from CPU0 will be directed to the
socket that used to receive traffic from the last CPU after an interface
restart.

This commit introduces a helper to add a socket at the tail of a list,
then uses it to register AF_PACKET sockets.

Note that this changes the order in which sockets are listed in /proc and
with sock_diag.

Fixes: dc99f600 ("packet: Add fanout support")
Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

69cea7cf

net-sysfs: call dev_hold if kobject_init_and_add success · d9d215be

由 YueHaibing 提交于 3月 19, 2019

[ Upstream commit a3e23f719f5c4a38ffb3d30c8d7632a4ed8ccd9e ]

In netdev_queue_add_kobject and rx_queue_add_kobject,
if sysfs_create_group failed, kobject_put will call
netdev_queue_release to decrease dev refcont, however
dev_hold has not be called. So we will see this while
unregistering dev:

unregister_netdevice: waiting for bcsh0 to become free. Usage count = -1
Reported-by: NHulk Robot <hulkci@huawei.com>
Fixes: d0d66837 ("net: don't decrement kobj reference count on init failure")
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d9d215be

net: stmmac: fix memory corruption with large MTUs · 8dcf078d

由 Aaro Koskinen 提交于 3月 18, 2019

[ Upstream commit 223a960c01227e4dbcb6f9fa06b47d73bda21274 ]

When using 16K DMA buffers and ring mode, the DES3 refill is not working
correctly as the function is using a bogus pointer for checking the
private data. As a result stale pointers will remain in the RX descriptor
ring, so DMA will now likely overwrite/corrupt some already freed memory.

As simple reproducer, just receive some UDP traffic:

	# ifconfig eth0 down; ifconfig eth0 mtu 9000; ifconfig eth0 up
	# iperf3 -c 192.168.253.40 -u -b 0 -R

If you didn't crash by now check the RX descriptors to find non-contiguous
RX buffers:

	cat /sys/kernel/debug/stmmaceth/eth0/descriptors_status
	[...]
	1 [0x2be5020]: 0xa3220321 0x9ffc1ffc 0x72d70082 0x130e207e
					     ^^^^^^^^^^^^^^^^^^^^^
	2 [0x2be5040]: 0xa3220321 0x9ffc1ffc 0x72998082 0x1311a07e
					     ^^^^^^^^^^^^^^^^^^^^^

A simple ping test will now report bad data:

	# ping -s 8200 192.168.253.40
	PING 192.168.253.40 (192.168.253.40) 8200(8228) bytes of data.
	8208 bytes from 192.168.253.40: icmp_seq=1 ttl=64 time=1.00 ms
	wrong data byte #8144 should be 0xd0 but was 0x88

Fix the wrong pointer. Also we must refill DES3 only if the DMA buffer
size is 16K.

Fixes: 54139cf3 ("net: stmmac: adding multiple buffers for rx")
Signed-off-by: NAaro Koskinen <aaro.koskinen@nokia.com>
Acked-by: NJose Abreu <joabreu@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

8dcf078d

net: rose: fix a possible stack overflow · 7eeb12ed

由 Eric Dumazet 提交于 3月 15, 2019

[ Upstream commit e5dcc0c3223c45c94100f05f28d8ef814db3d82c ]

rose_write_internal() uses a temp buffer of 100 bytes, but a manual
inspection showed that given arbitrary input, rose_create_facilities()
can fill up to 110 bytes.

Lets use a tailroom of 256 bytes for peace of mind, and remove
the bounce buffer : we can simply allocate a big enough skb
and adjust its length as needed.

syzbot report :

BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:352 [inline]
BUG: KASAN: stack-out-of-bounds in rose_create_facilities net/rose/rose_subr.c:521 [inline]
BUG: KASAN: stack-out-of-bounds in rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116
Write of size 7 at addr ffff88808b1ffbef by task syz-executor.0/24854

CPU: 0 PID: 24854 Comm: syz-executor.0 Not tainted 5.0.0+ #97
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
 check_memory_region_inline mm/kasan/generic.c:185 [inline]
 check_memory_region+0x123/0x190 mm/kasan/generic.c:191
 memcpy+0x38/0x50 mm/kasan/common.c:131
 memcpy include/linux/string.h:352 [inline]
 rose_create_facilities net/rose/rose_subr.c:521 [inline]
 rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116
 rose_connect+0x7cb/0x1510 net/rose/af_rose.c:826
 __sys_connect+0x266/0x330 net/socket.c:1685
 __do_sys_connect net/socket.c:1696 [inline]
 __se_sys_connect net/socket.c:1693 [inline]
 __x64_sys_connect+0x73/0xb0 net/socket.c:1693
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458079
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f47b8d9dc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458079
RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000004
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f47b8d9e6d4
R13: 00000000004be4a4 R14: 00000000004ceca8 R15: 00000000ffffffff

The buggy address belongs to the page:
page:ffffea00022c7fc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x1fffc0000000000()
raw: 01fffc0000000000 0000000000000000 ffffffff022c0101 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88808b1ffa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff88808b1ffb00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 03
>ffff88808b1ffb80: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 04 f3
                                                             ^
 ffff88808b1ffc00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
 ffff88808b1ffc80: 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 01 f2 01
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

7eeb12ed

net: phy: meson-gxl: fix interrupt support · a6f0168e

由 Jerome Brunet 提交于 3月 14, 2019

[ Upstream commit daa5c4d0167a308306525fd5ab9a5e18e21f4f74 ]

If an interrupt is already pending when the interrupt is enabled on the
GXL phy, no IRQ will ever be triggered.

The fix is simply to make sure pending IRQs are cleared before setting
up the irq mask.

Fixes: cf127ff2 ("net: phy: meson-gxl: add interrupt support")
Signed-off-by: NJerome Brunet <jbrunet@baylibre.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

a6f0168e

net/packet: Set __GFP_NOWARN upon allocation in alloc_pg_vec · 85ef72d8

由 Christoph Paasch 提交于 3月 18, 2019

[ Upstream commit 398f0132c14754fcd03c1c4f8e7176d001ce8ea1 ]

Since commit fc62814d690c ("net/packet: fix 4gb buffer limit due to overflow check")
one can now allocate packet ring buffers >= UINT_MAX. However, syzkaller
found that that triggers a warning:

[   21.100000] WARNING: CPU: 2 PID: 2075 at mm/page_alloc.c:4584 __alloc_pages_nod0
[   21.101490] Modules linked in:
[   21.101921] CPU: 2 PID: 2075 Comm: syz-executor.0 Not tainted 5.0.0 #146
[   21.102784] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
[   21.103887] RIP: 0010:__alloc_pages_nodemask+0x2a0/0x630
[   21.104640] Code: fe ff ff 65 48 8b 04 25 c0 de 01 00 48 05 90 0f 00 00 41 bd 01 00 00 00 48 89 44 24 48 e9 9c fe 3
[   21.107121] RSP: 0018:ffff88805e1cf920 EFLAGS: 00010246
[   21.107819] RAX: 0000000000000000 RBX: ffffffff85a488a0 RCX: 0000000000000000
[   21.108753] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000
[   21.109699] RBP: 1ffff1100bc39f28 R08: ffffed100bcefb67 R09: ffffed100bcefb67
[   21.110646] R10: 0000000000000001 R11: ffffed100bcefb66 R12: 000000000000000d
[   21.111623] R13: 0000000000000000 R14: ffff88805e77d888 R15: 000000000000000d
[   21.112552] FS:  00007f7c7de05700(0000) GS:ffff88806d100000(0000) knlGS:0000000000000000
[   21.113612] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   21.114405] CR2: 000000000065c000 CR3: 000000005e58e006 CR4: 00000000001606e0
[   21.115367] Call Trace:
[   21.115705]  ? __alloc_pages_slowpath+0x21c0/0x21c0
[   21.116362]  alloc_pages_current+0xac/0x1e0
[   21.116923]  kmalloc_order+0x18/0x70
[   21.117393]  kmalloc_order_trace+0x18/0x110
[   21.117949]  packet_set_ring+0x9d5/0x1770
[   21.118524]  ? packet_rcv_spkt+0x440/0x440
[   21.119094]  ? lock_downgrade+0x620/0x620
[   21.119646]  ? __might_fault+0x177/0x1b0
[   21.120177]  packet_setsockopt+0x981/0x2940
[   21.120753]  ? __fget+0x2fb/0x4b0
[   21.121209]  ? packet_release+0xab0/0xab0
[   21.121740]  ? sock_has_perm+0x1cd/0x260
[   21.122297]  ? selinux_secmark_relabel_packet+0xd0/0xd0
[   21.123013]  ? __fget+0x324/0x4b0
[   21.123451]  ? selinux_netlbl_socket_setsockopt+0x101/0x320
[   21.124186]  ? selinux_netlbl_sock_rcv_skb+0x3a0/0x3a0
[   21.124908]  ? __lock_acquire+0x529/0x3200
[   21.125453]  ? selinux_socket_setsockopt+0x5d/0x70
[   21.126075]  ? __sys_setsockopt+0x131/0x210
[   21.126533]  ? packet_release+0xab0/0xab0
[   21.127004]  __sys_setsockopt+0x131/0x210
[   21.127449]  ? kernel_accept+0x2f0/0x2f0
[   21.127911]  ? ret_from_fork+0x8/0x50
[   21.128313]  ? do_raw_spin_lock+0x11b/0x280
[   21.128800]  __x64_sys_setsockopt+0xba/0x150
[   21.129271]  ? lockdep_hardirqs_on+0x37f/0x560
[   21.129769]  do_syscall_64+0x9f/0x450
[   21.130182]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

We should allocate with __GFP_NOWARN to handle this.

Cc: Kal Conley <kal.conley@dectris.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Fixes: fc62814d690c ("net/packet: fix 4gb buffer limit due to overflow check")
Signed-off-by: NChristoph Paasch <cpaasch@apple.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

85ef72d8

net: datagram: fix unbounded loop in __skb_try_recv_datagram() · 88c64f9c

由 Paolo Abeni 提交于 3月 25, 2019

[ Upstream commit 0b91bce1ebfc797ff3de60c8f4a1e6219a8a3187 ]

Christoph reported a stall while peeking datagram with an offset when
busy polling is enabled. __skb_try_recv_datagram() uses as the loop
termination condition 'queue empty'. When peeking, the socket
queue can be not empty, even when no additional packets are received.

Address the issue explicitly checking for receive queue changes,
as currently done by __skb_wait_for_more_packets().

Fixes: 2b5cd0df ("net: Change return type of sk_busy_loop from bool to void")
Reported-and-tested-by: NChristoph Paasch <cpaasch@apple.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

88c64f9c

net: aquantia: fix rx checksum offload for UDP/TCP over IPv6 · e4ff39e1

由 Dmitry Bogdanov 提交于 3月 16, 2019

[ Upstream commit a7faaa0c5dc7d091cc9f72b870d7edcdd6f43f12 ]

TCP/UDP checksum validity was propagated to skb
only if IP checksum is valid.
But for IPv6 there is no validity as there is no checksum in IPv6.
This patch propagates TCP/UDP checksum validity regardless of IP checksum.

Fixes: 018423e9 ("net: ethernet: aquantia: Add ring support code")
Signed-off-by: NIgor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: NNikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: NDmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e4ff39e1

mISDN: hfcpci: Test both vendor & device ID for Digium HFC4S · c4084262

由 Bjorn Helgaas 提交于 3月 18, 2019

[ Upstream commit fae846e2b7124d4b076ef17791c73addf3b26350 ]

The device ID alone does not uniquely identify a device.  Test both the
vendor and device ID to make sure we don't mistakenly think some other
vendor's 0xB410 device is a Digium HFC4S.  Also, instead of the bare hex
ID, use the same constant (PCI_DEVICE_ID_DIGIUM_HFC4S) used in the device
ID table.

No functional change intended.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c4084262

mac8390: Fix mmio access size probe · e0f8c06f

由 Finn Thain 提交于 3月 16, 2019

[ Upstream commit bb9e5c5bcd76f4474eac3baf643d7a39f7bac7bb ]

The bug that Stan reported is as follows. After a restart, a 16-bit NIC
may be incorrectly identified as a 32-bit NIC and stop working.

mac8390 slot.E: Memory length resource not found, probing
mac8390 slot.E: Farallon EtherMac II-C (type farallon)
mac8390 slot.E: MAC 00:00:c5:30:c2:99, IRQ 61, 32 KB shared memory at 0xfeed0000, 32-bit access.

The bug never arises after a cold start and only intermittently after a
warm start. (I didn't investigate why the bug is intermittent.)

It turns out that memcpy_toio() is deprecated and memcmp_withio() also
has issues. Replacing these calls with mmio accessors fixes the problem.
Reported-and-tested-by: NStan Johnson <userm57@yahoo.com>
Fixes: 2964db0f ("m68k: Mac DP8390 update")
Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e0f8c06f

ipv6: make ip6_create_rt_rcu return ip6_null_entry instead of NULL · be092113

由 Xin Long 提交于 3月 20, 2019

[ Upstream commit 1c87e79a002f6a159396138cd3f3ab554a2a8887 ]

Jianlin reported a crash:

  [  381.484332] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
  [  381.619802] RIP: 0010:fib6_rule_lookup+0xa3/0x160
  [  382.009615] Call Trace:
  [  382.020762]  <IRQ>
  [  382.030174]  ip6_route_redirect.isra.52+0xc9/0xf0
  [  382.050984]  ip6_redirect+0xb6/0xf0
  [  382.066731]  icmpv6_notify+0xca/0x190
  [  382.083185]  ndisc_redirect_rcv+0x10f/0x160
  [  382.102569]  ndisc_rcv+0xfb/0x100
  [  382.117725]  icmpv6_rcv+0x3f2/0x520
  [  382.133637]  ip6_input_finish+0xbf/0x460
  [  382.151634]  ip6_input+0x3b/0xb0
  [  382.166097]  ipv6_rcv+0x378/0x4e0

It was caused by the lookup function __ip6_route_redirect() returns NULL in
fib6_rule_lookup() when ip6_create_rt_rcu() returns NULL.

So we fix it by simply making ip6_create_rt_rcu() return ip6_null_entry
instead of NULL.

v1->v2:
  - move down 'fallback:' to make it more readable.

Fixes: e873e4b9 ("ipv6: use fib6_info_hold_safe() when necessary")
Reported-by: NJianlin Shi <jishi@redhat.com>
Suggested-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NWei Wang <weiwan@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

be092113

gtp: change NET_UDP_TUNNEL dependency to select · 53adaacb

由 Matteo Croce 提交于 3月 16, 2019

[ Upstream commit c22da36688d6298f2e546dcc43fdc1ad35036467 ]

Similarly to commit a7603ac1fc8c ("geneve: change NET_UDP_TUNNEL
dependency to select"), GTP has a dependency on NET_UDP_TUNNEL which
makes impossible to compile it if no other protocol depending on
NET_UDP_TUNNEL is selected.

Fix this by changing the depends to a select, and drop NET_IP_TUNNEL from
the select list, as it already depends on NET_UDP_TUNNEL.
Signed-off-by: NMatteo Croce <mcroce@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

53adaacb

genetlink: Fix a memory leak on error path · 9b8ef421

由 YueHaibing 提交于 3月 21, 2019

[ Upstream commit ceabee6c59943bdd5e1da1a6a20dc7ee5f8113a2 ]

In genl_register_family(), when idr_alloc() fails,
we forget to free the memory we possibly allocate for
family->attrbuf.
Reported-by: NHulk Robot <hulkci@huawei.com>
Fixes: 2ae0f17d ("genetlink: use idr to track families")
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Reviewed-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9b8ef421

dccp: do not use ipv6 header for ipv4 flow · 321461f2

由 Eric Dumazet 提交于 3月 19, 2019

[ Upstream commit e0aa67709f89d08c8d8e5bdd9e0b649df61d0090 ]

When a dual stack dccp listener accepts an ipv4 flow,
it should not attempt to use an ipv6 header or
inet6_iif() helper.

Fixes: 3df80d93 ("[DCCP]: Introduce DCCPv6")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

321461f2

ipmi_si: Fix crash when using hard-coded device · 6bba17f6

由 Corey Minyard 提交于 3月 26, 2019

Backport from 41b766d661bf94a364960862cfc248a78313dbd3

When excuting a command like:
  modprobe ipmi_si ports=0xffc0e3 type=bt
The system would get an oops.

The trouble here is that ipmi_si_hardcode_find_bmc() is called before
ipmi_si_platform_init(), but initialization of the hard-coded device
creates an IPMI platform device, which won't be initialized yet.

The real trouble is that hard-coded devices aren't created with
any device, and the fixup is done later.  So do it right, create the
hard-coded devices as normal platform devices.

This required adding some new resource types to the IPMI platform
code for passing information required by the hard-coded device
and adding some code to remove the hard-coded platform devices
on module removal.

To enforce the "hard-coded devices passed by the user take priority
over firmware devices" rule, some special code was added to check
and see if a hard-coded device already exists.

The backport required some minor fixups and adding the device
id table that had been added in another change and was used
in this one.
Reported-by: NYang Yingliang <yangyingliang@huawei.com>
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: NCorey Minyard <cminyard@mvista.com>
Tested-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6bba17f6

Bluetooth: Verify that l2cap_get_conf_opt provides large enough buffer · 15d6538a

由 Marcel Holtmann 提交于 1月 18, 2019

commit 7c9cbd0b5e38a1672fcd137894ace3b042dfbf69 upstream.

The function l2cap_get_conf_opt will return L2CAP_CONF_OPT_SIZE + opt->len
as length value. The opt->len however is in control over the remote user
and can be used by an attacker to gain access beyond the bounds of the
actual packet.

To prevent any potential leak of heap memory, it is enough to check that
the resulting len calculation after calling l2cap_get_conf_opt is not
below zero. A well formed packet will always return >= 0 here and will
end with the length value being zero after the last option has been
parsed. In case of malformed packets messing with the opt->len field the
length value will become negative. If that is the case, then just abort
and ignore the option.

In case an attacker uses a too short opt->len value, then garbage will
be parsed, but that is protected by the unknown option handling and also
the option parameter size checks.
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

15d6538a

Bluetooth: Check L2CAP option sizes returned from l2cap_get_conf_opt · 2318c0e4

由 Marcel Holtmann 提交于 1月 18, 2019

commit af3d5d1c87664a4f150fcf3534c6567cb19909b0 upstream.

When doing option parsing for standard type values of 1, 2 or 4 octets,
the value is converted directly into a variable instead of a pointer. To
avoid being tricked into being a pointer, check that for these option
types that sizes actually match. In L2CAP every option is fixed size and
thus it is prudent anyway to ensure that the remote side sends us the
right option size along with option paramters.

If the option size is not matching the option type, then that option is
silently ignored. It is a protocol violation and instead of trying to
give the remote attacker any further hints just pretend that option is
not present and proceed with the default values. Implementation
following the specification and its qualification procedures will always
use the correct size and thus not being impacted here.

To keep the code readable and consistent accross all options, a few
cosmetic changes were also required.
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

2318c0e4

27 3月, 2019 12 次提交

G

Linux 4.19.32 · 3a2156c8
由 Greg Kroah-Hartman 提交于 3月 27, 2019

3a2156c8

power: supply: charger-manager: Fix incorrect return value · 33bd347f

由 Baolin Wang 提交于 11月 16, 2018

commit f25a646fbe2051527ad9721853e892d13a99199e upstream.

Fix incorrect return value.
Signed-off-by: NBaolin Wang <baolin.wang@linaro.org>
Signed-off-by: NSebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

33bd347f

ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec · 19184190

由 Hui Wang 提交于 3月 19, 2019

commit b5a236c175b0d984552a5f7c9d35141024c2b261 upstream.

Recently we found the audio jack detection stop working after suspend
on many machines with Realtek codec. Sometimes the audio selection
dialogue didn't show up after users plugged headhphone/headset into
the headset jack, sometimes after uses plugged headphone/headset, then
click the sound icon on the upper-right corner of gnome-desktop, it
also showed the speaker rather than the headphone.

The root cause is that before suspend, the codec already call the
runtime_suspend since this codec is not used by any apps, then in
resume, it will not call runtime_resume for this codec. But for some
realtek codec (so far, alc236, alc255 and alc891) with the specific
BIOS, if it doesn't run runtime_resume after suspend, all codec
functions including jack detection stop working anymore.

This problem existed for a long time, but it was not exposed, that is
because when problem happens, if users play sound or open
sound-setting to check audio device, this will trigger calling to
runtime_resume (via snd_hda_power_up), then the codec starts working
again before users notice this problem.

Since we don't know how many codec and BIOS combinations have this
problem, to fix it, let the driver call runtime_resume for all codecs
in pm_resume, maybe for some codecs, this is not needed, but it is
harmless. After a codec is runtime resumed, if it is not used by any
apps, it will be runtime suspended soon and furthermore we don't run
suspend frequently, this change will not add much power consumption.

Fixes: cc72da7d ("ALSA: hda - Use standard runtime PM for codec power-save control")
Signed-off-by: NHui Wang <hui.wang@canonical.com>
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

19184190

ALSA: hda - Record the current power state before suspend/resume calls · 156ba57f

由 Takashi Iwai 提交于 1月 29, 2019

commit 98081ca62cbac31fb0f7efaf90b2e7384ce22257 upstream.

Currently we deal with single codec and suspend codec callbacks for
all S3, S4 and runtime PM handling.  But it turned out that we want
distinguish the call patterns sometimes, e.g. for applying some init
sequence only at probing and restoring from hibernate.

This patch slightly modifies the common PM callbacks for HD-audio
codec and stores the currently processed PM event in power_state of
the codec's device.power field, which is currently unused.  The codec
callback can take a look at this event value and judges which purpose
it's being called.
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

156ba57f

locking/lockdep: Add debug_locks check in __lock_downgrade() · 0e0f7b30

由 Waiman Long 提交于 1月 09, 2019

commit 71492580571467fb7177aade19c18ce7486267f5 upstream.

Tetsuo Handa had reported he saw an incorrect "downgrading a read lock"
warning right after a previous lockdep warning. It is likely that the
previous warning turned off lock debugging causing the lockdep to have
inconsistency states leading to the lock downgrade warning.

Fix that by add a check for debug_locks at the beginning of
__lock_downgrade().
Debugged-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: syzbot+53383ae265fb161ef488@syzkaller.appspotmail.com
Signed-off-by: NWaiman Long <longman@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/r/1547093005-26085-1-git-send-email-longman@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0e0f7b30

x86/unwind: Add hardcoded ORC entry for NULL · 206a76a6

由 Jann Horn 提交于 3月 01, 2019

commit ac5ceccce5501e43d217c596e4ee859f2a3fef79 upstream.

When the ORC unwinder is invoked for an oops caused by IP==0,
it currently has no idea what to do because there is no debug information
for the stack frame of NULL.

But if RIP is NULL, it is very likely that the last successfully executed
instruction was an indirect CALL/JMP, and it is possible to unwind out in
the same way as for the first instruction of a normal function. Hardcode
a corresponding ORC entry.

With an artificially-added NULL call in prctl_set_seccomp(), before this
patch, the trace is:

Call Trace:
 ? __x64_sys_prctl+0x402/0x680
 ? __ia32_sys_prctl+0x6e0/0x6e0
 ? __do_page_fault+0x457/0x620
 ? do_syscall_64+0x6d/0x160
 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

After this patch, the trace looks like this:

Call Trace:
 __x64_sys_prctl+0x402/0x680
 ? __ia32_sys_prctl+0x6e0/0x6e0
 ? __do_page_fault+0x457/0x620
 do_syscall_64+0x6d/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

prctl_set_seccomp() still doesn't show up in the trace because for some
reason, tail call optimization is only disabled in builds that use the
frame pointer unwinder.
Signed-off-by: NJann Horn <jannh@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/20190301031201.7416-2-jannh@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

206a76a6

x86/unwind: Handle NULL pointer calls better in frame unwinder · 367ccafb

由 Jann Horn 提交于 3月 01, 2019

commit f4f34e1b82eb4219d8eaa1c7e2e17ca219a6a2b5 upstream.

When the frame unwinder is invoked for an oops caused by a call to NULL, it
currently skips the parent function because BP still points to the parent's
stack frame; the (nonexistent) current function only has the first half of
a stack frame, and BP doesn't point to it yet.

Add a special case for IP==0 that calculates a fake BP from SP, then uses
the real BP for the next frame.

Note that this handles first_frame specially: Return information about the
parent function as long as the saved IP is >=first_frame, even if the fake
BP points below it.

With an artificially-added NULL call in prctl_set_seccomp(), before this
patch, the trace is:

Call Trace:
 ? prctl_set_seccomp+0x3a/0x50
 __x64_sys_prctl+0x457/0x6f0
 ? __ia32_sys_prctl+0x750/0x750
 do_syscall_64+0x72/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

After this patch, the trace is:

Call Trace:
 prctl_set_seccomp+0x3a/0x50
 __x64_sys_prctl+0x457/0x6f0
 ? __ia32_sys_prctl+0x750/0x750
 do_syscall_64+0x72/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: NJann Horn <jannh@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/20190301031201.7416-1-jannh@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

367ccafb

loop: access lo_backing_file only when the loop device is Lo_bound · 3254dd30

由 Dongli Zhang 提交于 3月 18, 2019

commit f7c8a4120eedf24c36090b7542b179ff7a649219 upstream.

Commit 758a58d0bc67 ("loop: set GENHD_FL_NO_PART_SCAN after
blkdev_reread_part()") separates "lo->lo_backing_file = NULL" and
"lo->lo_state = Lo_unbound" into different critical regions protected by
loop_ctl_mutex.

However, there is below race that the NULL lo->lo_backing_file would be
accessed when the backend of a loop is another loop device, e.g., loop0's
backend is a file, while loop1's backend is loop0.

loop0's backend is file            loop1's backend is loop0

__loop_clr_fd()
  mutex_lock(&loop_ctl_mutex);
  lo->lo_backing_file = NULL; --> set to NULL
  mutex_unlock(&loop_ctl_mutex);
                                   loop_set_fd()
                                     mutex_lock_killable(&loop_ctl_mutex);
                                     loop_validate_file()
                                       f = l->lo_backing_file; --> NULL
                                         access if loop0 is not Lo_unbound
  mutex_lock(&loop_ctl_mutex);
  lo->lo_state = Lo_unbound;
  mutex_unlock(&loop_ctl_mutex);

lo->lo_backing_file should be accessed only when the loop device is
Lo_bound.

In fact, the problem has been introduced already in commit 7ccd0791d985
("loop: Push loop_ctl_mutex down into loop_clr_fd()") after which
loop_validate_file() could see devices in Lo_rundown state with which it
did not count. It was harmless at that point but still.

Fixes: 7ccd0791d985 ("loop: Push loop_ctl_mutex down into loop_clr_fd()")
Reported-by: syzbot+9bdc1adc1c55e7fe765b@syzkaller.appspotmail.com
Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

3254dd30

netfilter: ebtables: remove BUGPRINT messages · 35cdcdc5

由 Florian Westphal 提交于 2月 19, 2019

commit d824548dae220820bdf69b2d1561b7c4b072783f upstream.

They are however frequently triggered by syzkaller, so remove them.

ebtables userspace should never trigger any of these, so there is little
value in making them pr_debug (or ratelimited).
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

35cdcdc5

f2fs: fix to avoid deadlock of atomic file operations · 1fd916e8

由 Chao Yu 提交于 2月 25, 2019

commit 48432984d718c95cf13e26d487c2d1b697c3c01f upstream.

Thread A				Thread B
- __fput
 - f2fs_release_file
  - drop_inmem_pages
   - mutex_lock(&fi->inmem_lock)
   - __revoke_inmem_pages
    - lock_page(page)
					- open
					- f2fs_setattr
					- truncate_setsize
					 - truncate_inode_pages_range
					  - lock_page(page)
					  - truncate_cleanup_page
					   - f2fs_invalidate_page
					    - drop_inmem_page
					    - mutex_lock(&fi->inmem_lock);

We may encounter above ABBA deadlock as reported by Kyungtae Kim:

I'm reporting a bug in linux-4.17.19: "INFO: task hung in
drop_inmem_page" (no reproducer)

I think this might be somehow related to the following:
https://groups.google.com/forum/#!searchin/syzkaller-bugs/INFO$3A$20task$20hung$20in$20%7Csort:date/syzkaller-bugs/c6soBTrdaIo/AjAzPeIzCgAJ

=========================================
INFO: task syz-executor7:10822 blocked for more than 120 seconds.
      Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7   D27024 10822   6346 0x00000004
Call Trace:
 context_switch kernel/sched/core.c:2867 [inline]
 __schedule+0x721/0x1e60 kernel/sched/core.c:3515
 schedule+0x88/0x1c0 kernel/sched/core.c:3559
 schedule_preempt_disabled+0x18/0x30 kernel/sched/core.c:3617
 __mutex_lock_common kernel/locking/mutex.c:833 [inline]
 __mutex_lock+0x5bd/0x1410 kernel/locking/mutex.c:893
 mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:908
 drop_inmem_page+0xcb/0x810 fs/f2fs/segment.c:327
 f2fs_invalidate_page+0x337/0x5e0 fs/f2fs/data.c:2401
 do_invalidatepage mm/truncate.c:165 [inline]
 truncate_cleanup_page+0x261/0x330 mm/truncate.c:187
 truncate_inode_pages_range+0x552/0x1610 mm/truncate.c:367
 truncate_inode_pages mm/truncate.c:478 [inline]
 truncate_pagecache+0x6d/0x90 mm/truncate.c:801
 truncate_setsize+0x81/0xa0 mm/truncate.c:826
 f2fs_setattr+0x44f/0x1270 fs/f2fs/file.c:781
 notify_change+0xa62/0xe80 fs/attr.c:313
 do_truncate+0x12e/0x1e0 fs/open.c:63
 do_last fs/namei.c:2955 [inline]
 path_openat+0x2042/0x29f0 fs/namei.c:3505
 do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
 do_sys_open+0x35e/0x4e0 fs/open.c:1101
 __do_sys_open fs/open.c:1119 [inline]
 __se_sys_open fs/open.c:1114 [inline]
 __x64_sys_open+0x89/0xc0 fs/open.c:1114
 do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f734e459c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f734e45a6cc RCX: 00000000004497b9
RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
RBP: 000000000071bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e45a700
INFO: task syz-executor7:10858 blocked for more than 120 seconds.
      Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7   D28880 10858   6346 0x00000004
Call Trace:
 context_switch kernel/sched/core.c:2867 [inline]
 __schedule+0x721/0x1e60 kernel/sched/core.c:3515
 schedule+0x88/0x1c0 kernel/sched/core.c:3559
 __rwsem_down_write_failed_common kernel/locking/rwsem-xadd.c:565 [inline]
 rwsem_down_write_failed+0x5e6/0xc90 kernel/locking/rwsem-xadd.c:594
 call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
 __down_write arch/x86/include/asm/rwsem.h:142 [inline]
 down_write+0x58/0xa0 kernel/locking/rwsem.c:72
 inode_lock include/linux/fs.h:713 [inline]
 do_truncate+0x120/0x1e0 fs/open.c:61
 do_last fs/namei.c:2955 [inline]
 path_openat+0x2042/0x29f0 fs/namei.c:3505
 do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
 do_sys_open+0x35e/0x4e0 fs/open.c:1101
 __do_sys_open fs/open.c:1119 [inline]
 __se_sys_open fs/open.c:1114 [inline]
 __x64_sys_open+0x89/0xc0 fs/open.c:1114
 do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f734e3b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f734e3b56cc RCX: 00000000004497b9
RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
RBP: 000000000071c238 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e3b5700
INFO: task syz-executor5:10829 blocked for more than 120 seconds.
      Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor5   D28760 10829   6308 0x80000002
Call Trace:
 context_switch kernel/sched/core.c:2867 [inline]
 __schedule+0x721/0x1e60 kernel/sched/core.c:3515
 schedule+0x88/0x1c0 kernel/sched/core.c:3559
 io_schedule+0x21/0x80 kernel/sched/core.c:5179
 wait_on_page_bit_common mm/filemap.c:1100 [inline]
 __lock_page+0x2b5/0x390 mm/filemap.c:1273
 lock_page include/linux/pagemap.h:483 [inline]
 __revoke_inmem_pages+0xb35/0x11c0 fs/f2fs/segment.c:231
 drop_inmem_pages+0xa3/0x3e0 fs/f2fs/segment.c:306
 f2fs_release_file+0x2c7/0x330 fs/f2fs/file.c:1556
 __fput+0x2c7/0x780 fs/file_table.c:209
 ____fput+0x1a/0x20 fs/file_table.c:243
 task_work_run+0x151/0x1d0 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x8ba/0x30a0 kernel/exit.c:865
 do_group_exit+0x13b/0x3a0 kernel/exit.c:968
 get_signal+0x6bb/0x1650 kernel/signal.c:2482
 do_signal+0x84/0x1b70 arch/x86/kernel/signal.c:810
 exit_to_usermode_loop+0x155/0x190 arch/x86/entry/common.c:162
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
 do_syscall_64+0x445/0x4e0 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f1c68e74ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 000000000071bf80 RCX: 00000000004497b9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000071bf80
RBP: 000000000071bf80 R08: 0000000000000000 R09: 000000000071bf58
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f1c68e759c0 R15: 00007f1c68e75700

This patch tries to use trylock_page to mitigate such deadlock condition
for fix.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

1fd916e8

RDMA/cma: Rollback source IP address if failing to acquire device · 9dd5053c

由 Myungho Jung 提交于 1月 09, 2019

commit 5fc01fb846bce8fa6d5f95e2625b8ce0f8e86810 upstream.

If cma_acquire_dev_by_src_ip() returns error in addr_handler(), the
device state changes back to RDMA_CM_ADDR_BOUND but the resolved source
IP address is still left. After that, if rdma_destroy_id() is called
after rdma_listen(), the device is freed without removed from
listen_any_list in cma_cancel_operation(). Revert to the previous IP
address if acquiring device fails.

Reported-by: syzbot+f3ce716af730c8f96637@syzkaller.appspotmail.com
Signed-off-by: NMyungho Jung <mhjungk@gmail.com>
Reviewed-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9dd5053c

drm: Reorder set_property_atomic to avoid returning with an active ww_ctx · 015b828b

由 Chris Wilson 提交于 12月 30, 2018

commit 227ad6d957898a88b1746e30234ece64d305f066 upstream.

Delay the drm_modeset_acquire_init() until after we check for an
allocation failure so that we can return immediately upon error without
having to unwind.

WARNING: lock held when returning to user space!
4.20.0+ #174 Not tainted
------------------------------------------------
syz-executor556/8153 is leaving the kernel with locks still held!
1 lock held by syz-executor556/8153:
  #0: 000000005100c85c (crtc_ww_class_acquire){+.+.}, at:
set_property_atomic+0xb3/0x330 drivers/gpu/drm/drm_mode_object.c:462

Reported-by: syzbot+6ea337c427f5083ebdf2@syzkaller.appspotmail.com
Fixes: 144a7999 ("drm: Handle properties in the core for atomic drivers")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Sean Paul <sean@poorly.run>
Cc: David Airlie <airlied@linux.ie>
Cc: <stable@vger.kernel.org> # v4.14+
Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181230122842.21917-1-chris@chris-wilson.co.ukSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

015b828b

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功