提交 · 8b9ea9010139596e71aae4be5264c3dd901b13cb · openeuler / raspberrypi-kernel

17 3月, 2020 1 次提交

pagecache: support percpu refcount to imporve performance · 8b9ea901

由 Yunfeng Ye 提交于 3月 17, 2020

euleros inclusion
category: feature
feature: pagecache percpu refcount
bugzilla: 31398
CVE: NA

-------------------------------------------------

The pagecache manages the file physical pages, and the life cycle of
page is managed by atomic counting. With the increasing number of cpu
cores, the cost of atomic counting is very large when reading file
pagecaches at large concurrent.

For example, when running nginx http application, the biggest hotspot is
found in the atomic operation of find_get_entry():

 11.94% [kernel] [k] find_get_entry
  7.45% [kernel] [k] do_tcp_sendpages
  6.12% [kernel] [k] generic_file_buffered_read

So we using the percpu refcount mechanism to fix this problem. and the
test result show that the read performance of nginx http can be improved
by 100%：

  worker   original(requests/sec)   percpu(requests/sec)   imporve
  64       759656.87                1627088.95             114.2%

Notes: we use page->lru to save percpu information, so the pages with
percpu attribute will not be recycled by memory recycling process, we
should avoid grow the file size unlimited.
Signed-off-by: NYunfeng Ye <yeyunfeng@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

8b9ea901

16 3月, 2020 2 次提交

net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup · d93d588d

由 Sabrina Dubroca 提交于 3月 16, 2020

mainline inclusion
from mainline-v5.5-rc1
commit 6c8991f41546c3c472503dff1ea9daaddf9331c2
category: bugfix
bugzilla: 13690
CVE: CVE-2020-1749

-------------------------------------------------

ipv6_stub uses the ip6_dst_lookup function to allow other modules to
perform IPv6 lookups. However, this function skips the XFRM layer
entirely.

All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the
ip_route_output_key and ip_route_output helpers) for their IPv4 lookups,
which calls xfrm_lookup_route(). This patch fixes this inconsistent
behavior by switching the stub to ip6_dst_lookup_flow, which also calls
xfrm_lookup_route().

This requires some changes in all the callers, as these two functions
take different arguments and have different return types.

Fixes: 5f81bd2e ("ipv6: export a stub for IPv6 symbols used by vxlan")
Reported-by: NXiumei Mu <xmu@redhat.com>
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Conflicts:
  include/net/addrconf.h
  drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
  net/core/lwt_bpf.c
  net/tipc/udp_media.c
  net/ipv6/addrconf_core.c
  net/ipv6/af_inet6.c
  drivers/infiniband/core/addr.c
[yyl: adjust context]
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NWenan Mao <maowenan@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

d93d588d

net: ipv6: add net argument to ip6_dst_lookup_flow · e6a9ef86

由 Sabrina Dubroca 提交于 3月 16, 2020

mainline inclusion
from mainline-v5.5-rc1
commit c4e85f73afb6384123e5ef1bba3315b2e3ad031e
category: bugfix
bugzilla: 13690
CVE: CVE-2020-1749

It's prepare for fixing CVE-2020-1749.

-------------------------------------------------

This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow,
as some modules currently pass a net argument without a socket to
ip6_dst_lookup. This is equivalent to commit 343d60aa ("ipv6: change
ipv6_stub_impl.ipv6_dst_lookup to take net argument").
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NWenan Mao <maowenan@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

e6a9ef86

12 3月, 2020 1 次提交

blktrace: Protect q->blk_trace with RCU · 40b51620

由 Jan Kara 提交于 3月 12, 2020

mainline inclusion
from mainline-v5.6-rc4
commit c780e86dd48ef6467a1146cf7d0fe1e05a635039
category: bugfix
bugzilla: 13690
CVE: CVE-2019-19768

-------------------------------------------------

KASAN is reporting that __blk_add_trace() has a use-after-free issue
when accessing q->blk_trace. Indeed the switching of block tracing (and
thus eventual freeing of q->blk_trace) is completely unsynchronized with
the currently running tracing and thus it can happen that the blk_trace
structure is being freed just while __blk_add_trace() works on it.
Protect accesses to q->blk_trace by RCU during tracing and make sure we
wait for the end of RCU grace period when shutting down tracing. Luckily
that is rare enough event that we can afford that. Note that postponing
the freeing of blk_trace to an RCU callback should better be avoided as
it could have unexpected user visible side-effects as debugfs files
would be still existing for a short while block tracing has been shut
down.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=205711
CC: stable@vger.kernel.org
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Tested-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Reported-by: NTristan Madani <tristmd@gmail.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Conflicts:
  kernel/trace/blktrace.c
[yyl: adjust context]
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

40b51620

05 3月, 2020 36 次提交

iscsi: add member for NUMA aware order workqueue · c8a2308c

由 Yang Yingliang 提交于 2月 24, 2020

euleros inclusion
category: feature
feature: Implement NUMA affinity for order workqueue

-------------------------------------------------

Add member to struct iscsi_conn.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

c8a2308c

Revert "debugfs: fix kabi for function debugfs_remove_recursive" · acd24e6d

由 Yang Yingliang 提交于 2月 24, 2020

hulk inclusion
category: bugfix
bugzilla: 30939
CVE: NA

---------------------------

The kabi can be broken before official release.

This reverts commit ce620c1a6783b2341a376ef948484b5314ed064e.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

acd24e6d

Revert "bdi: fix kabi for struct backing_dev_info" · 6cd5af2c

由 Yang Yingliang 提交于 2月 24, 2020

hulk inclusion
category: bugfix
bugzilla: 30939
CVE: NA
---------------------------

The kabi can be broken before official release.

This reverts commit f8589079659b51222d86a1cb8fd9129752b0d97c.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

6cd5af2c

Revert "membarrier/kabi: fix kabi for membarrier_state" · 6597894c

由 Yang Yingliang 提交于 2月 24, 2020

hulk inclusion
category: bugfix
bugzilla: 30939
CVE: NA

-------------------------------------------------

The kabi can be broken before official release.

This reverts commit f316812150a4fbb52720fe7fb7702c5a52c37602.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

6597894c

Revert "PCI: fix kabi change in struct pci_bus" · 6432d8c9

由 Yang Yingliang 提交于 2月 24, 2020

hulk inclusion
category: bugfix
bugzilla: 30939
CVE: NA

---------------------------

The kabi can be broken before official release.

This reverts commit 8664b79edac95322379eee025763ba0840d458d1.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

6432d8c9

bdi: get device name under rcu protect · de1b854e

由 Yufen Yu 提交于 2月 20, 2020

hulk inclusion
category: bugfix
bugzilla: 30109
CVE: NA
---------------------------

bdi->dev may be set as "NULL" or freed by bdi_unregister().
To avoid causing "NULL" pointer reference or use-after-free
in user, we add a common function bdi_get_dev_name(), in which
dev is protected by RCU lock. Then, the caller can get device
name safely.

Fixes: 5ca4579ae59b ("bdi: fix use-after-free for the bdi device")
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Reviewed-by: NHou Tao <houao1@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

de1b854e

iommu/iova: avoid softlockup in fq_flush_timeout · 90ba118b

由 Li Bin 提交于 2月 20, 2020

hulk inclusion
category: bugfix
bugzilla: 30859
CVE: NA

---------------------------

There is softlockup under fio pressure test with smmu enabled:
watchdog: BUG: soft lockup - CPU#81 stuck for 22s!  [swapper/81:0]
...
Call trace:
 fq_flush_timeout+0xc0/0x110
 call_timer_fn+0x34/0x178
 expire_timers+0xec/0x158
 run_timer_softirq+0xc0/0x1f8
 __do_softirq+0x120/0x324
 irq_exit+0x11c/0x140
 __handle_domain_irq+0x6c/0xc0
 gic_handle_irq+0x6c/0x170
 el1_irq+0xb8/0x140
 arch_cpu_idle+0x38/0x1c0
 default_idle_call+0x24/0x44
 do_idle+0x1f4/0x2d8
 cpu_startup_entry+0x2c/0x30
 secondary_start_kernel+0x17c/0x1c8

This is because the timer callback fq_flush_timeout may run more than
10ms, and timer may be processed continuously in the softirq so trigger
softlockup. We can use work to deal with fq_ring_free for each cpu which
may take long time, that to avoid triggering softlockup.
Signed-off-by: NLi Bin <huawei.libin@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

90ba118b

uacce: Remove uacce mode 1 relatives · f7acb516

由 Yu'an Wang 提交于 2月 20, 2020

driver inclusion
category: feature
bugzilla: NA
CVE: NA

To be simple, in this patch  we try to remove uacce mode 1
related logic of uacce.c. Because in open mainline ,we do
not use this mode, mode 0 is used for kernel and mode 2 is
used for user.
At the same time, we update correspondingheader file uacce.h.
We also delete UACCE_QFRT_DKO in dummy_wd_dev.c and dummy_wd_v2.c
Signed-off-by: NYu'an Wang <wangyuan46@huawei.com>
Reviewed-by: NHui Tang <tanghui20@huawei.com>
Reviewed-by: NCheng Hu <hucheng.hu@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

f7acb516

debugfs: fix kabi for function debugfs_remove_recursive · 27d247b2

由 yu kuai 提交于 2月 20, 2020

hulk inclusion
category: bugfix
bugzilla: 24454
CVE: NA

---------------------------

debugfs_remove_recursive was changed from a function to an alias to
debugfs_remove in patch "simple_recursive_removal(): kernel-side rm -rf
for ramfs-style filesystems". Change it back to a function.
Signed-off-by: Nyu kuai <yukuai3@huawei.com>
Reviewed-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

27d247b2

simple_recursive_removal(): kernel-side rm -rf for ramfs-style filesystems · 3d1b056c

由 yu kuai 提交于 2月 20, 2020

mainline inclusion
from mainline-5.6-rc1
commit a3d1e7eb5abe3aa1095bc75d1a6760d3809bd672
category: bugfix
bugzilla: 24454
CVE: NA

---------------------------

two requirements: no file creations in IS_DEADDIR and no cross-directory
renames whatsoever.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

Conflicts:
 fs/debugfs/inode.c
 fs/libfs.c
 fs/tracefs/inode.c
 include/linux/debugfs.h
 include/linux/fs.h
 include/linux/tracefs.h
 kernel/trace/trace.c
functional changes:
 replace current_time() with current_fs_time()
 remove call to fsnotify_rmdir() and fsnotify_unlink()
Signed-off-by: Nyu kuai <yukuai3@huawei.com>
Reviewed-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

3d1b056c

bdi: fix kabi for struct backing_dev_info · b45215d4

由 Yufen Yu 提交于 2月 20, 2020

hulk inclusion
category: bugfix
bugzilla: 30109
CVE: NA
---------------------------
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

b45215d4

bdi: fix use-after-free for the bdi device · 725ee753

由 Yufen Yu 提交于 2月 20, 2020

hulk inclusion
category: bugfix
bugzilla: 30109
CVE: NA
---------------------------

We reported kernel crash:

[201962.639350] Call trace:
[201962.644403]  string+0x28/0xa0
[201962.650501]  vsnprintf+0x5f0/0x748
[201962.657472]  seq_vprintf+0x70/0x98
[201962.664442]  seq_printf+0x7c/0xa0
[201962.671238]  __blkg_prfill_rwstat+0x84/0x128
[201962.679949]  blkg_prfill_rwstat_field+0x94/0xc0
[201962.689182]  blkcg_print_blkgs+0xcc/0x140
[201962.697370]  blkg_print_stat_bytes+0x4c/0x60
[201962.706083]  cgroup_seqfile_show+0x58/0xc0
[201962.714446]  kernfs_seq_show+0x44/0x50
[201962.722112]  seq_read+0xd4/0x4a8
[201962.728732]  kernfs_fop_read+0x16c/0x218
[201962.736748]  __vfs_read+0x60/0x188
[201962.743717]  vfs_read+0x94/0x150
[201962.750338]  ksys_read+0x6c/0xd8
[201962.756958]  __arm64_sys_read+0x24/0x30
[201962.764800]  el0_svc_common+0x78/0x130
[201962.772466]  el0_svc_handler+0x38/0x78
[201962.780131]  el0_svc+0x8/0xc

__blkg_prfill_rwstat() tried to get the device name by
'bdi->dev', while the 'dev' have been freed by bdi_release().
The race as following:

blkg_print_stat_bytes         __scsi_remove_device
                              del_gendisk
                                bdi_unregister

                                put_device(bdi->dev)
                                  kfree(bdi->dev)

__blkg_prfill_rwstat
  blkg_dev_name
    //use the freed bdi->dev
    dev_name(blkg->q->backing_dev_info->dev)

                                bdi->dev = NULL

Since blkg_dev_name() have been coverd by rcu_read_lock/unlock(),
we wait all rcu reader before free 'bdi->dev' to avoid use-after-free.

Link: https://lore.kernel.org/linux-block/20200211140038.146629-1-yuyufen@huawei.com/Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

725ee753

jbd2: make jbd2_handle_buffer_credits() handle reserved handles · c7f65ce1

由 Jan Kara 提交于 2月 20, 2020

mainline inclusion
from mainline-5.5-rc1
commit 3c845acd0237caef617f330a0e3b37ad8ae9fea5
category: bugfix
bugzilla: 25031
CVE: NA
---------------------------

The helper jbd2_handle_buffer_credits() doesn't correctly handle reserved
handles which can lead to crashes. Fix it getting of journal pointer to
work for reserved handles as well.

Fixes: a9a8344ee171 ("ext4, jbd2: Provide accessor function for handle credits")
Reported-by: NEric Biggers <ebiggers@kernel.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20191115102210.29445-1-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

c7f65ce1

jbd2: Provide trace event for handle restarts · 92a8f9f1

由 Jan Kara 提交于 2月 20, 2020

mainline inclusion
from mainline-5.5-rc1
commit 0094f981bbaca3ae707c95c5e5977429d29c2dd0
category: bugfix
bugzilla: 25031
CVE: NA
---------------------------

Provide trace event for handle restarts to ease debugging.
Reviewed-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20191105164437.32602-24-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

92a8f9f1

ext4: Reserve revoke credits for freed blocks · a9e4f54d

由 Jan Kara 提交于 2月 20, 2020

mainline inclusion
from mainline-5.5-rc1
commit 83448bdfb59731c2f54784ed3f4a93ff95be6e7e
category: bugfix
bugzilla: 25031
CVE: NA
---------------------------

So far we have reserved only relatively high fixed amount of revoke
credits for each transaction. We over-reserved by large amount for most
cases but when freeing large directories or files with data journalling,
the fixed amount is not enough. In fact the worst case estimate is
inconveniently large (maximum extent size) for freeing of one extent.

We fix this by doing proper estimate of the amount of blocks that need
to be revoked when removing blocks from the inode due to truncate or
hole punching and otherwise reserve just a small amount of revoke
credits for each transaction to accommodate freeing of xattrs block or
so.
Signed-off-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20191105164437.32602-23-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

a9e4f54d

jbd2: Rename h_buffer_credits to h_total_credits · c38a5168

由 Jan Kara 提交于 2月 20, 2020

mainline inclusion
from mainline-5.5-rc1
commit 933f1c1e0b75bbc29730eef07c9e196c6dfd37e5
category: bugfix
bugzilla: 25031
CVE: NA
---------------------------

The credit counter now contains both buffer and revoke descriptor block
credits. Rename to counter to h_total_credits to reflect that. No
functional change.
Reviewed-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20191105164437.32602-21-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>

Conflict:
  fs/jbd2/transaction.c
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

c38a5168

jbd2: Reserve space for revoke descriptor blocks · 38d8c053

由 Jan Kara 提交于 2月 20, 2020

mainline inclusion
from mainline-5.5-rc1
commit fdc3ef882a5d59c1709a13b5486ae2b1632e12b6
category: bugfix
bugzilla: 25031
CVE: NA
---------------------------

Extend functions for starting, extending, and restarting transaction
handles to take number of revoke records handle must be able to
accommodate. These functions then make sure transaction has enough
credits to be able to store resulting revoke descriptor blocks. Also
revoke code tracks number of revoke records created by a handle to catch
situation where some place didn't reserve enough space for revoke
records. Similarly to standard transaction credits, space for unused
reserved revoke records is released when the handle is stopped.

On the ext4 side we currently take a simplistic approach of reserving
space for 1024 revoke records for any transaction. This grows amount of
credits reserved for each handle only by a few and is enough for any
normal workload so that we don't hit warnings in jbd2. We will refine
the logic in following commits.
Signed-off-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20191105164437.32602-20-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>

Conflict:
  include/linux/jbd2.h
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

38d8c053

jbd2: Drop jbd2_space_needed() · a5140ec0

由 Jan Kara 提交于 2月 20, 2020

mainline inclusion
from mainline-5.5-rc1
commit 77444ac4f9537bc4211f928959d5231445e30c6e
category: bugfix
bugzilla: 25031
CVE: NA
---------------------------

The function is now just a trivial wrapper returning
journal->j_max_transaction_buffers. Drop it.
Reviewed-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20191105164437.32602-19-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

a5140ec0

jbd2: Account descriptor blocks into t_outstanding_credits · ad68953d

由 Jan Kara 提交于 2月 20, 2020

mainline inclusion
from mainline-5.5-rc1
commit 9f356e5a4f12008fa0df8b6385fc0ab830416e72
category: bugfix
bugzilla: 25031
CVE: NA
---------------------------

Currently, journal descriptor blocks were not accounted in
transaction->t_outstanding_credits and we were just leaving some slack
space in the journal for them (in jbd2_log_space_left() and
jbd2_space_needed()). This is making proper accounting (and reservation
we want to add) of descriptor blocks difficult so switch to accounting
descriptor blocks in transaction->t_outstanding_credits and just reserve
the same amount of credits in t_outstanding credits for journal
descriptor blocks when creating transaction.
Signed-off-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20191105164437.32602-18-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>

Conflict:
  include/linux/jbd2.h
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

ad68953d

ext4, jbd2: Provide accessor function for handle credits · 02b7342a

由 Jan Kara 提交于 2月 20, 2020

mainline inclusion
from mainline-5.5-rc1
commit a9a8344ee1714f835ba394077e8c13d751e2f148
category: bugfix
bugzilla: 25031
CVE: NA
---------------------------

Provide accessor function to get number of credits available in a handle
and use it from ext4. Later, computation of available credits won't be
so straightforward.
Reviewed-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20191105164437.32602-11-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

02b7342a

membarrier/kabi: fix kabi for membarrier_state · 8b9958f0

由 Cheng Jian 提交于 2月 05, 2020

hulk inclusion
category: feature
bugzilla: 28332
CVE: NA

-------------------------------------------------
Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

8b9958f0

sched/membarrier: Fix p->mm->membarrier_state racy load · 08946ecc

由 Mathieu Desnoyers 提交于 2月 05, 2020

mainline inclusion
from mainline-5.4-rc1
commit 227a4aadc75ba22fcb6c4e1c078817b8cbaae4ce
category: bugfix
bugzilla: 28332
CVE: NA

-------------------------------------------------

The membarrier_state field is located within the mm_struct, which
is not guaranteed to exist when used from runqueue-lock-free iteration
on runqueues by the membarrier system call.

Copy the membarrier_state from the mm_struct into the scheduler runqueue
when the scheduler switches between mm.

When registering membarrier for mm, after setting the registration bit
in the mm membarrier state, issue a synchronize_rcu() to ensure the
scheduler observes the change. In order to take care of the case
where a runqueue keeps executing the target mm without swapping to
other mm, iterate over each runqueue and issue an IPI to copy the
membarrier_state from the mm_struct into each runqueue which have the
same mm which state has just been modified.

Move the mm membarrier_state field closer to pgd in mm_struct to use
a cache line already touched by the scheduler switch_mm.

The membarrier_execve() (now membarrier_exec_mmap) hook now needs to
clear the runqueue's membarrier state in addition to clear the mm
membarrier state, so move its implementation into the scheduler
membarrier code so it can access the runqueue structure.

Add memory barrier in membarrier_exec_mmap() prior to clearing
the membarrier state, ensuring memory accesses executed prior to exec
are not reordered with the stores clearing the membarrier state.

As suggested by Linus, move all membarrier.c RCU read-side locks outside
of the for each cpu loops.

[Cheng Jian: use task_rcu_dereference in sync_runqueues_membarrier_state]
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190919173705.2181-5-mathieu.desnoyers@efficios.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

08946ecc

PCI: fix kabi change in struct pci_bus · 822a5290

由 Xiongfeng Wang 提交于 2月 05, 2020

hulk inclusion
category: bugfix
bugzilla: NA
CVE: NA

---------------------------

Fix kabi change in struct pci_bus since the following patch.
12de28f380a9 ("PCI: add a member in 'struct pci_bus' to record the
original 'pci_ops'")
Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

822a5290

PCI: add a member in 'struct pci_bus' to record the original 'pci_ops' · 03377a8a

由 Xiongfeng Wang 提交于 2月 05, 2020

hulk inclusion
category: bugfix
bugzilla: NA
CVE: NA

-------------------------------------------------

When I test 'aer-inject' with the following procedures:
1. inject a fatal error into a upstream PCI bridge
2. remove the upstream bridge by sysfs
3. rescan the PCI tree by 'echo 1 > /sys/bus/pci/rescan'
4. execute command 'rmmod aer-inject'
5. remove the upstream bridge by sysfs again

I came across the following Oops.

[  799.713238] Internal error: Oops: 96000007 [#1] SMP
[  799.718099] Process bash (pid: 10683, stack limit = 0x00000000125a3b1b)
[  799.724686] CPU: 108 PID: 10683 Comm: bash Kdump: loaded Not tainted 4.19.36 #2
[  799.731962] Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 1.05 09/18/2019
[  799.739325] pstate: 40400009 (nZcv daif +PAN -UAO)
[  799.744104] pc : pci_remove_bus+0xc0/0x1c0
[  799.748182] lr : pci_remove_bus+0x94/0x1c0
[  799.752260] sp : ffffa02e335df940
[  799.755560] x29: ffffa02e335df940 x28: ffff2000088216a8
[  799.760849] x27: 1ffff405c66bbfbc x26: ffff20000a9518c0
[  799.766139] x25: ffffa02dea6ec418 x24: 1ffff405bd4dd883
[  799.771427] x23: ffffa02e72576628 x22: 1ffff405ce4aecc0
[  799.776715] x21: ffffa02e72576608 x20: ffff200002e75080
[  799.782003] x19: ffffa02e72576600 x18: 0000000000000000
[  799.787291] x17: 0000000000000000 x16: 0000000000000000
[  799.792578] x15: 0000000000000001 x14: dfff200000000000
[  799.797866] x13: ffff20000a6dfaf0 x12: 0000000000000000
[  799.803154] x11: 1fffe4000159b217 x10: ffff04000159b217
[  799.808442] x9 : dfff200000000000 x8 : ffff20000acd90bf
[  799.813730] x7 : 0000000000000000 x6 : 0000000000000000
[  799.819017] x5 : 0000000000000001 x4 : 0000000000000000
[  799.824306] x3 : 1ffff405dbe62603 x2 : 1fffe400005cea11
[  799.829593] x1 : dfff200000000000 x0 : ffff200002e75088
[  799.834882] Call trace:
[  799.837323]  pci_remove_bus+0xc0/0x1c0
[  799.841056]  pci_remove_bus_device+0xd0/0x2f0
[  799.845392]  pci_stop_and_remove_bus_device_locked+0x2c/0x40
[  799.851028]  remove_store+0x1b8/0x1d0
[  799.854679]  dev_attr_store+0x60/0x80
[  799.858330]  sysfs_kf_write+0x104/0x170
[  799.862149]  kernfs_fop_write+0x23c/0x430
[  799.866143]  __vfs_write+0xec/0x4e0
[  799.869615]  vfs_write+0x12c/0x3d0
[  799.873001]  ksys_write+0xd0/0x190
[  799.876389]  __arm64_sys_write+0x70/0xa0
[  799.880298]  el0_svc_common+0xfc/0x278
[  799.884030]  el0_svc_handler+0x50/0xc0
[  799.887764]  el0_svc+0x8/0xc
[  799.890634] Code: d2c40001 f2fbffe1 91002280 d343fc02 (38e16841)
[  799.896700] kernel fault(0x1) notification starting on CPU 108

It is because when we alloc a new bus in rescanning process, the
'pci_ops' of the newly allocced 'pci_bus' is inherited from its parent
pci bus. Whereas, the 'pci_ops' of the parent bus may be changed to
'aer_inj_pci_ops' in 'aer_inject()'. When we unload the module
'aer_inject', we only restore the 'pci_ops' for the pci bus of the
error-injected device and the root port in 'aer_inject_exit'. After we
have unloaded the module, the 'pci_ops' of the newly allocced pci bus is
still 'aer_inj_pci_ops'. When we access it, an Oops happened.

This patch add a member 'backup_ops' in 'struct pci_bus' to record the
original 'ops'. When we alloc a child pci bus, we assign the
'backup_ops' of the parent bus to the 'ops' of the child bus.

Maybe the best way is to not modify the 'pci_ops' in 'struct pci_bus',
but this will refactor the 'aer_inject' framework a lot. I haven't found
a better way to handle it.
Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

03377a8a

PM / hibernate: introduce system_in_hibernation · ba4c6e55

由 Hongbo Yao 提交于 1月 19, 2020

hulk inclusion
category: bugfix
bugzilla: 26326
CVE: NA

-------------------------------------------------
Introduce boolean function system_in_hibernation() returning
'true' when the system carrying out hibernation.

Some device drivers or syscore need such a function to check
if it is in the phase of hibernation.
Signed-off-by: NHongbo Yao <yaohongbo@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

ba4c6e55

f2fs: support swap file w/ DIO · c9c60491

由 Jaegeuk Kim 提交于 1月 16, 2020

mainline inclusion
from mainline-v5.3-rc1
commit 4969c06a0d83c9c3dc50b8efcdc8eeedfce896f6
category: bugfix
bugzilla: 13690
CVE: CVE-2019-19815

-------------------------------------------------
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
Conflicts:
  fs/f2fs/f2fs.h
  fs/f2fs/data.c
[yyl: adjust context]
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NWei Fang <fangwei1@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

c9c60491

cfg80211/mac80211: make ieee80211_send_layer2_update a public function · d869e0d7

由 Dedy Lansky 提交于 1月 16, 2020

mainline inclusion
from mainline-v4.20-rc1
commit 30ca1aa536211f5ac3de0173513a7a99a98a97f3
category: bugfix
bugzilla: 13690
CVE: CVE-2019-5108

This patch is prepare for fixing CVE-2019-5108

-------------------------------------------------

Make ieee80211_send_layer2_update() a common function so other drivers
can re-use it.
Signed-off-by: NDedy Lansky <dlansky@codeaurora.org>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Conflicts:
  net/wireless/util.c
[yyl: adjust context]
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NWenan Mao <maowenan@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

d869e0d7

macvlan: do not assume mac_header is set in macvlan_broadcast() · b2b473d5

由 Eric Dumazet 提交于 1月 16, 2020

[ Upstream commit 96cc4b69581db68efc9749ef32e9cf8e0160c509 ]

Use of eth_hdr() in tx path is error prone.

Many drivers call skb_reset_mac_header() before using it,
but others do not.

Commit 6d1ccff6 ("net: reset mac header in dev_start_xmit()")
attempted to fix this generically, but commit d346a3fa
("packet: introduce PACKET_QDISC_BYPASS socket option") brought
back the macvlan bug.

Lets add a new helper, so that tx paths no longer have
to call skb_reset_mac_header() only to get a pointer
to skb->data.

Hopefully we will be able to revert 6d1ccff6
("net: reset mac header in dev_start_xmit()") and save few cycles
in transmit fast path.

BUG: KASAN: use-after-free in __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline]
BUG: KASAN: use-after-free in mc_hash drivers/net/macvlan.c:251 [inline]
BUG: KASAN: use-after-free in macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277
Read of size 4 at addr ffff8880a4932401 by task syz-executor947/9579

CPU: 0 PID: 9579 Comm: syz-executor947 Not tainted 5.5.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x197/0x210 lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
 __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
 kasan_report+0x12/0x20 mm/kasan/common.c:639
 __asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:145
 __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline]
 mc_hash drivers/net/macvlan.c:251 [inline]
 macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277
 macvlan_queue_xmit drivers/net/macvlan.c:520 [inline]
 macvlan_start_xmit+0x402/0x77f drivers/net/macvlan.c:559
 __netdev_start_xmit include/linux/netdevice.h:4447 [inline]
 netdev_start_xmit include/linux/netdevice.h:4461 [inline]
 dev_direct_xmit+0x419/0x630 net/core/dev.c:4079
 packet_direct_xmit+0x1a9/0x250 net/packet/af_packet.c:240
 packet_snd net/packet/af_packet.c:2966 [inline]
 packet_sendmsg+0x260d/0x6220 net/packet/af_packet.c:2991
 sock_sendmsg_nosec net/socket.c:639 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:659
 __sys_sendto+0x262/0x380 net/socket.c:1985
 __do_sys_sendto net/socket.c:1997 [inline]
 __se_sys_sendto net/socket.c:1993 [inline]
 __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1993
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x442639
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 5b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc13549e08 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000442639
RDX: 000000000000000e RSI: 0000000020000080 RDI: 0000000000000003
RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000403bb0 R14: 0000000000000000 R15: 0000000000000000

Allocated by task 9389:
 save_stack+0x23/0x90 mm/kasan/common.c:72
 set_track mm/kasan/common.c:80 [inline]
 __kasan_kmalloc mm/kasan/common.c:513 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486
 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:527
 __do_kmalloc mm/slab.c:3656 [inline]
 __kmalloc+0x163/0x770 mm/slab.c:3665
 kmalloc include/linux/slab.h:561 [inline]
 tomoyo_realpath_from_path+0xc5/0x660 security/tomoyo/realpath.c:252
 tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
 tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822
 tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129
 security_inode_getattr+0xf2/0x150 security/security.c:1222
 vfs_getattr+0x25/0x70 fs/stat.c:115
 vfs_statx_fd+0x71/0xc0 fs/stat.c:145
 vfs_fstat include/linux/fs.h:3265 [inline]
 __do_sys_newfstat+0x9b/0x120 fs/stat.c:378
 __se_sys_newfstat fs/stat.c:375 [inline]
 __x64_sys_newfstat+0x54/0x80 fs/stat.c:375
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 9389:
 save_stack+0x23/0x90 mm/kasan/common.c:72
 set_track mm/kasan/common.c:80 [inline]
 kasan_set_free_info mm/kasan/common.c:335 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:483
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x10a/0x2c0 mm/slab.c:3757
 tomoyo_realpath_from_path+0x1a7/0x660 security/tomoyo/realpath.c:289
 tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
 tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822
 tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129
 security_inode_getattr+0xf2/0x150 security/security.c:1222
 vfs_getattr+0x25/0x70 fs/stat.c:115
 vfs_statx_fd+0x71/0xc0 fs/stat.c:145
 vfs_fstat include/linux/fs.h:3265 [inline]
 __do_sys_newfstat+0x9b/0x120 fs/stat.c:378
 __se_sys_newfstat fs/stat.c:375 [inline]
 __x64_sys_newfstat+0x54/0x80 fs/stat.c:375
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8880a4932000
 which belongs to the cache kmalloc-4k of size 4096
The buggy address is located 1025 bytes inside of
 4096-byte region [ffff8880a4932000, ffff8880a4933000)
The buggy address belongs to the page:
page:ffffea0002924c80 refcount:1 mapcount:0 mapping:ffff8880aa402000 index:0x0 compound_mapcount: 0
raw: 00fffe0000010200 ffffea0002846208 ffffea00028f3888 ffff8880aa402000
raw: 0000000000000000 ffff8880a4932000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8880a4932300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880a4932380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880a4932400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff8880a4932480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880a4932500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: b863ceb7 ("[NET]: Add macvlan driver")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

b2b473d5

netfilter: uapi: Avoid undefined left-shift in xt_sctp.h · fccad51d

由 Phil Sutter 提交于 1月 16, 2020

[ Upstream commit 164166558aacea01b99c8c8ffb710d930405ba69 ]

With 'bytes(__u32)' being 32, a left-shift of 31 may happen which is
undefined for the signed 32-bit value 1. Avoid this by declaring 1 as
unsigned.
Signed-off-by: NPhil Sutter <phil@nwl.cc>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

fccad51d

block: fix use-after-free on cached last_lookup partition · f8096783

由 Yufen Yu 提交于 1月 16, 2020

hulk inclusion
category: bugfix
bugzilla: 27962
CVE: NA
---------------------------

delete_partition() clears the cached last_lookup partition. However
the .last_lookup cache may be overwritten by one IO path after
it is cleared from delete_partition(). Then another IO path may
use the cached deleting partition after __delete_partition() is
called, then use-after-free is triggered on the cached partition.

Fixes the issue by the following approach:

1) always get the partition's refcount via hd_struct_try_get() before
setting .last_lookup

2) move clearing .last_lookup from delete_partition() to
__delete_partition() which is release handle of the partition's
percpu-refcount, so that no IO path can overwrite .last_lookup after it
is cleared in __delete_partition().

It is one candidate approach of Yufen's patch[1] which adds overhead
in fast path by indirect lookup which may introduce one extra cacheline
in IO path. Also this patch relies on percpu-refcount's protection, and
it is easier to understand and verify.

[1] https://lore.kernel.org/linux-block/20200109013551.GB9655@ming.t460p/T/#tReported-by: NYufen Yu <yuyufen@huawei.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Hou Tao <houtao1@huawei.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Conflict:
	include/linux/genhd.h
	block/blk-core.c
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

f8096783

net: add annotations on hh->hh_len lockless accesses · acae4f46

由 Eric Dumazet 提交于 1月 16, 2020

[ Upstream commit c305c6ae79e2ce20c22660ceda94f0d86d639a82 ]

KCSAN reported a data-race [1]

While we can use READ_ONCE() on the read sides,
we need to make sure hh->hh_len is written last.

[1]

BUG: KCSAN: data-race in eth_header_cache / neigh_resolve_output

write to 0xffff8880b9dedcb8 of 4 bytes by task 29760 on cpu 0:
 eth_header_cache+0xa9/0xd0 net/ethernet/eth.c:247
 neigh_hh_init net/core/neighbour.c:1463 [inline]
 neigh_resolve_output net/core/neighbour.c:1480 [inline]
 neigh_resolve_output+0x415/0x470 net/core/neighbour.c:1470
 neigh_output include/net/neighbour.h:511 [inline]
 ip6_finish_output2+0x7a2/0xec0 net/ipv6/ip6_output.c:116
 __ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
 __ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
 ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
 NF_HOOK_COND include/linux/netfilter.h:294 [inline]
 ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
 dst_output include/net/dst.h:436 [inline]
 NF_HOOK include/linux/netfilter.h:305 [inline]
 ndisc_send_skb+0x459/0x5f0 net/ipv6/ndisc.c:505
 ndisc_send_ns+0x207/0x430 net/ipv6/ndisc.c:647
 rt6_probe_deferred+0x98/0xf0 net/ipv6/route.c:615
 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
 worker_thread+0xa0/0x800 kernel/workqueue.c:2415
 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

read to 0xffff8880b9dedcb8 of 4 bytes by task 29572 on cpu 1:
 neigh_resolve_output net/core/neighbour.c:1479 [inline]
 neigh_resolve_output+0x113/0x470 net/core/neighbour.c:1470
 neigh_output include/net/neighbour.h:511 [inline]
 ip6_finish_output2+0x7a2/0xec0 net/ipv6/ip6_output.c:116
 __ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
 __ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
 ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
 NF_HOOK_COND include/linux/netfilter.h:294 [inline]
 ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
 dst_output include/net/dst.h:436 [inline]
 NF_HOOK include/linux/netfilter.h:305 [inline]
 ndisc_send_skb+0x459/0x5f0 net/ipv6/ndisc.c:505
 ndisc_send_ns+0x207/0x430 net/ipv6/ndisc.c:647
 rt6_probe_deferred+0x98/0xf0 net/ipv6/route.c:615
 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
 worker_thread+0xa0/0x800 kernel/workqueue.c:2415
 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 29572 Comm: kworker/1:4 Not tainted 5.4.0-rc6+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events rt6_probe_deferred
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

acae4f46

net: core: limit nested device depth · cdb5d42b

由 Taehee Yoo 提交于 1月 16, 2020

[ Upstream commit 5343da4c17429efaa5fb1594ea96aee1a283e694 ]

Current code doesn't limit the number of nested devices.
Nested devices would be handled recursively and this needs huge stack
memory. So, unlimited nested devices could make stack overflow.

This patch adds upper_level and lower_level, they are common variables
and represent maximum lower/upper depth.
When upper/lower device is attached or dettached,
{lower/upper}_level are updated. and if maximum depth is bigger than 8,
attach routine fails and returns -EMLINK.

In addition, this patch converts recursive routine of
netdev_walk_all_{lower/upper} to iterator routine.

Test commands:
    ip link add dummy0 type dummy
    ip link add link dummy0 name vlan1 type vlan id 1
    ip link set vlan1 up

    for i in {2..55}
    do
	    let A=$i-1

	    ip link add vlan$i link vlan$A type vlan id $i
    done
    ip link del dummy0

Splat looks like:
[  155.513226][  T908] BUG: KASAN: use-after-free in __unwind_start+0x71/0x850
[  155.514162][  T908] Write of size 88 at addr ffff8880608a6cc0 by task ip/908
[  155.515048][  T908]
[  155.515333][  T908] CPU: 0 PID: 908 Comm: ip Not tainted 5.4.0-rc3+ #96
[  155.516147][  T908] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  155.517233][  T908] Call Trace:
[  155.517627][  T908]
[  155.517918][  T908] Allocated by task 0:
[  155.518412][  T908] (stack is not available)
[  155.518955][  T908]
[  155.519228][  T908] Freed by task 0:
[  155.519885][  T908] (stack is not available)
[  155.520452][  T908]
[  155.520729][  T908] The buggy address belongs to the object at ffff8880608a6ac0
[  155.520729][  T908]  which belongs to the cache names_cache of size 4096
[  155.522387][  T908] The buggy address is located 512 bytes inside of
[  155.522387][  T908]  4096-byte region [ffff8880608a6ac0, ffff8880608a7ac0)
[  155.523920][  T908] The buggy address belongs to the page:
[  155.524552][  T908] page:ffffea0001822800 refcount:1 mapcount:0 mapping:ffff88806c657cc0 index:0x0 compound_mapcount:0
[  155.525836][  T908] flags: 0x100000000010200(slab|head)
[  155.526445][  T908] raw: 0100000000010200 ffffea0001813808 ffffea0001a26c08 ffff88806c657cc0
[  155.527424][  T908] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
[  155.528429][  T908] page dumped because: kasan: bad access detected
[  155.529158][  T908]
[  155.529410][  T908] Memory state around the buggy address:
[  155.530060][  T908]  ffff8880608a6b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  155.530971][  T908]  ffff8880608a6c00: fb fb fb fb fb f1 f1 f1 f1 00 f2 f2 f2 f3 f3 f3
[  155.531889][  T908] >ffff8880608a6c80: f3 fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  155.532806][  T908]                                            ^
[  155.533509][  T908]  ffff8880608a6d00: fb fb fb fb fb fb fb fb fb f1 f1 f1 f1 00 00 00
[  155.534436][  T908]  ffff8880608a6d80: f2 f3 f3 f3 f3 fb fb fb 00 00 00 00 00 00 00 00
[ ... ]
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

cdb5d42b

regulator: ab8500: Remove AB8505 USB regulator · cbbb04a5

由 Stephan Gerhold 提交于 1月 16, 2020

commit 99c4f70df3a6446c56ca817c2d0f9c12d85d4e7c upstream.

The USB regulator was removed for AB8500 in
commit 41a06aa7 ("regulator: ab8500: Remove USB regulator").
It was then added for AB8505 in
commit 547f384f ("regulator: ab8500: add support for ab8505").

However, there was never an entry added for it in
ab8505_regulator_match. This causes all regulators after it
to be initialized with the wrong device tree data, eventually
leading to an out-of-bounds array read.

Given that it is not used anywhere in the kernel, it seems
likely that similar arguments against supporting it exist for
AB8505 (it is controlled by hardware).

Therefore, simply remove it like for AB8500 instead of adding
an entry in ab8505_regulator_match.

Fixes: 547f384f ("regulator: ab8500: add support for ab8505")
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: NStephan Gerhold <stephan@gerhold.net>
Reviewed-by: NLinus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20191106173125.14496-1-stephan@gerhold.netSigned-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

cbbb04a5

libata: Fix retrieving of active qcs · c30da1f4

由 Sascha Hauer 提交于 1月 16, 2020

commit 8385d756e114f2df8568e508902d5f9850817ffb upstream.

ata_qc_complete_multiple() is called with a mask of the still active
tags.

mv_sata doesn't have this information directly and instead calculates
the still active tags from the started tags (ap->qc_active) and the
finished tags as (ap->qc_active ^ done_mask)

Since 28361c40 the hw_tag and tag are no longer the same and the
equation is no longer valid. In ata_exec_internal_sg() ap->qc_active is
initialized as 1ULL << ATA_TAG_INTERNAL, but in hardware tag 0 is
started and this will be in done_mask on completion. ap->qc_active ^
done_mask becomes 0x100000000 ^ 0x1 = 0x100000001 and thus tag 0 used as
the internal tag will never be reported as completed.

This is fixed by introducing ata_qc_get_active() which returns the
active hardware tags and calling it where appropriate.

This is tested on mv_sata, but sata_fsl and sata_nv suffer from the same
problem. There is another case in sata_nv that most likely needs fixing
as well, but this looks a little different, so I wasn't confident enough
to change that.

Fixes: 28361c40 ("libata: add extra internal command")
Cc: stable@vger.kernel.org
Tested-by: NPali Rohár <pali.rohar@gmail.com>
Signed-off-by: NSascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

Add missing export of ata_qc_get_active(), as per Pali.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

c30da1f4

ata: libahci_platform: Export again ahci_platform_<en/dis>able_phys() · e86dc8a4

由 Florian Fainelli 提交于 1月 16, 2020

commit 84b032dbfdf1c139cd2b864e43959510646975f8 upstream.

This reverts commit 6bb86fef
("libahci_platform: Staticize ahci_platform_<en/dis>able_phys()") we are
going to need ahci_platform_{enable,disable}_phys() in a subsequent
commit for ahci_brcm.c in order to properly control the PHY
initialization order.

Also make sure the function prototypes are declared in
include/linux/ahci_platform.h as a result.

Cc: stable@vger.kernel.org
Reviewed-by: NHans de Goede <hdegoede@redhat.com>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

e86dc8a4

dmaengine: Fix access to uninitialized dma_slave_caps · 2f16dfc9

由 Lukas Wunner 提交于 1月 16, 2020

commit 53a256a9b925b47c7e67fc1f16ca41561a7b877c upstream.

dmaengine_desc_set_reuse() allocates a struct dma_slave_caps on the
stack, populates it using dma_get_slave_caps() and then accesses one
of its members.

However dma_get_slave_caps() may fail and this isn't accounted for,
leading to a legitimate warning of gcc-4.9 (but not newer versions):

In file included from drivers/spi/spi-bcm2835.c:19:0:
drivers/spi/spi-bcm2835.c: In function 'dmaengine_desc_set_reuse':
>> include/linux/dmaengine.h:1370:10: warning: 'caps.descriptor_reuse' is used uninitialized in this function [-Wuninitialized]
if (caps.descriptor_reuse) {

Fix it, thereby also silencing the gcc-4.9 warning.

The issue has been present for 4 years but surfaces only now that
the first caller of dmaengine_desc_set_reuse() has been added in
spi-bcm2835.c. Another user of reusable DMA descriptors has existed
for a while in pxa_camera.c, but it sets the DMA_CTRL_REUSE flag
directly instead of calling dmaengine_desc_set_reuse(). Nevertheless,
tag this commit for stable in case there are out-of-tree users.

Fixes: 27242021 ("dmaengine: Add DMA_CTRL_REUSE")
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v4.3+
Link: https://lore.kernel.org/r/ca92998ccc054b4f2bfd60ef3adbab2913171eac.1575546234.git.lukas@wunner.deSigned-off-by: NVinod Koul <vkoul@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

2f16dfc9