提交 · 442fd787cff86f90543dbbcfa7b11b16de097238 · openeuler / Kernel

21 10月, 2021 40 次提交

btrfs: reset replace target device to allocation state on close · 442fd787

由 Desmond Cheong Zhi Xi 提交于 10月 21, 2021

stable inclusion
from stable-5.10.67
commit c1b249e02a80347fe519b761a8c66008e5e7dcbf
bugzilla: 182619 https://gitee.com/openeuler/kernel/issues/I4EWO7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c1b249e02a80347fe519b761a8c66008e5e7dcbf

--------------------------------

commit 0d977e0e upstream.

This crash was observed with a failed assertion on device close:

  BTRFS: Transaction aborted (error -28)
  WARNING: CPU: 1 PID: 3902 at fs/btrfs/extent-tree.c:2150 btrfs_run_delayed_refs+0x1d2/0x1e0 [btrfs]
  Modules linked in: btrfs blake2b_generic libcrc32c crc32c_intel xor zstd_decompress zstd_compress xxhash lzo_compress lzo_decompress raid6_pq loop
  CPU: 1 PID: 3902 Comm: kworker/u8:4 Not tainted 5.14.0-rc5-default+ #1532
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
  Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
  RIP: 0010:btrfs_run_delayed_refs+0x1d2/0x1e0 [btrfs]
  RSP: 0018:ffffb7a5452d7d80 EFLAGS: 00010282
  RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: ffffffffabee13c4 RDI: 00000000ffffffff
  RBP: ffff97834176a378 R08: 0000000000000001 R09: 0000000000000001
  R10: 0000000000000000 R11: 0000000000000001 R12: ffff97835195d388
  R13: 0000000005b08000 R14: ffff978385484000 R15: 000000000000016c
  FS:  0000000000000000(0000) GS:ffff9783bd800000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000056190d003fe8 CR3: 000000002a81e005 CR4: 0000000000170ea0
  Call Trace:
   flush_space+0x197/0x2f0 [btrfs]
   btrfs_async_reclaim_metadata_space+0x139/0x300 [btrfs]
   process_one_work+0x262/0x5e0
   worker_thread+0x4c/0x320
   ? process_one_work+0x5e0/0x5e0
   kthread+0x144/0x170
   ? set_kthread_struct+0x40/0x40
   ret_from_fork+0x1f/0x30
  irq event stamp: 19334989
  hardirqs last  enabled at (19334997): [<ffffffffab0e0c87>] console_unlock+0x2b7/0x400
  hardirqs last disabled at (19335006): [<ffffffffab0e0d0d>] console_unlock+0x33d/0x400
  softirqs last  enabled at (19334900): [<ffffffffaba0030d>] __do_softirq+0x30d/0x574
  softirqs last disabled at (19334893): [<ffffffffab0721ec>] irq_exit_rcu+0x12c/0x140
  ---[ end trace 45939e308e0dd3c7 ]---
  BTRFS: error (device vdd) in btrfs_run_delayed_refs:2150: errno=-28 No space left
  BTRFS info (device vdd): forced readonly
  BTRFS warning (device vdd): failed setting block group ro: -30
  BTRFS info (device vdd): suspending dev_replace for unmount
  assertion failed: !test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state), in fs/btrfs/volumes.c:1150
  ------------[ cut here ]------------
  kernel BUG at fs/btrfs/ctree.h:3431!
  invalid opcode: 0000 [#1] PREEMPT SMP
  CPU: 1 PID: 3982 Comm: umount Tainted: G        W         5.14.0-rc5-default+ #1532
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
  RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
  RSP: 0018:ffffb7a5454c7db8 EFLAGS: 00010246
  RAX: 0000000000000068 RBX: ffff978364b91c00 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffffffffabee13c4 RDI: 00000000ffffffff
  RBP: ffff9783523a4c00 R08: 0000000000000001 R09: 0000000000000001
  R10: 0000000000000000 R11: 0000000000000001 R12: ffff9783523a4d18
  R13: 0000000000000000 R14: 0000000000000004 R15: 0000000000000003
  FS:  00007f61c8f42800(0000) GS:ffff9783bd800000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000056190cffa810 CR3: 0000000030b96002 CR4: 0000000000170ea0
  Call Trace:
   btrfs_close_one_device.cold+0x11/0x55 [btrfs]
   close_fs_devices+0x44/0xb0 [btrfs]
   btrfs_close_devices+0x48/0x160 [btrfs]
   generic_shutdown_super+0x69/0x100
   kill_anon_super+0x14/0x30
   btrfs_kill_super+0x12/0x20 [btrfs]
   deactivate_locked_super+0x2c/0xa0
   cleanup_mnt+0x144/0x1b0
   task_work_run+0x59/0xa0
   exit_to_user_mode_loop+0xe7/0xf0
   exit_to_user_mode_prepare+0xaf/0xf0
   syscall_exit_to_user_mode+0x19/0x50
   do_syscall_64+0x4a/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xae

This happens when close_ctree is called while a dev_replace hasn't
completed. In close_ctree, we suspend the dev_replace, but keep the
replace target around so that we can resume the dev_replace procedure
when we mount the root again. This is the call trace:

  close_ctree():
    btrfs_dev_replace_suspend_for_unmount();
    btrfs_close_devices():
      btrfs_close_fs_devices():
        btrfs_close_one_device():
          ASSERT(!test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
                 &device->dev_state));

However, since the replace target sticks around, there is a device
with BTRFS_DEV_STATE_REPLACE_TGT set on close, and we fail the
assertion in btrfs_close_one_device.

To fix this, if we come across the replace target device when
closing, we should properly reset it back to allocation state. This
fix also ensures that if a non-target device has a corrupted state and
has the BTRFS_DEV_STATE_REPLACE_TGT bit set, the assertion will still
catch the error.
Reported-by: NDavid Sterba <dsterba@suse.com>
Fixes: b2a61667 ("btrfs: fix rw device counting in __btrfs_free_extra_devids")
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDesmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

442fd787

btrfs: wake up async_delalloc_pages waiters after submit · 00f0b637

由 Josef Bacik 提交于 10月 21, 2021

stable inclusion
from stable-5.10.67
commit 0901af53da8f4cfee3bb364a35f66d0d4a9b93ba
bugzilla: 182619 https://gitee.com/openeuler/kernel/issues/I4EWO7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=0901af53da8f4cfee3bb364a35f66d0d4a9b93ba

--------------------------------

commit ac98141d upstream.

We use the async_delalloc_pages mechanism to make sure that we've
completed our async work before trying to continue our delalloc
flushing.  The reason for this is we need to see any ordered extents
that were created by our delalloc flushing.  However we're waking up
before we do the submit work, which is before we create the ordered
extents.  This is a pretty wide race window where we could potentially
think there are no ordered extents and thus exit shrink_delalloc
prematurely.  Fix this by waking us up after we've done the work to
create ordered extents.

CC: stable@vger.kernel.org # 5.4+
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

00f0b637

io-wq: fix wakeup race when adding new work · 25575955

由 Jens Axboe 提交于 10月 21, 2021

stable inclusion
from stable-5.10.67
commit 9ac218642dfc1309682baa708e81935e5cec5ff3
bugzilla: 182619 https://gitee.com/openeuler/kernel/issues/I4EWO7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9ac218642dfc1309682baa708e81935e5cec5ff3

--------------------------------

commit 87df7fb9 upstream.

When new work is added, io_wqe_enqueue() checks if we need to wake or
create a new worker. But that check is done outside the lock that
otherwise synchronizes us with a worker going to sleep, so we can end
up in the following situation:

CPU0				CPU1
lock
insert work
unlock
atomic_read(nr_running) != 0
				lock
				atomic_dec(nr_running)
no wakeup needed

Hold the wqe lock around the "need to wakeup" check. Then we can also get
rid of the temporary work_flags variable, as we know the work will remain
valid as long as we hold the lock.

Cc: stable@vger.kernel.org
Reported-by: NAndres Freund <andres@anarazel.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

25575955

io_uring: fail links of cancelled timeouts · 62d8cd05

由 Pavel Begunkov 提交于 10月 21, 2021

stable inclusion
from stable-5.10.67
commit 548ee201fb4a3c372a332284df3ecc21749918e1
bugzilla: 182619 https://gitee.com/openeuler/kernel/issues/I4EWO7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=548ee201fb4a3c372a332284df3ecc21749918e1

--------------------------------

commit 2ae2eb9d upstream.

When we cancel a timeout we should mark it with REQ_F_FAIL, so
linked requests are cancelled as well, but not queued for further
execution.

Cc: stable@vger.kernel.org
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/fff625b44eeced3a5cae79f60e6acf3fbdf8f990.1631192135.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

62d8cd05

io_uring: add ->splice_fd_in checks · 046eebfd

由 Pavel Begunkov 提交于 10月 21, 2021

stable inclusion
from stable-5.10.67
commit 54eb6211b979f21c4908719e8bfb005a1143e8c3
bugzilla: 182619 https://gitee.com/openeuler/kernel/issues/I4EWO7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=54eb6211b979f21c4908719e8bfb005a1143e8c3

--------------------------------

commit 26578cda upstream.

->splice_fd_in is used only by splice/tee, but no other request checks
it for validity. Add the check for most of request types excluding
reads/writes/sends/recvs, we don't want overhead for them and can leave
them be as is until the field is actually used.

Cc: stable@vger.kernel.org
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/f44bc2acd6777d932de3d71a5692235b5b2b7397.1629451684.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

046eebfd

io_uring: place fixed tables under memcg limits · 4f326107

由 Pavel Begunkov 提交于 10月 21, 2021

stable inclusion
from stable-5.10.67
commit a3ed34bcada565ccd33fc61360fd8157c32b5636
bugzilla: 182619 https://gitee.com/openeuler/kernel/issues/I4EWO7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a3ed34bcada565ccd33fc61360fd8157c32b5636

--------------------------------

commit 0bea96f5 upstream.

Fixed tables may be large enough, place all of them together with
allocated tags under memcg limits.

Cc: stable@vger.kernel.org
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b3ac9f5da9821bb59837b5fe25e8ef4be982218c.1629451684.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4f326107

io_uring: limit fixed table size by RLIMIT_NOFILE · 1feab377

由 Pavel Begunkov 提交于 10月 21, 2021

stable inclusion
from stable-5.10.67
commit 5103b733348e2d39ecee5bb37ed6ead10a0a1129
bugzilla: 182619 https://gitee.com/openeuler/kernel/issues/I4EWO7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5103b733348e2d39ecee5bb37ed6ead10a0a1129

--------------------------------

Limit the number of files in io_uring fixed tables by RLIMIT_NOFILE,
that's the first and the simpliest restriction that we should impose.

Cc: stable@vger.kernel.org
Suggested-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b2756c340aed7d6c0b302c26dab50c6c5907f4ce.1629451684.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1feab377

rtc: tps65910: Correct driver module alias · ff75f736

由 Dmitry Osipenko 提交于 10月 21, 2021

stable inclusion
from stable-5.10.67
commit ebedb252a47f754575ac2486298ba6912924e4bf
bugzilla: 182619 https://gitee.com/openeuler/kernel/issues/I4EWO7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ebedb252a47f754575ac2486298ba6912924e4bf

--------------------------------

commit 8d448fa0 upstream.

The TPS65910 RTC driver module doesn't auto-load because of the wrong
module alias that doesn't match the device name, fix it.

Cc: stable@vger.kernel.org
Reported-by: NAnton Bambura <jenneron@protonmail.com>
Tested-by: NAnton Bambura <jenneron@protonmail.com>
Signed-off-by: NDmitry Osipenko <digetx@gmail.com>
Signed-off-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210808160030.8556-1-digetx@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ff75f736

ext4: flush s_error_work before journal destroy in ext4_fill_super · bb39867f

由 yangerkun 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.15-rc4
commit bb9464e0
category: bugfix
bugzilla: 176737 https://gitee.com/openeuler/kernel/issues/I4DDEL
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bb9464e08309f6befe80866f5be51778ca355ee9

---------------------------

The error path in ext4_fill_super forget to flush s_error_work before
journal destroy, and it may trigger the follow bug since
flush_stashed_error_work can run concurrently with journal destroy
without any protection for sbi->s_journal.

[32031.740193] EXT4-fs (loop66): get root inode failed
[32031.740484] EXT4-fs (loop66): mount failed
[32031.759805] ------------[ cut here ]------------
[32031.759807] kernel BUG at fs/jbd2/transaction.c:373!
[32031.760075] invalid opcode: 0000 [#1] SMP PTI
[32031.760336] CPU: 5 PID: 1029268 Comm: kworker/5:1 Kdump: loaded
4.18.0
[32031.765112] Call Trace:
[32031.765375]  ? __switch_to_asm+0x35/0x70
[32031.765635]  ? __switch_to_asm+0x41/0x70
[32031.765893]  ? __switch_to_asm+0x35/0x70
[32031.766148]  ? __switch_to_asm+0x41/0x70
[32031.766405]  ? _cond_resched+0x15/0x40
[32031.766665]  jbd2__journal_start+0xf1/0x1f0 [jbd2]
[32031.766934]  jbd2_journal_start+0x19/0x20 [jbd2]
[32031.767218]  flush_stashed_error_work+0x30/0x90 [ext4]
[32031.767487]  process_one_work+0x195/0x390
[32031.767747]  worker_thread+0x30/0x390
[32031.768007]  ? process_one_work+0x390/0x390
[32031.768265]  kthread+0x10d/0x130
[32031.768521]  ? kthread_flush_work_fn+0x10/0x10
[32031.768778]  ret_from_fork+0x35/0x40

static int start_this_handle(...)
    BUG_ON(journal->j_flags & JBD2_UNMOUNT); <---- Trigger this

Besides, after we enable fast commit, ext4_fc_replay can add work to
s_error_work but return success, so the latter journal destroy in
ext4_load_journal can trigger this problem too.

Fix this problem with two steps:
1. Call ext4_commit_super directly in ext4_handle_error for the case
   that called from ext4_fc_replay
2. Since it's hard to pair the init and flush for s_error_work, we'd
   better add a extras flush_work before journal destroy in
   ext4_fill_super

Besides, this patch will call ext4_commit_super in ext4_handle_error for
any nojournal case too. But it seems safe since the reason we call
schedule_work was that we should save error info to sb through journal
if available. Conversely, for the nojournal case, it seems useless delay
commit superblock to s_error_work.

Fixes: c92dc856 ("ext4: defer saving error info from atomic context")
Fixes: 2d01ddc8 ("ext4: save error info to sb through journal if available")
Cc: stable@kernel.org
Signed-off-by: Nyangerkun <yangerkun@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210924093917.1953239-1-yangerkun@huawei.comReviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

bb39867f

crypto: ccp - fix resource leaks in ccp_run_aes_gcm_cmd() · 8c5d963a

由 Dan Carpenter 提交于 10月 21, 2021

mainline inclusion
from mainline-5.15-rc4
commit 505d9dcb
bugzilla: 182864 https://gitee.com/openeuler/kernel/issues/I4DDEL
CVE: CVE-2021-3744

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=505d9dcb0f7ddf9d075e729523a33d38642ae680

--------------------------------

There are three bugs in this code:

1) If we ccp_init_data() fails for &src then we need to free aad.
   Use goto e_aad instead of goto e_ctx.
2) The label to free the &final_wa was named incorrectly as "e_tag" but
   it should have been "e_final_wa".  One error path leaked &final_wa.
3) The &tag was leaked on one error path.  In that case, I added a free
   before the goto because the resource was local to that block.

Fixes: 36cf515b ("crypto: ccp - Enable support for AES GCM on v5 CCPs")
Reported-by: N"minihanshen(沈明航)" <minihanshen@tencent.com>
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NJohn Allen <john.allen@amd.com>
Tested-by: NJohn Allen <john.allen@amd.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8c5d963a

make OPTIMIZE_INLINING config editable · 090bca0f

由 Guo Xuenan 提交于 10月 21, 2021

hulk inclusion
category: bugfix
bugzilla: 182617 https://gitee.com/openeuler/kernel/issues/I4DDEL

--------------------------------

in commit e2b3eb3c5 disable OPTIMIZE_INLINING by default, make it
editable when using menuconfig
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

090bca0f

bpf: Fix integer overflow in prealloc_elems_and_freelist() · 506ab636

由 Xu Kuohai 提交于 10月 21, 2021

mainline inclusion
from mainline-5.15-rc4
commit 30e29a9a
bugzilla: 182857 https://gitee.com/openeuler/kernel/issues/I4DDEL
CVE: CVE-2021-41864

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=30e29a9a2bc6a4888335a6ede968b75cd329657a

-------------------------------------------------

In prealloc_elems_and_freelist(), the multiplication to calculate the
size passed to bpf_map_area_alloc() could lead to an integer overflow.
As a result, out-of-bounds write could occur in pcpu_freelist_populate()
as reported by KASAN:

[...]
[   16.968613] BUG: KASAN: slab-out-of-bounds in pcpu_freelist_populate+0xd9/0x100
[   16.969408] Write of size 8 at addr ffff888104fc6ea0 by task crash/78
[   16.970038]
[   16.970195] CPU: 0 PID: 78 Comm: crash Not tainted 5.15.0-rc2+ #1
[   16.970878] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[   16.972026] Call Trace:
[   16.972306]  dump_stack_lvl+0x34/0x44
[   16.972687]  print_address_description.constprop.0+0x21/0x140
[   16.973297]  ? pcpu_freelist_populate+0xd9/0x100
[   16.973777]  ? pcpu_freelist_populate+0xd9/0x100
[   16.974257]  kasan_report.cold+0x7f/0x11b
[   16.974681]  ? pcpu_freelist_populate+0xd9/0x100
[   16.975190]  pcpu_freelist_populate+0xd9/0x100
[   16.975669]  stack_map_alloc+0x209/0x2a0
[   16.976106]  __sys_bpf+0xd83/0x2ce0
[...]

The possibility of this overflow was originally discussed in [0], but
was overlooked.

Fix the integer overflow by changing elem_size to u64 from u32.

[0] https://lore.kernel.org/bpf/728b238e-a481-eb50-98e9-b0f430ab01e7@gmail.com/

Fixes: 557c0c6e ("bpf: convert stackmap to pre-allocation")
Signed-off-by: NTatsuhiko Yasumatsu <th.yasumatsu@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210930135545.173698-1-th.yasumatsu@gmail.comSigned-off-by: NXu Kuohai <xukuohai@huawei.com>
Reviewed-by: NYang Jihong <yangjihong1@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

506ab636

iommu/vt-d: Fix clearing real DMA device's scalable-mode context entries · 12cea97e

由 Lu Baolu 提交于 10月 21, 2021

mainline inclusion
from mainline-5.14-rc2
commit 474dd1c6
category: bugfix
bugzilla: 174363 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=474dd1c6506411752a9b2f2233eec11f1733a099

------------------------------------------------

The commit 2b0140c6 ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
fixes an issue of "sub-device is removed where the context entry is cleared
for all aliases". But this commit didn't consider the PASID entry and PASID
table in VT-d scalable mode. This fix increases the coverage of scalable
mode.
Suggested-by: NSanjay Kumar <sanjay.k.kumar@intel.com>
Fixes: 8038bdb8 ("iommu/vt-d: Only clear real DMA device's context entries")
Fixes: 2b0140c6 ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Cc: stable@vger.kernel.org # v5.6+
Cc: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210712071712.3416949-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NJing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: NChen Wandun <chenwandun@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

12cea97e

iommu/vt-d: Global devTLB flush when present context entry changed · 3a3f71d3

由 Sanjay Kumar 提交于 10月 21, 2021

mainline inclusion
from mainline-5.14-rc2
commit 37764b95
category: bugfix
bugzilla: 174360 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=37764b952e1b39053defc7ebe5dcd8c4e3e78de9

------------------------------------------------

This fixes a bug in context cache clear operation. The code was not
following the correct invalidation flow. A global device TLB invalidation
should be added after the IOTLB invalidation. At the same time, it
uses the domain ID from the context entry. But in scalable mode, the
domain ID is in PASID table entry, not context entry.

Fixes: 7373a8cc ("iommu/vt-d: Setup context and enable RID2PASID support")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: NSanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210712071315.3416543-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NJing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: NChen Wandun <chenwandun@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3a3f71d3

mm: slub: fix slub_debug disabling for list of slabs · 015e8f1d

由 Vlastimil Babka 提交于 10月 21, 2021

mainline inclusion
from mainline-5.14-rc6
commit a7f1d485
category: bugfix
bugzilla: 176508 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7f1d48585b34730765dcda09ead6edc4ac16a5c

------------------------------------------------

Vijayanand Jitta reports:

  Consider the scenario where CONFIG_SLUB_DEBUG_ON is set and we would
  want to disable slub_debug for few slabs. Using boot parameter with
  slub_debug=-,slab_name syntax doesn't work as expected i.e; only
  disabling debugging for the specified list of slabs. Instead it
  disables debugging for all slabs, which is wrong.

This patch fixes it by delaying the moment when the global slub_debug
flags variable is updated.  In case a "slub_debug=-,slab_name" has been
passed, the global flags remain as initialized (depending on
CONFIG_SLUB_DEBUG_ON enabled or disabled) and are not simply reset to 0.

Link: https://lkml.kernel.org/r/8a3d992a-473a-467b-28a0-4ad2ff60ab82@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz>
Reported-by: NVijayanand Jitta <vjitta@codeaurora.org>
Reviewed-by: NVijayanand Jitta <vjitta@codeaurora.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Conflicts:
	mm/slub.c
Signed-off-by: NJing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: NChen Wandun <chenwandun@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

015e8f1d

mm: vmscan: fix missing psi annotation for node_reclaim() · 77bd601a

由 Johannes Weiner 提交于 10月 21, 2021

mainline inclusion
from mainline-5.14-rc7
commit 57f29762
category: bugfix
bugzilla: 176927 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=57f29762cdd4687a02f245d1b1e78de046388eac

------------------------------------------------

In a debugging session the other day, Rik noticed that node_reclaim()
was missing memstall annotations.  This means we'll miss pressure and
lost productivity resulting from reclaim on an overloaded local NUMA
node when vm.zone_reclaim_mode is enabled.

There haven't been any reports, but that's likely because
vm.zone_reclaim_mode hasn't been a commonly used feature recently, and
the intersection between such setups and psi users is probably nil.

But secondary memory such as CXL-connected DIMMS, persistent memory etc,
and the page demotion patches that handle them
(https://lore.kernel.org/lkml/20210401183216.443C4443@viggo.jf.intel.com/)
could soon make this a more common codepath again.

Link: https://lkml.kernel.org/r/20210818152457.35846-1-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Reported-by: NRik van Riel <riel@surriel.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: NChen Wandun <chenwandun@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

77bd601a

ipc: replace costly bailout check in sysvipc_find_ipc() · fc7b1e65

由 Rafael Aquini 提交于 10月 21, 2021

mainline inclusion
from mainline-5.15-rc1
commit 20401d10
bugzilla: 182755 https://gitee.com/openeuler/kernel/issues/I4DDEL
CVE: CVE-2021-3669

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=20401d1058f3f841f35a594ac2fc1293710e55b9

--------------------------------

sysvipc_find_ipc() was left with a costly way to check if the offset
position fed to it is bigger than the total number of IPC IDs in use.  So
much so that the time it takes to iterate over /proc/sysvipc/* files grows
exponentially for a custom benchmark that creates "N" SYSV shm segments
and then times the read of /proc/sysvipc/shm (milliseconds):

    12 msecs to read   1024 segs from /proc/sysvipc/shm
    18 msecs to read   2048 segs from /proc/sysvipc/shm
    65 msecs to read   4096 segs from /proc/sysvipc/shm
   325 msecs to read   8192 segs from /proc/sysvipc/shm
  1303 msecs to read  16384 segs from /proc/sysvipc/shm
  5182 msecs to read  32768 segs from /proc/sysvipc/shm

The root problem lies with the loop that computes the total amount of ids
in use to check if the "pos" feeded to sysvipc_find_ipc() grew bigger than
"ids->in_use".  That is a quite inneficient way to get to the maximum
index in the id lookup table, specially when that value is already
provided by struct ipc_ids.max_idx.

This patch follows up on the optimization introduced via commit
15df03c8 ("sysvipc: make get_maxid O(1) again") and gets rid of the
aforementioned costly loop replacing it by a simpler checkpoint based on
ipc_get_maxidx() returned value, which allows for a smooth linear increase
in time complexity for the same custom benchmark:

     2 msecs to read   1024 segs from /proc/sysvipc/shm
     2 msecs to read   2048 segs from /proc/sysvipc/shm
     4 msecs to read   4096 segs from /proc/sysvipc/shm
     9 msecs to read   8192 segs from /proc/sysvipc/shm
    19 msecs to read  16384 segs from /proc/sysvipc/shm
    39 msecs to read  32768 segs from /proc/sysvipc/shm

Link: https://lkml.kernel.org/r/20210809203554.1562989-1-aquini@redhat.comSigned-off-by: NRafael Aquini <aquini@redhat.com>
Acked-by: NDavidlohr Bueso <dbueso@suse.de>
Acked-by: NManfred Spraul <manfred@colorfullife.com>
Cc: Waiman Long <llong@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

fc7b1e65

bpf, mips: Validate conditional branch offsets · 018337c2

由 Piotr Krysiuk 提交于 10月 21, 2021

maillist inclusion
category: bugfix
bugzilla: 182756 https://gitee.com/openeuler/kernel/issues/I4DDEL
CVE: CVE-2021-38300

Reference: https://lore.kernel.org/bpf/20210915160437.4080-1-piotras@gmail.com/

-------------------------------------------------

The conditional branch instructions on MIPS use 18-bit signed offsets
allowing for a branch range of 128 KBytes (backward and forward).
However, this limit is not observed by the cBPF JIT compiler, and so
the JIT compiler emits out-of-range branches when translating certain
cBPF programs. A specific example of such a cBPF program is included in
the "BPF_MAXINSNS: exec all MSH" test from lib/test_bpf.c that executes
anomalous machine code containing incorrect branch offsets under JIT.

Furthermore, this issue can be abused to craft undesirable machine
code, where the control flow is hijacked to execute arbitrary Kernel
code.

The following steps can be used to reproduce the issue:

  # echo 1 > /proc/sys/net/core/bpf_jit_enable
  # modprobe test_bpf test_name="BPF_MAXINSNS: exec all MSH"

This should produce multiple warnings from build_bimm() similar to:

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 209 at arch/mips/mm/uasm-mips.c:210 build_insn+0x558/0x590
  Micro-assembler field overflow
  Modules linked in: test_bpf(+)
  CPU: 0 PID: 209 Comm: modprobe Not tainted 5.14.3 #1
  Stack : 00000000 807bb824 82b33c9c 801843c0 00000000 00000004 00000000 63c9b5ee
          82b33af4 80999898 80910000 80900000 82fd6030 00000001 82b33a98 82087180
          00000000 00000000 80873b28 00000000 000000fc 82b3394c 00000000 2e34312e
          6d6d6f43 809a180f 809a1836 6f6d203a 80900000 00000001 82b33bac 80900000
          00027f80 00000000 00000000 807bb824 00000000 804ed790 001cc317 00000001
  [...]
  Call Trace:
  [<80108f44>] show_stack+0x38/0x118
  [<807a7aac>] dump_stack_lvl+0x5c/0x7c
  [<807a4b3c>] __warn+0xcc/0x140
  [<807a4c3c>] warn_slowpath_fmt+0x8c/0xb8
  [<8011e198>] build_insn+0x558/0x590
  [<8011e358>] uasm_i_bne+0x20/0x2c
  [<80127b48>] build_body+0xa58/0x2a94
  [<80129c98>] bpf_jit_compile+0x114/0x1e4
  [<80613fc4>] bpf_prepare_filter+0x2ec/0x4e4
  [<8061423c>] bpf_prog_create+0x80/0xc4
  [<c0a006e4>] test_bpf_init+0x300/0xba8 [test_bpf]
  [<8010051c>] do_one_initcall+0x50/0x1d4
  [<801c5e54>] do_init_module+0x60/0x220
  [<801c8b20>] sys_finit_module+0xc4/0xfc
  [<801144d0>] syscall_common+0x34/0x58
  [...]
  ---[ end trace a287d9742503c645 ]---

Then the anomalous machine code executes:

=> 0xc0a18000:  addiu   sp,sp,-16
   0xc0a18004:  sw      s3,0(sp)
   0xc0a18008:  sw      s4,4(sp)
   0xc0a1800c:  sw      s5,8(sp)
   0xc0a18010:  sw      ra,12(sp)
   0xc0a18014:  move    s5,a0
   0xc0a18018:  move    s4,zero
   0xc0a1801c:  move    s3,zero

   # __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0)
   0xc0a18020:  lui     t6,0x8012
   0xc0a18024:  ori     t4,t6,0x9e14
   0xc0a18028:  li      a1,0
   0xc0a1802c:  jalr    t4
   0xc0a18030:  move    a0,s5
   0xc0a18034:  bnez    v0,0xc0a1ffb8           # incorrect branch offset
   0xc0a18038:  move    v0,zero
   0xc0a1803c:  andi    s4,s3,0xf
   0xc0a18040:  b       0xc0a18048
   0xc0a18044:  sll     s4,s4,0x2
   [...]

   # __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0)
   0xc0a1ffa0:  lui     t6,0x8012
   0xc0a1ffa4:  ori     t4,t6,0x9e14
   0xc0a1ffa8:  li      a1,0
   0xc0a1ffac:  jalr    t4
   0xc0a1ffb0:  move    a0,s5
   0xc0a1ffb4:  bnez    v0,0xc0a1ffb8           # incorrect branch offset
   0xc0a1ffb8:  move    v0,zero
   0xc0a1ffbc:  andi    s4,s3,0xf
   0xc0a1ffc0:  b       0xc0a1ffc8
   0xc0a1ffc4:  sll     s4,s4,0x2

   # __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0)
   0xc0a1ffc8:  lui     t6,0x8012
   0xc0a1ffcc:  ori     t4,t6,0x9e14
   0xc0a1ffd0:  li      a1,0
   0xc0a1ffd4:  jalr    t4
   0xc0a1ffd8:  move    a0,s5
   0xc0a1ffdc:  bnez    v0,0xc0a3ffb8           # correct branch offset
   0xc0a1ffe0:  move    v0,zero
   0xc0a1ffe4:  andi    s4,s3,0xf
   0xc0a1ffe8:  b       0xc0a1fff0
   0xc0a1ffec:  sll     s4,s4,0x2
   [...]

   # epilogue
   0xc0a3ffb8:  lw      s3,0(sp)
   0xc0a3ffbc:  lw      s4,4(sp)
   0xc0a3ffc0:  lw      s5,8(sp)
   0xc0a3ffc4:  lw      ra,12(sp)
   0xc0a3ffc8:  addiu   sp,sp,16
   0xc0a3ffcc:  jr      ra
   0xc0a3ffd0:  nop

To mitigate this issue, we assert the branch ranges for each emit call
that could generate an out-of-range branch.

Fixes: 36366e36 ("MIPS: BPF: Restore MIPS32 cBPF JIT")
Fixes: c6610de3 ("MIPS: net: Add BPF JIT")
Signed-off-by: NPiotr Krysiuk <piotras@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Tested-by: NJohan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: NJohan Almbladh <johan.almbladh@anyfinetworks.com>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: https://lore.kernel.org/bpf/20210915160437.4080-1-piotras@gmail.comReviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: NKuohai Xu <xukuohai@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

018337c2

ARM: Qualify enabling of swiotlb_init() · 2993f3f4

由 Florian Fainelli 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.13-rc1
commit fcf04489
category: bugfix
bugzilla: 106749 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fcf044891c84e38fc90eb736b818781bccf94e38

-----------------------------------------------

We do not need a SWIOTLB unless we have DRAM that is addressable beyond
the arm_dma_limit. Compare max_pfn with arm_dma_pfn_limit to determine
whether we do need a SWIOTLB to be initialized.

Fixes: ad3c7b18 ("arm: use swiotlb for bounce buffering on LPAE configs")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: Ntong tiangen <tongtiangen@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2993f3f4

arm64: mm: account for hotplug memory when randomizing the linear region · 1a537fe8

由 Ard Biesheuvel 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.11-rc1
commit 97d6786e
category: bugfix
bugzilla: 175103 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97d6786e0669daa5c2f2d07a057f574e849dfd3e

-----------------------------------------------

As a hardening measure, we currently randomize the placement of
physical memory inside the linear region when KASLR is in effect.
Since the random offset at which to place the available physical
memory inside the linear region is chosen early at boot, it is
based on the memblock description of memory, which does not cover
hotplug memory. The consequence of this is that the randomization
offset may be chosen such that any hotplugged memory located above
memblock_end_of_DRAM() that appears later is pushed off the end of
the linear region, where it cannot be accessed.

So let's limit this randomization of the linear region to ensure
that this can no longer happen, by using the CPU's addressable PA
range instead. As it is guaranteed that no hotpluggable memory will
appear that falls outside of that range, we can safely put this PA
range sized window anywhere in the linear region.
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20201014081857.3288-1-ardb@kernel.orgSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NChen Wandun <chenwandun@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1a537fe8

blk-mq-sched: Fix blk_mq_sched_alloc_tags() error handling · a990edfa

由 John Garry 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.14-rc1
commit b93af305
category: bugfix
bugzilla: 177012 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b93af3055d6f32d3b0361cfdb110c9399c1241ba

---------------------------

If the blk_mq_sched_alloc_tags() -> blk_mq_alloc_rqs() call fails, then we
call blk_mq_sched_free_tags() -> blk_mq_free_rqs().

It is incorrect to do so, as any rqs would have already been freed in the
blk_mq_alloc_rqs() call.

Fix by calling blk_mq_free_rq_map() only directly.

Fixes: 6917ff0b ("blk-mq-sched: refactor scheduler initialization")
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/1627378373-148090-1-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a990edfa

disable OPTIMIZE_INLINING by default · 003a421b

由 Guo Xuenan 提交于 10月 21, 2021

hulk inclusion
category: bugfix
bugzilla: 182617 https://gitee.com/openeuler/kernel/issues/I4DDEL

--------------------------------

for performance reasons, hulk 5.10 do not use OPTIMIZE_INLINING.
it using gnu_inline attribute causing inline functions not really inline,
which introducing performance issues,so we disable it and adapt
some link conflicting functions.
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

003a421b

Revert "compiler: remove CONFIG_OPTIMIZE_INLINING entirely" · 077d8826

由 Guo Xuenan 提交于 10月 21, 2021

hulk inclusion
category: bugfix
bugzilla: 182617 https://gitee.com/openeuler/kernel/issues/I4DDEL

--------------------------------

This reverts commit 889b3c12.
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

077d8826

ARM: Support KFENCE for ARM · 7d1c8c11

由 Kefeng Wang 提交于 10月 21, 2021

hulk inclusion
category: feature
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

-----------------------------------------------

Add architecture specific implementation details for KFENCE and enable
KFENCE on ARM. In particular, this implements the required interface in
 <asm/kfence.h>.

KFENCE requires that attributes for pages from its memory pool can
individually be set. Therefore, force the kfence pool to be mapped
at page granularity.

Testing this patch using the testcases in kfence_test.c and all passed
with or without ARM_LPAE.
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7d1c8c11

ARM: mm: Provide is_write_fault() · 5e954eaa

由 Kefeng Wang 提交于 10月 21, 2021

hulk inclusion
category: feature
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

-----------------------------------------------

The function will check whether the fault is caused by a write access.
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5e954eaa

ARM: mm: Provide set_memory_valid() · 12c11a8b

由 Kefeng Wang 提交于 10月 21, 2021

hulk inclusion
category: feature
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

-----------------------------------------------

This function validates and invalidates PTE entries.
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

12c11a8b

kfence: show cpu and timestamp in alloc/free info · 4d2f24b4

由 Marco Elver 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.15-rc1
commit 4bbf04aa
category: feature
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=

-----------------------------------------------

Record cpu and timestamp on allocations and frees, and show them in
reports.  Upon an error, this can help correlate earlier messages in the
kernel log via allocation and free timestamps.

Link: https://lkml.kernel.org/r/20210714175312.2947941-1-elver@google.comSuggested-by: NJoern Engel <joern@purestorage.com>
Signed-off-by: NMarco Elver <elver@google.com>
Acked-by: NAlexander Potapenko <glider@google.com>
Acked-by: NJoern Engel <joern@purestorage.com>
Cc: Yuanyuan Zhong <yzhong@purestorage.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4d2f24b4

kfence: test: fail fast if disabled at boot · 472e0b04

由 Marco Elver 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.15-rc1
commit c40c6e59
category: bugfix
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=

-----------------------------------------------

Fail kfence_test fast if KFENCE was disabled at boot, instead of each test
case trying several seconds to allocate from KFENCE and failing.  KUnit
will fail all test cases if kunit_suite::init returns an error.

Even if KFENCE was disabled, we still want the test to fail, so that CI
systems that parse KUnit output will alert on KFENCE being disabled
(accidentally or otherwise).

Link: https://lkml.kernel.org/r/20210825105533.1247922-1-elver@google.comSigned-off-by: NMarco Elver <elver@google.com>
Reported-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: NAlexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

472e0b04

slub: force on no_hash_pointers when slub_debug is enabled · 75928e39

由 Stephen Boyd 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.14-rc1
commit 79270291
category: feature
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=792702911f581f7793962fbeb99d5c3a1b28f4c3

-----------------------------------------------

Obscuring the pointers that slub shows when debugging makes for some
confusing slub debug messages:

 Padding overwritten. 0x0000000079f0674a-0x000000000d4dce17

Those addresses are hashed for kernel security reasons.  If we're trying
to be secure with slub_debug on the commandline we have some big problems
given that we dump whole chunks of kernel memory to the kernel logs.
Let's force on the no_hash_pointers commandline flag when slub_debug is on
the commandline.  This makes slub debug messages more meaningful and if by
chance a kernel address is in some slub debug object dump we will have a
better chance of figuring out what went wrong.

Note that we don't use %px in the slub code because we want to reduce the
number of places that %px is used in the kernel.  This also nicely prints
a big fat warning at kernel boot if slub_debug is on the commandline so
that we know that this kernel shouldn't be used on production systems.

[akpm@linux-foundation.org: fix build with CONFIG_SLUB_DEBUG=n]

Link: https://lkml.kernel.org/r/20210601182202.3011020-5-swboyd@chromium.orgSigned-off-by: NStephen Boyd <swboyd@chromium.org>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NPetr Mladek <pmladek@suse.com>
Cc: Joe Perches <joe@perches.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

75928e39

printk: clarify the documentation for plain pointer printing · ac1dd90c

由 Vlastimil Babka 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.13-rc1
commit a48849e2
category: feature
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a48849e2358ecf1a347a03b33dc27b9b2f25f8fd

-----------------------------------------------

We have several modifiers for plain pointers (%p, %px and %pK) and now
also the no_hash_pointers boot parameter. The documentation should help
to choose which variant to use. Importantly, we should discourage %px
in favor of %p (with the new boot parameter when debugging), and stress
that %pK should be only used for procfs and similar files, not dmesg
buffer. This patch clarifies the documentation in that regard.
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: NPetr Mladek <pmladek@suse.com>
Signed-off-by: NPetr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210225164639.27212-1-vbabka@suse.czSigned-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ac1dd90c

lib/vsprintf: do not show no_hash_pointers message multiple times · eeb1fbf7

由 Marco Elver 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.13-rc1
commit 9f961c2e
category: feature
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9f961c2e08741579aa53095d0dbffbcb25a9ae66

-----------------------------------------------

Do not show no_hash_pointers message multiple times if the option was
passed more than once (e.g. via generated command line).
Signed-off-by: NMarco Elver <elver@google.com>
Reviewed-by: NPetr Mladek <pmladek@suse.com>
Signed-off-by: NPetr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210305194206.3165917-1-elver@google.comSigned-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

eeb1fbf7

kfence: add function to mask address bits · 4aff2600

由 Sven Schnelle 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.15-rc1
commit f99e12b2
category: bugfix
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=

-----------------------------------------------

s390 only reports the page address during a translation fault.
To make the kfence unit tests pass, add a function that might
be implemented by architectures to mask out address bits.
Signed-off-by: NSven Schnelle <svens@linux.ibm.com>
Reviewed-by: NMarco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20210728190254.3921642-3-hca@linux.ibm.comSigned-off-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4aff2600

kfence, x86: only define helpers if !MODULE · d1ff1d58

由 Marco Elver 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.15-rc1
commit 00e67bf0
category: bugfix
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=

-----------------------------------------------

x86's <asm/tlbflush.h> only declares non-module accessible functions
(such as flush_tlb_one_kernel) if !MODULE.

In preparation of including <asm/kfence.h> from the KFENCE test module,
only define the helpers if !MODULE to avoid breaking the build with
CONFIG_KFENCE_KUNIT_TEST=m.
Signed-off-by: NMarco Elver <elver@google.com>
Link: https://lore.kernel.org/r/YQJdarx6XSUQ1tFZ@elver.google.comSigned-off-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d1ff1d58

kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE · ca25384b

由 Marco Elver 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.14-rc7
commit a7cb5d23
category: bugfix
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=

-----------------------------------------------

Originally the addr != NULL check was meant to take care of the case
where __kfence_pool == NULL (KFENCE is disabled).  However, this does
not work for addresses where addr > 0 && addr < KFENCE_POOL_SIZE.

This can be the case on NULL-deref where addr > 0 && addr < PAGE_SIZE or
any other faulting access with addr < KFENCE_POOL_SIZE.  While the
kernel would likely crash, the stack traces and report might be
confusing due to double faults upon KFENCE's attempt to unprotect such
an address.

Fix it by just checking that __kfence_pool != NULL instead.

Link: https://lkml.kernel.org/r/20210818130300.2482437-1-elver@google.com
Fixes: 0ce20dd8 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: NMarco Elver <elver@google.com>
Reported-by: NKuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Acked-by: NAlexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>    [5.12+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ca25384b

kfence: skip all GFP_ZONEMASK allocations · 73edbd11

由 Alexander Potapenko 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.14-rc3
commit 236e9f15
category: bugfix
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=

-----------------------------------------------

Allocation requests outside ZONE_NORMAL (MOVABLE, HIGHMEM or DMA) cannot
be fulfilled by KFENCE, because KFENCE memory pool is located in a zone
different from the requested one.

Because callers of kmem_cache_alloc() may actually rely on the
allocation to reside in the requested zone (e.g.  memory allocations
done with __GFP_DMA must be DMAable), skip all allocations done with
GFP_ZONEMASK and/or respective SLAB flags (SLAB_CACHE_DMA and
SLAB_CACHE_DMA32).

Link: https://lkml.kernel.org/r/20210714092222.1890268-2-glider@google.com
Fixes: 0ce20dd8 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: NAlexander Potapenko <glider@google.com>
Reviewed-by: NMarco Elver <elver@google.com>
Acked-by: NSouptick Joarder <jrdr.linux@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: <stable@vger.kernel.org>	[5.12+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

73edbd11

kfence: move the size check to the beginning of __kfence_alloc() · 4ebce3c0

由 Alexander Potapenko 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.14-rc3
commit 235a85cb
category: bugfix
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=

-----------------------------------------------

Check the allocation size before toggling kfence_allocation_gate.

This way allocations that can't be served by KFENCE will not result in
waiting for another CONFIG_KFENCE_SAMPLE_INTERVAL without allocating
anything.

Link: https://lkml.kernel.org/r/20210714092222.1890268-1-glider@google.comSigned-off-by: NAlexander Potapenko <glider@google.com>
Suggested-by: NMarco Elver <elver@google.com>
Reviewed-by: NMarco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>	[5.12+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4ebce3c0

kfence: defer kfence_test_init to ensure that kunit debugfs is created · 350dc26c

由 Weizhao Ouyang 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.14-rc3
commit 32ae8a06
category: bugfix
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=

-----------------------------------------------

kfence_test_init and kunit_init both use the same level late_initcall,
which means if kfence_test_init linked ahead of kunit_init,
kfence_test_init will get a NULL debugfs_rootdir as parent dentry, then
kfence_test_init and kfence_debugfs_init both create a debugfs node
named "kfence" under debugfs_mount->mnt_root, and it will throw out
"debugfs: Directory 'kfence' with parent '/' already present!" with
EEXIST.  So kfence_test_init should be deferred.

Link: https://lkml.kernel.org/r/20210714113140.2949995-1-o451686892@gmail.comSigned-off-by: NWeizhao Ouyang <o451686892@gmail.com>
Tested-by: NMarco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

350dc26c

kfence: unconditionally use unbound work queue · 811791b9

由 Marco Elver 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.14-rc1
commit 3d1a90ab
category: feature
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3d1a90ab0ed93362ec8ac85cf291243c87260c21

-----------------------------------------------

Unconditionally use unbound work queue, and not just if wq_power_efficient
is true.  Because if the system is idle, KFENCE may wait, and by being run
on the unbound work queue, we permit the scheduler to make better
scheduling decisions and not require pinning KFENCE to the same CPU upon
waking up.

Link: https://lkml.kernel.org/r/20210521111630.472579-1-elver@google.com
Fixes: 36f0b35d ("kfence: use power-efficient work queue to run delayed work")
Signed-off-by: NMarco Elver <elver@google.com>
Reported-by: NHillf Danton <hdanton@sina.com>
Reviewed-by: NAlexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

811791b9

mm, slub: change run-time assertion in kmalloc_index() to compile-time · 0d59918d

由 Hyeonggon Yoo 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.14-rc1
commit 588c7fa0
category: feature
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=588c7fa022d7b2361500ead5660d9a1a2ecd9b7d

-----------------------------------------------

Currently when size is not supported by kmalloc_index, compiler will
generate a run-time BUG() while compile-time error is also possible, and
better.  So change BUG to BUILD_BUG_ON_MSG to make compile-time check
possible.

Also remove code that allocates more than 32MB because current
implementation supports only up to 32MB.

[42.hyeyoo@gmail.com: fix support for clang 10]
  Link: https://lkml.kernel.org/r/20210518181247.GA10062@hyeyoo
[vbabka@suse.cz: fix false-positive assert in kernel/bpf/local_storage.c]
  Link: https://lkml.kernel.org/r/bea97388-01df-8eac-091b-a3c89b4a4a09@suse.czLink: https://lkml.kernel.org/r/20210511173448.GA54466@hyeyoo
[elver@google.com: kfence fix]
  Link: https://lkml.kernel.org/r/20210512195227.245000695c9014242e9a00e5@linux-foundation.orgSigned-off-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Signed-off-by: NMarco Elver <elver@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0d59918d

kfence: use TASK_IDLE when awaiting allocation · 1efe3854

由 Marco Elver 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.13-rc5
commit 8fd0e995
category: feature
bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8fd0e995cc7b6a7a8a40bc03d52a2cd445beeff4

-----------------------------------------------

Since wait_event() uses TASK_UNINTERRUPTIBLE by default, waiting for an
allocation counts towards load.  However, for KFENCE, this does not make
any sense, since there is no busy work we're awaiting.

Instead, use TASK_IDLE via wait_event_idle() to not count towards load.

BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1185565
Link: https://lkml.kernel.org/r/20210521083209.3740269-1-elver@google.com
Fixes: 407f1d8c ("kfence: await for allocation using wait_event")
Signed-off-by: NMarco Elver <elver@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Hillf Danton <hdanton@sina.com>
Cc: <stable@vger.kernel.org>	[5.12+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1efe3854

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功