提交 · 2c17620384cd754daf834683d9793460656ec0c3 · openeuler / Kernel

03 6月, 2023 5 次提交

md: fix unexpected changes of return value in rdev_set_badblocks · 74720ee6

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6XBZQ
CVE: NA

--------------------------------

If we set any badblocks fail, we will remove this rdev(set it to Faulty
or set recovery_disabled). Previous patch "md/raid10: fix io hung in
md_wait_for_blocked_rdev()" check badblocks->changed instead of return
value in rdev_set_badblocks(), but return value of this func also changed
accordingly, which is not what we expected.

Keep the return value consistent with before.
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
(cherry picked from commit bebf3d97)

74720ee6

md/raid10: fix io hung in md_wait_for_blocked_rdev() · 1f407ca9

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6XBZQ
CVE: NA

--------------------------------

If badblocks are merged but bb->count exceedded, badblocks_set() will
return 1 and merged badblocks will become un-ack. rdev_set_badblocks()
will not set sb_flags and wakeup mddev->thread, io wait in
md_wait_for_blocked_rdev() will hung because BlockedBadBlocks may not be
cleared.

Fix it by checking badblocks->changed instead of return value. This flag
is set when badblocks changes.
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
(cherry picked from commit c23e1cd1)

1f407ca9

md: fix sysfs duplicate file while adding rdev · 0b18dcc1

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188553, https://gitee.com/openeuler/kernel/issues/I6TNFX
CVE: NA

--------------------------------

rdev->del_work has not been queued to md_rdev_misc_wq and flush_workqueue
will not flush it if tow threads add and remove same device. sysfs might
WARN duplicate filename as below.

    //T1	             //T2
    mdadm write super
			     add success
			     remove
			      unbind_rdev_from_array

    md_ioctl
     flush_workqueue
			      INIT_WORK
                               queue_work
     md_add_new_disk
      duplicate filename dev-xxx

Check if there is any kobj with the same name, and return busy if true.

Fixes: 5792a285 ("md: avoid a deadlock when removing a device from an md array via sysfs")
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
(cherry picked from commit 5815341f)

0b18dcc1

md: replace invalid function flush_rdev_wq() with flush_workqueue() · 6912b8bb

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188553, https://gitee.com/openeuler/kernel/issues/I6TNFX
CVE: NA

--------------------------------

If we want to remove a device, first we delete it from mddev->disks list,
then init rdev->del_work to put it (see unbind_rdev_from_array()).

flush_rdev_wq() traverses mddev->disks to check if there is any pending
rdev->del_work, if so, flush it. Howerver, rdev will not be in the list of
mddev->disks if rdev->del_work exists, and flush_workqueue() will never be
executed.

Replace it with flush_workqueue() to ensure del_work has been completed
when adding devices.

Fixes: cc1ffe61 ("md: add new workqueue for delete rdev")
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
(cherry picked from commit ff461e2d)

6912b8bb

md: Flush workqueue md_rdev_misc_wq in md_alloc() · 47a700c6

由 David Sloan 提交于 6月 03, 2023

mainline inclusion
from mainline-v6.0-rc3
commit 5e8daf90
category: bugfix
bugzilla: 188015, https://gitee.com/openeuler/kernel/issues/I6OERX
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=5e8daf906f890560df430d30617c692a794acb73

--------------------------------

A race condition still exists when removing and re-creating md devices
in test cases. However, it is only seen on some setups.

The race condition was tracked down to a reference still being held
to the kobject by the rdev in the md_rdev_misc_wq which will be released
in rdev_delayed_delete().

md_alloc() waits for previous deletions by waiting on the md_misc_wq,
but the md_rdev_misc_wq may still be holding a reference to a recently
removed device.

To fix this, also flush the md_rdev_misc_wq in md_alloc().
Signed-off-by: NDavid Sloan <david.sloan@eideticom.com>
[logang@deltatee.com: rewrote commit message]
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NSong Liu <song@kernel.org>

Conflict:
	drivers/md/md.c
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
(cherry picked from commit 5fa41917)

47a700c6

31 5月, 2023 7 次提交

md: use interruptible apis in idle/frozen_sync_thread · 401824da

由 Yu Kuai 提交于 5月 31, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC
CVE: NA

--------------------------------

Before refactoring idle and frozen from action_store, interruptible apis
is used so that hungtask warning won't be triggered if it takes too long
to finish indle/frozen sync_thread. This patch do the same.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

401824da

md: wake up 'resync_wait' at last in md_reap_sync_thread() · 8f50da59

由 Yu Kuai 提交于 5月 31, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC
CVE: NA

--------------------------------

We just replace md_reap_sync_thread() with wait_event(resync_wait, ...)
from action_store(), this patch just make sure action_store() will still
wait for everything to be done in md_reap_sync_thread().
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

8f50da59

md: refactor idle/frozen_sync_thread() · e377145e

由 Yu Kuai 提交于 5月 31, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC
CVE: NA

--------------------------------

Our test found a following deadlock in raid10:

1) Issue a normal write, and such write failed:

  raid10_end_write_request
   set_bit(R10BIO_WriteError, &r10_bio->state)
   one_write_done
    reschedule_retry

  // later from md thread
  raid10d
   handle_write_completed
    list_add(&r10_bio->retry_list, &conf->bio_end_io_list)

  // later from md thread
  raid10d
   if (!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags))
    list_move(conf->bio_end_io_list.prev, &tmp)
    r10_bio = list_first_entry(&tmp, struct r10bio, retry_list)
    raid_end_bio_io(r10_bio)

Dependency chain 1: normal io is waiting for updating superblock

2) Trigger a recovery:

  raid10_sync_request
   raise_barrier

Dependency chain 2: sync thread is waiting for normal io

3) echo idle/frozen to sync_action:

  action_store
   mddev_lock
    md_unregister_thread
     kthread_stop

Dependency chain 3: drop 'reconfig_mutex' is waiting for sync thread

4) md thread can't update superblock:

  raid10d
   md_check_recovery
    if (mddev_trylock(mddev))
     md_update_sb

Dependency chain 4: update superblock is waiting for 'reconfig_mutex'

Hence cyclic dependency exist, in order to fix the problem, we must
break one of them. Dependency 1 and 2 can't be broken because they are
foundation design. Dependency 4 may be possible if it can be guaranteed
that no io can be inflight, however, this requires a new mechanism which
seems complex. Dependency 3 is a good choice, because idle/frozen only
requires sync thread to finish, which can be done asynchronously that is
already implemented, and 'reconfig_mutex' is not needed anymore.

This patch switch 'idle' and 'frozen' to wait sync thread to be done
asynchronously, and this patch also add a sequence counter to record how
many times sync thread is done, so that 'idle' won't keep waiting on new
started sync thread.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

e377145e

md: add a mutex to synchronize idle and frozen in action_store() · a4fcefd5

由 Yu Kuai 提交于 5月 31, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC
CVE: NA

--------------------------------

Currently, for idle and frozen, action_store will hold 'reconfig_mutex'
and call md_reap_sync_thread() to stop sync thread, however, this will
cause deadlock (explained in the next patch). In order to fix the
problem, following patch will release 'reconfig_mutex' and wait on
'resync_wait', like md_set_readonly() and do_md_stop() does.

Consider that action_store() will set/clear 'MD_RECOVERY_FROZEN'
unconditionally, which might cause unexpected problems, for example,
frozen just set 'MD_RECOVERY_FROZEN' and is still in progress, while
'idle' clear 'MD_RECOVERY_FROZEN' and new sync thread is started, which
might starve in progress frozen.

This patch add a mutex to synchronize idle and frozen from
action_store().
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

a4fcefd5

md: refactor action_store() for 'idle' and 'frozen' · a02523ea

由 Yu Kuai 提交于 5月 31, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC
CVE: NA

--------------------------------

Prepare to handle 'idle' and 'frozen' differently to fix a deadlock, there
are no functional changes except that MD_RECOVERY_RUNNING is checked
again after 'reconfig_mutex' is held.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

a02523ea

Revert "md: unlock mddev before reap sync_thread in action_store" · 75396cbf

由 Yu Kuai 提交于 5月 31, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC
CVE: NA

--------------------------------

This reverts commit 9dfbdafd.

Because it will introduce a defect that sync_thread can be running while
MD_RECOVERY_RUNNING is cleared, which will cause some unexpected problems,
for example:

list_add corruption. prev->next should be next (ffff0001ac1daba0), but was ffff0000ce1a02a0. (prev=ffff0000ce1a02a0).
Call trace:
 __list_add_valid+0xfc/0x140
 insert_work+0x78/0x1a0
 __queue_work+0x500/0xcf4
 queue_work_on+0xe8/0x12c
 md_check_recovery+0xa34/0xf30
 raid10d+0xb8/0x900 [raid10]
 md_thread+0x16c/0x2cc
 kthread+0x1a4/0x1ec
 ret_from_fork+0x10/0x18

This is because work is requeued while it's still inside workqueue:

t1:			t2:
action_store
 mddev_lock
  if (mddev->sync_thread)
   mddev_unlock
   md_unregister_thread
   // first sync_thread is done
			md_check_recovery
			 mddev_try_lock
			 /*
			  * once MD_RECOVERY_DONE is set, new sync_thread
			  * can start.
			  */
			 set_bit(MD_RECOVERY_RUNNING, &mddev->recovery)
			 INIT_WORK(&mddev->del_work, md_start_sync)
			 queue_work(md_misc_wq, &mddev->del_work)
			  test_and_set_bit(WORK_STRUCT_PENDING_BIT, ...)
			  // set pending bit
			  insert_work
			   list_add_tail
			 mddev_unlock
   mddev_lock_nointr
   md_reap_sync_thread
   // MD_RECOVERY_RUNNING is cleared
 mddev_unlock

t3:

// before queued work started from t2
md_check_recovery
 // MD_RECOVERY_RUNNING is not set, a new sync_thread can be started
 INIT_WORK(&mddev->del_work, md_start_sync)
  work->data = 0
  // work pending bit is cleared
 queue_work(md_misc_wq, &mddev->del_work)
  insert_work
   list_add_tail
   // list is corrupted

This patch revert the commit to fix the problem, the deadlock this
commit tries to fix will be fixed in following patches.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Signed-off-by: NSong Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230322064122.2384589-2-yukuai1@huaweicloud.comReviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

75396cbf

md: unlock mddev before reap sync_thread in action_store · 0145aa4a

由 Guoqing Jiang 提交于 5月 31, 2023

mainline inclusion
from mainline-v6.0-rc1
commit 9dfbdafd
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc3&id=9dfbdafda3b34e262e43e786077bab8e476a89d1

--------------------------------

Since the bug which commit 8b48ec23 ("md: don't unregister sync_thread
with reconfig_mutex held") fixed is related with action_store path, other
callers which reap sync_thread didn't need to be changed.

Let's pull md_unregister_thread from md_reap_sync_thread, then fix previous
bug with belows.

1. unlock mddev before md_reap_sync_thread in action_store.
2. save reshape_position before unlock, then restore it to ensure position
   not changed accidentally by others.
Signed-off-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

0145aa4a

08 3月, 2023 4 次提交

md: fix soft lockup in status_resync · 200c0290

由 Yu Kuai 提交于 3月 08, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6JN1I
CVE: NA

--------------------------------

status_resync() will calculate 'curr_resync - recovery_active' to show
user a progress bar like following:

[============>........]  resync = 61.4%

'curr_resync' and 'recovery_active' is updated in md_do_sync(), and
status_resync() can read them concurrently, hence it's possible that
'curr_resync - recovery_active' can overflow to a huge number. In this
case status_resync() will be stuck in the loop to print a large amount
of '=', which will end up soft lockup.

Fix the problem by setting 'resync' to MD_RESYNC_ACTIVE in this case,
this way resync in progress will be reported to user.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

200c0290

md: don't update recovery_cp when curr_resync is ACTIVE · c3925395

由 Hou Tao 提交于 3月 08, 2023

mainline inclusion
from mainline-v6.3-rc1
commit 1d1f25bf
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6JN1I
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1d1f25bfda432a6b61bd0205d426226bbbd73504

--------------------------------

Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise
md may skip the resync of the first 3 sectors if the resync procedure is
interrupted before the first calling of ->sync_request() as shown below:

md_do_sync thread          control thread
  // setup resync
  mddev->recovery_cp = 0
  j = 0
  mddev->curr_resync = MD_RESYNC_ACTIVE

                             // e.g., set array as idle
                             set_bit(MD_RECOVERY_INTR, &&mddev_recovery)
  // resync loop
  // check INTR before calling sync_request
  !test_bit(MD_RECOVERY_INTR, &mddev->recovery

  // resync interrupted
  // update recovery_cp from 0 to 3
  // the resync of three 3 sectors will be skipped
  mddev->recovery_cp = 3

Fixes: eac58d08 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync")
Cc: stable@vger.kernel.org # 6.0+
Signed-off-by: NHou Tao <houtao1@huawei.com>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

c3925395

md: Ensure resync is reported after it starts · 0f3c7435

由 Logan Gunthorpe 提交于 3月 08, 2023

mainline inclusion
from mainline-v6.0-rc1
commit b368856a
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6JN1I
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b368856aab02c8fcaabb809aad401b2cf96504f2

--------------------------------

The 07layouts test in mdadm fails on some systems. The failure
presents itself as the backup file not being removed before the next
layout is grown into:

  mdadm: /dev/md0: cannot create backup file /tmp/md-test-backup:
      File exists

This is because the background mdadm process, which is responsible for
cleaning up this backup file gets into an infinite loop waiting for
the reshape to start. mdadm checks the mdstat file if a reshape is
going and, if it is not, it waits for an event on the file or times
out in 5 seconds. On faster machines, the reshape may complete before
the 5 seconds times out, and thus the background mdadm process loops
waiting for a reshape to start that has already occurred.

mdadm reads the mdstat file to start, but mdstat does not report that the
reshape has begun, even though it has indeed begun. So the mdstat_wait()
call (in mdadm) which polls on the mdstat file won't ever return until
timing out.

The reason mdstat reports the reshape has started is due to an issue
in status_resync(). recovery_active is subtracted from curr_resync which
will result in a value of zero for the first chunk of reshaped data, and
the resulting read will report no reshape in progress.

To fix this, if "resync - recovery_active" is an overloaded value, force
the value to be MD_RESYNC_ACTIVE so the code reports a resync in progress.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

0f3c7435

md: Use enum for overloaded magic numbers used by mddev->curr_resync · d0837927

由 Logan Gunthorpe 提交于 3月 08, 2023

mainline inclusion
from mainline-v6.0-rc1
commit eac58d08
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6JN1I
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eac58d08d4937d2eab8f71c663d98d0759845bde

--------------------------------

Comments in the code document special values used for
mddev->curr_resync. Make this clearer by using an enum to label these
values.

The only functional change is a couple places use the wrong comparison
operator that implied 3 is another special value. They are all
fixed to imply that 3 or greater is an active resync.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

d0837927

28 2月, 2023 2 次提交

md/raid10: fix wrong setting of max_corr_read_errors · af6ef413

由 Li Nan 提交于 2月 28, 2023

hulk inclusion
category: bugfix
bugzilla: 188290, https://gitee.com/openeuler/kernel/issues/I6ECBD
CVE: NA

--------------------------------

max_corr_read_errors should not be negative number. Change it to
unsigned int where use it.
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

af6ef413

md/raid10: fix overflow in safe_delay_store · 95b9cf29

由 Li Nan 提交于 2月 28, 2023

hulk inclusion
category: bugfix
bugzilla: 188284, https://gitee.com/openeuler/kernel/issues/I6ECB2
CVE: NA

--------------------------------

There is no input check when echo md/safe_mode_delay, and overflow will
occur. There is risk of overflow in strict_strtoul_scaled(), too. Fixed
it by using kstrtoul instead of parsing word one by one.
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

95b9cf29

02 12月, 2022 2 次提交

md: call __md_stop_writes in md_stop · 63eb00e4

由 Guoqing Jiang 提交于 12月 02, 2022

stable inclusion
from stable-v5.10.140
commit a5a58fab556bfe618b4c9719eb85712d78c6cb10
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a5a58fab556bfe618b4c9719eb85712d78c6cb10

--------------------------------

commit 0dd84b31 upstream.

From the link [1], we can see raid1d was running even after the path
raid_dtr -> md_stop -> __md_stop.

Let's stop write first in destructor to align with normal md-raid to
fix the KASAN issue.

[1]. https://lore.kernel.org/linux-raid/CAPhsuW5gc4AakdGNdF8ubpezAuDLFOYUO_sfMZcec6hQFm8nhg@mail.gmail.com/T/#m7f12bf90481c02c6d2da68c64aeed4779b7df74a

Fixes: 48df498d ("md: move bitmap_destroy to the beginning of __md_stop")
Reported-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>

63eb00e4

Revert "md-raid: destroy the bitmap after destroying the thread" · 270c01ba

由 Guoqing Jiang 提交于 12月 02, 2022

stable inclusion
from stable-v5.10.140
commit f68f025c7e692f817d5f459253e6604bb1417977
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f68f025c7e692f817d5f459253e6604bb1417977

--------------------------------

commit 1d258758 upstream.

This reverts commit e151db8e. Because it
obviously breaks clustered raid as noticed by Neil though it fixed KASAN
issue for dm-raid, let's revert it and fix KASAN issue in next commit.

[1]. https://lore.kernel.org/linux-raid/a6657e08-b6a7-358b-2d2a-0ac37d49d23a@linux.dev/T/#m95ac225cab7409f66c295772483d091084a6d470

Fixes: e151db8e ("md-raid: destroy the bitmap after destroying the thread")
Signed-off-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>

270c01ba

21 11月, 2022 1 次提交

md: Notify sysfs sync_completed in md_reap_sync_thread() · 1bc8d81d

由 Logan Gunthorpe 提交于 11月 21, 2022

stable inclusion
from stable-v5.10.138
commit 6d7aabdba60cc7e6d051617f6c984f5d1bfe2974
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I60QFD

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6d7aabdba60cc7e6d051617f6c984f5d1bfe2974

--------------------------------

[ Upstream commit 9973f0fa ]

The mdadm test 07layouts randomly produces a kernel hung task deadlock.
The deadlock is caused by the suspend_lo/suspend_hi files being set by
the mdadm background process during reshape and not being cleared
because the process hangs. (Leaving aside the issue of the fragility of
freezing kernel tasks by buggy userspace processes...)

When the background mdadm process hangs it, is waiting (without a
timeout) on a change to the sync_completed file signalling that the
reshape has completed. The process is woken up a couple times when
the reshape finishes but it is woken up before MD_RECOVERY_RUNNING
is cleared so sync_completed_show() reports 0 instead of "none".

To fix this, notify the sysfs file in md_reap_sync_thread() after
MD_RECOVERY_RUNNING has been cleared. This wakes up mdadm and causes
it to continue and write to suspend_lo/suspend_hi to allow IO to
continue.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>

1bc8d81d

18 11月, 2022 1 次提交

md-raid: destroy the bitmap after destroying the thread · 0e996bdc

由 Mikulas Patocka 提交于 11月 18, 2022

stable inclusion
from stable-v5.10.137
commit e0bdaed154e5b9cc4310ddaf5da290483d00e6ba
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I60PLB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e0bdaed154e5b9cc4310ddaf5da290483d00e6ba

--------------------------------

commit e151db8e upstream.

When we ran the lvm test "shell/integrity-blocksize-3.sh" on a kernel with
kasan, we got failure in write_page.

The reason for the failure is that md_bitmap_destroy is called before
destroying the thread and the thread may be waiting in the function
write_page for the bio to complete. When the thread finishes waiting, it
executes "if (test_bit(BITMAP_WRITE_ERROR, &bitmap->flags))", which
triggers the kasan warning.

Note that the commit 48df498d that caused this bug claims that it is
neede for md-cluster, you should check md-cluster and possibly find
another bugfix for it.

BUG: KASAN: use-after-free in write_page+0x18d/0x680 [md_mod]
Read of size 8 at addr ffff889162030c78 by task mdX_raid1/5539

CPU: 10 PID: 5539 Comm: mdX_raid1 Not tainted 5.19.0-rc2 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x34/0x44
 print_report.cold+0x45/0x57a
 ? __lock_text_start+0x18/0x18
 ? write_page+0x18d/0x680 [md_mod]
 kasan_report+0xa8/0xe0
 ? write_page+0x18d/0x680 [md_mod]
 kasan_check_range+0x13f/0x180
 write_page+0x18d/0x680 [md_mod]
 ? super_sync+0x4d5/0x560 [dm_raid]
 ? md_bitmap_file_kick+0xa0/0xa0 [md_mod]
 ? rs_set_dev_and_array_sectors+0x2e0/0x2e0 [dm_raid]
 ? mutex_trylock+0x120/0x120
 ? preempt_count_add+0x6b/0xc0
 ? preempt_count_sub+0xf/0xc0
 md_update_sb+0x707/0xe40 [md_mod]
 md_reap_sync_thread+0x1b2/0x4a0 [md_mod]
 md_check_recovery+0x533/0x960 [md_mod]
 raid1d+0xc8/0x2a20 [raid1]
 ? var_wake_function+0xe0/0xe0
 ? psi_group_change+0x411/0x500
 ? preempt_count_sub+0xf/0xc0
 ? _raw_spin_lock_irqsave+0x78/0xc0
 ? __lock_text_start+0x18/0x18
 ? raid1_end_read_request+0x2a0/0x2a0 [raid1]
 ? preempt_count_sub+0xf/0xc0
 ? _raw_spin_unlock_irqrestore+0x19/0x40
 ? del_timer_sync+0xa9/0x100
 ? try_to_del_timer_sync+0xc0/0xc0
 ? _raw_spin_lock_irqsave+0x78/0xc0
 ? __lock_text_start+0x18/0x18
 ? __list_del_entry_valid+0x68/0xa0
 ? finish_wait+0xa3/0x100
 md_thread+0x161/0x260 [md_mod]
 ? unregister_md_personality+0xa0/0xa0 [md_mod]
 ? _raw_spin_lock_irqsave+0x78/0xc0
 ? prepare_to_wait_event+0x2c0/0x2c0
 ? unregister_md_personality+0xa0/0xa0 [md_mod]
 kthread+0x148/0x180
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x1f/0x30
 </TASK>

Allocated by task 5522:
 kasan_save_stack+0x1e/0x40
 __kasan_kmalloc+0x80/0xa0
 md_bitmap_create+0xa8/0xe80 [md_mod]
 md_run+0x777/0x1300 [md_mod]
 raid_ctr+0x249c/0x4a30 [dm_raid]
 dm_table_add_target+0x2b0/0x620 [dm_mod]
 table_load+0x1c8/0x400 [dm_mod]
 ctl_ioctl+0x29e/0x560 [dm_mod]
 dm_compat_ctl_ioctl+0x7/0x20 [dm_mod]
 __do_compat_sys_ioctl+0xfa/0x160
 do_syscall_64+0x90/0xc0
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Freed by task 5680:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x40
 kasan_set_free_info+0x20/0x40
 __kasan_slab_free+0xf7/0x140
 kfree+0x80/0x240
 md_bitmap_free+0x1c3/0x280 [md_mod]
 __md_stop+0x21/0x120 [md_mod]
 md_stop+0x9/0x40 [md_mod]
 raid_dtr+0x1b/0x40 [dm_raid]
 dm_table_destroy+0x98/0x1e0 [dm_mod]
 __dm_destroy+0x199/0x360 [dm_mod]
 dev_remove+0x10c/0x160 [dm_mod]
 ctl_ioctl+0x29e/0x560 [dm_mod]
 dm_compat_ctl_ioctl+0x7/0x20 [dm_mod]
 __do_compat_sys_ioctl+0xfa/0x160
 do_syscall_64+0x90/0xc0
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 48df498d ("md: move bitmap_destroy to the beginning of __md_stop")
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>

0e996bdc

19 10月, 2022 1 次提交

md: protect md_unregister_thread from reentrancy · 5321a9b0

由 Guoqing Jiang 提交于 10月 18, 2022

stable inclusion
from stable-v5.10.122
commit 7eb32f286e6841ca091e737e5f8b02fee365d16c
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5W6OE

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7eb32f286e6841ca091e737e5f8b02fee365d16c

--------------------------------

[ Upstream commit 1e267742 ]

Generally, the md_unregister_thread is called with reconfig_mutex, but
raid_message in dm-raid doesn't hold reconfig_mutex to unregister thread,
so md_unregister_thread can be called simulitaneously from two call sites
in theory.

Then after previous commit which remove the protection of reconfig_mutex
for md_unregister_thread completely, the potential issue could be worse
than before.

Let's take pers_lock at the beginning of function to ensure reentrancy.
Reported-by: NDonald Buczek <buczek@molgen.mpg.de>
Signed-off-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>

5321a9b0

29 9月, 2022 2 次提交

md: fix an incorrect NULL check in md_reload_sb · b09d2861

由 Xiaomeng Tong 提交于 9月 29, 2022

stable inclusion
from stable-v5.10.121
commit b2b01444228d59ba62b75286c84cea40bb064c64
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5L6CQ

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b2b01444228d59ba62b75286c84cea40bb064c64

--------------------------------

commit 64c54d92 upstream.

The bug is here:
	if (!rdev || rdev->desc_nr != nr) {

The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each_rcu(), so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element
found (In fact, it will be a bogus pointer to an invalid struct
object containing the HEAD). Otherwise it will bypass the check
and lead to invalid memory access passing the check.

To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'pdev' as a dedicated pointer to
point to the found element.

Cc: stable@vger.kernel.org
Fixes: 70bcecdb ("md-cluster: Improve md_reload_sb to be less error prone")
Signed-off-by: NXiaomeng Tong <xiam0nd.tong@gmail.com>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

b09d2861

md: fix an incorrect NULL check in does_sb_need_changing · 01f65efc

由 Xiaomeng Tong 提交于 9月 29, 2022

stable inclusion
from stable-v5.10.121
commit 2401f1cf3dee6974d799a4387f0c5d98cec29326
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5L6CQ

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2401f1cf3dee6974d799a4387f0c5d98cec29326

--------------------------------

commit fc873834 upstream.

The bug is here:
	if (!rdev)

The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each(), so it is incorrect to assume that the iterator
value will be NULL if the list is empty or no element found.
Otherwise it will bypass the NULL check and lead to invalid memory
access passing the check.

To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'rdev' as a dedicated pointer to
point to the found element.

Cc: stable@vger.kernel.org
Fixes: 2aa82191 ("md-cluster: Perform a lazy update")
Acked-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NXiaomeng Tong <xiam0nd.tong@gmail.com>
Acked-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

01f65efc

04 8月, 2022 1 次提交

md: add io accounting for raid0 and raid5 · c46da1e4

由 Guoqing Jiang 提交于 8月 04, 2022

mainline inclusion
from mainline-v5.14-rc1
commit 10764815
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I587H6
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=10764815ff4728d2c57da677cd5d3dd6f446cf5f

-------------------------------

We introduce a new bioset (io_acct_set) for raid0 and raid5 since they
don't own clone infrastructure to accounting io. And the bioset is added
to mddev instead of to raid0 and raid5 layer, because with this way, we
can put common functions to md.h and reuse them in raid0 and raid5.

Also struct md_io_acct is added accordingly which includes io start_time,
the origin bio and cloned bio. Then we can call bio_{start,end}_io_acct
to get related io status.
Signed-off-by: NGuoqing Jiang <jiangguoqing@kylinos.cn>
Signed-off-by: NSong Liu <song@kernel.org>
Conflict:
	drivers/md/md.c
	drivers/md/md.h
Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c46da1e4

20 1月, 2022 1 次提交

md: Fix undefined behaviour in is_mddev_idle · 406295a3

由 zhangwensheng 提交于 1月 20, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4QXS1?from=project-issue
CVE: NA

--------------------------------

UBSAN reports this problem:

[ 5984.281385] UBSAN: Undefined behaviour in drivers/md/md.c:8175:15
[ 5984.281390] signed integer overflow:
[ 5984.281393] -2147483291 - 2072033152 cannot be represented in type 'int'
[ 5984.281400] CPU: 25 PID: 1854 Comm: md101_resync Kdump: loaded Not tainted 4.19.90
[ 5984.281404] Hardware name: Huawei TaiShan 200 (Model 5280)/BC82AMDDA
[ 5984.281406] Call trace:
[ 5984.281415]  dump_backtrace+0x0/0x310
[ 5984.281418]  show_stack+0x28/0x38
[ 5984.281425]  dump_stack+0xec/0x15c
[ 5984.281430]  ubsan_epilogue+0x18/0x84
[ 5984.281434]  handle_overflow+0x14c/0x19c
[ 5984.281439]  __ubsan_handle_sub_overflow+0x34/0x44
[ 5984.281445]  is_mddev_idle+0x338/0x3d8
[ 5984.281449]  md_do_sync+0x1bb8/0x1cf8
[ 5984.281452]  md_thread+0x220/0x288
[ 5984.281457]  kthread+0x1d8/0x1e0
[ 5984.281461]  ret_from_fork+0x10/0x18

When the stat aacum of the disk is greater than INT_MAX, its value
becomes negative after casting to 'int', which may lead to overflow
after subtracting a positive number. In the same way, when the value
of sync_io is greater than INT_MAX,overflow may also occur. These
situations will lead to undefined behavior.

Otherwise, if the stat accum of the disk is close to INT_MAX when
creating raid arrays, the initial value of last_events would be set
close to INT_MAX when mddev initializes IO event counters.
'curr_events - rdev->last_events > 64' will always false during
synchronization. If all the disks of mddev are in this case,
is_mddev_idle() will always return 1, which may cause non-sync IO
is very slow.

To address these problems, need to use 64bit signed integer type
for sync_io,last_events, and curr_events.
Signed-off-by: Nzhangwensheng <zhangwensheng5@huawei.com>
Reviewed-by: NTao Hou <houtao1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

406295a3

14 1月, 2022 1 次提交

md: fix update super 1.0 on rdev size change · 2fbe2ae4

由 Markus Hochholdinger 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.85
commit 8b4264c27b821d6b3550fd67c0169cbc5549db8c
bugzilla: 186032 https://gitee.com/openeuler/kernel/issues/I4QVI4

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8b4264c27b821d6b3550fd67c0169cbc5549db8c

--------------------------------

commit 55df1ce0 upstream.

The superblock of version 1.0 doesn't get moved to the new position on a
device size change. This leads to a rdev without a superblock on a known
position, the raid can't be re-assembled.

The line was removed by mistake and is re-added by this patch.

Fixes: d9c0fa50 ("md: fix max sectors calculation for super 1.0")
Cc: stable@vger.kernel.org
Signed-off-by: NMarkus Hochholdinger <markus@hochholdinger.net>
Reviewed-by: NXiao Ni <xni@redhat.com>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2fbe2ae4

06 12月, 2021 1 次提交

md: update superblock after changing rdev flags in state_store · f4334456

由 Xiao Ni 提交于 12月 06, 2021

stable inclusion
from stable-5.10.80
commit 2338c3501726895c1657adda2308fcf9e6f17449
bugzilla: 185821 https://gitee.com/openeuler/kernel/issues/I4L7CG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2338c3501726895c1657adda2308fcf9e6f17449

--------------------------------

[ Upstream commit 8b9e2291 ]

When the in memory flag is changed, we need to persist the change in the
rdev superblock flags. This is needed for "writemostly" and "failfast".
Reviewed-by: NLi Feng <fengli@smartx.com>
Signed-off-by: NXiao Ni <xni@redhat.com>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f4334456

15 11月, 2021 1 次提交

md: fix a lock order reversal in md_alloc · 901bf452

由 Christoph Hellwig 提交于 11月 15, 2021

stable inclusion
from stable-5.10.70
commit b18ba3f477a2fdd12d2ca2e01d2bd874968714e2
bugzilla: 182949 https://gitee.com/openeuler/kernel/issues/I4I3GQ

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b18ba3f477a2fdd12d2ca2e01d2bd874968714e2

--------------------------------

[ Upstream commit 7df835a3 ]

Commit b0140891 ("md: Fix race when creating a new md device.")
not only moved assigning mddev->gendisk before calling add_disk, which
fixes the races described in the commit log, but also added a
mddev->open_mutex critical section over add_disk and creation of the
md kobj.  Adding a kobject after add_disk is racy vs deleting the gendisk
right after adding it, but md already prevents against that by holding
a mddev->active reference.

On the other hand taking this lock added a lock order reversal with what
is not disk->open_mutex (used to be bdev->bd_mutex when the commit was
added) for partition devices, which need that lock for the internal open
for the partition scan, and a recent commit also takes it for
non-partitioned devices, leading to further lockdep splatter.

Fixes: b0140891 ("md: Fix race when creating a new md device.")
Fixes: d6263387 ("block: support delayed holder registration")
Reported-by: syzbot+fadc0aaf497e6a493b9f@syzkaller.appspotmail.com
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: syzbot+fadc0aaf497e6a493b9f@syzkaller.appspotmail.com
Reviewed-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

901bf452

19 10月, 2021 1 次提交

md: revert io stats accounting · b84a2bb4

由 Guoqing Jiang 提交于 10月 19, 2021

mainline inclusion
from mainline-v5.14-rc1
commit ad3fc798
category: bugfix
bugzilla: 169402 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ad3fc798800fb7ca04c1dfc439dba946818048d8

-------------------------------------------------

The commit 41d2d848 ("md: improve io stats accounting") could cause
double fault problem per the report [1], and also it is not correct to
change ->bi_end_io if md don't own it, so let's revert it.

And io stats accounting will be replemented in later commits.

[1]. https://lore.kernel.org/linux-raid/3bf04253-3fad-434a-63a7-20214e38cf26@gmail.com/T/#t

Fixes: 41d2d848 ("md: improve io stats accounting")
Signed-off-by: NGuoqing Jiang <jiangguoqing@kylinos.cn>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NLuo Meng <luomeng12@huawei.com>

Conflicts:
	drivers/md/md.c
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b84a2bb4

03 6月, 2021 5 次提交

md: Fix missing unused status line of /proc/mdstat · 1e099dfb

由 Jan Glauber 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 0035a4704557ba66824c08d5759d6e743747410b
bugzilla: 51868
CVE: NA

--------------------------------

commit 7abfabaf upstream.

Reading /proc/mdstat with a read buffer size that would not
fit the unused status line in the first read will skip this
line from the output.

So 'dd if=/proc/mdstat bs=64 2>/dev/null' will not print something
like: unused devices: <none>

Don't return NULL immediately in start() for v=2 but call
show() once to print the status line also for multiple reads.

Cc: stable@vger.kernel.org
Fixes: 1f4aace6 ("fs/seq_file.c: simplify seq_file iteration code and interface")
Signed-off-by: NJan Glauber <jglauber@digitalocean.com>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1e099dfb

md: md_open returns -EBUSY when entering racing area · 640134e4

由 Zhao Heming 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit b70b7ec500892f8bc12ffc6f60a3af6fd61d3a8b
bugzilla: 51868
CVE: NA

--------------------------------

commit 6a4db2a6 upstream.

commit d3374825 ("md: make devices disappear when they are no longer
needed.") introduced protection between mddev creating & removing. The
md_open shouldn't create mddev when all_mddevs list doesn't contain
mddev. With currently code logic, there will be very easy to trigger
soft lockup in non-preempt env.

This patch changes md_open returning from -ERESTARTSYS to -EBUSY, which
will break the infinitely retry when md_open enter racing area.

This patch is partly fix soft lockup issue, full fix needs mddev_find
is split into two functions: mddev_find & mddev_find_or_alloc. And
md_open should call new mddev_find (it only does searching job).

For more detail, please refer with Christoph's "split mddev_find" patch
in later commits.

*** env ***
kvm-qemu VM 2C1G with 2 iscsi luns
kernel should be non-preempt

*** script ***

about trigger every time with below script

```
1  node1="mdcluster1"
2  node2="mdcluster2"
3
4  mdadm -Ss
5  ssh ${node2} "mdadm -Ss"
6  wipefs -a /dev/sda /dev/sdb
7  mdadm -CR /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda \
   /dev/sdb --assume-clean
8
9  for i in {1..10}; do
10    echo ==== $i ====;
11
12    echo "test  ...."
13    ssh ${node2} "mdadm -A /dev/md0 /dev/sda /dev/sdb"
14    sleep 1
15
16    echo "clean  ....."
17    ssh ${node2} "mdadm -Ss"
18 done
```

I use mdcluster env to trigger soft lockup, but it isn't mdcluster
speical bug. To stop md array in mdcluster env will do more jobs than
non-cluster array, which will leave enough time/gap to allow kernel to
run md_open.

*** stack ***

```
[  884.226509]  mddev_put+0x1c/0xe0 [md_mod]
[  884.226515]  md_open+0x3c/0xe0 [md_mod]
[  884.226518]  __blkdev_get+0x30d/0x710
[  884.226520]  ? bd_acquire+0xd0/0xd0
[  884.226522]  blkdev_get+0x14/0x30
[  884.226524]  do_dentry_open+0x204/0x3a0
[  884.226531]  path_openat+0x2fc/0x1520
[  884.226534]  ? seq_printf+0x4e/0x70
[  884.226536]  do_filp_open+0x9b/0x110
[  884.226542]  ? md_release+0x20/0x20 [md_mod]
[  884.226543]  ? seq_read+0x1d8/0x3e0
[  884.226545]  ? kmem_cache_alloc+0x18a/0x270
[  884.226547]  ? do_sys_open+0x1bd/0x260
[  884.226548]  do_sys_open+0x1bd/0x260
[  884.226551]  do_syscall_64+0x5b/0x1e0
[  884.226554]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
```

*** rootcause ***

"mdadm -A" (or other array assemble commands) will start a daemon "mdadm
--monitor" by default. When "mdadm -Ss" is running, the stop action will
wakeup "mdadm --monitor". The "--monitor" daemon will immediately get
info from /proc/mdstat. This time mddev in kernel still exist, so
/proc/mdstat still show md device, which makes "mdadm --monitor" to open
/dev/md0.

The previously "mdadm -Ss" is removing action, the "mdadm --monitor"
open action will trigger md_open which is creating action. Racing is
happening.

```
<thread 1>: "mdadm -Ss"
md_release
  mddev_put deletes mddev from all_mddevs
  queue_work for mddev_delayed_delete
  at this time, "/dev/md0" is still available for opening

<thread 2>: "mdadm --monitor ..."
md_open
 + mddev_find can't find mddev of /dev/md0, and create a new mddev and
 |    return.
 + trigger "if (mddev->gendisk != bdev->bd_disk)" and return
      -ERESTARTSYS.
```

In non-preempt kernel, <thread 2> is occupying on current CPU. and
mddev_delayed_delete which was created in <thread 1> also can't be
schedule.

In preempt kernel, it can also trigger above racing. But kernel doesn't
allow one thread running on a CPU all the time. after <thread 2> running
some time, the later "mdadm -A" (refer above script line 13) will call
md_alloc to alloc a new gendisk for mddev. it will break md_open
statement "if (mddev->gendisk != bdev->bd_disk)" and return 0 to caller,
the soft lockup is broken.

Cc: stable@vger.kernel.org
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NZhao Heming <heming.zhao@suse.com>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

640134e4

md: factor out a mddev_find_locked helper from mddev_find · f80d4b29

由 Christoph Hellwig 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit cdcfa77a332a57962ee3af255f8769fd5cdf97ad
bugzilla: 51868
CVE: NA

--------------------------------

commit 8b57251f upstream.

Factor out a self-contained helper to just lookup a mddev by the dev_t
"unit".

Cc: stable@vger.kernel.org
Reviewed-by: NHeming Zhao <heming.zhao@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f80d4b29

md: split mddev_find · 69eae441

由 Christoph Hellwig 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 07e73740850299e39f1737aff4811e79021f72e5
bugzilla: 51868
CVE: NA

--------------------------------

commit 65aa97c4 upstream.

Split mddev_find into a simple mddev_find that just finds an existing
mddev by the unit number, and a more complicated mddev_find that deals
with find or allocating a mddev.

This turns out to fix this bug reported by Zhao Heming.

----------------------------- snip ------------------------------
commit d3374825 ("md: make devices disappear when they are no longer
needed.") introduced protection between mddev creating & removing. The
md_open shouldn't create mddev when all_mddevs list doesn't contain
mddev. With currently code logic, there will be very easy to trigger
soft lockup in non-preempt env.
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

69eae441

md-cluster: fix use-after-free issue when removing rdev · 152be1b9

由 Heming Zhao 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 61b8c6efbe87c445c3907fc36a9644ed705228f8
bugzilla: 51868
CVE: NA

--------------------------------

commit f7c7a2f9 upstream.

md_kick_rdev_from_array will remove rdev, so we should
use rdev_for_each_safe to search list.

How to trigger:

env: Two nodes on kvm-qemu x86_64 VMs (2C2G with 2 iscsi luns).

```
node2=192.168.0.3

for i in {1..20}; do
    echo ==== $i `date` ====;

    mdadm -Ss && ssh ${node2} "mdadm -Ss"
    wipefs -a /dev/sda /dev/sdb

    mdadm -CR /dev/md0 -b clustered -e 1.2 -n 2 -l 1 /dev/sda \
       /dev/sdb --assume-clean
    ssh ${node2} "mdadm -A /dev/md0 /dev/sda /dev/sdb"
    mdadm --wait /dev/md0
    ssh ${node2} "mdadm --wait /dev/md0"

    mdadm --manage /dev/md0 --fail /dev/sda --remove /dev/sda
    sleep 1
done
```

Crash stack:

```
stack segment: 0000 [#1] SMP
... ...
RIP: 0010:md_check_recovery+0x1e8/0x570 [md_mod]
... ...
RSP: 0018:ffffb149807a7d68 EFLAGS: 00010207
RAX: 0000000000000000 RBX: ffff9d494c180800 RCX: ffff9d490fc01e50
RDX: fffff047c0ed8308 RSI: 0000000000000246 RDI: 0000000000000246
RBP: 6b6b6b6b6b6b6b6b R08: ffff9d490fc01e40 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: ffff9d494c180818 R14: ffff9d493399ef38 R15: ffff9d4933a1d800
FS:  0000000000000000(0000) GS:ffff9d494f700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe68cab9010 CR3: 000000004c6be001 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 raid1d+0x5c/0xd40 [raid1]
 ? finish_task_switch+0x75/0x2a0
 ? lock_timer_base+0x67/0x80
 ? try_to_del_timer_sync+0x4d/0x80
 ? del_timer_sync+0x41/0x50
 ? schedule_timeout+0x254/0x2d0
 ? md_start_sync+0xe0/0xe0 [md_mod]
 ? md_thread+0x127/0x160 [md_mod]
 md_thread+0x127/0x160 [md_mod]
 ? wait_woken+0x80/0x80
 kthread+0x10d/0x130
 ? kthread_park+0xa0/0xa0
 ret_from_fork+0x1f/0x40
```

Fixes: dbb64f86 ("md-cluster: Fix adding of new disk with new reload code")
Fixes: 659b254f ("md-cluster: remove a disk asynchronously from cluster environment")
Cc: stable@vger.kernel.org
Reviewed-by: NGang He <ghe@suse.com>
Signed-off-by: NHeming Zhao <heming.zhao@suse.com>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

152be1b9

09 3月, 2021 1 次提交

md: Set prev_flush_start and flush_bio in an atomic way · 78a62c13

由 Xiao Ni 提交于 2月 19, 2021

stable inclusion
from stable-5.10.15
commit fe272570d0037c54c40cba31eaa42a343a5656fa
bugzilla: 48167

--------------------------------

commit dc5d17a3 upstream.

One customer reports a crash problem which causes by flush request. It
triggers a warning before crash.

        /* new request after previous flush is completed */
        if (ktime_after(req_start, mddev->prev_flush_start)) {
                WARN_ON(mddev->flush_bio);
                mddev->flush_bio = bio;
                bio = NULL;
        }

The WARN_ON is triggered. We use spin lock to protect prev_flush_start and
flush_bio in md_flush_request. But there is no lock protection in
md_submit_flush_data. It can set flush_bio to NULL first because of
compiler reordering write instructions.

For example, flush bio1 sets flush bio to NULL first in
md_submit_flush_data. An interrupt or vmware causing an extended stall
happen between updating flush_bio and prev_flush_start. Because flush_bio
is NULL, flush bio2 can get the lock and submit to underlayer disks. Then
flush bio1 updates prev_flush_start after the interrupt or extended stall.

Then flush bio3 enters in md_flush_request. The start time req_start is
behind prev_flush_start. The flush_bio is not NULL(flush bio2 hasn't
finished). So it can trigger the WARN_ON now. Then it calls INIT_WORK
again. INIT_WORK() will re-initialize the list pointers in the
work_struct, which then can result in a corrupted work list and the
work_struct queued a second time. With the work list corrupted, it can
lead in invalid work items being used and cause a crash in
process_one_work.

We need to make sure only one flush bio can be handled at one same time.
So add spin lock in md_submit_flush_data to protect prev_flush_start and
flush_bio in an atomic way.
Reviewed-by: NDavid Jeffery <djeffery@redhat.com>
Signed-off-by: NXiao Ni <xni@redhat.com>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NJack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

78a62c13

12 1月, 2021 2 次提交

md/cluster: fix deadlock when node is doing resync job · 4663a7ee

由 Zhao Heming 提交于 1月 07, 2021

stable inclusion
from stable-5.10.4
commit d27d1942e173c1c719e00a596f5db60221b05b14
bugzilla: 46903

--------------------------------

commit bca5b065 upstream.

md-cluster uses MD_CLUSTER_SEND_LOCK to make node can exclusively send msg.
During sending msg, node can concurrently receive msg from another node.
When node does resync job, grab token_lockres:EX may trigger a deadlock:
```
nodeA                       nodeB
--------------------     --------------------
a.
send METADATA_UPDATED
held token_lockres:EX
                         b.
                         md_do_sync
                          resync_info_update
                            send RESYNCING
                             + set MD_CLUSTER_SEND_LOCK
                             + wait for holding token_lockres:EX

                         c.
                         mdadm /dev/md0 --remove /dev/sdg
                          + held reconfig_mutex
                          + send REMOVE
                             + wait_event(MD_CLUSTER_SEND_LOCK)

                         d.
                         recv_daemon //METADATA_UPDATED from A
                          process_metadata_update
                           + (mddev_trylock(mddev) ||
                              MD_CLUSTER_HOLDING_MUTEX_FOR_RECVD)
                             //this time, both return false forever
```
Explaination:
a. A send METADATA_UPDATED
   This will block another node to send msg

b. B does sync jobs, which will send RESYNCING at intervals.
   This will be block for holding token_lockres:EX lock.

c. B do "mdadm --remove", which will send REMOVE.
   This will be blocked by step <b>: MD_CLUSTER_SEND_LOCK is 1.

d. B recv METADATA_UPDATED msg, which send from A in step <a>.
   This will be blocked by step <c>: holding mddev lock, it makes
   wait_event can't hold mddev lock. (btw,
   MD_CLUSTER_HOLDING_MUTEX_FOR_RECVD keep ZERO in this scenario.)

There is a similar deadlock in commit 0ba95977
("md-cluster: use sync way to handle METADATA_UPDATED msg")
In that commit, step c is "update sb". This patch step c is
"mdadm --remove".

For fixing this issue, we can refer the solution of function:
metadata_update_start. Which does the same grab lock_token action.
lock_comm can use the same steps to avoid deadlock. By moving
MD_CLUSTER_HOLDING_MUTEX_FOR_RECVD from lock_token to lock_comm.
It enlarge a little bit window of MD_CLUSTER_HOLDING_MUTEX_FOR_RECVD,
but it is safe & can break deadlock.

Repro steps (I only triggered 3 times with hundreds tests):

two nodes share 3 iSCSI luns: sdg/sdh/sdi. Each lun size is 1GB.
```
ssh root@node2 "mdadm -S --scan"
mdadm -S --scan
for i in {g,h,i};do dd if=/dev/zero of=/dev/sd$i oflag=direct bs=1M \
count=20; done

mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sdg /dev/sdh \
 --bitmap-chunk=1M
ssh root@node2 "mdadm -A /dev/md0 /dev/sdg /dev/sdh"

sleep 5

mkfs.xfs /dev/md0
mdadm --manage --add /dev/md0 /dev/sdi
mdadm --wait /dev/md0
mdadm --grow --raid-devices=3 /dev/md0

mdadm /dev/md0 --fail /dev/sdg
mdadm /dev/md0 --remove /dev/sdg
mdadm --grow --raid-devices=2 /dev/md0
```

test script will hung when executing "mdadm --remove".

```
 # dump stacks by "echo t > /proc/sysrq-trigger"
md0_cluster_rec D    0  5329      2 0x80004000
Call Trace:
 __schedule+0x1f6/0x560
 ? _cond_resched+0x2d/0x40
 ? schedule+0x4a/0xb0
 ? process_metadata_update.isra.0+0xdb/0x140 [md_cluster]
 ? wait_woken+0x80/0x80
 ? process_recvd_msg+0x113/0x1d0 [md_cluster]
 ? recv_daemon+0x9e/0x120 [md_cluster]
 ? md_thread+0x94/0x160 [md_mod]
 ? wait_woken+0x80/0x80
 ? md_congested+0x30/0x30 [md_mod]
 ? kthread+0x115/0x140
 ? __kthread_bind_mask+0x60/0x60
 ? ret_from_fork+0x1f/0x40

mdadm           D    0  5423      1 0x00004004
Call Trace:
 __schedule+0x1f6/0x560
 ? __schedule+0x1fe/0x560
 ? schedule+0x4a/0xb0
 ? lock_comm.isra.0+0x7b/0xb0 [md_cluster]
 ? wait_woken+0x80/0x80
 ? remove_disk+0x4f/0x90 [md_cluster]
 ? hot_remove_disk+0xb1/0x1b0 [md_mod]
 ? md_ioctl+0x50c/0xba0 [md_mod]
 ? wait_woken+0x80/0x80
 ? blkdev_ioctl+0xa2/0x2a0
 ? block_ioctl+0x39/0x40
 ? ksys_ioctl+0x82/0xc0
 ? __x64_sys_ioctl+0x16/0x20
 ? do_syscall_64+0x5f/0x150
 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

md0_resync      D    0  5425      2 0x80004000
Call Trace:
 __schedule+0x1f6/0x560
 ? schedule+0x4a/0xb0
 ? dlm_lock_sync+0xa1/0xd0 [md_cluster]
 ? wait_woken+0x80/0x80
 ? lock_token+0x2d/0x90 [md_cluster]
 ? resync_info_update+0x95/0x100 [md_cluster]
 ? raid1_sync_request+0x7d3/0xa40 [raid1]
 ? md_do_sync.cold+0x737/0xc8f [md_mod]
 ? md_thread+0x94/0x160 [md_mod]
 ? md_congested+0x30/0x30 [md_mod]
 ? kthread+0x115/0x140
 ? __kthread_bind_mask+0x60/0x60
 ? ret_from_fork+0x1f/0x40
```

At last, thanks for Xiao's solution.

Cc: stable@vger.kernel.org
Signed-off-by: NZhao Heming <heming.zhao@suse.com>
Suggested-by: NXiao Ni <xni@redhat.com>
Reviewed-by: NXiao Ni <xni@redhat.com>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

4663a7ee

md/cluster: block reshape with remote resync job · e35dbd4a

由 Zhao Heming 提交于 1月 07, 2021

stable inclusion
from stable-5.10.4
commit 3ace8d52ee4afac46e4d718612efa09591bd7506
bugzilla: 46903

--------------------------------

commit a8da01f7 upstream.

Reshape request should be blocked with ongoing resync job. In cluster
env, a node can start resync job even if the resync cmd isn't executed
on it, e.g., user executes "mdadm --grow" on node A, sometimes node B
will start resync job. However, current update_raid_disks() only check
local recovery status, which is incomplete. As a result, we see user will
execute "mdadm --grow" successfully on local, while the remote node deny
to do reshape job when it doing resync job. The inconsistent handling
cause array enter unexpected status. If user doesn't observe this issue
and continue executing mdadm cmd, the array doesn't work at last.

Fix this issue by blocking reshape request. When node executes "--grow"
and detects ongoing resync, it should stop and report error to user.

The following script reproduces the issue with ~100% probability.
(two nodes share 3 iSCSI luns: sdg/sdh/sdi. Each lun size is 1GB)
```
 # on node1, node2 is the remote node.
ssh root@node2 "mdadm -S --scan"
mdadm -S --scan
for i in {g,h,i};do dd if=/dev/zero of=/dev/sd$i oflag=direct bs=1M \
count=20; done

mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sdg /dev/sdh
ssh root@node2 "mdadm -A /dev/md0 /dev/sdg /dev/sdh"

sleep 5

mdadm --manage --add /dev/md0 /dev/sdi
mdadm --wait /dev/md0
mdadm --grow --raid-devices=3 /dev/md0

mdadm /dev/md0 --fail /dev/sdg
mdadm /dev/md0 --remove /dev/sdg
mdadm --grow --raid-devices=2 /dev/md0
```

Cc: stable@vger.kernel.org
Signed-off-by: NZhao Heming <heming.zhao@suse.com>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

e35dbd4a

07 1月, 2021 1 次提交

md: fix a warning caused by a race between concurrent md_ioctl()s · a69924ea

由 Dae R. Jeong 提交于 12月 28, 2020

stable inclusion
from stable-5.10.3
commit 70eb256f8c8a7b9c698af22157bf20005641883f
bugzilla: 46871

--------------------------------

commit c731b84b upstream.

Syzkaller reports a warning as belows.
WARNING: CPU: 0 PID: 9647 at drivers/md/md.c:7169
...
Call Trace:
...
RIP: 0010:md_ioctl+0x4017/0x5980 drivers/md/md.c:7169
RSP: 0018:ffff888096027950 EFLAGS: 00010293
RAX: ffff88809322c380 RBX: 0000000000000932 RCX: ffffffff84e266f2
RDX: 0000000000000000 RSI: ffffffff84e299f7 RDI: 0000000000000007
RBP: ffff888096027bc0 R08: ffff88809322c380 R09: ffffed101341a482
R10: ffff888096027940 R11: ffff88809a0d240f R12: 0000000000000932
R13: ffff8880a2c14100 R14: ffff88809a0d2268 R15: ffff88809a0d2408
 __blkdev_driver_ioctl block/ioctl.c:304 [inline]
 blkdev_ioctl+0xece/0x1c10 block/ioctl.c:606
 block_ioctl+0xee/0x130 fs/block_dev.c:1930
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:509 [inline]
 do_vfs_ioctl+0xd5f/0x1380 fs/ioctl.c:696
 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
 __do_sys_ioctl fs/ioctl.c:720 [inline]
 __se_sys_ioctl fs/ioctl.c:718 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

This is caused by a race between two concurrenct md_ioctl()s closing
the array.
CPU1 (md_ioctl())                   CPU2 (md_ioctl())
------                              ------
set_bit(MD_CLOSING, &mddev->flags);
did_set_md_closing = true;
                                    WARN_ON_ONCE(test_bit(MD_CLOSING,
                                            &mddev->flags));
if(did_set_md_closing)
    clear_bit(MD_CLOSING, &mddev->flags);

Fix the warning by returning immediately if the MD_CLOSING bit is set
in &mddev->flags which indicates that the array is being closed.

Fixes: 065e519e ("md: MD_CLOSING needs to be cleared after called md_set_readonly or do_md_stop")
Reported-by: syzbot+1e46a0864c1a6e9bd3d8@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: NDae R. Jeong <dae.r.jeong@kaist.ac.kr>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>

a69924ea

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功