提交 · b0907cadabcae6f1248f37a32a6e777f9ff6d4aa · openeuler / Kernel

13 1月, 2023 1 次提交

md: fix incorrect declaration about claim_rdev in md_import_device · b0907cad

由 Adrian Huang 提交于 1月 10, 2023

Commit fb541ca4 ("md: remove lock_bdev / unlock_bdev") removes
wrappers for blkdev_get/blkdev_put. However, the uninitialized local
static variable of pointer type 'claim_rdev' in md_import_device()
is NULL, which leads to the following warning call trace:

  WARNING: CPU: 22 PID: 1037 at block/bdev.c:577 bd_prepare_to_claim+0x131/0x150
  CPU: 22 PID: 1037 Comm: mdadm Not tainted 6.2.0-rc3+ #69
  ..
  RIP: 0010:bd_prepare_to_claim+0x131/0x150
  ..
  Call Trace:
   <TASK>
   ? _raw_spin_unlock+0x15/0x30
   ? iput+0x6a/0x220
   blkdev_get_by_dev.part.0+0x4b/0x300
   md_import_device+0x126/0x1d0
   new_dev_store+0x184/0x240
   md_attr_store+0x80/0xf0
   kernfs_fop_write_iter+0x128/0x1c0
   vfs_write+0x2be/0x3c0
   ksys_write+0x5f/0xe0
   do_syscall_64+0x38/0x90
   entry_SYSCALL_64_after_hwframe+0x72/0xdc

It turns out the md device cannot be used:

  md: could not open device unknown-block(259,0).
  md: md127 stopped.

Fix the issue by declaring the local static variable of struct type
and passing the pointer of the variable to blkdev_get_by_dev().

Fixes: fb541ca4 ("md: remove lock_bdev / unlock_bdev")
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAdrian Huang <ahuang12@lenovo.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>

b0907cad

05 1月, 2023 1 次提交

block: handle bio_split_to_limits() NULL return · 613b1488

由 Jens Axboe 提交于 1月 04, 2023

This can't happen right now, but in preparation for allowing
bio_split_to_limits() returning NULL if it ended the bio, check for it
in all the callers.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

613b1488

03 12月, 2022 3 次提交

md: fold unbind_rdev_from_array into md_kick_rdev_from_array · b5c1acf0

由 Christoph Hellwig 提交于 11月 29, 2022

unbind_rdev_from_array is only called from md_kick_rdev_from_array, so
merge it into its only caller.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>

b5c1acf0

md: mark md_kick_rdev_from_array static · d57d9d69

由 Christoph Hellwig 提交于 11月 29, 2022

md_kick_rdev_from_array is only used in md.c, so unexport it and mark
the symbol static.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>

d57d9d69

md: remove lock_bdev / unlock_bdev · fb541ca4

由 Christoph Hellwig 提交于 11月 29, 2022

These wrappers for blkdev_get / blkdev_put just horribly confuse the
code with their odd naming.  Remove them and improve the error unwinding
in md_import_device with the now folded code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>

fb541ca4

15 11月, 2022 3 次提交

md: fix a crash in mempool_free · 341097ee

由 Mikulas Patocka 提交于 11月 04, 2022

There's a crash in mempool_free when running the lvm test
shell/lvchange-rebuild-raid.sh.

The reason for the crash is this:
* super_written calls atomic_dec_and_test(&mddev->pending_writes) and
  wake_up(&mddev->sb_wait). Then it calls rdev_dec_pending(rdev, mddev)
  and bio_put(bio).
* so, the process that waited on sb_wait and that is woken up is racing
  with bio_put(bio).
* if the process wins the race, it calls bioset_exit before bio_put(bio)
  is executed.
* bio_put(bio) attempts to free a bio into a destroyed bio set - causing
  a crash in mempool_free.

We fix this bug by moving bio_put before atomic_dec_and_test.

We also move rdev_dec_pending before atomic_dec_and_test as suggested by
Neil Brown.

The function md_end_flush has a similar bug - we must call bio_put before
we decrement the number of in-progress bios.

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 11557f0067 P4D 11557f0067 PUD 0
 Oops: 0002 [#1] PREEMPT SMP
 CPU: 0 PID: 73 Comm: kworker/0:1 Not tainted 6.1.0-rc3 #5
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
 Workqueue: kdelayd flush_expired_bios [dm_delay]
 RIP: 0010:mempool_free+0x47/0x80
 Code: 48 89 ef 5b 5d ff e0 f3 c3 48 89 f7 e8 32 45 3f 00 48 63 53 08 48 89 c6 3b 53 04 7d 2d 48 8b 43 10 8d 4a 01 48 89 df 89 4b 08 <48> 89 2c d0 e8 b0 45 3f 00 48 8d 7b 30 5b 5d 31 c9 ba 01 00 00 00
 RSP: 0018:ffff88910036bda8 EFLAGS: 00010093
 RAX: 0000000000000000 RBX: ffff8891037b65d8 RCX: 0000000000000001
 RDX: 0000000000000000 RSI: 0000000000000202 RDI: ffff8891037b65d8
 RBP: ffff8891447ba240 R08: 0000000000012908 R09: 00000000003d0900
 R10: 0000000000000000 R11: 0000000000173544 R12: ffff889101a14000
 R13: ffff8891562ac300 R14: ffff889102b41440 R15: ffffe8ffffa00d05
 FS:  0000000000000000(0000) GS:ffff88942fa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000001102e99000 CR4: 00000000000006b0
 Call Trace:
  <TASK>
  clone_endio+0xf4/0x1c0 [dm_mod]
  clone_endio+0xf4/0x1c0 [dm_mod]
  __submit_bio+0x76/0x120
  submit_bio_noacct_nocheck+0xb6/0x2a0
  flush_expired_bios+0x28/0x2f [dm_delay]
  process_one_work+0x1b4/0x300
  worker_thread+0x45/0x3e0
  ? rescuer_thread+0x380/0x380
  kthread+0xc2/0x100
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30
  </TASK>
 Modules linked in: brd dm_delay dm_raid dm_mod af_packet uvesafb cfbfillrect cfbimgblt cn cfbcopyarea fb font fbdev tun autofs4 binfmt_misc configfs ipv6 virtio_rng virtio_balloon rng_core virtio_net pcspkr net_failover failover qemu_fw_cfg button mousedev raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod sd_mod t10_pi crc64_rocksoft crc64 virtio_scsi scsi_mod evdev psmouse bsg scsi_common [last unloaded: brd]
 CR2: 0000000000000000
 ---[ end trace 0000000000000000 ]---
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NSong Liu <song@kernel.org>

341097ee

md: introduce md_ro_state · f97a5528

由 Ye Bin 提交于 9月 20, 2022

Introduce md_ro_state for mddev->ro, so it is easy to understand.
Signed-off-by: NYe Bin <yebin10@huawei.com>
Signed-off-by: NSong Liu <song@kernel.org>

f97a5528

md: factor out __md_set_array_info() · 2f6d261e

由 Ye Bin 提交于 9月 20, 2022

Factor out __md_set_array_info(). No functional change.
Signed-off-by: NYe Bin <yebin10@huawei.com>
Signed-off-by: NSong Liu <song@kernel.org>

2f6d261e

27 9月, 2022 1 次提交

block: replace blk_queue_nowait with bdev_nowait · 568ec936

由 Christoph Hellwig 提交于 9月 27, 2022

Replace blk_queue_nowait with a bdev_nowait helpers that takes the
block_device given that the I/O submission path should not have to
look into the request_queue.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NPankaj Raghav <p.raghav@samsung.com>
Link: https://lore.kernel.org/r/20220927075815.269694-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

568ec936

22 9月, 2022 1 次提交

md: Remove extra mddev_get() in md_seq_start() · 3bfc3bcd

由 Logan Gunthorpe 提交于 9月 08, 2022

A regression is seen where mddev devices stay permanently after they
are stopped due to an elevated reference count.

This was tracked down to an extra mddev_get() in md_seq_start().

It only happened rarely because most of the time the md_seq_start()
is called with a zero offset. The path with an extra mddev_get() only
happens when it starts with a non-zero offset.

The commit noted below changed an mddev_get() to check its success
but inadvertently left the original call in. Remove the extra call.

Fixes: 12a6caf2 ("md: only delete entries from all_mddevs when the disk is freed")
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NGuoqing Jiang <Guoqing.jiang@linux.dev>
Signed-off-by: NSong Liu <song@kernel.org>

3bfc3bcd

25 8月, 2022 3 次提交

md: call __md_stop_writes in md_stop · 0dd84b31

由 Guoqing Jiang 提交于 8月 17, 2022

From the link [1], we can see raid1d was running even after the path
raid_dtr -> md_stop -> __md_stop.

Let's stop write first in destructor to align with normal md-raid to
fix the KASAN issue.

[1]. https://lore.kernel.org/linux-raid/CAPhsuW5gc4AakdGNdF8ubpezAuDLFOYUO_sfMZcec6hQFm8nhg@mail.gmail.com/T/#m7f12bf90481c02c6d2da68c64aeed4779b7df74a

Fixes: 48df498d ("md: move bitmap_destroy to the beginning of __md_stop")
Reported-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NSong Liu <song@kernel.org>

0dd84b31

Revert "md-raid: destroy the bitmap after destroying the thread" · 1d258758

由 Guoqing Jiang 提交于 8月 17, 2022

This reverts commit e151db8e. Because it
obviously breaks clustered raid as noticed by Neil though it fixed KASAN
issue for dm-raid, let's revert it and fix KASAN issue in next commit.

[1]. https://lore.kernel.org/linux-raid/a6657e08-b6a7-358b-2d2a-0ac37d49d23a@linux.dev/T/#m95ac225cab7409f66c295772483d091084a6d470

Fixes: e151db8e ("md-raid: destroy the bitmap after destroying the thread")
Signed-off-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NSong Liu <song@kernel.org>

1d258758

md: Flush workqueue md_rdev_misc_wq in md_alloc() · 5e8daf90

由 David Sloan 提交于 8月 11, 2022

A race condition still exists when removing and re-creating md devices
in test cases. However, it is only seen on some setups.

The race condition was tracked down to a reference still being held
to the kobject by the rdev in the md_rdev_misc_wq which will be released
in rdev_delayed_delete().

md_alloc() waits for previous deletions by waiting on the md_misc_wq,
but the md_rdev_misc_wq may still be holding a reference to a recently
removed device.

To fix this, also flush the md_rdev_misc_wq in md_alloc().
Signed-off-by: NDavid Sloan <david.sloan@eideticom.com>
[logang@deltatee.com: rewrote commit message]
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NSong Liu <song@kernel.org>

5e8daf90

03 8月, 2022 21 次提交

block: change the blk_queue_split calling convention · 5a97806f

由 Christoph Hellwig 提交于 7月 27, 2022

The double indirect bio leads to somewhat suboptimal code generation.
Instead return the (original or split) bio, and make sure the
request_queue arguments to the lower level helpers is passed after the
bio to avoid constant reshuffling of the argument passing registers.

Also give it and the helpers used to implement it more descriptive names.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220727162300.3089193-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

5a97806f

md-raid: destroy the bitmap after destroying the thread · e151db8e

由 Mikulas Patocka 提交于 7月 24, 2022

When we ran the lvm test "shell/integrity-blocksize-3.sh" on a kernel with
kasan, we got failure in write_page.

The reason for the failure is that md_bitmap_destroy is called before
destroying the thread and the thread may be waiting in the function
write_page for the bio to complete. When the thread finishes waiting, it
executes "if (test_bit(BITMAP_WRITE_ERROR, &bitmap->flags))", which
triggers the kasan warning.

Note that the commit 48df498d that caused this bug claims that it is
neede for md-cluster, you should check md-cluster and possibly find
another bugfix for it.

BUG: KASAN: use-after-free in write_page+0x18d/0x680 [md_mod]
Read of size 8 at addr ffff889162030c78 by task mdX_raid1/5539

CPU: 10 PID: 5539 Comm: mdX_raid1 Not tainted 5.19.0-rc2 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x34/0x44
 print_report.cold+0x45/0x57a
 ? __lock_text_start+0x18/0x18
 ? write_page+0x18d/0x680 [md_mod]
 kasan_report+0xa8/0xe0
 ? write_page+0x18d/0x680 [md_mod]
 kasan_check_range+0x13f/0x180
 write_page+0x18d/0x680 [md_mod]
 ? super_sync+0x4d5/0x560 [dm_raid]
 ? md_bitmap_file_kick+0xa0/0xa0 [md_mod]
 ? rs_set_dev_and_array_sectors+0x2e0/0x2e0 [dm_raid]
 ? mutex_trylock+0x120/0x120
 ? preempt_count_add+0x6b/0xc0
 ? preempt_count_sub+0xf/0xc0
 md_update_sb+0x707/0xe40 [md_mod]
 md_reap_sync_thread+0x1b2/0x4a0 [md_mod]
 md_check_recovery+0x533/0x960 [md_mod]
 raid1d+0xc8/0x2a20 [raid1]
 ? var_wake_function+0xe0/0xe0
 ? psi_group_change+0x411/0x500
 ? preempt_count_sub+0xf/0xc0
 ? _raw_spin_lock_irqsave+0x78/0xc0
 ? __lock_text_start+0x18/0x18
 ? raid1_end_read_request+0x2a0/0x2a0 [raid1]
 ? preempt_count_sub+0xf/0xc0
 ? _raw_spin_unlock_irqrestore+0x19/0x40
 ? del_timer_sync+0xa9/0x100
 ? try_to_del_timer_sync+0xc0/0xc0
 ? _raw_spin_lock_irqsave+0x78/0xc0
 ? __lock_text_start+0x18/0x18
 ? __list_del_entry_valid+0x68/0xa0
 ? finish_wait+0xa3/0x100
 md_thread+0x161/0x260 [md_mod]
 ? unregister_md_personality+0xa0/0xa0 [md_mod]
 ? _raw_spin_lock_irqsave+0x78/0xc0
 ? prepare_to_wait_event+0x2c0/0x2c0
 ? unregister_md_personality+0xa0/0xa0 [md_mod]
 kthread+0x148/0x180
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x1f/0x30
 </TASK>

Allocated by task 5522:
 kasan_save_stack+0x1e/0x40
 __kasan_kmalloc+0x80/0xa0
 md_bitmap_create+0xa8/0xe80 [md_mod]
 md_run+0x777/0x1300 [md_mod]
 raid_ctr+0x249c/0x4a30 [dm_raid]
 dm_table_add_target+0x2b0/0x620 [dm_mod]
 table_load+0x1c8/0x400 [dm_mod]
 ctl_ioctl+0x29e/0x560 [dm_mod]
 dm_compat_ctl_ioctl+0x7/0x20 [dm_mod]
 __do_compat_sys_ioctl+0xfa/0x160
 do_syscall_64+0x90/0xc0
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Freed by task 5680:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x40
 kasan_set_free_info+0x20/0x40
 __kasan_slab_free+0xf7/0x140
 kfree+0x80/0x240
 md_bitmap_free+0x1c3/0x280 [md_mod]
 __md_stop+0x21/0x120 [md_mod]
 md_stop+0x9/0x40 [md_mod]
 raid_dtr+0x1b/0x40 [dm_raid]
 dm_table_destroy+0x98/0x1e0 [dm_mod]
 __dm_destroy+0x199/0x360 [dm_mod]
 dev_remove+0x10c/0x160 [dm_mod]
 ctl_ioctl+0x29e/0x560 [dm_mod]
 dm_compat_ctl_ioctl+0x7/0x20 [dm_mod]
 __do_compat_sys_ioctl+0xfa/0x160
 do_syscall_64+0x90/0xc0
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 48df498d ("md: move bitmap_destroy to the beginning of __md_stop")
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e151db8e

md: return the allocated devices from md_alloc · 34cb92c0

由 Christoph Hellwig 提交于 7月 23, 2022

Two callers of md_alloc want to use the newly allocated devices, so
return it instead of letting them find it cumbersomely after the
allocation.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-and-tested-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

34cb92c0

md: open code md_probe in autorun_devices · a1108768

由 Christoph Hellwig 提交于 7月 23, 2022

autorun_devices should not be limited to the controls for the legacy
probe on open, so just call md_alloc directly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-and-tested-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a1108768

md: remove unneeded semicolon · c0250d16

由 Yang Li 提交于 7月 22, 2022

Eliminate the following coccicheck warning:
./drivers/md/md.c:8208:2-3: Unneeded semicolon
Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NYang Li <yang.lee@linux.alibaba.com>
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c0250d16

md: fix build failure for !MODULE · 2198c51a

由 Stephen Rothwell 提交于 7月 21, 2022

After merging the block tree, today's linux-next build (x86_64
allmodconfig) failed like this:

drivers/md/md.c:717:22: error: 'mddev_find' defined but not used [-Werror=unused-function]
  717 | static struct mddev *mddev_find(dev_t unit)
      |                      ^~~~~~~~~~
cc1: all warnings being treated as errors

Caused by commit

  4500d5c17910 ("md: simplify md_open")

Make mddev_find() available only for non-modular builds.
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220721131132.070be166@canb.auug.org.auSigned-off-by: NJens Axboe <axboe@kernel.dk>

2198c51a

md: simplify md_open · 5b26804b

由 Christoph Hellwig 提交于 7月 19, 2022

Now that devices are on the all_mddevs list until the gendisk is freed,
there can't be any duplicates.  Remove the global list lookup and just
grab a reference.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5b26804b

md: only delete entries from all_mddevs when the disk is freed · 12a6caf2

由 Christoph Hellwig 提交于 7月 19, 2022

This ensures device names don't get prematurely reused.  Instead add a
deleted flag to skip already deleted devices in mddev_get and other
places that only want to see live mddevs.
Reported-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

12a6caf2

md: stop using for_each_mddev in md_exit · 16648bac

由 Christoph Hellwig 提交于 7月 19, 2022

Just do a simple list_for_each_entry_safe on all_mddevs, and only grab a
reference when we drop the lock and delete the now unused for_each_mddev
macro.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

16648bac

md: stop using for_each_mddev in md_notify_reboot · f2651434

由 Christoph Hellwig 提交于 7月 19, 2022

Just do a simple list_for_each_entry_safe on all_mddevs, and only grab a
reference when we drop the lock.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f2651434

md: stop using for_each_mddev in md_do_sync · b0e706a1

由 Christoph Hellwig 提交于 7月 19, 2022

Just do a plain list_for_each that only grabs a mddev reference in
the case where the thread sleeps and restarts the list iteration.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b0e706a1

md: factor out the rdev overlaps check from rdev_size_store · 2652a1bd

由 Christoph Hellwig 提交于 7月 19, 2022

This splits the code into nicely readable chunks and also avoids
the refcount inc/dec manipulations.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2652a1bd

md: rename md_free to md_kobj_release · 33b614e3

由 Christoph Hellwig 提交于 7月 19, 2022

The md_free name is rather misleading, so pick a better one.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

33b614e3

md: implement ->free_disk · e8c59ac4

由 Christoph Hellwig 提交于 7月 19, 2022

Ensure that all private data is only freed once all accesses are done.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e8c59ac4

md: fix error handling in md_alloc · c57094a6

由 Christoph Hellwig 提交于 7月 19, 2022

Error handling in md_alloc is a mess.  Untangle it to just free the mddev
directly before add_disk is called and thus the gendisk is globally
visible.  After that clear the hold flag and let the mddev_put take care
of cleaning up the mddev through the usual mechanisms.

Fixes: 5e55e2f5 ("[PATCH] md: convert compile time warnings into runtime warnings")
Fixes: 9be68dd7 ("md: add error handling support for add_disk()")
Fixes: 7ad10691 ("md: properly unwind when failing to add the kobject in md_alloc")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c57094a6

md: fix mddev->kobj lifetime · ca39f750

由 Christoph Hellwig 提交于 7月 19, 2022

Once a kobject is initialized, the containing object should not be
directly freed.  So delay initialization until it is added.  Also
remove the kobject_del call as the last put will remove the kobject as
well.  The explicitly delete isn't needed here, and dropping it will
simplify further fixes.

With this md_free now does not need to check that ->gendisk is non-NULL
as it is always set by the time that kobject_init is called on
mddev->kobj.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ca39f750

md: unlock mddev before reap sync_thread in action_store · 9dfbdafd

由 Guoqing Jiang 提交于 6月 21, 2022

Since the bug which commit 8b48ec23 ("md: don't unregister sync_thread
with reconfig_mutex held") fixed is related with action_store path, other
callers which reap sync_thread didn't need to be changed.

Let's pull md_unregister_thread from md_reap_sync_thread, then fix previous
bug with belows.

1. unlock mddev before md_reap_sync_thread in action_store.
2. save reshape_position before unlock, then restore it to ensure position
   not changed accidentally by others.
Signed-off-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9dfbdafd

md: Explicitly create command-line configured devices · 05ce7fb9

由 Chris Webb 提交于 6月 01, 2022

Boot-time assembly of arrays with md= command-line arguments breaks when
CONFIG_BLOCK_LEGACY_AUTOLOAD is unset. md_setup_drive() in md-autodetect.c
calls blkdev_get_by_dev(), assuming this implicitly creates the block
device.

Fix this by attempting to md_alloc() the array first. As in the probe path,
ignore any error as failure is caught by blkdev_get_by_dev() anyway.
Signed-off-by: NChris Webb <chris@arachsys.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

05ce7fb9

md: Notify sysfs sync_completed in md_reap_sync_thread() · 9973f0fa

由 Logan Gunthorpe 提交于 6月 08, 2022

The mdadm test 07layouts randomly produces a kernel hung task deadlock.
The deadlock is caused by the suspend_lo/suspend_hi files being set by
the mdadm background process during reshape and not being cleared
because the process hangs. (Leaving aside the issue of the fragility of
freezing kernel tasks by buggy userspace processes...)

When the background mdadm process hangs it, is waiting (without a
timeout) on a change to the sync_completed file signalling that the
reshape has completed. The process is woken up a couple times when
the reshape finishes but it is woken up before MD_RECOVERY_RUNNING
is cleared so sync_completed_show() reports 0 instead of "none".

To fix this, notify the sysfs file in md_reap_sync_thread() after
MD_RECOVERY_RUNNING has been cleared. This wakes up mdadm and causes
it to continue and write to suspend_lo/suspend_hi to allow IO to
continue.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9973f0fa

md: Ensure resync is reported after it starts · b368856a

由 Logan Gunthorpe 提交于 6月 08, 2022

The 07layouts test in mdadm fails on some systems. The failure
presents itself as the backup file not being removed before the next
layout is grown into:

  mdadm: /dev/md0: cannot create backup file /tmp/md-test-backup:
      File exists

This is because the background mdadm process, which is responsible for
cleaning up this backup file gets into an infinite loop waiting for
the reshape to start. mdadm checks the mdstat file if a reshape is
going and, if it is not, it waits for an event on the file or times
out in 5 seconds. On faster machines, the reshape may complete before
the 5 seconds times out, and thus the background mdadm process loops
waiting for a reshape to start that has already occurred.

mdadm reads the mdstat file to start, but mdstat does not report that the
reshape has begun, even though it has indeed begun. So the mdstat_wait()
call (in mdadm) which polls on the mdstat file won't ever return until
timing out.

The reason mdstat reports the reshape has started is due to an issue
in status_resync(). recovery_active is subtracted from curr_resync which
will result in a value of zero for the first chunk of reshaped data, and
the resulting read will report no reshape in progress.

To fix this, if "resync - recovery_active" is an overloaded value, force
the value to be MD_RESYNC_ACTIVE so the code reports a resync in progress.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b368856a

md: Use enum for overloaded magic numbers used by mddev->curr_resync · eac58d08

由 Logan Gunthorpe 提交于 6月 08, 2022

Comments in the code document special values used for
mddev->curr_resync. Make this clearer by using an enum to label these
values.

The only functional change is a couple places use the wrong comparison
operator that implied 3 is another special value. They are all
fixed to imply that 3 or greater is an active resync.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

eac58d08

15 7月, 2022 2 次提交

md/core: Combine two sync_page_io() arguments · 4ce4c73f

由 Bart Van Assche 提交于 7月 14, 2022

Improve uniformity in the kernel of handling of request operation and
flags by passing these as a single argument.

Cc: Song Liu <song@kernel.org>
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-32-bvanassche@acm.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>

4ce4c73f

block: remove bdevname · 900d156b

由 Christoph Hellwig 提交于 7月 13, 2022

Replace the remaining calls of bdevname with snprintf using the %pg
format specifier.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220713055317.1888500-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

900d156b

28 6月, 2022 1 次提交

block: remove blk_cleanup_disk · 8b9ab626

由 Christoph Hellwig 提交于 6月 19, 2022

blk_cleanup_disk is nothing but a trivial wrapper for put_disk now,
so remove it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

8b9ab626

16 6月, 2022 1 次提交

Revert "md: don't unregister sync_thread with reconfig_mutex held" · d0a18034

由 Guoqing Jiang 提交于 6月 07, 2022

The 07reshape5intr test is broke because of below path.

    md_reap_sync_thread
            -> mddev_unlock
            -> md_unregister_thread(&mddev->sync_thread)

And md_check_recovery is triggered by,

mddev_unlock -> md_wakeup_thread(mddev->thread)

then mddev->reshape_position is set to MaxSector in raid5_finish_reshape
since MD_RECOVERY_INTR is cleared in md_check_recovery, which means
feature_map is not set with MD_FEATURE_RESHAPE_ACTIVE and superblock's
reshape_position can't be updated accordingly.

Fixes: 8b48ec23 ("md: don't unregister sync_thread with reconfig_mutex held")
Reported-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NSong Liu <song@kernel.org>

d0a18034

23 5月, 2022 2 次提交

md: fix double free of io_acct_set bioset · 42b805af

由 Xiao Ni 提交于 5月 12, 2022

Now io_acct_set is alloc and free in personality. Remove the codes that
free io_acct_set in md_free and md_stop.

Fixes: 0c031fd3 (md: Move alloc/free acct bioset in to personality)
Signed-off-by: NXiao Ni <xni@redhat.com>
Signed-off-by: NSong Liu <song@kernel.org>

42b805af

md: remove most calls to bdevname · 913cce5a

由 Christoph Hellwig 提交于 5月 12, 2022

Use the %pg format specifier to save on stack consumption and code size.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSong Liu <song@kernel.org>

913cce5a

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功