提交 · 92d9aac92b7cc92c770e736c70c3acae7b803278 · openeuler / Kernel

26 4月, 2022 6 次提交

md: replace deprecated strlcpy & remove duplicated line · 92d9aac9

由 Heming Zhao 提交于 4月 01, 2022

This commit includes two topics:

1> replace deprecated strlcpy

change strlcpy to strscpy for strlcpy is marked as deprecated in
Documentation/process/deprecated.rst

2> remove duplicated strlcpy line

in md_bitmap_read_sb@md-bitmap.c there are two duplicated strlcpy(), the
history:

- commit cf921cc1 ("Add node recovery callbacks") introduced the first
  usage of strlcpy().

- commit b97e9257 ("Use separate bitmaps for each nodes in the cluster")
  introduced the second strlcpy(). this time, the two strlcpy() are same,
   we can remove anyone safely.

- commit d3b178ad ("md: Skip cluster setup for dm-raid") added dm-raid
  special handling. And the "nodes" value is the key of this patch. but
  from this patch, strlcpy() which was introduced by b97e9257
  become necessary.

- commit 3c462c88 ("md: Increment version for clustered bitmaps") used
  clustered major version to only handle in clustered env. this patch
  could look a polishment for clustered code logic.

So cf921cc1 became useless after d3b178ad, we could remove it
safely.
Signed-off-by: NHeming Zhao <heming.zhao@suse.com>
Signed-off-by: NSong Liu <song@kernel.org>

92d9aac9

md/bitmap: don't set sb values if can't pass sanity check · e68cb83a

由 Heming Zhao 提交于 4月 01, 2022

If bitmap area contains invalid data, kernel will crash then mdadm
triggers "Segmentation fault".
This is cluster-md speical bug. In non-clustered env, mdadm will
handle broken metadata case. In clustered array, only kernel space
handles bitmap slot info. But even this bug only happened in clustered
env, current sanity check is wrong, the code should be changed.

How to trigger: (faulty injection)

dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sda
dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sdb
mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb
mdadm -Ss
echo aaa > magic.txt
 == below modifying slot 2 bitmap data ==
dd if=magic.txt of=/dev/sda seek=16384 bs=1 count=3 <== destroy magic
dd if=/dev/zero of=/dev/sda seek=16436 bs=1 count=4 <== ZERO chunksize
mdadm -A /dev/md0 /dev/sda /dev/sdb
 == kernel crashes. mdadm outputs "Segmentation fault" ==

Reason of kernel crash:

In md_bitmap_read_sb (called by md_bitmap_create), bad bitmap magic didn't
block chunksize assignment, and zero value made DIV_ROUND_UP_SECTOR_T()
trigger "divide error".

Crash log:

kernel: md: md0 stopped.
kernel: md/raid1:md0: not clean -- starting background reconstruction
kernel: md/raid1:md0: active with 2 out of 2 mirrors
kernel: dlm: ... ...
kernel: md-cluster: Joined cluster 44810aba-38bb-e6b8-daca-bc97a0b254aa slot 1
kernel: md0: invalid bitmap file superblock: bad magic
kernel: md_bitmap_copy_from_slot can't get bitmap from slot 2
kernel: md-cluster: Could not gather bitmaps from slot 2
kernel: divide error: 0000 [#1] SMP NOPTI
kernel: CPU: 0 PID: 1603 Comm: mdadm Not tainted 5.14.6-1-default
kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod]
kernel: RSP: 0018:ffffc22ac0843ba0 EFLAGS: 00010246
kernel: ... ...
kernel: Call Trace:
kernel:  ? dlm_lock_sync+0xd0/0xd0 [md_cluster 77fe..7a0]
kernel:  md_bitmap_copy_from_slot+0x2c/0x290 [md_mod 24ea..d3a]
kernel:  load_bitmaps+0xec/0x210 [md_cluster 77fe..7a0]
kernel:  md_bitmap_load+0x81/0x1e0 [md_mod 24ea..d3a]
kernel:  do_md_run+0x30/0x100 [md_mod 24ea..d3a]
kernel:  md_ioctl+0x1290/0x15a0 [md_mod 24ea....d3a]
kernel:  ? mddev_unlock+0xaa/0x130 [md_mod 24ea..d3a]
kernel:  ? blkdev_ioctl+0xb1/0x2b0
kernel:  block_ioctl+0x3b/0x40
kernel:  __x64_sys_ioctl+0x7f/0xb0
kernel:  do_syscall_64+0x59/0x80
kernel:  ? exit_to_user_mode_prepare+0x1ab/0x230
kernel:  ? syscall_exit_to_user_mode+0x18/0x40
kernel:  ? do_syscall_64+0x69/0x80
kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: RIP: 0033:0x7f4a15fa722b
kernel: ... ...
kernel: ---[ end trace 8afa7612f559c868 ]---
kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod]
Reported-by: Nkernel test robot <lkp@intel.com>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NHeming Zhao <heming.zhao@suse.com>
Signed-off-by: NSong Liu <song@kernel.org>

e68cb83a

md: fix an incorrect NULL check in md_reload_sb · 64c54d92

由 Xiaomeng Tong 提交于 4月 08, 2022

The bug is here:
	if (!rdev || rdev->desc_nr != nr) {

The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each_rcu(), so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element
found (In fact, it will be a bogus pointer to an invalid struct
object containing the HEAD). Otherwise it will bypass the check
and lead to invalid memory access passing the check.

To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'pdev' as a dedicated pointer to
point to the found element.

Cc: stable@vger.kernel.org
Fixes: 70bcecdb ("md-cluster: Improve md_reload_sb to be less error prone")
Signed-off-by: NXiaomeng Tong <xiam0nd.tong@gmail.com>
Signed-off-by: NSong Liu <song@kernel.org>

64c54d92

md: fix an incorrect NULL check in does_sb_need_changing · fc873834

由 Xiaomeng Tong 提交于 4月 08, 2022

The bug is here:
	if (!rdev)

The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each(), so it is incorrect to assume that the iterator
value will be NULL if the list is empty or no element found.
Otherwise it will bypass the NULL check and lead to invalid memory
access passing the check.

To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'rdev' as a dedicated pointer to
point to the found element.

Cc: stable@vger.kernel.org
Fixes: 2aa82191 ("md-cluster: Perform a lazy update")
Acked-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NXiaomeng Tong <xiam0nd.tong@gmail.com>
Acked-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: NSong Liu <song@kernel.org>

fc873834

raid5: introduce MD_BROKEN · 57668f0a

由 Mariusz Tkaczyk 提交于 3月 22, 2022

Raid456 module had allowed to achieve failed state. It was fixed by
fb73b357 ("raid5: block failing device if raid will be failed").
This fix introduces a bug, now if raid5 fails during IO, it may result
with a hung task without completion. Faulty flag on the device is
necessary to process all requests and is checked many times, mainly in
analyze_stripe().
Allow to set faulty on drive again and set MD_BROKEN if raid is failed.

As a result, this level is allowed to achieve failed state again, but
communication with userspace (via -EBUSY status) will be preserved.

This restores possibility to fail array via #mdadm --set-faulty command
and will be fixed by additional verification on mdadm side.

Reproduction steps:
 mdadm -CR imsm -e imsm -n 3 /dev/nvme[0-2]n1
 mdadm -CR r5 -e imsm -l5 -n3 /dev/nvme[0-2]n1 --assume-clean
 mkfs.xfs /dev/md126 -f
 mount /dev/md126 /mnt/root/

 fio --filename=/mnt/root/file --size=5GB --direct=1 --rw=randrw
--bs=64k --ioengine=libaio --iodepth=64 --runtime=240 --numjobs=4
--time_based --group_reporting --name=throughput-test-job
--eta-newline=1 &

 echo 1 > /sys/block/nvme2n1/device/device/remove
 echo 1 > /sys/block/nvme1n1/device/device/remove

 [ 1475.787779] Call Trace:
 [ 1475.793111] __schedule+0x2a6/0x700
 [ 1475.799460] schedule+0x38/0xa0
 [ 1475.805454] raid5_get_active_stripe+0x469/0x5f0 [raid456]
 [ 1475.813856] ? finish_wait+0x80/0x80
 [ 1475.820332] raid5_make_request+0x180/0xb40 [raid456]
 [ 1475.828281] ? finish_wait+0x80/0x80
 [ 1475.834727] ? finish_wait+0x80/0x80
 [ 1475.841127] ? finish_wait+0x80/0x80
 [ 1475.847480] md_handle_request+0x119/0x190
 [ 1475.854390] md_make_request+0x8a/0x190
 [ 1475.861041] generic_make_request+0xcf/0x310
 [ 1475.868145] submit_bio+0x3c/0x160
 [ 1475.874355] iomap_dio_submit_bio.isra.20+0x51/0x60
 [ 1475.882070] iomap_dio_bio_actor+0x175/0x390
 [ 1475.889149] iomap_apply+0xff/0x310
 [ 1475.895447] ? iomap_dio_bio_actor+0x390/0x390
 [ 1475.902736] ? iomap_dio_bio_actor+0x390/0x390
 [ 1475.909974] iomap_dio_rw+0x2f2/0x490
 [ 1475.916415] ? iomap_dio_bio_actor+0x390/0x390
 [ 1475.923680] ? atime_needs_update+0x77/0xe0
 [ 1475.930674] ? xfs_file_dio_aio_read+0x6b/0xe0 [xfs]
 [ 1475.938455] xfs_file_dio_aio_read+0x6b/0xe0 [xfs]
 [ 1475.946084] xfs_file_read_iter+0xba/0xd0 [xfs]
 [ 1475.953403] aio_read+0xd5/0x180
 [ 1475.959395] ? _cond_resched+0x15/0x30
 [ 1475.965907] io_submit_one+0x20b/0x3c0
 [ 1475.972398] __x64_sys_io_submit+0xa2/0x180
 [ 1475.979335] ? do_io_getevents+0x7c/0xc0
 [ 1475.986009] do_syscall_64+0x5b/0x1a0
 [ 1475.992419] entry_SYSCALL_64_after_hwframe+0x65/0xca
 [ 1476.000255] RIP: 0033:0x7f11fc27978d
 [ 1476.006631] Code: Bad RIP value.
 [ 1476.073251] INFO: task fio:3877 blocked for more than 120 seconds.

Cc: stable@vger.kernel.org
Fixes: fb73b357 ("raid5: block failing device if raid will be failed")
Reviewd-by: NXiao Ni <xni@redhat.com>
Signed-off-by: NMariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: NSong Liu <song@kernel.org>

57668f0a

md: Set MD_BROKEN for RAID1 and RAID10 · 9631abdb

由 Mariusz Tkaczyk 提交于 3月 22, 2022

There is no direct mechanism to determine raid failure outside
personality. It is done by checking rdev->flags after executing
md_error(). If "faulty" flag is not set then -EBUSY is returned to
userspace. -EBUSY means that array will be failed after drive removal.

Mdadm has special routine to handle the array failure and it is executed
if -EBUSY is returned by md.

There are at least two known reasons to not consider this mechanism
as correct:
1. drive can be removed even if array will be failed[1].
2. -EBUSY seems to be wrong status. Array is not busy, but removal
   process cannot proceed safe.

-EBUSY expectation cannot be removed without breaking compatibility
with userspace. In this patch first issue is resolved by adding support
for MD_BROKEN flag for RAID1 and RAID10. Support for RAID456 is added in
next commit.

The idea is to set the MD_BROKEN if we are sure that raid is in failed
state now. This is done in each error_handler(). In md_error() MD_BROKEN
flag is checked. If is set, then -EBUSY is returned to userspace.

As in previous commit, it causes that #mdadm --set-faulty is able to
fail array. Previously proposed workaround is valid if optional
functionality[1] is disabled.

[1] commit 9a567843("md: allow last device to be forcibly removed from
    RAID1/RAID10.")
Reviewd-by: NXiao Ni <xni@redhat.com>
Signed-off-by: NMariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: NSong Liu <song@kernel.org>

9631abdb

18 4月, 2022 34 次提交

block/rnbd-clt: Avoid flush_workqueue(system_long_wq) usage · 5ea7c133

由 Jack Wang 提交于 4月 13, 2022

Flushing system-wide workqueues is dangerous and will be forbidden.

Replace system_long_wq with local rnbd_clt_wq.

Link: https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp

Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NJack Wang <jinpu.wang@ionos.com>
Reviewed-by: NSantosh Kumar Pradhan <santosh.pradhan@ionos.com>
Link: https://lore.kernel.org/r/20220413123420.66470-1-jinpu.wang@ionos.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

5ea7c133

loop: don't destroy lo->workqueue in __loop_clr_fd · d292dc80

由 Christoph Hellwig 提交于 3月 30, 2022

There is no need to destroy the workqueue when clearing unbinding
a loop device from a backing file. Not doing so on the other hand
avoid creating a complex lock dependency chain involving the global
system_transition_mutex.

Based on a patch from Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>.

Reported-by: syzbot+6479585dfd4dedd3f7e1@syzkaller.appspotmail.com
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Tested-by: syzbot+6479585dfd4dedd3f7e1@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20220330052917.2566582-16-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

d292dc80

loop: remove lo_refcount and avoid lo_mutex in ->open / ->release · a0e286b6

由 Christoph Hellwig 提交于 3月 30, 2022

lo_refcount counts how many openers a loop device has, but that count
is already provided by the block layer in the bd_openers field of the
whole-disk block_device. Remove lo_refcount and allow opens to
succeed even on devices beeing deleted - now that ->free_disk is
implemented we can handle that race gracefull and all I/O on it will
just fail. Similarly there is a small race window now where
loop_control_remove does not synchronize the delete vs the remove
due do bd_openers not being under lo_mutex protection, but we can
handle that just as gracefully.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220330052917.2566582-15-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

a0e286b6

loop: avoid loop_validate_mutex/lo_mutex in ->release · 158eaeba

由 Tetsuo Handa 提交于 3月 30, 2022

Since ->release is called with disk->open_mutex held, and __loop_clr_fd()
 from lo_release() is called via ->release when disk_openers() == 0, we are
guaranteed that "struct file" which will be passed to loop_validate_file()
via fget() cannot be the loop device __loop_clr_fd(lo, true) will clear.
Thus, there is no need to hold loop_validate_mutex from __loop_clr_fd()
if release == true.

When I made commit 3ce6e1f6 ("loop: reintroduce global lock for
safe loop_validate_file() traversal"), I wrote "It is acceptable for
loop_validate_file() to succeed, for actual clear operation has not started
yet.". But now I came to feel why it is acceptable to succeed.

It seems that the loop driver was added in Linux 1.3.68, and

  if (lo->lo_refcnt > 1)
    return -EBUSY;

check in loop_clr_fd() was there from the beginning. The intent of this
check was unclear. But now I think that current

  disk_openers(lo->lo_disk) > 1

form is there for three reasons.

(1) Avoid I/O errors when some process which opens and reads from this
    loop device in response to uevent notification (e.g. systemd-udevd),
    as described in commit a1ecac3b ("loop: Make explicit loop
    device destruction lazy"). This opener is short-lived because it is
    likely that the file descriptor used by that process is closed soon.

(2) Avoid I/O errors caused by underlying layer of stacked loop devices
    (i.e. ioctl(some_loop_fd, LOOP_SET_FD, other_loop_fd)) being suddenly
    disappeared. This opener is long-lived because this reference is
    associated with not a file descriptor but lo->lo_backing_file.

(3) Avoid I/O errors caused by underlying layer of mounted loop device
    (i.e. mount(some_loop_device, some_mount_point)) being suddenly
    disappeared. This opener is long-lived because this reference is
    associated with not a file descriptor but mount.

While race in (1) might be acceptable, (2) and (3) should be checked
racelessly. That is, make sure that __loop_clr_fd() will not run if
loop_validate_file() succeeds, by doing refcount check with global lock
held when explicit loop device destruction is requested.

As a result of no longer waiting for lo->lo_mutex after setting Lo_rundown,
we can remove pointless BUG_ON(lo->lo_state != Lo_rundown) check.
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220330052917.2566582-14-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

158eaeba

loop: suppress uevents while reconfiguring the device · 498ef5c7

由 Christoph Hellwig 提交于 3月 30, 2022

Currently, udev change event is generated for a loop device before the
device is ready for IO. Due to serialization on lo->lo_mutex in
lo_open() this does not matter because anybody is able to open the
device and do IO only after the configuration is finished. However this
synchronization in lo_open() is going away so make sure userspace
reacting to the change event will see the new device state by generating
the event only when the device is setup.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220330052917.2566582-13-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

498ef5c7

loop: implement ->free_disk · d2c7f56f

由 Christoph Hellwig 提交于 3月 30, 2022

Ensure that the lo_device which is stored in the gendisk private
data is valid until the gendisk is freed.  Currently the loop driver
uses a lot of effort to make sure a device is not freed when it is
still in use, but to to fix a potential deadlock this will be relaxed
a bit soon.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220330052917.2566582-12-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

d2c7f56f

loop: only freeze the queue in __loop_clr_fd when needed · 1fe0b1ac

由 Christoph Hellwig 提交于 3月 30, 2022

->release is only called after all outstanding I/O has completed, so only
freeze the queue when clearing the backing file of a live loop device.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Tested-by: NDarrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220330052917.2566582-11-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

1fe0b1ac

loop: don't freeze the queue in lo_release · 46dc9674

由 Christoph Hellwig 提交于 3月 30, 2022

By the time the final ->release is called there can't be outstanding I/O.
For non-final ->release there is no need for driver action at all.

Thus remove the useless queue freeze.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Tested-by: NDarrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220330052917.2566582-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

46dc9674

loop: remove the racy bd_inode->i_mapping->nrpages asserts · 98ded54a

由 Christoph Hellwig 提交于 3月 30, 2022

Nothing prevents a file system or userspace opener of the block device
from redirtying the page right afte sync_blockdev returned.  Fortunately
data in the page cache during a block device change is mostly harmless
anyway.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Tested-by: NDarrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220330052917.2566582-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

98ded54a

loop: initialize the worker tracking fields once · b15ed546

由 Christoph Hellwig 提交于 3月 30, 2022

There is no need to reinitialize idle_worker_list, worker_tree and timer
every time a loop device is configured.  Just initialize them once at
allocation time.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Tested-by: NDarrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220330052917.2566582-8-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

b15ed546

loop: de-duplicate the idle worker freeing code · 2cf429b5

由 Christoph Hellwig 提交于 3月 30, 2022

Use a common helper for both timer based and uncoditional freeing of idle
workers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Tested-by: NDarrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220330052917.2566582-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

2cf429b5

block: turn bdev->bd_openers into an atomic_t · 9acf381f

由 Christoph Hellwig 提交于 3月 30, 2022

All manipulation of bd_openers is under disk->open_mutex and will remain
so for the foreseeable future. But at least one place reads it without
the lock (blkdev_get) and there are more to be added. So make sure the
compiler does not do turn the increments and decrements into non-atomic
sequences by using an atomic_t.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220330052917.2566582-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

9acf381f

block: add a disk_openers helper · dbdc1be3

由 Christoph Hellwig 提交于 3月 30, 2022

Add a helper that returns the openers for a given gendisk to avoid having
drivers poke into disk->part0 to get at this information in a somewhat
cumbersome way.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220330052917.2566582-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

dbdc1be3

zram: cleanup zram_remove · 7a86d6dc

由 Christoph Hellwig 提交于 3月 30, 2022

Remove the bdev variable and just use the gendisk pointed to by the
zram_device directly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220330052917.2566582-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

7a86d6dc

zram: cleanup reset_store · d666e20e

由 Christoph Hellwig 提交于 3月 30, 2022

Use a local variable for the gendisk instead of the part0 block_device,
as the gendisk is what this function actually operates on.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220330052917.2566582-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

d666e20e

nbd: use the correct block_device in nbd_bdev_reset · 2a852a69

由 Christoph Hellwig 提交于 3月 30, 2022

The bdev parameter to ->ioctl contains the block device that the ioctl
is called on, which can be the partition.  But the openers check in
nbd_bdev_reset really needs to check use the whole device, so switch to
using that.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220330052917.2566582-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

2a852a69

drbd: Return true/false (not 1/0) from bool functions · 8fd6533e

由 Haowen Bai 提交于 4月 06, 2022

Return boolean values ("true" or "false") instead of 1 or 0 from bool
functions.  This fixes the following warnings from coccicheck:

./drivers/block/drbd/drbd_req.c:912:9-10: WARNING: return of 0/1 in
function 'remote_due_to_read_balancing' with return type bool
Signed-off-by: NHaowen Bai <baihaowen@meizu.com>
Reviewed-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-8-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

8fd6533e

drdb: Switch to kvfree_rcu() API · 90c6c291

由 Uladzislau Rezki (Sony) 提交于 4月 06, 2022

Instead of invoking a synchronize_rcu() to free a pointer
after a grace period we can directly make use of new API
that does the same but in more efficient way.

TO: Jens Axboe <axboe@kernel.dk>
TO: Philipp Reisner <philipp.reisner@linbit.com>
TO: Jason Gunthorpe <jgg@nvidia.com>
TO: drbd-dev@lists.linbit.com
TO: linux-block@vger.kernel.org
Signed-off-by: NUladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-7-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

90c6c291

drbd: Replace "unsigned" with "unsigned int" · e6be38a1

由 Cai Huoqing 提交于 4月 06, 2022

when run checkpath.pl for the first patch, found that
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'.
so fix it. BTW
Signed-off-by: NCai Huoqing <caihuoqing@baidu.com>
Acked-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-6-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

e6be38a1

drbd: Make use of PFN_UP helper macro · ba6bee98

由 Cai Huoqing 提交于 4月 06, 2022

it's a refactor to make use of PFN_UP helper macro
Signed-off-by: NCai Huoqing <caihuoqing@baidu.com>
Reviewed-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-5-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

ba6bee98

block: drbd: drbd_receiver: Remove redundant assignment to err · e1838cf0

由 Jiapeng Chong 提交于 4月 06, 2022

Variable err is set to '-EIO' but this value is never read as
it is overwritten or not used later on, hence it is a redundant
assignment and can be removed.

Clean up the following clang-analyzer warning:

drivers/block/drbd/drbd_receiver.c:3955:5: warning: Value stored to
'err' is never read [clang-analyzer-deadcode.DeadStores].
Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-4-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

e1838cf0

drbd: address enum mismatch warnings · 4b28f3b4

由 Arnd Bergmann 提交于 4月 06, 2022

gcc -Wextra warns about mixing drbd_state_rv with drbd_ret_code
in a couple of places:

drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_set_role':
drivers/block/drbd/drbd_nl.c:777:14: warning: comparison between 'enum drbd_state_rv' and 'enum drbd_ret_code' [-Wenum-compare]
  777 |  if (retcode != NO_ERROR)
      |              ^~
drivers/block/drbd/drbd_nl.c:784:12: warning: implicit conversion from 'enum drbd_ret_code' to 'enum drbd_state_rv' [-Wenum-conversion]
  784 |    retcode = ERR_MANDATORY_TAG;
      |            ^
drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_attach':
drivers/block/drbd/drbd_nl.c:1965:10: warning: implicit conversion from 'enum drbd_state_rv' to 'enum drbd_ret_code' [-Wenum-conversion]
 1965 |  retcode = rv;  /* FIXME: Type mismatch. */
      |          ^
drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_connect':
drivers/block/drbd/drbd_nl.c:2690:10: warning: implicit conversion from 'enum drbd_state_rv' to 'enum drbd_ret_code' [-Wenum-conversion]
 2690 |  retcode = conn_request_state(connection, NS(conn, C_UNCONNECTED), CS_VERBOSE);
      |          ^
drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_disconnect':
drivers/block/drbd/drbd_nl.c:2803:11: warning: implicit conversion from 'enum drbd_state_rv' to 'enum drbd_ret_code' [-Wenum-conversion]
 2803 |   retcode = rv;  /* FIXME: Type mismatch. */
      |           ^

In each case, both are passed into drbd_adm_finish(), which just takes
a 32-bit integer and is happy with either, presumably intentionally.

Restructure the code to pass either type directly in there in most
cases, avoiding the warnings.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-3-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

4b28f3b4

drbd: fix duplicate array initializer · 33cb0917

由 Arnd Bergmann 提交于 4月 06, 2022

There are two initializers for P_RETRY_WRITE:

drivers/block/drbd/drbd_main.c:3676:22: warning: initialized field overwritten [-Woverride-init]

Remove the first one since it was already ignored by the compiler
and reorder the list to match the enum definition. As P_ZEROES had
no entry, add that one instead.

Fixes: 036b17ea ("drbd: Receiving part for the PROTOCOL_UPDATE packet")
Fixes: f31e583a ("drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire")")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-2-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

33cb0917

direct-io: remove random prefetches · c22198e7

由 Christoph Hellwig 提交于 4月 15, 2022

Randomly poking into block device internals for manual prefetches isn't
exactly a very maintainable thing to do. And none of the performance
critical direct I/O implementations still use this library function
anyway, so just drop it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220415045258.199825-28-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

c22198e7

block: decouple REQ_OP_SECURE_ERASE from REQ_OP_DISCARD · 44abff2c

由 Christoph Hellwig 提交于 4月 15, 2022

Secure erase is a very different operation from discard in that it is
a data integrity operation vs hint.  Fully split the limits and helper
infrastructure to make the separation more clear.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nifs2]
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> [f2fs]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Acked-by: NChao Yu <chao@kernel.org>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-27-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

44abff2c

block: add a bdev_discard_granularity helper · 7b47ef52

由 Christoph Hellwig 提交于 4月 15, 2022

Abstract away implementation details from file systems by providing a
block_device based helper to retrieve the discard granularity.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: NRyusuke Konishi <konishi.ryusuke@gmail.com>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Link: https://lore.kernel.org/r/20220415045258.199825-26-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

7b47ef52

block: remove QUEUE_FLAG_DISCARD · 70200574

由 Christoph Hellwig 提交于 4月 15, 2022

Just use a non-zero max_discard_sectors as an indicator for discard
support, similar to what is done for write zeroes.

The only places where needs special attention is the RAID5 driver,
which must clear discard support for security reasons by default,
even if the default stacking rules would allow for it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

70200574

block: add a bdev_max_discard_sectors helper · cf0fbf89

由 Christoph Hellwig 提交于 4月 15, 2022

Add a helper to query the number of sectors support per each discard bio
based on the block device and use this helper to stop various places from
poking into the request_queue to see if discard is supported and if so how
much. This mirrors what is done e.g. for write zeroes as well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-24-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

cf0fbf89

block: refactor discard bio size limiting · e3cc28ea

由 Christoph Hellwig 提交于 4月 15, 2022

Move all the logic to limit the discard bio size into a common helper
so that it is better documented.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: NColy Li <colyli@suse.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-23-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

e3cc28ea

block: move {bdev,queue_limit}_discard_alignment out of line · 5c4b4a5c

由 Christoph Hellwig 提交于 4月 15, 2022

No need to inline these fairly larger helpers. Also fix the return value
to be unsigned, just like the field in struct queue_limits.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220415045258.199825-22-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

5c4b4a5c

block: use bdev_discard_alignment in part_discard_alignment_show · f0f975a4

由 Christoph Hellwig 提交于 4月 15, 2022

Use the bdev based alignment helper instead of open coding it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220415045258.199825-21-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

f0f975a4

block: remove queue_discard_alignment · 4e1462ff

由 Christoph Hellwig 提交于 4月 15, 2022

Just use bdev_alignment_offset in disk_discard_alignment_show instead.
That helpers is the same except for an always false branch that doesn't
matter in this slow path.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220415045258.199825-20-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

4e1462ff

block: move bdev_alignment_offset and queue_limit_alignment_offset out of line · 89098b07

由 Christoph Hellwig 提交于 4月 15, 2022

No need to inline these fairly larger helpers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220415045258.199825-19-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

89098b07

block: use bdev_alignment_offset in disk_alignment_offset_show · 640f2a23

由 Christoph Hellwig 提交于 4月 15, 2022

This does the same as the open coded variant except for an extra branch,
and allows to remove queue_alignment_offset entirely.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220415045258.199825-18-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

640f2a23

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功