- 26 4月, 2022 3 次提交
-
-
由 Xiaomeng Tong 提交于
The bug is here: if (!rdev) The list iterator value 'rdev' will *always* be set and non-NULL by rdev_for_each(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element found. Otherwise it will bypass the NULL check and lead to invalid memory access passing the check. To fix the bug, use a new variable 'iter' as the list iterator, while using the original variable 'rdev' as a dedicated pointer to point to the found element. Cc: stable@vger.kernel.org Fixes: 2aa82191 ("md-cluster: Perform a lazy update") Acked-by: NGuoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: NXiaomeng Tong <xiam0nd.tong@gmail.com> Acked-by: NGoldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Mariusz Tkaczyk 提交于
Raid456 module had allowed to achieve failed state. It was fixed by fb73b357 ("raid5: block failing device if raid will be failed"). This fix introduces a bug, now if raid5 fails during IO, it may result with a hung task without completion. Faulty flag on the device is necessary to process all requests and is checked many times, mainly in analyze_stripe(). Allow to set faulty on drive again and set MD_BROKEN if raid is failed. As a result, this level is allowed to achieve failed state again, but communication with userspace (via -EBUSY status) will be preserved. This restores possibility to fail array via #mdadm --set-faulty command and will be fixed by additional verification on mdadm side. Reproduction steps: mdadm -CR imsm -e imsm -n 3 /dev/nvme[0-2]n1 mdadm -CR r5 -e imsm -l5 -n3 /dev/nvme[0-2]n1 --assume-clean mkfs.xfs /dev/md126 -f mount /dev/md126 /mnt/root/ fio --filename=/mnt/root/file --size=5GB --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=240 --numjobs=4 --time_based --group_reporting --name=throughput-test-job --eta-newline=1 & echo 1 > /sys/block/nvme2n1/device/device/remove echo 1 > /sys/block/nvme1n1/device/device/remove [ 1475.787779] Call Trace: [ 1475.793111] __schedule+0x2a6/0x700 [ 1475.799460] schedule+0x38/0xa0 [ 1475.805454] raid5_get_active_stripe+0x469/0x5f0 [raid456] [ 1475.813856] ? finish_wait+0x80/0x80 [ 1475.820332] raid5_make_request+0x180/0xb40 [raid456] [ 1475.828281] ? finish_wait+0x80/0x80 [ 1475.834727] ? finish_wait+0x80/0x80 [ 1475.841127] ? finish_wait+0x80/0x80 [ 1475.847480] md_handle_request+0x119/0x190 [ 1475.854390] md_make_request+0x8a/0x190 [ 1475.861041] generic_make_request+0xcf/0x310 [ 1475.868145] submit_bio+0x3c/0x160 [ 1475.874355] iomap_dio_submit_bio.isra.20+0x51/0x60 [ 1475.882070] iomap_dio_bio_actor+0x175/0x390 [ 1475.889149] iomap_apply+0xff/0x310 [ 1475.895447] ? iomap_dio_bio_actor+0x390/0x390 [ 1475.902736] ? iomap_dio_bio_actor+0x390/0x390 [ 1475.909974] iomap_dio_rw+0x2f2/0x490 [ 1475.916415] ? iomap_dio_bio_actor+0x390/0x390 [ 1475.923680] ? atime_needs_update+0x77/0xe0 [ 1475.930674] ? xfs_file_dio_aio_read+0x6b/0xe0 [xfs] [ 1475.938455] xfs_file_dio_aio_read+0x6b/0xe0 [xfs] [ 1475.946084] xfs_file_read_iter+0xba/0xd0 [xfs] [ 1475.953403] aio_read+0xd5/0x180 [ 1475.959395] ? _cond_resched+0x15/0x30 [ 1475.965907] io_submit_one+0x20b/0x3c0 [ 1475.972398] __x64_sys_io_submit+0xa2/0x180 [ 1475.979335] ? do_io_getevents+0x7c/0xc0 [ 1475.986009] do_syscall_64+0x5b/0x1a0 [ 1475.992419] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 1476.000255] RIP: 0033:0x7f11fc27978d [ 1476.006631] Code: Bad RIP value. [ 1476.073251] INFO: task fio:3877 blocked for more than 120 seconds. Cc: stable@vger.kernel.org Fixes: fb73b357 ("raid5: block failing device if raid will be failed") Reviewd-by: NXiao Ni <xni@redhat.com> Signed-off-by: NMariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Mariusz Tkaczyk 提交于
There is no direct mechanism to determine raid failure outside personality. It is done by checking rdev->flags after executing md_error(). If "faulty" flag is not set then -EBUSY is returned to userspace. -EBUSY means that array will be failed after drive removal. Mdadm has special routine to handle the array failure and it is executed if -EBUSY is returned by md. There are at least two known reasons to not consider this mechanism as correct: 1. drive can be removed even if array will be failed[1]. 2. -EBUSY seems to be wrong status. Array is not busy, but removal process cannot proceed safe. -EBUSY expectation cannot be removed without breaking compatibility with userspace. In this patch first issue is resolved by adding support for MD_BROKEN flag for RAID1 and RAID10. Support for RAID456 is added in next commit. The idea is to set the MD_BROKEN if we are sure that raid is in failed state now. This is done in each error_handler(). In md_error() MD_BROKEN flag is checked. If is set, then -EBUSY is returned to userspace. As in previous commit, it causes that #mdadm --set-faulty is able to fail array. Previously proposed workaround is valid if optional functionality[1] is disabled. [1] commit 9a567843("md: allow last device to be forcibly removed from RAID1/RAID10.") Reviewd-by: NXiao Ni <xni@redhat.com> Signed-off-by: NMariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: NSong Liu <song@kernel.org>
-
- 18 4月, 2022 37 次提交
-
-
由 Jack Wang 提交于
Flushing system-wide workqueues is dangerous and will be forbidden. Replace system_long_wq with local rnbd_clt_wq. Link: https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJack Wang <jinpu.wang@ionos.com> Reviewed-by: NSantosh Kumar Pradhan <santosh.pradhan@ionos.com> Link: https://lore.kernel.org/r/20220413123420.66470-1-jinpu.wang@ionos.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
There is no need to destroy the workqueue when clearing unbinding a loop device from a backing file. Not doing so on the other hand avoid creating a complex lock dependency chain involving the global system_transition_mutex. Based on a patch from Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>. Reported-by: syzbot+6479585dfd4dedd3f7e1@syzkaller.appspotmail.com Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Tested-by: syzbot+6479585dfd4dedd3f7e1@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20220330052917.2566582-16-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
lo_refcount counts how many openers a loop device has, but that count is already provided by the block layer in the bd_openers field of the whole-disk block_device. Remove lo_refcount and allow opens to succeed even on devices beeing deleted - now that ->free_disk is implemented we can handle that race gracefull and all I/O on it will just fail. Similarly there is a small race window now where loop_control_remove does not synchronize the delete vs the remove due do bd_openers not being under lo_mutex protection, but we can handle that just as gracefully. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-15-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tetsuo Handa 提交于
Since ->release is called with disk->open_mutex held, and __loop_clr_fd() from lo_release() is called via ->release when disk_openers() == 0, we are guaranteed that "struct file" which will be passed to loop_validate_file() via fget() cannot be the loop device __loop_clr_fd(lo, true) will clear. Thus, there is no need to hold loop_validate_mutex from __loop_clr_fd() if release == true. When I made commit 3ce6e1f6 ("loop: reintroduce global lock for safe loop_validate_file() traversal"), I wrote "It is acceptable for loop_validate_file() to succeed, for actual clear operation has not started yet.". But now I came to feel why it is acceptable to succeed. It seems that the loop driver was added in Linux 1.3.68, and if (lo->lo_refcnt > 1) return -EBUSY; check in loop_clr_fd() was there from the beginning. The intent of this check was unclear. But now I think that current disk_openers(lo->lo_disk) > 1 form is there for three reasons. (1) Avoid I/O errors when some process which opens and reads from this loop device in response to uevent notification (e.g. systemd-udevd), as described in commit a1ecac3b ("loop: Make explicit loop device destruction lazy"). This opener is short-lived because it is likely that the file descriptor used by that process is closed soon. (2) Avoid I/O errors caused by underlying layer of stacked loop devices (i.e. ioctl(some_loop_fd, LOOP_SET_FD, other_loop_fd)) being suddenly disappeared. This opener is long-lived because this reference is associated with not a file descriptor but lo->lo_backing_file. (3) Avoid I/O errors caused by underlying layer of mounted loop device (i.e. mount(some_loop_device, some_mount_point)) being suddenly disappeared. This opener is long-lived because this reference is associated with not a file descriptor but mount. While race in (1) might be acceptable, (2) and (3) should be checked racelessly. That is, make sure that __loop_clr_fd() will not run if loop_validate_file() succeeds, by doing refcount check with global lock held when explicit loop device destruction is requested. As a result of no longer waiting for lo->lo_mutex after setting Lo_rundown, we can remove pointless BUG_ON(lo->lo_state != Lo_rundown) check. Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220330052917.2566582-14-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Currently, udev change event is generated for a loop device before the device is ready for IO. Due to serialization on lo->lo_mutex in lo_open() this does not matter because anybody is able to open the device and do IO only after the configuration is finished. However this synchronization in lo_open() is going away so make sure userspace reacting to the change event will see the new device state by generating the event only when the device is setup. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-13-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Ensure that the lo_device which is stored in the gendisk private data is valid until the gendisk is freed. Currently the loop driver uses a lot of effort to make sure a device is not freed when it is still in use, but to to fix a potential deadlock this will be relaxed a bit soon. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-12-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
->release is only called after all outstanding I/O has completed, so only freeze the queue when clearing the backing file of a live loop device. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Tested-by: NDarrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220330052917.2566582-11-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
By the time the final ->release is called there can't be outstanding I/O. For non-final ->release there is no need for driver action at all. Thus remove the useless queue freeze. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Tested-by: NDarrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220330052917.2566582-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Nothing prevents a file system or userspace opener of the block device from redirtying the page right afte sync_blockdev returned. Fortunately data in the page cache during a block device change is mostly harmless anyway. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Tested-by: NDarrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220330052917.2566582-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
There is no need to reinitialize idle_worker_list, worker_tree and timer every time a loop device is configured. Just initialize them once at allocation time. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Tested-by: NDarrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220330052917.2566582-8-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use a common helper for both timer based and uncoditional freeing of idle workers. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Tested-by: NDarrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220330052917.2566582-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
All manipulation of bd_openers is under disk->open_mutex and will remain so for the foreseeable future. But at least one place reads it without the lock (blkdev_get) and there are more to be added. So make sure the compiler does not do turn the increments and decrements into non-atomic sequences by using an atomic_t. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a helper that returns the openers for a given gendisk to avoid having drivers poke into disk->part0 to get at this information in a somewhat cumbersome way. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Remove the bdev variable and just use the gendisk pointed to by the zram_device directly. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use a local variable for the gendisk instead of the part0 block_device, as the gendisk is what this function actually operates on. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The bdev parameter to ->ioctl contains the block device that the ioctl is called on, which can be the partition. But the openers check in nbd_bdev_reset really needs to check use the whole device, so switch to using that. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Haowen Bai 提交于
Return boolean values ("true" or "false") instead of 1 or 0 from bool functions. This fixes the following warnings from coccicheck: ./drivers/block/drbd/drbd_req.c:912:9-10: WARNING: return of 0/1 in function 'remote_due_to_read_balancing' with return type bool Signed-off-by: NHaowen Bai <baihaowen@meizu.com> Reviewed-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://lore.kernel.org/r/20220406190715.1938174-8-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Uladzislau Rezki (Sony) 提交于
Instead of invoking a synchronize_rcu() to free a pointer after a grace period we can directly make use of new API that does the same but in more efficient way. TO: Jens Axboe <axboe@kernel.dk> TO: Philipp Reisner <philipp.reisner@linbit.com> TO: Jason Gunthorpe <jgg@nvidia.com> TO: drbd-dev@lists.linbit.com TO: linux-block@vger.kernel.org Signed-off-by: NUladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://lore.kernel.org/r/20220406190715.1938174-7-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Cai Huoqing 提交于
when run checkpath.pl for the first patch, found that WARNING: Prefer 'unsigned int' to bare use of 'unsigned'. so fix it. BTW Signed-off-by: NCai Huoqing <caihuoqing@baidu.com> Acked-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://lore.kernel.org/r/20220406190715.1938174-6-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Cai Huoqing 提交于
it's a refactor to make use of PFN_UP helper macro Signed-off-by: NCai Huoqing <caihuoqing@baidu.com> Reviewed-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://lore.kernel.org/r/20220406190715.1938174-5-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jiapeng Chong 提交于
Variable err is set to '-EIO' but this value is never read as it is overwritten or not used later on, hence it is a redundant assignment and can be removed. Clean up the following clang-analyzer warning: drivers/block/drbd/drbd_receiver.c:3955:5: warning: Value stored to 'err' is never read [clang-analyzer-deadcode.DeadStores]. Reported-by: NAbaci Robot <abaci@linux.alibaba.com> Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com> Acked-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://lore.kernel.org/r/20220406190715.1938174-4-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Arnd Bergmann 提交于
gcc -Wextra warns about mixing drbd_state_rv with drbd_ret_code in a couple of places: drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_set_role': drivers/block/drbd/drbd_nl.c:777:14: warning: comparison between 'enum drbd_state_rv' and 'enum drbd_ret_code' [-Wenum-compare] 777 | if (retcode != NO_ERROR) | ^~ drivers/block/drbd/drbd_nl.c:784:12: warning: implicit conversion from 'enum drbd_ret_code' to 'enum drbd_state_rv' [-Wenum-conversion] 784 | retcode = ERR_MANDATORY_TAG; | ^ drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_attach': drivers/block/drbd/drbd_nl.c:1965:10: warning: implicit conversion from 'enum drbd_state_rv' to 'enum drbd_ret_code' [-Wenum-conversion] 1965 | retcode = rv; /* FIXME: Type mismatch. */ | ^ drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_connect': drivers/block/drbd/drbd_nl.c:2690:10: warning: implicit conversion from 'enum drbd_state_rv' to 'enum drbd_ret_code' [-Wenum-conversion] 2690 | retcode = conn_request_state(connection, NS(conn, C_UNCONNECTED), CS_VERBOSE); | ^ drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_disconnect': drivers/block/drbd/drbd_nl.c:2803:11: warning: implicit conversion from 'enum drbd_state_rv' to 'enum drbd_ret_code' [-Wenum-conversion] 2803 | retcode = rv; /* FIXME: Type mismatch. */ | ^ In each case, both are passed into drbd_adm_finish(), which just takes a 32-bit integer and is happy with either, presumably intentionally. Restructure the code to pass either type directly in there in most cases, avoiding the warnings. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://lore.kernel.org/r/20220406190715.1938174-3-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Arnd Bergmann 提交于
There are two initializers for P_RETRY_WRITE: drivers/block/drbd/drbd_main.c:3676:22: warning: initialized field overwritten [-Woverride-init] Remove the first one since it was already ignored by the compiler and reorder the list to match the enum definition. As P_ZEROES had no entry, add that one instead. Fixes: 036b17ea ("drbd: Receiving part for the PROTOCOL_UPDATE packet") Fixes: f31e583a ("drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire")") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NChristoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://lore.kernel.org/r/20220406190715.1938174-2-christoph.boehmwalder@linbit.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Randomly poking into block device internals for manual prefetches isn't exactly a very maintainable thing to do. And none of the performance critical direct I/O implementations still use this library function anyway, so just drop it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Link: https://lore.kernel.org/r/20220415045258.199825-28-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Secure erase is a very different operation from discard in that it is a data integrity operation vs hint. Fully split the limits and helper infrastructure to make the separation more clear. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nifs2] Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> [f2fs] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: NChao Yu <chao@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-27-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Abstract away implementation details from file systems by providing a block_device based helper to retrieve the discard granularity. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: NRyusuke Konishi <konishi.ryusuke@gmail.com> Acked-by: David Sterba <dsterba@suse.com> [btrfs] Link: https://lore.kernel.org/r/20220415045258.199825-26-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Just use a non-zero max_discard_sectors as an indicator for discard support, similar to what is done for write zeroes. The only places where needs special attention is the RAID5 driver, which must clear discard support for security reasons by default, even if the default stacking rules would allow for it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a helper to query the number of sectors support per each discard bio based on the block device and use this helper to stop various places from poking into the request_queue to see if discard is supported and if so how much. This mirrors what is done e.g. for write zeroes as well. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-24-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Move all the logic to limit the discard bio size into a common helper so that it is better documented. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NColy Li <colyli@suse.de> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-23-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
No need to inline these fairly larger helpers. Also fix the return value to be unsigned, just like the field in struct queue_limits. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220415045258.199825-22-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use the bdev based alignment helper instead of open coding it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220415045258.199825-21-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Just use bdev_alignment_offset in disk_discard_alignment_show instead. That helpers is the same except for an always false branch that doesn't matter in this slow path. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220415045258.199825-20-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
No need to inline these fairly larger helpers. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220415045258.199825-19-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
This does the same as the open coded variant except for an extra branch, and allows to remove queue_alignment_offset entirely. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220415045258.199825-18-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Replace the open coded offset calculation with the proper helper. This is an ABI change in that the -1 for a misaligned partition is properly propagated, which can be considered a bug fix and matches what is done on the whole device. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-17-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a helper to check the max supported sectors for zone append based on the block_device instead of having to poke into the block layer internal request_queue. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-16-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a helper to check the stable writes flag based on the block_device instead of having to poke into the block layer internal request_queue. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-15-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-