- 26 4月, 2022 6 次提交
-
-
由 Heming Zhao 提交于
This commit includes two topics: 1> replace deprecated strlcpy change strlcpy to strscpy for strlcpy is marked as deprecated in Documentation/process/deprecated.rst 2> remove duplicated strlcpy line in md_bitmap_read_sb@md-bitmap.c there are two duplicated strlcpy(), the history: - commit cf921cc1 ("Add node recovery callbacks") introduced the first usage of strlcpy(). - commit b97e9257 ("Use separate bitmaps for each nodes in the cluster") introduced the second strlcpy(). this time, the two strlcpy() are same, we can remove anyone safely. - commit d3b178ad ("md: Skip cluster setup for dm-raid") added dm-raid special handling. And the "nodes" value is the key of this patch. but from this patch, strlcpy() which was introduced by b97e9257 become necessary. - commit 3c462c88 ("md: Increment version for clustered bitmaps") used clustered major version to only handle in clustered env. this patch could look a polishment for clustered code logic. So cf921cc1 became useless after d3b178ad, we could remove it safely. Signed-off-by: NHeming Zhao <heming.zhao@suse.com> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Heming Zhao 提交于
If bitmap area contains invalid data, kernel will crash then mdadm triggers "Segmentation fault". This is cluster-md speical bug. In non-clustered env, mdadm will handle broken metadata case. In clustered array, only kernel space handles bitmap slot info. But even this bug only happened in clustered env, current sanity check is wrong, the code should be changed. How to trigger: (faulty injection) dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sda dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sdb mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb mdadm -Ss echo aaa > magic.txt == below modifying slot 2 bitmap data == dd if=magic.txt of=/dev/sda seek=16384 bs=1 count=3 <== destroy magic dd if=/dev/zero of=/dev/sda seek=16436 bs=1 count=4 <== ZERO chunksize mdadm -A /dev/md0 /dev/sda /dev/sdb == kernel crashes. mdadm outputs "Segmentation fault" == Reason of kernel crash: In md_bitmap_read_sb (called by md_bitmap_create), bad bitmap magic didn't block chunksize assignment, and zero value made DIV_ROUND_UP_SECTOR_T() trigger "divide error". Crash log: kernel: md: md0 stopped. kernel: md/raid1:md0: not clean -- starting background reconstruction kernel: md/raid1:md0: active with 2 out of 2 mirrors kernel: dlm: ... ... kernel: md-cluster: Joined cluster 44810aba-38bb-e6b8-daca-bc97a0b254aa slot 1 kernel: md0: invalid bitmap file superblock: bad magic kernel: md_bitmap_copy_from_slot can't get bitmap from slot 2 kernel: md-cluster: Could not gather bitmaps from slot 2 kernel: divide error: 0000 [#1] SMP NOPTI kernel: CPU: 0 PID: 1603 Comm: mdadm Not tainted 5.14.6-1-default kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod] kernel: RSP: 0018:ffffc22ac0843ba0 EFLAGS: 00010246 kernel: ... ... kernel: Call Trace: kernel: ? dlm_lock_sync+0xd0/0xd0 [md_cluster 77fe..7a0] kernel: md_bitmap_copy_from_slot+0x2c/0x290 [md_mod 24ea..d3a] kernel: load_bitmaps+0xec/0x210 [md_cluster 77fe..7a0] kernel: md_bitmap_load+0x81/0x1e0 [md_mod 24ea..d3a] kernel: do_md_run+0x30/0x100 [md_mod 24ea..d3a] kernel: md_ioctl+0x1290/0x15a0 [md_mod 24ea....d3a] kernel: ? mddev_unlock+0xaa/0x130 [md_mod 24ea..d3a] kernel: ? blkdev_ioctl+0xb1/0x2b0 kernel: block_ioctl+0x3b/0x40 kernel: __x64_sys_ioctl+0x7f/0xb0 kernel: do_syscall_64+0x59/0x80 kernel: ? exit_to_user_mode_prepare+0x1ab/0x230 kernel: ? syscall_exit_to_user_mode+0x18/0x40 kernel: ? do_syscall_64+0x69/0x80 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae kernel: RIP: 0033:0x7f4a15fa722b kernel: ... ... kernel: ---[ end trace 8afa7612f559c868 ]--- kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod] Reported-by: Nkernel test robot <lkp@intel.com> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NGuoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: NHeming Zhao <heming.zhao@suse.com> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Xiaomeng Tong 提交于
The bug is here: if (!rdev || rdev->desc_nr != nr) { The list iterator value 'rdev' will *always* be set and non-NULL by rdev_for_each_rcu(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element found (In fact, it will be a bogus pointer to an invalid struct object containing the HEAD). Otherwise it will bypass the check and lead to invalid memory access passing the check. To fix the bug, use a new variable 'iter' as the list iterator, while using the original variable 'pdev' as a dedicated pointer to point to the found element. Cc: stable@vger.kernel.org Fixes: 70bcecdb ("md-cluster: Improve md_reload_sb to be less error prone") Signed-off-by: NXiaomeng Tong <xiam0nd.tong@gmail.com> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Xiaomeng Tong 提交于
The bug is here: if (!rdev) The list iterator value 'rdev' will *always* be set and non-NULL by rdev_for_each(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element found. Otherwise it will bypass the NULL check and lead to invalid memory access passing the check. To fix the bug, use a new variable 'iter' as the list iterator, while using the original variable 'rdev' as a dedicated pointer to point to the found element. Cc: stable@vger.kernel.org Fixes: 2aa82191 ("md-cluster: Perform a lazy update") Acked-by: NGuoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: NXiaomeng Tong <xiam0nd.tong@gmail.com> Acked-by: NGoldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Mariusz Tkaczyk 提交于
Raid456 module had allowed to achieve failed state. It was fixed by fb73b357 ("raid5: block failing device if raid will be failed"). This fix introduces a bug, now if raid5 fails during IO, it may result with a hung task without completion. Faulty flag on the device is necessary to process all requests and is checked many times, mainly in analyze_stripe(). Allow to set faulty on drive again and set MD_BROKEN if raid is failed. As a result, this level is allowed to achieve failed state again, but communication with userspace (via -EBUSY status) will be preserved. This restores possibility to fail array via #mdadm --set-faulty command and will be fixed by additional verification on mdadm side. Reproduction steps: mdadm -CR imsm -e imsm -n 3 /dev/nvme[0-2]n1 mdadm -CR r5 -e imsm -l5 -n3 /dev/nvme[0-2]n1 --assume-clean mkfs.xfs /dev/md126 -f mount /dev/md126 /mnt/root/ fio --filename=/mnt/root/file --size=5GB --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=240 --numjobs=4 --time_based --group_reporting --name=throughput-test-job --eta-newline=1 & echo 1 > /sys/block/nvme2n1/device/device/remove echo 1 > /sys/block/nvme1n1/device/device/remove [ 1475.787779] Call Trace: [ 1475.793111] __schedule+0x2a6/0x700 [ 1475.799460] schedule+0x38/0xa0 [ 1475.805454] raid5_get_active_stripe+0x469/0x5f0 [raid456] [ 1475.813856] ? finish_wait+0x80/0x80 [ 1475.820332] raid5_make_request+0x180/0xb40 [raid456] [ 1475.828281] ? finish_wait+0x80/0x80 [ 1475.834727] ? finish_wait+0x80/0x80 [ 1475.841127] ? finish_wait+0x80/0x80 [ 1475.847480] md_handle_request+0x119/0x190 [ 1475.854390] md_make_request+0x8a/0x190 [ 1475.861041] generic_make_request+0xcf/0x310 [ 1475.868145] submit_bio+0x3c/0x160 [ 1475.874355] iomap_dio_submit_bio.isra.20+0x51/0x60 [ 1475.882070] iomap_dio_bio_actor+0x175/0x390 [ 1475.889149] iomap_apply+0xff/0x310 [ 1475.895447] ? iomap_dio_bio_actor+0x390/0x390 [ 1475.902736] ? iomap_dio_bio_actor+0x390/0x390 [ 1475.909974] iomap_dio_rw+0x2f2/0x490 [ 1475.916415] ? iomap_dio_bio_actor+0x390/0x390 [ 1475.923680] ? atime_needs_update+0x77/0xe0 [ 1475.930674] ? xfs_file_dio_aio_read+0x6b/0xe0 [xfs] [ 1475.938455] xfs_file_dio_aio_read+0x6b/0xe0 [xfs] [ 1475.946084] xfs_file_read_iter+0xba/0xd0 [xfs] [ 1475.953403] aio_read+0xd5/0x180 [ 1475.959395] ? _cond_resched+0x15/0x30 [ 1475.965907] io_submit_one+0x20b/0x3c0 [ 1475.972398] __x64_sys_io_submit+0xa2/0x180 [ 1475.979335] ? do_io_getevents+0x7c/0xc0 [ 1475.986009] do_syscall_64+0x5b/0x1a0 [ 1475.992419] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 1476.000255] RIP: 0033:0x7f11fc27978d [ 1476.006631] Code: Bad RIP value. [ 1476.073251] INFO: task fio:3877 blocked for more than 120 seconds. Cc: stable@vger.kernel.org Fixes: fb73b357 ("raid5: block failing device if raid will be failed") Reviewd-by: NXiao Ni <xni@redhat.com> Signed-off-by: NMariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Mariusz Tkaczyk 提交于
There is no direct mechanism to determine raid failure outside personality. It is done by checking rdev->flags after executing md_error(). If "faulty" flag is not set then -EBUSY is returned to userspace. -EBUSY means that array will be failed after drive removal. Mdadm has special routine to handle the array failure and it is executed if -EBUSY is returned by md. There are at least two known reasons to not consider this mechanism as correct: 1. drive can be removed even if array will be failed[1]. 2. -EBUSY seems to be wrong status. Array is not busy, but removal process cannot proceed safe. -EBUSY expectation cannot be removed without breaking compatibility with userspace. In this patch first issue is resolved by adding support for MD_BROKEN flag for RAID1 and RAID10. Support for RAID456 is added in next commit. The idea is to set the MD_BROKEN if we are sure that raid is in failed state now. This is done in each error_handler(). In md_error() MD_BROKEN flag is checked. If is set, then -EBUSY is returned to userspace. As in previous commit, it causes that #mdadm --set-faulty is able to fail array. Previously proposed workaround is valid if optional functionality[1] is disabled. [1] commit 9a567843("md: allow last device to be forcibly removed from RAID1/RAID10.") Reviewd-by: NXiao Ni <xni@redhat.com> Signed-off-by: NMariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: NSong Liu <song@kernel.org>
-
- 18 4月, 2022 6 次提交
-
-
由 Christoph Hellwig 提交于
Secure erase is a very different operation from discard in that it is a data integrity operation vs hint. Fully split the limits and helper infrastructure to make the separation more clear. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nifs2] Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> [f2fs] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: NChao Yu <chao@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-27-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Just use a non-zero max_discard_sectors as an indicator for discard support, similar to what is done for write zeroes. The only places where needs special attention is the RAID5 driver, which must clear discard support for security reasons by default, even if the default stacking rules would allow for it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a helper to query the number of sectors support per each discard bio based on the block device and use this helper to stop various places from poking into the request_queue to see if discard is supported and if so how much. This mirrors what is done e.g. for write zeroes as well. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-24-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a helper to check the stable writes flag based on the block_device instead of having to poke into the block layer internal request_queue. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-15-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a helper to check the nonrot flag based on the block_device instead of having to poke into the block layer internal request_queue. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-12-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Remove the magic autofree semantics and require the callers to explicitly call bio_init to initialize the bio. This allows bio_free to catch accidental bio_put calls on bio_init()ed bios as well. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NColy Li <colyli@suse.de> Acked-by: NMike Snitzer <snitzer@kernel.org> Link: https://lore.kernel.org/r/20220406061228.410163-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 16 4月, 2022 1 次提交
-
-
由 Shin'ichiro Kawasaki 提交于
The commit 92986f6b ("dm: use bio_clone_fast in alloc_io/alloc_tio") removed bio_clone_fast() call from alloc_tio() when ci->io->tio is available. In this case, ci->bio is not copied to ci->io->tio.clone. This is fine since init_clone_info() sets same values to ci->bio and ci->io->tio.clone. However, when incoming bios have REQ_PREFLUSH flag, __send_empty_flush() prepares a zero length bio on stack and set it to ci->bio. At this time, ci->io->tio.clone still keeps non-zero length. When alloc_tio() chooses this ci->io->tio.clone as the bio to map, it is passed to targets as non-empty flush bio. It causes bio length check failure in dm-zoned and unexpected operation such as dm_accept_partial_bio() call. To avoid the non-empty flush bio, set zero length to ci->io->tio.clone in __send_empty_flush(). Fixes: 92986f6b ("dm: use bio_clone_fast in alloc_io/alloc_tio") Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
- 15 4月, 2022 1 次提交
-
-
由 Mike Snitzer 提交于
The intent behind commit e6fc9f62 ("dm: flag clones created by __send_duplicate_bios") was to formally disallow the use of dm_accept_partial_bio() where it simply isn't possible -- due to constraint that multiple bios cannot meaningfully update a shared tio->len_ptr. But that commit went too far and disallowed the case where "abormal" IO (e.g. WRITE_ZEROES) is only using a single bio. Fix this by not marking a dm_io with a single dm_target_io (and bio), that happens to be created by __send_duplicate_bios, as DM_TIO_IS_DUPLICATE_BIO. Also remove 'unsigned *len' parameter from alloc_multiple_bios(). This commit fixes a dm_accept_partial_bio() BUG_ON() with dm-zoned when a WRITE_ZEROES bio is issued. Fixes: 655f3aad ("dm: switch dm_target_io booleans over to proper flags") Reported-by: NShinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
- 14 4月, 2022 3 次提交
-
-
由 Mike Snitzer 提交于
Commit 0fbb4d93 ("dm: add dm_submit_bio_remap interface") changed the alloc_io() function to delay the initialization of struct dm_io's orig_bio member, leaving it NULL until after the dm_io and associated user submitted bio is processed by __split_and_process_bio(). This change causes a NULL pointer dereference in dm_zone_map_bio() when the original user bio is inspected to detect the need for zone append command emulation. Fix this NULL pointer by updating dm_zone_map_bio() to not access ->orig_bio when the same info can be accessed from the clone of the ->orig_bio _before_ any ->map processing. Save off the bio_op() and bio_sectors() for the clone and then use the saved orig_bio_details as needed. Fixes: 0fbb4d93 ("dm: add dm_submit_bio_remap interface") Reported-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
由 Khazhismel Kumykov 提交于
Mixing sched_clock() and ktime_get_ns() usage will give bad results. Switch hst_select_path() from using sched_clock() to ktime_get_ns(). Also rename path_service_time()'s 'sched_now' variable to 'now'. Fixes: 2613eab1 ("dm mpath: add Historical Service Time Path Selector") Signed-off-by: NKhazhismel Kumykov <khazhy@google.com> Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
由 Mikulas Patocka 提交于
It is possible to set up dm-integrity in such a way that the "tag_size" parameter is less than the actual digest size. In this situation, a part of the digest beyond tag_size is ignored. In this case, dm-integrity would write beyond the end of the ic->recalc_tags array and corrupt memory. The corruption happened in integrity_recalc->integrity_sector_checksum->crypto_shash_final. Fix this corruption by increasing the tags array so that it has enough padding at the end to accomodate the loop in integrity_recalc() being able to write a full digest size for the last member of the tags array. Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
- 02 4月, 2022 2 次提交
-
-
由 Ming Lei 提交于
Expanded testing of DM's bio polling support (using more fio threads to dm-linear ontop of null_blk) exposed the possibility for polled bios to hang (repeatedly polling in io_uring) when null_blk responds with BLK_STS_AGAIN (due to lack of resources): 1) io_complete_rw_iopoll() is called from blkdev_bio_end_io_async() to notify kiocb is done, that is the completion interface between block layer and io_uring 2) io_complete_rw_iopoll() is called from io_do_iopoll() 3) dm returns BLK_STS_AGAIN for one bio (on behalf of underlying driver), then io_complete_rw_iopoll is called, but io_do_iopoll() doesn't handle -EAGAIN at all (due to logic in io_rw_should_reissue) 4) reason for dm's BLK_STS_AGAIN is underlying null_blk driver ran out of requests (easier to reproduce by setting low hw_queue_depth). 5) dm should handle BLK_STS_AGAIN for POLLED underlying IO, and may retry in dm layer. This fix adds REQ_POLLED specific BLK_STS_AGAIN handling to dm_io_complete() that clears REQ_POLLED and requeues the bio to DM using queue_io(). Fixes: b99fdcdc ("dm: support bio polling") Signed-off-by: NMing Lei <ming.lei@redhat.com> [snitzer: revised header, reused dm_io_complete's REQ_POLLED case] Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
由 Mikulas Patocka 提交于
Early alpha processors cannot write a single byte or short; they read 8 bytes, modify the value in registers and write back 8 bytes. This could cause race condition in the structure dm_io - if the fields flags and io_count are modified simultaneously. Fix this bug by using 32-bit flags if we are on Alpha and if we are compiling for a processor that doesn't have the byte-word-extension. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Fixes: bd4a6dd2 ("dm: reduce size of dm_io and dm_target_io structs") [snitzer: Jens allowed this change since Mikulas owns a relevant Alpha!] Acked-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
- 01 4月, 2022 2 次提交
-
-
由 Mikulas Patocka 提交于
Commit f6f72f32 ("dm integrity: don't replay journal data past the end of the device") skips journal replay if the target sector points beyond the end of the device. Unfortunatelly, it doesn't set the journal entry unused, which resulted in this BUG being triggered: BUG_ON(!journal_entry_is_unused(je)) Fix this by calling journal_entry_set_unused() for this case. Fixes: f6f72f32 ("dm integrity: don't replay journal data past the end of the device") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Tested-by: NMilan Broz <gmazyland@gmail.com> [snitzer: revised header] Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
由 Mikulas Patocka 提交于
This will help triage bugs when userspace is passing invalid ioctl structure to the kernel. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> [snitzer: log errors using DMERR instead of DMWARN] Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
- 22 3月, 2022 4 次提交
-
-
由 Mike Snitzer 提交于
No reason to have separate startio_lock and endio_lock given endio_lock could be used during submission anyway. This change leaves the dm_io struct weighing in at 256 bytes (down from 272 bytes, so saves a cacheline). Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
由 Mike Snitzer 提交于
Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
由 Mike Snitzer 提交于
Add flags to dm_target_io and manage them using the same pattern used for bi_flags in struct bio. Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
由 Mike Snitzer 提交于
Add flags to dm_io and manage them using the same pattern used for bi_flags in struct bio. Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
- 11 3月, 2022 6 次提交
-
-
由 Mike Snitzer 提交于
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Optimizes dm_io_dec_pending() slightly by avoiding local variables. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Remove the from_wq argument from dm_sumbit_bio_remap(). Eliminates the need for dm_sumbit_bio_remap() callers to know whether they are calling for a workqueue or from the original dm_submit_bio(). Add map_task to dm_io struct, record the map_task in alloc_io and clear it after all target ->map() calls have completed. Update dm_sumbit_bio_remap to check if 'current' matches io->map_task rather than rely on passed 'from_rq' argument. This change really simplifies the chore of porting each DM target to using dm_sumbit_bio_remap() because there is no longer the risk of programming error by not completely knowing all the different contexts a particular method that calls dm_sumbit_bio_remap() might be used in. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
If a target uses dm_submit_bio_remap() it should set ti->accounts_remapped_io. Also, switch dm_start_io_acct() WARN_ON to WARN_ON_ONCE. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 10 3月, 2022 1 次提交
-
-
由 Ming Lei 提交于
Support bio polling (REQ_POLLED) in the following approach: 1) only support io polling on normal READ/WRITE, and other abnormal IOs still fallback to IRQ mode, so the target io (and DM's clone bio) is exactly inside the dm io. 2) hold one refcnt on io->io_count after submitting this dm bio with REQ_POLLED 3) support dm native bio splitting, any dm io instance associated with current bio will be added into one list which head is bio->bi_private which will be recovered before ending this bio 4) implement .poll_bio() callback, call bio_poll() on the single target bio inside the dm io which is retrieved via bio->bi_bio_drv_data; call dm_io_dec_pending() after the target io is done in .poll_bio() 5) enable QUEUE_FLAG_POLL if all underlying queues enable QUEUE_FLAG_POLL, which is based on Jeffle's previous patch. These changes are good for a 30-35% IOPS improvement for polled IO. For detailed test results please see (Jens, thanks for testing!): https://listman.redhat.com/archives/dm-devel/2022-March/049868.html or https://marc.info/?l=linux-block&m=164684246214700&w=2Tested-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 09 3月, 2022 6 次提交
-
-
由 Christoph Hellwig 提交于
Use bio_init to initialize the bios when needed to the full state instead of a partial initialization plus later setting of dev and op and bio_reset. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Christoph Hellwig 提交于
There is no need to preallocate the bio and reset it when use. Just allocate it on-stack and use a bvec places next to the pages used for it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Christoph Hellwig 提交于
Stop using bio_reset and just initialize the bio fully when needed. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Christoph Hellwig 提交于
We have all the information to pass the bdev and op directly to bio_init, so do that. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Eric Dumazet 提交于
Calling mdelay(1000) from process context, even while a reboot is in progress, does not make sense. Using msleep() allows other threads to make progress. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: linux-raid@vger.kernel.org Signed-off-by: NSong Liu <song@kernel.org>
-
由 Mariusz Tkaczyk 提交于
Those counters are not necessary after commit 11bb45e8aaf6 ("md: drop queue limitation for RAID1 and RAID10"). Remove them from all code (conf and plug structs). raid1_plug_cb and raid10_plug_cb are identical, so move definition of raid1_plug_cb to common raid1-10 definitions and use it for RAID10 too. Signed-off-by: NMariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: NSong Liu <song@kernel.org>
-
- 08 3月, 2022 1 次提交
-
-
由 Christoph Hellwig 提交于
With the NVMe support for this gone, there are no consumers of these hints left, so remove them. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220304175556.407719-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 07 3月, 2022 1 次提交
-
-
由 Christoph Hellwig 提交于
Use the %pg format specifier to save on stack consuption and code size. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NSong Liu <song@kernel.org> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220304180105.409765-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-