- 14 5月, 2021 2 次提交
-
-
由 Mikulas Patocka 提交于
Revert the commit 7a5b96b4 ("dm integrity: use discard support when recalculating"). There's a bug that when we write some data beyond the current recalculate boundary, the checksum will be rewritten with the discard filler later. And the data will no longer have integrity protection. There's no easy fix for this case. Also, another problematic case is if dm-integrity is used to detect bitrot (random device errors, bit flips, etc); dm-integrity should detect that even for unused sectors. With commit 7a5b96b4 it can happen that such change is undetected (because discard filler is not a valid checksum). Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Acked-by: NMilan Broz <gmazyland@gmail.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mikulas Patocka 提交于
The following commands will crash the kernel: modprobe brd rd_size=1048576 dmsetup create o --table "0 `blockdev --getsize /dev/ram0` snapshot-origin /dev/ram0" dmsetup create s --table "0 `blockdev --getsize /dev/ram0` snapshot /dev/ram0 /dev/ram1 N 0" The reason is that when we test for zero chunk size, we jump to the label bad_read_metadata without setting the "r" variable. The function snapshot_ctr destroys all the structures and then exits with "r == 0". The kernel then crashes because it falsely believes that snapshot_ctr succeeded. In order to fix the bug, we set the variable "r" to -EINVAL. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 10 5月, 2021 1 次提交
-
-
由 Mikulas Patocka 提交于
If an origin target has no snapshots, o->split_boundary is set to 0. This causes BUG_ON(sectors <= 0) in block/bio.c:bio_split(). Fix this by initializing chunk_size, and in turn split_boundary, to rounddown_pow_of_two(UINT_MAX) -- the largest power of two that fits into "unsigned" type. Reported-by: NMichael Tokarev <mjt@tls.msk.ru> Tested-by: NMichael Tokarev <mjt@tls.msk.ru> Cc: stable@vger.kernel.org Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 07 5月, 2021 1 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
My UEK-derived config has 1030 files depending on pagemap.h before this change. Afterwards, just 326 files need to be rebuilt when I touch pagemap.h. I think blkdev.h is probably included too widely, but untangling that dependency is harder and this solves my problem. x86 allmodconfig builds, but there may be implicit include problems on other architectures. Link: https://lkml.kernel.org/r/20210309195747.283796-1-willy@infradead.orgSigned-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Dan Williams <dan.j.williams@intel.com> [nvdimm] Acked-by: Jens Axboe <axboe@kernel.dk> [block] Reviewed-by: NChristoph Hellwig <hch@lst.de> Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: Martin K. Petersen <martin.petersen@oracle.com> [scsi] Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 5月, 2021 5 次提交
-
-
由 Mike Snitzer 提交于
Commit 29efc390 ("md/md0: optimize raid0 discard handling") and commit d30588b2 ("md/raid10: improve raid10 discard request") remove MD raid0's and raid10's inability to properly handle large discards. So eliminate associated constraints from dm-raid's support. Depends-on: 29efc390 ("md/md0: optimize raid0 discard handling") Depends-on: d30588b2 ("md/raid10: improve raid10 discard request") Reported-by: NMatthew Ruffell <matthew.ruffell@canonical.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Benjamin Block 提交于
When loading a device-mapper table for a request-based mapped device, and the allocation/initialization of the blk_mq_tag_set for the device fails, a following device remove will cause a double free. E.g. (dmesg): device-mapper: core: Cannot initialize queue for request-based dm-mq mapped device device-mapper: ioctl: unable to set up device queue for new table. Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0305e098835de000 TEID: 0305e098835de803 Fault in home space mode while using kernel ASCE. AS:000000025efe0007 R3:0000000000000024 Oops: 0038 ilc:3 [#1] SMP Modules linked in: ... lots of modules ... Supported: Yes, External CPU: 0 PID: 7348 Comm: multipathd Kdump: loaded Tainted: G W X 5.3.18-53-default #1 SLE15-SP3 Hardware name: IBM 8561 T01 7I2 (LPAR) Krnl PSW : 0704e00180000000 000000025e368eca (kfree+0x42/0x330) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 000000000000004a 000000025efe5230 c1773200d779968d 0000000000000000 000000025e520270 000000025e8d1b40 0000000000000003 00000007aae10000 000000025e5202a2 0000000000000001 c1773200d779968d 0305e098835de640 00000007a8170000 000003ff80138650 000000025e5202a2 000003e00396faa8 Krnl Code: 000000025e368eb8: c4180041e100 lgrl %r1,25eba50b8 000000025e368ebe: ecba06b93a55 risbg %r11,%r10,6,185,58 #000000025e368ec4: e3b010000008 ag %r11,0(%r1) >000000025e368eca: e310b0080004 lg %r1,8(%r11) 000000025e368ed0: a7110001 tmll %r1,1 000000025e368ed4: a7740129 brc 7,25e369126 000000025e368ed8: e320b0080004 lg %r2,8(%r11) 000000025e368ede: b904001b lgr %r1,%r11 Call Trace: [<000000025e368eca>] kfree+0x42/0x330 [<000000025e5202a2>] blk_mq_free_tag_set+0x72/0xb8 [<000003ff801316a8>] dm_mq_cleanup_mapped_device+0x38/0x50 [dm_mod] [<000003ff80120082>] free_dev+0x52/0xd0 [dm_mod] [<000003ff801233f0>] __dm_destroy+0x150/0x1d0 [dm_mod] [<000003ff8012bb9a>] dev_remove+0x162/0x1c0 [dm_mod] [<000003ff8012a988>] ctl_ioctl+0x198/0x478 [dm_mod] [<000003ff8012ac8a>] dm_ctl_ioctl+0x22/0x38 [dm_mod] [<000000025e3b11ee>] ksys_ioctl+0xbe/0xe0 [<000000025e3b127a>] __s390x_sys_ioctl+0x2a/0x40 [<000000025e8c15ac>] system_call+0xd8/0x2c8 Last Breaking-Event-Address: [<000000025e52029c>] blk_mq_free_tag_set+0x6c/0xb8 Kernel panic - not syncing: Fatal exception: panic_on_oops When allocation/initialization of the blk_mq_tag_set fails in dm_mq_init_request_queue(), it is uninitialized/freed, but the pointer is not reset to NULL; so when dev_remove() later gets into dm_mq_cleanup_mapped_device() it sees the pointer and tries to uninitialize and free it again. Fix this by setting the pointer to NULL in dm_mq_init_request_queue() error-handling. Also set it to NULL in dm_mq_cleanup_mapped_device(). Cc: <stable@vger.kernel.org> # 4.6+ Fixes: 1c357a1e ("dm: allocate blk_mq_tag_set rather than embed in mapped_device") Signed-off-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mikulas Patocka 提交于
If we have discard support we don't have to recalculate hash - we can just fill the metadata with the discard pattern. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mikulas Patocka 提交于
Increase RECALC_SECTORS because it improves recalculate speed slightly (from 390kiB/s to 410kiB/s). Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mikulas Patocka 提交于
If we discard already discarded blocks we do not need to write discard pattern to the metadata, because it is already there. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 24 4月, 2021 2 次提交
-
-
由 Paul Clements 提交于
This patch addresses a data corruption bug in raid1 arrays using bitmaps. Without this fix, the bitmap bits for the failed I/O end up being cleared. Since we are in the failure leg of raid1_end_write_request, the request either needs to be retried (R1BIO_WriteError) or failed (R1BIO_Degraded). Fixes: eeba6809 ("md/raid1: end bio when the device faulty") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: NPaul Clements <paul.clements@us.sios.com> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Heming Zhao 提交于
md_kick_rdev_from_array will remove rdev, so we should use rdev_for_each_safe to search list. How to trigger: env: Two nodes on kvm-qemu x86_64 VMs (2C2G with 2 iscsi luns). ``` node2=192.168.0.3 for i in {1..20}; do echo ==== $i `date` ====; mdadm -Ss && ssh ${node2} "mdadm -Ss" wipefs -a /dev/sda /dev/sdb mdadm -CR /dev/md0 -b clustered -e 1.2 -n 2 -l 1 /dev/sda \ /dev/sdb --assume-clean ssh ${node2} "mdadm -A /dev/md0 /dev/sda /dev/sdb" mdadm --wait /dev/md0 ssh ${node2} "mdadm --wait /dev/md0" mdadm --manage /dev/md0 --fail /dev/sda --remove /dev/sda sleep 1 done ``` Crash stack: ``` stack segment: 0000 [#1] SMP ... ... RIP: 0010:md_check_recovery+0x1e8/0x570 [md_mod] ... ... RSP: 0018:ffffb149807a7d68 EFLAGS: 00010207 RAX: 0000000000000000 RBX: ffff9d494c180800 RCX: ffff9d490fc01e50 RDX: fffff047c0ed8308 RSI: 0000000000000246 RDI: 0000000000000246 RBP: 6b6b6b6b6b6b6b6b R08: ffff9d490fc01e40 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: ffff9d494c180818 R14: ffff9d493399ef38 R15: ffff9d4933a1d800 FS: 0000000000000000(0000) GS:ffff9d494f700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe68cab9010 CR3: 000000004c6be001 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: raid1d+0x5c/0xd40 [raid1] ? finish_task_switch+0x75/0x2a0 ? lock_timer_base+0x67/0x80 ? try_to_del_timer_sync+0x4d/0x80 ? del_timer_sync+0x41/0x50 ? schedule_timeout+0x254/0x2d0 ? md_start_sync+0xe0/0xe0 [md_mod] ? md_thread+0x127/0x160 [md_mod] md_thread+0x127/0x160 [md_mod] ? wait_woken+0x80/0x80 kthread+0x10d/0x130 ? kthread_park+0xa0/0xa0 ret_from_fork+0x1f/0x40 ``` Fixes: dbb64f86 ("md-cluster: Fix adding of new disk with new reload code") Fixes: 659b254f ("md-cluster: remove a disk asynchronously from cluster environment") Cc: stable@vger.kernel.org Reviewed-by: NGang He <ghe@suse.com> Signed-off-by: NHeming Zhao <heming.zhao@suse.com> Signed-off-by: NSong Liu <song@kernel.org>
-
- 22 4月, 2021 1 次提交
-
-
由 Heinz Mauelshagen 提交于
If fast table reloads occur during an ongoing reshape of raid4/5/6 devices the target may race reading a superblock vs the the MD resync thread; causing an inconclusive reshape state to be read in its constructor. lvm2 test lvconvert-raid-reshape-stripes-load-reload.sh can cause BUG_ON() to trigger in md_run(), e.g.: "kernel BUG at drivers/md/raid5.c:7567!". Scenario triggering the bug: 1. the MD sync thread calls end_reshape() from raid5_sync_request() when done reshaping. However end_reshape() _only_ updates the reshape position to MaxSector keeping the changed layout configuration though (i.e. any delta disks, chunk sector or RAID algorithm changes). That inconclusive configuration is stored in the superblock. 2. dm-raid constructs a mapping, loading named inconsistent superblock as of step 1 before step 3 is able to finish resetting the reshape state completely, and calls md_run() which leads to mentioned bug in raid5.c. 3. the MD RAID personality's finish_reshape() is called; which resets the reshape information on chunk sectors, delta disks, etc. This explains why the bug is rarely seen on multi-core machines, as MD's finish_reshape() superblock update races with the dm-raid constructor's superblock load in step 2. Fix identifies inconclusive superblock content in the dm-raid constructor and resets it before calling md_run(), factoring out identifying checks into rs_is_layout_change() to share in existing rs_reshape_requested() and new rs_reset_inclonclusive_reshape(). Also enhance a comment and remove an empty line. Cc: stable@vger.kernel.org Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 21 4月, 2021 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 20 4月, 2021 7 次提交
-
-
由 Jiapeng Chong 提交于
Fix the following clang warning: drivers/md/dm-clone-metadata.c:279:19: warning: unused function 'superblock_write_lock' [-Wunused-function]. Reported-by: NAbaci Robot <abaci@linux.alibaba.com> Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Tian Tao 提交于
Fixes: 468dfca3 ("dm integrity: add a bitmap mode") Cc: stable@vger.kernel.org Signed-off-by: NTian Tao <tiantao6@hisilicon.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Matthew Wilcox (Oracle) 提交于
Use kvcalloc or kvmalloc_array instead (depending whether zeroing is useful). Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Joe Thornber 提交于
This division bug meant the search for free metadata space could skip the final allocation bitmap's worth of entries. Fix affects DM thinp, cache and era targets. Cc: stable@vger.kernel.org Signed-off-by: NJoe Thornber <ejt@redhat.com> Tested-by: NMing-Hung Tsai <mtsai@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Joe Thornber 提交于
Otherwise most non-x86 architectures (e.g. riscv, arm) will resort to byte-by-byte access. Cc: stable@vger.kernel.org Signed-off-by: NJoe Thornber <ejt@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Joe Thornber 提交于
Remove this extra BUG_ON() that calls node_check() -- which avoids extra crc checking. Signed-off-by: NJoe Thornber <ejt@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Joe Thornber 提交于
Both sm_disk_new_block and sm_disk_commit are needlessly calling sm_disk_get_nr_free(). Looks like old queries used for some debugging. Signed-off-by: NJoe Thornber <ejt@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 16 4月, 2021 4 次提交
-
-
由 Sudhakar Panneerselvam 提交于
NULL pointer dereference was observed in super_written() when it tries to access the mddev structure. [The below stack trace is from an older kernel, but the problem described in this patch applies to the mainline kernel.] [ 1194.474861] task: ffff8fdd20858000 task.stack: ffffb99d40790000 [ 1194.488000] RIP: 0010:super_written+0x29/0xe1 [ 1194.499688] RSP: 0018:ffff8ffb7fcc3c78 EFLAGS: 00010046 [ 1194.512477] RAX: 0000000000000000 RBX: ffff8ffb7bf4a000 RCX: ffff8ffb78991048 [ 1194.527325] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8ffb56b8a200 [ 1194.542576] RBP: ffff8ffb7fcc3c90 R08: 000000000000000b R09: 0000000000000000 [ 1194.558001] R10: ffff8ffb56b8a298 R11: 0000000000000000 R12: ffff8ffb56b8a200 [ 1194.573070] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 1194.588117] FS: 0000000000000000(0000) GS:ffff8ffb7fcc0000(0000) knlGS:0000000000000000 [ 1194.604264] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1194.617375] CR2: 00000000000002b8 CR3: 00000021e040a002 CR4: 00000000007606e0 [ 1194.632327] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1194.647865] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1194.663316] PKRU: 55555554 [ 1194.674090] Call Trace: [ 1194.683735] <IRQ> [ 1194.692948] bio_endio+0xae/0x135 [ 1194.703580] blk_update_request+0xad/0x2fa [ 1194.714990] blk_update_bidi_request+0x20/0x72 [ 1194.726578] __blk_end_bidi_request+0x2c/0x4d [ 1194.738373] __blk_end_request_all+0x31/0x49 [ 1194.749344] blk_flush_complete_seq+0x377/0x383 [ 1194.761550] flush_end_io+0x1dd/0x2a7 [ 1194.772910] blk_finish_request+0x9f/0x13c [ 1194.784544] scsi_end_request+0x180/0x25c [ 1194.796149] scsi_io_completion+0xc8/0x610 [ 1194.807503] scsi_finish_command+0xdc/0x125 [ 1194.818897] scsi_softirq_done+0x81/0xde [ 1194.830062] blk_done_softirq+0xa4/0xcc [ 1194.841008] __do_softirq+0xd9/0x29f [ 1194.851257] irq_exit+0xe6/0xeb [ 1194.861290] do_IRQ+0x59/0xe3 [ 1194.871060] common_interrupt+0x1c6/0x382 [ 1194.881988] </IRQ> [ 1194.890646] RIP: 0010:cpuidle_enter_state+0xdd/0x2a5 [ 1194.902532] RSP: 0018:ffffb99d40793e68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff43 [ 1194.917317] RAX: ffff8ffb7fce27c0 RBX: ffff8ffb7fced800 RCX: 000000000000001f [ 1194.932056] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000 [ 1194.946428] RBP: ffffb99d40793ea0 R08: 0000000000000004 R09: 0000000000002ed2 [ 1194.960508] R10: 0000000000002664 R11: 0000000000000018 R12: 0000000000000003 [ 1194.974454] R13: 000000000000000b R14: ffffffff925715a0 R15: 0000011610120d5a [ 1194.988607] ? cpuidle_enter_state+0xcc/0x2a5 [ 1194.999077] cpuidle_enter+0x17/0x19 [ 1195.008395] call_cpuidle+0x23/0x3a [ 1195.017718] do_idle+0x172/0x1d5 [ 1195.026358] cpu_startup_entry+0x73/0x75 [ 1195.035769] start_secondary+0x1b9/0x20b [ 1195.044894] secondary_startup_64+0xa5/0xa5 [ 1195.084921] RIP: super_written+0x29/0xe1 RSP: ffff8ffb7fcc3c78 [ 1195.096354] CR2: 00000000000002b8 bio in the above stack is a bitmap write whose completion is invoked after the tear down sequence sets the mddev structure to NULL in rdev. During tear down, there is an attempt to flush the bitmap writes, but for external bitmaps, there is no explicit wait for all the bitmap writes to complete. For instance, md_bitmap_flush() is called to flush the bitmap writes, but the last call to md_bitmap_daemon_work() in md_bitmap_flush() could generate new bitmap writes for which there is no explicit wait to complete those writes. The call to md_bitmap_update_sb() will return simply for external bitmaps and the follow-up call to md_update_sb() is conditional and may not get called for external bitmaps. This results in a kernel panic when the completion routine, super_written() is called which tries to reference mddev in the rdev that has been set to NULL(in unbind_rdev_from_array() by tear down sequence). The solution is to call md_super_wait() for external bitmaps after the last call to md_bitmap_daemon_work() in md_bitmap_flush() to ensure there are no pending bitmap writes before proceeding with the tear down. Cc: stable@vger.kernel.org Signed-off-by: NSudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com> Reviewed-by: NZhao Heming <heming.zhao@suse.com> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Christoph Hellwig 提交于
Instead of returning an existing mddev, just for it to be discarded later directly return -EEXIST. Rename the function to mddev_alloc now that it doesn't find an existing mddev. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Christoph Hellwig 提交于
Allocate the new mddev first speculatively, which greatly simplifies the code flow. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Christoph Hellwig 提交于
Split out a self contained helper to find a free minor for the md "unit" number. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSong Liu <song@kernel.org>
-
- 15 4月, 2021 1 次提交
-
-
由 Jaegeuk Kim 提交于
commit df7b59ba ("dm verity: fix FEC for RS roots unaligned to block size") introduced the possibility for misaligned roots IO relative to the underlying device's logical block size. E.g. Android's default RS roots=2 results in dm_bufio->block_size=1024, which causes the following EIO if the logical block size of the device is 4096, given v->data_dev_block_bits=12: E sd 0 : 0:0:0: [sda] tag#30 request not aligned to the logical block size E blk_update_request: I/O error, dev sda, sector 10368424 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 E device-mapper: verity-fec: 254:8: FEC 9244672: parity read failed (block 18056): -5 Fix this by onlu using f->roots for dm_bufio blocksize IFF it is aligned to v->data_dev_block_bits. Fixes: df7b59ba ("dm verity: fix FEC for RS roots unaligned to block size") Cc: stable@vger.kernel.org Signed-off-by: NJaegeuk Kim <jaegeuk@google.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 11 4月, 2021 7 次提交
-
-
由 Coly Li 提交于
The patch "bcache: remove PTR_CACHE" introduces a compiling failure in debug.c with following error message, In file included from drivers/md/bcache/bcache.h:182:0, from drivers/md/bcache/debug.c:9: drivers/md/bcache/debug.c: In function 'bch_btree_verify': drivers/md/bcache/debug.c:53:19: error: 'c' undeclared (first use in this function) bio_set_dev(bio, c->cache->bdev); ^ This patch fixes the regression by replacing c->cache->bdev by b->c-> cache->bdev. Signed-off-by: NColy Li <colyli@suse.de> Cc: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210411134316.80274-8-colyli@suse.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Gustavo A. R. Silva 提交于
Cast multiple variables to (int64_t) in order to give the compiler complete information about the proper arithmetic to use. Notice that these variables are being used in contexts that expect expressions of type int64_t (64 bit, signed). And currently, such expressions are being evaluated using 32-bit arithmetic. Fixes: d0cf9503 ("octeontx2-pf: ethtool fec mode support") Addresses-Coverity-ID: 1501724 ("Unintentional integer overflow") Addresses-Coverity-ID: 1501725 ("Unintentional integer overflow") Addresses-Coverity-ID: 1501726 ("Unintentional integer overflow") Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: NColy Li <colyli@suse.de> Link: https://lore.kernel.org/r/20210411134316.80274-7-colyli@suse.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Bhaskar Chowdhury 提交于
s/condidate/candidate/ s/folowing/following/ Signed-off-by: NBhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: NColy Li <colyli@suse.de> Link: https://lore.kernel.org/r/20210411134316.80274-6-colyli@suse.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Arnd Bergmann 提交于
building with 'make W=1' shows a harmless warning for each user of the EBUG_ON() macro: drivers/md/bcache/bset.c: In function 'bch_btree_sort_partial': drivers/md/bcache/util.h:30:55: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body] 30 | #define EBUG_ON(cond) do { if (cond); } while (0) | ^ drivers/md/bcache/bset.c:1312:9: note: in expansion of macro 'EBUG_ON' 1312 | EBUG_ON(oldsize >= 0 && bch_count_data(b) != oldsize); | ^~~~~~~ Reword the macro slightly to avoid the warning. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NColy Li <colyli@suse.de> Link: https://lore.kernel.org/r/20210411134316.80274-5-colyli@suse.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Yang Li 提交于
This fixes the following sparse warnings: drivers/md/bcache/features.c:22:16: warning: Using plain integer as NULL pointer Reported-by: NAbaci Robot <abaci@linux.alibaba.com> Signed-off-by: NYang Li <yang.lee@linux.alibaba.com> Signed-off-by: NColy Li <colyli@suse.de> Link: https://lore.kernel.org/r/20210411134316.80274-4-colyli@suse.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Remove the PTR_CACHE inline and replace it with a direct dereference of c->cache. (Coly Li: fix the typo from PTR_BUCKET to PTR_CACHE in commit log) Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NColy Li <colyli@suse.de> Link: https://lore.kernel.org/r/20210411134316.80274-3-colyli@suse.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Zhiqiang Liu 提交于
In bch_cached_dev_run(), free(env[1])|free(env[2])|free(buf) show up three times. This patch introduce out tag in which free(env[1])|free(env[2])|free(buf) are only called one time. If we need to call free() when errors occur, we can set error code to ret, and then goto out tag directly. Signed-off-by: NZhiqiang Liu <liuzhiqiang26@huawei.com> Signed-off-by: NColy Li <colyli@suse.de> Link: https://lore.kernel.org/r/20210411134316.80274-2-colyli@suse.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 4月, 2021 1 次提交
-
-
由 Sami Tolvanen 提交于
list_sort() internally casts the comparison function passed to it to a different type with constant struct list_head pointers, and uses this pointer to call the functions, which trips indirect call Control-Flow Integrity (CFI) checking. Instead of removing the consts, this change defines the list_cmp_func_t type and changes the comparison function types of all list_sort() callers to use const pointers, thus avoiding type mismatches. Suggested-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NSami Tolvanen <samitolvanen@google.com> Reviewed-by: NNick Desaulniers <ndesaulniers@google.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKees Cook <keescook@chromium.org> Tested-by: NNick Desaulniers <ndesaulniers@google.com> Tested-by: NNathan Chancellor <nathan@kernel.org> Signed-off-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-10-samitolvanen@google.com
-
- 08 4月, 2021 3 次提交
-
-
由 Christoph Hellwig 提交于
Split mddev_find into a simple mddev_find that just finds an existing mddev by the unit number, and a more complicated mddev_find that deals with find or allocating a mddev. This turns out to fix this bug reported by Zhao Heming. ----------------------------- snip ------------------------------ commit d3374825 ("md: make devices disappear when they are no longer needed.") introduced protection between mddev creating & removing. The md_open shouldn't create mddev when all_mddevs list doesn't contain mddev. With currently code logic, there will be very easy to trigger soft lockup in non-preempt env. *** env *** kvm-qemu VM 2C1G with 2 iscsi luns kernel should be non-preempt *** script *** about trigger 1 time with 10 tests `1 node1="15sp3-mdcluster1" 2 node2="15sp3-mdcluster2" 3 4 mdadm -Ss 5 ssh ${node2} "mdadm -Ss" 6 wipefs -a /dev/sda /dev/sdb 7 mdadm -CR /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda \ /dev/sdb --assume-clean 8 9 for i in {1..100}; do 10 echo ==== $i ====; 11 12 echo "test ...." 13 ssh ${node2} "mdadm -A /dev/md0 /dev/sda /dev/sdb" 14 sleep 1 15 16 echo "clean ....." 17 ssh ${node2} "mdadm -Ss" 18 done ` I use mdcluster env to trigger soft lockup, but it isn't mdcluster speical bug. To stop md array in mdcluster env will do more jobs than non-cluster array, which will leave enough time/gap to allow kernel to run md_open. *** stack *** `ID: 2831 TASK: ffff8dd7223b5040 CPU: 0 COMMAND: "mdadm" #0 [ffffa15d00a13b90] __schedule at ffffffffb8f1935f #1 [ffffa15d00a13ba8] exact_lock at ffffffffb8a4a66d #2 [ffffa15d00a13bb0] kobj_lookup at ffffffffb8c62fe3 #3 [ffffa15d00a13c28] __blkdev_get at ffffffffb89273b9 #4 [ffffa15d00a13c98] blkdev_get at ffffffffb8927964 #5 [ffffa15d00a13cb0] do_dentry_open at ffffffffb88dc4b4 #6 [ffffa15d00a13ce0] path_openat at ffffffffb88f0ccc #7 [ffffa15d00a13db8] do_filp_open at ffffffffb88f32bb #8 [ffffa15d00a13ee0] do_sys_open at ffffffffb88ddc7d #9 [ffffa15d00a13f38] do_syscall_64 at ffffffffb86053cb ffffffffb900008c or: [ 884.226509] mddev_put+0x1c/0xe0 [md_mod] [ 884.226515] md_open+0x3c/0xe0 [md_mod] [ 884.226518] __blkdev_get+0x30d/0x710 [ 884.226520] ? bd_acquire+0xd0/0xd0 [ 884.226522] blkdev_get+0x14/0x30 [ 884.226524] do_dentry_open+0x204/0x3a0 [ 884.226531] path_openat+0x2fc/0x1520 [ 884.226534] ? seq_printf+0x4e/0x70 [ 884.226536] do_filp_open+0x9b/0x110 [ 884.226542] ? md_release+0x20/0x20 [md_mod] [ 884.226543] ? seq_read+0x1d8/0x3e0 [ 884.226545] ? kmem_cache_alloc+0x18a/0x270 [ 884.226547] ? do_sys_open+0x1bd/0x260 [ 884.226548] do_sys_open+0x1bd/0x260 [ 884.226551] do_syscall_64+0x5b/0x1e0 [ 884.226554] entry_SYSCALL_64_after_hwframe+0x44/0xa9 ` *** rootcause *** "mdadm -A" (or other array assemble commands) will start a daemon "mdadm --monitor" by default. When "mdadm -Ss" is running, the stop action will wakeup "mdadm --monitor". The "--monitor" daemon will immediately get info from /proc/mdstat. This time mddev in kernel still exist, so /proc/mdstat still show md device, which makes "mdadm --monitor" to open /dev/md0. The previously "mdadm -Ss" is removing action, the "mdadm --monitor" open action will trigger md_open which is creating action. Racing is happening. `<thread 1>: "mdadm -Ss" md_release mddev_put deletes mddev from all_mddevs queue_work for mddev_delayed_delete at this time, "/dev/md0" is still available for opening <thread 2>: "mdadm --monitor ..." md_open + mddev_find can't find mddev of /dev/md0, and create a new mddev and | return. + trigger "if (mddev->gendisk != bdev->bd_disk)" and return -ERESTARTSYS. ` In non-preempt kernel, <thread 2> is occupying on current CPU. and mddev_delayed_delete which was created in <thread 1> also can't be schedule. In preempt kernel, it can also trigger above racing. But kernel doesn't allow one thread running on a CPU all the time. after <thread 2> running some time, the later "mdadm -A" (refer above script line 13) will call md_alloc to alloc a new gendisk for mddev. it will break md_open statement "if (mddev->gendisk != bdev->bd_disk)" and return 0 to caller, the soft lockup is broken. ------------------------------ snip ------------------------------ Cc: stable@vger.kernel.org Fixes: d3374825 ("md: make devices disappear when they are no longer needed.") Reported-by: NHeming Zhao <heming.zhao@suse.com> Reviewed-by: NHeming Zhao <heming.zhao@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Christoph Hellwig 提交于
Factor out a self-contained helper to just lookup a mddev by the dev_t "unit". Cc: stable@vger.kernel.org Reviewed-by: NHeming Zhao <heming.zhao@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSong Liu <song@kernel.org>
-
由 Zhao Heming 提交于
commit d3374825 ("md: make devices disappear when they are no longer needed.") introduced protection between mddev creating & removing. The md_open shouldn't create mddev when all_mddevs list doesn't contain mddev. With currently code logic, there will be very easy to trigger soft lockup in non-preempt env. This patch changes md_open returning from -ERESTARTSYS to -EBUSY, which will break the infinitely retry when md_open enter racing area. This patch is partly fix soft lockup issue, full fix needs mddev_find is split into two functions: mddev_find & mddev_find_or_alloc. And md_open should call new mddev_find (it only does searching job). For more detail, please refer with Christoph's "split mddev_find" patch in later commits. *** env *** kvm-qemu VM 2C1G with 2 iscsi luns kernel should be non-preempt *** script *** about trigger every time with below script ``` 1 node1="mdcluster1" 2 node2="mdcluster2" 3 4 mdadm -Ss 5 ssh ${node2} "mdadm -Ss" 6 wipefs -a /dev/sda /dev/sdb 7 mdadm -CR /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda \ /dev/sdb --assume-clean 8 9 for i in {1..10}; do 10 echo ==== $i ====; 11 12 echo "test ...." 13 ssh ${node2} "mdadm -A /dev/md0 /dev/sda /dev/sdb" 14 sleep 1 15 16 echo "clean ....." 17 ssh ${node2} "mdadm -Ss" 18 done ``` I use mdcluster env to trigger soft lockup, but it isn't mdcluster speical bug. To stop md array in mdcluster env will do more jobs than non-cluster array, which will leave enough time/gap to allow kernel to run md_open. *** stack *** ``` [ 884.226509] mddev_put+0x1c/0xe0 [md_mod] [ 884.226515] md_open+0x3c/0xe0 [md_mod] [ 884.226518] __blkdev_get+0x30d/0x710 [ 884.226520] ? bd_acquire+0xd0/0xd0 [ 884.226522] blkdev_get+0x14/0x30 [ 884.226524] do_dentry_open+0x204/0x3a0 [ 884.226531] path_openat+0x2fc/0x1520 [ 884.226534] ? seq_printf+0x4e/0x70 [ 884.226536] do_filp_open+0x9b/0x110 [ 884.226542] ? md_release+0x20/0x20 [md_mod] [ 884.226543] ? seq_read+0x1d8/0x3e0 [ 884.226545] ? kmem_cache_alloc+0x18a/0x270 [ 884.226547] ? do_sys_open+0x1bd/0x260 [ 884.226548] do_sys_open+0x1bd/0x260 [ 884.226551] do_syscall_64+0x5b/0x1e0 [ 884.226554] entry_SYSCALL_64_after_hwframe+0x44/0xa9 ``` *** rootcause *** "mdadm -A" (or other array assemble commands) will start a daemon "mdadm --monitor" by default. When "mdadm -Ss" is running, the stop action will wakeup "mdadm --monitor". The "--monitor" daemon will immediately get info from /proc/mdstat. This time mddev in kernel still exist, so /proc/mdstat still show md device, which makes "mdadm --monitor" to open /dev/md0. The previously "mdadm -Ss" is removing action, the "mdadm --monitor" open action will trigger md_open which is creating action. Racing is happening. ``` <thread 1>: "mdadm -Ss" md_release mddev_put deletes mddev from all_mddevs queue_work for mddev_delayed_delete at this time, "/dev/md0" is still available for opening <thread 2>: "mdadm --monitor ..." md_open + mddev_find can't find mddev of /dev/md0, and create a new mddev and | return. + trigger "if (mddev->gendisk != bdev->bd_disk)" and return -ERESTARTSYS. ``` In non-preempt kernel, <thread 2> is occupying on current CPU. and mddev_delayed_delete which was created in <thread 1> also can't be schedule. In preempt kernel, it can also trigger above racing. But kernel doesn't allow one thread running on a CPU all the time. after <thread 2> running some time, the later "mdadm -A" (refer above script line 13) will call md_alloc to alloc a new gendisk for mddev. it will break md_open statement "if (mddev->gendisk != bdev->bd_disk)" and return 0 to caller, the soft lockup is broken. Cc: stable@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NZhao Heming <heming.zhao@suse.com> Signed-off-by: NSong Liu <song@kernel.org>
-
- 27 3月, 2021 4 次提交
-
-
由 Mikulas Patocka 提交于
Add a new flag "reset_recalculate" that will restart recalculating from the beginning of the device. It can be used if we want to change the hash function. Example: dmsetup remove_all rmmod brd set -e modprobe brd rd_size=1048576 dmsetup create in --table '0 2000000 integrity /dev/ram0 0 16 J 2 internal_hash:sha256 recalculate' sleep 10 dmsetup status dmsetup remove in dmsetup create in --table '0 2000000 integrity /dev/ram0 0 16 J 2 internal_hash:sha3-256 reset_recalculate' Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Jiapeng Chong 提交于
Fix the following coccicheck warnings: ./drivers/md/persistent-data/dm-btree-spine.c:188:5-6: Unneeded variable: "r". Return "0" on line 194. Reported-by: NAbaci Robot <abaci@linux.alibaba.com> Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Xu Wang 提交于
Since commit ff9ea323 ("block, bdi: an active gendisk always has a request_queue associated with it") the request_queue pointer returned from bdev_get_queue() shall never be NULL. Signed-off-by: NXu Wang <vulab@iscas.ac.cn> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Xu Wang 提交于
Since commit ff9ea323 ("block, bdi: an active gendisk always has a request_queue associated with it") the request_queue pointer returned from bdev_get_queue() shall never be NULL. Signed-off-by: NXu Wang <vulab@iscas.ac.cn> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-