- 03 6月, 2023 19 次提交
-
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188535, https://gitee.com/openeuler/kernel/issues/I6O61Q CVE: NA -------------------------------- Recovery will go to giveup and let chunks_skipped++ in raid10_sync_request if there are some bad_blocks, and it will return max_sector when chunks_skipped >= geo.raid_disks. Now, recovery fail and data is inconsistent but user think recovery is done, it is wrong. Fix it by set mirror's recovery_disabled and spare device shouln't be added to here. Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit b0ac58c9)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188378, https://gitee.com/openeuler/kernel/issues/I6GGV7 CVE: NA -------------------------------- init_resync() init mempool and set conf->have_replacemnt at the begaining of sync, close_sync() free the mempool when sync is completed. After commit 7e83ccbe ("md/raid10: Allow skipping recovery when clean arrays are assembled"), recovery might skipped and init_resync() is called but close_sync() is not. null-ptr-deref occurs as below: 1) creat a array, wait for resync to complete, mddev->recovery_cp is set to MaxSector. 2) recovery is woken and it is skipped. conf->have_replacement is set to 0 in init_resync(). close_sync() not called. 3) some io errors and rdev A is set to WantReplacement. 4) a new device is added and set to A's replacement. 5) recovery is woken, A have replacement, but conf->have_replacemnt is 0. r10bio->dev[i].repl_bio will not be alloced and null-ptr-deref occurs. Fix it by not init_resync() if recovery skipped. Fixes: 7e83ccbe md/raid10: Allow skipping recovery when clean arrays are assembled") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit 2de30b8f)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6XBZQ CVE: NA -------------------------------- If we set any badblocks fail, we will remove this rdev(set it to Faulty or set recovery_disabled). Previous patch "md/raid10: fix io hung in md_wait_for_blocked_rdev()" check badblocks->changed instead of return value in rdev_set_badblocks(), but return value of this func also changed accordingly, which is not what we expected. Keep the return value consistent with before. Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit bebf3d97)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6XBZQ CVE: NA -------------------------------- If badblocks are merged but bb->count exceedded, badblocks_set() will return 1 and merged badblocks will become un-ack. rdev_set_badblocks() will not set sb_flags and wakeup mddev->thread, io wait in md_wait_for_blocked_rdev() will hung because BlockedBadBlocks may not be cleared. Fix it by checking badblocks->changed instead of return value. This flag is set when badblocks changes. Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit c23e1cd1)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188605, https://gitee.com/openeuler/kernel/issues/I6ZJ3T CVE: NA -------------------------------- We get rdev from mirrors.replacement twice in raid10_write_request(). If replacement changes between two reads, it will increase A->nr_pending and decrease B->nr_pending. T1 (write) T2 (remove) T3 (add) raid10_remove_disk raid10_write_request rrdev = conf->mirrors[d].replacement; ->rdev A A nr_pending++ p->rdev = p->replacement; ->rdev A p->replacement = NULL; //A it set to WantReplacement raid10_add_disk p->replacement = rdev; ->rdev B if blocked_rdev rdev = conf->mirrors[d].replacement; ->rdev B B nr_pending-- We will record rdev in r10bio, and get rdev from r10bio to fix it. Fixes: 475b0321 ("md/raid10: writes should get directed to replacement as well as original.") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit 7b3b8187)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188605, https://gitee.com/openeuler/kernel/issues/I6GOYF CVE: NA -------------------------------- It might read mirror.redev first and then mirror->replacement because of memory reordering in raid10_end_write_request(), WARN_ON occurs if we remove disk at the same time. T1 remove T2 io end raid10_remove_disk raid10_end_write_request p->rdev = NULL read rdev -> NULL smp_mb p->replacement = NULL read replacement -> NULL It is meaningless to compare rdev with mirror->rdev after we get it from r10_bio in raid10_end_write_request(). Remove this WANR_ON_ONCE. Fixes: 2ecf5e6ecbfd ("md/raid10: fix uaf if replacement replaces rdev") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit a3ebeed7)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188377, https://gitee.com/openeuler/kernel/issues/I6GOYF CVE: NA -------------------------------- After commit 4ca40c2c ("md/raid10: Allow replacement device to be replace old drive.") mirrors->replacement can replace rdev during replacement's io pending, and repl_bio will write rdev (see raid10_write_one_disk()). We will get wrong device by r10conf in raid10_end_write_request(). In which case, r10_bio->devs[slot].repl_bio will be put but not set to IO_MADE_GOOD, and it will be put again later in raid_end_bio_io(), uaf occurs. Fix it by using r10_bio to record rdev. Put the operations of io fail and no replacement together, so no need to change repl. ================================================================== BUG: KASAN: use-after-free in bio_flagged include/linux/bio.h:238 [inline] BUG: KASAN: use-after-free in bio_put+0x78/0x80 block/bio.c:650 Read of size 2 at addr ffff888116524dd4 by task md0_raid10/2618 CPU: 0 PID: 2618 Comm: md0_raid10 Not tainted 5.10.0+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 sd 0:0:0:0: rejecting I/O to offline device Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x107/0x167 lib/dump_stack.c:118 print_address_description.constprop.0+0x1c/0x270 mm/kasan/report.c:390 __kasan_report mm/kasan/report.c:550 [inline] kasan_report.cold+0x22/0x3a mm/kasan/report.c:567 bio_flagged include/linux/bio.h:238 [inline] bio_put+0x78/0x80 block/bio.c:650 put_all_bios drivers/md/raid10.c:248 [inline] free_r10bio drivers/md/raid10.c:257 [inline] raid_end_bio_io+0x3b5/0x590 drivers/md/raid10.c:309 handle_write_completed drivers/md/raid10.c:2699 [inline] raid10d+0x2f85/0x5af0 drivers/md/raid10.c:2759 md_thread+0x444/0x4b0 drivers/md/md.c:7932 kthread+0x38c/0x470 kernel/kthread.c:313 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:299 Allocated by task 1400: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48 kasan_set_track mm/kasan/common.c:56 [inline] set_alloc_info mm/kasan/common.c:498 [inline] __kasan_kmalloc.constprop.0+0xb5/0xe0 mm/kasan/common.c:530 slab_post_alloc_hook mm/slab.h:512 [inline] slab_alloc_node mm/slub.c:2923 [inline] slab_alloc mm/slub.c:2931 [inline] kmem_cache_alloc+0x144/0x360 mm/slub.c:2936 mempool_alloc+0x146/0x360 mm/mempool.c:391 bio_alloc_bioset+0x375/0x610 block/bio.c:486 bio_clone_fast+0x20/0x50 block/bio.c:711 raid10_write_one_disk+0x166/0xd30 drivers/md/raid10.c:1240 raid10_write_request+0x1600/0x2c90 drivers/md/raid10.c:1484 __make_request drivers/md/raid10.c:1508 [inline] raid10_make_request+0x376/0x620 drivers/md/raid10.c:1537 md_handle_request+0x699/0x970 drivers/md/md.c:451 md_submit_bio+0x204/0x400 drivers/md/md.c:489 __submit_bio block/blk-core.c:959 [inline] __submit_bio_noacct block/blk-core.c:1007 [inline] submit_bio_noacct+0x2e3/0xcf0 block/blk-core.c:1086 submit_bio+0x1a0/0x3a0 block/blk-core.c:1146 submit_bh_wbc+0x685/0x8e0 fs/buffer.c:3053 ext4_commit_super+0x37e/0x6c0 fs/ext4/super.c:5696 flush_stashed_error_work+0x28b/0x400 fs/ext4/super.c:791 process_one_work+0x9a6/0x1590 kernel/workqueue.c:2280 worker_thread+0x61d/0x1310 kernel/workqueue.c:2426 kthread+0x38c/0x470 kernel/kthread.c:313 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:299 Freed by task 2618: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48 kasan_set_track+0x1c/0x30 mm/kasan/common.c:56 kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:361 __kasan_slab_free+0x151/0x180 mm/kasan/common.c:482 slab_free_hook mm/slub.c:1569 [inline] slab_free_freelist_hook+0xa9/0x180 mm/slub.c:1608 slab_free mm/slub.c:3179 [inline] kmem_cache_free+0xcd/0x3d0 mm/slub.c:3196 mempool_free+0xe3/0x3b0 mm/mempool.c:500 bio_free+0xe2/0x140 block/bio.c:266 bio_put+0x58/0x80 block/bio.c:651 raid10_end_write_request+0x885/0xb60 drivers/md/raid10.c:516 bio_endio+0x376/0x6a0 block/bio.c:1465 req_bio_endio block/blk-core.c:289 [inline] blk_update_request+0x5f5/0xf40 block/blk-core.c:1525 blk_mq_end_request+0x4c/0x510 block/blk-mq.c:654 blk_flush_complete_seq+0x835/0xd80 block/blk-flush.c:204 flush_end_io+0x7b7/0xb90 block/blk-flush.c:261 __blk_mq_end_request+0x282/0x4c0 block/blk-mq.c:645 scsi_end_request+0x3a8/0x850 drivers/scsi/scsi_lib.c:607 scsi_io_completion+0x3f5/0x1320 drivers/scsi/scsi_lib.c:970 scsi_softirq_done+0x11b/0x490 drivers/scsi/scsi_lib.c:1448 blk_mq_complete_request block/blk-mq.c:788 [inline] blk_mq_complete_request+0x84/0xb0 block/blk-mq.c:785 scsi_mq_done+0x155/0x360 drivers/scsi/scsi_lib.c:1603 virtscsi_vq_done drivers/scsi/virtio_scsi.c:184 [inline] virtscsi_req_done+0x14c/0x220 drivers/scsi/virtio_scsi.c:199 vring_interrupt drivers/virtio/virtio_ring.c:2061 [inline] vring_interrupt+0x27a/0x300 drivers/virtio/virtio_ring.c:2047 __handle_irq_event_percpu+0x2f8/0x830 kernel/irq/handle.c:156 handle_irq_event_percpu kernel/irq/handle.c:196 [inline] handle_irq_event+0x105/0x280 kernel/irq/handle.c:213 handle_edge_irq+0x258/0xd20 kernel/irq/chip.c:828 asm_call_irq_on_stack+0xf/0x20 __run_irq_on_irqstack arch/x86/include/asm/irq_stack.h:48 [inline] run_irq_on_irqstack_cond arch/x86/include/asm/irq_stack.h:101 [inline] handle_irq arch/x86/kernel/irq.c:230 [inline] __common_interrupt arch/x86/kernel/irq.c:249 [inline] common_interrupt+0xe2/0x190 arch/x86/kernel/irq.c:239 asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:626 Fixes: 4ca40c2c ("md/raid10: Allow replacement device to be replace old drive.") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit af959500)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188527, https://gitee.com/openeuler/kernel/issues/I6O3HO CVE: NA -------------------------------- need_replace will be set to 1 if no-Faulty mreplace exists, and mreplace will be deref later. However, the latter check of mreplace might set mreplace to NULL, null-ptr-deref occurs if need_replace is 1 at this time. Fix it by merging two checks into one. Fixes: ee37d731 ("md/raid10: Fix raid10 replace hang when new added disk faulty") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit 7718714e)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188787, https://gitee.com/openeuler/kernel/issues/I78YIW CVE: NA -------------------------------- When we remove a disk which has replacement, first set rdev to NULL and then set replacement to rdev, finally set replacement to NULL (see raid10_remove_disk()). If io is submitted during the same time, it might read both rdev and replacement as NULL, and io will not be submitted. rdev -> NULL read rdev replacement -> NULL read replacement Fix it by reading replacement first and rdev later, meanwhile, use smp_mb() to prevent memory reordering. Fixes: 475b0321 ("md/raid10: writes should get directed to replacement as well as original.") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit e8025850)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188804, https://gitee.com/openeuler/kernel/issues/I78YIS CVE: NA -------------------------------- When add a new disk to raid10, it will traverse conf->mirror from start and find one of the following mirror: 1. mirror->rdev is set to WantReplacement and it have no replacement, set new disk to mirror->replacement. 2. no rdev, set new disk to mirror->rdev. There is a array as below (sda is set to WantReplacement): Number Major Minor RaidDevice State 0 8 0 0 active sync set-A /dev/sda - 0 0 1 removed 2 8 32 2 active sync set-A /dev/sdc 3 8 48 3 active sync set-B /dev/sdd Use 'mdadm --add' to add a new disk to this array, the new disk will become sda's replacement instead of add to removed position, which is confusing for users. Meanwhile, after new disk recovery success, sda will be set to Faulty. Prioritize adding disk to 'removed' mirror is a better choice. In the above scenario, the behavior is the same as before, except sda will not be deleted. Before other disks are added, continued use sda is more reliable. Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit 2e2e7ab6)
-
由 Li Nan 提交于
hulk inclusion category: bugfix, https://gitee.com/openeuler/kernel/issues/I71EKW bugzilla: 188628 CVE: NA -------------------------------- We first set rdev to WantRemove, and check if there is any io pending, if so, we will clear flag and return busy in raid10_remove_disk(). io will loss as below: raid10_remove_disk set WantRemove write rdev if WantRemove do not submit io if rdev->nr_pending clear WantRemove return BUSY read rdev get error data Fix it by md_error the rdev which io pending while removing. When the code reaches this point, it means this rdev will be removed later, so setting it as faulty has little impact. Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit 894f89fa)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188533, https://gitee.com/openeuler/kernel/issues/I6O7YB CVE: NA -------------------------------- commit ceff49d9 ("md/raid1: fix a race between removing rdev and access conf->mirrors[i].rdev") fix a null-ptr-deref about raid1. There is same bug in raid10 and fix it in the same way. There is no sync_thread running while removing rdev, no need to check the flag in raid10_sync_request(). Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit 4461a62e)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188380, https://gitee.com/openeuler/kernel/issues/I6GISC CVE: NA -------------------------------- commit fe630de0 ("md/raid10: avoid deadlock on recovery.") allowed normal io and sync io to exist at the same time. Task hung will occur as below: T1 T2 T3 T4 raid10d handle_read_error allow_barrier conf->nr_pending-- -> 0 //submit sync io raid10_sync_request raise_barrier ->will not be blocked ... //submit to drivers raid10_read_request wait_barrier conf->nr_pending++ -> 1 //retry read fail raid10_end_read_request reschedule_retry add to retry_list conf->nr_queued++ -> 1 //sync io fail end_sync_read __end_sync_read reschedule_retry add to retry_list conf->nr_queued++ -> 2 ... handle_read_error freeze_array wait nr_pending == nr_queued+1 ->1 ->3 //task hung retry read and sync io will be added to retry_list(nr_queued->2) if they fails. raid10d() called handle_read_error() and hung in freeze_array(). nr_queued will not decrease because raid10d is blocked, nr_pending will not increase because conf->barrier is not released. Fix it by moving allow_barrier() after raid10_read_request(). raise_barrier() will wait for nr_waiting to become 0. Therefore, sync io and regular io will not be issued at the same time. We also removed the check of nr_queued. It can be 0 but don't need to be blocked. MD_RECOVERY_RUNNING always is set after this patch, because all sync io is waitting in raise_barrier(), remove it, too. Fixes: fe630de0 ("md/raid10: avoid deadlock on recovery.") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit 1fe782f0)
-
由 Yu Kuai 提交于
mainline inclusion from mainline-v6.1-rc1 commit ed2e063f category: bugfix bugzilla: 188380, https://gitee.com/openeuler/kernel/issues/I6GISC CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=ed2e063f92c44c891ccd883e289dde6ca870edcc -------------------------------- Currently the nasty condition in wait_barrier() is hard to read. This patch factors out the condition into a function. There are no functional changes. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Acked-by: NPaul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: NLogan Gunthorpe <logang@deltatee.com> Acked-by: NGuoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: NSong Liu <song@kernel.org> conflict: drivers/md/raid10.c Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit 7aad54e0)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188628, https://gitee.com/openeuler/kernel/issues/I6WKDR CVE: NA -------------------------------- There is no limit to the number of io for raid10 plug, whitch may result in excessive memory usage and potential softlockup when a large number of io are submitted at once. There is no good way to fix it now, just add schedule point to prevent softlockup. Fixes: 57c67df4 ("md/raid10: submit IO from originating thread instead of md thread.") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit f8cecf7a)
-
由 Jiang Li 提交于
mainline inclusion from mainline-v6.2-rc1 commit b611ad14 category: bugfix bugzilla: 188662, https://gitee.com/openeuler/kernel/issues/I6UMUF CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=b611ad14006e5be2170d9e8e611bf49dff288911 -------------------------------- fail run raid1 array when we assemble array with the inactive disk only, but the mdx_raid1 thread were not stop, Even if the associated resources have been released. it will caused a NULL dereference when we do poweroff. This causes the following Oops: [ 287.587787] BUG: kernel NULL pointer dereference, address: 0000000000000070 [ 287.594762] #PF: supervisor read access in kernel mode [ 287.599912] #PF: error_code(0x0000) - not-present page [ 287.605061] PGD 0 P4D 0 [ 287.607612] Oops: 0000 [#1] SMP NOPTI [ 287.611287] CPU: 3 PID: 5265 Comm: md0_raid1 Tainted: G U 5.10.146 #0 [ 287.619029] Hardware name: xxxxxxx/To be filled by O.E.M, BIOS 5.19 06/16/2022 [ 287.626775] RIP: 0010:md_check_recovery+0x57/0x500 [md_mod] [ 287.632357] Code: fe 01 00 00 48 83 bb 10 03 00 00 00 74 08 48 89 ...... [ 287.651118] RSP: 0018:ffffc90000433d78 EFLAGS: 00010202 [ 287.656347] RAX: 0000000000000000 RBX: ffff888105986800 RCX: 0000000000000000 [ 287.663491] RDX: ffffc90000433bb0 RSI: 00000000ffffefff RDI: ffff888105986800 [ 287.670634] RBP: ffffc90000433da0 R08: 0000000000000000 R09: c0000000ffffefff [ 287.677771] R10: 0000000000000001 R11: ffffc90000433ba8 R12: ffff888105986800 [ 287.684907] R13: 0000000000000000 R14: fffffffffffffe00 R15: ffff888100b6b500 [ 287.692052] FS: 0000000000000000(0000) GS:ffff888277f80000(0000) knlGS:0000000000000000 [ 287.700149] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 287.705897] CR2: 0000000000000070 CR3: 000000000320a000 CR4: 0000000000350ee0 [ 287.713033] Call Trace: [ 287.715498] raid1d+0x6c/0xbbb [raid1] [ 287.719256] ? __schedule+0x1ff/0x760 [ 287.722930] ? schedule+0x3b/0xb0 [ 287.726260] ? schedule_timeout+0x1ed/0x290 [ 287.730456] ? __switch_to+0x11f/0x400 [ 287.734219] md_thread+0xe9/0x140 [md_mod] [ 287.738328] ? md_thread+0xe9/0x140 [md_mod] [ 287.742601] ? wait_woken+0x80/0x80 [ 287.746097] ? md_register_thread+0xe0/0xe0 [md_mod] [ 287.751064] kthread+0x11a/0x140 [ 287.754300] ? kthread_park+0x90/0x90 [ 287.757974] ret_from_fork+0x1f/0x30 In fact, when raid1 array run fail, we need to do md_unregister_thread() before raid1_free(). Signed-off-by: NJiang Li <jiang.li@ugreen.com> Signed-off-by: NSong Liu <song@kernel.org> Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit 22eeb5d1)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188553, https://gitee.com/openeuler/kernel/issues/I6TNFX CVE: NA -------------------------------- rdev->del_work has not been queued to md_rdev_misc_wq and flush_workqueue will not flush it if tow threads add and remove same device. sysfs might WARN duplicate filename as below. //T1 //T2 mdadm write super add success remove unbind_rdev_from_array md_ioctl flush_workqueue INIT_WORK queue_work md_add_new_disk duplicate filename dev-xxx Check if there is any kobj with the same name, and return busy if true. Fixes: 5792a285 ("md: avoid a deadlock when removing a device from an md array via sysfs") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit 5815341f)
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188553, https://gitee.com/openeuler/kernel/issues/I6TNFX CVE: NA -------------------------------- If we want to remove a device, first we delete it from mddev->disks list, then init rdev->del_work to put it (see unbind_rdev_from_array()). flush_rdev_wq() traverses mddev->disks to check if there is any pending rdev->del_work, if so, flush it. Howerver, rdev will not be in the list of mddev->disks if rdev->del_work exists, and flush_workqueue() will never be executed. Replace it with flush_workqueue() to ensure del_work has been completed when adding devices. Fixes: cc1ffe61 ("md: add new workqueue for delete rdev") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit ff461e2d)
-
由 David Sloan 提交于
mainline inclusion from mainline-v6.0-rc3 commit 5e8daf90 category: bugfix bugzilla: 188015, https://gitee.com/openeuler/kernel/issues/I6OERX CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=5e8daf906f890560df430d30617c692a794acb73 -------------------------------- A race condition still exists when removing and re-creating md devices in test cases. However, it is only seen on some setups. The race condition was tracked down to a reference still being held to the kobject by the rdev in the md_rdev_misc_wq which will be released in rdev_delayed_delete(). md_alloc() waits for previous deletions by waiting on the md_misc_wq, but the md_rdev_misc_wq may still be holding a reference to a recently removed device. To fix this, also flush the md_rdev_misc_wq in md_alloc(). Signed-off-by: NDavid Sloan <david.sloan@eideticom.com> [logang@deltatee.com: rewrote commit message] Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NSong Liu <song@kernel.org> Conflict: drivers/md/md.c Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> (cherry picked from commit 5fa41917)
-
- 31 5月, 2023 8 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC CVE: NA -------------------------------- Struct mddev is just used inside raid, just in case that md_mod is compiled from new kernel, and raid1/raid10 or other out-of-tree raid are compiled from old kernel. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC CVE: NA -------------------------------- Before refactoring idle and frozen from action_store, interruptible apis is used so that hungtask warning won't be triggered if it takes too long to finish indle/frozen sync_thread. This patch do the same. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC CVE: NA -------------------------------- We just replace md_reap_sync_thread() with wait_event(resync_wait, ...) from action_store(), this patch just make sure action_store() will still wait for everything to be done in md_reap_sync_thread(). Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC CVE: NA -------------------------------- Our test found a following deadlock in raid10: 1) Issue a normal write, and such write failed: raid10_end_write_request set_bit(R10BIO_WriteError, &r10_bio->state) one_write_done reschedule_retry // later from md thread raid10d handle_write_completed list_add(&r10_bio->retry_list, &conf->bio_end_io_list) // later from md thread raid10d if (!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags)) list_move(conf->bio_end_io_list.prev, &tmp) r10_bio = list_first_entry(&tmp, struct r10bio, retry_list) raid_end_bio_io(r10_bio) Dependency chain 1: normal io is waiting for updating superblock 2) Trigger a recovery: raid10_sync_request raise_barrier Dependency chain 2: sync thread is waiting for normal io 3) echo idle/frozen to sync_action: action_store mddev_lock md_unregister_thread kthread_stop Dependency chain 3: drop 'reconfig_mutex' is waiting for sync thread 4) md thread can't update superblock: raid10d md_check_recovery if (mddev_trylock(mddev)) md_update_sb Dependency chain 4: update superblock is waiting for 'reconfig_mutex' Hence cyclic dependency exist, in order to fix the problem, we must break one of them. Dependency 1 and 2 can't be broken because they are foundation design. Dependency 4 may be possible if it can be guaranteed that no io can be inflight, however, this requires a new mechanism which seems complex. Dependency 3 is a good choice, because idle/frozen only requires sync thread to finish, which can be done asynchronously that is already implemented, and 'reconfig_mutex' is not needed anymore. This patch switch 'idle' and 'frozen' to wait sync thread to be done asynchronously, and this patch also add a sequence counter to record how many times sync thread is done, so that 'idle' won't keep waiting on new started sync thread. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC CVE: NA -------------------------------- Currently, for idle and frozen, action_store will hold 'reconfig_mutex' and call md_reap_sync_thread() to stop sync thread, however, this will cause deadlock (explained in the next patch). In order to fix the problem, following patch will release 'reconfig_mutex' and wait on 'resync_wait', like md_set_readonly() and do_md_stop() does. Consider that action_store() will set/clear 'MD_RECOVERY_FROZEN' unconditionally, which might cause unexpected problems, for example, frozen just set 'MD_RECOVERY_FROZEN' and is still in progress, while 'idle' clear 'MD_RECOVERY_FROZEN' and new sync thread is started, which might starve in progress frozen. This patch add a mutex to synchronize idle and frozen from action_store(). Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC CVE: NA -------------------------------- Prepare to handle 'idle' and 'frozen' differently to fix a deadlock, there are no functional changes except that MD_RECOVERY_RUNNING is checked again after 'reconfig_mutex' is held. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC CVE: NA -------------------------------- This reverts commit 9dfbdafd. Because it will introduce a defect that sync_thread can be running while MD_RECOVERY_RUNNING is cleared, which will cause some unexpected problems, for example: list_add corruption. prev->next should be next (ffff0001ac1daba0), but was ffff0000ce1a02a0. (prev=ffff0000ce1a02a0). Call trace: __list_add_valid+0xfc/0x140 insert_work+0x78/0x1a0 __queue_work+0x500/0xcf4 queue_work_on+0xe8/0x12c md_check_recovery+0xa34/0xf30 raid10d+0xb8/0x900 [raid10] md_thread+0x16c/0x2cc kthread+0x1a4/0x1ec ret_from_fork+0x10/0x18 This is because work is requeued while it's still inside workqueue: t1: t2: action_store mddev_lock if (mddev->sync_thread) mddev_unlock md_unregister_thread // first sync_thread is done md_check_recovery mddev_try_lock /* * once MD_RECOVERY_DONE is set, new sync_thread * can start. */ set_bit(MD_RECOVERY_RUNNING, &mddev->recovery) INIT_WORK(&mddev->del_work, md_start_sync) queue_work(md_misc_wq, &mddev->del_work) test_and_set_bit(WORK_STRUCT_PENDING_BIT, ...) // set pending bit insert_work list_add_tail mddev_unlock mddev_lock_nointr md_reap_sync_thread // MD_RECOVERY_RUNNING is cleared mddev_unlock t3: // before queued work started from t2 md_check_recovery // MD_RECOVERY_RUNNING is not set, a new sync_thread can be started INIT_WORK(&mddev->del_work, md_start_sync) work->data = 0 // work pending bit is cleared queue_work(md_misc_wq, &mddev->del_work) insert_work list_add_tail // list is corrupted This patch revert the commit to fix the problem, the deadlock this commit tries to fix will be fixed in following patches. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NSong Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230322064122.2384589-2-yukuai1@huaweicloud.comReviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Guoqing Jiang 提交于
mainline inclusion from mainline-v6.0-rc1 commit 9dfbdafd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6OMCC CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc3&id=9dfbdafda3b34e262e43e786077bab8e476a89d1 -------------------------------- Since the bug which commit 8b48ec23 ("md: don't unregister sync_thread with reconfig_mutex held") fixed is related with action_store path, other callers which reap sync_thread didn't need to be changed. Let's pull md_unregister_thread from md_reap_sync_thread, then fix previous bug with belows. 1. unlock mddev before md_reap_sync_thread in action_store. 2. save reshape_position before unlock, then restore it to ensure position not changed accidentally by others. Signed-off-by: NGuoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: NSong Liu <song@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 10 5月, 2023 1 次提交
-
-
由 Mike Snitzer 提交于
mainline inclusion from mainline-v6.4-rc1 commit 3d32aaa7 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6YQZS CVE: CVE-2023-2269 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=3d32aaa7e66d5c1479a3c31d6c2c5d45dd0d3b89 ---------------------------------------- syzkaller found the following problematic rwsem locking (with write lock already held): down_read+0x9d/0x450 kernel/locking/rwsem.c:1509 dm_get_inactive_table+0x2b/0xc0 drivers/md/dm-ioctl.c:773 __dev_status+0x4fd/0x7c0 drivers/md/dm-ioctl.c:844 table_clear+0x197/0x280 drivers/md/dm-ioctl.c:1537 In table_clear, it first acquires a write lock https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L1520 down_write(&_hash_lock); Then before the lock is released at L1539, there is a path shown above: table_clear -> __dev_status -> dm_get_inactive_table -> down_read https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L773 down_read(&_hash_lock); It tries to acquire the same read lock again, resulting in the deadlock problem. Fix this by moving table_clear()'s __dev_status() call to after its up_write(&_hash_lock); Cc: stable@vger.kernel.org Reported-by: NZheng Zhang <zheng.zhang@email.ucr.edu> Signed-off-by: NMike Snitzer <snitzer@kernel.org> Conflicts: drivers/md/dm-ioctl.c Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 13 4月, 2023 4 次提交
-
-
由 Logan Gunthorpe 提交于
stable inclusion from stable-v5.10.150 commit 782b3e71c957991ac8ae53318bc369049d49bb53 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=782b3e71c957991ac8ae53318bc369049d49bb53 -------------------------------- [ Upstream commit 5e2cf333 ] A complicated deadlock exists when using the journal and an elevated group_thrtead_cnt. It was found with loop devices, but its not clear whether it can be seen with real disks. The deadlock can occur simply by writing data with an fio script. When the deadlock occurs, multiple threads will hang in different ways: 1) The group threads will hang in the blk-wbt code with bios waiting to be submitted to the block layer: io_schedule+0x70/0xb0 rq_qos_wait+0x153/0x210 wbt_wait+0x115/0x1b0 io_schedule+0x70/0xb0 rq_qos_wait+0x153/0x210 wbt_wait+0x115/0x1b0 __rq_qos_throttle+0x38/0x60 blk_mq_submit_bio+0x589/0xcd0 wbt_wait+0x115/0x1b0 __rq_qos_throttle+0x38/0x60 blk_mq_submit_bio+0x589/0xcd0 __submit_bio+0xe6/0x100 submit_bio_noacct_nocheck+0x42e/0x470 submit_bio_noacct+0x4c2/0xbb0 ops_run_io+0x46b/0x1a30 handle_stripe+0xcd3/0x36b0 handle_active_stripes.constprop.0+0x6f6/0xa60 raid5_do_work+0x177/0x330 Or: io_schedule+0x70/0xb0 rq_qos_wait+0x153/0x210 wbt_wait+0x115/0x1b0 __rq_qos_throttle+0x38/0x60 blk_mq_submit_bio+0x589/0xcd0 __submit_bio+0xe6/0x100 submit_bio_noacct_nocheck+0x42e/0x470 submit_bio_noacct+0x4c2/0xbb0 flush_deferred_bios+0x136/0x170 raid5_do_work+0x262/0x330 2) The r5l_reclaim thread will hang in the same way, submitting a bio to the block layer: io_schedule+0x70/0xb0 rq_qos_wait+0x153/0x210 wbt_wait+0x115/0x1b0 __rq_qos_throttle+0x38/0x60 blk_mq_submit_bio+0x589/0xcd0 __submit_bio+0xe6/0x100 submit_bio_noacct_nocheck+0x42e/0x470 submit_bio_noacct+0x4c2/0xbb0 submit_bio+0x3f/0xf0 md_super_write+0x12f/0x1b0 md_update_sb.part.0+0x7c6/0xff0 md_update_sb+0x30/0x60 r5l_do_reclaim+0x4f9/0x5e0 r5l_reclaim_thread+0x69/0x30b However, before hanging, the MD_SB_CHANGE_PENDING flag will be set for sb_flags in r5l_write_super_and_discard_space(). This flag will never be cleared because the submit_bio() call never returns. 3) Due to the MD_SB_CHANGE_PENDING flag being set, handle_stripe() will do no processing on any pending stripes and re-set STRIPE_HANDLE. This will cause the raid5d thread to enter an infinite loop, constantly trying to handle the same stripes stuck in the queue. The raid5d thread has a blk_plug that holds a number of bios that are also stuck waiting seeing the thread is in a loop that never schedules. These bios have been accounted for by blk-wbt thus preventing the other threads above from continuing when they try to submit bios. --Deadlock. To fix this, add the same wait_event() that is used in raid5_do_work() to raid5d() such that if MD_SB_CHANGE_PENDING is set, the thread will schedule and wait until the flag is cleared. The schedule action will flush the plug which will allow the r5l_reclaim thread to continue, thus preventing the deadlock. However, md_check_recovery() calls can also clear MD_SB_CHANGE_PENDING from the same thread and can thus deadlock if the thread is put to sleep. So avoid waiting if md_check_recovery() is being called in the loop. It's not clear when the deadlock was introduced, but the similar wait_event() call in raid5_do_work() was added in 2017 by this commit: 16d997b7 ("md/raid5: simplfy delaying of writes while metadata is updated.") Link: https://lore.kernel.org/r/7f3b87b6-b52a-f737-51d7-a4eec5c44112@deltatee.comSigned-off-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NSong Liu <song@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Coly Li 提交于
stable inclusion from stable-v5.10.150 commit c263516c2c20df9c29f33baeb4a817af7212fb69 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c263516c2c20df9c29f33baeb4a817af7212fb69 -------------------------------- [ Upstream commit d2d05b88 ] Inside set_at_max_writeback_rate() the calculation in following if() check is wrong, if (atomic_inc_return(&c->idle_counter) < atomic_read(&c->attached_dev_nr) * 6) Because each attached backing device has its own writeback thread running and increasing c->idle_counter, the counter increates much faster than expected. The correct calculation should be, (counter / dev_nr) < dev_nr * 6 which equals to, counter < dev_nr * dev_nr * 6 This patch fixes the above mistake with correct calculation, and helper routine idle_counter_exceeded() is added to make code be more clear. Reported-by: NMingzhe Zou <mingzhe.zou@easystack.cn> Signed-off-by: NColy Li <colyli@suse.de> Acked-by: NMingzhe Zou <mingzhe.zou@easystack.cn> Link: https://lore.kernel.org/r/20220919161647.81238-6-colyli@suse.deSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Logan Gunthorpe 提交于
stable inclusion from stable-v5.10.150 commit a1263294b55c948842a2c058a47fb330223b0f6e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a1263294b55c948842a2c058a47fb330223b0f6e -------------------------------- [ Upstream commit e2eed85b ] When doing degrade/recover tests using the journal a kernel BUG is hit at drivers/md/raid5.c:4381 in handle_parity_checks5(): BUG_ON(!test_bit(R5_UPTODATE, &dev->flags)); This was found to occur because handle_stripe_fill() was skipped for stripes in the journal due to a condition in that function. Thus blocks were not fetched and R5_UPTODATE was not set when the code reached handle_parity_checks5(). To fix this, don't skip handle_stripe_fill() unless the stripe is for read. Fixes: 07e83364 ("md/r5cache: shift complex rmw from read path to write path") Link: https://lore.kernel.org/linux-raid/e05c4239-41a9-d2f7-3cfa-4aa9d2cea8c1@deltatee.com/Suggested-by: NSong Liu <song@kernel.org> Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NSong Liu <song@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Saurabh Sengar 提交于
stable inclusion from stable-v5.10.150 commit 76694e9ce0b2238c0a5f3ba54f9361dd3770ec78 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=76694e9ce0b2238c0a5f3ba54f9361dd3770ec78 -------------------------------- [ Upstream commit 1727fd50 ] Current code produces a warning as shown below when total characters in the constituent block device names plus the slashes exceeds 200. snprintf() returns the number of characters generated from the given input, which could cause the expression “200 – len” to wrap around to a large positive number. Fix this by using scnprintf() instead, which returns the actual number of characters written into the buffer. [ 1513.267938] ------------[ cut here ]------------ [ 1513.267943] WARNING: CPU: 15 PID: 37247 at <snip>/lib/vsprintf.c:2509 vsnprintf+0x2c8/0x510 [ 1513.267944] Modules linked in: <snip> [ 1513.267969] CPU: 15 PID: 37247 Comm: mdadm Not tainted 5.4.0-1085-azure #90~18.04.1-Ubuntu [ 1513.267969] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 05/09/2022 [ 1513.267971] RIP: 0010:vsnprintf+0x2c8/0x510 <-snip-> [ 1513.267982] Call Trace: [ 1513.267986] snprintf+0x45/0x70 [ 1513.267990] ? disk_name+0x71/0xa0 [ 1513.267993] dump_zones+0x114/0x240 [raid0] [ 1513.267996] ? _cond_resched+0x19/0x40 [ 1513.267998] raid0_run+0x19e/0x270 [raid0] [ 1513.268000] md_run+0x5e0/0xc50 [ 1513.268003] ? security_capable+0x3f/0x60 [ 1513.268005] do_md_run+0x19/0x110 [ 1513.268006] md_ioctl+0x195e/0x1f90 [ 1513.268007] blkdev_ioctl+0x91f/0x9f0 [ 1513.268010] block_ioctl+0x3d/0x50 [ 1513.268012] do_vfs_ioctl+0xa9/0x640 [ 1513.268014] ? __fput+0x162/0x260 [ 1513.268016] ksys_ioctl+0x75/0x80 [ 1513.268017] __x64_sys_ioctl+0x1a/0x20 [ 1513.268019] do_syscall_64+0x5e/0x200 [ 1513.268021] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 76603884 ("md/raid0: replace printk() with pr_*()") Reviewed-by: NMichael Kelley <mikelley@microsoft.com> Acked-by: NGuoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: NSaurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: NSong Liu <song@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 12 4月, 2023 1 次提交
-
-
由 Mikulas Patocka 提交于
mainline inclusion from mainline-v6.3-rc4 commit fb294b1c category: bugfix bugzilla: 188393, https://gitee.com/openeuler/kernel/issues/I6JPSH Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fb294b1c0ba982144ca467a75e7d01ff26304e2b ---------------------------------------- The loop in dmcrypt_write may be running for unbounded amount of time, thus we need cond_resched() in it. This commit fixes the following warning: [ 3391.153255][ C12] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [dmcrypt_write/2:2897] ... [ 3391.387210][ C12] Call trace: [ 3391.390338][ C12] blk_attempt_bio_merge.part.6+0x38/0x158 [ 3391.395970][ C12] blk_attempt_plug_merge+0xc0/0x1b0 [ 3391.401085][ C12] blk_mq_submit_bio+0x398/0x550 [ 3391.405856][ C12] submit_bio_noacct+0x308/0x380 [ 3391.410630][ C12] dmcrypt_write+0x1e4/0x208 [dm_crypt] [ 3391.416005][ C12] kthread+0x130/0x138 [ 3391.419911][ C12] ret_from_fork+0x10/0x18 Reported-by: Nyangerkun <yangerkun@huawei.com> Fixes: dc267621 ("dm crypt: offload writes to thread") Cc: stable@vger.kernel.org Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@kernel.org> Signed-off-by: Nyangerkun <yangerkun@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 29 3月, 2023 1 次提交
-
-
由 Zhong Jinghua 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6OSXU ---------------------------------------- commit "md/raid6: refactor raid5_read_one_chunk" incorrectly merged the code. Repeatedly applying for memory leads to memory leaks. Fix it by removing redundant allocating memory code. Fixes: c13c2cd2 ("md/raid6: refactor raid5_read_one_chunk") Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 15 3月, 2023 2 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6L586 CVE: NA -------------------------------- 'ios' and 'sectors' is counted in bio_start_io_acct() while io is started insted of io is done. Hence switch to precise io accounting to count them when io is done. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6L4UU CVE: NA -------------------------------- In the error path of raid10_run(), 'conf' need be freed, however, 'conf->bio_split' is missed and memory will be leaked. Since there are 3 places to free 'conf', factor out a helper to fix the problem. Fixes: fc9977dd ("md/raid10: simplify the splitting of requests.") Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 08 3月, 2023 4 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6JZ3F CVE: NA -------------------------------- raid10_sync_request() will add 'r10bio->remaining' for both rdev and replacement rdev. However, if the read io failed, recovery_request_write() will return without issuring the write io, in this case, end_sync_request() is only called once and 'remaining' is leaked, which will cause io hang. Fix the probleming by decreasing 'remaining' according to if 'bio' and 'repl_bio' is valid. Fixes: 24afd80d ("md/raid10: handle recovery of replacement devices.") Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6JN1I CVE: NA -------------------------------- status_resync() will calculate 'curr_resync - recovery_active' to show user a progress bar like following: [============>........] resync = 61.4% 'curr_resync' and 'recovery_active' is updated in md_do_sync(), and status_resync() can read them concurrently, hence it's possible that 'curr_resync - recovery_active' can overflow to a huge number. In this case status_resync() will be stuck in the loop to print a large amount of '=', which will end up soft lockup. Fix the problem by setting 'resync' to MD_RESYNC_ACTIVE in this case, this way resync in progress will be reported to user. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Hou Tao 提交于
mainline inclusion from mainline-v6.3-rc1 commit 1d1f25bf category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6JN1I CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1d1f25bfda432a6b61bd0205d426226bbbd73504 -------------------------------- Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise md may skip the resync of the first 3 sectors if the resync procedure is interrupted before the first calling of ->sync_request() as shown below: md_do_sync thread control thread // setup resync mddev->recovery_cp = 0 j = 0 mddev->curr_resync = MD_RESYNC_ACTIVE // e.g., set array as idle set_bit(MD_RECOVERY_INTR, &&mddev_recovery) // resync loop // check INTR before calling sync_request !test_bit(MD_RECOVERY_INTR, &mddev->recovery // resync interrupted // update recovery_cp from 0 to 3 // the resync of three 3 sectors will be skipped mddev->recovery_cp = 3 Fixes: eac58d08 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync") Cc: stable@vger.kernel.org # 6.0+ Signed-off-by: NHou Tao <houtao1@huawei.com> Reviewed-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NSong Liu <song@kernel.org> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Logan Gunthorpe 提交于
mainline inclusion from mainline-v6.0-rc1 commit b368856a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6JN1I CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b368856aab02c8fcaabb809aad401b2cf96504f2 -------------------------------- The 07layouts test in mdadm fails on some systems. The failure presents itself as the backup file not being removed before the next layout is grown into: mdadm: /dev/md0: cannot create backup file /tmp/md-test-backup: File exists This is because the background mdadm process, which is responsible for cleaning up this backup file gets into an infinite loop waiting for the reshape to start. mdadm checks the mdstat file if a reshape is going and, if it is not, it waits for an event on the file or times out in 5 seconds. On faster machines, the reshape may complete before the 5 seconds times out, and thus the background mdadm process loops waiting for a reshape to start that has already occurred. mdadm reads the mdstat file to start, but mdstat does not report that the reshape has begun, even though it has indeed begun. So the mdstat_wait() call (in mdadm) which polls on the mdstat file won't ever return until timing out. The reason mdstat reports the reshape has started is due to an issue in status_resync(). recovery_active is subtracted from curr_resync which will result in a value of zero for the first chunk of reshaped data, and the resulting read will report no reshape in progress. To fix this, if "resync - recovery_active" is an overloaded value, force the value to be MD_RESYNC_ACTIVE so the code reports a resync in progress. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSong Liu <song@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-