- 03 10月, 2020 5 次提交
-
-
由 Coly Li 提交于
Currently although the bcache code has a framework for multiple caches in a cache set, but indeed the multiple caches never completed and users use md raid1 for multiple copies of the cached data. This patch does the following change in struct cache_set, to explicitly make a cache_set only have single cache, - Change pointer array "*cache[MAX_CACHES_PER_SET]" to a single pointer "*cache". - Remove pointer array "*cache_by_alloc[MAX_CACHES_PER_SET]". - Remove "caches_loaded". Now the code looks as exactly what it does in practic: only one cache is used in the cache set. Signed-off-by: NColy Li <colyli@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Coly Li 提交于
The parameter 'int n' from bch_bucket_alloc_set() is not cleared defined. From the code comments n is the number of buckets to alloc, but from the code itself 'n' is the maximum cache to iterate. Indeed all the locations where bch_bucket_alloc_set() is called, 'n' is alwasy 1. This patch removes the confused and unnecessary 'int n' from parameter list of bch_bucket_alloc_set(), and explicitly allocates only 1 bucket for its caller. Signed-off-by: NColy Li <colyli@suse.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Qinglang Miao 提交于
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. As inode->iprivate equals to third parameter of debugfs_create_file() which is NULL. So it's equivalent to original code logic. Signed-off-by: NQinglang Miao <miaoqinglang@huawei.com> Signed-off-by: NColy Li <colyli@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Dongsheng Yang 提交于
In mca_reserve(c) macro, we are checking root whether is NULL or not. But that's not enough, when we read the root node in run_cache_set(), if we got an error in bch_btree_node_read_done(), we will return ERR_PTR(-EIO) to c->root. And then we will go continue to unregister, but before calling unregister_shrinker(&c->shrink), there is a possibility to call bch_mca_count(), and we would get a crash with call trace like that: [ 2149.876008] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b5 ... ... [ 2150.598931] Call trace: [ 2150.606439] bch_mca_count+0x58/0x98 [escache] [ 2150.615866] do_shrink_slab+0x54/0x310 [ 2150.624429] shrink_slab+0x248/0x2d0 [ 2150.632633] drop_slab_node+0x54/0x88 [ 2150.640746] drop_slab+0x50/0x88 [ 2150.648228] drop_caches_sysctl_handler+0xf0/0x118 [ 2150.657219] proc_sys_call_handler.isra.18+0xb8/0x110 [ 2150.666342] proc_sys_write+0x40/0x50 [ 2150.673889] __vfs_write+0x48/0x90 [ 2150.681095] vfs_write+0xac/0x1b8 [ 2150.688145] ksys_write+0x6c/0xd0 [ 2150.695127] __arm64_sys_write+0x24/0x30 [ 2150.702749] el0_svc_handler+0xa0/0x128 [ 2150.710296] el0_svc+0x8/0xc Signed-off-by: NDongsheng Yang <dongsheng.yang@easystack.cn> Signed-off-by: NColy Li <colyli@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Coly Li 提交于
Previously the experimental async registration uses a separate sysfs file register_async. Now the async registration code seems working well for a while, we can do furtuher testing with it now. This patch changes the async bcache registration shares the same sysfs file /sys/fs/bcache/register (and register_quiet). Async registration will be default behavior if BCACHE_ASYNC_REGISTRATION is set in kernel configure. By default, BCACHE_ASYNC_REGISTRATION is not configured yet. Signed-off-by: NColy Li <colyli@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 9月, 2020 19 次提交
-
-
由 Xiao Ni 提交于
For far layout, the discard region is not continuous on disks. So it needs far copies r10bio to cover all regions. It needs a way to know all r10bios have finish or not. Similar with raid10_sync_request, only the first r10bio master_bio records the discard bio. Other r10bios master_bio record the first r10bio. The first r10bio can finish after other r10bios finish and then return the discard bio. Signed-off-by: NXiao Ni <xni@redhat.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Xiao Ni 提交于
Now the discard request is split by chunk size. So it takes a long time to finish mkfs on disks which support discard function. This patch improve handling raid10 discard request. It uses the similar way with patch 29efc390 (md/md0: optimize raid0 discard handling). But it's a little complex than raid0. Because raid10 has different layout. If raid10 is offset layout and the discard request is smaller than stripe size. There are some holes when we submit discard bio to underlayer disks. For example: five disks (disk1 - disk5) D01 D02 D03 D04 D05 D05 D01 D02 D03 D04 D06 D07 D08 D09 D10 D10 D06 D07 D08 D09 The discard bio just wants to discard from D03 to D10. For disk3, there is a hole between D03 and D08. For disk4, there is a hole between D04 and D09. D03 is a chunk, raid10_write_request can handle one chunk perfectly. So the part that is not aligned with stripe size is still handled by raid10_write_request. If reshape is running when discard bio comes and the discard bio spans the reshape position, raid10_write_request is responsible to handle this discard bio. I did a test with this patch set. Without patch: time mkfs.xfs /dev/md0 real4m39.775s user0m0.000s sys0m0.298s With patch: time mkfs.xfs /dev/md0 real0m0.105s user0m0.000s sys0m0.007s nvme3n1 259:1 0 477G 0 disk └─nvme3n1p1 259:10 0 50G 0 part nvme4n1 259:2 0 477G 0 disk └─nvme4n1p1 259:11 0 50G 0 part nvme5n1 259:6 0 477G 0 disk └─nvme5n1p1 259:12 0 50G 0 part nvme2n1 259:9 0 477G 0 disk └─nvme2n1p1 259:15 0 50G 0 part nvme0n1 259:13 0 477G 0 disk └─nvme0n1p1 259:14 0 50G 0 part Reviewed-by: NColy Li <colyli@suse.de> Reviewed-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: NXiao Ni <xni@redhat.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Xiao Ni 提交于
The following patch will reuse these logics, so pull the same codes into one function. Signed-off-by: NXiao Ni <xni@redhat.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Xiao Ni 提交于
Now it allocs r10bio->devs[conf->copies]. Discard bio needs to submit to all member disks and it needs to use r10bio. So extend to r10bio->devs[geo.raid_disks]. Reviewed-by: NColy Li <colyli@suse.de> Signed-off-by: NXiao Ni <xni@redhat.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Xiao Ni 提交于
Move these logic from raid0.c to md.c, so that we can also use it in raid10.c. Reviewed-by: NColy Li <colyli@suse.de> Reviewed-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: NXiao Ni <xni@redhat.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Zhen Lei 提交于
#define RESYNC_SECTORS (RESYNC_BLOCK_SIZE >> 9) "RESYNC_BLOCK_SIZE/512" is equal to "RESYNC_BLOCK_SIZE >> 9", replace it with RESYNC_SECTORS. Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
When try to resize stripe_size, we also need to free old shared page array and allocate new. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
When reshape array, we try to reuse shared pages of old stripe_head, and allocate more for the new one if needed. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
In current implementation, grow_buffers() uses alloc_page() to allocate the buffers for each stripe_head, i.e. allocate a page for each dev[i] in stripe_head. After setting stripe_size as a configurable value by writing sysfs entry, it means that we always allocate 64K buffers, but just use 4K of them when stripe_size is 4K in 64KB arm64. To avoid wasting memory, we try to let multiple sh->dev share one real page. That means, multiple sh->dev[i].page will point to the only page with different offset. Example of 64K PAGE_SIZE and 4K stripe_size as following: 64K PAGE_SIZE +---+---+---+---+------------------------------+ | | | | | | | | | | +-+-+-+-+-+-+-+-+------------------------------+ ^ ^ ^ ^ | | | +----------------------------+ | | | | | | +-------------------+ | | | | | | +----------+ | | | | | | +-+ | | | | | | | +-----+-----+------+-----+------+-----+------+------+ sh | offset(0) | offset(4K) | offset(8K) | offset(12K) | + +-----------+------------+------------+-------------+ +----> dev[0].page dev[1].page dev[2].page dev[3].page A new 'pages' array will be added into stripe_head to record shared page used by this stripe_head. Allocate them when grow_buffers() and free them when shrink_buffers(). After trying to share page, the users of sh->dev[i].page need to take care of the related page offset: page of issued bio and page passed to xor compution functions. But thanks for previous different page offset supported. Here, we just need to set correct dev[i].offset. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
For now, asynchronous raid6 recovery calculate functions are require common offset for pages. But, we expect them to support different page offset after introducing stripe shared page. Do that by simplily adding page offset where each page address are referred. Then, replace the old interface with the new ones in raid6 and raid6test. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
For now, syndrome compute functions require common offset in the pages array. However, we expect them to support different offset when try to use shared page in the following. Simplily covert them by adding page offset where each page address are referred. Since the only caller of async_gen_syndrome() and async_syndrome_val() are in raid6, we don't want to reserve the old interface but modify the interface directly. After that, replacing old interfaces with new ones for raid6 and raid6test. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
We try to replace async_xor() and async_xor_val() with the new introduced interface async_xor_offs() and async_xor_val_offs() for raid456. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
ops_run_biofill() and ops_run_biodrain() will call async_copy_data() to copy sh->dev[i].page from or to bio page. For now, it implies the offset of dev[i].page is 0. But we want to support different page offset in the following. Thus, pass page offset to these functions and replace 'page_offset' with 'page_offset + poff'. No functional change. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
Add a new member of offset into struct r5dev. It indicates the offset of related dev[i].page. For now, since each device have a privated page, the value is always 0. Thus, we set offset as 0 when allcate page in grow_buffers() and resize_stripes(). To support following different page offset, we try to use the page offset rather than '0' directly for async_memcpy() and ops_run_io(). We try to support different page offset for xor compution functions in the following. To avoid repeatly allocate a new array each time, we add a memory region into scribble buffer to record offset. No functional change. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Xianting Tian 提交于
We alreday has the interface i_blocksize(), which can be used to get blocksize, so use it. Only calculate blocksize once and use it within read_page(). Signed-off-by: NXianting Tian <tian.xianting@h3c.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Christoph Hellwig 提交于
The BDI_CAP_STABLE_WRITES is one of the few bits of information in the backing_dev_info shared between the block drivers and the writeback code. To help untangling the dependency replace it with a queue flag and a superblock flag derived from it. This also helps with the case of e.g. a file system requiring stable writes due to its own checksumming, but not forcing it on other users of the block device like the swap code. One downside is that we an't support the stable_pages_required bdi attribute in sysfs anymore. It is replaced with a queue attribute which also is writable for easier testing. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Drivers shouldn't really mess with the readahead size, as that is a VM concept. Instead set it based on the optimal I/O size by lifting the algorithm from the md driver when registering the disk. Also set bdi->io_pages there as well by applying the same scheme based on max_sectors. To ensure the limits work well for stacking drivers a new helper is added to update the readahead limits from the block limits, which is also called from disk_stack_limits. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NColy Li <colyli@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The raid5 and raid10 drivers currently update the read-ahead size, but not the optimal I/O size on reshape. To prepare for deriving the read-ahead size from the optimal I/O size make sure it is updated as well. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NSong Liu <song@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Inherit the optimal I/O size setting just like the readahead window, as any reason to do larger I/O does not apply to just readahead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NColy Li <colyli@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 20 9月, 2020 2 次提交
-
-
由 Jan Kara 提交于
DM was calling generic_fsdax_supported() to determine whether a device referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that they don't have to duplicate common generic checks. High level code should call dax_supported() helper which that calls into appropriate helper for the particular device. This problem manifested itself as kernel messages: dm-3: error: dax access failed (-95) when lvm2-testsuite run in cases where a DM device was stacked on top of another DM device. Fixes: 7bf7eac8 ("dax: Arrange for dax_supported check to span multiple devices") Cc: <stable@vger.kernel.org> Tested-by: NAdrian Huang <ahuang12@lenovo.com> Signed-off-by: NJan Kara <jack@suse.cz> Acked-by: NMike Snitzer <snitzer@redhat.com> Reported-by: Nkernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/160061715195.13131.5503173247632041975.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
A recent fix to the dm_dax_supported() flow uncovered a latent bug. When dm_get_live_table() fails it is still required to drop the srcu_read_lock(). Without this change the lvm2 test-suite triggers this warning: # lvm2-testsuite --only pvmove-abort-all.sh WARNING: lock held when returning to user space! 5.9.0-rc5+ #251 Tainted: G OE ------------------------------------------------ lvm/1318 is leaving the kernel with locks still held! 1 lock held by lvm/1318: #0: ffff9372abb5a340 (&md->io_barrier){....}-{0:0}, at: dm_get_live_table+0x5/0xb0 [dm_mod] ...and later on this hang signature: INFO: task lvm:1344 blocked for more than 122 seconds. Tainted: G OE 5.9.0-rc5+ #251 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:lvm state:D stack: 0 pid: 1344 ppid: 1 flags:0x00004000 Call Trace: __schedule+0x45f/0xa80 ? finish_task_switch+0x249/0x2c0 ? wait_for_completion+0x86/0x110 schedule+0x5f/0xd0 schedule_timeout+0x212/0x2a0 ? __schedule+0x467/0xa80 ? wait_for_completion+0x86/0x110 wait_for_completion+0xb0/0x110 __synchronize_srcu+0xd1/0x160 ? __bpf_trace_rcu_utilization+0x10/0x10 __dm_suspend+0x6d/0x210 [dm_mod] dm_suspend+0xf6/0x140 [dm_mod] Fixes: 7bf7eac8 ("dax: Arrange for dax_supported check to span multiple devices") Cc: <stable@vger.kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Reported-by: NAdrian Huang <ahuang12@lenovo.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Tested-by: NAdrian Huang <ahuang12@lenovo.com> Link: https://lore.kernel.org/r/160045867590.25663.7548541079217827340.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 12 9月, 2020 2 次提交
-
-
由 Song Liu 提交于
This enables proper statistics in /proc/diskstats for bcache partitions. Signed-off-by: NSong Liu <songliubraving@fb.com> Reviewed-by: NColy Li <colyli@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Song Liu 提交于
This enables proper statistics in /proc/diskstats for md partitions. Signed-off-by: NSong Liu <songliubraving@fb.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 10 9月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
The md driver does not have a ->revalidate_disk method, so it can just use bdev_check_media_change without any additional changes. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Acked-by: NSong Liu <song@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 03 9月, 2020 3 次提交
-
-
由 Ye Bin 提交于
The following error ocurred when testing disk online/offline: [ 301.798344] device-mapper: thin: 253:5: aborting current metadata transaction [ 301.848441] device-mapper: thin: 253:5: failed to abort metadata transaction [ 301.849206] Aborting journal on device dm-26-8. [ 301.850489] EXT4-fs error (device dm-26) in __ext4_new_inode:943: Journal has aborted [ 301.851095] EXT4-fs (dm-26): Delayed block allocation failed for inode 398742 at logical offset 181 with max blocks 19 with error 30 [ 301.854476] BUG: KASAN: use-after-free in dm_bm_set_read_only+0x3a/0x40 [dm_persistent_data] Reason is: metadata_operation_failed abort_transaction dm_pool_abort_metadata __create_persistent_data_objects r = __open_or_format_metadata if (r) --> If failed will free pmd->bm but pmd->bm not set NULL dm_block_manager_destroy(pmd->bm); set_pool_mode dm_pool_metadata_read_only(pool->pmd); dm_bm_set_read_only(pmd->bm); --> use-after-free Add checks to see if pmd->bm is NULL in dm_bm_set_read_only and dm_bm_set_read_write functions. If bm is NULL it means creating the bm failed and so dm_bm_is_read_only must return true. Signed-off-by: NYe Bin <yebin10@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Ye Bin 提交于
Maybe __create_persistent_data_objects() caller will use PTR_ERR as a pointer, it will lead to some strange things. Signed-off-by: NYe Bin <yebin10@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Ye Bin 提交于
Maybe __create_persistent_data_objects() caller will use PTR_ERR as a pointer, it will lead to some strange things. Signed-off-by: NYe Bin <yebin10@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 02 9月, 2020 7 次提交
-
-
由 Christoph Hellwig 提交于
Remove the now unused helper. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: NSong Liu <song@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
revalidate_disk is a relative awkward helper for driver use, as it first calls an optional driver method and then updates the block device size, while most callers either don't need the method call at all, or want to keep state between the caller and the called method. Add a revalidate_disk_size helper that just performs the update of the block device size from the gendisk one, and switch all drivers that do not implement ->revalidate_disk to use the new helper instead of revalidate_disk() Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: NSong Liu <song@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Two different callers use two different mutexes for updating the block device size, which obviously doesn't help to actually protect against concurrent updates from the different callers. In addition one of the locks, bd_mutex is rather prone to deadlocks with other parts of the block stack that use it for high level synchronization. Switch to using a new spinlock protecting just the size updates, as that is all we need, and make sure everyone does the update through the proper helper. This fixes a bug reported with the nvme revalidating disks during a hot removal operation, which can currently deadlock on bd_mutex. Reported-by: NXianting Tian <xianting_tian@126.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Mikulas Patocka 提交于
The dm-integrity target did not report errors in bitmap mode just after creation. The reason is that the function integrity_recalc didn't clean up ic->recalc_bitmap as it proceeded with recalculation. Fix this by updating the bitmap accordingly -- the double shift serves to rounddown. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Fixes: 468dfca3 ("dm integrity: add a bitmap mode") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Damien Le Moal 提交于
Use the DECLARE_CRYPTO_WAIT() macro to properly initialize the crypto wait structures declared on stack before their use with crypto_wait_req(). Fixes: 39d13a1a ("dm crypt: reuse eboiv skcipher for IV generation") Fixes: bbb16584 ("dm crypt: Implement Elephant diffuser for Bitlocker compatibility") Cc: stable@vger.kernel.org Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Commit 935fcc56 ("dm mpath: only flush workqueue when needed") changed flush_multipath_work() to avoid needless workqueue flushing (of a multipath global workqueue). But that change didn't realize the surrounding flush_multipath_work() code should also only run if 'pg_init_in_progress' is set. Fix this by only doing all of flush_multipath_work()'s PG init related work if 'pg_init_in_progress' is set. Otherwise multipath_wait_for_pg_init_completion() will run unconditionally but the preceeding flush_workqueue(kmpath_handlerd) may not. This could lead to deadlock (though only if kmpath_handlerd never runs a corresponding work to decrement 'pg_init_in_progress'). It could also be, though highly unlikely, that the kmpath_handlerd work that does PG init completes before 'pg_init_in_progress' is set, and then an intervening DM table reload's multipath_postsuspend() triggers flush_multipath_work(). Fixes: 935fcc56 ("dm mpath: only flush workqueue when needed") Cc: stable@vger.kernel.org Reported-by: NBen Marzinski <bmarzins@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mikulas Patocka 提交于
The function dax_direct_access doesn't take partitions into account, it always maps pages from the beginning of the device. Therefore, persistent_memory_claim() must get the partition offset using get_start_sect() and add it to the page offsets passed to dax_direct_access(). Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Fixes: 48debafe ("dm: add writecache target") Cc: stable@vger.kernel.org # 4.18+ Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 28 8月, 2020 1 次提交
-
-
由 Yufen Yu 提交于
Commit 3b5408b9 ("md/raid5: support config stripe_size by sysfs entry") make stripe_size as a configurable value. It just requires stripe_size as multiple of 4KB. In fact, we should make sure stripe_size as power of two. Otherwise, stripe_shift which is the result of ilog2 can not represent the real stripe_size. Then, stripe_hash() and stripe_hash_locks_hash() may get unexpected value. Fixes: 3b5408b9 ("md/raid5: support config stripe_size by sysfs entry") Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-