- 06 6月, 2020 5 次提交
-
-
由 Hannes Reinecke 提交于
Checking the tertiary superblock just consists of validating UUIDs, crcs, and the generation number; it doesn't have contents which would be required during the actual operation. So allocate a temporary superblock when checking tertiary devices to avoid having to store it together with the 'real' superblocks. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
The zones array is getting really large, and large arrays tend to wreak havoc with the CPU caches. So convert it to xarray to become more cache friendly. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> # fix leak in dmz_insert Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Instead of counting the number of reserved zones in dmz_free_zone(), mark the zone as 'reserved' during allocation and simplify dmz_free_zone(). Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
The secondary superblock must reside on the same device as the primary superblock, so there is no need to re-calculate the device. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 23 5月, 2020 1 次提交
-
-
由 Hannes Reinecke 提交于
Remove a leftover hunk to switch from random zones to sequential zones when selecting a reclaim zone; the logic has moved into the caller and this hunk is now pointless. Fixes: 34f5affd ("dm zoned: separate random and cache zones") Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 21 5月, 2020 8 次提交
-
-
由 Hannes Reinecke 提交于
When dmz_get_chunk_mapping() selects a zone which is under reclaim we should terminate the reclaim copy process. Since we're changing the zone itself, reclaim needs to run afterwards again anyway. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
When the system is idle we should be starting reclaiming random zones, too. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Instead of lumping emulated zones together with random zones we should be handling them as separate 'cache' zones. This improves code readability and allows an easier implementation of different cache policies. Also add additional allocation flags, to separate the type (cache, random, or sequential) from the purpose (eg reclaim). Also switch the allocation policy to not use random zones as buffer zones if cache zones are present. This avoids a performance drop when all cache zones are used. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
The only case where dmz_get_zone_for_reclaim() cannot return a zone is if the respective lists are empty. So we should just return a simple NULL value here as we really don't have an error code which would make sense. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Implement handling for metadata version 2. The new metadata adds a label and UUID for the device mapper device, and additional UUID for the underlying block devices. It also allows for an additional regular drive to be used for emulating random access zones. The emulated zones will be placed logically in front of the zones from the zoned block device, causing the superblocks and metadata to be stored on that device. The first zone of the original zoned device will be used to hold another, tertiary copy of the metadata; this copy carries a generation number of 0 and is never updated; it's just used for identification. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NBob Liu <bob.liu@oracle.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
When looking up zones in dmz_alloc_zone() we need to ignore metadata zones so as not to accidentally overwrite metadata. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NBob Liu <bob.liu@oracle.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
dm-zoned is becoming quite chatty during startup; reduce the noise by moving some information to 'debug' level. Suggested-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Use the metadata label for logging and not the underlying device. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NBob Liu <bob.liu@oracle.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 20 5月, 2020 1 次提交
-
-
由 Hannes Reinecke 提交于
Use accessors to retrieve the device pointer in preparation for adding an additional block device. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NBob Liu <bob.liu@oracle.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 15 5月, 2020 7 次提交
-
-
由 Hannes Reinecke 提交于
Introduce accessors dmz_dev_is_dying() and dmz_check_dev() to avoid having to reference the devices directly. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NBob Liu <bob.liu@oracle.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Introduce dmz_metadata_label() to format the device-mapper device name and use it instead of the device name of the underlying device. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NBob Liu <bob.liu@oracle.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Move fields from the device structure into the metadata structure and provide accessor functions. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NBob Liu <bob.liu@oracle.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Store the device together with the superblock so that we don't have to recur to the metadata to find it. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NBob Liu <bob.liu@oracle.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Instead of storing just the first superblock zone and calculate the secondary relative to that we should be using an array for holding the superblock zones. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NBob Liu <bob.liu@oracle.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Instead of calculating the zone index by the offset within the zone array store the index within the structure itself. With that the helper dmz_id() is pointless and can be replaced with accessing the ->id value directly. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NBob Liu <bob.liu@oracle.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Add callback to supply information for 'dmsetup status' and 'dmsetup table'. The output for 'dmsetup status' is 0 <size> zoned <nr_zones> zones <nr_unmap_rnd>/<nr_rnd> random <nr_unmap_seq>/<nr_seq> sequential where <nr_unmap_rnd> is the number of unmapped (ie free) random zones, <nr_rnd> the total number of random zones, <nr_unmap_seq> the number of unmapped sequential zones, and <nr_seq> the total number of sequential zones. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NBob Liu <bob.liu@oracle.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 25 3月, 2020 1 次提交
-
-
由 Bob Liu 提交于
zmd->nr_rnd_zones was increased twice by mistake. The other place it is increased in dmz_init_zone() is the only one needed: 1131 zmd->nr_useable_zones++; 1132 if (dmz_is_rnd(zone)) { 1133 zmd->nr_rnd_zones++; ^^^ Fixes: 3b1a94c8 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Signed-off-by: NBob Liu <bob.liu@oracle.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 08 1月, 2020 1 次提交
-
-
由 Dmitry Fomichev 提交于
dm-zoned is observed to log failed kernel assertions and not work correctly when operating against a device with a zone size smaller than 128MiB (e.g. 32768 bits per 4K block). The reason is that the bitmap size per zone is calculated as zero with such a small zone size. Fix this problem and also make the code related to zone bitmap management be able to handle per zone bitmaps smaller than a single block. A dm-zoned-tools patch is required to properly format dm-zoned devices with zone sizes smaller than 128MiB. Fixes: 3b1a94c8 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Signed-off-by: NDmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 13 11月, 2019 1 次提交
-
-
由 Christoph Hellwig 提交于
Avoid the need to allocate a potentially large array of struct blk_zone in the block layer by switching the ->report_zones method interface to a callback model. Now the caller simply supplies a callback that is executed on each reported zone, and private data for it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 07 11月, 2019 2 次提交
-
-
由 Dmitry Fomichev 提交于
Commit 75d66ffb added backing device health checks and as a part of these checks, check_events() block ops template call is invoked in dm-zoned mapping path as well as in reclaim and flush path. Calling check_events() with ATA or SCSI backing devices introduces a blocking scsi_test_unit_ready() call being made in sd_check_events(). Even though the overhead of calling scsi_test_unit_ready() is small for ATA zoned devices, it is much larger for SCSI and it affects performance in a very negative way. Fix this performance regression by executing check_events() only in case of any I/O errors. The function dmz_bdev_is_dying() is modified to call only blk_queue_dying(), while calls to check_events() are made in a new helper function, dmz_check_bdev(). Reported-by: Nzhangxiaoxu <zhangxiaoxu5@huawei.com> Fixes: 75d66ffb ("dm zoned: properly handle backing device failure") Cc: stable@vger.kernel.org Signed-off-by: NDmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Ajay Joshi 提交于
Zoned block devices (ZBC and ZAC devices) allow an explicit control over the condition (state) of zones. The operations allowed are: * Open a zone: Transition to open condition to indicate that a zone will actively be written * Close a zone: Transition to closed condition to release the drive resources used for writing to a zone * Finish a zone: Transition an open or closed zone to the full condition to prevent write operations To enable this control for in-kernel zoned block device users, define the new request operations REQ_OP_ZONE_OPEN, REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH as well as the generic function blkdev_zone_mgmt() for submitting these operations on a range of zones. This results in blkdev_reset_zones() removal and replacement with this new zone magement function. Users of blkdev_reset_zones() (f2fs and dm-zoned) are updated accordingly. Contains contributions from Matias Bjorling, Hans Holmberg, Dmitry Fomichev, Keith Busch, Damien Le Moal and Christoph Hellwig. Reviewed-by: NJavier González <javier@javigon.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAjay Joshi <ajay.joshi@wdc.com> Signed-off-by: NMatias Bjorling <matias.bjorling@wdc.com> Signed-off-by: NHans Holmberg <hans.holmberg@wdc.com> Signed-off-by: NDmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 21 8月, 2019 1 次提交
-
-
由 Dan Carpenter 提交于
This function is supposed to return error pointers so it matches the dmz_get_rnd_zone_for_reclaim() function. The current code could lead to a NULL dereference in dmz_do_reclaim() Fixes: b234c6d7 ("dm zoned: improve error handling in reclaim") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Reviewed-by: NDmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 16 8月, 2019 4 次提交
-
-
由 Dmitry Fomichev 提交于
Signed-off-by: NDmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Dmitry Fomichev 提交于
Signed-off-by: NDmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Dmitry Fomichev 提交于
dm-zoned is observed to lock up or livelock in case of hardware failure or some misconfiguration of the backing zoned device. This patch adds a new dm-zoned target function that checks the status of the backing device. If the request queue of the backing device is found to be in dying state or the SCSI backing device enters offline state, the health check code sets a dm-zoned target flag prompting all further incoming I/O to be rejected. In order to detect backing device failures timely, this new function is called in the request mapping path, at the beginning of every reclaim run and before performing any metadata I/O. The proper way out of this situation is to do dmsetup remove <dm-zoned target> and recreate the target when the problem with the backing device is resolved. Fixes: 3b1a94c8 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Signed-off-by: NDmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Dmitry Fomichev 提交于
There are several places in reclaim code where errors are not propagated to the main function, dmz_reclaim(). This function is responsible for unlocking zones that might be still locked at the end of any failed reclaim iterations. As the result, some device zones may be left permanently locked for reclaim, degrading target's capability to reclaim zones. This patch fixes these issues as follows - Make sure that dmz_reclaim_buf(), dmz_reclaim_seq_data() and dmz_reclaim_rnd_data() return error codes to the caller. dmz_reclaim() function is renamed to dmz_do_reclaim() to avoid clashing with "struct dmz_reclaim" and is modified to return the error to the caller. dmz_get_zone_for_reclaim() now returns an error instead of NULL pointer and reclaim code checks for that error. Error logging/debug messages are added where necessary. Fixes: 3b1a94c8 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Signed-off-by: NDmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 17 7月, 2019 1 次提交
-
-
由 Damien Le Moal 提交于
dm-zoned uses the zone flag DMZ_ACTIVE to indicate that a zone of the backend device is being actively read or written and so cannot be reclaimed. This flag is set as long as the zone atomic reference counter is not 0. When this atomic is decremented and reaches 0 (e.g. on BIO completion), the active flag is cleared and set again whenever the zone is reused and BIO issued with the atomic counter incremented. These 2 operations (atomic inc/dec and flag set/clear) are however not always executed atomically under the target metadata mutex lock and this causes the warning: WARN_ON(!test_bit(DMZ_ACTIVE, &zone->flags)); in dmz_deactivate_zone() to be displayed. This problem is regularly triggered with xfstests generic/209, generic/300, generic/451 and xfs/077 with XFS being used as the file system on the dm-zoned target device. Similarly, xfstests ext4/303, ext4/304, generic/209 and generic/300 trigger the warning with ext4 use. This problem can be easily fixed by simply removing the DMZ_ACTIVE flag and managing the "ACTIVE" state by directly looking at the reference counter value. To do so, the functions dmz_activate_zone() and dmz_deactivate_zone() are changed to inline functions respectively calling atomic_inc() and atomic_dec(), while the dmz_is_active() macro is changed to an inline function calling atomic_read(). Fixes: 3b1a94c8 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Reported-by: NMasato Suzuki <masato.suzuki@wdc.com> Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 12 7月, 2019 1 次提交
-
-
由 Damien Le Moal 提交于
Only GFP_KERNEL and GFP_NOIO are used with blkdev_report_zones(). In preparation of using vmalloc() for large report buffer and zone array allocations used by this function, remove its "gfp_t gfp_mask" argument and rely on the caller context to use memalloc_noio_save/restore() where necessary (block layer zone revalidation and dm-zoned I/O error path). Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 19 4月, 2019 1 次提交
-
-
由 Damien Le Moal 提交于
The function blkdev_report_zones() returns success even if no zone information is reported (empty report). Empty zone reports can only happen if the report start sector passed exceeds the device capacity. The conditions for this to happen are either a bug in the caller code, or, a change in the device that forced the low level driver to change the device capacity to a value that is lower than the report start sector. This situation includes a failed disk revalidation resulting in the disk capacity being changed to 0. If this change happens while dm-zoned is in its initialization phase executing dmz_init_zones(), this function may enter an infinite loop and hang the system. To avoid this, add a check to disallow empty zone reports and bail out early. Also fix the function dmz_update_zone() to make sure that the report for the requested zone was correctly obtained. Fixes: 3b1a94c8 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NShaun Tancheff <shaun@tancheff.com> Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 19 10月, 2018 2 次提交
-
-
由 Damien Le Moal 提交于
dmz_fetch_mblock() called from dmz_get_mblock() has a race since the allocation of the new metadata block descriptor and its insertion in the cache rbtree with the READING state is not atomic. Two different contexts requesting the same block may end up each adding two different descriptors of the same block to the cache. Another problem for this function is that the BIO for processing the block read is allocated after the metadata block descriptor is inserted in the cache rbtree. If the BIO allocation fails, the metadata block descriptor is freed without first being removed from the rbtree. Fix the first problem by checking again if the requested block is not in the cache right before inserting the newly allocated descriptor, atomically under the mblk_lock spinlock. The second problem is fixed by simply allocating the BIO before inserting the new block in the cache. Finally, since dmz_fetch_mblock() also increments a block reference counter, rename the function to dmz_get_mblock_slow(). To be symmetric and clear, also rename dmz_lookup_mblock() to dmz_get_mblock_fast() and increment the block reference counter directly in that function rather than in dmz_get_mblock(). Fixes: 3b1a94c8 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Damien Le Moal 提交于
Since the ref field of struct dmz_mblock is always used with the spinlock of struct dmz_metadata locked, there is no need to use an atomic_t type. Change the type of the ref field to an unsigne integer. Fixes: 3b1a94c8 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 17 1月, 2018 1 次提交
-
-
由 Mike Snitzer 提交于
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 24 8月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
This way we don't need a block_device structure to submit I/O. The block_device has different life time rules from the gendisk and request_queue and is usually only available when the block device node is open. Other callers need to explicitly create one (e.g. the lightnvm passthrough code, or the new nvme multipathing code). For the actual I/O path all that we need is the gendisk, which exists once per block device. But given that the block layer also does partition remapping we additionally need a partition index, which is used for said remapping in generic_make_request. Note that all the block drivers generally want request_queue or sometimes the gendisk, so this removes a layer of indirection all over the stack. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 27 7月, 2017 1 次提交
-
-
由 Damien Le Moal 提交于
Use GFP_NOIO for memory allocations in the I/O path. Other memory allocations in the initialization path can use GFP_KERNEL. Reported-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-