提交 · 91ccbbac1747eea155632a1c6bb100052309b215 · openeuler / Kernel

11 8月, 2021 1 次提交

dm ima: measure data on table load · 91ccbbac

由 Tushar Sugandhi 提交于 7月 12, 2021

DM configures a block device with various target specific attributes
passed to it as a table. DM loads the table, and calls each target’s
respective constructors with the attributes as input parameters.
Some of these attributes are critical to ensure the device meets
certain security bar. Thus, IMA should measure these attributes, to
ensure they are not tampered with, during the lifetime of the device.
So that the external services can have high confidence in the
configuration of the block-devices on a given system.

Some devices may have large tables. And a given device may change its
state (table-load, suspend, resume, rename, remove, table-clear etc.)
many times. Measuring these attributes each time when the device
changes its state will significantly increase the size of the IMA logs.
Further, once configured, these attributes are not expected to change
unless a new table is loaded, or a device is removed and recreated.
Therefore the clear-text of the attributes should only be measured
during table load, and the hash of the active/inactive table should be
measured for the remaining device state changes.

Export IMA function ima_measure_critical_data() to allow measurement
of DM device parameters, as well as target specific attributes, during
table load. Compute the hash of the inactive table and store it for
measurements during future state change. If a load is called multiple
times, update the inactive table hash with the hash of the latest
populated table. So that the correct inactive table hash is measured
when the device transitions to different states like resume, remove,
rename, etc.
Signed-off-by: NTushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com> # leak fix
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

91ccbbac

05 6月, 2021 2 次提交

dm: introduce zone append emulation · bb37d772

由 Damien Le Moal 提交于 5月 26, 2021

For zoned targets that cannot support zone append operations, implement
an emulation using regular write operations. If the original BIO
submitted by the user is a zone append operation, change its clone into
a regular write operation directed at the target zone write pointer
position.

To do so, an array of write pointer offsets (write pointer position
relative to the start of a zone) is added to struct mapped_device. All
operations that modify a sequential zone write pointer (writes, zone
reset, zone finish and zone append) are intersepted in __map_bio() and
processed using the new functions dm_zone_map_bio().

Detection of the target ability to natively support zone append
operations is done from dm_table_set_restrictions() by calling the
function dm_set_zones_restrictions(). A target that does not support
zone append operation, either by explicitly declaring it using the new
struct dm_target field zone_append_not_supported, or because the device
table contains a non-zoned device, has its mapped device marked with the
new flag DMF_ZONE_APPEND_EMULATED. The helper function
dm_emulate_zone_append() is introduced to test a mapped device for this
new flag.

Atomicity of the zones write pointer tracking and updates is done using
a zone write locking mechanism based on a bitmap. This is similar to
the block layer method but based on BIOs rather than struct request.
A zone write lock is taken in dm_zone_map_bio() for any clone BIO with
an operation type that changes the BIO target zone write pointer
position. The zone write lock is released if the clone BIO is failed
before submission or when dm_zone_endio() is called when the clone BIO
completes.

The zone write lock bitmap of the mapped device, together with a bitmap
indicating zone types (conv_zones_bitmap) and the write pointer offset
array (zwp_offset) are allocated and initialized with a full device zone
report in dm_set_zones_restrictions() using the function
dm_revalidate_zones().

For failed operations that may have modified a zone write pointer, the
zone write pointer offset is marked as invalid in dm_zone_endio().
Zones with an invalid write pointer offset are checked and the write
pointer updated using an internal report zone operation when the
faulty zone is accessed again by the user.

All functions added for this emulation have a minimal overhead for
zoned targets natively supporting zone append operations. Regular
device targets are also not affected. The added code also does not
impact builds with CONFIG_BLK_DEV_ZONED disabled by stubbing out all
dm zone related functions.
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

bb37d772

dm: Introduce dm_report_zones() · 912e8875

由 Damien Le Moal 提交于 5月 26, 2021

To simplify the implementation of the report_zones operation of a zoned
target, introduce the function dm_report_zones() to set a target
mapping start sector in struct dm_report_zones_args and call
blkdev_report_zones(). This new function is exported and the report
zones callback function dm_report_zones_cb() is not.

dm-linear, dm-flakey and dm-crypt are modified to use dm_report_zones().
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

912e8875

20 4月, 2021 1 次提交

dm: replace dm_vcalloc() · 7a35693a

由 Matthew Wilcox (Oracle) 提交于 4月 07, 2021

Use kvcalloc or kvmalloc_array instead (depending whether zeroing is
useful).
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

7a35693a

23 3月, 2021 1 次提交

dm table: Fix zoned model check and zone sectors check · 2d669ceb

由 Shin'ichiro Kawasaki 提交于 3月 16, 2021

Commit 24f6b603 ("dm table: fix zoned iterate_devices based device
capability checks") triggered dm table load failure when dm-zoned device
is set up for zoned block devices and a regular device for cache.

The commit inverted logic of two callback functions for iterate_devices:
device_is_zoned_model() and device_matches_zone_sectors(). The logic of
device_is_zoned_model() was inverted then all destination devices of all
targets in dm table are required to have the expected zoned model. This
is fine for dm-linear, dm-flakey and dm-crypt on zoned block devices
since each target has only one destination device. However, this results
in failure for dm-zoned with regular cache device since that target has
both regular block device and zoned block devices.

As for device_matches_zone_sectors(), the commit inverted the logic to
require all zoned block devices in each target have the specified
zone_sectors. This check also fails for regular block device which does
not have zones.

To avoid the check failures, fix the zone model check and the zone
sectors check. For zone model check, introduce the new feature flag
DM_TARGET_MIXED_ZONED_MODEL, and set it to dm-zoned target. When the
target has this flag, allow it to have destination devices with any
zoned model. For zone sectors check, skip the check if the destination
device is not a zoned block device. Also add comments and improve an
error message to clarify expectations to the two checks.

Fixes: 24f6b603 ("dm table: fix zoned iterate_devices based device capability checks")
Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

2d669ceb

11 2月, 2021 3 次提交

dm: fix deadlock when swapping to encrypted device · a666e5c0

由 Mikulas Patocka 提交于 2月 10, 2021

The system would deadlock when swapping to a dm-crypt device. The reason
is that for each incoming write bio, dm-crypt allocates memory that holds
encrypted data. These excessive allocations exhaust all the memory and the
result is either deadlock or OOM trigger.

This patch limits the number of in-flight swap bios, so that the memory
consumed by dm-crypt is limited. The limit is enforced if the target set
the "limit_swap_bios" variable and if the bio has REQ_SWAP set.

Non-swap bios are not affected becuase taking the semaphore would cause
performance degradation.

This is similar to request-based drivers - they will also block when the
number of requests is over the limit.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

a666e5c0

dm: simplify target code conditional on CONFIG_BLK_DEV_ZONED · e3290b94

由 Mike Snitzer 提交于 2月 10, 2021

Allow removal of CONFIG_BLK_DEV_ZONED conditionals in target_type
definition of various targets.
Suggested-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

e3290b94

dm: add support for passing through inline crypto support · aa6ce87a

由 Satya Tangirala 提交于 2月 01, 2021

Update the device-mapper core to support exposing the inline crypto
support of the underlying device(s) through the device-mapper device.

This works by creating a "passthrough keyslot manager" for the dm
device, which declares support for encryption settings which all
underlying devices support.  When a supported setting is used, the bio
cloning code handles cloning the crypto context to the bios for all the
underlying devices.  When an unsupported setting is used, the blk-crypto
fallback is used as usual.

Crypto support on each underlying device is ignored unless the
corresponding dm target opts into exposing it.  This is needed because
for inline crypto to semantically operate on the original bio, the data
must not be transformed by the dm target.  Thus, targets like dm-linear
can expose crypto support of the underlying device, but targets like
dm-crypt can't.  (dm-crypt could use inline crypto itself, though.)

A DM device's table can only be changed if the "new" inline encryption
capabilities are a (*not* necessarily strict) superset of the "old" inline
encryption capabilities.  Attempts to make changes to the table that result
in some inline encryption capability becoming no longer supported will be
rejected.

For the sake of clarity, key eviction from underlying devices will be
handled in a future patch.
Co-developed-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NSatya Tangirala <satyat@google.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

aa6ce87a

08 10月, 2020 1 次提交

dm: remove special-casing of bio-based immutable singleton target on NVMe · 9c37de29

由 Mike Snitzer 提交于 10月 07, 2020

Since commit 5a6c35f9 ("block: remove direct_make_request") there
is no benefit to DM special-casing NVMe. Remove all code used to
establish DM_TYPE_NVME_BIO_BASED.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

9c37de29

25 9月, 2020 1 次提交

dm: add support for REQ_NOWAIT and enable it for linear target · 6abc4946

由 Konstantin Khlebnikov 提交于 9月 23, 2020

Add DM target feature flag DM_TARGET_NOWAIT which advertises that
target works with REQ_NOWAIT bios.

Add dm_table_supports_nowait() and update dm_table_set_restrictions()
to set/clear QUEUE_FLAG_NOWAIT accordingly.
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6abc4946

24 7月, 2020 1 次提交

dm integrity: fix integrity recalculation that is improperly skipped · 5df96f2b

由 Mikulas Patocka 提交于 7月 23, 2020

Commit adc0daad ("dm: report suspended
device during destroy") broke integrity recalculation.

The problem is dm_suspended() returns true not only during suspend,
but also during resume. So this race condition could occur:
1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
2. integrity_recalc (&ic->recalc_work) preempts the current thread
3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
4. integrity_recalc exits and no recalculating is done.

To fix this race condition, add a function dm_post_suspending that is
only true during the postsuspend phase and use it instead of
dm_suspended().

Signed-off-by: Mikulas Patocka <mpatocka redhat com>
Fixes: adc0daad ("dm: report suspended device during destroy")
Cc: stable vger kernel org # v4.18+
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

5df96f2b

09 7月, 2020 1 次提交

writeback: remove bdi->congested_fn · 21cf8661

由 Christoph Hellwig 提交于 7月 01, 2020

Except for pktdvd, the only places setting congested bits are file
systems that allocate their own backing_dev_info structures.  And
pktdvd is a deprecated driver that isn't useful in stack setup
either.  So remove the dead congested_fn stacking infrastructure.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NSong Liu <song@kernel.org>
Acked-by: NDavid Sterba <dsterba@suse.com>
[axboe: fixup unused variables in bcache/request.c]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

21cf8661

21 5月, 2020 1 次提交

dm: use dynamic debug instead of compile-time config option · 74244b59

由 Hannes Reinecke 提交于 5月 14, 2020

Switch to use dynamic debug to avoid having recompile the kernel
just to enable debugging messages.
Signed-off-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

74244b59

15 5月, 2020 1 次提交

dm mpath: pass IO start time to path selector · 087615bf

由 Gabriel Krisman Bertazi 提交于 4月 30, 2020

The HST path selector needs this information to perform path
prediction. For request-based mpath, struct request's io_start_time_ns
is used, while for bio-based, use the start_time stored in dm_io.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

087615bf

03 4月, 2020 1 次提交

dm,dax: Add dax zero_page_range operation · cdf6cdcd

由 Vivek Goyal 提交于 2月 28, 2020

This patch adds support for dax zero_page_range operation to dm targets.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20200228163456.1587-5-vgoyal@redhat.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

cdf6cdcd

13 11月, 2019 1 次提交

block: rework zone reporting · d4100351

由 Christoph Hellwig 提交于 11月 11, 2019

Avoid the need to allocate a potentially large array of struct blk_zone
in the block layer by switching the ->report_zones method interface to
a callback model. Now the caller simply supplies a callback that is
executed on each reported zone, and private data for it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d4100351

06 11月, 2019 1 次提交

dm stripe: use struct_size() in kmalloc() · 8adeac3b

由 Gustavo A. R. Silva 提交于 10月 02, 2019

One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct stripe_c {
        ...
        struct stripe stripe[0];
};

In this case alloc_context() and dm_array_too_big() are removed and
replaced by the direct use of the struct_size() helper in kmalloc().

Notice that open-coded form is prone to type mistakes.

This code was detected with the help of Coccinelle.
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

8adeac3b

18 7月, 2019 1 次提交

dm: use printk ratelimiting functions · 733232f8

由 Mike Snitzer 提交于 7月 17, 2019

DM provided its own ratelimiting printk wrapper but given printk
advances this is no longer needed.

Also, switching DMDEBUG_LIMIT to using pr_debug_ratelimited() fixes the
reported issue where DMDEBUG_LIMIT() still caused a flood of "callbacks
suppressed" messages.
Reported-by: NMilan Broz <gmazyland@gmail.com>
Depends-on: 29fc2bc7 ("printk: pr_debug_ratelimited: check state first to reduce "callbacks suppressed" messages")
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

733232f8

12 7月, 2019 1 次提交

block: Kill gfp_t argument of blkdev_report_zones() · bd976e52

由 Damien Le Moal 提交于 7月 01, 2019

Only GFP_KERNEL and GFP_NOIO are used with blkdev_report_zones(). In
preparation of using vmalloc() for large report buffer and zone array
allocations used by this function, remove its "gfp_t gfp_mask" argument
and rely on the caller context to use memalloc_noio_save/restore() where
necessary (block layer zone revalidation and dm-zoned I/O error path).
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bd976e52

26 4月, 2019 1 次提交

dm mpath: fix missing call of path selector type->end_io · 5de719e3

由 Yufen Yu 提交于 4月 24, 2019

After commit 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via
blk_insert_cloned_request feedback"), map_request() will requeue the tio
when issued clone request return BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE.

Thus, if device driver status is error, a tio may be requeued multiple
times until the return value is not DM_MAPIO_REQUEUE.  That means
type->start_io may be called multiple times, while type->end_io is only
called when IO complete.

In fact, even without commit 396eaf21, setup_clone() failure can
also cause tio requeue and associated missed call to type->end_io.

The service-time path selector selects path based on in_flight_size,
which is increased by st_start_io() and decreased by st_end_io().
Missed calls to st_end_io() can lead to in_flight_size count error and
will cause the selector to make the wrong choice.  In addition,
queue-length path selector will also be affected.

To fix the problem, call type->end_io in ->release_clone_rq before tio
requeue.  map_info is passed to ->release_clone_rq() for map_request()
error path that result in requeue.

Fixes: 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
Cc: stable@vger.kernl.org
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

5de719e3

06 3月, 2019 2 次提交

dm: add support to directly boot to a mapped device · 6bbc923d

由 Helen Koike 提交于 2月 21, 2019

Add a "create" module parameter, which allows device-mapper targets to
be configured at boot time. This enables early use of DM targets in the
boot process (as the root device or otherwise) without the need of an
initramfs.

The syntax used in the boot param is based on the concise format from
the dmsetup tool to follow the rule of least surprise:

	dmsetup table --concise /dev/mapper/lroot

Which is:
	dm-mod.create=<name>,<uuid>,<minor>,<flags>,<table>[,<table>+][;<name>,<uuid>,<minor>,<flags>,<table>[,<table>+]+]

Where,
	<name>		::= The device name.
	<uuid>		::= xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | ""
	<minor>		::= The device minor number | ""
	<flags>		::= "ro" | "rw"
	<table>		::= <start_sector> <num_sectors> <target_type> <target_args>
	<target_type>	::= "verity" | "linear" | ...

For example, the following could be added in the boot parameters:
dm-mod.create="lroot,,,rw, 0 4096 linear 98:16 0, 4096 4096 linear 98:32 0" root=/dev/dm-0

Only the targets that were tested are allowed and the ones that don't
change any block device when the device is create as read-only. For
example, mirror and cache targets are not allowed. The rationale behind
this is that if the user makes a mistake, choosing the wrong device to
be the mirror or the cache can corrupt data.

The only targets initially allowed are:
* crypt
* delay
* linear
* snapshot-origin
* striped
* verity
Co-developed-by: NWill Drewry <wad@chromium.org>
Co-developed-by: NKees Cook <keescook@chromium.org>
Co-developed-by: NEnric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: NHelen Koike <helen.koike@collabora.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

6bbc923d

dm: fix to_sector() for 32bit · 0bdb50c5

由 NeilBrown 提交于 1月 06, 2019

A dm-raid array with devices larger than 4GB won't assemble on
a 32 bit host since _check_data_dev_sectors() was added in 4.16.
This is because to_sector() treats its argument as an "unsigned long"
which is 32bits (4GB) on a 32bit host.  Using "unsigned long long"
is more correct.

Kernels as early as 4.2 can have other problems due to to_sector()
being used on the size of a device.

Fixes: 0cf45031 ("dm raid: add support for the MD RAID0 personality")
cc: stable@vger.kernel.org (v4.2+)
Reported-and-tested-by: NGuillaume Perréal <gperreal@free.fr>
Signed-off-by: NNeilBrown <neil@brown.name>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

0bdb50c5

21 2月, 2019 1 次提交

dm: eliminate 'split_discard_bios' flag from DM target interface · 61697a6a

由 Mike Snitzer 提交于 1月 18, 2019

There is no need to have DM core split discards on behalf of a DM target
now that blk_queue_split() handles splitting discards based on the
queue_limits.  A DM target just needs to set max_discard_sectors,
discard_granularity, etc, in queue_limits.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

61697a6a

26 10月, 2018 1 次提交

block: add a report_zones method · e76239a3

由 Christoph Hellwig 提交于 10月 12, 2018

Dispatching a report zones command through the request queue is a major
pain due to the command reply payload rewriting necessary. Given that
blkdev_report_zones() is executing everything synchronously, implement
report zones as a block device file operation instead, allowing major
simplification of the code in many places.

sd, null-blk, dm-linear and dm-flakey being the only block device
drivers supporting exposing zoned block devices, these drivers are
modified to provide the device side implementation of the
report_zones() block device file operation.

For device mappers, a new report_zones() target type operation is
defined so that the upper block layer calls blkdev_report_zones() can
be propagated down to the underlying devices of the dm targets.
Implementation for this new operation is added to the dm-linear and
dm-flakey targets.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
[Damien]
* Changed method block_device argument to gendisk
* Various bug fixes and improvements
* Added support for null_blk, dm-linear and dm-flakey.
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e76239a3

19 10月, 2018 1 次提交

dm: add dm_table_device_name() · f349b0a3

由 Michał Mirosław 提交于 10月 09, 2018

Add a shortcut for dm_device_name(dm_table_get_md(t)).
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f349b0a3

11 10月, 2018 1 次提交

dm: rename DM_TYPE_MQ_REQUEST_BASED to DM_TYPE_REQUEST_BASED · 953923c0

由 Mike Snitzer 提交于 10月 11, 2018

Now that request-based DM is only using blk-mq, there is no need to
differentiate between legacy "rq" and new "mq".  We're back to a single
request-based DM -- and there was much rejoicing!
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

953923c0

23 5月, 2018 1 次提交

dax: Introduce a ->copy_to_iter dax operation · b3a9a0c3

由 Dan Williams 提交于 5月 02, 2018

Similar to the ->copy_from_iter() operation, a platform may want to
deploy an architecture or device specific routine for handling reads
from a dax_device like /dev/pmemX. On x86 this routine will point to a
machine check safe version of copy_to_iter(). For now, add the plumbing
to device-mapper and the dax core.

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

b3a9a0c3

05 4月, 2018 1 次提交

dm: remove fmode_t argument from .prepare_ioctl hook · 5bd5e8d8

由 Mike Snitzer 提交于 4月 03, 2018

Use the fmode_t that is passed to dm_blk_ioctl() rather than
inconsistently (varies across targets) drop it on the floor by
overriding it with the fmode_t stored in 'struct dm_dev'.

All the persistent reservation functions weren't using the fmode_t they
got back from .prepare_ioctl so remove them.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

5bd5e8d8

04 4月, 2018 2 次提交

dm: add support for secure erase forwarding · 00716545

由 Denis Semakin 提交于 3月 13, 2018

Set QUEUE_FLAG_SECERASE in DM device's queue_flags if a DM table's
data devices support secure erase.

Also, add support for secure erase to both the linear and striped
targets.
Signed-off-by: NDenis Semakin <d.semakin@omprussia.ru>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

00716545

dm: allow targets to return output from messages they are sent · 1eb5fa84

由 Mike Snitzer 提交于 2月 28, 2018

Could be useful for a target to return stats or other information.
If a target does DMEMIT() anything to @result from its .message method
then it must return 1 to the caller.
Signed-off-By: NMike Snitzer <snitzer@redhat.com>

1eb5fa84

18 3月, 2018 1 次提交

block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> · 233bde21

由 Bart Van Assche 提交于 3月 14, 2018

It happens often while I'm preparing a patch for a block driver that
I'm wondering: is a definition of SECTOR_SIZE and/or SECTOR_SHIFT
available for this driver? Do I have to introduce definitions of these
constants before I can use these constants? To avoid this confusion,
move the existing definitions of SECTOR_SIZE and SECTOR_SHIFT into the
<linux/blkdev.h> header file such that these become available for all
block drivers. Make the SECTOR_SIZE definition in the uapi msdos_fs.h
header file conditional to avoid that including that header file after
<linux/blkdev.h> causes the compiler to complain about a SECTOR_SIZE
redefinition.

Note: the SECTOR_SIZE / SECTOR_SHIFT / SECTOR_BITS definitions have
not been removed from uapi header files nor from NAND drivers in
which these constants are used for another purpose than converting
block layer offsets and sizes into a number of sectors.

Cc: David S. Miller <davem@davemloft.net>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

233bde21

30 1月, 2018 1 次提交

dm mpath: delay the retry of a request if the target responded as busy · ac514ffc

由 Mike Snitzer 提交于 1月 12, 2018

Add DM_ENDIO_DELAY_REQUEUE to allow request-based multipath's
multipath_end_io() to instruct dm-rq.c:dm_done() to delay a requeue.
This is beneficial to do if BLK_STS_RESOURCE is returned from the target
(because target is busy).

Relative to blk-mq: kick the hw queues via blk_mq_requeue_work(),
indirectly from dm-rq.c:__dm_mq_kick_requeue_list(), after a delay.

For old .request_fn: use blk_delay_queue().

bio-based multipath doesn't have feature parity with request-based for
retryable error requeues; that is something that'll need fixing in the
future.
Suggested-by: NBart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Acked-by: NBart Van Assche <bart.vanassche@wdc.com>
[as interpreted from Bart's "... patch looks fine to me."]

ac514ffc

17 1月, 2018 1 次提交

dm: move dm_table_destroy() to same header as dm_table_create() · f6e7baad

由 Brian Norris 提交于 3月 28, 2017

If anyone is going to use dm_table_create(), they probably should be
able to use dm_table_destroy() too. Move the dm_table_destroy()
definition outside the private header, near dm_table_create()
Signed-off-by: NBrian Norris <briannorris@chromium.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f6e7baad

20 12月, 2017 1 次提交

dm: introduce DM_TYPE_NVME_BIO_BASED · 22c11858

由 Mike Snitzer 提交于 12月 04, 2017

If dm_table_determine_type() establishes DM_TYPE_NVME_BIO_BASED then
all devices in the DM table do not support partial completions.  Also,
the table has a single immutable target that doesn't require DM core to
split bios.

This will enable adding NVMe optimizations to bio-based DM.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

22c11858

17 12月, 2017 1 次提交

dm: improve performance by moving dm_io structure to per-bio-data · 64f52b0e

由 Mike Snitzer 提交于 12月 11, 2017

Eliminates need for a separate mempool to allocate 'struct dm_io'
objects from.  As such, it saves an extra mempool allocation for each
original bio that DM core is issued.

This complicates the per-bio-data accessor functions by needing to
conditonally add extra padding to get to a target's per-bio-data.  But
in the end this provides a decent performance improvement for all
bio-based DM devices.

On an NVMe-loop based testbed to a ramdisk (~3100 MB/s): bio-based
DM linear performance improved by 2% (went from 2665 to 2777 MB/s).
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

64f52b0e

14 12月, 2017 1 次提交

dm: remove unused 'num_write_bios' target interface · f31c21e4

由 NeilBrown 提交于 11月 22, 2017

No DM target provides num_write_bios and none has since dm-cache's
brief use in 2013.

Having the possibility of num_write_bios > 1 complicates bio
allocation.  So remove the interface and assume there is only one bio
needed.

If a target ever needs more, it must provide a suitable bioset and
allocate itself based on its particular needs.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f31c21e4

11 9月, 2017 1 次提交

dax: remove the pmem_dax_ops->flush abstraction · c3ca015f

由 Mikulas Patocka 提交于 8月 31, 2017

Commit abebfbe2 ("dm: add ->flush() dax operation support") is
buggy. A DM device may be composed of multiple underlying devices and
all of them need to be flushed. That commit just routes the flush
request to the first device and ignores the other devices.

It could be fixed by adding more complex logic to the device mapper. But
there is only one implementation of the method pmem_dax_ops->flush - that
is pmem_dax_flush() - and it calls arch_wb_cache_pmem(). Consequently, we
don't need the pmem_dax_ops->flush abstraction at all, we can call
arch_wb_cache_pmem() directly from dax_flush() because dax_dev->ops->flush
can't ever reach anything different from arch_wb_cache_pmem().

It should be also pointed out that for some uses of persistent memory it
is needed to flush only a very small amount of data (such as 1 cacheline),
and it would be overkill if we go through that device mapper machinery for
a single flushed cache line.

Fix this by removing the pmem_dax_ops->flush abstraction and call
arch_wb_cache_pmem() directly from dax_flush(). Also, remove the device
mapper code that forwards the flushes.

Fixes: abebfbe2 ("dm: add ->flush() dax operation support")
Cc: stable@vger.kernel.org
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

c3ca015f

28 8月, 2017 2 次提交

dm: constify argument arrays · 5916a22b

由 Eric Biggers 提交于 6月 22, 2017

The arrays of 'struct dm_arg' are never modified by the device-mapper
core, so constify them so that they are placed in .rodata.

(Exception: the args array in dm-raid cannot be constified because it is
allocated on the stack and modified.)
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

5916a22b

dm: fix printk() rate limiting code · 60440789

由 Bart Van Assche 提交于 8月 09, 2017

Using the same rate limiting state for different kinds of messages
is wrong because this can cause a high frequency message to suppress
a report of a low frequency message. Hence use a unique rate limiting
state per message type.

Fixes: 71a16736 ("dm: use local printk ratelimit")
Cc: stable@vger.kernel.org
Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

60440789

19 6月, 2017 1 次提交

dm: introduce dm_remap_zone_report() · 10999307

由 Damien Le Moal 提交于 5月 08, 2017

A target driver support zoned block devices and exposing it as such may
receive REQ_OP_ZONE_REPORT request for the user to determine the mapped
device zone configuration. To process properly such request, the target
driver may need to remap the zone descriptors provided in the report
reply. The helper function dm_remap_zone_report() does this generically
using only the target start offset and length and the start offset
within the target device.

dm_remap_zone_report() will remap the start sector of all zones
reported. If the report includes sequential zones, the write pointer
position of these zones will also be remapped.
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

10999307

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功