提交 · 03100aada96f0645bbcb89aea24c01f02d0ef1fa · openanolis / cloud-kernel

20 8月, 2015 1 次提交

block: Replace SG_GAPS with new queue limits mask · 03100aad

由 Keith Busch 提交于 8月 19, 2015

The SG_GAPS queue flag caused checks for bio vector alignment against
PAGE_SIZE, but the device may have different constraints. This patch
adds a queue limits so a driver with such constraints can set to allow
requests that would have been unnecessarily split. The new gaps check
takes the request_queue as a parameter to simplify the logic around
invoking this function.

This new limit makes the queue flag redundant, so removing it and
all usage. Device-mappers will inherit the correct settings through
blk_stack_limits().
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

03100aad

14 8月, 2015 1 次提交

block: kill merge_bvec_fn() completely · 8ae12666

由 Kent Overstreet 提交于 4月 27, 2015

As generic_make_request() is now able to handle arbitrarily sized bios,
it's no longer necessary for each individual block driver to define its
own ->merge_bvec_fn() callback. Remove every invocation completely.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: drbd-user@lists.linbit.com
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@kernel.org>
Cc: ceph-devel@vger.kernel.org
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Neil Brown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: NeilBrown <neilb@suse.de> (for the 'md' bits)
Acked-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
[dpark: also remove ->merge_bvec_fn() in dm-thin as well as
 dm-era-target, and resolve merge conflicts]
Signed-off-by: NDongsu Park <dpark@posteo.net>
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

8ae12666

26 6月, 2015 2 次提交

Revert "block, dm: don't copy bios for request clones" · 78d8e58a

由 Mike Snitzer 提交于 6月 26, 2015

This reverts commit 5f1b670d.

Justification for revert as reported in this dm-devel post:
https://www.redhat.com/archives/dm-devel/2015-June/msg00160.html

this change should not be pushed to mainline yet.

Firstly, Christoph has a newer version of the patch that fixes silent
data corruption problem:
  https://www.redhat.com/archives/dm-devel/2015-May/msg00229.html

And the new version still depends on LLDDs to always complete requests
to the end when error happens, while block API doesn't enforce such a
requirement. If the assumption is ever broken, the inconsistency between
request and bio (e.g. rq->__sector and rq->bio) will cause silent data
corruption:
  https://www.redhat.com/archives/dm-devel/2015-June/msg00022.htmlReported-by: NJunichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

78d8e58a

Revert "dm: do not allocate any mempools for blk-mq request-based DM" · 4e6e36c3

由 Mike Snitzer 提交于 6月 26, 2015

This reverts commit cbc4e3c1.
Reported-by: NJunichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

4e6e36c3

30 5月, 2015 2 次提交

dm: do not allocate any mempools for blk-mq request-based DM · cbc4e3c1

由 Mike Snitzer 提交于 4月 27, 2015

Do not allocate the io_pool mempool for blk-mq request-based DM
(DM_TYPE_MQ_REQUEST_BASED) in dm_alloc_rq_mempools().

Also refine __bind_mempools() to have more precise awareness of which
mempools each type of DM device uses -- avoids mempool churn when
reloading DM tables (particularly for DM_TYPE_REQUEST_BASED).
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

cbc4e3c1

dm: fix reload failure of 0 path multipath mapping on blk-mq devices · 15b94a69

由 Junichi Nomura 提交于 5月 29, 2015

dm-multipath accepts 0 path mapping.

  # echo '0 2097152 multipath 0 0 0 0' | dmsetup create newdev

Such a mapping can be used to release underlying devices while still
holding requests in its queue until working paths come back.

However, once the multipath device is created over blk-mq devices,
it rejects reloading of 0 path mapping:

  # echo '0 2097152 multipath 0 0 1 1 queue-length 0 1 1 /dev/sda 1' \
      | dmsetup create mpath1
  # echo '0 2097152 multipath 0 0 0 0' | dmsetup load mpath1
  device-mapper: reload ioctl on mpath1 failed: Invalid argument
  Command failed

With following kernel message:
  device-mapper: ioctl: can't change device type after initial table load.

DM tries to inherit the current table type using dm_table_set_type()
but it doesn't work as expected because of unnecessary check about
whether the target type is hybrid or not.

Hybrid type is for targets that work as either request-based or bio-based
and not required for blk-mq or non blk-mq checking.

Fixes: 65803c20 ("dm table: train hybrid target type detection to select blk-mq if appropriate")
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

15b94a69

22 5月, 2015 1 次提交

block, dm: don't copy bios for request clones · 5f1b670d

由 Christoph Hellwig 提交于 5月 22, 2015

Currently dm-multipath has to clone the bios for every request sent
to the lower devices, which wastes cpu cycles and ties down memory.

This patch instead adds a new REQ_CLONE flag that instructs req_bio_endio
to not complete bios attached to a request, which we set on clone
requests similar to bios in a flush sequence.  With this change I/O
errors on a path failure only get propagated to dm-multipath, which
can then either resubmit the I/O or complete the bios on the original
request.

I've done some basic testing of this on a Linux target with ALUA support,
and it survives path failures during I/O nicely.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

5f1b670d

16 4月, 2015 4 次提交

dm table: use bool function return values of true/false not 1/0 · 7f61f5a0

由 Joe Perches 提交于 3月 30, 2015

Use the normal return values for bool functions.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

7f61f5a0

dm table: fall back to getting device using name_to_dev_t() · 644bda6f

由 Dan Ehrenberg 提交于 2月 10, 2015

If a device is used as the root filesystem, it can't be built
off of devices which are within the root filesystem (just like
command line arguments to root=).  For this reason, Linux has a
pseudo-filesystem for root= and MD initialization (based on the
function name_to_dev_t) which handles different ways of specifying
devices including PARTUUID and major:minor.

Switch to using name_to_dev_t() in dm_get_device().  Rather than
having DM assume that all things which are not major:minor are paths in
an already-mounted filesystem, change dm_get_device() to first attempt
to look up the device in the filesystem, and if not found it will fall
back to using name_to_dev_t().

In terms of backwards compatibility, there are some cases where
behavior will be different:
- If you have a file in the current working directory named 1:2 and
  you initialze DM there, then it will try to use that file rather
  than the disk with that major:minor pair as a backing device.
- Similarly for other bdev types which name_to_dev_t() knows how to
  interpret, the previous behavior was to repeatedly check for the
  existence of the file (e.g., while waiting for rootfs to come up)
  but the new behavior is to use the name_to_dev_t() interpretation.
  For example, if you have a file named /dev/ubiblock0_0 which is
  a symlink to /dev/sda3, but it is not yet present when DM starts
  to initialize, then the name_to_dev_t() interpretation will take
  precedence.

These incompatibilities would only show up in really strange setups
with bad practices so we shouldn't have to worry about them.
Signed-off-by: NDan Ehrenberg <dehrenberg@chromium.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

644bda6f

dm: add 'use_blk_mq' module param and expose in per-device ro sysfs attr · 17e149b8

由 Mike Snitzer 提交于 3月 11, 2015

Request-based DM's blk-mq support defaults to off; but a user can easily
change the default using the dm_mod.use_blk_mq module/boot option.

Also, you can check what mode a given request-based DM device is using
with: cat /sys/block/dm-X/dm/use_blk_mq

This change enabled further cleanup and reduced work (e.g. the
md->io_pool and md->rq_pool isn't created if using blk-mq).
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

17e149b8

dm: add full blk-mq support to request-based DM · bfebd1cd

由 Mike Snitzer 提交于 3月 08, 2015

Commit e5863d9a ("dm: allocate requests in target when stacking on
blk-mq devices") served as the first step toward fully utilizing blk-mq
in request-based DM -- it enabled stacking an old-style (request_fn)
request_queue ontop of the underlying blk-mq device(s).  That first step
didn't improve performance of DM multipath ontop of fast blk-mq devices
(e.g. NVMe) because the top-level old-style request_queue was severely
limited by the queue_lock.

The second step offered here enables stacking a blk-mq request_queue
ontop of the underlying blk-mq device(s).  This unlocks significant
performance gains on fast blk-mq devices, Keith Busch tested on his NVMe
testbed and offered this really positive news:

 "Just providing a performance update. All my fio tests are getting
  roughly equal performance whether accessed through the raw block
  device or the multipath device mapper (~470k IOPS). I could only push
  ~20% of the raw iops through dm before this conversion, so this latest
  tree is looking really solid from a performance standpoint."
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Tested-by: NKeith Busch <keith.busch@intel.com>

bfebd1cd

01 4月, 2015 1 次提交

dm: remove request-based DM queue's lld_busy_fn hook · d56b9b28

由 Mike Snitzer 提交于 2月 23, 2015

DM multipath is the only caller of blk_lld_busy() -- which calls a
queue's lld_busy_fn hook. Request-based DM doesn't support stacking
multipath devices so there is no reason to register the lld_busy_fn hook
on a multipath device's queue using blk_queue_lld_busy().

As such, remove functions dm_lld_busy and dm_table_any_busy_target.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

d56b9b28

11 2月, 2015 1 次提交

dm: inherit QUEUE_FLAG_SG_GAPS flags from underlying queues · a4afe76b

由 Keith Busch 提交于 1月 24, 2015

A DM device must inherit the QUEUE_FLAG_SG_GAPS flags from its
underlying block devices' request queues.

This fixes problems when submitting cloned requests to multipathed
devices requiring virtually contiguous buffers.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

a4afe76b

10 2月, 2015 2 次提交

dm table: train hybrid target type detection to select blk-mq if appropriate · 65803c20

由 Mike Snitzer 提交于 12月 18, 2014

Otherwise replacing the multipath target with the error target fails:
device-mapper: ioctl: can't change device type after initial table load.

The error target was mistakenly considered to be target type
DM_TYPE_REQUEST_BASED rather than DM_TYPE_MQ_REQUEST_BASED even if the
target it was to replace was of type DM_TYPE_MQ_REQUEST_BASED.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

65803c20

dm: allocate requests in target when stacking on blk-mq devices · e5863d9a

由 Mike Snitzer 提交于 12月 17, 2014

For blk-mq request-based DM the responsibility of allocating a cloned
request is transfered from DM core to the target type.  Doing so
enables the cloned request to be allocated from the appropriate
blk-mq request_queue's pool (only the DM target, e.g. multipath, can
know which block device to send a given cloned request to).

Care was taken to preserve compatibility with old-style block request
completion that requires request-based DM _not_ acquire the clone
request's queue lock in the completion path.  As such, there are now 2
different request-based DM target_type interfaces:
1) the original .map_rq() interface will continue to be used for
   non-blk-mq devices -- the preallocated clone request is passed in
   from DM core.
2) a new .clone_and_map_rq() and .release_clone_rq() will be used for
   blk-mq devices -- blk_get_request() and blk_put_request() are used
   respectively from these hooks.

dm_table_set_type() was updated to detect if the request-based target is
being stacked on blk-mq devices, if so DM_TYPE_MQ_REQUEST_BASED is set.
DM core disallows switching the DM table's type after it is set.  This
means that there is no mixing of non-blk-mq and blk-mq devices within
the same request-based DM table.

[This patch was started by Keith and later heavily modified by Mike]
Tested-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

e5863d9a

20 11月, 2014 1 次提交

dm: add presuspend_undo hook to target_type · d67ee213

由 Mike Snitzer 提交于 10月 28, 2014

The DM thin-pool target now must undo the changes performed during
pool_presuspend() so introduce presuspend_undo hook in target_type.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Acked-by: NJoe Thornber <ejt@redhat.com>

d67ee213

06 10月, 2014 1 次提交

dm: allow active and inactive tables to share dm_devs · 86f1152b

由 Benjamin Marzinski 提交于 8月 13, 2014

Until this change, when loading a new DM table, DM core would re-open
all of the devices in the DM table. Now, DM core will avoid redundant
device opens (and closes when destroying the old table) if the old
table already has a device open using the same mode. This is achieved
by managing reference counts on the table_devices that DM core now
stores in the mapped_device structure (rather than in the dm_table
structure). So a mapped_device's active and inactive dm_tables' dm_dev
lists now just point to the dm_devs stored in the mapped_device's
table_devices list.

This improvement in DM core's device reference counting has the
side-effect of fixing a long-standing limitation of the multipath
target: a DM multipath table couldn't include any paths that were unusable
(failed). For example: if all paths have failed and you add a new,
working, path to the table; you can't use it since the table load would
fail due to it still containing failed paths. Now a re-load of a
multipath table can include failed devices and when those devices become
active again they can be used instantly.

The device list code in dm.c isn't a straight copy/paste from the code in
dm-table.c, but it's very close (aside from some variable renames). One
subtle difference is that find_table_device for the tables_devices list
will only match devices with the same name and mode. This is because we
don't want to upgrade a device's mode in the active table when an
inactive table is loaded.

Access to the mapped_device structure's tables_devices list requires a
mutex (tables_devices_lock), so that tables cannot be created and
destroyed concurrently.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

86f1152b

11 8月, 2014 1 次提交

dm table: propagate QUEUE_FLAG_NO_SG_MERGE · 200612ec

由 Jeff Moyer 提交于 8月 08, 2014

Commit 05f1dd53 ("block: add queue flag for disabling SG merging")
introduced a new queue flag: QUEUE_FLAG_NO_SG_MERGE.  This gets set by
default in blk_mq_init_queue for mq-enabled devices.  The effect of
the flag is to bypass the SG segment merging.  Instead, the
bio->bi_vcnt is used as the number of hardware segments.

With a device mapper target on top of a device with
QUEUE_FLAG_NO_SG_MERGE set, we can end up sending down more segments
than a driver is prepared to handle.  I ran into this when backporting
the virtio_blk mq support.  It triggerred this BUG_ON, in
virtio_queue_rq:

        BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems);

The queue's max is set here:
        blk_queue_max_segments(q, vblk->sg_elems-2);

Basically, what happens is that a bio is built up for the dm device
(which does not have the QUEUE_FLAG_NO_SG_MERGE flag set) using
bio_add_page.  That path will call into __blk_recalc_rq_segments, so
what you end up with is bi_phys_segments being much smaller than bi_vcnt
(and bi_vcnt grows beyond the maximum sg elements).  Then, when the bio
is submitted, it gets cloned.  When the cloned bio is submitted, it will
end up in blk_recount_segments, here:

        if (test_bit(QUEUE_FLAG_NO_SG_MERGE, &q->queue_flags))
                bio->bi_phys_segments = bio->bi_vcnt;

and now we've set bio->bi_phys_segments to a number that is beyond what
was registered as queue_max_segments by the driver.

The right way to fix this is to propagate the queue flag up the stack.

The rules for propagating the flag are simple:
- if the flag is set for any underlying device, it must be set for the
  upper device
- consequently, if the flag is not set for any underlying device, it
  should not be set for the upper device.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 3.16+

200612ec

02 8月, 2014 1 次提交

dm table: make dm_table_supports_discards static · a7ffb6a5

由 Mikulas Patocka 提交于 7月 10, 2014

The function dm_table_supports_discards is only called from
dm-table.c:dm_table_set_restrictions().  So move it above
dm_table_set_restrictions and make it static.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

a7ffb6a5

04 6月, 2014 1 次提交

dm: remove symbol export for dm_set_device_limits · 11f0431b

由 Mike Snitzer 提交于 6月 03, 2014

There is no need for code other than DM core to use dm_set_device_limits
so remove its EXPORT_SYMBOL_GPL. Also, cleanup a couple whitespace nits.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

11f0431b

28 3月, 2014 2 次提交

dm table: add dm_table_run_md_queue_async · 9974fa2c

由 Mike Snitzer 提交于 2月 28, 2014

Introduce dm_table_run_md_queue_async() to run the request_queue of the
mapped_device associated with a request-based DM table.

Also add dm_md_get_queue() wrapper to extract the request_queue from a
mapped_device.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>

9974fa2c

dm: make dm_table_alloc_md_mempools static · 473c36df

由 Mikulas Patocka 提交于 2月 13, 2014

Make the function dm_table_alloc_md_mempools static because it is not
called from another file.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

473c36df

07 1月, 2014 1 次提交

dm table: remove unused buggy code that extends the targets array · 57a2f238

由 Mikulas Patocka 提交于 11月 22, 2013

A device mapper table is allocated in the following way:
* The function dm_table_create is called, it gets the number of targets
  as an argument -- it allocates a targets array accordingly.
* For each target, we call dm_table_add_target.

If we add more targets than were specified in dm_table_create, the
function dm_table_add_target reallocates the targets array.  However,
this reallocation code is wrong - it moves the targets array to a new
location, while some target constructors hold pointers to the array in
the old location.

The following DM target drivers save the pointer to the target
structure, so they corrupt memory if the target array is moved:
multipath, raid, mirror, snapshot, stripe, switch, thin, verity.

Under normal circumstances, the reallocation function is not called
(because dm_table_create is called with the correct number of targets),
so the buggy reallocation code is not used.

Prior to the fix "dm table: fail dm_table_create on dm_round_up
overflow", the reallocation code could only be used in case the user
specifies too large a value in param->target_count, such as 0xffffffff.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

57a2f238

11 12月, 2013 1 次提交

dm table: fail dm_table_create on dm_round_up overflow · 5b2d0657

由 Mikulas Patocka 提交于 11月 22, 2013

The dm_round_up function may overflow to zero.  In this case,
dm_table_create() must fail rather than go on to allocate an empty array
with alloc_targets().

This fixes a possible memory corruption that could be caused by passing
too large a number in "param->target_count".
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

5b2d0657

10 11月, 2013 1 次提交

dm table: print error on preresume failure · 7833b08e

由 Mike Snitzer 提交于 10月 24, 2013

If preresume fails it is worth logging an error given that a device is
left suspended due to the failure.

This change was motivated by local preresume error logging that was
added to the cache target ("preresume failed").  Elevating this
target-agnostic context for the where the target-specific error occurred
relative to the DM core's callouts makes sense.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJoe Thornber <ejt@redhat.com>

7833b08e

01 11月, 2013 1 次提交

dm: allocate buffer for messages with small number of arguments using GFP_NOIO · f36afb39

由 Mikulas Patocka 提交于 10月 31, 2013

dm-mpath and dm-thin must process messages even if some device is
suspended, so we allocate argv buffer with GFP_NOIO. These messages have
a small fixed number of arguments.

On the other hand, dm-switch needs to process bulk data using messages
so excessive use of GFP_NOIO could cause trouble.

The patch also lowers the default number of arguments from 64 to 8, so
that there is smaller load on GFP_NOIO allocations.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Acked-by: NAlasdair G Kergon <agk@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f36afb39

06 9月, 2013 2 次提交

dm ioctl: increase granularity of type_lock when loading table · 00c4fc3b

由 Mike Snitzer 提交于 8月 27, 2013

Hold the mapped device's type_lock before calling populate_table() since
it is where the table's type is determined based on the specified
targets.  There is no need to allow concurrent table loads to race to
establish the table's targets or type.

This eliminates the need to grab the lock in dm_table_set_type().

Also verify that the type_lock is held in both dm_set_md_type() and
dm_get_md_type().
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

00c4fc3b

dm: allow error target to replace bio-based and request-based targets · 169e2cc2

由 Mike Snitzer 提交于 8月 22, 2013

It may be useful to switch a request-based table to the "error" target.
Enhance the DM core to allow a hybrid target_type which is capable of
handling either bios (via .map) or requests (via .map_rq).

Add a request-based map function (.map_rq) to the "error" target_type;
making it DM's first hybrid target.  Train dm_table_set_type() to prefer
the mapped device's established type (request-based or bio-based).  If
the mapped device doesn't have an established type default to making the
table with the hybrid target(s) bio-based.

Tested 'dmsetup wipe_table' to work on both bio-based and request-based
devices.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJoe Jin <joe.jin@oracle.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

169e2cc2

11 7月, 2013 1 次提交

dm: optimize use SRCU and RCU · 83d5e5b0

由 Mikulas Patocka 提交于 7月 10, 2013

This patch removes "io_lock" and "map_lock" in struct mapped_device and
"holders" in struct dm_table and replaces these mechanisms with
sleepable-rcu.

Previously, the code would call "dm_get_live_table" and "dm_table_put" to
get and release table. Now, the code is changed to call "dm_get_live_table"
and "dm_put_live_table". dm_get_live_table locks sleepable-rcu and
dm_put_live_table unlocks it.

dm_get_live_table_fast/dm_put_live_table_fast can be used instead of
dm_get_live_table/dm_put_live_table. These *_fast functions use
non-sleepable RCU, so the caller must not block between them.

If the code changes active or inactive dm table, it must call
dm_sync_table before destroying the old table.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

83d5e5b0

10 5月, 2013 1 次提交

dm table: fix write same support · dc019b21

由 Mike Snitzer 提交于 5月 10, 2013

If device_not_write_same_capable() returns true then the iterate_devices
loop in dm_table_supports_write_same() should return false.
Reported-by: NBharata B Rao <bharata.rao@gmail.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # v3.8+
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

dc019b21

02 3月, 2013 2 次提交

dm: rename request variables to bios · 55a62eef

由 Alasdair G Kergon 提交于 3月 01, 2013

Use 'bio' in the name of variables and functions that deal with
bios rather than 'request' to avoid confusion with the normal
block layer use of 'request'.

No functional changes.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

55a62eef

dm table: remove superfluous variable reset · d2ce70a1

由 Wang Sheng-Hui 提交于 3月 01, 2013

If allocation fails, the local var *t is not used any more after kfree.
Don't need to reset it to NULL. Remove the unnecesary NULL set here.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

d2ce70a1

22 12月, 2012 3 次提交

dm: introduce per_bio_data · c0820cf5

由 Mikulas Patocka 提交于 12月 21, 2012

Introduce a field per_bio_data_size in struct dm_target.

Targets can set this field in the constructor. If a target sets this
field to a non-zero value, "per_bio_data_size" bytes of auxiliary data
are allocated for each bio submitted to the target. These data can be
used for any purpose by the target and help us improve performance by
removing some per-target mempools.

Per-bio data is accessed with dm_per_bio_data. The
argument data_size must be the same as the value per_bio_data_size in
dm_target.

If the target has a pointer to per_bio_data, it can get a pointer to
the bio with dm_bio_from_per_bio_data() function (data_size must be the
same as the value passed to dm_per_bio_data).
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

c0820cf5

dm: prepare to support WRITE SAME · d54eaa5a

由 Mike Snitzer 提交于 12月 21, 2012

Allow targets to opt in to WRITE SAME support by setting
'num_write_same_requests' in the dm_target structure.

A dm device will only advertise WRITE SAME support if all its
targets and all its underlying devices support it.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

d54eaa5a

dm: disable WRITE SAME · c1a94672

由 Mike Snitzer 提交于 12月 21, 2012

WRITE SAME bios are not yet handled correctly by device-mapper so
disable their use on device-mapper devices by setting
max_write_same_sectors to zero.

As an example, a ciphertext device is incompatible because the data
gets changed according to the location at which it written and so the
dm crypt target cannot support it.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Cc: Milan Broz <mbroz@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

c1a94672

27 9月, 2012 2 次提交

dm: retain table limits when swapping to new table with no devices · 3ae70656

由 Mike Snitzer 提交于 9月 26, 2012

Add a safety net that will re-use the DM device's existing limits in the
event that DM device has a temporary table that doesn't have any
component devices.  This is to reduce the chance that requests not
respecting the hardware limits will reach the device.

DM recalculates queue limits based only on devices which currently exist
in the table.  This creates a problem in the event all devices are
temporarily removed such as all paths being lost in multipath.  DM will
reset the limits to the maximum permissible, which can then assemble
requests which exceed the limits of the paths when the paths are
restored.  The request will fail the blk_rq_check_limits() test when
sent to a path with lower limits, and will be retried without end by
multipath.  This became a much bigger issue after v3.6 commit fe86cdce
("block: do not artificially constrain max_sectors for stacking
drivers").
Reported-by: NDavid Jeffery <djeffery@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

3ae70656

dm table: clear add_random unless all devices have it set · c3c4555e

由 Milan Broz 提交于 9月 26, 2012

Always clear QUEUE_FLAG_ADD_RANDOM if any underlying device does not
have it set. Otherwise devices with predictable characteristics may
contribute entropy.

QUEUE_FLAG_ADD_RANDOM specifies whether or not queue IO timings
contribute to the random pool.

For bio-based targets this flag is always 0 because such devices have no
real queue.

For request-based devices this flag was always set to 1 by default.

Now set it according to the flags on underlying devices. If there is at
least one device which should not contribute, set the flag to zero: If a
device, such as fast SSD storage, is not suitable for supplying entropy,
a request-based queue stacked over it will not be either.

Because the checking logic is exactly same as for the rotational flag,
share the iteration function with device_is_nonrot().
Signed-off-by: NMilan Broz <mbroz@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

c3c4555e

27 7月, 2012 1 次提交

dm: allow targets to request flushes regardless of underlying device support · 0e9c24ed

由 Joe Thornber 提交于 7月 27, 2012

Allow targets to override the 'supports flush' calculation.

Set 'flush_supported' if a target needs to receive flushes regardless of
whether or not its underlying devices have support.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

0e9c24ed

29 3月, 2012 2 次提交

dm: reject trailing characters in sccanf input · 31998ef1

由 Mikulas Patocka 提交于 3月 28, 2012

Device mapper uses sscanf to convert arguments to numbers. The problem is that
the way we use it ignores additional unmatched characters in the scanned string.

For example, this `if (sscanf(string, "%d", &number) == 1)' will match a number,
but also it will match number with some garbage appended, like "123abc".

As a result, device mapper accepts garbage after some numbers. For example
the command `dmsetup create vg1-new --table "0 16384 linear 254:1bla 34816bla"'
will pass without an error.

This patch fixes all sscanf uses in device mapper. It appends "%c" with
a pointer to a dummy character variable to every sscanf statement.

The construct `if (sscanf(string, "%d%c", &number, &dummy) == 1)' succeeds
only if string is a null-terminated number (optionally preceded by some
whitespace characters). If there is some character appended after the number,
sscanf matches "%c", writes the character to the dummy variable and returns 2.
We check the return value for 1 and consequently reject numbers with some
garbage appended.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

31998ef1

dm table: simplify call to free_devices · 574ce07e

由 Hannes Reinecke 提交于 3月 28, 2012

free_devices in dm_table.c already uses list_for_each(), so we don't
need to check if the list is empty.
Signed-off-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

574ce07e

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功