提交 · 6cfa58573f9b4a543005baa002e137820504c7af · openanolis / cloud-kernel

06 9月, 2013 2 次提交

由 Mikulas Patocka 提交于 8月 16, 2013

Support the collection of I/O statistics on user-defined regions of
a DM device.  If no regions are defined no statistics are collected so
there isn't any performance impact.  Only bio-based DM devices are
currently supported.

Each user-defined region specifies a starting sector, length and step.
Individual statistics will be collected for each step-sized area within
the range specified.

The I/O statistics counters for each step-sized area of a region are
in the same format as /sys/block/*/stat or /proc/diskstats but extra
counters (12 and 13) are provided: total time spent reading and
writing in milliseconds.  All these counters may be accessed by sending
the @stats_print message to the appropriate DM device via dmsetup.

The creation of DM statistics will allocate memory via kmalloc or
fallback to using vmalloc space.  At most, 1/4 of the overall system
memory may be allocated by DM statistics.  The admin can see how much
memory is used by reading
/sys/module/dm_mod/parameters/stats_current_allocated_bytes

See Documentation/device-mapper/statistics.txt for more details.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

fd2ed4d2

dm: allow error target to replace bio-based and request-based targets · 169e2cc2

由 Mike Snitzer 提交于 8月 22, 2013

It may be useful to switch a request-based table to the "error" target.
Enhance the DM core to allow a hybrid target_type which is capable of
handling either bios (via .map) or requests (via .map_rq).

Add a request-based map function (.map_rq) to the "error" target_type;
making it DM's first hybrid target.  Train dm_table_set_type() to prefer
the mapped device's established type (request-based or bio-based).  If
the mapped device doesn't have an established type default to making the
table with the hybrid target(s) bio-based.

Tested 'dmsetup wipe_table' to work on both bio-based and request-based
devices.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJoe Jin <joe.jin@oracle.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

169e2cc2

22 12月, 2012 1 次提交

dm: introduce per_bio_data · c0820cf5

由 Mikulas Patocka 提交于 12月 21, 2012

Introduce a field per_bio_data_size in struct dm_target.

Targets can set this field in the constructor. If a target sets this
field to a non-zero value, "per_bio_data_size" bytes of auxiliary data
are allocated for each bio submitted to the target. These data can be
used for any purpose by the target and help us improve performance by
removing some per-target mempools.

Per-bio data is accessed with dm_per_bio_data. The
argument data_size must be the same as the value per_bio_data_size in
dm_target.

If the target has a pointer to per_bio_data, it can get a pointer to
the bio with dm_bio_from_per_bio_data() function (data_size must be the
same as the value passed to dm_per_bio_data).
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

c0820cf5

27 9月, 2012 1 次提交

dm: retain table limits when swapping to new table with no devices · 3ae70656

由 Mike Snitzer 提交于 9月 26, 2012

Add a safety net that will re-use the DM device's existing limits in the
event that DM device has a temporary table that doesn't have any
component devices.  This is to reduce the chance that requests not
respecting the hardware limits will reach the device.

DM recalculates queue limits based only on devices which currently exist
in the table.  This creates a problem in the event all devices are
temporarily removed such as all paths being lost in multipath.  DM will
reset the limits to the maximum permissible, which can then assemble
requests which exceed the limits of the paths when the paths are
restored.  The request will fail the blk_rq_check_limits() test when
sent to a path with lower limits, and will be retried without end by
multipath.  This became a much bigger issue after v3.6 commit fe86cdce
("block: do not artificially constrain max_sectors for stacking
drivers").
Reported-by: NDavid Jeffery <djeffery@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

3ae70656

27 7月, 2012 1 次提交

dm thin: commit before gathering status · 1f4e0ff0

由 Alasdair G Kergon 提交于 7月 27, 2012

Commit outstanding metadata before returning the status for a dm thin
pool so that the numbers reported are as up-to-date as possible.

The commit is not performed if the device is suspended or if
the DM_NOFLUSH_FLAG is supplied by userspace and passed to the target
through a new 'status_flags' parameter in the target's dm_status_fn.

The userspace dmsetup tool will support the --noflush flag with the
'dmsetup status' and 'dmsetup wait' commands from version 1.02.76
onwards.
Tested-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

1f4e0ff0

01 11月, 2011 1 次提交

dm table: add immutable feature · 36a0456f

由 Alasdair G Kergon 提交于 10月 31, 2011

Introduce DM_TARGET_IMMUTABLE to indicate that the target type cannot be mixed
with any other target type, and once loaded into a device, it cannot be
replaced with a table containing a different type.

The thin provisioning pool device will use this.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

36a0456f

02 8月, 2011 1 次提交

dm: ignore merge_bvec for snapshots when safe · d5b9dd04

由 Mikulas Patocka 提交于 8月 02, 2011

Add a new flag DMF_MERGE_IS_OPTIONAL to struct mapped_device to indicate
whether the device can accept bios larger than the size its merge
function returns. When set, use this to send large bios to snapshots
which can split them if necessary. Snapshot I/O may be significantly
fragmented and this approach seems to improve peformance.

Before the patch, dm_set_device_limits restricted bio size to page size
if the underlying device had a merge function and the target didn't
provide a merge function. After the patch, dm_set_device_limits
restricts bio size to page size if the underlying device has a merge
function, doesn't have DMF_MERGE_IS_OPTIONAL flag and the target doesn't
provide a merge function.

The snapshot target can't provide a merge function because when the merge
function is called, it is impossible to determine where the bio will be
remapped. Previously this led us to impose a 4k limit, which we can
now remove if the snapshot store is located on a device without a merge
function. Together with another patch for optimizing full chunk writes,
it improves performance from 29MB/s to 40MB/s when writing to the
filesystem on snapshot store.

If the snapshot store is placed on a non-dm device with a merge function
(such as md-raid), device mapper still limits all bios to page size.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

d5b9dd04

17 3月, 2011 1 次提交

block: Require subsystems to explicitly allocate bio_set integrity mempool · a91a2785

由 Martin K. Petersen 提交于 3月 17, 2011

MD and DM create a new bio_set for every metadevice. Each bio_set has an
integrity mempool attached regardless of whether the metadevice is
capable of passing integrity metadata. This is a waste of memory.

Instead we defer the allocation decision to MD and DM since we know at
metadevice creation time whether integrity passthrough is needed or not.

Automatic integrity mempool allocation can then be removed from
bioset_create() and we make an explicit integrity allocation for the
fs_bio_set.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reported-by: NZdenek Kabelac <zkabelac@redhat.com>
Acked-by: NMike Snitzer <snizer@redhat.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

a91a2785

12 8月, 2010 5 次提交

dm: linear support discard · 5ae89a87

由 Mike Snitzer 提交于 8月 12, 2010

Allow discards to be passed through to linear mappings if at least one
underlying device supports it.  Discards will be forwarded only to
devices that support them.

A target that supports discards should set num_discard_requests to
indicate how many times each discard request must be submitted to it.

Verify table's underlying devices support discards prior to setting the
associated DM device as capable of discards (via QUEUE_FLAG_DISCARD).
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reviewed-by: NJoe Thornber <thornber@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

5ae89a87

dm ioctl: refactor dm_table_complete · 26803b9f

由 Will Drewry 提交于 8月 12, 2010

This change unifies the various checks and finalization that occurs on a
table prior to use.  By doing so, it allows table construction without
traversing the dm-ioctl interface.
Signed-off-by: NWill Drewry <wad@chromium.org>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

26803b9f

dm: do not initialise full request queue when bio based · 4a0b4ddf

由 Mike Snitzer 提交于 8月 12, 2010

Change bio-based mapped devices no longer to have a fully initialized
request_queue (request_fn, elevator, etc).  This means bio-based DM
devices no longer register elevator sysfs attributes ('iosched/' tree
or 'scheduler' other than "none").

In contrast, a request-based DM device will continue to have a full
request_queue and will register elevator sysfs attributes.  Therefore
a user can determine a DM device's type by checking if elevator sysfs
attributes exist.

First allocate a minimalist request_queue structure for a DM device
(needed for both bio and request-based DM).

Initialization of a full request_queue is deferred until it is known
that the DM device is request-based, at the end of the table load
sequence.

Factor DM device's request_queue initialization:
- common to both request-based and bio-based into dm_init_md_queue().
- specific to request-based into dm_init_request_based_queue().

The md->type_lock mutex is used to protect md->queue, in addition to
md->type, during table_load().

A DM device's first table_load will establish the immutable md->type.
But md->queue initialization, based on md->type, may fail at that time
(because blk_init_allocated_queue cannot allocate memory).  Therefore
any subsequent table_load must (re)try dm_setup_md_queue independently of
establishing md->type.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Acked-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

4a0b4ddf

dm ioctl: make bio or request based device type immutable · a5664dad

由 Mike Snitzer 提交于 8月 12, 2010

Determine whether a mapped device is bio-based or request-based when
loading its first (inactive) table and don't allow that to be changed
later.

This patch performs different device initialisation in each of the two
cases.  (We don't think it's necessary to add code to support changing
between the two types.)

Allowed md->type transitions:
  DM_TYPE_NONE to DM_TYPE_BIO_BASED
  DM_TYPE_NONE to DM_TYPE_REQUEST_BASED

We now prevent table_load from replacing the inactive table with a
conflicting type of table even after an explicit table_clear.

Introduce 'type_lock' into the struct mapped_device to protect md->type
and to prepare for the next patch that will change the queue
initialization and allocate memory while md->type_lock is held.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Acked-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

 drivers/md/dm-ioctl.c    |   15 +++++++++++++++
 drivers/md/dm.c          |   37 ++++++++++++++++++++++++++++++-------
 drivers/md/dm.h          |    5 +++++
 include/linux/dm-ioctl.h |    4 ++--
 4 files changed, 52 insertions(+), 9 deletions(-)

a5664dad

dm: separate device deletion from dm_put · 3f77316d

由 Kiyoshi Ueda 提交于 8月 12, 2010

This patch separates the device deletion code from dm_put()
to make sure the deletion happens in the process context.

By this patch, device deletion always occurs in an ioctl (process)
context and dm_put() can be called in interrupt context.
As a result, the request-based dm's bad dm_put() usage pointed out
by Mikulas below disappears.
    http://marc.info/?l=dm-devel&m=126699981019735&w=2

Without this patch, I confirmed there is a case to crash the system:
    dm_put() => dm_table_destroy() => vfree() => BUG_ON(in_interrupt())

Some more backgrounds and details:
In request-based dm, a device opener can remove a mapped_device
while the last request is still completing, because bios in the last
request complete first and then the device opener can close and remove
the mapped_device before the last request completes:
  CPU0                                          CPU1
  =================================================================
  <<INTERRUPT>>
  blk_end_request_all(clone_rq)
    blk_update_request(clone_rq)
      bio_endio(clone_bio) == end_clone_bio
        blk_update_request(orig_rq)
          bio_endio(orig_bio)
                                                <<I/O completed>>
                                                dm_blk_close()
                                                dev_remove()
                                                  dm_put(md)
                                                    <<Free md>>
   blk_finish_request(clone_rq)
     ....
     dm_end_request(clone_rq)
       free_rq_clone(clone_rq)
       blk_end_request_all(orig_rq)
       rq_completed(md)

So request-based dm used dm_get()/dm_put() to hold md for each I/O
until its request completion handling is fully done.
However, the final dm_put() can call the device deletion code which
must not be run in interrupt context and may cause kernel panic.

To solve the problem, this patch moves the device deletion code,
dm_destroy(), to predetermined places that is actually deleting
the mapped_device in ioctl (process) context, and changes dm_put()
just to decrement the reference count of the mapped_device.
By this change, dm_put() can be used in any context and the symmetric
model below is introduced:
    dm_create():  create a mapped_device
    dm_destroy(): destroy a mapped_device
    dm_get():     increment the reference count of a mapped_device
    dm_put():     decrement the reference count of a mapped_device

dm_destroy() waits for all references of the mapped_device to disappear,
then deletes the mapped_device.

dm_destroy() uses active waiting with msleep(1), since deleting
the mapped_device isn't performance-critical task.
And since at this point, nobody opens the mapped_device and no new
reference will be taken, the pending counts are just for racing
completing activity and will eventually decrease to zero.

For the unlikely case of the forced module unload, dm_destroy_immediate(),
which doesn't wait and forcibly deletes the mapped_device, is also
introduced and used in dm_hash_remove_all().  Otherwise, "rmmod -f"
may be stuck and never return.
And now, because the mapped_device is deleted at this point, subsequent
accesses to the mapped_device may cause NULL pointer references.

Cc: stable@kernel.org
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

3f77316d

06 3月, 2010 1 次提交

dm ioctl: introduce flag indicating uevent was generated · 3abf85b5

由 Peter Rajnoha 提交于 3月 06, 2010

Set a new DM_UEVENT_GENERATED_FLAG when returning from ioctls to
indicate that a uevent was actually generated.  This tells the userspace
caller that it may need to wait for the event to be processed.
Signed-off-by: NPeter Rajnoha <prajnoha@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

3abf85b5

11 12月, 2009 3 次提交

dm: rename dm_suspended to dm_suspended_md · 4f186f8b

由 Kiyoshi Ueda 提交于 12月 10, 2009

This patch renames dm_suspended() to dm_suspended_md() and
keeps it internal to dm.
No functional change.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

4f186f8b

dm: add dm_deleting_md function · 432a212c

由 Mike Anderson 提交于 12月 10, 2009

Add dm_deleting_md to check whether or not a given mapped
device is currently being deleted.
Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

432a212c

dm io: use slab for struct io · 952b3557

由 Mikulas Patocka 提交于 12月 10, 2009

Allocate "struct io" from a slab.

This patch changes dm-io, so that "struct io" is allocated from a slab cache.
It used to be allocated with kmalloc. Allocating from a slab will be needed
for the next patch, because it requires a special alignment of "struct io"
and kmalloc cannot meet this alignment.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

952b3557

24 7月, 2009 1 次提交

dm: remove queue next_ordered workaround for barriers · a732c207

由 Mike Snitzer 提交于 7月 23, 2009

This patch removes DM's bio-based vs request-based conditional setting
of next_ordered.  For bio-based DM the next_ordered check is no longer a
concern (as that check is now in the __make_request path).  For
request-based DM the default of QUEUE_ORDERED_NONE is now appropriate.

bio-based DM was changed to work-around the previously misplaced
next_ordered check with this commit:
99360b4c

request-based DM does not yet support barriers but reacted to the above
bio-based DM change with this commit:
5d67aa23

The above changes are no longer needed given Neil Brown's recent fix to
put the next_ordered check in the __make_request path:
db64f680Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: NeilBrown <neilb@suse.de>
Acked-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Acked-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

a732c207

22 6月, 2009 5 次提交

dm: do not set QUEUE_ORDERED_DRAIN if request based · 5d67aa23

由 Kiyoshi Ueda 提交于 6月 22, 2009

Request-based dm doesn't have barrier support yet.
So we need to set QUEUE_ORDERED_DRAIN only for bio-based dm.
Since the device type is decided at the first table loading time,
the flag set is deferred until then.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

5d67aa23

dm: enable request based option · e6ee8c0b

由 Kiyoshi Ueda 提交于 6月 22, 2009

This patch enables request-based dm.

o Request-based dm and bio-based dm coexist, since there are
  some target drivers which are more fitting to bio-based dm.
  Also, there are other bio-based devices in the kernel
  (e.g. md, loop).
  Since bio-based device can't receive struct request,
  there are some limitations on device stacking between
  bio-based and request-based.

                     type of underlying device
                   bio-based      request-based
   ----------------------------------------------
    bio-based         OK                OK
    request-based     --                OK

  The device type is recognized by the queue flag in the kernel,
  so dm follows that.

o The type of a dm device is decided at the first table binding time.
  Once the type of a dm device is decided, the type can't be changed.

o Mempool allocations are deferred to at the table loading time, since
  mempools for request-based dm are different from those for bio-based
  dm and needed mempool type is fixed by the type of table.

o Currently, request-based dm supports only tables that have a single
  target.  To support multiple targets, we need to support request
  splitting or prevent bio/request from spanning multiple targets.
  The former needs lots of changes in the block layer, and the latter
  needs that all target drivers support merge() function.
  Both will take a time.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

e6ee8c0b

dm: prepare for request based option · cec47e3d

由 Kiyoshi Ueda 提交于 6月 22, 2009

This patch adds core functions for request-based dm.

When struct mapped device (md) is initialized, md->queue has
an I/O scheduler and the following functions are used for
request-based dm as the queue functions:
    make_request_fn: dm_make_request()
    pref_fn:         dm_prep_fn()
    request_fn:      dm_request_fn()
    softirq_done_fn: dm_softirq_done()
    lld_busy_fn:     dm_lld_busy()
Actual initializations are done in another patch (PATCH 2).

Below is a brief summary of how request-based dm behaves, including:
  - making request from bio
  - cloning, mapping and dispatching request
  - completing request and bio
  - suspending md
  - resuming md

  bio to request
  ==============
  md->queue->make_request_fn() (dm_make_request()) calls __make_request()
  for a bio submitted to the md.
  Then, the bio is kept in the queue as a new request or merged into
  another request in the queue if possible.

  Cloning and Mapping
  ===================
  Cloning and mapping are done in md->queue->request_fn() (dm_request_fn()),
  when requests are dispatched after they are sorted by the I/O scheduler.

  dm_request_fn() checks busy state of underlying devices using
  target's busy() function and stops dispatching requests to keep them
  on the dm device's queue if busy.
  It helps better I/O merging, since no merge is done for a request
  once it is dispatched to underlying devices.

  Actual cloning and mapping are done in dm_prep_fn() and map_request()
  called from dm_request_fn().
  dm_prep_fn() clones not only request but also bios of the request
  so that dm can hold bio completion in error cases and prevent
  the bio submitter from noticing the error.
  (See the "Completion" section below for details.)

  After the cloning, the clone is mapped by target's map_rq() function
    and inserted to underlying device's queue using
    blk_insert_cloned_request().

  Completion
  ==========
  Request completion can be hooked by rq->end_io(), but then, all bios
  in the request will have been completed even error cases, and the bio
  submitter will have noticed the error.
  To prevent the bio completion in error cases, request-based dm clones
  both bio and request and hooks both bio->bi_end_io() and rq->end_io():
      bio->bi_end_io(): end_clone_bio()
      rq->end_io():     end_clone_request()

  Summary of the request completion flow is below:
  blk_end_request() for a clone request
    => blk_update_request()
       => bio->bi_end_io() == end_clone_bio() for each clone bio
          => Free the clone bio
          => Success: Complete the original bio (blk_update_request())
             Error:   Don't complete the original bio
    => blk_finish_request()
       => rq->end_io() == end_clone_request()
          => blk_complete_request()
             => dm_softirq_done()
                => Free the clone request
                => Success: Complete the original request (blk_end_request())
                   Error:   Requeue the original request

  end_clone_bio() completes the original request on the size of
  the original bio in successful cases.
  Even if all bios in the original request are completed by that
  completion, the original request must not be completed yet to keep
  the ordering of request completion for the stacking.
  So end_clone_bio() uses blk_update_request() instead of
  blk_end_request().
  In error cases, end_clone_bio() doesn't complete the original bio.
  It just frees the cloned bio and gives over the error handling to
  end_clone_request().

  end_clone_request(), which is called with queue lock held, completes
  the clone request and the original request in a softirq context
  (dm_softirq_done()), which has no queue lock, to avoid a deadlock
  issue on submission of another request during the completion:
      - The submitted request may be mapped to the same device
      - Request submission requires queue lock, but the queue lock
        has been held by itself and it doesn't know that

  The clone request has no clone bio when dm_softirq_done() is called.
  So target drivers can't resubmit it again even error cases.
  Instead, they can ask dm core for requeueing and remapping
  the original request in that cases.

  suspend
  =======
  Request-based dm uses stopping md->queue as suspend of the md.
  For noflush suspend, just stops md->queue.

  For flush suspend, inserts a marker request to the tail of md->queue.
  And dispatches all requests in md->queue until the marker comes to
  the front of md->queue.  Then, stops dispatching request and waits
  for the all dispatched requests to complete.
  After that, completes the marker request, stops md->queue and
  wake up the waiter on the suspend queue, md->wait.

  resume
  ======
  Starts md->queue.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

cec47e3d

dm: calculate queue limits during resume not load · 754c5fc7

由 Mike Snitzer 提交于 6月 22, 2009

Currently, device-mapper maintains a separate instance of 'struct
queue_limits' for each table of each device.  When the configuration of
a device is to be changed, first its table is loaded and this structure
is populated, then the device is 'resumed' and the calculated
queue_limits are applied.

This places restrictions on how userspace may process related devices,
where it is often advantageous to 'load' tables for several devices
at once before 'resuming' them together.  As the new queue_limits
only take effect after the 'resume', if they are changing and one
device uses another, the latter must be 'resumed' before the former
may be 'loaded'.

This patch moves the calculation of these queue_limits out of
the 'load' operation into 'resume'.  Since we are no longer
pre-calculating this struct, we no longer need to maintain copies
within our dm structs.

dm_set_device_limits() now passes the 'start' of the device's
data area (aka pe_start) as the 'offset' to blk_stack_limits().

init_valid_queue_limits() is replaced by blk_set_default_limits().
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: martin.petersen@oracle.com
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

754c5fc7

dm ioctl: support cookies for udev · 60935eb2

由 Milan Broz 提交于 6月 22, 2009

Add support for passing a 32 bit "cookie" into the kernel with the
DM_SUSPEND, DM_DEV_RENAME and DM_DEV_REMOVE ioctls. The (unsigned)
value of this cookie is returned to userspace alongside the uevents
issued by these ioctls in the variable DM_COOKIE.

This means the userspace process issuing these ioctls can be notified
by udev after udev has completed any actions triggered.

To minimise the interface extension, we pass the cookie into the
kernel in the event_nr field which is otherwise unused when calling
these ioctls. Incrementing the version number allows userspace to
determine in advance whether or not the kernel supports the cookie.
If the kernel does support this but userspace does not, there should
be no impact as the new variable will just get ignored.
Signed-off-by: NMilan Broz <mbroz@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

60935eb2

09 4月, 2009 1 次提交

dm: remove limited barrier support · 692d0eb9

由 Mikulas Patocka 提交于 4月 09, 2009

Prepare for full barrier implementation: first remove the restricted support.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

692d0eb9

03 4月, 2009 1 次提交

dm target: remove struct tt_internal · 45194e4f

由 Cheng Renquan 提交于 4月 02, 2009

The tt_internal is really just a list_head to manage registered target_type
in a double linked list,

Here embed the list_head into target_type directly,
1. to avoid kmalloc/kfree;
2. then tt_internal is really unneeded;

Cc: stable@kernel.org
Signed-off-by: NCheng Renquan <crquan@gmail.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
Reviewed-by: NAlasdair G Kergon <agk@redhat.com>

45194e4f

06 1月, 2009 3 次提交

dm: add name and uuid to sysfs · 784aae73

由 Milan Broz 提交于 1月 06, 2009

Implement simple read-only sysfs entry for device-mapper block device.

This patch adds a simple sysfs directory named "dm" under block device
properties and implements
	- name attribute (string containing mapped device name)
	- uuid attribute (string containing UUID, or empty string if not set)

The kobject is embedded in mapped_device struct, so no additional
memory allocation is needed for initializing sysfs entry.

During the processing of sysfs attribute we need to lock mapped device
which is done by a new function dm_get_from_kobj, which returns the md
associated with kobject and increases the usage count.

Each 'show attribute' function is responsible for its own locking.
Signed-off-by: NMilan Broz <mbroz@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

784aae73

dm table: rework reference counting · d5816876

由 Mikulas Patocka 提交于 1月 06, 2009

Rework table reference counting.

The existing code uses a reference counter. When the last reference is
dropped and the counter reaches zero, the table destructor is called.
Table reference counters are acquired/released from upcalls from other
kernel code (dm_any_congested, dm_merge_bvec, dm_unplug_all).
If the reference counter reaches zero in one of the upcalls, the table
destructor is called from almost random kernel code.

This leads to various problems:
* dm_any_congested being called under a spinlock, which calls the
  destructor, which calls some sleeping function.
* the destructor attempting to take a lock that is already taken by the
  same process.
* stale reference from some other kernel code keeps the table
  constructed, which keeps some devices open, even after successful
  return from "dmsetup remove". This can confuse lvm and prevent closing
  of underlying devices or reusing device minor numbers.

The patch changes reference counting so that the table destructor can be
called only at predetermined places.

The table has always exactly one reference from either mapped_device->map
or hash_cell->new_map. After this patch, this reference is not counted
in table->holders.  A pair of dm_create_table/dm_destroy_table functions
is used for table creation/destruction.

Temporary references from the other code increase table->holders. A pair
of dm_table_get/dm_table_put functions is used to manipulate it.

When the table is about to be destroyed, we wait for table->holders to
reach 0. Then, we call the table destructor.  We use active waiting with
msleep(1), because the situation happens rarely (to one user in 5 years)
and removing the device isn't performance-critical task: the user doesn't
care if it takes one tick more or not.

This way, the destructor is called only at specific points
(dm_table_destroy function) and the above problems associated with lazy
destruction can't happen.

Finally remove the temporary protection added to dm_any_congested().
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

d5816876

dm: support barriers on simple devices · ab4c1424

由 Andi Kleen 提交于 1月 06, 2009

Implement barrier support for single device DM devices

This patch implements barrier support in DM for the common case of dm linear
just remapping a single underlying device. In this case we can safely
pass the barrier through because there can be no reordering between
devices.

 NB. Any DM device might cease to support barriers if it gets
     reconfigured so code must continue to allow for a possible
     -EOPNOTSUPP on every barrier bio submitted.  - agk
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

ab4c1424

22 10月, 2008 1 次提交

dm: publish array_too_big · d63a5ce3

由 Mikulas Patocka 提交于 10月 21, 2008

Move array_too_big to include/linux/device-mapper.h because it is
used by targets.

Remove the test from dm-raid1 as the number of mirror legs is limited
such that it can never fail.  (Even for stripes it seems rather
unlikely.)
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

d63a5ce3

10 10月, 2008 4 次提交

dm: publish dm_vcalloc · 54160904

由 Mikulas Patocka 提交于 10月 10, 2008

Publish dm_vcalloc in include/linux/device-mapper.h because this function is
used by targets.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

54160904

dm: publish dm_table_unplug_all · ea0ec640

由 Mikulas Patocka 提交于 10月 10, 2008

Publish dm_table_unplug_all in include/linux/device-mapper.h because this
function is used by targets.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

ea0ec640

dm: publish dm_get_mapinfo · 89343da0

由 Mikulas Patocka 提交于 10月 10, 2008

Publish dm_get_mapinfo in include/linux/device-mapper.h because this function
is used by targets.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

89343da0

dm: export struct dm_dev · 82b1519b

由 Mikulas Patocka 提交于 10月 10, 2008

Split struct dm_dev in two and publish the part that other targets need in
include/linux/device-mapper.h.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

82b1519b

21 7月, 2008 1 次提交

dm log: make dm_dirty_log init and exit static · c8da2f8d

由 Adrian Bunk 提交于 7月 21, 2008

dm_dirty_log_{init,exit}() can now become static.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

c8da2f8d

25 4月, 2008 3 次提交

dm: expose macros · 0da336e5

由 Alasdair G Kergon 提交于 4月 24, 2008

Make dm.h macros and inlines available in include/linux/device-mapper.h
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

0da336e5

dm kcopyd: remove redundant client counting · 945fa4d2

由 Mikulas Patocka 提交于 4月 24, 2008

Remove client counting code that is no longer needed.

Initialization and destruction is made globally from dm_init and dm_exit and is
not based on client counts. Initialization allocates only one empty slab cache,
so there is no negative impact from performing the initialization always,
regardless of whether some client uses kcopyd or not.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

945fa4d2

dm log: clean interface · 416cd17b

由 Heinz Mauelshagen 提交于 4月 24, 2008

Clean up the dm-log interface to prepare for publishing it in include/linux.
Signed-off-by: NHeinz Mauelshagen <hjm@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

416cd17b

21 12月, 2007 2 次提交

dm: trigger change uevent on rename · 69267a30

由 Alasdair G Kergon 提交于 12月 13, 2007

Insert a missing KOBJ_CHANGE notification when a device is renamed.

Cc: Scott James Remnant <scott@ubuntu.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

69267a30

dm: table detect io beyond device · 512875bd

由 Jun'ichi Nomura 提交于 12月 13, 2007

This patch fixes a panic on shrinking a DM device if there is
outstanding I/O to the part of the device that is being removed.
(Normally this doesn't happen - a filesystem would be resized first,
for example.)

The bug is that __clone_and_map() assumes dm_table_find_target()
always returns a valid pointer.  It may fail if a bio arrives from the
block layer but its target sector is no longer included in the DM
btree.

This patch appends an empty entry to table->targets[] which will
be returned by a lookup beyond the end of the device.

After calling dm_table_find_target(), __clone_and_map() and target_message()
check for this condition using
dm_target_is_valid().

Sample test script to trigger oops:

512875bd

16 10月, 2007 1 次提交

block: convert blkdev_issue_flush() to use empty barriers · fd5d8062

由 Jens Axboe 提交于 10月 16, 2007

Then we can get rid of ->issue_flush_fn() and all the driver private
implementations of that.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fd5d8062

openanolis / cloud-kernel 11 个月 前同步成功

openanolis / cloud-kernel
11 个月前同步成功