提交 · 0ac55489d9e3898987b2ae305844cf2af86e6b8d · bug2833 / cloud-kernel

27 7月, 2012 2 次提交

dm: use bool bitfields in struct dm_target · 0ac55489

由 Alasdair G Kergon 提交于 7月 27, 2012

Use boolean bit fields for flags in struct dm_target.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

0ac55489

dm: support non power of two target max_io_len · 542f9038

由 Mike Snitzer 提交于 7月 27, 2012

Remove the restriction that limits a target's specified maximum incoming
I/O size to be a power of 2.

Rename this setting from 'split_io' to the less-ambiguous 'max_io_len'.
Change it from sector_t to uint32_t, which is plenty big enough, and
introduce a wrapper function dm_set_target_max_io_len() to set it.
Use sector_div() to process it now that it is not necessarily a power of 2.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

542f9038

20 7月, 2012 2 次提交

dm raid1: set discard_zeroes_data_unsupported · 7c8d3a42

由 Mikulas Patocka 提交于 7月 20, 2012

We can't guarantee that REQ_DISCARD on dm-mirror zeroes the data even if
the underlying disks support zero on discard.  So this patch sets
ti->discard_zeroes_data_unsupported.

For example, if the mirror is in the process of resynchronizing, it may
happen that kcopyd reads a piece of data, then discard is sent on the
same area and then kcopyd writes the piece of data to another leg.
Consequently, the data is not zeroed.

The flag was made available by commit 983c7db3
(dm crypt: always disable discard_zeroes_data).
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

7c8d3a42

dm raid1: fix crash with mirror recovery and discard · 751f188d

由 Mikulas Patocka 提交于 7月 20, 2012

This patch fixes a crash when a discard request is sent during mirror
recovery.

Firstly, some background.  Generally, the following sequence happens during
mirror synchronization:
- function do_recovery is called
- do_recovery calls dm_rh_recovery_prepare
- dm_rh_recovery_prepare uses a semaphore to limit the number
  simultaneously recovered regions (by default the semaphore value is 1,
  so only one region at a time is recovered)
- dm_rh_recovery_prepare calls __rh_recovery_prepare,
  __rh_recovery_prepare asks the log driver for the next region to
  recover. Then, it sets the region state to DM_RH_RECOVERING. If there
  are no pending I/Os on this region, the region is added to
  quiesced_regions list. If there are pending I/Os, the region is not
  added to any list. It is added to the quiesced_regions list later (by
  dm_rh_dec function) when all I/Os finish.
- when the region is on quiesced_regions list, there are no I/Os in
  flight on this region. The region is popped from the list in
  dm_rh_recovery_start function. Then, a kcopyd job is started in the
  recover function.
- when the kcopyd job finishes, recovery_complete is called. It calls
  dm_rh_recovery_end. dm_rh_recovery_end adds the region to
  recovered_regions or failed_recovered_regions list (depending on
  whether the copy operation was successful or not).

The above mechanism assumes that if the region is in DM_RH_RECOVERING
state, no new I/Os are started on this region. When I/O is started,
dm_rh_inc_pending is called, which increases reg->pending count. When
I/O is finished, dm_rh_dec is called. It decreases reg->pending count.
If the count is zero and the region was in DM_RH_RECOVERING state,
dm_rh_dec adds it to the quiesced_regions list.

Consequently, if we call dm_rh_inc_pending/dm_rh_dec while the region is
in DM_RH_RECOVERING state, it could be added to quiesced_regions list
multiple times or it could be added to this list when kcopyd is copying
data (it is assumed that the region is not on any list while kcopyd does
its jobs). This results in memory corruption and crash.

There already exist bypasses for REQ_FLUSH requests: REQ_FLUSH requests
do not belong to any region, so they are always added to the sync list
in do_writes. dm_rh_inc_pending does not increase count for REQ_FLUSH
requests. In mirror_end_io, dm_rh_dec is never called for REQ_FLUSH
requests. These bypasses avoid the crash possibility described above.

These bypasses were improperly implemented for REQ_DISCARD when
the mirror target gained discard support in commit
5fc2ffea (dm raid1: support discard).

In do_writes, REQ_DISCARD requests is always added to the sync queue and
immediately dispatched (even if the region is in DM_RH_RECOVERING).  However,
dm_rh_inc and dm_rh_dec is called for REQ_DISCARD resusts.  So it violates the
rule that no I/Os are started on DM_RH_RECOVERING regions, and causes the list
corruption described above.

This patch changes it so that REQ_DISCARD requests follow the same path
as REQ_FLUSH. This avoids the crash.

Reference: https://bugzilla.redhat.com/837607Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

751f188d

29 3月, 2012 1 次提交

dm: reject trailing characters in sccanf input · 31998ef1

由 Mikulas Patocka 提交于 3月 28, 2012

Device mapper uses sscanf to convert arguments to numbers. The problem is that
the way we use it ignores additional unmatched characters in the scanned string.

For example, this `if (sscanf(string, "%d", &number) == 1)' will match a number,
but also it will match number with some garbage appended, like "123abc".

As a result, device mapper accepts garbage after some numbers. For example
the command `dmsetup create vg1-new --table "0 16384 linear 254:1bla 34816bla"'
will pass without an error.

This patch fixes all sscanf uses in device mapper. It appends "%c" with
a pointer to a dummy character variable to every sscanf statement.

The construct `if (sscanf(string, "%d%c", &number, &dummy) == 1)' succeeds
only if string is a null-terminated number (optionally preceded by some
whitespace characters). If there is some character appended after the number,
sscanf matches "%c", writes the character to the dummy variable and returns 2.
We check the return value for 1 and consequently reject numbers with some
garbage appended.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

31998ef1

29 5月, 2011 3 次提交

dm kcopyd: return client directly and not through a pointer · fa34ce73

由 Mikulas Patocka 提交于 5月 29, 2011

Return client directly from dm_kcopyd_client_create, not through a
parameter, making it consistent with dm_io_client_create.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

fa34ce73

dm kcopyd: reserve fewer pages · 5f43ba29

由 Mikulas Patocka 提交于 5月 29, 2011

Reserve just the minimum of pages needed to process one job.

Because we allocate pages from page allocator, we don't need to reserve
a large number of pages.  The maximum job size is SUB_JOB_SIZE and we
calculate the number of reserved pages based on this.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

5f43ba29

dm io: use fixed initial mempool size · bda8efec

由 Mikulas Patocka 提交于 5月 29, 2011

Replace the arbitrary calculation of an initial io struct mempool size
with a constant.

The code calculated the number of reserved structures based on the request
size and used a "magic" multiplication constant of 4.  This patch changes
it to reserve a fixed number - itself still chosen quite arbitrarily.
Further testing might show if there is a better number to choose.

Note that if there is no memory pressure, we can still allocate an
arbitrary number of "struct io" structures.  One structure is enough to
process the whole request.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

bda8efec

10 3月, 2011 1 次提交

block: remove per-queue plugging · 7eaceacc

由 Jens Axboe 提交于 3月 10, 2011

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7eaceacc

14 1月, 2011 4 次提交

dm: use non reentrant workqueues if equivalent · 9c4376de

由 Tejun Heo 提交于 1月 13, 2011

kmirrord_wq, kcopyd_work and md->wq are created per dm instance and
serve only a single work item from the dm instance, so non-reentrant
workqueues would provide the same ordering guarantees as ordered ones
while allowing CPU affinity and use of the workqueues for other
purposes.  Switch them to non-reentrant workqueues.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

9c4376de

dm: convert workqueues to alloc_ordered · 4d4d66ab

由 Tejun Heo 提交于 1月 13, 2011

Convert all create[_singlethread]_work() users to the new
alloc[_ordered]_workqueue().  This conversion is mechanical and
doesn't introduce any behavior change.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

4d4d66ab

dm: dont use flush_scheduled_work · d5ffa387

由 Tejun Heo 提交于 1月 13, 2011

flush_scheduled_work() is being deprecated.  Flush the used work
directly instead.  In all dm targets, the only work which uses
system_wq is ->trigger_event.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

d5ffa387

dm raid1: support discard · 5fc2ffea

由 Mike Snitzer 提交于 1月 13, 2011

Enable discard support in the DM mirror target.
Also change an existing use of 'bvec' to 'addr' in the union.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

5fc2ffea

10 9月, 2010 1 次提交

dm: implement REQ_FLUSH/FUA support for bio-based dm · d87f4c14

由 Tejun Heo 提交于 9月 03, 2010

This patch converts bio-based dm to support REQ_FLUSH/FUA instead of
now deprecated REQ_HARDBARRIER.

* -EOPNOTSUPP handling logic dropped.

* Preflush is handled as before but postflush is dropped and replaced
  with passing down REQ_FUA to member request_queues.  This replaces
  one array wide cache flush w/ member specific FUA writes.

* __split_and_process_bio() now calls __clone_and_map_flush() directly
  for flushes and guarantees all FLUSH bio's going to targets are zero
`  length.

* It's now guaranteed that all FLUSH bio's which are passed onto dm
  targets are zero length.  bio_empty_barrier() tests are replaced
  with REQ_FLUSH tests.

* Empty WRITE_BARRIERs are replaced with WRITE_FLUSHes.

* Dropped unlikely() around REQ_FLUSH tests.  Flushes are not unlikely
  enough to be marked with unlikely().

* Block layer now filters out REQ_FLUSH/FUA bio's if the request_queue
  doesn't support cache flushing.  Advertise REQ_FLUSH | REQ_FUA
  capability.

* Request based dm isn't converted yet.  dm_init_request_based_queue()
  resets flush support to 0 for now.  To avoid disturbing request
  based dm code, dm->flush_error is added for bio based dm while
  requested based dm continues to use dm->barrier_error.

Lightly tested linear, stripe, raid1, snap and crypt targets.  Please
proceed with caution as I'm not familiar with the code base.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: dm-devel@redhat.com
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

d87f4c14

12 8月, 2010 1 次提交

dm: use dm_target_offset macro · b441a262

由 Alasdair G Kergon 提交于 8月 12, 2010

Use new dm_target_offset() macro to avoid most references to ti->begin
in dm targets.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

b441a262

08 8月, 2010 1 次提交

block: unify flags for struct bio and struct request · 7b6d91da

由 Christoph Hellwig 提交于 8月 07, 2010

Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver. There were two flags in the bio that were
missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7b6d91da

06 3月, 2010 3 次提交

dm raid1: fix deadlock when suspending failed device · f0703040

由 Takahiro Yasui 提交于 3月 06, 2010

To prevent deadlock, bios in the hold list should be flushed before
dm_rh_stop_recovery() is called in mirror_suspend().

The recovery can't start because there are pending bios and therefore
dm_rh_stop_recovery deadlocks.

When there are pending bios in the hold list, the recovery waits for
the completion of the bios after recovery_count is acquired.
The recovery_count is released when the recovery finished, however,
the bios in the hold list are processed after dm_rh_stop_recovery() in
mirror_presuspend(). dm_rh_stop_recovery() also acquires recovery_count,
then deadlock occurs.
Signed-off-by: NTakahiro Yasui <tyasui@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
Reviewed-by: NMikulas Patocka <mpatocka@redhat.com>

f0703040

dm table: remove unused dm_get_device range parameters · 8215d6ec

由 Nikanth Karthikesan 提交于 3月 06, 2010

Remove unused parameters(start and len) of dm_get_device()
and fix the callers.
Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

8215d6ec

dm raid1: always return error if all legs fail · ede5ea0b

由 Mikulas Patocka 提交于 3月 06, 2010

If all mirror legs fail, always return an error instead of holding the
bio, even if the handle_errors option was set.  At present it is the
responsibility of the driver underneath us to deal with retries,
multipath etc.

The patch adds the bio to the failures list instead of holding it
directly.  do_failures tests first if all legs failed and, if so,
returns the bio with -EIO.  If any leg is still alive and handle_errors
is set, do_failures calls hold_bio.
Reviewed-by: NTakahiro Yasui <tyasui@redhat.com>
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

ede5ea0b

17 2月, 2010 1 次提交

dm raid1: fail writes if errors are not handled and log fails · 5528d17d

由 Mikulas Patocka 提交于 2月 16, 2010

If the mirror log fails when the handle_errors option was not selected
and there is no remaining valid mirror leg, writes return success even
though they weren't actually written to any device.  This patch
completes them with EIO instead.

This code path is taken:
do_writes:
	bio_list_merge(&ms->failures, &sync);
do_failures:
	if (!get_valid_mirror(ms)) (false)
	else if (errors_handled(ms)) (false)
	else bio_endio(bio, 0);

The logic in do_failures is based on presuming that the write was already
tried: if it succeeded at least on one leg (without handle_errors) it
is reported as success.

Reference: https://bugzilla.redhat.com/show_bug.cgi?id=555197Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

5528d17d

11 12月, 2009 11 次提交

dm raid1: explicitly initialise bio_lists · 5339fc2d

由 Mikulas Patocka 提交于 12月 10, 2009

Explicitly initialize bio lists instead of relying on kzalloc.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reviewed-by: NTakahiro Yasui <tyasui@redhat.com>
Tested-by: NTakahiro Yasui <tyasui@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

5339fc2d

dm raid1: hold all write bios when leg fails · 929be8fc

由 Mikulas Patocka 提交于 12月 10, 2009

Hold all write bios when leg fails and errors are handled

When using a userspace daemon such as dmeventd to handle errors, we must
delay completing  bios until it has done its job.
This patch prevents the following race:
  - primary leg fails
  - write "1" fail, the write is held, secondary leg is set default
  - write "2" goes straight to the secondary leg
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reviewed-by: NTakahiro Yasui <tyasui@redhat.com>
Tested-by: NTakahiro Yasui <tyasui@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

929be8fc

dm raid1: hold write bios when errors are handled · 60f355ea

由 Mikulas Patocka 提交于 12月 10, 2009

Hold all write bios when errors are handled.

Previously the failures list was used only when handling errors with
a userspace daemon such as dmeventd.  Now, it is always used for all bios.
The regions where some writes failed must be marked as nosync. This can only
be done in process context (i.e. in raid1 workqueue), not in the
write_callback function.

Previously the write would succeed if writing to at least one leg
succeeded.  This is wrong because data from the failed leg may be
replicated to the correct leg.  Now, if using a userspace daemon, the
write with some failures will be held until the daemon has done its job
and reconfigured the array.  If not using a daemon, the write still
succeeds if at least one leg succeeds. This is bad, but it is consistent
with current behavior.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reviewed-by: NTakahiro Yasui <tyasui@redhat.com>
Tested-by: NTakahiro Yasui <tyasui@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

60f355ea

dm raid1: remove bio_endio from dm_rh_mark_nosync · c58098be

由 Mikulas Patocka 提交于 12月 10, 2009

Move bio completion out of dm_rh_mark_nosync in preparation for the
next patch.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reviewed-by: NTakahiro Yasui <tyasui@redhat.com>
Tested-by: NTakahiro Yasui <tyasui@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

c58098be

dm raid1: abstract get_valid_mirror function · 87968ddd

由 Mikulas Patocka 提交于 12月 10, 2009

Move the logic to get a valid mirror leg into a function for re-use
in a later patch.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reviewed-by: NTakahiro Yasui <tyasui@redhat.com>
Tested-by: NTakahiro Yasui <tyasui@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

87968ddd

dm raid1: use hold framework in do_failures · 0f398a84

由 Mikulas Patocka 提交于 12月 10, 2009

Use the hold framework in do_failures.

This patch doesn't change the bio processing logic, it just simplifies
failure handling and avoids periodically polling the failures list.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reviewed-by: NTakahiro Yasui <tyasui@redhat.com>
Tested-by: NTakahiro Yasui <tyasui@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

0f398a84

dm raid1: add framework to hold bios during suspend · 04788507

由 Mikulas Patocka 提交于 12月 10, 2009

Add framework to delay bios until a suspend and then resubmit them with
either DM_ENDIO_REQUEUE (if the suspend was noflush) or complete them
with -EIO.  I/O barrier support will use this.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reviewed-by: NTakahiro Yasui <tyasui@redhat.com>
Tested-by: NTakahiro Yasui <tyasui@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

04788507

dm raid1: report flush errors separately in status · 64b30c46

由 Mikulas Patocka 提交于 12月 10, 2009

Report flush errors as 'F' instead of 'D' for log and mirror devices.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

64b30c46

dm raid1: implement mirror_flush · c0da3748

由 Mikulas Patocka 提交于 12月 10, 2009

Implement flush callee. It uses dm_io to send zero-size barrier synchronously
and concurrently to all the mirror legs.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

c0da3748

dm log: add flush callback fn · 87a8f240

由 Mikulas Patocka 提交于 12月 10, 2009

Introduce a callback pointer from the log to dm-raid1 layer.

Before some region is set as "in-sync", we need to flush hardware cache on
all the disks. But the log module doesn't have access to the mirror_set
structure. So it will use this callback.

So far the callback is unused, it will be used in further patches.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

87a8f240

dm raid1: support flush · 4184153f

由 Mikulas Patocka 提交于 12月 10, 2009

Flush support for dm-raid1.

When it receives an empty barrier, submit it to all the devices via dm-io.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

4184153f

11 9月, 2009 1 次提交

bio: first step in sanitizing the bio->bi_rw flag testing · 1f98a13f

由 Jens Axboe 提交于 9月 11, 2009

Get rid of any functions that test for these bits and make callers
use bio_rw_flagged() directly. Then it is at least directly apparent
what variable and flag they check.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1f98a13f

05 9月, 2009 1 次提交

dm raid1: do not allow log_failure variable to unset after being set · d2b69864

由 Jonathan Brassow 提交于 9月 04, 2009

This patch fixes a bug which was triggering a case where the primary leg
could not be changed on failure even when the mirror was in-sync.

The case involves the failure of the primary device along with
the transient failure of the log device.  The problem is that
bios can be put on the 'failures' list (due to log failure)
before 'fail_mirror' is called due to the primary device failure.
Normally, this is fine, but if the log device failure is transient,
a subsequent iteration of the work thread, 'do_mirror', will
reset 'log_failure'.  The 'do_failures' function then resets
the 'in_sync' variable when processing bios on the failures list.
The 'in_sync' variable is what is used to determine if the
primary device can be switched in the event of a failure.  Since
this has been reset, the primary device is incorrectly assumed
to be not switchable.

The case has been seen in the cluster mirror context, where one
machine realizes the log device is dead before the other machines.
As the responsibilities of the server migrate from one node to
another (because the mirror is being reconfigured due to the failure),
the new server may think for a moment that the log device is fine -
thus resetting the 'log_failure' variable.

In any case, it is inappropiate for us to reset the 'log_failure'
variable.  The above bug simply illustrates that it can actually
hurt us.

Cc: stable@kernel.org
Signed-off-by: NJonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

d2b69864

24 7月, 2009 2 次提交

dm table: pass correct dev area size to device_area_is_valid · 5dea271b

由 Mike Snitzer 提交于 7月 23, 2009

Incorrect device area lengths are being passed to device_area_is_valid().

The regression appeared in 2.6.31-rc1 through commit
754c5fc7.

With the dm-stripe target, the size of the target (ti->len) was used
instead of the stripe_width (ti->len/#stripes).  An example of a
consequent incorrect error message is:

  device-mapper: table: 254:0: sdb too small for target
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

5dea271b

dm raid1: wake kmirrord when requeueing delayed bios after remote recovery · 69885683

由 Mikulas Patocka 提交于 7月 23, 2009

The recent commit 7513c2a7 (dm raid1:
add is_remote_recovering hook for clusters) changed do_writes() to
update the ms->writes list but forgot to wake up kmirrord to process it.

The rule is that when anything is being added on ms->reads, ms->writes
or ms->failures and the list was empty before we must call
wakeup_mirrord (for immediate processing) or delayed_wake (for delayed
processing).  Otherwise the bios could sit on the list indefinitely.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
CC: stable@kernel.org
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

69885683

22 6月, 2009 1 次提交

dm target:s introduce iterate devices fn · af4874e0

由 Mike Snitzer 提交于 6月 22, 2009

Add .iterate_devices to 'struct target_type' to allow a function to be
called for all devices in a DM target.  Implemented it for all targets
except those in dm-snap.c (origin and snapshot).

(The raid1 version number jumps to 1.12 because we originally reserved
1.1 to 1.11 for 'block_on_error' but ended up using 'handle_errors'
instead.)
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
Cc: martin.petersen@oracle.com

af4874e0

15 4月, 2009 1 次提交

block: move bio list helpers into bio.h · 8f3d8ba2

由 Christoph Hellwig 提交于 4月 07, 2009

It's used by DM and MD and generally useful, so move the bio list
helpers into bio.h.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NAlasdair G Kergon <agk@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8f3d8ba2

03 4月, 2009 2 次提交

dm raid1: add is_remote_recovering hook for clusters · 7513c2a7

由 Jonathan Brassow 提交于 4月 02, 2009

The logging API needs an extra function to make cluster mirroring
possible.  This new function allows us to check whether a mirror
region is being recovered on another machine in the cluster.  This
helps us prevent simultaneous recovery I/O and process I/O to the
same locations on disk.

Cluster-aware log modules will implement this function.  Single
machine log modules will not.  So, there is no performance
penalty for single machine mirrors.
Signed-off-by: NJonathan Brassow <jbrassow@redhat.com>
Acked-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

7513c2a7

dm raid1: switch read_record from kmalloc to slab to save memory · 95f8fac8

由 Mikulas Patocka 提交于 4月 02, 2009

With my previous patch to save bi_io_vec, the size of dm_raid1_read_record
is significantly increased (the vector list takes 3072 bytes on 32-bit machines
and 4096 bytes on 64-bit machines).

The structure dm_raid1_read_record used to be allocated with kmalloc,
but kmalloc aligns the size on the next power-of-two so an object
slightly greater than 4096 will allocate 8192 bytes of memory and half of
that memory will be wasted.

This patch turns kmalloc into a slab cache which doesn't have this
padding so it will reduce the memory consumed.

Cc: stable@kernel.org
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

95f8fac8

06 1月, 2009 1 次提交

dm log: move region_size validation · 2045e88e

由 Milan Broz 提交于 1月 06, 2009

Move log size validation from mirror target to log constructor.

Removed PAGE_SIZE restriction we no longer think necessary.
Signed-off-by: NMilan Broz <mbroz@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

2045e88e

bug2833 / cloud-kernel 与 Fork 源项目一致

bug2833 / cloud-kernel
与 Fork 源项目一致