提交 · 0232605d987d8230b254aa139805bbb56a7ca30c · openeuler / raspberrypi-kernel

03 7月, 2012 2 次提交

md: make 'name' arg to md_register_thread non-optional. · 0232605d

由 NeilBrown 提交于 7月 03, 2012

Having the 'name' arg optional and defaulting to the current
personality name is no necessary and leads to errors, as when
changing the level of an array we can end up using the
name of the old level instead of the new one.

So make it non-optional and always explicitly pass the name
of the level that the array will be.
Reported-by: Nmajianpeng <majianpeng@gmail.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

0232605d

md:Add blk_plug in sync_thread. · 7c2c57c9

由 majianpeng 提交于 7月 03, 2012

Add blk_plug in sync_thread will increase the performance of sync.
Because sync_thread did not blk_plug,so when raid sync, the bio merge
not well.

Testing environment:
SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI
Controller.
OS:Linux xxx 3.5.0-rc2+ #340 SMP Tue Jun 12 09:00:25 CST 2012
x86_64 x86_64 x86_64 GNU/Linux.
RAID5: four ST31000524NS disk.

Without blk_plug:recovery speed about 63M/Sec;
Add blk_plug:recovery speed about 120M/Sec.

Using blktrace:
blktrace -d /dev/sdb -w 60  -o -|blkparse -i -

without blk_plug:
Total (8,16):
 Reads Queued:      309811,     1239MiB	 Writes Queued:           0,        0KiB
 Read Dispatches:   283583,     1189MiB	 Write Dispatches:        0,        0KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:   273351,     1149MiB	 Writes Completed:        0,        0KiB
 Read Merges:        23533,    94132KiB	 Write Merges:            0,        0KiB
 IO unplugs:             0        	 Timer unplugs:           0

add blk_plug:
Total (8,16):
 Reads Queued:      428697,     1714MiB	 Writes Queued:           0,        0KiB
 Read Dispatches:     3954,     1714MiB	 Write Dispatches:        0,        0KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:     3956,     1715MiB	 Writes Completed:        0,        0KiB
 Read Merges:       424743,     1698MiB	 Write Merges:            0,        0KiB
 IO unplugs:             0        	 Timer unplugs:        3384

The ratio of merge will be markedly increased.
Signed-off-by: Nmajianpeng <majianpeng@gmail.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

7c2c57c9

22 5月, 2012 8 次提交

md: check the return of mddev_find() · 0c098220

由 Yuanhan Liu 提交于 5月 22, 2012

Check the return of mddev_find(), since it may fail due to out of
memeory or out of usable minor number.

The reason I chose -ENODEV instead of -ENOMEM or something else is
md_alloc() function chose that ;)
Signed-off-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

0c098220

DM RAID: Set recovery flags on resume · 47525e59

由 Jonathan Brassow 提交于 5月 22, 2012

Properly initialize MD recovery flags when resuming device-mapper devices.

When a device-mapper device is suspended, all I/O must stop.  This is done by
calling 'md_stop_writes' and 'mddev_suspend'.  These calls in-turn manipulate
the recovery flags - including setting 'MD_RECOVERY_FROZEN'.  The DM device
may have been suspended while recovery was not yet complete, so the process
needs to pick-up where it left off.  Since 'mddev_resume' does not unset
'MD_RECOVERY_FROZEN' and set 'MD_RECOVERY_NEEDED', we must do it ourselves.
'MD_RECOVERY_NEEDED' can safely be set in 'mddev_resume', but 'MD_RECOVERY_FROZEN'
must be set outside of 'mddev_resume' due to how MD handles RAID reshaping.
(e.g.  It is possible for a user to delay reshaping a RAID5->RAID6 by purposefully
setting 'MD_RECOVERY_FROZEN'.  Clearing it in 'mddev_resume' would override the
desired behavior.)

Because 'mddev_resume' already unconditionally calls 'md_wakeup_thread(mddev->thread)'
there is no need to make this call from 'raid_resume' since it calls 'mddev_resume'.

Also clean up where  level_store calls mddev_resume() - it current
duplicates some of the funcitons of that call. - NB
Signed-off-by: NJonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

47525e59

md: allow array to be resized while bitmap is present. · a4a6125a

由 NeilBrown 提交于 5月 22, 2012

Now that bitmaps can be resized, we can allow an array to be resized
while the bitmap is present.

This only covers resizing that involves changing the effective size
of member devices, not resizing that changes the number of devices.
Signed-off-by: NNeilBrown <neilb@suse.de>

a4a6125a

md/bitmap: move some fields of 'struct bitmap' into a 'storage' substruct. · 1ec885cd

由 NeilBrown 提交于 5月 22, 2012

This new 'struct bitmap_storage' reflects the external storage of the
bitmap.
Having this clearly defined will make it easier to change the storage
used while the array is active.
Signed-off-by: NNeilBrown <neilb@suse.de>

1ec885cd

md/bitmap: allow a bitmap with no backing storage. · ef99bf48

由 NeilBrown 提交于 5月 22, 2012

An md bitmap comprises two parts
 - internal counting of active writes per 'chunk'.
 - external storage of whether there are any active writes on
   each chunk

The second requires the first, but the first doesn't require the
second.

Not having backing storage means that the bitmap cannot expedite
resync after a crash, but it still allows us to expedite the recovery
of a recently-removed device.

So: allow a bitmap to exist even if there is no backing device.
In that case we default to 128M chunks.

A particular value of this is that we can remove and re-add a bitmap
(possibly of a different granularity) on a degraded array, and not
lose the information needed to fast-recover the missing device.

We don't actually activate these bitmaps yet - that will come
in a later patch.
Signed-off-by: NNeilBrown <neilb@suse.de>

ef99bf48

md/bitmap: add new 'space' attribute for bitmaps. · 6409bb05

由 NeilBrown 提交于 5月 22, 2012

If we are to allow bitmaps to be resized when the array is resized,
we need to know how much space there is.

So create an attribute to store this information and set appropriate
defaults.

It can be set more precisely via sysfs, or future metadata extensions
may allow it to be recorded.
Signed-off-by: NNeilBrown <neilb@suse.de>

6409bb05

md: move freeing of badblocks.page into md_rdev_clear · 4fa2f327

由 NeilBrown 提交于 5月 22, 2012

This ensures that it is always freed - there were case where
we failed to free the page.
Reported-by: Nmajianpeng <majianpeng@gmail.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

4fa2f327

md: dm-raid should call helper function to clear rdev. · 545c8795

由 NeilBrown 提交于 5月 22, 2012

dm-raid currently open-codes the freeing of some members of
and rdev.  It is more maintainable to have it call common code
from md.c which does this for all call-sites.

So remove free_disk_sb to md_rdev_clear, export it, and use it in
dm-raid.c
Signed-off-by: NNeilBrown <neilb@suse.de>

545c8795

21 5月, 2012 5 次提交

md: use resync_max_sectors for reshape as well as resync. · c804cdec

由 NeilBrown 提交于 5月 21, 2012

Some resync type operations need to act on the address space of the
device, others on the address space of the array.

This only affects RAID10, so it sets resync_max_sectors to the array
size (it defaults to the device size), and that is currently used for
resync only.  However reshape of a RAID10 must be done against the
array size, not device size, so change code to use resync_max_sectors
for both the resync and the reshape cases.
This does not affect RAID5 or RAID1, just RAID10.
Signed-off-by: NNeilBrown <neilb@suse.de>

c804cdec

md: teach sync_page_io about new_data_offset. · 1fdd6fc9

由 NeilBrown 提交于 5月 21, 2012

Some code in raid1 and raid10 use sync_page_io to
read/write pages when responding to read errors.
As we will shortly support changing data_offset for
raid10, this function must understand new_data_offset.

So add that understanding.
Signed-off-by: NNeilBrown <neilb@suse.de>

1fdd6fc9

md: add possibility to change data-offset for devices. · c6563a8c

由 NeilBrown 提交于 5月 21, 2012

When reshaping we can avoid costly intermediate backup by
changing the 'start' address of the array on the device
(if there is enough room).

So as a first step, allow such a change to be requested
through sysfs, and recorded in v1.x metadata.

(As we didn't previous check that all 'pad' fields were zero,
 we need a new FEATURE flag for this.
 A (belatedly) check that all remaining 'pad' fields are
 zero to avoid a repeat of this)

The new data offset must be requested separately for each device.
This allows each to have a different change in the data offset.
This is not likely to be used often but as data_offset can be
set per-device, new_data_offset should be too.

This patch also removes the 'acknowledged' arg to rdev_set_badblocks as
it is never used and never will be.  At the same time we add a new
arg ('in_new') which is currently always zero but will be used more
soon.

When a reshape finishes we will need to update the data_offset
and rdev->sectors.  So provide an exported function to do that.
Signed-off-by: NNeilBrown <neilb@suse.de>

c6563a8c

md: allow a reshape operation to be reversed. · 2c810cdd

由 NeilBrown 提交于 5月 21, 2012

Currently a reshape operation always progresses from the start
of the array to the end unless the number of devices is being
reduced, in which case it progressed in the opposite direction.

To reverse a partial reshape which changes the number of devices
you can stop the array and re-assemble with the raid-disks numbers
reversed and it will undo.

However for a reshape that does not change the number of devices
it is not possible to reverse the reshape in the middle - you have to
wait until it completes.

So add a 'reshape_direction' attribute with is either 'forwards' or
'backwards' and can be explicitly set when delta_disks is zero.

This will become more important when we allow the data_offset to
change in a reshape.  Then the explicit statement of what direction is
being used will be more useful.

This can be enabled in raid5 trivially as it already supports
reverse reshape and just needs to use a different trigger to request it.
Signed-off-by: NNeilBrown <neilb@suse.de>

2c810cdd

md: using GFP_NOIO to allocate bio for flush request · b5e1b8ce

由 Shaohua Li 提交于 5月 21, 2012

A flush request is usually issued in transaction commit code path, so
using GFP_KERNEL to allocate memory for flush request bio falls into
the classic deadlock issue.

This is suitable for any -stable kernel to which it applies as it
avoids a possible deadlock.

Cc: stable@vger.kernel.org
Signed-off-by: NShaohua Li <shli@fusionio.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

b5e1b8ce

17 5月, 2012 1 次提交

MD: Add del_timer_sync to mddev_suspend (fix nasty panic) · 0d9f4f13

由 Jonathan Brassow 提交于 5月 16, 2012

Use del_timer_sync to remove timer before mddev_suspend finishes.

We don't want a timer going off after an mddev_suspend is called. This is
especially true with device-mapper, since it can call the destructor function
immediately following a suspend. This results in the removal (kfree) of the
structures upon which the timer depends - resulting in a very ugly panic.
Therefore, we add a del_timer_sync to mddev_suspend to prevent this.

Cc: stable@vger.kernel.org
Signed-off-by: NNeilBrown <neilb@suse.de>

0d9f4f13

24 4月, 2012 2 次提交

md: fix possible corruption of array metadata on shutdown. · 30b8aa91

由 NeilBrown 提交于 4月 24, 2012

commit c744a65c
  md: don't set md arrays to readonly on shutdown.

removed the possibility of a 'BUG' when data is written to an array
that has just been switched to read-only, but also introduced the
possibility that the array metadata could be corrupted.

If, when md_notify_reboot gets the mddev lock, the array is
in a state where it is assembled but hasn't been started (as can
happen if the personality module is not available, or in other unusual
situations), then incorrect metadata will be written out making it
impossible to re-assemble the array.

So only call __md_stop_writes() if the array has actually been
activated.

This patch is needed for any stable kernel which has had the above
commit applied.

Cc: stable@vger.kernel.org
Reported-by: NChristoph Nelles <evilazrael@evilazrael.de>
Signed-off-by: NNeilBrown <neilb@suse.de>

30b8aa91

md: don't call ->add_disk unless there is good reason. · ed209584

由 NeilBrown 提交于 4月 24, 2012

Commit 7bfec5f3

   md/raid5: If there is a spare and a want_replacement device, start replacement.

cause md_check_recovery to call ->add_disk much more often.
Instead of only when the array is degraded, it is now called whenever
md_check_recovery finds anything useful to do, which includes
updating the metadata for clean<->dirty transition.
This causes unnecessary work, and causes info messages from ->add_disk
to be reported much too often.

So refine md_check_recovery to only do any actual recovery checking
(including ->add_disk) if MD_RECOVERY_NEEDED is set.

This fix is suitable for 3.3.y:

Cc: stable@vger.kernel.org
Reported-by: NJan Ceuleers <jan.ceuleers@computer.org>
Signed-off-by: NNeilBrown <neilb@suse.de>

ed209584

19 3月, 2012 6 次提交

md: Add judgement bb->unacked_exist in function md_ack_all_badblocks(). · ecb178bb

由 majianpeng 提交于 3月 19, 2012

If there are no unacked bad blocks, then there is no point searching
for them to acknowledge them.
Signed-off-by: Nmajianpeng <majianpeng@gmail.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

ecb178bb

md: fix clearing of the 'changed' flags for the bad blocks list. · d0962936

由 NeilBrown 提交于 3月 19, 2012

In super_1_sync (the first hunk) we need to clear 'changed' before
checking read_seqretry(), otherwise we might race with other code
adding a bad block and so won't retry later.

In md_update_sb (the second hunk), in the case where there is no
metadata (neither persistent nor external), we treat any bad blocks as
an error.  However we need to clear the 'changed' flag before calling
md_ack_all_badblocks, else it won't do anything.

This patch is suitable for -stable release 3.0 and later.

Cc: stable@vger.kernel.org
Signed-off-by: NNeilBrown <neilb@suse.de>

d0962936

md/bitmap: move printing of bitmap status to bitmap.c · 57148964

由 NeilBrown 提交于 3月 19, 2012

The part of /proc/mdstat which describes the bitmap should really
be generated by code in bitmap.c.  So move it there.
Signed-off-by: NNeilBrown <neilb@suse.de>

57148964

md/raid10: handle merge_bvec_fn in member devices. · 050b6615

由 NeilBrown 提交于 3月 19, 2012

Currently we don't honour merge_bvec_fn in member devices so if there
is one, we force all requests to be single-page at most.
This is not ideal.

So enhance the raid10 merge_bvec_fn to check that function in children
as well.

This introduces a small problem.  There is no locking around calls
the ->merge_bvec_fn and subsequent calls to ->make_request.  So a
device added between these could end up getting a request which
violates its merge_bvec_fn.

Currently the best we can do is synchronize_sched().  This will work
providing no preemption happens.  If there is preemption, we just
have to hope that new devices are largely consistent with old devices.
Signed-off-by: NNeilBrown <neilb@suse.de>

050b6615

md: tidy up rdev_for_each usage. · dafb20fa

由 NeilBrown 提交于 3月 19, 2012

md.h has an 'rdev_for_each()' macro for iterating the rdevs in an
mddev.  However it uses the 'safe' version of list_for_each_entry,
and so requires the extra variable, but doesn't include 'safe' in the
name, which is useful documentation.

Consequently some places use this safe version without needing it, and
many use an explicity list_for_each entry.

So:
 - rename rdev_for_each to rdev_for_each_safe
 - create a new rdev_for_each which uses the plain
   list_for_each_entry,
 - use the 'safe' version only where needed, and convert all other
   list_for_each_entry calls to use rdev_for_each.
Signed-off-by: NNeilBrown <neilb@suse.de>

dafb20fa

md: don't set md arrays to readonly on shutdown. · c744a65c

由 NeilBrown 提交于 3月 19, 2012

It seems that with recent kernel, writeback can still be happening
while shutdown is happening, and consequently data can be written
after the md reboot notifier switches all arrays to read-only.
This causes a BUG.

So don't switch them to read-only - just mark them clean and
set 'safemode' to '2' which mean that immediately after any
write the array will be switch back to 'clean'.

This could result in the shutdown happening when array is marked
dirty, thus forcing a resync on reboot.  However if you reboot
without performing a "sync" first, you get to keep both halves.

This is suitable for any stable kernel (though there might be some
conflicts with obvious fixes in earlier kernels).

Cc: stable@vger.kernel.org
Signed-off-by: NNeilBrown <neilb@suse.de>

c744a65c

07 2月, 2012 1 次提交

md: two small fixes to handling interrupt resync. · db91ff55

由 NeilBrown 提交于 2月 07, 2012

1/ If a resync is aborted we should record how far we got
 (recovery_cp) the last request that we know has completed
 (->curr_resync_completed) rather than the last request that was
 submitted (->curr_resync).

2/ When a resync aborts we still want to update the metadata with
 any changes, so set MD_CHANGE_DEVS even if we 'skip'.
Signed-off-by: NNeilBrown <neilb@suse.de>

db91ff55

11 1月, 2012 2 次提交

block: Introduce blk_set_stacking_limits function · b1bd055d

由 Martin K. Petersen 提交于 1月 11, 2012

Stacking driver queue limits are typically bounded exclusively by the
capabilities of the low level devices, not by the stacking driver
itself.

This patch introduces blk_set_stacking_limits() which has more liberal
metrics than the default queue limits function. This allows us to
inherit topology parameters from bottom devices without manually
tweaking the default limits in each driver prior to calling the stacking
function.

Since there is now a clear distinction between stacking and low-level
devices, blk_set_default_limits() has been modified to carry the more
conservative values that we used to manually set in
blk_queue_make_request().
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b1bd055d

md: notify the 'degraded' sysfs attribute on failure. · f2a371c5

由 NeilBrown 提交于 1月 09, 2012

We currently only 'notify' changes to the 'degraded' attribute
when it decreases, not when it increases.

Notifying on failure is a little awkward as it happen in
interrupt context.
So instead, notify when we remove the failed device from the array,
which is very soon afterwards.
Reported-and-tested-by: NMikhail Balabin <mbalabin@gmail.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

f2a371c5

04 1月, 2012 1 次提交

fs: move code out of buffer.c · ff01bb48

由 Al Viro 提交于 9月 16, 2011

Move invalidate_bdev, block_sync_page into fs/block_dev.c.  Export
kill_bdev as well, so brd doesn't have to open code it.  Reduce
buffer_head.h requirement accordingly.

Removed a rather large comment from invalidate_bdev, as it looked a bit
obsolete to bother moving.  The small comment replacing it says enough.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ff01bb48

23 12月, 2011 6 次提交

md/raid5: If there is a spare and a want_replacement device, start replacement. · 7bfec5f3

由 NeilBrown 提交于 12月 23, 2011

When attempting to add a spare to a RAID[456] array, also consider
adding it as a replacement for a want_replacement device.

This requires that common md code attempt hot_add even when the array
is not formally degraded.
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

7bfec5f3

md: create externally visible flags for supporting hot-replace. · 2d78f8c4

由 NeilBrown 提交于 12月 23, 2011

hot-replace is a feature being added to md which will allow a
device to be replaced without removing it from the array first.

With hot-replace a spare can be activated and recovery can start while
the original device is still in place, thus allowing a transition from
an unreliable device to a reliable device without leaving the array
degraded during the transition.  It can also be use when the original
device is still reliable but it not wanted for some reason.

This will eventually be supported in RAID4/5/6 and RAID10.

This patch adds a super-block flag to distinguish the replacement
device.  If an old kernel sees this flag it will reject the device.

It also adds two per-device flags which are viewable and settable via
sysfs.
   "want_replacement" can be set to request that a device be replaced.
   "replacement" is set to show that this device is replacing another
   device.

The "rd%d" links in /sys/block/mdXx/md only apply to the original
device, not the replacement.  We currently don't make links for the
replacement - there doesn't seem to be a need.
Signed-off-by: NNeilBrown <neilb@suse.de>

2d78f8c4

md: change hot_remove_disk to take an rdev rather than a number. · b8321b68

由 NeilBrown 提交于 12月 23, 2011

Soon an array will be able to have multiple devices with the
same raid_disk number (an original and a replacement).  So removing
a device based on the number won't work.  So pass the actual device
handle instead.
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

b8321b68

md: remove test for duplicate device when setting slot number. · 476a7abb

由 NeilBrown 提交于 12月 23, 2011

When setting the slot number on a device in an active array we
currently check that the number is not already in use.
We then call into the personality's hot_add_disk function
which performs the same test and returns the same error.

Thus the common test is not needed.

As we will shortly be changing some personalities to allow duplicates
in some cases (to support hot-replace), the common test will become
inconvenient.

So remove the common test.
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

476a7abb

md: allow non-privileged uses to GET_*_INFO about raid arrays. · 506c9e44

由 NeilBrown 提交于 12月 23, 2011

The info is already available in /proc/mdstat and /sys/block in
an accessible form so there is no point in putting a road-block in
the ioctl for information gathering.
Signed-off-by: NNeilBrown <neilb@suse.de>

506c9e44

md: don't give up looking for spares on first failure-to-add · 60fc1370

由 NeilBrown 提交于 12月 23, 2011

Before performing a recovery we try to remove any spares that
might not be working, then add any that might have become relevant.

Currently we abort on the first spare that cannot be added.
This is a false optimisation.
It is conceivable that - depending on rules in the personality - a
subsequent spare might be accepted.
Also the loop does other things like count the available spares and
reset the 'recovery_offset' value.

If we abort early these might not happen properly.

So remove the early abort.

In particular if you have an array what is undergoing recovery and
which has extra spares, then the recovery may not restart after as
reboot as the could of 'spares' might end up as zero.
Reported-by: NAnssi Hannula <anssi.hannula@iki.fi>
Signed-off-by: NNeilBrown <neilb@suse.de>

60fc1370

08 12月, 2011 4 次提交

md: ensure new badblocks are handled promptly. · 8bd2f0a0

由 NeilBrown 提交于 12月 08, 2011

When we mark blocks as bad we need them to be acknowledged by the
metadata handler promptly.

For an in-kernel metadata handler that was already being done.  But
for an external metadata handler we need to alert it of the change by
sending a notification through the sysfs file.  This adds that
notification.
Signed-off-by: NNeilBrown <neilb@suse.de>

8bd2f0a0

md: bad blocks shouldn't cause a Blocked status on a Faulty device. · 52c64152

由 NeilBrown 提交于 12月 08, 2011

Once a device is marked Faulty the badblocks - whether acknowledged or
not - become irrelevant.  So they shouldn't cause the device to be
marked as Blocked.

Without this patch, a process might write "-blocked" to clear the
Blocked status, but while that will correctly fail the device, it
won't remove the apparent 'blocked' status.
Signed-off-by: NNeilBrown <neilb@suse.de>

52c64152

md: take a reference to mddev during sysfs access. · af8a2434

由 NeilBrown 提交于 12月 08, 2011


When we are accessing an mddev via sysfs we know that the
mddev cannot disappear because it has an embedded kobj which
is refcounted by sysfs.
And we also take the mddev_lock.
However this is not enough.

The final mddev_put could have been called and the
mddev_delayed_delete is waiting for sysfs to let go so it can destroy
the kobj and mddev.
In this state there are a lot of changes that should not be attempted.

To to guard against this we:
 - initialise mddev->all_mddevs in on last put so the state can be
   easily detected.
 - in md_attr_show and md_attr_store, check ->all_mddevs under
   all_mddevs_lock and mddev_get the mddev if it still appears to
   be active.

This means that if we get to sysfs as the mddev is being deleted we
will get -EBUSY.

rdev_attr_store and rdev_attr_show are similar but already have
sufficient protection.  They check that rdev->mddev still points to
mddev after taking mddev_lock.  As this is cleared  before delayed
removal which can only be requested under the mddev_lock, this
ensure the rdev and mddev are still alive.
Signed-off-by: NNeilBrown <neilb@suse.de>

af8a2434

md: refine interpretation of "hold_active == UNTIL_IOCTL". · 1d23f178

由 NeilBrown 提交于 12月 08, 2011

We like md devices to disappear when they really are not needed.
However it is not possible to tell from the current state whether it
is needed or not.  We can only tell from recent history of changes.

In particular immediately after we create an md device it looks very
similar to immediately after we have finished with it.

So we always preserve a newly created md device until something
significant happens.  This state is stored in 'hold_active'.

The normal case is to keep it until an ioctl happens, as that will
normally either activate it, or explicitly de-activate it.  If it
doesn't then it was probably created by mistake and it is now time to
get rid of it.

We can also modify an array via sysfs (instead of via ioctl) and we
currently treat any change via sysfs like an ioctl as a sign that if
it now isn't more active, it should be destroyed.
However this is not appropriate as changes made via sysfs are more
gradual so we should look for a more definitive change.

So this patch only clears 'hold_active' from UNTIL_IOCTL to clear when
the array_state is changed via sysfs.  Other changes via sysfs
are ignored.
Signed-off-by: NNeilBrown <neilb@suse.de>

1d23f178

01 11月, 2011 1 次提交

md: Add module.h to all files using it implicitly · 056075c7

由 Paul Gortmaker 提交于 7月 03, 2011

A pending cleanup will mean that module.h won't be implicitly
everywhere anymore. Make sure the modular drivers in md dir
are actually calling out for <module.h> explicitly in advance.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

056075c7

19 10月, 2011 1 次提交

md.c: trivial comment fix · 751e67ca

由 Chris Dunlop 提交于 10月 19, 2011

Trivial comment fix
Signed-off-by: NChris Dunlop <chris@onthe.net.au>
Signed-off-by: NNeilBrown <neilb@suse.de>

751e67ca