提交 · 1b30e66f5acc6bf22fff49d4093cf17454f914b7 · openeuler / raspberrypi-kernel

06 2月, 2015 7 次提交

md: minor cleanup in safe_delay_store. · 1b30e66f

由 NeilBrown 提交于 12月 15, 2014

There isn't really much room for races with ->safemode_delay.
But as I am trying to clean up any racy code and will soon
be removing reconfig_mutex protection from most _store()
functions:
 - only set mddev->safemode_delay once, to ensure no code
   can see an intermediate value
 - use safemode_timer to call md_safemode_timeout() rather than
   calling it directly, to ensure it never races with itself.
Signed-off-by: NNeilBrown <neilb@suse.de>

1b30e66f

md: move GET_BITMAP_FILE ioctl out from mddev_lock. · 4af1a041

由 NeilBrown 提交于 12月 15, 2014

It makes more sense to report bitmap_info->file, rather than
bitmap->file (the later is only available once the array is
active).

With that change, use mddev->lock to protect bitmap_info being
set to NULL, and we can call get_bitmap_file() without taking
the mutex.
Signed-off-by: NNeilBrown <neilb@suse.de>

4af1a041

md: tidy up set_bitmap_file · 1e594bb2

由 NeilBrown 提交于 12月 15, 2014

1/ delay setting mddev->bitmap_info.file until 'f' looks
   usable, so we don't have to unset it.
2/ Don't allow bitmap file to be set if bitmap_info.file
   is already set.
Signed-off-by: NNeilBrown <neilb@suse.de>

1e594bb2

md: remove unnecessary 'buf' from get_bitmap_file. · f4ad3d38

由 NeilBrown 提交于 12月 15, 2014

'buf' is only used because d_path fills from the end of the
buffer instead of from the start.
We don't need a separate buf to handle that, we just need to use
memmove() to move the string to the start.
Signed-off-by: NNeilBrown <neilb@suse.de>

f4ad3d38

md: remove mddev_lock from rdev_attr_show() · 758bfc8a

由 NeilBrown 提交于 12月 15, 2014

No rdev attributes need locking for 'show', though
state_show() might benefit from ensuring it sees a
consistent set of flags.

None even use rdev->mddev, so testing for it isn't really
needed and it certainly doesn't need to be held constant.

So improve state_show() and remove the locking.
Signed-off-by: NNeilBrown <neilb@suse.de>

758bfc8a

md: remove mddev_lock() from md_attr_show() · b7b17c9b

由 NeilBrown 提交于 12月 15, 2014

Most attributes can be read safely without any locking.
A race might lead to a slightly out-dated value, but nothing wrong.

We already have locking in some places where needed.
All that remains is can_clear_show(), behind_writes_used_show()
and action_show() which are easily fixed.
Signed-off-by: NNeilBrown <neilb@suse.de>

b7b17c9b

md: remove need for mddev_lock() in md_seq_show() · f97fcad3

由 NeilBrown 提交于 12月 15, 2014

The only access in md_seq_show that could suffer from races
not protected by ->lock is walking the rdev list.
This can receive sufficient protection from 'rcu'.

So use rdev_for_each_rcu() and get rid of mddev_lock().

Now reading /proc/mdstat will never block in md_seq_show.
Signed-off-by: NNeilBrown <neilb@suse.de>

f97fcad3

04 2月, 2015 7 次提交

md: protect ->pers changes with mddev->lock · 36d091f4

由 NeilBrown 提交于 12月 15, 2014

->pers is already protected by ->reconfig_mutex, and
cannot possibly change when there are threads running or
outstanding IO.

However there are some places where we access ->pers
not in a thread or IO context, and where ->reconfig_mutex
is unnecessarily heavy-weight:  level_show and md_seq_show().

So protect all changes, and those accesses, with ->lock.
This is a step toward taking those accesses out from under
reconfig_mutex.

[Fixed missing "mddev->pers" -> "pers" conversion, thanks to
 Dan Carpenter <dan.carpenter@oracle.com>]
Signed-off-by: NNeilBrown <neilb@suse.de>

36d091f4

md: level_store: group all important changes into one place. · db721d32

由 NeilBrown 提交于 12月 15, 2014

Gather all the changes that can happen atomically and might
be relevant to other code into one place.  This will
make it easier to refine the locking.

Note that this puts quite a few things between mddev_detach()
and ->free().  Enabling this was the point of some recent patches.
Signed-off-by: NNeilBrown <neilb@suse.de>

db721d32

md: rename ->stop to ->free · afa0f557

由 NeilBrown 提交于 12月 15, 2014

Now that the ->stop function only frees the private data,
rename is accordingly.

Also pass in the private pointer as an arg rather than using
mddev->private.  This flexibility will be useful in level_store().

Finally, don't clear ->private.  It doesn't make sense to clear
it seeing that isn't what we free, and it is no longer necessary
to clear ->private (it was some time ago before  ->to_remove was
introduced).

Setting ->to_remove in ->free() is a bit of a wart, but not a
big problem at the moment.
Signed-off-by: NNeilBrown <neilb@suse.de>

afa0f557

md: split detach operation out from ->stop. · 5aa61f42

由 NeilBrown 提交于 12月 15, 2014

Each md personality has a 'stop' operation which does two
things:
 1/ it finalizes some aspects of the array to ensure nothing
    is accessing the ->private data
 2/ it frees the ->private data.

All the steps in '1' can apply to all arrays and so can be
performed in common code.

This is useful as in the case where we change the personality which
manages an array (in level_store()), it would be helpful to do
step 1 early, and step 2 later.

So split the 'step 1' functionality out into a new mddev_detach().
Signed-off-by: NNeilBrown <neilb@suse.de>

5aa61f42

md: make merge_bvec_fn more robust in face of personality changes. · 64590f45

由 NeilBrown 提交于 12月 15, 2014

There is no locking around calls to merge_bvec_fn(), so
it is possible that calls which coincide with a level (or personality)
change could go wrong.

So create a central dispatch point for these functions and use
rcu_read_lock().
If the array is suspended, reject any merge that can be rejected.
If not, we know it is safe to call the function.
Signed-off-by: NNeilBrown <neilb@suse.de>

64590f45

md: make ->congested robust against personality changes. · 5c675f83

由 NeilBrown 提交于 12月 15, 2014

There is currently no locking around calls to the 'congested'
bdi function.  If called at an awkward time while an array is
being converted from one level (or personality) to another, there
is a tiny chance of running code in an unreferenced module etc.

So add a 'congested' function to the md_personality operations
structure, and call it with appropriate locking from a central
'mddev_congested'.

When the array personality is changing the array will be 'suspended'
so no IO is processed.
If mddev_congested detects this, it simply reports that the
array is congested, which is a safe guess.
As mddev_suspend calls synchronize_rcu(), mddev_congested can
avoid races by included the whole call inside an rcu_read_lock()
region.
This require that the congested functions for all subordinate devices
can be run under rcu_lock.  Fortunately this is the case.
Signed-off-by: NNeilBrown <neilb@suse.de>

5c675f83

md: rename mddev->write_lock to mddev->lock · 85572d7c

由 NeilBrown 提交于 12月 15, 2014

This lock is used for (slightly) more than helping with writing
superblocks, and it will soon be extended further.  So the
name is inappropriate.

Also, the _irq variant hasn't been needed since 2.6.37 as it is
never taking from interrupt or bh context.

So:
  -rename write_lock to lock
  -document what it protects
  -remove _irq ... except in md_flush_request() as there
     is no wait_event_lock() (with no _irq).  This can be
     cleaned up after appropriate changes to wait.h.
Signed-off-by: NNeilBrown <neilb@suse.de>

85572d7c

11 12月, 2014 1 次提交

md: Check MD_RECOVERY_RUNNING as well as ->sync_thread. · f851b60d

由 NeilBrown 提交于 12月 11, 2014

A recent change to md started the ->sync_thread from a asynchronously
from a work_queue rather than synchronously.  This means that there
can be a small window between the time when MD_RECOVERY_RUNNING is set
and when ->sync_thread is set.

So code that checks ->sync_thread might now conclude that the thread
has not been started and (because a lock is held) will not be started.
That is no longer the case.

Most of those places are best fixed by testing MD_RECOVERY_RUNNING
as well.  To make this completely reliable, we wake_up(&resync_wait)
after clearing that flag as well as after clearing ->sync_thread.

Other places are better served by flushing the relevant workqueue
to ensure that that if the sync thread was starting, it has now
started.  This is particularly best if we are about to stop the
sync thread.

Fixes: ac05f256Signed-off-by: NNeilBrown <neilb@suse.de>

f851b60d

03 12月, 2014 1 次提交

md: fix semicolon.cocci warnings · 7d7e64f2

由 kbuild test robot 提交于 12月 03, 2014

drivers/md/md.c:7175:43-44: Unneeded semicolon

 Removes unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

7d7e64f2

24 11月, 2014 1 次提交

md: use generic io stats accounting functions to simplify io stat accounting · 18c0b223

由 Gu Zheng 提交于 11月 24, 2014

Use generic io stats accounting help functions (generic_{start,end}_io_acct)
to simplify io stat accounting.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

18c0b223

17 11月, 2014 1 次提交

md: Always set RECOVERY_NEEDED when clearing RECOVERY_FROZEN · 45eaf45d

由 NeilBrown 提交于 10月 29, 2014

md_check_recovery will skip any recovery and also clear
MD_RECOVERY_NEEDED if MD_RECOVERY_FROZEN is set.
So when we clear _FROZEN, we must set _NEEDED and ensure that
md_check_recovery gets run.
Otherwise we could miss out on something that is needed.

In particular, this can make it impossible to remove a
failed device from an array is the  'recovery-needed' processing
didn't happen.
Suitable for stable kernels since 3.13.

Cc: stable@vger.kernel.org (3.13+)
Reported-and-tested-by: NJoe Lawrence <joe.lawrence@stratus.com>
Fixes: 30b8feb7Signed-off-by: NNeilBrown <neilb@suse.de>

45eaf45d

14 10月, 2014 14 次提交

N
md: move EXPORT_SYMBOL to after function in md.c · 6c144d31
由 NeilBrown 提交于 9月 30, 2014
```
Signed-off-by: NNeilBrown <neilb@suse.de>
```
6c144d31

md: discard PRINT_RAID_DEBUG ioctl · 2cbbca5e

由 NeilBrown 提交于 9月 30, 2014

All the interesting information printed by this ioctl
is provided in /proc/mdstat and/or sysfs.
So it isn't needed and isn't used and would be best if it didn't
exist.
Signed-off-by: NNeilBrown <neilb@suse.de>

2cbbca5e

md: remove MD_BUG() · 403df478

由 NeilBrown 提交于 9月 30, 2014

Most of the places that call this are doing so pointlessly.
A couple of the others a best replaced with WARN_ON().
Signed-off-by: NNeilBrown <neilb@suse.de>

403df478

N
md: clean up 'exit' labels in md_ioctl(). · 3adc28d8
由 NeilBrown 提交于 9月 30, 2014
```
There are 4 labels and we only really need two.
Signed-off-by: NNeilBrown <neilb@suse.de>
```
3adc28d8

md: remove unnecessary test for MD_MAJOR in md_ioctl() · 326eb17d

由 NeilBrown 提交于 9月 30, 2014

unknown ioctls no longer get this deep into md_ioctl since
md_ioctl_valid() was introduced in 3.14.
So remove the test and the misleading comment.
Signed-off-by: NNeilBrown <neilb@suse.de>

326eb17d

md: don't allow "-sync" to be set for device in an active array. · e1960f8c

由 NeilBrown 提交于 9月 30, 2014

If an array is active, devices can be marked 'faulty', but simply
removing the 'sync' flag is wrong.  That only makes sense
for an array which is not active (and is probably only useful
for testing anyway).
Signed-off-by: NNeilBrown <neilb@suse.de>

e1960f8c

N
md: remove unwanted white space from md.c · f72ffdd6
由 NeilBrown 提交于 9月 30, 2014
```
My editor shows much of this is RED.
Signed-off-by: NNeilBrown <neilb@suse.de>
```
f72ffdd6

md: don't start resync thread directly from md thread. · ac05f256

由 NeilBrown 提交于 9月 30, 2014

The main 'md' thread is needed for processing writes, so if it blocks
write requests could be delayed.

Starting a new thread requires some GFP_KERNEL allocations and so can
wait for writes to complete.  This can deadlock.

So instead, ask a workqueue to start the sync thread.
There is no particular rush for this to happen, so any work queue
will do.

MD_RECOVERY_RUNNING is used to ensure only one thread is started.
Reported-by: NBillStuff <billstuff2001@sbcglobal.net>
Signed-off-by: NNeilBrown <neilb@suse.de>

ac05f256

md: Just use RCU when checking for overlap between arrays. · 8b1afc3d

由 NeilBrown 提交于 9月 29, 2014

We don't really need the full mddev_lock here, and having to
drop it is messy.
RCU is enough to protect these lists.
Signed-off-by: NNeilBrown <neilb@suse.de>

8b1afc3d

md: avoid potential long delay under pers_lock · 50bd3774

由 Chao Yu 提交于 9月 25, 2014

printk may cause long time lapse if value of printk_delay in sysctl is
configured large by user. If register_md_personality takes long time to print in
spinlock pers_lock, we may encounter high CPU usage rate when there are other
pers_lock competitors who may be blocked to spin.
We can avoid this condition by moving printk out of coverage of pers_lock
spinlock.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

50bd3774

md: simplify export_array() · 0638bb0e

由 NeilBrown 提交于 9月 25, 2014

We don't really need that for_each loop, or those MD_BUGs.
Signed-off-by: NNeilBrown <neilb@suse.de>

0638bb0e

N
md: discard find_rdev_nr in favour of find_rdev_nr_rcu · 4878e9eb
由 NeilBrown 提交于 9月 25, 2014
```
Having both is a waste - just use the one.
Signed-off-by: NNeilBrown <neilb@suse.de>
```
4878e9eb

md: use wait_event() to simplify md_super_wait() · 1967cd56

由 NeilBrown 提交于 9月 09, 2014

md_super_wait is really just wait_event() open-coded.
So use the macro instead.
Signed-off-by: NNeilBrown <neilb@suse.de>

1967cd56

md: be more relaxed about stopping an array which isn't started. · 9ba3b7f5

由 NeilBrown 提交于 9月 09, 2014

In general we don't allow an array to be stopped if it is in use.
However if the array hasn't really been started yet, then any
apparent use is an anomily, probably due to 'udev' or similar
having a look to see what is there.

This means that if something goes wrong while assembling an array
it cannot reliably be un-assembled - STOP_ARRAY could fail.
There is no value here, so change do_md_stop() to succeed
despite concurrent opens if the array has not yet been
activated.  i.e. if ->pers is NULL.
Reported-by: N"Baldysiak, Pawel" <pawel.baldysiak@intel.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

9ba3b7f5

08 8月, 2014 2 次提交

md: don't allow bitmap file to be added to raid0/linear. · d66b1b39

由 NeilBrown 提交于 8月 08, 2014

An array can only accept a bitmap if it will call bitmap_daemon_work
periodically, which means it needs a thread running.

If there is no thread, don't allow a bitmap to be added.
Signed-off-by: NNeilBrown <neilb@suse.de>

d66b1b39

md: Recovery speed is wrong · ac7e50a3

由 Xiao Ni 提交于 8月 07, 2014

When we calculate the speed of recovery, the numerator that contains
the recovery done sectors.  It's need to subtract the sectors which
don't finish recovery.
Signed-off-by: NXiao Ni <xni@redhat.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

ac7e50a3

31 7月, 2014 1 次提交

md: disable probing for md devices 512 and over. · af5628f0

由 NeilBrown 提交于 7月 31, 2014

The way md devices are traditionally created in the kernel
is simply to open the device with the desired major/minor number.

This can be problematic as some support tools, notably udev and
programs run by udev, can open a device just to see what is there, and
find that it has created something.  It is easy for a race to cause
udev to open an md device just after it was destroy, causing it to
suddenly re-appear.

For some time we have had an alternate way to create md devices
  echo md_somename > /sys/modules/md_mod/paramaters/new_array

This will always use a minor number of 512 or higher, which mdadm
normally avoids.
Using this makes the creation-by-opening unnecessary, but does
not disable it, so it is still there to cause problems.

This patch disable probing for devices with a major of 9 (MD_MAJOR)
and a minor of 512 and up.  This devices created by writing to
new_array cannot be re-created by opening the node in /dev.
Signed-off-by: NNeilBrown <neilb@suse.de>

af5628f0

03 7月, 2014 2 次提交

md: flush writes before starting a recovery. · 133d4527

由 NeilBrown 提交于 7月 02, 2014

When we write to a degraded array which has a bitmap, we
make sure the relevant bit in the bitmap remains set when
the write completes (so a 're-add' can quickly rebuilt a
temporarily-missing device).

If, immediately after such a write starts, we incorporate a spare,
commence recovery, and skip over the region where the write is
happening (because the 'needs recovery' flag isn't set yet),
then that write will not get to the new device.

Once the recovery finishes the new device will be trusted, but will
have incorrect data, leading to possible corruption.

We cannot set the 'needs recovery' flag when we start the write as we
do not know easily if the write will be "degraded" or not.  That
depends on details of the particular raid level and particular write
request.

This patch fixes a corruption issue of long standing and so it
suitable for any -stable kernel.  It applied correctly to 3.0 at
least and will minor editing to earlier kernels.
Reported-by: NBill <billstuff2001@sbcglobal.net>
Tested-by: NBill <billstuff2001@sbcglobal.net>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/53A518BB.60709@sbcglobal.netSigned-off-by: NNeilBrown <neilb@suse.de>

133d4527

md: make sure GET_ARRAY_INFO ioctl reports correct "clean" status · 9bd35920

由 NeilBrown 提交于 7月 02, 2014

If an array has a bitmap, the when we set the "has bitmap" flag we
incorrectly clear the "is clean" flag.

"is clean" isn't really important when a bitmap is present, but it is
best to get it right anyway.
Reported-by: NGeorge Duffield <forumscollective@gmail.com>
Link: http://lkml.kernel.org/CAG__1a4MRV6gJL38XLAurtoSiD3rLBTmWpcS5HYvPpSfPR88UQ@mail.gmail.com
Fixes: 36fa3063 (v2.6.14)
Signed-off-by: NNeilBrown <neilb@suse.de>

9bd35920

29 5月, 2014 3 次提交

md: md_clear_badblocks should return an error code on failure. · 8b32bf5e

由 NeilBrown 提交于 5月 28, 2014

Julia Lawall and coccinelle report that md_clear_badblocks always
returns 0, despite appearing to have an error path.
The error path really should return an error code.  ENOSPC is
reasonably appropriate.
Reported-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: NNeilBrown <neilb@suse.de>

8b32bf5e

md: refuse to change shape of array if it is active but read-only · bd8839e0

由 NeilBrown 提交于 5月 28, 2014

read-only arrays should not be changed.  This includes changing
the level, layout, size, or number of devices.

So reject those changes for readonly arrays.
Signed-off-by: NNeilBrown <neilb@suse.de>

bd8839e0

md: always set MD_RECOVERY_INTR when interrupting a reshape thread. · 2ac295a5

由 NeilBrown 提交于 5月 29, 2014

Commit 8313b8e5
   md: fix problem when adding device to read-only array with bitmap.

added a called to md_reap_sync_thread() which cause a reshape thread
to be interrupted (in particular, it could cause md_thread() to never even
call md_do_sync()).
However it didn't set MD_RECOVERY_INTR so ->finish_reshape() would not
know that the reshape didn't complete.

This only happens when mddev->ro is set and normally reshape threads
don't run in that situation.  But raid5 and raid10 can start a reshape
thread during "run" is the array is in the middle of a reshape.
They do this even if ->ro is set.

So it is best to set MD_RECOVERY_INTR before abortingg the
sync thread, just in case.

Though it rare for this to trigger a problem it can cause data corruption
because the reshape isn't finished properly.
So it is suitable for any stable which the offending commit was applied to.
(3.2 or later)

Fixes: 8313b8e5
Cc: stable@vger.kernel.org (3.2+)
Signed-off-by: NNeilBrown <neilb@suse.de>

2ac295a5