提交 · ffd96e35c16a99fdb490cc5723b8e32135ae5883 · openeuler / Kernel

18 7月, 2011 2 次提交

md/raid5: get rid of duplicated call to bio_data_dir() · ffd96e35

由 Namhyung Kim 提交于 7月 18, 2011

In raid5::make_request(), once bio_data_dir(@bi) is detected
it never (and couldn't) be changed. Use the result always.
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

ffd96e35

md/raid5: use kmem_cache_zalloc() · 6ce32846

由 Namhyung Kim 提交于 7月 18, 2011

Replace kmem_cache_alloc + memset(,0,) to kmem_cache_zalloc.
I think it's not harmful since @conf->slab_cache already knows
actual size of struct stripe_head.
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

6ce32846

14 6月, 2011 3 次提交

md/raid5: remove unusual use of bio_iovec_idx() · fcde9075

由 Namhyung Kim 提交于 6月 14, 2011

In the bio_for_each_segment loop, bvl always points current
bio_vec, so the same as bio_iovec_idx(, i). Let's get rid of
it.

Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

fcde9075

md/raid5: fix FUA request handling in ops_run_io() · b062962e

由 Namhyung Kim 提交于 6月 14, 2011

Commit e9c7469b ("md: implment REQ_FLUSH/FUA support")
introduced R5_WantFUA flag and set rw to WRITE_FUA in that case.
However remaining code still checks whether rw is exactly same
as WRITE or not, so FUAed-write ends up with being treated as
READ. Fix it.

This bug has been present since 2.6.37 and the fix is suitable for any
-stable kernel since then.  It is not clear why this has not caused
more problems.

Cc: Tejun Heo <tj@kernel.org>
Cc: stable@kernel.org
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

b062962e

md/raid5: fix raid5_set_bi_hw_segments · 9b2dc8b6

由 Namhyung Kim 提交于 6月 13, 2011

The @bio->bi_phys_segments consists of active stripes count in the
lower 16 bits and processed stripes count in the upper 16 bits. So
logical-OR operator should be bitwise one.

This bug has been present since 2.6.27 and the fix is suitable for any
-stable kernel since then.  Fortunately the bad code is only used on
error paths and is relatively unlikely to be hit.

Cc: stable@kernel.org
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

9b2dc8b6

09 6月, 2011 1 次提交

MD: raid5 do not set fullsync · d6b212f4

由 Jonathan Brassow 提交于 6月 08, 2011

Add check to determine if a device needs full resync or if partial resync will do

RAID 5 was assuming that if a device was not In_sync, it must undergo a full
resync.  We add a check to see if 'saved_raid_disk' is the same as 'raid_disk'.
If it is, we can safely skip the full resync and rely on the bitmap for
partial recovery instead.  This is the legitimate purpose of 'saved_raid_disk',
from md.h:
int saved_raid_disk;            /* role that device used to have in the
                                 * array and could again if we did a partial
                                 * resync from the bitmap
                                 */
Signed-off-by: NJonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

d6b212f4

11 5月, 2011 2 次提交

md: allow resync_start to be set while an array is active. · b098636c

由 NeilBrown 提交于 5月 11, 2011

The sysfs attribute 'resync_start' (known internally as recovery_cp),
records where a resync is up to.  A value of 0 means the array is
not known to be in-sync at all.  A value of MaxSector means the array
is believed to be fully in-sync.

When the size of member devices of an array (RAID1,RAID4/5/6) is
increased, the array can be increased to match.  This process sets
resync_start to the old end-of-device offset so that the new part of
the array gets resynced.

However with RAID1 (and RAID6) a resync is not technically necessary
and may be undesirable.  So it would be good if the implied resync
after the array is resized could be avoided.

So: change 'resync_start' so the value can be changed while the array
is active, and as a precaution only allow it to be changed while
resync/recovery is 'frozen'.  Changing it once resync has started is
not going to be useful anyway.

This allows the array to be resized without a resync by:
  write 'frozen' to 'sync_action'
  write new size to 'component_size' (this will set resync_start)
  write 'none' to 'resync_start'
  write 'idle' to 'sync_action'.

Also slightly improve some tests on recovery_cp when resizing
raid1/raid5.  Now that an arbitrary value could be set we should be
more careful in our tests.
Signed-off-by: NNeilBrown <neilb@suse.de>

b098636c

md: make error_handler functions more uniform and correct. · 6f8d0c77

由 NeilBrown 提交于 5月 11, 2011

- there is no need to test_bit Faulty, as that was already done in
  md_error which is the only caller of these functions.
- MD_CHANGE_DEVS should be set *after* faulty is set to ensure
  metadata is updated correctly.
- spinlock should be held while updating ->degraded.
Signed-off-by: NNeilBrown <neilb@suse.de>

6f8d0c77

10 5月, 2011 1 次提交

md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course'). · aeb878b0

由 Jesper Juhl 提交于 4月 10, 2011

There's a small typo in a comment in drivers/md/raid5.c - 'Of course' is
misspelled as 'Ofcourse'. This patch fixes the spelling error.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

aeb878b0

22 4月, 2011 1 次提交

raid5: fix build error, sector_t usage · d76c8420

由 Randy Dunlap 提交于 4月 21, 2011

Change <sectors> from unsigned long long to sector_t.
This matches its source field.

  ERROR: "__udivdi3" [drivers/md/raid456.ko] undefined!
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d76c8420

20 4月, 2011 2 次提交

md: Fix dev_sectors on takeover from raid0 to raid4/5 · 3b71bd93

由 NeilBrown 提交于 4月 20, 2011

A raid0 array doesn't set 'dev_sectors' as each device might
contribute a different number of sectors.
So when converting to a RAID4 or RAID5 we need to set dev_sectors
as they need the number.
We have already verified that in fact all devices do contribute
the same number of sectors, so use that number.
Signed-off-by: NNeilBrown <neilb@suse.de>

3b71bd93

md/raid5: remove setting of ->queue_lock · 2b7da309

由 NeilBrown 提交于 4月 20, 2011

We previously needed to set ->queue_lock to match the raid5
device_lock so we could safely use queue_flag_* operations (e.g. for
plugging). which test the ->queue_lock is in fact locked.

However that need has completely gone away and is unlikely to come
back to remove this now-pointless setting.
Signed-off-by: NNeilBrown <neilb@suse.de>

2b7da309

18 4月, 2011 3 次提交

md: incorporate new plugging into raid5. · 7c13edc8

由 NeilBrown 提交于 4月 18, 2011

In raid5 plugging is used for 2 things:
 1/ collecting writes that require a bitmap update
 2/ collecting writes in the hope that we can create full
    stripes - or at least more-full.

We now release these different sets of stripes when plug_cnt
is zero.

Also in make_request, we call mddev_check_plug to hopefully increase
plug_cnt, and wake up the thread at the end if plugging wasn't
achieved for some reason.
Signed-off-by: NNeilBrown <neilb@suse.de>

7c13edc8

md - remove old plugging code. · 482c0834

由 NeilBrown 提交于 4月 18, 2011

md has some plugging infrastructure for RAID5 to use because the
normal plugging infrastructure required a 'request_queue', and when
called from dm, RAID5 doesn't have one of those available.

This relied on the ->unplug_fn callback which doesn't exist any more.

So remove all of that code, both in md and raid5.  Subsequent patches
with restore the plugging functionality.
Signed-off-by: NNeilBrown <neilb@suse.de>

482c0834

md: use new plugging interface for RAID IO. · e1dfa0a2

由 NeilBrown 提交于 4月 18, 2011

md/raid submits a lot of IO from the various raid threads.
So adding start/finish plug calls to those so that some
plugging happens.
Signed-off-by: NNeilBrown <neilb@suse.de>

e1dfa0a2

10 3月, 2011 1 次提交

block: remove per-queue plugging · 7eaceacc

由 Jens Axboe 提交于 3月 10, 2011

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7eaceacc

21 2月, 2011 1 次提交

md: avoid spinlock problem in blk_throtl_exit · da9cf505

由 NeilBrown 提交于 2月 21, 2011

blk_throtl_exit assumes that ->queue_lock still exists,
so make sure that it does.
To do this, we stop redirecting ->queue_lock to conf->device_lock
and leave it pointing where it is initialised - __queue_lock.

As the blk_plug functions check the ->queue_lock is held, we now
take that spin_lock explicitly around the plug functions.  We don't
need the locking, just the warning removal.

This is needed for any kernel with the blk_throtl code, which is
which is 2.6.37 and later.

Cc: stable@kernel.org
Signed-off-by: NNeilBrown <neilb@suse.de>

da9cf505

31 1月, 2011 3 次提交

md: don't abort checking spares as soon as one cannot be added. · 50da0840

由 NeilBrown 提交于 1月 31, 2011

As spares can be added manually before a reshape starts, we need to
find them all to mark some of them as in_sync.

Previously we would abort looking for spares when we found an
unallocated spare what could not be added to the array (implying there
was no room for new spares).  However already-added spares could be
later in the list, so we need to keep searching.
Signed-off-by: NNeilBrown <neilb@suse.de>

50da0840

md: fix the test for finding spares in raid5_start_reshape. · 469518a3

由 NeilBrown 提交于 1月 31, 2011

As spares can be added to the array before the reshape is started,
we need to find and count them when checking there are enough.
The array could have been degraded, so we need to check all devices,
no just those out side of the range of devices in the array before
the reshape.

So instead of checking the index, check the In_sync flag as that
reliably tells if the device is a spare or this purpose.
Signed-off-by: NNeilBrown <neilb@suse.de>

469518a3

md: simplify some 'if' conditionals in raid5_start_reshape. · 87a8dec9

由 NeilBrown 提交于 1月 31, 2011

There are two consecutive 'if' statements.

 if (mddev->delta_disks >= 0)
      ....
 if (mddev->delta_disks > 0)

The code in the second is equally valid if delta_disks == 0, and these
two statements are the only place that 'added_devices' is used.

So make them a single if statement, make added_devices a local
variable, and re-indent it all.

No functional change.
Signed-off-by: NNeilBrown <neilb@suse.de>

87a8dec9

14 1月, 2011 4 次提交

md/raid5: handle manually-added spares in start_reshape. · 1a940fce

由 NeilBrown 提交于 1月 14, 2011

It is possible to manually add spares to specific slots before
starting a reshape.
raid5_start_reshape should recognised this possibility and include
it in the accounting.
Signed-off-by: NNeilBrown <neilb@suse.de>

1a940fce

md: Don't let implementation detail of curr_resync leak out through sysfs. · 75d3da43

由 NeilBrown 提交于 1月 14, 2011

mddev->curr_resync has artificial values of '1' and '2' which are used
by the code which ensures only one resync is happening at a time on
any given device.

These values are internal and should never be exposed to user-space
(except when translated appropriately as in the 'pending' status in
/proc/mdstat).

Unfortunately they are as ->curr_resync is assigned to
->curr_resync_completed and that value is directly visible through
sysfs.

So change the assignments to ->curr_resync_completed to get the same
valued from elsewhere in a form that doesn't have the magic '1' or '2'
values.
Signed-off-by: NNeilBrown <neilb@suse.de>

75d3da43

md/raid5: use sysfs_notify_dirent_safe to avoid NULL pointer · 43c73ca4

由 Jonathan Brassow 提交于 1月 14, 2011

With the module parameter 'start_dirty_degraded' set,
raid5_spare_active() previously called sysfs_notify_dirent() with a NULL
argument (rdev->sysfs_state) when a rebuild finished.
Signed-off-by: NJonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

43c73ca4

md: Fix single printks with multiple KERN_<level>s · 067032bc

由 Joe Perches 提交于 1月 14, 2011

Noticed-by: NRussell King <linux@arm.linux.org.uk>
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

067032bc

28 10月, 2010 2 次提交

md: use separate bio pool for each md device. · a167f663

由 NeilBrown 提交于 10月 26, 2010

bio_clone and bio_alloc allocate from a common bio pool.
If an md device is stacked with other devices that use this pool, or under
something like swap which uses the pool, then the multiple calls on
the pool can cause deadlocks.

So allocate a local bio pool for each md array and use that rather
than the common pool.

This pool is used both for regular IO and metadata updates.
Signed-off-by: NNeilBrown <neilb@suse.de>

a167f663

md: use sector_t in bitmap_get_counter · 57dab0bd

由 NeilBrown 提交于 10月 19, 2010

bitmap_get_counter returns the number of sectors covered
by the counter in a pass-by-reference variable.
In some cases this can be very large, so make it a sector_t
for safety.
Signed-off-by: NNeilBrown <neilb@suse.de>

57dab0bd

10 9月, 2010 1 次提交

md: implment REQ_FLUSH/FUA support · e9c7469b

由 Tejun Heo 提交于 9月 03, 2010

This patch converts md to support REQ_FLUSH/FUA instead of now
deprecated REQ_HARDBARRIER.  In the core part (md.c), the following
changes are notable.

* Unlike REQ_HARDBARRIER, REQ_FLUSH/FUA don't interfere with
  processing of other requests and thus there is no reason to mark the
  queue congested while FLUSH/FUA is in progress.

* REQ_FLUSH/FUA failures are final and its users don't need retry
  logic.  Retry logic is removed.

* Preflush needs to be issued to all member devices but FUA writes can
  be handled the same way as other writes - their processing can be
  deferred to request_queue of member devices.  md_barrier_request()
  is renamed to md_flush_request() and simplified accordingly.

For linear, raid0 and multipath, the core changes are enough.  raid1,
5 and 10 need the following conversions.

* raid1: Handling of FLUSH/FUA bio's can simply be deferred to
  request_queues of member devices.  Barrier related logic removed.

* raid5: Queue draining logic dropped.  FUA bit is propagated through
  biodrain and stripe resconstruction such that all the updated parts
  of the stripe are written out with FUA writes if any of the dirtying
  writes was FUA.  preread_active_stripes handling in make_request()
  is updated as suggested by Neil Brown.

* raid10: FUA bit needs to be propagated to write clones.

linear, raid0, 1, 5 and 10 tested.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

e9c7469b

18 8月, 2010 2 次提交

md: provide appropriate return value for spare_active functions. · 6b965620

由 NeilBrown 提交于 8月 18, 2010

md_check_recovery expects ->spare_active to return 'true' if any
spares were activated, but none of them do, so the consequent change
in 'degraded' is not notified through sysfs.

So count the number of spares activated, subtract it from 'degraded'
just once, and return it.
Reported-by: NAdrian Drzewiecki <adriand@vmware.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

6b965620

md: Notify sysfs when RAID1/5/10 disk is In_sync. · e6ffbcb6

由 Adrian Drzewiecki 提交于 8月 18, 2010

When RAID1 is done syncing disks, it'll update the state
of synced rdevs to In_sync. But it neglected to notify
sysfs that the attribute changed. So any programs that
are waiting for an rdev's state to change will not be
woken.

(raid5/raid10 added by neilb)
Signed-off-by: NAdrian Drzewiecki <adriand@vmware.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

e6ffbcb6

08 8月, 2010 1 次提交

block: unify flags for struct bio and struct request · 7b6d91da

由 Christoph Hellwig 提交于 8月 07, 2010

Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver. There were two flags in the bio that were
missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7b6d91da

26 7月, 2010 6 次提交

md/raid5: export raid5 unplugging interface. · 9f7c2220

由 NeilBrown 提交于 7月 26, 2010

Also remove remaining accesses to ->queue and ->gendisk when ->queue
is NULL (As it is in a DM target).
Signed-off-by: NNeilBrown <neilb@suse.de>

9f7c2220

md/plug: optionally use plugger to unplug an array during resync/recovery. · 252ac522

由 NeilBrown 提交于 6月 01, 2010

If an array doesn't have a 'queue' then md_do_sync cannot
unplug it.
In that case it will have a 'plugger', so make that available
to the mddev, and use it to unplug the array if needed.
Signed-off-by: NNeilBrown <neilb@suse.de>

252ac522

md/raid5: add simple plugging infrastructure. · 2ac87401

由 NeilBrown 提交于 6月 01, 2010

md/raid5 uses the plugging infrastructure provided by the block layer
and 'struct request_queue'.  However when we plug raid5 under dm there
is no request queue so we cannot use that.

So create a similar infrastructure that is much lighter weight and use
it for raid5.
Signed-off-by: NNeilBrown <neilb@suse.de>

2ac87401

md/raid5: export is_congested test · 11d8a6e3

由 NeilBrown 提交于 7月 26, 2010

the dm module will need this for dm-raid45.

Also only access ->queue->backing_dev_info->congested_fn
if ->queue actually exists.  It won't in a dm target.
Signed-off-by: NNeilBrown <neilb@suse.de>

11d8a6e3

raid5: Don't set read-ahead when there is no queue · 4a5add49

由 NeilBrown 提交于 6月 01, 2010

dm-raid456 does not provide a 'queue' for raid5 to use,
so we must make raid5 stop depending on the queue.

First: read_ahead
dm handles read-ahead adjustment fully in userspace, so
simply don't do any readahead adjustments if there is
no queue.

Also re-arrange code slightly so all the accesses to ->queue are
together.

Finally, move the blk_queue_merge_bvec function into the 'if' as
the ->split_io setting in dm-raid456 has the same effect.
Signed-off-by: NNeilBrown <neilb@suse.de>

4a5add49

md/raid5: ensure we create a unique name for kmem_cache when mddev has no gendisk · f4be6b43

由 NeilBrown 提交于 6月 01, 2010

We will shortly allow md devices with no gendisk (they are attached to
a dm-target instead).  That will cause mdname() to return 'mdX'.
There is one place where mdname really needs to be unique: when
creating the name for a slab cache.
So in that case, if there is no gendisk, you the address of the mddev
formatted in HEX to provide a unique name.
Signed-off-by: NNeilBrown <neilb@suse.de>

f4be6b43

21 7月, 2010 2 次提交

md/raid5: factor out code for changing size of stripe cache. · c41d4ac4

由 NeilBrown 提交于 6月 01, 2010

Separate the actual 'change' code from the sysfs interface
so that it can eventually be called internally.
Signed-off-by: NNeilBrown <neilb@suse.de>

c41d4ac4

md: reduce dependence on sysfs. · 00bcb4ac

由 NeilBrown 提交于 6月 01, 2010

We will want md devices to live as dm targets where sysfs is not
visible.  So allow md to not connect to sysfs.
Signed-off-by: NNeilBrown <neilb@suse.de>

00bcb4ac

24 6月, 2010 2 次提交

md/raid5: don't include 'spare' drives when reshaping to fewer devices. · 3424bf6a

由 NeilBrown 提交于 6月 17, 2010

There are few situations where it would make any sense to add a spare
when reducing the number of devices in an array, but it is
conceivable:  A 6 drive RAID6 with two missing devices could be
reshaped to a 5 drive RAID6, and a spare could become available
just in time for the reshape, but not early enough to have been
recovered first.  'freezing' recovery can make this easy to
do without any races.

However doing such a thing is a bad idea.  md will not record the
partially-recovered state of the 'spare' and when the reshape
finished it will think that the spare is still spare.
Easiest way to avoid this confusion is to simply disallow it.
Signed-off-by: NNeilBrown <neilb@suse.de>

3424bf6a

md/raid5: add a missing 'continue' in a loop. · 2f115882

由 NeilBrown 提交于 6月 17, 2010

As the comment says, the tail of this loop only applies to devices
that are not fully in sync, so if In_sync was set, we should avoid
the rest of the loop.

This bug will hardly ever cause an actual problem.  The worst it
can do is allow an array to be assembled that is dirty and degraded,
which is not generally a good idea (without warning the sysadmin
first).

This will only happen if the array is RAID4 or a RAID5/6 in an
intermediate state during a reshape and so has one drive that is
all 'parity' - no data - while some other device has failed.

This is certainly possible, but not at all common.
Signed-off-by: NNeilBrown <neilb@suse.de>

2f115882

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功