提交 · 0403e3827788d878163f9ef0541b748b0f88ca5d · openanolis / cloud-kernel

09 9月, 2009 1 次提交

由 Dan Williams 提交于 9月 08, 2009

Some engines optimize operation by reading ahead in the descriptor chain
such that descriptor2 may start execution before descriptor1 completes.
If descriptor2 depends on the result from descriptor1 then a fence is
required (on descriptor2) to disable this optimization. The async_tx
api could implicitly identify dependencies via the 'depend_tx'
parameter, but that would constrain cases where the dependency chain
only specifies a completion order rather than a data dependency. So,
provide an ASYNC_TX_FENCE to explicitly identify data dependencies.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

0403e382

30 8月, 2009 13 次提交

md/raid456: distribute raid processing over multiple cores · 07a3b417

由 Dan Williams 提交于 8月 29, 2009

Now that the resources to handle stripe_head operations are allocated
percpu it is possible for raid5d to distribute stripe handling over
multiple cores.  This conversion also adds a call to cond_resched() in
the non-multicore case to prevent one core from getting monopolized for
raid operations.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

07a3b417

md/raid6: remove synchronous infrastructure · b774ef49

由 Yuri Tikhonov 提交于 8月 29, 2009

These routines have been replaced by there asynchronous counterparts.
Signed-off-by: NYuri Tikhonov <yur@emcraft.com>
Signed-off-by: NIlya Yanok <yanok@emcraft.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

b774ef49

md/raid6: asynchronous handle_stripe6 · 6c0069c0

由 Yuri Tikhonov 提交于 8月 29, 2009

1/ Use STRIPE_OP_BIOFILL to offload completion of read requests to
   raid_run_ops
2/ Implement a handler for sh->reconstruct_state similar to the raid5 case
   (adds handling of Q parity)
3/ Prevent handle_parity_checks6 from running concurrently with 'compute'
   operations
4/ Hook up raid_run_ops
Signed-off-by: NYuri Tikhonov <yur@emcraft.com>
Signed-off-by: NIlya Yanok <yanok@emcraft.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

6c0069c0

md/raid6: asynchronous handle_parity_check6 · d82dfee0

由 Dan Williams 提交于 7月 14, 2009

[ Based on an original patch by Yuri Tikhonov ]

Implement the state machine for handling the RAID-6 parities check and
repair functionality.  Note that the raid6 case does not need to check
for new failures, like raid5, as it will always writeback the correct
disks.  The raid5 case can be updated to check zero_sum_result to avoid
getting confused by new failures rather than retrying the entire check
operation.
Signed-off-by: NYuri Tikhonov <yur@emcraft.com>
Signed-off-by: NIlya Yanok <yanok@emcraft.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

d82dfee0

md/raid6: asynchronous handle_stripe_dirtying6 · a9b39a74

由 Yuri Tikhonov 提交于 8月 29, 2009

In the synchronous implementation of stripe dirtying we processed a
degraded stripe with one call to handle_stripe_dirtying6().  I.e.
compute the missing blocks from the other drives, then copy in the new
data and reconstruct the parities.

In the asynchronous case we do not perform stripe operations directly.
Instead, operations are scheduled with flags to be later serviced by
raid_run_ops.  So, for the degraded case the final reconstruction step
can only be carried out after all blocks have been brought up to date by
being read, or computed.  Like the raid5 case schedule_reconstruction()
sets STRIPE_OP_RECONSTRUCT to request a parity generation pass and
through operation chaining can handle compute and reconstruct in a
single raid_run_ops pass.

[dan.j.williams@intel.com: fixup handle_stripe_dirtying6 gating]
Signed-off-by: NYuri Tikhonov <yur@emcraft.com>
Signed-off-by: NIlya Yanok <yanok@emcraft.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

a9b39a74

md/raid6: asynchronous handle_stripe_fill6 · 5599becc

由 Yuri Tikhonov 提交于 8月 29, 2009

Modify handle_stripe_fill6 to work asynchronously by introducing
fetch_block6 as the raid6 analog of fetch_block5 (schedule compute
operations for missing/out-of-sync disks).

[dan.j.williams@intel.com: compute D+Q in one pass]
Signed-off-by: NYuri Tikhonov <yur@emcraft.com>
Signed-off-by: NIlya Yanok <yanok@emcraft.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

5599becc

md/raid5,6: common schedule_reconstruction for raid5/6 · c0f7bddb

由 Yuri Tikhonov 提交于 8月 29, 2009

Extend schedule_reconstruction5 for reuse by the raid6 path.  Add
support for generating Q and BUG() if a request is made to perform
'prexor'.
Signed-off-by: NYuri Tikhonov <yur@emcraft.com>
Signed-off-by: NIlya Yanok <yanok@emcraft.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

c0f7bddb

md/raid6: asynchronous raid6 operations · ac6b53b6

由 Dan Williams 提交于 7月 14, 2009

[ Based on an original patch by Yuri Tikhonov ]

The raid_run_ops routine uses the asynchronous offload api and
the stripe_operations member of a stripe_head to carry out xor+pq+copy
operations asynchronously, outside the lock.

The operations performed by RAID-6 are the same as in the RAID-5 case
except for no support of STRIPE_OP_PREXOR operations. All the others
are supported:
STRIPE_OP_BIOFILL
 - copy data into request buffers to satisfy a read request
STRIPE_OP_COMPUTE_BLK
 - generate missing blocks (1 or 2) in the cache from the other blocks
STRIPE_OP_BIODRAIN
 - copy data out of request buffers to satisfy a write request
STRIPE_OP_RECONSTRUCT
 - recalculate parity for new data that has entered the cache
STRIPE_OP_CHECK
 - verify that the parity is correct

The flow is the same as in the RAID-5 case, and reuses some routines, namely:
1/ ops_complete_postxor (renamed to ops_complete_reconstruct)
2/ ops_complete_compute (updated to set up to 2 targets uptodate)
3/ ops_run_check (renamed to ops_run_check_p for xor parity checks)

[neilb@suse.de: fixes to get it to pass mdadm regression suite]
Reviewed-by: NAndre Noll <maan@systemlinux.org>
Signed-off-by: NYuri Tikhonov <yur@emcraft.com>
Signed-off-by: NIlya Yanok <yanok@emcraft.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

ac6b53b6

md/raid5: factor out mark_uptodate from ops_complete_compute5 · 4e7d2c0a

由 Dan Williams 提交于 8月 29, 2009

ops_complete_compute5 can be reused in the raid6 path if it is updated to
generically handle a second target.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

4e7d2c0a

async_tx: raid6 recovery self test · cb3c8299

由 Dan Williams 提交于 7月 14, 2009

Port drivers/md/raid6test/test.c to use the async raid6 recovery
routines.  This is meant as a unit test for raid6 acceleration drivers.  In
addition to the 16-drive test case this implements tests for the 4-disk and
5-disk special cases (dma devices can not generically handle less than 2
sources), and adds a test for the D+Q case.
Reviewed-by: NAndre Noll <maan@systemlinux.org>
Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

cb3c8299

async_tx: add sum check flags · ad283ea4

由 Dan Williams 提交于 8月 29, 2009

Replace the flat zero_sum_result with a collection of flags to contain
the P (xor) zero-sum result, and the soon to be utilized Q (raid6 reed
solomon syndrome) zero-sum result.  Use the SUM_CHECK_ namespace instead
of DMA_ since these flags will be used on non-dma-zero-sum enabled
platforms.
Reviewed-by: NAndre Noll <maan@systemlinux.org>
Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

ad283ea4

md/raid5,6: add percpu scribble region for buffer lists · d6f38f31

由 Dan Williams 提交于 7月 14, 2009

Use percpu memory rather than stack for storing the buffer lists used in
parity calculations.  Include space for dma address conversions and pass
that to async_tx via the async_submit_ctl.scribble pointer.

[ Impact: move memory pressure from stack to heap ]
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

d6f38f31

md/raid6: move the spare page to a percpu allocation · 36d1c647

由 Dan Williams 提交于 7月 14, 2009

In preparation for asynchronous handling of raid6 operations move the
spare page to a percpu allocation to allow multiple simultaneous
synchronous raid6 recovery operations.

Make this allocation cpu hotplug aware to maximize allocation
efficiency.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

36d1c647

15 7月, 2009 1 次提交

md/raid6: release spare page at ->stop() · a11034b4

由 Dan Williams 提交于 7月 14, 2009

Add missing call to safe_put_page from stop() by unifying open coded
raid5_conf_t de-allocation under free_conf().
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

a11034b4

09 6月, 2009 3 次提交

md/raid5: fix bug in reshape code when chunk_size decreases. · 0e6e0271

由 NeilBrown 提交于 6月 09, 2009

Now that we support changing the chunksize, we calculate
"reshape_sectors" to be the max of number of sectors in old
and new chunk size.
However there is one please where we still use 'chunksize'
rather than 'reshape_sectors'.
This causes a reshape that reduces the size of chunks to freeze.
Signed-off-by: NNeilBrown <neilb@suse.de>

0e6e0271

md/raid5 - avoid deadlocks in get_active_stripe during reshape · a8c906ca

由 NeilBrown 提交于 6月 09, 2009

md has functionality to 'quiesce' and array so that all pending
IO completed and no new IO starts.  This is used to achieve a
stable state before making internal changes.

Currently this quiescing applies equally to normal IO, resync
IO, and reshape IO.
However there is a problem with applying it to reshape IO.
Reshape can have multiple 'stripe_heads' that must be active together.
If the quiesce come between allocating the first and the last of
such a collection, then we deadlock, as the last will not be allocated
until the quiesce is lifted, the quiesce will not be lifted until the
first (which has been allocated) gets used, and that first cannot be
used until the last is allocated.

It is not necessary to inhibit reshape IO when a quiesce is
requested.  Those places in the code that require a full quiesce will
ensure the reshape thread is not running at all.

So allow reshape requests to get access to new stripe_heads without
being blocked by a 'quiesce'.

This only affects in-place reshapes (i.e. where the array does not
grow or shrink) and these are only newly supported.  So this patch is
not needed in earlier kernels.
Signed-off-by: NNeilBrown <neilb@suse.de>

a8c906ca

md/raid5: use conf->raid_disks in preference to mddev->raid_disk · f001a70c

由 NeilBrown 提交于 6月 09, 2009

mddev->raid_disks can be changed and any time by a request from
user-space.  It is a suggestion as to what number of raid_disks is
desired.

conf->raid_disks can only be changed by the raid5 module with suitable
locks in place.  It is a statement as to the current number of
raid_disks.

There are two places where the latter should be used, but the former
is used.  This can lead to a crash when reshaping an array.

This patch changes to mddev-> to conf->
Signed-off-by: NNeilBrown <neilb@suse.de>

f001a70c

04 6月, 2009 2 次提交

async_tx: structify submission arguments, add scribble · a08abd8c

由 Dan Williams 提交于 6月 03, 2009

Prepare the api for the arrival of a new parameter, 'scribble'.  This
will allow callers to identify scratchpad memory for dma address or page
address conversions.  As this adds yet another parameter, take this
opportunity to convert the common submission parameters (flags,
dependency, callback, and callback argument) into an object that is
passed by reference.

Also, take this opportunity to fix up the kerneldoc and add notes about
the relevant ASYNC_TX_* flags for each routine.

[ Impact: moves api pass-by-value parameters to a pass-by-reference struct ]
Signed-off-by: NAndre Noll <maan@systemlinux.org>
Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

a08abd8c

async_tx: kill ASYNC_TX_DEP_ACK flag · 88ba2aa5

由 Dan Williams 提交于 4月 09, 2009

In support of inter-channel chaining async_tx utilizes an ack flag to
gate whether a dependent operation can be chained to another.  While the
flag is not set the chain can be considered open for appending.  Setting
the ack flag closes the chain and flags the descriptor for garbage
collection.  The ASYNC_TX_DEP_ACK flag essentially means "close the
chain after adding this dependency".  Since each operation can only have
one child the api now implicitly sets the ack flag at dependency
submission time.  This removes an unnecessary management burden from
clients of the api.

[ Impact: clean up and enforce one dependency per operation ]
Reviewed-by: NAndre Noll <maan@systemlinux.org>
Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

88ba2aa5

27 5月, 2009 1 次提交

md: raid5: change incorrect usage of 'min' macro to 'min_t' · ed37d83e

由 NeilBrown 提交于 5月 27, 2009

A recent patch to raid5.c use min on an int and a sector_t.
This isn't allowed.
So change it to min_t(sector_t,x,y).
Signed-off-by: NNeilBrown <neilb@suse.de>

ed37d83e

26 5月, 2009 7 次提交

md: don't use locked_ioctl. · b492b852

由 NeilBrown 提交于 5月 26, 2009

md has no need for the BKL - it does its own locking.
So md_ioctl doesn't need to be a locked_ioctl.
Signed-off-by: NNeilBrown <neilb@suse.de>

b492b852

md: don't update curr_resync_completed without also updating reshape_position. · 7a91ee1f

由 NeilBrown 提交于 5月 26, 2009

In order for the metadata to always be consistent, we mustn't updated
curr_resync_completed without also updating reshape_position.

The reshape code updates both at the same time.  However since
commit 97e4f42d
the common md_do_sync will sometimes update curr_resync_completed
but is not in a position to update reshape_position.
So if MD_RECOVERY_RESHAPE is set (indicating that a reshape is
happening, so reshape_position might change), don't update
curr_resync_completed in md_do_sync, leave it to the per-personality
reshape code.
Signed-off-by: NNeilBrown <neilb@suse.de>

7a91ee1f

md: raid5: avoid sector values going negative when testing reshape progress. · 848b3182

由 NeilBrown 提交于 5月 26, 2009

As sector_t in unsigned, we cannot afford to let 'safepos' etc go
negative.
So replace
   a -= b;
by
   a -= min(b,a);
Signed-off-by: NNeilBrown <neilb@suse.de>

848b3182

md: export 'frozen' resync state through sysfs · b6a9ce68

由 NeilBrown 提交于 5月 26, 2009

The md resync engine has a 'frozen' state which ensures that
no resync/recovery.  This is used to avoid races.

Export this state through the 'sync_action' sysfs attribute
so that user-space can benefit and also avoid some races.
Signed-off-by: NNeilBrown <neilb@suse.de>

b6a9ce68

md: bitmap: improve bitmap maintenance code. · be512691

由 NeilBrown 提交于 5月 26, 2009

The code for checking which bits in the bitmap can be cleared
has 2 problems:
 1/ it repeatedly takes and drops a spinlock, where it would make
    more sense to just hold on to it most of the time.
 2/ it doesn't make use of some opportunities to skip large sections
    of the bitmap

This patch fixes those.  It will only affect CPU consumption, not
correctness.
Signed-off-by: NNeilBrown <neilb@suse.de>

be512691

md: improve errno return when setting array_size · 2b69c839

由 NeilBrown 提交于 5月 26, 2009

Instead of always returns EINVAL if anything goes wrong
when setting the array size, add the option of
  E2BIG
if the size requested is too large.  This makes it easier
for user-space to be sure what went wrong.
Signed-off-by: NNeilBrown <neilb@suse.de>

2b69c839

md: always update level / chunk_size / layout when writing v1.x metadata. · 62e1e389

由 NeilBrown 提交于 5月 26, 2009

We previously didn't update these fields when writing the metadata
because they could never change.  They can now, so we better write
them.
v0.90 metadata always updated these fields.
Signed-off-by: NNeilBrown <neilb@suse.de>

62e1e389

07 5月, 2009 7 次提交

md: remove rd%d links immediately after stopping an array. · c4647292

由 NeilBrown 提交于 5月 07, 2009

md maintains link in sys/mdXX/md/ to identify which device has
which role in the array. e.g.
   rd2 -> dev-sda

indicates that the device with role '2' in the array is sda.

These links are only present when the array is active.  They are
created immediately after ->run is called, and so should be removed
immediately after ->stop is called.
However they are currently removed a little bit later, and it is
possible for ->run to be called again, thus adding these links, before
they are removed.

So move the removal earlier so they are consistently only present when
the array is active.
Signed-off-by: NNeilBrown <neilb@suse.de>

c4647292

md: remove ability to explicit set an inactive array to 'clean'. · 5bf29597

由 NeilBrown 提交于 5月 07, 2009

Being able to write 'clean' to an 'array_state' of an inactive array
to activate it in 'clean' mode is both unnecessary and inconvenient.

It is unnecessary because the same can be achieved by writing
'active'.  This activates and array, but it still remains 'clean'
until the first write.

It is inconvenient because writing 'clean' is more often used to
cause an 'active' array to revert to 'clean' mode (thus blocking
any writes until a 'write-pending' is promoted to 'active').

Allowing 'clean' to both activate an array and mark an active array as
clean can lead to races:  One program writes 'clean' to mark the
active array as clean at the same time as another program writes
'inactive' to deactivate (stop) and active array.  Depending on which
writes first, the array could be deactivated and immediately
reactivated which isn't what was desired.

So just disable the use of 'clean' to activate an array.

This avoids a race that can be triggered with mdadm-3.0 and external
metadata, so it suitable for -stable.
Reported-by: NRafal Marszewski <rafal.marszewski@intel.com>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: NNeilBrown <neilb@suse.de>

5bf29597

md: constify VFTs · 110518bc

由 Jan Engelhardt 提交于 5月 07, 2009

Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NNeilBrown <neilb@suse.de>

110518bc

md: tidy up status_resync to handle large arrays. · dd71cf6b

由 NeilBrown 提交于 5月 07, 2009

Two problems in status_resync.
1/ It still used Kilobytes as the basic block unit, while most code
   now uses sectors uniformly.
2/ It doesn't allow for the possibility that max_sectors exceeds
   the range of "unsigned long".

So
 - change "max_blocks" to "max_sectors", and store sector numbers
   in there and in 'resync'
 - Make 'rt' a 'sector_t' so it can temporarily hold the number of
   remaining sectors.
 - use sector_div rather than normal division.
 - change the magic '100' used to preserve precision to '32'.
   + making it a power of 2 makes division easier
   + it doesn't need to be as large as it was chosen when we averaged
     speed over the entire run.  Now we average speed over the last 30
     seconds or so.
Reported-by: N"Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE>
Signed-off-by: NNeilBrown <neilb@suse.de>

dd71cf6b

md: fix some (more) errors with bitmaps on devices larger than 2TB. · db305e50

由 NeilBrown 提交于 5月 07, 2009

If a write intent bitmap covers more than 2TB, we sometimes work with
values beyond 32bit, so these need to be sector_t.  This patches
add the required casts to some unsigned longs that are being shifted
up.

This will affect any raid10 larger than 2TB, or any raid1/4/5/6 with
member devices that are larger than 2TB.
Signed-off-by: NNeilBrown <neilb@suse.de>
Reported-by: N"Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE>
Cc: stable@kernel.org

db305e50

md/raid10: don't clear bitmap during recovery if array will still be degraded. · 18055569

由 NeilBrown 提交于 5月 07, 2009

If we have a raid10 with multiple missing devices, and we recover just
one of these to a spare, then we risk (depending on the bitmap and
array chunk size) clearing bits of the bitmap for which recovery isn't
complete (because a device is still missing).

This can lead to a subsequent "re-add" being recovered without
any IO happening, which would result in loss of data.

This patch takes the safe approach of not clearing bitmap bits
if the array will still be degraded.

This patch is suitable for all active -stable kernels.

Cc: stable@kernel.org
Signed-off-by: NNeilBrown <neilb@suse.de>

18055569

md: fix loading of out-of-date bitmap. · b74fd282

由 NeilBrown 提交于 5月 07, 2009

When md is loading a bitmap which it knows is out of date, it fills
each page with 1s and writes it back out again.  However the
write_page call makes used of bitmap->file_pages and
bitmap->last_page_size which haven't been set correctly yet.  So this
can sometimes fail.

Move the setting of file_pages and last_page_size to before the call
to write_page.

This bug can cause the assembly on an array to fail, thus making the
data inaccessible.  Hence I think it is a suitable candidate for
-stable.

Cc: stable@kernel.org
Reported-by: NVojtech Pavlik <vojtech@suse.cz>
Signed-off-by: NNeilBrown <neilb@suse.de>

b74fd282

20 4月, 2009 1 次提交

md: support bitmaps on RAID10 arrays larger then 2 terabytes · 1f593903

由 NeilBrown 提交于 4月 20, 2009

.. and other arrays with components larger than 2 terabytes.

We use a "long" rather than a "sector_t" in part of the bitmap
size calculations, which is sad.
Reported-by: N"Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE>
Signed-off-by: NNeilBrown <neilb@suse.de>

1f593903

17 4月, 2009 1 次提交

md: update sync_completed and reshape_position even more often. · c03f6a19

由 NeilBrown 提交于 4月 17, 2009

There are circumstances when a user-space process might need to
"oversee" a resync/reshape process.  For example when doing an
in-place reshape of a raid5, it is prudent to take a backup of each
section before reshaping it as this is the only way to provide
safety against an unplanned shutdown (i.e. crash/power failure).

The sync_max sysfs value can be used to stop the resync from
advancing beyond a particular point.
So user-space can:
  suspend IO to the first section and back it up
  set 'sync_max' to the end of the section
  wait for 'sync_completed' to reach that point
  resume IO on the first section and move on to the next section.

However this process requires the kernel and user-space to run in
lock-step which could introduce unnecessary delays.

It would be better if a 'double buffered' approach could be used with
userspace and kernel space working on different sections with the
'next' section always ready when the 'current' section is finished.

One problem with implementing this is that sync_completed is only
guaranteed to be updated when the sync process reaches sync_max.
(it is updated on a time basis at other times, but it is hard to rely
on that).  This defeats some of the double buffering.

With this patch, sync_completed (and reshape_position) get updated as
the current position approaches sync_max, so there is room for
userspace to advance sync_max early without losing updates.

To be precise, sync_completed is updated when the current sync
position reaches half way between the current value of sync_completed
and the value of sync_max.  This will usually be a good time for user
space to update sync_max.

If sync_max does not get updated, the updates to sync_completed
(together with associated metadata updates) will occur at an
exponentially increasing frequency which will get unreasonably fast
(one update every page) immediately before the process hits sync_max
and stops.  So the update rate will be unreasonably fast only for an
insignificant period of time.
Signed-off-by: NNeilBrown <neilb@suse.de>

c03f6a19

15 4月, 2009 1 次提交

block: move bio list helpers into bio.h · 8f3d8ba2

由 Christoph Hellwig 提交于 4月 07, 2009

It's used by DM and MD and generally useful, so move the bio list
helpers into bio.h.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NAlasdair G Kergon <agk@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8f3d8ba2

14 4月, 2009 2 次提交

md: improve usefulness and accuracy of sysfs file md/sync_completed. · acb180b0

由 NeilBrown 提交于 4月 14, 2009

The sync_completed file reports how much of a resync (or recovery or
reshape) has been completed.
However due to the possibility of out-of-order completion of writes,
it is not certain to be accurate.

We have an internal value - mddev->curr_resync_completed - which is an
accurate value (though it might not always be quite so uptodate).

So:
 - make curr_resync_completed be uptodate a little more often,
   particularly when raid5 reshape updates status in the metadata
 - report curr_resync_completed in the sysfs file
 - allow poll/select to report all updates to md/sync_completed.

This makes sync_completed completed usable by any external metadata
handler that wants to record this status information in its metadata.
Signed-off-by: NNeilBrown <neilb@suse.de>

acb180b0

md: allow setting newly added device to 'in_sync' via sysfs. · 6d56e278

由 NeilBrown 提交于 4月 14, 2009

When adding devices to an active array via sysfs, there is currently
no way to mark a device as 'in-sync' which is useful when
incrementally assembling an array.

So add that option.
Signed-off-by: NNeilBrown <neilb@suse.de>

6d56e278

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功