提交 · 4b532c9b8c87eb8e51605c4d08dfb5139c758dc5 · openanolis / cloud-kernel

28 10月, 2010 2 次提交

由 NeilBrown 提交于 10月 28, 2010

lock_kernel calls were recently pushed down into open/release
functions.
md doesn't need that protection.
Then the BKL calls were change to md_mutex.  We don't need those
either.
So remove it all.
Signed-off-by: NNeilBrown <neilb@suse.de>

4b532c9b

md: Fix regression with raid1 arrays without persistent metadata. · d97a41dc

由 NeilBrown 提交于 10月 28, 2010

A RAID1 which has no persistent metadata, whether internal or
external, will hang on the first write.
This is caused by commit  070dc6dd
In that case, MD_CHANGE_PENDING never gets cleared.

So during md_update_sb, is neither persistent or external,
clear MD_CHANGE_PENDING.

This is suitable for 2.6.36-stable.
Signed-off-by: NNeilBrown <neilb@suse.de>
Cc: stable@kernel.org

d97a41dc

27 10月, 2010 1 次提交

workqueues: s/ON_STACK/ONSTACK/ · ca1cab37

由 Andrew Morton 提交于 10月 26, 2010

Silly though it is, completions and wait_queue_heads use foo_ONSTACK
(COMPLETION_INITIALIZER_ONSTACK, DECLARE_COMPLETION_ONSTACK,
__WAIT_QUEUE_HEAD_INIT_ONSTACK and DECLARE_WAIT_QUEUE_HEAD_ONSTACK) so I
guess workqueues should do the same thing.

s/INIT_WORK_ON_STACK/INIT_WORK_ONSTACK/
s/INIT_DELAYED_WORK_ON_STACK/INIT_DELAYED_WORK_ONSTACK/

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ca1cab37

15 10月, 2010 1 次提交

llseek: automatically add .llseek fop · 6038f373

由 Arnd Bergmann 提交于 8月 15, 2010

All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.

The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.

New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time.  Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.

The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.

Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.

Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.

===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
//   but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}

@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}

@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
   *off = E
|
   *off += E
|
   func(..., off, ...)
|
   E = *off
)
...+>
}

@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}

@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
  *off = E
|
  *off += E
|
  func(..., off, ...)
|
  E = *off
)
...+>
}

@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}

@ fops0 @
identifier fops;
@@
struct file_operations fops = {
 ...
};

@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
 .llseek = llseek_f,
...
};

@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
 .read = read_f,
...
};

@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
 .write = write_f,
...
};

@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
 .open = open_f,
...
};

// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
...  .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};

@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
...  .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};

// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
...  .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};

// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};

// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};

@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+	.llseek = default_llseek, /* write accesses f_pos */
};

// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////

@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
 .write = write_f,
 .read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};

@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};

@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};

@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>

6038f373

07 10月, 2010 3 次提交

md: check return code of read_sb_page · 5c04f551

由 Vasiliy Kulikov 提交于 10月 01, 2010

Function read_sb_page may return ERR_PTR(...). Check for it.
Signed-off-by: NVasiliy Kulikov <segooon@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NNeilBrown <neilb@suse.de>

5c04f551

md/raid1: minor bio initialisation improvements. · db8d9d35

由 NeilBrown 提交于 10月 07, 2010


When performing a resync we pre-allocate some bios and repeatedly use
them.  This requires us to re-initialise them each time.
One field (bi_comp_cpu) and some flags weren't being initiaised
reliably.
Signed-off-by: NNeilBrown <neilb@suse.de>

db8d9d35

md/raid1: avoid overflow in raid1 resync when bitmap is in use. · 7571ae88

由 NeilBrown 提交于 10月 07, 2010

bitmap_start_sync returns - via a pass-by-reference variable - the
number of sectors before we need to check with the bitmap again.
Since commit ef425673 this number can be substantially larger,
2^27 is a common value.

Unfortunately it is an 'int' and so when raid1.c:sync_request shifts
it 9 places to the left it becomes 0.  This results in a zero-length
read which the scsi layer justifiably complains about.

This patch just removes the shift so the common case becomes safe with
a trivially-correct patch.

In the next merge window we will convert this 'int' to a 'sector_t'
Reported-by: N"George Spelvin" <linux@horizon.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

7571ae88

05 10月, 2010 1 次提交

block: autoconvert trivial BKL users to private mutex · 2a48fc0a

由 Arnd Bergmann 提交于 6月 02, 2010

The block device drivers have all gained new lock_kernel
calls from a recent pushdown, and some of the drivers
were already using the BKL before.

This turns the BKL into a set of per-driver mutexes.
Still need to check whether this is safe to do.

file=$1
name=$2
if grep -q lock_kernel ${file} ; then
    if grep -q 'include.*linux.mutex.h' ${file} ; then
            sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
    else
            sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
    fi
    sed -i ${file} \
        -e "/^#include.*linux.mutex.h/,$ {
                1,/^\(static\|int\|long\)/ {
                     /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);

} }"  \
    -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
    -e '/[      ]*cycle_kernel_lock();/d'
else
    sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                -e '/cycle_kernel_lock()/d'
fi
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

2a48fc0a

17 9月, 2010 2 次提交

md: fix v1.x metadata update when a disk is missing. · ddcf3522

由 NeilBrown 提交于 9月 08, 2010

If an array with 1.x metadata is assembled with the last disk missing,
md doesn't properly record the fact that the disk was missing.

This is unlikely to cause a real problem as the event count will be
different to the count on the missing disk so it won't be included in
the array. However it could still cause confusion.

So make sure we clear all the relevant slots, not just the early ones.
Signed-off-by: NNeilBrown <neilb@suse.de>

ddcf3522

md: call md_update_sb even for 'external' metadata arrays. · 126925c0

由 NeilBrown 提交于 9月 07, 2010

Now that we depend on md_update_sb to clear variable bits in
mddev->flags (rather than trying not to set them) it is important to
always call md_update_sb when appropriate.

md_check_recovery has this job but explicitly avoids it for ->external
metadata arrays.  This is not longer appropraite, or needed.

However we do want to avoid taking the mddev lock if only
MD_CHANGE_PENDING is set as that is not cleared by md_update_sb for
external-metadata arrays.
Reported-by: N"Kwolek, Adam" <adam.kwolek@intel.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

126925c0

11 9月, 2010 1 次提交

Consolidate min_not_zero · c8bf1336

由 Martin K. Petersen 提交于 9月 10, 2010

We have several users of min_not_zero, each of them using their own
definition.  Move the define to kernel.h.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@carl.home.kernel.dk>

c8bf1336

10 9月, 2010 7 次提交

dm: convey that all flushes are processed as empty · b372d360

由 Mike Snitzer 提交于 9月 08, 2010

Rename __clone_and_map_flush to __clone_and_map_empty_flush for added
clarity.

Simplify logic associated with REQ_FLUSH conditionals.

Introduce a BUG_ON() and add a few more helpful comments to the code
so that it is clear that all flushes are empty.

Cleanup __split_and_process_bio() so that an empty flush isn't processed
by a 'sector_count' focused while loop.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

b372d360

dm: fix locking context in queue_io() · 05447420

由 Kiyoshi Ueda 提交于 9月 08, 2010

Now queue_io() is called from dec_pending(), which may be called with
interrupts disabled, so queue_io() must not enable interrupts
unconditionally and must save/restore the current interrupts status.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

05447420

dm: relax ordering of bio-based flush implementation · 6a8736d1

由 Tejun Heo 提交于 9月 08, 2010

Unlike REQ_HARDBARRIER, REQ_FLUSH/FUA doesn't mandate any ordering
against other bio's.  This patch relaxes ordering around flushes.

* A flush bio is no longer deferred to workqueue directly.  It's
  processed like other bio's but __split_and_process_bio() uses
  md->flush_bio as the clone source.  md->flush_bio is initialized to
  empty flush during md initialization and shared for all flushes.

* As a flush bio now travels through the same execution path as other
  bio's, there's no need for dedicated error handling path either.  It
  can use the same error handling path in dec_pending().  Dedicated
  error handling removed along with md->flush_error.

* When dec_pending() detects that a flush has completed, it checks
  whether the original bio has data.  If so, the bio is queued to the
  deferred list w/ REQ_FLUSH cleared; otherwise, it's completed.

* As flush sequencing is handled in the usual issue/completion path,
  dm_wq_work() no longer needs to handle flushes differently.  Now its
  only responsibility is re-issuing deferred bio's the same way as
  _dm_request() would.  REQ_FLUSH handling logic including
  process_flush() is dropped.

* There's no reason for queue_io() and dm_wq_work() write lock
  dm->io_lock.  queue_io() now only uses md->deferred_lock and
  dm_wq_work() read locks dm->io_lock.

* bio's no longer need to be queued on the deferred list while a flush
  is in progress making DMF_QUEUE_IO_TO_THREAD unncessary.  Drop it.

This avoids stalling the device during flushes and simplifies the
implementation.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

6a8736d1

dm: implement REQ_FLUSH/FUA support for request-based dm · 29e4013d

由 Tejun Heo 提交于 9月 08, 2010

This patch converts request-based dm to support the new REQ_FLUSH/FUA.

The original request-based flush implementation depended on
request_queue blocking other requests while a barrier sequence is in
progress, which is no longer true for the new REQ_FLUSH/FUA.

In general, request-based dm doesn't have infrastructure for cloning
one source request to multiple targets, but the original flush
implementation had a special mostly independent path which can issue
flushes to multiple targets and sequence them.  However, the
capability isn't currently in use and adds a lot of complexity.
Moreoever, it's unlikely to be useful in its current form as it
doesn't make sense to be able to send out flushes to multiple targets
when write requests can't be.

This patch rips out special flush code path and deals handles
REQ_FLUSH/FUA requests the same way as other requests.  The only
special treatment is that REQ_FLUSH requests use the block address 0
when finding target, which is enough for now.

* added BUG_ON(!dm_target_is_valid(ti)) in dm_request_fn() as
  suggested by Mike Snitzer
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Tested-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

29e4013d

dm: implement REQ_FLUSH/FUA support for bio-based dm · d87f4c14

由 Tejun Heo 提交于 9月 03, 2010

This patch converts bio-based dm to support REQ_FLUSH/FUA instead of
now deprecated REQ_HARDBARRIER.

* -EOPNOTSUPP handling logic dropped.

* Preflush is handled as before but postflush is dropped and replaced
  with passing down REQ_FUA to member request_queues.  This replaces
  one array wide cache flush w/ member specific FUA writes.

* __split_and_process_bio() now calls __clone_and_map_flush() directly
  for flushes and guarantees all FLUSH bio's going to targets are zero
`  length.

* It's now guaranteed that all FLUSH bio's which are passed onto dm
  targets are zero length.  bio_empty_barrier() tests are replaced
  with REQ_FLUSH tests.

* Empty WRITE_BARRIERs are replaced with WRITE_FLUSHes.

* Dropped unlikely() around REQ_FLUSH tests.  Flushes are not unlikely
  enough to be marked with unlikely().

* Block layer now filters out REQ_FLUSH/FUA bio's if the request_queue
  doesn't support cache flushing.  Advertise REQ_FLUSH | REQ_FUA
  capability.

* Request based dm isn't converted yet.  dm_init_request_based_queue()
  resets flush support to 0 for now.  To avoid disturbing request
  based dm code, dm->flush_error is added for bio based dm while
  requested based dm continues to use dm->barrier_error.

Lightly tested linear, stripe, raid1, snap and crypt targets.  Please
proceed with caution as I'm not familiar with the code base.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: dm-devel@redhat.com
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

d87f4c14

md: implment REQ_FLUSH/FUA support · e9c7469b

由 Tejun Heo 提交于 9月 03, 2010

This patch converts md to support REQ_FLUSH/FUA instead of now
deprecated REQ_HARDBARRIER.  In the core part (md.c), the following
changes are notable.

* Unlike REQ_HARDBARRIER, REQ_FLUSH/FUA don't interfere with
  processing of other requests and thus there is no reason to mark the
  queue congested while FLUSH/FUA is in progress.

* REQ_FLUSH/FUA failures are final and its users don't need retry
  logic.  Retry logic is removed.

* Preflush needs to be issued to all member devices but FUA writes can
  be handled the same way as other writes - their processing can be
  deferred to request_queue of member devices.  md_barrier_request()
  is renamed to md_flush_request() and simplified accordingly.

For linear, raid0 and multipath, the core changes are enough.  raid1,
5 and 10 need the following conversions.

* raid1: Handling of FLUSH/FUA bio's can simply be deferred to
  request_queues of member devices.  Barrier related logic removed.

* raid5: Queue draining logic dropped.  FUA bit is propagated through
  biodrain and stripe resconstruction such that all the updated parts
  of the stripe are written out with FUA writes if any of the dirtying
  writes was FUA.  preread_active_stripes handling in make_request()
  is updated as suggested by Neil Brown.

* raid10: FUA bit needs to be propagated to write clones.

linear, raid0, 1, 5 and 10 tested.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

e9c7469b

block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush() · 4913efe4

由 Tejun Heo 提交于 9月 03, 2010

Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
requests.  Deprecate barrier.  All REQ_HARDBARRIERs are failed with
-EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
blk_queue_flush().

blk_queue_flush() takes combinations of REQ_FLUSH and FUA.  If a
device has write cache and can flush it, it should set REQ_FLUSH.  If
the device can handle FUA writes, it should also set REQ_FUA.

All blk_queue_ordered() users are converted.

* ORDERED_DRAIN is mapped to 0 which is the default value.
* ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
* ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NBoaz Harrosh <bharrosh@panasas.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Cc: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

4913efe4

30 8月, 2010 3 次提交

md: resolve confusion of MD_CHANGE_CLEAN · 070dc6dd

由 NeilBrown 提交于 8月 30, 2010

MD_CHANGE_CLEAN is used for two different purposes and this leads to
confusion.
One of the purposes is largely mirrored by MD_CHANGE_PENDING which is
not used for anything else, so have MD_CHANGE_PENDING take over that
purpose fully.

The two purposes are:
 1/ tell md_update_sb that an update is needed and that it is just a
   clean/dirty transition.
 2/ tell user-space that an transition from clean to dirty is pending
    (something wants to write), and tell te kernel (by clearin the
    flag) that the transition is OK.

The first purpose remains wit MD_CHANGE_CLEAN, the second is moved
fully to MD_CHANGE_PENDING.

This means that various places which conditionally set or cleared
MD_CHANGE_CLEAN no longer need to be conditional.
Signed-off-by: NNeilBrown <neilb@suse.de>

070dc6dd

md: don't clear MD_CHANGE_CLEAN in md_update_sb() for external arrays · bd52b746

由 Dan Williams 提交于 8月 30, 2010

If this bit is cleared in md_update_sb() the kernel will allow writes to the
array if userspace triggers md_allow_write(), e.g. through stripe_cache_size,
when mdmon is not active. When mdmon is active the array transitions to
active-idle bypassing write-pending, setting up a race for mdmon to set the
array clean before a write arrives.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

bd52b746

Move .gitignore from drivers/md to lib/raid6 · 7c44ece9

由 NeilBrown 提交于 8月 30, 2010

Another missing bit of the raid6 -> /lib move.
Reported-by: NAndreas Schwab <schwab@linux-m68k.org>
Signed-off-by: NNeilBrown <neilb@suse.de>

7c44ece9

18 8月, 2010 4 次提交

md raid-1/10 Fix bio_rw bit manipulations again · 2c7d46ec

由 NeilBrown 提交于 8月 18, 2010

commit 7b6d91da changed the behaviour
of a few variables in raid1 and raid10 from flags to bit-sets, but
left them as type 'bool' so they did not work.

Change them (back) to unsigned long.
(historical note: see 1ef04fef)
Signed-off-by: NNeilBrown <neilb@suse.de>
Reported-by: Jiri Slaby <jslaby@suse.cz> and many others

2c7d46ec

md: provide appropriate return value for spare_active functions. · 6b965620

由 NeilBrown 提交于 8月 18, 2010

md_check_recovery expects ->spare_active to return 'true' if any
spares were activated, but none of them do, so the consequent change
in 'degraded' is not notified through sysfs.

So count the number of spares activated, subtract it from 'degraded'
just once, and return it.
Reported-by: NAdrian Drzewiecki <adriand@vmware.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

6b965620

md: Notify sysfs when RAID1/5/10 disk is In_sync. · e6ffbcb6

由 Adrian Drzewiecki 提交于 8月 18, 2010

When RAID1 is done syncing disks, it'll update the state
of synced rdevs to In_sync. But it neglected to notify
sysfs that the attribute changed. So any programs that
are waiting for an rdev's state to change will not be
woken.

(raid5/raid10 added by neilb)
Signed-off-by: NAdrian Drzewiecki <adriand@vmware.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

e6ffbcb6

Update recovery_offset even when external metadata is used. · 3a3a5ddb

由 NeilBrown 提交于 8月 16, 2010

The update of ->recovery_offset in sync_sbs is appropriate even then external
metadata is in use.  However sync_sbs is only called when native
metadata is used.

So move that update in to the top of md_update_sb (which is the only
caller of sync_sbs) before the test on ->external.

This moves the update out of ->write_lock protection, but those fields
only need ->reconfig_mutex protection which they still have.

Also move the test on ->persistent up to where ->external is set as
for metadata update purposes they are the same.

Clear MD_CHANGE_DEVS and MD_CHANGE_CLEAN as they can only be confusing
if ->external is set or ->persistent isn't.

Finally move the update of ->utime down as it is only relevent (like
the ->events update) for native metadata.
Signed-off-by: NNeilBrown <neilb@suse.de>
Reported-by: N"Kwolek, Adam" <adam.kwolek@intel.com>

3a3a5ddb

12 8月, 2010 15 次提交

dm mpath: support discard · 959eb4e5

由 Mike Snitzer 提交于 8月 12, 2010

Enable discard support in the DM multipath target.

This discard support depends on a few discard-specific fixes to the
block layer's request stacking driver methods.

Discard requests are optional so don't allow a failed discard to trigger
path failures.  If there is a real problem with a given path the
barriers associated with the discard (either before or after the
discard) will cause path failure.  That said, unconditionally passing
discard failures up the stack is not ideal.  This must be fixed once DM
has more information about the nature of the underlying storage failure.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
Cc: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>

959eb4e5

dm stripe: support discards · 7b76ec11

由 Mikulas Patocka 提交于 8月 12, 2010

The DM core will submit a discard bio to the stripe target for each
stripe in a striped DM device.  The stripe target will determine
stripe-specific portions of the supplied bio to be remapped into
individual (at most 'num_discard_requests' extents).  If a given
stripe-specific discard bio doesn't touch a particular stripe the bio
will be dropped.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

7b76ec11

dm: split discard requests on target boundaries · a79245b3

由 Mike Snitzer 提交于 8月 12, 2010

Update __clone_and_map_discard to loop across all targets in a DM
device's table when it processes a discard bio.  If a discard crosses a
target boundary it must be split accordingly.

Update __issue_target_requests and __issue_target_request to allow a
cloned discard bio to have a custom start sector and size.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

a79245b3

dm stripe: optimize sector division · c96053b7

由 Mikulas Patocka 提交于 8月 12, 2010

Optimize sector division: If the number of stripes is a power of two,
we can do shift and mask instead of division.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

c96053b7

dm stripe: move sector translation to a function · 65988525

由 Mikulas Patocka 提交于 8月 12, 2010

Move sector to stripe translation into a function.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

65988525

dm: error return error for discards · 38e1b257

由 Mike Snitzer 提交于 8月 12, 2010

Have the error target respond to a discard request with a hard -EIO
rather than fail the request with -EOPNOTSUPP.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

38e1b257

dm delay: support discard · 3fd5d480

由 Mike Snitzer 提交于 8月 12, 2010

Enable discard support for the delay target.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

3fd5d480

dm: zero silently drop discards · f8facb61

由 Mike Snitzer 提交于 8月 12, 2010

Have the zero target silently drop a discard rather than fail the
request with -EOPNOTSUPP.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

f8facb61

dm: use dm_target_offset macro · b441a262

由 Alasdair G Kergon 提交于 8月 12, 2010

Use new dm_target_offset() macro to avoid most references to ti->begin
in dm targets.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

b441a262

dm: factor out max_io_len_target_boundary · 56a67df7

由 Mike Snitzer 提交于 8月 12, 2010

Split max_io_len_target_boundary out of max_io_len so that the discard
support can make use of it without duplicating max_io_len code.

Avoiding max_io_len's split_io logic enables DM's discard support to
submit the entire discard request to a target.  But discards must still
be split on target boundaries.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

56a67df7

dm: use common __issue_target_request for flush and discard support · 06a426ce

由 Mike Snitzer 提交于 8月 12, 2010

Rename __flush_target to __issue_target_request now that it is used to
issue both flush and discard requests.

Introduce __issue_target_requests as a convenient wrapper to
__issue_target_request 'num_flush_requests' or 'num_discard_requests'
times per target.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

06a426ce

dm: linear support discard · 5ae89a87

由 Mike Snitzer 提交于 8月 12, 2010

Allow discards to be passed through to linear mappings if at least one
underlying device supports it.  Discards will be forwarded only to
devices that support them.

A target that supports discards should set num_discard_requests to
indicate how many times each discard request must be submitted to it.

Verify table's underlying devices support discards prior to setting the
associated DM device as capable of discards (via QUEUE_FLAG_DISCARD).
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reviewed-by: NJoe Thornber <thornber@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

5ae89a87

dm crypt: simplify crypt_ctr · 5ebaee6d

由 Milan Broz 提交于 8月 12, 2010

Allocate cipher strings indpendently of struct crypt_config and move
cipher parsing and allocation into a separate function to prepare for
supporting the cryptoapi format e.g. "xts(aes)".

No functional change in this patch.
Signed-off-by: NMilan Broz <mbroz@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

5ebaee6d

dm crypt: simplify crypt_config destruction logic · 28513fcc

由 Milan Broz 提交于 8月 12, 2010

Use just one label and reuse common destructor for crypt target.

Parse remaining argv arguments in logic order.

Also do not ignore error values from IV init and set key functions.

No functional change in this patch except changed return codes
based on above.
Signed-off-by: NMilan Broz <mbroz@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

28513fcc

dm: allow autoloading of dm mod · 7e507eb6

由 Peter Rajnoha 提交于 8月 12, 2010

Add devname:mapper/control and MAPPER_CTRL_MINOR module alias
to support dm-mod module autoloading.
Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NPeter Rajnoha <prajnoha@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

7e507eb6

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功