提交 · e46b272b6608783ed7aa7b0594871550ce20b849 · openeuler / Kernel

28 4月, 2008 2 次提交

md: replace remaining __FUNCTION__ occurrences · e46b272b

由 Harvey Harrison 提交于 4月 28, 2008

__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e46b272b

md: fix integer as NULL pointer warnings in md.c · 9a7b2b0f

由 Harvey Harrison 提交于 4月 28, 2008

drivers/md/md.c:734:16: warning: Using plain integer as NULL pointer
drivers/md/md.c:1115:16: warning: Using plain integer as NULL pointer

Add some braces to match the else-block as well.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9a7b2b0f

25 4月, 2008 23 次提交

dm: remove md argument from specific_minor · cf13ab8e

由 Frederik Deweerdt 提交于 4月 24, 2008

The small patch below:
- Removes the unused md argument from both specific_minor() and next_free_minor()
- Folds kmalloc + memset(0) into a single kzalloc call in alloc_dev()

This has been compile tested on x86.
Signed-off-by: NFrederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

cf13ab8e

dm table: remove unused dm_create_error_table · 4fdfe401

由 Adrian Bunk 提交于 4月 24, 2008

dm_create_error_table() was added in kernel 2.6.18 and never used...
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

4fdfe401

dm table: drop void suspend_targets return · e8488d08

由 Adrian Bunk 提交于 4月 24, 2008

void returning functions returned the return value of another void
returning function...

Spotted by sparse.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

e8488d08

dm: unplug queues in threads · 7ff14a36

由 Mikulas Patocka 提交于 4月 24, 2008

Remove an avoidable 3ms delay on some dm-raid1 and kcopyd I/O.

It is specified that any submitted bio without BIO_RW_SYNC flag may plug the
queue (i.e. block the requests from being dispatched to the physical device).

The queue is unplugged when the caller calls blk_unplug() function. Usually, the
sequence is that someone calls submit_bh to submit IO on a buffer. The IO plugs
the queue and waits (to be possibly joined with other adjacent bios). Then, when
the caller calls wait_on_buffer(), it unplugs the queue and submits the IOs to
the disk.

This was happenning:

When doing O_SYNC writes, function fsync_buffers_list() submits a list of
bios to dm_raid1, the bios are added to dm_raid1 write queue and kmirrord is
woken up.

fsync_buffers_list() calls wait_on_buffer(). That unplugs the queue, but
there are no bios on the device queue as they are still in the dm_raid1 queue.

wait_on_buffer() starts waiting until the IO is finished.

kmirrord is scheduled, kmirrord takes bios and submits them to the devices.

The submitted bio plugs the harddisk queue but there is no one to unplug it.
(The process that called wait_on_buffer() is already sleeping.)

So there is a 3ms timeout, after which the queues on the harddisks are
unplugged and requests are processed.

This 3ms timeout meant that in certain workloads (e.g. O_SYNC, 8kb writes),
dm-raid1 is 10 times slower than md raid1.

Every time we submit something asynchronously via dm_io, we must unplug the
queue actually to send the request to the device.

This patch adds an unplug call to kmirrord - while processing requests, it keeps
the queue plugged (so that adjacent bios can be merged); when it finishes
processing all the bios, it unplugs the queue to submit the bios.

It also fixes kcopyd which has the same potential problem. All kcopyd requests
are submitted with BIO_RW_SYNC.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
Acked-by: NJens Axboe <jens.axboe@oracle.com>

7ff14a36

dm raid1: use timer · a2aebe03

由 Mikulas Patocka 提交于 4月 24, 2008

This patch replaces the schedule() in the main kmirrord thread with a timer.
The schedule() could introduce an unwanted delay when work is ready to be
processed.

The code instead calls wake() when there's work to be done immediately, and
delayed_wake() after a failure to give a short delay before retrying.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

a2aebe03

dm: move include files · a765e20e

由 Alasdair G Kergon 提交于 4月 24, 2008

Publish the dm-io, dm-log and dm-kcopyd headers in include/linux.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

a765e20e

dm kcopyd: rename · 2d1e580a

由 Alasdair G Kergon 提交于 4月 24, 2008

Rename kcopyd.[ch] to dm-kcopyd.[ch].
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

2d1e580a

dm: expose macros · 0da336e5

由 Alasdair G Kergon 提交于 4月 24, 2008

Make dm.h macros and inlines available in include/linux/device-mapper.h
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

0da336e5

dm kcopyd: remove redundant client counting · 945fa4d2

由 Mikulas Patocka 提交于 4月 24, 2008

Remove client counting code that is no longer needed.

Initialization and destruction is made globally from dm_init and dm_exit and is
not based on client counts. Initialization allocates only one empty slab cache,
so there is no negative impact from performing the initialization always,
regardless of whether some client uses kcopyd or not.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

945fa4d2

dm kcopyd: private mempool · 08d8757a

由 Mikulas Patocka 提交于 4月 24, 2008

Change the global mempool in kcopyd into a per-device mempool to avoid
deadlock possibilities.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

08d8757a

dm kcopyd: per device · 8c0cbc2f

由 Mikulas Patocka 提交于 4月 24, 2008

Make one kcopyd thread per device.

The original shared kcopyd could deadlock.

Configuration:

8c0cbc2f

dm log: make module use tracking internal · 2a23aa1d

由 Jonathan Brassow 提交于 4月 24, 2008

Remove internal module reference fields from the interface.
Signed-off-by: NJonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

2a23aa1d

dm log: move register functions · b8206bc3

由 Alasdair G Kergon 提交于 4月 24, 2008

Reorder a couple of functions in the file so the next patch is readable.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

b8206bc3

dm log: clean interface · 416cd17b

由 Heinz Mauelshagen 提交于 4月 24, 2008

Clean up the dm-log interface to prepare for publishing it in include/linux.
Signed-off-by: NHeinz Mauelshagen <hjm@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

416cd17b

dm kcopyd: clean interface · eb69aca5

由 Heinz Mauelshagen 提交于 4月 24, 2008

Clean up the kcopyd interface to prepare for publishing it in include/linux.
Signed-off-by: NHeinz Mauelshagen <hjm@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

eb69aca5

dm io: clean interface · 22a1ceb1

由 Heinz Mauelshagen 提交于 4月 24, 2008

Clean up the dm-io interface to prepare for publishing it in include/linux.
Signed-off-by: NHeinz Mauelshagen <hjm@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

22a1ceb1

dm io: rename error to error_bits · e01fd7ee

由 Alasdair G Kergon 提交于 4月 24, 2008

Rename 'error' to 'error_bits' for clarity.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

e01fd7ee

dm snapshot: store pointer to target instance · 72727bad

由 Mikulas Patocka 提交于 4月 24, 2008

Save pointer to dm_target in dm_snapshot structure.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

72727bad

dm log: move dirty region log code into separate module · 769aef30

由 Heinz Mauelshagen 提交于 4月 24, 2008

Move the dirty region log code into a separate module so
other targets can share the code.
Signed-off-by: NHeinz Mauelshagen <hjm@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

769aef30

dm log: generalise name in messages · b7fd54a7

由 Heinz Mauelshagen 提交于 4月 24, 2008

Change dm-log.c messages from "mirror log" to "dirty region log" as
a new dm target wants to share this code.
Signed-off-by: NHeinz Mauelshagen <hjm@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

b7fd54a7

dm raid1: use list_split_init · c12bfc92

由 Robert P. J. Day 提交于 4月 24, 2008

Use shorter list_splice_init() for brevity.
Signed-off-by: NRobert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

c12bfc92

dm snapshot: reduce default memory allocation · 8ee2767a

由 Milan Broz 提交于 4月 24, 2008

Limit the amount of memory allocated per snapshot on systems
with a large page size.  (The larger default chunk size on
these systems compensates for the smaller number of pages reserved.)
Signed-off-by: NMilan Broz <mbroz@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

8ee2767a

dm snapshot: fix chunksize sector conversion · 92436262

由 Mikulas Patocka 提交于 4月 24, 2008

If a snapshot has a smaller chunksize than the page size the
conversion to pages currently returns 0 instead of 1, causing:
kernel BUG in mempool_resize.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMilan Broz <mbroz@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
Cc: stable@kernel.org

92436262

22 4月, 2008 1 次提交

RAID: remove trailing space from printk line · fdefa4d8

由 Nick Andrew 提交于 4月 21, 2008

drivers/md/*.[ch] contains only one more printk line with a trailing space.
Remove it.
Signed-off-by: NNick Andrew <nick@nick-andrew.net>
Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com>

fdefa4d8

11 4月, 2008 1 次提交

md: close a livelock window in handle_parity_checks5 · bd2ab670

由 Dan Williams 提交于 4月 10, 2008

If a failure is detected after a parity check operation has been initiated,
but before it completes handle_parity_checks5 will never quiesce operations on
the stripe.

Explicitly handle this case by "canceling" the parity check, i.e.  clear the
STRIPE_OP_CHECK flags and queue the stripe on the handle list again to refresh
any non-uptodate blocks.

Kernel versions >= 2.6.23 are susceptible.

Cc: <stable@kernel.org>
Cc: NeilBrown <neilb@suse.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bd2ab670

29 3月, 2008 2 次提交

dm io: write error bits form long not int · 4cdc1d1f

由 Alasdair G Kergon 提交于 3月 28, 2008

write_err is an unsigned long used with set_bit() so should not be passed
around as unsigned int.

http://bugzilla.kernel.org/show_bug.cgi?id=10271Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4cdc1d1f

dm crypt: fix ctx pending · 3f1e9070

由 Milan Broz 提交于 3月 28, 2008

Fix regression in dm-crypt introduced in commit
3a7f6c99 ("dm crypt: use async crypto").

If write requests need to be split into pieces, the code must not process them
in parallel because the crypto context cannot be shared.  So there can be
parallel crypto operations on one part of the write, but only one write bio
can be processed at a time.

This is not optimal and the workqueue code needs to be optimized for parallel
processing, but for now it solves the problem without affecting the
performance of synchronous crypto operation (most of current dm-crypt users).

http://bugzilla.kernel.org/show_bug.cgi?id=10242
http://bugzilla.kernel.org/show_bug.cgi?id=10207Signed-off-by: NMilan Broz <mbroz@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3f1e9070

20 3月, 2008 2 次提交

drivers/md/raid5.c: fix printk warnings · 9ea85eba

由 Andrew Morton 提交于 3月 19, 2008

gcc-3.4.5 on sparc64:

drivers/md/raid5.c: In function `raid5_end_read_request':
drivers/md/raid5.c:1147: warning: long long unsigned int format, long unsigned int arg (arg 4)
drivers/md/raid5.c:1164: warning: long long unsigned int format, long unsigned int arg (arg 3)
drivers/md/raid5.c:1170: warning: long long unsigned int format, long unsigned int arg (arg 3)

sector_t is u64, and we don't know what type the architecture uses to
implement u64 (on some it is unsigned long).

Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9ea85eba

md: remove the 'super' sysfs attribute from devices in an 'md' array · 0e82989d

由 NeilBrown 提交于 3月 19, 2008

Exposing the binary blob which is the md 'super-block' via sysfs doesn't
really fit with the whole sysfs model, and ever since commit
8118a859 ("sysfs: fix off-by-one error
in fill_read_buffer()") it doesn't actually work at all (as the size of
the blob is often one page).

(akpm: as in, fs/sysfs/file.c:fill_read_buffer() goes BUG)

So just remove it altogether.  It isn't really useful.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0e82989d

11 3月, 2008 2 次提交

md: reduce CPU wastage on idle md array with a write-intent bitmap · 7be3dfec

由 NeilBrown 提交于 3月 10, 2008

Recent patch titled
  Reduce CPU wastage on idle md array with a write-intent bitmap.

would sometimes leave the array with dirty bitmap bits that stay dirty.  A
subsequent write would sort things out so it isn't a big problem, but should
be fixed nonetheless.

We need to make sure that when the bitmap becomes not "allclean", the
daemon_sleep really does get set to a sensible value.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7be3dfec

md: fix formatting error in /proc/mdstat · 52720ae7

由 NeilBrown 提交于 3月 10, 2008

If an md array is "auto-read-only", then this appears in /proc/mdstat as

   /dev/md0: active(auto-read-only)

whereas if it is truely readonly, it appears as

   /dev/md0: active (read-only)

The difference being a space.

One program known to parse this file expects the space and gets badly
confused.  It will be fixed, but it would be best if what the kernel generates
is more consistent too.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

52720ae7

05 3月, 2008 7 次提交

md: the md RAID10 resync thread could cause a md RAID10 array deadlock · a07e6ab4

由 K.Tanaka 提交于 3月 04, 2008

This message describes another issue about md RAID10 found by testing the
2.6.24 md RAID10 using new scsi fault injection framework.

Abstract:

When a scsi error results in disabling a disk during RAID10 recovery, the
resync threads of md RAID10 could stall.

This case, the raid array has already been broken and it may not matter.  But
I think stall is not preferable.  If it occurs, even shutdown or reboot will
fail because of resource busy.

The deadlock mechanism:

The r10bio_s structure has a "remaining" member to keep track of BIOs yet to
be handled when recovering.  The "remaining" counter is incremented when
building a BIO in sync_request() and is decremented when finish a BIO in
end_sync_write().

If building a BIO fails for some reasons in sync_request(), the "remaining"
should be decremented if it has already been incremented.  I found a case
where this decrement is forgotten.  This causes a md_do_sync() deadlock
because md_do_sync() waits for md_done_sync() called by end_sync_write(), but
end_sync_write() never calls md_done_sync() because of the "remaining" counter
mismatch.

For example, this problem would be reproduced in the following case:

Personalities : [raid10]
md0 : active raid10 sdf1[4] sde1[5](F) sdd1[2] sdc1[1] sdb1[6](F)
      3919616 blocks 64K chunks 2 near-copies [4/2] [_UU_]
      [>....................]  recovery =  2.2% (45376/1959808) finish=0.7min speed=45376K/sec

This case, sdf1 is recovering, sdb1 and sde1 are disabled.
An additional error with detaching sdd will cause a deadlock.

md0 : active raid10 sdf1[4] sde1[5](F) sdd1[6](F) sdc1[1] sdb1[7](F)
      3919616 blocks 64K chunks 2 near-copies [4/1] [_U__]
      [=>...................]  recovery =  5.0% (99520/1959808) finish=5.9min speed=5237K/sec

 2739 ?        S<     0:17 [md0_raid10]
28608 ?        D<     0:00 [md0_resync]
28629 pts/1    Ss     0:00 bash
28830 pts/1    R+     0:00 ps ax
31819 ?        D<     0:00 [kjournald]

The resync thread keeps working, but actually it is deadlocked.

Patch:
By this patch, the remaining counter will be decremented if needed.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a07e6ab4

md: fix possible raid1/raid10 deadlock on read error during resync · 1c830532

由 NeilBrown 提交于 3月 04, 2008

Thanks to K.Tanaka and the scsi fault injection framework, here is a fix for
another possible deadlock in raid1/raid10 error handing.

If a read request returns an error while a resync is happening and a resync
request is pending, the attempt to fix the error will block until the resync
progresses, and the resync will block until the read request completes.  Thus
a deadlock.

This patch fixes the problem.

Cc: "K.Tanaka" <k-tanaka@ce.jp.nec.com>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1c830532

md: don't attempt read-balancing for raid10 'far' layouts · 8ed3a195

由 Keld Simonsen 提交于 3月 04, 2008

This patch changes the disk to be read for layout "far > 1" to always be the
disk with the lowest block address.

Thus the chunks to be read will always be (for a fully functioning array) from
the first band of stripes, and the raid will then work as a raid0 consisting
of the first band of stripes.

Some advantages:

The fastest part which is the outer sectors of the disks involved will be
used.  The outer blocks of a disk may be as much as 100 % faster than the
inner blocks.

Average seek time will be smaller, as seeks will always be confined to the
first part of the disks.

Mixed disks with different performance characteristics will work better, as
they will work as raid0, the sequential read rate will be number of disks
involved times the IO rate of the slowest disk.

If a disk is malfunctioning, the first disk which is working, and has the
lowest block address for the logical block will be used.
Signed-off-by: NKeld Simonsen <keld@dkuug.dk>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8ed3a195

md: lock access to rdev attributes properly · 27c529bb

由 NeilBrown 提交于 3月 04, 2008

When we access attributes of an rdev (component device on an md array) through
sysfs, we really need to lock the array against concurrent changes.  We
currently do that when we change an attribute, but not when we read an
attribute.  We need to lock when reading as well else rdev->mddev could become
NULL while we are accessing it.

So add appropriate locking (mddev_lock) to rdev_attr_show.

rdev_size_store requires some extra care as well as it needs to unlock the
mddev while scanning other mddevs for overlapping regions.  We currently
assume that rdev->mddev will still be unchanged after the scan, but that
cannot be certain.  So take a copy of rdev->mddev for use at the end of the
function.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

27c529bb

md: make sure a reshape is started when device switches to read-write · 25156198

由 NeilBrown 提交于 3月 04, 2008

A resync/reshape/recovery thread will refuse to progress when the array is
marked read-only. So whenever it mark it not read-only, it is important to
wake up thread resync thread. There is one place we didn't do this.

The problem manifests if the start_ro module parameters is set, and a raid5
array that is in the middle of a reshape (restripe) is started. The array
will initially be semi-read-only (meaning it acts like it is readonly until
the first write). So the reshape will not proceed.

On the first write, the array will become read-write, but the reshape will not
be started, and there is no event which will ever restart that thread.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

25156198

md: clean up irregularity with raid autodetect · d0fae18f

由 NeilBrown 提交于 3月 04, 2008

When a raid1 array is stopped, all components currently get added to the list
for auto-detection.  However we should really only add components that were
found by autodetection in the first place.  So add a flag to record that
information, and use it.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0fae18f

md: guard against possible bad array geometry in v1 metadata · a1801f85

由 NeilBrown 提交于 3月 04, 2008

Make sure the data doesn't start before the end of the superblock when the
superblock is at the start of the device.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a1801f85

openeuler / Kernel 11 个月 前同步成功

openeuler / Kernel
11 个月前同步成功