提交 · 84692195969b83f0ba57dc33ecf73e6c124dd186 · openanolis / cloud-kernel

28 8月, 2006 2 次提交

[PATCH] md: avoid backward event updates in md superblock when degraded. · 84692195

由 NeilBrown 提交于 8月 27, 2006

If we
  - shut down a clean array,
  - restart with one (or more) drive(s) missing
  - make some changes
  - pause, so that they array gets marked 'clean',
the event count on the superblock of included drives
will be the same as that of the removed drives.
So adding the removed drive back in will cause it
to be included with no resync.

To avoid this, we only update the eventcount backwards when the array
is not degraded.  In this case there can (should) be no non-connected
drives that we can get confused with, and this is the particular case
where updating-backwards is valuable.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

84692195

[PATCH] dm: Fix deadlock under high i/o load in raid1 setup. · c06aad85

由 Daniel Kobras 提交于 8月 27, 2006

On an nForce4-equipped machine with two SATA disk in raid1 setup using dmraid,
we experienced frequent deadlock of the system under high i/o load.  'cat
/dev/zero > ~/zero' was the most reliable way to reproduce them: Randomly
after a few GB, 'cp' would be left in 'D' state along with kjournald and
kmirrord.  The functions cp and kjournald were blocked in did vary, but
kmirrord's wchan always pointed to 'mempool_alloc()'.  We've seen this pattern
on 2.6.15 and 2.6.17 kernels.  http://lkml.org/lkml/2005/4/20/142 indicates
that this problem has been around even before.

So much for the facts, here's my interpretation: mempool_alloc() first tries
to atomically allocate the requested memory, or falls back to hand out
preallocated chunks from the mempool.  If both fail, it puts the calling
process (kmirrord in this case) on a private waitqueue until somebody refills
the pool.  Where the only 'somebody' is kmirrord itself, so we have a
deadlock.

I worked around this problem by falling back to a (blocking) kmalloc when
before kmirrord would have ended up on the waitqueue.  This defeats part of
the benefits of using the mempool, but at least keeps the system running.  And
it could be done with a two-line change.  Note that mempool_alloc() clears the
GFP_NOIO flag internally, and only uses it to decide whether to wait or return
an error if immediate allocation fails, so the attached patch doesn't change
behaviour in the non-deadlocking case.  Path is against current git
(2.6.18-rc4), but should apply to earlier versions as well.  I've tested on
2.6.15, where this patch makes the difference between random lockup and a
stable system.
Signed-off-by: NDaniel Kobras <kobras@linux.de>
Acked-by: NAlasdair G Kergon <agk@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c06aad85

15 8月, 2006 1 次提交

[PATCH] dm: BUG/OOPS fix · 485311a2

由 Michal Miroslaw 提交于 8月 13, 2006

Fix BUG I tripped on while testing failover and multipathing.

BUG shows up on error path in multipath_ctr() when parse_priority_group()
fails after returning at least once without error.  The fix is to
initialize m->ti early - just after alloc()ing it.

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c027c3d2
*pde = 00000000
Oops: 0000 [#3]
Modules linked in: qla2xxx ext3 jbd mbcache sg ide_cd cdrom floppy
CPU:    0
EIP:    0060:[<c027c3d2>]    Not tainted VLI
EFLAGS: 00010202   (2.6.17.3 #1)
EIP is at dm_put_device+0xf/0x3b
eax: 00000001   ebx: ee4fcac0   ecx: 00000000   edx: ee4fcac0
esi: ee4fc4e0   edi: ee4fc4e0   ebp: 00000000   esp: c5db3e78
ds: 007b   es: 007b   ss: 0068
Process multipathd (pid: 15912, threadinfo=c5db2000 task=ef485a90)
Stack: ec4eda40 c02816bd ee4fc4c0 00000000 f7e89498 f883e0bc c02816f6 f7e89480
       f7e8948c c0281801 ffffffea f7e89480 f883e080 c0281ffe 00000001 00000000
       00000004 dfe9cab8 f7a693c0 f883e080 f883e0c0 ca4b99c0 c027c6ee 01400000
Call Trace:
 <c02816bd> free_pgpaths+0x31/0x45  <c02816f6> free_priority_group+0x25/0x2e
 <c0281801> free_multipath+0x35/0x67  <c0281ffe> multipath_ctr+0x123/0x12d
 <c027c6ee> dm_table_add_target+0x11e/0x18b  <c027e5b4> populate_table+0x8a/0xaf
 <c027e62b> table_load+0x52/0xf9  <c027ec23> ctl_ioctl+0xca/0xfc
 <c027e5d9> table_load+0x0/0xf9  <c0152146> do_ioctl+0x3e/0x43
 <c0152360> vfs_ioctl+0x16c/0x178  <c01523b4> sys_ioctl+0x48/0x60
 <c01029b3> syscall_call+0x7/0xb
Code: 97 f0 00 00 00 89 c1 83 c9 01 80 e2 01 0f 44 c1 88 43 14 8b 04 24 59 5b 5e 5f 5d c3 53 89 c1 89 d3 ff 4a 08 0f 94 c0 84 c0 74 2a <8b> 01 8b 10 89 d8 e8 f6 fb ff ff 8b 03 8b 53 04 89 50 04 89 02
EIP: [<c027c3d2>] dm_put_device+0xf/0x3b SS:ESP 0068:c5db3e78
Signed-off-by: NMichal Miroslaw <mirq-linux@rere.qmqm.pl>
Acked-by: NAlasdair G Kergon <agk@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

485311a2

06 8月, 2006 1 次提交

[PATCH] md: Fix a bug that recently crept into md/linear · f9abd1ac

由 NeilBrown 提交于 8月 05, 2006

A recent patch that allowed linear arrays to be reconfigured on-line
allowed in a bug which results in divide by zero - not all
mddev->array_size were converted to conf->array_size.

This patch finished the conversion and fixed the bug.

The offending patch was commit 7c7546cc.

Thanks to Simon Kirby <sim@netnation.com> for the bug report.

Cc: Simon Kirby <sim@netnation.com>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f9abd1ac

11 7月, 2006 11 次提交

[PATCH] md: fix oops in error-handling · d0a0a5ee

由 Andrew Morton 提交于 7月 10, 2006

During early MD setup (superblock reading), we don't have a personality yet.
But the error-handling code tries to dereference mddev->pers.  Fix.
Acked-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d0a0a5ee

[PATCH] md: include sector number in messages about corrected read errors · d6950432

由 NeilBrown 提交于 7月 10, 2006

This is generally useful, but particularly helps see if it is the same sector
that always needs correcting, or different ones.

[akpm@osdl.org: fix printk warnings]
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d6950432

[PATCH] md: require CAP_SYS_ADMIN for (re-)configuring md devices via sysfs · 67463acb

由 NeilBrown 提交于 7月 10, 2006

The ioctl requires CAP_SYS_ADMIN, so sysfs should too.  Note that we don't
require CAP_SYS_ADMIN for reading attributes even though the ioctl does.
There is no reason to limit the read access, and much of the information is
already available via /proc/mdstat

Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

67463acb

[PATCH] md: unify usage of symbolic names for perms · 80ca3a44

由 NeilBrown 提交于 7月 10, 2006

Some places we use number (0660) someplaces names (S_IRUGO).  Change all
numbers to be names, and change 0655 to be what it should be.

Also make some formatting more consistent.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

80ca3a44

[PATCH] md: fix usage of wrong variable in raid1 · 5e3db645

由 NeilBrown 提交于 7月 10, 2006

Though it rarely matters, we should be using 's' rather than r1_bio->sector
here.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5e3db645

[PATCH] md: fix some small races in bitmap plugging in raid5 · ae3c20cc

由 NeilBrown 提交于 7月 10, 2006

The comment gives more details, but I didn't quite have the sequencing write,
so there was room for races to leave bits unset in the on-disk bitmap for
short periods of time.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ae3c20cc

[PATCH] md: fix a plug/unplug race in raid5 · 7c785b7a

由 NeilBrown 提交于 7月 10, 2006

When a device is unplugged, requests are moved from one or two (depending on
whether a bitmap is in use) queues to the main request queue.

So whenever requests are put on either of those queues, we should make sure
the raid5 array is 'plugged'. However we don't. We currently plug the raid5
queue just before putting requests on queues, so there is room for a race. If
something unplugs the queue at just the wrong time, requests will be left on
the queue and nothing will want to unplug them. Normally something else will
plug and unplug the queue fairly soon, but there is a risk that nothing will.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7c785b7a

[PATCH] md: fix resync speed calculation for restarted resyncs · ff4e8d9a

由 NeilBrown 提交于 7月 10, 2006

We introduced 'io_sectors' recently so we could count the sectors that causes
io during resync separate from sectors which didn't cause IO - there can be a
difference if a bitmap is being used to accelerate resync.

However when a speed is reported, we find the number of sectors processed
recently by subtracting an oldish io_sectors count from a current
'curr_resync' count.  This is wrong because curr_resync counts all sectors,
not just io sectors.

So, add a field to mddev to store the curren io_sectors separately from
curr_resync, and use that in the calculations.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ff4e8d9a

[PATCH] md: delay starting md threads until array is completely setup · 0b8c9de0

由 NeilBrown 提交于 7月 10, 2006

When an array is started we start one or two threads (two if there is a
reshape or recovery that needs to be completed).

We currently start these *before* the array is completely set up and in
particular before queue->queuedata is set.  If the thread actually starts
very quickly on another CPU, we can end up dereferencing queue->queuedata
and oops.

This patch also makes sure we don't try to start a recovery if a reshape is
being restarted.
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0b8c9de0

[PATCH] md: set desc_nr correctly for version-1 superblocks · 31b65a0d

由 NeilBrown 提交于 7月 10, 2006

This has to be done in ->load_super, not ->validate_super

Without this, hot-adding devices to an array doesn't always
work right - though there is a work around in mdadm-2.5.2 to
make this less of an issue.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

31b65a0d

[PATCH] md: possible fix for unplug problem · f4370781

由 NeilBrown 提交于 7月 10, 2006

I have reports of a problem with raid5 which turns out to be because the raid5
device gets stuck in a 'plugged' state.  This shouldn't be able to happen as
3msec after it gets plugged it should get unplugged.  However it happens
none-the-less.  This patch fixes the problem and is a reasonable thing to do,
though it might hurt performance slightly in some cases.

Until I can find the real problem, we should probably have this workaround in
place.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f4370781

04 7月, 2006 1 次提交

[PATCH] lockdep: annotate blkdev nesting · 663d440e

由 Ingo Molnar 提交于 7月 03, 2006

Teach special (recursive) locking code to the lock validator.

Effects on non-lockdep kernels:

- the introduction of the following function variants:

  extern struct block_device *open_partition_by_devnum(dev_t, unsigned);

  extern int blkdev_put_partition(struct block_device *);

  static int
  blkdev_get_whole(struct block_device *bdev, mode_t mode, unsigned flags);

 which on non-lockdep are the same as open_by_devnum(), blkdev_put()
 and blkdev_get().

- a subclass parameter to do_open(). [unused on non-lockdep]

- a subclass parameter to __blkdev_put(), which is a new internal
  function for the main blkdev_put*() functions. [parameter unused
  on non-lockdep kernels, except for two sanity check WARN_ON()s]

these functions carry no semantical difference - they only express
object dependencies towards the lockdep subsystem.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

663d440e

01 7月, 2006 1 次提交

Remove obsolete #include <linux/config.h> · 6ab3d562

由 Jörn Engel 提交于 6月 30, 2006

Signed-off-by: NJörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: NAdrian Bunk <bunk@stusta.de>

6ab3d562

30 6月, 2006 1 次提交

[PATCH] drivers/md/raid5.c: remove an unused variable · cfb9e32f

由 Adrian Bunk 提交于 6月 29, 2006

Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cfb9e32f

27 6月, 2006 22 次提交

[PATCH] devfs: Last little devfs cleanups throughout the kernel tree. · 890fbae2

由 Greg Kroah-Hartman 提交于 6月 20, 2005

Just removes a few unused #defines and fixes some comments due to
devfs now being gone.
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

890fbae2

[PATCH] devfs: Remove the gendisk devfs_name field as it's no longer needed · ce7b0f46

由 Greg Kroah-Hartman 提交于 6月 20, 2005

And remove the now unneeded number field.
Also fixes all drivers that set these fields.
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

ce7b0f46

G
[PATCH] devfs: Remove the miscdevice devfs_name field as it's no longer needed · 96192ff1
由 Greg Kroah-Hartman 提交于 6月 20, 2005
```
Also fixes all drivers that set this field.
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
```
96192ff1
G
[PATCH] devfs: Remove the devfs_fs_kernel.h file from the tree · ff23eca3
由 Greg Kroah-Hartman 提交于 6月 20, 2005
```
Also fixes up all files that #include it.
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
```
ff23eca3
G
[PATCH] devfs: Remove devfs_remove() function from the kernel tree · 8ab5e4c1
由 Greg Kroah-Hartman 提交于 6月 20, 2005
```
Removes the devfs_remove() function and all callers of it.
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
```
8ab5e4c1
G
[PATCH] devfs: Remove devfs_mk_bdev() function from the kernel tree · 1a715c5c
由 Greg Kroah-Hartman 提交于 6月 20, 2005
```
Removes the devfs_mk_bdev() function and all callers of it.
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
```
1a715c5c
G
[PATCH] devfs: Remove devfs_mk_dir() function from the kernel tree · 95dc112a
由 Greg Kroah-Hartman 提交于 6月 20, 2005
```
Removes the devfs_mk_dir() function and all callers of it.
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
```
95dc112a

[PATCH] drivers/md/md.c: make code static · 05381954

由 Adrian Bunk 提交于 6月 26, 2006

Make needlessly global code static.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

05381954

[PATCH] md: Allow the write_mostly flag to be set via sysfs · f655675b

由 NeilBrown 提交于 6月 26, 2006

It appears in /sys/mdX/md/dev-YYY/state
and can be set or cleared by writing 'writemostly' or '-writemostly'
respectively.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f655675b

[PATCH] md: Allow resync_start to be set and queried via sysfs · a94213b1

由 NeilBrown 提交于 6月 26, 2006

Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

a94213b1

[PATCH] md: Allow raid 'layout' to be read and set via sysfs · d4dbd025

由 NeilBrown 提交于 6月 26, 2006

Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d4dbd025

[PATCH] md: Allow rdev state to be set via sysfs · 45dc2de1

由 NeilBrown 提交于 6月 26, 2006

The md/dev-XXX/state file can now be written:

 "faulty" simulates an error on the device
 "remove" removes the device from the array (if it is not busy)
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

45dc2de1

[PATCH] md: Set/get state of array via sysfs · 9e653b63

由 NeilBrown 提交于 6月 26, 2006

This allows the state of an md/array to be directly controlled via sysfs and
adds the ability to stop and array without tearing it down.

Array states/settings:

 clear
     No devices, no size, no level
     Equivalent to STOP_ARRAY ioctl
 inactive
     May have some settings, but array is not active
        all IO results in error
     When written, doesn't tear down array, but just stops it
 suspended (not supported yet)
     All IO requests will block. The array can be reconfigured.
     Writing this, if accepted, will block until array is quiescent
 readonly
     no resync can happen.  no superblocks get written.
     write requests fail
 read-auto
     like readonly, but behaves like 'clean' on a write request.

 clean - no pending writes, but otherwise active.
     When written to inactive array, starts without resync
     If a write request arrives then
       if metadata is known, mark 'dirty' and switch to 'active'.
       if not known, block and switch to write-pending
     If written to an active array that has pending writes, then fails.
 active
     fully active: IO and resync can be happening.
     When written to inactive array, starts with resync

 write-pending (not supported yet)
     clean, but writes are blocked waiting for 'active' to be written.

 active-idle
     like active, but no writes have been seen for a while (100msec).
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9e653b63

[PATCH] md: Don't write dirty/clean update to spares - leave them alone · 42543769

由 NeilBrown 提交于 6月 26, 2006

- record the 'event' count on each individual device (they
  might sometimes be slightly different now)
- add a new value for 'sb_dirty': '3' means that the super
  block only needs to be updated to record a clean<->dirty
  transition.
- Prefer odd event numbers for dirty states and even numbers
  for clean states
- Using all the above, don't update the superblock on
  a spare device if the update is just doing a clean-dirty
  transition.  To accomodate this, a transition from
  dirty back to clean might now decrement the events counter
  if nothing else has changed.

The net effect of this is that spare drives will not see any IO requests
during normal running of the array, so they can go to sleep if that is what
they want to do.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

42543769

[PATCH] md: Allow re-add to work on array without bitmaps · 07d84d10

由 NeilBrown 提交于 6月 26, 2006

When an array has a bitmap, a device can be removed and re-added and only
blocks changes since the removal (as recorded in the bitmap) will be resynced.

It should be possible to do a similar thing to arrays without bitmaps. i.e.
if a device is removed and re-added and *no* changes have been made in the
interim, then the add should not require a resync.

This patch allows that option. This means that when assembling an array one
device at a time (e.g. during device discovery) the array can be enabled
read-only as soon as enough devices are available, but extra devices can still
be added without causing a resync.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

07d84d10

[PATCH] md: Fix bug that stops raid5 resync from happening · 3285edf1

由 NeilBrown 提交于 6月 26, 2006

As data_disks is *less* than raid_disks, the current test here is obviously
wrong.  And as the difference is already available in conf->max_degraded, it
makes much more sense to use that.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3285edf1

[PATCH] md: Fix Kconfig error · b3cc9ec7

由 akpm@osdl.org 提交于 6月 26, 2006

RAID5 recently changed to RAID456
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b3cc9ec7

[PATCH] md: md Kconfig speeling feex · 4d2554d0

由 Justin Piszcz 提交于 6月 26, 2006

I was experimenting with Linux SW raid today and found a spelling error when
reading the help menus...  (and fly spell found more).
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4d2554d0

[PATCH] md: Calculate correct array size for raid10 in new offset mode · 88388328

由 NeilBrown 提交于 6月 26, 2006

The size calculation made assumtion which the new offset mode didn't
follow.  This gets the size right in all cases.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

88388328

[PATCH] md: Change md/bitmap file handling to use bmap to file blocks-fix · ce25c31b

由 NeilBrown 提交于 6月 26, 2006

Fix problems with new bmap based access to bitmap files.

1/ When not using a file based bitmap, attach a NULL list of buffers
   to each page so the common free_buffer routine can cope.
2/ Use submit_bh to read as well as write, rather than vfs_read.
   This makes read and write more symetric.
3/ sync the file before reading, to ensure that the page cache has no
   dirty pages that might get written out later.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ce25c31b

[PATCH] md/bitmap: change md/bitmap file handling to use bmap to file blocks · d785a06a

由 NeilBrown 提交于 6月 26, 2006

If md is asked to store a bitmap in a file, it tries to hold onto the page
cache pages for that file, manipulate them directly, and call a cocktail of
operations to write the file out.  I don't believe this is a supportable
approach.

This patch changes the approach to use the same approach as swap files.  i.e.
bmap is used to enumerate all the block address of parts of the file and we
write directly to those blocks of the device.

swapfile only uses parts of the file that provide a full pages at contiguous
addresses.  We don't have that luxury so we have to cope with pages that are
non-contiguous in storage.  To handle this we attach buffers to each page, and
store the addresses in those buffers.

With this approach the pagecache may contain data which is inconsistent with
what is on disk.  To alleviate the problems this can cause, md invalidates the
pagecache when releasing the file.  If the file is to be examined while the
array is active (a non-critical but occasionally useful function), O_DIRECT io
must be used.  And new version of mdadm will have support for this.

This approach simplifies a lot of code:
 - we no longer need to keep a list of pages which we need to wait for,
   as the b_endio function can keep track of how many outstanding
   writes there are.  This saves a mempool.
 - -EAGAIN returns from write_page are no longer possible (not sure if
    they ever were actually).
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d785a06a

[PATCH] md/bitmap: tidy up i_writecount handling in md/bitmap · acc55e22

由 NeilBrown 提交于 6月 26, 2006

md/bitmap modifies i_writecount of a bitmap file to make sure that no-one else
writes to it.  The reverting of the change is sometimes done twice, and there
is one error path where it is omitted.

This patch tidies that up.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

acc55e22

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功