- 28 8月, 2006 1 次提交
-
-
由 Daniel Kobras 提交于
On an nForce4-equipped machine with two SATA disk in raid1 setup using dmraid, we experienced frequent deadlock of the system under high i/o load. 'cat /dev/zero > ~/zero' was the most reliable way to reproduce them: Randomly after a few GB, 'cp' would be left in 'D' state along with kjournald and kmirrord. The functions cp and kjournald were blocked in did vary, but kmirrord's wchan always pointed to 'mempool_alloc()'. We've seen this pattern on 2.6.15 and 2.6.17 kernels. http://lkml.org/lkml/2005/4/20/142 indicates that this problem has been around even before. So much for the facts, here's my interpretation: mempool_alloc() first tries to atomically allocate the requested memory, or falls back to hand out preallocated chunks from the mempool. If both fail, it puts the calling process (kmirrord in this case) on a private waitqueue until somebody refills the pool. Where the only 'somebody' is kmirrord itself, so we have a deadlock. I worked around this problem by falling back to a (blocking) kmalloc when before kmirrord would have ended up on the waitqueue. This defeats part of the benefits of using the mempool, but at least keeps the system running. And it could be done with a two-line change. Note that mempool_alloc() clears the GFP_NOIO flag internally, and only uses it to decide whether to wait or return an error if immediate allocation fails, so the attached patch doesn't change behaviour in the non-deadlocking case. Path is against current git (2.6.18-rc4), but should apply to earlier versions as well. I've tested on 2.6.15, where this patch makes the difference between random lockup and a stable system. Signed-off-by: NDaniel Kobras <kobras@linux.de> Acked-by: NAlasdair G Kergon <agk@redhat.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 15 8月, 2006 1 次提交
-
-
由 Michal Miroslaw 提交于
Fix BUG I tripped on while testing failover and multipathing. BUG shows up on error path in multipath_ctr() when parse_priority_group() fails after returning at least once without error. The fix is to initialize m->ti early - just after alloc()ing it. BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c027c3d2 *pde = 00000000 Oops: 0000 [#3] Modules linked in: qla2xxx ext3 jbd mbcache sg ide_cd cdrom floppy CPU: 0 EIP: 0060:[<c027c3d2>] Not tainted VLI EFLAGS: 00010202 (2.6.17.3 #1) EIP is at dm_put_device+0xf/0x3b eax: 00000001 ebx: ee4fcac0 ecx: 00000000 edx: ee4fcac0 esi: ee4fc4e0 edi: ee4fc4e0 ebp: 00000000 esp: c5db3e78 ds: 007b es: 007b ss: 0068 Process multipathd (pid: 15912, threadinfo=c5db2000 task=ef485a90) Stack: ec4eda40 c02816bd ee4fc4c0 00000000 f7e89498 f883e0bc c02816f6 f7e89480 f7e8948c c0281801 ffffffea f7e89480 f883e080 c0281ffe 00000001 00000000 00000004 dfe9cab8 f7a693c0 f883e080 f883e0c0 ca4b99c0 c027c6ee 01400000 Call Trace: <c02816bd> free_pgpaths+0x31/0x45 <c02816f6> free_priority_group+0x25/0x2e <c0281801> free_multipath+0x35/0x67 <c0281ffe> multipath_ctr+0x123/0x12d <c027c6ee> dm_table_add_target+0x11e/0x18b <c027e5b4> populate_table+0x8a/0xaf <c027e62b> table_load+0x52/0xf9 <c027ec23> ctl_ioctl+0xca/0xfc <c027e5d9> table_load+0x0/0xf9 <c0152146> do_ioctl+0x3e/0x43 <c0152360> vfs_ioctl+0x16c/0x178 <c01523b4> sys_ioctl+0x48/0x60 <c01029b3> syscall_call+0x7/0xb Code: 97 f0 00 00 00 89 c1 83 c9 01 80 e2 01 0f 44 c1 88 43 14 8b 04 24 59 5b 5e 5f 5d c3 53 89 c1 89 d3 ff 4a 08 0f 94 c0 84 c0 74 2a <8b> 01 8b 10 89 d8 e8 f6 fb ff ff 8b 03 8b 53 04 89 50 04 89 02 EIP: [<c027c3d2>] dm_put_device+0xf/0x3b SS:ESP 0068:c5db3e78 Signed-off-by: NMichal Miroslaw <mirq-linux@rere.qmqm.pl> Acked-by: NAlasdair G Kergon <agk@redhat.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 06 8月, 2006 1 次提交
-
-
由 NeilBrown 提交于
A recent patch that allowed linear arrays to be reconfigured on-line allowed in a bug which results in divide by zero - not all mddev->array_size were converted to conf->array_size. This patch finished the conversion and fixed the bug. The offending patch was commit 7c7546cc. Thanks to Simon Kirby <sim@netnation.com> for the bug report. Cc: Simon Kirby <sim@netnation.com> Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 11 7月, 2006 11 次提交
-
-
由 Andrew Morton 提交于
During early MD setup (superblock reading), we don't have a personality yet. But the error-handling code tries to dereference mddev->pers. Fix. Acked-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
This is generally useful, but particularly helps see if it is the same sector that always needs correcting, or different ones. [akpm@osdl.org: fix printk warnings] Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
The ioctl requires CAP_SYS_ADMIN, so sysfs should too. Note that we don't require CAP_SYS_ADMIN for reading attributes even though the ioctl does. There is no reason to limit the read access, and much of the information is already available via /proc/mdstat Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
Some places we use number (0660) someplaces names (S_IRUGO). Change all numbers to be names, and change 0655 to be what it should be. Also make some formatting more consistent. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
Though it rarely matters, we should be using 's' rather than r1_bio->sector here. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
The comment gives more details, but I didn't quite have the sequencing write, so there was room for races to leave bits unset in the on-disk bitmap for short periods of time. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
When a device is unplugged, requests are moved from one or two (depending on whether a bitmap is in use) queues to the main request queue. So whenever requests are put on either of those queues, we should make sure the raid5 array is 'plugged'. However we don't. We currently plug the raid5 queue just before putting requests on queues, so there is room for a race. If something unplugs the queue at just the wrong time, requests will be left on the queue and nothing will want to unplug them. Normally something else will plug and unplug the queue fairly soon, but there is a risk that nothing will. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
We introduced 'io_sectors' recently so we could count the sectors that causes io during resync separate from sectors which didn't cause IO - there can be a difference if a bitmap is being used to accelerate resync. However when a speed is reported, we find the number of sectors processed recently by subtracting an oldish io_sectors count from a current 'curr_resync' count. This is wrong because curr_resync counts all sectors, not just io sectors. So, add a field to mddev to store the curren io_sectors separately from curr_resync, and use that in the calculations. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
When an array is started we start one or two threads (two if there is a reshape or recovery that needs to be completed). We currently start these *before* the array is completely set up and in particular before queue->queuedata is set. If the thread actually starts very quickly on another CPU, we can end up dereferencing queue->queuedata and oops. This patch also makes sure we don't try to start a recovery if a reshape is being restarted. Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
This has to be done in ->load_super, not ->validate_super Without this, hot-adding devices to an array doesn't always work right - though there is a work around in mdadm-2.5.2 to make this less of an issue. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
I have reports of a problem with raid5 which turns out to be because the raid5 device gets stuck in a 'plugged' state. This shouldn't be able to happen as 3msec after it gets plugged it should get unplugged. However it happens none-the-less. This patch fixes the problem and is a reasonable thing to do, though it might hurt performance slightly in some cases. Until I can find the real problem, we should probably have this workaround in place. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 04 7月, 2006 1 次提交
-
-
由 Ingo Molnar 提交于
Teach special (recursive) locking code to the lock validator. Effects on non-lockdep kernels: - the introduction of the following function variants: extern struct block_device *open_partition_by_devnum(dev_t, unsigned); extern int blkdev_put_partition(struct block_device *); static int blkdev_get_whole(struct block_device *bdev, mode_t mode, unsigned flags); which on non-lockdep are the same as open_by_devnum(), blkdev_put() and blkdev_get(). - a subclass parameter to do_open(). [unused on non-lockdep] - a subclass parameter to __blkdev_put(), which is a new internal function for the main blkdev_put*() functions. [parameter unused on non-lockdep kernels, except for two sanity check WARN_ON()s] these functions carry no semantical difference - they only express object dependencies towards the lockdep subsystem. Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 01 7月, 2006 1 次提交
-
-
由 Jörn Engel 提交于
Signed-off-by: NJörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: NAdrian Bunk <bunk@stusta.de>
-
- 30 6月, 2006 1 次提交
-
-
由 Adrian Bunk 提交于
Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 27 6月, 2006 23 次提交
-
-
由 Greg Kroah-Hartman 提交于
Just removes a few unused #defines and fixes some comments due to devfs now being gone. Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Greg Kroah-Hartman 提交于
And remove the now unneeded number field. Also fixes all drivers that set these fields. Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Greg Kroah-Hartman 提交于
Also fixes all drivers that set this field. Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Greg Kroah-Hartman 提交于
Also fixes up all files that #include it. Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Greg Kroah-Hartman 提交于
Removes the devfs_remove() function and all callers of it. Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Greg Kroah-Hartman 提交于
Removes the devfs_mk_bdev() function and all callers of it. Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Greg Kroah-Hartman 提交于
Removes the devfs_mk_dir() function and all callers of it. Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Adrian Bunk 提交于
Make needlessly global code static. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Cc: Neil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
It appears in /sys/mdX/md/dev-YYY/state and can be set or cleared by writing 'writemostly' or '-writemostly' respectively. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
The md/dev-XXX/state file can now be written: "faulty" simulates an error on the device "remove" removes the device from the array (if it is not busy) Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
This allows the state of an md/array to be directly controlled via sysfs and adds the ability to stop and array without tearing it down. Array states/settings: clear No devices, no size, no level Equivalent to STOP_ARRAY ioctl inactive May have some settings, but array is not active all IO results in error When written, doesn't tear down array, but just stops it suspended (not supported yet) All IO requests will block. The array can be reconfigured. Writing this, if accepted, will block until array is quiescent readonly no resync can happen. no superblocks get written. write requests fail read-auto like readonly, but behaves like 'clean' on a write request. clean - no pending writes, but otherwise active. When written to inactive array, starts without resync If a write request arrives then if metadata is known, mark 'dirty' and switch to 'active'. if not known, block and switch to write-pending If written to an active array that has pending writes, then fails. active fully active: IO and resync can be happening. When written to inactive array, starts with resync write-pending (not supported yet) clean, but writes are blocked waiting for 'active' to be written. active-idle like active, but no writes have been seen for a while (100msec). Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
- record the 'event' count on each individual device (they might sometimes be slightly different now) - add a new value for 'sb_dirty': '3' means that the super block only needs to be updated to record a clean<->dirty transition. - Prefer odd event numbers for dirty states and even numbers for clean states - Using all the above, don't update the superblock on a spare device if the update is just doing a clean-dirty transition. To accomodate this, a transition from dirty back to clean might now decrement the events counter if nothing else has changed. The net effect of this is that spare drives will not see any IO requests during normal running of the array, so they can go to sleep if that is what they want to do. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
When an array has a bitmap, a device can be removed and re-added and only blocks changes since the removal (as recorded in the bitmap) will be resynced. It should be possible to do a similar thing to arrays without bitmaps. i.e. if a device is removed and re-added and *no* changes have been made in the interim, then the add should not require a resync. This patch allows that option. This means that when assembling an array one device at a time (e.g. during device discovery) the array can be enabled read-only as soon as enough devices are available, but extra devices can still be added without causing a resync. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
As data_disks is *less* than raid_disks, the current test here is obviously wrong. And as the difference is already available in conf->max_degraded, it makes much more sense to use that. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 akpm@osdl.org 提交于
RAID5 recently changed to RAID456 Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Justin Piszcz 提交于
I was experimenting with Linux SW raid today and found a spelling error when reading the help menus... (and fly spell found more). Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
The size calculation made assumtion which the new offset mode didn't follow. This gets the size right in all cases. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
Fix problems with new bmap based access to bitmap files. 1/ When not using a file based bitmap, attach a NULL list of buffers to each page so the common free_buffer routine can cope. 2/ Use submit_bh to read as well as write, rather than vfs_read. This makes read and write more symetric. 3/ sync the file before reading, to ensure that the page cache has no dirty pages that might get written out later. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
If md is asked to store a bitmap in a file, it tries to hold onto the page cache pages for that file, manipulate them directly, and call a cocktail of operations to write the file out. I don't believe this is a supportable approach. This patch changes the approach to use the same approach as swap files. i.e. bmap is used to enumerate all the block address of parts of the file and we write directly to those blocks of the device. swapfile only uses parts of the file that provide a full pages at contiguous addresses. We don't have that luxury so we have to cope with pages that are non-contiguous in storage. To handle this we attach buffers to each page, and store the addresses in those buffers. With this approach the pagecache may contain data which is inconsistent with what is on disk. To alleviate the problems this can cause, md invalidates the pagecache when releasing the file. If the file is to be examined while the array is active (a non-critical but occasionally useful function), O_DIRECT io must be used. And new version of mdadm will have support for this. This approach simplifies a lot of code: - we no longer need to keep a list of pages which we need to wait for, as the b_endio function can keep track of how many outstanding writes there are. This saves a mempool. - -EAGAIN returns from write_page are no longer possible (not sure if they ever were actually). Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
md/bitmap modifies i_writecount of a bitmap file to make sure that no-one else writes to it. The reverting of the change is sometimes done twice, and there is one error path where it is omitted. This patch tidies that up. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
bitmap_active is never called, and the BITMAP_ACTIVE flag is never users or tested, so discard them both. Also remove some out-of-date 'todo' comments. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-