- 08 3月, 2012 6 次提交
-
-
由 Joe Thornber 提交于
Correct the number of mapped sectors shown on a thin device's status line by decrementing td->mapped_blocks in __remove() each time a block is removed. Signed-off-by: NJoe Thornber <ejt@redhat.com> Acked-by: NMike Snitzer <snitzer@redhat.com> Cc: stable@kernel.org Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Joe Thornber 提交于
If dm_sm_disk_create() fails the superblock must be unlocked. Signed-off-by: NJoe Thornber <ejt@redhat.com> Acked-by: NMike Snitzer <snitzer@redhat.com> Cc: stable@kernel.org Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
The __open_device() error paths in __create_thin() and __create_snap() incorrectly call __close_device() even if td was not initialized by __open_device(). Remove this. Also document __open_device() return values, remove a redundant td->changed = 1 in __create_thin(), and insert an additional safeguard against creating an already-existing device. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Cc: stable@kernel.org Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
The following BUG is hit on the first read that is submitted to a dm flakey test device while the device is "down" if the corrupt_bio_byte feature wasn't requested when the device's table was loaded. Example DM table that will hit this BUG: 0 2097152 flakey 8:0 2048 0 30 This bug was introduced by commit a3998799 (dm flakey: add corrupt_bio_byte feature) in v3.1-rc1. BUG: unable to handle kernel paging request at ffff8801cfce3fff IP: [<ffffffffa008c233>] corrupt_bio_data+0x6e/0xae [dm_flakey] PGD 1606063 PUD 0 Oops: 0002 [#1] SMP ... Call Trace: <IRQ> [<ffffffffa008c2b5>] flakey_end_io+0x42/0x48 [dm_flakey] [<ffffffffa00dca98>] clone_endio+0x54/0xb6 [dm_mod] [<ffffffff81130587>] bio_endio+0x2d/0x2f [<ffffffff811c819a>] req_bio_endio+0x96/0x9f [<ffffffff811c94b9>] blk_update_request+0x1dc/0x3a9 [<ffffffff812f5ee2>] ? rcu_read_unlock+0x21/0x23 [<ffffffff811c96a6>] blk_update_bidi_request+0x20/0x6e [<ffffffff811c9713>] blk_end_bidi_request+0x1f/0x5d [<ffffffff811c978d>] blk_end_request+0x10/0x12 [<ffffffff8128f450>] scsi_io_completion+0x1e5/0x4b1 [<ffffffff812882a9>] scsi_finish_command+0xec/0xf5 [<ffffffff8128f830>] scsi_softirq_done+0xff/0x108 [<ffffffff811ce284>] blk_done_softirq+0x84/0x98 [<ffffffff81048d19>] __do_softirq+0xe3/0x1d5 [<ffffffff8138f83f>] ? _raw_spin_lock+0x62/0x69 [<ffffffff810997cf>] ? handle_irq_event+0x4c/0x61 [<ffffffff8139833c>] call_softirq+0x1c/0x30 [<ffffffff81003b37>] do_softirq+0x4b/0xa3 [<ffffffff81048a39>] irq_exit+0x53/0xca [<ffffffff81398acd>] do_IRQ+0x9d/0xb4 [<ffffffff81390333>] common_interrupt+0x73/0x73 ... Signed-off-by: NMike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.1+ Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Milan Broz 提交于
This patch fixes a crash by recognising discards in dm_io. Currently dm_mirror can send REQ_DISCARD bios if running over a discard-enabled device and without support in dm_io the system crashes badly. BUG: unable to handle kernel paging request at 00800000 IP: __bio_add_page.part.17+0xf5/0x1e0 ... bio_add_page+0x56/0x70 dispatch_io+0x1cf/0x240 [dm_mod] ? km_get_page+0x50/0x50 [dm_mod] ? vm_next_page+0x20/0x20 [dm_mod] ? mirror_flush+0x130/0x130 [dm_mirror] dm_io+0xdc/0x2b0 [dm_mod] ... Introduced in 2.6.38-rc1 by commit 5fc2ffea (dm raid1: support discard). Signed-off-by: NMilan Broz <mbroz@redhat.com> Cc: stable@kernel.org Acked-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Jesper Juhl 提交于
If 'argc' is zero we jump to the 'out:' label, but this leaks the (unused) memory that 'dm_split_args()' allocated for 'argv' if the string being split consisted entirely of whitespace. Jump to the 'out_argv:' label instead to free up that memory. Signed-off-by: NJesper Juhl <jj@chaosbits.net> Cc: stable@kernel.org Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
- 07 2月, 2012 1 次提交
-
-
由 NeilBrown 提交于
1/ If a resync is aborted we should record how far we got (recovery_cp) the last request that we know has completed (->curr_resync_completed) rather than the last request that was submitted (->curr_resync). 2/ When a resync aborts we still want to update the metadata with any changes, so set MD_CHANGE_DEVS even if we 'skip'. Signed-off-by: NNeilBrown <neilb@suse.de>
-
- 31 1月, 2012 1 次提交
-
-
由 Jonathan Brassow 提交于
The life cycle of a device-mapper target is: 1) create 2) resume 3) suspend *) possibly repeat from 2 4) destroy The dm-raid target is unconditionally calling MD's bitmap_load function upon every resume. If steps 2 & 3 above are repeated, bitmap_load is called multiple times. It is only written to be called once; otherwise, it allocates new memory for the bitmap (without freeing the old) and incrementing the number of pages it thinks it has without zeroing first. This ultimately leads to access beyond allocated memory and lost memory. Simply avoiding the bitmap_load call upon resume is not sufficient. If the target was suspended while the initial recovery was only partially complete, it needs to be restarted when the target is resumed. This is why 'md_wakeup_thread' is called before issuing the 'mddev_resume'. Signed-off-by: NJonathan Brassow <jbrassow@redhat.com> Signed-off-by: NNeilBrown <neilb@suse.de>
-
- 15 1月, 2012 1 次提交
-
-
由 Paolo Bonzini 提交于
A logical volume can map to just part of underlying physical volume. In this case, it must be treated like a partition. Based on a patch from Alasdair G Kergon. Cc: Alasdair G Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 1月, 2012 3 次提交
-
-
由 Martin K. Petersen 提交于
Stacking driver queue limits are typically bounded exclusively by the capabilities of the low level devices, not by the stacking driver itself. This patch introduces blk_set_stacking_limits() which has more liberal metrics than the default queue limits function. This allows us to inherit topology parameters from bottom devices without manually tweaking the default limits in each driver prior to calling the stacking function. Since there is now a clear distinction between stacking and low-level devices, blk_set_default_limits() has been modified to carry the more conservative values that we used to manually set in blk_queue_make_request(). Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 NeilBrown 提交于
We normally try to avoid reading from write-mostly devices, but when we do we really have to check for bad blocks and be sure not to try reading them. With the current code, best_good_sectors might not get set and that causes zero-length read requests to be send down which is very confusing. This bug was introduced in commit d2eb35ac and so the patch is suitable for 3.1.x and 3.2.x Reported-and-tested-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Reported-and-tested-by: NArt -kwaak- van Breemen <ard@telegraafnet.nl> Signed-off-by: NNeilBrown <neilb@suse.de> Cc: stable@vger.kernel.org
-
由 NeilBrown 提交于
We currently only 'notify' changes to the 'degraded' attribute when it decreases, not when it increases. Notifying on failure is a little awkward as it happen in interrupt context. So instead, notify when we remove the failed device from the array, which is very soon afterwards. Reported-and-tested-by: NMikhail Balabin <mbalabin@gmail.com> Signed-off-by: NNeilBrown <neilb@suse.de>
-
- 04 1月, 2012 1 次提交
-
-
由 Al Viro 提交于
Move invalidate_bdev, block_sync_page into fs/block_dev.c. Export kill_bdev as well, so brd doesn't have to open code it. Reduce buffer_head.h requirement accordingly. Removed a rather large comment from invalidate_bdev, as it looked a bit obsolete to bother moving. The small comment replacing it says enough. Signed-off-by: NNick Piggin <npiggin@suse.de> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 23 12月, 2011 27 次提交
-
-
由 NeilBrown 提交于
Now that WantReplacement drives are replaced cleanly, mark a drive as want_replacement when we see a write error. It might get failed soon so the WantReplacement flag is irrelevant, but if the write error is recorded in the bad block log, we still want to activate any spare that might be available. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
When attempting to add a spare to a RAID1 array, also consider adding it as a replacement for a want_replacement device. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
If a Replacement is seen, file it as such. If we see two replacements (or two normal devices) for the one slot, abort. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
When recovery completes ->spare_active is called. This checks if the replacement is ready and if so it fails the original. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
Replacement devices are stored at a different offset, so look there too. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
In RAID1, a replacement is much like a normal device, so we just double the size of the relevant arrays and look at all possible devices for reads and writes. This means that the array looks like it is now double the size in some way - we need to be careful about that. In particular, we checking if the array is still degraded while creating a recovery request we need to only consider the first 'half' - i.e. the real (non-replacement) devices. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
In general mddev->raid_disks can change unexpectedly while conf->raid_disks will only change in a very controlled way. So change some uses of one to the other. The use of mddev->raid_disks will not cause actually problems but this way is more consistent and safer in the long term. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
When attempting to add a spare to a RAID10 array, also consider adding it as a replacement for a want_replacement device. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
If a Replacement is seen, file it as such. If we see two replacements (or two normal devices) for the one slot, abort. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
When recovery finish and spare_active is called, check for a replace that might have just become fully synced and mark it as such, marking the original as failed. Then when the original is removed, move the replacement into its position. This means that 'replacement' and spontaneously become NULL in some situations. Make sure we check for those. It also means that 'rdev' and 'replacement' could appear to be identical - check for that too. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
If there is a replacement device, then recover to it, reading from any drives - maybe the one being replaced, maybe not. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
If we need to resync an array which has replacement devices, we always write any block checked to every replacement. If the resync was bitmap-based resync we will then complete the replacement normally. If it was a full resync, we mark the replacements as fully recovered when the resync finishes so no further recovery is needed. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
When writing, we need to submit two writes, one to the original, and one to the replacements - if there is a replacement. If the write to the replacement results in a write error we just fail the device. We only try to record write errors to the original. This only handles writing new data. Writing for resync/recovery will come later. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
Enhance raid10_remove_disk to be able to remove ->replacement as well as ->rdev Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
When reading (for array reads, not for recovery etc) we read from the replacement device if it has recovered far enough. This requires storing the chosen rdev in the 'r10_bio' so we can make sure to drop the ref on the right device when the read finishes. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
It makes more sense to return an rdev than just an index as read_balance() gets a reference to the rdev and so returning the pointer make this more idiomatic. This will be needed in a future patch when we might return a 'replacement' rdev instead of the main rdev. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
Allow each slot in the RAID10 to have 2 devices, the want_replacement and the replacement. Also an r10bio to have 2 bios, and for resync/recovery allocate the second bio if there are any replacement devices. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
Now that WantReplacement drives are replaced cleanly, mark a drive as WantReplacement when we see a write error. It might get failed soon so the WantReplacement flag is irrelevant, but if the write error is recorded in the bad block log, we still want to activate any spare that might be available. Reviewed-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
When attempting to add a spare to a RAID[456] array, also consider adding it as a replacement for a want_replacement device. This requires that common md code attempt hot_add even when the array is not formally degraded. Reviewed-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
If a Replacement is seen, file it as such. If we see two replacements (or two normal devices) for the one slot, abort. Reviewed-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
When recovery completes - as reported by a call to ->spare_active, we clear In_sync on the original and set it on the replacement. Then when the original gets removed we move the replacement from 'replacement' to 'rdev'. This could race with other code that is looking at these pointers, so we use memory barriers and careful ordering to ensure that a reader might see one device twice, but never no devices. Then the readers guard against using both devices, which could only happen when writing. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
During recovery we want to write to the replacement but not the original. So we have two new flags - R5_NeedReplace if this stripe has a replacement that needs to be written at some stage - R5_WantReplace if NeedReplace, and the data is available, and a 'sync' has been requested on this stripe. We also distinguish between 'sync and replace' which need to read all other devices, and 'replace' which only needs to read the devices being replaced. Note that during resync we always write to any replacement device. It might not need to be written to, but as we don't read to compare, we have to write to be sure. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
When writing, we need to submit two writes, one to the original, and one to the replacement - if there is a replacement. If the write to the replacement results in a write error, we just fail the device. We only try to record write errors to the original. When writing for recovery, we shouldn't write to the original. This will be addressed in a subsequent patch that generally addresses recovery. Reviewed-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
Enhance raid5_remove_disk to be able to remove ->replacement as well as ->rdev. Reviewed-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
If a replacement device is present and has been recovered far enough, then use it for reading into the stripe cache. If we get an error we don't try to repair it, we just fail the device. A replacement device that gives errors does not sound sensible. This requires removing the setting of R5_ReadError when we get a read error during a read that bypasses the cache. It was probably a bad idea anyway as we don't know that every block in the read caused an error, and it could cause ReadError to be set for the replacement device, which is bad. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
We current initialise some fields of a bio when preparing a stripe_head, and again just before submitting the request. Remove the duplication by only setting the fields that lower level devices don't touch in raid5_build_block, and only set the changeable fields in ops_run_io. Reviewed-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
Remove some #defines that are no longer used, and replace some others with an enum. And remove an unused field. Reviewed-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NNeilBrown <neilb@suse.de>
-