- 13 10月, 2008 3 次提交
-
-
由 NeilBrown 提交于
'read-auto' is a variant of 'readonly' which will switch to writable on the first write attempt. Calling do_md_stop to set the array readonly when it is already readonly returns an error. So make sure not to do that. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
For externally managed metadata, the 'metadata_version' sysfs attribute is really just a channel for user-space programs to communicate about how the array is being managed. It can be useful for this to be changed while the array is active. Normally changes to metadata_version are not permitted while the array is active. Change that so that if the metadata is externally managed, the metadata_version can be changed to a different flavour of external management. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 Chris Webb 提交于
Fix rdev_size_store with size == 0. size == 0 means to use the largest size allowed by the underlying device and is used when modifying an active array. This fixes a regression introduced by commit d7027458 Cc: <stable@kernel.org> Signed-off-by: NChris Webb <chris@arachsys.com> Signed-off-by: NNeilBrown <neilb@suse.de>
-
- 10 10月, 2008 18 次提交
-
-
由 Alasdair G Kergon 提交于
Detect and report buggy drivers that destroy their request_queue. Signed-off-by: NAlasdair G Kergon <agk@redhat.com> Cc: Stefan Raspl <raspl@linux.vnet.ibm.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org>
-
由 Mikulas Patocka 提交于
Publish dm_vcalloc in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
Publish dm_table_unplug_all in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
Publish dm_get_mapinfo in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
Split struct dm_dev in two and publish the part that other targets need in include/linux/device-mapper.h. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Milan Broz 提交于
Don't wait between submitting crypt requests for a bio unless we are short of memory. There are two situations when we must split an encrypted bio: 1) there are no free pages; 2) the new bio would violate underlying device restrictions (e.g. max hw segments). In case (2) we do not need to wait. Add output variable to crypt_alloc_buffer() to distinguish between these cases. Signed-off-by: NMilan Broz <mbroz@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Milan Broz 提交于
Move the initialisation of ctx->pending into one place, at the start of crypt_convert(). Introduce crypt_finished to indicate whether or not the encryption is finished, for use in a later patch. No functional change. Signed-off-by: NMilan Broz <mbroz@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Milan Broz 提交于
The pending reference count must be incremented *before* the async work is queued to another thread, not after. Otherwise there's a race if the work completes and decrements the reference count before it gets incremented. Signed-off-by: NMilan Broz <mbroz@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Milan Broz 提交于
Make kcryptd_crypt_write_io_submit() responsible for decrementing the pending count after an error. Also fixes a bug in the async path that forgot to decrement it. Signed-off-by: NMilan Broz <mbroz@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Alasdair G Kergon 提交于
Make the caller reponsible for incrementing the pending count before calling kcryptd_crypt_write_io_submit() in the non-async case to bring it into line with the async case. Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Milan Broz 提交于
Move kcryptd_crypt_write_convert_loop inside kcryptd_crypt_write_convert. This change is needed for a later patch. No functional change. Signed-off-by: NMilan Broz <mbroz@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Milan Broz 提交于
Factor out crypt io allocation code. Later patches will call it from another place. No functional change. Signed-off-by: NMilan Broz <mbroz@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Milan Broz 提交于
Move io pending to one place. No functional change, usefull to simplify debugging. Signed-off-by: NMilan Broz <mbroz@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
Change uint32_t into chunk_t to remove 32-bit limitation on the number of chunks on systems with 64-bit sector numbers. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
Move this logic to a function, because it will be reused later. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Jonathan Brassow 提交于
dm-raid1 is setting the 'DM_KCOPYD_IGNORE_ERROR' flag unconditionally when assigning kcopyd work. kcopyd is responsible for copying an assigned section of disk to one or more other disks. The 'DM_KCOPYD_IGNORE_ERROR' flag affects kcopyd in the following way: When not set: kcopyd will immediately stop the copy operation when an error is encountered. When set: kcopyd will try to proceed regardless of errors and try to continue copying any remaining amount. Since dm-raid1 tracks regions of the address space that are (or are not) in sync and it now has the ability to handle these errors, we can safely enable this optimization. This optimization is conditional on whether mirror error handling has been enabled. Signed-off-by: NJonathan Brassow <jbrassow@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Kiyoshi Ueda 提交于
This patch moves 'is_active' from struct dm_path to struct pgpath as it does not need exporting. Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Benjamin Marzinski 提交于
This patch allows path errors from the multipath ctr function to propagate up to userspace as errno values from the ioctl() call. This is in response to https://www.redhat.com/archives/dm-devel/2008-May/msg00000.html and https://bugzilla.redhat.com/show_bug.cgi?id=444421 The patch only lets through the errors that it needs to in order to get the path errors from parse_path(). Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
- 09 10月, 2008 11 次提交
-
-
由 Denis ChengRq 提交于
Since all bio_split calls refer the same single bio_split_pool, the bio_split function can use bio_split_pool directly instead of the mempool_t parameter; then the mempool_t parameter can be removed from bio_split param list, and bio_split_pool is only referred in fs/bio.c file, can be marked static. Signed-off-by: NDenis ChengRq <crquan@gmail.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Mike Anderson 提交于
Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Move stats related fields - stamp, in_flight, dkstats - from disk to part0 and unify stat handling such that... * part_stat_*() now updates part0 together if the specified partition is not part0. ie. part_stat_*() are now essentially all_stat_*(). * {disk|all}_stat_*() are gone. * part_round_stats() is updated similary. It handles part0 stats automatically and disk_round_stats() is killed. * part_{inc|dec}_in_fligh() is implemented which automatically updates part0 stats for parts other than part0. * disk_map_sector_rcu() is updated to return part0 if no part matches. Combined with the above changes, this makes NULL special case handling in callers unnecessary. * Separate stats show code paths for disk are collapsed into part stats show code paths. * Rename disk_stat_lock/unlock() to part_stat_lock/unlock() While at it, reposition stat handling macros a bit and add missing parentheses around macro parameters. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Till now, bdev->bd_part is set only if the bdev was for parts other than part0. This patch makes bdev->bd_part always set so that code paths don't have to differenciate common handling. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Move disk->policy to part0->policy. Implement and use get_disk_ro(). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Implement {disk|part}_to_dev() and use them to access generic device instead of directly dereferencing {disk|part}->dev. To make sure no user is left behind, rename generic devices fields to __dev. This is in preparation of unifying partition 0 handling with other partitions. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
There are two variants of stat functions - ones prefixed with double underbars which don't care about preemption and ones without which disable preemption before manipulating per-cpu counters. It's unclear whether the underbarred ones assume that preemtion is disabled on entry as some callers don't do that. This patch unifies diskstats access by implementing disk_stat_lock() and disk_stat_unlock() which take care of both RCU (for partition access) and preemption (for per-cpu counter access). diskstats access should always be enclosed between the two functions. As such, there's no need for the versions which disables preemption. They're removed and double underbars ones are renamed to drop the underbars. As an extra argument is added, there's no danger of using the old version unconverted. disk_stat_lock() uses get_cpu() and returns the cpu index and all diskstat functions which access per-cpu counters now has @cpu argument to help RT. This change adds RCU or preemption operations at some places but also collapses several preemption ops into one at others. Overall, the performance difference should be negligible as all involved ops are very lightweight per-cpu ones. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
* Implement disk_devt() and part_devt() and use them to directly access devt instead of computing it from ->major and ->first_minor. Note that all references to ->major and ->first_minor outside of block layer is used to determine devt of the disk (the part0) and as ->major and ->first_minor will continue to represent devt for the disk, converting these users aren't strictly necessary. However, convert them for consistency. * Implement disk_max_parts() to avoid directly deferencing genhd->minors. * Update bdget_disk() such that it doesn't assume consecutive minor space. * Move devt computation from register_disk() to add_disk() and make it the only one (all other usages use the initially determined value). These changes clean up the code and will help disk->part dereference fix and extended block device numbers. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
raid5 can overflow with more than 255 stripes, and we can increase it to an int for free on both 32 and 64-bit archs due to the padding. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Mikulas Patocka 提交于
Remove hw_segments field from struct bio and struct request. Without virtual merge accounting they have no purpose. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 01 10月, 2008 3 次提交
-
-
由 Chandra Seetharaman 提交于
Moving the path activation to workqueue along with scsi_dh patches introduced a race. It is due to the fact that the current_pgpath (in the multipath data structure) can be modified if changes happen in any of the paths leading to the lun. If the changes lead to current_pgpath being set to NULL, then it leads to the invalid access which results in the panic below. This patch fixes that by storing the pgpath to activate in the multipath data structure and properly protecting it. Note that if activate_path is called twice in succession with different pgpath, with the second one being called before the first one is done, then activate path will be called twice for the second pgpath, which is fine. Unable to handle kernel paging request for data at address 0x00000020 Faulting instruction address: 0xd000000000aa1844 cpu 0x1: Vector: 300 (Data Access) at [c00000006b987a80] pc: d000000000aa1844: .activate_path+0x30/0x218 [dm_multipath] lr: c000000000087a2c: .run_workqueue+0x114/0x204 sp: c00000006b987d00 msr: 8000000000009032 dar: 20 dsisr: 40000000 current = 0xc0000000676bb3f0 paca = 0xc0000000006f3680 pid = 2528, comm = kmpath_handlerd enter ? for help [c00000006b987da0] c000000000087a2c .run_workqueue+0x114/0x204 [c00000006b987e40] c000000000088b58 .worker_thread+0x120/0x144 [c00000006b987f00] c00000000008ca70 .kthread+0x78/0xc4 [c00000006b987f90] c000000000027cc8 .kernel_thread+0x4c/0x68 Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
If for any reason dm_merge_bvec() is given an offset beyond the end of the device, avoid an oops and always allow one page to be added to an empty bio. We'll reject the I/O later after the bio is submitted. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
Some callers assume they can always add at least one page to an empty bio, so dm_merge_bvec should not return 0 in this case: we'll reject the I/O later after the bio is submitted. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
- 19 9月, 2008 1 次提交
-
-
由 NeilBrown 提交于
When two md arrays share some block device (e.g each uses different partitions on the one device), a resync of one array will wait for the resync on the other to finish. This can be a long time and as it currently waits TASK_UNINTERRUPTIBLE, the softlockup code notices and complains. So use TASK_INTERRUPTIBLE instead and make sure to flush signals before calling schedule. Signed-off-by: NNeilBrown <neilb@suse.de>
-
- 01 9月, 2008 2 次提交
-
-
由 NeilBrown 提交于
A recent patch to protect the rdev list with rcu locking leaves us with a problem because we can sleep on memalloc while holding the rcu lock. The rcu lock is only needed while walking the linked list as uninteresting devices (failed or spares) can be removed at any time. So only take the rcu lock while actually walking the linked list. Take a refcount on the rdev during the time when we drop the lock and do the memalloc to start IO. When we return to the locked code, all the interesting devices on the list will not have moved, so we can simply use list_for_each_continue_rcu to pick up where we left off. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
When stopping an md array, or just switching to read-only, we currently call invalidate_partition while holding the mddev lock. The main reason for this is probably to ensure all dirty buffers are flushed (invalidate_partition calls fsync_bdev). However if any dirty buffers are found, it will almost certainly cause a deadlock as starting writeout will require an update to the superblock, and performing that updates requires taking the mddev lock - which is already held. This deadlock can be demonstrated by running "reboot -f -n" with a root filesystem on md/raid, and some dirty buffers in memory. All other calls to stop an array should already happen after a flush. The normal sequence is to stop using the array (e.g. umount) which will cause __blkdev_put to call sync_blockdev. Then open the array and issue the STOP_ARRAY ioctl while the buffers are all still clean. So this invalidate_partition is normally a no-op, except for one case where it will cause a deadlock. So remove it. This patch possibly addresses the regression recored in http://bugzilla.kernel.org/show_bug.cgi?id=11460 and http://bugzilla.kernel.org/show_bug.cgi?id=11452 though it isn't yet clear how it ever worked. Signed-off-by: NNeilBrown <neilb@suse.de>
-
- 08 8月, 2008 1 次提交
-
-
由 Dan Williams 提交于
If a 'repair' is requested when an array is in a position to 'recover' raid1 will perform the repair while md believes a recovery is happening. Address this at both ends, i.e. cancel check/repair requests upon detecting a recover condition and do not call ->spare_active after completing a check/repair. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 05 8月, 2008 1 次提交
-
-
由 NeilBrown 提交于
The raid10 resync/recovery code currently limits the amount of in-flight resync IO to 2Meg. This was copied from raid1 where it seems quite adequate. However for raid10, some layouts require a bit of seeking to perform a resync, and allowing a larger buffer size means that the seeking can be significantly reduced. There is probably no real need to limit the amount of in-flight IO at all. Any shortage of memory will naturally reduce the amount of buffer space available down to a set minimum, and any concurrent normal IO will quickly cause resync IO to back off. The only problem would be that normal IO has to wait for all resync IO to finish, so a very large amount of resync IO could cause unpleasant latency when normal IO starts up. So: increase RESYNC_DEPTH to allow 32Meg of buffer (if memory is available) which seems to be a good amount. Also reduce the amount of memory reserved as there is no need to keep 2Meg just for resync if memory is tight. Thanks to Keld for the suggestion. Cc: Keld Jørn Simonsen <keld@dkuug.dk> Signed-off-by: NNeilBrown <neilb@suse.de>
-