- 12 8月, 2010 33 次提交
-
-
由 Mike Snitzer 提交于
Enable discard support in the DM multipath target. This discard support depends on a few discard-specific fixes to the block layer's request stacking driver methods. Discard requests are optional so don't allow a failed discard to trigger path failures. If there is a real problem with a given path the barriers associated with the discard (either before or after the discard) will cause path failure. That said, unconditionally passing discard failures up the stack is not ideal. This must be fixed once DM has more information about the nature of the underlying storage failure. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com> Cc: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
-
由 Mikulas Patocka 提交于
The DM core will submit a discard bio to the stripe target for each stripe in a striped DM device. The stripe target will determine stripe-specific portions of the supplied bio to be remapped into individual (at most 'num_discard_requests' extents). If a given stripe-specific discard bio doesn't touch a particular stripe the bio will be dropped. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
Update __clone_and_map_discard to loop across all targets in a DM device's table when it processes a discard bio. If a discard crosses a target boundary it must be split accordingly. Update __issue_target_requests and __issue_target_request to allow a cloned discard bio to have a custom start sector and size. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
Optimize sector division: If the number of stripes is a power of two, we can do shift and mask instead of division. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
Move sector to stripe translation into a function. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
Have the error target respond to a discard request with a hard -EIO rather than fail the request with -EOPNOTSUPP. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
Enable discard support for the delay target. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
Have the zero target silently drop a discard rather than fail the request with -EOPNOTSUPP. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Alasdair G Kergon 提交于
Use new dm_target_offset() macro to avoid most references to ti->begin in dm targets. Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
Split max_io_len_target_boundary out of max_io_len so that the discard support can make use of it without duplicating max_io_len code. Avoiding max_io_len's split_io logic enables DM's discard support to submit the entire discard request to a target. But discards must still be split on target boundaries. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
Rename __flush_target to __issue_target_request now that it is used to issue both flush and discard requests. Introduce __issue_target_requests as a convenient wrapper to __issue_target_request 'num_flush_requests' or 'num_discard_requests' times per target. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
Allow discards to be passed through to linear mappings if at least one underlying device supports it. Discards will be forwarded only to devices that support them. A target that supports discards should set num_discard_requests to indicate how many times each discard request must be submitted to it. Verify table's underlying devices support discards prior to setting the associated DM device as capable of discards (via QUEUE_FLAG_DISCARD). Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Reviewed-by: NJoe Thornber <thornber@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Milan Broz 提交于
Allocate cipher strings indpendently of struct crypt_config and move cipher parsing and allocation into a separate function to prepare for supporting the cryptoapi format e.g. "xts(aes)". No functional change in this patch. Signed-off-by: NMilan Broz <mbroz@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Milan Broz 提交于
Use just one label and reuse common destructor for crypt target. Parse remaining argv arguments in logic order. Also do not ignore error values from IV init and set key functions. No functional change in this patch except changed return codes based on above. Signed-off-by: NMilan Broz <mbroz@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Peter Rajnoha 提交于
Add devname:mapper/control and MAPPER_CTRL_MINOR module alias to support dm-mod module autoloading. Signed-off-by: NKay Sievers <kay.sievers@vrfy.org> Signed-off-by: NPeter Rajnoha <prajnoha@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
'target_request_nr' is a more generic name that reflects the fact that it will be used for both flush and discard support. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Will Drewry 提交于
This change unifies the various checks and finalization that occurs on a table prior to use. By doing so, it allows table construction without traversing the dm-ioctl interface. Signed-off-by: NWill Drewry <wad@chromium.org> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
Implement merge method for the snapshot origin to improve read performance. Without merge method, dm asks the upper layers to submit smallest possible bios --- one page. Submitting such small bios impacts performance negatively when reading or writing the origin device. Without this patch, CPU consumption when reading the origin on lvm on md-raid0 was 6 to 12%, with this patch, it drops to 1 to 4%. Note: in my testing, it actually degraded performance in some settings, I traced it to Maxtor disks having problems with > 512-sector requests. Reducing the number of sectors to /sys/block/sd*/queue/max_sectors_kb to 256 fixed the read performance. I think we don't have to care about weird disks that actually degrade performance because of large requests being sent to them. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
Change bio-based mapped devices no longer to have a fully initialized request_queue (request_fn, elevator, etc). This means bio-based DM devices no longer register elevator sysfs attributes ('iosched/' tree or 'scheduler' other than "none"). In contrast, a request-based DM device will continue to have a full request_queue and will register elevator sysfs attributes. Therefore a user can determine a DM device's type by checking if elevator sysfs attributes exist. First allocate a minimalist request_queue structure for a DM device (needed for both bio and request-based DM). Initialization of a full request_queue is deferred until it is known that the DM device is request-based, at the end of the table load sequence. Factor DM device's request_queue initialization: - common to both request-based and bio-based into dm_init_md_queue(). - specific to request-based into dm_init_request_based_queue(). The md->type_lock mutex is used to protect md->queue, in addition to md->type, during table_load(). A DM device's first table_load will establish the immutable md->type. But md->queue initialization, based on md->type, may fail at that time (because blk_init_allocated_queue cannot allocate memory). Therefore any subsequent table_load must (re)try dm_setup_md_queue independently of establishing md->type. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Acked-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mike Snitzer 提交于
Determine whether a mapped device is bio-based or request-based when loading its first (inactive) table and don't allow that to be changed later. This patch performs different device initialisation in each of the two cases. (We don't think it's necessary to add code to support changing between the two types.) Allowed md->type transitions: DM_TYPE_NONE to DM_TYPE_BIO_BASED DM_TYPE_NONE to DM_TYPE_REQUEST_BASED We now prevent table_load from replacing the inactive table with a conflicting type of table even after an explicit table_clear. Introduce 'type_lock' into the struct mapped_device to protect md->type and to prepare for the next patch that will change the queue initialization and allocate memory while md->type_lock is held. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Acked-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com> drivers/md/dm-ioctl.c | 15 +++++++++++++++ drivers/md/dm.c | 37 ++++++++++++++++++++++++++++++------- drivers/md/dm.h | 5 +++++ include/linux/dm-ioctl.h | 4 ++-- 4 files changed, 52 insertions(+), 9 deletions(-)
-
由 Mikulas Patocka 提交于
When processing barriers, skip the second flush if processing the bio failed with -EOPNOTSUPP. This can happen with discard+barrier requests. If the device doesn't support discard, there would be two useless SYNCHRONIZE CACHE commands. The first dm_flush cannot be so easily optimized out, so we leave it there. Previously, -EOPNOTSUPP could be received in dec_pending only with empty barriers and we ignored that error, assuming the device not supporting cache flushes has cache always consistent. With the addition of discard barriers, this -EOPNOTSUPP can also be generated by discards and we must record it in md->barrier_error for process_barrier. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Tomohiro Kusumi 提交于
This patch fixes hard-coded value for the size of a chunk that includes disk header for persistent snapshot. It should be changed to existing macro NUM_SNAPSHOT_HDR_CHUNKS instead of using hard-coded value 1. Signed-off-by: NTomohiro Kusumi <kusumi.tomohiro@jp.fujitsu.com> Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Julia Lawall 提交于
Use kstrdup when the goal of an allocation is copy a string into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to; expression flag,E1,E2; statement S; @@ - to = kmalloc(strlen(from) + 1,flag); + to = kstrdup(from, flag); ... when != \(from = E1 \| to = E1 \) if (to==NULL || ...) S ... when != \(from = E2 \| to = E2 \) - strcpy(to, from); // </smpl> Signed-off-by: NJulia Lawall <julia@diku.dk> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Arnd Bergmann 提交于
The dm control device does not implement read/write, so it has no use for seeking. Using no_llseek prevents falling back to default_llseek, which requires the BKL. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Kiyoshi Ueda 提交于
This patch separates the device deletion code from dm_put() to make sure the deletion happens in the process context. By this patch, device deletion always occurs in an ioctl (process) context and dm_put() can be called in interrupt context. As a result, the request-based dm's bad dm_put() usage pointed out by Mikulas below disappears. http://marc.info/?l=dm-devel&m=126699981019735&w=2 Without this patch, I confirmed there is a case to crash the system: dm_put() => dm_table_destroy() => vfree() => BUG_ON(in_interrupt()) Some more backgrounds and details: In request-based dm, a device opener can remove a mapped_device while the last request is still completing, because bios in the last request complete first and then the device opener can close and remove the mapped_device before the last request completes: CPU0 CPU1 ================================================================= <<INTERRUPT>> blk_end_request_all(clone_rq) blk_update_request(clone_rq) bio_endio(clone_bio) == end_clone_bio blk_update_request(orig_rq) bio_endio(orig_bio) <<I/O completed>> dm_blk_close() dev_remove() dm_put(md) <<Free md>> blk_finish_request(clone_rq) .... dm_end_request(clone_rq) free_rq_clone(clone_rq) blk_end_request_all(orig_rq) rq_completed(md) So request-based dm used dm_get()/dm_put() to hold md for each I/O until its request completion handling is fully done. However, the final dm_put() can call the device deletion code which must not be run in interrupt context and may cause kernel panic. To solve the problem, this patch moves the device deletion code, dm_destroy(), to predetermined places that is actually deleting the mapped_device in ioctl (process) context, and changes dm_put() just to decrement the reference count of the mapped_device. By this change, dm_put() can be used in any context and the symmetric model below is introduced: dm_create(): create a mapped_device dm_destroy(): destroy a mapped_device dm_get(): increment the reference count of a mapped_device dm_put(): decrement the reference count of a mapped_device dm_destroy() waits for all references of the mapped_device to disappear, then deletes the mapped_device. dm_destroy() uses active waiting with msleep(1), since deleting the mapped_device isn't performance-critical task. And since at this point, nobody opens the mapped_device and no new reference will be taken, the pending counts are just for racing completing activity and will eventually decrease to zero. For the unlikely case of the forced module unload, dm_destroy_immediate(), which doesn't wait and forcibly deletes the mapped_device, is also introduced and used in dm_hash_remove_all(). Otherwise, "rmmod -f" may be stuck and never return. And now, because the mapped_device is deleted at this point, subsequent accesses to the mapped_device may cause NULL pointer references. Cc: stable@kernel.org Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Kiyoshi Ueda 提交于
This patch changes dm_hash_remove_all() to release _hash_lock when removing a device. After removing the device, dm_hash_remove_all() takes _hash_lock and searches the hash from scratch again. This patch is a preparation for the next patch, which changes device deletion code to wait for md reference to be 0. Without this patch, the wait in the next patch may cause AB-BA deadlock: CPU0 CPU1 ----------------------------------------------------------------------- dm_hash_remove_all() down_write(_hash_lock) table_status() md = find_device() dm_get(md) <increment md->holders> dm_get_live_or_inactive_table() dm_get_inactive_table() down_write(_hash_lock) <in the md deletion code> <wait for md->holders to be 0> Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: stable@kernel.org Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Kiyoshi Ueda 提交于
This patch prevents access to mapped_device which is being deleted. Currently, even after a mapped_device has been removed from the hash, it could be accessed through idr_find() using minor number. That could cause a race and NULL pointer reference below: CPU0 CPU1 ------------------------------------------------------------------ dev_remove(param) down_write(_hash_lock) dm_lock_for_deletion(md) spin_lock(_minor_lock) set_bit(DMF_DELETING) spin_unlock(_minor_lock) __hash_remove(hc) up_write(_hash_lock) dev_status(param) md = find_device(param) down_read(_hash_lock) __find_device_hash_cell(param) dm_get_md(param->dev) md = dm_find_md(dev) spin_lock(_minor_lock) md = idr_find(MINOR(dev)) spin_unlock(_minor_lock) dm_put(md) free_dev(md) dm_get(md) up_read(_hash_lock) __dev_status(md, param) dm_put(md) This patch fixes such problems. Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: stable@kernel.org Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Peter Rajnoha 提交于
All the dm ioctls that generate uevents set the DM_UEVENT_GENERATED flag so that userspace knows whether or not to wait for a uevent to be processed before continuing, The dm rename ioctl sets this flag but was not structured to return it to userspace. This patch restructures the rename ioctl processing to behave like the other ioctls that return data and so fix this. Signed-off-by: NPeter Rajnoha <prajnoha@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Alasdair G Kergon 提交于
__dev_status() cannot fail so make it void and simplify callers. Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Peter Rajnoha 提交于
Remove useless __dev_status call while processing an ioctl that sets up device geometry and target message. The data is not returned to userspace so there is no point collecting it and in the case of target_message it is collected before processing the message so if it did return it might be stale. Signed-off-by: NPeter Rajnoha <prajnoha@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
Validate chunk size against both origin and snapshot sector size Don't allow chunk size smaller than either origin or snapshot logical sector size. Reading or writing data not aligned to sector size is not allowed and causes immediate errors. This requires us to open the origin before initialising the exception store and to export dm_snap_origin. Cc: stable@kernel.org Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Mikulas Patocka 提交于
Iterate both origin and snapshot devices iterate_devices method should call the callback for all the devices where the bio may be remapped. Thus, snapshot_iterate_devices should call the callback for both snapshot and origin underlying devices because it remaps some bios to the snapshot and some to the origin. snapshot_iterate_devices called the callback only for the origin device. This led to badly calculated device limits if snapshot and origin were placed on different types of disks. Cc: stable@kernel.org Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Alasdair G Kergon 提交于
multipath_ctr() forgets to return an error after detecting missing path parameters. Fix this. Signed-off-by: NPatrick LoPresti <lopresti@gmail.com> Cc: stable@kernel.org Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
- 08 8月, 2010 7 次提交
-
-
由 NeilBrown 提交于
There is only one error exit from do_md_stop, so make that more explicit and discard the 'err' variable. Also drop the 'revalidate' variable by moving the unlock calls around. Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 NeilBrown 提交于
Move the deletion of sysfs attributes from reconfig_mutex to open_mutex didn't really help as a process can try to take open_mutex while holding reconfig_mutex, so the same deadlock can happen, just requiring one more process to be involved in the chain. I looks like I cannot easily use locking to wait for the sysfs deletion to complete, so don't. The only things that we cannot do while the deletions are still pending is other things which can change the sysfs namespace: run, takeover, stop. Each of these can fail with -EBUSY. So set a flag while doing a sysfs deletion, and fail run, takeover, stop if that flag is set. This is suitable for 2.6.35.x Cc: stable@kernel.org Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 Dan Williams 提交于
Commit b821eaa5 "md: remove ->changed and related code" moved revalidate_disk() under open_mutex, and lockdep noticed. [ INFO: possible circular locking dependency detected ] 2.6.32-mdadm-locking #1 ------------------------------------------------------- mdadm/3640 is trying to acquire lock: (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff811acecb>] revalidate_disk+0x5b/0x90 but task is already holding lock: (&mddev->open_mutex){+.+...}, at: [<ffffffffa055e07a>] do_md_stop+0x4a/0x4d0 [md_mod] which lock already depends on the new lock. It is suitable for 2.6.35.x Cc: <stable@kernel.org> Reported-by: NPrzemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NNeilBrown <neilb@suse.de>
-
由 Arnd Bergmann 提交于
The open and release block_device_operations are currently called with the BKL held. In order to change that, we must first make sure that all drivers that currently rely on this have no regressions. This blindly pushes the BKL into all .open and .release operations for all block drivers to prepare for the next step. The drivers can subsequently replace the BKL with their own locks or remove it completely when it can be shown that it is not needed. The functions blkdev_get and blkdev_put are the only remaining users of the big kernel lock in the block layer, besides a few uses in the ioctl code, none of which need to serialize with blkdev_{get,put}. Most of these two functions is also under the protection of bdev->bd_mutex, including the actual calls to ->open and ->release, and the common code does not access any global data structures that need the BKL. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 FUJITA Tomonori 提交于
This removes q->prepare_flush_fn completely (changes the blk_queue_ordered API). Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 FUJITA Tomonori 提交于
use REQ_FLUSH flag instead. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Alasdair G Kergon <agk@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Christoph Hellwig 提交于
Remove the current bio flags and reuse the request flags for the bio, too. This allows to more easily trace the type of I/O from the filesystem down to the block driver. There were two flags in the bio that were missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've renamed two request flags that had a superflous RW in them. Note that the flags are in bio.h despite having the REQ_ name - as blkdev.h includes bio.h that is the only way to go for now. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-