- 09 10月, 2008 40 次提交
-
-
由 Andrew Patterson 提交于
We call flush_disk() to make sure the buffer cache for the disk is flushed after a disk resize. There are two resize cases, growing and shrinking. Given that users can shrink/then grow a disk before revalidate_disk() is called, we treat the grow case identically to shrinking. We need to flush the buffer cache after an online shrink because, as James Bottomley puts it, The two use cases for shrinking I can see are 1. planned: the fs is already shrunk to within the new boundaries and all data is relocated, so invalidate is fine (any dirty buffers that might exist in the shrunk region are there only because they were relocated but not yet written to their original location). 2. unplanned: In this case, the fs is probably toast, so whether we invalidate or not isn't going to make a whole lot of difference; it's still going to try to read or write from sectors beyond the new size and get I/O errors. Immediately invalidating shrunk disks will cause errors for outstanding I/Os for reads/write beyond the new end of the disk to be generated earlier then if we waited for the normal buffer cache operation. It also removes a potential security hole where we might keep old data around from beyond the end of the shrunk disk if the disk was not invalidated. Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Andrew Patterson 提交于
We need to be able to flush the buffer cache for for more than just when a disk is changed, so we factor out common cache flush code in check_disk_change() to an internal flush_disk() routine. This routine will then be used for both disk changes and disk resizes (in a later patch). Include the disk name in the text indicating that there are busy inodes on the device and increase the KERN severity of the message. Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Andrew Patterson 提交于
Modify the SCSI disk driver to call the revalidate_disk() wrapper. This allows us to do some housekeeping such as accounting for a disk being resized online. The wrapper will call sd_revalidate_disk() at the appropriate time. Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Andrew Patterson 提交于
Check for device resize in the rescan_partitions() routine. If the device has been resized, the bdev size is set to match. The rescan_partitions() routine is called when opening the device and when calling the BLKRRPART ioctl. Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Andrew Patterson 提交于
The revalidate_disk routine now checks if a disk has been resized by comparing the gendisk capacity to the bdev inode size. If they are different (usually because the disk has been resized underneath the kernel) the bdev inode size is adjusted to match the capacity. Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Andrew Patterson 提交于
This is a wrapper for the lower-level revalidate_disk call-backs such as sd_revalidate_disk(). It allows us to perform pre and post operations when calling them. We will use this wrapper in a later patch to adjust block device sizes after an online resize (a _post_ operation). Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
seqf can be started multiple times for a read and the header should be printed only for the initial one. Fix it. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
With the older SG interface, we don't know a user-space address to trasfer data when executing a SCSI command. So we can't pass a user-space address to blk_rq_map_user. This patch fixes sg to pass a NULL user-space address to blk_rq_map_user so that it just sets up a request and bios with page frames propely without data transfer. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
This patch changes blk_rq_map_user to accept a NULL user-space buffer with a READ command if rq_map_data is not NULL. Thus a caller can pass page frames to lk_rq_map_user to just set up a request and bios with page frames propely. bio_uncopy_user (called via blk_rq_unmap_user) doesn't copy data to user space with such request. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
It refers to functions that no longer exist after the IO completion changes. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
DEBUG_BLOCK_EXT_DEVT shuffles SCSI and IDE device numbers and root device number set using rdev become meaningless. Root devices should be explicitly specified using textual names. Warn about it if root can't be found and DEBUG_BLOCK_EXT_DEVT is enabled. Also, add warning to the help text. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
bdget_disk() and blk_lookup_devt() never cared whether the specified partition (or disk) is zero sized or not. I got confused while converting those not to depend on consecutive minor numbers in commit 5a6411b1178baf534aa9138052864dfa89d3eada and later when dev0 was added it broke callers which expected to get valid return for zero sized disk devices. So, they never needed nr_sects checks in the first place. Kill them. This problem was spotted and debugged by Bartlmoiej Zolnierkiewicz. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
It's a debug option that you would explicitly enable to test this feature, we should default it to 'n' to prevent accidental surprises for now. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Harvey Harrison 提交于
Noticed by sparse: block/blk-softirq.c:156:12: warning: symbol 'blk_softirq_init' was not declared. Should it be static? block/genhd.c:583:28: warning: function 'bdget_disk' with external linkage has definition block/genhd.c:659:17: warning: incorrect type in argument 1 (different base types) block/genhd.c:659:17: expected unsigned int [unsigned] [usertype] size block/genhd.c:659:17: got restricted gfp_t block/genhd.c:659:29: warning: incorrect type in argument 2 (different base types) block/genhd.c:659:29: expected restricted gfp_t [usertype] flags block/genhd.c:659:29: got unsigned int block: kmalloc args reversed Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Douglas Gilbert <dougg@torque.net> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
This adds blk_rq_aligned helper function to see if alignment and padding requirement is satisfied for DMA transfer. This also converts blk_rq_map_kern and __blk_rq_map_user to use the helper function. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
bio_copy_kern and bio_copy_user are very similar. This converts bio_copy_kern to use bio_copy_user. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
This patch converts the indirect IO path (including mmap IO and old struct sg_header) to use the block layer functions (blk_get_request, blk_execute_rq_nowait, blk_rq_map_user, etc) instead of scsi_execute_async(). [Jens: fixed compile error with SCSI logging enabled] Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NDouglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
This patch converts the direct IO path (SG_FLAG_DIRECT_IO) to use the block layer functions (blk_get_request, blk_execute_rq_nowait, blk_rq_map_user, etc) instead of scsi_execute_async(). Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NDouglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
This patch converts the non data path to use the block layer functions (blk_get_request, blk_execute_rq_nowait, etc) instead of uses scsi_execute_async(). Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NDouglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
This patch introduces struct rq_map_data to enable bio_copy_use_iov() use reserved pages. Currently, bio_copy_user_iov allocates bounce pages but drivers/scsi/sg.c wants to allocate pages by itself and use them. struct rq_map_data can be used to pass allocated pages to bio_copy_user_iov. The current users of bio_copy_user_iov simply passes NULL (they don't want to use pre-allocated pages). Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
Currently, blk_rq_map_user and blk_rq_map_user_iov always do GFP_KERNEL allocation. This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov so sg can use it (sg always does GFP_ATOMIC allocation). Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NDouglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Aaron Carroll 提交于
CFQ's detection of queueing devices assumes a non-queuing device and detects if the queue depth reaches a certain threshold. Under some workloads (e.g. synchronous reads), CFQ effectively forces a unit queue depth, thus defeating the detection logic. This leads to poor performance on queuing hardware, since the idle window remains enabled. This patch inverts the sense of the logic: assume a queuing-capable device, and detect if the depth does not exceed the threshold. Signed-off-by: NAaron Carroll <aaronc@gelato.unsw.edu.au> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
We should just check for rq->bio, as that is really the information we are looking for. Even if the bio attached doesn't carry data, we still need to do IO post processing on it. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Somewhat incomplete, as we do allow merges of requests and bios that have different completion CPUs given. This is done on the assumption that a larger IO is still more beneficial than CPU locality. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
This patch adds support for controlling the IO completion CPU of either all requests on a queue, or on a per-request basis. We export a sysfs variable (rq_affinity) which, if set, migrates completions of requests to the CPU that originally submitted it. A bio helper (bio_set_completion_cpu()) is also added, so that queuers can ask for completion on that specific CPU. In testing, this has been show to cut the system time by as much as 20-40% on synthetic workloads where CPU affinity is desired. This requires a little help from the architecture, so it'll only work as designed for archs that are using the new generic smp helper infrastructure. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Preparatory patch for checking queuing affinity. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Now that disk and partition handlings are mostly unified, it's easy to allow disk to have extended device number. This patch makes add_disk() use extended device number if disk->minors is zero. Both sd and ide-disk are updated to use this. * sd_format_disk_name() is implemented which can generically determine the drive name. This removes disk number restriction stemming from limited device names. * If sd index goes over SD_MAX_DISKS (which can be increased now BTW), sd simply doesn't initialize minors letting block layer choose extended device number. * If CONFIG_DEBUG_EXT_DEVT is set, both sd and ide-disk always set minors to 0 and use extended device numbers. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
With previous changes, it's meaningless to limit the number of partitions. Replace @ext_minors with GENHD_FL_EXT_DEVT such that setting the flag allows the disk to have maximum number of allowed partitions (only limited by the number of entries in parsed_partitions as determined by MAX_PART constant). This kills not-too-pretty alloc_disk_ext[_node]() functions and makes @minors parameter to alloc_disk[_node]() unnecessary. The parameter is left alone to avoid disturbing the users. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
disk->__part used to be statically allocated to the maximum possible number of partitions. This patch makes partition array allocation dynamic. The added overhead is minimal as only real change is one memory dereference changed to RCU one. This saves both a bit of memory and cpu cycles iterating through unoccupied slots and makes increasing partition limit easier. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Move stats related fields - stamp, in_flight, dkstats - from disk to part0 and unify stat handling such that... * part_stat_*() now updates part0 together if the specified partition is not part0. ie. part_stat_*() are now essentially all_stat_*(). * {disk|all}_stat_*() are gone. * part_round_stats() is updated similary. It handles part0 stats automatically and disk_round_stats() is killed. * part_{inc|dec}_in_fligh() is implemented which automatically updates part0 stats for parts other than part0. * disk_map_sector_rcu() is updated to return part0 if no part matches. Combined with the above changes, this makes NULL special case handling in callers unnecessary. * Separate stats show code paths for disk are collapsed into part stats show code paths. * Rename disk_stat_lock/unlock() to part_stat_lock/unlock() While at it, reposition stat handling macros a bit and add missing parentheses around macro parameters. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
GENHD_FL_FAIL for disk is what make_it_fail is for parts. Kill it and use part0->make_it_fail. Sysfs node handling is unified too. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Till now, bdev->bd_part is set only if the bdev was for parts other than part0. This patch makes bdev->bd_part always set so that code paths don't have to differenciate common handling. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Move disk->holder_dir to part0->holder_dir. Kill now mostly superflous bdev_get_holder(). While at it, kill superflous kobject_get/put() around holder_dir, slave_dir and cmd_filter creation and collapse disk_sysfs_add_subdirs() into register_disk(). These serve no purpose but obfuscating the code. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Move disk->policy to part0->policy. Implement and use get_disk_ro(). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Now that capacity and __dev are moved to part0, part0 and others can share the same method. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Move disk->__dev to part0->__dev. This simplifies bdget_disk() and lookup_devt() and allows common sysfs attributes to be unified. part_to_disk() is updated to handle part0 -> disk. Updated to include a fix from Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>, he writes: "part0 is a "special" partition and doesn't need to have capacity set - this fixes regression caused by "block: move __dev from disk to part0" commit." Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Move disk->capacity to part0->nr_sects and convert all users who directly accessed the field to use {get|set}_capacity(). This is done early to allow the __dev field to be moved. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-