- 11 9月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
bio and request use the same set of failfast bits. This patch makes the following changes to simplify things. * enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_* bits coincide with __REQ_FAILFAST_* bits. * The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV but the matching is useless anyway. init_request_from_bio() is responsible for setting FAILFAST bits on FS requests and non-FS requests never use BIO_RW_AHEAD. Drop the code and comment from blk_rq_bio_prep(). * Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and simplify FAILFAST flags handling in init_request_from_bio(). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 02 9月, 2009 1 次提交
-
-
由 Nikanth Karthikesan 提交于
The patch "block: Use accessor functions for queue limits" (ae03bf63) changed queue_max_sectors_store() to use blk_queue_max_sectors() instead of directly assigning the value. But blk_queue_max_sectors() differs a bit 1. It sets both max_sectors_kb, and max_hw_sectors_kb 2. Never allows one to change max_sectors_kb above BLK_DEF_MAX_SECTORS. If one specifies a value greater then max_hw_sectors is set to that value but max_sectors is set to BLK_DEF_MAX_SECTORS I am not sure whether blk_queue_max_sectors() should be changed, as it seems to be that way for a long time. And there may be callers dependent on that behaviour. This patch simply reverts to the older way of directly assigning the value to max_sectors as it was before. Signed-off-by: NNikanth Karthikesan <knikanth@suse.de> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 05 8月, 2009 1 次提交
-
-
由 John Stoffel 提交于
Make SCSI SG v4 driver enabled by default and remove EXPERIMENTAL dependency, since udev depends on BSG Make Block Layer SG support v4 the default, since recent udev versions depend on this to access serial numbers and other low level info properly. This should be backported to older kernels as well, since most distros have enabled this for a long time. Signed-off-by: NJohn Stoffel <john@stoffel.org> Cc: stable@kernel.org Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 01 8月, 2009 4 次提交
-
-
由 Martin K. Petersen 提交于
Update topology comments and sysfs documentation based upon discussions with Neil Brown. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Martin K. Petersen 提交于
When stacking block devices ensure that optimal I/O size is scaled accordingly. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Martin K. Petersen 提交于
Introduce blk_limits_io_min() and make blk_queue_io_min() call it. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Martin K. Petersen 提交于
blk_queue_stack_limits() has been superceded by blk_stack_limits() and disk_stack_limits(). Wrap the function call for now, we'll deprecate it later. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 29 7月, 2009 1 次提交
-
-
由 Jens Axboe 提交于
Prior to the change for more sane end_io functions, we exported the helpers with the normal EXPORT_SYMBOL(). That got changed to _GPL() for the new interface. Revert that particular change, on the basis that this is basic functionality and doesn't dip into internal structures. If these exports can't be non-GPL, then we may as well make EXPORT_SYMBOL() imply GPL for everything. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 28 7月, 2009 2 次提交
-
-
由 Xiaotian Feng 提交于
blk_integrity_unregister should use kobject_put to release the kobject, otherwise after bi is freed, memory of bi->kobj->name is leaked. Signed-off-by: NXiaotian Feng <dfeng@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Move the assignment of a default lock below blk_init_queue() to blk_queue_make_request(), so we also get to set the default lock for ->make_request_fn() based drivers. This is important since the queue flag locking requires a lock to be in place. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 17 7月, 2009 2 次提交
-
-
由 Xiaotian Feng 提交于
In blk-sysfs.c, queue_var_store uses unsigned long to store data, but queue_var_show uses unsigned int to show data. This causes, # echo 70000000000 > /sys/block/<dev>/queue/read_ahead_kb # cat /sys/block/<dev>/queue/read_ahead_kb => get wrong value Fix it by using unsigned long. While at it, convert queue_rq_affinity_show() such that it uses bool variable instead of explicit != 0 testing. Signed-off-by: NXiaotian Feng <dfeng@redhat.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Commit ab0fd1de tries to prevent merge of requests with different failfast settings. In elv_rq_merge_ok(), it compares new bio's failfast flags against the merge target request's. However, the flag testing accessors for bio and blk don't return boolean but the tested bit value directly and FAILFAST on bio and blk don't match, so directly comparing them with == results in false negative unnecessary preventing merge of readahead requests. This patch convert the results to boolean by negating them before comparison. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jeff Garzik <jeff@garzik.org>
-
- 11 7月, 2009 2 次提交
-
-
由 Vivek Goyal 提交于
In case memory is scarce, we now default to oom_cfqq. Once memory is available again, we should allocate a new cfqq and stop using oom_cfqq for a particular io context. Once a new request comes in, check if we are using oom_cfqq, and if yes, try to allocate a new cfqq. Tested the patch by forcing the use of oom_cfqq and upon next request thread realized that it was using oom_cfqq and it allocated a new cfqq. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
Currently, blk_scsi_ioctl_init() is not called since it lacks an initcall marking. This causes the command table to be unitialized, hence somce commands are block when they should not have been. This fixes a regression introduced by commit 018e0446Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 04 7月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
Block layer used to merge requests and bios with different failfast settings. This caused regular IOs to fail prematurely when they were merged into failfast requests for readahead. Niel Lambrechts could trigger the problem semi-reliably on ext4 when resuming from STR. ext4 uses readahead when reading inodes and combined with the deterministic extra SATA PHY exception cycle during resume on the specific configuration, non-readahead inode read would fail causing ext4 errors. Please read the following thread for details. http://lkml.org/lkml/2009/5/23/21 This patch makes block layer reject merging if the failfast settings don't match. This is correct but likely to lower IO performance by preventing regular IOs from mingling into surrounding readahead requests. Changes to allow such mixed merges and handle errors correctly will be added later. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NNiel Lambrechts <niel.lambrechts@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Signed-off-by: NJens Axboe <axboe@carl.(none)>
-
- 01 7月, 2009 6 次提交
-
-
由 Shan Wei 提交于
With the changes for falling back to an oom_cfqq, we never fail to find/allocate a queue in cfq_get_queue(). So remove the check. Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 NeilBrown 提交于
The next_ordered flag is only meaningful for devices that use __make_request. So move the test against next_ordered out of generic code and in to __make_request Since this test was added, barriers have not worked on md or any devices that don't use __make_request and so don't bother to set next_ordered. (dm explicitly sets something other than QUEUE_ORDERED_NONE since commit 99360b4c but notes in the comments that it is otherwise meaningless). Cc: Ken Milmore <ken.milmore@googlemail.com> Cc: stable@kernel.org Signed-off-by: NNeilBrown <neilb@suse.de> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
The initial patches to support this through sysfs export were broken and have been if 0'ed out in any release. So lets just kill the code and reclaim some space in struct request_queue, if anyone would later like to fixup the sysfs bits, the git history can easily restore the removed bits. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Martin K. Petersen 提交于
This patch restores stacking ability to the block layer integrity infrastructure by creating a set of dedicated bip slabs. Each bip slab has an embedded bio_vec array at the end. This cuts down on memory allocations and also simplifies the code compared to the original bvec version. Only the largest bip slab is backed by a mempool. The pool is contained in the bio_set so stacking drivers can ensure forward progress. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <axboe@carl.(none)>
-
由 Jens Axboe 提交于
Setup an emergency fallback cfqq that we allocate at IO scheduler init time. If the slab allocation fails in cfq_find_alloc_queue(), we'll just punt IO to that cfqq instead. This ensures that cfq_find_alloc_queue() never fails without having to ensure free memory. On cfqq lookup, always try to allocate a new cfqq if the given cfq io context has the oom_cfqq assigned. This ensures that we only temporarily punt to this shared queue. Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
We're going to be needing that init code outside of that function to get rid of the __GFP_NOFAIL in cfqq allocation. Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 21 6月, 2009 1 次提交
-
-
由 FUJITA Tomonori 提交于
The SMP handler (sas_smp_request) was fixed to use the block API properly, so we don't need this workaround to avoid blk_put_request() warning. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
- 19 6月, 2009 2 次提交
-
-
由 Randy Dunlap 提交于
Warning(block/blk-settings.c:108): No description found for parameter 'lim' Warning(block/blk-settings.c:108): Excess function parameter 'limits' description in 'blk_set_default_limits' Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
Follow-up to "block: enable by default support for large devices and files on 32-bit archs". Rename CONFIG_LBD to CONFIG_LBDAF to: - allow update of existing [def]configs for "default y" change - reflect that it is used also for large files support nowadays Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 18 6月, 2009 1 次提交
-
-
由 Martin K. Petersen 提交于
Correct stacking bounce_pfn limit setting and prevent warnings on 32-bit. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 16 6月, 2009 7 次提交
-
-
由 Li Zefan 提交于
When porting blktrace to tracepoints, we changed to trace/block.h for trace prober declarations. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Martin K. Petersen 提交于
DM reuses the request queue when swapping in a new device table Introduce blk_set_default_limits() which can be used to reset the the queue_limits prior to stacking devices. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NAlasdair G Kergon <agk@redhat.com> Acked-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jeff Moyer 提交于
I noticed a blank line in blktrace output. This patch fixes that. Signed-off-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Move the defaults to where we do the init of the backing_dev_info. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Gui Jianfeng 提交于
Actually, last_end_request in cfq_data isn't used now. So lets just remove it. Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Kay Sievers 提交于
This adds support to the BSG driver to report the proper device name to userspace for the bsg devices. Signed-off-by: NKay Sievers <kay.sievers@vrfy.org> Signed-off-by: NJan Blunck <jblunck@suse.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Kay Sievers 提交于
This adds support for block drivers to report their requested nodename to userspace. It also updates a number of block drivers to provide the needed subdirectory and device name to be used for them. Signed-off-by: NKay Sievers <kay.sievers@vrfy.org> Signed-off-by: NJan Blunck <jblunck@suse.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 12 6月, 2009 1 次提交
-
-
由 Randy Dunlap 提交于
Fix kernel-doc warnings in recently changed block/ source code. Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 6月, 2009 2 次提交
-
-
由 Kiyoshi Ueda 提交于
This patch adds the following 2 interfaces for request-stacking drivers: - blk_rq_prep_clone(struct request *clone, struct request *orig, struct bio_set *bs, gfp_t gfp_mask, int (*bio_ctr)(struct bio *, struct bio*, void *), void *data) * Clones bios in the original request to the clone request (bio_ctr is called for each cloned bios.) * Copies attributes of the original request to the clone request. The actual data parts (e.g. ->cmd, ->buffer, ->sense) are not copied. - blk_rq_unprep_clone(struct request *clone) * Frees cloned bios from the clone request. Request stacking drivers (e.g. request-based dm) need to make a clone request for a submitted request and dispatch it to other devices. To allocate request for the clone, request stacking drivers may not be able to use blk_get_request() because the allocation may be done in an irq-disabled context. So blk_rq_prep_clone() takes a request allocated by the caller as an argument. For each clone bio in the clone request, request stacking drivers should be able to set up their own completion handler. So blk_rq_prep_clone() takes a callback function which is called for each clone bio, and a pointer for private data which is passed to the callback. NOTE: blk_rq_prep_clone() doesn't copy any actual data of the original request. Pages are shared between original bios and cloned bios. So caller must not complete the original request before the clone request. Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Nikanth Karthikesan 提交于
Currently io_context has an atomic_t(32-bit) as refcount. In the case of cfq, for each device against whcih a task does I/O, a reference to the io_context would be taken. And when there are multiple process sharing io_contexts(CLONE_IO) would also have a reference to the same io_context. Theoretically the possible maximum number of processes sharing the same io_context + the number of disks/cfq_data referring to the same io_context can overflow the 32-bit counter on a very high-end machine. Even though it is an improbable case, let us make it atomic_long_t. Signed-off-by: NNikanth Karthikesan <knikanth@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 10 6月, 2009 1 次提交
-
-
由 Li Zefan 提交于
TRACE_EVENT is a more generic way to define tracepoints. Doing so adds these new capabilities to this tracepoint: - zero-copy and per-cpu splice() tracing - binary tracing without printf overhead - structured logging records exposed under /debug/tracing/events - trace events embedded in function tracer output and other plugins - user-defined, per tracepoint filter expressions ... Cons: - no dev_t info for the output of plug, unplug_timer and unplug_io events. no dev_t info for getrq and sleeprq events if bio == NULL. no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL. This is mainly because we can't get the deivce from a request queue. But this may change in the future. - A packet command is converted to a string in TP_assign, not TP_print. While blktrace do the convertion just before output. Since pc requests should be rather rare, this is not a big issue. - In blktrace, an event can have 2 different print formats, but a TRACE_EVENT has a unique format, which means we have some unused data in a trace entry. The overhead is minimized by using __dynamic_array() instead of __array(). I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing: dd dd + ioctl blktrace dd + TRACE_EVENT (splice) 1 7.36s, 42.7 MB/s 7.50s, 42.0 MB/s 7.41s, 42.5 MB/s 2 7.43s, 42.3 MB/s 7.48s, 42.1 MB/s 7.43s, 42.4 MB/s 3 7.38s, 42.6 MB/s 7.45s, 42.2 MB/s 7.41s, 42.5 MB/s So the overhead of tracing is very small, and no regression when using those trace events vs blktrace. And the binary output of TRACE_EVENT is much smaller than blktrace: # ls -l -h -rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0 -rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1 -rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out Following are some comparisons between TRACE_EVENT and blktrace: plug: kjournald-480 [000] 303.084981: block_plug: [kjournald] kjournald-480 [000] 303.084981: 8,0 P N [kjournald] unplug_io: kblockd/0-118 [000] 300.052973: block_unplug_io: [kblockd/0] 1 kblockd/0-118 [000] 300.052974: 8,0 U N [kblockd/0] 1 remap: kjournald-480 [000] 303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384 kjournald-480 [000] 303.085043: 8,0 A W 102736992 + 8 <- (8,8) 33384 bio_backmerge: kjournald-480 [000] 303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald] kjournald-480 [000] 303.085086: 8,0 M W 102737032 + 8 [kjournald] getrq: kjournald-480 [000] 303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald] kjournald-480 [000] 303.084975: 8,0 G W 102736984 + 8 [kjournald] bash-2066 [001] 1072.953770: 8,0 G N [bash] bash-2066 [001] 1072.953773: block_getrq: 0,0 N 0 + 0 [bash] rq_complete: konsole-2065 [001] 300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0] konsole-2065 [001] 300.053191: 8,0 C W 103669040 + 16 [0] ksoftirqd/1-7 [001] 1072.953811: 8,0 C N (5a 00 08 00 00 00 00 00 24 00) [0] ksoftirqd/1-7 [001] 1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0] rq_insert: kjournald-480 [000] 303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald] kjournald-480 [000] 303.084986: 8,0 I W 102736984 + 8 [kjournald] Changelog from v2 -> v3: - use the newly introduced __dynamic_array(). Changelog from v1 -> v2: - use __string() instead of __array() to minimize the memory required to store hex dump of rq->cmd(). - support large pc requests. - add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT. - some cleanups. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 09 6月, 2009 4 次提交
-
-
由 FUJITA Tomonori 提交于
Due to commit 1cd96c24 ("block: WARN in __blk_put_request() for potential bio leak"), BSG SMP requests get the false warnings: WARNING: at block/blk-core.c:1068 __blk_put_request+0x52/0xc0() This sets rq->bio to NULL to avoid that false warnings. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Martin K. Petersen 提交于
DM no longer needs to set limits explicitly when calling blk_stack_limits. Let the latter automatically deal with bounce_pfn scaling. Fix kerneldoc variable names. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
This reverts commit a05c0205. DM doesn't need to access the bounce_pfn directly. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
Tejun's "block: set rq->resid_len to blk_rq_bytes() on issue" patch seems to be incomplete; It doesn't set rq->resid_len to blk_rq_bytes() for a bidi request (req->next_rq). As a result, all bidi users are broken. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-