- 16 7月, 2007 3 次提交
-
-
由 FUJITA Tomonori 提交于
This just kills linux/config.h and dprintk warnings. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
This converts block/scsi_ioctl.c use blk_rq_unmap_user new API. blk_unmap_sghdr_rq is too simple and it might be better to remove it. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 10 7月, 2007 5 次提交
-
-
由 Matthias Kaehlcke 提交于
elevator Signed-off-by: NMatthias Kaehlcke <matthias.kaehlcke@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jan Engelhardt 提交于
instead of going through all options. Signed-off-by: NJan Engelhardt <jengelh@gmx.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
With the cfq_queue hash removal, we inadvertently got rid of the async queue sharing. This was not intentional, in fact CFQ purposely shares the async queue per priority level to get good merging for async writes. So put some logic in cfq_get_queue() to track the shared queues. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Barrier bios are completed twice - once after the barrier write itself is done and again after the whole sequence is complete. flush_dry_bio_endio() is for the first completion. It doesn't really complete the bio. It rewinds bvec and resets bio so that it can be completed again when the whole barrier sequence is complete. The bvec rewinding code has the following problems. 1. The rewinding code is wrong because filesystems may pass bvec with non zero bv_offset. 2. The block layer doesn't guarantee anything about the state of bvec array on request completion. bv_offset and len are updated iff __end_that_request_first() completes the bvec partially. Because of #2, #1 doesn't really matter (nobody cares whether bvec is re-wound correctly or not) but then again by not doing unwinding at all, we'll always give back the same bvec to the caller as full bvec completion doesn't alter bvecs and the final completion is always full completion. Drop unnecessary rewinding code. This is spotted by Neil Brown. Signed-off-by: NTejun Heo <htejun@gmail.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Two bugs in there: - The virt oversize check should use the current bio hardware back size and the next bio front size, not the same bio. Spotted by Neil Brown. - The segment size check should add hw front sizes, not total bio sizes. Spotted by James Bottomley Acked-by: NJames Bottomley <James.Bottomley@SteelEye.com> Acked-by: NNeilBrown <neilb@suse.de> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 16 6月, 2007 1 次提交
-
-
由 Tejun Heo 提交于
SCSI marks internal commands with REQ_PREEMPT and push it at the front of the request queue using blk_execute_rq(). When entering suspended or frozen state, SCSI devices are quiesced using scsi_device_quiesce(). In quiesced state, only REQ_PREEMPT requests are processed. This is how SCSI blocks other requests out while suspending and resuming. As all internal commands are pushed at the front of the queue, this usually works. Unfortunately, this interacts badly with ordered requeueing. To preserve request order on requeueing (due to busy device, active EH or other failures), requests are sorted according to ordered sequence on requeue if IO barrier is in progress. The following sequence deadlocks. 1. IO barrier sequence issues. 2. Suspend requested. Queue is quiesced with part or all of IO barrier sequence at the front. 3. During suspending or resuming, SCSI issues internal command which gets deferred and requeued for some reason. As the command is issued after the IO barrier in #1, ordered requeueing code puts the request after IO barrier sequence. 4. The device is ready to process requests again but still is in quiesced state and the first request of the queue isn't REQ_PREEMPT, so command processing is deadlocked - suspending/resuming waits for the issued request to complete while the request can't be processed till device is put back into running state by resuming. This can be fixed by always putting !fs requests at the front when requeueing. The following thread reports this deadlock. http://thread.gmane.org/gmane.linux.kernel/537473Signed-off-by: NTejun Heo <htejun@gmail.com> Acked-by: NDavid Greaves <david@dgreaves.com> Acked-by: NJeff Garzik <jeff@garzik.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 5月, 2007 2 次提交
-
-
由 Kristen Carlson Accardi 提交于
Send an uevent to user space to indicate that a media change event has occurred. Signed-off-by: NKristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kristen Carlson Accardi 提交于
Allow user space to determine if a disk supports Asynchronous Notification of media changes. This is done by adding a new sysfs file "capability_flags", which is documented in (insert file name). This sysfs file will export all disk capabilities flags to user space. We also define a new flag to define the media change notification capability. Signed-off-by: NKristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 5月, 2007 1 次提交
-
-
由 Jens Axboe 提交于
current_io_context() is both static and exported with EXPORT_SYMBOL(). As there are no users outside of ll_rw_blk.c itself, just kill the export. Problem reported by Martin Michlmayr <tbm@cyrius.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 5月, 2007 1 次提交
-
-
由 Neil Brown 提交于
to generic_make_request can use up a lot of space, and we would rather they didn't. As generic_make_request is a void function, and as it is generally not expected that it will have any effect immediately, it is safe to delay any call to generic_make_request until there is sufficient stack space available. As ->bi_next is reserved for the driver to use, it can have no valid value when generic_make_request is called, and as __make_request implicitly assumes it will be NULL (ELEVATOR_BACK_MERGE fork of switch) we can be certain that all callers set it to NULL. We can therefore safely use bi_next to link pending requests together, providing we clear it before making the real call. So, we choose to allow each thread to only be active in one generic_make_request at a time. If a subsequent (recursive) call is made, the bio is linked into a per-thread list, and is handled when the active call completes. As the list of pending bios is per-thread, there are no locking issues to worry about. I say above that it is "safe to delay any call...". There are, however, some behaviours of a make_request_fn which would make it unsafe. These include any behaviour that assumes anything will have changed after a recursive call to generic_make_request. These could include: - waiting for that call to finish and call it's bi_end_io function. md use to sometimes do this (marking the superblock dirty before completing a write) but doesn't any more - inspecting the bio for fields that generic_make_request might change, such as bi_sector or bi_bdev. It is hard to see a good reason for this, and I don't think anyone actually does it. - inspecing the queue to see if, e.g. it is 'full' yet. Again, I think this is very unlikely to be useful, or to be done. Signed-off-by: NNeil Brown <neilb@suse.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: <dm-devel@redhat.com> Alasdair G Kergon <agk@redhat.com> said: I can see nothing wrong with this in principle. For device-mapper at the moment though it's essential that, while the bio mappings may now get delayed, they still get processed in exactly the same order as they were passed to generic_make_request(). My main concern is whether the timing changes implicit in this patch will make the rare data-corrupting races in the existing snapshot code more likely. (I'm working on a fix for these races, but the unfinished patch is already several hundred lines long.) It would be helpful if some people on this mailing list would test this patch in various scenarios and report back. Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 10 5月, 2007 4 次提交
-
-
由 Rafael J. Wysocki 提交于
Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [oleg@tv-sign.ru: cleanups] Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
flush_work(wq, work) doesn't need the first parameter, we can use cwq->wq (this was possible from the very beginnig, I missed this). So we can unify flush_work_keventd and flush_work. Also, rename flush_work() to cancel_work_sync() and fix all callers. Perhaps this is not the best name, but "flush_work" is really bad. (akpm: this is why the earlier patches bypassed maintainers) Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru> Cc: Jeff Garzik <jeff@garzik.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Auke Kok <auke-jan.h.kok@intel.com>, Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
Switch the kblockd flushing from a global flush to a more specific flush_work(). (akpm: bypassed maintainers, sorry. There are other patches which depend on this) Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <axboe@suse.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dave Gilbert 提交于
Display all possible partitions when the root filesystem is not mounted. This helps to track spell'o's and missing drivers. Updated to work with newer kernels. Example output: VFS: Cannot open root device "foobar" or unknown-block(0,0) Please append a correct "root=" boot option; here are the available partitions: 0800 8388608 sda driver: sd 0801 192748 sda1 0802 8193150 sda2 0810 4194304 sdb driver: sd Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) [akpm@linux-foundation.org: cleanups, fix printk warnings] Signed-off-by: NJan Engelhardt <jengelh@gmx.de> Cc: Dave Gilbert <linux@treblig.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 5月, 2007 3 次提交
-
-
由 Michael Opdenacker 提交于
Signed-off-by: NMichael Opdenacker <michael@free-electrons.com> Signed-off-by: NAdrian Bunk <bunk@stusta.de>
-
由 Nick Piggin 提交于
Fix units mismatch (jiffies vs msecs) in as-iosched.c, spotted by Xiaoning Ding <dingxn@cse.ohio-state.edu>. Signed-off-by: NNick Piggin <npiggin@suse.de> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Christie 提交于
I think we might just need the blk_map_kern users now. For the async execute I added the bounce code already and the block SG_IO has it atleady. I think the blk_map_kern bounce code got dropped because we thought the correct gfp_t would be passed in. But I think all we need is the patch below and all the paths are take care of. The patch is not tested. Patch was made against scsi-misc. The last place that is sending non sg commands may just be md/dm-emc.c but that is is just waiting on alasdair to take some patches that fix that and a bunch of junk in there including adding bounce support. If the patch below is ok though and dm-emc finally gets converted then it will have sg and bonce buffer support. Signed-off-by: NMike Christie <michaelc@cs.wisc.edu> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 08 5月, 2007 2 次提交
-
-
由 Christoph Lameter 提交于
This patch provides a new macro KMEM_CACHE(<struct>, <flags>) to simplify slab creation. KMEM_CACHE creates a slab with the name of the struct, with the size of the struct and with the alignment of the struct. Additional slab flags may be specified if necessary. Example struct test_slab { int a,b,c; struct list_head; } __cacheline_aligned_in_smp; test_slab_cache = KMEM_CACHE(test_slab, SLAB_PANIC) will create a new slab named "test_slab" of the size sizeof(struct test_slab) and aligned to the alignment of test slab. If it fails then we panic. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Peter Zijlstra 提交于
Remove the destroy_dirty_buffers argument from invalidate_bdev(), it hasn't been used in 6 years (so akpm says). find * -name \*.[ch] | xargs grep -l invalidate_bdev | while read file; do quilt add $file; sed -ie 's/invalidate_bdev(\([^,]*\),[^)]*)/invalidate_bdev(\1)/g' $file; done Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 5月, 2007 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
We need to work on cleaning up the relationship between kobjects, ksets and ktypes. The removal of 'struct subsystem' is the first step of this, especially as it is not really needed at all. Thanks to Kay for fixing the bugs in this patch. Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 30 4月, 2007 17 次提交
-
-
由 Jens Axboe 提交于
It's never grabbed from irq context, so just make it plain spin_lock(). Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
We often lookup the same queue many times in succession, so cache the last looked up queue to avoid browsing the rbtree. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
To be used by as/cfq as they see fit. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Vasily Tarasov 提交于
cfq hash is no more necessary. We always can get cfqq from io context. cfq_get_io_context_noalloc() function is introduced, because we don't want to allocate cic on merging and checking may_queue. In order to identify sync queue we've used hash key = CFQ_KEY_ASYNC. Since hash is eliminated we need to use other criterion: sync flag for queue is added. In all places where we dig in rb_tree we're in current context, so no additional locking is required. Advantages of this patch: no additional memory for hash, no seeking in hash, code is cleaner. But it is necessary now to seek cic in per-ioc rbtree, but it is faster: - most processes work only with few devices - most systems have only few block devices - it is a rb-tree Signed-off-by: NVasily Tarasov <vtaras@openvz.org> Changes by me: - Merge into CFQ devel branch - Get rid of cfq_get_io_context_noalloc() - Fix various bugs with dereferencing cic->cfqq[] with offset other than 0 or 1. - Fix bug in cfqq setup, is_sync condition was reversed. - Fix bug where only bio_sync() is used, we need to check for a READ too Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
For tagged devices, allow overlap of requests if the idle window isn't enabled on the current active queue. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
We don't enable it by default, don't let it get enabled during runtime. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
We can track it fairly accurately locally, let the slice handling take care of the rest. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
We don't use it anymore in the slice expiry handling. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
It's only used for preemption now that the IDLE and RT queues also use the rbtree. If we pass an 'add_front' variable to cfq_service_tree_add(), we can set ->rb_key to 0 to force insertion at the front of the tree. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Use the max_slice-cur_slice as the multipler for the insertion offset. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Same treatment as the RT conversion, just put the sorted idle branch at the end of the tree. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Currently CFQ does a linked insert into the current list for RT queues. We can just factor the class into the rb insertion, and then we don't have to treat RT queues in a special way. It's faster, too. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
For cases where the rbtree is mainly used for sorting and min retrieval, a nice speedup of the rbtree code is to maintain a cache of the leftmost node in the tree. Also spotted in the CFS CPU scheduler code. Improved by Alan D. Brunelle <Alan.Brunelle@hp.com> by updating the leftmost hint in cfq_rb_first() if it isn't set, instead of only updating it on insert. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Drawing on some inspiration from the CFS CPU scheduler design, overhaul the pending cfq_queue concept list management. Currently CFQ uses a doubly linked list per priority level for sorting and service uses. Kill those lists and maintain an rbtree of cfq_queue's, sorted by when to service them. This unfortunately means that the ionice levels aren't as strong anymore, will work on improving those later. We only scale the slice time now, not the number of times we service. This means that latency is better (for all priority levels), but that the distinction between the highest and lower levels aren't as big. The diffstat speaks for itself. cfq-iosched.c | 363 +++++++++++++++++--------------------------------- 1 file changed, 125 insertions(+), 238 deletions(-) Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
- Move the queue_new flag clear to when the queue is selected - Only select the non-first queue in cfq_get_best_queue(), if there's a substantial difference between the best and first. - Get rid of ->busy_rr - Only select a close cooperator, if the current queue is known to take a while to "think". Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-