提交 · b217a903ab6581cba04f88c44284dcdd2a752561 · bug2833 / cloud-kernel

11 9月, 2009 5 次提交

cfq: fix the log message after dispatched a request · b217a903

由 Shan Wei 提交于 9月 01, 2009

The blktrace tools can show process id when cfq dispatched a request,
using cfq_log_cfqq() instead of cfq_log().
Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b217a903

cfq-iosched: get rid of must_alloc flag · 1b379d8d

由 Jens Axboe 提交于 8月 11, 2009

It's not currently used, as pointed out by
Gui Jianfeng <guijianfeng@cn.fujitsu.com>. We already check the
wait_request flag to allow an idling queue priority allocation access,
so we don't need this extra flag.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1b379d8d

bio: first step in sanitizing the bio->bi_rw flag testing · 1f98a13f

由 Jens Axboe 提交于 9月 11, 2009

Get rid of any functions that test for these bits and make callers
use bio_rw_flagged() directly. Then it is at least directly apparent
what variable and flag they check.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1f98a13f

cfq-iosched: no need to keep track of busy_rt_queues · d58b85e1

由 Vivek Goyal 提交于 7月 10, 2009

o Get rid of busy_rt_queues infrastructure. Looks like it is redundant.

o Once an RT queue gets request it will preempt any of the BE or IDLE queues
immediately. Otherwise this queue will be put on service tree and scheduler
will anyway select this queue before any of the BE or IDLE queue. Hence
looks like there is no need to keep track of how many busy RT queues are
currently on service tree.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d58b85e1

cfq-iosched: drain device queue before switching to a sync queue · 5ad531db

由 Jens Axboe 提交于 7月 03, 2009

To lessen the impact of async IO on sync IO, let the device drain of
any async IO in progress when switching to a sync cfqq that has idling
enabled.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5ad531db

11 7月, 2009 1 次提交

cfq-iosched: reset oom_cfqq in cfq_set_request() · 32f2e807

由 Vivek Goyal 提交于 7月 09, 2009

In case memory is scarce, we now default to oom_cfqq. Once memory is
available again, we should allocate a new cfqq and stop using oom_cfqq for
a particular io context.

Once a new request comes in, check if we are using oom_cfqq, and if yes,
try to allocate a new cfqq.

Tested the patch by forcing the use of oom_cfqq and upon next request thread
realized that it was using oom_cfqq and it allocated a new cfqq.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

32f2e807

01 7月, 2009 3 次提交

cfq-iosched: remove redundant check for NULL cfqq in cfq_set_request() · b706f642

由 Shan Wei 提交于 7月 01, 2009

With the changes for falling back to an oom_cfqq, we never fail
to find/allocate a queue in cfq_get_queue(). So remove the check.
Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b706f642

cfq-iosched: get rid of the need for __GFP_NOFAIL in cfq_find_alloc_queue() · 6118b70b

由 Jens Axboe 提交于 6月 30, 2009

Setup an emergency fallback cfqq that we allocate at IO scheduler init
time. If the slab allocation fails in cfq_find_alloc_queue(), we'll just
punt IO to that cfqq instead. This ensures that cfq_find_alloc_queue()
never fails without having to ensure free memory.

On cfqq lookup, always try to allocate a new cfqq if the given cfq io
context has the oom_cfqq assigned. This ensures that we only temporarily
punt to this shared queue.
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6118b70b

cfq-iosched: move cfqq initialization out of cfq_find_alloc_queue() · d5036d77

由 Jens Axboe 提交于 6月 26, 2009

We're going to be needing that init code outside of that function
to get rid of the __GFP_NOFAIL in cfqq allocation.
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d5036d77

16 6月, 2009 2 次提交

cfq: remove extraneous '\n' in blktrace output · 6923715a

由 Jeff Moyer 提交于 6月 12, 2009

I noticed a blank line in blktrace output.  This patch fixes that.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6923715a

cfq: cleanup for last_end_request in cfq_data · 81be8347

由 Gui Jianfeng 提交于 6月 12, 2009

Actually, last_end_request in cfq_data isn't used now. So lets
just remove it.
Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

81be8347

11 6月, 2009 1 次提交

block: prevent possible io_context->refcount overflow · d9c7d394

由 Nikanth Karthikesan 提交于 6月 10, 2009

Currently io_context has an atomic_t(32-bit) as refcount. In the case of
cfq, for each device against whcih a task does I/O, a reference to the
io_context would be taken. And when there are multiple process sharing
io_contexts(CLONE_IO) would also have a reference to the same io_context.

Theoretically the possible maximum number of processes sharing the same
io_context + the number of disks/cfq_data referring to the same io_context
can overflow the 32-bit counter on a very high-end machine.

Even though it is an improbable case, let us make it atomic_long_t.
Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d9c7d394

11 5月, 2009 3 次提交

block: drop request->hard_* and *nr_sectors · 2e46e8b2

由 Tejun Heo 提交于 5月 07, 2009

struct request has had a few different ways to represent some
properties of a request.  ->hard_* represent block layer's view of the
request progress (completion cursor) and the ones without the prefix
are supposed to represent the issue cursor and allowed to be updated
as necessary by the low level drivers.  The thing is that as block
layer supports partial completion, the two cursors really aren't
necessary and only cause confusion.  In addition, manual management of
request detail from low level drivers is cumbersome and error-prone at
the very least.

Another interesting duplicate fields are rq->[hard_]nr_sectors and
rq->{hard_cur|current}_nr_sectors against rq->data_len and
rq->bio->bi_size.  This is more convoluted than the hard_ case.

rq->[hard_]nr_sectors are initialized for requests with bio but
blk_rq_bytes() uses it only for !pc requests.  rq->data_len is
initialized for all request but blk_rq_bytes() uses it only for pc
requests.  This causes good amount of confusion throughout block layer
and its drivers and determining the request length has been a bit of
black magic which may or may not work depending on circumstances and
what the specific LLD is actually doing.

rq->{hard_cur|current}_nr_sectors represent the number of sectors in
the contiguous data area at the front.  This is mainly used by drivers
which transfers data by walking request segment-by-segment.  This
value always equals rq->bio->bi_size >> 9.  However, data length for
pc requests may not be multiple of 512 bytes and using this field
becomes a bit confusing.

In general, having multiple fields to represent the same property
leads only to confusion and subtle bugs.  With recent block low level
driver cleanups, no driver is accessing or manipulating these
duplicate fields directly.  Drop all the duplicates.  Now rq->sector
means the current sector, rq->data_len the current total length and
rq->bio->bi_size the current segment length.  Everything else is
defined in terms of these three and available only through accessors.

* blk_recalc_rq_sectors() is collapsed into blk_update_request() and
  now handles pc and fs requests equally other than rq->sector update.
  This means that now pc requests can use partial completion too (no
  in-kernel user yet tho).

* bio_cur_sectors() is replaced with bio_cur_bytes() as block layer
  now uses byte count as the primary data length.

* blk_rq_pos() is now guranteed to be always correct.  In-block users
  converted.

* blk_rq_bytes() is now guaranteed to be always valid as is
  blk_rq_sectors().  In-block users converted.

* blk_rq_sectors() is now guaranteed to equal blk_rq_bytes() >> 9.
  More convenient one is used.

* blk_rq_bytes() and blk_rq_cur_bytes() are now inlined and take const
  pointer to request.

[ Impact: API cleanup, single way to represent one property of a request ]
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2e46e8b2

block: convert to pos and nr_sectors accessors · 83096ebf

由 Tejun Heo 提交于 5月 07, 2009

With recent cleanups, there is no place where low level driver
directly manipulates request fields.  This means that the 'hard'
request fields always equal the !hard fields.  Convert all
rq->sectors, nr_sectors and current_nr_sectors references to
accessors.

While at it, drop superflous blk_rq_pos() < 0 test in swim.c.

[ Impact: use pos and nr_sectors accessors ]
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NGeert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Tested-by: NGrant Likely <grant.likely@secretlab.ca>
Acked-by: NGrant Likely <grant.likely@secretlab.ca>
Tested-by: NAdrian McMenamin <adrian@mcmen.demon.co.uk>
Acked-by: NAdrian McMenamin <adrian@mcmen.demon.co.uk>
Acked-by: NMike Miller <mike.miller@hp.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Tim Waugh <tim@cyberelk.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Dario Ballabio <ballabio_dario@emc.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: unsik Kim <donari75@gmail.com>
Cc: Laurent Vivier <Laurent@lvivier.info>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

83096ebf

block: implement blk_rq_pos/[cur_]sectors() and convert obvious ones · 5b93629b

由 Tejun Heo 提交于 5月 07, 2009

Implement accessors - blk_rq_pos(), blk_rq_sectors() and
blk_rq_cur_sectors() which return rq->hard_sector, rq->hard_nr_sectors
and rq->hard_cur_sectors respectively and convert direct references of
the said fields to the accessors.

This is in preparation of request data length handling cleanup.

Geert	: suggested adding const to struct request * parameter to accessors
Sergei	: spotted error in patch description

[ Impact: cleanup ]
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NGeert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: NStephen Rothwell <sfr@canb.auug.org.au>
Tested-by: NGrant Likely <grant.likely@secretlab.ca>
Acked-by: NGrant Likely <grant.likely@secretlab.ca>
Ackec-by: NSergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5b93629b

28 4月, 2009 1 次提交

block: kill blk_start_queueing() · a7f55792

由 Tejun Heo 提交于 4月 23, 2009

blk_start_queueing() is identical to __blk_run_queue() except that it
doesn't check for recursion.  None of the current users depends on
blk_start_queueing() running request_fn directly.  Replace usages of
blk_start_queueing() with [__]blk_run_queue() and kill it.

[ Impact: removal of mostly duplicate interface function ]
Signed-off-by: NTejun Heo <tj@kernel.org>

a7f55792

24 4月, 2009 3 次提交

cfq-iosched: cache prio_tree root in cfqq->p_root · f2d1f0ae

由 Jens Axboe 提交于 4月 23, 2009

Currently we look it up from ->ioprio, but ->ioprio can change if
either the process gets its IO priority changed explicitly, or if
cfq decides to temporarily boost it. So if we are unlucky, we can
end up attempting to remove a node from a different rbtree root than
where it was added.

Fix this by using ->org_ioprio as the prio_tree index, since that
will only change for explicit IO priority settings (not for a boost).
Additionally cache the rbtree root inside the cfqq, then we don't have
to add code to reinsert the cfqq in the prio_tree if IO priority changes.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f2d1f0ae

cfq-iosched: fix bug with aliased request and cooperation detection · 3ac6c9f8

由 Jens Axboe 提交于 4月 23, 2009

cfq_prio_tree_lookup() should return the direct match, yet it always
returns zero. Fix that.

cfq_prio_tree_add() assumes that we don't get a direct match, while
it is very possible that we do. Using O_DIRECT, you can have different
cfqq with matching requests, since you don't have the page cache
to serialize things for you. Fix this bug by only adding the cfqq if
there isn't an existing match.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

3ac6c9f8

cfq-iosched: clear ->prio_trees[] on cfqd alloc · 26a2ac00

由 Jens Axboe 提交于 4月 23, 2009

Not strictly needed, but we should make it clear that we init the
rbtree roots here.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

26a2ac00

22 4月, 2009 2 次提交

cfq-iosched: use the default seek distance when there aren't enough seek samples · 04dc6e71

由 Jeff Moyer 提交于 4月 21, 2009

If the cfq io context doesn't have enough samples yet to provide a mean
seek distance, then use the default threshold we have for seeky IO instead
of defaulting to 0.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

04dc6e71

cfq-iosched: make seek_mean converge more quickly · 4d00aa47

由 Jeff Moyer 提交于 4月 21, 2009

Right now, depending on the first sector to which a process issues I/O,
the seek time may start out way out of whack. So make sure we start
with 0 sectors in seek, instead of the offset of the first request
issued.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4d00aa47

15 4月, 2009 7 次提交

cfq-iosched: add close cooperator code · a36e71f9

由 Jens Axboe 提交于 4月 15, 2009

If we have processes that are working in close proximity to each
other on disk, we don't want to idle wait. Instead allow the close
process to issue a request, getting better aggregate bandwidth.
The anticipatory scheduler has similar checks, noop and deadline do
not need it since they don't care about process <-> io mappings.

The code for CFQ is a little more involved though, since we split
request queues into per-process contexts.

This fixes a performance problem with eg dump(8), since it uses
several processes in some silly attempt to speed IO up. Even if
dump(8) isn't really a valid case (it should be fixed by using
CLONE_IO), there are other cases where we see close processes
and where idling ends up hurting performance.

Credit goes to Jeff Moyer <jmoyer@redhat.com> for writing the
initial implementation.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a36e71f9

J
cfq-iosched: log responsible 'cfqq' in idle timer arm · 9481ffdc
由 Jens Axboe 提交于 4月 15, 2009
```
Makes it easier to read the traces.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
9481ffdc

cfq-iosched: tweak kick logic a bit more · 2d870722

由 Jens Axboe 提交于 4月 15, 2009

We only kick the dispatch for an idling queue, if we think it's a
(somewhat) fully merged request. Also allow a kick if we have other
busy queues in the system, since we don't want to risk waiting for
a potential merge in that case. It's better to get some work done and
proceed.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2d870722

cfq-iosched: no need to save interrupts in cfq_kick_queue() · 40bb54d1

由 Jens Axboe 提交于 4月 15, 2009

It's called from the workqueue handlers from process context, so
we always have irqs enabled when entered.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

40bb54d1

cfq-iosched: don't delay queue kick for a merged request · d6ceb25e

由 Jens Axboe 提交于 4月 14, 2009

"Zhang, Yanmin" <yanmin_zhang@linux.intel.com> reports that commit
b029195d introduced a regression
of about 50% with sequential threaded read workloads. The test
case is:

tiotest -k0 -k1 -k3 -f 80 -t 32

which starts 32 threads each reading a 80MB file. Twiddle the kick
queue logic so that we do start IO immediately, if it appears to be
a fully merged request. We can't really detect that, so just check
if the request is bigger than a page or not. The assumption is that
since single bio issues will first queue a single request with just
one page attached and then later do merges on that, if we already
have more than a page worth of data in the request, then the request
is most likely good to go.

Verified that this doesn't cause a regression with the test case that
commit b029195d was fixing. It does not,
we still see maximum sized requests for the queue-then-merge cases.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d6ceb25e

cfq-iosched: get rid of private SYNC/ASYNC defines · ff6657c6

由 Jens Axboe 提交于 4月 08, 2009

We can just use the block layer BLK_RW_SYNC/ASYNC defines now.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ff6657c6

J
cfq-iosched: use rw_is_sync() to see if rw flags are sync or not · b0b78f81
由 Jens Axboe 提交于 4月 08, 2009
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
b0b78f81

07 4月, 2009 3 次提交

cfq-iosched: don't let idling interfere with plugging · b029195d

由 Jens Axboe 提交于 4月 07, 2009

When CFQ is waiting for a new request from a process, currently it'll
immediately restart queuing when it sees such a request. This doesn't
work very well with streamed IO, since we then end up splitting IO
that would otherwise have been merged nicely. For a simple dd test,
this causes 10x as many requests to be issued as we should have.
Normally this goes unnoticed due to the low overhead of requests
at the device side, but some hardware is very sensitive to request
sizes and there it can cause big slow downs.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b029195d

cfq-iosched: kill two unused cfqq flags · 75e50984

由 Jens Axboe 提交于 4月 07, 2009

We only manipulate the must_dispatch and queue_new flags, they are not
tested anymore. So get rid of them.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

75e50984

cfq-iosched: change dispatch logic to deal with single requests at the time · 2f5cb738

由 Jens Axboe 提交于 4月 07, 2009

The IO scheduler core calls into the IO scheduler dispatch_request hook
to move requests from the IO scheduler and into the driver dispatch
list. It only does so when the dispatch list is empty. CFQ moves several
requests to the dispatch list, which can cause higher latencies if we
suddenly have to switch to some important sync IO. Change the logic to
move one request at the time instead.

This should almost be functionally equivalent to what we did before,
except that we now honor 'quantum' as the maximum queue depth at the
device side from any single cfqq. If there's just a single active
cfqq, we allow up to 4 times the normal quantum.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2f5cb738

06 4月, 2009 1 次提交

block: Add flag for telling the IO schedulers NOT to anticipate more IO · aeb6fafb

由 Jens Axboe 提交于 4月 06, 2009

By default, CFQ will anticipate more IO from a given io context if the
previously completed IO was sync. This used to be fine, since the only
sync IO was reads and O_DIRECT writes. But with more "normal" sync writes
being used now, we don't want to anticipate for those.

Add a bio/request flag that informs the IO scheduler that this is a sync
request that we should not idle for. Introduce WRITE_ODIRECT specifically
for O_DIRECT writes, and make sure that the other sync writes set this
flag.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aeb6fafb

30 1月, 2009 1 次提交

cfq-iosched: Allow RT requests to pre-empt ongoing BE timeslice · 3a9a3f6c

由 Divyesh Shah 提交于 1月 30, 2009

This patch adds the ability to pre-empt an ongoing BE timeslice when a RT
request is waiting for the current timeslice to complete. This reduces the
wait time to disk for RT requests from an upper bound of 4 (current value
of cfq_quantum) to 1 disk request.

Applied Jens' suggeested changes to avoid the rb lookup and use !cfq_class_rt()
and retested.

Latency(secs) for the RT task when doing sequential reads from 10G file.
                       | only RT | RT + BE | RT + BE + this patch
small (512 byte) reads | 143     | 163     | 145
large (1Mb) reads      | 142     | 158     | 146
Signed-off-by: NDivyesh Shah <dpshah@google.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

3a9a3f6c

29 12月, 2008 4 次提交

cfq-iosched: fix race between exiting queue and exiting task · 62c1fe9d

由 Jens Axboe 提交于 12月 15, 2008

Original patch from Nikanth Karthikesan <knikanth@suse.de>

When a queue exits the queue lock is taken and cfq_exit_queue() would free all
the cic's associated with the queue.

But when a task exits, cfq_exit_io_context() gets cic one by one and then
locks the associated queue to call __cfq_exit_single_io_context. It looks like
between getting a cic from the ioc and locking the queue, the queue might have
exited on another cpu.

Fix this by rechecking the cfq_io_context queue key inside the queue lock
again, and not calling into __cfq_exit_single_io_context() if somebody
beat us to it.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

62c1fe9d

cfq-iosched: remove limit of dispatch depth of max 4 times quantum · 30e0dc28

由 Jens Axboe 提交于 10月 20, 2008

This basically limits the hardware queue depth to 4*quantum at any
point in time, which is 16 with the default settings. As CFQ uses
other means to shrink the hardware queue when necessary in the first
place, there's really no need for this extra heuristic. Additionally,
it ends up hurting performance in some cases.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

30e0dc28

block: get rid of elevator_t typedef · b374d18a

由 Jens Axboe 提交于 10月 31, 2008

Just use struct elevator_queue everywhere instead.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b374d18a

block: use cancel_work_sync() instead of kblockd_flush_work() · 64d01dc9

由 Cheng Renquan 提交于 12月 03, 2008

After many improvements on kblockd_flush_work, it is now identical to
cancel_work_sync, so a direct call to cancel_work_sync is suggested.

The only difference is that cancel_work_sync is a GPL symbol,
so no non-GPL modules anymore.
Signed-off-by: NCheng Renquan <crquan@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

64d01dc9

09 10月, 2008 3 次提交

block: as/cfq ssd idle check update · f7d7b7a7

由 Jens Axboe 提交于 9月 25, 2008

We really need to know about the hardware tagging support as well,
since if the SSD does not do tagging then we still want to idle.
Otherwise have the same dependent sync IO vs flooding async IO
problem as on rotational media.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f7d7b7a7

block: add queue flag for SSD/non-rotational devices · a68bbddb

由 Jens Axboe 提交于 9月 24, 2008

We don't want to idle in AS/CFQ if the device doesn't have a seek
penalty. So add a QUEUE_FLAG_NONROT to indicate a non-rotational
device, low level drivers should set this flag upon discovery of
an SSD or similar device type.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a68bbddb

cfq-iosched: fix queue depth detection · 45333d5a

由 Aaron Carroll 提交于 8月 26, 2008

CFQ's detection of queueing devices assumes a non-queuing device and detects
if the queue depth reaches a certain threshold. Under some workloads (e.g.
synchronous reads), CFQ effectively forces a unit queue depth, thus defeating
the detection logic. This leads to poor performance on queuing hardware,
since the idle window remains enabled.

This patch inverts the sense of the logic: assume a queuing-capable device,
and detect if the depth does not exceed the threshold.
Signed-off-by: NAaron Carroll <aaronc@gelato.unsw.edu.au>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

45333d5a

bug2833 / cloud-kernel 与 Fork 源项目一致

bug2833 / cloud-kernel
与 Fork 源项目一致