提交 · d9e7620e60bc6648c3dcabbc8d1a320b69c846f9 · openanolis / cloud-kernel

30 4月, 2007 4 次提交

cfq-iosched: rework the whole round-robin list concept · d9e7620e

由 Jens Axboe 提交于 4月 20, 2007

Drawing on some inspiration from the CFS CPU scheduler design, overhaul
the pending cfq_queue concept list management. Currently CFQ uses a
doubly linked list per priority level for sorting and service uses.
Kill those lists and maintain an rbtree of cfq_queue's, sorted by when
to service them.

This unfortunately means that the ionice levels aren't as strong
anymore, will work on improving those later. We only scale the slice
time now, not the number of times we service. This means that latency
is better (for all priority levels), but that the distinction between
the highest and lower levels aren't as big.

The diffstat speaks for itself.

 cfq-iosched.c |  363 +++++++++++++++++---------------------------------
 1 file changed, 125 insertions(+), 238 deletions(-)
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d9e7620e

cfq-iosched: minor updates · 1afba045

由 Jens Axboe 提交于 4月 17, 2007

- Move the queue_new flag clear to when the queue is selected
- Only select the non-first queue in cfq_get_best_queue(), if there's
  a substantial difference between the best and first.
- Get rid of ->busy_rr
- Only select a close cooperator, if the current queue is known to take
  a while to "think".
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1afba045

cfq-iosched: development update · 6d048f53

由 Jens Axboe 提交于 4月 25, 2007

- Implement logic for detecting cooperating processes, so we
  choose the best available queue whenever possible.

- Improve residual slice time accounting.

- Remove dead code: we no longer see async requests coming in on
  sync queues. That part was removed a long time ago. That means
  that we can also remove the difference between cfq_cfqq_sync()
  and cfq_cfqq_class_sync(), they are now indentical. And we can
  kill the on_dispatch array, just make it a counter.

- Allow a process to go into the current list, if it hasn't been
  serviced in this scheduler tick yet.

Possible future improvements including caching the cfqq lookup
in cfq_close_cooperator(), so we don't have to look it up twice.
cfq_get_best_queue() should just use that last decision instead
of doing it again.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6d048f53

cfq-iosched: improve preemption for cooperating tasks · 1e3335de

由 Jens Axboe 提交于 2月 14, 2007

When testing the syslet async io approach, I discovered that CFQ
sometimes didn't perform as well as expected. cfq_should_preempt()
needs to better check for cooperating tasks, so fix that by allowing
preemption of an equal priority queue if the recently queued request
is as good a candidate for IO as the one we are currently waiting for.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1e3335de

25 4月, 2007 1 次提交

cfq-iosched: fix alias + front merge bug · 5044eed4

由 Jens Axboe 提交于 4月 25, 2007

There's a really rare and obscure bug in CFQ, that causes a crash in
cfq_dispatch_insert() due to rq == NULL.  One example of the resulting
oops is seen here:

	http://lkml.org/lkml/2007/4/15/41

Neil correctly diagnosed the situation for how this can happen: if two
concurrent requests with the exact same sector number (due to direct IO
or aliasing between MD and the raw device access), the alias handling
will add the request to the sortlist, but next_rq remains NULL.

Read the more complete analysis at:

	http://lkml.org/lkml/2007/4/25/57

This looks like it requires md to trigger, even though it should
potentially be possible to due with O_DIRECT (at least if you edit the
kernel and doctor some of the unplug calls).

The fix is to move the ->next_rq update to when we add a request to the
rbtree. Then we remove the possibility for a request to exist in the
rbtree code, but not have ->next_rq correctly updated.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5044eed4

21 4月, 2007 1 次提交

cfq-iosched: fix sequential write regression · a9938006

由 Jens Axboe 提交于 4月 20, 2007

We have a 10-15% performance regression for sequential writes on TCQ/NCQ
enabled drives in 2.6.21-rcX after the CFQ update went in. It has been
reported by Valerie Clement <valerie.clement@bull.net> and the Intel
testing folks. The regression is because of CFQ's now more aggressive
queue control, limiting the depth available to the device.

This patches fixes that regression by allowing a greater depth when only
one queue is busy. It has been tested to not impact sync-vs-async
workloads too much - we still do a lot better than 2.6.20.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a9938006

12 2月, 2007 11 次提交

cfq-iosched: improve continue or break logic in cfq_dispatch · 9ede209e

由 Jens Axboe 提交于 1月 19, 2007

This improves performance considerably for sync requests when you
have command queuing enabled.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9ede209e

cfq-iosched: remove the implicit queue kicking in slice expire · 28f95cbc

由 Jens Axboe 提交于 1月 19, 2007

We only really need it for a process going away, so move it to
those locations.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

28f95cbc

J
cfq-iosched: check whether a queue timed out in accounting · 3c6bd2f8
由 Jens Axboe 提交于 1月 19, 2007
```
Makes it more fair for the residual slice count.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
3c6bd2f8

cfq-iosched: tweak the FIFO checking · cb887411

由 Jens Axboe 提交于 1月 19, 2007

We currently check the FIFO once per slice. Optimize that a bit and
only do it as the first thing for a new slice, so we don't end up
doing a single request and then seek to the FIFO requests.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

cb887411

cfq-iosched: don't pass in queue for cfq_arm_slice_timer() · 1792669c

由 Jens Axboe 提交于 1月 19, 2007

It must always be the active queue, otherwise it's a bug. So just
use the active_queue, don't pass it in explicitly.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1792669c

cfq-iosched: account for slice over/under time · c5b680f3

由 Jens Axboe 提交于 1月 19, 2007

If a slice uses less than it is entitled to (or perhaps more), include
that in the decision on how much time to give it the next time it
gets serviced.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c5b680f3

cfq-iosched: defer slice activation to first request being active · 44f7c160

由 Jens Axboe 提交于 1月 19, 2007

This better matches what time the queue is actually spending doing
IO.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

44f7c160

[PATCH] cfq-iosched: use last service point as the fairness criteria · 99f9628a

由 Jens Axboe 提交于 2月 05, 2007

Right now we use slice_start, which gives async queues an unfair
advantage. Chance that to service_last, and base the resorter
on that.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

99f9628a

J
cfq-iosched: document the cfqq flags · b0b8d749
由 Jens Axboe 提交于 1月 19, 2007
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
b0b8d749

[PATCH] cfq-iosched: move on_rr check into cfq_resort_rr_list() · 98e41c7d

由 Jens Axboe 提交于 2月 05, 2007

Move the on_rr check into cfq_resort_rr_list(), every call site
needs to check it anyway.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

98e41c7d

cfq-iosched: remove cfq_io_context last_queue · aaf1228d

由 Jens Axboe 提交于 1月 19, 2007

It hasn't been used for a while, kill it off and remove the old
if 0 code chunk.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

aaf1228d

03 1月, 2007 1 次提交

[PATCH] cfq-iosched: merging problem · ec8acb69

由 Jens Axboe 提交于 1月 02, 2007

Two issues:

- The final return 1 should be a return 0, otherwise comparing cfqq is
  a noop.

- bio_sync() only checks the sync flag, while rq_is_sync() checks both
  for READ and sync. The latter is what we want. Expand the bio check
  to include reads, and relax the restriction to allow merging of async
  io into sync requests.

In the future we want to clean up the SYNC logic, right now it means
both sync request (such as READ and O_DIRECT WRITE) and unplug-on-issue.
Leave that for later.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ec8acb69

23 12月, 2006 1 次提交

[PATCH] cfq-iosched: tighten allow merge criteria · 719d3402

由 Jens Axboe 提交于 12月 22, 2006

The logic in cfq_allow_merge() wasn't clear enough - basically allow
merging for the same queues only.  Do a fast check for 'rq and bio both
sync/async' before doing the cfqq hash lookup.

This is verified to work with the fixed elv_try_merge() from commit
bb4067e3.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

719d3402

20 12月, 2006 1 次提交

[PATCH] cfq-iosched: don't allow sync merges across queues · da775265

由 Jens Axboe 提交于 12月 20, 2006

Currently we allow any merge, even if the io originates from different
processes. This can cause really bad starvation and unfairness, if those
ios happen to be synchronous (reads or direct writes).

So add a allow_merge hook to the io scheduler ops, so an io scheduler can
help decide whether a bio/process combination may be merged with an
existing request.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

da775265

13 12月, 2006 1 次提交

[PATCH] Propagate down request sync flag · 7749a8d4

由 Jens Axboe 提交于 12月 13, 2006

We need to do this, otherwise the io schedulers don't get access to the
sync flag. Then they cannot tell the difference between a regular write
and an O_DIRECT write, which can cause a performance loss.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7749a8d4

08 12月, 2006 1 次提交

[PATCH] slab: remove kmem_cache_t · e18b890b

由 Christoph Lameter 提交于 12月 06, 2006

Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

	#!/bin/sh
	#
	# Replace one string by another in all the kernel sources.
	#

	set -e

	for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
		quilt add $file
		sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
		mv /tmp/$$ $file
		quilt refresh
	done

The script was run like this

	sh replace kmem_cache_t "struct kmem_cache"
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e18b890b

01 12月, 2006 1 次提交

[BLOCK] Cleanup unused variable passing · bb37b94c

由 Jens Axboe 提交于 12月 01, 2006

- ->init_queue() does not need the elevator passed in
- ->put_request() is a hot path and need not have the queue passed in
- cfq_update_io_seektime() does not need cfqd passed in
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

bb37b94c

22 11月, 2006 1 次提交

WorkStruct: Pass the work_struct pointer instead of context data · 65f27f38

由 David Howells 提交于 11月 22, 2006

Pass the work_struct pointer to the work function rather than context data.
The work function can use container_of() to work out the data.

For the cases where the container of the work_struct may go away the moment the
pending bit is cleared, it is made possible to defer the release of the
structure by deferring the clearing of the pending bit.

To make this work, an extra flag is introduced into the management side of the
work_struct. This governs auto-release of the structure upon execution.

Ordinarily, the work queue executor would release the work_struct for further
scheduling or deallocation by clearing the pending bit prior to jumping to the
work function. This means that, unless the driver makes some guarantee itself
that the work_struct won't go away, the work function may not access anything
else in the work_struct or its container lest they be deallocated.. This is a
problem if the auxiliary data is taken away (as done by the last patch).

However, if the pending bit is *not* cleared before jumping to the work
function, then the work function *may* access the work_struct and its container
with no problems. But then the work function must itself release the
work_struct by calling work_release().

In most cases, automatic release is fine, so this is the default. Special
initiators exist for the non-auto-release case (ending in _NAR).
Signed-Off-By: NDavid Howells <dhowells@redhat.com>

65f27f38

01 11月, 2006 1 次提交

[PATCH] CFQ: request <-> request merging rr_list fixup · 5fccbf61

由 Jens Axboe 提交于 10月 31, 2006

In very rare circumstances would we be pruning a merged request and at
the same time delete the implicated cfqq from the rr_list, and not readd
it when the merged request got added. This could cause io stalls until
that process issued io again.

Fix it up by putting the rr_list add handling into cfq_add_rq_rb(),
identical to how pruning is handled in cfq_del_rq_rb(). This fixes a
hang reproducible with fsx-linux.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5fccbf61

31 10月, 2006 2 次提交

[PATCH] CFQ: bad locking in changed_ioprio() · c1b707d2

由 Jens Axboe 提交于 10月 30, 2006

When the ioprio code recently got juggled a bit, a bug was introduced.
changed_ioprio() is no longer called with interrupts disabled, so using
plain spin_lock() on the queue_lock is a bug.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c1b707d2

[PATCH] CFQ: use irq safe locking in cfq_cic_link() · 0261d688

由 Jens Axboe 提交于 10月 30, 2006

If cfq_set_request() is called for a new process AND a non-fs io
request (so that __GFP_WAIT may not be set), cfq_cic_link() may
use spin_lock_irq() and spin_unlock_irq() with interrupts already
disabled.

Fix is to always use irq safe locking in cfq_cic_link()
Acked-By: NArjan van de Ven <arjan@linux.intel.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0261d688

01 10月, 2006 13 次提交

[PATCH] completions: lockdep annotate on stack completions · 6e9a4738

由 Peter Zijlstra 提交于 9月 30, 2006

All on stack DECLARE_COMPLETIONs should be replaced by:
DECLARE_COMPLETION_ONSTACK
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

6e9a4738

[PATCH] Update axboe@suse.de email address · 0fe23479

由 Jens Axboe 提交于 9月 04, 2006

As people often look for the copyright in files to see who to mail,
update the link to a neutral one.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0fe23479

[PATCH] cfq-iosched: use metadata read flag · 374f84ac

由 Jens Axboe 提交于 7月 23, 2006

Give meta data reads preference over regular reads, as the process
often needs to get that out of the way to do the io it was actually
interested in.
Signed-off-by: NJens Axboe <axboe@suse.de>

374f84ac

[PATCH] cfq-iosched: improve queue preemption · bf572256

由 Jens Axboe 提交于 7月 19, 2006

Don't touch the current queues, just make sure that the wanted queue
is selected next. Simplifies the logic.
Signed-off-by: NJens Axboe <axboe@suse.de>

bf572256

[PATCH] Add blk_start_queueing() helper · dc72ef4a

由 Jens Axboe 提交于 7月 20, 2006

CFQ implements this on its own now, but it's really block layer
knowledge. Tells a device queue to start dispatching requests to
the driver, taking care to unplug if needed. Also fixes the issue
where as/cfq will invoke a stopped queue, which we really don't
want.
Signed-off-by: NJens Axboe <axboe@suse.de>

dc72ef4a

[PATCH] cfq-iosched: kill the empty_list · 981a7973

由 Jens Axboe 提交于 7月 19, 2006

No point in having a place holder list just for empty queues, so remove
it. It's not used for anything other than to keep ->cfq_list busy.
Signed-off-by: NJens Axboe <axboe@suse.de>

981a7973

[PATCH] cfq-iosched: Kill O(N) runtime of cfq_resort_rr_list() · 53b03744

由 Jens Axboe 提交于 7月 28, 2006

Currently it scales with number of processes in that priority group,
which is potentially not very nice as it's called quite often.
Basically we always need to do tail inserts, except for the case of a
new process. So just mark/detect a queue as such.
Signed-off-by: NJens Axboe <axboe@suse.de>

53b03744

[PATCH] Make sure all block/io scheduler setups are node aware · b5deef90

由 Jens Axboe 提交于 7月 19, 2006

Some were kmalloc_node(), some were still kmalloc(). Change them all to
kmalloc_node().
Signed-off-by: NJens Axboe <axboe@suse.de>

b5deef90

[PATCH] Audit block layer inlines · 1ea25ecb

由 Jens Axboe 提交于 7月 18, 2006

Kill a few inlines that bring in too much code to more than one location
Shrinks kernel text by about 300 bytes on 32-bit x86.
Signed-off-by: NJens Axboe <axboe@suse.de>

1ea25ecb

[PATCH] cfq-iosched: use new io context counting mechanism · 4050cf16

由 Jens Axboe 提交于 7月 19, 2006

It's ok if the read path is a lot more costly, as long as inc/dec is
really cheap. The inc/dec will happen for each created/freed io context,
while the reading only happens when a disk queue exits.
Signed-off-by: NJens Axboe <axboe@suse.de>

4050cf16

[PATCH] cfq-iosched: kill cfq_exit_lock · fc46379d

由 Jens Axboe 提交于 8月 29, 2006

cfq_exit_lock is protecting two things now:

- The per-ioc rbtree of cfq_io_contexts

- The per-cfqd linked list of cfq_io_contexts

The per-cfqd linked list can be protected by the queue lock, as it is (by
definition) per cfqd as the queue lock is.

The per-ioc rbtree is mainly used and updated by the process itself only.
The only outside use is the io priority changing. If we move the
priority changing to not browsing the rbtree, we can remove any locking
from the rbtree updates and lookup completely. Let the sys_ioprio syscall
just mark processes as having the iopriority changed and lazily update
the private cfq io contexts the next time io is queued, and we can
remove this locking as well.
Signed-off-by: NJens Axboe <axboe@suse.de>

fc46379d

[PATCH] cfq-iosched: cleanups, fixes, dead code removal · 89850f7e

由 Jens Axboe 提交于 7月 22, 2006

A collection of little fixes and cleanups:

- We don't use the 'queued' sysfs exported attribute, since the
  may_queue() logic was rewritten. So kill it.

- Remove dead defines.

- cfq_set_active_queue() can be rewritten cleaner with else if conditions.

- Several places had cfq_exit_cfqq() like logic, abstract that out and
  use that.

- Annotate the cfqq kmem_cache_alloc() so the allocator knows that this
  is a repeat allocation if it fails with __GFP_WAIT set. Allows the
  allocator to start freeing some memory, if needed. CFQ already loops for
  this condition, so might as well pass the hint down.

- Remove cfqd->rq_starved logic. It's not needed anymore after we dropped
  the crq allocation in cfq_set_request().

- Remove uneeded parameter passing.
Signed-off-by: NJens Axboe <axboe@suse.de>

89850f7e

J
[PATCH] Drop useless bio passing in may_queue/set_request API · cb78b285
由 Jens Axboe 提交于 7月 28, 2006
```
It's not needed for anything, so kill the bio passing.
Signed-off-by: NJens Axboe <axboe@suse.de>
```
cb78b285

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功