提交 · 4ac845a2e9a816ed5a7b301f56dcc0a3d0b1ba4d · openanolis / cloud-kernel

28 1月, 2008 3 次提交

block: cfq: make the io contect sharing lockless · 4ac845a2

由 Jens Axboe 提交于 1月 24, 2008

The io context sharing introduced a per-ioc spinlock, that would protect
the cfq io context lookup. That is a regression from the original, since
we never needed any locking there because the ioc/cic were process private.

The cic lookup is changed from an rbtree construct to a radix tree, which
we can then use RCU to make the reader side lockless. That is the performance
critical path, modifying the radix tree is only done on process creation
(when that process first does IO, actually) and on process exit (if that
process has done IO).

As it so happens, radix trees are also much faster for this type of
lookup where the key is a pointer. It's a very sparse tree.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4ac845a2

N
io_context sharing - cfq changes · 66dac98e
由 Nikanth Karthikesan 提交于 11月 27, 2007
```
changes in the cfq for io_context sharing
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
66dac98e

ioprio: move io priority from task_struct to io_context · fd0928df

由 Jens Axboe 提交于 1月 24, 2008

This is where it belongs and then it doesn't take up space for a
process that doesn't do IO.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fd0928df

18 12月, 2007 1 次提交

block: let elv_register() return void · 2fdd82bd

由 Adrian Bunk 提交于 12月 12, 2007

elv_register() always returns 0, and there isn't anything it does where
it should return an error (the only error condition is so grave that
it's handled with a BUG_ON).
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2fdd82bd

07 11月, 2007 3 次提交

cfq_idle_class_timer: add paranoid checks for jiffies overflow · 0e7be9ed

由 Oleg Nesterov 提交于 11月 07, 2007

In theory, if the queue was idle long enough, cfq_idle_class_timer may have
a false (and very long) timeout because jiffies can wrap into the past wrt
->last_end_request.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0e7be9ed

cfq: fix IOPRIO_CLASS_IDLE delays · b70c864d

由 Oleg Nesterov 提交于 11月 07, 2007

After the fresh boot:

	ionice -c3 -p $$
	echo cfq >> /sys/block/XXX/queue/scheduler
	dd if=/dev/XXX of=/dev/null bs=512 count=1

Now dd hangs in D state and the queue is completely stalled for approximately
INITIAL_JIFFIES + CFQ_IDLE_GRACE jiffies. This is because cfq_init_queue()
forgets to initialize cfq_data->last_end_request.

(I guess this patch is not complete, overflow is still possible)
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b70c864d

cfq: fix IOPRIO_CLASS_IDLE accounting · 2389d1ef

由 Oleg Nesterov 提交于 11月 05, 2007

Spotted by Nick <gentuu@gmail.com>, hopefully can explain the second trace in
http://bugzilla.kernel.org/show_bug.cgi?id=9180.

If ->async_idle_cfqq != NULL cfq_put_async_queues() puts it IOPRIO_BE_NR times
in a loop. Fix this.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2389d1ef

29 10月, 2007 2 次提交

cfq_get_queue: fix possible NULL pointer access · 0a0836a0

由 Oleg Nesterov 提交于 10月 23, 2007

cfq_get_queue()->cfq_find_alloc_queue() can fail, check the returned value.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>

Note that this isn't a bug at the moment, since the regular IO path
does not call this path without __GFP_WAIT set. However, it could be a
future bug, so I've applied it.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0a0836a0

cfq_exit_queue() should cancel cfq_data->unplug_work · 4310864b

由 Oleg Nesterov 提交于 10月 23, 2007

Spotted by Nick <gentuu@gmail.com>, perhaps explains the first trace in
http://bugzilla.kernel.org/show_bug.cgi?id=9180.

cfq_exit_queue() should cancel cfqd->unplug_work before freeing cfqd.
blk_sync_queue() seems unneeded, removed.

Q: why cfq_exit_queue() calls cfq_shutdown_timer_wq() twice?
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4310864b

24 7月, 2007 1 次提交

[BLOCK] Get rid of request_queue_t typedef · 165125e1

由 Jens Axboe 提交于 7月 24, 2007

Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

165125e1

20 7月, 2007 2 次提交

cfq: Write-only stuff in CFQ data structures · 8350163a

由 Alexey Dobriyan 提交于 7月 20, 2007

There are some leftover bits from the task cooperator patch, that was
yanked out again. While it will get reintroduced, no point in having
this write-only stuff in the tree. So yank it.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8350163a

cfq: async queue allocation per priority · c2dea2d1

由 Vasily Tarasov 提交于 7月 20, 2007

If we have two processes with different ioprio_class, but the same
ioprio_data, their async requests will fall into the same queue. I guess
such behavior is not expected, because it's not right to put real-time
requests and best-effort requests in the same queue.

The attached patch fixes the problem by introducing additional *cfqq
fields on cfqd, pointing to per-(class,priority) async queues.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c2dea2d1

18 7月, 2007 1 次提交

Slab allocators: Replace explicit zeroing with __GFP_ZERO · 94f6030c

由 Christoph Lameter 提交于 7月 17, 2007

kmalloc_node() and kmem_cache_alloc_node() were not available in a zeroing
variant in the past.  But with __GFP_ZERO it is possible now to do zeroing
while allocating.

Use __GFP_ZERO to remove the explicit clearing of memory via memset whereever
we can.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

94f6030c

10 7月, 2007 1 次提交

cfq-iosched: fix async queue behaviour · 15c31be4

由 Jens Axboe 提交于 7月 10, 2007

With the cfq_queue hash removal, we inadvertently got rid of the
async queue sharing. This was not intentional, in fact CFQ purposely
shares the async queue per priority level to get good merging for
async writes.

So put some logic in cfq_get_queue() to track the shared queues.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

15c31be4

08 5月, 2007 1 次提交

KMEM_CACHE(): simplify slab cache creation · 0a31bd5f

由 Christoph Lameter 提交于 5月 06, 2007

This patch provides a new macro

KMEM_CACHE(<struct>, <flags>)

to simplify slab creation. KMEM_CACHE creates a slab with the name of the
struct, with the size of the struct and with the alignment of the struct.
Additional slab flags may be specified if necessary.

Example

struct test_slab {
	int a,b,c;
	struct list_head;
} __cacheline_aligned_in_smp;

test_slab_cache = KMEM_CACHE(test_slab, SLAB_PANIC)

will create a new slab named "test_slab" of the size sizeof(struct
test_slab) and aligned to the alignment of test slab.  If it fails then we
panic.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0a31bd5f

30 4月, 2007 17 次提交

cfq-iosched: speedup cic rb lookup · 597bc485

由 Jens Axboe 提交于 4月 24, 2007

We often lookup the same queue many times in succession, so cache
the last looked up queue to avoid browsing the rbtree.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

597bc485

cfq-iosched: get rid of cfqq hash · 91fac317

由 Vasily Tarasov 提交于 4月 25, 2007

cfq hash is no more necessary.  We always can get cfqq from io context.
cfq_get_io_context_noalloc() function is introduced, because we don't
want to allocate cic on merging and checking may_queue.  In order to
identify sync queue we've used hash key = CFQ_KEY_ASYNC. Since hash is
eliminated we need to use other criterion: sync flag for queue is added.
In all places where we dig in rb_tree we're in current context, so no
additional locking is required.

Advantages of this patch: no additional memory for hash, no seeking in
hash, code is cleaner. But it is necessary now to seek cic in per-ioc
rbtree, but it is faster:
- most processes work only with few devices
- most systems have only few block devices
- it is a rb-tree
Signed-off-by: NVasily Tarasov <vtaras@openvz.org>

Changes by me:

- Merge into CFQ devel branch
- Get rid of cfq_get_io_context_noalloc()
- Fix various bugs with dereferencing cic->cfqq[] with offset other
  than 0 or 1.
- Fix bug in cfqq setup, is_sync condition was reversed.
- Fix bug where only bio_sync() is used, we need to check for a READ too
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

91fac317

cfq-iosched: tighten queue request overlap condition · cc197479

由 Jens Axboe 提交于 4月 20, 2007

For tagged devices, allow overlap of requests if the idle window
isn't enabled on the current active queue.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

cc197479

J
cfq-iosched: improve sync vs async workloads · 3ed9a296
由 Jens Axboe 提交于 4月 23, 2007
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
3ed9a296

cfq-iosched: never allow an async queue idling · 1be92f2f

由 Jens Axboe 提交于 4月 19, 2007

We don't enable it by default, don't let it get enabled during
runtime.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1be92f2f

cfq-iosched: get rid of ->dispatch_slice · 20e493a8

由 Jens Axboe 提交于 4月 23, 2007

We can track it fairly accurately locally, let the slice handling
take care of the rest.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

20e493a8

J
cfq-iosched: don't pass unused preemption variable around · 6084cdda
由 Jens Axboe 提交于 4月 23, 2007
```
We don't use it anymore in the slice expiry handling.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
6084cdda

cfq-iosched: get rid of ->cur_rr and ->cfq_list · edd75ffd

由 Jens Axboe 提交于 4月 19, 2007

It's only used for preemption now that the IDLE and RT queues also
use the rbtree. If we pass an 'add_front' variable to
cfq_service_tree_add(), we can set ->rb_key to 0 to force insertion
at the front of the tree.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

edd75ffd

cfq-iosched: slice offset should take ioprio into account · 67e6b49e

由 Jens Axboe 提交于 4月 20, 2007

Use the max_slice-cur_slice as the multipler for the insertion offset.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

67e6b49e

J
[PATCH] cfq-iosched: style cleanups and comments · 498d3aa2
由 Jens Axboe 提交于 4月 26, 2007
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
498d3aa2

cfq-iosched: sort IDLE queues into the rbtree · 67060e37

由 Jens Axboe 提交于 4月 18, 2007

Same treatment as the RT conversion, just put the sorted idle
branch at the end of the tree.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

67060e37

cfq-iosched: sort RT queues into the rbtree · 0c534e0a

由 Jens Axboe 提交于 4月 18, 2007

Currently CFQ does a linked insert into the current list for RT
queues. We can just factor the class into the rb insertion,
and then we don't have to treat RT queues in a special way. It's
faster, too.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0c534e0a

[PATCH] cfq-iosched: speed up rbtree handling · cc09e299

由 Jens Axboe 提交于 4月 26, 2007

For cases where the rbtree is mainly used for sorting and min retrieval,
a nice speedup of the rbtree code is to maintain a cache of the leftmost
node in the tree.

Also spotted in the CFS CPU scheduler code.

Improved by Alan D. Brunelle <Alan.Brunelle@hp.com> by updating the
leftmost hint in cfq_rb_first() if it isn't set, instead of only
updating it on insert.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

cc09e299

cfq-iosched: rework the whole round-robin list concept · d9e7620e

由 Jens Axboe 提交于 4月 20, 2007

Drawing on some inspiration from the CFS CPU scheduler design, overhaul
the pending cfq_queue concept list management. Currently CFQ uses a
doubly linked list per priority level for sorting and service uses.
Kill those lists and maintain an rbtree of cfq_queue's, sorted by when
to service them.

This unfortunately means that the ionice levels aren't as strong
anymore, will work on improving those later. We only scale the slice
time now, not the number of times we service. This means that latency
is better (for all priority levels), but that the distinction between
the highest and lower levels aren't as big.

The diffstat speaks for itself.

 cfq-iosched.c |  363 +++++++++++++++++---------------------------------
 1 file changed, 125 insertions(+), 238 deletions(-)
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d9e7620e

cfq-iosched: minor updates · 1afba045

由 Jens Axboe 提交于 4月 17, 2007

- Move the queue_new flag clear to when the queue is selected
- Only select the non-first queue in cfq_get_best_queue(), if there's
  a substantial difference between the best and first.
- Get rid of ->busy_rr
- Only select a close cooperator, if the current queue is known to take
  a while to "think".
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1afba045

cfq-iosched: development update · 6d048f53

由 Jens Axboe 提交于 4月 25, 2007

- Implement logic for detecting cooperating processes, so we
  choose the best available queue whenever possible.

- Improve residual slice time accounting.

- Remove dead code: we no longer see async requests coming in on
  sync queues. That part was removed a long time ago. That means
  that we can also remove the difference between cfq_cfqq_sync()
  and cfq_cfqq_class_sync(), they are now indentical. And we can
  kill the on_dispatch array, just make it a counter.

- Allow a process to go into the current list, if it hasn't been
  serviced in this scheduler tick yet.

Possible future improvements including caching the cfqq lookup
in cfq_close_cooperator(), so we don't have to look it up twice.
cfq_get_best_queue() should just use that last decision instead
of doing it again.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6d048f53

cfq-iosched: improve preemption for cooperating tasks · 1e3335de

由 Jens Axboe 提交于 2月 14, 2007

When testing the syslet async io approach, I discovered that CFQ
sometimes didn't perform as well as expected. cfq_should_preempt()
needs to better check for cooperating tasks, so fix that by allowing
preemption of an equal priority queue if the recently queued request
is as good a candidate for IO as the one we are currently waiting for.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1e3335de

25 4月, 2007 1 次提交

cfq-iosched: fix alias + front merge bug · 5044eed4

由 Jens Axboe 提交于 4月 25, 2007

There's a really rare and obscure bug in CFQ, that causes a crash in
cfq_dispatch_insert() due to rq == NULL.  One example of the resulting
oops is seen here:

	http://lkml.org/lkml/2007/4/15/41

Neil correctly diagnosed the situation for how this can happen: if two
concurrent requests with the exact same sector number (due to direct IO
or aliasing between MD and the raw device access), the alias handling
will add the request to the sortlist, but next_rq remains NULL.

Read the more complete analysis at:

	http://lkml.org/lkml/2007/4/25/57

This looks like it requires md to trigger, even though it should
potentially be possible to due with O_DIRECT (at least if you edit the
kernel and doctor some of the unplug calls).

The fix is to move the ->next_rq update to when we add a request to the
rbtree. Then we remove the possibility for a request to exist in the
rbtree code, but not have ->next_rq correctly updated.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5044eed4

21 4月, 2007 1 次提交

cfq-iosched: fix sequential write regression · a9938006

由 Jens Axboe 提交于 4月 20, 2007

We have a 10-15% performance regression for sequential writes on TCQ/NCQ
enabled drives in 2.6.21-rcX after the CFQ update went in. It has been
reported by Valerie Clement <valerie.clement@bull.net> and the Intel
testing folks. The regression is because of CFQ's now more aggressive
queue control, limiting the depth available to the device.

This patches fixes that regression by allowing a greater depth when only
one queue is busy. It has been tested to not impact sync-vs-async
workloads too much - we still do a lot better than 2.6.20.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a9938006

12 2月, 2007 6 次提交

cfq-iosched: improve continue or break logic in cfq_dispatch · 9ede209e

由 Jens Axboe 提交于 1月 19, 2007

This improves performance considerably for sync requests when you
have command queuing enabled.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9ede209e

cfq-iosched: remove the implicit queue kicking in slice expire · 28f95cbc

由 Jens Axboe 提交于 1月 19, 2007

We only really need it for a process going away, so move it to
those locations.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

28f95cbc

J
cfq-iosched: check whether a queue timed out in accounting · 3c6bd2f8
由 Jens Axboe 提交于 1月 19, 2007
```
Makes it more fair for the residual slice count.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
3c6bd2f8

cfq-iosched: tweak the FIFO checking · cb887411

由 Jens Axboe 提交于 1月 19, 2007

We currently check the FIFO once per slice. Optimize that a bit and
only do it as the first thing for a new slice, so we don't end up
doing a single request and then seek to the FIFO requests.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

cb887411

cfq-iosched: don't pass in queue for cfq_arm_slice_timer() · 1792669c

由 Jens Axboe 提交于 1月 19, 2007

It must always be the active queue, otherwise it's a bug. So just
use the active_queue, don't pass it in explicitly.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1792669c

cfq-iosched: account for slice over/under time · c5b680f3

由 Jens Axboe 提交于 1月 19, 2007

If a slice uses less than it is entitled to (or perhaps more), include
that in the decision on how much time to give it the next time it
gets serviced.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c5b680f3

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功