提交 · 4ac845a2e9a816ed5a7b301f56dcc0a3d0b1ba4d · openanolis / cloud-kernel

28 1月, 2008 4 次提交

block: cfq: make the io contect sharing lockless · 4ac845a2

由 Jens Axboe 提交于 1月 24, 2008

The io context sharing introduced a per-ioc spinlock, that would protect
the cfq io context lookup. That is a regression from the original, since
we never needed any locking there because the ioc/cic were process private.

The cic lookup is changed from an rbtree construct to a radix tree, which
we can then use RCU to make the reader side lockless. That is the performance
critical path, modifying the radix tree is only done on process creation
(when that process first does IO, actually) and on process exit (if that
process has done IO).

As it so happens, radix trees are also much faster for this type of
lookup where the key is a pointer. It's a very sparse tree.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4ac845a2

N
io_context sharing - cfq changes · 66dac98e
由 Nikanth Karthikesan 提交于 11月 27, 2007
```
changes in the cfq for io_context sharing
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
66dac98e

io context sharing: preliminary support · d38ecf93

由 Jens Axboe 提交于 1月 24, 2008

Detach task state from ioc, instead keep track of how many processes
are accessing the ioc.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d38ecf93

ioprio: move io priority from task_struct to io_context · fd0928df

由 Jens Axboe 提交于 1月 24, 2008

This is where it belongs and then it doesn't take up space for a
process that doesn't do IO.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fd0928df

25 1月, 2008 7 次提交

Kobject: rename kobject_init_ng() to kobject_init() · f9cb074b

由 Greg Kroah-Hartman 提交于 12月 17, 2007

Now that the old kobject_init() function is gone, rename
kobject_init_ng() to kobject_init() to clean up the namespace.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

f9cb074b

Kobject: rename kobject_add_ng() to kobject_add() · b2d6db58

由 Greg Kroah-Hartman 提交于 12月 17, 2007

Now that the old kobject_add() function is gone, rename kobject_add_ng()
to kobject_add() to clean up the namespace.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

b2d6db58

Kobject: convert block/ll_rw_blk.c to use kobject_init/add_ng() · d5a379f7

由 Greg Kroah-Hartman 提交于 12月 17, 2007

This converts the code to use the new kobject functions, cleaning up the
logic in doing so.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

d5a379f7

Kobject: convert block/elevator.c to use kobject_init/add_ng() · 29e3dd0d

由 Greg Kroah-Hartman 提交于 12月 17, 2007

This converts the code to use the new kobject functions, cleaning up the
logic in doing so.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

29e3dd0d

Driver core: convert block from raw kobjects to core devices · edfaa7c3

由 Kay Sievers 提交于 5月 21, 2007

This moves the block devices to /sys/class/block. It will create a
flat list of all block devices, with the disks and partitions in one
directory. For compatibility /sys/block is created and contains symlinks
to the disks.

  /sys/class/block
  |-- sda -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda
  |-- sda1 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda1
  |-- sda10 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda10
  |-- sda5 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda5
  |-- sda6 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda6
  |-- sda7 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda7
  |-- sda8 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda8
  |-- sda9 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda9
  `-- sr0 -> ../../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0

  /sys/block/
  |-- sda -> ../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda
  `-- sr0 -> ../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0
Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

edfaa7c3

kset: convert block_subsys to use kset_create · 830d3cfb

由 Greg Kroah-Hartman 提交于 11月 06, 2007

Dynamically create the kset instead of declaring it statically.  We also
rename block_subsys to block_kset to catch all users of this symbol
with a build error instead of an easy-to-ignore build warning.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

830d3cfb

kobject: remove struct kobj_type from struct kset · 3514faca

由 Greg Kroah-Hartman 提交于 10月 16, 2007

We don't need a "default" ktype for a kset.  We should set this
explicitly every time for each kset.  This change is needed so that we
can make ksets dynamic, and cleans up one of the odd, undocumented
assumption that the kset/kobject/ktype model has.

This patch is based on a lot of help from Kay Sievers.

Nasty bug in the block code was found by Dave Young
<hidave.darkstar@gmail.com>

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

3514faca

12 1月, 2008 2 次提交

[SCSI] block: Introduce new blk_queue_update_dma_alignment interface · 11c3e689

由 James Bottomley 提交于 12月 31, 2007

The purpose of this is to allow stacked alignment settings, with the
ultimate queue alignment being set to the largest alignment requirement
in the stack.

The reason for this is so that the SCSI mid-layer can relax the default
alignment requirements (which are basically causing a lot of superfluous
copying to go on in the SG_IO interface) while allowing transports,
devices or HBAs to add stricter limits if they need them.
Acked-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

11c3e689

[SCSI] libsas, bsg: pass errors through correctly · 2d507a01

由 James Bottomley 提交于 12月 29, 2007

Currently in BSG, errors returned in req->errors aren't passed back to
the calling programme (either via SG_IO or via read/write).  Fix this,
while preserving the SCSI convention of returning status in
req->errors.

Now update libsas to return errors correctly instead of to ignore
them.
Acked-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

2d507a01

11 1月, 2008 2 次提交

blktrace: kill the unneeded initcall · 11a57153

由 Jens Axboe 提交于 1月 11, 2008

It just inits the mutex, we can do that with DEFINE_MUTEX() instead.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

11a57153

block: fix blktrace timestamps · 2997c8c4

由 Ingo Molnar 提交于 1月 11, 2008

David Dillow reported broken blktrace timestamps. The reason
is cpu_clock() which is not a global time source.

Fix bkltrace timestamps by using ktime_get() like the networking
code does for packet timestamps. This also removes a whole lot
of complexity from bkltrace.c and shrinks the code by 500 bytes:

   text    data     bss     dec     hex filename
   2888     124      44    3056     bf0 blktrace.o.before
   2390     116      44    2550     9f6 blktrace.o.after
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2997c8c4

18 12月, 2007 4 次提交

block: let elv_register() return void · 2fdd82bd

由 Adrian Bunk 提交于 12月 12, 2007

elv_register() always returns 0, and there isn't anything it does where
it should return an error (the only error condition is so grave that
it's handled with a BUG_ON).
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2fdd82bd

as-iosched: fix write batch start point · 49565124

由 Aaron Carroll 提交于 12月 05, 2007

New write batches currently start from where the last one completed.
We have no idea where the head is after switching batches, so this
makes little sense.  Instead, start the next batch from the request
with the earliest deadline in the hope that we avoid a deadline
expiry later on.
Signed-off-by: NAaron Carroll <aaronc@gelato.unsw.edu.au>
Acked-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

49565124

as-iosched: fix incorrect comments · 8896f3c0

由 Aaron Carroll 提交于 12月 05, 2007

Two comments refer to deadlines applying to reads only.  This is
not the case.
Signed-off-by: NAaron Carroll <aaronc@gelato.unsw.edu.au>
Acked-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8896f3c0

block: use jiffies conversion functions in scsi_ioctl.c · 24bb8fb9

由 Tejun Heo 提交于 12月 05, 2007

Use msecs_to_jiffies() and jiffies_to_msecs() in scsi_ioctl().
Sometimes callers use very large values for e.g. vendor specific media
clear command and calculation can overflow.
Signed-off-by: NTejun Heo <htejun@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

24bb8fb9

27 11月, 2007 3 次提交

Revert "ll_rw_blk: temporarily enable max_segments tweaking" · 7c9f29b1

由 Jens Axboe 提交于 11月 27, 2007

This was a temporary debugging thing for sg chaining testing, revert
it now as it has served its purpose.

This reverts commit 563063a8.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7c9f29b1

block: Fix memory leak in alloc_disk_node() · c7674030

由 Jerome Marchand 提交于 11月 23, 2007

Fix a memory leak in alloc_disk_node(). Don't forget to free 'dkstats' when the allocation of 'part' failed.
Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c7674030

blktrace: Make sure BLKTRACETEARDOWN does the full cleanup. · 35fc51e7

由 Aneesh Kumar K.V 提交于 11月 21, 2007

if blktrace program segfault it will not be able
to call BLKTRACETEARDOWN. Now if we run the blktrace
again that would result in a failure to create the
block/<device> debugfs directory.This will result
in blk_remove_root() to be called which will set
blk_tree_root to NULL. But the  debugfs block dir
still exist because it contain subdirectory.

Now if we try to fix it using BLKTRACETEARDOWN
it won't work because blk_tree_root is NULL.

Fix the same.

Tested as below

root@qemu-image:/home/kvaneesh/blktrace# ./blktrace  -d /dev/hdc
Segmentation fault
root@qemu-image:/home/kvaneesh/blktrace# ./blktrace  -d /dev/hdc
BLKTRACESETUP: No such file or directory
Failed to start trace on /dev/hdc
root@qemu-image:/home/kvaneesh/blktrace# ./blktrace  -k /dev/hdc
root@qemu-image:/home/kvaneesh/blktrace# ./blktrace  -d /dev/hdc
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

35fc51e7

09 11月, 2007 2 次提交

Add UNPLUG traces to all appropriate places · 2ad8b1ef

由 Alan D. Brunelle 提交于 11月 07, 2007

Added blk_unplug interface, allowing all invocations of unplugs to result
in a generated blktrace UNPLUG.
Signed-off-by: NAlan D. Brunelle <Alan.Brunelle@hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2ad8b1ef

block: fix requeue handling in blk_queue_invalidate_tags() · d85532ed

由 Jens Axboe 提交于 11月 09, 2007

Credit goes to juergen.kadidlo@exasol.com for diagnosing this issue
and supplying the initial patch.

blk_queue_invalidate_tags() must use the proper requeueing paths instead
of open coding the re-add of the request, otherwise we bug out in rq
accounting. Just switch to using blk_requeue_request(), that takes care
of end-tag handling as well and also adds the blktrace REQUEUE notify
event that is also appropriate here.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d85532ed

07 11月, 2007 3 次提交

cfq_idle_class_timer: add paranoid checks for jiffies overflow · 0e7be9ed

由 Oleg Nesterov 提交于 11月 07, 2007

In theory, if the queue was idle long enough, cfq_idle_class_timer may have
a false (and very long) timeout because jiffies can wrap into the past wrt
->last_end_request.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0e7be9ed

cfq: fix IOPRIO_CLASS_IDLE delays · b70c864d

由 Oleg Nesterov 提交于 11月 07, 2007

After the fresh boot:

	ionice -c3 -p $$
	echo cfq >> /sys/block/XXX/queue/scheduler
	dd if=/dev/XXX of=/dev/null bs=512 count=1

Now dd hangs in D state and the queue is completely stalled for approximately
INITIAL_JIFFIES + CFQ_IDLE_GRACE jiffies. This is because cfq_init_queue()
forgets to initialize cfq_data->last_end_request.

(I guess this patch is not complete, overflow is still possible)
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b70c864d

cfq: fix IOPRIO_CLASS_IDLE accounting · 2389d1ef

由 Oleg Nesterov 提交于 11月 05, 2007

Spotted by Nick <gentuu@gmail.com>, hopefully can explain the second trace in
http://bugzilla.kernel.org/show_bug.cgi?id=9180.

If ->async_idle_cfqq != NULL cfq_put_async_queues() puts it IOPRIO_BE_NR times
in a loop. Fix this.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2389d1ef

02 11月, 2007 5 次提交

J
[BLOCK] Don't allow empty barriers to be passed down to queues that don't grok them · 51fd77bd
由 Jens Axboe 提交于 11月 02, 2007
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
51fd77bd

Deadline iosched: Fix batching fairness · 6f5d8aa6

由 Aaron Carroll 提交于 10月 30, 2007

After switching data directions, deadline always starts the next batch
from the lowest-sector request.  This gives excessive deadline expiries
and large latency and throughput disparity between high- and low-sector
requests; an order of magnitude in some tests.

This patch changes the batching behaviour so new batches start from the
request whose expiry is earliest.
Signed-off-by: NAaron Carroll <aaronc@gelato.unsw.edu.au>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6f5d8aa6

Deadline iosched: Reset batch for ordered requests · dfb3d72a

由 Aaron Carroll 提交于 10月 30, 2007

The deadline I/O scheduler does not reset the batch count when starting
a new batch at a higher-sectored request. This means the second and
subsequent batch in the same data direction will never exceed a single
request in size whenever higher-sectored requests are pending.

This patch gives new batches in the same data direction as old ones
their full quota of requests by resetting the batch count.
Signed-off-by: NAaron Carroll <aaronc@gelato.unsw.edu.au>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

dfb3d72a

Deadline iosched: Factor out finding latter reques · 5d1a5366

由 Aaron Carroll 提交于 10月 30, 2007

Factor finding the next request in sector-sorted order into
a function deadline_latter_request.
Signed-off-by: NAaron Carroll <aaronc@gelato.unsw.edu.au>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5d1a5366

[SG] Get rid of __sg_mark_end() · c46f2334

由 Jens Axboe 提交于 10月 31, 2007

sg_mark_end() overwrites the page_link information, but all users want
__sg_mark_end() behaviour where we just set the end bit. That is the most
natural way to use the sg list, since you'll fill it in and then mark the
end point.

So change sg_mark_end() to only set the termination bit. Add a sg_magic
debug check as well, and clear a chain pointer if it is set.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c46f2334

29 10月, 2007 7 次提交

compat_ioctl: fix block device compat ioctl regression · 33013a88

由 Philip Langdale 提交于 10月 27, 2007

The conversion of handlers to compat_blkdev_ioctl accidentally
disabled handling of most ioctl numbers on block devices because
of a typo. Fix the one line to enable it all again.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NJens Axboe <axboe@carl.home.kernel.dk>

33013a88

[BLOCK] Fix bad sharing of tag busy list on queues with shared tag maps · 6eca9004

由 Jens Axboe 提交于 10月 25, 2007

For the locking to work, only the tag map and tag bit map may be shared
(incidentally, I was just explaining this to Nick yesterday, but I
apparently didn't review the code well enough myself). But we also share
the busy list!  The busy_list must be queue private, or we need a
block_queue_tag covering lock as well.

So we have to move the busy_list to the queue. This'll work fine, and
it'll actually also fix a problem with blk_queue_invalidate_tags() which
will invalidate tags across all shared queues. This is a bit confusing,
the low level driver should call it for each queue seperately since
otherwise you cannot kill tags on just a single queue for eg a hard
drive that stops responding. Since the function has no callers
currently, it's not an issue.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6eca9004

block: use lock bitops for the tag map. · adb4ddbb

由 Nick Piggin 提交于 10月 24, 2007

The block queue tag map can use lock bitops.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

adb4ddbb

cfq_get_queue: fix possible NULL pointer access · 0a0836a0

由 Oleg Nesterov 提交于 10月 23, 2007

cfq_get_queue()->cfq_find_alloc_queue() can fail, check the returned value.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>

Note that this isn't a bug at the moment, since the regular IO path
does not call this path without __GFP_WAIT set. However, it could be a
future bug, so I've applied it.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0a0836a0

blk_sync_queue() should cancel request_queue->unplug_work · abbeb88d

由 Oleg Nesterov 提交于 10月 23, 2007

blk_sync_queue() cancels the timer, but forgets to cancel the work.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

abbeb88d

cfq_exit_queue() should cancel cfq_data->unplug_work · 4310864b

由 Oleg Nesterov 提交于 10月 23, 2007

Spotted by Nick <gentuu@gmail.com>, perhaps explains the first trace in
http://bugzilla.kernel.org/show_bug.cgi?id=9180.

cfq_exit_queue() should cancel cfqd->unplug_work before freeing cfqd.
blk_sync_queue() seems unneeded, removed.

Q: why cfq_exit_queue() calls cfq_shutdown_timer_wq() twice?
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4310864b

block layer: remove a unused argument of drive_stat_acct() · b238b3d4

由 Jerome Marchand 提交于 10月 23, 2007

The nr_sector argument of drive_stat_acct() is not used anymore since the read and write sectors statistics are now updated in end_that_request_first(). This patch removes the useless argument.
Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b238b3d4

24 10月, 2007 1 次提交

SG: Change sg_set_page() to take length and offset argument · 642f1490

由 Jens Axboe 提交于 10月 24, 2007

Most drivers need to set length and offset as well, so may as well fold
those three lines into one.

Add sg_assign_page() for those two locations that only needed to set
the page, where the offset/length is set outside of the function context.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

642f1490

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功