提交 · 963b72fc6664be12ea52f35a6addea14ec373433 · openeuler / raspberrypi-kernel

04 10月, 2009 1 次提交

cfq-iosched: rename 'desktop' sysfs entry to 'low_latency' · 963b72fc

由 Jens Axboe 提交于 10月 03, 2009

Don't think that's necessarily a perfect description of what this
option fiddles with, but it's probably better than 'desktop'.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

963b72fc

03 10月, 2009 3 次提交

cfq-iosched: implement slower async initiate and queue ramp up · 8e296755

由 Jens Axboe 提交于 10月 03, 2009

This slowly ramps up the async queue depth based on the time
passed since the sync IO, and doesn't allow async at all until
a sync slice period has passed.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8e296755

cfq-iosched: delay async IO dispatch, if sync IO was just done · 365722bb

由 Vivek Goyal 提交于 10月 03, 2009

o Do not allow more than max_dispatch requests from an async queue, if some
  sync request has finished recently. This is in the hope that sync activity
  is still going on in the system and we might receive a sync request soon.
  Most likely from a sync queue which finished a request and we did not enable
  idling on it.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

365722bb

cfq-iosched: add a knob for desktop interactiveness · 1d223515

由 Jens Axboe 提交于 10月 02, 2009

This is basically identical to what Vivek Goyal posted, but combined
into one and labelled 'desktop' instead of 'fairness'. The goal
is to continue to improve on the latency side of things as it relates
to interactiveness, keeping the questionable bits under this sysfs
tunable so it would be easy for throughput-only people to turn off.

Apart from adding the interactive sysfs knob, it also adds the
behavioural change of allowing slice idling even if the hardware
does tagged command queuing.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1d223515

02 10月, 2009 6 次提交

Add a tracepoint for block request remapping · b0da3f0d

由 Jun'ichi Nomura 提交于 10月 01, 2009

Since 2.6.31 now has request-based device-mapper, it's useful to have
a tracepoint for request-remapping as well as bio-remapping.
This patch adds a tracepoint for request-remapping, trace_block_rq_remap().
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b0da3f0d

block: allow large discard requests · 67efc925

由 Christoph Hellwig 提交于 9月 30, 2009

Currently we set the bio size to the byte equivalent of the blocks to
be trimmed when submitting the initial DISCARD ioctl.  That means it
is subject to the max_hw_sectors limitation of the HBA which is
much lower than the size of a DISCARD request we can support.
Add a separate max_discard_sectors tunable to limit the size for discard
requests.

We limit the max discard request size in bytes to 32bit as that is the
limit for bio->bi_size.  This could be much larger if we had a way to pass
that information through the block layer.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

67efc925

block: use normal I/O path for discard requests · c15227de

由 Christoph Hellwig 提交于 9月 30, 2009

prepare_discard_fn() was being called in a place where memory allocation
was effectively impossible.  This makes it inappropriate for all but
the most trivial translations of Linux's DISCARD operation to the block
command set.  Additionally adding a payload there makes the ownership
of the bio backing unclear as it's now allocated by the device driver
and not the submitter as usual.

It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether
the queue supports discard operations or not.  blkdev_issue_discard now
allocates a one-page, sector-length payload which is the right thing
for the common ATA and SCSI implementations.

The mtd implementation of prepare_discard_fn() is replaced with simply
checking for the request being a discard.

Largely based on a previous patch from Matthew Wilcox <matthew@wil.cx>
which did the prepare_discard_fn but not the different payload allocation
yet.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c15227de

Add missing blk_trace_remove_sysfs to be in pair with blk_trace_init_sysfs · 48c0d4d4

由 Zdenek Kabelac 提交于 9月 25, 2009

Add missing blk_trace_remove_sysfs to be in pair with blk_trace_init_sysfs
introduced in commit 1d54ad6d.
Release kobject also in case the request_fn is NULL.

Problem was noticed via kmemleak backtrace when some sysfs entries were
note properly destroyed during  device removal:

unreferenced object 0xffff88001aa76640 (size 80):
  comm "lvcreate", pid 2120, jiffies 4294885144
  hex dump (first 32 bytes):
    01 00 00 00 00 00 00 00 f0 65 a7 1a 00 88 ff ff  .........e......
    90 66 a7 1a 00 88 ff ff 86 1d 53 81 ff ff ff ff  .f........S.....
  backtrace:
    [<ffffffff813f9cc6>] kmemleak_alloc+0x26/0x60
    [<ffffffff8111d693>] kmem_cache_alloc+0x133/0x1c0
    [<ffffffff81195891>] sysfs_new_dirent+0x41/0x120
    [<ffffffff81194b0c>] sysfs_add_file_mode+0x3c/0xb0
    [<ffffffff81197c81>] internal_create_group+0xc1/0x1a0
    [<ffffffff81197d93>] sysfs_create_group+0x13/0x20
    [<ffffffff810d8004>] blk_trace_init_sysfs+0x14/0x20
    [<ffffffff8123f45c>] blk_register_queue+0x3c/0xf0
    [<ffffffff812447e4>] add_disk+0x94/0x160
    [<ffffffffa00d8b08>] dm_create+0x598/0x6e0 [dm_mod]
    [<ffffffffa00de951>] dev_create+0x51/0x350 [dm_mod]
    [<ffffffffa00de823>] ctl_ioctl+0x1a3/0x240 [dm_mod]
    [<ffffffffa00de8f2>] dm_compat_ctl_ioctl+0x12/0x20 [dm_mod]
    [<ffffffff81177bfd>] compat_sys_ioctl+0xcd/0x4f0
    [<ffffffff81036ed8>] sysenter_dispatch+0x7/0x2c
    [<ffffffffffffffff>] 0xffffffffffffffff
Signed-off-by: NZdenek Kabelac <zkabelac@redhat.com>
Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

48c0d4d4

block: Do not clamp max_hw_sectors for stacking devices · 5dee2477

由 Martin K. Petersen 提交于 9月 21, 2009

Stacking devices do not have an inherent max_hw_sector limit.  Set the
default to INT_MAX so we are bounded only by capabilities of the
underlying storage.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5dee2477

block: Set max_sectors correctly for stacking devices · 80ddf247

由 Martin K. Petersen 提交于 9月 18, 2009

The topology changes unintentionally caused SAFE_MAX_SECTORS to be set
for stacking devices.  Set the default limit to BLK_DEF_MAX_SECTORS and
provide SAFE_MAX_SECTORS in blk_queue_make_request() for legacy hw
drivers that depend on the old behavior.
Acked-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

80ddf247

20 9月, 2009 1 次提交

Driver-Core: extend devnode callbacks to provide permissions · e454cea2

由 Kay Sievers 提交于 9月 18, 2009

This allows subsytems to provide devtmpfs with non-default permissions
for the device node. Instead of the default mode of 0600, null, zero,
random, urandom, full, tty, ptmx now have a mode of 0666, which allows
non-privileged processes to access standard device nodes in case no
other userspace process applies the expected permissions.

This also fixes a wrong assignment in pktcdvd and a checkpatch.pl complain.
Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

e454cea2

16 9月, 2009 1 次提交

driver model: constify attribute groups · a4dbd674

由 David Brownell 提交于 6月 24, 2009

Let attribute group vectors be declared "const".  We'd
like to let most attribute metadata live in read-only
sections... this is a start.
Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

a4dbd674

14 9月, 2009 5 次提交

block: use blkdev_issue_discard in blk_ioctl_discard · 746cd1e7

由 Christoph Hellwig 提交于 9月 12, 2009

blk_ioctl_discard duplicates large amounts of code from blkdev_issue_discard,
the only difference between the two is that blkdev_issue_discard needs to
send a barrier discard request and blk_ioctl_discard a non-barrier one,
and blk_ioctl_discard needs to wait on the request. To facilitates this
add a flags argument to blkdev_issue_discard to control both aspects of the
behaviour. This will be very useful later on for using the waiting
funcitonality for other callers.

Based on an earlier patch from Matthew Wilcox <matthew@wil.cx>.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

746cd1e7

block: don't assume device has a request list backing in nr_requests store · b8a9ae77

由 Jens Axboe 提交于 9月 11, 2009

Stacked devices do not. For now, just error out with -EINVAL. Later
we could make the limit apply on stacked devices too, for throttling
reasons.

This fixes

5a54cd13353bb3b88887604e2c980aa01e314309

and should go into 2.6.31 stable as well.

Cc: stable@kernel.org
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b8a9ae77

block: Optimal I/O limit wrapper · 3c5820c7

由 Martin K. Petersen 提交于 9月 11, 2009

Implement blk_limits_io_opt() and make blk_queue_io_opt() a wrapper
around it. DM needs this to avoid poking at the queue_limits directly.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

3c5820c7

cfq: choose a new next_req when a request is dispatched · 06d21886

由 Jeff Moyer 提交于 9月 11, 2009

This patch addresses http://bugzilla.kernel.org/show_bug.cgi?id=13401, a
regression introduced in 2.6.30.

From the bug report:
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

06d21886

Seperate read and write statistics of in_flight requests · a9327cac

由 Nikanth Karthikesan 提交于 9月 11, 2009

Currently, there is a single in_flight counter measuring the number of
requests in the request_queue. But some monitoring tools would like to
know how many read requests and write requests are in progress. Split the
current in_flight counter into two seperate counters for read and write.

This information is exported as a sysfs attribute, as changing the
currently available stat files would break the existing tools.
Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a9327cac

11 9月, 2009 17 次提交

block: trace bio queueing trial only when it occurs · 01edede4

由 Minchan Kim 提交于 9月 08, 2009

If BIO is discarded or cross over end of device,
BIO queueing trial doesn't occur.

Actually the trace was called just before make_request at first:
[PATCH] Block queue IO tracing support (blktrace) as of 2006-03-23
      2056a782

And then 2 patches added some checks between them:
[PATCH] md: check bio address after mapping through partitions
        5ddfe969,
[BLOCK] Don't allow empty barriers to be passed down to
queues that don't grok them
        51fd77bd

It breaks original goal.
Let's trace it only when it happens.
Signed-off-by: NMinchan Kim <minchan.kim@gmail.com>
Acked-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

01edede4

cfq: fix the log message after dispatched a request · b217a903

由 Shan Wei 提交于 9月 01, 2009

The blktrace tools can show process id when cfq dispatched a request,
using cfq_log_cfqq() instead of cfq_log().
Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b217a903

cfq-iosched: get rid of must_alloc flag · 1b379d8d

由 Jens Axboe 提交于 8月 11, 2009

It's not currently used, as pointed out by
Gui Jianfeng <guijianfeng@cn.fujitsu.com>. We already check the
wait_request flag to allow an idling queue priority allocation access,
so we don't need this extra flag.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1b379d8d

block: use interrupts disabled version of raise_softirq_irqoff() · a33dac26

由 Jens Axboe 提交于 8月 06, 2009

We already have interrupts disabled at that point, so use the
__raise_softirq_irqoff() variant.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a33dac26

J
block: fix comment in blk-iopoll.c · fca51d64
由 Jens Axboe 提交于 8月 06, 2009
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
fca51d64

block: adjust default budget for blk-iopoll · 37867ae7

由 Jens Axboe 提交于 8月 06, 2009

It's not exported, I doubt we'll have a reason to change this...
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

37867ae7

block: fix long lines in block/blk-iopoll.c · 1badcfbd

由 Jens Axboe 提交于 8月 06, 2009

Note sure why they happened in the first place, probably some bad
terminal setting.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1badcfbd

block: add blk-iopoll, a NAPI like approach for block devices · 5e605b64

由 Jens Axboe 提交于 8月 05, 2009

This borrows some code from NAPI and implements a polled completion
mode for block devices. The idea is the same as NAPI - instead of
doing the command completion when the irq occurs, schedule a dedicated
softirq in the hopes that we will complete more IO when the iopoll
handler is invoked. Devices have a budget of commands assigned, and will
stay in polled mode as long as they continue to consume their budget
from the iopoll softirq handler. If they do not, the device is set back
to interrupt completion mode.

This patch holds the core bits for blk-iopoll, device driver support
sold separately.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5e605b64

block: improve queue_should_plug() by looking at IO depths · fb1e7538

由 Jens Axboe 提交于 7月 30, 2009

Instead of just checking whether this device uses block layer
tagging, we can improve the detection by looking at the maximum
queue depth it has reached. If that crosses 4, then deem it a
queuing device.

This is important on high IOPS devices, since plugging hurts
the performance there (it can be as much as 10-15% of the sys
time).
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fb1e7538

bio: first step in sanitizing the bio->bi_rw flag testing · 1f98a13f

由 Jens Axboe 提交于 9月 11, 2009

Get rid of any functions that test for these bits and make callers
use bio_rw_flagged() directly. Then it is at least directly apparent
what variable and flag they check.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1f98a13f

Send uevents for write_protect changes · e3264a4d

由 Hannes Reinecke 提交于 7月 28, 2009

Whenever a block device changes it's read-only attribute
notify the userspace about it.
Signed-off-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e3264a4d

cfq-iosched: no need to keep track of busy_rt_queues · d58b85e1

由 Vivek Goyal 提交于 7月 10, 2009

o Get rid of busy_rt_queues infrastructure. Looks like it is redundant.

o Once an RT queue gets request it will preempt any of the BE or IDLE queues
immediately. Otherwise this queue will be put on service tree and scheduler
will anyway select this queue before any of the BE or IDLE queue. Hence
looks like there is no need to keep track of how many busy RT queues are
currently on service tree.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d58b85e1

cfq-iosched: drain device queue before switching to a sync queue · 5ad531db

由 Jens Axboe 提交于 7月 03, 2009

To lessen the impact of async IO on sync IO, let the device drain of
any async IO in progress when switching to a sync cfqq that has idling
enabled.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5ad531db

scsi,block: update SCSI to handle mixed merge failures · da6c5c72

由 Tejun Heo 提交于 9月 11, 2009

Update scsi_io_completion() such that it only fails requests till the
next error boundary and retry the leftover.  This enables block layer
to merge requests with different failfast settings and still behave
correctly on errors.  Allow merge of requests of different failfast
settings.

As SCSI is currently the only subsystem which follows failfast status,
there's no need to worry about other block drivers for now.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Niel Lambrechts <niel.lambrechts@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

da6c5c72

block: implement mixed merge of different failfast requests · 80a761fd

由 Tejun Heo 提交于 7月 03, 2009

Failfast has characteristics from other attributes.  When issuing,
executing and successuflly completing requests, failfast doesn't make
any difference.  It only affects how a request is handled on failure.
Allowing requests with different failfast settings to be merged cause
normal IOs to fail prematurely while not allowing has performance
penalties as failfast is used for read aheads which are likely to be
located near in-flight or to-be-issued normal IOs.

This patch introduces the concept of 'mixed merge'.  A request is a
mixed merge if it is merge of segments which require different
handling on failure.  Currently the only mixable attributes are
failfast ones (or lack thereof).

When a bio with different failfast settings is added to an existing
request or requests of different failfast settings are merged, the
merged request is marked mixed.  Each bio carries failfast settings
and the request always tracks failfast state of the first bio.  When
the request fails, blk_rq_err_bytes() can be used to determine how
many bytes can be safely failed without crossing into an area which
requires further retrials.

This allows request merging regardless of failfast settings while
keeping the failure handling correct.

This patch only implements mixed merge but doesn't enable it.  The
next one will update SCSI to make use of mixed merge.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Niel Lambrechts <niel.lambrechts@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

80a761fd

block: use the same failfast bits for bio and request · a82afdfc

由 Tejun Heo 提交于 7月 03, 2009

bio and request use the same set of failfast bits.  This patch makes
the following changes to simplify things.

* enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_*
  bits coincide with __REQ_FAILFAST_* bits.

* The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV
  but the matching is useless anyway.  init_request_from_bio() is
  responsible for setting FAILFAST bits on FS requests and non-FS
  requests never use BIO_RW_AHEAD.  Drop the code and comment from
  blk_rq_bio_prep().

* Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and
  simplify FAILFAST flags handling in init_request_from_bio().
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a82afdfc

writeback: add name to backing_dev_info · d993831f

由 Jens Axboe 提交于 6月 12, 2009

This enables us to track who does what and print info. Its main use
is catching dirty inodes on the default_backing_dev_info, so we can
fix that up.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d993831f

02 9月, 2009 1 次提交

block: Allow changing max_sectors_kb above the default 512 · c295fc05

由 Nikanth Karthikesan 提交于 9月 01, 2009

The patch "block: Use accessor functions for queue limits"
(ae03bf63) changed queue_max_sectors_store()
to use blk_queue_max_sectors() instead of directly assigning the value.

But blk_queue_max_sectors() differs a bit
1. It sets both max_sectors_kb, and max_hw_sectors_kb
2. Never allows one to change max_sectors_kb above BLK_DEF_MAX_SECTORS. If one
specifies a value greater then max_hw_sectors is set to that value but
max_sectors is set to BLK_DEF_MAX_SECTORS

I am not sure whether blk_queue_max_sectors() should be changed, as it seems
to be that way for a long time. And there may be callers dependent on that
behaviour.

This patch simply reverts to the older way of directly assigning the value to
max_sectors as it was before.
Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c295fc05

05 8月, 2009 1 次提交

Make SCSI SG v4 driver enabled by default and remove EXPERIMENTAL dependency,... · 14d9fa35

由 John Stoffel 提交于 8月 04, 2009

Make SCSI SG v4 driver enabled by default and remove EXPERIMENTAL dependency, since udev depends on BSG

Make Block Layer SG support v4 the default, since recent udev versions
depend on this to access serial numbers and other low level info properly.

This should be backported to older kernels as well, since most distros have
enabled this for a long time.
Signed-off-by: NJohn Stoffel <john@stoffel.org>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

14d9fa35

01 8月, 2009 4 次提交

block: Update topology documentation · 7e5f5fb0

由 Martin K. Petersen 提交于 7月 31, 2009

Update topology comments and sysfs documentation based upon discussions
with Neil Brown.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7e5f5fb0

block: Stack optimal I/O size · 70dd5bf3

由 Martin K. Petersen 提交于 7月 31, 2009

When stacking block devices ensure that optimal I/O size is scaled
accordingly.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

70dd5bf3

block: Add a wrapper for setting minimum request size without a queue · 7c958e32

由 Martin K. Petersen 提交于 7月 31, 2009

Introduce blk_limits_io_min() and make blk_queue_io_min() call it.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7c958e32

block: Make blk_queue_stack_limits use the new stacking interface · fef24667

由 Martin K. Petersen 提交于 7月 31, 2009

blk_queue_stack_limits() has been superceded by blk_stack_limits() and
disk_stack_limits().  Wrap the function call for now, we'll deprecate it
later.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fef24667