提交 · 150e6c67f4bf6ab51e62defc41bd19a2eefe5709 · bug2833 / cloud-kernel

04 11月, 2009 1 次提交
- J
  
  Merge branch 'cfq-2.6.33' into for-2.6.33 · 150e6c67
  由 Jens Axboe 提交于 11月 03, 2009
  
  150e6c67
02 11月, 2009 2 次提交

Do not __always_inline bvec_kmap_irq() and bvec_kunmap_irq() · 4f570f99

由 Alberto Bertogli 提交于 11月 02, 2009

So remove both the comment and the inline requirement, going back to the
inline hint.
Signed-off-by: NAlberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4f570f99

cfq-iosched: simplify prio-unboost code · dddb7451

由 Corrado Zoccolo 提交于 11月 02, 2009

Eliminate redundant checks.
Signed-off-by: NCorrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

dddb7451

29 10月, 2009 2 次提交

blkdev: flush disk cache on ->fsync · ab0a9735

由 Christoph Hellwig 提交于 10月 29, 2009

Currently there is no barrier support in the block device code.  That
means we cannot guarantee any sort of data integerity when using the
block device node with dis kwrite caches enabled.  Using the raw block
device node is a typical use case for virtualization (and I assume
databases, too).  This patch changes block_fsync to issue a cache flush
and thus make fsync on block device nodes actually useful.

Note that in mainline we would also need to add such code to the
->aio_write method for O_SYNC handling, but assuming that Jan's patch
series for the O_SYNC rewrite goes in it will also call into ->fsync
for 2.6.32.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ab0a9735

block: move bdi/address_space unplug functions to backing-dev.h · b9d128f1

由 Jens Axboe 提交于 10月 29, 2009

There's nothing block related about them, the backing device
is used by things like NFS etc as well. This gets rid of the
need to protect such calls by CONFIG_BLOCK.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b9d128f1

28 10月, 2009 9 次提交

J
drbd: fix in_flight rw indexing · a870a3a4
由 Jens Axboe 提交于 10月 28, 2009
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
a870a3a4

aio: implement request batching · cfb1e33e

由 Jeff Moyer 提交于 10月 02, 2009

Hi,

Some workloads issue batches of small I/O, and the performance is poor
due to the call to blk_run_address_space for every single iocb.  Nathan
Roberts pointed this out, and suggested that by deferring this call
until all I/Os in the iocb array are submitted to the block layer, we
can realize some impressive performance gains (up to 30% for sequential
4k reads in batches of 16).
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

cfb1e33e

block: get rid of the WRITE_ODIRECT flag · 1af60fbd

由 Jeff Moyer 提交于 10月 02, 2009

Hi,

The WRITE_ODIRECT flag is only used in one place, and that code path
happens to also call blk_run_address_space.  The introduction of this
flag, then, could result in the device being unplugged twice for every
I/O.

Further, with the batching changes in the next patch, we don't want an
O_DIRECT write to imply a queue unplug.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1af60fbd

J
cfq-iosched: fix style issue in cfq_get_avg_queues() · 5869619c
由 Jens Axboe 提交于 10月 28, 2009
```
Line breaks and bad brace placement.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
5869619c

cfq-iosched: fairness for sync no-idle queues · 718eee05

由 Corrado Zoccolo 提交于 10月 26, 2009

Currently no-idle queues in cfq are not serviced fairly:
even if they can only dispatch a small number of requests at a time,
they have to compete with idling queues to be serviced, experiencing
large latencies.

We should notice, instead, that no-idle queues are the ones that would
benefit most from having low latency, in fact they are any of:
* processes with large think times (e.g. interactive ones like file
  managers)
* seeky (e.g. programs faulting in their code at startup)
* or marked as no-idle from upper levels, to improve latencies of those
  requests.

This patch improves the fairness and latency for those queues, by:
* separating sync idle, sync no-idle and async queues in separate
  service_trees, for each priority
* service all no-idle queues together
* and idling when the last no-idle queue has been serviced, to
  anticipate for more no-idle work
* the timeslices allotted for idle and no-idle service_trees are
  computed proportionally to the number of processes in each set.

Servicing all no-idle queues together should have a performance boost
for NCQ-capable drives, without compromising fairness.
Signed-off-by: NCorrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

718eee05

cfq-iosched: enable idling for last queue on priority class · a6d44e98

由 Corrado Zoccolo 提交于 10月 26, 2009

cfq can disable idling for queues in various circumstances.
When workloads of different priorities are competing, if the higher
priority queue has idling disabled, lower priority queues may steal
its disk share. For example, in a scenario with an RT process
performing seeky reads vs a BE process performing sequential reads,
on an NCQ enabled hardware, with low_latency unset,
the RT process will dispatch only the few pending requests every full
slice of service for the BE process.

The patch solves this issue by always performing idle on the last
queue at a given priority class > idle. If the same process, or one
that can pre-empt it (so at the same priority or higher), submits a
new request within the idle window, the lower priority queue won't
dispatch, saving the disk bandwidth for higher priority ones.

Note: this doesn't touch the non_rotational + NCQ case (no hardware
to test if this is a benefit in that case).
Signed-off-by: NCorrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a6d44e98

cfq-iosched: reimplement priorities using different service trees · c0324a02

由 Corrado Zoccolo 提交于 10月 27, 2009

We use different service trees for different priority classes.
This allows a simplification in the service tree insertion code, that no
longer has to consider priority while walking the tree.
Signed-off-by: NCorrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c0324a02

cfq-iosched: preparation to handle multiple service trees · aa6f6a3d

由 Corrado Zoccolo 提交于 10月 26, 2009

We embed a pointer to the service tree in each queue, to handle multiple
service trees easily.
Service trees are enriched with a counter.
cfq_add_rq_rb is invoked after putting the rq in the fifo, to ensure
that all fields in rq are properly initialized.
Signed-off-by: NCorrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

aa6f6a3d

cfq-iosched: adapt slice to number of processes doing I/O · 5db5d642

由 Corrado Zoccolo 提交于 10月 26, 2009

When the number of processes performing I/O concurrently increases,
a fixed time slice per process will cause large latencies.

This patch, if low_latency mode is enabled,  will scale the time slice
assigned to each process according to a 300ms target latency.

In order to keep fairness among processes:
* The number of active processes is computed using a special form of
running average, that quickly follows sudden increases (to keep latency low),
and decrease slowly (to have fairness in spite of rapid decreases of this
value).

To safeguard sequential bandwidth, we impose a minimum time slice
(computed using 2*cfq_slice_idle as base, adjusted according to priority
and async-ness).
Signed-off-by: NCorrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5db5d642

27 10月, 2009 1 次提交

cfq-iosched: improve hw_tag detection · 1a1238a7

由 Shaohua Li 提交于 10月 27, 2009

If active queue hasn't enough requests and idle window opens, cfq will not
dispatch sufficient requests to hardware. In such situation, current code
will zero hw_tag. But this is because cfq doesn't dispatch enough requests
instead of hardware queue doesn't work. Don't zero hw_tag in such case.
Signed-off-by: NShaohua Li <shaohua.li@intel.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1a1238a7

26 10月, 2009 4 次提交

cfq: break apart merged cfqqs if they stop cooperating · e6c5bc73

由 Jeff Moyer 提交于 10月 23, 2009

cfq_queues are merged if they are issuing requests within the mean seek
distance of one another.  This patch detects when the coopearting stops and
breaks the queues back up.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e6c5bc73

cfq: change the meaning of the cfqq_coop flag · b3b6d040

由 Jeff Moyer 提交于 10月 23, 2009

The flag used to indicate that a cfqq was allowed to jump ahead in the
scheduling order due to submitting a request close to the queue that
just executed.  Since closely cooperating queues are now merged, the flag
holds little meaning.  Change it to indicate that multiple queues were
merged.  This will later be used to allow the breaking up of merged queues
when they are no longer cooperating.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b3b6d040

cfq: merge cooperating cfq_queues · df5fe3e8

由 Jeff Moyer 提交于 10月 23, 2009

When cooperating cfq_queues are detected currently, they are allowed to
skip ahead in the scheduling order. It is much more efficient to
automatically share the cfq_queue data structure between cooperating processes.
Performance of the read-test2 benchmark (which is written to emulate the
dump(8) utility) went from 12MB/s to 90MB/s on my SATA disk. NFS servers
with multiple nfsd threads also saw performance increases.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

df5fe3e8

cfq: calculate the seek_mean per cfq_queue not per cfq_io_context · b2c18e1e

由 Jeff Moyer 提交于 10月 23, 2009

async cfq_queue's are already shared between processes within the same
priority, and forthcoming patches will change the mapping of cic to sync
cfq_queue from 1:1 to 1:N.  So, calculate the seekiness of a process
based on the cfq_queue instead of the cfq_io_context.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b2c18e1e

13 10月, 2009 3 次提交

J

Merge branch 'for-linus' into for-2.6.33 · c30f3343
由 Jens Axboe 提交于 10月 13, 2009

c30f3343

cciss: Add cciss_allow_hpsa module parameter · 2ec24ff1

由 Stephen M. Cameron 提交于 10月 13, 2009

Add cciss_allow_hpsa module parameter.  This parameter causes
the cciss driver to ignore any Smart Array devices known to be
supported by the hpsa driver.
Signed-off-by: NStephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2ec24ff1

cciss: Fix multiple calls to pci_release_regions · 2cfa948c

由 Stephen M. Cameron 提交于 10月 13, 2009

Fix multiple calls to pci_release_regions.  If cciss_pci_init
fails, it already does any necessary call to pci_release_regions,
so this does not need to be done again in cciss_init_one in that
case.
Signed-off-by: NStephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2cfa948c

12 10月, 2009 1 次提交

blk-settings: fix function parameter kernel-doc notation · c7ebf065

由 Randy Dunlap 提交于 10月 12, 2009

Fix kernel-doc notation in blk-settings.c::blk_queue_max_discard_sectors().
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c7ebf065

09 10月, 2009 3 次提交

writeback: kill space in debugfs item name · 961515f6

由 Wu Fengguang 提交于 10月 09, 2009

The space is not script friendly, kill it.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

961515f6

writeback: account IO throttling wait as iowait · d25105e8

由 Wu Fengguang 提交于 10月 09, 2009

It makes sense to do IOWAIT when someone is blocked
due to IO throttle, as suggested by Kame and Peter.

There is an old comment for not doing IOWAIT on throttle,
however it has been mismatching the code for a long time.

If we stop accounting IOWAIT for 2.6.32, it could be an
undesirable behavior change. So restore the io_schedule.

CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d25105e8

elv_iosched_store(): fix strstrip() misuse · 8c279598

由 KOSAKI Motohiro 提交于 10月 09, 2009

elv_iosched_store() ignore the return value of strstrip().  It makes small
inconsistent behavior.

This patch fixes it.

 <before>
 ====================================
 # cd /sys/block/{blockdev}/queue

 case1:
 # echo "anticipatory" > scheduler
 # cat scheduler
 noop [anticipatory] deadline cfq

 case2:
 # echo "anticipatory " > scheduler
 # cat scheduler
 noop [anticipatory] deadline cfq

 case3:
 # echo " anticipatory" > scheduler
 bash: echo: write error: Invalid argument

 <after>
 ====================================
 # cd /sys/block/{blockdev}/queue

 case1:
 # echo "anticipatory" > scheduler
 # cat scheduler
 noop [anticipatory] deadline cfq

 case2:
 # echo "anticipatory " > scheduler
 # cat scheduler
 noop [anticipatory] deadline cfq

 case3:
 # echo " anticipatory" > scheduler
 noop [anticipatory] deadline cfq

Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8c279598

08 10月, 2009 4 次提交

cfq-iosched: avoid probable slice overrun when idling · 355b659c

由 Corrado Zoccolo 提交于 10月 08, 2009

If the average think time is larger than the remaining time slice
for any given queue, don't allow it to idle. A succesful idle also
means that we need to dispatch and complete a request, so if we don't
even have time left for the idle process, we would overrun the slice
in any case.
Signed-off-by: NCorrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

355b659c

cfq-iosched: apply bool value where we return 0/1 · a6151c3a

由 Jens Axboe 提交于 10月 07, 2009

Saves 16 bytes of text, woohoo. But the more important point is
that it makes the code more readable when returning bool for 0/1
cases.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a6151c3a

cfq-iosched: fix think time allowed for seekers · ec60e4f6

由 Corrado Zoccolo 提交于 10月 07, 2009

CFQ enables idle only for processes that think less than the allowed
idle time. Since idle time is lower for seeky queues, we should use the
correct value in the comparison.
Signed-off-by: NCorrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ec60e4f6

drbd: needs __ratelimit() · 132cc538

由 Randy Dunlap 提交于 10月 07, 2009

drbd_int.h uses __ratelimit(), so it needs to #include ratelimit.h:

drivers/block/drbd/drbd_int.h:1765: error: implicit declaration of function '__ratelimit'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: drbd-dev@lists.linbit.com
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

132cc538

07 10月, 2009 4 次提交

cfq-iosched: fix the slice residual sign · b9c8946b

由 Jens Axboe 提交于 10月 06, 2009

We should subtract the slice residual from the rb tree key, since
a negative residual count indicates that the cfqq overran its slice
the last time. Hence we want to add the overrun time, to position
it a bit further away in the service tree.
Reported-by: NCorrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b9c8946b

cfq-iosched: abstract out the 'may this cfqq dispatch' logic · 0b182d61

由 Jens Axboe 提交于 10月 06, 2009

Makes the whole thing easier to read, cfq_dispatch_requests() was
a bit messy before.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0b182d61

J
block: use proper BLK_RW_ASYNC in blk_queue_start_tag() · 1b59dd51
由 Jens Axboe 提交于 10月 06, 2009
```
Makes it easier to read than the 0.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
1b59dd51

block: Seperate read and write statistics of in_flight requests v2 · 316d315b

由 Nikanth Karthikesan 提交于 10月 06, 2009

Commit a9327cac added seperate read
and write statistics of in_flight requests. And exported the number
of read and write requests in progress seperately through sysfs.

But  Corrado Zoccolo <czoccolo@gmail.com> reported getting strange
output from "iostat -kx 2". Global values for service time and
utilization were garbage. For interval values, utilization was always
100%, and service time is higher than normal.

So this was reverted by commit 0f78ab98

The problem was in part_round_stats_single(), I missed the following:
        if (now == part->stamp)
                return;

-       if (part->in_flight) {
+       if (part_in_flight(part)) {
                __part_stat_add(cpu, part, time_in_queue,
                                part_in_flight(part) * (now - part->stamp));
                __part_stat_add(cpu, part, io_ticks, (now - part->stamp));

With this chunk included, the reported regression gets fixed.
Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>

--
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

316d315b

06 10月, 2009 1 次提交

drbd: Work on permission enforcement · 9f5180e5

由 Philipp Reisner 提交于 10月 06, 2009

Now we have the capabilities of the sending process available,
use them to enforce CAP_SYS_ADMIN.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9f5180e5

05 10月, 2009 5 次提交

block: get rid of kblock_schedule_delayed_work() · 23e018a1

由 Jens Axboe 提交于 10月 05, 2009

It was briefly introduced to allow CFQ to to delayed scheduling,
but we ended up removing that feature again. So lets kill the
function and export, and just switch CFQ back to the normal work
schedule since it is now passing in a '0' delay from all call
sites.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

23e018a1

cfq-iosched: fix possible problem with jiffies wraparound · 48e025e6

由 Corrado Zoccolo 提交于 10月 05, 2009

The RR service tree is indexed by a key that is relative to current jiffies.
This can cause problems on jiffies wraparound.

The patch fixes it using time_before comparison, and changing
the add_front path to use a relative number, too.
Signed-off-by: NCorrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

48e025e6

cfq-iosched: fix issue with rq-rq merging and fifo list ordering · 30996f40

由 Jens Axboe 提交于 10月 05, 2009

cfq uses rq->start_time as the fifo indicator, but that field may
get modified prior to cfq doing it's fifo list adjustment when
a request gets merged with another request. This can cause the
fifo list to become unordered.
Reported-by: NCorrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

30996f40

J
drbd: fixup for reverted dual in_flight patch · 25d2d4ed
由 Jens Axboe 提交于 10月 05, 2009
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
25d2d4ed
J

Merge branch 'master' into for-2.6.33 · 5d13379a
由 Jens Axboe 提交于 10月 05, 2009

5d13379a

bug2833 / cloud-kernel 与 Fork 源项目一致

bug2833 / cloud-kernel
与 Fork 源项目一致