提交 · 61cc74fbb87af6aa551a06a370590c9bc07e29d9 · openeuler / raspberrypi-kernel

04 12月, 2009 27 次提交

block: Fix io_context leak after clone with CLONE_IO · 61cc74fb

由 Louis Rilling 提交于 12月 04, 2009

With CLONE_IO, copy_io() increments both ioc->refcount and ioc->nr_tasks.
However exit_io_context() only decrements ioc->refcount if ioc->nr_tasks
reaches 0.

Always call put_io_context() in exit_io_context().
Signed-off-by: NLouis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

61cc74fb

cfq-iosched: make nonrot check logic consistent · 3c764b7a

由 Shaohua Li 提交于 12月 04, 2009

cfq_arm_slice_timer() has logic to disable idle window for SSD device. The same
thing should be done at cfq_select_queue() too, otherwise we will still see
idle window. This makes the nonrot check logic consistent in cfq.
Tests in a intel SSD with low_latency knob close, below patch can triple disk
thoughput for muti-thread sequential read.
Signed-off-by: NShaohua Li <shaohua.li@intel.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

3c764b7a

io controller: quick fix for blk-cgroup and modular CFQ · 237e5bc4

由 Jens Axboe 提交于 12月 04, 2009

It's currently not an allowed configuration, so express that in Kconfig.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

237e5bc4

cfq-iosched: move IO controller declerations to a header file · f2eecb91

由 Jens Axboe 提交于 12月 04, 2009

They should not be declared inside some other file that's not related
to CFQ.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f2eecb91

J
cfq-iosched: fix compile problem with !CONFIG_CGROUP · 2f5ea477
由 Jens Axboe 提交于 12月 03, 2009
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
2f5ea477

blkio: Documentation · 72f924f6

由 Vivek Goyal 提交于 12月 03, 2009

Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

72f924f6

blkio: Wait on sync-noidle queue even if rq_noidle = 1 · c04645e5

由 Vivek Goyal 提交于 12月 03, 2009

o rq_noidle() is supposed to tell cfq that do not expect a request after this
  one, hence don't idle. But this does not seem to work very well. For example
  for direct random readers, rq_noidle = 1 but there is next request coming
  after this. Not idling, leads to a group not getting its share even if
  group_isolation=1.

o The right solution for this issue is to scan the higher layers and set
  right flag (WRITE_SYNC or WRITE_ODIRECT). For the time being, this single
  line fix helps. This should not have any significant impact when we are
  not using cgroups. I will later figure out IO paths in higher layer and
  fix it.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c04645e5

blkio: Implement group_isolation tunable · ae30c286

由 Vivek Goyal 提交于 12月 03, 2009

o If a group is running only a random reader, then it will not have enough
traffic to keep disk busy and we will reduce overall throughput. This
should result in better latencies for random reader though. If we don't
idle on random reader service tree, then this random reader will experience
large latencies if there are other groups present in system with sequential
readers running in these.

o One solution suggested by corrado is that by default keep the random readers
or sync-noidle workload in root group so that during one dispatch round
we idle only once on sync-noidle tree. This means that all the sync-idle
workload queues will be in their respective group and we will see service
differentiation in those but not on sync-noidle workload.

o Provide a tunable group_isolation. If set, this will make sure that even
sync-noidle queues go in their respective group and we wait on these. This
provides stronger isolation between groups but at the expense of throughput
if group does not have enough traffic to keep the disk busy.

o By default group_isolation = 0
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ae30c286

blkio: Determine async workload length based on total number of queues · f26bd1f0

由 Vivek Goyal 提交于 12月 03, 2009

o Async queues are not per group. Instead these are system wide and maintained
  in root group. Hence their workload slice length should be calculated
  based on total number of queues in the system and not just queues in the
  root group.

o As root group's default weight is 1000, make sure to charge async queue
  more in terms of vtime so that it does not get more time on disk because
  root group has higher weight.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f26bd1f0

blkio: Wait for cfq queue to get backlogged if group is empty · f75edf2d

由 Vivek Goyal 提交于 12月 03, 2009

o If a queue consumes its slice and then gets deleted from service tree, its
associated group will also get deleted from service tree if this was the
only queue in the group. That will make group loose its share.

o For the queues on which we have idling on and if these have used their
slice, wait a bit for these queues to get backlogged again and then
expire these queues so that group does not loose its share.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f75edf2d

blkio: Propagate cgroup weight updation to cfq groups · f8d461d6

由 Vivek Goyal 提交于 12月 03, 2009

o Propagate blkio cgroup weight updation to associated cfq groups.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f8d461d6

blkio: Drop the reference to queue once the task changes cgroup · 24610333

由 Vivek Goyal 提交于 12月 03, 2009

o If a task changes cgroup, drop reference to the cfqq associated with io
  context and set cfqq pointer stored in ioc to NULL so that upon next request
  arrival we will allocate a  new queue in new group.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

24610333

blkio: Provide some isolation between groups · 8682e1f1

由 Vivek Goyal 提交于 12月 03, 2009

o Do not allow following three operations across groups for isolation.
	- selection of co-operating queues
	- preemtpions across groups
	- request merging across groups.

o Async queues are currently global and not per group. Allow preemption of
  an async queue if a sync queue in other group gets backlogged.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8682e1f1

blkio: Export disk time and sectors used by a group to user space · 22084190

由 Vivek Goyal 提交于 12月 03, 2009

o Export disk time and sector used by a group to user space through cgroup
  interface.

o Also export a "dequeue" interface to cgroup which keeps track of how many
  a times a group was deleted from service tree. Helps in debugging.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

22084190

blkio: Some debugging aids for CFQ · 2868ef7b

由 Vivek Goyal 提交于 12月 03, 2009

o Some debugging aids for CFQ.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2868ef7b

blkio: Take care of cgroup deletion and cfq group reference counting · b1c35769

由 Vivek Goyal 提交于 12月 03, 2009

o One can choose to change elevator or delete a cgroup. Implement group
  reference counting so that both elevator exit and cgroup deletion can
  take place gracefully.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NNauman Rafique <nauman@google.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b1c35769

blkio: Dynamic cfq group creation based on cgroup tasks belongs to · 25fb5169

由 Vivek Goyal 提交于 12月 03, 2009

o Determine the cgroup IO submitting task belongs to and create the cfq
  group if it does not exist already.

o Also link cfqq and associated cfq group.

o Currently all async IO is mapped to root group.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

25fb5169

blkio: Group time used accounting and workload context save restore · dae739eb

由 Vivek Goyal 提交于 12月 03, 2009

o This patch introduces the functionality to do the accounting of group time
  when a queue expires. This time used decides which is the group to go
  next.

o Also introduce the functionlity to save and restore the workload type
  context with-in group. It might happen that once we expire the cfq queue
  and group, a different group will schedule in and we will lose the context
  of the workload type. Hence save and restore it upon queue expiry.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

dae739eb

blkio: Implement per cfq group latency target and busy queue avg · 58ff82f3

由 Vivek Goyal 提交于 12月 03, 2009

o So far we had 300ms soft target latency system wide. Now with the
  introduction of cfq groups, divide that latency by number of groups so
  that one can come up with group target latency which will be helpful
  in determining the workload slice with-in group and also the dynamic
  slice length of the cfq queue.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

58ff82f3

blkio: Introduce per cfq group weights and vdisktime calculations · 25bc6b07

由 Vivek Goyal 提交于 12月 03, 2009

o Bring in the per cfq group weight and how vdisktime is calculated for the
  group. Also bring in the functionality of updating the min_vdisktime of
  the group service tree.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

25bc6b07

blkio: Introduce blkio controller cgroup interface · 31e4c28d

由 Vivek Goyal 提交于 12月 03, 2009

o This is basic implementation of blkio controller cgroup interface. This is
  the common interface visible to user space and should be used by different
  IO control policies as we implement those.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

31e4c28d

blkio: Introduce the root service tree for cfq groups · 1fa8f6d6

由 Vivek Goyal 提交于 12月 03, 2009

o So far we just had one cfq_group in cfq_data. To create space for more than
one cfq_group, we need to have a service tree of groups where all the groups
can be queued if they have active cfq queues backlogged in these.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1fa8f6d6

blkio: Keep queue on service tree until we expire it · f04a6424

由 Vivek Goyal 提交于 12月 03, 2009

o Currently cfqq deletes a queue from service tree if it is empty (even if
  we might idle on the queue). This patch keeps the queue on service tree
  hence associated group remains on the service tree until we decide that
  we are not going to idle on the queue and expire it.

o This just helps in time accounting for queue/group and in implementation
  of rest of the patches.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f04a6424

blkio: Implement macro to traverse each service tree in group · 615f0259

由 Vivek Goyal 提交于 12月 03, 2009

o Implement a macro to traverse each service tree in the group. This avoids
  usage of double for loop and special condition for idle tree 4 times.

o Macro is little twisted because of special handling of idle class service
  tree.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

615f0259

blkio: Introduce the notion of cfq groups · cdb16e8f

由 Vivek Goyal 提交于 12月 03, 2009

o This patch introduce the notion of cfq groups. Soon we will can have multiple
  groups of different weights in the system.

o Various service trees (prioclass and workload type trees), will become per
  cfq group. So hierarchy looks as follows.

			cfq_groups
			   |
			workload type
			   |
		        cfq queue

o When an scheduling decision has to be taken, first we select the cfq group
  then workload with-in the group and then cfq queue with-in the workload
  type.

o This patch just makes various workload service tree per cfq group and
  introduce the function to be able to choose a group for scheduling.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

cdb16e8f

blkio: Set must_dispatch only if we decided to not dispatch the request · bf791937

由 Vivek Goyal 提交于 12月 03, 2009

o must_dispatch flag should be set only if we decided not to run the queue
  and dispatch the request.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

bf791937

drbd_req.c: use part_[inc|dec]_in_flight() · 753c8913

由 Philipp Reisner 提交于 11月 18, 2009

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

753c8913

03 12月, 2009 13 次提交

writeback: remove unused nonblocking and congestion checks · 0d99519e

由 Wu Fengguang 提交于 12月 03, 2009

- no one is calling wb_writeback and write_cache_pages with
  wbc.nonblocking=1 any more
- lumpy pageout will want to do nonblocking writeback without the
  congestion wait

So remove the congestion checks as suggested by Chris.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Alex Elder <aelder@sgi.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0d99519e

writeback: introduce wbc.for_background · b17621fe

由 Wu Fengguang 提交于 12月 03, 2009

It will lower the flush priority for NFS, and maybe more in future.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b17621fe

writeback: remove the always false bdi_cap_writeback_dirty() test · 951c30d1

由 Wu Fengguang 提交于 12月 03, 2009

This is dead code because no bdi flush thread will be started for
!bdi_cap_writeback_dirty bdi.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

951c30d1

flusher: Fix PF_FROZEN race · bf7ec5bb

由 OGAWA Hirofumi 提交于 12月 03, 2009

To touch task->flags directly is racy. thaw_process() still has race
(changing non_current->flags, but this is another issue) though, I think
it's much better off.

So, use thaw_process() instead.
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

bf7ec5bb

J

Merge branch 'master' into for-2.6.33 · 220d0b1d
由 Jens Axboe 提交于 12月 03, 2009

220d0b1d

cfq-iosched: no dispatch limit for single queue · 474b18cc

由 Shaohua Li 提交于 12月 03, 2009

Since commit 2f5cb738, each queue can send
up to 4 * 4 requests if only one queue exists. I wonder why we have such limit.
Device supports tag can send more requests. For example, AHCI can send 31
requests. Test (direct aio randread) shows the limits reduce about 4% disk
thoughput.
On the other hand, since we send one request one time, if other queue
pop when current is sending more than cfq_quantum requests, current queue will
stop send requests soon after one request, so sounds there is no big latency.
Signed-off-by: NShaohua Li <shaohua.li@intel.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

474b18cc

block: Allow devices to indicate whether discarded blocks are zeroed · 98262f27

由 Martin K. Petersen 提交于 12月 03, 2009

The discard ioctl is used by mkfs utilities to clear a block device
prior to putting metadata down. However, not all devices return zeroed
blocks after a discard. Some drives return stale data, potentially
containing old superblocks. It is therefore important to know whether
discarded blocks are properly zeroed.

Both ATA and SCSI drives have configuration bits that indicate whether
zeroes are returned after a discard operation. Implement a block level
interface that allows this information to be bubbled up the stack and
queried via a new block device ioctl.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

98262f27

L

Linux 2.6.32 · 22763c5c
由 Linus Torvalds 提交于 12月 02, 2009

22763c5c

VIDEO: Correct use of request_region/request_mem_region · 0fdd07f7

由 Julia Lawall 提交于 8月 09, 2009

request_region should be used with release_region, not request_mem_region.

Geert Uytterhoeven pointed out that in the case of drivers/video/gbefb.c,
the problem is actually the other way around; request_mem_region should be
used instead of request_region.

The semantic patch that finds/fixes this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r1@
expression start;
@@

request_region(start,...)

@b1@
expression r1.start;
@@

request_mem_region(start,...)

@depends on !b1@
expression r1.start;
expression E;
@@

- release_mem_region
+ release_region
  (start,E)
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

0fdd07f7

SPI: spi_txx9: Fix bit rate calculation · dbf763a2

由 Atsushi Nemoto 提交于 9月 03, 2009

TXx9 SPI bit rate is calculated by:
        fBR = (spi-baseclk) / (n + 1)
Fix calculation of min_speed_hz, max_speed_hz and n.
Signed-off-by: NAtsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

dbf763a2

L
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 · 56f3f55c
由 Linus Torvalds 提交于 12月 02, 2009
```
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
  mfd: Correct WM831X_MAX_ISEL_VALUE
```
56f3f55c

Input: i8042 - add Dell Vostro 1320, 1520 and 1720 to the reset list · 049e2d13

由 Anisse Astier 提交于 12月 01, 2009

These laptops often leave i8042 in a wierd state resulting in non-
operational touchpad and keyboard.
Signed-off-by: NAnisse Astier <anisse@astier.eu>
Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

049e2d13

L
Merge branch 'for-linus' of git://neil.brown.name/md · 0a45281f
由 Linus Torvalds 提交于 12月 02, 2009
```
* 'for-linus' of git://neil.brown.name/md:
  md: revert incorrect fix for read error handling in raid1.
```
0a45281f