提交 · 2fc391112fb6f3424435a3aa2fda887497b5f807 · openeuler / raspberrypi-kernel

10 8月, 2009 1 次提交

locking, sched: Give waitqueue spinlocks their own lockdep classes · 2fc39111

由 Peter Zijlstra 提交于 8月 10, 2009

Give waitqueue spinlocks their own lockdep classes when they
are initialised from init_waitqueue_head().  This means that
struct wait_queue::func functions can operate other waitqueues.

This is used by CacheFiles to catch the page from a backing fs
being unlocked and to wake up another thread to take a copy of
it.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Tested-by: NTakashi Iwai <tiwai@suse.de>
Cc: linux-cachefs@redhat.com
Cc: torvalds@osdl.org
Cc: akpm@linux-foundation.org
LKML-Reference: <20090810113305.17284.81508.stgit@warthog.procyon.org.uk>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2fc39111

09 8月, 2009 1 次提交

perf_counter: Fix/complete ftrace event records sampling · f413cdb8

由 Frederic Weisbecker 提交于 8月 07, 2009

This patch implements the kernel side support for ftrace event
record sampling.

A new counter sampling attribute is added:

   PERF_SAMPLE_TP_RECORD

which requests ftrace events record sampling. In this case
if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
fires, we emit the tracepoint binary record to the
perfcounter event buffer, as a sample.

Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
record:

 perf record -f -F 1 -a -e workqueue:workqueue_execution
 perf report -D

 0x21e18 [0x48]: event: 9
 .
 . ... raw event: size 72 bytes
 .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
 .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
 .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
 .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
 .  0040:  e0 b1 31 81 ff ff ff ff                          .......
.
0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33

The raw ftrace binary record starts at offset 0020.

Translation:

 struct trace_entry {
	type		= 0x2b = 43;
	flags		= 1;
	preempt_count	= 2;
	pid		= 0xa = 10;
	tgid		= 0xa = 10;
 }

 thread_comm = "events/1"
 thread_pid  = 0xa = 10;
 func	    = 0xffffffff8131b1e0 = flush_to_ldisc()

What will come next?

 - Userspace support ('perf trace'), 'flight data recorder' mode
   for perf trace, etc.

 - The unconditional copy from the profiling callback brings
   some costs however if someone wants no such sampling to
   occur, and needs to be fixed in the future. For that we need
   to have an instant access to the perf counter attribute.
   This is a matter of a flag to add in the struct ftrace_event.

 - Take care of the events recursivity! Don't ever try to record
   a lock event for example, it seems some locking is used in
   the profiling fast path and lead to a tracing recursivity.
   That will be fixed using raw spinlock or recursivity
   protection.

 - [...]

 - Profit! :-)
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f413cdb8

08 8月, 2009 4 次提交

bzip2/lzma/gzip: fix comments describing decompressor API · daeb6b6f

由 Phillip Lougher 提交于 8月 06, 2009

Fix and improve comments in decompress/generic.h that describe the
decompressor API.  Also remove an unused definition, and rename INBUF_LEN
in lib/decompress_inflate.c to conform to bzip2/lzma naming.
Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

daeb6b6f

mm: make set_mempolicy(MPOL_INTERLEAV) N_HIGH_MEMORY aware · 4bfc4495

由 KAMEZAWA Hiroyuki 提交于 8月 06, 2009

At first, init_task's mems_allowed is initialized as this.
 init_task->mems_allowed == node_state[N_POSSIBLE]

And cpuset's top_cpuset mask is initialized as this
 top_cpuset->mems_allowed = node_state[N_HIGH_MEMORY]

Before 2.6.29:
policy's mems_allowed is initialized as this.

  1. update tasks->mems_allowed by its cpuset->mems_allowed.
  2. policy->mems_allowed = nodes_and(tasks->mems_allowed, user's mask)

Updating task's mems_allowed in reference to top_cpuset's one.
cpuset's mems_allowed is aware of N_HIGH_MEMORY, always.

In 2.6.30: After commit 58568d2a
("cpuset,mm: update tasks' mems_allowed in time"), policy's mems_allowed
is initialized as this.

  1. policy->mems_allowd = nodes_and(task->mems_allowed, user's mask)

Here, if task is in top_cpuset, task->mems_allowed is not updated from
init's one.  Assume user excutes command as #numactrl --interleave=all
,....

  policy->mems_allowd = nodes_and(N_POSSIBLE, ALL_SET_MASK)

Then, policy's mems_allowd can includes a possible node, which has no pgdat.

MPOL's INTERLEAVE just scans nodemask of task->mems_allowd and access this
directly.

  NODE_DATA(nid)->zonelist even if NODE_DATA(nid)==NULL

Then, what's we need is making policy->mems_allowed be aware of
N_HIGH_MEMORY.  This patch does that.  But to do so, extra nodemask will
be on statck.  Because I know cpumask has a new interface of
CPUMASK_ALLOC(), I added it to node.

This patch stands on old behavior.  But I feel this fix itself is just a
Band-Aid.  But to do fundametal fix, we have to take care of memory
hotplug and it takes time.  (task->mems_allowd should be N_HIGH_MEMORY, I
think.)

mpol_set_nodemask() should be aware of N_HIGH_MEMORY and policy's nodemask
should be includes only online nodes.

In old behavior, this is guaranteed by frequent reference to cpuset's
code.  Now, most of them are removed and mempolicy has to check it by
itself.

To do check, a few nodemask_t will be used for calculating nodemask.  But,
size of nodemask_t can be big and it's not good to allocate them on stack.

Now, cpumask_t has CPUMASK_ALLOC/FREE an easy code for get scratch area.
NODEMASK_ALLOC/FREE shoudl be there.

[akpm@linux-foundation.org: cleanups & tweaks]
Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Paul Menage <menage@google.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: David Rientjes <rientjes@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4bfc4495

vfs: add __destroy_inode · 2e00c97e

由 Christoph Hellwig 提交于 8月 07, 2009

When we want to tear down an inode that lost the add to the cache race
in XFS we must not call into ->destroy_inode because that would delete
the inode that won the race from the inode cache radix tree.

This patch provides the __destroy_inode helper needed to fix this,
the actual fix will be in th next patch. As XFS was the only reason
destroy_inode was exported we shift the export to the new __destroy_inode.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NEric Sandeen <sandeen@sandeen.net>

2e00c97e

vfs: fix inode_init_always calling convention · 54e34621

由 Christoph Hellwig 提交于 8月 07, 2009

Currently inode_init_always calls into ->destroy_inode if the additional
initialization fails. That's not only counter-intuitive because
inode_init_always did not allocate the inode structure, but in case of
XFS it's actively harmful as ->destroy_inode might delete the inode from
a radix-tree that has never been added. This in turn might end up
deleting the inode for the same inum that has been instanciated by
another process and cause lots of cause subtile problems.

Also in the case of re-initializing a reclaimable inode in XFS it would
free an inode we still want to keep alive.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NEric Sandeen <sandeen@sandeen.net>

54e34621

06 8月, 2009 2 次提交

Input: matrix_keypad - make matrix keymap size dynamic · d82f1c35

由 Eric Miao 提交于 8月 05, 2009

Remove assumption on the shift and size of rows/columns form
matrix_keypad driver.
Signed-off-by: NEric Miao <eric.y.miao@gmail.com>
Signed-off-by: NDmitry Torokhov <dtor@mail.ru>

d82f1c35

ftrace: Fix perf-tracepoint OOPS · af6af30c

由 Peter Zijlstra 提交于 8月 05, 2009

Not all tracepoints are created equal, in specific the ftrace
tracepoints are created with TRACE_EVENT_FORMAT() which does
not generate the needed bits to tie them into perf counters.

For those events, don't create the 'id' file and fail
->profile_enable when their ID is specified through other
means.
Reported-by: NChris Mason <chris.mason@oracle.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1249497664.5890.4.camel@laptop>
[ v2: fix build error in the !CONFIG_EVENT_PROFILE case ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

af6af30c

05 8月, 2009 2 次提交

KVM: fix ack not being delivered when msi present · 5116d8f6

由 Michael S. Tsirkin 提交于 7月 26, 2009

kvm_notify_acked_irq does not check irq type, so that it sometimes
interprets msi vector as irq.  As a result, ack notifiers are not
called, which typially hangs the guest.  The fix is to track and
check irq type.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5116d8f6

tty-ldisc: make refcount be atomic_t 'users' count · 18eac1cc

由 Linus Torvalds 提交于 8月 03, 2009

This is pure preparation of changing the ldisc reference counting to be
a true refcount that defines the lifetime of the ldisc.  But this is a
purely syntactic change for now to make the next steps easier.

This patch should make no semantic changes at all. But I wanted to make
the ldisc refcount be an atomic (I will be touching it without locks
soon enough), and I wanted to rename it so that there isn't quite as
much confusion between 'ldo->refcount' (ldisk operations refcount) and
'ld->refcount' (ldisc refcount itself) in the same file.

So it's now an atomic 'ld->users' count. It still starts at zero,
despite having a reference from 'tty->ldisc', but that will change once
we turn it into a _real_ refcount.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Tested-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Tested-by: NSergey Senozhatsky <sergey.senozhatsky@mail.by>
Acked-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

18eac1cc

03 8月, 2009 2 次提交

mtd: fix the conversion from dev to mtd_info · 6afc4fdb

由 Saeed Bishara 提交于 7月 28, 2009

The patch fixes a bug when converting dev to mtd_info by using the
drvdata of the dev, the previous code used
container_of(dev, struct mtd_info, dev), but won't work for the mtdXro
devices as they created without being contained inside mtd_info structure.
Signed-off-by: NSaeed Bishara <saeed@marvell.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

6afc4fdb

mtd: let include/linux/mtd/partitions.h stand on its own · 7699ad35

由 Nicolas Pitre 提交于 6月 15, 2009

When declaring static MTD partitions in board specific code, only
including <include/linux/mtd/partitions.h> should suffice without
gcc nagging us with:

In file included from arch/arm/mach-kirkwood/sheevaplug-setup.c:14:
include/linux/mtd/partitions.h:50: warning: 'struct mtd_info' declared inside parameter list
include/linux/mtd/partitions.h:50: warning: its scope is only this definition or declaration, which is probably not what you want
include/linux/mtd/partitions.h:51: warning: 'struct mtd_info' declared inside parameter list
include/linux/mtd/partitions.h:61: warning: 'struct mtd_info' declared inside parameter list
include/linux/mtd/partitions.h:67: warning: 'struct mtd_info' declared inside parameter list
Signed-off-by: NNicolas Pitre <nico@marvell.com>
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

7699ad35

02 8月, 2009 1 次提交

perf_counter: Full task tracing · 9f498cc5

由 Peter Zijlstra 提交于 7月 23, 2009

In order to be able to distinguish between no samples due to
inactivity and no samples due to task ended, Arjan asked for
PERF_EVENT_EXIT events. This is useful to the boot delay
instrumentation (bootchart) app.

This patch changes the PERF_EVENT_FORK to be emitted on every
clone, and adds PERF_EVENT_EXIT to be emitted on task exit,
after the task's counters have been closed.

This task tracing is controlled through: attr.comm || attr.mmap
and through the new attr.task field.
Suggested-by: NArjan van de Ven <arjan@linux.intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
[ cleaned up perf_counter.h a bit ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9f498cc5

01 8月, 2009 1 次提交

block: Add a wrapper for setting minimum request size without a queue · 7c958e32

由 Martin K. Petersen 提交于 7月 31, 2009

Introduce blk_limits_io_min() and make blk_queue_io_min() call it.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7c958e32

31 7月, 2009 4 次提交

clocksource: Save mult_orig in clocksource_disable() · c7121843

由 Magnus Damm 提交于 7月 28, 2009

To fix the common case where ->enable() does not set up
mult, make sure mult_orig is saved in mult on disable.

Also add comments to explain why we do this.
Signed-off-by: NMagnus Damm <damm@igel.co.jp>
Cc: johnstul@us.ibm.com
Cc: lethal@linux-sh.org
Cc: akpm@linux-foundation.org
LKML-Reference: <20090618152432.10136.9932.sendpatchset@rx1.opensource.se>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

c7121843

cb710: use SG_MITER_TO_SG/SG_MITER_FROM_SG · 4b2a108c

由 Sebastian Andrzej Siewior 提交于 6月 22, 2009

the code allready uses flush_kernel_dcache_page(). This patch updates the
driver to the recent sg API changes which require that either SG_MITER_TO_SG
or SG_MITER_FROM_SG is set. SG_MITER_TO_SG calls flush_kernel_dcache_page()
in sg_mitter_stop()
Signed-off-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc>
Acked-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NPierre Ossman <pierre@ossman.eu>

4b2a108c

lib/scatterlist: add a flags to signalize mapping direction · 6de7e356

由 Sebastian Andrzej Siewior 提交于 6月 18, 2009

sg_miter_start() is currently unaware of the direction of the copy
process (to or from the scatter list). It is important to know the
direction because the page has to be flushed in case the data written
is seen on a different mapping in user land on cache incoherent
architectures.
Signed-off-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc>
Acked-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NPierre Ossman <pierre@ossman.eu>

6de7e356

io context: fix ref counting · cbb4f264

由 Li Zefan 提交于 7月 31, 2009

Commit d9c7d394
("block: prevent possible io_context->refcount overflow") mistakenly
changed atomic_inc(&ioc->nr_tasks) to atomic_long_inc(&ioc->refcount).
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NNikanth Karthikesan <knikanth@suse.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

cbb4f264

30 7月, 2009 9 次提交

lguest and virtio: cleanup struct definitions to Linux style. · 1842f23c

由 Rusty Russell 提交于 7月 30, 2009

I've been doing this for years, and akpm picked me up on it about 12
months ago.  lguest partly serves as example code, so let's do it Right.

Also, remove two unused fields in struct vblk_info in the example launcher.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>

1842f23c

lguest: fix comment style · 2e04ef76

由 Rusty Russell 提交于 7月 30, 2009

I don't really notice it (except to begrudge the extra vertical
space), but Ingo does.  And he pointed out that one excuse of lguest
is as a teaching tool, it should set a good example.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>

2e04ef76

uio: mark uio.h functions __KERNEL__ only · 812ed032

由 Jiri Slaby 提交于 7月 29, 2009

To avoid userspace build failures such as:

.../linux/uio.h:37: error: expected `=', `,', `;', `asm' or `__attribute__' before `iov_length'
.../linux/uio.h:47: error: expected declaration specifiers or `...' before `size_t'

move uio functions inside a __KERNEL__ block.
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Acked-by: NSam Ravnborg <sam@ravnborg.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

812ed032

lib: flexible array implementation · 534acc05

由 Dave Hansen 提交于 7月 29, 2009

Once a structure goes over PAGE_SIZE*2, we see occasional allocation
failures.  Some people have chosen to switch over to things like vmalloc()
that will let them keep array-like access to such a large structures.
But, vmalloc() has plenty of downsides.

Here's an alternative.  I think it's what Andrew was suggesting here:

	http://lkml.org/lkml/2009/7/2/518

I call it a flexible array.  It does all of its work in PAGE_SIZE bits, so
never does an order>0 allocation.  The base level has
PAGE_SIZE-2*sizeof(int) bytes of storage for pointers to the second level.
 So, with a 32-bit arch, you get about 4MB (4183112 bytes) of total
storage when the objects pack nicely into a page.  It is half that on
64-bit because the pointers are twice the size.  There's a table detailing
this in the code.

There are kerneldocs for the functions, but here's an
overview:

flex_array_alloc() - dynamically allocate a base structure
flex_array_free() - free the array and all of the
		    second-level pages
flex_array_free_parts() - free the second-level pages, but
			  not the base (for static bases)
flex_array_put() - copy into the array at the given index
flex_array_get() - copy out of the array at the given index
flex_array_prealloc() - preallocate the second-level pages
			between the given indexes to
			guarantee no allocs will occur at
			put() time.

We could also potentially just pass the "element_size" into each of the
API functions instead of storing it internally.  That would get us one
more base pointer on 32-bit.

I've been testing this by running it in userspace.  The header and patch
that I've been using are here, as well as the little script I'm using to
generate the size table which goes in the kerneldocs.

	http://sr71.net/~dave/linux/flexarray/

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

534acc05

pps.h needs <linux/types.h> · f5a55efa

由 Dave Jones 提交于 7月 29, 2009

Found with make headers_check

/usr/include/linux/pps.h:52: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: NDave Jones <davej@redhat.com>
Cc: Rodolfo Giometti <giometti@linux.it>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f5a55efa

cgroup avoid permanent sleep at rmdir · 88703267

由 KAMEZAWA Hiroyuki 提交于 7月 29, 2009

After commit ec64f515 ("cgroup: fix
frequent -EBUSY at rmdir"), cgroup's rmdir (especially against memcg)
doesn't return -EBUSY by temporary ref counts.  That commit expects all
refs after pre_destroy() is temporary but...it wasn't.  Then, rmdir can
wait permanently.  This patch tries to fix that and change followings.

 - set CGRP_WAIT_ON_RMDIR flag before pre_destroy().
 - clear CGRP_WAIT_ON_RMDIR flag when the subsys finds racy case.
   if there are sleeping ones, wakes them up.
 - rmdir() sleeps only when CGRP_WAIT_ON_RMDIR flag is set.
Tested-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reported-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reviewed-by: NPaul Menage <menage@google.com>
Acked-by: NBalbir Sigh <balbir@linux.vnet.ibm.com>
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

88703267

cgroups: fix pid namespace bug · 096b7fe0

由 Li Zefan 提交于 7月 29, 2009

The bug was introduced by commit cc31edce
("cgroups: convert tasks file to use a seq_file with shared pid array").

We cache a pid array for all threads that are opening the same "tasks"
file, but the pids in the array are always from the namespace of the
last process that opened the file, so all other threads will read pids
from that namespace instead of their own namespaces.

To fix it, we maintain a list of pid arrays, which is keyed by pid_ns.
The list will be of length 1 at most time.
Reported-by: NPaul Menage <menage@google.com>
Idea-by: NPaul Menage <menage@google.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Reviewed-by: NSerge Hallyn <serue@us.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

096b7fe0

pty: avoid forcing 'low_latency' tty flag · e043e42b

由 OGAWA Hirofumi 提交于 7月 29, 2009

We really don't want to mark the pty as a low-latency device, because as
Alan points out, the ->write method can be called from an IRQ (ppp?),
and that means we can't use ->low_latency=1 as we take mutexes in the
low_latency case.

So rather than using low_latency to force the written data to be pushed
to the ldisc handling at 'write()' time, just make the reader side (or
the poll function) do the flush when it checks whether there is data to
be had.

This also fixes the problem with lost data in an emacs compile buffer
(bugzilla 13815), and we can thus revert the low_latency pty hack
(commit 3a542974: "pty: quickfix for the
pty ENXIO timing problems").
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Tested-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[ Modified to do the tty_flush_to_ldisc() inside input_available_p() so
  that it triggers for both read and poll()  - Linus]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e043e42b

PM / Hibernate: Replace bdget call with simple atomic_inc of i_count · dddac6a7

由 Alan Jenkins 提交于 7月 29, 2009

Create bdgrab(). This function copies an existing reference to a
block_device. It is safe to call from any context.

Hibernation code wishes to copy a reference to the active swap device.
Right now it calls bdget() under a spinlock, but this is wrong because
bdget() can sleep. It doesn't need a full bdget() because we already
hold a reference to active swap devices (and the spinlock protects
against swapoff).

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=13827Signed-off-by: NAlan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

dddac6a7

29 7月, 2009 1 次提交

libata: accept late unlocking of HPA · 5920dadf

由 Tejun Heo 提交于 7月 15, 2009

On certain configurations, HPA isn't or can't be unlocked during
probing but it somehow ends up unlocked afterwards.  In the following
thread, the problem can be reliably reproduced after resuming from
STR.  The BIOS turns on HPA during boot but forgets to do it during
resume.

  http://thread.gmane.org/gmane.linux.kernel/858310

This patch updates libata revalidation such that it considers native
n_sectors.  If the device size has increased to match native
n_sectors, it's assumed that HPA has been unlocked involuntarily and
the device is recognized as the same one.  This should be fairly safe
while nicely working around the problem.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NChristof Warlich <christof@warlich.name>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

5920dadf

28 7月, 2009 1 次提交

ip: fix logic of reverse path filter sysctl · 27fed417

由 Stephen Hemminger 提交于 7月 27, 2009

Even though reverse path filter was changed from simple boolean to
trinary control, the loose mode only works if both all and device are
configured because of this logic error.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27fed417

25 7月, 2009 1 次提交

V4L/DVB (12283): gspca - sn9c20x: New subdriver for sn9c201 and sn9c202 bridges. · 26e744b6

由 Brian Johnson 提交于 7月 19, 2009

Signed-off-by: NBrian Johnson <brijohn@gmail.com>
Signed-off-by: NJean-Francois Moine <moinejf@free.fr>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

26e744b6

24 7月, 2009 1 次提交

dm table: pass correct dev area size to device_area_is_valid · 5dea271b

由 Mike Snitzer 提交于 7月 23, 2009

Incorrect device area lengths are being passed to device_area_is_valid().

The regression appeared in 2.6.31-rc1 through commit
754c5fc7.

With the dm-stripe target, the size of the target (ti->len) was used
instead of the stripe_width (ti->len/#stripes).  An example of a
consequent incorrect error message is:

  device-mapper: table: 254:0: sdb too small for target
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

5dea271b

23 7月, 2009 2 次提交

of/mdio: Add support function for Ethernet fixed-link property · 24c30dbb

由 Anton Vorontsov 提交于 7月 16, 2009

Fixed-link support is broken for the ucc_eth, gianfar, and fs_enet
device drivers.  The "OF MDIO rework" patches removed most of the
support. Instead of re-adding fixed-link stuff to the drivers, this
patch adds a support function for parsing the fixed-link property
and obtaining a dummy phy to match.

Note: the dummy phy handling in arch/powerpc is a bit of a hack and
needs to be reworked.  This function is being added now to solve the
regression in the Ethernet drivers, but it should be considered a
temporary measure until the fixed link handling can be reworked.
Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24c30dbb

perf_counter: PERF_SAMPLE_ID and inherited counters · 7f453c24

由 Peter Zijlstra 提交于 7月 21, 2009

Anton noted that for inherited counters the counter-id as provided by
PERF_SAMPLE_ID isn't mappable to the id found through PERF_RECORD_ID
because each inherited counter gets its own id.

His suggestion was to always return the parent counter id, since that
is the primary counter id as exposed. However, these inherited
counters have a unique identifier so that events like
PERF_EVENT_PERIOD and PERF_EVENT_THROTTLE can be specific about which
counter gets modified, which is important when trying to normalize the
sample streams.

This patch removes PERF_EVENT_PERIOD in favour of PERF_SAMPLE_PERIOD,
which is more useful anyway, since changing periods became a lot more
common than initially thought -- rendering PERF_EVENT_PERIOD the less
useful solution (also, PERF_SAMPLE_PERIOD reports the more accurate
value, since it reports the value used to trigger the overflow,
whereas PERF_EVENT_PERIOD simply reports the requested period changed,
which might only take effect on the next cycle).

This still leaves us PERF_EVENT_THROTTLE to consider, but since that
_should_ be a rare occurrence, and linking it to a primary id is the
most useful bit to diagnose the problem, we introduce a
PERF_SAMPLE_STREAM_ID, for those few cases where the full
reconstruction is important.

[Does change the ABI a little, but I see no other way out]
Suggested-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1248095846.15751.8781.camel@twins>

7f453c24

22 7月, 2009 3 次提交

softirq: introduce tasklet_hrtimer infrastructure · 9ba5f005

由 Peter Zijlstra 提交于 7月 22, 2009

commit ca109491 (hrtimer: removing all ur callback modes) moved all
hrtimer callbacks into hard interrupt context when high resolution
timers are active. That breaks code which relied on the assumption
that the callback happens in softirq context.

Provide a generic infrastructure which combines tasklets and hrtimers
together to provide an in-softirq hrtimer experience.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: torvalds@linux-foundation.org
Cc: kaber@trash.net
Cc: David Miller <davem@davemloft.net>
LKML-Reference: <1248265724.27058.1366.camel@twins>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

9ba5f005

inotify: use GFP_NOFS under potential memory pressure · f44aebcc

由 Eric Paris 提交于 7月 15, 2009

inotify can have a watchs removed under filesystem reclaim.

=================================
[ INFO: inconsistent lock state ]
2.6.31-rc2 #16
---------------------------------
inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
khubd/217 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (iprune_mutex){+.+.?.}, at: [<c10ba899>] invalidate_inodes+0x20/0xe3
{IN-RECLAIM_FS-W} state was registered at:
  [<c10536ab>] __lock_acquire+0x2c9/0xac4
  [<c1053f45>] lock_acquire+0x9f/0xc2
  [<c1308872>] __mutex_lock_common+0x2d/0x323
  [<c1308c00>] mutex_lock_nested+0x2e/0x36
  [<c10ba6ff>] shrink_icache_memory+0x38/0x1b2
  [<c108bfb6>] shrink_slab+0xe2/0x13c
  [<c108c3e1>] kswapd+0x3d1/0x55d
  [<c10449b5>] kthread+0x66/0x6b
  [<c1003fdf>] kernel_thread_helper+0x7/0x10
  [<ffffffff>] 0xffffffff

Two things are needed to fix this.  First we need a method to tell
fsnotify_create_event() to use GFP_NOFS and second we need to stop using
one global IN_IGNORED event and allocate them one at a time.  This solves
current issues with multiple IN_IGNORED on a queue having tail drop
problems and simplifies the allocations since we don't have to worry about
two tasks opperating on the IGNORED event concurrently.
Signed-off-by: NEric Paris <eparis@redhat.com>

f44aebcc

rfkill: remove too-strict __must_check · e56f0975

由 Alan Jenkins 提交于 7月 18, 2009

Some drivers don't need the return value of rfkill_set_hw_state(),
so it should not be marked as __must_check.
Signed-off-by: NAlan Jenkins <alan-jenkins@tuffmail.co.uk>
Acked-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

e56f0975

21 7月, 2009 1 次提交

genirq: Delegate irq affinity setting to the irq thread · 591d2fb0

由 Thomas Gleixner 提交于 7月 21, 2009

irq_set_thread_affinity() calls set_cpus_allowed_ptr() which might
sleep, but irq_set_thread_affinity() is called with desc->lock held
and can be called from hard interrupt context as well. The code has
another bug as it does not hold a ref on the task struct as required
by set_cpus_allowed_ptr().

Just set the IRQTF_AFFINITY bit in action->thread_flags. The next time
the thread runs it migrates itself. Solves all of the above problems
nicely.

Add kerneldoc to irq_set_thread_affinity() while at it.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>

591d2fb0

18 7月, 2009 1 次提交

sched: fix nr_uninterruptible accounting of frozen tasks really · 6301cb95

由 Thomas Gleixner 提交于 7月 17, 2009

commit e3c8ca83 (sched: do not count frozen tasks toward load) broke
the nr_uninterruptible accounting on freeze/thaw. On freeze the task
is excluded from accounting with a check for (task->flags &
PF_FROZEN), but that flag is cleared before the task is thawed. So
while we prevent that the task with state TASK_UNINTERRUPTIBLE
is accounted to nr_uninterruptible on freeze we decrement
nr_uninterruptible on thaw.

Use a separate flag which is handled by the freezing task itself. Set
it before calling the scheduler with TASK_UNINTERRUPTIBLE state and
clear it after we return from frozen state.

Cc: <stable@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

6301cb95

17 7月, 2009 2 次提交

virtio_net: Sync header with qemu · e79f07e2

由 Alex Williamson 提交于 7月 07, 2009

Qemu added support for a few extra RX modes that Linux doesn't
currently make use of.  Sync the headers to maintain consistency.
Signed-off-by: NAlex Williamson <alex.williamson@hp.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

e79f07e2

lguest: fix journey · 5780888b

由 Matias Zabaljauregui 提交于 6月 18, 2009

fix: "make Guest" was complaining about duplicated G:032
Signed-off-by: NMatias Zabaljauregui <zabaljauregui@gmail.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

5780888b