提交 · 5706b27deae29ceee26d0c20112f087a9b841575 · openeuler / raspberrypi-kernel

04 1月, 2012 6 次提交

A
sysctl: use umode_t for table permissions · 36fcb589
由 Al Viro 提交于 7月 26, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
36fcb589
A
cgroup: propagate mode_t · a5e7ed32
由 Al Viro 提交于 7月 26, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
a5e7ed32
A
switch debugfs to umode_t · f4ae40a6
由 Al Viro 提交于 7月 24, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
f4ae40a6

switch vfs_mkdir() and ->mkdir() to umode_t · 18bb1db3

由 Al Viro 提交于 7月 26, 2011

vfs_mkdir() gets int, but immediately drops everything that might not
fit into umode_t and that's the only caller of ->mkdir()...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

18bb1db3

get rid of timer in kern/acct.c · 32dc7308

由 Al Viro 提交于 12月 08, 2011

... and clean it up a bit, while we are at it
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

32dc7308

hung_task: fix false positive during vfork · f9fab10b

由 Mandeep Singh Baines 提交于 1月 03, 2012

vfork parent uninterruptibly and unkillably waits for its child to
exec/exit. This wait is of unbounded length. Ignore such waits
in the hung_task detector.
Signed-off-by: NMandeep Singh Baines <msb@chromium.org>
Reported-by: NSasha Levin <levinsasha928@gmail.com>
LKML-Reference: <1325344394.28904.43.camel@lappy>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Kacur <jkacur@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f9fab10b

01 1月, 2012 1 次提交

futex: Fix uninterruptible loop due to gate_area · e6780f72

由 Hugh Dickins 提交于 12月 31, 2011

It was found (by Sasha) that if you use a futex located in the gate
area we get stuck in an uninterruptible infinite loop, much like the
ZERO_PAGE issue.

While looking at this problem, PeterZ realized you'll get into similar
trouble when hitting any install_special_pages() mapping. And are there
still drivers setting up their own special mmaps without page->mapping,
and without special VM or pte flags to make get_user_pages fail?

In most cases, if page->mapping is NULL, we do not need to retry at all:
Linus points out that even /proc/sys/vm/drop_caches poses no problem,
because it ends up using remove_mapping(), which takes care not to
interfere when the page reference count is raised.

But there is still one case which does need a retry: if memory pressure
called shmem_writepage in between get_user_pages_fast dropping page
table lock and our acquiring page lock, then the page gets switched from
filecache to swapcache (and ->mapping set to NULL) whatever the refcount.
Fault it back in to get the page->mapping needed for key->shared.inode.
Reported-by: NSasha Levin <levinsasha928@gmail.com>
Signed-off-by: NHugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e6780f72

31 12月, 2011 1 次提交

Revert "clockevents: Set noop handler in clockevents_exchange_device()" · 3b87487a

由 Linus Torvalds 提交于 12月 30, 2011

This reverts commit de28f25e.

It results in resume problems for various people. See for example

  http://thread.gmane.org/gmane.linux.kernel/1233033
  http://thread.gmane.org/gmane.linux.kernel/1233389
  http://thread.gmane.org/gmane.linux.kernel/1233159
  http://thread.gmane.org/gmane.linux.kernel/1227868/focus=1230877

and the fedora and ubuntu bug reports

  https://bugzilla.redhat.com/show_bug.cgi?id=767248
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/904569

which got bisected down to the stable version of this commit.
Reported-by: NJonathan Nieder <jrnieder@gmail.com>
Reported-by: NPhil Miller <mille121@illinois.edu>
Reported-by: NPhilip Langdale <philipl@overt.org>
Reported-by: NTim Gardner <tim.gardner@canonical.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg KH <gregkh@suse.de>
Cc: stable@kernel.org    # for stable kernels that applied the original
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3b87487a

21 12月, 2011 2 次提交

binary_sysctl(): fix memory leak · 3d3c8f93

由 Michel Lespinasse 提交于 12月 19, 2011

binary_sysctl() calls sysctl_getname() which allocates from names_cache
slab usin __getname()

The matching function to free the name is __putname(), and not putname()
which should be used only to match getname() allocations.

This is because when auditing is enabled, putname() calls audit_putname
*instead* (not in addition) to __putname().  Then, if a syscall is in
progress, audit_putname does not release the name - instead, it expects
the name to get released when the syscall completes, but that will happen
only if audit_getname() was called previously, i.e.  if the name was
allocated with getname() rather than the naked __getname().  So,
__getname() followed by putname() ends up leaking memory.
Signed-off-by: NMichel Lespinasse <walken@google.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3d3c8f93

cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask · b246272e

由 David Rientjes 提交于 12月 19, 2011

Kernels where MAX_NUMNODES > BITS_PER_LONG may temporarily see an empty
nodemask in a tsk's mempolicy if its previous nodemask is remapped onto a
new set of allowed cpuset nodes where the two nodemasks, as a result of
the remap, are now disjoint.

c0ff7453 ("cpuset,mm: fix no node to alloc memory when changing
cpuset's mems") adds get_mems_allowed() to prevent the set of allowed
nodes from changing for a thread.  This causes any update to a set of
allowed nodes to stall until put_mems_allowed() is called.

This stall is unncessary, however, if at least one node remains unchanged
in the update to the set of allowed nodes.  This was addressed by
89e8a244 ("cpusets: avoid looping when storing to mems_allowed if one
node remains set"), but it's still possible that an empty nodemask may be
read from a mempolicy because the old nodemask may be remapped to the new
nodemask during rebind.  To prevent this, only avoid the stall if there is
no mempolicy for the thread being changed.

This is a temporary solution until all reads from mempolicy nodemasks can
be guaranteed to not be empty without the get_mems_allowed()
synchronization.

Also moves the check for nodemask intersection inside task_lock() so that
tsk->mems_allowed cannot change.  This ensures that nothing can set this
tsk's mems_allowed out from under us and also protects tsk->mempolicy.
Reported-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b246272e

20 12月, 2011 1 次提交

cgroups: fix a css_set not found bug in cgroup_attach_proc · e0197aae

由 Mandeep Singh Baines 提交于 12月 15, 2011

There is a BUG when migrating a PF_EXITING proc. Since css_set_prefetch()
is not called for the PF_EXITING case, find_existing_css_set() will return
NULL inside cgroup_task_migrate() causing a BUG.

This bug is easy to reproduce. Create a zombie and echo its pid to
cgroup.procs.

$ cat zombie.c
\#include <unistd.h>

int main()
{
  if (fork())
      pause();
  return 0;
}
$

We are hitting this bug pretty regularly on ChromeOS.

This bug is already fixed by Tejun Heo's cgroup patchset which is
targetted for the next merge window:

https://lkml.org/lkml/2011/11/1/356

I've create a smaller patch here which just fixes this bug so that a
fix can be merged into the current release and stable.
Signed-off-by: NMandeep Singh Baines <msb@chromium.org>
Downstream-Bug-Report: http://crosbug.com/23953Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: containers@lists.linux-foundation.org
Cc: cgroups@vger.kernel.org
Cc: stable@kernel.org
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Olof Johansson <olofj@chromium.org>

e0197aae

19 12月, 2011 1 次提交

time/clocksource: Fix kernel-doc warnings · b1b73d09

由 Kusanagi Kouichi 提交于 12月 19, 2011

Fix various KernelDoc build warnings.
Signed-off-by: NKusanagi Kouichi <slash@ac.auone-net.jp>
Cc: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20111219091320.0D5AF6FC03D@msa105.auone-net.jpSigned-off-by: NIngo Molnar <mingo@elte.hu>

b1b73d09

16 12月, 2011 1 次提交

sched: Fix select_idle_sibling() regression in selecting an idle SMT sibling · ab278921

由 Peter Zijlstra 提交于 12月 15, 2011

Mike Galbraith reported that this recent commit:

   commit 4dcfe102
   Author: Peter Zijlstra <peterz@infradead.org>
   Date:   Thu Nov 10 13:01:10 2011 +0100

       sched: Avoid SMT siblings in select_idle_sibling() if possible

stopped selecting an idle SMT sibling when there are no idle
cores in a single socket system.

Intent of the select_idle_sibling() was to fallback to an idle
SMT sibling, if it fails to identify an idle core. But this
fallback was not happening on systems where all the scheduler
domains had `SD_SHARE_PKG_RESOURCES' flag set.

Fix it. Slightly bigger patch of cleaning all these goto's etc
is queued up for the next release.
Reported-by: NMike Galbraith <efault@gmx.de>
Reported-by: NAlex Shi <alex.shi@intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1323978421.1984.244.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

ab278921

14 12月, 2011 1 次提交

perf events: Fix ring_buffer_wakeup() brown paperbag bug · 44b7f4b9

由 Will Deacon 提交于 12月 13, 2011

Commit 10c6db11 ("perf: Fix loss of notification with multi-event")
seems to unconditionally dereference event->rb in the wakeup handler,
this is wrong, there might not be a buffer attached.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111213152651.GP20297@mudshark.cambridge.arm.com
[ minor edits ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

44b7f4b9

09 12月, 2011 2 次提交

sys_getppid: add missing rcu_dereference · 031af165

由 Mandeep Singh Baines 提交于 12月 08, 2011

In order to safely dereference current->real_parent inside an
rcu_read_lock, we need an rcu_dereference.
Signed-off-by: NMandeep Singh Baines <msb@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

031af165

printk: avoid double lock acquire · 09dc3cf9

由 Peter Zijlstra 提交于 12月 08, 2011

Commit 4f2a8d3c ("printk: Fix console_sem vs logbuf_lock unlock race")
introduced another silly bug where we would want to acquire an already
held lock.  Avoid this.
Reported-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

09dc3cf9

07 12月, 2011 2 次提交

perf: Do no try to schedule task events if there are none · 86b47c25

由 Gleb Natapov 提交于 11月 22, 2011

perf_event_sched_in() shouldn't try to schedule task events if there
are none otherwise task's ctx->is_active will be set and will not be
cleared during sched_out. This will prevent newly added events from
being scheduled into the task context.

Fixes a boo-boo in commit 1d5f003f ("perf: Do not set task_ctx
pointer in cpuctx if there are no events in the context").
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111122140821.GF2557@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

86b47c25

lockdep, kmemcheck: Annotate ->lock in lockdep_init_map() · a33caeb1

由 Yong Zhang 提交于 11月 09, 2011

Since commit f59de899 ("lockdep: Clear whole lockdep_map on initialization"),
lockdep_init_map() will clear all the struct. But it will break
lock_set_class()/lock_set_subclass(). A typical race condition
is like below:

     CPU A                                   CPU B
lock_set_subclass(lockA);
 lock_set_class(lockA);
   lockdep_init_map(lockA);
     /* lockA->name is cleared */
     memset(lockA);
                                     __lock_acquire(lockA);
                                       /* lockA->class_cache[] is cleared */
                                       register_lock_class(lockA);
                                         look_up_lock_class(lockA);
                                           WARN_ON_ONCE(class->name !=
                                                     lock->name);

     lock->name = name;

So restore to what we have done before commit f59de899 but annotate
->lock with kmemcheck_mark_initialized() to suppress the kmemcheck
warning reported in commit f59de899.
Reported-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: NBorislav Petkov <bp@alien8.de>
Suggested-by: NVegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: NYong Zhang <yong.zhang0@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: <stable@kernel.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111109080451.GB8124@zhySigned-off-by: NIngo Molnar <mingo@elte.hu>

a33caeb1

06 12月, 2011 6 次提交

alarmtimers: Fix time comparison · c9c024b3

由 Thomas Gleixner 提交于 12月 05, 2011

The expiry function compares the timer against current time and does
not expire the timer when the expiry time is >= now. That's wrong. If
the timer is set for now, then it must expire.

Make the condition expiry > now for breaking out the loop.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJohn Stultz <john.stultz@linaro.org>
Cc: stable@kernel.org

c9c024b3

ftrace: Fix hash record accounting bug · ddf6e0e5

由 Steven Rostedt 提交于 11月 04, 2011

If the set_ftrace_filter is cleared by writing just whitespace to
it, then the filter hash refcounts will be decremented but not
updated. This causes two bugs:

1) No functions will be enabled for tracing when they all should be

2) If the users clears the set_ftrace_filter twice, it will crash ftrace:

------------[ cut here ]------------
WARNING: at /home/rostedt/work/git/linux-trace.git/kernel/trace/ftrace.c:1384 __ftrace_hash_rec_update.part.27+0x157/0x1a7()
Modules linked in:
Pid: 2330, comm: bash Not tainted 3.1.0-test+ #32
Call Trace:
 [<ffffffff81051828>] warn_slowpath_common+0x83/0x9b
 [<ffffffff8105185a>] warn_slowpath_null+0x1a/0x1c
 [<ffffffff810ba362>] __ftrace_hash_rec_update.part.27+0x157/0x1a7
 [<ffffffff810ba6e8>] ? ftrace_regex_release+0xa7/0x10f
 [<ffffffff8111bdfe>] ? kfree+0xe5/0x115
 [<ffffffff810ba51e>] ftrace_hash_move+0x2e/0x151
 [<ffffffff810ba6fb>] ftrace_regex_release+0xba/0x10f
 [<ffffffff8112e49a>] fput+0xfd/0x1c2
 [<ffffffff8112b54c>] filp_close+0x6d/0x78
 [<ffffffff8113a92d>] sys_dup3+0x197/0x1c1
 [<ffffffff8113a9a6>] sys_dup2+0x4f/0x54
 [<ffffffff8150cac2>] system_call_fastpath+0x16/0x1b
---[ end trace 77a3a7ee73794a02 ]---

Link: http://lkml.kernel.org/r/20111101141420.GA4918@debianReported-by: NRabin Vincent <rabin@rab.in>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

ddf6e0e5

jump_label: jump_label_inc may return before the code is patched · bbbf7af4

由 Gleb Natapov 提交于 10月 18, 2011

If cpu A calls jump_label_inc() just after atomic_add_return() is
called by cpu B, atomic_inc_not_zero() will return value greater then
zero and jump_label_inc() will return to a caller before jump_label_update()
finishes its job on cpu B.

Link: http://lkml.kernel.org/r/20111018175551.GH17571@redhat.com

Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NJason Baron <jbaron@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

bbbf7af4

ftrace: Remove force undef config value left for testing · c7c6ec8b

由 Steven Rostedt 提交于 11月 04, 2011

A forced undef of a config value was used for testing and was
accidently left in during the final commit. This causes x86 to
run slower than needed while running function tracing as well
as causes the function graph selftest to fail when DYNMAIC_FTRACE
is not set. This is because the code in MCOUNT expects the ftrace
code to be processed with the config value set that happened to
be forced not set.

The forced config option was left in by:
    commit 6331c28c
    ftrace: Fix dynamic selftest failure on some archs

Link: http://lkml.kernel.org/r/20111102150255.GA6973@debian

Cc: stable@vger.kernel.org
Reported-by: NRabin Vincent <rabin@rab.in>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

c7c6ec8b

tracing: Restore system filter behavior · 27b14b56

由 Li Zefan 提交于 11月 01, 2011

Though not all events have field 'prev_pid', it was allowed to do this:

  # echo 'prev_pid == 100' > events/sched/filter

but commit 75b8e982 (tracing/filter: Swap
entire filter of events) broke it without any reason.

Link: http://lkml.kernel.org/r/4EAF46CF.8040408@cn.fujitsu.comSigned-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

27b14b56

tracing: fix event_subsystem ref counting · cb599747

由 Ilya Dryomov 提交于 10月 31, 2011

Fix a bug introduced by e9dbfae5, which prevents event_subsystem from
ever being released.

Ref_count was added to keep track of subsystem users, not for counting
events.  Subsystem is created with ref_count = 1, so there is no need to
increment it for every event, we have nr_events for that.  Fix this by
touching ref_count only when we actually have a new user -
subsystem_open().

Cc: stable@vger.kernel.org
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Link: http://lkml.kernel.org/r/1320052062-7846-1-git-send-email-idryomov@gmail.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>

cb599747

05 12月, 2011 1 次提交

perf: Fix loss of notification with multi-event · 10c6db11

由 Peter Zijlstra 提交于 11月 26, 2011

When you do:
$ perf record -e cycles,cycles,cycles noploop 10

You expect about 10,000 samples for each event, i.e., 10s at
1000samples/sec. However, this is not what's happening. You
get much fewer samples, maybe 3700 samples/event:

$ perf report -D | tail -15
Aggregated stats:
TOTAL events: 10998
MMAP events: 66
COMM events: 2
SAMPLE events: 10930
cycles stats:
TOTAL events: 3644
SAMPLE events: 3644
cycles stats:
TOTAL events: 3642
SAMPLE events: 3642
cycles stats:
TOTAL events: 3644
SAMPLE events: 3644

On a Intel Nehalem or even AMD64, there are 4 counters capable
of measuring cycles, so there is plenty of space to measure those
events without multiplexing (even with the NMI watchdog active).
And even with multiplexing, we'd expect roughly the same number
of samples per event.

The root of the problem was that when the event that caused the buffer
to become full was not the first event passed on the cmdline, the user
notification would get lost. The notification was sent to the file
descriptor of the overflowed event but the perf tool was not polling
on it. The perf tool aggregates all samples into a single buffer,
i.e., the buffer of the first event. Consequently, it assumes
notifications for any event will come via that descriptor.

The seemingly straight forward solution of moving the waitq into the
ringbuffer object doesn't work because of life-time issues. One could
perf_event_set_output() on a fd that you're also blocking on and cause
the old rb object to be freed while its waitq would still be
referenced by the blocked thread -> FAIL.

Therefore link all events to the ringbuffer and broadcast the wakeup
from the ringbuffer object to all possible events that could be waited
upon. This is rather ugly, and we're open to better solutions but it
works for now.
Reported-by: NStephane Eranian <eranian@google.com>
Finished-by: NStephane Eranian <eranian@google.com>
Reviewed-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111126014731.GA7030@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>

10c6db11

02 12月, 2011 5 次提交

clockevents: Set noop handler in clockevents_exchange_device() · de28f25e

由 Thomas Gleixner 提交于 12月 02, 2011

If a device is shutdown, then there might be a pending interrupt,
which will be processed after we reenable interrupts, which causes the
original handler to be run. If the old handler is the (broadcast)
periodic handler the shutdown state might hang the kernel completely.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org

de28f25e

tick-broadcast: Stop active broadcast device when replacing it · c1be8430

由 Thomas Gleixner 提交于 12月 02, 2011

When a better rated broadcast device is installed, then the current
active device is not disabled, which results in two running broadcast
devices.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org

c1be8430

genirq: Fix race condition when stopping the irq thread · 550acb19

由 Ido Yariv 提交于 12月 01, 2011

In irq_wait_for_interrupt(), the should_stop member is verified before
setting the task's state to TASK_INTERRUPTIBLE and calling schedule().
In case kthread_stop sets should_stop and wakes up the process after
should_stop is checked by the irq thread but before the task's state
is changed, the irq thread might never exit:

kthread_stop                    irq_wait_for_interrupt
------------                    ----------------------

                                 ...
...                              while (!kthread_should_stop()) {
kthread->should_stop = 1;
wake_up_process(k);
wait_for_completion(&kthread->exited);
...
                                     set_current_state(TASK_INTERRUPTIBLE);

                                     ...

                                     schedule();
                                 }

Fix this by checking if the thread should stop after modifying the
task's state.

[ tglx: Simplified it a bit ]
Signed-off-by: NIdo Yariv <ido@wizery.com>
Link: http://lkml.kernel.org/r/1322740508-22640-1-git-send-email-ido@wizery.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org

550acb19

trace_events_filter: Use rcu_assign_pointer() when setting ftrace_event_call->filter · d3d9acf6

由 Tejun Heo 提交于 11月 23, 2011

ftrace_event_call->filter is sched RCU protected but didn't use
rcu_assign_pointer().  Use it.

TODO: Add proper __rcu annotation to call->filter and all its users.

-v2: Use RCU_INIT_POINTER() for %NULL clearing as suggested by Eric.

Link: http://lkml.kernel.org/r/20111123164949.GA29639@google.com

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@kernel.org # (2.6.39+)
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

d3d9acf6

clocksource: Fix bug with max_deferment margin calculation · b1f91966

由 Yang Honggang (Joseph) 提交于 12月 01, 2011

In order to leave a margin of 12.5% we should >> 3 not >> 5.

CC: stable@kernel.org
Signed-off-by: NYang Honggang (Joseph) <eagle.rtlinux@gmail.com>
[jstultz: Modified commit subject]
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

b1f91966

29 11月, 2011 1 次提交

genirq: fix regression in irqfixup, irqpoll · 52553ddf

由 Edward Donovan 提交于 11月 27, 2011

Commit fa27271b("genirq: Fixup poll handling") introduced a
regression that broke irqfixup/irqpoll for some hardware configurations.

Amidst reorganizing 'try_one_irq', that patch removed a test that
checked for 'action->handler' returning IRQ_HANDLED, before acting on
the interrupt.  Restoring this test back returns the functionality lost
since 2.6.39.  In the current set of tests, after 'action' is set, it
must precede '!action->next' to take effect.

With this and my previous patch to irq/spurious.c, c75d720f, all
IRQ regressions that I have encountered are fixed.
Signed-off-by: NEdward Donovan <edward.donovan@numble.net>
Reported-and-tested-by: NRogério Brito <rbrito@ime.usp.br>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org (2.6.39+)
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

52553ddf

25 11月, 2011 1 次提交

cgroup_freezer: fix freezing groups with stopped tasks · 884a45d9

由 Michal Hocko 提交于 11月 22, 2011

2d3cbf8b (cgroup_freezer: update_freezer_state() does incorrect state
transitions) removed is_task_frozen_enough and replaced it with a simple
frozen call. This, however, breaks freezing for a group with stopped tasks
because those cannot be frozen and so the group remains in CGROUP_FREEZING
state (update_if_frozen doesn't count stopped tasks) and never reaches
CGROUP_FROZEN.

Let's add is_task_frozen_enough back and use it at the original locations
(update_if_frozen and try_to_freeze_cgroup). Semantically we consider
stopped tasks as frozen enough so we should consider both cases when
testing frozen tasks.

Testcase:
mkdir /dev/freezer
mount -t cgroup -o freezer none /dev/freezer
mkdir /dev/freezer/foo
sleep 1h &
pid=$!
kill -STOP $pid
echo $pid > /dev/freezer/foo/tasks
echo FROZEN > /dev/freezer/foo/freezer.state
while true
do
	cat /dev/freezer/foo/freezer.state
	[ "`cat /dev/freezer/foo/freezer.state`" = "FROZEN" ] && break
	sleep 1
done
echo OK
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NLi Zefan <lizf@cn.fujitsu.com>
Cc: Tomasz Buchert <tomasz.buchert@inria.fr>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: NTejun Heo <htejun@gmail.com>

884a45d9

24 11月, 2011 1 次提交

PM / Hibernate: Do not leak memory in error/test code paths · bb58dd5d

由 Rafael J. Wysocki 提交于 11月 22, 2011

The hibernation core code forgets to release memory preallocated
for hibernation if there's an error in its early stages or if test
modes causing hibernation_snapshot() to return early are used.  This
causes the system to be hardly usable, because the amount of
preallocated memory is usually huge.  Fix this problem.
Reported-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Acked-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

bb58dd5d

19 11月, 2011 3 次提交

PM / Suspend: Fix bug in suspend statistics update · 501a708f

由 Srivatsa S. Bhat 提交于 11月 19, 2011

After commit 2a77c46d
(PM / Suspend: Add statistics debugfs file for suspend to RAM)
a missing pair of braces inside the state_store() function causes even
invalid arguments to suspend to be wrongly treated as failed suspend
attempts. Fix this.

[rjw: Put the hash/subject of the buggy commit into the changelog.]
Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

501a708f

hrtimer: Fix extra wakeups from __remove_hrtimer() · 27c9cd7e

由 Jeff Ohlstein 提交于 11月 18, 2011

__remove_hrtimer() attempts to reprogram the clockevent device when
the timer being removed is the next to expire. However,
__remove_hrtimer() reprograms the clockevent *before* removing the
timer from the timerqueue and thus when hrtimer_force_reprogram()
finds the next timer to expire it finds the timer we're trying to
remove.

This is especially noticeable when the system switches to NOHz mode
and the system tick is removed. The timer tick is removed from the
system but the clockevent is programmed to wakeup in another HZ
anyway.

Silence the extra wakeup by removing the timer from the timerqueue
before calling hrtimer_force_reprogram() so that we actually program
the clockevent for the next timer to expire.

This was broken by 998adc3d "hrtimers: Convert hrtimers to use
timerlist infrastructure".
Signed-off-by: NJeff Ohlstein <johlstei@codeaurora.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1321660030-8520-1-git-send-email-johlstei@codeaurora.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

27c9cd7e

PM / Hibernate: Fix the early termination of test modes · aa9a7b11

由 Srivatsa S. Bhat 提交于 11月 18, 2011

Commit 2aede851
(PM / Hibernate: Freeze kernel threads after preallocating memory)
postponed the freezing of kernel threads to after preallocating memory
for hibernation. But while doing that, the hibernation test TEST_FREEZER
and the test mode HIBERNATION_TESTPROC were not moved accordingly.

As a result, when using these test modes, it only goes upto the freezing of
userspace and exits, when in fact it should go till the complete end of task
freezing stage, namely the freezing of kernel threads as well.

So, move these points of exit to appropriate places so that freezing of
kernel threads is also tested while using these test harnesses.
Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

aa9a7b11

18 11月, 2011 2 次提交

timekeeping: add arch_offset hook to ktime_get functions · d004e024

由 Hector Palacios 提交于 11月 14, 2011

ktime_get and ktime_get_ts were calling timekeeping_get_ns()
but later they were not calling arch_gettimeoffset() so architectures
using this mechanism returned 0 ns when calling these functions.

This happened for example when running Busybox's ping which calls
syscall(__NR_clock_gettime, CLOCK_MONOTONIC, ts) which eventually
calls ktime_get. As a result the returned ping travel time was zero.

CC: stable@kernel.org
Signed-off-by: NHector Palacios <hector.palacios@digi.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

d004e024

genirq: Don't allow per cpu interrupts to be suspended · 2ed0e645

由 Marc Zyngier 提交于 11月 16, 2011

The power management functions related to interrupts do not know
(yet) about per-cpu interrupts and end up calling the wrong
low-level methods to enable/disable interrupts.

This leads to all kind of interesting issues (action taken on one
CPU only, updating a refcount which is not used otherwise...).

The workaround for the time being is simply to flag these interrupts
with IRQF_NO_SUSPEND. At least on ARM, these interrupts are actually
dealt with at the architecture level.
Reported-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1321446459-31409-1-git-send-email-marc.zyngier@arm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

2ed0e645

17 11月, 2011 1 次提交
- W
  writeback: remove vm_dirties and task->dirties · 468e6a20
  由 Wu Fengguang 提交于 9月 07, 2011
```
They are not used any more.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
```
  468e6a20
16 11月, 2011 1 次提交

sched: Fix buglet in return_cfs_rq_runtime() · fccfdc6f

由 Paul Turner 提交于 11月 07, 2011

In return_cfs_rq_runtime() we want to return bandwidth when there are no
remaining tasks, not "return" when this is the case.
Signed-off-by: NPaul Turner <pjt@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111108042736.623812423@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

fccfdc6f