提交 · 72acc854427948efed7a83da27f7dc3239ac9afc · openeuler / raspberrypi-kernel

28 7月, 2010 18 次提交

fsnotify: put inode specific fields in an fsnotify_mark in a union · 2823e04d

由 Eric Paris 提交于 12月 17, 2009

The addition of marks on vfs mounts will be simplified if the inode
specific parts of a mark and the vfsmnt specific parts of a mark are
actually in a union so naming can be easy.  This patch just implements the
inode struct and the union.
Signed-off-by: NEric Paris <eparis@redhat.com>

2823e04d

fsnotify: include vfsmount in should_send_event when appropriate · 3a9fb89f

由 Eric Paris 提交于 12月 17, 2009

To ensure that a group will not duplicate events when it receives it based
on the vfsmount and the inode should_send_event test we should distinguish
those two cases.  We pass a vfsmount to this function so groups can make
their own determinations.
Signed-off-by: NEric Paris <eparis@redhat.com>

3a9fb89f

fsnotify: drop mask argument from fsnotify_alloc_group · 0d2e2a1d

由 Eric Paris 提交于 12月 17, 2009

Nothing uses the mask argument to fsnotify_alloc_group.  This patch drops
that argument.
Signed-off-by: NEric Paris <eparis@redhat.com>

0d2e2a1d

Audit: only set group mask when something is being watched · 220d14df

由 Eric Paris 提交于 12月 17, 2009

Currently the audit watch group always sets a mask equal to all events it
might care about.  We instead should only set the group mask if we are
actually watching inodes.  This should be a perf win when audit watches are
compiled in.
Signed-off-by: NEric Paris <eparis@redhat.com>

220d14df

fsnotify: fsnotify_obtain_group should be fsnotify_alloc_group · ffab8340

由 Eric Paris 提交于 12月 17, 2009

fsnotify_obtain_group was intended to be able to find an already existing
group.  Nothing uses that functionality.  This just renames it to
fsnotify_alloc_group so it is clear what it is doing.
Signed-off-by: NEric Paris <eparis@redhat.com>

ffab8340

fsnotify: remove group_num altogether · 74be0cc8

由 Eric Paris 提交于 12月 17, 2009

The original fsnotify interface has a group-num which was intended to be
able to find a group after it was added.  I no longer think this is a
necessary thing to do and so we remove the group_num.
Signed-off-by: NEric Paris <eparis@redhat.com>

74be0cc8

fsnotify: include data in should_send calls · 8112e2d6

由 Eric Paris 提交于 12月 17, 2009

fanotify is going to need to look at file->private_data to know if an event
should be sent or not.  This passes the data (which might be a file,
dentry, inode, or none) to the should_send function calls so fanotify can
get that information when available
Signed-off-by: NEric Paris <eparis@redhat.com>

8112e2d6

fsnotify: provide the data type to should_send_event · 7b0a04fb

由 Eric Paris 提交于 12月 17, 2009

fanotify is only interested in event types which contain enough information
to open the original file in the context of the fanotify listener.  Since
fanotify may not want to send events if that data isn't present we pass
the data type to the should_send_event function call so fanotify can express
its lack of interest.
Signed-off-by: NEric Paris <eparis@redhat.com>

7b0a04fb

E
inotify: remove inotify in kernel interface · 2dfc1cae
由 Eric Paris 提交于 12月 17, 2009
```
nothing uses inotify in the kernel, drop it!
Signed-off-by: NEric Paris <eparis@redhat.com>
```
2dfc1cae

Audit: audit watch init should not be before fsnotify init · 1a3aedbc

由 Eric Paris 提交于 12月 17, 2009

Audit watch init and fsnotify init both use subsys_initcall() but since the
audit watch code is linked in before the fsnotify code the audit watch code
would be using the fsnotify srcu struct before it was initialized.  This
patch fixes that problem by moving audit watch init to device_initcall() so
it happens after fsnotify is ready.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NEric Paris <eparis@redhat.com>
Tested-by : Sachin Sant <sachinp@in.ibm.com>

1a3aedbc

Audit: split audit watch Kconfig · 939a67fc

由 Eric Paris 提交于 12月 17, 2009

Audit watch should depend on CONFIG_AUDIT_SYSCALL and should select
FSNOTIFY.  This splits the spagetti like mixing of audit_watch and
audit_filter code so they can be configured seperately.
Signed-off-by: NEric Paris <eparis@redhat.com>

939a67fc

audit: reimplement audit_trees using fsnotify rather than inotify · 28a3a7eb

由 Eric Paris 提交于 12月 17, 2009

Simply switch audit_trees from using inotify to using fsnotify for it's
inode pinning and disappearing act information.
Signed-off-by: NEric Paris <eparis@redhat.com>

28a3a7eb

fsnotify: allow addition of duplicate fsnotify marks · 40554c3d

由 Eric Paris 提交于 12月 17, 2009

This patch allows a task to add a second fsnotify mark to an inode for the
same group. This mark will be added to the end of the inode's list and
this will never be found by the stand fsnotify_find_mark() function. This
is useful if a user wants to add a new mark before removing the old one.
Signed-off-by: NEric Paris <eparis@redhat.com>

40554c3d

audit: do not get and put just to free a watch · a05fb6cc

由 Eric Paris 提交于 12月 17, 2009

deleting audit watch rules is not currently done under audit_filter_mutex.
It was done this way because we could not hold the mutex during inotify
manipulation. Since we are using fsnotify we don't need to do the extra
get/put pair nor do we need the private list on which to store the parents
while they are about to be freed.
Signed-off-by: NEric Paris <eparis@redhat.com>

a05fb6cc

audit: redo audit watch locking and refcnt in light of fsnotify · e118e9c5

由 Eric Paris 提交于 12月 17, 2009

fsnotify can handle mutexes to be held across all fsnotify operations since
it deals strickly in spinlocks.  This can simplify and reduce some of the
audit_filter_mutex taking and dropping.
Signed-off-by: NEric Paris <eparis@redhat.com>

e118e9c5

audit: convert audit watches to use fsnotify instead of inotify · e9fd702a

由 Eric Paris 提交于 12月 17, 2009

Audit currently uses inotify to pin inodes in core and to detect when
watched inodes are deleted or unmounted.  This patch uses fsnotify instead
of inotify.
Signed-off-by: NEric Paris <eparis@redhat.com>

e9fd702a

Audit: clean up the audit_watch split · ae7b8f41

由 Eric Paris 提交于 12月 17, 2009

No real changes, just cleanup to the audit_watch split patch which we done
with minimal code changes for easy review.  Now fix interfaces to make
things work better.
Signed-off-by: NEric Paris <eparis@redhat.com>

ae7b8f41

dynamic debug: move ddebug_remove_module() down into free_module() · b82bab4b

由 Jason Baron 提交于 7月 27, 2010

The command

	echo "file ec.c +p" >/sys/kernel/debug/dynamic_debug/control

causes an oops.

Move the call to ddebug_remove_module() down into free_module().  In this
way it should be called from all error paths.  Currently, we are missing
the remove if the module init routine fails.
Signed-off-by: NJason Baron <jbaron@redhat.com>
Reported-by: NThomas Renninger <trenn@suse.de>
Tested-by: NThomas Renninger <trenn@suse.de>
Cc: <stable@kernel.org>		[2.6.32+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b82bab4b

22 7月, 2010 5 次提交

sysrq,kdb: Use __handle_sysrq() for kdb's sysrq function · edd63cb6

由 Jason Wessel 提交于 7月 21, 2010

The kdb code should not toggle the sysrq state in case an end user
wants to try and resume the normal kernel execution.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Acked-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>

edd63cb6

debug_core,kdb: fix kgdb_connected bit set in the wrong place · b0679c63

由 Jason Wessel 提交于 7月 21, 2010

Immediately following an exit from the kdb shell the kgdb_connected
variable should be set to zero, unless there are breakpoints planted.
If the kgdb_connected variable is not zeroed out with kdb, it is
impossible to turn off kdb.

This patch is merely a work around for now, the real fix will check
for the breakpoints.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>

b0679c63

Fix merge regression from external kdb to upstream kdb · 9e8b624f

由 Jason Wessel 提交于 7月 21, 2010

In the process of merging kdb to the mainline, the kdb lsmod command
stopped printing the base load address of kernel modules.  This is
needed for using kdb in conjunction with external tools such as gdb.

Simply restore the functionality by adding a kdb_printf for the base
load address of the kernel modules.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>

9e8b624f

repair gdbstub to match the gdbserial protocol specification · fb82c0ff

由 Jason Wessel 提交于 7月 21, 2010

The gdbserial protocol handler should return an empty packet instead
of an error string when ever it responds to a command it does not
implement.

The problem cases come from a debugger client sending
qTBuffer, qTStatus, qSearch, qSupported.

The incorrect response from the gdbstub leads the debugger clients to
not function correctly.  Recent versions of gdb will not detach correctly as a result of this behavior.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Signed-off-by: NDongdong Deng <dongdong.deng@windriver.com>

fb82c0ff

kdb: break out of kdb_ll() when command is terminated · 1396a21b

由 Martin Hicks 提交于 7月 21, 2010

Without this patch the "ll" linked-list traversal command won't
terminate when you hit q/Q.
Signed-off-by: NMartin Hicks <mort@sgi.com>
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>

1396a21b

19 7月, 2010 1 次提交

kmemleak: Add support for NO_BOOTMEM configurations · 9078370c

由 Catalin Marinas 提交于 7月 19, 2010

With commits 08677214 and 59be5a8e, alloc_bootmem()/free_bootmem() and
friends use the early_res functions for memory management when
NO_BOOTMEM is enabled. This patch adds the kmemleak calls in the
corresponding code paths for bootmem allocations.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: stable@kernel.org

9078370c

05 7月, 2010 1 次提交

module: initialize module dynamic debug later · ff49d74a

由 Yehuda Sadeh 提交于 7月 03, 2010

We should initialize the module dynamic debug datastructures
only after determining that the module is not loaded yet. This
fixes a bug that introduced in 2.6.35-rc2, where when a trying
to load a module twice, we also load it's dynamic printing data
twice which causes all sorts of nasty issues. Also handle
the dynamic debug cleanup later on failure.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed a #ifdef)
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ff49d74a

01 7月, 2010 2 次提交

sched: Cure nr_iowait_cpu() users · 8c215bd3

由 Peter Zijlstra 提交于 7月 01, 2010

Commit 0224cf4c (sched: Intoduce get_cpu_iowait_time_us())
broke things by not making sure preemption was indeed disabled
by the callers of nr_iowait_cpu() which took the iowait value of
the current cpu.

This resulted in a heap of preempt warnings. Cure this by making
nr_iowait_cpu() take a cpu number and fix up the callers to pass
in the right number.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: linux-pm@lists.linux-foundation.org
LKML-Reference: <1277968037.1868.120.camel@laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8c215bd3

futex: futex_find_get_task remove credentails check · 7a0ea09a

由 Michal Hocko 提交于 6月 30, 2010

futex_find_get_task is currently used (through lookup_pi_state) from two
contexts, futex_requeue and futex_lock_pi_atomic.  None of the paths
looks it needs the credentials check, though.  Different (e)uids
shouldn't matter at all because the only thing that is important for
shared futex is the accessibility of the shared memory.

The credentail check results in glibc assert failure or process hang (if
glibc is compiled without assert support) for shared robust pthread
mutex with priority inheritance if a process tries to lock already held
lock owned by a process with a different euid:

pthread_mutex_lock.c:312: __pthread_mutex_lock_full: Assertion `(-(e)) != 3 || !robust' failed.

The problem is that futex_lock_pi_atomic which is called when we try to
lock already held lock checks the current holder (tid is stored in the
futex value) to get the PI state.  It uses lookup_pi_state which in turn
gets task struct from futex_find_get_task.  ESRCH is returned either
when the task is not found or if credentials check fails.

futex_lock_pi_atomic simply returns if it gets ESRCH.  glibc code,
however, doesn't expect that robust lock returns with ESRCH because it
should get either success or owner died.
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NDarren Hart <dvhltc@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7a0ea09a

30 6月, 2010 1 次提交

kexec: fix Oops in crash_shrink_memory() · e05bd336

由 Pavan Naregundi 提交于 6月 29, 2010

When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
OOPSes the kernel in crash_shrink_memory.  This happens when
crash_shrink_memory tries to release the 'crashk_res' resource which are
not reserved.  Also value of "/sys/kernel/kexec_crash_size" shows as 1,
which should be 0.

This patch fixes the OOPS in crash_shrink_memory and shows
"/sys/kernel/kexec_crash_size" as 0 when crash kernel memory is not
reserved.
Signed-off-by: NPavan Naregundi <pavan@linux.vnet.ibm.com>
Reviewed-by: NWANG Cong <xiyou.wangcong@gmail.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e05bd336

25 6月, 2010 1 次提交

sched: Prevent compiler from optimising the sched_avg_update() loop · 0d98bb26

由 Will Deacon 提交于 5月 24, 2010

GCC 4.4.1 on ARM has been observed to replace the while loop in
sched_avg_update with a call to uldivmod, resulting in the
following build failure at link-time:

kernel/built-in.o: In function `sched_avg_update':
 kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
 kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
make: *** [.tmp_vmlinux1] Error 1

This patch introduces a fake data hazard to the loop body to
prevent the compiler optimising the loop away.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0d98bb26

24 6月, 2010 1 次提交

sched: silence PROVE_RCU in sched_fork() · 86951599

由 Peter Zijlstra 提交于 6月 22, 2010

Because cgroup_fork() is ran before sched_fork() [ from copy_process() ]
and the child's pid is not yet visible the child is pinned to its
cgroup. Therefore we can silence this warning.

A nicer solution would be moving cgroup_fork() to right after
dup_task_struct() and exclude PF_STARTING from task_subsys_state().
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

86951599

23 6月, 2010 1 次提交

rcu: apply RCU protection to wake_affine() · f3b577de

由 Daniel J Blueman 提交于 6月 01, 2010

The task_group() function returns a pointer that must be protected
by either RCU, the ->alloc_lock, or the cgroup lock (see the
rcu_dereference_check() in task_subsys_state(), which is invoked by
task_group()). The wake_affine() function currently does none of these,
which means that a concurrent update would be within its rights to free
the structure returned by task_group(). Because wake_affine() uses this
structure only to compute load-balancing heuristics, there is no reason
to acquire either of the two locks.

Therefore, this commit introduces an RCU read-side critical section that
starts before the first call to task_group() and ends after the last use
of the "tg" pointer returned from task_group(). Thanks to Li Zefan for
pointing out the need to extend the RCU read-side critical section from
that proposed by the original patch.
Signed-off-by: NDaniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

f3b577de

18 6月, 2010 2 次提交

sched: Fix over-scheduling bug · 3c93717c

由 Alex,Shi 提交于 6月 17, 2010

Commit e7097159 ("sched: Optimize unused cgroup configuration") introduced
an imbalanced scheduling bug.

If we do not use CGROUP, function update_h_load won't update h_load. When the
system has a large number of tasks far more than logical CPU number, the
incorrect cfs_rq[cpu]->h_load value will cause load_balance() to pull too
many tasks to the local CPU from the busiest CPU. So the busiest CPU keeps
going in a round robin. That will hurt performance.

The issue was found originally by a scientific calculation workload that
developed by Yanmin. With that commit, the workload performance drops
about 40%.

 CPU  before    after

 00   : 2       : 7
 01   : 1       : 7
 02   : 11      : 6
 03   : 12      : 7
 04   : 6       : 6
 05   : 11      : 7
 06   : 10      : 6
 07   : 12      : 7
 08   : 11      : 6
 09   : 12      : 6
 10   : 1       : 6
 11   : 1       : 6
 12   : 6       : 6
 13   : 2       : 6
 14   : 2       : 6
 15   : 1       : 6
Reviewed-by: NYanmin zhang <yanmin.zhang@intel.com>
Signed-off-by: NAlex Shi <alex.shi@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1276754893.9452.5442.camel@debian>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3c93717c

nohz: Fix nohz ratelimit · 3310d4d3

由 Peter Zijlstra 提交于 6月 17, 2010

Chris Wedgwood reports that 39c0cbe2 (sched: Rate-limit nohz) causes a
serial console regression, unresponsiveness, and indeed it does. The
reason is that the nohz code is skipped even when the tick was already
stopped before the nohz_ratelimit(cpu) condition changed.

Move the nohz_ratelimit() check to the other conditions which prevent
long idle sleeps.
Reported-by: NChris Wedgwood <cw@f00f.org>
Tested-by: NBrian Bloniarz <bmb@athenacr.com>
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg KH <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Jef Driesen <jefdriesen@telenet.be>
LKML-Reference: <1276790557.27822.516.camel@twins>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

3310d4d3

11 6月, 2010 1 次提交

perf/tracing: Fix regression of perf losing kprobe events · a8fb2608

由 Steven Rostedt 提交于 6月 10, 2010

With the addition of the code to shrink the kernel tracepoint
infrastructure, we lost kprobes being traced by perf. The reason
is that I tested if the "tp_event->class->perf_probe" existed before
enabling it. This prevents "ftrace only" events (like the function
trace events) from being enabled by perf.

Unfortunately, kprobe events do not use perf_probe. This causes
kprobes to be missed by perf. To fix this, we add the test to
see if "tp_event->class->reg" exists as well as perf_probe.

Normal trace events have only "perf_probe" but no "reg" function,
and kprobes and syscalls have the "reg" but no "perf_probe".
The ftrace unique events do not have either, so this is a valid
test. If a kprobe or syscall is not to be probed by perf, the
"reg" function is called anyway, and will return a failure and
prevent perf from probing it.
Reported-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

a8fb2608

10 6月, 2010 1 次提交

suspend: Move NVS save/restore code to generic suspend functionality · dd4c4f17

由 Matthew Garrett 提交于 5月 28, 2010

Saving platform non-volatile state may be required for suspend to RAM as
well as hibernation. Move it to more generic code.
Signed-off-by: NMatthew Garrett <mjg@redhat.com>
Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
Tested-by: NMaxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: NLen Brown <len.brown@intel.com>

dd4c4f17

09 6月, 2010 3 次提交

genirq: Deal with desc->set_type() changing desc->chip · 46732475

由 Thomas Gleixner 提交于 6月 07, 2010

The set_type() function can change the chip implementation when the
trigger mode changes. That might result in using an non-initialized
irq chip when called from __setup_irq() or when called via
set_irq_type() on an already enabled irq. 

The set_irq_type() function should not be called on an enabled irq,
but because we forgot to put a check into it, we have a bunch of users
which grew the habit of doing that and it never blew up as the
function is serialized via desc->lock against all users of desc->chip
and they never hit the non-initialized irq chip issue.

The easy fix for the __setup_irq() issue would be to move the
irq_chip_set_defaults(desc->chip) call after the trigger setting to
make sure that a chip change is covered.

But as we have already users, which do the type setting after
request_irq(), the safe fix for now is to call irq_chip_set_defaults()
from __irq_set_trigger() when desc->set_type() changed the irq chip.

It needs a deeper analysis whether we should refuse to change the chip
on an already enabled irq, but that'd be a large scale change to fix
all the existing users. So that's neither stable nor 2.6.35 material.
Reported-by: NEsben Haabendal <eha@doredevelopment.dk>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev <linuxppc-dev@ozlabs.org>
Cc: stable@kernel.org

46732475

sched: Fix PROVE_RCU vs cpu_cgroup · dc61b1d6

由 Peter Zijlstra 提交于 6月 08, 2010

PROVE_RCU has a few issues with the cpu_cgroup because the scheduler
typically holds rq->lock around the css rcu derefs but the generic
cgroup code doesn't (and can't) know about that lock.

Provide means to add extra checks to the css dereference and use that
in the scheduler to annotate its users.

The addition of rq->lock to these checks is correct because the
cgroup_subsys::attach() method takes the rq->lock for each task it
moves, therefore by holding that lock, we ensure the task is pinned to
the current cgroup and the RCU derefence is valid.

That leaves one genuine race in __sched_setscheduler() where we used
task_group() without holding any of the required locks and thus raced
with the cgroup code. Solve this by moving the check under the
appropriate lock.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dc61b1d6

perf: Fix signed comparison in perf_adjust_period() · f6ab91ad

由 Peter Zijlstra 提交于 6月 04, 2010

Frederic reported that frequency driven swevents didn't work properly
and even caused a division-by-zero error.

It turns out there are two bugs, the division-by-zero comes from a
failure to deal with that in perf_calculate_period().

The other was more interesting and turned out to be a wrong comparison
in perf_adjust_period(). The comparison was between an s64 and u64 and
got implicitly converted to an unsigned comparison. The problem is
that period_left is typically < 0, so it ended up being always true.

Cure this by making the local period variables s64.
Reported-by: NFrederic Weisbecker <fweisbec@gmail.com>
Tested-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f6ab91ad

05 6月, 2010 2 次提交

module: fix bne2 "gave up waiting for init of module libcrc32c" · 9bea7f23

由 Rusty Russell 提交于 6月 05, 2010

Problem: it's hard to avoid an init routine stumbling over a
request_module these days.  And it's not clear it's always a bad idea:
for example, a module like kvm with dynamic dependencies on kvm-intel
or kvm-amd would be neater if it could simply request_module the right
one.

In this particular case, it's libcrc32c:

	libcrc32c_mod_init
	 crypto_alloc_shash
	  crypto_alloc_tfm
	   crypto_find_alg
	    crypto_alg_mod_lookup
	     crypto_larval_lookup
	      request_module

If another module is waiting inside resolve_symbol() for libcrc32c to
finish initializing (ie. bne2 depends on libcrc32c) then it does so
holding the module lock, and our request_module() can't make progress
until that is released.

Waiting inside resolve_symbol() without the lock isn't all that hard:
we just need to pass the -EBUSY up the call chain so we can sleep
where we don't hold the lock.  Error reporting is a bit trickier: we
need to copy the name of the unfinished module before releasing the
lock.

Other notes:
1) This also fixes a theoretical issue where a weak dependency would allow
   symbol version mismatches to be ignored.
2) We rename use_module to ref_module to make life easier for the only
   external user (the out-of-tree ksplice patches).
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tim Abbot <tabbott@ksplice.com>
Tested-by: NBrandon Philips <bphilips@suse.de>

9bea7f23

module: verify_export_symbols under the lock · be593f4c

由 Rusty Russell 提交于 6月 05, 2010

It disabled preempt so it was "safe", but nothing stops another module
slipping in before this module is added to the global list now we don't
hold the lock the whole time.

So we check this just after we check for duplicate modules, and just
before we put the module in the global list.

(find_symbol finds symbols in coming and going modules, too).
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

be593f4c