提交 · a6550207538619bc9b90bac2e1d5e54902a432ad · gsplhtlxg / clone-Linux

19 4月, 2008 5 次提交

kernel: Remove unnecessary inclusions of asm/semaphore.h · a6550207

由 Matthew Wilcox 提交于 2月 26, 2008

None of these files use any of the functionality promised by
asm/semaphore.h.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>

a6550207

Audit: Final renamings and cleanup · 04305e4a

由 Ahmed S. Darwish 提交于 4月 19, 2008

Rename the se_str and se_rule audit fields elements to
lsm_str and lsm_rule to avoid confusion.
Signed-off-by: NCasey Schaufler <casey@schaufler-ca.com>
Signed-off-by: NAhmed S. Darwish <darwish.07@gmail.com>
Acked-by: NJames Morris <jmorris@namei.org>

04305e4a

SELinux: use new audit hooks, remove redundant exports · 9d57a7f9

由 Ahmed S. Darwish 提交于 3月 01, 2008

Setup the new Audit LSM hooks for SELinux.
Remove the now redundant exported SELinux Audit interface.

Audit: Export 'audit_krule' and 'audit_field' to the public
since their internals are needed by the implementation of the
new LSM hook 'audit_rule_known'.
Signed-off-by: NCasey Schaufler <casey@schaufler-ca.com>
Signed-off-by: NAhmed S. Darwish <darwish.07@gmail.com>
Acked-by: NJames Morris <jmorris@namei.org>

9d57a7f9

Audit: internally use the new LSM audit hooks · d7a96f3a

由 Ahmed S. Darwish 提交于 3月 01, 2008

Convert Audit to use the new LSM Audit hooks instead of
the exported SELinux interface.

Basically, use:
security_audit_rule_init
secuirty_audit_rule_free
security_audit_rule_known
security_audit_rule_match

instad of (respectively) :
selinux_audit_rule_init
selinux_audit_rule_free
audit_rule_has_selinux
selinux_audit_rule_match
Signed-off-by: NCasey Schaufler <casey@schaufler-ca.com>
Signed-off-by: NAhmed S. Darwish <darwish.07@gmail.com>
Acked-by: NJames Morris <jmorris@namei.org>

d7a96f3a

Audit: use new LSM hooks instead of SELinux exports · 2a862b32

由 Ahmed S. Darwish 提交于 3月 01, 2008

Stop using the following exported SELinux interfaces:
selinux_get_inode_sid(inode, sid)
selinux_get_ipc_sid(ipcp, sid)
selinux_get_task_sid(tsk, sid)
selinux_sid_to_string(sid, ctx, len)
kfree(ctx)

and use following generic LSM equivalents respectively:
security_inode_getsecid(inode, secid)
security_ipc_getsecid*(ipcp, secid)
security_task_getsecid(tsk, secid)
security_sid_to_secctx(sid, ctx, len)
security_release_secctx(ctx, len)

Call security_release_secctx only if security_secid_to_secctx
succeeded.
Signed-off-by: NCasey Schaufler <casey@schaufler-ca.com>
Signed-off-by: NAhmed S. Darwish <darwish.07@gmail.com>
Acked-by: NJames Morris <jmorris@namei.org>
Reviewed-by: NPaul Moore <paul.moore@hp.com>

2a862b32

18 4月, 2008 10 次提交

ptrace_signal subroutine · 18c98b65

由 Roland McGrath 提交于 4月 17, 2008

This breaks out the ptrace handling from get_signal_to_deliver into a
new subroutine.  The actual code there doesn't change, and it gets
inlined into nearly identical compiled code.  This makes the function
substantially shorter and thus easier to read, and it nicely isolates
the ptrace magic.
Signed-off-by: NRoland McGrath <roland@redhat.com>
Acked-by: NKyle McMartin <kyle@mcmartin.ca>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

18c98b65

cgroup: fix a race condition in manipulating tsk->cg_list · 0e04388f

由 Li Zefan 提交于 4月 17, 2008

When I ran a test program to fork mass processes and at the same time
'cat /cgroup/tasks', I got the following oops:

  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:72!
  invalid opcode: 0000 [#1] SMP
  Pid: 4178, comm: a.out Not tainted (2.6.25-rc9 #72)
  ...
  Call Trace:
   [<c044a5f9>] ? cgroup_exit+0x55/0x94
   [<c0427acf>] ? do_exit+0x217/0x5ba
   [<c0427ed7>] ? do_group_exit+0.65/0x7c
   [<c0427efd>] ? sys_exit_group+0xf/0x11
   [<c0404842>] ? syscall_call+0x7/0xb
   [<c05e0000>] ? init_cyrix+0x2fa/0x479
  ...
  EIP: [<c04df671>] list_del+0x35/0x53 SS:ESP 0068:ebc7df4
  ---[ end trace caffb7332252612b ]---
  Fixing recursive fault but reboot is needed!

After digging into the code and debugging, I finlly found out a race
situation:

				do_exit()
				  ->cgroup_exit()
				    ->if (!list_empty(&tsk->cg_list))
				        list_del(&tsk->cg_list);

  cgroup_iter_start()
    ->cgroup_enable_task_cg_list()
      ->list_add(&tsk->cg_list, ..);

In this case the list won't be deleted though the process has exited.

We got two bug reports in the past, which seem to be the same bug as
this one:
	http://lkml.org/lkml/2008/3/5/332
	http://lkml.org/lkml/2007/10/17/224

Actually sometimes I got oops on list_del, sometimes oops on list_add.
And I can change my test program a bit to trigger other oops.

The patch has been tested both on x86_32 and x86_64.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPaul Menage <menage@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0e04388f

kgdb: always use icache flush for sw breakpoints · 1a9a3e76

由 Jason Wessel 提交于 4月 01, 2008

On the ppc 4xx architecture the instruction cache must be flushed as
well as the data cache.  This patch just makes it generic for all
architectures where CACHE_FLUSH_IS_SAFE is set to 1.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1a9a3e76

kgdb: fix SMP NMI kgdb_handle_exception exit race · 56fb7093

由 Jason Wessel 提交于 4月 01, 2008

Fix the problem of protecting the kgdb handle_exception exit
which had an NMI race condition, while trying to restore
normal system operation.

There was a small window after the master processor sets cpu_in_debug
to zero but before it has set kgdb_active to zero where a
non-master processor in an SMP system could receive an NMI and
re-enter the kgdb_wait() loop.

As long as the master processor sets the cpu_in_debug before sending
the cpu roundup the cpu_in_debug variable can also be used to guard
against the race condition.

The kgdb_wait() function no longer needs to check
kgdb_active because it is done in the arch specific code
and handled along with the nmi traps at the low level.
This also allows kgdb_wait() to exit correctly if it was
entered for some unknown reason due to a spurious NMI that
could not be handled by the arch specific code.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

56fb7093

kgdb: fix several kgdb regressions · 737a460f

由 Jason Wessel 提交于 3月 07, 2008

kgdb core fixes:
- Check to see that mm->mmap_cache is not null before calling
  flush_cache_range(), else on arch=ARM it will cause a fatal
  fault.

- Breakpoints should only be restored if they are in the BP_ACTIVE
  state.

- Fix a typo in comments to "kgdb_register_io_module"

x86 kgdb fixes:
- Fix the x86 arch handler such that on a kill or detach that the
  appropriate cleanup on the single stepping flags gets run.

- Add in the DIE_NMIWATCHDOG call for x86_64

- Touch the nmi watchdog before returning the system to normal
  operation after performing any kind of kgdb operation, else
  the possibility exists to trigger the watchdog.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

737a460f

kgdb: fix optional arch functions and probe_kernel_* · b4b8ac52

由 Jason Wessel 提交于 2月 20, 2008

Fix two regressions dealing with the kgdb core.

1) kgdb_skipexception and kgdb_post_primary_code are optional
functions that are only required on archs that need special exception
fixups.

2) The kernel address space scope must be set on any probe_kernel_*
function or archs such as ARCH=arm will not allow access to the kernel
memory space.  As an example, it is required to allow the full kernel
address space is when you the kernel debugger to inspect a system
call.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b4b8ac52

kgdb: add x86 HW breakpoints · 64e9ee30

由 Jason Wessel 提交于 2月 15, 2008

Add HW breakpoints into the arch specific portion of x86 kgdb. In the
current x86 kernel.org kernels HW breakpoints are changed out in lazy
fashion because there is no infrastructure around changing them when
changing to a kernel task or entering the kernel mode via a system
call. This lazy approach means that if a user process uses HW
breakpoints the kgdb will loose out. This is an acceptable trade off
because the developer debugging the kernel is assumed to know what is
going on system wide and would be aware of this trade off.

There is a minor bug fix to the kgdb core so as to correctly call the
hw breakpoint functions with a valid value from the enum.

There is also a minor change to the x86_64 startup code when using
early HW breakpoints. When the debugger is connected, the cpu startup
code must not zero out the HW breakpoint registers or you cannot hit
the breakpoints you are interested in, in the first place.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

64e9ee30

kgdb: print breakpoint removed on exception · 67baf94c

由 Jason Wessel 提交于 2月 15, 2008

If kgdb does remove a breakpoint that had a problem on the recursion
check, it should also print the address of the breakpoint.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

67baf94c

kgdb: clocksource watchdog · 7c3078b6

由 Jason Wessel 提交于 2月 15, 2008

In order to not trip the clocksource watchdog, kgdb must touch the
clocksource watchdog on the return to normal system run state.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7c3078b6

kgdb: core · dc7d5527

由 Jason Wessel 提交于 4月 17, 2008

kgdb core code. Handles the protocol and the arch details.

[ mingo@elte.hu: heavily modified, simplified and cleaned up. ]
[ xemul@openvz.org: use find_task_by_pid_ns ]
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NJan Kiszka <jan.kiszka@web.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>

dc7d5527

17 4月, 2008 13 次提交

Improve semaphore documentation · 714493cd

由 Matthew Wilcox 提交于 4月 11, 2008

Move documentation from semaphore.h to semaphore.c as requested by
Andrew Morton.  Also reformat to kernel-doc style and add some more
notes about the implementation.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>

714493cd

Simplify semaphore implementation · b17170b2

由 Matthew Wilcox 提交于 3月 14, 2008

By removing the negative values of 'count' and relying on the wait_list to
indicate whether we have any waiters, we can simplify the implementation
by removing the protection against an unlikely race condition. Thanks to
David Howells for his suggestions.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>

b17170b2

Add down_timeout and change ACPI to use it · f1241c87

由 Matthew Wilcox 提交于 3月 14, 2008

ACPI currently emulates a timeout for semaphores with calls to
down_trylock and sleep. This produces horrible behaviour in terms of
fairness and excessive wakeups. Now that we have a unified semaphore
implementation, adding a real down_trylock is almost trivial.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>

f1241c87

Introduce down_killable() · f06d9686

由 Matthew Wilcox 提交于 3月 14, 2008

down_killable() is the functional counterpart of mutex_lock_killable.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>

f06d9686

Generic semaphore implementation · 64ac24e7

由 Matthew Wilcox 提交于 3月 07, 2008

Semaphores are no longer performance-critical, so a generic C
implementation is better for maintainability, debuggability and
extensibility.  Thanks to Peter Zijlstra for fixing the lockdep
warning.  Thanks to Harvey Harrison for pointing out that the
unlikely() was unnecessary.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
Acked-by: NIngo Molnar <mingo@elte.hu>

64ac24e7

clocksource: make clocksource watchdog cycle through online CPUs · 6993fc5b

由 Andi Kleen 提交于 1月 30, 2008

This way it checks if the clocks are synchronized between CPUs too.
This might be able to detect slowly drifting TSCs which only
go wrong over longer time.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

6993fc5b

clockevents: optimise tick_nohz_stop_sched_tick() a bit · 903b8a8d

由 Karsten Wiese 提交于 2月 28, 2008

Call
	ts = &per_cpu(tick_cpu_sched, cpu);
and
	cpu = smp_processor_id();
once instead of twice.

No functional change done, as changed code runs with local irq off.
Reduces source lines and text size (20bytes on x86_64).

[ akpm@linux-foundation.org: Build fix ]
Signed-off-by: NKarsten Wiese <fzu@wemgehoertderstaat.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

903b8a8d

hrtimers: simplify lockdep handling · 8e60e05f

由 Oleg Nesterov 提交于 4月 04, 2008

In order to avoid the false positive from lockdep, each per-cpu base->lock has
the separate lock class and migrate_hrtimers() uses double_spin_lock().

This is overcomplicated: except for migrate_hrtimers() we never take 2 locks
at once, and migrate_hrtimers() can use spin_lock_nested().
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

8e60e05f

timers: simplify lockdep handling · 0d180406

由 Oleg Nesterov 提交于 4月 04, 2008

In order to avoid the false positive from lockdep, each per-cpu base->lock has
the separate lock class and migrate_timers() uses double_spin_lock().

This all is overcomplicated: except for migrate_timers() we never take 2 locks
at once, and migrate_timers() can use spin_lock_nested().
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

0d180406

posix-timers: fix shadowed variables · ee7dd205

由 WANG Cong 提交于 4月 04, 2008

Fix sparse warnings like this:
kernel/posix-cpu-timers.c:1090:25: warning: symbol 't' shadows an earlier one
kernel/posix-cpu-timers.c:1058:21: originally declared here
Signed-off-by: NWANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

ee7dd205

timer_list: add annotations to workqueue.c · d59b949f

由 Pavel Machek 提交于 2月 05, 2008

Add timer list annotations to workqueue.c so we can see the call site
in the timer stats.
Signed-off-by: NPavel Machek <Pavel@suse.cz>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

d59b949f

hrtimer: use nanosleep specific restart_block fields · 029a07e0

由 Thomas Gleixner 提交于 2月 10, 2008

Convert all the nanosleep related users of restart_block to the
new nanosleep specific restart_block fields.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

029a07e0

[S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h · d7b90689

由 Russell King 提交于 4月 17, 2008

> Generic code is not supposed to include irq.h. Replace this include
> by linux/hardirq.h instead and add/replace an include of linux/irq.h
> in asm header files where necessary.
> This change should only matter for architectures that make use of
> GENERIC_CLOCKEVENTS.
> Architectures in question are mips, x86, arm, sh, powerpc, uml and sparc64.
>
> I did some cross compile tests for mips, x86_64, arm, powerpc and sparc64.
> This patch fixes also build breakages caused by the include replacement in
> tick-common.h.

I generally dislike adding optional linux/* includes in asm/* includes -
I'm nervous about this causing include loops.

However, there's a separate point to be discussed here.

That is, what interfaces are expected of every architecture in the kernel.
If generic code wants to be able to set the affinity of interrupts, then
that needs to become part of the interfaces listed in linux/interrupt.h
rather than linux/irq.h.

So what I suggest is this approach instead (against Linus' tree of a
couple of days ago) - we move irq_set_affinity() and irq_can_set_affinity()
to linux/interrupt.h, change the linux/irq.h includes to linux/interrupt.h
and include asm/irq_regs.h where needed (asm/irq_regs.h is supposed to be
rarely used include since not much touches the stacked parent context
registers.)

Build tested on ARM PXA family kernels and ARM's Realview platform
kernels which both use genirq.

[ tglx@linutronix.de: add GENERIC_HARDIRQ dependencies ]
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

d7b90689

16 4月, 2008 1 次提交

Fix locking bug in "acquire_console_semaphore_for_printk()" · 093a07e2

由 Linus Torvalds 提交于 4月 15, 2008

When I cleaned up printk() and split up the printk locking logic in
commit 266c2e0a ("Make printk() console
semaphore accesses sensible") I had incorrectly moved the call to
have_callable_console() outside of the console semaphore.

That was buggy. The console semaphore protects the console_drivers list
that is used by have_callable_console().

Thanks go to Bongani Hlope who saw this as a hang on shutdown and reboot
and bisected the bug to the right commit, and tested this patch. See

http://lkml.org/lkml/2008/4/11/315Bisected-and-tested-by: NBongani Hlope <bonganilinux@mweb.co.za>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

093a07e2

14 4月, 2008 1 次提交

revert "sched: fix fair sleepers" · e2df9e09

由 Ingo Molnar 提交于 4月 14, 2008

revert "sched: fix fair sleepers" (e22ecef1),
because it is causing audio skipping, see:

   http://bugzilla.kernel.org/show_bug.cgi?id=10428

the patch is correct and the real cause of the skipping is not
understood (tracing makes it go away), but time has run out so we'll
revert it and re-try in 2.6.26.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e2df9e09

11 4月, 2008 2 次提交

cgroups: include hierarchy ids in /proc/<pid>/cgroup · b6c3006d

由 Paul Menage 提交于 4月 10, 2008

Extend the /proc/<pid>/cgroup file to include the appropriate hierarchy ID on
each line.

Currently this ID isn't really needed since a hierarchy can be completely
identified by the set of subsystems bound to it, but this is likely to change
in the near future in order to support stateless subsystems and
merging/rebinding of subsystems.  Getting this change into 2.6.25 reduces the
need for an API change later.
Signed-off-by: NPaul Menage <menage@google.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6c3006d

asmlinkage_protect replaces prevent_tail_call · 54a01510

由 Roland McGrath 提交于 4月 10, 2008

The prevent_tail_call() macro works around the problem of the compiler
clobbering argument words on the stack, which for asmlinkage functions
is the caller's (user's) struct pt_regs. The tail/sibling-call
optimization is not the only way that the compiler can decide to use
stack argument words as scratch space, which we have to prevent.
Other optimizations can do it too.

Until we have new compiler support to make "asmlinkage" binding on the
compiler's own use of the stack argument frame, we have work around all
the manifestations of this issue that crop up.

More cases seem to be prevented by also keeping the incoming argument
variables live at the end of the function. This makes their original
stack slots attractive places to leave those variables, so the compiler
tends not clobber them for something else. It's still no guarantee, but
it handles some observed cases that prevent_tail_call() did not.
Signed-off-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

54a01510

05 4月, 2008 1 次提交

cgroups: add cgroup support for enabling controllers at boot time · 8bab8dde

由 Paul Menage 提交于 4月 04, 2008

The effects of cgroup_disable=foo are:

- foo isn't auto-mounted if you mount all cgroups in a single hierarchy
- foo isn't visible as an individually mountable subsystem

As a result there will only ever be one call to foo->create(), at init time;
all processes will stay in this group, and the group will never be mounted on
a visible hierarchy.  Any additional effects (e.g.  not allocating metadata)
are up to the foo subsystem.

This doesn't handle early_init subsystems (their "disabled" bit isn't set be,
but it could easily be extended to do so if any of the early_init systems
wanted it - I think it would just involve some nastier parameter processing
since it would occur before the command-line argument parser had been run.

Hugh said:

  Ballpark figures, I'm trying to get this question out rather than
  processing the exact numbers: CONFIG_CGROUP_MEM_RES_CTLR adds 15% overhead
  to the affected paths, booting with cgroup_disable=memory cuts that back to
  1% overhead (due to slightly bigger struct page).

  I'm no expert on distros, they may have no interest whatever in
  CONFIG_CGROUP_MEM_RES_CTLR=y; and the rest of us can easily build with or
  without it, or apply the cgroup_disable=memory patches.

Unix bench's execl test result on x86_64 was

== just after boot without mounting any cgroup fs.==
mem_cgorup=off : Execl Throughput       43.0     3150.1      732.6
mem_cgroup=on  : Execl Throughput       43.0     2932.6      682.0
==

[lizf@cn.fujitsu.com: fix boot option parsing]
Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8bab8dde

03 4月, 2008 1 次提交

markers: use synchronize_sched() · 6496968e

由 Mathieu Desnoyers 提交于 4月 02, 2008

Markers do not mix well with CONFIG_PREEMPT_RCU because it uses
preempt_disable/enable() and not rcu_read_lock/unlock for minimal
intrusiveness. We would need call_sched and sched_barrier primitives.

Currently, the modification (connection and disconnection) of probes
from markers requires changes to the data structure done in RCU-style :
a new data structure is created, the pointer is changed atomically, a
quiescent state is reached and then the old data structure is freed.

The quiescent state is reached once all the currently running
preempt_disable regions are done running. We use the call_rcu mechanism
to execute kfree() after such quiescent state has been reached.
However, the new CONFIG_PREEMPT_RCU version of call_rcu and rcu_barrier
does not guarantee that all preempt_disable code regions have finished,
hence the race.

The "proper" way to do this is to use rcu_read_lock/unlock, but we don't
want to use it to minimize intrusiveness on the traced system. (we do
not want the marker code to call into much of the OS code, because it
would quickly restrict what can and cannot be instrumented, such as the
scheduler).

The temporary fix, until we get call_rcu_sched and rcu_barrier_sched in
mainline, is to use synchronize_sched before each call_rcu calls, so we
wait for the quiescent state in the system call code path. It will slow
down batch marker enable/disable, but will make sure the race is gone.
Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6496968e

31 3月, 2008 2 次提交

futex_compat __user annotation · 8481664d

由 Al Viro 提交于 3月 29, 2008

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8481664d

NULL noise: fs/*, mm/*, kernel/* · 9dce07f1

由 Al Viro 提交于 3月 29, 2008

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9dce07f1

29 3月, 2008 2 次提交

audit: silence two kerneldoc warnings in kernel/audit.c · f706d5d2

由 Dave Jones 提交于 3月 28, 2008

Silence two kerneldoc warnings.

Warning(kernel/audit.c:1276): No description found for parameter 'string'
Warning(kernel/audit.c:1276): No description found for parameter 'len'

[also fix a typo for bonus points]
Signed-off-by: NDave Jones <davej@codemonkey.org.uk>
Acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f706d5d2

memcgroup: fix spurious EBUSY on memory cgroup removal · 1d4a788f

由 YAMAMOTO Takashi 提交于 3月 28, 2008

Call mm_free_cgroup earlier.  Otherwise a reference due to lazy mm switching
can prevent cgroup removal.
Signed-off-by: NYAMAMOTO Takashi <yamamoto@valinux.co.jp>
Acked-by: NBalbir Singh <balbir@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d4a788f

27 3月, 2008 1 次提交

Give futex init a proper name · f6d107fb

由 Benjamin Herrenschmidt 提交于 3月 27, 2008

The futex init function is called init(). This is a pain in the neck
when debugging when you code dies in ... init :-)

This renames it to futex_init().
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f6d107fb

26 3月, 2008 1 次提交

relay: set an spd_release() hook for splice · 5eb7f9fa

由 Jens Axboe 提交于 3月 26, 2008

relay doesn't reference the pages it adds, however we need a non-NULL
hook or splice_to_pipe() can oops.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5eb7f9fa