提交 · cd689985cf49f6ff5c8eddc48d98b9d581d9475d · gsplhtlxg / clone-Linux

02 2月, 2008 6 次提交

futex: Add bitset conditional wait/wakeup functionality · cd689985

由 Thomas Gleixner 提交于 2月 01, 2008

To allow the implementation of optimized rw-locks in user space, glibc
needs a possibility to select waiters for wakeup depending on a bitset
mask.

This requires two new futex OPs: FUTEX_WAIT_BITS and FUTEX_WAKE_BITS
These OPs are basically the same as FUTEX_WAIT and FUTEX_WAKE plus an
additional argument - a bitset. Further the FUTEX_WAIT_BITS OP is
expecting an absolute timeout value instead of the relative one, which
is used for the FUTEX_WAIT OP.

FUTEX_WAIT_BITS calls into the kernel with a bitset. The bitset is
stored in the futex_q structure, which is used to enqueue the waiter
into the hashed futex waitqueue.

FUTEX_WAKE_BITS also calls into the kernel with a bitset. The wakeup
function logically ANDs the bitset with the bitset stored in each
waiters futex_q structure. If the result is zero (i.e. none of the set
bits in the bitsets is matching), then the waiter is not woken up. If
the result is not zero (i.e. one of the set bits in the bitsets is
matching), then the waiter is woken.

The bitset provided by the caller must be non zero. In case the
provided bitset is zero the kernel returns EINVAL.

Internaly the new OPs are only extensions to the existing FUTEX_WAIT
and FUTEX_WAKE functions. The existing OPs hand a bitset with all bits
set into the futex_wait() and futex_wake() functions.
Signed-off-by: NThomas Gleixner <tgxl@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cd689985

futex: Remove warn on in return fixup path · 83e96c60

由 Thomas Gleixner 提交于 2月 01, 2008

The WARN_ON() in the fixup return path of futex_lock_pi() can
trigger with false positives.

The following scenario happens:
t1 holds the futex and t2 and t3 are blocked on the kernel side rt_mutex.
t1 releases the futex (and the rt_mutex) and assigned t2 to be the next
owner of the futex.

t2 is interrupted and returns w/o acquiring the rt_mutex, before t1 can
release the rtmutex.

t1 releases the rtmutex and t3 becomes the pending owner of the rtmutex.

t2 notices that it is the designated owner (user space variable) and
fails to acquire the rt_mutex via trylock, because it is not allowed to
steal the rt_mutex from t3. Now it looks at the rt_mutex pending owner (t3)
and assigns the futex and the pi_state to it.

During the fixup t4 steals the rtmutex from t3.

t2 returns from the fixup and the owner of the rt_mutex has changed from
t3 to t4.

There is no need to do another round of fixups from t2. The important
part (t2 is not returning as the user space visible owner) is
done. The further fixups are done, before either t3 or t4 return to
user space.

For the user space it is not relevant which task (t3 or t4) is the real
owner, as long as those are both in the kernel, which is guaranteed by
the serialization of the hash bucket lock. Both tasks (which ever returns
first to userspace - t4 because it locked the rt_mutex or t3 due to a signal)
are going through the lock_futex_pi() return path where the ownership is
fixed before the return to user space.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

83e96c60

x86: replace LOCK_PREFIX in futex.h · 9d55b992

由 Thomas Gleixner 提交于 2月 01, 2008

The exception fixup for the futex macros __futex_atomic_op1/2 and
futex_atomic_cmpxchg_inatomic() is missing an entry when the lock
prefix is replaced by a NOP via SMP alternatives.

Chuck Ebert tracked this down from the information provided in:
https://bugzilla.redhat.com/show_bug.cgi?id=429412

A possible solution would be to add another fixup after the
LOCK_PREFIX, so both the LOCK and NOP case have their own entry in the
exception table, but it's not really worth the trouble.

Simply replace LOCK_PREFIX with lock and keep those untouched by SMP
alternatives.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9d55b992

tick-sched: add more debug information · 5df7fa1c

由 Thomas Gleixner 提交于 2月 01, 2008

To allow better diagnosis of tick-sched related, especially NOHZ
related problems, we need to know when the last wakeup via an irq
happened and when the CPU left the idle state.

Add two fields (idle_waketime, idle_exittime) to the tick_sched
structure and add them to the timer_list output.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5df7fa1c

timekeeping: update xtime_cache when time(zone) changes · 1001d0a9

由 Thomas Gleixner 提交于 2月 01, 2008

xtime_cache needs to be updated whenever xtime and or wall_to_monotic
are changed. Otherwise users of xtime_cache might see a stale (and in
the case of timezone changes utterly wrong) value until the next
update happens.

Fixup the obvious places, which miss this update.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJohn Stultz <johnstul@us.ibm.com>
Tested-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1001d0a9

hrtimer: fix hrtimer_init_sleeper() users · 3588a085

由 Peter Zijlstra 提交于 2月 01, 2008

this patch:

 commit 37bb6cb4
 Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
 Date:   Fri Jan 25 21:08:32 2008 +0100

     hrtimer: unlock hrtimer_wakeup

Broke hrtimer_init_sleeper() users. It forgot to fix up the futex
caller of this function to detect the failed queueing and messed up
the do_nanosleep() caller in that it could leak a TASK_INTERRUPTIBLE
state.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3588a085

01 2月, 2008 34 次提交

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · 24e1c13c

由 Linus Torvalds 提交于 2月 01, 2008

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: kill swap_io_context()
  as-iosched: fix inconsistent ioc->lock context
  ide-cd: fix leftover data BUG
  block: make elevator lib checkpatch compliant
  cfq-iosched: make checkpatch compliant
  block: make core bits checkpatch compliant
  block: new end request handling interface should take unsigned byte counts
  unexport add_disk_randomness
  block/sunvdc.c:print_version() must be __devinit
  splice: always updated atime in direct splice

24e1c13c

block: kill swap_io_context() · 3bc217ff

由 Jens Axboe 提交于 2月 01, 2008

It blindly copies everything in the io_context, including the lock.
That doesn't work so well for either lock ordering or lockdep.

There seems zero point in swapping io contexts on a request to request
merge, so the best point of action is to just remove it.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

3bc217ff

docbook: rapidio: fix fatal filename error · 31fa5d28

由 Randy Dunlap 提交于 1月 31, 2008

Fix docbook fatal error (files were renamed):
docproc: linux-2.6.24-git9/arch/ppc/kernel/rio.c: No such file or directory
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

31fa5d28

fix directory entry in arch-x86-Makefile · e3c2a998

由 Steven Rostedt 提交于 2月 01, 2008

Doing a make randconfig I came across this error in the Makefile.

This patch makes a directory out of arch/x86/mach-default for
CONFIG_X86_RDC321X
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e3c2a998

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 · 42bb1899

由 Linus Torvalds 提交于 2月 01, 2008

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: Fix inconsistent .section usage in lib/
  [SPARC/SPARC64]: Fix usage of .section .sched.text in assembler code.

42bb1899

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · cec03afc

由 Linus Torvalds 提交于 2月 01, 2008

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (173 commits)
  [NETNS]: Lookup in FIB semantic hashes taking into account the namespace.
  [NETNS]: Add a namespace mark to fib_info.
  [IPV4]: fib_sync_down rework.
  [NETNS]: Process interface address manipulation routines in the namespace.
  [IPV4]: Small style cleanup of the error path in rtm_to_ifaddr.
  [IPV4]: Fix memory leak on error path during FIB initialization.
  [NETFILTER]: Ipv6-related xt_hashlimit compilation fix.
  [NET_SCHED]: Add flow classifier
  [NET_SCHED]: sch_sfq: make internal queues visible as classes
  [NET_SCHED]: sch_sfq: add support for external classifiers
  [NET_SCHED]: Constify struct tcf_ext_map
  [BLUETOOTH]: Fix bugs in previous conn add/del workqueue changes.
  [TCP]: Unexport sysctl_tcp_tso_win_divisor
  [IPV4]: Make struct ipv4_devconf static.
  [TR] net/802/tr.c: sysctl_tr_rif_timeout static
  [XFRM]: Fix statistics.
  [XFRM]: Remove unused exports.
  [PKT_SCHED] sch_teql.c: Duplicate IFF_BROADCAST in FMASK, remove 2nd.
  [BNX2]: Fix ASYM PAUSE advertisement for remote PHY.
  [IPV4] route cache: Introduce rt_genid for smooth cache invalidation
  ...

cec03afc

pasemi: Fix thinko in dma_direct_ops setup · 2da53b01

由 Olof Johansson 提交于 1月 31, 2008

[POWERPC] pasemi: Fix thinko in dma_direct_ops setup

The first patch will just fall through and still set dma_data to a bad
value, make it return directly instead.
Signed-off-by: NOlof Johansson <olof@lixom.net>
Acked-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2da53b01

m68knommu: remove dead timer int pending code · 0c6377f8