提交 · c2172ce2303051764829d4958bd50a11ada0590f · openeuler / raspberrypi-kernel

25 8月, 2015 1 次提交

ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore() · 3fba7e23

由 Russell King 提交于 8月 19, 2015

Provide uaccess_save_and_enable() and uaccess_restore() to permit
control of userspace visibility to the kernel, and hook these into
the appropriate places in the kernel where we need to access
userspace.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

3fba7e23

19 5月, 2015 2 次提交

sched/preempt, arm/futex: Disable preemption in UP futex_atomic_op_inuser() explicitly · 388b0e0a

由 David Hildenbrand 提交于 5月 11, 2015

The !CONFIG_SMP implementation of futex_atomic_op_inuser() seems to rely
on disabled preemption to guarantee mutual exclusion.

From commit e589ed23 ("[ARM] 5218/1: arm: improved futex support"):

    "For UP it's enough to disable preemption to ensure mutual exclusion..."

From the code itself:

    "!SMP, we can work around lack of atomic ops by disabling preemption"

Let's make this explicit, to prepare for pagefault_disable() not
touching preemption anymore.
Reviewed-and-tested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: airlied@linux.ie
Cc: akpm@linux-foundation.org
Cc: benh@kernel.crashing.org
Cc: bigeasy@linutronix.de
Cc: borntraeger@de.ibm.com
Cc: daniel.vetter@intel.com
Cc: heiko.carstens@de.ibm.com
Cc: herbert@gondor.apana.org.au
Cc: hocko@suse.cz
Cc: hughd@google.com
Cc: mst@redhat.com
Cc: paulus@samba.org
Cc: ralf@linux-mips.org
Cc: schwidefsky@de.ibm.com
Cc: yang.shi@windriver.com
Link: http://lkml.kernel.org/r/1431359540-32227-12-git-send-email-dahi@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

388b0e0a

sched/preempt, arm/futex: Disable preemption in UP futex_atomic_cmpxchg_inatomic() explicitly · 39919b01

由 David Hildenbrand 提交于 5月 11, 2015

The !CONFIG_SMP implementation of futex_atomic_cmpxchg_inatomic()
requires preemption to be disabled to guarantee mutual exclusion.
Let's make this explicit.

This patch is based on a patch by Sebastian Andrzej Siewior on the
-rt branch.
Reviewed-and-tested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: airlied@linux.ie
Cc: akpm@linux-foundation.org
Cc: benh@kernel.crashing.org
Cc: bigeasy@linutronix.de
Cc: borntraeger@de.ibm.com
Cc: daniel.vetter@intel.com
Cc: heiko.carstens@de.ibm.com
Cc: herbert@gondor.apana.org.au
Cc: hocko@suse.cz
Cc: hughd@google.com
Cc: mst@redhat.com
Cc: paulus@samba.org
Cc: ralf@linux-mips.org
Cc: schwidefsky@de.ibm.com
Cc: yang.shi@windriver.com
Link: http://lkml.kernel.org/r/1431359540-32227-11-git-send-email-dahi@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

39919b01

30 3月, 2015 1 次提交

ARM: 8322/1: keep .text and .fixup regions closer together · c4a84ae3

由 Ard Biesheuvel 提交于 3月 24, 2015

This moves all fixup snippets to the .text.fixup section, which is
a special section that gets emitted along with the .text section
for each input object file, i.e., the snippets are kept much closer
to the code they refer to, which helps prevent linker failure on
large kernels.
Acked-by: NNicolas Pitre <nico@linaro.org>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

c4a84ae3

25 2月, 2014 1 次提交

ARM: 7984/1: prefetch: add prefetchw invocations for barriered atomics · c32ffce0

由 Will Deacon 提交于 2月 21, 2014

After a bunch of benchmarking on the interaction between dmb and pldw,
it turns out that issuing the pldw *after* the dmb instruction can
give modest performance gains (~3% atomic_add_return improvement on a
dual A15).

This patch adds prefetchw invocations to our barriered atomic operations
including cmpxchg, test_and_xxx and futexes.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

c32ffce0

10 2月, 2014 1 次提交

ARM: 7954/1: mm: remove remaining domain support from ARMv6 · b6ccb980

由 Will Deacon 提交于 2月 07, 2014

CPU_32v6 currently selects CPU_USE_DOMAINS if CPU_V6 and MMU. This is
because ARM 1136 r0pX CPUs lack the v6k extensions, and therefore do
not have hardware thread registers. The lack of these registers requires
the kernel to update the vectors page at each context switch in order to
write a new TLS pointer. This write must be done via the userspace
mapping, since aliasing caches can lead to expensive flushing when using
kmap. Finally, this requires the vectors page to be mapped r/w for
kernel and r/o for user, which has implications for things like put_user
which must trigger CoW appropriately when targetting user pages.

The upshot of all this is that a v6/v7 kernel makes use of domains to
segregate kernel and user memory accesses. This has the nasty
side-effect of making device mappings executable, which has been
observed to cause subtle bugs on recent cores (e.g. Cortex-A15
performing a speculative instruction fetch from the GIC and acking an
interrupt in the process).

This patch solves this problem by removing the remaining domain support
from ARMv6. A new memory type is added specifically for the vectors page
which allows that page (and only that page) to be mapped as user r/o,
kernel r/w. All other user r/o pages are mapped also as kernel r/o.
Patch co-developed with Russell King.

Cc: <stable@vger.kernel.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

b6ccb980

16 6月, 2012 1 次提交

ARM: 7425/1: extable: ensure fixup entries are 4-byte aligned · 667d1b48

由 Will Deacon 提交于 6月 15, 2012

Fixup entries in the kernel exception tables should be 4-byte aligned
since we return directly to them when handling a faulting instruction in
the kernel.

This patch adds the missing align directives to the fixup entries.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

667d1b48

25 1月, 2012 1 次提交

ARM: 7301/1: Rename the T() macro to TUSER() to avoid namespace conflicts · 4e7682d0

由 Catalin Marinas 提交于 1月 25, 2012

This macro is used to generate unprivileged accesses (LDRT/STRT) to user
space.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NNicolas Pitre <nico@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

4e7682d0

26 9月, 2011 1 次提交

ARM: 7099/1: futex: preserve oldval in SMP __futex_atomic_op · df77abca

由 Will Deacon 提交于 9月 23, 2011

The SMP implementation of __futex_atomic_op clobbers oldval with the
status flag from the exclusive store. This causes it to always read as
zero when performing the FUTEX_OP_CMP_* operation.

This patch updates the ARM __futex_atomic_op implementations to take a
tmp argument, allowing us to store the strex status flag without
overwriting the register containing oldval.

Cc: stable@kernel.org
Reported-by: NMinho Ban <mhban@samsung.com>
Reviewed-by: NNicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

df77abca

12 5月, 2011 1 次提交

ARM: 6889/1: futex: add SMP futex support when !CPU_USE_DOMAINS · c1b0db56

由 Will Deacon 提交于 4月 28, 2011

This patch uses the load/store exclusive instructions to add SMP futex
support for ARM.

Since the ARM architecture does not provide instructions for
unprivileged exclusive memory accesses, we can only provide SMP futexes
when CPU domain support is disabled.

Cc: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

c1b0db56

15 3月, 2011 1 次提交

arm: Remove bogus comment in futex_atomic_cmpxchg_inatomic() · 07d5ecae

由 Thomas Gleixner 提交于 3月 14, 2011

commit 522d7dec(futex: Remove redundant pagefault_disable in
futex_atomic_cmpxchg_inatomic()) added a bogus comment.

/* Note that preemption is disabled by futex_atomic_cmpxchg_inatomic
 * call sites. */

Bogus in two aspects:

1) pagefault_disable != preempt_disable even if the mechanism we use
   is the same

2) we have a call site which deliberately does not disable pagefaults
   as it wants the possible fault to be handled - though that has been
   changed for consistency reasons now.

Sigh. I really should have seen that when committing the above. :(
Catched-by-and-rightfully-ranted-at-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
LKML-Reference: <alpine.LFD.2.00.1103141126590.2787@localhost6.localdomain6>
Cc: Michel Lespinasse <walken@google.com>
Cc: Darren Hart <darren@dvhart.com>

07d5ecae

11 3月, 2011 3 次提交

futex: Sanitize futex ops argument types · 8d7718aa

由 Michel Lespinasse 提交于 3月 10, 2011

Change futex_atomic_op_inuser and futex_atomic_cmpxchg_inatomic
prototypes to use u32 types for the futex as this is the data type the
futex core code uses all over the place.
Signed-off-by: NMichel Lespinasse <walken@google.com>
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311025058.GD26122@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

8d7718aa

futex: Sanitize cmpxchg_futex_value_locked API · 37a9d912

由 Michel Lespinasse 提交于 3月 10, 2011

The cmpxchg_futex_value_locked API was funny in that it returned either
the original, user-exposed futex value OR an error code such as -EFAULT.
This was confusing at best, and could be a source of livelocks in places
that retry the cmpxchg_futex_value_locked after trying to fix the issue
by running fault_in_user_writeable().
    
This change makes the cmpxchg_futex_value_locked API more similar to the
get_futex_value_locked one, returning an error code and updating the
original value through a reference argument.
Signed-off-by: NMichel Lespinasse <walken@google.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>  [tile]
Acked-by: Tony Luck <tony.luck@intel.com>  [ia64]
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>  [microblaze]
Acked-by: David Howells <dhowells@redhat.com> [frv]
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311024851.GC26122@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

37a9d912

futex: Remove redundant pagefault_disable in futex_atomic_cmpxchg_inatomic() · 522d7dec

由 Michel Lespinasse 提交于 3月 10, 2011

kernel/futex.c disables page faults before calling
futex_atomic_cmpxchg_inatomic(), so there is no need to do it again
within that function.
Signed-off-by: NMichel Lespinasse <walken@google.com>
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311024731.GB26122@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

522d7dec

04 11月, 2010 1 次提交

ARM: 6384/1: Remove the domain switching on ARMv6k/v7 CPUs · 247055aa

由 Catalin Marinas 提交于 9月 13, 2010

This patch removes the domain switching functionality via the set_fs and
__switch_to functions on cores that have a TLS register.

Currently, the ioremap and vmalloc areas share the same level 1 page
tables and therefore have the same domain (DOMAIN_KERNEL). When the
kernel domain is modified from Client to Manager (via the __set_fs or in
the __switch_to function), the XN (eXecute Never) bit is overridden and
newer CPUs can speculatively prefetch the ioremap'ed memory.

Linux performs the kernel domain switching to allow user-specific
functions (copy_to/from_user, get/put_user etc.) to access kernel
memory. In order for these functions to work with the kernel domain set
to Client, the patch modifies the LDRT/STRT and related instructions to
the LDR/STR ones.

The user pages access rights are also modified for kernel read-only
access rather than read/write so that the copy-on-write mechanism still
works. CPU_USE_DOMAINS gets disabled only if the hardware has a TLS register
(CPU_32v6K is defined) since writing the TLS value to the high vectors page
isn't possible.

The user addresses passed to the kernel are checked by the access_ok()
function so that they do not point to the kernel space.
Tested-by: NAnton Vorontsov <cbouatmailru@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

247055aa

21 4月, 2010 1 次提交

ARM: fix build error in arch/arm/kernel/process.c · 4260415f

由 Russell King 提交于 4月 19, 2010

/tmp/ccJ3ssZW.s: Assembler messages:
/tmp/ccJ3ssZW.s:1952: Error: can't resolve `.text' {.text section} - `.LFB1077'

This is caused because:

	.section .data
	.section .text
	.section .text
	.previous

does not return us to the .text section, but the .data section; this
makes use of .previous dangerous if the ordering of previous sections
is not known.

Fix up the other users of .previous; .pushsection and .popsection are
a safer pairing to use than .section and .previous.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

4260415f

24 7月, 2009 1 次提交

Thumb-2: Implementation of the unified start-up and exceptions code · b86040a5

由 Catalin Marinas 提交于 7月 24, 2009

This patch implements the ARM/Thumb-2 unified kernel start-up and
exception handling code.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

b86040a5

01 9月, 2008 1 次提交

[ARM] 5218/1: arm: improved futex support · e589ed23

由 Mikael Pettersson 提交于 8月 20, 2008

Linux/ARM currently doesn't support robust or PI futexes.
The problem is that the kernel wants to perform certain ops
(cmpxchg, set, add, or, andn, xor) atomically on user-space
addresses, and ARM's futex.h doesn't support that.

This patch adds that support, but only for uniprocessor machines.
For UP it's enough to disable preemption to ensure mutual exclusion
with other software agents (futexes don't need to care about other
hardware agents, fortunately).

This patch is based on one posted by Khem Raj on 2007-08-01
<http://marc.info/?l=linux-arm-kernel&m=118599407413016&w=2>.
(That patch is included in the -RT kernel patches.)
My changes since that version include:
* corrected implementation of FUTEX_OP_ANDN (must complement oparg)
* added missing memory clobber to futex_atomic_cmpxchg_inatomic()
* removed spinlock because it's unnecessary for UP and insufficient
  for SMP, instead the code is restricted to UP and relies on the
  fact that pagefault_disable() also disables preemption
* coding style cleanups

Tested on ARMv5 XScales with the glibc-2.6 nptl test suite.
Tested-by: NBruce Ashfield <bruce.ashfield@windriver.com>
Signed-off-by: NMikael Pettersson <mikpe@it.uu.se>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

e589ed23

03 8月, 2008 1 次提交

[ARM] move include/asm-arm to arch/arm/include/asm · 4baa9922

由 Russell King 提交于 8月 02, 2008

Move platform independent header files to arch/arm/include/asm, leaving
those in asm/arch* and asm/plat* alone.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

4baa9922

09 1月, 2006 1 次提交

[PATCH] consolidate asm/futex.h · f8aaeace

由 Jeff Dike 提交于 1月 08, 2006

Most of the architectures have the same asm/futex.h.  This consolidates them
into asm-generic, with the arches including it from their own asm/futex.h.

In the case of UML, this reverts the old broken futex.h and goes back to using
the same one as almost everyone else.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f8aaeace

19 9月, 2005 1 次提交

[ARM] Fix warning in asm/futex.h · 7f8c0fd7

由 Russell King 提交于 9月 18, 2005

The recently added futex.h contains an unused variable, which gcc
naturally warns about.  Remove this unused variable.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

7f8c0fd7

08 9月, 2005 1 次提交

[PATCH] FUTEX_WAKE_OP: pthread_cond_signal() speedup · 4732efbe

由 Jakub Jelinek 提交于 9月 06, 2005

ATM pthread_cond_signal is unnecessarily slow, because it wakes one waiter
(which at least on UP usually means an immediate context switch to one of
the waiter threads).  This waiter wakes up and after a few instructions it
attempts to acquire the cv internal lock, but that lock is still held by
the thread calling pthread_cond_signal.  So it goes to sleep and eventually
the signalling thread is scheduled in, unlocks the internal lock and wakes
the waiter again.

Now, before 2003-09-21 NPTL was using FUTEX_REQUEUE in pthread_cond_signal
to avoid this performance issue, but it was removed when locks were
redesigned to the 3 state scheme (unlocked, locked uncontended, locked
contended).

Following scenario shows why simply using FUTEX_REQUEUE in
pthread_cond_signal together with using lll_mutex_unlock_force in place of
lll_mutex_unlock is not enough and probably why it has been disabled at
that time:

The number is value in cv->__data.__lock.
        thr1            thr2            thr3
0       pthread_cond_wait
1       lll_mutex_lock (cv->__data.__lock)
0       lll_mutex_unlock (cv->__data.__lock)
0       lll_futex_wait (&cv->__data.__futex, futexval)
0                       pthread_cond_signal
1                       lll_mutex_lock (cv->__data.__lock)
1                                       pthread_cond_signal
2                                       lll_mutex_lock (cv->__data.__lock)
2                                         lll_futex_wait (&cv->__data.__lock, 2)
2                       lll_futex_requeue (&cv->__data.__futex, 0, 1, &cv->__data.__lock)
                          # FUTEX_REQUEUE, not FUTEX_CMP_REQUEUE
2                       lll_mutex_unlock_force (cv->__data.__lock)
0                         cv->__data.__lock = 0
0                         lll_futex_wake (&cv->__data.__lock, 1)
1       lll_mutex_lock (cv->__data.__lock)
0       lll_mutex_unlock (cv->__data.__lock)
          # Here, lll_mutex_unlock doesn't know there are threads waiting
          # on the internal cv's lock

Now, I believe it is possible to use FUTEX_REQUEUE in pthread_cond_signal,
but it will cost us not one, but 2 extra syscalls and, what's worse, one of
these extra syscalls will be done for every single waiting loop in
pthread_cond_*wait.

We would need to use lll_mutex_unlock_force in pthread_cond_signal after
requeue and lll_mutex_cond_lock in pthread_cond_*wait after lll_futex_wait.

Another alternative is to do the unlocking pthread_cond_signal needs to do
(the lock can't be unlocked before lll_futex_wake, as that is racy) in the
kernel.

I have implemented both variants, futex-requeue-glibc.patch is the first
one and futex-wake_op{,-glibc}.patch is the unlocking inside of the kernel.
 The kernel interface allows userland to specify how exactly an unlocking
operation should look like (some atomic arithmetic operation with optional
constant argument and comparison of the previous futex value with another
constant).

It has been implemented just for ppc*, x86_64 and i?86, for other
architectures I'm including just a stub header which can be used as a
starting point by maintainers to write support for their arches and ATM
will just return -ENOSYS for FUTEX_WAKE_OP.  The requeue patch has been
(lightly) tested just on x86_64, the wake_op patch on ppc64 kernel running
32-bit and 64-bit NPTL and x86_64 kernel running 32-bit and 64-bit NPTL.

With the following benchmark on UP x86-64 I get:

for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so --library-path .:$i /tmp/bench; \
for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 2>&1; done; done
time elf/ld.so --library-path .:nptl-orig /tmp/bench
real 0m0.655s user 0m0.253s sys 0m0.403s
real 0m0.657s user 0m0.269s sys 0m0.388s
time elf/ld.so --library-path .:nptl-requeue /tmp/bench
real 0m0.496s user 0m0.225s sys 0m0.271s
real 0m0.531s user 0m0.242s sys 0m0.288s
time elf/ld.so --library-path .:nptl-wake_op /tmp/bench
real 0m0.380s user 0m0.176s sys 0m0.204s
real 0m0.382s user 0m0.175s sys 0m0.207s

The benchmark is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00001.txt
Older futex-requeue-glibc.patch version is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00002.txt
Older futex-wake_op-glibc.patch version is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00003.txt
Will post a new version (just x86-64 fixes so that the patch
applies against pthread_cond_signal.S) to libc-hacker ml soon.

Attached is the kernel FUTEX_WAKE_OP patch as well as a simple-minded
testcase that will not test the atomicity of the operation, but at least
check if the threads that should have been woken up are woken up and
whether the arithmetic operation in the kernel gave the expected results.
Acked-by: NIngo Molnar <mingo@redhat.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Jamie Lokier <jamie@shareable.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NYoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4732efbe