提交 · a7e926abc3adfbd2e5e20d2b46177adb4e313915 · openeuler / raspberrypi-kernel

26 2月, 2010 3 次提交

x86-32: Rewrite 32-bit atomic64 functions in assembly · a7e926ab

由 Luca Barbieri 提交于 2月 24, 2010

This patch replaces atomic64_32.c with two assembly implementations,
one for 386/486 machines using pushf/cli/popf and one for 586+ machines
using cmpxchg8b.

The cmpxchg8b implementation provides the following advantages over the
current one:

1. Implements atomic64_add_unless, atomic64_dec_if_positive and
   atomic64_inc_not_zero

2. Uses the ZF flag changed by cmpxchg8b instead of doing a comparison

3. Uses custom register calling conventions that reduce or eliminate
   register moves to suit cmpxchg8b

4. Reads the initial value instead of using cmpxchg8b to do that.
   Currently we use lock xaddl and movl, which seems the fastest.

5. Does not use the lock prefix for atomic64_set
   64-bit writes are already atomic, so we don't need that.
   We still need it for atomic64_read to avoid restoring a value
   changed in the meantime.

6. Allocates registers as well or better than gcc

The 386 implementation provides support for 386 and 486 machines.
386/486 SMP is not supported (we dropped it), but such support can be
added easily if desired.

A pure assembly implementation is required due to the custom calling
conventions, and desire to use %ebp in atomic64_add_return (we need
7 registers...), as well as the ability to use pushf/popf in the 386
code without an intermediate pop/push.

The parameter names are changed to match the convention in atomic_64.h

Changes in v3 (due to rebasing to tip/x86/asm):
- Patches atomic64_32.h instead of atomic_32.h
- Uses the CALL alternative mechanism from commit
  1b1d9258

Changes in v2:
- Merged 386 and cx8 support in the same patch
- 386 support now done in assembly, C code no longer used at all
- cmpxchg64 is used for atomic64_cmpxchg
- stop using macros, use one-line inline functions instead
- miscellanous changes and improvements
Signed-off-by: NLuca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-5-git-send-email-luca@luca-barbieri.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

a7e926ab

x86-32: Allow UP/SMP lock replacement in cmpxchg64 · 9c76b384

由 Luca Barbieri 提交于 2月 24, 2010

Use the functionality just introduced in the previous patch: mark the
lock prefixes in cmpxchg64 alternatives for UP removal.

Changes in v2:
- Naming change
Signed-off-by: NLuca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-3-git-send-email-luca@luca-barbieri.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

9c76b384

x86: Add support for lock prefix in alternatives · b3ac891b

由 Luca Barbieri 提交于 2月 24, 2010

The current lock prefix UP/SMP alternative code doesn't allow
LOCK_PREFIX to be used in alternatives code.

This patch solves the problem by adding a new LOCK_PREFIX_ALTERNATIVE_PATCH
macro that only records the lock prefix location but does not emit
the prefix.

The user of this macro can then start any alternative sequence with
"lock" and have it UP/SMP patched.

To make this work, the UP/SMP alternative code is changed to do the
lock/DS prefix switching only if the byte actually contains a lock or
DS prefix.

Thus, if an alternative without the "lock" is selected, it will now do
nothing instead of clobbering the code.

Changes in v2:
- Naming change
- Change label to not conflict with alternatives
Signed-off-by: NLuca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-2-git-send-email-luca@luca-barbieri.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

b3ac891b

17 2月, 2010 1 次提交

x86: Mark atomic irq ops raw for 32bit legacy · 17c0e710

由 Ingo Molnar 提交于 7月 03, 2009

The atomic ops emulation for 32bit legacy CPUs floods the tracer with
irq off/on entries. The irq disabled regions are short and therefor
not interesting when chasing long irq disabled latencies. Mark them
raw and keep them out of the trace.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

17c0e710

14 1月, 2010 1 次提交

x86: Merge show_regs() · 3bef4447

由 Brian Gerst 提交于 1月 13, 2010

Using kernel_stack_pointer() allows 32-bit and 64-bit versions to
be merged.  This is more correct for 64-bit, since the old %rsp is
always saved on the stack.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
LKML-Reference: <1263397555-27695-1-git-send-email-brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

3bef4447

13 1月, 2010 2 次提交

x86: Macroise x86 cache descriptors · 2ca49b2f

由 Dave Jones 提交于 1月 04, 2010

Use a macro to define the cache sizes when cachesize > 1 MB.

This is less typing, and less prone to introducing bugs like we
saw in e02e0e1a, and means we
don't have to do maths when adding new non-power-of-2 updates
like those seen recently.
Signed-off-by: NDave Jones <davej@redhat.com>
LKML-Reference: <20100104144735.GA18390@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2ca49b2f

x86-32: clean up rwsem inline asm statements · 59c33fa7

由 Linus Torvalds 提交于 1月 12, 2010

This makes gcc use the right register names and instruction operand sizes
automatically for the rwsem inline asm statements.

So instead of using "(%%eax)" to specify the memory address that is the
semaphore, we use "(%1)" or similar. And instead of forcing the operation
to always be 32-bit, we use "%z0", taking the size from the actual
semaphore data structure itself.

This doesn't actually matter on x86-32, but if we want to use the same
inline asm for x86-64, we'll need to have the compiler generate the proper
64-bit names for the registers (%rax instead of %eax), and if we want to
use a 64-bit counter too (in order to avoid the 15-bit limit on the
write counter that limits concurrent users to 32767 threads), we'll need
to be able to generate instructions with "q" accesses rather than "l".

Since this header currently isn't enabled on x86-64, none of that matters,
but we do want to use the xadd version of the semaphores rather than have
to take spinlocks to do a rwsem. The mm->mmap_sem can be heavily contended
when you have lots of threads all taking page faults, and the fallback
rwsem code that uses a spinlock performs abysmally badly in that case.

[ hpa: modified the patch to skip size suffixes entirely when they are
redundant due to register operands. ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <alpine.LFD.2.00.1001121613560.17145@localhost.localdomain>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

59c33fa7

08 1月, 2010 3 次提交

x86: Merge asm/atomic_{32,64}.h · 5abbbbf0

由 Brian Gerst 提交于 1月 07, 2010

Merge the now identical code from asm/atomic_32.h and asm/atomic_64.h
into asm/atomic.h.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
LKML-Reference: <1262883215-4034-4-git-send-email-brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

5abbbbf0

x86: Sync asm/atomic_32.h and asm/atomic_64.h · 3ce59bb8

由 Brian Gerst 提交于 1月 07, 2010

Prepare for merging into asm/atomic.h.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
LKML-Reference: <1262883215-4034-3-git-send-email-brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

3ce59bb8

x86: Split atomic64_t functions into seperate headers · 1a3b1d89

由 Brian Gerst 提交于 1月 07, 2010

Split atomic64_t functions out into separate headers, since they will
not be practical to merge between 32 and 64 bits.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
LKML-Reference: <1262883215-4034-2-git-send-email-brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

1a3b1d89

30 12月, 2009 3 次提交

x86-64: Modify memcpy()/memset() alternatives mechanism · 7269e881

由 Jan Beulich 提交于 12月 18, 2009

In order to avoid unnecessary chains of branches, rather than
implementing memcpy()/memset()'s access to their alternative
implementations via a jump, patch the (larger) original function
directly.

The memcpy() part of this is slightly subtle: while alternative
instruction patching does itself use memcpy(), with the
replacement block being less than 64-bytes in size the main loop
of the original function doesn't get used for copying memcpy_c()
over memcpy(), and hence we can safely write over its beginning.

Also note that the CFI annotations are fine for both variants of
each of the functions.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4B2BB8D30200007800026AF2@vpn.id2.novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7269e881

x86-64: Modify copy_user_generic() alternatives mechanism · 1b1d9258

由 Jan Beulich 提交于 12月 18, 2009

In order to avoid unnecessary chains of branches, rather than
implementing copy_user_generic() as a function consisting of
just a single (possibly patched) branch, instead properly deal
with patching call instructions in the alternative instructions
framework, and move the patching into the callers.

As a follow-on, one could also introduce something like
__EXPORT_SYMBOL_ALT() to avoid patching call sites in modules.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4B2BB8180200007800026AE7@vpn.id2.novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1b1d9258

x86: Lift restriction on the location of FIX_BTMAP_* · 499a5f1e

由 Jan Beulich 提交于 12月 18, 2009

The early ioremap fixmap entries cover half (or for 32-bit
non-PAE, a quarter) of a page table, yet they got
uncondtitionally aligned so far to a 256-entry boundary. This is
not necessary if the range of page table entries anyway falls
into a single page table.

This buys back, for (theoretically) 50% of all configurations
(25% of all non-PAE ones), at least some of the lowmem
necessarily lost with commit e621bd18.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4B2BB66F0200007800026AD6@vpn.id2.novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

499a5f1e

24 12月, 2009 1 次提交

Revert "x86, ucode-amd: Ensure ucode update on suspend/resume after CPU off/online cycle" · 2f99f5c8

由 Linus Torvalds 提交于 12月 23, 2009

This reverts commit 9f15226e.  It's just
wrong, and broke resume for Rafael even on a non-AMD CPU.

As Rafael says:
 "... it causes microcode_init_cpu() to be called during resume even for
  CPUs for which there's no microcode to apply.  That, in turn, results
  in executing request_firmware() (on Intel CPUs at least) which doesn't
  work at this stage of resume (we have device interrupts disabled, I/O
  devices are still suspended and so on).

  If I'm not mistaken, the "if (uci->valid)" logic means "if that CPU is
  known to us" , so before commit 9f15226e microcode_resume_cpu() was
  called for all CPUs already in the system during suspend, which was
  the right thing to do.  The commit changed it so that the CPUs without
  microcode to apply are now treated as "unknown", which is not quite
  right.

  The problem this commit attempted to solve has to be handled
  differently."

Bisected-and -requested-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f99f5c8

23 12月, 2009 1 次提交

arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c: avoid cross-CPU interrupts by... · 4a28395d

由 Andrew Morton 提交于 12月 21, 2009

arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c: avoid cross-CPU interrupts by using smp_call_function_any()

Presently acpi-cpufreq will perform the MSR read on the first CPU in the
mask.  That's inefficient if that CPU differs from the current CPU.
Because we have to perform a cross-CPU call, but we could have run the
rdmsr on the current CPU.

So switch to using the new smp_call_function_any(), which will perform the
call on the current CPU if that CPU is present in the mask (it is).

Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLen Brown <len.brown@intel.com>

4a28395d

22 12月, 2009 5 次提交

ACPI: processor: unify arch_acpi_processor_cleanup_pdc · 47817254