提交 · 7f387d3f2421781610588faa2f49ae5f1737b137 · openeuler / raspberrypi-kernel

29 10月, 2009 1 次提交

由 Masami Hiramatsu 提交于 10月 27, 2009

Fix superscripts position because some superscripts of SSE
opcode are not put in correct position.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
LKML-Reference: <20091027204204.30545.97296.stgit@harusame>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7f387d3f

21 10月, 2009 2 次提交

x86: Add AES opcodes to opcode map · 9983d60d

由 Masami Hiramatsu 提交于 10月 20, 2009

Add Intel AES opcodes to x86 opcode map. These opcodes are
used in arch/x86/crypt/aesni-intel_asm.S.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap<systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020165531.4145.21872.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9983d60d

x86: Fix group attribute decoding bug · 06ed6ba5

由 Masami Hiramatsu 提交于 10月 20, 2009

Fix a typo in inat_get_group_attribute() which should refer
inat_group_tables, not inat_escape_tables.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap<systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020165524.4145.97333.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

06ed6ba5

17 10月, 2009 2 次提交

x86: Add AMD prefetch and 3DNow! opcodes to opcode map · d1baf5a5

由 Masami Hiramatsu 提交于 10月 16, 2009

Add AMD prefetch and 3DNow! opcode including FEMMS. Since 3DNow!
uses the last immediate byte as an opcode extension byte, x86
insn just treats the extenstion byte as an immediate byte
instead of a part of opcode (insn_get_opcode() decodes first
"0x0f 0x0f" bytes.)

Users who are interested in analyzing 3DNow! opcode still can
decode it by analyzing the immediate byte.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20091017000744.16556.27881.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d1baf5a5

x86: Add MMX/SSE opcode groups to opcode map · 8c95bc3e

由 Masami Hiramatsu 提交于 10月 16, 2009

Add missing MMX/SSE opcode groups to x86 opcode map.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20091017000736.16556.29061.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8c95bc3e

03 10月, 2009 1 次提交

x86: Add VIA processor instructions in opcodes decoder · c0b11d3a

由 Masami Hiramatsu 提交于 9月 25, 2009

Add VIA processor's Padlock instructions(MONTMUL, XSHA1, XSHA256)
as parts of the kernel may use them.

This fixes the following crash in opcodes decoder selftests:

 make[2]: `scripts/unifdef' is up to date.
   TEST    posttest
 Error: c145cf71:        f3 0f a6 d0             repz xsha256
 Error: objdump says 4 bytes, but insn_get_length() says 3 (attr:0)
 make[1]: *** [posttest] Error 2
 make: *** [bzImage] Error 2
Reported-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Acked-by: NIngo Molnar <mingo@elte.hu>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20090925182037.10157.3180.stgit@omoto>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

c0b11d3a

01 10月, 2009 2 次提交

x86: Don't generate cmpxchg8b_emu if CONFIG_X86_CMPXCHG64=y · 04edbdef

由 Eric Dumazet 提交于 10月 01, 2009

Conditionaly compile cmpxchg8b_emu.o and EXPORT_SYMBOL(cmpxchg8b_emu).

This reduces the kernel size a bit.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4AC43E7E.1000600@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

04edbdef

x86: Provide an alternative() based cmpxchg64() · 79e1dd05

由 Arjan van de Ven 提交于 9月 30, 2009

cmpxchg64() today generates, to quote Linus, "barf bag" code.

cmpxchg64() is about to get used in the scheduler to fix a bug there,
but it's a prerequisite that cmpxchg64() first be made non-sucking.

This patch turns cmpxchg64() into an efficient implementation that
uses the alternative() mechanism to just use the raw instruction on
all modern systems.

Note: the fallback is NOT smp safe, just like the current fallback
is not SMP safe. (Interested parties with i486 based SMP systems
are welcome to submit fix patches for that.)
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
[ fixed asm constraint bug ]
Fixed-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090930170754.0886ff2e@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

79e1dd05

11 9月, 2009 1 次提交

x86: Add MMX support for instruction decoder · f12b4f54

由 Masami Hiramatsu 提交于 9月 08, 2009

Add MMX/SSE instructions to x86 opcode maps, since some of those
instructions are used in the kernel.

This also fixes failures in the x86 instruction decoder seftest.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20090908163246.23516.78835.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

f12b4f54

05 9月, 2009 1 次提交

x86, msr: change msr-reg.o to obj-y, and export its symbols · b19ae399

由 H. Peter Anvin 提交于 9月 04, 2009

Change msr-reg.o to obj-y (it will be included in virtually every
kernel since it is used by the initialization code for AMD processors)
and add a separate C file to export its symbols to modules, so that
msr.ko can use them; on uniprocessors we bypass the helper functions
in msr.o and use the accessor functions directly via inlines.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
LKML-Reference: <20090904140834.GA15789@elte.hu>
Cc: Borislav Petkov <petkovbb@googlemail.com>

b19ae399

04 9月, 2009 1 次提交

x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too · 8adf65cf

由 Ingo Molnar 提交于 9月 03, 2009

The macro was defined in the 32-bit path as well - breaking the
build on 32-bit platforms:

  arch/x86/lib/msr-reg.S: Assembler messages:
  arch/x86/lib/msr-reg.S:53: Error: Bad macro parameter list
  arch/x86/lib/msr-reg.S:100: Error: invalid character '_' in mnemonic
  arch/x86/lib/msr-reg.S:101: Error: invalid character '_' in mnemonic

Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <tip-f6909f39@git.kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8adf65cf

02 9月, 2009 1 次提交

x86, msr: fix msr-reg.S compilation with gas 2.16.1 · f6909f39

由 H. Peter Anvin 提交于 9月 01, 2009

msr-reg.S used the :req option on a macro argument, which wasn't
supported by gas 2.16.1 (but apparently by some earlier versions of
gas, just to be confusing.)  It isn't necessary, so just remove it.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>

f6909f39

01 9月, 2009 3 次提交

x86, msr: Create _on_cpu helpers for {rw,wr}msr_safe_regs() · 8b956bf1

由 H. Peter Anvin 提交于 8月 31, 2009

Create _on_cpu helpers for {rw,wr}msr_safe_regs() analogously with the
other MSR functions.  This will be necessary to add support for these
to the MSR driver.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <petkovbb@gmail.com>

8b956bf1

x86, msr: CFI annotations, cleanups for msr-reg.S · 79c5dca3

由 H. Peter Anvin 提交于 8月 31, 2009

Add CFI annotations for native_{rd,wr}msr_safe_regs().
Simplify the 64-bit implementation: we don't allow the upper half
registers to be set, and so we can use them to carry state across the
operation.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <petkovbb@gmail.com>
LKML-Reference: <1251705011-18636-1-git-send-email-petkovbb@gmail.com>

79c5dca3

x86, msr: Add rd/wrmsr interfaces with preset registers · 132ec92f

由 Borislav Petkov 提交于 8月 31, 2009

native_{rdmsr,wrmsr}_safe_regs are two new interfaces which allow
presetting of a subset of eight x86 GPRs before executing the rd/wrmsr
instructions. This is needed at least on AMD K8 for accessing an erratum
workaround MSR.

Originally based on an idea by H. Peter Anvin.
Signed-off-by: NBorislav Petkov <petkovbb@gmail.com>
LKML-Reference: <1251705011-18636-1-git-send-email-petkovbb@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

132ec92f

27 8月, 2009 1 次提交

x86: Instruction decoder API · eb13296c

由 Masami Hiramatsu 提交于 8月 13, 2009

Add x86 instruction decoder to arch-specific libraries. This decoder
can decode x86 instructions used in kernel into prefix, opcode, modrm,
sib, displacement and immediates. This can also show the length of
instructions.

This version introduces instruction attributes for decoding
instructions.
The instruction attribute tables are generated from the opcode map file
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).

Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
and consist of below two types of opcode tables.

1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;

 Table: table-name
 Referrer: escaped-name
 opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
  (or)
 opcode: escape # escaped-name
 EndTable

Group opcodes, which has 8 elements, are written as below;

 GrpTable: GrpXXX
 reg:  mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
 EndTable

These opcode maps include a few SSE and FP opcodes (for setup), because
those opcodes are used in the kernel.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: NJim Keniston <jkenisto@us.ibm.com>
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jason Baron <jbaron@redhat.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it>
Cc: Roland McGrath <roland@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
LKML-Reference: <20090813203413.31965.49709.stgit@localhost.localdomain>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

eb13296c

04 8月, 2009 1 次提交

x86, msr: execute on the correct CPU subset · bab9a3da

由 Borislav Petkov 提交于 7月 30, 2009

Make rdmsr_on_cpus/wrmsr_on_cpus execute on the current CPU only if it
is in the supplied bitmask.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

bab9a3da

11 7月, 2009 1 次提交

Fix congestion_wait() sync/async vs read/write confusion · 8aa7e847

由 Jens Axboe 提交于 7月 09, 2009

Commit 1faa16d2 accidentally broke
the bdi congestion wait queue logic, causing us to wait on congestion
for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8aa7e847

04 7月, 2009 4 次提交

x86: atomic64: Inline atomic64_read() again · a79f0da8

由 Eric Dumazet 提交于 7月 03, 2009

Now atomic64_read() is light weight (no register pressure and
small icache), we can inline it again.

Also use "=&A" constraint instead of "+A" to avoid warning
about unitialized 'res' variable. (gcc had to force 0 in eax/edx)

  $ size vmlinux.prev vmlinux.after
     text    data     bss     dec     hex filename
  4908667  451676 1684868 7045211  6b805b vmlinux.prev
  4908651  451676 1684868 7045195  6b804b vmlinux.after
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <4A4E1AA2.30002@gmail.com>
[ Also fix typo in atomic64_set() export ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a79f0da8

x86: atomic64: Clean up atomic64_sub_and_test() and atomic64_add_negative() · ddf9a003

由 Ingo Molnar 提交于 7月 03, 2009

Linus noticed that the variable name 'old_val' is
confusingly named in these functions - the correct
naming is 'new_val'.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907030942260.3210@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ddf9a003

x86: atomic64: Improve atomic64_xchg() · 3a8d1788

由 Ingo Molnar 提交于 7月 03, 2009

Remove the read-first logic from atomic64_xchg() and simplify
the loop.

This function was the last user of __atomic64_read() - remove it.

Also, change the 'real_val' assumption from the somewhat quirky
1ULL << 32 value to the (just as arbitrary, but simpler) value
of 0.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3a8d1788

x86: atomic64: Export APIs to modules · 1fde902d

由 Ingo Molnar 提交于 7月 03, 2009

atomic64_t primitives are used by a handful of drivers,
so export the APIs consistently. These were inlined
before.

Also mark atomic64_32.o a core object, so that the symbols
are available even if not linked to core kernel pieces.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1fde902d

03 7月, 2009 8 次提交

x86: atomic64: Improve atomic64_read() · 67d7178f

由 Eric Dumazet 提交于 7月 03, 2009

Optimize atomic64_read() as a special open-coded
cmpxchg8b variant. This generates nicer code:

arch/x86/lib/atomic64_32.o:

   text	   data	    bss	    dec	    hex	filename
    435	      0	      0	    435	    1b3	atomic64_32.o.before
    431	      0	      0	    431	    1af	atomic64_32.o.after

md5:
   bd8ab95e69c93518578bfaf0ea3be4d9  atomic64_32.o.before.asm
   2bdfd4bd1f6b7b61b7fc127aef90ce3b  atomic64_32.o.after.asm

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

67d7178f

x86: Add missing annotation to arch/x86/lib/copy_user_64.S::copy_to_user · 3fd382ce

由 Mike Galbraith 提交于 7月 02, 2009

While examining symbol generation in perf_counter tools, I
noticed that copy_to_user() had no size in vmlinux's symtab.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Acked-by: NAlexander van Heukelum <heukelum@fastmail.fm>
Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <1246512440.13293.3.camel@marge.simson.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3fd382ce

x86: atomic64: Fix unclean type use in atomic64_xchg() · 199e2378

由 Ingo Molnar 提交于 7月 03, 2009

Linus noticed that atomic64_xchg() uses atomic_read(), which
happens to work because atomic_read() is a macro so the
.counter value gets u64-read on 32-bit too - but this is really
bogus and serious bugs are waiting to happen.

Fix atomic64_xchg() to use __atomic64_read() instead.

No code changed:

arch/x86/lib/atomic64_32.o:

   text	   data	    bss	    dec	    hex	filename
    435	      0	      0	    435	    1b3	atomic64_32.o.before
    435	      0	      0	    435	    1b3	atomic64_32.o.after

md5:
   bd8ab95e69c93518578bfaf0ea3be4d9  atomic64_32.o.before.asm
   bd8ab95e69c93518578bfaf0ea3be4d9  atomic64_32.o.after.asm
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

199e2378

x86: atomic64: Reduce size of functions · 3ac805d2

由 Ingo Molnar 提交于 7月 03, 2009

cmpxchg8b is a huge instruction in terms of register footprint,
we almost never want to inline it, not even within the same
code module.

GCC 4.3 still messes up for two functions, under-judging the
true cost of this instruction - so annotate two key functions
to reduce the bloat:

arch/x86/lib/atomic64_32.o:

   text	   data	    bss	    dec	    hex	filename
   1763	      0	      0	   1763	    6e3	atomic64_32.o.before
    435	      0	      0	    435	    1b3	atomic64_32.o.after

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3ac805d2

x86: atomic64: Improve atomic64_add_return() · 824975ef

由 Ingo Molnar 提交于 7月 03, 2009

Linus noted (based on Eric Dumazet's numbers) that we would
probably be better off not trying an atomic_read() in
atomic64_add_return() but intead intentionally let the first
cmpxchg8b fail - to get a cache-friendly 'give me ownership
of this cacheline' transaction. That can then be followed
by the real cmpxchg8b which sets the value local to the CPU.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

824975ef

x86: atomic64: Improve cmpxchg8b() · 69237f94

由 Eric Dumazet 提交于 7月 03, 2009

Rewrite cmpxchg8b() to not use %edi register but a generic "+m"
constraint, to increase compiler freedom in code generation and
possibly better code.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

69237f94

x86: atomic64: Improve atomic64_read() · aacf682f

由 Eric Dumazet 提交于 7月 03, 2009

Linus noticed that the 32-bit version of atomic64_read() was
being overly complex with re-reading the value and doing a
retry loop over that.

Instead we can just rely on cmpxchg8b returning either the new
value or returning the current value.

We can use any 'old' value, which will be faster as it can be
loaded via immediates. Using some value that is not equal to
the real value in memory the instruction gets faster.

This also has the advantage that the CPU could avoid dirtying
the cacheline.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

aacf682f

x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file · b7882b7c

由 Ingo Molnar 提交于 7月 03, 2009

Linus noted that the atomic64_t primitives are all inlines
currently which is crazy because these functions have a large
register footprint anyway.

Move them to a separate file: arch/x86/lib/atomic64_32.c

Also, while at it, rename all uses of 'unsigned long long' to
the much shorter u64.

This makes the appearance of the prototypes a lot nicer - and
it also uncovered a few bugs where (yet unused) API variants
had 'long' as their return type instead of u64.

[ More intrusive changes are not yet done in this patch. ]
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b7882b7c

01 7月, 2009 1 次提交

x86: Fix symbol annotation for arch/x86/lib/clear_page_64.S::clear_page_c · 9e314996

由 Mike Galbraith 提交于 6月 30, 2009

Noticed the zero-sized function symbol while looking at 'perf' profiles,
it causes the profiler to display those addresses in hexa.

Turns out that this was wrong/bogus for an eternity.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Acked-by: NAlexander van Heukelum <heukelum@fastmail.fm>
Acked-by: NCyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <1246366820.6538.1.camel@marge.simson.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9e314996

26 6月, 2009 1 次提交

x86, delay: tsc based udelay should have rdtsc_barrier · e888d7fa

由 Pallipadi, Venkatesh 提交于 6月 25, 2009

delay_tsc needs rdtsc_barrier to provide proper delay.

Output from a test driver using hpet to cross check delay
provided by udelay().

Before:
[   86.794363] Expected delay 5us actual 4679ns
[   87.154362] Expected delay 5us actual 698ns
[   87.514162] Expected delay 5us actual 4539ns
[   88.653716] Expected delay 5us actual 4539ns
[   94.664106] Expected delay 10us actual 9638ns
[   95.049351] Expected delay 10us actual 10126ns
[   95.416110] Expected delay 10us actual 9568ns
[   95.799216] Expected delay 10us actual 9638ns
[  103.624104] Expected delay 10us actual 9707ns
[  104.020619] Expected delay 10us actual 768ns
[  104.419951] Expected delay 10us actual 9707ns

After:
[   50.983320] Expected delay 5us actual 5587ns
[   51.261807] Expected delay 5us actual 5587ns
[   51.565715] Expected delay 5us actual 5657ns
[   51.861171] Expected delay 5us actual 5587ns
[   52.164704] Expected delay 5us actual 5726ns
[   52.487457] Expected delay 5us actual 5657ns
[   52.789338] Expected delay 5us actual 5726ns
[   57.119680] Expected delay 10us actual 10755ns
[   57.893997] Expected delay 10us actual 10615ns
[   58.261287] Expected delay 10us actual 10755ns
[   58.620505] Expected delay 10us actual 10825ns
[   58.941035] Expected delay 10us actual 10755ns
[   59.320903] Expected delay 10us actual 10615ns
[   61.306311] Expected delay 10us actual 10755ns
[   61.520542] Expected delay 10us actual 10615ns
Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

e888d7fa

21 6月, 2009 1 次提交

x86, 64-bit: Clean up user address masking · 9063c61f

由 Linus Torvalds 提交于 6月 20, 2009

The discussion about using "access_ok()" in get_user_pages_fast() (see
commit 7f818906: "x86: don't use
'access_ok()' as a range check in get_user_pages_fast()" for details and
end result), made us notice that x86-64 was really being very sloppy
about virtual address checking.

So be way more careful and straightforward about masking x86-64 virtual
addresses:

 - All the VIRTUAL_MASK* variants now cover half of the address
   space, it's not like we can use the full mask on a signed
   integer, and the larger mask just invites mistakes when
   applying it to either half of the 48-bit address space.

 - /proc/kcore's kc_offset_to_vaddr() becomes a lot more
   obvious when it transforms a file offset into a
   (kernel-half) virtual address.

 - Unify/simplify the 32-bit and 64-bit USER_DS definition to
   be based on TASK_SIZE_MAX.

This cleanup and more careful/obvious user virtual address checking also
uncovered a buglet in the x86-64 implementation of strnlen_user(): it
would do an "access_ok()" check on the whole potential area, even if the
string itself was much shorter, and thus return an error even for valid
strings. Our sloppy checking had hidden this.

So this fixes 'strnlen_user()' to do this properly, the same way we
already handled user strings in 'strncpy_from_user()'.  Namely by just
checking the first byte, and then relying on fault handling for the
rest.  That always works, since we impose a guard page that cannot be
mapped at the end of the user space address space (and even if we
didn't, we'd have the address space hole).
Acked-by: NIngo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9063c61f

10 6月, 2009 2 次提交

x86: MSR: add methods for writing of an MSR on several CPUs · b034c19f

由 Borislav Petkov 提交于 5月 22, 2009

Provide for concurrent MSR writes on all the CPUs in the cpumask. Also,
add a temporary workaround for smp_call_function_many which skips the
CPU we're executing on.

Bart: zero out rv struct which is allocated on stack.

CC: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>

b034c19f

x86: MSR: add a struct representation of an MSR · 6bc1096d

由 Borislav Petkov 提交于 5月 22, 2009

Add a struct representing a 64bit MSR pair consisting of a low and high
register part and convert msr_info to use it. Also, rename msr-on-cpu.c
to msr.c.

Side note: Put the cpumask.h include in __KERNEL__ space thus fixing an
allmodconfig build failure in the headers_check target.

CC: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

6bc1096d

12 3月, 2009 2 次提交

x86: memcpy, clean up · f3b6eaf0

由 Ingo Molnar 提交于 3月 12, 2009

Impact: cleanup

Make this file more readable by bringing it more in line
with the usual kernel style.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f3b6eaf0

x86-64: remove unnecessary spill/reload of rbx from memcpy · dd1ef4ec

由 Jan Beulich 提交于 3月 12, 2009

Impact: micro-optimization

This should slightly improve its performance.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
LKML-Reference: <49B8F641.76E4.0078.0@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dd1ef4ec

14 2月, 2009 1 次提交

x86: use _types.h headers in asm where available · 0341c14d

由 Jeremy Fitzhardinge 提交于 2月 13, 2009

In general, the only definitions that assembly files can use
are in _types.S headers (where available), so convert them.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

0341c14d

21 1月, 2009 1 次提交

x86: use early clobbers in usercopy*.c · e0a96129

由 Andi Kleen 提交于 1月 16, 2009

Impact: fix rare (but currently harmless) miscompile with certain configs and gcc versions

Hugh Dickins noticed that strncpy_from_user() was miscompiled
in some circumstances with gcc 4.3.

Thanks to Hugh's excellent analysis it was easy to track down.

Hugh writes:

> Try building an x86_64 defconfig 2.6.29-rc1 kernel tree,
> except not quite defconfig, switch CONFIG_PREEMPT_NONE=y
> and CONFIG_PREEMPT_VOLUNTARY off (because it expands a
> might_fault() there, which hides the issue): using a
> gcc 4.3.2 (I've checked both openSUSE 11.1 and Fedora 10).
>
> It generates the following:
>
> 0000000000000000 <__strncpy_from_user>:
>    0:   48 89 d1                mov    %rdx,%rcx
>    3:   48 85 c9                test   %rcx,%rcx
>    6:   74 0e                   je     16 <__strncpy_from_user+0x16>
>    8:   ac                      lods   %ds:(%rsi),%al
>    9:   aa                      stos   %al,%es:(%rdi)
>    a:   84 c0                   test   %al,%al
>    c:   74 05                   je     13 <__strncpy_from_user+0x13>
>    e:   48 ff c9                dec    %rcx
>   11:   75 f5                   jne    8 <__strncpy_from_user+0x8>
>   13:   48 29 c9                sub    %rcx,%rcx
>   16:   48 89 c8                mov    %rcx,%rax
>   19:   c3                      retq
>
> Observe that "sub %rcx,%rcx; mov %rcx,%rax", whereas gcc 4.2.1
> (and many other configs) say "sub %rcx,%rdx; mov %rdx,%rax".
> Isn't it returning 0 when it ought to be returning strlen?

The asm constraints for the strncpy_from_user() result were missing an
early clobber, which tells gcc that the last output arguments
are written before all input arguments are read.

Also add more early clobbers in the rest of the file and fix 32-bit
usercopy.c in the same way.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
[ since this API is rarely used and no in-kernel user relies on a 'len'
  return value (they only rely on negative return values) this miscompile
  was never noticed in the field. But it's worth fixing it nevertheless. ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e0a96129

12 9月, 2008 1 次提交

x86: some lock annotations for user copy paths, v3 · 1d18ef48

由 Ingo Molnar 提交于 9月 11, 2008

- add annotation back to clear_user()
- change probe_kernel_address() to _inatomic*() method
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1d18ef48