1. 29 10月, 2009 1 次提交
    • M
      x86: Fix SSE opcode map bug · 7f387d3f
      Masami Hiramatsu 提交于
      Fix superscripts position because some superscripts of SSE
      opcode are not put in correct position.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091027204204.30545.97296.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7f387d3f
  2. 21 10月, 2009 2 次提交
  3. 17 10月, 2009 2 次提交
    • M
      x86: Add AMD prefetch and 3DNow! opcodes to opcode map · d1baf5a5
      Masami Hiramatsu 提交于
      Add AMD prefetch and 3DNow! opcode including FEMMS. Since 3DNow!
      uses the last immediate byte as an opcode extension byte, x86
      insn just treats the extenstion byte as an immediate byte
      instead of a part of opcode (insn_get_opcode() decodes first
      "0x0f 0x0f" bytes.)
      
      Users who are interested in analyzing 3DNow! opcode still can
      decode it by analyzing the immediate byte.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20091017000744.16556.27881.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d1baf5a5
    • M
      x86: Add MMX/SSE opcode groups to opcode map · 8c95bc3e
      Masami Hiramatsu 提交于
      Add missing MMX/SSE opcode groups to x86 opcode map.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20091017000736.16556.29061.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8c95bc3e
  4. 03 10月, 2009 1 次提交
    • M
      x86: Add VIA processor instructions in opcodes decoder · c0b11d3a
      Masami Hiramatsu 提交于
      Add VIA processor's Padlock instructions(MONTMUL, XSHA1, XSHA256)
      as parts of the kernel may use them.
      
      This fixes the following crash in opcodes decoder selftests:
      
       make[2]: `scripts/unifdef' is up to date.
         TEST    posttest
       Error: c145cf71:        f3 0f a6 d0             repz xsha256
       Error: objdump says 4 bytes, but insn_get_length() says 3 (attr:0)
       make[1]: *** [posttest] Error 2
       make: *** [bzImage] Error 2
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20090925182037.10157.3180.stgit@omoto>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      c0b11d3a
  5. 01 10月, 2009 2 次提交
  6. 11 9月, 2009 1 次提交
  7. 05 9月, 2009 1 次提交
    • H
      x86, msr: change msr-reg.o to obj-y, and export its symbols · b19ae399
      H. Peter Anvin 提交于
      Change msr-reg.o to obj-y (it will be included in virtually every
      kernel since it is used by the initialization code for AMD processors)
      and add a separate C file to export its symbols to modules, so that
      msr.ko can use them; on uniprocessors we bypass the helper functions
      in msr.o and use the accessor functions directly via inlines.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <20090904140834.GA15789@elte.hu>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      b19ae399
  8. 04 9月, 2009 1 次提交
    • I
      x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too · 8adf65cf
      Ingo Molnar 提交于
      The macro was defined in the 32-bit path as well - breaking the
      build on 32-bit platforms:
      
        arch/x86/lib/msr-reg.S: Assembler messages:
        arch/x86/lib/msr-reg.S:53: Error: Bad macro parameter list
        arch/x86/lib/msr-reg.S:100: Error: invalid character '_' in mnemonic
        arch/x86/lib/msr-reg.S:101: Error: invalid character '_' in mnemonic
      
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <tip-f6909f39@git.kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8adf65cf
  9. 02 9月, 2009 1 次提交
  10. 01 9月, 2009 3 次提交
  11. 27 8月, 2009 1 次提交
    • M
      x86: Instruction decoder API · eb13296c
      Masami Hiramatsu 提交于
      Add x86 instruction decoder to arch-specific libraries. This decoder
      can decode x86 instructions used in kernel into prefix, opcode, modrm,
      sib, displacement and immediates. This can also show the length of
      instructions.
      
      This version introduces instruction attributes for decoding
      instructions.
      The instruction attribute tables are generated from the opcode map file
      (x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).
      
      Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
      IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
      and consist of below two types of opcode tables.
      
      1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
      written as below;
      
       Table: table-name
       Referrer: escaped-name
       opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
        (or)
       opcode: escape # escaped-name
       EndTable
      
      Group opcodes, which has 8 elements, are written as below;
      
       GrpTable: GrpXXX
       reg:  mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
       EndTable
      
      These opcode maps include a few SSE and FP opcodes (for setup), because
      those opcodes are used in the kernel.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NJim Keniston <jkenisto@us.ibm.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      LKML-Reference: <20090813203413.31965.49709.stgit@localhost.localdomain>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      eb13296c
  12. 04 8月, 2009 1 次提交
  13. 11 7月, 2009 1 次提交
  14. 04 7月, 2009 4 次提交
    • E
      x86: atomic64: Inline atomic64_read() again · a79f0da8
      Eric Dumazet 提交于
      Now atomic64_read() is light weight (no register pressure and
      small icache), we can inline it again.
      
      Also use "=&A" constraint instead of "+A" to avoid warning
      about unitialized 'res' variable. (gcc had to force 0 in eax/edx)
      
        $ size vmlinux.prev vmlinux.after
           text    data     bss     dec     hex filename
        4908667  451676 1684868 7045211  6b805b vmlinux.prev
        4908651  451676 1684868 7045195  6b804b vmlinux.after
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <4A4E1AA2.30002@gmail.com>
      [ Also fix typo in atomic64_set() export ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a79f0da8
    • I
      x86: atomic64: Clean up atomic64_sub_and_test() and atomic64_add_negative() · ddf9a003
      Ingo Molnar 提交于
      Linus noticed that the variable name 'old_val' is
      confusingly named in these functions - the correct
      naming is 'new_val'.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907030942260.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ddf9a003
    • I
      x86: atomic64: Improve atomic64_xchg() · 3a8d1788
      Ingo Molnar 提交于
      Remove the read-first logic from atomic64_xchg() and simplify
      the loop.
      
      This function was the last user of __atomic64_read() - remove it.
      
      Also, change the 'real_val' assumption from the somewhat quirky
      1ULL << 32 value to the (just as arbitrary, but simpler) value
      of 0.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a8d1788
    • I
      x86: atomic64: Export APIs to modules · 1fde902d
      Ingo Molnar 提交于
      atomic64_t primitives are used by a handful of drivers,
      so export the APIs consistently. These were inlined
      before.
      
      Also mark atomic64_32.o a core object, so that the symbols
      are available even if not linked to core kernel pieces.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1fde902d
  15. 03 7月, 2009 8 次提交
    • E
      x86: atomic64: Improve atomic64_read() · 67d7178f
      Eric Dumazet 提交于
      Optimize atomic64_read() as a special open-coded
      cmpxchg8b variant. This generates nicer code:
      
      arch/x86/lib/atomic64_32.o:
      
         text	   data	    bss	    dec	    hex	filename
          435	      0	      0	    435	    1b3	atomic64_32.o.before
          431	      0	      0	    431	    1af	atomic64_32.o.after
      
      md5:
         bd8ab95e69c93518578bfaf0ea3be4d9  atomic64_32.o.before.asm
         2bdfd4bd1f6b7b61b7fc127aef90ce3b  atomic64_32.o.after.asm
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      67d7178f
    • M
      x86: Add missing annotation to arch/x86/lib/copy_user_64.S::copy_to_user · 3fd382ce
      Mike Galbraith 提交于
      While examining symbol generation in perf_counter tools, I
      noticed that copy_to_user() had no size in vmlinux's symtab.
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Acked-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      LKML-Reference: <1246512440.13293.3.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3fd382ce
    • I
      x86: atomic64: Fix unclean type use in atomic64_xchg() · 199e2378
      Ingo Molnar 提交于
      Linus noticed that atomic64_xchg() uses atomic_read(), which
      happens to work because atomic_read() is a macro so the
      .counter value gets u64-read on 32-bit too - but this is really
      bogus and serious bugs are waiting to happen.
      
      Fix atomic64_xchg() to use __atomic64_read() instead.
      
      No code changed:
      
      arch/x86/lib/atomic64_32.o:
      
         text	   data	    bss	    dec	    hex	filename
          435	      0	      0	    435	    1b3	atomic64_32.o.before
          435	      0	      0	    435	    1b3	atomic64_32.o.after
      
      md5:
         bd8ab95e69c93518578bfaf0ea3be4d9  atomic64_32.o.before.asm
         bd8ab95e69c93518578bfaf0ea3be4d9  atomic64_32.o.after.asm
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      199e2378
    • I
      x86: atomic64: Reduce size of functions · 3ac805d2
      Ingo Molnar 提交于
      cmpxchg8b is a huge instruction in terms of register footprint,
      we almost never want to inline it, not even within the same
      code module.
      
      GCC 4.3 still messes up for two functions, under-judging the
      true cost of this instruction - so annotate two key functions
      to reduce the bloat:
      
      arch/x86/lib/atomic64_32.o:
      
         text	   data	    bss	    dec	    hex	filename
         1763	      0	      0	   1763	    6e3	atomic64_32.o.before
          435	      0	      0	    435	    1b3	atomic64_32.o.after
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3ac805d2
    • I
      x86: atomic64: Improve atomic64_add_return() · 824975ef
      Ingo Molnar 提交于
      Linus noted (based on Eric Dumazet's numbers) that we would
      probably be better off not trying an atomic_read() in
      atomic64_add_return() but intead intentionally let the first
      cmpxchg8b fail - to get a cache-friendly 'give me ownership
      of this cacheline' transaction. That can then be followed
      by the real cmpxchg8b which sets the value local to the CPU.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      824975ef
    • E
      x86: atomic64: Improve cmpxchg8b() · 69237f94
      Eric Dumazet 提交于
      Rewrite cmpxchg8b() to not use %edi register but a generic "+m"
      constraint, to increase compiler freedom in code generation and
      possibly better code.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      69237f94
    • E
      x86: atomic64: Improve atomic64_read() · aacf682f
      Eric Dumazet 提交于
      Linus noticed that the 32-bit version of atomic64_read() was
      being overly complex with re-reading the value and doing a
      retry loop over that.
      
      Instead we can just rely on cmpxchg8b returning either the new
      value or returning the current value.
      
      We can use any 'old' value, which will be faster as it can be
      loaded via immediates. Using some value that is not equal to
      the real value in memory the instruction gets faster.
      
      This also has the advantage that the CPU could avoid dirtying
      the cacheline.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      aacf682f
    • I
      x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file · b7882b7c
      Ingo Molnar 提交于
      Linus noted that the atomic64_t primitives are all inlines
      currently which is crazy because these functions have a large
      register footprint anyway.
      
      Move them to a separate file: arch/x86/lib/atomic64_32.c
      
      Also, while at it, rename all uses of 'unsigned long long' to
      the much shorter u64.
      
      This makes the appearance of the prototypes a lot nicer - and
      it also uncovered a few bugs where (yet unused) API variants
      had 'long' as their return type instead of u64.
      
      [ More intrusive changes are not yet done in this patch. ]
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b7882b7c
  16. 01 7月, 2009 1 次提交
  17. 26 6月, 2009 1 次提交
    • P
      x86, delay: tsc based udelay should have rdtsc_barrier · e888d7fa
      Pallipadi, Venkatesh 提交于
      delay_tsc needs rdtsc_barrier to provide proper delay.
      
      Output from a test driver using hpet to cross check delay
      provided by udelay().
      
      Before:
      [   86.794363] Expected delay 5us actual 4679ns
      [   87.154362] Expected delay 5us actual 698ns
      [   87.514162] Expected delay 5us actual 4539ns
      [   88.653716] Expected delay 5us actual 4539ns
      [   94.664106] Expected delay 10us actual 9638ns
      [   95.049351] Expected delay 10us actual 10126ns
      [   95.416110] Expected delay 10us actual 9568ns
      [   95.799216] Expected delay 10us actual 9638ns
      [  103.624104] Expected delay 10us actual 9707ns
      [  104.020619] Expected delay 10us actual 768ns
      [  104.419951] Expected delay 10us actual 9707ns
      
      After:
      [   50.983320] Expected delay 5us actual 5587ns
      [   51.261807] Expected delay 5us actual 5587ns
      [   51.565715] Expected delay 5us actual 5657ns
      [   51.861171] Expected delay 5us actual 5587ns
      [   52.164704] Expected delay 5us actual 5726ns
      [   52.487457] Expected delay 5us actual 5657ns
      [   52.789338] Expected delay 5us actual 5726ns
      [   57.119680] Expected delay 10us actual 10755ns
      [   57.893997] Expected delay 10us actual 10615ns
      [   58.261287] Expected delay 10us actual 10755ns
      [   58.620505] Expected delay 10us actual 10825ns
      [   58.941035] Expected delay 10us actual 10755ns
      [   59.320903] Expected delay 10us actual 10615ns
      [   61.306311] Expected delay 10us actual 10755ns
      [   61.520542] Expected delay 10us actual 10615ns
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      e888d7fa
  18. 21 6月, 2009 1 次提交
    • L
      x86, 64-bit: Clean up user address masking · 9063c61f
      Linus Torvalds 提交于
      The discussion about using "access_ok()" in get_user_pages_fast() (see
      commit 7f818906: "x86: don't use
      'access_ok()' as a range check in get_user_pages_fast()" for details and
      end result), made us notice that x86-64 was really being very sloppy
      about virtual address checking.
      
      So be way more careful and straightforward about masking x86-64 virtual
      addresses:
      
       - All the VIRTUAL_MASK* variants now cover half of the address
         space, it's not like we can use the full mask on a signed
         integer, and the larger mask just invites mistakes when
         applying it to either half of the 48-bit address space.
      
       - /proc/kcore's kc_offset_to_vaddr() becomes a lot more
         obvious when it transforms a file offset into a
         (kernel-half) virtual address.
      
       - Unify/simplify the 32-bit and 64-bit USER_DS definition to
         be based on TASK_SIZE_MAX.
      
      This cleanup and more careful/obvious user virtual address checking also
      uncovered a buglet in the x86-64 implementation of strnlen_user(): it
      would do an "access_ok()" check on the whole potential area, even if the
      string itself was much shorter, and thus return an error even for valid
      strings. Our sloppy checking had hidden this.
      
      So this fixes 'strnlen_user()' to do this properly, the same way we
      already handled user strings in 'strncpy_from_user()'.  Namely by just
      checking the first byte, and then relying on fault handling for the
      rest.  That always works, since we impose a guard page that cannot be
      mapped at the end of the user space address space (and even if we
      didn't, we'd have the address space hole).
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9063c61f
  19. 10 6月, 2009 2 次提交
  20. 12 3月, 2009 2 次提交
  21. 14 2月, 2009 1 次提交
  22. 21 1月, 2009 1 次提交
    • A
      x86: use early clobbers in usercopy*.c · e0a96129
      Andi Kleen 提交于
      Impact: fix rare (but currently harmless) miscompile with certain configs and gcc versions
      
      Hugh Dickins noticed that strncpy_from_user() was miscompiled
      in some circumstances with gcc 4.3.
      
      Thanks to Hugh's excellent analysis it was easy to track down.
      
      Hugh writes:
      
      > Try building an x86_64 defconfig 2.6.29-rc1 kernel tree,
      > except not quite defconfig, switch CONFIG_PREEMPT_NONE=y
      > and CONFIG_PREEMPT_VOLUNTARY off (because it expands a
      > might_fault() there, which hides the issue): using a
      > gcc 4.3.2 (I've checked both openSUSE 11.1 and Fedora 10).
      >
      > It generates the following:
      >
      > 0000000000000000 <__strncpy_from_user>:
      >    0:   48 89 d1                mov    %rdx,%rcx
      >    3:   48 85 c9                test   %rcx,%rcx
      >    6:   74 0e                   je     16 <__strncpy_from_user+0x16>
      >    8:   ac                      lods   %ds:(%rsi),%al
      >    9:   aa                      stos   %al,%es:(%rdi)
      >    a:   84 c0                   test   %al,%al
      >    c:   74 05                   je     13 <__strncpy_from_user+0x13>
      >    e:   48 ff c9                dec    %rcx
      >   11:   75 f5                   jne    8 <__strncpy_from_user+0x8>
      >   13:   48 29 c9                sub    %rcx,%rcx
      >   16:   48 89 c8                mov    %rcx,%rax
      >   19:   c3                      retq
      >
      > Observe that "sub %rcx,%rcx; mov %rcx,%rax", whereas gcc 4.2.1
      > (and many other configs) say "sub %rcx,%rdx; mov %rdx,%rax".
      > Isn't it returning 0 when it ought to be returning strlen?
      
      The asm constraints for the strncpy_from_user() result were missing an
      early clobber, which tells gcc that the last output arguments
      are written before all input arguments are read.
      
      Also add more early clobbers in the rest of the file and fix 32-bit
      usercopy.c in the same way.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      [ since this API is rarely used and no in-kernel user relies on a 'len'
        return value (they only rely on negative return values) this miscompile
        was never noticed in the field. But it's worth fixing it nevertheless. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e0a96129
  23. 12 9月, 2008 1 次提交