1. 25 9月, 2010 1 次提交
    • M
      x86, mem: Optimize memmove for small size and unaligned cases · 3b4b682b
      Ma Ling 提交于
      movs instruction will combine data to accelerate moving data,
      however we need to concern two cases about it.
      
      1. movs instruction need long lantency to startup,
         so here we use general mov instruction to copy data.
      2. movs instruction is not good for unaligned case,
         even if src offset is 0x10, dest offset is 0x0,
         we avoid and handle the case by general mov instruction.
      Signed-off-by: NMa Ling <ling.ma@intel.com>
      LKML-Reference: <1284664360-6138-1-git-send-email-ling.ma@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      3b4b682b
  2. 24 8月, 2010 2 次提交
    • M
      x86, mem: Optimize memcpy by avoiding memory false dependece · 59daa706
      Ma Ling 提交于
      All read operations after allocation stage can run speculatively,
      all write operation will run in program order, and if addresses are
      different read may run before older write operation, otherwise wait
      until write commit. However CPU don't check each address bit,
      so read could fail to recognize different address even they
      are in different page.For example if rsi is 0xf004, rdi is 0xe008,
      in following operation there will generate big performance latency.
      1. movq (%rsi),	%rax
      2. movq %rax,	(%rdi)
      3. movq 8(%rsi), %rax
      4. movq %rax,	8(%rdi)
      
      If %rsi and rdi were in really the same meory page, there are TRUE
      read-after-write dependence because instruction 2 write 0x008 and
      instruction 3 read 0x00c, the two address are overlap partially.
      Actually there are in different page and no any issues,
      but without checking each address bit CPU could think they are
      in the same page, and instruction 3 have to wait for instruction 2
      to write data into cache from write buffer, then load data from cache,
      the cost time read spent is equal to mfence instruction. We may avoid it by
      tuning operation sequence as follow.
      
      1. movq 8(%rsi), %rax
      2. movq %rax,	8(%rdi)
      3. movq (%rsi),	%rax
      4. movq %rax,	(%rdi)
      
      Instruction 3 read 0x004, instruction 2 write address 0x010, no any
      dependence.  At last on Core2 we gain 1.83x speedup compared with
      original instruction sequence.  In this patch we first handle small
      size(less 20bytes), then jump to different copy mode. Based on our
      micro-benchmark small bytes from 1 to 127 bytes, we got up to 2X
      improvement, and up to 1.5X improvement for 1024 bytes on Corei7.  (We
      use our micro-benchmark, and will do further test according to your
      requirment)
      Signed-off-by: NMa Ling <ling.ma@intel.com>
      LKML-Reference: <1277753065-18610-1-git-send-email-ling.ma@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      59daa706
    • M
      x86, mem: Don't implement forward memmove() as memcpy() · fdf42896
      Ma, Ling 提交于
      memmove() allow source and destination address to be overlap, but
      there is no such limitation for memcpy().  Therefore, explicitly
      implement memmove() in both the forwards and backward directions, to
      give us the ability to optimize memcpy().
      Signed-off-by: NMa Ling <ling.ma@intel.com>
      LKML-Reference: <C10D3FB0CD45994C8A51FEC1227CE22F0E483AD86A@shsmsx502.ccr.corp.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      fdf42896
  3. 12 8月, 2010 2 次提交
  4. 29 7月, 2010 2 次提交
  5. 14 7月, 2010 1 次提交
  6. 08 7月, 2010 1 次提交
    • H
      x86, alternatives: Use 16-bit numbers for cpufeature index · 83a7a2ad
      H. Peter Anvin 提交于
      We already have cpufeature indicies above 255, so use a 16-bit number
      for the alternatives index.  This consumes a padding field and so
      doesn't add any size, but it means that abusing the padding field to
      create assembly errors on overflow no longer works.  We can retain the
      test simply by redirecting it to the .discard section, however.
      
      [ v3: updated to include open-coded locations ]
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      LKML-Reference: <tip-f88731e3068f9d1392ba71cc9f50f035d26a0d4f@git.kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      83a7a2ad
  7. 05 5月, 2010 1 次提交
  8. 10 3月, 2010 1 次提交
    • I
      perf, x86: Add INSTRUCTION_DECODER config flag · ba7e4d13
      Ingo Molnar 提交于
      The PEBS+LBR decoding magic needs the insn_get_length() infrastructure
      to be able to decode x86 instruction length.
      
      So split it out of KPROBES dependency and make it enabled when either
      KPROBES or PERF_EVENTS is enabled.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ba7e4d13
  9. 02 3月, 2010 2 次提交
  10. 26 2月, 2010 1 次提交
    • L
      x86-32: Rewrite 32-bit atomic64 functions in assembly · a7e926ab
      Luca Barbieri 提交于
      This patch replaces atomic64_32.c with two assembly implementations,
      one for 386/486 machines using pushf/cli/popf and one for 586+ machines
      using cmpxchg8b.
      
      The cmpxchg8b implementation provides the following advantages over the
      current one:
      
      1. Implements atomic64_add_unless, atomic64_dec_if_positive and
         atomic64_inc_not_zero
      
      2. Uses the ZF flag changed by cmpxchg8b instead of doing a comparison
      
      3. Uses custom register calling conventions that reduce or eliminate
         register moves to suit cmpxchg8b
      
      4. Reads the initial value instead of using cmpxchg8b to do that.
         Currently we use lock xaddl and movl, which seems the fastest.
      
      5. Does not use the lock prefix for atomic64_set
         64-bit writes are already atomic, so we don't need that.
         We still need it for atomic64_read to avoid restoring a value
         changed in the meantime.
      
      6. Allocates registers as well or better than gcc
      
      The 386 implementation provides support for 386 and 486 machines.
      386/486 SMP is not supported (we dropped it), but such support can be
      added easily if desired.
      
      A pure assembly implementation is required due to the custom calling
      conventions, and desire to use %ebp in atomic64_add_return (we need
      7 registers...), as well as the ability to use pushf/popf in the 386
      code without an intermediate pop/push.
      
      The parameter names are changed to match the convention in atomic_64.h
      
      Changes in v3 (due to rebasing to tip/x86/asm):
      - Patches atomic64_32.h instead of atomic_32.h
      - Uses the CALL alternative mechanism from commit
        1b1d9258
      
      Changes in v2:
      - Merged 386 and cx8 support in the same patch
      - 386 support now done in assembly, C code no longer used at all
      - cmpxchg64 is used for atomic64_cmpxchg
      - stop using macros, use one-line inline functions instead
      - miscellanous changes and improvements
      Signed-off-by: NLuca Barbieri <luca@luca-barbieri.com>
      LKML-Reference: <1267005265-27958-5-git-send-email-luca@luca-barbieri.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      a7e926ab
  11. 06 2月, 2010 1 次提交
  12. 23 1月, 2010 1 次提交
  13. 14 1月, 2010 1 次提交
    • L
      x86-64: support native xadd rwsem implementation · bafaecd1
      Linus Torvalds 提交于
      This one is much faster than the spinlock based fallback rwsem code,
      with certain artifical benchmarks having shown 300%+ improvement on
      threaded page faults etc.
      
      Again, note the 32767-thread limit here. So this really does need that
      whole "make rwsem_count_t be 64-bit and fix the BIAS values to match"
      extension on top of it, but that is conceptually a totally independent
      issue.
      
      NOT TESTED! The original patch that this all was based on were tested by
      KAMEZAWA Hiroyuki, but maybe I screwed up something when I created the
      cleaned-up series, so caveat emptor..
      
      Also note that it _may_ be a good idea to mark some more registers
      clobbered on x86-64 in the inline asms instead of saving/restoring them.
      They are inline functions, but they are only used in places where there
      are not a lot of live registers _anyway_, so doing for example the
      clobbers of %r8-%r11 in the asm wouldn't make the fast-path code any
      worse, and would make the slow-path code smaller.
      
      (Not that the slow-path really matters to that degree. Saving a few
      unnecessary registers is the _least_ of our problems when we hit the slow
      path. The instruction/cycle counting really only matters in the fast
      path).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <alpine.LFD.2.00.1001121810410.17145@localhost.localdomain>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      bafaecd1
  14. 30 12月, 2009 2 次提交
    • J
      x86-64: Modify memcpy()/memset() alternatives mechanism · 7269e881
      Jan Beulich 提交于
      In order to avoid unnecessary chains of branches, rather than
      implementing memcpy()/memset()'s access to their alternative
      implementations via a jump, patch the (larger) original function
      directly.
      
      The memcpy() part of this is slightly subtle: while alternative
      instruction patching does itself use memcpy(), with the
      replacement block being less than 64-bytes in size the main loop
      of the original function doesn't get used for copying memcpy_c()
      over memcpy(), and hence we can safely write over its beginning.
      
      Also note that the CFI annotations are fine for both variants of
      each of the functions.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4B2BB8D30200007800026AF2@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7269e881
    • J
      x86-64: Modify copy_user_generic() alternatives mechanism · 1b1d9258
      Jan Beulich 提交于
      In order to avoid unnecessary chains of branches, rather than
      implementing copy_user_generic() as a function consisting of
      just a single (possibly patched) branch, instead properly deal
      with patching call instructions in the alternative instructions
      framework, and move the patching into the callers.
      
      As a follow-on, one could also introduce something like
      __EXPORT_SYMBOL_ALT() to avoid patching call sites in modules.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4B2BB8180200007800026AE7@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1b1d9258
  15. 17 12月, 2009 1 次提交
    • B
      x86, msr: msrs_alloc/free for CONFIG_SMP=n · 6ede31e0
      Borislav Petkov 提交于
      Randy Dunlap reported the following build error:
      
      "When CONFIG_SMP=n, CONFIG_X86_MSR=m:
      
      ERROR: "msrs_free" [drivers/edac/amd64_edac_mod.ko] undefined!
      ERROR: "msrs_alloc" [drivers/edac/amd64_edac_mod.ko] undefined!"
      
      This is due to the fact that <arch/x86/lib/msr.c> is conditioned on
      CONFIG_SMP and in the UP case we have only the stubs in the header.
      Fork off SMP functionality into a new file (msr-smp.c) and build
      msrs_{alloc,free} unconditionally.
      Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NBorislav Petkov <petkovbb@gmail.com>
      LKML-Reference: <20091216231625.GD27228@liondog.tnic>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6ede31e0
  16. 12 12月, 2009 1 次提交
    • B
      x86, msr: Add support for non-contiguous cpumasks · 50542251
      Borislav Petkov 提交于
      The current rd/wrmsr_on_cpus helpers assume that the supplied
      cpumasks are contiguous. However, there are machines out there
      like some K8 multinode Opterons which have a non-contiguous core
      enumeration on each node (e.g. cores 0,2 on node 0 instead of 0,1), see
      http://www.gossamer-threads.com/lists/linux/kernel/1160268.
      
      This patch fixes out-of-bounds writes (see URL above) by adding per-CPU
      msr structs which are used on the respective cores.
      
      Additionally, two helpers, msrs_{alloc,free}, are provided for use by
      the callers of the MSR accessors.
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Doug Thompson <dougthompson@xmission.com>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      LKML-Reference: <20091211171440.GD31998@aftab>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      50542251
  17. 08 12月, 2009 1 次提交
  18. 07 12月, 2009 1 次提交
  19. 16 11月, 2009 1 次提交
    • F
      x86: Add missing might_fault() checks to copy_{to,from}_user() · 3c93ca00
      Frederic Weisbecker 提交于
      On x86-64, copy_[to|from]_user() rely on assembly routines that
      never call might_fault(), making us missing various lockdep
      checks.
      
      This doesn't apply to __copy_from,to_user() that explicitly
      handle these calls, neither is it a problem in x86-32 where
      copy_to,from_user() rely on the "__" prefixed versions that
      also call might_fault().
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1258382538-30979-1-git-send-email-fweisbec@gmail.com>
      [ v2: fix module export ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3c93ca00
  20. 15 11月, 2009 1 次提交
    • J
      x86-64: __copy_from_user_inatomic() adjustments · 14722485
      Jan Beulich 提交于
      This v2.6.26 commit:
      
          ad2fc2cd: x86: fix copy_user on x86
      
      rendered __copy_from_user_inatomic() identical to
      copy_user_generic(), yet didn't make the former just call the
      latter from an inline function.
      
      Furthermore, this v2.6.19 commit:
      
          b885808e: [PATCH] Add proper sparse __user casts to __copy_to_user_inatomic
      
      converted the return type of __copy_to_user_inatomic() from
      unsigned long to int, but didn't do the same to
      __copy_from_user_inatomic().
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: <v.mayatskih@gmail.com>
      LKML-Reference: <4AFD5778020000780001F8F4@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      14722485
  21. 04 11月, 2009 1 次提交
  22. 29 10月, 2009 5 次提交
    • M
      x86: Add Intel FMA instructions to x86 opcode map · 3f7e454a
      Masami Hiramatsu 提交于
      Add Intel FMA(FUSED-MULTIPLY-ADD) instructions to x86 opcode map
      for x86 instruction decoder.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091027204235.30545.33997.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3f7e454a
    • M
      x86: AVX instruction set decoder support · e0e492e9
      Masami Hiramatsu 提交于
      Add Intel AVX(Advanced Vector Extensions) instruction set
      support to x86 instruction decoder. This adds insn.vex_prefix
      field for storing VEX prefixes, and introduces some original
      tags for expressing opcodes attributes.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091027204226.30545.23451.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e0e492e9
    • M
      x86: Add pclmulq to x86 opcode map · 82cb5702
      Masami Hiramatsu 提交于
      Add pclmulq opcode to x86 opcode map.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091027204219.30545.82039.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      82cb5702
    • M
      x86: Merge INAT_REXPFX into INAT_PFX_* · 04d46c1b
      Masami Hiramatsu 提交于
      Merge INAT_REXPFX into INAT_PFX_* macro and rename it to
      INAT_PFX_REX.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091027204211.30545.58090.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      04d46c1b
    • M
      x86: Fix SSE opcode map bug · 7f387d3f
      Masami Hiramatsu 提交于
      Fix superscripts position because some superscripts of SSE
      opcode are not put in correct position.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091027204204.30545.97296.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7f387d3f
  23. 21 10月, 2009 2 次提交
  24. 17 10月, 2009 2 次提交
    • M
      x86: Add AMD prefetch and 3DNow! opcodes to opcode map · d1baf5a5
      Masami Hiramatsu 提交于
      Add AMD prefetch and 3DNow! opcode including FEMMS. Since 3DNow!
      uses the last immediate byte as an opcode extension byte, x86
      insn just treats the extenstion byte as an immediate byte
      instead of a part of opcode (insn_get_opcode() decodes first
      "0x0f 0x0f" bytes.)
      
      Users who are interested in analyzing 3DNow! opcode still can
      decode it by analyzing the immediate byte.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20091017000744.16556.27881.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d1baf5a5
    • M
      x86: Add MMX/SSE opcode groups to opcode map · 8c95bc3e
      Masami Hiramatsu 提交于
      Add missing MMX/SSE opcode groups to x86 opcode map.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20091017000736.16556.29061.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8c95bc3e
  25. 03 10月, 2009 1 次提交
    • M
      x86: Add VIA processor instructions in opcodes decoder · c0b11d3a
      Masami Hiramatsu 提交于
      Add VIA processor's Padlock instructions(MONTMUL, XSHA1, XSHA256)
      as parts of the kernel may use them.
      
      This fixes the following crash in opcodes decoder selftests:
      
       make[2]: `scripts/unifdef' is up to date.
         TEST    posttest
       Error: c145cf71:        f3 0f a6 d0             repz xsha256
       Error: objdump says 4 bytes, but insn_get_length() says 3 (attr:0)
       make[1]: *** [posttest] Error 2
       make: *** [bzImage] Error 2
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20090925182037.10157.3180.stgit@omoto>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      c0b11d3a
  26. 01 10月, 2009 3 次提交
    • A
      x86: Turn the copy_from_user check into an (optional) compile time warning · 4a312769
      Arjan van de Ven 提交于
      A previous patch added the buffer size check to copy_from_user().
      
      One of the things learned from analyzing the result of the previous
      patch is that in general, gcc is really good at proving that the
      code contains sufficient security checks to not need to do a
      runtime check. But that for those cases where gcc could not prove
      this, there was a relatively high percentage of real security
      issues.
      
      This patch turns the case of "gcc cannot prove" into a compile time
      warning, as long as a sufficiently new gcc is in use that supports
      this. The objective is that these warnings will trigger developers
      checking new cases out before a security hole enters a linux kernel
      release.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: James Morris <jmorris@namei.org>
      Cc: Jan Beulich <jbeulich@novell.com>
      LKML-Reference: <20090930130523.348ae6c4@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4a312769
    • E
      x86: Don't generate cmpxchg8b_emu if CONFIG_X86_CMPXCHG64=y · 04edbdef
      Eric Dumazet 提交于
      Conditionaly compile cmpxchg8b_emu.o and EXPORT_SYMBOL(cmpxchg8b_emu).
      
      This reduces the kernel size a bit.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <johnstul@us.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4AC43E7E.1000600@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      04edbdef
    • A
      x86: Provide an alternative() based cmpxchg64() · 79e1dd05
      Arjan van de Ven 提交于
      cmpxchg64() today generates, to quote Linus, "barf bag" code.
      
      cmpxchg64() is about to get used in the scheduler to fix a bug there,
      but it's a prerequisite that cmpxchg64() first be made non-sucking.
      
      This patch turns cmpxchg64() into an efficient implementation that
      uses the alternative() mechanism to just use the raw instruction on
      all modern systems.
      
      Note: the fallback is NOT smp safe, just like the current fallback
      is not SMP safe. (Interested parties with i486 based SMP systems
      are welcome to submit fix patches for that.)
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      [ fixed asm constraint bug ]
      Fixed-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <johnstul@us.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20090930170754.0886ff2e@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      79e1dd05
  27. 26 9月, 2009 1 次提交
    • A
      x86: Use __builtin_object_size() to validate the buffer size for copy_from_user() · 9f0cf4ad
      Arjan van de Ven 提交于
      gcc (4.x) supports the __builtin_object_size() builtin, which
      reports the size of an object that a pointer point to, when known
      at compile time. If the buffer size is not known at compile time, a
      constant -1 is returned.
      
      This patch uses this feature to add a sanity check to
      copy_from_user(); if the target buffer is known to be smaller than
      the copy size, the copy is aborted and a WARNing is emitted in
      memory debug mode.
      
      These extra checks compile away when the object size is not known,
      or if both the buffer size and the copy length are constants.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      LKML-Reference: <20090926143301.2c396b94@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9f0cf4ad