1. 26 6月, 2015 1 次提交
    • D
      compiler-intel: fix wrong compiler barrier() macro · b86a50c3
      Daniel Borkmann 提交于
      Cleanup commit 73679e50 ("compiler-intel.h: Remove duplicate
      definition") removed the double definition of __memory_barrier()
      intrinsics.
      
      However, in doing so, it also removed the preceding #undef barrier by
      accident, meaning, the actual barrier() macro from compiler-gcc.h with
      inline asm is still in place as __GNUC__ is provided.
      
      Subsequently, barrier() can never be defined as __memory_barrier() from
      compiler.h since it already has a definition in place and if we trust
      the comment in compiler-intel.h, ecc doesn't support gcc specific asm
      statements.
      
      I don't have an ecc at hand (unsure if that's still used in the field?)
      and only found this by accident during code review, a revert of that
      cleanup would be simplest option.
      
      Fixes: 73679e50 ("compiler-intel.h: Remove duplicate definition")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NPranith Kumar <bobby.prani@gmail.com>
      Cc: Pranith Kumar <bobby.prani@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: mancha security <mancha1@zoho.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b86a50c3
  2. 04 5月, 2015 1 次提交
    • D
      lib: make memzero_explicit more robust against dead store elimination · 7829fb09
      Daniel Borkmann 提交于
      In commit 0b053c95 ("lib: memzero_explicit: use barrier instead
      of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
      case LTO would decide to inline memzero_explicit() and eventually
      find out it could be elimiated as dead store.
      
      While using barrier() works well for the case of gcc, recent efforts
      from LLVMLinux people suggest to use llvm as an alternative to gcc,
      and there, Stephan found in a simple stand-alone user space example
      that llvm could nevertheless optimize and thus elimitate the memset().
      A similar issue has been observed in the referenced llvm bug report,
      which is regarded as not-a-bug.
      
      Based on some experiments, icc is a bit special on its own, while it
      doesn't seem to eliminate the memset(), it could do so with an own
      implementation, and then result in similar findings as with llvm.
      
      The fix in this patch now works for all three compilers (also tested
      with more aggressive optimization levels). Arguably, in the current
      kernel tree it's more of a theoretical issue, but imho, it's better
      to be pedantic about it.
      
      It's clearly visible with gcc/llvm though, with the below code: if we
      would have used barrier() only here, llvm would have omitted clearing,
      not so with barrier_data() variant:
      
        static inline void memzero_explicit(void *s, size_t count)
        {
          memset(s, 0, count);
          barrier_data(s);
        }
      
        int main(void)
        {
          char buff[20];
          memzero_explicit(buff, sizeof(buff));
          return 0;
        }
      
        $ gcc -O2 test.c
        $ gdb a.out
        (gdb) disassemble main
        Dump of assembler code for function main:
         0x0000000000400400  <+0>: lea   -0x28(%rsp),%rax
         0x0000000000400405  <+5>: movq  $0x0,-0x28(%rsp)
         0x000000000040040e <+14>: movq  $0x0,-0x20(%rsp)
         0x0000000000400417 <+23>: movl  $0x0,-0x18(%rsp)
         0x000000000040041f <+31>: xor   %eax,%eax
         0x0000000000400421 <+33>: retq
        End of assembler dump.
      
        $ clang -O2 test.c
        $ gdb a.out
        (gdb) disassemble main
        Dump of assembler code for function main:
         0x00000000004004f0  <+0>: xorps  %xmm0,%xmm0
         0x00000000004004f3  <+3>: movaps %xmm0,-0x18(%rsp)
         0x00000000004004f8  <+8>: movl   $0x0,-0x8(%rsp)
         0x0000000000400500 <+16>: lea    -0x18(%rsp),%rax
         0x0000000000400505 <+21>: xor    %eax,%eax
         0x0000000000400507 <+23>: retq
        End of assembler dump.
      
      As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
      this in compiler-gcc.h only to be picked up. For a fallback or otherwise
      unsupported compiler, we define it as a barrier. Similarly, for ecc which
      does not support gcc inline asm.
      
      Reference: https://llvm.org/bugs/show_bug.cgi?id=15495Reported-by: NStephan Mueller <smueller@chronox.de>
      Tested-by: NStephan Mueller <smueller@chronox.de>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Stephan Mueller <smueller@chronox.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: mancha security <mancha1@zoho.com>
      Cc: Mark Charlebois <charlebm@gmail.com>
      Cc: Behan Webster <behanw@converseincode.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      7829fb09
  3. 16 4月, 2014 1 次提交
  4. 11 12月, 2013 1 次提交
  5. 05 12月, 2013 1 次提交
    • C
      crypto: more robust crypto_memneq · fe8c8a12
      Cesar Eduardo Barros 提交于
      Disabling compiler optimizations can be fragile, since a new
      optimization could be added to -O0 or -Os that breaks the assumptions
      the code is making.
      
      Instead of disabling compiler optimizations, use a dummy inline assembly
      (based on RELOC_HIDE) to block the problematic kinds of optimization,
      while still allowing other optimizations to be applied to the code.
      
      The dummy inline assembly is added after every OR, and has the
      accumulator variable as its input and output. The compiler is forced to
      assume that the dummy inline assembly could both depend on the
      accumulator variable and change the accumulator variable, so it is
      forced to compute the value correctly before the inline assembly, and
      cannot assume anything about its value after the inline assembly.
      
      This change should be enough to make crypto_memneq work correctly (with
      data-independent timing) even if it is inlined at its call sites. That
      can be done later in a followup patch.
      
      Compile-tested on x86_64.
      Signed-off-by: NCesar Eduardo Barros <cesarb@cesarb.eti.br>
      Acked-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      fe8c8a12
  6. 06 12月, 2012 1 次提交
    • D
      byteorder: allow arch to opt to use GCC intrinsics for byteswapping · cf66bb93
      David Woodhouse 提交于
      Since GCC 4.4, there have been __builtin_bswap32() and __builtin_bswap16()
      intrinsics. A __builtin_bswap16() came a little later (4.6 for PowerPC,
      48 for other platforms).
      
      By using these instead of the inline assembler that most architectures
      have in their __arch_swabXX() macros, we let the compiler see what's
      actually happening. The resulting code should be at least as good, and
      much *better* in the cases where it can be combined with a nearby load
      or store, using a load-and-byteswap or store-and-byteswap instruction
      (e.g. lwbrx/stwbrx on PowerPC, movbe on Atom).
      
      When GCC is sufficiently recent *and* the architecture opts in to using
      the intrinsics by setting CONFIG_ARCH_USE_BUILTIN_BSWAP, they will be
      used in preference to the __arch_swabXX() macros. An architecture which
      does not set ARCH_USE_BUILTIN_BSWAP will continue to use its own
      hand-crafted macros.
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      cf66bb93
  7. 17 10月, 2007 1 次提交
  8. 08 5月, 2007 2 次提交
  9. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4