1. 08 6月, 2015 3 次提交
    • I
      x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32 · b2502b41
      Ingo Molnar 提交于
      The 'system_call' entry points differ starkly between native 32-bit and 64-bit
      kernels: on 32-bit kernels it defines the INT 0x80 entry point, while on
      64-bit it's the SYSCALL entry point.
      
      This is pretty confusing when looking at generic code, and it also obscures
      the nature of the entry point at the assembly level.
      
      So unangle this by splitting the name into its two uses:
      
      	system_call (32) -> entry_INT80_32
      	system_call (64) -> entry_SYSCALL_64
      
      As per the generic naming scheme for x86 system call entry points:
      
      	entry_MNEMONIC_qualifier
      
      where 'qualifier' is one of _32, _64 or _compat.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b2502b41
    • I
      x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points:... · 4c8cd0c5
      Ingo Molnar 提交于
      x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat
      
      So the SYSENTER instruction is pretty quirky and it has different behavior
      depending on bitness and CPU maker.
      
      Yet we create a false sense of coherency by naming it 'ia32_sysenter_target'
      in both of the cases.
      
      Split the name into its two uses:
      
      	ia32_sysenter_target (32)    -> entry_SYSENTER_32
      	ia32_sysenter_target (64)    -> entry_SYSENTER_compat
      
      As per the generic naming scheme for x86 system call entry points:
      
      	entry_MNEMONIC_qualifier
      
      where 'qualifier' is one of _32, _64 or _compat.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4c8cd0c5
    • I
      x86/asm/entry: Rename compat syscall entry points · 2cd23553
      Ingo Molnar 提交于
      Rename the following system call entry points:
      
      	ia32_cstar_target       -> entry_SYSCALL_compat
      	ia32_syscall            -> entry_INT80_compat
      
      The generic naming scheme for x86 system call entry points is:
      
      	entry_MNEMONIC_qualifier
      
      where 'qualifier' is one of _32, _64 or _compat.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      2cd23553
  2. 07 6月, 2015 1 次提交
    • I
      x86/asm/entry/64/compat: Rename ia32entry.S -> entry_64_compat.S · 138bd56a
      Ingo Molnar 提交于
      So we now have the following system entry code related
      files, which define the following system call instruction
      and other entry paths:
      
         entry_32.S            # 32-bit binaries on 32-bit kernels
         entry_64.S            # 64-bit binaries on 64-bit kernels
         entry_64_compat.S	 # 32-bit binaries on 64-bit kernels
      
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      138bd56a
  3. 05 6月, 2015 9 次提交
    • D
      x86/asm/entry/32: Remove unnecessary optimization in stub32_clone · 7a5a9824
      Denys Vlasenko 提交于
      Really swap arguments #4 and #5 in stub32_clone instead of
      "optimizing" it into a move.
      
      Yes, tls_val is currently unused. Yes, on some CPUs XCHG is a
      little bit more expensive than MOV. But a cycle or two on an
      expensive syscall like clone() is way below noise floor, and
      this optimization is simply not worth the obfuscation of logic.
      
      [ There's also ongoing work on the clone() ABI by Josh Triplett
        that will depend on this change later on. ]
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1433339930-20880-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7a5a9824
    • D
      x86/asm/entry/32: Explain the stub32_clone logic · 5cdc683b
      Denys Vlasenko 提交于
      The reason for copying of %r8 to %rcx is quite non-obvious.
      Add a comment which explains why it is done.
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1433339930-20880-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5cdc683b
    • I
      x86/asm/entry/32: Improve code readability · 54ad726c
      Ingo Molnar 提交于
      Make the 64-bit compat 32-bit syscall entry code a bit more readable:
      
       - eliminate whitespace noise
      
       - use consistent vertical spacing
      
       - use consistent assembly coding style similar to entry_64.S
      
       - fix various comments
      
      No code changed:
      
      arch/x86/entry/ia32entry.o:
      
         text	   data	    bss	    dec	    hex	filename
         1391	      0	      0	   1391	    56f	ia32entry.o.before
         1391	      0	      0	   1391	    56f	ia32entry.o.after
      
      md5:
         f28501dcc366e68b557313942c6496d6  ia32entry.o.before.asm
         f28501dcc366e68b557313942c6496d6  ia32entry.o.after.asm
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      54ad726c
    • D
      x86/asm/entry/32: Do not use R9 in SYSCALL32 entry point · 53e9accf
      Denys Vlasenko 提交于
      SYSENTER and SYSCALL 32-bit entry points differ in handling of
      arg2 and arg6.
      
      SYSENTER:
       * ecx  arg2
       * ebp  user stack
       * 0(%ebp) arg6
      
      SYSCALL:
       * ebp  arg2
       * esp  user stack
       * 0(%esp) arg6
      
      Sysenter code loads 0(%ebp) to %ebp right away.
      (This destroys %ebp. It means we do not preserve it on return.
      It's not causing problems since userspace VDSO code does not
      depend on it, and SYSENTER insn can't be sanely used outside of
      VDSO).
      
      Syscall code loads 0(%ebp) to %r9. This allows to eliminate one
      MOV insn (r9 is a register where arg6 should be for 64-bit ABI),
      but on audit/ptrace code paths this requires juggling of r9 and
      ebp: (1) ptrace expects arg6 to be in pt_regs->bp;
      (2) r9 is callee-clobbered register and needs to be
      saved/restored     around calls to C functions.
      
      This patch changes syscall code to load 0(%ebp) to %ebp, making
      it more similar to sysenter code. It's a bit smaller:
      
         text    data     bss     dec     hex filename
         1407       0       0    1407     57f ia32entry.o.before
         1391       0       0    1391     56f ia32entry.o
      
      To preserve ABI compat, we restore ebp on exit.
      
      Run-tested.
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1433336169-18964-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      53e9accf
    • D
      x86/asm/entry/32: Open-code LOAD_ARGS32 · 73cbf687
      Denys Vlasenko 提交于
      This macro is small, has only three callsites, and one of them
      is slightly different using a conditional parameter.
      
      A few saved lines aren't worth the resulting obfuscation.
      
      Generated machine code is identical.
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1433271842-9139-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      73cbf687
    • D
      x86/asm/entry/32: Open-code CLEAR_RREGS · ef0cd5dc
      Denys Vlasenko 提交于
      This macro is small, has only four callsites, and one of them is
      slightly different using a conditional parameter.
      
      A few saved lines aren't worth the resulting obfuscation.
      
      Generated machine code is identical.
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      [ Added comments. ]
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1433271842-9139-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ef0cd5dc
    • D
      x86/asm/entry/32: Simplify the zeroing of pt_regs->r8..r11 in the int80 code path · 61b1e3e7
      Denys Vlasenko 提交于
      32-bit syscall entry points do not save the complete pt_regs struct,
      they leave some fields uninitialized. However, they must be
      careful to not leak uninitialized data in pt_regs->r8..r11 to
      ptrace users.
      
      CLEAR_RREGS macro is used to zero these fields out when needed.
      
      However, in the int80 code path this zeroing is unconditional.
      This patch simplifies it by storing zeroes there right away,
      when pt_regs is constructed on stack.
      
      This uses shorter instructions:
      
         text    data     bss     dec     hex filename
         1423       0       0    1423     58f ia32entry.o.before
         1407       0       0    1407     57f ia32entry.o
      
      Compile-tested.
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1433266510-2938-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      61b1e3e7
    • A
      x86/asm/entry/64: Remove pointless jump to irq_return · 5ca6f70f
      Andy Lutomirski 提交于
      INTERRUPT_RETURN turns into a jmp instruction.  There's no need
      for extra indirection.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: <linux-kernel@vger.kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/2f2318653dbad284a59311f13f08cea71298fd7c.1433449436.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5ca6f70f
    • A
      x86/asm/msr: Make wrmsrl_safe() a function · cf991de2
      Andy Lutomirski 提交于
      The wrmsrl_safe macro performs invalid shifts if the value
      argument is 32 bits.  This makes it unnecessarily awkward to
      write code that puts an unsigned long into an MSR.
      
      Convert it to a real inline function.
      
      For inspiration, see:
      
        7c74d5b7 ("x86/asm/entry/64: Fix MSR_IA32_SYSENTER_CS MSR value").
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: <linux-kernel@vger.kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      [ Applied small improvements. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      cf991de2
  4. 04 6月, 2015 7 次提交
    • I
      x86/asm/entry: Move the vsyscall code to arch/x86/entry/vsyscall/ · 00398a00
      Ingo Molnar 提交于
      The vsyscall code is entry code too, so move it to arch/x86/entry/vsyscall/.
      
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      00398a00
    • I
      x86/asm/entry: Move the arch/x86/syscalls/ definitions to arch/x86/entry/syscalls/ · 1f57d5d8
      Ingo Molnar 提交于
      The build time generated syscall definitions are entry code related, move
      them into the arch/x86/entry/ directory.
      
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1f57d5d8
    • I
      x86/asm/entry: Move arch/x86/include/asm/calling.h to arch/x86/entry/ · d36f9479
      Ingo Molnar 提交于
      asm/calling.h is private to the entry code, make this more apparent
      by moving it to the new arch/x86/entry/ directory.
      
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d36f9479
    • I
      x86/asm/entry: Move the 'thunk' functions to arch/x86/entry/ · e6b93f4e
      Ingo Molnar 提交于
      These are all calling x86 entry code functions, so move them close
      to other entry code.
      
      Change lib-y to obj-y: there's no real difference between the two
      as we don't really drop any of them during the linking stage, and
      obj-y is the more common approach for core kernel object code.
      
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e6b93f4e
    • I
      x86/asm/entry, x86/vdso: Move the vDSO code to arch/x86/entry/vdso/ · d603c8e1
      Ingo Molnar 提交于
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d603c8e1
    • I
      x86/asm/entry: Move the compat syscall entry code to arch/x86/entry/ · 19a433f4
      Ingo Molnar 提交于
      Move the ia32entry.S file over into arch/x86/entry/.
      
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      19a433f4
    • I
      x86/asm/entry: Move entry_64.S and entry_32.S to arch/x86/entry/ · 905a36a2
      Ingo Molnar 提交于
      Create a new directory hierarchy for the low level x86 entry code:
      
          arch/x86/entry/*
      
      This will host all the low level glue that is currently scattered
      all across arch/x86/.
      
      Start with entry_64.S and entry_32.S.
      
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      905a36a2
  5. 02 6月, 2015 3 次提交
    • J
      x86/asm/entry/64: Fold identical code paths · 2f63b9db
      Jan Beulich 提交于
      retint_kernel doesn't require %rcx to be pointing to thread info
      (anymore?), and the code on the two alternative paths is - not
      really surprisingly - identical.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/556C664F020000780007FB64@mail.emea.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2f63b9db
    • J
      x86/asm/entry/64: Use negative immediates for stack adjustments · 2bf557ea
      Jan Beulich 提交于
      Doing so allows adjustments by 128 bytes (occurring for
      REMOVE_PT_GPREGS_FROM_STACK 8 uses) to be expressed with a
      single byte immediate.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/556C660F020000780007FB60@mail.emea.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2bf557ea
    • I
      x86/debug: Remove perpetually broken, unmaintainable dwarf annotations · 131484c8
      Ingo Molnar 提交于
      So the dwarf2 annotations in low level assembly code have
      become an increasing hindrance: unreadable, messy macros
      mixed into some of the most security sensitive code paths
      of the Linux kernel.
      
      These debug info annotations don't even buy the upstream
      kernel anything: dwarf driven stack unwinding has caused
      problems in the past so it's out of tree, and the upstream
      kernel only uses the much more robust framepointers based
      stack unwinding method.
      
      In addition to that there's a steady, slow bitrot going
      on with these annotations, requiring frequent fixups.
      There's no tooling and no functionality upstream that
      keeps it correct.
      
      So burn down the sick forest, allowing new, healthier growth:
      
         27 files changed, 350 insertions(+), 1101 deletions(-)
      
      Someone who has the willingness and time to do this
      properly can attempt to reintroduce dwarf debuginfo in x86
      assembly code plus dwarf unwinding from first principles,
      with the following conditions:
      
       - it should be maximally readable, and maximally low-key to
         'ordinary' code reading and maintenance.
      
       - find a build time method to insert dwarf annotations
         automatically in the most common cases, for pop/push
         instructions that manipulate the stack pointer. This could
         be done for example via a preprocessing step that just
         looks for common patterns - plus special annotations for
         the few cases where we want to depart from the default.
         We have hundreds of CFI annotations, so automating most of
         that makes sense.
      
       - it should come with build tooling checks that ensure that
         CFI annotations are sensible. We've seen such efforts from
         the framepointer side, and there's no reason it couldn't be
         done on the dwarf side.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      131484c8
  6. 24 5月, 2015 1 次提交
    • A
      x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers · cdeb6048
      Andy Lutomirski 提交于
      The early_idt_handlers asm code generates an array of entry
      points spaced nine bytes apart.  It's not really clear from that
      code or from the places that reference it what's going on, and
      the code only works in the first place because GAS never
      generates two-byte JMP instructions when jumping to global
      labels.
      
      Clean up the code to generate the correct array stride (member size)
      explicitly. This should be considerably more robust against
      screw-ups, as GAS will warn if a .fill directive has a negative
      count.  Using '. =' to advance would have been even more robust
      (it would generate an actual error if it tried to move
      backwards), but it would pad with nulls, confusing anyone who
      tries to disassemble the code.  The new scheme should be much
      clearer to future readers.
      
      While we're at it, improve the comments and rename the array and
      common code.
      
      Binutils may start relaxing jumps to non-weak labels.  If so,
      this change will fix our build, and we may need to backport this
      change.
      
      Before, on x86_64:
      
        0000000000000000 <early_idt_handlers>:
           0:   6a 00                   pushq  $0x0
           2:   6a 00                   pushq  $0x0
           4:   e9 00 00 00 00          jmpq   9 <early_idt_handlers+0x9>
                                5: R_X86_64_PC32        early_idt_handler-0x4
        ...
          48:   66 90                   xchg   %ax,%ax
          4a:   6a 08                   pushq  $0x8
          4c:   e9 00 00 00 00          jmpq   51 <early_idt_handlers+0x51>
                                4d: R_X86_64_PC32       early_idt_handler-0x4
        ...
         117:   6a 00                   pushq  $0x0
         119:   6a 1f                   pushq  $0x1f
         11b:   e9 00 00 00 00          jmpq   120 <early_idt_handler>
                                11c: R_X86_64_PC32      early_idt_handler-0x4
      
      After:
      
        0000000000000000 <early_idt_handler_array>:
           0:   6a 00                   pushq  $0x0
           2:   6a 00                   pushq  $0x0
           4:   e9 14 01 00 00          jmpq   11d <early_idt_handler_common>
        ...
          48:   6a 08                   pushq  $0x8
          4a:   e9 d1 00 00 00          jmpq   120 <early_idt_handler_common>
          4f:   cc                      int3
          50:   cc                      int3
        ...
         117:   6a 00                   pushq  $0x0
         119:   6a 1f                   pushq  $0x1f
         11b:   eb 03                   jmp    120 <early_idt_handler_common>
         11d:   cc                      int3
         11e:   cc                      int3
         11f:   cc                      int3
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Binutils <binutils@sourceware.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H.J. Lu <hjl.tools@gmail.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/ac027962af343b0c599cbfcf50b945ad2ef3d7a8.1432336324.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cdeb6048
  7. 17 5月, 2015 3 次提交
    • D
      x86/asm/entry/64: Use shorter MOVs from segment registers · adeb5537
      Denys Vlasenko 提交于
      The "movw %ds,%cx" instruction needs a 0x66 prefix, while
      "movl %ds,%ecx" does not.
      
      The difference is that latter form (on 64-bit CPUs)
      overwrites the entire %ecx, not only its lower half.
      
      But subsequent code doesn't depend on the value of upper
      half of %ecx, so we can safely use the shorter instruction.
      
      The new code is also faster than the old one - now we don't
      depend on the old value of %ecx, but this code fragment is
      not performance-critical so it does not matter much.
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1431722346-26585-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      adeb5537
    • B
      x86/asm/head*.S: Change global labels to local · e839004b
      Borislav Petkov 提交于
      Make the disassembly look less confusing:
      
        -- head_64.o.before.asm
        ++ head_64.o.after.asm
         0000000000000120 <early_idt_handler>:
          120:	fc                   	cld
          121:	83 3c 24 02          	cmpl   $0x2,(%rsp)
        - 125:	0f 84 9d 00 00 00    	je     1c8 <is_nmi>
        + 125:	0f 84 9d 00 00 00    	je     1c8 <early_idt_handler+0xa8>
          12b:	83 3d 00 00 00 00 02 	cmpl   $0x2,0x0(%rip)        # 132 <early_idt_handler+0x12>
          132:	74 7e                	je     1b2 <early_idt_handler+0x92>
          134:	ff 05 00 00 00 00    	incl   0x0(%rip)        # 13a <early_idt_handler+0x1a>
        @@ -1198,9 +1198,7 @@ Disassembly of section .init.text:
          1bf:	5a                   	pop    %rdx
          1c0:	59                   	pop    %rcx
          1c1:	58                   	pop    %rax
        - 1c2:	ff 0d 00 00 00 00    	decl   0x0(%rip)        # 1c8 <is_nmi>
        -
        -00000000000001c8 <is_nmi>:
        + 1c2:	ff 0d 00 00 00 00    	decl   0x0(%rip)        # 1c8 <early_idt_handler+0xa8>
          1c8:	48 83 c4 10          	add    $0x10,%rsp
          1cc:	48 cf                	iretq
      
        -- head_32.o.before.asm
        ++ head_32.o.after.asm
         0000016c <early_idt_handler>:
          16c:  fc                      cld
          16d:  83 3c 24 02             cmpl   $0x2,(%esp)
        - 171:  74 73                   je     1e6 <is_nmi>
        + 171:  74 73                   je     1e6 <ex_entry+0xc>
          173:  36 83 3d 00 00 00 00    cmpl   $0x2,%ss:0x0
          17a:  02
          17b:  74 5a                   je     1d7 <hlt_loop>
        @@ -483,8 +483,6 @@ Disassembly of section .init.text:
          1dd:  59                      pop    %ecx
          1de:  58                      pop    %eax
          1df:  36 ff 0d 00 00 00 00    decl   %ss:0x0
        -
        -000001e6 <is_nmi>:
          1e6:  83 c4 08                add    $0x8,%esp
          1e9:  cf                      iret
          1ea:  66 90                   xchg   %ax,%ax
      
      No functionality change.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1431793079-11153-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e839004b
    • I
      x86: Pack loops tightly as well · 52648e83
      Ingo Molnar 提交于
      Packing loops tightly (-falign-loops=1) is beneficial to code size:
      
           text        data    bss     dec              filename
       12566391        1617840 1089536 15273767         vmlinux.align.16-byte
       12224951        1617840 1089536 14932327         vmlinux.align.1-byte
       11976567        1617840 1089536 14683943         vmlinux.align.1-byte.funcs-1-byte
       11903735        1617840 1089536 14611111         vmlinux.align.1-byte.funcs-1-byte.loops-1-byte
      
      Which reduces the size of the kernel by another 0.6%, so the
      the total combined size reduction of the alignment-packing
      patches is ~5.5%.
      
      The x86 decoder bandwidth and caching arguments laid out in:
      
        be6cb027 ("x86: Align jump targets to 1-byte boundaries")
      
      apply to loop alignment as well.
      
      Furtermore, modern CPU uarchs have a loop cache/buffer that
      is a L0 cache before even any uop cache, covering a few
      dozen most recently executed instructions.
      
      This loop cache generally does not have the 16-byte alignment
      restrictions of the uop cache.
      
      Now loop alignment can still be beneficial if:
      
       - a loop is cache-hot and its surroundings are not.
      
       - if the loop is so cache hot that the instruction
         flow becomes x86 decoder bandwidth limited
      
      But loop alignment is harmful if:
      
       - a loop is cache-cold
      
       - a loop's surroundings are cache-hot as well
      
       - two cache-hot loops are close to each other
      
       - if the loop fits into the loop cache
      
       - if the code flow is not decoder bandwidth limited
      
      and I'd argue that the latter five scenarios are much
      more common in the kernel, as our hottest loops are
      typically:
      
       - pointer chasing: this should fit into the loop cache
         in most cases and is typically data cache and address
         generation limited
      
       - generic memory ops (memset, memcpy, etc.): these generally
         fit into the loop cache as well, and are likewise data
         cache limited.
      
      So this patch packs loop addresses tightly as well.
      Acked-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/20150410123017.GB19918@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      52648e83
  8. 16 5月, 2015 2 次提交
  9. 15 5月, 2015 1 次提交
    • I
      x86: Align jump targets to 1-byte boundaries · be6cb027
      Ingo Molnar 提交于
      The following NOP in a hot function caught my attention:
      
        >   5a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
      
      That's a dead NOP that bloats the function a bit, added for the
      default 16-byte alignment that GCC applies for jump targets.
      
      I realize that x86 CPU manufacturers recommend 16-byte jump
      target alignments (it's in the Intel optimization manual),
      to help their relatively narrow decoder prefetch alignment
      and uop cache constraints, but the cost of that is very
      significant:
      
              text           data       bss         dec      filename
          12566391        1617840   1089536    15273767      vmlinux.align.16-byte
          12224951        1617840   1089536    14932327      vmlinux.align.1-byte
      
      By using 1-byte jump target alignment (i.e. no alignment at all)
      we get an almost 3% reduction in kernel size (!) - and a
      probably similar reduction in I$ footprint.
      
      Now, the usual justification for jump target alignment is the
      following:
      
       - modern decoders tend to have 16-byte (effective) decoder
         prefetch windows. (AMD documents it higher but measurements
         suggest the effective prefetch window on curretn uarchs is
         still around 16 bytes)
      
       - on Intel there's also the uop-cache with cachelines that have
         16-byte granularity and limited associativity.
      
       - older x86 uarchs had a penalty for decoder fetches that crossed
         16-byte boundaries. These limits are mostly gone from recent
         uarchs.
      
      So if a forward jump target is aligned to cacheline boundary then
      prefetches will start from a new prefetch-cacheline and there's
      higher chance for decoding in fewer steps and packing tightly.
      
      But I think that argument is flawed for typical optimized kernel
      code flows: forward jumps often go to 'cold' (uncommon) pieces
      of code, and  aligning cold code to cache lines does not bring a
      lot of advantages  (they are uncommon), while it causes
      collateral damage:
      
       - their alignment 'spreads out' the cache footprint, it shifts
         followup hot code further out
      
       - plus it slows down even 'cold' code that immediately follows 'hot'
         code (like in the above case), which could have benefited from the
         partial cacheline that comes off the end of hot code.
      
      But even in the cache-hot case the 16 byte alignment brings
      disadvantages:
      
       - it spreads out the cache footprint, possibly making the code
         fall out of the L1 I$.
      
       - On Intel CPUs, recent microarchitectures have plenty of
         uop cache (typically doubling every 3 years) - while the
         size of the L1 cache grows much less aggressively. So
         workloads are rarely uop cache limited.
      
      The only situation where alignment might matter are tight
      loops that could fit into a single 16 byte chunk - but those
      are pretty rare in the kernel: if they exist they tend
      to be pointer chasing or generic memory ops, which both tend
      to be cache miss (or cache allocation) intensive and are not
      decoder bandwidth limited.
      
      So the balance of arguments strongly favors packing kernel
      instructions tightly versus maximizing for decoder bandwidth:
      this patch changes the jump target alignment from 16 bytes
      to 1 byte (tightly packed, unaligned).
      Acked-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/20150410120846.GA17101@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      be6cb027
  10. 14 5月, 2015 5 次提交
  11. 13 5月, 2015 5 次提交