1. 04 7月, 2015 1 次提交
  2. 26 6月, 2015 2 次提交
  3. 25 6月, 2015 3 次提交
  4. 23 6月, 2015 1 次提交
  5. 19 6月, 2015 8 次提交
  6. 18 6月, 2015 1 次提交
  7. 12 6月, 2015 3 次提交
  8. 09 6月, 2015 11 次提交
    • D
      x86/mpx: Support 32-bit binaries on 64-bit kernels · 613fcb7d
      Dave Hansen 提交于
      Right now, the kernel can only switch between 64-bit and 32-bit
      binaries at compile time. This patch adds support for 32-bit
      binaries on 64-bit kernels when we support ia32 emulation.
      
      We essentially choose which set of table sizes to use when doing
      arithmetic for the bounds table calculations.
      
      This also uses a different approach for calculating the table
      indexes than before.  I think the new one makes it much more
      clear what is going on, and allows us to share more code between
      the 32-bit and 64-bit cases.
      Based-on-patch-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150607183705.E01F21E2@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      613fcb7d
    • D
      x86/mpx: Introduce new 'directory entry' to 'addr' helper function · 54587653
      Dave Hansen 提交于
      Currently, to get from a bounds directory entry to the virtual
      address of a bounds table, we simply mask off a few low bits.
      However, the set of bits we mask off is different for 32-bit and
      64-bit binaries.
      
      This breaks the operation out in to a helper function and also
      adds a temporary variable to store the result until we are
      sure we are returning one.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150607183704.007686CE@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      54587653
    • D
      x86: Make is_64bit_mm() widely available · b0e9b09b
      Dave Hansen 提交于
      The uprobes code has a nice helper, is_64bit_mm(), that consults
      both the runtime and compile-time flags for 32-bit support.
      Instead of reinventing the wheel, pull it in to an x86 header so
      we can use it for MPX.
      
      I prefer passing the 'mm' around to test_thread_flag(TIF_IA32)
      because it makes it explicit where the context is coming from.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150607183704.F0209999@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b0e9b09b
    • D
      x86/mpx: Trace allocation of new bounds tables · cd4996dc
      Dave Hansen 提交于
      Bounds tables are a significant consumer of memory.  It is
      important to know when they are being allocated.  Add a trace
      point to trace whenever an allocation occurs and also its
      virtual address.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150607183704.EC23A93E@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cd4996dc
    • D
      x86/mpx: Trace the attempts to find bounds tables · 2a1dcb1f
      Dave Hansen 提交于
      There are two different events being traced here.  They are
      doing similar things so share a trace "EVENT_CLASS" and are
      presented together.
      
      1. Trace when MPX is zapping pages "mpx_unmap_zap":
      
      	When MPX can not free an entire bounds table, it will
      	instead try to zap unused parts of a bounds table to free
      	the backing memory.  This decreases RSS (resident set
      	size) without decreasing the virtual space allocated
      	for bounds tables.
      
      2. Trace attempts to find bounds tables "mpx_unmap_search":
      
      	This event traces any time we go looking to unmap a
      	bounds table for a given virtual address range.  This is
      	useful to ensure that the kernel actually "tried" to free
      	a bounds table versus times it succeeded in finding one.
      
      	It might try and fail if it realized that a table was
      	shared with an adjacent VMA which is not being unmapped.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150607183703.B9D2468B@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2a1dcb1f
    • D
      x86/mpx: Trace entry to bounds exception paths · 97efebf1
      Dave Hansen 提交于
      There are two basic things that can happen as the result of
      a bounds exception (#BR):
      
      	1. We allocate a new bounds table
      	2. We pass up a bounds exception to userspace.
      
      This patch adds a trace point for the case where we are
      passing the exception up to userspace with a signal.
      
      We are also explicit that we're printing out the inverse of
      the 'upper' that we encounter.  If you want to filter, for
      instance, you need to ~ the value first.  The reason we do
      this is because of how 'upper' is stored in the bounds table.
      
      If a pointer's range is:
      
      	0x1000 -> 0x2000
      
      it is stored in the bounds table as (32-bits here for brevity):
      
      	lower: 0x00001000
      	upper: 0xffffdfff
      
      That is so that an all 0's entry:
      
      	lower: 0x00000000
      	upper: 0x00000000
      
      corresponds to the "init" bounds which store a *range* of:
      
      	0x00000000 -> 0xffffffff
      
      That is, by far, the common case, and that lets us use the
      zero page, or deduplicate the memory, etc... The 'upper'
      stored in the table is gibberish to print by itself, so we
      print ~upper to get the *actual*, logical, human-readable
      value printed out.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150607183703.027BB9B0@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      97efebf1
    • D
      x86/mpx: Trace #BR exceptions · e7126cf5
      Dave Hansen 提交于
      This is the first in a series of MPX tracing patches.
      I've found these extremely useful in the process of
      debugging applications and the kernel code itself.
      
      This exception hooks in to the bounds (#BR) exception
      very early and allows capturing the key registers which
      would influence how the exception is handled.
      
      Note that bndcfgu/bndstatus are technically still
      64-bit registers even in 32-bit mode.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150607183703.5FE2619A@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e7126cf5
    • Q
      x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK · 3c1d3230
      Qiaowei Ren 提交于
      MPX_BNDCFG_ADDR_MASK is defined two times, so this patch removes
      redundant one.
      Signed-off-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150607183702.5F129376@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3c1d3230
    • D
      x86/mpx: Clean up the code by not passing a task pointer around when unnecessary · 46a6e0cf
      Dave Hansen 提交于
      The MPX code can only work on the current task.  You can not,
      for instance, enable MPX management in another process or
      thread. You can also not handle a fault for another process or
      thread.
      
      Despite this, we pass a task_struct around prolifically.  This
      patch removes all of the task struct passing for code paths
      where the code can not deal with another task (which turns out
      to be all of them).
      
      This has no functional changes.  It's just a cleanup.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: bp@alien8.de
      Link: http://lkml.kernel.org/r/20150607183702.6A81DA2C@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      46a6e0cf
    • D
      x86/mpx: Use the new get_xsave_field_ptr()API · a84eeaa9
      Dave Hansen 提交于
      The MPX registers (bndcsr/bndcfgu/bndstatus) are not directly
      accessible via normal instructions.  They essentially act as
      if they were floating point registers and are saved/restored
      along with those registers.
      
      There are two main paths in the MPX code where we care about
      the contents of these registers:
      
      	1. #BR (bounds) faults
      	2. the prctl() code where we are setting MPX up
      
      Both of those paths _might_ be called without the FPU having
      been used.  That means that 'tsk->thread.fpu.state' might
      never be allocated.
      
      Also, fpu_save_init() is not preempt-safe.  It was a bug to
      call it without disabling preemption.  The new
      get_xsave_addr() calls unlazy_fpu() instead and properly
      disables preemption.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: bp@alien8.de
      Link: http://lkml.kernel.org/r/20150607183701.BC0D37CF@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a84eeaa9
    • D
      x86/fpu/xstate: Wrap get_xsave_addr() to make it safer · 04cd027b
      Dave Hansen 提交于
      The MPX code appears is calling a low-level FPU function
      (copy_fpregs_to_fpstate()).  This function is not able to
      be called in all contexts, although it is safe to call
      directly in some cases.
      
      Although probably correct, the current code is ugly and
      potentially error-prone.  So, add a wrapper that calls
      the (slightly) higher-level fpu__save() (which is preempt-
      safe) and also ensures that we even *have* an FPU context
      (in the case that this was called when in lazy FPU mode).
      
      Ingo had this to say about the details about when we need
      preemption disabled:
      
      > it's indeed generally unsafe to access/copy FPU registers with preemption enabled,
      > for two reasons:
      >
      >   - on older systems that use FSAVE the instruction destroys FPU register
      >     contents, which has to be handled carefully
      >
      >   - even on newer systems if we copy to FPU registers (which this code doesn't)
      >     then we don't want a context switch to occur in the middle of it, because a
      >     context switch will write to the fpstate, potentially overwriting our new data
      >     with old FPU state.
      >
      > But it's safe to access FPU registers with preemption enabled in a couple of
      > special cases:
      >
      >   - potentially destructively saving FPU registers: the signal handling code does
      >     this in copy_fpstate_to_sigframe(), because it can rely on the signal restore
      >     side to restore the original FPU state.
      >
      >   - reading FPU registers on modern systems: we don't do this anywhere at the
      >     moment, mostly to keep symmetry with older systems where FSAVE is
      >     destructive.
      >
      >   - initializing FPU registers on modern systems: fpu__clear() does this. Here
      >     it's safe because we don't copy from the fpstate.
      >
      >   - directly writing FPU registers from user-space memory (!). We do this in
      >     fpu__restore_sig(), and it's safe because neither context switches nor
      >     irq-handler FPU use can corrupt the source context of the copy (which is
      >     user-space memory).
      >
      > Note that the MPX code's current use of copy_fpregs_to_fpstate() was safe I think,
      > because:
      >
      >  - MPX is predicated on eagerfpu, so the destructive F[N]SAVE instruction won't be
      >    used.
      >
      >  - the code was only reading FPU registers, and was doing it only in places that
      >    guaranteed that an FPU state was already active (i.e. didn't do it in
      >    kthreads)
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: bp@alien8.de
      Link: http://lkml.kernel.org/r/20150607183700.AA881696@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      04cd027b
  9. 08 6月, 2015 4 次提交
    • B
      PCI: Remove unused pci_dma_burst_advice() · 01d72a95
      Bjorn Helgaas 提交于
      pci_dma_burst_advice() was added by e24c2d96 ("[PATCH] PCI: DMA
      bursting advice") but apparently never used.  Remove it.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: Michal Simek <monstr@monstr.eu>	# microblaze
      CC: David S. Miller <davem@davemloft.net>
      01d72a95
    • I
      x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32 · b2502b41
      Ingo Molnar 提交于
      The 'system_call' entry points differ starkly between native 32-bit and 64-bit
      kernels: on 32-bit kernels it defines the INT 0x80 entry point, while on
      64-bit it's the SYSCALL entry point.
      
      This is pretty confusing when looking at generic code, and it also obscures
      the nature of the entry point at the assembly level.
      
      So unangle this by splitting the name into its two uses:
      
      	system_call (32) -> entry_INT80_32
      	system_call (64) -> entry_SYSCALL_64
      
      As per the generic naming scheme for x86 system call entry points:
      
      	entry_MNEMONIC_qualifier
      
      where 'qualifier' is one of _32, _64 or _compat.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b2502b41
    • I
      x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points:... · 4c8cd0c5
      Ingo Molnar 提交于
      x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat
      
      So the SYSENTER instruction is pretty quirky and it has different behavior
      depending on bitness and CPU maker.
      
      Yet we create a false sense of coherency by naming it 'ia32_sysenter_target'
      in both of the cases.
      
      Split the name into its two uses:
      
      	ia32_sysenter_target (32)    -> entry_SYSENTER_32
      	ia32_sysenter_target (64)    -> entry_SYSENTER_compat
      
      As per the generic naming scheme for x86 system call entry points:
      
      	entry_MNEMONIC_qualifier
      
      where 'qualifier' is one of _32, _64 or _compat.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4c8cd0c5
    • I
      x86/asm/entry: Rename compat syscall entry points · 2cd23553
      Ingo Molnar 提交于
      Rename the following system call entry points:
      
      	ia32_cstar_target       -> entry_SYSCALL_compat
      	ia32_syscall            -> entry_INT80_compat
      
      The generic naming scheme for x86 system call entry points is:
      
      	entry_MNEMONIC_qualifier
      
      where 'qualifier' is one of _32, _64 or _compat.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      2cd23553
  10. 07 6月, 2015 6 次提交