1. 08 12月, 2014 3 次提交
  2. 01 12月, 2014 1 次提交
  3. 26 11月, 2014 1 次提交
    • A
      kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn() · d3fccc7e
      Ard Biesheuvel 提交于
      This reverts commit 85c8555f ("KVM: check for !is_zero_pfn() in
      kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn.
      
      The problem being addressed by the patch above was that some ARM code
      based the memory mapping attributes of a pfn on the return value of
      kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should
      be mapped as device memory.
      
      However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin,
      and the existing non-ARM users were already using it in a way which
      suggests that its name should probably have been 'kvm_is_reserved_pfn'
      from the beginning, e.g., whether or not to call get_page/put_page on
      it etc. This means that returning false for the zero page is a mistake
      and the patch above should be reverted.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d3fccc7e
  4. 25 11月, 2014 1 次提交
  5. 24 11月, 2014 5 次提交
  6. 23 11月, 2014 3 次提交
    • T
      PCI/MSI: Rename mask/unmask_msi_irq treewide · 280510f1
      Thomas Gleixner 提交于
      The PCI/MSI irq chip callbacks mask/unmask_msi_irq have been renamed
      to pci_msi_mask/unmask_irq to mark them PCI specific. Rename all usage
      sites. The conversion helper functions are kept around to avoid
      conflicts in next and will be removed after merging into mainline.
      
      Coccinelle assisted conversion. No functional change.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: x86@kernel.org
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Murali Karicheri <m-karicheri2@ti.com>
      Cc: Thierry Reding <thierry.reding@gmail.com>
      Cc: Mohit Kumar <mohit.kumar@st.com>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: Yijing Wang <wangyijing@huawei.com>
      280510f1
    • J
      PCI/MSI: Rename write_msi_msg() to pci_write_msi_msg() · 83a18912
      Jiang Liu 提交于
      Rename write_msi_msg() to pci_write_msi_msg() to mark it as PCI
      specific.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
      Cc: Yijing Wang <wangyijing@huawei.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      83a18912
    • J
      PCI/MSI: Rename __read_msi_msg() to __pci_read_msi_msg() · 891d4a48
      Jiang Liu 提交于
      Rename __read_msi_msg() to __pci_read_msi_msg() and kill unused
      read_msi_msg(). It's a preparation to separate generic MSI code from
      PCI core.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
      Cc: Yijing Wang <wangyijing@huawei.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      891d4a48
  7. 21 11月, 2014 1 次提交
  8. 19 11月, 2014 5 次提交
    • D
      x86: Cleanly separate use of asm-generic/mm_hooks.h · a1ea1c03
      Dave Hansen 提交于
      asm-generic/mm_hooks.h provides some generic fillers for the 90%
      of architectures that do not need to hook some mmap-manipulation
      functions.  A comment inside says:
      
      > Define generic no-op hooks for arch_dup_mmap and
      > arch_exit_mmap, to be included in asm-FOO/mmu_context.h
      > for any arch FOO which doesn't need to hook these.
      
      So, does x86 need to hook these?  It depends on CONFIG_PARAVIRT.
      We *conditionally* include this generic header if we have
      CONFIG_PARAVIRT=n.  That's madness.
      
      With this patch, x86 stops using asm-generic/mmu_hooks.h entirely.
      We use our own copies of the functions.  The paravirt code
      provides some stubs if it is disabled, and we always call those
      stubs in our x86-private versions of arch_exit_mmap() and
      arch_dup_mmap().
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: x86@kernel.org
      Link: http://lkml.kernel.org/r/20141118182349.14567FA5@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      a1ea1c03
    • D
      x86 mpx: Change return type of get_reg_offset() · 68c009c4
      Dave Hansen 提交于
      get_reg_offset() used to return the register contents themselves
      instead of the register offset.  When it did that, it was an
      unsigned long.  I changed it to return an integer _offset_
      instead of the register.  But, I neglected to change the return
      type of the function or the variables in which we store the
      result of the call.
      
      This fixes up the code to clear up the warnings from the smatch
      bot:
      
      New smatch warnings:
      arch/x86/mm/mpx.c:178 mpx_get_addr_ref() warn: unsigned 'addr_offset' is never less than zero.
      arch/x86/mm/mpx.c:184 mpx_get_addr_ref() warn: unsigned 'base_offset' is never less than zero.
      arch/x86/mm/mpx.c:188 mpx_get_addr_ref() warn: unsigned 'indx_offset' is never less than zero.
      arch/x86/mm/mpx.c:196 mpx_get_addr_ref() warn: unsigned 'addr_offset' is never less than zero.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: x86@kernel.org
      Link: http://lkml.kernel.org/r/20141118182343.C3E0C629@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      68c009c4
    • K
      x86, kaslr: Handle Gold linker for finding bss/brk · 70b61e36
      Kees Cook 提交于
      When building with the Gold linker, the .bss and .brk areas of vmlinux
      are shown as consecutive instead of having the same file offset. Allow
      for either state, as long as things add up correctly.
      
      Fixes: e6023367 ("x86, kaslr: Prevent .bss from overlaping initrd")
      Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Junjie Mao <eternal.n08@gmail.com>
      Link: http://lkml.kernel.org/r/20141118001604.GA25045@www.outflux.net
      Cc: stable@vger.kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      70b61e36
    • K
      x86, mm: Set NX across entire PMD at boot · 45e2a9d4
      Kees Cook 提交于
      When setting up permissions on kernel memory at boot, the end of the
      PMD that was split from bss remained executable. It should be NX like
      the rest. This performs a PMD alignment instead of a PAGE alignment to
      get the correct span of memory.
      
      Before:
      ---[ High Kernel Mapping ]---
      ...
      0xffffffff8202d000-0xffffffff82200000  1868K     RW       GLB NX pte
      0xffffffff82200000-0xffffffff82c00000    10M     RW   PSE GLB NX pmd
      0xffffffff82c00000-0xffffffff82df5000  2004K     RW       GLB NX pte
      0xffffffff82df5000-0xffffffff82e00000    44K     RW       GLB x  pte
      0xffffffff82e00000-0xffffffffc0000000   978M                     pmd
      
      After:
      ---[ High Kernel Mapping ]---
      ...
      0xffffffff8202d000-0xffffffff82200000  1868K     RW       GLB NX pte
      0xffffffff82200000-0xffffffff82e00000    12M     RW   PSE GLB NX pmd
      0xffffffff82e00000-0xffffffffc0000000   978M                     pmd
      
      [ tglx: Changed it to roundup(_brk_end, PMD_SIZE) and added a comment.
              We really should unmap the reminder along with the holes
              caused by init,initdata etc. but thats a different issue ]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20141114194737.GA3091@www.outflux.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      45e2a9d4
    • B
      x86, microcode: Update BSPs microcode on resume · fb86b973
      Borislav Petkov 提交于
      In the situation when we apply early microcode but do *not* apply late
      microcode, we fail to update the BSP's microcode on resume because we
      haven't initialized the uci->mc microcode pointer. So, in order to
      alleviate that, we go and dig out the stashed microcode patch during
      early boot. It is basically the same thing that is done on the APs early
      during boot so do that too here.
      
      Tested-by: alex.schnaidt@gmail.com
      Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=88001
      Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: <stable@vger.kernel.org> # v3.9
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/20141118094657.GA6635@pd.tnicSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      fb86b973
  9. 18 11月, 2014 8 次提交
    • D
      x86, mpx: Cleanup unused bound tables · 1de4fa14
      Dave Hansen 提交于
      The previous patch allocates bounds tables on-demand.  As noted in
      an earlier description, these can add up to *HUGE* amounts of
      memory.  This has caused OOMs in practice when running tests.
      
      This patch adds support for freeing bounds tables when they are no
      longer in use.
      
      There are two types of mappings in play when unmapping tables:
       1. The mapping with the actual data, which userspace is
          munmap()ing or brk()ing away, etc...
       2. The mapping for the bounds table *backing* the data
          (is tagged with VM_MPX, see the patch "add MPX specific
          mmap interface").
      
      If userspace use the prctl() indroduced earlier in this patchset
      to enable the management of bounds tables in kernel, when it
      unmaps the first type of mapping with the actual data, the kernel
      needs to free the mapping for the bounds table backing the data.
      This patch hooks in at the very end of do_unmap() to do so.
      We look at the addresses being unmapped and find the bounds
      directory entries and tables which cover those addresses.  If
      an entire table is unused, we clear associated directory entry
      and free the table.
      
      Once we unmap the bounds table, we would have a bounds directory
      entry pointing at empty address space. That address space might
      now be allocated for some other (random) use, and the MPX
      hardware might now try to walk it as if it were a bounds table.
      That would be bad.  So any unmapping of an enture bounds table
      has to be accompanied by a corresponding write to the bounds
      directory entry to invalidate it.  That write to the bounds
      directory can fault, which causes the following problem:
      
      Since we are doing the freeing from munmap() (and other paths
      like it), we hold mmap_sem for write. If we fault, the page
      fault handler will attempt to acquire mmap_sem for read and
      we will deadlock.  To avoid the deadlock, we pagefault_disable()
      when touching the bounds directory entry and use a
      get_user_pages() to resolve the fault.
      
      The unmapping of bounds tables happends under vm_munmap().  We
      also (indirectly) call vm_munmap() to _do_ the unmapping of the
      bounds tables.  We avoid unbounded recursion by disallowing
      freeing of bounds tables *for* bounds tables.  This would not
      occur normally, so should not have any practical impact.  Being
      strict about it here helps ensure that we do not have an
      exploitable stack overflow.
      Based-on-patch-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151831.E4531C4A@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      1de4fa14
    • D
      x86, mpx: On-demand kernel allocation of bounds tables · fe3d197f
      Dave Hansen 提交于
      This is really the meat of the MPX patch set.  If there is one patch to
      review in the entire series, this is the one.  There is a new ABI here
      and this kernel code also interacts with userspace memory in a
      relatively unusual manner.  (small FAQ below).
      
      Long Description:
      
      This patch adds two prctl() commands to provide enable or disable the
      management of bounds tables in kernel, including on-demand kernel
      allocation (See the patch "on-demand kernel allocation of bounds tables")
      and cleanup (See the patch "cleanup unused bound tables"). Applications
      do not strictly need the kernel to manage bounds tables and we expect
      some applications to use MPX without taking advantage of this kernel
      support. This means the kernel can not simply infer whether an application
      needs bounds table management from the MPX registers.  The prctl() is an
      explicit signal from userspace.
      
      PR_MPX_ENABLE_MANAGEMENT is meant to be a signal from userspace to
      require kernel's help in managing bounds tables.
      
      PR_MPX_DISABLE_MANAGEMENT is the opposite, meaning that userspace don't
      want kernel's help any more. With PR_MPX_DISABLE_MANAGEMENT, the kernel
      won't allocate and free bounds tables even if the CPU supports MPX.
      
      PR_MPX_ENABLE_MANAGEMENT will fetch the base address of the bounds
      directory out of a userspace register (bndcfgu) and then cache it into
      a new field (->bd_addr) in  the 'mm_struct'.  PR_MPX_DISABLE_MANAGEMENT
      will set "bd_addr" to an invalid address.  Using this scheme, we can
      use "bd_addr" to determine whether the management of bounds tables in
      kernel is enabled.
      
      Also, the only way to access that bndcfgu register is via an xsaves,
      which can be expensive.  Caching "bd_addr" like this also helps reduce
      the cost of those xsaves when doing table cleanup at munmap() time.
      Unfortunately, we can not apply this optimization to #BR fault time
      because we need an xsave to get the value of BNDSTATUS.
      
      ==== Why does the hardware even have these Bounds Tables? ====
      
      MPX only has 4 hardware registers for storing bounds information.
      If MPX-enabled code needs more than these 4 registers, it needs to
      spill them somewhere. It has two special instructions for this
      which allow the bounds to be moved between the bounds registers
      and some new "bounds tables".
      
      They are similar conceptually to a page fault and will be raised by
      the MPX hardware during both bounds violations or when the tables
      are not present. This patch handles those #BR exceptions for
      not-present tables by carving the space out of the normal processes
      address space (essentially calling the new mmap() interface indroduced
      earlier in this patch set.) and then pointing the bounds-directory
      over to it.
      
      The tables *need* to be accessed and controlled by userspace because
      the instructions for moving bounds in and out of them are extremely
      frequent. They potentially happen every time a register pointing to
      memory is dereferenced. Any direct kernel involvement (like a syscall)
      to access the tables would obviously destroy performance.
      
      ==== Why not do this in userspace? ====
      
      This patch is obviously doing this allocation in the kernel.
      However, MPX does not strictly *require* anything in the kernel.
      It can theoretically be done completely from userspace. Here are
      a few ways this *could* be done. I don't think any of them are
      practical in the real-world, but here they are.
      
      Q: Can virtual space simply be reserved for the bounds tables so
         that we never have to allocate them?
      A: As noted earlier, these tables are *HUGE*. An X-GB virtual
         area needs 4*X GB of virtual space, plus 2GB for the bounds
         directory. If we were to preallocate them for the 128TB of
         user virtual address space, we would need to reserve 512TB+2GB,
         which is larger than the entire virtual address space today.
         This means they can not be reserved ahead of time. Also, a
         single process's pre-popualated bounds directory consumes 2GB
         of virtual *AND* physical memory. IOW, it's completely
         infeasible to prepopulate bounds directories.
      
      Q: Can we preallocate bounds table space at the same time memory
         is allocated which might contain pointers that might eventually
         need bounds tables?
      A: This would work if we could hook the site of each and every
         memory allocation syscall. This can be done for small,
         constrained applications. But, it isn't practical at a larger
         scale since a given app has no way of controlling how all the
         parts of the app might allocate memory (think libraries). The
         kernel is really the only place to intercept these calls.
      
      Q: Could a bounds fault be handed to userspace and the tables
         allocated there in a signal handler instead of in the kernel?
      A: (thanks to tglx) mmap() is not on the list of safe async
         handler functions and even if mmap() would work it still
         requires locking or nasty tricks to keep track of the
         allocation state there.
      
      Having ruled out all of the userspace-only approaches for managing
      bounds tables that we could think of, we create them on demand in
      the kernel.
      Based-on-patch-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151829.AD4310DE@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      fe3d197f
    • D
      x86, mpx: Decode MPX instruction to get bound violation information · fcc7ffd6
      Dave Hansen 提交于
      This patch sets bound violation fields of siginfo struct in #BR
      exception handler by decoding the user instruction and constructing
      the faulting pointer.
      
      We have to be very careful when decoding these instructions.  They
      are completely controlled by userspace and may be changed at any
      time up to and including the point where we try to copy them in to
      the kernel.  They may or may not be MPX instructions and could be
      completely invalid for all we know.
      
      Note: This code is based on Qiaowei Ren's specialized MPX
      decoder, but uses the generic decoder whenever possible.  It was
      tested for robustness by generating a completely random data
      stream and trying to decode that stream.  I also unmapped random
      pages inside the stream to test the "partial instruction" short
      read code.
      
      We kzalloc() the siginfo instead of stack allocating it because
      we need to memset() it anyway, and doing this makes it much more
      clear when it got initialized by the MPX instruction decoder.
      
      Changes from the old decoder:
       * Use the generic decoder instead of custom functions.  Saved
         ~70 lines of code overall.
       * Remove insn->addr_bytes code (never used??)
       * Make sure never to possibly overflow the regoff[] array, plus
         check the register range correctly in 32 and 64-bit modes.
       * Allow get_reg() to return an error and have mpx_get_addr_ref()
         handle when it sees errors.
       * Only call insn_get_*() near where we actually use the values
         instead if trying to call them all at once.
       * Handle short reads from copy_from_user() and check the actual
         number of read bytes against what we expect from
         insn_get_length().  If a read stops in the middle of an
         instruction, we error out.
       * Actually check the opcodes intead of ignoring them.
       * Dynamically kzalloc() siginfo_t so we don't leak any stack
         data.
       * Detect and handle decoder failures instead of ignoring them.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Based-on-patch-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151828.5BDD0915@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      fcc7ffd6
    • Q
      x86, mpx: Add MPX-specific mmap interface · 57319d80
      Qiaowei Ren 提交于
      We have chosen to perform the allocation of bounds tables in
      kernel (See the patch "on-demand kernel allocation of bounds
      tables") and to mark these VMAs with VM_MPX.
      
      However, there is currently no suitable interface to actually do
      this.  Existing interfaces, like do_mmap_pgoff(), have no way to
      set a modified ->vm_ops or ->vm_flags and don't hold mmap_sem
      long enough to let a caller do it.
      
      This patch wraps mmap_region() and hold mmap_sem long enough to
      make the modifications to the VMA which we need.
      
      Also note the 32/64-bit #ifdef in the header.  We actually need
      to do this at runtime eventually.  But, for now, we don't support
      running 32-bit binaries on 64-bit kernels.  Support for this will
      come in later patches.
      Signed-off-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151827.CE440F67@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      57319d80
    • D
      x86, mpx: Add MPX to disabled features · 95290cf1
      Dave Hansen 提交于
      This allows us to use cpu_feature_enabled(X86_FEATURE_MPX) as
      both a runtime and compile-time check.
      
      When CONFIG_X86_INTEL_MPX is disabled,
      cpu_feature_enabled(X86_FEATURE_MPX) will evaluate at
      compile-time to 0. If CONFIG_X86_INTEL_MPX=y, then the cpuid
      flag will be checked at runtime.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Qiaowei Ren <qiaowei.ren@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151823.B358EAD2@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      95290cf1
    • D
      x86, mpx: Rename cfg_reg_u and status_reg · 62e7759b
      Dave Hansen 提交于
      According to Intel SDM extension, MPX configuration and status registers
      should be BNDCFGU and BNDSTATUS. This patch renames cfg_reg_u and
      status_reg to bndcfgu and bndstatus.
      
      [ tglx: Renamed 'struct bndscr_struct' to 'struct bndscr' ]
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Qiaowei Ren <qiaowei.ren@intel.com>
      Link: http://lkml.kernel.org/r/20141114151817.031762AC@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      62e7759b
    • D
      x86: mpx: Give bndX registers actual names · c04e051c
      Dave Hansen 提交于
      Consider the bndX MPX registers.  There 4 registers each
      containing a 64-bit lower and a 64-bit upper bound.  That's 8*64
      bits and we declare it thusly:
      
      	struct bndregs_struct {
      		u64 bndregs[8];
      	}
          
      Let's say you want to read the upper bound from the MPX register
      bnd2 out of the xsave buf.  You do:
      
      	bndregno = 2;
      	upper_bound = xsave_buf->bndregs.bndregs[2*bndregno+1];
      
      That kinda sucks.  Every time you access it, you need to know:
      1. Each bndX register is two entries wide in "bndregs"
      2. The lower comes first followed by upper.  We do the +1 to get
         upper vs. lower.
      
      This replaces the old definition.  You can now access them
      indexed by the register number directly, and with a meaningful
      name for the lower and upper bound:
      
      	bndregno = 2;
      	xsave_buf->bndreg[bndregno].upper_bound;
      
      It's now *VERY* clear that there are 4 registers.  The programmer
      now doesn't have to care what order the lower and upper bounds
      are in, and it's harder to get it wrong.
      
      [ tglx: Changed ub/lb to upper_bound/lower_bound and renamed struct
      bndreg_struct to struct bndreg ]
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: x86@kernel.org
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Qiaowei Ren <qiaowei.ren@intel.com>
      Cc: "Yu, Fenghua" <fenghua.yu@intel.com>
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141031215820.5EA5E0EC@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      c04e051c
    • D
      x86: Remove arbitrary instruction size limit in instruction decoder · 6ba48ff4
      Dave Hansen 提交于
      The current x86 instruction decoder steps along through the
      instruction stream but always ensures that it never steps farther
      than the largest possible instruction size (MAX_INSN_SIZE).
      
      The MPX code is now going to be doing some decoding of userspace
      instructions.  We copy those from userspace in to the kernel and
      they're obviously completely untrusted coming from userspace.  In
      addition to the constraint that instructions can only be so long,
      we also have to be aware of how long the buffer is that came in
      from userspace.  This _looks_ to be similar to what the perf and
      kprobes is doing, but it's unclear to me whether they are
      affected.
      
      The whole reason we need this is that it is perfectly valid to be
      executing an instruction within MAX_INSN_SIZE bytes of an
      unreadable page. We should be able to gracefully handle short
      reads in those cases.
      
      This adds support to the decoder to record how long the buffer
      being decoded is and to refuse to "validate" the instruction if
      we would have gone over the end of the buffer to decode it.
      
      The kprobes code probably needs to be looked at here a bit more
      carefully.  This patch still respects the MAX_INSN_SIZE limit
      there but the kprobes code does look like it might be able to
      be a bit more strict than it currently is.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NJim Keniston <jkenisto@us.ibm.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: x86@kernel.org
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/20141114153957.E6B01535@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6ba48ff4
  10. 17 11月, 2014 1 次提交
    • L
      x86-64: make csum_partial_copy_from_user() error handling consistent · 3b91270a
      Linus Torvalds 提交于
      Al Viro pointed out that the x86-64 csum_partial_copy_from_user() is
      somewhat confused about what it should do on errors, notably it mostly
      clears the uncopied end result buffer, but misses that for the initial
      alignment case.
      
      All users should check for errors, so it's dubious whether the clearing
      is even necessary, and Al also points out that we should probably clean
      up the calling conventions, but regardless of any future changes to this
      function, the fact that it is inconsistent is just annoying.
      
      So make the __get_user() failure path use the same error exit as all the
      other errors do.
      Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: David Miller <davem@davemloft.net>
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3b91270a
  11. 16 11月, 2014 8 次提交
  12. 12 11月, 2014 3 次提交