1. 18 11月, 2014 6 次提交
    • D
      x86, mpx: Decode MPX instruction to get bound violation information · fcc7ffd6
      Dave Hansen 提交于
      This patch sets bound violation fields of siginfo struct in #BR
      exception handler by decoding the user instruction and constructing
      the faulting pointer.
      
      We have to be very careful when decoding these instructions.  They
      are completely controlled by userspace and may be changed at any
      time up to and including the point where we try to copy them in to
      the kernel.  They may or may not be MPX instructions and could be
      completely invalid for all we know.
      
      Note: This code is based on Qiaowei Ren's specialized MPX
      decoder, but uses the generic decoder whenever possible.  It was
      tested for robustness by generating a completely random data
      stream and trying to decode that stream.  I also unmapped random
      pages inside the stream to test the "partial instruction" short
      read code.
      
      We kzalloc() the siginfo instead of stack allocating it because
      we need to memset() it anyway, and doing this makes it much more
      clear when it got initialized by the MPX instruction decoder.
      
      Changes from the old decoder:
       * Use the generic decoder instead of custom functions.  Saved
         ~70 lines of code overall.
       * Remove insn->addr_bytes code (never used??)
       * Make sure never to possibly overflow the regoff[] array, plus
         check the register range correctly in 32 and 64-bit modes.
       * Allow get_reg() to return an error and have mpx_get_addr_ref()
         handle when it sees errors.
       * Only call insn_get_*() near where we actually use the values
         instead if trying to call them all at once.
       * Handle short reads from copy_from_user() and check the actual
         number of read bytes against what we expect from
         insn_get_length().  If a read stops in the middle of an
         instruction, we error out.
       * Actually check the opcodes intead of ignoring them.
       * Dynamically kzalloc() siginfo_t so we don't leak any stack
         data.
       * Detect and handle decoder failures instead of ignoring them.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Based-on-patch-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151828.5BDD0915@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      fcc7ffd6
    • Q
      x86, mpx: Add MPX-specific mmap interface · 57319d80
      Qiaowei Ren 提交于
      We have chosen to perform the allocation of bounds tables in
      kernel (See the patch "on-demand kernel allocation of bounds
      tables") and to mark these VMAs with VM_MPX.
      
      However, there is currently no suitable interface to actually do
      this.  Existing interfaces, like do_mmap_pgoff(), have no way to
      set a modified ->vm_ops or ->vm_flags and don't hold mmap_sem
      long enough to let a caller do it.
      
      This patch wraps mmap_region() and hold mmap_sem long enough to
      make the modifications to the VMA which we need.
      
      Also note the 32/64-bit #ifdef in the header.  We actually need
      to do this at runtime eventually.  But, for now, we don't support
      running 32-bit binaries on 64-bit kernels.  Support for this will
      come in later patches.
      Signed-off-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151827.CE440F67@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      57319d80
    • D
      x86, mpx: Add MPX to disabled features · 95290cf1
      Dave Hansen 提交于
      This allows us to use cpu_feature_enabled(X86_FEATURE_MPX) as
      both a runtime and compile-time check.
      
      When CONFIG_X86_INTEL_MPX is disabled,
      cpu_feature_enabled(X86_FEATURE_MPX) will evaluate at
      compile-time to 0. If CONFIG_X86_INTEL_MPX=y, then the cpuid
      flag will be checked at runtime.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Qiaowei Ren <qiaowei.ren@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151823.B358EAD2@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      95290cf1
    • D
      x86, mpx: Rename cfg_reg_u and status_reg · 62e7759b
      Dave Hansen 提交于
      According to Intel SDM extension, MPX configuration and status registers
      should be BNDCFGU and BNDSTATUS. This patch renames cfg_reg_u and
      status_reg to bndcfgu and bndstatus.
      
      [ tglx: Renamed 'struct bndscr_struct' to 'struct bndscr' ]
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Qiaowei Ren <qiaowei.ren@intel.com>
      Link: http://lkml.kernel.org/r/20141114151817.031762AC@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      62e7759b
    • D
      x86: mpx: Give bndX registers actual names · c04e051c
      Dave Hansen 提交于
      Consider the bndX MPX registers.  There 4 registers each
      containing a 64-bit lower and a 64-bit upper bound.  That's 8*64
      bits and we declare it thusly:
      
      	struct bndregs_struct {
      		u64 bndregs[8];
      	}
          
      Let's say you want to read the upper bound from the MPX register
      bnd2 out of the xsave buf.  You do:
      
      	bndregno = 2;
      	upper_bound = xsave_buf->bndregs.bndregs[2*bndregno+1];
      
      That kinda sucks.  Every time you access it, you need to know:
      1. Each bndX register is two entries wide in "bndregs"
      2. The lower comes first followed by upper.  We do the +1 to get
         upper vs. lower.
      
      This replaces the old definition.  You can now access them
      indexed by the register number directly, and with a meaningful
      name for the lower and upper bound:
      
      	bndregno = 2;
      	xsave_buf->bndreg[bndregno].upper_bound;
      
      It's now *VERY* clear that there are 4 registers.  The programmer
      now doesn't have to care what order the lower and upper bounds
      are in, and it's harder to get it wrong.
      
      [ tglx: Changed ub/lb to upper_bound/lower_bound and renamed struct
      bndreg_struct to struct bndreg ]
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: x86@kernel.org
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Qiaowei Ren <qiaowei.ren@intel.com>
      Cc: "Yu, Fenghua" <fenghua.yu@intel.com>
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141031215820.5EA5E0EC@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      c04e051c
    • D
      x86: Remove arbitrary instruction size limit in instruction decoder · 6ba48ff4
      Dave Hansen 提交于
      The current x86 instruction decoder steps along through the
      instruction stream but always ensures that it never steps farther
      than the largest possible instruction size (MAX_INSN_SIZE).
      
      The MPX code is now going to be doing some decoding of userspace
      instructions.  We copy those from userspace in to the kernel and
      they're obviously completely untrusted coming from userspace.  In
      addition to the constraint that instructions can only be so long,
      we also have to be aware of how long the buffer is that came in
      from userspace.  This _looks_ to be similar to what the perf and
      kprobes is doing, but it's unclear to me whether they are
      affected.
      
      The whole reason we need this is that it is perfectly valid to be
      executing an instruction within MAX_INSN_SIZE bytes of an
      unreadable page. We should be able to gracefully handle short
      reads in those cases.
      
      This adds support to the decoder to record how long the buffer
      being decoded is and to refuse to "validate" the instruction if
      we would have gone over the end of the buffer to decode it.
      
      The kprobes code probably needs to be looked at here a bit more
      carefully.  This patch still respects the MAX_INSN_SIZE limit
      there but the kprobes code does look like it might be able to
      be a bit more strict than it currently is.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NJim Keniston <jkenisto@us.ibm.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: x86@kernel.org
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/20141114153957.E6B01535@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6ba48ff4
  2. 02 11月, 2014 3 次提交
    • P
      KVM: vmx: defer load of APIC access page address during reset · a73896cb
      Paolo Bonzini 提交于
      Most call paths to vmx_vcpu_reset do not hold the SRCU lock.  Defer loading
      the APIC access page to the next vmentry.
      
      This avoids the following lockdep splat:
      
      [ INFO: suspicious RCU usage. ]
      3.18.0-rc2-test2+ #70 Not tainted
      -------------------------------
      include/linux/kvm_host.h:474 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 1, debug_locks = 0
      1 lock held by qemu-system-x86/2371:
       #0:  (&vcpu->mutex){+.+...}, at: [<ffffffffa037d800>] vcpu_load+0x20/0xd0 [kvm]
      
      stack backtrace:
      CPU: 4 PID: 2371 Comm: qemu-system-x86 Not tainted 3.18.0-rc2-test2+ #70
      Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
       0000000000000001 ffff880209983ca8 ffffffff816f514f 0000000000000000
       ffff8802099b8990 ffff880209983cd8 ffffffff810bd687 00000000000fee00
       ffff880208a2c000 ffff880208a10000 ffff88020ef50040 ffff880209983d08
      Call Trace:
       [<ffffffff816f514f>] dump_stack+0x4e/0x71
       [<ffffffff810bd687>] lockdep_rcu_suspicious+0xe7/0x120
       [<ffffffffa037d055>] gfn_to_memslot+0xd5/0xe0 [kvm]
       [<ffffffffa03807d3>] __gfn_to_pfn+0x33/0x60 [kvm]
       [<ffffffffa0380885>] gfn_to_page+0x25/0x90 [kvm]
       [<ffffffffa038aeec>] kvm_vcpu_reload_apic_access_page+0x3c/0x80 [kvm]
       [<ffffffffa08f0a9c>] vmx_vcpu_reset+0x20c/0x460 [kvm_intel]
       [<ffffffffa039ab8e>] kvm_vcpu_reset+0x15e/0x1b0 [kvm]
       [<ffffffffa039ac0c>] kvm_arch_vcpu_setup+0x2c/0x50 [kvm]
       [<ffffffffa037f7e0>] kvm_vm_ioctl+0x1d0/0x780 [kvm]
       [<ffffffff810bc664>] ? __lock_is_held+0x54/0x80
       [<ffffffff812231f0>] do_vfs_ioctl+0x300/0x520
       [<ffffffff8122ee45>] ? __fget+0x5/0x250
       [<ffffffff8122f0fa>] ? __fget_light+0x2a/0xe0
       [<ffffffff81223491>] SyS_ioctl+0x81/0xa0
       [<ffffffff816fed6d>] system_call_fastpath+0x16/0x1b
      Reported-by: NTakashi Iwai <tiwai@suse.de>
      Reported-by: NAlexei Starovoitov <alexei.starovoitov@gmail.com>
      Reviewed-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Tested-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Fixes: 38b99173Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a73896cb
    • J
      KVM: nVMX: Disable preemption while reading from shadow VMCS · 282da870
      Jan Kiszka 提交于
      In order to access the shadow VMCS, we need to load it. At this point,
      vmx->loaded_vmcs->vmcs and the actually loaded one start to differ. If
      we now get preempted by Linux, vmx_vcpu_put and, on return, the
      vmx_vcpu_load will work against the wrong vmcs. That can cause
      copy_shadow_to_vmcs12 to corrupt the vmcs12 state.
      
      Fix the issue by disabling preemption during the copy operation.
      copy_vmcs12_to_shadow is safe from this issue as it is executed by
      vmx_vcpu_run when preemption is already disabled before vmentry.
      
      This bug is exposed by running Jailhouse within KVM on CPUs with
      shadow VMCS support.  Jailhouse never expects an interrupt pending
      vmexit, but the bug can cause it if, after copy_shadow_to_vmcs12
      is preempted, the active VMCS happens to have the virtual interrupt
      pending flag set in the CPU-based execution controls.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      282da870
    • N
      KVM: x86: Fix far-jump to non-canonical check · 7e46dddd
      Nadav Amit 提交于
      Commit d1442d85 ("KVM: x86: Handle errors when RIP is set during far
      jumps") introduced a bug that caused the fix to be incomplete.  Due to
      incorrect evaluation, far jump to segment with L bit cleared (i.e., 32-bit
      segment) and RIP with any of the high bits set (i.e, RIP[63:32] != 0) set may
      not trigger #GP.  As we know, this imposes a security problem.
      
      In addition, the condition for two warnings was incorrect.
      
      Fixes: d1442d85Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
      [Add #ifdef CONFIG_X86_64 to avoid complaints of undefined behavior. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7e46dddd
  3. 01 11月, 2014 1 次提交
    • A
      x86_64, entry: Fix out of bounds read on sysenter · 653bc77a
      Andy Lutomirski 提交于
      Rusty noticed a Really Bad Bug (tm) in my NT fix.  The entry code
      reads out of bounds, causing the NT fix to be unreliable.  But, and
      this is much, much worse, if your stack is somehow just below the
      top of the direct map (or a hole), you read out of bounds and crash.
      
      Excerpt from the crash:
      
      [    1.129513] RSP: 0018:ffff88001da4bf88  EFLAGS: 00010296
      
        2b:*    f7 84 24 90 00 00 00     testl  $0x4000,0x90(%rsp)
      
      That read is deterministically above the top of the stack.  I
      thought I even single-stepped through this code when I wrote it to
      check the offset, but I clearly screwed it up.
      
      Fixes: 8c7aa698 ("x86_64, entry: Filter RFLAGS.NT on entry from userspace")
      Reported-by: NRusty Russell <rusty@ozlabs.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      653bc77a
  4. 29 10月, 2014 7 次提交
    • P
      KVM: emulator: fix execution close to the segment limit · fd56e154
      Paolo Bonzini 提交于
      Emulation of code that is 14 bytes to the segment limit or closer
      (e.g. RIP = 0xFFFFFFF2 after reset) is broken because we try to read as
      many as 15 bytes from the beginning of the instruction, and __linearize
      fails when the passed (address, size) pair reaches out of the segment.
      
      To fix this, let __linearize return the maximum accessible size (clamped
      to 2^32-1) for usage in __do_insn_fetch_bytes, and avoid the limit check
      by passing zero for the desired size.
      
      For expand-down segments, __linearize is performing a redundant check.
      (u32)(addr.ea + size - 1) <= lim can only happen if addr.ea is close
      to 4GB; in this case, addr.ea + size - 1 will also fail the check against
      the upper bound of the segment (which is provided by the D/B bit).
      After eliminating the redundant check, it is simple to compute
      the *max_size for expand-down segments too.
      
      Now that the limit check is done in __do_insn_fetch_bytes, we want
      to inject a general protection fault there if size < op_size (like
      __linearize would have done), instead of just aborting.
      
      This fixes booting Tiano Core from emulated flash with EPT disabled.
      
      Cc: stable@vger.kernel.org
      Fixes: 719d5a9bReported-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fd56e154
    • P
      KVM: emulator: fix error code for __linearize · 3606189f
      Paolo Bonzini 提交于
      The error code for #GP and #SS is zero when the segment is used to
      access an operand or an instruction.  It is only non-zero when
      a segment register is being loaded; for limit checks this means
      cases such as:
      
      * for #GP, when RIP is beyond the limit on a far call (before the first
      instruction is executed).  We do not implement this check, but it
      would be in em_jmp_far/em_call_far.
      
      * for #SS, if the new stack overflows during an inter-privilege-level
      call to a non-conforming code segment.  We do not implement stack
      switching at all.
      
      So use an error code of zero.
      Reviewed-by: NNadav Amit <namit@cs.technion.ac.il>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3606189f
    • I
      perf/x86/intel: Revert incomplete and undocumented Broadwell client support · 1776b106
      Ingo Molnar 提交于
      These patches:
      
        86a349a2 ("perf/x86/intel: Add Broadwell core support")
        c46e665f ("perf/x86: Add INST_RETIRED.ALL workarounds")
        fdda3c4a ("perf/x86/intel: Use Broadwell cache event list for Haswell")
      
      introduced magic constants and unexplained changes:
      
        https://lkml.org/lkml/2014/10/28/1128
        https://lkml.org/lkml/2014/10/27/325
        https://lkml.org/lkml/2014/8/27/546
        https://lkml.org/lkml/2014/10/28/546
      
      Peter Zijlstra has attempted to help out, to clean up the mess:
      
        https://lkml.org/lkml/2014/10/28/543
      
      But has not received helpful and constructive replies which makes
      me doubt wether it can all be finished in time until v3.18 is
      released.
      
      Despite various review feedback the author (Andi Kleen) has answered
      only few of the review questions and has generally been uncooperative,
      only giving replies when prompted repeatedly, and only giving minimal
      answers instead of constructively explaining and helping along the effort.
      
      That kind of behavior is not acceptable.
      
      There's also a boot crash on Intel E5-1630 v3 CPUs reported for another
      commit from Andi Kleen:
      
        e735b9db ("perf/x86/intel/uncore: Add Haswell-EP uncore support")
      
        https://lkml.org/lkml/2014/10/22/730
      
      Which is not yet resolved. The uncore driver is independent in theory,
      but the crash makes me worry about how well all these patches were
      tested and makes me uneasy about the level of interminging that the
      Broadwell and Haswell code has received by the commits above.
      
      As a first step to resolve the mess revert the Broadwell client commits
      back to the v3.17 version, before we run out of time and problematic
      code hits a stable upstream kernel.
      
      ( If the Haswell-EP crash is not resolved via a simple fix then we'll have
        to revert the Haswell-EP uncore driver as well. )
      
      The Broadwell client series has to be submitted in a clean fashion, with
      single, well documented changes per patch. If they are submitted in time
      and are accepted during review then they can possibly go into v3.19 but
      will need additional scrutiny due to the rocky history of this patch set.
      
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: eranian@google.com
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1409683455-29168-3-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1776b106
    • D
      x86, pageattr: Prevent overflow in slow_virt_to_phys() for X86_PAE · d1cd1210
      Dexuan Cui 提交于
      pte_pfn() returns a PFN of long (32 bits in 32-PAE), so "long <<
      PAGE_SHIFT" will overflow for PFNs above 4GB.
      
      Due to this issue, some Linux 32-PAE distros, running as guests on Hyper-V,
      with 5GB memory assigned, can't load the netvsc driver successfully and
      hence the synthetic network device can't work (we can use the kernel parameter
      mem=3000M to work around the issue).
      
      Cast pte_pfn() to phys_addr_t before shifting.
      
      Fixes: "commit d7656534: x86, mm: Create slow_virt_to_phys()"
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: gregkh@linuxfoundation.org
      Cc: linux-mm@kvack.org
      Cc: olaf@aepfle.de
      Cc: apw@canonical.com
      Cc: jasowang@redhat.com
      Cc: dave.hansen@intel.com
      Cc: riel@redhat.com
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1414580017-27444-1-git-send-email-decui@microsoft.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      d1cd1210
    • J
      ACPI, irq, x86: Return IRQ instead of GSI in mp_register_gsi() · b77e8f43
      Jiang Liu 提交于
      Function mp_register_gsi() returns blindly the GSI number for the ACPI
      SCI interrupt. That causes a regression when the GSI for ACPI SCI is
      shared with other devices.
      
      The regression was caused by commit 84245af7 "x86, irq, ACPI:
      Change __acpi_register_gsi to return IRQ number instead of GSI" and
      exposed on a SuperMicro system, which shares one GSI between ACPI SCI
      and PCI device, with following failure:
      
      http://sourceforge.net/p/linux1394/mailman/linux1394-user/?viewmonth=201410
      [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 20 low
      level)
      [    2.699224] firewire_ohci 0000:06:00.0: failed to allocate interrupt
      20
      
      Return mp_map_gsi_to_irq(gsi, 0) instead of the GSI number.
      Reported-and-Tested-by: NDaniel Robbins <drobbins@funtoo.org>
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: <stable@vger.kernel.org> # 3.17
      Link: http://lkml.kernel.org/r/1414387308-27148-4-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b77e8f43
    • J
      x86, intel-mid: Create IRQs for APB timers and RTC timers · f1829859
      Jiang Liu 提交于
      Intel MID platforms has no legacy interrupts, so no IRQ descriptors
      preallocated. We need to call mp_map_gsi_to_irq() to create IRQ
      descriptors for APB timers and RTC timers, otherwise it may cause
      invalid memory access as:
      [    0.116839] BUG: unable to handle kernel NULL pointer dereference at
      0000003a
      [    0.123803] IP: [<c1071c0e>] setup_irq+0xf/0x4d
      Tested-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: <stable@vger.kernel.org> # 3.17
      Link: http://lkml.kernel.org/r/1414387308-27148-3-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f1829859
    • D
      x86: Don't enable F00F workaround on Intel Quark processors · d4e1a0af
      Dave Jones 提交于
      The Intel Quark processor is a part of family 5, but does not have the
      F00F bug present in Pentiums of the same family.
      
      Pentiums were models 0 through 8, Quark is model 9.
      Signed-off-by: NDave Jones <davej@redhat.com>
      Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie>
      Link: http://lkml.kernel.org/r/20141028175753.GA12743@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d4e1a0af
  5. 28 10月, 2014 5 次提交
  6. 25 10月, 2014 1 次提交
  7. 24 10月, 2014 13 次提交
  8. 23 10月, 2014 4 次提交
    • M
      x86/xen: panic on bad Xen-provided memory map · 1ea644c8
      Martin Kelly 提交于
      Panic if Xen provides a memory map with 0 entries. Although this is
      unlikely, it is better to catch the error at the point of seeing the map
      than later on as a symptom of some other crash.
      Signed-off-by: NMartin Kelly <martkell@amazon.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      1ea644c8
    • B
      x86/xen: Fix incorrect per_cpu accessor in xen_clocksource_read() · 3251f20b
      Boris Ostrovsky 提交于
      Commit 89cbc767 ("x86: Replace __get_cpu_var uses") replaced
      __get_cpu_var() with this_cpu_ptr() in xen_clocksource_read() in such a
      way that instead of accessing a structure pointed to by a per-cpu pointer
      we are trying to get to a per-cpu structure.
      
      __this_cpu_read() of the pointer is the more appropriate accessor.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      3251f20b
    • J
      x86/xen: avoid race in p2m handling · 3a0e94f8
      Juergen Gross 提交于
      When a new p2m leaf is allocated this leaf is linked into the p2m tree
      via cmpxchg. Unfortunately the compare value for checking the success
      of the update is read after checking for the need of a new leaf. It is
      possible that a new leaf has been linked into the tree concurrently
      in between. This could lead to a leaked memory page and to the loss of
      some p2m entries.
      
      Avoid the race by using the read compare value for checking the need
      of a new p2m leaf and use ACCESS_ONCE() to get it.
      
      There are other places which seem to need ACCESS_ONCE() to ensure
      proper operation. Change them accordingly.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      3a0e94f8
    • J
      x86/xen: delay construction of mfn_list_list · 2c185687
      Juergen Gross 提交于
      The 3 level p2m tree for the Xen tools is constructed very early at
      boot by calling xen_build_mfn_list_list(). Memory needed for this tree
      is allocated via extend_brk().
      
      As this tree (other than the kernel internal p2m tree) is only needed
      for domain save/restore, live migration and crash dump analysis it
      doesn't matter whether it is constructed very early or just some
      milliseconds later when memory allocation is possible by other means.
      
      This patch moves the call of xen_build_mfn_list_list() just after
      calling xen_pagetable_p2m_copy() simplifying this function, too, as it
      doesn't have to bother with two parallel trees now. The same applies
      for some other internal functions.
      
      While simplifying code, make early_can_reuse_p2m_middle() static and
      drop the unused second parameter. p2m_mid_identity_mfn can be removed
      as well, it isn't used either.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      2c185687