1. 18 11月, 2014 8 次提交
    • D
      x86, mpx: Decode MPX instruction to get bound violation information · fcc7ffd6
      Dave Hansen 提交于
      This patch sets bound violation fields of siginfo struct in #BR
      exception handler by decoding the user instruction and constructing
      the faulting pointer.
      
      We have to be very careful when decoding these instructions.  They
      are completely controlled by userspace and may be changed at any
      time up to and including the point where we try to copy them in to
      the kernel.  They may or may not be MPX instructions and could be
      completely invalid for all we know.
      
      Note: This code is based on Qiaowei Ren's specialized MPX
      decoder, but uses the generic decoder whenever possible.  It was
      tested for robustness by generating a completely random data
      stream and trying to decode that stream.  I also unmapped random
      pages inside the stream to test the "partial instruction" short
      read code.
      
      We kzalloc() the siginfo instead of stack allocating it because
      we need to memset() it anyway, and doing this makes it much more
      clear when it got initialized by the MPX instruction decoder.
      
      Changes from the old decoder:
       * Use the generic decoder instead of custom functions.  Saved
         ~70 lines of code overall.
       * Remove insn->addr_bytes code (never used??)
       * Make sure never to possibly overflow the regoff[] array, plus
         check the register range correctly in 32 and 64-bit modes.
       * Allow get_reg() to return an error and have mpx_get_addr_ref()
         handle when it sees errors.
       * Only call insn_get_*() near where we actually use the values
         instead if trying to call them all at once.
       * Handle short reads from copy_from_user() and check the actual
         number of read bytes against what we expect from
         insn_get_length().  If a read stops in the middle of an
         instruction, we error out.
       * Actually check the opcodes intead of ignoring them.
       * Dynamically kzalloc() siginfo_t so we don't leak any stack
         data.
       * Detect and handle decoder failures instead of ignoring them.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Based-on-patch-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151828.5BDD0915@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      fcc7ffd6
    • Q
      x86, mpx: Add MPX-specific mmap interface · 57319d80
      Qiaowei Ren 提交于
      We have chosen to perform the allocation of bounds tables in
      kernel (See the patch "on-demand kernel allocation of bounds
      tables") and to mark these VMAs with VM_MPX.
      
      However, there is currently no suitable interface to actually do
      this.  Existing interfaces, like do_mmap_pgoff(), have no way to
      set a modified ->vm_ops or ->vm_flags and don't hold mmap_sem
      long enough to let a caller do it.
      
      This patch wraps mmap_region() and hold mmap_sem long enough to
      make the modifications to the VMA which we need.
      
      Also note the 32/64-bit #ifdef in the header.  We actually need
      to do this at runtime eventually.  But, for now, we don't support
      running 32-bit binaries on 64-bit kernels.  Support for this will
      come in later patches.
      Signed-off-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151827.CE440F67@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      57319d80
    • D
      x86, mpx: Add MPX to disabled features · 95290cf1
      Dave Hansen 提交于
      This allows us to use cpu_feature_enabled(X86_FEATURE_MPX) as
      both a runtime and compile-time check.
      
      When CONFIG_X86_INTEL_MPX is disabled,
      cpu_feature_enabled(X86_FEATURE_MPX) will evaluate at
      compile-time to 0. If CONFIG_X86_INTEL_MPX=y, then the cpuid
      flag will be checked at runtime.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Qiaowei Ren <qiaowei.ren@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151823.B358EAD2@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      95290cf1
    • Q
      ia64: Sync struct siginfo with general version · 53f037b0
      Qiaowei Ren 提交于
      New fields about bound violation are added into general struct
      siginfo. This will impact MIPS and IA64, which extend general
      struct siginfo. This patch syncs this struct for IA64 with
      general version.
      Signed-off-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151822.82B3B486@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      53f037b0
    • Q
      mips: Sync struct siginfo with general version · 232b5fff
      Qiaowei Ren 提交于
      New fields about bound violation are added into general struct
      siginfo. This will impact MIPS and IA64, which extend general
      struct siginfo. This patch syncs this struct for MIPS with
      general version.
      Signed-off-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151820.F7EDC3CC@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      232b5fff
    • D
      x86, mpx: Rename cfg_reg_u and status_reg · 62e7759b
      Dave Hansen 提交于
      According to Intel SDM extension, MPX configuration and status registers
      should be BNDCFGU and BNDSTATUS. This patch renames cfg_reg_u and
      status_reg to bndcfgu and bndstatus.
      
      [ tglx: Renamed 'struct bndscr_struct' to 'struct bndscr' ]
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Qiaowei Ren <qiaowei.ren@intel.com>
      Link: http://lkml.kernel.org/r/20141114151817.031762AC@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      62e7759b
    • D
      x86: mpx: Give bndX registers actual names · c04e051c
      Dave Hansen 提交于
      Consider the bndX MPX registers.  There 4 registers each
      containing a 64-bit lower and a 64-bit upper bound.  That's 8*64
      bits and we declare it thusly:
      
      	struct bndregs_struct {
      		u64 bndregs[8];
      	}
          
      Let's say you want to read the upper bound from the MPX register
      bnd2 out of the xsave buf.  You do:
      
      	bndregno = 2;
      	upper_bound = xsave_buf->bndregs.bndregs[2*bndregno+1];
      
      That kinda sucks.  Every time you access it, you need to know:
      1. Each bndX register is two entries wide in "bndregs"
      2. The lower comes first followed by upper.  We do the +1 to get
         upper vs. lower.
      
      This replaces the old definition.  You can now access them
      indexed by the register number directly, and with a meaningful
      name for the lower and upper bound:
      
      	bndregno = 2;
      	xsave_buf->bndreg[bndregno].upper_bound;
      
      It's now *VERY* clear that there are 4 registers.  The programmer
      now doesn't have to care what order the lower and upper bounds
      are in, and it's harder to get it wrong.
      
      [ tglx: Changed ub/lb to upper_bound/lower_bound and renamed struct
      bndreg_struct to struct bndreg ]
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: x86@kernel.org
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Qiaowei Ren <qiaowei.ren@intel.com>
      Cc: "Yu, Fenghua" <fenghua.yu@intel.com>
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141031215820.5EA5E0EC@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      c04e051c
    • D
      x86: Remove arbitrary instruction size limit in instruction decoder · 6ba48ff4
      Dave Hansen 提交于
      The current x86 instruction decoder steps along through the
      instruction stream but always ensures that it never steps farther
      than the largest possible instruction size (MAX_INSN_SIZE).
      
      The MPX code is now going to be doing some decoding of userspace
      instructions.  We copy those from userspace in to the kernel and
      they're obviously completely untrusted coming from userspace.  In
      addition to the constraint that instructions can only be so long,
      we also have to be aware of how long the buffer is that came in
      from userspace.  This _looks_ to be similar to what the perf and
      kprobes is doing, but it's unclear to me whether they are
      affected.
      
      The whole reason we need this is that it is perfectly valid to be
      executing an instruction within MAX_INSN_SIZE bytes of an
      unreadable page. We should be able to gracefully handle short
      reads in those cases.
      
      This adds support to the decoder to record how long the buffer
      being decoded is and to refuse to "validate" the instruction if
      we would have gone over the end of the buffer to decode it.
      
      The kprobes code probably needs to be looked at here a bit more
      carefully.  This patch still respects the MAX_INSN_SIZE limit
      there but the kprobes code does look like it might be able to
      be a bit more strict than it currently is.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NJim Keniston <jkenisto@us.ibm.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: x86@kernel.org
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/20141114153957.E6B01535@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6ba48ff4
  2. 09 11月, 2014 1 次提交
  3. 07 11月, 2014 1 次提交
    • M
      MIPS: Fix build with binutils 2.24.51+ · 842dfc11
      Manuel Lauss 提交于
      Starting with version 2.24.51.20140728 MIPS binutils complain loudly
      about mixing soft-float and hard-float object files, leading to this
      build failure since GCC is invoked with "-msoft-float" on MIPS:
      
      {standard input}: Warning: .gnu_attribute 4,3 requires `softfloat'
        LD      arch/mips/alchemy/common/built-in.o
      mipsel-softfloat-linux-gnu-ld: Warning: arch/mips/alchemy/common/built-in.o
       uses -msoft-float (set by arch/mips/alchemy/common/prom.o),
       arch/mips/alchemy/common/sleeper.o uses -mhard-float
      
      To fix this, we detect if GAS is new enough to support "-msoft-float" command
      option, and if it does, we can let GCC pass it to GAS;  but then we also need
      to sprinkle the files which make use of floating point registers with the
      necessary ".set hardfloat" directives.
      Signed-off-by: NManuel Lauss <manuel.lauss@gmail.com>
      Cc: Linux-MIPS <linux-mips@linux-mips.org>
      Cc: Matthew Fortune <Matthew.Fortune@imgtec.com>
      Cc: Markos Chandras <Markos.Chandras@imgtec.com>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Patchwork: https://patchwork.linux-mips.org/patch/8355/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      842dfc11
  4. 06 11月, 2014 5 次提交
  5. 04 11月, 2014 3 次提交
  6. 02 11月, 2014 4 次提交
    • P
      KVM: vmx: defer load of APIC access page address during reset · a73896cb
      Paolo Bonzini 提交于
      Most call paths to vmx_vcpu_reset do not hold the SRCU lock.  Defer loading
      the APIC access page to the next vmentry.
      
      This avoids the following lockdep splat:
      
      [ INFO: suspicious RCU usage. ]
      3.18.0-rc2-test2+ #70 Not tainted
      -------------------------------
      include/linux/kvm_host.h:474 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 1, debug_locks = 0
      1 lock held by qemu-system-x86/2371:
       #0:  (&vcpu->mutex){+.+...}, at: [<ffffffffa037d800>] vcpu_load+0x20/0xd0 [kvm]
      
      stack backtrace:
      CPU: 4 PID: 2371 Comm: qemu-system-x86 Not tainted 3.18.0-rc2-test2+ #70
      Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
       0000000000000001 ffff880209983ca8 ffffffff816f514f 0000000000000000
       ffff8802099b8990 ffff880209983cd8 ffffffff810bd687 00000000000fee00
       ffff880208a2c000 ffff880208a10000 ffff88020ef50040 ffff880209983d08
      Call Trace:
       [<ffffffff816f514f>] dump_stack+0x4e/0x71
       [<ffffffff810bd687>] lockdep_rcu_suspicious+0xe7/0x120
       [<ffffffffa037d055>] gfn_to_memslot+0xd5/0xe0 [kvm]
       [<ffffffffa03807d3>] __gfn_to_pfn+0x33/0x60 [kvm]
       [<ffffffffa0380885>] gfn_to_page+0x25/0x90 [kvm]
       [<ffffffffa038aeec>] kvm_vcpu_reload_apic_access_page+0x3c/0x80 [kvm]
       [<ffffffffa08f0a9c>] vmx_vcpu_reset+0x20c/0x460 [kvm_intel]
       [<ffffffffa039ab8e>] kvm_vcpu_reset+0x15e/0x1b0 [kvm]
       [<ffffffffa039ac0c>] kvm_arch_vcpu_setup+0x2c/0x50 [kvm]
       [<ffffffffa037f7e0>] kvm_vm_ioctl+0x1d0/0x780 [kvm]
       [<ffffffff810bc664>] ? __lock_is_held+0x54/0x80
       [<ffffffff812231f0>] do_vfs_ioctl+0x300/0x520
       [<ffffffff8122ee45>] ? __fget+0x5/0x250
       [<ffffffff8122f0fa>] ? __fget_light+0x2a/0xe0
       [<ffffffff81223491>] SyS_ioctl+0x81/0xa0
       [<ffffffff816fed6d>] system_call_fastpath+0x16/0x1b
      Reported-by: NTakashi Iwai <tiwai@suse.de>
      Reported-by: NAlexei Starovoitov <alexei.starovoitov@gmail.com>
      Reviewed-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Tested-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Fixes: 38b99173Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a73896cb
    • J
      KVM: nVMX: Disable preemption while reading from shadow VMCS · 282da870
      Jan Kiszka 提交于
      In order to access the shadow VMCS, we need to load it. At this point,
      vmx->loaded_vmcs->vmcs and the actually loaded one start to differ. If
      we now get preempted by Linux, vmx_vcpu_put and, on return, the
      vmx_vcpu_load will work against the wrong vmcs. That can cause
      copy_shadow_to_vmcs12 to corrupt the vmcs12 state.
      
      Fix the issue by disabling preemption during the copy operation.
      copy_vmcs12_to_shadow is safe from this issue as it is executed by
      vmx_vcpu_run when preemption is already disabled before vmentry.
      
      This bug is exposed by running Jailhouse within KVM on CPUs with
      shadow VMCS support.  Jailhouse never expects an interrupt pending
      vmexit, but the bug can cause it if, after copy_shadow_to_vmcs12
      is preempted, the active VMCS happens to have the virtual interrupt
      pending flag set in the CPU-based execution controls.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      282da870
    • N
      KVM: x86: Fix far-jump to non-canonical check · 7e46dddd
      Nadav Amit 提交于
      Commit d1442d85 ("KVM: x86: Handle errors when RIP is set during far
      jumps") introduced a bug that caused the fix to be incomplete.  Due to
      incorrect evaluation, far jump to segment with L bit cleared (i.e., 32-bit
      segment) and RIP with any of the high bits set (i.e, RIP[63:32] != 0) set may
      not trigger #GP.  As we know, this imposes a security problem.
      
      In addition, the condition for two warnings was incorrect.
      
      Fixes: d1442d85Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
      [Add #ifdef CONFIG_X86_64 to avoid complaints of undefined behavior. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7e46dddd
    • D
      powerpc: use device_online/offline() instead of cpu_up/down() · 10ccaf17
      Dan Streetman 提交于
      In powerpc pseries platform dlpar operations, use device_online() and
      device_offline() instead of cpu_up() and cpu_down().
      
      Calling cpu_up/down() directly does not update the cpu device offline
      field, which is used to online/offline a cpu from sysfs. Calling
      device_online/offline() instead keeps the sysfs cpu online value
      correct. The hotplug lock, which is required to be held when calling
      device_online/offline(), is already held when dlpar_online/offline_cpu()
      are called, since they are called only from cpu_probe|release_store().
      
      This patch fixes errors on phyp (PowerVM) systems that have cpu(s)
      added/removed using dlpar operations; without this patch, the
      /sys/devices/system/cpu/cpuN/online nodes do not correctly show the
      online state of added/removed cpus.
      Signed-off-by: NDan Streetman <ddstreet@ieee.org>
      Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
      Fixes: 0902a904 ("Driver core: Use generic offline/online for CPU offline/online")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      10ccaf17
  7. 01 11月, 2014 2 次提交
    • A
      x86_64, entry: Fix out of bounds read on sysenter · 653bc77a
      Andy Lutomirski 提交于
      Rusty noticed a Really Bad Bug (tm) in my NT fix.  The entry code
      reads out of bounds, causing the NT fix to be unreliable.  But, and
      this is much, much worse, if your stack is somehow just below the
      top of the direct map (or a hole), you read out of bounds and crash.
      
      Excerpt from the crash:
      
      [    1.129513] RSP: 0018:ffff88001da4bf88  EFLAGS: 00010296
      
        2b:*    f7 84 24 90 00 00 00     testl  $0x4000,0x90(%rsp)
      
      That read is deterministically above the top of the stack.  I
      thought I even single-stepped through this code when I wrote it to
      check the offset, but I clearly screwed it up.
      
      Fixes: 8c7aa698 ("x86_64, entry: Filter RFLAGS.NT on entry from userspace")
      Reported-by: NRusty Russell <rusty@ozlabs.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      653bc77a
    • T
      net: smc91x: Fix gpios for device tree based booting · 7d2911c4
      Tony Lindgren 提交于
      With legacy booting, the platform init code was taking care of
      the configuring of GPIOs. With device tree based booting, things
      may or may not work depending what bootloader has configured or
      if the legacy platform code gets called.
      
      Let's add support for the pwrdn and reset GPIOs to the smc91x
      driver to fix the issues of smc91x not working properly when
      booted in device tree mode.
      
      And let's change n900 to use these settings as some versions
      of the bootloader do not configure things properly causing
      errors.
      Reported-by: NKevin Hilman <khilman@linaro.org>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d2911c4
  8. 31 10月, 2014 3 次提交
  9. 30 10月, 2014 8 次提交
  10. 29 10月, 2014 5 次提交
    • P
      KVM: emulator: fix execution close to the segment limit · fd56e154
      Paolo Bonzini 提交于
      Emulation of code that is 14 bytes to the segment limit or closer
      (e.g. RIP = 0xFFFFFFF2 after reset) is broken because we try to read as
      many as 15 bytes from the beginning of the instruction, and __linearize
      fails when the passed (address, size) pair reaches out of the segment.
      
      To fix this, let __linearize return the maximum accessible size (clamped
      to 2^32-1) for usage in __do_insn_fetch_bytes, and avoid the limit check
      by passing zero for the desired size.
      
      For expand-down segments, __linearize is performing a redundant check.
      (u32)(addr.ea + size - 1) <= lim can only happen if addr.ea is close
      to 4GB; in this case, addr.ea + size - 1 will also fail the check against
      the upper bound of the segment (which is provided by the D/B bit).
      After eliminating the redundant check, it is simple to compute
      the *max_size for expand-down segments too.
      
      Now that the limit check is done in __do_insn_fetch_bytes, we want
      to inject a general protection fault there if size < op_size (like
      __linearize would have done), instead of just aborting.
      
      This fixes booting Tiano Core from emulated flash with EPT disabled.
      
      Cc: stable@vger.kernel.org
      Fixes: 719d5a9bReported-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fd56e154
    • P
      KVM: emulator: fix error code for __linearize · 3606189f
      Paolo Bonzini 提交于
      The error code for #GP and #SS is zero when the segment is used to
      access an operand or an instruction.  It is only non-zero when
      a segment register is being loaded; for limit checks this means
      cases such as:
      
      * for #GP, when RIP is beyond the limit on a far call (before the first
      instruction is executed).  We do not implement this check, but it
      would be in em_jmp_far/em_call_far.
      
      * for #SS, if the new stack overflows during an inter-privilege-level
      call to a non-conforming code segment.  We do not implement stack
      switching at all.
      
      So use an error code of zero.
      Reviewed-by: NNadav Amit <namit@cs.technion.ac.il>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3606189f
    • F
      ARM: 8182/1: l2c: Make l2x0_cache_size_of_parse() return 'int' · d0b92845
      Fabio Estevam 提交于
      Since commit f3354ab6 ("ARM: 8169/1: l2c: parse cache properties from
      ePAPR definitions") the following error is seen on imx6q:
      
      [    0.000000] PL310 OF: cache setting yield illegal associativity
      [    0.000000] PL310 OF: -2147097556 calculated, only 8 and 16 legal
      
      As imx6q does not pass the "cache-size" and "cache-sets" properties in DT, the function l2x0_cache_size_of_parse() returns early and keep the 'associativity' pointer uninitialized.
      
      To fix this problem, return error codes inside l2x0_cache_size_of_parse() and only use the 'associativity' pointer result if l2x0_cache_size_of_parse() succeeds.
      Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com>
      Reviewed-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      d0b92845
    • I
      perf/x86/intel: Revert incomplete and undocumented Broadwell client support · 1776b106
      Ingo Molnar 提交于
      These patches:
      
        86a349a2 ("perf/x86/intel: Add Broadwell core support")
        c46e665f ("perf/x86: Add INST_RETIRED.ALL workarounds")
        fdda3c4a ("perf/x86/intel: Use Broadwell cache event list for Haswell")
      
      introduced magic constants and unexplained changes:
      
        https://lkml.org/lkml/2014/10/28/1128
        https://lkml.org/lkml/2014/10/27/325
        https://lkml.org/lkml/2014/8/27/546
        https://lkml.org/lkml/2014/10/28/546
      
      Peter Zijlstra has attempted to help out, to clean up the mess:
      
        https://lkml.org/lkml/2014/10/28/543
      
      But has not received helpful and constructive replies which makes
      me doubt wether it can all be finished in time until v3.18 is
      released.
      
      Despite various review feedback the author (Andi Kleen) has answered
      only few of the review questions and has generally been uncooperative,
      only giving replies when prompted repeatedly, and only giving minimal
      answers instead of constructively explaining and helping along the effort.
      
      That kind of behavior is not acceptable.
      
      There's also a boot crash on Intel E5-1630 v3 CPUs reported for another
      commit from Andi Kleen:
      
        e735b9db ("perf/x86/intel/uncore: Add Haswell-EP uncore support")
      
        https://lkml.org/lkml/2014/10/22/730
      
      Which is not yet resolved. The uncore driver is independent in theory,
      but the crash makes me worry about how well all these patches were
      tested and makes me uneasy about the level of interminging that the
      Broadwell and Haswell code has received by the commits above.
      
      As a first step to resolve the mess revert the Broadwell client commits
      back to the v3.17 version, before we run out of time and problematic
      code hits a stable upstream kernel.
      
      ( If the Haswell-EP crash is not resolved via a simple fix then we'll have
        to revert the Haswell-EP uncore driver as well. )
      
      The Broadwell client series has to be submitted in a clean fashion, with
      single, well documented changes per patch. If they are submitted in time
      and are accepted during review then they can possibly go into v3.19 but
      will need additional scrutiny due to the rocky history of this patch set.
      
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: eranian@google.com
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1409683455-29168-3-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1776b106
    • D
      x86, pageattr: Prevent overflow in slow_virt_to_phys() for X86_PAE · d1cd1210
      Dexuan Cui 提交于
      pte_pfn() returns a PFN of long (32 bits in 32-PAE), so "long <<
      PAGE_SHIFT" will overflow for PFNs above 4GB.
      
      Due to this issue, some Linux 32-PAE distros, running as guests on Hyper-V,
      with 5GB memory assigned, can't load the netvsc driver successfully and
      hence the synthetic network device can't work (we can use the kernel parameter
      mem=3000M to work around the issue).
      
      Cast pte_pfn() to phys_addr_t before shifting.
      
      Fixes: "commit d7656534: x86, mm: Create slow_virt_to_phys()"
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: gregkh@linuxfoundation.org
      Cc: linux-mm@kvack.org
      Cc: olaf@aepfle.de
      Cc: apw@canonical.com
      Cc: jasowang@redhat.com
      Cc: dave.hansen@intel.com
      Cc: riel@redhat.com
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1414580017-27444-1-git-send-email-decui@microsoft.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      d1cd1210