1. 21 6月, 2013 1 次提交
  2. 20 6月, 2013 2 次提交
    • M
      x86: Fix trigger_all_cpu_backtrace() implementation · b52e0a7c
      Michel Lespinasse 提交于
      The following change fixes the x86 implementation of
      trigger_all_cpu_backtrace(), which was previously (accidentally,
      as far as I can tell) disabled to always return false as on
      architectures that do not implement this function.
      
      trigger_all_cpu_backtrace(), as defined in include/linux/nmi.h,
      should call arch_trigger_all_cpu_backtrace() if available, or
      return false if the underlying arch doesn't implement this
      function.
      
      x86 did provide a suitable arch_trigger_all_cpu_backtrace()
      implementation, but it wasn't actually being used because it was
      declared in asm/nmi.h, which linux/nmi.h doesn't include. Also,
      linux/nmi.h couldn't easily be fixed by including asm/nmi.h,
      because that file is not available on all architectures.
      
      I am proposing to fix this by moving the x86 definition of
      arch_trigger_all_cpu_backtrace() to asm/irq.h.
      
      Tested via: echo l > /proc/sysrq-trigger
      
      Before the change, this uses a fallback implementation which
      shows backtraces on active CPUs (using
      smp_call_function_interrupt() )
      
      After the change, this shows NMI backtraces on all CPUs
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1370518875-1346-1-git-send-email-walken@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b52e0a7c
    • P
      x86: Fix section mismatch on load_ucode_ap · 94978599
      Paul Gortmaker 提交于
      We are in the process of removing all the __cpuinit annotations.
      While working on making that change, an existing problem was
      made evident:
      
        WARNING: arch/x86/kernel/built-in.o(.text+0x198f2): Section mismatch
        in reference from the function cpu_init() to the function
        .init.text:load_ucode_ap()   The function cpu_init() references
        the function __init load_ucode_ap().  This is often because cpu_init
        lacks a __init annotation or the annotation of load_ucode_ap is wrong.
      
      This now appears because in my working tree, cpu_init() is no longer
      tagged as __cpuinit, and so the audit picks up the mismatch.  The 2nd
      hypothesis from the audit is the correct one, as there was an incorrect
      __init tag on the prototype in the header (but __cpuinit was used on
      the function itself.)
      
      The audit is telling us that the prototype's __init annotation took
      effect and the function did land in the .init.text section.  Checking
      with objdump on a mainline tree that still has __cpuinit shows that
      the __cpuinit on the function takes precedence over the __init on the
      prototype, but that won't be true once we make __cpuinit a no-op.
      
      Even though we are removing __cpuinit, we temporarily align both
      the function and the prototype on __cpuinit so that the changeset
      can be applied to stable trees  if desired.
      
      [ hpa: build fix only, no object code change ]
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: stable <stable@vger.kernel.org> # 3.9+
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Link: http://lkml.kernel.org/r/1371654926-11729-1-git-send-email-paul.gortmaker@windriver.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      94978599
  3. 19 6月, 2013 4 次提交
  4. 18 6月, 2013 1 次提交
  5. 13 6月, 2013 3 次提交
  6. 12 6月, 2013 1 次提交
    • T
      idle: Add the stack canary init to cpu_startup_entry() · d7880812
      Thomas Gleixner 提交于
      Moving x86 to the generic idle implementation (commit 7d1a9417 "x86:
      Use generic idle loop") wreckaged the stack protector.
      
      I stupidly missed that boot_init_stack_canary() must be inlined from a
      function which never returns, but I put that call into
      arch_cpu_idle_prepare() which of course returns.
      
      I pondered to play tricks with arch_cpu_idle_prepare() first, but then
      I noticed, that the other archs which have implemented the
      stackprotector (ARM and SH) do not initialize the canary for the
      non-boot cpus.
      
      So I decided to move the boot_init_stack_canary() call into
      cpu_startup_entry() ifdeffed with an CONFIG_X86 for now. This #ifdef
      is just a temporary measure as I don't want to inflict the
      boot_init_stack_canary() call on ARM and SH that late in the cycle.
      
      I'll queue a patch for 3.11 which removes the #ifdef if the ARM/SH
      maintainers have no objection.
      Reported-by: NWouter van Kesteren <woutershep@gmail.com>
      Cc: x86@kernel.org
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      d7880812
  7. 11 6月, 2013 1 次提交
    • M
      Modify UEFI anti-bricking code · f8b84043
      Matthew Garrett 提交于
      This patch reworks the UEFI anti-bricking code, including an effective
      reversion of cc5a080c and 31ff2f20. It turns out that calling
      QueryVariableInfo() from boot services results in some firmware
      implementations jumping to physical addresses even after entering virtual
      mode, so until we have 1:1 mappings for UEFI runtime space this isn't
      going to work so well.
      
      Reverting these gets us back to the situation where we'd refuse to create
      variables on some systems because they classify deleted variables as "used"
      until the firmware triggers a garbage collection run, which they won't do
      until they reach a lower threshold. This results in it being impossible to
      install a bootloader, which is unhelpful.
      
      Feedback from Samsung indicates that the firmware doesn't need more than
      5KB of storage space for its own purposes, so that seems like a reasonable
      threshold. However, there's still no guarantee that a platform will attempt
      garbage collection merely because it drops below this threshold. It seems
      that this is often only triggered if an attempt to write generates a
      genuine EFI_OUT_OF_RESOURCES error. We can force that by attempting to
      create a variable larger than the remaining space. This should fail, but if
      it somehow succeeds we can then immediately delete it.
      
      I've tested this on the UEFI machines I have available, but I don't have
      a Samsung and so can't verify that it avoids the bricking problem.
      Signed-off-by: NMatthew Garrett <matthew.garrett@nebula.com>
      Signed-off-by: Lee, Chun-Y <jlee@suse.com> [ dummy variable cleanup ]
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      f8b84043
  8. 06 6月, 2013 1 次提交
    • M
      x86/PCI: Map PCI setup data with ioremap() so it can be in highmem · 65694c5a
      Matt Fleming 提交于
      f9a37be0 ("x86: Use PCI setup data") added support for using PCI ROM
      images from setup_data.  This used phys_to_virt(), which is not valid for
      highmem addresses, and can cause a crash when booting a 32-bit kernel via
      the EFI boot stub.
      
      pcibios_add_device() assumes that the physical addresses stored in
      setup_data are accessible via the direct kernel mapping, and that calling
      phys_to_virt() is valid.  This isn't guaranteed to be true on x86 where the
      direct mapping range is much smaller than on x86-64.
      
      Calling phys_to_virt() on a highmem address results in the following:
      
       BUG: unable to handle kernel paging request at 39a3c198
       IP: [<c262be0f>] pcibios_add_device+0x2f/0x90
       ...
       Call Trace:
        [<c2370c73>] pci_device_add+0xe3/0x130
        [<c274640b>] pci_scan_single_device+0x8b/0xb0
        [<c2370d08>] pci_scan_slot+0x48/0x100
        [<c2371904>] pci_scan_child_bus+0x24/0xc0
        [<c262a7b0>] pci_acpi_scan_root+0x2c0/0x490
        [<c23b7203>] acpi_pci_root_add+0x312/0x42f
        ...
      
      The solution is to use ioremap() instead of phys_to_virt() to map the
      setup data into the kernel address space.
      
      [bhelgaas: changelog]
      Tested-by: NJani Nikula <jani.nikula@intel.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Seth Forshee <seth.forshee@canonical.com>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: stable@vger.kernel.org	# v3.8+
      65694c5a
  9. 04 6月, 2013 1 次提交
    • K
      xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU. · 466318a8
      Konrad Rzeszutek Wilk 提交于
      The xen_play_dead is an undead function. When the vCPU is told to
      offline it ends up calling xen_play_dead wherin it calls the
      VCPUOP_down hypercall which offlines the vCPU. However, when the
      vCPU is onlined back, it resumes execution right after
      VCPUOP_down hypercall.
      
      That was OK (albeit the API for play_dead assumes that the CPU
      stays dead and never returns) but with commit 4b0c0f29
      (tick: Cleanup NOHZ per cpu data on cpu down) that is no longer safe
      as said commit resets the ts->inidle which at the start of the
      cpu_idle loop was set.
      
      The net effect is that we get this warn:
      
      Broke affinity for irq 16
      installing Xen timer for CPU 1
      cpu 1 spinlock event irq 48
      ------------[ cut here ]------------
      WARNING: at /home/konrad/linux-linus/kernel/time/tick-sched.c:935 tick_nohz_idle_exit+0x195/0x1b0()
      Modules linked in: dm_multipath dm_mod xen_evtchn iscsi_boot_sysfs
      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-rc3upstream-00068-gdcdbe33a #1
      Hardware name: BIOSTAR Group N61PB-M2S/N61PB-M2S, BIOS 6.00 PG 09/03/2009
       ffffffff8193b448 ffff880039da5e60 ffffffff816707c8 ffff880039da5ea0
       ffffffff8108ce8b ffff880039da4010 ffff88003fa8e500 ffff880039da4010
       0000000000000001 ffff880039da4000 ffff880039da4010 ffff880039da5eb0
      Call Trace:
       [<ffffffff816707c8>] dump_stack+0x19/0x1b
       [<ffffffff8108ce8b>] warn_slowpath_common+0x6b/0xa0
       [<ffffffff8108ced5>] warn_slowpath_null+0x15/0x20
       [<ffffffff810e4745>] tick_nohz_idle_exit+0x195/0x1b0
       [<ffffffff810da755>] cpu_startup_entry+0x205/0x250
       [<ffffffff81661070>] cpu_bringup_and_idle+0x13/0x15
      ---[ end trace 915c8c486004dda1 ]---
      
      b/c ts_inidle is set to zero. Thomas suggested that we just add a workaround
      to call tick_nohz_idle_enter before returning from xen_play_dead() - and
      that is what this patch does and fixes the issue.
      
      We also add the stable part b/c git commit 4b0c0f29 is on the stable
      tree.
      
      CC: stable@vger.kernel.org
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      466318a8
  10. 03 6月, 2013 3 次提交
  11. 01 6月, 2013 1 次提交
    • Y
      x86: Fix adjust_range_size_mask calling position · 7de3d66b
      Yinghai Lu 提交于
      Commit
      
          8d57470d x86, mm: setup page table in top-down
      
      causes a kernel panic while setting mem=2G.
      
           [mem 0x00000000-0x000fffff] page 4k
           [mem 0x7fe00000-0x7fffffff] page 1G
           [mem 0x7c000000-0x7fdfffff] page 1G
           [mem 0x00100000-0x001fffff] page 4k
           [mem 0x00200000-0x7bffffff] page 2M
      
      for last entry is not what we want, we should have
           [mem 0x00200000-0x3fffffff] page 2M
           [mem 0x40000000-0x7bffffff] page 1G
      
      Actually we merge the continuous ranges with same page size too early.
      in this case, before merging we have
           [mem 0x00200000-0x3fffffff] page 2M
           [mem 0x40000000-0x7bffffff] page 2M
      after merging them, will get
           [mem 0x00200000-0x7bffffff] page 2M
      even we can use 1G page to map
           [mem 0x40000000-0x7bffffff]
      
      that will cause problem, because we already map
           [mem 0x7fe00000-0x7fffffff] page 1G
           [mem 0x7c000000-0x7fdfffff] page 1G
      with 1G page, aka [0x40000000-0x7fffffff] is mapped with 1G page already.
      During phys_pud_init() for [0x40000000-0x7bffffff], it will not
      reuse existing that pud page, and allocate new one then try to use
      2M page to map it instead, as page_size_mask does not include
      PG_LEVEL_1G. At end will have [7c000000-0x7fffffff] not mapped, loop
      in phys_pmd_init stop mapping at 0x7bffffff.
      
      That is right behavoir, it maps exact range with exact page size that
      we ask, and we should explicitly call it to map [7c000000-0x7fffffff]
      before or after mapping 0x40000000-0x7bffffff.
      Anyway we need to make sure ranges' page_size_mask correct and consistent
      after split_mem_range for each range.
      
      Fix that by calling adjust_range_size_mask before merging range
      with same page size.
      
      -v2: update change log.
      -v3: add more explanation why [7c000000-0x7fffffff] is not mapped, and
          it causes panic.
      Bisected-by: N"Xie, ChanglongX" <changlongx.xie@intel.com>
      Bisected-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
      Reported-and-tested-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/1370015587-20835-1-git-send-email-yinghai@kernel.org
      Cc: <stable@vger.kernel.org> v3.9
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      7de3d66b
  12. 31 5月, 2013 3 次提交
    • A
      sched/x86: Construct all sibling maps if smt · b0bc225d
      Andrew Jones 提交于
      Commit 316ad248 ("sched/x86: Rewrite
      set_cpu_sibling_map()") broke the construction of sibling maps,
      which also broke the booted_cores accounting.
      
      Before the rewrite, if smt was present, then each map was
      updated for each smt sibling. After the rewrite only
      cpu_sibling_mask gets updated, as the llc and core maps depend
      on 'has_mc = x86_max_cores > 1' instead. This leads to problems
      with topologies like the following
      
      (qemu -smp sockets=2,cores=1,threads=2)
      
        processor       : 0
        physical id     : 0
        siblings        : 1    <= should be 2
        core id         : 0
        cpu cores       : 1
      
        processor       : 1
        physical id     : 0
        siblings        : 1    <= should be 2
        core id         : 0
        cpu cores       : 0    <= should be 1
      
        processor       : 2
        physical id     : 1
        siblings        : 1    <= should be 2
        core id         : 0
        cpu cores       : 1
      
        processor       : 3
        physical id     : 1
        siblings        : 1    <= should be 2
        core id         : 0
        cpu cores       : 0    <= should be 1
      
      This patch restores the former construction by defining has_mc
      as (has_smt || x86_max_cores > 1). This should be fine as there
      were no (has_smt && !has_mc) conditions in the context.
      
      Aso rename has_mc to has_mp now that it's not just for cores.
      Signed-off-by: NAndrew Jones <drjones@redhat.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: a.p.zijlstra@chello.nl
      Cc: fenghua.yu@intel.com
      Link: http://lkml.kernel.org/r/1369831695-11970-1-git-send-email-drjones@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b0bc225d
    • P
      x86: Allow FPU to be used at interrupt time even with eagerfpu · 5187b28f
      Pekka Riikonen 提交于
      With the addition of eagerfpu the irq_fpu_usable() now returns false
      negatives especially in the case of ksoftirqd and interrupted idle task,
      two common cases for FPU use for example in networking/crypto.  With
      eagerfpu=off FPU use is possible in those contexts.  This is because of
      the eagerfpu check in interrupted_kernel_fpu_idle():
      
      ...
        * For now, with eagerfpu we will return interrupted kernel FPU
        * state as not-idle. TBD: Ideally we can change the return value
        * to something like __thread_has_fpu(current). But we need to
        * be careful of doing __thread_clear_has_fpu() before saving
        * the FPU etc for supporting nested uses etc. For now, take
        * the simple route!
      ...
       	if (use_eager_fpu())
       		return 0;
      
      As eagerfpu is automatically "on" on those CPUs that also have the
      features like AES-NI this patch changes the eagerfpu check to return 1 in
      case the kernel_fpu_begin() has not been said yet.  Once it has been the
      __thread_has_fpu() will start returning 0.
      
      Notice that with eagerfpu the __thread_has_fpu is always true initially.
      FPU use is thus always possible no matter what task is under us, unless
      the state has already been saved with kernel_fpu_begin().
      
      [ hpa: this is a performance regression, not a correctness regression,
        but since it can be quite serious on CPUs which need encryption at
        interrupt time I am marking this for urgent/stable. ]
      Signed-off-by: NPekka Riikonen <priikone@iki.fi>
      Link: http://lkml.kernel.org/r/alpine.GSO.2.00.1305131356320.18@git.silcnet.org
      Cc: <stable@vger.kernel.org> v3.7+
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      5187b28f
    • J
      x86, crc32-pclmul: Fix build with older binutils · 2baad612
      Jan Beulich 提交于
      binutils prior to 2.18 (e.g. the ones found on SLE10) don't support
      assembling PEXTRD, so a macro based approach like the one for PCLMULQDQ
      in the same file should be used.
      
      This requires making the helper macros capable of recognizing 32-bit
      general purpose register operands.
      
      [ hpa: tagging for stable as it is a low risk build fix ]
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Link: http://lkml.kernel.org/r/51A6142A02000078000D99D8@nat28.tlf.novell.com
      Cc: Alexander Boyko <alexander_boyko@xyratex.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: <stable@vger.kernel.org> v3.9
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      2baad612
  13. 29 5月, 2013 2 次提交
    • S
      xen: Clean up apic ipi interface · 1db01b49
      Stefan Bader 提交于
      Commit f447d56d introduced the
      implementation of the PV apic ipi interface. But there were some
      odd things (it seems none of which cause really any issue but
      maybe they should be cleaned up anyway):
       - xen_send_IPI_mask_allbutself (and by that xen_send_IPI_allbutself)
         ignore the passed in vector and only use the CALL_FUNCTION_SINGLE
         vector. While xen_send_IPI_all and xen_send_IPI_mask use the vector.
       - physflat_send_IPI_allbutself is declared unnecessarily. It is never
         used.
      
      This patch tries to clean up those things.
      Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      1db01b49
    • Z
      x86-64, init: Fix a possible wraparound bug in switchover in head_64.S · e9d0626e
      Zhang Yanfei 提交于
      In head_64.S, a switchover has been used to handle kernel crossing
      1G, 512G boundaries.
      
      And commit 8170e6be
          x86, 64bit: Use a #PF handler to materialize early mappings on demand
      said:
          During the switchover in head_64.S, before #PF handler is available,
          we use three pages to handle kernel crossing 1G, 512G boundaries with
          sharing page by playing games with page aliasing: the same page is
          mapped twice in the higher-level tables with appropriate wraparound.
      
      But from the switchover code, when we set up the PUD table:
      114         addq    $4096, %rdx
      115         movq    %rdi, %rax
      116         shrq    $PUD_SHIFT, %rax
      117         andl    $(PTRS_PER_PUD-1), %eax
      118         movq    %rdx, (4096+0)(%rbx,%rax,8)
      119         movq    %rdx, (4096+8)(%rbx,%rax,8)
      
      It seems line 119 has a potential bug there. For example,
      if the kernel is loaded at physical address 511G+1008M, that is
          000000000 111111111 111111000 000000000000000000000
      and the kernel _end is 512G+2M, that is
          000000001 000000000 000000001 000000000000000000000
      So in this example, when using the 2nd page to setup PUD (line 114~119),
      rax is 511.
      In line 118, we put rdx which is the address of the PMD page (the 3rd page)
      into entry 511 of the PUD table. But in line 119, the entry we calculate from
      (4096+8)(%rbx,%rax,8) has exceeded the PUD page. IMO, the entry in line
      119 should be wraparound into entry 0 of the PUD table.
      
      The patch fixes the bug.
      Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Link: http://lkml.kernel.org/r/5191DE5A.3020302@cn.fujitsu.comSigned-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: <stable@vger.kernel.org> v3.9
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      e9d0626e
  14. 28 5月, 2013 1 次提交
  15. 21 5月, 2013 2 次提交
  16. 15 5月, 2013 1 次提交
    • J
      time: Revert ALWAYS_USE_PERSISTENT_CLOCK compile time optimizaitons · b4f711ee
      John Stultz 提交于
      Kay Sievers noted that the ALWAYS_USE_PERSISTENT_CLOCK config,
      which enables some minor compile time optimization to avoid
      uncessary code in mostly the suspend/resume path could cause
      problems for userland.
      
      In particular, the dependency for RTC_HCTOSYS on
      !ALWAYS_USE_PERSISTENT_CLOCK, which avoids setting the time
      twice and simplifies suspend/resume, has the side effect
      of causing the /sys/class/rtc/rtcN/hctosys flag to always be
      zero, and this flag is commonly used by udev to setup the
      /dev/rtc symlink to /dev/rtcN, which can cause pain for
      older applications.
      
      While the udev rules could use some work to be less fragile,
      breaking userland should strongly be avoided. Additionally
      the compile time optimizations are fairly minor, and the code
      being optimized is likely to be reworked in the future, so
      lets revert this change.
      Reported-by: NKay Sievers <kay@vrfy.org>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: stable <stable@vger.kernel.org> #3.9
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Link: http://lkml.kernel.org/r/1366828376-18124-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      b4f711ee
  17. 14 5月, 2013 1 次提交
    • L
      x86, efi: initial the local variable of DataSize to zero · eccaf52f
      Lee, Chun-Yi 提交于
      That will be better initial the value of DataSize to zero for the input of
      GetVariable(), otherwise we will feed a random value. The debug log of input
      DataSize like this:
      
      ...
      [  195.915612] EFI Variables Facility v0.08 2004-May-17
      [  195.915819] efi: size: 18446744071581821342
      [  195.915969] efi:  size': 18446744071581821342
      [  195.916324] efi: size: 18446612150714306560
      [  195.916632] efi:  size': 18446612150714306560
      [  195.917159] efi: size: 18446612150714306560
      [  195.917453] efi:  size': 18446612150714306560
      ...
      
      The size' is value that was returned by BIOS.
      
      After applied this patch:
      [   82.442042] EFI Variables Facility v0.08 2004-May-17
      [   82.442202] efi: size: 0
      [   82.442360] efi:  size': 1039
      [   82.443828] efi: size: 0
      [   82.444127] efi:  size': 2616
      [   82.447057] efi: size: 0
      [   82.447356] efi:  size': 5832
      ...
      
      Found on Acer Aspire V3 BIOS, it will not return the size of data if we input a
      non-zero DataSize.
      
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NLee, Chun-Yi <jlee@suse.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      eccaf52f
  18. 10 5月, 2013 4 次提交
  19. 09 5月, 2013 5 次提交
    • P
      KVM: emulator: emulate SALC · 326f578f
      Paolo Bonzini 提交于
      This is an almost-undocumented instruction available in 32-bit mode.
      I say "almost" undocumented because AMD documents it in their opcode
      maps just to say that it is unavailable in 64-bit mode (sections
      "A.2.1 One-Byte Opcodes" and "B.3 Invalid and Reassigned Instructions
      in 64-Bit Mode").
      
      It is roughly equivalent to "sbb %al, %al" except it does not
      set the flags.  Use fastop to emulate it, but do not use the opcode
      directly because it would fail if the host is 64-bit!
      Reported-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: stable@vger.kernel.org # 3.9
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      326f578f
    • P
      KVM: emulator: emulate XLAT · 7fa57952
      Paolo Bonzini 提交于
      This is used by SGABIOS, KVM breaks with emulate_invalid_guest_state=1.
      It is just a MOV in disguise, with a funny source address.
      Reported-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: stable@vger.kernel.org # 3.9
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      7fa57952
    • P
      KVM: emulator: emulate AAM · a035d5c6
      Paolo Bonzini 提交于
      This is used by SGABIOS, KVM breaks with emulate_invalid_guest_state=1.
      
      AAM needs the source operand to be unsigned; do the same in AAD as well
      for consistency, even though it does not affect the result.
      Reported-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: stable@vger.kernel.org # 3.9
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      a035d5c6
    • K
      x86/microcode: Add local mutex to fix physical CPU hot-add deadlock · 074d72ff
      Konrad Rzeszutek Wilk 提交于
      This can easily be triggered if a new CPU is added (via
      ACPI hotplug mechanism) and from user-space you do:
      
         echo 1 > /sys/devices/system/cpu/cpu3/online
      
      (or wait for UDEV to do it) on a newly appeared physical CPU.
      
      The deadlock is that the "store_online" in drivers/base/cpu.c
      takes the cpu_hotplug_driver_lock() lock, then calls "cpu_up".
      "cpu_up" eventually ends up calling "save_mc_for_early"
      which also takes the cpu_hotplug_driver_lock() lock.
      
      And here is that lockdep thinks of it:
      
       smpboot: Stack at about ffff880075c39f44
       smpboot: CPU3: has booted.
       microcode: CPU3 sig=0x206a7, pf=0x2, revision=0x25
      
       =============================================
       [ INFO: possible recursive locking detected ]
       3.9.0upstream-10129-g167af0e #1 Not tainted
       ---------------------------------------------
       sh/2487 is trying to acquire lock:
        (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20
      
       but task is already holding lock:
        (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(x86_cpu_hotplug_driver_mutex);
         lock(x86_cpu_hotplug_driver_mutex);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       6 locks held by sh/2487:
        #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff811ca48d>] vfs_write+0x17d/0x190
        #1:  (&buffer->mutex){+.+.+.}, at: [<ffffffff812464ef>] sysfs_write_file+0x3f/0x160
        #2:  (s_active#20){.+.+.+}, at: [<ffffffff81246578>] sysfs_write_file+0xc8/0x160
        #3:  (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20
        #4:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff810961c2>] cpu_maps_update_begin+0x12/0x20
        #5:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff810962a7>] cpu_hotplug_begin+0x27/0x60
      Suggested-and-Acked-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: fenghua.yu@intel.com
      Cc: xen-devel@lists.xensource.com
      Cc: stable@vger.kernel.org # for v3.9
      Link: http://lkml.kernel.org/r/1368029583-23337-1-git-send-email-konrad.wilk@oracle.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      074d72ff
    • G
      KVM: VMX: fix halt emulation while emulating invalid guest sate · 8d76c49e
      Gleb Natapov 提交于
      The invalid guest state emulation loop does not check halt_request
      which causes 100% cpu loop while guest is in halt and in invalid
      state, but more serious issue is that this leaves halt_request set, so
      random instruction emulated by vm86 #GP exit can be interpreted
      as halt which causes guest hang. Fix both problems by handling
      halt_request in emulation loop.
      Reported-by: NTomas Papan <tomas.papan@gmail.com>
      Tested-by: NTomas Papan <tomas.papan@gmail.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      CC: stable@vger.kernel.org
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      8d76c49e
  20. 08 5月, 2013 2 次提交
    • Z
      xen: mask x2APIC feature in PV · 4ea9b9ac
      Zhenzhong Duan 提交于
      On x2apic enabled pvm, doing sysrq+l, got NULL pointer dereference as below.
      
          SysRq : Show backtrace of all active CPUs
          BUG: unable to handle kernel NULL pointer dereference at           (null)
          IP: [<ffffffff8125e3cb>] memcpy+0xb/0x120
          Call Trace:
           [<ffffffff81039633>] ? __x2apic_send_IPI_mask+0x73/0x160
           [<ffffffff8103973e>] x2apic_send_IPI_all+0x1e/0x20
           [<ffffffff8103498c>] arch_trigger_all_cpu_backtrace+0x6c/0xb0
           [<ffffffff81501be4>] ? _raw_spin_lock_irqsave+0x34/0x50
           [<ffffffff8131654e>] sysrq_handle_showallcpus+0xe/0x10
           [<ffffffff8131616d>] __handle_sysrq+0x7d/0x140
           [<ffffffff81316230>] ? __handle_sysrq+0x140/0x140
           [<ffffffff81316287>] write_sysrq_trigger+0x57/0x60
           [<ffffffff811ca996>] proc_reg_write+0x86/0xc0
           [<ffffffff8116dd8e>] vfs_write+0xce/0x190
           [<ffffffff8116e3e5>] sys_write+0x55/0x90
           [<ffffffff8150a242>] system_call_fastpath+0x16/0x1b
      
      That's because apic points to apic_x2apic_cluster or apic_x2apic_phys
      but the basic element like cpumask isn't initialized.
      
      Mask x2APIC feature in pvm to avoid overwrite of apic pointer,
      update commit message per Konrad's suggestion.
      Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
      Tested-by: NTamon Shiose <tamon.shiose@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      4ea9b9ac
    • K
      xen/spinlock: Fix check from greater than to be also be greater or equal to. · cb91f8f4
      Konrad Rzeszutek Wilk 提交于
      During review of git commit cb9c6f15
      ("xen/spinlock:  Check against default value of -1 for IRQ line.")
      Stefano pointed out a bug in the patch. Unfortunatly due to vacation
      timing the fix was not applied and this patch fixes it up.
      Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      cb91f8f4