1. 02 4月, 2013 1 次提交
    • P
      pmu: prepare for migration support · afd80d85
      Paolo Bonzini 提交于
      In order to migrate the PMU state correctly, we need to restore the
      values of MSR_CORE_PERF_GLOBAL_STATUS (a read-only register) and
      MSR_CORE_PERF_GLOBAL_OVF_CTRL (which has side effects when written).
      We also need to write the full 40-bit value of the performance counter,
      which would only be possible with a v3 architectural PMU's full-width
      counter MSRs.
      
      To distinguish host-initiated writes from the guest's, pass the
      full struct msr_data to kvm_pmu_set_msr.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      afd80d85
  2. 22 3月, 2013 2 次提交
    • T
      KVM: MMU: Rename kvm_mmu_free_some_pages() to make_mmu_pages_available() · 81f4f76b
      Takuya Yoshikawa 提交于
      The current name "kvm_mmu_free_some_pages" should be used for something
      that actually frees some shadow pages, as we expect from the name, but
      what the function is doing is to make some, KVM_MIN_FREE_MMU_PAGES,
      shadow pages available: it does nothing when there are enough.
      
      This patch changes the name to reflect this meaning better; while doing
      this renaming, the code in the wrapper function is inlined into the main
      body since the whole function will be inlined into the only caller now.
      Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      81f4f76b
    • T
      KVM: MMU: Move kvm_mmu_free_some_pages() into kvm_mmu_alloc_page() · 7ddca7e4
      Takuya Yoshikawa 提交于
      What this function is doing is to ensure that the number of shadow pages
      does not exceed the maximum limit stored in n_max_mmu_pages: so this is
      placed at every code path that can reach kvm_mmu_alloc_page().
      
      Although it might have some sense to spread this function in each such
      code path when it could be called before taking mmu_lock, the rule was
      changed not to do so.
      
      Taking this background into account, this patch moves it into
      kvm_mmu_alloc_page() and simplifies the code.
      
      Note: the unlikely hint in kvm_mmu_free_some_pages() guarantees that the
      overhead of this function is almost zero except when we actually need to
      allocate some shadow pages, so we do not need to care about calling it
      multiple times in one path by doing kvm_mmu_get_page() a few times.
      Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      7ddca7e4
  3. 21 3月, 2013 1 次提交
  4. 20 3月, 2013 2 次提交
  5. 19 3月, 2013 2 次提交
  6. 18 3月, 2013 1 次提交
    • L
      perf,x86: fix wrmsr_on_cpu() warning on suspend/resume · 2a6e06b2
      Linus Torvalds 提交于
      Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after
      suspend/resume") fixed a crash when doing PEBS performance profiling
      after resuming, but in using init_debug_store_on_cpu() to restore the
      DS_AREA mtrr it also resulted in a new WARN_ON() triggering.
      
      init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU
      cross-calls to do the MSR update.  Which is not really valid at the
      early resume stage, and the warning is quite reasonable.  Now, it all
      happens to _work_, for the simple reason that smp_call_function_single()
      ends up just doing the call directly on the CPU when the CPU number
      matches, but we really should just do the wrmsr() directly instead.
      
      This duplicates the wrmsr() logic, but hopefully we can just remove the
      wrmsr_on_cpu() version eventually.
      Reported-and-tested-by: NParag Warudkar <parag.lkml@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2a6e06b2
  7. 16 3月, 2013 1 次提交
  8. 14 3月, 2013 4 次提交
  9. 13 3月, 2013 4 次提交
  10. 12 3月, 2013 2 次提交
  11. 11 3月, 2013 1 次提交
  12. 08 3月, 2013 6 次提交
  13. 07 3月, 2013 4 次提交
  14. 06 3月, 2013 2 次提交
  15. 05 3月, 2013 5 次提交
  16. 03 3月, 2013 1 次提交
    • Y
      x86, ACPI, mm: Revert movablemem_map support · 20e6926d
      Yinghai Lu 提交于
      Tim found:
      
        WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x80()
        Hardware name: S2600CP
        sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
        smpboot: Booting Node   1, Processors  #1
        Modules linked in:
        Pid: 0, comm: swapper/1 Not tainted 3.9.0-0-generic #1
        Call Trace:
          set_cpu_sibling_map+0x279/0x449
          start_secondary+0x11d/0x1e5
      
      Don Morris reproduced on a HP z620 workstation, and bisected it to
      commit e8d19552 ("acpi, memory-hotplug: parse SRAT before memblock
      is ready")
      
      It turns out movable_map has some problems, and it breaks several things
      
      1. numa_init is called several times, NOT just for srat. so those
      	nodes_clear(numa_nodes_parsed)
      	memset(&numa_meminfo, 0, sizeof(numa_meminfo))
         can not be just removed.  Need to consider sequence is: numaq, srat, amd, dummy.
         and make fall back path working.
      
      2. simply split acpi_numa_init to early_parse_srat.
         a. that early_parse_srat is NOT called for ia64, so you break ia64.
         b.  for (i = 0; i < MAX_LOCAL_APIC; i++)
      	     set_apicid_to_node(i, NUMA_NO_NODE)
           still left in numa_init. So it will just clear result from early_parse_srat.
           it should be moved before that....
         c.  it breaks ACPI_TABLE_OVERIDE...as the acpi table scan is moved
             early before override from INITRD is settled.
      
      3. that patch TITLE is total misleading, there is NO x86 in the title,
         but it changes critical x86 code. It caused x86 guys did not
         pay attention to find the problem early. Those patches really should
         be routed via tip/x86/mm.
      
      4. after that commit, following range can not use movable ram:
        a. real_mode code.... well..funny, legacy Node0 [0,1M) could be hot-removed?
        b. initrd... it will be freed after booting, so it could be on movable...
        c. crashkernel for kdump...: looks like we can not put kdump kernel above 4G
      	anymore.
        d. init_mem_mapping: can not put page table high anymore.
        e. initmem_init: vmemmap can not be high local node anymore. That is
           not good.
      
      If node is hotplugable, the mem related range like page table and
      vmemmap could be on the that node without problem and should be on that
      node.
      
      We have workaround patch that could fix some problems, but some can not
      be fixed.
      
      So just remove that offending commit and related ones including:
      
       f7210e6c ("mm/memblock.c: use CONFIG_HAVE_MEMBLOCK_NODE_MAP to
          protect movablecore_map in memblock_overlaps_region().")
      
       01a178a9 ("acpi, memory-hotplug: support getting hotplug info from
          SRAT")
      
       27168d38 ("acpi, memory-hotplug: extend movablemem_map ranges to
          the end of node")
      
       e8d19552 ("acpi, memory-hotplug: parse SRAT before memblock is
          ready")
      
       fb06bc8e ("page_alloc: bootmem limit with movablecore_map")
      
       42f47e27 ("page_alloc: make movablemem_map have higher priority")
      
       6981ec31 ("page_alloc: introduce zone_movable_limit[] to keep
          movable limit for nodes")
      
       34b71f1e ("page_alloc: add movable_memmap kernel parameter")
      
       4d59a751 ("x86: get pg_data_t's memory from other node")
      
      Later we should have patches that will make sure kernel put page table
      and vmemmap on local node ram instead of push them down to node0.  Also
      need to find way to put other kernel used ram to local node ram.
      Reported-by: NTim Gardner <tim.gardner@canonical.com>
      Reported-by: NDon Morris <don.morris@hp.com>
      Bisected-by: NDon Morris <don.morris@hp.com>
      Tested-by: NDon Morris <don.morris@hp.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20e6926d
  17. 01 3月, 2013 1 次提交
    • K
      xen/pci: We don't do multiple MSI's. · 884ac297
      Konrad Rzeszutek Wilk 提交于
      There is no hypercall to setup multiple MSI per PCI device.
      As such with these two new commits:
      -  08261d87
         PCI/MSI: Enable multiple MSIs with pci_enable_msi_block_auto()
      - 5ca72c4f
         AHCI: Support multiple MSIs
      
      we would call the PHYSDEVOP_map_pirq 'nvec' times with the same
      contents of the PCI device. Sander discovered that we would get
      the same PIRQ value 'nvec' times and return said values to the
      caller. That of course meant that the device was configured only
      with one MSI and AHCI would fail with:
      
      ahci 0000:00:11.0: version 3.0
      xen: registering gsi 19 triggering 0 polarity 1
      xen: --> pirq=19 -> irq=19 (gsi=19)
      (XEN) [2013-02-27 19:43:07] IOAPIC[0]: Set PCI routing entry (6-19 -> 0x99 -> IRQ 19 Mode:1 Active:1)
      ahci 0000:00:11.0: AHCI 0001.0200 32 slots 4 ports 6 Gbps 0xf impl SATA mode
      ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio slum part
      ahci: probe of 0000:00:11.0 failed with error -22
      
      That is b/c in ahci_host_activate the second call to
      devm_request_threaded_irq  would return -EINVAL as we passed in
      (on the second run) an IRQ that was never initialized.
      
      CC: stable@vger.kernel.org
      Reported-and-Tested-by: NSander Eikelenboom <linux@eikelenboom.it>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      884ac297