1. 17 10月, 2013 3 次提交
    • P
      KVM: PPC: Book3S HV: Implement timebase offset for guests · 93b0f4dc
      Paul Mackerras 提交于
      This allows guests to have a different timebase origin from the host.
      This is needed for migration, where a guest can migrate from one host
      to another and the two hosts might have a different timebase origin.
      However, the timebase seen by the guest must not go backwards, and
      should go forwards only by a small amount corresponding to the time
      taken for the migration.
      
      Therefore this provides a new per-vcpu value accessed via the one_reg
      interface using the new KVM_REG_PPC_TB_OFFSET identifier.  This value
      defaults to 0 and is not modified by KVM.  On entering the guest, this
      value is added onto the timebase, and on exiting the guest, it is
      subtracted from the timebase.
      
      This is only supported for recent POWER hardware which has the TBU40
      (timebase upper 40 bits) register.  Writing to the TBU40 register only
      alters the upper 40 bits of the timebase, leaving the lower 24 bits
      unchanged.  This provides a way to modify the timebase for guest
      migration without disturbing the synchronization of the timebase
      registers across CPU cores.  The kernel rounds up the value given
      to a multiple of 2^24.
      
      Timebase values stored in KVM structures (struct kvm_vcpu, struct
      kvmppc_vcore, etc.) are stored as host timebase values.  The timebase
      values in the dispatch trace log need to be guest timebase values,
      however, since that is read directly by the guest.  This moves the
      setting of vcpu->arch.dec_expires on guest exit to a point after we
      have restored the host timebase so that vcpu->arch.dec_expires is a
      host timebase value.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      93b0f4dc
    • P
      KVM: PPC: Book3S HV: Save/restore SIAR and SDAR along with other PMU registers · 14941789
      Paul Mackerras 提交于
      Currently we are not saving and restoring the SIAR and SDAR registers in
      the PMU (performance monitor unit) on guest entry and exit.  The result
      is that performance monitoring tools in the guest could get false
      information about where a program was executing and what data it was
      accessing at the time of a performance monitor interrupt.  This fixes
      it by saving and restoring these registers along with the other PMU
      registers on guest entry/exit.
      
      This also provides a way for userspace to access these values for a
      vcpu via the one_reg interface.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      14941789
    • M
      KVM: PPC: Book3S HV: Reserve POWER8 space in get/set_one_reg · 3b783474
      Michael Neuling 提交于
      This reserves space in get/set_one_reg ioctl for the extra guest state
      needed for POWER8.  It doesn't implement these at all, it just reserves
      them so that the ABI is defined now.
      
      A few things to note here:
      
      - This add *a lot* state for transactional memory.  TM suspend mode,
        this is unavoidable, you can't simply roll back all transactions and
        store only the checkpointed state.  I've added this all to
        get/set_one_reg (including GPRs) rather than creating a new ioctl
        which returns a struct kvm_regs like KVM_GET_REGS does.  This means we
        if we need to extract the TM state, we are going to need a bucket load
        of IOCTLs.  Hopefully most of the time this will not be needed as we
        can look at the MSR to see if TM is active and only grab them when
        needed.  If this becomes a bottle neck in future we can add another
        ioctl to grab all this state in one go.
      
      - The TM state is offset by 0x80000000.
      
      - For TM, I've done away with VMX and FP and created a single 64x128 bit
        VSX register space.
      
      - I've left a space of 1 (at 0x9c) since Paulus needs to add a value
        which applies to POWER7 as well.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      3b783474
  2. 14 10月, 2013 1 次提交
  3. 13 9月, 2013 2 次提交
  4. 12 9月, 2013 1 次提交
    • N
      mm: migrate: check movability of hugepage in unmap_and_move_huge_page() · 83467efb
      Naoya Horiguchi 提交于
      Currently hugepage migration works well only for pmd-based hugepages
      (mainly due to lack of testing,) so we had better not enable migration of
      other levels of hugepages until we are ready for it.
      
      Some users of hugepage migration (mbind, move_pages, and migrate_pages) do
      page table walk and check pud/pmd_huge() there, so they are safe.  But the
      other users (softoffline and memory hotremove) don't do this, so without
      this patch they can try to migrate unexpected types of hugepages.
      
      To prevent this, we introduce hugepage_migration_support() as an
      architecture dependent check of whether hugepage are implemented on a pmd
      basis or not.  And on some architecture multiple sizes of hugepages are
      available, so hugepage_migration_support() also checks hugepage size.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83467efb
  5. 11 9月, 2013 4 次提交
    • V
      powerpc: Default arch idle could cede processor on pseries · 363edbe2
      Vaidyanathan Srinivasan 提交于
      When adding cpuidle support to pSeries, we introduced two
      regressions:
      
        - The new cpuidle backend driver only works under hypervisors
          supporting the "SLPLAR" option, which isn't the case of the
          old POWER4 hypervisor and the HV "light" used on js2x blades
      
        - The cpuidle driver registers fairly late, meaning that for
          a significant portion of the boot process, we end up having
          all threads spinning. This slows down the boot process and
          increases the overall resource usage if the hypervisor has
          shared processors.
      
      This fixes both by implementing a "default" idle that will cede
      to the hypervisor when possible, in a very simple way without
      all the bells and whisles of cpuidle.
      Reported-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Acked-by: NDeepthi Dharwar <deepthi@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@vger.kernel.org>
      363edbe2
    • V
      powerpc: Fix section mismatch warning for prom_rtas_call · 620e5050
      Vladimir Murzin 提交于
      While cross-building for PPC64 I've got
      
      WARNING: vmlinux.o(.text.unlikely+0x1ba): Section mismatch in
      reference from the function .prom_rtas_call() to the variable
      .init.data:dt_string_start The function .prom_rtas_call() references
      the variable __initdata dt_string_start.  This is often because
      .prom_rtas_call lacks a __initdata annotation or the annotation of
      dt_string_start is wrong.
      
      WARNING: vmlinux.o(.meminit.text+0xeb0): Section mismatch in reference
      from the function .free_area_init_core.isra.47() to the function
      .init.text:.set_pageblock_order() The function __meminit
      .free_area_init_core.isra.47() references a function __init
      .set_pageblock_order().  If .set_pageblock_order is only used by
      .free_area_init_core.isra.47 then annotate .set_pageblock_order with a
      matching annotation.
      
      Fix it by proper annotation of prom_rtas_call.
      Signed-off-by: NVladimir Murzin <murzin.v@gmail.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      620e5050
    • A
      powerpc: Fix possible deadlock on page fault · 69e044dd
      Aneesh Kumar K.V 提交于
       stack_grow_into/14082 is trying to acquire lock:
        (&mm->mmap_sem){++++++}, at: [<c000000000206d28>] .might_fault+0x78/0xe0
      
       but task is already holding lock:
        (&mm->mmap_sem){++++++}, at: [<c0000000007ffd8c>] .do_page_fault+0x24c/0x910
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&mm->mmap_sem);
         lock(&mm->mmap_sem);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by stack_grow_into/14082:
        #0:  (&mm->mmap_sem){++++++}, at: [<c0000000007ffd8c>] .do_page_fault+0x24c/0x910
      
       stack backtrace:
       CPU: 21 PID: 14082 Comm: stack_grow_into Not tainted 3.10.0-10.el7.ppc64.debug #1
       Call Trace:
       [c0000003d396b850] [c000000000016e7c] .show_stack+0x7c/0x1f0 (unreliable)
       [c0000003d396b920] [c000000000813fc8] .dump_stack+0x28/0x3c
       [c0000003d396b990] [c000000000124b90] .__lock_acquire+0x1640/0x1800
       [c0000003d396bab0] [c00000000012570c] .lock_acquire+0xac/0x250
       [c0000003d396bb80] [c000000000206d54] .might_fault+0xa4/0xe0
       [c0000003d396bbf0] [c0000000007ffe2c] .do_page_fault+0x2ec/0x910
       [c0000003d396be30] [c0000000000092e8] handle_page_fault+0x10/0x30
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      69e044dd
    • G
      powerpc: Export cpu_to_chip_id() to fix build error · 256588fd
      Guenter Roeck 提交于
      powerpc allmodconfig build fails with:
      
      ERROR: ".cpu_to_chip_id" [drivers/block/mtip32xx/mtip32xx.ko] undefined!
      
      The problem was introduced with commit 15863ff3 (powerpc: Make chip-id
      information available to userspace).
      
      Export the missing symbol.
      
      Cc: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
      Cc: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      256588fd
  6. 05 9月, 2013 2 次提交
  7. 04 9月, 2013 4 次提交
  8. 29 8月, 2013 2 次提交
  9. 28 8月, 2013 6 次提交
  10. 27 8月, 2013 15 次提交