1. 28 7月, 2014 23 次提交
    • A
      KVM: PPC: Book3S: Add hack for split real mode · c01e3f66
      Alexander Graf 提交于
      Today we handle split real mode by mapping both instruction and data faults
      into a special virtual address space that only exists during the split mode
      phase.
      
      This is good enough to catch 32bit Linux guests that use split real mode for
      copy_from/to_user. In this case we're always prefixed with 0xc0000000 for our
      instruction pointer and can map the user space process freely below there.
      
      However, that approach fails when we're running KVM inside of KVM. Here the 1st
      level last_inst reader may well be in the same virtual page as a 2nd level
      interrupt handler.
      
      It also fails when running Mac OS X guests. Here we have a 4G/4G split, so a
      kernel copy_from/to_user implementation can easily overlap with user space
      addresses.
      
      The architecturally correct way to fix this would be to implement an instruction
      interpreter in KVM that kicks in whenever we go into split real mode. This
      interpreter however would not receive a great amount of testing and be a lot of
      bloat for a reasonably isolated corner case.
      
      So I went back to the drawing board and tried to come up with a way to make
      split real mode work with a single flat address space. And then I realized that
      we could get away with the same trick that makes it work for Linux:
      
      Whenever we see an instruction address during split real mode that may collide,
      we just move it higher up the virtual address space to a place that hopefully
      does not collide (keep your fingers crossed!).
      
      That approach does work surprisingly well. I am able to successfully run
      Mac OS X guests with KVM and QEMU (no split real mode hacks like MOL) when I
      apply a tiny timing probe hack to QEMU. I'd say this is a win over even more
      broken split real mode :).
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      c01e3f66
    • A
      KVM: PPC: Book3S: Stop PTE lookup on write errors · 2e27ecc9
      Alexander Graf 提交于
      When a page lookup failed because we're not allowed to write to the page, we
      should not overwrite that value with another lookup on the second PTEG which
      will return "page not found". Instead, we should just tell the caller that we
      had a permission problem.
      
      This fixes Mac OS X guests looping endlessly in page lookup code for me.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      2e27ecc9
    • A
      KVM: PPC: Deflect page write faults properly in kvmppc_st · 17824b5a
      Alexander Graf 提交于
      When we have a page that we're not allowed to write to, xlate() will already
      tell us -EPERM on lookup of that page. With the code as is we change it into
      a "page missing" error which a guest may get confused about. Instead, just
      tell the caller about the -EPERM directly.
      
      This fixes Mac OS X guests when run with DCBZ32 emulation.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      17824b5a
    • M
      KVM: PPC: e500: Emulate power management control SPR · debf27d6
      Mihai Caraman 提交于
      For FSL e6500 core the kernel uses power management SPR register (PWRMGTCR0)
      to enable idle power down for cores and devices by setting up the idle count
      period at boot time. With the host already controlling the power management
      configuration the guest could simply benefit from it, so emulate guest request
      as a general store.
      Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      debf27d6
    • A
      KVM: PPC: Book3S HV: Enable for little endian hosts · 6947f948
      Alexander Graf 提交于
      Now that we've fixed all the issues that HV KVM code had on little endian
      hosts, we can enable it in the kernel configuration for users to play with.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      6947f948
    • A
      KVM: PPC: Book3S HV: Fix ABIv2 on LE · 9bf163f8
      Alexander Graf 提交于
      For code that doesn't live in modules we can just branch to the real function
      names, giving us compatibility with ABIv1 and ABIv2.
      
      Do this for the compiled-in code of HV KVM.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9bf163f8
    • A
      KVM: PPC: Book3S HV: Access XICS in BE · 76d072fb
      Alexander Graf 提交于
      On the exit path from the guest we check what type of interrupt we received
      if we received one. This means we're doing hardware access to the XICS interrupt
      controller.
      
      However, when running on a little endian system, this access is byte reversed.
      
      So let's make sure to swizzle the bytes back again and virtually make XICS
      accesses big endian.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      76d072fb
    • A
      KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE · 0865a583
      Alexander Graf 提交于
      Some data structures are always stored in big endian. Among those are the LPPACA
      fields as well as the shadow slb. These structures might be shared with a
      hypervisor.
      
      So whenever we access those fields, make sure we do so in big endian byte order.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      0865a583
    • A
      KVM: PPC: Book3S HV: Access guest VPA in BE · 02407552
      Alexander Graf 提交于
      There are a few shared data structures between the host and the guest. Most
      of them get registered through the VPA interface.
      
      These data structures are defined to always be in big endian byte order, so
      let's make sure we always access them in big endian.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      02407552
    • A
      KVM: PPC: Book3S HV: Make HTAB code LE host aware · 6f22bd32
      Alexander Graf 提交于
      When running on an LE host all data structures are kept in little endian
      byte order. However, the HTAB still needs to be maintained in big endian.
      
      So every time we access any HTAB we need to make sure we do so in the right
      byte order. Fix up all accesses to manually byte swap.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      6f22bd32
    • M
      KVM: PPC: e500: Fix default tlb for victim hint · d57cef91
      Mihai Caraman 提交于
      Tlb search operation used for victim hint relies on the default tlb set by the
      host. When hardware tablewalk support is enabled in the host, the default tlb is
      TLB1 which leads KVM to evict the bolted entry. Set and restore the default tlb
      when searching for victim hint.
      Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
      Reviewed-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      d57cef91
    • M
      KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling · 9642382e
      Michael Neuling 提交于
      This adds support for the H_SET_MODE hcall.  This hcall is a
      multiplexer that has several functions, some of which are called
      rarely, and some which are potentially called very frequently.
      Here we add support for the functions that set the debug registers
      CIABR (Completed Instruction Address Breakpoint Register) and
      DAWR/DAWRX (Data Address Watchpoint Register and eXtension),
      since they could be updated by the guest as often as every context
      switch.
      
      This also adds a kvmppc_power8_compatible() function to test to see
      if a guest is compatible with POWER8 or not.  The CIABR and DAWR/X
      only exist on POWER8.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9642382e
    • P
      KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled · ae2113a4
      Paul Mackerras 提交于
      This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL
      capability is used to enable or disable in-kernel handling of an
      hcall, that the hcall is actually implemented by the kernel.
      If not an EINVAL error is returned.
      
      This also checks the default-enabled list of hcalls and prints a
      warning if any hcall there is not actually implemented.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ae2113a4
    • P
      KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling · 699a0ea0
      Paul Mackerras 提交于
      This provides a way for userspace controls which sPAPR hcalls get
      handled in the kernel.  Each hcall can be individually enabled or
      disabled for in-kernel handling, except for H_RTAS.  The exception
      for H_RTAS is because userspace can already control whether
      individual RTAS functions are handled in-kernel or not via the
      KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for
      H_RTAS is out of the normal sequence of hcall numbers.
      
      Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the
      KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM.
      The args field of the struct kvm_enable_cap specifies the hcall number
      in args[0] and the enable/disable flag in args[1]; 0 means disable
      in-kernel handling (so that the hcall will always cause an exit to
      userspace) and 1 means enable.  Enabling or disabling in-kernel
      handling of an hcall is effective across the whole VM.
      
      The ability for KVM_ENABLE_CAP to be used on a VM file descriptor
      on PowerPC is new, added by this commit.  The KVM_CAP_ENABLE_CAP_VM
      capability advertises that this ability exists.
      
      When a VM is created, an initial set of hcalls are enabled for
      in-kernel handling.  The set that is enabled is the set that have
      an in-kernel implementation at this point.  Any new hcall
      implementations from this point onwards should not be added to the
      default set without a good reason.
      
      No distinction is made between real-mode and virtual-mode hcall
      implementations; the one setting controls them both.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      699a0ea0
    • M
      KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule · 1f0eeb7e
      Mihai Caraman 提交于
      On vcpu schedule, the condition checked for tlb pollution is too loose.
      The tlb entries of a vcpu become polluted (vs stale) only when a different
      vcpu within the same logical partition runs in-between. Optimize the tlb
      invalidation condition keeping last_vcpu per logical partition id.
      
      With the new invalidation condition, a guest shows 4% performance improvement
      on P5020DS while running a memory stress application with the cpu oversubscribed,
      the other guest running a cpu intensive workload.
      
      Guest - old invalidation condition
        real 3.89
        user 3.87
        sys 0.01
      
      Guest - enhanced invalidation condition
        real 3.75
        user 3.73
        sys 0.01
      
      Host
        real 3.70
        user 1.85
        sys 0.00
      
      The memory stress application accesses 4KB pages backed by 75% of available
      TLB0 entries:
      
      char foo[ENTRIES][4096] __attribute__ ((aligned (4096)));
      
      int main()
      {
      	char bar;
      	int i, j;
      
      	for (i = 0; i < ITERATIONS; i++)
              	for (j = 0; j < ENTRIES; j++)
                  		bar = foo[j][0];
      
      	return 0;
      }
      Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
      Reviewed-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      1f0eeb7e
    • A
      KVM: PPC: Book3S PR: Fix sparse endian checks · f396df35
      Alexander Graf 提交于
      While sending sparse with endian checks over the code base, it triggered at
      some places that were missing casts or had wrong types. Fix them up.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      f396df35
    • A
      KVM: PPC: Book3S PR: Fix ABIv2 on LE · da166fac
      Alexander Graf 提交于
      We switched to ABIv2 on Little Endian systems now which gets rid of the
      dotted function names. Branch to the actual functions when we see such
      a system.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      da166fac
    • A
      KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC() · ad7d4584
      Anton Blanchard 提交于
      Both kvmppc_hv_entry_trampoline and kvmppc_entry_trampoline are
      assembly functions that are exported to modules and also require
      a valid r2.
      
      As such we need to use _GLOBAL_TOC so we provide a global entry
      point that establishes the TOC (r2).
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ad7d4584
    • A
      KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue · 05a308c7
      Anton Blanchard 提交于
      To establish addressability quickly, ABIv2 requires the target
      address of the function being called to be in r12.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      05a308c7
    • A
      KVM: PPC: Book3S PR: Handle hyp doorbell exits · 568fccc4
      Alexander Graf 提交于
      If we're running PR KVM in HV mode, we may get hypervisor doorbell interrupts.
      Handle those the same way we treat normal doorbells.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      568fccc4
    • A
      KVM: PPC: Book3s PR: Disable AIL mode with OPAL · fb4188ba
      Alexander Graf 提交于
      When we're using PR KVM we must not allow the CPU to take interrupts
      in virtual mode, as the SLB does not contain host kernel mappings
      when running inside the guest context.
      
      To make sure we get good performance for non-KVM tasks but still
      properly functioning PR KVM, let's just disable AIL whenever a vcpu
      is scheduled in.
      
      This is fundamentally different from how we deal with AIL on pSeries
      type machines where we disable AIL for the whole machine as soon as
      a single KVM VM is up.
      
      The reason for that is easy - on pSeries we do not have control over
      per-cpu configuration of AIL. We also don't want to mess with CPU hotplug
      races and AIL configuration, so setting it per CPU is easier and more
      flexible.
      
      This patch fixes running PR KVM on POWER8 bare metal for me.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      fb4188ba
    • A
      KVM: PPC: BOOK3S: PR: Emulate instruction counter · 06da28e7
      Aneesh Kumar K.V 提交于
      Writing to IC is not allowed in the privileged mode.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      06da28e7
    • A
      KVM: PPC: BOOK3S: PR: Emulate virtual timebase register · 8f42ab27
      Aneesh Kumar K.V 提交于
      virtual time base register is a per VM, per cpu register that needs
      to be saved and restored on vm exit and entry. Writing to VTB is not
      allowed in the privileged mode.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      [agraf: fix compile error]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8f42ab27
  2. 06 7月, 2014 1 次提交
  3. 11 6月, 2014 1 次提交
    • M
      powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest. · 74845bc2
      Mahesh Salgaonkar 提交于
      Currently we forward MCEs to guest which have been recovered by guest.
      And for unhandled errors we do not deliver the MCE to guest. It looks like
      with no support of FWNMI in qemu, guest just panics whenever we deliver the
      recovered MCEs to guest. Also, the existig code used to return to host for
      unhandled errors which was casuing guest to hang with soft lockups inside
      guest and makes it difficult to recover guest instance.
      
      This patch now forwards all fatal MCEs to guest causing guest to crash/panic.
      And, for recovered errors we just go back to normal functioning of guest
      instead of returning to host. This fixes soft lockup issues in guest.
      This patch also fixes an issue where guest MCE events were not logged to
      host console.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      74845bc2
  4. 30 5月, 2014 15 次提交
    • A
      KVM: PPC: Book3S PR: Rework SLB switching code · d8d164a9
      Alexander Graf 提交于
      On LPAR guest systems Linux enables the shadow SLB to indicate to the
      hypervisor a number of SLB entries that always have to be available.
      
      Today we go through this shadow SLB and disable all ESID's valid bits.
      However, pHyp doesn't like this approach very much and honors us with
      fancy machine checks.
      
      Fortunately the shadow SLB descriptor also has an entry that indicates
      the number of valid entries following. During the lifetime of a guest
      we can just swap that value to 0 and don't have to worry about the
      SLB restoration magic.
      
      While we're touching the code, let's also make it more readable (get
      rid of rldicl), allow it to deal with a dynamic number of bolted
      SLB entries and only do shadow SLB swizzling on LPAR systems.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      d8d164a9
    • A
      KVM: PPC: Book3S PR: Use SLB entry 0 · 207438d4
      Alexander Graf 提交于
      We didn't make use of SLB entry 0 because ... of no good reason. SLB entry 0
      will always be used by the Linux linear SLB entry, so the fact that slbia
      does not invalidate it doesn't matter as we overwrite SLB 0 on exit anyway.
      
      Just enable use of SLB entry 0 for our shadow SLB code.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      207438d4
    • P
      KVM: PPC: Book3S HV: Fix machine check delivery to guest · 000a25dd
      Paul Mackerras 提交于
      The code that delivered a machine check to the guest after handling
      it in real mode failed to load up r11 before calling kvmppc_msr_interrupt,
      which needs the old MSR value in r11 so it can see the transactional
      state there.  This adds the missing load.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      000a25dd
    • P
      KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs · 9bc01a9b
      Paul Mackerras 提交于
      This adds workarounds for two hardware bugs in the POWER8 performance
      monitor unit (PMU), both related to interrupt generation.  The effect
      of these bugs is that PMU interrupts can get lost, leading to tools
      such as perf reporting fewer counts and samples than they should.
      
      The first bug relates to the PMAO (perf. mon. alert occurred) bit in
      MMCR0; setting it should cause an interrupt, but doesn't.  The other
      bug relates to the PMAE (perf. mon. alert enable) bit in MMCR0.
      Setting PMAE when a counter is negative and counter negative
      conditions are enabled to cause alerts should cause an alert, but
      doesn't.
      
      The workaround for the first bug is to create conditions where a
      counter will overflow, whenever we are about to restore a MMCR0
      value that has PMAO set (and PMAO_SYNC clear).  The workaround for
      the second bug is to freeze all counters using MMCR2 before reading
      MMCR0.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9bc01a9b
    • P
      KVM: PPC: Book3S HV: Make sure we don't miss dirty pages · 6c576e74
      Paul Mackerras 提交于
      Current, when testing whether a page is dirty (when constructing the
      bitmap for the KVM_GET_DIRTY_LOG ioctl), we test the C (changed) bit
      in the HPT entries mapping the page, and if it is 0, we consider the
      page to be clean.  However, the Power ISA doesn't require processors
      to set the C bit to 1 immediately when writing to a page, and in fact
      allows them to delay the writeback of the C bit until they receive a
      TLB invalidation for the page.  Thus it is possible that the page
      could be dirty and we miss it.
      
      Now, if there are vcpus running, this is not serious since the
      collection of the dirty log is racy already - some vcpu could dirty
      the page just after we check it.  But if there are no vcpus running we
      should return definitive results, in case we are in the final phase of
      migrating the guest.
      
      Also, if the permission bits in the HPTE don't allow writing, then we
      know that no CPU can set C.  If the HPTE was previously writable and
      the page was modified, any C bit writeback would have been flushed out
      by the tlbie that we did when changing the HPTE to read-only.
      
      Otherwise we need to do a TLB invalidation even if the C bit is 0, and
      then check the C bit.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      6c576e74
    • A
      KVM: PPC: Book3S HV: Fix dirty map for hugepages · 687414be
      Alexey Kardashevskiy 提交于
      The dirty map that we construct for the KVM_GET_DIRTY_LOG ioctl has
      one bit per system page (4K/64K).  Currently, we only set one bit in
      the map for each HPT entry with the Change bit set, even if the HPT is
      for a large page (e.g., 16MB).  Userspace then considers only the
      first system page dirty, though in fact the guest may have modified
      anywhere in the large page.
      
      To fix this, we make kvm_test_clear_dirty() return the actual number
      of pages that are dirty (and rename it to kvm_test_clear_dirty_npages()
      to emphasize that that's what it returns).  In kvmppc_hv_get_dirty_log()
      we then set that many bits in the dirty map.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      687414be
    • P
      KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address · 1066f772
      Paul Mackerras 提交于
      Currently, when a huge page is faulted in for a guest, we select the
      rmap chain to insert the HPTE into based on the guest physical address
      that the guest tried to access.  Since there is an rmap chain for each
      system page, there are many rmap chains for the area covered by a huge
      page (e.g. 256 for 16MB pages when PAGE_SIZE = 64kB), and the huge-page
      HPTE could end up in any one of them.
      
      For consistency, and to make the huge-page HPTEs easier to find, we now
      put huge-page HPTEs in the rmap chain corresponding to the base address
      of the huge page.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      1066f772
    • P
      KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates() · 55765483
      Paul Mackerras 提交于
      The global_invalidates() function contains a check that is intended
      to tell whether we are currently executing in the context of a hypercall
      issued by the guest.  The reason is that the optimization of using a
      local TLB invalidate instruction is only valid in that context.  The
      check was testing local_paca->kvm_hstate.kvm_vcore, which gets set
      when entering the guest but no longer gets cleared when exiting the
      guest.  To fix this, we use the kvm_vcpu field instead, which does
      get cleared when exiting the guest, by the kvmppc_release_hwthread()
      calls inside kvmppc_run_core().
      
      The effect of having the check wrong was that when kvmppc_do_h_remove()
      got called from htab_write() on the destination machine during a
      migration, it cleared the current cpu's bit in kvm->arch.need_tlb_flush.
      This meant that when the guest started running in the destination VM,
      it may miss out on doing a complete TLB flush, and therefore may end
      up using stale TLB entries from a previous guest that used the same
      LPID value.
      
      This should make migration more reliable.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      55765483
    • A
      KVM: PPC: Add CAP to indicate hcall fixes · f2e91042
      Alexander Graf 提交于
      We worked around some nasty KVM magic page hcall breakages:
      
        1) NX bit not honored, so ignore NX when we detect it
        2) LE guests swizzle hypercall instruction
      
      Without these fixes in place, there's no way it would make sense to expose kvm
      hypercalls to a guest. Chances are immensely high it would trip over and break.
      
      So add a new CAP that gives user space a hint that we have workarounds for the
      bugs above in place. It can use those as hint to disable PV hypercalls when
      the guest CPU is anything POWER7 or higher and the host does not have fixes
      in place.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      f2e91042
    • A
      KVM: PPC: MPIC: Reset IRQ source private members · aae65596
      Alexander Graf 提交于
      When we reset the in-kernel MPIC controller, we forget to reset some hidden
      state such as destmask and output. This state is usually set when the guest
      writes to the IDR register for a specific IRQ line.
      
      To make sure we stay in sync and don't forget hidden state, treat reset of
      the IDR register as a simple write of the IDR register. That automatically
      updates all the hidden state as well.
      Reported-by: NPaul Janzen <pcj@pauljanzen.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      aae65596
    • A
      KVM: PPC: Graciously fail broken LE hypercalls · 42188365
      Alexander Graf 提交于
      There are LE Linux guests out there that don't handle hypercalls correctly.
      Instead of interpreting the instruction stream from device tree as big endian
      they assume it's a little endian instruction stream and fail.
      
      When we see an illegal instruction from such a byte reversed instruction stream,
      bail out graciously and just declare every hcall as error.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      42188365
    • A
      KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler · ddca156a
      Aneesh Kumar K.V 提交于
      Use make_dsisr instead of open coding it. This also have
      the added benefit of handling alignment interrupt on additional
      instructions.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ddca156a
    • A
      KVM: PPC: BOOK3S: Always use the saved DAR value · 7310f3a5
      Aneesh Kumar K.V 提交于
      Although it's optional, IBM POWER cpus always had DAR value set on
      alignment interrupt. So don't try to compute these values.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      7310f3a5
    • A
      KVM: PPC: Disable NX for old magic page using guests · f3383cf8
      Alexander Graf 提交于
      Old guests try to use the magic page, but map their trampoline code inside
      of an NX region.
      
      Since we can't fix those old kernels, try to detect whether the guest is sane
      or not. If not, just disable NX functionality in KVM so that old guests at
      least work at all. For newer guests, add a bit that we can set to keep NX
      functionality available.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      f3383cf8
    • A
      KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest · 1f365bb0
      Aneesh Kumar K.V 提交于
      On recent IBM Power CPUs, while the hashed page table is looked up using
      the page size from the segmentation hardware (i.e. the SLB), it is
      possible to have the HPT entry indicate a larger page size.  Thus for
      example it is possible to put a 16MB page in a 64kB segment, but since
      the hash lookup is done using a 64kB page size, it may be necessary to
      put multiple entries in the HPT for a single 16MB page.  This
      capability is called mixed page-size segment (MPSS).  With MPSS,
      there are two relevant page sizes: the base page size, which is the
      size used in searching the HPT, and the actual page size, which is the
      size indicated in the HPT entry. [ Note that the actual page size is
      always >= base page size ].
      
      We use "ibm,segment-page-sizes" device tree node to advertise
      the MPSS support to PAPR guest. The penc encoding indicates whether
      we support a specific combination of base page size and actual
      page size in the same segment. We also use the penc value in the
      LP encoding of HPTE entry.
      
      This patch exposes MPSS support to KVM guest by advertising the
      feature via "ibm,segment-page-sizes". It also adds the necessary changes
      to decode the base page size and the actual page size correctly from the
      HPTE entry.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      1f365bb0