1. 05 4月, 2018 2 次提交
    • S
      KVM: VMX: remove bogus WARN_ON in handle_ept_misconfig · c75d0edc
      Sean Christopherson 提交于
      Remove the WARN_ON in handle_ept_misconfig() as it is unnecessary
      and causes false positives.  Return the unmodified result of
      kvm_mmu_page_fault() instead of converting a system error code to
      KVM_EXIT_UNKNOWN so that userspace sees the error code of the
      actual failure, not a generic "we don't know what went wrong".
      
        * kvm_mmu_page_fault() will WARN if reserved bits are set in the
          SPTEs, i.e. it covers the case where an EPT misconfig occurred
          because of a KVM bug.
      
        * The WARN_ON will fire on any system error code that is hit while
          handling the fault, e.g. -ENOMEM from mmu_topup_memory_caches()
          while handling a legitmate MMIO EPT misconfig or -EFAULT from
          kvm_handle_bad_page() if the corresponding HVA is invalid.  In
          either case, userspace should receive the original error code
          and firing a warning is incorrect behavior as KVM is operating
          as designed.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c75d0edc
    • S
      Revert "KVM: X86: Fix SMRAM accessing even if VM is shutdown" · 2c151b25
      Sean Christopherson 提交于
      The bug that led to commit 95e057e2
      was a benign warning (no adverse affects other than the warning
      itself) that was detected by syzkaller.  Further inspection shows
      that the WARN_ON in question, in handle_ept_misconfig(), is
      unnecessary and flawed (this was also briefly discussed in the
      original patch: https://patchwork.kernel.org/patch/10204649).
      
        * The WARN_ON is unnecessary as kvm_mmu_page_fault() will WARN
          if reserved bits are set in the SPTEs, i.e. it covers the case
          where an EPT misconfig occurred because of a KVM bug.
      
        * The WARN_ON is flawed because it will fire on any system error
          code that is hit while handling the fault, e.g. -ENOMEM can be
          returned by mmu_topup_memory_caches() while handling a legitmate
          MMIO EPT misconfig.
      
      The original behavior of returning -EFAULT when userspace munmaps
      an HVA without first removing the memslot is correct and desirable,
      i.e. KVM is letting userspace know it has generated a bad address.
      Returning RET_PF_EMULATE masks the WARN_ON in the EPT misconfig path,
      but does not fix the underlying bug, i.e. the WARN_ON is bogus.
      
      Furthermore, returning RET_PF_EMULATE has the unwanted side effect of
      causing KVM to attempt to emulate an instruction on any page fault
      with an invalid HVA translation, e.g. a not-present EPT violation
      on a VM_PFNMAP VMA whose fault handler failed to insert a PFN.
      
        * There is no guarantee that the fault is directly related to the
          instruction, i.e. the fault could have been triggered by a side
          effect memory access in the guest, e.g. while vectoring a #DB or
          writing a tracing record.  This could cause KVM to effectively
          mask the fault if KVM doesn't model the behavior leading to the
          fault, i.e. emulation could succeed and resume the guest.
      
        * If emulation does fail, KVM will return EMULATION_FAILED instead
          of -EFAULT, which is a red herring as the user will either debug
          a bogus emulation attempt or scratch their head wondering why we
          were attempting emulation in the first place.
      
      TL;DR: revert to returning -EFAULT and remove the bogus WARN_ON in
      handle_ept_misconfig in a future patch.
      
      This reverts commit 95e057e2.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2c151b25
  2. 04 4月, 2018 2 次提交
    • S
      kvm: Add emulation for movups/movupd · 29916968
      Stefan Fritsch 提交于
      This is very similar to the aligned versions movaps/movapd.
      
      We have seen the corresponding emulation failures with openbsd as guest
      and with Windows 10 with intel HD graphics pass through.
      Signed-off-by: NChristian Ehrhardt <christian_ehrhardt@genua.de>
      Signed-off-by: NStefan Fritsch <sf@sfritsch.de>
      Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      29916968
    • S
      KVM: VMX: raise internal error for exception during invalid protected mode state · add5ff7a
      Sean Christopherson 提交于
      Exit to userspace with KVM_INTERNAL_ERROR_EMULATION if we encounter
      an exception in Protected Mode while emulating guest due to invalid
      guest state.  Unlike Big RM, KVM doesn't support emulating exceptions
      in PM, i.e. PM exceptions are always injected via the VMCS.  Because
      we will never do VMRESUME due to emulation_required, the exception is
      never realized and we'll keep emulating the faulting instruction over
      and over until we receive a signal.
      
      Exit to userspace iff there is a pending exception, i.e. don't exit
      simply on a requested event. The purpose of this check and exit is to
      aid in debugging a guest that is in all likelihood already doomed.
      Invalid guest state in PM is extremely limited in normal operation,
      e.g. it generally only occurs for a few instructions early in BIOS,
      and any exception at this time is all but guaranteed to be fatal.
      Non-vectored interrupts, e.g. INIT, SIPI and SMI, can be cleanly
      handled/emulated, while checking for vectored interrupts, e.g. INTR
      and NMI, without hitting false positives would add a fair amount of
      complexity for almost no benefit (getting hit by lightning seems
      more likely than encountering this specific scenario).
      
      Add a WARN_ON_ONCE to vmx_queue_exception() if we try to inject an
      exception via the VMCS and emulation_required is true.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      add5ff7a
  3. 30 3月, 2018 1 次提交
  4. 29 3月, 2018 22 次提交
  5. 28 3月, 2018 5 次提交
    • A
      KVM: x86: Fix perf timer mode IP reporting · dd60d217
      Andi Kleen 提交于
      KVM and perf have a special backdoor mechanism to report the IP for interrupts
      re-executed after vm exit. This works for the NMIs that perf normally uses.
      
      However when perf is in timer mode it doesn't work because the timer interrupt
      doesn't get this special treatment. This is common when KVM is running
      nested in another hypervisor which may not implement the PMU, so only
      timer mode is available.
      
      Call the functions to set up the backdoor IP also for non NMI interrupts.
      
      I renamed the functions to set up the backdoor IP reporting to be more
      appropiate for their new use.  The SVM change is only compile tested.
      
      v2: Moved the functions inline.
      For the normal interrupt case the before/after functions are now
      called from x86.c, not arch specific code.
      For the NMI case we still need to call it in the architecture
      specific code, because it's already needed in the low level *_run
      functions.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      [Removed unnecessary calls from arch handle_external_intr. - Radim]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      dd60d217
    • R
      Merge tag 'kvm-arm-for-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm · abe7a458
      Radim Krčmář 提交于
      KVM/ARM updates for v4.17
      
      - VHE optimizations
      - EL2 address space randomization
      - Variant 3a mitigation for Cortex-A57 and A72
      - The usual vgic fixes
      - Various minor tidying-up
      abe7a458
    • M
      arm64: Add temporary ERRATA_MIDR_ALL_VERSIONS compatibility macro · dc6ed61d
      Marc Zyngier 提交于
      MIDR_ALL_VERSIONS is changing, and won't have the same meaning
      in 4.17, and the right thing to use will be ERRATA_MIDR_ALL_VERSIONS.
      
      In order to cope with the merge window, let's add a compatibility
      macro that will allow a relatively smooth transition, and that
      can be removed post 4.17-rc1.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      dc6ed61d
    • M
      Revert "arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening" · adc91ab7
      Marc Zyngier 提交于
      Creates far too many conflicts with arm64/for-next/core, to be
      resent post -rc1.
      
      This reverts commit f9f5dc19.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      adc91ab7
    • P
      KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot() in page fault handler · 31c8b0d0
      Paul Mackerras 提交于
      This changes the hypervisor page fault handler for radix guests to use
      the generic KVM __gfn_to_pfn_memslot() function instead of using
      get_user_pages_fast() and then handling the case of VM_PFNMAP vmas
      specially.  The old code missed the case of VM_IO vmas; with this
      change, VM_IO vmas will now be handled correctly by code within
      __gfn_to_pfn_memslot.
      
      Currently, __gfn_to_pfn_memslot calls hva_to_pfn, which only uses
      __get_user_pages_fast for the initial lookup in the cases where
      either atomic or async is set.  Since we are not setting either
      atomic or async, we do our own __get_user_pages_fast first, for now.
      
      This also adds code to check for the KVM_MEM_READONLY flag on the
      memslot.  If it is set and this is a write access, we synthesize a
      data storage interrupt for the guest.
      
      In the case where the page is not normal RAM (i.e. page == NULL in
      kvmppc_book3s_radix_page_fault(), we read the PTE from the Linux page
      tables because we need the mapping attribute bits as well as the PFN.
      (The mapping attribute bits indicate whether accesses have to be
      non-cacheable and/or guarded.)
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      31c8b0d0
  6. 26 3月, 2018 2 次提交
    • M
      KVM: arm/arm64: vgic-its: Fix potential overrun in vgic_copy_lpi_list · 7d8b44c5
      Marc Zyngier 提交于
      vgic_copy_lpi_list() parses the LPI list and picks LPIs targeting
      a given vcpu. We allocate the array containing the intids before taking
      the lpi_list_lock, which means we can have an array size that is not
      equal to the number of LPIs.
      
      This is particularly obvious when looking at the path coming from
      vgic_enable_lpis, which is not a command, and thus can run in parallel
      with commands:
      
      vcpu 0:                                        vcpu 1:
      vgic_enable_lpis
        its_sync_lpi_pending_table
          vgic_copy_lpi_list
            intids = kmalloc_array(irq_count)
                                                     MAPI(lpi targeting vcpu 0)
            list_for_each_entry(lpi_list_head)
              intids[i++] = irq->intid;
      
      At that stage, we will happily overrun the intids array. Boo. An easy
      fix is is to break once the array is full. The MAPI command will update
      the config anyway, and we won't miss a thing. We also make sure that
      lpi_list_count is read exactly once, so that further updates of that
      value will not affect the array bound check.
      
      Cc: stable@vger.kernel.org
      Fixes: ccb1d791 ("KVM: arm64: vgic-its: Fix pending table sync")
      Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      7d8b44c5
    • M
      KVM: arm/arm64: vgic: Disallow Active+Pending for level interrupts · 67b5b673
      Marc Zyngier 提交于
      It was recently reported that VFIO mediated devices, and anything
      that VFIO exposes as level interrupts, do no strictly follow the
      expected logic of such interrupts as it only lowers the input
      line when the guest has EOId the interrupt at the GIC level, rather
      than when it Acked the interrupt at the device level.
      
      THe GIC's Active+Pending state is fundamentally incompatible with
      this behaviour, as it prevents KVM from observing the EOI, and in
      turn results in VFIO never dropping the line. This results in an
      interrupt storm in the guest, which it really never expected.
      
      As we cannot really change VFIO to follow the strict rules of level
      signalling, let's forbid the A+P state altogether, as it is in the
      end only an optimization. It ensures that we will transition via
      an invalid state, which we can use to notify VFIO of the EOI.
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Tested-by: NEric Auger <eric.auger@redhat.com>
      Tested-by: NShunyong Yang <shunyong.yang@hxt-semitech.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      67b5b673
  7. 24 3月, 2018 5 次提交
  8. 21 3月, 2018 1 次提交