1. 12 12月, 2020 2 次提交
    • S
      tools/kvm_stat: Exempt time-based counters · 111d0bda
      Stefan Raspl 提交于
      The new counters halt_poll_success_ns and halt_poll_fail_ns do not count
      events. Instead they provide a time, and mess up our statistics. Therefore,
      we should exclude them.
      Removal is currently implemented with an exempt list. If more counters like
      these appear, we can think about a more general rule like excluding all
      fields name "*_ns", in case that's a standing convention.
      Signed-off-by: NStefan Raspl <raspl@linux.ibm.com>
      Tested-and-reported-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Message-Id: <20201208210829.101324-1-raspl@linux.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      111d0bda
    • M
      KVM: mmu: Fix SPTE encoding of MMIO generation upper half · 34c0f6f2
      Maciej S. Szmigiero 提交于
      Commit cae7ed3c ("KVM: x86: Refactor the MMIO SPTE generation handling")
      cleaned up the computation of MMIO generation SPTE masks, however it
      introduced a bug how the upper part was encoded:
      SPTE bits 52-61 were supposed to contain bits 10-19 of the current
      generation number, however a missing shift encoded bits 1-10 there instead
      (mostly duplicating the lower part of the encoded generation number that
      then consisted of bits 1-9).
      
      In the meantime, the upper part was shrunk by one bit and moved by
      subsequent commits to become an upper half of the encoded generation number
      (bits 9-17 of bits 0-17 encoded in a SPTE).
      
      In addition to the above, commit 56871d44 ("KVM: x86: fix overlap between SPTE_MMIO_MASK and generation")
      has changed the SPTE bit range assigned to encode the generation number and
      the total number of bits encoded but did not update them in the comment
      attached to their defines, nor in the KVM MMU doc.
      Let's do it here, too, since it is too trivial thing to warrant a separate
      commit.
      
      Fixes: cae7ed3c ("KVM: x86: Refactor the MMIO SPTE generation handling")
      Signed-off-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com>
      Message-Id: <156700708db2a5296c5ed7a8b9ac71f1e9765c85.1607129096.git.maciej.szmigiero@oracle.com>
      Cc: stable@vger.kernel.org
      [Reorganize macros so that everything is computed from the bit ranges. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      34c0f6f2
  2. 11 12月, 2020 1 次提交
  3. 04 12月, 2020 3 次提交
  4. 02 12月, 2020 3 次提交
  5. 28 11月, 2020 1 次提交
    • V
      kvm: x86/mmu: Fix get_mmio_spte() on CPUs supporting 5-level PT · 9a2a0d3c
      Vitaly Kuznetsov 提交于
      Commit 95fb5b02 ("kvm: x86/mmu: Support MMIO in the TDP MMU") caused
      the following WARNING on an Intel Ice Lake CPU:
      
       get_mmio_spte: detect reserved bits on spte, addr 0xb80a0, dump hierarchy:
       ------ spte 0xb80a0 level 5.
       ------ spte 0xfcd210107 level 4.
       ------ spte 0x1004c40107 level 3.
       ------ spte 0x1004c41107 level 2.
       ------ spte 0x1db00000000b83b6 level 1.
       WARNING: CPU: 109 PID: 10254 at arch/x86/kvm/mmu/mmu.c:3569 kvm_mmu_page_fault.cold.150+0x54/0x22f [kvm]
      ...
       Call Trace:
        ? kvm_io_bus_get_first_dev+0x55/0x110 [kvm]
        vcpu_enter_guest+0xaa1/0x16a0 [kvm]
        ? vmx_get_cs_db_l_bits+0x17/0x30 [kvm_intel]
        ? skip_emulated_instruction+0xaa/0x150 [kvm_intel]
        kvm_arch_vcpu_ioctl_run+0xca/0x520 [kvm]
      
      The guest triggering this crashes. Note, this happens with the traditional
      MMU and EPT enabled, not with the newly introduced TDP MMU. Turns out,
      there was a subtle change in the above mentioned commit. Previously,
      walk_shadow_page_get_mmio_spte() was setting 'root' to 'iterator.level'
      which is returned by shadow_walk_init() and this equals to
      'vcpu->arch.mmu->shadow_root_level'. Now, get_mmio_spte() sets it to
      'int root = vcpu->arch.mmu->root_level'.
      
      The difference between 'root_level' and 'shadow_root_level' on CPUs
      supporting 5-level page tables is that in some case we don't want to
      use 5-level, in particular when 'cpuid_maxphyaddr(vcpu) <= 48'
      kvm_mmu_get_tdp_level() returns '4'. In case upper layer is not used,
      the corresponding SPTE will fail '__is_rsvd_bits_set()' check.
      
      Revert to using 'shadow_root_level'.
      
      Fixes: 95fb5b02 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20201126110206.2118959-1-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9a2a0d3c
  6. 27 11月, 2020 3 次提交
    • P
      KVM: x86: Fix split-irqchip vs interrupt injection window request · 71cc849b
      Paolo Bonzini 提交于
      kvm_cpu_accept_dm_intr and kvm_vcpu_ready_for_interrupt_injection are
      a hodge-podge of conditions, hacked together to get something that
      more or less works.  But what is actually needed is much simpler;
      in both cases the fundamental question is, do we have a place to stash
      an interrupt if userspace does KVM_INTERRUPT?
      
      In userspace irqchip mode, that is !vcpu->arch.interrupt.injected.
      Currently kvm_event_needs_reinjection(vcpu) covers it, but it is
      unnecessarily restrictive.
      
      In split irqchip mode it's a bit more complicated, we need to check
      kvm_apic_accept_pic_intr(vcpu) (the IRQ window exit is basically an INTACK
      cycle and thus requires ExtINTs not to be masked) as well as
      !pending_userspace_extint(vcpu).  However, there is no need to
      check kvm_event_needs_reinjection(vcpu), since split irqchip keeps
      pending ExtINT state separate from event injection state, and checking
      kvm_cpu_has_interrupt(vcpu) is wrong too since ExtINT has higher
      priority than APIC interrupts.  In fact the latter fixes a bug:
      when userspace requests an IRQ window vmexit, an interrupt in the
      local APIC can cause kvm_cpu_has_interrupt() to be true and thus
      kvm_vcpu_ready_for_interrupt_injection() to return false.  When this
      happens, vcpu_run does not exit to userspace but the interrupt window
      vmexits keep occurring.  The VM loops without any hope of making progress.
      
      Once we try to fix these with something like
      
           return kvm_arch_interrupt_allowed(vcpu) &&
      -        !kvm_cpu_has_interrupt(vcpu) &&
      -        !kvm_event_needs_reinjection(vcpu) &&
      -        kvm_cpu_accept_dm_intr(vcpu);
      +        (!lapic_in_kernel(vcpu)
      +         ? !vcpu->arch.interrupt.injected
      +         : (kvm_apic_accept_pic_intr(vcpu)
      +            && !pending_userspace_extint(v)));
      
      we realize two things.  First, thanks to the previous patch the complex
      conditional can reuse !kvm_cpu_has_extint(vcpu).  Second, the interrupt
      window request in vcpu_enter_guest()
      
              bool req_int_win =
                      dm_request_for_irq_injection(vcpu) &&
                      kvm_cpu_accept_dm_intr(vcpu);
      
      should be kept in sync with kvm_vcpu_ready_for_interrupt_injection():
      it is unnecessary to ask the processor for an interrupt window
      if we would not be able to return to userspace.  Therefore,
      kvm_cpu_accept_dm_intr(vcpu) is basically !kvm_cpu_has_extint(vcpu)
      ANDed with the existing check for masked ExtINT.  It all makes sense:
      
      - we can accept an interrupt from userspace if there is a place
        to stash it (and, for irqchip split, ExtINTs are not masked).
        Interrupts from userspace _can_ be accepted even if right now
        EFLAGS.IF=0.
      
      - in order to tell userspace we will inject its interrupt ("IRQ
        window open" i.e. kvm_vcpu_ready_for_interrupt_injection), both
        KVM and the vCPU need to be ready to accept the interrupt.
      
      ... and this is what the patch implements.
      Reported-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Analyzed-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NNikos Tsironis <ntsironis@arrikto.com>
      Reviewed-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Tested-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      71cc849b
    • P
      KVM: x86: handle !lapic_in_kernel case in kvm_cpu_*_extint · 72c3bcdc
      Paolo Bonzini 提交于
      Centralize handling of interrupts from the userspace APIC
      in kvm_cpu_has_extint and kvm_cpu_get_extint, since
      userspace APIC interrupts are handled more or less the
      same as ExtINTs are with split irqchip.  This removes
      duplicated code from kvm_cpu_has_injectable_intr and
      kvm_cpu_has_interrupt, and makes the code more similar
      between kvm_cpu_has_{extint,interrupt} on one side
      and kvm_cpu_get_{extint,interrupt} on the other.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NFilippo Sironi <sironi@amazon.de>
      Reviewed-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Tested-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      72c3bcdc
    • P
      Merge tag 'kvmarm-fixes-5.10-4' of... · 545f6394
      Paolo Bonzini 提交于
      Merge tag 'kvmarm-fixes-5.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
      
      KVM/arm64 fixes for v5.10, take #4
      
      - Fix alignment of the new HYP sections
      - Fix GICR_TYPER access from userspace
      545f6394
  7. 20 11月, 2020 1 次提交
    • S
      MAINTAINERS: Update email address for Sean Christopherson · c2b1209d
      Sean Christopherson 提交于
      Update my email address to one provided by my new benefactor.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Jarkko Sakkinen <jarkko@kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: kvm@vger.kernel.org
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20201119183707.291864-1-sean.kvm@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c2b1209d
  8. 19 11月, 2020 1 次提交
  9. 18 11月, 2020 3 次提交
  10. 17 11月, 2020 3 次提交
  11. 16 11月, 2020 1 次提交
    • J
      KVM: arm64: Correctly align nVHE percpu data · 7bab16a6
      Jamie Iles 提交于
      The nVHE percpu data is partially linked but the nVHE linker script did
      not align the percpu section.  The PERCPU_INPUT macro would then align
      the data to a page boundary:
      
        #define PERCPU_INPUT(cacheline)					\
        	__per_cpu_start = .;						\
        	*(.data..percpu..first)						\
        	. = ALIGN(PAGE_SIZE);						\
        	*(.data..percpu..page_aligned)					\
        	. = ALIGN(cacheline);						\
        	*(.data..percpu..read_mostly)					\
        	. = ALIGN(cacheline);						\
        	*(.data..percpu)						\
        	*(.data..percpu..shared_aligned)				\
        	PERCPU_DECRYPTED_SECTION					\
        	__per_cpu_end = .;
      
      but then when the final vmlinux linking happens the hypervisor percpu
      data is included after page alignment and so the offsets potentially
      don't match.  On my build I saw that the .hyp.data..percpu section was
      at address 0x20 and then the percpu data would begin at 0x1000 (because
      of the page alignment in PERCPU_INPUT), but when linked into vmlinux,
      everything would be shifted down by 0x20 bytes.
      
      This manifests as one of the CPUs getting lost when running
      kvm-unit-tests or starting any VM and subsequent soft lockup on a Cortex
      A72 device.
      
      Fixes: 30c95391 ("kvm: arm64: Set up hyp percpu data for nVHE")
      Signed-off-by: NJamie Iles <jamie@nuviainc.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Acked-by: NDavid Brazdil <dbrazdil@google.com>
      Cc: David Brazdil <dbrazdil@google.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201113150406.14314-1-jamie@nuviainc.com
      7bab16a6
  12. 15 11月, 2020 1 次提交
    • P
      kvm: mmu: fix is_tdp_mmu_check when the TDP MMU is not in use · c887c9b9
      Paolo Bonzini 提交于
      In some cases where shadow paging is in use, the root page will
      be either mmu->pae_root or vcpu->arch.mmu->lm_root.  Then it will
      not have an associated struct kvm_mmu_page, because it is allocated
      with alloc_page instead of kvm_mmu_alloc_page.
      
      Just return false quickly from is_tdp_mmu_root if the TDP MMU is
      not in use, which also includes the case where shadow paging is
      enabled.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c887c9b9
  13. 13 11月, 2020 17 次提交