1. 28 9月, 2020 3 次提交
  2. 22 9月, 2020 5 次提交
  3. 21 9月, 2020 3 次提交
  4. 19 9月, 2020 2 次提交
  5. 17 9月, 2020 3 次提交
    • P
      KVM: PPC: Book3S HV: Set LPCR[HDICE] before writing HDEC · 35dfb43c
      Paul Mackerras 提交于
      POWER8 and POWER9 machines have a hardware deviation where generation
      of a hypervisor decrementer exception is suppressed if the HDICE bit
      in the LPCR register is 0 at the time when the HDEC register
      decrements from 0 to -1.  When entering a guest, KVM first writes the
      HDEC register with the time until it wants the CPU to exit the guest,
      and then writes the LPCR with the guest value, which includes
      HDICE = 1.  If HDEC decrements from 0 to -1 during the interval
      between those two events, it is possible that we can enter the guest
      with HDEC already negative but no HDEC exception pending, meaning that
      no HDEC interrupt will occur while the CPU is in the guest, or at
      least not until HDEC wraps around.  Thus it is possible for the CPU to
      keep executing in the guest for a long time; up to about 4 seconds on
      POWER8, or about 4.46 years on POWER9 (except that the host kernel
      hard lockup detector will fire first).
      
      To fix this, we set the LPCR[HDICE] bit before writing HDEC on guest
      entry.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      35dfb43c
    • F
      KVM: PPC: Book3S HV: Do not allocate HPT for a nested guest · 05e6295d
      Fabiano Rosas 提交于
      The current nested KVM code does not support HPT guests. This is
      informed/enforced in some ways:
      
      - Hosts < P9 will not be able to enable the nested HV feature;
      
      - The nested hypervisor MMU capabilities will not contain
        KVM_CAP_PPC_MMU_HASH_V3;
      
      - QEMU reflects the MMU capabilities in the
        'ibm,arch-vec-5-platform-support' device-tree property;
      
      - The nested guest, at 'prom_parse_mmu_model' ignores the
        'disable_radix' kernel command line option if HPT is not supported;
      
      - The KVM_PPC_CONFIGURE_V3_MMU ioctl will fail if trying to use HPT.
      
      There is, however, still a way to start a HPT guest by using
      max-compat-cpu=power8 at the QEMU machine options. This leads to the
      guest being set to use hash after QEMU calls the KVM_PPC_ALLOCATE_HTAB
      ioctl.
      
      With the guest set to hash, the nested hypervisor goes through the
      entry path that has no knowledge of nesting (kvmppc_run_vcpu) and
      crashes when it tries to execute an hypervisor-privileged (mtspr
      HDEC) instruction at __kvmppc_vcore_entry:
      
      root@L1:~ $ qemu-system-ppc64 -machine pseries,max-cpu-compat=power8 ...
      
      <snip>
      [  538.543303] CPU: 83 PID: 25185 Comm: CPU 0/KVM Not tainted 5.9.0-rc4 #1
      [  538.543355] NIP:  c00800000753f388 LR: c00800000753f368 CTR: c0000000001e5ec0
      [  538.543417] REGS: c0000013e91e33b0 TRAP: 0700   Not tainted  (5.9.0-rc4)
      [  538.543470] MSR:  8000000002843033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE>  CR: 22422882  XER: 20040000
      [  538.543546] CFAR: c00800000753f4b0 IRQMASK: 3
                     GPR00: c0080000075397a0 c0000013e91e3640 c00800000755e600 0000000080000000
                     GPR04: 0000000000000000 c0000013eab19800 c000001394de0000 00000043a054db72
                     GPR08: 00000000003b1652 0000000000000000 0000000000000000 c0080000075502e0
                     GPR12: c0000000001e5ec0 c0000007ffa74200 c0000013eab19800 0000000000000008
                     GPR16: 0000000000000000 c00000139676c6c0 c000000001d23948 c0000013e91e38b8
                     GPR20: 0000000000000053 0000000000000000 0000000000000001 0000000000000000
                     GPR24: 0000000000000001 0000000000000001 0000000000000000 0000000000000001
                     GPR28: 0000000000000001 0000000000000053 c0000013eab19800 0000000000000001
      [  538.544067] NIP [c00800000753f388] __kvmppc_vcore_entry+0x90/0x104 [kvm_hv]
      [  538.544121] LR [c00800000753f368] __kvmppc_vcore_entry+0x70/0x104 [kvm_hv]
      [  538.544173] Call Trace:
      [  538.544196] [c0000013e91e3640] [c0000013e91e3680] 0xc0000013e91e3680 (unreliable)
      [  538.544260] [c0000013e91e3820] [c0080000075397a0] kvmppc_run_core+0xbc8/0x19d0 [kvm_hv]
      [  538.544325] [c0000013e91e39e0] [c00800000753d99c] kvmppc_vcpu_run_hv+0x404/0xc00 [kvm_hv]
      [  538.544394] [c0000013e91e3ad0] [c0080000072da4fc] kvmppc_vcpu_run+0x34/0x48 [kvm]
      [  538.544472] [c0000013e91e3af0] [c0080000072d61b8] kvm_arch_vcpu_ioctl_run+0x310/0x420 [kvm]
      [  538.544539] [c0000013e91e3b80] [c0080000072c7450] kvm_vcpu_ioctl+0x298/0x778 [kvm]
      [  538.544605] [c0000013e91e3ce0] [c0000000004b8c2c] sys_ioctl+0x1dc/0xc90
      [  538.544662] [c0000013e91e3dc0] [c00000000002f9a4] system_call_exception+0xe4/0x1c0
      [  538.544726] [c0000013e91e3e20] [c00000000000d140] system_call_common+0xf0/0x27c
      [  538.544787] Instruction dump:
      [  538.544821] f86d1098 60000000 60000000 48000099 e8ad0fe8 e8c500a0 e9264140 75290002
      [  538.544886] 7d1602a6 7cec42a6 40820008 7d0807b4 <7d164ba6> 7d083a14 f90d10a0 480104fd
      [  538.544953] ---[ end trace 74423e2b948c2e0c ]---
      
      This patch makes the KVM_PPC_ALLOCATE_HTAB ioctl fail when running in
      the nested hypervisor, causing QEMU to abort.
      Reported-by: NSatheesh Rajendran <sathnaga@linux.vnet.ibm.com>
      Signed-off-by: NFabiano Rosas <farosas@linux.ibm.com>
      Reviewed-by: NGreg Kurz <groug@kaod.org>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      05e6295d
    • G
      KVM: PPC: Don't return -ENOTSUPP to userspace in ioctls · 4e1b2ab7
      Greg Kurz 提交于
      ENOTSUPP is a linux only thingy, the value of which is unknown to
      userspace, not to be confused with ENOTSUP which linux maps to
      EOPNOTSUPP, as permitted by POSIX [1]:
      
      [EOPNOTSUPP]
      Operation not supported on socket. The type of socket (address family
      or protocol) does not support the requested operation. A conforming
      implementation may assign the same values for [EOPNOTSUPP] and [ENOTSUP].
      
      Return -EOPNOTSUPP instead of -ENOTSUPP for the following ioctls:
      - KVM_GET_FPU for Book3s and BookE
      - KVM_SET_FPU for Book3s and BookE
      - KVM_GET_DIRTY_LOG for BookE
      
      This doesn't affect QEMU which doesn't call the KVM_GET_FPU and
      KVM_SET_FPU ioctls on POWER anyway since they are not supported,
      and _buggily_ ignores anything but -EPERM for KVM_GET_DIRTY_LOG.
      
      [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.htmlSigned-off-by: NGreg Kurz <groug@kaod.org>
      Acked-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      4e1b2ab7
  6. 14 9月, 2020 1 次提交
  7. 13 9月, 2020 4 次提交
    • M
      KVM: emulator: more strict rsm checks. · 37f66bbe
      Maxim Levitsky 提交于
      Don't ignore return values in rsm_load_state_64/32 to avoid
      loading invalid state from SMM state area if it was tampered with
      by the guest.
      
      This is primarly intended to avoid letting guest set bits in EFER
      (like EFER.SVME when nesting is disabled) by manipulating SMM save area.
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20200827171145.374620-8-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      37f66bbe
    • M
      KVM: nSVM: more strict SMM checks when returning to nested guest · 3ebb5d26
      Maxim Levitsky 提交于
      * check that guest is 64 bit guest, otherwise the SVM related fields
        in the smm state area are not defined
      
      * If the SMM area indicates that SMM interrupted a running guest,
        check that EFER.SVME which is also saved in this area is set, otherwise
        the guest might have tampered with SMM save area, and so indicate
        emulation failure which should triple fault the guest.
      
      * Check that that guest CPUID supports SVM (due to the same issue as above)
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20200827162720.278690-4-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3ebb5d26
    • M
      SVM: nSVM: setup nested msr permission bitmap on nested state load · 772b81bb
      Maxim Levitsky 提交于
      This code was missing and was forcing the L2 run with L1's msr
      permission bitmap
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20200827162720.278690-3-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      772b81bb
    • M
      SVM: nSVM: correctly restore GIF on vmexit from nesting after migration · 9883764a
      Maxim Levitsky 提交于
      Currently code in svm_set_nested_state copies the current vmcb control
      area to L1 control area (hsave->control), under assumption that
      it mostly reflects the defaults that kvm choose, and later qemu
      overrides  these defaults with L2 state using standard KVM interfaces,
      like KVM_SET_REGS.
      
      However nested GIF (which is AMD specific thing) is by default is true,
      and it is copied to hsave area as such.
      
      This alone is not a big deal since on VMexit, GIF is always set to false,
      regardless of what it was on VM entry.  However in nested_svm_vmexit we
      were first were setting GIF to false, but then we overwrite the control
      fields with value from the hsave area.  (including the nested GIF field
      itself if GIF virtualization is enabled).
      
      Now on normal vm entry this is not a problem, since GIF is usually false
      prior to normal vm entry, and this is the value that copied to hsave,
      and then restored, but this is not always the case when the nested state
      is loaded as explained above.
      
      To fix this issue, move svm_set_gif after we restore the L1 control
      state in nested_svm_vmexit, so that even with wrong GIF in the
      saved L1 control area, we still clear GIF as the spec says.
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20200827162720.278690-2-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9883764a
  8. 12 9月, 2020 13 次提交
    • V
      x86/kvm: don't forget to ACK async PF IRQ · cc17b225
      Vitaly Kuznetsov 提交于
      Merge commit 26d05b36 ("Merge branch 'kvm-async-pf-int' into HEAD")
      tried to adapt the new interrupt based async PF mechanism to the newly
      introduced IDTENTRY magic but unfortunately it missed the fact that
      DEFINE_IDTENTRY_SYSVEC() doesn't call ack_APIC_irq() on its own and
      all DEFINE_IDTENTRY_SYSVEC() users have to call it manually.
      
      As the result all multi-CPU KVM guest hang on boot when
      KVM_FEATURE_ASYNC_PF_INT is present. The breakage went unnoticed because no
      KVM userspace (e.g. QEMU) currently set it (and thus async PF mechanism
      is currently disabled) but we're about to change that.
      
      Fixes: 26d05b36 ("Merge branch 'kvm-async-pf-int' into HEAD")
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200908135350.355053-3-vkuznets@redhat.com>
      Tested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cc17b225
    • V
      x86/kvm: properly use DEFINE_IDTENTRY_SYSVEC() macro · 244081f9
      Vitaly Kuznetsov 提交于
      DEFINE_IDTENTRY_SYSVEC() already contains irqentry_enter()/
      irqentry_exit().
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200908135350.355053-2-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      244081f9
    • W
      KVM: VMX: Don't freeze guest when event delivery causes an APIC-access exit · 99b82a14
      Wanpeng Li 提交于
      According to SDM 27.2.4, Event delivery causes an APIC-access VM exit.
      Don't report internal error and freeze guest when event delivery causes
      an APIC-access exit, it is handleable and the event will be re-injected
      during the next vmentry.
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1597827327-25055-2-git-send-email-wanpengli@tencent.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      99b82a14
    • W
      KVM: SVM: avoid emulation with stale next_rip · e42c6828
      Wanpeng Li 提交于
      svm->next_rip is reset in svm_vcpu_run() only after calling
      svm_exit_handlers_fastpath(), which will cause SVM's
      skip_emulated_instruction() to write a stale RIP.
      
      We can move svm_exit_handlers_fastpath towards the end of
      svm_vcpu_run().  To align VMX with SVM, keep svm_complete_interrupts()
      close as well.
      Suggested-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Cc: Paul K. <kronenpj@kronenpj.dyndns.org>
      Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      [Also move vmcb_mark_all_clean before any possible write to the VMCB.
       - Paolo]
      e42c6828
    • V
      KVM: x86: always allow writing '0' to MSR_KVM_ASYNC_PF_EN · d831de17
      Vitaly Kuznetsov 提交于
      Even without in-kernel LAPIC we should allow writing '0' to
      MSR_KVM_ASYNC_PF_EN as we're not enabling the mechanism. In
      particular, QEMU with 'kernel-irqchip=off' fails to start
      a guest with
      
      qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0
      
      Fixes: 9d3c447c ("KVM: X86: Fix async pf caused null-ptr-deref")
      Reported-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200911093147.484565-1-vkuznets@redhat.com>
      [Actually commit the version proposed by Sean Christopherson. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d831de17
    • D
      KVM: SVM: Periodically schedule when unregistering regions on destroy · 7be74942
      David Rientjes 提交于
      There may be many encrypted regions that need to be unregistered when a
      SEV VM is destroyed.  This can lead to soft lockups.  For example, on a
      host running 4.15:
      
      watchdog: BUG: soft lockup - CPU#206 stuck for 11s! [t_virtual_machi:194348]
      CPU: 206 PID: 194348 Comm: t_virtual_machi
      RIP: 0010:free_unref_page_list+0x105/0x170
      ...
      Call Trace:
       [<0>] release_pages+0x159/0x3d0
       [<0>] sev_unpin_memory+0x2c/0x50 [kvm_amd]
       [<0>] __unregister_enc_region_locked+0x2f/0x70 [kvm_amd]
       [<0>] svm_vm_destroy+0xa9/0x200 [kvm_amd]
       [<0>] kvm_arch_destroy_vm+0x47/0x200
       [<0>] kvm_put_kvm+0x1a8/0x2f0
       [<0>] kvm_vm_release+0x25/0x30
       [<0>] do_exit+0x335/0xc10
       [<0>] do_group_exit+0x3f/0xa0
       [<0>] get_signal+0x1bc/0x670
       [<0>] do_signal+0x31/0x130
      
      Although the CLFLUSH is no longer issued on every encrypted region to be
      unregistered, there are no other changes that can prevent soft lockups for
      very large SEV VMs in the latest kernel.
      
      Periodically schedule if necessary.  This still holds kvm->lock across the
      resched, but since this only happens when the VM is destroyed this is
      assumed to be acceptable.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Message-Id: <alpine.DEB.2.23.453.2008251255240.2987727@chino.kir.corp.google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7be74942
    • H
      KVM: MIPS: Change the definition of kvm type · 15e9e35c
      Huacai Chen 提交于
      MIPS defines two kvm types:
      
       #define KVM_VM_MIPS_TE          0
       #define KVM_VM_MIPS_VZ          1
      
      In Documentation/virt/kvm/api.rst it is said that "You probably want to
      use 0 as machine type", which implies that type 0 be the "automatic" or
      "default" type. And, in user-space libvirt use the null-machine (with
      type 0) to detect the kvm capability, which returns "KVM not supported"
      on a VZ platform.
      
      I try to fix it in QEMU but it is ugly:
      https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05629.html
      
      And Thomas Huth suggests me to change the definition of kvm type:
      https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg03281.html
      
      So I define like this:
      
       #define KVM_VM_MIPS_AUTO        0
       #define KVM_VM_MIPS_VZ          1
       #define KVM_VM_MIPS_TE          2
      
      Since VZ and TE cannot co-exists, using type 0 on a TE platform will
      still return success (so old user-space tools have no problems on new
      kernels); the advantage is that using type 0 on a VZ platform will not
      return failure. So, the only problem is "new user-space tools use type
      2 on old kernels", but if we treat this as a kernel bug, we can backport
      this patch to old stable kernels.
      Signed-off-by: NHuacai Chen <chenhc@lemote.com>
      Message-Id: <1599734031-28746-1-git-send-email-chenhc@lemote.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      15e9e35c
    • L
      kvm x86/mmu: use KVM_REQ_MMU_SYNC to sync when needed · f6f6195b
      Lai Jiangshan 提交于
      When kvm_mmu_get_page() gets a page with unsynced children, the spt
      pagetable is unsynchronized with the guest pagetable. But the
      guest might not issue a "flush" operation on it when the pagetable
      entry is changed from zero or other cases. The hypervisor has the
      responsibility to synchronize the pagetables.
      
      KVM behaved as above for many years, But commit 8c8560b8
      ("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific flushes")
      inadvertently included a line of code to change it without giving any
      reason in the changelog. It is clear that the commit's intention was to
      change KVM_REQ_TLB_FLUSH -> KVM_REQ_TLB_FLUSH_CURRENT, so we don't
      needlessly flush other contexts; however, one of the hunks changed
      a nearby KVM_REQ_MMU_SYNC instead.  This patch changes it back.
      
      Link: https://lore.kernel.org/lkml/20200320212833.3507-26-sean.j.christopherson@intel.com/
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
      Message-Id: <20200902135421.31158-1-jiangshanlai@gmail.com>
      fixes: 8c8560b8 ("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific flushes")
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f6f6195b
    • C
      KVM: nVMX: Fix the update value of nested load IA32_PERF_GLOBAL_CTRL control · c6b177a3
      Chenyi Qiang 提交于
      A minor fix for the update of VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL field
      in exit_ctls_high.
      
      Fixes: 03a8871a ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
      VM-{Entry,Exit} control")
      Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
      Reviewed-by: NXiaoyao Li <xiaoyao.li@intel.com>
      Message-Id: <20200828085622.8365-5-chenyi.qiang@intel.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c6b177a3
    • R
      KVM: fix memory leak in kvm_io_bus_unregister_dev() · f6588660
      Rustam Kovhaev 提交于
      when kmalloc() fails in kvm_io_bus_unregister_dev(), before removing
      the bus, we should iterate over all other devices linked to it and call
      kvm_iodevice_destructor() for them
      
      Fixes: 90db1043 ("KVM: kvm_io_bus_unregister_dev() should never fail")
      Cc: stable@vger.kernel.org
      Reported-and-tested-by: syzbot+f196caa45793d6374707@syzkaller.appspotmail.com
      Link: https://syzkaller.appspot.com/bug?extid=f196caa45793d6374707Signed-off-by: NRustam Kovhaev <rkovhaev@gmail.com>
      Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200907185535.233114-1-rkovhaev@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f6588660
    • H
      KVM: Check the allocation of pv cpu mask · 0f990222
      Haiwei Li 提交于
      check the allocation of per-cpu __pv_cpu_mask. Initialize ops only when
      successful.
      Signed-off-by: NHaiwei Li <lihaiwei@tencent.com>
      Message-Id: <d59f05df-e6d3-3d31-a036-cc25a2b2f33f@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0f990222
    • P
      KVM: nVMX: Update VMCS02 when L2 PAE PDPTE updates detected · 43fea4e4
      Peter Shier 提交于
      When L2 uses PAE, L0 intercepts of L2 writes to CR0/CR3/CR4 call
      load_pdptrs to read the possibly updated PDPTEs from the guest
      physical address referenced by CR3.  It loads them into
      vcpu->arch.walk_mmu->pdptrs and sets VCPU_EXREG_PDPTR in
      vcpu->arch.regs_dirty.
      
      At the subsequent assumed reentry into L2, the mmu will call
      vmx_load_mmu_pgd which calls ept_load_pdptrs. ept_load_pdptrs sees
      VCPU_EXREG_PDPTR set in vcpu->arch.regs_dirty and loads
      VMCS02.GUEST_PDPTRn from vcpu->arch.walk_mmu->pdptrs[]. This all works
      if the L2 CRn write intercept always resumes L2.
      
      The resume path calls vmx_check_nested_events which checks for
      exceptions, MTF, and expired VMX preemption timers. If
      vmx_check_nested_events finds any of these conditions pending it will
      reflect the corresponding exit into L1. Live migration at this point
      would also cause a missed immediate reentry into L2.
      
      After L1 exits, vmx_vcpu_run calls vmx_register_cache_reset which
      clears VCPU_EXREG_PDPTR in vcpu->arch.regs_dirty.  When L2 next
      resumes, ept_load_pdptrs finds VCPU_EXREG_PDPTR clear in
      vcpu->arch.regs_dirty and does not load VMCS02.GUEST_PDPTRn from
      vcpu->arch.walk_mmu->pdptrs[]. prepare_vmcs02 will then load
      VMCS02.GUEST_PDPTRn from vmcs12->pdptr0/1/2/3 which contain the stale
      values stored at last L2 exit. A repro of this bug showed L2 entering
      triple fault immediately due to the bad VMCS02.GUEST_PDPTRn values.
      
      When L2 is in PAE paging mode add a call to ept_load_pdptrs before
      leaving L2. This will update VMCS02.GUEST_PDPTRn if they are dirty in
      vcpu->arch.walk_mmu->pdptrs[].
      
      Tested:
      kvm-unit-tests with new directed test: vmx_mtf_pdpte_test.
      Verified that test fails without the fix.
      
      Also ran Google internal VMM with an Ubuntu 16.04 4.4.0-83 guest running a
      custom hypervisor with a 32-bit Windows XP L2 guest using PAE. Prior to fix
      would repro readily. Ran 14 simultaneous L2s for 140 iterations with no
      failures.
      Signed-off-by: NPeter Shier <pshier@google.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Message-Id: <20200820230545.2411347-1-pshier@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      43fea4e4
    • P
      Merge tag 'kvmarm-fixes-5.9-1' of... · 1b67fd08
      Paolo Bonzini 提交于
      Merge tag 'kvmarm-fixes-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for Linux 5.9, take #1
      
      - Multiple stolen time fixes, with a new capability to match x86
      - Fix for hugetlbfs mappings when PUD and PMD are the same level
      - Fix for hugetlbfs mappings when PTE mappings are enforced
        (dirty logging, for example)
      - Fix tracing output of 64bit values
      1b67fd08
  9. 08 9月, 2020 6 次提交
反馈
建议
客服 返回
顶部