1. 18 6月, 2021 4 次提交
  2. 11 6月, 2021 1 次提交
    • S
      KVM: x86/mmu: Calculate and check "full" mmu_role for nested MMU · 654430ef
      Sean Christopherson 提交于
      Calculate and check the full mmu_role when initializing the MMU context
      for the nested MMU, where "full" means the bits and pieces of the role
      that aren't handled by kvm_calc_mmu_role_common().  While the nested MMU
      isn't used for shadow paging, things like the number of levels in the
      guest's page tables are surprisingly important when walking the guest
      page tables.  Failure to reinitialize the nested MMU context if L2's
      paging mode changes can result in unexpected and/or missed page faults,
      and likely other explosions.
      
      E.g. if an L1 vCPU is running both a 32-bit PAE L2 and a 64-bit L2, the
      "common" role calculation will yield the same role for both L2s.  If the
      64-bit L2 is run after the 32-bit PAE L2, L0 will fail to reinitialize
      the nested MMU context, ultimately resulting in a bad walk of L2's page
      tables as the MMU will still have a guest root_level of PT32E_ROOT_LEVEL.
      
        WARNING: CPU: 4 PID: 167334 at arch/x86/kvm/vmx/vmx.c:3075 ept_save_pdptrs+0x15/0xe0 [kvm_intel]
        Modules linked in: kvm_intel]
        CPU: 4 PID: 167334 Comm: CPU 3/KVM Not tainted 5.13.0-rc1-d849817d5673-reqs #185
        Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
        RIP: 0010:ept_save_pdptrs+0x15/0xe0 [kvm_intel]
        Code: <0f> 0b c3 f6 87 d8 02 00f
        RSP: 0018:ffffbba702dbba00 EFLAGS: 00010202
        RAX: 0000000000000011 RBX: 0000000000000002 RCX: ffffffff810a2c08
        RDX: ffff91d7bc30acc0 RSI: 0000000000000011 RDI: ffff91d7bc30a600
        RBP: ffff91d7bc30a600 R08: 0000000000000010 R09: 0000000000000007
        R10: 0000000000000000 R11: 0000000000000000 R12: ffff91d7bc30a600
        R13: ffff91d7bc30acc0 R14: ffff91d67c123460 R15: 0000000115d7e005
        FS:  00007fe8e9ffb700(0000) GS:ffff91d90fb00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 000000029f15a001 CR4: 00000000001726e0
        Call Trace:
         kvm_pdptr_read+0x3a/0x40 [kvm]
         paging64_walk_addr_generic+0x327/0x6a0 [kvm]
         paging64_gva_to_gpa_nested+0x3f/0xb0 [kvm]
         kvm_fetch_guest_virt+0x4c/0xb0 [kvm]
         __do_insn_fetch_bytes+0x11a/0x1f0 [kvm]
         x86_decode_insn+0x787/0x1490 [kvm]
         x86_decode_emulated_instruction+0x58/0x1e0 [kvm]
         x86_emulate_instruction+0x122/0x4f0 [kvm]
         vmx_handle_exit+0x120/0x660 [kvm_intel]
         kvm_arch_vcpu_ioctl_run+0xe25/0x1cb0 [kvm]
         kvm_vcpu_ioctl+0x211/0x5a0 [kvm]
         __x64_sys_ioctl+0x83/0xb0
         do_syscall_64+0x40/0xb0
         entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: bf627a92 ("x86/kvm/mmu: check if MMU reconfiguration is needed in init_kvm_nested_mmu()")
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210610220026.1364486-1-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      654430ef
  3. 07 5月, 2021 1 次提交
    • S
      KVM: x86: Prevent KVM SVM from loading on kernels with 5-level paging · 03ca4589
      Sean Christopherson 提交于
      Disallow loading KVM SVM if 5-level paging is supported.  In theory, NPT
      for L1 should simply work, but there unknowns with respect to how the
      guest's MAXPHYADDR will be handled by hardware.
      
      Nested NPT is more problematic, as running an L1 VMM that is using
      2-level page tables requires stacking single-entry PDP and PML4 tables in
      KVM's NPT for L2, as there are no equivalent entries in L1's NPT to
      shadow.  Barring hardware magic, for 5-level paging, KVM would need stack
      another layer to handle PML5.
      
      Opportunistically rename the lm_root pointer, which is used for the
      aforementioned stacking when shadowing 2-level L1 NPT, to pml4_root to
      call out that it's specifically for PML4.
      Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210505204221.1934471-1-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      03ca4589
  4. 20 4月, 2021 2 次提交
    • B
      KVM: x86/mmu: Tear down roots before kvm_mmu_zap_all_fast returns · 4c6654bd
      Ben Gardon 提交于
      To avoid saddling a vCPU thread with the work of tearing down an entire
      paging structure, take a reference on each root before they become
      obsolete, so that the thread initiating the fast invalidation can tear
      down the paging structure and (most likely) release the last reference.
      As a bonus, this teardown can happen under the MMU lock in read mode so
      as not to block the progress of vCPU threads.
      Signed-off-by: NBen Gardon <bgardon@google.com>
      Message-Id: <20210401233736.638171-14-bgardon@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4c6654bd
    • B
      KVM: x86/mmu: Fast invalidation for TDP MMU · b7cccd39
      Ben Gardon 提交于
      Provide a real mechanism for fast invalidation by marking roots as
      invalid so that their reference count will quickly fall to zero
      and they will be torn down.
      
      One negative side affect of this approach is that a vCPU thread will
      likely drop the last reference to a root and be saddled with the work of
      tearing down an entire paging structure. This issue will be resolved in
      a later commit.
      Signed-off-by: NBen Gardon <bgardon@google.com>
      Message-Id: <20210401233736.638171-13-bgardon@google.com>
      [Move the loop to tdp_mmu.c, otherwise compilation fails on 32-bit. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b7cccd39
  5. 19 4月, 2021 6 次提交
  6. 17 4月, 2021 9 次提交
  7. 08 4月, 2021 1 次提交
    • P
      KVM: x86/mmu: preserve pending TLB flush across calls to kvm_tdp_mmu_zap_sp · 315f02c6
      Paolo Bonzini 提交于
      Right now, if a call to kvm_tdp_mmu_zap_sp returns false, the caller
      will skip the TLB flush, which is wrong.  There are two ways to fix
      it:
      
      - since kvm_tdp_mmu_zap_sp will not yield and therefore will not flush
        the TLB itself, we could change the call to kvm_tdp_mmu_zap_sp to
        use "flush |= ..."
      
      - or we can chain the flush argument through kvm_tdp_mmu_zap_sp down
        to __kvm_tdp_mmu_zap_gfn_range.  Note that kvm_tdp_mmu_zap_sp will
        neither yield nor flush, so flush would never go from true to
        false.
      
      This patch does the former to simplify application to stable kernels,
      and to make it further clearer that kvm_tdp_mmu_zap_sp will not flush.
      
      Cc: seanjc@google.com
      Fixes: 048f4980 ("KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping")
      Cc: <stable@vger.kernel.org> # 5.10.x: 048f4980: KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping
      Cc: <stable@vger.kernel.org> # 5.10.x: 33a31641: KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      315f02c6
  8. 31 3月, 2021 2 次提交
    • S
      KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages · 33a31641
      Sean Christopherson 提交于
      Prevent the TDP MMU from yielding when zapping a gfn range during NX
      page recovery.  If a flush is pending from a previous invocation of the
      zapping helper, either in the TDP MMU or the legacy MMU, but the TDP MMU
      has not accumulated a flush for the current invocation, then yielding
      will release mmu_lock with stale TLB entries.
      
      That being said, this isn't technically a bug fix in the current code, as
      the TDP MMU will never yield in this case.  tdp_mmu_iter_cond_resched()
      will yield if and only if it has made forward progress, as defined by the
      current gfn vs. the last yielded (or starting) gfn.  Because zapping a
      single shadow page is guaranteed to (a) find that page and (b) step
      sideways at the level of the shadow page, the TDP iter will break its loop
      before getting a chance to yield.
      
      But that is all very, very subtle, and will break at the slightest sneeze,
      e.g. zapping while holding mmu_lock for read would break as the TDP MMU
      wouldn't be guaranteed to see the present shadow page, and thus could step
      sideways at a lower level.
      
      Cc: Ben Gardon <bgardon@google.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210325200119.1359384-4-seanjc@google.com>
      [Add lockdep assertion. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      33a31641
    • S
      KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping · 048f4980
      Sean Christopherson 提交于
      Honor the "flush needed" return from kvm_tdp_mmu_zap_gfn_range(), which
      does the flush itself if and only if it yields (which it will never do in
      this particular scenario), and otherwise expects the caller to do the
      flush.  If pages are zapped from the TDP MMU but not the legacy MMU, then
      no flush will occur.
      
      Fixes: 29cf0f50 ("kvm: x86/mmu: NX largepage recovery for TDP MMU")
      Cc: stable@vger.kernel.org
      Cc: Ben Gardon <bgardon@google.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210325200119.1359384-3-seanjc@google.com>
      Reviewed-by: NBen Gardon <bgardon@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      048f4980
  9. 18 3月, 2021 1 次提交
    • I
      x86: Fix various typos in comments · d9f6e12f
      Ingo Molnar 提交于
      Fix ~144 single-word typos in arch/x86/ code comments.
      
      Doing this in a single commit should reduce the churn.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: linux-kernel@vger.kernel.org
      d9f6e12f
  10. 15 3月, 2021 13 次提交