1. 04 2月, 2021 7 次提交
  2. 08 1月, 2021 4 次提交
  3. 28 11月, 2020 1 次提交
    • V
      kvm: x86/mmu: Fix get_mmio_spte() on CPUs supporting 5-level PT · 9a2a0d3c
      Vitaly Kuznetsov 提交于
      Commit 95fb5b02 ("kvm: x86/mmu: Support MMIO in the TDP MMU") caused
      the following WARNING on an Intel Ice Lake CPU:
      
       get_mmio_spte: detect reserved bits on spte, addr 0xb80a0, dump hierarchy:
       ------ spte 0xb80a0 level 5.
       ------ spte 0xfcd210107 level 4.
       ------ spte 0x1004c40107 level 3.
       ------ spte 0x1004c41107 level 2.
       ------ spte 0x1db00000000b83b6 level 1.
       WARNING: CPU: 109 PID: 10254 at arch/x86/kvm/mmu/mmu.c:3569 kvm_mmu_page_fault.cold.150+0x54/0x22f [kvm]
      ...
       Call Trace:
        ? kvm_io_bus_get_first_dev+0x55/0x110 [kvm]
        vcpu_enter_guest+0xaa1/0x16a0 [kvm]
        ? vmx_get_cs_db_l_bits+0x17/0x30 [kvm_intel]
        ? skip_emulated_instruction+0xaa/0x150 [kvm_intel]
        kvm_arch_vcpu_ioctl_run+0xca/0x520 [kvm]
      
      The guest triggering this crashes. Note, this happens with the traditional
      MMU and EPT enabled, not with the newly introduced TDP MMU. Turns out,
      there was a subtle change in the above mentioned commit. Previously,
      walk_shadow_page_get_mmio_spte() was setting 'root' to 'iterator.level'
      which is returned by shadow_walk_init() and this equals to
      'vcpu->arch.mmu->shadow_root_level'. Now, get_mmio_spte() sets it to
      'int root = vcpu->arch.mmu->root_level'.
      
      The difference between 'root_level' and 'shadow_root_level' on CPUs
      supporting 5-level page tables is that in some case we don't want to
      use 5-level, in particular when 'cpuid_maxphyaddr(vcpu) <= 48'
      kvm_mmu_get_tdp_level() returns '4'. In case upper layer is not used,
      the corresponding SPTE will fail '__is_rsvd_bits_set()' check.
      
      Revert to using 'shadow_root_level'.
      
      Fixes: 95fb5b02 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20201126110206.2118959-1-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9a2a0d3c
  4. 15 11月, 2020 2 次提交
    • P
      KVM: Don't allocate dirty bitmap if dirty ring is enabled · 044c59c4
      Peter Xu 提交于
      Because kvm dirty rings and kvm dirty log is used in an exclusive way,
      Let's avoid creating the dirty_bitmap when kvm dirty ring is enabled.
      At the meantime, since the dirty_bitmap will be conditionally created
      now, we can't use it as a sign of "whether this memory slot enabled
      dirty tracking".  Change users like that to check against the kvm
      memory slot flags.
      
      Note that there still can be chances where the kvm memory slot got its
      dirty_bitmap allocated, _if_ the memory slots are created before
      enabling of the dirty rings and at the same time with the dirty
      tracking capability enabled, they'll still with the dirty_bitmap.
      However it should not hurt much (e.g., the bitmaps will always be
      freed if they are there), and the real users normally won't trigger
      this because dirty bit tracking flag should in most cases only be
      applied to kvm slots only before migration starts, that should be far
      latter than kvm initializes (VM starts).
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Message-Id: <20201001012226.5868-1-peterx@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      044c59c4
    • P
      KVM: X86: Implement ring-based dirty memory tracking · fb04a1ed
      Peter Xu 提交于
      This patch is heavily based on previous work from Lei Cao
      <lei.cao@stratus.com> and Paolo Bonzini <pbonzini@redhat.com>. [1]
      
      KVM currently uses large bitmaps to track dirty memory.  These bitmaps
      are copied to userspace when userspace queries KVM for its dirty page
      information.  The use of bitmaps is mostly sufficient for live
      migration, as large parts of memory are be dirtied from one log-dirty
      pass to another.  However, in a checkpointing system, the number of
      dirty pages is small and in fact it is often bounded---the VM is
      paused when it has dirtied a pre-defined number of pages. Traversing a
      large, sparsely populated bitmap to find set bits is time-consuming,
      as is copying the bitmap to user-space.
      
      A similar issue will be there for live migration when the guest memory
      is huge while the page dirty procedure is trivial.  In that case for
      each dirty sync we need to pull the whole dirty bitmap to userspace
      and analyse every bit even if it's mostly zeros.
      
      The preferred data structure for above scenarios is a dense list of
      guest frame numbers (GFN).  This patch series stores the dirty list in
      kernel memory that can be memory mapped into userspace to allow speedy
      harvesting.
      
      This patch enables dirty ring for X86 only.  However it should be
      easily extended to other archs as well.
      
      [1] https://patchwork.kernel.org/patch/10471409/Signed-off-by: NLei Cao <lei.cao@stratus.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Message-Id: <20201001012222.5767-1-peterx@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fb04a1ed
  5. 08 11月, 2020 1 次提交
  6. 31 10月, 2020 1 次提交
  7. 23 10月, 2020 10 次提交
  8. 22 10月, 2020 10 次提交
  9. 28 9月, 2020 4 次提交