1. 14 7月, 2017 2 次提交
  2. 13 7月, 2017 13 次提交
  3. 10 7月, 2017 1 次提交
  4. 04 7月, 2017 1 次提交
  5. 03 7月, 2017 4 次提交
    • P
      x86: kvm: mmu: use ept a/d in vmcs02 iff used in vmcs12 · 995f00a6
      Peter Feiner 提交于
      EPT A/D was enabled in the vmcs02 EPTP regardless of the vmcs12's EPTP
      value. The problem is that enabling A/D changes the behavior of L2's
      x86 page table walks as seen by L1. With A/D enabled, x86 page table
      walks are always treated as EPT writes.
      
      Commit ae1e2d10 ("kvm: nVMX: support EPT accessed/dirty bits",
      2017-03-30) tried to work around this problem by clearing the write
      bit in the exit qualification for EPT violations triggered by page
      walks.  However, that fixup introduced the opposite bug: page-table walks
      that actually set x86 A/D bits were *missing* the write bit in the exit
      qualification.
      
      This patch fixes the problem by disabling EPT A/D in the shadow MMU
      when EPT A/D is disabled in vmcs12's EPTP.
      Signed-off-by: NPeter Feiner <pfeiner@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      995f00a6
    • P
      kvm: x86: mmu: allow A/D bits to be disabled in an mmu · ac8d57e5
      Peter Feiner 提交于
      Adds the plumbing to disable A/D bits in the MMU based on a new role
      bit, ad_disabled. When A/D is disabled, the MMU operates as though A/D
      aren't available (i.e., using access tracking faults instead).
      
      To avoid SP -> kvm_mmu_page.role.ad_disabled lookups all over the
      place, A/D disablement is now stored in the SPTE. This state is stored
      in the SPTE by tweaking the use of SPTE_SPECIAL_MASK for access
      tracking. Rather than just setting SPTE_SPECIAL_MASK when an
      access-tracking SPTE is non-present, we now always set
      SPTE_SPECIAL_MASK for access-tracking SPTEs.
      Signed-off-by: NPeter Feiner <pfeiner@google.com>
      [Use role.ad_disabled even for direct (non-shadow) EPT page tables.  Add
       documentation and a few MMU_WARN_ONs. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ac8d57e5
    • P
      x86: kvm: mmu: make spte mmio mask more explicit · dcdca5fe
      Peter Feiner 提交于
      Specify both a mask (i.e., bits to consider) and a value (i.e.,
      pattern of bits that indicates a special PTE) for mmio SPTEs. On
      Intel, this lets us pack even more information into the
      (SPTE_SPECIAL_MASK | EPT_VMX_RWX_MASK) mask we use for access
      tracking liberating all (SPTE_SPECIAL_MASK | (non-misconfigured-RWX))
      values.
      Signed-off-by: NPeter Feiner <pfeiner@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      dcdca5fe
    • P
      x86: kvm: mmu: dead code thanks to access tracking · ce00053b
      Peter Feiner 提交于
      The MMU always has hardware A bits or access tracking support, thus
      it's unnecessary to handle the scenario where we have neither.
      Signed-off-by: NPeter Feiner <pfeiner@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ce00053b
  6. 30 6月, 2017 5 次提交
  7. 29 6月, 2017 1 次提交
  8. 27 6月, 2017 5 次提交
  9. 22 6月, 2017 1 次提交
    • P
      KVM: x86: fix singlestepping over syscall · c8401dda
      Paolo Bonzini 提交于
      TF is handled a bit differently for syscall and sysret, compared
      to the other instructions: TF is checked after the instruction completes,
      so that the OS can disable #DB at a syscall by adding TF to FMASK.
      When the sysret is executed the #DB is taken "as if" the syscall insn
      just completed.
      
      KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
      Fix the behavior, otherwise you could get #DB on a user stack which is not
      nice.  This does not affect Linux guests, as they use an IST or task gate
      for #DB.
      
      This fixes CVE-2017-7518.
      
      Cc: stable@vger.kernel.org
      Reported-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      c8401dda
  10. 13 6月, 2017 1 次提交
  11. 11 6月, 2017 1 次提交
    • W
      KVM: async_pf: avoid async pf injection when in guest mode · 9bc1f09f
      Wanpeng Li 提交于
       INFO: task gnome-terminal-:1734 blocked for more than 120 seconds.
             Not tainted 4.12.0-rc4+ #8
       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
       gnome-terminal- D    0  1734   1015 0x00000000
       Call Trace:
        __schedule+0x3cd/0xb30
        schedule+0x40/0x90
        kvm_async_pf_task_wait+0x1cc/0x270
        ? __vfs_read+0x37/0x150
        ? prepare_to_swait+0x22/0x70
        do_async_page_fault+0x77/0xb0
        ? do_async_page_fault+0x77/0xb0
        async_page_fault+0x28/0x30
      
      This is triggered by running both win7 and win2016 on L1 KVM simultaneously,
      and then gives stress to memory on L1, I can observed this hang on L1 when
      at least ~70% swap area is occupied on L0.
      
      This is due to async pf was injected to L2 which should be injected to L1,
      L2 guest starts receiving pagefault w/ bogus %cr2(apf token from the host
      actually), and L1 guest starts accumulating tasks stuck in D state in
      kvm_async_pf_task_wait() since missing PAGE_READY async_pfs.
      
      This patch fixes the hang by doing async pf when executing L1 guest.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9bc1f09f
  12. 08 6月, 2017 1 次提交
    • W
      KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulation · a3641631
      Wanpeng Li 提交于
      If "i" is the last element in the vcpu->arch.cpuid_entries[] array, it
      potentially can be exploited the vulnerability. this will out-of-bounds
      read and write.  Luckily, the effect is small:
      
      	/* when no next entry is found, the current entry[i] is reselected */
      	for (j = i + 1; ; j = (j + 1) % nent) {
      		struct kvm_cpuid_entry2 *ej = &vcpu->arch.cpuid_entries[j];
      		if (ej->function == e->function) {
      
      It reads ej->maxphyaddr, which is user controlled.  However...
      
      			ej->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
      
      After cpuid_entries there is
      
      	int maxphyaddr;
      	struct x86_emulate_ctxt emulate_ctxt;  /* 16-byte aligned */
      
      So we have:
      
      - cpuid_entries at offset 1B50 (6992)
      - maxphyaddr at offset 27D0 (6992 + 3200 = 10192)
      - padding at 27D4...27DF
      - emulate_ctxt at 27E0
      
      And it writes in the padding.  Pfew, writing the ops field of emulate_ctxt
      would have been much worse.
      
      This patch fixes it by modding the index to avoid the out-of-bounds
      access. Worst case, i == j and ej->function == e->function,
      the loop can bail out.
      Reported-by: NMoguofang <moguofang@huawei.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Guofang Mo <moguofang@huawei.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a3641631
  13. 07 6月, 2017 4 次提交