1. 05 12月, 2017 1 次提交
    • B
      KVM: Introduce KVM_MEMORY_ENCRYPT_OP ioctl · 5acc5c06
      Brijesh Singh 提交于
      If the hardware supports memory encryption then the
      KVM_MEMORY_ENCRYPT_OP ioctl can be used by qemu to issue a platform
      specific memory encryption commands.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: x86@kernel.org
      Cc: kvm@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      5acc5c06
  2. 19 10月, 2017 1 次提交
  3. 12 10月, 2017 3 次提交
    • L
      KVM: nSVM: fix SMI injection in guest mode · 05cade71
      Ladi Prosek 提交于
      Entering SMM while running in guest mode wasn't working very well because several
      pieces of the vcpu state were left set up for nested operation.
      
      Some of the issues observed:
      
      * L1 was getting unexpected VM exits (using L1 interception controls but running
        in SMM execution environment)
      * MMU was confused (walk_mmu was still set to nested_mmu)
      * INTERCEPT_SMI was not emulated for L1 (KVM never injected SVM_EXIT_SMI)
      
      Intel SDM actually prescribes the logical processor to "leave VMX operation" upon
      entering SMM in 34.14.1 Default Treatment of SMI Delivery. AMD doesn't seem to
      document this but they provide fields in the SMM state-save area to stash the
      current state of SVM. What we need to do is basically get out of guest mode for
      the duration of SMM. All this completely transparent to L1, i.e. L1 is not given
      control and no L1 observable state changes.
      
      To avoid code duplication this commit takes advantage of the existing nested
      vmexit and run functionality, perhaps at the cost of efficiency. To get out of
      guest mode, nested_svm_vmexit is called, unchanged. Re-entering is performed using
      enter_svm_guest_mode.
      
      This commit fixes running Windows Server 2016 with Hyper-V enabled in a VM with
      OVMF firmware (OVMF_CODE-need-smm.fd).
      Signed-off-by: NLadi Prosek <lprosek@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      05cade71
    • L
      KVM: x86: introduce ISA specific smi_allowed callback · 72d7b374
      Ladi Prosek 提交于
      Similar to NMI, there may be ISA specific reasons why an SMI cannot be
      injected into the guest. This commit adds a new smi_allowed callback to
      be implemented in following commits.
      Signed-off-by: NLadi Prosek <lprosek@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      72d7b374
    • L
      KVM: x86: introduce ISA specific SMM entry/exit callbacks · 0234bf88
      Ladi Prosek 提交于
      Entering and exiting SMM may require ISA specific handling under certain
      circumstances. This commit adds two new callbacks with empty implementations.
      Actual functionality will be added in following commits.
      
      * pre_enter_smm() is to be called when injecting an SMM, before any
        SMM related vcpu state has been changed
      * pre_leave_smm() is to be called when emulating the RSM instruction,
        when the vcpu is in real mode and before any SMM related vcpu state
        has been restored
      Signed-off-by: NLadi Prosek <lprosek@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0234bf88
  4. 26 9月, 2017 1 次提交
    • T
      x86/apic: Sanitize 32/64bit APIC callbacks · 64063505
      Thomas Gleixner 提交于
      The 32bit and the 64bit implementation of default_cpu_present_to_apicid()
      and default_check_phys_apicid_present() are exactly the same, but
      implemented and located differently.
      
      Move them to common apic code and get rid of the pointless difference.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213153.757329991@linutronix.de
      64063505
  5. 14 9月, 2017 1 次提交
  6. 13 9月, 2017 1 次提交
  7. 01 9月, 2017 1 次提交
  8. 25 8月, 2017 6 次提交
  9. 18 8月, 2017 1 次提交
  10. 10 8月, 2017 1 次提交
  11. 08 8月, 2017 1 次提交
  12. 18 7月, 2017 1 次提交
    • T
      kvm/x86/svm: Support Secure Memory Encryption within KVM · d0ec49d4
      Tom Lendacky 提交于
      Update the KVM support to work with SME. The VMCB has a number of fields
      where physical addresses are used and these addresses must contain the
      memory encryption mask in order to properly access the encrypted memory.
      Also, use the memory encryption mask when creating and using the nested
      page tables.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Toshimitsu Kani <toshi.kani@hpe.com>
      Cc: kasan-dev@googlegroups.com
      Cc: kvm@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-efi@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/89146eccfa50334409801ff20acd52a90fb5efcf.1500319216.git.thomas.lendacky@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d0ec49d4
  13. 14 7月, 2017 5 次提交
  14. 13 7月, 2017 2 次提交
    • R
      kvm: x86: hyperv: add KVM_CAP_HYPERV_SYNIC2 · efc479e6
      Roman Kagan 提交于
      There is a flaw in the Hyper-V SynIC implementation in KVM: when message
      page or event flags page is enabled by setting the corresponding msr,
      KVM zeroes it out.  This is problematic because on migration the
      corresponding MSRs are loaded on the destination, so the content of
      those pages is lost.
      
      This went unnoticed so far because the only user of those pages was
      in-KVM hyperv synic timers, which could continue working despite that
      zeroing.
      
      Newer QEMU uses those pages for Hyper-V VMBus implementation, and
      zeroing them breaks the migration.
      
      Besides, in newer QEMU the content of those pages is fully managed by
      QEMU, so zeroing them is undesirable even when writing the MSRs from the
      guest side.
      
      To support this new scheme, introduce a new capability,
      KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic
      pages aren't zeroed out in KVM.
      Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      efc479e6
    • L
      KVM: x86: make backwards_tsc_observed a per-VM variable · a826faf1
      Ladi Prosek 提交于
      The backwards_tsc_observed global introduced in commit 16a96021 is never
      reset to false. If a VM happens to be running while the host is suspended
      (a common source of the TSC jumping backwards), master clock will never
      be enabled again for any VM. In contrast, if no VM is running while the
      host is suspended, master clock is unaffected. This is inconsistent and
      unnecessarily strict. Let's track the backwards_tsc_observed variable
      separately and let each VM start with a clean slate.
      
      Real world impact: My Windows VMs get slower after my laptop undergoes a
      suspend/resume cycle. The only way to get the perf back is unloading and
      reloading the kvm module.
      Signed-off-by: NLadi Prosek <lprosek@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      a826faf1
  15. 03 7月, 2017 1 次提交
    • P
      kvm: x86: mmu: allow A/D bits to be disabled in an mmu · ac8d57e5
      Peter Feiner 提交于
      Adds the plumbing to disable A/D bits in the MMU based on a new role
      bit, ad_disabled. When A/D is disabled, the MMU operates as though A/D
      aren't available (i.e., using access tracking faults instead).
      
      To avoid SP -> kvm_mmu_page.role.ad_disabled lookups all over the
      place, A/D disablement is now stored in the SPTE. This state is stored
      in the SPTE by tweaking the use of SPTE_SPECIAL_MASK for access
      tracking. Rather than just setting SPTE_SPECIAL_MASK when an
      access-tracking SPTE is non-present, we now always set
      SPTE_SPECIAL_MASK for access-tracking SPTEs.
      Signed-off-by: NPeter Feiner <pfeiner@google.com>
      [Use role.ad_disabled even for direct (non-shadow) EPT page tables.  Add
       documentation and a few MMU_WARN_ONs. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ac8d57e5
  16. 04 6月, 2017 1 次提交
  17. 17 5月, 2017 1 次提交
    • P
      KVM: x86: lower default for halt_poll_ns · b401ee0b
      Paolo Bonzini 提交于
      In some fio benchmarks, halt_poll_ns=400000 caused CPU utilization to
      increase heavily even in cases where the performance improvement was
      small.  In particular, bandwidth divided by CPU usage was as much as
      60% lower.
      
      To some extent this is the expected effect of the patch, and the
      additional CPU utilization is only visible when running the
      benchmarks.  However, halving the threshold also halves the extra
      CPU utilization (from +30-130% to +20-70%) and has no negative
      effect on performance.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      b401ee0b
  18. 09 5月, 2017 1 次提交
  19. 02 5月, 2017 1 次提交
  20. 27 4月, 2017 2 次提交
    • P
      KVM: mark requests that need synchronization · 7a97cec2
      Paolo Bonzini 提交于
      kvm_make_all_requests() provides a synchronization that waits until all
      kicked VCPUs have acknowledged the kick.  This is important for
      KVM_REQ_MMU_RELOAD as it prevents freeing while lockless paging is
      underway.
      
      This patch adds the synchronization property into all requests that are
      currently being used with kvm_make_all_requests() in order to preserve
      the current behavior and only introduce a new framework.  Removing it
      from requests where it is not necessary is left for future patches.
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7a97cec2
    • R
      KVM: mark requests that do not need a wakeup · 930f7fd6
      Radim Krčmář 提交于
      Some operations must ensure that the guest is not running with stale
      data, but if the guest is halted, then the update can wait until another
      event happens.  kvm_make_all_requests() currently doesn't wake up, so we
      can mark all requests used with it.
      
      First 8 bits were arbitrarily reserved for request numbers.
      
      Most uses of requests have the request type as a constant, so a compiler
      will optimize the '&'.
      
      An alternative would be to have an inline function that would return
      whether the request needs a wake-up or not, but I like this one better
      even though it might produce worse assembly.
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Reviewed-by: NAndrew Jones <drjones@redhat.com>
      Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      930f7fd6
  21. 21 4月, 2017 1 次提交
  22. 13 4月, 2017 1 次提交
  23. 07 4月, 2017 2 次提交
    • P
      kvm: make KVM_COALESCED_MMIO_PAGE_OFFSET public · 4b4357e0
      Paolo Bonzini 提交于
      Its value has never changed; we might as well make it part of the ABI instead
      of using the return value of KVM_CHECK_EXTENSION(KVM_CAP_COALESCED_MMIO).
      
      Because PPC does not always make MMIO available, the code has to be made
      dependent on CONFIG_KVM_MMIO rather than KVM_COALESCED_MMIO_PAGE_OFFSET.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      4b4357e0
    • P
      kvm: nVMX: support EPT accessed/dirty bits · ae1e2d10
      Paolo Bonzini 提交于
      Now use bit 6 of EPTP to optionally enable A/D bits for EPTP.  Another
      thing to change is that, when EPT accessed and dirty bits are not in use,
      VMX treats accesses to guest paging structures as data reads.  When they
      are in use (bit 6 of EPTP is set), they are treated as writes and the
      corresponding EPT dirty bit is set.  The MMU didn't know this detail,
      so this patch adds it.
      
      We also have to fix up the exit qualification.  It may be wrong because
      KVM sets bit 6 but the guest might not.
      
      L1 emulates EPT A/D bits using write permissions, so in principle it may
      be possible for EPT A/D bits to be used by L1 even though not available
      in hardware.  The problem is that guest page-table walks will be treated
      as reads rather than writes, so they would not cause an EPT violation.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      [Fixed typo in walk_addr_generic() comment and changed bit clear +
       conditional-set pattern in handle_ept_violation() to conditional-clear]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      ae1e2d10
  24. 17 2月, 2017 1 次提交
  25. 15 2月, 2017 1 次提交
    • P
      KVM: x86: do not scan IRR twice on APICv vmentry · 76dfafd5
      Paolo Bonzini 提交于
      Calls to apic_find_highest_irr are scanning IRR twice, once
      in vmx_sync_pir_from_irr and once in apic_search_irr.  Change
      sync_pir_from_irr to get the new maximum IRR from kvm_apic_update_irr;
      now that it does the computation, it can also do the RVI write.
      
      In order to avoid complications in svm.c, make the callback optional.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      76dfafd5
  26. 09 1月, 2017 1 次提交