1. 17 4月, 2021 2 次提交
  2. 15 3月, 2021 8 次提交
  3. 13 3月, 2021 1 次提交
    • M
      kvm: x86: annotate RCU pointers · 6fcd9cbc
      Muhammad Usama Anjum 提交于
      This patch adds the annotation to fix the following sparse errors:
      arch/x86/kvm//x86.c:8147:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//x86.c:8147:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//x86.c:8147:15:    struct kvm_apic_map *
      arch/x86/kvm//x86.c:10628:16: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//x86.c:10628:16:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//x86.c:10628:16:    struct kvm_apic_map *
      arch/x86/kvm//x86.c:10629:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//x86.c:10629:15:    struct kvm_pmu_event_filter [noderef] __rcu *
      arch/x86/kvm//x86.c:10629:15:    struct kvm_pmu_event_filter *
      arch/x86/kvm//lapic.c:267:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:267:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:267:15:    struct kvm_apic_map *
      arch/x86/kvm//lapic.c:269:9: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:269:9:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:269:9:    struct kvm_apic_map *
      arch/x86/kvm//lapic.c:637:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:637:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:637:15:    struct kvm_apic_map *
      arch/x86/kvm//lapic.c:994:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:994:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:994:15:    struct kvm_apic_map *
      arch/x86/kvm//lapic.c:1036:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:1036:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:1036:15:    struct kvm_apic_map *
      arch/x86/kvm//lapic.c:1173:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:1173:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:1173:15:    struct kvm_apic_map *
      arch/x86/kvm//pmu.c:190:18: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//pmu.c:190:18:    struct kvm_pmu_event_filter [noderef] __rcu *
      arch/x86/kvm//pmu.c:190:18:    struct kvm_pmu_event_filter *
      arch/x86/kvm//pmu.c:251:18: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//pmu.c:251:18:    struct kvm_pmu_event_filter [noderef] __rcu *
      arch/x86/kvm//pmu.c:251:18:    struct kvm_pmu_event_filter *
      arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//pmu.c:522:18:    struct kvm_pmu_event_filter [noderef] __rcu *
      arch/x86/kvm//pmu.c:522:18:    struct kvm_pmu_event_filter *
      arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//pmu.c:522:18:    struct kvm_pmu_event_filter [noderef] __rcu *
      arch/x86/kvm//pmu.c:522:18:    struct kvm_pmu_event_filter *
      Signed-off-by: NMuhammad Usama Anjum <musamaanjum@gmail.com>
      Message-Id: <20210305191123.GA497469@LEGION>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6fcd9cbc
  4. 03 3月, 2021 1 次提交
    • D
      KVM: x86/xen: Add support for vCPU runstate information · 30b5c851
      David Woodhouse 提交于
      This is how Xen guests do steal time accounting. The hypervisor records
      the amount of time spent in each of running/runnable/blocked/offline
      states.
      
      In the Xen accounting, a vCPU is still in state RUNSTATE_running while
      in Xen for a hypercall or I/O trap, etc. Only if Xen explicitly schedules
      does the state become RUNSTATE_blocked. In KVM this means that even when
      the vCPU exits the kvm_run loop, the state remains RUNSTATE_running.
      
      The VMM can explicitly set the vCPU to RUNSTATE_blocked by using the
      KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT attribute, and can also use
      KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST to retrospectively add a given
      amount of time to the blocked state and subtract it from the running
      state.
      
      The state_entry_time corresponds to get_kvmclock_ns() at the time the
      vCPU entered the current state, and the total times of all four states
      should always add up to state_entry_time.
      Co-developed-by: NJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20210301125309.874953-2-dwmw2@infradead.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      30b5c851
  5. 26 2月, 2021 1 次提交
  6. 19 2月, 2021 5 次提交
  7. 09 2月, 2021 5 次提交
    • V
      KVM: x86: hyper-v: Make Hyper-V emulation enablement conditional · 8f014550
      Vitaly Kuznetsov 提交于
      Hyper-V emulation is enabled in KVM unconditionally. This is bad at least
      from security standpoint as it is an extra attack surface. Ideally, there
      should be a per-VM capability explicitly enabled by VMM but currently it
      is not the case and we can't mandate one without breaking backwards
      compatibility. We can, however, check guest visible CPUIDs and only enable
      Hyper-V emulation when "Hv#1" interface was exposed in
      HYPERV_CPUID_INTERFACE.
      
      Note, VMMs are free to act in any sequence they like, e.g. they can try
      to set MSRs first and CPUIDs later so we still need to allow the host
      to read/write Hyper-V specific MSRs unconditionally.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210126134816.1880136-14-vkuznets@redhat.com>
      [Add selftest vcpu_set_hv_cpuid API to avoid breaking xen_vmcall_test. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      8f014550
    • V
      KVM: x86: hyper-v: Allocate 'struct kvm_vcpu_hv' dynamically · 4592b7ea
      Vitaly Kuznetsov 提交于
      Hyper-V context is only needed for guests which use Hyper-V emulation in
      KVM (e.g. Windows/Hyper-V guests). 'struct kvm_vcpu_hv' is, however, quite
      big, it accounts for more than 1/4 of the total 'struct kvm_vcpu_arch'
      which is also quite big already. This all looks like a waste.
      
      Allocate 'struct kvm_vcpu_hv' dynamically. This patch does not bring any
      (intentional) functional change as we still allocate the context
      unconditionally but it paves the way to doing that only when needed.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210126134816.1880136-13-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4592b7ea
    • V
      KVM: Raise the maximum number of user memslots · 4fc096a9
      Vitaly Kuznetsov 提交于
      Current KVM_USER_MEM_SLOTS limits are arch specific (512 on Power, 509 on x86,
      32 on s390, 16 on MIPS) but they don't really need to be. Memory slots are
      allocated dynamically in KVM when added so the only real limitation is
      'id_to_index' array which is 'short'. We don't have any other
      KVM_MEM_SLOTS_NUM/KVM_USER_MEM_SLOTS-sized statically defined structures.
      
      Low KVM_USER_MEM_SLOTS can be a limiting factor for some configurations.
      In particular, when QEMU tries to start a Windows guest with Hyper-V SynIC
      enabled and e.g. 256 vCPUs the limit is hit as SynIC requires two pages per
      vCPU and the guest is free to pick any GFN for each of them, this fragments
      memslots as QEMU wants to have a separate memslot for each of these pages
      (which are supposed to act as 'overlay' pages).
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210127175731.2020089-3-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4fc096a9
    • P
      KVM: x86: reading DR cannot fail · 29d6ca41
      Paolo Bonzini 提交于
      kvm_get_dr and emulator_get_dr except an in-range value for the register
      number so they cannot fail.  Change the return type to void.
      Suggested-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      29d6ca41
    • P
      KVM: x86: compile out TDP MMU on 32-bit systems · 897218ff
      Paolo Bonzini 提交于
      The TDP MMU assumes that it can do atomic accesses to 64-bit PTEs.
      Rather than just disabling it, compile it out completely so that it
      is possible to use for example 64-bit xchg.
      
      To limit the number of stubs, wrap all accesses to tdp_mmu_enabled
      or tdp_mmu_page with a function.  Calls to all other functions in
      tdp_mmu.c are eliminated and do not even reach the linker.
      Reviewed-by: NSean Christopherson <seanjc@google.com>
      Tested-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      897218ff
  8. 04 2月, 2021 14 次提交
    • S
      KVM: x86: SEV: Treat C-bit as legal GPA bit regardless of vCPU mode · ca29e145
      Sean Christopherson 提交于
      Rename cr3_lm_rsvd_bits to reserved_gpa_bits, and use it for all GPA
      legality checks.  AMD's APM states:
      
        If the C-bit is an address bit, this bit is masked from the guest
        physical address when it is translated through the nested page tables.
      
      Thus, any access that can conceivably be run through NPT should ignore
      the C-bit when checking for validity.
      
      For features that KVM emulates in software, e.g. MTRRs, there is no
      clear direction in the APM for how the C-bit should be handled.  For
      such cases, follow the SME behavior inasmuch as possible, since SEV is
      is essentially a VM-specific variant of SME.  For SME, the APM states:
      
        In this case the upper physical address bits are treated as reserved
        when the feature is enabled except where otherwise indicated.
      
      Collecting the various relavant SME snippets in the APM and cross-
      referencing the omissions with Linux kernel code, this leaves MTTRs and
      APIC_BASE as the only flows that KVM emulates that should _not_ ignore
      the C-bit.
      
      Note, this means the reserved bit checks in the page tables are
      technically broken.  This will be remedied in a future patch.
      
      Although the page table checks are technically broken, in practice, it's
      all but guaranteed to be irrelevant.  NPT is required for SEV, i.e.
      shadowing page tables isn't needed in the common case.  Theoretically,
      the checks could be in play for nested NPT, but it's extremely unlikely
      that anyone is running nested VMs on SEV, as doing so would require L1
      to expose sensitive data to L0, e.g. the entire VMCB.  And if anyone is
      running nested VMs, L0 can't read the guest's encrypted memory, i.e. L1
      would need to put its NPT in shared memory, in which case the C-bit will
      never be set.  Or, L1 could use shadow paging, but again, if L0 needs to
      read page tables, e.g. to load PDPTRs, the memory can't be encrypted if
      L1 has any expectation of L0 doing the right thing.
      
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210204000117.3303214-8-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ca29e145
    • D
      KVM: x86/xen: Add event channel interrupt vector upcall · 40da8ccd
      David Woodhouse 提交于
      It turns out that we can't handle event channels *entirely* in userspace
      by delivering them as ExtINT, because KVM is a bit picky about when it
      accepts ExtINT interrupts from a legacy PIC. The in-kernel local APIC
      has to have LVT0 configured in APIC_MODE_EXTINT and unmasked, which
      isn't necessarily the case for Xen guests especially on secondary CPUs.
      
      To cope with this, add kvm_xen_get_interrupt() which checks the
      evtchn_pending_upcall field in the Xen vcpu_info, and delivers the Xen
      upcall vector (configured by KVM_XEN_ATTR_TYPE_UPCALL_VECTOR) if it's
      set regardless of LAPIC LVT0 configuration. This gives us the minimum
      support we need for completely userspace-based implementation of event
      channels.
      
      This does mean that vcpu_enter_guest() needs to check for the
      evtchn_pending_upcall flag being set, because it can't rely on someone
      having set KVM_REQ_EVENT unless we were to add some way for userspace to
      do so manually.
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      40da8ccd
    • J
      KVM: x86/xen: register vcpu time info region · f2340cd9
      Joao Martins 提交于
      Allow the Xen emulated guest the ability to register secondary
      vcpu time information. On Xen guests this is used in order to be
      mapped to userspace and hence allow vdso gettimeofday to work.
      Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      f2340cd9
    • J
      KVM: x86/xen: register vcpu info · 73e69a86
      Joao Martins 提交于
      The vcpu info supersedes the per vcpu area of the shared info page and
      the guest vcpus will use this instead.
      Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: NAnkur Arora <ankur.a.arora@oracle.com>
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      73e69a86
    • J
      KVM: x86/xen: register shared_info page · 13ffb97a
      Joao Martins 提交于
      Add KVM_XEN_ATTR_TYPE_SHARED_INFO to allow hypervisor to know where the
      guest's shared info page is.
      Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      13ffb97a
    • D
      a3833b81
    • J
      KVM: x86/xen: intercept xen hypercalls if enabled · 23200b7a
      Joao Martins 提交于
      Add a new exit reason for emulator to handle Xen hypercalls.
      
      Since this means KVM owns the ABI, dispense with the facility for the
      VMM to provide its own copy of the hypercall pages; just fill them in
      directly using VMCALL/VMMCALL as we do for the Hyper-V hypercall page.
      
      This behaviour is enabled by a new INTERCEPT_HCALL flag in the
      KVM_XEN_HVM_CONFIG ioctl structure, and advertised by the same flag
      being returned from the KVM_CAP_XEN_HVM check.
      
      Rename xen_hvm_config() to kvm_xen_write_hypercall_page() and move it
      to the nascent xen.c while we're at it, and add a test case.
      Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      23200b7a
    • B
      KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map · 9a77daac
      Ben Gardon 提交于
      To prepare for handling page faults in parallel, change the TDP MMU
      page fault handler to use atomic operations to set SPTEs so that changes
      are not lost if multiple threads attempt to modify the same SPTE.
      Reviewed-by: NPeter Feiner <pfeiner@google.com>
      Signed-off-by: NBen Gardon <bgardon@google.com>
      
      Message-Id: <20210202185734.1680553-21-bgardon@google.com>
      [Document new locking rules. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9a77daac
    • B
      KVM: x86/mmu: Use an rwlock for the x86 MMU · 531810ca
      Ben Gardon 提交于
      Add a read / write lock to be used in place of the MMU spinlock on x86.
      The rwlock will enable the TDP MMU to handle page faults, and other
      operations in parallel in future commits.
      Reviewed-by: NPeter Feiner <pfeiner@google.com>
      Signed-off-by: NBen Gardon <bgardon@google.com>
      
      Message-Id: <20210202185734.1680553-19-bgardon@google.com>
      [Introduce virt/kvm/mmu_lock.h - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      531810ca
    • J
      KVM: x86: use static calls to reduce kvm_x86_ops overhead · b3646477
      Jason Baron 提交于
      Convert kvm_x86_ops to use static calls. Note that all kvm_x86_ops are
      covered here except for 'pmu_ops and 'nested ops'.
      
      Here are some numbers running cpuid in a loop of 1 million calls averaged
      over 5 runs, measured in the vm (lower is better).
      
      Intel Xeon 3000MHz:
      
                 |default    |mitigations=off
      -------------------------------------
      vanilla    |.671s      |.486s
      static call|.573s(-15%)|.458s(-6%)
      
      AMD EPYC 2500MHz:
      
                 |default    |mitigations=off
      -------------------------------------
      vanilla    |.710s      |.609s
      static call|.664s(-6%) |.609s(0%)
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Signed-off-by: NJason Baron <jbaron@akamai.com>
      Message-Id: <e057bf1b8a7ad15652df6eeba3f907ae758d3399.1610680941.git.jbaron@akamai.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b3646477
    • J
      KVM: x86: introduce definitions to support static calls for kvm_x86_ops · 9af5471b
      Jason Baron 提交于
      Use static calls to improve kvm_x86_ops performance. Introduce the
      definitions that will be used by a subsequent patch to actualize the
      savings. Add a new kvm-x86-ops.h header that can be used for the
      definition of static calls. This header is also intended to be
      used to simplify the defition of svm_kvm_ops and vmx_x86_ops.
      
      Note that all functions in kvm_x86_ops are covered here except for
      'pmu_ops' and 'nested ops'. I think they can be covered by static
      calls in a simlilar manner, but were omitted from this series to
      reduce scope and because I don't think they have as large of a
      performance impact.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NJason Baron <jbaron@akamai.com>
      Message-Id: <e5cc82ead7ab37b2dceb0837a514f3f8bea4f8d1.1610680941.git.jbaron@akamai.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9af5471b
    • C
      KVM: X86: Rename DR6_INIT to DR6_ACTIVE_LOW · 9a3ecd5e
      Chenyi Qiang 提交于
      DR6_INIT contains the 1-reserved bits as well as the bit that is cleared
      to 0 when the condition (e.g. RTM) happens. The value can be used to
      initialize dr6 and also be the XOR mask between the #DB exit
      qualification (or payload) and DR6.
      
      Concerning that DR6_INIT is used as initial value only once, rename it
      to DR6_ACTIVE_LOW and apply it in other places, which would make the
      incoming changes for bus lock debug exception more simple.
      Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
      Message-Id: <20210202090433.13441-2-chenyi.qiang@intel.com>
      [Define DR6_FIXED_1 from DR6_ACTIVE_LOW and DR6_VOLATILE. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9a3ecd5e
    • C
      KVM: VMX: Enable bus lock VM exit · fe6b6bc8
      Chenyi Qiang 提交于
      Virtual Machine can exploit bus locks to degrade the performance of
      system. Bus lock can be caused by split locked access to writeback(WB)
      memory or by using locks on uncacheable(UC) memory. The bus lock is
      typically >1000 cycles slower than an atomic operation within a cache
      line. It also disrupts performance on other cores (which must wait for
      the bus lock to be released before their memory operations can
      complete).
      
      To address the threat, bus lock VM exit is introduced to notify the VMM
      when a bus lock was acquired, allowing it to enforce throttling or other
      policy based mitigations.
      
      A VMM can enable VM exit due to bus locks by setting a new "Bus Lock
      Detection" VM-execution control(bit 30 of Secondary Processor-based VM
      execution controls). If delivery of this VM exit was preempted by a
      higher priority VM exit (e.g. EPT misconfiguration, EPT violation, APIC
      access VM exit, APIC write VM exit, exception bitmap exiting), bit 26 of
      exit reason in vmcs field is set to 1.
      
      In current implementation, the KVM exposes this capability through
      KVM_CAP_X86_BUS_LOCK_EXIT. The user can get the supported mode bitmap
      (i.e. off and exit) and enable it explicitly (disabled by default). If
      bus locks in guest are detected by KVM, exit to user space even when
      current exit reason is handled by KVM internally. Set a new field
      KVM_RUN_BUS_LOCK in vcpu->run->flags to inform the user space that there
      is a bus lock detected in guest.
      
      Document for Bus Lock VM exit is now available at the latest "Intel
      Architecture Instruction Set Extensions Programming Reference".
      
      Document Link:
      https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.htmlCo-developed-by: NXiaoyao Li <xiaoyao.li@intel.com>
      Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
      Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
      Message-Id: <20201106090315.18606-4-chenyi.qiang@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fe6b6bc8
    • S
      KVM: x86/mmu: Remove the defunct update_pte() paging hook · c5e2184d
      Sean Christopherson 提交于
      Remove the update_pte() shadow paging logic, which was obsoleted by
      commit 4731d4c7 ("KVM: MMU: out of sync shadow core"), but never
      removed.  As pointed out by Yu, KVM never write protects leaf page
      tables for the purposes of shadow paging, and instead marks their
      associated shadow page as unsync so that the guest can write PTEs at
      will.
      
      The update_pte() path, which predates the unsync logic, optimizes COW
      scenarios by refreshing leaf SPTEs when they are written, as opposed to
      zapping the SPTE, restarting the guest, and installing the new SPTE on
      the subsequent fault.  Since KVM no longer write-protects leaf page
      tables, update_pte() is unreachable and can be dropped.
      Reported-by: NYu Zhang <yu.c.zhang@intel.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210115004051.4099250-1-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c5e2184d
  9. 08 1月, 2021 2 次提交
    • T
      KVM: SVM: Add support for booting APs in an SEV-ES guest · 647daca2
      Tom Lendacky 提交于
      Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence,
      where the guest vCPU register state is updated and then the vCPU is VMRUN
      to begin execution of the AP. For an SEV-ES guest, this won't work because
      the guest register state is encrypted.
      
      Following the GHCB specification, the hypervisor must not alter the guest
      register state, so KVM must track an AP/vCPU boot. Should the guest want
      to park the AP, it must use the AP Reset Hold exit event in place of, for
      example, a HLT loop.
      
      First AP boot (first INIT-SIPI-SIPI sequence):
        Execute the AP (vCPU) as it was initialized and measured by the SEV-ES
        support. It is up to the guest to transfer control of the AP to the
        proper location.
      
      Subsequent AP boot:
        KVM will expect to receive an AP Reset Hold exit event indicating that
        the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to
        awaken it. When the AP Reset Hold exit event is received, KVM will place
        the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI
        sequence, KVM will make the vCPU runnable. It is again up to the guest
        to then transfer control of the AP to the proper location.
      
        To differentiate between an actual HLT and an AP Reset Hold, a new MP
        state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is
        placed in upon receiving the AP Reset Hold exit event. Additionally, to
        communicate the AP Reset Hold exit event up to userspace (if needed), a
        new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD.
      
      A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order
      to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the
      original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds
      a new function that, for non SEV-ES guests, invokes the original SIPI
      delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests,
      implements the logic above.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      647daca2
    • B
      KVM: x86/mmu: Clarify TDP MMU page list invariants · c0dba6e4
      Ben Gardon 提交于
      The tdp_mmu_roots and tdp_mmu_pages in struct kvm_arch should only contain
      pages with tdp_mmu_page set to true. tdp_mmu_pages should not contain any
      pages with a non-zero root_count and tdp_mmu_roots should only contain
      pages with a positive root_count, unless a thread holds the MMU lock and
      is in the process of modifying the list. Various functions expect these
      invariants to be maintained, but they are not explictily documented. Add
      to the comments on both fields to document the above invariants.
      Signed-off-by: NBen Gardon <bgardon@google.com>
      Message-Id: <20210107001935.3732070-2-bgardon@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c0dba6e4
  10. 15 12月, 2020 1 次提交