提交 · c4f55198c7c2b87909b166ffc2f6b68d9af6766c · openeuler / Kernel

18 10月, 2018 5 次提交

kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD · c4f55198

由 Jim Mattson 提交于 10月 16, 2018

This is a per-VM capability which can be enabled by userspace so that
the faulting linear address will be included with the information
about a pending #PF in L2, and the "new DR6 bits" will be included
with the information about a pending #DB in L2. With this capability
enabled, the L1 hypervisor can now intercept #PF before CR2 is
modified. Under VMX, the L1 hypervisor can now intercept #DB before
DR6 and DR7 are modified.

When userspace has enabled KVM_CAP_EXCEPTION_PAYLOAD, it should
generally provide an appropriate payload when injecting a #PF or #DB
exception via KVM_SET_VCPU_EVENTS. However, to support restoring old
checkpoints, this payload is not required.

Note that bit 16 of the "new DR6 bits" is set to indicate that a debug
exception (#DB) or a breakpoint exception (#BP) occurred inside an RTM
region while advanced debugging of RTM transactional regions was
enabled. This is the reverse of DR6.RTM, which is cleared in this
scenario.

This capability also enables exception.pending in struct
kvm_vcpu_events, which allows userspace to distinguish between pending
and injected exceptions.
Reported-by: NJim Mattson <jmattson@google.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c4f55198

kvm: vmx: Defer setting of DR6 until #DB delivery · f10c729f

由 Jim Mattson 提交于 10月 16, 2018

When exception payloads are enabled by userspace (which is not yet
possible) and a #DB is raised in L2, defer the setting of DR6 until
later. Under VMX, this allows the L1 hypervisor to intercept the fault
before DR6 is modified. Under SVM, DR6 is modified before L1 can
intercept the fault (as has always been the case with DR7).

Note that the payload associated with a #DB exception includes only
the "new DR6 bits." When the payload is delievered, DR6.B0-B3 will be
cleared and DR6.RTM will be set prior to merging in the new DR6 bits.

Also note that bit 16 in the "new DR6 bits" is set to indicate that a
debug exception (#DB) or a breakpoint exception (#BP) occurred inside
an RTM region while advanced debugging of RTM transactional regions
was enabled. Though the reverse of DR6.RTM, this makes the #DB payload
field compatible with both the pending debug exceptions field under
VMX and the exit qualification for #DB exceptions under VMX.
Reported-by: NJim Mattson <jmattson@google.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f10c729f

kvm: x86: Defer setting of CR2 until #PF delivery · da998b46

由 Jim Mattson 提交于 10月 16, 2018

When exception payloads are enabled by userspace (which is not yet
possible) and a #PF is raised in L2, defer the setting of CR2 until
the #PF is delivered. This allows the L1 hypervisor to intercept the
fault before CR2 is modified.

For backwards compatibility, when exception payloads are not enabled
by userspace, kvm_multiple_exception modifies CR2 when the #PF
exception is raised.
Reported-by: NJim Mattson <jmattson@google.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

da998b46

kvm: x86: Add payload operands to kvm_multiple_exception · 91e86d22

由 Jim Mattson 提交于 10月 16, 2018

kvm_multiple_exception now takes two additional operands: has_payload
and payload, so that updates to CR2 (and DR6 under VMX) can be delayed
until the exception is delivered. This is necessary to properly
emulate VMX or SVM hardware behavior for nested virtualization.

The new behavior is triggered by
vcpu->kvm->arch.exception_payload_enabled, which will (later) be set
by a new per-VM capability, KVM_CAP_EXCEPTION_PAYLOAD.
Reported-by: NJim Mattson <jmattson@google.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

91e86d22

kvm: x86: Add exception payload fields to kvm_vcpu_events · 59073aaf

由 Jim Mattson 提交于 10月 16, 2018

The per-VM capability KVM_CAP_EXCEPTION_PAYLOAD (to be introduced in a
later commit) adds the following fields to struct kvm_vcpu_events:
exception_has_payload, exception_payload, and exception.pending.

With this capability set, all of the details of vcpu->arch.exception,
including the payload for a pending exception, are reported to
userspace in response to KVM_GET_VCPU_EVENTS.

With this capability clear, the original ABI is preserved, and the
exception.injected field is set for either pending or injected
exceptions.

When userspace calls KVM_SET_VCPU_EVENTS with
KVM_CAP_EXCEPTION_PAYLOAD clear, exception.injected is no longer
translated to exception.pending. KVM_SET_VCPU_EVENTS can now only
establish a pending exception when KVM_CAP_EXCEPTION_PAYLOAD is set.
Reported-by: NJim Mattson <jmattson@google.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

59073aaf

17 10月, 2018 35 次提交

kvm: x86: Add has_payload and payload to kvm_queued_exception · c851436a

由 Jim Mattson 提交于 10月 16, 2018

The payload associated with a #PF exception is the linear address of
the fault to be loaded into CR2 when the fault is delivered. The
payload associated with a #DB exception is a mask of the DR6 bits to
be set (or in the case of DR6.RTM, cleared) when the fault is
delivered. Add fields has_payload and payload to kvm_queued_exception
to track payloads for pending exceptions.

The new fields are introduced here, but for now, they are just cleared.
Reported-by: NJim Mattson <jmattson@google.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c851436a

KVM: Documentation: Fix omission in struct kvm_vcpu_events · bba9ce58

由 Jim Mattson 提交于 10月 16, 2018

The header file indicates that there are 36 reserved bytes at the end
of this structure. Adjust the documentation to agree with the header
file.
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bba9ce58

KVM: selftests: add Enlightened VMCS test · 18178ff8

由 Vitaly Kuznetsov 提交于 10月 16, 2018

Modify test library and add eVMCS test. This includes nVMX save/restore
testing.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

18178ff8

tools/headers: update kvm.h · c939989d

由 Vitaly Kuznetsov 提交于 10月 16, 2018

Pick up the latest kvm.h definitions.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c939989d

x86/kvm/nVMX: nested state migration for Enlightened VMCS · 8cab6507

由 Vitaly Kuznetsov 提交于 10月 16, 2018

Add support for get/set of nested state when Enlightened VMCS is in use.
A new KVM_STATE_NESTED_EVMCS flag to indicate eVMCS on the vCPU was enabled
is added.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8cab6507

KVM: selftests: state_test: test bare VMXON migration · 1e7ecd1b

由 Vitaly Kuznetsov 提交于 10月 16, 2018

Split prepare_for_vmx_operation() into prepare_for_vmx_operation() and
load_vmcs() so we can inject GUEST_SYNC() in between.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1e7ecd1b

x86/kvm/nVMX: allow bare VMXON state migration · a1b0c1c6

由 Vitaly Kuznetsov 提交于 10月 16, 2018

It is perfectly valid for a guest to do VMXON and not do VMPTRLD. This
state needs to be preserved on migration.

Cc: stable@vger.kernel.org
Fixes: 8fcc4b59Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a1b0c1c6

x86/kvm/lapic: preserve gfn_to_hva_cache len on cache reinit · a7c42bb6

由 Vitaly Kuznetsov 提交于 10月 16, 2018

vcpu->arch.pv_eoi is accessible through both HV_X64_MSR_VP_ASSIST_PAGE and
MSR_KVM_PV_EOI_EN so on migration userspace may try to restore them in any
order. Values match, however, kvm_lapic_enable_pv_eoi() uses different
length: for Hyper-V case it's the whole struct hv_vp_assist_page, for KVM
native case it is 8. In case we restore KVM-native MSR last cache will
be reinitialized with len=8 so trying to access VP assist page beyond
8 bytes with kvm_read_guest_cached() will fail.

Check if we re-initializing cache for the same address and preserve length
in case it was greater.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a7c42bb6

x86/kvm/hyperv: don't clear VP assist pages on init · 12e0c618

由 Vitaly Kuznetsov 提交于 10月 16, 2018

VP assist pages may hold valuable data which needs to be preserved across
migration. Clean PV EOI portion of the data on init, the guest is
responsible for making sure there's no garbage in the rest.

This will be used for nVMX migration, eVMCS address needs to be preserved.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

12e0c618

KVM: nVMX: optimize prepare_vmcs02{,_full} for Enlightened VMCS case · c4ebd629

由 Vitaly Kuznetsov 提交于 10月 16, 2018

When Enlightened VMCS is in use by L1 hypervisor we can avoid vmwriting
VMCS fields which did not change.

Our first goal is to achieve minimal impact on traditional VMCS case so
we're not wrapping each vmwrite() with an if-changed checker. We also can't
utilize static keys as Enlightened VMCS usage is per-guest.

This patch implements the simpliest solution: checking fields in groups.
We skip single vmwrite() statements as doing the check will cost us
something even in non-evmcs case and the win is tiny. Unfortunately, this
makes prepare_vmcs02_full{,_full}() code Enlightened VMCS-dependent (and
a bit ugly).
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c4ebd629

KVM: nVMX: implement enlightened VMPTRLD and VMCLEAR · b8bbab92

由 Vitaly Kuznetsov 提交于 10月 16, 2018

Per Hyper-V TLFS 5.0b:

"The L1 hypervisor may choose to use enlightened VMCSs by writing 1 to
the corresponding field in the VP assist page (see section 7.8.7).
Another field in the VP assist page controls the currently active
enlightened VMCS. Each enlightened VMCS is exactly one page (4 KB) in
size and must be initially zeroed. No VMPTRLD instruction must be
executed to make an enlightened VMCS active or current.

After the L1 hypervisor performs a VM entry with an enlightened VMCS,
the VMCS is considered active on the processor. An enlightened VMCS
can only be active on a single processor at the same time. The L1
hypervisor can execute a VMCLEAR instruction to transition an
enlightened VMCS from the active to the non-active state. Any VMREAD
or VMWRITE instructions while an enlightened VMCS is active is
unsupported and can result in unexpected behavior."

Keep Enlightened VMCS structure for the current L2 guest permanently mapped
from struct nested_vmx instead of mapping it every time.
Suggested-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b8bbab92

KVM: nVMX: add enlightened VMCS state · 945679e3

由 Vitaly Kuznetsov 提交于 10月 16, 2018

Adds hv_evmcs pointer and implement copy_enlightened_to_vmcs12() and
copy_enlightened_to_vmcs12().

prepare_vmcs02()/prepare_vmcs02_full() separation is not valid for
Enlightened VMCS, do full sync for now.
Suggested-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

945679e3

KVM: nVMX: add KVM_CAP_HYPERV_ENLIGHTENED_VMCS capability · 57b119da

由 Vitaly Kuznetsov 提交于 10月 16, 2018

Enlightened VMCS is opt-in. The current version does not contain all
fields supported by nested VMX so we must not advertise the
corresponding VMX features if enlightened VMCS is enabled.

Userspace is given the enlightened VMCS version supported by KVM as
part of enabling KVM_CAP_HYPERV_ENLIGHTENED_VMCS. The version is to
be advertised to the nested hypervisor, currently done via a cpuid
leaf for Hyper-V.
Suggested-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

57b119da

KVM: VMX: refactor evmcs_sanitize_exec_ctrls() · 5d7a6443

由 Vitaly Kuznetsov 提交于 10月 16, 2018

Split off EVMCS1_UNSUPPORTED_* macros so we can re-use them when
enabling Enlightened VMCS for Hyper-V on KVM.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5d7a6443

KVM: hyperv: define VP assist page helpers · 72bbf935

由 Ladi Prosek 提交于 10月 16, 2018

The state related to the VP assist page is still managed by the LAPIC
code in the pv_eoi field.
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

72bbf935

KVM: refine the comment of function gfn_to_hva_memslot_prot() · 970c0d4b

由 Wei Yang 提交于 10月 09, 2018

The original comment is little hard to understand.

No functional change, just amend the comment a little.
Signed-off-by: NWei Yang <richard.weiyang@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

970c0d4b

KVM: x86: reintroduce pte_list_remove, but including mmu_spte_clear_track_bits · e7912386

由 Wei Yang 提交于 10月 04, 2018

rmap_remove() removes the sptep after locating the correct rmap_head but,
in several cases, the caller has already known the correct rmap_head.

This patch introduces a new pte_list_remove(); because it is known that
the spte is present (or it would not have an rmap_head), it is safe
to remove the tracking bits without any previous check.
Signed-off-by: NWei Yang <richard.weiyang@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e7912386

KVM: x86: rename pte_list_remove to __pte_list_remove · 8daf3462

由 Wei Yang 提交于 10月 04, 2018

This is a patch preparing for further change.
Signed-off-by: NWei Yang <richard.weiyang@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8daf3462

kvm/x86 : add coalesced pio support · 0804c849

由 Peng Hao 提交于 10月 14, 2018

Coalesced pio is based on coalesced mmio and can be used for some port
like rtc port, pci-host config port and so on.

Specially in case of rtc as coalesced pio, some versions of windows guest
access rtc frequently because of rtc as system tick. guest access rtc like
this: write register index to 0x70, then write or read data from 0x71.
writing 0x70 port is just as index and do nothing else. So we can use
coalesced pio to handle this scene to reduce VM-EXIT time.

When starting and closing a virtual machine, it will access pci-host config
port frequently. So setting these port as coalesced pio can reduce startup
and shutdown time.

without my patch, get the vm-exit time of accessing rtc 0x70 and piix 0xcf8
using perf tools: (guest OS : windows 7 64bit)
IO Port Access Samples Samples% Time% Min Time Max Time Avg time
0x70:POUT 86 30.99% 74.59% 9us 29us 10.75us (+- 3.41%)
0xcf8:POUT 1119 2.60% 2.12% 2.79us 56.83us 3.41us (+- 2.23%)

with my patch
IO Port Access Samples Samples% Time% Min Time Max Time Avg time
0x70:POUT 106 32.02% 29.47% 0us 10us 1.57us (+- 7.38%)
0xcf8:POUT 1065 1.67% 0.28% 0.41us 65.44us 0.66us (+- 10.55%)
Signed-off-by: NPeng Hao <peng.hao2@zte.com.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0804c849

kvm/x86 : add document for coalesced mmio · 9943450b

由 Peng Hao 提交于 10月 14, 2018

Signed-off-by: NPeng Hao <peng.hao2@zte.com.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9943450b

kvm/x86 : fix some typo · 39337ad1

由 Peng Hao 提交于 10月 04, 2018

Signed-off-by: NPeng Hao <peng.hao2@zte.com.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

39337ad1

KVM/VMX: Change hv flush logic when ept tables are mismatched. · a5c214da

由 Lan Tianyu 提交于 10月 13, 2018

If ept table pointers are mismatched, flushing tlb for each vcpus via
hv flush interface still helps to reduce vmexits which are triggered
by IPI and INEPT emulation.
Signed-off-by: NLan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a5c214da

KVM/x86: Use 32bit xor to clear register · 44c2d667

由 Uros Bizjak 提交于 10月 11, 2018

x86_64 zero-extends 32bit xor to a full 64bit register. Use %k asm
operand modifier to force 32bit register and save 268 bytes in kvm.o
Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

44c2d667

KVM/x86: Use assembly instruction mnemonics instead of .byte streams · 4b1e5478

由 Uros Bizjak 提交于 10月 11, 2018

Recently the minimum required version of binutils was changed to 2.20,
which supports all VMX instruction mnemonics. The patch removes
all .byte #defines and uses real instruction mnemonics instead.

The compiler is now able to pass memory operand to the instruction,
so there is no need for memory clobber anymore. Also, the compiler
adds CC register clobber automatically to all extended asm clauses,
so the patch also removes explicit CC clobber.

The immediate benefit of the patch is removal of many unnecesary
register moves, resulting in 1434 saved bytes in vmx.o:

   text    data     bss     dec     hex filename
 151257   18246    8500  178003   2b753 vmx.o
 152691   18246    8500  179437   2bced vmx-old.o

Some examples of improvement include removal of unneeded moves
of %rsp to %rax in front of invept and invvpid instructions:

    a57e:	b9 01 00 00 00       	mov    $0x1,%ecx
    a583:	48 89 04 24          	mov    %rax,(%rsp)
    a587:	48 89 e0             	mov    %rsp,%rax
    a58a:	48 c7 44 24 08 00 00 	movq   $0x0,0x8(%rsp)
    a591:	00 00
    a593:	66 0f 38 80 08       	invept (%rax),%rcx

to:

    a45c:	48 89 04 24          	mov    %rax,(%rsp)
    a460:	b8 01 00 00 00       	mov    $0x1,%eax
    a465:	48 c7 44 24 08 00 00 	movq   $0x0,0x8(%rsp)
    a46c:	00 00
    a46e:	66 0f 38 80 04 24    	invept (%rsp),%rax

and the ability to use more optimal registers and memory operands
in the instruction:

    8faa:	48 8b 44 24 28       	mov    0x28(%rsp),%rax
    8faf:	4c 89 c2             	mov    %r8,%rdx
    8fb2:	0f 79 d0             	vmwrite %rax,%rdx

to:

    8e7c:	44 0f 79 44 24 28    	vmwrite 0x28(%rsp),%r8
Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4b1e5478

KVM/x86: Fix invvpid and invept register operand size in 64-bit mode · 5ebb272b

由 Uros Bizjak 提交于 10月 11, 2018

Register operand size of invvpid and invept instruction in 64-bit mode
has always 64 bits. Adjust inline function argument type to reflect
correct size.
Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5ebb272b

x86/kvm/mmu: check if MMU reconfiguration is needed in init_kvm_nested_mmu() · bf627a92

由 Vitaly Kuznetsov 提交于 10月 08, 2018

We don't use root page role for nested_mmu, however, optimizing out
re-initialization in case nothing changed is still valuable as this
is done for every nested vmentry.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bf627a92

x86/kvm/mmu: check if tdp/shadow MMU reconfiguration is needed · 7dcd5755

由 Vitaly Kuznetsov 提交于 10月 08, 2018

MMU reconfiguration in init_kvm_tdp_mmu()/kvm_init_shadow_mmu() can be
avoided if the source data used to configure it didn't change; enhance
MMU extended role with the required fields and consolidate common code in
kvm_calc_mmu_role_common().
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7dcd5755

x86/kvm/nVMX: introduce source data cache for kvm_init_shadow_ept_mmu() · a336282d

由 Vitaly Kuznetsov 提交于 10月 08, 2018

MMU re-initialization is expensive, in particular,
update_permission_bitmask() and update_pkru_bitmask() are.

Cache the data used to setup shadow EPT MMU and avoid full re-init when
it is unchanged.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a336282d

x86/kvm/mmu: make space for source data caching in struct kvm_mmu · 36d9594d

由 Vitaly Kuznetsov 提交于 10月 08, 2018

In preparation to MMU reconfiguration avoidance we need a space to
cache source data. As this partially intersects with kvm_mmu_page_role,
create 64bit sized union kvm_mmu_role holding both base and extended data.
No functional change.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

36d9594d

x86/kvm/mmu: get rid of redundant kvm_mmu_setup() · e1732991

由 Paolo Bonzini 提交于 10月 08, 2018

Just inline the contents into the sole caller, kvm_init_mmu is now
public.
Suggested-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>

e1732991

x86/kvm/mmu: introduce guest_mmu · 14c07ad8

由 Vitaly Kuznetsov 提交于 10月 08, 2018

When EPT is used for nested guest we need to re-init MMU as shadow
EPT MMU (nested_ept_init_mmu_context() does that). When we return back
from L2 to L1 kvm_mmu_reset_context() in nested_vmx_load_cr3() resets
MMU back to normal TDP mode. Add a special 'guest_mmu' so we can use
separate root caches; the improved hit rate is not very important for
single vCPU performance, but it avoids contention on the mmu_lock for
many vCPUs.

On the nested CPUID benchmark, with 16 vCPUs, an L2->L1->L2 vmexit
goes from 42k to 26k cycles.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

14c07ad8

x86/kvm/mmu.c: add kvm_mmu parameter to kvm_mmu_free_roots() · 6a82cd1c

由 Vitaly Kuznetsov 提交于 10月 08, 2018

Add an option to specify which MMU root we want to free. This will
be used when nested and non-nested MMUs for L1 are split.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>

6a82cd1c

x86/kvm/mmu.c: set get_pdptr hook in kvm_init_shadow_ept_mmu() · 3dc773e7

由 Vitaly Kuznetsov 提交于 10月 08, 2018

kvm_init_shadow_ept_mmu() doesn't set get_pdptr() hook and is this
not a problem just because MMU context is already initialized and this
hook points to kvm_pdptr_read(). As we're intended to use a dedicated
MMU for shadow EPT MMU set this hook explicitly.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>

3dc773e7

x86/kvm/mmu: make vcpu->mmu a pointer to the current MMU · 44dd3ffa

由 Vitaly Kuznetsov 提交于 10月 08, 2018

As a preparation to full MMU split between L1 and L2 make vcpu->arch.mmu
a pointer to the currently used mmu. For now, this is always
vcpu->arch.root_mmu. No functional change.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>

44dd3ffa

kvm: x86: optimize dr6 restore · 0e0a53c5

由 Paolo Bonzini 提交于 8月 22, 2018

The quote from the comment almost says it all: we are currently zeroing
the guest dr6 in kvm_arch_vcpu_put, because do_debug expects it.  However,
the host %dr6 is either:

- zero because the guest hasn't run after kvm_arch_vcpu_load

- written from vcpu->arch.dr6 by vcpu_enter_guest

- written by the guest and copied to vcpu->arch.dr6 by ->sync_dirty_debug_regs().

Therefore, we can skip the write if vcpu->arch.dr6 is already zero.  We
may do extra useless writes if vcpu->arch.dr6 is nonzero but the guest
hasn't run; however that is less important for performance.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0e0a53c5

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功