提交 · 6ac2f49edb1ef5446089c7c660017732886d62d6 · openanolis / cloud-kernel

06 6月, 2018 1 次提交

x86/bugs: Add AMD's SPEC_CTRL MSR usage · 6ac2f49e

由 Konrad Rzeszutek Wilk 提交于 6月 01, 2018

The AMD document outlining the SSBD handling
124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
mentions that if CPUID 8000_0008.EBX[24] is set we should be using
the SPEC_CTRL MSR (0x48) over the VIRT SPEC_CTRL MSR (0xC001_011f)
for speculative store bypass disable.

This in effect means we should clear the X86_FEATURE_VIRT_SSBD
flag so that we would prefer the SPEC_CTRL MSR.

See the document titled:
   124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf

A copy of this document is available at
   https://bugzilla.kernel.org/show_bug.cgi?id=199889Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: kvm@vger.kernel.org
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: andrew.cooper3@citrix.com
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20180601145921.9500-3-konrad.wilk@oracle.com

6ac2f49e

17 5月, 2018 4 次提交

KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD · bc226f07

由 Tom Lendacky 提交于 5月 10, 2018

Expose the new virtualized architectural mechanism, VIRT_SSBD, for using
speculative store bypass disable (SSBD) under SVM.  This will allow guests
to use SSBD on hardware that uses non-architectural mechanisms for enabling
SSBD.

[ tglx: Folded the migration fixup from Paolo Bonzini ]
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

bc226f07

x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL · ccbcd267

由 Thomas Gleixner 提交于 5月 09, 2018

AMD is proposing a VIRT_SPEC_CTRL MSR to handle the Speculative Store
Bypass Disable via MSR_AMD64_LS_CFG so that guests do not have to care
about the bit position of the SSBD bit and thus facilitate migration.
Also, the sibling coordination on Family 17H CPUs can only be done on
the host.

Extend x86_spec_ctrl_set_guest() and x86_spec_ctrl_restore_host() with an
extra argument for the VIRT_SPEC_CTRL MSR.

Hand in 0 from VMX and in SVM add a new virt_spec_ctrl member to the CPU
data structure which is going to be used in later patches for the actual
implementation.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ccbcd267

x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP · e7c587da

由 Borislav Petkov 提交于 5月 02, 2018

Intel and AMD have different CPUID bits hence for those use synthetic bits
which get set on the respective vendor's in init_speculation_control(). So
that debacles like what the commit message of

  c65732e4 ("x86/cpu: Restore CPUID_8000_0008_EBX reload")

talks about don't happen anymore.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: NJörg Otte <jrg.otte@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20180504161815.GG9257@pd.tnic

e7c587da

KVM: SVM: Move spec control call after restore of GS · 15e6c22f

由 Thomas Gleixner 提交于 5月 11, 2018

svm_vcpu_run() invokes x86_spec_ctrl_restore_host() after VMEXIT, but
before the host GS is restored. x86_spec_ctrl_restore_host() uses 'current'
to determine the host SSBD state of the thread. 'current' is GS based, but
host GS is not yet restored and the access causes a triple fault.

Move the call after the host GS restore.

Fixes: 885f82bf x86/process: Allow runtime control of Speculative Store Bypass
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>

15e6c22f

03 5月, 2018 2 次提交

x86/speculation: Create spec-ctrl.h to avoid include hell · 28a27752

由 Thomas Gleixner 提交于 4月 29, 2018

Having everything in nospec-branch.h creates a hell of dependencies when
adding the prctl based switching mechanism. Move everything which is not
required in nospec-branch.h to spec-ctrl.h and fix up the includes in the
relevant files.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: NIngo Molnar <mingo@kernel.org>

28a27752

x86/bugs, KVM: Support the combination of guest and host IBRS · 5cf68754

由 Konrad Rzeszutek Wilk 提交于 4月 25, 2018

A guest may modify the SPEC_CTRL MSR from the value used by the
kernel. Since the kernel doesn't use IBRS, this means a value of zero is
what is needed in the host.

But the 336996-Speculative-Execution-Side-Channel-Mitigations.pdf refers to
the other bits as reserved so the kernel should respect the boot time
SPEC_CTRL value and use that.

This allows to deal with future extensions to the SPEC_CTRL interface if
any at all.

Note: This uses wrmsrl() instead of native_wrmsl(). I does not make any
difference as paravirt will over-write the callq *0xfff.. with the wrmsrl
assembler code.
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NIngo Molnar <mingo@kernel.org>

5cf68754

16 4月, 2018 2 次提交

kvm: x86: move MSR_IA32_TSC handling to x86.c · dd259935

由 Paolo Bonzini 提交于 4月 13, 2018

This is not specific to Intel/AMD anymore.  The TSC offset is available
in vcpu->arch.tsc_offset.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

dd259935

X86/KVM: Properly update 'tsc_offset' to represent the running guest · e79f245d

由 KarimAllah Ahmed 提交于 4月 14, 2018

Update 'tsc_offset' on vmentry/vmexit of L2 guests to ensure that it always
captures the TSC_OFFSET of the running guest whether it is the L1 or L2
guest.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: NJim Mattson <jmattson@google.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de>
[AMD changes, fix update_ia32_tsc_adjust_msr. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e79f245d

11 4月, 2018 1 次提交

KVM: X86: fix incorrect reference of trace_kvm_pi_irte_update · 2698d82e

由 hu huajun 提交于 4月 11, 2018

In arch/x86/kvm/trace.h, this function is declared as host_irq the
first input, and vcpu_id the second, instead of otherwise.
Signed-off-by: Nhu huajun <huhuajun@linux.alibaba.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2698d82e

05 4月, 2018 1 次提交

KVM: X86: Introduce handle_ud() · 082d06ed

由 Wanpeng Li 提交于 4月 03, 2018

Introduce handle_ud() to handle invalid opcode, this function will be
used by later patches.
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmÃ¡Å™ <rkrcmar@redhat.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

082d06ed

29 3月, 2018 2 次提交

KVM: SVM: Implement pause loop exit logic in SVM · 8566ac8b

由 Babu Moger 提交于 3月 16, 2018

Bring the PLE(pause loop exit) logic to AMD svm driver.

While testing, we found this helping in situations where numerous
pauses are generated. Without these patches we could see continuos
VMEXITS due to pause interceptions. Tested it on AMD EPYC server with
boot parameter idle=poll on a VM with 32 vcpus to simulate extensive
pause behaviour. Here are VMEXITS in 10 seconds interval.

Pauses                  810199                  504
Total                   882184                  325415
Signed-off-by: NBabu Moger <babu.moger@amd.com>
[Prevented the window from dropping below the initial value. - Radim]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

8566ac8b

KVM: SVM: Add pause filter threshold · 1d8fb44a

由 Babu Moger 提交于 3月 16, 2018

This patch adds the support for pause filtering threshold. This feature
support is indicated by CPUID Fn8000_000A_EDX. See AMD APM Vol 2 Section
15.14.4 Pause Intercept Filtering for more details.

In this mode, a 16-bit pause filter threshold field is added in VMCB.
The threshold value is a cycle count that is used to reset the pause
counter. As with simple pause filtering, VMRUN loads the pause count
value from VMCB into an internal counter. Then, on each pause instruction
the hardware checks the elapsed number of cycles since the most recent
pause instruction against the pause Filter Threshold. If the elapsed cycle
count is greater than the pause filter threshold, then the internal pause
count is reloaded from VMCB and execution continues. If the elapsed cycle
count is less than the pause filter threshold, then the internal pause
count is decremented. If the count value is less than zero and pause
intercept is enabled, a #VMEXIT is triggered. If advanced pause filtering
is supported and pause filter threshold field is set to zero, the filter
will operate in the simpler, count only mode.
Signed-off-by: NBabu Moger <babu.moger@amd.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

1d8fb44a

28 3月, 2018 1 次提交

KVM: x86: Fix perf timer mode IP reporting · dd60d217

由 Andi Kleen 提交于 7月 25, 2017

KVM and perf have a special backdoor mechanism to report the IP for interrupts
re-executed after vm exit. This works for the NMIs that perf normally uses.

However when perf is in timer mode it doesn't work because the timer interrupt
doesn't get this special treatment. This is common when KVM is running
nested in another hypervisor which may not implement the PMU, so only
timer mode is available.

Call the functions to set up the backdoor IP also for non NMI interrupts.

I renamed the functions to set up the backdoor IP reporting to be more
appropiate for their new use.  The SVM change is only compile tested.

v2: Moved the functions inline.
For the normal interrupt case the before/after functions are now
called from x86.c, not arch specific code.
For the NMI case we still need to call it in the architecture
specific code, because it's already needed in the low level *_run
functions.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
[Removed unnecessary calls from arch handle_external_intr. - Radim]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

dd60d217

24 3月, 2018 3 次提交

KVM: SVM: add struct kvm_svm to hold SVM specific KVM vars · 81811c16

由 Sean Christopherson 提交于 3月 20, 2018

Add struct kvm_svm, which is analagous to struct vcpu_svm, along with
a helper to_kvm_svm() to retrieve kvm_svm from a struct kvm *.  Move
the SVM specific variables and struct definitions out of kvm_arch
and into kvm_svm.

Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

81811c16

KVM: x86: move setting of ept_identity_map_addr to vmx.c · 2ac52ab8

由 Sean Christopherson 提交于 3月 20, 2018

Add kvm_x86_ops->set_identity_map_addr and set ept_identity_map_addr
in VMX specific code so that ept_identity_map_addr can be moved out
of 'struct kvm_arch' in a future patch.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2ac52ab8

KVM: x86: define SVM/VMX specific kvm_arch_[alloc|free]_vm · 434a1e94

由 Sean Christopherson 提交于 3月 20, 2018

Define kvm_arch_[alloc|free]_vm in x86 as pass through functions
to new kvm_x86_ops vm_alloc and vm_free, and move the current
allocation logic as-is to SVM and VMX.  Vendor specific alloc/free
functions set the stage for SVM/VMX wrappers of 'struct kvm',
which will allow us to move the growing number of SVM/VMX specific
member variables out of 'struct kvm_arch'.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

434a1e94

17 3月, 2018 5 次提交

KVM: X86: Provide a capability to disable PAUSE intercepts · b31c114b

由 Wanpeng Li 提交于 3月 12, 2018

Allow to disable pause loop exit/pause filtering on a per VM basis.

If some VMs have dedicated host CPUs, they won't be negatively affected
due to needlessly intercepted PAUSE instructions.

Thanks to Jan H. Schönherr's initial patch.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b31c114b

KVM: X86: Provide a capability to disable HLT intercepts · caa057a2

由 Wanpeng Li 提交于 3月 12, 2018

If host CPUs are dedicated to a VM, we can avoid VM exits on HLT.
This patch adds the per-VM capability to disable them.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

caa057a2

KVM: X86: Provide a capability to disable MWAIT intercepts · 4d5422ce

由 Wanpeng Li 提交于 3月 12, 2018

Allowing a guest to execute MWAIT without interception enables a guest
to put a (physical) CPU into a power saving state, where it takes
longer to return from than what may be desired by the host.

Don't give a guest that power over a host by default. (Especially,
since nothing prevents a guest from using MWAIT even when it is not
advertised via CPUID.)

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4d5422ce

KVM: x86: SVM: Intercept #GP to support access to VMware backdoor ports · 9718420e

由 Liran Alon 提交于 3月 12, 2018

If KVM enable_vmware_backdoor module parameter is set,
the commit change VMX to now intercept #GP instead of being directly
deliviered from CPU to guest.

It is done to support access to VMware Backdoor I/O ports
even if TSS I/O permission denies it.
In that case:
1. A #GP will be raised and intercepted.
2. #GP intercept handler will simulate I/O port access instruction.
3. I/O port access instruction simulation will allow access to VMware
backdoor ports specifically even if TSS I/O permission bitmap denies it.

Note that the above change introduce slight performance hit as now #GPs
are now not deliviered directly from CPU to guest but instead
cause #VMExit and instruction emulation.
However, this behavior is introduced only when enable_vmware_backdoor
KVM module parameter is set.
Signed-off-by: NLiran Alon <liran.alon@oracle.com>
Reviewed-by: NNikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9718420e

KVM: x86: add kvm_fast_pio() to consolidate fast PIO code · dca7f128

由 Sean Christopherson 提交于 3月 08, 2018

Add kvm_fast_pio() to consolidate duplicate code in VMX and SVM.
Unexport kvm_fast_pio_in() and kvm_fast_pio_out().
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

dca7f128

08 3月, 2018 1 次提交

x86/MSR: Move native_* variants to msr.h · c996f380

由 Borislav Petkov 提交于 3月 01, 2018

... where they belong.

No functional change.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDarren Kenny <darren.kenny@oracle.com>
Cc: kvm@vger.kernel.org
Link: https://lkml.kernel.org/r/20180301151336.12948-1-bp@alien8.de

c996f380

02 3月, 2018 3 次提交

KVM: X86: Allow userspace to define the microcode version · 518e7b94

由 Wanpeng Li 提交于 2月 28, 2018

Linux (among the others) has checks to make sure that certain features
aren't enabled on a certain family/model/stepping if the microcode version
isn't greater than or equal to a known good version.

By exposing the real microcode version, we're preventing buggy guests that
don't check that they are running virtualized (i.e., they should trust the
hypervisor) from disabling features that are effectively not buggy.
Suggested-by: NFilippo Sironi <sironi@amazon.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

518e7b94

KVM: SVM: Add MSR-based feature support for serializing LFENCE · d1d93fa9

由 Tom Lendacky 提交于 2月 24, 2018

In order to determine if LFENCE is a serializing instruction on AMD
processors, MSR 0xc0011029 (MSR_F10H_DECFG) must be read and the state
of bit 1 checked.  This patch will add support to allow a guest to
properly make this determination.

Add the MSR feature callback operation to svm.c and add MSR 0xc0011029
to the list of MSR-based features.  If LFENCE is serializing, then the
feature is supported, allowing the hypervisor to set the value of the
MSR that guest will see.  Support is also added to write (hypervisor only)
and read the MSR value for the guest.  A write by the guest will result in
a #GP.  A read by the guest will return the value as set by the host.  In
this way, the support to expose the feature to the guest is controlled by
the hypervisor.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

d1d93fa9

KVM: x86: Add a framework for supporting MSR-based features · 801e459a

由 Tom Lendacky 提交于 2月 21, 2018

Provide a new KVM capability that allows bits within MSRs to be recognized
as features.  Two new ioctls are added to the /dev/kvm ioctl routine to
retrieve the list of these MSRs and then retrieve their values. A kvm_x86_ops
callback is used to determine support for the listed MSR-based features.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
[Tweaked documentation. - Radim]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

801e459a

24 2月, 2018 3 次提交

KVM: SVM: Fix SEV LAUNCH_SECRET command · 9c5e0afa

由 Brijesh Singh 提交于 2月 19, 2018

The SEV LAUNCH_SECRET command fails with error code 'invalid param'
because we missed filling the guest and header system physical address
while issuing the command.

Fixes: 9f5b5b95 (KVM: SVM: Add support for SEV LAUNCH_SECRET command)
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-kernel@vger.kernel.org
Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9c5e0afa

KVM: SVM: install RSM intercept · 7607b717

由 Brijesh Singh 提交于 2月 19, 2018

RSM instruction is used by the SMM handler to return from SMM mode.
Currently, rsm causes a #UD - which results in instruction fetch, decode,
and emulate. By installing the RSM intercept we can avoid the instruction
fetch since we know that #VMEXIT was due to rsm.

The patch is required for the SEV guest, because in case of SEV guest
memory is encrypted with guest-specific key and hypervisor will not
able to fetch the instruction bytes from the guest memory.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7607b717

KVM: SVM: no need to call access_ok() in LAUNCH_MEASURE command · 3e233385

由 Brijesh Singh 提交于 2月 23, 2018

Using the access_ok() to validate the input before issuing the SEV
command does not buy us anything in this case. If userland is
giving us a garbage pointer then copy_to_user() will catch it when we try
to return the measurement.
Suggested-by: NAl Viro <viro@ZenIV.linux.org.uk>
Fixes: 0d0736f7 (KVM: SVM: Add support for KVM_SEV_LAUNCH_MEASURE ...)
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-kernel@vger.kernel.org
Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3e233385

23 2月, 2018 2 次提交

KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() · 946fbbc1

由 Paolo Bonzini 提交于 2月 22, 2018

vmx_vcpu_run() and svm_vcpu_run() are large functions, and giving
branch hints to the compiler can actually make a substantial cycle
difference by keeping the fast path contiguous in memory.

With this optimization, the retpoline-guest/retpoline-host case is
about 50 cycles faster.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20180222154318.20361-3-pbonzini@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

946fbbc1

KVM/x86: Remove indirect MSR op calls from SPEC_CTRL · ecb586bd

由 Paolo Bonzini 提交于 2月 22, 2018

Having a paravirt indirect call in the IBRS restore path is not a
good idea, since we are trying to protect from speculative execution
of bogus indirect branch targets.  It is also slower, so use
native_wrmsrl() on the vmentry path too.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: d28b387f
Link: http://lkml.kernel.org/r/20180222154318.20361-2-pbonzini@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ecb586bd

04 2月, 2018 2 次提交

KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL · b2ac58f9

由 KarimAllah Ahmed 提交于 2月 03, 2018

[ Based on a patch from Paolo Bonzini <pbonzini@redhat.com> ]

... basically doing exactly what we do for VMX:

- Passthrough SPEC_CTRL to guests (if enabled in guest CPUID)
- Save and restore SPEC_CTRL around VMExit and VMEntry only if the guest
  actually used it.
Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDarren Kenny <darren.kenny@oracle.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: kvm@vger.kernel.org
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lkml.kernel.org/r/1517669783-20732-1-git-send-email-karahmed@amazon.de

b2ac58f9

KVM/x86: Add IBPB support · 15d45071

由 Ashok Raj 提交于 2月 01, 2018

The Indirect Branch Predictor Barrier (IBPB) is an indirect branch
control mechanism. It keeps earlier branches from influencing
later ones.

Unlike IBRS and STIBP, IBPB does not define a new mode of operation.
It's a command that ensures predicted branch targets aren't used after
the barrier. Although IBRS and IBPB are enumerated by the same CPUID
enumeration, IBPB is very different.

IBPB helps mitigate against three potential attacks:

* Mitigate guests from being attacked by other guests.
  - This is addressed by issing IBPB when we do a guest switch.

* Mitigate attacks from guest/ring3->host/ring3.
  These would require a IBPB during context switch in host, or after
  VMEXIT. The host process has two ways to mitigate
  - Either it can be compiled with retpoline
  - If its going through context switch, and has set !dumpable then
    there is a IBPB in that path.
    (Tim's patch: https://patchwork.kernel.org/patch/10192871)
  - The case where after a VMEXIT you return back to Qemu might make
    Qemu attackable from guest when Qemu isn't compiled with retpoline.
  There are issues reported when doing IBPB on every VMEXIT that resulted
  in some tsc calibration woes in guest.

* Mitigate guest/ring0->host/ring0 attacks.
  When host kernel is using retpoline it is safe against these attacks.
  If host kernel isn't using retpoline we might need to do a IBPB flush on
  every VMEXIT.

Even when using retpoline for indirect calls, in certain conditions 'ret'
can use the BTB on Skylake-era CPUs. There are other mitigations
available like RSB stuffing/clearing.

* IBPB is issued only for SVM during svm_free_vcpu().
  VMX has a vmclear and SVM doesn't.  Follow discussion here:
  https://lkml.org/lkml/2018/1/15/146

Please refer to the following spec for more details on the enumeration
and control.

Refer here to get documentation about mitigations.

https://software.intel.com/en-us/side-channel-security-support

[peterz: rebase and changelog rewrite]
[karahmed: - rebase
           - vmx: expose PRED_CMD if guest has it in CPUID
           - svm: only pass through IBPB if guest has it in CPUID
           - vmx: support !cpu_has_vmx_msr_bitmap()]
           - vmx: support nested]
[dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS)
        PRED_CMD is a write-only MSR]
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: kvm@vger.kernel.org
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com
Link: https://lkml.kernel.org/r/1517522386-18410-3-git-send-email-karahmed@amazon.de

15d45071

16 1月, 2018 2 次提交

KVM: x86: Optimization: Create SVM stubs for sync_pir_to_irr() · fa59cc00

由 Liran Alon 提交于 12月 24, 2017

sync_pir_to_irr() is only called if vcpu->arch.apicv_active()==true.
In case it is false, VMX code make sure to set sync_pir_to_irr
to NULL.

Therefore, having SVM stubs allows to remove check for if
sync_pir_to_irr != NULL from all calling sites.
Signed-off-by: NLiran Alon <liran.alon@oracle.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NNikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: NLiam Merwick <liam.merwick@oracle.com>
[Return highest IRR in the SVM case. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

fa59cc00

KVM: X86: introduce invalidate_gpa argument to tlb flush · c2ba05cc

由 Wanpeng Li 提交于 12月 12, 2017

Introduce a new bool invalidate_gpa argument to kvm_x86_ops->tlb_flush,
it will be used by later patches to just flush guest tlb.

For VMX, this will use INVVPID instead of INVEPT, which will invalidate
combined mappings while keeping guest-physical mappings.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

c2ba05cc

12 1月, 2018 1 次提交

x86/retpoline: Fill return stack buffer on vmexit · 117cc7a9

由 David Woodhouse 提交于 1月 12, 2018

In accordance with the Intel and AMD documentation, we need to overwrite
all entries in the RSB on exiting a guest, to prevent malicious branch
target predictions from affecting the host kernel. This is needed both
for retpoline and for IBRS.

[ak: numbers again for the RSB stuffing labels]
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515755487-8524-1-git-send-email-dwmw@amazon.co.uk

117cc7a9

11 1月, 2018 1 次提交

KVM: x86: emulate #UD while in guest mode · bd89525a

由 Paolo Bonzini 提交于 1月 11, 2018

This reverts commits ae1f5767
and ac9b305c.

If the hardware doesn't support MOVBE, but L0 sets CPUID.01H:ECX.MOVBE
in L1's emulated CPUID information, then L1 is likely to pass that
CPUID bit through to L2. L2 will expect MOVBE to work, but if L1
doesn't intercept #UD, then any MOVBE instruction executed in L2 will
raise #UD, and the exception will be delivered in L2.

Commit ac9b305c is a better and more
complete version of ae1f5767 ("KVM: nVMX: Do not emulate #UD while
in guest mode"); however, neither considers the above case.
Suggested-by: NJim Mattson <jmattson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bd89525a

05 1月, 2018 1 次提交

kvm: vmx: Scrub hardware GPRs at VM-exit · 0cb5b306

由 Jim Mattson 提交于 1月 03, 2018

Guest GPR values are live in the hardware GPRs at VM-exit.  Do not
leave any guest values in hardware GPRs after the guest GPR values are
saved to the vcpu_vmx structure.

This is a partial mitigation for CVE 2017-5715 and CVE 2017-5753.
Specifically, it defeats the Project Zero PoC for CVE 2017-5715.
Suggested-by: NEric Northup <digitaleric@google.com>
Signed-off-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NEric Northup <digitaleric@google.com>
Reviewed-by: NBenjamin Serebrin <serebrin@google.com>
Reviewed-by: NAndrew Honig <ahonig@google.com>
[Paolo: Add AMD bits, Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0cb5b306

14 12月, 2017 1 次提交

KVM: x86: add support for emulating UMIP · 66336cab

由 Paolo Bonzini 提交于 7月 12, 2016

The User-Mode Instruction Prevention feature present in recent Intel
processor prevents a group of instructions (sgdt, sidt, sldt, smsw, and
str) from being executed with CPL > 0. Otherwise, a general protection
fault is issued.

UMIP instructions in general are also able to trigger vmexits, so we can
actually emulate UMIP on older processors.  This commit sets up the
infrastructure so that kvm-intel.ko and kvm-amd.ko can set the UMIP
feature bit for CPUID even if the feature is not actually available
in hardware.
Reviewed-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

66336cab

05 12月, 2017 1 次提交

KVM: X86: Restart the guest when insn_len is zero and SEV is enabled · 00b10fe1

由 Brijesh Singh 提交于 12月 04, 2017

On AMD platforms, under certain conditions insn_len may be zero on #NPF.
This can happen if a guest gets a page-fault on data access but the HW
table walker is not able to read the instruction page (e.g instruction
page is not present in memory).

Typically, when insn_len is zero, x86_emulate_instruction() walks the
guest page table and fetches the instruction bytes from guest memory.
When SEV is enabled, the guest memory is encrypted with guest-specific
key hence hypervisor will not able to fetch the instruction bytes.
In those cases we simply restart the guest.

I have encountered this issue when running kernbench inside the guest.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>

00b10fe1

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功