未验证 提交 e02a5e30 编写于 作者: O openeuler-ci-bot 提交者: Gitee

!109 SPR: KVM: Notify VM exit support

Merge Pull Request from: @allen-shi 
 
Virtual machines can exploit Intel ISA characteristics to cause
functional denial of service to the VMM. Introduce a new
feature named Notify VM exit, which can help mitigate such kind of
attacks.

Intel-kernel issue:
[#I5PAJ5:SPR:KVM:Notify VM exit](https://gitee.com/openeuler/intel-kernel/issues/I5PAJ5)

Test:
1. KVM Sanity Test
2. run the kernel normally on OpenEuler 22.03 LTS

Known issue:
N/A 
 
Link:https://gitee.com/openeuler/kernel/pulls/109 
Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> 
Reviewed-by: Kevin Zhu <zhukeqian1@huawei.com> 
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> 
...@@ -1084,6 +1084,10 @@ The following bits are defined in the flags field: ...@@ -1084,6 +1084,10 @@ The following bits are defined in the flags field:
fields contain a valid state. This bit will be set whenever fields contain a valid state. This bit will be set whenever
KVM_CAP_EXCEPTION_PAYLOAD is enabled. KVM_CAP_EXCEPTION_PAYLOAD is enabled.
- KVM_VCPUEVENT_VALID_TRIPLE_FAULT may be set to signal that the
triple_fault_pending field contains a valid state. This bit will
be set whenever KVM_CAP_X86_TRIPLE_FAULT_EVENT is enabled.
ARM/ARM64: ARM/ARM64:
^^^^^^^^^^ ^^^^^^^^^^
...@@ -1179,6 +1183,10 @@ can be set in the flags field to signal that the ...@@ -1179,6 +1183,10 @@ can be set in the flags field to signal that the
exception_has_payload, exception_payload, and exception.pending fields exception_has_payload, exception_payload, and exception.pending fields
contain a valid state and shall be written into the VCPU. contain a valid state and shall be written into the VCPU.
If KVM_CAP_X86_TRIPLE_FAULT_EVENT is enabled, KVM_VCPUEVENT_VALID_TRIPLE_FAULT
can be set in flags field to signal that the triple_fault field contains
a valid state and shall be written into the VCPU.
ARM/ARM64: ARM/ARM64:
^^^^^^^^^^ ^^^^^^^^^^
...@@ -5488,6 +5496,26 @@ array field represents return values. The userspace should update the return ...@@ -5488,6 +5496,26 @@ array field represents return values. The userspace should update the return
values of SBI call before resuming the VCPU. For more details on RISC-V SBI values of SBI call before resuming the VCPU. For more details on RISC-V SBI
spec refer, https://github.com/riscv/riscv-sbi-doc. spec refer, https://github.com/riscv/riscv-sbi-doc.
::
/* KVM_EXIT_NOTIFY */
struct {
#define KVM_NOTIFY_CONTEXT_INVALID (1 << 0)
__u32 flags;
} notify;
Used on x86 systems. When the VM capability KVM_CAP_X86_NOTIFY_VMEXIT is
enabled, a VM exit generated if no event window occurs in VM non-root mode
for a specified amount of time. Once KVM_X86_NOTIFY_VMEXIT_USER is set when
enabling the cap, it would exit to userspace with the exit reason
KVM_EXIT_NOTIFY for further handling. The "flags" field contains more
detailed info.
The valid value for 'flags' is:
- KVM_NOTIFY_CONTEXT_INVALID -- the VM context is corrupted and not valid
in VMCS. It would run into unknown result if resume the target VM.
:: ::
/* Fix the size of the union. */ /* Fix the size of the union. */
...@@ -6280,6 +6308,35 @@ the bus lock vm exit can be preempted by a higher priority VM exit, the exit ...@@ -6280,6 +6308,35 @@ the bus lock vm exit can be preempted by a higher priority VM exit, the exit
notifications to userspace can be KVM_EXIT_BUS_LOCK or other reasons. notifications to userspace can be KVM_EXIT_BUS_LOCK or other reasons.
KVM_RUN_BUS_LOCK flag is used to distinguish between them. KVM_RUN_BUS_LOCK flag is used to distinguish between them.
7.25 KVM_CAP_X86_NOTIFY_VMEXIT
------------------------------
:Architectures: x86
:Target: VM
:Parameters: args[0] is the value of notify window as well as some flags
:Returns: 0 on success, -EINVAL if args[0] contains invalid flags or notify
VM exit is unsupported.
Bits 63:32 of args[0] are used for notify window.
Bits 31:0 of args[0] are for some flags. Valid bits are::
#define KVM_X86_NOTIFY_VMEXIT_ENABLED (1 << 0)
#define KVM_X86_NOTIFY_VMEXIT_USER (1 << 1)
This capability allows userspace to configure the notify VM exit on/off
in per-VM scope during VM creation. Notify VM exit is disabled by default.
When userspace sets KVM_X86_NOTIFY_VMEXIT_ENABLED bit in args[0], VMM will
enable this feature with the notify window provided, which will generate
a VM exit if no event window occurs in VM non-root mode for a specified of
time (notify window).
If KVM_X86_NOTIFY_VMEXIT_USER is set in args[0], upon notify VM exits happen,
KVM would exit to userspace for handling.
This capability is aimed to mitigate the threat that malicious VMs can
cause CPU stuck (due to event windows don't open up) and make the CPU
unavailable to host or other VMs.
8. Other capabilities. 8. Other capabilities.
====================== ======================
......
...@@ -55,6 +55,9 @@ ...@@ -55,6 +55,9 @@
#define KVM_BUS_LOCK_DETECTION_VALID_MODE (KVM_BUS_LOCK_DETECTION_OFF | \ #define KVM_BUS_LOCK_DETECTION_VALID_MODE (KVM_BUS_LOCK_DETECTION_OFF | \
KVM_BUS_LOCK_DETECTION_EXIT) KVM_BUS_LOCK_DETECTION_EXIT)
#define KVM_X86_NOTIFY_VMEXIT_VALID_BITS (KVM_X86_NOTIFY_VMEXIT_ENABLED | \
KVM_X86_NOTIFY_VMEXIT_USER)
/* x86-specific vcpu->requests bit members */ /* x86-specific vcpu->requests bit members */
#define KVM_REQ_MIGRATE_TIMER KVM_ARCH_REQ(0) #define KVM_REQ_MIGRATE_TIMER KVM_ARCH_REQ(0)
#define KVM_REQ_REPORT_TPR_ACCESS KVM_ARCH_REQ(1) #define KVM_REQ_REPORT_TPR_ACCESS KVM_ARCH_REQ(1)
...@@ -996,8 +999,13 @@ struct kvm_arch { ...@@ -996,8 +999,13 @@ struct kvm_arch {
bool guest_can_read_msr_platform_info; bool guest_can_read_msr_platform_info;
bool exception_payload_enabled; bool exception_payload_enabled;
bool triple_fault_event;
bool bus_lock_detection_enabled; bool bus_lock_detection_enabled;
u32 notify_window;
u32 notify_vmexit_flags;
/* Guest can access the SGX PROVISIONKEY. */ /* Guest can access the SGX PROVISIONKEY. */
bool sgx_provisioning_allowed; bool sgx_provisioning_allowed;
...@@ -1090,6 +1098,7 @@ struct kvm_vcpu_stat { ...@@ -1090,6 +1098,7 @@ struct kvm_vcpu_stat {
u64 preemption_reported; u64 preemption_reported;
u64 preemption_other; u64 preemption_other;
u64 preemption_timer_exits; u64 preemption_timer_exits;
u64 notify_window_exits;
}; };
struct x86_instruction_info; struct x86_instruction_info;
...@@ -1440,6 +1449,8 @@ extern u64 kvm_max_tsc_scaling_ratio; ...@@ -1440,6 +1449,8 @@ extern u64 kvm_max_tsc_scaling_ratio;
extern u64 kvm_default_tsc_scaling_ratio; extern u64 kvm_default_tsc_scaling_ratio;
/* bus lock detection supported? */ /* bus lock detection supported? */
extern bool kvm_has_bus_lock_exit; extern bool kvm_has_bus_lock_exit;
/* notify vmexit supported? */
extern bool kvm_has_notify_vmexit;
extern u64 kvm_mce_cap_supported; extern u64 kvm_mce_cap_supported;
......
...@@ -75,6 +75,7 @@ ...@@ -75,6 +75,7 @@
#define SECONDARY_EXEC_TSC_SCALING VMCS_CONTROL_BIT(TSC_SCALING) #define SECONDARY_EXEC_TSC_SCALING VMCS_CONTROL_BIT(TSC_SCALING)
#define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE VMCS_CONTROL_BIT(USR_WAIT_PAUSE) #define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE VMCS_CONTROL_BIT(USR_WAIT_PAUSE)
#define SECONDARY_EXEC_BUS_LOCK_DETECTION VMCS_CONTROL_BIT(BUS_LOCK_DETECTION) #define SECONDARY_EXEC_BUS_LOCK_DETECTION VMCS_CONTROL_BIT(BUS_LOCK_DETECTION)
#define SECONDARY_EXEC_NOTIFY_VM_EXITING VMCS_CONTROL_BIT(NOTIFY_VM_EXITING)
/* /*
* Definitions of Tertiary Processor-Based VM-Execution Controls. * Definitions of Tertiary Processor-Based VM-Execution Controls.
...@@ -279,6 +280,7 @@ enum vmcs_field { ...@@ -279,6 +280,7 @@ enum vmcs_field {
SECONDARY_VM_EXEC_CONTROL = 0x0000401e, SECONDARY_VM_EXEC_CONTROL = 0x0000401e,
PLE_GAP = 0x00004020, PLE_GAP = 0x00004020,
PLE_WINDOW = 0x00004022, PLE_WINDOW = 0x00004022,
NOTIFY_WINDOW = 0x00004024,
VM_INSTRUCTION_ERROR = 0x00004400, VM_INSTRUCTION_ERROR = 0x00004400,
VM_EXIT_REASON = 0x00004402, VM_EXIT_REASON = 0x00004402,
VM_EXIT_INTR_INFO = 0x00004404, VM_EXIT_INTR_INFO = 0x00004404,
...@@ -565,6 +567,11 @@ enum vm_entry_failure_code { ...@@ -565,6 +567,11 @@ enum vm_entry_failure_code {
#define EPT_VIOLATION_EXECUTABLE (1 << EPT_VIOLATION_EXECUTABLE_BIT) #define EPT_VIOLATION_EXECUTABLE (1 << EPT_VIOLATION_EXECUTABLE_BIT)
#define EPT_VIOLATION_GVA_TRANSLATED (1 << EPT_VIOLATION_GVA_TRANSLATED_BIT) #define EPT_VIOLATION_GVA_TRANSLATED (1 << EPT_VIOLATION_GVA_TRANSLATED_BIT)
/*
* Exit Qualifications for NOTIFY VM EXIT
*/
#define NOTIFY_VM_CONTEXT_INVALID BIT(0)
/* /*
* VM-instruction error numbers * VM-instruction error numbers
*/ */
......
...@@ -85,6 +85,7 @@ ...@@ -85,6 +85,7 @@
#define VMX_FEATURE_USR_WAIT_PAUSE ( 2*32+ 26) /* Enable TPAUSE, UMONITOR, UMWAIT in guest */ #define VMX_FEATURE_USR_WAIT_PAUSE ( 2*32+ 26) /* Enable TPAUSE, UMONITOR, UMWAIT in guest */
#define VMX_FEATURE_ENCLV_EXITING ( 2*32+ 28) /* "" VM-Exit on ENCLV (leaf dependent) */ #define VMX_FEATURE_ENCLV_EXITING ( 2*32+ 28) /* "" VM-Exit on ENCLV (leaf dependent) */
#define VMX_FEATURE_BUS_LOCK_DETECTION ( 2*32+ 30) /* "" VM-Exit when bus lock caused */ #define VMX_FEATURE_BUS_LOCK_DETECTION ( 2*32+ 30) /* "" VM-Exit when bus lock caused */
#define VMX_FEATURE_NOTIFY_VM_EXITING ( 2*32+ 31) /* VM-Exit when no event windows after notify window */
/* Tertiary Processor-Based VM-Execution Controls, word 3 */ /* Tertiary Processor-Based VM-Execution Controls, word 3 */
#define VMX_FEATURE_IPI_VIRT ( 3*32+ 4) /* Enable IPI virtualization */ #define VMX_FEATURE_IPI_VIRT ( 3*32+ 4) /* Enable IPI virtualization */
......
...@@ -310,6 +310,7 @@ struct kvm_reinject_control { ...@@ -310,6 +310,7 @@ struct kvm_reinject_control {
#define KVM_VCPUEVENT_VALID_SHADOW 0x00000004 #define KVM_VCPUEVENT_VALID_SHADOW 0x00000004
#define KVM_VCPUEVENT_VALID_SMM 0x00000008 #define KVM_VCPUEVENT_VALID_SMM 0x00000008
#define KVM_VCPUEVENT_VALID_PAYLOAD 0x00000010 #define KVM_VCPUEVENT_VALID_PAYLOAD 0x00000010
#define KVM_VCPUEVENT_VALID_TRIPLE_FAULT 0x00000020
/* Interrupt shadow states */ /* Interrupt shadow states */
#define KVM_X86_SHADOW_INT_MOV_SS 0x01 #define KVM_X86_SHADOW_INT_MOV_SS 0x01
...@@ -344,7 +345,10 @@ struct kvm_vcpu_events { ...@@ -344,7 +345,10 @@ struct kvm_vcpu_events {
__u8 smm_inside_nmi; __u8 smm_inside_nmi;
__u8 latched_init; __u8 latched_init;
} smi; } smi;
__u8 reserved[27]; struct {
__u8 pending;
} triple_fault;
__u8 reserved[26];
__u8 exception_has_payload; __u8 exception_has_payload;
__u64 exception_payload; __u64 exception_payload;
}; };
......
...@@ -90,6 +90,7 @@ ...@@ -90,6 +90,7 @@
#define EXIT_REASON_UMWAIT 67 #define EXIT_REASON_UMWAIT 67
#define EXIT_REASON_TPAUSE 68 #define EXIT_REASON_TPAUSE 68
#define EXIT_REASON_BUS_LOCK 74 #define EXIT_REASON_BUS_LOCK 74
#define EXIT_REASON_NOTIFY 75
#define VMX_EXIT_REASONS \ #define VMX_EXIT_REASONS \
{ EXIT_REASON_EXCEPTION_NMI, "EXCEPTION_NMI" }, \ { EXIT_REASON_EXCEPTION_NMI, "EXCEPTION_NMI" }, \
...@@ -151,7 +152,8 @@ ...@@ -151,7 +152,8 @@
{ EXIT_REASON_XRSTORS, "XRSTORS" }, \ { EXIT_REASON_XRSTORS, "XRSTORS" }, \
{ EXIT_REASON_UMWAIT, "UMWAIT" }, \ { EXIT_REASON_UMWAIT, "UMWAIT" }, \
{ EXIT_REASON_TPAUSE, "TPAUSE" }, \ { EXIT_REASON_TPAUSE, "TPAUSE" }, \
{ EXIT_REASON_BUS_LOCK, "BUS_LOCK" } { EXIT_REASON_BUS_LOCK, "BUS_LOCK" }, \
{ EXIT_REASON_NOTIFY, "NOTIFY" }
#define VMX_EXIT_REASON_FLAGS \ #define VMX_EXIT_REASON_FLAGS \
{ VMX_EXIT_REASONS_FAILED_VMENTRY, "FAILED_VMENTRY" } { VMX_EXIT_REASONS_FAILED_VMENTRY, "FAILED_VMENTRY" }
......
...@@ -417,4 +417,10 @@ static inline u64 vmx_supported_debugctl(void) ...@@ -417,4 +417,10 @@ static inline u64 vmx_supported_debugctl(void)
return debugctl; return debugctl;
} }
static inline bool cpu_has_notify_vmexit(void)
{
return vmcs_config.cpu_based_2nd_exec_ctrl &
SECONDARY_EXEC_NOTIFY_VM_EXITING;
}
#endif /* __KVM_X86_VMX_CAPS_H */ #endif /* __KVM_X86_VMX_CAPS_H */
...@@ -2157,6 +2157,8 @@ static u64 nested_vmx_calc_efer(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) ...@@ -2157,6 +2157,8 @@ static u64 nested_vmx_calc_efer(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx) static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx)
{ {
struct kvm *kvm = vmx->vcpu.kvm;
/* /*
* If vmcs02 hasn't been initialized, set the constant vmcs02 state * If vmcs02 hasn't been initialized, set the constant vmcs02 state
* according to L0's settings (vmcs12 is irrelevant here). Host * according to L0's settings (vmcs12 is irrelevant here). Host
...@@ -2201,6 +2203,9 @@ static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx) ...@@ -2201,6 +2203,9 @@ static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx)
if (cpu_has_vmx_encls_vmexit()) if (cpu_has_vmx_encls_vmexit())
vmcs_write64(ENCLS_EXITING_BITMAP, -1ull); vmcs_write64(ENCLS_EXITING_BITMAP, -1ull);
if (kvm_notify_vmexit_enabled(kvm))
vmcs_write32(NOTIFY_WINDOW, kvm->arch.notify_window);
/* /*
* Set the MSR load/store lists to match L0's settings. Only the * Set the MSR load/store lists to match L0's settings. Only the
* addresses are constant (for vmcs02), the counts can change based * addresses are constant (for vmcs02), the counts can change based
...@@ -5985,6 +5990,9 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu, ...@@ -5985,6 +5990,9 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu,
SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE); SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE);
case EXIT_REASON_ENCLS: case EXIT_REASON_ENCLS:
return nested_vmx_exit_handled_encls(vcpu, vmcs12); return nested_vmx_exit_handled_encls(vcpu, vmcs12);
case EXIT_REASON_NOTIFY:
/* Notify VM exit is not exposed to L1 */
return false;
default: default:
return true; return true;
} }
......
...@@ -2626,7 +2626,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf, ...@@ -2626,7 +2626,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
SECONDARY_EXEC_PT_USE_GPA | SECONDARY_EXEC_PT_USE_GPA |
SECONDARY_EXEC_PT_CONCEAL_VMX | SECONDARY_EXEC_PT_CONCEAL_VMX |
SECONDARY_EXEC_ENABLE_VMFUNC | SECONDARY_EXEC_ENABLE_VMFUNC |
SECONDARY_EXEC_BUS_LOCK_DETECTION; SECONDARY_EXEC_BUS_LOCK_DETECTION |
SECONDARY_EXEC_NOTIFY_VM_EXITING;
if (cpu_has_sgx()) if (cpu_has_sgx())
opt2 |= SECONDARY_EXEC_ENCLS_EXITING; opt2 |= SECONDARY_EXEC_ENCLS_EXITING;
if (adjust_vmx_controls(min2, opt2, if (adjust_vmx_controls(min2, opt2,
...@@ -4504,6 +4505,9 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx) ...@@ -4504,6 +4505,9 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx)
if (!vcpu->kvm->arch.bus_lock_detection_enabled) if (!vcpu->kvm->arch.bus_lock_detection_enabled)
exec_control &= ~SECONDARY_EXEC_BUS_LOCK_DETECTION; exec_control &= ~SECONDARY_EXEC_BUS_LOCK_DETECTION;
if (!kvm_notify_vmexit_enabled(vcpu->kvm))
exec_control &= ~SECONDARY_EXEC_NOTIFY_VM_EXITING;
vmx->secondary_exec_control = exec_control; vmx->secondary_exec_control = exec_control;
} }
...@@ -4600,6 +4604,9 @@ static void init_vmcs(struct vcpu_vmx *vmx) ...@@ -4600,6 +4604,9 @@ static void init_vmcs(struct vcpu_vmx *vmx)
vmx->ple_window_dirty = true; vmx->ple_window_dirty = true;
} }
if (kvm_notify_vmexit_enabled(vmx->vcpu.kvm))
vmcs_write32(NOTIFY_WINDOW, vmx->vcpu.kvm->arch.notify_window);
vmcs_write32(PAGE_FAULT_ERROR_CODE_MASK, 0); vmcs_write32(PAGE_FAULT_ERROR_CODE_MASK, 0);
vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, 0); vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, 0);
vmcs_write32(CR3_TARGET_COUNT, 0); /* 22.2.1 */ vmcs_write32(CR3_TARGET_COUNT, 0); /* 22.2.1 */
...@@ -5940,6 +5947,32 @@ static int handle_bus_lock_vmexit(struct kvm_vcpu *vcpu) ...@@ -5940,6 +5947,32 @@ static int handle_bus_lock_vmexit(struct kvm_vcpu *vcpu)
return 1; return 1;
} }
static int handle_notify(struct kvm_vcpu *vcpu)
{
unsigned long exit_qual = vmx_get_exit_qual(vcpu);
bool context_invalid = exit_qual & NOTIFY_VM_CONTEXT_INVALID;
++vcpu->stat.notify_window_exits;
/*
* Notify VM exit happened while executing iret from NMI,
* "blocked by NMI" bit has to be set before next VM entry.
*/
if (enable_vnmi && (exit_qual & INTR_INFO_UNBLOCK_NMI))
vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO,
GUEST_INTR_STATE_NMI);
if (vcpu->kvm->arch.notify_vmexit_flags & KVM_X86_NOTIFY_VMEXIT_USER ||
context_invalid) {
vcpu->run->exit_reason = KVM_EXIT_NOTIFY;
vcpu->run->notify.flags = context_invalid ?
KVM_NOTIFY_CONTEXT_INVALID : 0;
return 0;
}
return 1;
}
/* /*
* The exit handlers return 1 if the exit was handled fully and guest execution * The exit handlers return 1 if the exit was handled fully and guest execution
* may resume. Otherwise they set the kvm_run parameter to indicate what needs * may resume. Otherwise they set the kvm_run parameter to indicate what needs
...@@ -5997,6 +6030,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { ...@@ -5997,6 +6030,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
[EXIT_REASON_PREEMPTION_TIMER] = handle_preemption_timer, [EXIT_REASON_PREEMPTION_TIMER] = handle_preemption_timer,
[EXIT_REASON_ENCLS] = handle_encls, [EXIT_REASON_ENCLS] = handle_encls,
[EXIT_REASON_BUS_LOCK] = handle_bus_lock_vmexit, [EXIT_REASON_BUS_LOCK] = handle_bus_lock_vmexit,
[EXIT_REASON_NOTIFY] = handle_notify,
}; };
static const int kvm_vmx_max_exit_handlers = static const int kvm_vmx_max_exit_handlers =
...@@ -6332,7 +6366,8 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) ...@@ -6332,7 +6366,8 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
exit_reason.basic != EXIT_REASON_EPT_VIOLATION && exit_reason.basic != EXIT_REASON_EPT_VIOLATION &&
exit_reason.basic != EXIT_REASON_PML_FULL && exit_reason.basic != EXIT_REASON_PML_FULL &&
exit_reason.basic != EXIT_REASON_APIC_ACCESS && exit_reason.basic != EXIT_REASON_APIC_ACCESS &&
exit_reason.basic != EXIT_REASON_TASK_SWITCH)) { exit_reason.basic != EXIT_REASON_TASK_SWITCH &&
exit_reason.basic != EXIT_REASON_NOTIFY)) {
int ndata = 3; int ndata = 3;
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
...@@ -8215,6 +8250,7 @@ static __init int hardware_setup(void) ...@@ -8215,6 +8250,7 @@ static __init int hardware_setup(void)
} }
kvm_has_bus_lock_exit = cpu_has_vmx_bus_lock_detection(); kvm_has_bus_lock_exit = cpu_has_vmx_bus_lock_detection();
kvm_has_notify_vmexit = cpu_has_notify_vmexit();
set_bit(0, vmx_vpid_bitmap); /* 0 is reserved for host */ set_bit(0, vmx_vpid_bitmap); /* 0 is reserved for host */
......
...@@ -143,6 +143,8 @@ u64 __read_mostly kvm_default_tsc_scaling_ratio; ...@@ -143,6 +143,8 @@ u64 __read_mostly kvm_default_tsc_scaling_ratio;
EXPORT_SYMBOL_GPL(kvm_default_tsc_scaling_ratio); EXPORT_SYMBOL_GPL(kvm_default_tsc_scaling_ratio);
bool __read_mostly kvm_has_bus_lock_exit; bool __read_mostly kvm_has_bus_lock_exit;
EXPORT_SYMBOL_GPL(kvm_has_bus_lock_exit); EXPORT_SYMBOL_GPL(kvm_has_bus_lock_exit);
bool __read_mostly kvm_has_notify_vmexit;
EXPORT_SYMBOL_GPL(kvm_has_notify_vmexit);
/* tsc tolerance in parts per million - default to 1/2 of the NTP threshold */ /* tsc tolerance in parts per million - default to 1/2 of the NTP threshold */
static u32 __read_mostly tsc_tolerance_ppm = 250; static u32 __read_mostly tsc_tolerance_ppm = 250;
...@@ -240,6 +242,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { ...@@ -240,6 +242,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
VCPU_STAT("halt_poll_fail_ns", halt_poll_fail_ns), VCPU_STAT("halt_poll_fail_ns", halt_poll_fail_ns),
VCPU_STAT("preemption_reported", preemption_reported), VCPU_STAT("preemption_reported", preemption_reported),
VCPU_STAT("preemption_other", preemption_other), VCPU_STAT("preemption_other", preemption_other),
VCPU_STAT("notify_window_exits", notify_window_exits),
VM_STAT("mmu_shadow_zapped", mmu_shadow_zapped), VM_STAT("mmu_shadow_zapped", mmu_shadow_zapped),
VM_STAT("mmu_pte_write", mmu_pte_write), VM_STAT("mmu_pte_write", mmu_pte_write),
VM_STAT("mmu_pde_zapped", mmu_pde_zapped), VM_STAT("mmu_pde_zapped", mmu_pde_zapped),
...@@ -3845,6 +3848,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) ...@@ -3845,6 +3848,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_GET_MSR_FEATURES: case KVM_CAP_GET_MSR_FEATURES:
case KVM_CAP_MSR_PLATFORM_INFO: case KVM_CAP_MSR_PLATFORM_INFO:
case KVM_CAP_EXCEPTION_PAYLOAD: case KVM_CAP_EXCEPTION_PAYLOAD:
case KVM_CAP_X86_TRIPLE_FAULT_EVENT:
case KVM_CAP_SET_GUEST_DEBUG: case KVM_CAP_SET_GUEST_DEBUG:
case KVM_CAP_LAST_CPU: case KVM_CAP_LAST_CPU:
case KVM_CAP_X86_USER_SPACE_MSR: case KVM_CAP_X86_USER_SPACE_MSR:
...@@ -3928,6 +3932,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) ...@@ -3928,6 +3932,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
else else
r = 0; r = 0;
break; break;
case KVM_CAP_X86_NOTIFY_VMEXIT:
r = kvm_has_notify_vmexit;
break;
default: default:
break; break;
} }
...@@ -4416,6 +4423,10 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu, ...@@ -4416,6 +4423,10 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
| KVM_VCPUEVENT_VALID_SMM); | KVM_VCPUEVENT_VALID_SMM);
if (vcpu->kvm->arch.exception_payload_enabled) if (vcpu->kvm->arch.exception_payload_enabled)
events->flags |= KVM_VCPUEVENT_VALID_PAYLOAD; events->flags |= KVM_VCPUEVENT_VALID_PAYLOAD;
if (vcpu->kvm->arch.triple_fault_event) {
events->triple_fault.pending = kvm_test_request(KVM_REQ_TRIPLE_FAULT, vcpu);
events->flags |= KVM_VCPUEVENT_VALID_TRIPLE_FAULT;
}
memset(&events->reserved, 0, sizeof(events->reserved)); memset(&events->reserved, 0, sizeof(events->reserved));
} }
...@@ -4429,7 +4440,8 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu, ...@@ -4429,7 +4440,8 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
| KVM_VCPUEVENT_VALID_SIPI_VECTOR | KVM_VCPUEVENT_VALID_SIPI_VECTOR
| KVM_VCPUEVENT_VALID_SHADOW | KVM_VCPUEVENT_VALID_SHADOW
| KVM_VCPUEVENT_VALID_SMM | KVM_VCPUEVENT_VALID_SMM
| KVM_VCPUEVENT_VALID_PAYLOAD)) | KVM_VCPUEVENT_VALID_PAYLOAD
| KVM_VCPUEVENT_VALID_TRIPLE_FAULT))
return -EINVAL; return -EINVAL;
if (events->flags & KVM_VCPUEVENT_VALID_PAYLOAD) { if (events->flags & KVM_VCPUEVENT_VALID_PAYLOAD) {
...@@ -4507,6 +4519,15 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu, ...@@ -4507,6 +4519,15 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
} }
} }
if (events->flags & KVM_VCPUEVENT_VALID_TRIPLE_FAULT) {
if (!vcpu->kvm->arch.triple_fault_event)
return -EINVAL;
if (events->triple_fault.pending)
kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
else
kvm_clear_request(KVM_REQ_TRIPLE_FAULT, vcpu);
}
kvm_make_request(KVM_REQ_EVENT, vcpu); kvm_make_request(KVM_REQ_EVENT, vcpu);
return 0; return 0;
...@@ -5309,6 +5330,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, ...@@ -5309,6 +5330,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
kvm->arch.exception_payload_enabled = cap->args[0]; kvm->arch.exception_payload_enabled = cap->args[0];
r = 0; r = 0;
break; break;
case KVM_CAP_X86_TRIPLE_FAULT_EVENT:
kvm->arch.triple_fault_event = cap->args[0];
r = 0;
break;
case KVM_CAP_X86_USER_SPACE_MSR: case KVM_CAP_X86_USER_SPACE_MSR:
kvm->arch.user_space_msr_mask = cap->args[0]; kvm->arch.user_space_msr_mask = cap->args[0];
r = 0; r = 0;
...@@ -5358,6 +5383,22 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, ...@@ -5358,6 +5383,22 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
kvm->arch.bus_lock_detection_enabled = true; kvm->arch.bus_lock_detection_enabled = true;
r = 0; r = 0;
break; break;
case KVM_CAP_X86_NOTIFY_VMEXIT:
r = -EINVAL;
if ((u32)cap->args[0] & ~KVM_X86_NOTIFY_VMEXIT_VALID_BITS)
break;
if (!kvm_has_notify_vmexit)
break;
if (!((u32)cap->args[0] & KVM_X86_NOTIFY_VMEXIT_ENABLED))
break;
mutex_lock(&kvm->lock);
if (!kvm->created_vcpus) {
kvm->arch.notify_window = cap->args[0] >> 32;
kvm->arch.notify_vmexit_flags = (u32)cap->args[0];
r = 0;
}
mutex_unlock(&kvm->lock);
break;
default: default:
r = -EINVAL; r = -EINVAL;
break; break;
......
...@@ -340,6 +340,11 @@ static inline bool kvm_cstate_in_guest(struct kvm *kvm) ...@@ -340,6 +340,11 @@ static inline bool kvm_cstate_in_guest(struct kvm *kvm)
DECLARE_PER_CPU(struct kvm_vcpu *, current_vcpu); DECLARE_PER_CPU(struct kvm_vcpu *, current_vcpu);
static inline bool kvm_notify_vmexit_enabled(struct kvm *kvm)
{
return kvm->arch.notify_vmexit_flags & KVM_X86_NOTIFY_VMEXIT_ENABLED;
}
static inline void kvm_before_interrupt(struct kvm_vcpu *vcpu) static inline void kvm_before_interrupt(struct kvm_vcpu *vcpu)
{ {
__this_cpu_write(current_vcpu, vcpu); __this_cpu_write(current_vcpu, vcpu);
......
...@@ -252,6 +252,7 @@ struct kvm_hyperv_exit { ...@@ -252,6 +252,7 @@ struct kvm_hyperv_exit {
#define KVM_EXIT_X86_WRMSR 30 #define KVM_EXIT_X86_WRMSR 30
#define KVM_EXIT_RISCV_SBI 31 #define KVM_EXIT_RISCV_SBI 31
#define KVM_EXIT_X86_BUS_LOCK 33 #define KVM_EXIT_X86_BUS_LOCK 33
#define KVM_EXIT_NOTIFY 37
/* For KVM_EXIT_INTERNAL_ERROR */ /* For KVM_EXIT_INTERNAL_ERROR */
/* Emulate instruction failed. */ /* Emulate instruction failed. */
...@@ -435,6 +436,11 @@ struct kvm_run { ...@@ -435,6 +436,11 @@ struct kvm_run {
unsigned long args[6]; unsigned long args[6];
unsigned long ret[2]; unsigned long ret[2];
} riscv_sbi; } riscv_sbi;
/* KVM_EXIT_NOTIFY */
struct {
#define KVM_NOTIFY_CONTEXT_INVALID (1 << 0)
__u32 flags;
} notify;
/* Fix the size of the union. */ /* Fix the size of the union. */
char padding[256]; char padding[256];
}; };
...@@ -1064,6 +1070,8 @@ struct kvm_ppc_resize_hpt { ...@@ -1064,6 +1070,8 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_ENFORCE_PV_FEATURE_CPUID 190 #define KVM_CAP_ENFORCE_PV_FEATURE_CPUID 190
#define KVM_CAP_X86_BUS_LOCK_EXIT 193 #define KVM_CAP_X86_BUS_LOCK_EXIT 193
#define KVM_CAP_SGX_ATTRIBUTE 196 #define KVM_CAP_SGX_ATTRIBUTE 196
#define KVM_CAP_X86_TRIPLE_FAULT_EVENT 218
#define KVM_CAP_X86_NOTIFY_VMEXIT 219
#define KVM_CAP_ARM_CPU_FEATURE 555 #define KVM_CAP_ARM_CPU_FEATURE 555
...@@ -1741,5 +1749,8 @@ struct kvm_hyperv_eventfd { ...@@ -1741,5 +1749,8 @@ struct kvm_hyperv_eventfd {
#define KVM_BUS_LOCK_DETECTION_OFF (1 << 0) #define KVM_BUS_LOCK_DETECTION_OFF (1 << 0)
#define KVM_BUS_LOCK_DETECTION_EXIT (1 << 1) #define KVM_BUS_LOCK_DETECTION_EXIT (1 << 1)
/* Available with KVM_CAP_X86_NOTIFY_VMEXIT */
#define KVM_X86_NOTIFY_VMEXIT_ENABLED (1ULL << 0)
#define KVM_X86_NOTIFY_VMEXIT_USER (1ULL << 1)
#endif /* __LINUX_KVM_H */ #endif /* __LINUX_KVM_H */
...@@ -26,6 +26,7 @@ ...@@ -26,6 +26,7 @@
/x86_64/vmx_tsc_adjust_test /x86_64/vmx_tsc_adjust_test
/x86_64/xss_msr_test /x86_64/xss_msr_test
/clear_dirty_log_test /clear_dirty_log_test
/x86_64/triple_fault_event_test
/demand_paging_test /demand_paging_test
/dirty_log_test /dirty_log_test
/dirty_log_perf_test /dirty_log_perf_test
......
...@@ -60,6 +60,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/debug_regs ...@@ -60,6 +60,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/debug_regs
TEST_GEN_PROGS_x86_64 += x86_64/tsc_msrs_test TEST_GEN_PROGS_x86_64 += x86_64/tsc_msrs_test
TEST_GEN_PROGS_x86_64 += x86_64/user_msr_test TEST_GEN_PROGS_x86_64 += x86_64/user_msr_test
TEST_GEN_PROGS_x86_64 += x86_64/max_vcpuid_cap_test TEST_GEN_PROGS_x86_64 += x86_64/max_vcpuid_cap_test
TEST_GEN_PROGS_x86_64 += x86_64/triple_fault_event_test
TEST_GEN_PROGS_x86_64 += demand_paging_test TEST_GEN_PROGS_x86_64 += demand_paging_test
TEST_GEN_PROGS_x86_64 += dirty_log_test TEST_GEN_PROGS_x86_64 += dirty_log_test
TEST_GEN_PROGS_x86_64 += dirty_log_perf_test TEST_GEN_PROGS_x86_64 += dirty_log_perf_test
......
// SPDX-License-Identifier: GPL-2.0-only
#include "test_util.h"
#include "kvm_util.h"
#include "processor.h"
#include "vmx.h"
#include <string.h>
#include <sys/ioctl.h>
#include "kselftest.h"
#define VCPU_ID 0
#define ARBITRARY_IO_PORT 0x2000
/* The virtual machine object. */
static struct kvm_vm *vm;
static void l2_guest_code(void)
{
asm volatile("inb %%dx, %%al"
: : [port] "d" (ARBITRARY_IO_PORT) : "rax");
}
void l1_guest_code(struct vmx_pages *vmx)
{
#define L2_GUEST_STACK_SIZE 64
unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
GUEST_ASSERT(vmx->vmcs_gpa);
GUEST_ASSERT(prepare_for_vmx_operation(vmx));
GUEST_ASSERT(load_vmcs(vmx));
prepare_vmcs(vmx, l2_guest_code,
&l2_guest_stack[L2_GUEST_STACK_SIZE]);
GUEST_ASSERT(!vmlaunch());
/* L2 should triple fault after a triple fault event injected. */
GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_TRIPLE_FAULT);
GUEST_DONE();
}
int main(void)
{
struct kvm_run *run;
struct kvm_vcpu_events events;
vm_vaddr_t vmx_pages_gva;
struct ucall uc;
struct kvm_enable_cap cap = {
.cap = KVM_CAP_X86_TRIPLE_FAULT_EVENT,
.args = {1}
};
if (!nested_vmx_supported()) {
print_skip("Nested VMX not supported");
exit(KSFT_SKIP);
}
if (!kvm_check_cap(KVM_CAP_X86_TRIPLE_FAULT_EVENT)) {
print_skip("KVM_CAP_X86_TRIPLE_FAULT_EVENT not supported");
exit(KSFT_SKIP);
}
vm = vm_create_default(VCPU_ID, 0, (void *) l1_guest_code);
vm_enable_cap(vm, &cap);
run = vcpu_state(vm, VCPU_ID);
vcpu_alloc_vmx(vm, &vmx_pages_gva);
vcpu_args_set(vm, VCPU_ID, 1, vmx_pages_gva);
vcpu_run(vm, VCPU_ID);
TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
"Expected KVM_EXIT_IO, got: %u (%s)\n",
run->exit_reason, exit_reason_str(run->exit_reason));
TEST_ASSERT(run->io.port == ARBITRARY_IO_PORT,
"Expected IN from port %d from L2, got port %d",
ARBITRARY_IO_PORT, run->io.port);
vcpu_events_get(vm, VCPU_ID, &events);
events.flags |= KVM_VCPUEVENT_VALID_TRIPLE_FAULT;
events.triple_fault.pending = true;
vcpu_events_set(vm, VCPU_ID, &events);
run->immediate_exit = true;
vcpu_run_complete_io(vm, VCPU_ID);
vcpu_events_get(vm, VCPU_ID, &events);
TEST_ASSERT(events.flags & KVM_VCPUEVENT_VALID_TRIPLE_FAULT,
"Triple fault event invalid");
TEST_ASSERT(events.triple_fault.pending,
"No triple fault pending");
vcpu_run(vm, VCPU_ID);
switch (get_ucall(vm, VCPU_ID, &uc)) {
case UCALL_DONE:
break;
case UCALL_ABORT:
TEST_FAIL("%s", (const char *)uc.args[0]);
default:
TEST_FAIL("Unexpected ucall: %lu", uc.cmd);
}
}
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册