提交 · 4566654bb9be9e8864df417bb72ceee5136b6a6a · openeuler / Kernel

24 9月, 2014 4 次提交

KVM: vmx: Inject #GP on invalid PAT CR · 4566654b

由 Nadav Amit 提交于 9月 18, 2014

Guest which sets the PAT CR to invalid value should get a #GP.  Currently, if
vmx supports loading PAT CR during entry, then the value is not checked.  This
patch makes the required check in that case.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4566654b

KVM: x86: directly use kvm_make_request again · 77c3913b

由 Liang Chen 提交于 9月 18, 2014

A one-line wrapper around kvm_make_request is not particularly
useful. Replace kvm_mmu_flush_tlb() with kvm_make_request().
Signed-off-by: NLiang Chen <liangchen.linux@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

77c3913b

KVM: nested VMX: disable perf cpuid reporting · bc613494

由 Marcelo Tosatti 提交于 9月 18, 2014

Initilization of L2 guest with -cpu host, on L1 guest with -cpu host
triggers:

(qemu) KVM: entry failed, hardware error 0x7
...
nested_vmx_run: VMCS MSR_{LOAD,STORE} unsupported

Nested VMX MSR load/store support is not sufficient to
allow perf for L2 guest.

Until properly fixed, trap CPUID and disable function 0xA.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bc613494

kvm: Make init_rmode_tss() return 0 on success. · 1f755a82

由 Paolo Bonzini 提交于 9月 16, 2014

In init_rmode_tss(), there two variables indicating the return
value, r and ret, and it return 0 on error, 1 on success. The function
is only called by vmx_set_tss_addr(), and ret is redundant.

This patch removes the redundant variable, by making init_rmode_tss()
return 0 on success, -errno on failure.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1f755a82

17 9月, 2014 2 次提交

kvm: Make init_rmode_identity_map() return 0 on success. · f51770ed

由 Tang Chen 提交于 9月 16, 2014

In init_rmode_identity_map(), there two variables indicating the return
value, r and ret, and it return 0 on error, 1 on success. The function
is only called by vmx_create_vcpu(), and ret is redundant.

This patch removes the redundant variable, and makes init_rmode_identity_map()
return 0 on success, -errno on failure.
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f51770ed

kvm: Remove ept_identity_pagetable from struct kvm_arch. · a255d479

由 Tang Chen 提交于 9月 16, 2014

kvm_arch->ept_identity_pagetable holds the ept identity pagetable page. But
it is never used to refer to the page at all.

In vcpu initialization, it indicates two things:
1. indicates if ept page is allocated
2. indicates if a memory slot for identity page is initialized

Actually, kvm_arch->ept_identity_pagetable_done is enough to tell if the ept
identity pagetable is initialized. So we can remove ept_identity_pagetable.

NOTE: In the original code, ept identity pagetable page is pinned in memroy.
As a result, it cannot be migrated/hot-removed. After this patch, since
kvm_arch->ept_identity_pagetable is removed, ept identity pagetable page
is no longer pinned in memory. And it can be migrated/hot-removed.
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: NGleb Natapov <gleb@kernel.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a255d479

11 9月, 2014 1 次提交

kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address. · 73a6d941

由 Tang Chen 提交于 9月 11, 2014

We have APIC_DEFAULT_PHYS_BASE defined as 0xfee00000, which is also the address of
apic access page. So use this macro.
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: NGleb Natapov <gleb@kernel.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

73a6d941

29 8月, 2014 4 次提交

KVM: remove garbage arg to *hardware_{en,dis}able · 13a34e06

由 Radim Krčmář 提交于 8月 28, 2014

In the beggining was on_each_cpu(), which required an unused argument to
kvm_arch_ops.hardware_{en,dis}able, but this was soon forgotten.

Remove unnecessary arguments that stem from this.
Signed-off-by: NRadim KrÄmÃ¡Å™ <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

13a34e06

KVM: x86: fix some sparse warnings · 48d89b92

由 Paolo Bonzini 提交于 8月 26, 2014

Sparse reports the following easily fixed warnings:

arch/x86/kvm/vmx.c:8795:48: sparse: Using plain integer as NULL pointer
arch/x86/kvm/vmx.c:2138:5: sparse: symbol vmx_read_l1_tsc was not declared. Should it be static?
arch/x86/kvm/vmx.c:6151:48: sparse: Using plain integer as NULL pointer
arch/x86/kvm/vmx.c:8851:6: sparse: symbol vmx_sched_in was not declared. Should it be static?

arch/x86/kvm/svm.c:2162:5: sparse: symbol svm_read_l1_tsc was not declared. Should it be static?

Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

48d89b92

KVM: nVMX: nested TPR shadow/threshold emulation · a7c0b07d

由 Wanpeng Li 提交于 8月 21, 2014

This patch fix bug https://bugzilla.kernel.org/show_bug.cgi?id=61411

TPR shadow/threshold feature is important to speed up the Windows guest.
Besides, it is a must feature for certain VMM.

We map virtual APIC page address and TPR threshold from L1 VMCS. If
TPR_BELOW_THRESHOLD VM exit is triggered by L2 guest and L1 interested
in, we inject it into L1 VMM for handling.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
[Add PAGE_ALIGNED check, do not write useless virtual APIC page address
 if TPR shadowing is disabled. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a7c0b07d

KVM: nVMX: introduce nested_get_vmcs12_pages · a2bcba50

由 Wanpeng Li 提交于 8月 21, 2014

Introduce function nested_get_vmcs12_pages() to check the valid
of nested apic access page and virtual apic page earlier.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a2bcba50

22 8月, 2014 4 次提交

KVM: trace kvm_ple_window grow/shrink · 7b46268d

由 Radim Krčmář 提交于 8月 21, 2014

Tracepoint for dynamic PLE window, fired on every potential change.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7b46268d

KVM: VMX: dynamise PLE window · b4a2d31d

由 Radim Krčmář 提交于 8月 21, 2014

Window is increased on every PLE exit and decreased on every sched_in.
The idea is that we don't want to PLE exit if there is no preemption
going on.
We do this with sched_in() because it does not hold rq lock.

There are two new kernel parameters for changing the window:
 ple_window_grow and ple_window_shrink
ple_window_grow affects the window on PLE exit and ple_window_shrink
does it on sched_in;  depending on their value, the window is modifier
like this: (ple_window is kvm_intel's global)

  ple_window_shrink/ |
  ple_window_grow    | PLE exit           | sched_in
  -------------------+--------------------+---------------------
  < 1                |  = ple_window      |  = ple_window
  < ple_window       | *= ple_window_grow | /= ple_window_shrink
  otherwise          | += ple_window_grow | -= ple_window_shrink

A third new parameter, ple_window_max, controls the maximal ple_window;
it is internally rounded down to a closest multiple of ple_window_grow.

VCPU's PLE window is never allowed below ple_window.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b4a2d31d

KVM: VMX: make PLE window per-VCPU · a7653ecd

由 Radim Krčmář 提交于 8月 21, 2014

Change PLE window into per-VCPU variable, seeded from module parameter,
to allow greater flexibility.

Brings in a small overhead on every vmentry.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a7653ecd

KVM: x86: introduce sched_in to kvm_x86_ops · ae97a3b8

由 Radim Krčmář 提交于 8月 21, 2014

sched_in preempt notifier is available for x86, allow its use in
specific virtualization technlogies as well.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ae97a3b8

20 8月, 2014 1 次提交

KVM: vmx: fix ept reserved bits for 1-GByte page · a32e8459

由 Wanpeng Li 提交于 8月 20, 2014

EPT misconfig handler in kvm will check which reason lead to EPT
misconfiguration after vmexit. One of the reasons is that an EPT
paging-structure entry is configured with settings reserved for
future functionality. However, the handler can't identify if
paging-structure entry of reserved bits for 1-GByte page are
configured, since PDPTE which point to 1-GByte page will reserve
bits 29:12 instead of bits 7:3 which are reserved for PDPTE that
references an EPT Page Directory. This patch fix it by reserve
bits 29:12 for 1-GByte page.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a32e8459

19 8月, 2014 2 次提交

arch/x86: Use RCU_INIT_POINTER(x, NULL) in kvm/vmx.c · 3b63a43f

由 Monam Agarwal 提交于 3月 22, 2014

Here rcu_assign_pointer() is ensuring that the
initialization of a structure is carried out before storing a pointer
to that structure.
So, rcu_assign_pointer(p, NULL) can always safely be converted to
RCU_INIT_POINTER(p, NULL).
Signed-off-by: NMonam Agarwal <monamagarwal123@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3b63a43f

KVM: x86: drop fpu_activate hook · 4473b570

由 Wanpeng Li 提交于 8月 18, 2014

The only user of the fpu_activate hook was dropped in commit
2d04a05b (KVM: x86 emulator: emulate CLTS internally, 2011-04-20).
vmx_fpu_activate and svm_fpu_activate are still called on #NM (and for
Intel CLTS), but never from common code; hence, there's no need for
a hook.
Reviewed-by: NYang Zhang <yang.z.zhang@intel.com>
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4473b570

05 8月, 2014 1 次提交

KVM: nVMX: Fix nested vmexit ack intr before load vmcs01 · f3380ca5

由 Wanpeng Li 提交于 8月 05, 2014

An external interrupt will cause a vmexit with reason "external interrupt"
when L2 is running.  L1 will pick up the interrupt through vmcs12 if
L1 set the ack interrupt bit.  Commit 77b0f5d6 (KVM: nVMX: Ack and write
vector info to intr_info if L1 asks us to) retrieves the interrupt that
belongs to L1 before vmcs01 is loaded.

This will lead to problems in the next patch, which would write to SVI
of vmcs02 instead of vmcs01 (SVI of vmcs02 doesn't make sense because
L2 runs without APICv).
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Tested-by: NLiu, RongrongX <rongrongx.liu@intel.com>
Tested-by: NFelipe Reyes <freyes@suse.com>
Fixes: 77b0f5d6
Cc: stable@vger.kernel.org
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
[Move tracepoint as well. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f3380ca5

30 7月, 2014 1 次提交

KVM: vmx: remove duplicate vmx_mpx_supported() prototype · 296f0475

由 Chris J Arges 提交于 7月 29, 2014

Remove a prototype which was added by both 93c4adc7 and 36be0b9d.
Signed-off-by: NChris J Arges <chris.j.arges@canonical.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

296f0475

24 7月, 2014 2 次提交

Replace NR_VMX_MSR with its definition · 03916db9

由 Paolo Bonzini 提交于 7月 24, 2014

Using ARRAY_SIZE directly makes it easier to read the code.  While touching
the code, replace the division by a multiplication in the recently added
BUILD_BUG_ON.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

03916db9

KVM: x86: Assertions to check no overrun in MSR lists · 0123be42

由 Nadav Amit 提交于 7月 24, 2014

Currently there is no check whether shared MSRs list overrun the allocated size
which can results in bugs. In addition there is no check that vmx->guest_msrs
has sufficient space to accommodate all the VMX msrs.  This patch adds the
assertions.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0123be42

21 7月, 2014 3 次提交

KVM: x86: DR6/7.RTM cannot be written · 6f43ed01

由 Nadav Amit 提交于 7月 15, 2014

Haswell and newer Intel CPUs have support for RTM, and in that case DR6.RTM is
not fixed to 1 and DR7.RTM is not fixed to zero. That is not the case in the
current KVM implementation. This bug is apparent only if the MOV-DR instruction
is emulated or the host also debugs the guest.

This patch is a partial fix which enables DR6.RTM and DR7.RTM to be cleared and
set respectively. It also sets DR6.RTM upon every debug exception. Obviously,
it is not a complete fix, as debugging of RTM is still unsupported.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6f43ed01

KVM: nVMX: clean up nested_release_vmcs12 and code around it · 9a2a05b9

由 Paolo Bonzini 提交于 7月 17, 2014

Make nested_release_vmcs12 idempotent.
Tested-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9a2a05b9

KVM: nVMX: fix lifetime issues for vmcs02 · 4fa7734c

由 Paolo Bonzini 提交于 7月 17, 2014

free_nested needs the loaded_vmcs to be valid if it is a vmcs02, in
order to detach it from the shadow vmcs.  However, this is not
available anymore after commit 26a865f4 (KVM: VMX: fix use after
free of vmx->loaded_vmcs, 2014-01-03).

Revert that patch, and fix its problem by forcing a vmcs01 as the
active VMCS before freeing all the nested VMX state.
Reported-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Tested-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4fa7734c

17 7月, 2014 1 次提交

KVM: nVMX: Fix virtual interrupt delivery injection · 963fee16

由 Wanpeng Li 提交于 7月 17, 2014

This patch fix bug reported in https://bugzilla.kernel.org/show_bug.cgi?id=73331,
after the patch http://www.spinics.net/lists/kvm/msg105230.html applied, there is
some progress and the L2 can boot up, however, slowly. The original idea of this
fix vid injection patch is from "Zhang, Yang Z" <yang.z.zhang@intel.com>.

Interrupt which delivered by vid should be injected to L1 by L0 if current is in
L1, or should be injected to L2 by L0 through the old injection way if L1 doesn't
have set External-interrupt exiting bit. The current logic doen't consider these
cases. This patch fix it by vid intr to L1 if current is L1 or L2 through old
injection way if L1 doen't have External-interrupt exiting bit set.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: N"Zhang, Yang Z" <yang.z.zhang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

963fee16

11 7月, 2014 2 次提交

KVM: x86: return all bits from get_interrupt_shadow · 37ccdcbe

由 Paolo Bonzini 提交于 5月 20, 2014

For the next patch we will need to know the full state of the
interrupt shadow; we will then set KVM_REQ_EVENT when one bit
is cleared.

However, right now get_interrupt_shadow only returns the one
corresponding to the emulated instruction, or an unconditional
0 if the emulated instruction does not have an interrupt shadow.
This is confusing and does not allow us to check for cleared
bits as mentioned above.

Clean the callback up, and modify toggle_interruptibility to
match the comment above the call.  As a small result, the
call to set_interrupt_shadow will be skipped in the common
case where int_shadow == 0 && mask == 0.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

37ccdcbe

KVM: vmx: speed up emulation of invalid guest state · 98eb2f8b

由 Paolo Bonzini 提交于 3月 27, 2014

About 25% of the time spent in emulation of invalid guest state
is wasted in checking whether emulation is required for the next
instruction.  However, this almost never changes except when a
segment register (or TR or LDTR) changes, or when there is a mode
transition (i.e. CR0 changes).

In fact, vmx_set_segment and vmx_set_cr0 already modify
vmx->emulation_required (except that the former for some reason
uses |= instead of just an assignment).  So there is no need to
call guest_state_valid in the emulation loop.

Emulation performance test results indicate 1650-2600 cycles
for common instructions, versus 2300-3200 before this patch on
a Sandy Bridge Xeon.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

98eb2f8b

19 6月, 2014 9 次提交

KVM: vmx: vmx instructions handling does not consider cs.l · 27e6fb5d

由 Nadav Amit 提交于 6月 18, 2014

VMX instructions use 32-bit operands in 32-bit mode, and 64-bit operands in
64-bit mode. The current implementation is broken since it does not use the
register operands correctly, and always uses 64-bit for reads and writes.
Moreover, write to memory in vmwrite only considers long-mode, so it ignores
cs.l. This patch fixes this behavior.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

27e6fb5d

KVM: vmx: handle_cr ignores 32/64-bit mode · 1e32c079

由 Nadav Amit 提交于 6月 18, 2014

On 32-bit mode only bits [31:0] of the CR should be used for setting the CR
value.  Otherwise, the host may incorrectly assume the value is invalid if bits
[63:32] are not zero.  Moreover, the CR is currently being read twice when CR8
is used.  Last, nested mov-cr exiting is modified to handle the CR value
correctly as well.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1e32c079

KVM: x86: check DR6/7 high-bits are clear only on long-mode · 5777392e

由 Nadav Amit 提交于 6月 18, 2014

When the guest sets DR6 and DR7, KVM asserts the high 32-bits are clear, and
otherwise injects a #GP exception. This exception should only be injected only
if running in long-mode.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5777392e

KVM: nVMX: Fix returned value of MSR_IA32_VMX_VMCS_ENUM · 5381417f

由 Jan Kiszka 提交于 6月 16, 2014

Many real CPUs get this wrong as well, but ours is totally off: bits 9:1
define the highest index value.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5381417f

KVM: nVMX: Allow to disable VM_{ENTRY_LOAD,EXIT_SAVE}_DEBUG_CONTROLS · 2996fca0

由 Jan Kiszka 提交于 6月 16, 2014

Allow L1 to "leak" its debug controls into L2, i.e. permit cleared
VM_{ENTRY_LOAD,EXIT_SAVE}_DEBUG_CONTROLS. This requires to manually
transfer the state of DR7 and IA32_DEBUGCTLMSR from L1 into L2 as both
run on different VMCS.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2996fca0

KVM: nVMX: Fix returned value of MSR_IA32_VMX_PROCBASED_CTLS · 560b7ee1

由 Jan Kiszka 提交于 6月 16, 2014

SDM says bits 1, 4-6, 8, 13-16, and 26 have to be set.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

560b7ee1

KVM: nVMX: Allow to disable CR3 access interception · 3dcdf3ec

由 Jan Kiszka 提交于 6月 16, 2014

We already have this control enabled by exposing a broken
MSR_IA32_VMX_PROCBASED_CTLS value. This will properly advertise our
capability once the value is fixed by clearing the right bits in
MSR_IA32_VMX_TRUE_PROCBASED_CTLS. We also have to ensure to test the
right value on L2 entry.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3dcdf3ec

KVM: nVMX: Advertise support for MSR_IA32_VMX_TRUE_*_CTLS · 3dbcd8da

由 Jan Kiszka 提交于 6月 16, 2014

We already implemented them but failed to advertise them. Currently they
all return the identical values to the capability MSRs they are
augmenting. So there is no change in exposed features yet.

Drop related comments at this chance that are partially incorrect and
redundant anyway.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3dbcd8da

arch/x86/kvm/vmx.c: use PAGE_ALIGNED instead of IS_ALIGNED(PAGE_SIZE · bc39c4db

由 Fabian Frederick 提交于 6月 14, 2014

use mm.h definition

Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bc39c4db

22 5月, 2014 2 次提交

KVM: vmx: DR7 masking on task switch emulation is wrong · 1f854112

由 Nadav Amit 提交于 5月 19, 2014

The DR7 masking which is done on task switch emulation should be in hex format
(clearing the local breakpoints enable bits 0,2,4 and 6).
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1f854112

KVM: x86: get CPL from SS.DPL · ae9fedc7

由 Paolo Bonzini 提交于 5月 14, 2014

CS.RPL is not equal to the CPL in the few instructions between
setting CR0.PE and reloading CS.  And CS.DPL is also not equal
to the CPL for conforming code segments.

However, SS.DPL *is* always equal to the CPL except for the weird
case of SYSRET on AMD processors, which sets SS.DPL=SS.RPL from the
value in the STAR MSR, but force CPL=3 (Intel instead forces
SS.DPL=SS.RPL=CPL=3).

So this patch:

- modifies SVM to update the CPL from SS.DPL rather than CS.RPL;
the above case with SYSRET is not broken further, and the way
to fix it would be to pass the CPL to userspace and back

- modifies VMX to always return the CPL from SS.DPL (except
forcing it to 0 if we are emulating real mode via vm86 mode;
in vm86 mode all DPLs have to be 3, but real mode does allow
privileged instructions).  It also removes the CPL cache,
which becomes a duplicate of the SS access rights cache.

This fixes doing KVM_IOCTL_SET_SREGS exactly after setting
CR0.PE=1 but before CS has been reloaded.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ae9fedc7

08 5月, 2014 1 次提交

kvm: x86: emulate monitor and mwait instructions as nop · 87c00572

由 Gabriel L. Somlo 提交于 5月 07, 2014

Treat monitor and mwait instructions as nop, which is architecturally
correct (but inefficient) behavior. We do this to prevent misbehaving
guests (e.g. OS X <= 10.7) from crashing after they fail to check for
monitor/mwait availability via cpuid.

Since mwait-based idle loops relying on these nop-emulated instructions
would keep the host CPU pegged at 100%, do NOT advertise their presence
via cpuid, to prevent compliant guests from using them inadvertently.
Signed-off-by: NGabriel L. Somlo <somlo@cmu.edu>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

87c00572

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功