提交 · 73a6d9416279f138833574f11dc82134fb56908d · openeuler / Kernel

11 9月, 2014 1 次提交

kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address. · 73a6d941

由 Tang Chen 提交于 9月 11, 2014

We have APIC_DEFAULT_PHYS_BASE defined as 0xfee00000, which is also the address of
apic access page. So use this macro.
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: NGleb Natapov <gleb@kernel.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

73a6d941

05 9月, 2014 2 次提交

KVM: x86: propagate exception from permission checks on the nested page fault · 54987b7a

由 Paolo Bonzini 提交于 9月 02, 2014

Currently, if a permission error happens during the translation of
the final GPA to HPA, walk_addr_generic returns 0 but does not fill
in walker->fault.  To avoid this, add an x86_exception* argument
to the translate_gpa function, and let it fill in walker->fault.
The nested_page_fault field will be true, since the walk_mmu is the
nested_mmu and translate_gpu instead operates on the "outer" (NPT)
instance.
Reported-by: NValentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

54987b7a

KVM: x86: skip writeback on injection of nested exception · ef54bcfe

由 Paolo Bonzini 提交于 9月 04, 2014

If a nested page fault happens during emulation, we will inject a vmexit,
not a page fault.  However because writeback happens after the injection,
we will write ctxt->eip from L2 into the L1 EIP.  We do not write back
if an instruction caused an interception vmexit---do the same for page
faults.
Suggested-by: NGleb Natapov <gleb@kernel.org>
Reviewed-by: NGleb Natapov <gleb@kernel.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ef54bcfe

03 9月, 2014 6 次提交

KVM: nSVM: propagate the NPF EXITINFO to the guest · 5e352519

由 Paolo Bonzini 提交于 9月 02, 2014

This is similar to what the EPT code does with the exit qualification.
This allows the guest to see a valid value for bits 33:32.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5e352519

KVM: x86: reserve bit 8 of non-leaf PDPEs and PML4Es in 64-bit mode on AMD · a0c0feb5

由 Paolo Bonzini 提交于 9月 02, 2014

Bit 8 would be the "global" bit, which does not quite make sense for non-leaf
page table entries. Intel ignores it; AMD ignores it in PDEs, but reserves it
in PDPEs and PML4Es. The SVM test is relying on this behavior, so enforce it.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a0c0feb5

KVM: mmio: cleanup kvm_set_mmio_spte_mask · d1431483

由 Tiejun Chen 提交于 9月 01, 2014

Just reuse rsvd_bits() inside kvm_set_mmio_spte_mask()
for slightly better code.
Signed-off-by: NTiejun Chen <tiejun.chen@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d1431483

kvm: x86: fix stale mmio cache bug · 56f17dd3

由 David Matlack 提交于 8月 18, 2014

The following events can lead to an incorrect KVM_EXIT_MMIO bubbling
up to userspace:

(1) Guest accesses gpa X without a memory slot. The gfn is cached in
struct kvm_vcpu_arch (mmio_gfn). On Intel EPT-enabled hosts, KVM sets
the SPTE write-execute-noread so that future accesses cause
EPT_MISCONFIGs.

(2) Host userspace creates a memory slot via KVM_SET_USER_MEMORY_REGION
covering the page just accessed.

(3) Guest attempts to read or write to gpa X again. On Intel, this
generates an EPT_MISCONFIG. The memory slot generation number that
was incremented in (2) would normally take care of this but we fast
path mmio faults through quickly_check_mmio_pf(), which only checks
the per-vcpu mmio cache. Since we hit the cache, KVM passes a
KVM_EXIT_MMIO up to userspace.

This patch fixes the issue by using the memslot generation number
to validate the mmio cache.

Cc: stable@vger.kernel.org
Signed-off-by: NDavid Matlack <dmatlack@google.com>
[xiaoguangrong: adjust the code to make it simpler for stable-tree fix.]
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: NDavid Matlack <dmatlack@google.com>
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Tested-by: NDavid Matlack <dmatlack@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

56f17dd3

kvm: fix potentially corrupt mmio cache · ee3d1570

由 David Matlack 提交于 8月 18, 2014

vcpu exits and memslot mutations can run concurrently as long as the
vcpu does not aquire the slots mutex. Thus it is theoretically possible
for memslots to change underneath a vcpu that is handling an exit.

If we increment the memslot generation number again after
synchronize_srcu_expedited(), vcpus can safely cache memslot generation
without maintaining a single rcu_dereference through an entire vm exit.
And much of the x86/kvm code does not maintain a single rcu_dereference
of the current memslots during each exit.

We can prevent the following case:

   vcpu (CPU 0)                             | thread (CPU 1)
--------------------------------------------+--------------------------
1  vm exit                                  |
2  srcu_read_unlock(&kvm->srcu)             |
3  decide to cache something based on       |
     old memslots                           |
4                                           | change memslots
                                            | (increments generation)
5                                           | synchronize_srcu(&kvm->srcu);
6  retrieve generation # from new memslots  |
7  tag cache with new memslot generation    |
8  srcu_read_unlock(&kvm->srcu)             |
...                                         |
   <action based on cache occurs even       |
    though the caching decision was based   |
    on the old memslots>                    |
...                                         |
   <action *continues* to occur until next  |
    memslot generation change, which may    |
    be never>                               |
                                            |

By incrementing the generation after synchronizing with kvm->srcu readers,
we ensure that the generation retrieved in (6) will become invalid soon
after (8).

Keeping the existing increment is not strictly necessary, but we
do keep it and just move it for consistency from update_memslots to
install_new_memslots.  It invalidates old cached MMIOs immediately,
instead of having to wait for the end of synchronize_srcu_expedited,
which makes the code more clearly correct in case CPU 1 is preempted
right after synchronize_srcu() returns.

To avoid halving the generation space in SPTEs, always presume that the
low bit of the generation is zero when reconstructing a generation number
out of an SPTE.  This effectively disables MMIO caching in SPTEs during
the call to synchronize_srcu_expedited.  Using the low bit this way is
somewhat like a seqcount---where the protected thing is a cache, and
instead of retrying we can simply punt if we observe the low bit to be 1.

Cc: stable@vger.kernel.org
Signed-off-by: NDavid Matlack <dmatlack@google.com>
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: NDavid Matlack <dmatlack@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ee3d1570

KVM: do not bias the generation number in kvm_current_mmio_generation · 00f034a1

由 Paolo Bonzini 提交于 8月 20, 2014

The next patch will give a meaning (a la seqcount) to the low bit of the
generation number.  Ensure that it matches between kvm->memslots->generation
and kvm_current_mmio_generation().

Cc: stable@vger.kernel.org
Reviewed-by: NDavid Matlack <dmatlack@google.com>
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

00f034a1

30 8月, 2014 1 次提交

KVM: x86: use guest maxphyaddr to check MTRR values · fd275235

由 Paolo Bonzini 提交于 8月 29, 2014

The check introduced in commit d7a2a246 (KVM: x86: #GP when attempts to write reserved bits of Variable Range MTRRs, 2014-08-19)
will break if the guest maxphyaddr is higher than the host's (which
sometimes happens depending on your hardware and how QEMU is
configured).

To fix this, use cpuid_maxphyaddr similar to how the APIC_BASE MSR
does already.
Reported-by: NJan Kiszka <jan.kiszka@siemens.com>
Tested-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fd275235

29 8月, 2014 7 次提交

KVM: remove garbage arg to *hardware_{en,dis}able · 13a34e06

由 Radim Krčmář 提交于 8月 28, 2014

In the beggining was on_each_cpu(), which required an unused argument to
kvm_arch_ops.hardware_{en,dis}able, but this was soon forgotten.

Remove unnecessary arguments that stem from this.
Signed-off-by: NRadim KrÄmÃ¡Å™ <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

13a34e06

KVM: x86: remove Aligned bit from movntps/movntpd · d5b77069

由 Paolo Bonzini 提交于 7月 14, 2014

These are not explicitly aligned, and do not require alignment on AVX.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d5b77069

KVM: x86 emulator: emulate MOVNTDQ · 0a37027e

由 Alex Williamson 提交于 7月 11, 2014

Windows 8.1 guest with NVIDIA driver and GPU fails to boot with an
emulation failure.  The KVM spew suggests the fault is with lack of
movntdq emulation (courtesy of Paolo):

Code=02 00 00 b8 08 00 00 00 f3 0f 6f 44 0a f0 f3 0f 6f 4c 0a e0 <66> 0f e7 41 f0 66 0f e7 49 e0 48 83 e9 40 f3 0f 6f 44 0a 10 f3 0f 6f 0c 0a 66 0f e7 41 10

$ as -o a.out
        .section .text
        .byte 0x66, 0x0f, 0xe7, 0x41, 0xf0
        .byte 0x66, 0x0f, 0xe7, 0x49, 0xe0
$ objdump -d a.out
    0:  66 0f e7 41 f0          movntdq %xmm0,-0x10(%rcx)
    5:  66 0f e7 49 e0          movntdq %xmm1,-0x20(%rcx)

Add the necessary emulation.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0a37027e

KVM: vmx: VMXOFF emulation in vm86 should cause #UD · 0f54a321

由 Nadav Amit 提交于 8月 29, 2014

Unlike VMCALL, the instructions VMXOFF, VMLAUNCH and VMRESUME should cause a UD
exception in real-mode or vm86. However, the emulator considers all these
instructions the same for the matter of mode checks, and emulation upon exit
due to #UD exception.

As a result, the hypervisor behaves incorrectly on vm86 mode. VMXOFF, VMLAUNCH
or VMRESUME cause on vm86 exit due to #UD. The hypervisor then emulates these
instruction and inject #GP to the guest instead of #UD.

This patch creates a new group for these instructions and mark only VMCALL as
an instruction which can be emulated.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0f54a321

KVM: x86: fix some sparse warnings · 48d89b92

由 Paolo Bonzini 提交于 8月 26, 2014

Sparse reports the following easily fixed warnings:

arch/x86/kvm/vmx.c:8795:48: sparse: Using plain integer as NULL pointer
arch/x86/kvm/vmx.c:2138:5: sparse: symbol vmx_read_l1_tsc was not declared. Should it be static?
arch/x86/kvm/vmx.c:6151:48: sparse: Using plain integer as NULL pointer
arch/x86/kvm/vmx.c:8851:6: sparse: symbol vmx_sched_in was not declared. Should it be static?

arch/x86/kvm/svm.c:2162:5: sparse: symbol svm_read_l1_tsc was not declared. Should it be static?

Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

48d89b92

KVM: nVMX: nested TPR shadow/threshold emulation · a7c0b07d

由 Wanpeng Li 提交于 8月 21, 2014

This patch fix bug https://bugzilla.kernel.org/show_bug.cgi?id=61411

TPR shadow/threshold feature is important to speed up the Windows guest.
Besides, it is a must feature for certain VMM.

We map virtual APIC page address and TPR threshold from L1 VMCS. If
TPR_BELOW_THRESHOLD VM exit is triggered by L2 guest and L1 interested
in, we inject it into L1 VMM for handling.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
[Add PAGE_ALIGNED check, do not write useless virtual APIC page address
 if TPR shadowing is disabled. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a7c0b07d

KVM: nVMX: introduce nested_get_vmcs12_pages · a2bcba50

由 Wanpeng Li 提交于 8月 21, 2014

Introduce function nested_get_vmcs12_pages() to check the valid
of nested apic access page and virtual apic page earlier.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a2bcba50

25 8月, 2014 1 次提交

kvm: x86: fix tracing for 32-bit · 54ad89b0

由 Paolo Bonzini 提交于 8月 25, 2014

Fix commit 7b46268d, which mistakenly
included the new tracepoint under #ifdef CONFIG_X86_64.
Reported-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

54ad89b0

22 8月, 2014 5 次提交

KVM: trace kvm_ple_window grow/shrink · 7b46268d

由 Radim Krčmář 提交于 8月 21, 2014

Tracepoint for dynamic PLE window, fired on every potential change.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7b46268d

KVM: VMX: dynamise PLE window · b4a2d31d

由 Radim Krčmář 提交于 8月 21, 2014

Window is increased on every PLE exit and decreased on every sched_in.
The idea is that we don't want to PLE exit if there is no preemption
going on.
We do this with sched_in() because it does not hold rq lock.

There are two new kernel parameters for changing the window:
 ple_window_grow and ple_window_shrink
ple_window_grow affects the window on PLE exit and ple_window_shrink
does it on sched_in;  depending on their value, the window is modifier
like this: (ple_window is kvm_intel's global)

  ple_window_shrink/ |
  ple_window_grow    | PLE exit           | sched_in
  -------------------+--------------------+---------------------
  < 1                |  = ple_window      |  = ple_window
  < ple_window       | *= ple_window_grow | /= ple_window_shrink
  otherwise          | += ple_window_grow | -= ple_window_shrink

A third new parameter, ple_window_max, controls the maximal ple_window;
it is internally rounded down to a closest multiple of ple_window_grow.

VCPU's PLE window is never allowed below ple_window.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b4a2d31d

KVM: VMX: make PLE window per-VCPU · a7653ecd

由 Radim Krčmář 提交于 8月 21, 2014

Change PLE window into per-VCPU variable, seeded from module parameter,
to allow greater flexibility.

Brings in a small overhead on every vmentry.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a7653ecd

KVM: x86: introduce sched_in to kvm_x86_ops · ae97a3b8

由 Radim Krčmář 提交于 8月 21, 2014

sched_in preempt notifier is available for x86, allow its use in
specific virtualization technlogies as well.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ae97a3b8

KVM: add kvm_arch_sched_in · e790d9ef

由 Radim Krčmář 提交于 8月 21, 2014

Introduce preempt notifiers for architecture specific code.
Advantage over creating a new notifier in every arch is slightly simpler
code and guaranteed call order with respect to kvm_sched_in.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e790d9ef

21 8月, 2014 1 次提交

KVM: x86: Replace X86_FEATURE_NX offset with the definition · 6689fbe3

由 Nadav Amit 提交于 8月 20, 2014

Replace reference to X86_FEATURE_NX using bit shift with the defined
X86_FEATURE_NX.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6689fbe3

20 8月, 2014 4 次提交

KVM: emulate: warn on invalid or uninitialized exception numbers · e0ad0b47

由 Paolo Bonzini 提交于 8月 20, 2014

These were reported when running Jailhouse on AMD processors.

Initialize ctxt->exception.vector with an invalid exception number,
and warn if it remained invalid even though the emulator got
an X86EMUL_PROPAGATE_FAULT return code.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e0ad0b47

KVM: emulate: do not return X86EMUL_PROPAGATE_FAULT explicitly · 592f0858

由 Paolo Bonzini 提交于 8月 20, 2014

Always get it through emulate_exception or emulate_ts.  This
ensures that the ctxt->exception fields have been populated.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

592f0858

KVM: x86: Clarify PMU related features bit manipulation · d27aa7f1

由 Nadav Amit 提交于 8月 20, 2014

kvm_pmu_cpuid_update makes a lot of bit manuiplation operations, when in fact
there are already unions that can be used instead. Changing the bit
manipulation to the union for clarity. This patch does not change the
functionality.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d27aa7f1

KVM: vmx: fix ept reserved bits for 1-GByte page · a32e8459

由 Wanpeng Li 提交于 8月 20, 2014

EPT misconfig handler in kvm will check which reason lead to EPT
misconfiguration after vmexit. One of the reasons is that an EPT
paging-structure entry is configured with settings reserved for
future functionality. However, the handler can't identify if
paging-structure entry of reserved bits for 1-GByte page are
configured, since PDPTE which point to 1-GByte page will reserve
bits 29:12 instead of bits 7:3 which are reserved for PDPTE that
references an EPT Page Directory. This patch fix it by reserve
bits 29:12 for 1-GByte page.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a32e8459

19 8月, 2014 10 次提交

KVM: x86: recalculate_apic_map after enabling apic · 1e1b6c26

由 Nadav Amit 提交于 8月 19, 2014

Currently, recalculate_apic_map ignores vcpus whose lapic is software disabled
through the spurious interrupt vector. However, once it is re-enabled, the map
is not recalculated. Therefore, if the guest OS configured DFR while lapic is
software-disabled, the map may be incorrect. This patch recalculates apic map
after software enabling the lapic.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1e1b6c26

KVM: x86: Clear apic tsc-deadline after deadline · fae0ba21

由 Nadav Amit 提交于 8月 18, 2014

Intel SDM 10.5.4.1 says "When the timer generates an interrupt, it disarms
itself and clears the IA32_TSC_DEADLINE MSR".

This patch clears the MSR upon timer interrupt delivery which delivered on
deadline mode.  Since the MSR may be reconfigured while an interrupt is
pending, causing the new value to be overriden, pending timer interrupts are
checked before setting a new deadline.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fae0ba21

KVM: x86: #GP when attempts to write reserved bits of Variable Range MTRRs · d7a2a246

由 Wanpeng Li 提交于 8月 19, 2014

Section 11.11.2.3 of the SDM mentions "All other bits in the IA32_MTRR_PHYSBASEn
and IA32_MTRR_PHYSMASKn registers are reserved; the processor generates a
general-protection exception(#GP) if software attempts to write to them". This
patch do it in kvm.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d7a2a246

KVM: x86: fix check legal type of Variable Range MTRRs · adfb5d27

由 Wanpeng Li 提交于 8月 19, 2014

The first entry in each pair(IA32_MTRR_PHYSBASEn) defines the base
address and memory type for the range; the second entry(IA32_MTRR_PHYSMASKn)
contains a mask used to determine the address range. The legal values
for the type field of IA32_MTRR_PHYSBASEn are 0,1,4,5, and 6. However,
IA32_MTRR_PHYSMASKn don't have type field. This patch avoid check if
the type field is legal for IA32_MTRR_PHYSMASKn.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

adfb5d27

arch/x86: Use RCU_INIT_POINTER(x, NULL) in kvm/vmx.c · 3b63a43f

由 Monam Agarwal 提交于 3月 22, 2014

Here rcu_assign_pointer() is ensuring that the
initialization of a structure is carried out before storing a pointer
to that structure.
So, rcu_assign_pointer(p, NULL) can always safely be converted to
RCU_INIT_POINTER(p, NULL).
Signed-off-by: NMonam Agarwal <monamagarwal123@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3b63a43f

KVM: x86: raise invalid TSS exceptions during a task switch · 15fc0752

由 Paolo Bonzini 提交于 8月 18, 2014

Conditions that would usually trigger a general protection fault should
instead raise #TS.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

15fc0752

KVM: x86: drop fpu_activate hook · 4473b570

由 Wanpeng Li 提交于 8月 18, 2014

The only user of the fpu_activate hook was dropped in commit
2d04a05b (KVM: x86 emulator: emulate CLTS internally, 2011-04-20).
vmx_fpu_activate and svm_fpu_activate are still called on #NM (and for
Intel CLTS), but never from common code; hence, there's no need for
a hook.
Reviewed-by: NYang Zhang <yang.z.zhang@intel.com>
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4473b570

KVM: SVM: add rdmsr support for AMD event registers · dc9b2d93

由 Wei Huang 提交于 8月 13, 2014

Current KVM only supports RDMSR for K7_EVNTSEL0 and K7_PERFCTR0
MSRs. Reading the rest MSRs will trigger KVM to inject #GP into
guest VM. This causes a warning message "Failed to access perfctr
msr (MSR c0010001 is ffffffffffffffff)" on AMD host. This patch
adds RDMSR support for all K7_EVNTSELn and K7_PERFCTRn registers
and thus supresses the warning message.
Signed-off-by: NWei Huang <wehuang@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

dc9b2d93

KVM: x86: do not check CS.DPL against RPL during task switch · 9a4cfb27

由 Paolo Bonzini 提交于 8月 18, 2014

This reverts the check added by commit 5045b468 (KVM: x86: check CS.DPL
against RPL during task switch, 2014-05-15).  Although the CS.DPL=CS.RPL
check is mentioned in table 7-1 of the SDM as causing a #TSS exception,
it is not mentioned in table 6-6 that lists "invalid TSS conditions"
which cause #TSS exceptions. In fact it causes some tests to fail, which
pass on bare-metal.

Keep the rest of the commit, since we will find new uses for it in 3.18.
Reported-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9a4cfb27

KVM: x86: Avoid emulating instructions on #UD mistakenly · 3a6095a0

由 Nadav Amit 提交于 8月 13, 2014

Commit d40a6898 mistakenly caused instructions which are not marked as
EmulateOnUD to be emulated upon #UD exception. The commit caused the check of
whether the instruction flags include EmulateOnUD to never be evaluated. As a
result instructions whose emulation is broken may be emulated.  This fix moves
the evaluation of EmulateOnUD so it would be evaluated.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
[Tweak operand order in &&, remove EmulateOnUD where it's now superfluous.
 - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3a6095a0

09 8月, 2014 1 次提交

arch/x86: replace strict_strto calls · 164109e3

由 Daniel Walter 提交于 8月 08, 2014

Replace obsolete strict_strto calls with appropriate kstrto calls
Signed-off-by: NDaniel Walter <dwalter@google.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

164109e3

05 8月, 2014 1 次提交

KVM: nVMX: fix "acknowledge interrupt on exit" when APICv is in use · 56cc2406

由 Wanpeng Li 提交于 8月 05, 2014

After commit 77b0f5d6 (KVM: nVMX: Ack and write vector info to intr_info
if L1 asks us to), "Acknowledge interrupt on exit" behavior can be
emulated. To do so, KVM will ask the APIC for the interrupt vector if
during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set.  With APICv,
kvm_get_apic_interrupt would return -1 and give the following WARNING:

Call Trace:
 [<ffffffff81493563>] dump_stack+0x49/0x5e
 [<ffffffff8103f0eb>] warn_slowpath_common+0x7c/0x96
 [<ffffffffa059709a>] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [<ffffffff8103f11a>] warn_slowpath_null+0x15/0x17
 [<ffffffffa059709a>] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [<ffffffffa0594295>] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
 [<ffffffffa0537931>] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
 [<ffffffffa05972ec>] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
 [<ffffffffa051ebe9>] inject_pending_event+0xd0/0x16e [kvm]
 [<ffffffffa051efa0>] vcpu_enter_guest+0x319/0x704 [kvm]

To fix this, we cannot rely on the processor's virtual interrupt delivery,
because "acknowledge interrupt on exit" must only update the virtual
ISR/PPR/IRR registers (and SVI, which is just a cache of the virtual ISR)
but it should not deliver the interrupt through the IDT.  Thus, KVM has
to deliver the interrupt "by hand", similar to the treatment of EOI in
commit fc57ac2c (KVM: lapic: sync highest ISR to hardware apic on
EOI, 2014-05-14).

The patch modifies kvm_cpu_get_interrupt to always acknowledge an
interrupt; there are only two callers, and the other is not affected
because it is never reached with kvm_apic_vid_enabled() == true.  Then it
modifies apic_set_isr and apic_clear_irr to update SVI and RVI in addition
to the registers.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Suggested-by: N"Zhang, Yang Z" <yang.z.zhang@intel.com>
Tested-by: NLiu, RongrongX <rongrongx.liu@intel.com>
Tested-by: NFelipe Reyes <freyes@suse.com>
Fixes: 77b0f5d6
Cc: stable@vger.kernel.org
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

56cc2406

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功