提交 · bb663c7ada380f3c89c2f83fdbe2b3626621385d · openeuler / Kernel

21 7月, 2014 2 次提交

KVM: x86: Clearing rflags.rf upon skipped emulated instruction · bb663c7a

由 Nadav Amit 提交于 7月 21, 2014

When skipping an emulated instruction, rflags.rf should be cleared as it would
be on real x86 CPU.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bb663c7a

Merge tag 'kvm-s390-20140715' of... · ec10b727

由 Paolo Bonzini 提交于 7月 21, 2014

Merge tag 'kvm-s390-20140715' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next

This series enables the "KVM_(S|G)ET_MP_STATE" ioctls on s390 to make
the cpu state settable by user space.

This is necessary to avoid races in s390 SIGP/reset handling which
happen because some SIGPs are handled in QEMU, while others are
handled in the kernel. Together with the busy conditions as return
value of SIGP races happen especially in areas like starting and
stopping of CPUs. (For example, there is a program 'cpuplugd', that
runs on several s390 distros which does automatic onlining and
offlining on cpus.)

As soon as the MPSTATE interface is used, user space takes complete
control of the cpu states. Otherwise the kernel will use the old way.

Therefore, the new kernel continues to work fine with old QEMUs.

ec10b727

17 7月, 2014 1 次提交

KVM: nVMX: Fix virtual interrupt delivery injection · 963fee16

由 Wanpeng Li 提交于 7月 17, 2014

This patch fix bug reported in https://bugzilla.kernel.org/show_bug.cgi?id=73331,
after the patch http://www.spinics.net/lists/kvm/msg105230.html applied, there is
some progress and the L2 can boot up, however, slowly. The original idea of this
fix vid injection patch is from "Zhang, Yang Z" <yang.z.zhang@intel.com>.

Interrupt which delivered by vid should be injected to L1 by L0 if current is in
L1, or should be injected to L2 by L0 through the old injection way if L1 doesn't
have set External-interrupt exiting bit. The current logic doen't consider these
cases. This patch fix it by vid intr to L1 if current is L1 or L2 through old
injection way if L1 doen't have External-interrupt exiting bit set.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: N"Zhang, Yang Z" <yang.z.zhang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

963fee16

11 7月, 2014 24 次提交

KVM: x86: Emulator support for #UD on CPL>0 · 68efa764

由 Nadav Amit 提交于 6月 18, 2014

Certain instructions (e.g., mwait and monitor) cause a #UD exception when they
are executed in user mode. This is in contrast to the regular privileged
instructions which cause #GP. In order not to mess with SVM interception of
mwait and monitor which assumes privilege level assertions take place before
interception, a flag has been added.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

68efa764

KVM: x86: Emulator flag for instruction that only support 16-bit addresses in real mode · 10e38fc7

由 Nadav Amit 提交于 6月 18, 2014

Certain instructions, such as monitor and xsave do not support big real mode
and cause a #GP exception if any of the accessed bytes effective address are
not within [0, 0xffff]. This patch introduces a flag to mark these
instructions, including the necassary checks.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

10e38fc7

KVM: x86: use kvm_read_guest_page for emulator accesses · 44583cba

由 Paolo Bonzini 提交于 5月 13, 2014

Emulator accesses are always done a page at a time, either by the emulator
itself (for fetches) or because we need to query the MMU for address
translations. Speed up these accesses by using kvm_read_guest_page
and, in the case of fetches, by inlining kvm_read_guest_virt_helper and
dropping the loop around kvm_read_guest_page.

This final tweak saves 30-100 more clock cycles (4-10%), bringing the
count (as measured by kvm-unit-tests) down to 720-1100 clock cycles on
a Sandy Bridge Xeon host, compared to 2300-3200 before the whole series
and 925-1700 after the first two low-hanging fruit changes.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

44583cba

KVM: x86: ensure emulator fetches do not span multiple pages · 719d5a9b

由 Paolo Bonzini 提交于 6月 19, 2014

When the CS base is not page-aligned, the linear address of the code could
get close to the page boundary (e.g. 0x...ffe) even if the EIP value is
not. So we need to first linearize the address, and only then compute
the number of valid bytes that can be fetched.

This happens relatively often when executing real mode code.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

719d5a9b

KVM: emulate: put pointers in the fetch_cache · 17052f16

由 Paolo Bonzini 提交于 5月 06, 2014

This simplifies the code a bit, especially the overflow checks.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

17052f16

KVM: emulate: avoid per-byte copying in instruction fetches · 9506d57d

由 Paolo Bonzini 提交于 5月 06, 2014

We do not need a memory copying loop anymore in insn_fetch; we
can use a byte-aligned pointer to access instruction fields directly
from the fetch_cache. This eliminates 50-150 cycles (corresponding to
a 5-10% improvement in performance) from each instruction.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9506d57d

KVM: emulate: avoid repeated calls to do_insn_fetch_bytes · 5cfc7e0f

由 Paolo Bonzini 提交于 5月 06, 2014

do_insn_fetch_bytes will only be called once in a given insn_fetch and
insn_fetch_arr, because in fact it will only be called at most twice
for any instruction and the first call is explicit in x86_decode_insn.
This observation lets us hoist the call out of the memory copying loop.
It does not buy performance, because most fetches are one byte long
anyway, but it prepares for the next patch.

The overflow check is tricky, but correct. Because do_insn_fetch_bytes
has already been called once, we know that fc->end is at least 15. So
it is okay to subtract the number of bytes we want to read.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5cfc7e0f

KVM: emulate: speed up do_insn_fetch · 285ca9e9

由 Paolo Bonzini 提交于 5月 06, 2014

Hoist the common case up from do_insn_fetch_byte to do_insn_fetch,
and prime the fetch_cache in x86_decode_insn.  This helps a bit the
compiler and the branch predictor, but above all it lays the
ground for further changes in the next few patches.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

285ca9e9

KVM: emulate: do not initialize memopp · 41061cdb

由 Bandan Das 提交于 4月 16, 2014

rip_relative is only set if decode_modrm runs, and if you have ModRM
you will also have a memopp.  We can then access memopp unconditionally.
Note that rip_relative cannot be hoisted up to decode_modrm, or you
break "mov $0, xyz(%rip)".

Also, move typecast on "out of range value" of mem.ea to decode_modrm.

Together, all these optimizations save about 50 cycles on each emulated
instructions (4-6%).
Signed-off-by: NBandan Das <bsd@redhat.com>
[Fix immediate operands with rip-relative addressing. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

41061cdb

KVM: emulate: rework seg_override · 573e80fe

由 Bandan Das 提交于 4月 16, 2014

x86_decode_insn already sets a default for seg_override,
so remove it from the zeroed area. Also replace set/get functions
with direct access to the field.
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

573e80fe

KVM: emulate: clean up initializations in init_decode_cache · c44b4c6a

由 Bandan Das 提交于 4月 16, 2014

A lot of initializations are unnecessary as they get set to
appropriate values before actually being used. Optimize
placement of fields in x86_emulate_ctxt
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c44b4c6a

KVM: emulate: cleanup decode_modrm · 02357bdc

由 Bandan Das 提交于 4月 16, 2014

Remove the if conditional - that will help us avoid
an "else initialize to 0" Also, rearrange operators
for slightly better code.
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

02357bdc

KVM: emulate: Remove ctxt->intercept and ctxt->check_perm checks · 685bbf4a

由 Bandan Das 提交于 4月 16, 2014

The same information can be gleaned from ctxt->d and avoids having
to zero/NULL initialize intercept and check_perm
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

685bbf4a

KVM: emulate: move init_decode_cache to emulate.c · 1498507a

由 Bandan Das 提交于 4月 16, 2014

Core emulator functions all belong in emulator.c,
x86 should have no knowledge of emulator internals
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1498507a

KVM: emulate: simplify writeback · f5f87dfb

由 Paolo Bonzini 提交于 4月 01, 2014

The "if/return" checks are useless, because we return X86EMUL_CONTINUE
anyway if we do not return.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f5f87dfb

KVM: emulate: speed up emulated moves · 54cfdb3e

由 Paolo Bonzini 提交于 3月 27, 2014

We can just blindly move all 16 bytes of ctxt->src's value to ctxt->dst.
write_register_operand will take care of writing only the lower bytes.

Avoiding a call to memcpy (the compiler optimizes it out) gains about
200 cycles on kvm-unit-tests for register-to-register moves, and makes
them about as fast as arithmetic instructions.

We could perhaps get a larger speedup by moving all instructions _except_
moves out of x86_emulate_insn, removing opcode_len, and replacing the
switch statement with an inlined em_mov.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

54cfdb3e

KVM: emulate: protect checks on ctxt->d by a common "if (unlikely())" · d40a6898

由 Paolo Bonzini 提交于 3月 27, 2014

There are several checks for "peculiar" aspects of instructions in both
x86_decode_insn and x86_emulate_insn.  Group them together, and guard
them with a single "if" that lets the processor quickly skip them all.
Make this more effective by adding two more flag bits that say whether the
.intercept and .check_perm fields are valid.  We will reuse these
flags later to avoid initializing fields of the emulate_ctxt struct.

This skims about 30 cycles for each emulated instructions, which is
approximately a 3% improvement.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d40a6898

KVM: emulate: move around some checks · e24186e0

由 Paolo Bonzini 提交于 3月 27, 2014

The only purpose of this patch is to make the next patch simpler
to review.  No semantic change.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e24186e0

KVM: x86: avoid useless set of KVM_REQ_EVENT after emulation · 6addfc42

由 Paolo Bonzini 提交于 3月 27, 2014

Despite the provisions to emulate up to 130 consecutive instructions, in
practice KVM will emulate just one before exiting handle_invalid_guest_state,
because x86_emulate_instruction always sets KVM_REQ_EVENT.

However, we only need to do this if an interrupt could be injected,
which happens a) if an interrupt shadow bit (STI or MOV SS) has gone
away; b) if the interrupt flag has just been set (other instructions
than STI can set it without enabling an interrupt shadow).

This cuts another 700-900 cycles from the cost of emulating an
instruction (measured on a Sandy Bridge Xeon: 1650-2600 cycles
before the patch on kvm-unit-tests, 925-1700 afterwards).
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6addfc42

KVM: x86: return all bits from get_interrupt_shadow · 37ccdcbe

由 Paolo Bonzini 提交于 5月 20, 2014

For the next patch we will need to know the full state of the
interrupt shadow; we will then set KVM_REQ_EVENT when one bit
is cleared.

However, right now get_interrupt_shadow only returns the one
corresponding to the emulated instruction, or an unconditional
0 if the emulated instruction does not have an interrupt shadow.
This is confusing and does not allow us to check for cleared
bits as mentioned above.

Clean the callback up, and modify toggle_interruptibility to
match the comment above the call.  As a small result, the
call to set_interrupt_shadow will be skipped in the common
case where int_shadow == 0 && mask == 0.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

37ccdcbe

KVM: vmx: speed up emulation of invalid guest state · 98eb2f8b

由 Paolo Bonzini 提交于 3月 27, 2014

About 25% of the time spent in emulation of invalid guest state
is wasted in checking whether emulation is required for the next
instruction.  However, this almost never changes except when a
segment register (or TR or LDTR) changes, or when there is a mode
transition (i.e. CR0 changes).

In fact, vmx_set_segment and vmx_set_cr0 already modify
vmx->emulation_required (except that the former for some reason
uses |= instead of just an assignment).  So there is no need to
call guest_state_valid in the emulation loop.

Emulation performance test results indicate 1650-2600 cycles
for common instructions, versus 2300-3200 before this patch on
a Sandy Bridge Xeon.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

98eb2f8b

KVM: svm: writes to MSR_K7_HWCR generates GPE in guest · 22d48b2d

由 Matthias Lange 提交于 6月 26, 2014

Since commit 575203 the MCE subsystem in the Linux kernel for AMD sets bit 18
in MSR_K7_HWCR. Running such a kernel as a guest in KVM on an AMD host results
in a GPE injected into the guest because kvm_set_msr_common returns 1. This
patch fixes this by masking bit 18 from the MSR value desired by the guest.
Signed-off-by: NMatthias Lange <matthias.lange@kernkonzept.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

22d48b2d

KVM: x86: Pending interrupt may be delivered after INIT · 5f7552d4

由 Nadav Amit 提交于 6月 30, 2014

We encountered a scenario in which after an INIT is delivered, a pending
interrupt is delivered, although it was sent before the INIT. As the SDM
states in section 10.4.7.1, the ISR and the IRR should be cleared after INIT as
KVM does. This also means that pending interrupts should be cleared. This
patch clears upon reset (and INIT) the pending interrupts; and at the same
occassion clears the pending exceptions, since they may cause a similar issue.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5f7552d4

KVM: Synthesize G bit for all segments. · 80112c89

由 Jim Mattson 提交于 7月 08, 2014

We have noticed that qemu-kvm hangs early in the BIOS when runnning nested
under some versions of VMware ESXi.

The problem we believe is because KVM assumes that the platform preserves
the 'G' but for any segment register. The SVM specification itemizes the
segment attribute bits that are observed by the CPU, but the (G)ranularity bit
is not one of the bits itemized, for any segment. Though current AMD CPUs keep
track of the (G)ranularity bit for all segment registers other than CS, the
specification does not require it. VMware's virtual CPU may not track the
(G)ranularity bit for any segment register.

Since kvm already synthesizes the (G)ranularity bit for the CS segment. It
should do so for all segments. The patch below does that, and helps get rid of
the hangs. Patch applies on top of Linus' tree.
Signed-off-by: NJim Mattson <jmattson@vmware.com>
Signed-off-by: NAlok N Kataria <akataria@vmware.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

80112c89

10 7月, 2014 13 次提交

KVM: s390: implement KVM_(S|G)ET_MP_STATE for user space state control · 6352e4d2

由 David Hildenbrand 提交于 4月 10, 2014

This patch
- adds s390 specific MP states to linux headers and documents them
- implements the KVM_{SET,GET}_MP_STATE ioctls
- enables KVM_CAP_MP_STATE
- allows user space to control the VCPU state on s390.

If user space sets the VCPU state using the ioctl KVM_SET_MP_STATE, we can disable
manual changing of the VCPU state and trust user space to do the right thing.
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

6352e4d2

KVM: prepare for KVM_(S|G)ET_MP_STATE on other architectures · 0b4820d6

由 David Hildenbrand 提交于 5月 12, 2014

Highlight the aspects of the ioctls that are actually specific to x86
and ia64. As defined restrictions (irqchip) and mp states may not apply
to other architectures, these parts are flagged to belong to x86 and ia64.

In preparation for the use of KVM_(S|G)ET_MP_STATE by s390.
Fix a spelling error (KVM_SET_MP_STATE vs. KVM_SET_MPSTATE) on the way.
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

0b4820d6

KVM: s390: remove __cpu_is_stopped and expose is_vcpu_stopped · 7a42fdc2

由 David Hildenbrand 提交于 5月 05, 2014

The function "__cpu_is_stopped" is not used any more. Let's remove it and
expose the function "is_vcpu_stopped" instead, which is actually what we want.

This patch also converts an open coded check for CPUSTAT_STOPPED to
is_vcpu_stopped().
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

7a42fdc2

KVM: s390: move finalization of SIGP STOP orders to kvm_s390_vcpu_stop · 32f5ff63

由 David Hildenbrand 提交于 4月 14, 2014

Let's move the finalization of SIGP STOP and SIGP STOP AND STORE STATUS orders to
the point where the VCPU is actually stopped.

This change is needed to prepare for a user space driven VCPU state change. The
action_bits may only be cleared when setting the cpu state to STOPPED while
holding the local irq lock.
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

32f5ff63

KVM: s390: allow only one SIGP STOP (AND STORE STATUS) at a time · 7dfc63cf

由 David Hildenbrand 提交于 5月 15, 2014

A SIGP STOP (AND STORE STATUS) order is complete as soon as the VCPU has been
stopped. This patch makes sure that only one SIGP STOP (AND STORE STATUS) may
be pending at a time (as defined by the architecture). If the action_bits are
still set, a SIGP STOP has been issued but not completed yet. The VCPU is busy
for further SIGP STOP orders.

Also set the CPUSTAT_STOP_INT after the action_bits variable has been modified
(the same order that is used when injecting a KVM_S390_SIGP_STOP from
userspace).

Both changes are needed in preparation for a user space driven VCPU state change
(to avoid race conditions).
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

7dfc63cf

KVM: MIPS: Document MIPS specifics of KVM API. · c2d2c21b

由 James Hogan 提交于 7月 04, 2014

Document the MIPS specific parts of the KVM API, including:
 - The layout of the kvm_regs structure.
 - The interrupt number passed to KVM_INTERRUPT.
 - The registers supported by the KVM_{GET,SET}_ONE_REG interface, and
   the encoding of those register ids.
 - That KVM_INTERRUPT and KVM_GET_REG_LIST are supported on MIPS.
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c2d2c21b

KVM: Reformat KVM_SET_ONE_REG register documentation · bf5590f3

由 James Hogan 提交于 7月 04, 2014

Some of the MIPS registers that can be accessed with the
KVM_{GET,SET}_ONE_REG interface have fairly long names, so widen the
Register column of the table in the KVM_SET_ONE_REG documentation to
allow them to fit.

Tabs in the table are replaced with spaces at the same time for
consistency.
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bf5590f3

KVM: Document KVM_SET_SIGNAL_MASK as universal · 572e0929

由 James Hogan 提交于 7月 04, 2014

KVM_SET_SIGNAL_MASK is implemented in generic code and isn't x86
specific, so document it as being applicable for all architectures.
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

572e0929

KVM: x86: Fix lapic.c debug prints · 98eff52a

由 Nadav Amit 提交于 6月 29, 2014

In two cases lapic.c does not use the apic_debug macro correctly. This patch
fixes them.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

98eff52a

KVM: x86: fix TSC matching · 0d3da0d2

由 Tomasz Grabiec 提交于 6月 24, 2014

I've observed kvmclock being marked as unstable on a modern
single-socket system with a stable TSC and qemu-1.6.2 or qemu-2.0.0.

The culprit was failure in TSC matching because of overflow of
kvm_arch::nr_vcpus_matched_tsc in case there were multiple TSC writes
in a single synchronization cycle.

Turns out that qemu does multiple TSC writes during init, below is the
evidence of that (qemu-2.0.0):

The first one:

 0xffffffffa08ff2b4 : vmx_write_tsc_offset+0xa4/0xb0 [kvm_intel]
 0xffffffffa04c9c05 : kvm_write_tsc+0x1a5/0x360 [kvm]
 0xffffffffa04cfd6b : kvm_arch_vcpu_postcreate+0x4b/0x80 [kvm]
 0xffffffffa04b8188 : kvm_vm_ioctl+0x418/0x750 [kvm]

The second one:

 0xffffffffa08ff2b4 : vmx_write_tsc_offset+0xa4/0xb0 [kvm_intel]
 0xffffffffa04c9c05 : kvm_write_tsc+0x1a5/0x360 [kvm]
 0xffffffffa090610d : vmx_set_msr+0x29d/0x350 [kvm_intel]
 0xffffffffa04be83b : do_set_msr+0x3b/0x60 [kvm]
 0xffffffffa04c10a8 : msr_io+0xc8/0x160 [kvm]
 0xffffffffa04caeb6 : kvm_arch_vcpu_ioctl+0xc86/0x1060 [kvm]
 0xffffffffa04b6797 : kvm_vcpu_ioctl+0xc7/0x5a0 [kvm]

 #0  kvm_vcpu_ioctl at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1780
 #1  kvm_put_msrs at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1270
 #2  kvm_arch_put_registers at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1909
 #3  kvm_cpu_synchronize_post_init at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1641
 #4  cpu_synchronize_post_init at /build/buildd/qemu-2.0.0+dfsg/include/sysemu/kvm.h:330
 #5  cpu_synchronize_all_post_init () at /build/buildd/qemu-2.0.0+dfsg/cpus.c:521
 #6  main at /build/buildd/qemu-2.0.0+dfsg/vl.c:4390

The third one:

 0xffffffffa08ff2b4 : vmx_write_tsc_offset+0xa4/0xb0 [kvm_intel]
 0xffffffffa04c9c05 : kvm_write_tsc+0x1a5/0x360 [kvm]
 0xffffffffa090610d : vmx_set_msr+0x29d/0x350 [kvm_intel]
 0xffffffffa04be83b : do_set_msr+0x3b/0x60 [kvm]
 0xffffffffa04c10a8 : msr_io+0xc8/0x160 [kvm]
 0xffffffffa04caeb6 : kvm_arch_vcpu_ioctl+0xc86/0x1060 [kvm]
 0xffffffffa04b6797 : kvm_vcpu_ioctl+0xc7/0x5a0 [kvm]

 #0  kvm_vcpu_ioctl at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1780
 #1  kvm_put_msrs  at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1270
 #2  kvm_arch_put_registers  at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1909
 #3  kvm_cpu_synchronize_post_reset  at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1635
 #4  cpu_synchronize_post_reset  at /build/buildd/qemu-2.0.0+dfsg/include/sysemu/kvm.h:323
 #5  cpu_synchronize_all_post_reset () at /build/buildd/qemu-2.0.0+dfsg/cpus.c:512
 #6  main  at /build/buildd/qemu-2.0.0+dfsg/vl.c:4482

The fix is to count each vCPU only once when matched, so that
nr_vcpus_matched_tsc holds the size of the matched set. This is
achieved by reusing generation counters. Every vCPU with
this_tsc_generation == cur_tsc_generation is in the matched set. The
match set is cleared by setting cur_tsc_generation to a value which no
other vCPU is set to (by incrementing it).

I needed to bump up the counter size form u8 to u64 to ensure it never
overflows. Otherwise in cases TSC is not written the same number of
times on each vCPU the counter could overflow and incorrectly indicate
some vCPUs as being in the matched set. This scenario seems unlikely
but I'm not sure if it can be disregarded.
Signed-off-by: NTomasz Grabiec <tgrabiec@cloudius-systems.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0d3da0d2

KVM: nSVM: Set correct port for IOIO interception evaluation · 6cbc5f5a

由 Jan Kiszka 提交于 6月 30, 2014

Obtaining the port number from DX is bogus as a) there are immediate
port accesses and b) user space may have changed the register content
while processing the PIO access. Forward the correct value from the
instruction emulator instead.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6cbc5f5a

KVM: nSVM: Fix IOIO size reported on emulation · 6493f157

由 Jan Kiszka 提交于 6月 30, 2014

The access size of an in/ins is reported in dst_bytes, and that of
out/outs in src_bytes.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6493f157

KVM: nSVM: Fix IOIO bitmap evaluation · 9bf41833

由 Jan Kiszka 提交于 6月 30, 2014

First, kvm_read_guest returns 0 on success. And then we need to take the
access size into account when testing the bitmap: intercept if any of
bits corresponding to the access is set.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9bf41833

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功