提交 · 7a72f7a140bfd3a5dae73088947010bfdbcf6a40 · openanolis / cloud-kernel

04 12月, 2014 15 次提交

KVM: track pid for VCPU only on KVM_RUN ioctl · 7a72f7a1

由 Christian Borntraeger 提交于 8月 05, 2014

We currently track the pid of the task that runs the VCPU in vcpu_load.
If a yield to that VCPU is triggered while the PID of the wrong thread
is active, the wrong thread might receive a yield, but this will most
likely not help the executing thread at all.  Instead, if we only track
the pid on the KVM_RUN ioctl, there are two possibilities:

1) the thread that did a non-KVM_RUN ioctl is holding a mutex that
the VCPU thread is waiting for.  In this case, the VCPU thread is not
runnable, but we also do not do a wrong yield.

2) the thread that did a non-KVM_RUN ioctl is sleeping, or doing
something that does not block the VCPU thread.  In this case, the
VCPU thread can receive the directed yield correctly.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
CC: Rik van Riel <riel@redhat.com>
CC: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
CC: Michael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7a72f7a1

KVM: don't check for PF_VCPU when yielding · eed6e79d

由 David Hildenbrand 提交于 11月 25, 2014

kvm_enter_guest() has to be called with preemption disabled and will
set PF_VCPU.  Current code takes PF_VCPU as a hint that the VCPU thread
is running and therefore needs no yield.

However, the check on PF_VCPU is wrong on s390, where preemption has
to stay enabled in order to correctly process page faults.  Thus,
s390 reenables preemption and starts to execute the guest.  The thread
might be scheduled out between kvm_enter_guest() and kvm_exit_guest(),
resulting in PF_VCPU being set but not being run.  When this happens,
the opportunity for directed yield is missed.

However, this check is done already in kvm_vcpu_on_spin before calling
kvm_vcpu_yield_loop:

        if (!ACCESS_ONCE(vcpu->preempted))
                continue;

so the check on PF_VCPU is superfluous in general, and this patch
removes it.
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

eed6e79d

kvm: optimize GFN to memslot lookup with large slots amount · 9c1a5d38

由 Igor Mammedov 提交于 12月 01, 2014

Current linear search doesn't scale well when
large amount of memslots is used and looked up slot
is not in the beginning memslots array.
Taking in account that memslots don't overlap, it's
possible to switch sorting order of memslots array from
'npages' to 'base_gfn' and use binary search for
memslot lookup by GFN.

As result of switching to binary search lookup times
are reduced with large amount of memslots.

Following is a table of search_memslot() cycles
during WS2008R2 guest boot.

                         boot,          boot + ~10 min
                         mostly same    of using it,
                         slot lookup    randomized lookup
                max      average        average
                cycles   cycles         cycles

13 slots      : 1450       28           30

13 slots      : 1400       30           40
binary search

117 slots     : 13000      30           460

117 slots     : 2000       35           180
binary search
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9c1a5d38

kvm: change memslot sorting rule from size to GFN · 0e60b079

由 Igor Mammedov 提交于 12月 01, 2014

it will allow to use binary search for GFN -> memslot
lookups, reducing lookup cost with large slots amount.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0e60b079

kvm: search_memslots: add simple LRU memslot caching · d4ae84a0

由 Igor Mammedov 提交于 12月 01, 2014

In typical guest boot workload only 2-3 memslots are used
extensively, and at that it's mostly the same memslot
lookup operation.

Adding LRU cache improves average lookup time from
46 to 28 cycles (~40%) for this workload.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d4ae84a0

kvm: update_memslots: drop not needed check for the same slot · 7f379cff

由 Igor Mammedov 提交于 12月 01, 2014

UP/DOWN shift loops will shift array in needed
direction and stop at place where new slot should
be placed regardless of old slot size.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7f379cff

kvm: update_memslots: drop not needed check for the same number of pages · 5a38b6e6

由 Igor Mammedov 提交于 12月 01, 2014

if number of pages haven't changed sorting algorithm
will do nothing, so there is no need to do extra check
to avoid entering sorting logic.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5a38b6e6

KVM: x86: allow 256 logical x2APICs again · 45c3094a

由 Radim Krčmář 提交于 11月 27, 2014

While fixing an x2apic bug,
 17d68b76 KVM: x86: fix guest-initiated crash with x2apic (CVE-2013-6376)
we've made only one cluster available.  This means that the amount of
logically addressible x2APICs was reduced to 16 and VCPUs kept
overwriting themselves in that region, so even the first cluster wasn't
set up correctly.

This patch extends x2APIC support back to the logical_map's limit, and
keeps the CVE fixed as messages for non-present APICs are dropped.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

45c3094a

KVM: x86: check bounds of APIC maps · 25995e5b

由 Radim Krčmář 提交于 11月 27, 2014

They can't be violated now, but play it safe for the future.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

25995e5b

KVM: x86: fix APIC physical destination wrapping · fa834e91

由 Radim Krčmář 提交于 11月 27, 2014

x2apic allows destinations > 0xff and we don't want them delivered to
lower APICs.  They are correctly handled by doing nothing.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fa834e91

KVM: x86: deliver phys lowest-prio · 085563fb

由 Radim Krčmář 提交于 11月 27, 2014

Physical mode can't address more than one APIC, but lowest-prio is
allowed, so we just reuse our paths.

SDM 10.6.2.1 Physical Destination:
  Also, for any non-broadcast IPI or I/O subsystem initiated interrupt
  with lowest priority delivery mode, software must ensure that APICs
  defined in the interrupt address are present and enabled to receive
  interrupts.

We could warn on top of that.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

085563fb

KVM: x86: don't retry hopeless APIC delivery · 698f9755

由 Radim Krčmář 提交于 11月 27, 2014

False from kvm_irq_delivery_to_apic_fast() means that we don't handle it
in the fast path, but we still return false in cases that were perfectly
handled, fix that.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

698f9755

KVM: x86: use MSR_ICR instead of a number · decdc283

由 Radim Krčmář 提交于 11月 26, 2014

0x830 MSR is 0x300 xAPIC MMIO, which is MSR_ICR.
Signed-off-by: NRadim KrÄmÃ¡Å™ <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

decdc283

KVM: x86: Fix reserved x2apic registers · c69d3d9b

由 Nadav Amit 提交于 11月 26, 2014

x2APIC has no registers for DFR and ICR2 (see Intel SDM 10.12.1.2 "x2APIC
Register Address Space"). KVM needs to cause #GP on such accesses.

Fix it (DFR and ICR2 on read, ICR2 on write, DFR already handled on writes).
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c69d3d9b

KVM: x86: Generate #UD when memory operand is required · 39f062ff

由 Nadav Amit 提交于 11月 26, 2014

Certain x86 instructions that use modrm operands only allow memory operand
(i.e., mod012), and cause a #UD exception otherwise. KVM ignores this fact.
Currently, the instructions that are such and are emulated by KVM are MOVBE,
MOVNTPS, MOVNTPD and MOVNTI.  MOVBE is the most blunt example, since it may be
emulated by the host regardless of MMIO.

The fix introduces a new group for handling such instructions, marking mod3 as
illegal instruction.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

39f062ff

03 12月, 2014 1 次提交

Merge tag 'kvm-s390-next-20141128' of... · be06b6be

由 Paolo Bonzini 提交于 12月 03, 2014

Merge tag 'kvm-s390-next-20141128' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Several fixes,cleanups and reworks

Here is a bunch of fixes that deal mostly with architectural compliance:
- interrupt priorities
- interrupt handling
- intruction exit handling

We also provide a helper function for getting the guest visible storage key.

be06b6be

28 11月, 2014 11 次提交

KVM: s390: allow injecting all kinds of machine checks · fc2020cf

由 Jens Freimann 提交于 8月 13, 2014

Allow to specify CR14, logout area, external damage code
and failed storage address.

Since more then one machine check can be indicated to the guest at
a time we need to combine all indication bits with already pending
requests.
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

fc2020cf

KVM: s390: handle pending local interrupts via bitmap · 383d0b05

由 Jens Freimann 提交于 7月 29, 2014

This patch adapts handling of local interrupts to be more compliant with
the z/Architecture Principles of Operation and introduces a data
structure
which allows more efficient handling of interrupts.

* get rid of li->active flag, use bitmap instead
* Keep interrupts in a bitmap instead of a list
* Deliver interrupts in the order of their priority as defined in the
  PoP
* Use a second bitmap for sigp emergency requests, as a CPU can have
  one request pending from every other CPU in the system.
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

383d0b05

KVM: s390: add bitmap for handling cpu-local interrupts · c0e6159d

由 Jens Freimann 提交于 7月 29, 2013

Adds a bitmap to the vcpu structure which is used to keep track
of local pending interrupts. Also add enum with all interrupt
types sorted in order of priority (highest to lowest)
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

c0e6159d

KVM: s390: refactor interrupt delivery code · 0fb97abe

由 Jens Freimann 提交于 7月 29, 2014

Move delivery code for cpu-local interrupt from the huge do_deliver_interrupt()
to smaller functions which handle one type of interrupt.
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

0fb97abe

KVM: s390: add defines for virtio and pfault interrupt code · 60f90a14

由 Jens Freimann 提交于 11月 10, 2014

Get rid of open coded value for virtio and pfault completion interrupts.
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

60f90a14

KVM: s390: external param not valid for cpu timer and ckc · af43eb2f

由 David Hildenbrand 提交于 11月 07, 2014

The 32bit external interrupt parameter is only valid for timing-alert and
service-signal interrupts.
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

af43eb2f

KVM: s390: refactor interrupt injection code · 0146a7b0

由 Jens Freimann 提交于 7月 28, 2014

In preparation for the rework of the local interrupt injection code,
factor out injection routines from kvm_s390_inject_vcpu().
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

0146a7b0

KVM: S390: Create helper function get_guest_storage_key · 9fcf93b5

由 Jason J. Herne 提交于 9月 23, 2014

Define get_guest_storage_key which can be used to get the value of a guest
storage key. This compliments the functionality provided by the helper function
set_guest_storage_key. Both functions are needed for live migration of s390
guests that use storage keys.
Signed-off-by: NJason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

9fcf93b5

KVM: s390: trigger the right CPU exit for floating interrupts · da00fcbd

由 Christian Borntraeger 提交于 11月 21, 2014

When injecting a floating interrupt and no CPU is idle we
kick one CPU to do an external exit. In case of I/O we
should trigger an I/O exit instead. This does not matter
for Linux guests as external and I/O interrupts are
enabled/disabled at the same time, but play safe anyway.

The same holds true for machine checks. Since there is no
special exit, just reuse the generic stop exit. The injection
code inside the VCPU loop will recheck anyway and rearm the
proper exits (e.g. control registers) if necessary.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>

da00fcbd

KVM: s390: Fix rewinding of the PSW pointing to an EXECUTE instruction · 04b41acd

由 Thomas Huth 提交于 11月 12, 2014

A couple of our interception handlers rewind the PSW to the beginning
of the instruction to run the intercepted instruction again during the
next SIE entry. This normally works fine, but there is also the
possibility that the instruction did not get run directly but via an
EXECUTE instruction.
In this case, the PSW does not point to the instruction that caused the
interception, but to the EXECUTE instruction! So we've got to rewind the
PSW to the beginning of the EXECUTE instruction instead.
This is now accomplished with a new helper function kvm_s390_rewind_psw().
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

04b41acd

KVM: s390: Small fixes for the PFMF handler · a02689fe

由 Thomas Huth 提交于 11月 10, 2014

This patch includes two small fixes for the PFMF handler: First, the
start address for PFMF has to be masked according to the current
addressing mode, which is now done with kvm_s390_logical_to_effective().
Second, the protection exceptions have a lower priority than the
specification exceptions, so the check for low-address protection
has to be moved after the last spot where we inject a specification
exception.
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

a02689fe

24 11月, 2014 4 次提交

kvm: x86: avoid warning about potential shift wrapping bug · 2b4a273b

由 Paolo Bonzini 提交于 11月 24, 2014

cs.base is declared as a __u64 variable and vector is a u32 so this
causes a static checker warning.  The user indeed can set "sipi_vector"
to any u32 value in kvm_vcpu_ioctl_x86_set_vcpu_events(), but the
value should really have 8-bit precision only.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2b4a273b

KVM: x86: move device assignment out of kvm_host.h · c9eab58f

由 Paolo Bonzini 提交于 11月 24, 2014

Create a new header, and hide the device assignment functions there.
Move struct kvm_assigned_dev_kernel to assigned-dev.c by modifying
arch/x86/kvm/iommu.c to take a PCI device struct.

Based on a patch by Radim Krcmar <rkrcmark@redhat.com>.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c9eab58f

kvm: x86: mask out XSAVES · b65d6e17

由 Paolo Bonzini 提交于 11月 21, 2014

This feature is not supported inside KVM guests yet, because we do not emulate
MSR_IA32_XSS.  Mask it out.

Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b65d6e17

kvm: x86: move assigned-dev.c and iommu.c to arch/x86/ · c274e03a

由 Radim Krčmář 提交于 11月 21, 2014

Now that ia64 is gone, we can hide deprecated device assignment in x86.

Notable changes:
 - kvm_vm_ioctl_assigned_device() was moved to x86/kvm_arch_vm_ioctl()

The easy parts were removed from generic kvm code, remaining
 - kvm_iommu_(un)map_pages() would require new code to be moved
 - struct kvm_assigned_dev_kernel depends on struct kvm_irq_ack_notifier
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c274e03a

22 11月, 2014 3 次提交

kvm: remove IA64 ioctls · 6b397158

由 Radim Krčmář 提交于 11月 20, 2014

KVM ia64 is no longer present so new applications shouldn't use them.
The main problem is that they most likely didn't work even before,
because of a conflict in the #defines:

  #define KVM_SET_GUEST_DEBUG       _IOW(KVMIO,  0x9b, struct kvm_guest_debug)
  #define KVM_IA64_VCPU_SET_STACK   _IOW(KVMIO,  0x9b, void *)

The argument to KVM_SET_GUEST_DEBUG is:

  struct kvm_guest_debug {
  	__u32 control;
  	__u32 pad;
  	struct kvm_guest_debug_arch arch;
  };

  struct kvm_guest_debug_arch {
  };

meaning that sizeof(struct kvm_guest_debug) == sizeof(void *) == 8
and KVM_SET_GUEST_DEBUG == KVM_IA64_VCPU_SET_STACK.

KVM_SET_GUEST_DEBUG is handled in virt/kvm/kvm_main.c before even calling
kvm_arch_vcpu_ioctl (which would have handled KVM_IA64_VCPU_SET_STACK),
so KVM_IA64_VCPU_SET_STACK would just return -EINVAL.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6b397158

R
kvm: remove CONFIG_X86 #ifdefs from files formerly shared with ia64 · 3bf58e9a
由 Radim Krcmar 提交于 11月 21, 2014
```
Signed-off-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
3bf58e9a

kvm: x86: move ioapic.c and irq_comm.c back to arch/x86/ · 6ef768fa

由 Paolo Bonzini 提交于 11月 20, 2014

ia64 does not need them anymore.  Ack notifiers become x86-specific
too.
Suggested-by: NGleb Natapov <gleb@kernel.org>
Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6ef768fa

20 11月, 2014 6 次提交

kvm: Documentation: remove ia64 · c32a4272

由 Tiejun Chen 提交于 11月 20, 2014

kvm/ia64 is gone, clean up Documentation too.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c32a4272

KVM: ia64: remove · 003f7de6

由 Paolo Bonzini 提交于 11月 19, 2014

KVM for ia64 has been marked as broken not just once, but twice even,
and the last patch from the maintainer is now roughly 5 years old.
Time for it to rest in peace.
Acked-by: NGleb Natapov <gleb@kernel.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

003f7de6

KVM: x86: Remove FIXMEs in emulate.c · 86619e7b

由 Nicholas Krause 提交于 11月 19, 2014

Remove FIXME comments about needing fault addresses to be returned.  These
are propaagated from walk_addr_generic to gva_to_gpa and from there to
ops->read_std and ops->write_std.
Signed-off-by: NNicholas Krause <xerofoify@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

86619e7b

KVM: emulator: remove duplicated limit check · 997b0412

由 Paolo Bonzini 提交于 11月 19, 2014

The check on the higher limit of the segment, and the check on the
maximum accessible size, is the same for both expand-up and
expand-down segments.  Only the computation of "lim" varies.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

997b0412

KVM: emulator: remove code duplication in register_address{,_increment} · 01485a22

由 Paolo Bonzini 提交于 11月 19, 2014

register_address has been a duplicate of address_mask ever since the
ancestor of __linearize was born in 90de84f5 (KVM: x86 emulator:
preserve an operand's segment identity, 2010-11-17).

However, we can put it to a better use by including the call to reg_read
in register_address.  Similarly, the call to reg_rmw can be moved to
register_address_increment.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

01485a22

KVM: x86: Move __linearize masking of la into switch · 31ff6488

由 Nadav Amit 提交于 11月 19, 2014

In __linearize there is check of the condition whether to check if masking of
the linear address is needed.  It occurs immediately after switch that
evaluates the same condition.  Merge them.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

31ff6488

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功