提交 · b52f3ed02221252d8ee2c7d756e76fad4a5e84f6 · openeuler / raspberrypi-kernel

12 5月, 2016 4 次提交

irqbypass: Disallow NULL token · b52f3ed0

由 Alex Williamson 提交于 5月 05, 2016

A NULL token is meaningless and can only lead to unintended problems.
Error on registration with a NULL token, ignore de-registrations with
a NULL token.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b52f3ed0

kvm: introduce KVM_MAX_VCPU_ID · 0b1b1dfd

由 Greg Kurz 提交于 5月 09, 2016

The KVM_MAX_VCPUS define provides the maximum number of vCPUs per guest, and
also the upper limit for vCPU ids. This is okay for all archs except PowerPC
which can have higher ids, depending on the cpu/core/thread topology. In the
worst case (single threaded guest, host with 8 threads per core), it limits
the maximum number of vCPUS to KVM_MAX_VCPUS / 8.

This patch separates the vCPU numbering from the total number of vCPUs, with
the introduction of KVM_MAX_VCPU_ID, as the maximal valid value for vCPU ids
plus one.

The corresponding KVM_CAP_MAX_VCPU_ID allows userspace to validate vCPU ids
before passing them to KVM_CREATE_VCPU.

This patch only implements KVM_MAX_VCPU_ID with a specific value for PowerPC.
Other archs continue to return KVM_MAX_VCPUS instead.
Suggested-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0b1b1dfd

KVM: remove NULL return path for vcpu ids >= KVM_MAX_VCPUS · 9b9e3fc4

由 Greg Kurz 提交于 5月 09, 2016

Commit c896939f ("KVM: use heuristic for fast VCPU lookup by id") added
a return path that prevents vcpu ids to exceed KVM_MAX_VCPUS. This is a
problem for powerpc where vcpu ids can grow up to 8*KVM_MAX_VCPUS.

This patch simply reverses the logic so that we only try fast path if the
vcpu id can be tried as an index in kvm->vcpus[]. The slow path is not
affected by the change.
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9b9e3fc4

Merge tag 'kvm-arm-for-4.7' of... · bdb4094e

由 Paolo Bonzini 提交于 5月 11, 2016

Merge tag 'kvm-arm-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM Changes for Linux v4.7

Reworks our stage 2 page table handling to have page table manipulation
macros separate from those of the host systems as the underlying
hardware page tables can be configured to be noticably different in
layout from the stage 1 page tables used by the host.

Adds 16K page size support based on the above.

Adds a generic firmware probing layer for the timer and GIC so that KVM
initializes using the same logic based on both ACPI and FDT.

Finally adds support for hardware updating of the access flag.

bdb4094e

10 5月, 2016 7 次提交

Merge tag 'kvm-s390-next-4.7-2' of... · 6ac0f61f

由 Paolo Bonzini 提交于 5月 10, 2016

Merge tag 'kvm-s390-next-4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: features and fixes for 4.7 part2

- Use hardware provided information about facility bits that do not
  need any hypervisor activitiy
- Add missing documentation for KVM_CAP_S390_RI
- Some updates/fixes for handling cpu models and facilities

6ac0f61f

MIPS: KVM: Add missing disable FPU hazard barriers · 4ac33429

由 James Hogan 提交于 4月 22, 2016

Add the necessary hazard barriers after disabling the FPU in
kvm_lose_fpu(), just to be safe.
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim KrÄmÃ¡Å™" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4ac33429

MIPS: KVM: Fix preemption warning reading FPU capability · 556f2a52

由 James Hogan 提交于 4月 22, 2016

Reading the KVM_CAP_MIPS_FPU capability returns cpu_has_fpu, however
this uses smp_processor_id() to read the current CPU capabilities (since
some old MIPS systems could have FPUs present on only a subset of CPUs).

We don't support any such systems, so work around the warning by using
raw_cpu_has_fpu instead.

We should probably instead claim not to support FPU at all if any one
CPU is lacking an FPU, but this should do for now.
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim KrÄmÃ¡Å™" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

556f2a52

MIPS: KVM: Fix preemptable kvm_mips_get_*_asid() calls · f049729c

由 James Hogan 提交于 4月 22, 2016

There are a couple of places in KVM fault handling code which implicitly
use smp_processor_id() via kvm_mips_get_kernel_asid() and
kvm_mips_get_user_asid() from preemptable context. This is unsafe as a
preemption could cause the guest kernel ASID to be changed, resulting in
a host TLB entry being written with the wrong ASID.

Fix by disabling preemption around the kvm_mips_get_*_asid() call and
the corresponding kvm_mips_host_tlb_write().
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim KrÄmÃ¡Å™" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f049729c

MIPS: KVM: Fix timer IRQ race when writing CP0_Compare · b45bacd2

由 James Hogan 提交于 4月 22, 2016

Writing CP0_Compare clears the timer interrupt pending bit
(CP0_Cause.TI), but this wasn't being done atomically. If a timer
interrupt raced with the write of the guest CP0_Compare, the timer
interrupt could end up being pending even though the new CP0_Compare is
nowhere near CP0_Count.

We were already updating the hrtimer expiry with
kvm_mips_update_hrtimer(), which used both kvm_mips_freeze_hrtimer() and
kvm_mips_resume_hrtimer(). Close the race window by expanding out
kvm_mips_update_hrtimer(), and clearing CP0_Cause.TI and setting
CP0_Compare between the freeze and resume. Since the pending timer
interrupt should not be cleared when CP0_Compare is written via the KVM
user API, an ack argument is added to distinguish the source of the
write.

Fixes: e30492bb ("MIPS: KVM: Rewrite count/compare timer emulation")
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim KrÄmÃ¡Å™" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.16.x-
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b45bacd2

MIPS: KVM: Fix timer IRQ race when freezing timer · 4355c44f

由 James Hogan 提交于 4月 22, 2016

There's a particularly narrow and subtle race condition when the
software emulated guest timer is frozen which can allow a guest timer
interrupt to be missed.

This happens due to the hrtimer expiry being inexact, so very
occasionally the freeze time will be after the moment when the emulated
CP0_Count transitions to the same value as CP0_Compare (so an IRQ should
be generated), but before the moment when the hrtimer is due to expire
(so no IRQ is generated). The IRQ won't be generated when the timer is
resumed either, since the resume CP0_Count will already match CP0_Compare.

With VZ guests in particular this is far more likely to happen, since
the soft timer may be frozen frequently in order to restore the timer
state to the hardware guest timer. This happens after 5-10 hours of
guest soak testing, resulting in an overflow in guest kernel timekeeping
calculations, hanging the guest. A more focussed test case to
intentionally hit the race (with the help of a new hypcall to cause the
timer state to migrated between hardware & software) hits the condition
fairly reliably within around 30 seconds.

Instead of relying purely on the inexact hrtimer expiry to determine
whether an IRQ should be generated, read the guest CP0_Compare and
directly check whether the freeze time is before or after it. Only if
CP0_Count is on or after CP0_Compare do we check the hrtimer expiry to
determine whether the last IRQ has already been generated (which will
have pushed back the expiry by one timer period).

Fixes: e30492bb ("MIPS: KVM: Rewrite count/compare timer emulation")
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim KrÄmÃ¡Å™" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.16.x-
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4355c44f

kvm: arm64: Enable hardware updates of the Access Flag for Stage 2 page tables · 06485053

由 Catalin Marinas 提交于 4月 13, 2016

The ARMv8.1 architecture extensions introduce support for hardware
updates of the access and dirty information in page table entries. With
VTCR_EL2.HA enabled (bit 21), when the CPU accesses an IPA with the
PTE_AF bit cleared in the stage 2 page table, instead of raising an
Access Flag fault to EL2 the CPU sets the actual page table entry bit
(10). To ensure that kernel modifications to the page table do not
inadvertently revert a bit set by hardware updates, certain Stage 2
software pte/pmd operations must be performed atomically.

The main user of the AF bit is the kvm_age_hva() mechanism. The
kvm_age_hva_handler() function performs a "test and clear young" action
on the pte/pmd. This needs to be atomic in respect of automatic hardware
updates of the AF bit. Since the AF bit is in the same position for both
Stage 1 and Stage 2, the patch reuses the existing
ptep_test_and_clear_young() functionality if
__HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG is defined. Otherwise, the
existing pte_young/pte_mkold mechanism is preserved.

The kvm_set_s2pte_readonly() (and the corresponding pmd equivalent) have
to perform atomic modifications in order to avoid a race with updates of
the AF bit. The arm64 implementation has been re-written using
exclusives.

Currently, kvm_set_s2pte_writable() (and pmd equivalent) take a pointer
argument and modify the pte/pmd in place. However, these functions are
only used on local variables rather than actual page table entries, so
it makes more sense to follow the pte_mkwrite() approach for stage 1
attributes. The change to kvm_s2pte_mkwrite() makes it clear that these
functions do not modify the actual page table entries.

The (pte|pmd)_mkyoung() uses on Stage 2 entries (setting the AF bit
explicitly) do not need to be modified since hardware updates of the
dirty status are not supported by KVM, so there is no possibility of
losing such information.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

06485053

09 5月, 2016 9 次提交

x86/kvm: Do not use BIT() in user-exported header · 3dbe3458

由 Borislav Petkov 提交于 5月 05, 2016

Apparently, we're not exporting BIT() to userspace.
Reported-by: NBrooks Moses <bmoses@google.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

3dbe3458

KVM: s390: Populate mask of non-hypervisor managed facility bits · 60a37709

由 Alexander Yarygin 提交于 4月 01, 2016

When a guest is initializing, KVM provides facility bits that can be
successfully used by the guest. It's done by applying
kvm_s390_fac_list_mask mask on host facility bits stored by the STFLE
instruction. Facility bits can be one of two kinds: it's either a
hypervisor managed bit or non-hypervisor managed.

The hardware provides information which bits need special handling.
Let's automatically passthrough to guests new facility bits, that
don't require hypervisor support.
Signed-off-by: NAlexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NEric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

60a37709

s390/sclp: Add hmfai field · 154fa27e

由 Alexander Yarygin 提交于 4月 01, 2016

Let's add hypervisor-managed facility-apportionment indications field to
SCLP structs. KVM will use it to reduce maintenance cost of
Non-Hypervisor-Managed facility bits.
Signed-off-by: NAlexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NEric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

154fa27e

KVM: s390: Enable all facility bits that are known good for passthrough · ed8dda0b

由 Alexander Yarygin 提交于 3月 31, 2016

Some facility bits are in a range that is defined to be "ok for guests
without any necessary hypervisor changes". Enable those bits.
Signed-off-by: NAlexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

ed8dda0b

KVM: s390: document KVM_CAP_S390_RI · 051c87f7

由 David Hildenbrand 提交于 4月 19, 2016

We forgot to document that capability, let's add documentation.
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

051c87f7

KVM: s390: force ibc into valid range · 053dd230

由 David Hildenbrand 提交于 4月 04, 2016

Some hardware variants will round the ibc value up/down themselves,
others will report a validity intercept. Let's always round it up/down.

This patch will also make sure that the ibc is set to 0 in case we don't
have ibc support (lowest_ibc == 0).
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

053dd230

KVM: s390: cleanup cpuid handling · 9bb0ec09

由 David Hildenbrand 提交于 4月 04, 2016

We only have one cpuid for all VCPUs, so let's directly use the one in the
cpu model. Also always store it directly as u64, no need for struct cpuid.
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

9bb0ec09

KVM: s390: enable SRS only if enabled for the guest · bd50e8ec

由 David Hildenbrand 提交于 3月 04, 2016

If we don't have SIGP SENSE RUNNING STATUS enabled for the guest, let's
not enable interpretation so we can correctly report an invalid order.
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

bd50e8ec

KVM: s390: enable PFMFI only if guest has EDAT1 · d6af0b49

由 David Hildenbrand 提交于 3月 04, 2016

Only enable PFMF interpretation if the necessary facility (EDAT1) is
available, otherwise the pfmf handler in priv.c will inject an exception
Reviewed-by: NDominik Dingel <dingel@linux.vnet.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

d6af0b49

04 5月, 2016 2 次提交

KVM: s390: support NQ only if the facility is enabled for the guest · edc5b055

由 David Hildenbrand 提交于 3月 04, 2016

While we can not fully fence of the Nonquiescing Key-Setting facility,
we should as try our best to hide it.
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

edc5b055

KVM: s390: cmma: don't check entry content · 4a5e7e38

由 David Hildenbrand 提交于 4月 12, 2016

We should never inject an exception after we manually rewound the PSW
(to retry the ESSA instruction in this case). This will mess up the PSW.
So this never worked and therefore never really triggered.

Looking at the details, we don't even have to perform any validity checks.
1. Bits 52-63 of an entry are stored as 0 by the hardware.
2. We are dealing with absolute addresses but only check for the prefix
   starting at address 0. This isn't correct and doesn't make much sense,
   cpus could still zap the prefix of other cpus. But as prefix pages
   cannot be swapped out without a notifier being called for the affected
   VCPU, a zap can never remove a protected prefix.
Reviewed-by: NDominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

4a5e7e38

03 5月, 2016 11 次提交

kvm: robustify steal time record · 35f3fae1

由 Wanpeng Li 提交于 5月 03, 2016

Guest should only trust data to be valid when version haven't changed
before and after reads of steal time. Besides not changing, it has to
be an even number. Hypervisor may write an odd number to version field
to indicate that an update is in progress.

kvm_steal_clock() in guest has already done the read side, make write
side in hypervisor more robust by following the above rule.
Reviewed-by: NWincy Van <fanwenyi0529@gmail.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

35f3fae1

clocksource: arm_arch_timer: Remove arch_timer_get_timecounter · a53d892d