提交 · 4b8473c9c19dff1b0c672f182cc50b9952cf42e7 · openeuler / Kernel

17 10月, 2013 11 次提交

KVM: PPC: Book3S HV: Add support for guest Program Priority Register · 4b8473c9

由 Paul Mackerras 提交于 9月 20, 2013

POWER7 and later IBM server processors have a register called the
Program Priority Register (PPR), which controls the priority of
each hardware CPU SMT thread, and affects how fast it runs compared
to other SMT threads.  This priority can be controlled by writing to
the PPR or by use of a set of instructions of the form or rN,rN,rN
which are otherwise no-ops but have been defined to set the priority
to particular levels.

This adds code to context switch the PPR when entering and exiting
guests and to make the PPR value accessible through the SET/GET_ONE_REG
interface.  When entering the guest, we set the PPR as late as
possible, because if we are setting a low thread priority it will
make the code run slowly from that point on.  Similarly, the
first-level interrupt handlers save the PPR value in the PACA very
early on, and set the thread priority to the medium level, so that
the interrupt handling code runs at a reasonable speed.
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

4b8473c9

KVM: PPC: Book3S HV: Store LPCR value for each virtual core · a0144e2a

由 Paul Mackerras 提交于 9月 20, 2013

This adds the ability to have a separate LPCR (Logical Partitioning
Control Register) value relating to a guest for each virtual core,
rather than only having a single value for the whole VM.  This
corresponds to what real POWER hardware does, where there is a LPCR
per CPU thread but most of the fields are required to have the same
value on all active threads in a core.

The per-virtual-core LPCR can be read and written using the
GET/SET_ONE_REG interface.  Userspace can can only modify the
following fields of the LPCR value:

DPFD	Default prefetch depth
ILE	Interrupt little-endian
TC	Translation control (secondary HPT hash group search disable)

We still maintain a per-VM default LPCR value in kvm->arch.lpcr, which
contains bits relating to memory management, i.e. the Virtualized
Partition Memory (VPM) bits and the bits relating to guest real mode.
When this default value is updated, the update needs to be propagated
to the per-vcore values, so we add a kvmppc_update_lpcr() helper to do
that.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
[agraf: fix whitespace]
Signed-off-by: NAlexander Graf <agraf@suse.de>

a0144e2a

KVM: PPC: BookE: Add GET/SET_ONE_REG interface for VRSAVE · 8b75cbbe

由 Paul Mackerras 提交于 9月 20, 2013

This makes the VRSAVE register value for a vcpu accessible through
the GET/SET_ONE_REG interface on Book E systems (in addition to the
existing GET/SET_SREGS interface), for consistency with Book 3S.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

8b75cbbe

KVM: PPC: Book3S HV: Avoid unbalanced increments of VPA yield count · 8c2dbb79

由 Paul Mackerras 提交于 9月 06, 2013

The yield count in the VPA is supposed to be incremented every time
we enter the guest, and every time we exit the guest, so that its
value is even when the vcpu is running in the guest and odd when it
isn't. However, it's currently possible that we increment the yield
count on the way into the guest but then find that other CPU threads
are already exiting the guest, so we go back to nap mode via the
secondary_too_late label. In this situation we don't increment the
yield count again, breaking the relationship between the LSB of the
count and whether the vcpu is in the guest.

To fix this, we move the increment of the yield count to a point
after we have checked whether other CPU threads are exiting.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

8c2dbb79

KVM: PPC: Book3S HV: Pull out interrupt-reading code into a subroutine · c934243c

由 Paul Mackerras 提交于 9月 06, 2013

This moves the code in book3s_hv_rmhandlers.S that reads any pending
interrupt from the XICS interrupt controller, and works out whether
it is an IPI for the guest, an IPI for the host, or a device interrupt,
into a new function called kvmppc_read_intr.  Later patches will
need this.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

c934243c

KVM: PPC: Book3S HV: Restructure kvmppc_hv_entry to be a subroutine · 218309b7

由 Paul Mackerras 提交于 9月 06, 2013

We have two paths into and out of the low-level guest entry and exit
code: from a vcpu task via kvmppc_hv_entry_trampoline, and from the
system reset vector for an offline secondary thread on POWER7 via
kvm_start_guest. Currently both just branch to kvmppc_hv_entry to
enter the guest, and on guest exit, we test the vcpu physical thread
ID to detect which way we came in and thus whether we should return
to the vcpu task or go back to nap mode.

In order to make the code flow clearer, and to keep the code relating
to each flow together, this turns kvmppc_hv_entry into a subroutine
that follows the normal conventions for call and return. This means
that kvmppc_hv_entry_trampoline() and kvmppc_hv_entry() now establish
normal stack frames, and we use the normal stack slots for saving
return addresses rather than local_paca->kvm_hstate.vmhandler. Apart
from that this is mostly moving code around unchanged.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

218309b7

KVM: PPC: Book3S HV: Implement H_CONFER · 42d7604d

由 Paul Mackerras 提交于 9月 06, 2013

The H_CONFER hypercall is used when a guest vcpu is spinning on a lock
held by another vcpu which has been preempted, and the spinning vcpu
wishes to give its timeslice to the lock holder.  We implement this
in the straightforward way using kvm_vcpu_yield_to().
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

42d7604d

KVM: PPC: Book3S: Add GET/SET_ONE_REG interface for VRSAVE · c0867fd5

由 Paul Mackerras 提交于 9月 06, 2013

The VRSAVE register value for a vcpu is accessible through the
GET/SET_SREGS interface for Book E processors, but not for Book 3S
processors.  In order to make this accessible for Book 3S processors,
this adds a new register identifier for GET/SET_ONE_REG, and adds
the code to implement it.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

c0867fd5

KVM: PPC: Book3S HV: Implement timebase offset for guests · 93b0f4dc

由 Paul Mackerras 提交于 9月 06, 2013

This allows guests to have a different timebase origin from the host.
This is needed for migration, where a guest can migrate from one host
to another and the two hosts might have a different timebase origin.
However, the timebase seen by the guest must not go backwards, and
should go forwards only by a small amount corresponding to the time
taken for the migration.

Therefore this provides a new per-vcpu value accessed via the one_reg
interface using the new KVM_REG_PPC_TB_OFFSET identifier.  This value
defaults to 0 and is not modified by KVM.  On entering the guest, this
value is added onto the timebase, and on exiting the guest, it is
subtracted from the timebase.

This is only supported for recent POWER hardware which has the TBU40
(timebase upper 40 bits) register.  Writing to the TBU40 register only
alters the upper 40 bits of the timebase, leaving the lower 24 bits
unchanged.  This provides a way to modify the timebase for guest
migration without disturbing the synchronization of the timebase
registers across CPU cores.  The kernel rounds up the value given
to a multiple of 2^24.

Timebase values stored in KVM structures (struct kvm_vcpu, struct
kvmppc_vcore, etc.) are stored as host timebase values.  The timebase
values in the dispatch trace log need to be guest timebase values,
however, since that is read directly by the guest.  This moves the
setting of vcpu->arch.dec_expires on guest exit to a point after we
have restored the host timebase so that vcpu->arch.dec_expires is a
host timebase value.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

93b0f4dc

KVM: PPC: Book3S HV: Save/restore SIAR and SDAR along with other PMU registers · 14941789

由 Paul Mackerras 提交于 9月 06, 2013

Currently we are not saving and restoring the SIAR and SDAR registers in
the PMU (performance monitor unit) on guest entry and exit.  The result
is that performance monitoring tools in the guest could get false
information about where a program was executing and what data it was
accessing at the time of a performance monitor interrupt.  This fixes
it by saving and restoring these registers along with the other PMU
registers on guest entry/exit.

This also provides a way for userspace to access these values for a
vcpu via the one_reg interface.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

14941789

KVM: PPC: Book3S HV: Reserve POWER8 space in get/set_one_reg · 3b783474

由 Michael Neuling 提交于 9月 03, 2013

This reserves space in get/set_one_reg ioctl for the extra guest state
needed for POWER8.  It doesn't implement these at all, it just reserves
them so that the ABI is defined now.

A few things to note here:

- This add *a lot* state for transactional memory.  TM suspend mode,
  this is unavoidable, you can't simply roll back all transactions and
  store only the checkpointed state.  I've added this all to
  get/set_one_reg (including GPRs) rather than creating a new ioctl
  which returns a struct kvm_regs like KVM_GET_REGS does.  This means we
  if we need to extract the TM state, we are going to need a bucket load
  of IOCTLs.  Hopefully most of the time this will not be needed as we
  can look at the MSR to see if TM is active and only grab them when
  needed.  If this becomes a bottle neck in future we can add another
  ioctl to grab all this state in one go.

- The TM state is offset by 0x80000000.

- For TM, I've done away with VMX and FP and created a single 64x128 bit
  VSX register space.

- I've left a space of 1 (at 0x9c) since Paulus needs to add a value
  which applies to POWER7 as well.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

3b783474

16 10月, 2013 2 次提交

G
Merge tag 'kvm-arm-for-3.13-1' of git://git.linaro.org/people/cdall/linux-kvm-arm into next · d5701426
由 Gleb Natapov 提交于 10月 16, 2013
```
Updates for KVM/ARM including cpu=host and Cortex-A7 support
```
d5701426

KVM: ARM: Remove non-ASCII space characters · a7265fb1

由 Christoffer Dall 提交于 10月 15, 2013

Some strange character leaped into the documentation, which makes
git-send-email behave quite strangely.  Get rid of this before it bites
anyone else.

Cc: Anup Patel <anup.patel@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

a7265fb1

15 10月, 2013 1 次提交

KVM: Drop FOLL_GET in GUP when doing async page fault · f2e10669

由 chai wen 提交于 10月 14, 2013

Page pinning is not mandatory in kvm async page fault processing since
after async page fault event is delivered to a guest it accesses page once
again and does its own GUP.  Drop the FOLL_GET flag in GUP in async_pf
code, and do some simplifying in check/clear processing.
Suggested-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NGu zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Nchai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

f2e10669

14 10月, 2013 7 次提交

KVM: s390: Get rid of KVM_HPAGE defines · a7efdf6b

由 Christoffer Dall 提交于 10月 13, 2013

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

a7efdf6b

KVM: PPC: Get rid of KVM_HPAGE defines · 2c5350e9

由 Christoffer Dall 提交于 10月 02, 2013

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

2c5350e9

KVM: ia64: Get rid of KVM_HPAGE defines · bbba7938

由 Christoffer Dall 提交于 10月 02, 2013

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

bbba7938

KVM: mips: Get rid of KVM_HPAGE defines · 015e0513

由 Christoffer Dall 提交于 10月 02, 2013

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

015e0513

KVM: arm64: Get rid of KVM_HPAGE defines · ef0cfe71

由 Christoffer Dall 提交于 10月 02, 2013

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

ef0cfe71

KVM: ARM: Get rid of KVM_HPAGE defines · dc6f6763

由 Christoffer Dall 提交于 10月 02, 2013

The KVM_HPAGE_DEFINES are a little artificial on ARM, since the huge
page size is statically defined at compile time and there is only a
single huge page size.

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

dc6f6763

KVM: Move gfn_to_index to x86 specific code · 6d9d41e5

由 Christoffer Dall 提交于 10月 02, 2013

The gfn_to_index function relies on huge page defines which either may
not make sense on systems that don't support huge pages or are defined
in an unconvenient way for other architectures. Since this is
x86-specific, move the function to arch/x86/include/asm/kvm_host.h.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

6d9d41e5

13 10月, 2013 3 次提交

KVM: ARM: Add support for Cortex-A7 · e8c2d99f

由 Jonathan Austin 提交于 9月 26, 2013

This patch adds support for running Cortex-A7 guests on Cortex-A7 hosts.

As Cortex-A7 is architecturally compatible with A15, this patch is largely just
generalising existing code. Areas where 'implementation defined' behaviour
is identical for A7 and A15 is moved to allow it to be used by both cores.

The check to ensure that coprocessor register tables are sorted correctly is
also moved in to 'common' code to avoid each new cpu doing its own check
(and possibly forgetting to do so!)
Signed-off-by: NJonathan Austin <jonathan.austin@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

e8c2d99f

KVM: ARM: fix the size of TTBCR_{T0SZ,T1SZ} masks · 5e497046

由 Jonathan Austin 提交于 9月 26, 2013

The T{0,1}SZ fields of TTBCR are 3 bits wide when using the long descriptor
format. Likewise, the T0SZ field of the HTCR is 3-bits. KVM currently
defines TTBCR_T{0,1}SZ as 3, not 7.

The T0SZ mask is used to calculate the value for the HTCR, both to pick out
TTBCR.T0SZ and mask off the equivalent field in the HTCR during
read-modify-write. The incorrect mask size causes the (UNKNOWN) reset value
of HTCR.T0SZ to leak in to the calculated HTCR value. Linux will hang when
initializing KVM if HTCR's reset value has bit 2 set (sometimes the case on
A7/TC2)

Fixing T0SZ allows A7 cores to boot and T1SZ is also fixed for completeness.
Signed-off-by: NJonathan Austin <jonathan.austin@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

5e497046

KVM: ARM: Fix calculation of virtual CPU ID · 1158fca4

由 Jonathan Austin 提交于 9月 26, 2013

KVM does not have a notion of multiple clusters for CPUs, just a linear
array of CPUs. When using a system with cores in more than one cluster, the
current method for calculating the virtual MPIDR will leak the (physical)
cluster information into the virtual MPIDR. One effect of this is that
Linux under KVM fails to boot multiple CPUs that aren't in the 0th cluster.

This patch does away with exposing the real MPIDR fields in favour of simply
using the virtual CPU number (but preserving the U bit, as before).
Signed-off-by: NJonathan Austin <jonathan.austin@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

1158fca4

11 10月, 2013 1 次提交

KVM: nVMX: Fully support nested VMX preemption timer · 7854cbca

由 Arthur Chunqi Li 提交于 9月 16, 2013

This patch contains the following two changes:
1. Fix the bug in nested preemption timer support. If vmexit L2->L0
with some reasons not emulated by L1, preemption timer value should
be save in such exits.
2. Add support of "Save VMX-preemption timer value" VM-Exit controls
to nVMX.

With this patch, nested VMX preemption timer features are fully
supported.
Signed-off-by: NArthur Chunqi Li <yzt356@gmail.com>
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7854cbca

03 10月, 2013 13 次提交

KVM: mmu: change useless int return types to void · 8a3c1a33