提交 · cf29b21595b91eecce6ca69b0f92d60bca076ef0 · openanolis / cloud-kernel

06 11月, 2015 1 次提交

KVM: PPC: Book3S HV: Synthesize segment fault if SLB lookup fails · cf29b215

由 Paul Mackerras 提交于 10月 27, 2015

When handling a hypervisor data or instruction storage interrupt (HDSI
or HISI), we look up the SLB entry for the address being accessed in
order to translate the effective address to a virtual address which can
be looked up in the guest HPT.  This lookup can occasionally fail due
to the guest replacing an SLB entry without invalidating the evicted
SLB entry.  In this situation an ERAT (effective to real address
translation cache) entry can persist and be used by the hardware even
though there is no longer a corresponding SLB entry.

Previously we would just deliver a data or instruction storage interrupt
(DSI or ISI) to the guest in this case.  However, this is not correct
and has been observed to cause guests to crash, typically with a
data storage protection interrupt on a store to the vmemmap area.

Instead, what we do now is to synthesize a data or instruction segment
interrupt.  That should cause the guest to reload an appropriate entry
into the SLB and retry the faulting instruction.  If it still faults,
we should find an appropriate SLB entry next time and be able to handle
the fault.
Tested-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

cf29b215

05 11月, 2015 1 次提交

KVM: VMX: Fix commit which broke PML · a3eaa864

由 Kai Huang 提交于 11月 04, 2015

I found PML was broken since below commit:

	commit feda805f
	Author: Xiao Guangrong <guangrong.xiao@linux.intel.com>
	Date:   Wed Sep 9 14:05:55 2015 +0800

	KVM: VMX: unify SECONDARY_VM_EXEC_CONTROL update

	Unify the update in vmx_cpuid_update()
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
	[Rewrite to use vmcs_set_secondary_exec_control. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

The reason is in above commit vmx_cpuid_update calls vmx_secondary_exec_control,
in which currently SECONDARY_EXEC_ENABLE_PML bit is cleared unconditionally (as
PML is enabled in creating vcpu). Therefore if vcpu_cpuid_update is called after
vcpu is created, PML will be disabled unexpectedly while log-dirty code still
thinks PML is used.

Fix this by clearing SECONDARY_EXEC_ENABLE_PML in vmx_secondary_exec_control
only when PML is not supported or not enabled (!enable_pml). This is more
reasonable as PML is currently either always enabled or disabled. With this
explicit updating SECONDARY_EXEC_ENABLE_PML in vmx_enable{disable}_pml is not
needed so also rename vmx_enable{disable}_pml to vmx_create{destroy}_pml_buffer.

Fixes: feda805fSigned-off-by: NKai Huang <kai.huang@linux.intel.com>
[While at it, change a wrong ASSERT to an "if".  The condition can happen
 if creating the VCPU fails with ENOMEM. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a3eaa864

04 11月, 2015 11 次提交

KVM: x86: obey KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0() · 879ae188

由 Laszlo Ersek 提交于 11月 04, 2015

Commit b18d5431 ("KVM: x86: fix CR0.CD virtualization") was
technically correct, but it broke OVMF guests by slowing down various
parts of the firmware.

Commit fb279950 ("KVM: vmx: obey KVM_QUIRK_CD_NW_CLEARED") quirked the
first function modified by b18d5431, vmx_get_mt_mask(), for OVMF's
sake. This restored the speed of the OVMF code that runs before
PlatformPei (including the memory intensive LZMA decompression in SEC).

This patch extends the quirk to the second function modified by
b18d5431, kvm_set_cr0(). It eliminates the intrusive slowdown that
hits the EFI_MP_SERVICES_PROTOCOL implementation of edk2's
UefiCpuPkg/CpuDxe -- which is built into OVMF --, when CpuDxe starts up
all APs at once for initialization, in order to count them.

We also carry over the kvm_arch_has_noncoherent_dma() sub-condition from
the other half of the original commit b18d5431.

Fixes: b18d5431
Cc: stable@vger.kernel.org
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Tested-by: NJanusz Mocek <januszmk6@gmail.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>#
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

879ae188

KVM: x86: allow RSM from 64-bit mode · 89651a3d

由 Paolo Bonzini 提交于 11月 03, 2015

The SDM says that exiting system management mode from 64-bit mode
is invalid, but that would be too good to be true.  But actually,
most of the code is already there to support exiting from compat
mode (EFER.LME=1, EFER.LMA=0).  Getting all the way from 64-bit
mode to real mode only requires clearing CS.L and CR4.PCIDE.

Cc: stable@vger.kernel.org
Fixes: 660a5d51Tested-by: NLaszlo Ersek <lersek@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

89651a3d

KVM: VMX: fix SMEP and SMAP without EPT · 656ec4a4

由 Radim Krčmář 提交于 11月 02, 2015

The comment in code had it mostly right, but we enable paging for
emulated real mode regardless of EPT.

Without EPT (which implies emulated real mode), secondary VCPUs won't
start unless we disable SM[AE]P when the guest doesn't use paging.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

656ec4a4

KVM: x86: move kvm_set_irq_inatomic to legacy device assignment · 8a22f234

由 Paolo Bonzini 提交于 10月 28, 2015

The function is not used outside device assignment, and
kvm_arch_set_irq_inatomic has a different prototype.  Move it here and
make it static to avoid confusion.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8a22f234

KVM: device assignment: remove pointless #ifdefs · 76954056

由 Paolo Bonzini 提交于 10月 28, 2015

The symbols are always defined.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

76954056

KVM: x86: merge kvm_arch_set_irq with kvm_set_msi_inatomic · b97e6de9

由 Paolo Bonzini 提交于 10月 28, 2015

We do not want to do too much work in atomic context, in particular
not walking all the VCPUs of the virtual machine.  So we want
to distinguish the architecture-specific injection function for irqfd
from kvm_set_msi.  Since it's still empty, reuse the newly added
kvm_arch_set_irq and rename it to kvm_arch_set_irq_inatomic.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b97e6de9

KVM: x86: zero apic_arb_prio on reset · 0669a510

由 Radim Krčmář 提交于 10月 30, 2015

BSP doesn't get INIT so its apic_arb_prio isn't zeroed after reboot.
BSP won't get lowest priority interrupts until other VCPUs get enough
interrupts to match their pre-reboot apic_arb_prio.

That behavior doesn't fit into KVM's round-robin-like interpretation of
lowest priority delivery ... userspace should KVM_SET_LAPIC on reset, so
just zero apic_arb_prio there.
Reported-by: NYuki Shibuya <shibuya.yk@ncos.nec.co.jp>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0669a510

drivers/hv: share Hyper-V SynIC constants with userspace · c75efa97

由 Andrey Smetanin 提交于 10月 16, 2015

Moved Hyper-V synic contants from guest Hyper-V drivers private
header into x86 arch uapi Hyper-V header.

Added Hyper-V synic msr's flags into x86 arch uapi Hyper-V header.
Signed-off-by: NAndrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NDenis V. Lunev <den@openvz.org>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c75efa97

KVM: x86: handle SMBASE as physical address in RSM · f40606b1

由 Radim Krčmář 提交于 10月 30, 2015

GET_SMSTATE depends on real mode to ensure that smbase+offset is treated
as a physical address, which has already caused a bug after shuffling
the code.  Enforce physical addressing.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Reported-by: NLaszlo Ersek <lersek@redhat.com>
Tested-by: NLaszlo Ersek <lersek@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f40606b1

KVM: x86: add read_phys to x86_emulate_ops · 7a036a6f

由 Radim Krčmář 提交于 10月 30, 2015

We want to read the physical memory when emulating RSM.

X86EMUL_IO_NEEDED is returned on all errors for consistency with other
helpers.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Tested-by: NLaszlo Ersek <lersek@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7a036a6f

KVM: x86: removing unused variable · 2da29bcc

由 Saurabh Sengar 提交于 10月 30, 2015

removing unused variables, found by coccinelle
Signed-off-by: NSaurabh Sengar <saurabh.truth@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2da29bcc

29 10月, 2015 3 次提交

KVM: s390: use simple switch statement as multiplexer · 46b708ea

由 Christian Borntraeger 提交于 7月 23, 2015

We currently do some magic shifting (by exploiting that exit codes
are always a multiple of 4) and a table lookup to jump into the
exit handlers. This causes some calculations and checks, just to
do an potentially expensive function call.

Changing that to a switch statement gives the compiler the chance
to inline and dynamically decide between jump tables or inline
compare and branches. In addition it makes the code more readable.

bloat-o-meter gives me a small reduction in code size:

add/remove: 0/7 grow/shrink: 1/1 up/down: 986/-1334 (-348)
function old new delta
kvm_handle_sie_intercept 72 1058 +986
handle_prog 704 696 -8
handle_noop 54 - -54
handle_partial_execution 60 - -60
intercept_funcs 120 - -120
handle_instruction 198 - -198
handle_validity 210 - -210
handle_stop 316 - -316
handle_external_interrupt 368 - -368

Right now my gcc does conditional branches instead of jump tables.
The inlining seems to give us enough cycles as some micro-benchmarking
shows minimal improvements, but still in noise.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>

46b708ea

KVM: s390: drop useless newline in debugging data · 58c383c6

由 Christian Borntraeger 提交于 10月 12, 2015

the s390 debug feature does not need newlines. In fact it will
result in empty lines. Get rid of 4 leftovers.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>

58c383c6

KVM: s390: SCA must not cross page boundaries · c5c2c393

由 David Hildenbrand 提交于 10月 26, 2015

We seemed to have missed a few corner cases in commit f6c137ff
("KVM: s390: randomize sca address").

The SCA has a maximum size of 2112 bytes. By setting the sca_offset to
some unlucky numbers, we exceed the page.

0x7c0 (1984) -> Fits exactly
0x7d0 (2000) -> 16 bytes out
0x7e0 (2016) -> 32 bytes out
0x7f0 (2032) -> 48 bytes out

One VCPU entry is 32 bytes long.

For the last two cases, we actually write data to the other page.
1. The address of the VCPU.
2. Injection/delivery/clearing of SIGP externall calls via SIGP IF.

Especially the 2. happens regularly. So this could produce two problems:
1. The guest losing/getting external calls.
2. Random memory overwrites in the host.

So this problem happens on every 127 + 128 created VM with 64 VCPUs.

Cc: stable@vger.kernel.org # v3.15+
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

c5c2c393

23 10月, 2015 10 次提交

arm64: kvm: restore EL1N SP for panic · db85c55f

由 Mark Rutland 提交于 10月 12, 2015

If we panic in hyp mode, we inject a call to panic() into the EL1N host
kernel. If a guest context is active, we first attempt to restore the
minimal amount of state necessary to execute the host kernel with
restore_sysregs.

However, the SP is restored as part of restore_common_regs, and so we
may return to the host's panic() function with the SP of the guest. Any
calculations based on the SP will be bogus, and any attempt to access
the stack will result in recursive data aborts.

When running Linux as a guest, the guest's EL1N SP is like to be some
valid kernel address. In this case, the host kernel may use that region
as a stack for panic(), corrupting it in the process.

Avoid the problem by restoring the host SP prior to returning to the
host. To prevent misleading backtraces in the host, the FP is zeroed at
the same time. We don't need any of the other "common" registers in
order to panic successfully.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: <kvmarm@lists.cs.columbia.edu>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

db85c55f

arm/arm64: KVM: Improve kvm_exit tracepoint · b5905dc1

由 Christoffer Dall 提交于 8月 30, 2015

The ARM architecture only saves the exit class to the HSR (ESR_EL2 for
arm64) on synchronous exceptions, not on asynchronous exceptions like an
IRQ. However, we only report the exception class on kvm_exit, which is
confusing because an IRQ looks like it exited at some PC with the same
reason as the previous exit. Add a lookup table for the exception index
and prepend the kvm_exit tracepoint text with the exception type to
clarify this situation.

Also resolve the exception class (EC) to a human-friendly text version
so the trace output becomes immediately usable for debugging this code.

Cc: Wei Huang <wei@redhat.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

b5905dc1

KVM: arm/arm64: implement kvm_arm_[halt,resume]_guest · 3b92830a

由 Eric Auger 提交于 9月 25, 2015

We introduce kvm_arm_halt_guest and resume functions. They
will be used for IRQ forward state change.

Halt is synchronous and prevents the guest from being re-entered.
We use the same mechanism put in place for PSCI former pause,
now renamed power_off. A new flag is introduced in arch vcpu state,
pause, only meant to be used by those functions.
Signed-off-by: NEric Auger <eric.auger@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

3b92830a

KVM: arm/arm64: check power_off in critical section before VCPU run · 101d3da0

由 Eric Auger 提交于 9月 25, 2015

In case a vcpu off PSCI call is called just after we executed the
vcpu_sleep check, we can enter the guest although power_off
is set. Let's check the power_off state in the critical section,
just before entering the guest.
Signed-off-by: NEric Auger <eric.auger@linaro.org>
Reported-by: NChristoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

101d3da0

KVM: arm/arm64: check power_off in kvm_arch_vcpu_runnable · 4f5f1dc0

由 Eric Auger 提交于 9月 25, 2015

kvm_arch_vcpu_runnable now also checks whether the power_off
flag is set.
Signed-off-by: NEric Auger <eric.auger@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

4f5f1dc0

KVM: arm/arm64: rename pause into power_off · 3781528e

由 Eric Auger 提交于 9月 25, 2015

The kvm_vcpu_arch pause field is renamed into power_off to prepare
for the introduction of a new pause field. Also vcpu_pause is renamed
into vcpu_sleep since we will sleep until both power_off and pause are
false.
Signed-off-by: NEric Auger <eric.auger@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

3781528e

arm/arm64: KVM : Enable vhost device selection under KVM config menu · 75755c6d

由 Wei Huang 提交于 10月 09, 2015

vhost drivers provide guest VMs with better I/O performance and lower
CPU utilization. This patch allows users to select vhost devices under
KVM configuration menu on ARM. This makes vhost support on arm/arm64
on a par with other architectures (e.g. x86, ppc).
Signed-off-by: NWei Huang <wei@redhat.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

75755c6d

arm/arm64: KVM: Rework the arch timer to use level-triggered semantics · 4b4b4512

由 Christoffer Dall 提交于 8月 30, 2015

The arch timer currently uses edge-triggered semantics in the sense that
the line is never sampled by the vgic and lowering the line from the
timer to the vgic doesn't have any effect on the pending state of
virtual interrupts in the vgic. This means that we do not support a
guest with the otherwise valid behavior of (1) disable interrupts (2)
enable the timer (3) disable the timer (4) enable interrupts. Such a
guest would validly not expect to see any interrupts on real hardware,
but will see interrupts on KVM.

This patch fixes this shortcoming through the following series of
changes.

First, we change the flow of the timer/vgic sync/flush operations. Now
the timer is always flushed/synced before the vgic, because the vgic
samples the state of the timer output. This has the implication that we
move the timer operations in to non-preempible sections, but that is
fine after the previous commit getting rid of hrtimer schedules on every
entry/exit.

Second, we change the internal behavior of the timer, letting the timer
keep track of its previous output state, and only lower/raise the line
to the vgic when the state changes. Note that in theory this could have
been accomplished more simply by signalling the vgic every time the
state *potentially* changed, but we don't want to be hitting the vgic
more often than necessary.

Third, we get rid of the use of the map->active field in the vgic and
instead simply set the interrupt as active on the physical distributor
whenever the input to the GIC is asserted and conversely clear the
physical active state when the input to the GIC is deasserted.

Fourth, and finally, we now initialize the timer PPIs (and all the other
unused PPIs for now), to be level-triggered, and modify the sync code to
sample the line state on HW sync and re-inject a new interrupt if it is
still pending at that time.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

4b4b4512

arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_block · d35268da

由 Christoffer Dall 提交于 8月 25, 2015

We currently schedule a soft timer every time we exit the guest if the
timer did not expire while running the guest.  This is really not
necessary, because the only work we do in the timer work function is to
kick the vcpu.

Kicking the vcpu does two things:
(1) If the vpcu thread is on a waitqueue, make it runnable and remove it
from the waitqueue.
(2) If the vcpu is running on a different physical CPU from the one
doing the kick, it sends a reschedule IPI.

The second case cannot happen, because the soft timer is only ever
scheduled when the vcpu is not running.  The first case is only relevant
when the vcpu thread is on a waitqueue, which is only the case when the
vcpu thread has called kvm_vcpu_block().

Therefore, we only need to make sure a timer is scheduled for
kvm_vcpu_block(), which we do by encapsulating all calls to
kvm_vcpu_block() with kvm_timer_{un}schedule calls.

Additionally, we only schedule a soft timer if the timer is enabled and
unmasked, since it is useless otherwise.

Note that theoretically userspace can use the SET_ONE_REG interface to
change registers that should cause the timer to fire, even if the vcpu
is blocked without a scheduled timer, but this case was not supported
before this patch and we leave it for future work for now.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

d35268da

KVM: Add kvm_arch_vcpu_{un}blocking callbacks · 3217f7c2

由 Christoffer Dall 提交于 8月 27, 2015

Some times it is useful for architecture implementations of KVM to know
when the VCPU thread is about to block or when it comes back from
blocking (arm/arm64 needs to know this to properly implement timers, for
example).

Therefore provide a generic architecture callback function in line with
what we do elsewhere for KVM generic-arch interactions.
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

3217f7c2

21 10月, 2015 6 次提交

KVM: PPC: Book3S HV: Handle H_DOORBELL on the guest exit path · 70aa3961

由 Gautham R. Shenoy 提交于 10月 15, 2015

Currently a CPU running a guest can receive a H_DOORBELL in the
following two cases:
1) When the CPU is napping due to CEDE or there not being a guest
vcpu.
2) The CPU is running the guest vcpu.

Case 1), the doorbell message is not cleared since we were waking up
from nap. Hence when the EE bit gets set on transition from guest to
host, the H_DOORBELL interrupt is delivered to the host and the
corresponding handler is invoked.

However in Case 2), the message gets cleared by the action of taking
the H_DOORBELL interrupt. Since the CPU was running a guest, instead
of invoking the doorbell handler, the code invokes the second-level
interrupt handler to switch the context from the guest to the host. At
this point the setting of the EE bit doesn't result in the CPU getting
the doorbell interrupt since it has already been delivered once. So,
the handler for this doorbell is never invoked!

This causes softlockups if the missed DOORBELL was an IPI sent from a
sibling subcore on the same CPU.

This patch fixes it by explitly invoking the doorbell handler on the
exit path if the exit reason is H_DOORBELL similar to the way an
EXTERNAL interrupt is handled. Since this will also handle Case 1), we
can unconditionally clear the doorbell message in
kvmppc_check_wake_reason.
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

70aa3961

KVM: PPC: Implement extension to report number of memslots · bfec5c2c

由 Nikunj A Dadhania 提交于 10月 16, 2015

QEMU assumes 32 memslots if this extension is not implemented. Although,
current value of KVM_USER_MEM_SLOTS is 32, once KVM_USER_MEM_SLOTS
changes QEMU would take a wrong value.
Signed-off-by: NNikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

bfec5c2c

KVM: PPC: Book3S HV: Make H_REMOVE return correct HPTE value for absent HPTEs · c64dfe2a

由 Paul Mackerras 提交于 5月 18, 2015

This fixes a bug where the old HPTE value returned by H_REMOVE has
the valid bit clear if the HPTE was an absent HPTE, as happens for
HPTEs for emulated MMIO pages and for RAM pages that have been paged
out by the host. If the absent bit is set, we clear it and set the
valid bit, because from the guest's point of view, the HPTE is valid.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

c64dfe2a

KVM: PPC: Book3S HV: Don't fall back to smaller HPT size in allocation ioctl · 572abd56

由 Paul Mackerras 提交于 9月 29, 2015

Currently the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested
size of HPT, and if that is not possible, then try to allocate smaller
sizes (by factors of 2) until either a minimum is reached or the
allocation succeeds. This is not ideal for userspace, particularly in
migration scenarios, where the destination VM really does require the
size requested. Also, the minimum HPT size of 256kB may be
insufficient for the guest to run successfully.

This removes the fallback to smaller sizes on allocation failure for
the KVM_PPC_ALLOCATE_HTAB ioctl. The fallback still exists for the
case where the HPT is allocated at the time the first VCPU is run, if
no HPT has been allocated by ioctl by that time.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

572abd56

KVM: arm: use GIC support unconditionally · 4a5d69b7

由 Arnd Bergmann 提交于 10月 12, 2015

The vgic code on ARM is built for all configurations that enable KVM,
but the parent_data field that it references is only present when
CONFIG_IRQ_DOMAIN_HIERARCHY is set:

virt/kvm/arm/vgic.c: In function 'kvm_vgic_map_phys_irq':
virt/kvm/arm/vgic.c:1781:13: error: 'struct irq_data' has no member named 'parent_data'

This flag is implied by the GIC driver, and indeed the VGIC code only
makes sense if a GIC is present. This changes the CONFIG_KVM symbol
to always select GIC, which avoids the issue.

Fixes: 662d9715 ("arm/arm64: KVM: Kill CONFIG_KVM_ARM_{VGIC,TIMER}")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

4a5d69b7

KVM: arm/arm64: Fix memory leak if timer initialization fails · 399ea0f6

由 Pavel Fedin 提交于 10月 06, 2015

Jump to correct label and free kvm_host_cpu_state
Reviewed-by: NWei Huang <wei@redhat.com>
Signed-off-by: NPavel Fedin <p.fedin@samsung.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

399ea0f6

19 10月, 2015 2 次提交

KVM: x86: MMU: Initialize force_pt_level before calling mapping_level() · 8c85ac1c

由 Takuya Yoshikawa 提交于 10月 19, 2015

Commit fd136902 ("KVM: x86: MMU: Move mapping_level_dirty_bitmap()
call in mapping_level()") forgot to initialize force_pt_level to false
in FNAME(page_fault)() before calling mapping_level() like
nonpaging_map() does.  This can sometimes result in forcing page table
level mapping unnecessarily.

Fix this and move the first *force_pt_level check in mapping_level()
before kvm_vcpu_gfn_to_memslot() call to make it a bit clearer that
the variable must be initialized before mapping_level() gets called.

This change can also avoid calling kvm_vcpu_gfn_to_memslot() when
!check_hugepage_cache_consistency() check in tdp_page_fault() forces
page table level mapping.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8c85ac1c

kvm: x86: zero EFER on INIT · 5690891b

由 Paolo Bonzini 提交于 10月 19, 2015

Not zeroing EFER means that a 32-bit firmware cannot enter paging mode
without clearing EFER.LME first (which it should not know about).
Yang Zhang from Intel confirmed that the manual is wrong and EFER is
cleared to zero on INIT.

Fixes: d28bc9dd
Cc: stable@vger.kernel.org
Cc: Yang Z Zhang <yang.z.zhang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5690891b

16 10月, 2015 6 次提交

KVM: x86: move steal time initialization to vcpu entry time · 7cae2bed

由 Marcelo Tosatti 提交于 10月 14, 2015

As reported at https://bugs.launchpad.net/qemu/+bug/1494350,
it is possible to have vcpu->arch.st.last_steal initialized
from a thread other than vcpu thread, say the iothread, via
KVM_SET_MSRS.

Which can cause an overflow later (when subtracting from vcpu threads
sched_info.run_delay).

To avoid that, move steal time accumulation to vcpu entry time,
before copying steal time data to guest.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: NDavid Matlack <dmatlack@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7cae2bed

KVM: x86: MMU: Eliminate an extra memory slot search in mapping_level() · 5225fdf8

由 Takuya Yoshikawa 提交于 10月 16, 2015

Calling kvm_vcpu_gfn_to_memslot() twice in mapping_level() should be
avoided since getting a slot by binary search may not be negligible,
especially for virtual machines with many memory slots.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5225fdf8

KVM: x86: MMU: Remove mapping_level_dirty_bitmap() · d8aacf5d

由 Takuya Yoshikawa 提交于 10月 16, 2015

Now that it has only one caller, and its name is not so helpful for
readers, remove it.  The new memslot_valid_for_gpte() function
makes it possible to share the common code between
gfn_to_memslot_dirty_bitmap() and mapping_level().
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d8aacf5d

KVM: x86: MMU: Move mapping_level_dirty_bitmap() call in mapping_level() · fd136902

由 Takuya Yoshikawa 提交于 10月 16, 2015

This is necessary to eliminate an extra memory slot search later.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fd136902

KVM: x86: MMU: Simplify force_pt_level calculation code in FNAME(page_fault)() · 5ed5c5c8

由 Takuya Yoshikawa 提交于 10月 16, 2015

As a bonus, an extra memory slot search can be eliminated when
is_self_change_mapping is true.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5ed5c5c8

KVM: x86: MMU: Make force_pt_level bool · cd1872f0

由 Takuya Yoshikawa 提交于 10月 16, 2015

This will be passed to a function later.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cd1872f0

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功