提交 · 6a39bbc5da27c3b2520876b71e4f7b50f5313503 · openanolis / cloud-kernel

19 6月, 2015 20 次提交

KVM: MTRR: do not map huge page for non-consistent range · 6a39bbc5

由 Xiao Guangrong 提交于 6月 15, 2015

Based on Intel's SDM, mapping huge page which do not have consistent
memory cache for each 4k page will cause undefined behavior

In order to avoiding this kind of undefined behavior, we force to use
4k pages under this case
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6a39bbc5

KVM: MTRR: simplify kvm_mtrr_get_guest_memory_type · fa612137

由 Xiao Guangrong 提交于 6月 15, 2015

mtrr_for_each_mem_type() is ready now, use it to simplify
kvm_mtrr_get_guest_memory_type()
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fa612137

KVM: MTRR: introduce mtrr_for_each_mem_type · f571c097

由 Xiao Guangrong 提交于 6月 15, 2015

It walks all MTRRs and gets all the memory cache type setting for the
specified range also it checks if the range is fully covered by MTRRs
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
[Adjust for range_size->range_shift change. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f571c097

KVM: MTRR: introduce fixed_mtrr_addr_* functions · f7bfb57b

由 Xiao Guangrong 提交于 6月 15, 2015

Two functions are introduced:
- fixed_mtrr_addr_to_seg() translates the address to the fixed
  MTRR segment

- fixed_mtrr_addr_seg_to_range_index() translates the address to
  the index of kvm_mtrr.fixed_ranges[]

They will be used in the later patch
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
[Adjust for range_size->range_shift change. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f7bfb57b

KVM: MTRR: sort variable MTRRs · 19efffa2

由 Xiao Guangrong 提交于 6月 15, 2015

Sort all valid variable MTRRs based on its base address, it will help us to
check a range to see if it's fully contained in variable MTRRs
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
[Fix list insertion sort, simplify var_mtrr_range_is_valid to just
 test the V bit. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

19efffa2

KVM: MTRR: introduce var_mtrr_range · a13842dc

由 Xiao Guangrong 提交于 6月 15, 2015

It gets the range for the specified variable MTRR
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
[Simplify boolean operations. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a13842dc

KVM: MTRR: introduce fixed_mtrr_segment table · de9aef5e

由 Xiao Guangrong 提交于 6月 15, 2015

This table summarizes the information of fixed MTRRs and introduce some APIs
to abstract its operation which helps us to clean up the code and will be
used in later patches
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
[Change range_size to range_shift, in order to avoid udivdi3 errors.
 - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

de9aef5e

KVM: MTRR: improve kvm_mtrr_get_guest_memory_type · 3f3f78b6

由 Xiao Guangrong 提交于 6月 15, 2015

 - kvm_mtrr_get_guest_memory_type() only checks one page in MTRRs so
   that it's unnecessary to check to see if the range is partially
   covered in MTRR

 - optimize the check of overlap memory type and add some comments
   to explain the precedence
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3f3f78b6

KVM: MTRR: do not split 64 bits MSR content · 86fd5270

由 Xiao Guangrong 提交于 6月 15, 2015

Variable MTRR MSRs are 64 bits which are directly accessed with full length,
no reason to split them to two 32 bits
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

86fd5270

KVM: MTRR: clean up mtrr default type · 10fac2dc

由 Xiao Guangrong 提交于 6月 15, 2015

Drop kvm_mtrr->enable, omit the decode/code workload and get rid of
all the hard code
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

10fac2dc

KVM: MTRR: exactly define the size of variable MTRRs · 910a6aae

由 Xiao Guangrong 提交于 6月 15, 2015

Only KVM_NR_VAR_MTRR variable MTRRs are available in KVM guest
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

910a6aae

KVM: MTRR: remove mtrr_state.have_fixed · 70109e7d

由 Xiao Guangrong 提交于 6月 15, 2015

vMTRR does not depend on any host MTRR feature and fixed MTRRs have always
been implemented, so drop this field
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

70109e7d

KVM: MTRR: handle MSR_MTRRcap in kvm_mtrr_get_msr · eb839917

由 Xiao Guangrong 提交于 6月 15, 2015

MSR_MTRRcap is a MTRR msr so move the handler to the common place, also
add some comments to make the hard code more readable
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

eb839917

KVM: x86: move MTRR related code to a separate file · ff53604b

由 Xiao Guangrong 提交于 6月 15, 2015

MTRR code locates in x86.c and mmu.c so that move them to a separate file to
make the organization more clearer and it will be the place where we fully
implement vMTRR
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ff53604b

KVM: x86: fix CR0.CD virtualization · b18d5431

由 Xiao Guangrong 提交于 6月 15, 2015

Currently, CR0.CD is not checked when we virtualize memory cache type for
noncoherent_dma guests, this patch fixes it by :

- setting UC for all memory if CR0.CD = 1
- zapping all the last sptes in MMU if CR0.CD is changed
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b18d5431

KVM: nSVM: Check for NRIPS support before updating control field · f104765b

由 Bandan Das 提交于 6月 11, 2015

If hardware doesn't support DecodeAssist - a feature that provides
more information about the intercept in the VMCB, KVM decodes the
instruction and then updates the next_rip vmcb control field.
However, NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS].
Since skip_emulated_instruction() doesn't verify nrip support
before accepting control.next_rip as valid, avoid writing this
field if support isn't present.
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f104765b

KVM: fix checkpatch.pl errors in kvm/coalesced_mmio.h · 0b8ba4a2

由 Kevin Mulvey 提交于 6月 16, 2015

Tabs rather than spaces
Signed-off-by: NKevin Mulvey <kmulvey@linux.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0b8ba4a2

KVM: fix checkpatch.pl errors in kvm/async_pf.h · d626f3d5

由 Kevin Mulvey 提交于 6月 16, 2015

fix brace spacing
Signed-off-by: NKevin Mulvey <kmulvey@linux.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d626f3d5

kvm: irqchip: Break up high order allocations of kvm_irq_routing_table · e73f61e4

由 Joerg Roedel 提交于 5月 08, 2015

The allocation size of the kvm_irq_routing_table depends on
the number of irq routing entries because they are all
allocated with one kzalloc call.

When the irq routing table gets bigger this requires high
order allocations which fail from time to time:

	qemu-kvm: page allocation failure: order:4, mode:0xd0

This patch fixes this issue by breaking up the allocation of
the table and its entries into individual kzalloc calls.
These could all be satisfied with order-0 allocations, which
are less likely to fail.

The downside of this change is the lower performance, because
of more calls to kzalloc. But given how often kvm_set_irq_routing
is called in the lifetime of a guest, it doesn't really
matter much.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
[Avoid sparse warning through rcu_access_pointer. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e73f61e4

Merge tag 'kvm-arm-for-4.2' of... · 05fe125f

由 Paolo Bonzini 提交于 6月 19, 2015

Merge tag 'kvm-arm-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM changes for v4.2:

- Proper guest time accounting
- FP access fix for 32bit
- The usual pile of GIC fixes
- PSCI fixes
- Random cleanups

05fe125f

18 6月, 2015 1 次提交

KVM: arm/arm64: vgic: Remove useless arm-gic.h #include · c62e631d

由 Marc Zyngier 提交于 6月 18, 2015

Back in the days, vgic.c used to have an intimate knowledge of
the actual GICv2. These days, this has been abstracted away into
hardware-specific backends.

Remove the now useless arm-gic.h #include directive, making it
clear that GICv2 specific code doesn't belong here.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

c62e631d

17 6月, 2015 9 次提交

KVM: arm/arm64: vgic: Avoid injecting reserved IRQ numbers · 4839ddc2

由 Marc Zyngier 提交于 6月 17, 2015

Commit fd1d0ddf (KVM: arm/arm64: check IRQ number on userland
injection) rightly limited the range of interrupts userspace can
inject in a guest, but failed to consider the (unlikely) case where
a guest is configured with 1024 interrupts.

In this case, interrupts ranging from 1020 to 1023 are unuseable,
as they have a special meaning for the GIC CPU interface.

Make sure that these number cannot be used as an IRQ. Also delete
a redundant (and similarily buggy) check in kvm_set_irq.
Reported-by: NPeter Maydell <peter.maydell@linaro.org>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: <stable@vger.kernel.org> # 4.1, 4.0, 3.19, 3.18
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

4839ddc2

arm/arm64: KVM: vgic: Do not save GICH_HCR / ICH_HCR_EL2 · 4642019d

由 Marc Zyngier 提交于 6月 11, 2015

The GIC Hypervisor Configuration Register is used to enable
the delivery of virtual interupts to a guest, as well as to
define in which conditions maintenance interrupts are delivered
to the host.

This register doesn't contain any information that we need to
read back (the EOIcount is utterly useless for us).

So let's save ourselves some cycles, and not save it before
writing zero to it.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

4642019d

KVM: arm: vgic: Drop useless Group0 warning · f5a202db

由 Marc Zyngier 提交于 5月 29, 2015

If a GICv3-enabled guest tries to configure Group0, we print a
warning on the console (because we don't support Group0 interrupts).

This is fairly pointless, and would allow a guest to spam the
console. Let's just drop the warning.
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

f5a202db

ARM: kvm: psci: fix handling of unimplemented functions · e2d99736

由 Lorenzo Pieralisi 提交于 6月 10, 2015

According to the PSCI specification and the SMC/HVC calling
convention, PSCI function_ids that are not implemented must
return NOT_SUPPORTED as return value.

Current KVM implementation takes an unhandled PSCI function_id
as an error and injects an undefined instruction into the guest
if PSCI implementation is called with a function_id that is not
handled by the resident PSCI version (ie it is not implemented),
which is not the behaviour expected by a guest when calling a
PSCI function_id that is not implemented.

This patch fixes this issue by returning NOT_SUPPORTED whenever
the kvm PSCI call is executed for a function_id that is not
implemented by the PSCI kvm layer.

Cc: <stable@vger.kernel.org> # 3.18+
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: NSudeep Holla <sudeep.holla@arm.com>
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

e2d99736

KVM: arm64: fix misleading comments in save/restore · 921ef1e1

由 Alex Bennée 提交于 6月 04, 2015

The elr_el2 and spsr_el2 registers in fact contain the processor state
before entry into EL2. In the case of guest state it could be in either
el0 or el1.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

921ef1e1

KVM: arm/arm64: Enable the KVM-VFIO device · 8889583c

由 Kim Phillips 提交于 6月 05, 2015

The KVM-VFIO device is used by the QEMU VFIO device. It is used to
record the list of in-use VFIO groups so that KVM can manipulate
them.
Signed-off-by: NKim Phillips <kim.phillips@linaro.org>
Signed-off-by: NEric Auger <eric.auger@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

8889583c

arm/arm64: KVM: Properly account for guest CPU time · 1b3d546d

由 Christoffer Dall 提交于 5月 28, 2015

Until now we have been calling kvm_guest_exit after re-enabling
interrupts when we come back from the guest, but this has the
unfortunate effect that CPU time accounting done in the context of timer
interrupts occurring while the guest is running doesn't properly notice
that the time since the last tick was spent in the guest.

Inspired by the comment in the x86 code, move the kvm_guest_exit() call
below the local_irq_enable() call and change __kvm_guest_exit() to
kvm_guest_exit(), because we are now calling this function with
interrupts enabled. We have to now explicitly disable preemption and
not enable preemption before we've called kvm_guest_exit(), since
otherwise we could be preempted and everything happening before we
eventually get scheduled again would be accounted for as guest time.

At the same time, move the trace_kvm_exit() call outside of the atomic
section, since there is no reason for us to do that with interrupts
disabled.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

1b3d546d

kvm: remove one useless check extension · ea2c6d97

由 Tiejun Chen 提交于 5月 04, 2015

We already check KVM_CAP_IRQFD in generic once enable CONFIG_HAVE_KVM_IRQFD,

kvm_vm_ioctl_check_extension_generic()
    |
    + switch (arg) {
    +   ...
    +   #ifdef CONFIG_HAVE_KVM_IRQFD
    +       case KVM_CAP_IRQFD:
    +   #endif
    +   ...
    +   return 1;
    +   ...
    + }
    |
    + kvm_vm_ioctl_check_extension()

So its not necessary to check this in arch again, and also fix one typo,
s/emlation/emulation.
Signed-off-by: NTiejun Chen <tiejun.chen@intel.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

ea2c6d97

arm: KVM: force execution of HCPTR access on VM exit · 85e84ba3

由 Marc Zyngier 提交于 3月 16, 2015

On VM entry, we disable access to the VFP registers in order to
perform a lazy save/restore of these registers.

On VM exit, we restore access, test if we did enable them before,
and save/restore the guest/host registers if necessary. In this
sequence, the FPEXC register is always accessed, irrespective
of the trapping configuration.

If the guest didn't touch the VFP registers, then the HCPTR access
has now enabled such access, but we're missing a barrier to ensure
architectural execution of the new HCPTR configuration. If the HCPTR
access has been delayed/reordered, the subsequent access to FPEXC
will cause a trap, which we aren't prepared to handle at all.

The same condition exists when trapping to enable VFP for the guest.

The fix is to introduce a barrier after enabling VFP access. In the
vmexit case, it can be relaxed to only takes place if the guest hasn't
accessed its view of the VFP registers, making the access to FPEXC safe.

The set_hcptr macro is modified to deal with both vmenter/vmexit and
vmtrap operations, and now takes an optional label that is branched to
when the guest hasn't touched the VFP registers.
Reported-by: NVikram Sethi <vikrams@codeaurora.org>
Cc: stable@kernel.org	# v3.9+
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

85e84ba3

10 6月, 2015 2 次提交

KVM: arm64: add active register handling to GICv3 emulation as well · c11b5329

由 Andre Przywara 提交于 4月 23, 2015

Commit 47a98b15 ("arm/arm64: KVM: support for un-queuing active
IRQs") introduced handling of the GICD_I[SC]ACTIVER registers,
but only for the GICv2 emulation. For the sake of completeness and
as this is a pre-requisite for save/restore of the GICv3 distributor
state, we should also emulate their handling in the distributor and
redistributor frames of an emulated GICv3.
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

c11b5329

ARM: KVM: Remove pointless void pointer cast · a5f56ba3

由 Firo Yang 提交于 4月 23, 2015

No need to cast the void pointer returned by kmalloc() in
arch/arm/kvm/mmu.c::kvm_alloc_stage2_pgd().
Signed-off-by: NFiro Yang <firogm@gmail.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

a5f56ba3

05 6月, 2015 8 次提交

KVM: x86: mark legacy PCI device assignment as deprecated · e80a4a94

由 Paolo Bonzini 提交于 6月 04, 2015

Follow up to commit e194bbdf.
Suggested-by: NBandan Das <bsd@redhat.com>
Suggested-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e80a4a94

KVM: x86: advertise KVM_CAP_X86_SMM · 6d396b55

由 Paolo Bonzini 提交于 4月 01, 2015

... and we're done. :)

Because SMBASE is usually relocated above 1M on modern chipsets, and
SMM handlers might indeed rely on 4G segment limits, we only expose it
if KVM is able to run the guest in big real mode.  This includes any
of VMX+emulate_invalid_guest_state, VMX+unrestricted_guest, or SVM.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6d396b55

KVM: x86: add SMM to the MMU role, support SMRAM address space · 699023e2

由 Paolo Bonzini 提交于 5月 18, 2015

This is now very simple to do.  The only interesting part is a simple
trick to find the right memslot in gfn_to_rmap, retrieving the address
space from the spte role word.  The same trick is used in the auditing
code.

The comment on top of union kvm_mmu_page_role has been stale forever,
so remove it.  Speaking of stale code, remove pad_for_nice_hex_output
too: it was splitting the "access" bitfield across two bytes and thus
had effectively turned into pad_for_ugly_hex_output.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

699023e2

KVM: x86: work on all available address spaces · 9da0e4d5

由 Paolo Bonzini 提交于 5月 18, 2015

This patch has no semantic change, but it prepares for the introduction
of a second address space for system management mode.

A new function x86_set_memory_region (and the "slots_lock taken"
counterpart __x86_set_memory_region) is introduced in order to
operate on all address spaces when adding or deleting private
memory slots.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9da0e4d5

KVM: x86: use vcpu-specific functions to read/write/translate GFNs · 54bf36aa

由 Paolo Bonzini 提交于 4月 08, 2015

We need to hide SMRAM from guests not running in SMM.  Therefore,
all uses of kvm_read_guest* and kvm_write_guest* must be changed to
check whether the VCPU is in system management mode and use a
different set of memslots.  Switch from kvm_* to the newly-introduced
kvm_vcpu_*, which call into kvm_arch_vcpu_memslots_id.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

54bf36aa

KVM: x86: pass struct kvm_mmu_page to gfn_to_rmap · e4cd1da9

由 Paolo Bonzini 提交于 5月 18, 2015

This is always available (with one exception in the auditing code),
and with the same auditing exception the level was coming from
sp->role.level.

Later, the spte's role will also be used to look up the right memslots
array.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e4cd1da9

KVM: implement multiple address spaces · f481b069

由 Paolo Bonzini 提交于 5月 17, 2015

Only two ioctls have to be modified; the address space id is
placed in the higher 16 bits of their slot id argument.

As of this patch, no architecture defines more than one
address space; x86 will be the first.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f481b069

KVM: add vcpu-specific functions to read/write/translate GFNs · 8e73485c

由 Paolo Bonzini 提交于 5月 17, 2015

We need to hide SMRAM from guests not running in SMM.  Therefore, all
uses of kvm_read_guest* and kvm_write_guest* must be changed to use
different address spaces, depending on whether the VCPU is in system
management mode.  We need to introduce a new family of functions for
this purpose.

For now, the VCPU-based functions have the same behavior as the
existing per-VM ones, they just accept a different type for the
first argument.  Later however they will be changed to use one of many
"struct kvm_memslots" stored in struct kvm, through an architecture hook.
VM-based functions will unconditionally use the first memslots pointer.

Whenever possible, this patch introduces slot-based functions with an
__ prefix, with two wrappers for generic and vcpu-based actions.
The exceptions are kvm_read_guest and kvm_write_guest, which are copied
into the new functions kvm_vcpu_read_guest and kvm_vcpu_write_guest.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8e73485c

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功