提交 · a4cd8b23ac5786943202c0174c717956947db43c · openeuler / raspberrypi-kernel

12 7月, 2011 40 次提交

KVM: PPC: e500: enable magic page · a4cd8b23

由 Scott Wood 提交于 6月 14, 2011

This is a shared page used for paravirtualization.  It is always present
in the guest kernel's effective address space at the address indicated
by the hypercall that enables it.

The physical address specified by the hypercall is not used, as
e500 does not have real mode.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

a4cd8b23

KVM: PPC: e500: Support large page mappings of PFNMAP vmas. · 9973d54e

由 Scott Wood 提交于 6月 14, 2011

This allows large pages to be used on guest mappings backed by things like
/dev/mem, resulting in a significant speedup when guest memory
is mapped this way (it's useful for directly-assigned MMIO, too).

This is not a substitute for hugetlbfs integration, but is useful for
configurations where devices are directly assigned on chips without an
IOMMU -- in these cases, we need guest physical and true physical to
match, and be contiguous, so static reservation and mapping via /dev/mem
is the most straightforward way to set things up.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

9973d54e

KVM: PPC: e500: Eliminate shadow_pages[], and use pfns instead. · 59c1f4e3

由 Scott Wood 提交于 6月 14, 2011

This is in line with what other architectures do, and will allow us to
map things other than ordinary, unreserved kernel pages -- such as
dedicated devices, or large contiguous reserved regions.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

59c1f4e3

KVM: PPC: e500: don't use MAS0 as intermediate storage. · 0ef30995

由 Scott Wood 提交于 6月 14, 2011

This avoids races.  It also means that we use the shadow TLB way,
rather than the hardware hint -- if this is a problem, we could do
a tlbsx before inserting a TLB0 entry.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

0ef30995

KVM: PPC: e500: Disable preloading TLB1 in tlb_load(). · 6fc4d1eb

由 Scott Wood 提交于 6月 14, 2011

Since TLB1 loading doesn't check the shadow TLB before allocating another
entry, you can get duplicates.

Once shadow PIDs are enabled in a later patch, we won't need to
invalidate the TLB on every switch, so this optimization won't be
needed anyway.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

6fc4d1eb

KVM: PPC: e500: Save/restore SPE state · 4cd35f67

由 Scott Wood 提交于 6月 14, 2011

This is done lazily.  The SPE save will be done only if the guest has
used SPE since the last preemption or heavyweight exit.  Restore will be
done only on demand, when enabling MSR_SPE in the shadow MSR, in response
to an SPE fault or mtmsr emulation.

For SPEFSCR, Linux already switches it on context switch (non-lazily), so
the only remaining bit is to save it between qemu and the guest.
Signed-off-by: NLiu Yu <yu.liu@freescale.com>
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

4cd35f67

KVM: PPC: booke: use shadow_msr · ecee273f

由 Scott Wood 提交于 6月 14, 2011

Keep the guest MSR and the guest-mode true MSR separate, rather than
modifying the guest MSR on each guest entry to produce a true MSR.

Any bits which should be modified based on guest MSR must be explicitly
propagated from vcpu->arch.shared->msr to vcpu->arch.shadow_msr in
kvmppc_set_msr().

While we're modifying the guest entry code, reorder a few instructions
to bury some load latencies.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

ecee273f

powerpc/e500: SPE register saving: take arbitrary struct offset · c51584d5

由 Scott Wood 提交于 6月 14, 2011

Previously, these macros hardcoded THREAD_EVR0 as the base of the save
area, relative to the base register passed.  This base offset is now
passed as a separate macro parameter, allowing reuse with other SPE
save areas, such as used by KVM.
Acked-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

c51584d5

powerpc/e500: Save SPEFCSR in flush_spe_to_thread() · 685659ee

由 yu liu 提交于 6月 14, 2011

giveup_spe() saves the SPE state which is protected by MSR[SPE].
However, modifying SPEFSCR does not trap when MSR[SPE]=0.
And since SPEFSCR is already saved/restored in _switch(),
not all the callers want to save SPEFSCR again.
Thus, saving SPEFSCR should not belong to giveup_spe().

This patch moves SPEFSCR saving to flush_spe_to_thread(),
and cleans up the caller that needs to save SPEFSCR accordingly.
Signed-off-by: NLiu Yu <yu.liu@freescale.com>
Acked-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

685659ee

KVM: PPC: Resolve real-mode handlers through function exports · a22a2dac

由 Alexander Graf 提交于 6月 07, 2011

Up until now, Book3S KVM had variables stored in the kernel that a kernel module
or the kvm code in the kernel could read from to figure out where some real mode
helper functions are located.

This is all unnecessary. The high bits of the EA get ignore in real mode, so we
can just use the pointer as is. Also, it's a lot easier on relocations when we
use the normal way of resolving the address to a function, instead of jumping
through hoops.

This patch fixes compilation with CONFIG_RELOCATABLE=y.
Signed-off-by: NAlexander Graf <agraf@suse.de>

a22a2dac

KVM: PPC: fix partial application of "exit timing in ticks" · 24294b9a

由 Stuart Yoder 提交于 5月 17, 2011

When http://www.spinics.net/lists/kvm-ppc/msg02664.html
was applied to produce commit b51e7aa7ed6d8d134d02df78300ab0f91cfff4d2,
the removal of the conversion in add_exit_timing was left out.
Signed-off-by: NStuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

24294b9a

KVM: MMU: make kvm_mmu_reset_context() flush the guest TLB · 45bd07b9

由 Avi Kivity 提交于 6月 12, 2011

kvm_set_cr0() and kvm_set_cr4(), and possible other functions,
assume that kvm_mmu_reset_context() flushes the guest TLB.  However,
it does not.

Fix by flushing the tlb (and syncing the new root as well).
Signed-off-by: NAvi Kivity <avi@redhat.com>

45bd07b9

KVM: MMU: Adjust shadow paging to work when SMEP=1 and CR0.WP=0 · 411c588d

由 Avi Kivity 提交于 6月 06, 2011

When CR0.WP=0, we sometimes map user pages as kernel pages (to allow
the kernel to write to them).  Unfortunately this also allows the kernel
to fetch from these pages, even if CR4.SMEP is set.

Adjust for this by also setting NX on the spte in these circumstances.
Signed-off-by: NAvi Kivity <avi@redhat.com>

411c588d

KVM: Enable ERMS feature support for KVM · a01c8f9b

由 Yang, Wei 提交于 6月 14, 2011

This patch exposes ERMS feature to KVM guests.

The REP MOVSB/STOSB instruction can enhance fast strings attempts to
move as much of the data with larger size load/stores as possible.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a01c8f9b

KVM: Expose RDWRGSFS bit to KVM guests · 176f61da

由 Yang, Wei 提交于 6月 14, 2011

This patch exposes RDWRGSFS bit to KVM guests.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

176f61da

KVM: Add RDWRGSFS support when setting CR4 · 74dc2b4f

由 Yang, Wei 提交于 6月 14, 2011

This patch adds RDWRGSFS support when setting CR4.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

74dc2b4f

KVM: Remove RDWRGSFS bit from CR4_RESERVED_BITS · d9c3476d

由 Yang, Wei 提交于 6月 14, 2011

This patch removes RDWRGSFS bit from CR4_RESERVED_BITS.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d9c3476d

KVM: Enable DRNG feature support for KVM · 4a00efdf

由 Yang, Wei Y 提交于 6月 13, 2011

This patch exposes DRNG feature to KVM guests.

The RDRAND instruction can provide software with sequences of
random numbers generated from white noise.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4a00efdf

KVM: fix XSAVE bit scanning (now properly) · 02668b06

由 Andre Przywara 提交于 6月 10, 2011

commit 123108f1c1aafd51d6a5c79cc04d7999dd88a930 tried to fix KVMs
XSAVE valid feature scanning, but it was wrong. It was not considering
the sparse nature of this bitfield, instead reading values from
uninitialized members of the entries array.
This patch now separates subleaf indicies from KVM's array indicies
and fills the entry before querying it's value.
This fixes AVX support in KVM guests.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

02668b06

KVM: Add instruction fetch checking when walking guest page table · e57d4a35

由 Yang, Wei Y 提交于 6月 03, 2011

This patch adds instruction fetch checking when walking guest page table,
to implement SMEP when emulating instead of executing natively.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NShan, Haitao <haitao.shan@intel.com>
Signed-off-by: NLi, Xin <xin.li@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e57d4a35

KVM: Mask function7 ebx against host capability word9 · 611c120f

由 Yang, Wei Y 提交于 6月 03, 2011

This patch masks CPUID leaf 7 ebx against host capability word9.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NShan, Haitao <haitao.shan@intel.com>
Signed-off-by: NLi, Xin <xin.li@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

611c120f

KVM: Add SMEP support when setting CR4 · c68b734f

由 Yang, Wei Y 提交于 6月 03, 2011

This patch adds SMEP handling when setting CR4.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NShan, Haitao <haitao.shan@intel.com>
Signed-off-by: NLi, Xin <xin.li@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c68b734f

KVM: Remove SMEP bit from CR4_RESERVED_BITS · 8d9c975f

由 Yang, Wei Y 提交于 6月 03, 2011

This patch removes SMEP bit from CR4_RESERVED_BITS.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NShan, Haitao <haitao.shan@intel.com>
Signed-off-by: NLi, Xin <xin.li@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8d9c975f

KVM: nVMX: Fix bug preventing more than two levels of nesting · 509c75ea

由 Nadav Har'El 提交于 6月 02, 2011

The nested VMX feature is supposed to fully emulate VMX for the guest. This
(theoretically) not only allows it to run its own guests, but also also
to further emulate VMX for its own guests, and allow arbitrarily deep nesting.

This patch fixes a bug (discovered by Kevin Tian) in handling a VMLAUNCH
by L2, which prevented deeper nesting.

Deeper nesting now works (I only actually tested L3), but is currently
*absurdly* slow, to the point of being unusable.
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

509c75ea

KVM: x86 emulator: fold decode_cache into x86_emulate_ctxt · 9dac77fa

由 Avi Kivity 提交于 6月 01, 2011

This saves a lot of pointless casts x86_emulate_ctxt and decode_cache.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9dac77fa

KVM: x86 emulator: rename decode_cache::eip to _eip · 36dd9bb5

由 Avi Kivity 提交于 6月 01, 2011

The name eip conflicts with a field of the same name in x86_emulate_ctxt,
which we plan to fold decode_cache into.

The name _eip is unfortunate, but what's really needed is a refactoring
here, not a better name.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

36dd9bb5

KVM: VMX: Silence warning on 32-bit hosts · 2e4ce7f5

由 Jan Kiszka 提交于 6月 01, 2011

a is unused now on CONFIG_X86_32.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2e4ce7f5

KVM: x86 emulator: Use opcode::execute for CLI/STI(FA/FB) · f411e6cd

由 Takuya Yoshikawa 提交于 5月 29, 2011

Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f411e6cd

KVM: x86 emulator: Use opcode::execute for LOOP/JCXZ · d06e03ad

由 Takuya Yoshikawa 提交于 5月 29, 2011

  LOOP/LOOPcc      : E0-E2
  JCXZ/JECXZ/JRCXZ : E3
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d06e03ad

KVM: x86 emulator: Clean up INT n/INTO/INT 3(CC/CD/CE) · 5c5df76b

由 Takuya Yoshikawa 提交于 5月 29, 2011

Call emulate_int() directly to avoid spaghetti goto's.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5c5df76b

KVM: x86 emulator: Use opcode::execute for MOV(8C/8E) · 1bd5f469

由 Takuya Yoshikawa 提交于 5月 29, 2011

Different functions for those which take segment register operands.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1bd5f469

KVM: x86 emulator: Use opcode::execute for RET(C3) · ebda02c2

由 Takuya Yoshikawa 提交于 5月 29, 2011

Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ebda02c2

KVM: x86 emulator: Use opcode::execute for XCHG(86/87) · e4f973ae

由 Takuya Yoshikawa 提交于 5月 29, 2011

In addition, replace one "goto xchg" with an em_xchg() call.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e4f973ae

KVM: x86 emulator: Use opcode::execute for TEST(84/85, A8/A9) · 9f21ca59

由 Takuya Yoshikawa 提交于 5月 29, 2011

Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9f21ca59

KVM: x86 emulator: Use opcode::execute for some instructions · db5b0762

由 Takuya Yoshikawa 提交于 5月 29, 2011

Move the following functions to the opcode tables:

  RET (Far return) : CB
  IRET             : CF
  JMP (Jump far)   : EA

  SYSCALL          : 0F 05
  CLTS             : 0F 06
  SYSENTER         : 0F 34
  SYSEXIT          : 0F 35
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

db5b0762

KVM: x86 emulator: Rename emulate_xxx() to em_xxx() · e01991e7

由 Takuya Yoshikawa 提交于 5月 29, 2011

The next patch will change these to be called by opcode::execute.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e01991e7

KVM: x86 emulator: Use the pointers ctxt and c consistently · 9d74191a

由 Takuya Yoshikawa 提交于 5月 29, 2011

We should use the local variables ctxt and c when the emulate_ctxt and
decode appears many times.  At least, we need to be consistent about
how we use these in a function.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9d74191a

KVM: nVMX: Miscellenous small corrections · 2844d849

由 Nadav Har'El 提交于 5月 25, 2011

Small corrections of KVM (spelling, etc.) not directly related to nested VMX.
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2844d849

KVM: nVMX: Add VMX to list of supported cpuid features · 7b8050f5

由 Nadav Har'El 提交于 5月 25, 2011

If the "nested" module option is enabled, add the "VMX" CPU feature to the
list of CPU features KVM advertises with the KVM_GET_SUPPORTED_CPUID ioctl.

Qemu uses this ioctl, and intersects KVM's list with its own list of desired
cpu features (depending on the -cpu option given to qemu) to determine the
final list of features presented to the guest.
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7b8050f5

KVM: nVMX: Additional TSC-offset handling · 7991825b

由 Nadav Har'El 提交于 5月 25, 2011

In the unlikely case that L1 does not capture MSR_IA32_TSC, L0 needs to
emulate this MSR write by L2 by modifying vmcs02.tsc_offset. We also need to
set vmcs12.tsc_offset, for this change to survive the next nested entry (see
prepare_vmcs02()).
Additionally, we also need to modify vmx_adjust_tsc_offset: The semantics
of this function is that the TSC of all guests on this vcpu, L1 and possibly
several L2s, need to be adjusted. To do this, we need to adjust vmcs01's
tsc_offset (this offset will also apply to each L2s we enter). We can't set
vmcs01 now, so we have to remember this adjustment and apply it when we
later exit to L1.
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7991825b