提交 · edcafe3c5a06f46407c3f60145a36f269e56ff7f · openanolis / cloud-kernel

01 3月, 2010 40 次提交

KVM: VMX: Give the guest ownership of cr0.ts when the fpu is active · edcafe3c

由 Avi Kivity 提交于 12月 30, 2009

If the guest fpu is loaded, there is nothing interesing about cr0.ts; let
the guest play with it as it will.  This makes context switches between fpu
intensive guest processes faster, as we won't trap the clts and cr0 write
instructions.

[marcelo: fix cr0 read shadow update on fpu deactivation; kills F8 install]
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

edcafe3c

KVM: Lazify fpu activation and deactivation · 02daab21

由 Avi Kivity 提交于 12月 30, 2009

Defer fpu deactivation as much as possible - if the guest fpu is loaded, keep
it loaded until the next heavyweight exit (where we are forced to unload it).
This reduces unnecessary exits.

We also defer fpu activation on clts; while clts signals the intent to use the
fpu, we can't be sure the guest will actually use it.
Signed-off-by: NAvi Kivity <avi@redhat.com>

02daab21

KVM: VMX: Allow the guest to own some cr0 bits · e8467fda

由 Avi Kivity 提交于 12月 29, 2009

We will use this later to give the guest ownership of cr0.ts.
Signed-off-by: NAvi Kivity <avi@redhat.com>

e8467fda

KVM: Replace read accesses of vcpu->arch.cr0 by an accessor · 4d4ec087

由 Avi Kivity 提交于 12月 29, 2009

Since we'd like to allow the guest to own a few bits of cr0 at times, we need
to know when we access those bits.
Signed-off-by: NAvi Kivity <avi@redhat.com>

4d4ec087

A
KVM: VMX: trace clts and lmsw instructions as cr accesses · a1f83a74
由 Avi Kivity 提交于 12月 29, 2009
```
clts writes cr0.ts; lmsw writes cr0[0:15] - record that in ftrace.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
a1f83a74

KVM: PPC: Make large pages work · 4b5c9b7f

由 Alexander Graf 提交于 1月 10, 2010

An SLB entry contains two pieces of information related to size:

  1) PTE size
  2) SLB size

The L bit defines the PTE be "large" (usually means 16MB),
SLB_VSID_B_1T defines that the SLB should span 1 GB instead of the
default 256MB.

Apparently I messed things up and just put those two in one box,
shaked it heavily and came up with the current code which handles
large pages incorrectly, because it also treats large page SLB entries
as "1TB" segment entries.

This patch splits those two features apart, making Linux guests boot
even when they have > 256MB.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4b5c9b7f

KVM: PPC: Pass through program interrupts · 5f2b105a

由 Alexander Graf 提交于 1月 10, 2010

When we get a program interrupt in guest kernel mode, we try to emulate the
instruction.

If that doesn't fail, we report to the user and try again - at the exact same
instruction pointer. So if the guest kernel really does trigger an invalid
instruction, we loop forever.

So let's better go and forward program exceptions to the guest when we don't
know the instruction we're supposed to emulate.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5f2b105a

KVM: PPC: Pass program interrupt flags to the guest · ff1ca3f9

由 Alexander Graf 提交于 1月 08, 2010

When we need to reinject a program interrupt into the guest, we also need to
reinject the corresponding flags into the guest.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ff1ca3f9

KVM: PPC: Fix HID5 setting code · d35feb26

由 Alexander Graf 提交于 1月 08, 2010

The code to unset HID5.dcbz32 is broken.
This patch makes it do the right rotate magic.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d35feb26

KVM: PPC: Emulate trap SRR1 flags properly · 25a8a02d

由 Alexander Graf 提交于 1月 08, 2010

Book3S needs some flags in SRR1 to get to know details about an interrupt.

One such example is the trap instruction. It tells the guest kernel that
a program interrupt is due to a trap using a bit in SRR1.

This patch implements above behavior, making WARN_ON behave like WARN_ON.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

25a8a02d

KVM: PPC: Call SLB patching code in interrupt safe manner · 021ec9c6

由 Alexander Graf 提交于 1月 08, 2010

Currently we're racy when doing the transition from IR=1 to IR=0, from
the module memory entry code to the real mode SLB switching code.

To work around that I took a look at the RTAS entry code which is faced
with a similar problem and did the same thing:

  A small helper in linear mapped memory that does mtmsr with IR=0 and
  then RFIs info the actual handler.

Thanks to that trick we can safely take page faults in the entry code
and only need to be really wary of what to do as of the SLB switching
part.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

021ec9c6

KVM: PPC: Get rid of unnecessary RFI · bc90923e

由 Alexander Graf 提交于 1月 08, 2010

Using an RFI in IR=1 is dangerous. We need to set two SRRs and then do an RFI
without getting interrupted at all, because every interrupt could potentially
overwrite the SRR values.

Fortunately, we don't need to RFI in at least this particular case of the code,
so we can just replace it with an mtmsr and b.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bc90923e

KVM: PPC: Implement 'skip instruction' mode · b4433a7c

由 Alexander Graf 提交于 1月 08, 2010

To fetch the last instruction we were interrupted on, we enable DR in early
exit code, where we are still in a very transitional phase between guest
and host state.

Most of the time this seemed to work, but another CPU can easily flush our
TLB and HTAB which makes us go in the Linux page fault handler which totally
breaks because we still use the guest's SLB entries.

To work around that, let's introduce a second KVM guest mode that defines
that whenever we get a trap, we don't call the Linux handler or go into
the KVM exit code, but just jump over the faulting instruction.

That way a potentially bad lwz doesn't trigger any faults and we can later
on interpret the invalid instruction we fetched as "fetch didn't work".
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b4433a7c

KVM: PPC: Use PACA backed shadow vcpu · 7e57cba0

由 Alexander Graf 提交于 1月 08, 2010

We're being horribly racy right now. All the entry and exit code hijacks
random fields from the PACA that could easily be used by different code in
case we get interrupted, for example by a #MC or even page fault.

After discussing this with Ben, we figured it's best to reserve some more
space in the PACA and just shove off some vcpu state to there.

That way we can drastically improve the readability of the code, make it
less racy and less complex.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

7e57cba0

KVM: PPC: Add helpers for CR, XER · 992b5b29

由 Alexander Graf 提交于 1月 08, 2010

We now have helpers for the GPRs, so let's also add some for CR and XER.

Having them in the PACA simplifies code a lot, as we don't need to care
about where to store CC or not to overflow any integers.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

992b5b29

KVM: PPC: Use accessor functions for GPR access · 8e5b26b5

由 Alexander Graf 提交于 1月 08, 2010

All code in PPC KVM currently accesses gprs in the vcpu struct directly.

While there's nothing wrong with that wrt the current way gprs are stored
and loaded, it doesn't suffice for the PACA acceleration that will follow
in this patchset.

So let's just create little wrapper inline functions that we call whenever
a GPR needs to be read from or written to. The compiled code shouldn't really
change at all for now.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8e5b26b5

KVM: Fix the explanation of write_emulated · 0d178975

由 Takuya Yoshikawa 提交于 1月 06, 2010

The explanation of write_emulated is confused with
that of read_emulated. This patch fix it.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0d178975

KVM: VMX: Enable EPT 1GB page support · 878403b7

由 Sheng Yang 提交于 1月 05, 2010

Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

878403b7

KVM: x86: Rename gb_page_enable() to get_lpage_level() in kvm_x86_ops · 17cc3935

由 Sheng Yang 提交于 1月 05, 2010

Then the callback can provide the maximum supported large page level, which
is more flexible.

Also move the gb page support into x86_64 specific.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

17cc3935

KVM: x86: Moving PT_*_LEVEL to mmu.h · c9c54174

由 Sheng Yang 提交于 1月 05, 2010

We can use them in x86.c and vmx.c now...
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c9c54174

KVM: PPC: Enable lightweight exits again · 97c4cfbe

由 Alexander Graf 提交于 1月 04, 2010

The PowerPC C ABI defines that registers r14-r31 need to be preserved across
function calls. Since our exit handler is written in C, we can make use of that
and don't need to reload r14-r31 on every entry/exit cycle.

This technique is also used in the BookE code and is called "lightweight exits"
there. To follow the tradition, it's called the same in Book3S.

So far this optimization was disabled though, as the code didn't do what it was
expected to do, but failed to work.

This patch fixes and enables lightweight exits again.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

97c4cfbe

KVM: PPC: Fix typo in rebolting code · b480f780

由 Alexander Graf 提交于 1月 04, 2010

When we're loading bolted entries into the SLB again, we're checking if an
entry is in use and only slbmte it when it is.

Unfortunately, the check always goes to the skip label of the first entry,
resulting in an endless loop when it actually gets triggered.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

b480f780

KVM: avoid taking ioapic mutex for non-ioapic EOIs · 46a929bc

由 Avi Kivity 提交于 12月 28, 2009

When the guest acknowledges an interrupt, it sends an EOI message to the local
apic, which broadcasts it to the ioapic.  To handle the EOI, we need to take
the ioapic mutex.

On large guests, this causes a lot of contention on this mutex.  Since large
guests usually don't route interrupts via the ioapic (they use msi instead),
this is completely unnecessary.

Avoid taking the mutex by introducing a handled_vectors bitmap.  Before taking
the mutex, check if the ioapic was actually responsible for the acked vector.
If not, we can return early.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

46a929bc

KVM: Fill out ftrace exit reason strings · f4c9e87c

由 Avi Kivity 提交于 12月 28, 2009

Some exit reasons missed their strings; fill out the table.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f4c9e87c

KVM: Bump maximum vcpu count to 64 · 0680fe52

由 Avi Kivity 提交于 12月 27, 2009

With slots_lock converted to rcu, the entire kvm hotpath on modern processors
(with npt or ept) now scales beautifully.  Increase the maximum vcpu count to
64 to reflect this.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0680fe52

M
KVM: convert slots_lock to a mutex · 79fac95e
由 Marcelo Tosatti 提交于 12月 23, 2009
```
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
79fac95e
M
KVM: switch vcpu context to use SRCU · f656ce01
由 Marcelo Tosatti 提交于 12月 23, 2009
```
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
f656ce01
M
KVM: convert io_bus to SRCU · e93f8a0f
由 Marcelo Tosatti 提交于 12月 23, 2009
```
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
e93f8a0f
M
KVM: x86: switch kvm_set_memory_alias to SRCU update · a983fb23
由 Marcelo Tosatti 提交于 12月 23, 2009
```
Using a similar two-step procedure as for memslots.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
a983fb23
M
KVM: use SRCU for dirty log · b050b015
由 Marcelo Tosatti 提交于 12月 23, 2009
```
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
b050b015

KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU update · bc6678a3

由 Marcelo Tosatti 提交于 12月 23, 2009

Use two steps for memslot deletion: mark the slot invalid (which stops
instantiation of new shadow pages for that slot, but allows destruction),
then instantiate the new empty slot.

Also simplifies kvm_handle_hva locking.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

bc6678a3

KVM: use gfn_to_pfn_memslot in kvm_iommu_map_pages · 3ad26d81

由 Marcelo Tosatti 提交于 12月 23, 2009

So its possible to iommu map a memslot before making it visible to
kvm.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

3ad26d81

KVM: introduce gfn_to_pfn_memslot · 506f0d6f

由 Marcelo Tosatti 提交于 12月 23, 2009

Which takes a memslot pointer instead of using kvm->memslots.

To be used by SRCU convertion later.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

506f0d6f

M
KVM: split kvm_arch_set_memory_region into prepare and commit · f7784b8e
由 Marcelo Tosatti 提交于 12月 23, 2009
```
Required for SRCU convertion later.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
f7784b8e

KVM: modify alias layout in x86s struct kvm_arch · fef9cce0

由 Marcelo Tosatti 提交于 12月 23, 2009

Have a pointer to an allocated region inside x86's kvm_arch.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

fef9cce0

KVM: modify memslots layout in struct kvm · 46a26bf5

由 Marcelo Tosatti 提交于 12月 23, 2009

Have a pointer to an allocated region inside struct kvm.

[alex: fix ppc book 3s]
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

46a26bf5

KVM: trivial document fixes · 2044892d

由 Wu Fengguang 提交于 12月 24, 2009

Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2044892d

KVM: powerpc: Change maintainer · ddf0289d

由 Alexander Graf 提交于 12月 20, 2009

Progress on KVM for Embedded PowerPC has stalled, but for Book3S there's quite
a lot of work to do and going on.

So in agreement with Hollis and Avi, we should switch maintainers for PowerPC.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Acked-by: NHollis Blanchard <hollis@penguinppc.org>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ddf0289d

KVM: powerpc: Remove AGGRESSIVE_DEC · 0bb1fb71

由 Alexander Graf 提交于 12月 21, 2009

Because we now emulate the DEC interrupt according to real life behavior,
there's no need to keep the AGGRESSIVE_DEC hack around.

Let's just remove it.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Acked-by: NAcked-by: Hollis Blanchard <hollis@penguinppc.org>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0bb1fb71

KVM: powerpc: Improve DEC handling · 7706664d

由 Alexander Graf 提交于 12月 21, 2009

We treated the DEC interrupt like an edge based one. This is not true for
Book3s. The DEC keeps firing until mtdec is issued again and thus clears
the interrupt line.

So let's implement this logic in KVM too. This patch moves the line clearing
from the firing of the interrupt to the mtdec emulation.

This makes PPC64 guests work without AGGRESSIVE_DEC defined.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Acked-by: NAcked-by: Hollis Blanchard <hollis@penguinppc.org>
Signed-off-by: NAvi Kivity <avi@redhat.com>

7706664d

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功