提交 · 342d3db763f2621ed4546ebf8f6c61cb29d7fbdb · openeuler / Kernel

05 3月, 2012 1 次提交

KVM: PPC: Implement MMU notifiers for Book3S HV guests · 342d3db7

由 Paul Mackerras 提交于 12月 12, 2011

This adds the infrastructure to enable us to page out pages underneath
a Book3S HV guest, on processors that support virtualized partition
memory, that is, POWER7.  Instead of pinning all the guest's pages,
we now look in the host userspace Linux page tables to find the
mapping for a given guest page.  Then, if the userspace Linux PTE
gets invalidated, kvm_unmap_hva() gets called for that address, and
we replace all the guest HPTEs that refer to that page with absent
HPTEs, i.e. ones with the valid bit clear and the HPTE_V_ABSENT bit
set, which will cause an HDSI when the guest tries to access them.
Finally, the page fault handler is extended to reinstantiate the
guest HPTE when the guest tries to access a page which has been paged
out.

Since we can't intercept the guest DSI and ISI interrupts on PPC970,
we still have to pin all the guest pages on PPC970.  We have a new flag,
kvm->arch.using_mmu_notifiers, that indicates whether we can page
guest pages out.  If it is not set, the MMU notifier callbacks do
nothing and everything operates as before.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

342d3db7

23 7月, 2011 1 次提交

virtio: expose for non-virtualization users too · e7254219

由 Ohad Ben-Cohen 提交于 7月 05, 2011

virtio has been so far used only in the context of virtualization,
and the virtio Kconfig was sourced directly by the relevant arch
Kconfigs when VIRTUALIZATION was selected.

Now that we start using virtio for inter-processor communications,
we need to source the virtio Kconfig outside of the virtualization
scope too.

Moreover, some architectures might use virtio for both virtualization
and inter-processor communications, so directly sourcing virtio
might yield unexpected results due to conflicting selections.

The simple solution offered by this patch is to always source virtio's
Kconfig in drivers/Kconfig, and remove it from the appropriate arch
Kconfigs. Additionally, a virtio menu entry has been added so virtio
drivers don't show up in the general drivers menu.

This way anyone can use virtio, though it's arguably less accessible
(and neat!) for virtualization users now.

Note: some architectures (mips and sh) seem to have a VIRTUALIZATION
menu merely for sourcing virtio's Kconfig, so that menu is removed too.
Signed-off-by: NOhad Ben-Cohen <ohad@wizery.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

e7254219

12 7月, 2011 2 次提交

KVM: PPC: book3s_hv: Add support for PPC970-family processors · 9e368f29

由 Paul Mackerras 提交于 6月 29, 2011

This adds support for running KVM guests in supervisor mode on those
PPC970 processors that have a usable hypervisor mode.  Unfortunately,
Apple G5 machines have supervisor mode disabled (MSR[HV] is forced to
1), but the YDL PowerStation does have a usable hypervisor mode.

There are several differences between the PPC970 and POWER7 in how
guests are managed.  These differences are accommodated using the
CPU_FTR_ARCH_201 (PPC970) and CPU_FTR_ARCH_206 (POWER7) CPU feature
bits.  Notably, on PPC970:

* The LPCR, LPID or RMOR registers don't exist, and the functions of
  those registers are provided by bits in HID4 and one bit in HID0.

* External interrupts can be directed to the hypervisor, but unlike
  POWER7 they are masked by MSR[EE] in non-hypervisor modes and use
  SRR0/1 not HSRR0/1.

* There is no virtual RMA (VRMA) mode; the guest must use an RMO
  (real mode offset) area.

* The TLB entries are not tagged with the LPID, so it is necessary to
  flush the whole TLB on partition switch.  Furthermore, when switching
  partitions we have to ensure that no other CPU is executing the tlbie
  or tlbsync instructions in either the old or the new partition,
  otherwise undefined behaviour can occur.

* The PMU has 8 counters (PMC registers) rather than 6.

* The DSCR, PURR, SPURR, AMR, AMOR, UAMOR registers don't exist.

* The SLB has 64 entries rather than 32.

* There is no mediated external interrupt facility, so if we switch to
  a guest that has a virtual external interrupt pending but the guest
  has MSR[EE] = 0, we have to arrange to have an interrupt pending for
  it so that we can get control back once it re-enables interrupts.  We
  do that by sending ourselves an IPI with smp_send_reschedule after
  hard-disabling interrupts.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

9e368f29

KVM: PPC: Add support for Book3S processors in hypervisor mode · de56a948

由 Paul Mackerras 提交于 6月 29, 2011

This adds support for KVM running on 64-bit Book 3S processors,
specifically POWER7, in hypervisor mode. Using hypervisor mode means
that the guest can use the processor's supervisor mode. That means
that the guest can execute privileged instructions and access privileged
registers itself without trapping to the host. This gives excellent
performance, but does mean that KVM cannot emulate a processor
architecture other than the one that the hardware implements.

This code assumes that the guest is running paravirtualized using the
PAPR (Power Architecture Platform Requirements) interface, which is the
interface that IBM's PowerVM hypervisor uses. That means that existing
Linux distributions that run on IBM pSeries machines will also run
under KVM without modification. In order to communicate the PAPR
hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code
to include/linux/kvm.h.

Currently the choice between book3s_hv support and book3s_pr support
(i.e. the existing code, which runs the guest in user mode) has to be
made at kernel configuration time, so a given kernel binary can only
do one or the other.

This new book3s_hv code doesn't support MMIO emulation at present.
Since we are running paravirtualized guests, this isn't a serious
restriction.

With the guest running in supervisor mode, most exceptions go straight
to the guest. We will never get data or instruction storage or segment
interrupts, alignment interrupts, decrementer interrupts, program
interrupts, single-step interrupts, etc., coming to the hypervisor from
the guest. Therefore this introduces a new KVMTEST_NONHV macro for the
exception entry path so that we don't have to do the KVM test on entry
to those exception handlers.

We do however get hypervisor decrementer, hypervisor data storage,
hypervisor instruction storage, and hypervisor emulation assist
interrupts, so we have to handle those.

In hypervisor mode, real-mode accesses can access all of RAM, not just
a limited amount. Therefore we put all the guest state in the vcpu.arch
and use the shadow_vcpu in the PACA only for temporary scratch space.
We allocate the vcpu with kzalloc rather than vzalloc, and we don't use
anything in the kvmppc_vcpu_book3s struct, so we don't allocate it.
We don't have a shared page with the guest, but we still need a
kvm_vcpu_arch_shared struct to store the values of various registers,
so we include one in the vcpu_arch struct.

The POWER7 processor has a restriction that all threads in a core have
to be in the same partition. MMU-on kernel code counts as a partition
(partition 0), so we have to do a partition switch on every entry to and
exit from the guest. At present we require the host and guest to run
in single-thread mode because of this hardware restriction.

This code allocates a hashed page table for the guest and initializes
it with HPTEs for the guest's Virtual Real Memory Area (VRMA). We
require that the guest memory is allocated using 16MB huge pages, in
order to simplify the low-level memory management. This also means that
we can get away without tracking paging activity in the host for now,
since huge pages can't be paged or swapped.

This also adds a few new exports needed by the book3s_hv code.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

de56a948

17 5月, 2010 3 次提交

KVM: PPC: Enable Book3S_32 KVM building · 4f841390

由 Alexander Graf 提交于 4月 16, 2010

Now that we have all the bits and pieces in place, let's enable building
of the Book3S_32 target.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4f841390

KVM: PPC: Use CONFIG_PPC_BOOK3S define · 00c3a37c

由 Alexander Graf 提交于 4月 16, 2010

Upstream recently added a new name for PPC64: Book3S_64.

So instead of using CONFIG_PPC64 we should use CONFIG_PPC_BOOK3S consotently.
That makes understanding the code easier (I hope).
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

00c3a37c

KVM: PPC: Use KVM_BOOK3S_HANDLER · c14dea04

由 Alexander Graf 提交于 4月 16, 2010

So far we had a lot of conditional code on CONFIG_KVM_BOOK3S_64_HANDLER.
As we're moving towards common code between 32 and 64 bits, most of
these ifdefs can be moved to a more generic term define, called
CONFIG_KVM_BOOK3S_HANDLER.

This patch adds the new generic config option and moves ifdefs over.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c14dea04

01 3月, 2010 1 次提交

KVM: Add KVM_MMIO kconfig item · 50eb2a3c

由 Avi Kivity 提交于 12月 20, 2009

s390 doesn't have mmio, this will simplify ifdefing it out.
Signed-off-by: NAvi Kivity <avi@redhat.com>

50eb2a3c

25 1月, 2010 1 次提交

KVM: powerpc: Show timing option only on embedded · e1f829b6

由 Alexander Graf 提交于 12月 20, 2009

Embedded PowerPC KVM has an exit timing implementation to track and evaluate
how much time was spent in which exit path.

For Book3S, we don't implement it. So let's not expose it as a config option
either.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e1f829b6

15 1月, 2010 1 次提交

vhost_net: a kernel-level virtio server · 3a4d5c94

由 Michael S. Tsirkin 提交于 1月 14, 2010

What it is: vhost net is a character device that can be used to reduce
the number of system calls involved in virtio networking.
Existing virtio net code is used in the guest without modification.

There's similarity with vringfd, with some differences and reduced scope
- uses eventfd for signalling
- structures can be moved around in memory at any time (good for
  migration, bug work-arounds in userspace)
- write logging is supported (good for migration)
- support memory table and not just an offset (needed for kvm)

common virtio related code has been put in a separate file vhost.c and
can be made into a separate module if/when more backends appear.  I used
Rusty's lguest.c as the source for developing this part : this supplied
me with witty comments I wouldn't be able to write myself.

What it is not: vhost net is not a bus, and not a generic new system
call. No assumptions are made on how guest performs hypercalls.
Userspace hypervisors are supported as well as kvm.

How it works: Basically, we connect virtio frontend (configured by
userspace) to a backend. The backend could be a network device, or a tap
device.  Backend is also configured by userspace, including vlan/mac
etc.

Status: This works for me, and I haven't see any crashes.
Compared to userspace, people reported improved latency (as I save up to
4 system calls per packet), as well as better bandwidth and CPU
utilization.

Features that I plan to look at in the future:
- mergeable buffers
- zero copy
- scalability tuning: figure out the best threading model to use

Note on RCU usage (this is also documented in vhost.h, near
private_pointer which is the value protected by this variant of RCU):
what is happening is that the rcu_dereference() is being used in a
workqueue item.  The role of rcu_read_lock() is taken on by the start of
execution of the workqueue item, of rcu_read_unlock() by the end of
execution of the workqueue item, and of synchronize_rcu() by
flush_workqueue()/flush_work(). In the future we might need to apply
some gcc attribute or sparse annotation to the function passed to
INIT_WORK(). Paul's ack below is for this RCU usage.

(Includes fixes by Alan Cox <alan@linux.intel.com>,
David L Stevens <dlstevens@us.ibm.com>,
Chris Wright <chrisw@redhat.com>)
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a4d5c94

05 11月, 2009 1 次提交

Include Book3s_64 target in buildsystem · c4f9c779

由 Alexander Graf 提交于 10月 30, 2009

Now we have everything in place to be able to build KVM, so let's add it
as config option and in the Makefile.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c4f9c779

10 9月, 2009 2 次提交

KVM: remove old KVMTRACE support code · 2023a29c

由 Marcelo Tosatti 提交于 6月 18, 2009

Return EOPNOTSUPP for KVM_TRACE_ENABLE/PAUSE/DISABLE ioctls.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2023a29c

A
KVM: Move common KVM Kconfig items to new file virt/kvm/Kconfig · 0ba12d10
由 Avi Kivity 提交于 5月 21, 2009
```
Reduce Kconfig code duplication.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
0ba12d10

24 3月, 2009 2 次提交

KVM: Add CONFIG_HAVE_KVM_IRQCHIP · 5d9b8e30

由 Avi Kivity 提交于 1月 04, 2009

Two KVM archs support irqchips and two don't.  Add a Kconfig item to
make selecting between the two models easier.
Signed-off-by: NAvi Kivity <avi@redhat.com>

5d9b8e30

KVM: ppc: E500 core-specific code · bc8080cb

由 Hollis Blanchard 提交于 1月 03, 2009

Signed-off-by: NLiu Yu <yu.liu@freescale.com>
Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bc8080cb

31 12月, 2008 4 次提交

KVM: ppc: Implement in-kernel exit timing statistics · 73e75b41

由 Hollis Blanchard 提交于 12月 02, 2008

Existing KVM statistics are either just counters (kvm_stat) reported for
KVM generally or trace based aproaches like kvm_trace.
For KVM on powerpc we had the need to track the timings of the different exit
types. While this could be achieved parsing data created with a kvm_trace
extension this adds too much overhead (at least on embedded PowerPC) slowing
down the workloads we wanted to measure.

Therefore this patch adds a in-kernel exit timing statistic to the powerpc kvm
code. These statistic is available per vm&vcpu under the kvm debugfs directory.
As this statistic is low, but still some overhead it can be enabled via a
.config entry and should be off by default.

Since this patch touched all powerpc kvm_stat code anyway this code is now
merged and simplified together with the exit timing statistic code (still
working with exit timing disabled in .config).
Signed-off-by: NChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

73e75b41

KVM: ppc: fix Kconfig constraints · 74ef740d

由 Hollis Blanchard 提交于 11月 07, 2008

Make sure that CONFIG_KVM cannot be selected without processor support
(currently, 440 is the only processor implementation available).
Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

74ef740d

KVM: ppc: Refactor powerpc.c to relocate 440-specific code · 9dd921cf

由 Hollis Blanchard 提交于 11月 05, 2008

This introduces a set of core-provided hooks. For 440, some of these are
implemented by booke.c, with the rest in (the new) 44x.c.

Note that these hooks are link-time, not run-time. Since it is not possible to
build a single kernel for both e500 and 440 (for example), using function
pointers would only add overhead.
Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9dd921cf

KVM: ppc: combine booke_guest.c and booke_host.c · d9fbd03d

由 Hollis Blanchard 提交于 11月 05, 2008

The division was somewhat artificial and cumbersome, and had no functional
benefit anyways: we can only guests built for the real host processor.
Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d9fbd03d

15 10月, 2008 1 次提交

KVM: ppc: enable KVM_TRACE building for powerpc · 12f67556

由 Jerone Young 提交于 7月 14, 2008

This patch enables KVM_TRACE to build for PowerPC arch. This means just
adding sections to Kconfig and Makefile.
Signed-off-by: NJerone Young <jyoung5@us.ibm.com>
Signed-off-by: NChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

12f67556

27 4月, 2008 1 次提交

KVM: ppc: PowerPC 440 KVM implementation · bbf45ba5

由 Hollis Blanchard 提交于 4月 16, 2008

This functionality is definitely experimental, but is capable of running
unmodified PowerPC 440 Linux kernels as guests on a PowerPC 440 host. (Only
tested with 440EP "Bamboo" guests so far, but with appropriate userspace
support other SoC/board combinations should work.)

See Documentation/powerpc/kvm_440.txt for technical details.

[stephen: build fix]
Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
Acked-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

bbf45ba5

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功