提交 · 52e1718c6fd1a1f54c676c2107dc931e93865fe8 · openanolis / cloud-kernel

08 4月, 2012 7 次提交

KVM: PPC: e500: clean up arch/powerpc/kvm/e500.h · 52e1718c

由 Scott Wood 提交于 12月 20, 2011

Move vcpu to the beginning of vcpu_e500 to give it appropriate
prominence, especially if more fields end up getting added to the
end of vcpu_e500 (and vcpu ends up in the middle).

Remove gratuitous "extern" and add parameter names to prototypes.
Signed-off-by: NScott Wood <scottwood@freescale.com>
[agraf: fix bisectability]
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

52e1718c

KVM: PPC: e500: merge <asm/kvm_e500.h> into arch/powerpc/kvm/e500.h · fc6cf995

由 Scott Wood 提交于 12月 20, 2011

Keeping two separate headers for e500-specific things was a
pain, and wasn't even organized along any logical boundary.

There was TLB stuff in <asm/kvm_e500.h> despite the existence of
arch/powerpc/kvm/e500_tlb.h, and nothing in <asm/kvm_e500.h> needed
to be referenced from outside arch/powerpc/kvm.
Signed-off-by: NScott Wood <scottwood@freescale.com>
[agraf: fix bisectability]
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fc6cf995

KVM: PPC: e500: rename e500_tlb.h to e500.h · 29a5a6f9

由 Scott Wood 提交于 12月 20, 2011

This is in preparation for merging in the contents of
arch/powerpc/include/asm/kvm_e500.h.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

29a5a6f9

KVM: PPC: booke: Move vm core init/destroy out of booke.c · fafd6832

由 Scott Wood 提交于 12月 20, 2011

e500mc will want to do lpid allocation/deallocation here.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fafd6832

KVM: PPC: booke: add booke-level vcpu load/put · 94fa9d99

由 Scott Wood 提交于 12月 20, 2011

This gives us a place to put load/put actions that correspond to
code that is booke-specific but not specific to a particular core.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

94fa9d99

KVM: PPC: factor out lpid allocator from book3s_64_mmu_hv · 043cc4d7

由 Scott Wood 提交于 12月 20, 2011

We'll use it on e500mc as well.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

043cc4d7

KVM: Factor out kvm_vcpu_kick to arch-generic code · b6d33834

由 Christoffer Dall 提交于 3月 08, 2012

The kvm_vcpu_kick function performs roughly the same funcitonality on
most all architectures, so we shouldn't have separate copies.

PowerPC keeps a pointer to interchanging waitqueues on the vcpu_arch
structure and to accomodate this special need a
__KVM_HAVE_ARCH_VCPU_GET_WQ define and accompanying function
kvm_arch_vcpu_wq have been defined. For all other architectures this
is a generic inline that just returns &vcpu->wq;
Acked-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b6d33834

03 4月, 2012 5 次提交

KVM: PPC: Book3S: PR: Fix preemption · 592f5d87

由 Alexander Graf 提交于 3月 13, 2012

We were leaking preemption counters. Fix the code to always toggle
between preempt and non-preempt properly.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

592f5d87

KVM: PPC: Save/Restore CR over vcpu_run · e1f8acf8

由 Alexander Graf 提交于 3月 05, 2012

On PPC, CR2-CR4 are nonvolatile, thus have to be saved across function calls.
We didn't respect that for any architecture until Paul spotted it in his
patch for Book3S-HV. This patch saves/restores CR for all KVM capable PPC hosts.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

e1f8acf8

KVM: PPC: Book3S HV: Save and restore CR in __kvmppc_vcore_entry · a5ddea0e

由 Paul Mackerras 提交于 2月 03, 2012

The ABI specifies that CR fields CR2--CR4 are nonvolatile across function
calls.  Currently __kvmppc_vcore_entry doesn't save and restore the CR,
leading to CR2--CR4 getting corrupted with guest values, possibly leading
to incorrect behaviour in its caller.  This adds instructions to save
and restore CR at the points where we save and restore the nonvolatile
GPRs.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

a5ddea0e

KVM: PPC: Book3S HV: Fix kvm_alloc_linear in case where no linears exist · b4e51229

由 Paul Mackerras 提交于 2月 03, 2012

In kvm_alloc_linear we were using and deferencing ri after the
list_for_each_entry had come to the end of the list.  In that
situation, ri is not really defined and probably points to the
list head.  This will happen every time if the free_linears list
is empty, for instance.  This led to a NULL pointer dereference
crash in memset on POWER7 while trying to allocate an HPT in the
case where no HPTs were preallocated.

This fixes it by using a separate variable for the return value
from the loop iterator.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

b4e51229

KVM: PPC: Book3S: Compile fix for ppc32 in HIOR access code · b8e6f8ae

由 Alexander Graf 提交于 3月 13, 2012

We were failing to compile on book3s_32 with the following errors:

arch/powerpc/kvm/book3s_pr.c:883:45: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
arch/powerpc/kvm/book3s_pr.c:898:79: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]

Fix this by explicity casting the u64 to long before we use it as a pointer.

Also, on PPC32 we can not use get_user/put_user for 64bit wide variables,
as there is no single instruction that could load or store variables that big.

So instead, we have to use copy_from/to_user which works everywhere.
Reported-by: NJörg Sommer <joerg@alea.gnuu.de>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

b8e6f8ae

02 4月, 2012 1 次提交
- B
  powerpc/kvm: Fallout from system.h disintegration · 95327d08
  由 Benjamin Herrenschmidt 提交于 4月 01, 2012
```
Add a missing include to fix build
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
```
  95327d08
29 3月, 2012 1 次提交

Disintegrate asm/system.h for PowerPC · ae3a197e

由 David Howells 提交于 3月 28, 2012

Disintegrate asm/system.h for PowerPC.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
cc: linuxppc-dev@lists.ozlabs.org

ae3a197e

20 3月, 2012 1 次提交
- C
  powerpc: remove the second argument of k[un]map_atomic() · 2480b208
  由 Cong Wang 提交于 11月 25, 2011
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
  2480b208
08 3月, 2012 2 次提交

arch/powerpc/kvm/book3s_hv.c: included linux/sched.h twice · 9cc815e4

由 Danny Kukawka 提交于 2月 16, 2012

arch/powerpc/kvm/book3s_hv.c: included 'linux/sched.h' twice,
remove the duplicate.
Signed-off-by: NDanny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9cc815e4

KVM: Introduce kvm_memory_slot::arch and move lpage_info into it · db3fe4eb

由 Takuya Yoshikawa 提交于 2月 08, 2012

Some members of kvm_memory_slot are not used by every architecture.

This patch is the first step to make this difference clear by
introducing kvm_memory_slot::arch;  lpage_info is moved into it.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

db3fe4eb

05 3月, 2012 23 次提交

KVM: PPC: Add HPT preallocator · d2a1b483

由 Alexander Graf 提交于 1月 16, 2012

We're currently allocating 16MB of linear memory on demand when creating
a guest. That does work some times, but finding 16MB of linear memory
available in the system at runtime is definitely not a given.

So let's add another command line option similar to the RMA preallocator,
that we can use to keep a pool of page tables around. Now, when a guest
gets created it has a pretty low chance of receiving an OOM.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d2a1b483

KVM: PPC: Initialize linears with zeros · b7f5d011

由 Alexander Graf 提交于 1月 17, 2012

RMAs and HPT preallocated spaces should be zeroed, so we don't accidently
leak information from previous VM executions.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Acked-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b7f5d011

KVM: PPC: Convert RMA allocation into generic code · b4e70611

由 Alexander Graf 提交于 1月 16, 2012

We have code to allocate big chunks of linear memory on bootup for later use.
This code is currently used for RMA allocation, but can be useful beyond that
extent.

Make it generic so we can reuse it for other stuff later.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Acked-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b4e70611

KVM: PPC: E500: Fail init when not on e500v2 · 9cf7c0e4

由 Alexander Graf 提交于 1月 19, 2012

When enabling the current KVM code on e500mc, I get the following oops:

    Oops: Exception in kernel mode, sig: 4 [#1]
    SMP NR_CPUS=8 P2041 RDB
    Modules linked in:
    NIP: c067df4c LR: c067df44 CTR: 00000000
    REGS: ee055ed0 TRAP: 0700   Not tainted  (3.2.0-10391-g36c5afe)
    MSR: 00029002 <CE,EE,ME>  CR: 24042022  XER: 00000000
    TASK = ee0429b0[1] 'swapper/0' THREAD: ee054000 CPU: 2
    GPR00: c067df44 ee055f80 ee0429b0 00000000 00000058 0000003f ee211600 60c6b864
    GPR08: 7cc903a6 0000002c 00000000 00000001 44042082 2d180088 00000000 00000000
    GPR16: c0000a00 00000014 3fffffff 03fe9000 00000015 7ff3be68 c06e0000 00000000
    GPR24: 00000000 00000000 00001720 c067df1c c06e0000 00000000 ee054000 c06ab51c
    NIP [c067df4c] kvmppc_e500_init+0x30/0xf8
    LR [c067df44] kvmppc_e500_init+0x28/0xf8
    Call Trace:
    [ee055f80] [c067df44] kvmppc_e500_init+0x28/0xf8 (unreliable)
    [ee055fb0] [c0001d30] do_one_initcall+0x50/0x1f0
    [ee055fe0] [c06721dc] kernel_init+0xa4/0x14c
    [ee055ff0] [c000e910] kernel_thread+0x4c/0x68
    Instruction dump:
    9421ffd0 7c0802a6 93410018 9361001c 90010034 93810020 93a10024 93c10028
    93e1002c 4bfffe7d 2c030000 408200a4 <7c1082a6> 90010008 7c1182a6 9001000c
    ---[ end trace b8ef4903fcbf9dd3 ]---

Since it doesn't make sense to run the init function on any non-supported
platform, we can just call our "is this platform supported?" function and
bail out of init() if it's not.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9cf7c0e4

KVM: Move gfn_to_memslot() to kvm_host.h · 9d4cba7f

由 Paul Mackerras 提交于 1月 12, 2012

This moves __gfn_to_memslot() and search_memslots() from kvm_main.c to
kvm_host.h to reduce the code duplication caused by the need for
non-modular code in arch/powerpc/kvm/book3s_hv_rm_mmu.c to call
gfn_to_memslot() in real mode.

Rather than putting gfn_to_memslot() itself in a header, which would
lead to increased code size, this puts __gfn_to_memslot() in a header.
Then, the non-modular uses of gfn_to_memslot() are changed to call
__gfn_to_memslot() instead.  This way there is only one place in the
source code that needs to be changed should the gfn_to_memslot()
implementation need to be modified.

On powerpc, the Book3S HV style of KVM has code that is called from
real mode which needs to call gfn_to_memslot() and thus needs this.
(Module code is allocated in the vmalloc region, which can't be
accessed in real mode.)

With this, we can remove builtin_gfn_to_memslot() from book3s_hv_rm_mmu.c.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Acked-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9d4cba7f

KVM: PPC: Rename MMIO register identifiers · b3c5d3c2

由 Alexander Graf 提交于 1月 07, 2012

We need the KVM_REG namespace for generic register settings now, so
let's rename the existing users to something different, enabling
us to reuse the namespace for more visible interfaces.

While at it, also move these private constants to a private header.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b3c5d3c2

KVM: PPC: Move kvm_vcpu_ioctl_[gs]et_one_reg down to platform-specific code · 31f3438e

由 Paul Mackerras 提交于 12月 12, 2011

This moves the get/set_one_reg implementation down from powerpc.c into
booke.c, book3s_pr.c and book3s_hv.c.  This avoids #ifdefs in C code,
but more importantly, it fixes a bug on Book3s HV where we were
accessing beyond the end of the kvm_vcpu struct (via the to_book3s()
macro) and corrupting memory, causing random crashes and file corruption.

On Book3s HV we only accept setting the HIOR to zero, since the guest
runs in supervisor mode and its vectors are never offset from zero.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
[agraf update to apply on top of changed ONE_REG patches]
Signed-off-by: NAvi Kivity <avi@redhat.com>

31f3438e

KVM: PPC: Add support for explicit HIOR setting · 1022fc3d

由 Alexander Graf 提交于 9月 14, 2011

Until now, we always set HIOR based on the PVR, but this is just wrong.
Instead, we should be setting HIOR explicitly, so user space can decide
what the initial HIOR value is - just like on real hardware.

We keep the old PVR based way around for backwards compatibility, but
once user space uses the SET_ONE_REG based method, we drop the PVR logic.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1022fc3d

KVM: PPC: Add generic single register ioctls · e24ed81f

由 Alexander Graf 提交于 9月 14, 2011

Right now we transfer a static struct every time we want to get or set
registers. Unfortunately, over time we realize that there are more of
these than we thought of before and the extensibility and flexibility of
transferring a full struct every time is limited.

So this is a new approach to the problem. With these new ioctls, we can
get and set a single register that is identified by an ID. This allows for
very precise and limited transmittal of data. When we later realize that
it's a better idea to shove over multiple registers at once, we can reuse
most of the infrastructure and simply implement a GET_MANY_REGS / SET_MANY_REGS
interface.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e24ed81f

KVM: PPC: Use the vcpu kmem_cache when allocating new VCPUs · 6b75e6bf

由 Sasha Levin 提交于 12月 07, 2011

Currently the code kzalloc()s new VCPUs instead of using the kmem_cache
which is created when KVM is initialized.

Modify it to allocate VCPUs from that kmem_cache.
Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
Acked-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6b75e6bf

KVM: PPC: booke: Add booke206 TLB trace · d37b1a03

由 Liu Yu 提交于 12月 20, 2011

The existing kvm_stlb_write/kvm_gtlb_write were a poor match for
the e500/book3e MMU -- mas1 was passed as "tid", mas2 was limited
to "unsigned int" which will be a problem on 64-bit, mas3/7 got
split up rather than treated as a single 64-bit word, etc.
Signed-off-by: NLiu Yu <yu.liu@freescale.com>
[scottwood@freescale.com: made mas2 64-bit, and added mas8 init]
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d37b1a03

KVM: PPC: Book3s HV: Implement get_dirty_log using hardware changed bit · 82ed3616

由 Paul Mackerras 提交于 12月 15, 2011

This changes the implementation of kvm_vm_ioctl_get_dirty_log() for
Book3s HV guests to use the hardware C (changed) bits in the guest
hashed page table. Since this makes the implementation quite different
from the Book3s PR case, this moves the existing implementation from
book3s.c to book3s_pr.c and creates a new implementation in book3s_hv.c.
That implementation calls kvmppc_hv_get_dirty_log() to do the actual
work by calling kvm_test_clear_dirty on each page. It iterates over
the HPTEs, clearing the C bit if set, and returns 1 if any C bit was
set (including the saved C bit in the rmap entry).
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

82ed3616

KVM: PPC: Book3S HV: Use the hardware referenced bit for kvm_age_hva · 55514893

由 Paul Mackerras 提交于 12月 15, 2011

This uses the host view of the hardware R (referenced) bit to speed
up kvm_age_hva() and kvm_test_age_hva().  Instead of removing all
the relevant HPTEs in kvm_age_hva(), we now just reset their R bits
if set.  Also, kvm_test_age_hva() now scans the relevant HPTEs to
see if any of them have R set.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

55514893

KVM: PPC: Book3s HV: Maintain separate guest and host views of R and C bits · bad3b507

由 Paul Mackerras 提交于 12月 15, 2011

This allows both the guest and the host to use the referenced (R) and
changed (C) bits in the guest hashed page table. The guest has a view
of R and C that is maintained in the guest_rpte field of the revmap
entry for the HPTE, and the host has a view that is maintained in the
rmap entry for the associated gfn.

Both view are updated from the guest HPT. If a bit (R or C) is zero
in either view, it will be initially set to zero in the HPTE (or HPTEs),
until set to 1 by hardware. When an HPTE is removed for any reason,
the R and C bits from the HPTE are ORed into both views. We have to
be careful to read the R and C bits from the HPTE after invalidating
it, but before unlocking it, in case of any late updates by the hardware.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bad3b507

KVM: PPC: Book3S HV: Keep HPTE locked when invalidating · a92bce95

由 Paul Mackerras 提交于 12月 15, 2011

This reworks the implementations of the H_REMOVE and H_BULK_REMOVE
hcalls to make sure that we keep the HPTE locked and in the reverse-
mapping chain until we have finished invalidating it.  Previously
we would remove it from the chain and unlock it before invalidating
it, leaving a tiny window when the guest could access the page even
though we believe we have removed it from the guest (e.g.,
kvm_unmap_hva() has been called for the page and has found no HPTEs
in the chain).  In addition, we'll need this for future patches where
we will need to read the R and C bits in the HPTE after invalidating
it.

Doing this required restructuring kvmppc_h_bulk_remove() substantially.
Since we want to batch up the tlbies, we now need to keep several
HPTEs locked simultaneously.  In order to avoid possible deadlocks,
we don't spin on the HPTE bitlock for any except the first HPTE in
a batch.  If we can't acquire the HPTE bitlock for the second or
subsequent HPTE, we terminate the batch at that point, do the tlbies
that we have accumulated so far, unlock those HPTEs, and then start
a new batch to do the remaining invalidations.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a92bce95

KVM: PPC: Add KVM_CAP_NR_VCPUS and KVM_CAP_MAX_VCPUS · b5434032

由 Matt Evans 提交于 12月 07, 2011

PPC KVM lacks these two capabilities, and as such a userland system must assume
a max of 4 VCPUs (following api.txt).  With these, a userland can determine
a more realistic limit.
Signed-off-by: NMatt Evans <matt@ozlabs.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b5434032

KVM: PPC: Fix vcpu_create dereference before validity check. · 03cdab53

由 Matt Evans 提交于 12月 06, 2011

Fix usage of vcpu struct before check that it's actually valid.
Signed-off-by: NMatt Evans <matt@ozlabs.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

03cdab53

KVM: PPC: Allow for read-only pages backing a Book3S HV guest · 4cf302bc

由 Paul Mackerras 提交于 12月 12, 2011

With this, if a guest does an H_ENTER with a read/write HPTE on a page
which is currently read-only, we make the actual HPTE inserted be a
read-only version of the HPTE. We now intercept protection faults as
well as HPTE not found faults, and for a protection fault we work out
whether it should be reflected to the guest (e.g. because the guest HPTE
didn't allow write access to usermode) or handled by switching to
kernel context and calling kvmppc_book3s_hv_page_fault, which will then
request write access to the page and update the actual HPTE.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4cf302bc

KVM: PPC: Implement MMU notifiers for Book3S HV guests · 342d3db7

由 Paul Mackerras 提交于 12月 12, 2011

This adds the infrastructure to enable us to page out pages underneath
a Book3S HV guest, on processors that support virtualized partition
memory, that is, POWER7.  Instead of pinning all the guest's pages,
we now look in the host userspace Linux page tables to find the
mapping for a given guest page.  Then, if the userspace Linux PTE
gets invalidated, kvm_unmap_hva() gets called for that address, and
we replace all the guest HPTEs that refer to that page with absent
HPTEs, i.e. ones with the valid bit clear and the HPTE_V_ABSENT bit
set, which will cause an HDSI when the guest tries to access them.
Finally, the page fault handler is extended to reinstantiate the
guest HPTE when the guest tries to access a page which has been paged
out.

Since we can't intercept the guest DSI and ISI interrupts on PPC970,
we still have to pin all the guest pages on PPC970.  We have a new flag,
kvm->arch.using_mmu_notifiers, that indicates whether we can page
guest pages out.  If it is not set, the MMU notifier callbacks do
nothing and everything operates as before.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

342d3db7

KVM: PPC: Implement MMIO emulation support for Book3S HV guests · 697d3899

由 Paul Mackerras 提交于 12月 12, 2011

This provides the low-level support for MMIO emulation in Book3S HV
guests.  When the guest tries to map a page which is not covered by
any memslot, that page is taken to be an MMIO emulation page.  Instead
of inserting a valid HPTE, we insert an HPTE that has the valid bit
clear but another hypervisor software-use bit set, which we call
HPTE_V_ABSENT, to indicate that this is an absent page.  An
absent page is treated much like a valid page as far as guest hcalls
(H_ENTER, H_REMOVE, H_READ etc.) are concerned, except of course that
an absent HPTE doesn't need to be invalidated with tlbie since it
was never valid as far as the hardware is concerned.

When the guest accesses a page for which there is an absent HPTE, it
will take a hypervisor data storage interrupt (HDSI) since we now set
the VPM1 bit in the LPCR.  Our HDSI handler for HPTE-not-present faults
looks up the hash table and if it finds an absent HPTE mapping the
requested virtual address, will switch to kernel mode and handle the
fault in kvmppc_book3s_hv_page_fault(), which at present just calls
kvmppc_hv_emulate_mmio() to set up the MMIO emulation.

This is based on an earlier patch by Benjamin Herrenschmidt, but since
heavily reworked.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

697d3899

KVM: PPC: Maintain a doubly-linked list of guest HPTEs for each gfn · 06ce2c63

由 Paul Mackerras 提交于 12月 12, 2011

This expands the reverse mapping array to contain two links for each
HPTE which are used to link together HPTEs that correspond to the
same guest logical page. Each circular list of HPTEs is pointed to
by the rmap array entry for the guest logical page, pointed to by
the relevant memslot. Links are 32-bit HPT entry indexes rather than
full 64-bit pointers, to save space. We use 3 of the remaining 32
bits in the rmap array entries as a lock bit, a referenced bit and
a present bit (the present bit is needed since HPTE index 0 is valid).
The bit lock for the rmap chain nests inside the HPTE lock bit.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

06ce2c63

KVM: PPC: Allow I/O mappings in memory slots · 9d0ef5ea

由 Paul Mackerras 提交于 12月 12, 2011

This provides for the case where userspace maps an I/O device into the
address range of a memory slot using a VM_PFNMAP mapping.  In that
case, we work out the pfn from vma->vm_pgoff, and record the cache
enable bits from vma->vm_page_prot in two low-order bits in the
slot_phys array entries.  Then, in kvmppc_h_enter() we check that the
cache bits in the HPTE that the guest wants to insert match the cache
bits in the slot_phys array entry.  However, we do allow the guest to
create what it thinks is a non-cacheable or write-through mapping to
memory that is actually cacheable, so that we can use normal system
memory as part of an emulated device later on.  In that case the actual
HPTE we insert is a cacheable HPTE.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9d0ef5ea

KVM: PPC: Allow use of small pages to back Book3S HV guests · da9d1d7f

由 Paul Mackerras 提交于 12月 12, 2011

This relaxes the requirement that the guest memory be provided as
16MB huge pages, allowing it to be provided as normal memory, i.e.
in pages of PAGE_SIZE bytes (4k or 64k).  To allow this, we index
the kvm->arch.slot_phys[] arrays with a small page index, even if
huge pages are being used, and use the low-order 5 bits of each
entry to store the order of the enclosing page with respect to
normal pages, i.e. log_2(enclosing_page_size / PAGE_SIZE).
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

da9d1d7f

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功