提交 · 3a167beac07cba597856c12b87638a06b0d53db7 · openeuler / Kernel

17 10月, 2013 5 次提交

kvm: powerpc: Add kvmppc_ops callback · 3a167bea

由 Aneesh Kumar K.V 提交于 10月 07, 2013

This patch add a new callback kvmppc_ops. This will help us in enabling
both HV and PR KVM together in the same kernel. The actual change to
enable them together is done in the later patch in the series.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[agraf: squash in booke changes]
Signed-off-by: NAlexander Graf <agraf@suse.de>

3a167bea

KVM: PPC: Book3S PR: Mark pages accessed, and dirty if being written · adc0bafe

由 Paul Mackerras 提交于 9月 20, 2013

The mark_page_dirty() function, despite what its name might suggest,
doesn't actually mark the page as dirty as far as the MM subsystem is
concerned.  It merely sets a bit in KVM's map of dirty pages, if
userspace has requested dirty tracking for the relevant memslot.
To tell the MM subsystem that the page is dirty, we have to call
kvm_set_pfn_dirty() (or an equivalent such as SetPageDirty()).

This adds a call to kvm_set_pfn_dirty(), and while we are here, also
adds a call to kvm_set_pfn_accessed() to tell the MM subsystem that
the page has been accessed.  Since we are now using the pfn in
several places, this adds a 'pfn' variable to store it and changes
the places that used hpaddr >> PAGE_SHIFT to use pfn instead, which
is the same thing.

This also changes a use of HPTE_R_PP to PP_RXRX.  Both are 3, but
PP_RXRX is more informative as being the read-only page permission
bit setting.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

adc0bafe

KVM: PPC: Book3S PR: Use mmu_notifier_retry() in kvmppc_mmu_map_page() · d78bca72

由 Paul Mackerras 提交于 9月 20, 2013

When the MM code is invalidating a range of pages, it calls the KVM
kvm_mmu_notifier_invalidate_range_start() notifier function, which calls
kvm_unmap_hva_range(), which arranges to flush all the existing host
HPTEs for guest pages.  However, the Linux PTEs for the range being
flushed are still valid at that point.  We are not supposed to establish
any new references to pages in the range until the ...range_end()
notifier gets called.  The PPC-specific KVM code doesn't get any
explicit notification of that; instead, we are supposed to use
mmu_notifier_retry() to test whether we are or have been inside a
range flush notifier pair while we have been getting a page and
instantiating a host HPTE for the page.

This therefore adds a call to mmu_notifier_retry inside
kvmppc_mmu_map_page().  This call is inside a region locked with
kvm->mmu_lock, which is the same lock that is called by the KVM
MMU notifier functions, thus ensuring that no new notification can
proceed while we are in the locked region.  Inside this region we
also create the host HPTE and link the corresponding hpte_cache
structure into the lists used to find it later.  We cannot allocate
the hpte_cache structure inside this locked region because that can
lead to deadlock, so we allocate it outside the region and free it
if we end up not using it.

This also moves the updates of vcpu3s->hpte_cache_count inside the
regions locked with vcpu3s->mmu_lock, and does the increment in
kvmppc_mmu_hpte_cache_map() when the pte is added to the cache
rather than when it is allocated, in order that the hpte_cache_count
is accurate.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

d78bca72

KVM: PPC: Book3S PR: Better handling of host-side read-only pages · 93b159b4

由 Paul Mackerras 提交于 9月 20, 2013

Currently we request write access to all pages that get mapped into the
guest, even if the guest is only loading from the page. This reduces
the effectiveness of KSM because it means that we unshare every page we
access. Also, we always set the changed (C) bit in the guest HPTE if
it allows writing, even for a guest load.

This fixes both these problems. We pass an 'iswrite' flag to the
mmu.xlate() functions and to kvmppc_mmu_map_page() to indicate whether
the access is a load or a store. The mmu.xlate() functions now only
set C for stores. kvmppc_gfn_to_pfn() now calls gfn_to_pfn_prot()
instead of gfn_to_pfn() so that it can indicate whether we need write
access to the page, and get back a 'writable' flag to indicate whether
the page is writable or not. If that 'writable' flag is clear, we then
make the host HPTE read-only even if the guest HPTE allowed writing.

This means that we can get a protection fault when the guest writes to a
page that it has mapped read-write but which is read-only on the host
side (perhaps due to KSM having merged the page). Thus we now call
kvmppc_handle_pagefault() for protection faults as well as HPTE not found
faults. In kvmppc_handle_pagefault(), if the access was allowed by the
guest HPTE and we thus need to install a new host HPTE, we then need to
remove the old host HPTE if there is one. This is done with a new
function, kvmppc_mmu_unmap_page(), which uses kvmppc_mmu_pte_vflush() to
find and remove the old host HPTE.

Since the memslot-related functions require the KVM SRCU read lock to
be held, this adds srcu_read_lock/unlock pairs around the calls to
kvmppc_handle_pagefault().

Finally, this changes kvmppc_mmu_book3s_32_xlate_pte() to not ignore
guest HPTEs that don't permit access, and to return -EPERM for accesses
that are not permitted by the page protections.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

93b159b4

KVM: PPC: Book3S PR: Use 64k host pages where possible · c9029c34

由 Paul Mackerras 提交于 9月 20, 2013

Currently, PR KVM uses 4k pages for the host-side mappings of guest
memory, regardless of the host page size.  When the host page size is
64kB, we might as well use 64k host page mappings for guest mappings
of 64kB and larger pages and for guest real-mode mappings.  However,
the magic page has to remain a 4k page.

To implement this, we first add another flag bit to the guest VSID
values we use, to indicate that this segment is one where host pages
should be mapped using 64k pages.  For segments with this bit set
we set the bits in the shadow SLB entry to indicate a 64k base page
size.  When faulting in host HPTEs for this segment, we make them
64k HPTEs instead of 4k.  We record the pagesize in struct hpte_cache
for use when invalidating the HPTE.

For now we restrict the segment containing the magic page (if any) to
4k pages.  It should be possible to lift this restriction in future
by ensuring that the magic 4k page is appropriately positioned within
a host 64k page.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

c9029c34

30 6月, 2013 2 次提交

KVM: PPC: Book3S PR: Allow guest to use 1TB segments · 0f296829

由 Paul Mackerras 提交于 6月 22, 2013

With this, the guest can use 1TB segments as well as 256MB segments.
Since we now have the situation where a single emulated guest segment
could correspond to multiple shadow segments (as the shadow segments
are still 256MB segments), this adds a new kvmppc_mmu_flush_segment()
to scan for all shadow segments that need to be removed.

This restructures the guest HPT (hashed page table) lookup code to
use the correct hashing and matching functions for HPTEs within a
1TB segment. We use the standard hpt_hash() function instead of
open-coding the hash calculation, and we use HPTE_V_COMPARE() with
an AVPN value that has the B (segment size) field included. The
calculation of avpn is done a little earlier since it doesn't change
in the loop starting at the do_second label.

The computation in kvmppc_mmu_book3s_64_esid_to_vsid() changes so that
it returns a 256MB VSID even if the guest SLB entry is a 1TB entry.
This is because the users of this function are creating 256MB SLB
entries. We set a new VSID_1T flag so that entries created from 1T
segments don't collide with entries from 256MB segments.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

0f296829

KVM: PPC: Book3S PR: Fix proto-VSID calculations · 8ed7b7e9

由 Paul Mackerras 提交于 6月 22, 2013

This makes sure the calculation of the proto-VSIDs used by PR KVM
is done with 64-bit arithmetic. Since vcpu3s->context_id[] is int,
when we do vcpu3s->context_id[0] << ESID_BITS the shift will be done
with 32-bit instructions, possibly leading to significant bits
getting lost, as the context id can be up to 524283 and ESID_BITS is
18. To fix this we cast the context id to u64 before shifting.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

8ed7b7e9

21 6月, 2013 1 次提交

powerpc/mm: handle hugepage size correctly when invalidating hpte entries · db3d8534

由 Aneesh Kumar K.V 提交于 6月 20, 2013

If a hash bucket gets full, we "evict" a more/less random entry from it.
When we do that we don't invalidate the TLB (hpte_remove) because we assume
the old translation is still technically "valid". This implies that when
we are invalidating or updating pte, even if HPTE entry is not valid
we should do a tlb invalidate. With hugepages, we need to pass the correct
actual page size value for tlb invalidation.

This change update the patch 0608d692
"powerpc/mm: Always invalidate tlb on hpte invalidate and update" to handle
transparent hugepages correctly.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

db3d8534

30 4月, 2013 1 次提交

powerpc: Decode the pte-lp-encoding bits correctly. · b1022fbd

由 Aneesh Kumar K.V 提交于 4月 28, 2013

We look at both the segment base page size and actual page size and store
the pte-lp-encodings in an array per base page size.

We also update all relevant functions to take actual page size argument
so that we can use the correct PTE LP encoding in HPTE. This should also
get the basic Multiple Page Size per Segment (MPSS) support. This is needed
to enable THP on ppc64.

[Fixed PR KVM build --BenH]
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b1022fbd

17 3月, 2013 1 次提交

powerpc: Rename USER_ESID_BITS* to ESID_BITS* · af81d787

由 Aneesh Kumar K.V 提交于 3月 13, 2013

Now we use ESID_BITS of kernel address to build proto vsid. So rename
USER_ESIT_BITS to ESID_BITS
Acked-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org> [v3.8]

af81d787

30 10月, 2012 1 次提交

KVM: do not treat noslot pfn as a error pfn · 81c52c56

由 Xiao Guangrong 提交于 10月 16, 2012

This patch filters noslot pfn out from error pfns based on Marcelo comment:
noslot pfn is not a error pfn

After this patch,
- is_noslot_pfn indicates that the gfn is not in slot
- is_error_pfn indicates that the gfn is in slot but the error is occurred
  when translate the gfn to pfn
- is_error_noslot_pfn indicates that the pfn either it is error pfns or it
  is noslot pfn
And is_invalid_pfn can be removed, it makes the code more clean
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

81c52c56

06 10月, 2012 1 次提交

KVM: PPC: Book3s: PR: Add (dumb) MMU Notifier support · 9b0cb3c8

由 Alexander Graf 提交于 8月 10, 2012

Now that we have very simple MMU Notifier support for e500 in place,
also add the same simple support to book3s. It gets us one step closer
to actual fast support.
Signed-off-by: NAlexander Graf <agraf@suse.de>

9b0cb3c8

17 9月, 2012 1 次提交

powerpc/mm: Convert virtual address to vpn · 5524a27d

由 Aneesh Kumar K.V 提交于 9月 10, 2012

This patch convert different functions to take virtual page number
instead of virtual address. Virtual page number is virtual address
shifted right by VPN_SHIFT (12) bits. This enable us to have an
address range of upto 76 bits.
Reviewed-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

5524a27d

16 8月, 2012 1 次提交

KVM: PPC: Add cache flush on page map · 249ba1ee

由 Alexander Graf 提交于 8月 03, 2012

When we map a page that wasn't icache cleared before, do so when first
mapping it in KVM using the same information bits as the Linux mapping
logic. That way we are 100% sure that any page we map does not have stale
entries in the icache.
Signed-off-by: NAlexander Graf <agraf@suse.de>

249ba1ee

16 5月, 2012 1 次提交

powerpc/kvm: Fix VSID usage in 64-bit "PR" KVM · ffe36492

由 Benjamin Herrenschmidt 提交于 3月 23, 2012

The code forgot to scramble the VSIDs the way we normally do
and was basically using the "proto VSID" directly with the MMU.

This means that in practice, KVM used random VSIDs that could
collide with segments used by other user space programs.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
[agraf: simplify ppc32 case]
Signed-off-by: NAlexander Graf <agraf@suse.de>

ffe36492

05 3月, 2012 1 次提交

KVM: PPC: Use get/set for to_svcpu to help preemption · 468a12c2

由 Alexander Graf 提交于 12月 09, 2011

When running the 64-bit Book3s PR code without CONFIG_PREEMPT_NONE, we were
doing a few things wrong, most notably access to PACA fields without making
sure that the pointers stay stable accross the access (preempt_disable()).

This patch moves to_svcpu towards a get/put model which allows us to disable
preemption while accessing the shadow vcpu fields in the PACA. That way we
can run preemptible and everyone's happy!
Reported-by: NJörg Sommer <joerg@alea.gnuu.de>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

468a12c2

24 10月, 2010 9 次提交

KVM: PPC: Implement correct SID mapping on Book3s_32 · 8b6db3bc

由 Alexander Graf 提交于 8月 15, 2010

Up until now we were doing segment mappings wrong on Book3s_32. For Book3s_64
we were using a trick where we know that a single mmu_context gives us 16 bits
of context ids.

The mm system on Book3s_32 instead uses a clever algorithm to distribute VSIDs
across the available range, so a context id really only gives us 16 available
VSIDs.

To keep at least a few guest processes in the SID shadow, let's map a number of
contexts that we can use as VSID pool. This makes the code be actually correct
and shouldn't hurt performance too much.
Signed-off-by: NAlexander Graf <agraf@suse.de>

8b6db3bc

KVM: PPC: Remove unused define · cb24c508

由 Alexander Graf 提交于 8月 02, 2010

The define VSID_ALL is unused. Let's remove it.
Signed-off-by: NAlexander Graf <agraf@suse.de>

cb24c508

KVM: PPC: Revert "KVM: PPC: Use kernel hash function" · b9877ce2

由 Alexander Graf 提交于 8月 02, 2010

It turns out the in-kernel hash function is sub-optimal for our subtle
hash inputs where every bit is significant. So let's revert to the original
hash functions.

This reverts commit 05340ab4f9a6626f7a2e8f9fe5397c61d494f445.
Signed-off-by: NAlexander Graf <agraf@suse.de>

b9877ce2

KVM: PPC: Move slb debugging to tracepoints · 928d78be

由 Alexander Graf 提交于 8月 02, 2010

This patch moves debugging printks for shadow SLB debugging over to tracepoints.
Signed-off-by: NAlexander Graf <agraf@suse.de>

928d78be

KVM: PPC: Fix sid map search after flush · c22c3196

由 Alexander Graf 提交于 8月 02, 2010

After a flush the sid map contained lots of entries with 0 for their gvsid and
hvsid value. Unfortunately, 0 can be a real value the guest searches for when
looking up a vsid so it would incorrectly find the host's 0 hvsid mapping which
doesn't belong to our sid space.

So let's also check for the valid bit that indicated that the sid we're
looking at actually contains useful data.
Signed-off-by: NAlexander Graf <agraf@suse.de>

c22c3196

A
KVM: PPC: Move book3s_64 mmu map debug print to trace point · 82fdee7b
由 Alexander Graf 提交于 8月 02, 2010
```
This patch moves Book3s MMU debugging over to tracepoints.
Signed-off-by: NAlexander Graf <agraf@suse.de>
```
82fdee7b

KVM: PPC: correctly check gfn_to_pfn() return value · 49451389

由 Gleb Natapov 提交于 7月 29, 2010

On failure gfn_to_pfn returns bad_page so use correct function to check
for that.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

49451389

KVM: PPC: Magic Page Book3s support · e8508940

由 Alexander Graf 提交于 7月 29, 2010

We need to override EA as well as PA lookups for the magic page. When the guest
tells us to project it, the magic page overrides any guest mappings.

In order to reflect that, we need to hook into all the MMU layers of KVM to
force map the magic page if necessary.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e8508940

KVM: PPC: Convert MSR to shared page · 666e7252

由 Alexander Graf 提交于 7月 29, 2010

One of the most obvious registers to share with the guest directly is the
MSR. The MSR contains the "interrupts enabled" flag which the guest has to
toggle in critical sections.

So in order to bring the overhead of interrupt en- and disabling down, let's
put msr into the shared page. Keep in mind that even though you can fully read
its contents, writing to it doesn't always update all state. There are a few
safe fields that don't require hypervisor interaction. See the documentation
for a list of MSR bits that are safe to be set from inside the guest.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

666e7252

01 8月, 2010 3 次提交

KVM: PPC: Make use of hash based Shadow MMU · fef093be

由 Alexander Graf 提交于 6月 30, 2010

We just introduced generic functions to handle shadow pages on PPC.
This patch makes the respective backends make use of them, getting
rid of a lot of duplicate code along the way.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

fef093be

KVM: PPC: Use kernel hash function · 3b249157

由 Alexander Graf 提交于 6月 21, 2010

The linux kernel already provides a hash function. Let's reuse that
instead of reinventing the wheel!
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3b249157

KVM: PPC: Remove obsolete kvmppc_mmu_find_pte · a576f7a2

由 Alexander Graf 提交于 6月 21, 2010

Initially we had to search for pte entries to invalidate them. Since
the logic has improved since then, we can just get rid of the search
function.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a576f7a2

17 5月, 2010 7 次提交

KVM: PPC: Fix Book3S_64 Host MMU debug output · 5156f274

由 Alexander Graf 提交于 4月 20, 2010

We have some debug output in Book3S_64. Some of that was invalid though,
partially not even compiling because it accessed incorrect variables.

So let's fix that up, making debugging more fun again.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5156f274

KVM: PPC: Be more informative on BUG · ac214671

由 Alexander Graf 提交于 4月 20, 2010

We have a condition in the ppc64 host mmu code that should never occur.
Unfortunately, it just did happen to me and I was rather puzzled on why,
because BUG_ON doesn't tell me anything useful.

So let's add some more debug output in case this goes wrong. Also change
BUG to WARN, since I don't want to reboot every time I mess something up.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ac214671

KVM: PPC: Convert u64 -> ulong · af7b4d10

由 Alexander Graf 提交于 4月 20, 2010

There are some pieces in the code that I overlooked that still use
u64s instead of longs. This slows down 32 bit hosts unnecessarily, so
let's just move them to ulong.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

af7b4d10

KVM: PPC: Release clean pages as clean · 33fd27c7

由 Alexander Graf 提交于 4月 16, 2010

When we mapped a page as read-only, we can just release it as clean to
KVM's page claim mechanisms, because we're pretty sure it hasn't been
touched.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

33fd27c7

KVM: PPC: Extract MMU init · 9cc5e953

由 Alexander Graf 提交于 4月 16, 2010

The host shadow mmu code needs to get initialized. It needs to fetch a
segment it can use to put shadow PTEs into.

That initialization code was in generic code, which is icky. Let's move
it over to the respective MMU file.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9cc5e953

KVM: PPC: Improve indirect svcpu accessors · c7f38f46

由 Alexander Graf 提交于 4月 16, 2010

We already have some inline fuctions we use to access vcpu or svcpu structs,
depending on whether we're on booke or book3s. Since we just put a few more
registers into the svcpu, we also need to make sure the respective callbacks
are available and get used.

So this patch moves direct use of the now in the svcpu struct fields to
inline function calls. While at it, it also moves the definition of those
inline function calls to respective header files for booke and book3s,
greatly improving readability.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c7f38f46

KVM: PPC: Add check if pte was mapped secondary · a1eda280

由 Alexander Graf 提交于 3月 24, 2010

Some HTAB providers (namely the PS3) ignore the SECONDARY flag. They
just put an entry in the htab as secondary when they see fit.

So we need to check the return value of htab_insert to remember the
correct slot id so we can actually invalidate the entry again.

Fixes KVM on the PS3.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a1eda280

25 4月, 2010 1 次提交

KVM: PPC: Enable use of secondary htab bucket · 20a340ab

由 Alexander Graf 提交于 2月 19, 2010

We had code to make use of the secondary htab buckets, but kept that
disabled because it was unstable when I put it in.

I checked again if that's still the case and apparently it was only
exposing some instability that was there anyways before. I haven't
seen any badness related to usage of secondary htab entries so far.

This should speed up guest memory allocations by quite a bit, because
we now have more space to put PTEs in.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

20a340ab

05 11月, 2009 1 次提交

Add book3s_64 Host MMU handling · 0d8dc681

由 Alexander Graf 提交于 10月 30, 2009

We designed the Book3S port of KVM as modular as possible. Most
of the code could be easily used on a Book3S_32 host as well.

The main difference between 32 and 64 bit cores is the MMU. To keep
things well separated, we treat the book3s_64 MMU as one possible compile
option.

This patch adds all the MMU helpers the rest of the code needs in
order to modify the host's MMU, like setting PTEs and segments.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0d8dc681

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功