提交 · d19469e8415813cceaa494b6f538e327b9a95f3b · openeuler / Kernel

10 3月, 2017 1 次提交

power/mm: update pte_write and pte_wrprotect to handle savedwrite · d19469e8

由 Aneesh Kumar K.V 提交于 3月 09, 2017

We use pte_write() to check whethwer the pte entry is writable. This is
mostly used to later mark the pte read only if it is writable. The other
use of pte_write() is to check whether the pte_entry is writable so that
hardware page table entry can be marked accordingly. This is used in kvm
where we look at qemu page table entry and update hardware hash page table
for the guest with correct write enable bit.

With the above, for the first usage we should also check the savedwrite
bit so that we can correctly clear the savedwite bit. For the later, we
add a new variant __pte_write().

With this we can revert write_protect_page part of 595cd8f2 ("mm/ksm:
handle protnone saved writes when making page write protect"). But I left
it as it is as an example code for savedwrite check.

Fixes: c137a275 ("powerpc/mm/autonuma: switch ppc64 to its own implementation of saved write")
Link: http://lkml.kernel.org/r/1488203787-17849-2-git-send-email-aneesh.kumar@linux.vnet.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d19469e8

18 2月, 2017 1 次提交

KVM: PPC: Book3S HV: Disable HPT resizing on POWER9 for now · bcd3bb63

由 Paul Mackerras 提交于 2月 18, 2017

The new HPT resizing code added in commit b5baa687 ("KVM: PPC:
Book3S HV: KVM-HV HPT resizing implementation", 2016-12-20) doesn't
have code to handle the new HPTE format which POWER9 uses.  Thus it
would be best not to advertise it to userspace on POWER9 systems
until it works properly.

Also, since resize_hpt_rehash_hpte() contains BUG_ON() calls that
could be hit on POWER9, let's prevent it from being called on POWER9
for now.
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

bcd3bb63

17 2月, 2017 1 次提交

KVM: PPC: Book3S HV: Turn "KVM guest htab" message into a debug message · 3a4f1760

由 Thomas Huth 提交于 2月 16, 2017

The average user likely does not know what a "htab" or "LPID" is,
and it's annoying that these messages are quickly filling the dmesg
log when you're doing boot cycle tests, so let's turn it into a debug
message instead.
Signed-off-by: NThomas Huth <thuth@redhat.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

3a4f1760

16 2月, 2017 1 次提交

KVM: PPC: Book3S HV: Prevent double-free on HPT resize commit path · 5b73d634

由 David Gibson 提交于 2月 15, 2017

resize_hpt_release(), called once the HPT resize of a KVM guest is
completed (successfully or unsuccessfully) frees the state structure for
the resize.  It is currently not safe to call with a NULL pointer.

However, one of the error paths in kvm_vm_ioctl_resize_hpt_commit() can
invoke it with a NULL pointer.  This will occur if userspace improperly
invokes KVM_PPC_RESIZE_HPT_COMMIT without previously calling
KVM_PPC_RESIZE_HPT_PREPARE, or if it calls COMMIT twice without an
intervening PREPARE.

To fix this potential crash bug - and maybe others like it, make it safe
(and a no-op) to call resize_hpt_release() with a NULL resize pointer.

Found by Dan Carpenter with a static checker.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

5b73d634

31 1月, 2017 13 次提交

KVM: PPC: Book3S HV: KVM-HV HPT resizing implementation · b5baa687

由 David Gibson 提交于 12月 20, 2016

This adds the "guts" of the implementation for the HPT resizing PAPR
extension.  It has the code to allocate and clear a new HPT, rehash an
existing HPT's entries into it, and accomplish the switchover for a
KVM guest from the old HPT to the new one.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

b5baa687

KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing implementation · 5e985969

由 David Gibson 提交于 12月 20, 2016

This adds a not yet working outline of the HPT resizing PAPR
extension.  Specifically it adds the necessary ioctl() functions,
their basic steps, the work function which will handle preparation for
the resize, and synchronization between these, the guest page fault
path and guest HPT update path.

The actual guts of the implementation isn't here yet, so for now the
calls will always fail.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

5e985969

KVM: PPC: Book3S HV: Create kvmppc_unmap_hpte_helper() · 639e4597

由 David Gibson 提交于 12月 20, 2016

The kvm_unmap_rmapp() function, called from certain MMU notifiers, is used
to force all guest mappings of a particular host page to be set ABSENT, and
removed from the reverse mappings.

For HPT resizing, we will have some cases where we want to set just a
single guest HPTE ABSENT and remove its reverse mappings.  To prepare with
this, we split out the logic from kvm_unmap_rmapp() to evict a single HPTE,
moving it to a new helper function.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

639e4597

KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB ioctl() to change HPT size · f98a8bf9

由 David Gibson 提交于 12月 20, 2016

The KVM_PPC_ALLOCATE_HTAB ioctl() is used to set the size of hashed page
table (HPT) that userspace expects a guest VM to have, and is also used to
clear that HPT when necessary (e.g. guest reboot).

At present, once the ioctl() is called for the first time, the HPT size can
never be changed thereafter - it will be cleared but always sized as from
the first call.

With upcoming HPT resize implementation, we're going to need to allow
userspace to resize the HPT at reset (to change it back to the default size
if the guest changed it).

So, we need to allow this ioctl() to change the HPT size.

This patch also updates Documentation/virtual/kvm/api.txt to reflect
the new behaviour.  In fact the documentation was already slightly
incorrect since 572abd56 "KVM: PPC: Book3S HV: Don't fall back to
smaller HPT size in allocation ioctl"
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

f98a8bf9

KVM: PPC: Book3S HV: Split HPT allocation from activation · aae0777f

由 David Gibson 提交于 12月 20, 2016

Currently, kvmppc_alloc_hpt() both allocates a new hashed page table (HPT)
and sets it up as the active page table for a VM. For the upcoming HPT
resize implementation we're going to want to allocate HPTs separately from
activating them.

So, split the allocation itself out into kvmppc_allocate_hpt() and perform
the activation with a new kvmppc_set_hpt() function. Likewise we split
kvmppc_free_hpt(), which just frees the HPT, from kvmppc_release_hpt()
which unsets it as an active HPT, then frees it.

We also move the logic to fall back to smaller HPT sizes if the first try
fails into the single caller which used that behaviour,
kvmppc_hv_setup_htab_rma(). This introduces a slight semantic change, in
that previously if the initial attempt at CMA allocation failed, we would
fall back to attempting smaller sizes with the page allocator. Now, we
try first CMA, then the page allocator at each size. As far as I can tell
this change should be harmless.

To match, we make kvmppc_free_hpt() just free the actual HPT itself. The
call to kvmppc_free_lpid() that was there, we move to the single caller.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

aae0777f

KVM: PPC: Book3S HV: Don't store values derivable from HPT order · 3d089f84

由 David Gibson 提交于 12月 20, 2016

Currently the kvm_hpt_info structure stores the hashed page table's order,
and also the number of HPTEs it contains and a mask for its size.  The
last two can be easily derived from the order, so remove them and just
calculate them as necessary with a couple of helper inlines.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

3d089f84

KVM: PPC: Book3S HV: Gather HPT related variables into sub-structure · 3f9d4f5a

由 David Gibson 提交于 12月 20, 2016

Currently, the powerpc kvm_arch structure contains a number of variables
tracking the state of the guest's hashed page table (HPT) in KVM HV.  This
patch gathers them all together into a single kvm_hpt_info substructure.
This makes life more convenient for the upcoming HPT resizing
implementation.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

3f9d4f5a

KVM: PPC: Book3S HV: Rename kvm_alloc_hpt() for clarity · db9a290d

由 David Gibson 提交于 12月 20, 2016

The difference between kvm_alloc_hpt() and kvmppc_alloc_hpt() is not at
all obvious from the name.  In practice kvmppc_alloc_hpt() allocates an HPT
by whatever means, and calls kvm_alloc_hpt() which will attempt to allocate
it with CMA only.

To make this less confusing, rename kvm_alloc_hpt() to kvm_alloc_hpt_cma().
Similarly, kvm_release_hpt() is renamed kvm_free_hpt_cma().
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

db9a290d

KVM: PPC: Book3S HV: Enable radix guest support · 8cf4ecc0

由 Paul Mackerras 提交于 1月 30, 2017

This adds a few last pieces of the support for radix guests:

* Implement the backends for the KVM_PPC_CONFIGURE_V3_MMU and
  KVM_PPC_GET_RMMU_INFO ioctls for radix guests

* On POWER9, allow secondary threads to be on/off-lined while guests
  are running.

* Set up LPCR and the partition table entry for radix guests.

* Don't allocate the rmap array in the kvm_memory_slot structure
  on radix.

* Don't try to initialize the HPT for radix guests, since they don't
  have an HPT.

* Take out the code that prevents the HV KVM module from
  initializing on radix hosts.

At this stage, we only support radix guests if the host is running
in radix mode, and only support HPT guests if the host is running in
HPT mode.  Thus a guest cannot switch from one mode to the other,
which enables some simplifications.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8cf4ecc0

KVM: PPC: Book3S HV: Implement dirty page logging for radix guests · 8f7b79b8

由 Paul Mackerras 提交于 1月 30, 2017

This adds code to keep track of dirty pages when requested (that is,
when memslot->dirty_bitmap is non-NULL) for radix guests. We use the
dirty bits in the PTEs in the second-level (partition-scoped) page
tables, together with a bitmap of pages that were dirty when their
PTE was invalidated (e.g., when the page was paged out). This bitmap
is stored in the first half of the memslot->dirty_bitmap area, and
kvm_vm_ioctl_get_dirty_log_hv() now uses the second half for the
bitmap that gets returned to userspace.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8f7b79b8

KVM: PPC: Book3S HV: MMU notifier callbacks for radix guests · 01756099

由 Paul Mackerras 提交于 1月 30, 2017

This adapts our implementations of the MMU notifier callbacks
(unmap_hva, unmap_hva_range, age_hva, test_age_hva, set_spte_hva)
to call radix functions when the guest is using radix.  These
implementations are much simpler than for HPT guests because we
have only one PTE to deal with, so we don't need to traverse
rmap chains.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

01756099

KVM: PPC: Book3S HV: Page table construction and page faults for radix guests · 5a319350

由 Paul Mackerras 提交于 1月 30, 2017

This adds the code to construct the second-level ("partition-scoped" in
architecturese) page tables for guests using the radix MMU.  Apart from
the PGD level, which is allocated when the guest is created, the rest
of the tree is all constructed in response to hypervisor page faults.

As well as hypervisor page faults for missing pages, we also get faults
for reference/change (RC) bits needing to be set, as well as various
other error conditions.  For now, we only set the R or C bit in the
guest page table if the same bit is set in the host PTE for the
backing page.

This code can take advantage of the guest being backed with either
transparent or ordinary 2MB huge pages, and insert 2MB page entries
into the guest page tables.  There is no support for 1GB huge pages
yet.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5a319350

KVM: PPC: Book3S HV: Add basic infrastructure for radix guests · 9e04ba69

由 Paul Mackerras 提交于 1月 30, 2017

This adds a field in struct kvm_arch and an inline helper to
indicate whether a guest is a radix guest or not, plus a new file
to contain the radix MMU code, which currently contains just a
translate function which knows how to traverse the guest page
tables to translate an address.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9e04ba69

24 11月, 2016 1 次提交

KVM: PPC: Book3S HV: Adapt to new HPTE format on POWER9 · abb7c7dd

由 Paul Mackerras 提交于 11月 16, 2016

This adapts the KVM-HV hashed page table (HPT) code to read and write
HPT entries in the new format defined in Power ISA v3.00 on POWER9
machines.  The new format moves the B (segment size) field from the
first doubleword to the second, and trims some bits from the AVA
(abbreviated virtual address) and ARPN (abbreviated real page number)
fields.  As far as possible, the conversion is done when reading or
writing the HPT entries, and the rest of the code continues to use
the old format.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

abb7c7dd

21 11月, 2016 3 次提交

KVM: PPC: Book3S HV: Add a per vcpu cache for recently page faulted MMIO entries · a56ee9f8

由 Yongji Xie 提交于 11月 04, 2016

This keeps a per vcpu cache for recently page faulted MMIO entries.
On a page fault, if the entry exists in the cache, we can avoid some
time-consuming paths, for example, looking up HPT, locking HPTE twice
and searching mmio gfn from memslots, then directly call
kvmppc_hv_emulate_mmio().

In current implenment, we limit the size of cache to four. We think
it's enough to cover the high-frequency MMIO HPTEs in most case.
For example, considering the case of using virtio device, for virtio
legacy devices, one HPTE could handle notifications from up to
1024 (64K page / 64 byte Port IO register) devices, so one cache entry
is enough; for virtio modern devices, we always need one HPTE to handle
notification for each device because modern device would use a 8M MMIO
register to notify host instead of Port IO register, typically the
system's configuration should not exceed four virtio devices per
vcpu, four cache entry is also enough in this case. Of course, if needed,
we could also modify the macro to a module parameter in the future.
Signed-off-by: NYongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

a56ee9f8

KVM: PPC: Book3S HV: Clear the key field of HPTE when the page is paged out · f0585982

由 Yongji Xie 提交于 11月 04, 2016

Currently we mark a HPTE for emulated MMIO with HPTE_V_ABSENT bit
set as well as key 0x1f. However, those HPTEs may be conflicted with
the HPTE for real guest RAM page HPTE with key 0x1f when the page
get paged out.

This patch clears the key field of HPTE when the page is paged out,
then recover it when HPTE is re-established.
Signed-off-by: NYongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

f0585982

KVM: PPC: Book3S HV: Fix sparse static warning · 025c9511

由 Daniel Axtens 提交于 10月 10, 2016

Squash a couple of sparse warnings by making things static.

Build tested.
Signed-off-by: NDaniel Axtens <dja@axtens.net>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

025c9511

01 5月, 2016 1 次提交

powerpc/mm: Drop WIMG in favour of new constants · 30bda41a

由 Aneesh Kumar K.V 提交于 4月 29, 2016

PowerISA 3.0 introduces two pte bits with the below meaning for radix:
  00 -> Normal Memory
  01 -> Strong Access Order (SAO)
  10 -> Non idempotent I/O (Cache inhibited and guarded)
  11 -> Tolerant I/O (Cache inhibited)

We drop the existing WIMG bits in the Linux page table in favour of the
above constants. We loose _PAGE_WRITETHRU with this conversion. We only
use writethru via pgprot_cached_wthru() which is used by
fbdev/controlfb.c which is Apple control display and also PPC32.

With respect to _PAGE_COHERENCE, we have been marking hpte always
coherent for some time now. htab_convert_pte_flags() always added
HPTE_R_M.

NOTE: KVM changes need closer review.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

30bda41a

03 3月, 2016 1 次提交

powerpc/mm: Move hash related mmu-*.h headers to book3s/ · f64e8084

由 Aneesh Kumar K.V 提交于 3月 01, 2016

No code changes.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f64e8084

21 10月, 2015 1 次提交

KVM: PPC: Book3S HV: Don't fall back to smaller HPT size in allocation ioctl · 572abd56

由 Paul Mackerras 提交于 9月 29, 2015

Currently the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested
size of HPT, and if that is not possible, then try to allocate smaller
sizes (by factors of 2) until either a minimum is reached or the
allocation succeeds. This is not ideal for userspace, particularly in
migration scenarios, where the destination VM really does require the
size requested. Also, the minimum HPT size of 256kB may be
insufficient for the guest to run successfully.

This removes the fallback to smaller sizes on allocation failure for
the KVM_PPC_ALLOCATE_HTAB ioctl. The fallback still exists for the
case where the HPT is allocated at the time the first VCPU is run, if
no HPT has been allocated by ioctl by that time.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

572abd56

12 10月, 2015 1 次提交

powerpc/mm: Differentiate between hugetlb and THP during page walk · 891121e6

由 Aneesh Kumar K.V 提交于 10月 09, 2015

We need to properly identify whether a hugepage is an explicit or
a transparent hugepage in follow_huge_addr(). We used to depend
on hugepage shift argument to do that. But in some case that can
result in wrong results. For ex:

On finding a transparent hugepage we set hugepage shift to PMD_SHIFT.
But we can end up clearing the thp pte, via pmdp_huge_get_and_clear.
We do prevent reusing the pfn page via the usage of
kick_all_cpus_sync(). But that happens after we updated the pte to 0.
Hence in follow_huge_addr() we can find hugepage shift set, but transparent
huge page check fail for a thp pte.

NOTE: We fixed a variant of this race against thp split in commit
691e95fd
("powerpc/mm/thp: Make page table walk safe against thp split/collapse")

Without this patch, we may hit the BUG_ON(flags & FOLL_GET) in
follow_page_mask occasionally.

In the long term, we may want to switch ppc64 64k page size config to
enable CONFIG_ARCH_WANT_GENERAL_HUGETLB
Reported-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

891121e6

22 8月, 2015 1 次提交

KVM: PPC: Book3S HV: Fix bug in dirty page tracking · 08fe1e7b

由 Paul Mackerras 提交于 6月 24, 2015

This fixes a bug in the tracking of pages that get modified by the
guest. If the guest creates a large-page HPTE, writes to memory
somewhere within the large page, and then removes the HPTE, we only
record the modified state for the first normal page within the large
page, when in fact the guest might have modified some other normal
page within the large page.

To fix this we use some unused bits in the rmap entry to record the
order (log base 2) of the size of the page that was modified, when
removing an HPTE. Then in kvm_test_clear_dirty_npages() we use that
order to return the correct number of modified pages.

The same thing could in principle happen when removing a HPTE at the
host's request, i.e. when paging out a page, except that we never
page out large pages, and the guest can only create large-page HPTEs
if the guest RAM is backed by large pages. However, we also fix
this case for the sake of future-proofing.

The reference bit is also subject to the same loss of information. We
don't make the same fix here for the reference bit because there isn't
an interface for userspace to find out which pages the guest has
referenced, whereas there is one for userspace to find out which pages
the guest has modified. Because of this loss of information, the
kvm_age_hva_hv() and kvm_test_age_hva_hv() functions might incorrectly
say that a page has not been referenced when it has, but that doesn't
matter greatly because we never page or swap out large pages.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

08fe1e7b

26 5月, 2015 1 次提交

KVM: use kvm_memslots whenever possible · 9f6b8029

由 Paolo Bonzini 提交于 5月 17, 2015

kvm_memslots provides lockdep checking.  Use it consistently instead of
explicit dereferencing of kvm->memslots.
Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9f6b8029

21 4月, 2015 3 次提交

KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT · e23a808b

由 Paul Mackerras 提交于 3月 28, 2015

This creates a debugfs directory for each HV guest (assuming debugfs
is enabled in the kernel config), and within that directory, a file
by which the contents of the guest's HPT (hashed page table) can be
read.  The directory is named vmnnnn, where nnnn is the PID of the
process that created the guest.  The file is named "htab".  This is
intended to help in debugging problems in the host's management
of guest memory.

The contents of the file consist of a series of lines like this:

  3f48 4000d032bf003505 0000000bd7ff1196 00000003b5c71196

The first field is the index of the entry in the HPT, the second and
third are the HPT entry, so the third entry contains the real page
number that is mapped by the entry if the entry's valid bit is set.
The fourth field is the guest's view of the second doubleword of the
entry, so it contains the guest physical address.  (The format of the
second through fourth fields are described in the Power ISA and also
in arch/powerpc/include/asm/mmu-hash64.h.)
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

e23a808b

KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte · a4bd6eb0

由 Aneesh Kumar K.V 提交于 3月 20, 2015

This adds helper routines for locking and unlocking HPTEs, and uses
them in the rest of the code.  We don't change any locking rules in
this patch.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

a4bd6eb0

KVM: PPC: Book3S HV: Remove RMA-related variables from code · 31037eca

由 Aneesh Kumar K.V 提交于 3月 20, 2015

We don't support real-mode areas now that 970 support is removed.
Remove the remaining details of rma from the code.  Also rename
rma_setup_done to hpte_setup_done to better reflect the changes.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

31037eca

17 4月, 2015 2 次提交

powerpc/mm/thp: Return pte address if we find trans_splitting. · 7d6e7f7f

由 Aneesh Kumar K.V 提交于 3月 30, 2015

For THP that is marked trans splitting, we return the pte.
This require the callers to handle the pmd_trans_splitting scenario,
if they care. All the current callers are either looking at pfn or
write_ok, hence we don't need to update them.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7d6e7f7f

powerpc/mm/thp: Make page table walk safe against thp split/collapse · 691e95fd

由 Aneesh Kumar K.V 提交于 3月 30, 2015

We can disable a THP split or a hugepage collapse by disabling irq.
We do send IPI to all the cpus in the early part of split/collapse,
and disabling local irq ensure we don't make progress with
split/collapse. If the THP is getting split we return NULL from
find_linux_pte_or_hugepte(). For all the current callers it should be ok.
We need to be careful if we want to use returned pte_t pointer outside
the irq disabled region. W.r.t to THP split, the pfn remains the same,
but then a hugepage collapse will result in a pfn change. There are
few steps we can take to avoid a hugepage collapse.One way is to take page
reference inside the irq disable region. Other option is to take
mmap_sem so that a parallel collapse will not happen. We can also
disable collapse by taking pmd_lock. Another method used by kvm
subsystem is to check whether we had a mmu_notifer update in between
using mmu_notifier_retry().
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

691e95fd

17 12月, 2014 2 次提交

KVM: PPC: Book3S HV: Remove code for PPC970 processors · c17b98cf

由 Paul Mackerras 提交于 12月 03, 2014

This removes the code that was added to enable HV KVM to work
on PPC970 processors.  The PPC970 is an old CPU that doesn't
support virtualizing guest memory.  Removing PPC970 support also
lets us remove the code for allocating and managing contiguous
real-mode areas, the code for the !kvm->arch.using_mmu_notifiers
case, the code for pinning pages of guest memory when first
accessed and keeping track of which pages have been pinned, and
the code for handling H_ENTER hypercalls in virtual mode.

Book3S HV KVM is now supported only on POWER7 and POWER8 processors.
The KVM_CAP_PPC_RMA capability now always returns 0.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

c17b98cf

KVM: PPC: Book3S HV: Tracepoints for KVM HV guest interactions · 3c78f78a

由 Suresh E. Warrier 提交于 12月 03, 2014

This patch adds trace points in the guest entry and exit code and also
for exceptions handled by the host in kernel mode - hypercalls and page
faults. The new events are added to /sys/kernel/debug/tracing/events
under a new subsystem called kvm_hv.
Acked-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NSuresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

3c78f78a

15 12月, 2014 2 次提交

KVM: PPC: Book3S HV: ptes are big endian · ffada016

由 Cédric Le Goater 提交于 11月 21, 2014

When being restored from qemu, the kvm_get_htab_header are in native
endian, but the ptes are big endian.

This patch fixes restore on a KVM LE host. Qemu also needs a fix for
this :

     http://lists.nongnu.org/archive/html/qemu-ppc/2014-11/msg00008.htmlSigned-off-by: NCédric Le Goater <clg@fr.ibm.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

ffada016

KVM: PPC: Book3S HV: Add missing HPTE unlock · f6fb9e84

由 Aneesh Kumar K.V 提交于 11月 05, 2014

In kvm_test_clear_dirty(), if we find an invalid HPTE we move on to the
next HPTE without unlocking the invalid one.  In fact we should never
find an invalid and unlocked HPTE in the rmap chain, but for robustness
we should unlock it.  This adds the missing unlock.
Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

f6fb9e84

24 9月, 2014 1 次提交

kvm: Fix page ageing bugs · 57128468

由 Andres Lagar-Cavilla 提交于 9月 22, 2014

1. We were calling clear_flush_young_notify in unmap_one, but we are
within an mmu notifier invalidate range scope. The spte exists no more
(due to range_start) and the accessed bit info has already been
propagated (due to kvm_pfn_set_accessed). Simply call
clear_flush_young.

2. We clear_flush_young on a primary MMU PMD, but this may be mapped
as a collection of PTEs by the secondary MMU (e.g. during log-dirty).
This required expanding the interface of the clear_flush_young mmu
notifier, so a lot of code has been trivially touched.

3. In the absence of shadow_accessed_mask (e.g. EPT A bit), we emulate
the access bit by blowing the spte. This requires proper synchronizing
with MMU notifier consumers, like every other removal of spte's does.
Signed-off-by: NAndres Lagar-Cavilla <andreslc@google.com>
Acked-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

57128468

03 9月, 2014 1 次提交

powerpc/kvm/cma: Fix panic introduces by signed shift operation · 02a68d05

由 Laurent Dufour 提交于 9月 02, 2014

fc95ca72 introduces a memset in
kvmppc_alloc_hpt since the general CMA doesn't clear the memory it
allocates.

However, the size argument passed to memset is computed from a signed value
and its signed bit is extended by the cast the compiler is doing. This lead
to extremely large size value when dealing with order value >= 31, and
almost all the memory following the allocated space is cleaned. As a
consequence, the system is panicing and may even fail spawning the kdump
kernel.

This fix makes use of an unsigned value for the memset's size argument to
avoid sign extension. Among this fix, another shift operation which may
lead to signed extended value too is also fixed.

Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Alexander Graf <agraf@suse.de>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

02a68d05

07 8月, 2014 1 次提交

PPC, KVM, CMA: use general CMA reserved area management framework · fc95ca72

由 Joonsoo Kim 提交于 8月 06, 2014

Now, we have general CMA reserved area management framework, so use it
for future maintainabilty.  There is no functional change.
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: NMichal Nazarewicz <mina86@mina86.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Tested-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Gleb Natapov <gleb@kernel.org>
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fc95ca72

28 7月, 2014 1 次提交

KVM: PPC: Allow kvmppc_get_last_inst() to fail · 51f04726

由 Mihai Caraman 提交于 7月 23, 2014

On book3e, guest last instruction is read on the exit path using load
external pid (lwepx) dedicated instruction. This load operation may fail
due to TLB eviction and execute-but-not-read entries.

This patch lay down the path for an alternative solution to read the guest
last instruction, by allowing kvmppc_get_lat_inst() function to fail.
Architecture specific implmentations of kvmppc_load_last_inst() may read
last guest instruction and instruct the emulation layer to re-execute the
guest in case of failure.

Make kvmppc_get_last_inst() definition common between architectures.
Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

51f04726

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功