提交 · 6d5315b3a60cc9b136b86b4dafeb73bf272d24c0 · openeuler / raspberrypi-kernel

20 6月, 2016 1 次提交

KVM: PPC: Book3S HV: Fix TB corruption in guest exit path on HMI interrupt · fd7bacbc

由 Mahesh Salgaonkar 提交于 5月 15, 2016

When a guest is assigned to a core it converts the host Timebase (TB)
into guest TB by adding guest timebase offset before entering into
guest. During guest exit it restores the guest TB to host TB. This means
under certain conditions (Guest migration) host TB and guest TB can differ.

When we get an HMI for TB related issues the opal HMI handler would
try fixing errors and restore the correct host TB value. With no guest
running, we don't have any issues. But with guest running on the core
we run into TB corruption issues.

If we get an HMI while in the guest, the current HMI handler invokes opal
hmi handler before forcing guest to exit. The guest exit path subtracts
the guest TB offset from the current TB value which may have already
been restored with host value by opal hmi handler. This leads to incorrect
host and guest TB values.

With split-core, things become more complex. With split-core, TB also gets
split and each subcore gets its own TB register. When a hmi handler fixes
a TB error and restores the TB value, it affects all the TB values of
sibling subcores on the same core. On TB errors all the thread in the core
gets HMI. With existing code, the individual threads call opal hmi handle
independently which can easily throw TB out of sync if we have guest
running on subcores. Hence we will need to co-ordinate with all the
threads before making opal hmi handler call followed by TB resync.

This patch introduces a sibling subcore state structure (shared by all
threads in the core) in paca which holds information about whether sibling
subcores are in Guest mode or host mode. An array in_guest[] of size
MAX_SUBCORE_PER_CORE=4 is used to maintain the state of each subcore.
The subcore id is used as index into in_guest[] array. Only primary
thread entering/exiting the guest is responsible to set/unset its
designated array element.

On TB error, we get HMI interrupt on every thread on the core. Upon HMI,
this patch will now force guest to vacate the core/subcore. Primary
thread from each subcore will then turn off its respective bit
from the above bitmap during the guest exit path just after the
guest->host partition switch is complete.

All other threads that have just exited the guest OR were already in host
will wait until all other subcores clears their respective bit.
Once all the subcores turn off their respective bit, all threads will
will make call to opal hmi handler.

It is not necessary that opal hmi handler would resync the TB value for
every HMI interrupts. It would do so only for the HMI caused due to
TB errors. For rest, it would not touch TB value. Hence to make things
simpler, primary thread would call TB resync explicitly once for each
core immediately after opal hmi handler instead of subtracting guest
offset from TB. TB resync call will restore the TB with host value.
Thus we can be sure about the TB state.

One of the primary threads exiting the guest will take up the
responsibility of calling TB resync. It will use one of the top bits
(bit 63) from subcore state flags bitmap to make the decision. The first
primary thread (among the subcores) that is able to set the bit will
have to call the TB resync. Rest all other threads will wait until TB
resync is complete. Once TB resync is complete all threads will then
proceed.
Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

fd7bacbc

20 5月, 2016 1 次提交

arch: fix has_transparent_hugepage() · fd8cfd30

由 Hugh Dickins 提交于 5月 19, 2016

I've just discovered that the useful-sounding has_transparent_hugepage()
is actually an architecture-dependent minefield: on some arches it only
builds if CONFIG_TRANSPARENT_HUGEPAGE=y, on others it's also there when
not, but on some of those (arm and arm64) it then gives the wrong
answer; and on mips alone it's marked __init, which would crash if
called later (but so far it has not been called later).

Straighten this out: make it available to all configs, with a sensible
default in asm-generic/pgtable.h, removing its definitions from those
arches (arc, arm, arm64, sparc, tile) which are served by the default,
adding #define has_transparent_hugepage has_transparent_hugepage to
those (mips, powerpc, s390, x86) which need to override the default at
runtime, and removing the __init from mips (but maybe that kind of code
should be avoided after init: set a static variable the first time it's
called).
Signed-off-by: NHugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Yang Shi <yang.shi@linaro.org>
Cc: Ning Qu <quning@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: Vineet Gupta <vgupta@synopsys.com>		[arch/arc]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[arch/s390]
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fd8cfd30

13 5月, 2016 1 次提交

KVM: halt_polling: provide a way to qualify wakeups during poll · 3491caf2

由 Christian Borntraeger 提交于 5月 13, 2016

Some wakeups should not be considered a sucessful poll. For example on
s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
would be considered runnable - letting all vCPUs poll all the time for
transactional like workload, even if one vCPU would be enough.
This can result in huge CPU usage for large guests.
This patch lets architectures provide a way to qualify wakeups if they
should be considered a good/bad wakeups in regard to polls.

For s390 the implementation will fence of halt polling for anything but
known good, single vCPU events. The s390 implementation for floating
interrupts does a wakeup for one vCPU, but the interrupt will be delivered
by whatever CPU checks first for a pending interrupt. We prefer the
woken up CPU by marking the poll of this CPU as "good" poll.
This code will also mark several other wakeup reasons like IPI or
expired timers as "good". This will of course also mark some events as
not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
we prefer to not poll, unless we are really sure, though.

This patch successfully limits the CPU usage for cases like uperf 1byte
transactional ping pong workload or wakeup heavy workload like OLTP
while still providing a proper speedup.

This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
wakeups that are considered not good for polling.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
Cc: David Matlack <dmatlack@google.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
[Rename config symbol. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3491caf2

12 5月, 2016 1 次提交

kvm: introduce KVM_MAX_VCPU_ID · 0b1b1dfd

由 Greg Kurz 提交于 5月 09, 2016

The KVM_MAX_VCPUS define provides the maximum number of vCPUs per guest, and
also the upper limit for vCPU ids. This is okay for all archs except PowerPC
which can have higher ids, depending on the cpu/core/thread topology. In the
worst case (single threaded guest, host with 8 threads per core), it limits
the maximum number of vCPUS to KVM_MAX_VCPUS / 8.

This patch separates the vCPU numbering from the total number of vCPUs, with
the introduction of KVM_MAX_VCPU_ID, as the maximal valid value for vCPU ids
plus one.

The corresponding KVM_CAP_MAX_VCPU_ID allows userspace to validate vCPU ids
before passing them to KVM_CREATE_VCPU.

This patch only implements KVM_MAX_VCPU_ID with a specific value for PowerPC.
Other archs continue to return KVM_MAX_VCPUS instead.
Suggested-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0b1b1dfd

11 5月, 2016 26 次提交

powerpc/pci: Export pci_traverse_device_nodes() · cdddc577

由 Gavin Shan 提交于 5月 03, 2016

This renames traverse_pci_devices() to pci_traverse_device_nodes().
The function traverses all subordinate device nodes of the specified
one. Also, below cleanup applied to the function. No logical changes
introduced.

   * Rename "pre" to "fn".
   * Avoid assignment in if condition reported from checkpatch.pl.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

cdddc577

powerpc/pci: Introduce pci_remove_device_node_info() · de5a28ac

由 Gavin Shan 提交于 5月 03, 2016

This implements and exports pci_remove_device_node_info(). It's
used to remove the pdn (struct pci_dn) for the indicated device
node. The function is going to be used by PowerNV PCI hotplug
driver.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

de5a28ac

powerpc/pci: Export pci_add_device_node_info() · d8f66f41

由 Gavin Shan 提交于 5月 03, 2016

This renames update_dn_pci_info() to pci_add_device_node_info()
with corresponding adjustment on the parameter type and exports it.
The function is used to create pdn (struct pci_dn) for the indicated
device node. Another function add_pdn(), almost wrapper of
pci_add_device_node_info(), to be used in traverse_pci_devices(). No
logical changes introduced.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d8f66f41

powerpc/pci: Rename pcibios_find_pci_bus() · 3773dd25

由 Gavin Shan 提交于 5月 03, 2016

This renames pcibios_find_pci_bus() to pci_find_bus_by_node() to
avoid conflicts with those PCI subsystem weak function names, which
have prefix "pcibios". No logical changes introduced.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3773dd25

powerpc/pci: Rename pcibios_{add, remove}_pci_devices() · bd251b89

由 Gavin Shan 提交于 5月 03, 2016

This renames pcibios_{add,remove}_pci_devices() to avoid conflicts
with names of the weak functions in PCI subsystem, which have the
prefix "pcibios". No logical changes introduced.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-By: NAlistair Popple <alistair@popple.id.au>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bd251b89

powerpc/powernv: Data type unsigned int for PE number · 689ee8c9

由 Gavin Shan 提交于 5月 03, 2016

This changes the data type of PE number from "int" to "unsigned int"
in order to match the fact PE number is never negative:

   * The number of PE to which the specified PCI device is attached.
   * The PE number map for SRIOV VFs.
   * The returned PE number from pnv_ioda_alloc_pe().
   * The returned PE number from pnv_ioda2_pick_m64_pe().
Suggested-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-By: NAlistair Popple <alistair@popple.id.au>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

689ee8c9

powerpc/pci: Cleanup on struct pci_controller_ops · 062b26ba

由 Gavin Shan 提交于 5月 03, 2016

Each PHB has one instance of "struct pci_controller_ops" that includes
various callbacks called by PCI subsystem. In the definition of this
struct, some callbacks have explicit names for its arguments, but the
left don't have.

This adds all explicit names of the arguments to the callbacks in
"struct pci_controller_ops" so that the code looks consistent. Also,
argument name @dev is replaced by @pdev as the later one is the
preferred name for PCI device.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: NDaniel Axtens <dja@axtens.net>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

062b26ba

powerpc/mm/slice: Remove slice_mm_new_context() · 62ccf5bf

由 Aneesh Kumar K.V 提交于 5月 02, 2016

The usage in mm mmu_context_nohash.c is bogus, because we set the
context.id value to MMU_NO_CONTEXT 4 lines previously in the same
function, meaning slice_mm_new_context() will always be true.

The book3s 64 usage was removed in the previous commit. So remove it as
unused.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

62ccf5bf

powerpc/mm/radix: Document software bits for radix · 69dfbaeb

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Add #defines for Power ISA 3.0 software defined bits.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

69dfbaeb

powerpc/mm/radix: Add THP support for 4K linux page size · ab624762

由 Aneesh Kumar K.V 提交于 4月 29, 2016

This adds THP support for 4K Linux page size config with radix. We still
don't do THP with 4K Linux page size and hash page table. Hash page
table needs a 16MB hugepage and we can't do THP with 16MM hugepage and
4K Linux page size.

We add missing functions to 4K hash config to get it to build and
hash__has_transparent_hugepage() makes sure we don't enable THP for 4K
hash config. To catch wrong usage of THP related with 4K config, we add
BUG() in those dummy functions we added to get it compile.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ab624762

powerpc/mm/radix: Add radix THP callbacks · bde3eb62

由 Aneesh Kumar K.V 提交于 4月 29, 2016

The deposited pgtable_t is a pte fragment hence we cannot use page->lru
for linking then together. We use the first two 64 bits for pte fragment
as list_head type to link all deposited fragments together. On withdraw
we properly zero then out.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bde3eb62

powerpc/mm/thp: Abstraction for THP functions · 3df33f12

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3df33f12

powerpc/mm: THP is only available on hash64 as of now · 6a1ea362

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Only code movement in this patch. No functionality change.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6a1ea362

powerpc/mm/radix: Add hugetlb support 4K page size · c0a6c719

由 Aneesh Kumar K.V 提交于 4月 29, 2016

We have hugepage at the pmd level with 4K radix config. Hence we don't
need to use hugepd format with radix.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c0a6c719

powerpc/mm: Add radix support for hugetlb · 48483760

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

48483760

powerpc/mm: pte_frag abstraction · 5ed7ecd0

由 Aneesh Kumar K.V 提交于 4月 29, 2016

In this patch we make the number of pte fragments per level 4 page table
page a variable. Radix level 4 table size is 256 bytes and hence we can
have 256 fragments per level 4 page. We don't update the fragment count
in this patch. We need to do performance measurements to find the right
value for fragment count.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5ed7ecd0

powerpc/mm: vmalloc abstraction in preparation for radix · d6a9996e

由 Aneesh Kumar K.V 提交于 4月 29, 2016

The vmalloc range differs between hash and radix config. Hence make
VMALLOC_START and related constants a variable which will be runtime
initialized depending on whether hash or radix mode is active.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d6a9996e

powerpc/mm: Add radix pgalloc details · a2f41eb9

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a2f41eb9

powerpc/mm: Make 4K and 64K use pte_t for pgtable_t · 934828ed

由 Aneesh Kumar K.V 提交于 4月 29, 2016

This patch switches 4K Linux page size config to use pte_t * type
instead of struct page * for pgtable_t. This simplifies the code a lot
and helps in consolidating both 64K and 4K page allocator routines. The
changes should not have any impact, because we already store physical
address in the upper level page table tree and that implies we already
do struct page * to physical address conversion.

One change to note here is we move the pgtable_page_dtor() call for
nohash to pte_fragment_free_mm(). The nohash related change is due to
the related changes in pgtable_64.c.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

934828ed

powerpc/mm: Rename function to indicate we are allocating fragments · 74701d59

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Only code cleanup. No functionality change.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

74701d59

powerpc/mm: Simplify the code dropping 4-level table #ifdef · bcbe7f77

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Simplify the code by dropping 4-level page table #ifdef. We are always
4-level now.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bcbe7f77

powerpc/mm: Revert changes made to nohash pgalloc-64.h · 27209206

由 Aneesh Kumar K.V 提交于 4月 29, 2016

This reverts pgalloc related changes WRT implementing 4-level page
table for 64K Linux page size and storing of physical address in higher
level page tables since they are only applicable to book3s64 variant
and we now have a separate copy for book3s64. This helps to keep these
headers simpler.

Cc: Scott Wood <scottwood@freescale.com>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

27209206

powerpc/mm: Copy pgalloc (part 2) · 75a9b8a6

由 Aneesh Kumar K.V 提交于 4月 29, 2016

This moves the nohash variant of pgalloc headers to nohash/ directory
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

75a9b8a6

powerpc/mm: Make a copy of pgalloc.h for 32 and 64 book3s · 101ad5c6

由 Aneesh Kumar K.V 提交于 4月 29, 2016

This patch start to make a book3s variant for pgalloc headers. We have
multiple book3s specific changes such as:
  * 4 level page table
  * store physical address in higher level table
  * use pte_t * for pgtable_t

Having a book3s64 specific variant helps to keep code simpler and remove
lots of #ifdef around code.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

101ad5c6

powerpc/mm/radix: Add MMU_FTR_RADIX · a8ed87c9

由 Aneesh Kumar K.V 提交于 4月 29, 2016

We are going to add asm changes in the follow up patches. Add the
feature bit now so that we can get it all build.

mpe: When CONFIG_PPC_RADIX_MMU=n we omit MMU_FTR_RADIX from the
MMU_FTRS_POSSIBLE mask. This allows the compiler to work out that those
checks will always be false and so the code can be elided completely.

Note we do *not* define MMU_FTR_RADIX to 0 in the RADIX_MMU=n case,
because that doesn't work with the ASM_FTR patching. In particular an
IF_SET section will result in a mask and value of zero, which is always
true, meaning the section *won't* be patched, which is the opposite of
what we want.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a8ed87c9

powerpc/mm: Add mask of possible MMU features · 773edead

由 Michael Ellerman 提交于 5月 11, 2016

Follow the example of the cpu feature code, and add a mask of possible
MMU features, MMU_FTRS_POSSIBLE.

This is used in mmu_has_feature(), which allows the possible mask to act
as a shortcut for any features that are not possible, but still allows
the feature bit itself to be defined.

We will use this in the next commit to allow MMU_FTR_RADIX checks to be
elided when MMU_FTR_RADIX is not possible.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

773edead

02 5月, 2016 1 次提交

powerpc: Fix bad inline asm constraint in create_zero_mask() · b4c11211

由 Anton Blanchard 提交于 4月 30, 2016

In create_zero_mask() we have:

	addi	%1,%2,-1
	andc	%1,%1,%2
	popcntd	%0,%1

using the "r" constraint for %2. r0 is a valid register in the "r" set,
but addi X,r0,X turns it into an li:

	li	r7,-1
	andc	r7,r7,r0
	popcntd	r4,r7

Fix this by using the "b" constraint, for which r0 is not a valid
register.

This was found with a kernel build using gcc trunk, narrowed down to
when -frename-registers was enabled at -O2. It is just luck however
that we aren't seeing this on older toolchains.

Thanks to Segher for working with me to find this issue.

Cc: stable@vger.kernel.org
Fixes: d0cebfa6 ("powerpc: word-at-a-time optimization for 64-bit Little Endian")
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b4c11211

01 5月, 2016 9 次提交

powerpc/mm/radix: Add tlbflush routines · 1a472c9d

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Core kernel doesn't track the page size of the VA range that we are
invalidating. Hence we end up flushing TLB for the entire mm here. Later
patches will improve this.

We also don't flush page walk cache separetly instead use RIC=2 when
flushing TLB, because we do a MMU gather flush after freeing page table.

MMU_NO_CONTEXT is updated for hash.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1a472c9d

powerpc/mm: Hash abstraction for tlbflush routines · 676012a6

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

676012a6

powerpc/mm/radix: Add mmu context handling callback for radix · 7e381c0f

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7e381c0f

powerpc/mm: Abstraction for switch_mmu_context() · d2adba3f

由 Aneesh Kumar K.V 提交于 4月 29, 2016

How we switch MMU context differs between hash and radix. For hash we
need to switch the SLB details and for radix we need to switch the PID
SPR.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d2adba3f

A
powerpc/mm/radix: Add radix callbacks for vmemmap and map_kernel page() · d9225ad9
由 Aneesh Kumar K.V 提交于 4月 29, 2016
```
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
```
d9225ad9

powerpc/mm: Abstraction for vmemmap and map_kernel_page() · 31a14fae

由 Aneesh Kumar K.V 提交于 4月 29, 2016

For hash we create vmemmap mapping using bolted hash page table entries.
For radix we fill the radix page table. The next patch will add the
radix details for creating vmemmap mappings.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

31a14fae

powerpc/mm/radix: Add radix callbacks for early init routines · 2bfd65e4

由 Aneesh Kumar K.V 提交于 4月 29, 2016

This adds routines for early setup for radix. We use device tree
property "ibm,processor-radix-AP-encodings" to find supported page
sizes. If we don't find the above we consider 64K and 4K as supported
page sizes.

We do map vmemap using 2M page size if we can. The linear mapping is
done such that we use required page size for that range. For example
memory of 3.5G is mapped such that we use 1G mapping till 3G range and
use 2M mapping for the rest.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2bfd65e4

powerpc/mm: Abstract early MMU init in preparation for radix · 756d08d1

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

756d08d1

powerpc/mm/radix: Add radix callback for pmd accessors · 6cc1a0ee

由 Aneesh Kumar K.V 提交于 4月 29, 2016

This only does 64K Linux page support for now. 64K hash Linux config
THP needs to differentiate it from hugetlb huge page because with THP we
need to track hash pte slot information with respect to each subpage.
This is not needed with hugetlb hugepage, because we don't do MPSS with
hugetlb.

Radix doesn't have any such restrictions.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6cc1a0ee