提交 · d182b8fd6084412963cdb1a16d04c2f07234e82b · openeuler / Kernel

31 8月, 2017 1 次提交

KVM: PPC: Book3S HV: Fix setting of storage key in H_ENTER · d182b8fd

由 Ram Pai 提交于 7月 31, 2017

In handling a H_ENTER hypercall, the code in kvmppc_do_h_enter
clobbers the high-order two bits of the storage key, which is stored
in a split field in the second doubleword of the HPTE.  Any storage
key number above 7 hence fails to operate correctly.

This makes sure we preserve all the bits of the storage key.
Acked-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NRam Pai <linuxram@us.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

d182b8fd

28 4月, 2017 1 次提交

powerpc/mm/hash: Fix off-by-one in comment about kernel contexts ids · add2e1e5

由 Michael Ellerman 提交于 4月 28, 2017

Michal Suchánek noticed a comment in book3s/64/mmu-hash.h about the context ids
we use for the kernel was inconsistent with the code and other comments in the
same file.

It should read 1-4 not 1-5.

While we're touching it, update "address" to "addresses" which makes more sense
as it's referring to more than one address below.
Reported-by: NMichal Suchánek <msuchanek@suse.de>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

add2e1e5

01 4月, 2017 2 次提交

powerpc/pseries: Skip using reserved virtual address range · 82228e36

由 Aneesh Kumar K.V 提交于 3月 22, 2017

Now that we use all the available virtual address range, we need to make
sure we don't generate VSID such that it overlaps with the reserved vsid
range. Reserved vsid range include the virtual address range used by the
adjunct partition and also the VRMA virtual segment. We find the context
value that can result in generating such a VSID and reserve it early in
boot.

We don't look at the adjunct range, because for now we disable the
adjunct usage in a Linux LPAR via CAS interface.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Rewrite hash__reserve_context_id(), move the rest into pseries]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

82228e36

powerpc/mm: Add addr_limit to mm_context and use it to derive max slice index · 957b778a

由 Aneesh Kumar K.V 提交于 3月 22, 2017

In the followup patch, we will increase the slice array size to handle
512TB range, but will limit the max addr to 128TB. Avoid doing
unnecessary computation and avoid doing slice mask related operation
above address limit.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

957b778a

31 3月, 2017 4 次提交

powerpc/mm/hash: Convert mask to unsigned long · 59248aec

由 Aneesh Kumar K.V 提交于 3月 22, 2017

This doesn't have any functional change. But helps in avoiding mistakes
in case the shift bit changes
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

59248aec

powerpc/mm/hash: Support 68 bit VA · e6f81a92

由 Aneesh Kumar K.V 提交于 3月 29, 2017

Inorder to support large effective address range (512TB), we want to
increase the virtual address bits to 68. But we do have platforms like
p4 and p5 that can only do 65 bit VA. We support those platforms by
limiting context bits on them to 16.

The protovsid -> vsid conversion is verified to work with both 65 and 68
bit va values. I also documented the restrictions in a table format as
part of code comments.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e6f81a92

powerpc/mm/hash: Check for non-kernel address in get_kernel_vsid() · 85beb1c4

由 Michael Ellerman 提交于 3月 29, 2017

get_kernel_vsid() has a very stern comment saying that it's only valid
for kernel addresses, but there's nothing in the code to enforce that.

Rather than hoping our callers are well behaved, add a check and return
a VSID of 0 (invalid).
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

85beb1c4

powerpc/mm/hash: Use context ids 1-4 for the kernel · 941711a3

由 Aneesh Kumar K.V 提交于 3月 29, 2017

Currently we use the top 4 context ids (0x7fffc-0x7ffff) for the kernel.
Kernel VSIDs are built using these top context values and effective the
segement ID. In subsequent patches we want to increase the max effective
address to 512TB. We will achieve that by increasing the effective
segment IDs there by increasing virtual address range.

We will be switching to a 68bit virtual address in the following patch.
But platforms like Power4 and Power5 only support a 65 bit virtual
address. We will handle that by limiting the context bits to 16 instead
of 19 on those platforms. That means the max context id will have a
different value on different platforms.

So that we don't have to deal with the kernel context ids changing
between different platforms, move the kernel context ids down to use
context ids 1-4.

We can't use segment 0 of context-id 0, because that maps to VSID 0,
which we want to keep as invalid, so we avoid context-id 0 entirely.
Similarly we can't use the last segment of the maximum context, so we
avoid it too.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Switch from 0-3 to 1-4 so VSID=0 remains invalid]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

941711a3

10 2月, 2017 1 次提交

powerpc/pseries: Add support for hash table resizing · dbcf929c

由 David Gibson 提交于 12月 09, 2016

This adds support for using two hypercalls to change the size of the
main hash page table while running as a PAPR guest. For now these
hypercalls are only in experimental qemu versions.

The interface is two part: first H_RESIZE_HPT_PREPARE is used to
allocate and prepare the new hash table. This may be slow, but can be
done asynchronously. Then, H_RESIZE_HPT_COMMIT is used to switch to the
new hash table. This requires that no CPUs be concurrently updating the
HPT, and so must be run under stop_machine().

This also adds a debugfs file which can be used to manually control
HPT resizing or testing purposes.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NPaul Mackerras <paulus@samba.org>
[mpe: Rename the debugfs file to "hpt_order"]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

dbcf929c

30 1月, 2017 1 次提交

powerpc/mm/hash: Properly mask the ESID bits when building proto VSID · 79270e0a

由 Aneesh Kumar K.V 提交于 1月 28, 2017

The proto VSID is built using both the MMU context id and effective
segment ID (ESID). We should not have overlapping bits between those.
That could result in us having a VSID collision. With the current code
we missed masking the top bits of the ESID. This implies for kernel
address we ended up using the top 4 bits of the ESID as part of the
proto VSID, which is wrong.

The current code use the top 4 context values (0x7fffc - 0x7ffff) for
the kernel. With those context IDs used for the kernel, we don't run
into VSID collisions because we get the same proto VSID irrespective of
whether we mask the ESID bits or not. eg:

  ea         = 0xf000000000000000
  context    = 0x7ffff

  w/out masking:
  proto_vsid = (0x7ffff << 6 | 0xf000000000000000 >> 40)
	     = (0x1ffffc0 | 0xf00000)
	     =  0x1ffffc0

  with masking:
  proto_vsid = (0x7ffff << 6 | ((0xf000000000000000 >> 40) & 0x3f))
	     = (0x1ffffc0 | (0xf00000 & 0x3f))
	     =  0x1ffffc0 | 0)
	     =  0x1ffffc0

So although there is no bug, the code is still overly subtle, so fix it
to save ourselves pain in future.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

79270e0a

16 11月, 2016 1 次提交

powerpc/64: Simplify adaptation to new ISA v3.00 HPTE format · 6b243fcf

由 Paul Mackerras 提交于 11月 11, 2016

This changes the way that we support the new ISA v3.00 HPTE format.
Instead of adapting everything that uses HPTE values to handle either
the old format or the new format, depending on which CPU we are on,
we now convert explicitly between old and new formats if necessary
in the low-level routines that actually access HPTEs in memory.
This limits the amount of code that needs to know about the new
format and makes the conversions explicit.  This is OK because the
old format contains all the information that is in the new format.

This also fixes operation under a hypervisor, because the H_ENTER
hypercall (and other hypercalls that deal with HPTEs) will continue
to require the HPTE value to be supplied in the old format.  At
present the kernel will not boot in HPT mode on POWER9 under a
hypervisor.

This fixes and partially reverts commit 50de596d
("powerpc/mm/hash: Add support for Power9 Hash", 2016-04-29).

Fixes: 50de596d ("powerpc/mm/hash: Add support for Power9 Hash")
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6b243fcf

09 9月, 2016 1 次提交

powerpc/mm: Speed up computation of base and actual page size for a HPTE · 0eeede0c

由 Paul Mackerras 提交于 9月 02, 2016

This replaces a 2-D search through an array with a simple 8-bit table
lookup for determining the actual and/or base page size for a HPT entry.

The encoding in the second doubleword of the HPTE is designed to encode
the actual and base page sizes without using any more bits than would be
needed for a 4k page number, by using between 1 and 8 low-order bits of
the RPN (real page number) field to encode the page sizes. A single
"large page" bit in the first doubleword indicates that these low-order
bits are to be interpreted like this.

We can determine the page sizes by using the low-order 8 bits of the RPN
to look up a 256-entry table. For actual page sizes less than 1MB, some
of the upper bits of these 8 bits are going to be real address bits, but
we can cope with that by replicating the entries for those smaller page
sizes.

While we're at it, let's move the hpte_page_size() and hpte_base_page_size()
functions from a KVM-specific header to a header for 64-bit HPT systems,
since this computation doesn't have anything specifically to do with KVM.
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

0eeede0c

01 8月, 2016 2 次提交

powerpc/mm/hash: Add helper for finding SLBE LLP encoding · 138ee7ee

由 Aneesh Kumar K.V 提交于 7月 13, 2016

Replace opencoding of the same at multiple places with the helper.
No functional change with this patch.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

138ee7ee

powerpc: Move cpu_has_feature() to a separate file · b92a226e

由 Kevin Hao 提交于 7月 23, 2016

We plan to use jump label for cpu_has_feature(). In order to implement
this we need to include the linux/jump_label.h in asm/cputable.h.

Unfortunately if we do that it leads to an include loop. The root of the
problem seems to be that reg.h needs cputable.h (for CPU_FTRs), and then
cputable.h via jump_label.h eventually pulls in hw_irq.h which needs
reg.h (for MSR_EE).

So move cpu_has_feature() to a separate file on its own.
Signed-off-by: NKevin Hao <haokexin@gmail.com>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Rename to cpu_has_feature.h and flesh out change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b92a226e

26 7月, 2016 2 次提交

powerpc/mm: Drop unused externs for hpte_init_beat[_v3]() · 1a1cee84

由 Michael Ellerman 提交于 7月 25, 2016

We removed the BEAT support in 2015 in commit bf4981a0 ("powerpc:
Remove the celleb support"). These externs are unused since then.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1a1cee84

powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header · 6364e84e

由 Michael Ellerman 提交于 7月 26, 2016

hpte_init_lpar() is part of the pseries platform, so name it as such.

Move the fallback implementation for when PSERIES=n into the header,
dropping the weak implementation. The panic() is now handled by the
calling code.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6364e84e

21 7月, 2016 1 次提交

powerpc/mm: Move hash table ops to a separate structure · 7025776e

由 Benjamin Herrenschmidt 提交于 7月 05, 2016

Moving probe_machine() to after mmu init will cause the ppc_md
fields relative to the hash table management to be overwritten.

Since we have essentially disconnected the machine type from
the hash backend ops, finish the job by moving them to a different
structure.

The only callback that didn't quite fix is update_partition_table
since this is not specific to hash, so I moved it to a standalone
variable for now. We can revisit later if needed.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Fix ppc64e build failure in kexec]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7025776e

14 6月, 2016 2 次提交

powerpc: Various typo fixes · 027dfac6

由 Michael Ellerman 提交于 6月 01, 2016

Signed-off-by: NAndrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

027dfac6

powerpc/mm/hash: Use the correct PPP mask when updating HPTE · 8550e2fa

由 Aneesh Kumar K.V 提交于 6月 08, 2016

With commit e58e87ad "powerpc/mm: Update _PAGE_KERNEL_RO" we now
use all the three PPP bits. The top bit is now used to have a PPP value
of 0b110 which will be mapped to kernel read only. When updating the
hpte entry use right mask such that we update the 63rd bit (top 'P' bit)
too.

Prior to e58e87ad we didn't support KERNEL_RO at all (it was ==
KERNEL_RW), so this isn't a regression as such.

Fixes: e58e87ad ("powerpc/mm: Update _PAGE_KERNEL_RO")
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8550e2fa

01 5月, 2016 4 次提交

powerpc/mm/radix: Add tlbflush routines · 1a472c9d

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Core kernel doesn't track the page size of the VA range that we are
invalidating. Hence we end up flushing TLB for the entire mm here. Later
patches will improve this.

We also don't flush page walk cache separetly instead use RIC=2 when
flushing TLB, because we do a MMU gather flush after freeing page table.

MMU_NO_CONTEXT is updated for hash.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1a472c9d

powerpc/mm: Make page table size a variable · dd1842a2

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Radix and hash MMU models support different page table sizes. Make
the #defines a variable so that existing code can work with variable
sizes.

Slice related code is only used by hash, so use hash constants there. We
will replicate some of the boundary conditions with resepct to TASK_SIZE
using radix values too. Right now we do boundary condition check using
hash constants.

Swapper pgdir size is initialized in asm code. We select the max pgd
size to keep it simple. For now we select hash pgdir. When adding radix
we will switch that to radix pgdir which is 64K.

BUILD_BUG_ON check which is removed is already done in hugepage_init()
using MAYBE_BUILD_BUG_ON().
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

dd1842a2

powerpc/mm/hash: Add support for Power9 Hash · 50de596d

由 Aneesh Kumar K.V 提交于 4月 29, 2016

PowerISA 3.0 adds a parition table indexed by LPID. Parition table
allows us to specify the MMU model that will be used for guest and host
translation.

This patch adds support with SLB based hash model (UPRT = 0). What is
required with this model is to support the new hash page table entry
format and also setup partition table such that we use hash table for
address translation.

We don't have segment table support yet.

In order to make sure we don't load KVM module on Power9 (since we don't
have kvm support yet) this patch also disables KVM on Power9.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

50de596d

powerpc/mm: Move radix/hash common data structures to book3s64 headers · 11a6f6ab

由 Aneesh Kumar K.V 提交于 4月 29, 2016

Start moving code that is generic between radix and hash to book3s64
specific headers from the book3s64 hash specific one.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

11a6f6ab

03 3月, 2016 1 次提交

powerpc/mm: Move hash related mmu-*.h headers to book3s/ · f64e8084

由 Aneesh Kumar K.V 提交于 3月 01, 2016

No code changes.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f64e8084

02 3月, 2016 1 次提交

powerpc/mm: Split hash page table sizing heuristic into a helper · 5c3c7ede

由 David Gibson 提交于 2月 09, 2016

htab_get_table_size() either retrieve the size of the hash page table (HPT)
from the device tree - if the HPT size is determined by firmware - or
uses a heuristic to determine a good size based on RAM size if the kernel
is responsible for allocating the HPT.

To support a PAPR extension allowing resizing of the HPT, we're going to
want the memory size -> HPT size logic elsewhere, so split it out into a
helper function.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5c3c7ede

22 2月, 2016 1 次提交

powerpc: Add POWER9 cputable entry · c3ab300e

由 Michael Neuling 提交于 2月 19, 2016

Add a cputable entry for POWER9.  More code is required to actually
boot and run on a POWER9 but this gets the base piece in which we can
start building on.

Copies over from POWER8 except for:
- Adds a new CPU_FTR_ARCH_300 bit to start hanging new architecture
   features from (in subsequent patches).
- Advertises new user features bits PPC_FEATURE2_ARCH_3_00 &
  HAS_IEEE128 when on POWER9.
- Drops CPU_FTR_SUBCORE.
- Drops PMU code and machine check.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c3ab300e

14 12月, 2015 1 次提交

powerpc/mm: make a separate copy for book3s · 3dfcb315

由 Aneesh Kumar K.V 提交于 12月 01, 2015

In this patch we do:
cp pgtable-ppc32.h book3s/32/pgtable.h
cp pgtable-ppc64.h book3s/64/pgtable.h

This enable us to do further changes to hash specific config.
We will change the page table format for 64bit hash in later patches.
Acked-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3dfcb315

12 10月, 2015 1 次提交

powerpc/mm: Differentiate between hugetlb and THP during page walk · 891121e6

由 Aneesh Kumar K.V 提交于 10月 09, 2015

We need to properly identify whether a hugepage is an explicit or
a transparent hugepage in follow_huge_addr(). We used to depend
on hugepage shift argument to do that. But in some case that can
result in wrong results. For ex:

On finding a transparent hugepage we set hugepage shift to PMD_SHIFT.
But we can end up clearing the thp pte, via pmdp_huge_get_and_clear.
We do prevent reusing the pfn page via the usage of
kick_all_cpus_sync(). But that happens after we updated the pte to 0.
Hence in follow_huge_addr() we can find hugepage shift set, but transparent
huge page check fail for a thp pte.

NOTE: We fixed a variant of this race against thp split in commit
691e95fd
("powerpc/mm/thp: Make page table walk safe against thp split/collapse")

Without this patch, we may hit the BUG_ON(flags & FOLL_GET) in
follow_page_mask occasionally.

In the long term, we may want to switch ppc64 64k page size config to
enable CONFIG_ARCH_WANT_GENERAL_HUGETLB
Reported-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

891121e6

11 6月, 2015 1 次提交

powerpc/mmu: Add userspace-to-physical addresses translation cache · 15b244a8

由 Alexey Kardashevskiy 提交于 6月 05, 2015

We are adding support for DMA memory pre-registration to be used in
conjunction with VFIO. The idea is that the userspace which is going to
run a guest may want to pre-register a user space memory region so
it all gets pinned once and never goes away. Having this done,
a hypervisor will not have to pin/unpin pages on every DMA map/unmap
request. This is going to help with multiple pinning of the same memory.

Another use of it is in-kernel real mode (mmu off) acceleration of
DMA requests where real time translation of guest physical to host
physical addresses is non-trivial and may fail as linux ptes may be
temporarily invalid. Also, having cached host physical addresses
(compared to just pinning at the start and then walking the page table
again on every H_PUT_TCE), we can be sure that the addresses which we put
into TCE table are the ones we already pinned.

This adds a list of memory regions to mm_context_t. Each region consists
of a header and a list of physical addresses. This adds API to:
1. register/unregister memory regions;
2. do final cleanup (which puts all pre-registered pages);
3. do userspace to physical address translation;
4. manage usage counters; multiple registration of the same memory
is allowed (once per container).

This implements 2 counters per registered memory region:
- @mapped: incremented on every DMA mapping; decremented on unmapping;
initialized to 1 when a region is just registered; once it becomes zero,
no more mappings allowe;
- @used: incremented on every "register" ioctl; decremented on
"unregister"; unregistration is allowed for DMA mapped regions unless
it is the very last reference. For the very last reference this checks
that the region is still mapped and returns -EBUSY so the userspace
gets to know that memory is still pinned and unregistration needs to
be retried; @used remains 1.

Host physical addresses are stored in vmalloc'ed array. In order to
access these in the real mode (mmu off), there is a real_vmalloc_addr()
helper. In-kernel acceleration patchset will move it from KVM to MMU code.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

15b244a8

17 3月, 2015 1 次提交

powerpc/book3s: Fix flush_tlb cpu_spec hook to take a generic argument. · 45706bb5

由 Mahesh Salgaonkar 提交于 12月 19, 2014

The flush_tlb hook in cpu_spec was introduced as a generic function hook
to invalidate TLBs. But the current implementation of flush_tlb hook
takes IS (invalidation selector) as an argument which is architecture
dependent. Hence, It is not right to have a generic routine where caller
has to pass non-generic argument.

This patch fixes this and makes flush_tlb hook as high level API.
Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

45706bb5

05 12月, 2014 1 次提交

powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault · aefa5688

由 Aneesh Kumar K.V 提交于 12月 04, 2014

upatepp can get called for a nohpte fault when we find from the linux
page table that the translation was hashed before. In that case
we are sure that there is no existing translation, hence we could
avoid doing tlbie.

We could possibly race with a parallel fault filling the TLB. But
that should be ok because updatepp is only ever relaxing permissions.
We also look at linux pte permission bits when filling hash pte
permission bits. We also hold the linux pte busy bits while
inserting/updating a hashpte entry, hence a paralle update of
linux pte is not possible. On the other hand mprotect involves
ptep_modify_prot_start which cause a hpte invalidate and not updatepp.

Performance number:
We use randbox_access_bench written by Anton.

Kernel with THP disabled and smaller hash page table size.

86.60% random_access_b [kernel.kallsyms] [k] .native_hpte_updatepp
2.10% random_access_b random_access_bench [.] doit
1.99% random_access_b [kernel.kallsyms] [k] .do_raw_spin_lock
1.85% random_access_b [kernel.kallsyms] [k] .native_hpte_insert
1.26% random_access_b [kernel.kallsyms] [k] .native_flush_hash_range
1.18% random_access_b [kernel.kallsyms] [k] .__delay
0.69% random_access_b [kernel.kallsyms] [k] .native_hpte_remove
0.37% random_access_b [kernel.kallsyms] [k] .clear_user_page
0.34% random_access_b [kernel.kallsyms] [k] .__hash_page_64K
0.32% random_access_b [kernel.kallsyms] [k] fast_exception_return
0.30% random_access_b [kernel.kallsyms] [k] .hash_page_mm

With Fix:

27.54% random_access_b random_access_bench [.] doit
22.90% random_access_b [kernel.kallsyms] [k] .native_hpte_insert
5.76% random_access_b [kernel.kallsyms] [k] .native_hpte_remove
5.20% random_access_b [kernel.kallsyms] [k] fast_exception_return
5.12% random_access_b [kernel.kallsyms] [k] .__hash_page_64K
4.80% random_access_b [kernel.kallsyms] [k] .hash_page_mm
3.31% random_access_b [kernel.kallsyms] [k] data_access_common
1.84% random_access_b [kernel.kallsyms] [k] .trace_hardirqs_on_caller
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

aefa5688

08 10月, 2014 2 次提交

powerpc/mm: Add new hash_page_mm() · a1dca346

由 Ian Munsie 提交于 10月 08, 2014

This adds a new function hash_page_mm() based on the existing hash_page().
This version allows any struct mm to be passed in, rather than assuming
current. This is useful for servicing co-processor faults which are not in the
context of the current running process.

We need to be careful here as the current hash_page() assumes current in a few
places.
Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a1dca346

powerpc/cell: Move data segment faulting code out of cell platform · 73d16a6e

由 Ian Munsie 提交于 10月 08, 2014

__spu_trap_data_seg() currently contains code to determine the VSID and ESID
required for a particular EA and mm struct.

This code is generically useful for other co-processors. This moves the code of
the cell platform so it can be used by other powerpc code. It also adds 1TB
segment handling which Cell didn't support.  The new function is called
copro_calculate_slb().

This also moves the internal struct spu_slb to a generic struct copro_slb which
is now used in the Cell and copro code.  We use this new struct instead of
passing around esid and vsid parameters.
Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

73d16a6e

25 9月, 2014 1 次提交

powerpc: Move htab_remove_mapping function prototype into header file · f6026df1

由 Anton Blanchard 提交于 8月 20, 2014

A recent patch added a function prototype for htab_remove_mapping in
c code. Fix it.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f6026df1

28 7月, 2014 1 次提交

powerpc: Remove STAB code · 376af594

由 Michael Ellerman 提交于 7月 10, 2014

Old cpus didn't have a Segment Lookaside Buffer (SLB), instead they had
a Segment Table (STAB). Now that we've dropped support for those cpus,
we can remove the STAB support entirely.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

376af594

22 7月, 2014 1 次提交

powerpc: subpage_protect: Increase the array size to take care of 64TB · dad6f37c

由 Aneesh Kumar K.V 提交于 7月 15, 2014

We now support TASK_SIZE of 16TB, hence the array should be 8.

Fixes the below crash:

Unable to handle kernel paging request for data at address 0x000100bd
Faulting instruction address: 0xc00000000004f914
cpu 0x13: Vector: 300 (Data Access) at [c000000fea75fa90]
    pc: c00000000004f914: .sys_subpage_prot+0x2d4/0x5c0
    lr: c00000000004fb5c: .sys_subpage_prot+0x51c/0x5c0
    sp: c000000fea75fd10
   msr: 9000000000009032
   dar: 100bd
 dsisr: 40000000
  current = 0xc000000fea6ae490
  paca    = 0xc00000000fb8ab00   softe: 0        irq_happened: 0x00
    pid   = 8237, comm = a.out
enter ? for help
[c000000fea75fe30] c00000000000a164 syscall_exit+0x0/0x98
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

dad6f37c

11 10月, 2013 1 次提交

powerpc: Book 3S MMU little endian support · 12f04f2b

由 Anton Blanchard 提交于 9月 23, 2013

Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

12f04f2b

25 6月, 2013 1 次提交

powerpc/mm: Fix build warnings with CONFIG_TRANSPARENT_HUGEPAGE disabled · ff1e7683

由 Nathan Fontenot 提交于 6月 24, 2013

Building with CONFIG_TRANSPARENT_HUGEPAGE disabled causes the following
build wearnings;

powerpc/arch/powerpc/include/asm/mmu-hash64.h: In function ‘__hash_page_thp’:
powerpc/arch/powerpc/include/asm/mmu-hash64.h:354: warning: no return statement in function returning non-void

This patch adds a return -1 to the static inline for __hash_page_thp()
to correct the warnings.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ff1e7683

21 6月, 2013 1 次提交

powerpc/THP: Add code to handle HPTE faults for hugepages · 6d492ecc

由 Aneesh Kumar K.V 提交于 6月 20, 2013

The deposted PTE page in the second half of the PMD table is used to
track the state on hash PTEs. After updating the HPTE, we mark the
coresponding slot in the deposted PTE page valid.
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

6d492ecc

30 4月, 2013 1 次提交

powerpc: print both base and actual page size on hash failure · d8139ebf

由 Aneesh Kumar K.V 提交于 4月 28, 2013

Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

d8139ebf

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功