提交 · ac55c768143aa34cc3789c4820cbb0809a76fd9c · openeuler / raspberrypi-kernel

06 10月, 2014 1 次提交

sparc64: Switch to 4-level page tables. · ac55c768

由 David S. Miller 提交于 9月 26, 2014

This has become necessary with chips that support more than 43-bits
of physical addressing.

Based almost entirely upon a patch by Bob Picco.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NBob Picco <bob.picco@oracle.com>

ac55c768

04 5月, 2014 1 次提交

sparc64: Fix huge PMD invalidation. · 51e5ef1b

由 David S. Miller 提交于 4月 24, 2014

On sparc64 "present" and "valid" are seperate PTE bits, this allows us to
naturally distinguish between the user explicitly asking for PROT_NONE
with mprotect() and other situations.

However we weren't handling this properly in the huge PMD paths.

First of all, the page table walker in the TSB miss path only checks
for _PAGE_PMD_HUGE.  So the generic pmdp_invalidate() would clear
_PAGE_PRESENT but the TLB miss paths would still load it into the TLB
as a valid huge PMD.

Fix this by clearing the valid bit in pmdp_invalidate(), and also
checking the valid bit in USER_PGTABLE_CHECK_PMD_HUGE using "brgez"
since _PAGE_VALID is bit 63 in both the sun4u and sun4v pte layouts.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51e5ef1b

14 11月, 2013 1 次提交

sparc64: Encode huge PMDs using PTE encoding. · a7b9403f

由 David S. Miller 提交于 9月 26, 2013

Now that we have 64-bits for PMDs we can stop using special encodings
for the huge PMD values, and just put real PTEs in there.

We allocate a _PAGE_PMD_HUGE bit to distinguish between plain PMDs and
huge ones.  It is the same for both 4U and 4V PTE layouts.

We also use _PAGE_SPECIAL to indicate the splitting state, since a
huge PMD cannot also be special.

All of the PMD --> PTE translation code disappears, and most of the
huge PMD bit modifications and tests just degenerate into the PTE
operations.  In particular USER_PGTABLE_CHECK_PMD_HUGE becomes
trivial.

As a side effect, normal PMDs don't shift the physical address around.
This also speeds up the page table walks in the TLB miss paths since
they don't have to do the shifts any more.

Another non-trivial aspect is that pte_modify() has to be changed
to preserve the _PAGE_PMD_HUGE bits as well as the page size field
of the pte.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7b9403f

13 11月, 2013 2 次提交

sparc64: Move to 64-bit PGDs and PMDs. · 2b77933c

由 David S. Miller 提交于 9月 25, 2013

To make the page tables compact, we were using 32-bit PGDs and PMDs.
We only had to support <= 43 bits of physical addresses so this was
quite feasible.

In order to support larger physical addresses we have to move to
64-bit PGDs and PMDs.

Most of the changes are straight-forward:

1) {pgd,pmd}_t --> unsigned long

2) Anything that tries to use plain "unsigned int" types with pgd/pmd
   values needs to be adjusted.  In particular things like "0U" become
   "0UL".

3) {PGDIR,PMD}_BITS decrease by one.

4) In the assembler page table walkers, use "ldxa" instead of "lduwa"
   and adjust the low bit masks to clear out the low 3 bits instead of
   just the low 2 bits during pgd/pmd address formation.

Also, use PTRS_PER_PGD and PTRS_PER_PMD in the sizing of the
swapper_{pg_dir,low_pmd_dir} arrays.

This patch does not try to take advantage of having 64-bits in the
PMDs to simplify the hugepage code, that will come in a subsequent
change.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b77933c

sparc64: Move from 4MB to 8MB huge pages. · 37b3a8ff

由 David S. Miller 提交于 9月 25, 2013

The impetus for this is that we would like to move to 64-bit PMDs and
PGDs, but that would result in only supporting a 42-bit address space
with the current page table layout.  It'd be nice to support at least
43-bits.

The reason we'd end up with only 42-bits after making PMDs and PGDs
64-bit is that we only use half-page sized PTE tables in order to make
PMDs line up to 4MB, the hardware huge page size we use.

So what we do here is we make huge pages 8MB, and fabricate them using
4MB hw TLB entries.

Facilitate this by providing a "REAL_HPAGE_SHIFT" which is used in
places that really need to operate on hardware 4MB pages.

Use full pages (512 entries) for PTE tables, and adjust PMD_SHIFT,
PGD_SHIFT, and the build time CPP test as needed.  Use a CPP test to
make sure REAL_HPAGE_SHIFT and the _PAGE_SZHUGE_* we use match up.

This makes the pgtable cache completely unused, so remove the code
managing it and the state used in mm_context_t.  Now we have less
spinlocks taken in the page table allocation path.

The technique we use to fabricate the 8MB pages is to transfer bit 22
from the missing virtual address into the PTEs physical address field.
That takes care of the transparent huge pages case.

For hugetlb, we fill things in at the PTE level and that code already
puts the sub huge page physical bits into the PTEs, based upon the
offset, so there is nothing special we need to do.  It all just works
out.

So, a small amount of complexity in the THP case, but this code is
about to get much simpler when we move the 64-bit PMDs as we can move
away from the fancy 32-bit huge PMD encoding and just put a real PTE
value in there.

With bug fixes and help from Bob Picco.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37b3a8ff

21 2月, 2013 1 次提交

sparc64: Fix huge PMD to PTE translation for sun4u in TLB miss handler. · 76968ad2

由 David S. Miller 提交于 2月 20, 2013

When we set the sun4u version of the PTE execute bit, it's:

	or	REG, _PAGE_EXEC_4U, REG

_PAGE_EXEC_4U is 0x1000, unfortunately the immedate field of the
'or' instruction is a signed 13-bit value.  So the above actually
assembles into:

	or	REG, -4096, REG

completely corrupting the final PTE value.

Set it with a:

	sethi	%hi(_PAGE_EXEC_4U), TMP
	or	REG, TMP, REG

sequence instead.

This fixes "git gc" crashes on sun4u machines.
Reported-by: NMeelis Roos <mroos@linux.ee>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76968ad2

09 10月, 2012 3 次提交

sparc64: Support transparent huge pages. · 9e695d2e

由 David Miller 提交于 10月 08, 2012

This is relatively easy since PMD's now cover exactly 4MB of memory.

Our PMD entries are 32-bits each, so we use a special encoding.  The
lowest bit, PMD_ISHUGE, determines the interpretation.  This is possible
because sparc64's page tables are purely software entities so we can use
whatever encoding scheme we want.  We just have to make the TLB miss
assembler page table walkers aware of the layout.

set_pmd_at() works much like set_pte_at() but it has to operate in two
page from a table of non-huge PTEs, so we have to queue up TLB flushes
based upon what mappings are valid in the PTE table.  In the second regime
we are going from huge-page to non-huge-page, and in that case we need
only queue up a single TLB flush to push out the huge page mapping.

We still have 5 bits remaining in the huge PMD encoding so we can very
likely support any new pieces of THP state tracking that might get added
in the future.

With lots of help from Johannes Weiner.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9e695d2e

sparc64: Document PGD and PMD layout. · dbc9fdf0

由 David Miller 提交于 10月 08, 2012

We're going to be messing around with the PMD interpretation and layout
for the sake of transparent huge pages, so we better clearly document what
we're starting with.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dbc9fdf0

sparc64: Halve the size of PTE tables · 56a70b8c

由 David Miller 提交于 10月 08, 2012

The reason we want to do this is to facilitate transparent huge page
support.

Right now PMD's cover 8MB of address space, and our huge page size is 4MB.
 The current transparent hugepage support is not able to handle HPAGE_SIZE
!= PMD_SIZE.

So make PTE tables be sized to half of a page instead of a full page.

We can still map properly the whole supported virtual address range which
on sparc64 requires 44 bits.  Add a compile time CPP test which ensures
that this requirement is always met.

There is a minor inefficiency added by this change.  We only use half of
the page for PTE tables.  It's not trivial to use only half of the page
yet still get all of the pgtable_page_{ctor,dtor}() stuff working
properly.  It is doable, and that will come in a subsequent change.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

56a70b8c

05 8月, 2011 1 次提交

sparc: Access kernel TSB using physical addressing when possible. · 9076d0e7

由 David S. Miller 提交于 8月 05, 2011

On sun4v this is basically required since we point the hypervisor and
the TSB walking hardware at these tables using physical addressing
too.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9076d0e7

05 12月, 2008 1 次提交

sparc64: Stop using memory barriers for atomics and locks. · 293666b7

由 David S. Miller 提交于 11月 15, 2008

The kernel always executes in the TSO memory model now,
so none of this stuff is necessary any more.

With helpful feedback from Nick Piggin.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

293666b7

28 7月, 2008 1 次提交

sparc, sparc64: use arch/sparc/include · a439fe51

由 Sam Ravnborg 提交于 7月 27, 2008

The majority of this patch was created by the following script:

***
ASM=arch/sparc/include/asm
mkdir -p $ASM
git mv include/asm-sparc64/ftrace.h $ASM
git rm include/asm-sparc64/*
git mv include/asm-sparc/* $ASM
sed -ie 's/asm-sparc64/asm/g' $ASM/*
sed -ie 's/asm-sparc/asm/g' $ASM/*
***

The rest was an update of the top-level Makefile to use sparc
for header files when sparc64 is being build.
And a small fixlet to pick up the correct unistd.h from
sparc64 code.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>

a439fe51

18 7月, 2008 1 次提交

sparc: copy sparc64 specific files to asm-sparc · a00736e9

由 Sam Ravnborg 提交于 6月 19, 2008

Used the following script to copy the files:
cd include
set -e
SPARC64=`ls asm-sparc64`
for FILE in ${SPARC64}; do
	if [ -f asm-sparc/$FILE ]; then
		echo $FILE exist in asm-sparc
	else
		git mv asm-sparc64/$FILE asm-sparc/$FILE
		printf "#include <asm-sparc/$FILE>\n" > asm-sparc64/$FILE
		git add asm-sparc64/$FILE
	fi
done
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>

a00736e9

29 5月, 2007 1 次提交

[SPARC64]: Fix two bugs wrt. kernel 4MB TSB. · 2d9e2763

由 David S. Miller 提交于 5月 29, 2007

1) The TSB lookup was not using the correct hash mask.

2) It was not aligned on a boundary equal to it's size,
   which is required by the sun4v Hypervisor.

wasn't having it's return value checked, and that bug will be fixed up
as well in a subsequent changeset.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d9e2763

17 3月, 2007 1 次提交

[SPARC64]: Get DEBUG_PAGEALLOC working again. · d1acb421

由 David S. Miller 提交于 3月 16, 2007

We have to make sure to use base-pagesize TLB entries even during the
early transition period where we need TLB miss handling but don't have
the kernel page tables setup yet for the linear region.

Also, it is necessary therefore to not use the 4MB TSB for these
translations, and instead use the normal kernel TSB.  This allows us
to also get rid of the 4MB tsb for debug builds which shrinks the
kernel a little bit.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1acb421

20 3月, 2006 9 次提交

[SPARC64]: Create a seperate kernel TSB for 4MB/256MB mappings. · d7744a09

由 David S. Miller 提交于 2月 21, 2006

It can map all of the linear kernel mappings with zero TSB hash
conflicts for systems with 16GB or less ram.  In such cases, on
SUN4V, once we load up this TSB the first time with all the
mappings, we never take a linear kernel mapping TLB miss ever
again, the hypervisor handles them all.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7744a09

[SPARC64]: More TLB/TSB handling fixes. · 8b234274

由 David S. Miller 提交于 2月 17, 2006

The SUN4V convention with non-shared TSBs is that the context
bit of the TAG is clear.  So we have to choose an "invalid"
bit and initialize new TSBs appropriately.  Otherwise a zero
TAG looks "valid".

Make sure, for the window fixup cases, that we use the right
global registers and that we don't potentially trample on
the live global registers in etrap/rtrap handling (%g2 and
%g6) and that we put the missing virtual address properly
in %g5.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8b234274

[SPARC64]: Initial sun4v TLB miss handling infrastructure. · d257d5da

由 David S. Miller 提交于 2月 06, 2006

Things are a little tricky because, unlike sun4u, we have
to:

1) do a hypervisor trap to do the TLB load.
2) do the TSB lookup calculations by hand
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d257d5da

[SPARC64]: Access TSB with physical addresses when possible. · 517af332

由 David S. Miller 提交于 2月 01, 2006

This way we don't need to lock the TSB into the TLB.
The trick is that every TSB load/store is registered into
a special instruction patch section.  The default uses
virtual addresses, and the patch instructions use physical
address load/stores.

We can't do this on all chips because only cheetah+ and later
have the physical variant of the atomic quad load.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

517af332

D
[SPARC64]: Kill out-of-date commentary in asm-sparc64/tsb.h · b0fd4e49
由 David S. Miller 提交于 1月 31, 2006
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
b0fd4e49
D
[SPARC64]: Increase swapper_tsb size to 32K. · 2f7ee7c6
由 David S. Miller 提交于 1月 31, 2006
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
2f7ee7c6

[SPARC64]: Fix incorrect TSB lock bit handling. · 4753eb2a

由 David S. Miller 提交于 1月 31, 2006

The TSB_LOCK_BIT define is actually a special
value shifted down by 32-bits for the assembler
code macros.

In C code, this isn't what we want.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4753eb2a

[SPARC64]: Add infrastructure for dynamic TSB sizing. · 98c5584c

由 David S. Miller 提交于 1月 31, 2006

This also cleans up tsb_context_switch().  The assembler
routine is now __tsb_context_switch() and the former is
an inline function that picks out the bits from the mm_struct
and passes it into the assembler code as arguments.

setup_tsb_parms() computes the locked TLB entry to map the
TSB.  Later when we support using the physical address quad
load instructions of Cheetah+ and later, we'll simply use
the physical address for the TSB register value and set
the map virtual and PTE both to zero.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98c5584c

[SPARC64]: Move away from virtual page tables, part 1. · 74bf4312

由 David S. Miller 提交于 1月 31, 2006

We now use the TSB hardware assist features of the UltraSPARC
MMUs.

SMP is currently knowingly broken, we need to find another place
to store the per-cpu base pointers.  We hid them away in the TSB
base register, and that obviously will not work any more :-)

Another known broken case is non-8KB base page size.

Also noticed that flush_tlb_all() is not referenced anywhere, only
the internal __flush_tlb_all() (local cpu only) is used by the
sparc64 port, so we can get rid of flush_tlb_all().

The kernel gets it's own 8KB TSB (swapper_tsb) and each address space
gets it's own private 8K TSB.  Later we can add code to dynamically
increase the size of per-process TSB as the RSS grows.  An 8KB TSB is
good enough for up to about a 4MB RSS, after which the TSB starts to
incur many capacity and conflict misses.

We even accumulate OBP translations into the kernel TSB.

Another area for refinement is large page size support.  We could use
a secondary address space TSB to handle those.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74bf4312