提交 · 86505fc06b6f1ee8a13473053a41ed01948e2d4f · openanolis / cloud-kernel

29 7月, 2016 1 次提交

sparc64 mm: Fix base TSB sizing when hugetlb pages are used · af1b1a9b

由 Mike Kravetz 提交于 7月 15, 2016

do_sparc64_fault() calculates both the base and huge page RSS sizes and
uses this information in calls to tsb_grow(). The calculation for base
page TSB size is not correct if the task uses hugetlb pages. hugetlb
pages are not accounted for in RSS, therefore the call to get_mm_rss(mm)
does not include hugetlb pages. However, the number of pages based on
huge_pte_count (which does include hugetlb pages) is subtracted from
this value. This will result in an artificially small and often negative
RSS calculation. The base TSB size is then often set to max_tsb_size
as the passed RSS is unsigned, so a negative value looks really big.

THP pages are also accounted for in huge_pte_count, and THP pages are
accounted for in RSS so the calculation in do_sparc64_fault() is correct
if a task only uses THP pages.

A single huge_pte_count is not sufficient for TSB sizing if both hugetlb
and THP pages can be used. Instead of a single counter, use two: one
for hugetlb and one for THP.
Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

af1b1a9b

19 5月, 2014 1 次提交

sparc: drop use of extern for prototypes in arch/sparc/include/asm · f05a6865

由 Sam Ravnborg 提交于 5月 16, 2014

Drop extern for all prototypes and adjust alignment of parameters
as required after the removal.
In a few rare cases adjust linelength to conform to maximum 80 chars,
and likewise in a few rare cases adjust alignment of parameters
to static functions.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f05a6865

13 11月, 2013 1 次提交

sparc64: Move from 4MB to 8MB huge pages. · 37b3a8ff

由 David S. Miller 提交于 9月 25, 2013

The impetus for this is that we would like to move to 64-bit PMDs and
PGDs, but that would result in only supporting a 42-bit address space
with the current page table layout.  It'd be nice to support at least
43-bits.

The reason we'd end up with only 42-bits after making PMDs and PGDs
64-bit is that we only use half-page sized PTE tables in order to make
PMDs line up to 4MB, the hardware huge page size we use.

So what we do here is we make huge pages 8MB, and fabricate them using
4MB hw TLB entries.

Facilitate this by providing a "REAL_HPAGE_SHIFT" which is used in
places that really need to operate on hardware 4MB pages.

Use full pages (512 entries) for PTE tables, and adjust PMD_SHIFT,
PGD_SHIFT, and the build time CPP test as needed.  Use a CPP test to
make sure REAL_HPAGE_SHIFT and the _PAGE_SZHUGE_* we use match up.

This makes the pgtable cache completely unused, so remove the code
managing it and the state used in mm_context_t.  Now we have less
spinlocks taken in the page table allocation path.

The technique we use to fabricate the 8MB pages is to transfer bit 22
from the missing virtual address into the PTEs physical address field.
That takes care of the transparent huge pages case.

For hugetlb, we fill things in at the PTE level and that code already
puts the sub huge page physical bits into the PTEs, based upon the
offset, so there is nothing special we need to do.  It all just works
out.

So, a small amount of complexity in the THP case, but this code is
about to get much simpler when we move the 64-bit PMDs as we can move
away from the fancy 32-bit huge PMD encoding and just put a real PTE
value in there.

With bug fixes and help from Bob Picco.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37b3a8ff

09 10月, 2012 3 次提交

sparc64: Support transparent huge pages. · 9e695d2e

由 David Miller 提交于 10月 08, 2012

This is relatively easy since PMD's now cover exactly 4MB of memory.

Our PMD entries are 32-bits each, so we use a special encoding.  The
lowest bit, PMD_ISHUGE, determines the interpretation.  This is possible
because sparc64's page tables are purely software entities so we can use
whatever encoding scheme we want.  We just have to make the TLB miss
assembler page table walkers aware of the layout.

set_pmd_at() works much like set_pte_at() but it has to operate in two
page from a table of non-huge PTEs, so we have to queue up TLB flushes
based upon what mappings are valid in the PTE table.  In the second regime
we are going from huge-page to non-huge-page, and in that case we need
only queue up a single TLB flush to push out the huge page mapping.

We still have 5 bits remaining in the huge PMD encoding so we can very
likely support any new pieces of THP state tracking that might get added
in the future.

With lots of help from Johannes Weiner.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9e695d2e

sparc64: Eliminate PTE table memory wastage. · c460bec7

由 David Miller 提交于 10月 08, 2012

We've split up the PTE tables so that they take up half a page instead of
a full page.  This is in order to facilitate transparent huge page
support, which works much better if our PMDs cover 4MB instead of 8MB.

What we do is have a one-behind cache for PTE table allocations in the
mm struct.

This logic triggers only on allocations.  For example, we don't try to
keep track of free'd up page table blocks in the style that the s390 port
does.

There were only two slightly annoying aspects to this change:

1) Changing pgtable_t to be a "pte_t *".  There's all of this special
   logic in the TLB free paths that needed adjustments, as did the
   PMD populate interfaces.

2) init_new_context() needs to zap the pointer, since the mm struct
   just gets copied from the parent on fork.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c460bec7

sparc64: Only support 4MB huge pages and 8KB base pages. · 15b9350a

由 David Miller 提交于 10月 08, 2012

Narrowing the scope of the page size configurations will make the
transparent hugepage changes much simpler.

In the end what we really want to do is have the kernel support multiple
huge page sizes and use whatever is appropriate as the context dictactes.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

15b9350a

28 7月, 2008 1 次提交

sparc, sparc64: use arch/sparc/include · a439fe51

由 Sam Ravnborg 提交于 7月 27, 2008

The majority of this patch was created by the following script:

***
ASM=arch/sparc/include/asm
mkdir -p $ASM
git mv include/asm-sparc64/ftrace.h $ASM
git rm include/asm-sparc64/*
git mv include/asm-sparc/* $ASM
sed -ie 's/asm-sparc64/asm/g' $ASM/*
sed -ie 's/asm-sparc/asm/g' $ASM/*
***

The rest was an update of the top-level Makefile to use sparc
for header files when sparc64 is being build.
And a small fixlet to pick up the correct unistd.h from
sparc64 code.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>

a439fe51

18 7月, 2008 2 次提交

sparc64: Remove 4MB and 512K base page size options. · f7fe9334

由 David S. Miller 提交于 7月 17, 2008

Adrian Bunk reported that enabling 4MB page size breaks the build.
The problem is that MAX_ORDER combined with the page shift exceeds the
SECTION_SIZE_BITS we use in asm-sparc64/sparsemem.h

There are several ways I suppose we could work around this.  For one
we could define a CONFIG_FORCE_MAX_ZONEORDER to decrease MAX_ORDER in
these higher page size cases.

But I also know that these page size cases are broken wrt. TLB miss
handling especially on pre-hypervisor systems, and there isn't an easy
way to fix that.

These options were meant to be fun experimental hacks anyways, and
only 8K and 64K make any sense to support.

So remove 512K and 4M base page size support.  Of course, we still
support these page sizes for huge pages.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7fe9334

sparc: join the remaining header files · f5e706ad

由 Sam Ravnborg 提交于 7月 17, 2008

With this commit all sparc64 header files are moved to asm-sparc.
The remaining files (71 files) were too different to be trivially
merged so divide them up in a _32.h and a _64.h file which
are both included from the file with no bit size.

The following script were used:
cd include
FILES=`wc -l asm-sparc64/*h | grep -v '^     1' | cut -b 20-`

for FILE in ${FILES}; do
  echo $FILE:
  BASE=`echo $FILE | cut -d '.' -f 1`
  FN32=${BASE}_32.h
  FN64=${BASE}_64.h
  GUARD=___ASM_SPARC_`echo $BASE | tr '-' '_' | tr [:lower:] [:upper:]`_H
  git mv asm-sparc/$FILE asm-sparc/$FN32
  git mv asm-sparc64/$FILE asm-sparc/$FN64
  echo git mv done
  printf "#ifndef %s\n" $GUARD                             >   asm-sparc/$FILE
  printf "#define %s\n" $GUARD                             >>  asm-sparc/$FILE
  printf "#if defined(__sparc__) && defined(__arch64__)\n" >>  asm-sparc/$FILE
  printf "#include <asm-sparc/%s>\n" $FN64                 >>  asm-sparc/$FILE
  printf "#else\n"                                         >>  asm-sparc/$FILE
  printf "#include <asm-sparc/%s>\n" $FN32                 >>  asm-sparc/$FILE
  printf "#endif\n"                                        >>  asm-sparc/$FILE
  printf "#endif\n"                                        >>  asm-sparc/$FILE
  git add asm-sparc/$FILE
  echo new file done
  printf "#include <asm-sparc/%s>\n" $FILE                 >  asm-sparc64/$FILE
  git add asm-sparc64/$FILE
  echo sparc64 file done
done

The guard contains three '_' to avoid conflict with existing guards.
In additing the two Kbuild files are emptied to avoid breaking
headers_* targets.
We will reintroduce the exported header files when the necessary
kbuild changes are merged.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5e706ad

09 5月, 2007 1 次提交

consolidate asm/const.h to linux/const.h · 6df95fd7

由 Randy Dunlap 提交于 5月 08, 2007

Make a global linux/const.h header file instead of having multiple,
per-arch files, and convert current users of asm/const.h to use
linux/const.h.

Built on x86_64 and sparc64.

[akpm@linux-foundation.org: fix include/asm-x86_64/Kbuild]
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6df95fd7

26 4月, 2006 1 次提交
- D
  Don't include linux/config.h from anywhere else in include/ · 62c4f0a2
  由 David Woodhouse 提交于 4月 26, 2006
```
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
```
  62c4f0a2
22 3月, 2006 1 次提交
- D
  [SPARC64]: Add a secondary TSB for hugepage mappings. · dcc1e8dd
  由 David S. Miller 提交于 3月 22, 2006
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  dcc1e8dd
20 3月, 2006 9 次提交

[SPARC64]: Optimized TSB table initialization. · bb8646d8

由 David S. Miller 提交于 3月 18, 2006

We only need to write an invalid tag every 16 bytes,
so taking advantage of this can save many instructions
compared to the simple memset() call we make now.

A prefetching implementation is implemented for sun4u
and a block-init store version if implemented for Niagara.

The next trick is to be able to perform an init and
a copy_tsb() in parallel when growing a TSB table.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb8646d8

[SPARC64]: Use 13-bit context size always. · 97c4b6f9

由 David S. Miller 提交于 2月 26, 2006

We no longer have the problems that require using the smaller
sizes.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97c4b6f9

[SPARC64]: Fix TLB context allocation with SMT style shared TLBs. · a0663a79

由 David S. Miller 提交于 2月 23, 2006

The context allocation scheme we use depends upon there being a 1<-->1
mapping from cpu to physical TLB for correctness.  Chips like Niagara
break this assumption.

So what we do is notify all cpus with a cross call when the context
version number changes, and if necessary this makes them allocate
a valid context for the address space they are running at the time.

Stress tested with make -j1024, make -j2048, and make -j4096 kernel
builds on a 32-strand, 8 core, T2000 with 16GB of ram.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0663a79

D
[SPARC64]: Hypervisor TSB context switching. · 618e9ed9
由 David S. Miller 提交于 2月 09, 2006
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
618e9ed9

[SPARC64]: Access TSB with physical addresses when possible. · 517af332

由 David S. Miller 提交于 2月 01, 2006

This way we don't need to lock the TSB into the TLB.
The trick is that every TSB load/store is registered into
a special instruction patch section.  The default uses
virtual addresses, and the patch instructions use physical
address load/stores.

We can't do this on all chips because only cheetah+ and later
have the physical variant of the atomic quad load.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

517af332

D
[SPARC64]: Preload TSB entries from update_mmu_cache(). · b70c0fa1
由 David S. Miller 提交于 1月 31, 2006
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
b70c0fa1

[SPARC64]: Dynamically grow TSB in response to RSS growth. · bd40791e

由 David S. Miller 提交于 1月 31, 2006

As the RSS grows, grow the TSB in order to reduce the likelyhood
of hash collisions and thus poor hit rates in the TSB.

This definitely needs some serious tuning.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bd40791e

[SPARC64]: Add infrastructure for dynamic TSB sizing. · 98c5584c

由 David S. Miller 提交于 1月 31, 2006

This also cleans up tsb_context_switch().  The assembler
routine is now __tsb_context_switch() and the former is
an inline function that picks out the bits from the mm_struct
and passes it into the assembler code as arguments.

setup_tsb_parms() computes the locked TLB entry to map the
TSB.  Later when we support using the physical address quad
load instructions of Cheetah+ and later, we'll simply use
the physical address for the TSB register value and set
the map virtual and PTE both to zero.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98c5584c

[SPARC64]: Move away from virtual page tables, part 1. · 74bf4312

由 David S. Miller 提交于 1月 31, 2006

We now use the TSB hardware assist features of the UltraSPARC
MMUs.

SMP is currently knowingly broken, we need to find another place
to store the per-cpu base pointers.  We hid them away in the TSB
base register, and that obviously will not work any more :-)

Another known broken case is non-8KB base page size.

Also noticed that flush_tlb_all() is not referenced anywhere, only
the internal __flush_tlb_all() (local cpu only) is used by the
sparc64 port, so we can get rid of flush_tlb_all().

The kernel gets it's own 8KB TSB (swapper_tsb) and each address space
gets it's own private 8K TSB.  Later we can add code to dynamically
increase the size of per-process TSB as the RSS grows.  An 8KB TSB is
good enough for up to about a 4MB RSS, after which the TSB starts to
incur many capacity and conflict misses.

We even accumulate OBP translations into the kernel TSB.

Another area for refinement is large page size support.  We could use
a secondary address space TSB to handle those.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74bf4312

17 4月, 2005 1 次提交

Linux-2.6.12-rc2 · 1da177e4

由 Linus Torvalds 提交于 4月 16, 2005

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

1da177e4

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功