提交 · c2e4e676adb40ea764af79d3e08be954e14a0f4c · openeuler / Kernel

04 5月, 2014 3 次提交

sparc64: Fix hex values in comment above pte_modify(). · c2e4e676

由 David S. Miller 提交于 4月 27, 2014

When _PAGE_SPECIAL and _PAGE_PMD_HUGE were added to the mask, the
comment was not updated.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2e4e676

sparc64: Fix bugs in get_user_pages_fast() wrt. THP. · 04df419d

由 David S. Miller 提交于 4月 25, 2014

The large PMD path needs to check _PAGE_VALID not _PAGE_PRESENT, to
decide if it needs to bail and return 0.

pmd_large() should therefore just check _PAGE_PMD_HUGE.

Calls to gup_huge_pmd() are guarded with a check of pmd_large(), so we
just need to add a valid bit check.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04df419d

sparc64: Fix huge PMD invalidation. · 51e5ef1b

由 David S. Miller 提交于 4月 24, 2014

On sparc64 "present" and "valid" are seperate PTE bits, this allows us to
naturally distinguish between the user explicitly asking for PROT_NONE
with mprotect() and other situations.

However we weren't handling this properly in the huge PMD paths.

First of all, the page table walker in the TSB miss path only checks
for _PAGE_PMD_HUGE.  So the generic pmdp_invalidate() would clear
_PAGE_PRESENT but the TLB miss paths would still load it into the TLB
as a valid huge PMD.

Fix this by clearing the valid bit in pmdp_invalidate(), and also
checking the valid bit in USER_PGTABLE_CHECK_PMD_HUGE using "brgez"
since _PAGE_VALID is bit 63 in both the sun4u and sun4v pte layouts.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51e5ef1b

19 12月, 2013 1 次提交

mm: fix TLB flush race between migration, and change_protection_range · 20841405

由 Rik van Riel 提交于 12月 18, 2013

There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.

The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.

During that time, a CPU may continue writing to the page.

This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.

When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.

This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).

The basic race looks like this:

CPU A			CPU B			CPU C

						load TLB entry
make entry PTE/PMD_NUMA
			fault on entry
						read/write old page
			start migrating page
			change PTE/PMD to new page
						read/write old page [*]
flush TLB
						reload TLB from new entry
						read/write new page
						lose data

[*] the old page may belong to a new user at this point!

The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.

This should fix both NUMA migration and compaction.

[mgorman@suse.de: fix build]
Signed-off-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NMel Gorman <mgorman@suse.de>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

20841405

14 11月, 2013 1 次提交

sparc64: Encode huge PMDs using PTE encoding. · a7b9403f

由 David S. Miller 提交于 9月 26, 2013

Now that we have 64-bits for PMDs we can stop using special encodings
for the huge PMD values, and just put real PTEs in there.

We allocate a _PAGE_PMD_HUGE bit to distinguish between plain PMDs and
huge ones.  It is the same for both 4U and 4V PTE layouts.

We also use _PAGE_SPECIAL to indicate the splitting state, since a
huge PMD cannot also be special.

All of the PMD --> PTE translation code disappears, and most of the
huge PMD bit modifications and tests just degenerate into the PTE
operations.  In particular USER_PGTABLE_CHECK_PMD_HUGE becomes
trivial.

As a side effect, normal PMDs don't shift the physical address around.
This also speeds up the page table walks in the TLB miss paths since
they don't have to do the shifts any more.

Another non-trivial aspect is that pte_modify() has to be changed
to preserve the _PAGE_PMD_HUGE bits as well as the page size field
of the pte.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7b9403f

13 11月, 2013 2 次提交

sparc64: Move to 64-bit PGDs and PMDs. · 2b77933c

由 David S. Miller 提交于 9月 25, 2013

To make the page tables compact, we were using 32-bit PGDs and PMDs.
We only had to support <= 43 bits of physical addresses so this was
quite feasible.

In order to support larger physical addresses we have to move to
64-bit PGDs and PMDs.

Most of the changes are straight-forward:

1) {pgd,pmd}_t --> unsigned long

2) Anything that tries to use plain "unsigned int" types with pgd/pmd
   values needs to be adjusted.  In particular things like "0U" become
   "0UL".

3) {PGDIR,PMD}_BITS decrease by one.

4) In the assembler page table walkers, use "ldxa" instead of "lduwa"
   and adjust the low bit masks to clear out the low 3 bits instead of
   just the low 2 bits during pgd/pmd address formation.

Also, use PTRS_PER_PGD and PTRS_PER_PMD in the sizing of the
swapper_{pg_dir,low_pmd_dir} arrays.

This patch does not try to take advantage of having 64-bits in the
PMDs to simplify the hugepage code, that will come in a subsequent
change.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b77933c

sparc64: Move from 4MB to 8MB huge pages. · 37b3a8ff

由 David S. Miller 提交于 9月 25, 2013

The impetus for this is that we would like to move to 64-bit PMDs and
PGDs, but that would result in only supporting a 42-bit address space
with the current page table layout.  It'd be nice to support at least
43-bits.

The reason we'd end up with only 42-bits after making PMDs and PGDs
64-bit is that we only use half-page sized PTE tables in order to make
PMDs line up to 4MB, the hardware huge page size we use.

So what we do here is we make huge pages 8MB, and fabricate them using
4MB hw TLB entries.

Facilitate this by providing a "REAL_HPAGE_SHIFT" which is used in
places that really need to operate on hardware 4MB pages.

Use full pages (512 entries) for PTE tables, and adjust PMD_SHIFT,
PGD_SHIFT, and the build time CPP test as needed.  Use a CPP test to
make sure REAL_HPAGE_SHIFT and the _PAGE_SZHUGE_* we use match up.

This makes the pgtable cache completely unused, so remove the code
managing it and the state used in mm_context_t.  Now we have less
spinlocks taken in the page table allocation path.

The technique we use to fabricate the 8MB pages is to transfer bit 22
from the missing virtual address into the PTEs physical address field.
That takes care of the transparent huge pages case.

For hugetlb, we fill things in at the PTE level and that code already
puts the sub huge page physical bits into the PTEs, based upon the
offset, so there is nothing special we need to do.  It all just works
out.

So, a small amount of complexity in the THP case, but this code is
about to get much simpler when we move the 64-bit PMDs as we can move
away from the fancy 32-bit huge PMD encoding and just put a real PTE
value in there.

With bug fixes and help from Bob Picco.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37b3a8ff

29 6月, 2013 1 次提交
- A
  consolidate io_remap_pfn_range definitions · 40d158e6
  由 Al Viro 提交于 5月 11, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  40d158e6
20 6月, 2013 1 次提交

mm/THP: add pmd args to pgtable deposit and withdraw APIs · 6b0b50b0

由 Aneesh Kumar K.V 提交于 6月 05, 2013

This will be later used by powerpc THP support. In powerpc we want to use
pgtable for storing the hash index values. So instead of adding them to
mm_context list, we would like to store them in the second half of pmd
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

6b0b50b0

20 4月, 2013 1 次提交

sparc64: Fix race in TLB batch processing. · f36391d2

由 David S. Miller 提交于 4月 19, 2013

As reported by Dave Kleikamp, when we emit cross calls to do batched
TLB flush processing we have a race because we do not synchronize on
the sibling cpus completing the cross call.

So meanwhile the TLB batch can be reset (tb->tlb_nr set to zero, etc.)
and either flushes are missed or flushes will flush the wrong
addresses.

Fix this by using generic infrastructure to synchonize on the
completion of the cross call.

This first required getting the flush_tlb_pending() call out from
switch_to() which operates with locks held and interrupts disabled.
The problem is that smp_call_function_many() cannot be invoked with
IRQs disabled and this is explicitly checked for with WARN_ON_ONCE().

We get the batch processing outside of locked IRQ disabled sections by
using some ideas from the powerpc port. Namely, we only batch inside
of arch_{enter,leave}_lazy_mmu_mode() calls.  If we're not in such a
region, we flush TLBs synchronously.

1) Get rid of xcall_flush_tlb_pending and per-cpu type
   implementations.

2) Do TLB batch cross calls instead via:

	smp_call_function_many()
		tlb_pending_func()
			__flush_tlb_pending()

3) Batch only in lazy mmu sequences:

	a) Add 'active' member to struct tlb_batch
	b) Define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
	c) Set 'active' in arch_enter_lazy_mmu_mode()
	d) Run batch and clear 'active' in arch_leave_lazy_mmu_mode()
	e) Check 'active' in tlb_batch_add_one() and do a synchronous
           flush if it's clear.

4) Add infrastructure for synchronous TLB page flushes.

	a) Implement __flush_tlb_page and per-cpu variants, patch
	   as needed.
	b) Likewise for xcall_flush_tlb_page.
	c) Implement smp_flush_tlb_page() to invoke the cross-call.
	d) Wire up global_flush_tlb_page() to the right routine based
           upon CONFIG_SMP

5) It turns out that singleton batches are very common, 2 out of every
   3 batch flushes have only a single entry in them.

   The batch flush waiting is very expensive, both because of the poll
   on sibling cpu completeion, as well as because passing the tlb batch
   pointer to the sibling cpus invokes a shared memory dereference.

   Therefore, in flush_tlb_pending(), if there is only one entry in
   the batch perform a completely asynchronous global_flush_tlb_page()
   instead.
Reported-by: NDave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NDave Kleikamp <dave.kleikamp@oracle.com>

f36391d2

14 2月, 2013 1 次提交

sparc64: Fix get_user_pages_fast() wrt. THP. · 89a77915

由 David S. Miller 提交于 2月 13, 2013

Mostly mirrors the s390 logic, as unlike x86 we don't need the
SetPageReferenced() bits.

On sparc64 we also lack a user/privileged bit in the huge PMDs.

In order to make this work for THP and non-THP builds, some header
file adjustments were necessary.  Namely, provide the PMD_HUGE_* bit
defines and the pmd_large() inline unconditionally rather than
protected by TRANSPARENT_HUGEPAGE.
Reported-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

89a77915

19 12月, 2012 1 次提交

sparc64: Define pte_accessible() · 4a9d1946

由 David S. Miller 提交于 12月 18, 2012

We can elide flush_tlb_*() calls when _PAGE_VALID is clear
as that is the test used to determine whether or not to
queue up a TLB flush in set_pte_at().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a9d1946

09 10月, 2012 4 次提交

sparc64: Support transparent huge pages. · 9e695d2e

由 David Miller 提交于 10月 08, 2012

This is relatively easy since PMD's now cover exactly 4MB of memory.

Our PMD entries are 32-bits each, so we use a special encoding.  The
lowest bit, PMD_ISHUGE, determines the interpretation.  This is possible
because sparc64's page tables are purely software entities so we can use
whatever encoding scheme we want.  We just have to make the TLB miss
assembler page table walkers aware of the layout.

set_pmd_at() works much like set_pte_at() but it has to operate in two
page from a table of non-huge PTEs, so we have to queue up TLB flushes
based upon what mappings are valid in the PTE table.  In the second regime
we are going from huge-page to non-huge-page, and in that case we need
only queue up a single TLB flush to push out the huge page mapping.

We still have 5 bits remaining in the huge PMD encoding so we can very
likely support any new pieces of THP state tracking that might get added
in the future.

With lots of help from Johannes Weiner.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9e695d2e

sparc64: Document PGD and PMD layout. · dbc9fdf0

由 David Miller 提交于 10月 08, 2012

We're going to be messing around with the PMD interpretation and layout
for the sake of transparent huge pages, so we better clearly document what
we're starting with.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dbc9fdf0

sparc64: Halve the size of PTE tables · 56a70b8c

由 David Miller 提交于 10月 08, 2012

The reason we want to do this is to facilitate transparent huge page
support.

Right now PMD's cover 8MB of address space, and our huge page size is 4MB.
 The current transparent hugepage support is not able to handle HPAGE_SIZE
!= PMD_SIZE.

So make PTE tables be sized to half of a page instead of a full page.

We can still map properly the whole supported virtual address range which
on sparc64 requires 44 bits.  Add a compile time CPP test which ensures
that this requirement is always met.

There is a minor inefficiency added by this change.  We only use half of
the page for PTE tables.  It's not trivial to use only half of the page
yet still get all of the pgtable_page_{ctor,dtor}() stuff working
properly.  It is doable, and that will come in a subsequent change.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

56a70b8c

sparc64: Only support 4MB huge pages and 8KB base pages. · 15b9350a

由 David Miller 提交于 10月 08, 2012

Narrowing the scope of the page size configurations will make the
transparent hugepage changes much simpler.

In the end what we really want to do is have the kernel support multiple
huge page sizes and use whatever is appropriate as the context dictactes.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

15b9350a

14 5月, 2012 1 次提交

sparc: Kill mmu_{un,}lockarea(). · 679bea5e

由 David S. Miller 提交于 5月 13, 2012

These were used on sun4c during floppy data transfers since on that
chip we had to lock the cpu mappings into the TLB because we cannot
take a TLB miss during the assembler floppy interrupt handler that
does the data transfer.

That is no longer necessary since we've removed sun4c support, thus
this stuff can disappear completely.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

679bea5e

02 4月, 2012 1 次提交

sparc: pgtable_64: change include order · 2533e824

由 Aaro Koskinen 提交于 4月 01, 2012

Fix the following build breakage in v3.4-rc1:

CC arch/sparc/kernel/cpu.o
In file included from /home/aaro/git/linux/arch/sparc/include/asm/pgtable_64.h:15:0,
from /home/aaro/git/linux/arch/sparc/include/asm/pgtable.h:4,
from arch/sparc/kernel/cpu.c:15:
include/asm-generic/pgtable-nopud.h:13:16: error: unknown type name 'pgd_t'
include/asm-generic/pgtable-nopud.h:25:28: error: unknown type name 'pgd_t'
include/asm-generic/pgtable-nopud.h:26:27: error: unknown type name 'pgd_t'
include/asm-generic/pgtable-nopud.h:27:31: error: unknown type name 'pgd_t'
include/asm-generic/pgtable-nopud.h:28:30: error: unknown type name 'pgd_t'
include/asm-generic/pgtable-nopud.h:38:34: error: unknown type name 'pgd_t'
In file included from /home/aaro/git/linux/arch/sparc/include/asm/pgtable_64.h:783:0,
from /home/aaro/git/linux/arch/sparc/include/asm/pgtable.h:4,
from arch/sparc/kernel/cpu.c:15:
include/asm-generic/pgtable.h: In function 'pgd_none_or_clear_bad':
include/asm-generic/pgtable.h:258:2: error: implicit declaration of function 'pgd_none' [-Werror=implicit-function-declaration]
include/asm-generic/pgtable.h:260:2: error: implicit declaration of function 'pgd_bad' [-Werror=implicit-function-declaration]
include/asm-generic/pgtable.h: In function 'pud_none_or_clear_bad':
include/asm-generic/pgtable.h:269:6: error: request for member 'pgd' in something not a structure or union
Reported-by: NMeelis Roos <mroos@linux.ee>
Signed-off-by: NAaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2533e824

29 3月, 2012 1 次提交

Disintegrate asm/system.h for Sparc · d550bbd4

由 David Howells 提交于 3月 28, 2012

Disintegrate asm/system.h for Sparc.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
cc: sparclinux@vger.kernel.org

d550bbd4

18 11月, 2011 1 次提交

sparc: Kill custom io_remap_pfn_range(). · 3e37fd31

由 David S. Miller 提交于 11月 17, 2011

To handle the large physical addresses, just make a simple wrapper
around remap_pfn_range() like MIPS does.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3e37fd31

26 7月, 2011 1 次提交

sparc64: add support for _PAGE_SPECIAL · 683d2fa6

由 David S. Miller 提交于 7月 25, 2011

Luckily there are still a few software PTE bits remaining and they even
match up in both the sun4u and sun4v pte layouts.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

683d2fa6

25 5月, 2011 1 次提交

sparc: mmu_gather rework · 90f08e39

由 Peter Zijlstra 提交于 5月 24, 2011

Rework the sparc mmu_gather usage to conform to the new world order :-)

Sparc mmu_gather does two things:
 - tracks vaddrs to unhash
 - tracks pages to free

Split these two things like powerpc has done and keep the vaddrs
in per-cpu data structures and flush them on context switch.

The remaining bits can then use the generic mmu_gather.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NDavid Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

90f08e39

22 4月, 2011 1 次提交

sparc: consolidate show_cpuinfo in cpu.c · cb1b8209

由 Sam Ravnborg 提交于 4月 21, 2011

We have all the cpu related info in cpu.c - so move
the remaining functions to support /proc/cpuinfo to this file.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb1b8209

27 10月, 2010 1 次提交

mm: remove pte_*map_nested() · ece0e2b6

由 Peter Zijlstra 提交于 10月 26, 2010

Since we no longer need to provide KM_type, the whole pte_*map_nested()
API is now redundant, remove it.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NChris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ece0e2b6

21 2月, 2010 1 次提交

MM: Pass a PTE pointer to update_mmu_cache() rather than the PTE itself · 4b3073e1

由 Russell King 提交于 12月 18, 2009

On VIVT ARM, when we have multiple shared mappings of the same file
in the same MM, we need to ensure that we have coherency across all
copies.  We do this via make_coherent() by making the pages
uncacheable.

This used to work fine, until we allowed highmem with highpte - we
now have a page table which is mapped as required, and is not available
for modification via update_mmu_cache().

Ralf Beache suggested getting rid of the PTE value passed to
update_mmu_cache():

  On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables
  to construct a pointer to the pte again.  Passing a pte_t * is much
  more elegant.  Maybe we might even replace the pte argument with the
  pte_t?

Ben Herrenschmidt would also like the pte pointer for PowerPC:

  Passing the ptep in there is exactly what I want.  I want that
  -instead- of the PTE value, because I have issue on some ppc cases,
  for I$/D$ coherency, where set_pte_at() may decide to mask out the
  _PAGE_EXEC.

So, pass in the mapped page table pointer into update_mmu_cache(), and
remove the PTE value, updating all implementations and call sites to
suit.

Includes a fix from Stephen Rothwell:

  sparc: fix fallout from update_mmu_cache API change
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

4b3073e1

29 9月, 2009 1 次提交

sparc64: Increase vmalloc size to fix percpu regressions. · 1b6b9d62

由 David S. Miller 提交于 9月 28, 2009

Since we now use the embedding percpu allocator we have to make the
vmalloc area at least as large as the stretch can be between nodes.

Besides some minor asm adjustments, this turned out to be pretty
trivial.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b6b9d62

26 8月, 2009 1 次提交

sparc64: Validate linear D-TLB misses. · d8ed1d43

由 David S. Miller 提交于 8月 25, 2009

When page alloc debugging is not enabled, we essentially accept any
virtual address for linear kernel TLB misses.  But with kgdb, kernel
address probing, and other facilities we can try to access arbitrary
crap.

So, make sure the address we miss on will translate to physical memory
that actually exists.

In order to make this work we have to embed the valid address bitmap
into the kernel image.  And in order to make that less expensive we
make an adjustment, in that the max physical memory address is
decreased to "1 << 41", even on the chips that support a 42-bit
physical address space.  We can do this because bit 41 indicates
"I/O space" and thus covers non-memory ranges.

The result of this is that:

1) kpte_linear_bitmap shrinks from 2K to 1K in size

2) we need 64K more for the valid address bitmap

We can't let the valid address bitmap be dynamically allocated
once we start using it to validate TLB misses, otherwise we have
crazy issues to deal with wrt. recursive TLB misses and such.

If we're in a TLB miss it could be the deepest trap level that's legal
inside of the cpu.  So if we TLB miss referencing the bitmap, the cpu
will be out of trap levels and enter RED state.

To guard against out-of-range accesses to the bitmap, we have to check
to make sure no bits in the physical address above bit 40 are set.  We
could export and use last_valid_pfn for this check, but that's just an
unnecessary extra memory reference.

On the plus side of all this, since we load all of these translations
into the special 4MB mapping TSB, and we check the TSB first for TLB
misses, there should be absolutely no real cost for these new checks
in the TLB miss path.

Reported-by: heyongli@gmail.com
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8ed1d43

12 9月, 2008 1 次提交

sparc64: Fix sparse warnings in fault.c · b539c467

由 David S. Miller 提交于 9月 12, 2008

1) set_brkpt() is referenced by nothing and hasn't been used by anyone
   to my knowledge for many many years.  So just delete it.

2) add extern decl for do_sparc64_fault() in asm/pgtable_64.h
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b539c467

28 7月, 2008 1 次提交

sparc, sparc64: use arch/sparc/include · a439fe51

由 Sam Ravnborg 提交于 7月 27, 2008

The majority of this patch was created by the following script:

***
ASM=arch/sparc/include/asm
mkdir -p $ASM
git mv include/asm-sparc64/ftrace.h $ASM
git rm include/asm-sparc64/*
git mv include/asm-sparc/* $ASM
sed -ie 's/asm-sparc64/asm/g' $ASM/*
sed -ie 's/asm-sparc/asm/g' $ASM/*
***

The rest was an update of the top-level Makefile to use sparc
for header files when sparc64 is being build.
And a small fixlet to pick up the correct unistd.h from
sparc64 code.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>

a439fe51

18 7月, 2008 2 次提交

sparc64: Remove 4MB and 512K base page size options. · f7fe9334

由 David S. Miller 提交于 7月 17, 2008

Adrian Bunk reported that enabling 4MB page size breaks the build.
The problem is that MAX_ORDER combined with the page shift exceeds the
SECTION_SIZE_BITS we use in asm-sparc64/sparsemem.h

There are several ways I suppose we could work around this.  For one
we could define a CONFIG_FORCE_MAX_ZONEORDER to decrease MAX_ORDER in
these higher page size cases.

But I also know that these page size cases are broken wrt. TLB miss
handling especially on pre-hypervisor systems, and there isn't an easy
way to fix that.

These options were meant to be fun experimental hacks anyways, and
only 8K and 64K make any sense to support.

So remove 512K and 4M base page size support.  Of course, we still
support these page sizes for huge pages.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7fe9334

sparc: join the remaining header files · f5e706ad

由 Sam Ravnborg 提交于 7月 17, 2008

With this commit all sparc64 header files are moved to asm-sparc.
The remaining files (71 files) were too different to be trivially
merged so divide them up in a _32.h and a _64.h file which
are both included from the file with no bit size.

The following script were used:
cd include
FILES=`wc -l asm-sparc64/*h | grep -v '^     1' | cut -b 20-`

for FILE in ${FILES}; do
  echo $FILE:
  BASE=`echo $FILE | cut -d '.' -f 1`
  FN32=${BASE}_32.h
  FN64=${BASE}_64.h
  GUARD=___ASM_SPARC_`echo $BASE | tr '-' '_' | tr [:lower:] [:upper:]`_H
  git mv asm-sparc/$FILE asm-sparc/$FN32
  git mv asm-sparc64/$FILE asm-sparc/$FN64
  echo git mv done
  printf "#ifndef %s\n" $GUARD                             >   asm-sparc/$FILE
  printf "#define %s\n" $GUARD                             >>  asm-sparc/$FILE
  printf "#if defined(__sparc__) && defined(__arch64__)\n" >>  asm-sparc/$FILE
  printf "#include <asm-sparc/%s>\n" $FN64                 >>  asm-sparc/$FILE
  printf "#else\n"                                         >>  asm-sparc/$FILE
  printf "#include <asm-sparc/%s>\n" $FN32                 >>  asm-sparc/$FILE
  printf "#endif\n"                                        >>  asm-sparc/$FILE
  printf "#endif\n"                                        >>  asm-sparc/$FILE
  git add asm-sparc/$FILE
  echo new file done
  printf "#include <asm-sparc/%s>\n" $FILE                 >  asm-sparc64/$FILE
  git add asm-sparc64/$FILE
  echo sparc64 file done
done

The guard contains three '_' to avoid conflict with existing guards.
In additing the two Kbuild files are emptied to avoid breaking
headers_* targets.
We will reintroduce the exported header files when the necessary
kbuild changes are merged.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5e706ad

20 5月, 2008 1 次提交

sparc64: remove CVS keywords · b00dc837

由 Adrian Bunk 提交于 5月 19, 2008

This patch removes the CVS keywords that weren't updated for a long time
from comments.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b00dc837

28 4月, 2008 1 次提交

mm: introduce pte_special pte bit · 7e675137

由 Nick Piggin 提交于 4月 28, 2008

s390 for one, cannot implement VM_MIXEDMAP with pfn_valid, due to their memory
model (which is more dynamic than most).  Instead, they had proposed to
implement it with an additional path through vm_normal_page(), using a bit in
the pte to determine whether or not the page should be refcounted:

vm_normal_page()
{
	...
        if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
                if (vma->vm_flags & VM_MIXEDMAP) {
#ifdef s390
			if (!mixedmap_refcount_pte(pte))
				return NULL;
#else
                        if (!pfn_valid(pfn))
                                return NULL;
#endif
                        goto out;
                }
	...
}

This is fine, however if we are allowed to use a bit in the pte to determine
refcountedness, we can use that to _completely_ replace all the vma based
schemes.  So instead of adding more cases to the already complex vma-based
scheme, we can have a clearly seperate and simple pte-based scheme (and get
slightly better code generation in the process):

vm_normal_page()
{
#ifdef s390
	if (!mixedmap_refcount_pte(pte))
		return NULL;
	return pte_page(pte);
#else
	...
#endif
}

And finally, we may rather make this concept usable by any architecture rather
than making it s390 only, so implement a new type of pte state for this.
Unfortunately the old vma based code must stay, because some architectures may
not be able to spare pte bits.  This makes vm_normal_page a little bit more
ugly than we would like, but the 2 cases are clearly seperate.

So introduce a pte_special pte state, and use it in mm/memory.c.  It is
currently a noop for all architectures, so this doesn't actually result in any
compiled code changes to mm/memory.o.

BTW:
I haven't put vm_normal_page() into arch code as-per an earlier suggestion.
The reason is that, regardless of where vm_normal_page is actually
implemented, the *abstraction* is still exactly the same. Also, while it
depends on whether the architecture has pte_special or not, that is the
only two possible cases, and it really isn't an arch specific function --
the role of the arch code should be to provide primitive functions and
accessors with which to build the core code; pte_special does that. We do
not want architectures to know or care about vm_normal_page itself, and
we definitely don't want them being able to invent something new there
out of sight of mm/ code. If we made vm_normal_page an arch function, then
we have to make vm_insert_mixed (next patch) an arch function too. So I
don't think moving it to arch code fundamentally improves any abstractions,
while it does practically make the code more difficult to follow, for both
mm and arch developers, and easier to misuse.

[akpm@linux-foundation.org: build fix]
Signed-off-by: NNick Piggin <npiggin@suse.de>
Acked-by: NCarsten Otte <cotte@de.ibm.com>
Cc: Jared Hulbert <jaredeh@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7e675137

26 3月, 2008 2 次提交

[SPARC64]: Fix sparse errors in arch/sparc64/kernel/traps.c · 99cd2201

由 David S. Miller 提交于 3月 26, 2008

Add 'UL' markers to DCU_* macros.

Declare C functions called from assembler in entry.h

Declare C functions called from within the sparc64 arch
code in include/asm-sparc64/*.h headers as appropriate.

Remove unused routines in traps.c
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

99cd2201

[SPARC64]: Adjust {TLBTEMP,TSBMAP}_BASE. · 606d5b19

由 David S. Miller 提交于 3月 25, 2008

Move them further from the main kernel image area
to facilitate larger kernel sizes.

Adjust comments to match.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

606d5b19

17 10月, 2007 1 次提交

SPARC64: SPARSEMEM_VMEMMAP support · 46644c24

由 David Miller 提交于 10月 16, 2007

[apw@shadowen.org: style fixups]
[apw@shadowen.org: vmemmap sparc64: convert to new config options]
Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
Acked-by: NMel Gorman <mel@csn.ul.ie>
Acked-by: NChristoph Lameter <clameter@sgi.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

46644c24

17 7月, 2007 1 次提交

page table handling cleanup · 45e98cdb

由 Jan Beulich 提交于 7月 15, 2007

Kill pte_rdprotect(), pte_exprotect(), pte_mkread(), pte_mkexec(), pte_read(),
pte_exec(), and pte_user() except where arch-specific code is making use of
them.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

45e98cdb

09 5月, 2007 1 次提交

consolidate asm/const.h to linux/const.h · 6df95fd7

由 Randy Dunlap 提交于 5月 08, 2007

Make a global linux/const.h header file instead of having multiple,
per-arch files, and convert current users of asm/const.h to use
linux/const.h.

Built on x86_64 and sparc64.

[akpm@linux-foundation.org: fix include/asm-x86_64/Kbuild]
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6df95fd7

26 4月, 2007 2 次提交

D
[SPARC64]: Add proper header file extern for cmdline_memory_size. · b93f2620
由 David S. Miller 提交于 3月 15, 2007
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
b93f2620

[SPARC64]: Privatize sun4u_get_pte() and fix name. · 4be5c34d

由 David S. Miller 提交于 3月 15, 2007

__get_phys is only called from init.c as is prom_virt_to_phys(),
__get_iospace() is not called at all, and sun4u_get_pte() is largely
misnamed.

Privatize the implementation and helper functions of
sun4u_get_phys() to mm/init.c, and rename to
kvaddr_to_paddr().

The only used of this thing is flush_icache_range(), and thus
things can be considerably further simplified.  For example,
we should only see module or PAGE_OFFSET kernel addresses here,
so we don't need the OBP firmware range handling at all.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4be5c34d

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功