提交 · e9b0a0712148abe96ff717a2b9f8dab1d433e0d5 · openeuler / Kernel

27 3月, 2006 1 次提交

[PATCH] ia64: ioremap: check EFI for valid memory attributes · e9b0a071

由 Bjorn Helgaas 提交于 3月 26, 2006

Check the EFI memory map so we can use the correct memory attributes for
ioremap().  Previously, we always used uncacheable access, which blows up on
some machines for regular system memory.
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: N"Luck, Tony" <tony.luck@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e9b0a071

23 3月, 2006 5 次提交

[IA64] add init declaration - nolwsys · 03906ea0

由 Chen, Kenneth W 提交于 3月 12, 2006

Add __initdata to nolwsys.
Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

03906ea0

[IA64] add init declaration - gate page functions · 914a4ea4

由 Chen, Kenneth W 提交于 3月 12, 2006

Add init declaration to bunch of patch functions and gate
page setup function.
Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

914a4ea4

[IA64] add init declaration to memory initialization functions · dae28066

由 Chen, Kenneth W 提交于 3月 22, 2006

Add init declaration to variables/functions used for memory
initialization.  I don't think they would clash with memory
hotplug.  If they do, please yell.
Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

dae28066

[IA64] add init declaration to cpu initialization functions · 244fd545

由 Chen, Kenneth W 提交于 3月 12, 2006

Add init declaration to cpu initialization functions.
Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

244fd545

[IA64] fix ia64 is_hugepage_only_range · 2332c9ae

由 Chen, Kenneth W 提交于 3月 22, 2006

fix is_hugepage_only_range() definition to be "overlaps"
instead of "within architectural restricted hugetlb address
range".  Simplify the ia64 specific code that used to use
is_hugepage_only_range() to just check which region the
address is in.
Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

2332c9ae

22 3月, 2006 2 次提交

[PATCH] hugepage: is_aligned_hugepage_range() cleanup · 42b88bef

由 David Gibson 提交于 3月 22, 2006

Quite a long time back, prepare_hugepage_range() replaced
is_aligned_hugepage_range() as the callback from mm/mmap.c to arch code to
verify if an address range is suitable for a hugepage mapping.
is_aligned_hugepage_range() stuck around, but only to implement
prepare_hugepage_range() on archs which didn't implement their own.

Most archs (everything except ia64 and powerpc) used the same
implementation of is_aligned_hugepage_range().  On powerpc, which
implements its own prepare_hugepage_range(), the custom version was never
used.

In addition, "is_aligned_hugepage_range()" was a bad name, because it
suggests it returns true iff the given range is a good hugepage range,
whereas in fact it returns 0-or-error (so the sense is reversed).

This patch cleans up by abolishing is_aligned_hugepage_range().  Instead
prepare_hugepage_range() is defined directly.  Most archs use the default
version, which simply checks the given region is aligned to the size of a
hugepage.  ia64 and powerpc define custom versions.  The ia64 one simply
checks that the range is in the correct address space region in addition to
being suitably aligned.  The powerpc version (just as previously) checks
for suitable addresses, and if necessary performs low-level MMU frobbing to
set up new areas for use by hugepages.

No libhugetlbfs testsuite regressions on ppc64 (POWER5 LPAR).
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NZhang Yanmin <yanmin.zhang@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

42b88bef

[PATCH] remove set_page_count() outside mm/ · 7835e98b

由 Nick Piggin 提交于 3月 22, 2006

set_page_count usage outside mm/ is limited to setting the refcount to 1.
Remove set_page_count from outside mm/, and replace those users with
init_page_count() and set_page_refcounted().

This allows more debug checking, and tighter control on how code is allowed
to play around with page->_count.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7835e98b

17 1月, 2006 1 次提交

[IA64] Simple memory hot-add for ia64. · 1681b8e1

由 Yasunori Goto 提交于 1月 07, 2006

First step to memory hotplug for ia64 (add only,
all new memory is added to node 0, does not use
ZONE_EASY_RECLAIM yet).
Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

1681b8e1

14 1月, 2006 1 次提交

[IA64] Hole in IA64 TLB flushing from system threads · cfbb1426

由 Jack Steiner 提交于 12月 22, 2005

I originally thought this was an bug only in the SN code, but I think I
also see a hole in the generic IA64 tlb code. (Separate patch was sent
for the SN problem).

It looks like there is a bug in the TLB flushing code. During context switch,
kernel threads (kswapd, for example) inherit the mm of the task that was
previously running on the cpu. Normally, this is ok because the previous context
is still loaded into the RR registers. However, if the owner of the mm
migrates to another cpu, changes it's context number, and references a
page before kswapd issues a tlb_purge for that same page, the purge will be
done with a stale context number (& RR registers).
Signed-off-by: NTony Luck <tony.luck@intel.com>

cfbb1426

06 1月, 2006 1 次提交

[IA64] support for cpu0 removal · ff741906

由 Ashok Raj 提交于 11月 11, 2005

here is the BSP removal support for IA64. Its pretty much the same thing that
was released a while back, but has your feedback incorporated.

- Removed CONFIG_BSP_REMOVE_WORKAROUND and associated cmdline param
- Fixed compile issue with sn2/zx1 due to a undefined fix_b0_for_bsp
- some formatting nits (whitespace etc)

This has been tested on tiger and long back by alex on hp systems as well.
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

ff741906

07 12月, 2005 1 次提交

[IA64] Limit the maximum NODEDATA_ALIGN() offset · acb7f672

由 Jack Steiner 提交于 12月 05, 2005

The per-node data structures are allocated with strided offsets that are a
function of the node number. This prevents excessive cache-aliasing from
occurring.

On systems with a large number of nodes, the strided offset becomes
too large. This patch restricts the maximum offset to 32MB. This is far larger
than the size of any current L3 cache.
Signed-off-by: NJack Steiner <steiner@sgi.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

acb7f672

09 11月, 2005 1 次提交

[IA64] fix memory less node allocation · 97835245

由 Bob Picco 提交于 10月 29, 2005

The original memory less node allocation attempted to use NODEDATA_ALIGN for
alignment.  The bootmem allocator only allows a power of two alignments. This
causes a BUG_ON for some nodes. For cpu only nodes just allocate with a
PERCPU_PAGE_SIZE alignment.

Some older firmware reports SLIT distances of 0xff and results in bestnode
not being computed. This is now treated correctly.

The failed allocation check was removed because it's redundant.  The
bootmem allocator already makes this check.

This fix has been boot tested on 4 node machine which has 4 cpu only nodes
and 1 memory node.  Thanks to Pete Keilty for reporting this and helping me
test it.
Signed-off-by: NBob Picco <bob.picco@hp.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

97835245

04 11月, 2005 1 次提交

[IA64] make mmu_context.h and tlb.c 80-column friendly · 58cd9082

由 Chen, Kenneth W 提交于 10月 29, 2005

wrap_mmu_context(), delayed_tlb_flush(), get_mmu_context() all
have an extra { } block which cause one extra indentation.
get_mmu_context() is particularly bad with 5 indentations to
the most inner "if".  It finally gets on my nerve that I can't
keep the code within 80 columns.  Remove the extra { } block
and while I'm at it, reformat all the comments to 80-column
friendly.  No functional change at all with this patch.
Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

58cd9082

01 11月, 2005 1 次提交

[IA64] Use bitmaps for efficient context allocation/free · dcc17d1b

由 Peter Keilty 提交于 10月 31, 2005

Corrects the very inefficent method of finding free context_ids in
get_mmu_context().  Instead of walking the task_list of all processes,
2 bitmaps are used to efficently store and lookup state, inuse and
needs flushing. The entire rid address space is now used before calling
wrap_mmu_context and global tlb flushing.

Special thanks to Ken and Rohit for their review and modifications in
using a bit flushmap.
Signed-off-by: NPeter Keilty <peter.keilty@hp.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

dcc17d1b

30 10月, 2005 5 次提交

[PATCH] memory hotplug locking: node_size_lock · 208d54e5

由 Dave Hansen 提交于 10月 29, 2005

pgdat->node_size_lock is basically only neeeded in one place in the normal
code: show_mem(), which is the arch-specific sysrq-m printing function.

Strictly speaking, the architectures not doing memory hotplug do no need this
locking in show_mem(). However, they are all included for completeness. This
should also make any future consolidation of all of the implementations a
little more straightforward.

This lock is also held in the sparsemem code during a memory removal, as
sections are invalidated. This is the place there pfn_valid() is made false
for a memory area that's being removed. The lock is only required when doing
pfn_valid() operations on memory which the user does not already have a
reference on the page, such as in show_mem().
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

208d54e5

[PATCH] mm: flush_tlb_range outside ptlock · 663b97f7

由 Hugh Dickins 提交于 10月 29, 2005

There was one small but very significant change in the previous patch:
mprotect's flush_tlb_range fell outside the page_table_lock: as it is in 2.4,
but that doesn't prove it safe in 2.6.

On some architectures flush_tlb_range comes to the same as flush_tlb_mm, which
has always been called from outside page_table_lock in dup_mmap, and is so
proved safe. Others required a deeper audit: I could find no reliance on
page_table_lock in any; but in ia64 and parisc found some code which looks a
bit as if it might want preemption disabled. That won't do any actual harm,
so pending a decision from the maintainers, disable preemption there.

Remove comments on page_table_lock from flush_tlb_mm, flush_tlb_range and
flush_tlb_page entries in cachetlb.txt: they were rather misleading (what
generic code does is different from what usually happens), the rules are now
changing, and it's not yet clear where we'll end up (will the generic
tlb_flush_mmu happen always under lock? never under lock? or sometimes under
and sometimes not?).
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

663b97f7

[PATCH] mm: init_mm without ptlock · 872fec16

由 Hugh Dickins 提交于 10月 29, 2005

First step in pushing down the page_table_lock. init_mm.page_table_lock has
been used throughout the architectures (usually for ioremap): not to serialize
kernel address space allocation (that's usually vmlist_lock), but because
pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.

Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
and drop it when allocating a new one, to check lest a racing task already
did. Similarly no page_table_lock in vmalloc's map_vm_area.

Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
user mms, which are converted only by a later patch, for now they have to lock
differently according to whether or not it's init_mm.

If sources get muddled, there's a danger that an arch source taking
init_mm.page_table_lock will be mixed with common source also taking it (or
neither take it). So break the rules and make another change, which should
break the build for such a mismatch: remove the redundant mm arg from
pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).

Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
took page_table_lock for no good reason.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

872fec16

[PATCH] mm: ia64 use expand_upwards · 46dea3d0

由 Hugh Dickins 提交于 10月 29, 2005

ia64 has expand_backing_store function for growing its Register Backing Store
vma upwards. But more complete code for this purpose is found in the
CONFIG_STACK_GROWSUP part of mm/mmap.c. Uglify its #ifdefs further to provide
expand_upwards for ia64 as well as expand_stack for parisc.

The Register Backing Store vma should be marked VM_ACCOUNT. Implement the
intention of growing it only a page at a time, instead of passing an address
outside of the vma to handle_mm_fault, with unknown consequences.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

46dea3d0

[PATCH] mm: vm_stat_account unshackled · ab50b8ed

由 Hugh Dickins 提交于 10月 29, 2005

The original vm_stat_account has fallen into disuse, with only one user, and
only one user of vm_stat_unaccount.  It's easier to keep track if we convert
them all to __vm_stat_account, then free it from its __shackles.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ab50b8ed

28 10月, 2005 1 次提交

[IA64] - Avoid slow TLB purges on SGI Altix systems · c1902aae

由 Dean Roe 提交于 10月 27, 2005

flush_tlb_all() can be a scaling issue on large SGI Altix systems
since it uses the global call_lock and always executes on all cpus.
When a process enters flush_tlb_range() to purge TLBs for another
process, it is possible to avoid flush_tlb_all() and instead allow
sn2_global_tlb_purge() to purge TLBs only where necessary.

This patch modifies flush_tlb_range() so that this case can be handled
by platform TLB purge functions and updates ia64_global_tlb_purge()
accordingly.  sn2_global_tlb_purge() now calculates the region register
value from the mm argument introduced with this patch.
Signed-off-by: NDean Roe <roe@sgi.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

c1902aae

26 10月, 2005 1 次提交

[IA64] wider use of for_each_cpu_mask() in arch/ia64 · dc565b52

由 hawkes@sgi.com 提交于 10月 10, 2005

In arch/ia64 change the explicit use of for-loops and NR_CPUS into the
general for_each_cpu() or for_each_online_cpu() constructs, as
appropriate.  This widens the scope of potential future optimizations
of the general constructs, as well as takes advantage of the existing
optimizations of first_cpu() and next_cpu().
Signed-off-by: NJohn Hawkes <hawkes@sgi.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

dc565b52

05 10月, 2005 3 次提交

[PATCH] V5 ia64 SPARSEMEM - SPARSEMEM code changes · 2d4b1fa2

由 Bob Picco 提交于 10月 04, 2005

This patch is the minimal set of changes required by ia64 to use SPARSEMEM.
Signed-off-by: NBob Picco <bob.picco@hp.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

2d4b1fa2

[PATCH] V5 ia64 SPARSEMEM - eliminate contig_page_data · c678796c

由 Bob Picco 提交于 10月 04, 2005

For FLATMEM contig_page_data has been made transparent to the arch code.
This patch conforms to that change.
Signed-off-by: NBob Picco <bob.picco@hp.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

c678796c

[PATCH] V5 ia64 SPARSEMEM - Kconfig and Makefile · da9577c5

由 Bob Picco 提交于 10月 04, 2005

The patch modifies the Kconfig file to introduce the new memory model
options and other related SPARSEMEM changes.  There is also a minor change
in the Makefile.
Signed-off-by: NBob Picco <bob.picco@hp.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

da9577c5

08 9月, 2005 1 次提交

[PATCH] Kprobes: prevent possible race conditions ia64 changes · 1f7ad57b

由 Prasanna S Panchamukhi 提交于 9月 06, 2005

This patch contains the ia64 architecture specific changes to prevent the
possible race conditions.
Signed-off-by: NPrasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1f7ad57b

07 9月, 2005 1 次提交

[IA64] page_not_present fault in region 5 is normal · 63028aa7

由 Kiyoshi Ueda 提交于 8月 24, 2005

When copying data from user-space to kernel-space by __copy_user(),
a page_not_present fault sometimes occurs at vmalloced kernel address
because of VHPT pre-fetching.

Ignore the page_not_present fault in ia64_do_page_fault() before
jumping into exception handlers.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

63028aa7

31 8月, 2005 1 次提交

[IA64] Fix nasty VMLPT problem... · 6cf07a8c

由 Peter Chubb 提交于 8月 23, 2005

I've solved the problem I was having with the simulator and not
booting Debian.

The problem is that the number of bits for the virtual linear array
short-format VHPT (Virtually mapped linear page table, VMLPT for
short) is being tested incorrectly. 

There are two problems:
      1. The PAL call that should tell the kernel the size of the
      virtual address space isn't implemented for the simulator, so
      the kernel uses the default 50.  This is addressed separately
      in dc90e95f

      2.  In arch/ia64/mm/init.c there's code to calcualte the size
      of the VMLPT based on the number of implemented virtual address
      bits and the page size.  It checks to see if the VMLPT base
      address overlaps the top of the mapped region, but this check
      doesn't allow for the address space hole, and in fact will
      never trigger.

Here's an alternative test and panic, that I think is more accurate.
Signed-off-by: NPeter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: NTony Luck <tony.luck@intel.com>

6cf07a8c

25 8月, 2005 1 次提交

[IA64] Rationalise Region Definitions · 0a41e250

由 Peter Chubb 提交于 8月 16, 2005

Currently, region numbers are defined in several files, with several 
names.  For example, we have REGION_KERNEL in asm/page.h and 
RGN_KERNEL in pgtable.h 
 
We also have address definitions that should depend on the 
RGN_XXX macros, but are currently just long constants. 
 
The following patch reorganises all the definitions so that they have 
the same form (RGN_XXX), are in one place, and that addresses that 
depend on RGN_XXX are derived from them. 

(This is a necessary but not sufficient patch to allow UML-like 
operation on IA64). 

Thanks to David Mosberger for catching the change I missed in mmu_context.h.
 
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au> 
Signed-off-by: NTony Luck <tony.luck@intel.com>

0a41e250

07 7月, 2005 2 次提交

[IA64] fix generic/up builds · 8d7e3517

由 Tony Luck 提交于 7月 06, 2005

Jesse Barnes provided the original version of this patch months ago, but
other changes kept conflicting with it, so it got deferred. Greg Edwards
dug it out of obscurity just over a week ago, and almost immediately
another conflicting patch appeared (Bob Picco's memory-less nodes).

I've resolved the conflicts and got it running again. CONFIG_SGI_TIOCX
is set to "y" in defconfig, which causes a Tiger to not boot (oops in
tiocx_init). But that can be resolved later ... get this in now before it
gets stale again.
Signed-off-by: NTony Luck <tony.luck@intel.com>

8d7e3517

[IA64] memory-less-nodes repost · 564601a5

由 bob.picco 提交于 6月 30, 2005

I reworked how nodes with only CPUs are treated.  The patch below seems
simpler to me and has eliminated the complicated routine
reassign_cpu_only_nodes.  There isn't any longer the requirement
to modify ACPI NUMA information which was in large part the
complexity introduced in reassign_cpu_only_nodes. 

This patch will produce a different number of nodes. For example,
reassign_cpu_only_nodes would reduce two CPUonly nodes and one memory node
configuration to one memory+CPUs node configuration.  This patch
doesn't change the number of nodes which means the user will see three.  Two
nodes without memory and one node with all the memory.

While doing this patch, I noticed that early_nr_phys_cpus_node isn't serving
any useful purpose.  It is called once in find_pernode_space but the value
isn't used to computer pernode space.  
Signed-off-by: Nbob.picco <bob.picco@hp.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

564601a5

24 6月, 2005 2 次提交

[PATCH] Kprobes/IA64: kdebug die notification mechanism · 7213b252

由 Anil S Keshavamurthy 提交于 6月 23, 2005

As many of you know that kprobes exist in the main line kernel for various
architecture including i386, x86_64, ppc64 and sparc64.  Attached patches
following this mail are a port of Kprobes and Jprobes for IA64.

I have tesed this patches for kprobes and Jprobes and this seems to work fine.
 I have tested this patch by inserting kprobes on various slots and various
templates including various types of branch instructions.

I have also tested this patch using the tool
http://marc.theaimsgroup.com/?l=linux-kernel&m=111657358022586&w=2 and the
kprobes for IA64 works great.

Here is list of TODO things and pathes for the same will appear soon.

1) Support kprobes on "mov r1=ip" type of instruction
2) Support Kprobes and Jprobes to exist on the same address
3) Support Return probes
3) Architecture independent cleanup of kprobes

This patch adds the kdebug die notification mechanism needed by Kprobes.

For break instruction on Branch type slot, imm21 is ignored and value
zero is placed in IIM register, hence we need to handle kprobes
for switch case zero.
Signed-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: NRusty Lynch <Rusty.lynch@intel.com>

From: Rusty Lynch <rusty.lynch@intel.com>

At the point in traps.c where we recieve a break with a zero value, we can
not say if the break was a result of a kprobe or some other debug facility.

This simple patch changes the informational string to a more correct "break
0" value, and applies to the 2.6.12-rc2-mm2 tree with all the kprobes
patches that were just recently included for the next mm cut.
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7213b252

[PATCH] remove non-DISCONTIG use of pgdat->node_mem_map · 408fde81

由 Dave Hansen 提交于 6月 23, 2005

This patch effectively eliminates direct use of pgdat->node_mem_map outside
of the DISCONTIG code.  On a flat memory system, these fields aren't
currently used, neither are they on a sparsemem system.

There was also a node_mem_map(nid) macro on many architectures.  Its use
along with the use of ->node_mem_map itself was not consistent.  It has
been removed in favor of two new, more explicit, arch-independent macros:

	pgdat_page_nr(pgdat, pagenr)
	nid_page_nr(nid, pagenr)

I called them "pgdat" and "nid" because we overload the term "node" to mean
"NUMA node", "DISCONTIG node" or "pg_data_t" in very confusing ways.  I
believe the newer names are much clearer.

These macros can be overridden in the sparsemem case with a theoretically
slower operation using node_start_pfn and pfn_to_page(), instead.  We could
make this the only behavior if people want, but I don't want to change too
much at once.  One thing at a time.

This patch removes more code than it adds.

Compile tested on alpha, alpha discontig, arm, arm-discontig, i386, i386
generic, NUMAQ, Summit, ppc64, ppc64 discontig, and x86_64.  Full list
here: http://sr71.net/patches/2.6.12/2.6.12-rc1-mhp2/configs/

Boot tested on NUMAQ, x86 SMP and ppc64 power4/5 LPARs.
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Signed-off-by: NMartin J. Bligh <mbligh@aracnet.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

408fde81

22 6月, 2005 1 次提交

[PATCH] Hugepage consolidation · 63551ae0

由 David Gibson 提交于 6月 21, 2005

A lot of the code in arch/*/mm/hugetlbpage.c is quite similar.  This patch
attempts to consolidate a lot of the code across the arch's, putting the
combined version in mm/hugetlb.c.  There are a couple of uglyish hacks in
order to covert all the hugepage archs, but the result is a very large
reduction in the total amount of code.  It also means things like hugepage
lazy allocation could be implemented in one place, instead of six.

Tested, at least a little, on ppc64, i386 and x86_64.

Notes:
	- this patch changes the meaning of set_huge_pte() to be more
	  analagous to set_pte()
	- does SH4 need s special huge_ptep_get_and_clear()??
Acked-by: NWilliam Lee Irwin <wli@holomorphy.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

63551ae0

09 6月, 2005 1 次提交

[IA64] Fill holes in FIXADDR_USER space with zero pages. · ad597bd5

由 David Mosberger-Tang 提交于 6月 08, 2005

This fixes an oops reported by Jason Baron.
Signed-off-by: NDavid Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

ad597bd5

26 4月, 2005 3 次提交

[IA64] Need to handle lfetch in "no_context" case. · f0a8d3c9

由 Tony Luck 提交于 4月 25, 2005

Thanks to Mark for tracking down this one. Users of __copy_from_user_inatomic()
will be sad if we don't handle lfetch faults for the "no_context" case.
Signed-off-by: NTony Luck <tony.luck@intel.com>

f0a8d3c9

T
[IA64] MAX_PGT_FREES_PER_PASS must be 'L' to avoid warning · e96c9b47
由 Tony Luck 提交于 4月 25, 2005
```
'min' is very picky about types of arguments, make it happy
Signed-off-by: NTony Luck <tony.luck@intel.com>
```
e96c9b47

[IA64] Percpu quicklist for combined allocator for pgd/pmd/pte. · fde740e4

由 Robin Holt 提交于 4月 25, 2005

This patch introduces using the quicklists for pgd, pmd, and pte levels
by combining the alloc and free functions into a common set of routines.
This greatly simplifies the reading of this header file.

This patch is simple but necessary for large numa configurations.
It simply ensures that only pages from the local node are added to a
cpus quicklist. This prevents the trapping of pages on a remote nodes
quicklist by starting a process, touching a large number of pages to
fill pmd and pte entries, migrating to another node, and then unmapping
or exiting. With those conditions, the pages get trapped and if the
machine has more than 100 nodes of the same size, the calculation of
the pgtable high water mark will be larger than any single node so page
table cache flushing will never occur.

I ran lmbench lat_proc fork and lat_proc exec on a zx1 with and without
this patch and did not notice any change.

On an sn2 machine, there was a slight improvement which is possibly
due to pages from other nodes trapped on the test node before starting
the run. I did not investigate further.

This patch shrinks the quicklist based upon free memory on the node
instead of the high/low water marks. I have written it to enable
preemption periodically and recalculate the amount to shrink every time
we have freed enough pages that the quicklist size should have grown.
I rescan the nodes zones each pass because other processess may be
draining node memory at the same time as we are adding.
Signed-off-by: NRobin Holt <holt@sgi.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

fde740e4

20 4月, 2005 2 次提交

[PATCH] freepgt: hugetlb_free_pgd_range · 3bf5ee95

由 Hugh Dickins 提交于 4月 19, 2005

ia64 and ppc64 had hugetlb_free_pgtables functions which were no longer being
called, and it wasn't obvious what to do about them.

The ppc64 case turns out to be easy: the associated tables are noted elsewhere
and freed later, safe to either skip its hugetlb areas or go through the
motions of freeing nothing. Since ia64 does need a special case, restore to
ppc64 the special case of skipping them.

The ia64 hugetlb case has been broken since pgd_addr_end went in, though it
probably appeared to work okay if you just had one such area; in fact it's
been broken much longer if you consider a long munmap spanning from another
region into the hugetlb region.

In the ia64 hugetlb region, more virtual address bits are available than in
the other regions, yet the page tables are structured the same way: the page
at the bottom is larger. Here we need to scale down each addr before passing
it to the standard free_pgd_range. Was about to write a hugely_scaled_down
macro, but found htlbpage_to_page already exists for just this purpose. Fixed
off-by-one in ia64 is_hugepage_only_range.

Uninline free_pgd_range to make it available to ia64. Make sure the
vma-gathering loop in free_pgtables cannot join a hugepage_only_range to any
other (safe to join huges? probably but don't bother).
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3bf5ee95

[PATCH] freepgt: free_pgtables use vma list · e0da382c

由 Hugh Dickins 提交于 4月 19, 2005

Recent woes with some arches needing their own pgd_addr_end macro; and 4-level
clear_page_range regression since 2.6.10's clear_page_tables; and its
long-standing well-known inefficiency in searching throughout the higher-level
page tables for those few entries to clear and free: all can be blamed on
ignoring the list of vmas when we free page tables.

Replace exit_mmap's clear_page_range of the total user address space by
free_pgtables operating on the mm's vma list; unmap_region use it in the same
way, giving floor and ceiling beyond which it may not free tables. This
brings lmbench fork/exec/sh numbers back to 2.6.10 (unless preempt is enabled,
in which case latency fixes spoil unmap_vmas throughput).

Beware: the do_mmap_pgoff driver failure case must now use unmap_region
instead of zap_page_range, since a page table might have been allocated, and
can only be freed while it is touched by some vma.

Move free_pgtables from mmap.c to memory.c, where its lower levels are adapted
from the clear_page_range levels. (Most of free_pgtables' old code was
actually for a non-existent case, prev not properly set up, dating from before
hch gave us split_vma.) Pass mmu_gather** in the public interfaces, since we
might want to add latency lockdrops later; but no attempt to do so yet, going
by vma should itself reduce latency.

But what if is_hugepage_only_range? Those ia64 and ppc64 cases need careful
examination: put that off until a later patch of the series.

What of x86_64's 32bit vdso page __map_syscall32 maps outside any vma?

And the range to sparc64's flush_tlb_pgtables? It's less clear to me now that
we need to do more than is done here - every PMD_SIZE ever occupied will be
flushed, do we really have to flush every PGDIR_SIZE ever partially occupied?
A shame to complicate it unnecessarily.

Special thanks to David Miller for time spent repairing my ceilings.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e0da382c

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功