提交 · f96efd585b8d847181f81bf16721f96ded18d9fe · OpenHarmony / kernel_linux

17 7月, 2007 10 次提交

hugetlb: fix race in alloc_fresh_huge_page() · f96efd58

由 Joe Jin 提交于 7月 15, 2007

That static `nid' index needs locking.  Without it we can end up calling
alloc_pages_node() with an illegal node ID and the kernel crashes.
Acked-by: Ngurudas pai <gurudas.pai@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f96efd58

vmscan: fix comments related to shrink_list() · 2706a1b8

由 Anderson Briglia 提交于 7月 15, 2007

Fix the shrink_list name on some files under mm/ directory.
Signed-off-by: NAnderson Briglia <anderson.briglia@indt.org.br>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2706a1b8

slob: improved alignment handling · 55394849

由 Nick Piggin 提交于 7月 15, 2007

Remove the core slob allocator's minimum alignment restrictions, and instead
introduce the alignment restrictions at the slab API layer. This lets us heed
the ARCH_KMALLOC/SLAB_MINALIGN directives, and also use __alignof__ (unsigned
long) for the default alignment (which should allow relaxed alignment
architectures to take better advantage of SLOB's small minimum alignment).
Signed-off-by: NNick Piggin <npiggin@suse.de>
Acked-by: NMatt Mackall <mpm@selenic.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

55394849

slob: remove bigblock tracking · d87a133f

由 Nick Piggin 提交于 7月 15, 2007

Remove the bigblock lists in favour of using compound pages and going directly
to the page allocator.  Allocation size is stored in page->private, which also
makes ksize more accurate than it previously was.

Saves ~.5K of code, and 12-24 bytes overhead per >= PAGE_SIZE allocation.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Acked-by: NMatt Mackall <mpm@selenic.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d87a133f

slob: rework freelist handling · 95b35127

由 Nick Piggin 提交于 7月 15, 2007

Improve slob by turning the freelist into a list of pages using struct page
fields, then each page has a singly linked freelist of slob blocks via a
pointer in the struct page.

- The first benefit is that the slob freelists can be indexed by a smaller
  type (2 bytes, if the PAGE_SIZE is reasonable).

- Next is that freeing is much quicker because it does not have to traverse
  the entire freelist. Allocation can be slightly faster too, because we can
  skip almost-full freelist pages completely.

- Slob pages are then freed immediately when they become empty, rather than
  having a periodic timer try to free them. This gives efficiency and memory
  consumption improvement.

Then, we don't encode seperate size and next fields into each slob block,
rather we use the sign bit to distinguish between "size" or "next". Then
size 1 blocks contain a "next" offset, and others contain the "size" in
the first unit and "next" in the second unit.

- This allows minimum slob allocation alignment to go from 8 bytes to 2
  bytes on 32-bit and 12 bytes to 2 bytes on 64-bit. In practice, it is
  best to align them to word size, however some architectures (eg. cris)
  could gain space savings from turning off this extra alignment.

Then, make kmalloc use its own slob_block at the front of the allocation
in order to encode allocation size, rather than rely on not overwriting
slob's existing header block.

- This reduces kmalloc allocation overhead similarly to alignment reductions.

- Decouples kmalloc layer from the slob allocator.

Then, add a page flag specific to slob pages.

- This means kfree of a page aligned slob block doesn't have to traverse
  the bigblock list.

I would get benchmarks, but my test box's network doesn't come up with
slob before this patch. I think something is timing out. Anyway, things
are faster after the patch.

Code size goes up about 1K, however dynamic memory usage _should_ be
lower even on relatively small memory systems.

Future todo item is to restore the cyclic free list search, rather than
to always begin at the start.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Acked-by: NMatt Mackall <mpm@selenic.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

95b35127

MM: alloc_large_system_hash() can free some memory for non power-of-two bucketsize · 1037b83b

由 Eric Dumazet 提交于 7月 15, 2007

alloc_large_system_hash() is called at boot time to allocate space for
several large hash tables.

Lately, TCP hash table was changed and its bucketsize is not a power-of-two
anymore.

On most setups, alloc_large_system_hash() allocates one big page (order >
0) with __get_free_pages(GFP_ATOMIC, order).  This single high_order page
has a power-of-two size, bigger than the needed size.

We can free all pages that wont be used by the hash table.

On a 1GB i386 machine, this patch saves 128 KB of LOWMEM memory.

TCP established hash table entries: 32768 (order: 6, 393216 bytes)
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1037b83b

Make /proc/slabinfo use seq_list_xxx helpers · b92151ba

由 Pavel Emelianov 提交于 7月 15, 2007

This entry prints a header in .start callback.  This is OK, but the more
elegant solution would be to move this into the .show callback and use
seq_list_start_head() in .start one.

I have left it as is in order to make the patch just switch to new API and
noting more.

[adobriyan@sw.ru: Wrong pointer was used as kmem_cache pointer]
Signed-off-by: NPavel Emelianov <xemul@openvz.org>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b92151ba

MM: use DIV_ROUND_UP() in mm/memory.c · 68e116a3

由 Rolf Eike Beer 提交于 7月 15, 2007

Replace a hand coded version of DIV_ROUND_UP().
Signed-off-by: NRolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

68e116a3

hugetlb: remove unnecessary nid initialization · 31a5c6e4

由 Nishanth Aravamudan 提交于 7月 15, 2007

nid is initialized to numa_node_id() but will either be overwritten in
the loop or not used in the conditional. So remove the initialization.
Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

31a5c6e4

change zonelist order: zonelist order selection logic · f0c0b2b8

由 KAMEZAWA Hiroyuki 提交于 7月 15, 2007

Make zonelist creation policy selectable from sysctl/boot option v6.

This patch makes NUMA's zonelist (of pgdat) order selectable.
Available order are Default(automatic)/ Node-based / Zone-based.

[Default Order]
The kernel selects Node-based or Zone-based order automatically.

[Node-based Order]
This policy treats the locality of memory as the most important parameter.
Zonelist order is created by each zone's locality. This means lower zones
(ex. ZONE_DMA) can be used before higher zone (ex. ZONE_NORMAL) exhausion.
IOW. ZONE_DMA will be in the middle of zonelist.
current 2.6.21 kernel uses this.

Pros.
 * A user can expect local memory as much as possible.
Cons.
 * lower zone will be exhansted before higher zone. This may cause OOM_KILL.

Maybe suitable if ZONE_DMA is relatively big and you never see OOM_KILL
because of ZONE_DMA exhaution and you need the best locality.

(example)
assume 2 node NUMA. node(0) has ZONE_DMA/ZONE_NORMAL, node(1) has ZONE_NORMAL.

*node(0)'s memory allocation order:

 node(0)'s NORMAL -> node(0)'s DMA -> node(1)'s NORMAL.

*node(1)'s memory allocation order:

 node(1)'s NORMAL -> node(0)'s NORMAL -> node(0)'s DMA.

[Zone-based order]
This policy treats the zone type as the most important parameter.
Zonelist order is created by zone-type order. This means lower zone
never be used bofere higher zone exhaustion.
IOW. ZONE_DMA will be always at the tail of zonelist.

Pros.
 * OOM_KILL(bacause of lower zone) occurs only if the whole zones are exhausted.
Cons.
 * memory locality may not be best.

(example)
assume 2 node NUMA. node(0) has ZONE_DMA/ZONE_NORMAL, node(1) has ZONE_NORMAL.

*node(0)'s memory allocation order:

 node(0)'s NORMAL -> node(1)'s NORMAL -> node(0)'s DMA.

*node(1)'s memory allocation order:

 node(1)'s NORMAL -> node(0)'s NORMAL -> node(0)'s DMA.

bootoption "numa_zonelist_order=" and proc/sysctl is supporetd.

command:
%echo N > /proc/sys/vm/numa_zonelist_order

Will rebuild zonelist in Node-based order.

command:
%echo Z > /proc/sys/vm/numa_zonelist_order

Will rebuild zonelist in Zone-based order.

Thanks to Lee Schermerhorn, he gives me much help and codes.

[Lee.Schermerhorn@hp.com: add check_highest_zone to build_zonelists_in_zone_order]
[akpm@linux-foundation.org: build fix]
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "jesse.barnes@intel.com" <jesse.barnes@intel.com>
Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f0c0b2b8

12 7月, 2007 1 次提交

security: Protection for exploiting null dereference using mmap · ed032189

由 Eric Paris 提交于 6月 28, 2007

Add a new security check on mmap operations to see if the user is attempting
to mmap to low area of the address space. The amount of space protected is
indicated by the new proc tunable /proc/sys/vm/mmap_min_addr and defaults to
0, preserving existing behavior.

This patch uses a new SELinux security class "memprotect." Policy already
contains a number of allow rules like a_t self:process * (unconfined_t being
one of them) which mean that putting this check in the process class (its
best current fit) would make it useless as all user processes, which we also
want to protect against, would be allowed. By taking the memprotect name of
the new class it will also make it possible for us to move some of the other
memory protect permissions out of 'process' and into the new class next time
we bump the policy version number (which I also think is a good future idea)
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Acked-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NEric Paris <eparis@redhat.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

ed032189

10 7月, 2007 3 次提交

xip sendfile removal · d054fe3d

由 Carsten Otte 提交于 6月 15, 2007

This patch removes xip_file_sendfile, the sendfile implementation for
xip without replacement. Those customers that use xip on s390 are not
using sendfile() as far as we know, and so far s390 is the only platform
this could potentially be used on so far.
Having sendfile is not a popular feature for execute in place file
systems, however we have a working implementation of splice_read() based
on fs/splice.c if anyone asks for it.
At this point in time, it does not seem preferable to merge
splice_read() for xip because it causes extra maintenence effort due to
code duplication and it requires struct page behind the xip memory
segment. We'd like to get rid of that in favor of supporting flash based
embedded platforms (Monta Vista work) soon.
Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d054fe3d

shmem: convert to using splice instead of sendfile() · ae976416

由 Hugh Dickins 提交于 6月 04, 2007

Remove shmem_file_sendfile and resurrect shmem_readpage, as used by tmpfs
to support loop and sendfile in 2.4 and 2.5. Now tmpfs can support splice,
loop and sendfile in the simplest way, using generic_file_splice_read and
generic_file_splice_write (with the aid of shmem_prepare_write).

We could make some efficiency tweaks later, if there's a real need;
but this is stable and works well as is.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ae976416

J
sendfile: kill generic_file_sendfile() · 0452a4e5
由 Jens Axboe 提交于 6月 01, 2007
```
It's no longer used.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
0452a4e5

09 7月, 2007 1 次提交

mm: double mark_page_accessed() in read_cache_page_async() · 4e99325b

由 Peter Zijlstra 提交于 7月 08, 2007

Fix a post-2.6.21 regression.

read_cache_page_async() has two invocations of mark_page_accessed() which will
launch pages right onto the active list.

Remove the first one, keeping the latter one.  This avoids marking unwanted
pages active (in the retry loop).
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NNick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4e99325b

07 7月, 2007 2 次提交

slub: remove useless EXPORT_SYMBOL · d23cf676

由 Christoph Lameter 提交于 7月 06, 2007

kmem_cache_open is static. EXPORT_SYMBOL was leftover from some earlier
time period where kmem_cache_open was usable outside of slub.

(Fixes powerpc build error)
Signed-off-by: NChrsitoph Lameter <clameter@sgi.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d23cf676

mm: fixup /proc/vmstat output · 23c1fb52

由 Peter Zijlstra 提交于 7月 06, 2007

Line up the vmstat_text with zone_stat_item

enum zone_stat_item {
	/* First 128 byte cacheline (assuming 64 bit words) */
	NR_FREE_PAGES,
	NR_INACTIVE,
	NR_ACTIVE,

We current have nr_active and nr_inactive reversed.

[ "OK with patch, though using initializers canbe handy to prevent such
   things in future:

	static const char * const vmstat_text[] = {
		[NR_FREE_PAGES] = "nr_free_pages",
		..."
							 - Alexey ]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

23c1fb52

06 7月, 2007 1 次提交

Fix slab redzone alignment · 87a927c7

由 David Woodhouse 提交于 7月 04, 2007

Commit b46b8f19 fixed a couple of bugs
by switching the redzone to 64 bits. Unfortunately, it neglected to
ensure that the _second_ redzone, after the slab object, is aligned
correctly. This caused illegal instruction faults on sparc32, which for
some reason not entirely clear to me are not trapped and fixed up.

Two things need to be done to fix this:
  - increase the object size, rounding up to alignof(long long) so
    that the second redzone can be aligned correctly.
  - If SLAB_STORE_USER is set but alignof(long long)==8, allow a
    full 64 bits of space for the user word at the end of the buffer,
    even though we may not _use_ the whole 64 bits.

This patch should be a no-op on any 64-bit architecture or any 32-bit
architecture where alignof(long long) == 4. Of the others, it's tested
on ppc32 by myself and a very similar patch was tested on sparc32 by
Mark Fortescue, who reported the new problem.

Also, fix the conditions for FORCED_DEBUG, which hadn't been adjusted to
the new sizes. Again noticed by Mark.
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

87a927c7

04 7月, 2007 1 次提交

SLUB: Make lockdep happy by not calling add_partial with interrupts enabled during bootstrap · dbc55faa

由 Christoph Lameter 提交于 7月 03, 2007

If we move the local_irq_enable() to the end of the function then
add_partial() in early_kmem_cache_node_alloc() will be called
with interrupts disabled like during regular operations.

This makes lockdep happy.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Tested-by: NAndre Noll <maan@systemlinux.org>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dbc55faa

02 7月, 2007 1 次提交

SLAB: remove WARN_ON_ONCE for zero sized objects for 2.6.22 release · 17022220

由 Christoph Lameter 提交于 7月 01, 2007

We agreed to remove the WARN_ON_ONCE before 2.6.22 is released.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

17022220

29 6月, 2007 1 次提交

mm: kill validate_anon_vma to avoid mapcount BUG · 30acbaba

由 Hugh Dickins 提交于 6月 27, 2007

validate_anon_vma gave a useful check on the integrity of the anon_vma list
when Andrea was developing obj rmap; but it was not enabled in SLES9
itself, nor in mainline, until Nick changed commented-out RMAP_DEBUG to
configurable CONFIG_DEBUG_VM in 2.6.17.  Now Petr Vandrovec reports that
its BUG_ON(mapcount > 100000) can easily crash a CONFIG_DEBUG_VM=y system.

That limit was just an arbitrary number to protect against an infinite
loop.  We could raise it to something enormous (depending on sizeof struct
vma and size of memory?); but I rather think validate_anon_vma has outlived
its usefulness, and is better just removed - which gives a magnificent
performance boost to anything like Petr's test program ;)

Of course, a very long anon_vma list is bad news for preemption latency,
and I believe there has been one recent report of such: let's not forget
that, but validate_anon_vma only makes it worse not better.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Cc: Petr Vandrovec <petr@vmware.com>
Acked-by: NNick Piggin <npiggin@suse.de>
Cc: Andrea Arcangeli <andrea@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

30acbaba

24 6月, 2007 1 次提交

SLUB: fix behavior if the text output of list_locations overflows PAGE_SIZE · 84966343

由 Christoph Lameter 提交于 6月 23, 2007

If slabs are allocated or freed from a large set of call sites (typical for
the kmalloc area) then we may create more output than fits into a single
PAGE and sysfs only gives us one page. The output should be truncated.
This patch fixes the checks to do the truncation properly.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

84966343

22 6月, 2007 1 次提交

[PARISC] Handle wrapping in expand_upwards() · 06b32f3a

由 Helge Deller 提交于 12月 19, 2006

Function expand_upwards() did not guarded against wrapping
around to address 0. This fixes the adjtimex02 testcase from
the Linux Test Project on a 32bit PARISC kernel.

[expand_upwards is only used on parisc and ia64; it looks like it does
 the right thing on both. --kyle]
Signed-off-by: NHelge Deller <deller@gmx.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: NKyle McMartin <kyle@parisc-linux.org>

06b32f3a

17 6月, 2007 3 次提交

SLUB: minimum alignment fixes · 4b356be0

由 Christoph Lameter 提交于 6月 16, 2007

If ARCH_KMALLOC_MINALIGN is set to a value greater than 8 (SLUBs smallest
kmalloc cache) then SLUB may generate duplicate slabs in sysfs (yes again)
because the object size is padded to reach ARCH_KMALLOC_MINALIGN.  Thus the
size of the small slabs is all the same.

No arch sets ARCH_KMALLOC_MINALIGN larger than 8 though except mips which
for some reason wants a 128 byte alignment.

This patch increases the size of the smallest cache if
ARCH_KMALLOC_MINALIGN is greater than 8.  In that case more and more of the
smallest caches are disabled.

If we do that then the count of the active general caches that is displayed
on boot is not correct anymore since we may skip elements of the kmalloc
array.  So count them separately.

This approach was tested by Havard yesterday.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4b356be0

Rework ptep_set_access_flags and fix sun4c · 8dab5241

由 Benjamin Herrenschmidt 提交于 6月 16, 2007

Some changes done a while ago to avoid pounding on ptep_set_access_flags and
update_mmu_cache in some race situations break sun4c which requires
update_mmu_cache() to always be called on minor faults.

This patch reworks ptep_set_access_flags() semantics, implementations and
callers so that it's now responsible for returning whether an update is
necessary or not (basically whether the PTE actually changed).  This allow
fixing the sparc implementation to always return 1 on sun4c.

[akpm@linux-foundation.org: fixes, cleanups]
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: David Miller <davem@davemloft.net>
Cc: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Acked-by: NWilliam Lee Irwin III <wli@holomorphy.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8dab5241

SLUB slab validation: Alloc while interrupts are disabled must use GFP_ATOMIC · dd08c40e

由 Christoph Lameter 提交于 6月 16, 2007

The data structure to manage the information gathered about functions
allocating and freeing objects is allocated when the list_lock has already
been taken.  We need to allocate with GFP_ATOMIC instead of GFP_KERNEL.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dd08c40e

16 6月, 2007 1 次提交

mm: Fix memory/cpu hotplug section mismatch and oops. · d09c6b80

由 Paul Mundt 提交于 6月 14, 2007

When building with memory hotplug enabled and cpu hotplug disabled, we
end up with the following section mismatch:

WARNING: mm/built-in.o(.text+0x4e58): Section mismatch: reference to
.init.text: (between 'free_area_init_node' and '__build_all_zonelists')

This happens as a result of:

        -> free_area_init_node()
          -> free_area_init_core()
            -> zone_pcp_init() <-- all __meminit up to this point
              -> zone_batchsize() <-- marked as __cpuinit                     fo

This happens because CONFIG_HOTPLUG_CPU=n sets __cpuinit to __init, but
CONFIG_MEMORY_HOTPLUG=y unsets __meminit.

Changing zone_batchsize() to __devinit fixes this.

__devinit is the only thing that is common between CONFIG_HOTPLUG_CPU=y and
CONFIG_MEMORY_HOTPLUG=y. In the long run, perhaps this should be moved to
another section identifier completely. Without this, memory hot-add
of offline nodes (via hotadd_new_pgdat()) will oops if CPU hotplug is
not also enabled.
Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
Acked-by: NYasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

--

 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

d09c6b80

09 6月, 2007 4 次提交

Move three functions that are only needed for CONFIG_MEMORY_HOTPLUG · 193faea9

由 Stephen Rothwell 提交于 6月 08, 2007

into the appropriate #ifdef.
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

193faea9

SLUB: return ZERO_SIZE_PTR for kmalloc(0) · 272c1d21

由 Christoph Lameter 提交于 6月 08, 2007

Instead of returning the smallest available object return ZERO_SIZE_PTR.

A ZERO_SIZE_PTR can be legitimately used as an object pointer as long as it
is not deferenced.  The dereference of ZERO_SIZE_PTR causes a distinctive
fault.  kfree can handle a ZERO_SIZE_PTR in the same way as NULL.

This enables functions to use zero sized object. e.g. n = number of objects.

	objects = kmalloc(n * sizeof(object));

	for (i = 0; i < n; i++)
		objects[i].x = y;

	kfree(objects);
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

272c1d21

slab: fix alien cache handling · 3cdc0ed0

由 Christoph Lameter 提交于 6月 08, 2007

cache_free_alien must be called regardless if we use alien caches or not.
cache_free_alien() will do the right thing if there are no alien caches
available.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: NPekka J Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3cdc0ed0

mount -t tmpfs -o mpol=: check nodes online · a210906c

由 Hugh Dickins 提交于 6月 08, 2007

Randy Dunlap reports that a tmpfs, mounted with NUMA mpol= specifying an
offline node, crashes as soon as data is allocated upon it.  Now restrict it
to online nodes, where before it restricted to MAX_NUMNODES.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Tested-and-acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a210906c

01 6月, 2007 3 次提交

SLUB: fix locking for hotplug callbacks · 27390bc3

由 Christoph Lameter 提交于 6月 01, 2007

Hotplug callbacks are performed with interrupts enabled.  Slub requires
interrupts to be disabled for flushing caches.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Cc: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

27390bc3

memory hotplug: fix unnecessary calling of init_currenty_empty_zone() · 13466c84

由 Yasunori Goto 提交于 6月 01, 2007

zone->present_pages is updated in online_pages().  But, __add_zone() can be
called twice or more before calling online_pages().  So,
init_currenty_empty_zone() can be called unnecessary times.  It is cause of
memory leak of zone's wait_table.
Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

13466c84

x86_64: allocate sparsemem memmap above 4G · 2e1c49db

由 Zou Nan hai 提交于 6月 01, 2007

On systems with huge amount of physical memory, VFS cache and memory memmap
may eat all available system memory under 4G, then the system may fail to
allocate swiotlb bounce buffer.

There was a fix for this issue in arch/x86_64/mm/numa.c, but that fix dose
not cover sparsemem model.

This patch add fix to sparsemem model by first try to allocate memmap above
4G.
Signed-off-by: NZou Nan hai <nanhai.zou@intel.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2e1c49db

31 5月, 2007 2 次提交

m68k: discontinuous memory support · 12d810c1

由 Roman Zippel 提交于 5月 31, 2007

Fix support for discontinuous memory
Signed-off-by: NRoman Zippel <zippel@linux-m68k.org>
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

12d810c1

SLUB: Fix NUMA / SYSFS bootstrap issue · 8ffa6875

由 Christoph Lameter 提交于 5月 31, 2007

We need this patch in ASAP.  Patch fixes the mysterious hang that remained
on some particular configurations with lockdep on after the first fix that
moved the #idef CONFIG_SLUB_DEBUG to the right location.  See
http://marc.info/?t=117963072300001&r=1&w=2

The kmem_cache_node cache is very special because it is needed for NUMA
bootstrap.  Under certain conditions (like for example if lockdep is
enabled and significantly increases the size of spinlock_t) the structure
may become exactly the size as one of the larger caches in the kmalloc
array.

That early during bootstrap we cannot perform merging properly.  The unique
id for the kmem_cache_node cache will match one of the kmalloc array.
Sysfs will complain about a duplicate directory entry.  All of this occurs
while the console is not yet fully operational.  Thus boot may appear to be
silently failing.

The kmem_cache_node cache is very special.  During early boostrap the main
allocation function is not operational yet and so we have to run our own
small special alloc function during early boot.  It is also special in that
it is never freed.

We really do not want any merging on that cache.  Set the refcount -1 and
forbid merging of slabs that have a negative refcount.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8ffa6875

24 5月, 2007 3 次提交

SLUB Debug: fix check for super sized slabs (>512k 64bit, >256k 32bit) · 33e9e241

由 Christoph Lameter 提交于 5月 23, 2007

The check for super sized slabs where we can no longer move the free
pointer behind the object for debugging purposes etc is accessing a
field that is not setup yet. We must use objsize here since the size of
the slab has not been determined yet.

The effect of this is that a global slab shrink via "slabinfo -s" will
show errors about offsets being wrong if booted with slub_debug.
Potentially there are other troubles with huge slabs under slub_debug
because the calculated free pointer offset is truncated.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

33e9e241

fix unused setup_nr_node_ids · 418508c1

由 Miklos Szeredi 提交于 5月 23, 2007

mm/page_alloc.c:931: warning: 'setup_nr_node_ids' defined but not used

This is now the only (!) compiler warning I get in my UML build :)
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

418508c1

SLUB Debug: Fix object size calculation · c12b3c62

由 Christoph Lameter 提交于 5月 23, 2007

The object size calculation is wrong if !CONFIG_SLUB_DEBUG because the
#ifdef CONFIG_SLUB_DEBUG is now switching off the size adjustments for
DESTROY_BY_RCU and ctor.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Acked-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c12b3c62

22 5月, 2007 1 次提交

Detach sched.h from mm.h · e8edc6e0

由 Alexey Dobriyan 提交于 5月 21, 2007

First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
   getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
   they don't need sched.h
b) sched.h stops being dependency for significant number of files:
   on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
   after patch it's only 3744 (-8.3%).

Cross-compile tested on

	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
	alpha alpha-up
	arm
	i386 i386-up i386-defconfig i386-allnoconfig
	ia64 ia64-up
	m68k
	mips
	parisc parisc-up
	powerpc powerpc-up
	s390 s390-up
	sparc sparc-up
	sparc64 sparc64-up
	um-x86_64
	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e8edc6e0

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多