提交 · e5a5623b28198aa91ea71ee5d3846757fc76bc87 · openeuler / raspberrypi-kernel

14 1月, 2011 2 次提交

mm: remove unused get_vm_area_node · e5a5623b

由 David Rientjes 提交于 1月 13, 2011

get_vm_area_node() is unused in the kernel and can thus be removed.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e5a5623b

mm: convert sprintf_symbol to %pS · 62c70bce

由 Joe Perches 提交于 1月 13, 2011

Signed-off-by: NJoe Perches <joe@perches.com>
Acked-by: NPekka Enberg <penberg@kernel.org>
Cc: Jiri Kosina <trivial@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

62c70bce

03 12月, 2010 1 次提交

vmalloc: eagerly clear ptes on vunmap · 64141da5

由 Jeremy Fitzhardinge 提交于 12月 02, 2010

On stock 2.6.37-rc4, running:

  # mount lilith:/export /mnt/lilith
  # find  /mnt/lilith/ -type f -print0 | xargs -0 file

crashes the machine fairly quickly under Xen.  Often it results in oops
messages, but the couple of times I tried just now, it just hung quietly
and made Xen print some rude messages:

    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp
    3000000000000000) for mfn 1d7058 (pfn 18fa7)
    (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms
    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp
    1000000000000000) for mfn 1d2e04 (pfn 1d1fb)
    (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04

Which means the domain tried to map a pagetable page RW, which would
allow it to map arbitrary memory, so Xen stopped it.  This is because
vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had
finished with them, and those pages got recycled as pagetable pages
while still having these RW aliases.

Removing those mappings immediately removes the Xen-visible aliases, and
so it has no problem with those pages being reused as pagetable pages.
Deferring the TLB flush doesn't upset Xen because it can flush the TLB
itself as needed to maintain its invariants.

When unmapping a region in the vmalloc space, clear the ptes
immediately.  There's no point in deferring this because there's no
amortization benefit.

The TLBs are left dirty, and they are flushed lazily to amortize the
cost of the IPIs.

This specific motivation for this patch is an oops-causing regression
since 2.6.36 when using NFS under Xen, triggered by the NFS client's use
of vm_map_ram() introduced in 56e4ebf8 ("NFS: readdir with vmapped
pages") .  XFS also uses vm_map_ram() and could cause similar problems.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

64141da5

27 10月, 2010 3 次提交

mm: add vzalloc() and vzalloc_node() helpers · e1ca7788

由 Dave Young 提交于 10月 26, 2010

Add vzalloc() and vzalloc_node() to encapsulate the
vmalloc-then-memset-zero operation.

Use __GFP_ZERO to zero fill the allocated memory.
Signed-off-by: NDave Young <hidave.darkstar@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Acked-by: NGreg Ungerer <gerg@snapgear.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e1ca7788

vmalloc: annotate lock context change on s_start/stop() · e199b5d1

由 Namhyung Kim 提交于 10月 26, 2010

s_start() and s_stop() grab/release vmlist_lock but were missing proper
annotations.  Add them.
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e199b5d1

vmalloc: rename temporary variable in __insert_vmap_area() · 170168d0

由 Namhyung Kim 提交于 10月 26, 2010

Rename redundant 'tmp' to fix following sparse warnings:

 mm/vmalloc.c:296:34: warning: symbol 'tmp' shadows an earlier one
 mm/vmalloc.c:293:24: originally declared here
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

170168d0

02 10月, 2010 1 次提交

vmalloc: pcpu_get/free_vm_areas() aren't needed on UP · 0bc14062

由 Tejun Heo 提交于 9月 03, 2010

These functions are used only by percpu memory allocator on SMP.
Don't build them on UP.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Nick Piggin <npiggin@kernel.dk>

0bc14062

17 9月, 2010 1 次提交

mm, x86: Saving vmcore with non-lazy freeing of vmas · 3ee48b6a

由 Cliff Wickman 提交于 9月 16, 2010

During the reading of /proc/vmcore the kernel is doing
ioremap()/iounmap() repeatedly. And the buildup of un-flushed
vm_area_struct's is causing a great deal of overhead. (rb_next()
is chewing up most of that time).

This solution is to provide function set_iounmap_nonlazy(). It
causes a subsequent call to iounmap() to immediately purge the
vma area (with try_purge_vmap_area_lazy()).

With this patch we have seen the time for writing a 250MB
compressed dump drop from 71 seconds to 44 seconds.
Signed-off-by: NCliff Wickman <cpw@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: kexec@lists.infradead.org
Cc: <stable@kernel.org>
LKML-Reference: <E1OwHZ4-0005WK-Tw@eag09.americas.sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3ee48b6a

08 9月, 2010 1 次提交

vmalloc: pcpu_get/free_vm_areas() aren't needed on UP · 4f8b02b4

由 Tejun Heo 提交于 9月 03, 2010

These functions are used only by percpu memory allocator on SMP.
Don't build them on UP.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Reviewed-by: NChrsitoph Lameter <cl@linux.com>

4f8b02b4

10 8月, 2010 2 次提交

mm/vmalloc.c: check kmalloc() return value · 51980ac9

由 Kulikov Vasiliy 提交于 8月 09, 2010

kmalloc() may fail, if so return -ENOMEM.
Signed-off-by: NKulikov Vasiliy <segooon@gmail.com>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

51980ac9

mm: use ERR_CAST · e7d86340

由 Julia Lawall 提交于 8月 09, 2010

Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)).  The former makes more
clear what is the purpose of the operation, which otherwise looks like a
no-op.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
type T;
T x;
identifier f;
@@

T f (...) { <+...
- ERR_PTR(PTR_ERR(x))
+ x
 ...+> }

@@
expression x;
@@

- ERR_PTR(PTR_ERR(x))
+ ERR_CAST(x)
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e7d86340

27 7月, 2010 1 次提交

vmap: add flag to allow lazy unmap to be disabled at runtime · a0d40c80

由 Jeremy Fitzhardinge 提交于 3月 26, 2010

Add a flag to force lazy_max_pages() to zero to prevent any outstanding
mapped pages.  We'll need this for Xen.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: NNick Piggin <npiggin@suse.de>

a0d40c80

10 7月, 2010 1 次提交

x86, ioremap: Fix incorrect physical address handling in PAE mode · ffa71f33

由 Kenji Kaneshige 提交于 6月 18, 2010

Current x86 ioremap() doesn't handle physical address higher than
32-bit properly in X86_32 PAE mode. When physical address higher than
32-bit is passed to ioremap(), higher 32-bits in physical address is
cleared wrongly. Due to this bug, ioremap() can map wrong address to
linear address space.

In my case, 64-bit MMIO region was assigned to a PCI device (ioat
device) on my system. Because of the ioremap()'s bug, wrong physical
address (instead of MMIO region) was mapped to linear address space.
Because of this, loading ioatdma driver caused unexpected behavior
(kernel panic, kernel hangup, ...).
Signed-off-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <4C1AE680.7090408@jp.fujitsu.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

ffa71f33

03 2月, 2010 2 次提交

mm: purge fragmented percpu vmap blocks · 02b709df

由 Nick Piggin 提交于 2月 01, 2010

Improve handling of fragmented per-CPU vmaps.  We previously don't free
up per-CPU maps until all its addresses have been used and freed.  So
fragmented blocks could fill up vmalloc space even if they actually had
no active vmap regions within them.

Add some logic to allow all CPUs to have these blocks purged in the case
of failure to allocate a new vm area, and also put some logic to trim
such blocks of a current CPU if we hit them in the allocation path (so
as to avoid a large build up of them).

Christoph reported some vmap allocation failures when using the per CPU
vmap APIs in XFS, which cannot be reproduced after this patch and the
previous bug fix.

Cc: linux-mm@kvack.org
Cc: stable@kernel.org
Tested-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NNick Piggin <npiggin@suse.de>
--
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

02b709df

mm: percpu-vmap fix RCU list walking · de560423

由 Nick Piggin 提交于 2月 01, 2010

RCU list walking of the per-cpu vmap cache was broken.  It did not use
RCU primitives, and also the union of free_list and rcu_head is
obviously wrong (because free_list is indeed the list we are RCU
walking).

While we are there, remove a couple of unused fields from an earlier
iteration.

These APIs aren't actually used anywhere, because of problems with the
XFS conversion.  Christoph has now verified that the problems are solved
with these patches.  Also it is an exported interface, so I think it
will be good to be merged now (and Christoph wants to get the XFS
changes into their local tree).

Cc: stable@kernel.org
Cc: linux-mm@kvack.org
Tested-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NNick Piggin <npiggin@suse.de>
--
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

de560423

21 1月, 2010 1 次提交

vmalloc: remove BUG_ON due to racy counting of VM_LAZY_FREE · 88f50044

由 Yongseok Koh 提交于 1月 19, 2010

In free_unmap_area_noflush(), va->flags is marked as VM_LAZY_FREE first, and
then vmap_lazy_nr is increased atomically.

But, in __purge_vmap_area_lazy(), while traversing of vmap_are_list, nr
is counted by checking VM_LAZY_FREE is set to va->flags.  After counting
the variable nr, kernel reads vmap_lazy_nr atomically and checks a
BUG_ON condition whether nr is greater than vmap_lazy_nr to prevent
vmap_lazy_nr from being negative.

The problem is that, if interrupted right after marking VM_LAZY_FREE,
increment of vmap_lazy_nr can be delayed.  Consequently, BUG_ON
condition can be met because nr is counted more than vmap_lazy_nr.

It is highly probable when vmalloc/vfree are called frequently.  This
scenario have been verified by adding delay between marking VM_LAZY_FREE
and increasing vmap_lazy_nr in free_unmap_area_noflush().

Even the vmap_lazy_nr is for checking high watermark, it never be the
strict watermark.  Although the BUG_ON condition is to prevent
vmap_lazy_nr from being negative, vmap_lazy_nr is signed variable.  So,
it could go down to negative value temporarily.

Consequently, removing the BUG_ON condition is proper.

A possible BUG_ON message is like the below.

   kernel BUG at mm/vmalloc.c:517!
   invalid opcode: 0000 [#1] SMP
   EIP: 0060:[<c04824a4>] EFLAGS: 00010297 CPU: 3
   EIP is at __purge_vmap_area_lazy+0x144/0x150
   EAX: ee8a8818 EBX: c08e77d4 ECX: e7c7ae40 EDX: c08e77ec
   ESI: 000081fe EDI: e7c7ae60 EBP: e7c7ae64 ESP: e7c7ae3c
   DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
   Call Trace:
   [<c0482ad9>] free_unmap_vmap_area_noflush+0x69/0x70
   [<c0482b02>] remove_vm_area+0x22/0x70
   [<c0482c15>] __vunmap+0x45/0xe0
   [<c04831ec>] vmalloc+0x2c/0x30
   Code: 8d 59 e0 eb 04 66 90 89 cb 89 d0 e8 87 fe ff ff 8b 43 20 89 da 8d 48 e0 8d 43 20 3b 04 24 75 e7 fe 05 a8 a5 a3 c0 e9 78 ff ff ff <0f> 0b eb fe 90 8d b4 26 00 00 00 00 56 89 c6 b8 ac a5 a3 c0 31
   EIP: [<c04824a4>] __purge_vmap_area_lazy+0x144/0x150 SS:ESP 0068:e7c7ae3c

[ See also http://marc.info/?l=linux-kernel&m=126335856228090&w=2 ]
Signed-off-by: NYongseok Koh <yongseok.koh@samsung.com>
Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

88f50044

16 12月, 2009 1 次提交

vmalloc(): adjust gfp mask passed on nested vmalloc() invocation · 976d6dfb

由 Jan Beulich 提交于 12月 14, 2009

- avoid wasting more precious resources (DMA or DMA32 pools), when
  being called through vmalloc_32{,_user}()
- explicitly allow using high memory here even if the outer allocation
  request doesn't allow it
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

976d6dfb

29 10月, 2009 1 次提交

vmalloc: fix use of non-existent percpu variable in put_cpu_var() · 3f04ba85

由 Tejun Heo 提交于 10月 29, 2009

vmalloc used non-existent percpu variable vmap_cpu_blocks instead of
the intended vmap_block_queue.  This went unnoticed because
put_cpu_var() didn't evaluate the parameter.  Fix it.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Nick Piggin <npiggin@suse.de>

3f04ba85

12 10月, 2009 1 次提交

headers: remove sched.h from interrupt.h · d43c36dc

由 Alexey Dobriyan 提交于 10月 07, 2009

After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>

d43c36dc

08 10月, 2009 2 次提交

mm, perf_event: Make vmalloc_user() align base kernel virtual address to SHMLBA · 2dca6999

由 David Miller 提交于 9月 21, 2009

When a vmalloc'd area is mmap'd into userspace, some kind of
co-ordination is necessary for this to work on platforms with cpu
D-caches which can have aliases.

Otherwise kernel side writes won't be seen properly in userspace
and vice versa.

If the kernel side mapping and the user side one have the same
alignment, modulo SHMLBA, this can work as long as VM_SHARED is
shared of VMA and for all current users this is true.  VM_SHARED
will force SHMLBA alignment of the user side mmap on platforms with
D-cache aliasing matters.

The bulk of this patch is just making it so that a specific
alignment can be passed down into __get_vm_area_node().  All
existing callers pass in '1' which preserves existing behavior.
vmalloc_user() gives SHMLBA for the alignment.

As a side effect this should get the video media drivers and other
vmalloc_user() users into more working shape on such systems.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
LKML-Reference: <200909211922.n8LJMYjw029425@imap1.linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2dca6999

mm: includecheck fix: vmalloc.c · 3700c155

由 Jaswinder Singh Rajput 提交于 10月 07, 2009

fix the following 'make includecheck' warning:

  mm/vmalloc.c: linux/highmem.h is included more than once.
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3700c155

23 9月, 2009 1 次提交

kcore: register module area in generic way · 81ac3ad9

由 KAMEZAWA Hiroyuki 提交于 9月 22, 2009

Some archs define MODULED_VADDR/MODULES_END which is not in VMALLOC area.
This is handled only in x86-64.  This patch make it more generic.  And we
can use vread/vwrite to access the area.  Fix it.
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

81ac3ad9

22 9月, 2009 4 次提交

mm: replace various uses of num_physpages by totalram_pages · 4481374c

由 Jan Beulich 提交于 9月 21, 2009

Sizing of memory allocations shouldn't depend on the number of physical
pages found in a system, as that generally includes (perhaps a huge amount
of) non-RAM pages.  The amount of what actually is usable as storage
should instead be used as a basis here.

Some of the calculations (i.e.  those not intending to use high memory)
should likely even use (totalram_pages - totalhigh_pages).
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NIngo Molnar <mingo@elte.hu>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4481374c

kcore: fix vread/vwrite to be aware of holes · d0107eb0

由 KAMEZAWA Hiroyuki 提交于 9月 21, 2009

vread/vwrite access vmalloc area without checking there is a page or not.
In most case, this works well.

In old ages, the caller of get_vm_ara() is only IOREMAP and there is no
memory hole within vm_struct's [addr...addr + size - PAGE_SIZE] (
-PAGE_SIZE is for a guard page.)

After per-cpu-alloc patch, it uses get_vm_area() for reserve continuous
virtual address but remap _later_.  There tend to be a hole in valid
vmalloc area in vm_struct lists.  Then, skip the hole (not mapped page) is
necessary.  This patch updates vread/vwrite() for avoiding memory hole.

Routines which access vmalloc area without knowing for which addr is used
are
  - /proc/kcore
  - /dev/kmem

kcore checks IOREMAP, /dev/kmem doesn't.  After this patch, IOREMAP is
checked and /dev/kmem will avoid to read/write it.  Fixes to /proc/kcore
will be in the next patch in series.
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Mike Smith <scgtrp@gmail.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0107eb0

vmalloc: unmap vmalloc area after hiding it · dd32c279

由 KAMEZAWA Hiroyuki 提交于 9月 21, 2009

vmap area should be purged after vm_struct is removed from the list
because vread/vwrite etc...believes the range is valid while it's on
vm_struct list.
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: NWANG Cong <xiyou.wangcong@gmail.com>
Cc: Mike Smith <scgtrp@gmail.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dd32c279

vmalloc.c: fix double error checking · bf88c8c8

由 Figo.zhang 提交于 9月 21, 2009

There is no need for double error checking.
Signed-off-by: NFigo.zhang <figo1802@gmail.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bf88c8c8

14 8月, 2009 2 次提交

vmalloc: implement pcpu_get_vm_areas() · ca23e405

由 Tejun Heo 提交于 8月 14, 2009

To directly use spread NUMA memories for percpu units, percpu
allocator will be updated to allow sparsely mapping units in a chunk.
As the distances between units can be very large, this makes
allocating single vmap area for each chunk undesirable.  This patch
implements pcpu_get_vm_areas() and pcpu_free_vm_areas() which
allocates and frees sparse congruent vmap areas.

pcpu_get_vm_areas() take @offsets and @sizes array which define
distances and sizes of vmap areas.  It scans down from the top of
vmalloc area looking for the top-most address which can accomodate all
the areas.  The top-down scan is to avoid interacting with regular
vmallocs which can push up these congruent areas up little by little
ending up wasting address space and page table.

To speed up top-down scan, the highest possible address hint is
maintained.  Although the scan is linear from the hint, given the
usual large holes between memory addresses between NUMA nodes, the
scanning is highly likely to finish after finding the first hole for
the last unit which is scanned first.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Nick Piggin <npiggin@suse.de>

ca23e405

vmalloc: separate out insert_vmalloc_vm() · cf88c790

由 Tejun Heo 提交于 8月 14, 2009

Separate out insert_vmalloc_vm() from __get_vm_area_node().
insert_vmalloc_vm() initializes vm_struct from vmap_area and inserts
it into vmlist.  insert_vmalloc_vm() only initializes fields which can
be determined from @vm, @flags and @caller The rest should be
initialized by the caller.  For __get_vm_area_node(), all other fields
just need to be cleared and this is done by using kzalloc instead of
kmalloc.

This will be used to implement pcpu_get_vm_areas().
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Nick Piggin <npiggin@suse.de>

cf88c790

12 6月, 2009 2 次提交

vmalloc: use kzalloc() instead of alloc_bootmem() · 43ebdac4

由 Pekka Enberg 提交于 5月 25, 2009

We can call vmalloc_init() after kmem_cache_init() and use kzalloc() instead of
the bootmem allocator when initializing vmalloc data structures.
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Acked-by: NNick Piggin <npiggin@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

43ebdac4

kmemleak: Add the vmalloc memory allocation/freeing hooks · 89219d37

由 Catalin Marinas 提交于 6月 11, 2009

This patch adds the callbacks to kmemleak_(alloc|free) functions from
vmalloc/vfree.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

89219d37

07 5月, 2009 1 次提交

alloc_vmap_area: fix memory leak · 2498ce42

由 Ralph Wuerthner 提交于 5月 06, 2009

If alloc_vmap_area() fails the allocated struct vmap_area has to be freed.
Signed-off-by: NRalph Wuerthner <ralphw@linux.vnet.ibm.com>
Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2498ce42

01 4月, 2009 1 次提交

vmap: remove needless lock and list in vmap · d086817d

由 MinChan Kim 提交于 3月 31, 2009

vmap's dirty_list is unused.  It's for optimizing flushing.  but Nick
didn't write the code yet.  so, we don't need it until time as it is
needed.

This patch removes vmap_block's dirty_list and codes related to it.
Signed-off-by: NMinChan Kim <minchan.kim@gmail.com>
Acked-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d086817d

28 2月, 2009 2 次提交

mm: fix lazy vmap purging (use-after-free error) · cbb76676

由 Vegard Nossum 提交于 2月 27, 2009

I just got this new warning from kmemcheck:

    WARNING: kmemcheck: Caught 32-bit read from freed memory (c7806a60)
    a06a80c7ecde70c1a04080c700000000a06709c1000000000000000000000000
     f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f
     ^

    Pid: 0, comm: swapper Not tainted (2.6.29-rc4 #230)
    EIP: 0060:[<c1096df7>] EFLAGS: 00000286 CPU: 0
    EIP is at __purge_vmap_area_lazy+0x117/0x140
    EAX: 00070f43 EBX: c7806a40 ECX: c1677080 EDX: 00027b66
    ESI: 00002001 EDI: c170df0c EBP: c170df00 ESP: c178830c
     DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
    CR0: 80050033 CR2: c7806b14 CR3: 01775000 CR4: 00000690
    DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
    DR6: 00004000 DR7: 00000000
     [<c1096f3e>] free_unmap_vmap_area_noflush+0x6e/0x70
     [<c1096f6a>] remove_vm_area+0x2a/0x70
     [<c1097025>] __vunmap+0x45/0xe0
     [<c10970de>] vunmap+0x1e/0x30
     [<c1008ba5>] text_poke+0x95/0x150
     [<c1008ca9>] alternatives_smp_unlock+0x49/0x60
     [<c171ef47>] alternative_instructions+0x11b/0x124
     [<c171f991>] check_bugs+0xbd/0xdc
     [<c17148c5>] start_kernel+0x2ed/0x360
     [<c171409e>] __init_begin+0x9e/0xa9
     [<ffffffff>] 0xffffffff

It happened here:

    $ addr2line -e vmlinux -i c1096df7
    mm/vmalloc.c:540

Code:

	list_for_each_entry(va, &valist, purge_list)
		__free_vmap_area(va);

It's this instruction:

    mov    0x20(%ebx),%edx

Which corresponds to a dereference of va->purge_list.next:

    (gdb) p ((struct vmap_area *) 0)->purge_list.next
    Cannot access memory at address 0x20

It seems that we should use "safe" list traversal here, as the element
is freed inside the loop. Please verify that this is the right fix.
Acked-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: <stable@kernel.org>		[2.6.28.x]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cbb76676

mm: vmap fix overflow · 7766970c

由 Nick Piggin 提交于 2月 27, 2009

The new vmap allocator can wrap the address and get confused in the case
of large allocations or VMALLOC_END near the end of address space.

Problem reported by Christoph Hellwig on a 32-bit XFS workload.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Reported-by: NChristoph Hellwig <hch@lst.de>
Cc: <stable@kernel.org>		[2.6.28.x]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7766970c

25 2月, 2009 1 次提交
- P
  x86: make vmap yell louder when it is used under irqs_disabled() · 34754b69
  由 Peter Zijlstra 提交于 2月 25, 2009
```
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
  34754b69
24 2月, 2009 1 次提交

vmalloc: add @align to vm_area_register_early() · c0c0a293

由 Tejun Heo 提交于 2月 24, 2009

Impact: allow larger alignment for early vmalloc area allocation

Some early vmalloc users might want larger alignment, for example, for
custom large page mapping.  Add @align to vm_area_register_early().
While at it, drop docbook comment on non-existent @size.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>

c0c0a293

21 2月, 2009 1 次提交

vmalloc: call flush_cache_vunmap() from unmap_kernel_range() · f6fcba70

由 Tejun Heo 提交于 2月 20, 2009

Impact: proper vcache flush on unmap_kernel_range()

flush_cache_vunmap() should be called before pages are unmapped.  Add
a call to it in unmap_kernel_range().
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NNick Piggin <npiggin@suse.de>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Cc: <stable@kernel.org>		[2.6.28.x]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f6fcba70

20 2月, 2009 3 次提交

vmalloc: add un/map_kernel_range_noflush() · 8fc48985

由 Tejun Heo 提交于 2月 20, 2009

Impact: two more public map/unmap functions

Implement map_kernel_range_noflush() and unmap_kernel_range_noflush().
These functions respectively map and unmap address range in kernel VM
area but doesn't do any vcache or tlb flushing.  These will be used by
new percpu allocator.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>

8fc48985

vmalloc: implement vm_area_register_early() · f0aa6617

由 Tejun Heo 提交于 2月 20, 2009

Impact: allow multiple early vm areas

There are places where kernel VM area needs to be allocated before
vmalloc is initialized.  This is done by allocating static vm_struct,
initializing several fields and linking it to vmlist and later vmalloc
initialization picking up these from vmlist.  This is currently done
manually and if there's more than one such areas, there's no defined
way to arbitrate who gets which address.

This patch implements vm_area_register_early(), which takes vm_area
struct with flags and size initialized, assigns address to it and puts
it on the vmlist.  This way, multiple early vm areas can determine
which addresses they should use.  The only current user - alpha mm
init - is converted to use it.
Signed-off-by: NTejun Heo <tj@kernel.org>

f0aa6617

vmalloc: call flush_cache_vunmap() from unmap_kernel_range() · 73426952

由 Tejun Heo 提交于 2月 20, 2009

Impact: proper vcache flush on unmap_kernel_range()

flush_cache_vunmap() should be called before pages are unmapped.  Add
a call to it in unmap_kernel_range().
Signed-off-by: NTejun Heo <tj@kernel.org>

73426952