提交 · b554cb426a955a267dba524f98f99e29bc947643 · Linux-御风守护者 / linux

24 3月, 2011 1 次提交

mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm · cae5d390

由 Stephen Wilson 提交于 3月 13, 2011

Now that gate vma's are referenced with respect to a particular mm and not a
particular task it only makes sense to propagate the change to this predicate as
well.
Signed-off-by: NStephen Wilson <wilsons@start.ca>
Reviewed-by: NMichel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

cae5d390

10 3月, 2011 1 次提交

block: remove per-queue plugging · 7eaceacc

由 Jens Axboe 提交于 3月 10, 2011

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7eaceacc

14 1月, 2011 1 次提交

mlock: do not hold mmap_sem for extended periods of time · 53a7706d

由 Michel Lespinasse 提交于 1月 13, 2011

__get_user_pages gets a new 'nonblocking' parameter to signal that the
caller is prepared to re-acquire mmap_sem and retry the operation if
needed.  This is used to split off long operations if they are going to
block on a disk transfer, or when we detect contention on the mmap_sem.

[akpm@linux-foundation.org: remove ref to rwsem_is_contended()]
Signed-off-by: NMichel Lespinasse <walken@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

53a7706d

24 12月, 2010 2 次提交

nommu: Provide stubbed alloc/free_vm_area() implementation. · 29c185e5

由 Paul Mundt 提交于 12月 24, 2010

Now that these have been introduced in to the vmalloc API, sync up the
nommu side of things. At present we don't deal with VMAs as such, so for
the time being these will simply BUG() out. In the future it should be
possible to support this interface by layering on top of the vm_regions.
Signed-off-by: NPaul Mundt <lethal@linux-sh.org>

29c185e5

nommu: Fix up vmalloc_node() symbol export regression. · 9a14f653

由 Paul Mundt 提交于 12月 24, 2010

Commit e1ca7788 ("mm: add vzalloc() and vzalloc_node() helpers") ended up
accidentally deleting the vmalloc_node() symbol export, resulting in:

"vmalloc_node" [net/core/pktgen.ko] undefined!
"vmalloc_node" [net/netfilter/x_tables.ko] undefined!

regressions.
Signed-off-by: NPaul Mundt <lethal@linux-sh.org>

9a14f653

25 11月, 2010 1 次提交

nommu: yield CPU while disposing VM · 04c34961

由 Steven J. Magnani 提交于 11月 24, 2010

Depending on processor speed, page size, and the amount of memory a
process is allowed to amass, cleanup of a large VM may freeze the system
for many seconds.  This can result in a watchdog timeout.

Make sure other tasks receive some service when cleaning up large VMs.
Signed-off-by: NSteven J. Magnani <steve@digidescorp.com>
Cc: Greg Ungerer <gerg@snapgear.com>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

04c34961

30 10月, 2010 1 次提交

audit mmap · 120a795d

由 Al Viro 提交于 10月 30, 2010

Normal syscall audit doesn't catch 5th argument of syscall.  It also
doesn't catch the contents of userland structures pointed to be
syscall argument, so for both old and new mmap(2) ABI it doesn't
record the descriptor we are mapping.  For old one it also misses
flags.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

120a795d

27 10月, 2010 1 次提交

mm: add vzalloc() and vzalloc_node() helpers · e1ca7788

由 Dave Young 提交于 10月 26, 2010

Add vzalloc() and vzalloc_node() to encapsulate the
vmalloc-then-memset-zero operation.

Use __GFP_ZERO to zero fill the allocated memory.
Signed-off-by: NDave Young <hidave.darkstar@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Acked-by: NGreg Ungerer <gerg@snapgear.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e1ca7788

21 8月, 2010 1 次提交

mm: make the vma list be doubly linked · 297c5eee

由 Linus Torvalds 提交于 8月 20, 2010

It's a really simple list, and several of the users want to go backwards
in it to find the previous vma.  So rather than have to look up the
previous entry with 'find_vma_prev()' or something similar, just make it
doubly linked instead.
Tested-by: NIan Campbell <ijc@hellion.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

297c5eee

14 8月, 2010 1 次提交

NOMMU: Remove an extraneous no_printk() · fe622e76

由 David Howells 提交于 8月 13, 2010

Remove an extraneous no_printk() in mm/nommu.c that got missed when the
function got generalised from several things that used it in commit
12fdff3f ("Add a dummy printk function for the maintenance of unused
printks").

Without this, the following error is observed:

  mm/nommu.c:41: error: conflicting types for 'no_printk'
  include/linux/kernel.h:314: error: previous definition of 'no_printk' was here
Reported-by: NMichal Simek <monstr@monstr.eu>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fe622e76

26 5月, 2010 1 次提交

nommu: allow private mappings of read-only devices · 3c7b2045

由 Bernd Schmidt 提交于 5月 25, 2010

Slightly rearrange the logic that determines capabilities and vm_flags.
Disable BDI_CAP_MAP_DIRECT in all cases if the device can't support the
protections.  Allow private readonly mappings of readonly backing devices.
Signed-off-by: NBernd Schmidt <bernds_cb1@t-online.de>
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Acked-by: NDavid McCullough <davidm@snapgear.com>
Acked-by: NGreg Ungerer <gerg@uclinux.org>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Acked-by: NDavid Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3c7b2045

26 3月, 2010 2 次提交

NOMMU: Fix __get_user_pages() to pin last page on offset buffers · e1ee65d8

由 David Howells 提交于 3月 25, 2010

Fix __get_user_pages() to make it pin the last page on a buffer that doesn't
begin at the start of a page, but is a multiple of PAGE_SIZE in size.

The problem is that __get_user_pages() advances the pointer too much when it
iterates to the next page if the page it's currently looking at isn't used from
the first byte. This can cause the end of a short VMA to be reached
prematurely, resulting in the last page being lost.
Signed-off-by: NSteven J. Magnani <steve@digidescorp.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e1ee65d8

NOMMU: Revert 'nommu: get_user_pages(): pin last page on non-page-aligned start' · 7561e8ca

由 David Howells 提交于 3月 25, 2010

Revert the following patch:

	commit c08c6e1f
	Author: Steven J. Magnani <steve@digidescorp.com>
	Date:   Fri Mar 5 13:42:24 2010 -0800

	nommu: get_user_pages(): pin last page on non-page-aligned start

As it assumes that the mappings begin at the start of pages - something that
isn't necessarily true on NOMMU systems.  On NOMMU systems, it is possible for
a mapping to only occupy part of the page, and not necessarily touch either end
of it; in fact it's also possible for multiple non-overlapping mappings to
coexist on one page (consider direct mappings of ROMFS files, for example).
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NSteven J. Magnani <steve@digidescorp.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7561e8ca

25 3月, 2010 1 次提交

nommu: fix an incorrect comment in the do_mmap_shared_file() · 3fa30460

由 David Howells 提交于 3月 23, 2010

Fix an incorrect comment in the do_mmap_shared_file().  If a mapping is
requested MAP_SHARED, then a private copy cannot be made and still provide
correct semantics.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reported-by: NDave Hudson <uclinux@blueteddy.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3fa30460

13 3月, 2010 1 次提交

Add generic sys_old_mmap() · a4679373

由 Christoph Hellwig 提交于 3月 10, 2010

Add a generic implementation of the old mmap() syscall, which expects its
argument in a memory block and switch all architectures over to use it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Reviewed-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Acked-by: NGreg Ungerer <gerg@uclinux.org>
Acked-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a4679373

07 3月, 2010 2 次提交

nommu: get_user_pages(): pin last page on non-page-aligned start · c08c6e1f

由 Steven J. Magnani 提交于 3月 05, 2010

The noMMU version of get_user_pages() fails to pin the last page when the
start address isn't page-aligned.  The patch fixes this in a way that
makes find_extend_vma() congruent to its MMU cousin.
Signed-off-by: NSteven J. Magnani <steve@digidescorp.com>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c08c6e1f

mm: change anon_vma linking to fix multi-process server scalability issue · 5beb4930

由 Rik van Riel 提交于 3月 05, 2010

The old anon_vma code can lead to scalability issues with heavily forking
workloads.  Specifically, each anon_vma will be shared between the parent
process and all its child processes.

In a workload with 1000 child processes and a VMA with 1000 anonymous
pages per process that get COWed, this leads to a system with a million
anonymous pages in the same anon_vma, each of which is mapped in just one
of the 1000 processes.  However, the current rmap code needs to walk them
all, leading to O(N) scanning complexity for each page.

This can result in systems where one CPU is walking the page tables of
1000 processes in page_referenced_one, while all other CPUs are stuck on
the anon_vma lock.  This leads to catastrophic failure for a benchmark
like AIM7, where the total number of processes can reach in the tens of
thousands.  Real workloads are still a factor 10 less process intensive
than AIM7, but they are catching up.

This patch changes the way anon_vmas and VMAs are linked, which allows us
to associate multiple anon_vmas with a VMA.  At fork time, each child
process gets its own anon_vmas, in which its COWed pages will be
instantiated.  The parents' anon_vma is also linked to the VMA, because
non-COWed pages could be present in any of the children.

This reduces rmap scanning complexity to O(1) for the pages of the 1000
child processes, with O(N) complexity for at most 1/N pages in the system.
 This reduces the average scanning cost in heavily forking workloads from
O(N) to 2.

The only real complexity in this patch stems from the fact that linking a
VMA to anon_vmas now involves memory allocations.  This means vma_adjust
can fail, if it needs to attach a VMA to anon_vma structures.  This in
turn means error handling needs to be added to the calling functions.

A second source of complexity is that, because there can be multiple
anon_vmas, the anon_vma linking in vma_adjust can no longer be done under
"the" anon_vma lock.  To prevent the rmap code from walking up an
incomplete VMA, this patch introduces the VM_LOCK_RMAP VMA flag.  This bit
flag uses the same slot as the NOMMU VM_MAPPED_COPY, with an ifdef in mm.h
to make sure it is impossible to compile a kernel that needs both symbolic
values for the same bitflag.

Some test results:

Without the anon_vma changes, when AIM7 hits around 9.7k users (on a test
box with 16GB RAM and not quite enough IO), the system ends up running
>99% in system time, with every CPU on the same anon_vma lock in the
pageout code.

With these changes, AIM7 hits the cross-over point around 29.7k users.
This happens with ~99% IO wait time, there never seems to be any spike in
system time.  The anon_vma lock contention appears to be resolved.

[akpm@linux-foundation.org: cleanups]
Signed-off-by: NRik van Riel <riel@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5beb4930

17 1月, 2010 4 次提交

nommu: fix shared mmap after truncate shrinkage problems · 7e660872

由 David Howells 提交于 1月 15, 2010

Fix a problem in NOMMU mmap with ramfs whereby a shared mmap can happen
over the end of a truncation.  The problem is that
ramfs_nommu_check_mappings() checks that the reduced file size against the
VMA tree, but not the vm_region tree.

The following sequence of events can cause the problem:

	fd = open("/tmp/x", O_RDWR|O_TRUNC|O_CREAT, 0600);
	ftruncate(fd, 32 * 1024);
	a = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
	b = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
	munmap(a, 32 * 1024);
	ftruncate(fd, 16 * 1024);
	c = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

Mapping 'a' creates a vm_region covering 32KB of the file.  Mapping 'b'
sees that the vm_region from 'a' is covering the region it wants and so
shares it, pinning it in memory.

Mapping 'a' then goes away and the file is truncated to the end of VMA
'b'.  However, the region allocated by 'a' is still in effect, and has
_not_ been reduced.

Mapping 'c' is then created, and because there's a vm_region covering the
desired region, get_unmapped_area() is _not_ called to repeat the check,
and the mapping is granted, even though the pages from the latter half of
the mapping have been discarded.

However:

	d = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

Mapping 'd' should work, and should end up sharing the region allocated by
'a'.

To deal with this, we shrink the vm_region struct during the truncation,
lest do_mmap_pgoff() take it as licence to share the full region
automatically without calling the get_unmapped_area() file op again.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7e660872

nommu: don't need get_unmapped_area() for NOMMU · efc1a3b1

由 David Howells 提交于 1月 15, 2010

get_unmapped_area() is unnecessary for NOMMU as no-one calls it.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

efc1a3b1

nommu: remove a superfluous check of vm_region::vm_usage · 779c1023

由 David Howells 提交于 1月 15, 2010

In split_vma(), there's no need to check if the VMA being split has a
region that's in use by more than one VMA because:

 (1) The preceding test prohibits splitting of non-anonymous VMAs and regions
     (eg: file or chardev backed VMAs).

 (2) Anonymous regions can't be mapped multiple times because there's no handle
     by which to refer to the already existing region.

 (3) If a VMA has previously been split, then the region backing it has also
     been split into two regions, each of usage 1.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

779c1023

nommu: struct vm_region's vm_usage count need not be atomic · 1e2ae599

由 David Howells 提交于 1月 15, 2010

The vm_usage count field in struct vm_region does not need to be atomic as
it's only even modified whilst nommu_region_sem is write locked.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1e2ae599

07 1月, 2010 2 次提交

NOMMU: Use copy_*_user_page() in access_process_vm() · 7959722b

由 Jie Zhang 提交于 1月 06, 2010

The MMU code uses the copy_*_user_page() variants in access_process_vm()
rather than copy_*_user() as the former includes an icache flush.  This
is important when doing things like setting software breakpoints with
gdb.  So switch the NOMMU code over to do the same.

This patch makes the reasonable assumption that copy_from_user_page()
won't fail - which is probably fine, as we've checked the VMA from which
we're copying is usable, and the copy is not allowed to cross VMAs.  The
one case where it might go wrong is if the VMA is a device rather than
RAM, and that device returns an error which - in which case rubbish will
be returned rather than EIO.
Signed-off-by: NJie Zhang <jie.zhang@analog.com>
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NDavid McCullough <david_mccullough@mcafee.com>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Acked-by: NGreg Ungerer <gerg@uclinux.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7959722b

NOMMU: Avoiding duplicate icache flushes of shared maps · cfe79c00

由 Mike Frysinger 提交于 1月 06, 2010

When working with FDPIC, there are many shared mappings of read-only
code regions between applications (the C library, applet packages like
busybox, etc.), but the current do_mmap_pgoff() function will issue an
icache flush whenever a VMA is added to an MM instead of only doing it
when the map is initially created.

The flush can instead be done when a region is first mmapped PROT_EXEC.
Note that we may not rely on the first mapping of a region being
executable - it's possible for it to be PROT_READ only, so we have to
remember whether we've flushed the region or not, and then flush the
entire region when a bit of it is made executable.

However, this also affects the brk area. That will no longer be
executable. We can mprotect() it to PROT_EXEC on MPU-mode kernels, but
for NOMMU mode kernels, when it increases the brk allocation, making
sys_brk() flush the extra from the icache should suffice. The brk area
probably isn't used by NOMMU programs since the brk area can only use up
the leavings from the stack allocation, where the stack allocation is
larger than requested.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cfe79c00

31 12月, 2009 1 次提交

mm: move sys_mmap_pgoff from util.c · 66f0dc48

由 Hugh Dickins 提交于 12月 30, 2009

Move sys_mmap_pgoff() from mm/util.c to mm/mmap.c and mm/nommu.c,
where we'd expect to find such code: especially now that it contains
the MAP_HUGETLB handling. Revert mm/util.c to how it was in 2.6.32.

This patch just ignores MAP_HUGETLB in the nommu case, as in 2.6.32,
whereas 2.6.33-rc2 reported -ENOSYS. Perhaps validate_mmap_request()
should reject it with -EINVAL? Add that later if necessary.
Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

66f0dc48

16 12月, 2009 1 次提交

nommu: fix malloc performance by adding uninitialized flag · ea637639

由 Jie Zhang 提交于 12月 14, 2009

The NOMMU code currently clears all anonymous mmapped memory.  While this
is what we want in the default case, all memory allocation from userspace
under NOMMU has to go through this interface, including malloc() which is
allowed to return uninitialized memory.  This can easily be a significant
performance penalty.  So for constrained embedded systems were security is
irrelevant, allow people to avoid clearing memory unnecessarily.

This also alters the ELF-FDPIC binfmt such that it obtains uninitialised
memory for the brk and stack region.
Signed-off-by: NJie Zhang <jie.zhang@analog.com>
Signed-off-by: NRobin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Acked-by: NGreg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ea637639

01 11月, 2009 1 次提交

NOMMU: Don't pass NULL pointers to fput() in do_mmap_pgoff() · 89a86402

由 David Howells 提交于 10月 30, 2009

Don't pass NULL pointers to fput() in the error handling paths of the NOMMU
do_mmap_pgoff() as it can't handle it.

The following can be used as a test program:

	int main() { static long long a[1024 * 1024 * 20] = { 0 }; return a;}

Without the patch, the code oopses in atomic_long_dec_and_test() as called by
fput() after the kernel complains that it can't allocate that big a chunk of
memory.  With the patch, the kernel just complains about the allocation size
and then the program segfaults during execve() as execve() can't complete the
allocation of all the new ELF program segments.
Reported-by: NRobin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NRobin Getz <rgetz@blackfin.uclinux.org>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

89a86402

28 9月, 2009 1 次提交

const: mark struct vm_struct_operations · f0f37e2f

由 Alexey Dobriyan 提交于 9月 27, 2009

* mark struct vm_area_struct::vm_ops as const
* mark vm_ops in AGP code

But leave TTM code alone, something is fishy there with global vm_ops
being used.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f0f37e2f

25 9月, 2009 2 次提交

NOMMU: Ignore mmap() address param as it is a hint · 06aab5a3

由 David Howells 提交于 9月 24, 2009

Ignore the address parameter given to NOMMU mmap() as it is a hint, rather
than giving an error if it's non-zero. MAP_FIXED still gets an error.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

06aab5a3

NOMMU: Fix MAP_PRIVATE mmap() of objects where the data can be mapped directly · 645d83c5

由 David Howells 提交于 9月 24, 2009

Fix MAP_PRIVATE mmap() of files and devices where the data in the backing store
might be mapped directly.  Use the BDI_CAP_MAP_DIRECT capability flag to govern
whether or not we should be trying to map a file directly.  This can be used to
determine whether or not a region has been filled in at the point where we call
do_mmap_shared() or do_mmap_private().

The BDI_CAP_MAP_DIRECT capability flag is cleared by validate_mmap_request() if
there's any reason we can't use it.  It's also cleared in do_mmap_pgoff() if
f_op->get_unmapped_area() fails.

Without this fix, attempting to run a program from a RomFS image on a
non-mappable MTD partition results in a BUG as the kernel attempts XIP, and
this can be caught in gdb:

Program received signal SIGABRT, Aborted.
0xc005dce8 in add_nommu_region (region=<value optimized out>) at mm/nommu.c:547
(gdb) bt
#0  0xc005dce8 in add_nommu_region (region=<value optimized out>) at mm/nommu.c:547
#1  0xc005f168 in do_mmap_pgoff (file=0xc31a6620, addr=<value optimized out>, len=3808, prot=3, flags=6146, pgoff=0) at mm/nommu.c:1373
#2  0xc00a96b8 in elf_fdpic_map_file (params=0xc33fbbec, file=0xc31a6620, mm=0xc31bef60, what=0xc0213144 "executable") at mm.h:1145
#3  0xc00aa8b4 in load_elf_fdpic_binary (bprm=0xc316cb00, regs=<value optimized out>) at fs/binfmt_elf_fdpic.c:343
#4  0xc006b588 in search_binary_handler (bprm=0x6, regs=0xc33fbce0) at fs/exec.c:1234
#5  0xc006c648 in do_execve (filename=<value optimized out>, argv=0xc3ad14cc, envp=0xc3ad1460, regs=0xc33fbce0) at fs/exec.c:1356
#6  0xc0008cf0 in sys_execve (name=<value optimized out>, argv=0xc3ad14cc, envp=0xc3ad1460) at arch/frv/kernel/process.c:263
#7  0xc00075dc in __syscall_call () at arch/frv/kernel/entry.S:897

Note that this fix does the following commit differently:

	commit a190887b
	Author: David Howells <dhowells@redhat.com>
	Date:   Sat Sep 5 11:17:07 2009 -0700
	nommu: fix error handling in do_mmap_pgoff()
Reported-by: NGraff Yang <graff.yang@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

645d83c5

24 9月, 2009 2 次提交

truncate: new helpers · 25d9e2d1

由 npiggin@suse.de 提交于 8月 21, 2009

Introduce new truncate helpers truncate_pagecache and inode_newsize_ok.
vmtruncate is also consolidated from mm/memory.c and mm/nommu.c and
into mm/truncate.c.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

25d9e2d1

nommu: fix two build breakages · 4266c97a

由 Hugh Dickins 提交于 9月 23, 2009

My 58fa879e "mm: FOLL flags for GUP flags"
broke CONFIG_NOMMU build by forgetting to update nommu.c foll_flags type:

  mm/nommu.c:171: error: conflicting types for `__get_user_pages'
  mm/internal.h:254: error: previous declaration of `__get_user_pages' was here
  make[1]: *** [mm/nommu.o] Error 1

My 03f6462a "mm: move highest_memmap_pfn"
broke CONFIG_NOMMU build by forgetting to add a nommu.c highest_memmap_pfn:

  mm/built-in.o: In function `memmap_init_zone':
  (.meminit.text+0x326): undefined reference to `highest_memmap_pfn'
  mm/built-in.o: In function `memmap_init_zone':
  (.meminit.text+0x32d): undefined reference to `highest_memmap_pfn'

Fix both breakages, and give myself 30 lashes (ouch!)
Reported-by: NMichal Simek <michal.simek@petalogix.com>
Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4266c97a

22 9月, 2009 4 次提交

nommu: add support for Memory Protection Units (MPU) · eb8cdec4

由 Bernd Schmidt 提交于 9月 21, 2009

Some architectures (like the Blackfin arch) implement some of the
"simpler" features that one would expect out of a MMU such as memory
protection.

In our case, we actually get read/write/exec protection down to the page
boundary so processes can't stomp on each other let alone the kernel.

There is a performance decrease (which depends greatly on the workload)
however as the hardware/software interaction was not optimized at design
time.
Signed-off-by: NBernd Schmidt <bernds_cb1@t-online.de>
Signed-off-by: NBryan Wu <cooloney@kernel.org>
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Acked-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NGreg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eb8cdec4

mm: FOLL flags for GUP flags · 58fa879e

由 Hugh Dickins 提交于 9月 21, 2009

__get_user_pages() has been taking its own GUP flags, then processing
them into FOLL flags for follow_page().  Though oddly named, the FOLL
flags are more widely used, so pass them to __get_user_pages() now.
Sorry, VM flags, VM_FAULT flags and FAULT_FLAGs are still distinct.

(The patch to __get_user_pages() looks peculiar, with both gup_flags
and foll_flags: the gup_flags remain constant; but as before there's
an exceptional case, out of scope of the patch, in which foll_flags
per page have FOLL_WRITE masked off.)
Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Rik van Riel <riel@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

58fa879e

mm: remove unused GUP flags · 1c3aff1c

由 Hugh Dickins 提交于 9月 21, 2009

GUP_FLAGS_IGNORE_VMA_PERMISSIONS and GUP_FLAGS_IGNORE_SIGKILL were
flags added solely to prevent __get_user_pages() from doing some of
what it usually does, in the munlock case: we can now remove them.
Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Acked-by: NRik van Riel <riel@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1c3aff1c

mm: includecheck fix for mm/nommu.c · 72ff13b7

由 Jaswinder Singh Rajput 提交于 9月 21, 2009

Fix the following 'make includecheck' warning:

  mm/nommu.c: internal.h is included more than once.
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: NGreg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

72ff13b7

06 9月, 2009 1 次提交

nommu: fix error handling in do_mmap_pgoff() · a190887b

由 David Howells 提交于 9月 05, 2009

Fix the error handling in do_mmap_pgoff().  If do_mmap_shared_file() or
do_mmap_private() fail, we jump to the error_put_region label at which
point we cann __put_nommu_region() on the region - but we haven't yet
added the region to the tree, and so __put_nommu_region() may BUG
because the region tree is empty or it may corrupt the region tree.

To get around this, we can afford to add the region to the region tree
before calling do_mmap_shared_file() or do_mmap_private() as we keep
nommu_region_sem write-locked, so no-one can race with us by seeing a
transient region.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: NGreg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a190887b

19 8月, 2009 1 次提交

nommu: check fd read permission in validate_mmap_request() · 28d7a6ae

由 Graff Yang 提交于 8月 18, 2009

According to the POSIX (1003.1-2008), the file descriptor shall have been
opened with read permission, regardless of the protection options specified to
mmap().  The ltp test cases mmap06/07 need this.
Signed-off-by: NGraff Yang <graff.yang@gmail.com>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NGreg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

28d7a6ae

17 8月, 2009 1 次提交

Security/SELinux: seperate lsm specific mmap_min_addr · 788084ab

由 Eric Paris 提交于 7月 31, 2009

Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable.  This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.

The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.

This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.
Signed-off-by: NEric Paris <eparis@redhat.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

788084ab

06 8月, 2009 1 次提交

Security/SELinux: seperate lsm specific mmap_min_addr · a2551df7

由 Eric Paris 提交于 7月 31, 2009

Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable.  This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.

The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.

This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.
Signed-off-by: NEric Paris <eparis@redhat.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

a2551df7

26 6月, 2009 1 次提交

nommu: provide follow_pfn(). · dfc2f91a

由 Paul Mundt 提交于 6月 26, 2009

With the introduction of follow_pfn() as an exported symbol, modules have
begun making use of it. Unfortunately this was not reflected on nommu at
the time, so the in-tree users have subsequently all blown up with link
errors there.

This provides a simple follow_pfn() that just returns addr >> PAGE_SHIFT,
which will do the right thing on nommu. There is no need to do range
checking within the vma, as the find_vma() case will already take care of
this.
Signed-off-by: NPaul Mundt <lethal@linux-sh.org>

dfc2f91a

Linux-御风守护者 / linux 与 Fork 源项目一致

Linux-御风守护者 / linux
与 Fork 源项目一致