- 26 1月, 2011 1 次提交
-
-
由 Andrea Arcangeli 提交于
Commit 5d689240 ("thp: select CONFIG_COMPACTION if TRANSPARENT_HUGEPAGE enabled") causes this warning during the configuration process: warning: (TRANSPARENT_HUGEPAGE) selects COMPACTION which has unmet direct dependencies (EXPERIMENTAL && HUGETLB_PAGE && MMU) COMPACTION doesn't depend on HUGETLB_PAGE, it doesn't depend on THP either, it is also useful for regular alloc_pages(order > 0) including the very kernel stack during fork (THREAD_ORDER = 1). It's always better to enable COMPACTION. The warning should be an error because we would end up with MIGRATION not selected, and COMPACTION wouldn't work without migration (despite it seems to build with an inline migrate_pages returning -ENOSYS). I'd also like to remove EXPERIMENTAL: compaction has been in the kernel for some releases (for full safety the default remains disabled which I think is enough). Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Reported-by: NLuca Tettamanti <kronos.it@gmail.com> Tested-by: NLuca Tettamanti <kronos.it@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 1月, 2011 4 次提交
-
-
由 Andrea Arcangeli 提交于
With transparent hugepage support we need compaction for the "defrag" sysfs controls to be effective. At the moment THP hangs the system if COMPACTION isn't selected, as without COMPACTION lumpy reclaim wouldn't be entirely disabled. So at the moment it's not orthogonal. When lumpy will be removed from the VM I can remove the select COMPACTION in theory, but then 99% of THP users would be still doing a mistake in disabling compaction, even if the mistake won't return in fatal runtime but just slightly degraded performance. So from a theoretical standpoing forcing the below select is not needed (the dependency isn't strict nor at compile time nor at runtime) but from a practical standpoint it is safer. If anybody really wants THP to run without compaction, it'd be such a weird setup that editing the Kconfig file to allow it will be surely not a problem. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Acked-by: NMel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrea Arcangeli 提交于
Allow to choose between the always|madvise default for page faults and khugepaged at config time. madvise guarantees zero risk of higher memory footprint for applications (applications using madvise(MADV_HUGEPAGE) won't risk to use any more memory by backing their virtual regions with hugepages). Initially set the default to N and don't depend on EMBEDDED. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Johannes Weiner 提交于
Add support for transparent hugepages to x86 32bit. Share the same VM_ bitflag for VM_MAPPED_COPY. mm/nommu.c will never support transparent hugepages. Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Reviewed-by: NRik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrea Arcangeli 提交于
Add config option. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Acked-by: NRik van Riel <riel@redhat.com> Acked-by: NMel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 10月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
On UP, percpu allocations were redirected to kmalloc. This has the following problems. * For certain amount of allocations (determined by PERCPU_DYNAMIC_EARLY_SLOTS and PERCPU_DYNAMIC_EARLY_SIZE), percpu allocator can be used before the usual kernel memory allocator is brought online. On SMP, this is used to initialize the kernel memory allocator. * percpu allocator honors alignment upto PAGE_SIZE but kmalloc() doesn't. For example, workqueue makes use of larger alignments for cpu_workqueues. Currently, users of percpu allocators need to handle UP differently, which is somewhat fragile and ugly. Other than small amount of memory, there isn't much to lose by enabling percpu allocator on UP. It can simply use kernel memory based chunk allocation which was added for SMP archs w/o MMUs. This patch removes mm/percpu_up.c, builds mm/percpu.c on UP too and makes UP build use percpu-km. As percpu addresses and kernel addresses are always identity mapped and static percpu variables don't need any special treatment, nothing is arch dependent and mm/percpu.c implements generic setup_per_cpu_areas() for UP. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi>
-
- 10 9月, 2010 1 次提交
-
-
由 Andrea Arcangeli 提交于
COMPACTION enables MIGRATION, but MIGRATION spawns a warning if numa or memhotplug aren't selected. However MIGRATION doesn't depend on them. I guess it's just trying to be strict doing a double check on who's enabling it, but it doesn't know that compaction also enables MIGRATION. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Acked-by: NMel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 9月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
On UP, percpu allocations were redirected to kmalloc. This has the following problems. * For certain amount of allocations (determined by PERCPU_DYNAMIC_EARLY_SLOTS and PERCPU_DYNAMIC_EARLY_SIZE), percpu allocator can be used before the usual kernel memory allocator is brought online. On SMP, this is used to initialize the kernel memory allocator. * percpu allocator honors alignment upto PAGE_SIZE but kmalloc() doesn't. For example, workqueue makes use of larger alignments for cpu_workqueues. Currently, users of percpu allocators need to handle UP differently, which is somewhat fragile and ugly. Other than small amount of memory, there isn't much to lose by enabling percpu allocator on UP. It can simply use kernel memory based chunk allocation which was added for SMP archs w/o MMUs. This patch removes mm/percpu_up.c, builds mm/percpu.c on UP too and makes UP build use percpu-km. As percpu addresses and kernel addresses are always identity mapped and static percpu variables don't need any special treatment, nothing is arch dependent and mm/percpu.c implements generic setup_per_cpu_areas() for UP. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
-
- 14 7月, 2010 1 次提交
-
-
由 Yinghai Lu 提交于
via following scripts FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/lmb/memblock/g' \ -e 's/LMB/MEMBLOCK/g' \ $FILES for N in $(find . -name lmb.[ch]); do M=$(echo $N | sed 's/lmb/memblock/g') mv $N $M done and remove some wrong change like lmbench and dlmb etc. also move memblock.c from lib/ to mm/ Suggested-by: NIngo Molnar <mingo@elte.hu> Acked-by: N"H. Peter Anvin" <hpa@zytor.com> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 25 5月, 2010 1 次提交
-
-
由 Mel Gorman 提交于
CONFIG_MIGRATION currently depends on CONFIG_NUMA or on the architecture being able to hot-remove memory. The main users of page migration such as sys_move_pages(), sys_migrate_pages() and cpuset process migration are only beneficial on NUMA so it makes sense. As memory compaction will operate within a zone and is useful on both NUMA and non-NUMA systems, this patch allows CONFIG_MIGRATION to be set if the user selects CONFIG_COMPACTION as an option. [akpm@linux-foundation.org: Depend on CONFIG_HUGETLB_PAGE] Signed-off-by: NMel Gorman <mel@csn.ul.ie> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Reviewed-by: NRik van Riel <riel@redhat.com> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 2月, 2010 1 次提交
-
-
由 Yinghai Lu 提交于
Add vmemmap_alloc_block_buf for mem map only. It will fallback to the old way if it cannot get a block that big. Before this patch, when a node have 128g ram installed, memmap are split into two parts or more. [ 0.000000] [ffffea0000000000-ffffea003fffffff] PMD -> [ffff880100600000-ffff88013e9fffff] on node 1 [ 0.000000] [ffffea0040000000-ffffea006fffffff] PMD -> [ffff88013ec00000-ffff88016ebfffff] on node 1 [ 0.000000] [ffffea0070000000-ffffea007fffffff] PMD -> [ffff882000600000-ffff8820105fffff] on node 0 [ 0.000000] [ffffea0080000000-ffffea00bfffffff] PMD -> [ffff882010800000-ffff8820507fffff] on node 0 [ 0.000000] [ffffea00c0000000-ffffea00dfffffff] PMD -> [ffff882050a00000-ffff8820709fffff] on node 0 [ 0.000000] [ffffea00e0000000-ffffea00ffffffff] PMD -> [ffff884000600000-ffff8840205fffff] on node 2 [ 0.000000] [ffffea0100000000-ffffea013fffffff] PMD -> [ffff884020800000-ffff8840607fffff] on node 2 [ 0.000000] [ffffea0140000000-ffffea014fffffff] PMD -> [ffff884060a00000-ffff8840709fffff] on node 2 [ 0.000000] [ffffea0150000000-ffffea017fffffff] PMD -> [ffff886000600000-ffff8860305fffff] on node 3 [ 0.000000] [ffffea0180000000-ffffea01bfffffff] PMD -> [ffff886030800000-ffff8860707fffff] on node 3 [ 0.000000] [ffffea01c0000000-ffffea01ffffffff] PMD -> [ffff888000600000-ffff8880405fffff] on node 4 [ 0.000000] [ffffea0200000000-ffffea022fffffff] PMD -> [ffff888040800000-ffff8880707fffff] on node 4 [ 0.000000] [ffffea0230000000-ffffea023fffffff] PMD -> [ffff88a000600000-ffff88a0105fffff] on node 5 [ 0.000000] [ffffea0240000000-ffffea027fffffff] PMD -> [ffff88a010800000-ffff88a0507fffff] on node 5 [ 0.000000] [ffffea0280000000-ffffea029fffffff] PMD -> [ffff88a050a00000-ffff88a0709fffff] on node 5 [ 0.000000] [ffffea02a0000000-ffffea02bfffffff] PMD -> [ffff88c000600000-ffff88c0205fffff] on node 6 [ 0.000000] [ffffea02c0000000-ffffea02ffffffff] PMD -> [ffff88c020800000-ffff88c0607fffff] on node 6 [ 0.000000] [ffffea0300000000-ffffea030fffffff] PMD -> [ffff88c060a00000-ffff88c0709fffff] on node 6 [ 0.000000] [ffffea0310000000-ffffea033fffffff] PMD -> [ffff88e000600000-ffff88e0305fffff] on node 7 [ 0.000000] [ffffea0340000000-ffffea037fffffff] PMD -> [ffff88e030800000-ffff88e0707fffff] on node 7 after patch will get [ 0.000000] [ffffea0000000000-ffffea006fffffff] PMD -> [ffff880100200000-ffff88016e5fffff] on node 0 [ 0.000000] [ffffea0070000000-ffffea00dfffffff] PMD -> [ffff882000200000-ffff8820701fffff] on node 1 [ 0.000000] [ffffea00e0000000-ffffea014fffffff] PMD -> [ffff884000200000-ffff8840701fffff] on node 2 [ 0.000000] [ffffea0150000000-ffffea01bfffffff] PMD -> [ffff886000200000-ffff8860701fffff] on node 3 [ 0.000000] [ffffea01c0000000-ffffea022fffffff] PMD -> [ffff888000200000-ffff8880701fffff] on node 4 [ 0.000000] [ffffea0230000000-ffffea029fffffff] PMD -> [ffff88a000200000-ffff88a0701fffff] on node 5 [ 0.000000] [ffffea02a0000000-ffffea030fffffff] PMD -> [ffff88c000200000-ffff88c0701fffff] on node 6 [ 0.000000] [ffffea0310000000-ffffea037fffffff] PMD -> [ffff88e000200000-ffff88e0701fffff] on node 7 -v2: change buf to vmemmap_buf instead according to Ingo also add CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER according to Ingo -v3: according to Andrew, use sizeof(name) instead of hard coded 15 Signed-off-by: NYinghai Lu <yinghai@kernel.org> LKML-Reference: <1265793639-15071-19-git-send-email-yinghai@kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org> Acked-by: NChristoph Lameter <cl@linux-foundation.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 05 1月, 2010 1 次提交
-
-
由 Paul Mundt 提交于
We previously had 2 quicklists, one for the PGD case and one for PTEs. Now that the PGD/PMD cases are handled through slab caches due to the multi-level configurability, only the PTE quicklist remains. As such, reduce NR_QUICK to its appropriate size and bump down the PTE quicklist index. Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
-
- 22 12月, 2009 1 次提交
-
-
由 Andi Kleen 提交于
The injector filter requires stable_page_flags() which is supplied by procfs. So make it dependent on that. Also add ifdefs around the filter code in memory-failure.c so that when the filter is disabled due to missing dependencies the whole code still builds. Reported-by: Ingo Molnar Signed-off-by: NAndi Kleen <ak@linux.intel.com>
-
- 17 12月, 2009 1 次提交
-
-
由 David Howells 提交于
In NOMMU mode clamp dac_mmap_min_addr to zero to cause the tests on it to be skipped by the compiler. We do this as the minimum mmap address doesn't make any sense in NOMMU mode. mmap_min_addr and round_hint_to_min() can be discarded entirely in NOMMU mode. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NEric Paris <eparis@redhat.com> Signed-off-by: NJames Morris <jmorris@namei.org>
-
- 16 12月, 2009 5 次提交
-
-
由 Andi Kleen 提交于
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
-
由 Wu Fengguang 提交于
When specified, only poison pages if ((page_flags & mask) == value). - corrupt-filter-flags-mask - corrupt-filter-flags-value This allows stress testing of many kinds of pages. Strictly speaking, the buddy pages requires taking zone lock, to avoid setting PG_hwpoison on a "was buddy but now allocated to someone" page. However we can just do nothing because we set PG_locked in the beginning, this prevents the page allocator from allocating it to someone. (It will BUG() on the unexpected PG_locked, which is fine for hwpoison testing.) [AK: Add select PROC_PAGE_MONITOR to satisfy dependency] CC: Nick Piggin <npiggin@suse.de> Signed-off-by: NWu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndi Kleen <ak@linux.intel.com>
-
由 Hugh Dickins 提交于
Now that ksm pages are swappable, and the known holes plugged, remove mention of unswappable kernel pages from KSM documentation and comments. Remove the totalram_pages/4 initialization of max_kernel_pages. In fact, remove max_kernel_pages altogether - we can reinstate it if removal turns out to break someone's script; but if we later want to limit KSM's memory usage, limiting the stable nodes would not be an effective approach. Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Izik Eidus <ieidus@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Chris Wright <chrisw@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
CONFIG_DEBUG_SPINLOCK adds 12 or 16 bytes to a 32- or 64-bit spinlock_t, and CONFIG_DEBUG_LOCK_ALLOC adds another 12 or 24 bytes to it: lockdep enables both of those, and CONFIG_LOCK_STAT adds 8 or 16 bytes to that. When 2.6.15 placed the split page table lock inside struct page (usually sized 32 or 56 bytes), only CONFIG_DEBUG_SPINLOCK was a possibility, and we ignored the enlargement (but fitted in CONFIG_GENERIC_LOCKBREAK's 4 by letting the spinlock_t occupy both page->private and page->mapping). Should these debugging options be allowed to double the size of a struct page, when only one minority use of the page (as a page table) needs to fit a spinlock in there? Perhaps not. Take the easy way out: switch off SPLIT_PTLOCK_CPUS when DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC is in force. I've sometimes tried to be cleverer, kmallocing a cacheline for the spinlock when it doesn't fit, but given up each time. Falling back to mm->page_table_lock (as we do when ptlock is not split) lets lockdep check out the strictest path anyway. And now that some arches allow 8192 cpus, use 999999 for infinity. (What has this got to do with KSM swapping? It doesn't care about the size of struct page, but may care about random junk in page->mapping - to be explained separately later.) Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Izik Eidus <ieidus@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
Remove three degrees of obfuscation, left over from when we had CONFIG_UNEVICTABLE_LRU. MLOCK_PAGES is CONFIG_HAVE_MLOCKED_PAGE_BIT is CONFIG_HAVE_MLOCK is CONFIG_MMU. rmap.o (and memory-failure.o) are only built when CONFIG_MMU, so don't need such conditions at all. Somehow, I feel no compulsion to remove the CONFIG_HAVE_MLOCK* lines from 169 defconfigs: leave those to evolve in due course. Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Izik Eidus <ieidus@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 11月, 2009 1 次提交
-
-
由 Andi Kleen 提交于
Allow memory hotplug and hibernation in the same kernel Memory hotplug and hibernation were exclusive in Kconfig. This is obviously a problem for distribution kernels who want to support both in the same image. After some discussions with Rafael and others the only problem is with parallel memory hotadd or removal while a hibernation operation is in process. It was also working for s390 before. This patch removes the Kconfig level exclusion, and simply makes the memory add / remove functions grab the pm_mutex to exclude against hibernation. Fixes a regression - old kernels didn't exclude memory hotadd and hibernation. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Acked-by: NRafael J. Wysocki <rjw@sisk.pl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 10月, 2009 1 次提交
-
-
由 Russell King 提交于
Currently, sparsemem is only available if EXPERIMENTAL is enabled. However, it hasn't ever been marked experimental. It's been about four years since sparsemem was merged, and we have platforms which depend on it; allow architectures to decide whether sparsemem should be the default memory model. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 10月, 2009 1 次提交
-
-
由 Kumar Gala 提交于
Signed-off-by: NKumar Gala <galak@kernel.crashing.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 08 10月, 2009 1 次提交
-
-
由 Hugh Dickins 提交于
Adjust the max_kernel_pages default to a quarter of totalram_pages, instead of nr_free_buffer_pages() / 4: the KSM pages themselves come from highmem, and even on a 16GB PAE machine, 4GB of KSM pages would only be pinning 32MB of lowmem with their rmap_items, so no need for the more obscure calculation (nor for its own special init function). There is no way for the user to switch KSM on if CONFIG_SYSFS is not enabled, so in that case default run to KSM_RUN_MERGE. Update KSM Documentation and Kconfig to reflect the new defaults. Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Izik Eidus <ieidus@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 9月, 2009 1 次提交
-
-
由 Linus Torvalds 提交于
This build failure triggers: In file included from include/linux/suspend.h:8, from arch/x86/kernel/asm-offsets_32.c:11, from arch/x86/kernel/asm-offsets.c:2: include/linux/mm.h:503:2: error: #error SECTIONS_WIDTH+NODES_WIDTH+ZONES_WIDTH > BITS_PER_LONG - NR_PAGEFLAGS Because due to the hwpoison page flag we ran out of page flags on 32-bit. Dont turn on hwpoison on 32-bit NUMA (it's rare in any case). Also clean up the Kconfig dependencies in the generic MM code by introducing ARCH_SUPPORTS_MEMORY_FAILURE. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 22 9月, 2009 2 次提交
-
-
由 Hugh Dickins 提交于
Add Documentation/vm/ksm.txt: how to use the Kernel Samepage Merging feature Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: NIzik Eidus <ieidus@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
This patch presents the mm interface to a dummy version of ksm.c, for better scrutiny of that interface: the real ksm.c follows later. When CONFIG_KSM is not set, madvise(2) reject MADV_MERGEABLE and MADV_UNMERGEABLE with EINVAL, since that seems more helpful than pretending that they can be serviced. But when CONFIG_KSM=y, accept them even if KSM is not currently running, and even on areas which KSM will not touch (e.g. hugetlb or shared file or special driver mappings). Like other madvices, report ENOMEM despite success if any area in the range is unmapped, and use EAGAIN to report out of memory. Define vma flag VM_MERGEABLE to identify an area on which KSM may try merging pages: leave it to ksm_madvise() to decide whether to set it. Define mm flag MMF_VM_MERGEABLE to identify an mm which might contain VM_MERGEABLE areas, to minimize callouts when forking or exiting. Based upon earlier patches by Chris Wright and Izik Eidus. Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NChris Wright <chrisw@redhat.com> Signed-off-by: NIzik Eidus <ieidus@redhat.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Avi Kivity <avi@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 9月, 2009 2 次提交
-
-
由 Andi Kleen 提交于
Useful for some testing scenarios, although specific testing is often done better through MADV_POISON This can be done with the x86 level MCE injector too, but this interface allows it to do independently from low level x86 changes. v2: Add module license (Haicheng Li) Signed-off-by: NAndi Kleen <ak@linux.intel.com>
-
由 Andi Kleen 提交于
Add the high level memory handler that poisons pages that got corrupted by hardware (typically by a two bit flip in a DIMM or a cache) on the Linux level. The goal is to prevent everyone from accessing these pages in the future. This done at the VM level by marking a page hwpoisoned and doing the appropriate action based on the type of page it is. The code that does this is portable and lives in mm/memory-failure.c To quote the overview comment: High level machine check handler. Handles pages reported by the hardware as being corrupted usually due to a 2bit ECC memory or cache failure. This focuses on pages detected as corrupted in the background. When the current CPU tries to consume corruption the currently running process can just be killed directly instead. This implies that if the error cannot be handled for some reason it's safe to just ignore it because no corruption has been consumed yet. Instead when that happens another machine check will happen. Handles page cache pages in various states. The tricky part here is that we can access any page asynchronous to other VM users, because memory failures could happen anytime and anywhere, possibly violating some of their assumptions. This is why this code has to be extremely careful. Generally it tries to use normal locking rules, as in get the standard locks, even if that means the error handling takes potentially a long time. Some of the operations here are somewhat inefficient and have non linear algorithmic complexity, because the data structures have not been optimized for this case. This is in particular the case for the mapping from a vma to a process. Since this case is expected to be rare we hope we can get away with this. There are in principle two strategies to kill processes on poison: - just unmap the data and wait for an actual reference before killing - kill as soon as corruption is detected. Both have advantages and disadvantages and should be used in different situations. Right now both are implemented and can be switched with a new sysctl vm.memory_failure_early_kill The default is early kill. The patch does some rmap data structure walking on its own to collect processes to kill. This is unusual because normally all rmap data structure knowledge is in rmap.c only. I put it here for now to keep everything together and rmap knowledge has been seeping out anyways Includes contributions from Johannes Weiner, Chris Mason, Fengguang Wu, Nick Piggin (who did a lot of great work) and others. Cc: npiggin@suse.de Cc: riel@redhat.com Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NRik van Riel <riel@redhat.com> Reviewed-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
-
- 01 9月, 2009 1 次提交
-
-
由 H. Peter Anvin 提交于
CONFIG_PAGEFLAGS_EXTENDED disables a trick to conserve pageflags. This trick is indended to be enabled when the pressure on page flags is very high. The previous condition was: - depends on 64BIT || SPARSEMEM_VMEMMAP || !NUMA || !SPARSEMEM ... however, the sparsemem code already has a way to crowd out the node number from the pageflags, which means that !NUMA actually doesn't contribute to hard pageflags exhaustion. This is required for the new PG_uncached flag to not cause pageflags exhaustion on x86_32 + PAE + SPARSEMEM + !NUMA. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> LKML-Reference: <4A9828F4.4040905@zytor.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh Siddha <suresh.siddha@intel.com>
-
- 17 8月, 2009 1 次提交
-
-
由 Eric Paris 提交于
Currently SELinux enforcement of controls on the ability to map low memory is determined by the mmap_min_addr tunable. This patch causes SELinux to ignore the tunable and instead use a seperate Kconfig option specific to how much space the LSM should protect. The tunable will now only control the need for CAP_SYS_RAWIO and SELinux permissions will always protect the amount of low memory designated by CONFIG_LSM_MMAP_MIN_ADDR. This allows users who need to disable the mmap_min_addr controls (usual reason being they run WINE as a non-root user) to do so and still have SELinux controls preventing confined domains (like a web server) from being able to map some area of low memory. Signed-off-by: NEric Paris <eparis@redhat.com> Signed-off-by: NJames Morris <jmorris@namei.org>
-
- 06 8月, 2009 1 次提交
-
-
由 Eric Paris 提交于
Currently SELinux enforcement of controls on the ability to map low memory is determined by the mmap_min_addr tunable. This patch causes SELinux to ignore the tunable and instead use a seperate Kconfig option specific to how much space the LSM should protect. The tunable will now only control the need for CAP_SYS_RAWIO and SELinux permissions will always protect the amount of low memory designated by CONFIG_LSM_MMAP_MIN_ADDR. This allows users who need to disable the mmap_min_addr controls (usual reason being they run WINE as a non-root user) to do so and still have SELinux controls preventing confined domains (like a web server) from being able to map some area of low memory. Signed-off-by: NEric Paris <eparis@redhat.com> Signed-off-by: NJames Morris <jmorris@namei.org>
-
- 17 6月, 2009 1 次提交
-
-
由 KOSAKI Motohiro 提交于
Currently, nobody wants to turn UNEVICTABLE_LRU off. Thus this configurability is unnecessary. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Andi Kleen <andi@firstfloor.org> Acked-by: NMinchan Kim <minchan.kim@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Rik van Riel <riel@redhat.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 6月, 2009 1 次提交
-
-
由 Gerald Schaefer 提交于
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 04 6月, 2009 1 次提交
-
-
由 Christoph Lameter 提交于
This patch removes the dependency of mmap_min_addr on CONFIG_SECURITY. It also sets a default mmap_min_addr of 4096. mmapping of addresses below 4096 will only be possible for processes with CAP_SYS_RAWIO. Signed-off-by: NChristoph Lameter <cl@linux-foundation.org> Acked-by: NEric Paris <eparis@redhat.com> Looks-ok-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NJames Morris <jmorris@namei.org>
-
- 07 5月, 2009 1 次提交
-
-
由 David Howells 提交于
NOMMU mmap() has an option controlled by a sysctl variable that determines whether the allocations made by do_mmap_private() should have the excess space trimmed off and returned to the allocator. Make the initial setting of this variable a Kconfig configuration option. The reason there can be excess space is that the allocator only allocates in power-of-2 size chunks, but mmap()'s can be made in sizes that aren't a power of 2. There are two alternatives: (1) Keep the excess as dead space. The dead space then remains unused for the lifetime of the mapping. Mappings of shared objects such as libc, ld.so or busybox's text segment may retain their dead space forever. (2) Return the excess to the allocator. This means that the dead space is limited to less than a page per mapping, but it means that for a transient process, there's more chance of fragmentation as the excess space may be reused fairly quickly. During the boot process, a lot of transient processes are created, and this can cause a lot of fragmentation as the pagecache and various slabs grow greatly during this time. By turning off the trimming of excess space during boot and disabling batching of frees, Coldfire can manage to boot. A better way of doing things might be to have /sbin/init turn this option off. By that point libc, ld.so and init - which are all long-duration processes - have all been loaded and trimmed. Reported-by: NLanttor Guo <lanttor.guo@freescale.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> Tested-by: NLanttor Guo <lanttor.guo@freescale.com> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 4月, 2009 1 次提交
-
-
由 David Howells 提交于
Point the UNEVICTABLE_LRU config option at the documentation describing the option. Signed-off-by: NDavid Howells <dhowells@redhat.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 4月, 2009 2 次提交
-
-
由 David Howells 提交于
Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n. There's no logical reason it shouldn't be available, and it can be used for ramfs. Signed-off-by: NDavid Howells <dhowells@redhat.com> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Greg Ungerer <gerg@snapgear.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Enrik Berkhan <Enrik.Berkhan@ge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Howells 提交于
The mlock() facility does not exist for NOMMU since all mappings are effectively locked anyway, so we don't make the bits available when they're not useful. Signed-off-by: NDavid Howells <dhowells@redhat.com> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Greg Ungerer <gerg@snapgear.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Enrik Berkhan <Enrik.Berkhan@ge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 1月, 2009 1 次提交
-
-
由 Geert Uytterhoeven 提交于
commit 8308c54d ("generic: redefine resource_size_t as phys_addr_t") made CONFIG_RESOURCES_64BIT obsolete, but didn't remove it. Remove it. Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 10月, 2008 1 次提交
-
-
由 Lee Schermerhorn 提交于
When the system contains lots of mlocked or otherwise unevictable pages, the pageout code (kswapd) can spend lots of time scanning over these pages. Worse still, the presence of lots of unevictable pages can confuse kswapd into thinking that more aggressive pageout modes are required, resulting in all kinds of bad behaviour. Infrastructure to manage pages excluded from reclaim--i.e., hidden from vmscan. Based on a patch by Larry Woodman of Red Hat. Reworked to maintain "unevictable" pages on a separate per-zone LRU list, to "hide" them from vmscan. Kosaki Motohiro added the support for the memory controller unevictable lru list. Pages on the unevictable list have both PG_unevictable and PG_lru set. Thus, PG_unevictable is analogous to and mutually exclusive with PG_active--it specifies which LRU list the page is on. The unevictable infrastructure is enabled by a new mm Kconfig option [CONFIG_]UNEVICTABLE_LRU. A new function 'page_evictable(page, vma)' in vmscan.c tests whether or not a page may be evictable. Subsequent patches will add the various !evictable tests. We'll want to keep these tests light-weight for use in shrink_active_list() and, possibly, the fault path. To avoid races between tasks putting pages [back] onto an LRU list and tasks that might be moving the page from non-evictable to evictable state, the new function 'putback_lru_page()' -- inverse to 'isolate_lru_page()' -- tests the "evictability" of a page after placing it on the LRU, before dropping the reference. If the page has become unevictable, putback_lru_page() will redo the 'putback', thus moving the page to the unevictable list. This way, we avoid "stranding" evictable pages on the unevictable list. [akpm@linux-foundation.org: fix fallout from out-of-order merge] [riel@redhat.com: fix UNEVICTABLE_LRU and !PROC_PAGE_MONITOR build] [nishimura@mxp.nes.nec.co.jp: remove redundant mapping check] [kosaki.motohiro@jp.fujitsu.com: unevictable-lru-infrastructure: putback_lru_page()/unevictable page handling rework] [kosaki.motohiro@jp.fujitsu.com: kill unnecessary lock_page() in vmscan.c] [kosaki.motohiro@jp.fujitsu.com: revert migration change of unevictable lru infrastructure] [kosaki.motohiro@jp.fujitsu.com: revert to unevictable-lru-infrastructure-kconfig-fix.patch] [kosaki.motohiro@jp.fujitsu.com: restore patch failure of vmstat-unevictable-and-mlocked-pages-vm-events.patch] Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: NRik van Riel <riel@redhat.com> Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Debugged-by: NBenjamin Kidwell <benjkidwell@yahoo.com> Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-