- 22 9月, 2009 40 次提交
-
-
由 Izik Eidus 提交于
Adding Hugh Dickins into the authors list. Signed-off-by: NIzik Eidus <ieidus@redhat.com> Cc: Chris Wright <chrisw@redhat.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Avi Kivity <avi@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
KSM's scan allows for user pages to be COWed or unmapped at any time, without requiring any notification. But its stable tree does assume that when it finds a KSM page where it placed a KSM page, then it is the same KSM page that it placed there. mremap move could break that assumption: if an area containing a KSM page was unmapped, then an area containing a different KSM page was moved with mremap into the place of the original, before KSM's scan came around to notice. That could then poison a node of the stable tree, so that memcmps would "lie" and upset the ordering of the tree. Probably noone will ever need mremap move on a VM_MERGEABLE area; except that prohibiting it would make trouble for schemes in which we try making everything VM_MERGEABLE e.g. for testing: an mremap which normally works would then fail mysteriously. There's no need to go to any trouble, such as re-sorting KSM's list of rmap_items to match the new layout: simply unmerge the area to COW all its KSM pages before moving, but leave VM_MERGEABLE on so that they're remerged later. Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NChris Wright <chrisw@redhat.com> Signed-off-by: NIzik Eidus <ieidus@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Avi Kivity <avi@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Izik Eidus 提交于
Ksm is code that allows merging of identical pages between one or more applications, in a way invisible to the applications that use it. Pages that are merged are marked as read-only, then COWed when any application tries to change them. Whereas fork() allows sharing anonymous pages between parent and child, ksm can share anonymous pages between unrelated processes. Ksm works by walking over the memory pages of the applications it scans, in order to find identical pages. It uses two sorted data structures, called the stable and unstable trees, to locate identical pages in an effective way. When ksm finds two identical pages, it marks them as readonly and merges them into a single page. After the pages have been marked as readonly and merged into one, Linux treats them as normal copy-on-write pages, copying to a fresh anonymous page if write access is required later. Ksm scans and merges anonymous pages only in those memory areas that have been registered with it by madvise(addr, length, MADV_MERGEABLE). The ksm scanner is controlled by sysfs files in /sys/kernel/mm/ksm/: max_kernel_pages - the maximum number of unswappable kernel pages which may be allocated by ksm (0 for unlimited). kernel_pages_allocated - how many ksm pages are currently allocated, sharing identical content between different processes (pages unswappable in this release). pages_shared - how many pages have been saved by sharing with ksm pages (kernel_pages_allocated being excluded from this count). pages_to_scan - how many pages ksm should scan before sleeping. sleep_millisecs - how many milliseconds ksm should sleep between scans. run - write 0 to disable ksm, read 0 while ksm is disabled (default), write 1 to run ksm, read 1 while ksm is running, write 2 to disable ksm and unmerge all its pages. Includes contributions by Andrea Arcangeli Chris Wright and Hugh Dickins. [hugh.dickins@tiscali.co.uk: fix rare page leak] Signed-off-by: NIzik Eidus <ieidus@redhat.com> Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NChris Wright <chrisw@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Avi Kivity <avi@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
KSM will need to identify its kernel merged pages unambiguously, and /proc/kpageflags will probably like to do so too. Since KSM will only be substituting anonymous pages, statistics are best preserved by making a PageKsm page a special PageAnon page: one with no anon_vma. But KSM then needs its own page_add_ksm_rmap() - keep it in ksm.h near PageKsm; and do_wp_page() must COW them, unlike singly mapped PageAnons. Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NChris Wright <chrisw@redhat.com> Signed-off-by: NIzik Eidus <ieidus@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Avi Kivity <avi@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
page_dup_rmap(), used on each mapped page when forking, was originally just an inline atomic_inc of mapcount. 2.6.22 added CONFIG_DEBUG_VM out-of-line checks to it, which would need to be ever-so-slightly complicated to allow for the PageKsm() we're about to define. But I think these checks never caught anything. And if it's coding errors we're worried about, such checks should be in page_remove_rmap() too, not just when forking; whereas if it's pagetable corruption we're worried about, then they shouldn't be limited to CONFIG_DEBUG_VM. Oh, just revert page_dup_rmap() to an inline atomic_inc of mapcount. Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NChris Wright <chrisw@redhat.com> Signed-off-by: NIzik Eidus <ieidus@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
This patch presents the mm interface to a dummy version of ksm.c, for better scrutiny of that interface: the real ksm.c follows later. When CONFIG_KSM is not set, madvise(2) reject MADV_MERGEABLE and MADV_UNMERGEABLE with EINVAL, since that seems more helpful than pretending that they can be serviced. But when CONFIG_KSM=y, accept them even if KSM is not currently running, and even on areas which KSM will not touch (e.g. hugetlb or shared file or special driver mappings). Like other madvices, report ENOMEM despite success if any area in the range is unmapped, and use EAGAIN to report out of memory. Define vma flag VM_MERGEABLE to identify an area on which KSM may try merging pages: leave it to ksm_madvise() to decide whether to set it. Define mm flag MMF_VM_MERGEABLE to identify an mm which might contain VM_MERGEABLE areas, to minimize callouts when forking or exiting. Based upon earlier patches by Chris Wright and Izik Eidus. Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NChris Wright <chrisw@redhat.com> Signed-off-by: NIzik Eidus <ieidus@redhat.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Avi Kivity <avi@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
The out-of-tree KSM used ioctls on fds cloned from /dev/ksm to register a memory area for merging: we prefer now to use an madvise(2) interface. This patch just defines MADV_MERGEABLE (to tell KSM it may merge pages in this area found identical to pages in other mergeable areas) and MADV_UNMERGEABLE (to undo that). Most architectures use asm-generic, but alpha, mips, parisc, xtensa need their own definitions: included here for mmotm convenience, but we'll probably want to split this and feed pieces to arch maintainers. Based upon earlier patches by Chris Wright and Izik Eidus. Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NChris Wright <chrisw@redhat.com> Signed-off-by: NIzik Eidus <ieidus@redhat.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Helge Deller <deller@gmx.de> Cc: Chris Zankel <chris@zankel.net> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Avi Kivity <avi@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
madvise.c has several levels of switch statements, what to do in which? Move MADV_DOFORK code down from madvise_vma() to madvise_behavior(), so madvise_vma() can be a simple router, to madvise_behavior() by default. vma->vm_flags is an unsigned long so use the same type for new_flags. Add missing comment lines to describe MADV_DONTFORK and MADV_DOFORK. Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NChris Wright <chrisw@redhat.com> Signed-off-by: NIzik Eidus <ieidus@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Avi Kivity <avi@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Izik Eidus 提交于
KSM is a linux driver that allows dynamicly sharing identical memory pages between one or more processes. Unlike tradtional page sharing that is made at the allocation of the memory, ksm do it dynamicly after the memory was created. Memory is periodically scanned; identical pages are identified and merged. The sharing is made in a transparent way to the processes that use it. Ksm is highly important for hypervisors (kvm), where in production enviorments there might be many copys of the same data data among the host memory. This kind of data can be: similar kernels, librarys, cache, and so on. Even that ksm was wrote for kvm, any userspace application that want to use it to share its data can try it. Ksm may be useful for any application that might have similar (page aligment) data strctures among the memory, ksm will find this data merge it to one copy, and even if it will be changed and thereforew copy on writed, ksm will merge it again as soon as it will be identical again. Another reason to consider using ksm is the fact that it might simplify alot the userspace code of application that want to use shared private data, instead that the application will mange shared area, ksm will do this for the application, and even write to this data will be allowed without any synchinization acts from the application. Ksm was designed to be a loadable module that doesn't change the VM code of linux. This patch: The set_pte_at_notify() macro allows setting a pte in the shadow page table directly, instead of flushing the shadow page table entry and then getting vmexit to set it. It uses a new change_pte() callback to do so. set_pte_at_notify() is an optimization for kvm, and other users of mmu_notifiers, for COW pages. It is useful for kvm when ksm is used, because it allows kvm not to have to receive vmexit and only then map the ksm page into the shadow page table, but instead map it directly at the same time as Linux maps the page into the host page table. Users of mmu_notifiers who don't implement new mmu_notifier_change_pte() callback will just receive the mmu_notifier_invalidate_page() callback. Signed-off-by: NIzik Eidus <ieidus@redhat.com> Signed-off-by: NChris Wright <chrisw@redhat.com> Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Avi Kivity <avi@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Johannes Weiner 提交于
By the time PG_mlocked is cleared in the page freeing path, nobody else is looking at our page->flags anymore. It is thus safe to make the test-and-clear non-atomic and thereby removing an unnecessary and expensive operation from a hotpath. Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Figo.zhang 提交于
There is no need for double error checking. Signed-off-by: NFigo.zhang <figo1802@gmail.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Akinobu Mita 提交于
__get_free_pages() with __GFP_HIGHMEM is not safe because the return address cannot represent a highmem page. get_zeroed_page() already has such a debug checking. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
The pages in the list passed move_active_pages_to_lru() are already touched by shrink_active_list(). IOW the prefetch in move_active_pages_to_lru() don't populate any cache. it's pointless. This patch remove it. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
The page_lru() already evaluate PageActive() and PageSwapBacked(). We don't need to re-evaluate it. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
The move_active_pages_to_lru() function is called under irq disabled and ClearPageActive() doesn't need irq disabling. Then, this patch move it into shrink_active_list(). Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
The VM already avoids attempting to reclaim anon pages in various places, But it doesn't avoid it for lumpy reclaim. It shuffles lru list unnecessary so that it is pointless. [akpm@linux-foundation.org: cleanup] Signed-off-by: NMinchan Kim <minchan.kim@gmail.com> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: NRik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wu Fengguang 提交于
global_lru_pages() / zone_lru_pages() can be used in two ways: - to estimate max reclaimable pages in determine_dirtyable_memory() - to calculate the slab scan ratio When swap is full or not present, the anon lru lists are not reclaimable and also won't be scanned. So the anon pages shall not be counted in both usage scenarios. Also rename to _reclaimable_pages: now they are counting the possibly reclaimable lru pages. It can greatly (and correctly) increase the slab scan rate under high memory pressure (when most file pages have been reclaimed and swap is full/absent), thus reduce false OOM kills. Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: NRik van Riel <riel@redhat.com> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NWu Fengguang <fengguang.wu@intel.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> Cc: David Howells <dhowells@redhat.com> Cc: "Li, Ming Chun" <macli@brc.ubc.ca> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Kara 提交于
Reported-by: NChristian Thaeter <ct@pipapo.org> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
__add_zone_page_state() and __sub_zone_page_state() are unused. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rik van Riel 提交于
When way too many processes go into direct reclaim, it is possible for all of the pages to be taken off the LRU. One result of this is that the next process in the page reclaim code thinks there are no reclaimable pages left and triggers an out of memory kill. One solution to this problem is to never let so many processes into the page reclaim path that the entire LRU is emptied. Limiting the system to only having half of each inactive list isolated for reclaim should be safe. Signed-off-by: NRik van Riel <riel@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
If the system is running a heavy load of processes then concurrent reclaim can isolate a large number of pages from the LRU. /proc/vmstat and the output generated for an OOM do not show how many pages were isolated. This has been observed during process fork bomb testing (mstctl11 in LTP). This patch shows the information about isolated pages. Reproduced via: ----------------------- % ./hackbench 140 process 1000 => OOM occur active_anon:146 inactive_anon:0 isolated_anon:49245 active_file:79 inactive_file:18 isolated_file:113 unevictable:0 dirty:0 writeback:0 unstable:0 buffer:39 free:370 slab_reclaimable:309 slab_unreclaimable:5492 mapped:53 shmem:15 pagetables:28140 bounce:0 Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: NRik van Riel <riel@redhat.com> Acked-by: NWu Fengguang <fengguang.wu@intel.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
If sc->isolate_pages() return 0, we don't need to call shrink_page_list(). In past days, shrink_inactive_list() handled it properly. But commit fb8d14e1 (three years ago commit!) breaked it. current shrink_inactive_list() always call shrink_page_list() although isolate_pages() return 0. This patch restore proper return value check. Requirements: o "nr_taken == 0" condition should stay before calling shrink_page_list(). o "nr_taken == 0" condition should stay after nr_scan related statistics modification. Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: NRik van Riel <riel@redhat.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
Currently the pgmoved variable has two meanings. It causes harder reviewing. This patch separates it. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: NRik van Riel <riel@redhat.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Rientjes 提交于
It is possible for the oom killer to select current as the task to kill. When this happens, alloc_flags needs to be updated accordingly to set ALLOC_NO_WATERMARKS so the subsequent allocation attempt may use memory reserves as the result of its thread having TIF_MEMDIE set if the allocation is not __GFP_NOMEMALLOC. Acked-by: NMel Gorman <mel@csn.ul.ie> Signed-off-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
Recently we encountered OOM problems due to memory use of the GEM cache. Generally a large amuont of Shmem/Tmpfs pages tend to create a memory shortage problem. We often use the following calculation to determine the amount of shmem pages: shmem = NR_ACTIVE_ANON + NR_INACTIVE_ANON - NR_ANON_PAGES however the expression does not consider isolated and mlocked pages. This patch adds explicit accounting for pages used by shmem and tmpfs. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: NRik van Riel <riel@redhat.com> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Acked-by: NWu Fengguang <fengguang.wu@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
The amount of memory allocated to kernel stacks can become significant and cause OOM conditions. However, we do not display the amount of memory consumed by stacks. Add code to display the amount of memory used for stacks in /proc/meminfo. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Reviewed-by: NRik van Riel <riel@redhat.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
It is often useful to know the statistics for all pages that are handled like page cache pages when looking at OOM log output. Therefore show_free_areas() should also display buffer cache statistics. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: NWu Fengguang <fengguang.wu@intel.com> Reviewed-by: NRik van Riel <riel@redhat.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
show_free_areas() displays only a limited amount of zone counters. This patch includes additional counters in the display to allow easier debugging. This may be especially useful if an OOM is due to running out of DMA memory. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Acked-by: NWu Fengguang <fengguang.wu@intel.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Reviewed-by: NRik van Riel <riel@redhat.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andi Kleen 提交于
Remove some very outdated recommendations in Documentation/memory.txt Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
If an OOM happens, we really want to know the number of remaining reclaimable pages. So the reclaimable slab and unreclaimable slab fields should not be combined for display. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
page_remove_rmap() has multiple PageAnon() tests and it has deep nesting. Clean this up. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NMel Gorman <mel@csn.ul.ie> Reviewed-by: NWu Fengguang <fengguang.wu@intel.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lee Schermerhorn 提交于
I noticed that alloc_bootmem_huge_page() will only advance to the next node on failure to allocate a huge page, potentially filling nodes with huge-pages. I asked about this on linux-mm and linux-numa, cc'ing the usual huge page suspects. Mel Gorman responded: I strongly suspect that the same node being used until allocation failure instead of round-robin is an oversight and not deliberate at all. It appears to be a side-effect of a fix made way back in commit 63b4613c ["hugetlb: fix hugepage allocation with memoryless nodes"]. Prior to that patch it looked like allocations would always round-robin even when allocation was successful. This patch--factored out of my "hugetlb mempolicy" series--moves the advance of the hstate next node from which to allocate up before the test for success of the attempted allocation. Note that alloc_bootmem_huge_page() is only used for order > MAX_ORDER huge pages. I'll post a separate patch for mainline/stable, as the above mentioned "balance freeing" series renamed the next node to alloc function. Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com> Reviewed-by: NMel Gorman <mel@csn.ul.ie> Reviewed-by: NAndy Whitcroft <apw@canonical.com> Reviewed-by: NAndi Kleen <andi@firstfloor.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lee Schermerhorn 提交于
Attempt to clarify huge page administration and usage, and updates the doucmentation to mention the balancing of huge pages across nodes when allocating and freeing. Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Nishanth Aravamudan <nacc@us.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Adam Litke <agl@us.ibm.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Eric Whitney <eric.whitney@hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lee Schermerhorn 提交于
Use the [modified] free_pool_huge_page() function to return unused surplus pages. This will help keep huge pages balanced across nodes between freeing of unused surplus pages and freeing of persistent huge pages [from set_max_huge_pages] by using the same node id "cursor". It also eliminates some code duplication. Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: Adam Litke <agl@us.ibm.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Eric Whitney <eric.whitney@hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lee Schermerhorn 提交于
Free huges pages from nodes in round robin fashion in an attempt to keep [persistent a.k.a static] hugepages balanced across nodes New function free_pool_huge_page() is modeled on and performs roughly the inverse of alloc_fresh_huge_page(). Replaces dequeue_huge_page() which now has no callers, so this patch removes it. Helper function hstate_next_node_to_free() uses new hstate member next_to_free_nid to distribute "frees" across all nodes with huge pages. Acked-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: NMel Gorman <mel@csn.ul.ie> Cc: Nishanth Aravamudan <nacc@us.ibm.com> Cc: Adam Litke <agl@us.ibm.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Eric Whitney <eric.whitney@hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Randy Dunlap 提交于
Ummark function as having kernel-doc notation, fixing the kernel-doc warning. Warning(mm/page_alloc.c:4519): No description found for parameter 'zone' Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shaohua Li 提交于
In test, some pages in swap-cache can't be migrated, as they aren't rmap. unmap_and_move() ignores swap-cache page which is just read in and hasn't rmap (see the comments in the code), but swap_aops provides .migratepage. Better to migrate such pages instead of ignore them. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Yakui Zhao <yakui.zhao@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shaohua Li 提交于
To initialize hotadded node, some pages are allocated. At that time, the node hasn't memory, this makes the allocation always fail. In such case, let's allocate pages from other nodes. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Signed-off-by: NYakui Zhao <yakui.zhao@intel.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shaohua Li 提交于
Pages on movable zone have two types, MIGRATE_MOVABLE and MIGRATE_RESERVE, both them can be movable, because only movable memory allocation can get pages from movable zone. This makes pages in movable zone always be able to migrate. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Yakui Zhao <yakui.zhao@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shaohua Li 提交于
Pages marked as isolated should not be allocated again. If such pages reside in pcp list, they can be allocated too, so there is a ping-pong memory offline frees some pages to pcp list and the pages get allocated and then memory offline frees them again, this loop will happen again and again. This should have no impact in normal code path, because in normal code path, pages in pcp list aren't isolated, and below loop will break in the first entry. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Yakui Zhao <yakui.zhao@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-