- 13 1月, 2012 3 次提交
-
-
由 Mel Gorman 提交于
This patch adds a lightweight sync migrate operation MIGRATE_SYNC_LIGHT mode that avoids writing back pages to backing storage. Async compaction maps to MIGRATE_ASYNC while sync compaction maps to MIGRATE_SYNC_LIGHT. For other migrate_pages users such as memory hotplug, MIGRATE_SYNC is used. This avoids sync compaction stalling for an excessive length of time, particularly when copying files to a USB stick where there might be a large number of dirty pages backed by a filesystem that does not support ->writepages. [aarcange@redhat.com: This patch is heavily based on Andrea's work] [akpm@linux-foundation.org: fix fs/nfs/write.c build] [akpm@linux-foundation.org: fix fs/btrfs/disk-io.c build] Signed-off-by: NMel Gorman <mgorman@suse.de> Reviewed-by: NRik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Dave Jones <davej@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andy Isaacson <adi@hexapodia.org> Cc: Nai Xia <nai.xia@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
Asynchronous compaction is used when allocating transparent hugepages to avoid blocking for long periods of time. Due to reports of stalling, there was a debate on disabling synchronous compaction but this severely impacted allocation success rates. Part of the reason was that many dirty pages are skipped in asynchronous compaction by the following check; if (PageDirty(page) && !sync && mapping->a_ops->migratepage != migrate_page) rc = -EBUSY; This skips over all mapping aops using buffer_migrate_page() even though it is possible to migrate some of these pages without blocking. This patch updates the ->migratepage callback with a "sync" parameter. It is the responsibility of the callback to fail gracefully if migration would block. Signed-off-by: NMel Gorman <mgorman@suse.de> Reviewed-by: NRik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Dave Jones <davej@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andy Isaacson <adi@hexapodia.org> Cc: Nai Xia <nai.xia@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KAMEZAWA Hiroyuki 提交于
This is a preparation before removing a flag PCG_ACCT_LRU in page_cgroup and reducing atomic ops/complexity in memcg LRU handling. In some cases, pages are added to lru before charge to memcg and pages are not classfied to memory cgroup at lru addtion. Now, the lru where the page should be added is determined a bit in page_cgroup->flags and pc->mem_cgroup. I'd like to remove the check of flag. To handle the case pc->mem_cgroup may contain stale pointers if pages are added to LRU before classification. This patch resets pc->mem_cgroup to root_mem_cgroup before lru additions. [akpm@linux-foundation.org: fix CONFIG_CGROUP_MEM_CONT=n build] [hughd@google.com: fix CONFIG_CGROUP_MEM_RES_CTLR=y CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n build] [akpm@linux-foundation.org: ksm.c needs memcontrol.h, per Michal] [hughd@google.com: stop oops in mem_cgroup_reset_owner()] [hughd@google.com: fix page migration to reset_owner] Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Miklos Szeredi <mszeredi@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Ying Han <yinghan@google.com> Signed-off-by: NHugh Dickins <hughd@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 1月, 2012 3 次提交
-
-
由 Wang Sheng-Hui 提交于
lru_to_page is not used in mm/migrate.c. Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com> Acked-by: NMel Gorman <mgorman@suse.de> Acked-by: NKyungmin Park <kyungmin.park@samsung.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wang Sheng-Hui 提交于
migration_entry_wait() can also be called from hugetlb_fault() now. Remove the incorrect comment. Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com> Acked-by: NMinchan Kim <minchan@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jacobo Giralt 提交于
migrate_page_move_mapping() drops a reference from the old page after unfreezing its counter. Both operations can be merged into a single atomic operation by directly unfreezing to one less reference. The same applies to migrate_huge_page_move_mapping(). Signed-off-by: NJacobo Giralt <jacobo.giralt@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 12月, 2011 1 次提交
-
-
由 Hillf Danton 提交于
Avoid unlocking and unlocked page if we failed to lock it. Signed-off-by: NHillf Danton <dhillf@gmail.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 11月, 2011 1 次提交
-
-
由 Minchan Kim 提交于
unmap_and_move() is one a big messy function. Clean it up. Signed-off-by: NMinchan Kim <minchan.kim@gmail.com> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 31 10月, 2011 1 次提交
-
-
由 Paul Gortmaker 提交于
The files changed within are only using the EXPORT_SYMBOL macro variants. They are not using core modular infrastructure and hence don't need module.h but only the export.h header. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 20 10月, 2011 1 次提交
-
-
由 Hugh Dickins 提交于
I don't usually pay much attention to the stale "? " addresses in stack backtraces, but this lucky report from Pawel Sikora hints that mremap's move_ptes() has inadequate locking against page migration. 3.0 BUG_ON(!PageLocked(p)) in migration_entry_to_page(): kernel BUG at include/linux/swapops.h:105! RIP: 0010:[<ffffffff81127b76>] [<ffffffff81127b76>] migration_entry_wait+0x156/0x160 [<ffffffff811016a1>] handle_pte_fault+0xae1/0xaf0 [<ffffffff810feee2>] ? __pte_alloc+0x42/0x120 [<ffffffff8112c26b>] ? do_huge_pmd_anonymous_page+0xab/0x310 [<ffffffff81102a31>] handle_mm_fault+0x181/0x310 [<ffffffff81106097>] ? vma_adjust+0x537/0x570 [<ffffffff81424bed>] do_page_fault+0x11d/0x4e0 [<ffffffff81109a05>] ? do_mremap+0x2d5/0x570 [<ffffffff81421d5f>] page_fault+0x1f/0x30 mremap's down_write of mmap_sem, together with i_mmap_mutex or lock, and pagetable locks, were good enough before page migration (with its requirement that every migration entry be found) came in, and enough while migration always held mmap_sem; but not enough nowadays, when there's memory hotremove and compaction. The danger is that move_ptes() lets a migration entry dodge around behind remove_migration_pte()'s back, so it's in the old location when looking at the new, then in the new location when looking at the old. Either mremap's move_ptes() must additionally take anon_vma lock(), or migration's remove_migration_pte() must stop peeking for is_swap_entry() before it takes pagetable lock. Consensus chooses the latter: we prefer to add overhead to migration than to mremapping, which gets used by JVMs and by exec stack setup. Reported-and-tested-by: NPaweł Sikora <pluto@agmk.net> Signed-off-by: NHugh Dickins <hughd@google.com> Acked-by: NAndrea Arcangeli <aarcange@redhat.com> Acked-by: NMel Gorman <mgorman@suse.de> Cc: stable@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 6月, 2011 1 次提交
-
-
由 Andrea Arcangeli 提交于
swapcache will reach the below code path in migrate_page_move_mapping, and swapcache is accounted as NR_FILE_PAGES but it's not accounted as NR_SHMEM. Hugh pointed out we must use PageSwapCache instead of comparing mapping to &swapper_space, to avoid build failure with CONFIG_SWAP=n. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Acked-by: NHugh Dickins <hughd@google.com> Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 5月, 2011 1 次提交
-
-
由 Peter Zijlstra 提交于
Convert page_lock_anon_vma() over to use refcounts. This is done to prepare for the conversion of anon_vma from spinlock to mutex. Sadly this inceases the cost of page_lock_anon_vma() from one to two atomics, a follow up patch addresses this, lets keep that simple for now. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: NHugh Dickins <hughd@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Miller <davem@davemloft.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Tony Luck <tony.luck@intel.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Namhyung Kim <namhyung@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 31 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 24 3月, 2011 1 次提交
-
-
由 KAMEZAWA Hiroyuki 提交于
Remove initialization of vaiable in caller of memory cgroup function. Actually, it's return value of memcg function but it's initialized in caller. Some memory cgroup uses following style to bring the result of start function to the end function for avoiding races. mem_cgroup_start_A(&(*ptr)) /* Something very complicated can happen here. */ mem_cgroup_end_A(*ptr) In some calls, *ptr should be initialized to NULL be caller. But it's ugly. This patch fixes that *ptr is initialized by _start function. Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 3月, 2011 3 次提交
-
-
由 Andrea Arcangeli 提交于
__GFP_NO_KSWAPD allocations are usually very expensive and not mandatory to succeed as they have graceful fallback. Waiting for I/O in those, tends to be overkill in terms of latencies, so we can reduce their latency by disabling sync migrate. Unfortunately, even with async migration it's still possible for the process to be blocked waiting for a request slot (e.g. get_request_wait in the block layer) when ->writepage is called. To prevent __GFP_NO_KSWAPD blocking, this patch prevents ->writepage being called on dirty page cache for asynchronous migration. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=31142 [mel@csn.ul.ie: Avoid writebacks for NFS, retry locked pages, use bool] Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Signed-off-by: NMel Gorman <mel@csn.ul.ie> Cc: Arthur Marsh <arthur.marsh@internode.on.net> Cc: Clemens Ladisch <cladisch@googlemail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Reported-by: NAlex Villacis Lasso <avillaci@ceibo.fiec.espol.edu.ec> Tested-by: NAlex Villacis Lasso <avillaci@ceibo.fiec.espol.edu.ec> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Peter Zijlstra 提交于
The normal code pattern used in the kernel is: get/put. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NHugh Dickins <hughd@google.com> Reviewed-by: NRik van Riel <riel@redhat.com> Acked-by: NMel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
This function basically does: remove_from_page_cache(old); page_cache_release(old); add_to_page_cache_locked(new); Except it does this atomically, so there's no possibility for the "add" to fail because of a race. If memory cgroups are enabled, then the memory cgroup charge is also moved from the old page to the new. This function is currently used by fuse to move pages into the page cache on read, instead of copying the page contents. [minchan.kim@gmail.com: add freepage() hook to replace_page_cache_page()] Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Acked-by: NRik van Riel <riel@redhat.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: NMinchan Kim <minchan.kim@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 2月, 2011 1 次提交
-
-
由 Greg Thelen 提交于
The move_pages() usage of find_task_by_vpid() requires rcu_read_lock() to prevent free_pid() from reclaiming the pid. Without this patch, RCU warnings are printed in v2.6.38-rc4 move_pages() with: CONFIG_LOCKUP_DETECTOR=y CONFIG_PREEMPT=y CONFIG_LOCKDEP=y CONFIG_PROVE_LOCKING=y CONFIG_PROVE_RCU=y Previously, migrate_pages() went through a similar transformation replacing usage of tasklist_lock with rcu read lock: commit 55cfaa3c Author: Zeng Zhaoming <zengzm.kernel@gmail.com> Date: Thu Dec 2 14:31:13 2010 -0800 mm/mempolicy.c: add rcu read lock to protect pid structure commit 1e50df39 Author: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Date: Thu Jan 13 15:46:14 2011 -0800 mempolicy: remove tasklist_lock from migrate_pages Signed-off-by: NGreg Thelen <gthelen@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Zeng Zhaoming <zengzm.kernel@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 2月, 2011 2 次提交
-
-
由 Minchan Kim 提交于
If migrate_huge_page by memory-failure fails , it calls put_page in itself to decrease page reference and caller of migrate_huge_page also calls putback_lru_pages. It can do double free of page so it can make page corruption on page holder. In addtion, clean of pages on caller is consistent behavior with migrate_pages by cf608ac1 ("mm: compaction: fix COMPACTPAGEFAILED counting"). Signed-off-by: NMinchan Kim <minchan.kim@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrea Arcangeli 提交于
In some cases migrate_pages could return zero while still leaving a few pages in the pagelist (and some caller wouldn't notice it has to call putback_lru_pages after commit cf608ac1 ("mm: compaction: fix COMPACTPAGEFAILED counting")). Add one missing putback_lru_pages not added by commit cf608ac1 ("mm: compaction: fix COMPACTPAGEFAILED counting"). Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Signed-off-by: NMinchan Kim <minchan.kim@gmail.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Cc: Christoph Lameter <cl@linux.com> Acked-by: NMel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 1月, 2011 1 次提交
-
-
由 Minchan Kim 提交于
Callers of migrate_pages should putback_lru_pages to return pages isolated to LRU or free list. Now comment is rather confusing. It says caller always have to call it. It is more clear to point out that the caller has to call it if migrate_pages's return value isn't zero. Signed-off-by: NMinchan Kim <minchan.kim@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 1月, 2011 8 次提交
-
-
由 Daisuke Nishimura 提交于
In the current implementation mem_cgroup_end_migration() decides whether the page migration has succeeded or not by checking "oldpage->mapping". But if we are tring to migrate a shmem swapcache, the page->mapping of it is NULL from the begining, so the check would be invalid. As a result, mem_cgroup_end_migration() assumes the migration has succeeded even if it's not, so "newpage" would be freed while it's not uncharged. This patch fixes it by passing mem_cgroup_end_migration() the result of the page migration. Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Cc: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
2.6.37 added an unmap_and_move_huge_page() for memory failure recovery, but its anon_vma handling was still based around the 2.6.35 conventions. Update it to use page_lock_anon_vma, get_anon_vma, page_unlock_anon_vma, drop_anon_vma in the same way as we're now changing unmap_and_move(). I don't particularly like to propose this for stable when I've not seen its problems in practice nor tested the solution: but it's clearly out of synch at present. Signed-off-by: NHugh Dickins <hughd@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: <stable@kernel.org> [2.6.37, 2.6.36] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
Increased usage of page migration in mmotm reveals that the anon_vma locking in unmap_and_move() has been deficient since 2.6.36 (or even earlier). Review at the time of f1819427 ("mm: fix hang on anon_vma->root->lock") missed the issue here: the anon_vma to which we get a reference may already have been freed back to its slab (it is in use when we check page_mapped, but that can change), and so its anon_vma->root may be switched at any moment by reuse in anon_vma_prepare. Perhaps we could fix that with a get_anon_vma_unless_zero(), but let's not: just rely on page_lock_anon_vma() to do all the hard thinking for us, then we don't need any rcu read locking over here. In removing the rcu_unlock label: since PageAnon is a bit in page->mapping, it's impossible for a !page->mapping page to be anon; but insert VM_BUG_ON in case the implementation ever changes. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NHugh Dickins <hughd@google.com> Reviewed-by: NMel Gorman <mel@csn.ul.ie> Reviewed-by: NRik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: <stable@kernel.org> [2.6.37, 2.6.36] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
mm: migration: use rcu_dereference_protected when dereferencing the radix tree slot during file page migration migrate_pages() -> unmap_and_move() only calls rcu_read_lock() for anonymous pages, as introduced by git commit 989f89c5 ("fix rcu_read_lock() in page migraton"). The point of the RCU protection there is part of getting a stable reference to anon_vma and is only held for anon pages as file pages are locked which is sufficient protection against freeing. However, while a file page's mapping is being migrated, the radix tree is double checked to ensure it is the expected page. This uses radix_tree_deref_slot() -> rcu_dereference() without the RCU lock held triggering the following warning. [ 173.674290] =================================================== [ 173.676016] [ INFO: suspicious rcu_dereference_check() usage. ] [ 173.676016] --------------------------------------------------- [ 173.676016] include/linux/radix-tree.h:145 invoked rcu_dereference_check() without protection! [ 173.676016] [ 173.676016] other info that might help us debug this: [ 173.676016] [ 173.676016] [ 173.676016] rcu_scheduler_active = 1, debug_locks = 0 [ 173.676016] 1 lock held by hugeadm/2899: [ 173.676016] #0: (&(&inode->i_data.tree_lock)->rlock){..-.-.}, at: [<c10e3d2b>] migrate_page_move_mapping+0x40/0x1ab [ 173.676016] [ 173.676016] stack backtrace: [ 173.676016] Pid: 2899, comm: hugeadm Not tainted 2.6.37-rc5-autobuild [ 173.676016] Call Trace: [ 173.676016] [<c128cc01>] ? printk+0x14/0x1b [ 173.676016] [<c1063502>] lockdep_rcu_dereference+0x7d/0x86 [ 173.676016] [<c10e3db5>] migrate_page_move_mapping+0xca/0x1ab [ 173.676016] [<c10e41ad>] migrate_page+0x23/0x39 [ 173.676016] [<c10e491b>] buffer_migrate_page+0x22/0x107 [ 173.676016] [<c10e48f9>] ? buffer_migrate_page+0x0/0x107 [ 173.676016] [<c10e425d>] move_to_new_page+0x9a/0x1ae [ 173.676016] [<c10e47e6>] migrate_pages+0x1e7/0x2fa This patch introduces radix_tree_deref_slot_protected() which calls rcu_dereference_protected(). Users of it must pass in the mapping->tree_lock that is protecting this dereference. Holding the tree lock protects against parallel updaters of the radix tree meaning that rcu_dereference_protected is allowable. [akpm@linux-foundation.org: remove unneeded casts] Signed-off-by: NMel Gorman <mel@csn.ul.ie> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Milton Miller <miltonm@bga.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: <stable@kernel.org> [2.6.37.early] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrea Arcangeli 提交于
No pmd_trans_huge should ever materialize in migration ptes areas, because we split the hugepage before migration ptes are instantiated. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Acked-by: NRik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
With the introduction of the boolean sync parameter, the API looks a little inconsistent as offlining is still an int. Convert offlining to a bool for the sake of being tidy. Signed-off-by: NMel Gorman <mel@csn.ul.ie> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
mm: migration: allow migration to operate asynchronously and avoid synchronous compaction in the faster path Migration synchronously waits for writeback if the initial passes fails. Callers of memory compaction do not necessarily want this behaviour if the caller is latency sensitive or expects that synchronous migration is not going to have a significantly better success rate. This patch adds a sync parameter to migrate_pages() allowing the caller to indicate if wait_on_page_writeback() is allowed within migration or not. For reclaim/compaction, try_to_compact_pages() is first called asynchronously, direct reclaim runs and then try_to_compact_pages() is called synchronously as there is a greater expectation that it'll succeed. [akpm@linux-foundation.org: build/merge fix] Signed-off-by: NMel Gorman <mel@csn.ul.ie> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
Lumpy reclaim is disruptive. It reclaims a large number of pages and ignores the age of the pages it reclaims. This can incur significant stalls and potentially increase the number of major faults. Compaction has reached the point where it is considered reasonably stable (meaning it has passed a lot of testing) and is a potential candidate for displacing lumpy reclaim. This patch introduces an alternative to lumpy reclaim whe compaction is available called reclaim/compaction. The basic operation is very simple - instead of selecting a contiguous range of pages to reclaim, a number of order-0 pages are reclaimed and then compaction is later by either kswapd (compact_zone_order()) or direct compaction (__alloc_pages_direct_compact()). [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: use conventional task_struct naming] Signed-off-by: NMel Gorman <mel@csn.ul.ie> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 12月, 2010 1 次提交
-
-
由 Michal Nazarewicz 提交于
GCC complained about update_mmu_cache() not being defined in migrate.c. Including <asm/tlbflush.h> seems to solve the problem. Signed-off-by: NMichal Nazarewicz <m.nazarewicz@samsung.com> Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 10月, 2010 3 次提交
-
-
由 Gleb Natapov 提交于
The vma returned by find_vma does not necessarily include the target address. If this happens the code tries to follow a page outside of any vma and returns ENOENT instead of EFAULT. Signed-off-by: NGleb Natapov <gleb@redhat.com> Acked-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Presently update_nr_listpages() doesn't have a role. That's because lists passed is always empty just after calling migrate_pages. The migrate_pages cleans up page list which have failed to migrate before returning by aaa994b3. [PATCH] page migration: handle freeing of pages in migrate_pages() Do not leave pages on the lists passed to migrate_pages(). Seems that we will not need any postprocessing of pages. This will simplify the handling of pages by the callers of migrate_pages(). At that time, we thought we don't need any postprocessing of pages. But the situation is changed. The compaction need to know the number of failed to migrate for COMPACTPAGEFAILED stat This patch makes new rule for caller of migrate_pages to call putback_lru_pages. So caller need to clean up the lists so it has a chance to postprocess the pages. [suggested by Christoph Lameter] Signed-off-by: NMinchan Kim <minchan.kim@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Andi Kleen <andi@firstfloor.org> Reviewed-by: NMel Gorman <mel@csn.ul.ie> Reviewed-by: NWu Fengguang <fengguang.wu@intel.com> Acked-by: NChristoph Lameter <cl@linux.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wu Fengguang 提交于
This removes more dead code that was somehow missed by commit 0d99519e (writeback: remove unused nonblocking and congestion checks). There are no behavior change except for the removal of two entries from one of the ext4 tracing interface. The nonblocking checks in ->writepages are no longer used because the flusher now prefer to block on get_request_wait() than to skip inodes on IO congestion. The latter will lead to more seeky IO. The nonblocking checks in ->writepage are no longer used because it's redundant with the WB_SYNC_NONE check. We no long set ->nonblocking in VM page out and page migration, because a) it's effectively redundant with WB_SYNC_NONE in current code b) it's old semantic of "Don't get stuck on request queues" is mis-behavior: that would skip some dirty inodes on congestion and page out others, which is unfair in terms of LRU age. Inspired by Christoph Hellwig. Thanks! Signed-off-by: NWu Fengguang <fengguang.wu@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: David Howells <dhowells@redhat.com> Cc: Sage Weil <sage@newdream.net> Cc: Steve French <sfrench@samba.org> Cc: Chris Mason <chris.mason@oracle.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 10月, 2010 1 次提交
-
-
由 Andi Kleen 提交于
31bit s390 doesn't have huge pages and failed with: > mm/migrate.c: In function 'remove_migration_pte': > mm/migrate.c:143:3: error: implicit declaration of function 'pte_mkhuge' > mm/migrate.c:143:7: error: incompatible types when assigning to type 'pte_t' from type 'int' Put that code into a ifdef. Reported by Heiko Carstens Signed-off-by: NAndi Kleen <ak@linux.intel.com>
-
- 08 10月, 2010 1 次提交
-
-
由 Naoya Horiguchi 提交于
This patch extends page migration code to support hugepage migration. One of the potential users of this feature is soft offlining which is triggered by memory corrected errors (added by the next patch.) Todo: - there are other users of page migration such as memory policy, memory hotplug and memocy compaction. They are not ready for hugepage support for now. ChangeLog since v4: - define migrate_huge_pages() - remove changes on isolation/putback_lru_page() ChangeLog since v2: - refactor isolate/putback_lru_page() to handle hugepage - add comment about race on unmap_and_move_huge_page() ChangeLog since v1: - divide migration code path for hugepage - define routine checking migration swap entry for hugetlb - replace "goto" with "if/else" in remove_migration_pte() Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: NMel Gorman <mel@csn.ul.ie> Signed-off-by: NAndi Kleen <ak@linux.intel.com>
-
- 10 8月, 2010 3 次提交
-
-
由 Rik van Riel 提交于
KSM reference counts can cause an anon_vma to exist after the processe it belongs to have already exited. Because the anon_vma lock now lives in the root anon_vma, we need to ensure that the root anon_vma stays around until after all the "child" anon_vmas have been freed. The obvious way to do this is to have a "child" anon_vma take a reference to the root in anon_vma_fork. When the anon_vma is freed at munmap or process exit, we drop the refcount in anon_vma_unlink and possibly free the root anon_vma. The KSM anon_vma reference count function also needs to be modified to deal with the possibility of freeing 2 levels of anon_vma. The easiest way to do this is to break out the KSM magic and make it generic. When compiling without CONFIG_KSM, this code is compiled out. Signed-off-by: NRik van Riel <riel@redhat.com> Tested-by: NLarry Woodman <lwoodman@redhat.com> Acked-by: NLarry Woodman <lwoodman@redhat.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NMel Gorman <mel@csn.ul.ie> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Tested-by: NDave Young <hidave.darkstar@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rik van Riel 提交于
Always (and only) lock the root (oldest) anon_vma whenever we do something in an anon_vma. The recently introduced anon_vma scalability is due to the rmap code scanning only the VMAs that need to be scanned. Many common operations still took the anon_vma lock on the root anon_vma, so always taking that lock is not expected to introduce any scalability issues. However, always taking the same lock does mean we only need to take one lock, which means rmap_walk on pages from any anon_vma in the vma is excluded from occurring during an munmap, expand_stack or other operation that needs to exclude rmap_walk and similar functions. Also add the proper locking to vma_adjust. Signed-off-by: NRik van Riel <riel@redhat.com> Tested-by: NLarry Woodman <lwoodman@redhat.com> Acked-by: NLarry Woodman <lwoodman@redhat.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NMel Gorman <mel@csn.ul.ie> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rik van Riel 提交于
Subsitute a direct call of spin_lock(anon_vma->lock) with an inline function doing exactly the same. This makes it easier to do the substitution to the root anon_vma lock in a following patch. We will deal with the handful of special locks (nested, dec_and_lock, etc) separately. Signed-off-by: NRik van Riel <riel@redhat.com> Acked-by: NMel Gorman <mel@csn.ul.ie> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Tested-by: NLarry Woodman <lwoodman@redhat.com> Acked-by: NLarry Woodman <lwoodman@redhat.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 5月, 2010 1 次提交
-
-
FILE_MAPPED per memcg of migrated file cache is not properly updated, because our hook in page_add_file_rmap() can't know to which memcg FILE_MAPPED should be counted. Basically, this patch is for fixing the bug but includes some big changes to fix up other messes. Now, at migrating mapped file, events happen in following sequence. 1. allocate a new page. 2. get memcg of an old page. 3. charge ageinst a new page before migration. But at this point, no changes to new page's page_cgroup, no commit for the charge. (IOW, PCG_USED bit is not set.) 4. page migration replaces radix-tree, old-page and new-page. 5. page migration remaps the new page if the old page was mapped. 6. Here, the new page is unlocked. 7. memcg commits the charge for newpage, Mark the new page's page_cgroup as PCG_USED. Because "commit" happens after page-remap, we can count FILE_MAPPED at "5", because we should avoid to trust page_cgroup->mem_cgroup. if PCG_USED bit is unset. (Note: memcg's LRU removal code does that but LRU-isolation logic is used for helping it. When we overwrite page_cgroup->mem_cgroup, page_cgroup is not on LRU or page_cgroup->mem_cgroup is NULL.) We can lose file_mapped accounting information at 5 because FILE_MAPPED is updated only when mapcount changes 0->1. So we should catch it. BTW, historically, above implemntation comes from migration-failure of anonymous page. Because we charge both of old page and new page with mapcount=0, we can't catch - the page is really freed before remap. - migration fails but it's freed before remap or .....corner cases. New migration sequence with memcg is: 1. allocate a new page. 2. mark PageCgroupMigration to the old page. 3. charge against a new page onto the old page's memcg. (here, new page's pc is marked as PageCgroupUsed.) 4. page migration replaces radix-tree, page table, etc... 5. At remapping, new page's page_cgroup is now makrked as "USED" We can catch 0->1 event and FILE_MAPPED will be properly updated. And we can catch SWAPOUT event after unlock this and freeing this page by unmap() can be caught. 7. Clear PageCgroupMigration of the old page. So, FILE_MAPPED will be correctly updated. Then, for what MIGRATION flag is ? Without it, at migration failure, we may have to charge old page again because it may be fully unmapped. "charge" means that we have to dive into memory reclaim or something complated. So, it's better to avoid charge it again. Before this patch, __commit_charge() was working for both of the old/new page and fixed up all. But this technique has some racy condtion around FILE_MAPPED and SWAPOUT etc... Now, the kernel use MIGRATION flag and don't uncharge old page until the end of migration. I hope this change will make memcg's page migration much simpler. This page migration has caused several troubles. Worth to add a flag for simplification. Reviewed-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Tested-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Reported-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 5月, 2010 1 次提交
-
-
由 Mel Gorman 提交于
This patch is the core of a mechanism which compacts memory in a zone by relocating movable pages towards the end of the zone. A single compaction run involves a migration scanner and a free scanner. Both scanners operate on pageblock-sized areas in the zone. The migration scanner starts at the bottom of the zone and searches for all movable pages within each area, isolating them onto a private list called migratelist. The free scanner starts at the top of the zone and searches for suitable areas and consumes the free pages within making them available for the migration scanner. The pages isolated for migration are then migrated to the newly isolated free pages. [aarcange@redhat.com: Fix unsafe optimisation] [mel@csn.ul.ie: do not schedule work on other CPUs for compaction] Signed-off-by: NMel Gorman <mel@csn.ul.ie> Acked-by: NRik van Riel <riel@redhat.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-