提交 aa88b68c 编写于 作者: K Kirill A. Shutemov 提交者: Linus Torvalds

thp: keep huge zero page pinned until tlb flush

Andrea has found[1] a race condition on MMU-gather based TLB flush vs
split_huge_page() or shrinker which frees huge zero under us (patch 1/2
and 2/2 respectively).

With new THP refcounting, we don't need patch 1/2: mmu_gather keeps the
page pinned until flush is complete and the pin prevents the page from
being split under us.

We still need patch 2/2.  This is simplified version of Andrea's patch.
We don't need fancy encoding.

[1] http://lkml.kernel.org/r/1447938052-22165-1-git-send-email-aarcange@redhat.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: NAndrea Arcangeli <aarcange@redhat.com>
Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
上级 66ee95d1
...@@ -152,6 +152,7 @@ static inline bool is_huge_zero_pmd(pmd_t pmd) ...@@ -152,6 +152,7 @@ static inline bool is_huge_zero_pmd(pmd_t pmd)
} }
struct page *get_huge_zero_page(void); struct page *get_huge_zero_page(void);
void put_huge_zero_page(void);
#else /* CONFIG_TRANSPARENT_HUGEPAGE */ #else /* CONFIG_TRANSPARENT_HUGEPAGE */
#define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; }) #define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; })
...@@ -208,6 +209,10 @@ static inline bool is_huge_zero_page(struct page *page) ...@@ -208,6 +209,10 @@ static inline bool is_huge_zero_page(struct page *page)
return false; return false;
} }
static inline void put_huge_zero_page(void)
{
BUILD_BUG();
}
static inline struct page *follow_devmap_pmd(struct vm_area_struct *vma, static inline struct page *follow_devmap_pmd(struct vm_area_struct *vma,
unsigned long addr, pmd_t *pmd, int flags) unsigned long addr, pmd_t *pmd, int flags)
......
...@@ -232,7 +232,7 @@ struct page *get_huge_zero_page(void) ...@@ -232,7 +232,7 @@ struct page *get_huge_zero_page(void)
return READ_ONCE(huge_zero_page); return READ_ONCE(huge_zero_page);
} }
static void put_huge_zero_page(void) void put_huge_zero_page(void)
{ {
/* /*
* Counter should never go to zero here. Only shrinker can put * Counter should never go to zero here. Only shrinker can put
...@@ -1684,12 +1684,12 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, ...@@ -1684,12 +1684,12 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
if (vma_is_dax(vma)) { if (vma_is_dax(vma)) {
spin_unlock(ptl); spin_unlock(ptl);
if (is_huge_zero_pmd(orig_pmd)) if (is_huge_zero_pmd(orig_pmd))
put_huge_zero_page(); tlb_remove_page(tlb, pmd_page(orig_pmd));
} else if (is_huge_zero_pmd(orig_pmd)) { } else if (is_huge_zero_pmd(orig_pmd)) {
pte_free(tlb->mm, pgtable_trans_huge_withdraw(tlb->mm, pmd)); pte_free(tlb->mm, pgtable_trans_huge_withdraw(tlb->mm, pmd));
atomic_long_dec(&tlb->mm->nr_ptes); atomic_long_dec(&tlb->mm->nr_ptes);
spin_unlock(ptl); spin_unlock(ptl);
put_huge_zero_page(); tlb_remove_page(tlb, pmd_page(orig_pmd));
} else { } else {
struct page *page = pmd_page(orig_pmd); struct page *page = pmd_page(orig_pmd);
page_remove_rmap(page, true); page_remove_rmap(page, true);
......
...@@ -728,6 +728,11 @@ void release_pages(struct page **pages, int nr, bool cold) ...@@ -728,6 +728,11 @@ void release_pages(struct page **pages, int nr, bool cold)
zone = NULL; zone = NULL;
} }
if (is_huge_zero_page(page)) {
put_huge_zero_page();
continue;
}
page = compound_head(page); page = compound_head(page);
if (!put_page_testzero(page)) if (!put_page_testzero(page))
continue; continue;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册