• H
    x86,mm: fix pte_special versus pte_numa · b38af472
    Hugh Dickins 提交于
    Sasha Levin has shown oopses on ffffea0003480048 and ffffea0003480008 at
    mm/memory.c:1132, running Trinity on different 3.16-rc-next kernels:
    where zap_pte_range() checks page->mapping to see if PageAnon(page).
    
    Those addresses fit struct pages for pfns d2001 and d2000, and in each
    dump a register or a stack slot showed d2001730 or d2000730: pte flags
    0x730 are PCD ACCESSED PROTNONE SPECIAL IOMAP; and Sasha's e820 map has
    a hole between cfffffff and 100000000, which would need special access.
    
    Commit c46a7c81 ("x86: define _PAGE_NUMA by reusing software bits on
    the PMD and PTE levels") has broken vm_normal_page(): a PROTNONE SPECIAL
    pte no longer passes the pte_special() test, so zap_pte_range() goes on
    to try to access a non-existent struct page.
    
    Fix this by refining pte_special() (SPECIAL with PRESENT or PROTNONE) to
    complement pte_numa() (SPECIAL with neither PRESENT nor PROTNONE).  A
    hint that this was a problem was that c46a7c81 added pte_numa() test
    to vm_normal_page(), and moved its is_zero_pfn() test from slow to fast
    path: This was papering over a pte_special() snag when the zero page was
    encountered during zap.  This patch reverts vm_normal_page() to how it
    was before, relying on pte_special().
    
    It still appears that this patch may be incomplete: aren't there other
    places which need to be handling PROTNONE along with PRESENT?  For
    example, pte_mknuma() clears _PAGE_PRESENT and sets _PAGE_NUMA, but on a
    PROT_NONE area, that would make it pte_special().  This is side-stepped
    by the fact that NUMA hinting faults skipped PROT_NONE VMAs and there
    are no grounds where a NUMA hinting fault on a PROT_NONE VMA would be
    interesting.
    
    Fixes: c46a7c81 ("x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels")
    Reported-by: NSasha Levin <sasha.levin@oracle.com>
    Tested-by: NSasha Levin <sasha.levin@oracle.com>
    Signed-off-by: NHugh Dickins <hughd@google.com>
    Signed-off-by: NMel Gorman <mgorman@suse.de>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Cyrill Gorcunov <gorcunov@gmail.com>
    Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
    Cc: <stable@vger.kernel.org>	[3.16]
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    b38af472
pgtable.h 21.5 KB