• H
    mm/thp: try_to_unmap() use TTU_SYNC for safe splitting · af64f69b
    Hugh Dickins 提交于
    stable inclusion
    from linux-4.19.197
    commit 205899d6be9a1d5494318920a211852a32886e46
    
    --------------------------------
    
    [ Upstream commit 732ed558 ]
    
    Stressing huge tmpfs often crashed on unmap_page()'s VM_BUG_ON_PAGE
    (!unmap_success): with dump_page() showing mapcount:1, but then its raw
    struct page output showing _mapcount ffffffff i.e.  mapcount 0.
    
    And even if that particular VM_BUG_ON_PAGE(!unmap_success) is removed,
    it is immediately followed by a VM_BUG_ON_PAGE(compound_mapcount(head)),
    and further down an IS_ENABLED(CONFIG_DEBUG_VM) total_mapcount BUG():
    all indicative of some mapcount difficulty in development here perhaps.
    But the !CONFIG_DEBUG_VM path handles the failures correctly and
    silently.
    
    I believe the problem is that once a racing unmap has cleared pte or
    pmd, try_to_unmap_one() may skip taking the page table lock, and emerge
    from try_to_unmap() before the racing task has reached decrementing
    mapcount.
    
    Instead of abandoning the unsafe VM_BUG_ON_PAGE(), and the ones that
    follow, use PVMW_SYNC in try_to_unmap_one() in this case: adding
    TTU_SYNC to the options, and passing that from unmap_page().
    
    When CONFIG_DEBUG_VM, or for non-debug too? Consensus is to do the same
    for both: the slight overhead added should rarely matter, except perhaps
    if splitting sparsely-populated multiply-mapped shmem.  Once confident
    that bugs are fixed, TTU_SYNC here can be removed, and the race
    tolerated.
    
    Link: https://lkml.kernel.org/r/c1e95853-8bcd-d8fd-55fa-e7f2488e78f@google.com
    Fixes: fec89c10 ("thp: rewrite freeze_page()/unfreeze_page() with generic rmap walkers")
    Signed-off-by: NHugh Dickins <hughd@google.com>
    Cc: Alistair Popple <apopple@nvidia.com>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Jue Wang <juew@google.com>
    Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Ralph Campbell <rcampbell@nvidia.com>
    Cc: Shakeel Butt <shakeelb@google.com>
    Cc: Wang Yugui <wangyugui@e16-tech.com>
    Cc: Yang Shi <shy828301@gmail.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    
    Note on stable backport: upstream TTU_SYNC 0x10 takes the value which
    5.11 commit 013339df ("mm/rmap: always do TTU_IGNORE_ACCESS") freed.
    It is very tempting to backport that commit (as 5.10 already did) and
    make no change here; but on reflection, good as that commit is, I'm
    reluctant to include any possible side-effect of it in this series.
    Signed-off-by: NHugh Dickins <hughd@google.com>
    Signed-off-by: NSasha Levin <sashal@kernel.org>
    Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
    af64f69b
page_vma_mapped.c 8.0 KB