• L
    mm: don't put pinned pages into the swap cache · 8d757a4b
    Linus Torvalds 提交于
    stable inclusion
    from stable-5.10.9
    commit 72c5ce89427feb277ac6f998a6ec27b820863fb5
    bugzilla: 47457
    
    --------------------------------
    
    [ Upstream commit feb889fb ]
    
    So technically there is nothing wrong with adding a pinned page to the
    swap cache, but the pinning obviously means that the page can't actually
    be free'd right now anyway, so it's a bit pointless.
    
    However, the real problem is not with it being a bit pointless: the real
    issue is that after we've added it to the swap cache, we'll try to unmap
    the page.  That will succeed, because the code in mm/rmap.c doesn't know
    or care about pinned pages.
    
    Even the unmapping isn't fatal per se, since the page will stay around
    in memory due to the pinning, and we do hold the connection to it using
    the swap cache.  But when we then touch it next and take a page fault,
    the logic in do_swap_page() will map it back into the process as a
    possibly read-only page, and we'll then break the page association on
    the next COW fault.
    
    Honestly, this issue could have been fixed in any of those other places:
    (a) we could refuse to unmap a pinned page (which makes conceptual
    sense), or (b) we could make sure to re-map a pinned page writably in
    do_swap_page(), or (c) we could just make do_wp_page() not COW the
    pinned page (which was what we historically did before that "mm:
    do_wp_page() simplification" commit).
    
    But while all of them are equally valid models for breaking this chain,
    not putting pinned pages into the swap cache in the first place is the
    simplest one by far.
    
    It's also the safest one: the reason why do_wp_page() was changed in the
    first place was that getting the "can I re-use this page" wrong is so
    fraught with errors.  If you do it wrong, you end up with an incorrectly
    shared page.
    
    As a result, using "page_maybe_dma_pinned()" in either do_wp_page() or
    do_swap_page() would be a serious bug since it is only a (very good)
    heuristic.  Re-using the page requires a hard black-and-white rule with
    no room for ambiguity.
    
    In contrast, saying "this page is very likely dma pinned, so let's not
    add it to the swap cache and try to unmap it" is an obviously safe thing
    to do, and if the heuristic might very rarely be a false positive, no
    harm is done.
    
    Fixes: 09854ba9 ("mm: do_wp_page() simplification")
    Reported-and-tested-by: NMartin Raiber <martin@urbackup.org>
    Cc: Pavel Begunkov <asml.silence@gmail.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Peter Xu <peterx@redhat.com>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: NSasha Levin <sashal@kernel.org>
    Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: NChen Jun <chenjun102@huawei.com>
    Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
    8d757a4b
vmscan.c 123.4 KB