• M
    mm: page_alloc: avoid marking zones full prematurely after zone_reclaim() · fed2719e
    Mel Gorman 提交于
    The following problem was reported against a distribution kernel when
    zone_reclaim was enabled but the same problem applies to the mainline
    kernel.  The reproduction case was as follows
    
    1. Run numactl -m +0 dd if=largefile of=/dev/null
       This allocates a large number of clean pages in node 0
    
    2. numactl -N +0 memhog 0.5*Mg
       This start a memory-using application in node 0.
    
    The expected behaviour is that the clean pages get reclaimed and the
    application uses node 0 for its memory.  The observed behaviour was that
    the memory for the memhog application was allocated off-node since
    commits cd38b115 ("mm: page allocator: initialise ZLC for first zone
    eligible for zone_reclaim") and commit 76d3fbf8 ("mm: page
    allocator: reconsider zones for allocation after direct reclaim").
    
    The assumption of those patches was that it was always preferable to
    allocate quickly than stall for long periods of time and they were meant
    to take care that the zone was only marked full when necessary but an
    important case was missed.
    
    In the allocator fast path, only the low watermarks are checked.  If the
    zones free pages are between the low and min watermark then allocations
    from the allocators slow path will succeed.  However, zone_reclaim will
    only reclaim SWAP_CLUSTER_MAX or 1<<order pages.  There is no guarantee
    that this will meet the low watermark causing the zone to be marked full
    prematurely.
    
    This patch will only mark the zone full after zone_reclaim if it the min
    watermarks are checked or if page reclaim failed to make sufficient
    progress.
    
    [mhocko@suse.cz: fix alloc_flags test]
    Signed-off-by: NMel Gorman <mgorman@suse.de>
    Reported-by: NHedi Berriche <hedi@sgi.com>
    Tested-by: NHedi Berriche <hedi@sgi.com>
    Reviewed-by: NMichal Hocko <mhocko@suse.cz>
    Reviewed-by: NWanpeng Li <liwanp@linux.vnet.ibm.com>
    Signed-off-by: NMichal Hocko <mhocko@suse.cz>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    fed2719e
page_alloc.c 171.9 KB