• W
    mm: do batched scans for mem_cgroup · f8629631
    Wu Fengguang 提交于
    For mem_cgroup, shrink_zone() may call shrink_list() with nr_to_scan=1, in
    which case shrink_list() _still_ calls isolate_pages() with the much
    larger SWAP_CLUSTER_MAX.  It effectively scales up the inactive list scan
    rate by up to 32 times.
    
    For example, with 16k inactive pages and DEF_PRIORITY=12, (16k >> 12)=4.
    So when shrink_zone() expects to scan 4 pages in the active/inactive list,
    the active list will be scanned 4 pages, while the inactive list will be
    (over) scanned SWAP_CLUSTER_MAX=32 pages in effect.  And that could break
    the balance between the two lists.
    
    It can further impact the scan of anon active list, due to the anon
    active/inactive ratio rebalance logic in balance_pgdat()/shrink_zone():
    
    inactive anon list over scanned => inactive_anon_is_low() == TRUE
                                    => shrink_active_list()
                                    => active anon list over scanned
    
    So the end result may be
    
    - anon inactive  => over scanned
    - anon active    => over scanned (maybe not as much)
    - file inactive  => over scanned
    - file active    => under scanned (relatively)
    
    The accesses to nr_saved_scan are not lock protected and so not 100%
    accurate, however we can tolerate small errors and the resulted small
    imbalanced scan rates between zones.
    
    Cc: Rik van Riel <riel@redhat.com>
    Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Acked-by: NBalbir Singh <balbir@linux.vnet.ibm.com>
    Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
    Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    f8629631
mmzone.h 34.9 KB