• K
    vmscan: kill prev_priority completely · 25edde03
    KOSAKI Motohiro 提交于
    Since 2.6.28 zone->prev_priority is unused. Then it can be removed
    safely. It reduce stack usage slightly.
    
    Now I have to say that I'm sorry. 2 years ago, I thought prev_priority
    can be integrate again, it's useful. but four (or more) times trying
    haven't got good performance number. Thus I give up such approach.
    
    The rest of this changelog is notes on prev_priority and why it existed in
    the first place and why it might be not necessary any more. This information
    is based heavily on discussions between Andrew Morton, Rik van Riel and
    Kosaki Motohiro who is heavily quotes from.
    
    Historically prev_priority was important because it determined when the VM
    would start unmapping PTE pages. i.e. there are no balances of note within
    the VM, Anon vs File and Mapped vs Unmapped. Without prev_priority, there
    is a potential risk of unnecessarily increasing minor faults as a large
    amount of read activity of use-once pages could push mapped pages to the
    end of the LRU and get unmapped.
    
    There is no proof this is still a problem but currently it is not considered
    to be. Active files are not deactivated if the active file list is smaller
    than the inactive list reducing the liklihood that file-mapped pages are
    being pushed off the LRU and referenced executable pages are kept on the
    active list to avoid them getting pushed out by read activity.
    
    Even if it is a problem, prev_priority prev_priority wouldn't works
    nowadays. First of all, current vmscan still a lot of UP centric code. it
    expose some weakness on some dozens CPUs machine. I think we need more and
    more improvement.
    
    The problem is, current vmscan mix up per-system-pressure, per-zone-pressure
    and per-task-pressure a bit. example, prev_priority try to boost priority to
    other concurrent priority. but if the another task have mempolicy restriction,
    it is unnecessary, but also makes wrong big latency and exceeding reclaim.
    per-task based priority + prev_priority adjustment make the emulation of
    per-system pressure. but it have two issue 1) too rough and brutal emulation
    2) we need per-zone pressure, not per-system.
    
    Another example, currently DEF_PRIORITY is 12. it mean the lru rotate about
    2 cycle (1/4096 + 1/2048 + 1/1024 + .. + 1) before invoking OOM-Killer.
    but if 10,0000 thrreads enter DEF_PRIORITY reclaim at the same time, the
    system have higher memory pressure than priority==0 (1/4096*10,000 > 2).
    prev_priority can't solve such multithreads workload issue. In other word,
    prev_priority concept assume the sysmtem don't have lots threads."
    Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Signed-off-by: NMel Gorman <mel@csn.ul.ie>
    Reviewed-by: NJohannes Weiner <hannes@cmpxchg.org>
    Reviewed-by: NRik van Riel <riel@redhat.com>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Chris Mason <chris.mason@oracle.com>
    Cc: Nick Piggin <npiggin@suse.de>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Michael Rubin <mrubin@google.com>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    25edde03
vmstat.c 28.4 KB