• M
    mm: introduce MADV_PAGEOUT · 23757dcc
    Minchan Kim 提交于
    commit 1a4e58cce84ee88129d5d49c064bd2852b481357 upstream
    
    When a process expects no accesses to a certain memory range for a long
    time, it could hint kernel that the pages can be reclaimed instantly but
    data should be preserved for future use.  This could reduce workingset
    eviction so it ends up increasing performance.
    
    This patch introduces the new MADV_PAGEOUT hint to madvise(2) syscall.
    MADV_PAGEOUT can be used by a process to mark a memory range as not
    expected to be used for a long time so that kernel reclaims *any LRU*
    pages instantly.  The hint can help kernel in deciding which pages to
    evict proactively.
    
    A note: It doesn't apply SWAP_CLUSTER_MAX LRU page isolation limit
    intentionally because it's automatically bounded by PMD size.  If PMD
    size(e.g., 256) makes some trouble, we could fix it later by limit it to
    SWAP_CLUSTER_MAX[1].
    
    - man-page material
    
    MADV_PAGEOUT (since Linux x.x)
    
    Do not expect access in the near future so pages in the specified
    regions could be reclaimed instantly regardless of memory pressure.
    Thus, access in the range after successful operation could cause
    major page fault but never lose the up-to-date contents unlike
    MADV_DONTNEED. Pages belonging to a shared mapping are only processed
    if a write access is allowed for the calling process.
    
    MADV_PAGEOUT cannot be applied to locked pages, Huge TLB pages, or
    VM_PFNMAP pages.
    
    [1] https://lore.kernel.org/lkml/20190710194719.GS29695@dhcp22.suse.cz/
    
    [minchan@kernel.org: clear PG_active on MADV_PAGEOUT]
      Link: http://lkml.kernel.org/r/20190802200643.GA181880@google.com
    [akpm@linux-foundation.org: resolve conflicts with hmm.git]
    Link: http://lkml.kernel.org/r/20190726023435.214162-5-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
    Reported-by: Nkbuild test robot <lkp@intel.com>
    Acked-by: NMichal Hocko <mhocko@suse.com>
    Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
    Cc: Richard Henderson <rth@twiddle.net>
    Cc: Ralf Baechle <ralf@linux-mips.org>
    Cc: Chris Zankel <chris@zankel.net>
    Cc: Daniel Colascione <dancol@google.com>
    Cc: Dave Hansen <dave.hansen@intel.com>
    Cc: Hillf Danton <hdanton@sina.com>
    Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Oleksandr Natalenko <oleksandr@redhat.com>
    Cc: Shakeel Butt <shakeelb@google.com>
    Cc: Sonny Rao <sonnyrao@google.com>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Tim Murray <timmurray@google.com>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
    Signed-off-by: NXunlei Pang <xlpang@linux.alibaba.com>
    23757dcc
vmscan.c 124.0 KB