• G
    alinux: mm: Support kidled · f55ac551
    Gavin Shan 提交于
    This enables scanning pages in fixed interval to determine their access
    frequency (hot/cold). The result is exported to user land on basis of
    memory cgroup by "memory.idle_page_stats". The design is highlighted as
    below:
    
       * A kernel thread is spawn when this feature is enabled by writing
         non-zero value to "/sys/kernel/mm/kidled/scan_period_in_seconds".
         The thread sequentially scans the nodes and their pages that have
         been chained up in LRU list.
    
       * For each page, its corresponding age information is stored in the
         page flags or array in node. The age represents the scanning intervals
         in which the page isn't accessed. Also, the page flag (PG_idle) is
         leveraged. The page's age is increased by one if the idle flag isn't
         cleared in two consective scans. Otherwise, the page's age is cleared out.
         Also, the page's age information is cleared when it's free'd so that
         the stale age information won't be fetched when it's allocated.
    
       * Initially, the flag is set, while the access bit in its PTE is cleared
         out by the thread. In next scanning period, its PTE access bit is
         synchronized with the page flag: clear the flag if access bit is set.
         The flag is kept otherwise. For unmapped pages, the flag is cleared
         when it's accessed.
    
       * Eventually, the page's aging information is updated to the unstable
         bucket of its corresponding memory cgroup, taking as statistics. The
         unstable bucket (statistics) is copied to stable bucket when all pages
         in all nodes are scanned for once. The stable bucket (statistics) is
         exported to user land through "memory.idle_page_stats".
    
    TESTING
    =======
    
       * cgroup1, unmapped pagecache
    
         # dd if=/dev/zero of=/ext4/test.data oflag=direct bs=1M count=128
         #
         # echo 1 > /sys/kernel/mm/kidled/use_hierarchy
         # echo 15 > /sys/kernel/mm/kidled/scan_period_in_seconds
         # mkdir -p /cgroup/memory
         # mount -tcgroup -o memory /cgroup/memory
         # echo 1 > /cgroup/memory/memory.use_hierarchy
         # mkdir -p /cgroup/memory/test
         # echo 1 > /cgroup/memory/test/memory.use_hierarchy
         #
         # echo $$ > /cgroup/memory/test/cgroup.procs
         # dd if=/ext4/test.data of=/dev/null bs=1M count=128
         # < wait a few minutes >
         # cat /cgroup/memory/test/memory.idle_page_stats | grep cfei
         # cat /cgroup/memory/test/memory.idle_page_stats | grep cfei
           cfei   0   0   0   134217728   0   0   0   0
         # cat /cgroup/memory/memory.idle_page_stats | grep cfei
           cfei   0   0   0   134217728   0   0   0   0
    
       * cgroup1, mapped pagecache
    
         # < create same file and memory cgroups as above >
         #
         # echo $$ > /cgroup/memory/test/cgroup.procs
         # < run program to mmap the whole created file and access the area >
         # < wait a few minutes >
         # cat /cgroup/memory/test/memory.idle_page_stats | grep cfei
           cfei   0   134217728   0   0   0   0   0   0
         # cat /cgroup/memory/memory.idle_page_stats | grep cfei
           cfei   0   134217728   0   0   0   0   0   0
    
       * cgroup1, mapped and locked pagecache
    
         # < create same file and memory cgroups as above >
         #
         # echo $$ > /cgroup/memory/test/cgroup.procs
         # < run program to mmap the whole created file and mlock the area >
         # < wait a few minutes >
         # cat /cgroup/memory/test/memory.idle_page_stats | grep cfui
           cfui   0   134217728   0   0   0   0   0   0
         # cat /cgroup/memory/memory.idle_page_stats | grep cfui
           cfui   0   134217728   0   0   0   0   0   0
    
       * cgroup1, anonymous and locked area
    
         # < create memory cgroups as above >
         #
         # echo $$ > /cgroup/memory/test/cgroup.procs
         # < run program to mmap anonymous area and mlock it >
         # < wait a few minutes >
         # cat /cgroup/memory/test/memory.idle_page_stats | grep csui
           csui   0   0   134217728   0   0   0   0   0
         # cat /cgroup/memory/memory.idle_page_stats | grep csui
           csui   0   0   134217728   0   0   0   0   0
    
       * Rerun above test cases in cgroup2 and the results are no exceptional.
         However, the cgroups are populated in different way as below:
    
         # mkdir -p /cgroup
         # mount -tcgroup2 none /cgroup
         # echo "+memory" > /cgroup/cgroup.subtree_control
         # mkdir -p /cgroup/test
    Signed-off-by: NGavin Shan <shan.gavin@linux.alibaba.com>
    Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
    Reviewed-by: NXunlei Pang <xlpang@linux.alibaba.com>
    f55ac551
memory_hotplug.c 49.5 KB