• M
    mm: pagemap: avoid unnecessary overhead when tracepoints are deactivated · 24b7e581
    Mel Gorman 提交于
    This was formerly the series "Improve sequential read throughput" which
    noted some major differences in performance of tiobench since 3.0.
    While there are a number of factors, two that dominated were the
    introduction of the fair zone allocation policy and changes to CFQ.
    
    The behaviour of fair zone allocation policy makes more sense than
    tiobench as a benchmark and CFQ defaults were not changed due to
    insufficient benchmarking.
    
    This series is what's left.  It's one functional fix to the fair zone
    allocation policy when used on NUMA machines and a reduction of overhead
    in general.  tiobench was used for the comparison despite its flaws as
    an IO benchmark as in this case we are primarily interested in the
    overhead of page allocator and page reclaim activity.
    
    On UMA, it makes little difference to overhead
    
              3.16.0-rc3   3.16.0-rc3
                 vanilla lowercost-v5
    User          383.61      386.77
    System        403.83      401.74
    Elapsed      5411.50     5413.11
    
    On a 4-socket NUMA machine it's a bit more noticable
    
              3.16.0-rc3   3.16.0-rc3
                 vanilla lowercost-v5
    User          746.94      802.00
    System      65336.22    40852.33
    Elapsed     27553.52    27368.46
    
    This patch (of 6):
    
    The LRU insertion and activate tracepoints take PFN as a parameter
    forcing the overhead to the caller.  Move the overhead to the tracepoint
    fast-assign method to ensure the cost is only incurred when the
    tracepoint is active.
    Signed-off-by: NMel Gorman <mgorman@suse.de>
    Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    24b7e581
swap.c 30.6 KB