• X
    alinux: memcg: Introduce memory.wmark_min_adj · 60be0f54
    Xunlei Pang 提交于
    In co-location environment, there are more or less some memory
    overcommitment, then BATCH tasks may break the shared global min
    watermark resulting in all types of applications falling into
    the direct reclaim slow path hurting the RT of LS tasks.
    (NOTE: BATCH tasks tolerate big latency spike even in seconds
    as long as doesn't hurt its overal throughput. While LS tasks
    are very Latency-Sensitive, they may time out or fail in case
    of sudden latency spike lasts like hundreds of ms typically.)
    
    Actually BATCH tasks are not sensitive to memory latency, they
    can be assigned a strict min watermark which is different from
    that of LS tasks(which can be aissgned a lenient min watermark
    accordingly), thus isolating each other in case of global memory
    allocation. This is kind of like the idea behind ALLOC_HARDER
    for rt_task(), see gfp_to_alloc_flags().
    
    memory.wmark_min_adj stands for memcg global WMARK_MIN adjustment,
    it is used to realize separate min watermarks above-mentioned for
    memcgs, its valid value is within [-25, 50], specifically:
    negative value means to be relative to [0, WMARK_MIN],
    positive value means to be relative to [WMARK_MIN, WMARK_LOW].
    For examples,
      -25 means "WMARK_MIN + (WMARK_MIN - 0) * (-25%)"
       50 means "WMARK_MIN + (WMARK_LOW - WMARK_MIN) * 50%"
    
    Note that the minimum -25 is what ALLOC_HARDER uses which is safe
    for us to adopt, and the maximum 50 is one experienced value.
    
    Negative memory.wmark_min_adj means high QoS requirements, it can
    allocate below the global WMARK_MIN, which is kind of like the idea
    behind ALLOC_HARDER, see gfp_to_alloc_flags().
    
    Positive memory.wmark_min_adj means low QoS requirements, thus when
    allocation broke memcg min watermark, it should trigger direct reclaim
    traditionally, and we trigger throttle instead to further prevent
    them from disturbing others.
    
    With this interface, we can assign positive values for BATCH memcgs
    and negative values for LS memcgs.
    
    memory.wmark_min_adj default value is 0, and inherit from its parent,
    Note that the final effective wmark_min_adj will consider all the
    hierarchical values, its value is the maximal(most conservative)
    wmark_min_adj along the hierarchy but excluding intermediate default
    values(zero).
    Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
    Reviewed-by: NGavin Shan <shan.gavin@linux.alibaba.com>
    Signed-off-by: NXunlei Pang <xlpang@linux.alibaba.com>
    60be0f54
sched.h 53.7 KB