• W
    writeback: max, min and target dirty pause time · 7ccb9ad5
    Wu Fengguang 提交于
    Control the pause time and the call intervals to balance_dirty_pages()
    with three parameters:
    
    1) max_pause, limited by bdi_dirty and MAX_PAUSE
    
    2) the target pause time, grows with the number of dd tasks
       and is normally limited by max_pause/2
    
    3) the minimal pause, set to half the target pause
       and is used to skip short sleeps and accumulate them into bigger ones
    
    The typical behaviors after patch:
    
    - if ever task_ratelimit is far below dirty_ratelimit, the pause time
      will remain constant at max_pause and nr_dirtied_pause will be
      fluctuating with task_ratelimit
    
    - in the normal cases, nr_dirtied_pause will remain stable (keep in the
      same pace with dirty_ratelimit) and the pause time will be fluctuating
      with task_ratelimit
    
    In summary, someone has to fluctuate with task_ratelimit, because
    
    	task_ratelimit = nr_dirtied_pause / pause
    
    We normally prefer a stable nr_dirtied_pause, until reaching max_pause.
    
    The notable behavior changes are:
    
    - in stable workloads, there will no longer be sudden big trajectory
      switching of nr_dirtied_pause as concerned by Peter. It will be as
      smooth as dirty_ratelimit and changing proportionally with it (as
      always, assuming bdi bandwidth does not fluctuate across 2^N lines,
      otherwise nr_dirtied_pause will show up in 2+ parallel trajectories)
    
    - in the rare cases when something keeps task_ratelimit far below
      dirty_ratelimit, the smoothness can no longer be retained and
      nr_dirtied_pause will be "dancing" with task_ratelimit. This fixes a
      (not that destructive but still not good) bug that
    	  dirty_ratelimit gets brought down undesirably
    	  <= balanced_dirty_ratelimit is under estimated
    	  <= weakly executed task_ratelimit
    	  <= pause goes too large and gets trimmed down to max_pause
    	  <= nr_dirtied_pause (based on dirty_ratelimit) is set too large
    	  <= dirty_ratelimit being much larger than task_ratelimit
    
    - introduce min_pause to avoid small pause sleeps
    
    - when pause is trimmed down to max_pause, try to compensate it at the
      next pause time
    
    The "refactor" type of changes are:
    
    The max_pause equation is slightly transformed to make it slightly more
    efficient.
    
    We now scale target_pause by (N * 10ms) on 2^N concurrent tasks, which
    is effectively equal to the original scaling max_pause by (N * 20ms)
    because the original code does implicit target_pause ~= max_pause / 2.
    Based on the same implicit ratio, target_pause starts with 10ms on 1 dd.
    
    CC: Jan Kara <jack@suse.cz>
    CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
    7ccb9ad5
page-writeback.c 63.6 KB