• T
    timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC · 4396e058
    Thomas Gleixner 提交于
    Tracers want a correlated time between the kernel instrumentation and
    user space. We really do not want to export sched_clock() to user
    space, so we need to provide something sensible for this.
    
    Using separate data structures with an non blocking sequence count
    based update mechanism allows us to do that. The data structure
    required for the readout has a sequence counter and two copies of the
    timekeeping data.
    
    On the update side:
    
      smp_wmb();
      tkf->seq++;
      smp_wmb();
      update(tkf->base[0], tk);
      smp_wmb();
      tkf->seq++;
      smp_wmb();
      update(tkf->base[1], tk);
    
    On the reader side:
    
      do {
         seq = tkf->seq;
         smp_rmb();
         idx = seq & 0x01;
         now = now(tkf->base[idx]);
         smp_rmb();
      } while (seq != tkf->seq)
    
    So if a NMI hits the update of base[0] it will use base[1] which is
    still consistent, but this timestamp is not guaranteed to be monotonic
    across an update.
    
    The timestamp is calculated by:
    
    	now = base_mono + clock_delta * slope
    
    So if the update lowers the slope, readers who are forced to the
    not yet updated second array are still using the old steeper slope.
    
     tmono
     ^
     |    o  n
     |   o n
     |  u
     | o
     |o
     |12345678---> reader order
    
     o = old slope
     u = update
     n = new slope
    
    So reader 6 will observe time going backwards versus reader 5.
    
    While other CPUs are likely to be able observe that, the only way
    for a CPU local observation is when an NMI hits in the middle of
    the update. Timestamps taken from that NMI context might be ahead
    of the following timestamps. Callers need to be aware of that and
    deal with it.
    
    V2: Got rid of clock monotonic raw and reorganized the data
        structures. Folded in the barrier fix from Mathieu.
    Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
    4396e058
timekeeping.c 48.7 KB