1. 11 1月, 2017 1 次提交
    • F
      nohz: Fix collision between tick and other hrtimers · 24b91e36
      Frederic Weisbecker 提交于
      When the tick is stopped and an interrupt occurs afterward, we check on
      that interrupt exit if the next tick needs to be rescheduled. If it
      doesn't need any update, we don't want to do anything.
      
      In order to check if the tick needs an update, we compare it against the
      clockevent device deadline. Now that's a problem because the clockevent
      device is at a lower level than the tick itself if it is implemented
      on top of hrtimer.
      
      Every hrtimer share this clockevent device. So comparing the next tick
      deadline against the clockevent device deadline is wrong because the
      device may be programmed for another hrtimer whose deadline collides
      with the tick. As a result we may end up not reprogramming the tick
      accidentally.
      
      In a worst case scenario under full dynticks mode, the tick stops firing
      as it is supposed to every 1hz, leaving /proc/stat stalled:
      
            Task in a full dynticks CPU
            ----------------------------
      
            * hrtimer A is queued 2 seconds ahead
            * the tick is stopped, scheduled 1 second ahead
            * tick fires 1 second later
            * on tick exit, nohz schedules the tick 1 second ahead but sees
              the clockevent device is already programmed to that deadline,
              fooled by hrtimer A, the tick isn't rescheduled.
            * hrtimer A is cancelled before its deadline
            * tick never fires again until an interrupt happens...
      
      In order to fix this, store the next tick deadline to the tick_sched
      local structure and reuse that value later to check whether we need to
      reprogram the clock after an interrupt.
      
      On the other hand, ts->sleep_length still wants to know about the next
      clock event and not just the tick, so we want to improve the related
      comment to avoid confusion.
      Reported-by: NJames Hartsock <hartsjc@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NRik van Riel <riel@redhat.com>
      Link: http://lkml.kernel.org/r/1483539124-5693-1-git-send-email-fweisbec@gmail.com
      Cc: stable@vger.kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      24b91e36
  2. 29 3月, 2016 1 次提交
  3. 02 3月, 2016 1 次提交
    • F
      nohz: New tick dependency mask · d027d45d
      Frederic Weisbecker 提交于
      The tick dependency is evaluated on every IRQ and context switch. This
      consists is a batch of checks which determine whether it is safe to
      stop the tick or not. These checks are often split in many details:
      posix cpu timers, scheduler, sched clock, perf events.... each of which
      are made of smaller details: posix cpu timer involves checking process
      wide timers then thread wide timers. Perf involves checking freq events
      then more per cpu details.
      
      Checking these informations asynchronously every time we update the full
      dynticks state bring avoidable overhead and a messy layout.
      
      Let's introduce instead tick dependency masks: one for system wide
      dependency (unstable sched clock, freq based perf events), one for CPU
      wide dependency (sched, throttling perf events), and task/signal level
      dependencies (posix cpu timers). The subsystems are responsible
      for setting and clearing their dependency through a set of APIs that will
      take care of concurrent dependency mask modifications and kick targets
      to restart the relevant CPU tick whenever needed.
      
      This new dependency engine stays beside the old one until all subsystems
      having a tick dependency are converted to it.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      d027d45d
  4. 08 7月, 2015 1 次提交
    • T
      tick/broadcast: Make idle check independent from mode and config · f32dd117
      Thomas Gleixner 提交于
      Currently the broadcast busy check, which prevents the idle code from
      going into deep idle, works only in one shot mode.
      
      If NOHZ and HIGHRES are off (config or command line) there is no
      sanity check at all, so under certain conditions cpus are allowed to
      go into deep idle, where the local timer stops, and are not woken up
      again because there is no broadcast timer installed or a hrtimer based
      broadcast device is not evaluated.
      
      Move tick_broadcast_oneshot_control() into the common code and provide
      proper subfunctions for the various config combinations.
      
      The common check in tick_broadcast_oneshot_control() is for the C3STOP
      misfeature flag of the local clock event device. If its not set, idle
      can proceed. If set, further checks are necessary.
      
      Provide checks for the trivial cases:
      
       - If broadcast is disabled in the config, then return busy
      
       - If oneshot mode (NOHZ/HIGHES) is disabled in the config, return
         busy if the broadcast device is hrtimer based.
      
       - If oneshot mode is enabled in the config call the original
         tick_broadcast_oneshot_control() function. That function needs
         extra checks which will be implemented in seperate patches.
      
      [ Split out from a larger combo patch ]
      Reported-and-tested-by: NSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
      Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
      Cc: Catalin Marinas <Catalin.Marinas@arm.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
      f32dd117
  5. 22 4月, 2015 1 次提交
  6. 01 4月, 2015 2 次提交