1. 16 1月, 2018 21 次提交
    • A
      hrtimer: Use irqsave/irqrestore around __run_hrtimer() · dd934aa8
      Anna-Maria Gleixner 提交于
      __run_hrtimer() is called with the hrtimer_cpu_base.lock held and
      interrupts disabled. Before invoking the timer callback the base lock is
      dropped, but interrupts stay disabled.
      
      The upcoming support for softirq based hrtimers requires that interrupts
      are enabled before the timer callback is invoked.
      
      To avoid code duplication, take hrtimer_cpu_base.lock with
      raw_spin_lock_irqsave(flags) at the call site and hand in the flags as
      a parameter. So raw_spin_unlock_irqrestore() before the callback invocation
      will either keep interrupts disabled in interrupt context or restore to
      interrupt enabled state when called from softirq context.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-26-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dd934aa8
    • A
      hrtimer: Factor out __hrtimer_next_event_base() · ad38f596
      Anna-Maria Gleixner 提交于
      Preparatory patch for softirq based hrtimers to avoid code duplication.
      
      No functional change.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-25-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ad38f596
    • A
      hrtimer: Factor out __hrtimer_start_range_ns() · 138a6b7a
      Anna-Maria Gleixner 提交于
      Preparatory patch for softirq based hrtimers to avoid code duplication,
      factor out the __hrtimer_start_range_ns() function from hrtimer_start_range_ns().
      
      No functional change.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-24-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      138a6b7a
    • A
      hrtimer: Remove the 'base' parameter from hrtimer_reprogram() · 3ec7a3ee
      Anna-Maria Gleixner 提交于
      hrtimer_reprogram() must have access to the hrtimer_clock_base of the new
      first expiring timer to access hrtimer_clock_base.offset for adjusting the
      expiry time to CLOCK_MONOTONIC. This is required to evaluate whether the
      new left most timer in the hrtimer_clock_base is the first expiring timer
      of all clock bases in a hrtimer_cpu_base.
      
      The only user of hrtimer_reprogram() is hrtimer_start_range_ns(), which has
      a pointer to hrtimer_clock_base() already and hands it in as a parameter. But
      hrtimer_start_range_ns() will be split for the upcoming support for softirq
      based hrtimers to avoid code duplication and will lose the direct access to
      the clock base pointer.
      
      Instead of handing in timer and timer->base as a parameter remove the base
      parameter from hrtimer_reprogram() instead and retrieve the clock base internally.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-23-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3ec7a3ee
    • A
      hrtimer: Make remote enqueue decision less restrictive · 2ac2dccc
      Anna-Maria Gleixner 提交于
      The current decision whether a timer can be queued on a remote CPU checks
      for timer->expiry <= remote_cpu_base.expires_next.
      
      This is too restrictive because a timer with the same expiry time as an
      existing timer will be enqueued on right-hand size of the existing timer
      inside the rbtree, i.e. behind the first expiring timer.
      
      So its safe to allow enqueuing timers with the same expiry time as the
      first expiring timer on a remote CPU base.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-22-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2ac2dccc
    • A
      hrtimer: Unify remote enqueue handling · 14c80341
      Anna-Maria Gleixner 提交于
      hrtimer_reprogram() is conditionally invoked from hrtimer_start_range_ns()
      when hrtimer_cpu_base.hres_active is true.
      
      In the !hres_active case there is a special condition for the nohz_active
      case:
      
        If the newly enqueued timer expires before the first expiring timer on a
        remote CPU then the remote CPU needs to be notified and woken up from a
        NOHZ idle sleep to take the new first expiring timer into account.
      
      Previous changes have already established the prerequisites to make the
      remote enqueue behaviour the same whether high resolution mode is active or
      not:
      
        If the to be enqueued timer expires before the first expiring timer on a
        remote CPU, then it cannot be enqueued there.
      
      This was done for the high resolution mode because there is no way to
      access the remote CPU timer hardware. The same is true for NOHZ, but was
      handled differently by unconditionally enqueuing the timer and waking up
      the remote CPU so it can reprogram its timer. Again there is no compelling
      reason for this difference.
      
      hrtimer_check_target(), which makes the 'can remote enqueue' decision is
      already unconditional, but not yet functional because nothing updates
      hrtimer_cpu_base.expires_next in the !hres_active case.
      
      To unify this the following changes are required:
      
       1) Make the store of the new first expiry time unconditonal in
          hrtimer_reprogram() and check __hrtimer_hres_active() before proceeding
          to the actual hardware access. This check also lets the compiler
          eliminate the rest of the function in case of CONFIG_HIGH_RES_TIMERS=n.
      
       2) Invoke hrtimer_reprogram() unconditionally from
          hrtimer_start_range_ns()
      
       3) Remove the remote wakeup special case for the !high_res && nohz_active
          case.
      
      Confine the timers_nohz_active static key to timer.c which is the only user
      now.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-21-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      14c80341
    • A
      hrtimer: Unify hrtimer removal handling · 61bb4bcb
      Anna-Maria Gleixner 提交于
      When the first hrtimer on the current CPU is removed,
      hrtimer_force_reprogram() is invoked but only when
      CONFIG_HIGH_RES_TIMERS=y and hrtimer_cpu_base.hres_active is set.
      
      hrtimer_force_reprogram() updates hrtimer_cpu_base.expires_next and
      reprograms the clock event device. When CONFIG_HIGH_RES_TIMERS=y and
      hrtimer_cpu_base.hres_active is set, a pointless hrtimer interrupt can be
      prevented.
      
      hrtimer_check_target() makes the 'can remote enqueue' decision. As soon as
      hrtimer_check_target() is unconditionally available and
      hrtimer_cpu_base.expires_next is updated by hrtimer_reprogram(),
      hrtimer_force_reprogram() needs to be available unconditionally as well to
      prevent the following scenario with CONFIG_HIGH_RES_TIMERS=n:
      
      - the first hrtimer on this CPU is removed and hrtimer_force_reprogram() is
        not executed
      
      - CPU goes idle (next timer is calculated and hrtimers are taken into
        account)
      
      - a hrtimer is enqueued remote on the idle CPU: hrtimer_check_target()
        compares expiry value and hrtimer_cpu_base.expires_next. The expiry value
        is after expires_next, so the hrtimer is enqueued. This timer will fire
        late, if it expires before the effective first hrtimer on this CPU and
        the comparison was with an outdated expires_next value.
      
      To prevent this scenario, make hrtimer_force_reprogram() unconditional
      except the effective reprogramming part, which gets eliminated by the
      compiler in the CONFIG_HIGH_RES_TIMERS=n case.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-20-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      61bb4bcb
    • A
      hrtimer: Make hrtimer_force_reprogramm() unconditionally available · ebba2c72
      Anna-Maria Gleixner 提交于
      hrtimer_force_reprogram() needs to be available unconditionally for softirq
      based hrtimers. Move the function and all required struct members out of
      the CONFIG_HIGH_RES_TIMERS #ifdef.
      
      There is no functional change because hrtimer_force_reprogram() is only
      invoked when hrtimer_cpu_base.hres_active is true and
      CONFIG_HIGH_RES_TIMERS=y.
      
      Making it unconditional increases the text size for the
      CONFIG_HIGH_RES_TIMERS=n case slightly, but avoids replication of that code
      for the upcoming softirq based hrtimers support. Most of the code gets
      eliminated in the CONFIG_HIGH_RES_TIMERS=n case by the compiler.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-19-anna-maria@linutronix.de
      [ Made it build on !CONFIG_HIGH_RES_TIMERS ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ebba2c72
    • A
      hrtimer: Make hrtimer_reprogramm() unconditional · 11a9fe06
      Anna-Maria Gleixner 提交于
      hrtimer_reprogram() needs to be available unconditionally for softirq based
      hrtimers. Move the function and all required struct members out of the
      CONFIG_HIGH_RES_TIMERS #ifdef.
      
      There is no functional change because hrtimer_reprogram() is only invoked
      when hrtimer_cpu_base.hres_active is true. Making it unconditional
      increases the text size for the CONFIG_HIGH_RES_TIMERS=n case, but avoids
      replication of that code for the upcoming softirq based hrtimers support.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-18-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      11a9fe06
    • A
      hrtimer: Make hrtimer_cpu_base.next_timer handling unconditional · eb27926b
      Anna-Maria Gleixner 提交于
      hrtimer_cpu_base.next_timer stores the pointer to the next expiring timer
      in a CPU base.
      
      This pointer cannot be dereferenced and is solely used to check whether a
      hrtimer which is removed is the hrtimer which is the first to expire in the
      CPU base. If this is the case, then the timer hardware needs to be
      reprogrammed to avoid an extra interrupt for nothing.
      
      Again, this is conditional functionality, but there is no compelling reason
      to make this conditional. As a preparation, hrtimer_cpu_base.next_timer
      needs to be available unconditonally.
      
      Aside of that the upcoming support for softirq based hrtimers requires access
      to this pointer unconditionally as well, so our motivation is not entirely
      simplicity based.
      
      Make the update of hrtimer_cpu_base.next_timer unconditional and remove the
      #ifdef cruft. The impact on CONFIG_HIGH_RES_TIMERS=n && CONFIG_NOHZ=n is
      marginal as it's just a store on an already dirtied cacheline.
      
      No functional change.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-17-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      eb27926b
    • A
      hrtimer: Make the remote enqueue check unconditional · 07a9a7ea
      Anna-Maria Gleixner 提交于
      hrtimer_cpu_base.expires_next is used to cache the next event armed in the
      timer hardware. The value is used to check whether an hrtimer can be
      enqueued remotely. If the new hrtimer is expiring before expires_next, then
      remote enqueue is not possible as the remote hrtimer hardware cannot be
      accessed for reprogramming to an earlier expiry time.
      
      The remote enqueue check is currently conditional on
      CONFIG_HIGH_RES_TIMERS=y and hrtimer_cpu_base.hres_active. There is no
      compelling reason to make this conditional.
      
      Move hrtimer_cpu_base.expires_next out of the CONFIG_HIGH_RES_TIMERS=y
      guarded area and remove the conditionals in hrtimer_check_target().
      
      The check is currently a NOOP for the CONFIG_HIGH_RES_TIMERS=n and the
      !hrtimer_cpu_base.hres_active case because in these cases nothing updates
      hrtimer_cpu_base.expires_next yet. This will be changed with later patches
      which further reduce the #ifdef zoo in this code.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-16-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      07a9a7ea
    • A
      hrtimer: Use accesor functions instead of direct access · 851cff8c
      Anna-Maria Gleixner 提交于
      __hrtimer_hres_active() is now available unconditionally, so replace open
      coded direct accesses to hrtimer_cpu_base.hres_active.
      
      No functional change.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-15-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      851cff8c
    • A
      hrtimer: Make the hrtimer_cpu_base::hres_active field unconditional, to simplify the code · 28bfd18b
      Anna-Maria Gleixner 提交于
      The hrtimer_cpu_base::hres_active_member field depends on CONFIG_HIGH_RES_TIMERS=y
      currently, and all related functions to this member are conditional as well.
      
      To simplify the code make it unconditional and set it to zero during initialization.
      
      (This will also help with the upcoming softirq based hrtimers code.)
      
      The conditional code sections can be avoided by adding IS_ENABLED(HIGHRES)
      conditionals into common functions, which ensures dead code elimination.
      
      There is no functional change.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-14-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      28bfd18b
    • A
      hrtimer: Store running timer in hrtimer_clock_base · 3f0b9e8e
      Anna-Maria Gleixner 提交于
      The pointer to the currently running timer is stored in hrtimer_cpu_base
      before the base lock is dropped and the callback is invoked.
      
      This results in two levels of indirections and the upcoming support for
      softirq based hrtimer requires splitting the "running" storage into soft
      and hard IRQ context expiry.
      
      Storing both in the cpu base would require conditionals in all code paths
      accessing that information.
      
      It's possible to have a per clock base sequence count and running pointer
      without changing the semantics of the related mechanisms because the timer
      base pointer cannot be changed while a timer is running the callback.
      
      Unfortunately this makes cpu_clock base larger than 32 bytes on 32-bit
      kernels. Instead of having huge gaps due to alignment, remove the alignment
      and let the compiler pack CPU base for 32-bit kernels. The resulting cache access
      patterns are fortunately not really different from the current
      behaviour. On 64-bit kernels the 64-byte alignment stays and the behaviour is
      unchanged. This was determined by analyzing the resulting layout and
      looking at the number of cache lines involved for the frequently used
      clocks.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-12-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3f0b9e8e
    • A
      hrtimer: Switch 'for' loop to _ffs() evaluation · c272ca58
      Anna-Maria Gleixner 提交于
      Looping over all clock bases to find active bits is suboptimal if not all
      bases are active.
      
      Avoid this by converting it to a __ffs() evaluation. The functionallity is
      outsourced into its own function and is called via a macro as suggested by
      Peter Zijlstra.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-11-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c272ca58
    • A
      tracing/hrtimer: Print the hrtimer mode in the 'hrtimer_start' tracepoint · 63e2ed36
      Anna-Maria Gleixner 提交于
      The 'hrtimer_start' tracepoint lacks the mode information. The mode is
      important because consecutive starts can switch from ABS to REL or from
      PINNED to non PINNED.
      
      Append the mode field.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-10-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      63e2ed36
    • A
      hrtimer: Ensure POSIX compliance (relative CLOCK_REALTIME hrtimers) · 48d0c9be
      Anna-Maria Gleixner 提交于
      The POSIX specification defines that relative CLOCK_REALTIME timers are not
      affected by clock modifications. Those timers have to use CLOCK_MONOTONIC
      to ensure POSIX compliance.
      
      The introduction of the additional HRTIMER_MODE_PINNED mode broke this
      requirement for pinned timers.
      
      There is no user space visible impact because user space timers are not
      using pinned mode, but for consistency reasons this needs to be fixed.
      
      Check whether the mode has the HRTIMER_MODE_REL bit set instead of
      comparing with HRTIMER_MODE_ABS.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Fixes: 597d0275 ("timers: Framework for identifying pinned timers")
      Link: http://lkml.kernel.org/r/20171221104205.7269-7-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      48d0c9be
    • A
      hrtimer: Fix hrtimer_start[_range_ns]() function descriptions · 6de6250c
      Anna-Maria Gleixner 提交于
      The hrtimer_start[_range_ns]() functions start a timer reliably on this CPU only
      when HRTIMER_MODE_PINNED is set.
      
      Furthermore the HRTIMER_MODE_PINNED mode is not considered when a hrtimer is initialized.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-6-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6de6250c
    • A
      hrtimer: Clean up the 'int clock' parameter of schedule_hrtimeout_range_clock() · 90777713
      Anna-Maria Gleixner 提交于
      schedule_hrtimeout_range_clock() uses an 'int clock' parameter for the
      clock ID, instead of the customary predefined "clockid_t" type.
      
      In hrtimer coding style the canonical variable name for the clock ID is
      'clock_id', therefore change the name of the parameter here as well
      to make it all consistent.
      
      While at it, clean up the description for the 'clock_id' and 'mode'
      function parameters. The clock modes and the clock IDs are not
      restricted as the comment suggests.
      
      Fix the mode description as well for the callers of schedule_hrtimeout_range_clock().
      
      No functional changes intended.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-5-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      90777713
    • T
      hrtimer: Correct blatantly incorrect comment · d05ca13b
      Thomas Gleixner 提交于
      The protection of a hrtimer which runs its callback against migration to a
      different CPU has nothing to do with hard interrupt context.
      
      The protection against migration of a hrtimer running the expiry callback
      is the pointer in the cpu_base which holds a pointer to the currently
      running timer. This pointer is evaluated in the code which potentially
      switches the timer base and makes sure it's kept on the CPU on which the
      callback is running.
      Reported-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Reviewed-by: NFrederic Weisbecker <frederic@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-3-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d05ca13b
    • T
      hrtimer: Optimize the hrtimer code by using static keys for migration_enable/nohz_active · ae67bada
      Thomas Gleixner 提交于
      The hrtimer_cpu_base::migration_enable and ::nohz_active fields
      were originally introduced to avoid accessing global variables
      for these decisions.
      
      Still that results in a (cache hot) load and conditional branch,
      which can be avoided by using static keys.
      
      Implement it with static keys and optimize for the most critical
      case of high performance networking which tends to disable the
      timer migration functionality.
      
      No change in functionality.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1801142327490.2371@nanos
      Link: https://lkml.kernel.org/r/20171221104205.7269-2-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ae67bada
  2. 15 1月, 2018 2 次提交
  3. 04 1月, 2018 1 次提交
  4. 30 12月, 2017 4 次提交
    • T
      timers: Invoke timer_start_debug() where it makes sense · fd45bb77
      Thomas Gleixner 提交于
      The timer start debug function is called before the proper timer base is
      set. As a consequence the trace data contains the stale CPU and flags
      values.
      
      Call the debug function after setting the new base and flags.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: rt@linutronix.de
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Link: https://lkml.kernel.org/r/20171222145337.792907137@linutronix.de
      fd45bb77
    • T
      nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() · 5d62c183
      Thomas Gleixner 提交于
      The conditions in irq_exit() to invoke tick_nohz_irq_exit() which
      subsequently invokes tick_nohz_stop_sched_tick() are:
      
        if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu))
      
      If need_resched() is not set, but a timer softirq is pending then this is
      an indication that the softirq code punted and delegated the execution to
      softirqd. need_resched() is not true because the current interrupted task
      takes precedence over softirqd.
      
      Invoking tick_nohz_irq_exit() in this case can cause an endless loop of
      timer interrupts because the timer wheel contains an expired timer, but
      softirqs are not yet executed. So it returns an immediate expiry request,
      which causes the timer to fire immediately again. Lather, rinse and
      repeat....
      
      Prevent that by adding a check for a pending timer soft interrupt to the
      conditions in tick_nohz_stop_sched_tick() which avoid calling
      get_next_timer_interrupt(). That keeps the tick sched timer on the tick and
      prevents a repetitive programming of an already expired timer.
      Reported-by: NSebastian Siewior <bigeasy@linutronix.d>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos
      5d62c183
    • T
      timers: Reinitialize per cpu bases on hotplug · 26456f87
      Thomas Gleixner 提交于
      The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
      them with a potentially stale clk and next_expiry valuem, which can cause
      trouble then the CPU is plugged.
      
      Add a prepare callback which forwards the clock, sets next_expiry to far in
      the future and reset the control flags to a known state.
      
      Set base->must_forward_clk so the first timer which is queued will try to
      forward the clock to current jiffies.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
      26456f87
    • A
      timers: Use deferrable base independent of base::nohz_active · ced6d5c1
      Anna-Maria Gleixner 提交于
      During boot and before base::nohz_active is set in the timer bases, deferrable
      timers are enqueued into the standard timer base. This works correctly as
      long as base::nohz_active is false.
      
      Once it base::nohz_active is set and a timer which was enqueued before that
      is accessed the lock selector code choses the lock of the deferred
      base. This causes unlocked access to the standard base and in case the
      timer is removed it does not clear the pending flag in the standard base
      bitmap which causes get_next_timer_interrupt() to return bogus values.
      
      To prevent that, the deferrable timers must be enqueued in the deferrable
      base, even when base::nohz_active is not set. Those deferrable timers also
      need to be expired unconditional.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: rt@linutronix.de
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Link: https://lkml.kernel.org/r/20171222145337.633328378@linutronix.de
      ced6d5c1
  5. 28 12月, 2017 1 次提交
  6. 18 12月, 2017 1 次提交
    • P
      sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION · bf29cb23
      Paul E. McKenney 提交于
      CONFIG_NO_HZ_FULL doesn't make sense without CONFIG_CPU_ISOLATION. In
      fact enabling the first without the second is a regression as nohz_full=
      boot parameter gets silently ignored.
      
      Besides this unnatural combination hangs RCU gp kthread when running
      rcutorture for reasons that are not yet fully understood:
      
      	rcu_preempt kthread starved for 9974 jiffies! g4294967208
      	+c4294967207 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
      	rcu_preempt     I 7464     8      2 0x80000000
      	Call Trace:
      		__schedule+0x493/0x620
      		schedule+0x24/0x40
      		schedule_timeout+0x330/0x3b0
      		? preempt_count_sub+0xea/0x140
      		? collect_expired_timers+0xb0/0xb0
      		rcu_gp_kthread+0x6bf/0xef0
      
      This commit therefore makes NO_HZ_FULL select CPU_ISOLATION, which
      prevents all these bad behaviours.
      Reported-by: Nkernel test robot <xiaolong.ye@intel.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <frederic@kernel.org>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      Fixes: 5c4991e2 ("sched/isolation: Split out new CONFIG_CPU_ISOLATION=y config from CONFIG_NO_HZ_FULL")
      Link: http://lkml.kernel.org/r/1513275507-29200-2-git-send-email-frederic@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bf29cb23
  7. 15 12月, 2017 1 次提交
    • T
      posix-timer: Properly check sigevent->sigev_notify · cef31d9a
      Thomas Gleixner 提交于
      timer_create() specifies via sigevent->sigev_notify the signal delivery for
      the new timer. The valid modes are SIGEV_NONE, SIGEV_SIGNAL, SIGEV_THREAD
      and (SIGEV_SIGNAL | SIGEV_THREAD_ID).
      
      The sanity check in good_sigevent() is only checking the valid combination
      for the SIGEV_THREAD_ID bit, i.e. SIGEV_SIGNAL, but if SIGEV_THREAD_ID is
      not set it accepts any random value.
      
      This has no real effects on the posix timer and signal delivery code, but
      it affects show_timer() which handles the output of /proc/$PID/timers. That
      function uses a string array to pretty print sigev_notify. The access to
      that array has no bound checks, so random sigev_notify cause access beyond
      the array bounds.
      
      Add proper checks for the valid notify modes and remove the SIGEV_THREAD_ID
      masking from various code pathes as SIGEV_NONE can never be set in
      combination with SIGEV_THREAD_ID.
      Reported-by: NEric Biggers <ebiggers3@gmail.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Reported-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: stable@vger.kernel.org
      cef31d9a
  8. 22 11月, 2017 5 次提交
    • K
      timer: Pass function down to initialization routines · 188665b2
      Kees Cook 提交于
      In preparation for removing more macros, pass the function down to the
      initialization routines instead of doing it in macros.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      188665b2
    • K
      timer: Switch callback prototype to take struct timer_list * argument · 354b46b1
      Kees Cook 提交于
      Since all callbacks have been converted, we can switch the core
      prototype to "struct timer_list *" now too.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      354b46b1
    • K
      timer: Pass timer_list pointer to callbacks unconditionally · c1eba5bc
      Kees Cook 提交于
      Now that all timer callbacks are already taking their struct timer_list
      pointer as the callback argument, just do this unconditionally and remove
      the .data field.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      c1eba5bc
    • K
      treewide: setup_timer() -> timer_setup() · e99e88a9
      Kees Cook 提交于
      This converts all remaining cases of the old setup_timer() API into using
      timer_setup(), where the callback argument is the structure already
      holding the struct timer_list. These should have no behavioral changes,
      since they just change which pointer is passed into the callback with
      the same available pointers after conversion. It handles the following
      examples, in addition to some other variations.
      
      Casting from unsigned long:
      
          void my_callback(unsigned long data)
          {
              struct something *ptr = (struct something *)data;
          ...
          }
          ...
          setup_timer(&ptr->my_timer, my_callback, ptr);
      
      and forced object casts:
      
          void my_callback(struct something *ptr)
          {
          ...
          }
          ...
          setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);
      
      become:
      
          void my_callback(struct timer_list *t)
          {
              struct something *ptr = from_timer(ptr, t, my_timer);
          ...
          }
          ...
          timer_setup(&ptr->my_timer, my_callback, 0);
      
      Direct function assignments:
      
          void my_callback(unsigned long data)
          {
              struct something *ptr = (struct something *)data;
          ...
          }
          ...
          ptr->my_timer.function = my_callback;
      
      have a temporary cast added, along with converting the args:
      
          void my_callback(struct timer_list *t)
          {
              struct something *ptr = from_timer(ptr, t, my_timer);
          ...
          }
          ...
          ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;
      
      And finally, callbacks without a data assignment:
      
          void my_callback(unsigned long data)
          {
          ...
          }
          ...
          setup_timer(&ptr->my_timer, my_callback, 0);
      
      have their argument renamed to verify they're unused during conversion:
      
          void my_callback(struct timer_list *unused)
          {
          ...
          }
          ...
          timer_setup(&ptr->my_timer, my_callback, 0);
      
      The conversion is done with the following Coccinelle script:
      
      spatch --very-quiet --all-includes --include-headers \
      	-I ./arch/x86/include -I ./arch/x86/include/generated \
      	-I ./include -I ./arch/x86/include/uapi \
      	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
      	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
      	--dir . \
      	--cocci-file ~/src/data/timer_setup.cocci
      
      @fix_address_of@
      expression e;
      @@
      
       setup_timer(
      -&(e)
      +&e
       , ...)
      
      // Update any raw setup_timer() usages that have a NULL callback, but
      // would otherwise match change_timer_function_usage, since the latter
      // will update all function assignments done in the face of a NULL
      // function initialization in setup_timer().
      @change_timer_function_usage_NULL@
      expression _E;
      identifier _timer;
      type _cast_data;
      @@
      
      (
      -setup_timer(&_E->_timer, NULL, _E);
      +timer_setup(&_E->_timer, NULL, 0);
      |
      -setup_timer(&_E->_timer, NULL, (_cast_data)_E);
      +timer_setup(&_E->_timer, NULL, 0);
      |
      -setup_timer(&_E._timer, NULL, &_E);
      +timer_setup(&_E._timer, NULL, 0);
      |
      -setup_timer(&_E._timer, NULL, (_cast_data)&_E);
      +timer_setup(&_E._timer, NULL, 0);
      )
      
      @change_timer_function_usage@
      expression _E;
      identifier _timer;
      struct timer_list _stl;
      identifier _callback;
      type _cast_func, _cast_data;
      @@
      
      (
      -setup_timer(&_E->_timer, _callback, _E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, &_callback, _E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, _callback, (_cast_data)_E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, (_cast_func)_callback, _E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E._timer, _callback, (_cast_data)_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, _callback, (_cast_data)&_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, &_callback, (_cast_data)_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
       _E->_timer@_stl.function = _callback;
      |
       _E->_timer@_stl.function = &_callback;
      |
       _E->_timer@_stl.function = (_cast_func)_callback;
      |
       _E->_timer@_stl.function = (_cast_func)&_callback;
      |
       _E._timer@_stl.function = _callback;
      |
       _E._timer@_stl.function = &_callback;
      |
       _E._timer@_stl.function = (_cast_func)_callback;
      |
       _E._timer@_stl.function = (_cast_func)&_callback;
      )
      
      // callback(unsigned long arg)
      @change_callback_handle_cast
       depends on change_timer_function_usage@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._timer;
      type _origtype;
      identifier _origarg;
      type _handletype;
      identifier _handle;
      @@
      
       void _callback(
      -_origtype _origarg
      +struct timer_list *t
       )
       {
      (
      	... when != _origarg
      	_handletype *_handle =
      -(_handletype *)_origarg;
      +from_timer(_handle, t, _timer);
      	... when != _origarg
      |
      	... when != _origarg
      	_handletype *_handle =
      -(void *)_origarg;
      +from_timer(_handle, t, _timer);
      	... when != _origarg
      |
      	... when != _origarg
      	_handletype *_handle;
      	... when != _handle
      	_handle =
      -(_handletype *)_origarg;
      +from_timer(_handle, t, _timer);
      	... when != _origarg
      |
      	... when != _origarg
      	_handletype *_handle;
      	... when != _handle
      	_handle =
      -(void *)_origarg;
      +from_timer(_handle, t, _timer);
      	... when != _origarg
      )
       }
      
      // callback(unsigned long arg) without existing variable
      @change_callback_handle_cast_no_arg
       depends on change_timer_function_usage &&
                           !change_callback_handle_cast@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._timer;
      type _origtype;
      identifier _origarg;
      type _handletype;
      @@
      
       void _callback(
      -_origtype _origarg
      +struct timer_list *t
       )
       {
      +	_handletype *_origarg = from_timer(_origarg, t, _timer);
      +
      	... when != _origarg
      -	(_handletype *)_origarg
      +	_origarg
      	... when != _origarg
       }
      
      // Avoid already converted callbacks.
      @match_callback_converted
       depends on change_timer_function_usage &&
                  !change_callback_handle_cast &&
      	    !change_callback_handle_cast_no_arg@
      identifier change_timer_function_usage._callback;
      identifier t;
      @@
      
       void _callback(struct timer_list *t)
       { ... }
      
      // callback(struct something *handle)
      @change_callback_handle_arg
       depends on change_timer_function_usage &&
      	    !match_callback_converted &&
                  !change_callback_handle_cast &&
                  !change_callback_handle_cast_no_arg@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._timer;
      type _handletype;
      identifier _handle;
      @@
      
       void _callback(
      -_handletype *_handle
      +struct timer_list *t
       )
       {
      +	_handletype *_handle = from_timer(_handle, t, _timer);
      	...
       }
      
      // If change_callback_handle_arg ran on an empty function, remove
      // the added handler.
      @unchange_callback_handle_arg
       depends on change_timer_function_usage &&
      	    change_callback_handle_arg@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._timer;
      type _handletype;
      identifier _handle;
      identifier t;
      @@
      
       void _callback(struct timer_list *t)
       {
      -	_handletype *_handle = from_timer(_handle, t, _timer);
       }
      
      // We only want to refactor the setup_timer() data argument if we've found
      // the matching callback. This undoes changes in change_timer_function_usage.
      @unchange_timer_function_usage
       depends on change_timer_function_usage &&
                  !change_callback_handle_cast &&
                  !change_callback_handle_cast_no_arg &&
      	    !change_callback_handle_arg@
      expression change_timer_function_usage._E;
      identifier change_timer_function_usage._timer;
      identifier change_timer_function_usage._callback;
      type change_timer_function_usage._cast_data;
      @@
      
      (
      -timer_setup(&_E->_timer, _callback, 0);
      +setup_timer(&_E->_timer, _callback, (_cast_data)_E);
      |
      -timer_setup(&_E._timer, _callback, 0);
      +setup_timer(&_E._timer, _callback, (_cast_data)&_E);
      )
      
      // If we fixed a callback from a .function assignment, fix the
      // assignment cast now.
      @change_timer_function_assignment
       depends on change_timer_function_usage &&
                  (change_callback_handle_cast ||
                   change_callback_handle_cast_no_arg ||
                   change_callback_handle_arg)@
      expression change_timer_function_usage._E;
      identifier change_timer_function_usage._timer;
      identifier change_timer_function_usage._callback;
      type _cast_func;
      typedef TIMER_FUNC_TYPE;
      @@
      
      (
       _E->_timer.function =
      -_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E->_timer.function =
      -&_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E->_timer.function =
      -(_cast_func)_callback;
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E->_timer.function =
      -(_cast_func)&_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._timer.function =
      -_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._timer.function =
      -&_callback;
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._timer.function =
      -(_cast_func)_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._timer.function =
      -(_cast_func)&_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      )
      
      // Sometimes timer functions are called directly. Replace matched args.
      @change_timer_function_calls
       depends on change_timer_function_usage &&
                  (change_callback_handle_cast ||
                   change_callback_handle_cast_no_arg ||
                   change_callback_handle_arg)@
      expression _E;
      identifier change_timer_function_usage._timer;
      identifier change_timer_function_usage._callback;
      type _cast_data;
      @@
      
       _callback(
      (
      -(_cast_data)_E
      +&_E->_timer
      |
      -(_cast_data)&_E
      +&_E._timer
      |
      -_E
      +&_E->_timer
      )
       )
      
      // If a timer has been configured without a data argument, it can be
      // converted without regard to the callback argument, since it is unused.
      @match_timer_function_unused_data@
      expression _E;
      identifier _timer;
      identifier _callback;
      @@
      
      (
      -setup_timer(&_E->_timer, _callback, 0);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, _callback, 0L);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, _callback, 0UL);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E._timer, _callback, 0);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, _callback, 0L);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, _callback, 0UL);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_timer, _callback, 0);
      +timer_setup(&_timer, _callback, 0);
      |
      -setup_timer(&_timer, _callback, 0L);
      +timer_setup(&_timer, _callback, 0);
      |
      -setup_timer(&_timer, _callback, 0UL);
      +timer_setup(&_timer, _callback, 0);
      |
      -setup_timer(_timer, _callback, 0);
      +timer_setup(_timer, _callback, 0);
      |
      -setup_timer(_timer, _callback, 0L);
      +timer_setup(_timer, _callback, 0);
      |
      -setup_timer(_timer, _callback, 0UL);
      +timer_setup(_timer, _callback, 0);
      )
      
      @change_callback_unused_data
       depends on match_timer_function_unused_data@
      identifier match_timer_function_unused_data._callback;
      type _origtype;
      identifier _origarg;
      @@
      
       void _callback(
      -_origtype _origarg
      +struct timer_list *unused
       )
       {
      	... when != _origarg
       }
      Signed-off-by: NKees Cook <keescook@chromium.org>
      e99e88a9
    • K
      treewide: init_timer() -> setup_timer() · b9eaf187
      Kees Cook 提交于
      This mechanically converts all remaining cases of ancient open-coded timer
      setup with the old setup_timer() API, which is the first step in timer
      conversions. This has no behavioral changes, since it ultimately just
      changes the order of assignment to fields of struct timer_list when
      finding variations of:
      
          init_timer(&t);
          f.function = timer_callback;
          t.data = timer_callback_arg;
      
      to be converted into:
      
          setup_timer(&t, timer_callback, timer_callback_arg);
      
      The conversion is done with the following Coccinelle script, which
      is an improved version of scripts/cocci/api/setup_timer.cocci, in the
      following ways:
       - assignments-before-init_timer() cases
       - limit the .data case removal to the specific struct timer_list instance
       - handling calls by dereference (timer->field vs timer.field)
      
      spatch --very-quiet --all-includes --include-headers \
      	-I ./arch/x86/include -I ./arch/x86/include/generated \
      	-I ./include -I ./arch/x86/include/uapi \
      	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
      	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
      	--dir . \
      	--cocci-file ~/src/data/setup_timer.cocci
      
      @fix_address_of@
      expression e;
      @@
      
       init_timer(
      -&(e)
      +&e
       , ...)
      
      // Match the common cases first to avoid Coccinelle parsing loops with
      // "... when" clauses.
      
      @match_immediate_function_data_after_init_timer@
      expression e, func, da;
      @@
      
      -init_timer
      +setup_timer
       ( \(&e\|e\)
      +, func, da
       );
      (
      -\(e.function\|e->function\) = func;
      -\(e.data\|e->data\) = da;
      |
      -\(e.data\|e->data\) = da;
      -\(e.function\|e->function\) = func;
      )
      
      @match_immediate_function_data_before_init_timer@
      expression e, func, da;
      @@
      
      (
      -\(e.function\|e->function\) = func;
      -\(e.data\|e->data\) = da;
      |
      -\(e.data\|e->data\) = da;
      -\(e.function\|e->function\) = func;
      )
      -init_timer
      +setup_timer
       ( \(&e\|e\)
      +, func, da
       );
      
      @match_function_and_data_after_init_timer@
      expression e, e2, e3, e4, e5, func, da;
      @@
      
      -init_timer
      +setup_timer
       ( \(&e\|e\)
      +, func, da
       );
       ... when != func = e2
           when != da = e3
      (
      -e.function = func;
      ... when != da = e4
      -e.data = da;
      |
      -e->function = func;
      ... when != da = e4
      -e->data = da;
      |
      -e.data = da;
      ... when != func = e5
      -e.function = func;
      |
      -e->data = da;
      ... when != func = e5
      -e->function = func;
      )
      
      @match_function_and_data_before_init_timer@
      expression e, e2, e3, e4, e5, func, da;
      @@
      (
      -e.function = func;
      ... when != da = e4
      -e.data = da;
      |
      -e->function = func;
      ... when != da = e4
      -e->data = da;
      |
      -e.data = da;
      ... when != func = e5
      -e.function = func;
      |
      -e->data = da;
      ... when != func = e5
      -e->function = func;
      )
      ... when != func = e2
          when != da = e3
      -init_timer
      +setup_timer
       ( \(&e\|e\)
      +, func, da
       );
      
      @r1 exists@
      expression t;
      identifier f;
      position p;
      @@
      
      f(...) { ... when any
        init_timer@p(\(&t\|t\))
        ... when any
      }
      
      @r2 exists@
      expression r1.t;
      identifier g != r1.f;
      expression e8;
      @@
      
      g(...) { ... when any
        \(t.data\|t->data\) = e8
        ... when any
      }
      
      // It is dangerous to use setup_timer if data field is initialized
      // in another function.
      @script:python depends on r2@
      p << r1.p;
      @@
      
      cocci.include_match(False)
      
      @r3@
      expression r1.t, func, e7;
      position r1.p;
      @@
      
      (
      -init_timer@p(&t);
      +setup_timer(&t, func, 0UL);
      ... when != func = e7
      -t.function = func;
      |
      -t.function = func;
      ... when != func = e7
      -init_timer@p(&t);
      +setup_timer(&t, func, 0UL);
      |
      -init_timer@p(t);
      +setup_timer(t, func, 0UL);
      ... when != func = e7
      -t->function = func;
      |
      -t->function = func;
      ... when != func = e7
      -init_timer@p(t);
      +setup_timer(t, func, 0UL);
      )
      Signed-off-by: NKees Cook <keescook@chromium.org>
      b9eaf187
  9. 14 11月, 2017 1 次提交
  10. 13 11月, 2017 1 次提交
  11. 12 11月, 2017 2 次提交
    • D
      timers: Add a function to start/reduce a timer · b24591e2
      David Howells 提交于
      Add a function, similar to mod_timer(), that will start a timer if it isn't
      running and will modify it if it is running and has an expiry time longer
      than the new time.  If the timer is running with an expiry time that's the
      same or sooner, no change is made.
      
      The function looks like:
      
      	int timer_reduce(struct timer_list *timer, unsigned long expires);
      
      This can be used by code such as networking code to make it easier to share
      a timer for multiple timeouts.  For instance, in upcoming AF_RXRPC code,
      the rxrpc_call struct will maintain a number of timeouts:
      
      	unsigned long	ack_at;
      	unsigned long	resend_at;
      	unsigned long	ping_at;
      	unsigned long	expect_rx_by;
      	unsigned long	expect_req_by;
      	unsigned long	expect_term_by;
      
      each of which is set independently of the others.  With timer reduction
      available, when the code needs to set one of the timeouts, it only needs to
      look at that timeout and then call timer_reduce() to modify the timer,
      starting it or bringing it forward if necessary.  There is no need to refer
      to the other timeouts to see which is earliest and no need to take any lock
      other than, potentially, the timer lock inside timer_reduce().
      
      Note, that this does not protect against concurrent invocations of any of
      the timer functions.
      
      As an example, the expect_rx_by timeout above, which terminates a call if
      we don't get a packet from the server within a certain time window, would
      be set something like this:
      
      	unsigned long now = jiffies;
      	unsigned long expect_rx_by = now + packet_receive_timeout;
      	WRITE_ONCE(call->expect_rx_by, expect_rx_by);
      	timer_reduce(&call->timer, expect_rx_by);
      
      The timer service code (which might, say, be in a work function) would then
      check all the timeouts to see which, if any, had triggered, deal with
      those:
      
      	t = READ_ONCE(call->ack_at);
      	if (time_after_eq(now, t)) {
      		cmpxchg(&call->ack_at, t, now + MAX_JIFFY_OFFSET);
      		set_bit(RXRPC_CALL_EV_ACK, &call->events);
      	}
      
      and then restart the timer if necessary by finding the soonest timeout that
      hasn't yet passed and then calling timer_reduce().
      
      The disadvantage of doing things this way rather than comparing the timers
      each time and calling mod_timer() is that you *will* take timer events
      unless you can finish what you're doing and delete the timer in time.
      
      The advantage of doing things this way is that you don't need to use a lock
      to work out when the next timer should be set, other than the timer's own
      lock - which you might not have to take.
      
      [ tglx: Fixed weird formatting and adopted it to pending changes ]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: keyrings@vger.kernel.org
      Cc: linux-afs@lists.infradead.org
      Link: https://lkml.kernel.org/r/151023090769.23050.1801643667223880753.stgit@warthog.procyon.org.uk
      b24591e2
    • A
      pstore: Use ktime_get_real_fast_ns() instead of __getnstimeofday() · df27067e
      Arnd Bergmann 提交于
      __getnstimeofday() is a rather odd interface, with a number of quirks:
      
      - The caller may come from NMI context, but the implementation is not NMI safe,
        one way to get there from NMI is
      
            NMI handler:
              something bad
                panic()
                  kmsg_dump()
                    pstore_dump()
                       pstore_record_init()
                         __getnstimeofday()
      
      - The calling conventions are different from any other timekeeping functions,
        to deal with returning an error code during suspended timekeeping.
      
      Address the above issues by using a completely different method to get the
      time: ktime_get_real_fast_ns() is NMI safe and has a reasonable behavior
      when timekeeping is suspended: it returns the time at which it got
      suspended. As Thomas Gleixner explained, this is safe, as
      ktime_get_real_fast_ns() does not call into the clocksource driver that
      might be suspended.
      
      The result can easily be transformed into a timespec structure. Since
      ktime_get_real_fast_ns() was not exported to modules, add the export.
      
      The pstore behavior for the suspended case changes slightly, as it now
      stores the timestamp at which timekeeping was suspended instead of storing
      a zero timestamp.
      
      This change is not addressing y2038-safety, that's subject to a more
      complex follow up patch.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Anton Vorontsov <anton@enomsg.org>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Colin Cross <ccross@android.com>
      Link: https://lkml.kernel.org/r/20171110152530.1926955-1-arnd@arndb.de
      df27067e