1. 20 5月, 2016 1 次提交
  2. 05 5月, 2016 1 次提交
  3. 23 4月, 2016 3 次提交
    • F
      sched/fair: Correctly handle nohz ticks CPU load accounting · 1f41906a
      Frederic Weisbecker 提交于
      Ticks can happen while the CPU is in dynticks-idle or dynticks-singletask
      mode. In fact "nohz" or "dynticks" only mean that we exit the periodic
      mode and we try to minimize the ticks as much as possible. The nohz
      subsystem uses a confusing terminology with the internal state
      "ts->tick_stopped" which is also available through its public interface
      with tick_nohz_tick_stopped(). This is a misnomer as the tick is instead
      reduced with the best effort rather than stopped. In the best case the
      tick can indeed be actually stopped but there is no guarantee about that.
      If a timer needs to fire one second later, a tick will fire while the
      CPU is in nohz mode and this is a very common scenario.
      
      Now this confusion happens to be a problem with CPU load updates:
      cpu_load_update_active() doesn't handle nohz ticks correctly because it
      assumes that ticks are completely stopped in nohz mode and that
      cpu_load_update_active() can't be called in dynticks mode. When that
      happens, the whole previous tickless load is ignored and the function
      just records the load for the current tick, ignoring potentially long
      idle periods behind.
      
      In order to solve this, we could account the current load for the
      previous nohz time but there is a risk that we account the load of a
      task that got freshly enqueued for the whole nohz period.
      
      So instead, lets record the dynticks load on nohz frame entry so we know
      what to record in case of nohz ticks, then use this record to account
      the tickless load on nohz ticks and nohz frame end.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1460555812-25375-3-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1f41906a
    • F
      sched/fair: Gather CPU load functions under a more conventional namespace · cee1afce
      Frederic Weisbecker 提交于
      The CPU load update related functions have a weak naming convention
      currently, starting with update_cpu_load_*() which isn't ideal as
      "update" is a very generic concept.
      
      Since two of these functions are public already (and a third is to come)
      that's enough to introduce a more conventional naming scheme. So let's
      do the following rename instead:
      
      	update_cpu_load_*() -> cpu_load_update_*()
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1460555812-25375-2-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cee1afce
    • B
      time: Introduce do_sys_settimeofday64() · 86d34732
      Baolin Wang 提交于
      The do_sys_settimeofday() function uses a timespec, which is not year
      2038 safe on 32bit systems.
      
      Thus this patch introduces do_sys_settimeofday64(), which allows us to
      transition users of do_sys_settimeofday() to using 64bit time types.
      
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NBaolin Wang <baolin.wang@linaro.org>
      [jstultz: Include errno-base.h to avoid build issue on some arches]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      86d34732
  4. 29 3月, 2016 1 次提交
  5. 26 3月, 2016 1 次提交
  6. 23 3月, 2016 1 次提交
  7. 18 3月, 2016 2 次提交
    • K
      param: convert some "on"/"off" users to strtobool · 4cc7ecb7
      Kees Cook 提交于
      This changes several users of manual "on"/"off" parsing to use
      strtobool.
      
      Some side-effects:
      - these uses will now parse y/n/1/0 meaningfully too
      - the early_param uses will now bubble up parse errors
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Amitkumar Karwar <akarwar@marvell.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Joe Perches <joe@perches.com>
      Cc: Kalle Valo <kvalo@codeaurora.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Nishant Sarmukadam <nishants@marvell.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Steve French <sfrench@samba.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4cc7ecb7
    • J
      timer: convert timer_slack_ns from unsigned long to u64 · da8b44d5
      John Stultz 提交于
      This patchset introduces a /proc/<pid>/timerslack_ns interface which
      would allow controlling processes to be able to set the timerslack value
      on other processes in order to save power by avoiding wakeups (Something
      Android currently does via out-of-tree patches).
      
      The first patch tries to fix the internal timer_slack_ns usage which was
      defined as a long, which limits the slack range to ~4 seconds on 32bit
      systems.  It converts it to a u64, which provides the same basically
      unlimited slack (500 years) on both 32bit and 64bit machines.
      
      The second patch introduces the /proc/<pid>/timerslack_ns interface
      which allows the full 64bit slack range for a task to be read or set on
      both 32bit and 64bit machines.
      
      With these two patches, on a 32bit machine, after setting the slack on
      bash to 10 seconds:
      
      $ time sleep 1
      
      real    0m10.747s
      user    0m0.001s
      sys     0m0.005s
      
      The first patch is a little ugly, since I had to chase the slack delta
      arguments through a number of functions converting them to u64s.  Let me
      know if it makes sense to break that up more or not.
      
      Other than that things are fairly straightforward.
      
      This patch (of 2):
      
      The timer_slack_ns value in the task struct is currently a unsigned
      long.  This means that on 32bit applications, the maximum slack is just
      over 4 seconds.  However, on 64bit machines, its much much larger (~500
      years).
      
      This disparity could make application development a little (as well as
      the default_slack) to a u64.  This means both 32bit and 64bit systems
      have the same effective internal slack range.
      
      Now the existing ABI via PR_GET_TIMERSLACK and PR_SET_TIMERSLACK specify
      the interface as a unsigned long, so we preserve that limitation on
      32bit systems, where SET_TIMERSLACK can only set the slack to a unsigned
      long value, and GET_TIMERSLACK will return ULONG_MAX if the slack is
      actually larger then what can be stored by an unsigned long.
      
      This patch also modifies hrtimer functions which specified the slack
      delta as a unsigned long.
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Oren Laadan <orenl@cellrox.com>
      Cc: Ruchi Kandoi <kandoiruchi@google.com>
      Cc: Rom Lemarchand <romlem@android.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Android Kernel Team <kernel-team@android.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      da8b44d5
  8. 08 3月, 2016 1 次提交
    • I
      time/timekeeping: Work around false positive GCC warning · 6436257b
      Ingo Molnar 提交于
      Newer GCC versions trigger the following warning:
      
        kernel/time/timekeeping.c: In function ‘get_device_system_crosststamp’:
        kernel/time/timekeeping.c:987:5: warning: ‘clock_was_set_seq’ may be used uninitialized in this function [-Wmaybe-uninitialized]
          if (discontinuity) {
           ^
        kernel/time/timekeeping.c:1045:15: note: ‘clock_was_set_seq’ was declared here
          unsigned int clock_was_set_seq;
                       ^
      
      GCC clearly is unable to recognize that the 'do_interp' boolean tracks
      the initialization status of 'clock_was_set_seq'.
      
      The GCC version used was:
      
        gcc version 5.3.1 20151207 (Red Hat 5.3.1-2) (GCC)
      
      Work it around by initializing clock_was_set_seq to 0. Compilers that
      are able to recognize the code flow will eliminate the unnecessary
      initialization.
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6436257b
  9. 03 3月, 2016 6 次提交
    • T
      hrtimer: Revert CLOCK_MONOTONIC_RAW support · 82e88ff1
      Thomas Gleixner 提交于
      Revert commits:
      a6e707dd: KVM: arm/arm64: timer: Switch to CLOCK_MONOTONIC_RAW
      9006a018: hrtimer: Catch illegal clockids
      9c808765: hrtimer: Add support for CLOCK_MONOTONIC_RAW
      
      Marc found out, that there are fundamental issues with that patch series
      because __hrtimer_get_next_event() and hrtimer_forward() need support for
      CLOCK_MONOTONIC_RAW. Nothing which is easily fixed, so revert the whole lot.
      Reported-by: NMarc Zyngier <marc.zyngier@arm.com>
      Link: http://lkml.kernel.org/r/56D6CEF0.8060607@arm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      82e88ff1
    • C
      time: Add history to cross timestamp interface supporting slower devices · 2c756feb
      Christopher S. Hall 提交于
      Another representative use case of time sync and the correlated
      clocksource (in addition to PTP noted above) is PTP synchronized
      audio.
      
      In a streaming application, as an example, samples will be sent and/or
      received by multiple devices with a presentation time that is in terms
      of the PTP master clock. Synchronizing the audio output on these
      devices requires correlating the audio clock with the PTP master
      clock. The more precise this correlation is, the better the audio
      quality (i.e. out of sync audio sounds bad).
      
      From an application standpoint, to correlate the PTP master clock with
      the audio device clock, the system clock is used as a intermediate
      timebase. The transforms such an application would perform are:
      
          System Clock <-> Audio clock
          System Clock <-> Network Device Clock [<-> PTP Master Clock]
      
      Modern Intel platforms can perform a more accurate cross timestamp in
      hardware (ART,audio device clock).  The audio driver requires
      ART->system time transforms -- the same as required for the network
      driver. These platforms offload audio processing (including
      cross-timestamps) to a DSP which to ensure uninterrupted audio
      processing, communicates and response to the host only once every
      millsecond. As a result is takes up to a millisecond for the DSP to
      receive a request, the request is processed by the DSP, the audio
      output hardware is polled for completion, the result is copied into
      shared memory, and the host is notified. All of these operation occur
      on a millisecond cadence.  This transaction requires about 2 ms, but
      under heavier workloads it may take up to 4 ms.
      
      Adding a history allows these slow devices the option of providing an
      ART value outside of the current interval. In this case, the callback
      provided is an accessor function for the previously obtained counter
      value. If get_system_device_crosststamp() receives a counter value
      previous to cycle_last, it consults the history provided as an
      argument in history_ref and interpolates the realtime and monotonic
      raw system time using the provided counter value. If there are any
      clock discontinuities, e.g. from calling settimeofday(), the monotonic
      raw time is interpolated in the usual way, but the realtime clock time
      is adjusted by scaling the monotonic raw adjustment.
      
      When an accessor function is used a history argument *must* be
      provided. The history is initialized using ktime_get_snapshot() and
      must be called before the counter values are read.
      
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: kevin.b.stanton@intel.com
      Cc: kevin.j.clarke@intel.com
      Cc: hpa@zytor.com
      Cc: jeffrey.t.kirsher@intel.com
      Cc: netdev@vger.kernel.org
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NChristopher S. Hall <christopher.s.hall@intel.com>
      [jstultz: Fixed up cycles_t/cycle_t type confusion]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      2c756feb
    • C
      time: Add driver cross timestamp interface for higher precision time synchronization · 8006c245
      Christopher S. Hall 提交于
      ACKNOWLEDGMENT: cross timestamp code was developed by Thomas Gleixner
      <tglx@linutronix.de>. It has changed considerably and any mistakes are
      mine.
      
      The precision with which events on multiple networked systems can be
      synchronized using, as an example, PTP (IEEE 1588, 802.1AS) is limited
      by the precision of the cross timestamps between the system clock and
      the device (timestamp) clock. Precision here is the degree of
      simultaneity when capturing the cross timestamp.
      
      Currently the PTP cross timestamp is captured in software using the
      PTP device driver ioctl PTP_SYS_OFFSET. Reads of the device clock are
      interleaved with reads of the realtime clock. At best, the precision
      of this cross timestamp is on the order of several microseconds due to
      software latencies. Sub-microsecond precision is required for
      industrial control and some media applications. To achieve this level
      of precision hardware supported cross timestamping is needed.
      
      The function get_device_system_crosstimestamp() allows device drivers
      to return a cross timestamp with system time properly scaled to
      nanoseconds.  The realtime value is needed to discipline that clock
      using PTP and the monotonic raw value is used for applications that
      don't require a "real" time, but need an unadjusted clock time.  The
      get_device_system_crosstimestamp() code calls back into the driver to
      ensure that the system counter is within the current timekeeping
      update interval.
      
      Modern Intel hardware provides an Always Running Timer (ART) which is
      exactly related to TSC through a known frequency ratio. The ART is
      routed to devices on the system and is used to precisely and
      simultaneously capture the device clock with the ART.
      
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: kevin.b.stanton@intel.com
      Cc: kevin.j.clarke@intel.com
      Cc: hpa@zytor.com
      Cc: jeffrey.t.kirsher@intel.com
      Cc: netdev@vger.kernel.org
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NChristopher S. Hall <christopher.s.hall@intel.com>
      [jstultz: Reworked to remove extra structures and simplify calling]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      8006c245
    • C
      time: Remove duplicated code in ktime_get_raw_and_real() · ba26621e
      Christopher S. Hall 提交于
      The code in ktime_get_snapshot() is a superset of the code in
      ktime_get_raw_and_real() code. Further, ktime_get_raw_and_real() is
      called only by the PPS code, pps_get_ts(). Consolidate the
      pps_get_ts() code into a single function calling ktime_get_snapshot()
      and eliminate ktime_get_raw_and_real(). A side effect of this is that
      the raw and real results of pps_get_ts() correspond to exactly the
      same clock cycle. Previously these values represented separate reads
      of the system clock.
      
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: kevin.b.stanton@intel.com
      Cc: kevin.j.clarke@intel.com
      Cc: hpa@zytor.com
      Cc: jeffrey.t.kirsher@intel.com
      Cc: netdev@vger.kernel.org
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NChristopher S. Hall <christopher.s.hall@intel.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      ba26621e
    • C
      time: Add timekeeping snapshot code capturing system time and counter · 9da0f49c
      Christopher S. Hall 提交于
      In the current timekeeping code there isn't any interface to
      atomically capture the current relationship between the system counter
      and system time. ktime_get_snapshot() returns this triple (counter,
      monotonic raw, realtime) in the system_time_snapshot struct.
      
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: kevin.b.stanton@intel.com
      Cc: kevin.j.clarke@intel.com
      Cc: hpa@zytor.com
      Cc: jeffrey.t.kirsher@intel.com
      Cc: netdev@vger.kernel.org
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NChristopher S. Hall <christopher.s.hall@intel.com>
      [jstultz: Moved structure definitions around to clean things up,
       fixed cycles_t/cycle_t confusion.]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      9da0f49c
    • C
      time: Add cycles to nanoseconds translation · 6bd58f09
      Christopher S. Hall 提交于
      The timekeeping code does not currently provide a way to translate
      externally provided clocksource cycles to system time. The cycle count
      is always provided by the result clocksource read() method internal to
      the timekeeping code. The added function timekeeping_cycles_to_ns()
      calculated a nanosecond value from a cycle count that can be added to
      tk_read_base.base value yielding the current system time. This allows
      clocksource cycle values external to the timekeeping code to provide a
      cycle count that can be transformed to system time.
      
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: kevin.b.stanton@intel.com
      Cc: kevin.j.clarke@intel.com
      Cc: hpa@zytor.com
      Cc: jeffrey.t.kirsher@intel.com
      Cc: netdev@vger.kernel.org
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NChristopher S. Hall <christopher.s.hall@intel.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      6bd58f09
  10. 02 3月, 2016 6 次提交
    • F
      sched-clock: Migrate to use new tick dependency mask model · 4f49b90a
      Frederic Weisbecker 提交于
      Instead of checking sched_clock_stable from the nohz subsystem to verify
      its tick dependency, migrate it to the new mask in order to include it
      to the all-in-one check.
      Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      4f49b90a
    • F
      posix-cpu-timers: Migrate to use new tick dependency mask model · b7878300
      Frederic Weisbecker 提交于
      Instead of providing asynchronous checks for the nohz subsystem to verify
      posix cpu timers tick dependency, migrate the latter to the new mask.
      
      In order to keep track of the running timers and expose the tick
      dependency accordingly, we must probe the timers queuing and dequeuing
      on threads and process lists.
      
      Unfortunately it implies both task and signal level dependencies. We
      should be able to further optimize this and merge all that on the task
      level dependency, at the cost of a bit of complexity and may be overhead.
      Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      b7878300
    • F
      sched: Migrate sched to use new tick dependency mask model · 76d92ac3
      Frederic Weisbecker 提交于
      Instead of providing asynchronous checks for the nohz subsystem to verify
      sched tick dependency, migrate sched to the new mask.
      
      Everytime a task is enqueued or dequeued, we evaluate the state of the
      tick dependency on top of the policy of the tasks in the runqueue, by
      order of priority:
      
      SCHED_DEADLINE: Need the tick in order to periodically check for runtime
      SCHED_FIFO    : Don't need the tick (no round-robin)
      SCHED_RR      : Need the tick if more than 1 task of the same priority
                      for round robin (simplified with checking if more than
                      one SCHED_RR task no matter what priority).
      SCHED_NORMAL  : Need the tick if more than 1 task for round-robin.
      
      We could optimize that further with one flag per sched policy on the tick
      dependency mask and perform only the checks relevant to the policy
      concerned by an enqueue/dequeue operation.
      
      Since the checks aren't based on the current task anymore, we could get
      rid of the task switch hook but it's still needed for posix cpu
      timers.
      Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      76d92ac3
    • F
      perf: Migrate perf to use new tick dependency mask model · 555e0c1e
      Frederic Weisbecker 提交于
      Instead of providing asynchronous checks for the nohz subsystem to verify
      perf event tick dependency, migrate perf to the new mask.
      
      Perf needs the tick for two situations:
      
      1) Freq events. We could set the tick dependency when those are
      installed on a CPU context. But setting a global dependency on top of
      the global freq events accounting is much easier. If people want that
      to be optimized, we can still refine that on the per-CPU tick dependency
      level. This patch dooesn't change the current behaviour anyway.
      
      2) Throttled events: this is a per-cpu dependency.
      Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      555e0c1e
    • F
      nohz: Use enum code for tick stop failure tracing message · e6e6cc22
      Frederic Weisbecker 提交于
      It makes nohz tracing more lightweight, standard and easier to parse.
      
      Examples:
      
             user_loop-2904  [007] d..1   517.701126: tick_stop: success=1 dependency=NONE
             user_loop-2904  [007] dn.1   518.021181: tick_stop: success=0 dependency=SCHED
          posix_timers-6142  [007] d..1  1739.027400: tick_stop: success=0 dependency=POSIX_TIMER
             user_loop-5463  [007] dN.1  1185.931939: tick_stop: success=0 dependency=PERF_EVENTS
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      e6e6cc22
    • F
      nohz: New tick dependency mask · d027d45d
      Frederic Weisbecker 提交于
      The tick dependency is evaluated on every IRQ and context switch. This
      consists is a batch of checks which determine whether it is safe to
      stop the tick or not. These checks are often split in many details:
      posix cpu timers, scheduler, sched clock, perf events.... each of which
      are made of smaller details: posix cpu timer involves checking process
      wide timers then thread wide timers. Perf involves checking freq events
      then more per cpu details.
      
      Checking these informations asynchronously every time we update the full
      dynticks state bring avoidable overhead and a messy layout.
      
      Let's introduce instead tick dependency masks: one for system wide
      dependency (unstable sched clock, freq based perf events), one for CPU
      wide dependency (sched, throttling perf events), and task/signal level
      dependencies (posix cpu timers). The subsystems are responsible
      for setting and clearing their dependency through a set of APIs that will
      take care of concurrent dependency mask modifications and kick targets
      to restart the relevant CPU tick whenever needed.
      
      This new dependency engine stays beside the old one until all subsystems
      having a tick dependency are converted to it.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      d027d45d
  11. 29 2月, 2016 1 次提交
    • D
      Handle ISO 8601 leap seconds and encodings of midnight in mktime64() · ede5147d
      David Howells 提交于
      Handle the following ISO 8601 features in mktime64():
      
       (1) Leap seconds.
      
           Leap seconds are indicated by the seconds parameter being the value
           60.  Handle this by treating it the same as 00 of the following
           minute.
      
           It has been pointed out that a minute may contain two leap seconds.
           However, pending discussion of what that looks like and how to handle
           it, I'm not going to concern myself with it.
      
       (2) Alternate encodings of midnight.
      
           Two different encodings of midnight are permitted - 00:00:00 and
           24:00:00 - the first is midnight today and the second is midnight
           tomorrow and is exactly equivalent to the first with tomorrow's date.
      
      As it happens, we don't actually need to change mktime64() to handle either
      of these - just comment them as valid parameters.
      
      These facility will be used by the X.509 parser.  Doing it in mktime64()
      makes the policy common to the whole kernel and easier to find.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      cc: John Stultz <john.stultz@linaro.org>
      cc: Rudolf Polzer <rpolzer@google.com>
      cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      ede5147d
  12. 27 2月, 2016 1 次提交
  13. 15 2月, 2016 1 次提交
  14. 13 2月, 2016 1 次提交
    • F
      nohz: Implement wide kick on top of irq work · 8537bb95
      Frederic Weisbecker 提交于
      It simplifies it and allows wide kick to be performed, even when IRQs
      are disabled, without an asynchronous level in the middle.
      
      This comes at a cost of some more overhead on features like perf and
      posix cpu timers slow-paths, which is probably not much important
      for nohz full users.
      Requested-by: NPeter Zijlstra <peterz@infradead.org>
      Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      8537bb95
  15. 27 1月, 2016 4 次提交
  16. 26 1月, 2016 1 次提交
    • A
      tick/sched: Hide unused oneshot timer code · 7809998a
      Arnd Bergmann 提交于
      A couple of functions in kernel/time/tick-sched.c are only
      relevant for oneshot timer mode, i.e. when hires-timers or
      nohz mode are enabled. If both are disabled, we get gcc warnings
      about them:
      
      kernel/time/tick-sched.c:98:16: warning: 'tick_init_jiffy_update' defined but not used [-Wunused-function]
       static ktime_t tick_init_jiffy_update(void)
                      ^
      kernel/time/tick-sched.c:112:13: warning: 'tick_sched_do_timer' defined but not used [-Wunused-function]
       static void tick_sched_do_timer(ktime_t now)
                   ^
      kernel/time/tick-sched.c:134:13: warning: 'tick_sched_handle' defined but not used [-Wunused-function]
       static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs)
                   ^
      
      This encloses the whole set of functions in an appropriate ifdef
      to avoid the warning and to make it clearer when they are used.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1453736525-1959191-1-git-send-email-arnd@arndb.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      7809998a
  17. 22 1月, 2016 1 次提交
  18. 17 1月, 2016 3 次提交
  19. 16 1月, 2016 1 次提交
  20. 29 12月, 2015 1 次提交
  21. 19 12月, 2015 1 次提交
  22. 17 12月, 2015 1 次提交
    • J
      timekeeping: Cap adjustments so they don't exceed the maxadj value · ec02b076
      John Stultz 提交于
      Thus its been occasionally noted that users have seen
      confusing warnings like:
      
          Adjusting tsc more than 11% (5941981 vs 7759439)
      
      We try to limit the maximum total adjustment to 11% (10% tick
      adjustment + 0.5% frequency adjustment). But this is done by
      bounding the requested adjustment values, and the internal
      steering that is done by tracking the error from what was
      requested and what was applied, does not have any such limits.
      
      This is usually not problematic, but in some cases has a risk
      that an adjustment could cause the clocksource mult value to
      overflow, so its an indication things are outside of what is
      expected.
      
      It ends up most of the reports of this 11% warning are on systems
      using chrony, which utilizes the adjtimex() ADJ_TICK interface
      (which allows a +-10% adjustment). The original rational for
      ADJ_TICK unclear to me but my assumption it was originally added
      to allow broken systems to get a big constant correction at boot
      (see adjtimex userspace package for an example) which would allow
      the system to work w/ ntpd's 0.5% adjustment limit.
      
      Chrony uses ADJ_TICK to make very aggressive short term corrections
      (usually right at startup). Which push us close enough to the max
      bound that a few late ticks can cause the internal steering to push
      past the max adjust value (tripping the warning).
      
      Thus this patch adds some extra logic to enforce the max adjustment
      cap in the internal steering.
      
      Note: This has the potential to slow corrections when the ADJ_TICK
      value is furthest away from the default value. So it would be good to
      get some testing from folks using chrony, to make sure we don't
      cause any troubles there.
      
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Tested-by: NMiroslav Lichvar <mlichvar@redhat.com>
      Reported-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      ec02b076