1. 26 4月, 2018 1 次提交
    • T
      Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME · a3ed0e43
      Thomas Gleixner 提交于
      Revert commits
      
      92af4dcb ("tracing: Unify the "boot" and "mono" tracing clocks")
      127bfa5f ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior")
      7250a404 ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior")
      d6c7270e ("timekeeping: Remove boot time specific code")
      f2d6fdbf ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior")
      d6ed449a ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock")
      72199320 ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock")
      
      As stated in the pull request for the unification of CLOCK_MONOTONIC and
      CLOCK_BOOTTIME, it was clear that we might have to revert the change.
      
      As reported by several folks systemd and other applications rely on the
      documented behaviour of CLOCK_MONOTONIC on Linux and break with the above
      changes. After resume daemons time out and other timeout related issues are
      observed. Rafael compiled this list:
      
      * systemd kills daemons on resume, after >WatchdogSec seconds
        of suspending (Genki Sky).  [Verified that that's because systemd uses
        CLOCK_MONOTONIC and expects it to not include the suspend time.]
      
      * systemd-journald misbehaves after resume:
        systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal
      corrupted or uncleanly shut down, renaming and replacing.
        (Mike Galbraith).
      
      * NetworkManager reports "networking disabled" and networking is broken
        after resume 50% of the time (Pavel).  [May be because of systemd.]
      
      * MATE desktop dims the display and starts the screensaver right after
        system resume (Pavel).
      
      * Full system hang during resume (me).  [May be due to systemd or NM or both.]
      
      That happens on debian and open suse systems.
      
      It's sad, that these problems were neither catched in -next nor by those
      folks who expressed interest in this change.
      Reported-by: NRafael J. Wysocki <rjw@rjwysocki.net>
      Reported-by: Genki Sky <sky@genki.is>,
      Reported-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      a3ed0e43
  2. 08 4月, 2018 1 次提交
  3. 13 3月, 2018 1 次提交
    • T
      hrtimer: Unify MONOTONIC and BOOTTIME clock behavior · 127bfa5f
      Thomas Gleixner 提交于
      Now that th MONOTONIC and BOOTTIME clocks are indentical remove all the special
      casing.
      
      The user space visible interfaces still support both clocks, but their behavior
      is identical.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20180301165150.410218515@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      127bfa5f
  4. 16 1月, 2018 13 次提交
    • A
      hrtimer: Implement support for softirq based hrtimers · 5da70160
      Anna-Maria Gleixner 提交于
      hrtimer callbacks are always invoked in hard interrupt context. Several
      users in tree require soft interrupt context for their callbacks and
      achieve this by combining a hrtimer with a tasklet. The hrtimer schedules
      the tasklet in hard interrupt context and the tasklet callback gets invoked
      in softirq context later.
      
      That's suboptimal and aside of that the real-time patch moves most of the
      hrtimers into softirq context. So adding native support for hrtimers
      expiring in softirq context is a valuable extension for both mainline and
      the RT patch set.
      
      Each valid hrtimer clock id has two associated hrtimer clock bases: one for
      timers expiring in hardirq context and one for timers expiring in softirq
      context.
      
      Implement the functionality to associate a hrtimer with the hard or softirq
      related clock bases and update the relevant functions to take them into
      account when the next expiry time needs to be evaluated.
      
      Add a check into the hard interrupt context handler functions to check
      whether the first expiring softirq based timer has expired. If it's expired
      the softirq is raised and the accounting of softirq based timers to
      evaluate the next expiry time for programming the timer hardware is skipped
      until the softirq processing has finished. At the end of the softirq
      processing the regular processing is resumed.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-29-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5da70160
    • A
      hrtimer: Add clock bases and hrtimer mode for softirq context · 98ecadd4
      Anna-Maria Gleixner 提交于
      Currently hrtimer callback functions are always executed in hard interrupt
      context. Users of hrtimers, which need their timer function to be executed
      in soft interrupt context, make use of tasklets to get the proper context.
      
      Add additional hrtimer clock bases for timers which must expire in softirq
      context, so the detour via the tasklet can be avoided. This is also
      required for RT, where the majority of hrtimer is moved into softirq
      hrtimer context.
      
      The selection of the expiry mode happens via a mode bit. Introduce
      HRTIMER_MODE_SOFT and the matching combinations with the ABS/REL/PINNED
      bits and update the decoding of hrtimer_mode in tracepoints.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-27-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      98ecadd4
    • A
      hrtimer: Make hrtimer_reprogramm() unconditional · 11a9fe06
      Anna-Maria Gleixner 提交于
      hrtimer_reprogram() needs to be available unconditionally for softirq based
      hrtimers. Move the function and all required struct members out of the
      CONFIG_HIGH_RES_TIMERS #ifdef.
      
      There is no functional change because hrtimer_reprogram() is only invoked
      when hrtimer_cpu_base.hres_active is true. Making it unconditional
      increases the text size for the CONFIG_HIGH_RES_TIMERS=n case, but avoids
      replication of that code for the upcoming softirq based hrtimers support.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-18-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      11a9fe06
    • A
      hrtimer: Make hrtimer_cpu_base.next_timer handling unconditional · eb27926b
      Anna-Maria Gleixner 提交于
      hrtimer_cpu_base.next_timer stores the pointer to the next expiring timer
      in a CPU base.
      
      This pointer cannot be dereferenced and is solely used to check whether a
      hrtimer which is removed is the hrtimer which is the first to expire in the
      CPU base. If this is the case, then the timer hardware needs to be
      reprogrammed to avoid an extra interrupt for nothing.
      
      Again, this is conditional functionality, but there is no compelling reason
      to make this conditional. As a preparation, hrtimer_cpu_base.next_timer
      needs to be available unconditonally.
      
      Aside of that the upcoming support for softirq based hrtimers requires access
      to this pointer unconditionally as well, so our motivation is not entirely
      simplicity based.
      
      Make the update of hrtimer_cpu_base.next_timer unconditional and remove the
      #ifdef cruft. The impact on CONFIG_HIGH_RES_TIMERS=n && CONFIG_NOHZ=n is
      marginal as it's just a store on an already dirtied cacheline.
      
      No functional change.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-17-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      eb27926b
    • A
      hrtimer: Make the remote enqueue check unconditional · 07a9a7ea
      Anna-Maria Gleixner 提交于
      hrtimer_cpu_base.expires_next is used to cache the next event armed in the
      timer hardware. The value is used to check whether an hrtimer can be
      enqueued remotely. If the new hrtimer is expiring before expires_next, then
      remote enqueue is not possible as the remote hrtimer hardware cannot be
      accessed for reprogramming to an earlier expiry time.
      
      The remote enqueue check is currently conditional on
      CONFIG_HIGH_RES_TIMERS=y and hrtimer_cpu_base.hres_active. There is no
      compelling reason to make this conditional.
      
      Move hrtimer_cpu_base.expires_next out of the CONFIG_HIGH_RES_TIMERS=y
      guarded area and remove the conditionals in hrtimer_check_target().
      
      The check is currently a NOOP for the CONFIG_HIGH_RES_TIMERS=n and the
      !hrtimer_cpu_base.hres_active case because in these cases nothing updates
      hrtimer_cpu_base.expires_next yet. This will be changed with later patches
      which further reduce the #ifdef zoo in this code.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-16-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      07a9a7ea
    • A
      hrtimer: Make the hrtimer_cpu_base::hres_active field unconditional, to simplify the code · 28bfd18b
      Anna-Maria Gleixner 提交于
      The hrtimer_cpu_base::hres_active_member field depends on CONFIG_HIGH_RES_TIMERS=y
      currently, and all related functions to this member are conditional as well.
      
      To simplify the code make it unconditional and set it to zero during initialization.
      
      (This will also help with the upcoming softirq based hrtimers code.)
      
      The conditional code sections can be avoided by adding IS_ENABLED(HIGHRES)
      conditionals into common functions, which ensures dead code elimination.
      
      There is no functional change.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-14-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      28bfd18b
    • A
      hrtimer: Make room in 'struct hrtimer_cpu_base' · da21c5a5
      Anna-Maria Gleixner 提交于
      The upcoming softirq based hrtimers support requires an additional field in
      the hrtimer_cpu_base struct, which would grow the struct size beyond a
      cache line.
      
      The hrtimer_cpu_base::nr_retries and ::nr_hangs members are solely
      used for diagnostic output and have no requirement to be 'unsigned int'.
      
      Make them 'unsigned short' to create room for the new struct member.
      
      No functional change.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-13-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      da21c5a5
    • A
      hrtimer: Store running timer in hrtimer_clock_base · 3f0b9e8e
      Anna-Maria Gleixner 提交于
      The pointer to the currently running timer is stored in hrtimer_cpu_base
      before the base lock is dropped and the callback is invoked.
      
      This results in two levels of indirections and the upcoming support for
      softirq based hrtimer requires splitting the "running" storage into soft
      and hard IRQ context expiry.
      
      Storing both in the cpu base would require conditionals in all code paths
      accessing that information.
      
      It's possible to have a per clock base sequence count and running pointer
      without changing the semantics of the related mechanisms because the timer
      base pointer cannot be changed while a timer is running the callback.
      
      Unfortunately this makes cpu_clock base larger than 32 bytes on 32-bit
      kernels. Instead of having huge gaps due to alignment, remove the alignment
      and let the compiler pack CPU base for 32-bit kernels. The resulting cache access
      patterns are fortunately not really different from the current
      behaviour. On 64-bit kernels the 64-byte alignment stays and the behaviour is
      unchanged. This was determined by analyzing the resulting layout and
      looking at the number of cache lines involved for the frequently used
      clocks.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-12-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3f0b9e8e
    • A
      hrtimer: Clean up 'enum hrtimer_mode' · 19b51cb5
      Anna-Maria Gleixner 提交于
      It's not obvious that the HRTIMER_MODE variants are bit combinations,
      because all modes are hard coded constants currently.
      
      Change it so the bit meanings are clear; and use the symbols for creating
      modes which combine bits.
      
      While at it get rid of the ugly tail comments as well.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-8-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      19b51cb5
    • A
      hrtimer: Fix hrtimer_start[_range_ns]() function descriptions · 6de6250c
      Anna-Maria Gleixner 提交于
      The hrtimer_start[_range_ns]() functions start a timer reliably on this CPU only
      when HRTIMER_MODE_PINNED is set.
      
      Furthermore the HRTIMER_MODE_PINNED mode is not considered when a hrtimer is initialized.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-6-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6de6250c
    • A
      hrtimer: Clean up the 'int clock' parameter of schedule_hrtimeout_range_clock() · 90777713
      Anna-Maria Gleixner 提交于
      schedule_hrtimeout_range_clock() uses an 'int clock' parameter for the
      clock ID, instead of the customary predefined "clockid_t" type.
      
      In hrtimer coding style the canonical variable name for the clock ID is
      'clock_id', therefore change the name of the parameter here as well
      to make it all consistent.
      
      While at it, clean up the description for the 'clock_id' and 'mode'
      function parameters. The clock modes and the clock IDs are not
      restricted as the comment suggests.
      
      Fix the mode description as well for the callers of schedule_hrtimeout_range_clock().
      
      No functional changes intended.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-5-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      90777713
    • A
      hrtimer: Fix kerneldoc syntax for 'struct hrtimer_cpu_base' · 1fbc78b3
      Anna-Maria Gleixner 提交于
      The '/**' sequence marks the start of a structure description. Add the
      missing second asterisk. While at it adapt the ordering of the struct
      members to the struct definition and document the purpose of
      expires_next more precisely.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-4-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1fbc78b3
    • T
      hrtimer: Optimize the hrtimer code by using static keys for migration_enable/nohz_active · ae67bada
      Thomas Gleixner 提交于
      The hrtimer_cpu_base::migration_enable and ::nohz_active fields
      were originally introduced to avoid accessing global variables
      for these decisions.
      
      Still that results in a (cache hot) load and conditional branch,
      which can be avoided by using static keys.
      
      Implement it with static keys and optimize for the most critical
      case of high performance networking which tends to disable the
      timer migration functionality.
      
      No change in functionality.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1801142327490.2371@nanos
      Link: https://lkml.kernel.org/r/20171221104205.7269-2-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ae67bada
  5. 30 6月, 2017 1 次提交
  6. 14 6月, 2017 4 次提交
  7. 15 4月, 2017 1 次提交
  8. 18 3月, 2017 1 次提交
  9. 03 3月, 2017 1 次提交
  10. 10 2月, 2017 1 次提交
    • K
      time: Remove CONFIG_TIMER_STATS · dfb4357d
      Kees Cook 提交于
      Currently CONFIG_TIMER_STATS exposes process information across namespaces:
      
      kernel/time/timer_list.c print_timer():
      
              SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);
      
      /proc/timer_list:
      
       #11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570
      
      Given that the tracer can give the same information, this patch entirely
      removes CONFIG_TIMER_STATS.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: linux-doc@vger.kernel.org
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Xing Gao <xgao01@email.wm.edu>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Jessica Frazelle <me@jessfraz.com>
      Cc: kernel-hardening@lists.openwall.com
      Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Michal Marek <mmarek@suse.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-api@vger.kernel.org
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170208192659.GA32582@beastSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      dfb4357d
  11. 26 12月, 2016 1 次提交
    • T
      ktime: Get rid of the union · 2456e855
      Thomas Gleixner 提交于
      ktime is a union because the initial implementation stored the time in
      scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
      variant for 32bit machines. The Y2038 cleanup removed the timespec variant
      and switched everything to scalar nanoseconds. The union remained, but
      become completely pointless.
      
      Get rid of the union and just keep ktime_t as simple typedef of type s64.
      
      The conversion was done with coccinelle and some manual mopping up.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      2456e855
  12. 15 7月, 2016 1 次提交
  13. 18 3月, 2016 1 次提交
    • J
      timer: convert timer_slack_ns from unsigned long to u64 · da8b44d5
      John Stultz 提交于
      This patchset introduces a /proc/<pid>/timerslack_ns interface which
      would allow controlling processes to be able to set the timerslack value
      on other processes in order to save power by avoiding wakeups (Something
      Android currently does via out-of-tree patches).
      
      The first patch tries to fix the internal timer_slack_ns usage which was
      defined as a long, which limits the slack range to ~4 seconds on 32bit
      systems.  It converts it to a u64, which provides the same basically
      unlimited slack (500 years) on both 32bit and 64bit machines.
      
      The second patch introduces the /proc/<pid>/timerslack_ns interface
      which allows the full 64bit slack range for a task to be read or set on
      both 32bit and 64bit machines.
      
      With these two patches, on a 32bit machine, after setting the slack on
      bash to 10 seconds:
      
      $ time sleep 1
      
      real    0m10.747s
      user    0m0.001s
      sys     0m0.005s
      
      The first patch is a little ugly, since I had to chase the slack delta
      arguments through a number of functions converting them to u64s.  Let me
      know if it makes sense to break that up more or not.
      
      Other than that things are fairly straightforward.
      
      This patch (of 2):
      
      The timer_slack_ns value in the task struct is currently a unsigned
      long.  This means that on 32bit applications, the maximum slack is just
      over 4 seconds.  However, on 64bit machines, its much much larger (~500
      years).
      
      This disparity could make application development a little (as well as
      the default_slack) to a u64.  This means both 32bit and 64bit systems
      have the same effective internal slack range.
      
      Now the existing ABI via PR_GET_TIMERSLACK and PR_SET_TIMERSLACK specify
      the interface as a unsigned long, so we preserve that limitation on
      32bit systems, where SET_TIMERSLACK can only set the slack to a unsigned
      long value, and GET_TIMERSLACK will return ULONG_MAX if the slack is
      actually larger then what can be stored by an unsigned long.
      
      This patch also modifies hrtimer functions which specified the slack
      delta as a unsigned long.
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Oren Laadan <orenl@cellrox.com>
      Cc: Ruchi Kandoi <kandoiruchi@google.com>
      Cc: Rom Lemarchand <romlem@android.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Android Kernel Team <kernel-team@android.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      da8b44d5
  14. 03 3月, 2016 1 次提交
  15. 27 1月, 2016 1 次提交
    • M
      hrtimer: Add support for CLOCK_MONOTONIC_RAW · 9c808765
      Marc Zyngier 提交于
      The KVM/ARM timer implementation arms a hrtimer when a vcpu is
      blocked (usually because it is waiting for an interrupt)
      while its timer is going to kick in the future.
      
      It is essential that this timer doesn't get adjusted, or the
      guest will end up being woken-up at the wrong time (NTP running
      on the host seems to confuse the hell out of some guests).
      
      In order to allow this, let's add CLOCK_MONOTONIC_RAW support
      to hrtimer (it is so far only supported for posix timers). It also
      has the (limited) benefit of fixing de0421d5 ("mac80211_hwsim:
      shuffle code to prepare for dynamic radios"), which already uses
      this functionnality without realizing wasn't implemented (just being
      lucky...).
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Cc: Tomasz Nowicki <tn@semihalf.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Link: http://lkml.kernel.org/r/1452879670-16133-2-git-send-email-marc.zyngier@arm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      9c808765
  16. 17 1月, 2016 1 次提交
  17. 19 6月, 2015 4 次提交
    • T
      timer: Minimize nohz off overhead · 683be13a
      Thomas Gleixner 提交于
      If nohz is disabled on the kernel command line the [hr]timer code
      still calls wake_up_nohz_cpu() and tick_nohz_full_cpu(), a pretty
      pointless exercise. Cache nohz_active in [hr]timer per cpu bases and
      avoid the overhead.
      
      Before:
        48.10%  hog       [.] main
        15.25%  [kernel]  [k] _raw_spin_lock_irqsave
         9.76%  [kernel]  [k] _raw_spin_unlock_irqrestore
         6.50%  [kernel]  [k] mod_timer
         6.44%  [kernel]  [k] lock_timer_base.isra.38
         3.87%  [kernel]  [k] detach_if_pending
         3.80%  [kernel]  [k] del_timer
         2.67%  [kernel]  [k] internal_add_timer
         1.33%  [kernel]  [k] __internal_add_timer
         0.73%  [kernel]  [k] timerfn
         0.54%  [kernel]  [k] wake_up_nohz_cpu
      
      After:
        48.73%  hog       [.] main
        15.36%  [kernel]  [k] _raw_spin_lock_irqsave
         9.77%  [kernel]  [k] _raw_spin_unlock_irqrestore
         6.61%  [kernel]  [k] lock_timer_base.isra.38
         6.42%  [kernel]  [k] mod_timer
         3.90%  [kernel]  [k] detach_if_pending
         3.76%  [kernel]  [k] del_timer
         2.41%  [kernel]  [k] internal_add_timer
         1.39%  [kernel]  [k] __internal_add_timer
         0.76%  [kernel]  [k] timerfn
      
      We probably should have a cached value for nohz full in the per cpu
      bases as well to avoid the cpumask check. The base cache line is hot
      already, the cpumask not necessarily.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Joonwoo Park <joonwoop@codeaurora.org>
      Cc: Wenbo Wang <wenbo.wang@memblaze.com>
      Link: http://lkml.kernel.org/r/20150526224512.207378134@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      683be13a
    • T
      timer: Reduce timer migration overhead if disabled · bc7a34b8
      Thomas Gleixner 提交于
      Eric reported that the timer_migration sysctl is not really nice
      performance wise as it needs to check at every timer insertion whether
      the feature is enabled or not. Further the check does not live in the
      timer code, so we have an extra function call which checks an extra
      cache line to figure out that it is disabled.
      
      We can do better and store that information in the per cpu (hr)timer
      bases. I pondered to use a static key, but that's a nightmare to
      update from the nohz code and the timer base cache line is hot anyway
      when we select a timer base.
      
      The old logic enabled the timer migration unconditionally if
      CONFIG_NO_HZ was set even if nohz was disabled on the kernel command
      line.
      
      With this modification, we start off with migration disabled. The user
      visible sysctl is still set to enabled. If the kernel switches to NOHZ
      migration is enabled, if the user did not disable it via the sysctl
      prior to the switch. If nohz=off is on the kernel command line,
      migration stays disabled no matter what.
      
      Before:
        47.76%  hog       [.] main
        14.84%  [kernel]  [k] _raw_spin_lock_irqsave
         9.55%  [kernel]  [k] _raw_spin_unlock_irqrestore
         6.71%  [kernel]  [k] mod_timer
         6.24%  [kernel]  [k] lock_timer_base.isra.38
         3.76%  [kernel]  [k] detach_if_pending
         3.71%  [kernel]  [k] del_timer
         2.50%  [kernel]  [k] internal_add_timer
         1.51%  [kernel]  [k] get_nohz_timer_target
         1.28%  [kernel]  [k] __internal_add_timer
         0.78%  [kernel]  [k] timerfn
         0.48%  [kernel]  [k] wake_up_nohz_cpu
      
      After:
        48.10%  hog       [.] main
        15.25%  [kernel]  [k] _raw_spin_lock_irqsave
         9.76%  [kernel]  [k] _raw_spin_unlock_irqrestore
         6.50%  [kernel]  [k] mod_timer
         6.44%  [kernel]  [k] lock_timer_base.isra.38
         3.87%  [kernel]  [k] detach_if_pending
         3.80%  [kernel]  [k] del_timer
         2.67%  [kernel]  [k] internal_add_timer
         1.33%  [kernel]  [k] __internal_add_timer
         0.73%  [kernel]  [k] timerfn
         0.54%  [kernel]  [k] wake_up_nohz_cpu
      Reported-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Joonwoo Park <joonwoop@codeaurora.org>
      Cc: Wenbo Wang <wenbo.wang@memblaze.com>
      Link: http://lkml.kernel.org/r/20150526224512.127050787@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      bc7a34b8
    • P
      hrtimer: Allow hrtimer::function() to free the timer · 887d9dc9
      Peter Zijlstra 提交于
      Currently an hrtimer callback function cannot free its own timer
      because __run_hrtimer() still needs to clear HRTIMER_STATE_CALLBACK
      after it. Freeing the timer would result in a clear use-after-free.
      
      Solve this by using a scheme similar to regular timers; track the
      current running timer in hrtimer_clock_base::running.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: ktkhai@parallels.com
      Cc: rostedt@goodmis.org
      Cc: juri.lelli@gmail.com
      Cc: pang.xunlei@linaro.org
      Cc: wanpeng.li@linux.intel.com
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: umgwanakikbuti@gmail.com
      Link: http://lkml.kernel.org/r/20150611124743.471563047@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      887d9dc9
    • O
      hrtimer: Remove HRTIMER_STATE_MIGRATE · c04dca02
      Oleg Nesterov 提交于
      I do not understand HRTIMER_STATE_MIGRATE. Unless I am totally
      confused it looks buggy and simply unneeded.
      
      migrate_hrtimer_list() sets it to keep hrtimer_active() == T, but this
      is not enough: this can fool, say, hrtimer_is_queued() in
      dequeue_signal().
      
      Can't migrate_hrtimer_list() simply use HRTIMER_STATE_ENQUEUED?
      This fixes the race and we can kill STATE_MIGRATE.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: ktkhai@parallels.com
      Cc: rostedt@goodmis.org
      Cc: juri.lelli@gmail.com
      Cc: pang.xunlei@linaro.org
      Cc: wanpeng.li@linux.intel.com
      Cc: umgwanakikbuti@gmail.com
      Link: http://lkml.kernel.org/r/20150611124743.072387650@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      c04dca02
  18. 08 6月, 2015 1 次提交
  19. 22 4月, 2015 4 次提交