1. 01 12月, 2016 1 次提交
    • B
      alarmtimer: Add tracepoints for alarm timers · 4a057549
      Baolin Wang 提交于
      Alarm timers are one of the mechanisms to wake up a system from suspend,
      but there exist no tracepoints to analyse which process/thread armed an
      alarmtimer.
      
      Add tracepoints for start/cancel/expire of individual alarm timers and one
      for tracing the suspend time decision when to resume the system.
      
      The following trace excerpt illustrates the new mechanism:
      
      Binder:3292_2-3304  [000] d..2   149.981123: alarmtimer_cancel:
      alarmtimer:ffffffc1319a7800 type:REALTIME
      expires:1325463120000000000 now:1325376810370370245
      
      Binder:3292_2-3304  [000] d..2   149.981136: alarmtimer_start:
      alarmtimer:ffffffc1319a7800 type:REALTIME
      expires:1325376840000000000 now:1325376810370384591
      
      Binder:3292_9-3953  [000] d..2   150.212991: alarmtimer_cancel:
      alarmtimer:ffffffc1319a5a00 type:BOOTTIME
      expires:179552000000 now:150154008122
      
      Binder:3292_9-3953  [000] d..2   150.213006: alarmtimer_start:
      alarmtimer:ffffffc1319a5a00 type:BOOTTIME
      expires:179551000000 now:150154025622
      
      system_server-3000  [002] ...1  162.701940: alarmtimer_suspend:
      alarmtimer type:REALTIME expires:1325376840000000000
      
      The wakeup time which is selected at suspend time allows to map it back to
      the task arming the timer: Binder:3292_2.
      
      [ tglx: Store alarm timer expiry time instead of some useless RTC relative
        	information, add proper type information for wakeups which are
        	handled via the clock_nanosleep/freezer and massage the changelog. ]
      Signed-off-by: NBaolin Wang <baolin.wang@linaro.org>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Link: http://lkml.kernel.org/r/1480372524-15181-5-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      4a057549
  2. 30 11月, 2016 1 次提交
    • J
      timekeeping: Add a fast and NMI safe boot clock · 948a5312
      Joel Fernandes 提交于
      This boot clock can be used as a tracing clock and will account for
      suspend time.
      
      To keep it NMI safe since we're accessing from tracing, we're not using a
      separate timekeeper with updates to monotonic clock and boot offset
      protected with seqlocks. This has the following minor side effects:
      
      (1) Its possible that a timestamp be taken after the boot offset is updated
      but before the timekeeper is updated. If this happens, the new boot offset
      is added to the old timekeeping making the clock appear to update slightly
      earlier:
         CPU 0                                        CPU 1
         timekeeping_inject_sleeptime64()
         __timekeeping_inject_sleeptime(tk, delta);
                                                      timestamp();
         timekeeping_update(tk, TK_CLEAR_NTP...);
      
      (2) On 32-bit systems, the 64-bit boot offset (tk->offs_boot) may be
      partially updated.  Since the tk->offs_boot update is a rare event, this
      should be a rare occurrence which postprocessing should be able to handle.
      Signed-off-by: NJoel Fernandes <joelaf@google.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1480372524-15181-6-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      948a5312
  3. 16 11月, 2016 3 次提交
  4. 26 10月, 2016 2 次提交
    • D
      timers: Fix documentation for schedule_timeout() and similar · 4b7e9cf9
      Douglas Anderson 提交于
      The documentation for schedule_timeout(), schedule_hrtimeout(), and
      schedule_hrtimeout_range() all claim that the routines couldn't possibly
      return early if the task state was TASK_UNINTERRUPTIBLE. This is simply
      not true since wake_up_process() will cause those routines to exit early.
      
      We cannot make schedule_[hr]timeout() loop until the timeout expires if the
      task state is uninterruptible because we have users which rely on the
      existing and designed behaviour.
      
      Make the documentation match the (correct) implementation.
      
      schedule_hrtimeout() returns -EINTR even when a uninterruptible task was
      woken up. This might look strange, but making the return code depend on the
      state is too much of an effort as it would affect all the call sites. There
      is no value in doing so, but we spell it out clearly in the documentation.
      Suggested-by: NDaniel Kurtz <djkurtz@chromium.org>
      Signed-off-by: NDouglas Anderson <dianders@chromium.org>
      Cc: huangtao@rock-chips.com
      Cc: heiko@sntech.de
      Cc: broonie@kernel.org
      Cc: briannorris@chromium.org
      Cc: Andreas Mohr <andi@lisas.de>
      Cc: linux-rockchip@lists.infradead.org
      Cc: tony.xie@rock-chips.com
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: linux@roeck-us.net
      Cc: tskd08@gmail.com
      Link: http://lkml.kernel.org/r/1477065531-30342-2-git-send-email-dianders@chromium.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      4b7e9cf9
    • D
      timers: Fix usleep_range() in the context of wake_up_process() · 6c5e9059
      Douglas Anderson 提交于
      Users of usleep_range() expect that it will _never_ return in less time
      than the minimum passed parameter. However, nothing in the code ensures
      this, when the sleeping task is woken by wake_up_process() or any other
      mechanism which can wake a task from uninterruptible state.
      
      Neither usleep_range() nor schedule_hrtimeout_range*() have any protection
      against wakeups. schedule_hrtimeout_range*() is designed this way despite
      the fact that the API documentation does not mention it.
      
      msleep() already has code to handle this case since it will loop as long
      as there was still time left.  usleep_range() has no such loop, add it.
      
      Presumably this problem was not detected before because usleep_range() is
      only used in a few places and the function is mostly used in contexts which
      are not exposed to wakeups of any form.
      
      An effort was made to look for users relying on the old behavior by
      looking for usleep_range() in the same file as wake_up_process().
      No problems were found by this search, though it is conceivable that
      someone could have put the sleep and wakeup in two different files.
      
      An effort was made to ask several upstream maintainers if they were aware
      of people relying on wake_up_process() to wake up usleep_range(). No
      maintainers were aware of that but they were aware of many people relying
      on usleep_range() never returning before the minimum.
      Reported-by: NTao Huang <huangtao@rock-chips.com>
      Signed-off-by: NDouglas Anderson <dianders@chromium.org>
      Cc: heiko@sntech.de
      Cc: broonie@kernel.org
      Cc: briannorris@chromium.org
      Cc: Andreas Mohr <andi@lisas.de>
      Cc: linux-rockchip@lists.infradead.org
      Cc: tony.xie@rock-chips.com
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: djkurtz@chromium.org
      Cc: linux@roeck-us.net
      Cc: tskd08@gmail.com
      Link: http://lkml.kernel.org/r/1477065531-30342-1-git-send-email-dianders@chromium.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6c5e9059
  5. 17 10月, 2016 1 次提交
  6. 11 10月, 2016 1 次提交
    • E
      latent_entropy: Mark functions with __latent_entropy · 0766f788
      Emese Revfy 提交于
      The __latent_entropy gcc attribute can be used only on functions and
      variables.  If it is on a function then the plugin will instrument it for
      gathering control-flow entropy. If the attribute is on a variable then
      the plugin will initialize it with random contents.  The variable must
      be an integer, an integer array type or a structure with integer fields.
      
      These specific functions have been selected because they are init
      functions (to help gather boot-time entropy), are called at unpredictable
      times, or they have variable loops, each of which provide some level of
      latent entropy.
      Signed-off-by: NEmese Revfy <re.emese@gmail.com>
      [kees: expanded commit message]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      0766f788
  7. 05 10月, 2016 1 次提交
    • J
      timekeeping: Fix __ktime_get_fast_ns() regression · 58bfea95
      John Stultz 提交于
      In commit 27727df2 ("Avoid taking lock in NMI path with
      CONFIG_DEBUG_TIMEKEEPING"), I changed the logic to open-code
      the timekeeping_get_ns() function, but I forgot to include
      the unit conversion from cycles to nanoseconds, breaking the
      function's output, which impacts users like perf.
      
      This results in bogus perf timestamps like:
       swapper     0 [000]   253.427536:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.426573:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.426687:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.426800:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.426905:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.427022:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.427127:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.427239:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.427346:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.427463:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   255.426572:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
      
      Instead of more reasonable expected timestamps like:
       swapper     0 [000]    39.953768:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.064839:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.175956:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.287103:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.398217:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.509324:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.620437:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.731546:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.842654:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.953772:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    41.064881:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
      
      Add the proper use of timekeeping_delta_to_ns() to convert
      the cycle delta to nanoseconds as needed.
      
      Thanks to Brendan and Alexei for finding this quickly after
      the v4.8 release. Unfortunately the problematic commit has
      landed in some -stable trees so they'll need this fix as
      well.
      
      Many apologies for this mistake. I'll be looking to add a
      perf-clock sanity test to the kselftest timers tests soon.
      
      Fixes: 27727df2 "timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING"
      Reported-by: NBrendan Gregg <bgregg@netflix.com>
      Reported-by: NAlexei Starovoitov <alexei.starovoitov@gmail.com>
      Tested-and-reviewed-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable <stable@vger.kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1475636148-26539-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      58bfea95
  8. 13 9月, 2016 1 次提交
  9. 02 9月, 2016 1 次提交
    • W
      tick/nohz: Fix softlockup on scheduler stalls in kvm guest · 08d07259
      Wanpeng Li 提交于
      tick_nohz_start_idle() is prevented to be called if the idle tick can't 
      be stopped since commit 1f3b0f82 ("tick/nohz: Optimize nohz idle 
      enter"). As a result, after suspend/resume the host machine, full dynticks 
      kvm guest will softlockup:
      
       NMI watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [swapper/0:0]
       Call Trace:
        default_idle+0x31/0x1a0
        arch_cpu_idle+0xf/0x20
        default_idle_call+0x2a/0x50
        cpu_startup_entry+0x39b/0x4d0
        rest_init+0x138/0x140
        ? rest_init+0x5/0x140
        start_kernel+0x4c1/0x4ce
        ? set_init_arg+0x55/0x55
        ? early_idt_handler_array+0x120/0x120
        x86_64_start_reservations+0x24/0x26
        x86_64_start_kernel+0x142/0x14f
      
      In addition, cat /proc/stat | grep cpu in guest or host:
      
      cpu  398 16 5049 15754 5490 0 1 46 0 0
      cpu0 206 5 450 0 0 0 1 14 0 0
      cpu1 81 0 3937 3149 1514 0 0 9 0 0
      cpu2 45 6 332 6052 2243 0 0 11 0 0
      cpu3 65 2 328 6552 1732 0 0 11 0 0
      
      The idle and iowait states are weird 0 for cpu0(housekeeping). 
      
      The bug is present in both guest and host kernels, and they both have 
      cpu0's idle and iowait states issue, however, host kernel's suspend/resume 
      path etc will touch watchdog to avoid the softlockup.
      
      - The watchdog will not be touched in tick_nohz_stop_idle path (need be 
        touched since the scheduler stall is expected) if idle_active flags are 
        not detected.
      - The idle and iowait states will not be accounted when exit idle loop 
        (resched or interrupt) if idle start time and idle_active flags are 
        not set. 
      
      This patch fixes it by reverting commit 1f3b0f82 since can't stop 
      idle tick doesn't mean can't be idle.
      
      Fixes: 1f3b0f82 ("tick/nohz: Optimize nohz idle enter")
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Cc: Sanjeev Yadav<sanjeev.yadav@spreadtrum.com>
      Cc: Gaurav Jindal<gaurav.jindal@spreadtrum.com>
      Cc: stable@vger.kernel.org
      Cc: kvm@vger.kernel.org
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Link: http://lkml.kernel.org/r/1472798303-4154-1-git-send-email-wanpeng.li@hotmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      08d07259
  10. 01 9月, 2016 5 次提交
    • V
      time: Avoid undefined behaviour in ktime_add_safe() · 979515c5
      Vegard Nossum 提交于
      I ran into this:
      
          ================================================================================
          UBSAN: Undefined behaviour in kernel/time/hrtimer.c:310:16
          signed integer overflow:
          9223372036854775807 + 50000 cannot be represented in type 'long long int'
          CPU: 2 PID: 4798 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #91
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
           0000000000000000 ffff88010ce6fb88 ffffffff82344740 0000000041b58ab3
           ffffffff84f97a20 ffffffff82344694 ffff88010ce6fbb0 ffff88010ce6fb60
           000000000000c350 ffff88010ce6f968 dffffc0000000000 ffffffff857bc320
          Call Trace:
           [<ffffffff82344740>] dump_stack+0xac/0xfc
           [<ffffffff82344694>] ? _atomic_dec_and_lock+0xc4/0xc4
           [<ffffffff8242df78>] ubsan_epilogue+0xd/0x8a
           [<ffffffff8242e6b4>] handle_overflow+0x202/0x23d
           [<ffffffff8242e4b2>] ? val_to_string.constprop.6+0x11e/0x11e
           [<ffffffff8236df71>] ? timerqueue_add+0x151/0x410
           [<ffffffff81485c48>] ? hrtimer_start_range_ns+0x3b8/0x1380
           [<ffffffff81795631>] ? memset+0x31/0x40
           [<ffffffff8242e6fd>] __ubsan_handle_add_overflow+0xe/0x10
           [<ffffffff81488ac9>] hrtimer_nanosleep+0x5d9/0x790
           [<ffffffff814884f0>] ? hrtimer_init_sleeper+0x80/0x80
           [<ffffffff813a9ffb>] ? __might_sleep+0x5b/0x260
           [<ffffffff8148be10>] common_nsleep+0x20/0x30
           [<ffffffff814906c7>] SyS_clock_nanosleep+0x197/0x210
           [<ffffffff81490530>] ? SyS_clock_getres+0x150/0x150
           [<ffffffff823c7113>] ? __this_cpu_preempt_check+0x13/0x20
           [<ffffffff8162ef60>] ? __context_tracking_exit.part.3+0x30/0x1b0
           [<ffffffff81490530>] ? SyS_clock_getres+0x150/0x150
           [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
           [<ffffffff845f85aa>] entry_SYSCALL64_slow_path+0x25/0x25
          ================================================================================
      
      Add a new ktime_add_unsafe() helper which doesn't check for overflow, but
      doesn't throw a UBSAN warning when it does overflow either.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      979515c5
    • V
      time: Avoid undefined behaviour in timespec64_add_safe() · 469e857f
      Vegard Nossum 提交于
      I ran into this:
      
          ================================================================================
          UBSAN: Undefined behaviour in kernel/time/time.c:783:2
          signed integer overflow:
          5273 + 9223372036854771711 cannot be represented in type 'long int'
          CPU: 0 PID: 17363 Comm: trinity-c0 Not tainted 4.8.0-rc1+ #88
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org
          04/01/2014
           0000000000000000 ffff88011457f8f0 ffffffff82344f50 0000000041b58ab3
           ffffffff84f98080 ffffffff82344ea4 ffff88011457f918 ffff88011457f8c8
           ffff88011457f8e0 7fffffffffffefff ffff88011457f6d8 dffffc0000000000
          Call Trace:
           [<ffffffff82344f50>] dump_stack+0xac/0xfc
           [<ffffffff82344ea4>] ? _atomic_dec_and_lock+0xc4/0xc4
           [<ffffffff8242f4c8>] ubsan_epilogue+0xd/0x8a
           [<ffffffff8242fc04>] handle_overflow+0x202/0x23d
           [<ffffffff8242fa02>] ? val_to_string.constprop.6+0x11e/0x11e
           [<ffffffff823c7837>] ? debug_smp_processor_id+0x17/0x20
           [<ffffffff8131b581>] ? __sigqueue_free.part.13+0x51/0x70
           [<ffffffff8146d4e0>] ? rcu_is_watching+0x110/0x110
           [<ffffffff8242fc4d>] __ubsan_handle_add_overflow+0xe/0x10
           [<ffffffff81476ef8>] timespec64_add_safe+0x298/0x340
           [<ffffffff81476c60>] ? timespec_add_safe+0x330/0x330
           [<ffffffff812f7990>] ? wait_noreap_copyout+0x1d0/0x1d0
           [<ffffffff8184bf18>] poll_select_set_timeout+0xf8/0x170
           [<ffffffff8184be20>] ? poll_schedule_timeout+0x2b0/0x2b0
           [<ffffffff813aa9bb>] ? __might_sleep+0x5b/0x260
           [<ffffffff833c8a87>] __sys_recvmmsg+0x107/0x790
           [<ffffffff833c8980>] ? SyS_recvmsg+0x20/0x20
           [<ffffffff81486378>] ? hrtimer_start_range_ns+0x3b8/0x1380
           [<ffffffff845f8bfb>] ? _raw_spin_unlock_irqrestore+0x3b/0x60
           [<ffffffff8148bcea>] ? do_setitimer+0x39a/0x8e0
           [<ffffffff813aa9bb>] ? __might_sleep+0x5b/0x260
           [<ffffffff833c9110>] ? __sys_recvmmsg+0x790/0x790
           [<ffffffff833c91e9>] SyS_recvmmsg+0xd9/0x160
           [<ffffffff833c9110>] ? __sys_recvmmsg+0x790/0x790
           [<ffffffff823c7853>] ? __this_cpu_preempt_check+0x13/0x20
           [<ffffffff8162f680>] ? __context_tracking_exit.part.3+0x30/0x1b0
           [<ffffffff833c9110>] ? __sys_recvmmsg+0x790/0x790
           [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
           [<ffffffff845f936a>] entry_SYSCALL64_slow_path+0x25/0x25
          ================================================================================
      
      Line 783 is this:
      
      783         set_normalized_timespec64(&res, lhs.tv_sec + rhs.tv_sec,
      784                         lhs.tv_nsec + rhs.tv_nsec);
      
      In other words, since lhs.tv_sec and rhs.tv_sec are both time64_t, this
      is a signed addition which will cause undefined behaviour on overflow.
      
      Note that this is not currently a huge concern since the kernel should be
      built with -fno-strict-overflow by default, but could be a problem in the
      future, a problem with older compilers, or other compilers than gcc.
      
      The easiest way to avoid the overflow is to cast one of the arguments to
      unsigned (so the addition will be done using unsigned arithmetic).
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      469e857f
    • R
      timekeeping: Prints the amounts of time spent during suspend · 0bf43f15
      Ruchi Kandoi 提交于
      In addition to keeping a histogram of suspend times, also
      print out the time spent in suspend to dmesg.
      
      This helps to keep track of suspend time while debugging using
      kernel logs.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Signed-off-by: NRuchi Kandoi <kandoiruchi@google.com>
      [jstultz: Tweaked commit message]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      0bf43f15
    • K
      clocksource: Defer override invalidation unless clock is unstable · 36374583
      Kyle Walker 提交于
      Clocksources don't get the VALID_FOR_HRES flag until they have been
      checked by a watchdog. However, when using an override, the
      clocksource_select logic will clear the override value if the
      clocksource is not marked VALID_FOR_HRES during that inititial check.
      When using the boot arguments clocksource=<foo>, this selection can
      run before the watchdog, and can cause the override to be incorrectly
      cleared.
      
      To address this condition, the override_name is only invalidated for
      unstable clocksources. Otherwise, the override is left intact until after
      the watchdog has validated the clocksource as stable/unstable.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NKyle Walker <kwalker@redhat.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      36374583
    • P
      hrtimer: Spelling fixes · b4d90e9f
      Pratyush Patel 提交于
      Fix a minor spelling error.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Signed-off-by: NPratyush Patel <pratyushpatel.1995@gmail.com>
      [jstultz: Added commit message]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      b4d90e9f
  11. 24 8月, 2016 2 次提交
    • J
      timekeeping: Cap array access in timekeeping_debug · a4f8f666
      John Stultz 提交于
      It was reported that hibernation could fail on the 2nd attempt, where the
      system hangs at hibernate() -> syscore_resume() -> i8237A_resume() ->
      claim_dma_lock(), because the lock has already been taken.
      
      However there is actually no other process would like to grab this lock on
      that problematic platform.
      
      Further investigation showed that the problem is triggered by setting
      /sys/power/pm_trace to 1 before the 1st hibernation.
      
      Since once pm_trace is enabled, the rtc becomes unmeaningful after suspend,
      and meanwhile some BIOSes would like to adjust the 'invalid' RTC (e.g, smaller
      than 1970) to the release date of that motherboard during POST stage, thus
      after resumed, it may seem that the system had a significant long sleep time
      which is a completely meaningless value.
      
      Then in timekeeping_resume -> tk_debug_account_sleep_time, if the bit31 of the
      sleep time happened to be set to 1, fls() returns 32 and we add 1 to
      sleep_time_bin[32], which causes an out of bounds array access and therefor
      memory being overwritten.
      
      As depicted by System.map:
      0xffffffff81c9d080 b sleep_time_bin
      0xffffffff81c9d100 B dma_spin_lock
      the dma_spin_lock.val is set to 1, which caused this problem.
      
      This patch adds a sanity check in tk_debug_account_sleep_time()
      to ensure we don't index past the sleep_time_bin array.
      
      [jstultz: Problem diagnosed and original patch by Chen Yu, I've solved the
       issue slightly differently, but borrowed his excelent explanation of the
       issue here.]
      
      Fixes: 5c83545f "power: Add option to log time spent in suspend"
      Reported-by: NJanek Kozicki <cosurgi@gmail.com>
      Reported-by: NChen Yu <yu.c.chen@intel.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: linux-pm@vger.kernel.org
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xunlei Pang <xpang@redhat.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: stable <stable@vger.kernel.org>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Link: http://lkml.kernel.org/r/1471993702-29148-3-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      a4f8f666
    • J
      timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING · 27727df2
      John Stultz 提交于
      When I added some extra sanity checking in timekeeping_get_ns() under
      CONFIG_DEBUG_TIMEKEEPING, I missed that the NMI safe __ktime_get_fast_ns()
      method was using timekeeping_get_ns().
      
      Thus the locking added to the debug checks broke the NMI-safety of
      __ktime_get_fast_ns().
      
      This patch open-codes the timekeeping_get_ns() logic for
      __ktime_get_fast_ns(), so can avoid any deadlocks in NMI.
      
      Fixes: 4ca22c26 "timekeeping: Add warnings when overflows or underflows are observed"
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: stable <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1471993702-29148-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      27727df2
  12. 09 8月, 2016 1 次提交
    • C
      timers: Fix get_next_timer_interrupt() computation · 46c8f0b0
      Chris Metcalf 提交于
      The tick_nohz_stop_sched_tick() routine is not properly
      canceling the sched timer when nothing is pending, because
      get_next_timer_interrupt() is no longer returning KTIME_MAX in
      that case.  This causes periodic interrupts when none are needed.
      
      When determining the next interrupt time, we first use
      __next_timer_interrupt() to get the first expiring timer in the
      timer wheel.  If no timer is found, we return the base clock value
      plus NEXT_TIMER_MAX_DELTA to indicate there is no timer in the
      timer wheel.
      
      Back in get_next_timer_interrupt(), we set the "expires" value
      by converting the timer wheel expiry (in ticks) to a nsec value.
      But we don't want to do this if the timer wheel expiry value
      indicates no timer; we want to return KTIME_MAX.
      
      Prior to commit 500462a9 ("timers: Switch to a non-cascading
      wheel") we checked base->active_timers to see if any timers
      were active, and if not, we didn't touch the expiry value and so
      properly returned KTIME_MAX.  Now we don't have active_timers.
      
      To fix this, we now just check the timer wheel expiry value to
      see if it is "now + NEXT_TIMER_MAX_DELTA", and if it is, we don't
      try to compute a new value based on it, but instead simply let the
      KTIME_MAX value in expires remain.
      
      Fixes: 500462a9 "timers: Switch to a non-cascading wheel"
      Signed-off-by: NChris Metcalf <cmetcalf@mellanox.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Link: http://lkml.kernel.org/r/1470688147-22287-1-git-send-email-cmetcalf@mellanox.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      46c8f0b0
  13. 19 7月, 2016 2 次提交
  14. 15 7月, 2016 2 次提交
  15. 11 7月, 2016 1 次提交
  16. 07 7月, 2016 12 次提交
    • A
      timers: Implement optimization for same expiry time in mod_timer() · f00c0afd
      Anna-Maria Gleixner 提交于
      The existing optimization for same expiry time in mod_timer() checks whether
      the timer expiry time is the same as the new requested expiry time. In the old
      timer wheel implementation this does not take the slack batching into account,
      neither does the new implementation evaluate whether the new expiry time will
      requeue the timer to the same bucket.
      
      To optimize that, we can calculate the resulting bucket and check if the new
      expiry time is different from the current expiry time. This calculation
      happens outside the base lock held region. If the resulting bucket is the same
      we can avoid taking the base lock and requeueing the timer.
      
      If the timer needs to be requeued then we have to check under the base lock
      whether the base time has changed between the lockless calculation and taking
      the lock. If it has changed we need to recalculate under the lock.
      
      This optimization takes effect for timers which are enqueued into the less
      granular wheel levels (1 and above). With a simple test case the functionality
      has been verified:
      
                  Before        After
       Match:       5.5%        86.6%
       Requeue:    94.5%        13.4%
       Recalc:                  <0.01%
      
      In the non optimized case the timer is requeued in 94.5% of the cases. With
      the index optimization in place the requeue rate drops to 13.4%. The case
      where the lockless index calculation has to be redone is less than 0.01%.
      
      With a real world test case (networking) we observed the following changes:
      
                  Before        After
       Match:      97.8%        99.7%
       Requeue:     2.2%         0.3%
       Recalc:                  <0.001%
      
      That means two percent fewer lock/requeue/unlock operations done in one of
      the hot path use cases of timers.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.778527749@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f00c0afd
    • A
      timers: Split out index calculation · ffdf0477
      Anna-Maria Gleixner 提交于
      For further optimizations we need to seperate index calculation
      from queueing. No functional change.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.691159619@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ffdf0477
    • T
      timers: Only wake softirq if necessary · 4e85876a
      Thomas Gleixner 提交于
      With the wheel forwading in place and with the HZ=1000 4ms folding we can
      avoid running the softirq at all.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.607650550@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4e85876a
    • T
      timers: Forward the wheel clock whenever possible · a683f390
      Thomas Gleixner 提交于
      The wheel clock is stale when a CPU goes into a long idle sleep. This has the
      side effect that timers which are queued end up in the outer wheel levels.
      That results in coarser granularity.
      
      To solve this, we keep track of the idle state and forward the wheel clock
      whenever possible.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.512039360@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a683f390
    • T
      timers/nohz: Remove pointless tick_nohz_kick_tick() function · ff006732
      Thomas Gleixner 提交于
      This was a failed attempt to optimize the timer expiry in idle, which was
      disabled and never revisited. Remove the cruft.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.431073782@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ff006732
    • A
      timers: Optimize collect_expired_timers() for NOHZ · 23696838
      Anna-Maria Gleixner 提交于
      After a NOHZ idle sleep the timer wheel must be forwarded to current jiffies.
      There might be expired timers so the current code loops and checks the expired
      buckets for timers. This can take quite some time for long NOHZ idle periods.
      
      The pending bitmask in the timer base allows us to do a quick search for the
      next expiring timer and therefore a fast forward of the base time which
      prevents pointless long lasting loops.
      
      For a 3 seconds idle sleep this reduces the catchup time from ~1ms to 5us.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.351296290@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      23696838
    • A
      timers: Move __run_timers() function · 73420fea
      Anna-Maria Gleixner 提交于
      Move __run_timers() below __next_timer_interrupt() and next_pending_bucket()
      in preparation for __run_timers() NOHZ optimization.
      
      No functional change.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.271872665@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      73420fea
    • T
      timers: Remove set_timer_slack() leftovers · 53bf837b
      Thomas Gleixner 提交于
      We now have implicit batching in the timer wheel. The slack API is no longer
      used, so remove it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Andrew F. Davis <afd@ti.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jaehoon Chung <jh80.chung@samsung.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mathias Nyman <mathias.nyman@intel.com>
      Cc: Pali Rohár <pali.rohar@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Sebastian Reichel <sre@kernel.org>
      Cc: Ulf Hansson <ulf.hansson@linaro.org>
      Cc: linux-block@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mmc@vger.kernel.org
      Cc: linux-pm@vger.kernel.org
      Cc: linux-usb@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.189813118@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      53bf837b
    • T
      timers: Switch to a non-cascading wheel · 500462a9
      Thomas Gleixner 提交于
      The current timer wheel has some drawbacks:
      
      1) Cascading:
      
         Cascading can be an unbound operation and is completely pointless in most
         cases because the vast majority of the timer wheel timers are canceled or
         rearmed before expiration. (They are used as timeout safeguards, not as
         real timers to measure time.)
      
      2) No fast lookup of the next expiring timer:
      
         In NOHZ scenarios the first timer soft interrupt after a long NOHZ period
         must fast forward the base time to the current value of jiffies. As we
         have no way to find the next expiring timer fast, the code loops linearly
         and increments the base time one by one and checks for expired timers
         in each step. This causes unbound overhead spikes exactly in the moment
         when we should wake up as fast as possible.
      
      After a thorough analysis of real world data gathered on laptops,
      workstations, webservers and other machines (thanks Chris!) I came to the
      conclusion that the current 'classic' timer wheel implementation can be
      modified to address the above issues.
      
      The vast majority of timer wheel timers is canceled or rearmed before
      expiry. Most of them are timeouts for networking and other I/O tasks. The
      nature of timeouts is to catch the exception from normal operation (TCP ack
      timed out, disk does not respond, etc.). For these kinds of timeouts the
      accuracy of the timeout is not really a concern. Timeouts are very often
      approximate worst-case values and in case the timeout fires, we already
      waited for a long time and performance is down the drain already.
      
      The few timers which actually expire can be split into two categories:
      
       1) Short expiry times which expect halfways accurate expiry
      
       2) Long term expiry times are inaccurate today already due to the
          batching which is done for NOHZ automatically and also via the
          set_timer_slack() API.
      
      So for long term expiry timers we can avoid the cascading property and just
      leave them in the less granular outer wheels until expiry or
      cancelation. Timers which are armed with a timeout larger than the wheel
      capacity are no longer cascaded. We expire them with the longest possible
      timeout (6+ days). We have not observed such timeouts in our data collection,
      but at least we handle them, applying the rule of the least surprise.
      
      To avoid extending the wheel levels for HZ=1000 so we can accomodate the
      longest observed timeouts (5 days in the network conntrack code) we reduce the
      first level granularity on HZ=1000 to 4ms, which effectively is the same as
      the HZ=250 behaviour. From our data analysis there is nothing which relies on
      that 1ms granularity and as a side effect we get better batching and timer
      locality for the networking code as well.
      
      Contrary to the classic wheel the granularity of the next wheel is not the
      capacity of the first wheel. The granularities of the wheels are in the
      currently chosen setting 8 times the granularity of the previous wheel.
      
      So for HZ=250 we end up with the following granularity levels:
      
       Level Offset   Granularity                  Range
           0      0          4 ms                 0 ms -        252 ms
           1     64         32 ms               256 ms -       2044 ms (256ms - ~2s)
           2    128        256 ms              2048 ms -      16380 ms (~2s   - ~16s)
           3    192       2048 ms (~2s)       16384 ms -     131068 ms (~16s  - ~2m)
           4    256      16384 ms (~16s)     131072 ms -    1048572 ms (~2m   - ~17m)
           5    320     131072 ms (~2m)     1048576 ms -    8388604 ms (~17m  - ~2h)
           6    384    1048576 ms (~17m)    8388608 ms -   67108863 ms (~2h   - ~18h)
           7    448    8388608 ms (~2h)    67108864 ms -  536870911 ms (~18h  - ~6d)
      
      That's a worst case inaccuracy of 12.5% for the timers which are queued at the
      beginning of a level.
      
      So the new wheel concept addresses the old issues:
      
      1) Cascading is avoided completely
      
      2) By keeping the timers in the bucket until expiry/cancelation we can track
         the buckets which have timers enqueued in a bucket bitmap and therefore can
         look up the next expiring timer very fast and O(1).
      
      A further benefit of the concept is that the slack calculation which is done
      on every timer start is no longer necessary because the granularity levels
      provide natural batching already.
      
      Our extensive testing with various loads did not show any performance
      degradation vs. the current wheel implementation.
      
      This patch does not address the 'fast lookup' issue as we wanted to make sure
      that there is no regression introduced by the wheel redesign. The
      optimizations are in follow up patches.
      
      This patch contains fixes from Anna-Maria Gleixner and Richard Cochran.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.108621834@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      500462a9
    • T
      timers: Give a few structs and members proper names · 494af3ed
      Thomas Gleixner 提交于
      Some of the names in the internal implementation of the timer code
      are not longer correct and others are simply too long to type.
      
      Clean it up before we switch the wheel implementation over to
      the new scheme.
      
      No functional change.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.948752516@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      494af3ed
    • T
      timers: Remove the deprecated mod_timer_pinned() API · 177ec0a0
      Thomas Gleixner 提交于
      We switched all users to initialize the timers as pinned and call
      mod_timer(). Remove the now unused timer API function.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.706205231@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      177ec0a0
    • T
      timers: Make 'pinned' a timer property · e675447b
      Thomas Gleixner 提交于
      We want to move the timer migration logic from a 'push' to a 'pull' model.
      
      Under the current 'push' model pinned timers are handled via
      a runtime API variant: mod_timer_pinned().
      
      The 'pull' model requires us to store the pinned attribute of a timer
      in the timer_list structure itself, as a new TIMER_PINNED bit in
      timer->flags.
      
      This flag must be set at initialization time and the timer APIs
      recognize the flag.
      
      This patch:
      
       - Implements the new flag and associated new-style initialization
         methods
      
       - makes mod_timer() recognize new-style pinned timers,
      
       - and adds some migration helper facility to allow
         step by step conversion of old-style to new-style
         pinned timers.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.049338558@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e675447b
  17. 05 7月, 2016 1 次提交
  18. 01 7月, 2016 2 次提交