1. 15 5月, 2017 4 次提交
    • P
      sched/clock: Remove watchdog touching · 3067a33d
      Peter Zijlstra 提交于
      Commit:
      
        2bacec8c ("sched: touch softlockup watchdog after idling")
      
      introduced the touch_softlockup_watchdog_sched() call without
      justification and I feel sched_clock management is not the right
      place, it should only be concerned with producing semi coherent time.
      
      If this causes watchdog thingies, we can find a better place.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3067a33d
    • P
      sched/clock: Remove unused argument to sched_clock_idle_wakeup_event() · ac1e843f
      Peter Zijlstra 提交于
      The argument to sched_clock_idle_wakeup_event() has not been used in a
      long time. Remove it.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ac1e843f
    • P
      x86/tsc, sched/clock, clocksource: Use clocksource watchdog to provide stable sync points · b421b22b
      Peter Zijlstra 提交于
      Currently we keep sched_clock_tick() active for stable TSC in order to
      keep the per-CPU state semi up-to-date. The (obvious) problem is that
      by the time we detect TSC is borked, our per-CPU state is also borked.
      
      So hook into the clocksource watchdog and call a method after we've
      found it to still be stable.
      
      There's the obvious race where the TSC goes wonky between finding it
      stable and us running the callback, but closing that is too much work
      and not really worth it, since we're already detecting TSC wobbles
      after the fact, so we cannot, per definition, fully avoid funny clock
      values.
      
      And since the watchdog runs less often than the tick, this is also an
      optimization.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b421b22b
    • P
      sched/clock: Initialize all per-CPU state before switching (back) to unstable · cf15ca8d
      Peter Zijlstra 提交于
      In preparation for not keeping the sched_clock_tick() active for
      stable TSC, we need to explicitly initialize all per-CPU state
      before switching back to unstable.
      
      Note: this patch looses the __gtod_offset calculation; it will be
      restored in the next one.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      cf15ca8d
  2. 27 3月, 2017 1 次提交
  3. 23 3月, 2017 2 次提交
  4. 02 3月, 2017 2 次提交
  5. 20 1月, 2017 1 次提交
    • P
      sched/clock: Fix hotplug crash · acb04058
      Peter Zijlstra 提交于
      Mike reported that he could trigger the WARN_ON_ONCE() in
      set_sched_clock_stable() using hotplug.
      
      This exposed a fundamental problem with the interface, we should never
      mark the TSC stable if we ever find it to be unstable. Therefore
      set_sched_clock_stable() is a broken interface.
      
      The reason it existed is that not having it is a pain, it means all
      relevant architecture code needs to call clear_sched_clock_stable()
      where appropriate.
      
      Of the three architectures that select HAVE_UNSTABLE_SCHED_CLOCK ia64
      and parisc are trivial in that they never called
      set_sched_clock_stable(), so add an unconditional call to
      clear_sched_clock_stable() to them.
      
      For x86 the story is a lot more involved, and what this patch tries to
      do is ensure we preserve the status quo. So even is Cyrix or Transmeta
      have usable TSC they never called set_sched_clock_stable() so they now
      get an explicit mark unstable.
      Reported-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 9881b024 ("sched/clock: Delay switching sched_clock to stable")
      Link: http://lkml.kernel.org/r/20170119133633.GB6536@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      acb04058
  6. 14 1月, 2017 3 次提交
    • P
      sched/clock: Provide better clock continuity · 5680d809
      Peter Zijlstra 提交于
      When switching between the unstable and stable variants it is
      currently possible that clock discontinuities occur.
      
      And while these will mostly be 'small', attempt to do better.
      
      As observed on my IVB-EP, the sched_clock() is ~1.5s ahead of the
      ktime_get_ns() based timeline at the point of switchover
      (sched_clock_init_late()) after SMP bringup.
      
      Equally, when the TSC is later found to be unstable -- typically
      because SMM tries to hide its SMI latencies by mucking with the TSC --
      we want to avoid large jumps.
      
      Since the clocksource watchdog reports the issue after the fact we
      cannot exactly fix up time, but since SMI latencies are typically
      small (~10ns range), the discontinuity is mainly due to drift between
      sched_clock() and ktime_get_ns() (which on my desktop is ~79s over
      24days).
      
      I dislike this patch because it adds overhead to the good case in
      favour of dealing with badness. But given the widespread failure of
      TSC stability this is worth it.
      
      Note that in case the TSC makes drastic jumps after SMP bringup we're
      still hosed. There's just not much we can do in that case without
      stupid overhead.
      
      If we were to somehow expose tsc_clocksource_reliable (which is hard
      because this code is also used on ia64 and parisc) we could avoid some
      of the newly introduced overhead.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5680d809
    • P
      sched/clock: Delay switching sched_clock to stable · 9881b024
      Peter Zijlstra 提交于
      Currently we switch to the stable sched_clock if we guess the TSC is
      usable, and then switch back to the unstable path if it turns out TSC
      isn't stable during SMP bringup after all.
      
      Delay switching to the stable path until after SMP bringup is
      complete. This way we'll avoid switching during the time we detect the
      worst of the TSC offences.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9881b024
    • P
      sched/clock: Update static_key usage · 555570d7
      Peter Zijlstra 提交于
      sched_clock was still using the deprecated static_key interface.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      555570d7
  7. 13 4月, 2016 2 次提交
  8. 02 3月, 2016 1 次提交
  9. 09 12月, 2015 1 次提交
    • T
      watchdog: introduce touch_softlockup_watchdog_sched() · 03e0d461
      Tejun Heo 提交于
      touch_softlockup_watchdog() is used to tell watchdog that scheduler
      stall is expected.  One group of usage is from paths where the task
      may not be able to yield for a long time such as performing slow PIO
      to finicky device and coming out of suspend.  The other is to account
      for scheduler and timer going idle.
      
      For scheduler softlockup detection, there's no reason to distinguish
      the two cases; however, workqueue lockup detector is planned and it
      can use the same signals from the former group while the latter would
      spuriously prevent detection.  This patch introduces a new function
      touch_softlockup_watchdog_sched() and convert the latter group to call
      it instead.  For now, it just calls touch_softlockup_watchdog() and
      there's no functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      03e0d461
  10. 23 11月, 2015 1 次提交
    • P
      treewide: Remove old email address · 90eec103
      Peter Zijlstra 提交于
      There were still a number of references to my old Red Hat email
      address in the kernel source. Remove these while keeping the
      Red Hat copyright notices intact.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      90eec103
  11. 13 2月, 2015 1 次提交
    • C
      kernel/sched/clock.c: add another clock for use with the soft lockup watchdog · 545a2bf7
      Cyril Bur 提交于
      When the hypervisor pauses a virtualised kernel the kernel will observe a
      jump in timebase, this can cause spurious messages from the softlockup
      detector.
      
      Whilst these messages are harmless, they are accompanied with a stack
      trace which causes undue concern and more problematically the stack trace
      in the guest has nothing to do with the observed problem and can only be
      misleading.
      
      Futhermore, on POWER8 this is completely avoidable with the introduction
      of the Virtual Time Base (VTB) register.
      
      This patch (of 2):
      
      This permits the use of arch specific clocks for which virtualised kernels
      can use their notion of 'running' time, not the elpased wall time which
      will include host execution time.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Andrew Jones <drjones@redhat.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: chai wen <chaiw.fnst@cn.fujitsu.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Ben Zhang <benzh@chromium.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      545a2bf7
  12. 27 8月, 2014 1 次提交
  13. 08 4月, 2014 1 次提交
  14. 11 3月, 2014 1 次提交
  15. 23 1月, 2014 1 次提交
    • P
      sched/clock: Fixup early initialization · d375b4e0
      Peter Zijlstra 提交于
      The code would assume sched_clock_stable() and switch to !stable
      later, this switch brings a discontinuity in time.
      
      The discontinuity on switching from stable to unstable was always
      present, but previously we would set stable/unstable before
      initializing TSC and usually stick to the one we start out with.
      
      So the static_key bits brought an extra switch where there previously
      wasn't one.
      
      Things are further complicated by the fact that we cannot use
      static_key as early as we usually call set_sched_clock_stable().
      
      Fix things by tracking the stable state in a regular variable and only
      set the static_key to the right state on sched_clock_init(), which is
      ran right after late_time_init->tsc_init().
      
      Before this we would not be using the TSC anyway.
      Reported-and-Tested-by: NSasha Levin <sasha.levin@oracle.com>
      Reported-by: dyoung@redhat.com
      Fixes: 35af99e6 ("sched/clock, x86: Use a static_key for sched_clock_stable")
      Cc: jacob.jun.pan@linux.intel.com
      Cc: Mike Galbraith <bitbucket@online.de>
      Cc: hpa@zytor.com
      Cc: paulmck@linux.vnet.ibm.com
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: lenb@kernel.org
      Cc: rjw@rjwysocki.net
      Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
      Cc: rui.zhang@intel.com
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140122115918.GG3694@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d375b4e0
  16. 13 1月, 2014 3 次提交
    • P
      sched/clock: Fix up clear_sched_clock_stable() · 6577e42a
      Peter Zijlstra 提交于
      The below tells us the static_key conversion has a problem; since the
      exact point of clearing that flag isn't too important, delay the flip
      and use a workqueue to process it.
      
      [ ] TSC synchronization [CPU#0 -> CPU#22]:
      [ ] Measured 8 cycles TSC warp between CPUs, turning off TSC clock.
      [ ]
      [ ] ======================================================
      [ ] [ INFO: possible circular locking dependency detected ]
      [ ] 3.13.0-rc3-01745-g848b0d0322cb-dirty #637 Not tainted
      [ ] -------------------------------------------------------
      [ ] swapper/0/1 is trying to acquire lock:
      [ ]  (jump_label_mutex){+.+...}, at: [<ffffffff8115a637>] jump_label_lock+0x17/0x20
      [ ]
      [ ] but task is already holding lock:
      [ ]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109408b>] cpu_hotplug_begin+0x2b/0x60
      [ ]
      [ ] which lock already depends on the new lock.
      [ ]
      [ ]
      [ ] the existing dependency chain (in reverse order) is:
      [ ]
      [ ] -> #1 (cpu_hotplug.lock){+.+.+.}:
      [ ]        [<ffffffff810def00>] lock_acquire+0x90/0x130
      [ ]        [<ffffffff81661f83>] mutex_lock_nested+0x63/0x3e0
      [ ]        [<ffffffff81093fdc>] get_online_cpus+0x3c/0x60
      [ ]        [<ffffffff8104cc67>] arch_jump_label_transform+0x37/0x130
      [ ]        [<ffffffff8115a3cf>] __jump_label_update+0x5f/0x80
      [ ]        [<ffffffff8115a48d>] jump_label_update+0x9d/0xb0
      [ ]        [<ffffffff8115aa6d>] static_key_slow_inc+0x9d/0xb0
      [ ]        [<ffffffff810c0f65>] sched_feat_set+0xf5/0x100
      [ ]        [<ffffffff810c5bdc>] set_numabalancing_state+0x2c/0x30
      [ ]        [<ffffffff81d12f3d>] numa_policy_init+0x1af/0x1b7
      [ ]        [<ffffffff81cebdf4>] start_kernel+0x35d/0x41f
      [ ]        [<ffffffff81ceb5a5>] x86_64_start_reservations+0x2a/0x2c
      [ ]        [<ffffffff81ceb6a2>] x86_64_start_kernel+0xfb/0xfe
      [ ]
      [ ] -> #0 (jump_label_mutex){+.+...}:
      [ ]        [<ffffffff810de141>] __lock_acquire+0x1701/0x1eb0
      [ ]        [<ffffffff810def00>] lock_acquire+0x90/0x130
      [ ]        [<ffffffff81661f83>] mutex_lock_nested+0x63/0x3e0
      [ ]        [<ffffffff8115a637>] jump_label_lock+0x17/0x20
      [ ]        [<ffffffff8115aa3b>] static_key_slow_inc+0x6b/0xb0
      [ ]        [<ffffffff810ca775>] clear_sched_clock_stable+0x15/0x20
      [ ]        [<ffffffff810503b3>] mark_tsc_unstable+0x23/0x70
      [ ]        [<ffffffff810772cb>] check_tsc_sync_source+0x14b/0x150
      [ ]        [<ffffffff81076612>] native_cpu_up+0x3a2/0x890
      [ ]        [<ffffffff810941cb>] _cpu_up+0xdb/0x160
      [ ]        [<ffffffff810942c9>] cpu_up+0x79/0x90
      [ ]        [<ffffffff81d0af6b>] smp_init+0x60/0x8c
      [ ]        [<ffffffff81cebf42>] kernel_init_freeable+0x8c/0x197
      [ ]        [<ffffffff8164e32e>] kernel_init+0xe/0x130
      [ ]        [<ffffffff8166beec>] ret_from_fork+0x7c/0xb0
      [ ]
      [ ] other info that might help us debug this:
      [ ]
      [ ]  Possible unsafe locking scenario:
      [ ]
      [ ]        CPU0                    CPU1
      [ ]        ----                    ----
      [ ]   lock(cpu_hotplug.lock);
      [ ]                                lock(jump_label_mutex);
      [ ]                                lock(cpu_hotplug.lock);
      [ ]   lock(jump_label_mutex);
      [ ]
      [ ]  *** DEADLOCK ***
      [ ]
      [ ] 2 locks held by swapper/0/1:
      [ ]  #0:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81094037>] cpu_maps_update_begin+0x17/0x20
      [ ]  #1:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109408b>] cpu_hotplug_begin+0x2b/0x60
      [ ]
      [ ] stack backtrace:
      [ ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc3-01745-g848b0d0322cb-dirty #637
      [ ] Hardware name: Supermicro X8DTN/X8DTN, BIOS 4.6.3 01/08/2010
      [ ]  ffffffff82c9c270 ffff880236843bb8 ffffffff8165c5f5 ffffffff82c9c270
      [ ]  ffff880236843bf8 ffffffff81658c02 ffff880236843c80 ffff8802368586a0
      [ ]  ffff880236858678 0000000000000001 0000000000000002 ffff880236858000
      [ ] Call Trace:
      [ ]  [<ffffffff8165c5f5>] dump_stack+0x4e/0x7a
      [ ]  [<ffffffff81658c02>] print_circular_bug+0x1f9/0x207
      [ ]  [<ffffffff810de141>] __lock_acquire+0x1701/0x1eb0
      [ ]  [<ffffffff816680ff>] ? __atomic_notifier_call_chain+0x8f/0xb0
      [ ]  [<ffffffff810def00>] lock_acquire+0x90/0x130
      [ ]  [<ffffffff8115a637>] ? jump_label_lock+0x17/0x20
      [ ]  [<ffffffff8115a637>] ? jump_label_lock+0x17/0x20
      [ ]  [<ffffffff81661f83>] mutex_lock_nested+0x63/0x3e0
      [ ]  [<ffffffff8115a637>] ? jump_label_lock+0x17/0x20
      [ ]  [<ffffffff8115a637>] jump_label_lock+0x17/0x20
      [ ]  [<ffffffff8115aa3b>] static_key_slow_inc+0x6b/0xb0
      [ ]  [<ffffffff810ca775>] clear_sched_clock_stable+0x15/0x20
      [ ]  [<ffffffff810503b3>] mark_tsc_unstable+0x23/0x70
      [ ]  [<ffffffff810772cb>] check_tsc_sync_source+0x14b/0x150
      [ ]  [<ffffffff81076612>] native_cpu_up+0x3a2/0x890
      [ ]  [<ffffffff810941cb>] _cpu_up+0xdb/0x160
      [ ]  [<ffffffff810942c9>] cpu_up+0x79/0x90
      [ ]  [<ffffffff81d0af6b>] smp_init+0x60/0x8c
      [ ]  [<ffffffff81cebf42>] kernel_init_freeable+0x8c/0x197
      [ ]  [<ffffffff8164e320>] ? rest_init+0xd0/0xd0
      [ ]  [<ffffffff8164e32e>] kernel_init+0xe/0x130
      [ ]  [<ffffffff8166beec>] ret_from_fork+0x7c/0xb0
      [ ]  [<ffffffff8164e320>] ? rest_init+0xd0/0xd0
      [ ] ------------[ cut here ]------------
      [ ] WARNING: CPU: 0 PID: 1 at /usr/src/linux-2.6/kernel/smp.c:374 smp_call_function_many+0xad/0x300()
      [ ] Modules linked in:
      [ ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc3-01745-g848b0d0322cb-dirty #637
      [ ] Hardware name: Supermicro X8DTN/X8DTN, BIOS 4.6.3 01/08/2010
      [ ]  0000000000000009 ffff880236843be0 ffffffff8165c5f5 0000000000000000
      [ ]  ffff880236843c18 ffffffff81093d8c 0000000000000000 0000000000000000
      [ ]  ffffffff81ccd1a0 ffffffff810ca951 0000000000000000 ffff880236843c28
      [ ] Call Trace:
      [ ]  [<ffffffff8165c5f5>] dump_stack+0x4e/0x7a
      [ ]  [<ffffffff81093d8c>] warn_slowpath_common+0x8c/0xc0
      [ ]  [<ffffffff810ca951>] ? sched_clock_tick+0x1/0xa0
      [ ]  [<ffffffff81093dda>] warn_slowpath_null+0x1a/0x20
      [ ]  [<ffffffff8110b72d>] smp_call_function_many+0xad/0x300
      [ ]  [<ffffffff8104f200>] ? arch_unregister_cpu+0x30/0x30
      [ ]  [<ffffffff8104f200>] ? arch_unregister_cpu+0x30/0x30
      [ ]  [<ffffffff810ca951>] ? sched_clock_tick+0x1/0xa0
      [ ]  [<ffffffff8110ba96>] smp_call_function+0x46/0x80
      [ ]  [<ffffffff8104f200>] ? arch_unregister_cpu+0x30/0x30
      [ ]  [<ffffffff8110bb3c>] on_each_cpu+0x3c/0xa0
      [ ]  [<ffffffff810ca950>] ? sched_clock_idle_sleep_event+0x20/0x20
      [ ]  [<ffffffff810ca951>] ? sched_clock_tick+0x1/0xa0
      [ ]  [<ffffffff8104f964>] text_poke_bp+0x64/0xd0
      [ ]  [<ffffffff810ca950>] ? sched_clock_idle_sleep_event+0x20/0x20
      [ ]  [<ffffffff8104ccde>] arch_jump_label_transform+0xae/0x130
      [ ]  [<ffffffff8115a3cf>] __jump_label_update+0x5f/0x80
      [ ]  [<ffffffff8115a48d>] jump_label_update+0x9d/0xb0
      [ ]  [<ffffffff8115aa6d>] static_key_slow_inc+0x9d/0xb0
      [ ]  [<ffffffff810ca775>] clear_sched_clock_stable+0x15/0x20
      [ ]  [<ffffffff810503b3>] mark_tsc_unstable+0x23/0x70
      [ ]  [<ffffffff810772cb>] check_tsc_sync_source+0x14b/0x150
      [ ]  [<ffffffff81076612>] native_cpu_up+0x3a2/0x890
      [ ]  [<ffffffff810941cb>] _cpu_up+0xdb/0x160
      [ ]  [<ffffffff810942c9>] cpu_up+0x79/0x90
      [ ]  [<ffffffff81d0af6b>] smp_init+0x60/0x8c
      [ ]  [<ffffffff81cebf42>] kernel_init_freeable+0x8c/0x197
      [ ]  [<ffffffff8164e320>] ? rest_init+0xd0/0xd0
      [ ]  [<ffffffff8164e32e>] kernel_init+0xe/0x130
      [ ]  [<ffffffff8166beec>] ret_from_fork+0x7c/0xb0
      [ ]  [<ffffffff8164e320>] ? rest_init+0xd0/0xd0
      [ ] ---[ end trace 6ff1df5620c49d26 ]---
      [ ] tsc: Marking TSC unstable due to check_tsc_sync_source failed
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-v55fgqj3nnyqnngmvuu8ep6h@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6577e42a
    • P
      sched/clock, x86: Use a static_key for sched_clock_stable · 35af99e6
      Peter Zijlstra 提交于
      In order to avoid the runtime condition and variable load turn
      sched_clock_stable into a static_key.
      
      Also provide a shorter implementation of local_clock() and
      cpu_clock(int) when sched_clock_stable==1.
      
                              MAINLINE   PRE       POST
      
          sched_clock_stable: 1          1         1
          (cold) sched_clock: 329841     221876    215295
          (cold) local_clock: 301773     234692    220773
          (warm) sched_clock: 38375      25602     25659
          (warm) local_clock: 100371     33265     27242
          (warm) rdtsc:       27340      24214     24208
          sched_clock_stable: 0          0         0
          (cold) sched_clock: 382634     235941    237019
          (cold) local_clock: 396890     297017    294819
          (warm) sched_clock: 38194      25233     25609
          (warm) local_clock: 143452     71234     71232
          (warm) rdtsc:       27345      24245     24243
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-eummbdechzz37mwmpags1gjr@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      35af99e6
    • P
      sched/clock: Remove local_irq_disable() from the clocks · ef08f0ff
      Peter Zijlstra 提交于
      Now that x86 no longer requires IRQs disabled for sched_clock() and
      ia64 never had this requirement (it doesn't seem to do cpufreq at
      all), we can remove the requirement of disabling IRQs.
      
                              MAINLINE   PRE        POST
      
          sched_clock_stable: 1          1          1
          (cold) sched_clock: 329841     257223     221876
          (cold) local_clock: 301773     309889     234692
          (warm) sched_clock: 38375      25280      25602
          (warm) local_clock: 100371     85268      33265
          (warm) rdtsc:       27340      24247      24214
          sched_clock_stable: 0          0          0
          (cold) sched_clock: 382634     301224     235941
          (cold) local_clock: 396890     399870     297017
          (warm) sched_clock: 38194      25630      25233
          (warm) local_clock: 143452     129629     71234
          (warm) rdtsc:       27345      24307      24245
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-36e5kohiasnr106d077mgubp@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ef08f0ff
  17. 08 4月, 2013 1 次提交
    • T
      sched_clock: Prevent 64bit inatomicity on 32bit systems · a1cbcaa9
      Thomas Gleixner 提交于
      The sched_clock_remote() implementation has the following inatomicity
      problem on 32bit systems when accessing the remote scd->clock, which
      is a 64bit value.
      
      CPU0			CPU1
      
      sched_clock_local()	sched_clock_remote(CPU0)
      ...
      			remote_clock = scd[CPU0]->clock
      			    read_low32bit(scd[CPU0]->clock)
      cmpxchg64(scd->clock,...)
      			    read_high32bit(scd[CPU0]->clock)
      
      While the update of scd->clock is using an atomic64 mechanism, the
      readout on the remote cpu is not, which can cause completely bogus
      readouts.
      
      It is a quite rare problem, because it requires the update to hit the
      narrow race window between the low/high readout and the update must go
      across the 32bit boundary.
      
      The resulting misbehaviour is, that CPU1 will see the sched_clock on
      CPU1 ~4 seconds ahead of it's own and update CPU1s sched_clock value
      to this bogus timestamp. This stays that way due to the clamping
      implementation for about 4 seconds until the synchronization with
      CLOCK_MONOTONIC undoes the problem.
      
      The issue is hard to observe, because it might only result in a less
      accurate SCHED_OTHER timeslicing behaviour. To create observable
      damage on realtime scheduling classes, it is necessary that the bogus
      update of CPU1 sched_clock happens in the context of an realtime
      thread, which then gets charged 4 seconds of RT runtime, which results
      in the RT throttler mechanism to trigger and prevent scheduling of RT
      tasks for a little less than 4 seconds. So this is quite unlikely as
      well.
      
      The issue was quite hard to decode as the reproduction time is between
      2 days and 3 weeks and intrusive tracing makes it less likely, but the
      following trace recorded with trace_clock=global, which uses
      sched_clock_local(), gave the final hint:
      
        <idle>-0   0d..30 400269.477150: hrtimer_cancel: hrtimer=0xf7061e80
        <idle>-0   0d..30 400269.477151: hrtimer_start:  hrtimer=0xf7061e80 ...
      irq/20-S-587 1d..32 400273.772118: sched_wakeup:   comm= ... target_cpu=0
        <idle>-0   0dN.30 400273.772118: hrtimer_cancel: hrtimer=0xf7061e80
      
      What happens is that CPU0 goes idle and invokes
      sched_clock_idle_sleep_event() which invokes sched_clock_local() and
      CPU1 runs a remote wakeup for CPU0 at the same time, which invokes
      sched_remote_clock(). The time jump gets propagated to CPU0 via
      sched_remote_clock() and stays stale on both cores for ~4 seconds.
      
      There are only two other possibilities, which could cause a stale
      sched clock:
      
      1) ktime_get() which reads out CLOCK_MONOTONIC returns a sporadic
         wrong value.
      
      2) sched_clock() which reads the TSC returns a sporadic wrong value.
      
      #1 can be excluded because sched_clock would continue to increase for
         one jiffy and then go stale.
      
      #2 can be excluded because it would not make the clock jump
         forward. It would just result in a stale sched_clock for one jiffy.
      
      After quite some brain twisting and finding the same pattern on other
      traces, sched_clock_remote() remained the only place which could cause
      such a problem and as explained above it's indeed racy on 32bit
      systems.
      
      So while on 64bit systems the readout is atomic, we need to verify the
      remote readout on 32bit machines. We need to protect the local->clock
      readout in sched_clock_remote() on 32bit as well because an NMI could
      hit between the low and the high readout, call sched_clock_local() and
      modify local->clock.
      
      Thanks to Siegfried Wulsch for bearing with my debug requests and
      going through the tedious tasks of running a bunch of reproducer
      systems to generate the debug information which let me decode the
      issue.
      Reported-by: NSiegfried Wulsch <Siegfried.Wulsch@rovema.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304051544160.21884@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      a1cbcaa9
  18. 17 11月, 2011 1 次提交
  19. 31 10月, 2011 1 次提交
  20. 23 11月, 2010 1 次提交
  21. 09 6月, 2010 1 次提交
  22. 15 4月, 2010 1 次提交
  23. 15 12月, 2009 1 次提交
    • D
      sched: Fix cpu_clock() in NMIs, on !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK · b9f8fcd5
      David Miller 提交于
      Relax stable-sched-clock architectures to not save/disable/restore
      hardirqs in cpu_clock().
      
      The background is that I was trying to resolve a sparc64 perf
      issue when I discovered this problem.
      
      On sparc64 I implement pseudo NMIs by simply running the kernel
      at IRQ level 14 when local_irq_disable() is called, this allows
      performance counter events to still come in at IRQ level 15.
      
      This doesn't work if any code in an NMI handler does
      local_irq_save() or local_irq_disable() since the "disable" will
      kick us back to cpu IRQ level 14 thus letting NMIs back in and
      we recurse.
      
      The only path which that does that in the perf event IRQ
      handling path is the code supporting frequency based events.  It
      uses cpu_clock().
      
      cpu_clock() simply invokes sched_clock() with IRQs disabled.
      
      And that's a fundamental bug all on it's own, particularly for
      the HAVE_UNSTABLE_SCHED_CLOCK case.  NMIs can thus get into the
      sched_clock() code interrupting the local IRQ disable code
      sections of it.
      
      Furthermore, for the not-HAVE_UNSTABLE_SCHED_CLOCK case, the IRQ
      disabling done by cpu_clock() is just pure overhead and
      completely unnecessary.
      
      So the core problem is that sched_clock() is not NMI safe, but
      we are invoking it from NMI contexts in the perf events code
      (via cpu_clock()).
      
      A less important issue is the overhead of IRQ disabling when it
      isn't necessary in cpu_clock().
      
      CONFIG_HAVE_UNSTABLE_SCHED_CLOCK architectures are not
      affected by this patch.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <20091213.182502.215092085.davem@davemloft.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b9f8fcd5
  24. 01 10月, 2009 1 次提交
  25. 19 9月, 2009 1 次提交
    • P
      sched_clock: Make it NMI safe · def0a9b2
      Peter Zijlstra 提交于
      Arjan complained about the suckyness of TSC on modern machines, and
      asked if we could do something about that for PERF_SAMPLE_TIME.
      
      Make cpu_clock() NMI safe by removing the spinlock and using
      cmpxchg. This also makes it smaller and more robust.
      
      Affects architectures that use HAVE_UNSTABLE_SCHED_CLOCK, i.e. IA64
      and x86.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      def0a9b2
  26. 09 5月, 2009 1 次提交
  27. 27 2月, 2009 3 次提交
  28. 31 12月, 2008 1 次提交
    • T
      sched_clock: prevent scd->clock from moving backwards, take #2 · 1c5745aa
      Thomas Gleixner 提交于
      Redo:
      
        5b7dba4f: sched_clock: prevent scd->clock from moving backwards
      
      which had to be reverted due to s2ram hangs:
      
        ca7e716c: Revert "sched_clock: prevent scd->clock from moving backwards"
      
      ... this time with resume restoring GTOD later in the sequence
      taken into account as well.
      
      The "timekeeping_suspended" flag is not very nice but we cannot call into
      GTOD before it has been properly resumed and the scheduler will run very
      early in the resume sequence.
      
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1c5745aa