1. 19 4月, 2014 1 次提交
  2. 04 4月, 2014 1 次提交
    • B
      kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not every one · 62572e29
      Ben Zhang 提交于
      I ran into a scenario where while one cpu was stuck and should have
      panic'd because of the NMI watchdog, it didn't.  The reason was another
      cpu was spewing stack dumps on to the console.  Upon investigation, I
      noticed that when writing to the console and also when dumping the
      stack, the watchdog is touched.
      
      This causes all the cpus to reset their NMI watchdog flags and the
      'stuck' cpu just spins forever.
      
      This change causes the semantics of touch_nmi_watchdog to be changed
      slightly.  Previously, I accidentally changed the semantics and we
      noticed there was a codepath in which touch_nmi_watchdog could be
      touched from a preemtible area.  That caused a BUG() to happen when
      CONFIG_DEBUG_PREEMPT was enabled.  I believe it was the acpi code.
      
      My attempt here re-introduces the change to have the
      touch_nmi_watchdog() code only touch the local cpu instead of all of the
      cpus.  But instead of using __get_cpu_var(), I use the
      __raw_get_cpu_var() version.
      
      This avoids the preemption problem.  However my reasoning wasn't because
      I was trying to be lazy.  Instead I rationalized it as, well if
      preemption is enabled then interrupts should be enabled to and the NMI
      watchdog will have no reason to trigger.  So it won't matter if the
      wrong cpu is touched because the percpu interrupt counters the NMI
      watchdog uses should still be incrementing.
      
      Don said:
      
      : I'm ok with this patch, though it does alter the behaviour of how
      : touch_nmi_watchdog works.  For the most part I don't think most callers
      : need to touch all of the watchdogs (on each cpu).  Perhaps a corner case
      : will pop up (the scheduler??  to mimic touch_all_softlockup_watchdogs() ).
      :
      : But this does address an issue where if a system is locked up and one cpu
      : is spewing out useful debug messages (or error messages), the hard lockup
      : will fail to go off.  We have seen this on RHEL also.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NBen Zhang <benzh@chromium.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      62572e29
  3. 25 2月, 2014 1 次提交
    • F
      watchdog: Simplify a little the IPI call · e0a23b06
      Frederic Weisbecker 提交于
      In order to remotely restart the watchdog hrtimer, update_timers()
      allocates a csd on the stack and pass it to __smp_call_function_single().
      
      There is no partcular need, however, for a specific csd here. Lets
      simplify that a little by calling smp_call_function_single()
      which can already take care of the csd allocation by itself.
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e0a23b06
  4. 25 9月, 2013 2 次提交
    • M
      watchdog: update watchdog_thresh properly · 9809b18f
      Michal Hocko 提交于
      watchdog_tresh controls how often nmi perf event counter checks per-cpu
      hrtimer_interrupts counter and blows up if the counter hasn't changed
      since the last check.  The counter is updated by per-cpu
      watchdog_hrtimer hrtimer which is scheduled with 2/5 watchdog_thresh
      period which guarantees that hrtimer is scheduled 2 times per the main
      period.  Both hrtimer and perf event are started together when the
      watchdog is enabled.
      
      So far so good.  But...
      
      But what happens when watchdog_thresh is updated from sysctl handler?
      
      proc_dowatchdog will set a new sampling period and hrtimer callback
      (watchdog_timer_fn) will use the new value in the next round.  The
      problem, however, is that nobody tells the perf event that the sampling
      period has changed so it is ticking with the period configured when it
      has been set up.
      
      This might result in an ear ripping dissonance between perf and hrtimer
      parts if the watchdog_thresh is increased.  And even worse it might lead
      to KABOOM if the watchdog is configured to panic on such a spurious
      lockup.
      
      This patch fixes the issue by updating both nmi perf even counter and
      hrtimers if the threshold value has changed.
      
      The nmi one is disabled and then reinitialized from scratch.  This has
      an unpleasant side effect that the allocation of the new event might
      fail theoretically so the hard lockup detector would be disabled for
      such cpus.  On the other hand such a memory allocation failure is very
      unlikely because the original event is deallocated right before.
      
      It would be much nicer if we just changed perf event period but there
      doesn't seem to be any API to do that right now.  It is also unfortunate
      that perf_event_alloc uses GFP_KERNEL allocation unconditionally so we
      cannot use on_each_cpu() and do the same thing from the per-cpu context.
      The update from the current CPU should be safe because
      perf_event_disable removes the event atomically before it clears the
      per-cpu watchdog_ev so it cannot change anything under running handler
      feet.
      
      The hrtimer is simply restarted (thanks to Don Zickus who has pointed
      this out) if it is queued because we cannot rely it will fire&adopt to
      the new sampling period before a new nmi event triggers (when the
      treshold is decreased).
      
      [akpm@linux-foundation.org: the UP version of __smp_call_function_single ended up in the wrong place]
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Fabio Estevam <festevam@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9809b18f
    • M
      watchdog: update watchdog attributes atomically · 359e6fab
      Michal Hocko 提交于
      proc_dowatchdog doesn't synchronize multiple callers which might lead to
      confusion when two parallel callers might confuse watchdog_enable_all_cpus
      resp watchdog_disable_all_cpus (eg watchdog gets enabled even if
      watchdog_thresh was set to 0 already).
      
      This patch adds a local mutex which synchronizes callers to the sysctl
      handler.
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      359e6fab
  5. 31 7月, 2013 1 次提交
  6. 20 6月, 2013 3 次提交
    • F
      watchdog: Boot-disable by default on full dynticks · 940be35a
      Frederic Weisbecker 提交于
      When the watchdog runs, it prevents the full dynticks
      CPUs from stopping their tick because the hard lockup
      detector uses perf events internally, which in turn
      rely on the periodic tick.
      
      Since this is a rather confusing behaviour that is not
      easy to track down and identify for those who want to
      test CONFIG_NO_HZ_FULL, let's default disable the
      watchdog on boot time when full dynticks is enabled.
      
      The user can still enable it later on runtime using
      proc or sysctl.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Anish Singh <anish198519851985@gmail.com>
      940be35a
    • F
      watchdog: Rename confusing state variable · 3c00ea82
      Frederic Weisbecker 提交于
      We have two very conflicting state variable names in the
      watchdog:
      
      * watchdog_enabled: This one reflects the user interface. It's
      set to 1 by default and can be overriden with boot options
      or sysctl/procfs interface.
      
      * watchdog_disabled: This is the internal toggle state that
      tells if watchdog threads, timers and NMI events are currently
      running or not. This state mostly depends on the user settings.
      It's a convenient state latch.
      
      Now we really need to find clearer names because those
      are just too confusing to encourage deep review.
      
      watchdog_enabled now becomes watchdog_user_enabled to reflect
      its purpose as an interface.
      
      watchdog_disabled becomes watchdog_running to suggest its
      role as a pure internal state.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Anish Singh <anish198519851985@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Don Zickus <dzickus@redhat.com>
      3c00ea82
    • F
      watchdog: Register / unregister watchdog kthreads on sysctl control · b8900bc0
      Frederic Weisbecker 提交于
      The user activation/deactivation of the watchdog through boot parameters
      or systcl is currently implemented with a dance involving kthreads parking
      and unparking methods: the threads are unconditionally registered on
      boot and they park as soon as the user want the watchdog to be disabled.
      
      This method involves a few noisy details to handle though: the watchdog
      kthreads may be unparked anytime due to hotplug operations, after which
      the watchdog internals have to decide to park again if it is user-disabled.
      
      As a result the setup() and unpark() methods need to be able to request a
      reparking. This is not currently supported in the kthread infrastructure
      so this piece of the watchdog code only works halfway.
      
      Besides, unparking/reparking the watchdog kthreads consume unnecessary
      cputime on hotplug operations when those could be simply ignored in the
      first place.
      
      As suggested by Srivatsa, let's instead only register the watchdog
      threads when they are needed. This way we don't need to think about
      hotplug operations and we don't burden the CPU onlining when the watchdog
      is simply disabled.
      Suggested-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Anish Singh <anish198519851985@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Don Zickus <dzickus@redhat.com>
      b8900bc0
  7. 14 3月, 2013 1 次提交
  8. 19 2月, 2013 1 次提交
  9. 08 2月, 2013 1 次提交
  10. 20 12月, 2012 1 次提交
    • B
      watchdog: Fix disable/enable regression · 3935e895
      Bjørn Mork 提交于
      Commit 8d451690 ("watchdog: Fix CPU hotplug regression") causes an
      oops or hard lockup when doing
      
       echo 0 > /proc/sys/kernel/nmi_watchdog
       echo 1 > /proc/sys/kernel/nmi_watchdog
      
      and the kernel is booted with nmi_watchdog=1 (default)
      
      Running laptop-mode-tools and disconnecting/connecting AC power will
      cause this to trigger, making it a common failure scenario on laptops.
      
      Instead of bailing out of watchdog_disable() when !watchdog_enabled we
      can initialize the hrtimer regardless of watchdog_enabled status.  This
      makes it safe to call watchdog_disable() in the nmi_watchdog=0 case,
      without the negative effect on the enabled => disabled => enabled case.
      
      All these tests pass with this patch:
      - nmi_watchdog=1
        echo 0 > /proc/sys/kernel/nmi_watchdog
        echo 1 > /proc/sys/kernel/nmi_watchdog
      
      - nmi_watchdog=0
        echo 0 > /sys/devices/system/cpu/cpu1/online
      
      - nmi_watchdog=0
        echo mem > /sys/power/state
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51661
      
      Cc: <stable@vger.kernel.org> # v3.7
      Cc: Norbert Warmuth <nwarmuth@t-online.de>
      Cc: Joseph Salisbury <joseph.salisbury@canonical.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3935e895
  11. 18 12月, 2012 1 次提交
  12. 05 12月, 2012 1 次提交
  13. 27 11月, 2012 1 次提交
  14. 13 8月, 2012 1 次提交
  15. 09 8月, 2012 1 次提交
  16. 31 7月, 2012 1 次提交
    • S
      NMI watchdog: fix for lockup detector breakage on resume · 45226e94
      Sameer Nanda 提交于
      On the suspend/resume path the boot CPU does not go though an
      offline->online transition.  This breaks the NMI detector post-resume
      since it depends on PMU state that is lost when the system gets
      suspended.
      
      Fix this by forcing a CPU offline->online transition for the lockup
      detector on the boot CPU during resume.
      
      To provide more context, we enable NMI watchdog on Chrome OS.  We have
      seen several reports of systems freezing up completely which indicated
      that the NMI watchdog was not firing for some reason.
      
      Debugging further, we found a simple way of repro'ing system freezes --
      issuing the command 'tasket 1 sh -c "echo nmilockup > /proc/breakme"'
      after the system has been suspended/resumed one or more times.
      
      With this patch in place, the system freeze result in panics, as
      expected.
      
      These panics provide a nice stack trace for us to debug the actual issue
      causing the freeze.
      
      [akpm@linux-foundation.org: fiddle with code comment]
      [akpm@linux-foundation.org: make lockup_detector_bootcpu_resume() conditional on CONFIG_SUSPEND]
      [akpm@linux-foundation.org: fix section errors]
      Signed-off-by: NSameer Nanda <snanda@chromium.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Mandeep Singh Baines <msb@chromium.org>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      45226e94
  17. 14 6月, 2012 1 次提交
    • D
      watchdog: Quiet down the boot messages · a7027046
      Don Zickus 提交于
      A bunch of bugzillas have complained how noisy the nmi_watchdog
      is during boot-up especially with its expected failure cases
      (like virt and bios resource contention).
      
      This is my attempt to quiet them down and keep it less confusing
      for the end user.  What I did is print the message for cpu0 and
      save it for future comparisons.  If future cpus have an
      identical message as cpu0, then don't print the redundant info.
      However, if a future cpu has a different message, happily print
      that loudly.
      
      Before the change, you would see something like:
      
          ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
          CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
          Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
          ... version:                2
          ... bit width:              40
          ... generic registers:      2
          ... value mask:             000000ffffffffff
          ... max period:             000000007fffffff
          ... fixed-purpose events:   3
          ... event mask:             0000000700000003
          NMI watchdog enabled, takes one hw-pmu counter.
          Booting Node   0, Processors  #1
          NMI watchdog enabled, takes one hw-pmu counter.
           #2
          NMI watchdog enabled, takes one hw-pmu counter.
           #3 Ok.
          NMI watchdog enabled, takes one hw-pmu counter.
          Brought up 4 CPUs
          Total of 4 processors activated (22607.24 BogoMIPS).
      
      After the change, it is simplified to:
      
          ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
          CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
          Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
          ... version:                2
          ... bit width:              40
          ... generic registers:      2
          ... value mask:             000000ffffffffff
          ... max period:             000000007fffffff
          ... fixed-purpose events:   3
          ... event mask:             0000000700000003
          NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
          Booting Node   0, Processors  #1 #2 #3 Ok.
          Brought up 4 CPUs
      
      V2: little changes based on Joe Perches' feedback
      V3: printk cleanup based on Ingo's feedback; checkpatch fix
      V4: keep printk as one long line
      V5: Ingo fix ups
      Reported-and-tested-by: NNathan Zimmer <nzimmer@sgi.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: nzimmer@sgi.com
      Cc: joe@perches.com
      Link: http://lkml.kernel.org/r/1339594548-17227-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a7027046
  18. 08 4月, 2012 1 次提交
  19. 24 3月, 2012 3 次提交
  20. 11 2月, 2012 1 次提交
  21. 27 1月, 2012 1 次提交
    • P
      bugs, x86: Fix printk levels for panic, softlockups and stack dumps · b0f4c4b3
      Prarit Bhargava 提交于
      rsyslog will display KERN_EMERG messages on a connected
      terminal.  However, these messages are useless/undecipherable
      for a general user.
      
      For example, after a softlockup we get:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
       kernel:Stack:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
       kernel:Call Trace:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
       kernel:Code: ff ff a8 08 75 25 31 d2 48 8d 86 38 e0 ff ff 48 89
       d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e0 0f 01 c9 <e8> ea 69 dd ff 4c 29 e8 48 89 c7 e8 0f bc da ff 49 89 c4 49 89
      
      This happens because the printk levels for these messages are
      incorrect. Only an informational message should be displayed on
      a terminal.
      
      I modified the printk levels for various messages in the kernel
      and tested the output by using the drivers/misc/lkdtm.c kernel
      modules (ie, softlockups, panics, hard lockups, etc.) and
      confirmed that the console output was still the same and that
      the output to the terminals was correct.
      
      For example, in the case of a softlockup we now see the much
      more informative:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 10:18:06 ...
       BUG: soft lockup - CPU4 stuck for 60s!
      
      instead of the above confusing messages.
      
      AFAICT, the messages no longer have to be KERN_EMERG.  In the
      most important case of a panic we set console_verbose().  As for
      the other less severe cases the correct data is output to the
      console and /var/log/messages.
      
      Successfully tested by me using the drivers/misc/lkdtm.c module.
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: dzickus@redhat.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1327586134-11926-1-git-send-email-prarit@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b0f4c4b3
  22. 01 11月, 2011 1 次提交
  23. 18 9月, 2011 1 次提交
  24. 14 8月, 2011 1 次提交
  25. 15 7月, 2011 1 次提交
    • C
      perf, x86: P4 PMU - Introduce event alias feature · f9129870
      Cyrill Gorcunov 提交于
      Instead of hw_nmi_watchdog_set_attr() weak function
      and appropriate x86_pmu::hw_watchdog_set_attr() call
      we introduce even alias mechanism which allow us
      to drop this routines completely and isolate quirks
      of Netburst architecture inside P4 PMU code only.
      
      The main idea remains the same though -- to allow
      nmi-watchdog and perf top run simultaneously.
      
      Note the aliasing mechanism applies to generic
      PERF_COUNT_HW_CPU_CYCLES event only because arbitrary
      event (say passed as RAW initially) might have some
      additional bits set inside ESCR register changing
      the behaviour of event and we can't guarantee anymore
      that alias event will give the same result.
      
      P.S. Thanks a huge to Don and Steven for for testing
           and early review.
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Tested-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      CC: Ingo Molnar <mingo@elte.hu>
      CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Stephane Eranian <eranian@google.com>
      CC: Lin Ming <ming.m.lin@intel.com>
      CC: Arnaldo Carvalho de Melo <acme@redhat.com>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/r/20110708201712.GS23657@sunSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f9129870
  26. 01 7月, 2011 3 次提交
  27. 24 5月, 2011 2 次提交
  28. 23 5月, 2011 4 次提交
  29. 29 4月, 2011 1 次提交