1. 05 9月, 2015 4 次提交
    • U
      watchdog: introduce watchdog_park_threads() and watchdog_unpark_threads() · 81a4beef
      Ulrich Obergfell 提交于
      Originally watchdog_nmi_enable(cpu) and watchdog_nmi_disable(cpu) were
      only called in watchdog thread context.  However, the following commits
      utilize these functions outside of watchdog thread context too.
      
        commit 9809b18f
        Author: Michal Hocko <mhocko@suse.cz>
        Date:   Tue Sep 24 15:27:30 2013 -0700
      
            watchdog: update watchdog_thresh properly
      
        commit b3738d29
        Author: Stephane Eranian <eranian@google.com>
        Date:   Mon Nov 17 20:07:03 2014 +0100
      
            watchdog: Add watchdog enable/disable all functions
      
      Hence, it is now possible that these functions execute concurrently with
      the same 'cpu' argument.  This concurrency is problematic because per-cpu
      'watchdog_ev' can be accessed/modified without adequate synchronization.
      
      The patch series aims to address the above problem.  However, instead of
      introducing locks to protect per-cpu 'watchdog_ev' a different approach is
      taken: Invoke these functions by parking and unparking the watchdog
      threads (to ensure they are always called in watchdog thread context).
      
        static struct smp_hotplug_thread watchdog_threads = {
                 ...
                .park   = watchdog_disable, // calls watchdog_nmi_disable()
                .unpark = watchdog_enable,  // calls watchdog_nmi_enable()
        };
      
      Both previously mentioned commits call these functions in a similar way
      and thus in principle contain some duplicate code.  The patch series also
      avoids this duplication by providing a commonly usable mechanism.
      
      - Patch 1/4 introduces the watchdog_{park|unpark}_threads functions that
        park/unpark all watchdog threads specified in 'watchdog_cpumask'. They
        are intended to be called inside of kernel/watchdog.c only.
      
      - Patch 2/4 introduces the watchdog_{suspend|resume} functions which can
        be utilized by external callers to deactivate the hard and soft lockup
        detector temporarily.
      
      - Patch 3/4 utilizes watchdog_{park|unpark}_threads to replace some code
        that was introduced by commit 9809b18f.
      
      - Patch 4/4 utilizes watchdog_{suspend|resume} to replace some code that
        was introduced by commit b3738d29.
      
      A few corner cases should be mentioned here for completeness.
      
      - kthread_park() of watchdog/N could hang if cpu N is already locked up.
        However, if watchdog is enabled the lockup will be detected anyway.
      
      - kthread_unpark() of watchdog/N could hang if cpu N got locked up after
        kthread_park(). The occurrence of this scenario should be _very_ rare
        in practice, in particular because it is not expected that temporary
        deactivation will happen frequently, and if it happens at all it is
        expected that the duration of deactivation will be short.
      
      This patch (of 4): introduce watchdog_park_threads() and watchdog_unpark_threads()
      
      These functions are intended to be used only from inside kernel/watchdog.c
      to park/unpark all watchdog threads that are specified in
      watchdog_cpumask.
      Signed-off-by: NUlrich Obergfell <uobergfe@redhat.com>
      Reviewed-by: NAaron Tomlin <atomlin@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81a4beef
    • G
      kernel/watchdog: move NMI function header declarations from watchdog.h to nmi.h · aacfbe6a
      Guenter Roeck 提交于
      The kernel's NMI watchdog has nothing to do with the watchdog subsystem.
      Its header declarations should be in linux/nmi.h, not linux/watchdog.h.
      
      The code provided two sets of dummy functions if HARDLOCKUP_DETECTOR is
      not configured, one in the include file and one in kernel/watchdog.c.
      Remove the dummy functions from kernel/watchdog.c and use those from the
      include file.
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Don Zickus <dzickus@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aacfbe6a
    • F
      watchdog: simplify housekeeping affinity with the appropriate mask · 314b08ff
      Frederic Weisbecker 提交于
      housekeeping_mask gathers all the CPUs that aren't part of the nohz_full
      set.  This is exactly what we want the watchdog to be affine to without
      the need to use complicated cpumask operations.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      314b08ff
    • F
      smpboot: allow passing the cpumask on per-cpu thread registration · 230ec939
      Frederic Weisbecker 提交于
      It makes the registration cheaper and simpler for the smpboot per-cpu
      kthread users that don't need to always update the cpumask after threads
      creation.
      
      [sfr@canb.auug.org.au: fix for allow passing the cpumask on per-cpu thread registration]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      230ec939
  2. 25 6月, 2015 1 次提交
    • C
      watchdog: add watchdog_cpumask sysctl to assist nohz · fe4ba3c3
      Chris Metcalf 提交于
      Change the default behavior of watchdog so it only runs on the
      housekeeping cores when nohz_full is enabled at build and boot time.
      Allow modifying the set of cores the watchdog is currently running on
      with a new kernel.watchdog_cpumask sysctl.
      
      In the current system, the watchdog subsystem runs a periodic timer that
      schedules the watchdog kthread to run.  However, nohz_full cores are
      designed to allow userspace application code running on those cores to
      have 100% access to the CPU.  So the watchdog system prevents the
      nohz_full application code from being able to run the way it wants to,
      thus the motivation to suppress the watchdog on nohz_full cores, which
      this patchset provides by default.
      
      However, if we disable the watchdog globally, then the housekeeping
      cores can't benefit from the watchdog functionality.  So we allow
      disabling it only on some cores.  See Documentation/lockup-watchdogs.txt
      for more information.
      
      [jhubbard@nvidia.com: fix a watchdog crash in some configurations]
      Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJohn Hubbard <jhubbard@nvidia.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe4ba3c3
  3. 20 5月, 2015 1 次提交
  4. 19 5月, 2015 1 次提交
    • P
      watchdog: Fix merge 'conflict' · ab992dc3
      Peter Zijlstra 提交于
      Two watchdog changes that came through different trees had a non
      conflicting conflict, that is, one changed the semantics of a variable
      but no actual code conflict happened. So the merge appeared fine, but
      the resulting code did not behave as expected.
      
      Commit 195daf66 ("watchdog: enable the new user interface of the
      watchdog mechanism") changes the semantics of watchdog_user_enabled,
      which thereafter is only used by the functions introduced by
      b3738d29 ("watchdog: Add watchdog enable/disable all functions").
      
      There further appears to be a distinct lack of serialization between
      setting and using watchdog_enabled, so perhaps we should wrap the
      {en,dis}able_all() things in watchdog_proc_mutex.
      
      This patch fixes a s2r failure reported by Michal; which I cannot
      readily explain. But this does make the code internally consistent
      again.
      Reported-and-tested-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ab992dc3
  5. 15 4月, 2015 9 次提交
  6. 02 4月, 2015 1 次提交
  7. 13 2月, 2015 1 次提交
    • C
      kernel/sched/clock.c: add another clock for use with the soft lockup watchdog · 545a2bf7
      Cyril Bur 提交于
      When the hypervisor pauses a virtualised kernel the kernel will observe a
      jump in timebase, this can cause spurious messages from the softlockup
      detector.
      
      Whilst these messages are harmless, they are accompanied with a stack
      trace which causes undue concern and more problematically the stack trace
      in the guest has nothing to do with the observed problem and can only be
      misleading.
      
      Futhermore, on POWER8 this is completely avoidable with the introduction
      of the Virtual Time Base (VTB) register.
      
      This patch (of 2):
      
      This permits the use of arch specific clocks for which virtualised kernels
      can use their notion of 'running' time, not the elpased wall time which
      will include host execution time.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Andrew Jones <drjones@redhat.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: chai wen <chaiw.fnst@cn.fujitsu.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Ben Zhang <benzh@chromium.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      545a2bf7
  8. 14 10月, 2014 1 次提交
    • U
      kernel/watchdog.c: control hard lockup detection default · 6e7458a6
      Ulrich Obergfell 提交于
      In some cases we don't want hard lockup detection enabled by default.
      An example is when running as a guest.  Introduce
      
        watchdog_enable_hardlockup_detector(bool)
      
      allowing those cases to disable hard lockup detection.  This must be
      executed early by the boot processor from e.g.  smp_prepare_boot_cpu, in
      order to allow kernel command line arguments to override it, as well as
      to avoid hard lockup detection being enabled before we've had a chance
      to indicate that it's unwanted.  In summary,
      
        initial boot:					default=enabled
        smp_prepare_boot_cpu
          watchdog_enable_hardlockup_detector(false):	default=disabled
        cmdline has 'nmi_watchdog=1':			default=enabled
      
      The running kernel still has the ability to enable/disable at any time
      with /proc/sys/kernel/nmi_watchdog us usual.  However even when the
      default has been overridden /proc/sys/kernel/nmi_watchdog will initially
      show '1'.  To truly turn it on one must disable/enable it, i.e.
      
        echo 0 > /proc/sys/kernel/nmi_watchdog
        echo 1 > /proc/sys/kernel/nmi_watchdog
      
      This patch will be immediately useful for KVM with the next patch of this
      series.  Other hypervisor guest types may find it useful as well.
      
      [akpm@linux-foundation.org: fix build]
      [dzickus@redhat.com: fix compile issues on sparc]
      Signed-off-by: NUlrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: NAndrew Jones <drjones@redhat.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6e7458a6
  9. 10 10月, 2014 1 次提交
    • C
      softlockup: make detector be aware of task switch of processes hogging cpu · b1a8de1f
      chai wen 提交于
      For now, soft lockup detector warns once for each case of process
      softlockup.  But the thread 'watchdog/n' may not always get the cpu at the
      time slot between the task switch of two processes hogging that cpu to
      reset soft_watchdog_warn.
      
      An example would be two processes hogging the cpu.  Process A causes the
      softlockup warning and is killed manually by a user.  Process B
      immediately becomes the new process hogging the cpu preventing the
      softlockup code from resetting the soft_watchdog_warn variable.
      
      This case is a false negative of "warn only once for a process", as there
      may be a different process that is going to hog the cpu.  Resolve this by
      saving/checking the task pointer of the hogging process and use that to
      reset soft_watchdog_warn too.
      
      [dzickus@redhat.com: update comment]
      Signed-off-by: Nchai wen <chaiw.fnst@cn.fujitsu.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b1a8de1f
  10. 27 8月, 2014 1 次提交
  11. 18 8月, 2014 2 次提交
  12. 09 8月, 2014 1 次提交
  13. 07 8月, 2014 1 次提交
  14. 24 6月, 2014 2 次提交
    • A
      kernel/watchdog.c: print traces for all cpus on lockup detection · ed235875
      Aaron Tomlin 提交于
      A 'softlockup' is defined as a bug that causes the kernel to loop in
      kernel mode for more than a predefined period to time, without giving
      other tasks a chance to run.
      
      Currently, upon detection of this condition by the per-cpu watchdog
      task, debug information (including a stack trace) is sent to the system
      log.
      
      On some occasions, we have observed that the "victim" rather than the
      actual "culprit" (i.e.  the owner/holder of the contended resource) is
      reported to the user.  Often this information has proven to be
      insufficient to assist debugging efforts.
      
      To avoid loss of useful debug information, for architectures which
      support NMI, this patch makes it possible to improve soft lockup
      reporting.  This is accomplished by issuing an NMI to each cpu to obtain
      a stack trace.
      
      If NMI is not supported we just revert back to the old method.  A sysctl
      and boot-time parameter is available to toggle this feature.
      
      [dzickus@redhat.com: add CONFIG_SMP in certain areas]
      [akpm@linux-foundation.org: additional CONFIG_SMP=n optimisations]
      [mq@suse.cz: fix warning]
      Signed-off-by: NAaron Tomlin <atomlin@redhat.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Mateusz Guzik <mguzik@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NJan Moskyto Matejka <mq@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed235875
    • D
      kernel/watchdog.c: remove preemption restrictions when restarting lockup detector · bde92cf4
      Don Zickus 提交于
      Peter Wu noticed the following splat on his machine when updating
      /proc/sys/kernel/watchdog_thresh:
      
        BUG: sleeping function called from invalid context at mm/slub.c:965
        in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
        3 locks held by init/1:
         #0:  (sb_writers#3){.+.+.+}, at: [<ffffffff8117b663>] vfs_write+0x143/0x180
         #1:  (watchdog_proc_mutex){+.+.+.}, at: [<ffffffff810e02d3>] proc_dowatchdog+0x33/0x110
         #2:  (cpu_hotplug.lock){.+.+.+}, at: [<ffffffff810589c2>] get_online_cpus+0x32/0x80
        Preemption disabled at:[<ffffffff810e0384>] proc_dowatchdog+0xe4/0x110
      
        CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-testing #34
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        Call Trace:
          dump_stack+0x4e/0x7a
          __might_sleep+0x11d/0x190
          kmem_cache_alloc_trace+0x4e/0x1e0
          perf_event_alloc+0x55/0x440
          perf_event_create_kernel_counter+0x26/0xe0
          watchdog_nmi_enable+0x75/0x140
          update_timers_all_cpus+0x53/0xa0
          proc_dowatchdog+0xe4/0x110
          proc_sys_call_handler+0xb3/0xc0
          proc_sys_write+0x14/0x20
          vfs_write+0xad/0x180
          SyS_write+0x49/0xb0
          system_call_fastpath+0x16/0x1b
        NMI watchdog: disabled (cpu0): hardware events not enabled
      
      What happened is after updating the watchdog_thresh, the lockup detector
      is restarted to utilize the new value.  Part of this process involved
      disabling preemption.  Once preemption was disabled, perf tried to
      allocate a new event (as part of the restart).  This caused the above
      BUG_ON as you can't sleep with preemption disabled.
      
      The preemption restriction seemed agressive as we are not doing anything
      on that particular cpu, but with all the online cpus (which are
      protected by the get_online_cpus lock).  Remove the restriction and the
      BUG_ON goes away.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Reported-by: NPeter Wu <peter@lekensteyn.nl>
      Tested-by: NPeter Wu <peter@lekensteyn.nl>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>		[3.13+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bde92cf4
  15. 19 4月, 2014 1 次提交
  16. 04 4月, 2014 1 次提交
    • B
      kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not every one · 62572e29
      Ben Zhang 提交于
      I ran into a scenario where while one cpu was stuck and should have
      panic'd because of the NMI watchdog, it didn't.  The reason was another
      cpu was spewing stack dumps on to the console.  Upon investigation, I
      noticed that when writing to the console and also when dumping the
      stack, the watchdog is touched.
      
      This causes all the cpus to reset their NMI watchdog flags and the
      'stuck' cpu just spins forever.
      
      This change causes the semantics of touch_nmi_watchdog to be changed
      slightly.  Previously, I accidentally changed the semantics and we
      noticed there was a codepath in which touch_nmi_watchdog could be
      touched from a preemtible area.  That caused a BUG() to happen when
      CONFIG_DEBUG_PREEMPT was enabled.  I believe it was the acpi code.
      
      My attempt here re-introduces the change to have the
      touch_nmi_watchdog() code only touch the local cpu instead of all of the
      cpus.  But instead of using __get_cpu_var(), I use the
      __raw_get_cpu_var() version.
      
      This avoids the preemption problem.  However my reasoning wasn't because
      I was trying to be lazy.  Instead I rationalized it as, well if
      preemption is enabled then interrupts should be enabled to and the NMI
      watchdog will have no reason to trigger.  So it won't matter if the
      wrong cpu is touched because the percpu interrupt counters the NMI
      watchdog uses should still be incrementing.
      
      Don said:
      
      : I'm ok with this patch, though it does alter the behaviour of how
      : touch_nmi_watchdog works.  For the most part I don't think most callers
      : need to touch all of the watchdogs (on each cpu).  Perhaps a corner case
      : will pop up (the scheduler??  to mimic touch_all_softlockup_watchdogs() ).
      :
      : But this does address an issue where if a system is locked up and one cpu
      : is spewing out useful debug messages (or error messages), the hard lockup
      : will fail to go off.  We have seen this on RHEL also.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NBen Zhang <benzh@chromium.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      62572e29
  17. 25 2月, 2014 1 次提交
    • F
      watchdog: Simplify a little the IPI call · e0a23b06
      Frederic Weisbecker 提交于
      In order to remotely restart the watchdog hrtimer, update_timers()
      allocates a csd on the stack and pass it to __smp_call_function_single().
      
      There is no partcular need, however, for a specific csd here. Lets
      simplify that a little by calling smp_call_function_single()
      which can already take care of the csd allocation by itself.
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e0a23b06
  18. 25 9月, 2013 2 次提交
    • M
      watchdog: update watchdog_thresh properly · 9809b18f
      Michal Hocko 提交于
      watchdog_tresh controls how often nmi perf event counter checks per-cpu
      hrtimer_interrupts counter and blows up if the counter hasn't changed
      since the last check.  The counter is updated by per-cpu
      watchdog_hrtimer hrtimer which is scheduled with 2/5 watchdog_thresh
      period which guarantees that hrtimer is scheduled 2 times per the main
      period.  Both hrtimer and perf event are started together when the
      watchdog is enabled.
      
      So far so good.  But...
      
      But what happens when watchdog_thresh is updated from sysctl handler?
      
      proc_dowatchdog will set a new sampling period and hrtimer callback
      (watchdog_timer_fn) will use the new value in the next round.  The
      problem, however, is that nobody tells the perf event that the sampling
      period has changed so it is ticking with the period configured when it
      has been set up.
      
      This might result in an ear ripping dissonance between perf and hrtimer
      parts if the watchdog_thresh is increased.  And even worse it might lead
      to KABOOM if the watchdog is configured to panic on such a spurious
      lockup.
      
      This patch fixes the issue by updating both nmi perf even counter and
      hrtimers if the threshold value has changed.
      
      The nmi one is disabled and then reinitialized from scratch.  This has
      an unpleasant side effect that the allocation of the new event might
      fail theoretically so the hard lockup detector would be disabled for
      such cpus.  On the other hand such a memory allocation failure is very
      unlikely because the original event is deallocated right before.
      
      It would be much nicer if we just changed perf event period but there
      doesn't seem to be any API to do that right now.  It is also unfortunate
      that perf_event_alloc uses GFP_KERNEL allocation unconditionally so we
      cannot use on_each_cpu() and do the same thing from the per-cpu context.
      The update from the current CPU should be safe because
      perf_event_disable removes the event atomically before it clears the
      per-cpu watchdog_ev so it cannot change anything under running handler
      feet.
      
      The hrtimer is simply restarted (thanks to Don Zickus who has pointed
      this out) if it is queued because we cannot rely it will fire&adopt to
      the new sampling period before a new nmi event triggers (when the
      treshold is decreased).
      
      [akpm@linux-foundation.org: the UP version of __smp_call_function_single ended up in the wrong place]
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Fabio Estevam <festevam@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9809b18f
    • M
      watchdog: update watchdog attributes atomically · 359e6fab
      Michal Hocko 提交于
      proc_dowatchdog doesn't synchronize multiple callers which might lead to
      confusion when two parallel callers might confuse watchdog_enable_all_cpus
      resp watchdog_disable_all_cpus (eg watchdog gets enabled even if
      watchdog_thresh was set to 0 already).
      
      This patch adds a local mutex which synchronizes callers to the sysctl
      handler.
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      359e6fab
  19. 31 7月, 2013 1 次提交
  20. 20 6月, 2013 3 次提交
    • F
      watchdog: Boot-disable by default on full dynticks · 940be35a
      Frederic Weisbecker 提交于
      When the watchdog runs, it prevents the full dynticks
      CPUs from stopping their tick because the hard lockup
      detector uses perf events internally, which in turn
      rely on the periodic tick.
      
      Since this is a rather confusing behaviour that is not
      easy to track down and identify for those who want to
      test CONFIG_NO_HZ_FULL, let's default disable the
      watchdog on boot time when full dynticks is enabled.
      
      The user can still enable it later on runtime using
      proc or sysctl.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Anish Singh <anish198519851985@gmail.com>
      940be35a
    • F
      watchdog: Rename confusing state variable · 3c00ea82
      Frederic Weisbecker 提交于
      We have two very conflicting state variable names in the
      watchdog:
      
      * watchdog_enabled: This one reflects the user interface. It's
      set to 1 by default and can be overriden with boot options
      or sysctl/procfs interface.
      
      * watchdog_disabled: This is the internal toggle state that
      tells if watchdog threads, timers and NMI events are currently
      running or not. This state mostly depends on the user settings.
      It's a convenient state latch.
      
      Now we really need to find clearer names because those
      are just too confusing to encourage deep review.
      
      watchdog_enabled now becomes watchdog_user_enabled to reflect
      its purpose as an interface.
      
      watchdog_disabled becomes watchdog_running to suggest its
      role as a pure internal state.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Anish Singh <anish198519851985@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Don Zickus <dzickus@redhat.com>
      3c00ea82
    • F
      watchdog: Register / unregister watchdog kthreads on sysctl control · b8900bc0
      Frederic Weisbecker 提交于
      The user activation/deactivation of the watchdog through boot parameters
      or systcl is currently implemented with a dance involving kthreads parking
      and unparking methods: the threads are unconditionally registered on
      boot and they park as soon as the user want the watchdog to be disabled.
      
      This method involves a few noisy details to handle though: the watchdog
      kthreads may be unparked anytime due to hotplug operations, after which
      the watchdog internals have to decide to park again if it is user-disabled.
      
      As a result the setup() and unpark() methods need to be able to request a
      reparking. This is not currently supported in the kthread infrastructure
      so this piece of the watchdog code only works halfway.
      
      Besides, unparking/reparking the watchdog kthreads consume unnecessary
      cputime on hotplug operations when those could be simply ignored in the
      first place.
      
      As suggested by Srivatsa, let's instead only register the watchdog
      threads when they are needed. This way we don't need to think about
      hotplug operations and we don't burden the CPU onlining when the watchdog
      is simply disabled.
      Suggested-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Anish Singh <anish198519851985@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Don Zickus <dzickus@redhat.com>
      b8900bc0
  21. 14 3月, 2013 1 次提交
  22. 19 2月, 2013 1 次提交
  23. 08 2月, 2013 1 次提交
  24. 20 12月, 2012 1 次提交
    • B
      watchdog: Fix disable/enable regression · 3935e895
      Bjørn Mork 提交于
      Commit 8d451690 ("watchdog: Fix CPU hotplug regression") causes an
      oops or hard lockup when doing
      
       echo 0 > /proc/sys/kernel/nmi_watchdog
       echo 1 > /proc/sys/kernel/nmi_watchdog
      
      and the kernel is booted with nmi_watchdog=1 (default)
      
      Running laptop-mode-tools and disconnecting/connecting AC power will
      cause this to trigger, making it a common failure scenario on laptops.
      
      Instead of bailing out of watchdog_disable() when !watchdog_enabled we
      can initialize the hrtimer regardless of watchdog_enabled status.  This
      makes it safe to call watchdog_disable() in the nmi_watchdog=0 case,
      without the negative effect on the enabled => disabled => enabled case.
      
      All these tests pass with this patch:
      - nmi_watchdog=1
        echo 0 > /proc/sys/kernel/nmi_watchdog
        echo 1 > /proc/sys/kernel/nmi_watchdog
      
      - nmi_watchdog=0
        echo 0 > /sys/devices/system/cpu/cpu1/online
      
      - nmi_watchdog=0
        echo mem > /sys/power/state
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51661
      
      Cc: <stable@vger.kernel.org> # v3.7
      Cc: Norbert Warmuth <nwarmuth@t-online.de>
      Cc: Joseph Salisbury <joseph.salisbury@canonical.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3935e895