1. 01 1月, 2009 1 次提交
  2. 03 12月, 2008 1 次提交
  3. 17 10月, 2008 1 次提交
    • A
      Make the taint flags reliable · 25ddbb18
      Andi Kleen 提交于
      It's somewhat unlikely that it happens, but right now a race window
      between interrupts or machine checks or oopses could corrupt the tainted
      bitmap because it is modified in a non atomic fashion.
      
      Convert the taint variable to an unsigned long and use only atomic bit
      operations on it.
      
      Unfortunately this means the intvec sysctl functions cannot be used on it
      anymore.
      
      It turned out the taint sysctl handler could actually be simplified a bit
      (since it only increases capabilities) so this patch actually removes
      code.
      
      [akpm@linux-foundation.org: remove unneeded include]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      25ddbb18
  4. 10 9月, 2008 1 次提交
  5. 03 9月, 2008 1 次提交
  6. 30 8月, 2008 1 次提交
    • A
      Don't trigger softlockup detector on network fs blocked tasks · 316d9679
      Andi Kleen 提交于
      Pulling the ethernet cable on a 2.6.27-rc system with NFS mounts
      currently leads to an ongoing flood of soft lockup detector backtraces
      for all tasks blocked on the NFS mounts when the hickup takes
      longer than 120s.
      
      I don't think NFS problems should be all that noisy.
      
      Luckily there's a reasonably easy way to distingush this case.
      
      Don't report task softlockup warnings for tasks in TASK_KILLABLE
      state, which is used by the network file systems.
      
      I believe this patch is a 2.6.27 candidate.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      316d9679
  7. 27 7月, 2008 1 次提交
  8. 05 7月, 2008 1 次提交
  9. 01 7月, 2008 1 次提交
  10. 30 6月, 2008 1 次提交
  11. 25 6月, 2008 1 次提交
  12. 19 6月, 2008 1 次提交
    • J
      softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression · 9c106c11
      Jason Wessel 提交于
      The touch_nmi_watchdog() routine on x86 ultimately calls
      touch_softlockup_watchdog().  The problem is that to touch the
      softlockup watchdog, the cpu_clock code has to be called which could
      involve multiple cpu locks and can lead to a hard hang if one of the
      locks is held by a processor that is not going to return anytime soon
      (such as could be the case with kgdb or perhaps even with some other
      kind of exception).
      
      This patch causes the public version of the
      touch_softlockup_watchdog() to defer the cpu clock access to a later
      point.
      
      The test case for this problem is to use the following kernel config
      options:
      
      CONFIG_KGDB_TESTS=y
      CONFIG_KGDB_TESTS_ON_BOOT=y
      CONFIG_KGDB_TESTS_BOOT_STRING="V1F100I100000"
      
      It should be noted that kgdb test suite and these options were not
      available until 2.6.26-rc2, so it was necessary to patch the kgdb
      test suite during the bisection.
      
      I would consider this patch a regression fix because the problem first
      appeared in commit 27ec4407 when some
      logic was added to try to periodically sync the clocks.  It was
      possible to work around this particular problem by simply not
      performing the sync anytime the system was in a critical context.
      This was ok until commit 3e51f33f,
      which added config option CONFIG_HAVE_UNSTABLE_SCHED_CLOCK and some
      multi-cpu locks to sync the clocks.  It became clear that accessing
      this code from an nmi was the source of the lockups.  Avoiding the
      access to the low level clock code from an code inside the NMI
      processing also fixed the problem with the 27ec44... commit.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9c106c11
  13. 18 6月, 2008 1 次提交
  14. 02 6月, 2008 1 次提交
    • J
      softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression · 8c2238ea
      Jason Wessel 提交于
      The touch_nmi_watchdog() routine on x86 ultimately calls
      touch_softlockup_watchdog().  The problem is that to touch the
      softlockup watchdog, the cpu_clock code has to be called which could
      involve multiple cpu locks and can lead to a hard hang if one of the
      locks is held by a processor that is not going to return anytime soon
      (such as could be the case with kgdb or perhaps even with some other
      kind of exception).
      
      This patch causes the public version of the
      touch_softlockup_watchdog() to defer the cpu clock access to a later
      point.
      
      The test case for this problem is to use the following kernel config
      options:
      
      CONFIG_KGDB_TESTS=y
      CONFIG_KGDB_TESTS_ON_BOOT=y
      CONFIG_KGDB_TESTS_BOOT_STRING="V1F100I100000"
      
      It should be noted that kgdb test suite and these options were not
      available until 2.6.26-rc2, so it was necessary to patch the kgdb
      test suite during the bisection.
      
      I would consider this patch a regression fix because the problem first
      appeared in commit 27ec4407 when some
      logic was added to try to periodically sync the clocks.  It was
      possible to work around this particular problem by simply not
      performing the sync anytime the system was in a critical context.
      This was ok until commit 3e51f33f,
      which added config option CONFIG_HAVE_UNSTABLE_SCHED_CLOCK and some
      multi-cpu locks to sync the clocks.  It became clear that accessing
      this code from an nmi was the source of the lockups.  Avoiding the
      access to the low level clock code from an code inside the NMI
      processing also fixed the problem with the 27ec44... commit.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8c2238ea
  15. 25 5月, 2008 2 次提交
  16. 01 3月, 2008 1 次提交
  17. 02 2月, 2008 1 次提交
  18. 26 1月, 2008 2 次提交
    • I
      softlockup: fix signedness · 90739081
      Ingo Molnar 提交于
      fix softlockup tunables signedness.
      
      mark tunables read-mostly.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      90739081
    • I
      softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks · 82a1fcb9
      Ingo Molnar 提交于
      this patch extends the soft-lockup detector to automatically
      detect hung TASK_UNINTERRUPTIBLE tasks. Such hung tasks are
      printed the following way:
      
       ------------------>
       INFO: task prctl:3042 blocked for more than 120 seconds.
       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message
       prctl         D fd5e3793     0  3042   2997
              f6050f38 00000046 00000001 fd5e3793 00000009 c06d8264 c06dae80 00000286
              f6050f40 f6050f00 f7d34d90 f7d34fc8 c1e1be80 00000001 f6050000 00000000
              f7e92d00 00000286 f6050f18 c0489d1a f6050f40 00006605 00000000 c0133a5b
       Call Trace:
        [<c04883a5>] schedule_timeout+0x6d/0x8b
        [<c04883d8>] schedule_timeout_uninterruptible+0x15/0x17
        [<c0133a76>] msleep+0x10/0x16
        [<c0138974>] sys_prctl+0x30/0x1e2
        [<c0104c52>] sysenter_past_esp+0x5f/0xa5
        =======================
       2 locks held by prctl/3042:
       #0:  (&sb->s_type->i_mutex_key#5){--..}, at: [<c0197d11>] do_fsync+0x38/0x7a
       #1:  (jbd_handle){--..}, at: [<c01ca3d2>] journal_start+0xc7/0xe9
       <------------------
      
      the current default timeout is 120 seconds. Such messages are printed
      up to 10 times per bootup. If the system has crashed already then the
      messages are not printed.
      
      if lockdep is enabled then all held locks are printed as well.
      
      this feature is a natural extension to the softlockup-detector (kernel
      locked up without scheduling) and to the NMI watchdog (kernel locked up
      with IRQs disabled).
      
      [ Gautham R Shenoy <ego@in.ibm.com>: CPU hotplug fixes. ]
      [ Andrew Morton <akpm@linux-foundation.org>: build warning fix. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      82a1fcb9
  19. 20 10月, 2007 1 次提交
  20. 17 10月, 2007 5 次提交
    • R
      softlockup: add a /proc tuning parameter · c4f3b63f
      Ravikiran G Thirumalai 提交于
      Control the trigger limit for softlockup warnings.  This is useful for
      debugging softlockups, by lowering the softlockup_thresh to identify
      possible softlockups earlier.
      
      This patch:
      1. Adds a sysctl softlockup_thresh with valid values of 1-60s
         (Higher value to disable false positives)
      2. Changes the softlockup printk to print the cpu softlockup time
      
      [akpm@linux-foundation.org: Fix various warnings and add definition of "two"]
      Signed-off-by: NRavikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: NShai Fultheim <shai@scalex86.org>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c4f3b63f
    • I
      softlockup watchdog: style cleanups · a5f2ce3c
      Ingo Molnar 提交于
      kernel/softirq.c grew a few style uncleanlinesses in the past few
      months, clean that up. No functional changes:
      
         text    data     bss     dec     hex filename
         1126      76       4    1206     4b6 softlockup.o.before
         1129      76       4    1209     4b9 softlockup.o.after
      
      ( the 3 bytes .text increase is due to the "<1>" appended to one of
        the printk messages. )
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a5f2ce3c
    • I
      softlockup: improve debug output · 43581a10
      Ingo Molnar 提交于
      Improve the debuggability of kernel lockups by enhancing the debug
      output of the softlockup detector: print the task that causes the lockup
      and try to print a more intelligent backtrace.
      
      The old format was:
      
        BUG: soft lockup detected on CPU#1!
         [<c0105e4a>] show_trace_log_lvl+0x19/0x2e
         [<c0105f43>] show_trace+0x12/0x14
         [<c0105f59>] dump_stack+0x14/0x16
         [<c015f6bc>] softlockup_tick+0xbe/0xd0
         [<c013457d>] run_local_timers+0x12/0x14
         [<c01346b8>] update_process_times+0x3e/0x63
         [<c0145fb8>] tick_sched_timer+0x7c/0xc0
         [<c0140a75>] hrtimer_interrupt+0x135/0x1ba
         [<c011bde7>] smp_apic_timer_interrupt+0x6e/0x80
         [<c0105aa3>] apic_timer_interrupt+0x33/0x38
         [<c0104f8a>] syscall_call+0x7/0xb
         =======================
      
      The new format is:
      
        BUG: soft lockup detected on CPU#1! [prctl:2363]
      
        Pid: 2363, comm:                prctl
        EIP: 0060:[<c013915f>] CPU: 1
        EIP is at sys_prctl+0x24/0x18c
         EFLAGS: 00000213    Not tainted  (2.6.22-cfs-v20 #26)
        EAX: 00000001 EBX: 000003e7 ECX: 00000001 EDX: f6df0000
        ESI: 000003e7 EDI: 000003e7 EBP: f6df0fb0 DS: 007b ES: 007b FS: 00d8
        CR0: 8005003b CR2: 4d8c3340 CR3: 3731d000 CR4: 000006d0
         [<c0105e4a>] show_trace_log_lvl+0x19/0x2e
         [<c0105f43>] show_trace+0x12/0x14
         [<c01040be>] show_regs+0x1ab/0x1b3
         [<c015f807>] softlockup_tick+0xef/0x108
         [<c013457d>] run_local_timers+0x12/0x14
         [<c01346b8>] update_process_times+0x3e/0x63
         [<c0145fcc>] tick_sched_timer+0x7c/0xc0
         [<c0140a89>] hrtimer_interrupt+0x135/0x1ba
         [<c011bde7>] smp_apic_timer_interrupt+0x6e/0x80
         [<c0105aa3>] apic_timer_interrupt+0x33/0x38
         [<c0104f8a>] syscall_call+0x7/0xb
         =======================
      
      Note that in the old format we only knew that some system call locked
      up, we didnt know _which_. With the new format we know that it's at a
      specific place in sys_prctl(). [which was where i created an artificial
      kernel lockup to test the new format.]
      
      This is also useful if the lockup happens in user-space - the user-space
      EIP (and other registers) will be printed too. (such a lockup would
      either suggest that the task was running at SCHED_FIFO:99 and looping
      for more than 10 seconds, or that the softlockup detector has a
      false-positive.)
      
      The task name is printed too first, just in case we dont manage to print
      a useful backtrace.
      
      [satyam@infradead.org: fix warning]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NSatyam Sharma <satyam@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      43581a10
    • I
      fix the softlockup watchdog to actually work · a115d5ca
      Ingo Molnar 提交于
      this Xen related commit:
      
         commit 966812dc
         Author: Jeremy Fitzhardinge <jeremy@goop.org>
         Date:   Tue May 8 00:28:02 2007 -0700
      
             Ignore stolen time in the softlockup watchdog
      
      broke the softlockup watchdog to never report any lockups. (!)
      
      print_timestamp defaults to 0, this makes the following condition
      always true:
      
      	if (print_timestamp < (touch_timestamp + 1) ||
      
      and we'll in essence never report soft lockups.
      
      apparently the functionality of the soft lockup watchdog was never
      actually tested with that patch applied ...
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a115d5ca
    • I
      softlockup: use cpu_clock() instead of sched_clock() · a3b13c23
      Ingo Molnar 提交于
      sched_clock() is not a reliable time-source, use cpu_clock() instead.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a3b13c23
  21. 18 7月, 2007 1 次提交
    • R
      Freezer: make kernel threads nonfreezable by default · 83144186
      Rafael J. Wysocki 提交于
      Currently, the freezer treats all tasks as freezable, except for the kernel
      threads that explicitly set the PF_NOFREEZE flag for themselves.  This
      approach is problematic, since it requires every kernel thread to either
      set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
      care for the freezing of tasks at all.
      
      It seems better to only require the kernel threads that want to or need to
      be frozen to use some freezer-related code and to remove any
      freezer-related code from the other (nonfreezable) kernel threads, which is
      done in this patch.
      
      The patch causes all kernel threads to be nonfreezable by default (ie.  to
      have PF_NOFREEZE set by default) and introduces the set_freezable()
      function that should be called by the freezable kernel threads in order to
      unset PF_NOFREEZE.  It also makes all of the currently freezable kernel
      threads call set_freezable(), so it shouldn't cause any (intentional)
      change of behaviour to appear.  Additionally, it updates documentation to
      describe the freezing of tasks more accurately.
      
      [akpm@linux-foundation.org: build fixes]
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NNigel Cunningham <nigel@nigel.suspend2.net>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83144186
  22. 10 5月, 2007 1 次提交
    • R
      Add suspend-related notifications for CPU hotplug · 8bb78442
      Rafael J. Wysocki 提交于
      Since nonboot CPUs are now disabled after tasks and devices have been
      frozen and the CPU hotplug infrastructure is used for this purpose, we need
      special CPU hotplug notifications that will help the CPU-hotplug-aware
      subsystems distinguish normal CPU hotplug events from CPU hotplug events
      related to a system-wide suspend or resume operation in progress.  This
      patch introduces such notifications and causes them to be used during
      suspend and resume transitions.  It also changes all of the
      CPU-hotplug-aware subsystems to take these notifications into consideration
      (for now they are handled in the same way as the corresponding "normal"
      ones).
      
      [oleg@tv-sign.ru: cleanups]
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8bb78442
  23. 09 5月, 2007 3 次提交
  24. 30 9月, 2006 1 次提交
  25. 01 8月, 2006 1 次提交
  26. 28 6月, 2006 2 次提交
  27. 26 6月, 2006 2 次提交
  28. 26 4月, 2006 2 次提交
  29. 28 3月, 2006 1 次提交
    • A
      [PATCH] Notifier chain update: API changes · e041c683
      Alan Stern 提交于
      The kernel's implementation of notifier chains is unsafe.  There is no
      protection against entries being added to or removed from a chain while the
      chain is in use.  The issues were discussed in this thread:
      
          http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2
      
      We noticed that notifier chains in the kernel fall into two basic usage
      classes:
      
      	"Blocking" chains are always called from a process context
      	and the callout routines are allowed to sleep;
      
      	"Atomic" chains can be called from an atomic context and
      	the callout routines are not allowed to sleep.
      
      We decided to codify this distinction and make it part of the API.  Therefore
      this set of patches introduces three new, parallel APIs: one for blocking
      notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
      really just the old API under a new name).  New kinds of data structures are
      used for the heads of the chains, and new routines are defined for
      registration, unregistration, and calling a chain.  The three APIs are
      explained in include/linux/notifier.h and their implementation is in
      kernel/sys.c.
      
      With atomic and blocking chains, the implementation guarantees that the chain
      links will not be corrupted and that chain callers will not get messed up by
      entries being added or removed.  For raw chains the implementation provides no
      guarantees at all; users of this API must provide their own protections.  (The
      idea was that situations may come up where the assumptions of the atomic and
      blocking APIs are not appropriate, so it should be possible for users to
      handle these things in their own way.)
      
      There are some limitations, which should not be too hard to live with.  For
      atomic/blocking chains, registration and unregistration must always be done in
      a process context since the chain is protected by a mutex/rwsem.  Also, a
      callout routine for a non-raw chain must not try to register or unregister
      entries on its own chain.  (This did happen in a couple of places and the code
      had to be changed to avoid it.)
      
      Since atomic chains may be called from within an NMI handler, they cannot use
      spinlocks for synchronization.  Instead we use RCU.  The overhead falls almost
      entirely in the unregister routine, which is okay since unregistration is much
      less frequent that calling a chain.
      
      Here is the list of chains that we adjusted and their classifications.  None
      of them use the raw API, so for the moment it is only a placeholder.
      
        ATOMIC CHAINS
        -------------
      arch/i386/kernel/traps.c:		i386die_chain
      arch/ia64/kernel/traps.c:		ia64die_chain
      arch/powerpc/kernel/traps.c:		powerpc_die_chain
      arch/sparc64/kernel/traps.c:		sparc64die_chain
      arch/x86_64/kernel/traps.c:		die_chain
      drivers/char/ipmi/ipmi_si_intf.c:	xaction_notifier_list
      kernel/panic.c:				panic_notifier_list
      kernel/profile.c:			task_free_notifier
      net/bluetooth/hci_core.c:		hci_notifier
      net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_chain
      net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_expect_chain
      net/ipv6/addrconf.c:			inet6addr_chain
      net/netfilter/nf_conntrack_core.c:	nf_conntrack_chain
      net/netfilter/nf_conntrack_core.c:	nf_conntrack_expect_chain
      net/netlink/af_netlink.c:		netlink_chain
      
        BLOCKING CHAINS
        ---------------
      arch/powerpc/platforms/pseries/reconfig.c:	pSeries_reconfig_chain
      arch/s390/kernel/process.c:		idle_chain
      arch/x86_64/kernel/process.c		idle_notifier
      drivers/base/memory.c:			memory_chain
      drivers/cpufreq/cpufreq.c		cpufreq_policy_notifier_list
      drivers/cpufreq/cpufreq.c		cpufreq_transition_notifier_list
      drivers/macintosh/adb.c:		adb_client_list
      drivers/macintosh/via-pmu.c		sleep_notifier_list
      drivers/macintosh/via-pmu68k.c		sleep_notifier_list
      drivers/macintosh/windfarm_core.c	wf_client_list
      drivers/usb/core/notify.c		usb_notifier_list
      drivers/video/fbmem.c			fb_notifier_list
      kernel/cpu.c				cpu_chain
      kernel/module.c				module_notify_list
      kernel/profile.c			munmap_notifier
      kernel/profile.c			task_exit_notifier
      kernel/sys.c				reboot_notifier_list
      net/core/dev.c				netdev_chain
      net/decnet/dn_dev.c:			dnaddr_chain
      net/ipv4/devinet.c:			inetaddr_chain
      
      It's possible that some of these classifications are wrong.  If they are,
      please let us know or submit a patch to fix them.  Note that any chain that
      gets called very frequently should be atomic, because the rwsem read-locking
      used for blocking chains is very likely to incur cache misses on SMP systems.
      (However, if the chain's callout routines may sleep then the chain cannot be
      atomic.)
      
      The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
      material written by Keith Owens and suggestions from Paul McKenney and Andrew
      Morton.
      
      [jes@sgi.com: restructure the notifier chain initialization macros]
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
      Signed-off-by: NJes Sorensen <jes@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e041c683