1. 05 6月, 2014 2 次提交
    • P
      printk: ignore too long messages · f40e4b9f
      Petr Mladek 提交于
      There was no check for too long messages.  The check for free space always
      passed when first_seq and next_seq were equal.  Enough free space was not
      guaranteed, though.
      
      log_store() might be called to store messages up to 64kB + 64kB + 16B.
      This is sum of maximal text_len, dict_len values, and the size of the
      structure printk_log.
      
      On the other hand, the minimal size for the main log buffer currently is
      4kB and it is enforced only by Kconfig.
      
      The good news is that the usage looks safe right now.  log_store() is
      called only from vprintk_emit() and cont_flush().  Here the "text" part is
      always passed via a static buffer and the length is limited to
      LOG_LINE_MAX which is 1024.  The "dict" part is NULL in most cases.  The
      only exceptions is when vprintk_emit() is called from printk_emit() and
      dev_vprintk_emit().  But printk_emit() is currently used only in
      devkmsg_writev() and here "dict" is NULL as well.  In dev_vprintk_emit(),
      "dict" is limited by the static buffer "hdr" of the size 128 bytes.  It
      meas that the current maximal printed text is 1024B + 128B + 16B and it
      always fit the log buffer.
      
      But it is only matter of time when someone calls printk_emit() with unsafe
      parameters, especially the "dict" one.
      
      This patch adds a check for the free space when the buffer is empty.  It
      reuses the already existing log_has_space() function but it has to add an
      extra parameter.  It defines whether the buffer is empty.  Note that the
      same values of "first_idx" and "next_idx" might also mean that the buffer
      is full.
      
      If the buffer is empty, we must respect the current position of the
      indexes.  We cannot reset them to the beginning of the buffer.  Otherwise,
      the functions reading the buffer would get crazy.
      
      The question is what to do when the message is too long.  This patch uses
      the easiest solution and just ignores the problematic message.  Let's do
      something better in a followup patch.
      Signed-off-by: NPetr Mladek <pmladek@suse.cz>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Kay Sievers <kay@vrfy.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f40e4b9f
    • P
      printk: split code for making free space in the log buffer · 0a581694
      Petr Mladek 提交于
      The check for free space in the log buffer always passes when "first_seq"
      and "next_seq" are equal.  In theory, it might cause writing outside of
      the log buffer.
      
      Fortunately, the current usage looks safe because the used "text" and
      "dict" buffers are quite limited.  See the second patch for more details.
      
      Anyway, it is better to be on the safe side and add a check.  An easy
      solution is done in the 2nd patch and it is improved in the 4th patch.
      
      5th patch fixes the computation of the printed message length.
      
      1st and 3rd patches just do some code refactoring to make the other
      patches easier.
      
      This patch (of 5):
      
      There will be needed some fixes in the check for free space.  They will be
      easier if the code is moved outside of the quite long log_store()
      function.
      
      This patch does not change the existing behavior.
      Signed-off-by: NPetr Mladek <pmladek@suse.cz>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Kay Sievers <kay@vrfy.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0a581694
  2. 29 5月, 2014 1 次提交
    • S
      printk/of_serial: fix serial console cessation part way through boot. · 7fa21dd8
      Stephen Chivers 提交于
      Commit 5f5c9ae5
      "serial_core: Unregister console in uart_remove_one_port()"
      fixed a crash where a serial port was removed but
      not deregistered as a console.
      
      There is a side effect of that commit for platforms having serial consoles
      and of_serial configured (CONFIG_SERIAL_OF_PLATFORM). The serial console
      is disabled midway through the boot process.
      
      This cessation of the serial console affects PowerPC computers
      such as the MVME5100 and SAM440EP.
      
      The sequence is:
      
      	bootconsole [udbg0] enabled
      	....
      	serial8250/16550 driver initialises and registers its UARTS,
      	one of these is the serial console.
      	console [ttyS0] enabled
      	....
      	of_serial probes "platform" devices, registering them as it goes.
      	One of these is the serial console.
      	console [ttyS0] disabled.
      
      The disabling of the serial console is due to:
      
      	a.  unregister_console in printk not clearing the
      	    CONS_ENABLED bit in the console flags,
      	    even though it has announced that the console is disabled; and
      
      	b.  of_platform_serial_probe in of_serial not setting the port type
      	    before it registers with serial8250_register_8250_port.
      
      This patch ensures that the serial console is re-enabled when of_serial
      registers a serial port that corresponds to the designated console.
      Signed-off-by: NStephen Chivers <schivers@csc.com>
      Tested-by: NStephen Chivers <schivers@csc.com>
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [unregister_console]
      Cc: stable <stable@vger.kernel.org> # 3.15
      
      ===
      The above failure was identified in Linux-3.15-rc2.
      
      Tested using MVME5100 and SAM440EP PowerPC computers with
      kernels built from Linux-3.15-rc5 and tty-next.
      
      The continued operation of the serial console is vital for computers
      such as the MVME5100 as that Single Board Computer does not
      have any grapical/display hardware.
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7fa21dd8
  3. 06 5月, 2014 1 次提交
  4. 04 4月, 2014 5 次提交
    • J
      printk: fix one circular lockdep warning about console_lock · 72581487
      Jane Li 提交于
      Fix a warning about possible circular locking dependency.
      
      If do in following sequence:
      
          enter suspend ->  resume ->  plug-out CPUx (echo 0 > cpux/online)
      
      lockdep will show warning as following:
      
        ======================================================
        [ INFO: possible circular locking dependency detected ]
        3.10.0 #2 Tainted: G           O
        -------------------------------------------------------
        sh/1271 is trying to acquire lock:
        (console_lock){+.+.+.}, at: console_cpu_notify+0x20/0x2c
        but task is already holding lock:
        (cpu_hotplug.lock){+.+.+.}, at: cpu_hotplug_begin+0x2c/0x58
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
        -> #2 (cpu_hotplug.lock){+.+.+.}:
          lock_acquire+0x98/0x12c
          mutex_lock_nested+0x50/0x3d8
          cpu_hotplug_begin+0x2c/0x58
          _cpu_up+0x24/0x154
          cpu_up+0x64/0x84
          smp_init+0x9c/0xd4
          kernel_init_freeable+0x78/0x1c8
          kernel_init+0x8/0xe4
          ret_from_fork+0x14/0x2c
      
        -> #1 (cpu_add_remove_lock){+.+.+.}:
          lock_acquire+0x98/0x12c
          mutex_lock_nested+0x50/0x3d8
          disable_nonboot_cpus+0x8/0xe8
          suspend_devices_and_enter+0x214/0x448
          pm_suspend+0x1e4/0x284
          try_to_suspend+0xa4/0xbc
          process_one_work+0x1c4/0x4fc
          worker_thread+0x138/0x37c
          kthread+0xa4/0xb0
          ret_from_fork+0x14/0x2c
      
        -> #0 (console_lock){+.+.+.}:
          __lock_acquire+0x1b38/0x1b80
          lock_acquire+0x98/0x12c
          console_lock+0x54/0x68
          console_cpu_notify+0x20/0x2c
          notifier_call_chain+0x44/0x84
          __cpu_notify+0x2c/0x48
          cpu_notify_nofail+0x8/0x14
          _cpu_down+0xf4/0x258
          cpu_down+0x24/0x40
          store_online+0x30/0x74
          dev_attr_store+0x18/0x24
          sysfs_write_file+0x16c/0x19c
          vfs_write+0xb4/0x190
          SyS_write+0x3c/0x70
          ret_fast_syscall+0x0/0x48
      
        Chain exists of:
           console_lock --> cpu_add_remove_lock --> cpu_hotplug.lock
      
        Possible unsafe locking scenario:
               CPU0                    CPU1
               ----                    ----
        lock(cpu_hotplug.lock);
                                       lock(cpu_add_remove_lock);
                                       lock(cpu_hotplug.lock);
        lock(console_lock);
          *** DEADLOCK ***
      
      There are three locks involved in two sequence:
      a) pm suspend:
      	console_lock (@suspend_console())
      	cpu_add_remove_lock (@disable_nonboot_cpus())
      	cpu_hotplug.lock (@_cpu_down())
      b) Plug-out CPUx:
      	cpu_add_remove_lock (@(cpu_down())
      	cpu_hotplug.lock (@_cpu_down())
      	console_lock (@console_cpu_notify()) => Lockdeps prints warning log.
      
      There should be not real deadlock, as flag of console_suspended can
      protect this.
      
      Although console_suspend() releases console_sem, it doesn't tell lockdep
      about it.  That results in the lockdep warning about circular locking
      when doing the following: enter suspend -> resume -> plug-out CPUx (echo
      0 > cpux/online)
      
      Fix the problem by telling lockdep we actually released the semaphore in
      console_suspend() and acquired it again in console_resume().
      Signed-off-by: NJane Li <jiel@marvell.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72581487
    • P
      printk: do not compute the size of the message twice · fce6e033
      Petr Mladek 提交于
      This is just a tiny optimization.  It removes duplicate computation of
      the message size.
      Signed-off-by: NPetr Mladek <pmladek@suse.cz>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Kay Sievers <kay@vrfy.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fce6e033
    • P
      printk: use also the last bytes in the ring buffer · 39b25109
      Petr Mladek 提交于
      It seems that we have newer used the last byte in the ring buffer.  In
      fact, we have newer used the last 4 bytes because of padding.
      
      First problem is in the check for free space.  The exact number of free
      bytes is enough to store the length of data.
      
      Second problem is in the check where the ring buffer is rotated.  The
      left side counts the first unused index.  It is unused, so it might be
      the same as the size of the buffer.
      
      Note that the first problem has to be fixed together with the second
      one.  Otherwise, the buffer is rotated even when there is enough space
      on the end of the buffer.  Then the beginning of the buffer is rewritten
      and valid entries get corrupted.
      Signed-off-by: NPetr Mladek <pmladek@suse.cz>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Kay Sievers <kay@vrfy.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      39b25109
    • P
      printk: add comment about tricky check for text buffer size · e8c42d36
      Petr Mladek 提交于
      There is no check for potential "text_len" overflow.  It is not needed
      because only valid level is detected.  It took me some time to
      understand why.  It would deserve a comment ;-)
      Signed-off-by: NPetr Mladek <pmladek@suse.cz>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Kay Sievers <kay@vrfy.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8c42d36
    • P
      printk: remove obsolete check for log level "c" · c64730b2
      Petr Mladek 提交于
      The kernel log level "c" was removed in commit 61e99ab8 ("printk:
      remove the now unnecessary "C" annotation for KERN_CONT").  It is no
      longer detected in printk_get_level().  Hence we do not need to check it
      in vprintk_emit.
      Signed-off-by: NPetr Mladek <pmladek@suse.cz>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Kay Sievers <kay@vrfy.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c64730b2
  5. 18 2月, 2014 1 次提交
    • L
      printk: fix syslog() overflowing user buffer · e4178d80
      Linus Torvalds 提交于
      This is not a buffer overflow in the traditional sense: we don't
      overflow any *kernel* buffers, but we do mis-count the amount of data we
      copy back to user space for the SYSLOG_ACTION_READ_ALL case.
      
      In particular, if the user buffer is too small to hold everything, and
      *if* there is a continuation line at just the right place, we can end up
      giving the user more data than he asked for.
      
      The reason is that we first count up the number of bytes all the log
      records contains, then we walk the records again until we've skipped the
      records at the beginning that won't fit, and then we walk the rest of
      the records and copy them to the user space buffer.
      
      And in between that "skip the initial records that won't fit" and the
      "copy the records that *will* fit to user space", we reset the 'prev'
      variable that contained the record information for the last record not
      copied.  That meant that when we started copying to user space, we now
      had a different character count than what we had originally calculated
      in the first record walk-through.
      
      The fix is to simply not clear the 'prev' flags value (in both cases
      where we had the same logic: syslog_print_all and kmsg_dump_get_buffer:
      the latter is used for pstore-like dumping)
      Reported-and-tested-by: NDebabrata Banerjee <dbanerje@akamai.com>
      Acked-by: NKay Sievers <kay@vrfy.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e4178d80
  6. 24 1月, 2014 1 次提交
    • A
      printk: flush conflicting continuation line · 1d3fa370
      Arun KS 提交于
      An earlier newline was missing and current print is from different task.
      In this scenario flush the continuation line and store this line
      seperatly.
      
      This patch fix the below scenario of timestamp interleaving,
         [   28.154370 ] read_word_reg : reg[0x 3], reg[0x 4]  data [0x 642]
         [   28.155428 ] uart disconnect
         [   31.947341 ] dvfs[cpufreq.c<275>]:plug-in cpu<1> done
         [   28.155445 ] UART detached : send switch state 201
         [   32.014112 ] read_reg : reg[0x 3] data[0x21]
      
      [akpm@linux-foundation.org: simplify and condense the code]
      Signed-off-by: NArun KS <getarunks@gmail.com>
      Signed-off-by: NArun KS <arun.ks@broadcom.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Kay Sievers <kay@vrfy.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1d3fa370
  7. 22 1月, 2014 1 次提交
    • S
      kernel/printk/printk.c: use memblock apis for early memory allocations · 9da791df
      Santosh Shilimkar 提交于
      Switch to memblock interfaces for early memory allocator instead of
      bootmem allocator.  No functional change in beahvior than what it is in
      current code from bootmem users points of view.
      
      Archs already converted to NO_BOOTMEM now directly use memblock
      interfaces instead of bootmem wrappers build on top of memblock.  And
      the archs which still uses bootmem, these new apis just fallback to
      exiting bootmem APIs.
      Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Paul Walmsley <paul@pwsan.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Lindgren <tony@atomide.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9da791df
  8. 13 11月, 2013 4 次提交
  9. 05 8月, 2013 1 次提交
    • A
      register_console: prevent adding the same console twice · 16cf48a6
      Andreas Bießmann 提交于
      This patch guards the console_drivers list to be corrupted. The
      for_each_console() macro insist on a strictly forward list ended by NULL:
      
       con0->next->con1->next->NULL
      
      Without this patch it may happen easily to destroy this list for example by
      adding 'earlyprintk' twice, especially on embedded devices where the early
      console is often a single static instance.  This will result in the following
      list:
      
       con0->next->con0
      
      This in turn will result in an endless loop in console_unlock() later on by
      printing the first __log_buf line endlessly.
      Signed-off-by: NAndreas Bießmann <andreas@biessmann.de>
      Cc: Kay Sievers <kay@vrfy.org>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16cf48a6
  10. 01 8月, 2013 5 次提交
  11. 15 7月, 2013 1 次提交
    • P
      kernel: delete __cpuinit usage from all core kernel files · 0db0628d
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the uses of the __cpuinit macros from C files in
      the core kernel directories (kernel, init, lib, mm, and include)
      that don't really have a specific maintainer.
      
      [1] https://lkml.org/lkml/2013/5/20/589Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      0db0628d
  12. 13 6月, 2013 1 次提交
    • K
      kmsg: honor dmesg_restrict sysctl on /dev/kmsg · 637241a9
      Kees Cook 提交于
      The dmesg_restrict sysctl currently covers the syslog method for access
      dmesg, however /dev/kmsg isn't covered by the same protections.  Most
      people haven't noticed because util-linux dmesg(1) defaults to using the
      syslog method for access in older versions.  With util-linux dmesg(1)
      defaults to reading directly from /dev/kmsg.
      
      To fix /dev/kmsg, let's compare the existing interfaces and what they
      allow:
      
       - /proc/kmsg allows:
        - open (SYSLOG_ACTION_OPEN) if CAP_SYSLOG since it uses a destructive
          single-reader interface (SYSLOG_ACTION_READ).
        - everything, after an open.
      
       - syslog syscall allows:
        - anything, if CAP_SYSLOG.
        - SYSLOG_ACTION_READ_ALL and SYSLOG_ACTION_SIZE_BUFFER, if
          dmesg_restrict==0.
        - nothing else (EPERM).
      
      The use-cases were:
       - dmesg(1) needs to do non-destructive SYSLOG_ACTION_READ_ALLs.
       - sysklog(1) needs to open /proc/kmsg, drop privs, and still issue the
         destructive SYSLOG_ACTION_READs.
      
      AIUI, dmesg(1) is moving to /dev/kmsg, and systemd-journald doesn't
      clear the ring buffer.
      
      Based on the comments in devkmsg_llseek, it sounds like actions besides
      reading aren't going to be supported by /dev/kmsg (i.e.
      SYSLOG_ACTION_CLEAR), so we have a strict subset of the non-destructive
      syslog syscall actions.
      
      To this end, move the check as Josh had done, but also rename the
      constants to reflect their new uses (SYSLOG_FROM_CALL becomes
      SYSLOG_FROM_READER, and SYSLOG_FROM_FILE becomes SYSLOG_FROM_PROC).
      SYSLOG_FROM_READER allows non-destructive actions, and SYSLOG_FROM_PROC
      allows destructive actions after a capabilities-constrained
      SYSLOG_ACTION_OPEN check.
      
       - /dev/kmsg allows:
        - open if CAP_SYSLOG or dmesg_restrict==0
        - reading/polling, after open
      
      Addresses https://bugzilla.redhat.com/show_bug.cgi?id=903192
      
      [akpm@linux-foundation.org: use pr_warn_once()]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reported-by: NChristian Kujau <lists@nerdbynature.de>
      Tested-by: NJosh Boyer <jwboyer@redhat.com>
      Cc: Kay Sievers <kay@vrfy.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      637241a9
  13. 08 5月, 2013 1 次提交
  14. 01 5月, 2013 4 次提交
    • T
      workqueue: include workqueue info when printing debug dump of a worker task · 3d1cb205
      Tejun Heo 提交于
      One of the problems that arise when converting dedicated custom
      threadpool to workqueue is that the shared worker pool used by workqueue
      anonimizes each worker making it more difficult to identify what the
      worker was doing on which target from the output of sysrq-t or debug
      dump from oops, BUG() and friends.
      
      This patch implements set_worker_desc() which can be called from any
      workqueue work function to set its description.  When the worker task is
      dumped for whatever reason - sysrq-t, WARN, BUG, oops, lockdep assertion
      and so on - the description will be printed out together with the
      workqueue name and the worker function pointer.
      
      The printing side is implemented by print_worker_info() which is called
      from functions in task dump paths - sched_show_task() and
      dump_stack_print_info().  print_worker_info() can be safely called on
      any task in any state as long as the task struct itself is accessible.
      It uses probe_*() functions to access worker fields.  It may print
      garbage if something went very wrong, but it wouldn't cause (another)
      oops.
      
      The description is currently limited to 24bytes including the
      terminating \0.  worker->desc_valid and workder->desc[] are added and
      the 64 bytes marker which was already incorrect before adding the new
      fields is moved to the correct position.
      
      Here's an example dump with writeback updated to set the bdi name as
      worker desc.
      
       Hardware name: Bochs
       Modules linked in:
       Pid: 7, comm: kworker/u9:0 Not tainted 3.9.0-rc1-work+ #1
       Workqueue: writeback bdi_writeback_workfn (flush-8:0)
        ffffffff820a3ab0 ffff88000f6e9cb8 ffffffff81c61845 ffff88000f6e9cf8
        ffffffff8108f50f 0000000000000000 0000000000000000 ffff88000cde16b0
        ffff88000cde1aa8 ffff88001ee19240 ffff88000f6e9fd8 ffff88000f6e9d08
       Call Trace:
        [<ffffffff81c61845>] dump_stack+0x19/0x1b
        [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0
        [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff81200150>] bdi_writeback_workfn+0x2a0/0x3b0
       ...
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Acked-by: NJan Kara <jack@suse.cz>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Dave Chinner <david@fromorbit.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3d1cb205
    • T
      dump_stack: unify debug information printed by show_regs() · a43cb95d
      Tejun Heo 提交于
      show_regs() is inherently arch-dependent but it does make sense to print
      generic debug information and some archs already do albeit in slightly
      different forms.  This patch introduces a generic function to print debug
      information from show_regs() so that different archs print out the same
      information and it's much easier to modify what's printed.
      
      show_regs_print_info() prints out the same debug info as dump_stack()
      does plus task and thread_info pointers.
      
      * Archs which didn't print debug info now do.
      
        alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r,
        metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc,
        um, xtensa
      
      * Already prints debug info.  Replaced with show_regs_print_info().
        The printed information is superset of what used to be there.
      
        arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86
      
      * s390 is special in that it used to print arch-specific information
        along with generic debug info.  Heiko and Martin think that the
        arch-specific extra isn't worth keeping s390 specfic implementation.
        Converted to use the generic version.
      
      Note that now all archs print the debug info before actual register
      dumps.
      
      An example BUG() dump follows.
      
       kernel BUG at /work/os/work/kernel/workqueue.c:4841!
       invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
       task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
       RIP: 0010:[<ffffffff8234a07e>]  [<ffffffff8234a07e>] init_workqueues+0x4/0x6
       RSP: 0000:ffff88007c861ec8  EFLAGS: 00010246
       RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
       RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
       RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Stack:
        ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
        0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
        ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
       Call Trace:
        [<ffffffff81000312>] do_one_initcall+0x122/0x170
        [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        [<ffffffff81c4776e>] kernel_init+0xe/0xf0
        [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        ...
      
      v2: Typo fix in x86-32.
      
      v3: CPU number dropped from show_regs_print_info() as
          dump_stack_print_info() has been updated to print it.  s390
          specific implementation dropped as requested by s390 maintainers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Acked-by: Chris Metcalf <cmetcalf@tilera.com>		[tile bits]
      Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a43cb95d
    • T
      dump_stack: implement arch-specific hardware description in task dumps · 98e5e1bf
      Tejun Heo 提交于
      x86 and ia64 can acquire extra hardware identification information
      from DMI and print it along with task dumps; however, the usage isn't
      consistent.
      
      * x86 show_regs() collects vendor, product and board strings and print
        them out with PID, comm and utsname.  Some of the information is
        printed again later in the same dump.
      
      * warn_slowpath_common() explicitly accesses the DMI board and prints
        it out with "Hardware name:" label.  This applies to both x86 and
        ia64 but is irrelevant on all other archs.
      
      * ia64 doesn't show DMI information on other non-WARN dumps.
      
      This patch introduces arch-specific hardware description used by
      dump_stack().  It can be set by calling dump_stack_set_arch_desc()
      during boot and, if exists, printed out in a separate line with
      "Hardware name:" label.
      
      dmi_set_dump_stack_arch_desc() is added which sets arch-specific
      description from DMI data.  It uses dmi_ids_string[] which is set from
      dmi_present() used for DMI debug message.  It is superset of the
      information x86 show_regs() is using.  The function is called from x86
      and ia64 boot code right after dmi_scan_machine().
      
      This makes the explicit DMI handling in warn_slowpath_common()
      unnecessary.  Removed.
      
      show_regs() isn't yet converted to use generic debug information
      printing and this patch doesn't remove the duplicate DMI handling in
      x86 show_regs().  The next patch will unify show_regs() handling and
      remove the duplication.
      
      An example WARN dump follows.
      
       WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #3
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
        0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
        ffffffff8108f500 ffffffff82228240 0000000000000040 ffffffff8234a08e
        0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
       Call Trace:
        [<ffffffff81c614dc>] dump_stack+0x19/0x1b
        [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0
        [<ffffffff8108f54a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff8234a0c3>] init_workqueues+0x35/0x505
        ...
      
      v2: Use the same string as the debug message from dmi_present() which
          also contains BIOS information.  Move hardware name into its own
          line as warn_slowpath_common() did.  This change was suggested by
          Bjorn Helgaas.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      98e5e1bf
    • T
      dump_stack: consolidate dump_stack() implementations and unify their behaviors · 196779b9
      Tejun Heo 提交于
      Both dump_stack() and show_stack() are currently implemented by each
      architecture.  show_stack(NULL, NULL) dumps the backtrace for the
      current task as does dump_stack().  On some archs, dump_stack() prints
      extra information - pid, utsname and so on - in addition to the
      backtrace while the two are identical on other archs.
      
      The usages in arch-independent code of the two functions indicate
      show_stack(NULL, NULL) should print out bare backtrace while
      dump_stack() is used for debugging purposes when something went wrong,
      so it does make sense to print additional information on the task which
      triggered dump_stack().
      
      There's no reason to require archs to implement two separate but mostly
      identical functions.  It leads to unnecessary subtle information.
      
      This patch expands the dummy fallback dump_stack() implementation in
      lib/dump_stack.c such that it prints out debug information (taken from
      x86) and invokes show_stack(NULL, NULL) and drops arch-specific
      dump_stack() implementations in all archs except blackfin.  Blackfin's
      dump_stack() does something wonky that I don't understand.
      
      Debug information can be printed separately by calling
      dump_stack_print_info() so that arch-specific dump_stack()
      implementation can still emit the same debug information.  This is used
      in blackfin.
      
      This patch brings the following behavior changes.
      
      * On some archs, an extra level in backtrace for show_stack() could be
        printed.  This is because the top frame was determined in
        dump_stack() on those archs while generic dump_stack() can't do that
        reliably.  It can be compensated by inlining dump_stack() but not
        sure whether that'd be necessary.
      
      * Most archs didn't use to print debug info on dump_stack().  They do
        now.
      
      An example WARN dump follows.
      
       WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
       Hardware name: empty
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9
        0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
        ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c
        0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
       Call Trace:
        [<ffffffff81c614dc>] dump_stack+0x19/0x1b
        [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0
        [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff8234a071>] init_workqueues+0x35/0x505
        ...
      
      v2: CPU number added to the generic debug info as requested by s390
          folks and dropped the s390 specific dump_stack().  This loses %ksp
          from the debug message which the maintainers think isn't important
          enough to keep the s390-specific dump_stack() implementation.
      
          dump_stack_print_info() is moved to kernel/printk.c from
          lib/dump_stack.c.  Because linkage is per objecct file,
          dump_stack_print_info() living in the same lib file as generic
          dump_stack() means that archs which implement custom dump_stack()
          - at this point, only blackfin - can't use dump_stack_print_info()
          as that will bring in the generic version of dump_stack() too.  v1
          The v1 patch broke build on blackfin due to this issue.  The build
          breakage was reported by Fengguang Wu.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	[s390 bits]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      196779b9
  15. 30 4月, 2013 3 次提交
  16. 23 3月, 2013 1 次提交
    • F
      printk: Provide a wake_up_klogd() off-case · dc72c32e
      Frederic Weisbecker 提交于
      wake_up_klogd() is useless when CONFIG_PRINTK=n because neither printk()
      nor printk_sched() are in use and there are actually no waiter on
      log_wait waitqueue.  It should be a stub in this case for users like
      bust_spinlocks().
      
      Otherwise this results in this warning when CONFIG_PRINTK=n and
      CONFIG_IRQ_WORK=n:
      
      	kernel/built-in.o In function `wake_up_klogd':
      	(.text.wake_up_klogd+0xb4): undefined reference to `irq_work_queue'
      
      To fix this, provide an off-case for wake_up_klogd() when
      CONFIG_PRINTK=n.
      
      There is much more from console_unlock() and other console related code
      in printk.c that should be moved under CONFIG_PRINTK.  But for now,
      focus on a minimal fix as we passed the merged window already.
      
      [akpm@linux-foundation.org: include printk.h in bust_spinlocks.c]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reported-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dc72c32e
  17. 18 2月, 2013 1 次提交
  18. 08 2月, 2013 1 次提交
  19. 31 1月, 2013 1 次提交
  20. 05 1月, 2013 1 次提交
    • R
      printk: fix incorrect length from print_time() when seconds > 99999 · 35dac27c
      Roland Dreier 提交于
      print_prefix() passes a NULL buf to print_time() to get the length of
      the time prefix; when printk times are enabled, the current code just
      returns the constant 15, which matches the format "[%5lu.%06lu] " used
      to print the time value.  However, this is obviously incorrect when the
      whole seconds part of the time gets beyond 5 digits (100000 seconds is a
      bit more than a day of uptime).
      
      The simple fix is to use snprintf(NULL, 0, ...) to calculate the actual
      length of the time prefix.  This could be micro-optimized but it seems
      better to have simpler, more readable code here.
      
      The bug leads to the syslog system call miscomputing which messages fit
      into the userspace buffer.  If there are enough messages to fill
      log_buf_len and some have a timestamp >= 100000, dmesg may fail with:
      
          # dmesg
          klogctl: Bad address
      
      When this happens, strace shows that the failure is indeed EFAULT due to
      the kernel mistakenly accessing past the end of dmesg's buffer, since
      dmesg asks the kernel how big a buffer it needs, allocates a bit more,
      and then gets an error when it asks the kernel to fill it:
      
          syslog(0xa, 0, 0)                       = 1048576
          mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa4d25d2000
          syslog(0x3, 0x7fa4d25d2010, 0x100008)   = -1 EFAULT (Bad address)
      
      As far as I can see, the bug has been there as long as print_time(),
      which comes from commit 084681d1 ("printk: flush continuation lines
      immediately to console") in 3.5-rc5.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: Sylvain Munaut <s.munaut@whatever-company.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      35dac27c
  21. 18 12月, 2012 1 次提交
  22. 18 11月, 2012 1 次提交
    • F
      printk: Wake up klogd using irq_work · 74876a98
      Frederic Weisbecker 提交于
      klogd is woken up asynchronously from the tick in order
      to do it safely.
      
      However if printk is called when the tick is stopped, the reader
      won't be woken up until the next interrupt, which might not fire
      for a while. As a result, the user may miss some message.
      
      To fix this, lets implement the printk tick using a lazy irq work.
      This subsystem takes care of the timer tick state and can
      fix up accordingly.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      74876a98
  23. 24 10月, 2012 1 次提交
    • D
      console: use might_sleep in console_lock · 6b898c07
      Daniel Vetter 提交于
      Instead of BUG_ON(in_interrupt()), since that doesn't check for all
      the newfangled stuff like preempt.
      
      Note that this is valid since the console_sem is essentially used like
      a real mutex with only two twists:
      - we allow trylock from hardirq context
      - across suspend/resume we lock the logical console_lock, but drop the
        semaphore protecting the locking state.
      
      Now that doesn't guarantee that no one is playing tricks in
      single-thread atomic contexts at suspend/resume/boot time, but
      - I couldn't find anything suspicious with some grepping,
      - might_sleep shouldn't die,
      - and I think the upside of catching more potential issues is worth
        the risk of getting a might_sleep backtrace that would have been
        save (and then dealing with that fallout).
      
      Cc: Dave Airlie <airlied@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b898c07